Index: projects/clang390-import/UPDATING
===================================================================
--- projects/clang390-import/UPDATING	(revision 305686)
+++ projects/clang390-import/UPDATING	(revision 305687)
@@ -1,1641 +1,1647 @@
 Updating Information for FreeBSD current users.
 
 This file is maintained and copyrighted by M. Warner Losh <imp@freebsd.org>.
 See end of file for further details.  For commonly done items, please see the
 COMMON ITEMS: section later in the file.  These instructions assume that you
 basically know what you are doing.  If not, then please consult the FreeBSD
 handbook:
 
     http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html
 
 Items affecting the ports and packages system can be found in
 /usr/ports/UPDATING.  Please read that file before running portupgrade.
 
 NOTE: FreeBSD has switched from gcc to clang. If you have trouble bootstrapping
 from older versions of FreeBSD, try WITHOUT_CLANG and WITH_GCC to bootstrap to
 the tip of head, and then rebuild without this option. The bootstrap process
 from older version of current across the gcc/clang cutover is a bit fragile.
 
 NOTE TO PEOPLE WHO THINK THAT FreeBSD 12.x IS SLOW:
 	FreeBSD 12.x has many debugging features turned on, in both the kernel
 	and userland.  These features attempt to detect incorrect use of
 	system primitives, and encourage loud failure through extra sanity
 	checking and fail stop semantics.  They also substantially impact
 	system performance.  If you want to do performance measurement,
 	benchmarking, and optimization, you'll want to turn them off.  This
 	includes various WITNESS- related kernel options, INVARIANTS, malloc
 	debugging flags in userland, and various verbose features in the
 	kernel.  Many developers choose to disable these features on build
 	machines to maximize performance.  (To completely disable malloc
 	debugging, define MALLOC_PRODUCTION in /etc/make.conf, or to merely
 	disable the most expensive debugging functionality run
 	"ln -s 'abort:false,junk:false' /etc/malloc.conf".)
 
+20160908:
+	The queue(3) debugging macro, QUEUE_MACRO_DEBUG, has been split into
+	two separate components, QUEUE_MACRO_DEBUG_TRACE and
+	QUEUE_MACRO_DEBUG_TRASH.  Define both for the original
+	QUEUE_MACRO_DEBUG behavior.
+
 20160824:
 	r304787 changed some ioctl interfaces between the iSCSI userspace
 	programs and the kernel.  ctladm, ctld, iscsictl, and iscsid must be
 	rebuilt to work with new kernels.  __FreeBSD_version has been bumped
 	to 1200005.
 
 20160818:
 	The UDP receive code has been updated to only treat incoming UDP
 	packets that were addressed to an L2 broadcast address as L3
 	broadcast packets.  It is not expected that this will affect any
 	standards-conforming UDP application.  The new behaviour can be
 	disabled by setting the sysctl net.inet.udp.require_l2_bcast to
 	0.
 
 20160818:
 	Remove the openbsd_poll system call.
         __FreeBSD_version has been bumped because of this.
 
 20160622:
 	The libc stub for the pipe(2) system call has been replaced with
 	a wrapper that calls the pipe2(2) system call and the pipe(2)
 	system call is now only implemented by the kernels that include
 	"options COMPAT_FREEBSD10" in their config file (this is the
 	default).  Users should ensure that this option is enabled in
 	their kernel or upgrade userspace to r302092 before upgrading their
 	kernel.
 
 20160527:
 	CAM will now strip leading spaces from SCSI disks' serial numbers.
 	This will effect users who create UFS filesystems on SCSI disks using
 	those disk's diskid device nodes.  For example, if /etc/fstab
 	previously contained a line like
 	"/dev/diskid/DISK-%20%20%20%20%20%20%20ABCDEFG0123456", you should
 	change it to "/dev/diskid/DISK-ABCDEFG0123456".  Users of geom
 	transforms like gmirror may also be affected.  ZFS users should
 	generally be fine.
 
 20160523:
 	The bitstring(3) API has been updated with new functionality and
 	improved performance.  But it is binary-incompatible with the old API.
 	Objects built with the new headers may not be linked against objects
 	built with the old headers.
 
 20160520:
 	The brk and sbrk functions have been removed from libc on arm64.
 	Binutils from ports has been updated to not link to these
 	functions and should be updated to the latest version before
 	installing a new libc.
 
 20160517:
 	The armv6 port now defaults to hard float ABI. Limited support
 	for running both hardfloat and soft float on the same system
 	is available using the libraries installed with -DWITH_LIBSOFT.
 	This has only been tested as an upgrade path for installworld
 	and packages may fail or need manual intervention to run. New
 	packages will be needed.
 
 	To update an existing self-hosted armv6hf system, you must add
 	TARGET_ARCH=armv6 on the make command line for both the build 
 	and the install steps.
 
 20160510:
 	Kernel modules compiled outside of a kernel build now default to
 	installing to /boot/modules instead of /boot/kernel.  Many kernel
 	modules built this way (such as those in ports) already overrode
 	KMODDIR explicitly to install into /boot/modules.  However,
 	manually building and installing a module from /sys/modules will
 	now install to /boot/modules instead of /boot/kernel.
 
 20160414:
 	The CAM I/O scheduler has been committed to the kernel. There should be
 	no user visible impact. This does enable NCQ Trim on ada SSDs. While the
 	list of known rogues that claim support for this but actually corrupt
 	data is believed to be complete, be on the lookout for data
 	corruption. The known rogue list is believed to be complete:
 
 		o Crucial MX100, M550 drives with MU01 firmware.
 		o Micron M510 and M550 drives with MU01 firmware.
 		o Micron M500 prior to MU07 firmware
 		o Samsung 830, 840, and 850 all firmwares
 		o FCCT M500 all firmwares
 
 	Crucial has firmware http://www.crucial.com/usa/en/support-ssd-firmware
 	with working NCQ TRIM. For Micron branded drives, see your sales rep for
 	updated firmware. Black listed drives will work correctly because these
 	drives work correctly so long as no NCQ TRIMs are sent to them. Given
 	this list is the same as found in Linux, it's believed there are no
 	other rogues in the market place. All other models from the above
 	vendors work.
 
 	To be safe, if you are at all concerned, you can quirk each of your
 	drives to prevent NCQ from being sent by setting:
 		kern.cam.ada.X.quirks="0x2"
 	in loader.conf. If the drive requires the 4k sector quirk, set the
 	quirks entry to 0x3.
 
 20160330:
 	The FAST_DEPEND build option has been removed and its functionality is
 	now the one true way.  The old mkdep(1) style of 'make depend' has
 	been removed.  See 20160311 for further details.
 
 20160317:
 	Resource range types have grown from unsigned long to uintmax_t.  All
 	drivers, and anything using libdevinfo, need to be recompiled.
 
 20160311:
 	WITH_FAST_DEPEND is now enabled by default for in-tree and out-of-tree
 	builds.  It no longer runs mkdep(1) during 'make depend', and the
 	'make depend' stage can safely be skipped now as it is auto ran
 	when building 'make all' and will generate all SRCS and DPSRCS before
 	building anything else.  Dependencies are gathered at compile time with
 	-MF flags kept in separate .depend files per object file.  Users should
 	run 'make cleandepend' once if using -DNO_CLEAN to clean out older
 	stale .depend files.
 
 20160306:
 	On amd64, clang 3.8.0 can now insert sections of type AMD64_UNWIND into
 	kernel modules.  Therefore, if you load any kernel modules at boot time,
 	please install the boot loaders after you install the kernel, but before
 	rebooting, e.g.:
 
 	make buildworld
 	make kernel KERNCONF=YOUR_KERNEL_HERE
 	make -C sys/boot install
 	<reboot in single user>
 
 	Then follow the usual steps, described in the General Notes section,
 	below.
 
 20160305:
 	Clang, llvm, lldb and compiler-rt have been upgraded to 3.8.0.  Please
 	see the 20141231 entry below for information about prerequisites and
 	upgrading, if you are not already using clang 3.5.0 or higher.
 
 20160301:
 	The AIO subsystem is now a standard part of the kernel.  The
 	VFS_AIO kernel option and aio.ko kernel module have been removed.
 	Due to stability concerns, asynchronous I/O requests are only
 	permitted on sockets and raw disks by default.  To enable
 	asynchronous I/O requests on all file types, set the
 	vfs.aio.enable_unsafe sysctl to a non-zero value.
 
 20160226:
 	The ELF object manipulation tool objcopy is now provided by the
 	ELF Tool Chain project rather than by GNU binutils. It should be a
 	drop-in replacement, with the addition of arm64 support. The
 	(temporary) src.conf knob WITHOUT_ELFCOPY_AS_OBJCOPY knob may be set
 	to obtain the GNU version if necessary.
 
 20160129:
 	Building ZFS pools on top of zvols is prohibited by default.  That
 	feature has never worked safely; it's always been prone to deadlocks.
 	Using a zvol as the backing store for a VM guest's virtual disk will
 	still work, even if the guest is using ZFS.  Legacy behavior can be
 	restored by setting vfs.zfs.vol.recursive=1.
 
 20160119:
 	The NONE and HPN patches has been removed from OpenSSH.  They are
 	still available in the security/openssh-portable port.
 
 20160113:
 	With the addition of ypldap(8), a new _ypldap user is now required
 	during installworld. "mergemaster -p" can be used to add the user
 	prior to installworld, as documented in the handbook.
 
 20151216:
 	The tftp loader (pxeboot) now uses the option root-path directive. As a
 	consequence it no longer looks for a pxeboot.4th file on the tftp
 	server. Instead it uses the regular /boot infrastructure as with the
 	other loaders.
 
 20151211:
 	The code to start recording plug and play data into the modules has
 	been committed. While the old tools will properly build a new kernel,
 	a number of warnings about "unknown metadata record 4" will be produced
 	for an older kldxref. To avoid such warnings, make sure to rebuild
 	the kernel toolchain (or world). Make sure that you have r292078 or
 	later when trying to build 292077 or later before rebuilding.
 
 20151207:
 	Debug data files are now built by default with 'make buildworld' and
 	installed with 'make installworld'. This facilitates debugging but
 	requires more disk space both during the build and for the installed
 	world. Debug files may be disabled by setting WITHOUT_DEBUG_FILES=yes
 	in src.conf(5).
 
 20151130:
 	r291527 changed the internal interface between the nfsd.ko and
 	nfscommon.ko modules. As such, they must both be upgraded to-gether.
 	__FreeBSD_version has been bumped because of this.
 
 20151108:
 	Add support for unicode collation strings leads to a change of
 	order of files listed by ls(1) for example. To get back to the old
 	behaviour, set LC_COLLATE environment variable to "C".
 
 	Databases administrators will need to reindex their databases given
 	collation results will be different.
 
 	Due to a bug in install(1) it is recommended to remove the ancient
 	locales before running make installworld.
 
 	rm -rf /usr/share/locale/*
 
 20151030:
 	The OpenSSL has been upgraded to 1.0.2d.  Any binaries requiring
 	libcrypto.so.7 or libssl.so.7 must be recompiled.
 
 20151020:
 	Qlogic 24xx/25xx firmware images were updated from 5.5.0 to 7.3.0.
 	Kernel modules isp_2400_multi and isp_2500_multi were removed and
 	should be replaced with isp_2400 and isp_2500 modules respectively.
 
 20151017:
 	The build previously allowed using 'make -n' to not recurse into
 	sub-directories while showing what commands would be executed, and
 	'make -n -n' to recursively show commands.  Now 'make -n' will recurse
 	and 'make -N' will not.
 
 20151012:
 	If you specify SENDMAIL_MC or SENDMAIL_CF in make.conf, mergemaster
 	and etcupdate will now use this file. A custom sendmail.cf is now
 	updated via this mechanism rather than via installworld.  If you had
 	excluded sendmail.cf in mergemaster.rc or etcupdate.conf, you may
 	want to remove the exclusion or change it to "always install".
 	/etc/mail/sendmail.cf is now managed the same way regardless of
 	whether SENDMAIL_MC/SENDMAIL_CF is used.  If you are not using
 	SENDMAIL_MC/SENDMAIL_CF there should be no change in behavior.
 
 20151011:
 	Compatibility shims for legacy ATA device names have been removed.
 	It includes ATA_STATIC_ID kernel option, kern.cam.ada.legacy_aliases
 	and kern.geom.raid.legacy_aliases loader tunables, kern.devalias.*
 	environment variables, /dev/ad* and /dev/ar* symbolic links.
 
 20151006:
 	Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 3.7.0.
 	Please see the 20141231 entry below for information about prerequisites
 	and upgrading, if you are not already using clang 3.5.0 or higher.
 
 20150924:
 	Kernel debug files have been moved to /usr/lib/debug/boot/kernel/,
 	and renamed from .symbols to .debug. This reduces the size requirements
 	on the boot partition or file system and provides consistency with
 	userland debug files.
 
 	When using the supported kernel installation method the
 	/usr/lib/debug/boot/kernel directory will be renamed (to kernel.old)
 	as is done with /boot/kernel.
 
 	Developers wishing to maintain the historical behavior of installing
 	debug files in /boot/kernel/ can set KERN_DEBUGDIR="" in src.conf(5).
 
 20150827:
 	The wireless drivers had undergone changes that remove the 'parent
 	interface' from the ifconfig -l output. The rc.d network scripts
 	used to check presence of a parent interface in the list, so old
 	scripts would fail to start wireless networking. Thus, etcupdate(3)
 	or mergemaster(8) run is required after kernel update, to update your
 	rc.d scripts in /etc.
 
 20150827:
 	pf no longer supports 'scrub fragment crop' or 'scrub fragment drop-ovl'
 	These configurations are now automatically interpreted as
 	'scrub fragment reassemble'.
 
 20150817:
 	Kernel-loadable modules for the random(4) device are back. To use
 	them, the kernel must have
 
 	device	random
 	options	RANDOM_LOADABLE
 
 	kldload(8) can then be used to load random_fortuna.ko
 	or random_yarrow.ko. Please note that due to the indirect
 	function calls that the loadable modules need to provide,
 	the build-in variants will be slightly more efficient.
 
 	The random(4) kernel option RANDOM_DUMMY has been retired due to
 	unpopularity. It was not all that useful anyway.
 
 20150813:
 	The WITHOUT_ELFTOOLCHAIN_TOOLS src.conf(5) knob has been retired.
 	Control over building the ELF Tool Chain tools is now provided by
 	the WITHOUT_TOOLCHAIN knob.
 
 20150810:
 	The polarity of Pulse Per Second (PPS) capture events with the
 	uart(4) driver has been corrected.  Prior to this change the PPS
 	"assert" event corresponded to the trailing edge of a positive PPS
 	pulse and the "clear" event was the leading edge of the next pulse.
 
 	As the width of a PPS pulse in a typical GPS receiver is on the
 	order of 1 millisecond, most users will not notice any significant
 	difference with this change.
 
 	Anyone who has compensated for the historical polarity reversal by
 	configuring a negative offset equal to the pulse width will need to
 	remove that workaround.
 
 20150809:
 	The default group assigned to /dev/dri entries has been changed
 	from 'wheel' to 'video' with the id of '44'. If you want to have
 	access to the dri devices please add yourself to the video group
 	with:
 
 	# pw groupmod video -m $USER
 
 20150806:
 	The menu.rc and loader.rc files will now be replaced during 
 	upgrades. Please migrate local changes to menu.rc.local and
 	loader.rc.local instead.
 
 20150805:
 	GNU Binutils versions of addr2line, c++filt, nm, readelf, size,
 	strings and strip have been removed. The src.conf(5) knob
 	WITHOUT_ELFTOOLCHAIN_TOOLS no longer provides the binutils tools.
 
 20150728:
 	As ZFS requires more kernel stack pages than is the default on some
 	architectures e.g. i386, it now warns if KSTACK_PAGES is less than
 	ZFS_MIN_KSTACK_PAGES (which is 4 at the time of writing).
 
 	Please consider using 'options KSTACK_PAGES=X' where X is greater
 	than or equal to ZFS_MIN_KSTACK_PAGES i.e. 4 in such configurations.
 
 20150706:
 	sendmail has been updated to 8.15.2.  Starting with FreeBSD 11.0
 	and sendmail 8.15, sendmail uses uncompressed IPv6 addresses by
 	default, i.e., they will not contain "::".  For example, instead
 	of ::1, it will be 0:0:0:0:0:0:0:1.  This permits a zero subnet
 	to have a more specific match, such as different map entries for
 	IPv6:0:0 vs IPv6:0.  This change requires that configuration
 	data (including maps, files, classes, custom ruleset, etc.) must
 	use the same format, so make certain such configuration data is
 	upgrading.  As a very simple check search for patterns like
 	'IPv6:[0-9a-fA-F:]*::' and 'IPv6::'.  To return to the old
 	behavior, set the m4 option confUSE_COMPRESSED_IPV6_ADDRESSES or
 	the cf option UseCompressedIPv6Addresses.
 
 20150630:
 	The default kernel entropy-processing algorithm is now
 	Fortuna, replacing Yarrow.
 
 	Assuming you have 'device random' in your kernel config
 	file, the configurations allow a kernel option to override
 	this default. You may choose *ONE* of:
 
 	options	RANDOM_YARROW	# Legacy /dev/random algorithm.
 	options	RANDOM_DUMMY	# Blocking-only driver.
 
 	If you have neither, you get Fortuna.  For most people,
 	read no further, Fortuna will give a /dev/random that works
 	like it always used to, and the difference will be irrelevant.
 
 	If you remove 'device random', you get *NO* kernel-processed
 	entropy at all. This may be acceptable to folks building
 	embedded systems, but has complications. Carry on reading,
 	and it is assumed you know what you need.
 
 	*PLEASE* read random(4) and random(9) if you are in the
 	habit of tweaking kernel configs, and/or if you are a member
 	of the embedded community, wanting specific and not-usual
 	behaviour from your security subsystems.
 
 	NOTE!! If you use RANDOM_DUMMY and/or have no 'device
 	random', you will NOT have a functioning /dev/random, and
 	many cryptographic features will not work, including SSH.
 	You may also find strange behaviour from the random(3) set
 	of library functions, in particular sranddev(3), srandomdev(3)
 	and arc4random(3). The reason for this is that the KERN_ARND
 	sysctl only returns entropy if it thinks it has some to
 	share, and with RANDOM_DUMMY or no 'device random' this
 	will never happen.
 
 20150623:
 	An additional fix for the issue described in the 20150614 sendmail
 	entry below has been been committed in revision 284717.
 
 20150616:
 	FreeBSD's old make (fmake) has been removed from the system. It is
 	available as the devel/fmake port or via pkg install fmake.
 
 20150615:
 	The fix for the issue described in the 20150614 sendmail entry
 	below has been been committed in revision 284436.  The work
 	around described in that entry is no longer needed unless the
 	default setting is overridden by a confDH_PARAMETERS configuration
 	setting of '5' or pointing to a 512 bit DH parameter file.
 
 20150614:
 	ALLOW_DEPRECATED_ATF_TOOLS/ATFFILE support has been removed from
 	atf.test.mk (included from bsd.test.mk). Please upgrade devel/atf
 	and devel/kyua to version 0.20+ and adjust any calling code to work
 	with Kyuafile and kyua.
 
 20150614:
 	The import of openssl to address the FreeBSD-SA-15:10.openssl
 	security advisory includes a change which rejects handshakes
 	with DH parameters below 768 bits.  sendmail releases prior
 	to 8.15.2 (not yet released), defaulted to a 512 bit
 	DH parameter setting for client connections.  To work around
 	this interoperability, sendmail can be configured to use a
 	2048 bit DH parameter by:
 
 	1. Edit /etc/mail/`hostname`.mc
 	2. If a setting for confDH_PARAMETERS does not exist or
 	   exists and is set to a string beginning with '5',
 	   replace it with '2'.
 	3. If a setting for confDH_PARAMETERS exists and is set to
 	   a file path, create a new file with:
 		openssl dhparam -out /path/to/file 2048
 	4. Rebuild the .cf file:
 		cd /etc/mail/; make; make install
 	5. Restart sendmail:
 		cd /etc/mail/; make restart
 
 	A sendmail patch is coming, at which time this file will be
 	updated.
 
 20150604:
 	Generation of legacy formatted entries have been disabled by default
 	in pwd_mkdb(8), as all base system consumers of the legacy formatted
 	entries were converted to use the new format by default when the new,
 	machine independent format have been added and supported since FreeBSD
 	5.x.
 
 	Please see the pwd_mkdb(8) manual page for further details.
 
 20150525:
 	Clang and llvm have been upgraded to 3.6.1 release.  Please see the
 	20141231 entry below for information about prerequisites and upgrading,
 	if you are not already using 3.5.0 or higher.
 
 20150521:
 	TI platform code switched to using vendor DTS files and this update
 	may break existing systems running on Beaglebone, Beaglebone Black,
 	and Pandaboard:
 
 	- dtb files should be regenerated/reinstalled. Filenames are the
 	  same but content is different now
 	- GPIO addressing was changed, now each GPIO bank (32 pins per bank)
 	  has its own /dev/gpiocX device, e.g. pin 121 on /dev/gpioc0 in old
 	  addressing scheme is now pin 25 on /dev/gpioc3.
 	- Pandaboard: /etc/ttys should be updated, serial console device is
 	  now /dev/ttyu2, not /dev/ttyu0
 
 20150501:
 	soelim(1) from gnu/usr.bin/groff has been replaced by usr.bin/soelim.
 	If you need the GNU extension from groff soelim(1), install groff
 	from package: pkg install groff, or via ports: textproc/groff.
 
 20150423:
 	chmod, chflags, chown and chgrp now affect symlinks in -R mode as
 	defined in symlink(7); previously symlinks were silently ignored.
 
 20150415:
 	The const qualifier has been removed from iconv(3) to comply with
 	POSIX.  The ports tree is aware of this from r384038 onwards.
 
 20150416:
 	Libraries specified by LIBADD in Makefiles must have a corresponding
 	DPADD_<lib> variable to ensure correct dependencies.  This is now
 	enforced in src.libnames.mk.
 
 20150324:
 	From legacy ata(4) driver was removed support for SATA controllers
 	supported by more functional drivers ahci(4), siis(4) and mvs(4).
 	Kernel modules ataahci and ataadaptec were removed completely,
 	replaced by ahci and mvs modules respectively.
 
 20150315:
 	Clang, llvm and lldb have been upgraded to 3.6.0 release.  Please see
 	the 20141231 entry below for information about prerequisites and
 	upgrading, if you are not already using 3.5.0 or higher.
 
 20150307:
 	The 32-bit PowerPC kernel has been changed to a position-independent
 	executable. This can only be booted with a version of loader(8)
 	newer than January 31, 2015, so make sure to update both world and
 	kernel before rebooting.
 
 20150217:
 	If you are running a -CURRENT kernel since r273872 (Oct 30th, 2014),
 	but before r278950, the RNG was not seeded properly.  Immediately
 	upgrade the kernel to r278950 or later and regenerate any keys (e.g.
 	ssh keys or openssl keys) that were generated w/ a kernel from that
 	range.  This does not affect programs that directly used /dev/random
 	or /dev/urandom.  All userland uses of arc4random(3) are affected.
 
 20150210:
 	The autofs(4) ABI was changed in order to restore binary compatibility
 	with 10.1-RELEASE.  The automountd(8) daemon needs to be rebuilt to work
 	with the new kernel.
 
 20150131:
 	The powerpc64 kernel has been changed to a position-independent
 	executable. This can only be booted with a new version of loader(8),
 	so make sure to update both world and kernel before rebooting.
 
 20150118:
 	Clang and llvm have been upgraded to 3.5.1 release.  This is a bugfix
 	only release, no new features have been added.  Please see the 20141231
 	entry below for information about prerequisites and upgrading, if you
 	are not already using 3.5.0.
 
 20150107:
 	ELF tools addr2line, elfcopy (strip), nm, size, and strings are now
 	taken from the ELF Tool Chain project rather than GNU binutils. They
 	should be drop-in replacements, with the addition of arm64 support.
 	The WITHOUT_ELFTOOLCHAIN_TOOLS= knob may be used to obtain the
 	binutils tools, if necessary. See 20150805 for updated information.
 
 20150105:
 	The default Unbound configuration now enables remote control
 	using a local socket.  Users who have already enabled the
 	local_unbound service should regenerate their configuration
 	by running "service local_unbound setup" as root.
 
 20150102:
 	The GNU texinfo and GNU info pages have been removed.
 	To be able to view GNU info pages please install texinfo from ports.
 
 20141231:
 	Clang, llvm and lldb have been upgraded to 3.5.0 release.
 
 	As of this release, a prerequisite for building clang, llvm and lldb is
 	a C++11 capable compiler and C++11 standard library.  This means that to
 	be able to successfully build the cross-tools stage of buildworld, with
 	clang as the bootstrap compiler, your system compiler or cross compiler
 	should either be clang 3.3 or later, or gcc 4.8 or later, and your
 	system C++ library should be libc++, or libdstdc++ from gcc 4.8 or
 	later.
 
 	On any standard FreeBSD 10.x or 11.x installation, where clang and
 	libc++ are on by default (that is, on x86 or arm), this should work out
 	of the box.
 
 	On 9.x installations where clang is enabled by default, e.g. on x86 and
 	powerpc, libc++ will not be enabled by default, so libc++ should be
 	built (with clang) and installed first.  If both clang and libc++ are
 	missing, build clang first, then use it to build libc++.
 
 	On 8.x and earlier installations, upgrade to 9.x first, and then follow
 	the instructions for 9.x above.
 
 	Sparc64 and mips users are unaffected, as they still use gcc 4.2.1 by
 	default, and do not build clang.
 
 	Many embedded systems are resource constrained, and will not be able to
 	build clang in a reasonable time, or in some cases at all.  In those
 	cases, cross building bootable systems on amd64 is a workaround.
 
 	This new version of clang introduces a number of new warnings, of which
 	the following are most likely to appear:
 
 	-Wabsolute-value
 
 	This warns in two cases, for both C and C++:
 	* When the code is trying to take the absolute value of an unsigned
 	  quantity, which is effectively a no-op, and almost never what was
 	  intended.  The code should be fixed, if at all possible.  If you are
 	  sure that the unsigned quantity can be safely cast to signed, without
 	  loss of information or undefined behavior, you can add an explicit
 	  cast, or disable the warning.
 
 	* When the code is trying to take an absolute value, but the called
 	  abs() variant is for the wrong type, which can lead to truncation.
 	  If you want to disable the warning instead of fixing the code, please
 	  make sure that truncation will not occur, or it might lead to unwanted
 	  side-effects.
 
 	-Wtautological-undefined-compare and
 	-Wundefined-bool-conversion
 
 	These warn when C++ code is trying to compare 'this' against NULL, while
 	'this' should never be NULL in well-defined C++ code.  However, there is
 	some legacy (pre C++11) code out there, which actively abuses this
 	feature, which was less strictly defined in previous C++ versions.
 
 	Squid and openjdk do this, for example.  The warning can be turned off
 	for C++98 and earlier, but compiling the code in C++11 mode might result
 	in unexpected behavior; for example, the parts of the program that are
 	unreachable could be optimized away.
 
 20141222:
 	The old NFS client and server (kernel options NFSCLIENT, NFSSERVER)
 	kernel sources have been removed. The .h files remain, since some
 	utilities include them. This will need to be fixed later.
 	If "mount -t oldnfs ..." is attempted, it will fail.
 	If the "-o" option on mountd(8), nfsd(8) or nfsstat(1) is used,
 	the utilities will report errors.
 
 20141121:
 	The handling of LOCAL_LIB_DIRS has been altered to skip addition of
 	directories to top level SUBDIR variable when their parent
 	directory is included in LOCAL_DIRS.  Users with build systems with
 	such hierarchies and without SUBDIR entries in the parent
 	directory Makefiles should add them or add the directories to
 	LOCAL_DIRS.
 
 20141109:
 	faith(4) and faithd(8) have been removed from the base system. Faith
 	has been obsolete for a very long time.
 
 20141104:
 	vt(4), the new console driver, is enabled by default. It brings
 	support for Unicode and double-width characters, as well as
 	support for UEFI and integration with the KMS kernel video
 	drivers.
 
 	You may need to update your console settings in /etc/rc.conf,
 	most probably the keymap. During boot, /etc/rc.d/syscons will
 	indicate what you need to do.
 
 	vt(4) still has issues and lacks some features compared to
 	syscons(4). See the wiki for up-to-date information:
 	  https://wiki.freebsd.org/Newcons
 
 	If you want to keep using syscons(4), you can do so by adding
 	the following line to /boot/loader.conf:
 	  kern.vty=sc
 
 20141102:
 	pjdfstest has been integrated into kyua as an opt-in test suite.
 	Please see share/doc/pjdfstest/README for more details on how to
 	execute it.
 
 20141009:
 	gperf has been removed from the base system for architectures
 	that use clang. Ports that require gperf will obtain it from the
 	devel/gperf port.
 
 20140923:
 	pjdfstest has been moved from tools/regression/pjdfstest to
 	contrib/pjdfstest .
 
 20140922:
 	At svn r271982, The default linux compat kernel ABI has been adjusted
 	to 2.6.18 in support of the linux-c6 compat ports infrastructure
 	update.  If you wish to continue using the linux-f10 compat ports,
 	add compat.linux.osrelease=2.6.16 to your local sysctl.conf.  Users are
 	encouraged to update their linux-compat packages to linux-c6 during
 	their next update cycle.
 
 20140729:
 	The ofwfb driver, used to provide a graphics console on PowerPC when
 	using vt(4), no longer allows mmap() of all physical memory. This
 	will prevent Xorg on PowerPC with some ATI graphics cards from
 	initializing properly unless x11-servers/xorg-server is updated to
 	1.12.4_8 or newer.
 
 20140723:
 	The xdev targets have been converted to using TARGET and
 	TARGET_ARCH instead of XDEV and XDEV_ARCH.
 
 20140719:
 	The default unbound configuration has been modified to address
 	issues with reverse lookups on networks that use private
 	address ranges.  If you use the local_unbound service, run
 	"service local_unbound setup" as root to regenerate your
 	configuration, then "service local_unbound reload" to load the
 	new configuration.
 
 20140709:
 	The GNU texinfo and GNU info pages are not built and installed
 	anymore, WITH_INFO knob has been added to allow to built and install
 	them again.
 	UPDATE: see 20150102 entry on texinfo's removal
 
 20140708:
 	The GNU readline library is now an INTERNALLIB - that is, it is
 	statically linked into consumers (GDB and variants) in the base
 	system, and the shared library is no longer installed.  The
 	devel/readline port is available for third party software that
 	requires readline.
 
 20140702:
 	The Itanium architecture (ia64) has been removed from the list of
 	known architectures. This is the first step in the removal of the
 	architecture.
 
 20140701:
 	Commit r268115 has added NFSv4.1 server support, merged from
 	projects/nfsv4.1-server.  Since this includes changes to the
 	internal interfaces between the NFS related modules, a full
 	build of the kernel and modules will be necessary.
 	__FreeBSD_version has been bumped.
 
 20140629:
 	The WITHOUT_VT_SUPPORT kernel config knob has been renamed
 	WITHOUT_VT.  (The other _SUPPORT knobs have a consistent meaning
 	which differs from the behaviour controlled by this knob.)
 
 20140619:
 	Maximal length of the serial number in CTL was increased from 16 to
 	64 chars, that breaks ABI.  All CTL-related tools, such as ctladm
 	and ctld, need to be rebuilt to work with a new kernel.
 
 20140606:
 	The libatf-c and libatf-c++ major versions were downgraded to 0 and
 	1 respectively to match the upstream numbers.  They were out of
 	sync because, when they were originally added to FreeBSD, the
 	upstream versions were not respected.  These libraries are private
 	and not yet built by default, so renumbering them should be a
 	non-issue.  However, unclean source trees will yield broken test
 	programs once the operator executes "make delete-old-libs" after a
 	"make installworld".
 
 	Additionally, the atf-sh binary was made private by moving it into
 	/usr/libexec/.  Already-built shell test programs will keep the
 	path to the old binary so they will break after "make delete-old"
 	is run.
 
 	If you are using WITH_TESTS=yes (not the default), wipe the object
 	tree and rebuild from scratch to prevent spurious test failures.
 	This is only needed once: the misnumbered libraries and misplaced
 	binaries have been added to OptionalObsoleteFiles.inc so they will
 	be removed during a clean upgrade.
 
 20140512:
 	Clang and llvm have been upgraded to 3.4.1 release.
 
 20140508:
 	We bogusly installed src.opts.mk in /usr/share/mk. This file should
 	be removed to avoid issues in the future (and has been added to
 	ObsoleteFiles.inc).
 
 20140505:
 	/etc/src.conf now affects only builds of the FreeBSD src tree. In the
 	past, it affected all builds that used the bsd.*.mk files. The old
 	behavior was a bug, but people may have relied upon it. To get this
 	behavior back, you can .include /etc/src.conf from /etc/make.conf
 	(which is still global and isn't changed). This also changes the
 	behavior of incremental builds inside the tree of individual
 	directories. Set MAKESYSPATH to ".../share/mk" to do that.
 	Although this has survived make universe and some upgrade scenarios,
 	other upgrade scenarios may have broken. At least one form of
 	temporary breakage was fixed with MAKESYSPATH settings for buildworld
 	as well... In cases where MAKESYSPATH isn't working with this
 	setting, you'll need to set it to the full path to your tree.
 
 	One side effect of all this cleaning up is that bsd.compiler.mk
 	is no longer implicitly included by bsd.own.mk. If you wish to
 	use COMPILER_TYPE, you must now explicitly include bsd.compiler.mk
 	as well.
 
 20140430:
 	The lindev device has been removed since /dev/full has been made a
 	standard device.  __FreeBSD_version has been bumped.
 
 20140424:
 	The knob WITHOUT_VI was added to the base system, which controls
 	building ex(1), vi(1), etc. Older releases of FreeBSD required ex(1)
 	in order to reorder files share/termcap and didn't build ex(1) as a
 	build tool, so building/installing with WITH_VI is highly advised for
 	build hosts for older releases.
 
 	This issue has been fixed in stable/9 and stable/10 in r277022 and
 	r276991, respectively.
 
 20140418:
 	The YES_HESIOD knob has been removed. It has been obsolete for
 	a decade. Please move to using WITH_HESIOD instead or your builds
 	will silently lack HESIOD.
 
 20140405:
 	The uart(4) driver has been changed with respect to its handling
 	of the low-level console. Previously the uart(4) driver prevented
 	any process from changing the baudrate or the CLOCAL and HUPCL
 	control flags. By removing the restrictions, operators can make
 	changes to the serial console port without having to reboot.
 	However, when getty(8) is started on the serial device that is
 	associated with the low-level console, a misconfigured terminal
 	line in /etc/ttys will now have a real impact.
 	Before upgrading the kernel, make sure that /etc/ttys has the
 	serial console device configured as 3wire without baudrate to
 	preserve the previous behaviour. E.g:
 	    ttyu0  "/usr/libexec/getty 3wire"  vt100  on  secure
 
 20140306:
 	Support for libwrap (TCP wrappers) in rpcbind was disabled by default
 	to improve performance.  To re-enable it, if needed, run rpcbind
 	with command line option -W.
 
 20140226:
 	Switched back to the GPL dtc compiler due to updates in the upstream
 	dts files not being supported by the BSDL dtc compiler. You will need
 	to rebuild your kernel toolchain to pick up the new compiler. Core dumps
 	may result while building dtb files during a kernel build if you fail
 	to do so. Set WITHOUT_GPL_DTC if you require the BSDL compiler.
 
 20140216:
 	Clang and llvm have been upgraded to 3.4 release.
 
 20140216:
 	The nve(4) driver has been removed.  Please use the nfe(4) driver
 	for NVIDIA nForce MCP Ethernet adapters instead.
 
 20140212:
 	An ABI incompatibility crept into the libc++ 3.4 import in r261283.
 	This could cause certain C++ applications using shared libraries built
 	against the previous version of libc++ to crash.  The incompatibility
 	has now been fixed, but any C++ applications or shared libraries built
 	between r261283 and r261801 should be recompiled.
 
 20140204:
 	OpenSSH will now ignore errors caused by kernel lacking of Capsicum
 	capability mode support.  Please note that enabling the feature in
 	kernel is still highly recommended.
 
 20140131:
 	OpenSSH is now built with sandbox support, and will use sandbox as
 	the default privilege separation method.  This requires Capsicum
 	capability mode support in kernel.
 
 20140128:
 	The libelf and libdwarf libraries have been updated to newer
 	versions from upstream. Shared library version numbers for
 	these two libraries were bumped. Any ports or binaries
 	requiring these two libraries should be recompiled.
 	__FreeBSD_version is bumped to 1100006.
 
 20140110:
 	If a Makefile in a tests/ directory was auto-generating a Kyuafile
 	instead of providing an explicit one, this would prevent such
 	Makefile from providing its own Kyuafile in the future during
 	NO_CLEAN builds.  This has been fixed in the Makefiles but manual
 	intervention is needed to clean an objdir if you use NO_CLEAN:
 	  # find /usr/obj -name Kyuafile | xargs rm -f
 
 20131213:
 	The behavior of gss_pseudo_random() for the krb5 mechanism
 	has changed, for applications requesting a longer random string
 	than produced by the underlying enctype's pseudo-random() function.
 	In particular, the random string produced from a session key of
 	enctype aes256-cts-hmac-sha1-96 or aes256-cts-hmac-sha1-96 will
 	be different at the 17th octet and later, after this change.
 	The counter used in the PRF+ construction is now encoded as a
 	big-endian integer in accordance with RFC 4402.
 	__FreeBSD_version is bumped to 1100004.
 
 20131108:
 	The WITHOUT_ATF build knob has been removed and its functionality
 	has been subsumed into the more generic WITHOUT_TESTS.  If you were
 	using the former to disable the build of the ATF libraries, you
 	should change your settings to use the latter.
 
 20131025:
 	The default version of mtree is nmtree which is obtained from
 	NetBSD.  The output is generally the same, but may vary
 	slightly.  If you found you need identical output adding
 	"-F freebsd9" to the command line should do the trick.  For the
 	time being, the old mtree is available as fmtree.
 
 20131014:
 	libbsdyml has been renamed to libyaml and moved to /usr/lib/private.
 	This will break ports-mgmt/pkg. Rebuild the port, or upgrade to pkg
 	1.1.4_8 and verify bsdyml not linked in, before running "make
 	delete-old-libs":
 	  # make -C /usr/ports/ports-mgmt/pkg build deinstall install clean
 	  or
 	  # pkg install pkg; ldd /usr/local/sbin/pkg | grep bsdyml
 
 20131010:
 	The stable/10 branch has been created in subversion from head
 	revision r256279.
 
 20131010:
 	The rc.d/jail script has been updated to support jail(8)
 	configuration file.  The "jail_<jname>_*" rc.conf(5) variables
 	for per-jail configuration are automatically converted to
 	/var/run/jail.<jname>.conf before the jail(8) utility is invoked.
 	This is transparently backward compatible.  See below about some
 	incompatibilities and rc.conf(5) manual page for more details.
 
 	These variables are now deprecated in favor of jail(8) configuration
 	file.  One can use "rc.d/jail config <jname>" command to generate
 	a jail(8) configuration file in /var/run/jail.<jname>.conf without
 	running the jail(8) utility.   The default pathname of the
 	configuration file is /etc/jail.conf and can be specified by
 	using $jail_conf or $jail_<jname>_conf variables.
 
 	Please note that jail_devfs_ruleset accepts an integer at
 	this moment.  Please consider to rewrite the ruleset name
 	with an integer.
 
 20130930:
 	BIND has been removed from the base system.  If all you need
 	is a local resolver, simply enable and start the local_unbound
 	service instead.  Otherwise, several versions of BIND are
 	available in the ports tree.   The dns/bind99 port is one example.
 
 	With this change, nslookup(1) and dig(1) are no longer in the base
 	system.  Users should instead use host(1) and drill(1) which are
 	in the base system.  Alternatively, nslookup and dig can
 	be obtained by installing the dns/bind-tools port.
 
 20130916:
 	With the addition of unbound(8), a new unbound user is now
 	required during installworld.  "mergemaster -p" can be used to
 	add the user prior to installworld, as documented in the handbook.
 
 20130911:
 	OpenSSH is now built with DNSSEC support, and will by default
 	silently trust signed SSHFP records.  This can be controlled with
 	the VerifyHostKeyDNS client configuration setting.  DNSSEC support
 	can be disabled entirely with the WITHOUT_LDNS option in src.conf.
 
 20130906:
 	The GNU Compiler Collection and C++ standard library (libstdc++)
 	are no longer built by default on platforms where clang is the system
 	compiler.  You can enable them with the WITH_GCC and WITH_GNUCXX
 	options in src.conf.
 
 20130905:
 	The PROCDESC kernel option is now part of the GENERIC kernel
 	configuration and is required for the rwhod(8) to work.
 	If you are using custom kernel configuration, you should include
 	'options PROCDESC'.
 
 20130905:
 	The API and ABI related to the Capsicum framework was modified
 	in backward incompatible way. The userland libraries and programs
 	have to be recompiled to work with the new kernel. This includes the
 	following libraries and programs, but the whole buildworld is
 	advised: libc, libprocstat, dhclient, tcpdump, hastd, hastctl,
 	kdump, procstat, rwho, rwhod, uniq.
 
 20130903:
 	AES-NI intrinsic support has been added to gcc.  The AES-NI module
 	has been updated to use this support.  A new gcc is required to build
 	the aesni module on both i386 and amd64.
 
 20130821:
 	The PADLOCK_RNG and RDRAND_RNG kernel options are now devices.
 	Thus "device padlock_rng" and "device rdrand_rng" should be
 	used instead of "options PADLOCK_RNG" & "options RDRAND_RNG".
 
 20130813:
 	WITH_ICONV has been split into two feature sets.  WITH_ICONV now
 	enables just the iconv* functionality and is now on by default.
 	WITH_LIBICONV_COMPAT enables the libiconv api and link time
 	compatibility.  Set WITHOUT_ICONV to build the old way.
 	If you have been using WITH_ICONV before, you will very likely
 	need to turn on WITH_LIBICONV_COMPAT.
 
 20130806:
 	INVARIANTS option now enables DEBUG for code with OpenSolaris and
 	Illumos origin, including ZFS.  If you have INVARIANTS in your
 	kernel configuration, then there is no need to set DEBUG or ZFS_DEBUG
 	explicitly.
 	DEBUG used to enable witness(9) tracking of OpenSolaris (mostly ZFS)
 	locks if WITNESS option was set.  Because that generated a lot of
 	witness(9) reports and all of them were believed to be false
 	positives, this is no longer done.  New option OPENSOLARIS_WITNESS
 	can be used to achieve the previous behavior.
 
 20130806:
 	Timer values in IPv6 data structures now use time_uptime instead
 	of time_second.  Although this is not a user-visible functional
 	change, userland utilities which directly use them---ndp(8),
 	rtadvd(8), and rtsold(8) in the base system---need to be updated
 	to r253970 or later.
 
 20130802:
 	find -delete can now delete the pathnames given as arguments,
 	instead of only files found below them or if the pathname did
 	not contain any slashes. Formerly, the following error message
 	would result:
 
 	find: -delete: <path>: relative path potentially not safe
 
 	Deleting the pathnames given as arguments can be prevented
 	without error messages using -mindepth 1 or by changing
 	directory and passing "." as argument to find. This works in the
 	old as well as the new version of find.
 
 20130726:
 	Behavior of devfs rules path matching has been changed.
 	Pattern is now always matched against fully qualified devfs
 	path and slash characters must be explicitly matched by
 	slashes in pattern (FNM_PATHNAME). Rulesets involving devfs
 	subdirectories must be reviewed.
 
 20130716:
 	The default ARM ABI has changed to the ARM EABI. The old ABI is
 	incompatible with the ARM EABI and all programs and modules will
 	need to be rebuilt to work with a new kernel.
 
 	To keep using the old ABI ensure the WITHOUT_ARM_EABI knob is set.
 
 	NOTE: Support for the old ABI will be removed in the future and
 	users are advised to upgrade.
 
 20130709:
 	pkg_install has been disconnected from the build if you really need it
 	you should add WITH_PKGTOOLS in your src.conf(5).
 
 20130709:
 	Most of network statistics structures were changed to be able
 	keep 64-bits counters. Thus all tools, that work with networking
 	statistics, must be rebuilt (netstat(1), bsnmpd(1), etc.)
 
 20130618:
 	Fix a bug that allowed a tracing process (e.g. gdb) to write
 	to a memory-mapped file in the traced process's address space
 	even if neither the traced process nor the tracing process had
 	write access to that file.
 
 20130615:
 	CVS has been removed from the base system.  An exact copy
 	of the code is available from the devel/cvs port.
 
 20130613:
 	Some people report the following error after the switch to bmake:
 
 		make: illegal option -- J
 		usage: make [-BPSXeiknpqrstv] [-C directory] [-D variable]
 			...
 		*** [buildworld] Error code 2
 
 	this likely due to an old instance of make in
 	${MAKEPATH} (${MAKEOBJDIRPREFIX}${.CURDIR}/make.${MACHINE})
 	which src/Makefile will use that blindly, if it exists, so if
 	you see the above error:
 
 		rm -rf `make -V MAKEPATH`
 
 	should resolve it.
 
 20130516:
 	Use bmake by default.
 	Whereas before one could choose to build with bmake via
 	-DWITH_BMAKE one must now use -DWITHOUT_BMAKE to use the old
 	make. The goal is to remove these knobs for 10-RELEASE.
 
 	It is worth noting that bmake (like gmake) treats the command
 	line as the unit of failure, rather than statements within the
 	command line.  Thus '(cd some/where && dosomething)' is safer
 	than 'cd some/where; dosomething'. The '()' allows consistent
 	behavior in parallel build.
 
 20130429:
         Fix a bug that allows NFS clients to issue READDIR on files.
 
 20130426:
 	The WITHOUT_IDEA option has been removed because
 	the IDEA patent expired.
 
 20130426:
 	The sysctl which controls TRIM support under ZFS has been renamed
 	from vfs.zfs.trim_disable -> vfs.zfs.trim.enabled and has been
 	enabled by default.
 
 20130425:
 	The mergemaster command now uses the default MAKEOBJDIRPREFIX
 	rather than creating it's own in the temporary directory in
 	order allow access to bootstrapped versions of tools such as
 	install and mtree.  When upgrading from version of FreeBSD where
 	the install command does not support -l, you will need to
 	install a new mergemaster command if mergemaster -p is required.
 	This can be accomplished with the command (cd src/usr.sbin/mergemaster
 	&& make install).
 
 20130404:
 	Legacy ATA stack, disabled and replaced by new CAM-based one since
 	FreeBSD 9.0, completely removed from the sources.  Kernel modules
 	atadisk and atapi*, user-level tools atacontrol and burncd are
 	removed.  Kernel option `options ATA_CAM` is now permanently enabled
 	and removed.
 
 20130319:
 	SOCK_CLOEXEC and SOCK_NONBLOCK flags have been added to socket(2)
 	and socketpair(2). Software, in particular Kerberos, may
 	automatically detect and use these during building. The resulting
 	binaries will not work on older kernels.
 
 20130308:
 	CTL_DISABLE has also been added to the sparc64 GENERIC (for further
 	information, see the respective 20130304 entry).
 
 20130304:
 	Recent commits to callout(9) changed the size of struct callout,
 	so the KBI is probably heavily disturbed. Also, some functions
 	in callout(9)/sleep(9)/sleepqueue(9)/condvar(9) KPIs were replaced
 	by macros. Every kernel module using it won't load, so rebuild
 	is requested.
 
 	The ctl device has been re-enabled in GENERIC for i386 and amd64,
 	but does not initialize by default (because of the new CTL_DISABLE
 	option) to save memory.  To re-enable it, remove the CTL_DISABLE
 	option from the kernel config file or set kern.cam.ctl.disable=0
 	in /boot/loader.conf.
 
 20130301:
 	The ctl device has been disabled in GENERIC for i386 and amd64.
 	This was done due to the extra memory being allocated at system
 	initialisation time by the ctl driver which was only used if
 	a CAM target device was created.  This makes a FreeBSD system
 	unusable on 128MB or less of RAM.
 
 20130208:
 	A new compression method (lz4) has been merged to -HEAD.  Please
 	refer to zpool-features(7) for more information.
 
 	Please refer to the "ZFS notes" section of this file for information
 	on upgrading boot ZFS pools.
 
 20130129:
 	A BSD-licensed patch(1) variant has been added and is installed
 	as bsdpatch, being the GNU version the default patch.
 	To inverse the logic and use the BSD-licensed one as default,
 	while having the GNU version installed as gnupatch, rebuild
 	and install world with the WITH_BSD_PATCH knob set.
 
 20130121:
 	Due to the use of the new -l option to install(1) during build
 	and install, you must take care not to directly set the INSTALL
 	make variable in your /etc/make.conf, /etc/src.conf, or on the
 	command line.  If you wish to use the -C flag for all installs
 	you may be able to add INSTALL+=-C to /etc/make.conf or
 	/etc/src.conf.
 
 20130118:
 	The install(1) option -M has changed meaning and now takes an
 	argument that is a file or path to append logs to.  In the
 	unlikely event that -M was the last option on the command line
 	and the command line contained at least two files and a target
 	directory the first file will have logs appended to it.  The -M
 	option served little practical purpose in the last decade so its
 	use is expected to be extremely rare.
 
 20121223:
 	After switching to Clang as the default compiler some users of ZFS
 	on i386 systems started to experience stack overflow kernel panics.
 	Please consider using 'options KSTACK_PAGES=4' in such configurations.
 
 20121222:
 	GEOM_LABEL now mangles label names read from file system metadata.
 	Mangling affect labels containing spaces, non-printable characters,
 	'%' or '"'. Device names in /etc/fstab and other places may need to
 	be updated.
 
 20121217:
 	By default, only the 10 most recent kernel dumps will be saved.  To
 	restore the previous behaviour (no limit on the number of kernel dumps
 	stored in the dump directory) add the following line to /etc/rc.conf:
 
 		savecore_flags=""
 
 20121201:
 	With the addition of auditdistd(8), a new auditdistd user is now
 	required during installworld.  "mergemaster -p" can be used to
 	add the user prior to installworld, as documented in the handbook.
 
 20121117:
 	The sin6_scope_id member variable in struct sockaddr_in6 is now
 	filled by the kernel before passing the structure to the userland via
 	sysctl or routing socket.  This means the KAME-specific embedded scope
 	id in sin6_addr.s6_addr[2] is always cleared in userland application.
 	This behavior can be controlled by net.inet6.ip6.deembed_scopeid.
 	__FreeBSD_version is bumped to 1000025.
 
 20121105:
 	On i386 and amd64 systems WITH_CLANG_IS_CC is now the default.
 	This means that the world and kernel will be compiled with clang
 	and that clang will be installed as /usr/bin/cc, /usr/bin/c++,
 	and /usr/bin/cpp.  To disable this behavior and revert to building
 	with gcc, compile with WITHOUT_CLANG_IS_CC. Really old versions
 	of current may need to bootstrap WITHOUT_CLANG first if the clang
 	build fails (its compatibility window doesn't extend to the 9 stable
 	branch point).
 
 20121102:
 	The IPFIREWALL_FORWARD kernel option has been removed. Its
 	functionality now turned on by default.
 
 20121023:
 	The ZERO_COPY_SOCKET kernel option has been removed and
 	split into SOCKET_SEND_COW and SOCKET_RECV_PFLIP.
 	NB: SOCKET_SEND_COW uses the VM page based copy-on-write
 	mechanism which is not safe and may result in kernel crashes.
 	NB: The SOCKET_RECV_PFLIP mechanism is useless as no current
 	driver supports disposeable external page sized mbuf storage.
 	Proper replacements for both zero-copy mechanisms are under
 	consideration and will eventually lead to complete removal
 	of the two kernel options.
 
 20121023:
 	The IPv4 network stack has been converted to network byte
 	order. The following modules need to be recompiled together
 	with kernel: carp(4), divert(4), gif(4), siftr(4), gre(4),
 	pf(4), ipfw(4), ng_ipfw(4), stf(4).
 
 20121022:
 	Support for non-MPSAFE filesystems was removed from VFS. The
 	VFS_VERSION was bumped, all filesystem modules shall be
 	recompiled.
 
 20121018:
 	All the non-MPSAFE filesystems have been disconnected from
 	the build. The full list includes: codafs, hpfs, ntfs, nwfs,
 	portalfs, smbfs, xfs.
 
 20121016:
 	The interface cloning API and ABI has changed. The following
 	modules need to be recompiled together with kernel:
 	ipfw(4), pfsync(4), pflog(4), usb(4), wlan(4), stf(4),
 	vlan(4), disc(4), edsc(4), if_bridge(4), gif(4), tap(4),
 	faith(4), epair(4), enc(4), tun(4), if_lagg(4), gre(4).
 
 20121015:
 	The sdhci driver was split in two parts: sdhci (generic SD Host
 	Controller logic) and sdhci_pci (actual hardware driver).
 	No kernel config modifications are required, but if you
 	load sdhc as a module you must switch to sdhci_pci instead.
 
 20121014:
 	Import the FUSE kernel and userland support into base system.
 
 20121013:
 	The GNU sort(1) program has been removed since the BSD-licensed
 	sort(1) has been the default for quite some time and no serious
 	problems have been reported.  The corresponding WITH_GNU_SORT
 	knob has also gone.
 
 20121006:
 	The pfil(9) API/ABI for AF_INET family has been changed. Packet
 	filtering modules: pf(4), ipfw(4), ipfilter(4) need to be recompiled
 	with new kernel.
 
 20121001:
 	The net80211(4) ABI has been changed to allow for improved driver
 	PS-POLL and power-save support.  All wireless drivers need to be
 	recompiled to work with the new kernel.
 
 20120913:
 	The random(4) support for the VIA hardware random number
 	generator (`PADLOCK') is no longer enabled unconditionally.
 	Add the padlock_rng device in the custom kernel config if
 	needed.  The GENERIC kernels on i386 and amd64 do include the
 	device, so the change only affects the custom kernel
 	configurations.
 
 20120908:
 	The pf(4) packet filter ABI has been changed. pfctl(8) and
 	snmp_pf module need to be recompiled to work with new kernel.
 
 20120828:
 	A new ZFS feature flag "com.delphix:empty_bpobj" has been merged
 	to -HEAD. Pools that have empty_bpobj in active state can not be
 	imported read-write with ZFS implementations that do not support
 	this feature. For more information read the zpool-features(5)
 	manual page.
 
 20120727:
 	The sparc64 ZFS loader has been changed to no longer try to auto-
 	detect ZFS providers based on diskN aliases but now requires these
 	to be explicitly listed in the OFW boot-device environment variable.
 
 20120712:
 	The OpenSSL has been upgraded to 1.0.1c.  Any binaries requiring
 	libcrypto.so.6 or libssl.so.6 must be recompiled.  Also, there are
 	configuration changes.  Make sure to merge /etc/ssl/openssl.cnf.
 
 20120712:
 	The following sysctls and tunables have been renamed for consistency
 	with other variables:
 	  kern.cam.da.da_send_ordered   -> kern.cam.da.send_ordered
 	  kern.cam.ada.ada_send_ordered -> kern.cam.ada.send_ordered
 
 20120628:
 	The sort utility has been replaced with BSD sort.  For now, GNU sort
 	is also available as "gnusort" or the default can be set back to
 	GNU sort by setting WITH_GNU_SORT.  In this case, BSD sort will be
 	installed as "bsdsort".
 
 20120611:
 	A new version of ZFS (pool version 5000) has been merged to -HEAD.
 	Starting with this version the old system of ZFS pool versioning
 	is superseded by "feature flags". This concept enables forward
 	compatibility against certain future changes in functionality of ZFS
 	pools. The first read-only compatible "feature flag" for ZFS pools
 	is named "com.delphix:async_destroy". For more information
 	read the new zpool-features(5) manual page.
 	Please refer to the "ZFS notes" section of this file for information
 	on upgrading boot ZFS pools.
 
 20120417:
 	The malloc(3) implementation embedded in libc now uses sources imported
 	as contrib/jemalloc.  The most disruptive API change is to
 	/etc/malloc.conf.  If your system has an old-style /etc/malloc.conf,
 	delete it prior to installworld, and optionally re-create it using the
 	new format after rebooting.  See malloc.conf(5) for details
 	(specifically the TUNING section and the "opt.*" entries in the MALLCTL
 	NAMESPACE section).
 
 20120328:
 	Big-endian MIPS TARGET_ARCH values no longer end in "eb".  mips64eb
 	is now spelled mips64.  mipsn32eb is now spelled mipsn32.  mipseb is
 	now spelled mips.  This is to aid compatibility with third-party
 	software that expects this naming scheme in uname(3).  Little-endian
 	settings are unchanged. If you are updating a big-endian mips64 machine
 	from before this change, you may need to set MACHINE_ARCH=mips64 in
 	your environment before the new build system will recognize your machine.
 
 20120306:
 	Disable by default the option VFS_ALLOW_NONMPSAFE for all supported
 	platforms.
 
 20120229:
 	Now unix domain sockets behave "as expected" on	nullfs(5). Previously
 	nullfs(5) did not pass through all behaviours to the underlying layer,
 	as a result if we bound to a socket on the lower layer we could connect
 	only to the lower path; if we bound to the upper layer we could connect
 	only to	the upper path. The new behavior is one can connect to both the
 	lower and the upper paths regardless what layer path one binds to.
 
 20120211:
 	The getifaddrs upgrade path broken with 20111215 has been restored.
 	If you have upgraded in between 20111215 and 20120209 you need to
 	recompile libc again with your kernel.  You still need to recompile
 	world to be able to configure CARP but this restriction already
 	comes from 20111215.
 
 20120114:
 	The set_rcvar() function has been removed from /etc/rc.subr.  All
 	base and ports rc.d scripts have been updated, so if you have a
 	port installed with a script in /usr/local/etc/rc.d you can either
 	hand-edit the rcvar= line, or reinstall the port.
 
 	An easy way to handle the mass-update of /etc/rc.d:
 	rm /etc/rc.d/* && mergemaster -i
 
 20120109:
 	panic(9) now stops other CPUs in the SMP systems, disables interrupts
 	on the current CPU and prevents other threads from running.
 	This behavior can be reverted using the kern.stop_scheduler_on_panic
 	tunable/sysctl.
 	The new behavior can be incompatible with kern.sync_on_panic.
 
 20111215:
 	The carp(4) facility has been changed significantly. Configuration
 	of the CARP protocol via ifconfig(8) has changed, as well as format
 	of CARP events submitted to devd(8) has changed. See manual pages
 	for more information. The arpbalance feature of carp(4) is currently
 	not supported anymore.
 
 	Size of struct in_aliasreq, struct in6_aliasreq has changed. User
 	utilities using SIOCAIFADDR, SIOCAIFADDR_IN6, e.g. ifconfig(8),
 	need to be recompiled.
 
 20111122:
 	The acpi_wmi(4) status device /dev/wmistat has been renamed to
 	/dev/wmistat0.
 
 20111108:
 	The option VFS_ALLOW_NONMPSAFE option has been added in order to
 	explicitely support non-MPSAFE filesystems.
 	It is on by default for all supported platform at this present
 	time.
 
 20111101:
 	The broken amd(4) driver has been replaced with esp(4) in the amd64,
 	i386 and pc98 GENERIC kernel configuration files.
 
 20110930:
 	sysinstall has been removed
 
 20110923:
 	The stable/9 branch created in subversion.  This corresponds to the
 	RELENG_9 branch in CVS.
 
 COMMON ITEMS:
 
 	General Notes
 	-------------
 	Avoid using make -j when upgrading.  While generally safe, there are
 	sometimes problems using -j to upgrade.  If your upgrade fails with
 	-j, please try again without -j.  From time to time in the past there
 	have been problems using -j with buildworld and/or installworld.  This
 	is especially true when upgrading between "distant" versions (eg one
 	that cross a major release boundary or several minor releases, or when
 	several months have passed on the -current branch).
 
 	Sometimes, obscure build problems are the result of environment
 	poisoning.  This can happen because the make utility reads its
 	environment when searching for values for global variables.  To run
 	your build attempts in an "environmental clean room", prefix all make
 	commands with 'env -i '.  See the env(1) manual page for more details.
 
 	When upgrading from one major version to another it is generally best
 	to upgrade to the latest code in the currently installed branch first,
 	then do an upgrade to the new branch. This is the best-tested upgrade
 	path, and has the highest probability of being successful.  Please try
 	this approach before reporting problems with a major version upgrade.
 
 	When upgrading a live system, having a root shell around before
 	installing anything can help undo problems. Not having a root shell
 	around can lead to problems if pam has changed too much from your
 	starting point to allow continued authentication after the upgrade.
 
 	This file should be read as a log of events. When a later event changes
 	information of a prior event, the prior event should not be deleted.
 	Instead, a pointer to the entry with the new information should be
 	placed in the old entry. Readers of this file should also sanity check
 	older entries before relying on them blindly. Authors of new entries
 	should write them with this in mind.
 
 	ZFS notes
 	---------
 	When upgrading the boot ZFS pool to a new version, always follow
 	these two steps:
 
 	1.) recompile and reinstall the ZFS boot loader and boot block
 	(this is part of "make buildworld" and "make installworld")
 
 	2.) update the ZFS boot block on your boot drive
 
 	The following example updates the ZFS boot block on the first
 	partition (freebsd-boot) of a GPT partitioned drive ada0:
 	"gpart bootcode -p /boot/gptzfsboot -i 1 ada0"
 
 	Non-boot pools do not need these updates.
 
 	To build a kernel
 	-----------------
 	If you are updating from a prior version of FreeBSD (even one just
 	a few days old), you should follow this procedure.  It is the most
 	failsafe as it uses a /usr/obj tree with a fresh mini-buildworld,
 
 	make kernel-toolchain
 	make -DALWAYS_CHECK_MAKE buildkernel KERNCONF=YOUR_KERNEL_HERE
 	make -DALWAYS_CHECK_MAKE installkernel KERNCONF=YOUR_KERNEL_HERE
 
 	To test a kernel once
 	---------------------
 	If you just want to boot a kernel once (because you are not sure
 	if it works, or if you want to boot a known bad kernel to provide
 	debugging information) run
 	make installkernel KERNCONF=YOUR_KERNEL_HERE KODIR=/boot/testkernel
 	nextboot -k testkernel
 
 	To just build a kernel when you know that it won't mess you up
 	--------------------------------------------------------------
 	This assumes you are already running a CURRENT system.  Replace
 	${arch} with the architecture of your machine (e.g. "i386",
 	"arm", "amd64", "ia64", "pc98", "sparc64", "powerpc", "mips", etc).
 
 	cd src/sys/${arch}/conf
 	config KERNEL_NAME_HERE
 	cd ../compile/KERNEL_NAME_HERE
 	make depend
 	make
 	make install
 
 	If this fails, go to the "To build a kernel" section.
 
 	To rebuild everything and install it on the current system.
 	-----------------------------------------------------------
 	# Note: sometimes if you are running current you gotta do more than
 	# is listed here if you are upgrading from a really old current.
 
 	<make sure you have good level 0 dumps>
 	make buildworld
 	make kernel KERNCONF=YOUR_KERNEL_HERE
 							[1]
 	<reboot in single user>				[3]
 	mergemaster -Fp					[5]
 	make installworld
 	mergemaster -Fi					[4]
 	make delete-old					[6]
 	<reboot>
 
 	To cross-install current onto a separate partition
 	--------------------------------------------------
 	# In this approach we use a separate partition to hold
 	# current's root, 'usr', and 'var' directories.   A partition
 	# holding "/", "/usr" and "/var" should be about 2GB in
 	# size.
 
 	<make sure you have good level 0 dumps>
 	<boot into -stable>
 	make buildworld
 	make buildkernel KERNCONF=YOUR_KERNEL_HERE
 	<maybe newfs current's root partition>
 	<mount current's root partition on directory ${CURRENT_ROOT}>
 	make installworld DESTDIR=${CURRENT_ROOT} -DDB_FROM_SRC
 	make distribution DESTDIR=${CURRENT_ROOT} # if newfs'd
 	make installkernel KERNCONF=YOUR_KERNEL_HERE DESTDIR=${CURRENT_ROOT}
 	cp /etc/fstab ${CURRENT_ROOT}/etc/fstab 		   # if newfs'd
 	<edit ${CURRENT_ROOT}/etc/fstab to mount "/" from the correct partition>
 	<reboot into current>
 	<do a "native" rebuild/install as described in the previous section>
 	<maybe install compatibility libraries from ports/misc/compat*>
 	<reboot>
 
 
 	To upgrade in-place from stable to current
 	----------------------------------------------
 	<make sure you have good level 0 dumps>
 	make buildworld					[9]
 	make kernel KERNCONF=YOUR_KERNEL_HERE		[8]
 							[1]
 	<reboot in single user>				[3]
 	mergemaster -Fp					[5]
 	make installworld
 	mergemaster -Fi					[4]
 	make delete-old					[6]
 	<reboot>
 
 	Make sure that you've read the UPDATING file to understand the
 	tweaks to various things you need.  At this point in the life
 	cycle of current, things change often and you are on your own
 	to cope.  The defaults can also change, so please read ALL of
 	the UPDATING entries.
 
 	Also, if you are tracking -current, you must be subscribed to
 	freebsd-current@freebsd.org.  Make sure that before you update
 	your sources that you have read and understood all the recent
 	messages there.  If in doubt, please track -stable which has
 	much fewer pitfalls.
 
 	[1] If you have third party modules, such as vmware, you
 	should disable them at this point so they don't crash your
 	system on reboot.
 
 	[3] From the bootblocks, boot -s, and then do
 		fsck -p
 		mount -u /
 		mount -a
 		cd src
 		adjkerntz -i		# if CMOS is wall time
 	Also, when doing a major release upgrade, it is required that
 	you boot into single user mode to do the installworld.
 
 	[4] Note: This step is non-optional.  Failure to do this step
 	can result in a significant reduction in the functionality of the
 	system.  Attempting to do it by hand is not recommended and those
 	that pursue this avenue should read this file carefully, as well
 	as the archives of freebsd-current and freebsd-hackers mailing lists
 	for potential gotchas.  The -U option is also useful to consider.
 	See mergemaster(8) for more information.
 
 	[5] Usually this step is a noop.  However, from time to time
 	you may need to do this if you get unknown user in the following
 	step.  It never hurts to do it all the time.  You may need to
 	install a new mergemaster (cd src/usr.sbin/mergemaster && make
 	install) after the buildworld before this step if you last updated
 	from current before 20130425 or from -stable before 20130430.
 
 	[6] This only deletes old files and directories. Old libraries
 	can be deleted by "make delete-old-libs", but you have to make
 	sure that no program is using those libraries anymore.
 
 	[8] In order to have a kernel that can run the 4.x binaries needed to
 	do an installworld, you must include the COMPAT_FREEBSD4 option in
 	your kernel.  Failure to do so may leave you with a system that is
 	hard to boot to recover. A similar kernel option COMPAT_FREEBSD5 is
 	required to run the 5.x binaries on more recent kernels.  And so on
 	for COMPAT_FREEBSD6 and COMPAT_FREEBSD7.
 
 	Make sure that you merge any new devices from GENERIC since the
 	last time you updated your kernel config file.
 
 	[9] When checking out sources, you must include the -P flag to have
 	cvs prune empty directories.
 
 	If CPUTYPE is defined in your /etc/make.conf, make sure to use the
 	"?=" instead of the "=" assignment operator, so that buildworld can
 	override the CPUTYPE if it needs to.
 
 	MAKEOBJDIRPREFIX must be defined in an environment variable, and
 	not on the command line, or in /etc/make.conf.  buildworld will
 	warn if it is improperly defined.
 FORMAT:
 
 This file contains a list, in reverse chronological order, of major
 breakages in tracking -current.  It is not guaranteed to be a complete
 list of such breakages, and only contains entries since September 23, 2011.
 If you need to see UPDATING entries from before that date, you will need
 to fetch an UPDATING file from an older FreeBSD release.
 
 Copyright information:
 
 Copyright 1998-2009 M. Warner Losh.  All Rights Reserved.
 
 Redistribution, publication, translation and use, with or without
 modification, in full or in part, in any form or format of this
 document are permitted without further permission from the author.
 
 THIS DOCUMENT IS PROVIDED BY WARNER LOSH ``AS IS'' AND ANY EXPRESS OR
 IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
 WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 DISCLAIMED.  IN NO EVENT SHALL WARNER LOSH BE LIABLE FOR ANY DIRECT,
 INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
 (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
 SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
 STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
 IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 POSSIBILITY OF SUCH DAMAGE.
 
 Contact Warner Losh if you have any questions about your use of
 this document.
 
 $FreeBSD$
Index: projects/clang390-import/contrib/bmake/ChangeLog
===================================================================
--- projects/clang390-import/contrib/bmake/ChangeLog	(revision 305686)
+++ projects/clang390-import/contrib/bmake/ChangeLog	(revision 305687)
@@ -1,1936 +1,1965 @@
+2016-08-18  Simon J. Gerraty  <sjg@bad.crufty.net>
+
+	* Makefile (_MAKE_VERSION): 20160818
+	  its a neater number; pick up whitespace fixes to man page.
+
+2016-08-17  Simon J. Gerraty  <sjg@bad.crufty.net>
+
+	* Makefile (_MAKE_VERSION): 20160817
+	  Merge with NetBSD make, pick up
+	  o meta.c: move handling of .MAKE.META.IGNORE_* to meta_ignore()
+	    so we can call it before adding entries to missingFiles.
+	    Thus we do not track files we have been told to ignore.
+
+2016-08-15  Simon J. Gerraty  <sjg@bad.crufty.net>
+
+	* Makefile (_MAKE_VERSION): 20160815
+	  Merge with NetBSD make, pick up
+	  o meta_oodate: apply .MAKE.META.IGNORE_FILTER (if defined) to
+	    pathnames, and skip if the expansion is empty.
+	    Useful for dirdeps.mk when checking DIRDEPS_CACHE.
+
+2016-08-12  Simon J. Gerraty  <sjg@bad.crufty.net>
+
+	* Makefile (_MAKE_VERSION): 20160812
+	  Merge with NetBSD make, pick up
+	  o meta.c: remove all missingFiles entries that match a deleted
+	    dir.
+	  o main.c: set .ERROR_CMD if possible.
+	  
 2016-06-06  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (_MAKE_VERSION): 20160606
 	  Merge with NetBSD make, pick up
 	  o dir.c: extend mtimes cache to others via cached_stat()
 
 2016-06-04  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (_MAKE_VERSION): 20160604
 	  Merge with NetBSD make, pick up
 	  o meta.c: missing filemon data is only relevant if we read a
 	    meta file.
 	    Also do not return oodate for a missing metafile if gn->path
 	    points to .CURDIR
 	
 2016-06-02  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (_MAKE_VERSION): 20160602
 	  Merge with NetBSD make, pick up
 	  o cached_realpath(): avoid hitting filesystem more than necessary.
 	  o meta.c: refactor need_meta decision, add knobs for 
 	    missing meta file and filemon data wrt out-of-datedness.
 
 2016-05-28  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (_MAKE_VERSION): 20160528
 
 	* boot-strap, make-bootstrap.sh.in: Makefile now uses _MAKE_VERSION 
 
 2016-05-12  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (_MAKE_VERSION): 20160512
 	  Merge with NetBSD make, pick up
 	  o meta.c: ignore paths that match .MAKE.META.IGNORE_PATTERNS
 	    this is useful for gcov builds.
 	  o propagate errors from filemon(4).
 	
 2016-05-09  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (_MAKE_VERSION): 20160509
 	  Merge with NetBSD make, pick up
 	  o remove use of non-standard types u_int etc.
 	  o meta.c: apply realpath() before matching against metaIgnorePaths
 
 2016-04-04  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (_MAKE_VERSION): 20160404
 	  Merge with NetBSD make, pick up
 	  o allow makefile to set .MAKE.JOBS
 
 	* Makefile (PROG_NAME): use ${_MAKE_VERSION}
 
 2016-03-15  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (_MAKE_VERSION): 20160315
 	  Merge with NetBSD make, pick up
 	  o fix handling of archive members
 
 2016-03-13  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (_MAKE_VERSION): rename variable to avoid interference
 	  with checks for ${MAKE_VERSION}
 
 2016-03-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20160310
 	  Merge with NetBSD make, pick up
 	  o meta.c: treat missing Read file same as Write, incase we Delete it.
 
 2016-03-07  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20160307
 	  Merge with NetBSD make, pick up
 	  o var.c: fix :ts\nnn to be octal by default.
 	  o meta.c: meta_finish() to cleanup memory.
 
 2016-02-26  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20160226
 	  Merge with NetBSD make, pick up
 	  o meta.c: allow meta file for makeDepend if makefiles want it.
 
 2016-02-19  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* var.c: default .MAKE.SAVE_DOLLARS to FALSE
 	  for backwards compatability.
 
 	* Makefile (MAKE_VERSION): 20160220
 	  Merge with NetBSD make, pick up
 	  o var.c: add knob to control handling of '$$' in :=
 
 2016-02-18  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20160218
 	  Merge with NetBSD make, pick up
 	  o var.c: add .export-literal allows us to fix sys.clean-env.mk
 	    post the changes to Var_Subst.
 	    Var_Subst now takes flags, and does not consume '$$' in :=
 
 2016-02-17  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20160217
 	  Merge with NetBSD make, pick up
 	  o var.c: preserve '$$' in :=
 	  o parse.c: add .dinclude for handling included 
 	    makefile like .depend
 
 2015-12-20  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20151220
 	  Merge with NetBSD make, pick up
 	  o suff.c: re-initialize suffNull when clearing suffixes.
 
 2015-12-01  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20151201
 	  Merge with NetBSD make, pick up
 	  o cond.c: CondCvtArg: avoid access beyond end of empty buffer.
 	  o meta.c: meta_oodate: use lstat(2) for checking link target
 	    in case it is a symlink.
 	  o var.c: avoid calling brk_string and Var_Export1 with empty
 	    strings.
 	
 2015-11-26  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20151126
 	  Merge with NetBSD make, pick up
 	  o parse.c: ParseTrackInput don't access beyond 
 	    end of old value.
 	
 2015-10-22  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20151022
 
 	* Add support for BSD/OS which lacks inttypes.h
 	  and really needs sys/param.h for sys/sysctl.h
 	  also 'type' is not a shell builtin.
 
 	* var.c: eliminate uint32_t and need for inttypes.h
 	
 	* main.c: PrintOnError flush stdout before run .ERROR
 
 	* parse.c: cope with _SC_PAGESIZE not being defined.
 
 	
 2015-10-20  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20151020
 	  Merge with NetBSD make, pick up
 	  o var.c: fix uninitialized var 
 
 2015-10-12  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* var.c: the conditional expressions used with ':?' can be
 	expensive, if already discarding do not evaluate or expand
 	anything. 
 
 2015-10-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20151010
 	  Merge with NetBSD make, pick up
 	  o Add Boolean wantit flag to Var_Subst and Var_Parse
 	    when FALSE we know we are discarding the result and can
 	    skip operations like Cmd_Exec.
 
 2015-10-09  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20151009
 	  Merge with NetBSD make, pick up
 	  o var.c: don't check for NULL before free()
 	  o meta.c: meta_oodate, do not hard code ignore of makeDependfile
 
 2015-09-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20150910
 	  Merge with NetBSD make, pick up
 	  o main.c: with -w print Enter/Leaving messages for objdir too
 	    if necessary.
 	  o centralize shell metachar handling
 	
 	* FILES: add metachar.[ch]
 
 2015-06-06  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20150606
 	  Merge with NetBSD make, pick up
 	  o make.1: document .OBJDIR target
 
 2015-05-05  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20150505
 	  Merge with NetBSD make, pick up
 	  o cond.c: be strict about lhs of comparison when evaluating .if
 	    but less so when called from variable expansion.
 	  o unit-tests/cond2.mk: test various error conditions
 
 2015-05-04  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* machine.sh (MACHINE): Add Bitrig 
 	  patch from joerg@netbsd.org
 
 2015-04-18  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20150418
 	  Merge with NetBSD make, pick up
 	  o job.c: use memmove() rather than memcpy()
 
 	* unit-tests/varshell.mk: SunOS cannot handle the TERMINATED_BY_SIGNAL
 	  case, so skip it.
 
 2015-04-11  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20150411
 	  bump version - only mk/ changes.
 	
 2015-04-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20150410
 	  Merge with NetBSD make, pick up
 	  o document different handling of '-' in jobs mode vs compat
 	  o fix jobs mode so that '-' only applies to whole job
 	    when shell lacks hasErrCtl
 	  o meta.c: use separate vars to track lcwd and latestdir (read)
 	    per process
 	
 2015-04-01  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20150401
 	  Merge with NetBSD make, pick up
 	  o meta.c: close meta file in child
 	
 	* Makefile: use BINDIR.bmake if set.
 	  Same for MANDIR and SHAREDIR
 	  Handy for testing release candidates
 	  in various environments.
 	
 2015-03-26  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* move initialization of savederr to block where it is used
 	  to avoid spurious warning from gcc5
 
 2014-11-11  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20141111
 	  just a cooler number
 
 2014-11-05  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20141105
 	  Merge with NetBSD make, pick up
 	  o revert major overhaul of suffix handling
 	    and POSIX compliance - too much breakage
 	    and impossible to make backwards compatible.
 	  o we still have the new unit test structure which is ok.
 	  o meta.c ensure "-- filemon" is at start of line.
 
 2014-09-17  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* configure.in: test that result of getconf PATH_MAX is numeric
 	  and discard if not.  Apparently needed for Hurd.
 
 2014-08-30  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20140830
 	  Merge with NetBSD make, pick up
 	  o major overhaul of suffix handling
 	  o improved POSIX compliance
 	  o overhauled unit-tests
 
 2014-06-20  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20140620
 	  Merge with NetBSD make, pick up
 	  o var.c return varNoError rather than var_Error for ::= modifiers.
 
 2014-05-22  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20140522
 	  Merge with NetBSD make, pick up
 	  o var.c detect some parse errors.
 
 2014-04-05  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Fix spelling errors - patch from Pedro Giffuni
 
 2014-02-14  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20140214
 	  Merge with NetBSD make, pick up
 	  o .INCLUDEFROM*
 	  o use Var_Value to get MAKEOBJDIR[PREFIX]
 	  o reduced realloc'ign in brk_string.
 	* configure.in: add a check for compiler supporting __func__
 
 2014-01-03  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* boot-strap: ignore mksrc=none
 
 2014-01-02  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (DEFAULT_SYS_PATH?): use just ${prefix}/share/mk
 
 2014-01-01  Simon J. Gerraty  <sjg@bad.crufty.net>
 	
 	* Makefile (MAKE_VERSION): 20140101
 	* configure.in: set bmake_path_max to min(_SC_PATH_MAX,1024)
 	* Makefile.config: defined BMAKE_PATH_MAX to bmake_path_max
 	* make.h: use BMAKE_PATH_MAX if MAXPATHLEN not defined (needed for
 	  Hurd) 
 	* configure.in: Add AC_PREREQ and check for
 	  sysctl; patch from Andrew Shadura andrewsh at debian.org
 
 2013-10-16  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20131010
 	* lose the const from arg to systcl to avoid problems on older BSDs.
 
 2013-10-01  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20131001
 	  Merge with NetBSD make, pick up
 	  o main.c: for NATIVE build sysctl to get MACHINE_ARCH from
 	    hw.machine_arch if necessary.
 	  o meta.c: meta_oodate - need to look at src of Link and target
 	    of Move as well.
 	* main.c: check that CTL_HW and HW_MACHINE_ARCH exist.
 	  provide __arraycount() if needed.
 
 2013-09-04  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20130904
 	  Merge with NetBSD make, pick up
 	  o Add VAR_INTERNAL context, so that internal setting of
 	    MAKEFILE does not override value set by makefiles.
 
 2013-09-02  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20130902
 	  Merge with NetBSD make, pick up
 	  o CompatRunCommand: only apply shellErrFlag when errCheck is true
 
 2013-08-28  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20130828
 	  Merge with NetBSD make, pick up
 	  o Fix VAR :sh = syntax from Will Andrews at freebsd.org
 	  o Call Job_SetPrefix() from Job_Init() so makefiles have
 	    opportunity to set .MAKE.JOB.PREFIX
 
 2013-07-30  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20130730
 	  Merge with NetBSD make, pick up
 	  o Allow suppression of --- job -- tokens by setting
 	    .MAKE.JOB.PREFIX empty.
 
 2013-07-16  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20130716
 	  Merge with NetBSD make, pick up
 	  o number of gmake compatibility tweaks
 	    -w for gmake style entering/leaving messages
 	    if .MAKE.LEVEL > 0 indicate it in progname "make[1]" etc.
 	    handle MAKEFLAGS containing only letters.
 	  o when overriding a GLOBAL variable on the command line,
 	    delete it from GLOBAL context so -V doesn't show the wrong
 	    value.
 	
 2013-07-06  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* configure.in: We don't need MAKE_LEVEL_SAFE anymore.
 
 	* Makefile (MAKE_VERSION): 20130706
 	  Merge with NetBSD make, pick up
 	  o Shell_Init(): export shellErrFlag if commandShell hasErrCtl is
 	    true so that CompatRunCommand() can use it, to ensure
 	    consistent behavior with jobs mode.
 	  o use MAKE_LEVEL_ENV to define the variable to propagate
 	    .MAKE.LEVEL - currently set to MAKELEVEL (same as gmake).
 	  o meta.c: use .MAKE.META.IGNORE_PATHS to allow customization of
 	    paths to ignore.
 
 2013-06-04  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20130604
 	  Merge with NetBSD make, pick up
 	  o job.c: JobCreatePipe: do fcntl() after any tweaking of fd's
 	    to avoid leaking descriptors.
 
 2013-05-28  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20130528
 	  Merge with NetBSD make, pick up
 	  o var.c: cleanup some left-overs in VarHash()
 
 2013-05-20  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20130520
 	  generate manifest from component FILES rather than have to
 	  update FILES when mk/FILES changes.
 
 2013-05-18  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20130518
 	  Merge with NetBSD make, pick up
 	  o suff.c: don't skip all processsing for .PHONY targets
 	    else wildcard srcs do not get expanded.
 	  o var.c: expand name of variable to delete if necessary.
 
 2013-03-30  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20130330
 	  Merge with NetBSD make, pick up
 	  o meta.c: refine the handling of .OODATE in commands.
 	    Rather than suppress command comparison for the entire script
 	    as though .NOMETA_CMP had been used, only suppress it for the
 	    one command line.
 	    This allows something like ${.OODATE:M.NOMETA_CMP} to be used to 
 	    suppress comparison of a command without otherwise affecting it.
 	  o make.1: document that
 
 2013-03-22  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20130321
 	  yes, not quite right but its a cooler number.
 	  Merge with NetBSD make, pick up
 	  o parse.c: fix ParseGmakeExport to be portable 
 	    and add a unit-test.
 	* meta.c: call meta_init() before makefiles are read and if built
 	  with filemon support set .MAKE.PATH_FILEMON to _PATH_FILEMON
 	  this let's makefiles test for support.
 	  Call meta_mode_init() to process .MAKE.MODE.
 
 2013-03-13  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20130305
 	  Merge with NetBSD make, pick up
 	  o run .STALE: target when a dependency from .depend is missing.
 	  o job.c: add Job_RunTarget() for the above and .BEGIN
 
 2013-03-03  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20130303
 	  Merge with NetBSD make, pick up
 	  o main.c: set .MAKE.OS to utsname.sysname
 	  o job.c: more checks for read and poll errors
 	  o var.c: lose VarChangeCase() saves 4% time
 
 2013-03-02  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* boot-strap: remove MAKEOBJDIRPREFIX from environment since we
 	  want to use MAKEOBJDIR
 
 2013-01-27  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Merge with NetBSD make, pick up
 	  o make.1: more info on how shell commands are handled.
 	  o job.c,main.c: detect write errors to job pipes.
 
 2013-01-25  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile (MAKE_VERSION): 20130123
 	  Merge with NetBSD make, pick up
 	  o meta.c: if script uses .OODATE and meta_oodate() decides
 	    rebuild is needed, .OODATE will be empty - set it to .ALLSRC.
 	  o var.c: in debug output indicate which variabale modifiers
 	    apply to.
 	  o remove Check_Cwd logic the makefiles have been fixed.
 	
 2012-12-12  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* makefile.in: add a simple makefile for folk who insist on
 	  ./configure; make; make install
 	  it just runs boot-strap
 	* include mk/* to accommodate the above
 	* boot-strap:  re-work to accommodate the above
 	  mksrc defaults to $Mydir/mk
 	  allow op={configure,build,install,clean,all}
 	  add options to facilitate install
 	* Makefile.config.in: just the bits set by configure
 	* Makefile: bump version to 20121212
 	  abandon Makefile.in (NetBSD Makefile)
 	  leverage mk/* instead
 	* configure.in: ensure srcdir is absolute
 
 2012-11-11  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): 20121111
 	  fix generation of bmake.cat1
 
 2012-11-09  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): 20121109
 	  Merge with NetBSD make, pick up
 	  o make.c: MakeBuildChild: return 0 so search continues if a
 	    .ORDER dependency is detected.
 	  o unit-tests/order: test the above
 	
 2012-11-02  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): 20121102
 	  Merge with NetBSD make, pick up
 	  o cond.c: allow cond_state[] to grow.
 	    In meta mode with a very large tree, we can hit the limit
 	    while processing dirdeps.
 	
 2012-10-25  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in: we need to use ${srcdir} not ${.CURDIR}
 
 2012-10-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): 20121010
 	  o protect syntax that only bmake parses correctly.
 	  o remove auto setting of FORCE_MACHINE, use configure's
 	    --with-force-machine=whatever if that is desired.
 	
 2012-10-08  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in: do not lose history from make.1 when generating bmake.1
 
 2012-10-07  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): 20121007
 	  Merge with NetBSD make, pick up
 	  o compat.c: ignore empty commands - same as jobs mode.
 	  o make.1: document meta chars that cause use of shell
 
 2012-09-11  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20120911
 	* bsd.after-import.mk: include Makefile.inc early and allow it to
 	  override PROG
 
 2012-08-31  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20120831
 	  Merge with NetBSD make, pick up
 	  o cast sizeof() to int for comparison
 	  o minor make.1 tweak
 
 2012-08-30  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20120830
 	  Merge with NetBSD make, pick up
 	  o .MAKE.EXPAND_VARIABLES knob can control default behavior of -V
 	  o debug flag -dV causes -V to show raw value regardless.
 	
 2012-07-05  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* bsd.after-import.mk (after-import): ensure unit-tests/Makefile
 	  gets SRCTOP set.
 
 2012-07-04  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20120704
 	  Merge with NetBSD make, pick up
 	  o Job_ParseShell should call Shell_Init if it has been
 	    previously called.
 	* Makefile.in: set USE_META based on configure result.
 	  also .PARSEDIR is safer indicator of bmake.
 
 2012-06-26  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in: bump version to 20120626
 	  ensure CPPFLAGS is in CFLAGS
 	* meta.c: avoid nested externs
 	* bsd.after-import.mk: avoid ${.CURDIR}/Makefile as target
 	
 2012-06-20  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20120620
 	  Merge with NetBSD make, pick up
 	  o make_malloc.c: avoid including make_malloc.h again
 
 	* Makefile.in: avoid bmake only syntax or protect with
 	  .if defined(.MAKE.LEVEL)
 	* bsd.after-import.mk: replace .-include with .sinclude
 	  ensure? SRCTOP gets a value
 	* configure.in: look for filemon.h in /usr/include/dev/filemon first.
 
 2012-06-19  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20120612
 	  Merge with NetBSD make, pick up
 	  o use MAKE_ATTR_* rather than those defined by cdefs.h or compiler
 	    for greater portability.
 	  o unit-tests/forloop: check that .for works as expected wrt
 	    number of times and with "quoted strings".
 	
 2012-06-06  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20120606
 	  Merge with NetBSD make, pick up
 	  o compat.c: use kill(2) rather than raise(3).
 	* configure.in: look for sys/dev/filemon
 	* bsd.after-import.mk: add a .-include "Makefile.inc" to Makefile
 	  and pass BOOTSTRAP_XTRAS to boot-strap.
 
 2012-06-04  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20120604
 	  Merge with NetBSD make, pick up
 	  o util.c and var.c share same var for tracking if environ
 	    has been reallocated.
 	  o util.c provide getenv with setenv.
 	* Add MAKE_LEVEL_SAFE as an alternate means of passing MAKE_LEVEL
 	  when the shell actively strips .MAKE.* from the environment.
 	  We still refer to the variable always as .MAKE.LEVEL
 	* util.c fix bug in findenv() was finding prefix of name.
 	* compat.c: re-raising SIGINT etc after running .INTERRUPT
 	  results in more reliable termination of all activity on many
 	  platforms.
 
 2012-06-02  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20120602
 	  Merge with NetBSD make, pick up
 	  o for.c: handle quoted items in .for list
 
 2012-05-30  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20120530
 	  Merge with NetBSD make, pick up
 	  o compat.c: ignore empty command.
 
 2012-05-24  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20120524
 	* FILES: add bsd.after-import.mk:
 	  A simple means of integrating bmake into a BSD build system.
 
 2012-05-20  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20120520
 	  Merge with NetBSD make, pick up
 	  o increased limit for nested conditionals.
 	
 2012-05-18  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20120518
 	  Merge with NetBSD make, pick up
 	  o use _exit(2) in signal hanlder
 	  o Don't use the [dir] cache when building nodes that might have
 	    changed since the last exec.
 	  o Avoid nested extern declaration warnings.
 
 2012-04-27  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta.c (fgetLine): avoid %z - not portable.
 	* parse.c: Since we moved include of sys/mman.h
 	  and def's of MAP_COPY etc. we got dups from a merge.
 
 2012-04-24  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20120420
 	  Merge with NetBSD make, pick up
 	  o restore duplicate supression in .MAKE.MAKEFILES
 	    runtime saving can be significant.
 	  o Var_Subst() uses Buf_DestroyCompact() to reduce memory
 	    consumption up to 20%. 
 
 2012-04-20  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20120420
           Merge with NetBSD make, pick up
 	  o remove duplicate supression in .MAKE.MAKEFILES
 	  o improved dir cache behavior
 	  o gmake'ish export command
 	
 2012-03-25  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20120325
 	  Merge with NetBSD make, pick up
 	  o fix parsing of :[#] in conditionals.
 
 2012-02-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in: replace use of .Nx in bmake.1 with NetBSD
 	  since some systems cannot cope with .Nx <version>
 
 2011-11-14  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20111111
 	  Merge with NetBSD make, pick up
 	  o debug output for .PARSEDIR and .PARSEFILE
 
 2011-10-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION):  bump version to 20111010
 
 2011-10-09  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* boot-strap: check for an expected file in the dirs we look for.
 	* make-bootstrap.sh: pass on LDSTATIC
 
 2011-10-01  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20111001
 	  Merge with NetBSD make, pick up
 	  o ensure .PREFIX is set for .PHONY
 	    and .TARGET set for .PHONY run via .END
 	  o __dead used consistently
 	
 2011-09-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): 20110909 is a better number ;-)
 
 2011-09-05  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20110905
 	  Merge with NetBSD make, pick up
 	  o meta_oodate: ignore makeDependfile
 	
 2011-08-28  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20110828
 	  Merge with NetBSD make, pick up
 	  o silent=yes in .MAKE.MODE causes meta mode to mark targets 
 	    as SILENT if a .meta file is created
 
 2011-08-18  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20110818
 	  Merge with NetBSD make, pick up
 	  o in meta mode, if target flagged .META a missing .meta file
 	    means target is out-of-date
 	  o fixes for gcc 4.5 warnings
 	  o simplify job printing code
 	
 2011-08-09  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20110808
 	  Merge with NetBSD make, pick up
 	  o do not touch OP_SPECIAL targets when doing make -t
 	
 2011-06-22  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20110622
 	  Merge with NetBSD make, pick up
 	  o meta_oodate detect corrupted .meta file and declare oodate.
 	* configure.in: add check for setsid
 	
 2011-06-07  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Merge with NetBSD make, pick up
 	  o unit-tests/modts now works on MirBSD
 
 2011-06-04  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20110606
 	  Merge with NetBSD make, pick up
 	  o ApplyModifiers: when we parse a variable which is not
 	    the entire modifier string, or not followed by ':', do not
 	    consider it as containing modifiers.
 	  o loadfile: ensure newline at end of mapped file.
 
 2011-05-05  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20110505
 	  Merge with NetBSD make, pick up
 	  o .MAKE.META.BAILIWICK - list of prefixes which define the scope
 	    of make's control.  In meta mode, any generated file within
 	    said bailiwick, which  is found to be missing, causes current
 	    target to be out-of-date. 
 	
 2011-04-11  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20110411
 	  Merge with NetBSD make, pick up
 	  o when long modifiers fail to match, check sysV style.
 	    - add a test case
 	
 2011-04-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20110410
 	  Merge with NetBSD make, pick up
 	  o :hash - cheap 32bit hash of value
 	  o :localtime, :gmtime - use value as format string for strftime.
 	
 2011-03-30  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20110330
 	  mostly because its a cooler version.
 	  Merge with NetBSD make, pick up
 	  o NetBSD tags for meta.[ch]
 	  o job.c call meta_job_finish() after meta_job_error().
 	  o meta_job_error() should call meta_job_finish() to ensure
 	    .meta file is closed, and safe to copy - if .ERROR target wants.
 	   meta_job_finish() is safe to call repeatedly.
 	
 2011-03-29  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* unit-tests/modts: use printf if it is a builtin, 
 	  to save us from MirBSD
 
 	* Makefile.in (MAKE_VERSION): bump version to 20110329
 	  Merge with NetBSD make, pick up
 	  o fix for use after free() in CondDoExists().
 	  o meta_oodate() report extra commands and return earlier.
 	
 2011-03-27  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20110327
 	  Merge with NetBSD make, pick up
 	  o meta.c, if .MAKE.MODE contains curdirOk=yes
 	    allow creating .meta files in .CURDIR
 	* boot-strap (TOOL_DIFF): aparently at least on linux distro
 	  formats the output of 'type' differently - so eat any "()"
 
 2011-03-06  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20110306
 	  Merge with NetBSD make, pick up
 	  o meta.c, only do getcwd() once
 	
 2011-03-05  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20110305
 	  Merge with NetBSD make, pick up
 	  o correct sysV substitution handling of empty lhs and variable
 	  o correct exists() check for dir with trailing /
 	  o correct handling of modifiers for non-existant variables
 	    during evaluation of conditionals.
 	  o ensure MAP_FILE is defined.
 	  o meta.c use curdir[] now exported by main.c
 	
 2011-02-25  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20110225
 	  Merge with NetBSD make, pick up
 	  o fix for incorrect .PARSEDIR when .OBJDIR is re-computed after
 	    makefiles have been read.
 	  o fix example of :? modifier in man page.
 	
 2011-02-13  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20110214
 	  Merge with NetBSD make, pick up
 	  o meta.c handle realpath() failing when generating meta file
 	    name.
 
 	* sigcompat.c: convert to ansi so we can use higher warning levels.
 
 
 2011-02-07  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20110207
 	  Merge with NetBSD make, pick up
 	  o fix for bug in meta mode.
 	
 2011-01-03  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* parse.c: SunOS 5.8 at least does not have MAP_FILE
 
 2011-01-01  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20110101
 	  Merge with NetBSD make, pick up
 	  o use mmap(2) if available, for reading makefiles
 
 2010-12-15  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20101215
 	  Merge with NetBSD make, pick up
 	  o ensure meta_job_error() does not report a previous .meta file
 	    as being culprit.
 
 2010-12-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20101210
 	  Merge with NetBSD make, pick up
 	  o meta_oodate: track cwd per process, and only consider target
 	    out-of-date if missing file is outside make's CWD.
 	    Ignore files in /tmp/ etc.
 	  o to ensure unit-tests results match, need to control LC_ALL
 	    as well as LANG.
 	  o fix for parsing bug in var.c
 
 2010-11-26  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20101126
 	  Merge with NetBSD make, pick up
 	  o if stale dependency is an IMPSRC, search via .PATH
 	  o meta_oodate: if a referenced file is missing, target is
 	    out-of-date.
 	  o meta_oodate: if a target uses .OODATE in its commands,
 	    it (.OODATE) needs to be recomputed.
 	  o keep a pointer to youngest child node, rather than just its
 	    mtime.
 	
 2010-11-02  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20101101
 
 2010-10-16  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* machine.sh: like os.sh, 
 	allow for uname -p producing useless drivel
 
 2010-09-13  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* boot-strap: document configure knobs for meta and filemon.
 
 	* Makefile.in (MAKE_VERSION): bump version to 20100911
 	  Merge with NetBSD make, pick up
 	  o meta.c - meta mode
 
 	* make-bootstrap.sh.in: handle meta.c
 	* configure.in: add knobs for use_meta and filemon_h
 	  also, look for dirname, str[e]sep and strlcpy
 	* util.c: add simple err[x] and warn[x]
 
 2010-08-08  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* boot-strap (TOOL_DIFF): set this to ensure tests use
 	  the same version of diff that configure tested
 
 	* Makefile.in (MAKE_VERSION): bump version to 20100808
 	  Merge with NetBSD make, pick up
 	  o in jobs mode, when we discover we cannot make something,
 	    call PrintOnError before exit.
 	
 2010-08-06  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20100806
 	  Merge with NetBSD make, pick up
 	  o formatting fixes for ignored errors
 	  o ensure jobs are cleaned up regardless of where wait() was called.
 
 2010-06-28  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20100618
 	* os.sh (MACHINE_ARCH): watch out for drivel from uname -p
 
 2010-06-16  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20100616
 	  Merge with NetBSD make, pick up
 	  o man page update
 	  o call PrintOnError from JobFinish when we detect an error we
 	    are not ignoring. 
 	
 2010-06-06  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20100606
 	  Merge with NetBSD make, pick up
 	  o man page update
 
 2010-06-05  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20100605
 	  Merge with NetBSD make, pick up
 	  o use bmake_signal() which is a wrapper around sigaction() 
 	    in place of signal()
 	  o add .export-env to allow exporting variables to environment
 	    without tracking (so no re-export when the internal value is
 	    changed).
 	
 2010-05-24  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20100524
 	  Merge with NetBSD make, pick up
 	  o fix for .info et al being greedy.
 
 2010-05-23  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20100520
 	  Merge with NetBSD make, pick up
 	  o back to using realpath on argv[0] 
 	    but only if contains '/' and does not start with '/'.
 
 2010-05-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* boot-strap: use absolute path for bmake when running tests.
 
 	* Makefile.in (MAKE_VERSION):  bump version to 20100510
 	  Merge with NetBSD make, pick up
 	  o revert use of realpath on argv[0]
 	    too many corner cases.
 	  o print MAKE_PRINT_VAR_ON_ERROR before running .ERROR target.
 
 2010-05-05  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20100505
 	  Merge with NetBSD make, pick up
 	  o fix for missed SIGCHLD when compiled with SunPRO
 	    actually for bmake, defining FORCE_POSIX_SIGNALS would have
 	    done the job.
 
 2010-04-30  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20100430
 	  Merge with NetBSD make, pick up
 	  o fflush stdout before writing to stdout
 	
 2010-04-23  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20100423
 	  Merge with NetBSD make, pick up
 	  o updated unit tests for Haiku (this time for sure).
 	* boot-strap: based on patch from joerg 
 	  honor --with-default-sys-path better.
 	* boot-strap: remove mention of --with-prefix-sys-path
 	
 2010-04-22  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20100422
 	* Merge with NetBSD make, pick up
 	  o fix for vfork() on Darwin.
 	  o fix for bogus $TMPDIR.
 	  o set .MAKE.MODE=compat for -B
 	  o set .MAKE.JOBS=max_jobs for -j max_jobs
 	  o allow unit-tests to run without any *.mk
 	  o unit-tests/modmisc be more conservative in dirs presumed to exist.
 	* boot-strap: ignore /usr/share/mk except on NetBSD.
 	* unit-tests/Makefile.in: set LANG=C when running unit-tests to
 	  ensure sort(1) behaves as expected. 
 	
 2010-04-21  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* boot-strap: add FindHereOrAbove so we can use -m .../mk
 
 2010-04-20  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20100420
 	* Merge with NetBSD make, pick up
 	  o fix for variable realpath() behavior.
 	    we have to stat(2) the result to be sure.
 	  o fix for .export (all) when nested vars use :sh
 	
 2010-04-14  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20100414
 	* Merge with NetBSD make, pick up
 	  o use realpath to resolve argv[0] (for .MAKE) if needed.
 	  o add realpath from libc.
 	  o add :tA to resolve variable via realpath(3) if possible.
 
 2010-04-08  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20100408
 	* Merge with NetBSD make, pick up
 	  o unit tests for .ERROR, .error
 	  o fix for .ERROR to ensure it cannot be default target.
 
 2010-04-06  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20100406
 	* Merge with NetBSD make, pick up
 	  o fix for compat mode "Error code" going to debug_file.
 	  o fix for .ALLSRC being populated twice.
 	  o support for .info, .warning and .error directives
 	  o .MAKE.MODE to control make's operational mode
 	  o .MAKE.MAKEFILE_PREFERENCE to control the preferred makefile
 	    name(s).
 	  o .MAKE.DEPENDFILE to control the name of the depend file
 	  o .ERROR target - run on failure.
 	
 2010-03-18  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* make-bootstrap.sh.in: extract MAKE_VERSION from Makefile
 
 	* os.sh,arch.c: patch for Haiku from joerg at netbsd
 
 2010-03-17  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20100222
 	* Merge with NetBSD make, pick up
 	  o better error msg for .for with mutiple inter vars
 	
 	* boot-strap: 
 	  o use make-bootstrap.sh from joerg at netbsd
 	    to avoid the need for a native make when bootstrapping.
 	  o add "" everywhere ;-)
 	  o if /usr/share/tmac/andoc.tmac exists install nroff bmake.1
 	    otherwise the pre-formated version.
 
 2010-01-04  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20100102
 	* Merge with NetBSD make, pick up:
 	  o fix for -m .../
 
 2009-11-18  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20091118
 	* Merge with NetBSD make, pick up:
 	  o .unexport
 	  o report lines that start with '.' and should have ':'
 	    (catch typo's of .el*if).
 	
 2009-10-30  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* configure.in: Ensure that srcdir and mksrc are absolute paths.
 
 2009-10-09  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): fix version to 20091007
 
 2009-10-07  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 200910007
 	* Merge with NetBSD make, pick up:
 	  o fix for parsing of :S;...;...; applied to .for loop iterator
 	    appearing in a dependency line. 
 	
 2009-09-09  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20090909
 	* Merge with NetBSD make, pick up:
 	  o fix for -C, .CURDIR and .OBJDIR
 	* boot-strap: 
 	  o allow share_dir to be set independent of prefix.
 	  o select default share_dir better when prefix ends in $HOST_TARGET
 	  o if FORCE_BSD_MK etc were set, include them in the suggested
 	    install-mk command.
 
 2009-09-08  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20090908
 	* Merge with NetBSD make, pick up:
 	  o .MAKE.LEVEL for recursion tracking
 	  o fix for :M scanning \:
 
 2009-09-03  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* configure.in: Don't -D__EXTENSIONS__ if
 	AC_USE_SYSTEM_EXTENSIONS says "no".
 
 2009-08-26  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (MAKE_VERSION): bump version to 20090826
 	Simplify MAKE_VERSION to just the bare date.
 	* Merge with NetBSD make, pick up:
 	  o -C directory support.
 	  o support for SIGINFO
 	  o use $TMPDIR for temp files.
 	  o child of vfork should be careful about modifying parent's state.
 	
 
 2009-03-26  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Appy some patches for MiNT from David Brownlee
 
 2009-02-26  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20090222
 	* Merge with NetBSD make, pick up:
 	  o Possible null pointer de-ref in Var_Set.
 
 2009-02-08  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20090204
 	* Merge with NetBSD make, pick up:
 	  o bmake_malloc et al moved to their own .c
 	  o Count both () and {} when looking for the end of a :M pattern
 	  o Change 'Buffer' so that it is the actual struct, not a pointer to it.
 	  o strlist.c - functions for processing extendable arrays of pointers to strings.
 	  o ClientData replaced with void *, so const void * can be used.
 	  o New debug flag C for DEBUG_CWD
 
 2008-11-11  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20081111
 	  Apply patch from Joerg Sonnenberge to
 	  configure.in:
 	  o remove some redundant checks
 	  o check for emlloc etc only in libutil and require the whole family.
 	  util.c:
 	  o remove [v]asprintf which is no longer used.
 	
 2008-11-04  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20081101
 	* Merge with NetBSD make, pick up:
 	  o util.c: avoid use of putenv() - christos
 
 2008-10-30  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20081030
 	  pick up man page tweaks.
 
 2008-10-29  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in: move processing of LIBOBJS to after is definition!
 	  thus we'll have getenv.c in SRCS only if needed.
 
 	* make.1: add examples of how to use :?
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20081029
 	* Merge with NetBSD make, pick up:
 	  o fix for .END processing with -j
 	  o segfault from Parse_Error when no makefile is open
 	  o handle numeric expressions in any variable expansion
 	  o debug output now defaults to stderr, -dF to change it - apb
 	  o make now uses bmake_malloc etc so that it can build natively 
 	    on A/UX - wasn't an issue for bmake, but we want to keep in sync.
 
 2008-09-27  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20080808
 	* Merge with NetBSD make, pick up:
 	  o fix for PR/38840: Pierre Pronchery: make crashes while parsing
 	    long lines in Makefiles 
 	  o optimizations for VarQuote by joerg
 	  o fix for PR/38756: dominik: make dumps core on invalid makefile
 	
 2008-05-15  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20080515
 	* Merge with NetBSD make, pick up:
 	  o fix skip setting vars in VAR_GLOBAL context, to handle
 	    cases where VAR_CMD is used for other than command line vars.
 
 2008-05-14  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* boot-strap (make_version): we may need to look in
 	$prefix/share/mk for sys.mk 
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20080514
 	* Merge with NetBSD make, pick up:
 	  o skip setting vars in VAR_GLOBAL context, when already set in
 	  VAR_CMD which takes precedence.
 
 2008-03-30  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION):  bump version to 20080330
 	* Merge with NetBSD make, pick up:
 	  o fix for ?= when LHS contains variable reference.
 
 2008-02-15  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* merge some patches from NetBSD pkgsrc.
 	
 	* makefile.boot.in (BOOTSTRAP_SYS_PATH): Allow better control of
 	the MAKSYSPATH used during bootstrap. 
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20080215
 	* Merge with NetBSD make, pick up:
 	  o warn if non-space chars follow 'empty' in a conditional.
 
 2008-01-18  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20080118
 	* Merge with NetBSD make, pick up:
 	  o consider dependencies read from .depend as optional - dsl
 	  o remember when buffer for reading makefile grows - dsl
 	  o add -dl (aka LOUD) - David O'Brien
 
 2007-10-22  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20071022
 	* Merge with NetBSD make, pick up:
 	  o Allow .PATH<suffix> to be used for .include ""
 
 	* boot-strap: source default settings from .bmake-boot-strap.rc
 
 2007-10-16  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in: fix maninstall on various systems 
 	  provided that our man.mk is used.
 	  For non-BSD systems we install the preformatted page
 	  into $MANDIR/cat1
 
 2007-10-15  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* boot-strap: make bmake.1 too, so maninstall works.
 
 2007-10-14  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20071014
 	* Merge with NetBSD make, pick up:
 	  o revamped handling of defshell - configure no longer needs to
 	    know the content of the shells array - apb
 	  o stop Var_Subst modifying its input - apb
 	  o avoid calling ParseTrackInput too often - dsl
 
 2007-10-11  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20071011
 	* Merge with NetBSD make, pick up:
 	  o fix Shell_Init for case that _BASENAME_DEFSHELL is absolute path.
 
 	* sigcompat.c: some tweaks for HP-UX 11.x based on 
 	  patch from Tobias Nygren
 
 	* configure.in: update handling of --with-defshell to match
 	  new make behavior.  --with-defshell=/usr/xpg4/bin/sh
 	  will now do what one might hope - provided the chosen shell
 	  behaves enough like sh.
 
 2007-10-08  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump to 20071008
 	* Merge with NetBSD make, pick up:
 	  o .MAKE.JOB.PREFIX - control the token output before jobs - sjg
 	  o .export/.MAKE.EXPORTED - export of variables - sjg
 	  o .MAKE.MAKEFILES - track all makefiles read - sjg
 	  o performance improvements - dsl
 	  o revamp parallel job scheduling - dsl
 	
 2006-07-28  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump to 20060728
 	* Merge with NetBSD make, pick up:
 	  o extra debug info during variable and cond processing - sjg
 	  o shell definition now covers newline - rillig
 	  o minor mem leak in PrintOnError - sjg
 
 2006-05-11  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION):  bump to 20060511
 	* Merge with NetBSD make, pick up:
 	  o more memory leaks - coverity
 	  o possible overflow in ArchFindMember - coverity
 	  o extract variable modifier code out of Var_Parse()
 	    so it can be called recursively - sjg
 	  o unit-tests/moderrs - sjg
 
 2006-04-12  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump to 20060412
 	* Merge with NetBSD make, pick up:
 	  o fixes for some memory leaks - coverity
 	  o only read first sys.mk etc when searching sysIncPath - sjg
 
 	* main.c (ReadMakefile): remove hack for __INTERIX that prevented
 	setting ${MAKEFILE} - OBATA Akio
 
 2006-03-18  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump to 20060318
 	* Merge with NetBSD make, pick up:
 	  o cleanup of job.c to remove remote handling, distcc is more
 	    useful and this code was likely bit-rotting - dsl
 	  o fix for :P modifier - sjg
 	* boot-strap: set default prefix to something reasonable 
 	  (for me anyway). 
 
 2006-03-01  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump to 20060301
 	* Merge with NetBSD make, pick up:
 	  o make .WAIT apply recursively, document and test case - apb
 	  o allow variable modifiers in a variable appear anywhere in
 	    modifier list, document and test case - sjg
 
 2006-02-22  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump to 20060222
 	* Merge with NetBSD make, pick up:
 	  o improved job token handling - dsl
 	  o SIG_DFL the correct signal before exec - dsl
 	  o more debug info during parsing - dsl
 	  o allow variable modifiers to be specified via variable - sjg
 	* boot-strap: explain why we died if no mksrc
 
 2005-11-05  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump to 20051105
 	* configure.in: always set default_sys_path 
 	  default is ${prefix}/share/mk
 	  - remove prefix_sys_path, anyone wanting more than above
 	    needs to set it manually.
 
 2005-11-04  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* boot-strap: make this a bit easier for pkgsrc folk.
 	  bootstrap still fails on IRIX64 since MACHINE_ARCH gets set to
 	  'mips' while pkgsrc wants 'mipseb' or 'mipsel'
 
 2005-11-02  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump to 20051102
 	* job.c (JobFinish): fix likely ancient merge lossage
 	fix from Todd Vierling.
 	* boot-strap (srcdir): allow setting mksrc=none
 
 2005-10-31  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump to 20051031
 	* ranlib.h: skip on OSF too.
 	  (NetBSD PR 31864)
 
 2005-10-10  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump to 20051002
 	  fix a silly typo
 
 2005-10-09  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump to 20051001
 	  support for UnixWare and some other systems,
 	  based on patches from pkgsrc/bootstrap
 
 2005-09-03  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump to 20050901
 	* Merge with NetBSD make, pick up:
 	  o possible parse error causing us to wander off.
 
 2005-06-06  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump to 20050606
 	* Merge with NetBSD make, pick up:
 	  o :0x modifier for randomizing a list
 	  o fixes for a number of -Wuninitialized issues.
 
 2005-05-30  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump to 20050530
 	* Merge with NetBSD make, pick up:
 	  o Handle dependencies for .BEGIN, .END and .INTERRUPT
 
 	* README: was seriously out of date.
 	
 2005-03-22  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Important to use .MAKE rather than MAKE.
 
 2005-03-15  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump to 20050315
 	* Merge with NetBSD make, pick up:
 	  o don't mistake .elsefoo for .else
 	  o use suffix-specific search path correctly
 	  o bunch of style nits
 
 2004-05-11  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* boot-strap: 
 	o ensure that args to --src and --with-mksrc
 	  are resolved before giving them to configure.
 	o add -o "objdir" so that builder can control it,
 	  default is $OS as determined by os.sh
 	o add -q to suppress all the install instructions.
 
 2004-05-08  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Remove __IDSTRING()
 
 	* Makefile.in (BMAKE_VERSION): bump to 20040508
 	* Merge with NetBSD make, pick up:
 	  o posix fixes
 	    - remove '-e' from compat mode
 	    - add support for '+' command-line prefix.
 	  o fix for handling '--' on command-line.
 	  o fix include in lst.lib/lstInt.h to simplify '-I's
 	  o we also picked up replacement of MAKE_BOOTSTRAP 
 	    with !MAKE_NATIVE which is a noop, but possibly confusing.
 
 2004-04-14  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump to 20040414
 	* Merge with NetBSD make, pick up:
 	  o allow quoted strings on lhs of conditionals
 	  o issue warning when extra .else is seen
 	  o print line numer when errors encountered during parsing from
 	  string.
 
 2004-02-20  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION):  bump to 20040220
 	* Merge with NetBSD make, pick up:
 	  o fix for old :M parsing bug.
 	  o re-jigged unit-tests
 
 2004-02-15  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (accept test): use ${.MAKE:S,^./,${.CURDIR}/,}
 	so that './bmake -f Makefile test' works.
 	
 2004-02-14  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in: (BMAKE_VERSION): bump to 20040214
 	* Merge with NetBSD make, pick up:
 	  o search upwards for *.mk
 	  o fix for double free of var substitution buffers
 	  o use of getopt replaced with custom code, since the usage
 	  (re-scanning) isn't posix compatible.
 
 2004-02-12  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* arch.c: don't include ranlib.h on ELF systems
 	(thanks to Chuck Cranor <chuck@ece.cmu.edu>).
 
 2004-01-18  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump to 20040118
 
 	* boot-strap (while): export vars we assign to on cmdline
 	* unit-test/Makefile.in: ternary is .PHONY
 
 2004-01-08  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20040108
 	* Merge with NetBSD make, pick up:
 	  o fix for ternary modifier
 
 2004-01-06  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20040105
 	* Merge with NetBSD make, pick up:
 	  o fix for cond.c to handle compound expressions better
 	  o variable expansion within sysV style replacements
 	  
 2003-12-22  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Make portable snprintf safer - output to /dev/null first to
 	check space needed.
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20031222
 	* Merge with NetBSD make, pick up:
 	  o -dg3 to show input graph when things go wrong.
 	  o explicitly look for makefiles in objdir if not found in curdir so
 	    that errors in .depend etc will be reported accurarely. 
 	  o avoid use of -e in shell scripts in jobs mode, use '|| exit $?'
 	    instead as it more accurately reflects the expected behavior and
 	    is more consistently implemented.
 	  o avoid use of asprintf.
 
 2003-09-28  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* util.c: Add asprintf and vasprintf.
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20030928
 	* Merge with NetBSD make, pick up:
 	:[] modifier - allows picking words from a variable.
 	:tW modifier - allows treating value as one big word.
 	W flag for :C and :S - allows treating value as one big word.
 	
 2003-09-12  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Merge with NetBSD make
 	pick up -de flag to enable printing failed command.
 	don't skip 1st two dir entries (normally . and ..) since
 	coda does not have them.
 
 2003-09-09  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20030909
 	* Merge with NetBSD make, pick up:
 	- changes for -V '${VAR}' to print fully expanded value
 	  cf. -V VAR
 	- CompatRunCommand now prints the command that failed.
 	- several files got updated 3 clause Berkeley license.
 	
 2003-08-02  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* boot-strap: Allow setting configure args on command line.
 
 2003-07-31  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* configure.in: add --with-defshell to allow sh or ksh
 	to be selected as default shell.
 
 	* Makefile.in: bump version to 20030731
 
 	* Merge with NetBSD make 
 	Pick up .SHELL spec for ksh and associate man page changes.
 	Also compat mode now uses the same shell specs.
 
 2003-07-29  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* var.c (Var_Parse): ensure delim is initialized.
 
 	* unit-tests/Makefile.in: use single quotes to avoid problems from
 	some shells.
 
 	* makefile.boot.in:
 	Run the unit-tests as part of the bootstrap procedure.
 
 2003-07-28  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* unit-tests/Makefile.in: always force complaints from
 	${TEST_MAKE} to be from 'make'.
 
 	* configure.in: add check for 'diff -u'
 	also fix some old autoconf'isms
 	
 	* Makefile.in (BMAKE_VERSION): bump version to 20030728.
 	if using GCC add -Wno-cast-qual to CFLAGS for var.o
 
 	* Merge with NetBSD make
 	Pick up fix for :ts parsing error in some cases.
 	Pick unit-tests.
 
 2003-07-23  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in (BMAKE_VERSION): bump version to 20030723.
 
 	* var.c (Var_Parse): fix bug in :ts modifier, after const
 	correctness fixes, must pass nstr to VarModify.
 
 2003-07-14  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Makefile.in: BMAKE_VERSION switch to a date based version.
 	We'll generally use the date of last import from NetBSD.
 
 	* Merge with NetBSD make
 	Pick up fixes for const-correctness, now passes WARNS=3 on
 	NetBSD.
 	Pick up :ts modifier, allows controlling the separator used
 	between words in variable expansion.
 
 2003-07-11  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* FILES: include boot-strap and os.sh
 
 	* Makefile.in: only set WARNS if we are NetBSD, the effect on
 	FreeBSD is known to be bad.
 
 	* makefile.boot.in (bootstrap): make this the default target.
 
 	* Makefile.in: bump version to 3.1.19
 
 	* machine.sh: avoid A-Z with tr as it is bound to lose.
 
 2003-07-10  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Merge with NetBSD make
 	Pick up fix for PR/19781 - unhelpful error msg on unclosed ${var:foo
 	Plus some doc fixes.
 	
 2003-04-27  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Merge with NetBSD make
 	Pick up fix for PR/1523 - don't count a library as built, if there
 	is no way to build it 
 
 	* Bump version to 3.1.18
 
 2003-03-23  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Merge with NetBSD make
 	Pick up fix for ParseDoSpecialSrc - we only use it if .WAIT
 	appears in src list.
 
 2003-03-21  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Merge with NetBSD make (mmm 10th anniversary!)
 	pick up fix for .WAIT in srcs that refer to $@ or $* (PR#20828)
 	pick up -X which tells us to not export VAR=val via setenv if
 	we are already doing so via MAKEFLAGS.  This saves valuable env
 	space on systems like Darwin.
 	set MAKE_VERSION to 3.1.17
 
 	* parse.c: pix up fix for suffix rules
 
 2003-03-06  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Merge with NetBSD make.
 	pick up fix for propagating -B via MAKEFLAGS.
 	set MAKE_VERSION to 3.1.16
 
 	* Apply some patches from pkgsrc-bootstrap/bmake
 	Originally by Grant Beattie <grant@netbsd.org>
 	I may have missed some - since they are based on bmake-3.1.12
 	
 2002-12-03  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* makefile.boot.in (bmake): update install targets for those that
 	use them, also clear MAKEFLAGS when invoking bmake.boot to avoid
 	havoc from gmake -w.  Thanks to Harlan Stenn <hstenn@cisco.com>.
 
 	* bmake.cat1: update the pre-formatted man page!
 
 2002-11-30  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Merge with NetBSD make.
 	pick up fix for premature free of pointer used in call
 	to Dir_InitCur().
 	set MAKE_VERSION to 3.1.15
 
 2002-11-26  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* configure.in: determine suitable value for MKSRC.
 	override using --with-mksrc=PATH.
 
 	* machine.sh: use `uname -p` for MACHINE_ARCH on modern SunOS systems.
 	configs(8) will use 'sun4' as an alias for 'sparc'.
 
 2002-11-25  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Merge with NetBSD make.
 	pick up ${.PATH}
 	pick up fix for finding ../cat.c via .PATH when .CURDIR=..
 	set MAKE_VERSION to 3.1.14
 	add configure checks for killpg and sys/socket.h
 
 2002-09-16  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* tag bmake-3-1-13
 	
 	* makefile.boot.in (bmake): use install-mk
 	Also setup ./mk before trying to invoke bmake.boot incase we
 	needed install-mk to create a sys.mk for us. 
 
 	* configure.in: If we need to add -I${srcdir}/missing, make it an
 	absolute path so that it works for lst.lib too.
 
 	* make.h: always include sys/cdefs.h since we provide one if the
 	host does not.
 	
 	* Makefile.in (install-mk): 
 	use MKSRC/install-mk which will do the right thing.
 	use uname -p for ARCH if possible.
 	since install-mk will setup links bsd.prog.mk -> prog.mk if
 	needed, just .include bsd.prog.mk
 
 	* Merge with NetBSD make (NetBSD-1.6)
 	Code is ansi-C only now.
 	Bug in handling of dotLast is fixed.
 	Can now assign .OBJDIR and make will reset its notions of life.
 	New modifiers :tu :tl for toUpper and toLower.
 
 Tue Oct 16 12:18:42 2001  Simon J. Gerraty  <sjg@zen.crufty.net>
 
 	* Merge with NetBSD make
 	pick up fix for .END failure in compat mode.
 	pick up fix for extra va_end() in ParseVErrorInternal.
 
 Thu Oct 11 13:20:06 2001  Simon J. Gerraty  <sjg@zen.crufty.net>
 
 	* configure.in: for systems that have sys/cdefs.h check if it is
 	compatible.  If not, include the one under missing, but tell it to
 	include the native one too - necessary on Linux.
 
 	* missing/sys/cdefs.h: if NEED_HOST_CDEFS_H is defined, use
 	include_next (for gcc) to get the native sys/cdefs.h
 
 Tue Aug 21 02:29:34 2001  Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* job.c (JobFinish): Fix an earlier merge bug that resulted in
 	leaking descriptors when using -jN.
 	
 	* job.c (JobPrintCommand): See if "curdir" exists before
 	attempting to chdir().  Doing the chdir directly in make (when in
 	compat mode) fails silently, so let the -jN version do the same.
 	This can happen when building kernels in an object tree and
 	playing clever games to reset .CURDIR.
 
 	* Merged with NetBSD make
 	pick up .USEBEFORE
 
 Tue Jun 26 23:45:11 2001  Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* makefile.boot.in: Give bmake.boot a MAKESYSPATH that might work.
 
 Tue Jun 12 16:48:57 2001  Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* var.c (Var_Set): Add 4th (flags) arg so VarLoopExpand can tell
 	us not to export the iterator variable when using VAR_CMD context.
 
 Sun Jun 10 21:55:21 2001  Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* job.c (Job_CatchChildren): don't call Job_CatchOutput() here,
 	its the wrong "fix".
 
 Sat Jun  9 00:11:24 2001  Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* Redesigned export of VAR_CMD's via MAKEFLAGS.
 	We now simply append the variable names to .MAKEOVERRIDES, and
 	handle duplicate suppression and quoting in ExportMAKEFLAGS using:
 	${.MAKEOVERRIDES:O:u:@v@$v=${$v:Q}@}
 	Apart from fixing quoting bugs in previous version, this allows us
 	to export vars to the environment by simply doing:
 	.MAKEOVERRIDES+= PATH 
 	Merged again with NetBSD make, but the above is the only change.
 
 	* configure.in: added
 	--disable-pwd-override		disable $PWD overriding getcwd()
 	--disable-check-make-chdir	disable make trying to guess 
 		when it should automatically cd ${.CURDIR}
 
 	* Merge with NetBSD make, changes include:
 	parse.c (ParseDoDependency): Spot that the syntax error is
 	caused by an unresolved cvs/rcs conflict and say so.
 	var.c: most of Var* functions now take a ctxt as 1st arg.
 	now does variable substituion on rhs of sysv style modifiers.
 	
 	* var.c (Var_Set): exporting of command line variables (VAR_CMD)
 	is now done here.  We append the name='value' to .MAKEOVERRIDES
 	rather than directly into MAKEFLAGS as this allows a Makefile to
 	use .MAKEOVERRIDES= to disable this behaviour.  GNU make uses a
 	very similar mechanism.  Note that in adding name='value' to
 	.MAKEOVERRIDES we do the moral equivalent of:
 	.MAKEOVERRIDES:= ${.MAKEOVERRIDES:Nname=*} name='val'
 
 Fri Jun  1 14:08:02 2001  Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* make-conf.h (USE_IOVEC): make it conditional on HAVE_SYS_UIO_H
 
 	* Merged with NetBSD make
 	make -dx can now be used to run commands via sh -x
 	better error messages on exec failures.
 
 Thu May 31 01:44:54 2001  Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* Makefile.in (main.o): depends on ${SRCS} ${MAKEFILE} so that
 	MAKE_VERSION gets updated.  Also don't use ?= for MAKE_VERSION,
 	MACHINE etc otherwise they propagate from the previous bmake.
 
 	* configure.in (machine): allow --with-machine=generic to make
 	configure use machine.sh to set MACHINE. 
 
 	* job.c (JobInterrupt): convert to using WAIT_T and friends.
 
 	* Makefile.in: mention in bmake.1 that we use autoconf.
 
 	* make.1: mention MAKE_PRINT_VAR_ON_ERROR.
 
 Wed May 30 23:17:18 2001  Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* main.c (ReadMakefile): don't set MAKEFILE if reading ".depend"
 	as that rather defeats the usefulness of ${MAKEFILE}.
 
 	* main.c (MainParseArgs): append command line variable assignments
 	to MAKEFLAGS so that they get propagated to child make's.
 	Apparently this is required POSIX behaviour?  Its useful anyway.
 
 Tue May 29 02:20:07 2001  Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* compat.c (CompatRunCommand): don't use perror() since stdio may
 	cause problems in child of vfork().
 
 	* compat.c, main.c: Call PrintOnError() when we are going to bail.
 	This routine prints out the .curdir where we stopped and will also
 	display any vars listed in ${MAKE_PRINT_VAR_ON_ERROR}.
 
 	* main.c: add ${.newline} to hold a "\n" - sometimes handy in
 	:@ expansion.
 
 	* var.c: VarLoopExpand: ignore addSpace if a \n is present.
 
 	* Added RCSid's for the files we've touched.
 	
 Thu May 24 15:41:37 2001  Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* configure.in:	Thanks to some clues from mdb@juniper.net,
 	added autoconf magic to control setting of MACHINE, MACHINE_ARCH
 	as well as what ends up in _PATH_DEFSYSPATH.  We now have:
 
   --with-machine=MACHINE  explicitly set MACHINE
   --with-force-machine=MACHINE  set FORCE_MACHINE
   --with-machine_arch=MACHINE_ARCH  explicitly set MACHINE_ARCH
   --with-default-sys-path=PATH:DIR:LIST  use an explicit _PATH_DEFSYSPATH
   --with-prefix-sys-path=PATH:DIR:LIST  prefix _PATH_PREFIX_SYSPATH
   --with-path-objdirprefix=PATH  override _PATH_OBJDIRPREFIX
  	
 	If _PATH_OBJDIRPREFIX is set to "no" we won't define it.
 
 	* makefile: added a pathetically simple makefile to drive
 	bootstrapping.  Running configure by hand is more useful.
 
 	* Makefile.in: added MAKE_VERSION, and reworked things to be less
 	dependent on NetBSD bsd.*.mk
 	
 	* pathnames.h: allow NO_PATH_OBJDIRPREFIX to stop us defining
 	_PATH_OBJDIRPREFIX for those that don't want a default.
 	construct _PATH_DEFSYSPATH from the info we get from configure.
 
 	* main.c: allow for no _PATH_OBJDIRPREFIX, set ${MAKE_VERSION}
 	if MAKE_VERSION is defined.
 	
 	* compat.c: when we bail, print out the .CURDIR we were in.
 	
 Sat May 12 00:34:12 2001  Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* Merged with NetBSD make
 
 	* var.c: fixed a bug in the handling of the modifier :P
 	if the node as found but the path was null, we segfault trying to
 	duplicate it.
 
 Mon Mar  5 16:20:33 2001  Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* Merged with NetBSD make
 	
 	* make.c: Make_OODate's test for a library out of date was using
 	cmtime where it should have used mtime (my bug).
 
 	* compat.c: Use perror() to tell us what really went wrong when we
 	cannot exec a command.
 	
 Fri Dec 15 10:11:08  2000  Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* Merged with NetBSD make
 	
 Sat Jun 10 10:11:08  2000  Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* Merged with NetBSD make
 	
 Thu Jun  1 10:11:08  2000  Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* Merged with NetBSD make
 	
 Tue May 30 10:11:08  2000  Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* Merged with NetBSD make
 	
 Thu Apr 27 00:07:47 2000  Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* util.c: don't provide signal() since we use sigcompat.c
 
 	* Makefile.in: added a build target.
 
 	* var.c (Var_Parse): added ODE modifiers :U, :D, :L, :P, :@ and :!
 	These allow some quite clever magic.
 
 	* main.c (main): added support for getenv(MAKESYSPATH).
 
 Mon Apr  2 16:25:13 2000  Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* Disable $PWD overriding getcwd() if MAKEOBJDIRPREFIX is set.
 	This avoids objdir having a different value depending on how a
 	directory was reached (via command line, or subdir.mk).
 
 	* If FORCE_MACHINE is defined, ignore getenv("MACHINE").
 	
 Mon Apr  2 23:15:31 2000  Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* Do a chdir(${.CURDIR}) before invoking ${.MAKE} or ${.MAKE:T} if
 	MAKEOBJDIRPREFIX is set and NOCHECKMAKECHDIR is not.
 	I've been testing this in NetBSD's make for some weeks.
 
 	* Turn Makefile into Makefile.in and make it useful.
 	
 Tue Feb 29 22:08:00 2000 Simon J. Gerraty  <sjg@zen.quick.com.au>
 
 	* Imported NetBSD's -current make(1) and resolve conflicts.
 	
 	* Applied autoconf patches from bmake v2
 
 	* Imported clean code base from NetBSD-1.0
Index: projects/clang390-import/contrib/bmake/Makefile
===================================================================
--- projects/clang390-import/contrib/bmake/Makefile	(revision 305686)
+++ projects/clang390-import/contrib/bmake/Makefile	(revision 305687)
@@ -1,222 +1,222 @@
-#	$Id: Makefile,v 1.67 2016/06/07 00:46:12 sjg Exp $
+#	$Id: Makefile,v 1.72 2016/08/18 23:02:26 sjg Exp $
 
 # Base version on src date
-_MAKE_VERSION= 20160606
+_MAKE_VERSION= 20160818
 
 PROG=	bmake
 
 SRCS= \
 	arch.c \
 	buf.c \
 	compat.c \
 	cond.c \
 	dir.c \
 	for.c \
 	hash.c \
 	job.c \
 	main.c \
 	make.c \
 	make_malloc.c \
 	meta.c \
 	metachar.c \
 	parse.c \
 	str.c \
 	strlist.c \
 	suff.c \
 	targ.c \
 	trace.c \
 	util.c \
 	var.c
 
 # from lst.lib/
 SRCS+= \
 	lstAppend.c \
 	lstAtEnd.c \
 	lstAtFront.c \
 	lstClose.c \
 	lstConcat.c \
 	lstDatum.c \
 	lstDeQueue.c \
 	lstDestroy.c \
 	lstDupl.c \
 	lstEnQueue.c \
 	lstFind.c \
 	lstFindFrom.c \
 	lstFirst.c \
 	lstForEach.c \
 	lstForEachFrom.c \
 	lstInit.c \
 	lstInsert.c \
 	lstIsAtEnd.c \
 	lstIsEmpty.c \
 	lstLast.c \
 	lstMember.c \
 	lstNext.c \
 	lstOpen.c \
 	lstPrev.c \
 	lstRemove.c \
 	lstReplace.c \
 	lstSucc.c
 
 # this file gets generated by configure
 .-include "Makefile.config"
 
 .if !empty(LIBOBJS)
 SRCS+= ${LIBOBJS:T:.o=.c}
 .endif
 
 # just in case
 prefix?= /usr
 srcdir?= ${.CURDIR}
 
 DEFAULT_SYS_PATH?= ${prefix}/share/mk
 
 CPPFLAGS+= -DUSE_META
 CFLAGS+= ${CPPFLAGS}
 CFLAGS+= -D_PATH_DEFSYSPATH=\"${DEFAULT_SYS_PATH}\"
 CFLAGS+= -I. -I${srcdir} ${XDEFS} -DMAKE_NATIVE
 CFLAGS+= ${COPTS.${.ALLSRC:M*.c:T:u}}
 COPTS.main.c+= "-DMAKE_VERSION=\"${_MAKE_VERSION}\""
 
 # meta mode can be useful even without filemon 
 FILEMON_H ?= /usr/include/dev/filemon/filemon.h
 .if exists(${FILEMON_H}) && ${FILEMON_H:T} == "filemon.h"
 COPTS.meta.c += -DHAVE_FILEMON_H -I${FILEMON_H:H}
 .endif
 
 .PATH:	${srcdir}
 .PATH:	${srcdir}/lst.lib
 
 .if make(obj) || make(clean)
 SUBDIR+= unit-tests
 .endif
 
 # start-delete1 for bsd.after-import.mk
 # we skip a lot of this when building as part of FreeBSD etc.
 
 # list of OS's which are derrived from BSD4.4
 BSD44_LIST= NetBSD FreeBSD OpenBSD DragonFly MirBSD Bitrig
 # we are...
 OS!= uname -s
 # are we 4.4BSD ?
 isBSD44:=${BSD44_LIST:M${OS}}
 
 .if ${isBSD44} == ""
 MANTARGET= cat
 INSTALL?=${srcdir}/install-sh
 .if (${MACHINE} == "sun386")
 # even I don't have one of these anymore :-)
 CFLAGS+= -DPORTAR
 .elif (${MACHINE} != "sunos")
 SRCS+= sigcompat.c
 CFLAGS+= -DSIGNAL_FLAGS=SA_RESTART
 .endif
 .else
 MANTARGET?= man
 .endif
 
 # turn this on by default - ignored if we are root
 WITH_INSTALL_AS_USER=
 
 # suppress with -DWITHOUT_*
 OPTIONS_DEFAULT_YES+= \
 	AUTOCONF_MK \
 	INSTALL_MK \
 	PROG_LINK
 
 OPTIONS_DEFAULT_NO+= \
 	PROG_VERSION
 
 # process options now
 .include <own.mk>
 
 .if ${MK_PROG_VERSION} == "yes"
 PROG_NAME= ${PROG}-${_MAKE_VERSION}
 .if ${MK_PROG_LINK} == "yes"
 SYMLINKS+= ${PROG_NAME} ${BINDIR}/${PROG}
 .endif
 .endif
 
 EXTRACT_MAN=no
 # end-delete1
 
 MAN= ${PROG}.1
 MAN1= ${MAN}
 
 .if (${PROG} != "make")
 CLEANFILES+= my.history
 .if make(${MAN}) || !exists(${srcdir}/${MAN})
 my.history: ${MAKEFILE}
 	@(echo ".Nm"; \
 	echo "is derived from NetBSD"; \
 	echo ".Xr make 1 ."; \
 	echo "It uses autoconf to facilitate portability to other platforms."; \
 	echo ".Pp") > $@
 
 .NOPATH: ${MAN}
 ${MAN}:	make.1 my.history
 	@echo making $@
 	@sed -e 's/^.Nx/NetBSD/' -e '/^.Nm/s/make/${PROG}/' \
 	-e '/^.Sh HISTORY/rmy.history' \
 	-e '/^.Sh HISTORY/,$$s,^.Nm,make,' ${srcdir}/make.1 > $@
 
 all beforeinstall: ${MAN}
 _mfromdir=.
 .endif
 .endif
 
 MANTARGET?= cat
 MANDEST?= ${MANDIR}/${MANTARGET}1
 
 .if ${MANTARGET} == "cat"
 _mfromdir=${srcdir}
 .endif
 
 .include <prog.mk>
 
 CPPFLAGS+= -DMAKE_NATIVE -DHAVE_CONFIG_H
 COPTS.var.c += -Wno-cast-qual
 COPTS.job.c += -Wno-format-nonliteral
 COPTS.parse.c += -Wno-format-nonliteral
 COPTS.var.c += -Wno-format-nonliteral
 
 # Force these
 SHAREDIR= ${SHAREDIR.bmake:U${prefix}/share}
 BINDIR= ${BINDIR.bmake:U${prefix}/bin}
 MANDIR= ${MANDIR.bmake:U${SHAREDIR}/man}
 
 .if !exists(.depend)
 ${OBJS}: config.h
 .endif
 
 # make sure that MAKE_VERSION gets updated.
 main.o: ${SRCS} ${MAKEFILE}
 
 # start-delete2 for bsd.after-import.mk
 .if ${MK_AUTOCONF_MK} == "yes"
 .include <autoconf.mk>
 .endif
 SHARE_MK?=${SHAREDIR}/mk
 MKSRC=${srcdir}/mk
 INSTALL?=${srcdir}/install-sh
 
 .if ${MK_INSTALL_MK} == "yes"
 install: install-mk
 .endif
 
 beforeinstall:
 	test -d ${DESTDIR}${BINDIR} || ${INSTALL} -m 775 -d ${DESTDIR}${BINDIR}
 	test -d ${DESTDIR}${MANDEST} || ${INSTALL} -m 775 -d ${DESTDIR}${MANDEST}
 
 install-mk:
 .if exists(${MKSRC}/install-mk)
 	test -d ${DESTDIR}${SHARE_MK} || ${INSTALL} -m 775 -d ${DESTDIR}${SHARE_MK}
 	sh ${MKSRC}/install-mk -v -m 644 ${DESTDIR}${SHARE_MK}
 .else
 	@echo need to unpack mk.tar.gz under ${srcdir} or set MKSRC; false
 .endif
 # end-delete2
 
 # A simple unit-test driver to help catch regressions
 accept test:
 	cd ${.CURDIR}/unit-tests && MAKEFLAGS= ${.MAKE} -r -m / TEST_MAKE=${TEST_MAKE:U${.OBJDIR}/${PROG:T}} ${.TARGET}
Index: projects/clang390-import/contrib/bmake/bmake.1
===================================================================
--- projects/clang390-import/contrib/bmake/bmake.1	(revision 305686)
+++ projects/clang390-import/contrib/bmake/bmake.1	(revision 305687)
@@ -1,2331 +1,2346 @@
-.\"	$NetBSD: make.1,v 1.259 2016/06/03 07:07:37 wiz Exp $
+.\"	$NetBSD: make.1,v 1.262 2016/08/18 19:23:20 wiz Exp $
 .\"
 .\" Copyright (c) 1990, 1993
 .\"	The Regents of the University of California.  All rights reserved.
 .\"
 .\" Redistribution and use in source and binary forms, with or without
 .\" modification, are permitted provided that the following conditions
 .\" are met:
 .\" 1. Redistributions of source code must retain the above copyright
 .\"    notice, this list of conditions and the following disclaimer.
 .\" 2. Redistributions in binary form must reproduce the above copyright
 .\"    notice, this list of conditions and the following disclaimer in the
 .\"    documentation and/or other materials provided with the distribution.
 .\" 3. Neither the name of the University nor the names of its contributors
 .\"    may be used to endorse or promote products derived from this software
 .\"    without specific prior written permission.
 .\"
 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 .\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 .\" SUCH DAMAGE.
 .\"
 .\"	from: @(#)make.1	8.4 (Berkeley) 3/19/94
 .\"
-.Dd June 2, 2016
+.Dd August 15, 2016
 .Dt MAKE 1
 .Os
 .Sh NAME
 .Nm bmake
 .Nd maintain program dependencies
 .Sh SYNOPSIS
 .Nm
 .Op Fl BeikNnqrstWwX
 .Op Fl C Ar directory
 .Op Fl D Ar variable
 .Op Fl d Ar flags
 .Op Fl f Ar makefile
 .Op Fl I Ar directory
 .Op Fl J Ar private
 .Op Fl j Ar max_jobs
 .Op Fl m Ar directory
 .Op Fl T Ar file
 .Op Fl V Ar variable
 .Op Ar variable=value
 .Op Ar target ...
 .Sh DESCRIPTION
 .Nm
 is a program designed to simplify the maintenance of other programs.
 Its input is a list of specifications as to the files upon which programs
 and other files depend.
 If no
 .Fl f Ar makefile
 makefile option is given,
 .Nm
 will try to open
 .Ql Pa makefile
 then
 .Ql Pa Makefile
 in order to find the specifications.
 If the file
 .Ql Pa .depend
 exists, it is read (see
 .Xr mkdep 1 ) .
 .Pp
 This manual page is intended as a reference document only.
 For a more thorough description of
 .Nm
 and makefiles, please refer to
 .%T "PMake \- A Tutorial" .
 .Pp
 .Nm
 will prepend the contents of the
 .Va MAKEFLAGS
 environment variable to the command line arguments before parsing them.
 .Pp
 The options are as follows:
 .Bl -tag -width Ds
 .It Fl B
 Try to be backwards compatible by executing a single shell per command and
 by executing the commands to make the sources of a dependency line in sequence.
 .It Fl C Ar directory
 Change to
 .Ar directory
 before reading the makefiles or doing anything else.
 If multiple
 .Fl C
 options are specified, each is interpreted relative to the previous one:
 .Fl C Pa / Fl C Pa etc
 is equivalent to
 .Fl C Pa /etc .
 .It Fl D Ar variable
 Define
 .Ar variable
 to be 1, in the global context.
 .It Fl d Ar [-]flags
 Turn on debugging, and specify which portions of
 .Nm
 are to print debugging information.
 Unless the flags are preceded by
 .Ql \-
 they are added to the
 .Va MAKEFLAGS
 environment variable and will be processed by any child make processes.
 By default, debugging information is printed to standard error,
 but this can be changed using the
 .Ar F
 debugging flag.
 The debugging output is always unbuffered; in addition, if debugging
 is enabled but debugging output is not directed to standard output,
 then the standard output is line buffered.
 .Ar Flags
 is one or more of the following:
 .Bl -tag -width Ds
 .It Ar A
 Print all possible debugging information;
 equivalent to specifying all of the debugging flags.
 .It Ar a
 Print debugging information about archive searching and caching.
 .It Ar C
 Print debugging information about current working directory.
 .It Ar c
 Print debugging information about conditional evaluation.
 .It Ar d
 Print debugging information about directory searching and caching.
 .It Ar e
 Print debugging information about failed commands and targets.
 .It Ar F Ns Oo Sy \&+ Oc Ns Ar filename
 Specify where debugging output is written.
 This must be the last flag, because it consumes the remainder of
 the argument.
 If the character immediately after the
 .Ql F
 flag is
 .Ql \&+ ,
 then the file will be opened in append mode;
 otherwise the file will be overwritten.
 If the file name is
 .Ql stdout
 or
 .Ql stderr
 then debugging output will be written to the
 standard output or standard error output file descriptors respectively
 (and the
 .Ql \&+
 option has no effect).
 Otherwise, the output will be written to the named file.
 If the file name ends
 .Ql .%d
 then the
 .Ql %d
 is replaced by the pid.
 .It Ar f
 Print debugging information about loop evaluation.
 .It Ar "g1"
 Print the input graph before making anything.
 .It Ar "g2"
 Print the input graph after making everything, or before exiting
 on error.
 .It Ar "g3"
 Print the input graph before exiting on error.
 .It Ar j
 Print debugging information about running multiple shells.
 .It Ar l
 Print commands in Makefiles regardless of whether or not they are prefixed by
 .Ql @
 or other "quiet" flags.
 Also known as "loud" behavior.
 .It Ar M
 Print debugging information about "meta" mode decisions about targets.
 .It Ar m
 Print debugging information about making targets, including modification
 dates.
 .It Ar n
 Don't delete the temporary command scripts created when running commands.
 These temporary scripts are created in the directory
 referred to by the
 .Ev TMPDIR
 environment variable, or in
 .Pa /tmp
 if
 .Ev TMPDIR
 is unset or set to the empty string.
 The temporary scripts are created by
 .Xr mkstemp 3 ,
 and have names of the form
 .Pa makeXXXXXX .
 .Em NOTE :
 This can create many files in
 .Ev TMPDIR
 or
 .Pa /tmp ,
 so use with care.
 .It Ar p
 Print debugging information about makefile parsing.
 .It Ar s
 Print debugging information about suffix-transformation rules.
 .It Ar t
 Print debugging information about target list maintenance.
 .It Ar V
 Force the
 .Fl V
 option to print raw values of variables.
 .It Ar v
 Print debugging information about variable assignment.
 .It Ar x
 Run shell commands with
 .Fl x
 so the actual commands are printed as they are executed.
 .El
 .It Fl e
 Specify that environment variables override macro assignments within
 makefiles.
 .It Fl f Ar makefile
 Specify a makefile to read instead of the default
 .Ql Pa makefile .
 If
 .Ar makefile
 is
 .Ql Fl ,
 standard input is read.
 Multiple makefiles may be specified, and are read in the order specified.
 .It Fl I Ar directory
 Specify a directory in which to search for makefiles and included makefiles.
 The system makefile directory (or directories, see the
 .Fl m
 option) is automatically included as part of this list.
 .It Fl i
 Ignore non-zero exit of shell commands in the makefile.
 Equivalent to specifying
 .Ql Fl
 before each command line in the makefile.
 .It Fl J Ar private
 This option should
 .Em not
 be specified by the user.
 .Pp
 When the
 .Ar j
 option is in use in a recursive build, this option is passed by a make
 to child makes to allow all the make processes in the build to
 cooperate to avoid overloading the system.
 .It Fl j Ar max_jobs
 Specify the maximum number of jobs that
 .Nm
 may have running at any one time.
 The value is saved in
 .Va .MAKE.JOBS .
 Turns compatibility mode off, unless the
 .Ar B
 flag is also specified.
 When compatibility mode is off, all commands associated with a
 target are executed in a single shell invocation as opposed to the
 traditional one shell invocation per line.
 This can break traditional scripts which change directories on each
 command invocation and then expect to start with a fresh environment
 on the next line.
 It is more efficient to correct the scripts rather than turn backwards
 compatibility on.
 .It Fl k
 Continue processing after errors are encountered, but only on those targets
 that do not depend on the target whose creation caused the error.
 .It Fl m Ar directory
 Specify a directory in which to search for sys.mk and makefiles included
 via the
 .Ao Ar file Ac Ns -style
 include statement.
 The
 .Fl m
 option can be used multiple times to form a search path.
 This path will override the default system include path: /usr/share/mk.
 Furthermore the system include path will be appended to the search path used
 for
 .Qo Ar file Qc Ns -style
 include statements (see the
 .Fl I
 option).
 .Pp
 If a file or directory name in the
 .Fl m
 argument (or the
 .Ev MAKESYSPATH
 environment variable) starts with the string
 .Qq \&.../
 then
 .Nm
 will search for the specified file or directory named in the remaining part
 of the argument string.
 The search starts with the current directory of
 the Makefile and then works upward towards the root of the file system.
 If the search is successful, then the resulting directory replaces the
 .Qq \&.../
 specification in the
 .Fl m
 argument.
 If used, this feature allows
 .Nm
 to easily search in the current source tree for customized sys.mk files
 (e.g., by using
 .Qq \&.../mk/sys.mk
 as an argument).
 .It Fl n
 Display the commands that would have been executed, but do not
 actually execute them unless the target depends on the .MAKE special
 source (see below).
 .It Fl N
 Display the commands which would have been executed, but do not
 actually execute any of them; useful for debugging top-level makefiles
 without descending into subdirectories.
 .It Fl q
 Do not execute any commands, but exit 0 if the specified targets are
 up-to-date and 1, otherwise.
 .It Fl r
 Do not use the built-in rules specified in the system makefile.
 .It Fl s
 Do not echo any commands as they are executed.
 Equivalent to specifying
 .Ql Ic @
 before each command line in the makefile.
 .It Fl T Ar tracefile
 When used with the
 .Fl j
 flag,
 append a trace record to
 .Ar tracefile
 for each job started and completed.
 .It Fl t
 Rather than re-building a target as specified in the makefile, create it
 or update its modification time to make it appear up-to-date.
 .It Fl V Ar variable
 Print
 .Nm Ns 's
 idea of the value of
 .Ar variable ,
 in the global context.
 Do not build any targets.
 Multiple instances of this option may be specified;
 the variables will be printed one per line,
 with a blank line for each null or undefined variable.
 If
 .Ar variable
 contains a
 .Ql \&$
 then the value will be expanded before printing.
 .It Fl W
 Treat any warnings during makefile parsing as errors.
 .It Fl w
 Print entering and leaving directory messages, pre and post processing.
 .It Fl X
 Don't export variables passed on the command line to the environment
 individually.
 Variables passed on the command line are still exported
 via the
 .Va MAKEFLAGS
 environment variable.
 This option may be useful on systems which have a small limit on the
 size of command arguments.
 .It Ar variable=value
 Set the value of the variable
 .Ar variable
 to
 .Ar value .
 Normally, all values passed on the command line are also exported to
 sub-makes in the environment.
 The
 .Fl X
 flag disables this behavior.
 Variable assignments should follow options for POSIX compatibility
 but no ordering is enforced.
 .El
 .Pp
 There are seven different types of lines in a makefile: file dependency
 specifications, shell commands, variable assignments, include statements,
 conditional directives, for loops, and comments.
 .Pp
 In general, lines may be continued from one line to the next by ending
 them with a backslash
 .Pq Ql \e .
 The trailing newline character and initial whitespace on the following
 line are compressed into a single space.
 .Sh FILE DEPENDENCY SPECIFICATIONS
 Dependency lines consist of one or more targets, an operator, and zero
 or more sources.
 This creates a relationship where the targets
 .Dq depend
 on the sources
 and are usually created from them.
 The exact relationship between the target and the source is determined
 by the operator that separates them.
 The three operators are as follows:
 .Bl -tag -width flag
 .It Ic \&:
 A target is considered out-of-date if its modification time is less than
 those of any of its sources.
 Sources for a target accumulate over dependency lines when this operator
 is used.
 The target is removed if
 .Nm
 is interrupted.
 .It Ic \&!
 Targets are always re-created, but not until all sources have been
 examined and re-created as necessary.
 Sources for a target accumulate over dependency lines when this operator
 is used.
 The target is removed if
 .Nm
 is interrupted.
 .It Ic \&::
 If no sources are specified, the target is always re-created.
 Otherwise, a target is considered out-of-date if any of its sources has
 been modified more recently than the target.
 Sources for a target do not accumulate over dependency lines when this
 operator is used.
 The target will not be removed if
 .Nm
 is interrupted.
 .El
 .Pp
 Targets and sources may contain the shell wildcard values
 .Ql \&? ,
 .Ql * ,
 .Ql [] ,
 and
 .Ql {} .
 The values
 .Ql \&? ,
 .Ql * ,
 and
 .Ql []
 may only be used as part of the final
 component of the target or source, and must be used to describe existing
 files.
 The value
 .Ql {}
 need not necessarily be used to describe existing files.
 Expansion is in directory order, not alphabetically as done in the shell.
 .Sh SHELL COMMANDS
 Each target may have associated with it one or more lines of shell
 commands, normally
 used to create the target.
 Each of the lines in this script
 .Em must
 be preceded by a tab.
 (For historical reasons, spaces are not accepted.)
 While targets can appear in many dependency lines if desired, by
 default only one of these rules may be followed by a creation
 script.
 If the
 .Ql Ic \&::
 operator is used, however, all rules may include scripts and the
 scripts are executed in the order found.
 .Pp
 Each line is treated as a separate shell command, unless the end of
 line is escaped with a backslash
 .Pq Ql \e
 in which case that line and the next are combined.
 .\" The escaped newline is retained and passed to the shell, which
 .\" normally ignores it.
 .\" However, the tab at the beginning of the following line is removed.
 If the first characters of the command are any combination of
 .Ql Ic @ ,
 .Ql Ic + ,
 or
 .Ql Ic \- ,
 the command is treated specially.
 A
 .Ql Ic @
 causes the command not to be echoed before it is executed.
 A
 .Ql Ic +
 causes the command to be executed even when
 .Fl n
 is given.
 This is similar to the effect of the .MAKE special source,
 except that the effect can be limited to a single line of a script.
 A
 .Ql Ic \-
 in compatibility mode
 causes any non-zero exit status of the command line to be ignored.
 .Pp
 When
 .Nm
 is run in jobs mode with
 .Fl j Ar max_jobs ,
 the entire script for the target is fed to a
 single instance of the shell.
 In compatibility (non-jobs) mode, each command is run in a separate process.
 If the command contains any shell meta characters
 .Pq Ql #=|^(){};&<>*?[]:$`\e\en
 it will be passed to the shell; otherwise
 .Nm
 will attempt direct execution.
 If a line starts with
 .Ql Ic \-
 and the shell has ErrCtl enabled then failure of the command line
 will be ignored as in compatibility mode.
 Otherwise
 .Ql Ic \-
 affects the entire job;
 the script will stop at the first command line that fails,
 but the target will not be deemed to have failed.
 .Pp
 Makefiles should be written so that the mode of
 .Nm
 operation does not change their behavior.
 For example, any command which needs to use
 .Dq cd
 or
 .Dq chdir
 without potentially changing the directory for subsequent commands
 should be put in parentheses so it executes in a subshell.
 To force the use of one shell, escape the line breaks so as to make
 the whole script one command.
 For example:
 .Bd -literal -offset indent
 avoid-chdir-side-effects:
 	@echo Building $@ in `pwd`
 	@(cd ${.CURDIR} && ${MAKE} $@)
 	@echo Back in `pwd`
 
 ensure-one-shell-regardless-of-mode:
 	@echo Building $@ in `pwd`; \e
 	(cd ${.CURDIR} && ${MAKE} $@); \e
 	echo Back in `pwd`
 .Ed
 .Pp
 Since
 .Nm
 will
 .Xr chdir 2
 to
 .Ql Va .OBJDIR
 before executing any targets, each child process
 starts with that as its current working directory.
 .Sh VARIABLE ASSIGNMENTS
 Variables in make are much like variables in the shell, and, by tradition,
 consist of all upper-case letters.
 .Ss Variable assignment modifiers
 The five operators that can be used to assign values to variables are as
 follows:
 .Bl -tag -width Ds
 .It Ic \&=
 Assign the value to the variable.
 Any previous value is overridden.
 .It Ic \&+=
 Append the value to the current value of the variable.
 .It Ic \&?=
 Assign the value to the variable if it is not already defined.
 .It Ic \&:=
 Assign with expansion, i.e. expand the value before assigning it
 to the variable.
 Normally, expansion is not done until the variable is referenced.
 .Em NOTE :
 References to undefined variables are
 .Em not
 expanded.
 This can cause problems when variable modifiers are used.
 .It Ic \&!=
 Expand the value and pass it to the shell for execution and assign
 the result to the variable.
 Any newlines in the result are replaced with spaces.
 .El
 .Pp
 Any white-space before the assigned
 .Ar value
 is removed; if the value is being appended, a single space is inserted
 between the previous contents of the variable and the appended value.
 .Pp
 Variables are expanded by surrounding the variable name with either
 curly braces
 .Pq Ql {}
 or parentheses
 .Pq Ql ()
 and preceding it with
 a dollar sign
 .Pq Ql \&$ .
 If the variable name contains only a single letter, the surrounding
 braces or parentheses are not required.
 This shorter form is not recommended.
 .Pp
 If the variable name contains a dollar, then the name itself is expanded first.
 This allows almost arbitrary variable names, however names containing dollar,
 braces, parenthesis, or whitespace are really best avoided!
 .Pp
 If the result of expanding a variable contains a dollar sign
 .Pq Ql \&$
 the string is expanded again.
 .Pp
 Variable substitution occurs at three distinct times, depending on where
 the variable is being used.
 .Bl -enum
 .It
 Variables in dependency lines are expanded as the line is read.
 .It
 Variables in shell commands are expanded when the shell command is
 executed.
 .It
 .Dq .for
 loop index variables are expanded on each loop iteration.
 Note that other variables are not expanded inside loops so
 the following example code:
 .Bd -literal -offset indent
 
 .Dv .for i in 1 2 3
 a+=     ${i}
 j=      ${i}
 b+=     ${j}
 .Dv .endfor
 
 all:
 	@echo ${a}
 	@echo ${b}
 
 .Ed
 will print:
 .Bd -literal -offset indent
 1 2 3
 3 3 3
 
 .Ed
 Because while ${a} contains
 .Dq 1 2 3
 after the loop is executed, ${b}
 contains
 .Dq ${j} ${j} ${j}
 which expands to
 .Dq 3 3 3
 since after the loop completes ${j} contains
 .Dq 3 .
 .El
 .Ss Variable classes
 The four different classes of variables (in order of increasing precedence)
 are:
 .Bl -tag -width Ds
 .It Environment variables
 Variables defined as part of
 .Nm Ns 's
 environment.
 .It Global variables
 Variables defined in the makefile or in included makefiles.
 .It Command line variables
 Variables defined as part of the command line.
 .It Local variables
 Variables that are defined specific to a certain target.
 .El
 .Pp
 Local variables are all built in and their values vary magically from
 target to target.
 It is not currently possible to define new local variables.
 The seven local variables are as follows:
 .Bl -tag -width ".ARCHIVE" -offset indent
 .It Va .ALLSRC
 The list of all sources for this target; also known as
 .Ql Va \&\*[Gt] .
 .It Va .ARCHIVE
 The name of the archive file; also known as
 .Ql Va \&! .
 .It Va .IMPSRC
 In suffix-transformation rules, the name/path of the source from which the
 target is to be transformed (the
 .Dq implied
 source); also known as
 .Ql Va \&\*[Lt] .
 It is not defined in explicit rules.
 .It Va .MEMBER
 The name of the archive member; also known as
 .Ql Va % .
 .It Va .OODATE
 The list of sources for this target that were deemed out-of-date; also
 known as
 .Ql Va \&? .
 .It Va .PREFIX
 The file prefix of the target, containing only the file portion, no suffix
 or preceding directory components; also known as
 .Ql Va * .
 The suffix must be one of the known suffixes declared with
 .Ic .SUFFIXES
 or it will not be recognized.
 .It Va .TARGET
 The name of the target; also known as
 .Ql Va @ .
 For compatibility with other makes this is an alias for
 .Ic .ARCHIVE
 in archive member rules.
 .El
 .Pp
 The shorter forms
 .Ql ( Va \*[Gt] ,
 .Ql Va \&! ,
 .Ql Va \*[Lt] ,
 .Ql Va % ,
 .Ql Va \&? ,
 .Ql Va * ,
 and
 .Ql Va @ )
 are permitted for backward
 compatibility with historical makefiles and legacy POSIX make and are
 not recommended.
 .Pp
 Variants of these variables with the punctuation followed immediately by
 .Ql D
 or
 .Ql F ,
 e.g.
 .Ql Va $(@D) ,
 are legacy forms equivalent to using the
 .Ql :H
 and
 .Ql :T
 modifiers.
 These forms are accepted for compatibility with
 .At V
 makefiles and POSIX but are not recommended.
 .Pp
 Four of the local variables may be used in sources on dependency lines
 because they expand to the proper value for each target on the line.
 These variables are
 .Ql Va .TARGET ,
 .Ql Va .PREFIX ,
 .Ql Va .ARCHIVE ,
 and
 .Ql Va .MEMBER .
 .Ss Additional built-in variables
 In addition,
 .Nm
 sets or knows about the following variables:
 .Bl -tag -width .MAKEOVERRIDES
 .It Va \&$
 A single dollar sign
 .Ql \&$ ,
 i.e.
 .Ql \&$$
 expands to a single dollar
 sign.
 .It Va .ALLTARGETS
 The list of all targets encountered in the Makefile.
 If evaluated during
 Makefile parsing, lists only those targets encountered thus far.
 .It Va .CURDIR
 A path to the directory where
 .Nm
 was executed.
 Refer to the description of
 .Ql Ev PWD
 for more details.
 .It Va .INCLUDEDFROMDIR
 The directory of the file this Makefile was included from.
 .It Va .INCLUDEDFROMFILE
 The filename of the file this Makefile was included from.
 .It Ev MAKE
 The name that
 .Nm
 was executed with
 .Pq Va argv[0] .
 For compatibility
 .Nm
 also sets
 .Va .MAKE
 with the same value.
 The preferred variable to use is the environment variable
 .Ev MAKE
 because it is more compatible with other versions of
 .Nm
 and cannot be confused with the special target with the same name.
 .It Va .MAKE.DEPENDFILE
 Names the makefile (default
 .Ql Pa .depend )
 from which generated dependencies are read.
 .It Va .MAKE.EXPAND_VARIABLES
 A boolean that controls the default behavior of the
 .Fl V
 option.
 .It Va .MAKE.EXPORTED
 The list of variables exported by
 .Nm .
 .It Va .MAKE.JOBS
 The argument to the
 .Fl j
 option.
 .It Va .MAKE.JOB.PREFIX
 If
 .Nm
 is run with
 .Ar j
 then output for each target is prefixed with a token
 .Ql --- target ---
 the first part of which can be controlled via
 .Va .MAKE.JOB.PREFIX .
 If
 .Va .MAKE.JOB.PREFIX
 is empty, no token is printed.
 .br
 For example:
 .Li .MAKE.JOB.PREFIX=${.newline}---${.MAKE:T}[${.MAKE.PID}]
 would produce tokens like
 .Ql ---make[1234] target ---
 making it easier to track the degree of parallelism being achieved.
 .It Ev MAKEFLAGS
 The environment variable
 .Ql Ev MAKEFLAGS
 may contain anything that
 may be specified on
 .Nm Ns 's
 command line.
 Anything specified on
 .Nm Ns 's
 command line is appended to the
 .Ql Ev MAKEFLAGS
 variable which is then
 entered into the environment for all programs which
 .Nm
 executes.
 .It Va .MAKE.LEVEL
 The recursion depth of
 .Nm .
 The initial instance of
 .Nm
 will be 0, and an incremented value is put into the environment
 to be seen by the next generation.
 This allows tests like:
 .Li .if ${.MAKE.LEVEL} == 0
 to protect things which should only be evaluated in the initial instance of
 .Nm .
 .It Va .MAKE.MAKEFILE_PREFERENCE
 The ordered list of makefile names
 (default
 .Ql Pa makefile ,
 .Ql Pa Makefile )
 that
 .Nm
 will look for.
 .It Va .MAKE.MAKEFILES
 The list of makefiles read by
 .Nm ,
 which is useful for tracking dependencies.
 Each makefile is recorded only once, regardless of the number of times read.
 .It Va .MAKE.MODE
 Processed after reading all makefiles.
 Can affect the mode that
 .Nm
 runs in.
 It can contain a number of keywords:
 .Bl -hang -width missing-filemon=bf.
 .It Pa compat
 Like
 .Fl B ,
 puts
 .Nm
 into "compat" mode.
 .It Pa meta
 Puts
 .Nm
 into "meta" mode, where meta files are created for each target
 to capture the command run, the output generated and if
 .Xr filemon 4
 is available, the system calls which are of interest to
 .Nm .
 The captured output can be very useful when diagnosing errors.
 .It Pa curdirOk= Ar bf
 Normally
 .Nm
 will not create .meta files in
 .Ql Va .CURDIR .
 This can be overridden by setting
 .Va bf
 to a value which represents True.
 .It Pa missing-meta= Ar bf
 If
 .Va bf
 is True, then a missing .meta file makes the target out-of-date.
 .It Pa missing-filemon= Ar bf
 If
 .Va bf
 is True, then missing filemon data makes the target out-of-date.
 .It Pa nofilemon
 Do not use
 .Xr filemon 4 .
 .It Pa env
 For debugging, it can be useful to include the environment
 in the .meta file.
 .It Pa verbose
 If in "meta" mode, print a clue about the target being built.
 This is useful if the build is otherwise running silently.
 The message printed the value of:
 .Va .MAKE.META.PREFIX .
 .It Pa ignore-cmd
 Some makefiles have commands which are simply not stable.
 This keyword causes them to be ignored for
 determining whether a target is out of date in "meta" mode.
 See also
 .Ic .NOMETA_CMP .
 .It Pa silent= Ar bf
 If
 .Va bf
 is True, when a .meta file is created, mark the target
 .Ic .SILENT .
 .El
 .It Va .MAKE.META.BAILIWICK
 In "meta" mode, provides a list of prefixes which
 match the directories controlled by
 .Nm .
 If a file that was generated outside of
 .Va .OBJDIR
 but within said bailiwick is missing,
 the current target is considered out-of-date.
 .It Va .MAKE.META.CREATED
 In "meta" mode, this variable contains a list of all the meta files
 updated.
 If not empty, it can be used to trigger processing of
 .Va .MAKE.META.FILES .
 .It Va .MAKE.META.FILES
 In "meta" mode, this variable contains a list of all the meta files
 used (updated or not).
 This list can be used to process the meta files to extract dependency
 information.
 .It Va .MAKE.META.IGNORE_PATHS
 Provides a list of path prefixes that should be ignored;
 because the contents are expected to change over time.
 The default list includes:
 .Ql Pa /dev /etc /proc /tmp /var/run /var/tmp
 .It Va .MAKE.META.IGNORE_PATTERNS
 Provides a list of patterns to match against pathnames.
 Ignore any that match.
+.It Va .MAKE.META.IGNORE_FILTER
+Provides a list of variable modifiers to apply to each pathname.
+Ignore if the expansion is an empty string.
 .It Va .MAKE.META.PREFIX
 Defines the message printed for each meta file updated in "meta verbose" mode.
 The default value is:
 .Dl Building ${.TARGET:H:tA}/${.TARGET:T}
 .It Va .MAKEOVERRIDES
 This variable is used to record the names of variables assigned to
 on the command line, so that they may be exported as part of
 .Ql Ev MAKEFLAGS .
 This behavior can be disabled by assigning an empty value to
 .Ql Va .MAKEOVERRIDES
 within a makefile.
 Extra variables can be exported from a makefile
 by appending their names to
 .Ql Va .MAKEOVERRIDES .
 .Ql Ev MAKEFLAGS
 is re-exported whenever
 .Ql Va .MAKEOVERRIDES
 is modified.
 .It Va .MAKE.PATH_FILEMON
 If
 .Nm
 was built with
 .Xr filemon 4
 support, this is set to the path of the device node.
 This allows makefiles to test for this support.
 .It Va .MAKE.PID
 The process-id of
 .Nm .
 .It Va .MAKE.PPID
 The parent process-id of
 .Nm .
 .It Va .MAKE.SAVE_DOLLARS
 value should be a boolean that controls whether
 .Ql $$
 are preserved when doing
 .Ql :=
 assignments.
 The default is false, for backwards compatibility.
 Set to true for compatability with other makes.
 If set to false,
 .Ql $$
 becomes
 .Ql $
 per normal evaluation rules.
 .It Va MAKE_PRINT_VAR_ON_ERROR
 When
 .Nm
-stops due to an error, it prints its name and the value of
+stops due to an error, it sets
+.Ql Va .ERROR_TARGET
+to the name of the target that failed,
+.Ql Va .ERROR_CMD
+to the commands of the failed target,
+and in "meta" mode, it also sets
+.Ql Va .ERROR_CWD
+to the
+.Xr getcwd 3 ,
+and
+.Ql Va .ERROR_META_FILE
+to the path of the meta file (if any) describing the failed target.
+It then prints its name and the value of
 .Ql Va .CURDIR
 as well as the value of any variables named in
 .Ql Va MAKE_PRINT_VAR_ON_ERROR .
 .It Va .newline
 This variable is simply assigned a newline character as its value.
 This allows expansions using the
 .Cm \&:@
 modifier to put a newline between
 iterations of the loop rather than a space.
 For example, the printing of
 .Ql Va MAKE_PRINT_VAR_ON_ERROR
 could be done as ${MAKE_PRINT_VAR_ON_ERROR:@v@$v='${$v}'${.newline}@}.
 .It Va .OBJDIR
 A path to the directory where the targets are built.
 Its value is determined by trying to
 .Xr chdir 2
 to the following directories in order and using the first match:
 .Bl -enum
 .It
 .Ev ${MAKEOBJDIRPREFIX}${.CURDIR}
 .Pp
 (Only if
 .Ql Ev MAKEOBJDIRPREFIX
 is set in the environment or on the command line.)
 .It
 .Ev ${MAKEOBJDIR}
 .Pp
 (Only if
 .Ql Ev MAKEOBJDIR
 is set in the environment or on the command line.)
 .It
 .Ev ${.CURDIR} Ns Pa /obj. Ns Ev ${MACHINE}
 .It
 .Ev ${.CURDIR} Ns Pa /obj
 .It
 .Pa /usr/obj/ Ns Ev ${.CURDIR}
 .It
 .Ev ${.CURDIR}
 .El
 .Pp
 Variable expansion is performed on the value before it's used,
 so expressions such as
 .Dl ${.CURDIR:S,^/usr/src,/var/obj,}
 may be used.
 This is especially useful with
 .Ql Ev MAKEOBJDIR .
 .Pp
 .Ql Va .OBJDIR
 may be modified in the makefile via the special target
 .Ql Ic .OBJDIR .
 In all cases,
 .Nm
 will
 .Xr chdir 2
 to the specified directory if it exists, and set
 .Ql Va .OBJDIR
 and
 .Ql Ev PWD
 to that directory before executing any targets.
 .
 .It Va .PARSEDIR
 A path to the directory of the current
 .Ql Pa Makefile
 being parsed.
 .It Va .PARSEFILE
 The basename of the current
 .Ql Pa Makefile
 being parsed.
 This variable and
 .Ql Va .PARSEDIR
 are both set only while the
 .Ql Pa Makefiles
 are being parsed.
 If you want to retain their current values, assign them to a variable
 using assignment with expansion:
 .Pq Ql Cm \&:= .
 .It Va .PATH
 A variable that represents the list of directories that
 .Nm
 will search for files.
 The search list should be updated using the target
 .Ql Va .PATH
 rather than the variable.
 .It Ev PWD
 Alternate path to the current directory.
 .Nm
 normally sets
 .Ql Va .CURDIR
 to the canonical path given by
 .Xr getcwd 3 .
 However, if the environment variable
 .Ql Ev PWD
 is set and gives a path to the current directory, then
 .Nm
 sets
 .Ql Va .CURDIR
 to the value of
 .Ql Ev PWD
 instead.
 This behavior is disabled if
 .Ql Ev MAKEOBJDIRPREFIX
 is set or
 .Ql Ev MAKEOBJDIR
 contains a variable transform.
 .Ql Ev PWD
 is set to the value of
 .Ql Va .OBJDIR
 for all programs which
 .Nm
 executes.
 .It Ev .TARGETS
 The list of targets explicitly specified on the command line, if any.
 .It Ev VPATH
 Colon-separated
 .Pq Dq \&:
 lists of directories that
 .Nm
 will search for files.
 The variable is supported for compatibility with old make programs only,
 use
 .Ql Va .PATH
 instead.
 .El
 .Ss Variable modifiers
 Variable expansion may be modified to select or modify each word of the
 variable (where a
 .Dq word
 is white-space delimited sequence of characters).
 The general format of a variable expansion is as follows:
 .Pp
 .Dl ${variable[:modifier[:...]]}
 .Pp
 Each modifier begins with a colon,
 which may be escaped with a backslash
 .Pq Ql \e .
 .Pp
 A set of modifiers can be specified via a variable, as follows:
 .Pp
 .Dl modifier_variable=modifier[:...]
 .Dl ${variable:${modifier_variable}[:...]}
 .Pp
 In this case the first modifier in the modifier_variable does not
 start with a colon, since that must appear in the referencing
 variable.
 If any of the modifiers in the modifier_variable contain a dollar sign
 .Pq Ql $ ,
 these must be doubled to avoid early expansion.
 .Pp
 The supported modifiers are:
 .Bl -tag -width EEE
 .It Cm \&:E
 Replaces each word in the variable with its suffix.
 .It Cm \&:H
 Replaces each word in the variable with everything but the last component.
 .It Cm \&:M Ns Ar pattern
 Select only those words that match
 .Ar pattern .
 The standard shell wildcard characters
 .Pf ( Ql * ,
 .Ql \&? ,
 and
 .Ql Oo Oc )
 may
 be used.
 The wildcard characters may be escaped with a backslash
 .Pq Ql \e .
 As a consequence of the way values are split into words, matched,
 and then joined, a construct like
 .Dl ${VAR:M*}
 will normalize the inter-word spacing, removing all leading and
 trailing space, and converting multiple consecutive spaces
 to single spaces.
 .
 .It Cm \&:N Ns Ar pattern
 This is identical to
 .Ql Cm \&:M ,
 but selects all words which do not match
 .Ar pattern .
 .It Cm \&:O
 Order every word in variable alphabetically.
 To sort words in
 reverse order use the
 .Ql Cm \&:O:[-1..1]
 combination of modifiers.
 .It Cm \&:Ox
 Randomize words in variable.
 The results will be different each time you are referring to the
 modified variable; use the assignment with expansion
 .Pq Ql Cm \&:=
 to prevent such behavior.
 For example,
 .Bd -literal -offset indent
 LIST=			uno due tre quattro
 RANDOM_LIST=		${LIST:Ox}
 STATIC_RANDOM_LIST:=	${LIST:Ox}
 
 all:
 	@echo "${RANDOM_LIST}"
 	@echo "${RANDOM_LIST}"
 	@echo "${STATIC_RANDOM_LIST}"
 	@echo "${STATIC_RANDOM_LIST}"
 .Ed
 may produce output similar to:
 .Bd -literal -offset indent
 quattro due tre uno
 tre due quattro uno
 due uno quattro tre
 due uno quattro tre
 .Ed
 .It Cm \&:Q
 Quotes every shell meta-character in the variable, so that it can be passed
 safely through recursive invocations of
 .Nm .
 .It Cm \&:R
 Replaces each word in the variable with everything but its suffix.
 .It Cm \&:gmtime
 The value is a format string for
 .Xr strftime 3 ,
 using the current
 .Xr gmtime 3 .
 .It Cm \&:hash
 Compute a 32-bit hash of the value and encode it as hex digits.
 .It Cm \&:localtime
 The value is a format string for
 .Xr strftime 3 ,
 using the current
 .Xr localtime 3 .
 .It Cm \&:tA
 Attempt to convert variable to an absolute path using
 .Xr realpath 3 ,
 if that fails, the value is unchanged.
 .It Cm \&:tl
 Converts variable to lower-case letters.
 .It Cm \&:ts Ns Ar c
 Words in the variable are normally separated by a space on expansion.
 This modifier sets the separator to the character
 .Ar c .
 If
 .Ar c
 is omitted, then no separator is used.
 The common escapes (including octal numeric codes), work as expected.
 .It Cm \&:tu
 Converts variable to upper-case letters.
 .It Cm \&:tW
 Causes the value to be treated as a single word
 (possibly containing embedded white space).
 See also
 .Ql Cm \&:[*] .
 .It Cm \&:tw
 Causes the value to be treated as a sequence of
 words delimited by white space.
 See also
 .Ql Cm \&:[@] .
 .Sm off
 .It Cm \&:S No \&/ Ar old_string No \&/ Ar new_string No \&/ Op Cm 1gW
 .Sm on
 Modify the first occurrence of
 .Ar old_string
 in the variable's value, replacing it with
 .Ar new_string .
 If a
 .Ql g
 is appended to the last slash of the pattern, all occurrences
 in each word are replaced.
 If a
 .Ql 1
 is appended to the last slash of the pattern, only the first word
 is affected.
 If a
 .Ql W
 is appended to the last slash of the pattern,
 then the value is treated as a single word
 (possibly containing embedded white space).
 If
 .Ar old_string
 begins with a caret
 .Pq Ql ^ ,
 .Ar old_string
 is anchored at the beginning of each word.
 If
 .Ar old_string
 ends with a dollar sign
 .Pq Ql \&$ ,
 it is anchored at the end of each word.
 Inside
 .Ar new_string ,
 an ampersand
 .Pq Ql \*[Am]
 is replaced by
 .Ar old_string
 (without any
 .Ql ^
 or
 .Ql \&$ ) .
 Any character may be used as a delimiter for the parts of the modifier
 string.
 The anchoring, ampersand and delimiter characters may be escaped with a
 backslash
 .Pq Ql \e .
 .Pp
 Variable expansion occurs in the normal fashion inside both
 .Ar old_string
 and
 .Ar new_string
 with the single exception that a backslash is used to prevent the expansion
 of a dollar sign
 .Pq Ql \&$ ,
 not a preceding dollar sign as is usual.
 .Sm off
 .It Cm \&:C No \&/ Ar pattern No \&/ Ar replacement No \&/ Op Cm 1gW
 .Sm on
 The
 .Cm \&:C
 modifier is just like the
 .Cm \&:S
 modifier except that the old and new strings, instead of being
 simple strings, are an extended regular expression (see
 .Xr regex 3 )
 string
 .Ar pattern
 and an
 .Xr ed 1 Ns \-style
 string
 .Ar replacement .
 Normally, the first occurrence of the pattern
 .Ar pattern
 in each word of the value is substituted with
 .Ar replacement .
 The
 .Ql 1
 modifier causes the substitution to apply to at most one word; the
 .Ql g
 modifier causes the substitution to apply to as many instances of the
 search pattern
 .Ar pattern
 as occur in the word or words it is found in; the
 .Ql W
 modifier causes the value to be treated as a single word
 (possibly containing embedded white space).
 Note that
 .Ql 1
 and
 .Ql g
 are orthogonal; the former specifies whether multiple words are
 potentially affected, the latter whether multiple substitutions can
 potentially occur within each affected word.
 .Pp
 As for the
 .Cm \&:S
 modifier, the
 .Ar pattern
 and
 .Ar replacement
 are subjected to variable expansion before being parsed as
 regular expressions.
 .It Cm \&:T
 Replaces each word in the variable with its last component.
 .It Cm \&:u
 Remove adjacent duplicate words (like
 .Xr uniq 1 ) .
 .Sm off
 .It Cm \&:\&? Ar true_string Cm \&: Ar false_string
 .Sm on
 If the variable name (not its value), when parsed as a .if conditional
 expression, evaluates to true, return as its value the
 .Ar true_string ,
 otherwise return the
 .Ar false_string .
 Since the variable name is used as the expression, \&:\&? must be the
 first modifier after the variable name itself - which will, of course,
 usually contain variable expansions.
 A common error is trying to use expressions like
 .Dl ${NUMBERS:M42:?match:no}
 which actually tests defined(NUMBERS),
 to determine is any words match "42" you need to use something like:
 .Dl ${"${NUMBERS:M42}" != \&"\&":?match:no} .
 .It Ar :old_string=new_string
 This is the
 .At V
 style variable substitution.
 It must be the last modifier specified.
 If
 .Ar old_string
 or
 .Ar new_string
 do not contain the pattern matching character
 .Ar %
 then it is assumed that they are
 anchored at the end of each word, so only suffixes or entire
 words may be replaced.
 Otherwise
 .Ar %
 is the substring of
 .Ar old_string
 to be replaced in
 .Ar new_string .
 .Pp
 Variable expansion occurs in the normal fashion inside both
 .Ar old_string
 and
 .Ar new_string
 with the single exception that a backslash is used to prevent the
 expansion of a dollar sign
 .Pq Ql \&$ ,
 not a preceding dollar sign as is usual.
 .Sm off
 .It Cm \&:@ Ar temp Cm @ Ar string Cm @
 .Sm on
 This is the loop expansion mechanism from the OSF Development
 Environment (ODE) make.
 Unlike
 .Cm \&.for
 loops expansion occurs at the time of
 reference.
 Assign
 .Ar temp
 to each word in the variable and evaluate
 .Ar string .
 The ODE convention is that
 .Ar temp
 should start and end with a period.
 For example.
 .Dl ${LINKS:@.LINK.@${LN} ${TARGET} ${.LINK.}@}
 .Pp
 However a single character variable is often more readable:
 .Dl ${MAKE_PRINT_VAR_ON_ERROR:@v@$v='${$v}'${.newline}@}
 .It Cm \&:U Ns Ar newval
 If the variable is undefined
 .Ar newval
 is the value.
 If the variable is defined, the existing value is returned.
 This is another ODE make feature.
 It is handy for setting per-target CFLAGS for instance:
 .Dl ${_${.TARGET:T}_CFLAGS:U${DEF_CFLAGS}}
 If a value is only required if the variable is undefined, use:
 .Dl ${VAR:D:Unewval}
 .It Cm \&:D Ns Ar newval
 If the variable is defined
 .Ar newval
 is the value.
 .It Cm \&:L
 The name of the variable is the value.
 .It Cm \&:P
 The path of the node which has the same name as the variable
 is the value.
 If no such node exists or its path is null, then the
 name of the variable is used.
 In order for this modifier to work, the name (node) must at least have
 appeared on the rhs of a dependency.
 .Sm off
 .It Cm \&:\&! Ar cmd Cm \&!
 .Sm on
 The output of running
 .Ar cmd
 is the value.
 .It Cm \&:sh
 If the variable is non-empty it is run as a command and the output
 becomes the new value.
 .It Cm \&::= Ns Ar str
 The variable is assigned the value
 .Ar str
 after substitution.
 This modifier and its variations are useful in
 obscure situations such as wanting to set a variable when shell commands
 are being parsed.
 These assignment modifiers always expand to
 nothing, so if appearing in a rule line by themselves should be
 preceded with something to keep
 .Nm
 happy.
 .Pp
 The
 .Ql Cm \&::
 helps avoid false matches with the
 .At V
 style
 .Cm \&:=
 modifier and since substitution always occurs the
 .Cm \&::=
 form is vaguely appropriate.
 .It Cm \&::?= Ns Ar str
 As for
 .Cm \&::=
 but only if the variable does not already have a value.
 .It Cm \&::+= Ns Ar str
 Append
 .Ar str
 to the variable.
 .It Cm \&::!= Ns Ar cmd
 Assign the output of
 .Ar cmd
 to the variable.
 .It Cm \&:\&[ Ns Ar range Ns Cm \&]
 Selects one or more words from the value,
 or performs other operations related to the way in which the
 value is divided into words.
 .Pp
 Ordinarily, a value is treated as a sequence of words
 delimited by white space.
 Some modifiers suppress this behavior,
 causing a value to be treated as a single word
 (possibly containing embedded white space).
 An empty value, or a value that consists entirely of white-space,
 is treated as a single word.
 For the purposes of the
 .Ql Cm \&:[]
 modifier, the words are indexed both forwards using positive integers
 (where index 1 represents the first word),
 and backwards using negative integers
 (where index \-1 represents the last word).
 .Pp
 The
 .Ar range
 is subjected to variable expansion, and the expanded result is
 then interpreted as follows:
 .Bl -tag -width index
 .\" :[n]
 .It Ar index
 Selects a single word from the value.
 .\" :[start..end]
 .It Ar start Ns Cm \&.. Ns Ar end
 Selects all words from
 .Ar start
 to
 .Ar end ,
 inclusive.
 For example,
 .Ql Cm \&:[2..-1]
 selects all words from the second word to the last word.
 If
 .Ar start
 is greater than
 .Ar end ,
 then the words are output in reverse order.
 For example,
 .Ql Cm \&:[-1..1]
 selects all the words from last to first.
 .\" :[*]
 .It Cm \&*
 Causes subsequent modifiers to treat the value as a single word
 (possibly containing embedded white space).
 Analogous to the effect of
 \&"$*\&"
 in Bourne shell.
 .\" :[0]
 .It 0
 Means the same as
 .Ql Cm \&:[*] .
 .\" :[*]
 .It Cm \&@
 Causes subsequent modifiers to treat the value as a sequence of words
 delimited by white space.
 Analogous to the effect of
 \&"$@\&"
 in Bourne shell.
 .\" :[#]
 .It Cm \&#
 Returns the number of words in the value.
 .El \" :[range]
 .El
 .Sh INCLUDE STATEMENTS, CONDITIONALS AND FOR LOOPS
 Makefile inclusion, conditional structures and for loops  reminiscent
 of the C programming language are provided in
 .Nm .
 All such structures are identified by a line beginning with a single
 dot
 .Pq Ql \&.
 character.
 Files are included with either
 .Cm \&.include Aq Ar file
 or
 .Cm \&.include Pf \*q Ar file Ns \*q .
 Variables between the angle brackets or double quotes are expanded
 to form the file name.
 If angle brackets are used, the included makefile is expected to be in
 the system makefile directory.
 If double quotes are used, the including makefile's directory and any
 directories specified using the
 .Fl I
 option are searched before the system
 makefile directory.
 For compatibility with other versions of
 .Nm
 .Ql include file ...
 is also accepted.
 .Pp
 If the include statement is written as
 .Cm .-include
 or as
 .Cm .sinclude
 then errors locating and/or opening include files are ignored.
 .Pp
 If the include statement is written as
 .Cm .dinclude
 not only are errors locating and/or opening include files ignored,
 but stale dependencies within the included file will be ignored
 just like
 .Va .MAKE.DEPENDFILE .
 .Pp
 Conditional expressions are also preceded by a single dot as the first
 character of a line.
 The possible conditionals are as follows:
 .Bl -tag -width Ds
 .It Ic .error Ar message
 The message is printed along with the name of the makefile and line number,
 then
 .Nm
 will exit.
 .It Ic .export Ar variable ...
 Export the specified global variable.
 If no variable list is provided, all globals are exported
 except for internal variables (those that start with
 .Ql \&. ) .
 This is not affected by the
 .Fl X
 flag, so should be used with caution.
 For compatibility with other
 .Nm
 programs
 .Ql export variable=value
 is also accepted.
 .Pp
 Appending a variable name to
 .Va .MAKE.EXPORTED
 is equivalent to exporting a variable.
 .It Ic .export-env Ar variable ...
 The same as
 .Ql .export ,
 except that the variable is not appended to
 .Va .MAKE.EXPORTED .
 This allows exporting a value to the environment which is different from that
 used by
 .Nm
 internally.
 .It Ic .export-literal Ar variable ...
 The same as
 .Ql .export-env ,
 except that variables in the value are not expanded.
 .It Ic .info Ar message
 The message is printed along with the name of the makefile and line number.
 .It Ic .undef Ar variable
 Un-define the specified global variable.
 Only global variables may be un-defined.
 .It Ic .unexport Ar variable ...
 The opposite of
 .Ql .export .
 The specified global
 .Va variable
 will be removed from
 .Va .MAKE.EXPORTED .
 If no variable list is provided, all globals are unexported,
 and
 .Va .MAKE.EXPORTED
 deleted.
 .It Ic .unexport-env
 Unexport all globals previously exported and
 clear the environment inherited from the parent.
 This operation will cause a memory leak of the original environment,
 so should be used sparingly.
 Testing for
 .Va .MAKE.LEVEL
 being 0, would make sense.
 Also note that any variables which originated in the parent environment
 should be explicitly preserved if desired.
 For example:
 .Bd -literal -offset indent
 .Li .if ${.MAKE.LEVEL} == 0
 PATH := ${PATH}
 .Li .unexport-env
 .Li .export PATH
 .Li .endif
 .Pp
 .Ed
 Would result in an environment containing only
 .Ql Ev PATH ,
 which is the minimal useful environment.
 Actually
 .Ql Ev .MAKE.LEVEL
 will also be pushed into the new environment.
 .It Ic .warning Ar message
 The message prefixed by
 .Ql Pa warning:
 is printed along with the name of the makefile and line number.
 .It Ic \&.if Oo \&! Oc Ns Ar expression Op Ar operator expression ...
 Test the value of an expression.
 .It Ic .ifdef Oo \&! Oc Ns Ar variable Op Ar operator variable ...
 Test the value of a variable.
 .It Ic .ifndef Oo \&! Oc Ns Ar variable Op Ar operator variable ...
 Test the value of a variable.
 .It Ic .ifmake Oo \&! Oc Ns Ar target Op Ar operator target ...
 Test the target being built.
 .It Ic .ifnmake Oo \&! Ns Oc Ar target Op Ar operator target ...
 Test the target being built.
 .It Ic .else
 Reverse the sense of the last conditional.
 .It Ic .elif Oo \&! Ns Oc Ar expression Op Ar operator expression ...
 A combination of
 .Ql Ic .else
 followed by
 .Ql Ic .if .
 .It Ic .elifdef Oo \&! Oc Ns Ar variable Op Ar operator variable ...
 A combination of
 .Ql Ic .else
 followed by
 .Ql Ic .ifdef .
 .It Ic .elifndef Oo \&! Oc Ns Ar variable Op Ar operator variable ...
 A combination of
 .Ql Ic .else
 followed by
 .Ql Ic .ifndef .
 .It Ic .elifmake Oo \&! Oc Ns Ar target Op Ar operator target ...
 A combination of
 .Ql Ic .else
 followed by
 .Ql Ic .ifmake .
 .It Ic .elifnmake Oo \&! Oc Ns Ar target Op Ar operator target ...
 A combination of
 .Ql Ic .else
 followed by
 .Ql Ic .ifnmake .
 .It Ic .endif
 End the body of the conditional.
 .El
 .Pp
 The
 .Ar operator
 may be any one of the following:
 .Bl -tag -width "Cm XX"
 .It Cm \&|\&|
 Logical OR.
 .It Cm \&\*[Am]\*[Am]
 Logical
 .Tn AND ;
 of higher precedence than
 .Dq \&|\&| .
 .El
 .Pp
 As in C,
 .Nm
 will only evaluate a conditional as far as is necessary to determine
 its value.
 Parentheses may be used to change the order of evaluation.
 The boolean operator
 .Ql Ic \&!
 may be used to logically negate an entire
 conditional.
 It is of higher precedence than
 .Ql Ic \&\*[Am]\*[Am] .
 .Pp
 The value of
 .Ar expression
 may be any of the following:
 .Bl -tag -width defined
 .It Ic defined
 Takes a variable name as an argument and evaluates to true if the variable
 has been defined.
 .It Ic make
 Takes a target name as an argument and evaluates to true if the target
 was specified as part of
 .Nm Ns 's
 command line or was declared the default target (either implicitly or
 explicitly, see
 .Va .MAIN )
 before the line containing the conditional.
 .It Ic empty
 Takes a variable, with possible modifiers, and evaluates to true if
 the expansion of the variable would result in an empty string.
 .It Ic exists
 Takes a file name as an argument and evaluates to true if the file exists.
 The file is searched for on the system search path (see
 .Va .PATH ) .
 .It Ic target
 Takes a target name as an argument and evaluates to true if the target
 has been defined.
 .It Ic commands
 Takes a target name as an argument and evaluates to true if the target
 has been defined and has commands associated with it.
 .El
 .Pp
 .Ar Expression
 may also be an arithmetic or string comparison.
 Variable expansion is
 performed on both sides of the comparison, after which the integral
 values are compared.
 A value is interpreted as hexadecimal if it is
 preceded by 0x, otherwise it is decimal; octal numbers are not supported.
 The standard C relational operators are all supported.
 If after
 variable expansion, either the left or right hand side of a
 .Ql Ic ==
 or
 .Ql Ic "!="
 operator is not an integral value, then
 string comparison is performed between the expanded
 variables.
 If no relational operator is given, it is assumed that the expanded
 variable is being compared against 0 or an empty string in the case
 of a string comparison.
 .Pp
 When
 .Nm
 is evaluating one of these conditional expressions, and it encounters
 a (white-space separated) word it doesn't recognize, either the
 .Dq make
 or
 .Dq defined
 expression is applied to it, depending on the form of the conditional.
 If the form is
 .Ql Ic .ifdef ,
 .Ql Ic .ifndef ,
 or
 .Ql Ic .if
 the
 .Dq defined
 expression is applied.
 Similarly, if the form is
 .Ql Ic .ifmake
 or
 .Ql Ic .ifnmake , the
 .Dq make
 expression is applied.
 .Pp
 If the conditional evaluates to true the parsing of the makefile continues
 as before.
 If it evaluates to false, the following lines are skipped.
 In both cases this continues until a
 .Ql Ic .else
 or
 .Ql Ic .endif
 is found.
 .Pp
 For loops are typically used to apply a set of rules to a list of files.
 The syntax of a for loop is:
 .Pp
 .Bl -tag -compact -width Ds
 .It Ic \&.for Ar variable Oo Ar variable ... Oc Ic in Ar expression
 .It Aq make-rules
 .It Ic \&.endfor
 .El
 .Pp
 After the for
 .Ic expression
 is evaluated, it is split into words.
 On each iteration of the loop, one word is taken and assigned to each
 .Ic variable ,
 in order, and these
 .Ic variables
 are substituted into the
 .Ic make-rules
 inside the body of the for loop.
 The number of words must come out even; that is, if there are three
 iteration variables, the number of words provided must be a multiple
 of three.
 .Sh COMMENTS
 Comments begin with a hash
 .Pq Ql \&#
 character, anywhere but in a shell
 command line, and continue to the end of an unescaped new line.
 .Sh SPECIAL SOURCES (ATTRIBUTES)
 .Bl -tag -width .IGNOREx
 .It Ic .EXEC
 Target is never out of date, but always execute commands anyway.
 .It Ic .IGNORE
 Ignore any errors from the commands associated with this target, exactly
 as if they all were preceded by a dash
 .Pq Ql \- .
 .\" .It Ic .INVISIBLE
 .\" XXX
 .\" .It Ic .JOIN
 .\" XXX
 .It Ic .MADE
 Mark all sources of this target as being up-to-date.
 .It Ic .MAKE
 Execute the commands associated with this target even if the
 .Fl n
 or
 .Fl t
 options were specified.
 Normally used to mark recursive
 .Nm Ns s .
 .It Ic .META
 Create a meta file for the target, even if it is flagged as
 .Ic .PHONY ,
 .Ic .MAKE ,
 or
 .Ic .SPECIAL .
 Usage in conjunction with
 .Ic .MAKE
 is the most likely case.
 In "meta" mode, the target is out-of-date if the meta file is missing.
 .It Ic .NOMETA
 Do not create a meta file for the target.
 Meta files are also not created for
 .Ic .PHONY ,
 .Ic .MAKE ,
 or
 .Ic .SPECIAL
 targets.
 .It Ic .NOMETA_CMP
 Ignore differences in commands when deciding if target is out of date.
 This is useful if the command contains a value which always changes.
 If the number of commands change, though, the target will still be out of date.
 The same effect applies to any command line that uses the variable
 .Va .OODATE ,
 which can be used for that purpose even when not otherwise needed or desired:
 .Bd -literal -offset indent
 
 skip-compare-for-some:
 	@echo this will be compared
 	@echo this will not ${.OODATE:M.NOMETA_CMP}
 	@echo this will also be compared
 
 .Ed
 The
 .Cm \&:M
 pattern suppresses any expansion of the unwanted variable.
 .It Ic .NOPATH
 Do not search for the target in the directories specified by
 .Ic .PATH .
 .It Ic .NOTMAIN
 Normally
 .Nm
 selects the first target it encounters as the default target to be built
 if no target was specified.
 This source prevents this target from being selected.
 .It Ic .OPTIONAL
 If a target is marked with this attribute and
 .Nm
 can't figure out how to create it, it will ignore this fact and assume
 the file isn't needed or already exists.
 .It Ic .PHONY
 The target does not
 correspond to an actual file; it is always considered to be out of date,
 and will not be created with the
 .Fl t
 option.
 Suffix-transformation rules are not applied to
 .Ic .PHONY
 targets.
 .It Ic .PRECIOUS
 When
 .Nm
 is interrupted, it normally removes any partially made targets.
 This source prevents the target from being removed.
 .It Ic .RECURSIVE
 Synonym for
 .Ic .MAKE .
 .It Ic .SILENT
 Do not echo any of the commands associated with this target, exactly
 as if they all were preceded by an at sign
 .Pq Ql @ .
 .It Ic .USE
 Turn the target into
 .Nm Ns 's
 version of a macro.
 When the target is used as a source for another target, the other target
 acquires the commands, sources, and attributes (except for
 .Ic .USE )
 of the
 source.
 If the target already has commands, the
 .Ic .USE
 target's commands are appended
 to them.
 .It Ic .USEBEFORE
 Exactly like
 .Ic .USE ,
 but prepend the
 .Ic .USEBEFORE
 target commands to the target.
 .It Ic .WAIT
 If
 .Ic .WAIT
 appears in a dependency line, the sources that precede it are
 made before the sources that succeed it in the line.
 Since the dependents of files are not made until the file itself
 could be made, this also stops the dependents being built unless they
 are needed for another branch of the dependency tree.
 So given:
 .Bd -literal
 x: a .WAIT b
 	echo x
 a:
 	echo a
 b: b1
 	echo b
 b1:
 	echo b1
 
 .Ed
 the output is always
 .Ql a ,
 .Ql b1 ,
 .Ql b ,
 .Ql x .
 .br
 The ordering imposed by
 .Ic .WAIT
 is only relevant for parallel makes.
 .El
 .Sh SPECIAL TARGETS
 Special targets may not be included with other targets, i.e. they must be
 the only target specified.
 .Bl -tag -width .BEGINx
 .It Ic .BEGIN
 Any command lines attached to this target are executed before anything
 else is done.
 .It Ic .DEFAULT
 This is sort of a
 .Ic .USE
 rule for any target (that was used only as a
 source) that
 .Nm
 can't figure out any other way to create.
 Only the shell script is used.
 The
 .Ic .IMPSRC
 variable of a target that inherits
 .Ic .DEFAULT Ns 's
 commands is set
 to the target's own name.
 .It Ic .END
 Any command lines attached to this target are executed after everything
 else is done.
 .It Ic .ERROR
 Any command lines attached to this target are executed when another target fails.
 The
 .Ic .ERROR_TARGET
 variable is set to the target that failed.
 See also
 .Ic MAKE_PRINT_VAR_ON_ERROR .
 .It Ic .IGNORE
 Mark each of the sources with the
 .Ic .IGNORE
 attribute.
 If no sources are specified, this is the equivalent of specifying the
 .Fl i
 option.
 .It Ic .INTERRUPT
 If
 .Nm
 is interrupted, the commands for this target will be executed.
 .It Ic .MAIN
 If no target is specified when
 .Nm
 is invoked, this target will be built.
 .It Ic .MAKEFLAGS
 This target provides a way to specify flags for
 .Nm
 when the makefile is used.
 The flags are as if typed to the shell, though the
 .Fl f
 option will have
 no effect.
 .\" XXX: NOT YET!!!!
 .\" .It Ic .NOTPARALLEL
 .\" The named targets are executed in non parallel mode.
 .\" If no targets are
 .\" specified, then all targets are executed in non parallel mode.
 .It Ic .NOPATH
 Apply the
 .Ic .NOPATH
 attribute to any specified sources.
 .It Ic .NOTPARALLEL
 Disable parallel mode.
 .It Ic .NO_PARALLEL
 Synonym for
 .Ic .NOTPARALLEL ,
 for compatibility with other pmake variants.
 .It Ic .OBJDIR
 The source is a new value for
 .Ql Va .OBJDIR .
 If it exists,
 .Nm
 will
 .Xr chdir 2
 to it and update the value of
 .Ql Va .OBJDIR .
 .It Ic .ORDER
 The named targets are made in sequence.
 This ordering does not add targets to the list of targets to be made.
 Since the dependents of a target do not get built until the target itself
 could be built, unless
 .Ql a
 is built by another part of the dependency graph,
 the following is a dependency loop:
 .Bd -literal
 \&.ORDER: b a
 b: a
 .Ed
 .Pp
 The ordering imposed by
 .Ic .ORDER
 is only relevant for parallel makes.
 .\" XXX: NOT YET!!!!
 .\" .It Ic .PARALLEL
 .\" The named targets are executed in parallel mode.
 .\" If no targets are
 .\" specified, then all targets are executed in parallel mode.
 .It Ic .PATH
 The sources are directories which are to be searched for files not
 found in the current directory.
 If no sources are specified, any previously specified directories are
 deleted.
 If the source is the special
 .Ic .DOTLAST
 target, then the current working
 directory is searched last.
 .It Ic .PATH. Ns Va suffix
 Like
 .Ic .PATH
 but applies only to files with a particular suffix.
 The suffix must have been previously declared with
 .Ic .SUFFIXES .
 .It Ic .PHONY
 Apply the
 .Ic .PHONY
 attribute to any specified sources.
 .It Ic .PRECIOUS
 Apply the
 .Ic .PRECIOUS
 attribute to any specified sources.
 If no sources are specified, the
 .Ic .PRECIOUS
 attribute is applied to every
 target in the file.
 .It Ic .SHELL
 Sets the shell that
 .Nm
 will use to execute commands.
 The sources are a set of
 .Ar field=value
 pairs.
 .Bl -tag -width hasErrCtls
 .It Ar name
 This is the minimal specification, used to select one of the built-in
 shell specs;
 .Ar sh ,
 .Ar ksh ,
 and
 .Ar csh .
 .It Ar path
 Specifies the path to the shell.
 .It Ar hasErrCtl
 Indicates whether the shell supports exit on error.
 .It Ar check
 The command to turn on error checking.
 .It Ar ignore
 The command to disable error checking.
 .It Ar echo
 The command to turn on echoing of commands executed.
 .It Ar quiet
 The command to turn off echoing of commands executed.
 .It Ar filter
 The output to filter after issuing the
 .Ar quiet
 command.
 It is typically identical to
 .Ar quiet .
 .It Ar errFlag
 The flag to pass the shell to enable error checking.
 .It Ar echoFlag
 The flag to pass the shell to enable command echoing.
 .It Ar newline
 The string literal to pass the shell that results in a single newline
 character when used outside of any quoting characters.
 .El
 Example:
 .Bd -literal
 \&.SHELL: name=ksh path=/bin/ksh hasErrCtl=true \e
 	check="set \-e" ignore="set +e" \e
 	echo="set \-v" quiet="set +v" filter="set +v" \e
 	echoFlag=v errFlag=e newline="'\en'"
 .Ed
 .It Ic .SILENT
 Apply the
 .Ic .SILENT
 attribute to any specified sources.
 If no sources are specified, the
 .Ic .SILENT
 attribute is applied to every
 command in the file.
 .It Ic .STALE
 This target gets run when a dependency file contains stale entries, having
 .Va .ALLSRC
 set to the name of that dependency file.
 .It Ic .SUFFIXES
 Each source specifies a suffix to
 .Nm .
 If no sources are specified, any previously specified suffixes are deleted.
 It allows the creation of suffix-transformation rules.
 .Pp
 Example:
 .Bd -literal
 \&.SUFFIXES: .o
 \&.c.o:
 	cc \-o ${.TARGET} \-c ${.IMPSRC}
 .Ed
 .El
 .Sh ENVIRONMENT
 .Nm
 uses the following environment variables, if they exist:
 .Ev MACHINE ,
 .Ev MACHINE_ARCH ,
 .Ev MAKE ,
 .Ev MAKEFLAGS ,
 .Ev MAKEOBJDIR ,
 .Ev MAKEOBJDIRPREFIX ,
 .Ev MAKESYSPATH ,
 .Ev PWD ,
 and
 .Ev TMPDIR .
 .Pp
 .Ev MAKEOBJDIRPREFIX
 and
 .Ev MAKEOBJDIR
 may only be set in the environment or on the command line to
 .Nm
 and not as makefile variables;
 see the description of
 .Ql Va .OBJDIR
 for more details.
 .Sh FILES
 .Bl -tag -width /usr/share/mk -compact
 .It .depend
 list of dependencies
 .It Makefile
 list of dependencies
 .It makefile
 list of dependencies
 .It sys.mk
 system makefile
 .It /usr/share/mk
 system makefile directory
 .El
 .Sh COMPATIBILITY
 The basic make syntax is compatible between different versions of make;
 however the special variables, variable modifiers and conditionals are not.
 .Ss Older versions
 An incomplete list of changes in older versions of
 .Nm :
 .Pp
 The way that .for loop variables are substituted changed after
 NetBSD 5.0
 so that they still appear to be variable expansions.
 In particular this stops them being treated as syntax, and removes some
 obscure problems using them in .if statements.
 .Pp
 The way that parallel makes are scheduled changed in
 NetBSD 4.0
 so that .ORDER and .WAIT apply recursively to the dependent nodes.
 The algorithms used may change again in the future.
 .Ss Other make dialects
 Other make dialects (GNU make, SVR4 make, POSIX make, etc.) do not
 support most of the features of
 .Nm
 as described in this manual.
 Most notably:
 .Bl -bullet -offset indent
 .It
 The
 .Ic .WAIT
 and
 .Ic .ORDER
 declarations and most functionality pertaining to parallelization.
 (GNU make supports parallelization but lacks these features needed to
 control it effectively.)
 .It
 Directives, including for loops and conditionals and most of the
 forms of include files.
 (GNU make has its own incompatible and less powerful syntax for
 conditionals.)
 .It
 All built-in variables that begin with a dot.
 .It
 Most of the special sources and targets that begin with a dot,
 with the notable exception of
 .Ic .PHONY ,
 .Ic .PRECIOUS ,
 and
 .Ic .SUFFIXES .
 .It
 Variable modifiers, except for the
 .Dl :old=new
 string substitution, which does not portably support globbing with
 .Ql %
 and historically only works on declared suffixes.
 .It
 The
 .Ic $>
 variable even in its short form; most makes support this functionality
 but its name varies.
 .El
 .Pp
 Some features are somewhat more portable, such as assignment with
 .Ic += ,
 .Ic ?= ,
 and
 .Ic != .
 The
 .Ic .PATH
 functionality is based on an older feature
 .Ic VPATH
 found in GNU make and many versions of SVR4 make; however,
 historically its behavior is too ill-defined (and too buggy) to rely
 upon.
 .Pp
 The
 .Ic $@
 and
 .Ic $<
 variables are more or less universally portable, as is the
 .Ic $(MAKE)
 variable.
 Basic use of suffix rules (for files only in the current directory,
 not trying to chain transformations together, etc.) is also reasonably
 portable.
 .Sh SEE ALSO
 .Xr mkdep 1
 .Sh HISTORY
 .Nm
 is derived from NetBSD
 .Xr make 1 .
 It uses autoconf to facilitate portability to other platforms.
 .Pp
 A
 make
 command appeared in
 .At v7 .
 This
 make
 implementation is based on Adam De Boor's pmake program which was written
 for Sprite at Berkeley.
 It was designed to be a parallel distributed make running jobs on different
 machines using a daemon called
 .Dq customs .
 .Pp
 Historically the target/dependency
 .Dq FRC
 has been used to FoRCe rebuilding (since the target/dependency
 does not exist... unless someone creates an
 .Dq FRC
 file).
 .Sh BUGS
 The
 make
 syntax is difficult to parse without actually acting of the data.
 For instance finding the end of a variable use should involve scanning each
 the modifiers using the correct terminator for each field.
 In many places
 make
 just counts {} and () in order to find the end of a variable expansion.
 .Pp
 There is no way of escaping a space character in a filename.
Index: projects/clang390-import/contrib/bmake/bmake.cat1
===================================================================
--- projects/clang390-import/contrib/bmake/bmake.cat1	(revision 305686)
+++ projects/clang390-import/contrib/bmake/bmake.cat1	(revision 305687)
@@ -1,1492 +1,1501 @@
 MAKE(1)                 NetBSD General Commands Manual                 MAKE(1)
 
 NNAAMMEE
      bbmmaakkee -- maintain program dependencies
 
 SSYYNNOOPPSSIISS
      bbmmaakkee [--BBeeiikkNNnnqqrrssttWWwwXX] [--CC _d_i_r_e_c_t_o_r_y] [--DD _v_a_r_i_a_b_l_e] [--dd _f_l_a_g_s]
            [--ff _m_a_k_e_f_i_l_e] [--II _d_i_r_e_c_t_o_r_y] [--JJ _p_r_i_v_a_t_e] [--jj _m_a_x___j_o_b_s]
            [--mm _d_i_r_e_c_t_o_r_y] [--TT _f_i_l_e] [--VV _v_a_r_i_a_b_l_e] [_v_a_r_i_a_b_l_e_=_v_a_l_u_e]
            [_t_a_r_g_e_t _._._.]
 
 DDEESSCCRRIIPPTTIIOONN
      bbmmaakkee is a program designed to simplify the maintenance of other pro-
      grams.  Its input is a list of specifications as to the files upon which
      programs and other files depend.  If no --ff _m_a_k_e_f_i_l_e makefile option is
      given, bbmmaakkee will try to open `_m_a_k_e_f_i_l_e' then `_M_a_k_e_f_i_l_e' in order to find
      the specifications.  If the file `_._d_e_p_e_n_d' exists, it is read (see
      mkdep(1)).
 
      This manual page is intended as a reference document only.  For a more
      thorough description of bbmmaakkee and makefiles, please refer to _P_M_a_k_e _- _A
      _T_u_t_o_r_i_a_l.
 
      bbmmaakkee will prepend the contents of the _M_A_K_E_F_L_A_G_S environment variable to
      the command line arguments before parsing them.
 
      The options are as follows:
 
      --BB      Try to be backwards compatible by executing a single shell per
              command and by executing the commands to make the sources of a
              dependency line in sequence.
 
      --CC _d_i_r_e_c_t_o_r_y
              Change to _d_i_r_e_c_t_o_r_y before reading the makefiles or doing any-
              thing else.  If multiple --CC options are specified, each is inter-
              preted relative to the previous one: --CC _/ --CC _e_t_c is equivalent to
              --CC _/_e_t_c.
 
      --DD _v_a_r_i_a_b_l_e
              Define _v_a_r_i_a_b_l_e to be 1, in the global context.
 
      --dd _[_-_]_f_l_a_g_s
              Turn on debugging, and specify which portions of bbmmaakkee are to
              print debugging information.  Unless the flags are preceded by
              `-' they are added to the _M_A_K_E_F_L_A_G_S environment variable and will
              be processed by any child make processes.  By default, debugging
              information is printed to standard error, but this can be changed
              using the _F debugging flag.  The debugging output is always
              unbuffered; in addition, if debugging is enabled but debugging
              output is not directed to standard output, then the standard out-
              put is line buffered.  _F_l_a_g_s is one or more of the following:
 
              _A       Print all possible debugging information; equivalent to
                      specifying all of the debugging flags.
 
              _a       Print debugging information about archive searching and
                      caching.
 
              _C       Print debugging information about current working direc-
                      tory.
 
              _c       Print debugging information about conditional evaluation.
 
              _d       Print debugging information about directory searching and
                      caching.
 
              _e       Print debugging information about failed commands and
                      targets.
 
              _F[++]_f_i_l_e_n_a_m_e
                      Specify where debugging output is written.  This must be
                      the last flag, because it consumes the remainder of the
                      argument.  If the character immediately after the `F'
                      flag is `+', then the file will be opened in append mode;
                      otherwise the file will be overwritten.  If the file name
                      is `stdout' or `stderr' then debugging output will be
                      written to the standard output or standard error output
                      file descriptors respectively (and the `+' option has no
                      effect).  Otherwise, the output will be written to the
                      named file.  If the file name ends `.%d' then the `%d' is
                      replaced by the pid.
 
              _f       Print debugging information about loop evaluation.
 
              _g_1      Print the input graph before making anything.
 
              _g_2      Print the input graph after making everything, or before
                      exiting on error.
 
              _g_3      Print the input graph before exiting on error.
 
              _j       Print debugging information about running multiple
                      shells.
 
              _l       Print commands in Makefiles regardless of whether or not
                      they are prefixed by `@' or other "quiet" flags.  Also
                      known as "loud" behavior.
 
              _M       Print debugging information about "meta" mode decisions
                      about targets.
 
              _m       Print debugging information about making targets, includ-
                      ing modification dates.
 
              _n       Don't delete the temporary command scripts created when
                      running commands.  These temporary scripts are created in
                      the directory referred to by the TMPDIR environment vari-
                      able, or in _/_t_m_p if TMPDIR is unset or set to the empty
                      string.  The temporary scripts are created by mkstemp(3),
                      and have names of the form _m_a_k_e_X_X_X_X_X_X.  _N_O_T_E: This can
                      create many files in TMPDIR or _/_t_m_p, so use with care.
 
              _p       Print debugging information about makefile parsing.
 
              _s       Print debugging information about suffix-transformation
                      rules.
 
              _t       Print debugging information about target list mainte-
                      nance.
 
              _V       Force the --VV option to print raw values of variables.
 
              _v       Print debugging information about variable assignment.
 
              _x       Run shell commands with --xx so the actual commands are
                      printed as they are executed.
 
      --ee      Specify that environment variables override macro assignments
              within makefiles.
 
      --ff _m_a_k_e_f_i_l_e
              Specify a makefile to read instead of the default `_m_a_k_e_f_i_l_e'.  If
              _m_a_k_e_f_i_l_e is `--', standard input is read.  Multiple makefiles may
              be specified, and are read in the order specified.
 
      --II _d_i_r_e_c_t_o_r_y
              Specify a directory in which to search for makefiles and included
              makefiles.  The system makefile directory (or directories, see
              the --mm option) is automatically included as part of this list.
 
      --ii      Ignore non-zero exit of shell commands in the makefile.  Equiva-
              lent to specifying `--' before each command line in the makefile.
 
      --JJ _p_r_i_v_a_t_e
              This option should _n_o_t be specified by the user.
 
              When the _j option is in use in a recursive build, this option is
              passed by a make to child makes to allow all the make processes
              in the build to cooperate to avoid overloading the system.
 
      --jj _m_a_x___j_o_b_s
              Specify the maximum number of jobs that bbmmaakkee may have running at
              any one time.  The value is saved in _._M_A_K_E_._J_O_B_S.  Turns compati-
              bility mode off, unless the _B flag is also specified.  When com-
              patibility mode is off, all commands associated with a target are
              executed in a single shell invocation as opposed to the tradi-
              tional one shell invocation per line.  This can break traditional
              scripts which change directories on each command invocation and
              then expect to start with a fresh environment on the next line.
              It is more efficient to correct the scripts rather than turn
              backwards compatibility on.
 
      --kk      Continue processing after errors are encountered, but only on
              those targets that do not depend on the target whose creation
              caused the error.
 
      --mm _d_i_r_e_c_t_o_r_y
              Specify a directory in which to search for sys.mk and makefiles
              included via the <_f_i_l_e>-style include statement.  The --mm option
              can be used multiple times to form a search path.  This path will
              override the default system include path: /usr/share/mk.  Fur-
              thermore the system include path will be appended to the search
              path used for "_f_i_l_e"-style include statements (see the --II
              option).
 
              If a file or directory name in the --mm argument (or the
              MAKESYSPATH environment variable) starts with the string ".../"
              then bbmmaakkee will search for the specified file or directory named
              in the remaining part of the argument string.  The search starts
              with the current directory of the Makefile and then works upward
              towards the root of the file system.  If the search is success-
              ful, then the resulting directory replaces the ".../" specifica-
              tion in the --mm argument.  If used, this feature allows bbmmaakkee to
              easily search in the current source tree for customized sys.mk
              files (e.g., by using ".../mk/sys.mk" as an argument).
 
      --nn      Display the commands that would have been executed, but do not
              actually execute them unless the target depends on the .MAKE spe-
              cial source (see below).
 
      --NN      Display the commands which would have been executed, but do not
              actually execute any of them; useful for debugging top-level
              makefiles without descending into subdirectories.
 
      --qq      Do not execute any commands, but exit 0 if the specified targets
              are up-to-date and 1, otherwise.
 
      --rr      Do not use the built-in rules specified in the system makefile.
 
      --ss      Do not echo any commands as they are executed.  Equivalent to
              specifying `@@' before each command line in the makefile.
 
      --TT _t_r_a_c_e_f_i_l_e
              When used with the --jj flag, append a trace record to _t_r_a_c_e_f_i_l_e
              for each job started and completed.
 
      --tt      Rather than re-building a target as specified in the makefile,
              create it or update its modification time to make it appear up-
              to-date.
 
      --VV _v_a_r_i_a_b_l_e
              Print bbmmaakkee's idea of the value of _v_a_r_i_a_b_l_e, in the global con-
              text.  Do not build any targets.  Multiple instances of this
              option may be specified; the variables will be printed one per
              line, with a blank line for each null or undefined variable.  If
              _v_a_r_i_a_b_l_e contains a `$' then the value will be expanded before
              printing.
 
      --WW      Treat any warnings during makefile parsing as errors.
 
      --ww      Print entering and leaving directory messages, pre and post pro-
              cessing.
 
      --XX      Don't export variables passed on the command line to the environ-
              ment individually.  Variables passed on the command line are
              still exported via the _M_A_K_E_F_L_A_G_S environment variable.  This
              option may be useful on systems which have a small limit on the
              size of command arguments.
 
      _v_a_r_i_a_b_l_e_=_v_a_l_u_e
              Set the value of the variable _v_a_r_i_a_b_l_e to _v_a_l_u_e.  Normally, all
              values passed on the command line are also exported to sub-makes
              in the environment.  The --XX flag disables this behavior.  Vari-
              able assignments should follow options for POSIX compatibility
              but no ordering is enforced.
 
      There are seven different types of lines in a makefile: file dependency
      specifications, shell commands, variable assignments, include statements,
      conditional directives, for loops, and comments.
 
      In general, lines may be continued from one line to the next by ending
      them with a backslash (`\').  The trailing newline character and initial
      whitespace on the following line are compressed into a single space.
 
 FFIILLEE DDEEPPEENNDDEENNCCYY SSPPEECCIIFFIICCAATTIIOONNSS
      Dependency lines consist of one or more targets, an operator, and zero or
      more sources.  This creates a relationship where the targets ``depend''
      on the sources and are usually created from them.  The exact relationship
      between the target and the source is determined by the operator that sep-
      arates them.  The three operators are as follows:
 
      ::     A target is considered out-of-date if its modification time is less
            than those of any of its sources.  Sources for a target accumulate
            over dependency lines when this operator is used.  The target is
            removed if bbmmaakkee is interrupted.
 
      !!     Targets are always re-created, but not until all sources have been
            examined and re-created as necessary.  Sources for a target accumu-
            late over dependency lines when this operator is used.  The target
            is removed if bbmmaakkee is interrupted.
 
      ::::    If no sources are specified, the target is always re-created.  Oth-
            erwise, a target is considered out-of-date if any of its sources
            has been modified more recently than the target.  Sources for a
            target do not accumulate over dependency lines when this operator
            is used.  The target will not be removed if bbmmaakkee is interrupted.
 
      Targets and sources may contain the shell wildcard values `?', `*', `[]',
      and `{}'.  The values `?', `*', and `[]' may only be used as part of the
      final component of the target or source, and must be used to describe
      existing files.  The value `{}' need not necessarily be used to describe
      existing files.  Expansion is in directory order, not alphabetically as
      done in the shell.
 
 SSHHEELLLL CCOOMMMMAANNDDSS
      Each target may have associated with it one or more lines of shell com-
      mands, normally used to create the target.  Each of the lines in this
      script _m_u_s_t be preceded by a tab.  (For historical reasons, spaces are
      not accepted.)  While targets can appear in many dependency lines if
      desired, by default only one of these rules may be followed by a creation
      script.  If the `::::' operator is used, however, all rules may include
      scripts and the scripts are executed in the order found.
 
      Each line is treated as a separate shell command, unless the end of line
      is escaped with a backslash (`\') in which case that line and the next
      are combined.  If the first characters of the command are any combination
      of `@@', `++', or `--', the command is treated specially.  A `@@' causes the
      command not to be echoed before it is executed.  A `++' causes the command
      to be executed even when --nn is given.  This is similar to the effect of
      the .MAKE special source, except that the effect can be limited to a sin-
      gle line of a script.  A `--' in compatibility mode causes any non-zero
      exit status of the command line to be ignored.
 
      When bbmmaakkee is run in jobs mode with --jj _m_a_x___j_o_b_s, the entire script for
      the target is fed to a single instance of the shell.  In compatibility
      (non-jobs) mode, each command is run in a separate process.  If the com-
      mand contains any shell meta characters (`#=|^(){};&<>*?[]:$`\\n') it
      will be passed to the shell; otherwise bbmmaakkee will attempt direct execu-
      tion.  If a line starts with `--' and the shell has ErrCtl enabled then
      failure of the command line will be ignored as in compatibility mode.
      Otherwise `--' affects the entire job; the script will stop at the first
      command line that fails, but the target will not be deemed to have
      failed.
 
      Makefiles should be written so that the mode of bbmmaakkee operation does not
      change their behavior.  For example, any command which needs to use
      ``cd'' or ``chdir'' without potentially changing the directory for subse-
      quent commands should be put in parentheses so it executes in a subshell.
      To force the use of one shell, escape the line breaks so as to make the
      whole script one command.  For example:
 
            avoid-chdir-side-effects:
                    @echo Building $@ in `pwd`
                    @(cd ${.CURDIR} && ${MAKE} $@)
                    @echo Back in `pwd`
 
            ensure-one-shell-regardless-of-mode:
                    @echo Building $@ in `pwd`; \
                    (cd ${.CURDIR} && ${MAKE} $@); \
                    echo Back in `pwd`
 
      Since bbmmaakkee will chdir(2) to `_._O_B_J_D_I_R' before executing any targets, each
      child process starts with that as its current working directory.
 
 VVAARRIIAABBLLEE AASSSSIIGGNNMMEENNTTSS
      Variables in make are much like variables in the shell, and, by tradi-
      tion, consist of all upper-case letters.
 
    VVaarriiaabbllee aassssiiggnnmmeenntt mmooddiiffiieerrss
      The five operators that can be used to assign values to variables are as
      follows:
 
      ==       Assign the value to the variable.  Any previous value is overrid-
              den.
 
      ++==      Append the value to the current value of the variable.
 
      ??==      Assign the value to the variable if it is not already defined.
 
      ::==      Assign with expansion, i.e. expand the value before assigning it
              to the variable.  Normally, expansion is not done until the vari-
              able is referenced.  _N_O_T_E: References to undefined variables are
              _n_o_t expanded.  This can cause problems when variable modifiers
              are used.
 
      !!==      Expand the value and pass it to the shell for execution and
              assign the result to the variable.  Any newlines in the result
              are replaced with spaces.
 
      Any white-space before the assigned _v_a_l_u_e is removed; if the value is
      being appended, a single space is inserted between the previous contents
      of the variable and the appended value.
 
      Variables are expanded by surrounding the variable name with either curly
      braces (`{}') or parentheses (`()') and preceding it with a dollar sign
      (`$').  If the variable name contains only a single letter, the surround-
      ing braces or parentheses are not required.  This shorter form is not
      recommended.
 
      If the variable name contains a dollar, then the name itself is expanded
      first.  This allows almost arbitrary variable names, however names con-
      taining dollar, braces, parenthesis, or whitespace are really best
      avoided!
 
      If the result of expanding a variable contains a dollar sign (`$') the
      string is expanded again.
 
      Variable substitution occurs at three distinct times, depending on where
      the variable is being used.
 
      1.   Variables in dependency lines are expanded as the line is read.
 
      2.   Variables in shell commands are expanded when the shell command is
           executed.
 
      3.   ``.for'' loop index variables are expanded on each loop iteration.
           Note that other variables are not expanded inside loops so the fol-
           lowing example code:
 
 
                 .for i in 1 2 3
                 a+=     ${i}
                 j=      ${i}
                 b+=     ${j}
                 .endfor
 
                 all:
                         @echo ${a}
                         @echo ${b}
 
           will print:
 
                 1 2 3
                 3 3 3
 
           Because while ${a} contains ``1 2 3'' after the loop is executed,
           ${b} contains ``${j} ${j} ${j}'' which expands to ``3 3 3'' since
           after the loop completes ${j} contains ``3''.
 
    VVaarriiaabbllee ccllaasssseess
      The four different classes of variables (in order of increasing prece-
      dence) are:
 
      Environment variables
              Variables defined as part of bbmmaakkee's environment.
 
      Global variables
              Variables defined in the makefile or in included makefiles.
 
      Command line variables
              Variables defined as part of the command line.
 
      Local variables
              Variables that are defined specific to a certain target.
 
      Local variables are all built in and their values vary magically from
      target to target.  It is not currently possible to define new local vari-
      ables.  The seven local variables are as follows:
 
            _._A_L_L_S_R_C   The list of all sources for this target; also known as
                      `_>'.
 
            _._A_R_C_H_I_V_E  The name of the archive file; also known as `_!'.
 
            _._I_M_P_S_R_C   In suffix-transformation rules, the name/path of the
                      source from which the target is to be transformed (the
                      ``implied'' source); also known as `_<'.  It is not
                      defined in explicit rules.
 
            _._M_E_M_B_E_R   The name of the archive member; also known as `_%'.
 
            _._O_O_D_A_T_E   The list of sources for this target that were deemed out-
                      of-date; also known as `_?'.
 
            _._P_R_E_F_I_X   The file prefix of the target, containing only the file
                      portion, no suffix or preceding directory components;
                      also known as `_*'.  The suffix must be one of the known
                      suffixes declared with ..SSUUFFFFIIXXEESS or it will not be recog-
                      nized.
 
            _._T_A_R_G_E_T   The name of the target; also known as `_@'.  For compati-
                      bility with other makes this is an alias for ..AARRCCHHIIVVEE in
                      archive member rules.
 
      The shorter forms (`_>', `_!', `_<', `_%', `_?', `_*', and `_@') are permitted
      for backward compatibility with historical makefiles and legacy POSIX
      make and are not recommended.
 
      Variants of these variables with the punctuation followed immediately by
      `D' or `F', e.g.  `_$_(_@_D_)', are legacy forms equivalent to using the `:H'
      and `:T' modifiers.  These forms are accepted for compatibility with AT&T
      System V UNIX makefiles and POSIX but are not recommended.
 
      Four of the local variables may be used in sources on dependency lines
      because they expand to the proper value for each target on the line.
      These variables are `_._T_A_R_G_E_T', `_._P_R_E_F_I_X', `_._A_R_C_H_I_V_E', and `_._M_E_M_B_E_R'.
 
    AAddddiittiioonnaall bbuuiilltt--iinn vvaarriiaabblleess
      In addition, bbmmaakkee sets or knows about the following variables:
 
      _$               A single dollar sign `$', i.e.  `$$' expands to a single
                      dollar sign.
 
      _._A_L_L_T_A_R_G_E_T_S     The list of all targets encountered in the Makefile.  If
                      evaluated during Makefile parsing, lists only those tar-
                      gets encountered thus far.
 
      _._C_U_R_D_I_R         A path to the directory where bbmmaakkee was executed.  Refer
                      to the description of `PWD' for more details.
 
      _._I_N_C_L_U_D_E_D_F_R_O_M_D_I_R
                      The directory of the file this Makefile was included
                      from.
 
      _._I_N_C_L_U_D_E_D_F_R_O_M_F_I_L_E
                      The filename of the file this Makefile was included from.
 
      MAKE            The name that bbmmaakkee was executed with (_a_r_g_v_[_0_]).  For
                      compatibility bbmmaakkee also sets _._M_A_K_E with the same value.
                      The preferred variable to use is the environment variable
                      MAKE because it is more compatible with other versions of
                      bbmmaakkee and cannot be confused with the special target with
                      the same name.
 
      _._M_A_K_E_._D_E_P_E_N_D_F_I_L_E
                      Names the makefile (default `_._d_e_p_e_n_d') from which gener-
                      ated dependencies are read.
 
      _._M_A_K_E_._E_X_P_A_N_D___V_A_R_I_A_B_L_E_S
                      A boolean that controls the default behavior of the --VV
                      option.
 
      _._M_A_K_E_._E_X_P_O_R_T_E_D  The list of variables exported by bbmmaakkee.
 
      _._M_A_K_E_._J_O_B_S      The argument to the --jj option.
 
      _._M_A_K_E_._J_O_B_._P_R_E_F_I_X
                      If bbmmaakkee is run with _j then output for each target is
                      prefixed with a token `--- target ---' the first part of
                      which can be controlled via _._M_A_K_E_._J_O_B_._P_R_E_F_I_X.  If
                      _._M_A_K_E_._J_O_B_._P_R_E_F_I_X is empty, no token is printed.
                      For example:
                      .MAKE.JOB.PREFIX=${.newline}---${.MAKE:T}[${.MAKE.PID}]
                      would produce tokens like `---make[1234] target ---' mak-
                      ing it easier to track the degree of parallelism being
                      achieved.
 
      MAKEFLAGS       The environment variable `MAKEFLAGS' may contain anything
                      that may be specified on bbmmaakkee's command line.  Anything
                      specified on bbmmaakkee's command line is appended to the
                      `MAKEFLAGS' variable which is then entered into the envi-
                      ronment for all programs which bbmmaakkee executes.
 
      _._M_A_K_E_._L_E_V_E_L     The recursion depth of bbmmaakkee.  The initial instance of
                      bbmmaakkee will be 0, and an incremented value is put into the
                      environment to be seen by the next generation.  This
                      allows tests like: .if ${.MAKE.LEVEL} == 0 to protect
                      things which should only be evaluated in the initial
                      instance of bbmmaakkee.
 
      _._M_A_K_E_._M_A_K_E_F_I_L_E___P_R_E_F_E_R_E_N_C_E
                      The ordered list of makefile names (default `_m_a_k_e_f_i_l_e',
                      `_M_a_k_e_f_i_l_e') that bbmmaakkee will look for.
 
      _._M_A_K_E_._M_A_K_E_F_I_L_E_S
                      The list of makefiles read by bbmmaakkee, which is useful for
                      tracking dependencies.  Each makefile is recorded only
                      once, regardless of the number of times read.
 
      _._M_A_K_E_._M_O_D_E      Processed after reading all makefiles.  Can affect the
                      mode that bbmmaakkee runs in.  It can contain a number of key-
                      words:
 
                      _c_o_m_p_a_t               Like --BB, puts bbmmaakkee into "compat"
                                           mode.
 
                      _m_e_t_a                 Puts bbmmaakkee into "meta" mode, where
                                           meta files are created for each tar-
                                           get to capture the command run, the
                                           output generated and if filemon(4)
                                           is available, the system calls which
                                           are of interest to bbmmaakkee.  The cap-
                                           tured output can be very useful when
                                           diagnosing errors.
 
                      _c_u_r_d_i_r_O_k_= _b_f         Normally bbmmaakkee will not create .meta
                                           files in `_._C_U_R_D_I_R'.  This can be
                                           overridden by setting _b_f to a value
                                           which represents True.
 
                      _m_i_s_s_i_n_g_-_m_e_t_a_= _b_f     If _b_f is True, then a missing .meta
                                           file makes the target out-of-date.
 
                      _m_i_s_s_i_n_g_-_f_i_l_e_m_o_n_= _b_f  If _b_f is True, then missing filemon
                                           data makes the target out-of-date.
 
                      _n_o_f_i_l_e_m_o_n            Do not use filemon(4).
 
                      _e_n_v                  For debugging, it can be useful to
                                           include the environment in the .meta
                                           file.
 
                      _v_e_r_b_o_s_e              If in "meta" mode, print a clue
                                           about the target being built.  This
                                           is useful if the build is otherwise
                                           running silently.  The message
                                           printed the value of:
                                           _._M_A_K_E_._M_E_T_A_._P_R_E_F_I_X.
 
                      _i_g_n_o_r_e_-_c_m_d           Some makefiles have commands which
                                           are simply not stable.  This keyword
                                           causes them to be ignored for deter-
                                           mining whether a target is out of
                                           date in "meta" mode.  See also
                                           ..NNOOMMEETTAA__CCMMPP.
 
                      _s_i_l_e_n_t_= _b_f           If _b_f is True, when a .meta file is
                                           created, mark the target ..SSIILLEENNTT.
 
      _._M_A_K_E_._M_E_T_A_._B_A_I_L_I_W_I_C_K
                      In "meta" mode, provides a list of prefixes which match
                      the directories controlled by bbmmaakkee.  If a file that was
                      generated outside of _._O_B_J_D_I_R but within said bailiwick is
                      missing, the current target is considered out-of-date.
 
      _._M_A_K_E_._M_E_T_A_._C_R_E_A_T_E_D
                      In "meta" mode, this variable contains a list of all the
                      meta files updated.  If not empty, it can be used to
                      trigger processing of _._M_A_K_E_._M_E_T_A_._F_I_L_E_S.
 
      _._M_A_K_E_._M_E_T_A_._F_I_L_E_S
                      In "meta" mode, this variable contains a list of all the
                      meta files used (updated or not).  This list can be used
                      to process the meta files to extract dependency informa-
                      tion.
 
      _._M_A_K_E_._M_E_T_A_._I_G_N_O_R_E___P_A_T_H_S
                      Provides a list of path prefixes that should be ignored;
                      because the contents are expected to change over time.
                      The default list includes: `_/_d_e_v _/_e_t_c _/_p_r_o_c _/_t_m_p _/_v_a_r_/_r_u_n
                      _/_v_a_r_/_t_m_p'
 
      _._M_A_K_E_._M_E_T_A_._I_G_N_O_R_E___P_A_T_T_E_R_N_S
                      Provides a list of patterns to match against pathnames.
                      Ignore any that match.
 
+     _._M_A_K_E_._M_E_T_A_._I_G_N_O_R_E___F_I_L_T_E_R
+                     Provides a list of variable modifiers to apply to each
+                     pathname.  Ignore if the expansion is an empty string.
+
      _._M_A_K_E_._M_E_T_A_._P_R_E_F_I_X
                      Defines the message printed for each meta file updated in
                      "meta verbose" mode.  The default value is:
                            Building ${.TARGET:H:tA}/${.TARGET:T}
 
      _._M_A_K_E_O_V_E_R_R_I_D_E_S  This variable is used to record the names of variables
                      assigned to on the command line, so that they may be
                      exported as part of `MAKEFLAGS'.  This behavior can be
                      disabled by assigning an empty value to `_._M_A_K_E_O_V_E_R_R_I_D_E_S'
                      within a makefile.  Extra variables can be exported from
                      a makefile by appending their names to `_._M_A_K_E_O_V_E_R_R_I_D_E_S'.
                      `MAKEFLAGS' is re-exported whenever `_._M_A_K_E_O_V_E_R_R_I_D_E_S' is
                      modified.
 
      _._M_A_K_E_._P_A_T_H___F_I_L_E_M_O_N
                      If bbmmaakkee was built with filemon(4) support, this is set
                      to the path of the device node.  This allows makefiles to
                      test for this support.
 
      _._M_A_K_E_._P_I_D       The process-id of bbmmaakkee.
 
      _._M_A_K_E_._P_P_I_D      The parent process-id of bbmmaakkee.
 
      _._M_A_K_E_._S_A_V_E___D_O_L_L_A_R_S
                      value should be a boolean that controls whether `$$' are
                      preserved when doing `:=' assignments.  The default is
                      false, for backwards compatibility.  Set to true for com-
                      patability with other makes.  If set to false, `$$'
                      becomes `$' per normal evaluation rules.
 
      _M_A_K_E___P_R_I_N_T___V_A_R___O_N___E_R_R_O_R
-                     When bbmmaakkee stops due to an error, it prints its name and
-                     the value of `_._C_U_R_D_I_R' as well as the value of any vari-
-                     ables named in `_M_A_K_E___P_R_I_N_T___V_A_R___O_N___E_R_R_O_R'.
+                     When bbmmaakkee stops due to an error, it sets `_._E_R_R_O_R___T_A_R_G_E_T'
+                     to the name of the target that failed, `_._E_R_R_O_R___C_M_D' to
+                     the commands of the failed target, and in "meta" mode, it
+                     also sets `_._E_R_R_O_R___C_W_D' to the getcwd(3), and
+                     `_._E_R_R_O_R___M_E_T_A___F_I_L_E' to the path of the meta file (if any)
+                     describing the failed target.  It then prints its name
+                     and the value of `_._C_U_R_D_I_R' as well as the value of any
+                     variables named in `_M_A_K_E___P_R_I_N_T___V_A_R___O_N___E_R_R_O_R'.
 
      _._n_e_w_l_i_n_e        This variable is simply assigned a newline character as
                      its value.  This allows expansions using the ::@@ modifier
                      to put a newline between iterations of the loop rather
                      than a space.  For example, the printing of
                      `_M_A_K_E___P_R_I_N_T___V_A_R___O_N___E_R_R_O_R' could be done as
                      ${MAKE_PRINT_VAR_ON_ERROR:@v@$v='${$v}'${.newline}@}.
 
      _._O_B_J_D_I_R         A path to the directory where the targets are built.  Its
                      value is determined by trying to chdir(2) to the follow-
                      ing directories in order and using the first match:
 
                      1.   ${MAKEOBJDIRPREFIX}${.CURDIR}
 
                           (Only if `MAKEOBJDIRPREFIX' is set in the environ-
                           ment or on the command line.)
 
                      2.   ${MAKEOBJDIR}
 
                           (Only if `MAKEOBJDIR' is set in the environment or
                           on the command line.)
 
                      3.   ${.CURDIR}_/_o_b_j_.${MACHINE}
 
                      4.   ${.CURDIR}_/_o_b_j
 
                      5.   _/_u_s_r_/_o_b_j_/${.CURDIR}
 
                      6.   ${.CURDIR}
 
                      Variable expansion is performed on the value before it's
                      used, so expressions such as
                            ${.CURDIR:S,^/usr/src,/var/obj,}
                      may be used.  This is especially useful with
                      `MAKEOBJDIR'.
 
                      `_._O_B_J_D_I_R' may be modified in the makefile via the special
                      target `..OOBBJJDDIIRR'.  In all cases, bbmmaakkee will chdir(2) to
                      the specified directory if it exists, and set `_._O_B_J_D_I_R'
                      and `PWD' to that directory before executing any targets.
 
      _._P_A_R_S_E_D_I_R       A path to the directory of the current `_M_a_k_e_f_i_l_e' being
                      parsed.
 
      _._P_A_R_S_E_F_I_L_E      The basename of the current `_M_a_k_e_f_i_l_e' being parsed.
                      This variable and `_._P_A_R_S_E_D_I_R' are both set only while the
                      `_M_a_k_e_f_i_l_e_s' are being parsed.  If you want to retain
                      their current values, assign them to a variable using
                      assignment with expansion: (`::==').
 
      _._P_A_T_H           A variable that represents the list of directories that
                      bbmmaakkee will search for files.  The search list should be
                      updated using the target `_._P_A_T_H' rather than the vari-
                      able.
 
      PWD             Alternate path to the current directory.  bbmmaakkee normally
                      sets `_._C_U_R_D_I_R' to the canonical path given by getcwd(3).
                      However, if the environment variable `PWD' is set and
                      gives a path to the current directory, then bbmmaakkee sets
                      `_._C_U_R_D_I_R' to the value of `PWD' instead.  This behavior
                      is disabled if `MAKEOBJDIRPREFIX' is set or `MAKEOBJDIR'
                      contains a variable transform.  `PWD' is set to the value
                      of `_._O_B_J_D_I_R' for all programs which bbmmaakkee executes.
 
      .TARGETS        The list of targets explicitly specified on the command
                      line, if any.
 
      VPATH           Colon-separated (``:'') lists of directories that bbmmaakkee
                      will search for files.  The variable is supported for
                      compatibility with old make programs only, use `_._P_A_T_H'
                      instead.
 
    VVaarriiaabbllee mmooddiiffiieerrss
      Variable expansion may be modified to select or modify each word of the
      variable (where a ``word'' is white-space delimited sequence of charac-
      ters).  The general format of a variable expansion is as follows:
 
            ${variable[:modifier[:...]]}
 
      Each modifier begins with a colon, which may be escaped with a backslash
      (`\').
 
      A set of modifiers can be specified via a variable, as follows:
 
            modifier_variable=modifier[:...]
            ${variable:${modifier_variable}[:...]}
 
      In this case the first modifier in the modifier_variable does not start
      with a colon, since that must appear in the referencing variable.  If any
      of the modifiers in the modifier_variable contain a dollar sign (`$'),
      these must be doubled to avoid early expansion.
 
      The supported modifiers are:
 
      ::EE   Replaces each word in the variable with its suffix.
 
      ::HH   Replaces each word in the variable with everything but the last com-
           ponent.
 
      ::MM_p_a_t_t_e_r_n
           Select only those words that match _p_a_t_t_e_r_n.  The standard shell
           wildcard characters (`*', `?', and `[]') may be used.  The wildcard
           characters may be escaped with a backslash (`\').  As a consequence
           of the way values are split into words, matched, and then joined, a
           construct like
                 ${VAR:M*}
           will normalize the inter-word spacing, removing all leading and
           trailing space, and converting multiple consecutive spaces to single
           spaces.
 
      ::NN_p_a_t_t_e_r_n
           This is identical to `::MM', but selects all words which do not match
           _p_a_t_t_e_r_n.
 
      ::OO   Order every word in variable alphabetically.  To sort words in
           reverse order use the `::OO::[[--11....11]]' combination of modifiers.
 
      ::OOxx  Randomize words in variable.  The results will be different each
           time you are referring to the modified variable; use the assignment
           with expansion (`::==') to prevent such behavior.  For example,
 
                 LIST=                   uno due tre quattro
                 RANDOM_LIST=            ${LIST:Ox}
                 STATIC_RANDOM_LIST:=    ${LIST:Ox}
 
                 all:
                         @echo "${RANDOM_LIST}"
                         @echo "${RANDOM_LIST}"
                         @echo "${STATIC_RANDOM_LIST}"
                         @echo "${STATIC_RANDOM_LIST}"
           may produce output similar to:
 
                 quattro due tre uno
                 tre due quattro uno
                 due uno quattro tre
                 due uno quattro tre
 
      ::QQ   Quotes every shell meta-character in the variable, so that it can be
           passed safely through recursive invocations of bbmmaakkee.
 
      ::RR   Replaces each word in the variable with everything but its suffix.
 
      ::ggmmttiimmee
           The value is a format string for strftime(3), using the current
           gmtime(3).
 
      ::hhaasshh
           Compute a 32-bit hash of the value and encode it as hex digits.
 
      ::llooccaallttiimmee
           The value is a format string for strftime(3), using the current
           localtime(3).
 
      ::ttAA  Attempt to convert variable to an absolute path using realpath(3),
           if that fails, the value is unchanged.
 
      ::ttll  Converts variable to lower-case letters.
 
      ::ttss_c
           Words in the variable are normally separated by a space on expan-
           sion.  This modifier sets the separator to the character _c.  If _c is
           omitted, then no separator is used.  The common escapes (including
           octal numeric codes), work as expected.
 
      ::ttuu  Converts variable to upper-case letters.
 
      ::ttWW  Causes the value to be treated as a single word (possibly containing
           embedded white space).  See also `::[[**]]'.
 
      ::ttww  Causes the value to be treated as a sequence of words delimited by
           white space.  See also `::[[@@]]'.
 
      ::SS/_o_l_d___s_t_r_i_n_g/_n_e_w___s_t_r_i_n_g/[11ggWW]
           Modify the first occurrence of _o_l_d___s_t_r_i_n_g in the variable's value,
           replacing it with _n_e_w___s_t_r_i_n_g.  If a `g' is appended to the last
           slash of the pattern, all occurrences in each word are replaced.  If
           a `1' is appended to the last slash of the pattern, only the first
           word is affected.  If a `W' is appended to the last slash of the
           pattern, then the value is treated as a single word (possibly con-
           taining embedded white space).  If _o_l_d___s_t_r_i_n_g begins with a caret
           (`^'), _o_l_d___s_t_r_i_n_g is anchored at the beginning of each word.  If
           _o_l_d___s_t_r_i_n_g ends with a dollar sign (`$'), it is anchored at the end
           of each word.  Inside _n_e_w___s_t_r_i_n_g, an ampersand (`&') is replaced by
           _o_l_d___s_t_r_i_n_g (without any `^' or `$').  Any character may be used as a
           delimiter for the parts of the modifier string.  The anchoring,
           ampersand and delimiter characters may be escaped with a backslash
           (`\').
 
           Variable expansion occurs in the normal fashion inside both
           _o_l_d___s_t_r_i_n_g and _n_e_w___s_t_r_i_n_g with the single exception that a backslash
           is used to prevent the expansion of a dollar sign (`$'), not a pre-
           ceding dollar sign as is usual.
 
      ::CC/_p_a_t_t_e_r_n/_r_e_p_l_a_c_e_m_e_n_t/[11ggWW]
           The ::CC modifier is just like the ::SS modifier except that the old and
           new strings, instead of being simple strings, are an extended regu-
           lar expression (see regex(3)) string _p_a_t_t_e_r_n and an ed(1)-style
           string _r_e_p_l_a_c_e_m_e_n_t.  Normally, the first occurrence of the pattern
           _p_a_t_t_e_r_n in each word of the value is substituted with _r_e_p_l_a_c_e_m_e_n_t.
           The `1' modifier causes the substitution to apply to at most one
           word; the `g' modifier causes the substitution to apply to as many
           instances of the search pattern _p_a_t_t_e_r_n as occur in the word or
           words it is found in; the `W' modifier causes the value to be
           treated as a single word (possibly containing embedded white space).
           Note that `1' and `g' are orthogonal; the former specifies whether
           multiple words are potentially affected, the latter whether multiple
           substitutions can potentially occur within each affected word.
 
           As for the ::SS modifier, the _p_a_t_t_e_r_n and _r_e_p_l_a_c_e_m_e_n_t are subjected to
           variable expansion before being parsed as regular expressions.
 
      ::TT   Replaces each word in the variable with its last component.
 
      ::uu   Remove adjacent duplicate words (like uniq(1)).
 
      ::??_t_r_u_e___s_t_r_i_n_g::_f_a_l_s_e___s_t_r_i_n_g
           If the variable name (not its value), when parsed as a .if condi-
           tional expression, evaluates to true, return as its value the
           _t_r_u_e___s_t_r_i_n_g, otherwise return the _f_a_l_s_e___s_t_r_i_n_g.  Since the variable
           name is used as the expression, :? must be the first modifier after
           the variable name itself - which will, of course, usually contain
           variable expansions.  A common error is trying to use expressions
           like
                 ${NUMBERS:M42:?match:no}
           which actually tests defined(NUMBERS), to determine is any words
           match "42" you need to use something like:
                 ${"${NUMBERS:M42}" != "":?match:no}.
 
      _:_o_l_d___s_t_r_i_n_g_=_n_e_w___s_t_r_i_n_g
           This is the AT&T System V UNIX style variable substitution.  It must
           be the last modifier specified.  If _o_l_d___s_t_r_i_n_g or _n_e_w___s_t_r_i_n_g do not
           contain the pattern matching character _% then it is assumed that
           they are anchored at the end of each word, so only suffixes or
           entire words may be replaced.  Otherwise _% is the substring of
           _o_l_d___s_t_r_i_n_g to be replaced in _n_e_w___s_t_r_i_n_g.
 
           Variable expansion occurs in the normal fashion inside both
           _o_l_d___s_t_r_i_n_g and _n_e_w___s_t_r_i_n_g with the single exception that a backslash
           is used to prevent the expansion of a dollar sign (`$'), not a pre-
           ceding dollar sign as is usual.
 
      ::@@_t_e_m_p@@_s_t_r_i_n_g@@
           This is the loop expansion mechanism from the OSF Development Envi-
           ronment (ODE) make.  Unlike ..ffoorr loops expansion occurs at the time
           of reference.  Assign _t_e_m_p to each word in the variable and evaluate
           _s_t_r_i_n_g.  The ODE convention is that _t_e_m_p should start and end with a
           period.  For example.
                 ${LINKS:@.LINK.@${LN} ${TARGET} ${.LINK.}@}
 
           However a single character variable is often more readable:
                 ${MAKE_PRINT_VAR_ON_ERROR:@v@$v='${$v}'${.newline}@}
 
      ::UU_n_e_w_v_a_l
           If the variable is undefined _n_e_w_v_a_l is the value.  If the variable
           is defined, the existing value is returned.  This is another ODE
           make feature.  It is handy for setting per-target CFLAGS for
           instance:
                 ${_${.TARGET:T}_CFLAGS:U${DEF_CFLAGS}}
           If a value is only required if the variable is undefined, use:
                 ${VAR:D:Unewval}
 
      ::DD_n_e_w_v_a_l
           If the variable is defined _n_e_w_v_a_l is the value.
 
      ::LL   The name of the variable is the value.
 
      ::PP   The path of the node which has the same name as the variable is the
           value.  If no such node exists or its path is null, then the name of
           the variable is used.  In order for this modifier to work, the name
           (node) must at least have appeared on the rhs of a dependency.
 
      ::!!_c_m_d!!
           The output of running _c_m_d is the value.
 
      ::sshh  If the variable is non-empty it is run as a command and the output
           becomes the new value.
 
      ::::==_s_t_r
           The variable is assigned the value _s_t_r after substitution.  This
           modifier and its variations are useful in obscure situations such as
           wanting to set a variable when shell commands are being parsed.
           These assignment modifiers always expand to nothing, so if appearing
           in a rule line by themselves should be preceded with something to
           keep bbmmaakkee happy.
 
           The `::::' helps avoid false matches with the AT&T System V UNIX style
           ::== modifier and since substitution always occurs the ::::== form is
           vaguely appropriate.
 
      ::::??==_s_t_r
           As for ::::== but only if the variable does not already have a value.
 
      ::::++==_s_t_r
           Append _s_t_r to the variable.
 
      ::::!!==_c_m_d
           Assign the output of _c_m_d to the variable.
 
      ::[[_r_a_n_g_e]]
           Selects one or more words from the value, or performs other opera-
           tions related to the way in which the value is divided into words.
 
           Ordinarily, a value is treated as a sequence of words delimited by
           white space.  Some modifiers suppress this behavior, causing a value
           to be treated as a single word (possibly containing embedded white
           space).  An empty value, or a value that consists entirely of white-
           space, is treated as a single word.  For the purposes of the `::[[]]'
           modifier, the words are indexed both forwards using positive inte-
           gers (where index 1 represents the first word), and backwards using
           negative integers (where index -1 represents the last word).
 
           The _r_a_n_g_e is subjected to variable expansion, and the expanded
           result is then interpreted as follows:
 
           _i_n_d_e_x  Selects a single word from the value.
 
           _s_t_a_r_t...._e_n_d
                  Selects all words from _s_t_a_r_t to _e_n_d, inclusive.  For example,
                  `::[[22....--11]]' selects all words from the second word to the last
                  word.  If _s_t_a_r_t is greater than _e_n_d, then the words are out-
                  put in reverse order.  For example, `::[[--11....11]]' selects all
                  the words from last to first.
 
           **      Causes subsequent modifiers to treat the value as a single
                  word (possibly containing embedded white space).  Analogous
                  to the effect of "$*" in Bourne shell.
 
           0      Means the same as `::[[**]]'.
 
           @@      Causes subsequent modifiers to treat the value as a sequence
                  of words delimited by white space.  Analogous to the effect
                  of "$@" in Bourne shell.
 
           ##      Returns the number of words in the value.
 
 IINNCCLLUUDDEE SSTTAATTEEMMEENNTTSS,, CCOONNDDIITTIIOONNAALLSS AANNDD FFOORR LLOOOOPPSS
      Makefile inclusion, conditional structures and for loops  reminiscent of
      the C programming language are provided in bbmmaakkee.  All such structures
      are identified by a line beginning with a single dot (`.') character.
      Files are included with either ..iinncclluuddee <_f_i_l_e> or ..iinncclluuddee "_f_i_l_e".  Vari-
      ables between the angle brackets or double quotes are expanded to form
      the file name.  If angle brackets are used, the included makefile is
      expected to be in the system makefile directory.  If double quotes are
      used, the including makefile's directory and any directories specified
      using the --II option are searched before the system makefile directory.
      For compatibility with other versions of bbmmaakkee `include file ...' is also
      accepted.
 
      If the include statement is written as ..--iinncclluuddee or as ..ssiinncclluuddee then
      errors locating and/or opening include files are ignored.
 
      If the include statement is written as ..ddiinncclluuddee not only are errors
      locating and/or opening include files ignored, but stale dependencies
      within the included file will be ignored just like _._M_A_K_E_._D_E_P_E_N_D_F_I_L_E.
 
      Conditional expressions are also preceded by a single dot as the first
      character of a line.  The possible conditionals are as follows:
 
      ..eerrrroorr _m_e_s_s_a_g_e
              The message is printed along with the name of the makefile and
              line number, then bbmmaakkee will exit.
 
      ..eexxppoorrtt _v_a_r_i_a_b_l_e _._._.
              Export the specified global variable.  If no variable list is
              provided, all globals are exported except for internal variables
              (those that start with `.').  This is not affected by the --XX
              flag, so should be used with caution.  For compatibility with
              other bbmmaakkee programs `export variable=value' is also accepted.
 
              Appending a variable name to _._M_A_K_E_._E_X_P_O_R_T_E_D is equivalent to
              exporting a variable.
 
      ..eexxppoorrtt--eennvv _v_a_r_i_a_b_l_e _._._.
              The same as `.export', except that the variable is not appended
              to _._M_A_K_E_._E_X_P_O_R_T_E_D.  This allows exporting a value to the environ-
              ment which is different from that used by bbmmaakkee internally.
 
      ..eexxppoorrtt--lliitteerraall _v_a_r_i_a_b_l_e _._._.
              The same as `.export-env', except that variables in the value are
              not expanded.
 
      ..iinnffoo _m_e_s_s_a_g_e
              The message is printed along with the name of the makefile and
              line number.
 
      ..uunnddeeff _v_a_r_i_a_b_l_e
              Un-define the specified global variable.  Only global variables
              may be un-defined.
 
      ..uunneexxppoorrtt _v_a_r_i_a_b_l_e _._._.
              The opposite of `.export'.  The specified global _v_a_r_i_a_b_l_e will be
              removed from _._M_A_K_E_._E_X_P_O_R_T_E_D.  If no variable list is provided,
              all globals are unexported, and _._M_A_K_E_._E_X_P_O_R_T_E_D deleted.
 
      ..uunneexxppoorrtt--eennvv
              Unexport all globals previously exported and clear the environ-
              ment inherited from the parent.  This operation will cause a mem-
              ory leak of the original environment, so should be used spar-
              ingly.  Testing for _._M_A_K_E_._L_E_V_E_L being 0, would make sense.  Also
              note that any variables which originated in the parent environ-
              ment should be explicitly preserved if desired.  For example:
 
                    .if ${.MAKE.LEVEL} == 0
                    PATH := ${PATH}
                    .unexport-env
                    .export PATH
                    .endif
 
              Would result in an environment containing only `PATH', which is
              the minimal useful environment.  Actually `.MAKE.LEVEL' will also
              be pushed into the new environment.
 
      ..wwaarrnniinngg _m_e_s_s_a_g_e
              The message prefixed by `_w_a_r_n_i_n_g_:' is printed along with the name
              of the makefile and line number.
 
      ..iiff [!]_e_x_p_r_e_s_s_i_o_n [_o_p_e_r_a_t_o_r _e_x_p_r_e_s_s_i_o_n _._._.]
              Test the value of an expression.
 
      ..iiffddeeff [!]_v_a_r_i_a_b_l_e [_o_p_e_r_a_t_o_r _v_a_r_i_a_b_l_e _._._.]
              Test the value of a variable.
 
      ..iiffnnddeeff [!]_v_a_r_i_a_b_l_e [_o_p_e_r_a_t_o_r _v_a_r_i_a_b_l_e _._._.]
              Test the value of a variable.
 
      ..iiffmmaakkee [!]_t_a_r_g_e_t [_o_p_e_r_a_t_o_r _t_a_r_g_e_t _._._.]
              Test the target being built.
 
      ..iiffnnmmaakkee [!] _t_a_r_g_e_t [_o_p_e_r_a_t_o_r _t_a_r_g_e_t _._._.]
              Test the target being built.
 
      ..eellssee   Reverse the sense of the last conditional.
 
      ..eelliiff [!] _e_x_p_r_e_s_s_i_o_n [_o_p_e_r_a_t_o_r _e_x_p_r_e_s_s_i_o_n _._._.]
              A combination of `..eellssee' followed by `..iiff'.
 
      ..eelliiffddeeff [!]_v_a_r_i_a_b_l_e [_o_p_e_r_a_t_o_r _v_a_r_i_a_b_l_e _._._.]
              A combination of `..eellssee' followed by `..iiffddeeff'.
 
      ..eelliiffnnddeeff [!]_v_a_r_i_a_b_l_e [_o_p_e_r_a_t_o_r _v_a_r_i_a_b_l_e _._._.]
              A combination of `..eellssee' followed by `..iiffnnddeeff'.
 
      ..eelliiffmmaakkee [!]_t_a_r_g_e_t [_o_p_e_r_a_t_o_r _t_a_r_g_e_t _._._.]
              A combination of `..eellssee' followed by `..iiffmmaakkee'.
 
      ..eelliiffnnmmaakkee [!]_t_a_r_g_e_t [_o_p_e_r_a_t_o_r _t_a_r_g_e_t _._._.]
              A combination of `..eellssee' followed by `..iiffnnmmaakkee'.
 
      ..eennddiiff  End the body of the conditional.
 
      The _o_p_e_r_a_t_o_r may be any one of the following:
 
      ||||     Logical OR.
 
      &&&&     Logical AND; of higher precedence than ``||''.
 
      As in C, bbmmaakkee will only evaluate a conditional as far as is necessary to
      determine its value.  Parentheses may be used to change the order of
      evaluation.  The boolean operator `!!' may be used to logically negate an
      entire conditional.  It is of higher precedence than `&&&&'.
 
      The value of _e_x_p_r_e_s_s_i_o_n may be any of the following:
 
      ddeeffiinneedd  Takes a variable name as an argument and evaluates to true if
               the variable has been defined.
 
      mmaakkee     Takes a target name as an argument and evaluates to true if the
               target was specified as part of bbmmaakkee's command line or was
               declared the default target (either implicitly or explicitly,
               see _._M_A_I_N) before the line containing the conditional.
 
      eemmppttyy    Takes a variable, with possible modifiers, and evaluates to true
               if the expansion of the variable would result in an empty
               string.
 
      eexxiissttss   Takes a file name as an argument and evaluates to true if the
               file exists.  The file is searched for on the system search path
               (see _._P_A_T_H).
 
      ttaarrggeett   Takes a target name as an argument and evaluates to true if the
               target has been defined.
 
      ccoommmmaannddss
               Takes a target name as an argument and evaluates to true if the
               target has been defined and has commands associated with it.
 
      _E_x_p_r_e_s_s_i_o_n may also be an arithmetic or string comparison.  Variable
      expansion is performed on both sides of the comparison, after which the
      integral values are compared.  A value is interpreted as hexadecimal if
      it is preceded by 0x, otherwise it is decimal; octal numbers are not sup-
      ported.  The standard C relational operators are all supported.  If after
      variable expansion, either the left or right hand side of a `====' or `!!=='
      operator is not an integral value, then string comparison is performed
      between the expanded variables.  If no relational operator is given, it
      is assumed that the expanded variable is being compared against 0 or an
      empty string in the case of a string comparison.
 
      When bbmmaakkee is evaluating one of these conditional expressions, and it
      encounters a (white-space separated) word it doesn't recognize, either
      the ``make'' or ``defined'' expression is applied to it, depending on the
      form of the conditional.  If the form is `..iiffddeeff', `..iiffnnddeeff', or `..iiff'
      the ``defined'' expression is applied.  Similarly, if the form is
      `..iiffmmaakkee' or `..iiffnnmmaakkee, tthhee' ``make'' expression is applied.
 
      If the conditional evaluates to true the parsing of the makefile contin-
      ues as before.  If it evaluates to false, the following lines are
      skipped.  In both cases this continues until a `..eellssee' or `..eennddiiff' is
      found.
 
      For loops are typically used to apply a set of rules to a list of files.
      The syntax of a for loop is:
 
      ..ffoorr _v_a_r_i_a_b_l_e [_v_a_r_i_a_b_l_e _._._.] iinn _e_x_p_r_e_s_s_i_o_n
      <make-rules>
      ..eennddffoorr
 
      After the for eexxpprreessssiioonn is evaluated, it is split into words.  On each
      iteration of the loop, one word is taken and assigned to each vvaarriiaabbllee,
      in order, and these vvaarriiaabblleess are substituted into the mmaakkee--rruulleess inside
      the body of the for loop.  The number of words must come out even; that
      is, if there are three iteration variables, the number of words provided
      must be a multiple of three.
 
 CCOOMMMMEENNTTSS
      Comments begin with a hash (`#') character, anywhere but in a shell com-
      mand line, and continue to the end of an unescaped new line.
 
 SSPPEECCIIAALL SSOOUURRCCEESS ((AATTTTRRIIBBUUTTEESS))
      ..EEXXEECC     Target is never out of date, but always execute commands any-
                way.
 
      ..IIGGNNOORREE   Ignore any errors from the commands associated with this tar-
                get, exactly as if they all were preceded by a dash (`-').
 
      ..MMAADDEE     Mark all sources of this target as being up-to-date.
 
      ..MMAAKKEE     Execute the commands associated with this target even if the --nn
                or --tt options were specified.  Normally used to mark recursive
                bbmmaakkees.
 
      ..MMEETTAA     Create a meta file for the target, even if it is flagged as
                ..PPHHOONNYY, ..MMAAKKEE, or ..SSPPEECCIIAALL.  Usage in conjunction with ..MMAAKKEE is
                the most likely case.  In "meta" mode, the target is out-of-
                date if the meta file is missing.
 
      ..NNOOMMEETTAA   Do not create a meta file for the target.  Meta files are also
                not created for ..PPHHOONNYY, ..MMAAKKEE, or ..SSPPEECCIIAALL targets.
 
      ..NNOOMMEETTAA__CCMMPP
                Ignore differences in commands when deciding if target is out
                of date.  This is useful if the command contains a value which
                always changes.  If the number of commands change, though, the
                target will still be out of date.  The same effect applies to
                any command line that uses the variable _._O_O_D_A_T_E, which can be
                used for that purpose even when not otherwise needed or
                desired:
 
 
                      skip-compare-for-some:
                              @echo this will be compared
                              @echo this will not ${.OODATE:M.NOMETA_CMP}
                              @echo this will also be compared
 
                The ::MM pattern suppresses any expansion of the unwanted vari-
                able.
 
      ..NNOOPPAATTHH   Do not search for the target in the directories specified by
                ..PPAATTHH.
 
      ..NNOOTTMMAAIINN  Normally bbmmaakkee selects the first target it encounters as the
                default target to be built if no target was specified.  This
                source prevents this target from being selected.
 
      ..OOPPTTIIOONNAALL
                If a target is marked with this attribute and bbmmaakkee can't fig-
                ure out how to create it, it will ignore this fact and assume
                the file isn't needed or already exists.
 
      ..PPHHOONNYY    The target does not correspond to an actual file; it is always
                considered to be out of date, and will not be created with the
                --tt option.  Suffix-transformation rules are not applied to
                ..PPHHOONNYY targets.
 
      ..PPRREECCIIOOUUSS
                When bbmmaakkee is interrupted, it normally removes any partially
                made targets.  This source prevents the target from being
                removed.
 
      ..RREECCUURRSSIIVVEE
                Synonym for ..MMAAKKEE.
 
      ..SSIILLEENNTT   Do not echo any of the commands associated with this target,
                exactly as if they all were preceded by an at sign (`@').
 
      ..UUSSEE      Turn the target into bbmmaakkee's version of a macro.  When the tar-
                get is used as a source for another target, the other target
                acquires the commands, sources, and attributes (except for
                ..UUSSEE) of the source.  If the target already has commands, the
                ..UUSSEE target's commands are appended to them.
 
      ..UUSSEEBBEEFFOORREE
                Exactly like ..UUSSEE, but prepend the ..UUSSEEBBEEFFOORREE target commands
                to the target.
 
      ..WWAAIITT     If ..WWAAIITT appears in a dependency line, the sources that precede
                it are made before the sources that succeed it in the line.
                Since the dependents of files are not made until the file
                itself could be made, this also stops the dependents being
                built unless they are needed for another branch of the depen-
                dency tree.  So given:
 
                x: a .WAIT b
                        echo x
                a:
                        echo a
                b: b1
                        echo b
                b1:
                        echo b1
 
                the output is always `a', `b1', `b', `x'.
                The ordering imposed by ..WWAAIITT is only relevant for parallel
                makes.
 
 SSPPEECCIIAALL TTAARRGGEETTSS
      Special targets may not be included with other targets, i.e. they must be
      the only target specified.
 
      ..BBEEGGIINN   Any command lines attached to this target are executed before
               anything else is done.
 
      ..DDEEFFAAUULLTT
               This is sort of a ..UUSSEE rule for any target (that was used only
               as a source) that bbmmaakkee can't figure out any other way to cre-
               ate.  Only the shell script is used.  The ..IIMMPPSSRRCC variable of a
               target that inherits ..DDEEFFAAUULLTT's commands is set to the target's
               own name.
 
      ..EENNDD     Any command lines attached to this target are executed after
               everything else is done.
 
      ..EERRRROORR   Any command lines attached to this target are executed when
               another target fails.  The ..EERRRROORR__TTAARRGGEETT variable is set to the
               target that failed.  See also MMAAKKEE__PPRRIINNTT__VVAARR__OONN__EERRRROORR.
 
      ..IIGGNNOORREE  Mark each of the sources with the ..IIGGNNOORREE attribute.  If no
               sources are specified, this is the equivalent of specifying the
               --ii option.
 
      ..IINNTTEERRRRUUPPTT
               If bbmmaakkee is interrupted, the commands for this target will be
               executed.
 
      ..MMAAIINN    If no target is specified when bbmmaakkee is invoked, this target
               will be built.
 
      ..MMAAKKEEFFLLAAGGSS
               This target provides a way to specify flags for bbmmaakkee when the
               makefile is used.  The flags are as if typed to the shell,
               though the --ff option will have no effect.
 
      ..NNOOPPAATTHH  Apply the ..NNOOPPAATTHH attribute to any specified sources.
 
      ..NNOOTTPPAARRAALLLLEELL
               Disable parallel mode.
 
      ..NNOO__PPAARRAALLLLEELL
               Synonym for ..NNOOTTPPAARRAALLLLEELL, for compatibility with other pmake
               variants.
 
      ..OOBBJJDDIIRR  The source is a new value for `_._O_B_J_D_I_R'.  If it exists, bbmmaakkee
               will chdir(2) to it and update the value of `_._O_B_J_D_I_R'.
 
      ..OORRDDEERR   The named targets are made in sequence.  This ordering does not
               add targets to the list of targets to be made.  Since the depen-
               dents of a target do not get built until the target itself could
               be built, unless `a' is built by another part of the dependency
               graph, the following is a dependency loop:
 
               .ORDER: b a
               b: a
 
               The ordering imposed by ..OORRDDEERR is only relevant for parallel
               makes.
 
      ..PPAATTHH    The sources are directories which are to be searched for files
               not found in the current directory.  If no sources are speci-
               fied, any previously specified directories are deleted.  If the
               source is the special ..DDOOTTLLAASSTT target, then the current working
               directory is searched last.
 
      ..PPAATTHH.._s_u_f_f_i_x
               Like ..PPAATTHH but applies only to files with a particular suffix.
               The suffix must have been previously declared with ..SSUUFFFFIIXXEESS.
 
      ..PPHHOONNYY   Apply the ..PPHHOONNYY attribute to any specified sources.
 
      ..PPRREECCIIOOUUSS
               Apply the ..PPRREECCIIOOUUSS attribute to any specified sources.  If no
               sources are specified, the ..PPRREECCIIOOUUSS attribute is applied to
               every target in the file.
 
      ..SSHHEELLLL   Sets the shell that bbmmaakkee will use to execute commands.  The
               sources are a set of _f_i_e_l_d_=_v_a_l_u_e pairs.
 
               _n_a_m_e        This is the minimal specification, used to select
                           one of the built-in shell specs; _s_h, _k_s_h, and _c_s_h.
 
               _p_a_t_h        Specifies the path to the shell.
 
               _h_a_s_E_r_r_C_t_l   Indicates whether the shell supports exit on error.
 
               _c_h_e_c_k       The command to turn on error checking.
 
               _i_g_n_o_r_e      The command to disable error checking.
 
               _e_c_h_o        The command to turn on echoing of commands executed.
 
               _q_u_i_e_t       The command to turn off echoing of commands exe-
                           cuted.
 
               _f_i_l_t_e_r      The output to filter after issuing the _q_u_i_e_t com-
                           mand.  It is typically identical to _q_u_i_e_t.
 
               _e_r_r_F_l_a_g     The flag to pass the shell to enable error checking.
 
               _e_c_h_o_F_l_a_g    The flag to pass the shell to enable command echo-
                           ing.
 
               _n_e_w_l_i_n_e     The string literal to pass the shell that results in
                           a single newline character when used outside of any
                           quoting characters.
               Example:
 
               .SHELL: name=ksh path=/bin/ksh hasErrCtl=true \
                       check="set -e" ignore="set +e" \
                       echo="set -v" quiet="set +v" filter="set +v" \
                       echoFlag=v errFlag=e newline="'\n'"
 
      ..SSIILLEENNTT  Apply the ..SSIILLEENNTT attribute to any specified sources.  If no
               sources are specified, the ..SSIILLEENNTT attribute is applied to every
               command in the file.
 
      ..SSTTAALLEE   This target gets run when a dependency file contains stale
               entries, having _._A_L_L_S_R_C set to the name of that dependency file.
 
      ..SSUUFFFFIIXXEESS
               Each source specifies a suffix to bbmmaakkee.  If no sources are
               specified, any previously specified suffixes are deleted.  It
               allows the creation of suffix-transformation rules.
 
               Example:
 
               .SUFFIXES: .o
               .c.o:
                       cc -o ${.TARGET} -c ${.IMPSRC}
 
 EENNVVIIRROONNMMEENNTT
      bbmmaakkee uses the following environment variables, if they exist: MACHINE,
      MACHINE_ARCH, MAKE, MAKEFLAGS, MAKEOBJDIR, MAKEOBJDIRPREFIX, MAKESYSPATH,
      PWD, and TMPDIR.
 
      MAKEOBJDIRPREFIX and MAKEOBJDIR may only be set in the environment or on
      the command line to bbmmaakkee and not as makefile variables; see the descrip-
      tion of `_._O_B_J_D_I_R' for more details.
 
 FFIILLEESS
      .depend        list of dependencies
      Makefile       list of dependencies
      makefile       list of dependencies
      sys.mk         system makefile
      /usr/share/mk  system makefile directory
 
 CCOOMMPPAATTIIBBIILLIITTYY
      The basic make syntax is compatible between different versions of make;
      however the special variables, variable modifiers and conditionals are
      not.
 
    OOllddeerr vveerrssiioonnss
      An incomplete list of changes in older versions of bbmmaakkee:
 
      The way that .for loop variables are substituted changed after NetBSD 5.0
      so that they still appear to be variable expansions.  In particular this
      stops them being treated as syntax, and removes some obscure problems
      using them in .if statements.
 
      The way that parallel makes are scheduled changed in NetBSD 4.0 so that
      .ORDER and .WAIT apply recursively to the dependent nodes.  The algo-
      rithms used may change again in the future.
 
    OOtthheerr mmaakkee ddiiaalleeccttss
      Other make dialects (GNU make, SVR4 make, POSIX make, etc.) do not sup-
      port most of the features of bbmmaakkee as described in this manual.  Most
      notably:
 
            ++oo   The ..WWAAIITT and ..OORRDDEERR declarations and most functionality per-
                taining to parallelization.  (GNU make supports parallelization
                but lacks these features needed to control it effectively.)
 
            ++oo   Directives, including for loops and conditionals and most of
                the forms of include files.  (GNU make has its own incompatible
                and less powerful syntax for conditionals.)
 
            ++oo   All built-in variables that begin with a dot.
 
            ++oo   Most of the special sources and targets that begin with a dot,
                with the notable exception of ..PPHHOONNYY, ..PPRREECCIIOOUUSS, and ..SSUUFFFFIIXXEESS.
 
            ++oo   Variable modifiers, except for the
                      :old=new
                string substitution, which does not portably support globbing
                with `%' and historically only works on declared suffixes.
 
            ++oo   The $$>> variable even in its short form; most makes support this
                functionality but its name varies.
 
      Some features are somewhat more portable, such as assignment with ++==, ??==,
      and !!==.  The ..PPAATTHH functionality is based on an older feature VVPPAATTHH found
      in GNU make and many versions of SVR4 make; however, historically its
      behavior is too ill-defined (and too buggy) to rely upon.
 
      The $$@@ and $$<< variables are more or less universally portable, as is the
      $$((MMAAKKEE)) variable.  Basic use of suffix rules (for files only in the cur-
      rent directory, not trying to chain transformations together, etc.) is
      also reasonably portable.
 
 SSEEEE AALLSSOO
      mkdep(1)
 
 HHIISSTTOORRYY
      bbmmaakkee is derived from NetBSD make(1).  It uses autoconf to facilitate
      portability to other platforms.
 
      A make command appeared in Version 7 AT&T UNIX.  This make implementation
      is based on Adam De Boor's pmake program which was written for Sprite at
      Berkeley.  It was designed to be a parallel distributed make running jobs
      on different machines using a daemon called ``customs''.
 
      Historically the target/dependency ``FRC'' has been used to FoRCe
      rebuilding (since the target/dependency does not exist... unless someone
      creates an ``FRC'' file).
 
 BBUUGGSS
      The make syntax is difficult to parse without actually acting of the
      data.  For instance finding the end of a variable use should involve
      scanning each the modifiers using the correct terminator for each field.
      In many places make just counts {} and () in order to find the end of a
      variable expansion.
 
      There is no way of escaping a space character in a filename.
 
-NetBSD 5.1                       June 2, 2016                       NetBSD 5.1
+NetBSD 5.1                      August 15, 2016                     NetBSD 5.1
Index: projects/clang390-import/contrib/bmake/main.c
===================================================================
--- projects/clang390-import/contrib/bmake/main.c	(revision 305686)
+++ projects/clang390-import/contrib/bmake/main.c	(revision 305687)
@@ -1,2103 +1,2112 @@
-/*	$NetBSD: main.c,v 1.247 2016/06/05 01:39:17 christos Exp $	*/
+/*	$NetBSD: main.c,v 1.250 2016/08/11 19:53:17 sjg Exp $	*/
 
 /*
  * Copyright (c) 1988, 1989, 1990, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * Adam de Boor.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 /*
  * Copyright (c) 1989 by Berkeley Softworks
  * All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * Adam de Boor.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *	This product includes software developed by the University of
  *	California, Berkeley and its contributors.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #ifndef MAKE_NATIVE
-static char rcsid[] = "$NetBSD: main.c,v 1.247 2016/06/05 01:39:17 christos Exp $";
+static char rcsid[] = "$NetBSD: main.c,v 1.250 2016/08/11 19:53:17 sjg Exp $";
 #else
 #include <sys/cdefs.h>
 #ifndef lint
 __COPYRIGHT("@(#) Copyright (c) 1988, 1989, 1990, 1993\
  The Regents of the University of California.  All rights reserved.");
 #endif /* not lint */
 
 #ifndef lint
 #if 0
 static char sccsid[] = "@(#)main.c	8.3 (Berkeley) 3/19/94";
 #else
-__RCSID("$NetBSD: main.c,v 1.247 2016/06/05 01:39:17 christos Exp $");
+__RCSID("$NetBSD: main.c,v 1.250 2016/08/11 19:53:17 sjg Exp $");
 #endif
 #endif /* not lint */
 #endif
 
 /*-
  * main.c --
  *	The main file for this entire program. Exit routines etc
  *	reside here.
  *
  * Utility functions defined in this file:
  *	Main_ParseArgLine	Takes a line of arguments, breaks them and
  *				treats them as if they were given when first
  *				invoked. Used by the parse module to implement
  *				the .MFLAGS target.
  *
  *	Error			Print a tagged error message. The global
  *				MAKE variable must have been defined. This
  *				takes a format string and optional arguments
  *				for it.
  *
  *	Fatal			Print an error message and exit. Also takes
  *				a format string and arguments for it.
  *
  *	Punt			Aborts all jobs and exits with a message. Also
  *				takes a format string and arguments for it.
  *
  *	Finish			Finish things up by printing the number of
  *				errors which occurred, as passed to it, and
  *				exiting.
  */
 
 #include <sys/types.h>
 #include <sys/time.h>
 #include <sys/param.h>
 #include <sys/resource.h>
 #include <sys/stat.h>
 #if defined(MAKE_NATIVE) && defined(HAVE_SYSCTL)
 #include <sys/sysctl.h>
 #endif
 #include <sys/utsname.h>
 #include "wait.h"
 
 #include <errno.h>
 #include <signal.h>
 #include <stdarg.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <time.h>
 #include <ctype.h>
 
 #include "make.h"
 #include "hash.h"
 #include "dir.h"
 #include "job.h"
 #include "pathnames.h"
 #include "trace.h"
 
 #ifdef USE_IOVEC
 #include <sys/uio.h>
 #endif
 
 #ifndef	DEFMAXLOCAL
 #define	DEFMAXLOCAL DEFMAXJOBS
 #endif	/* DEFMAXLOCAL */
 
 #ifndef __arraycount
 # define __arraycount(__x)	(sizeof(__x) / sizeof(__x[0]))
 #endif
 
 Lst			create;		/* Targets to be made */
 time_t			now;		/* Time at start of make */
 GNode			*DEFAULT;	/* .DEFAULT node */
 Boolean			allPrecious;	/* .PRECIOUS given on line by itself */
 
 static Boolean		noBuiltins;	/* -r flag */
 static Lst		makefiles;	/* ordered list of makefiles to read */
 static Boolean		printVars;	/* print value of one or more vars */
 static Lst		variables;	/* list of variables to print */
 int			maxJobs;	/* -j argument */
 static int		maxJobTokens;	/* -j argument */
 Boolean			compatMake;	/* -B argument */
 int			debug;		/* -d argument */
 Boolean			debugVflag;	/* -dV */
 Boolean			noExecute;	/* -n flag */
 Boolean			noRecursiveExecute;	/* -N flag */
 Boolean			keepgoing;	/* -k flag */
 Boolean			queryFlag;	/* -q flag */
 Boolean			touchFlag;	/* -t flag */
 Boolean			enterFlag;	/* -w flag */
 Boolean			enterFlagObj;	/* -w and objdir != srcdir */
 Boolean			ignoreErrors;	/* -i flag */
 Boolean			beSilent;	/* -s flag */
 Boolean			oldVars;	/* variable substitution style */
 Boolean			checkEnvFirst;	/* -e flag */
 Boolean			parseWarnFatal;	/* -W flag */
 Boolean			jobServer; 	/* -J flag */
 static int jp_0 = -1, jp_1 = -1;	/* ends of parent job pipe */
 Boolean			varNoExportEnv;	/* -X flag */
 Boolean			doing_depend;	/* Set while reading .depend */
 static Boolean		jobsRunning;	/* TRUE if the jobs might be running */
 static const char *	tracefile;
 static void		MainParseArgs(int, char **);
 static int		ReadMakefile(const void *, const void *);
 static void		usage(void) MAKE_ATTR_DEAD;
 
 static Boolean		ignorePWD;	/* if we use -C, PWD is meaningless */
 static char objdir[MAXPATHLEN + 1];	/* where we chdir'ed to */
 char curdir[MAXPATHLEN + 1];		/* Startup directory */
 char *progname;				/* the program name */
 char *makeDependfile;
 pid_t myPid;
 int makelevel;
 
 Boolean forceJobs = FALSE;
 
 /*
  * On some systems MACHINE is defined as something other than
  * what we want.
  */
 #ifdef FORCE_MACHINE
 # undef MACHINE
 # define MACHINE FORCE_MACHINE
 #endif
 
 extern Lst parseIncPath;
 
 /*
  * For compatibility with the POSIX version of MAKEFLAGS that includes
  * all the options with out -, convert flags to -f -l -a -g -s.
  */
 static char *
 explode(const char *flags)
 {
     size_t len;
     char *nf, *st;
     const char *f;
 
     if (flags == NULL)
 	return NULL;
 
     for (f = flags; *f; f++)
 	if (!isalpha((unsigned char)*f))
 	    break;
 
     if (*f)
 	return bmake_strdup(flags);
 
     len = strlen(flags);
     st = nf = bmake_malloc(len * 3 + 1);
     while (*flags) {
 	*nf++ = '-';
 	*nf++ = *flags++;
 	*nf++ = ' ';
     }
     *nf = '\0';
     return st;
 }
 	    
 static void
 parse_debug_options(const char *argvalue)
 {
 	const char *modules;
 	const char *mode;
 	char *fname;
 	int len;
 
 	for (modules = argvalue; *modules; ++modules) {
 		switch (*modules) {
 		case 'A':
 			debug = ~0;
 			break;
 		case 'a':
 			debug |= DEBUG_ARCH;
 			break;
 		case 'C':
 			debug |= DEBUG_CWD;
 			break;
 		case 'c':
 			debug |= DEBUG_COND;
 			break;
 		case 'd':
 			debug |= DEBUG_DIR;
 			break;
 		case 'e':
 			debug |= DEBUG_ERROR;
 			break;
 		case 'f':
 			debug |= DEBUG_FOR;
 			break;
 		case 'g':
 			if (modules[1] == '1') {
 				debug |= DEBUG_GRAPH1;
 				++modules;
 			}
 			else if (modules[1] == '2') {
 				debug |= DEBUG_GRAPH2;
 				++modules;
 			}
 			else if (modules[1] == '3') {
 				debug |= DEBUG_GRAPH3;
 				++modules;
 			}
 			break;
 		case 'j':
 			debug |= DEBUG_JOB;
 			break;
 		case 'l':
 			debug |= DEBUG_LOUD;
 			break;
 		case 'M':
 			debug |= DEBUG_META;
 			break;
 		case 'm':
 			debug |= DEBUG_MAKE;
 			break;
 		case 'n':
 			debug |= DEBUG_SCRIPT;
 			break;
 		case 'p':
 			debug |= DEBUG_PARSE;
 			break;
 		case 's':
 			debug |= DEBUG_SUFF;
 			break;
 		case 't':
 			debug |= DEBUG_TARG;
 			break;
 		case 'V':
 			debugVflag = TRUE;
 			break;
 		case 'v':
 			debug |= DEBUG_VAR;
 			break;
 		case 'x':
 			debug |= DEBUG_SHELL;
 			break;
 		case 'F':
 			if (debug_file != stdout && debug_file != stderr)
 				fclose(debug_file);
 			if (*++modules == '+') {
 				modules++;
 				mode = "a";
 			} else
 				mode = "w";
 			if (strcmp(modules, "stdout") == 0) {
 				debug_file = stdout;
 				goto debug_setbuf;
 			}
 			if (strcmp(modules, "stderr") == 0) {
 				debug_file = stderr;
 				goto debug_setbuf;
 			}
 			len = strlen(modules);
 			fname = malloc(len + 20);
 			memcpy(fname, modules, len + 1);
 			/* Let the filename be modified by the pid */
 			if (strcmp(fname + len - 3, ".%d") == 0)
 				snprintf(fname + len - 2, 20, "%d", getpid());
 			debug_file = fopen(fname, mode);
 			if (!debug_file) {
 				fprintf(stderr, "Cannot open debug file %s\n",
 				    fname);
 				usage();
 			}
 			free(fname);
 			goto debug_setbuf;
 		default:
 			(void)fprintf(stderr,
 			    "%s: illegal argument to d option -- %c\n",
 			    progname, *modules);
 			usage();
 		}
 	}
 debug_setbuf:
 	/*
 	 * Make the debug_file unbuffered, and make
 	 * stdout line buffered (unless debugfile == stdout).
 	 */
 	setvbuf(debug_file, NULL, _IONBF, 0);
 	if (debug_file != stdout) {
 		setvbuf(stdout, NULL, _IOLBF, 0);
 	}
 }
 
 /*-
  * MainParseArgs --
  *	Parse a given argument vector. Called from main() and from
  *	Main_ParseArgLine() when the .MAKEFLAGS target is used.
  *
  *	XXX: Deal with command line overriding .MAKEFLAGS in makefile
  *
  * Results:
  *	None
  *
  * Side Effects:
  *	Various global and local flags will be set depending on the flags
  *	given
  */
 static void
 MainParseArgs(int argc, char **argv)
 {
 	char *p;
 	int c = '?';
 	int arginc;
 	char *argvalue;
 	const char *getopt_def;
 	char *optscan;
 	Boolean inOption, dashDash = FALSE;
 	char found_path[MAXPATHLEN + 1];	/* for searching for sys.mk */
 
 #define OPTFLAGS "BC:D:I:J:NST:V:WXd:ef:ij:km:nqrstw"
 /* Can't actually use getopt(3) because rescanning is not portable */
 
 	getopt_def = OPTFLAGS;
 rearg:	
 	inOption = FALSE;
 	optscan = NULL;
 	while(argc > 1) {
 		char *getopt_spec;
 		if(!inOption)
 			optscan = argv[1];
 		c = *optscan++;
 		arginc = 0;
 		if(inOption) {
 			if(c == '\0') {
 				++argv;
 				--argc;
 				inOption = FALSE;
 				continue;
 			}
 		} else {
 			if (c != '-' || dashDash)
 				break;
 			inOption = TRUE;
 			c = *optscan++;
 		}
 		/* '-' found at some earlier point */
 		getopt_spec = strchr(getopt_def, c);
 		if(c != '\0' && getopt_spec != NULL && getopt_spec[1] == ':') {
 			/* -<something> found, and <something> should have an arg */
 			inOption = FALSE;
 			arginc = 1;
 			argvalue = optscan;
 			if(*argvalue == '\0') {
 				if (argc < 3)
 					goto noarg;
 				argvalue = argv[2];
 				arginc = 2;
 			}
 		} else {
 			argvalue = NULL; 
 		}
 		switch(c) {
 		case '\0':
 			arginc = 1;
 			inOption = FALSE;
 			break;
 		case 'B':
 			compatMake = TRUE;
 			Var_Append(MAKEFLAGS, "-B", VAR_GLOBAL);
 			Var_Set(MAKE_MODE, "compat", VAR_GLOBAL, 0);
 			break;
 		case 'C':
 			if (chdir(argvalue) == -1) {
 				(void)fprintf(stderr,
 					      "%s: chdir %s: %s\n",
 					      progname, argvalue,
 					      strerror(errno));
 				exit(1);
 			}
 			if (getcwd(curdir, MAXPATHLEN) == NULL) {
 				(void)fprintf(stderr, "%s: %s.\n", progname, strerror(errno));
 				exit(2);
 			}
 			ignorePWD = TRUE;
 			break;
 		case 'D':
 			if (argvalue == NULL || argvalue[0] == 0) goto noarg;
 			Var_Set(argvalue, "1", VAR_GLOBAL, 0);
 			Var_Append(MAKEFLAGS, "-D", VAR_GLOBAL);
 			Var_Append(MAKEFLAGS, argvalue, VAR_GLOBAL);
 			break;
 		case 'I':
 			if (argvalue == NULL) goto noarg;
 			Parse_AddIncludeDir(argvalue);
 			Var_Append(MAKEFLAGS, "-I", VAR_GLOBAL);
 			Var_Append(MAKEFLAGS, argvalue, VAR_GLOBAL);
 			break;
 		case 'J':
 			if (argvalue == NULL) goto noarg;
 			if (sscanf(argvalue, "%d,%d", &jp_0, &jp_1) != 2) {
 			    (void)fprintf(stderr,
 				"%s: internal error -- J option malformed (%s)\n",
 				progname, argvalue);
 				usage();
 			}
 			if ((fcntl(jp_0, F_GETFD, 0) < 0) ||
 			    (fcntl(jp_1, F_GETFD, 0) < 0)) {
 #if 0
 			    (void)fprintf(stderr,
 				"%s: ###### warning -- J descriptors were closed!\n",
 				progname);
 			    exit(2);
 #endif
 			    jp_0 = -1;
 			    jp_1 = -1;
 			    compatMake = TRUE;
 			} else {
 			    Var_Append(MAKEFLAGS, "-J", VAR_GLOBAL);
 			    Var_Append(MAKEFLAGS, argvalue, VAR_GLOBAL);
 			    jobServer = TRUE;
 			}
 			break;
 		case 'N':
 			noExecute = TRUE;
 			noRecursiveExecute = TRUE;
 			Var_Append(MAKEFLAGS, "-N", VAR_GLOBAL);
 			break;
 		case 'S':
 			keepgoing = FALSE;
 			Var_Append(MAKEFLAGS, "-S", VAR_GLOBAL);
 			break;
 		case 'T':
 			if (argvalue == NULL) goto noarg;
 			tracefile = bmake_strdup(argvalue);
 			Var_Append(MAKEFLAGS, "-T", VAR_GLOBAL);
 			Var_Append(MAKEFLAGS, argvalue, VAR_GLOBAL);
 			break;
 		case 'V':
 			if (argvalue == NULL) goto noarg;
 			printVars = TRUE;
 			(void)Lst_AtEnd(variables, argvalue);
 			Var_Append(MAKEFLAGS, "-V", VAR_GLOBAL);
 			Var_Append(MAKEFLAGS, argvalue, VAR_GLOBAL);
 			break;
 		case 'W':
 			parseWarnFatal = TRUE;
 			break;
 		case 'X':
 			varNoExportEnv = TRUE;
 			Var_Append(MAKEFLAGS, "-X", VAR_GLOBAL);
 			break;
 		case 'd':
 			if (argvalue == NULL) goto noarg;
 			/* If '-d-opts' don't pass to children */
 			if (argvalue[0] == '-')
 			    argvalue++;
 			else {
 			    Var_Append(MAKEFLAGS, "-d", VAR_GLOBAL);
 			    Var_Append(MAKEFLAGS, argvalue, VAR_GLOBAL);
 			}
 			parse_debug_options(argvalue);
 			break;
 		case 'e':
 			checkEnvFirst = TRUE;
 			Var_Append(MAKEFLAGS, "-e", VAR_GLOBAL);
 			break;
 		case 'f':
 			if (argvalue == NULL) goto noarg;
 			(void)Lst_AtEnd(makefiles, argvalue);
 			break;
 		case 'i':
 			ignoreErrors = TRUE;
 			Var_Append(MAKEFLAGS, "-i", VAR_GLOBAL);
 			break;
 		case 'j':
 			if (argvalue == NULL) goto noarg;
 			forceJobs = TRUE;
 			maxJobs = strtol(argvalue, &p, 0);
 			if (*p != '\0' || maxJobs < 1) {
 				(void)fprintf(stderr, "%s: illegal argument to -j -- must be positive integer!\n",
 				    progname);
 				exit(1);
 			}
 			Var_Append(MAKEFLAGS, "-j", VAR_GLOBAL);
 			Var_Append(MAKEFLAGS, argvalue, VAR_GLOBAL);
 			Var_Set(".MAKE.JOBS", argvalue, VAR_GLOBAL, 0);
 			maxJobTokens = maxJobs;
 			break;
 		case 'k':
 			keepgoing = TRUE;
 			Var_Append(MAKEFLAGS, "-k", VAR_GLOBAL);
 			break;
 		case 'm':
 			if (argvalue == NULL) goto noarg;
 			/* look for magic parent directory search string */
 			if (strncmp(".../", argvalue, 4) == 0) {
 				if (!Dir_FindHereOrAbove(curdir, argvalue+4,
 				    found_path, sizeof(found_path)))
 					break;		/* nothing doing */
 				(void)Dir_AddDir(sysIncPath, found_path);
 			} else {
 				(void)Dir_AddDir(sysIncPath, argvalue);
 			}
 			Var_Append(MAKEFLAGS, "-m", VAR_GLOBAL);
 			Var_Append(MAKEFLAGS, argvalue, VAR_GLOBAL);
 			break;
 		case 'n':
 			noExecute = TRUE;
 			Var_Append(MAKEFLAGS, "-n", VAR_GLOBAL);
 			break;
 		case 'q':
 			queryFlag = TRUE;
 			/* Kind of nonsensical, wot? */
 			Var_Append(MAKEFLAGS, "-q", VAR_GLOBAL);
 			break;
 		case 'r':
 			noBuiltins = TRUE;
 			Var_Append(MAKEFLAGS, "-r", VAR_GLOBAL);
 			break;
 		case 's':
 			beSilent = TRUE;
 			Var_Append(MAKEFLAGS, "-s", VAR_GLOBAL);
 			break;
 		case 't':
 			touchFlag = TRUE;
 			Var_Append(MAKEFLAGS, "-t", VAR_GLOBAL);
 			break;
 		case 'w':
 			enterFlag = TRUE;
 			Var_Append(MAKEFLAGS, "-w", VAR_GLOBAL);
 			break;
 		case '-':
 			dashDash = TRUE;
 			break;
 		default:
 		case '?':
 #ifndef MAKE_NATIVE
 			fprintf(stderr, "getopt(%s) -> %d (%c)\n",
 				OPTFLAGS, c, c);
 #endif
 			usage();
 		}
 		argv += arginc;
 		argc -= arginc;
 	}
 
 	oldVars = TRUE;
 
 	/*
 	 * See if the rest of the arguments are variable assignments and
 	 * perform them if so. Else take them to be targets and stuff them
 	 * on the end of the "create" list.
 	 */
 	for (; argc > 1; ++argv, --argc)
 		if (Parse_IsVar(argv[1])) {
 			Parse_DoVar(argv[1], VAR_CMD);
 		} else {
 			if (!*argv[1])
 				Punt("illegal (null) argument.");
 			if (*argv[1] == '-' && !dashDash)
 				goto rearg;
 			(void)Lst_AtEnd(create, bmake_strdup(argv[1]));
 		}
 
 	return;
 noarg:
 	(void)fprintf(stderr, "%s: option requires an argument -- %c\n",
 	    progname, c);
 	usage();
 }
 
 /*-
  * Main_ParseArgLine --
  *  	Used by the parse module when a .MFLAGS or .MAKEFLAGS target
  *	is encountered and by main() when reading the .MAKEFLAGS envariable.
  *	Takes a line of arguments and breaks it into its
  * 	component words and passes those words and the number of them to the
  *	MainParseArgs function.
  *	The line should have all its leading whitespace removed.
  *
  * Input:
  *	line		Line to fracture
  *
  * Results:
  *	None
  *
  * Side Effects:
  *	Only those that come from the various arguments.
  */
 void
 Main_ParseArgLine(const char *line)
 {
 	char **argv;			/* Manufactured argument vector */
 	int argc;			/* Number of arguments in argv */
 	char *args;			/* Space used by the args */
 	char *buf, *p1;
 	char *argv0 = Var_Value(".MAKE", VAR_GLOBAL, &p1);
 	size_t len;
 
 	if (line == NULL)
 		return;
 	for (; *line == ' '; ++line)
 		continue;
 	if (!*line)
 		return;
 
 #ifndef POSIX
 	{
 		/*
 		 * $MAKE may simply be naming the make(1) binary
 		 */
 		char *cp;
 
 		if (!(cp = strrchr(line, '/')))
 			cp = line;
 		if ((cp = strstr(cp, "make")) &&
 		    strcmp(cp, "make") == 0)
 			return;
 	}
 #endif
 	buf = bmake_malloc(len = strlen(line) + strlen(argv0) + 2);
 	(void)snprintf(buf, len, "%s %s", argv0, line);
 	free(p1);
 
 	argv = brk_string(buf, &argc, TRUE, &args);
 	if (argv == NULL) {
 		Error("Unterminated quoted string [%s]", buf);
 		free(buf);
 		return;
 	}
 	free(buf);
 	MainParseArgs(argc, argv);
 
 	free(args);
 	free(argv);
 }
 
 Boolean
 Main_SetObjdir(const char *path)
 {
 	struct stat sb;
 	char *p = NULL;
 	char buf[MAXPATHLEN + 1];
 	Boolean rc = FALSE;
 
 	/* expand variable substitutions */
 	if (strchr(path, '$') != 0) {
 		snprintf(buf, MAXPATHLEN, "%s", path);
 		path = p = Var_Subst(NULL, buf, VAR_GLOBAL, VARF_WANTRES);
 	}
 
 	if (path[0] != '/') {
 		snprintf(buf, MAXPATHLEN, "%s/%s", curdir, path);
 		path = buf;
 	}
 
 	/* look for the directory and try to chdir there */
 	if (stat(path, &sb) == 0 && S_ISDIR(sb.st_mode)) {
 		if (chdir(path)) {
 			(void)fprintf(stderr, "make warning: %s: %s.\n",
 				      path, strerror(errno));
 		} else {
 			strncpy(objdir, path, MAXPATHLEN);
 			Var_Set(".OBJDIR", objdir, VAR_GLOBAL, 0);
 			setenv("PWD", objdir, 1);
 			Dir_InitDot();
 			rc = TRUE;
 			if (enterFlag && strcmp(objdir, curdir) != 0)
 				enterFlagObj = TRUE;
 		}
 	}
 
 	free(p);
 	return rc;
 }
 
 /*-
  * ReadAllMakefiles --
  *	wrapper around ReadMakefile() to read all.
  *
  * Results:
  *	TRUE if ok, FALSE on error
  */
 static int
 ReadAllMakefiles(const void *p, const void *q)
 {
 	return (ReadMakefile(p, q) == 0);
 }
 
 int
 str2Lst_Append(Lst lp, char *str, const char *sep)
 {
     char *cp;
     int n;
 
     if (!sep)
 	sep = " \t";
 
     for (n = 0, cp = strtok(str, sep); cp; cp = strtok(NULL, sep)) {
 	(void)Lst_AtEnd(lp, cp);
 	n++;
     }
     return (n);
 }
 
 #ifdef SIGINFO
 /*ARGSUSED*/
 static void
 siginfo(int signo MAKE_ATTR_UNUSED)
 {
 	char dir[MAXPATHLEN];
 	char str[2 * MAXPATHLEN];
 	int len;
 	if (getcwd(dir, sizeof(dir)) == NULL)
 		return;
 	len = snprintf(str, sizeof(str), "%s: Working in: %s\n", progname, dir);
 	if (len > 0)
 		(void)write(STDERR_FILENO, str, (size_t)len);
 }
 #endif
 
 /*
  * Allow makefiles some control over the mode we run in.
  */
 void
 MakeMode(const char *mode)
 {
     char *mp = NULL;
 
     if (!mode)
 	mode = mp = Var_Subst(NULL, "${" MAKE_MODE ":tl}",
 			      VAR_GLOBAL, VARF_WANTRES);
 
     if (mode && *mode) {
 	if (strstr(mode, "compat")) {
 	    compatMake = TRUE;
 	    forceJobs = FALSE;
 	}
 #if USE_META
 	if (strstr(mode, "meta"))
 	    meta_mode_init(mode);
 #endif
     }
 
     free(mp);
 }
 
 /*-
  * main --
  *	The main function, for obvious reasons. Initializes variables
  *	and a few modules, then parses the arguments give it in the
  *	environment and on the command line. Reads the system makefile
  *	followed by either Makefile, makefile or the file given by the
  *	-f argument. Sets the .MAKEFLAGS PMake variable based on all the
  *	flags it has received by then uses either the Make or the Compat
  *	module to create the initial list of targets.
  *
  * Results:
  *	If -q was given, exits -1 if anything was out-of-date. Else it exits
  *	0.
  *
  * Side Effects:
  *	The program exits when done. Targets are created. etc. etc. etc.
  */
 int
 main(int argc, char **argv)
 {
 	Lst targs;	/* target nodes to create -- passed to Make_Init */
 	Boolean outOfDate = FALSE; 	/* FALSE if all targets up to date */
 	struct stat sb, sa;
 	char *p1, *path;
 	char mdpath[MAXPATHLEN];
 #ifdef FORCE_MACHINE
 	const char *machine = FORCE_MACHINE;
 #else
     	const char *machine = getenv("MACHINE");
 #endif
 	const char *machine_arch = getenv("MACHINE_ARCH");
 	char *syspath = getenv("MAKESYSPATH");
 	Lst sysMkPath;			/* Path of sys.mk */
 	char *cp = NULL, *start;
 					/* avoid faults on read-only strings */
 	static char defsyspath[] = _PATH_DEFSYSPATH;
 	char found_path[MAXPATHLEN + 1];	/* for searching for sys.mk */
 	struct timeval rightnow;		/* to initialize random seed */
 	struct utsname utsname;
 
 	/* default to writing debug to stderr */
 	debug_file = stderr;
 
 #ifdef SIGINFO
 	(void)bmake_signal(SIGINFO, siginfo);
 #endif
 	/*
 	 * Set the seed to produce a different random sequence
 	 * on each program execution.
 	 */
 	gettimeofday(&rightnow, NULL);
 	srandom(rightnow.tv_sec + rightnow.tv_usec);
 	
 	if ((progname = strrchr(argv[0], '/')) != NULL)
 		progname++;
 	else
 		progname = argv[0];
 #if defined(MAKE_NATIVE) || (defined(HAVE_SETRLIMIT) && defined(RLIMIT_NOFILE))
 	/*
 	 * get rid of resource limit on file descriptors
 	 */
 	{
 		struct rlimit rl;
 		if (getrlimit(RLIMIT_NOFILE, &rl) != -1 &&
 		    rl.rlim_cur != rl.rlim_max) {
 			rl.rlim_cur = rl.rlim_max;
 			(void)setrlimit(RLIMIT_NOFILE, &rl);
 		}
 	}
 #endif
 
 	if (uname(&utsname) == -1) {
 	    (void)fprintf(stderr, "%s: uname failed (%s).\n", progname,
 		strerror(errno));
 	    exit(2);
 	}
 
 	/*
 	 * Get the name of this type of MACHINE from utsname
 	 * so we can share an executable for similar machines.
 	 * (i.e. m68k: amiga hp300, mac68k, sun3, ...)
 	 *
 	 * Note that both MACHINE and MACHINE_ARCH are decided at
 	 * run-time.
 	 */
 	if (!machine) {
 #ifdef MAKE_NATIVE
 	    machine = utsname.machine;
 #else
 #ifdef MAKE_MACHINE
 	    machine = MAKE_MACHINE;
 #else
 	    machine = "unknown";
 #endif
 #endif
 	}
 
 	if (!machine_arch) {
 #if defined(MAKE_NATIVE) && defined(HAVE_SYSCTL) && defined(CTL_HW) && defined(HW_MACHINE_ARCH)
 	    static char machine_arch_buf[sizeof(utsname.machine)];
 	    int mib[2] = { CTL_HW, HW_MACHINE_ARCH };
 	    size_t len = sizeof(machine_arch_buf);
                 
 	    if (sysctl(mib, __arraycount(mib), machine_arch_buf,
 		    &len, NULL, 0) < 0) {
 		(void)fprintf(stderr, "%s: sysctl failed (%s).\n", progname,
 		    strerror(errno));
 		exit(2);
 	    }
 
 	    machine_arch = machine_arch_buf;
 #else
 #ifndef MACHINE_ARCH
 #ifdef MAKE_MACHINE_ARCH
             machine_arch = MAKE_MACHINE_ARCH;
 #else
 	    machine_arch = "unknown";
 #endif
 #else
 	    machine_arch = MACHINE_ARCH;
 #endif
 #endif
 	}
 
 	myPid = getpid();		/* remember this for vFork() */
 
 	/*
 	 * Just in case MAKEOBJDIR wants us to do something tricky.
 	 */
 	Var_Init();		/* Initialize the lists of variables for
 				 * parsing arguments */
 	Var_Set(".MAKE.OS", utsname.sysname, VAR_GLOBAL, 0);
 	Var_Set("MACHINE", machine, VAR_GLOBAL, 0);
 	Var_Set("MACHINE_ARCH", machine_arch, VAR_GLOBAL, 0);
 #ifdef MAKE_VERSION
 	Var_Set("MAKE_VERSION", MAKE_VERSION, VAR_GLOBAL, 0);
 #endif
 	Var_Set(".newline", "\n", VAR_GLOBAL, 0); /* handy for :@ loops */
 	/*
 	 * This is the traditional preference for makefiles.
 	 */
 #ifndef MAKEFILE_PREFERENCE_LIST
 # define MAKEFILE_PREFERENCE_LIST "makefile Makefile"
 #endif
 	Var_Set(MAKEFILE_PREFERENCE, MAKEFILE_PREFERENCE_LIST,
 		VAR_GLOBAL, 0);
 	Var_Set(MAKE_DEPENDFILE, ".depend", VAR_GLOBAL, 0);
 
 	create = Lst_Init(FALSE);
 	makefiles = Lst_Init(FALSE);
 	printVars = FALSE;
 	debugVflag = FALSE;
 	variables = Lst_Init(FALSE);
 	beSilent = FALSE;		/* Print commands as executed */
 	ignoreErrors = FALSE;		/* Pay attention to non-zero returns */
 	noExecute = FALSE;		/* Execute all commands */
 	noRecursiveExecute = FALSE;	/* Execute all .MAKE targets */
 	keepgoing = FALSE;		/* Stop on error */
 	allPrecious = FALSE;		/* Remove targets when interrupted */
 	queryFlag = FALSE;		/* This is not just a check-run */
 	noBuiltins = FALSE;		/* Read the built-in rules */
 	touchFlag = FALSE;		/* Actually update targets */
 	debug = 0;			/* No debug verbosity, please. */
 	jobsRunning = FALSE;
 
 	maxJobs = DEFMAXLOCAL;		/* Set default local max concurrency */
 	maxJobTokens = maxJobs;
 	compatMake = FALSE;		/* No compat mode */
 	ignorePWD = FALSE;
 
 	/*
 	 * Initialize the parsing, directory and variable modules to prepare
 	 * for the reading of inclusion paths and variable settings on the
 	 * command line
 	 */
 
 	/*
 	 * Initialize various variables.
 	 *	MAKE also gets this name, for compatibility
 	 *	.MAKEFLAGS gets set to the empty string just in case.
 	 *	MFLAGS also gets initialized empty, for compatibility.
 	 */
 	Parse_Init();
 	if (argv[0][0] == '/' || strchr(argv[0], '/') == NULL) {
 	    /*
 	     * Leave alone if it is an absolute path, or if it does
 	     * not contain a '/' in which case we need to find it in
 	     * the path, like execvp(3) and the shells do.
 	     */
 	    p1 = argv[0];
 	} else {
 	    /*
 	     * A relative path, canonicalize it.
 	     */
 	    p1 = cached_realpath(argv[0], mdpath);
 	    if (!p1 || *p1 != '/' || stat(p1, &sb) < 0) {
 		p1 = argv[0];		/* realpath failed */
 	    }
 	}
 	Var_Set("MAKE", p1, VAR_GLOBAL, 0);
 	Var_Set(".MAKE", p1, VAR_GLOBAL, 0);
 	Var_Set(MAKEFLAGS, "", VAR_GLOBAL, 0);
 	Var_Set(MAKEOVERRIDES, "", VAR_GLOBAL, 0);
 	Var_Set("MFLAGS", "", VAR_GLOBAL, 0);
 	Var_Set(".ALLTARGETS", "", VAR_GLOBAL, 0);
 	/* some makefiles need to know this */
 	Var_Set(MAKE_LEVEL ".ENV", MAKE_LEVEL_ENV, VAR_CMD, 0);
 
 	/*
 	 * Set some other useful macros
 	 */
 	{
 	    char tmp[64], *ep;
 
 	    makelevel = ((ep = getenv(MAKE_LEVEL_ENV)) && *ep) ? atoi(ep) : 0;
 	    if (makelevel < 0)
 		makelevel = 0;
 	    snprintf(tmp, sizeof(tmp), "%d", makelevel);
 	    Var_Set(MAKE_LEVEL, tmp, VAR_GLOBAL, 0);
 	    snprintf(tmp, sizeof(tmp), "%u", myPid);
 	    Var_Set(".MAKE.PID", tmp, VAR_GLOBAL, 0);
 	    snprintf(tmp, sizeof(tmp), "%u", getppid());
 	    Var_Set(".MAKE.PPID", tmp, VAR_GLOBAL, 0);
 	}
 	if (makelevel > 0) {
 		char pn[1024];
 		snprintf(pn, sizeof(pn), "%s[%d]", progname, makelevel);
 		progname = bmake_strdup(pn);
 	}
 
 #ifdef USE_META
 	meta_init();
 #endif
 	/*
 	 * First snag any flags out of the MAKE environment variable.
 	 * (Note this is *not* MAKEFLAGS since /bin/make uses that and it's
 	 * in a different format).
 	 */
 #ifdef POSIX
 	p1 = explode(getenv("MAKEFLAGS"));
 	Main_ParseArgLine(p1);
 	free(p1);
 #else
 	Main_ParseArgLine(getenv("MAKE"));
 #endif
 
 	/*
 	 * Find where we are (now).
 	 * We take care of PWD for the automounter below...
 	 */
 	if (getcwd(curdir, MAXPATHLEN) == NULL) {
 		(void)fprintf(stderr, "%s: getcwd: %s.\n",
 		    progname, strerror(errno));
 		exit(2);
 	}
 
 	MainParseArgs(argc, argv);
 
 	if (enterFlag)
 		printf("%s: Entering directory `%s'\n", progname, curdir);
 
 	/*
 	 * Verify that cwd is sane.
 	 */
 	if (stat(curdir, &sa) == -1) {
 	    (void)fprintf(stderr, "%s: %s: %s.\n",
 		 progname, curdir, strerror(errno));
 	    exit(2);
 	}
 
 	/*
 	 * All this code is so that we know where we are when we start up
 	 * on a different machine with pmake.
 	 * Overriding getcwd() with $PWD totally breaks MAKEOBJDIRPREFIX
 	 * since the value of curdir can vary depending on how we got
 	 * here.  Ie sitting at a shell prompt (shell that provides $PWD)
 	 * or via subdir.mk in which case its likely a shell which does
 	 * not provide it.
 	 * So, to stop it breaking this case only, we ignore PWD if
 	 * MAKEOBJDIRPREFIX is set or MAKEOBJDIR contains a transform.
 	 */
 #ifndef NO_PWD_OVERRIDE
 	if (!ignorePWD) {
 		char *pwd, *ptmp1 = NULL, *ptmp2 = NULL;
 
 		if ((pwd = getenv("PWD")) != NULL &&
 		    Var_Value("MAKEOBJDIRPREFIX", VAR_CMD, &ptmp1) == NULL) {
 			const char *makeobjdir = Var_Value("MAKEOBJDIR",
 			    VAR_CMD, &ptmp2);
 
 			if (makeobjdir == NULL || !strchr(makeobjdir, '$')) {
 				if (stat(pwd, &sb) == 0 &&
 				    sa.st_ino == sb.st_ino &&
 				    sa.st_dev == sb.st_dev)
 					(void)strncpy(curdir, pwd, MAXPATHLEN);
 			}
 		}
 		free(ptmp1);
 		free(ptmp2);
 	}
 #endif
 	Var_Set(".CURDIR", curdir, VAR_GLOBAL, 0);
 
 	/*
 	 * Find the .OBJDIR.  If MAKEOBJDIRPREFIX, or failing that,
 	 * MAKEOBJDIR is set in the environment, try only that value
 	 * and fall back to .CURDIR if it does not exist.
 	 *
 	 * Otherwise, try _PATH_OBJDIR.MACHINE, _PATH_OBJDIR, and
 	 * finally _PATH_OBJDIRPREFIX`pwd`, in that order.  If none
 	 * of these paths exist, just use .CURDIR.
 	 */
 	Dir_Init(curdir);
 	(void)Main_SetObjdir(curdir);
 
 	if ((path = Var_Value("MAKEOBJDIRPREFIX", VAR_CMD, &p1)) != NULL) {
 		(void)snprintf(mdpath, MAXPATHLEN, "%s%s", path, curdir);
 		(void)Main_SetObjdir(mdpath);
 		free(p1);
 	} else if ((path = Var_Value("MAKEOBJDIR", VAR_CMD, &p1)) != NULL) {
 		(void)Main_SetObjdir(path);
 		free(p1);
 	} else {
 		(void)snprintf(mdpath, MAXPATHLEN, "%s.%s", _PATH_OBJDIR, machine);
 		if (!Main_SetObjdir(mdpath) && !Main_SetObjdir(_PATH_OBJDIR)) {
 			(void)snprintf(mdpath, MAXPATHLEN, "%s%s", 
 					_PATH_OBJDIRPREFIX, curdir);
 			(void)Main_SetObjdir(mdpath);
 		}
 	}
 
 	/*
 	 * Initialize archive, target and suffix modules in preparation for
 	 * parsing the makefile(s)
 	 */
 	Arch_Init();
 	Targ_Init();
 	Suff_Init();
 	Trace_Init(tracefile);
 
 	DEFAULT = NULL;
 	(void)time(&now);
 
 	Trace_Log(MAKESTART, NULL);
 	
 	/*
 	 * Set up the .TARGETS variable to contain the list of targets to be
 	 * created. If none specified, make the variable empty -- the parser
 	 * will fill the thing in with the default or .MAIN target.
 	 */
 	if (!Lst_IsEmpty(create)) {
 		LstNode ln;
 
 		for (ln = Lst_First(create); ln != NULL;
 		    ln = Lst_Succ(ln)) {
 			char *name = (char *)Lst_Datum(ln);
 
 			Var_Append(".TARGETS", name, VAR_GLOBAL);
 		}
 	} else
 		Var_Set(".TARGETS", "", VAR_GLOBAL, 0);
 
 
 	/*
 	 * If no user-supplied system path was given (through the -m option)
 	 * add the directories from the DEFSYSPATH (more than one may be given
 	 * as dir1:...:dirn) to the system include path.
 	 */
 	if (syspath == NULL || *syspath == '\0')
 		syspath = defsyspath;
 	else
 		syspath = bmake_strdup(syspath);
 
 	for (start = syspath; *start != '\0'; start = cp) {
 		for (cp = start; *cp != '\0' && *cp != ':'; cp++)
 			continue;
 		if (*cp == ':') {
 			*cp++ = '\0';
 		}
 		/* look for magic parent directory search string */
 		if (strncmp(".../", start, 4) != 0) {
 			(void)Dir_AddDir(defIncPath, start);
 		} else {
 			if (Dir_FindHereOrAbove(curdir, start+4, 
 			    found_path, sizeof(found_path))) {
 				(void)Dir_AddDir(defIncPath, found_path);
 			}
 		}
 	}
 	if (syspath != defsyspath)
 		free(syspath);
 
 	/*
 	 * Read in the built-in rules first, followed by the specified
 	 * makefile, if it was (makefile != NULL), or the default
 	 * makefile and Makefile, in that order, if it wasn't.
 	 */
 	if (!noBuiltins) {
 		LstNode ln;
 
 		sysMkPath = Lst_Init(FALSE);
 		Dir_Expand(_PATH_DEFSYSMK,
 			   Lst_IsEmpty(sysIncPath) ? defIncPath : sysIncPath,
 			   sysMkPath);
 		if (Lst_IsEmpty(sysMkPath))
 			Fatal("%s: no system rules (%s).", progname,
 			    _PATH_DEFSYSMK);
 		ln = Lst_Find(sysMkPath, NULL, ReadMakefile);
 		if (ln == NULL)
 			Fatal("%s: cannot open %s.", progname,
 			    (char *)Lst_Datum(ln));
 	}
 
 	if (!Lst_IsEmpty(makefiles)) {
 		LstNode ln;
 
 		ln = Lst_Find(makefiles, NULL, ReadAllMakefiles);
 		if (ln != NULL)
 			Fatal("%s: cannot open %s.", progname, 
 			    (char *)Lst_Datum(ln));
 	} else {
 	    p1 = Var_Subst(NULL, "${" MAKEFILE_PREFERENCE "}",
 		VAR_CMD, VARF_WANTRES);
 	    if (p1) {
 		(void)str2Lst_Append(makefiles, p1, NULL);
 		(void)Lst_Find(makefiles, NULL, ReadMakefile);
 		free(p1);
 	    }
 	}
 
 	/* In particular suppress .depend for '-r -V .OBJDIR -f /dev/null' */
 	if (!noBuiltins || !printVars) {
 	    makeDependfile = Var_Subst(NULL, "${.MAKE.DEPENDFILE:T}",
 		VAR_CMD, VARF_WANTRES);
 	    doing_depend = TRUE;
 	    (void)ReadMakefile(makeDependfile, NULL);
 	    doing_depend = FALSE;
 	}
 
 	if (enterFlagObj)
 		printf("%s: Entering directory `%s'\n", progname, objdir);
 	
 	MakeMode(NULL);
 
 	Var_Append("MFLAGS", Var_Value(MAKEFLAGS, VAR_GLOBAL, &p1), VAR_GLOBAL);
 	free(p1);
 
 	if (!forceJobs && !compatMake &&
 	    Var_Exists(".MAKE.JOBS", VAR_GLOBAL)) {
 	    char *value;
 	    int n;
 
 	    value = Var_Subst(NULL, "${.MAKE.JOBS}", VAR_GLOBAL, VARF_WANTRES);
 	    n = strtol(value, NULL, 0);
 	    if (n < 1) {
 		(void)fprintf(stderr, "%s: illegal value for .MAKE.JOBS -- must be positive integer!\n",
 		    progname);
 		exit(1);
 	    }
 	    if (n != maxJobs) {
 		Var_Append(MAKEFLAGS, "-j", VAR_GLOBAL);
 		Var_Append(MAKEFLAGS, value, VAR_GLOBAL);
 	    }
 	    maxJobs = n;
 	    maxJobTokens = maxJobs;
 	    forceJobs = TRUE;
 	    free(value);
 	}
 
 	/*
 	 * Be compatible if user did not specify -j and did not explicitly
 	 * turned compatibility on
 	 */
 	if (!compatMake && !forceJobs) {
 	    compatMake = TRUE;
 	}
 
 	if (!compatMake)
 	    Job_ServerStart(maxJobTokens, jp_0, jp_1);
 	if (DEBUG(JOB))
 	    fprintf(debug_file, "job_pipe %d %d, maxjobs %d, tokens %d, compat %d\n",
 		jp_0, jp_1, maxJobs, maxJobTokens, compatMake);
 
 	Main_ExportMAKEFLAGS(TRUE);	/* initial export */
 
 	/*
 	 * For compatibility, look at the directories in the VPATH variable
 	 * and add them to the search path, if the variable is defined. The
 	 * variable's value is in the same format as the PATH envariable, i.e.
 	 * <directory>:<directory>:<directory>...
 	 */
 	if (Var_Exists("VPATH", VAR_CMD)) {
 		char *vpath, savec;
 		/*
 		 * GCC stores string constants in read-only memory, but
 		 * Var_Subst will want to write this thing, so store it
 		 * in an array
 		 */
 		static char VPATH[] = "${VPATH}";
 
 		vpath = Var_Subst(NULL, VPATH, VAR_CMD, VARF_WANTRES);
 		path = vpath;
 		do {
 			/* skip to end of directory */
 			for (cp = path; *cp != ':' && *cp != '\0'; cp++)
 				continue;
 			/* Save terminator character so know when to stop */
 			savec = *cp;
 			*cp = '\0';
 			/* Add directory to search path */
 			(void)Dir_AddDir(dirSearchPath, path);
 			*cp = savec;
 			path = cp + 1;
 		} while (savec == ':');
 		free(vpath);
 	}
 
 	/*
 	 * Now that all search paths have been read for suffixes et al, it's
 	 * time to add the default search path to their lists...
 	 */
 	Suff_DoPaths();
 
 	/*
 	 * Propagate attributes through :: dependency lists.
 	 */
 	Targ_Propagate();
 
 	/* print the initial graph, if the user requested it */
 	if (DEBUG(GRAPH1))
 		Targ_PrintGraph(1);
 
 	/* print the values of any variables requested by the user */
 	if (printVars) {
 		LstNode ln;
 		Boolean expandVars;
 
 		if (debugVflag)
 			expandVars = FALSE;
 		else
 			expandVars = getBoolean(".MAKE.EXPAND_VARIABLES", FALSE);
 		for (ln = Lst_First(variables); ln != NULL;
 		    ln = Lst_Succ(ln)) {
 			char *var = (char *)Lst_Datum(ln);
 			char *value;
 			
 			if (strchr(var, '$')) {
 			    value = p1 = Var_Subst(NULL, var, VAR_GLOBAL,
 						   VARF_WANTRES);
 			} else if (expandVars) {
 				char tmp[128];
 								
 				if (snprintf(tmp, sizeof(tmp), "${%s}", var) >= (int)(sizeof(tmp)))
 					Fatal("%s: variable name too big: %s",
 					      progname, var);
 				value = p1 = Var_Subst(NULL, tmp, VAR_GLOBAL,
 						       VARF_WANTRES);
 			} else {
 				value = Var_Value(var, VAR_GLOBAL, &p1);
 			}
 			printf("%s\n", value ? value : "");
 			free(p1);
 		}
 	} else {
 		/*
 		 * Have now read the entire graph and need to make a list of
 		 * targets to create. If none was given on the command line,
 		 * we consult the parsing module to find the main target(s)
 		 * to create.
 		 */
 		if (Lst_IsEmpty(create))
 			targs = Parse_MainName();
 		else
 			targs = Targ_FindList(create, TARG_CREATE);
 
 		if (!compatMake) {
 			/*
 			 * Initialize job module before traversing the graph
 			 * now that any .BEGIN and .END targets have been read.
 			 * This is done only if the -q flag wasn't given
 			 * (to prevent the .BEGIN from being executed should
 			 * it exist).
 			 */
 			if (!queryFlag) {
 				Job_Init();
 				jobsRunning = TRUE;
 			}
 
 			/* Traverse the graph, checking on all the targets */
 			outOfDate = Make_Run(targs);
 		} else {
 			/*
 			 * Compat_Init will take care of creating all the
 			 * targets as well as initializing the module.
 			 */
 			Compat_Run(targs);
 		}
 	}
 
 #ifdef CLEANUP
 	Lst_Destroy(targs, NULL);
 	Lst_Destroy(variables, NULL);
 	Lst_Destroy(makefiles, NULL);
 	Lst_Destroy(create, (FreeProc *)free);
 #endif
 
 	/* print the graph now it's been processed if the user requested it */
 	if (DEBUG(GRAPH2))
 		Targ_PrintGraph(2);
 
 	Trace_Log(MAKEEND, 0);
 
 	if (enterFlagObj)
 		printf("%s: Leaving directory `%s'\n", progname, objdir);
 	if (enterFlag)
 		printf("%s: Leaving directory `%s'\n", progname, curdir);
 
 #ifdef USE_META
 	meta_finish();
 #endif
 	Suff_End();
         Targ_End();
 	Arch_End();
 	Var_End();
 	Parse_End();
 	Dir_End();
 	Job_End();
 	Trace_End();
 
 	return outOfDate ? 1 : 0;
 }
 
 /*-
  * ReadMakefile  --
  *	Open and parse the given makefile.
  *
  * Results:
  *	0 if ok. -1 if couldn't open file.
  *
  * Side Effects:
  *	lots
  */
 static int
 ReadMakefile(const void *p, const void *q MAKE_ATTR_UNUSED)
 {
 	const char *fname = p;		/* makefile to read */
 	int fd;
 	size_t len = MAXPATHLEN;
 	char *name, *path = bmake_malloc(len);
 
 	if (!strcmp(fname, "-")) {
 		Parse_File(NULL /*stdin*/, -1);
 		Var_Set("MAKEFILE", "", VAR_INTERNAL, 0);
 	} else {
 		/* if we've chdir'd, rebuild the path name */
 		if (strcmp(curdir, objdir) && *fname != '/') {
 			size_t plen = strlen(curdir) + strlen(fname) + 2;
 			if (len < plen)
 				path = bmake_realloc(path, len = 2 * plen);
 			
 			(void)snprintf(path, len, "%s/%s", curdir, fname);
 			fd = open(path, O_RDONLY);
 			if (fd != -1) {
 				fname = path;
 				goto found;
 			}
 			
 			/* If curdir failed, try objdir (ala .depend) */
 			plen = strlen(objdir) + strlen(fname) + 2;
 			if (len < plen)
 				path = bmake_realloc(path, len = 2 * plen);
 			(void)snprintf(path, len, "%s/%s", objdir, fname);
 			fd = open(path, O_RDONLY);
 			if (fd != -1) {
 				fname = path;
 				goto found;
 			}
 		} else {
 			fd = open(fname, O_RDONLY);
 			if (fd != -1)
 				goto found;
 		}
 		/* look in -I and system include directories. */
 		name = Dir_FindFile(fname, parseIncPath);
 		if (!name)
 			name = Dir_FindFile(fname,
 				Lst_IsEmpty(sysIncPath) ? defIncPath : sysIncPath);
 		if (!name || (fd = open(name, O_RDONLY)) == -1) {
 			free(name);
 			free(path);
 			return(-1);
 		}
 		fname = name;
 		/*
 		 * set the MAKEFILE variable desired by System V fans -- the
 		 * placement of the setting here means it gets set to the last
 		 * makefile specified, as it is set by SysV make.
 		 */
 found:
 		if (!doing_depend)
 			Var_Set("MAKEFILE", fname, VAR_INTERNAL, 0);
 		Parse_File(fname, fd);
 	}
 	free(path);
 	return(0);
 }
 
 
 
 /*-
  * Cmd_Exec --
  *	Execute the command in cmd, and return the output of that command
  *	in a string.
  *
  * Results:
  *	A string containing the output of the command, or the empty string
  *	If errnum is not NULL, it contains the reason for the command failure
  *
  * Side Effects:
  *	The string must be freed by the caller.
  */
 char *
 Cmd_Exec(const char *cmd, const char **errnum)
 {
     const char	*args[4];   	/* Args for invoking the shell */
     int 	fds[2];	    	/* Pipe streams */
     int 	cpid;	    	/* Child PID */
     int 	pid;	    	/* PID from wait() */
     char	*res;		/* result */
     WAIT_T	status;		/* command exit status */
     Buffer	buf;		/* buffer to store the result */
     char	*cp;
     int		cc;		/* bytes read, or -1 */
     int		savederr;	/* saved errno */
 
 
     *errnum = NULL;
 
     if (!shellName)
 	Shell_Init();
     /*
      * Set up arguments for shell
      */
     args[0] = shellName;
     args[1] = "-c";
     args[2] = cmd;
     args[3] = NULL;
 
     /*
      * Open a pipe for fetching its output
      */
     if (pipe(fds) == -1) {
 	*errnum = "Couldn't create pipe for \"%s\"";
 	goto bad;
     }
 
     /*
      * Fork
      */
     switch (cpid = vFork()) {
     case 0:
 	/*
 	 * Close input side of pipe
 	 */
 	(void)close(fds[0]);
 
 	/*
 	 * Duplicate the output stream to the shell's output, then
 	 * shut the extra thing down. Note we don't fetch the error
 	 * stream...why not? Why?
 	 */
 	(void)dup2(fds[1], 1);
 	(void)close(fds[1]);
 
 	Var_ExportVars();
 
 	(void)execv(shellPath, UNCONST(args));
 	_exit(1);
 	/*NOTREACHED*/
 
     case -1:
 	*errnum = "Couldn't exec \"%s\"";
 	goto bad;
 
     default:
 	/*
 	 * No need for the writing half
 	 */
 	(void)close(fds[1]);
 
 	savederr = 0;
 	Buf_Init(&buf, 0);
 
 	do {
 	    char   result[BUFSIZ];
 	    cc = read(fds[0], result, sizeof(result));
 	    if (cc > 0)
 		Buf_AddBytes(&buf, cc, result);
 	}
 	while (cc > 0 || (cc == -1 && errno == EINTR));
 	if (cc == -1)
 	    savederr = errno;
 
 	/*
 	 * Close the input side of the pipe.
 	 */
 	(void)close(fds[0]);
 
 	/*
 	 * Wait for the process to exit.
 	 */
 	while(((pid = waitpid(cpid, &status, 0)) != cpid) && (pid >= 0)) {
 	    JobReapChild(pid, status, FALSE);
 	    continue;
 	}
 	cc = Buf_Size(&buf);
 	res = Buf_Destroy(&buf, FALSE);
 
 	if (savederr != 0)
 	    *errnum = "Couldn't read shell's output for \"%s\"";
 
 	if (WIFSIGNALED(status))
 	    *errnum = "\"%s\" exited on a signal";
 	else if (WEXITSTATUS(status) != 0)
 	    *errnum = "\"%s\" returned non-zero status";
 
 	/*
 	 * Null-terminate the result, convert newlines to spaces and
 	 * install it in the variable.
 	 */
 	res[cc] = '\0';
 	cp = &res[cc];
 
 	if (cc > 0 && *--cp == '\n') {
 	    /*
 	     * A final newline is just stripped
 	     */
 	    *cp-- = '\0';
 	}
 	while (cp >= res) {
 	    if (*cp == '\n') {
 		*cp = ' ';
 	    }
 	    cp--;
 	}
 	break;
     }
     return res;
 bad:
     res = bmake_malloc(1);
     *res = '\0';
     return res;
 }
 
 /*-
  * Error --
  *	Print an error message given its format.
  *
  * Results:
  *	None.
  *
  * Side Effects:
  *	The message is printed.
  */
 /* VARARGS */
 void
 Error(const char *fmt, ...)
 {
 	va_list ap;
 	FILE *err_file;
 
 	err_file = debug_file;
 	if (err_file == stdout)
 		err_file = stderr;
 	(void)fflush(stdout);
 	for (;;) {
 		va_start(ap, fmt);
 		fprintf(err_file, "%s: ", progname);
 		(void)vfprintf(err_file, fmt, ap);
 		va_end(ap);
 		(void)fprintf(err_file, "\n");
 		(void)fflush(err_file);
 		if (err_file == stderr)
 			break;
 		err_file = stderr;
 	}
 }
 
 /*-
  * Fatal --
  *	Produce a Fatal error message. If jobs are running, waits for them
  *	to finish.
  *
  * Results:
  *	None
  *
  * Side Effects:
  *	The program exits
  */
 /* VARARGS */
 void
 Fatal(const char *fmt, ...)
 {
 	va_list ap;
 
 	va_start(ap, fmt);
 	if (jobsRunning)
 		Job_Wait();
 
 	(void)fflush(stdout);
 	(void)vfprintf(stderr, fmt, ap);
 	va_end(ap);
 	(void)fprintf(stderr, "\n");
 	(void)fflush(stderr);
 
 	PrintOnError(NULL, NULL);
 
 	if (DEBUG(GRAPH2) || DEBUG(GRAPH3))
 		Targ_PrintGraph(2);
 	Trace_Log(MAKEERROR, 0);
 	exit(2);		/* Not 1 so -q can distinguish error */
 }
 
 /*
  * Punt --
  *	Major exception once jobs are being created. Kills all jobs, prints
  *	a message and exits.
  *
  * Results:
  *	None
  *
  * Side Effects:
  *	All children are killed indiscriminately and the program Lib_Exits
  */
 /* VARARGS */
 void
 Punt(const char *fmt, ...)
 {
 	va_list ap;
 
 	va_start(ap, fmt);
 	(void)fflush(stdout);
 	(void)fprintf(stderr, "%s: ", progname);
 	(void)vfprintf(stderr, fmt, ap);
 	va_end(ap);
 	(void)fprintf(stderr, "\n");
 	(void)fflush(stderr);
 
 	PrintOnError(NULL, NULL);
 
 	DieHorribly();
 }
 
 /*-
  * DieHorribly --
  *	Exit without giving a message.
  *
  * Results:
  *	None
  *
  * Side Effects:
  *	A big one...
  */
 void
 DieHorribly(void)
 {
 	if (jobsRunning)
 		Job_AbortAll();
 	if (DEBUG(GRAPH2))
 		Targ_PrintGraph(2);
 	Trace_Log(MAKEERROR, 0);
 	exit(2);		/* Not 1, so -q can distinguish error */
 }
 
 /*
  * Finish --
  *	Called when aborting due to errors in child shell to signal
  *	abnormal exit.
  *
  * Results:
  *	None
  *
  * Side Effects:
  *	The program exits
  */
 void
 Finish(int errors)
 	           	/* number of errors encountered in Make_Make */
 {
 	Fatal("%d error%s", errors, errors == 1 ? "" : "s");
 }
 
 /*
  * eunlink --
  *	Remove a file carefully, avoiding directories.
  */
 int
 eunlink(const char *file)
 {
 	struct stat st;
 
 	if (lstat(file, &st) == -1)
 		return -1;
 
 	if (S_ISDIR(st.st_mode)) {
 		errno = EISDIR;
 		return -1;
 	}
 	return unlink(file);
 }
 
 /*
  * execError --
  *	Print why exec failed, avoiding stdio.
  */
 void
 execError(const char *af, const char *av)
 {
 #ifdef USE_IOVEC
 	int i = 0;
 	struct iovec iov[8];
 #define IOADD(s) \
 	(void)(iov[i].iov_base = UNCONST(s), \
 	    iov[i].iov_len = strlen(iov[i].iov_base), \
 	    i++)
 #else
 #define	IOADD(s) (void)write(2, s, strlen(s))
 #endif
 
 	IOADD(progname);
 	IOADD(": ");
 	IOADD(af);
 	IOADD("(");
 	IOADD(av);
 	IOADD(") failed (");
 	IOADD(strerror(errno));
 	IOADD(")\n");
 
 #ifdef USE_IOVEC
 	while (writev(2, iov, 8) == -1 && errno == EAGAIN)
 	    continue;
 #endif
 }
 
 /*
  * usage --
  *	exit with usage message
  */
 static void
 usage(void)
 {
 	char *p;
 	if ((p = strchr(progname, '[')) != NULL)
 	    *p = '\0';
 
 	(void)fprintf(stderr,
 "usage: %s [-BeikNnqrstWwX] \n\
             [-C directory] [-D variable] [-d flags] [-f makefile]\n\
             [-I directory] [-J private] [-j max_jobs] [-m directory] [-T file]\n\
             [-V variable] [variable=value] [target ...]\n", progname);
 	exit(2);
 }
 
 
 /*
  * realpath(3) can get expensive, cache results...
  */
 char *
 cached_realpath(const char *pathname, char *resolved)
 {
     static GNode *cache;
     char *rp, *cp;
 
     if (!pathname || !pathname[0])
 	return NULL;
 
     if (!cache) {
 	cache = Targ_NewGN("Realpath");
 #ifndef DEBUG_REALPATH_CACHE
 	cache->flags = INTERNAL;
 #endif
     }
 
-    rp = Var_Value(pathname, cache, &cp);
-    if (rp) {
+    if ((rp = Var_Value(pathname, cache, &cp)) != NULL) {
 	/* a hit */
 	strlcpy(resolved, rp, MAXPATHLEN);
-    } else if ((rp = realpath(pathname, resolved))) {
+    } else if ((rp = realpath(pathname, resolved)) != NULL) {
 	Var_Set(pathname, rp, cache, 0);
     }
     free(cp);
     return rp ? resolved : NULL;
 }
 
 int
 PrintAddr(void *a, void *b)
 {
     printf("%lx ", (unsigned long) a);
     return b ? 0 : 0;
 }
 
 
+static int
+addErrorCMD(void *cmdp, void *gnp)
+{
+    if (cmdp == NULL)
+	return 1;			/* stop */
+    Var_Append(".ERROR_CMD", cmdp, VAR_GLOBAL);
+    return 0;
+}
 
 void
 PrintOnError(GNode *gn, const char *s)
 {
     static GNode *en = NULL;
     char tmp[64];
     char *cp;
 
     if (s)
 	printf("%s", s);
 	
     printf("\n%s: stopped in %s\n", progname, curdir);
 
     if (en)
 	return;				/* we've been here! */
     if (gn) {
 	/*
 	 * We can print this even if there is no .ERROR target.
 	 */
 	Var_Set(".ERROR_TARGET", gn->name, VAR_GLOBAL, 0);
+	Var_Delete(".ERROR_CMD", VAR_GLOBAL);
+	Lst_ForEach(gn->commands, addErrorCMD, gn);
     }
     strncpy(tmp, "${MAKE_PRINT_VAR_ON_ERROR:@v@$v='${$v}'\n@}",
 	    sizeof(tmp) - 1);
     cp = Var_Subst(NULL, tmp, VAR_GLOBAL, VARF_WANTRES);
     if (cp) {
 	if (*cp)
 	    printf("%s", cp);
 	free(cp);
     }
     fflush(stdout);
 
     /*
      * Finally, see if there is a .ERROR target, and run it if so.
      */
     en = Targ_FindNode(".ERROR", TARG_NOCREATE);
     if (en) {
 	en->type |= OP_SPECIAL;
 	Compat_Make(en, en);
     }
 }
 
 void
 Main_ExportMAKEFLAGS(Boolean first)
 {
     static int once = 1;
     char tmp[64];
     char *s;
 
     if (once != first)
 	return;
     once = 0;
     
     strncpy(tmp, "${.MAKEFLAGS} ${.MAKEOVERRIDES:O:u:@v@$v=${$v:Q}@}",
 	    sizeof(tmp));
     s = Var_Subst(NULL, tmp, VAR_CMD, VARF_WANTRES);
     if (s && *s) {
 #ifdef POSIX
 	setenv("MAKEFLAGS", s, 1);
 #else
 	setenv("MAKE", s, 1);
 #endif
     }
 }
 
 char *
 getTmpdir(void)
 {
     static char *tmpdir = NULL;
 
     if (!tmpdir) {
 	struct stat st;
 
 	/*
 	 * Honor $TMPDIR but only if it is valid.
 	 * Ensure it ends with /.
 	 */
 	tmpdir = Var_Subst(NULL, "${TMPDIR:tA:U" _PATH_TMP "}/", VAR_GLOBAL,
 			   VARF_WANTRES);
 	if (stat(tmpdir, &st) < 0 || !S_ISDIR(st.st_mode)) {
 	    free(tmpdir);
 	    tmpdir = bmake_strdup(_PATH_TMP);
 	}
     }
     return tmpdir;
 }
 
 /*
  * Create and open a temp file using "pattern".
  * If "fnamep" is provided set it to a copy of the filename created.
  * Otherwise unlink the file once open.
  */
 int
 mkTempFile(const char *pattern, char **fnamep)
 {
     static char *tmpdir = NULL;
     char tfile[MAXPATHLEN];
     int fd;
     
     if (!pattern)
 	pattern = TMPPAT;
     if (!tmpdir)
 	tmpdir = getTmpdir();
     if (pattern[0] == '/') {
 	snprintf(tfile, sizeof(tfile), "%s", pattern);
     } else {
 	snprintf(tfile, sizeof(tfile), "%s%s", tmpdir, pattern);
     }
     if ((fd = mkstemp(tfile)) < 0)
 	Punt("Could not create temporary file %s: %s", tfile, strerror(errno));
     if (fnamep) {
 	*fnamep = bmake_strdup(tfile);
     } else {
 	unlink(tfile);			/* we just want the descriptor */
     }
     return fd;
 }
 
 /*
  * Convert a string representation of a boolean.
  * Anything that looks like "No", "False", "Off", "0" etc,
  * is FALSE, otherwise TRUE.
  */
 Boolean
 s2Boolean(const char *s, Boolean bf)
 {
     if (s) {
 	switch(*s) {
 	case '\0':			/* not set - the default wins */
 	    break;
 	case '0':
 	case 'F':
 	case 'f':
 	case 'N':
 	case 'n':
 	    bf = FALSE;
 	    break;
 	case 'O':
 	case 'o':
 	    switch (s[1]) {
 	    case 'F':
 	    case 'f':
 		bf = FALSE;
 		break;
 	    default:
 		bf = TRUE;
 		break;
 	    }
 	    break;
 	default:
 	    bf = TRUE;
 	    break;
 	}
     }
     return (bf);
 }
 
 /*
  * Return a Boolean based on setting of a knob.
  *
  * If the knob is not set, the supplied default is the return value.
  * If set, anything that looks or smells like "No", "False", "Off", "0" etc,
  * is FALSE, otherwise TRUE.
  */
 Boolean
 getBoolean(const char *name, Boolean bf)
 {
     char tmp[64];
     char *cp;
 
     if (snprintf(tmp, sizeof(tmp), "${%s:U:tl}", name) < (int)(sizeof(tmp))) {
 	cp = Var_Subst(NULL, tmp, VAR_GLOBAL, VARF_WANTRES);
 
 	if (cp) {
 	    bf = s2Boolean(cp, bf);
 	    free(cp);
 	}
     }
     return (bf);
 }
Index: projects/clang390-import/contrib/bmake/make.1
===================================================================
--- projects/clang390-import/contrib/bmake/make.1	(revision 305686)
+++ projects/clang390-import/contrib/bmake/make.1	(revision 305687)
@@ -1,2337 +1,2352 @@
-.\"	$NetBSD: make.1,v 1.259 2016/06/03 07:07:37 wiz Exp $
+.\"	$NetBSD: make.1,v 1.262 2016/08/18 19:23:20 wiz Exp $
 .\"
 .\" Copyright (c) 1990, 1993
 .\"	The Regents of the University of California.  All rights reserved.
 .\"
 .\" Redistribution and use in source and binary forms, with or without
 .\" modification, are permitted provided that the following conditions
 .\" are met:
 .\" 1. Redistributions of source code must retain the above copyright
 .\"    notice, this list of conditions and the following disclaimer.
 .\" 2. Redistributions in binary form must reproduce the above copyright
 .\"    notice, this list of conditions and the following disclaimer in the
 .\"    documentation and/or other materials provided with the distribution.
 .\" 3. Neither the name of the University nor the names of its contributors
 .\"    may be used to endorse or promote products derived from this software
 .\"    without specific prior written permission.
 .\"
 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 .\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 .\" SUCH DAMAGE.
 .\"
 .\"	from: @(#)make.1	8.4 (Berkeley) 3/19/94
 .\"
-.Dd June 2, 2016
+.Dd August 15, 2016
 .Dt MAKE 1
 .Os
 .Sh NAME
 .Nm make
 .Nd maintain program dependencies
 .Sh SYNOPSIS
 .Nm
 .Op Fl BeikNnqrstWwX
 .Op Fl C Ar directory
 .Op Fl D Ar variable
 .Op Fl d Ar flags
 .Op Fl f Ar makefile
 .Op Fl I Ar directory
 .Op Fl J Ar private
 .Op Fl j Ar max_jobs
 .Op Fl m Ar directory
 .Op Fl T Ar file
 .Op Fl V Ar variable
 .Op Ar variable=value
 .Op Ar target ...
 .Sh DESCRIPTION
 .Nm
 is a program designed to simplify the maintenance of other programs.
 Its input is a list of specifications as to the files upon which programs
 and other files depend.
 If no
 .Fl f Ar makefile
 makefile option is given,
 .Nm
 will try to open
 .Ql Pa makefile
 then
 .Ql Pa Makefile
 in order to find the specifications.
 If the file
 .Ql Pa .depend
 exists, it is read (see
 .Xr mkdep 1 ) .
 .Pp
 This manual page is intended as a reference document only.
 For a more thorough description of
 .Nm
 and makefiles, please refer to
 .%T "PMake \- A Tutorial" .
 .Pp
 .Nm
 will prepend the contents of the
 .Va MAKEFLAGS
 environment variable to the command line arguments before parsing them.
 .Pp
 The options are as follows:
 .Bl -tag -width Ds
 .It Fl B
 Try to be backwards compatible by executing a single shell per command and
 by executing the commands to make the sources of a dependency line in sequence.
 .It Fl C Ar directory
 Change to
 .Ar directory
 before reading the makefiles or doing anything else.
 If multiple
 .Fl C
 options are specified, each is interpreted relative to the previous one:
 .Fl C Pa / Fl C Pa etc
 is equivalent to
 .Fl C Pa /etc .
 .It Fl D Ar variable
 Define
 .Ar variable
 to be 1, in the global context.
 .It Fl d Ar [-]flags
 Turn on debugging, and specify which portions of
 .Nm
 are to print debugging information.
 Unless the flags are preceded by
 .Ql \-
 they are added to the
 .Va MAKEFLAGS
 environment variable and will be processed by any child make processes.
 By default, debugging information is printed to standard error,
 but this can be changed using the
 .Ar F
 debugging flag.
 The debugging output is always unbuffered; in addition, if debugging
 is enabled but debugging output is not directed to standard output,
 then the standard output is line buffered.
 .Ar Flags
 is one or more of the following:
 .Bl -tag -width Ds
 .It Ar A
 Print all possible debugging information;
 equivalent to specifying all of the debugging flags.
 .It Ar a
 Print debugging information about archive searching and caching.
 .It Ar C
 Print debugging information about current working directory.
 .It Ar c
 Print debugging information about conditional evaluation.
 .It Ar d
 Print debugging information about directory searching and caching.
 .It Ar e
 Print debugging information about failed commands and targets.
 .It Ar F Ns Oo Sy \&+ Oc Ns Ar filename
 Specify where debugging output is written.
 This must be the last flag, because it consumes the remainder of
 the argument.
 If the character immediately after the
 .Ql F
 flag is
 .Ql \&+ ,
 then the file will be opened in append mode;
 otherwise the file will be overwritten.
 If the file name is
 .Ql stdout
 or
 .Ql stderr
 then debugging output will be written to the
 standard output or standard error output file descriptors respectively
 (and the
 .Ql \&+
 option has no effect).
 Otherwise, the output will be written to the named file.
 If the file name ends
 .Ql .%d
 then the
 .Ql %d
 is replaced by the pid.
 .It Ar f
 Print debugging information about loop evaluation.
 .It Ar "g1"
 Print the input graph before making anything.
 .It Ar "g2"
 Print the input graph after making everything, or before exiting
 on error.
 .It Ar "g3"
 Print the input graph before exiting on error.
 .It Ar j
 Print debugging information about running multiple shells.
 .It Ar l
 Print commands in Makefiles regardless of whether or not they are prefixed by
 .Ql @
 or other "quiet" flags.
 Also known as "loud" behavior.
 .It Ar M
 Print debugging information about "meta" mode decisions about targets.
 .It Ar m
 Print debugging information about making targets, including modification
 dates.
 .It Ar n
 Don't delete the temporary command scripts created when running commands.
 These temporary scripts are created in the directory
 referred to by the
 .Ev TMPDIR
 environment variable, or in
 .Pa /tmp
 if
 .Ev TMPDIR
 is unset or set to the empty string.
 The temporary scripts are created by
 .Xr mkstemp 3 ,
 and have names of the form
 .Pa makeXXXXXX .
 .Em NOTE :
 This can create many files in
 .Ev TMPDIR
 or
 .Pa /tmp ,
 so use with care.
 .It Ar p
 Print debugging information about makefile parsing.
 .It Ar s
 Print debugging information about suffix-transformation rules.
 .It Ar t
 Print debugging information about target list maintenance.
 .It Ar V
 Force the
 .Fl V
 option to print raw values of variables.
 .It Ar v
 Print debugging information about variable assignment.
 .It Ar x
 Run shell commands with
 .Fl x
 so the actual commands are printed as they are executed.
 .El
 .It Fl e
 Specify that environment variables override macro assignments within
 makefiles.
 .It Fl f Ar makefile
 Specify a makefile to read instead of the default
 .Ql Pa makefile .
 If
 .Ar makefile
 is
 .Ql Fl ,
 standard input is read.
 Multiple makefiles may be specified, and are read in the order specified.
 .It Fl I Ar directory
 Specify a directory in which to search for makefiles and included makefiles.
 The system makefile directory (or directories, see the
 .Fl m
 option) is automatically included as part of this list.
 .It Fl i
 Ignore non-zero exit of shell commands in the makefile.
 Equivalent to specifying
 .Ql Fl
 before each command line in the makefile.
 .It Fl J Ar private
 This option should
 .Em not
 be specified by the user.
 .Pp
 When the
 .Ar j
 option is in use in a recursive build, this option is passed by a make
 to child makes to allow all the make processes in the build to
 cooperate to avoid overloading the system.
 .It Fl j Ar max_jobs
 Specify the maximum number of jobs that
 .Nm
 may have running at any one time.
 The value is saved in
 .Va .MAKE.JOBS .
 Turns compatibility mode off, unless the
 .Ar B
 flag is also specified.
 When compatibility mode is off, all commands associated with a
 target are executed in a single shell invocation as opposed to the
 traditional one shell invocation per line.
 This can break traditional scripts which change directories on each
 command invocation and then expect to start with a fresh environment
 on the next line.
 It is more efficient to correct the scripts rather than turn backwards
 compatibility on.
 .It Fl k
 Continue processing after errors are encountered, but only on those targets
 that do not depend on the target whose creation caused the error.
 .It Fl m Ar directory
 Specify a directory in which to search for sys.mk and makefiles included
 via the
 .Ao Ar file Ac Ns -style
 include statement.
 The
 .Fl m
 option can be used multiple times to form a search path.
 This path will override the default system include path: /usr/share/mk.
 Furthermore the system include path will be appended to the search path used
 for
 .Qo Ar file Qc Ns -style
 include statements (see the
 .Fl I
 option).
 .Pp
 If a file or directory name in the
 .Fl m
 argument (or the
 .Ev MAKESYSPATH
 environment variable) starts with the string
 .Qq \&.../
 then
 .Nm
 will search for the specified file or directory named in the remaining part
 of the argument string.
 The search starts with the current directory of
 the Makefile and then works upward towards the root of the file system.
 If the search is successful, then the resulting directory replaces the
 .Qq \&.../
 specification in the
 .Fl m
 argument.
 If used, this feature allows
 .Nm
 to easily search in the current source tree for customized sys.mk files
 (e.g., by using
 .Qq \&.../mk/sys.mk
 as an argument).
 .It Fl n
 Display the commands that would have been executed, but do not
 actually execute them unless the target depends on the .MAKE special
 source (see below).
 .It Fl N
 Display the commands which would have been executed, but do not
 actually execute any of them; useful for debugging top-level makefiles
 without descending into subdirectories.
 .It Fl q
 Do not execute any commands, but exit 0 if the specified targets are
 up-to-date and 1, otherwise.
 .It Fl r
 Do not use the built-in rules specified in the system makefile.
 .It Fl s
 Do not echo any commands as they are executed.
 Equivalent to specifying
 .Ql Ic @
 before each command line in the makefile.
 .It Fl T Ar tracefile
 When used with the
 .Fl j
 flag,
 append a trace record to
 .Ar tracefile
 for each job started and completed.
 .It Fl t
 Rather than re-building a target as specified in the makefile, create it
 or update its modification time to make it appear up-to-date.
 .It Fl V Ar variable
 Print
 .Nm Ns 's
 idea of the value of
 .Ar variable ,
 in the global context.
 Do not build any targets.
 Multiple instances of this option may be specified;
 the variables will be printed one per line,
 with a blank line for each null or undefined variable.
 If
 .Ar variable
 contains a
 .Ql \&$
 then the value will be expanded before printing.
 .It Fl W
 Treat any warnings during makefile parsing as errors.
 .It Fl w
 Print entering and leaving directory messages, pre and post processing.
 .It Fl X
 Don't export variables passed on the command line to the environment
 individually.
 Variables passed on the command line are still exported
 via the
 .Va MAKEFLAGS
 environment variable.
 This option may be useful on systems which have a small limit on the
 size of command arguments.
 .It Ar variable=value
 Set the value of the variable
 .Ar variable
 to
 .Ar value .
 Normally, all values passed on the command line are also exported to
 sub-makes in the environment.
 The
 .Fl X
 flag disables this behavior.
 Variable assignments should follow options for POSIX compatibility
 but no ordering is enforced.
 .El
 .Pp
 There are seven different types of lines in a makefile: file dependency
 specifications, shell commands, variable assignments, include statements,
 conditional directives, for loops, and comments.
 .Pp
 In general, lines may be continued from one line to the next by ending
 them with a backslash
 .Pq Ql \e .
 The trailing newline character and initial whitespace on the following
 line are compressed into a single space.
 .Sh FILE DEPENDENCY SPECIFICATIONS
 Dependency lines consist of one or more targets, an operator, and zero
 or more sources.
 This creates a relationship where the targets
 .Dq depend
 on the sources
 and are usually created from them.
 The exact relationship between the target and the source is determined
 by the operator that separates them.
 The three operators are as follows:
 .Bl -tag -width flag
 .It Ic \&:
 A target is considered out-of-date if its modification time is less than
 those of any of its sources.
 Sources for a target accumulate over dependency lines when this operator
 is used.
 The target is removed if
 .Nm
 is interrupted.
 .It Ic \&!
 Targets are always re-created, but not until all sources have been
 examined and re-created as necessary.
 Sources for a target accumulate over dependency lines when this operator
 is used.
 The target is removed if
 .Nm
 is interrupted.
 .It Ic \&::
 If no sources are specified, the target is always re-created.
 Otherwise, a target is considered out-of-date if any of its sources has
 been modified more recently than the target.
 Sources for a target do not accumulate over dependency lines when this
 operator is used.
 The target will not be removed if
 .Nm
 is interrupted.
 .El
 .Pp
 Targets and sources may contain the shell wildcard values
 .Ql \&? ,
 .Ql * ,
 .Ql [] ,
 and
 .Ql {} .
 The values
 .Ql \&? ,
 .Ql * ,
 and
 .Ql []
 may only be used as part of the final
 component of the target or source, and must be used to describe existing
 files.
 The value
 .Ql {}
 need not necessarily be used to describe existing files.
 Expansion is in directory order, not alphabetically as done in the shell.
 .Sh SHELL COMMANDS
 Each target may have associated with it one or more lines of shell
 commands, normally
 used to create the target.
 Each of the lines in this script
 .Em must
 be preceded by a tab.
 (For historical reasons, spaces are not accepted.)
 While targets can appear in many dependency lines if desired, by
 default only one of these rules may be followed by a creation
 script.
 If the
 .Ql Ic \&::
 operator is used, however, all rules may include scripts and the
 scripts are executed in the order found.
 .Pp
 Each line is treated as a separate shell command, unless the end of
 line is escaped with a backslash
 .Pq Ql \e
 in which case that line and the next are combined.
 .\" The escaped newline is retained and passed to the shell, which
 .\" normally ignores it.
 .\" However, the tab at the beginning of the following line is removed.
 If the first characters of the command are any combination of
 .Ql Ic @ ,
 .Ql Ic + ,
 or
 .Ql Ic \- ,
 the command is treated specially.
 A
 .Ql Ic @
 causes the command not to be echoed before it is executed.
 A
 .Ql Ic +
 causes the command to be executed even when
 .Fl n
 is given.
 This is similar to the effect of the .MAKE special source,
 except that the effect can be limited to a single line of a script.
 A
 .Ql Ic \-
 in compatibility mode
 causes any non-zero exit status of the command line to be ignored.
 .Pp
 When
 .Nm
 is run in jobs mode with
 .Fl j Ar max_jobs ,
 the entire script for the target is fed to a
 single instance of the shell.
 In compatibility (non-jobs) mode, each command is run in a separate process.
 If the command contains any shell meta characters
 .Pq Ql #=|^(){};&<>*?[]:$`\e\en
 it will be passed to the shell; otherwise
 .Nm
 will attempt direct execution.
 If a line starts with
 .Ql Ic \-
 and the shell has ErrCtl enabled then failure of the command line
 will be ignored as in compatibility mode.
 Otherwise
 .Ql Ic \-
 affects the entire job;
 the script will stop at the first command line that fails,
 but the target will not be deemed to have failed.
 .Pp
 Makefiles should be written so that the mode of
 .Nm
 operation does not change their behavior.
 For example, any command which needs to use
 .Dq cd
 or
 .Dq chdir
 without potentially changing the directory for subsequent commands
 should be put in parentheses so it executes in a subshell.
 To force the use of one shell, escape the line breaks so as to make
 the whole script one command.
 For example:
 .Bd -literal -offset indent
 avoid-chdir-side-effects:
 	@echo Building $@ in `pwd`
 	@(cd ${.CURDIR} && ${MAKE} $@)
 	@echo Back in `pwd`
 
 ensure-one-shell-regardless-of-mode:
 	@echo Building $@ in `pwd`; \e
 	(cd ${.CURDIR} && ${MAKE} $@); \e
 	echo Back in `pwd`
 .Ed
 .Pp
 Since
 .Nm
 will
 .Xr chdir 2
 to
 .Ql Va .OBJDIR
 before executing any targets, each child process
 starts with that as its current working directory.
 .Sh VARIABLE ASSIGNMENTS
 Variables in make are much like variables in the shell, and, by tradition,
 consist of all upper-case letters.
 .Ss Variable assignment modifiers
 The five operators that can be used to assign values to variables are as
 follows:
 .Bl -tag -width Ds
 .It Ic \&=
 Assign the value to the variable.
 Any previous value is overridden.
 .It Ic \&+=
 Append the value to the current value of the variable.
 .It Ic \&?=
 Assign the value to the variable if it is not already defined.
 .It Ic \&:=
 Assign with expansion, i.e. expand the value before assigning it
 to the variable.
 Normally, expansion is not done until the variable is referenced.
 .Em NOTE :
 References to undefined variables are
 .Em not
 expanded.
 This can cause problems when variable modifiers are used.
 .It Ic \&!=
 Expand the value and pass it to the shell for execution and assign
 the result to the variable.
 Any newlines in the result are replaced with spaces.
 .El
 .Pp
 Any white-space before the assigned
 .Ar value
 is removed; if the value is being appended, a single space is inserted
 between the previous contents of the variable and the appended value.
 .Pp
 Variables are expanded by surrounding the variable name with either
 curly braces
 .Pq Ql {}
 or parentheses
 .Pq Ql ()
 and preceding it with
 a dollar sign
 .Pq Ql \&$ .
 If the variable name contains only a single letter, the surrounding
 braces or parentheses are not required.
 This shorter form is not recommended.
 .Pp
 If the variable name contains a dollar, then the name itself is expanded first.
 This allows almost arbitrary variable names, however names containing dollar,
 braces, parenthesis, or whitespace are really best avoided!
 .Pp
 If the result of expanding a variable contains a dollar sign
 .Pq Ql \&$
 the string is expanded again.
 .Pp
 Variable substitution occurs at three distinct times, depending on where
 the variable is being used.
 .Bl -enum
 .It
 Variables in dependency lines are expanded as the line is read.
 .It
 Variables in shell commands are expanded when the shell command is
 executed.
 .It
 .Dq .for
 loop index variables are expanded on each loop iteration.
 Note that other variables are not expanded inside loops so
 the following example code:
 .Bd -literal -offset indent
 
 .Dv .for i in 1 2 3
 a+=     ${i}
 j=      ${i}
 b+=     ${j}
 .Dv .endfor
 
 all:
 	@echo ${a}
 	@echo ${b}
 
 .Ed
 will print:
 .Bd -literal -offset indent
 1 2 3
 3 3 3
 
 .Ed
 Because while ${a} contains
 .Dq 1 2 3
 after the loop is executed, ${b}
 contains
 .Dq ${j} ${j} ${j}
 which expands to
 .Dq 3 3 3
 since after the loop completes ${j} contains
 .Dq 3 .
 .El
 .Ss Variable classes
 The four different classes of variables (in order of increasing precedence)
 are:
 .Bl -tag -width Ds
 .It Environment variables
 Variables defined as part of
 .Nm Ns 's
 environment.
 .It Global variables
 Variables defined in the makefile or in included makefiles.
 .It Command line variables
 Variables defined as part of the command line.
 .It Local variables
 Variables that are defined specific to a certain target.
 .El
 .Pp
 Local variables are all built in and their values vary magically from
 target to target.
 It is not currently possible to define new local variables.
 The seven local variables are as follows:
 .Bl -tag -width ".ARCHIVE" -offset indent
 .It Va .ALLSRC
 The list of all sources for this target; also known as
 .Ql Va \&\*[Gt] .
 .It Va .ARCHIVE
 The name of the archive file; also known as
 .Ql Va \&! .
 .It Va .IMPSRC
 In suffix-transformation rules, the name/path of the source from which the
 target is to be transformed (the
 .Dq implied
 source); also known as
 .Ql Va \&\*[Lt] .
 It is not defined in explicit rules.
 .It Va .MEMBER
 The name of the archive member; also known as
 .Ql Va % .
 .It Va .OODATE
 The list of sources for this target that were deemed out-of-date; also
 known as
 .Ql Va \&? .
 .It Va .PREFIX
 The file prefix of the target, containing only the file portion, no suffix
 or preceding directory components; also known as
 .Ql Va * .
 The suffix must be one of the known suffixes declared with
 .Ic .SUFFIXES
 or it will not be recognized.
 .It Va .TARGET
 The name of the target; also known as
 .Ql Va @ .
 For compatibility with other makes this is an alias for
 .Ic .ARCHIVE
 in archive member rules.
 .El
 .Pp
 The shorter forms
 .Ql ( Va \*[Gt] ,
 .Ql Va \&! ,
 .Ql Va \*[Lt] ,
 .Ql Va % ,
 .Ql Va \&? ,
 .Ql Va * ,
 and
 .Ql Va @ )
 are permitted for backward
 compatibility with historical makefiles and legacy POSIX make and are
 not recommended.
 .Pp
 Variants of these variables with the punctuation followed immediately by
 .Ql D
 or
 .Ql F ,
 e.g.
 .Ql Va $(@D) ,
 are legacy forms equivalent to using the
 .Ql :H
 and
 .Ql :T
 modifiers.
 These forms are accepted for compatibility with
 .At V
 makefiles and POSIX but are not recommended.
 .Pp
 Four of the local variables may be used in sources on dependency lines
 because they expand to the proper value for each target on the line.
 These variables are
 .Ql Va .TARGET ,
 .Ql Va .PREFIX ,
 .Ql Va .ARCHIVE ,
 and
 .Ql Va .MEMBER .
 .Ss Additional built-in variables
 In addition,
 .Nm
 sets or knows about the following variables:
 .Bl -tag -width .MAKEOVERRIDES
 .It Va \&$
 A single dollar sign
 .Ql \&$ ,
 i.e.
 .Ql \&$$
 expands to a single dollar
 sign.
 .It Va .ALLTARGETS
 The list of all targets encountered in the Makefile.
 If evaluated during
 Makefile parsing, lists only those targets encountered thus far.
 .It Va .CURDIR
 A path to the directory where
 .Nm
 was executed.
 Refer to the description of
 .Ql Ev PWD
 for more details.
 .It Va .INCLUDEDFROMDIR
 The directory of the file this Makefile was included from.
 .It Va .INCLUDEDFROMFILE
 The filename of the file this Makefile was included from.
 .It Ev MAKE
 The name that
 .Nm
 was executed with
 .Pq Va argv[0] .
 For compatibility
 .Nm
 also sets
 .Va .MAKE
 with the same value.
 The preferred variable to use is the environment variable
 .Ev MAKE
 because it is more compatible with other versions of
 .Nm
 and cannot be confused with the special target with the same name.
 .It Va .MAKE.ALWAYS_PASS_JOB_QUEUE
 Tells
 .Nm
 whether to pass the descriptors of the job token queue
 even if the target is not tagged with
 .Ic .MAKE
 The default is 
 .Ql Pa yes
 for backwards compatability with
 .Fx 9.0
 and earlier.
 .It Va .MAKE.DEPENDFILE
 Names the makefile (default
 .Ql Pa .depend )
 from which generated dependencies are read.
 .It Va .MAKE.EXPAND_VARIABLES
 A boolean that controls the default behavior of the
 .Fl V
 option.
 .It Va .MAKE.EXPORTED
 The list of variables exported by
 .Nm .
 .It Va .MAKE.JOBS
 The argument to the
 .Fl j
 option.
 .It Va .MAKE.JOB.PREFIX
 If
 .Nm
 is run with
 .Ar j
 then output for each target is prefixed with a token
 .Ql --- target ---
 the first part of which can be controlled via
 .Va .MAKE.JOB.PREFIX .
 If
 .Va .MAKE.JOB.PREFIX
 is empty, no token is printed.
 .br
 For example:
 .Li .MAKE.JOB.PREFIX=${.newline}---${.MAKE:T}[${.MAKE.PID}]
 would produce tokens like
 .Ql ---make[1234] target ---
 making it easier to track the degree of parallelism being achieved.
 .It Ev MAKEFLAGS
 The environment variable
 .Ql Ev MAKEFLAGS
 may contain anything that
 may be specified on
 .Nm Ns 's
 command line.
 Anything specified on
 .Nm Ns 's
 command line is appended to the
 .Ql Ev MAKEFLAGS
 variable which is then
 entered into the environment for all programs which
 .Nm
 executes.
 .It Va .MAKE.LEVEL
 The recursion depth of
 .Nm .
 The initial instance of
 .Nm
 will be 0, and an incremented value is put into the environment
 to be seen by the next generation.
 This allows tests like:
 .Li .if ${.MAKE.LEVEL} == 0
 to protect things which should only be evaluated in the initial instance of
 .Nm .
 .It Va .MAKE.MAKEFILE_PREFERENCE
 The ordered list of makefile names
 (default
 .Ql Pa makefile ,
 .Ql Pa Makefile )
 that
 .Nm
 will look for.
 .It Va .MAKE.MAKEFILES
 The list of makefiles read by
 .Nm ,
 which is useful for tracking dependencies.
 Each makefile is recorded only once, regardless of the number of times read.
 .It Va .MAKE.MODE
 Processed after reading all makefiles.
 Can affect the mode that
 .Nm
 runs in.
 It can contain a number of keywords:
 .Bl -hang -width missing-filemon=bf.
 .It Pa compat
 Like
 .Fl B ,
 puts
 .Nm
 into "compat" mode.
 .It Pa meta
 Puts
 .Nm
 into "meta" mode, where meta files are created for each target
 to capture the command run, the output generated and if
 .Xr filemon 4
 is available, the system calls which are of interest to
 .Nm .
 The captured output can be very useful when diagnosing errors.
 .It Pa curdirOk= Ar bf
 Normally
 .Nm
 will not create .meta files in
 .Ql Va .CURDIR .
 This can be overridden by setting
 .Va bf
 to a value which represents True.
 .It Pa missing-meta= Ar bf
 If
 .Va bf
 is True, then a missing .meta file makes the target out-of-date.
 .It Pa missing-filemon= Ar bf
 If
 .Va bf
 is True, then missing filemon data makes the target out-of-date.
 .It Pa nofilemon
 Do not use
 .Xr filemon 4 .
 .It Pa env
 For debugging, it can be useful to include the environment
 in the .meta file.
 .It Pa verbose
 If in "meta" mode, print a clue about the target being built.
 This is useful if the build is otherwise running silently.
 The message printed the value of:
 .Va .MAKE.META.PREFIX .
 .It Pa ignore-cmd
 Some makefiles have commands which are simply not stable.
 This keyword causes them to be ignored for
 determining whether a target is out of date in "meta" mode.
 See also
 .Ic .NOMETA_CMP .
 .It Pa silent= Ar bf
 If
 .Va bf
 is True, when a .meta file is created, mark the target
 .Ic .SILENT .
 .El
 .It Va .MAKE.META.BAILIWICK
 In "meta" mode, provides a list of prefixes which
 match the directories controlled by
 .Nm .
 If a file that was generated outside of
 .Va .OBJDIR
 but within said bailiwick is missing,
 the current target is considered out-of-date.
 .It Va .MAKE.META.CREATED
 In "meta" mode, this variable contains a list of all the meta files
 updated.
 If not empty, it can be used to trigger processing of
 .Va .MAKE.META.FILES .
 .It Va .MAKE.META.FILES
 In "meta" mode, this variable contains a list of all the meta files
 used (updated or not).
 This list can be used to process the meta files to extract dependency
 information.
 .It Va .MAKE.META.IGNORE_PATHS
 Provides a list of path prefixes that should be ignored;
 because the contents are expected to change over time.
 The default list includes:
 .Ql Pa /dev /etc /proc /tmp /var/run /var/tmp
 .It Va .MAKE.META.IGNORE_PATTERNS
 Provides a list of patterns to match against pathnames.
 Ignore any that match.
+.It Va .MAKE.META.IGNORE_FILTER
+Provides a list of variable modifiers to apply to each pathname.
+Ignore if the expansion is an empty string.
 .It Va .MAKE.META.PREFIX
 Defines the message printed for each meta file updated in "meta verbose" mode.
 The default value is:
 .Dl Building ${.TARGET:H:tA}/${.TARGET:T}
 .It Va .MAKEOVERRIDES
 This variable is used to record the names of variables assigned to
 on the command line, so that they may be exported as part of
 .Ql Ev MAKEFLAGS .
 This behavior can be disabled by assigning an empty value to
 .Ql Va .MAKEOVERRIDES
 within a makefile.
 Extra variables can be exported from a makefile
 by appending their names to
 .Ql Va .MAKEOVERRIDES .
 .Ql Ev MAKEFLAGS
 is re-exported whenever
 .Ql Va .MAKEOVERRIDES
 is modified.
 .It Va .MAKE.PATH_FILEMON
 If
 .Nm
 was built with
 .Xr filemon 4
 support, this is set to the path of the device node.
 This allows makefiles to test for this support.
 .It Va .MAKE.PID
 The process-id of
 .Nm .
 .It Va .MAKE.PPID
 The parent process-id of
 .Nm .
 .It Va .MAKE.SAVE_DOLLARS
 value should be a boolean that controls whether
 .Ql $$
 are preserved when doing
 .Ql :=
 assignments.
 The default is false, for backwards compatibility.
 Set to true for compatability with other makes.
 If set to false,
 .Ql $$
 becomes
 .Ql $
 per normal evaluation rules.
 .It Va MAKE_PRINT_VAR_ON_ERROR
 When
 .Nm
-stops due to an error, it prints its name and the value of
+stops due to an error, it sets
+.Ql Va .ERROR_TARGET
+to the name of the target that failed,
+.Ql Va .ERROR_CMD
+to the commands of the failed target,
+and in "meta" mode, it also sets
+.Ql Va .ERROR_CWD
+to the
+.Xr getcwd 3 ,
+and
+.Ql Va .ERROR_META_FILE
+to the path of the meta file (if any) describing the failed target.
+It then prints its name and the value of
 .Ql Va .CURDIR
 as well as the value of any variables named in
 .Ql Va MAKE_PRINT_VAR_ON_ERROR .
 .It Va .newline
 This variable is simply assigned a newline character as its value.
 This allows expansions using the
 .Cm \&:@
 modifier to put a newline between
 iterations of the loop rather than a space.
 For example, the printing of
 .Ql Va MAKE_PRINT_VAR_ON_ERROR
 could be done as ${MAKE_PRINT_VAR_ON_ERROR:@v@$v='${$v}'${.newline}@}.
 .It Va .OBJDIR
 A path to the directory where the targets are built.
 Its value is determined by trying to
 .Xr chdir 2
 to the following directories in order and using the first match:
 .Bl -enum
 .It
 .Ev ${MAKEOBJDIRPREFIX}${.CURDIR}
 .Pp
 (Only if
 .Ql Ev MAKEOBJDIRPREFIX
 is set in the environment or on the command line.)
 .It
 .Ev ${MAKEOBJDIR}
 .Pp
 (Only if
 .Ql Ev MAKEOBJDIR
 is set in the environment or on the command line.)
 .It
 .Ev ${.CURDIR} Ns Pa /obj. Ns Ev ${MACHINE}
 .It
 .Ev ${.CURDIR} Ns Pa /obj
 .It
 .Pa /usr/obj/ Ns Ev ${.CURDIR}
 .It
 .Ev ${.CURDIR}
 .El
 .Pp
 Variable expansion is performed on the value before it's used,
 so expressions such as
 .Dl ${.CURDIR:S,^/usr/src,/var/obj,}
 may be used.
 This is especially useful with
 .Ql Ev MAKEOBJDIR .
 .Pp
 .Ql Va .OBJDIR
 may be modified in the makefile via the special target
 .Ql Ic .OBJDIR .
 In all cases,
 .Nm
 will
 .Xr chdir 2
 to the specified directory if it exists, and set
 .Ql Va .OBJDIR
 and
 .Ql Ev PWD
 to that directory before executing any targets.
 .
 .It Va .PARSEDIR
 A path to the directory of the current
 .Ql Pa Makefile
 being parsed.
 .It Va .PARSEFILE
 The basename of the current
 .Ql Pa Makefile
 being parsed.
 This variable and
 .Ql Va .PARSEDIR
 are both set only while the
 .Ql Pa Makefiles
 are being parsed.
 If you want to retain their current values, assign them to a variable
 using assignment with expansion:
 .Pq Ql Cm \&:= .
 .It Va .PATH
 A variable that represents the list of directories that
 .Nm
 will search for files.
 The search list should be updated using the target
 .Ql Va .PATH
 rather than the variable.
 .It Ev PWD
 Alternate path to the current directory.
 .Nm
 normally sets
 .Ql Va .CURDIR
 to the canonical path given by
 .Xr getcwd 3 .
 However, if the environment variable
 .Ql Ev PWD
 is set and gives a path to the current directory, then
 .Nm
 sets
 .Ql Va .CURDIR
 to the value of
 .Ql Ev PWD
 instead.
 This behavior is disabled if
 .Ql Ev MAKEOBJDIRPREFIX
 is set or
 .Ql Ev MAKEOBJDIR
 contains a variable transform.
 .Ql Ev PWD
 is set to the value of
 .Ql Va .OBJDIR
 for all programs which
 .Nm
 executes.
 .It Ev .TARGETS
 The list of targets explicitly specified on the command line, if any.
 .It Ev VPATH
 Colon-separated
 .Pq Dq \&:
 lists of directories that
 .Nm
 will search for files.
 The variable is supported for compatibility with old make programs only,
 use
 .Ql Va .PATH
 instead.
 .El
 .Ss Variable modifiers
 Variable expansion may be modified to select or modify each word of the
 variable (where a
 .Dq word
 is white-space delimited sequence of characters).
 The general format of a variable expansion is as follows:
 .Pp
 .Dl ${variable[:modifier[:...]]}
 .Pp
 Each modifier begins with a colon,
 which may be escaped with a backslash
 .Pq Ql \e .
 .Pp
 A set of modifiers can be specified via a variable, as follows:
 .Pp
 .Dl modifier_variable=modifier[:...]
 .Dl ${variable:${modifier_variable}[:...]}
 .Pp
 In this case the first modifier in the modifier_variable does not
 start with a colon, since that must appear in the referencing
 variable.
 If any of the modifiers in the modifier_variable contain a dollar sign
 .Pq Ql $ ,
 these must be doubled to avoid early expansion.
 .Pp
 The supported modifiers are:
 .Bl -tag -width EEE
 .It Cm \&:E
 Replaces each word in the variable with its suffix.
 .It Cm \&:H
 Replaces each word in the variable with everything but the last component.
 .It Cm \&:M Ns Ar pattern
 Select only those words that match
 .Ar pattern .
 The standard shell wildcard characters
 .Pf ( Ql * ,
 .Ql \&? ,
 and
 .Ql Oo Oc )
 may
 be used.
 The wildcard characters may be escaped with a backslash
 .Pq Ql \e .
 As a consequence of the way values are split into words, matched,
 and then joined, a construct like
 .Dl ${VAR:M*}
 will normalize the inter-word spacing, removing all leading and
 trailing space, and converting multiple consecutive spaces
 to single spaces.
 .
 .It Cm \&:N Ns Ar pattern
 This is identical to
 .Ql Cm \&:M ,
 but selects all words which do not match
 .Ar pattern .
 .It Cm \&:O
 Order every word in variable alphabetically.
 To sort words in
 reverse order use the
 .Ql Cm \&:O:[-1..1]
 combination of modifiers.
 .It Cm \&:Ox
 Randomize words in variable.
 The results will be different each time you are referring to the
 modified variable; use the assignment with expansion
 .Pq Ql Cm \&:=
 to prevent such behavior.
 For example,
 .Bd -literal -offset indent
 LIST=			uno due tre quattro
 RANDOM_LIST=		${LIST:Ox}
 STATIC_RANDOM_LIST:=	${LIST:Ox}
 
 all:
 	@echo "${RANDOM_LIST}"
 	@echo "${RANDOM_LIST}"
 	@echo "${STATIC_RANDOM_LIST}"
 	@echo "${STATIC_RANDOM_LIST}"
 .Ed
 may produce output similar to:
 .Bd -literal -offset indent
 quattro due tre uno
 tre due quattro uno
 due uno quattro tre
 due uno quattro tre
 .Ed
 .It Cm \&:Q
 Quotes every shell meta-character in the variable, so that it can be passed
 safely through recursive invocations of
 .Nm .
 .It Cm \&:R
 Replaces each word in the variable with everything but its suffix.
 .It Cm \&:gmtime
 The value is a format string for
 .Xr strftime 3 ,
 using the current
 .Xr gmtime 3 .
 .It Cm \&:hash
 Compute a 32-bit hash of the value and encode it as hex digits.
 .It Cm \&:localtime
 The value is a format string for
 .Xr strftime 3 ,
 using the current
 .Xr localtime 3 .
 .It Cm \&:tA
 Attempt to convert variable to an absolute path using
 .Xr realpath 3 ,
 if that fails, the value is unchanged.
 .It Cm \&:tl
 Converts variable to lower-case letters.
 .It Cm \&:ts Ns Ar c
 Words in the variable are normally separated by a space on expansion.
 This modifier sets the separator to the character
 .Ar c .
 If
 .Ar c
 is omitted, then no separator is used.
 The common escapes (including octal numeric codes), work as expected.
 .It Cm \&:tu
 Converts variable to upper-case letters.
 .It Cm \&:tW
 Causes the value to be treated as a single word
 (possibly containing embedded white space).
 See also
 .Ql Cm \&:[*] .
 .It Cm \&:tw
 Causes the value to be treated as a sequence of
 words delimited by white space.
 See also
 .Ql Cm \&:[@] .
 .Sm off
 .It Cm \&:S No \&/ Ar old_string No \&/ Ar new_string No \&/ Op Cm 1gW
 .Sm on
 Modify the first occurrence of
 .Ar old_string
 in the variable's value, replacing it with
 .Ar new_string .
 If a
 .Ql g
 is appended to the last slash of the pattern, all occurrences
 in each word are replaced.
 If a
 .Ql 1
 is appended to the last slash of the pattern, only the first word
 is affected.
 If a
 .Ql W
 is appended to the last slash of the pattern,
 then the value is treated as a single word
 (possibly containing embedded white space).
 If
 .Ar old_string
 begins with a caret
 .Pq Ql ^ ,
 .Ar old_string
 is anchored at the beginning of each word.
 If
 .Ar old_string
 ends with a dollar sign
 .Pq Ql \&$ ,
 it is anchored at the end of each word.
 Inside
 .Ar new_string ,
 an ampersand
 .Pq Ql \*[Am]
 is replaced by
 .Ar old_string
 (without any
 .Ql ^
 or
 .Ql \&$ ) .
 Any character may be used as a delimiter for the parts of the modifier
 string.
 The anchoring, ampersand and delimiter characters may be escaped with a
 backslash
 .Pq Ql \e .
 .Pp
 Variable expansion occurs in the normal fashion inside both
 .Ar old_string
 and
 .Ar new_string
 with the single exception that a backslash is used to prevent the expansion
 of a dollar sign
 .Pq Ql \&$ ,
 not a preceding dollar sign as is usual.
 .Sm off
 .It Cm \&:C No \&/ Ar pattern No \&/ Ar replacement No \&/ Op Cm 1gW
 .Sm on
 The
 .Cm \&:C
 modifier is just like the
 .Cm \&:S
 modifier except that the old and new strings, instead of being
 simple strings, are an extended regular expression (see
 .Xr regex 3 )
 string
 .Ar pattern
 and an
 .Xr ed 1 Ns \-style
 string
 .Ar replacement .
 Normally, the first occurrence of the pattern
 .Ar pattern
 in each word of the value is substituted with
 .Ar replacement .
 The
 .Ql 1
 modifier causes the substitution to apply to at most one word; the
 .Ql g
 modifier causes the substitution to apply to as many instances of the
 search pattern
 .Ar pattern
 as occur in the word or words it is found in; the
 .Ql W
 modifier causes the value to be treated as a single word
 (possibly containing embedded white space).
 Note that
 .Ql 1
 and
 .Ql g
 are orthogonal; the former specifies whether multiple words are
 potentially affected, the latter whether multiple substitutions can
 potentially occur within each affected word.
 .Pp
 As for the
 .Cm \&:S
 modifier, the
 .Ar pattern
 and
 .Ar replacement
 are subjected to variable expansion before being parsed as
 regular expressions.
 .It Cm \&:T
 Replaces each word in the variable with its last component.
 .It Cm \&:u
 Remove adjacent duplicate words (like
 .Xr uniq 1 ) .
 .Sm off
 .It Cm \&:\&? Ar true_string Cm \&: Ar false_string
 .Sm on
 If the variable name (not its value), when parsed as a .if conditional
 expression, evaluates to true, return as its value the
 .Ar true_string ,
 otherwise return the
 .Ar false_string .
 Since the variable name is used as the expression, \&:\&? must be the
 first modifier after the variable name itself - which will, of course,
 usually contain variable expansions.
 A common error is trying to use expressions like
 .Dl ${NUMBERS:M42:?match:no}
 which actually tests defined(NUMBERS),
 to determine is any words match "42" you need to use something like:
 .Dl ${"${NUMBERS:M42}" != \&"\&":?match:no} .
 .It Ar :old_string=new_string
 This is the
 .At V
 style variable substitution.
 It must be the last modifier specified.
 If
 .Ar old_string
 or
 .Ar new_string
 do not contain the pattern matching character
 .Ar %
 then it is assumed that they are
 anchored at the end of each word, so only suffixes or entire
 words may be replaced.
 Otherwise
 .Ar %
 is the substring of
 .Ar old_string
 to be replaced in
 .Ar new_string .
 .Pp
 Variable expansion occurs in the normal fashion inside both
 .Ar old_string
 and
 .Ar new_string
 with the single exception that a backslash is used to prevent the
 expansion of a dollar sign
 .Pq Ql \&$ ,
 not a preceding dollar sign as is usual.
 .Sm off
 .It Cm \&:@ Ar temp Cm @ Ar string Cm @
 .Sm on
 This is the loop expansion mechanism from the OSF Development
 Environment (ODE) make.
 Unlike
 .Cm \&.for
 loops expansion occurs at the time of
 reference.
 Assign
 .Ar temp
 to each word in the variable and evaluate
 .Ar string .
 The ODE convention is that
 .Ar temp
 should start and end with a period.
 For example.
 .Dl ${LINKS:@.LINK.@${LN} ${TARGET} ${.LINK.}@}
 .Pp
 However a single character variable is often more readable:
 .Dl ${MAKE_PRINT_VAR_ON_ERROR:@v@$v='${$v}'${.newline}@}
 .It Cm \&:U Ns Ar newval
 If the variable is undefined
 .Ar newval
 is the value.
 If the variable is defined, the existing value is returned.
 This is another ODE make feature.
 It is handy for setting per-target CFLAGS for instance:
 .Dl ${_${.TARGET:T}_CFLAGS:U${DEF_CFLAGS}}
 If a value is only required if the variable is undefined, use:
 .Dl ${VAR:D:Unewval}
 .It Cm \&:D Ns Ar newval
 If the variable is defined
 .Ar newval
 is the value.
 .It Cm \&:L
 The name of the variable is the value.
 .It Cm \&:P
 The path of the node which has the same name as the variable
 is the value.
 If no such node exists or its path is null, then the
 name of the variable is used.
 In order for this modifier to work, the name (node) must at least have
 appeared on the rhs of a dependency.
 .Sm off
 .It Cm \&:\&! Ar cmd Cm \&!
 .Sm on
 The output of running
 .Ar cmd
 is the value.
 .It Cm \&:sh
 If the variable is non-empty it is run as a command and the output
 becomes the new value.
 .It Cm \&::= Ns Ar str
 The variable is assigned the value
 .Ar str
 after substitution.
 This modifier and its variations are useful in
 obscure situations such as wanting to set a variable when shell commands
 are being parsed.
 These assignment modifiers always expand to
 nothing, so if appearing in a rule line by themselves should be
 preceded with something to keep
 .Nm
 happy.
 .Pp
 The
 .Ql Cm \&::
 helps avoid false matches with the
 .At V
 style
 .Cm \&:=
 modifier and since substitution always occurs the
 .Cm \&::=
 form is vaguely appropriate.
 .It Cm \&::?= Ns Ar str
 As for
 .Cm \&::=
 but only if the variable does not already have a value.
 .It Cm \&::+= Ns Ar str
 Append
 .Ar str
 to the variable.
 .It Cm \&::!= Ns Ar cmd
 Assign the output of
 .Ar cmd
 to the variable.
 .It Cm \&:\&[ Ns Ar range Ns Cm \&]
 Selects one or more words from the value,
 or performs other operations related to the way in which the
 value is divided into words.
 .Pp
 Ordinarily, a value is treated as a sequence of words
 delimited by white space.
 Some modifiers suppress this behavior,
 causing a value to be treated as a single word
 (possibly containing embedded white space).
 An empty value, or a value that consists entirely of white-space,
 is treated as a single word.
 For the purposes of the
 .Ql Cm \&:[]
 modifier, the words are indexed both forwards using positive integers
 (where index 1 represents the first word),
 and backwards using negative integers
 (where index \-1 represents the last word).
 .Pp
 The
 .Ar range
 is subjected to variable expansion, and the expanded result is
 then interpreted as follows:
 .Bl -tag -width index
 .\" :[n]
 .It Ar index
 Selects a single word from the value.
 .\" :[start..end]
 .It Ar start Ns Cm \&.. Ns Ar end
 Selects all words from
 .Ar start
 to
 .Ar end ,
 inclusive.
 For example,
 .Ql Cm \&:[2..-1]
 selects all words from the second word to the last word.
 If
 .Ar start
 is greater than
 .Ar end ,
 then the words are output in reverse order.
 For example,
 .Ql Cm \&:[-1..1]
 selects all the words from last to first.
 .\" :[*]
 .It Cm \&*
 Causes subsequent modifiers to treat the value as a single word
 (possibly containing embedded white space).
 Analogous to the effect of
 \&"$*\&"
 in Bourne shell.
 .\" :[0]
 .It 0
 Means the same as
 .Ql Cm \&:[*] .
 .\" :[*]
 .It Cm \&@
 Causes subsequent modifiers to treat the value as a sequence of words
 delimited by white space.
 Analogous to the effect of
 \&"$@\&"
 in Bourne shell.
 .\" :[#]
 .It Cm \&#
 Returns the number of words in the value.
 .El \" :[range]
 .El
 .Sh INCLUDE STATEMENTS, CONDITIONALS AND FOR LOOPS
 Makefile inclusion, conditional structures and for loops  reminiscent
 of the C programming language are provided in
 .Nm .
 All such structures are identified by a line beginning with a single
 dot
 .Pq Ql \&.
 character.
 Files are included with either
 .Cm \&.include Aq Ar file
 or
 .Cm \&.include Pf \*q Ar file Ns \*q .
 Variables between the angle brackets or double quotes are expanded
 to form the file name.
 If angle brackets are used, the included makefile is expected to be in
 the system makefile directory.
 If double quotes are used, the including makefile's directory and any
 directories specified using the
 .Fl I
 option are searched before the system
 makefile directory.
 For compatibility with other versions of
 .Nm
 .Ql include file ...
 is also accepted.
 .Pp
 If the include statement is written as
 .Cm .-include
 or as
 .Cm .sinclude
 then errors locating and/or opening include files are ignored.
 .Pp
 If the include statement is written as
 .Cm .dinclude
 not only are errors locating and/or opening include files ignored,
 but stale dependencies within the included file will be ignored
 just like
 .Va .MAKE.DEPENDFILE .
 .Pp
 Conditional expressions are also preceded by a single dot as the first
 character of a line.
 The possible conditionals are as follows:
 .Bl -tag -width Ds
 .It Ic .error Ar message
 The message is printed along with the name of the makefile and line number,
 then
 .Nm
 will exit.
 .It Ic .export Ar variable ...
 Export the specified global variable.
 If no variable list is provided, all globals are exported
 except for internal variables (those that start with
 .Ql \&. ) .
 This is not affected by the
 .Fl X
 flag, so should be used with caution.
 For compatibility with other
 .Nm
 programs
 .Ql export variable=value
 is also accepted.
 .Pp
 Appending a variable name to
 .Va .MAKE.EXPORTED
 is equivalent to exporting a variable.
 .It Ic .export-env Ar variable ...
 The same as
 .Ql .export ,
 except that the variable is not appended to
 .Va .MAKE.EXPORTED .
 This allows exporting a value to the environment which is different from that
 used by
 .Nm
 internally.
 .It Ic .export-literal Ar variable ...
 The same as
 .Ql .export-env ,
 except that variables in the value are not expanded.
 .It Ic .info Ar message
 The message is printed along with the name of the makefile and line number.
 .It Ic .undef Ar variable
 Un-define the specified global variable.
 Only global variables may be un-defined.
 .It Ic .unexport Ar variable ...
 The opposite of
 .Ql .export .
 The specified global
 .Va variable
 will be removed from
 .Va .MAKE.EXPORTED .
 If no variable list is provided, all globals are unexported,
 and
 .Va .MAKE.EXPORTED
 deleted.
 .It Ic .unexport-env
 Unexport all globals previously exported and
 clear the environment inherited from the parent.
 This operation will cause a memory leak of the original environment,
 so should be used sparingly.
 Testing for
 .Va .MAKE.LEVEL
 being 0, would make sense.
 Also note that any variables which originated in the parent environment
 should be explicitly preserved if desired.
 For example:
 .Bd -literal -offset indent
 .Li .if ${.MAKE.LEVEL} == 0
 PATH := ${PATH}
 .Li .unexport-env
 .Li .export PATH
 .Li .endif
 .Pp
 .Ed
 Would result in an environment containing only
 .Ql Ev PATH ,
 which is the minimal useful environment.
 Actually
 .Ql Ev .MAKE.LEVEL
 will also be pushed into the new environment.
 .It Ic .warning Ar message
 The message prefixed by
 .Ql Pa warning:
 is printed along with the name of the makefile and line number.
 .It Ic \&.if Oo \&! Oc Ns Ar expression Op Ar operator expression ...
 Test the value of an expression.
 .It Ic .ifdef Oo \&! Oc Ns Ar variable Op Ar operator variable ...
 Test the value of a variable.
 .It Ic .ifndef Oo \&! Oc Ns Ar variable Op Ar operator variable ...
 Test the value of a variable.
 .It Ic .ifmake Oo \&! Oc Ns Ar target Op Ar operator target ...
 Test the target being built.
 .It Ic .ifnmake Oo \&! Ns Oc Ar target Op Ar operator target ...
 Test the target being built.
 .It Ic .else
 Reverse the sense of the last conditional.
 .It Ic .elif Oo \&! Ns Oc Ar expression Op Ar operator expression ...
 A combination of
 .Ql Ic .else
 followed by
 .Ql Ic .if .
 .It Ic .elifdef Oo \&! Oc Ns Ar variable Op Ar operator variable ...
 A combination of
 .Ql Ic .else
 followed by
 .Ql Ic .ifdef .
 .It Ic .elifndef Oo \&! Oc Ns Ar variable Op Ar operator variable ...
 A combination of
 .Ql Ic .else
 followed by
 .Ql Ic .ifndef .
 .It Ic .elifmake Oo \&! Oc Ns Ar target Op Ar operator target ...
 A combination of
 .Ql Ic .else
 followed by
 .Ql Ic .ifmake .
 .It Ic .elifnmake Oo \&! Oc Ns Ar target Op Ar operator target ...
 A combination of
 .Ql Ic .else
 followed by
 .Ql Ic .ifnmake .
 .It Ic .endif
 End the body of the conditional.
 .El
 .Pp
 The
 .Ar operator
 may be any one of the following:
 .Bl -tag -width "Cm XX"
 .It Cm \&|\&|
 Logical OR.
 .It Cm \&\*[Am]\*[Am]
 Logical
 .Tn AND ;
 of higher precedence than
 .Dq \&|\&| .
 .El
 .Pp
 As in C,
 .Nm
 will only evaluate a conditional as far as is necessary to determine
 its value.
 Parentheses may be used to change the order of evaluation.
 The boolean operator
 .Ql Ic \&!
 may be used to logically negate an entire
 conditional.
 It is of higher precedence than
 .Ql Ic \&\*[Am]\*[Am] .
 .Pp
 The value of
 .Ar expression
 may be any of the following:
 .Bl -tag -width defined
 .It Ic defined
 Takes a variable name as an argument and evaluates to true if the variable
 has been defined.
 .It Ic make
 Takes a target name as an argument and evaluates to true if the target
 was specified as part of
 .Nm Ns 's
 command line or was declared the default target (either implicitly or
 explicitly, see
 .Va .MAIN )
 before the line containing the conditional.
 .It Ic empty
 Takes a variable, with possible modifiers, and evaluates to true if
 the expansion of the variable would result in an empty string.
 .It Ic exists
 Takes a file name as an argument and evaluates to true if the file exists.
 The file is searched for on the system search path (see
 .Va .PATH ) .
 .It Ic target
 Takes a target name as an argument and evaluates to true if the target
 has been defined.
 .It Ic commands
 Takes a target name as an argument and evaluates to true if the target
 has been defined and has commands associated with it.
 .El
 .Pp
 .Ar Expression
 may also be an arithmetic or string comparison.
 Variable expansion is
 performed on both sides of the comparison, after which the integral
 values are compared.
 A value is interpreted as hexadecimal if it is
 preceded by 0x, otherwise it is decimal; octal numbers are not supported.
 The standard C relational operators are all supported.
 If after
 variable expansion, either the left or right hand side of a
 .Ql Ic ==
 or
 .Ql Ic "!="
 operator is not an integral value, then
 string comparison is performed between the expanded
 variables.
 If no relational operator is given, it is assumed that the expanded
 variable is being compared against 0 or an empty string in the case
 of a string comparison.
 .Pp
 When
 .Nm
 is evaluating one of these conditional expressions, and it encounters
 a (white-space separated) word it doesn't recognize, either the
 .Dq make
 or
 .Dq defined
 expression is applied to it, depending on the form of the conditional.
 If the form is
 .Ql Ic .ifdef ,
 .Ql Ic .ifndef ,
 or
 .Ql Ic .if
 the
 .Dq defined
 expression is applied.
 Similarly, if the form is
 .Ql Ic .ifmake
 or
 .Ql Ic .ifnmake , the
 .Dq make
 expression is applied.
 .Pp
 If the conditional evaluates to true the parsing of the makefile continues
 as before.
 If it evaluates to false, the following lines are skipped.
 In both cases this continues until a
 .Ql Ic .else
 or
 .Ql Ic .endif
 is found.
 .Pp
 For loops are typically used to apply a set of rules to a list of files.
 The syntax of a for loop is:
 .Pp
 .Bl -tag -compact -width Ds
 .It Ic \&.for Ar variable Oo Ar variable ... Oc Ic in Ar expression
 .It Aq make-rules
 .It Ic \&.endfor
 .El
 .Pp
 After the for
 .Ic expression
 is evaluated, it is split into words.
 On each iteration of the loop, one word is taken and assigned to each
 .Ic variable ,
 in order, and these
 .Ic variables
 are substituted into the
 .Ic make-rules
 inside the body of the for loop.
 The number of words must come out even; that is, if there are three
 iteration variables, the number of words provided must be a multiple
 of three.
 .Sh COMMENTS
 Comments begin with a hash
 .Pq Ql \&#
 character, anywhere but in a shell
 command line, and continue to the end of an unescaped new line.
 .Sh SPECIAL SOURCES (ATTRIBUTES)
 .Bl -tag -width .IGNOREx
 .It Ic .EXEC
 Target is never out of date, but always execute commands anyway.
 .It Ic .IGNORE
 Ignore any errors from the commands associated with this target, exactly
 as if they all were preceded by a dash
 .Pq Ql \- .
 .\" .It Ic .INVISIBLE
 .\" XXX
 .\" .It Ic .JOIN
 .\" XXX
 .It Ic .MADE
 Mark all sources of this target as being up-to-date.
 .It Ic .MAKE
 Execute the commands associated with this target even if the
 .Fl n
 or
 .Fl t
 options were specified.
 Normally used to mark recursive
 .Nm Ns s .
 .It Ic .META
 Create a meta file for the target, even if it is flagged as
 .Ic .PHONY ,
 .Ic .MAKE ,
 or
 .Ic .SPECIAL .
 Usage in conjunction with
 .Ic .MAKE
 is the most likely case.
 In "meta" mode, the target is out-of-date if the meta file is missing.
 .It Ic .NOMETA
 Do not create a meta file for the target.
 Meta files are also not created for
 .Ic .PHONY ,
 .Ic .MAKE ,
 or
 .Ic .SPECIAL
 targets.
 .It Ic .NOMETA_CMP
 Ignore differences in commands when deciding if target is out of date.
 This is useful if the command contains a value which always changes.
 If the number of commands change, though, the target will still be out of date.
 The same effect applies to any command line that uses the variable
 .Va .OODATE ,
 which can be used for that purpose even when not otherwise needed or desired:
 .Bd -literal -offset indent
 
 skip-compare-for-some:
 	@echo this will be compared
 	@echo this will not ${.OODATE:M.NOMETA_CMP}
 	@echo this will also be compared
 
 .Ed
 The
 .Cm \&:M
 pattern suppresses any expansion of the unwanted variable.
 .It Ic .NOPATH
 Do not search for the target in the directories specified by
 .Ic .PATH .
 .It Ic .NOTMAIN
 Normally
 .Nm
 selects the first target it encounters as the default target to be built
 if no target was specified.
 This source prevents this target from being selected.
 .It Ic .OPTIONAL
 If a target is marked with this attribute and
 .Nm
 can't figure out how to create it, it will ignore this fact and assume
 the file isn't needed or already exists.
 .It Ic .PHONY
 The target does not
 correspond to an actual file; it is always considered to be out of date,
 and will not be created with the
 .Fl t
 option.
 Suffix-transformation rules are not applied to
 .Ic .PHONY
 targets.
 .It Ic .PRECIOUS
 When
 .Nm
 is interrupted, it normally removes any partially made targets.
 This source prevents the target from being removed.
 .It Ic .RECURSIVE
 Synonym for
 .Ic .MAKE .
 .It Ic .SILENT
 Do not echo any of the commands associated with this target, exactly
 as if they all were preceded by an at sign
 .Pq Ql @ .
 .It Ic .USE
 Turn the target into
 .Nm Ns 's
 version of a macro.
 When the target is used as a source for another target, the other target
 acquires the commands, sources, and attributes (except for
 .Ic .USE )
 of the
 source.
 If the target already has commands, the
 .Ic .USE
 target's commands are appended
 to them.
 .It Ic .USEBEFORE
 Exactly like
 .Ic .USE ,
 but prepend the
 .Ic .USEBEFORE
 target commands to the target.
 .It Ic .WAIT
 If
 .Ic .WAIT
 appears in a dependency line, the sources that precede it are
 made before the sources that succeed it in the line.
 Since the dependents of files are not made until the file itself
 could be made, this also stops the dependents being built unless they
 are needed for another branch of the dependency tree.
 So given:
 .Bd -literal
 x: a .WAIT b
 	echo x
 a:
 	echo a
 b: b1
 	echo b
 b1:
 	echo b1
 
 .Ed
 the output is always
 .Ql a ,
 .Ql b1 ,
 .Ql b ,
 .Ql x .
 .br
 The ordering imposed by
 .Ic .WAIT
 is only relevant for parallel makes.
 .El
 .Sh SPECIAL TARGETS
 Special targets may not be included with other targets, i.e. they must be
 the only target specified.
 .Bl -tag -width .BEGINx
 .It Ic .BEGIN
 Any command lines attached to this target are executed before anything
 else is done.
 .It Ic .DEFAULT
 This is sort of a
 .Ic .USE
 rule for any target (that was used only as a
 source) that
 .Nm
 can't figure out any other way to create.
 Only the shell script is used.
 The
 .Ic .IMPSRC
 variable of a target that inherits
 .Ic .DEFAULT Ns 's
 commands is set
 to the target's own name.
 .It Ic .END
 Any command lines attached to this target are executed after everything
 else is done.
 .It Ic .ERROR
 Any command lines attached to this target are executed when another target fails.
 The
 .Ic .ERROR_TARGET
 variable is set to the target that failed.
 See also
 .Ic MAKE_PRINT_VAR_ON_ERROR .
 .It Ic .IGNORE
 Mark each of the sources with the
 .Ic .IGNORE
 attribute.
 If no sources are specified, this is the equivalent of specifying the
 .Fl i
 option.
 .It Ic .INTERRUPT
 If
 .Nm
 is interrupted, the commands for this target will be executed.
 .It Ic .MAIN
 If no target is specified when
 .Nm
 is invoked, this target will be built.
 .It Ic .MAKEFLAGS
 This target provides a way to specify flags for
 .Nm
 when the makefile is used.
 The flags are as if typed to the shell, though the
 .Fl f
 option will have
 no effect.
 .\" XXX: NOT YET!!!!
 .\" .It Ic .NOTPARALLEL
 .\" The named targets are executed in non parallel mode.
 .\" If no targets are
 .\" specified, then all targets are executed in non parallel mode.
 .It Ic .NOPATH
 Apply the
 .Ic .NOPATH
 attribute to any specified sources.
 .It Ic .NOTPARALLEL
 Disable parallel mode.
 .It Ic .NO_PARALLEL
 Synonym for
 .Ic .NOTPARALLEL ,
 for compatibility with other pmake variants.
 .It Ic .OBJDIR
 The source is a new value for
 .Ql Va .OBJDIR .
 If it exists,
 .Nm
 will
 .Xr chdir 2
 to it and update the value of
 .Ql Va .OBJDIR .
 .It Ic .ORDER
 The named targets are made in sequence.
 This ordering does not add targets to the list of targets to be made.
 Since the dependents of a target do not get built until the target itself
 could be built, unless
 .Ql a
 is built by another part of the dependency graph,
 the following is a dependency loop:
 .Bd -literal
 \&.ORDER: b a
 b: a
 .Ed
 .Pp
 The ordering imposed by
 .Ic .ORDER
 is only relevant for parallel makes.
 .\" XXX: NOT YET!!!!
 .\" .It Ic .PARALLEL
 .\" The named targets are executed in parallel mode.
 .\" If no targets are
 .\" specified, then all targets are executed in parallel mode.
 .It Ic .PATH
 The sources are directories which are to be searched for files not
 found in the current directory.
 If no sources are specified, any previously specified directories are
 deleted.
 If the source is the special
 .Ic .DOTLAST
 target, then the current working
 directory is searched last.
 .It Ic .PATH. Ns Va suffix
 Like
 .Ic .PATH
 but applies only to files with a particular suffix.
 The suffix must have been previously declared with
 .Ic .SUFFIXES .
 .It Ic .PHONY
 Apply the
 .Ic .PHONY
 attribute to any specified sources.
 .It Ic .PRECIOUS
 Apply the
 .Ic .PRECIOUS
 attribute to any specified sources.
 If no sources are specified, the
 .Ic .PRECIOUS
 attribute is applied to every
 target in the file.
 .It Ic .SHELL
 Sets the shell that
 .Nm
 will use to execute commands.
 The sources are a set of
 .Ar field=value
 pairs.
 .Bl -tag -width hasErrCtls
 .It Ar name
 This is the minimal specification, used to select one of the built-in
 shell specs;
 .Ar sh ,
 .Ar ksh ,
 and
 .Ar csh .
 .It Ar path
 Specifies the path to the shell.
 .It Ar hasErrCtl
 Indicates whether the shell supports exit on error.
 .It Ar check
 The command to turn on error checking.
 .It Ar ignore
 The command to disable error checking.
 .It Ar echo
 The command to turn on echoing of commands executed.
 .It Ar quiet
 The command to turn off echoing of commands executed.
 .It Ar filter
 The output to filter after issuing the
 .Ar quiet
 command.
 It is typically identical to
 .Ar quiet .
 .It Ar errFlag
 The flag to pass the shell to enable error checking.
 .It Ar echoFlag
 The flag to pass the shell to enable command echoing.
 .It Ar newline
 The string literal to pass the shell that results in a single newline
 character when used outside of any quoting characters.
 .El
 Example:
 .Bd -literal
 \&.SHELL: name=ksh path=/bin/ksh hasErrCtl=true \e
 	check="set \-e" ignore="set +e" \e
 	echo="set \-v" quiet="set +v" filter="set +v" \e
 	echoFlag=v errFlag=e newline="'\en'"
 .Ed
 .It Ic .SILENT
 Apply the
 .Ic .SILENT
 attribute to any specified sources.
 If no sources are specified, the
 .Ic .SILENT
 attribute is applied to every
 command in the file.
 .It Ic .STALE
 This target gets run when a dependency file contains stale entries, having
 .Va .ALLSRC
 set to the name of that dependency file.
 .It Ic .SUFFIXES
 Each source specifies a suffix to
 .Nm .
 If no sources are specified, any previously specified suffixes are deleted.
 It allows the creation of suffix-transformation rules.
 .Pp
 Example:
 .Bd -literal
 \&.SUFFIXES: .o
 \&.c.o:
 	cc \-o ${.TARGET} \-c ${.IMPSRC}
 .Ed
 .El
 .Sh ENVIRONMENT
 .Nm
 uses the following environment variables, if they exist:
 .Ev MACHINE ,
 .Ev MACHINE_ARCH ,
 .Ev MAKE ,
 .Ev MAKEFLAGS ,
 .Ev MAKEOBJDIR ,
 .Ev MAKEOBJDIRPREFIX ,
 .Ev MAKESYSPATH ,
 .Ev PWD ,
 and
 .Ev TMPDIR .
 .Pp
 .Ev MAKEOBJDIRPREFIX
 and
 .Ev MAKEOBJDIR
 may only be set in the environment or on the command line to
 .Nm
 and not as makefile variables;
 see the description of
 .Ql Va .OBJDIR
 for more details.
 .Sh FILES
 .Bl -tag -width /usr/share/mk -compact
 .It .depend
 list of dependencies
 .It Makefile
 list of dependencies
 .It makefile
 list of dependencies
 .It sys.mk
 system makefile
 .It /usr/share/mk
 system makefile directory
 .El
 .Sh COMPATIBILITY
 The basic make syntax is compatible between different versions of make;
 however the special variables, variable modifiers and conditionals are not.
 .Ss Older versions
 An incomplete list of changes in older versions of
 .Nm :
 .Pp
 The way that .for loop variables are substituted changed after
 .Nx 5.0
 so that they still appear to be variable expansions.
 In particular this stops them being treated as syntax, and removes some
 obscure problems using them in .if statements.
 .Pp
 The way that parallel makes are scheduled changed in
 .Nx 4.0
 so that .ORDER and .WAIT apply recursively to the dependent nodes.
 The algorithms used may change again in the future.
 .Ss Other make dialects
 Other make dialects (GNU make, SVR4 make, POSIX make, etc.) do not
 support most of the features of
 .Nm
 as described in this manual.
 Most notably:
 .Bl -bullet -offset indent
 .It
 The
 .Ic .WAIT
 and
 .Ic .ORDER
 declarations and most functionality pertaining to parallelization.
 (GNU make supports parallelization but lacks these features needed to
 control it effectively.)
 .It
 Directives, including for loops and conditionals and most of the
 forms of include files.
 (GNU make has its own incompatible and less powerful syntax for
 conditionals.)
 .It
 All built-in variables that begin with a dot.
 .It
 Most of the special sources and targets that begin with a dot,
 with the notable exception of
 .Ic .PHONY ,
 .Ic .PRECIOUS ,
 and
 .Ic .SUFFIXES .
 .It
 Variable modifiers, except for the
 .Dl :old=new
 string substitution, which does not portably support globbing with
 .Ql %
 and historically only works on declared suffixes.
 .It
 The
 .Ic $>
 variable even in its short form; most makes support this functionality
 but its name varies.
 .El
 .Pp
 Some features are somewhat more portable, such as assignment with
 .Ic += ,
 .Ic ?= ,
 and
 .Ic != .
 The
 .Ic .PATH
 functionality is based on an older feature
 .Ic VPATH
 found in GNU make and many versions of SVR4 make; however,
 historically its behavior is too ill-defined (and too buggy) to rely
 upon.
 .Pp
 The
 .Ic $@
 and
 .Ic $<
 variables are more or less universally portable, as is the
 .Ic $(MAKE)
 variable.
 Basic use of suffix rules (for files only in the current directory,
 not trying to chain transformations together, etc.) is also reasonably
 portable.
 .Sh SEE ALSO
 .Xr mkdep 1
 .Sh HISTORY
 A
 .Nm
 command appeared in
 .At v7 .
 This
 .Nm
 implementation is based on Adam De Boor's pmake program which was written
 for Sprite at Berkeley.
 It was designed to be a parallel distributed make running jobs on different
 machines using a daemon called
 .Dq customs .
 .Pp
 Historically the target/dependency
 .Dq FRC
 has been used to FoRCe rebuilding (since the target/dependency
 does not exist... unless someone creates an
 .Dq FRC
 file).
 .Sh BUGS
 The
 .Nm
 syntax is difficult to parse without actually acting of the data.
 For instance finding the end of a variable use should involve scanning each
 the modifiers using the correct terminator for each field.
 In many places
 .Nm
 just counts {} and () in order to find the end of a variable expansion.
 .Pp
 There is no way of escaping a space character in a filename.
Index: projects/clang390-import/contrib/bmake/meta.c
===================================================================
--- projects/clang390-import/contrib/bmake/meta.c	(revision 305686)
+++ projects/clang390-import/contrib/bmake/meta.c	(revision 305687)
@@ -1,1555 +1,1633 @@
-/*      $NetBSD: meta.c,v 1.61 2016/06/07 00:40:00 sjg Exp $ */
+/*      $NetBSD: meta.c,v 1.67 2016/08/17 15:52:42 sjg Exp $ */
 
 /*
  * Implement 'meta' mode.
  * Adapted from John Birrell's patches to FreeBSD make.
  * --sjg
  */
 /*
  * Copyright (c) 2009-2016, Juniper Networks, Inc.
  * Portions Copyright (c) 2009, John Birrell.
  * 
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions 
  * are met: 
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer. 
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.  
  * 
  * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
  * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
  * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
  * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
  * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
  * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
  * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
  */
 #if defined(USE_META)
 
 #ifdef HAVE_CONFIG_H
 # include "config.h"
 #endif
 #include <sys/stat.h>
 #include <sys/ioctl.h>
 #ifdef HAVE_LIBGEN_H
 #include <libgen.h>
 #elif !defined(HAVE_DIRNAME)
 char * dirname(char *);
 #endif
 #include <errno.h>
 #if !defined(HAVE_CONFIG_H) || defined(HAVE_ERR_H)
 #include <err.h>
 #endif
 
 #include "make.h"
 #include "job.h"
 
 #ifdef HAVE_FILEMON_H
 # include <filemon.h>
 #endif
 #if !defined(USE_FILEMON) && defined(FILEMON_SET_FD)
 # define USE_FILEMON
 #endif
 
 static BuildMon Mybm;			/* for compat */
 static Lst metaBailiwick;		/* our scope of control */
 static char *metaBailiwickStr;		/* string storage for the list */
 static Lst metaIgnorePaths;		/* paths we deliberately ignore */
 static char *metaIgnorePathsStr;	/* string storage for the list */
 
 #ifndef MAKE_META_IGNORE_PATHS
 #define MAKE_META_IGNORE_PATHS ".MAKE.META.IGNORE_PATHS"
 #endif
 #ifndef MAKE_META_IGNORE_PATTERNS
 #define MAKE_META_IGNORE_PATTERNS ".MAKE.META.IGNORE_PATTERNS"
 #endif
+#ifndef MAKE_META_IGNORE_FILTER
+#define MAKE_META_IGNORE_FILTER ".MAKE.META.IGNORE_FILTER"
+#endif
 
 Boolean useMeta = FALSE;
 static Boolean useFilemon = FALSE;
 static Boolean writeMeta = FALSE;
 static Boolean metaMissing = FALSE;	/* oodate if missing */
 static Boolean filemonMissing = FALSE;	/* oodate if missing */
 static Boolean metaEnv = FALSE;		/* don't save env unless asked */
 static Boolean metaVerbose = FALSE;
 static Boolean metaIgnoreCMDs = FALSE;	/* ignore CMDs in .meta files */
 static Boolean metaIgnorePatterns = FALSE; /* do we need to do pattern matches */
+static Boolean metaIgnoreFilter = FALSE;   /* do we have more complex filtering? */
 static Boolean metaCurdirOk = FALSE;	/* write .meta in .CURDIR Ok? */
 static Boolean metaSilent = FALSE;	/* if we have a .meta be SILENT */
 
 extern Boolean forceJobs;
 extern Boolean comatMake;
 extern char    **environ;
 
 #define	MAKE_META_PREFIX	".MAKE.META.PREFIX"
 
 #ifndef N2U
 # define N2U(n, u)   (((n) + ((u) - 1)) / (u))
 #endif
 #ifndef ROUNDUP
 # define ROUNDUP(n, u)   (N2U((n), (u)) * (u))
 #endif
 
 #if !defined(HAVE_STRSEP)
 # define strsep(s, d) stresep((s), (d), 0)
 #endif
 
 /*
  * Filemon is a kernel module which snoops certain syscalls.
  *
  * C chdir
  * E exec
  * F [v]fork
  * L [sym]link
  * M rename
  * R read
  * W write
  * S stat
  *
  * See meta_oodate below - we mainly care about 'E' and 'R'.
  *
  * We can still use meta mode without filemon, but 
  * the benefits are more limited.
  */
 #ifdef USE_FILEMON
 # ifndef _PATH_FILEMON
 #   define _PATH_FILEMON "/dev/filemon"
 # endif
 
 /*
  * Open the filemon device.
  */
 static void
 filemon_open(BuildMon *pbm)
 {
     int retry;
     
     pbm->mon_fd = pbm->filemon_fd = -1;
     if (!useFilemon)
 	return;
 
     for (retry = 5; retry >= 0; retry--) {
 	if ((pbm->filemon_fd = open(_PATH_FILEMON, O_RDWR)) >= 0)
 	    break;
     }
 
     if (pbm->filemon_fd < 0) {
 	useFilemon = FALSE;
 	warn("Could not open %s", _PATH_FILEMON);
 	return;
     }
 
     /*
      * We use a file outside of '.'
      * to avoid a FreeBSD kernel bug where unlink invalidates
      * cwd causing getcwd to do a lot more work.
      * We only care about the descriptor.
      */
     pbm->mon_fd = mkTempFile("filemon.XXXXXX", NULL);
     if (ioctl(pbm->filemon_fd, FILEMON_SET_FD, &pbm->mon_fd) < 0) {
 	err(1, "Could not set filemon file descriptor!");
     }
     /* we don't need these once we exec */
     (void)fcntl(pbm->mon_fd, F_SETFD, FD_CLOEXEC);
     (void)fcntl(pbm->filemon_fd, F_SETFD, FD_CLOEXEC);
 }
 
 /*
  * Read the build monitor output file and write records to the target's
  * metadata file.
  */
 static int
 filemon_read(FILE *mfp, int fd)
 {
     char buf[BUFSIZ];
     int n;
     int error;
 
     /* Check if we're not writing to a meta data file.*/
     if (mfp == NULL) {
 	if (fd >= 0)
 	    close(fd);			/* not interested */
 	return 0;
     }
     /* rewind */
     (void)lseek(fd, (off_t)0, SEEK_SET);
 
     error = 0;
     fprintf(mfp, "\n-- filemon acquired metadata --\n");
 
     while ((n = read(fd, buf, sizeof(buf))) > 0) {
 	if ((int)fwrite(buf, 1, n, mfp) < n)
 	    error = EIO;
     }
     fflush(mfp);
     if (close(fd) < 0)
 	error = errno;
     return error;
 }
 #endif
 
 /*
  * when realpath() fails,
  * we use this, to clean up ./ and ../
  */
 static void
 eat_dots(char *buf, size_t bufsz, int dots)
 {
     char *cp;
     char *cp2;
     const char *eat;
     size_t eatlen;
 
     switch (dots) {
     case 1:
 	eat = "/./";
 	eatlen = 2;
 	break;
     case 2:
 	eat = "/../";
 	eatlen = 3;
 	break;
     default:
 	return;
     }
     
     do {
 	cp = strstr(buf, eat);
 	if (cp) {
 	    cp2 = cp + eatlen;
 	    if (dots == 2 && cp > buf) {
 		do {
 		    cp--;
 		} while (cp > buf && *cp != '/');
 	    }
 	    if (*cp == '/') {
 		strlcpy(cp, cp2, bufsz - (cp - buf));
 	    } else {
 		return;			/* can't happen? */
 	    }
 	}
     } while (cp);
 }
 
 static char *
 meta_name(struct GNode *gn, char *mname, size_t mnamelen,
 	  const char *dname,
 	  const char *tname,
 	  const char *cwd)
 {
     char buf[MAXPATHLEN];
     char *rp;
     char *cp;
     char *tp;
 
     /*
      * Weed out relative paths from the target file name.
      * We have to be careful though since if target is a
      * symlink, the result will be unstable.
      * So we use realpath() just to get the dirname, and leave the
      * basename as given to us.
      */
     if ((cp = strrchr(tname, '/'))) {
 	if (cached_realpath(tname, buf)) {
 	    if ((rp = strrchr(buf, '/'))) {
 		rp++;
 		cp++;
 		if (strcmp(cp, rp) != 0)
 		    strlcpy(rp, cp, sizeof(buf) - (rp - buf));
 	    }
 	    tname = buf;
 	} else {
 	    /*
 	     * We likely have a directory which is about to be made.
 	     * We pretend realpath() succeeded, to have a chance
 	     * of generating the same meta file name that we will
 	     * next time through.
 	     */
 	    if (tname[0] == '/') {
 		strlcpy(buf, tname, sizeof(buf));
 	    } else {
 		snprintf(buf, sizeof(buf), "%s/%s", cwd, tname);
 	    }
 	    eat_dots(buf, sizeof(buf), 1);	/* ./ */
 	    eat_dots(buf, sizeof(buf), 2);	/* ../ */
 	    tname = buf;
 	}
     }
     /* on some systems dirname may modify its arg */
     tp = bmake_strdup(tname);
     if (strcmp(dname, dirname(tp)) == 0)
 	snprintf(mname, mnamelen, "%s.meta", tname);
     else {
 	snprintf(mname, mnamelen, "%s/%s.meta", dname, tname);
 
 	/*
 	 * Replace path separators in the file name after the
 	 * current object directory path.
 	 */
 	cp = mname + strlen(dname) + 1;
 
 	while (*cp != '\0') {
 	    if (*cp == '/')
 		*cp = '_';
 	    cp++;
 	}
     }
     free(tp);
     return (mname);
 }
 
 /*
  * Return true if running ${.MAKE}
  * Bypassed if target is flagged .MAKE
  */
 static int
 is_submake(void *cmdp, void *gnp)
 {
     static char *p_make = NULL;
     static int p_len;
     char  *cmd = cmdp;
     GNode *gn = gnp;
     char *mp = NULL;
     char *cp;
     char *cp2;
     int rc = 0;				/* keep looking */
 
     if (!p_make) {
 	p_make = Var_Value(".MAKE", gn, &cp);
 	p_len = strlen(p_make);
     }
     cp = strchr(cmd, '$');
     if ((cp)) {
 	mp = Var_Subst(NULL, cmd, gn, VARF_WANTRES);
 	cmd = mp;
     }
     cp2 = strstr(cmd, p_make);
     if ((cp2)) {
 	switch (cp2[p_len]) {
 	case '\0':
 	case ' ':
 	case '\t':
 	case '\n':
 	    rc = 1;
 	    break;
 	}
 	if (cp2 > cmd && rc > 0) {
 	    switch (cp2[-1]) {
 	    case ' ':
 	    case '\t':
 	    case '\n':
 		break;
 	    default:
 		rc = 0;			/* no match */
 		break;
 	    }
 	}
     }
     free(mp);
     return (rc);
 }
 
 typedef struct meta_file_s {
     FILE *fp;
     GNode *gn;
 } meta_file_t;
 
 static int
 printCMD(void *cmdp, void *mfpp)
 {
     meta_file_t *mfp = mfpp;
     char *cmd = cmdp;
     char *cp = NULL;
 
     if (strchr(cmd, '$')) {
 	cmd = cp = Var_Subst(NULL, cmd, mfp->gn, VARF_WANTRES);
     }
     fprintf(mfp->fp, "CMD %s\n", cmd);
     free(cp);
     return 0;
 }
 
 /*
  * Certain node types never get a .meta file
  */
 #define SKIP_META_TYPE(_type) do { \
     if ((gn->type & __CONCAT(OP_, _type))) {	\
 	if (verbose) { \
 	    fprintf(debug_file, "Skipping meta for %s: .%s\n", \
 		    gn->name, __STRING(_type));		       \
 	} \
 	return FALSE; \
     } \
 } while (0)
 
 
 /*
  * Do we need/want a .meta file ?
  */
 static Boolean
 meta_needed(GNode *gn, const char *dname, const char *tname,
 	     char *objdir, int verbose)
 {
     struct stat fs;
 
     if (verbose)
 	verbose = DEBUG(META);
     
     /* This may be a phony node which we don't want meta data for... */
     /* Skip .meta for .BEGIN, .END, .ERROR etc as well. */
     /* Or it may be explicitly flagged as .NOMETA */
     SKIP_META_TYPE(NOMETA);
     /* Unless it is explicitly flagged as .META */
     if (!(gn->type & OP_META)) {
 	SKIP_META_TYPE(PHONY);
 	SKIP_META_TYPE(SPECIAL);
 	SKIP_META_TYPE(MAKE);
     }
 
     /* Check if there are no commands to execute. */
     if (Lst_IsEmpty(gn->commands)) {
 	if (verbose)
 	    fprintf(debug_file, "Skipping meta for %s: no commands\n",
 		    gn->name);
 	return FALSE;
     }
     if ((gn->type & (OP_META|OP_SUBMAKE)) == OP_SUBMAKE) {
 	/* OP_SUBMAKE is a bit too aggressive */
 	if (Lst_ForEach(gn->commands, is_submake, gn)) {
 	    if (DEBUG(META))
 		fprintf(debug_file, "Skipping meta for %s: .SUBMAKE\n",
 			gn->name);
 	    return FALSE;
 	}
     }
 
     /* The object directory may not exist. Check it.. */
     if (cached_stat(dname, &fs) != 0) {
 	if (verbose)
 	    fprintf(debug_file, "Skipping meta for %s: no .OBJDIR\n",
 		    gn->name);
 	return FALSE;
     }
 
     /* make sure these are canonical */
     if (cached_realpath(dname, objdir))
 	dname = objdir;
 
     /* If we aren't in the object directory, don't create a meta file. */
     if (!metaCurdirOk && strcmp(curdir, dname) == 0) {
 	if (verbose)
 	    fprintf(debug_file, "Skipping meta for %s: .OBJDIR == .CURDIR\n",
 		    gn->name);
 	return FALSE;
     }
     return TRUE;
 }
 
     
 static FILE *
 meta_create(BuildMon *pbm, GNode *gn)
 {
     meta_file_t mf;
     char buf[MAXPATHLEN];
     char objdir[MAXPATHLEN];
     char **ptr;
     const char *dname;
     const char *tname;
     char *fname;
     const char *cp;
     char *p[4];				/* >= possible uses */
     int i;
 
     mf.fp = NULL;
     i = 0;
 
     dname = Var_Value(".OBJDIR", gn, &p[i++]);
     tname = Var_Value(TARGET, gn, &p[i++]);
 
     /* if this succeeds objdir is realpath of dname */
     if (!meta_needed(gn, dname, tname, objdir, TRUE))
 	goto out;
     dname = objdir;
 
     if (metaVerbose) {
 	char *mp;
 
 	/* Describe the target we are building */
 	mp = Var_Subst(NULL, "${" MAKE_META_PREFIX "}", gn, VARF_WANTRES);
 	if (*mp)
 	    fprintf(stdout, "%s\n", mp);
 	free(mp);
     }
     /* Get the basename of the target */
     if ((cp = strrchr(tname, '/')) == NULL) {
 	cp = tname;
     } else {
 	cp++;
     }
 
     fflush(stdout);
 
     if (!writeMeta)
 	/* Don't create meta data. */
 	goto out;
 
     fname = meta_name(gn, pbm->meta_fname, sizeof(pbm->meta_fname),
 		      dname, tname, objdir);
 
 #ifdef DEBUG_META_MODE
     if (DEBUG(META))
 	fprintf(debug_file, "meta_create: %s\n", fname);
 #endif
 
     if ((mf.fp = fopen(fname, "w")) == NULL)
 	err(1, "Could not open meta file '%s'", fname);
 
     fprintf(mf.fp, "# Meta data file %s\n", fname);
 
     mf.gn = gn;
 
     Lst_ForEach(gn->commands, printCMD, &mf);
 
     fprintf(mf.fp, "CWD %s\n", getcwd(buf, sizeof(buf)));
     fprintf(mf.fp, "TARGET %s\n", tname);
 
     if (metaEnv) {
 	for (ptr = environ; *ptr != NULL; ptr++)
 	    fprintf(mf.fp, "ENV %s\n", *ptr);
     }
 
     fprintf(mf.fp, "-- command output --\n");
     fflush(mf.fp);
 
     Var_Append(".MAKE.META.FILES", fname, VAR_GLOBAL);
     Var_Append(".MAKE.META.CREATED", fname, VAR_GLOBAL);
 
     gn->type |= OP_META;		/* in case anyone wants to know */
     if (metaSilent) {
 	    gn->type |= OP_SILENT;
     }
  out:
     for (i--; i >= 0; i--) {
 	free(p[i]);
     }
 
     return (mf.fp);
 }
 
 static Boolean
 boolValue(char *s)
 {
     switch(*s) {
     case '0':
     case 'N':
     case 'n':
     case 'F':
     case 'f':
 	return FALSE;
     }
     return TRUE;
 }
 
 /*
  * Initialization we need before reading makefiles.
  */
 void
 meta_init(void)
 {
 #ifdef USE_FILEMON
 	/* this allows makefiles to test if we have filemon support */
 	Var_Set(".MAKE.PATH_FILEMON", _PATH_FILEMON, VAR_GLOBAL, 0);
 #endif
 }
 
 
 #define get_mode_bf(bf, token) \
     if ((cp = strstr(make_mode, token))) \
 	bf = boolValue(&cp[sizeof(token) - 1])
 
 /*
  * Initialization we need after reading makefiles.
  */
 void
 meta_mode_init(const char *make_mode)
 {
     static int once = 0;
     char *cp;
 
     useMeta = TRUE;
     useFilemon = TRUE;
     writeMeta = TRUE;
 
     if (make_mode) {
 	if (strstr(make_mode, "env"))
 	    metaEnv = TRUE;
 	if (strstr(make_mode, "verb"))
 	    metaVerbose = TRUE;
 	if (strstr(make_mode, "read"))
 	    writeMeta = FALSE;
 	if (strstr(make_mode, "nofilemon"))
 	    useFilemon = FALSE;
 	if (strstr(make_mode, "ignore-cmd"))
 	    metaIgnoreCMDs = TRUE;
 	if (useFilemon)
 	    get_mode_bf(filemonMissing, "missing-filemon=");
 	get_mode_bf(metaCurdirOk, "curdirok=");
 	get_mode_bf(metaMissing, "missing-meta=");
 	get_mode_bf(metaSilent, "silent=");
     }
     if (metaVerbose && !Var_Exists(MAKE_META_PREFIX, VAR_GLOBAL)) {
 	/*
 	 * The default value for MAKE_META_PREFIX
 	 * prints the absolute path of the target.
 	 * This works be cause :H will generate '.' if there is no /
 	 * and :tA will resolve that to cwd.
 	 */
 	Var_Set(MAKE_META_PREFIX, "Building ${.TARGET:H:tA}/${.TARGET:T}", VAR_GLOBAL, 0);
     }
     if (once)
 	return;
     once = 1;
     memset(&Mybm, 0, sizeof(Mybm));
     /*
      * We consider ourselves master of all within ${.MAKE.META.BAILIWICK}
      */
     metaBailiwick = Lst_Init(FALSE);
     metaBailiwickStr = Var_Subst(NULL, "${.MAKE.META.BAILIWICK:O:u:tA}",
 	VAR_GLOBAL, VARF_WANTRES);
     if (metaBailiwickStr) {
 	str2Lst_Append(metaBailiwick, metaBailiwickStr, NULL);
     }
     /*
      * We ignore any paths that start with ${.MAKE.META.IGNORE_PATHS}
      */
     metaIgnorePaths = Lst_Init(FALSE);
     Var_Append(MAKE_META_IGNORE_PATHS,
 	       "/dev /etc /proc /tmp /var/run /var/tmp ${TMPDIR}", VAR_GLOBAL);
     metaIgnorePathsStr = Var_Subst(NULL,
 		   "${" MAKE_META_IGNORE_PATHS ":O:u:tA}", VAR_GLOBAL,
 		   VARF_WANTRES);
     if (metaIgnorePathsStr) {
 	str2Lst_Append(metaIgnorePaths, metaIgnorePathsStr, NULL);
     }
 
     /*
      * We ignore any paths that match ${.MAKE.META.IGNORE_PATTERNS}
      */
     cp = NULL;
     if (Var_Value(MAKE_META_IGNORE_PATTERNS, VAR_GLOBAL, &cp)) {
 	metaIgnorePatterns = TRUE;
 	free(cp);
     }
+    cp = NULL;
+    if (Var_Value(MAKE_META_IGNORE_FILTER, VAR_GLOBAL, &cp)) {
+	metaIgnoreFilter = TRUE;
+	free(cp);
+    }
 }
 
 /*
  * In each case below we allow for job==NULL
  */
 void
 meta_job_start(Job *job, GNode *gn)
 {
     BuildMon *pbm;
 
     if (job != NULL) {
 	pbm = &job->bm;
     } else {
 	pbm = &Mybm;
     }
     pbm->mfp = meta_create(pbm, gn);
 #ifdef USE_FILEMON_ONCE
     /* compat mode we open the filemon dev once per command */
     if (job == NULL)
 	return;
 #endif
 #ifdef USE_FILEMON
     if (pbm->mfp != NULL && useFilemon) {
 	filemon_open(pbm);
     } else {
 	pbm->mon_fd = pbm->filemon_fd = -1;
     }
 #endif
 }
 
 /*
  * The child calls this before doing anything.
  * It does not disturb our state.
  */
 void
 meta_job_child(Job *job)
 {
 #ifdef USE_FILEMON
     BuildMon *pbm;
 
     if (job != NULL) {
 	pbm = &job->bm;
     } else {
 	pbm = &Mybm;
     }
     if (pbm->mfp != NULL) {
 	close(fileno(pbm->mfp));
 	if (useFilemon) {
 	    pid_t pid;
 
 	    pid = getpid();
 	    if (ioctl(pbm->filemon_fd, FILEMON_SET_PID, &pid) < 0) {
 		err(1, "Could not set filemon pid!");
 	    }
 	}
     }
 #endif
 }
 
 void
 meta_job_error(Job *job, GNode *gn, int flags, int status)
 {
     char cwd[MAXPATHLEN];
     BuildMon *pbm;
 
     if (job != NULL) {
 	pbm = &job->bm;
 	if (!gn)
 	    gn = job->node;
     } else {
 	pbm = &Mybm;
     }
     if (pbm->mfp != NULL) {
 	fprintf(pbm->mfp, "*** Error code %d%s\n",
 		status,
 		(flags & JOB_IGNERR) ?
 		"(ignored)" : "");
     }
     if (gn) {
 	Var_Set(".ERROR_TARGET", gn->path ? gn->path : gn->name, VAR_GLOBAL, 0);
     }
     getcwd(cwd, sizeof(cwd));
     Var_Set(".ERROR_CWD", cwd, VAR_GLOBAL, 0);
     if (pbm->meta_fname[0]) {
 	Var_Set(".ERROR_META_FILE", pbm->meta_fname, VAR_GLOBAL, 0);
     }
     meta_job_finish(job);
 }
 
 void
 meta_job_output(Job *job, char *cp, const char *nl)
 {
     BuildMon *pbm;
     
     if (job != NULL) {
 	pbm = &job->bm;
     } else {
 	pbm = &Mybm;
     }
     if (pbm->mfp != NULL) {
 	if (metaVerbose) {
 	    static char *meta_prefix = NULL;
 	    static int meta_prefix_len;
 
 	    if (!meta_prefix) {
 		char *cp2;
 
 		meta_prefix = Var_Subst(NULL, "${" MAKE_META_PREFIX "}",
 					VAR_GLOBAL, VARF_WANTRES);
 		if ((cp2 = strchr(meta_prefix, '$')))
 		    meta_prefix_len = cp2 - meta_prefix;
 		else
 		    meta_prefix_len = strlen(meta_prefix);
 	    }
 	    if (strncmp(cp, meta_prefix, meta_prefix_len) == 0) {
 		cp = strchr(cp+1, '\n');
 		if (!cp++)
 		    return;
 	    }
 	}
 	fprintf(pbm->mfp, "%s%s", cp, nl);
     }
 }
 
 int
 meta_cmd_finish(void *pbmp)
 {
     int error = 0;
 #ifdef USE_FILEMON
     BuildMon *pbm = pbmp;
     int x;
 
     if (!pbm)
 	pbm = &Mybm;
 
     if (pbm->filemon_fd >= 0) {
 	if (close(pbm->filemon_fd) < 0)
 	    error = errno;
 	x = filemon_read(pbm->mfp, pbm->mon_fd);
 	if (error == 0 && x != 0)
 	    error = x;
 	pbm->filemon_fd = pbm->mon_fd = -1;
     }
 #endif
     return error;
 }
 
 int
 meta_job_finish(Job *job)
 {
     BuildMon *pbm;
     int error = 0;
     int x;
 
     if (job != NULL) {
 	pbm = &job->bm;
     } else {
 	pbm = &Mybm;
     }
     if (pbm->mfp != NULL) {
 	error = meta_cmd_finish(pbm);
 	x = fclose(pbm->mfp);
 	if (error == 0 && x != 0)
 	    error = errno;
 	pbm->mfp = NULL;
 	pbm->meta_fname[0] = '\0';
     }
     return error;
 }
 
 void
 meta_finish(void)
 {
     Lst_Destroy(metaBailiwick, NULL);
     free(metaBailiwickStr);
     Lst_Destroy(metaIgnorePaths, NULL);
     free(metaIgnorePathsStr);
 }
 
 /*
  * Fetch a full line from fp - growing bufp if needed
  * Return length in bufp.
  */
 static int 
 fgetLine(char **bufp, size_t *szp, int o, FILE *fp)
 {
     char *buf = *bufp;
     size_t bufsz = *szp;
     struct stat fs;
     int x;
 
     if (fgets(&buf[o], bufsz - o, fp) != NULL) {
     check_newline:
 	x = o + strlen(&buf[o]);
 	if (buf[x - 1] == '\n')
 	    return x;
 	/*
 	 * We need to grow the buffer.
 	 * The meta file can give us a clue.
 	 */
 	if (fstat(fileno(fp), &fs) == 0) {
 	    size_t newsz;
 	    char *p;
 
 	    newsz = ROUNDUP((fs.st_size / 2), BUFSIZ);
 	    if (newsz <= bufsz)
 		newsz = ROUNDUP(fs.st_size, BUFSIZ);
 	    if (DEBUG(META)) 
 		fprintf(debug_file, "growing buffer %u -> %u\n",
 			(unsigned)bufsz, (unsigned)newsz);
 	    p = bmake_realloc(buf, newsz);
 	    if (p) {
 		*bufp = buf = p;
 		*szp = bufsz = newsz;
 		/* fetch the rest */
 		if (!fgets(&buf[x], bufsz - x, fp))
 		    return x;		/* truncated! */
 		goto check_newline;
 	    }
 	}
     }
     return 0;
 }
 
+/* Lst_ForEach wants 1 to stop search */
 static int
 prefix_match(void *p, void *q)
 {
     const char *prefix = p;
     const char *path = q;
     size_t n = strlen(prefix);
 
     return (0 == strncmp(path, prefix, n));
 }
 
+/*
+ * looking for exact or prefix/ match to
+ * Lst_Find wants 0 to stop search
+ */
 static int
+path_match(const void *p, const void *q)
+{
+    const char *prefix = q;
+    const char *path = p;
+    size_t n = strlen(prefix);
+    int rc;
+
+    if ((rc = strncmp(path, prefix, n)) == 0) {
+	switch (path[n]) {
+	case '\0':
+	case '/':
+	    break;
+	default:
+	    rc = 1;
+	    break;
+	}
+    }
+    return rc;
+}
+
+/* Lst_Find wants 0 to stop search */
+static int
 string_match(const void *p, const void *q)
 {
     const char *p1 = p;
     const char *p2 = q;
 
     return strcmp(p1, p2);
 }
 
 
+static int
+meta_ignore(GNode *gn, const char *p)
+{
+    char fname[MAXPATHLEN];
+
+    if (p == NULL)
+	return TRUE;
+
+    if (*p == '/') {
+	cached_realpath(p, fname); /* clean it up */
+	if (Lst_ForEach(metaIgnorePaths, prefix_match, fname)) {
+#ifdef DEBUG_META_MODE
+	    if (DEBUG(META))
+		fprintf(debug_file, "meta_oodate: ignoring path: %s\n",
+			p);
+#endif
+	    return TRUE;
+	}
+    }
+
+    if (metaIgnorePatterns) {
+	char *pm;
+
+	snprintf(fname, sizeof(fname),
+		 "${%s:@m@${%s:L:M$m}@}",
+		 MAKE_META_IGNORE_PATTERNS, p);
+	pm = Var_Subst(NULL, fname, gn, VARF_WANTRES);
+	if (*pm) {
+#ifdef DEBUG_META_MODE
+	    if (DEBUG(META))
+		fprintf(debug_file, "meta_oodate: ignoring pattern: %s\n",
+			p);
+#endif
+	    free(pm);
+	    return TRUE;
+	}
+	free(pm);
+    }
+
+    if (metaIgnoreFilter) {
+	char *fm;
+
+	/* skip if filter result is empty */
+	snprintf(fname, sizeof(fname),
+		 "${%s:L:${%s:ts:}}",
+		 p, MAKE_META_IGNORE_FILTER);
+	fm = Var_Subst(NULL, fname, gn, VARF_WANTRES);
+	if (*fm == '\0') {
+#ifdef DEBUG_META_MODE
+	    if (DEBUG(META))
+		fprintf(debug_file, "meta_oodate: ignoring filtered: %s\n",
+			p);
+#endif
+	    free(fm);
+	    return TRUE;
+	}
+	free(fm);
+    }
+    return FALSE;
+}
+
 /*
  * When running with 'meta' functionality, a target can be out-of-date
  * if any of the references in its meta data file is more recent.
  * We have to track the latestdir on a per-process basis.
  */
 #define LCWD_VNAME_FMT ".meta.%d.lcwd"
 #define LDIR_VNAME_FMT ".meta.%d.ldir"
 
 /*
  * It is possible that a .meta file is corrupted,
  * if we detect this we want to reproduce it.
  * Setting oodate TRUE will have that effect.
  */
 #define CHECK_VALID_META(p) if (!(p && *p)) { \
     warnx("%s: %d: malformed", fname, lineno); \
     oodate = TRUE; \
     continue; \
     }
 
 #define DEQUOTE(p) if (*p == '\'') {	\
     char *ep; \
     p++; \
     if ((ep = strchr(p, '\''))) \
 	*ep = '\0'; \
     }
 
 Boolean
 meta_oodate(GNode *gn, Boolean oodate)
 {
     static char *tmpdir = NULL;
     static char cwd[MAXPATHLEN];
     char lcwd_vname[64];
     char ldir_vname[64];
     char lcwd[MAXPATHLEN];
     char latestdir[MAXPATHLEN];
     char fname[MAXPATHLEN];
     char fname1[MAXPATHLEN];
     char fname2[MAXPATHLEN];
     char fname3[MAXPATHLEN];
     const char *dname;
     const char *tname;
     char *p;
     char *cp;
     char *link_src;
     char *move_target;
     static size_t cwdlen = 0;
     static size_t tmplen = 0;
     FILE *fp;
     Boolean needOODATE = FALSE;
     Lst missingFiles;
     char *pa[4];			/* >= possible uses */
     int i;
     int have_filemon = FALSE;
 
     if (oodate)
 	return oodate;		/* we're done */
 
     i = 0;
 
     dname = Var_Value(".OBJDIR", gn, &pa[i++]);
     tname = Var_Value(TARGET, gn, &pa[i++]);
 
     /* if this succeeds fname3 is realpath of dname */
     if (!meta_needed(gn, dname, tname, fname3, FALSE))
 	goto oodate_out;
     dname = fname3;
 
     missingFiles = Lst_Init(FALSE);
 
     /*
      * We need to check if the target is out-of-date. This includes
      * checking if the expanded command has changed. This in turn
      * requires that all variables are set in the same way that they
      * would be if the target needs to be re-built.
      */
     Make_DoAllVar(gn);
 
     meta_name(gn, fname, sizeof(fname), dname, tname, dname);
 
 #ifdef DEBUG_META_MODE
     if (DEBUG(META))
 	fprintf(debug_file, "meta_oodate: %s\n", fname);
 #endif
 
     if ((fp = fopen(fname, "r")) != NULL) {
 	static char *buf = NULL;
 	static size_t bufsz;
 	int lineno = 0;
 	int lastpid = 0;
 	int pid;
 	int x;
 	LstNode ln;
 	struct stat fs;
 
 	if (!buf) {
 	    bufsz = 8 * BUFSIZ;
 	    buf = bmake_malloc(bufsz);
 	}
 
 	if (!cwdlen) {
 	    if (getcwd(cwd, sizeof(cwd)) == NULL)
 		err(1, "Could not get current working directory");
 	    cwdlen = strlen(cwd);
 	}
 	strlcpy(lcwd, cwd, sizeof(lcwd));
 	strlcpy(latestdir, cwd, sizeof(latestdir));
 
 	if (!tmpdir) {
 	    tmpdir = getTmpdir();
 	    tmplen = strlen(tmpdir);
 	}
 
 	/* we want to track all the .meta we read */
 	Var_Append(".MAKE.META.FILES", fname, VAR_GLOBAL);
 
 	ln = Lst_First(gn->commands);
 	while (!oodate && (x = fgetLine(&buf, &bufsz, 0, fp)) > 0) {
 	    lineno++;
 	    if (buf[x - 1] == '\n')
 		buf[x - 1] = '\0';
 	    else {
 		warnx("%s: %d: line truncated at %u", fname, lineno, x);
 		oodate = TRUE;
 		break;
 	    }
 	    link_src = NULL;
 	    move_target = NULL;
 	    /* Find the start of the build monitor section. */
 	    if (!have_filemon) {
 		if (strncmp(buf, "-- filemon", 10) == 0) {
 		    have_filemon = TRUE;
 		    continue;
 		}
 		if (strncmp(buf, "# buildmon", 10) == 0) {
 		    have_filemon = TRUE;
 		    continue;
 		}
 	    }		    
 
 	    /* Delimit the record type. */
 	    p = buf;
 #ifdef DEBUG_META_MODE
 	    if (DEBUG(META))
 		fprintf(debug_file, "%s: %d: %s\n", fname, lineno, buf);
 #endif
 	    strsep(&p, " ");
 	    if (have_filemon) {
 		/*
 		 * We are in the 'filemon' output section.
 		 * Each record from filemon follows the general form:
 		 *
 		 * <key> <pid> <data>
 		 *
 		 * Where:
 		 * <key> is a single letter, denoting the syscall.
 		 * <pid> is the process that made the syscall.
 		 * <data> is the arguments (of interest).
 		 */
 		switch(buf[0]) {
 		case '#':		/* comment */
 		case 'V':		/* version */
 		    break;
 		default:
 		    /*
 		     * We need to track pathnames per-process.
 		     *
 		     * Each process run by make, starts off in the 'CWD'
 		     * recorded in the .meta file, if it chdirs ('C')
 		     * elsewhere we need to track that - but only for
 		     * that process.  If it forks ('F'), we initialize
 		     * the child to have the same cwd as its parent.
 		     *
 		     * We also need to track the 'latestdir' of
 		     * interest.  This is usually the same as cwd, but
 		     * not if a process is reading directories.
 		     *
 		     * Each time we spot a different process ('pid')
 		     * we save the current value of 'latestdir' in a
 		     * variable qualified by 'lastpid', and
 		     * re-initialize 'latestdir' to any pre-saved
 		     * value for the current 'pid' and 'CWD' if none.
 		     */
 		    CHECK_VALID_META(p);
 		    pid = atoi(p);
 		    if (pid > 0 && pid != lastpid) {
 			char *ldir;
 			char *tp;
 		    
 			if (lastpid > 0) {
 			    /* We need to remember these. */
 			    Var_Set(lcwd_vname, lcwd, VAR_GLOBAL, 0);
 			    Var_Set(ldir_vname, latestdir, VAR_GLOBAL, 0);
 			}
 			snprintf(lcwd_vname, sizeof(lcwd_vname), LCWD_VNAME_FMT, pid);
 			snprintf(ldir_vname, sizeof(ldir_vname), LDIR_VNAME_FMT, pid);
 			lastpid = pid;
 			ldir = Var_Value(ldir_vname, VAR_GLOBAL, &tp);
 			if (ldir) {
 			    strlcpy(latestdir, ldir, sizeof(latestdir));
 			    free(tp);
 			}
 			ldir = Var_Value(lcwd_vname, VAR_GLOBAL, &tp);
 			if (ldir) {
 			    strlcpy(lcwd, ldir, sizeof(lcwd));
 			    free(tp);
 			}
 		    }
 		    /* Skip past the pid. */
 		    if (strsep(&p, " ") == NULL)
 			continue;
 #ifdef DEBUG_META_MODE
 		    if (DEBUG(META))
 			    fprintf(debug_file, "%s: %d: %d: %c: cwd=%s lcwd=%s ldir=%s\n",
 				    fname, lineno,
 				    pid, buf[0], cwd, lcwd, latestdir);
 #endif
 		    break;
 		}
 
 		CHECK_VALID_META(p);
 
 		/* Process according to record type. */
 		switch (buf[0]) {
 		case 'X':		/* eXit */
 		    Var_Delete(lcwd_vname, VAR_GLOBAL);
 		    Var_Delete(ldir_vname, VAR_GLOBAL);
 		    lastpid = 0;	/* no need to save ldir_vname */
 		    break;
 
 		case 'F':		/* [v]Fork */
 		    {
 			char cldir[64];
 			int child;
 
 			child = atoi(p);
 			if (child > 0) {
 			    snprintf(cldir, sizeof(cldir), LCWD_VNAME_FMT, child);
 			    Var_Set(cldir, lcwd, VAR_GLOBAL, 0);
 			    snprintf(cldir, sizeof(cldir), LDIR_VNAME_FMT, child);
 			    Var_Set(cldir, latestdir, VAR_GLOBAL, 0);
 #ifdef DEBUG_META_MODE
 			    if (DEBUG(META))
 				    fprintf(debug_file, "%s: %d: %d: cwd=%s lcwd=%s ldir=%s\n",
 					    fname, lineno,
 					    child, cwd, lcwd, latestdir);
 #endif
 			}
 		    }
 		    break;
 
 		case 'C':		/* Chdir */
 		    /* Update lcwd and latest directory. */
 		    strlcpy(latestdir, p, sizeof(latestdir));	
 		    strlcpy(lcwd, p, sizeof(lcwd));
 		    Var_Set(lcwd_vname, lcwd, VAR_GLOBAL, 0);
 		    Var_Set(ldir_vname, lcwd, VAR_GLOBAL, 0);
 #ifdef DEBUG_META_MODE
 		    if (DEBUG(META))
 			fprintf(debug_file, "%s: %d: cwd=%s ldir=%s\n", fname, lineno, cwd, lcwd);
 #endif
 		    break;
 
 		case 'M':		/* renaMe */
 		    /*
 		     * For 'M'oves we want to check
 		     * the src as for 'R'ead
 		     * and the target as for 'W'rite.
 		     */
 		    cp = p;		/* save this for a second */
 		    /* now get target */
 		    if (strsep(&p, " ") == NULL)
 			continue;
 		    CHECK_VALID_META(p);
 		    move_target = p;
 		    p = cp;
 		    /* 'L' and 'M' put single quotes around the args */
 		    DEQUOTE(p);
 		    DEQUOTE(move_target);
 		    /* FALLTHROUGH */
 		case 'D':		/* unlink */
 		    if (*p == '/' && !Lst_IsEmpty(missingFiles)) {
-			/* remove p from the missingFiles list if present */
-			if ((ln = Lst_Find(missingFiles, p, string_match)) != NULL) {
-			    char *tp = Lst_Datum(ln);
-			    Lst_Remove(missingFiles, ln);
-			    free(tp);
-			    ln = NULL;	/* we're done with it */
+			/* remove any missingFiles entries that match p */
+			if ((ln = Lst_Find(missingFiles, p,
+					   path_match)) != NULL) {
+			    LstNode nln;
+			    char *tp;
+
+			    do {
+				nln = Lst_FindFrom(missingFiles, Lst_Succ(ln),
+						   p, path_match);
+				tp = Lst_Datum(ln);
+				Lst_Remove(missingFiles, ln);
+				free(tp);
+			    } while ((ln = nln) != NULL);
 			}
 		    }
 		    if (buf[0] == 'M') {
 			/* the target of the mv is a file 'W'ritten */
 #ifdef DEBUG_META_MODE
 			if (DEBUG(META))
 			    fprintf(debug_file, "meta_oodate: M %s -> %s\n",
 				    p, move_target);
 #endif
 			p = move_target;
 			goto check_write;
 		    }
 		    break;
 		case 'L':		/* Link */
 		    /*
 		     * For 'L'inks check
 		     * the src as for 'R'ead
 		     * and the target as for 'W'rite.
 		     */
 		    link_src = p;
 		    /* now get target */
 		    if (strsep(&p, " ") == NULL)
 			continue;
 		    CHECK_VALID_META(p);
 		    /* 'L' and 'M' put single quotes around the args */
 		    DEQUOTE(p);
 		    DEQUOTE(link_src);
 #ifdef DEBUG_META_MODE
 		    if (DEBUG(META))
 			fprintf(debug_file, "meta_oodate: L %s -> %s\n",
 				link_src, p);
 #endif
 		    /* FALLTHROUGH */
 		case 'W':		/* Write */
 		check_write:
 		    /*
 		     * If a file we generated within our bailiwick
 		     * but outside of .OBJDIR is missing,
 		     * we need to do it again. 
 		     */
 		    /* ignore non-absolute paths */
 		    if (*p != '/')
 			break;
 
 		    if (Lst_IsEmpty(metaBailiwick))
 			break;
 
 		    /* ignore cwd - normal dependencies handle those */
 		    if (strncmp(p, cwd, cwdlen) == 0)
 			break;
 
 		    if (!Lst_ForEach(metaBailiwick, prefix_match, p))
 			break;
 
 		    /* tmpdir might be within */
 		    if (tmplen > 0 && strncmp(p, tmpdir, tmplen) == 0)
 			break;
 
 		    /* ignore anything containing the string "tmp" */
 		    if ((strstr("tmp", p)))
 			break;
 
 		    if ((link_src != NULL && cached_lstat(p, &fs) < 0) ||
 			(link_src == NULL && cached_stat(p, &fs) < 0)) {
-			if (Lst_Find(missingFiles, p, string_match) == NULL)
+			if (!meta_ignore(gn, p)) {
+			    if (Lst_Find(missingFiles, p, string_match) == NULL)
 				Lst_AtEnd(missingFiles, bmake_strdup(p));
+			}
 		    }
 		    break;
 		check_link_src:
 		    p = link_src;
 		    link_src = NULL;
 #ifdef DEBUG_META_MODE
 		    if (DEBUG(META))
 			fprintf(debug_file, "meta_oodate: L src %s\n", p);
 #endif
 		    /* FALLTHROUGH */
 		case 'R':		/* Read */
 		case 'E':		/* Exec */
 		    /*
 		     * Check for runtime files that can't
 		     * be part of the dependencies because
 		     * they are _expected_ to change.
 		     */
-		    if (*p == '/') {
-			cached_realpath(p, fname1); /* clean it up */
-			if (Lst_ForEach(metaIgnorePaths, prefix_match, fname1)) {
-#ifdef DEBUG_META_MODE
-			    if (DEBUG(META))
-				fprintf(debug_file, "meta_oodate: ignoring path: %s\n",
-					p);
-#endif
-			    break;
-			}
-		    }
-
-		    if (metaIgnorePatterns) {
-			char *pm;
-
-			snprintf(fname1, sizeof(fname1),
-				 "${%s:@m@${%s:L:M$m}@}",
-				 MAKE_META_IGNORE_PATTERNS, p);
-			pm = Var_Subst(NULL, fname1, gn, VARF_WANTRES);
-			if (*pm) {
-#ifdef DEBUG_META_MODE
-			    if (DEBUG(META))
-				fprintf(debug_file, "meta_oodate: ignoring pattern: %s\n",
-					p);
-#endif
-			    free(pm);
-			    break;
-			}
-			free(pm);
-		    }
-
+		    if (meta_ignore(gn, p))
+			break;
+		    
 		    /*
 		     * The rest of the record is the file name.
 		     * Check if it's not an absolute path.
 		     */
 		    {
 			char *sdirs[4];
 			char **sdp;
 			int sdx = 0;
 			int found = 0;
 
 			if (*p == '/') {
 			    sdirs[sdx++] = p; /* done */
 			} else {
 			    if (strcmp(".", p) == 0)
 				continue;  /* no point */
 
 			    /* Check vs latestdir */
 			    snprintf(fname1, sizeof(fname1), "%s/%s", latestdir, p);
 			    sdirs[sdx++] = fname1;
 
 			    if (strcmp(latestdir, lcwd) != 0) {
 				/* Check vs lcwd */
 				snprintf(fname2, sizeof(fname2), "%s/%s", lcwd, p);
 				sdirs[sdx++] = fname2;
 			    }
 			    if (strcmp(lcwd, cwd) != 0) {
 				/* Check vs cwd */
 				snprintf(fname3, sizeof(fname3), "%s/%s", cwd, p);
 				sdirs[sdx++] = fname3;
 			    }
 			}
 			sdirs[sdx++] = NULL;
 
 			for (sdp = sdirs; *sdp && !found; sdp++) {
 #ifdef DEBUG_META_MODE
 			    if (DEBUG(META))
 				fprintf(debug_file, "%s: %d: looking for: %s\n", fname, lineno, *sdp);
 #endif
 			    if (cached_stat(*sdp, &fs) == 0) {
 				found = 1;
 				p = *sdp;
 			    }
 			}
 			if (found) {
 #ifdef DEBUG_META_MODE
 			    if (DEBUG(META))
 				fprintf(debug_file, "%s: %d: found: %s\n", fname, lineno, p);
 #endif
 			    if (!S_ISDIR(fs.st_mode) &&
 				fs.st_mtime > gn->mtime) {
 				if (DEBUG(META))
 				    fprintf(debug_file, "%s: %d: file '%s' is newer than the target...\n", fname, lineno, p);
 				oodate = TRUE;
 			    } else if (S_ISDIR(fs.st_mode)) {
 				/* Update the latest directory. */
 				cached_realpath(p, latestdir);
 			    }
 			} else if (errno == ENOENT && *p == '/' &&
 				   strncmp(p, cwd, cwdlen) != 0) {
 			    /*
 			     * A referenced file outside of CWD is missing.
 			     * We cannot catch every eventuality here...
 			     */
 			    if (Lst_Find(missingFiles, p, string_match) == NULL)
 				    Lst_AtEnd(missingFiles, bmake_strdup(p));
 			}
 		    }
 		    if (buf[0] == 'E') {
 			/* previous latestdir is no longer relevant */
 			strlcpy(latestdir, lcwd, sizeof(latestdir));
 		    }
 		    break;
 		default:
 		    break;
 		}
 		if (!oodate && buf[0] == 'L' && link_src != NULL)
 		    goto check_link_src;
 	    } else if (strcmp(buf, "CMD") == 0) {
 		/*
 		 * Compare the current command with the one in the
 		 * meta data file.
 		 */
 		if (ln == NULL) {
 		    if (DEBUG(META))
 			fprintf(debug_file, "%s: %d: there were more build commands in the meta data file than there are now...\n", fname, lineno);
 		    oodate = TRUE;
 		} else {
 		    char *cmd = (char *)Lst_Datum(ln);
 		    Boolean hasOODATE = FALSE;
 
 		    if (strstr(cmd, "$?"))
 			hasOODATE = TRUE;
 		    else if ((cp = strstr(cmd, ".OODATE"))) {
 			/* check for $[{(].OODATE[:)}] */
 			if (cp > cmd + 2 && cp[-2] == '$')
 			    hasOODATE = TRUE;
 		    }
 		    if (hasOODATE) {
 			needOODATE = TRUE;
 			if (DEBUG(META))
 			    fprintf(debug_file, "%s: %d: cannot compare command using .OODATE\n", fname, lineno);
 		    }
 		    cmd = Var_Subst(NULL, cmd, gn, VARF_WANTRES|VARF_UNDEFERR);
 
 		    if ((cp = strchr(cmd, '\n'))) {
 			int n;
 
 			/*
 			 * This command contains newlines, we need to
 			 * fetch more from the .meta file before we
 			 * attempt a comparison.
 			 */
 			/* first put the newline back at buf[x - 1] */
 			buf[x - 1] = '\n';
 			do {
 			    /* now fetch the next line */
 			    if ((n = fgetLine(&buf, &bufsz, x, fp)) <= 0)
 				break;
 			    x = n;
 			    lineno++;
 			    if (buf[x - 1] != '\n') {
 				warnx("%s: %d: line truncated at %u", fname, lineno, x);
 				break;
 			    }
 			    cp = strchr(++cp, '\n');
 			} while (cp);
 			if (buf[x - 1] == '\n')
 			    buf[x - 1] = '\0';
 		    }
 		    if (!hasOODATE &&
 			!(gn->type & OP_NOMETA_CMP) &&
 			strcmp(p, cmd) != 0) {
 			if (DEBUG(META))
 			    fprintf(debug_file, "%s: %d: a build command has changed\n%s\nvs\n%s\n", fname, lineno, p, cmd);
 			if (!metaIgnoreCMDs)
 			    oodate = TRUE;
 		    }
 		    free(cmd);
 		    ln = Lst_Succ(ln);
 		}
 	    } else if (strcmp(buf, "CWD") == 0) {
 		/*
 		 * Check if there are extra commands now
 		 * that weren't in the meta data file.
 		 */
 		if (!oodate && ln != NULL) {
 		    if (DEBUG(META))
 			fprintf(debug_file, "%s: %d: there are extra build commands now that weren't in the meta data file\n", fname, lineno);
 		    oodate = TRUE;
 		}
 		if (strcmp(p, cwd) != 0) {
 		    if (DEBUG(META))
 			fprintf(debug_file, "%s: %d: the current working directory has changed from '%s' to '%s'\n", fname, lineno, p, curdir);
 		    oodate = TRUE;
 		}
 	    }
 	}
 
 	fclose(fp);
 	if (!Lst_IsEmpty(missingFiles)) {
 	    if (DEBUG(META))
 		fprintf(debug_file, "%s: missing files: %s...\n",
 			fname, (char *)Lst_Datum(Lst_First(missingFiles)));
 	    oodate = TRUE;
 	}
 	if (!oodate && !have_filemon && filemonMissing) {
 	    if (DEBUG(META))
 		fprintf(debug_file, "%s: missing filemon data\n", fname);
 	    oodate = TRUE;
 	}
     } else {
 	if (writeMeta && metaMissing) {
 	    cp = NULL;
 
 	    /* if target is in .CURDIR we do not need a meta file */
 	    if (gn->path && (cp = strrchr(gn->path, '/')) && cp > gn->path) {
 		if (strncmp(curdir, gn->path, (cp - gn->path)) != 0) {
 		    cp = NULL;		/* not in .CURDIR */
 		}
 	    }
 	    if (!cp) {
 		if (DEBUG(META))
 		    fprintf(debug_file, "%s: required but missing\n", fname);
 		oodate = TRUE;
 		needOODATE = TRUE;	/* assume the worst */
 	    }
 	}
     }
 
     Lst_Destroy(missingFiles, (FreeProc *)free);
 
     if (oodate && needOODATE) {
 	/*
 	 * Target uses .OODATE which is empty; or we wouldn't be here.
 	 * We have decided it is oodate, so .OODATE needs to be set.
 	 * All we can sanely do is set it to .ALLSRC.
 	 */
 	Var_Delete(OODATE, gn);
 	Var_Set(OODATE, Var_Value(ALLSRC, gn, &cp), gn, 0);
 	free(cp);
     }
 
  oodate_out:
     for (i--; i >= 0; i--) {
 	free(pa[i]);
     }
     return oodate;
 }
 
 /* support for compat mode */
 
 static int childPipe[2];
 
 void
 meta_compat_start(void)
 {
 #ifdef USE_FILEMON_ONCE
     /*
      * We need to re-open filemon for each cmd.
      */
     BuildMon *pbm = &Mybm;
     
     if (pbm->mfp != NULL && useFilemon) {
 	filemon_open(pbm);
     } else {
 	pbm->mon_fd = pbm->filemon_fd = -1;
     }
 #endif
     if (pipe(childPipe) < 0)
 	Punt("Cannot create pipe: %s", strerror(errno));
     /* Set close-on-exec flag for both */
     (void)fcntl(childPipe[0], F_SETFD, FD_CLOEXEC);
     (void)fcntl(childPipe[1], F_SETFD, FD_CLOEXEC);
 }
 
 void
 meta_compat_child(void)
 {
     meta_job_child(NULL);
     if (dup2(childPipe[1], 1) < 0 ||
 	dup2(1, 2) < 0) {
 	execError("dup2", "pipe");
 	_exit(1);
     }
 }
 
 void
 meta_compat_parent(void)
 {
     FILE *fp;
     char buf[BUFSIZ];
     
     close(childPipe[1]);			/* child side */
     fp = fdopen(childPipe[0], "r");
     while (fgets(buf, sizeof(buf), fp)) {
 	meta_job_output(NULL, buf, "");
 	printf("%s", buf);
-	(void)fflush(stdout);
+	fflush(stdout);
     }
     fclose(fp);
 }
 
 #endif	/* USE_META */
Index: projects/clang390-import/contrib/bmake/mk/ChangeLog
===================================================================
--- projects/clang390-import/contrib/bmake/mk/ChangeLog	(revision 305686)
+++ projects/clang390-import/contrib/bmake/mk/ChangeLog	(revision 305687)
@@ -1,1180 +1,1211 @@
+2016-08-15  Simon J. Gerraty  <sjg@bad.crufty.net>
+
+	* install-mk (MK_VERSION): 20160815
+
+	* dirdeps.mk (.MAKE.META.IGNORE_FILTER): set filter to only
+	consider Makefile.depend* when checking if DIRDEPS_CACHE is up-to-date.
+
+2016-08-13  Simon J. Gerraty  <sjg@bad.crufty.net>
+
+	* meta.sys.mk (.MAKE.META.IGNORE_PATHS): 
+	  in meta mode we can ignore the mtime of makefiles
+
+2016-08-02  Simon J. Gerraty  <sjg@bad.crufty.net>
+
+	* install-mk (MK_VERSION): 20160802
+	
+	* lib.mk (libinstall): depends on beforinstall
+
+	* prog.mk (proginstall): depends on beforinstall
+	  patch from Lauri Tirkkonen
+	
+	* dirdeps.mk (bootstrap): When bootstrapping; creat
+	.MAKE.DEPENDFILE_DEFAULT and allow additional filtering via
+	.MAKE.DEPENDFILE_BOOTSTRAP_SED
+
+	* dirdeps.mk: move some comments to where they make sense.
+
+2016-07-27  Simon J. Gerraty  <sjg@bad.crufty.net>
+
+	* dirdeps.mk (DIRDEPS_CACHE): no dirname.
+
 2016-06-02  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20160602
 	* meta.autodep.mk: when passing META_FILES to gendirdeps.mk
 	  do not apply :T to META_XTRAS
 	  patch from Bryan Drewery at FreeBSD.org.
 
 2016-05-30  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20160530
 	* meta.stage.mk: we assume ${CLEANFILES} gets .NOPATH
 	  make it so.
 	
 2016-05-12  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20160512
 
 	* dpadd.mk: always include local.dpadd.mk if it exists
 	  remove some things that better belong in local.dpadd.mk
 	  skip INCLUDES_* for staged libs unless SRC_* defined.
 
 	* own.mk: add INCLUDEDIR
 	
 2016-04-18  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* dirdeps.mk: when doing -f dirdeps.mk if target suppies no
 	  TARGET_MACHINE - :E will be empty or match part of path, use
 	  ${MACHINE}
 
 2016-04-07  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta.autodep.mk: issue a warning if UPDATE_DEPENDFILE=NO due to
 	  NO_FILEMON_COOKIE  
 
 	* dirdeps.mk: move the logic that allows for 
 	  make -f dirdeps.mk some/dir.${TARGET_SPEC}
 	  inside the check for !target(_DIRDEP_USE)
 
 2016-04-04  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Use <> when including local*.mk and others which may exist
 	  elsewhere so that user can better control what they get.
 	
 	* meta.autodep.mk (NO_FILEMON_COOKIE): 
 	  create a cookie if we ever build dir with nofilemon
 	  so that UPDATE_DEPENDFILE will be forced to NO until cleaned.
 
 2016-04-01  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20160401
 	
 	* meta2deps.py: fix old print statement when debugging.
 
 	* gendirdeps.mk: META2DEPS_CMD append M2D_EXCLUDES with -X
 	  patch from Bryan Drewery
 
 2016-03-22  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20160317 (St. Pats)
 	
 	* warnings.mk: g++ does not like -Wimplicit
 	
 	* sys.mk sys/*.mk lib.mk prog.mk: use CXX_SUFFIXES to handle the
 	  pelthora of common suffixes for C++
 	
 	* lib.mk: use .So for shared objects
 
 2016-03-15  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20160315
 
 	* meta.stage.mk (LN_CP_SCRIPT): do not ln(1) if we have to chmod(1)
 	  normally only applies to scripts.
 	
 	* dirdeps.mk: NO_DIRDEPS_BELOW to supress DIRDEPS below RELDIR as
 	  well as outside it. 
 
 2016-03-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20160310
 
 	* dirdeps.mk: use targets rather than a list to track DIRDEPS that
 	  we have processed; the list gets very inefficient as number of
 	  DIRDEPS gets large.
 
 	* sys.dependfile.mk: fix comment wrt MACHINE
 
 	* meta.autodep.mk: ignore staged DPADDs when bootstrapping.
 	  patch from Bryan Drewery
 
 2016-03-02  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta2deps.sh: don't ignore subdirs.
 	  patch from Bryan Drewery
 
 2016-02-26  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20160226
 
 	* gendirdeps.mk: mark _DEPENDFILE .NOMETA
 
 2016-02-20  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* dirdeps.mk: we shouldn't normally include .depend but if we do
 	  use .dinclude if we can.
 
 2016-02-18  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20160218
 	* sys.clean-env.mk: with recent change to Var_Subst()
 	  we cannot use the '$$' trick, but .export-literal does the job
 	  we need.
 	* auto.dep.mk: make use .dinclude if we can.
 	
 
 2016-02-05  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* dirdeps.mk: 
 	  Add _build_all_dirs such that local.dirdeps.mk can
 	  add fully qualified dirs to it.
 	  These will be built normally but the current 
 	  DEP_RELDIR will not depend on then (to avoid cycles).
 	  This makes it easy to hook things like unit-tests into build.
 	
 
 2016-01-21  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* dirdeps.mk: add bootstrap-empty
 
 2015-12-12  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20151212
 	* auto.obj.mk: do not require MAKEOBJDIRPREFIX to exist.
 	  only apply :tA to __objdir when comparing to .OBJDIR
 
 2015-11-14  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20151111
 
 	* meta.sys.mk: include sys.dependfile.mk
 
 	* sys.mk (OPTIONS_DEFAULT_NO): use options.mk
 	  to set MK_AUTO_OBJ and MK_DIRDEPS_BUILD
 	  include local.sys.env.mk early
 	  include local.sys.mk later
 	
 	* own.mk (OPTIONS_DEFAULT_NO): AUTO_OBJ etc moved to sys.mk
 
 2015-11-13  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta.sys.mk (META_COOKIE_TOUCH):
 	  add ${META_COOKIE_TOUCH} to the end of scripts to touch cookie
 
 	* meta.stage.mk: stage_libs should ignore SYMLINKS.
 
 2015-10-23  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20151022
 
 	* sys.mk: BSD/OS does not have 'type' as a shell builtin.
 
 2015-10-20  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20151020
 	
 	* dirdeps.mk: Add logic for 
 	  make -f dirdeps.mk some/dir.${TARGET_SPEC}
 
 2015-10-14  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20151010
 
 2015-10-02  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta.stage.mk: use staging: ${STAGE_TARGETS:...
 	  to have stage_lins run last in non-jobs mode.
 	  Use .ORDER only for jobs mode.
 
 2015-09-02  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* rst2htm.mk: allow for per target flags etc.
 
 2015-09-01  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20150901
 
 	* doc.mk: create dir if needed use DOC_INSTALL_OWN
 
 2015-06-15  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20150615
 	
 	* auto.obj.mk: allow use of MAKEOBJDIRPREFIX too.
 	  Follow make's normal precedence rules.
 	
 	* gendirdeps.mk: allow customization of the header.
 	  eg. for FreeBSD: 
 	  GENDIRDEPS_HEADER= echo '\# ${FreeBSD:L:@v@$$$v$$ @:M*F*}';
 
 	* meta.autodep.mk: ignore dirdeps.cache*
 	
 	* meta.stage.mk: when bootstrapping options it can be handy to
 	  throw warnings rather than errors for staging conflicts.
 
 	* meta.sys.mk: include local.meta.sys.mk for customization
 
 2015-06-06  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20150606
 	
 	* dirdeps.mk: don't rely on manually maintained Makefile.depend
 	  to set DEP_RELDIR and reset DIRDEPS.
 	  By setting DEP_RELDIR ourselves we can skip :tA
 	
 	* gendirdeps.mk: skip setting DEP_RELDIR.
 
 2015-05-24  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* dirdeps.mk: avoid wildcards like make(bootstrap*)
 
 2015-05-20  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20150520
 
 	* dirdeps.mk: when we are building dirdeps cache file we *want*
 	  meta_oodate to look at all the Makefile.depend files, so
 	  set .MAKE.DEPENDFILE to something that won't match.
 
 	* meta.stage.mk: for STAGE_AS_* basename of file may not be unique
 	  so first use absolute path as key.
 	  Also skip staging at level 0. 
 
 2015-04-30  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20150430
 	
 	* dirdeps.mk: fix _count_dirdeps for non-cache case.
 
 2015-04-16  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20150411
 	  bump version
 	
 	* own.mk: put AUTO_OBJ in OPTIONS_DEFAULT_NO rather than YES.
 	  it is here mainly for documentation purposes, since
 	  if using auto.obj.mk it is better done via sys.mk
 
 2015-04-01  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20150401
 	
 	* meta2deps.sh: support @list
 	
 	* meta2deps.py: updates from Juniper
 	  o add EXCLUDES 
 	  o skip bogus input files.
 	  o treat 'M' and 'L' as both an 'R' and a 'W'
 
 2015-03-03  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20150303
 	
 	* dirdeps.mk: if MK_DIRDEPS_CACHE is yes, use dirdeps-cache
 	  which is built via sub-make so we have a .meta file to tell if
 	  it is out-of-date. 
 	  The dirdeps-cache contains the same dependency rules that we
 	  normaly construct on the fly.
 	  This adds a few seconds overhead when the cache is out of date,
 	  but for a large target, the savings can be significant (10-20min).
 
 2014-11-18  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20141118
 	
 	* meta.stage.mk: add stale_staged
 	
 	* dirdeps.mk (_DIRDEP_USE_LEVEL): allow this to be tweaked
 	  only useful under very rare conditions such as
 	  FreeBSD's make universe.
 
 	* auto.obj.mk: Allow MK_AUTO_OBJ to set MKOBJDIRS=auto
 	
 2014-11-11  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20141111
 
 	* mkopt.sh: use consistent semantics for _mk_opt and _mk_opts
 
 2014-11-09  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* FILES: include mkopt.sh which allows handling options in shell
 	  scripts in a manner compatible with options.mk
 
 2014-10-12  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta.stage.mk: ensure only _STAGED_DIRS under objroot are used
 	  for GENDIRDEPS_FILTER to avoid surprises.
 
 2014-10-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 	
 	* dirdeps.mk (NSkipHostDir): this needs SRCTOP prepended since by
 	  the time it is applied to __depdirs they have.
 	
 	* dirdeps.mk fix filtering of _machines since M_dep_qual_fixes
 	  expects patterns like *.${MACHINE}
 	
 	* cython.mk (pyprefix?): use pyprefix to find python bits
 	  since prefix might be something else (where we install our
 	  stuff)
 	
 2014-09-11  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20140911
 	
 	* dirdeps.mk: add bootstrap target to simplify adding support for
 	  new MACHINE.
 	
 2014-09-01  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* gendirdeps.mk: Add handling of GENDIRDEPS_FILTER_DIR_VARS and
 	  GENDIRDEPS_FILTER_VARS to make it easier to produce sharable
 	  Makefile.depend files.
 
 2014-08-28  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20140828
 	
 	* cython.mk: capture logic for building python extension modules
 	  with Cython.
 
 2014-08-08  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta.stage.mk (_STAGE_AS_BASENAME_USE): Add StageAs variant
 
 2014-08-02  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20140801
 
 	* dep.mk: use explicit MKDEP_MK rather than overload MKDEP to
 	identify the autodep.mk variant. 
 	
 	* sys.dependfile.mk: delete .MAKE.DEPENDFILE if its
 	initial value does not match .MAKE.DEPENDFILE_PREFIX
 
 	* meta.autodep.mk: if _bootstrap_dirdeps add RELDIR to DIRDEPS
 
 2014-05-22  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20140522
 
 	* lib.mk: use CC to link shlib for linux too
 	  patch from Brendan MacDonell
 
 2014-05-05  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta.autodep.mk: add _reldir_{finish,failed} for gathering stats
 	  if WITH_META_STATS is defined.
 
 2014-05-02  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* dirdeps.mk: accept -DWITHOUT_DIRDEPS (same a as -DNO_DIRDEPS)
 	  to supress dirdeps outside of .CURDIR.
 
 2014-04-05  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* Fix spelling errors - patch from Pedro Giffuni
 
 2014-03-14  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20140314
 
 	* dirdeps.mk (beforedirdeps): a handy hook
 
 	* dirdeps.mk (DIRDEP_MAKE): allow the actual command we run
 	  to visit leaf dirs to be intercepted (eg. for distributed
 	  build).
 
 	* dirdeps.mk (__depdirs): ensure // don't sneak in
 	
 	* gendirdeps.mk (DIRDEPS): ensure // don't sneak in
 
 
 2014-02-21  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* rst2htm.mk (RST2PDF): add support for rst2pdf
 
 2014-02-14  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version
 	* dirdeps.mk (_last_dependfile): use .INCLUDEDFROMFILE if
 	  available.
 
 2014-02-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* options.mk: avoid :U so this isn't bmake dependent
 
 2014-02-09  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* options.mk: cleanup and simplify semanitcs 
 	  NO_* dominates all, if both WITH_* and WITHOUT_*
 	  are defined then result is DOMINATE_* which defaults to "no".
 	  Ie. WITHOUT_ normally wins.
 
 2013-12-12  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version
 	* meta2deps.py: convert to print function for python3 compat.
 	  we also need to open files with mode 'r' rather than 'rb'
 	  otherwise we get bytes instead of strings.
 
 2013-10-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version
 
 	* dirdeps.mk: when TARGET_SPEC_VARS is more than just MACHINE
 	  apply the same filtering (M_dep_qual_fixes) when setting _machines
 	  as _build_dirs.
 	  Also fix the filtering of Makefile.depend files - for reporting
 	  what we are looking for (M_dep_qual_fixes can get confused by
 	  Makefile.depend) 
 	  Add some more debug info.
 
 2013-09-04  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* gendirdeps.mk (_objtops): fix typo also
 	  while processing M2D_OBJROOTS to gather qualdir_list
 	  qualify $ql with loop iterator to ensure correct results.
 
 2013-08-01  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20130801
 	* libs.mk: update to match progs.mk
 	
 2013-07-26  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20130726
 	  some updates from Juniper and FreeBSD
 	  o meta2deps.py: indicate file and line number when we hit parse
 	    errors
 	    also allow @file to provide huge list of .meta files.
 	* meta2deps.py: add try_parse() to cleanup the above.
 	
 2013-07-16  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20130716
 	* own.mk: add GPROG as an option
 	* prog.mk: honor MK_GPROF==yes
 	
 2013-05-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20130505
 	* gendirdeps.mk, meta2deps.py, meta2deps.sh: handle $TARGET_SPEC
 	  for when $MACHINE isn't enough for objdir distinction.
 	  Bring meta2deps.sh closer to par with meta2deps.py.
 
 2013-04-18  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta.stage.mk: set INSTALL to STAGE_INSTALL when making 'all'
 	  also if the target 'beforeinstall' exists, make it depend on
 	  .dirdep (incase it uses STAGE_INSTALL).
 
 2013-04-17  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): 20130401 ;-)
 	* meta.stage.mk (STAGE_INSTALL_SH): add stage-install.sh as
 	  wrapper around install(1).
 	* options.mk (OPTION_PREFIX): Allow a prefix other than MK_
 
 2013-03-30  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta2deps.py (MetaFile.__init__): ensure self.cwd is initialized.
 	* install-mk (MK_VERSION): bump version
 
 2013-03-21  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version
 	* gendirdeps.mk: do not apply :tA to DPADD entries, since we lose
 	  any trailing /., rather apply :tA only when needed.
 	* gendirdeps.mk: better mimic meta2deps handling of .dirdep files.
 	* meta.stage.mk (LN_CP_SCRIPT): Add LnCp to do the ln||cp dance
 	  consistently.
 	* dirdeps.mk: better describe the dance in sys.mk for TARGET_SPEC.
 	
 2013-03-18  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* gendirdeps.mk: revert the dance around .MAKE.DEPENDFILE_DEFAULT
 	  it is simpler to just not update when say building for "host"
 	  (where we know we apply filters to DIRDEPS), and using a
 	  non-machine qualified dependfile.
 
 2013-03-16  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* dirdeps.mk: improve DIRDEPS filtering by allowing DEP_SKIP_DIR
 	  and DEP_DIRDEPS_FILTER to vary by DEP_MACHINE and DEP_TARGET_SPEC
 	* gendirdeps.mk: ensure _objroot has trailing / if it needs it.
 	* meta2deps.py: if machine is "host", then also trim
 	  self.host_target from any OBJROOTS.
 	
 
 2013-03-11  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* gendirdeps.mk: if .MAKE.DEPENDFILE_DEFAULT is not machine
 	  qualified but _DEPENDFILE is, and .MAKE.DEPENDFILE_DEFAULT exists
 	  but _DEPENDFILE does not, compare the new _DEPENDFILE against
 	  .MAKE.DEPENDFILE_DEFAULT and discard if the same.
 
 2013-03-08  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta.stage.mk: use STAGE_TARGETS to control .ORDER
 	  and hook to all: via staging:
 
 2013-03-07  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* sys.dependfile.mk (.MAKE.DEPENDFILE_DEFAULT): 
 	  use a separate variable for the default .MAKE.DEPENDFILE value
 	  so that it can be controlled independently of
 	  .MAKE.DEPENDFILE_PREFERENCE
 
 	* meta.stage.mk: throw error if cp fails etc.
 	  Stage*() return early if passed no args.
 	  .ORDER stage_*
 
 2013-03-03  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version
 	* gendirdeps.mk: handle multiple M2D_OBJROOTS better.
 
 2013-02-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20130210
 	* import latest dirdeps.mk, gendirdeps.mk and meta2deps.py 
 	  from Juniper. 
 	  o dirdeps.mk now fully supports TARGET_SPEC consisting of more
 	    than just MACHINE.
 	  o no longer use DEP_MACHINE from Makefile.depend* so remove it.
 	
 2013-01-23  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20130123
 	* meta.stage.mk: add stage_links (hard links).
 	  if doing hard links, we add dest to link as well.
 	  Default the stage dir for [sym]links to STAGE_OBJTOP since
 	  these are typically specified as absolute paths.
 	  Add -m "mode" flag to StageFiles and StageAs.
 
 2012-11-11  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20121111
 	* autoconf.mk: avoid meta mode seeing changed commands for config.status
 	* meta.autodep.mk: pass resolved MAKESYSPATH to gendirdeps
 	  in case we were found via .../mk
 	* sys.clean-env.mk: move it from examples, we and others use it
 	  "as is".
 	* FILES: add srctop.mk and options.mk
 	* own.mk: convert to using options.mk
 	  which is modeled after FreeBSD's handling of MK_*
 	  but more flexible.
 	  This allows MK_* for boolean knobs to not be confused
 	  with MK* which can be commands.
 
 	* examples/sys.clean-env.mk: add WITH[OUT]_ to
 	  MAKE_ENV_SAVE_PREFIX_LIST.
 	  Mention that HOME=/var/empty might be a good idea.
 
 2012-11-08  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* sys.dependfile.mk: if not depend file exists, $MACHINE
 	  specific ones are supported but not the default,
 	  check if any exist and follow suit.
 
 2012-11-06  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20121106
 
 2012-11-05  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* import latest dirdeps.mk and meta2deps.py from Juniper.
 	* progs.mk: add MAN and CXXFLAGS to PROG_VARS
 	  also add PROGS_TARGETS and pass on PROG_CXX if it seems
 	  appropriate.
 	
 2012-11-04  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta.stage.mk: update CLEANFILES
 	  remove redundant cp of .dirdep from STAGE_AS_SCRIPT.
 	* progs.mk: Add LDADD to PROG_VARS
 	
 2012-10-12  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta.stage.mk (STAGE_DIR_FILTER): track dirs we stage to in
 	  _STAGED_DIRS so that these can be turned into filters for
 	  GENDIRDEPS_FILTER.
 
 2012-10-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20121010
 	* meta.stage.mk (STAGE_DIRDEP_SCRIPT): check that an existing
 	target.dirdep matches .dirdep
 
 2012-08-08  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20120808
 	* import latest meta2deps.py from Juniper.
 
 2012-07-11  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20120711
 	* dep.mk: add explicit dependencies on SRCS after applying
 	  SRCS_DEP_FILTER 
 	* meta.autodep.mk: add explicit dependencies on SRCS after
 	  applying SRCS_DEP_FILTER
 	* meta.autodep.mk: ensure GENDIRDEPS_FILTER is exported if needed.
 	
 2012-06-26  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20120626
 	* meta.sys.mk: ignore PYTHON if it does not exist
 	  compare ${.MAKE.DEPENDFILE:E} against ${MACHINE} is more reliable.
 	* meta.stage.mk: examine .MAKE.DEPENDFILE_PREFERENCE for any
 	  entries ending in .${MACHINE} to decide if qualified _dirdep is
 	  needed.
 	* gendirdeps.mk: only produce unqualified deps if no
 	  .MAKE.DEPENDFILE_PREFERENCE ends in .${MACHINE}
 	* meta.subdir.mk: apply SUBDIRDEPS_FILTER
 	
 2012-04-20  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20120420
 	* add sys.dependfile.mk so we can experiment with
 	  .MAKE.DEPENDFILE_PREFERENCE 
 	* meta.autodep.mk: _DEPENDFILE is precious!
 	
 2012-03-15  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20120315
 	* install-new.mk: avoid being interrupted
 
 2012-02-26  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* man.mk: MAN might have multiple values so be careful with exists().
 
 2012-01-19  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20120112
 	* fix examples/sys.clean-env.mk so that MAKEOBJDIR is handled
 	  as: MAKEOBJDIR='${.CURDIR:S,${SRCTOP},${OBJTOP},}'
 
 2011-12-03  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION):  bump version to 20111201
 	* import dirdeps.mk from Juniper sjg@
 	  o more consistent handling of DEP_MACHINE, especially when
 	    dealing with an odd Makefile.depend, when normally using
 	    Makefile.depend.${MACHINE}
 
 2011-11-22  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20111122
 	* meta.autodep.mk: add some debug output, be more crisp about
 	  updating.  Use ${.ALLTARGETS:M*.o} as a clue for .depend
 
 2011-11-13  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20111111
 	  it's too cool to miss
 	* import meta* updates from Juniper sjg@
 	  o dirdeps.mk set DEP_MACHINE for Makefile.depend (when we are
 	    normally using Makefile.depend.${MACHINE}), handy for
 	    read-only manually maintained dependencies.
 	  o meta2deps.py add a clear 'ERROR:' token if an exception is raised.
 	  o gendirdeps.mk if ERROR: from meta2deps.py do not update
 	    anything.
 	
 2011-10-30  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-new.mk separate the cmp and copy logic to its own function.
 	
 2011-10-28  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20111028
 	* sys.mk: include auto.obj.mk if MKOBJDIRS is set to auto
 	* subdir.mk: ensure _SUBDIRUSE is provided
 	* meta.autodep.mk: remove dependency of gendirdeps.mk on auto.obj.mk 
 	* meta.subdir.mk: always allow for Makefile.depend
 	
 2011-10-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20111010
 	  o minor tweak to *dirdeps.mk from Juniper sjg@
 	
 2011-10-01  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20111001
 	  o add meta2deps.py from Juniper sjg@
 	  o tweak gendirdeps.mk to work with meta2deps.py when not
 	    cross-building 
 	* autoconf.mk: add autoconf-input as a hook for regenerating 
 	  AUTOCONF_INPUTS (configure).
 
 2011-08-24  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta.autodep.mk: if we do not have OBJS, .depend isn't a useful
 	  trigger for updating Makefile.depend* 
 
 2011-08-08  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20110808
 	* obj.mk: minor cleanup
 	* auto.obj.mk: improve description of Mkdirs and honor NO_OBJ too.
 
 2011-08-01  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* auto.obj.mk (.OBJDIR): throw an error if we cannot use the
 	  specified dir.
 
 2011-06-28  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta.autodep.mk: if XMAKE_META_FILE is set
 	  the makefile uses a foreign make, and so dependencies
 	  can only be gathered from a clean tree build.
 
 2011-06-24  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20110622
 	* meta.autodep.mk: improve bootstraping
 
 2011-06-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* yacc.mk: handle the corner case of .c being removed
 	  while .h remains.
 
 2011-06-08  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* yacc.mk: do .y.h and .y.c separately
 
 2011-06-04  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20110606
 	* don't store SRC_DIRDEPS in Makefile.depend* by default
 	  not everyone needs it.
 
 2011-05-04  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20110505
 	  first release including meta mode makefiles
 
 2011-05-02  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta.stage.mk: add STAGE_AS_SETS and stage_as
 	  for things that need to be staged with different names.
 
 2011-05-01  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta.stage.mk: add notion of STAGE_SETS
 	  so a makefile can stage to multiple dirs
 
 2011-04-03  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* rst2htm.mk: convert rst to s5 (slides) or plain html depending
 	  on target name. 
 
 2011-03-30  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20110330
 
 2011-03-29  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* sys.mk (_DEBUG_MAKE_FLAGS): use indirection so that DEBUG_MAKE_FLAGS0
 	  can be used to debug level 0 only and DEBUG_MAKE_FLAGS for the rest.
 	* sys.mk: re-define M_whence in terms of M_type.
 	  M_type is useful for checking if something is a builtin.
 	
 2011-03-16  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta.stage.mk: add stage_symlinks and leverage StageLinks for
 	  stage_libs 
 
 2011-03-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* dirdeps.mk: correct value for _depdir_files depends on
 	  .MAKE.DEPENDFILE 
 	  Add our copyright - just to make it clear we have frobbed this
 	  quite a bit.
 	  DEP_MACHINE needs to be set to MACHINE each time, if using only
 	  Makefile.depend (cf. Makefile.depend.${MACHINE})
 
 	* meta.stage.mk: meta mode version of staging
 
 	* init.mk, final.mk: include local.*.mk to simplify customization
 
 2011-03-03  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* auto.obj.mk: just because we are doing mk destroy, we should
 	  still set .OBJDIR correctly if it exists.
 
 	* install-mk (mksrc): do not exclude meta.sys.mk
 
 2011-03-01  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* host-target.mk: set/export _HOST_ARCH etc separately,
 	  catch junk resulting from uname -p, so we can find sys/Linux.mk
 	  correctly.
 
 2011-02-18  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta.sys.mk: throw an error if /dev/filemon is missing and we
 	  expected to be updating Makefile.depend*
 
 2011-02-14  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20110214
 	* meta.subdir.mk: add support for -DBOOTSTRAP_DEPENDFILES
 
 2010-09-25  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* meta.sys.mk: not valid for older bmake
 
 2010-09-24  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20100919
 	include dirdeps.mk et al from Juniper Networks, 
 	for meta mode - requires filemon(9).
 	* sys.mk, subdir.mk: Add hooks for meta mode.
 	we do this as meta.sys.mk, meta.autodep.mk and meta.subdir.mk
 	to make turning it on/off simple.
 
 2010-06-16  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20100616
 	* fix typo in sys.mk
 
 2010-06-12  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20100612
 	* lib.mk: remove duplicate addition to SOBJS
 
 2010-06-10  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* sys.mk: Add a means of selectively turning on debug flags.
 	  Eg. DEBUG_MAKE_FLAGS=-dv DEBUG_MAKE_DIRS="*lib/sjg"
 	  will act as if we did make -dv if .CURDIR ends in lib/sjg
 	  DEBUG_MAKE_SYS_DIRS does the same thing, but we set the flags at
 	  the start of sys.mk rather than the end.
 	  This only makes sense for leaf dirs, so we check that
 	  .MAKE.LEVEL > 0
 
 2010-06-09  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20100608
 	* sys.mk: include sys.env.mk later so it can use M_ListToSkip et al.
 	* examples/sys.clean-env.mk: require MAKE_VERIONS >= 20100606
 	  also make it easier for folk to tweak
 
 2010-06-08  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20100606
 	  do not install examples/*
 	* FILES: add examples/sys.clean-env.mk
 	* examples/sys.clean-env.mk: use .export-env to handle MAKEOBJDIR
 	  this requires bmake-20100606 or later to work.
 
 2010-05-13  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* sys.mk (M_tA): better simulate the result of :tA if not available.
 
 2010-05-04  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* sys.mk: canonicalize MAKE_VERSION
 	  old versions reported bmake-<src-date> build-<build-date>
 	  whereas we only care about <src-date>
 
 2010-04-25  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk: just warn about FORCE_{BSD,SYS}_MK being ignored
 	* lib.mk: we only build the shared lib if SHLIB_FULLVERSION
 	  is !empty
 
 2010-04-22  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* dpadd.mk: use LDADD_* if defined.
 
 2010-04-21  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20100420
 	* sys/NetBSD.mk: add MACHINE_CPU to keep netbsd makefiles happy
 	* autoconf.mk allow AUTO_AUTOCONF
 	
 2010-04-19  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* obj.mk: add objwarn to keep freebsd makefiles happy
 	* auto.obj.mk: ensure Mkdirs is available.
 	* FILES: add auto.dep.mk - a simpler version of autodep.mk
 	* dep.mk: auto.dep.mk does not do 'make depend' so ignore it if
 	  asked to do that.
 	  fix/simplify the tests for when to run mkdep.
 	* auto.dep.mk: add some explanation of how/what we do.
 	* autodep.mk: skip the .OPTIONAL frobbing of .depend
 	  bmake's FROM_DEPEND flag makes it redundant.
 	
 2010-04-13  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20100404
 	* subdir.mk: protect from multiple inclusion using _SUBDIRUSE.
 	* obj.mk: protect from multiple inclusion even as bsd.obj.mk
 	Also create a target _SUBDIRUSE so that we can  be used without
 	subdir.mk
 
 2010-04-12  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* dep.mk: use <> when .including so can override.
 
 2010-01-11  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* lib.mk (SHLIB_LINKS): ensure a string comparison.
 
 2010-01-04  Simon J. Gerraty  <sjg@bad.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20100102
 	* own.mk: ensure PRINTOBJDIR works
 	* autoconf.mk: pass on CONFIGURE_ARGS
 	* init.mk: handle COPTS.${.IMPSRC:T} etc.
 	* lib.mk: allow sys.mk to control SHLIB_FULLVERSION
 	  fix handling of symlinks for darwin
 	* libnames.mk: add DSHLIBEXT for libs which only exist as shared.
 	* man.mk: suppress chown when not root.
 	* rst2htm.mk: allow srcs from multiple locations.
 	* sys.mk: M_whence, stop after 1st line of output.
 	* sys/Darwin.mk: Use .dylib for DSHLIBEXT and HOST_LIBEXT
 	* sys/SunOS.mk: we need to export PATH
 
 2009-12-23  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): bump version
 	  include rst2htm.mk
 
 2009-12-17  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* sys.mk,libnames.mk add .-include <local.*>
 	  this allows local customization without the need to edit the
 	  distributed files. 
 
 2009-12-14  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* dpadd.mk (__dpadd_libdirs): order -L's to avoid picking up
 	  older versions already installed.
 
 2009-12-13  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* stage.mk (.stage-install): generalize lib.mk's .libinstall
 	* rules.mk rules for generic Makefile.
 	* inc.mk install for includes.
 
 2009-12-11  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* sys/NetBSD.mk (MAKE_VERSION): some of our *.mk want to check
 	  this, so provide it if using native make.
 
 2009-12-10  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* FILES: move all the platform *.sys.mk files to sys/*.mk
 	* Rename Generic.sys.mk to sys.mk - we always want it.
 
 2009-11-17  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): bump version
 	* host-target.mk: only export the expensive stuff 
 	* Generic.sys.mk (sys_mk): for SunOS we need to look for
 	  ${HOST_OS}.${HOST_OSMAJOR} too!
 
 2009-11-07  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): bump version
 	* lib.mk: if sys.mk doesn't give us an lorder, don't use it.
 	  based on patch from Greg Olszewski.
 	* Generic.sys.mk: if we have nothing to work with
 	set LORDER etc only if we can find it.
 
 2009-09-08  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): bump version
 	* man.mk: cleanman: remove CLEANMAN if defined.
 
 2009-09-04  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* SunOS.5.sys.mk (CC): Use ?= like the other *sys.mk
 
 2009-07-17  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): bump version
 	include auto.obj.mk
 
 
 2009-03-26  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* prog.mk,lib.mk: ensure test of USE_DPADD_MK doesn't fail.
 
 2008-11-11  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): bump version
 	man.mk: ensure we generate *.cat1 etc in .
 
 2008-07-16  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): bump version
 	add prlist.mk
 
 2007-11-25  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Generic.sys.mk: Allow os specific sys.mk to be in a
 	subdir of ${.PARSEDIR}
 
 2007-11-22  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): bump version
 	* general cleanup
 	* dpadd.mk introduce DPMAGIC_LIBS_* 
 
 2007-04-30  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): bump version
 
 	* libs.mk, progs.mk, autodep.mk: allow for per lib/prog
 	depend files and ensure clean is called for each lib/prog.
 
 2007-03-27  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* autodep.mk (.depend): delete lines that do not start with
 	space and do not contain ':'
 
 2007-02-16  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* autodep.mk (.depend): gcc may wrap lines if pathnames are long
 	so make sure the transform for .OPTIONAL copes.
 
 2007-02-03  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): bump version
 
 	* own.mk: make sure RM and LN are defined.
 
 	* obj.mk: fix a typo, and objlink target.
 
 2006-12-30  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): bump version
 	* added libs.mk - analogous to progs.mk
 	  make both of them always inlcude {lib,prog}.mk
 
 2006-12-28  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* progs.mk: add a means of building multiple apps in one dir.
 
 2006-11-26  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20061126
 
 	* warnings.mk: detect invalid WARNINGS_SET
 	
 	* warnings.mk: use ${.TARGET:T:R}.o when looking for target
 	specific warnings. 
 	
 	* For .cc sources, turn off warnings that g++ vomits on.
 
 2006-11-08  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* own.mk: if __initialized__ target doesn't exist and we are
 	FreeBSD we got here directly from sys.mk
 
 2006-11-06  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20061106
 	add scripts.mk
 
 2006-03-18  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20060318
 
 	* autodep.mk: avoid := when modifying OBJS into __dependsrcs
 
 2006-03-02  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20060302
 	* autodep.mk: use -MF et al to help gcc+ccache DTRT.
 
 2006-03-01  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20060301
 	* autodep.mk (.depend): 
 	if MAKE_VERSION is newer than  20050530 we can make .END depend on
 	.depend and make .depend depend on __depsrcs that exist.
 	* dpadd.mk: add SRC_PATHADD
 	
 2005-11-04  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20051104
 	* prog.mk: remove all the LIBC?= junk, use
 	.-include libnames.mk instead (none by default).
 	also if USE_DPADD_MK is set, include that.
 
 2005-10-09  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20051001
 	Add UnixWare.sys.mk from Klaus Heinz.
 
 2005-04-05  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk: always install *.sys.mk and if need be symlink one
 	to sys.mk
 
 2005-03-22  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* subdir.mk, own.mk: use .MAKE rather than MAKE
 
 2004-02-15  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* own.mk: don't use NetBSD's _SRC_TOP_ it can
 	cause confusion.  Also don't take just 'mk' as a 
 	srctop indicator.
 
 2004-02-14  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* warnings.mk: overhauled, now very powerful.
 
 2004-02-03  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* Generic.sys.mk: need to use ${.PARSEDIR} with exists().
 
 2004-02-01  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): bump version to 20040201
 	* extract HOST_TARGET stuff to host-target.mk so own.mk and
 	Generic.sys.mk can share.
 	* fix typo in autodep.mk _SUBDIRUSE not _SUBDIR.
 
 2003-09-30  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): 20030930
 	* rename generic.sys.mk to Generic.sys.mk
 	so that it does not get installed (unless being used as sys.mk)
 	* set OS and ROOT_GROUP for those that we know the value.
 	for others (eg. Generic.sys.mk) wrap the != in an .ifndef so
 	we don't do it again for each sub-make.
 	
 2003-09-28  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk (MK_VERSION): 20030928
 	Add some extra *.sys.mk from bootstrap-pkgsrc
 	some of these likely still need work.
 	Make everything default to root:wheel ownership,
 	sys.mk can set ROOT_GROUP accordingly.
 
 2003-08-07  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk: if FORCE_BSD_MK={cp,ln} use the ones in SYS_MK_DIR
 	not the portable ones.
 
 2003-07-31  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk: add ability to use cp -f when updating 
 	destination .mk files.  Also now possible to play games with 
 	FORCE_SYS_MK=ln etc on *BSD machines to link /usr/share/mk/sys.mk
 	into dest - not recommended unless you seriously want to.
 
 2003-07-28  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* own.mk (IMPFLAGS): add support for COPTS.${IMPSRC:T} etc
 	for semi-compatability with NetBSD.
 
 2003-07-23  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* install-mk: add a version indicator
 
 2003-07-22  Simon J. Gerraty  <sjg@void.crufty.net>
 
 	* prog.mk: don't try and use ${LIBCRT0} if its /dev/null
 
 	* install-mk: Allow FORCE_SYS_MK to come from env
 
 
 	
Index: projects/clang390-import/contrib/bmake/mk/dirdeps.mk
===================================================================
--- projects/clang390-import/contrib/bmake/mk/dirdeps.mk	(revision 305686)
+++ projects/clang390-import/contrib/bmake/mk/dirdeps.mk	(revision 305687)
@@ -1,711 +1,724 @@
-# $Id: dirdeps.mk,v 1.67 2016/04/18 21:50:47 sjg Exp $
+# $Id: dirdeps.mk,v 1.73 2016/08/15 19:28:13 sjg Exp $
 
 # Copyright (c) 2010-2013, Juniper Networks, Inc.
 # All rights reserved.
 # 
 # Redistribution and use in source and binary forms, with or without
 # modification, are permitted provided that the following conditions 
 # are met: 
 # 1. Redistributions of source code must retain the above copyright
 #    notice, this list of conditions and the following disclaimer. 
 # 2. Redistributions in binary form must reproduce the above copyright
 #    notice, this list of conditions and the following disclaimer in the
 #    documentation and/or other materials provided with the distribution.  
 # 
 # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
 # "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
 # LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
 # A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
 # OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
 # SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
 # LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
 # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
 # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
 # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
 
 # Much of the complexity here is for supporting cross-building.
 # If a tree does not support that, simply using plain Makefile.depend
 # should provide sufficient clue.
 # Otherwise the recommendation is to use Makefile.depend.${MACHINE}
 # as expected below.
 
 # Note: this file gets multiply included.
 # This is what we do with DIRDEPS
 
 # DIRDEPS:
 #	This is a list of directories - relative to SRCTOP, it is
 #	normally only of interest to .MAKE.LEVEL 0.
 #	In some cases the entry may be qualified with a .<machine>
 #	or .<target_spec> suffix (see TARGET_SPEC_VARS below),
 #	for example to force building something for the pseudo
 #	machines "host" or "common" regardless of current ${MACHINE}.
 #	
 #	All unqualified entries end up being qualified with .${TARGET_SPEC}
 #	and partially qualified (if TARGET_SPEC_VARS has multiple
 #	entries) are also expanded to a full .<target_spec>.
 #	The  _DIRDEP_USE target uses the suffix to set TARGET_SPEC
 #	correctly when visiting each entry.
 #
 #	The fully qualified directory entries are used to construct a
 #	dependency graph that will drive the build later.
 #	
 #	Also, for each fully qualified directory target, we will search
 #	using ${.MAKE.DEPENDFILE_PREFERENCE} to find additional
 #	dependencies.  We use Makefile.depend (default value for
 #	.MAKE.DEPENDFILE_PREFIX) to refer to these makefiles to
 #	distinguish them from others.
 #	
 #	Each Makefile.depend file sets DEP_RELDIR to be the
 #	the RELDIR (path relative to SRCTOP) for its directory, and
 #	since each Makefile.depend file includes dirdeps.mk, this
 #	processing is recursive and results in .MAKE.LEVEL 0 learning the
 #	dependencies of the tree wrt the initial directory (_DEP_RELDIR).
 #
 # BUILD_AT_LEVEL0
 #	Indicates whether .MAKE.LEVEL 0 builds anything:
 #	if "no" sub-makes are used to build everything,
 #	if "yes" sub-makes are only used to build for other machines.
 #	It is best to use "no", but this can require fixing some
 #	makefiles to not do anything at .MAKE.LEVEL 0.
 #
 # TARGET_SPEC_VARS
 #	The default value is just MACHINE, and for most environments
 #	this is sufficient.  The _DIRDEP_USE target actually sets
 #	both MACHINE and TARGET_SPEC to the suffix of the current
 #	target so that in the general case TARGET_SPEC can be ignored.
 #
 #	If more than MACHINE is needed then sys.mk needs to decompose
 #	TARGET_SPEC and set the relevant variables accordingly.
 #	It is important that MACHINE be included in and actually be
 #	the first member of TARGET_SPEC_VARS.  This allows other
 #	variables to be considered optional, and some of the treatment
 #	below relies on MACHINE being the first entry.
 #	Note: TARGET_SPEC cannot contain any '.'s so the target
 #	triple used by compiler folk won't work (directly anyway).
 #
 #	For example:
 #
 #		# Always list MACHINE first, 
 #		# other variables might be optional.
 #		TARGET_SPEC_VARS = MACHINE TARGET_OS
 #		.if ${TARGET_SPEC:Uno:M*,*} != ""
 #		_tspec := ${TARGET_SPEC:S/,/ /g}
 #		MACHINE := ${_tspec:[1]}
 #		TARGET_OS := ${_tspec:[2]}
 #		# etc.
 #		# We need to stop that TARGET_SPEC affecting any submakes
 #		# and deal with MACHINE=${TARGET_SPEC} in the environment.
 #		TARGET_SPEC =
 #		# export but do not track
 #		.export-env TARGET_SPEC 
 #		.export ${TARGET_SPEC_VARS}
 #		.for v in ${TARGET_SPEC_VARS:O:u}
 #		.if empty($v)
 #		.undef $v
 #		.endif
 #		.endfor
 #		.endif
 #		# make sure we know what TARGET_SPEC is
 #		# as we may need it to find Makefile.depend*
 #		TARGET_SPEC = ${TARGET_SPEC_VARS:@v@${$v:U}@:ts,}
 #	
 
 # touch this at your peril
 _DIRDEP_USE_LEVEL?= 0
 .if ${.MAKE.LEVEL} == ${_DIRDEP_USE_LEVEL}
 # only the first instance is interested in all this
 
-# First off, we want to know what ${MACHINE} to build for.
-# This can be complicated if we are using a mixture of ${MACHINE} specific
-# and non-specific Makefile.depend*
-
 .if !target(_DIRDEP_USE)
 
+# do some setup we only need once
+_CURDIR ?= ${.CURDIR}
+_OBJDIR ?= ${.OBJDIR}
+
+now_utc = ${%s:L:gmtime}
+.if !defined(start_utc)
+start_utc := ${now_utc}
+.endif
+
 .if ${MAKEFILE:T} == ${.PARSEFILE} && empty(DIRDEPS) && ${.TARGETS:Uall:M*/*} != ""
 # This little trick let's us do
 #
 # mk -f dirdeps.mk some/dir.${TARGET_SPEC}
 #
 all:
 ${.TARGETS:Nall}: all
 DIRDEPS := ${.TARGETS:M*[/.]*}
 # so that -DNO_DIRDEPS works
 DEP_RELDIR := ${DIRDEPS:[1]:R}
 # this will become DEP_MACHINE below
 TARGET_MACHINE := ${DIRDEPS:[1]:E:C/,.*//}
 .if ${TARGET_MACHINE:N*/*} == ""
 TARGET_MACHINE := ${MACHINE}
 .endif
 # disable DIRDEPS_CACHE as it does not like this trick
 MK_DIRDEPS_CACHE = no
 .endif
 
 # make sure we get the behavior we expect
 .MAKE.SAVE_DOLLARS = no
 
-# do some setup we only need once
-_CURDIR ?= ${.CURDIR}
-_OBJDIR ?= ${.OBJDIR}
-
-now_utc = ${%s:L:gmtime}
-.if !defined(start_utc)
-start_utc := ${now_utc}
-.endif
-
 # make sure these are empty to start with
 _DEP_TARGET_SPEC =
 
 # If TARGET_SPEC_VARS is other than just MACHINE
 # it should be set by sys.mk or similar by now.
 # TARGET_SPEC must not contain any '.'s.
 TARGET_SPEC_VARS ?= MACHINE
 # this is what we started with
 TARGET_SPEC = ${TARGET_SPEC_VARS:@v@${$v:U}@:ts,}
 # this is what we mostly use below
 DEP_TARGET_SPEC = ${TARGET_SPEC_VARS:S,^,DEP_,:@v@${$v:U}@:ts,}
 # make sure we have defaults
 .for v in ${TARGET_SPEC_VARS}
 DEP_$v ?= ${$v}
 .endfor
 
 .if ${TARGET_SPEC_VARS:[#]} > 1
 # Ok, this gets more complex (putting it mildly).
 # In order to stay sane, we need to ensure that all the build_dirs
 # we compute below are fully qualified wrt DEP_TARGET_SPEC.
 # The makefiles may only partially specify (eg. MACHINE only),
 # so we need to construct a set of modifiers to fill in the gaps.
 # jot 10 should output 1 2 3 .. 10
 JOT ?= jot
 _tspec_x := ${${JOT} ${TARGET_SPEC_VARS:[#]}:L:sh}
 # this handles unqualified entries
 M_dep_qual_fixes = C;(/[^/.,]+)$$;\1.$${DEP_TARGET_SPEC};
 # there needs to be at least one item missing for these to make sense
 .for i in ${_tspec_x:[2..-1]}
 _tspec_m$i := ${TARGET_SPEC_VARS:[2..$i]:@w@[^,]+@:ts,}
 _tspec_a$i := ,${TARGET_SPEC_VARS:[$i..-1]:@v@$$$${DEP_$v}@:ts,}
 M_dep_qual_fixes += C;(\.${_tspec_m$i})$$;\1${_tspec_a$i};
 .endfor
 .else
 # A harmless? default.
 M_dep_qual_fixes = U
 .endif
 
 .if !defined(.MAKE.DEPENDFILE_PREFERENCE)
 # .MAKE.DEPENDFILE_PREFERENCE makes the logic below neater?
 # you really want this set by sys.mk or similar
 .MAKE.DEPENDFILE_PREFERENCE = ${_CURDIR}/${.MAKE.DEPENDFILE:T}
 .if ${.MAKE.DEPENDFILE:E} == "${TARGET_SPEC}"
 .if ${TARGET_SPEC} != ${MACHINE}
 .MAKE.DEPENDFILE_PREFERENCE += ${_CURDIR}/${.MAKE.DEPENDFILE:T:R}.$${MACHINE}
 .endif
 .MAKE.DEPENDFILE_PREFERENCE += ${_CURDIR}/${.MAKE.DEPENDFILE:T:R}
 .endif
 .endif
 
 _default_dependfile := ${.MAKE.DEPENDFILE_PREFERENCE:[1]:T}
 _machine_dependfiles := ${.MAKE.DEPENDFILE_PREFERENCE:T:M*${MACHINE}*}
 
 # for machine specific dependfiles we require ${MACHINE} to be at the end
 # also for the sake of sanity we require a common prefix
 .if !defined(.MAKE.DEPENDFILE_PREFIX)
 # knowing .MAKE.DEPENDFILE_PREFIX helps
 .if !empty(_machine_dependfiles)
 .MAKE.DEPENDFILE_PREFIX := ${_machine_dependfiles:[1]:T:R}
 .else
 .MAKE.DEPENDFILE_PREFIX := ${_default_dependfile:T}
 .endif
 .endif
 
 
 # this is how we identify non-machine specific dependfiles
 N_notmachine := ${.MAKE.DEPENDFILE_PREFERENCE:E:N*${MACHINE}*:${M_ListToSkip}}
 
 .endif				# !target(_DIRDEP_USE)
 
+# First off, we want to know what ${MACHINE} to build for.
+# This can be complicated if we are using a mixture of ${MACHINE} specific
+# and non-specific Makefile.depend*
+
 # if we were included recursively _DEP_TARGET_SPEC should be valid.
 .if empty(_DEP_TARGET_SPEC)
 # we may or may not have included a dependfile yet
 .if defined(.INCLUDEDFROMFILE)
 _last_dependfile := ${.INCLUDEDFROMFILE:M${.MAKE.DEPENDFILE_PREFIX}*}
 .else
 _last_dependfile := ${.MAKE.MAKEFILES:M*/${.MAKE.DEPENDFILE_PREFIX}*:[-1]}
 .endif
 .if ${_debug_reldir:U0}
 .info ${DEP_RELDIR}.${DEP_TARGET_SPEC}: _last_dependfile='${_last_dependfile}'
 .endif
 
 .if empty(_last_dependfile) || ${_last_dependfile:E:${N_notmachine}} == ""
 # this is all we have to work with
 DEP_MACHINE = ${TARGET_MACHINE:U${MACHINE}}
 _DEP_TARGET_SPEC := ${DEP_TARGET_SPEC}
 .else
 _DEP_TARGET_SPEC = ${_last_dependfile:${M_dep_qual_fixes:ts:}:E}
 .endif
 .if !empty(_last_dependfile)
 # record that we've read dependfile for this
 _dirdeps_checked.${_CURDIR}.${TARGET_SPEC}:
 .endif
 .endif
 
 # by now _DEP_TARGET_SPEC should be set, parse it.
 .if ${TARGET_SPEC_VARS:[#]} > 1
 # we need to parse DEP_MACHINE may or may not contain more info
 _tspec := ${_DEP_TARGET_SPEC:S/,/ /g}
 .for i in ${_tspec_x}
 DEP_${TARGET_SPEC_VARS:[$i]} := ${_tspec:[$i]}
 .endfor
 .for v in ${TARGET_SPEC_VARS:O:u}
 .if empty(DEP_$v)
 .undef DEP_$v
 .endif
 .endfor
 .else
 DEP_MACHINE := ${_DEP_TARGET_SPEC}
 .endif
 
 # reset each time through
 _build_all_dirs =
 
 # the first time we are included the _DIRDEP_USE target will not be defined
 # we can use this as a clue to do initialization and other one time things.
 .if !target(_DIRDEP_USE)
 # make sure this target exists
 dirdeps: beforedirdeps .WAIT
 beforedirdeps:
 
 # We normally expect to be included by Makefile.depend.*
 # which sets the DEP_* macros below.
 DEP_RELDIR ?= ${RELDIR}
 
 # this can cause lots of output!
 # set to a set of glob expressions that might match RELDIR
 DEBUG_DIRDEPS ?= no
 
 # remember the initial value of DEP_RELDIR - we test for it below.
 _DEP_RELDIR := ${DEP_RELDIR}
 
 .endif
 
 # pickup customizations
 # as below you can use !target(_DIRDEP_USE) to protect things
 # which should only be done once.
 .-include <local.dirdeps.mk>
 
 .if !target(_DIRDEP_USE)
 # things we skip for host tools
 SKIP_HOSTDIR ?=
 
 NSkipHostDir = ${SKIP_HOSTDIR:N*.host*:S,$,.host*,:N.host*:S,^,${SRCTOP}/,:${M_ListToSkip}}
 
 # things we always skip
 # SKIP_DIRDEPS allows for adding entries on command line.
 SKIP_DIR += .host *.WAIT ${SKIP_DIRDEPS}
 SKIP_DIR.host += ${SKIP_HOSTDIR}
 
 DEP_SKIP_DIR = ${SKIP_DIR} \
 	${SKIP_DIR.${DEP_TARGET_SPEC}:U} \
 	${SKIP_DIR.${DEP_MACHINE}:U} \
 	${SKIP_DIRDEPS.${DEP_MACHINE}:U}
 
 NSkipDir = ${DEP_SKIP_DIR:${M_ListToSkip}}
 
 .if defined(NODIRDEPS) || defined(WITHOUT_DIRDEPS)
 NO_DIRDEPS =
 .elif defined(WITHOUT_DIRDEPS_BELOW)
 NO_DIRDEPS_BELOW =
 .endif
 
 .if defined(NO_DIRDEPS)
 # confine ourselves to the original dir and below.
 DIRDEPS_FILTER += M${_DEP_RELDIR}*
 .elif defined(NO_DIRDEPS_BELOW)
 DIRDEPS_FILTER += M${_DEP_RELDIR}
 .endif
 
 # this is what we run below
 DIRDEP_MAKE?= ${.MAKE}
 
 # we suppress SUBDIR when visiting the leaves
 # we assume sys.mk will set MACHINE_ARCH
 # you can add extras to DIRDEP_USE_ENV
 # if there is no makefile in the target directory, we skip it.
 _DIRDEP_USE:	.USE .MAKE
 	@for m in ${.MAKE.MAKEFILE_PREFERENCE}; do \
 		test -s ${.TARGET:R}/$$m || continue; \
 		echo "${TRACER}Checking ${.TARGET:R} for ${.TARGET:E} ..."; \
 		MACHINE_ARCH= NO_SUBDIR=1 ${DIRDEP_USE_ENV} \
 		TARGET_SPEC=${.TARGET:E} \
 		MACHINE=${.TARGET:E} \
 		${DIRDEP_MAKE} -C ${.TARGET:R} || exit 1; \
 		break; \
 	done
 
 .ifdef ALL_MACHINES
 # this is how you limit it to only the machines we have been built for
 # previously.
 .if empty(ONLY_MACHINE_LIST)
 .if !empty(ALL_MACHINE_LIST)
 # ALL_MACHINE_LIST is the list of all legal machines - ignore anything else
 _machine_list != cd ${_CURDIR} && 'ls' -1 ${ALL_MACHINE_LIST:O:u:@m@${.MAKE.DEPENDFILE:T:R}.$m@} 2> /dev/null; echo
 .else
 _machine_list != 'ls' -1 ${_CURDIR}/${.MAKE.DEPENDFILE_PREFIX}.* 2> /dev/null; echo
 .endif
 _only_machines := ${_machine_list:${NIgnoreFiles:UN*.bak}:E:O:u}
 .else
 _only_machines := ${ONLY_MACHINE_LIST}
 .endif
 
 .if empty(_only_machines)
 # we must be boot-strapping
 _only_machines := ${TARGET_MACHINE:U${ALL_MACHINE_LIST:U${DEP_MACHINE}}}
 .endif
 
 .else				# ! ALL_MACHINES
 # if ONLY_MACHINE_LIST is set, we are limited to that
 # if TARGET_MACHINE is set - it is really the same as ONLY_MACHINE_LIST
 # otherwise DEP_MACHINE is it - so DEP_MACHINE will match.
 _only_machines := ${ONLY_MACHINE_LIST:U${TARGET_MACHINE:U${DEP_MACHINE}}:M${DEP_MACHINE}}
 .endif
 
 .if !empty(NOT_MACHINE_LIST)
 _only_machines := ${_only_machines:${NOT_MACHINE_LIST:${M_ListToSkip}}}
 .endif
 
 # make sure we have a starting place?
 DIRDEPS ?= ${RELDIR}
 .endif				# target 
 
 # if repeatedly building the same target, 
 # we can avoid the overhead of re-computing the tree dependencies.
 MK_DIRDEPS_CACHE ?= no
 BUILD_DIRDEPS_CACHE ?= no
 BUILD_DIRDEPS ?= yes
 
 .if !defined(NO_DIRDEPS) && !defined(NO_DIRDEPS_BELOW)
 .if ${MK_DIRDEPS_CACHE} == "yes"
 # this is where we will cache all our work
-DIRDEPS_CACHE?= ${_OBJDIR}/dirdeps.cache${.TARGETS:Nall:O:u:ts-:S,/,_,g:S,^,.,:N.}
+DIRDEPS_CACHE?= ${_OBJDIR:tA}/dirdeps.cache${.TARGETS:Nall:O:u:ts-:S,/,_,g:S,^,.,:N.}
 
 # just ensure this exists
 build-dirdeps:
 
 M_oneperline = @x@\\${.newline}	$$x@
 
 .if ${BUILD_DIRDEPS_CACHE} == "no" 
 .if !target(dirdeps-cached)
 # we do this via sub-make
 BUILD_DIRDEPS = no
 
+# ignore anything but these
+.MAKE.META.IGNORE_FILTER = M*/${.MAKE.DEPENDFILE_PREFIX}*
+
 dirdeps: dirdeps-cached
 dirdeps-cached:	${DIRDEPS_CACHE} .MAKE
 	@echo "${TRACER}Using ${DIRDEPS_CACHE}"
 	@MAKELEVEL=${.MAKE.LEVEL} ${.MAKE} -C ${_CURDIR} -f ${DIRDEPS_CACHE} \
 		dirdeps MK_DIRDEPS_CACHE=no BUILD_DIRDEPS=no
 
 # these should generally do
 BUILD_DIRDEPS_MAKEFILE ?= ${MAKEFILE}
 BUILD_DIRDEPS_TARGETS ?= ${.TARGETS}
 
 # we need the .meta file to ensure we update if 
 # any of the Makefile.depend* changed.
 # We do not want to compare the command line though.
 ${DIRDEPS_CACHE}:	.META .NOMETA_CMP
 	+@{ echo '# Autogenerated - do NOT edit!'; echo; \
 	echo 'BUILD_DIRDEPS=no'; echo; \
 	echo '.include <dirdeps.mk>'; \
 	} > ${.TARGET}.new
 	+@MAKELEVEL=${.MAKE.LEVEL} DIRDEPS_CACHE=${DIRDEPS_CACHE} \
 	DIRDEPS="${DIRDEPS}" \
 	MAKEFLAGS= ${.MAKE} -C ${_CURDIR} -f ${BUILD_DIRDEPS_MAKEFILE} \
 	${BUILD_DIRDEPS_TARGETS} BUILD_DIRDEPS_CACHE=yes \
 	.MAKE.DEPENDFILE=.none \
 	${.MAKEFLAGS:tW:S,-D ,-D,g:tw:M*WITH*} \
 	3>&1 1>&2 | sed 's,${SRCTOP},$${SRCTOP},g' >> ${.TARGET}.new && \
 	mv ${.TARGET}.new ${.TARGET}
 
 .endif
 .elif !target(_count_dirdeps)
 # we want to capture the dirdeps count in the cache
 .END: _count_dirdeps
 _count_dirdeps: .NOMETA
 	@echo '.info $${.newline}$${TRACER}Makefiles read: total=${.MAKE.MAKEFILES:[#]} depend=${.MAKE.MAKEFILES:M*depend*:[#]} dirdeps=${.ALLTARGETS:M${SRCTOP}*:O:u:[#]}' >&3
 
 .endif
 .elif !make(dirdeps) && !target(_count_dirdeps)
 beforedirdeps: _count_dirdeps
 _count_dirdeps: .NOMETA
 	@echo "${TRACER}Makefiles read: total=${.MAKE.MAKEFILES:[#]} depend=${.MAKE.MAKEFILES:M*depend*:[#]} dirdeps=${.ALLTARGETS:M${SRCTOP}*:O:u:[#]} seconds=`expr ${now_utc} - ${start_utc}`"
 
 .endif
 .endif
 
 .if ${BUILD_DIRDEPS} == "yes"
 .if ${DEBUG_DIRDEPS:@x@${DEP_RELDIR:M$x}${${DEP_RELDIR}.${DEP_MACHINE}:L:M$x}@} != ""
 _debug_reldir = 1
 .else
 _debug_reldir = 0
 .endif
 .if ${DEBUG_DIRDEPS:@x@${DEP_RELDIR:M$x}${${DEP_RELDIR}.depend:L:M$x}@} != ""
 _debug_search = 1
 .else
 _debug_search = 0
 .endif
 
 # the rest is done repeatedly for every Makefile.depend we read.
 # if we are anything but the original dir we care only about the
 # machine type we were included for..
 
 .if ${DEP_RELDIR} == "."
 _this_dir := ${SRCTOP}
 .else
 _this_dir := ${SRCTOP}/${DEP_RELDIR}
 .endif
 
 # on rare occasions, there can be a need for extra help
 _dep_hack := ${_this_dir}/${.MAKE.DEPENDFILE_PREFIX}.inc
 .-include <${_dep_hack}>
 
 .if ${DEP_RELDIR} != ${_DEP_RELDIR} || ${DEP_TARGET_SPEC} != ${TARGET_SPEC}
 # this should be all
 _machines := ${DEP_MACHINE}
 .else
 # this is the machine list we actually use below
 _machines := ${_only_machines}
 
 .if defined(HOSTPROG) || ${DEP_MACHINE} == "host"
 # we need to build this guy's dependencies for host as well.
 _machines += host
 .endif
 
 _machines := ${_machines:O:u}
 .endif
 
 .if ${TARGET_SPEC_VARS:[#]} > 1
 # we need to tweak _machines
 _dm := ${DEP_MACHINE}
 # apply the same filtering that we do when qualifying DIRDEPS.
 # M_dep_qual_fixes expects .${MACHINE}* so add (and remove) '.'
 _machines := ${_machines:@DEP_MACHINE@${DEP_TARGET_SPEC}@:S,^,.,:${M_dep_qual_fixes:ts:}:O:u:S,^.,,}
 DEP_MACHINE := ${_dm}
 .endif
 
 # reset each time through
 _build_dirs =
 
 .if ${DEP_RELDIR} == ${_DEP_RELDIR}
 # pickup other machines for this dir if necessary
 .if ${BUILD_AT_LEVEL0:Uyes} == "no"
 _build_dirs += ${_machines:@m@${_CURDIR}.$m@}
 .else
 _build_dirs += ${_machines:N${DEP_TARGET_SPEC}:@m@${_CURDIR}.$m@}
 .if ${DEP_TARGET_SPEC} == ${TARGET_SPEC}
 # pickup local dependencies now
 .if ${MAKE_VERSION} < 20160220
 .-include <.depend>
 .else
 .dinclude <.depend>
 .endif
 .endif
 .endif
 .endif
 
 .if ${_debug_reldir}
 .info ${DEP_RELDIR}.${DEP_TARGET_SPEC}: DIRDEPS='${DIRDEPS}'
 .info ${DEP_RELDIR}.${DEP_TARGET_SPEC}: _machines='${_machines}' 
 .endif
 
 .if !empty(DIRDEPS)
 # these we reset each time through as they can depend on DEP_MACHINE
 DEP_DIRDEPS_FILTER = \
 	${DIRDEPS_FILTER.${DEP_TARGET_SPEC}:U} \
 	${DIRDEPS_FILTER.${DEP_MACHINE}:U} \
 	${DIRDEPS_FILTER:U} 
 .if empty(DEP_DIRDEPS_FILTER)
 # something harmless
 DEP_DIRDEPS_FILTER = U
 .endif
 
 # this is what we start with
 __depdirs := ${DIRDEPS:${NSkipDir}:${DEP_DIRDEPS_FILTER:ts:}:C,//+,/,g:O:u:@d@${SRCTOP}/$d@}
 
 # some entries may be qualified with .<machine> 
 # the :M*/*/*.* just tries to limit the dirs we check to likely ones.
 # the ${d:E:M*/*} ensures we don't consider junos/usr.sbin/mgd
 __qual_depdirs := ${__depdirs:M*/*/*.*:@d@${exists($d):?:${"${d:E:M*/*}":?:${exists(${d:R}):?$d:}}}@}
 __unqual_depdirs := ${__depdirs:${__qual_depdirs:Uno:${M_ListToSkip}}}
 
 .if ${DEP_RELDIR} == ${_DEP_RELDIR}
 # if it was called out - we likely need it.
 __hostdpadd := ${DPADD:U.:M${HOST_OBJTOP}/*:S,${HOST_OBJTOP}/,,:H:${NSkipDir}:${DIRDEPS_FILTER:ts:}:S,$,.host,:N.*:@d@${SRCTOP}/$d@}
 __qual_depdirs += ${__hostdpadd}
 .endif
 
 .if ${_debug_reldir}
 .info depdirs=${__depdirs}
 .info qualified=${__qual_depdirs}
 .info unqualified=${__unqual_depdirs}
 .endif
 
 # _build_dirs is what we will feed to _DIRDEP_USE
 _build_dirs += \
 	${__qual_depdirs:M*.host:${NSkipHostDir}:N.host} \
 	${__qual_depdirs:N*.host} \
 	${_machines:Mhost*:@m@${__unqual_depdirs:@d@$d.$m@}@:${NSkipHostDir}:N.host} \
 	${_machines:Nhost*:@m@${__unqual_depdirs:@d@$d.$m@}@}
 
 # qualify everything now
 _build_dirs := ${_build_dirs:${M_dep_qual_fixes:ts:}:O:u}
 
 _build_all_dirs += ${_build_dirs}
 _build_all_dirs := ${_build_all_dirs:O:u}
 
 .endif				# empty DIRDEPS
 
 # Normally if doing make -V something,
 # we do not want to waste time chasing DIRDEPS
 # but if we want to count the number of Makefile.depend* read, we do.
 .if ${.MAKEFLAGS:M-V${_V_READ_DIRDEPS}} == ""
 .if !empty(_build_all_dirs)
 .if ${BUILD_DIRDEPS_CACHE} == "yes"
 x!= { echo; echo '\# ${DEP_RELDIR}.${DEP_TARGET_SPEC}'; \
 	echo 'dirdeps: ${_build_all_dirs:${M_oneperline}}'; echo; } >&3; echo
 x!= { ${_build_all_dirs:@x@${target($x):?:echo '$x: _DIRDEP_USE';}@} echo; } >&3; echo
 .else
 # this makes it all happen
 dirdeps: ${_build_all_dirs}
 .endif
 ${_build_all_dirs}:	_DIRDEP_USE
 
 .if ${_debug_reldir}
 .info ${DEP_RELDIR}.${DEP_TARGET_SPEC}: needs: ${_build_dirs}
 .endif
 
 # this builds the dependency graph
 .for m in ${_machines}
 # it would be nice to do :N${.TARGET}
 .if !empty(__qual_depdirs)
 .for q in ${__qual_depdirs:${M_dep_qual_fixes:ts:}:E:O:u:N$m}
 .if ${_debug_reldir} || ${DEBUG_DIRDEPS:@x@${${DEP_RELDIR}.$m:L:M$x}${${DEP_RELDIR}.$q:L:M$x}@} != ""
 .info ${DEP_RELDIR}.$m: graph: ${_build_dirs:M*.$q}
 .endif
 .if ${BUILD_DIRDEPS_CACHE} == "yes"
 x!= { echo; echo '${_this_dir}.$m: ${_build_dirs:M*.$q:${M_oneperline}}'; echo; } >&3; echo
 .else
 ${_this_dir}.$m: ${_build_dirs:M*.$q}
 .endif
 .endfor
 .endif
 .if ${_debug_reldir}
 .info ${DEP_RELDIR}.$m: graph: ${_build_dirs:M*.$m:N${_this_dir}.$m}
 .endif
 .if ${BUILD_DIRDEPS_CACHE} == "yes"
 x!= { echo; echo '${_this_dir}.$m: ${_build_dirs:M*.$m:N${_this_dir}.$m:${M_oneperline}}'; echo; } >&3; echo
 .else
 ${_this_dir}.$m: ${_build_dirs:M*.$m:N${_this_dir}.$m}
 .endif
 .endfor
 
 .endif
 
 # Now find more dependencies - and recurse.
 .for d in ${_build_all_dirs}
 .if !target(_dirdeps_checked.$d)
 # once only
 _dirdeps_checked.$d:
 .if ${_debug_search}
 .info checking $d
 .endif
 # Note: _build_all_dirs is fully qualifed so d:R is always the directory
 .if exists(${d:R})
 # Warning: there is an assumption here that MACHINE is always 
 # the first entry in TARGET_SPEC_VARS.
 # If TARGET_SPEC and MACHINE are insufficient, you have a problem.
 _m := ${.MAKE.DEPENDFILE_PREFERENCE:T:S;${TARGET_SPEC}$;${d:E};:S;${MACHINE};${d:E:C/,.*//};:@m@${exists(${d:R}/$m):?${d:R}/$m:}@:[1]}
 .if !empty(_m)
 # M_dep_qual_fixes isn't geared to Makefile.depend
 _qm := ${_m:C;(\.depend)$;\1.${d:E};:${M_dep_qual_fixes:ts:}}
 .if ${_debug_search}
 .info Looking for ${_qm}
 .endif
 # we pass _DEP_TARGET_SPEC to tell the next step what we want
 _DEP_TARGET_SPEC := ${d:E}
 # some makefiles may still look at this
 _DEP_MACHINE := ${d:E:C/,.*//}
 # set this "just in case" 
 # we can skip :tA since we computed the path above
 DEP_RELDIR := ${_m:H:S,${SRCTOP}/,,}
 # and reset this
 DIRDEPS =
 .if ${_debug_reldir} && ${_qm} != ${_m}
 .info loading ${_m} for ${d:E}
 .endif
 .include <${_m}>
 .endif
 .endif
 .endif
 .endfor
 
 .endif				# -V
 .endif				# BUILD_DIRDEPS
 
 .elif ${.MAKE.LEVEL} > 42
 .error You should have stopped recursing by now.
 .else
 # we are building something
 DEP_RELDIR := ${RELDIR}
 _DEP_RELDIR := ${RELDIR}
 # pickup local dependencies
 .if ${MAKE_VERSION} < 20160220
 .-include <.depend>
 .else
 .dinclude <.depend>
 .endif
 .endif
 
 # bootstrapping new dependencies made easy?
 .if !target(bootstrap) && (make(bootstrap) || \
 	make(bootstrap-this) || \
 	make(bootstrap-recurse) || \
 	make(bootstrap-empty))
 
-.if exists(${.CURDIR}/${.MAKE.DEPENDFILE:T})
+# if we are bootstrapping create the default
+_want = ${.CURDIR}/${.MAKE.DEPENDFILE_DEFAULT:T}
+
+.if exists(${_want})
 # stop here
 ${.TARGETS:Mboot*}:
 .elif !make(bootstrap-empty)
 # find a Makefile.depend to use as _src
 _src != cd ${.CURDIR} && for m in ${.MAKE.DEPENDFILE_PREFERENCE:T:S,${MACHINE},*,}; do test -s $$m || continue; echo $$m; break; done; echo
 .if empty(_src)
 .error cannot find any of ${.MAKE.DEPENDFILE_PREFERENCE:T}${.newline}Use: bootstrap-empty
 .endif
 
-_src?= ${.MAKE.DEPENDFILE:T}
+_src?= ${.MAKE.DEPENDFILE}
 
+.MAKE.DEPENDFILE_BOOTSTRAP_SED+= -e 's,${_src:E},${MACHINE},g'
+
 # just create Makefile.depend* for this dir
 bootstrap-this:	.NOTMAIN
-	@echo Bootstrapping ${RELDIR}/${.MAKE.DEPENDFILE:T} from ${_src:T}
-	(cd ${.CURDIR} && sed 's,${_src:E},${MACHINE},g' ${_src} > ${.MAKE.DEPENDFILE:T})
+	@echo Bootstrapping ${RELDIR}/${_want:T} from ${_src:T}; \
+	echo You need to build ${RELDIR} to correctly populate it.
+.if ${_src:T} != ${.MAKE.DEPENDFILE_PREFIX:T}
+	(cd ${.CURDIR} && sed ${.MAKE.DEPENDFILE_BOOTSTRAP_SED} ${_src} > ${_want})
+.else
+	cp ${.CURDIR}/${_src} ${_want}
+.endif
 
 # create Makefile.depend* for this dir and its dependencies
 bootstrap: bootstrap-recurse
 bootstrap-recurse: bootstrap-this
 
 _mf := ${.PARSEFILE}
 bootstrap-recurse:	.NOTMAIN .MAKE
 	@cd ${SRCTOP} && \
 	for d in `cd ${RELDIR} && ${.MAKE} -B -f ${"${.MAKEFLAGS:M-n}":?${_src}:${.MAKE.DEPENDFILE:T}} -V DIRDEPS`; do \
 		test -d $$d || d=$${d%.*}; \
 		test -d $$d || continue; \
 		echo "Checking $$d for bootstrap ..."; \
 		(cd $$d && ${.MAKE} -f ${_mf} bootstrap-recurse); \
 	done
 
 .endif
 
 # create an empty Makefile.depend* to get the ball rolling.
 bootstrap-empty: .NOTMAIN .NOMETA
-	@echo Creating empty ${RELDIR}/${.MAKE.DEPENDFILE:T}; \
+	@echo Creating empty ${RELDIR}/${_want:T}; \
 	echo You need to build ${RELDIR} to correctly populate it.
-	@{ echo DIRDEPS=; echo ".include <dirdeps.mk>"; } > ${.CURDIR}/${.MAKE.DEPENDFILE:T}
+	@{ echo DIRDEPS=; echo ".include <dirdeps.mk>"; } > ${_want}
 
 .endif
Index: projects/clang390-import/contrib/bmake/mk/install-mk
===================================================================
--- projects/clang390-import/contrib/bmake/mk/install-mk	(revision 305686)
+++ projects/clang390-import/contrib/bmake/mk/install-mk	(revision 305687)
@@ -1,185 +1,185 @@
 :
 # NAME:
 #	install-mk - install mk files
 #
 # SYNOPSIS:
 #	install-mk [options] [var=val] [dest]
 #
 # DESCRIPTION:
 #	This tool installs mk files in a semi-intelligent manner into
 #	"dest".
 #
 #	Options:
 #
 #	-n	just say what we want to do, but don't touch anything.
 #
 #	-f	use -f when copying sys,mk.
 #
 #	-v	be verbose
 #
 #	-q	be quiet
 #
 #	-m "mode"
 #		Use "mode" for installed files (444).
 #
 #	-o "owner"
 #		Use "owner" for installed files.
 #
 #	-g "group"
 #		Use "group" for installed files.
 #
 #	var=val
 #		Set "var" to "val".  See below.
 #
 #	All our *.mk files are copied to "dest" with appropriate
 #	ownership and permissions.
 #	
 #	By default if a sys.mk can be found in a standard location
 #	(that bmake will find) then no sys.mk will be put in "dest".
 #
 #	SKIP_SYS_MK:
 #		If set, we will avoid installing our 'sys.mk'
 #		This is probably a bad idea.
 #
 #	SKIP_BSD_MK:
 #		If set, we will skip making bsd.*.mk links to *.mk
 #
 #	sys.mk:
 #
 #	By default (and provided we are not installing to the system
 #	mk dir - '/usr/share/mk') we install our own 'sys.mk' which
 #	includes a sys specific file, or a generic one.
 #
 #
 # AUTHOR:
 #       Simon J. Gerraty <sjg@crufty.net>
 
 # RCSid:
-#	$Id: install-mk,v 1.128 2016/06/03 17:22:32 sjg Exp $
+#	$Id: install-mk,v 1.130 2016/08/15 19:28:13 sjg Exp $
 #
 #	@(#) Copyright (c) 1994 Simon J. Gerraty
 #
 #	This file is provided in the hope that it will
 #	be of use.  There is absolutely NO WARRANTY.
 #	Permission to copy, redistribute or otherwise
 #	use this file is hereby granted provided that 
 #	the above copyright notice and this notice are
 #	left intact. 
 #      
 #	Please send copies of changes and bug-fixes to:
 #	sjg@crufty.net
 #
 
-MK_VERSION=20160602
+MK_VERSION=20160815
 OWNER=
 GROUP=
 MODE=444
 BINMODE=555
 ECHO=:
 SKIP=
 cp_f=-f
 
 while :
 do
 	case "$1" in
 	*=*) eval "$1"; shift;;
 	+f) cp_f=; shift;;
 	-f) cp_f=-f; shift;;
 	-m) MODE=$2; shift 2;;
 	-o) OWNER=$2; shift 2;;
 	-g) GROUP=$2; shift 2;;
 	-v) ECHO=echo; shift;;
 	-q) ECHO=:; shift;;
 	-n) ECHO=echo SKIP=:; shift;;
 	--) shift; break;;
 	*) break;;
 	esac
 done
 
 case $# in
 0)	echo "$0 [options] <destination> [<os>]"
 	echo "eg."
 	echo "$0 -o bin -g bin -m 444 /usr/local/share/mk"
 	exit 1
 	;;
 esac
 dest=$1
 os=${2:-`uname`}
 osrel=${3:-`uname -r`}
 
 Do() {
 	$ECHO "$@"
 	$SKIP "$@"
 }
 
 Error() {
 	echo "ERROR: $@" >&2
 	exit 1
 }
 
 Warning() {
 	echo "WARNING: $@" >&2
 }
 
 [ "$FORCE_SYS_MK" ] && Warning "ignoring: FORCE_{BSD,SYS}_MK (no longer supported)"
 
 SYS_MK_DIR=${SYS_MK_DIR:-/usr/share/mk}
 SYS_MK=${SYS_MK:-$SYS_MK_DIR/sys.mk}
 
 realpath() {
 	[ -d $1 ] && cd $1 && 'pwd' && return
 	echo $1
 }
 
 if [ -s $SYS_MK -a -d $dest ]; then
 	# if this is a BSD system we don't want to touch $SYS_MK
 	dest=`realpath $dest`
 	sys_mk_dir=`realpath $SYS_MK_DIR`
 	if [ $dest = $sys_mk_dir ]; then
 		case "$os" in
 		*BSD*)	SKIP_SYS_MK=: 
 			SKIP_BSD_MK=:
 			;;
 		*)	# could be fake?
 			if [ ! -d $dest/sys -a ! -s $dest/Generic.sys.mk ]; then
 				SKIP_SYS_MK=: # play safe
 				SKIP_BSD_MK=:
 			fi
 			;;
 		esac
 	fi
 fi
 
 [ -d $dest/sys ] || Do mkdir -p $dest/sys
 [ -d $dest/sys ] || Do mkdir $dest/sys || exit 1
 [ -z "$SKIP" ] && dest=`realpath $dest`
 
 cd `dirname $0`
 mksrc=`'pwd'`
 if [ $mksrc = $dest ]; then
 	SKIP_MKFILES=:
 else
 	# we do not install the examples
 	mk_files=`grep '^[a-z].*\.mk' FILES | egrep -v '(examples/|^sys\.mk|sys/)'`
 	mk_scripts=`egrep '^[a-z].*\.(sh|py)' FILES | egrep -v '/'`
 	sys_mk_files=`grep 'sys/.*\.mk' FILES`
 	SKIP_MKFILES=
 	[ -z "$SKIP_SYS_MK" ] && mk_files="sys.mk $mk_files"
 fi
 $SKIP_MKFILES Do cp $cp_f $mk_files $dest
 $SKIP_MKFILES Do cp $cp_f $sys_mk_files $dest/sys
 $SKIP_MKFILES Do cp $cp_f $mk_scripts $dest
 $SKIP cd $dest
 $SKIP_MKFILES Do chmod $MODE $mk_files $sys_mk_files
 $SKIP_MKFILES Do chmod $BINMODE $mk_scripts
 [ "$GROUP" ] && $SKIP_MKFILES Do chgrp $GROUP $mk_files $sys_mk_files
 [ "$OWNER" ] && $SKIP_MKFILES Do chown $OWNER $mk_files $sys_mk_files
 # if this is a BSD system the bsd.*.mk should exist and be used.
 if [ -z "$SKIP_BSD_MK" ]; then
 	for f in dep doc init lib links man nls obj own prog subdir
 	do
 		b=bsd.$f.mk
 		[ -s $b ] || Do ln -s $f.mk $b
 	done
 fi
 exit 0
Index: projects/clang390-import/contrib/bmake/mk/lib.mk
===================================================================
--- projects/clang390-import/contrib/bmake/mk/lib.mk	(revision 305686)
+++ projects/clang390-import/contrib/bmake/mk/lib.mk	(revision 305687)
@@ -1,603 +1,604 @@
-# $Id: lib.mk,v 1.53 2016/03/22 20:45:14 sjg Exp $
+# $Id: lib.mk,v 1.54 2016/08/02 20:52:17 sjg Exp $
 
 .if !target(__${.PARSEFILE}__)
 __${.PARSEFILE}__:
 
 .include <init.mk>
 
 .if ${OBJECT_FMT} == "ELF"
 NEED_SOLINKS?= yes
 .endif
 
 .if exists(${.CURDIR}/shlib_version)
 SHLIB_MAJOR != . ${.CURDIR}/shlib_version ; echo $$major
 SHLIB_MINOR != . ${.CURDIR}/shlib_version ; echo $$minor
 .endif
 
 print-shlib-major:
 .if defined(SHLIB_MAJOR) && ${MK_PIC} != "no"
 	@echo ${SHLIB_MAJOR}
 .else
 	@false
 .endif
 
 print-shlib-minor:
 .if defined(SHLIB_MINOR) && ${MK_PIC} != "no"
 	@echo ${SHLIB_MINOR}
 .else
 	@false
 .endif
 
 print-shlib-teeny:
 .if defined(SHLIB_TEENY) && ${MK_PIC} != "no"
 	@echo ${SHLIB_TEENY}
 .else
 	@false
 .endif
 
 SHLIB_FULLVERSION ?= ${${SHLIB_MAJOR} ${SHLIB_MINOR} ${SHLIB_TEENY}:L:ts.}
 SHLIB_FULLVERSION := ${SHLIB_FULLVERSION}
 
 # add additional suffixes not exported.
 # .po is used for profiling object files.
 # .So is used for PIC object files.
 .SUFFIXES: .out .a .ln .So .po .o .s .S .c .cc .C .m .F .f .r .y .l .cl .p .h
 .SUFFIXES: .sh .m4 .m
 
 CFLAGS+=	${COPTS}
 
 # Derrived from NetBSD-1.6
 
 # Set PICFLAGS to cc flags for producing position-independent code,
 # if not already set.  Includes -DPIC, if required.
 
 # Data-driven table using make variables to control how shared libraries
 # are built for different platforms and object formats.
 # OBJECT_FMT:		currently either "ELF" or "a.out", from <bsd.own.mk>
 # SHLIB_SOVERSION:	version number to be compiled into a shared library
 #			via -soname. Usually ${SHLIB_MAJOR} on ELF.
 #			NetBSD/pmax used to use ${SHLIB_MAJOR}[.${SHLIB_MINOR}
 #			[.${SHLIB_TEENY}]]
 # SHLIB_SHFLAGS:	Flags to tell ${LD} to emit shared library.
 #			with ELF, also set shared-lib version for ld.so.
 # SHLIB_LDSTARTFILE:	support .o file, call C++ file-level constructors
 # SHLIB_LDENDFILE:	support .o file, call C++ file-level destructors
 # FPICFLAGS:		flags for ${FC} to compile .[fF] files to .So objects.
 # CPPICFLAGS:		flags for ${CPP} to preprocess .[sS] files for ${AS}
 # CPICFLAGS:		flags for ${CC} to compile .[cC] files to .So objects.
 # CAPICFLAGS		flags for {$CC} to compiling .[Ss] files
 #		 	(usually just ${CPPPICFLAGS} ${CPICFLAGS})
 # APICFLAGS:		flags for ${AS} to assemble .[sS] to .So objects.
 
 .if ${TARGET_OSNAME} == "NetBSD"
 .if ${MACHINE_ARCH} == "alpha"
 		# Alpha-specific shared library flags
 FPICFLAGS ?= -fPIC
 CPICFLAGS ?= -fPIC -DPIC
 CPPPICFLAGS?= -DPIC 
 CAPICFLAGS?= ${CPPPICFLAGS} ${CPICFLAGS}
 APICFLAGS ?=
 .elif ${MACHINE_ARCH} == "mipsel" || ${MACHINE_ARCH} == "mipseb"
 		# mips-specific shared library flags
 
 # On mips, all libs are compiled with ABIcalls, not just sharedlibs.
 MKPICLIB= no
 
 # so turn shlib PIC flags on for ${AS}.
 AINC+=-DABICALLS
 AFLAGS+= -fPIC
 AS+=	-KPIC
 
 .elif ${MACHINE_ARCH} == "vax" && ${OBJECT_FMT} == "ELF"
 # On the VAX, all object are PIC by default, not just sharedlibs.
 MKPICLIB= no
 
 .elif (${MACHINE_ARCH} == "sparc" || ${MACHINE_ARCH} == "sparc64") && \
        ${OBJECT_FMT} == "ELF"
 # If you use -fPIC you need to define BIGPIC to turn on 32-bit 
 # relocations in asm code
 FPICFLAGS ?= -fPIC
 CPICFLAGS ?= -fPIC -DPIC
 CPPPICFLAGS?= -DPIC -DBIGPIC
 CAPICFLAGS?= ${CPPPICFLAGS} ${CPICFLAGS}
 APICFLAGS ?= -KPIC
 
 .else
 
 # Platform-independent flags for NetBSD a.out shared libraries
 SHLIB_SOVERSION=${SHLIB_FULLVERSION}
 SHLIB_SHFLAGS=
 FPICFLAGS ?= -fPIC
 CPICFLAGS?= -fPIC -DPIC
 CPPPICFLAGS?= -DPIC 
 CAPICFLAGS?= ${CPPPICFLAGS} ${CPICFLAGS}
 APICFLAGS?= -k
 
 .endif
 
 # Platform-independent linker flags for ELF shared libraries
 .if ${OBJECT_FMT} == "ELF"
 SHLIB_SOVERSION=	${SHLIB_MAJOR}
 SHLIB_SHFLAGS=		-soname lib${LIB}.so.${SHLIB_SOVERSION}
 SHLIB_LDSTARTFILE?=	/usr/lib/crtbeginS.o
 SHLIB_LDENDFILE?=	/usr/lib/crtendS.o
 .endif
 
 # for compatibility with the following
 CC_PIC?= ${CPICFLAGS}
 LD_shared=${SHLIB_SHFLAGS}
 
 .endif # NetBSD
 
 .if ${TARGET_OSNAME} == "FreeBSD"
 .if ${OBJECT_FMT} == "ELF"
 SHLIB_SOVERSION=	${SHLIB_MAJOR}
 SHLIB_SHFLAGS=		-soname lib${LIB}.so.${SHLIB_SOVERSION}
 .else
 SHLIB_SHFLAGS=		-assert pure-text
 .endif
 SHLIB_LDSTARTFILE=
 SHLIB_LDENDFILE=
 CC_PIC?= -fpic
 LD_shared=${SHLIB_SHFLAGS}
 
 .endif # FreeBSD
 
 MKPICLIB?= yes
 
 # sys.mk can override these
 LD_X?=-X
 LD_x?=-x
 LD_r?=-r
 
 # Non BSD machines will be using bmake.
 .if ${TARGET_OSNAME} == "SunOS"
 LD_shared=-assert pure-text
 .if ${OBJECT_FMT} == "ELF" || ${MACHINE} == "solaris"
 # Solaris
 LD_shared=-h lib${LIB}.so.${SHLIB_MAJOR} -G
 .endif
 .elif ${TARGET_OSNAME} == "HP-UX"
 LD_shared=-b
 LD_so=sl
 DLLIB=
 # HPsUX lorder does not grok anything but .o
 LD_sobjs=`${LORDER} ${OBJS} | ${TSORT} | sed 's,\.o,.So,'`
 LD_pobjs=`${LORDER} ${OBJS} | ${TSORT} | sed 's,\.o,.po,'`
 .elif ${TARGET_OSNAME} == "OSF1"
 LD_shared= -msym -shared -expect_unresolved '*'
 LD_solib= -all lib${LIB}_pic.a
 DLLIB=
 # lorder does not grok anything but .o
 LD_sobjs=`${LORDER} ${OBJS} | ${TSORT} | sed 's,\.o,.So,'`
 LD_pobjs=`${LORDER} ${OBJS} | ${TSORT} | sed 's,\.o,.po,'`
 AR_cq= -cqs
 .elif ${TARGET_OSNAME} == "FreeBSD"
 LD_solib= lib${LIB}_pic.a
 .elif ${TARGET_OSNAME} == "Linux"
 SHLIB_LD = ${CC}
 # this is ambiguous of course
 LD_shared=-shared -Wl,"-h lib${LIB}.so.${SHLIB_MAJOR}"
 LD_solib= -Wl,--whole-archive lib${LIB}_pic.a -Wl,--no-whole-archive
 # Linux uses GNU ld, which is a multi-pass linker
 # so we don't need to use lorder or tsort
 LD_objs = ${OBJS}
 LD_pobjs = ${POBJS}
 LD_sobjs = ${SOBJS}
 .elif ${TARGET_OSNAME} == "Darwin"
 SHLIB_LD = ${CC}
 SHLIB_INSTALL_VERSION ?= ${SHLIB_MAJOR}
 SHLIB_COMPATABILITY_VERSION ?= ${SHLIB_MAJOR}.${SHLIB_MINOR:U0}
 SHLIB_COMPATABILITY ?= \
 	-compatibility_version ${SHLIB_COMPATABILITY_VERSION} \
 	-current_version ${SHLIB_FULLVERSION}
 LD_shared = -dynamiclib \
 	-flat_namespace -undefined suppress \
 	-install_name ${LIBDIR}/lib${LIB}.${SHLIB_INSTALL_VERSION}.${LD_solink} \
 	${SHLIB_COMPATABILITY}
 SHLIB_LINKS =
 .for v in ${SHLIB_COMPATABILITY_VERSION} ${SHLIB_INSTALL_VERSION}
 .if "$v" != "${SHLIB_FULLVERSION}"
 SHLIB_LINKS += lib${LIB}.$v.${LD_solink}
 .endif
 .endfor
 .if ${MK_LINKLIB} != "no"
 SHLIB_LINKS += lib${LIB}.${LD_solink}
 .endif
 
 LD_so = ${SHLIB_FULLVERSION}.dylib
 LD_sobjs = ${SOBJS:O:u}
 LD_solib = ${LD_sobjs}
 SOLIB = ${LD_sobjs}
 LD_solink = dylib
 .if ${MACHINE_ARCH} == "i386"
 PICFLAG ?= -fPIC
 .else
 PICFLAG ?= -fPIC -fno-common
 .endif
 RANLIB = :
 .endif
 
 SHLIB_LD ?= ${LD}
 
 .if !empty(SHLIB_MAJOR)
 .if ${NEED_SOLINKS} && empty(SHLIB_LINKS)
 .if ${MK_LINKLIB} != "no"
 SHLIB_LINKS = lib${LIB}.${LD_solink}
 .endif
 .if "${SHLIB_FULLVERSION}" != "${SHLIB_MAJOR}"
 SHLIB_LINKS += lib${LIB}.${LD_solink}.${SHLIB_MAJOR}
 .endif
 .endif
 .endif
 
 LIBTOOL?=libtool
 LD_shared ?= -Bshareable -Bforcearchive
 LD_so ?= so.${SHLIB_FULLVERSION}
 LD_solink ?= so
 .if empty(LORDER)
 LD_objs ?= ${OBJS}
 LD_pobjs ?= ${POBJS}
 LD_sobjs ?= ${SOBJS}
 .else
 LD_objs ?= `${LORDER} ${OBJS} | ${TSORT}`
 LD_sobjs ?= `${LORDER} ${SOBJS} | ${TSORT}`
 LD_pobjs ?= `${LORDER} ${POBJS} | ${TSORT}`
 .endif
 LD_solib ?= ${LD_sobjs}
 AR_cq ?= cq
 .if exists(/netbsd) && exists(${DESTDIR}/usr/lib/libdl.so)
 DLLIB ?= -ldl
 .endif
 
 # some libs have lots of objects, and scanning all .o, .po and .So meta files
 # is a waste of time, this tells meta.autodep.mk to just pick one 
 # (typically .So)
 # yes, 42 is a random number.
 .if ${MK_DIRDEPS_BUILD} == "yes" && ${SRCS:Uno:[\#]} > 42
 OPTIMIZE_OBJECT_META_FILES ?= yes
 .endif
 
 
 .if ${MK_LIBTOOL} == "yes"
 # because libtool is so fascist about naming the object files,
 # we cannot (yet) build profiled libs
 MK_PROFILE=no
 _LIBS=lib${LIB}.a
 .if exists(${.CURDIR}/shlib_version)
 SHLIB_AGE != . ${.CURDIR}/shlib_version ; echo $$age
 .endif
 .else
 # for the normal .a we do not want to strip symbols
 .c.o:
 	${COMPILE.c} ${.IMPSRC}
 
 # for the normal .a we do not want to strip symbols
 ${CXX_SUFFIXES:%=%.o}:
 	${COMPILE.cc} ${.IMPSRC}
 
 .S.o .s.o:
 	@echo ${COMPILE.S} ${CFLAGS:M-[ID]*} ${AINC} ${.IMPSRC}
 	@${COMPILE.S} ${CFLAGS:M-[ID]*} ${AINC} ${.IMPSRC} 
 
 .if (${LD_X} == "")
 .c.po:
 	${COMPILE.c} ${CC_PG} ${PROFFLAGS} ${.IMPSRC} -o ${.TARGET}
 
 ${CXX_SUFFIXES:%=%.po}:
 	${COMPILE.cc} -pg ${.IMPSRC} -o ${.TARGET}
 
 .S.So .s.So:
 	${COMPILE.S} ${PICFLAG} ${CC_PIC} ${CFLAGS:M-[ID]*} ${AINC} ${.IMPSRC} -o ${.TARGET}
 .else
 .c.po:
 	@echo ${COMPILE.c} ${CC_PG} ${PROFFLAGS} ${.IMPSRC} -o ${.TARGET}
 	@${COMPILE.c} ${CC_PG} ${PROFFLAGS} ${.IMPSRC} -o ${.TARGET}.o
 	@${LD} ${LD_X} ${LD_r} ${.TARGET}.o -o ${.TARGET}
 	@rm -f ${.TARGET}.o
 
 ${CXX_SUFFIXES:%=%.po}:
 	@echo ${COMPILE.cc} ${CXX_PG} ${PROFFLAGS} ${.IMPSRC} -o ${.TARGET}
 	@${COMPILE.cc} ${CXX_PG} ${.IMPSRC} -o ${.TARGET}.o
 	@${LD} ${LD_X} ${LD_r} ${.TARGET}.o -o ${.TARGET}
 	@rm -f ${.TARGET}.o
 
 .S.So .s.So:
 	@echo ${COMPILE.S} ${PICFLAG} ${CC_PIC} ${CFLAGS:M-[ID]*} ${AINC} ${.IMPSRC} -o ${.TARGET}
 	@${COMPILE.S} ${PICFLAG} ${CC_PIC} ${CFLAGS:M-[ID]*} ${AINC} ${.IMPSRC} -o ${.TARGET}.o
 	@${LD} ${LD_x} ${LD_r} ${.TARGET}.o -o ${.TARGET}
 	@rm -f ${.TARGET}.o
 .endif
 
 .if (${LD_x} == "")
 .c.So:
 	${COMPILE.c} ${PICFLAG} ${CC_PIC} ${.IMPSRC} -o ${.TARGET}
 
 ${CXX_SUFFIXES:%=%.So}:
 	${COMPILE.cc} ${PICFLAG} ${CC_PIC} ${.IMPSRC} -o ${.TARGET}
 
 .S.po .s.po:
 	${COMPILE.S} ${PROFFLAGS} ${CFLAGS:M-[ID]*} ${AINC} ${.IMPSRC} -o ${.TARGET}
 .else
 
 .c.So:
 	@echo ${COMPILE.c} ${PICFLAG} ${CC_PIC} ${.IMPSRC} -o ${.TARGET}
 	@${COMPILE.c} ${PICFLAG} ${CC_PIC} ${.IMPSRC} -o ${.TARGET}.o
 	@${LD} ${LD_x} ${LD_r} ${.TARGET}.o -o ${.TARGET}
 	@rm -f ${.TARGET}.o
 
 ${CXX_SUFFIXES:%=%.So}:
 	@echo ${COMPILE.cc} ${PICFLAG} ${CC_PIC} ${.IMPSRC} -o ${.TARGET}
 	@${COMPILE.cc} ${PICFLAG} ${CC_PIC} ${.IMPSRC} -o ${.TARGET}.o
 	@${LD} ${LD_x} ${LD_r} ${.TARGET}.o -o ${.TARGET}
 	@rm -f ${.TARGET}.o
 
 .S.po .s.po:
 	@echo ${COMPILE.S} ${PROFFLAGS} ${CFLAGS:M-[ID]*} ${AINC} ${.IMPSRC} -o ${.TARGET}
 	@${COMPILE.S} ${PROFFLAGS} ${CFLAGS:M-[ID]*} ${AINC} ${.IMPSRC} -o ${.TARGET}.o
 	@${LD} ${LD_X} ${LD_r} ${.TARGET}.o -o ${.TARGET}
 	@rm -f ${.TARGET}.o
 
 .endif
 .endif
 
 .c.ln:
 	${LINT} ${LINTFLAGS} ${CFLAGS:M-[IDU]*} -i ${.IMPSRC}
 
 .if ${MK_LIBTOOL} != "yes"
 
 .if !defined(PICFLAG)
 PICFLAG=-fpic
 .endif
 
 _LIBS=
 
 .if ${MK_ARCHIVE} != "no"
 _LIBS += lib${LIB}.a
 .endif
 
 .if ${MK_PROFILE} != "no"
 _LIBS+=lib${LIB}_p.a
 POBJS+=${OBJS:.o=.po}
 .endif
 
 .if ${MK_PIC} != "no"
 .if ${MK_PICLIB} == "no"
 SOLIB ?= lib${LIB}.a
 .else
 SOLIB=lib${LIB}_pic.a
 _LIBS+=${SOLIB}
 .endif
 .if !empty(SHLIB_FULLVERSION)
 _LIBS+=lib${LIB}.${LD_so}
 .endif
 .endif
 
 .if ${MK_LINT} != "no"
 _LIBS+=llib-l${LIB}.ln
 .endif
 
 # here is where you can define what LIB* are
 .-include <libnames.mk>
 .if ${MK_DPADD_MK} == "yes"
 # lots of cool magic, but might not suit everyone.
 .include <dpadd.mk>
 .endif
 
 .if !defined(_SKIP_BUILD)
 all: prebuild .WAIT ${_LIBS} 
 # a hook for things that must be done early
 prebuild:
 .if !defined(.PARSEDIR)
 # no-op is the best we can do if not bmake.
 .WAIT:
 .endif
 .endif
 all: _SUBDIRUSE
 
 .for s in ${SRCS:N*.h:M*/*}
 ${.o .So .po .lo:L:@o@${s:T:R}$o@}: $s
 .endfor
 
 OBJS+=	${SRCS:T:N*.h:R:S/$/.o/g}
 .NOPATH:	${OBJS}
 
 .if ${MK_LIBTOOL} == "yes"
 .if ${MK_PIC} == "no"
 LT_STATIC=-static
 .else
 LT_STATIC=
 .endif
 SHLIB_AGE?=0
 
 # .lo's are created as a side effect
 .s.o .S.o .c.o:
 	${LIBTOOL} --mode=compile ${CC} ${LT_STATIC} ${CFLAGS} ${CPPFLAGS} ${IMPFLAGS} -c ${.IMPSRC}
 
 # can't really do profiled libs with libtool - its too fascist about
 # naming the output...
 lib${LIB}.a:: ${OBJS}
 	@rm -f ${.TARGET}
 	${LIBTOOL} --mode=link ${CC} ${LT_STATIC} -o ${.TARGET:.a=.la} ${OBJS:.o=.lo} -rpath ${SHLIBDIR}:/usr/lib -version-info ${SHLIB_MAJOR}:${SHLIB_MINOR}:${SHLIB_AGE}
 	@ln .libs/${.TARGET} .
 
 lib${LIB}.${LD_so}:: lib${LIB}.a
 	@[ -s ${.TARGET}.${SHLIB_AGE} ] || { ln -s .libs/lib${LIB}.${LD_so}* . 2>/dev/null; : }
 	@[ -s ${.TARGET} ] || ln -s ${.TARGET}.${SHLIB_AGE} ${.TARGET}
 
 .else  # MK_LIBTOOL=yes
 
 lib${LIB}.a:: ${OBJS}
 	@echo building standard ${LIB} library
 	@rm -f ${.TARGET}
 	@${AR} ${AR_cq} ${.TARGET} ${LD_objs}
 	${RANLIB} ${.TARGET}
 
 POBJS+=	${OBJS:.o=.po}
 .NOPATH:	${POBJS}
 lib${LIB}_p.a:: ${POBJS}
 	@echo building profiled ${LIB} library
 	@rm -f ${.TARGET}
 	@${AR} ${AR_cq} ${.TARGET} ${LD_pobjs}
 	${RANLIB} ${.TARGET}
 
 SOBJS+=	${OBJS:.o=.So}
 .NOPATH:	${SOBJS}
 lib${LIB}_pic.a:: ${SOBJS}
 	@echo building shared object ${LIB} library
 	@rm -f ${.TARGET}
 	@${AR} ${AR_cq} ${.TARGET} ${LD_sobjs}
 	${RANLIB} ${.TARGET}
 
 #SHLIB_LDADD?= ${LDADD}
 
 # bound to be non-portable...
 # this is known to work for NetBSD 1.6 and FreeBSD 4.2
 lib${LIB}.${LD_so}: ${SOLIB} ${DPADD}
 	@echo building shared ${LIB} library \(version ${SHLIB_FULLVERSION}\)
 	@rm -f ${.TARGET}
 .if ${TARGET_OSNAME} == "NetBSD" || ${TARGET_OSNAME} == "FreeBSD"
 .if ${OBJECT_FMT} == "ELF"
 	${SHLIB_LD} -x -shared ${SHLIB_SHFLAGS} -o ${.TARGET} \
 	    ${SHLIB_LDSTARTFILE} \
 	    --whole-archive ${SOLIB} --no-whole-archive ${SHLIB_LDADD} \
 	    ${SHLIB_LDENDFILE}
 .else
 	${SHLIB_LD} ${LD_x} ${LD_shared} \
 	    -o ${.TARGET} ${SOLIB} ${SHLIB_LDADD}
 .endif
 .else
 	${SHLIB_LD} -o ${.TARGET} ${LD_shared} ${LD_solib} ${DLLIB} ${SHLIB_LDADD}
 .endif
 .endif
 .if !empty(SHLIB_LINKS)
 	rm -f ${SHLIB_LINKS}; ${SHLIB_LINKS:O:u:@x@ln -s ${.TARGET} $x;@}
 .endif
 
 LOBJS+=	${LSRCS:.c=.ln} ${SRCS:M*.c:.c=.ln}
 .NOPATH:	${LOBJS}
 LLIBS?=	-lc
 llib-l${LIB}.ln: ${LOBJS}
 	@echo building llib-l${LIB}.ln
 	@rm -f llib-l${LIB}.ln
 	@${LINT} -C${LIB} ${LOBJS} ${LLIBS}
 
 .if !target(clean)
 cleanlib: .PHONY
 	rm -f a.out [Ee]rrs mklog core *.core ${CLEANFILES}
 	rm -f lib${LIB}.a ${OBJS}
 	rm -f lib${LIB}_p.a ${POBJS}
 	rm -f lib${LIB}_pic.a lib${LIB}.so.*.* ${SOBJS}
 	rm -f llib-l${LIB}.ln ${LOBJS}
 .if !empty(SHLIB_LINKS)
 	rm -f ${SHLIB_LINKS}
 .endif
 
 clean: _SUBDIRUSE cleanlib
 cleandir: _SUBDIRUSE cleanlib
 .else
 cleandir: _SUBDIRUSE clean
 .endif
 
 .if defined(SRCS) && (!defined(MKDEP) || ${MKDEP} != autodep)
 afterdepend: .depend
 	@(TMP=/tmp/_depend$$$$; \
 	    sed -e 's/^\([^\.]*\).o[ ]*:/\1.o \1.po \1.So \1.ln:/' \
 	      < .depend > $$TMP; \
 	    mv $$TMP .depend)
 .endif
 
 .if !target(install)
 .if !target(beforeinstall)
 beforeinstall:
 .endif
 
 .if !empty(LIBOWN)
 LIB_INSTALL_OWN ?= -o ${LIBOWN} -g ${LIBGRP}
 .endif
 
 .include <links.mk>
 
 .if !target(realinstall)
 realinstall: libinstall
 .endif
 .if !target(libinstall)
 libinstall:
 	[ -d ${DESTDIR}/${LIBDIR} ] || \
 	${INSTALL} -d ${LIB_INSTALL_OWN} -m 775 ${DESTDIR}${LIBDIR}
 .if ${MK_ARCHIVE} != "no"
 	${INSTALL} ${COPY} ${LIB_INSTALL_OWN} -m 600 lib${LIB}.a \
 	    ${DESTDIR}${LIBDIR}
 	${RANLIB} ${DESTDIR}${LIBDIR}/lib${LIB}.a
 	chmod ${LIBMODE} ${DESTDIR}${LIBDIR}/lib${LIB}.a
 .endif
 .if ${MK_PROFILE} != "no"
 	${INSTALL} ${COPY} ${LIB_INSTALL_OWN} -m 600 \
 	    lib${LIB}_p.a ${DESTDIR}${LIBDIR}
 	${RANLIB} ${DESTDIR}${LIBDIR}/lib${LIB}_p.a
 	chmod ${LIBMODE} ${DESTDIR}${LIBDIR}/lib${LIB}_p.a
 .endif
 .if ${MK_PIC} != "no"
 .if ${MK_PICLIB} != "no"
 	${INSTALL} ${COPY} ${LIB_INSTALL_OWN} -m 600 \
 	    lib${LIB}_pic.a ${DESTDIR}${LIBDIR}
 	${RANLIB} ${DESTDIR}${LIBDIR}/lib${LIB}_pic.a
 	chmod ${LIBMODE} ${DESTDIR}${LIBDIR}/lib${LIB}_pic.a
 .endif
 .if !empty(SHLIB_MAJOR)
 	${INSTALL} ${COPY} ${LIB_INSTALL_OWN} -m ${LIBMODE} \
 	    lib${LIB}.${LD_so} ${DESTDIR}${LIBDIR}
 .if !empty(SHLIB_LINKS)
 	(cd ${DESTDIR}${LIBDIR} && { rm -f ${SHLIB_LINKS}; ${SHLIB_LINKS:O:u:@x@ln -s lib${LIB}.${LD_so} $x;@} })
 .endif
 .endif
 .endif
 .if ${MK_LINT} != "no" && ${MK_LINKLIB} != "no" && !empty(LOBJS)
 	${INSTALL} ${COPY} ${LIB_INSTALL_OWN} -m ${LIBMODE} \
 	    llib-l${LIB}.ln ${DESTDIR}${LINTLIBDIR}
 .endif
 .if defined(LINKS) && !empty(LINKS)
 	@set ${LINKS}; ${_LINKS_SCRIPT}
 .endif
 .endif
 
 install: maninstall _SUBDIRUSE
 maninstall: afterinstall
 afterinstall: realinstall
+libinstall: beforeinstall
 realinstall: beforeinstall
 .endif
 
 .if ${MK_MAN} != "no"
 .include <man.mk>
 .endif
 
 .if ${MK_NLS} != "no"
 .include <nls.mk>
 .endif
 
 .include <obj.mk>
 .include <inc.mk>
 .include <dep.mk>
 .include <subdir.mk>
 .endif
 
 # during building we usually need/want to install libs somewhere central
 # note that we do NOT ch{own,grp} as that would likely fail at this point.
 # otherwise it is the same as realinstall
 # Note that we don't need this when using dpadd.mk
 .libinstall:	${_LIBS}
 	test -d ${DESTDIR}${LIBDIR} || ${INSTALL} -d -m775 ${DESTDIR}${LIBDIR}
 .for _lib in ${_LIBS:M*.a}
 	${INSTALL} ${COPY} -m 644 ${_lib} ${DESTDIR}${LIBDIR}
 	${RANLIB} ${DESTDIR}${LIBDIR}/${_lib}
 .endfor
 .for _lib in ${_LIBS:M*.${LD_solink}*:O:u}
 	${INSTALL} ${COPY} -m ${LIBMODE} ${_lib} ${DESTDIR}${LIBDIR}
 .if !empty(SHLIB_LINKS)
 	(cd ${DESTDIR}${LIBDIR} && { ${SHLIB_LINKS:O:u:@x@ln -sf ${_lib} $x;@}; })
 .endif
 .endfor
 	@touch ${.TARGET}
 
 .include <final.mk>
 .endif
Index: projects/clang390-import/contrib/bmake/mk/meta.sys.mk
===================================================================
--- projects/clang390-import/contrib/bmake/mk/meta.sys.mk	(revision 305686)
+++ projects/clang390-import/contrib/bmake/mk/meta.sys.mk	(revision 305687)
@@ -1,153 +1,166 @@
-# $Id: meta.sys.mk,v 1.28 2016/04/05 15:58:37 sjg Exp $
+# $Id: meta.sys.mk,v 1.29 2016/08/13 17:51:45 sjg Exp $
 
 #
 #	@(#) Copyright (c) 2010, Simon J. Gerraty
 #
 #	This file is provided in the hope that it will
 #	be of use.  There is absolutely NO WARRANTY.
 #	Permission to copy, redistribute or otherwise
 #	use this file is hereby granted provided that 
 #	the above copyright notice and this notice are
 #	left intact. 
 #      
 #	Please send copies of changes and bug-fixes to:
 #	sjg@crufty.net
 #
 
 # include this if you want to enable meta mode
 # for maximum benefit, requires filemon(4) driver.
 
 .if ${MAKE_VERSION:U0} > 20100901
 .if !target(.ERROR)
 
 .-include <local.meta.sys.mk>
 
 # absoulte path to what we are reading.
 _PARSEDIR = ${.PARSEDIR:tA}
 
+.if !defined(SYS_MK_DIR)
+SYS_MK_DIR := ${_PARSEDIR}
+.endif
+
 META_MODE += meta verbose
 .MAKE.MODE ?= ${META_MODE}
 
 .if ${.MAKE.LEVEL} == 0
 _make_mode := ${.MAKE.MODE} ${META_MODE}
 .if ${_make_mode:M*read*} != "" || ${_make_mode:M*nofilemon*} != ""
 # tell everyone we are not updating Makefile.depend*
 UPDATE_DEPENDFILE = NO
 .export UPDATE_DEPENDFILE
 .endif
 .if ${UPDATE_DEPENDFILE:Uyes:tl} == "no" && !exists(/dev/filemon)
 # we should not get upset
 META_MODE += nofilemon
 .export META_MODE
 .endif
 .endif
 
 .if !defined(NO_SILENT)
 .if ${MAKE_VERSION} > 20110818
 # only be silent when we have a .meta file
 META_MODE += silent=yes
 .else
 .SILENT:
 .endif
 .endif
 
 # we use the pseudo machine "host" for the build host.
 # this should be taken care of before we get here
 .if ${OBJTOP:Ua} == ${HOST_OBJTOP:Ub}
 MACHINE = host
 .endif
 
 .if ${.MAKE.LEVEL} == 0
 # it can be handy to know which MACHINE kicked off the build
 # for example, if using Makefild.depend for multiple machines,
 # allowing only MACHINE0 to update can keep things simple.
 MACHINE0 := ${MACHINE}
 .export MACHINE0
 
 .if defined(PYTHON) && exists(${PYTHON})
 # we prefer the python version of this - it is much faster
 META2DEPS ?= ${.PARSEDIR}/meta2deps.py
 .else
 META2DEPS ?= ${.PARSEDIR}/meta2deps.sh
 .endif
 META2DEPS := ${META2DEPS}
 .export META2DEPS
 .endif
 
 MAKE_PRINT_VAR_ON_ERROR += \
 	.ERROR_TARGET \
 	.ERROR_META_FILE \
 	.MAKE.LEVEL \
 	MAKEFILE \
 	.MAKE.MODE
 
 .if !defined(SB) && defined(SRCTOP)
 SB = ${SRCTOP:H}
 .endif
 ERROR_LOGDIR ?= ${SB}/error
 meta_error_log = ${ERROR_LOGDIR}/meta-${.MAKE.PID}.log
 
 # we are not interested in make telling us a failure happened elsewhere
 .ERROR: _metaError
 _metaError: .NOMETA .NOTMAIN
 	-@[ "${.ERROR_META_FILE}" ] && { \
 	grep -q 'failure has been detected in another branch' ${.ERROR_META_FILE} && exit 0; \
 	mkdir -p ${meta_error_log:H}; \
 	cp ${.ERROR_META_FILE} ${meta_error_log}; \
 	echo "ERROR: log ${meta_error_log}" >&2; }; :
 
 .endif
 
 META_COOKIE_TOUCH=
 # some targets need to be .PHONY in non-meta mode
 META_NOPHONY= .PHONY
 # Are we, after all, in meta mode?
 .if ${.MAKE.MODE:Uno:Mmeta*} != ""
 MKDEP_MK = meta.autodep.mk
 
 .if ${.MAKE.MAKEFILES:M*sys.dependfile.mk} == ""
 # this does all the smarts of setting .MAKE.DEPENDFILE
 .-include <sys.dependfile.mk>
 # check if we got anything sane
 .if ${.MAKE.DEPENDFILE} == ".depend"
 .undef .MAKE.DEPENDFILE
 .endif
 .MAKE.DEPENDFILE ?= Makefile.depend
 .endif
 
 # we can afford to use cookies to prevent some targets
 # re-running needlessly
 META_COOKIE_TOUCH= touch ${COOKIE.${.TARGET}:U${.OBJDIR}/${.TARGET}}
 META_NOPHONY=
+
+# some targets involve old pre-built targets
+# ignore mtime of shell
+# and mtime of makefiles does not matter in meta mode
+.MAKE.META.IGNORE_PATHS += \
+        ${MAKEFILE} \
+        ${SHELL} \
+        ${SYS_MK_DIR}
+
 .if ${UPDATE_DEPENDFILE:Uyes:tl} != "no"
 .if ${.MAKEFLAGS:Uno:M-k} != ""
 # make this more obvious
 .warning Setting UPDATE_DEPENDFILE=NO due to -k
 UPDATE_DEPENDFILE= NO
 .export UPDATE_DEPENDFILE
 .elif !exists(/dev/filemon)
 .error ${.newline}ERROR: The filemon module (/dev/filemon) is not loaded.
 .endif
 .endif
 
 .if ${.MAKE.LEVEL} == 0
 # make sure dirdeps target exists and do it first
 all: dirdeps .WAIT
 dirdeps:
 .NOPATH: dirdeps
 
 .if defined(ALL_MACHINES)
 # the first .MAIN: is what counts
 # by default dirdeps is all we want at level0
 .MAIN: dirdeps
 # tell dirdeps.mk what we want
 BUILD_AT_LEVEL0 = no
 .endif
 .if ${.TARGETS:Nall} == "" 
 # it works best if we do everything via sub-makes
 BUILD_AT_LEVEL0 ?= no
 .endif
 
 .endif
 .endif
 .endif
Index: projects/clang390-import/contrib/bmake/mk/prog.mk
===================================================================
--- projects/clang390-import/contrib/bmake/mk/prog.mk	(revision 305686)
+++ projects/clang390-import/contrib/bmake/mk/prog.mk	(revision 305687)
@@ -1,222 +1,223 @@
-#	$Id: prog.mk,v 1.26 2016/03/22 20:45:14 sjg Exp $
+#	$Id: prog.mk,v 1.27 2016/08/02 20:52:17 sjg Exp $
 
 .if !target(__${.PARSEFILE}__)
 __${.PARSEFILE}__:
 
 .include <init.mk>
 
 # FreeBSD at least expects MAN8 etc.
 .if defined(MAN) && !empty(MAN)
 _sect:=${MAN:E}
 MAN${_sect}=${MAN}
 .endif
 
 .SUFFIXES: .out .o .c .cc .C .y .l .s .8 .7 .6 .5 .4 .3 .2 .1 .0
 
 CFLAGS+=	${COPTS}
 
 .if ${TARGET_OSNAME} == "NetBSD"
 .if ${MACHINE_ARCH} == "sparc64"
 CFLAGS+=	-mcmodel=medlow
 .endif
 
 # ELF platforms depend on crtbegin.o and crtend.o
 .if ${OBJECT_FMT} == "ELF"
 .ifndef LIBCRTBEGIN
 LIBCRTBEGIN=	${DESTDIR}/usr/lib/crtbegin.o
 .MADE: ${LIBCRTBEGIN}
 .endif
 .ifndef LIBCRTEND
 LIBCRTEND=	${DESTDIR}/usr/lib/crtend.o
 .MADE: ${LIBCRTEND}
 .endif
 _SHLINKER=	${SHLINKDIR}/ld.elf_so
 .else
 LIBCRTBEGIN?=
 LIBCRTEND?=
 _SHLINKER=	${SHLINKDIR}/ld.so
 .endif
 
 .ifndef LIBCRT0
 LIBCRT0=	${DESTDIR}/usr/lib/crt0.o
 .MADE: ${LIBCRT0}
 .endif
 .endif	# NetBSD
 
 # here is where you can define what LIB* are
 .-include <libnames.mk>
 .if ${MK_DPADD_MK} == "yes"
 # lots of cool magic, but might not suit everyone.
 .include <dpadd.mk>
 .endif
 
 .if ${MK_GPROF} == "yes"
 CFLAGS+= ${CC_PG} ${PROFFLAGS}
 LDADD+= ${CC_PG}
 .if ${MK_DPADD_MK} == "no"
 LDADD_LIBC_P?= -lc_p
 LDADD_LAST+= ${LDADD_LIBC_P}
 .endif
 .endif
 
 .if defined(SHAREDSTRINGS)
 CLEANFILES+=strings
 .c.o:
 	${CC} -E ${CFLAGS} ${.IMPSRC} | xstr -c -
 	@${CC} ${CFLAGS} -c x.c -o ${.TARGET}
 	@rm -f x.c
 
 ${CXX_SUFFIXES:%=%.o}:
 	${CXX} -E ${CXXFLAGS} ${.IMPSRC} | xstr -c -
 	@mv -f x.c x.cc
 	@${CXX} ${CXXFLAGS} -c x.cc -o ${.TARGET}
 	@rm -f x.cc
 .endif
 
 
 .if defined(PROG)
 SRCS?=	${PROG}.c
 .for s in ${SRCS:N*.h:N*.sh:M*/*}
 ${.o .po .lo:L:@o@${s:T:R}$o@}: $s
 .endfor
 .if !empty(SRCS:N*.h:N*.sh)
 OBJS+=	${SRCS:T:N*.h:N*.sh:R:S/$/.o/g}
 LOBJS+=	${LSRCS:.c=.ln} ${SRCS:M*.c:.c=.ln}
 .endif
 
 .if defined(OBJS) && !empty(OBJS)
 .NOPATH: ${OBJS} ${PROG} ${SRCS:M*.[ly]:C/\..$/.c/} ${YHEADER:D${SRCS:M*.y:.y=.h}}
 
 # this is known to work for NetBSD 1.6 and FreeBSD 4.2
 .if ${TARGET_OSNAME} == "NetBSD" || ${TARGET_OSNAME} == "FreeBSD"
 _PROGLDOPTS=
 .if ${SHLINKDIR} != "/usr/libexec"	# XXX: change or remove if ld.so moves
 _PROGLDOPTS+=	-Wl,-dynamic-linker=${_SHLINKER}
 .endif
 .if defined(LIBDIR) && ${SHLIBDIR} != ${LIBDIR}
 _PROGLDOPTS+=	-Wl,-rpath-link,${DESTDIR}${SHLIBDIR}:${DESTDIR}/usr/lib \
 		-L${DESTDIR}${SHLIBDIR}
 .endif
 _PROGLDOPTS+=	-Wl,-rpath,${SHLIBDIR}:/usr/lib 
 
 .if defined(PROG_CXX)
 _CCLINK=	${CXX}
 _SUPCXX=	-lstdc++ -lm
 .endif
 .endif	# NetBSD
 
 _CCLINK?=	${CC}
 
 .if defined(DESTDIR) && exists(${LIBCRT0}) && ${LIBCRT0} != "/dev/null"
 
 ${PROG}: ${LIBCRT0} ${OBJS} ${LIBC} ${DPADD}
 	${_CCLINK} ${LDFLAGS} ${LDSTATIC} -o ${.TARGET} -nostdlib ${_PROGLDOPTS} -L${DESTDIR}/usr/lib ${LIBCRT0} ${LIBCRTBEGIN} ${OBJS} ${LDADD} -L${DESTDIR}/usr/lib ${_SUPCXX} -lgcc -lc -lgcc ${LIBCRTEND}
 
 .else
 
 ${PROG}: ${LIBCRT0} ${OBJS} ${LIBC} ${DPADD}
 	${_CCLINK} ${LDFLAGS} ${LDSTATIC} -o ${.TARGET} ${_PROGLDOPTS} ${OBJS} ${LDADD}
 
 .endif	# defined(DESTDIR)
 .endif	# defined(OBJS) && !empty(OBJS)
 
 .if	!defined(MAN)
 MAN=	${PROG}.1
 .endif	# !defined(MAN)
 .endif	# defined(PROG)
 
 .if !defined(_SKIP_BUILD)
 all: ${PROG}
 .endif
 all: _SUBDIRUSE
 
 .if !target(clean)
 cleanprog:
 	rm -f a.out [Ee]rrs mklog core *.core \
 	    ${PROG} ${OBJS} ${LOBJS} ${CLEANFILES}
 
 clean: _SUBDIRUSE cleanprog
 cleandir: _SUBDIRUSE cleanprog
 .else
 cleandir: _SUBDIRUSE clean
 .endif
 
 .if defined(SRCS) && (!defined(MKDEP) || ${MKDEP} != autodep)
 afterdepend: .depend
 	@(TMP=/tmp/_depend$$$$; \
 	    sed -e 's/^\([^\.]*\).o[ ]*:/\1.o \1.ln:/' \
 	      < .depend > $$TMP; \
 	    mv $$TMP .depend)
 .endif
 
 .if !target(install)
 .if !target(beforeinstall)
 beforeinstall:
 .endif
 .if !target(afterinstall)
 afterinstall:
 .endif
 
 .if !empty(BINOWN)
 PROG_INSTALL_OWN ?= -o ${BINOWN} -g ${BINGRP}
 .endif
 
 .if !target(realinstall)
 realinstall: proginstall
 .endif
 .if !target(proginstall)
 proginstall:
 .if defined(PROG)
 	[ -d ${DESTDIR}${BINDIR} ] || \
 	${INSTALL} -d ${PROG_INSTALL_OWN} -m 775 ${DESTDIR}${BINDIR}
 	${INSTALL} ${COPY} ${STRIP_FLAG} ${PROG_INSTALL_OWN} -m ${BINMODE} \
 	    ${PROG} ${DESTDIR}${BINDIR}/${PROG_NAME}
 .endif
 .if defined(HIDEGAME)
 	(cd ${DESTDIR}/usr/games; rm -f ${PROG}; ln -s dm ${PROG})
 .endif
 .endif
 
 .include <links.mk>
 
 install: maninstall install_links _SUBDIRUSE
 
 install_links:
 .if !empty(SYMLINKS)
 	@set ${SYMLINKS}; ${_SYMLINKS_SCRIPT}
 .endif
 .if !empty(LINKS)
 	@set ${LINKS}; ${_LINKS_SCRIPT}
 .endif
 
 maninstall: afterinstall
 afterinstall: realinstall
+proginstall: beforeinstall
 realinstall: beforeinstall
 .endif
 
 .if !target(lint)
 lint: ${LOBJS}
 .if defined(LOBJS) && !empty(LOBJS)
 	@${LINT} ${LINTFLAGS} ${LDFLAGS:M-L*} ${LOBJS} ${LDADD}
 .endif
 .endif
 
 .NOPATH:	${PROG}
 .if defined(OBJS) && !empty(OBJS)
 .NOPATH:	${OBJS}
 .endif
 
 .if ${MK_MAN} != "no"
 .include <man.mk>
 .endif
 
 .if ${MK_NLS} != "no"
 .include <nls.mk>
 .endif
 
 .include <obj.mk>
 .include <dep.mk>
 .include <subdir.mk>
 .include <final.mk>
 
 .endif
Index: projects/clang390-import/contrib/bmake/os.sh
===================================================================
--- projects/clang390-import/contrib/bmake/os.sh	(revision 305686)
+++ projects/clang390-import/contrib/bmake/os.sh	(revision 305687)
@@ -1,242 +1,243 @@
 :
 # NAME:
 #	os.sh - operating system specifics
 #
 # DESCRIPTION:
 #	This file is included at the start of processing. Its role is
 #	to set the variables OS, OSREL, OSMAJOR, MACHINE and MACHINE_ARCH to
 #	reflect the current system.
 #	
 #	It also sets variables such as MAILER, LOCAL_FS, PS_AXC to hide
 #	certain aspects of different UNIX flavours. 
 #
 # SEE ALSO:
 #	site.sh,funcs.sh
 #
 # AUTHOR:
 #	Simon J. Gerraty <sjg@crufty.net>
 
 # RCSid:
-#	$Id: os.sh,v 1.50 2015/12/17 17:06:29 sjg Exp $
+#	$Id: os.sh,v 1.52 2016/06/17 05:15:14 sjg Exp $
 #
 #	@(#) Copyright (c) 1994 Simon J. Gerraty
 #
 #	This file is provided in the hope that it will
 #	be of use.  There is absolutely NO WARRANTY.
 #	Permission to copy, redistribute or otherwise
 #	use this file is hereby granted provided that 
 #	the above copyright notice and this notice are
 #	left intact. 
 #      
 #	Please send copies of changes and bug-fixes to:
 #	sjg@crufty.net
 #
 
 # this lets us skip sourcing it again
 _OS_SH=:
 
 OS=`uname`
 OSREL=`uname -r`
 OSMAJOR=`IFS=.; set $OSREL; echo $1`
 MACHINE=`uname -m`
 MACHINE_ARCH=`uname -p 2>/dev/null || echo $MACHINE`
 
 # there is at least one case of `uname -p` outputting
 # a bunch of usless drivel
 case "$MACHINE_ARCH" in
 unknown|*[!A-Za-z0-9_-]*) MACHINE_ARCH="$MACHINE";;
 esac
         
 # we need this here, and it is not always available...
 Which() {
 	case "$1" in
 	-*) t=$1; shift;;
 	*) t=-x;;
 	esac
 	case "$1" in
 	/*)	test $t $1 && echo $1;;
 	*)
 		# some shells cannot correctly handle `IFS`
 		# in conjunction with the for loop.
 		_dirs=`IFS=:; echo ${2:-$PATH}`
 		for d in $_dirs
 		do
 			test $t $d/$1 && { echo $d/$1; break; }
 		done
 		;;
 	esac
 }
 
 # tr is insanely non-portable wrt char classes, so we need to
 # spell out the alphabet. sed y/// would work too.
 toUpper() {
 	${TR:-tr} abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ
 }
 
 toLower() {
 	${TR:-tr} ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz
 }
 
 K=
 case $OS in
 AIX)	# everyone loves to be different...
 	OSMAJOR=`uname -v`
 	OSREL="$OSMAJOR.`uname -r`"
 	LOCAL_FS=jfs
 	PS_AXC=-e
 	SHARE_ARCH=$OS/$OSMAJOR.X
 	;;
 SunOS)
 	CHOWN=`Which chown /usr/etc:/usr/bin`
 	export CHOWN
 	
 	# Great! Solaris keeps moving arch(1)
 	# should just bite the bullet and use uname -p
 	arch=`Which arch /usr/bin:/usr/ucb`
 	
 	MAILER=/usr/ucb/Mail
 	LOCAL_FS=4.2
 	
 	case "$OSREL" in
 	4.0*)
 		# uname -m just says sun which could be anything
 		# so use arch(1).
 		MACHINE_ARCH=`arch`
 		MACHINE=$MACHINE_ARCH
 		;;
 	4*)
 		MACHINE_ARCH=`arch`
 		;;
 	5*)
 		K=-k
 		LOCAL_FS=ufs
 		MAILER=mailx
 		PS_AXC=-e
 		# can you believe that ln on Solaris defaults to
 		# overwriting an existing file!!!!! We want one that works!
 		test -x /usr/xpg4/bin/ln && LN=${LN:-/usr/xpg4/bin/ln}
 		# wonderful, 5.8's tr again require's []'s
 		# but /usr/xpg4/bin/tr causes problems if LC_COLLATE is set!
 		# use toUpper/toLower instead.
 		;;
 	esac
 	case "$OS/$MACHINE_ARCH" in
 	*sun386)	SHARE_ARCH=$MACHINE_ARCH;;
 	esac
 	;;
 *BSD)
 	K=-k
 	MAILER=/usr/bin/Mail
 	LOCAL_FS=local
 	: $-,$ENV
 	case "$-,$ENV" in
 	*i*,*) ;;
 	*,|*ENVFILE*) ;;
 	*) ENV=;;
 	esac
 	# NetBSD at least has good backward compatibility
 	# so NetBSD/i386 is good enough
 	case $OS in
 	NetBSD)
 		HOST_ARCH=$MACHINE
-		SHARE_ARCH=$OS/$HOST
+		SHARE_ARCH=$OS/$HOST_ARCH
 		;;
 	OpenBSD)
 		arch=`Which arch /usr/bin:/usr/ucb:$PATH`
 		MACHINE_ARCH=`$arch -s`
 		;;
 	esac
 	NAWK=awk
 	export NAWK
 	;;
 HP-UX)
 	TMP_DIRS="/tmp /usr/tmp"
 	LOCAL_FS=hfs
 	MAILER=mailx
 	# don't rely on /bin/sh, its broken
 	_shell=/bin/ksh; ENV=
 	# also, no one would be interested in OSMAJOR=A
 	case "$OSREL" in
 	?.09*)	OSMAJOR=9; PS_AXC=-e;;
 	?.10*)	OSMAJOR=10; PS_AXC=-e;;
 	esac
 	;;
 IRIX)
 	LOCAL_FS=efs
 	;;
 Interix)
 	MACHINE=i386
 	MACHINE_ARCH=i386
 	;;
 UnixWare)
 	OSREL=`uname -v`
 	OSMAJOR=`IFS=.; set $OSREL; echo $1`
 	MACHINE_ARCH=`uname -m`
 	;;
 Linux)
 	# Not really any such thing as Linux, but
 	# this covers red-hat and hopefully others.
 	case $MACHINE in
 	i?86)	MACHINE_ARCH=i386;; # we don't care about i686 vs i586
 	esac
 	LOCAL_FS=ext2
 	PS_AXC=axc
 	[ -x /usr/bin/md5sum ] && { MD5=/usr/bin/md5sum; export MD5; }
 	;;
 QNX)
 	case $MACHINE in
 	x86pc)	MACHINE_ARCH=i386;;
 	esac
 	;;
 Haiku)
 	case $MACHINE in
 	BeBox)	MACHINE_ARCH=powerpc;;
 	BeMac)	MACHINE_ARCH=powerpc;;
 	BePC)	MACHINE_ARCH=i386;;
 	esac
 	;;
 esac
 
 HOSTNAME=${HOSTNAME:-`( hostname ) 2>/dev/null`}
 HOSTNAME=${HOSTNAME:-`( uname -n ) 2>/dev/null`}
 case "$HOSTNAME" in
 *.*)	HOST=`IFS=.; set -- $HOSTNAME; echo $1`;;
 *)	HOST=$HOSTNAME;;
 esac
 
 TMP_DIRS=${TMP_DIRS:-"/tmp /var/tmp"}
 MACHINE_ARCH=${MACHINE_ARCH:-$MACHINE}
 HOST_ARCH=${HOST_ARCH:-$MACHINE_ARCH}
 # we mount server:/share/arch/$SHARE_ARCH as /usr/local
-SHARE_ARCH=${SHARE_ARCH:-$OS/$OSMAJOR.X/$HOST_ARCH}
+SHARE_ARCH_DEFAULT=$OS/$OSMAJOR.X/$HOST_ARCH
+SHARE_ARCH=${SHARE_ARCH:-$SHARE_ARCH_DEFAULT}
 LN=${LN:-ln}
 TR=${TR:-tr}
 
 # Some people like have /share/$HOST_TARGET/bin etc.
 HOST_TARGET=`echo ${OS}${OSMAJOR}-$HOST_ARCH | tr -d / | toLower`
 export HOST_TARGET
 
 case `echo -n .` in -n*) N=; C="\c";; *) N=-n; C=;; esac
 
 Echo() {
 	case "$1" in
 	-n) _n=$N _c=$C; shift;;
 	*) _n= _c=;;
 	esac
 	echo $_n "$@" $_c
 }
 
 export HOSTNAME HOST	    
 export OS MACHINE MACHINE_ARCH OSREL OSMAJOR LOCAL_FS TMP_DIRS MAILER N C K PS_AXC
 export LN SHARE_ARCH TR
 
 case /$0 in
 */os.sh)
 	for v in $*
 	do
 		eval vv=\$$v
 		echo "$v='$vv'"
 	done
 	;;
 esac
 
Index: projects/clang390-import/contrib/bmake/suff.c
===================================================================
--- projects/clang390-import/contrib/bmake/suff.c	(revision 305686)
+++ projects/clang390-import/contrib/bmake/suff.c	(revision 305687)
@@ -1,2667 +1,2675 @@
-/*	$NetBSD: suff.c,v 1.81 2016/03/15 18:30:14 matthias Exp $	*/
+/*	$NetBSD: suff.c,v 1.84 2016/06/30 05:34:04 dholland Exp $	*/
 
 /*
  * Copyright (c) 1988, 1989, 1990, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * Adam de Boor.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 /*
  * Copyright (c) 1989 by Berkeley Softworks
  * All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * Adam de Boor.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *	This product includes software developed by the University of
  *	California, Berkeley and its contributors.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #ifndef MAKE_NATIVE
-static char rcsid[] = "$NetBSD: suff.c,v 1.81 2016/03/15 18:30:14 matthias Exp $";
+static char rcsid[] = "$NetBSD: suff.c,v 1.84 2016/06/30 05:34:04 dholland Exp $";
 #else
 #include <sys/cdefs.h>
 #ifndef lint
 #if 0
 static char sccsid[] = "@(#)suff.c	8.4 (Berkeley) 3/21/94";
 #else
-__RCSID("$NetBSD: suff.c,v 1.81 2016/03/15 18:30:14 matthias Exp $");
+__RCSID("$NetBSD: suff.c,v 1.84 2016/06/30 05:34:04 dholland Exp $");
 #endif
 #endif /* not lint */
 #endif
 
 /*-
  * suff.c --
  *	Functions to maintain suffix lists and find implicit dependents
  *	using suffix transformation rules
  *
  * Interface:
  *	Suff_Init 	    	Initialize all things to do with suffixes.
  *
  *	Suff_End 	    	Cleanup the module
  *
  *	Suff_DoPaths	    	This function is used to make life easier
  *	    	  	    	when searching for a file according to its
  *	    	  	    	suffix. It takes the global search path,
  *	    	  	    	as defined using the .PATH: target, and appends
  *	    	  	    	its directories to the path of each of the
  *	    	  	    	defined suffixes, as specified using
  *	    	  	    	.PATH<suffix>: targets. In addition, all
  *	    	  	    	directories given for suffixes labeled as
  *	    	  	    	include files or libraries, using the .INCLUDES
  *	    	  	    	or .LIBS targets, are played with using
  *	    	  	    	Dir_MakeFlags to create the .INCLUDES and
  *	    	  	    	.LIBS global variables.
  *
  *	Suff_ClearSuffixes  	Clear out all the suffixes and defined
  *	    	  	    	transformations.
  *
  *	Suff_IsTransform    	Return TRUE if the passed string is the lhs
  *	    	  	    	of a transformation rule.
  *
  *	Suff_AddSuffix	    	Add the passed string as another known suffix.
  *
  *	Suff_GetPath	    	Return the search path for the given suffix.
  *
  *	Suff_AddInclude	    	Mark the given suffix as denoting an include
  *	    	  	    	file.
  *
  *	Suff_AddLib	    	Mark the given suffix as denoting a library.
  *
  *	Suff_AddTransform   	Add another transformation to the suffix
  *	    	  	    	graph. Returns  GNode suitable for framing, I
  *	    	  	    	mean, tacking commands, attributes, etc. on.
  *
  *	Suff_SetNull	    	Define the suffix to consider the suffix of
  *	    	  	    	any file that doesn't have a known one.
  *
  *	Suff_FindDeps	    	Find implicit sources for and the location of
  *	    	  	    	a target based on its suffix. Returns the
  *	    	  	    	bottom-most node added to the graph or NULL
  *	    	  	    	if the target had no implicit sources.
  *
  *	Suff_FindPath	    	Return the appropriate path to search in
  *				order to find the node.
  */
 
 #include    	  <stdio.h>
 #include	  "make.h"
 #include	  "hash.h"
 #include	  "dir.h"
 
 static Lst       sufflist;	/* Lst of suffixes */
 #ifdef CLEANUP
 static Lst	 suffClean;	/* Lst of suffixes to be cleaned */
 #endif
 static Lst	 srclist;	/* Lst of sources */
 static Lst       transforms;	/* Lst of transformation rules */
 
 static int        sNum = 0;	/* Counter for assigning suffix numbers */
 
 /*
  * Structure describing an individual suffix.
  */
 typedef struct _Suff {
     char         *name;	    	/* The suffix itself */
     int		 nameLen;	/* Length of the suffix */
     short	 flags;      	/* Type of suffix */
 #define SUFF_INCLUDE	  0x01	    /* One which is #include'd */
 #define SUFF_LIBRARY	  0x02	    /* One which contains a library */
 #define SUFF_NULL 	  0x04	    /* The empty suffix */
     Lst    	 searchPath;	/* The path along which files of this suffix
 				 * may be found */
     int          sNum;	      	/* The suffix number */
     int		 refCount;	/* Reference count of list membership */
     Lst          parents;	/* Suffixes we have a transformation to */
     Lst          children;	/* Suffixes we have a transformation from */
     Lst		 ref;		/* List of lists this suffix is referenced */
 } Suff;
 
 /*
  * for SuffSuffIsSuffix
  */
 typedef struct {
     char	*ename;		/* The end of the name */
     int		 len;		/* Length of the name */
 } SuffixCmpData;
 
 /*
  * Structure used in the search for implied sources.
  */
 typedef struct _Src {
     char            *file;	/* The file to look for */
     char    	    *pref;  	/* Prefix from which file was formed */
     Suff            *suff;	/* The suffix on the file */
     struct _Src     *parent;	/* The Src for which this is a source */
     GNode           *node;	/* The node describing the file */
     int	    	    children;	/* Count of existing children (so we don't free
 				 * this thing too early or never nuke it) */
 #ifdef DEBUG_SRC
     Lst		    cp;		/* Debug; children list */
 #endif
 } Src;
 
 /*
  * A structure for passing more than one argument to the Lst-library-invoked
  * function...
  */
 typedef struct {
     Lst            l;
     Src            *s;
 } LstSrc;
 
 typedef struct {
     GNode	  **gn;
     Suff	   *s;
     Boolean	    r;
 } GNodeSuff;
 
 static Suff 	    *suffNull;	/* The NULL suffix for this run */
 static Suff 	    *emptySuff;	/* The empty suffix required for POSIX
 				 * single-suffix transformation rules */
 
 
 static const char *SuffStrIsPrefix(const char *, const char *);
 static char *SuffSuffIsSuffix(const Suff *, const SuffixCmpData *);
 static int SuffSuffIsSuffixP(const void *, const void *);
 static int SuffSuffHasNameP(const void *, const void *);
 static int SuffSuffIsPrefix(const void *, const void *);
 static int SuffGNHasNameP(const void *, const void *);
 static void SuffUnRef(void *, void *);
 static void SuffFree(void *);
 static void SuffInsert(Lst, Suff *);
 static void SuffRemove(Lst, Suff *);
 static Boolean SuffParseTransform(char *, Suff **, Suff **);
 static int SuffRebuildGraph(void *, void *);
 static int SuffScanTargets(void *, void *);
 static int SuffAddSrc(void *, void *);
 static int SuffRemoveSrc(Lst);
 static void SuffAddLevel(Lst, Src *);
 static Src *SuffFindThem(Lst, Lst);
 static Src *SuffFindCmds(Src *, Lst);
 static void SuffExpandChildren(LstNode, GNode *);
 static void SuffExpandWildcards(LstNode, GNode *);
 static Boolean SuffApplyTransform(GNode *, GNode *, Suff *, Suff *);
 static void SuffFindDeps(GNode *, Lst);
 static void SuffFindArchiveDeps(GNode *, Lst);
 static void SuffFindNormalDeps(GNode *, Lst);
 static int SuffPrintName(void *, void *);
 static int SuffPrintSuff(void *, void *);
 static int SuffPrintTrans(void *, void *);
 
 	/*************** Lst Predicates ****************/
 /*-
  *-----------------------------------------------------------------------
  * SuffStrIsPrefix  --
  *	See if pref is a prefix of str.
  *
  * Input:
  *	pref		possible prefix
  *	str		string to check
  *
  * Results:
  *	NULL if it ain't, pointer to character in str after prefix if so
  *
  * Side Effects:
  *	None
  *-----------------------------------------------------------------------
  */
 static const char *
 SuffStrIsPrefix(const char *pref, const char *str)
 {
     while (*str && *pref == *str) {
 	pref++;
 	str++;
     }
 
     return (*pref ? NULL : str);
 }
 
 /*-
  *-----------------------------------------------------------------------
  * SuffSuffIsSuffix  --
  *	See if suff is a suffix of str. sd->ename should point to THE END
  *	of the string to check. (THE END == the null byte)
  *
  * Input:
  *	s		possible suffix
  *	sd		string to examine
  *
  * Results:
  *	NULL if it ain't, pointer to character in str before suffix if
  *	it is.
  *
  * Side Effects:
  *	None
  *-----------------------------------------------------------------------
  */
 static char *
 SuffSuffIsSuffix(const Suff *s, const SuffixCmpData *sd)
 {
     char  *p1;	    	/* Pointer into suffix name */
     char  *p2;	    	/* Pointer into string being examined */
 
     if (sd->len < s->nameLen)
 	return NULL;		/* this string is shorter than the suffix */
 
     p1 = s->name + s->nameLen;
     p2 = sd->ename;
 
     while (p1 >= s->name && *p1 == *p2) {
 	p1--;
 	p2--;
     }
 
     return (p1 == s->name - 1 ? p2 : NULL);
 }
 
 /*-
  *-----------------------------------------------------------------------
  * SuffSuffIsSuffixP --
  *	Predicate form of SuffSuffIsSuffix. Passed as the callback function
  *	to Lst_Find.
  *
  * Results:
  *	0 if the suffix is the one desired, non-zero if not.
  *
  * Side Effects:
  *	None.
  *
  *-----------------------------------------------------------------------
  */
 static int
 SuffSuffIsSuffixP(const void *s, const void *sd)
 {
     return(!SuffSuffIsSuffix(s, sd));
 }
 
 /*-
  *-----------------------------------------------------------------------
  * SuffSuffHasNameP --
  *	Callback procedure for finding a suffix based on its name. Used by
  *	Suff_GetPath.
  *
  * Input:
  *	s		Suffix to check
  *	sd		Desired name
  *
  * Results:
  *	0 if the suffix is of the given name. non-zero otherwise.
  *
  * Side Effects:
  *	None
  *-----------------------------------------------------------------------
  */
 static int
 SuffSuffHasNameP(const void *s, const void *sname)
 {
     return (strcmp(sname, ((const Suff *)s)->name));
 }
 
 /*-
  *-----------------------------------------------------------------------
  * SuffSuffIsPrefix  --
  *	See if the suffix described by s is a prefix of the string. Care
  *	must be taken when using this to search for transformations and
  *	what-not, since there could well be two suffixes, one of which
  *	is a prefix of the other...
  *
  * Input:
  *	s		suffix to compare
  *	str		string to examine
  *
  * Results:
  *	0 if s is a prefix of str. non-zero otherwise
  *
  * Side Effects:
  *	None
  *-----------------------------------------------------------------------
  */
 static int
 SuffSuffIsPrefix(const void *s, const void *str)
 {
     return SuffStrIsPrefix(((const Suff *)s)->name, str) == NULL;
 }
 
 /*-
  *-----------------------------------------------------------------------
  * SuffGNHasNameP  --
  *	See if the graph node has the desired name
  *
  * Input:
  *	gn		current node we're looking at
  *	name		name we're looking for
  *
  * Results:
  *	0 if it does. non-zero if it doesn't
  *
  * Side Effects:
  *	None
  *-----------------------------------------------------------------------
  */
 static int
 SuffGNHasNameP(const void *gn, const void *name)
 {
     return (strcmp(name, ((const GNode *)gn)->name));
 }
 
  	    /*********** Maintenance Functions ************/
 
 static void
 SuffUnRef(void *lp, void *sp)
 {
     Lst l = (Lst) lp;
 
     LstNode ln = Lst_Member(l, sp);
     if (ln != NULL) {
 	Lst_Remove(l, ln);
 	((Suff *)sp)->refCount--;
     }
 }
 
 /*-
  *-----------------------------------------------------------------------
  * SuffFree  --
  *	Free up all memory associated with the given suffix structure.
  *
  * Results:
  *	none
  *
  * Side Effects:
  *	the suffix entry is detroyed
  *-----------------------------------------------------------------------
  */
 static void
 SuffFree(void *sp)
 {
     Suff           *s = (Suff *)sp;
 
     if (s == suffNull)
 	suffNull = NULL;
 
     if (s == emptySuff)
 	emptySuff = NULL;
 
 #ifdef notdef
     /* We don't delete suffixes in order, so we cannot use this */
     if (s->refCount)
 	Punt("Internal error deleting suffix `%s' with refcount = %d", s->name,
 	    s->refCount);
 #endif
 
     Lst_Destroy(s->ref, NULL);
     Lst_Destroy(s->children, NULL);
     Lst_Destroy(s->parents, NULL);
     Lst_Destroy(s->searchPath, Dir_Destroy);
 
     free(s->name);
     free(s);
 }
 
 /*-
  *-----------------------------------------------------------------------
  * SuffRemove  --
  *	Remove the suffix into the list
  *
  * Results:
  *	None
  *
  * Side Effects:
  *	The reference count for the suffix is decremented and the
  *	suffix is possibly freed
  *-----------------------------------------------------------------------
  */
 static void
 SuffRemove(Lst l, Suff *s)
 {
     SuffUnRef(l, s);
     if (s->refCount == 0) {
 	SuffUnRef(sufflist, s);
 	SuffFree(s);
     }
 }
 
 /*-
  *-----------------------------------------------------------------------
  * SuffInsert  --
  *	Insert the suffix into the list keeping the list ordered by suffix
  *	numbers.
  *
  * Input:
  *	l		the list where in s should be inserted
  *	s		the suffix to insert
  *
  * Results:
  *	None
  *
  * Side Effects:
  *	The reference count of the suffix is incremented
  *-----------------------------------------------------------------------
  */
 static void
 SuffInsert(Lst l, Suff *s)
 {
     LstNode 	  ln;		/* current element in l we're examining */
     Suff          *s2 = NULL;	/* the suffix descriptor in this element */
 
     if (Lst_Open(l) == FAILURE) {
 	return;
     }
     while ((ln = Lst_Next(l)) != NULL) {
 	s2 = (Suff *)Lst_Datum(ln);
 	if (s2->sNum >= s->sNum) {
 	    break;
 	}
     }
 
     Lst_Close(l);
     if (DEBUG(SUFF)) {
 	fprintf(debug_file, "inserting %s(%d)...", s->name, s->sNum);
     }
     if (ln == NULL) {
 	if (DEBUG(SUFF)) {
 	    fprintf(debug_file, "at end of list\n");
 	}
 	(void)Lst_AtEnd(l, s);
 	s->refCount++;
 	(void)Lst_AtEnd(s->ref, l);
     } else if (s2->sNum != s->sNum) {
 	if (DEBUG(SUFF)) {
 	    fprintf(debug_file, "before %s(%d)\n", s2->name, s2->sNum);
 	}
 	(void)Lst_InsertBefore(l, ln, s);
 	s->refCount++;
 	(void)Lst_AtEnd(s->ref, l);
     } else if (DEBUG(SUFF)) {
 	fprintf(debug_file, "already there\n");
     }
 }
 
 /*-
  *-----------------------------------------------------------------------
  * Suff_ClearSuffixes --
  *	This is gross. Nuke the list of suffixes but keep all transformation
  *	rules around. The transformation graph is destroyed in this process,
  *	but we leave the list of rules so when a new graph is formed the rules
  *	will remain.
  *	This function is called from the parse module when a
  *	.SUFFIXES:\n line is encountered.
  *
  * Results:
  *	none
  *
  * Side Effects:
  *	the sufflist and its graph nodes are destroyed
  *-----------------------------------------------------------------------
  */
 void
 Suff_ClearSuffixes(void)
 {
 #ifdef CLEANUP
     Lst_Concat(suffClean, sufflist, LST_CONCLINK);
 #endif
     sufflist = Lst_Init(FALSE);
     sNum = 0;
     if (suffNull)
 	SuffFree(suffNull);
     emptySuff = suffNull = bmake_malloc(sizeof(Suff));
 
     suffNull->name =   	    bmake_strdup("");
     suffNull->nameLen =     0;
     suffNull->searchPath =  Lst_Init(FALSE);
     Dir_Concat(suffNull->searchPath, dirSearchPath);
     suffNull->children =    Lst_Init(FALSE);
     suffNull->parents =	    Lst_Init(FALSE);
     suffNull->ref =	    Lst_Init(FALSE);
     suffNull->sNum =   	    sNum++;
     suffNull->flags =  	    SUFF_NULL;
     suffNull->refCount =    1;
 }
 
 /*-
  *-----------------------------------------------------------------------
  * SuffParseTransform --
  *	Parse a transformation string to find its two component suffixes.
  *
  * Input:
  *	str		String being parsed
  *	srcPtr		Place to store source of trans.
  *	targPtr		Place to store target of trans.
  *
  * Results:
  *	TRUE if the string is a valid transformation and FALSE otherwise.
  *
  * Side Effects:
  *	The passed pointers are overwritten.
  *
  *-----------------------------------------------------------------------
  */
 static Boolean
 SuffParseTransform(char *str, Suff **srcPtr, Suff **targPtr)
 {
     LstNode		srcLn;	    /* element in suffix list of trans source*/
     Suff		*src;	    /* Source of transformation */
     LstNode		targLn;	    /* element in suffix list of trans target*/
     char		*str2;	    /* Extra pointer (maybe target suffix) */
     LstNode 	    	singleLn;   /* element in suffix list of any suffix
 				     * that exactly matches str */
     Suff    	    	*single = NULL;/* Source of possible transformation to
 				     * null suffix */
 
     srcLn = NULL;
     singleLn = NULL;
 
     /*
      * Loop looking first for a suffix that matches the start of the
      * string and then for one that exactly matches the rest of it. If
      * we can find two that meet these criteria, we've successfully
      * parsed the string.
      */
     for (;;) {
 	if (srcLn == NULL) {
 	    srcLn = Lst_Find(sufflist, str, SuffSuffIsPrefix);
 	} else {
 	    srcLn = Lst_FindFrom(sufflist, Lst_Succ(srcLn), str,
 				  SuffSuffIsPrefix);
 	}
 	if (srcLn == NULL) {
 	    /*
 	     * Ran out of source suffixes -- no such rule
 	     */
 	    if (singleLn != NULL) {
 		/*
 		 * Not so fast Mr. Smith! There was a suffix that encompassed
 		 * the entire string, so we assume it was a transformation
 		 * to the null suffix (thank you POSIX). We still prefer to
 		 * find a double rule over a singleton, hence we leave this
 		 * check until the end.
 		 *
 		 * XXX: Use emptySuff over suffNull?
 		 */
 		*srcPtr = single;
 		*targPtr = suffNull;
 		return(TRUE);
 	    }
 	    return (FALSE);
 	}
 	src = (Suff *)Lst_Datum(srcLn);
 	str2 = str + src->nameLen;
 	if (*str2 == '\0') {
 	    single = src;
 	    singleLn = srcLn;
 	} else {
 	    targLn = Lst_Find(sufflist, str2, SuffSuffHasNameP);
 	    if (targLn != NULL) {
 		*srcPtr = src;
 		*targPtr = (Suff *)Lst_Datum(targLn);
 		return (TRUE);
 	    }
 	}
     }
 }
 
 /*-
  *-----------------------------------------------------------------------
  * Suff_IsTransform  --
  *	Return TRUE if the given string is a transformation rule
  *
  *
  * Input:
  *	str		string to check
  *
  * Results:
  *	TRUE if the string is a concatenation of two known suffixes.
  *	FALSE otherwise
  *
  * Side Effects:
  *	None
  *-----------------------------------------------------------------------
  */
 Boolean
 Suff_IsTransform(char *str)
 {
     Suff    	  *src, *targ;
 
     return (SuffParseTransform(str, &src, &targ));
 }
 
 /*-
  *-----------------------------------------------------------------------
  * Suff_AddTransform --
  *	Add the transformation rule described by the line to the
  *	list of rules and place the transformation itself in the graph
  *
  * Input:
  *	line		name of transformation to add
  *
  * Results:
  *	The node created for the transformation in the transforms list
  *
  * Side Effects:
  *	The node is placed on the end of the transforms Lst and links are
  *	made between the two suffixes mentioned in the target name
  *-----------------------------------------------------------------------
  */
 GNode *
 Suff_AddTransform(char *line)
 {
     GNode         *gn;		/* GNode of transformation rule */
     Suff          *s,		/* source suffix */
                   *t;		/* target suffix */
     LstNode 	  ln;	    	/* Node for existing transformation */
 
     ln = Lst_Find(transforms, line, SuffGNHasNameP);
     if (ln == NULL) {
 	/*
 	 * Make a new graph node for the transformation. It will be filled in
 	 * by the Parse module.
 	 */
 	gn = Targ_NewGN(line);
 	(void)Lst_AtEnd(transforms, gn);
     } else {
 	/*
 	 * New specification for transformation rule. Just nuke the old list
 	 * of commands so they can be filled in again... We don't actually
 	 * free the commands themselves, because a given command can be
 	 * attached to several different transformations.
 	 */
 	gn = (GNode *)Lst_Datum(ln);
 	Lst_Destroy(gn->commands, NULL);
 	Lst_Destroy(gn->children, NULL);
 	gn->commands = Lst_Init(FALSE);
 	gn->children = Lst_Init(FALSE);
     }
 
     gn->type = OP_TRANSFORM;
 
     (void)SuffParseTransform(line, &s, &t);
 
     /*
      * link the two together in the proper relationship and order
      */
     if (DEBUG(SUFF)) {
 	fprintf(debug_file, "defining transformation from `%s' to `%s'\n",
 		s->name, t->name);
     }
     SuffInsert(t->children, s);
     SuffInsert(s->parents, t);
 
     return (gn);
 }
 
 /*-
  *-----------------------------------------------------------------------
  * Suff_EndTransform --
  *	Handle the finish of a transformation definition, removing the
  *	transformation from the graph if it has neither commands nor
  *	sources. This is a callback procedure for the Parse module via
  *	Lst_ForEach
  *
  * Input:
  *	gnp		Node for transformation
  *	dummy		Node for transformation
  *
  * Results:
  *	=== 0
  *
  * Side Effects:
  *	If the node has no commands or children, the children and parents
  *	lists of the affected suffixes are altered.
  *
  *-----------------------------------------------------------------------
  */
 int
 Suff_EndTransform(void *gnp, void *dummy)
 {
     GNode *gn = (GNode *)gnp;
 
+    (void)dummy;
+
     if ((gn->type & OP_DOUBLEDEP) && !Lst_IsEmpty (gn->cohorts))
 	gn = (GNode *)Lst_Datum(Lst_Last(gn->cohorts));
     if ((gn->type & OP_TRANSFORM) && Lst_IsEmpty(gn->commands) &&
 	Lst_IsEmpty(gn->children))
     {
 	Suff	*s, *t;
 
 	/*
 	 * SuffParseTransform() may fail for special rules which are not
 	 * actual transformation rules. (e.g. .DEFAULT)
 	 */
 	if (SuffParseTransform(gn->name, &s, &t)) {
 	    Lst	 p;
 
 	    if (DEBUG(SUFF)) {
 		fprintf(debug_file, "deleting transformation from `%s' to `%s'\n",
 		s->name, t->name);
 	    }
 
 	    /*
 	     * Store s->parents because s could be deleted in SuffRemove
 	     */
 	    p = s->parents;
 
 	    /*
 	     * Remove the source from the target's children list. We check for a
 	     * nil return to handle a beanhead saying something like
 	     *  .c.o .c.o:
 	     *
 	     * We'll be called twice when the next target is seen, but .c and .o
 	     * are only linked once...
 	     */
 	    SuffRemove(t->children, s);
 
 	    /*
 	     * Remove the target from the source's parents list
 	     */
 	    SuffRemove(p, t);
 	}
     } else if ((gn->type & OP_TRANSFORM) && DEBUG(SUFF)) {
 	fprintf(debug_file, "transformation %s complete\n", gn->name);
     }
 
-    return(dummy ? 0 : 0);
+    return 0;
 }
 
 /*-
  *-----------------------------------------------------------------------
  * SuffRebuildGraph --
  *	Called from Suff_AddSuffix via Lst_ForEach to search through the
  *	list of existing transformation rules and rebuild the transformation
  *	graph when it has been destroyed by Suff_ClearSuffixes. If the
  *	given rule is a transformation involving this suffix and another,
  *	existing suffix, the proper relationship is established between
  *	the two.
  *
  * Input:
  *	transformp	Transformation to test
  *	sp		Suffix to rebuild
  *
  * Results:
  *	Always 0.
  *
  * Side Effects:
  *	The appropriate links will be made between this suffix and
  *	others if transformation rules exist for it.
  *
  *-----------------------------------------------------------------------
  */
 static int
 SuffRebuildGraph(void *transformp, void *sp)
 {
     GNode   	*transform = (GNode *)transformp;
     Suff    	*s = (Suff *)sp;
     char 	*cp;
     LstNode	ln;
     Suff  	*s2;
     SuffixCmpData sd;
 
     /*
      * First see if it is a transformation from this suffix.
      */
     cp = UNCONST(SuffStrIsPrefix(s->name, transform->name));
     if (cp != NULL) {
 	ln = Lst_Find(sufflist, cp, SuffSuffHasNameP);
 	if (ln != NULL) {
 	    /*
 	     * Found target. Link in and return, since it can't be anything
 	     * else.
 	     */
 	    s2 = (Suff *)Lst_Datum(ln);
 	    SuffInsert(s2->children, s);
 	    SuffInsert(s->parents, s2);
 	    return(0);
 	}
     }
 
     /*
      * Not from, maybe to?
      */
     sd.len = strlen(transform->name);
     sd.ename = transform->name + sd.len;
     cp = SuffSuffIsSuffix(s, &sd);
     if (cp != NULL) {
 	/*
 	 * Null-terminate the source suffix in order to find it.
 	 */
 	cp[1] = '\0';
 	ln = Lst_Find(sufflist, transform->name, SuffSuffHasNameP);
 	/*
 	 * Replace the start of the target suffix
 	 */
 	cp[1] = s->name[0];
 	if (ln != NULL) {
 	    /*
 	     * Found it -- establish the proper relationship
 	     */
 	    s2 = (Suff *)Lst_Datum(ln);
 	    SuffInsert(s->children, s2);
 	    SuffInsert(s2->parents, s);
 	}
     }
     return(0);
 }
 
 /*-
  *-----------------------------------------------------------------------
  * SuffScanTargets --
  *	Called from Suff_AddSuffix via Lst_ForEach to search through the
  *	list of existing targets and find if any of the existing targets
  *	can be turned into a transformation rule.
  *
  * Results:
  *	1 if a new main target has been selected, 0 otherwise.
  *
  * Side Effects:
  *	If such a target is found and the target is the current main
  *	target, the main target is set to NULL and the next target
  *	examined (if that exists) becomes the main target.
  *
  *-----------------------------------------------------------------------
  */
 static int
 SuffScanTargets(void *targetp, void *gsp)
 {
     GNode   	*target = (GNode *)targetp;
     GNodeSuff	*gs = (GNodeSuff *)gsp;
     Suff	*s, *t;
     char 	*ptr;
 
     if (*gs->gn == NULL && gs->r && (target->type & OP_NOTARGET) == 0) {
 	*gs->gn = target;
 	Targ_SetMain(target);
 	return 1;
     }
 
     if ((unsigned int)target->type == OP_TRANSFORM)
 	return 0;
 
     if ((ptr = strstr(target->name, gs->s->name)) == NULL ||
 	ptr == target->name)
 	return 0;
 
     if (SuffParseTransform(target->name, &s, &t)) {
 	if (*gs->gn == target) {
 	    gs->r = TRUE;
 	    *gs->gn = NULL;
 	    Targ_SetMain(NULL);
 	}
 	Lst_Destroy(target->children, NULL);
 	target->children = Lst_Init(FALSE);
 	target->type = OP_TRANSFORM;
 	/*
 	 * link the two together in the proper relationship and order
 	 */
 	if (DEBUG(SUFF)) {
 	    fprintf(debug_file, "defining transformation from `%s' to `%s'\n",
 		s->name, t->name);
 	}
 	SuffInsert(t->children, s);
 	SuffInsert(s->parents, t);
     }
     return 0;
 }
 
 /*-
  *-----------------------------------------------------------------------
  * Suff_AddSuffix --
  *	Add the suffix in string to the end of the list of known suffixes.
  *	Should we restructure the suffix graph? Make doesn't...
  *
  * Input:
  *	str		the name of the suffix to add
  *
  * Results:
  *	None
  *
  * Side Effects:
  *	A GNode is created for the suffix and a Suff structure is created and
  *	added to the suffixes list unless the suffix was already known.
  *	The mainNode passed can be modified if a target mutated into a
  *	transform and that target happened to be the main target.
  *-----------------------------------------------------------------------
  */
 void
 Suff_AddSuffix(char *str, GNode **gn)
 {
     Suff          *s;	    /* new suffix descriptor */
     LstNode 	  ln;
     GNodeSuff	  gs;
 
     ln = Lst_Find(sufflist, str, SuffSuffHasNameP);
     if (ln == NULL) {
 	s = bmake_malloc(sizeof(Suff));
 
 	s->name =   	bmake_strdup(str);
 	s->nameLen = 	strlen(s->name);
 	s->searchPath = Lst_Init(FALSE);
 	s->children = 	Lst_Init(FALSE);
 	s->parents = 	Lst_Init(FALSE);
 	s->ref = 	Lst_Init(FALSE);
 	s->sNum =   	sNum++;
 	s->flags =  	0;
 	s->refCount =	1;
 
 	(void)Lst_AtEnd(sufflist, s);
 	/*
 	 * We also look at our existing targets list to see if adding
 	 * this suffix will make one of our current targets mutate into
 	 * a suffix rule. This is ugly, but other makes treat all targets
 	 * that start with a . as suffix rules.
 	 */
 	gs.gn = gn;
 	gs.s  = s;
 	gs.r  = FALSE;
 	Lst_ForEach(Targ_List(), SuffScanTargets, &gs);
 	/*
 	 * Look for any existing transformations from or to this suffix.
 	 * XXX: Only do this after a Suff_ClearSuffixes?
 	 */
 	Lst_ForEach(transforms, SuffRebuildGraph, s);
     }
 }
 
 /*-
  *-----------------------------------------------------------------------
  * Suff_GetPath --
  *	Return the search path for the given suffix, if it's defined.
  *
  * Results:
  *	The searchPath for the desired suffix or NULL if the suffix isn't
  *	defined.
  *
  * Side Effects:
  *	None
  *-----------------------------------------------------------------------
  */
 Lst
 Suff_GetPath(char *sname)
 {
     LstNode   	  ln;
     Suff    	  *s;
 
     ln = Lst_Find(sufflist, sname, SuffSuffHasNameP);
     if (ln == NULL) {
 	return NULL;
     } else {
 	s = (Suff *)Lst_Datum(ln);
 	return (s->searchPath);
     }
 }
 
 /*-
  *-----------------------------------------------------------------------
  * Suff_DoPaths --
  *	Extend the search paths for all suffixes to include the default
  *	search path.
  *
  * Results:
  *	None.
  *
  * Side Effects:
  *	The searchPath field of all the suffixes is extended by the
  *	directories in dirSearchPath. If paths were specified for the
  *	".h" suffix, the directories are stuffed into a global variable
  *	called ".INCLUDES" with each directory preceded by a -I. The same
  *	is done for the ".a" suffix, except the variable is called
  *	".LIBS" and the flag is -L.
  *-----------------------------------------------------------------------
  */
 void
 Suff_DoPaths(void)
 {
     Suff	   	*s;
     LstNode  		ln;
     char		*ptr;
     Lst	    	    	inIncludes; /* Cumulative .INCLUDES path */
     Lst	    	    	inLibs;	    /* Cumulative .LIBS path */
 
     if (Lst_Open(sufflist) == FAILURE) {
 	return;
     }
 
     inIncludes = Lst_Init(FALSE);
     inLibs = Lst_Init(FALSE);
 
     while ((ln = Lst_Next(sufflist)) != NULL) {
 	s = (Suff *)Lst_Datum(ln);
 	if (!Lst_IsEmpty (s->searchPath)) {
 #ifdef INCLUDES
 	    if (s->flags & SUFF_INCLUDE) {
 		Dir_Concat(inIncludes, s->searchPath);
 	    }
 #endif /* INCLUDES */
 #ifdef LIBRARIES
 	    if (s->flags & SUFF_LIBRARY) {
 		Dir_Concat(inLibs, s->searchPath);
 	    }
 #endif /* LIBRARIES */
 	    Dir_Concat(s->searchPath, dirSearchPath);
 	} else {
 	    Lst_Destroy(s->searchPath, Dir_Destroy);
 	    s->searchPath = Lst_Duplicate(dirSearchPath, Dir_CopyDir);
 	}
     }
 
     Var_Set(".INCLUDES", ptr = Dir_MakeFlags("-I", inIncludes), VAR_GLOBAL, 0);
     free(ptr);
     Var_Set(".LIBS", ptr = Dir_MakeFlags("-L", inLibs), VAR_GLOBAL, 0);
     free(ptr);
 
     Lst_Destroy(inIncludes, Dir_Destroy);
     Lst_Destroy(inLibs, Dir_Destroy);
 
     Lst_Close(sufflist);
 }
 
 /*-
  *-----------------------------------------------------------------------
  * Suff_AddInclude --
  *	Add the given suffix as a type of file which gets included.
  *	Called from the parse module when a .INCLUDES line is parsed.
  *	The suffix must have already been defined.
  *
  * Input:
  *	sname		Name of the suffix to mark
  *
  * Results:
  *	None.
  *
  * Side Effects:
  *	The SUFF_INCLUDE bit is set in the suffix's flags field
  *
  *-----------------------------------------------------------------------
  */
 void
 Suff_AddInclude(char *sname)
 {
     LstNode	  ln;
     Suff	  *s;
 
     ln = Lst_Find(sufflist, sname, SuffSuffHasNameP);
     if (ln != NULL) {
 	s = (Suff *)Lst_Datum(ln);
 	s->flags |= SUFF_INCLUDE;
     }
 }
 
 /*-
  *-----------------------------------------------------------------------
  * Suff_AddLib --
  *	Add the given suffix as a type of file which is a library.
  *	Called from the parse module when parsing a .LIBS line. The
  *	suffix must have been defined via .SUFFIXES before this is
  *	called.
  *
  * Input:
  *	sname		Name of the suffix to mark
  *
  * Results:
  *	None.
  *
  * Side Effects:
  *	The SUFF_LIBRARY bit is set in the suffix's flags field
  *
  *-----------------------------------------------------------------------
  */
 void
 Suff_AddLib(char *sname)
 {
     LstNode	  ln;
     Suff	  *s;
 
     ln = Lst_Find(sufflist, sname, SuffSuffHasNameP);
     if (ln != NULL) {
 	s = (Suff *)Lst_Datum(ln);
 	s->flags |= SUFF_LIBRARY;
     }
 }
 
  	  /********** Implicit Source Search Functions *********/
 
 /*-
  *-----------------------------------------------------------------------
  * SuffAddSrc  --
  *	Add a suffix as a Src structure to the given list with its parent
  *	being the given Src structure. If the suffix is the null suffix,
  *	the prefix is used unaltered as the file name in the Src structure.
  *
  * Input:
  *	sp		suffix for which to create a Src structure
  *	lsp		list and parent for the new Src
  *
  * Results:
  *	always returns 0
  *
  * Side Effects:
  *	A Src structure is created and tacked onto the end of the list
  *-----------------------------------------------------------------------
  */
 static int
 SuffAddSrc(void *sp, void *lsp)
 {
     Suff	*s = (Suff *)sp;
     LstSrc      *ls = (LstSrc *)lsp;
     Src         *s2;	    /* new Src structure */
     Src    	*targ; 	    /* Target structure */
 
     targ = ls->s;
 
     if ((s->flags & SUFF_NULL) && (*s->name != '\0')) {
 	/*
 	 * If the suffix has been marked as the NULL suffix, also create a Src
 	 * structure for a file with no suffix attached. Two birds, and all
 	 * that...
 	 */
 	s2 = bmake_malloc(sizeof(Src));
 	s2->file =  	bmake_strdup(targ->pref);
 	s2->pref =  	targ->pref;
 	s2->parent = 	targ;
 	s2->node =  	NULL;
 	s2->suff =  	s;
 	s->refCount++;
 	s2->children =	0;
 	targ->children += 1;
 	(void)Lst_AtEnd(ls->l, s2);
 #ifdef DEBUG_SRC
 	s2->cp = Lst_Init(FALSE);
 	Lst_AtEnd(targ->cp, s2);
-	fprintf(debug_file, "1 add %x %x to %x:", targ, s2, ls->l);
+	fprintf(debug_file, "1 add %p %p to %p:", targ, s2, ls->l);
 	Lst_ForEach(ls->l, PrintAddr, NULL);
 	fprintf(debug_file, "\n");
 #endif
     }
     s2 = bmake_malloc(sizeof(Src));
     s2->file = 	    str_concat(targ->pref, s->name, 0);
     s2->pref =	    targ->pref;
     s2->parent =    targ;
     s2->node = 	    NULL;
     s2->suff = 	    s;
     s->refCount++;
     s2->children =  0;
     targ->children += 1;
     (void)Lst_AtEnd(ls->l, s2);
 #ifdef DEBUG_SRC
     s2->cp = Lst_Init(FALSE);
     Lst_AtEnd(targ->cp, s2);
-    fprintf(debug_file, "2 add %x %x to %x:", targ, s2, ls->l);
+    fprintf(debug_file, "2 add %p %p to %p:", targ, s2, ls->l);
     Lst_ForEach(ls->l, PrintAddr, NULL);
     fprintf(debug_file, "\n");
 #endif
 
     return(0);
 }
 
 /*-
  *-----------------------------------------------------------------------
  * SuffAddLevel  --
  *	Add all the children of targ as Src structures to the given list
  *
  * Input:
  *	l		list to which to add the new level
  *	targ		Src structure to use as the parent
  *
  * Results:
  *	None
  *
  * Side Effects:
  * 	Lots of structures are created and added to the list
  *-----------------------------------------------------------------------
  */
 static void
 SuffAddLevel(Lst l, Src *targ)
 {
     LstSrc         ls;
 
     ls.s = targ;
     ls.l = l;
 
     Lst_ForEach(targ->suff->children, SuffAddSrc, &ls);
 }
 
 /*-
  *----------------------------------------------------------------------
  * SuffRemoveSrc --
  *	Free all src structures in list that don't have a reference count
  *
  * Results:
  *	Ture if an src was removed
  *
  * Side Effects:
  *	The memory is free'd.
  *----------------------------------------------------------------------
  */
 static int
 SuffRemoveSrc(Lst l)
 {
     LstNode ln;
     Src *s;
     int t = 0;
 
     if (Lst_Open(l) == FAILURE) {
 	return 0;
     }
 #ifdef DEBUG_SRC
     fprintf(debug_file, "cleaning %lx: ", (unsigned long) l);
     Lst_ForEach(l, PrintAddr, NULL);
     fprintf(debug_file, "\n");
 #endif
 
 
     while ((ln = Lst_Next(l)) != NULL) {
 	s = (Src *)Lst_Datum(ln);
 	if (s->children == 0) {
 	    free(s->file);
 	    if (!s->parent)
 		free(s->pref);
 	    else {
 #ifdef DEBUG_SRC
-		LstNode ln = Lst_Member(s->parent->cp, s);
-		if (ln != NULL)
-		    Lst_Remove(s->parent->cp, ln);
+		LstNode ln2 = Lst_Member(s->parent->cp, s);
+		if (ln2 != NULL)
+		    Lst_Remove(s->parent->cp, ln2);
 #endif
 		--s->parent->children;
 	    }
 #ifdef DEBUG_SRC
-	    fprintf(debug_file, "free: [l=%x] p=%x %d\n", l, s, s->children);
+	    fprintf(debug_file, "free: [l=%p] p=%p %d\n", l, s, s->children);
 	    Lst_Destroy(s->cp, NULL);
 #endif
 	    Lst_Remove(l, ln);
 	    free(s);
 	    t |= 1;
 	    Lst_Close(l);
 	    return TRUE;
 	}
 #ifdef DEBUG_SRC
 	else {
-	    fprintf(debug_file, "keep: [l=%x] p=%x %d: ", l, s, s->children);
+	    fprintf(debug_file, "keep: [l=%p] p=%p %d: ", l, s, s->children);
 	    Lst_ForEach(s->cp, PrintAddr, NULL);
 	    fprintf(debug_file, "\n");
 	}
 #endif
     }
 
     Lst_Close(l);
 
     return t;
 }
 
 /*-
  *-----------------------------------------------------------------------
  * SuffFindThem --
  *	Find the first existing file/target in the list srcs
  *
  * Input:
  *	srcs		list of Src structures to search through
  *
  * Results:
  *	The lowest structure in the chain of transformations
  *
  * Side Effects:
  *	None
  *-----------------------------------------------------------------------
  */
 static Src *
 SuffFindThem(Lst srcs, Lst slst)
 {
     Src            *s;		/* current Src */
     Src		   *rs;		/* returned Src */
     char	   *ptr;
 
     rs = NULL;
 
     while (!Lst_IsEmpty (srcs)) {
 	s = (Src *)Lst_DeQueue(srcs);
 
 	if (DEBUG(SUFF)) {
 	    fprintf(debug_file, "\ttrying %s...", s->file);
 	}
 
 	/*
 	 * A file is considered to exist if either a node exists in the
 	 * graph for it or the file actually exists.
 	 */
 	if (Targ_FindNode(s->file, TARG_NOCREATE) != NULL) {
 #ifdef DEBUG_SRC
-	    fprintf(debug_file, "remove %x from %x\n", s, srcs);
+	    fprintf(debug_file, "remove %p from %p\n", s, srcs);
 #endif
 	    rs = s;
 	    break;
 	}
 
 	if ((ptr = Dir_FindFile(s->file, s->suff->searchPath)) != NULL) {
 	    rs = s;
 #ifdef DEBUG_SRC
-	    fprintf(debug_file, "remove %x from %x\n", s, srcs);
+	    fprintf(debug_file, "remove %p from %p\n", s, srcs);
 #endif
 	    free(ptr);
 	    break;
 	}
 
 	if (DEBUG(SUFF)) {
 	    fprintf(debug_file, "not there\n");
 	}
 
 	SuffAddLevel(srcs, s);
 	Lst_AtEnd(slst, s);
     }
 
     if (DEBUG(SUFF) && rs) {
 	fprintf(debug_file, "got it\n");
     }
     return (rs);
 }
 
 /*-
  *-----------------------------------------------------------------------
  * SuffFindCmds --
  *	See if any of the children of the target in the Src structure is
  *	one from which the target can be transformed. If there is one,
  *	a Src structure is put together for it and returned.
  *
  * Input:
  *	targ		Src structure to play with
  *
  * Results:
  *	The Src structure of the "winning" child, or NULL if no such beast.
  *
  * Side Effects:
  *	A Src structure may be allocated.
  *
  *-----------------------------------------------------------------------
  */
 static Src *
 SuffFindCmds(Src *targ, Lst slst)
 {
     LstNode 	  	ln; 	/* General-purpose list node */
     GNode		*t, 	/* Target GNode */
 	    	  	*s; 	/* Source GNode */
     int	    	  	prefLen;/* The length of the defined prefix */
     Suff    	  	*suff;	/* Suffix on matching beastie */
     Src	    	  	*ret;	/* Return value */
     char    	  	*cp;
 
     t = targ->node;
     (void)Lst_Open(t->children);
     prefLen = strlen(targ->pref);
 
     for (;;) {
 	ln = Lst_Next(t->children);
 	if (ln == NULL) {
 	    Lst_Close(t->children);
 	    return NULL;
 	}
 	s = (GNode *)Lst_Datum(ln);
 
 	if (s->type & OP_OPTIONAL && Lst_IsEmpty(t->commands)) {
 	    /*
 	     * We haven't looked to see if .OPTIONAL files exist yet, so
 	     * don't use one as the implicit source.
 	     * This allows us to use .OPTIONAL in .depend files so make won't
 	     * complain "don't know how to make xxx.h' when a dependent file
 	     * has been moved/deleted.
 	     */
 	    continue;
 	}
 
 	cp = strrchr(s->name, '/');
 	if (cp == NULL) {
 	    cp = s->name;
 	} else {
 	    cp++;
 	}
 	if (strncmp(cp, targ->pref, prefLen) != 0)
 	    continue;
 	/*
 	 * The node matches the prefix ok, see if it has a known
 	 * suffix.
 	 */
 	ln = Lst_Find(sufflist, &cp[prefLen], SuffSuffHasNameP);
 	if (ln == NULL)
 	    continue;
 	/*
 	 * It even has a known suffix, see if there's a transformation
 	 * defined between the node's suffix and the target's suffix.
 	 *
 	 * XXX: Handle multi-stage transformations here, too.
 	 */
 	suff = (Suff *)Lst_Datum(ln);
 
 	if (Lst_Member(suff->parents, targ->suff) != NULL)
 	    break;
     }
 
     /*
      * Hot Damn! Create a new Src structure to describe
      * this transformation (making sure to duplicate the
      * source node's name so Suff_FindDeps can free it
      * again (ick)), and return the new structure.
      */
     ret = bmake_malloc(sizeof(Src));
     ret->file = bmake_strdup(s->name);
     ret->pref = targ->pref;
     ret->suff = suff;
     suff->refCount++;
     ret->parent = targ;
     ret->node = s;
     ret->children = 0;
     targ->children += 1;
 #ifdef DEBUG_SRC
     ret->cp = Lst_Init(FALSE);
-    fprintf(debug_file, "3 add %x %x\n", targ, ret);
+    fprintf(debug_file, "3 add %p %p\n", targ, ret);
     Lst_AtEnd(targ->cp, ret);
 #endif
     Lst_AtEnd(slst, ret);
     if (DEBUG(SUFF)) {
 	fprintf(debug_file, "\tusing existing source %s\n", s->name);
     }
     return (ret);
 }
 
 /*-
  *-----------------------------------------------------------------------
  * SuffExpandChildren --
  *	Expand the names of any children of a given node that contain
  *	variable invocations or file wildcards into actual targets.
  *
  * Input:
  *	cln		Child to examine
  *	pgn		Parent node being processed
  *
  * Results:
  *	=== 0 (continue)
  *
  * Side Effects:
  *	The expanded node is removed from the parent's list of children,
  *	and the parent's unmade counter is decremented, but other nodes
  * 	may be added.
  *
  *-----------------------------------------------------------------------
  */
 static void
 SuffExpandChildren(LstNode cln, GNode *pgn)
 {
     GNode   	*cgn = (GNode *)Lst_Datum(cln);
     GNode	*gn;	    /* New source 8) */
     char	*cp;	    /* Expanded value */
 
     if (!Lst_IsEmpty(cgn->order_pred) || !Lst_IsEmpty(cgn->order_succ))
 	/* It is all too hard to process the result of .ORDER */
 	return;
 
     if (cgn->type & OP_WAIT)
 	/* Ignore these (& OP_PHONY ?) */
 	return;
 
     /*
      * First do variable expansion -- this takes precedence over
      * wildcard expansion. If the result contains wildcards, they'll be gotten
      * to later since the resulting words are tacked on to the end of
      * the children list.
      */
     if (strchr(cgn->name, '$') == NULL) {
 	SuffExpandWildcards(cln, pgn);
 	return;
     }
 
     if (DEBUG(SUFF)) {
 	fprintf(debug_file, "Expanding \"%s\"...", cgn->name);
     }
     cp = Var_Subst(NULL, cgn->name, pgn, VARF_UNDEFERR|VARF_WANTRES);
 
     if (cp != NULL) {
 	Lst	    members = Lst_Init(FALSE);
 
 	if (cgn->type & OP_ARCHV) {
 	    /*
 	     * Node was an archive(member) target, so we want to call
 	     * on the Arch module to find the nodes for us, expanding
 	     * variables in the parent's context.
 	     */
 	    char	*sacrifice = cp;
 
 	    (void)Arch_ParseArchive(&sacrifice, members, pgn);
 	} else {
 	    /*
 	     * Break the result into a vector of strings whose nodes
 	     * we can find, then add those nodes to the members list.
 	     * Unfortunately, we can't use brk_string b/c it
 	     * doesn't understand about variable specifications with
 	     * spaces in them...
 	     */
 	    char	    *start;
 	    char	    *initcp = cp;   /* For freeing... */
 
 	    for (start = cp; *start == ' ' || *start == '\t'; start++)
 		continue;
 	    for (cp = start; *cp != '\0'; cp++) {
 		if (*cp == ' ' || *cp == '\t') {
 		    /*
 		     * White-space -- terminate element, find the node,
 		     * add it, skip any further spaces.
 		     */
 		    *cp++ = '\0';
 		    gn = Targ_FindNode(start, TARG_CREATE);
 		    (void)Lst_AtEnd(members, gn);
 		    while (*cp == ' ' || *cp == '\t') {
 			cp++;
 		    }
 		    /*
 		     * Adjust cp for increment at start of loop, but
 		     * set start to first non-space.
 		     */
 		    start = cp--;
 		} else if (*cp == '$') {
 		    /*
 		     * Start of a variable spec -- contact variable module
 		     * to find the end so we can skip over it.
 		     */
 		    char	*junk;
 		    int 	len;
 		    void	*freeIt;
 
 		    junk = Var_Parse(cp, pgn, VARF_UNDEFERR|VARF_WANTRES,
 			&len, &freeIt);
 		    if (junk != var_Error) {
 			cp += len - 1;
 		    }
 
 		    free(freeIt);
-		} else if (*cp == '\\' && *cp != '\0') {
+		} else if (*cp == '\\' && cp[1] != '\0') {
 		    /*
 		     * Escaped something -- skip over it
 		     */
 		    cp++;
 		}
 	    }
 
 	    if (cp != start) {
 		/*
 		 * Stuff left over -- add it to the list too
 		 */
 		gn = Targ_FindNode(start, TARG_CREATE);
 		(void)Lst_AtEnd(members, gn);
 	    }
 	    /*
 	     * Point cp back at the beginning again so the variable value
 	     * can be freed.
 	     */
 	    cp = initcp;
 	}
 
 	/*
 	 * Add all elements of the members list to the parent node.
 	 */
 	while(!Lst_IsEmpty(members)) {
 	    gn = (GNode *)Lst_DeQueue(members);
 
 	    if (DEBUG(SUFF)) {
 		fprintf(debug_file, "%s...", gn->name);
 	    }
 	    /* Add gn to the parents child list before the original child */
 	    (void)Lst_InsertBefore(pgn->children, cln, gn);
 	    (void)Lst_AtEnd(gn->parents, pgn);
 	    pgn->unmade++;
 	    /* Expand wildcards on new node */
 	    SuffExpandWildcards(Lst_Prev(cln), pgn);
 	}
 	Lst_Destroy(members, NULL);
 
 	/*
 	 * Free the result
 	 */
 	free(cp);
     }
     if (DEBUG(SUFF)) {
 	fprintf(debug_file, "\n");
     }
 
     /*
      * Now the source is expanded, remove it from the list of children to
      * keep it from being processed.
      */
     pgn->unmade--;
     Lst_Remove(pgn->children, cln);
     Lst_Remove(cgn->parents, Lst_Member(cgn->parents, pgn));
 }
 
 static void
 SuffExpandWildcards(LstNode cln, GNode *pgn)
 {
     GNode   	*cgn = (GNode *)Lst_Datum(cln);
     GNode	*gn;	    /* New source 8) */
     char	*cp;	    /* Expanded value */
     Lst 	explist;    /* List of expansions */
 
     if (!Dir_HasWildcards(cgn->name))
 	return;
 
     /*
      * Expand the word along the chosen path
      */
     explist = Lst_Init(FALSE);
     Dir_Expand(cgn->name, Suff_FindPath(cgn), explist);
 
     while (!Lst_IsEmpty(explist)) {
 	/*
 	 * Fetch next expansion off the list and find its GNode
 	 */
 	cp = (char *)Lst_DeQueue(explist);
 
 	if (DEBUG(SUFF)) {
 	    fprintf(debug_file, "%s...", cp);
 	}
 	gn = Targ_FindNode(cp, TARG_CREATE);
 
 	/* Add gn to the parents child list before the original child */
 	(void)Lst_InsertBefore(pgn->children, cln, gn);
 	(void)Lst_AtEnd(gn->parents, pgn);
 	pgn->unmade++;
     }
 
     /*
      * Nuke what's left of the list
      */
     Lst_Destroy(explist, NULL);
 
     if (DEBUG(SUFF)) {
 	fprintf(debug_file, "\n");
     }
 
     /*
      * Now the source is expanded, remove it from the list of children to
      * keep it from being processed.
      */
     pgn->unmade--;
     Lst_Remove(pgn->children, cln);
     Lst_Remove(cgn->parents, Lst_Member(cgn->parents, pgn));
 }
 
 /*-
  *-----------------------------------------------------------------------
  * Suff_FindPath --
  *	Find a path along which to expand the node.
  *
  *	If the word has a known suffix, use that path.
  *	If it has no known suffix, use the default system search path.
  *
  * Input:
  *	gn		Node being examined
  *
  * Results:
  *	The appropriate path to search for the GNode.
  *
  * Side Effects:
  *	XXX: We could set the suffix here so that we don't have to scan
  *	again.
  *
  *-----------------------------------------------------------------------
  */
 Lst
 Suff_FindPath(GNode* gn)
 {
     Suff *suff = gn->suffix;
 
     if (suff == NULL) {
 	SuffixCmpData sd;   /* Search string data */
 	LstNode ln;
 	sd.len = strlen(gn->name);
 	sd.ename = gn->name + sd.len;
 	ln = Lst_Find(sufflist, &sd, SuffSuffIsSuffixP);
 
 	if (DEBUG(SUFF)) {
 	    fprintf(debug_file, "Wildcard expanding \"%s\"...", gn->name);
 	}
 	if (ln != NULL)
 	    suff = (Suff *)Lst_Datum(ln);
 	/* XXX: Here we can save the suffix so we don't have to do this again */
     }
 
     if (suff != NULL) {
 	if (DEBUG(SUFF)) {
 	    fprintf(debug_file, "suffix is \"%s\"...", suff->name);
 	}
 	return suff->searchPath;
     } else {
 	/*
 	 * Use default search path
 	 */
 	return dirSearchPath;
     }
 }
 
 /*-
  *-----------------------------------------------------------------------
  * SuffApplyTransform --
  *	Apply a transformation rule, given the source and target nodes
  *	and suffixes.
  *
  * Input:
  *	tGn		Target node
  *	sGn		Source node
  *	t		Target suffix
  *	s		Source suffix
  *
  * Results:
  *	TRUE if successful, FALSE if not.
  *
  * Side Effects:
  *	The source and target are linked and the commands from the
  *	transformation are added to the target node's commands list.
  *	All attributes but OP_DEPMASK and OP_TRANSFORM are applied
  *	to the target. The target also inherits all the sources for
  *	the transformation rule.
  *
  *-----------------------------------------------------------------------
  */
 static Boolean
 SuffApplyTransform(GNode *tGn, GNode *sGn, Suff *t, Suff *s)
 {
     LstNode 	ln, nln;    /* General node */
     char    	*tname;	    /* Name of transformation rule */
     GNode   	*gn;	    /* Node for same */
 
     /*
      * Form the proper links between the target and source.
      */
     (void)Lst_AtEnd(tGn->children, sGn);
     (void)Lst_AtEnd(sGn->parents, tGn);
     tGn->unmade += 1;
 
     /*
      * Locate the transformation rule itself
      */
     tname = str_concat(s->name, t->name, 0);
     ln = Lst_Find(transforms, tname, SuffGNHasNameP);
     free(tname);
 
     if (ln == NULL) {
 	/*
 	 * Not really such a transformation rule (can happen when we're
 	 * called to link an OP_MEMBER and OP_ARCHV node), so return
 	 * FALSE.
 	 */
 	return(FALSE);
     }
 
     gn = (GNode *)Lst_Datum(ln);
 
     if (DEBUG(SUFF)) {
 	fprintf(debug_file, "\tapplying %s -> %s to \"%s\"\n", s->name, t->name, tGn->name);
     }
 
     /*
      * Record last child for expansion purposes
      */
     ln = Lst_Last(tGn->children);
 
     /*
      * Pass the buck to Make_HandleUse to apply the rule
      */
     (void)Make_HandleUse(gn, tGn);
 
     /*
      * Deal with wildcards and variables in any acquired sources
      */
     for (ln = Lst_Succ(ln); ln != NULL; ln = nln) {
 	nln = Lst_Succ(ln);
 	SuffExpandChildren(ln, tGn);
     }
 
     /*
      * Keep track of another parent to which this beast is transformed so
      * the .IMPSRC variable can be set correctly for the parent.
      */
     (void)Lst_AtEnd(sGn->iParents, tGn);
 
     return(TRUE);
 }
 
 
 /*-
  *-----------------------------------------------------------------------
  * SuffFindArchiveDeps --
  *	Locate dependencies for an OP_ARCHV node.
  *
  * Input:
  *	gn		Node for which to locate dependencies
  *
  * Results:
  *	None
  *
  * Side Effects:
  *	Same as Suff_FindDeps
  *
  *-----------------------------------------------------------------------
  */
 static void
 SuffFindArchiveDeps(GNode *gn, Lst slst)
 {
     char    	*eoarch;    /* End of archive portion */
     char    	*eoname;    /* End of member portion */
     GNode   	*mem;	    /* Node for member */
     static const char	*copy[] = {
 	/* Variables to be copied from the member node */
 	TARGET,	    	    /* Must be first */
 	PREFIX,	    	    /* Must be second */
     };
     LstNode 	ln, nln;    /* Next suffix node to check */
     int	    	i;  	    /* Index into copy and vals */
     Suff    	*ms;	    /* Suffix descriptor for member */
     char    	*name;	    /* Start of member's name */
 
     /*
      * The node is an archive(member) pair. so we must find a
      * suffix for both of them.
      */
     eoarch = strchr(gn->name, '(');
     eoname = strchr(eoarch, ')');
 
     *eoname = '\0';	  /* Nuke parentheses during suffix search */
     *eoarch = '\0';	  /* So a suffix can be found */
 
     name = eoarch + 1;
 
     /*
      * To simplify things, call Suff_FindDeps recursively on the member now,
      * so we can simply compare the member's .PREFIX and .TARGET variables
      * to locate its suffix. This allows us to figure out the suffix to
      * use for the archive without having to do a quadratic search over the
      * suffix list, backtracking for each one...
      */
     mem = Targ_FindNode(name, TARG_CREATE);
     SuffFindDeps(mem, slst);
 
     /*
      * Create the link between the two nodes right off
      */
     (void)Lst_AtEnd(gn->children, mem);
     (void)Lst_AtEnd(mem->parents, gn);
     gn->unmade += 1;
 
     /*
      * Copy in the variables from the member node to this one.
      */
     for (i = (sizeof(copy)/sizeof(copy[0]))-1; i >= 0; i--) {
 	char *p1;
 	Var_Set(copy[i], Var_Value(copy[i], mem, &p1), gn, 0);
 	free(p1);
 
     }
 
     ms = mem->suffix;
     if (ms == NULL) {
 	/*
 	 * Didn't know what it was -- use .NULL suffix if not in make mode
 	 */
 	if (DEBUG(SUFF)) {
 	    fprintf(debug_file, "using null suffix\n");
 	}
 	ms = suffNull;
     }
 
 
     /*
      * Set the other two local variables required for this target.
      */
     Var_Set(MEMBER, name, gn, 0);
     Var_Set(ARCHIVE, gn->name, gn, 0);
 
     /*
      * Set $@ for compatibility with other makes
      */
     Var_Set(TARGET, gn->name, gn, 0);
 
     /*
      * Now we've got the important local variables set, expand any sources
      * that still contain variables or wildcards in their names.
      */
     for (ln = Lst_First(gn->children); ln != NULL; ln = nln) {
 	nln = Lst_Succ(ln);
 	SuffExpandChildren(ln, gn);
     }
 
     if (ms != NULL) {
 	/*
 	 * Member has a known suffix, so look for a transformation rule from
 	 * it to a possible suffix of the archive. Rather than searching
 	 * through the entire list, we just look at suffixes to which the
 	 * member's suffix may be transformed...
 	 */
 	SuffixCmpData	sd;		/* Search string data */
 
 	/*
 	 * Use first matching suffix...
 	 */
 	sd.len = eoarch - gn->name;
 	sd.ename = eoarch;
 	ln = Lst_Find(ms->parents, &sd, SuffSuffIsSuffixP);
 
 	if (ln != NULL) {
 	    /*
 	     * Got one -- apply it
 	     */
 	    if (!SuffApplyTransform(gn, mem, (Suff *)Lst_Datum(ln), ms) &&
 		DEBUG(SUFF))
 	    {
 		fprintf(debug_file, "\tNo transformation from %s -> %s\n",
 		       ms->name, ((Suff *)Lst_Datum(ln))->name);
 	    }
 	}
     }
 
     /*
      * Replace the opening and closing parens now we've no need of the separate
      * pieces.
      */
     *eoarch = '('; *eoname = ')';
 
     /*
      * Pretend gn appeared to the left of a dependency operator so
      * the user needn't provide a transformation from the member to the
      * archive.
      */
     if (OP_NOP(gn->type)) {
 	gn->type |= OP_DEPENDS;
     }
 
     /*
      * Flag the member as such so we remember to look in the archive for
      * its modification time. The OP_JOIN | OP_MADE is needed because this
      * target should never get made.
      */
     mem->type |= OP_MEMBER | OP_JOIN | OP_MADE;
 }
 
 /*-
  *-----------------------------------------------------------------------
  * SuffFindNormalDeps --
  *	Locate implicit dependencies for regular targets.
  *
  * Input:
  *	gn		Node for which to find sources
  *
  * Results:
  *	None.
  *
  * Side Effects:
  *	Same as Suff_FindDeps...
  *
  *-----------------------------------------------------------------------
  */
 static void
 SuffFindNormalDeps(GNode *gn, Lst slst)
 {
     char    	*eoname;    /* End of name */
     char    	*sopref;    /* Start of prefix */
     LstNode 	ln, nln;    /* Next suffix node to check */
     Lst	    	srcs;	    /* List of sources at which to look */
     Lst	    	targs;	    /* List of targets to which things can be
 			     * transformed. They all have the same file,
 			     * but different suff and pref fields */
     Src	    	*bottom;    /* Start of found transformation path */
     Src 	*src;	    /* General Src pointer */
     char    	*pref;	    /* Prefix to use */
     Src	    	*targ;	    /* General Src target pointer */
     SuffixCmpData sd;	    /* Search string data */
 
 
     sd.len = strlen(gn->name);
     sd.ename = eoname = gn->name + sd.len;
 
     sopref = gn->name;
 
     /*
      * Begin at the beginning...
      */
     ln = Lst_First(sufflist);
     srcs = Lst_Init(FALSE);
     targs = Lst_Init(FALSE);
 
     /*
      * We're caught in a catch-22 here. On the one hand, we want to use any
      * transformation implied by the target's sources, but we can't examine
      * the sources until we've expanded any variables/wildcards they may hold,
      * and we can't do that until we've set up the target's local variables
      * and we can't do that until we know what the proper suffix for the
      * target is (in case there are two suffixes one of which is a suffix of
      * the other) and we can't know that until we've found its implied
      * source, which we may not want to use if there's an existing source
      * that implies a different transformation.
      *
      * In an attempt to get around this, which may not work all the time,
      * but should work most of the time, we look for implied sources first,
      * checking transformations to all possible suffixes of the target,
      * use what we find to set the target's local variables, expand the
      * children, then look for any overriding transformations they imply.
      * Should we find one, we discard the one we found before.
      */
     bottom = NULL;
     targ = NULL;
 
     if (!(gn->type & OP_PHONY)) {
 
 	while (ln != NULL) {
 	    /*
 	     * Look for next possible suffix...
 	     */
 	    ln = Lst_FindFrom(sufflist, ln, &sd, SuffSuffIsSuffixP);
 
 	    if (ln != NULL) {
 		int	    prefLen;	    /* Length of the prefix */
 
 		/*
 		 * Allocate a Src structure to which things can be transformed
 		 */
 		targ = bmake_malloc(sizeof(Src));
 		targ->file = bmake_strdup(gn->name);
 		targ->suff = (Suff *)Lst_Datum(ln);
 		targ->suff->refCount++;
 		targ->node = gn;
 		targ->parent = NULL;
 		targ->children = 0;
 #ifdef DEBUG_SRC
 		targ->cp = Lst_Init(FALSE);
 #endif
 
 		/*
 		 * Allocate room for the prefix, whose end is found by
 		 * subtracting the length of the suffix from
 		 * the end of the name.
 		 */
 		prefLen = (eoname - targ->suff->nameLen) - sopref;
 		targ->pref = bmake_malloc(prefLen + 1);
 		memcpy(targ->pref, sopref, prefLen);
 		targ->pref[prefLen] = '\0';
 
 		/*
 		 * Add nodes from which the target can be made
 		 */
 		SuffAddLevel(srcs, targ);
 
 		/*
 		 * Record the target so we can nuke it
 		 */
 		(void)Lst_AtEnd(targs, targ);
 
 		/*
 		 * Search from this suffix's successor...
 		 */
 		ln = Lst_Succ(ln);
 	    }
 	}
 
 	/*
 	 * Handle target of unknown suffix...
 	 */
 	if (Lst_IsEmpty(targs) && suffNull != NULL) {
 	    if (DEBUG(SUFF)) {
 		fprintf(debug_file, "\tNo known suffix on %s. Using .NULL suffix\n", gn->name);
 	    }
 
 	    targ = bmake_malloc(sizeof(Src));
 	    targ->file = bmake_strdup(gn->name);
 	    targ->suff = suffNull;
 	    targ->suff->refCount++;
 	    targ->node = gn;
 	    targ->parent = NULL;
 	    targ->children = 0;
 	    targ->pref = bmake_strdup(sopref);
 #ifdef DEBUG_SRC
 	    targ->cp = Lst_Init(FALSE);
 #endif
 
 	    /*
 	     * Only use the default suffix rules if we don't have commands
 	     * defined for this gnode; traditional make programs used to
 	     * not define suffix rules if the gnode had children but we
 	     * don't do this anymore.
 	     */
 	    if (Lst_IsEmpty(gn->commands))
 		SuffAddLevel(srcs, targ);
 	    else {
 		if (DEBUG(SUFF))
 		    fprintf(debug_file, "not ");
 	    }
 
 	    if (DEBUG(SUFF))
 		fprintf(debug_file, "adding suffix rules\n");
 
 	    (void)Lst_AtEnd(targs, targ);
 	}
 
 	/*
 	 * Using the list of possible sources built up from the target
 	 * suffix(es), try and find an existing file/target that matches.
 	 */
 	bottom = SuffFindThem(srcs, slst);
 
 	if (bottom == NULL) {
 	    /*
 	     * No known transformations -- use the first suffix found
 	     * for setting the local variables.
 	     */
 	    if (!Lst_IsEmpty(targs)) {
 		targ = (Src *)Lst_Datum(Lst_First(targs));
 	    } else {
 		targ = NULL;
 	    }
 	} else {
 	    /*
 	     * Work up the transformation path to find the suffix of the
 	     * target to which the transformation was made.
 	     */
 	    for (targ = bottom; targ->parent != NULL; targ = targ->parent)
 		continue;
 	}
     }
 
     Var_Set(TARGET, gn->path ? gn->path : gn->name, gn, 0);
 
     pref = (targ != NULL) ? targ->pref : gn->name;
     Var_Set(PREFIX, pref, gn, 0);
 
     /*
      * Now we've got the important local variables set, expand any sources
      * that still contain variables or wildcards in their names.
      */
     for (ln = Lst_First(gn->children); ln != NULL; ln = nln) {
 	nln = Lst_Succ(ln);
 	SuffExpandChildren(ln, gn);
     }
 
     if (targ == NULL) {
 	if (DEBUG(SUFF)) {
 	    fprintf(debug_file, "\tNo valid suffix on %s\n", gn->name);
 	}
 
 sfnd_abort:
 	/*
 	 * Deal with finding the thing on the default search path. We
 	 * always do that, not only if the node is only a source (not
 	 * on the lhs of a dependency operator or [XXX] it has neither
 	 * children or commands) as the old pmake did.
 	 */
 	if ((gn->type & (OP_PHONY|OP_NOPATH)) == 0) {
 	    free(gn->path);
 	    gn->path = Dir_FindFile(gn->name,
 				    (targ == NULL ? dirSearchPath :
 				     targ->suff->searchPath));
 	    if (gn->path != NULL) {
 		char *ptr;
 		Var_Set(TARGET, gn->path, gn, 0);
 
 		if (targ != NULL) {
 		    /*
 		     * Suffix known for the thing -- trim the suffix off
 		     * the path to form the proper .PREFIX variable.
 		     */
 		    int     savep = strlen(gn->path) - targ->suff->nameLen;
 		    char    savec;
 
 		    if (gn->suffix)
 			gn->suffix->refCount--;
 		    gn->suffix = targ->suff;
 		    gn->suffix->refCount++;
 
 		    savec = gn->path[savep];
 		    gn->path[savep] = '\0';
 
 		    if ((ptr = strrchr(gn->path, '/')) != NULL)
 			ptr++;
 		    else
 			ptr = gn->path;
 
 		    Var_Set(PREFIX, ptr, gn, 0);
 
 		    gn->path[savep] = savec;
 		} else {
 		    /*
 		     * The .PREFIX gets the full path if the target has
 		     * no known suffix.
 		     */
 		    if (gn->suffix)
 			gn->suffix->refCount--;
 		    gn->suffix = NULL;
 
 		    if ((ptr = strrchr(gn->path, '/')) != NULL)
 			ptr++;
 		    else
 			ptr = gn->path;
 
 		    Var_Set(PREFIX, ptr, gn, 0);
 		}
 	    }
 	}
 
 	goto sfnd_return;
     }
 
     /*
      * If the suffix indicates that the target is a library, mark that in
      * the node's type field.
      */
     if (targ->suff->flags & SUFF_LIBRARY) {
 	gn->type |= OP_LIB;
     }
 
     /*
      * Check for overriding transformation rule implied by sources
      */
     if (!Lst_IsEmpty(gn->children)) {
 	src = SuffFindCmds(targ, slst);
 
 	if (src != NULL) {
 	    /*
 	     * Free up all the Src structures in the transformation path
 	     * up to, but not including, the parent node.
 	     */
 	    while (bottom && bottom->parent != NULL) {
 		if (Lst_Member(slst, bottom) == NULL) {
 		    Lst_AtEnd(slst, bottom);
 		}
 		bottom = bottom->parent;
 	    }
 	    bottom = src;
 	}
     }
 
     if (bottom == NULL) {
 	/*
 	 * No idea from where it can come -- return now.
 	 */
 	goto sfnd_abort;
     }
 
     /*
      * We now have a list of Src structures headed by 'bottom' and linked via
      * their 'parent' pointers. What we do next is create links between
      * source and target nodes (which may or may not have been created)
      * and set the necessary local variables in each target. The
      * commands for each target are set from the commands of the
      * transformation rule used to get from the src suffix to the targ
      * suffix. Note that this causes the commands list of the original
      * node, gn, to be replaced by the commands of the final
      * transformation rule. Also, the unmade field of gn is incremented.
      * Etc.
      */
     if (bottom->node == NULL) {
 	bottom->node = Targ_FindNode(bottom->file, TARG_CREATE);
     }
 
     for (src = bottom; src->parent != NULL; src = src->parent) {
 	targ = src->parent;
 
 	if (src->node->suffix)
 	    src->node->suffix->refCount--;
 	src->node->suffix = src->suff;
 	src->node->suffix->refCount++;
 
 	if (targ->node == NULL) {
 	    targ->node = Targ_FindNode(targ->file, TARG_CREATE);
 	}
 
 	SuffApplyTransform(targ->node, src->node,
 			   targ->suff, src->suff);
 
 	if (targ->node != gn) {
 	    /*
 	     * Finish off the dependency-search process for any nodes
 	     * between bottom and gn (no point in questing around the
 	     * filesystem for their implicit source when it's already
 	     * known). Note that the node can't have any sources that
 	     * need expanding, since SuffFindThem will stop on an existing
 	     * node, so all we need to do is set the standard and System V
 	     * variables.
 	     */
 	    targ->node->type |= OP_DEPS_FOUND;
 
 	    Var_Set(PREFIX, targ->pref, targ->node, 0);
 
 	    Var_Set(TARGET, targ->node->name, targ->node, 0);
 	}
     }
 
     if (gn->suffix)
 	gn->suffix->refCount--;
     gn->suffix = src->suff;
     gn->suffix->refCount++;
 
     /*
      * Nuke the transformation path and the Src structures left over in the
      * two lists.
      */
 sfnd_return:
     if (bottom)
 	if (Lst_Member(slst, bottom) == NULL)
 	    Lst_AtEnd(slst, bottom);
 
     while (SuffRemoveSrc(srcs) || SuffRemoveSrc(targs))
 	continue;
 
     Lst_Concat(slst, srcs, LST_CONCLINK);
     Lst_Concat(slst, targs, LST_CONCLINK);
 }
 
 
 /*-
  *-----------------------------------------------------------------------
  * Suff_FindDeps  --
  *	Find implicit sources for the target described by the graph node
  *	gn
  *
  * Results:
  *	Nothing.
  *
  * Side Effects:
  *	Nodes are added to the graph below the passed-in node. The nodes
  *	are marked to have their IMPSRC variable filled in. The
  *	PREFIX variable is set for the given node and all its
  *	implied children.
  *
  * Notes:
  *	The path found by this target is the shortest path in the
  *	transformation graph, which may pass through non-existent targets,
  *	to an existing target. The search continues on all paths from the
  *	root suffix until a file is found. I.e. if there's a path
  *	.o -> .c -> .l -> .l,v from the root and the .l,v file exists but
  *	the .c and .l files don't, the search will branch out in
  *	all directions from .o and again from all the nodes on the
  *	next level until the .l,v node is encountered.
  *
  *-----------------------------------------------------------------------
  */
 
 void
 Suff_FindDeps(GNode *gn)
 {
 
     SuffFindDeps(gn, srclist);
     while (SuffRemoveSrc(srclist))
 	continue;
 }
 
 
 /*
  * Input:
  *	gn		node we're dealing with
  *
  */
 static void
 SuffFindDeps(GNode *gn, Lst slst)
 {
     if (gn->type & OP_DEPS_FOUND) {
 	/*
 	 * If dependencies already found, no need to do it again...
 	 */
 	return;
     } else {
 	gn->type |= OP_DEPS_FOUND;
     }
     /*
      * Make sure we have these set, may get revised below.
      */
     Var_Set(TARGET, gn->path ? gn->path : gn->name, gn, 0);
     Var_Set(PREFIX, gn->name, gn, 0);
 
     if (DEBUG(SUFF)) {
 	fprintf(debug_file, "SuffFindDeps (%s)\n", gn->name);
     }
 
     if (gn->type & OP_ARCHV) {
 	SuffFindArchiveDeps(gn, slst);
     } else if (gn->type & OP_LIB) {
 	/*
 	 * If the node is a library, it is the arch module's job to find it
 	 * and set the TARGET variable accordingly. We merely provide the
 	 * search path, assuming all libraries end in ".a" (if the suffix
 	 * hasn't been defined, there's nothing we can do for it, so we just
 	 * set the TARGET variable to the node's name in order to give it a
 	 * value).
 	 */
 	LstNode	ln;
 	Suff	*s;
 
 	ln = Lst_Find(sufflist, LIBSUFF, SuffSuffHasNameP);
 	if (gn->suffix)
 	    gn->suffix->refCount--;
 	if (ln != NULL) {
 	    gn->suffix = s = (Suff *)Lst_Datum(ln);
 	    gn->suffix->refCount++;
 	    Arch_FindLib(gn, s->searchPath);
 	} else {
 	    gn->suffix = NULL;
 	    Var_Set(TARGET, gn->name, gn, 0);
 	}
 	/*
 	 * Because a library (-lfoo) target doesn't follow the standard
 	 * filesystem conventions, we don't set the regular variables for
 	 * the thing. .PREFIX is simply made empty...
 	 */
 	Var_Set(PREFIX, "", gn, 0);
     } else {
 	SuffFindNormalDeps(gn, slst);
     }
 }
 
 /*-
  *-----------------------------------------------------------------------
  * Suff_SetNull --
  *	Define which suffix is the null suffix.
  *
  * Input:
  *	name		Name of null suffix
  *
  * Results:
  *	None.
  *
  * Side Effects:
  *	'suffNull' is altered.
  *
  * Notes:
  *	Need to handle the changing of the null suffix gracefully so the
  *	old transformation rules don't just go away.
  *
  *-----------------------------------------------------------------------
  */
 void
 Suff_SetNull(char *name)
 {
     Suff    *s;
     LstNode ln;
 
     ln = Lst_Find(sufflist, name, SuffSuffHasNameP);
     if (ln != NULL) {
 	s = (Suff *)Lst_Datum(ln);
 	if (suffNull != NULL) {
 	    suffNull->flags &= ~SUFF_NULL;
 	}
 	s->flags |= SUFF_NULL;
 	/*
 	 * XXX: Here's where the transformation mangling would take place
 	 */
 	suffNull = s;
     } else {
 	Parse_Error(PARSE_WARNING, "Desired null suffix %s not defined.",
 		     name);
     }
 }
 
 /*-
  *-----------------------------------------------------------------------
  * Suff_Init --
  *	Initialize suffixes module
  *
  * Results:
  *	None
  *
  * Side Effects:
  *	Many
  *-----------------------------------------------------------------------
  */
 void
 Suff_Init(void)
 {
 #ifdef CLEANUP
     suffClean = Lst_Init(FALSE);
 #endif
     srclist = Lst_Init(FALSE);
     transforms = Lst_Init(FALSE);
 
     /*
      * Create null suffix for single-suffix rules (POSIX). The thing doesn't
      * actually go on the suffix list or everyone will think that's its
      * suffix.
      */
     Suff_ClearSuffixes();
 }
 
 
 /*-
  *----------------------------------------------------------------------
  * Suff_End --
  *	Cleanup the this module
  *
  * Results:
  *	None
  *
  * Side Effects:
  *	The memory is free'd.
  *----------------------------------------------------------------------
  */
 
 void
 Suff_End(void)
 {
 #ifdef CLEANUP
     Lst_Destroy(sufflist, SuffFree);
     Lst_Destroy(suffClean, SuffFree);
     if (suffNull)
 	SuffFree(suffNull);
     Lst_Destroy(srclist, NULL);
     Lst_Destroy(transforms, NULL);
 #endif
 }
 
 
 /********************* DEBUGGING FUNCTIONS **********************/
 
 static int SuffPrintName(void *s, void *dummy)
 {
+    (void)dummy;
+
     fprintf(debug_file, "%s ", ((Suff *)s)->name);
-    return (dummy ? 0 : 0);
+    return 0;
 }
 
 static int
 SuffPrintSuff(void *sp, void *dummy)
 {
     Suff    *s = (Suff *)sp;
     int	    flags;
     int	    flag;
 
+    (void)dummy;
+
     fprintf(debug_file, "# `%s' [%d] ", s->name, s->refCount);
 
     flags = s->flags;
     if (flags) {
 	fputs(" (", debug_file);
 	while (flags) {
 	    flag = 1 << (ffs(flags) - 1);
 	    flags &= ~flag;
 	    switch (flag) {
 		case SUFF_NULL:
 		    fprintf(debug_file, "NULL");
 		    break;
 		case SUFF_INCLUDE:
 		    fprintf(debug_file, "INCLUDE");
 		    break;
 		case SUFF_LIBRARY:
 		    fprintf(debug_file, "LIBRARY");
 		    break;
 	    }
 	    fputc(flags ? '|' : ')', debug_file);
 	}
     }
     fputc('\n', debug_file);
     fprintf(debug_file, "#\tTo: ");
     Lst_ForEach(s->parents, SuffPrintName, NULL);
     fputc('\n', debug_file);
     fprintf(debug_file, "#\tFrom: ");
     Lst_ForEach(s->children, SuffPrintName, NULL);
     fputc('\n', debug_file);
     fprintf(debug_file, "#\tSearch Path: ");
     Dir_PrintPath(s->searchPath);
     fputc('\n', debug_file);
-    return (dummy ? 0 : 0);
+    return 0;
 }
 
 static int
 SuffPrintTrans(void *tp, void *dummy)
 {
     GNode   *t = (GNode *)tp;
 
+    (void)dummy;
+
     fprintf(debug_file, "%-16s: ", t->name);
     Targ_PrintType(t->type);
     fputc('\n', debug_file);
     Lst_ForEach(t->commands, Targ_PrintCmd, NULL);
     fputc('\n', debug_file);
-    return(dummy ? 0 : 0);
+    return 0;
 }
 
 void
 Suff_PrintAll(void)
 {
     fprintf(debug_file, "#*** Suffixes:\n");
     Lst_ForEach(sufflist, SuffPrintSuff, NULL);
 
     fprintf(debug_file, "#*** Transformations:\n");
     Lst_ForEach(transforms, SuffPrintTrans, NULL);
 }
Index: projects/clang390-import/contrib/bmake
===================================================================
--- projects/clang390-import/contrib/bmake	(revision 305686)
+++ projects/clang390-import/contrib/bmake	(revision 305687)

Property changes on: projects/clang390-import/contrib/bmake
___________________________________________________________________
Modified: svn:mergeinfo
## -0,0 +0,2 ##
   Merged /head/contrib/bmake:r303250-305686
   Merged /vendor/NetBSD/bmake/dist:r301637-305632
Index: projects/clang390-import/etc/mtree/BSD.tests.dist
===================================================================
--- projects/clang390-import/etc/mtree/BSD.tests.dist	(revision 305686)
+++ projects/clang390-import/etc/mtree/BSD.tests.dist	(revision 305687)
@@ -1,680 +1,692 @@
 # $FreeBSD$
 #
 # Please see the file src/etc/mtree/README before making changes to this file.
 #
 
 /set type=dir uname=root gname=wheel mode=0755
 .
     bin
         cat
         ..
         date
         ..
         dd
         ..
         expr
         ..
         ls
         ..
         mv
         ..
         pax
         ..
         pkill
         ..
         sh
             builtins
             ..
             errors
             ..
             execution
             ..
             expansion
             ..
             parameters
             ..
             parser
             ..
             set-e
             ..
         ..
         sleep
         ..
         test
         ..
     ..
     cddl
         lib
         ..
         sbin
         ..
         usr.bin
         ..
         usr.sbin
             dtrace
                 common
                     aggs
                     ..
                     arithmetic
                     ..
                     arrays
                     ..
                     assocs
                     ..
                     begin
                     ..
                     bitfields
                     ..
                     buffering
                     ..
                     builtinvar
                     ..
                     cg
                     ..
                     clauses
                     ..
                     cpc
                     ..
                     decls
                     ..
                     drops
                     ..
                     dtraceUtil
                     ..
                     end
                     ..
                     enum
                     ..
                     error
                     ..
                     exit
                     ..
                     fbtprovider
                     ..
                     funcs
                     ..
                     grammar
                     ..
                     include
                     ..
                     inline
                     ..
                     io
                     ..
                     ip
                     ..
                     java_api
                     ..
                     json
                     ..
                     lexer
                     ..
                     llquantize
                     ..
                     mdb
                     ..
                     mib
                     ..
                     misc
                     ..
                     multiaggs
                     ..
                     offsetof
                     ..
                     operators
                     ..
                     pid
                     ..
                     plockstat
                     ..
                     pointers
                     ..
                     pragma
                     ..
                     predicates
                     ..
                     preprocessor
                     ..
                     print
                     ..
                     printa
                     ..
                     printf
                     ..
                     privs
                     ..
                     probes
                     ..
                     proc
                     ..
                     profile-n
                     ..
                     providers
                     ..
                     raise
                     ..
                     rates
                     ..
                     safety
                     ..
                     scalars
                     ..
                     sched
                     ..
                     scripting
                     ..
                     sdt
                     ..
                     sizeof
                     ..
                     speculation
                     ..
                     stability
                     ..
                     stack
                     ..
                     stackdepth
                     ..
                     stop
                     ..
                     strlen
                     ..
                     strtoll
                     ..
                     struct
                     ..
                     sugar
                     ..
                     syscall
                     ..
                     sysevent
                     ..
                     tick-n
                     ..
                     trace
                     ..
                     tracemem
                     ..
                     translators
                     ..
                     typedef
                     ..
                     types
                     ..
                     uctf
                     ..
                     union
                     ..
                     usdt
                     ..
                     ustack
                     ..
                     vars
                     ..
                     version
                     ..
                 ..
             ..
             zfsd
             ..
         ..
     ..
     etc
         rc.d
         ..
     ..
     games
     ..
     gnu
         lib
         ..
         usr.bin
             diff
             ..
         ..
     ..
     lib
         atf
             libatf-c
                 detail
                 ..
             ..
             libatf-c++
                 detail
                 ..
             ..
             test-programs
             ..
         ..
         libarchive
         ..
         libc
             c063
             ..
             db
             ..
             gen
                 execve
                 ..
                 posix_spawn
                 ..
             ..
             hash
                 data
                 ..
             ..
             iconv
             ..
             inet
             ..
             locale
             ..
             net
                 getaddrinfo
                     data
                     ..
                 ..
             ..
             nss
             ..
             regex
                 data
                 ..
             ..
             resolv
             ..
             rpc
             ..
             ssp
             ..
             setjmp
             ..
             stdio
             ..
             stdlib
             ..
             string
             ..
             sys
             ..
             time
             ..
             tls
                 dso
                 ..
             ..
             termios
             ..
             ttyio
             ..
         ..
+        libcasper
+            services
+                cap_dns
+                ..
+                cap_grp
+                ..
+                cap_pwd
+                ..
+                cap_sysctl
+                ..
+            ..
+        ..
         libcrypt
         ..
         libdevdctl
         ..
         libmp
         ..
         libnv
         ..
         libpam
         ..
         libproc
         ..
         librt
         ..
         libthr
             dlopen
             ..
         ..
         libutil
         ..
         libxo
         ..
         msun
         ..
     ..
     libexec
         atf
             atf-check
             ..
             atf-sh
             ..
         ..
         rtld-elf
         ..
     ..
     sbin
         dhclient
         ..
         devd
         ..
         growfs
         ..
         ifconfig
         ..
         mdconfig
         ..
     ..
     secure
         lib
         ..
         libexec
         ..
         usr.bin
         ..
         usr.sbin
         ..
     ..
     share
         examples
             tests
                 atf
                 ..
                 plain
                 ..
             ..
         ..
     ..
     sys
         acl
         ..
         aio
         ..
         fifo
         ..
         file
         ..
         geom
             class
                 concat
                 ..
                 eli
                 ..
                 gate
                 ..
                 gpt
                 ..
                 mirror
                 ..
                 nop
                 ..
                 raid3
                 ..
                 shsec
                 ..
                 stripe
                 ..
                 uzip
                     etalon
                     ..
                 ..
             ..
         ..
         kern
             acct
             ..
             execve
             ..
             pipe
             ..
         ..
         kqueue
             libkqueue
             ..
         ..
         mac
             bsdextended
             ..
             portacl
             ..
         ..
         mqueue
         ..
         netinet
         ..
         opencrypto
         ..
         pjdfstest
             chflags
             ..
             chmod
             ..
             chown
             ..
             ftruncate
             ..
             granular
             ..
             link
             ..
             mkdir
             ..
             mkfifo
             ..
             mknod
             ..
             open
             ..
             rename
             ..
             rmdir
             ..
             symlink
             ..
             truncate
             ..
             unlink
             ..
         ..
         posixshm
         ..
         sys
         ..
         vfs
         ..
         vm
         ..
     ..
     usr.bin
         apply
         ..
         basename
         ..
         bmake
             archives
                 fmt_44bsd
                 ..
                 fmt_44bsd_mod
                 ..
                 fmt_oldbsd
                 ..
             ..
             basic
                 t0
                 ..
                 t1
                 ..
                 t2
                 ..
                 t3
                 ..
             ..
             execution
                 ellipsis
                 ..
                 empty
                 ..
                 joberr
                 ..
                 plus
                 ..
             ..
             shell
                 builtin
                 ..
                 meta
                 ..
                 path
                 ..
                 path_select
                 ..
                 replace
                 ..
                 select
                 ..
             ..
             suffixes
                 basic
                 ..
                 src_wild1
                 ..
                 src_wild2
                 ..
             ..
             syntax
                 directive-t0
                 ..
                 enl
                 ..
                 funny-targets
                 ..
                 semi
                 ..
             ..
             sysmk
                 t0
                     2
                         1
                         ..
                     ..
                     mk
                     ..
                 ..
                 t1
                     2
                         1
                         ..
                     ..
                     mk
                     ..
                 ..
                 t2
                     2
                         1
                         ..
                     ..
                     mk
                     ..
                 ..
             ..
             variables
                 modifier_M
                 ..
                 modifier_t
                 ..
                 opt_V
                 ..
                 t0
                 ..
             ..
         ..
 	bsdcat
 	..
         calendar
         ..
         cmp
         ..
         cpio
         ..
         col
         ..
         comm
         ..
         cut
         ..
         dirname
         ..
         file2c
         ..
         grep
         ..
         gzip
         ..
         ident
         ..
         join
         ..
         jot
         ..
         lastcomm
         ..
         limits
         ..
         m4
         ..
         mkimg
         ..
         ncal
         ..
         opensm
         ..
         printf
         ..
         sdiff
         ..
         sed
             regress.multitest.out
             ..
         ..
         soelim
         ..
         tar
         ..
         timeout
         ..
         tr
         ..
         truncate
         ..
         units
         ..
         uudecode
         ..
         uuencode
         ..
         xargs
         ..
         xinstall
         ..
         xo
         ..
         yacc
             yacc
             ..
         ..
     ..
     usr.sbin
         chown
         ..
         etcupdate
         ..
         extattr
         ..
         fstyp
         ..
         makefs
         ..
         newsyslog
         ..
         nmtree
         ..
         pw
         ..
         rpcbind
         ..
         sa
         ..
     ..
 ..
 
 # vim: set expandtab ts=4 sw=4:
Index: projects/clang390-import/lib/libc/aarch64/sys/Makefile.inc
===================================================================
--- projects/clang390-import/lib/libc/aarch64/sys/Makefile.inc	(revision 305686)
+++ projects/clang390-import/lib/libc/aarch64/sys/Makefile.inc	(revision 305687)
@@ -1,23 +1,15 @@
 # $FreeBSD$
 
 MIASM:=	${MIASM:Nfreebsd[467]_*}
 
 SRCS+=	__vdso_gettc.c
 
 MDASM=	cerror.S \
 	shmat.S \
 	sigreturn.S \
 	syscall.S \
 	vfork.S
 
 # Don't generate default code for these syscalls:
-NOASM=	break.o \
-	exit.o \
-	getlogin.o \
-	sbrk.o \
-	sstk.o \
-	vfork.o \
-	yield.o
-
-PSEUDO= _exit.o \
-	_getlogin.o
+NOASM+=	sbrk.o \
+	vfork.o
Index: projects/clang390-import/lib/libc/amd64/sys/Makefile.inc
===================================================================
--- projects/clang390-import/lib/libc/amd64/sys/Makefile.inc	(revision 305686)
+++ projects/clang390-import/lib/libc/amd64/sys/Makefile.inc	(revision 305687)
@@ -1,13 +1,11 @@
 #	from: Makefile.inc,v 1.1 1993/09/03 19:04:23 jtc Exp
 # $FreeBSD$
 
 SRCS+=	amd64_get_fsbase.c amd64_get_gsbase.c amd64_set_fsbase.c \
 	amd64_set_gsbase.c
 
 MDASM=	vfork.S brk.S cerror.S exect.S getcontext.S \
 	sbrk.S setlogin.S sigreturn.S
 
 # Don't generate default code for these syscalls:
-NOASM=	break.o exit.o getlogin.o sstk.o vfork.o yield.o
-
-PSEUDO=	_getlogin.o _exit.o
+NOASM+=	vfork.o
Index: projects/clang390-import/lib/libc/arm/sys/Makefile.inc
===================================================================
--- projects/clang390-import/lib/libc/arm/sys/Makefile.inc	(revision 305686)
+++ projects/clang390-import/lib/libc/arm/sys/Makefile.inc	(revision 305687)
@@ -1,10 +1,8 @@
 # $FreeBSD$
 
 SRCS+=	__vdso_gettc.c
 
 MDASM= Ovfork.S brk.S cerror.S sbrk.S shmat.S sigreturn.S syscall.S
 
 # Don't generate default code for these syscalls:
-NOASM=	break.o exit.o getlogin.o sstk.o vfork.o yield.o
-
-PSEUDO= _exit.o _getlogin.o
+NOASM+=	vfork.o
Index: projects/clang390-import/lib/libc/i386/sys/Makefile.inc
===================================================================
--- projects/clang390-import/lib/libc/i386/sys/Makefile.inc	(revision 305686)
+++ projects/clang390-import/lib/libc/i386/sys/Makefile.inc	(revision 305687)
@@ -1,23 +1,20 @@
 #	from: Makefile.inc,v 1.1 1993/09/03 19:04:23 jtc Exp
 # $FreeBSD$
 
 .if !defined(COMPAT_32BIT)
 SRCS+=	i386_clr_watch.c i386_set_watch.c i386_vm86.c
 .endif
 SRCS+=	i386_get_fsbase.c i386_get_gsbase.c i386_get_ioperm.c i386_get_ldt.c \
 	i386_set_fsbase.c i386_set_gsbase.c i386_set_ioperm.c i386_set_ldt.c
 
 MDASM=	Ovfork.S brk.S cerror.S exect.S getcontext.S \
 	sbrk.S setlogin.S sigreturn.S syscall.S
 
-# Don't generate default code for these syscalls:
-NOASM=	break.o exit.o getlogin.o sstk.o vfork.o yield.o
-
-PSEUDO=	_getlogin.o _exit.o
+NOASM+=	vfork.o
 
 MAN+=	i386_get_ioperm.2 i386_get_ldt.2 i386_vm86.2
 MAN+=	i386_set_watch.3
 
 MLINKS+=i386_get_ioperm.2 i386_set_ioperm.2
 MLINKS+=i386_get_ldt.2 i386_set_ldt.2
 MLINKS+=i386_set_watch.3 i386_clr_watch.3
Index: projects/clang390-import/lib/libc/mips/sys/Makefile.inc
===================================================================
--- projects/clang390-import/lib/libc/mips/sys/Makefile.inc	(revision 305686)
+++ projects/clang390-import/lib/libc/mips/sys/Makefile.inc	(revision 305687)
@@ -1,11 +1,9 @@
 # $FreeBSD$
 
 SRCS+=	trivial-vdso_tc.c
 
 MDASM=  Ovfork.S brk.S cerror.S exect.S \
 	sbrk.S syscall.S
 
 # Don't generate default code for these syscalls:
-NOASM=	break.o exit.o getlogin.o sstk.o vfork.o yield.o
-
-PSEUDO= _exit.o _getlogin.o
+NOASM+=	vfork.o
Index: projects/clang390-import/lib/libc/powerpc/sys/Makefile.inc
===================================================================
--- projects/clang390-import/lib/libc/powerpc/sys/Makefile.inc	(revision 305686)
+++ projects/clang390-import/lib/libc/powerpc/sys/Makefile.inc	(revision 305687)
@@ -1,8 +1,3 @@
 # $FreeBSD$
 
 MDASM+=	brk.S cerror.S exect.S sbrk.S setlogin.S
-
-# Don't generate default code for these syscalls:
-NOASM=	break.o exit.o getlogin.o sstk.o yield.o
-
-PSEUDO=	_getlogin.o _exit.o
Index: projects/clang390-import/lib/libc/powerpc64/sys/Makefile.inc
===================================================================
--- projects/clang390-import/lib/libc/powerpc64/sys/Makefile.inc	(revision 305686)
+++ projects/clang390-import/lib/libc/powerpc64/sys/Makefile.inc	(revision 305687)
@@ -1,8 +1,3 @@
 # $FreeBSD$
 
 MDASM+=	brk.S cerror.S exect.S sbrk.S setlogin.S
-
-# Don't generate default code for these syscalls:
-NOASM=	break.o exit.o getlogin.o sstk.o yield.o
-
-PSEUDO=	_getlogin.o _exit.o
Index: projects/clang390-import/lib/libc/riscv/sys/Makefile.inc
===================================================================
--- projects/clang390-import/lib/libc/riscv/sys/Makefile.inc	(revision 305686)
+++ projects/clang390-import/lib/libc/riscv/sys/Makefile.inc	(revision 305687)
@@ -1,22 +1,13 @@
 # $FreeBSD$
 
 SRCS+=	trivial-vdso_tc.c
 
 #MDASM= ptrace.S
 MDASM=	cerror.S \
 	shmat.S \
 	sigreturn.S \
 	syscall.S \
 	vfork.S
 
 # Don't generate default code for these syscalls:
-NOASM=	break.o \
-	exit.o \
-	getlogin.o \
-	sbrk.o \
-	sstk.o \
-	vfork.o \
-	yield.o
-
-PSEUDO= _exit.o \
-	_getlogin.o
+NOASM+=	vfork.o
Index: projects/clang390-import/lib/libc/sparc64/sys/Makefile.inc
===================================================================
--- projects/clang390-import/lib/libc/sparc64/sys/Makefile.inc	(revision 305686)
+++ projects/clang390-import/lib/libc/sparc64/sys/Makefile.inc	(revision 305687)
@@ -1,20 +1,15 @@
 # $FreeBSD$
 
 SRCS+=	__sparc_sigtramp_setup.c \
 	__sparc_utrap.c \
 	__sparc_utrap_align.c \
 	__sparc_utrap_emul.c \
 	__sparc_utrap_fp_disabled.S \
 	__sparc_utrap_gen.S \
 	__sparc_utrap_install.c \
 	__sparc_utrap_setup.c \
 	sigcode.S
 
 CFLAGS+= -I${LIBC_SRCTOP}/sparc64/fpu
 
 MDASM+=	brk.S cerror.S exect.S sbrk.S setlogin.S sigaction1.S
-
-# Don't generate default code for these syscalls:
-NOASM=	break.o exit.o getlogin.o sstk.o yield.o
-
-PSEUDO=	_getlogin.o _exit.o
Index: projects/clang390-import/lib/libc/sys/Makefile.inc
===================================================================
--- projects/clang390-import/lib/libc/sys/Makefile.inc	(revision 305686)
+++ projects/clang390-import/lib/libc/sys/Makefile.inc	(revision 305687)
@@ -1,468 +1,479 @@
 #	@(#)Makefile.inc	8.3 (Berkeley) 10/24/94
 # $FreeBSD$
 
 # sys sources
 .PATH: ${LIBC_SRCTOP}/${LIBC_ARCH}/sys ${LIBC_SRCTOP}/sys
 
 # Include the generated makefile containing the *complete* list
 # of syscall names in MIASM.
 .include "${LIBC_SRCTOP}/../../sys/sys/syscall.mk"
 
 # Include machine dependent definitions.
 #
 # MDASM names override the default syscall names in MIASM.
 # NOASM will prevent the default syscall code from being generated.
+# PSEUDO generates _<sys>() and __sys_<sys>() symbols, but not <sys>().
 #
+# While historically machine dependent, all architectures have the following
+# declarations in common:
+#
+NOASM=	break.o \
+	exit.o \
+	getlogin.o \
+	sstk.o \
+	yield.o
+PSEUDO=	_exit.o \
+	_getlogin.o
 .sinclude "${LIBC_SRCTOP}/${LIBC_ARCH}/sys/Makefile.inc"
 
 SRCS+= clock_gettime.c gettimeofday.c __vdso_gettimeofday.c
 NOASM+=  clock_gettime.o gettimeofday.o
 PSEUDO+= _clock_gettime.o _gettimeofday.o
 
 # Sources common to both syscall interfaces:
 SRCS+=	\
 	__error.c \
 	interposing_table.c
 
 SRCS+= futimens.c utimensat.c
 NOASM+= futimens.o utimensat.o
 PSEUDO+= _futimens.o _utimensat.o
 
 SRCS+= pipe.c
 
 INTERPOSED = \
 	accept \
 	accept4 \
 	aio_suspend \
 	close \
 	connect \
 	fcntl \
 	fdatasync \
 	fsync \
 	fork \
 	kevent \
 	msync \
 	nanosleep \
 	open \
 	openat \
 	poll \
 	ppoll \
 	pselect \
 	ptrace \
 	read \
 	readv \
 	recvfrom \
 	recvmsg \
 	select \
 	sendmsg \
 	sendto \
 	setcontext \
 	sigprocmask \
 	sigsuspend \
 	sigtimedwait \
 	sigwait \
 	sigwaitinfo \
 	swapcontext \
 	wait4 \
 	wait6 \
 	write \
 	writev
 
 .if ${MACHINE_CPUARCH} == "sparc64"
 SRCS+=	sigaction.c
 NOASM+=	sigaction.o
 .else
 INTERPOSED+= sigaction
 .endif
 
 SRCS+=	${INTERPOSED:S/$/.c/}
 NOASM+=	${INTERPOSED:S/$/.o/}
 PSEUDO+=	${INTERPOSED:C/^.*$/_&.o/}
 
 # Add machine dependent asm sources:
 SRCS+=${MDASM}
 
 # Look though the complete list of syscalls (MIASM) for names that are
 # not defined with machine dependent implementations (MDASM) and are
 # not declared for no generation of default code (NOASM).  Add each
 # syscall that satisfies these conditions to the ASM list.
 .for _asm in ${MIASM}
 .if (${MDASM:R:M${_asm:R}} == "")
 .if (${NOASM:R:M${_asm:R}} == "")
 ASM+=$(_asm)
 .endif
 .endif
 .endfor
 
 SASM=	${ASM:S/.o/.S/}
 
 SPSEUDO= ${PSEUDO:S/.o/.S/}
 
 SRCS+=	${SASM} ${SPSEUDO}
 
 SYM_MAPS+=	${LIBC_SRCTOP}/sys/Symbol.map
 
 # Generated files
 CLEANFILES+=	${SASM} ${SPSEUDO}
 
 .if ${MACHINE_CPUARCH} == "amd64" || ${MACHINE_CPUARCH} == "i386" || \
     ${MACHINE_CPUARCH} == "powerpc" || ${MACHINE_ARCH:Marmv6*}
 NOTE_GNU_STACK='\t.section .note.GNU-stack,"",%%progbits\n'
 .else
 NOTE_GNU_STACK=''
 .endif
 
 ${SASM}:
 	printf '#include "compat.h"\n' > ${.TARGET}
 	printf '#include "SYS.h"\nRSYSCALL(${.PREFIX})\n' >> ${.TARGET}
 	printf  ${NOTE_GNU_STACK} >>${.TARGET}
 
 ${SPSEUDO}:
 	printf '#include "compat.h"\n' > ${.TARGET}
 	printf '#include "SYS.h"\nPSEUDO(${.PREFIX:S/_//})\n' \
 	    >> ${.TARGET}
 	printf ${NOTE_GNU_STACK} >>${.TARGET}
 
 MAN+=	abort2.2 \
 	accept.2 \
 	access.2 \
 	acct.2 \
 	adjtime.2 \
 	aio_cancel.2 \
 	aio_error.2 \
 	aio_fsync.2 \
 	aio_mlock.2 \
 	aio_read.2 \
 	aio_return.2 \
 	aio_suspend.2 \
 	aio_waitcomplete.2 \
 	aio_write.2 \
 	bind.2 \
 	bindat.2 \
 	brk.2 \
 	cap_enter.2 \
 	cap_fcntls_limit.2 \
 	cap_ioctls_limit.2 \
 	cap_rights_limit.2 \
 	chdir.2 \
 	chflags.2 \
 	chmod.2 \
 	chown.2 \
 	chroot.2 \
 	clock_gettime.2 \
 	close.2 \
 	closefrom.2 \
 	connect.2 \
 	connectat.2 \
 	cpuset.2 \
 	cpuset_getaffinity.2 \
 	dup.2 \
 	execve.2 \
 	_exit.2 \
 	extattr_get_file.2 \
 	fcntl.2 \
 	ffclock.2 \
 	fhopen.2 \
 	flock.2 \
 	fork.2 \
 	fsync.2 \
 	getdirentries.2 \
 	getdtablesize.2 \
 	getfh.2 \
 	getfsstat.2 \
 	getgid.2 \
 	getgroups.2 \
 	getitimer.2 \
 	getlogin.2 \
 	getloginclass.2 \
 	getpeername.2 \
 	getpgrp.2 \
 	getpid.2 \
 	getpriority.2 \
 	getrlimit.2 \
 	getrusage.2 \
 	getsid.2 \
 	getsockname.2 \
 	getsockopt.2 \
 	gettimeofday.2 \
 	getuid.2 \
 	intro.2 \
 	ioctl.2 \
 	issetugid.2 \
 	jail.2 \
 	kenv.2 \
 	kill.2 \
 	kldfind.2 \
 	kldfirstmod.2 \
 	kldload.2 \
 	kldnext.2 \
 	kldstat.2 \
 	kldsym.2 \
 	kldunload.2 \
 	kqueue.2 \
 	ktrace.2 \
 	link.2 \
 	lio_listio.2 \
 	listen.2 \
 	lseek.2 \
 	madvise.2 \
 	mincore.2 \
 	minherit.2 \
 	mkdir.2 \
 	mkfifo.2 \
 	mknod.2 \
 	mlock.2 \
 	mlockall.2 \
 	mmap.2 \
 	modfind.2 \
 	modnext.2 \
 	modstat.2 \
 	mount.2 \
 	mprotect.2 \
 	mq_close.2 \
 	mq_getattr.2 \
 	mq_notify.2 \
 	mq_open.2 \
 	mq_receive.2 \
 	mq_send.2 \
 	mq_setattr.2 \
 	msgctl.2 \
 	msgget.2 \
 	msgrcv.2 \
 	msgsnd.2 \
 	msync.2 \
 	munmap.2 \
 	nanosleep.2 \
 	nfssvc.2 \
 	ntp_adjtime.2 \
 	numa_getaffinity.2 \
 	open.2 \
 	pathconf.2 \
 	pdfork.2 \
 	pipe.2 \
 	poll.2 \
 	posix_fadvise.2 \
 	posix_fallocate.2 \
 	posix_openpt.2 \
 	procctl.2 \
 	profil.2 \
 	pselect.2 \
 	ptrace.2 \
 	quotactl.2 \
 	read.2 \
 	readlink.2 \
 	reboot.2 \
 	recv.2 \
 	rename.2 \
 	revoke.2 \
 	rfork.2 \
 	rmdir.2 \
 	rtprio.2
 .if !defined(NO_P1003_1B)
 MAN+=	sched_get_priority_max.2 \
 	sched_setparam.2 \
 	sched_setscheduler.2 \
 	sched_yield.2
 .endif
 MAN+=	sctp_generic_recvmsg.2 \
 	sctp_generic_sendmsg.2 \
 	sctp_peeloff.2 \
 	select.2 \
 	semctl.2 \
 	semget.2 \
 	semop.2 \
 	send.2 \
 	setfib.2 \
 	sendfile.2 \
 	setgroups.2 \
 	setpgid.2 \
 	setregid.2 \
 	setresuid.2 \
 	setreuid.2 \
 	setsid.2 \
 	setuid.2 \
 	shmat.2 \
 	shmctl.2 \
 	shmget.2 \
 	shm_open.2 \
 	shutdown.2 \
 	sigaction.2 \
 	sigaltstack.2 \
 	sigpending.2 \
 	sigprocmask.2 \
 	sigqueue.2 \
 	sigreturn.2 \
 	sigstack.2 \
 	sigsuspend.2 \
 	sigwait.2 \
 	sigwaitinfo.2 \
 	socket.2 \
 	socketpair.2 \
 	stat.2 \
 	statfs.2 \
 	swapon.2 \
 	symlink.2 \
 	sync.2 \
 	sysarch.2 \
 	syscall.2 \
 	thr_exit.2 \
 	thr_kill.2 \
 	thr_new.2 \
 	thr_self.2 \
 	thr_set_name.2 \
 	timer_create.2 \
 	timer_delete.2 \
 	timer_settime.2 \
 	truncate.2 \
 	umask.2 \
 	undelete.2 \
 	unlink.2 \
 	utimensat.2 \
 	utimes.2 \
 	utrace.2 \
 	uuidgen.2 \
 	vfork.2 \
 	wait.2 \
 	write.2 \
 	_umtx_op.2
 
 MLINKS+=accept.2 accept4.2
 MLINKS+=access.2 eaccess.2 \
 	access.2 faccessat.2
 MLINKS+=brk.2 sbrk.2
 MLINKS+=cap_enter.2 cap_getmode.2
 MLINKS+=cap_fcntls_limit.2 cap_fcntls_get.2
 MLINKS+=cap_ioctls_limit.2 cap_ioctls_get.2
 MLINKS+=cap_rights_limit.2 cap_rights_get.2
 MLINKS+=chdir.2 fchdir.2
 MLINKS+=chflags.2 chflagsat.2 \
 	chflags.2 fchflags.2 \
 	chflags.2 lchflags.2
 MLINKS+=chmod.2 fchmod.2 \
 	chmod.2 fchmodat.2 \
 	chmod.2 lchmod.2
 MLINKS+=chown.2 fchown.2 \
 	chown.2 fchownat.2 \
 	chown.2 lchown.2
 MLINKS+=clock_gettime.2 clock_getres.2 \
 	clock_gettime.2 clock_settime.2
 MLINKS+=cpuset.2 cpuset_getid.2 \
 	cpuset.2 cpuset_setid.2
 MLINKS+=cpuset_getaffinity.2 cpuset_setaffinity.2
 MLINKS+=dup.2 dup2.2
 MLINKS+=execve.2 fexecve.2
 MLINKS+=extattr_get_file.2 extattr.2 \
 	extattr_get_file.2 extattr_delete_fd.2 \
 	extattr_get_file.2 extattr_delete_file.2 \
 	extattr_get_file.2 extattr_delete_link.2 \
 	extattr_get_file.2 extattr_get_fd.2 \
 	extattr_get_file.2 extattr_get_link.2 \
 	extattr_get_file.2 extattr_list_fd.2 \
 	extattr_get_file.2 extattr_list_file.2 \
 	extattr_get_file.2 extattr_list_link.2 \
 	extattr_get_file.2 extattr_set_fd.2 \
 	extattr_get_file.2 extattr_set_file.2 \
 	extattr_get_file.2 extattr_set_link.2
 MLINKS+=ffclock.2 ffclock_getcounter.2 \
 	ffclock.2 ffclock_getestimate.2 \
 	ffclock.2 ffclock_setestimate.2
 MLINKS+=fhopen.2 fhstat.2 fhopen.2 fhstatfs.2
 MLINKS+=fsync.2 fdatasync.2
 MLINKS+=getdirentries.2 getdents.2
 MLINKS+=getfh.2 lgetfh.2
 MLINKS+=getgid.2 getegid.2
 MLINKS+=getitimer.2 setitimer.2
 MLINKS+=getlogin.2 getlogin_r.3
 MLINKS+=getlogin.2 setlogin.2
 MLINKS+=getloginclass.2 setloginclass.2
 MLINKS+=getpgrp.2 getpgid.2
 MLINKS+=getpid.2 getppid.2
 MLINKS+=getpriority.2 setpriority.2
 MLINKS+=getrlimit.2 setrlimit.2
 MLINKS+=getsockopt.2 setsockopt.2
 MLINKS+=gettimeofday.2 settimeofday.2
 MLINKS+=getuid.2 geteuid.2
 MLINKS+=intro.2 errno.2
 MLINKS+=jail.2 jail_attach.2 \
 	jail.2 jail_get.2 \
 	jail.2 jail_remove.2 \
 	jail.2 jail_set.2
 MLINKS+=kldunload.2 kldunloadf.2
 MLINKS+=kqueue.2 kevent.2 \
 	kqueue.2 EV_SET.3
 MLINKS+=link.2 linkat.2
 MLINKS+=madvise.2 posix_madvise.2
 MLINKS+=mkdir.2 mkdirat.2
 MLINKS+=mkfifo.2 mkfifoat.2
 MLINKS+=mknod.2 mknodat.2
 MLINKS+=mlock.2 munlock.2
 MLINKS+=mlockall.2 munlockall.2
 MLINKS+=modnext.2 modfnext.2
 MLINKS+=mount.2 nmount.2 \
 	mount.2 unmount.2
 MLINKS+=mq_receive.2 mq_timedreceive.2
 MLINKS+=mq_send.2 mq_timedsend.2
 MLINKS+=ntp_adjtime.2 ntp_gettime.2
 MLINKS+=numa_getaffinity.2 numa_setaffinity.2
 MLINKS+=open.2 openat.2
 MLINKS+=pathconf.2 fpathconf.2
 MLINKS+=pathconf.2 lpathconf.2
 MLINKS+=pdfork.2 pdgetpid.2\
 	pdfork.2 pdkill.2 \
 	pdfork.2 pdwait4.2
 MLINKS+=pipe.2 pipe2.2
 MLINKS+=poll.2 ppoll.2
 MLINKS+=read.2 pread.2 \
 	read.2 preadv.2 \
 	read.2 readv.2
 MLINKS+=readlink.2 readlinkat.2
 MLINKS+=recv.2 recvfrom.2 \
 	recv.2 recvmsg.2
 MLINKS+=rename.2 renameat.2
 MLINKS+=rtprio.2 rtprio_thread.2
 .if !defined(NO_P1003_1B)
 MLINKS+=sched_get_priority_max.2 sched_get_priority_min.2 \
 	sched_get_priority_max.2 sched_rr_get_interval.2
 MLINKS+=sched_setparam.2 sched_getparam.2
 MLINKS+=sched_setscheduler.2 sched_getscheduler.2
 .endif
 MLINKS+=select.2 FD_CLR.3 \
 	select.2 FD_ISSET.3 \
 	select.2 FD_SET.3 \
 	select.2 FD_ZERO.3
 MLINKS+=send.2 sendmsg.2 \
 	send.2 sendto.2
 MLINKS+=setpgid.2 setpgrp.2
 MLINKS+=setresuid.2 getresgid.2 \
 	setresuid.2 getresuid.2 \
 	setresuid.2 setresgid.2
 MLINKS+=setuid.2 setegid.2 \
 	setuid.2 seteuid.2 \
 	setuid.2 setgid.2
 MLINKS+=shmat.2 shmdt.2
 MLINKS+=shm_open.2 shm_unlink.2
 MLINKS+=sigwaitinfo.2 sigtimedwait.2
 MLINKS+=stat.2 fstat.2 \
 	stat.2 fstatat.2 \
 	stat.2 lstat.2
 MLINKS+=statfs.2 fstatfs.2
 MLINKS+=swapon.2 swapoff.2
 MLINKS+=symlink.2 symlinkat.2
 MLINKS+=syscall.2 __syscall.2
 MLINKS+=timer_settime.2 timer_getoverrun.2 \
 	timer_settime.2 timer_gettime.2
 MLINKS+=thr_kill.2 thr_kill2.2
 MLINKS+=truncate.2 ftruncate.2
 MLINKS+=unlink.2 unlinkat.2
 MLINKS+=utimensat.2 futimens.2
 MLINKS+=utimes.2 futimes.2 \
 	utimes.2 futimesat.2 \
 	utimes.2 lutimes.2
 MLINKS+=wait.2 wait3.2 \
 	wait.2 wait4.2 \
 	wait.2 waitpid.2 \
 	wait.2 waitid.2 \
 	wait.2 wait6.2
 MLINKS+=write.2 pwrite.2 \
 	write.2 pwritev.2 \
 	write.2 writev.2
Index: projects/clang390-import/lib/libc/sys/_exit.2
===================================================================
--- projects/clang390-import/lib/libc/sys/_exit.2	(revision 305686)
+++ projects/clang390-import/lib/libc/sys/_exit.2	(revision 305687)
@@ -1,123 +1,125 @@
 .\" Copyright (c) 1980, 1993
 .\"	The Regents of the University of California.  All rights reserved.
 .\"
 .\" Redistribution and use in source and binary forms, with or without
 .\" modification, are permitted provided that the following conditions
 .\" are met:
 .\" 1. Redistributions of source code must retain the above copyright
 .\"    notice, this list of conditions and the following disclaimer.
 .\" 2. Redistributions in binary form must reproduce the above copyright
 .\"    notice, this list of conditions and the following disclaimer in the
 .\"    documentation and/or other materials provided with the distribution.
 .\" 4. Neither the name of the University nor the names of its contributors
 .\"    may be used to endorse or promote products derived from this software
 .\"    without specific prior written permission.
 .\"
 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 .\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 .\" SUCH DAMAGE.
 .\"
 .\"     @(#)_exit.2	8.1 (Berkeley) 6/4/93
 .\" $FreeBSD$
 .\"
-.Dd June 4, 1993
+.Dd September 8, 2016
 .Dt EXIT 2
 .Os
 .Sh NAME
 .Nm _exit
 .Nd terminate the calling process
 .Sh LIBRARY
 .Lb libc
 .Sh SYNOPSIS
 .In unistd.h
 .Ft void
 .Fn _exit "int status"
 .Sh DESCRIPTION
 The
 .Fn _exit
 system call
 terminates a process with the following consequences:
 .Bl -bullet
 .It
 All of the descriptors open in the calling process are closed.
 This may entail delays, for example, waiting for output to drain;
 a process in this state may not be killed, as it is already dying.
 .It
 If the parent process of the calling process has an outstanding
 .Xr wait 2
 call
 or catches the
 .Dv SIGCHLD
 signal,
 it is notified of the calling process's termination and
 the
 .Fa status
 is set as defined by
 .Xr wait 2 .
 .It
 The parent process-ID of all of the calling process's existing child
-processes are set to 1; the initialization process
+processes are set to the process-ID of the calling process's reaper;
+the reaper (normally the initialization process)
 inherits each of these processes
 (see
+.Xr procctl 2 ,
 .Xr init 8
 and the
 .Sx DEFINITIONS
 section of
 .Xr intro 2 ) .
 .It
 If the termination of the process causes any process group
 to become orphaned (usually because the parents of all members
 of the group have now exited; see
 .Dq orphaned process group
 in
 .Xr intro 2 ) ,
 and if any member of the orphaned group is stopped,
 the
 .Dv SIGHUP
 signal and the
 .Dv SIGCONT
 signal are sent to all members of the newly-orphaned process group.
 .It
 If the process is a controlling process (see
 .Xr intro 2 ) ,
 the
 .Dv SIGHUP
 signal is sent to the foreground process group of the controlling terminal,
 and all current access to the controlling terminal is revoked.
 .El
 .Pp
 Most C programs call the library routine
 .Xr exit 3 ,
 which flushes buffers, closes streams, unlinks temporary files, etc.,
 before
 calling
 .Fn _exit .
 .Sh RETURN VALUES
 The
 .Fn _exit
 system call
 can never return.
 .Sh SEE ALSO
 .Xr fork 2 ,
 .Xr sigaction 2 ,
 .Xr wait 2 ,
 .Xr exit 3 ,
 .Xr init 8
 .Sh STANDARDS
 The
 .Fn _exit
 system call is expected to conform to
 .St -p1003.1-90 .
 .Sh HISTORY
 The
 .Fn _exit
 function appeared in
 .At v7 .
Index: projects/clang390-import/lib/libc/sys/intro.2
===================================================================
--- projects/clang390-import/lib/libc/sys/intro.2	(revision 305686)
+++ projects/clang390-import/lib/libc/sys/intro.2	(revision 305687)
@@ -1,751 +1,754 @@
 .\" Copyright (c) 1980, 1983, 1986, 1991, 1993
 .\"	The Regents of the University of California.  All rights reserved.
 .\"
 .\" Redistribution and use in source and binary forms, with or without
 .\" modification, are permitted provided that the following conditions
 .\" are met:
 .\" 1. Redistributions of source code must retain the above copyright
 .\"    notice, this list of conditions and the following disclaimer.
 .\" 2. Redistributions in binary form must reproduce the above copyright
 .\"    notice, this list of conditions and the following disclaimer in the
 .\"    documentation and/or other materials provided with the distribution.
 .\" 4. Neither the name of the University nor the names of its contributors
 .\"    may be used to endorse or promote products derived from this software
 .\"    without specific prior written permission.
 .\"
 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 .\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 .\" SUCH DAMAGE.
 .\"
 .\"     @(#)intro.2	8.5 (Berkeley) 2/27/95
 .\" $FreeBSD$
 .\"
-.Dd May 4, 2013
+.Dd September 8, 2016
 .Dt INTRO 2
 .Os
 .Sh NAME
 .Nm intro
 .Nd introduction to system calls and error numbers
 .Sh LIBRARY
 .Lb libc
 .Sh SYNOPSIS
 .In errno.h
 .Sh DESCRIPTION
 This section provides an overview of the system calls,
 their error returns, and other common definitions and concepts.
 .\".Pp
 .\".Sy System call restart
 .\".Pp
 .\"(more later...)
 .Sh RETURN VALUES
 Nearly all of the system calls provide an error number referenced via
 the external identifier errno.
 This identifier is defined in
 .In sys/errno.h
 as
 .Pp
 .Dl extern    int *       __error();
 .Dl #define   errno       (* __error())
 .Pp
 The
 .Va __error()
 function returns a pointer to a field in the thread specific structure for
 threads other than the initial thread.
 For the initial thread and
 non-threaded processes,
 .Va __error()
 returns a pointer to a global
 .Va errno
 variable that is compatible with the previous definition.
 .Pp
 When a system call detects an error,
 it returns an integer value
 indicating failure (usually -1)
 and sets the variable
 .Va errno
 accordingly.
 (This allows interpretation of the failure on receiving
 a -1 and to take action accordingly.)
 Successful calls never set
 .Va errno ;
 once set, it remains until another error occurs.
 It should only be examined after an error.
 Note that a number of system calls overload the meanings of these
 error numbers, and that the meanings must be interpreted according
 to the type and circumstances of the call.
 .Pp
 The following is a complete list of the errors and their
 names as given in
 .In sys/errno.h .
 .Bl -hang -width Ds
 .It Er 0 Em "Undefined error: 0" .
 Not used.
 .It Er 1 EPERM Em "Operation not permitted" .
 An attempt was made to perform an operation limited to processes
 with appropriate privileges or to the owner of a file or other
 resources.
 .It Er 2 ENOENT Em "No such file or directory" .
 A component of a specified pathname did not exist, or the
 pathname was an empty string.
 .It Er 3 ESRCH Em "No such process" .
 No process could be found corresponding to that specified by the given
 process ID.
 .It Er 4 EINTR Em "Interrupted system call" .
 An asynchronous signal (such as
 .Dv SIGINT
 or
 .Dv SIGQUIT )
 was caught by the process during the execution of an interruptible
 function.
 If the signal handler performs a normal return, the
 interrupted system call will seem to have returned the error condition.
 .It Er 5 EIO Em "Input/output error" .
 Some physical input or output error occurred.
 This error will not be reported until a subsequent operation on the same file
 descriptor and may be lost (over written) by any subsequent errors.
 .It Er 6 ENXIO Em "Device not configured" .
 Input or output on a special file referred to a device that did not
 exist, or
 made a request beyond the limits of the device.
 This error may also occur when, for example,
 a tape drive is not online or no disk pack is
 loaded on a drive.
 .It Er 7 E2BIG Em "Argument list too long" .
 The number of bytes used for the argument and environment
 list of the new process exceeded the current limit
 .Dv ( NCARGS
 in
 .In sys/param.h ) .
 .It Er 8 ENOEXEC Em "Exec format error" .
 A request was made to execute a file
 that, although it has the appropriate permissions,
 was not in the format required for an
 executable file.
 .It Er 9 EBADF Em "Bad file descriptor" .
 A file descriptor argument was out of range, referred to no open file,
 or a read (write) request was made to a file that was only open for
 writing (reading).
 .It Er 10 ECHILD Em "\&No child processes" .
 A
 .Xr wait 2
 or
 .Xr waitpid 2
 function was executed by a process that had no existing or unwaited-for
 child processes.
 .It Er 11 EDEADLK Em "Resource deadlock avoided" .
 An attempt was made to lock a system resource that
 would have resulted in a deadlock situation.
 .It Er 12 ENOMEM Em "Cannot allocate memory" .
 The new process image required more memory than was allowed by the hardware
 or by system-imposed memory management constraints.
 A lack of swap space is normally temporary; however,
 a lack of core is not.
 Soft limits may be increased to their corresponding hard limits.
 .It Er 13 EACCES Em "Permission denied" .
 An attempt was made to access a file in a way forbidden
 by its file access permissions.
 .It Er 14 EFAULT Em "Bad address" .
 The system detected an invalid address in attempting to
 use an argument of a call.
 .It Er 15 ENOTBLK Em "Block device required" .
 A block device operation was attempted on a non-block device or file.
 .It Er 16 EBUSY Em "Device busy" .
 An attempt to use a system resource which was in use at the time
 in a manner which would have conflicted with the request.
 .It Er 17 EEXIST Em "File exists" .
 An existing file was mentioned in an inappropriate context,
 for instance, as the new link name in a
 .Xr link 2
 system call.
 .It Er 18 EXDEV Em "Cross-device link" .
 A hard link to a file on another file system
 was attempted.
 .It Er 19 ENODEV Em "Operation not supported by device" .
 An attempt was made to apply an inappropriate
 function to a device,
 for example,
 trying to read a write-only device such as a printer.
 .It Er 20 ENOTDIR Em "Not a directory" .
 A component of the specified pathname existed, but it was
 not a directory, when a directory was expected.
 .It Er 21 EISDIR Em "Is a directory" .
 An attempt was made to open a directory with write mode specified.
 .It Er 22 EINVAL Em "Invalid argument" .
 Some invalid argument was supplied.
 (For example,
 specifying an undefined signal to a
 .Xr signal 3
 function
 or a
 .Xr kill 2
 system call).
 .It Er 23 ENFILE Em "Too many open files in system" .
 Maximum number of open files allowable on the system
 has been reached and requests for an open cannot be satisfied
 until at least one has been closed.
 .It Er 24 EMFILE Em "Too many open files" .
 Maximum number of file descriptors allowable in the process
 has been reached and requests for an open cannot be satisfied
 until at least one has been closed.
 The
 .Xr getdtablesize 2
 system call will obtain the current limit.
 .It Er 25 ENOTTY Em "Inappropriate ioctl for device" .
 A control function (see
 .Xr ioctl 2 )
 was attempted for a file or
 special device for which the operation was inappropriate.
 .It Er 26 ETXTBSY Em "Text file busy" .
 The new process was a pure procedure (shared text) file
 which was open for writing by another process, or
 while the pure procedure file was being executed an
 .Xr open 2
 call requested write access.
 .It Er 27 EFBIG Em "File too large" .
 The size of a file exceeded the maximum.
 .It Er 28 ENOSPC Em "No space left on device" .
 A
 .Xr write 2
 to an ordinary file, the creation of a
 directory or symbolic link, or the creation of a directory
 entry failed because no more disk blocks were available
 on the file system, or the allocation of an inode for a newly
 created file failed because no more inodes were available
 on the file system.
 .It Er 29 ESPIPE Em "Illegal seek" .
 An
 .Xr lseek 2
 system call was issued on a socket, pipe or
 .Tn FIFO .
 .It Er 30 EROFS Em "Read-only file system" .
 An attempt was made to modify a file or directory
 on a file system that was read-only at the time.
 .It Er 31 EMLINK Em "Too many links" .
 Maximum allowable hard links to a single file has been exceeded (limit
 of 32767 hard links per file).
 .It Er 32 EPIPE Em "Broken pipe" .
 A write on a pipe, socket or
 .Tn FIFO
 for which there is no process
 to read the data.
 .It Er 33 EDOM Em "Numerical argument out of domain" .
 A numerical input argument was outside the defined domain of the mathematical
 function.
 .It Er 34 ERANGE Em "Result too large" .
 A numerical result of the function was too large to fit in the
 available space (perhaps exceeded precision).
 .It Er 35 EAGAIN Em "Resource temporarily unavailable" .
 This is a temporary condition and later calls to the
 same routine may complete normally.
 .It Er 36 EINPROGRESS Em "Operation now in progress" .
 An operation that takes a long time to complete (such as
 a
 .Xr connect 2 )
 was attempted on a non-blocking object (see
 .Xr fcntl 2 ) .
 .It Er 37 EALREADY Em "Operation already in progress" .
 An operation was attempted on a non-blocking object that already
 had an operation in progress.
 .It Er 38 ENOTSOCK Em "Socket operation on non-socket" .
 Self-explanatory.
 .It Er 39 EDESTADDRREQ Em "Destination address required" .
 A required address was omitted from an operation on a socket.
 .It Er 40 EMSGSIZE Em "Message too long" .
 A message sent on a socket was larger than the internal message buffer
 or some other network limit.
 .It Er 41 EPROTOTYPE Em "Protocol wrong type for socket" .
 A protocol was specified that does not support the semantics of the
 socket type requested.
 For example, you cannot use the
 .Tn ARPA
 Internet
 .Tn UDP
 protocol with type
 .Dv SOCK_STREAM .
 .It Er 42 ENOPROTOOPT Em "Protocol not available" .
 A bad option or level was specified in a
 .Xr getsockopt 2
 or
 .Xr setsockopt 2
 call.
 .It Er 43 EPROTONOSUPPORT Em "Protocol not supported" .
 The protocol has not been configured into the
 system or no implementation for it exists.
 .It Er 44 ESOCKTNOSUPPORT Em "Socket type not supported" .
 The support for the socket type has not been configured into the
 system or no implementation for it exists.
 .It Er 45 EOPNOTSUPP Em "Operation not supported" .
 The attempted operation is not supported for the type of object referenced.
 Usually this occurs when a file descriptor refers to a file or socket
 that cannot support this operation,
 for example, trying to
 .Em accept
 a connection on a datagram socket.
 .It Er 46 EPFNOSUPPORT Em "Protocol family not supported" .
 The protocol family has not been configured into the
 system or no implementation for it exists.
 .It Er 47 EAFNOSUPPORT Em "Address family not supported by protocol family" .
 An address incompatible with the requested protocol was used.
 For example, you should not necessarily expect to be able to use
 .Tn NS
 addresses with
 .Tn ARPA
 Internet protocols.
 .It Er 48 EADDRINUSE Em "Address already in use" .
 Only one usage of each address is normally permitted.
 .It Er 49 EADDRNOTAVAIL Em "Can't assign requested address" .
 Normally results from an attempt to create a socket with an
 address not on this machine.
 .It Er 50 ENETDOWN Em "Network is down" .
 A socket operation encountered a dead network.
 .It Er 51 ENETUNREACH Em "Network is unreachable" .
 A socket operation was attempted to an unreachable network.
 .It Er 52 ENETRESET Em "Network dropped connection on reset" .
 The host you were connected to crashed and rebooted.
 .It Er 53 ECONNABORTED Em "Software caused connection abort" .
 A connection abort was caused internal to your host machine.
 .It Er 54 ECONNRESET Em "Connection reset by peer" .
 A connection was forcibly closed by a peer.
 This normally
 results from a loss of the connection on the remote socket
 due to a timeout or a reboot.
 .It Er 55 ENOBUFS Em "\&No buffer space available" .
 An operation on a socket or pipe was not performed because
 the system lacked sufficient buffer space or because a queue was full.
 .It Er 56 EISCONN Em "Socket is already connected" .
 A
 .Xr connect 2
 request was made on an already connected socket; or,
 a
 .Xr sendto 2
 or
 .Xr sendmsg 2
 request on a connected socket specified a destination
 when already connected.
 .It Er 57 ENOTCONN Em "Socket is not connected" .
 An request to send or receive data was disallowed because
 the socket was not connected and (when sending on a datagram socket)
 no address was supplied.
 .It Er 58 ESHUTDOWN Em "Can't send after socket shutdown" .
 A request to send data was disallowed because the socket
 had already been shut down with a previous
 .Xr shutdown 2
 call.
 .It Er 60 ETIMEDOUT Em "Operation timed out" .
 A
 .Xr connect 2
 or
 .Xr send 2
 request failed because the connected party did not
 properly respond after a period of time.
 (The timeout
 period is dependent on the communication protocol.)
 .It Er 61 ECONNREFUSED Em "Connection refused" .
 No connection could be made because the target machine actively
 refused it.
 This usually results from trying to connect
 to a service that is inactive on the foreign host.
 .It Er 62 ELOOP Em "Too many levels of symbolic links" .
 A path name lookup involved more than 32
 .Pq Dv MAXSYMLINKS
 symbolic links.
 .It Er 63 ENAMETOOLONG Em "File name too long" .
 A component of a path name exceeded
 .Brq Dv NAME_MAX
 characters, or an entire
 path name exceeded
 .Brq Dv PATH_MAX
 characters.
 (See also the description of
 .Dv _PC_NO_TRUNC
 in
 .Xr pathconf 2 . )
 .It Er 64 EHOSTDOWN Em "Host is down" .
 A socket operation failed because the destination host was down.
 .It Er 65 EHOSTUNREACH Em "No route to host" .
 A socket operation was attempted to an unreachable host.
 .It Er 66 ENOTEMPTY Em "Directory not empty" .
 A directory with entries other than
 .Ql .\&
 and
 .Ql ..\&
 was supplied to a remove directory or rename call.
 .It Er 67 EPROCLIM Em "Too many processes" .
 .It Er 68 EUSERS Em "Too many users" .
 The quota system ran out of table entries.
 .It Er 69 EDQUOT Em "Disc quota exceeded" .
 A
 .Xr write 2
 to an ordinary file, the creation of a
 directory or symbolic link, or the creation of a directory
 entry failed because the user's quota of disk blocks was
 exhausted, or the allocation of an inode for a newly
 created file failed because the user's quota of inodes
 was exhausted.
 .It Er 70 ESTALE Em "Stale NFS file handle" .
 An attempt was made to access an open file (on an
 .Tn NFS
 file system)
 which is now unavailable as referenced by the file descriptor.
 This may indicate the file was deleted on the
 .Tn NFS
 server or some
 other catastrophic event occurred.
 .It Er 72 EBADRPC Em "RPC struct is bad" .
 Exchange of
 .Tn RPC
 information was unsuccessful.
 .It Er 73 ERPCMISMATCH Em "RPC version wrong" .
 The version of
 .Tn RPC
 on the remote peer is not compatible with
 the local version.
 .It Er 74 EPROGUNAVAIL Em "RPC prog. not avail" .
 The requested program is not registered on the remote host.
 .It Er 75 EPROGMISMATCH Em "Program version wrong" .
 The requested version of the program is not available
 on the remote host
 .Pq Tn RPC .
 .It Er 76 EPROCUNAVAIL Em "Bad procedure for program" .
 An
 .Tn RPC
 call was attempted for a procedure which does not exist
 in the remote program.
 .It Er 77 ENOLCK Em "No locks available" .
 A system-imposed limit on the number of simultaneous file
 locks was reached.
 .It Er 78 ENOSYS Em "Function not implemented" .
 Attempted a system call that is not available on this
 system.
 .It Er 79 EFTYPE Em "Inappropriate file type or format" .
 The file was the wrong type for the operation, or a data file had
 the wrong format.
 .It Er 80 EAUTH Em "Authentication error" .
 Attempted to use an invalid authentication ticket to mount a
 .Tn NFS
 file system.
 .It Er 81 ENEEDAUTH Em "Need authenticator" .
 An authentication ticket must be obtained before the given
 .Tn NFS
 file system may be mounted.
 .It Er 82 EIDRM Em "Identifier removed" .
 An IPC identifier was removed while the current process was waiting on it.
 .It Er 83 ENOMSG Em "No message of desired type" .
 An IPC message queue does not contain a message of the desired type, or a
 message catalog does not contain the requested message.
 .It Er 84 EOVERFLOW Em "Value too large to be stored in data type" .
 A numerical result of the function was too large to be stored in the caller
 provided space.
 .It Er 85 ECANCELED Em "Operation canceled" .
 The scheduled operation was canceled.
 .It Er 86 EILSEQ Em "Illegal byte sequence" .
 While decoding a multibyte character the function came along an
 invalid or an incomplete sequence of bytes or the given wide
 character is invalid.
 .It Er 87 ENOATTR Em "Attribute not found" .
 The specified extended attribute does not exist.
 .It Er 88 EDOOFUS Em "Programming error" .
 A function or API is being abused in a way which could only be detected
 at run-time.
 .It Er 89 EBADMSG Em "Bad message" .
 A corrupted message was detected.
 .It Er 90 EMULTIHOP Em "Multihop attempted" .
 This error code is unused, but present for compatibility with other systems.
 .It Er 91 ENOLINK Em "Link has been severed" .
 This error code is unused, but present for compatibility with other systems.
 .It Er 92 EPROTO Em "Protocol error" .
 A device or socket encountered an unrecoverable protocol error.
 .It Er 93 ENOTCAPABLE Em "Capabilities insufficient" .
 An operation on a capability file descriptor requires greater privilege than
 the capability allows.
 .It Er 94 ECAPMODE Em "Not permitted in capability mode" .
 The system call or operation is not permitted for capability mode processes.
 .It Er 95 ENOTRECOVERABLE Em "State not recoverable" .
 The state protected by a robust mutex is not recoverable.
 .It Er 96 EOWNERDEAD Em "Previous owner died" .
 The owner of a robust mutex terminated while holding the mutex lock.
 .El
 .Sh DEFINITIONS
 .Bl -tag -width Ds
 .It Process ID .
 Each active process in the system is uniquely identified by a non-negative
 integer called a process ID.
 The range of this ID is from 0 to 99999.
 .It Parent process ID
 A new process is created by a currently active process (see
 .Xr fork 2 ) .
 The parent process ID of a process is initially the process ID of its creator.
 If the creating process exits,
-the parent process ID of each child is set to the ID of a system process,
+the parent process ID of each child is set to the ID of the calling process's
+reaper (see
+.Xr procctl 2 ) ,
+normally
 .Xr init 8 .
 .It Process Group
 Each active process is a member of a process group that is identified by
 a non-negative integer called the process group ID.
 This is the process
 ID of the group leader.
 This grouping permits the signaling of related
 processes (see
 .Xr termios 4 )
 and the job control mechanisms of
 .Xr csh 1 .
 .It Session
 A session is a set of one or more process groups.
 A session is created by a successful call to
 .Xr setsid 2 ,
 which causes the caller to become the only member of the only process
 group in the new session.
 .It Session leader
 A process that has created a new session by a successful call to
 .Xr setsid 2 ,
 is known as a session leader.
 Only a session leader may acquire a terminal as its controlling terminal (see
 .Xr termios 4 ) .
 .It Controlling process
 A session leader with a controlling terminal is a controlling process.
 .It Controlling terminal
 A terminal that is associated with a session is known as the controlling
 terminal for that session and its members.
 .It "Terminal Process Group ID"
 A terminal may be acquired by a session leader as its controlling terminal.
 Once a terminal is associated with a session, any of the process groups
 within the session may be placed into the foreground by setting
 the terminal process group ID to the ID of the process group.
 This facility is used
 to arbitrate between multiple jobs contending for the same terminal;
 (see
 .Xr csh 1
 and
 .Xr tty 4 ) .
 .It "Orphaned Process Group"
 A process group is considered to be
 .Em orphaned
 if it is not under the control of a job control shell.
 More precisely, a process group is orphaned
 when none of its members has a parent process that is in the same session
 as the group,
 but is in a different process group.
 Note that when a process exits, the parent process for its children
-is changed to be
+is normally changed to be
 .Xr init 8 ,
 which is in a separate session.
 Not all members of an orphaned process group are necessarily orphaned
 processes (those whose creating process has exited).
 The process group of a session leader is orphaned by definition.
 .It "Real User ID and Real Group ID"
 Each user on the system is identified by a positive integer
 termed the real user ID.
 .Pp
 Each user is also a member of one or more groups.
 One of these groups is distinguished from others and
 used in implementing accounting facilities.
 The positive
 integer corresponding to this distinguished group is termed
 the real group ID.
 .Pp
 All processes have a real user ID and real group ID.
 These are initialized from the equivalent attributes
 of the process that created it.
 .It "Effective User Id, Effective Group Id, and Group Access List"
 Access to system resources is governed by two values:
 the effective user ID, and the group access list.
 The first member of the group access list is also known as the
 effective group ID.
 (In POSIX.1, the group access list is known as the set of supplementary
 group IDs, and it is unspecified whether the effective group ID is
 a member of the list.)
 .Pp
 The effective user ID and effective group ID are initially the
 process's real user ID and real group ID respectively.
 Either
 may be modified through execution of a set-user-ID or set-group-ID
 file (possibly by one its ancestors) (see
 .Xr execve 2 ) .
 By convention, the effective group ID (the first member of the group access
 list) is duplicated, so that the execution of a set-group-ID program
 does not result in the loss of the original (real) group ID.
 .Pp
 The group access list is a set of group IDs
 used only in determining resource accessibility.
 Access checks
 are performed as described below in ``File Access Permissions''.
 .It "Saved Set User ID and Saved Set Group ID"
 When a process executes a new file, the effective user ID is set
 to the owner of the file if the file is set-user-ID, and the effective
 group ID (first element of the group access list) is set to the group
 of the file if the file is set-group-ID.
 The effective user ID of the process is then recorded as the saved set-user-ID,
 and the effective group ID of the process is recorded as the saved set-group-ID.
 These values may be used to regain those values as the effective user
 or group ID after reverting to the real ID (see
 .Xr setuid 2 ) .
 (In POSIX.1, the saved set-user-ID and saved set-group-ID are optional,
 and are used in setuid and setgid, but this does not work as desired
 for the super-user.)
 .It Super-user
 A process is recognized as a
 .Em super-user
 process and is granted special privileges if its effective user ID is 0.
 .It Descriptor
 An integer assigned by the system when a file is referenced
 by
 .Xr open 2
 or
 .Xr dup 2 ,
 or when a socket is created by
 .Xr pipe 2 ,
 .Xr socket 2
 or
 .Xr socketpair 2 ,
 which uniquely identifies an access path to that file or socket from
 a given process or any of its children.
 .It File Name
 Names consisting of up to
 .Brq Dv NAME_MAX
 characters may be used to name
 an ordinary file, special file, or directory.
 .Pp
 These characters may be arbitrary eight-bit values,
 excluding
 .Dv NUL
 .Tn ( ASCII
 0) and the
 .Ql \&/
 character (slash,
 .Tn ASCII
 47).
 .Pp
 Note that it is generally unwise to use
 .Ql \&* ,
 .Ql \&? ,
 .Ql \&[
 or
 .Ql \&]
 as part of
 file names because of the special meaning attached to these characters
 by the shell.
 .It Path Name
 A path name is a
 .Dv NUL Ns -terminated
 character string starting with an
 optional slash
 .Ql \&/ ,
 followed by zero or more directory names separated
 by slashes, optionally followed by a file name.
 The total length of a path name must be less than
 .Brq Dv PATH_MAX
 characters.
 (On some systems, this limit may be infinite.)
 .Pp
 If a path name begins with a slash, the path search begins at the
 .Em root
 directory.
 Otherwise, the search begins from the current working directory.
 A slash by itself names the root directory.
 An empty
 pathname refers to the current directory.
 .It Directory
 A directory is a special type of file that contains entries
 that are references to other files.
 Directory entries are called links.
 By convention, a directory
 contains at least two links,
 .Ql .\&
 and
 .Ql \&.. ,
 referred to as
 .Em dot
 and
 .Em dot-dot
 respectively.
 Dot refers to the directory itself and
 dot-dot refers to its parent directory.
 .It "Root Directory and Current Working Directory"
 Each process has associated with it a concept of a root directory
 and a current working directory for the purpose of resolving path
 name searches.
 A process's root directory need not be the root
 directory of the root file system.
 .It File Access Permissions
 Every file in the file system has a set of access permissions.
 These permissions are used in determining whether a process
 may perform a requested operation on the file (such as opening
 a file for writing).
 Access permissions are established at the
 time a file is created.
 They may be changed at some later time
 through the
 .Xr chmod 2
 call.
 .Pp
 File access is broken down according to whether a file may be: read,
 written, or executed.
 Directory files use the execute
 permission to control if the directory may be searched.
 .Pp
 File access permissions are interpreted by the system as
 they apply to three different classes of users: the owner
 of the file, those users in the file's group, anyone else.
 Every file has an independent set of access permissions for
 each of these classes.
 When an access check is made, the system
 decides if permission should be granted by checking the access
 information applicable to the caller.
 .Pp
 Read, write, and execute/search permissions on
 a file are granted to a process if:
 .Pp
 The process's effective user ID is that of the super-user.
 (Note:
 even the super-user cannot execute a non-executable file.)
 .Pp
 The process's effective user ID matches the user ID of the owner
 of the file and the owner permissions allow the access.
 .Pp
 The process's effective user ID does not match the user ID of the
 owner of the file, and either the process's effective
 group ID matches the group ID
 of the file, or the group ID of the file is in
 the process's group access list,
 and the group permissions allow the access.
 .Pp
 Neither the effective user ID nor effective group ID
 and group access list of the process
 match the corresponding user ID and group ID of the file,
 but the permissions for ``other users'' allow access.
 .Pp
 Otherwise, permission is denied.
 .It Sockets and Address Families
 A socket is an endpoint for communication between processes.
 Each socket has queues for sending and receiving data.
 .Pp
 Sockets are typed according to their communications properties.
 These properties include whether messages sent and received
 at a socket require the name of the partner, whether communication
 is reliable, the format used in naming message recipients, etc.
 .Pp
 Each instance of the system supports some
 collection of socket types; consult
 .Xr socket 2
 for more information about the types available and
 their properties.
 .Pp
 Each instance of the system supports some number of sets of
 communications protocols.
 Each protocol set supports addresses
 of a certain format.
 An Address Family is the set of addresses
 for a specific group of protocols.
 Each socket has an address
 chosen from the address family in which the socket was created.
 .El
 .Sh SEE ALSO
 .Xr intro 3 ,
 .Xr perror 3
Index: projects/clang390-import/lib/libcasper/services/cap_dns/Makefile
===================================================================
--- projects/clang390-import/lib/libcasper/services/cap_dns/Makefile	(revision 305686)
+++ projects/clang390-import/lib/libcasper/services/cap_dns/Makefile	(revision 305687)
@@ -1,18 +1,24 @@
 # $FreeBSD$
 
+.include <src.opts.mk>
+
 PACKAGE=libcasper
 LIB=	cap_dns
 
 SHLIB_MAJOR=	0
 SHLIBDIR?=	/lib/casper
 INCSDIR?=	${INCLUDEDIR}/casper
 
 SRCS=	cap_dns.c
 
 INCS=	cap_dns.h
 
 LIBADD=	nv
 
 CFLAGS+=-I${.CURDIR}
+
+.if ${MK_TESTS} != "no"
+SUBDIR+=	tests
+.endif
 
 .include <bsd.lib.mk>
Index: projects/clang390-import/lib/libcasper/services/cap_dns/tests/Makefile
===================================================================
--- projects/clang390-import/lib/libcasper/services/cap_dns/tests/Makefile	(nonexistent)
+++ projects/clang390-import/lib/libcasper/services/cap_dns/tests/Makefile	(revision 305687)
@@ -0,0 +1,11 @@
+# $FreeBSD$
+
+TAP_TESTS_C=	dns_test
+
+LIBADD+=	casper
+LIBADD+=	cap_dns
+LIBADD+=	nv
+
+WARNS?=		3
+
+.include <bsd.test.mk>

Property changes on: projects/clang390-import/lib/libcasper/services/cap_dns/tests/Makefile
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: projects/clang390-import/lib/libcasper/services/cap_dns/tests/dns_test.c
===================================================================
--- projects/clang390-import/lib/libcasper/services/cap_dns/tests/dns_test.c	(nonexistent)
+++ projects/clang390-import/lib/libcasper/services/cap_dns/tests/dns_test.c	(revision 305687)
@@ -0,0 +1,702 @@
+/*-
+ * Copyright (c) 2013 The FreeBSD Foundation
+ * All rights reserved.
+ *
+ * This software was developed by Pawel Jakub Dawidek under sponsorship from
+ * the FreeBSD Foundation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <sys/cdefs.h>
+__FBSDID("$FreeBSD$");
+
+#include <sys/capsicum.h>
+
+#include <arpa/inet.h>
+#include <netinet/in.h>
+
+#include <assert.h>
+#include <err.h>
+#include <errno.h>
+#include <netdb.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include <libcasper.h>
+
+#include <casper/cap_dns.h>
+
+static int ntest = 1;
+
+#define CHECK(expr)     do {						\
+	if ((expr))							\
+		printf("ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
+	else								\
+		printf("not ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
+	ntest++;							\
+} while (0)
+#define CHECKX(expr)     do {						\
+	if ((expr)) {							\
+		printf("ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
+	} else {							\
+		printf("not ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
+		exit(1);						\
+	}								\
+	ntest++;							\
+} while (0)
+
+#define	GETHOSTBYNAME			0x01
+#define	GETHOSTBYNAME2_AF_INET		0x02
+#define	GETHOSTBYNAME2_AF_INET6		0x04
+#define	GETHOSTBYADDR_AF_INET		0x08
+#define	GETHOSTBYADDR_AF_INET6		0x10
+#define	GETADDRINFO_AF_UNSPEC		0x20
+#define	GETADDRINFO_AF_INET		0x40
+#define	GETADDRINFO_AF_INET6		0x80
+
+static bool
+addrinfo_compare(struct addrinfo *ai0, struct addrinfo *ai1)
+{
+	struct addrinfo *at0, *at1;
+
+	if (ai0 == NULL && ai1 == NULL)
+		return (true);
+	if (ai0 == NULL || ai1 == NULL)
+		return (false);
+
+	at0 = ai0;
+	at1 = ai1;
+	while (true) {
+		if ((at0->ai_flags == at1->ai_flags) &&
+		    (at0->ai_family == at1->ai_family) &&
+		    (at0->ai_socktype == at1->ai_socktype) &&
+		    (at0->ai_protocol == at1->ai_protocol) &&
+		    (at0->ai_addrlen == at1->ai_addrlen) &&
+		    (memcmp(at0->ai_addr, at1->ai_addr,
+			at0->ai_addrlen) == 0)) {
+			if (at0->ai_canonname != NULL &&
+			    at1->ai_canonname != NULL) {
+				if (strcmp(at0->ai_canonname,
+				    at1->ai_canonname) != 0) {
+					return (false);
+				}
+			}
+
+			if (at0->ai_canonname == NULL &&
+			    at1->ai_canonname != NULL) {
+				return (false);
+			}
+			if (at0->ai_canonname != NULL &&
+			    at1->ai_canonname == NULL) {
+				return (false);
+			}
+
+			if (at0->ai_next == NULL && at1->ai_next == NULL)
+				return (true);
+			if (at0->ai_next == NULL || at1->ai_next == NULL)
+				return (false);
+
+			at0 = at0->ai_next;
+			at1 = at1->ai_next;
+		} else {
+			return (false);
+		}
+	}
+
+	/* NOTREACHED */
+	fprintf(stderr, "Dead code reached in addrinfo_compare()\n");
+	exit(1);
+}
+
+static bool
+hostent_aliases_compare(char **aliases0, char **aliases1)
+{
+	int i0, i1;
+
+	if (aliases0 == NULL && aliases1 == NULL)
+		return (true);
+	if (aliases0 == NULL || aliases1 == NULL)
+		return (false);
+
+	for (i0 = 0; aliases0[i0] != NULL; i0++) {
+		for (i1 = 0; aliases1[i1] != NULL; i1++) {
+			if (strcmp(aliases0[i0], aliases1[i1]) == 0)
+				break;
+		}
+		if (aliases1[i1] == NULL)
+			return (false);
+	}
+
+	return (true);
+}
+
+static bool
+hostent_addr_list_compare(char **addr_list0, char **addr_list1, int length)
+{
+	int i0, i1;
+
+	if (addr_list0 == NULL && addr_list1 == NULL)
+		return (true);
+	if (addr_list0 == NULL || addr_list1 == NULL)
+		return (false);
+
+	for (i0 = 0; addr_list0[i0] != NULL; i0++) {
+		for (i1 = 0; addr_list1[i1] != NULL; i1++) {
+			if (memcmp(addr_list0[i0], addr_list1[i1], length) == 0)
+				break;
+		}
+		if (addr_list1[i1] == NULL)
+			return (false);
+	}
+
+	return (true);
+}
+
+static bool
+hostent_compare(const struct hostent *hp0, const struct hostent *hp1)
+{
+
+	if (hp0 == NULL && hp1 != NULL)
+		return (true);
+
+	if (hp0 == NULL || hp1 == NULL)
+		return (false);
+
+	if (hp0->h_name != NULL || hp1->h_name != NULL) {
+		if (hp0->h_name == NULL || hp1->h_name == NULL)
+			return (false);
+		if (strcmp(hp0->h_name, hp1->h_name) != 0)
+			return (false);
+	}
+
+	if (!hostent_aliases_compare(hp0->h_aliases, hp1->h_aliases))
+		return (false);
+	if (!hostent_aliases_compare(hp1->h_aliases, hp0->h_aliases))
+		return (false);
+
+	if (hp0->h_addrtype != hp1->h_addrtype)
+		return (false);
+
+	if (hp0->h_length != hp1->h_length)
+		return (false);
+
+	if (!hostent_addr_list_compare(hp0->h_addr_list, hp1->h_addr_list,
+	    hp0->h_length)) {
+		return (false);
+	}
+	if (!hostent_addr_list_compare(hp1->h_addr_list, hp0->h_addr_list,
+	    hp0->h_length)) {
+		return (false);
+	}
+
+	return (true);
+}
+
+static unsigned int
+runtest(cap_channel_t *capdns)
+{
+	unsigned int result;
+	struct addrinfo *ais, *aic, hints, *hintsp;
+	struct hostent *hps, *hpc;
+	struct in_addr ip4;
+	struct in6_addr ip6;
+
+	result = 0;
+
+	hps = gethostbyname("example.com");
+	if (hps == NULL)
+		fprintf(stderr, "Unable to resolve %s IPv4.\n", "example.com");
+	hpc = cap_gethostbyname(capdns, "example.com");
+	if (hostent_compare(hps, hpc))
+		result |= GETHOSTBYNAME;
+
+	hps = gethostbyname2("example.com", AF_INET);
+	if (hps == NULL)
+		fprintf(stderr, "Unable to resolve %s IPv4.\n", "example.com");
+	hpc = cap_gethostbyname2(capdns, "example.com", AF_INET);
+	if (hostent_compare(hps, hpc))
+		result |= GETHOSTBYNAME2_AF_INET;
+
+	hps = gethostbyname2("example.com", AF_INET6);
+	if (hps == NULL)
+		fprintf(stderr, "Unable to resolve %s IPv6.\n", "example.com");
+	hpc = cap_gethostbyname2(capdns, "example.com", AF_INET6);
+	if (hostent_compare(hps, hpc))
+		result |= GETHOSTBYNAME2_AF_INET6;
+
+	hints.ai_flags = 0;
+	hints.ai_family = AF_UNSPEC;
+	hints.ai_socktype = 0;
+	hints.ai_protocol = 0;
+	hints.ai_addrlen = 0;
+	hints.ai_addr = NULL;
+	hints.ai_canonname = NULL;
+	hints.ai_next = NULL;
+
+	hintsp = &hints;
+
+	if (getaddrinfo("freebsd.org", "25", hintsp, &ais) != 0) {
+		fprintf(stderr,
+		    "Unable to issue [system] getaddrinfo() for AF_UNSPEC: %s\n",
+		    gai_strerror(errno));
+	}
+	if (cap_getaddrinfo(capdns, "freebsd.org", "25", hintsp, &aic) == 0) {
+		if (addrinfo_compare(ais, aic))
+			result |= GETADDRINFO_AF_UNSPEC;
+		freeaddrinfo(ais);
+		freeaddrinfo(aic);
+	}
+
+	hints.ai_family = AF_INET;
+	if (getaddrinfo("freebsd.org", "25", hintsp, &ais) != 0) {
+		fprintf(stderr,
+		    "Unable to issue [system] getaddrinfo() for AF_UNSPEC: %s\n",
+		    gai_strerror(errno));
+	}
+	if (cap_getaddrinfo(capdns, "freebsd.org", "25", hintsp, &aic) == 0) {
+		if (addrinfo_compare(ais, aic))
+			result |= GETADDRINFO_AF_INET;
+		freeaddrinfo(ais);
+		freeaddrinfo(aic);
+	}
+
+	hints.ai_family = AF_INET6;
+	if (getaddrinfo("freebsd.org", "25", hintsp, &ais) != 0) {
+		fprintf(stderr,
+		    "Unable to issue [system] getaddrinfo() for AF_UNSPEC: %s\n",
+		    gai_strerror(errno));
+	}
+	if (cap_getaddrinfo(capdns, "freebsd.org", "25", hintsp, &aic) == 0) {
+		if (addrinfo_compare(ais, aic))
+			result |= GETADDRINFO_AF_INET6;
+		freeaddrinfo(ais);
+		freeaddrinfo(aic);
+	}
+
+	/*
+	 * 8.8.178.135 is IPv4 address of freefall.freebsd.org
+	 * as of 27 October 2013.
+	 */
+	inet_pton(AF_INET, "8.8.178.135", &ip4);
+	hps = gethostbyaddr(&ip4, sizeof(ip4), AF_INET);
+	if (hps == NULL)
+		fprintf(stderr, "Unable to resolve %s.\n", "8.8.178.135");
+	hpc = cap_gethostbyaddr(capdns, &ip4, sizeof(ip4), AF_INET);
+	if (hostent_compare(hps, hpc))
+		result |= GETHOSTBYADDR_AF_INET;
+
+	/*
+	 * 2001:1900:2254:206c::16:87 is IPv6 address of freefall.freebsd.org
+	 * as of 27 October 2013.
+	 */
+	inet_pton(AF_INET6, "2001:1900:2254:206c::16:87", &ip6);
+	hps = gethostbyaddr(&ip6, sizeof(ip6), AF_INET6);
+	if (hps == NULL) {
+		fprintf(stderr, "Unable to resolve %s.\n",
+		    "2001:1900:2254:206c::16:87");
+	}
+	hpc = cap_gethostbyaddr(capdns, &ip6, sizeof(ip6), AF_INET6);
+	if (hostent_compare(hps, hpc))
+		result |= GETHOSTBYADDR_AF_INET6;
+
+	return (result);
+}
+
+int
+main(void)
+{
+	cap_channel_t *capcas, *capdns, *origcapdns;
+	const char *types[2];
+	int families[2];
+
+	printf("1..91\n");
+
+	capcas = cap_init();
+	CHECKX(capcas != NULL);
+
+	origcapdns = capdns = cap_service_open(capcas, "system.dns");
+	CHECKX(capdns != NULL);
+
+	cap_close(capcas);
+
+	/* No limits set. */
+
+	CHECK(runtest(capdns) ==
+	    (GETHOSTBYNAME | GETHOSTBYNAME2_AF_INET | GETHOSTBYNAME2_AF_INET6 |
+	     GETHOSTBYADDR_AF_INET | GETHOSTBYADDR_AF_INET6 |
+	     GETADDRINFO_AF_UNSPEC | GETADDRINFO_AF_INET | GETADDRINFO_AF_INET6));
+
+	/*
+	 * Allow:
+	 * type: NAME, ADDR
+	 * family: AF_INET, AF_INET6
+	 */
+
+	capdns = cap_clone(origcapdns);
+	CHECK(capdns != NULL);
+
+	types[0] = "NAME";
+	types[1] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 2) == 0);
+	families[0] = AF_INET;
+	families[1] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 2) == 0);
+
+	CHECK(runtest(capdns) ==
+	    (GETHOSTBYNAME | GETHOSTBYNAME2_AF_INET | GETHOSTBYNAME2_AF_INET6 |
+	     GETHOSTBYADDR_AF_INET | GETHOSTBYADDR_AF_INET6 |
+	     GETADDRINFO_AF_INET | GETADDRINFO_AF_INET6));
+
+	cap_close(capdns);
+
+	/*
+	 * Allow:
+	 * type: NAME
+	 * family: AF_INET, AF_INET6
+	 */
+
+	capdns = cap_clone(origcapdns);
+	CHECK(capdns != NULL);
+
+	types[0] = "NAME";
+	CHECK(cap_dns_type_limit(capdns, types, 1) == 0);
+	types[1] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 2) == -1 &&
+	    errno == ENOTCAPABLE);
+	types[0] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+	families[0] = AF_INET;
+	families[1] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 2) == 0);
+
+	CHECK(runtest(capdns) ==
+	    (GETHOSTBYNAME | GETHOSTBYNAME2_AF_INET | GETHOSTBYNAME2_AF_INET6));
+
+	cap_close(capdns);
+
+	/*
+	 * Allow:
+	 * type: ADDR
+	 * family: AF_INET, AF_INET6
+	 */
+
+	capdns = cap_clone(origcapdns);
+	CHECK(capdns != NULL);
+
+	types[0] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 1) == 0);
+	types[1] = "NAME";
+	CHECK(cap_dns_type_limit(capdns, types, 2) == -1 &&
+	    errno == ENOTCAPABLE);
+	types[0] = "NAME";
+	CHECK(cap_dns_type_limit(capdns, types, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+	families[0] = AF_INET;
+	families[1] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 2) == 0);
+
+	CHECK(runtest(capdns) ==
+	    (GETHOSTBYADDR_AF_INET | GETHOSTBYADDR_AF_INET6 |
+	    GETADDRINFO_AF_INET | GETADDRINFO_AF_INET6));
+
+	cap_close(capdns);
+
+	/*
+	 * Allow:
+	 * type: NAME, ADDR
+	 * family: AF_INET
+	 */
+
+	capdns = cap_clone(origcapdns);
+	CHECK(capdns != NULL);
+
+	types[0] = "NAME";
+	types[1] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 2) == 0);
+	families[0] = AF_INET;
+	CHECK(cap_dns_family_limit(capdns, families, 1) == 0);
+	families[1] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 2) == -1 &&
+	    errno == ENOTCAPABLE);
+	families[0] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest(capdns) ==
+	    (GETHOSTBYNAME | GETHOSTBYNAME2_AF_INET | GETHOSTBYADDR_AF_INET |
+	    GETADDRINFO_AF_INET));
+
+	cap_close(capdns);
+
+	/*
+	 * Allow:
+	 * type: NAME, ADDR
+	 * family: AF_INET6
+	 */
+
+	capdns = cap_clone(origcapdns);
+	CHECK(capdns != NULL);
+
+	types[0] = "NAME";
+	types[1] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 2) == 0);
+	families[0] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 1) == 0);
+	families[1] = AF_INET;
+	CHECK(cap_dns_family_limit(capdns, families, 2) == -1 &&
+	    errno == ENOTCAPABLE);
+	families[0] = AF_INET;
+	CHECK(cap_dns_family_limit(capdns, families, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest(capdns) ==
+	    (GETHOSTBYNAME2_AF_INET6 | GETHOSTBYADDR_AF_INET6 |
+	    GETADDRINFO_AF_INET6));
+
+	cap_close(capdns);
+
+	/* Below we also test further limiting capability. */
+
+	/*
+	 * Allow:
+	 * type: NAME
+	 * family: AF_INET
+	 */
+
+	capdns = cap_clone(origcapdns);
+	CHECK(capdns != NULL);
+
+	types[0] = "NAME";
+	types[1] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 2) == 0);
+	families[0] = AF_INET;
+	families[1] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 2) == 0);
+	types[0] = "NAME";
+	CHECK(cap_dns_type_limit(capdns, types, 1) == 0);
+	types[1] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 2) == -1 &&
+	    errno == ENOTCAPABLE);
+	types[0] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+	families[0] = AF_INET;
+	CHECK(cap_dns_family_limit(capdns, families, 1) == 0);
+	families[1] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 2) == -1 &&
+	    errno == ENOTCAPABLE);
+	families[0] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest(capdns) == (GETHOSTBYNAME | GETHOSTBYNAME2_AF_INET));
+
+	cap_close(capdns);
+
+	/*
+	 * Allow:
+	 * type: NAME
+	 * family: AF_INET6
+	 */
+
+	capdns = cap_clone(origcapdns);
+	CHECK(capdns != NULL);
+
+	types[0] = "NAME";
+	types[1] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 2) == 0);
+	families[0] = AF_INET;
+	families[1] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 2) == 0);
+	types[0] = "NAME";
+	CHECK(cap_dns_type_limit(capdns, types, 1) == 0);
+	types[1] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 2) == -1 &&
+	    errno == ENOTCAPABLE);
+	types[0] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+	families[0] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 1) == 0);
+	families[1] = AF_INET;
+	CHECK(cap_dns_family_limit(capdns, families, 2) == -1 &&
+	    errno == ENOTCAPABLE);
+	families[0] = AF_INET;
+	CHECK(cap_dns_family_limit(capdns, families, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest(capdns) == GETHOSTBYNAME2_AF_INET6);
+
+	cap_close(capdns);
+
+	/*
+	 * Allow:
+	 * type: ADDR
+	 * family: AF_INET
+	 */
+
+	capdns = cap_clone(origcapdns);
+	CHECK(capdns != NULL);
+
+	types[0] = "NAME";
+	types[1] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 2) == 0);
+	families[0] = AF_INET;
+	families[1] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 2) == 0);
+	types[0] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 1) == 0);
+	types[1] = "NAME";
+	CHECK(cap_dns_type_limit(capdns, types, 2) == -1 &&
+	    errno == ENOTCAPABLE);
+	types[0] = "NAME";
+	CHECK(cap_dns_type_limit(capdns, types, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+	families[0] = AF_INET;
+	CHECK(cap_dns_family_limit(capdns, families, 1) == 0);
+	families[1] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 2) == -1 &&
+	    errno == ENOTCAPABLE);
+	families[0] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest(capdns) == (GETHOSTBYADDR_AF_INET | GETADDRINFO_AF_INET));
+
+	cap_close(capdns);
+
+	/*
+	 * Allow:
+	 * type: ADDR
+	 * family: AF_INET6
+	 */
+
+	capdns = cap_clone(origcapdns);
+	CHECK(capdns != NULL);
+
+	types[0] = "NAME";
+	types[1] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 2) == 0);
+	families[0] = AF_INET;
+	families[1] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 2) == 0);
+	types[0] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 1) == 0);
+	types[1] = "NAME";
+	CHECK(cap_dns_type_limit(capdns, types, 2) == -1 &&
+	    errno == ENOTCAPABLE);
+	types[0] = "NAME";
+	CHECK(cap_dns_type_limit(capdns, types, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+	families[0] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 1) == 0);
+	families[1] = AF_INET;
+	CHECK(cap_dns_family_limit(capdns, families, 2) == -1 &&
+	    errno == ENOTCAPABLE);
+	families[0] = AF_INET;
+	CHECK(cap_dns_family_limit(capdns, families, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest(capdns) == (GETHOSTBYADDR_AF_INET6 |
+	    GETADDRINFO_AF_INET6));
+
+	cap_close(capdns);
+
+	/* Trying to rise the limits. */
+
+	capdns = cap_clone(origcapdns);
+	CHECK(capdns != NULL);
+
+	types[0] = "NAME";
+	CHECK(cap_dns_type_limit(capdns, types, 1) == 0);
+	families[0] = AF_INET;
+	CHECK(cap_dns_family_limit(capdns, families, 1) == 0);
+
+	types[0] = "NAME";
+	types[1] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 2) == -1 &&
+	    errno == ENOTCAPABLE);
+	families[0] = AF_INET;
+	families[1] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 2) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	types[0] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+	families[0] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(cap_dns_type_limit(capdns, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+	CHECK(cap_dns_family_limit(capdns, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	/* Do the limits still hold? */
+	CHECK(runtest(capdns) == (GETHOSTBYNAME | GETHOSTBYNAME2_AF_INET));
+
+	cap_close(capdns);
+
+	capdns = cap_clone(origcapdns);
+	CHECK(capdns != NULL);
+
+	types[0] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 1) == 0);
+	families[0] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 1) == 0);
+
+	types[0] = "NAME";
+	types[1] = "ADDR";
+	CHECK(cap_dns_type_limit(capdns, types, 2) == -1 &&
+	    errno == ENOTCAPABLE);
+	families[0] = AF_INET;
+	families[1] = AF_INET6;
+	CHECK(cap_dns_family_limit(capdns, families, 2) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	types[0] = "NAME";
+	CHECK(cap_dns_type_limit(capdns, types, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+	families[0] = AF_INET;
+	CHECK(cap_dns_family_limit(capdns, families, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(cap_dns_type_limit(capdns, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+	CHECK(cap_dns_family_limit(capdns, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	/* Do the limits still hold? */
+	CHECK(runtest(capdns) == (GETHOSTBYADDR_AF_INET6 |
+	    GETADDRINFO_AF_INET6));
+
+	cap_close(capdns);
+
+	cap_close(origcapdns);
+
+	exit(0);
+}

Property changes on: projects/clang390-import/lib/libcasper/services/cap_dns/tests/dns_test.c
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: projects/clang390-import/lib/libcasper/services/cap_grp/Makefile
===================================================================
--- projects/clang390-import/lib/libcasper/services/cap_grp/Makefile	(revision 305686)
+++ projects/clang390-import/lib/libcasper/services/cap_grp/Makefile	(revision 305687)
@@ -1,18 +1,24 @@
 # $FreeBSD$
 
+.include <src.opts.mk>
+
 PACKAGE=libcasper
 LIB=	cap_grp
 
 SHLIB_MAJOR=	0
 SHLIBDIR?=	/lib/casper
 INCSDIR?=	${INCLUDEDIR}/casper
 
 SRCS=	cap_grp.c
 
 INCS=	cap_grp.h
 
 LIBADD=	nv
 
 CFLAGS+=-I${.CURDIR}
+
+.if ${MK_TESTS} != "no"
+SUBDIR+=	tests
+.endif
 
 .include <bsd.lib.mk>
Index: projects/clang390-import/lib/libcasper/services/cap_grp/tests/Makefile
===================================================================
--- projects/clang390-import/lib/libcasper/services/cap_grp/tests/Makefile	(nonexistent)
+++ projects/clang390-import/lib/libcasper/services/cap_grp/tests/Makefile	(revision 305687)
@@ -0,0 +1,11 @@
+# $FreeBSD$
+
+TAP_TESTS_C=	grp_test
+
+LIBADD+=	casper
+LIBADD+=	cap_grp
+LIBADD+=	nv
+
+WARNS?=		3
+
+.include <bsd.test.mk>

Property changes on: projects/clang390-import/lib/libcasper/services/cap_grp/tests/Makefile
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: projects/clang390-import/lib/libcasper/services/cap_grp/tests/grp_test.c
===================================================================
--- projects/clang390-import/lib/libcasper/services/cap_grp/tests/grp_test.c	(nonexistent)
+++ projects/clang390-import/lib/libcasper/services/cap_grp/tests/grp_test.c	(revision 305687)
@@ -0,0 +1,1550 @@
+/*-
+ * Copyright (c) 2013 The FreeBSD Foundation
+ * All rights reserved.
+ *
+ * This software was developed by Pawel Jakub Dawidek under sponsorship from
+ * the FreeBSD Foundation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <sys/cdefs.h>
+__FBSDID("$FreeBSD$");
+
+#include <sys/capsicum.h>
+
+#include <assert.h>
+#include <err.h>
+#include <errno.h>
+#include <grp.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include <libcasper.h>
+
+#include <casper/cap_grp.h>
+
+static int ntest = 1;
+
+#define CHECK(expr)     do {						\
+	if ((expr))							\
+		printf("ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
+	else								\
+		printf("not ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
+	ntest++;							\
+} while (0)
+#define CHECKX(expr)     do {						\
+	if ((expr)) {							\
+		printf("ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
+	} else {							\
+		printf("not ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
+		exit(1);						\
+	}								\
+	ntest++;							\
+} while (0)
+
+#define	GID_WHEEL	0
+#define	GID_OPERATOR	5
+
+#define	GETGRENT0	0x0001
+#define	GETGRENT1	0x0002
+#define	GETGRENT2	0x0004
+#define	GETGRENT	(GETGRENT0 | GETGRENT1 | GETGRENT2)
+#define	GETGRENT_R0	0x0008
+#define	GETGRENT_R1	0x0010
+#define	GETGRENT_R2	0x0020
+#define	GETGRENT_R	(GETGRENT_R0 | GETGRENT_R1 | GETGRENT_R2)
+#define	GETGRNAM	0x0040
+#define	GETGRNAM_R	0x0080
+#define	GETGRGID	0x0100
+#define	GETGRGID_R	0x0200
+#define	SETGRENT	0x0400
+
+static bool
+group_mem_compare(char **mem0, char **mem1)
+{
+	int i0, i1;
+
+	if (mem0 == NULL && mem1 == NULL)
+		return (true);
+	if (mem0 == NULL || mem1 == NULL)
+		return (false);
+
+	for (i0 = 0; mem0[i0] != NULL; i0++) {
+		for (i1 = 0; mem1[i1] != NULL; i1++) {
+			if (strcmp(mem0[i0], mem1[i1]) == 0)
+				break;
+		}
+		if (mem1[i1] == NULL)
+			return (false);
+	}
+
+	return (true);
+}
+
+static bool
+group_compare(const struct group *grp0, const struct group *grp1)
+{
+
+	if (grp0 == NULL && grp1 == NULL)
+		return (true);
+	if (grp0 == NULL || grp1 == NULL)
+		return (false);
+
+	if (strcmp(grp0->gr_name, grp1->gr_name) != 0)
+		return (false);
+
+	if (grp0->gr_passwd != NULL || grp1->gr_passwd != NULL) {
+		if (grp0->gr_passwd == NULL || grp1->gr_passwd == NULL)
+			return (false);
+		if (strcmp(grp0->gr_passwd, grp1->gr_passwd) != 0)
+			return (false);
+	}
+
+	if (grp0->gr_gid != grp1->gr_gid)
+		return (false);
+
+	if (!group_mem_compare(grp0->gr_mem, grp1->gr_mem))
+		return (false);
+
+	return (true);
+}
+
+static unsigned int
+runtest_cmds(cap_channel_t *capgrp)
+{
+	char bufs[1024], bufc[1024];
+	unsigned int result;
+	struct group *grps, *grpc;
+	struct group sts, stc;
+
+	result = 0;
+
+	(void)setgrent();
+	if (cap_setgrent(capgrp) == 1)
+		result |= SETGRENT;
+
+	grps = getgrent();
+	grpc = cap_getgrent(capgrp);
+	if (group_compare(grps, grpc)) {
+		result |= GETGRENT0;
+		grps = getgrent();
+		grpc = cap_getgrent(capgrp);
+		if (group_compare(grps, grpc))
+			result |= GETGRENT1;
+	}
+
+	getgrent_r(&sts, bufs, sizeof(bufs), &grps);
+	cap_getgrent_r(capgrp, &stc, bufc, sizeof(bufc), &grpc);
+	if (group_compare(grps, grpc)) {
+		result |= GETGRENT_R0;
+		getgrent_r(&sts, bufs, sizeof(bufs), &grps);
+		cap_getgrent_r(capgrp, &stc, bufc, sizeof(bufc), &grpc);
+		if (group_compare(grps, grpc))
+			result |= GETGRENT_R1;
+	}
+
+	(void)setgrent();
+	if (cap_setgrent(capgrp) == 1)
+		result |= SETGRENT;
+
+	getgrent_r(&sts, bufs, sizeof(bufs), &grps);
+	cap_getgrent_r(capgrp, &stc, bufc, sizeof(bufc), &grpc);
+	if (group_compare(grps, grpc))
+		result |= GETGRENT_R2;
+
+	grps = getgrent();
+	grpc = cap_getgrent(capgrp);
+	if (group_compare(grps, grpc))
+		result |= GETGRENT2;
+
+	grps = getgrnam("wheel");
+	grpc = cap_getgrnam(capgrp, "wheel");
+	if (group_compare(grps, grpc)) {
+		grps = getgrnam("operator");
+		grpc = cap_getgrnam(capgrp, "operator");
+		if (group_compare(grps, grpc))
+			result |= GETGRNAM;
+	}
+
+	getgrnam_r("wheel", &sts, bufs, sizeof(bufs), &grps);
+	cap_getgrnam_r(capgrp, "wheel", &stc, bufc, sizeof(bufc), &grpc);
+	if (group_compare(grps, grpc)) {
+		getgrnam_r("operator", &sts, bufs, sizeof(bufs), &grps);
+		cap_getgrnam_r(capgrp, "operator", &stc, bufc, sizeof(bufc),
+		    &grpc);
+		if (group_compare(grps, grpc))
+			result |= GETGRNAM_R;
+	}
+
+	grps = getgrgid(GID_WHEEL);
+	grpc = cap_getgrgid(capgrp, GID_WHEEL);
+	if (group_compare(grps, grpc)) {
+		grps = getgrgid(GID_OPERATOR);
+		grpc = cap_getgrgid(capgrp, GID_OPERATOR);
+		if (group_compare(grps, grpc))
+			result |= GETGRGID;
+	}
+
+	getgrgid_r(GID_WHEEL, &sts, bufs, sizeof(bufs), &grps);
+	cap_getgrgid_r(capgrp, GID_WHEEL, &stc, bufc, sizeof(bufc), &grpc);
+	if (group_compare(grps, grpc)) {
+		getgrgid_r(GID_OPERATOR, &sts, bufs, sizeof(bufs), &grps);
+		cap_getgrgid_r(capgrp, GID_OPERATOR, &stc, bufc, sizeof(bufc),
+		    &grpc);
+		if (group_compare(grps, grpc))
+			result |= GETGRGID_R;
+	}
+
+	return (result);
+}
+
+static void
+test_cmds(cap_channel_t *origcapgrp)
+{
+	cap_channel_t *capgrp;
+	const char *cmds[7], *fields[4], *names[5];
+	gid_t gids[5];
+
+	fields[0] = "gr_name";
+	fields[1] = "gr_passwd";
+	fields[2] = "gr_gid";
+	fields[3] = "gr_mem";
+
+	names[0] = "wheel";
+	names[1] = "daemon";
+	names[2] = "kmem";
+	names[3] = "sys";
+	names[4] = "operator";
+
+	gids[0] = 0;
+	gids[1] = 1;
+	gids[2] = 2;
+	gids[3] = 3;
+	gids[4] = 5;
+
+	/*
+	 * Allow:
+	 * cmds: setgrent, getgrent, getgrent_r, getgrnam, getgrnam_r,
+	 *       getgrgid, getgrgid_r
+	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
+	 * groups:
+	 *     names: wheel, daemon, kmem, sys, operator
+	 *     gids:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid";
+	cmds[6] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == 0);
+	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == 0);
+	CHECK(cap_grp_limit_groups(capgrp, names, 5, NULL, 0) == 0);
+
+	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
+	    GETGRNAM | GETGRNAM_R | GETGRGID | GETGRGID_R));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * cmds: setgrent, getgrent, getgrent_r, getgrnam, getgrnam_r,
+	 *       getgrgid, getgrgid_r
+	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
+	 * groups:
+	 *     names:
+	 *     gids: 0, 1, 2, 3, 5
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid";
+	cmds[6] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == 0);
+	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == 0);
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 5) == 0);
+
+	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
+	    GETGRNAM | GETGRNAM_R | GETGRGID | GETGRGID_R));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * cmds: getgrent, getgrent_r, getgrnam, getgrnam_r,
+	 *       getgrgid, getgrgid_r
+	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
+	 * groups:
+	 *     names: wheel, daemon, kmem, sys, operator
+	 *     gids:
+	 * Disallow:
+	 * cmds: setgrent
+	 * fields:
+	 * groups:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	cmds[0] = "getgrent";
+	cmds[1] = "getgrent_r";
+	cmds[2] = "getgrnam";
+	cmds[3] = "getgrnam_r";
+	cmds[4] = "getgrgid";
+	cmds[5] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid";
+	cmds[6] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "setgrent";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_grp_limit_groups(capgrp, names, 5, NULL, 0) == 0);
+
+	CHECK(runtest_cmds(capgrp) == (GETGRENT0 | GETGRENT1 | GETGRENT_R0 |
+	    GETGRENT_R1 | GETGRNAM | GETGRNAM_R | GETGRGID | GETGRGID_R));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * cmds: getgrent, getgrent_r, getgrnam, getgrnam_r,
+	 *       getgrgid, getgrgid_r
+	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
+	 * groups:
+	 *     names:
+	 *     gids: 0, 1, 2, 3, 5
+	 * Disallow:
+	 * cmds: setgrent
+	 * fields:
+	 * groups:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	cmds[0] = "getgrent";
+	cmds[1] = "getgrent_r";
+	cmds[2] = "getgrnam";
+	cmds[3] = "getgrnam_r";
+	cmds[4] = "getgrgid";
+	cmds[5] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid";
+	cmds[6] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "setgrent";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 5) == 0);
+
+	CHECK(runtest_cmds(capgrp) == (GETGRENT0 | GETGRENT1 | GETGRENT_R0 |
+	    GETGRENT_R1 | GETGRNAM | GETGRNAM_R | GETGRGID | GETGRGID_R));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * cmds: setgrent, getgrent_r, getgrnam, getgrnam_r,
+	 *       getgrgid, getgrgid_r
+	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
+	 * groups:
+	 *     names: wheel, daemon, kmem, sys, operator
+	 *     gids:
+	 * Disallow:
+	 * cmds: getgrent
+	 * fields:
+	 * groups:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent_r";
+	cmds[2] = "getgrnam";
+	cmds[3] = "getgrnam_r";
+	cmds[4] = "getgrgid";
+	cmds[5] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid";
+	cmds[6] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getgrent";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == 0);
+	CHECK(cap_grp_limit_groups(capgrp, names, 5, NULL, 0) == 0);
+
+	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT_R2 |
+	    GETGRNAM | GETGRNAM_R | GETGRGID | GETGRGID_R));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * cmds: setgrent, getgrent_r, getgrnam, getgrnam_r,
+	 *       getgrgid, getgrgid_r
+	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
+	 * groups:
+	 *     names:
+	 *     gids: 0, 1, 2, 3, 5
+	 * Disallow:
+	 * cmds: getgrent
+	 * fields:
+	 * groups:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent_r";
+	cmds[2] = "getgrnam";
+	cmds[3] = "getgrnam_r";
+	cmds[4] = "getgrgid";
+	cmds[5] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid";
+	cmds[6] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getgrent";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == 0);
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 5) == 0);
+
+	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT_R2 |
+	    GETGRNAM | GETGRNAM_R | GETGRGID | GETGRGID_R));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * cmds: setgrent, getgrent, getgrnam, getgrnam_r,
+	 *       getgrgid, getgrgid_r
+	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
+	 * groups:
+	 *     names: wheel, daemon, kmem, sys, operator
+	 *     gids:
+	 * Disallow:
+	 * cmds: getgrent_r
+	 * fields:
+	 * groups:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrnam";
+	cmds[3] = "getgrnam_r";
+	cmds[4] = "getgrgid";
+	cmds[5] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid";
+	cmds[6] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getgrent_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_grp_limit_groups(capgrp, names, 5, NULL, 0) == 0);
+
+	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT0 | GETGRENT1 |
+	    GETGRNAM | GETGRNAM_R | GETGRGID | GETGRGID_R));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * cmds: setgrent, getgrent, getgrnam, getgrnam_r,
+	 *       getgrgid, getgrgid_r
+	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
+	 * groups:
+	 *     names:
+	 *     gids: 0, 1, 2, 3, 5
+	 * Disallow:
+	 * cmds: getgrent_r
+	 * fields:
+	 * groups:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrnam";
+	cmds[3] = "getgrnam_r";
+	cmds[4] = "getgrgid";
+	cmds[5] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid";
+	cmds[6] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getgrent_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 5) == 0);
+
+	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT0 | GETGRENT1 |
+	    GETGRNAM | GETGRNAM_R | GETGRGID | GETGRGID_R));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * cmds: setgrent, getgrent, getgrent_r, getgrnam_r,
+	 *       getgrgid, getgrgid_r
+	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
+	 * groups:
+	 *     names: wheel, daemon, kmem, sys, operator
+	 *     gids:
+	 * Disallow:
+	 * cmds: getgrnam
+	 * fields:
+	 * groups:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam_r";
+	cmds[4] = "getgrgid";
+	cmds[5] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid";
+	cmds[6] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getgrnam";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == 0);
+	CHECK(cap_grp_limit_groups(capgrp, names, 5, NULL, 0) == 0);
+
+	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
+	    GETGRNAM_R | GETGRGID | GETGRGID_R));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * cmds: setgrent, getgrent, getgrent_r, getgrnam_r,
+	 *       getgrgid, getgrgid_r
+	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
+	 * groups:
+	 *     names:
+	 *     gids: 0, 1, 2, 3, 5
+	 * Disallow:
+	 * cmds: getgrnam
+	 * fields:
+	 * groups:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam_r";
+	cmds[4] = "getgrgid";
+	cmds[5] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid";
+	cmds[6] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getgrnam";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == 0);
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 5) == 0);
+
+	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
+	    GETGRNAM_R | GETGRGID | GETGRGID_R));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * cmds: setgrent, getgrent, getgrent_r, getgrnam,
+	 *       getgrgid, getgrgid_r
+	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
+	 * groups:
+	 *     names: wheel, daemon, kmem, sys, operator
+	 *     gids:
+	 * Disallow:
+	 * cmds: getgrnam_r
+	 * fields:
+	 * groups:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrgid";
+	cmds[5] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid";
+	cmds[6] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getgrnam_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_grp_limit_groups(capgrp, names, 5, NULL, 0) == 0);
+
+	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
+	    GETGRNAM | GETGRGID | GETGRGID_R));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * cmds: setgrent, getgrent, getgrent_r, getgrnam,
+	 *       getgrgid, getgrgid_r
+	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
+	 * groups:
+	 *     names:
+	 *     gids: 0, 1, 2, 3, 5
+	 * Disallow:
+	 * cmds: getgrnam_r
+	 * fields:
+	 * groups:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrgid";
+	cmds[5] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid";
+	cmds[6] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getgrnam_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 5) == 0);
+
+	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
+	    GETGRNAM | GETGRGID | GETGRGID_R));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * cmds: setgrent, getgrent, getgrent_r, getgrnam, getgrnam_r,
+	 *       getgrgid_r
+	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
+	 * groups:
+	 *     names: wheel, daemon, kmem, sys, operator
+	 *     gids:
+	 * Disallow:
+	 * cmds: getgrgid
+	 * fields:
+	 * groups:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid";
+	cmds[6] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getgrgid";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == 0);
+	CHECK(cap_grp_limit_groups(capgrp, names, 5, NULL, 0) == 0);
+
+	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
+	    GETGRNAM | GETGRNAM_R | GETGRGID_R));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * cmds: setgrent, getgrent, getgrent_r, getgrnam, getgrnam_r,
+	 *       getgrgid_r
+	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
+	 * groups:
+	 *     names:
+	 *     gids: 0, 1, 2, 3, 5
+	 * Disallow:
+	 * cmds: getgrgid
+	 * fields:
+	 * groups:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid";
+	cmds[6] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getgrgid";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == 0);
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 5) == 0);
+
+	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
+	    GETGRNAM | GETGRNAM_R | GETGRGID_R));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * cmds: setgrent, getgrent, getgrent_r, getgrnam, getgrnam_r,
+	 *       getgrgid
+	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
+	 * groups:
+	 *     names: wheel, daemon, kmem, sys, operator
+	 *     gids:
+	 * Disallow:
+	 * cmds: getgrgid_r
+	 * fields:
+	 * groups:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid";
+	cmds[6] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_grp_limit_groups(capgrp, names, 5, NULL, 0) == 0);
+
+	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
+	    GETGRNAM | GETGRNAM_R | GETGRGID));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * cmds: setgrent, getgrent, getgrent_r, getgrnam, getgrnam_r,
+	 *       getgrgid
+	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
+	 * groups:
+	 *     names:
+	 *     gids: 0, 1, 2, 3, 5
+	 * Disallow:
+	 * cmds: getgrgid_r
+	 * fields:
+	 * groups:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
+	cmds[0] = "setgrent";
+	cmds[1] = "getgrent";
+	cmds[2] = "getgrent_r";
+	cmds[3] = "getgrnam";
+	cmds[4] = "getgrnam_r";
+	cmds[5] = "getgrgid";
+	cmds[6] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getgrgid_r";
+	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 5) == 0);
+
+	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
+	    GETGRNAM | GETGRNAM_R | GETGRGID));
+
+	cap_close(capgrp);
+}
+
+#define	GR_NAME		0x01
+#define	GR_PASSWD	0x02
+#define	GR_GID		0x04
+#define	GR_MEM		0x08
+
+static unsigned int
+group_fields(const struct group *grp)
+{
+	unsigned int result;
+
+	result = 0;
+
+	if (grp->gr_name != NULL && grp->gr_name[0] != '\0')
+		result |= GR_NAME;
+
+	if (grp->gr_passwd != NULL && grp->gr_passwd[0] != '\0')
+		result |= GR_PASSWD;
+
+	if (grp->gr_gid != (gid_t)-1)
+		result |= GR_GID;
+
+	if (grp->gr_mem != NULL && grp->gr_mem[0] != NULL)
+		result |= GR_MEM;
+
+	return (result);
+}
+
+static bool
+runtest_fields(cap_channel_t *capgrp, unsigned int expected)
+{
+	char buf[1024];
+	struct group *grp;
+	struct group st;
+
+	(void)cap_setgrent(capgrp);
+	grp = cap_getgrent(capgrp);
+	if (group_fields(grp) != expected)
+		return (false);
+
+	(void)cap_setgrent(capgrp);
+	cap_getgrent_r(capgrp, &st, buf, sizeof(buf), &grp);
+	if (group_fields(grp) != expected)
+		return (false);
+
+	grp = cap_getgrnam(capgrp, "wheel");
+	if (group_fields(grp) != expected)
+		return (false);
+
+	cap_getgrnam_r(capgrp, "wheel", &st, buf, sizeof(buf), &grp);
+	if (group_fields(grp) != expected)
+		return (false);
+
+	grp = cap_getgrgid(capgrp, GID_WHEEL);
+	if (group_fields(grp) != expected)
+		return (false);
+
+	cap_getgrgid_r(capgrp, GID_WHEEL, &st, buf, sizeof(buf), &grp);
+	if (group_fields(grp) != expected)
+		return (false);
+
+	return (true);
+}
+
+static void
+test_fields(cap_channel_t *origcapgrp)
+{
+	cap_channel_t *capgrp;
+	const char *fields[4];
+
+	/* No limits. */
+
+	CHECK(runtest_fields(origcapgrp, GR_NAME | GR_PASSWD | GR_GID | GR_MEM));
+
+	/*
+	 * Allow:
+	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	fields[0] = "gr_name";
+	fields[1] = "gr_passwd";
+	fields[2] = "gr_gid";
+	fields[3] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == 0);
+
+	CHECK(runtest_fields(capgrp, GR_NAME | GR_PASSWD | GR_GID | GR_MEM));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * fields: gr_passwd, gr_gid, gr_mem
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	fields[0] = "gr_passwd";
+	fields[1] = "gr_gid";
+	fields[2] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 3) == 0);
+	fields[0] = "gr_name";
+	fields[1] = "gr_passwd";
+	fields[2] = "gr_gid";
+	fields[3] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest_fields(capgrp, GR_PASSWD | GR_GID | GR_MEM));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * fields: gr_name, gr_gid, gr_mem
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	fields[0] = "gr_name";
+	fields[1] = "gr_gid";
+	fields[2] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 3) == 0);
+	fields[0] = "gr_name";
+	fields[1] = "gr_passwd";
+	fields[2] = "gr_gid";
+	fields[3] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
+	    errno == ENOTCAPABLE);
+	fields[0] = "gr_passwd";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest_fields(capgrp, GR_NAME | GR_GID | GR_MEM));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * fields: gr_name, gr_passwd, gr_mem
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	fields[0] = "gr_name";
+	fields[1] = "gr_passwd";
+	fields[2] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 3) == 0);
+	fields[0] = "gr_name";
+	fields[1] = "gr_passwd";
+	fields[2] = "gr_gid";
+	fields[3] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
+	    errno == ENOTCAPABLE);
+	fields[0] = "gr_gid";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest_fields(capgrp, GR_NAME | GR_PASSWD | GR_MEM));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * fields: gr_name, gr_passwd, gr_gid
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	fields[0] = "gr_name";
+	fields[1] = "gr_passwd";
+	fields[2] = "gr_gid";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 3) == 0);
+	fields[0] = "gr_name";
+	fields[1] = "gr_passwd";
+	fields[2] = "gr_gid";
+	fields[3] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
+	    errno == ENOTCAPABLE);
+	fields[0] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest_fields(capgrp, GR_NAME | GR_PASSWD | GR_GID));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * fields: gr_name, gr_passwd
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	fields[0] = "gr_name";
+	fields[1] = "gr_passwd";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 2) == 0);
+	fields[0] = "gr_name";
+	fields[1] = "gr_passwd";
+	fields[2] = "gr_gid";
+	fields[3] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
+	    errno == ENOTCAPABLE);
+	fields[0] = "gr_gid";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest_fields(capgrp, GR_NAME | GR_PASSWD));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * fields: gr_name, gr_gid
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	fields[0] = "gr_name";
+	fields[1] = "gr_gid";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 2) == 0);
+	fields[0] = "gr_name";
+	fields[1] = "gr_passwd";
+	fields[2] = "gr_gid";
+	fields[3] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
+	    errno == ENOTCAPABLE);
+	fields[0] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest_fields(capgrp, GR_NAME | GR_GID));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * fields: gr_name, gr_mem
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	fields[0] = "gr_name";
+	fields[1] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 2) == 0);
+	fields[0] = "gr_name";
+	fields[1] = "gr_passwd";
+	fields[2] = "gr_gid";
+	fields[3] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
+	    errno == ENOTCAPABLE);
+	fields[0] = "gr_passwd";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest_fields(capgrp, GR_NAME | GR_MEM));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * fields: gr_passwd, gr_gid
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	fields[0] = "gr_passwd";
+	fields[1] = "gr_gid";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 2) == 0);
+	fields[0] = "gr_name";
+	fields[1] = "gr_passwd";
+	fields[2] = "gr_gid";
+	fields[3] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
+	    errno == ENOTCAPABLE);
+	fields[0] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest_fields(capgrp, GR_PASSWD | GR_GID));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * fields: gr_passwd, gr_mem
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	fields[0] = "gr_passwd";
+	fields[1] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 2) == 0);
+	fields[0] = "gr_name";
+	fields[1] = "gr_passwd";
+	fields[2] = "gr_gid";
+	fields[3] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
+	    errno == ENOTCAPABLE);
+	fields[0] = "gr_gid";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest_fields(capgrp, GR_PASSWD | GR_MEM));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * fields: gr_gid, gr_mem
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	fields[0] = "gr_gid";
+	fields[1] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 2) == 0);
+	fields[0] = "gr_name";
+	fields[1] = "gr_passwd";
+	fields[2] = "gr_gid";
+	fields[3] = "gr_mem";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
+	    errno == ENOTCAPABLE);
+	fields[0] = "gr_passwd";
+	CHECK(cap_grp_limit_fields(capgrp, fields, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest_fields(capgrp, GR_GID | GR_MEM));
+
+	cap_close(capgrp);
+}
+
+static bool
+runtest_groups(cap_channel_t *capgrp, const char **names, const gid_t *gids,
+    size_t ngroups)
+{
+	char buf[1024];
+	struct group *grp;
+	struct group st;
+	unsigned int i, got;
+
+	(void)cap_setgrent(capgrp);
+	got = 0;
+	for (;;) {
+		grp = cap_getgrent(capgrp);
+		if (grp == NULL)
+			break;
+		got++;
+		for (i = 0; i < ngroups; i++) {
+			if (strcmp(names[i], grp->gr_name) == 0 &&
+			    gids[i] == grp->gr_gid) {
+				break;
+			}
+		}
+		if (i == ngroups)
+			return (false);
+	}
+	if (got != ngroups)
+		return (false);
+
+	(void)cap_setgrent(capgrp);
+	got = 0;
+	for (;;) {
+		cap_getgrent_r(capgrp, &st, buf, sizeof(buf), &grp);
+		if (grp == NULL)
+			break;
+		got++;
+		for (i = 0; i < ngroups; i++) {
+			if (strcmp(names[i], grp->gr_name) == 0 &&
+			    gids[i] == grp->gr_gid) {
+				break;
+			}
+		}
+		if (i == ngroups)
+			return (false);
+	}
+	if (got != ngroups)
+		return (false);
+
+	for (i = 0; i < ngroups; i++) {
+		grp = cap_getgrnam(capgrp, names[i]);
+		if (grp == NULL)
+			return (false);
+	}
+
+	for (i = 0; i < ngroups; i++) {
+		cap_getgrnam_r(capgrp, names[i], &st, buf, sizeof(buf), &grp);
+		if (grp == NULL)
+			return (false);
+	}
+
+	for (i = 0; i < ngroups; i++) {
+		grp = cap_getgrgid(capgrp, gids[i]);
+		if (grp == NULL)
+			return (false);
+	}
+
+	for (i = 0; i < ngroups; i++) {
+		cap_getgrgid_r(capgrp, gids[i], &st, buf, sizeof(buf), &grp);
+		if (grp == NULL)
+			return (false);
+	}
+
+	return (true);
+}
+
+static void
+test_groups(cap_channel_t *origcapgrp)
+{
+	cap_channel_t *capgrp;
+	const char *names[5];
+	gid_t gids[5];
+
+	/*
+	 * Allow:
+	 * groups:
+	 *     names: wheel, daemon, kmem, sys, tty
+	 *     gids:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	names[0] = "wheel";
+	names[1] = "daemon";
+	names[2] = "kmem";
+	names[3] = "sys";
+	names[4] = "tty";
+	CHECK(cap_grp_limit_groups(capgrp, names, 5, NULL, 0) == 0);
+	gids[0] = 0;
+	gids[1] = 1;
+	gids[2] = 2;
+	gids[3] = 3;
+	gids[4] = 4;
+
+	CHECK(runtest_groups(capgrp, names, gids, 5));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * groups:
+	 *     names: kmem, sys, tty
+	 *     gids:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	names[0] = "kmem";
+	names[1] = "sys";
+	names[2] = "tty";
+	CHECK(cap_grp_limit_groups(capgrp, names, 3, NULL, 0) == 0);
+	names[3] = "daemon";
+	CHECK(cap_grp_limit_groups(capgrp, names, 4, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+	names[0] = "daemon";
+	CHECK(cap_grp_limit_groups(capgrp, names, 1, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+	names[0] = "kmem";
+	gids[0] = 2;
+	gids[1] = 3;
+	gids[2] = 4;
+
+	CHECK(runtest_groups(capgrp, names, gids, 3));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * groups:
+	 *     names: wheel, kmem, tty
+	 *     gids:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	names[0] = "wheel";
+	names[1] = "kmem";
+	names[2] = "tty";
+	CHECK(cap_grp_limit_groups(capgrp, names, 3, NULL, 0) == 0);
+	names[3] = "daemon";
+	CHECK(cap_grp_limit_groups(capgrp, names, 4, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+	names[0] = "daemon";
+	CHECK(cap_grp_limit_groups(capgrp, names, 1, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+	names[0] = "wheel";
+	gids[0] = 0;
+	gids[1] = 2;
+	gids[2] = 4;
+
+	CHECK(runtest_groups(capgrp, names, gids, 3));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * groups:
+	 *     names:
+	 *     gids: 2, 3, 4
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	names[0] = "kmem";
+	names[1] = "sys";
+	names[2] = "tty";
+	gids[0] = 2;
+	gids[1] = 3;
+	gids[2] = 4;
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 3) == 0);
+	gids[3] = 0;
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 4) == -1 &&
+	    errno == ENOTCAPABLE);
+	gids[0] = 0;
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+	gids[0] = 2;
+
+	CHECK(runtest_groups(capgrp, names, gids, 3));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * groups:
+	 *     names:
+	 *     gids: 0, 2, 4
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	names[0] = "wheel";
+	names[1] = "kmem";
+	names[2] = "tty";
+	gids[0] = 0;
+	gids[1] = 2;
+	gids[2] = 4;
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 3) == 0);
+	gids[3] = 1;
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 4) == -1 &&
+	    errno == ENOTCAPABLE);
+	gids[0] = 1;
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+	gids[0] = 0;
+
+	CHECK(runtest_groups(capgrp, names, gids, 3));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * groups:
+	 *     names: kmem
+	 *     gids:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	names[0] = "kmem";
+	CHECK(cap_grp_limit_groups(capgrp, names, 1, NULL, 0) == 0);
+	names[1] = "daemon";
+	CHECK(cap_grp_limit_groups(capgrp, names, 2, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+	names[0] = "daemon";
+	CHECK(cap_grp_limit_groups(capgrp, names, 1, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+	names[0] = "kmem";
+	gids[0] = 2;
+
+	CHECK(runtest_groups(capgrp, names, gids, 1));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * groups:
+	 *     names: wheel, tty
+	 *     gids:
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	names[0] = "wheel";
+	names[1] = "tty";
+	CHECK(cap_grp_limit_groups(capgrp, names, 2, NULL, 0) == 0);
+	names[2] = "daemon";
+	CHECK(cap_grp_limit_groups(capgrp, names, 3, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+	names[0] = "daemon";
+	CHECK(cap_grp_limit_groups(capgrp, names, 1, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+	names[0] = "wheel";
+	gids[0] = 0;
+	gids[1] = 4;
+
+	CHECK(runtest_groups(capgrp, names, gids, 2));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * groups:
+	 *     names:
+	 *     gids: 2
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	names[0] = "kmem";
+	gids[0] = 2;
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 1) == 0);
+	gids[1] = 1;
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 2) == -1 &&
+	    errno == ENOTCAPABLE);
+	gids[0] = 1;
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+	gids[0] = 2;
+
+	CHECK(runtest_groups(capgrp, names, gids, 1));
+
+	cap_close(capgrp);
+
+	/*
+	 * Allow:
+	 * groups:
+	 *     names:
+	 *     gids: 0, 4
+	 */
+	capgrp = cap_clone(origcapgrp);
+	CHECK(capgrp != NULL);
+
+	names[0] = "wheel";
+	names[1] = "tty";
+	gids[0] = 0;
+	gids[1] = 4;
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 2) == 0);
+	gids[2] = 1;
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 3) == -1 &&
+	    errno == ENOTCAPABLE);
+	gids[0] = 1;
+	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+	gids[0] = 0;
+
+	CHECK(runtest_groups(capgrp, names, gids, 2));
+
+	cap_close(capgrp);
+}
+
+int
+main(void)
+{
+	cap_channel_t *capcas, *capgrp;
+
+	printf("1..199\n");
+
+	capcas = cap_init();
+	CHECKX(capcas != NULL);
+
+	capgrp = cap_service_open(capcas, "system.grp");
+	CHECKX(capgrp != NULL);
+
+	cap_close(capcas);
+
+	/* No limits. */
+
+	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
+	    GETGRNAM | GETGRNAM_R | GETGRGID | GETGRGID_R));
+
+	test_cmds(capgrp);
+
+	test_fields(capgrp);
+
+	test_groups(capgrp);
+
+	cap_close(capgrp);
+
+	exit(0);
+}

Property changes on: projects/clang390-import/lib/libcasper/services/cap_grp/tests/grp_test.c
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: projects/clang390-import/lib/libcasper/services/cap_pwd/Makefile
===================================================================
--- projects/clang390-import/lib/libcasper/services/cap_pwd/Makefile	(revision 305686)
+++ projects/clang390-import/lib/libcasper/services/cap_pwd/Makefile	(revision 305687)
@@ -1,18 +1,24 @@
 # $FreeBSD$
 
+.include <src.opts.mk>
+
 PACKAGE=libcasper
 LIB=	cap_pwd
 
 SHLIB_MAJOR=	0
 SHLIBDIR?=	/lib/casper
 INCSDIR?=	${INCLUDEDIR}/casper
 
 SRCS=	cap_pwd.c
 
 INCS=	cap_pwd.h
 
 LIBADD=	nv
 
 CFLAGS+=-I${.CURDIR}
+
+.if ${MK_TESTS} != "no"
+SUBDIR+=	tests
+.endif
 
 .include <bsd.lib.mk>
Index: projects/clang390-import/lib/libcasper/services/cap_pwd/tests/Makefile
===================================================================
--- projects/clang390-import/lib/libcasper/services/cap_pwd/tests/Makefile	(nonexistent)
+++ projects/clang390-import/lib/libcasper/services/cap_pwd/tests/Makefile	(revision 305687)
@@ -0,0 +1,11 @@
+# $FreeBSD$
+
+TAP_TESTS_C=	pwd_test
+
+LIBADD+=	casper
+LIBADD+=	cap_pwd
+LIBADD+=	nv
+
+WARNS?=		3
+
+.include <bsd.test.mk>

Property changes on: projects/clang390-import/lib/libcasper/services/cap_pwd/tests/Makefile
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: projects/clang390-import/lib/libcasper/services/cap_pwd/tests/pwd_test.c
===================================================================
--- projects/clang390-import/lib/libcasper/services/cap_pwd/tests/pwd_test.c	(nonexistent)
+++ projects/clang390-import/lib/libcasper/services/cap_pwd/tests/pwd_test.c	(revision 305687)
@@ -0,0 +1,1536 @@
+/*-
+ * Copyright (c) 2013 The FreeBSD Foundation
+ * All rights reserved.
+ *
+ * This software was developed by Pawel Jakub Dawidek under sponsorship from
+ * the FreeBSD Foundation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <sys/cdefs.h>
+__FBSDID("$FreeBSD$");
+
+#include <sys/capsicum.h>
+
+#include <assert.h>
+#include <err.h>
+#include <errno.h>
+#include <pwd.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include <libcasper.h>
+
+#include <casper/cap_pwd.h>
+
+static int ntest = 1;
+
+#define CHECK(expr)     do {						\
+	if ((expr))							\
+		printf("ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
+	else								\
+		printf("not ok %d %s:%u\n", ntest, __FILE__, __LINE__);\
+	ntest++;							\
+} while (0)
+#define CHECKX(expr)     do {						\
+	if ((expr)) {							\
+		printf("ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
+	} else {							\
+		printf("not ok %d %s:%u\n", ntest, __FILE__, __LINE__);\
+		exit(1);						\
+	}								\
+	ntest++;							\
+} while (0)
+
+#define	UID_ROOT	0
+#define	UID_OPERATOR	2
+
+#define	GETPWENT0	0x0001
+#define	GETPWENT1	0x0002
+#define	GETPWENT2	0x0004
+#define	GETPWENT	(GETPWENT0 | GETPWENT1 | GETPWENT2)
+#define	GETPWENT_R0	0x0008
+#define	GETPWENT_R1	0x0010
+#define	GETPWENT_R2	0x0020
+#define	GETPWENT_R	(GETPWENT_R0 | GETPWENT_R1 | GETPWENT_R2)
+#define	GETPWNAM	0x0040
+#define	GETPWNAM_R	0x0080
+#define	GETPWUID	0x0100
+#define	GETPWUID_R	0x0200
+
+static bool
+passwd_compare(const struct passwd *pwd0, const struct passwd *pwd1)
+{
+
+	if (pwd0 == NULL && pwd1 == NULL)
+		return (true);
+	if (pwd0 == NULL || pwd1 == NULL)
+		return (false);
+
+	if (strcmp(pwd0->pw_name, pwd1->pw_name) != 0)
+		return (false);
+
+	if (pwd0->pw_passwd != NULL || pwd1->pw_passwd != NULL) {
+		if (pwd0->pw_passwd == NULL || pwd1->pw_passwd == NULL)
+			return (false);
+		if (strcmp(pwd0->pw_passwd, pwd1->pw_passwd) != 0)
+			return (false);
+	}
+
+	if (pwd0->pw_uid != pwd1->pw_uid)
+		return (false);
+
+	if (pwd0->pw_gid != pwd1->pw_gid)
+		return (false);
+
+	if (pwd0->pw_change != pwd1->pw_change)
+		return (false);
+
+	if (pwd0->pw_class != NULL || pwd1->pw_class != NULL) {
+		if (pwd0->pw_class == NULL || pwd1->pw_class == NULL)
+			return (false);
+		if (strcmp(pwd0->pw_class, pwd1->pw_class) != 0)
+			return (false);
+	}
+
+	if (pwd0->pw_gecos != NULL || pwd1->pw_gecos != NULL) {
+		if (pwd0->pw_gecos == NULL || pwd1->pw_gecos == NULL)
+			return (false);
+		if (strcmp(pwd0->pw_gecos, pwd1->pw_gecos) != 0)
+			return (false);
+	}
+
+	if (pwd0->pw_dir != NULL || pwd1->pw_dir != NULL) {
+		if (pwd0->pw_dir == NULL || pwd1->pw_dir == NULL)
+			return (false);
+		if (strcmp(pwd0->pw_dir, pwd1->pw_dir) != 0)
+			return (false);
+	}
+
+	if (pwd0->pw_shell != NULL || pwd1->pw_shell != NULL) {
+		if (pwd0->pw_shell == NULL || pwd1->pw_shell == NULL)
+			return (false);
+		if (strcmp(pwd0->pw_shell, pwd1->pw_shell) != 0)
+			return (false);
+	}
+
+	if (pwd0->pw_expire != pwd1->pw_expire)
+		return (false);
+
+	if (pwd0->pw_fields != pwd1->pw_fields)
+		return (false);
+
+	return (true);
+}
+
+static unsigned int
+runtest_cmds(cap_channel_t *cappwd)
+{
+	char bufs[1024], bufc[1024];
+	unsigned int result;
+	struct passwd *pwds, *pwdc;
+	struct passwd sts, stc;
+
+	result = 0;
+
+	setpwent();
+	cap_setpwent(cappwd);
+
+	pwds = getpwent();
+	pwdc = cap_getpwent(cappwd);
+	if (passwd_compare(pwds, pwdc)) {
+		result |= GETPWENT0;
+		pwds = getpwent();
+		pwdc = cap_getpwent(cappwd);
+		if (passwd_compare(pwds, pwdc))
+			result |= GETPWENT1;
+	}
+
+	getpwent_r(&sts, bufs, sizeof(bufs), &pwds);
+	cap_getpwent_r(cappwd, &stc, bufc, sizeof(bufc), &pwdc);
+	if (passwd_compare(pwds, pwdc)) {
+		result |= GETPWENT_R0;
+		getpwent_r(&sts, bufs, sizeof(bufs), &pwds);
+		cap_getpwent_r(cappwd, &stc, bufc, sizeof(bufc), &pwdc);
+		if (passwd_compare(pwds, pwdc))
+			result |= GETPWENT_R1;
+	}
+
+	setpwent();
+	cap_setpwent(cappwd);
+
+	getpwent_r(&sts, bufs, sizeof(bufs), &pwds);
+	cap_getpwent_r(cappwd, &stc, bufc, sizeof(bufc), &pwdc);
+	if (passwd_compare(pwds, pwdc))
+		result |= GETPWENT_R2;
+
+	pwds = getpwent();
+	pwdc = cap_getpwent(cappwd);
+	if (passwd_compare(pwds, pwdc))
+		result |= GETPWENT2;
+
+	pwds = getpwnam("root");
+	pwdc = cap_getpwnam(cappwd, "root");
+	if (passwd_compare(pwds, pwdc)) {
+		pwds = getpwnam("operator");
+		pwdc = cap_getpwnam(cappwd, "operator");
+		if (passwd_compare(pwds, pwdc))
+			result |= GETPWNAM;
+	}
+
+	getpwnam_r("root", &sts, bufs, sizeof(bufs), &pwds);
+	cap_getpwnam_r(cappwd, "root", &stc, bufc, sizeof(bufc), &pwdc);
+	if (passwd_compare(pwds, pwdc)) {
+		getpwnam_r("operator", &sts, bufs, sizeof(bufs), &pwds);
+		cap_getpwnam_r(cappwd, "operator", &stc, bufc, sizeof(bufc),
+		    &pwdc);
+		if (passwd_compare(pwds, pwdc))
+			result |= GETPWNAM_R;
+	}
+
+	pwds = getpwuid(UID_ROOT);
+	pwdc = cap_getpwuid(cappwd, UID_ROOT);
+	if (passwd_compare(pwds, pwdc)) {
+		pwds = getpwuid(UID_OPERATOR);
+		pwdc = cap_getpwuid(cappwd, UID_OPERATOR);
+		if (passwd_compare(pwds, pwdc))
+			result |= GETPWUID;
+	}
+
+	getpwuid_r(UID_ROOT, &sts, bufs, sizeof(bufs), &pwds);
+	cap_getpwuid_r(cappwd, UID_ROOT, &stc, bufc, sizeof(bufc), &pwdc);
+	if (passwd_compare(pwds, pwdc)) {
+		getpwuid_r(UID_OPERATOR, &sts, bufs, sizeof(bufs), &pwds);
+		cap_getpwuid_r(cappwd, UID_OPERATOR, &stc, bufc, sizeof(bufc),
+		    &pwdc);
+		if (passwd_compare(pwds, pwdc))
+			result |= GETPWUID_R;
+	}
+
+	return (result);
+}
+
+static void
+test_cmds(cap_channel_t *origcappwd)
+{
+	cap_channel_t *cappwd;
+	const char *cmds[7], *fields[10], *names[6];
+	uid_t uids[5];
+
+	fields[0] = "pw_name";
+	fields[1] = "pw_passwd";
+	fields[2] = "pw_uid";
+	fields[3] = "pw_gid";
+	fields[4] = "pw_change";
+	fields[5] = "pw_class";
+	fields[6] = "pw_gecos";
+	fields[7] = "pw_dir";
+	fields[8] = "pw_shell";
+	fields[9] = "pw_expire";
+
+	names[0] = "root";
+	names[1] = "toor";
+	names[2] = "daemon";
+	names[3] = "operator";
+	names[4] = "bin";
+	names[5] = "kmem";
+
+	uids[0] = 0;
+	uids[1] = 1;
+	uids[2] = 2;
+	uids[3] = 3;
+	uids[4] = 5;
+
+	/*
+	 * Allow:
+	 * cmds: setpwent, getpwent, getpwent_r, getpwnam, getpwnam_r,
+	 *       getpwuid, getpwuid_r
+	 * users:
+	 *     names: root, toor, daemon, operator, bin, kmem
+	 *     uids:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid";
+	cmds[6] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == 0);
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
+	CHECK(cap_pwd_limit_users(cappwd, names, 6, NULL, 0) == 0);
+
+	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
+	    GETPWNAM | GETPWNAM_R | GETPWUID | GETPWUID_R));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * cmds: setpwent, getpwent, getpwent_r, getpwnam, getpwnam_r,
+	 *       getpwuid, getpwuid_r
+	 * users:
+	 *     names:
+	 *     uids: 0, 1, 2, 3, 5
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid";
+	cmds[6] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == 0);
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 5) == 0);
+
+	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
+	    GETPWNAM | GETPWNAM_R | GETPWUID | GETPWUID_R));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * cmds: getpwent, getpwent_r, getpwnam, getpwnam_r,
+	 *       getpwuid, getpwuid_r
+	 * users:
+	 *     names: root, toor, daemon, operator, bin, kmem
+	 *     uids:
+	 * Disallow:
+	 * cmds: setpwent
+	 * users:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	cap_setpwent(cappwd);
+
+	cmds[0] = "getpwent";
+	cmds[1] = "getpwent_r";
+	cmds[2] = "getpwnam";
+	cmds[3] = "getpwnam_r";
+	cmds[4] = "getpwuid";
+	cmds[5] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid";
+	cmds[6] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "setpwent";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
+	CHECK(cap_pwd_limit_users(cappwd, names, 6, NULL, 0) == 0);
+
+	CHECK(runtest_cmds(cappwd) == (GETPWENT0 | GETPWENT1 | GETPWENT_R0 |
+	    GETPWENT_R1 | GETPWNAM | GETPWNAM_R | GETPWUID | GETPWUID_R));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * cmds: getpwent, getpwent_r, getpwnam, getpwnam_r,
+	 *       getpwuid, getpwuid_r
+	 * users:
+	 *     names:
+	 *     uids: 0, 1, 2, 3, 5
+	 * Disallow:
+	 * cmds: setpwent
+	 * users:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	cap_setpwent(cappwd);
+
+	cmds[0] = "getpwent";
+	cmds[1] = "getpwent_r";
+	cmds[2] = "getpwnam";
+	cmds[3] = "getpwnam_r";
+	cmds[4] = "getpwuid";
+	cmds[5] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid";
+	cmds[6] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "setpwent";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 5) == 0);
+
+	CHECK(runtest_cmds(cappwd) == (GETPWENT0 | GETPWENT1 | GETPWENT_R0 |
+	    GETPWENT_R1 | GETPWNAM | GETPWNAM_R | GETPWUID | GETPWUID_R));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * cmds: setpwent, getpwent_r, getpwnam, getpwnam_r,
+	 *       getpwuid, getpwuid_r
+	 * users:
+	 *     names: root, toor, daemon, operator, bin, kmem
+	 *     uids:
+	 * Disallow:
+	 * cmds: getpwent
+	 * users:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent_r";
+	cmds[2] = "getpwnam";
+	cmds[3] = "getpwnam_r";
+	cmds[4] = "getpwuid";
+	cmds[5] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid";
+	cmds[6] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getpwent";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
+	CHECK(cap_pwd_limit_users(cappwd, names, 6, NULL, 0) == 0);
+
+	CHECK(runtest_cmds(cappwd) == (GETPWENT_R2 |
+	    GETPWNAM | GETPWNAM_R | GETPWUID | GETPWUID_R));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * cmds: setpwent, getpwent_r, getpwnam, getpwnam_r,
+	 *       getpwuid, getpwuid_r
+	 * users:
+	 *     names:
+	 *     uids: 0, 1, 2, 3, 5
+	 * Disallow:
+	 * cmds: getpwent
+	 * users:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent_r";
+	cmds[2] = "getpwnam";
+	cmds[3] = "getpwnam_r";
+	cmds[4] = "getpwuid";
+	cmds[5] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid";
+	cmds[6] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getpwent";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 5) == 0);
+
+	CHECK(runtest_cmds(cappwd) == (GETPWENT_R2 |
+	    GETPWNAM | GETPWNAM_R | GETPWUID | GETPWUID_R));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * cmds: setpwent, getpwent, getpwnam, getpwnam_r,
+	 *       getpwuid, getpwuid_r
+	 * users:
+	 *     names: root, toor, daemon, operator, bin, kmem
+	 *     uids:
+	 * Disallow:
+	 * cmds: getpwent_r
+	 * users:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwnam";
+	cmds[3] = "getpwnam_r";
+	cmds[4] = "getpwuid";
+	cmds[5] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid";
+	cmds[6] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getpwent_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
+	CHECK(cap_pwd_limit_users(cappwd, names, 6, NULL, 0) == 0);
+
+	CHECK(runtest_cmds(cappwd) == (GETPWENT0 | GETPWENT1 |
+	    GETPWNAM | GETPWNAM_R | GETPWUID | GETPWUID_R));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * cmds: setpwent, getpwent, getpwnam, getpwnam_r,
+	 *       getpwuid, getpwuid_r
+	 * users:
+	 *     names:
+	 *     uids: 0, 1, 2, 3, 5
+	 * Disallow:
+	 * cmds: getpwent_r
+	 * users:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwnam";
+	cmds[3] = "getpwnam_r";
+	cmds[4] = "getpwuid";
+	cmds[5] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid";
+	cmds[6] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getpwent_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 5) == 0);
+
+	CHECK(runtest_cmds(cappwd) == (GETPWENT0 | GETPWENT1 |
+	    GETPWNAM | GETPWNAM_R | GETPWUID | GETPWUID_R));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * cmds: setpwent, getpwent, getpwent_r, getpwnam_r,
+	 *       getpwuid, getpwuid_r
+	 * users:
+	 *     names: root, toor, daemon, operator, bin, kmem
+	 *     uids:
+	 * Disallow:
+	 * cmds: getpwnam
+	 * users:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam_r";
+	cmds[4] = "getpwuid";
+	cmds[5] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid";
+	cmds[6] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getpwnam";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
+	CHECK(cap_pwd_limit_users(cappwd, names, 6, NULL, 0) == 0);
+
+	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
+	    GETPWNAM_R | GETPWUID | GETPWUID_R));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * cmds: setpwent, getpwent, getpwent_r, getpwnam_r,
+	 *       getpwuid, getpwuid_r
+	 * users:
+	 *     names:
+	 *     uids: 0, 1, 2, 3, 5
+	 * Disallow:
+	 * cmds: getpwnam
+	 * users:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam_r";
+	cmds[4] = "getpwuid";
+	cmds[5] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid";
+	cmds[6] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getpwnam";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 5) == 0);
+
+	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
+	    GETPWNAM_R | GETPWUID | GETPWUID_R));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * cmds: setpwent, getpwent, getpwent_r, getpwnam,
+	 *       getpwuid, getpwuid_r
+	 * users:
+	 *     names: root, toor, daemon, operator, bin, kmem
+	 *     uids:
+	 * Disallow:
+	 * cmds: getpwnam_r
+	 * users:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwuid";
+	cmds[5] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid";
+	cmds[6] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getpwnam_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
+	CHECK(cap_pwd_limit_users(cappwd, names, 6, NULL, 0) == 0);
+
+	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
+	    GETPWNAM | GETPWUID | GETPWUID_R));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * cmds: setpwent, getpwent, getpwent_r, getpwnam,
+	 *       getpwuid, getpwuid_r
+	 * users:
+	 *     names:
+	 *     uids: 0, 1, 2, 3, 5
+	 * Disallow:
+	 * cmds: getpwnam_r
+	 * users:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwuid";
+	cmds[5] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid";
+	cmds[6] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getpwnam_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 5) == 0);
+
+	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
+	    GETPWNAM | GETPWUID | GETPWUID_R));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * cmds: setpwent, getpwent, getpwent_r, getpwnam, getpwnam_r,
+	 *       getpwuid_r
+	 * users:
+	 *     names: root, toor, daemon, operator, bin, kmem
+	 *     uids:
+	 * Disallow:
+	 * cmds: getpwuid
+	 * users:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid";
+	cmds[6] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getpwuid";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
+	CHECK(cap_pwd_limit_users(cappwd, names, 6, NULL, 0) == 0);
+
+	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
+	    GETPWNAM | GETPWNAM_R | GETPWUID_R));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * cmds: setpwent, getpwent, getpwent_r, getpwnam, getpwnam_r,
+	 *       getpwuid_r
+	 * users:
+	 *     names:
+	 *     uids: 0, 1, 2, 3, 5
+	 * Disallow:
+	 * cmds: getpwuid
+	 * users:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid";
+	cmds[6] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getpwuid";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 5) == 0);
+
+	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
+	    GETPWNAM | GETPWNAM_R | GETPWUID_R));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * cmds: setpwent, getpwent, getpwent_r, getpwnam, getpwnam_r,
+	 *       getpwuid
+	 * users:
+	 *     names: root, toor, daemon, operator, bin, kmem
+	 *     uids:
+	 * Disallow:
+	 * cmds: getpwuid_r
+	 * users:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid";
+	cmds[6] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
+	CHECK(cap_pwd_limit_users(cappwd, names, 6, NULL, 0) == 0);
+
+	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
+	    GETPWNAM | GETPWNAM_R | GETPWUID));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * cmds: setpwent, getpwent, getpwent_r, getpwnam, getpwnam_r,
+	 *       getpwuid
+	 * users:
+	 *     names:
+	 *     uids: 0, 1, 2, 3, 5
+	 * Disallow:
+	 * cmds: getpwuid_r
+	 * users:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
+	cmds[0] = "setpwent";
+	cmds[1] = "getpwent";
+	cmds[2] = "getpwent_r";
+	cmds[3] = "getpwnam";
+	cmds[4] = "getpwnam_r";
+	cmds[5] = "getpwuid";
+	cmds[6] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
+	cmds[0] = "getpwuid_r";
+	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 5) == 0);
+
+	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
+	    GETPWNAM | GETPWNAM_R | GETPWUID));
+
+	cap_close(cappwd);
+}
+
+#define	PW_NAME		_PWF_NAME
+#define	PW_PASSWD	_PWF_PASSWD
+#define	PW_UID		_PWF_UID
+#define	PW_GID		_PWF_GID
+#define	PW_CHANGE	_PWF_CHANGE
+#define	PW_CLASS	_PWF_CLASS
+#define	PW_GECOS	_PWF_GECOS
+#define	PW_DIR		_PWF_DIR
+#define	PW_SHELL	_PWF_SHELL
+#define	PW_EXPIRE	_PWF_EXPIRE
+
+static unsigned int
+passwd_fields(const struct passwd *pwd)
+{
+	unsigned int result;
+
+	result = 0;
+
+	if (pwd->pw_name != NULL && pwd->pw_name[0] != '\0')
+		result |= PW_NAME;
+//	else
+//		printf("No pw_name\n");
+
+	if (pwd->pw_passwd != NULL && pwd->pw_passwd[0] != '\0')
+		result |= PW_PASSWD;
+	else if ((pwd->pw_fields & _PWF_PASSWD) != 0)
+		result |= PW_PASSWD;
+//	else
+//		printf("No pw_passwd\n");
+
+	if (pwd->pw_uid != (uid_t)-1)
+		result |= PW_UID;
+//	else
+//		printf("No pw_uid\n");
+
+	if (pwd->pw_gid != (gid_t)-1)
+		result |= PW_GID;
+//	else
+//		printf("No pw_gid\n");
+
+	if (pwd->pw_change != 0 || (pwd->pw_fields & _PWF_CHANGE) != 0)
+		result |= PW_CHANGE;
+//	else
+//		printf("No pw_change\n");
+
+	if (pwd->pw_class != NULL && pwd->pw_class[0] != '\0')
+		result |= PW_CLASS;
+	else if ((pwd->pw_fields & _PWF_CLASS) != 0)
+		result |= PW_CLASS;
+//	else
+//		printf("No pw_class\n");
+
+	if (pwd->pw_gecos != NULL && pwd->pw_gecos[0] != '\0')
+		result |= PW_GECOS;
+	else if ((pwd->pw_fields & _PWF_GECOS) != 0)
+		result |= PW_GECOS;
+//	else
+//		printf("No pw_gecos\n");
+
+	if (pwd->pw_dir != NULL && pwd->pw_dir[0] != '\0')
+		result |= PW_DIR;
+	else if ((pwd->pw_fields & _PWF_DIR) != 0)
+		result |= PW_DIR;
+//	else
+//		printf("No pw_dir\n");
+
+	if (pwd->pw_shell != NULL && pwd->pw_shell[0] != '\0')
+		result |= PW_SHELL;
+	else if ((pwd->pw_fields & _PWF_SHELL) != 0)
+		result |= PW_SHELL;
+//	else
+//		printf("No pw_shell\n");
+
+	if (pwd->pw_expire != 0 || (pwd->pw_fields & _PWF_EXPIRE) != 0)
+		result |= PW_EXPIRE;
+//	else
+//		printf("No pw_expire\n");
+
+if (false && pwd->pw_fields != (int)result) {
+printf("fields=0x%x != result=0x%x\n", (const unsigned int)pwd->pw_fields, result);
+printf("           fields result\n");
+printf("PW_NAME    %d      %d\n", (pwd->pw_fields & PW_NAME) != 0, (result & PW_NAME) != 0);
+printf("PW_PASSWD  %d      %d\n", (pwd->pw_fields & PW_PASSWD) != 0, (result & PW_PASSWD) != 0);
+printf("PW_UID     %d      %d\n", (pwd->pw_fields & PW_UID) != 0, (result & PW_UID) != 0);
+printf("PW_GID     %d      %d\n", (pwd->pw_fields & PW_GID) != 0, (result & PW_GID) != 0);
+printf("PW_CHANGE  %d      %d\n", (pwd->pw_fields & PW_CHANGE) != 0, (result & PW_CHANGE) != 0);
+printf("PW_CLASS   %d      %d\n", (pwd->pw_fields & PW_CLASS) != 0, (result & PW_CLASS) != 0);
+printf("PW_GECOS   %d      %d\n", (pwd->pw_fields & PW_GECOS) != 0, (result & PW_GECOS) != 0);
+printf("PW_DIR     %d      %d\n", (pwd->pw_fields & PW_DIR) != 0, (result & PW_DIR) != 0);
+printf("PW_SHELL   %d      %d\n", (pwd->pw_fields & PW_SHELL) != 0, (result & PW_SHELL) != 0);
+printf("PW_EXPIRE  %d      %d\n", (pwd->pw_fields & PW_EXPIRE) != 0, (result & PW_EXPIRE) != 0);
+}
+
+//printf("result=0x%x\n", result);
+	return (result);
+}
+
+static bool
+runtest_fields(cap_channel_t *cappwd, unsigned int expected)
+{
+	char buf[1024];
+	struct passwd *pwd;
+	struct passwd st;
+
+//printf("expected=0x%x\n", expected);
+	cap_setpwent(cappwd);
+	pwd = cap_getpwent(cappwd);
+	if ((passwd_fields(pwd) & ~expected) != 0)
+		return (false);
+
+	cap_setpwent(cappwd);
+	cap_getpwent_r(cappwd, &st, buf, sizeof(buf), &pwd);
+	if ((passwd_fields(pwd) & ~expected) != 0)
+		return (false);
+
+	pwd = cap_getpwnam(cappwd, "root");
+	if ((passwd_fields(pwd) & ~expected) != 0)
+		return (false);
+
+	cap_getpwnam_r(cappwd, "root", &st, buf, sizeof(buf), &pwd);
+	if ((passwd_fields(pwd) & ~expected) != 0)
+		return (false);
+
+	pwd = cap_getpwuid(cappwd, UID_ROOT);
+	if ((passwd_fields(pwd) & ~expected) != 0)
+		return (false);
+
+	cap_getpwuid_r(cappwd, UID_ROOT, &st, buf, sizeof(buf), &pwd);
+	if ((passwd_fields(pwd) & ~expected) != 0)
+		return (false);
+
+	return (true);
+}
+
+static void
+test_fields(cap_channel_t *origcappwd)
+{
+	cap_channel_t *cappwd;
+	const char *fields[10];
+
+	/* No limits. */
+
+	CHECK(runtest_fields(origcappwd, PW_NAME | PW_PASSWD | PW_UID |
+	    PW_GID | PW_CHANGE | PW_CLASS | PW_GECOS | PW_DIR | PW_SHELL |
+	    PW_EXPIRE));
+
+	/*
+	 * Allow:
+	 * fields: pw_name, pw_passwd, pw_uid, pw_gid, pw_change, pw_class,
+	 *         pw_gecos, pw_dir, pw_shell, pw_expire
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	fields[0] = "pw_name";
+	fields[1] = "pw_passwd";
+	fields[2] = "pw_uid";
+	fields[3] = "pw_gid";
+	fields[4] = "pw_change";
+	fields[5] = "pw_class";
+	fields[6] = "pw_gecos";
+	fields[7] = "pw_dir";
+	fields[8] = "pw_shell";
+	fields[9] = "pw_expire";
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
+
+	CHECK(runtest_fields(origcappwd, PW_NAME | PW_PASSWD | PW_UID |
+	    PW_GID | PW_CHANGE | PW_CLASS | PW_GECOS | PW_DIR | PW_SHELL |
+	    PW_EXPIRE));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * fields: pw_name, pw_passwd, pw_uid, pw_gid, pw_change
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	fields[0] = "pw_name";
+	fields[1] = "pw_passwd";
+	fields[2] = "pw_uid";
+	fields[3] = "pw_gid";
+	fields[4] = "pw_change";
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 5) == 0);
+	fields[5] = "pw_class";
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 6) == -1 &&
+	    errno == ENOTCAPABLE);
+	fields[0] = "pw_class";
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest_fields(cappwd, PW_NAME | PW_PASSWD | PW_UID |
+	    PW_GID | PW_CHANGE));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * fields: pw_class, pw_gecos, pw_dir, pw_shell, pw_expire
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	fields[0] = "pw_class";
+	fields[1] = "pw_gecos";
+	fields[2] = "pw_dir";
+	fields[3] = "pw_shell";
+	fields[4] = "pw_expire";
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 5) == 0);
+	fields[5] = "pw_uid";
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 6) == -1 &&
+	    errno == ENOTCAPABLE);
+	fields[0] = "pw_uid";
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest_fields(cappwd, PW_CLASS | PW_GECOS | PW_DIR |
+	    PW_SHELL | PW_EXPIRE));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * fields: pw_name, pw_uid, pw_change, pw_gecos, pw_shell
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	fields[0] = "pw_name";
+	fields[1] = "pw_uid";
+	fields[2] = "pw_change";
+	fields[3] = "pw_gecos";
+	fields[4] = "pw_shell";
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 5) == 0);
+	fields[5] = "pw_class";
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 6) == -1 &&
+	    errno == ENOTCAPABLE);
+	fields[0] = "pw_class";
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest_fields(cappwd, PW_NAME | PW_UID | PW_CHANGE |
+	    PW_GECOS | PW_SHELL));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * fields: pw_passwd, pw_gid, pw_class, pw_dir, pw_expire
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	fields[0] = "pw_passwd";
+	fields[1] = "pw_gid";
+	fields[2] = "pw_class";
+	fields[3] = "pw_dir";
+	fields[4] = "pw_expire";
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 5) == 0);
+	fields[5] = "pw_uid";
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 6) == -1 &&
+	    errno == ENOTCAPABLE);
+	fields[0] = "pw_uid";
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest_fields(cappwd, PW_PASSWD | PW_GID | PW_CLASS |
+	    PW_DIR | PW_EXPIRE));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * fields: pw_uid, pw_class, pw_shell
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	fields[0] = "pw_uid";
+	fields[1] = "pw_class";
+	fields[2] = "pw_shell";
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 3) == 0);
+	fields[3] = "pw_change";
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 4) == -1 &&
+	    errno == ENOTCAPABLE);
+	fields[0] = "pw_change";
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest_fields(cappwd, PW_UID | PW_CLASS | PW_SHELL));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * fields: pw_change
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	fields[0] = "pw_change";
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 1) == 0);
+	fields[1] = "pw_uid";
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 2) == -1 &&
+	    errno == ENOTCAPABLE);
+	fields[0] = "pw_uid";
+	CHECK(cap_pwd_limit_fields(cappwd, fields, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+
+	CHECK(runtest_fields(cappwd, PW_CHANGE));
+
+	cap_close(cappwd);
+}
+
+static bool
+runtest_users(cap_channel_t *cappwd, const char **names, const uid_t *uids,
+    size_t nusers)
+{
+	char buf[1024];
+	struct passwd *pwd;
+	struct passwd st;
+	unsigned int i, got;
+
+	cap_setpwent(cappwd);
+	got = 0;
+	for (;;) {
+		pwd = cap_getpwent(cappwd);
+		if (pwd == NULL)
+			break;
+		got++;
+		for (i = 0; i < nusers; i++) {
+			if (strcmp(names[i], pwd->pw_name) == 0 &&
+			    uids[i] == pwd->pw_uid) {
+				break;
+			}
+		}
+		if (i == nusers)
+			return (false);
+	}
+	if (got != nusers)
+		return (false);
+
+	cap_setpwent(cappwd);
+	got = 0;
+	for (;;) {
+		cap_getpwent_r(cappwd, &st, buf, sizeof(buf), &pwd);
+		if (pwd == NULL)
+			break;
+		got++;
+		for (i = 0; i < nusers; i++) {
+			if (strcmp(names[i], pwd->pw_name) == 0 &&
+			    uids[i] == pwd->pw_uid) {
+				break;
+			}
+		}
+		if (i == nusers)
+			return (false);
+	}
+	if (got != nusers)
+		return (false);
+
+	for (i = 0; i < nusers; i++) {
+		pwd = cap_getpwnam(cappwd, names[i]);
+		if (pwd == NULL)
+			return (false);
+	}
+
+	for (i = 0; i < nusers; i++) {
+		cap_getpwnam_r(cappwd, names[i], &st, buf, sizeof(buf), &pwd);
+		if (pwd == NULL)
+			return (false);
+	}
+
+	for (i = 0; i < nusers; i++) {
+		pwd = cap_getpwuid(cappwd, uids[i]);
+		if (pwd == NULL)
+			return (false);
+	}
+
+	for (i = 0; i < nusers; i++) {
+		cap_getpwuid_r(cappwd, uids[i], &st, buf, sizeof(buf), &pwd);
+		if (pwd == NULL)
+			return (false);
+	}
+
+	return (true);
+}
+
+static void
+test_users(cap_channel_t *origcappwd)
+{
+	cap_channel_t *cappwd;
+	const char *names[6];
+	uid_t uids[6];
+
+	/*
+	 * Allow:
+	 * users:
+	 *     names: root, toor, daemon, operator, bin, tty
+	 *     uids:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	names[0] = "root";
+	names[1] = "toor";
+	names[2] = "daemon";
+	names[3] = "operator";
+	names[4] = "bin";
+	names[5] = "tty";
+	CHECK(cap_pwd_limit_users(cappwd, names, 6, NULL, 0) == 0);
+	uids[0] = 0;
+	uids[1] = 0;
+	uids[2] = 1;
+	uids[3] = 2;
+	uids[4] = 3;
+	uids[5] = 4;
+
+	CHECK(runtest_users(cappwd, names, uids, 6));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * users:
+	 *     names: daemon, operator, bin
+	 *     uids:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	names[0] = "daemon";
+	names[1] = "operator";
+	names[2] = "bin";
+	CHECK(cap_pwd_limit_users(cappwd, names, 3, NULL, 0) == 0);
+	names[3] = "tty";
+	CHECK(cap_pwd_limit_users(cappwd, names, 4, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+	names[0] = "tty";
+	CHECK(cap_pwd_limit_users(cappwd, names, 1, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+	names[0] = "daemon";
+	uids[0] = 1;
+	uids[1] = 2;
+	uids[2] = 3;
+
+	CHECK(runtest_users(cappwd, names, uids, 3));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * users:
+	 *     names: daemon, bin, tty
+	 *     uids:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	names[0] = "daemon";
+	names[1] = "bin";
+	names[2] = "tty";
+	CHECK(cap_pwd_limit_users(cappwd, names, 3, NULL, 0) == 0);
+	names[3] = "operator";
+	CHECK(cap_pwd_limit_users(cappwd, names, 4, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+	names[0] = "operator";
+	CHECK(cap_pwd_limit_users(cappwd, names, 1, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+	names[0] = "daemon";
+	uids[0] = 1;
+	uids[1] = 3;
+	uids[2] = 4;
+
+	CHECK(runtest_users(cappwd, names, uids, 3));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * users:
+	 *     names:
+	 *     uids: 1, 2, 3
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	names[0] = "daemon";
+	names[1] = "operator";
+	names[2] = "bin";
+	uids[0] = 1;
+	uids[1] = 2;
+	uids[2] = 3;
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 3) == 0);
+	uids[3] = 4;
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 4) == -1 &&
+	    errno == ENOTCAPABLE);
+	uids[0] = 4;
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+	uids[0] = 1;
+
+	CHECK(runtest_users(cappwd, names, uids, 3));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * users:
+	 *     names:
+	 *     uids: 1, 3, 4
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	names[0] = "daemon";
+	names[1] = "bin";
+	names[2] = "tty";
+	uids[0] = 1;
+	uids[1] = 3;
+	uids[2] = 4;
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 3) == 0);
+	uids[3] = 5;
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 4) == -1 &&
+	    errno == ENOTCAPABLE);
+	uids[0] = 5;
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+	uids[0] = 1;
+
+	CHECK(runtest_users(cappwd, names, uids, 3));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * users:
+	 *     names: bin
+	 *     uids:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	names[0] = "bin";
+	CHECK(cap_pwd_limit_users(cappwd, names, 1, NULL, 0) == 0);
+	names[1] = "operator";
+	CHECK(cap_pwd_limit_users(cappwd, names, 2, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+	names[0] = "operator";
+	CHECK(cap_pwd_limit_users(cappwd, names, 1, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+	names[0] = "bin";
+	uids[0] = 3;
+
+	CHECK(runtest_users(cappwd, names, uids, 1));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * users:
+	 *     names: daemon, tty
+	 *     uids:
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	names[0] = "daemon";
+	names[1] = "tty";
+	CHECK(cap_pwd_limit_users(cappwd, names, 2, NULL, 0) == 0);
+	names[2] = "operator";
+	CHECK(cap_pwd_limit_users(cappwd, names, 3, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+	names[0] = "operator";
+	CHECK(cap_pwd_limit_users(cappwd, names, 1, NULL, 0) == -1 &&
+	    errno == ENOTCAPABLE);
+	names[0] = "daemon";
+	uids[0] = 1;
+	uids[1] = 4;
+
+	CHECK(runtest_users(cappwd, names, uids, 2));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * users:
+	 *     names:
+	 *     uids: 3
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	names[0] = "bin";
+	uids[0] = 3;
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 1) == 0);
+	uids[1] = 4;
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 2) == -1 &&
+	    errno == ENOTCAPABLE);
+	uids[0] = 4;
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+	uids[0] = 3;
+
+	CHECK(runtest_users(cappwd, names, uids, 1));
+
+	cap_close(cappwd);
+
+	/*
+	 * Allow:
+	 * users:
+	 *     names:
+	 *     uids: 1, 4
+	 */
+	cappwd = cap_clone(origcappwd);
+	CHECK(cappwd != NULL);
+
+	names[0] = "daemon";
+	names[1] = "tty";
+	uids[0] = 1;
+	uids[1] = 4;
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 2) == 0);
+	uids[2] = 3;
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 3) == -1 &&
+	    errno == ENOTCAPABLE);
+	uids[0] = 3;
+	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 1) == -1 &&
+	    errno == ENOTCAPABLE);
+	uids[0] = 1;
+
+	CHECK(runtest_users(cappwd, names, uids, 2));
+
+	cap_close(cappwd);
+}
+
+int
+main(void)
+{
+	cap_channel_t *capcas, *cappwd;
+
+	printf("1..188\n");
+
+	capcas = cap_init();
+	CHECKX(capcas != NULL);
+
+	cappwd = cap_service_open(capcas, "system.pwd");
+	CHECKX(cappwd != NULL);
+
+	cap_close(capcas);
+
+	/* No limits. */
+
+	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R | GETPWNAM |
+	    GETPWNAM_R | GETPWUID | GETPWUID_R));
+
+	test_cmds(cappwd);
+
+	test_fields(cappwd);
+
+	test_users(cappwd);
+
+	cap_close(cappwd);
+
+	exit(0);
+}

Property changes on: projects/clang390-import/lib/libcasper/services/cap_pwd/tests/pwd_test.c
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: projects/clang390-import/lib/libcasper/services/cap_sysctl/Makefile
===================================================================
--- projects/clang390-import/lib/libcasper/services/cap_sysctl/Makefile	(revision 305686)
+++ projects/clang390-import/lib/libcasper/services/cap_sysctl/Makefile	(revision 305687)
@@ -1,18 +1,24 @@
 # $FreeBSD$
 
+.include <src.opts.mk>
+
 PACKAGE=libcasper
 LIB=	cap_sysctl
 
 SHLIB_MAJOR=	0
 SHLIBDIR?=	/lib/casper
 INCSDIR?=	${INCLUDEDIR}/casper
 
 SRCS=	cap_sysctl.c
 
 INCS=	cap_sysctl.h
 
 LIBADD=	nv
 
 CFLAGS+=-I${.CURDIR}
+
+.if ${MK_TESTS} != "no"
+SUBDIR+=	tests
+.endif
 
 .include <bsd.lib.mk>
Index: projects/clang390-import/lib/libcasper/services/cap_sysctl/tests/Makefile
===================================================================
--- projects/clang390-import/lib/libcasper/services/cap_sysctl/tests/Makefile	(nonexistent)
+++ projects/clang390-import/lib/libcasper/services/cap_sysctl/tests/Makefile	(revision 305687)
@@ -0,0 +1,11 @@
+# $FreeBSD$
+
+TAP_TESTS_C=	sysctl_test
+
+LIBADD+=	casper
+LIBADD+=	cap_sysctl
+LIBADD+=	nv
+
+WARNS?=		3
+
+.include <bsd.test.mk>

Property changes on: projects/clang390-import/lib/libcasper/services/cap_sysctl/tests/Makefile
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: projects/clang390-import/lib/libcasper/services/cap_sysctl/tests/sysctl_test.c
===================================================================
--- projects/clang390-import/lib/libcasper/services/cap_sysctl/tests/sysctl_test.c	(nonexistent)
+++ projects/clang390-import/lib/libcasper/services/cap_sysctl/tests/sysctl_test.c	(revision 305687)
@@ -0,0 +1,1510 @@
+/*-
+ * Copyright (c) 2013 The FreeBSD Foundation
+ * All rights reserved.
+ *
+ * This software was developed by Pawel Jakub Dawidek under sponsorship from
+ * the FreeBSD Foundation.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <sys/cdefs.h>
+__FBSDID("$FreeBSD$");
+
+#include <sys/types.h>
+#include <sys/capsicum.h>
+#include <sys/sysctl.h>
+#include <sys/nv.h>
+
+#include <assert.h>
+#include <err.h>
+#include <errno.h>
+#include <netdb.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include <libcasper.h>
+
+#include <casper/cap_sysctl.h>
+
+/*
+ * We need some sysctls to perform the tests on.
+ * We remember their values and restore them afer the test is done.
+ */
+#define	SYSCTL0_PARENT	"kern"
+#define	SYSCTL0_NAME	"kern.sync_on_panic"
+#define	SYSCTL1_PARENT	"debug"
+#define	SYSCTL1_NAME	"debug.minidump"
+
+static int ntest = 1;
+
+#define CHECK(expr)     do {						\
+	if ((expr))							\
+		printf("ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
+	else								\
+		printf("not ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
+	ntest++;							\
+} while (0)
+#define CHECKX(expr)     do {						\
+	if ((expr)) {							\
+		printf("ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
+	} else {							\
+		printf("not ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
+		exit(1);						\
+	}								\
+	ntest++;							\
+} while (0)
+
+#define	SYSCTL0_READ0		0x0001
+#define	SYSCTL0_READ1		0x0002
+#define	SYSCTL0_READ2		0x0004
+#define	SYSCTL0_WRITE		0x0008
+#define	SYSCTL0_READ_WRITE	0x0010
+#define	SYSCTL1_READ0		0x0020
+#define	SYSCTL1_READ1		0x0040
+#define	SYSCTL1_READ2		0x0080
+#define	SYSCTL1_WRITE		0x0100
+#define	SYSCTL1_READ_WRITE	0x0200
+
+static unsigned int
+runtest(cap_channel_t *capsysctl)
+{
+	unsigned int result;
+	int oldvalue, newvalue;
+	size_t oldsize;
+
+	result = 0;
+
+	oldsize = sizeof(oldvalue);
+	if (cap_sysctlbyname(capsysctl, SYSCTL0_NAME, &oldvalue, &oldsize,
+	    NULL, 0) == 0) {
+		if (oldsize == sizeof(oldvalue))
+			result |= SYSCTL0_READ0;
+	}
+
+	newvalue = 123;
+	if (cap_sysctlbyname(capsysctl, SYSCTL0_NAME, NULL, NULL, &newvalue,
+	    sizeof(newvalue)) == 0) {
+		result |= SYSCTL0_WRITE;
+	}
+
+	if ((result & SYSCTL0_WRITE) != 0) {
+		oldsize = sizeof(oldvalue);
+		if (cap_sysctlbyname(capsysctl, SYSCTL0_NAME, &oldvalue,
+		    &oldsize, NULL, 0) == 0) {
+			if (oldsize == sizeof(oldvalue) && oldvalue == 123)
+				result |= SYSCTL0_READ1;
+		}
+	}
+
+	oldsize = sizeof(oldvalue);
+	newvalue = 4567;
+	if (cap_sysctlbyname(capsysctl, SYSCTL0_NAME, &oldvalue, &oldsize,
+	    &newvalue, sizeof(newvalue)) == 0) {
+		if (oldsize == sizeof(oldvalue) && oldvalue == 123)
+			result |= SYSCTL0_READ_WRITE;
+	}
+
+	if ((result & SYSCTL0_READ_WRITE) != 0) {
+		oldsize = sizeof(oldvalue);
+		if (cap_sysctlbyname(capsysctl, SYSCTL0_NAME, &oldvalue,
+		    &oldsize, NULL, 0) == 0) {
+			if (oldsize == sizeof(oldvalue) && oldvalue == 4567)
+				result |= SYSCTL0_READ2;
+		}
+	}
+
+	oldsize = sizeof(oldvalue);
+	if (cap_sysctlbyname(capsysctl, SYSCTL1_NAME, &oldvalue, &oldsize,
+	    NULL, 0) == 0) {
+		if (oldsize == sizeof(oldvalue))
+			result |= SYSCTL1_READ0;
+	}
+
+	newvalue = 506;
+	if (cap_sysctlbyname(capsysctl, SYSCTL1_NAME, NULL, NULL, &newvalue,
+	    sizeof(newvalue)) == 0) {
+		result |= SYSCTL1_WRITE;
+	}
+
+	if ((result & SYSCTL1_WRITE) != 0) {
+		oldsize = sizeof(oldvalue);
+		if (cap_sysctlbyname(capsysctl, SYSCTL1_NAME, &oldvalue,
+		    &oldsize, NULL, 0) == 0) {
+			if (oldsize == sizeof(oldvalue) && oldvalue == 506)
+				result |= SYSCTL1_READ1;
+		}
+	}
+
+	oldsize = sizeof(oldvalue);
+	newvalue = 7008;
+	if (cap_sysctlbyname(capsysctl, SYSCTL1_NAME, &oldvalue, &oldsize,
+	    &newvalue, sizeof(newvalue)) == 0) {
+		if (oldsize == sizeof(oldvalue) && oldvalue == 506)
+			result |= SYSCTL1_READ_WRITE;
+	}
+
+	if ((result & SYSCTL1_READ_WRITE) != 0) {
+		oldsize = sizeof(oldvalue);
+		if (cap_sysctlbyname(capsysctl, SYSCTL1_NAME, &oldvalue,
+		    &oldsize, NULL, 0) == 0) {
+			if (oldsize == sizeof(oldvalue) && oldvalue == 7008)
+				result |= SYSCTL1_READ2;
+		}
+	}
+
+	return (result);
+}
+
+static void
+test_operation(cap_channel_t *origcapsysctl)
+{
+	cap_channel_t *capsysctl;
+	nvlist_t *limits;
+
+	/*
+	 * Allow:
+	 * SYSCTL0_PARENT/RDWR/RECURSIVE
+	 * SYSCTL1_PARENT/RDWR/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, "foo.bar",
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, "foo.bar",
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL0_READ1 |
+	    SYSCTL0_READ2 | SYSCTL0_WRITE | SYSCTL0_READ_WRITE |
+	    SYSCTL1_READ0 | SYSCTL1_READ1 | SYSCTL1_READ2 | SYSCTL1_WRITE |
+	    SYSCTL1_READ_WRITE));
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL0_READ1 |
+	    SYSCTL0_READ2 | SYSCTL0_WRITE | SYSCTL0_READ_WRITE |
+	    SYSCTL1_READ0 | SYSCTL1_READ1 | SYSCTL1_READ2 | SYSCTL1_WRITE |
+	    SYSCTL1_READ_WRITE));
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_WRITE));
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_WRITE));
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+
+	CHECK(runtest(capsysctl) == SYSCTL0_READ0);
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_NAME/RDWR/RECURSIVE
+	 * SYSCTL1_NAME/RDWR/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_RDWR);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL0_READ1 |
+	    SYSCTL0_READ2 | SYSCTL0_WRITE | SYSCTL0_READ_WRITE |
+	    SYSCTL1_READ0 | SYSCTL1_READ1 | SYSCTL1_READ2 | SYSCTL1_WRITE |
+	    SYSCTL1_READ_WRITE));
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_PARENT/RDWR
+	 * SYSCTL1_PARENT/RDWR
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_RDWR);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == 0);
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_NAME/RDWR
+	 * SYSCTL1_NAME/RDWR
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_RDWR);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL0_READ1 |
+	    SYSCTL0_READ2 | SYSCTL0_WRITE | SYSCTL0_READ_WRITE |
+	    SYSCTL1_READ0 | SYSCTL1_READ1 | SYSCTL1_READ2 | SYSCTL1_WRITE |
+	    SYSCTL1_READ_WRITE));
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_PARENT/RDWR
+	 * SYSCTL1_PARENT/RDWR/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_RDWR);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == (SYSCTL1_READ0 | SYSCTL1_READ1 |
+	    SYSCTL1_READ2 | SYSCTL1_WRITE | SYSCTL1_READ_WRITE));
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_NAME/RDWR
+	 * SYSCTL1_NAME/RDWR/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_RDWR);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL0_READ1 |
+	    SYSCTL0_READ2 | SYSCTL0_WRITE | SYSCTL0_READ_WRITE |
+	    SYSCTL1_READ0 | SYSCTL1_READ1 | SYSCTL1_READ2 | SYSCTL1_WRITE |
+	    SYSCTL1_READ_WRITE));
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_PARENT/READ/RECURSIVE
+	 * SYSCTL1_PARENT/READ/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_RDWR);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_WRITE);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_READ0));
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_NAME/READ/RECURSIVE
+	 * SYSCTL1_NAME/READ/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_RDWR);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_READ0));
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_PARENT/READ
+	 * SYSCTL1_PARENT/READ
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_RDWR);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_WRITE);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == 0);
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_NAME/READ
+	 * SYSCTL1_NAME/READ
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_RDWR);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_READ0));
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_PARENT/READ
+	 * SYSCTL1_PARENT/READ/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == SYSCTL1_READ0);
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_NAME/READ
+	 * SYSCTL1_NAME/READ/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_READ0));
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_PARENT/WRITE/RECURSIVE
+	 * SYSCTL1_PARENT/WRITE/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_RDWR);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_WRITE | SYSCTL1_WRITE));
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_NAME/WRITE/RECURSIVE
+	 * SYSCTL1_NAME/WRITE/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_RDWR);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_WRITE | SYSCTL1_WRITE));
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_PARENT/WRITE
+	 * SYSCTL1_PARENT/WRITE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_WRITE);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_RDWR);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == 0);
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_NAME/WRITE
+	 * SYSCTL1_NAME/WRITE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_RDWR);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_WRITE | SYSCTL1_WRITE));
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_PARENT/WRITE
+	 * SYSCTL1_PARENT/WRITE/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_WRITE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == SYSCTL1_WRITE);
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_NAME/WRITE
+	 * SYSCTL1_NAME/WRITE/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_WRITE | SYSCTL1_WRITE));
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_PARENT/READ/RECURSIVE
+	 * SYSCTL1_PARENT/WRITE/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_WRITE));
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_NAME/READ/RECURSIVE
+	 * SYSCTL1_NAME/WRITE/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_WRITE));
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_PARENT/READ
+	 * SYSCTL1_PARENT/WRITE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+
+	CHECK(runtest(capsysctl) == 0);
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_NAME/READ
+	 * SYSCTL1_NAME/WRITE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_WRITE));
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_PARENT/READ
+	 * SYSCTL1_PARENT/WRITE/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+
+	CHECK(runtest(capsysctl) == SYSCTL1_WRITE);
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_NAME/READ
+	 * SYSCTL1_NAME/WRITE/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_WRITE));
+
+	cap_close(capsysctl);
+}
+
+static void
+test_names(cap_channel_t *origcapsysctl)
+{
+	cap_channel_t *capsysctl;
+	nvlist_t *limits;
+
+	/*
+	 * Allow:
+	 * SYSCTL0_PARENT/READ/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == SYSCTL0_READ0);
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL1_NAME/READ/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == SYSCTL1_READ0);
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_PARENT/WRITE/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_WRITE);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == SYSCTL0_WRITE);
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL1_NAME/WRITE/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == SYSCTL1_WRITE);
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_PARENT/RDWR/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_PARENT,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL0_READ1 |
+	    SYSCTL0_READ2 | SYSCTL0_WRITE | SYSCTL0_READ_WRITE));
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL1_NAME/RDWR/RECURSIVE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	nvlist_add_number(limits, SYSCTL1_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME,
+	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == (SYSCTL1_READ0 | SYSCTL1_READ1 |
+	    SYSCTL1_READ2 | SYSCTL1_WRITE | SYSCTL1_READ_WRITE));
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_PARENT/READ
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == 0);
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL1_NAME/READ
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == SYSCTL1_READ0);
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_PARENT/WRITE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_WRITE);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == 0);
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL1_NAME/WRITE
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == SYSCTL1_WRITE);
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL0_PARENT/RDWR
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_RDWR);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == 0);
+
+	cap_close(capsysctl);
+
+	/*
+	 * Allow:
+	 * SYSCTL1_NAME/RDWR
+	 */
+
+	capsysctl = cap_clone(origcapsysctl);
+	CHECK(capsysctl != NULL);
+
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == 0);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_RDWR);
+	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+	limits = nvlist_create(0);
+	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_RDWR);
+	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
+
+	CHECK(runtest(capsysctl) == (SYSCTL1_READ0 | SYSCTL1_READ1 |
+	    SYSCTL1_READ2 | SYSCTL1_WRITE | SYSCTL1_READ_WRITE));
+
+	cap_close(capsysctl);
+}
+
+int
+main(void)
+{
+	cap_channel_t *capcas, *capsysctl;
+	int scvalue0, scvalue1;
+	size_t scsize;
+
+	printf("1..256\n");
+
+	scsize = sizeof(scvalue0);
+	CHECKX(sysctlbyname(SYSCTL0_NAME, &scvalue0, &scsize, NULL, 0) == 0);
+	CHECKX(scsize == sizeof(scvalue0));
+	scsize = sizeof(scvalue1);
+	CHECKX(sysctlbyname(SYSCTL1_NAME, &scvalue1, &scsize, NULL, 0) == 0);
+	CHECKX(scsize == sizeof(scvalue1));
+
+	capcas = cap_init();
+	CHECKX(capcas != NULL);
+
+	capsysctl = cap_service_open(capcas, "system.sysctl");
+	CHECKX(capsysctl != NULL);
+
+	cap_close(capcas);
+
+	/* No limits set. */
+
+	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL0_READ1 |
+	    SYSCTL0_READ2 | SYSCTL0_WRITE | SYSCTL0_READ_WRITE |
+	    SYSCTL1_READ0 | SYSCTL1_READ1 | SYSCTL1_READ2 | SYSCTL1_WRITE |
+	    SYSCTL1_READ_WRITE));
+
+	test_operation(capsysctl);
+
+	test_names(capsysctl);
+
+	cap_close(capsysctl);
+
+	CHECK(sysctlbyname(SYSCTL0_NAME, NULL, NULL, &scvalue0,
+	    sizeof(scvalue0)) == 0);
+	CHECK(sysctlbyname(SYSCTL1_NAME, NULL, NULL, &scvalue1,
+	    sizeof(scvalue1)) == 0);
+
+	exit(0);
+}

Property changes on: projects/clang390-import/lib/libcasper/services/cap_sysctl/tests/sysctl_test.c
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: projects/clang390-import/share/man/man3/Makefile
===================================================================
--- projects/clang390-import/share/man/man3/Makefile	(revision 305686)
+++ projects/clang390-import/share/man/man3/Makefile	(revision 305687)
@@ -1,340 +1,341 @@
 #	@(#)Makefile	8.2 (Berkeley) 12/13/93
 # $FreeBSD$
 
 .include <src.opts.mk>
 
 PACKAGE=runtime-manuals
 
 MAN=		assert.3 \
 		ATOMIC_VAR_INIT.3 \
 		bitstring.3 \
 		end.3 \
 		fpgetround.3 \
 		intro.3 \
 		makedev.3 \
 		offsetof.3 \
 		${PTHREAD_MAN} \
 		queue.3 \
 		sigevent.3 \
 		siginfo.3 \
 		stdarg.3 \
 		sysexits.3 \
 		tgmath.3 \
 		timeradd.3 \
 		tree.3
 
 MLINKS=		ATOMIC_VAR_INIT.3 atomic_compare_exchange_strong.3 \
 		ATOMIC_VAR_INIT.3 atomic_compare_exchange_strong_explicit.3 \
 		ATOMIC_VAR_INIT.3 atomic_compare_exchange_weak.3 \
 		ATOMIC_VAR_INIT.3 atomic_compare_exchange_weak_explicit.3 \
 		ATOMIC_VAR_INIT.3 atomic_exchange.3 \
 		ATOMIC_VAR_INIT.3 atomic_exchange_explicit.3 \
 		ATOMIC_VAR_INIT.3 atomic_fetch_add.3 \
 		ATOMIC_VAR_INIT.3 atomic_fetch_add_explicit.3 \
 		ATOMIC_VAR_INIT.3 atomic_fetch_and.3 \
 		ATOMIC_VAR_INIT.3 atomic_fetch_and_explicit.3 \
 		ATOMIC_VAR_INIT.3 atomic_fetch_or.3 \
 		ATOMIC_VAR_INIT.3 atomic_fetch_or_explicit.3 \
 		ATOMIC_VAR_INIT.3 atomic_fetch_sub.3 \
 		ATOMIC_VAR_INIT.3 atomic_fetch_sub_explicit.3 \
 		ATOMIC_VAR_INIT.3 atomic_fetch_xor.3 \
 		ATOMIC_VAR_INIT.3 atomic_fetch_xor_explicit.3 \
 		ATOMIC_VAR_INIT.3 atomic_init.3 \
 		ATOMIC_VAR_INIT.3 atomic_is_lock_free.3 \
 		ATOMIC_VAR_INIT.3 atomic_load.3 \
 		ATOMIC_VAR_INIT.3 atomic_load_explicit.3 \
 		ATOMIC_VAR_INIT.3 atomic_store.3 \
 		ATOMIC_VAR_INIT.3 atomic_store_explicit.3
 MLINKS+=	bitstring.3 bit_alloc.3 \
 		bitstring.3 bit_clear.3 \
 		bitstring.3 bit_decl.3 \
 		bitstring.3 bit_ffc.3 \
 		bitstring.3 bit_ffc_at.3 \
 		bitstring.3 bit_ffs.3 \
 		bitstring.3 bit_ffs_at.3 \
 		bitstring.3 bit_nclear.3 \
 		bitstring.3 bit_nset.3 \
 		bitstring.3 bit_set.3 \
 		bitstring.3 bitstr_size.3 \
 		bitstring.3 bit_test.3
 MLINKS+=	end.3 edata.3 \
 		end.3 etext.3
 MLINKS+=	fpgetround.3 fpgetmask.3 \
 		fpgetround.3 fpgetprec.3 \
 		fpgetround.3 fpgetsticky.3 \
 		fpgetround.3 fpresetsticky.3 \
 		fpgetround.3 fpsetmask.3 \
 		fpgetround.3 fpsetprec.3 \
 		fpgetround.3 fpsetround.3
 MLINKS+=	makedev.3 major.3 \
 		makedev.3 minor.3
 MLINKS+=	${PTHREAD_MLINKS}
 MLINKS+=	queue.3 LIST_CLASS_ENTRY.3 \
 		queue.3 LIST_CLASS_HEAD.3 \
 		queue.3 LIST_EMPTY.3 \
 		queue.3 LIST_ENTRY.3 \
 		queue.3 LIST_FIRST.3 \
 		queue.3 LIST_FOREACH.3 \
 		queue.3 LIST_FOREACH_FROM.3 \
 		queue.3 LIST_FOREACH_FROM_SAFE.3 \
 		queue.3 LIST_FOREACH_SAFE.3 \
 		queue.3 LIST_HEAD.3 \
 		queue.3 LIST_HEAD_INITIALIZER.3 \
 		queue.3 LIST_INIT.3 \
 		queue.3 LIST_INSERT_AFTER.3 \
 		queue.3 LIST_INSERT_BEFORE.3 \
 		queue.3 LIST_INSERT_HEAD.3 \
 		queue.3 LIST_NEXT.3 \
 		queue.3 LIST_PREV.3 \
 		queue.3 LIST_REMOVE.3 \
 		queue.3 LIST_SWAP.3 \
 		queue.3 SLIST_CLASS_ENTRY.3 \
 		queue.3 SLIST_CLASS_HEAD.3 \
 		queue.3 SLIST_EMPTY.3 \
 		queue.3 SLIST_ENTRY.3 \
 		queue.3 SLIST_FIRST.3 \
 		queue.3 SLIST_FOREACH.3 \
 		queue.3 SLIST_FOREACH_FROM.3 \
 		queue.3 SLIST_FOREACH_FROM_SAFE.3 \
 		queue.3 SLIST_FOREACH_SAFE.3 \
 		queue.3 SLIST_HEAD.3 \
 		queue.3 SLIST_HEAD_INITIALIZER.3 \
 		queue.3 SLIST_INIT.3 \
 		queue.3 SLIST_INSERT_AFTER.3 \
 		queue.3 SLIST_INSERT_HEAD.3 \
 		queue.3 SLIST_NEXT.3 \
 		queue.3 SLIST_REMOVE.3 \
 		queue.3 SLIST_REMOVE_AFTER.3 \
 		queue.3 SLIST_REMOVE_HEAD.3 \
+		queue.3 SLIST_REMOVE_PREVPTR.3 \
 		queue.3 SLIST_SWAP.3 \
 		queue.3 STAILQ_CLASS_ENTRY.3 \
 		queue.3 STAILQ_CLASS_HEAD.3 \
 		queue.3 STAILQ_CONCAT.3 \
 		queue.3 STAILQ_EMPTY.3 \
 		queue.3 STAILQ_ENTRY.3 \
 		queue.3 STAILQ_FIRST.3 \
 		queue.3 STAILQ_FOREACH.3 \
 		queue.3 STAILQ_FOREACH_FROM.3 \
 		queue.3 STAILQ_FOREACH_FROM_SAFE.3 \
 		queue.3 STAILQ_FOREACH_SAFE.3 \
 		queue.3 STAILQ_HEAD.3 \
 		queue.3 STAILQ_HEAD_INITIALIZER.3 \
 		queue.3 STAILQ_INIT.3 \
 		queue.3 STAILQ_INSERT_AFTER.3 \
 		queue.3 STAILQ_INSERT_HEAD.3 \
 		queue.3 STAILQ_INSERT_TAIL.3 \
 		queue.3 STAILQ_LAST.3 \
 		queue.3 STAILQ_NEXT.3 \
 		queue.3 STAILQ_REMOVE.3 \
 		queue.3 STAILQ_REMOVE_AFTER.3 \
 		queue.3 STAILQ_REMOVE_HEAD.3 \
 		queue.3 STAILQ_SWAP.3 \
 		queue.3 TAILQ_CLASS_ENTRY.3 \
 		queue.3 TAILQ_CLASS_HEAD.3 \
 		queue.3 TAILQ_CONCAT.3 \
 		queue.3 TAILQ_EMPTY.3 \
 		queue.3 TAILQ_ENTRY.3 \
 		queue.3 TAILQ_FIRST.3 \
 		queue.3 TAILQ_FOREACH.3 \
 		queue.3 TAILQ_FOREACH_FROM.3 \
 		queue.3 TAILQ_FOREACH_FROM_SAFE.3 \
 		queue.3 TAILQ_FOREACH_REVERSE.3 \
 		queue.3 TAILQ_FOREACH_REVERSE_FROM.3 \
 		queue.3 TAILQ_FOREACH_REVERSE_FROM_SAFE.3 \
 		queue.3 TAILQ_FOREACH_REVERSE_SAFE.3 \
 		queue.3 TAILQ_FOREACH_SAFE.3 \
 		queue.3 TAILQ_HEAD.3 \
 		queue.3 TAILQ_HEAD_INITIALIZER.3 \
 		queue.3 TAILQ_INIT.3 \
 		queue.3 TAILQ_INSERT_AFTER.3 \
 		queue.3 TAILQ_INSERT_BEFORE.3 \
 		queue.3 TAILQ_INSERT_HEAD.3 \
 		queue.3 TAILQ_INSERT_TAIL.3 \
 		queue.3 TAILQ_LAST.3 \
 		queue.3 TAILQ_NEXT.3 \
 		queue.3 TAILQ_PREV.3 \
 		queue.3 TAILQ_REMOVE.3 \
 		queue.3 TAILQ_SWAP.3
 MLINKS+=	stdarg.3 va_arg.3 \
 		stdarg.3 va_copy.3 \
 		stdarg.3 va_end.3 \
 		stdarg.3 varargs.3 \
 		stdarg.3 va_start.3
 MLINKS+=	timeradd.3 timerclear.3 \
 		timeradd.3 timercmp.3 \
 		timeradd.3 timerisset.3 \
 		timeradd.3 timersub.3
 MLINKS+=	tree.3 RB_EMPTY.3 \
 		tree.3 RB_ENTRY.3 \
 		tree.3 RB_FIND.3 \
 		tree.3 RB_FOREACH.3 \
 		tree.3 RB_FOREACH_REVERSE.3 \
 		tree.3 RB_GENERATE.3 \
 		tree.3 RB_GENERATE_STATIC.3 \
 		tree.3 RB_HEAD.3 \
 		tree.3 RB_INIT.3 \
 		tree.3 RB_INITIALIZER.3 \
 		tree.3 RB_INSERT.3 \
 		tree.3 RB_LEFT.3 \
 		tree.3 RB_MAX.3 \
 		tree.3 RB_MIN.3 \
 		tree.3 RB_NEXT.3 \
 		tree.3 RB_NFIND.3 \
 		tree.3 RB_PARENT.3 \
 		tree.3 RB_PREV.3 \
 		tree.3 RB_PROTOTYPE.3 \
 		tree.3 RB_PROTOTYPE_STATIC.3 \
 		tree.3 RB_REMOVE.3 \
 		tree.3 RB_RIGHT.3 \
 		tree.3 RB_ROOT.3 \
 		tree.3 SPLAY_EMPTY.3 \
 		tree.3 SPLAY_ENTRY.3 \
 		tree.3 SPLAY_FIND.3 \
 		tree.3 SPLAY_FOREACH.3 \
 		tree.3 SPLAY_GENERATE.3 \
 		tree.3 SPLAY_HEAD.3 \
 		tree.3 SPLAY_INIT.3 \
 		tree.3 SPLAY_INITIALIZER.3 \
 		tree.3 SPLAY_INSERT.3 \
 		tree.3 SPLAY_LEFT.3 \
 		tree.3 SPLAY_MAX.3 \
 		tree.3 SPLAY_MIN.3 \
 		tree.3 SPLAY_NEXT.3 \
 		tree.3 SPLAY_PROTOTYPE.3 \
 		tree.3 SPLAY_REMOVE.3 \
 		tree.3 SPLAY_RIGHT.3 \
 		tree.3 SPLAY_ROOT.3
 
 .if ${MK_LIBTHR} != "no"
 PTHREAD_MAN=	pthread.3 \
 		pthread_affinity_np.3 \
 		pthread_atfork.3 \
 		pthread_attr.3 \
 		pthread_attr_affinity_np.3 \
 		pthread_attr_get_np.3 \
 		pthread_attr_setcreatesuspend_np.3 \
 		pthread_barrierattr.3 \
 		pthread_barrier_destroy.3 \
 		pthread_cancel.3 \
 		pthread_cleanup_pop.3 \
 		pthread_cleanup_push.3 \
 		pthread_condattr.3 \
 		pthread_cond_broadcast.3 \
 		pthread_cond_destroy.3 \
 		pthread_cond_init.3 \
 		pthread_cond_signal.3 \
 		pthread_cond_timedwait.3 \
 		pthread_cond_wait.3 \
 		pthread_create.3 \
 		pthread_detach.3 \
 		pthread_equal.3 \
 		pthread_exit.3 \
 		pthread_getconcurrency.3 \
 		pthread_getcpuclockid.3 \
 		pthread_getspecific.3 \
 		pthread_getthreadid_np.3 \
 		pthread_join.3 \
 		pthread_key_create.3 \
 		pthread_key_delete.3 \
 		pthread_kill.3 \
 		pthread_main_np.3 \
 		pthread_multi_np.3 \
 		pthread_mutexattr.3 \
 		pthread_mutexattr_getkind_np.3 \
 		pthread_mutex_consistent.3 \
 		pthread_mutex_destroy.3 \
 		pthread_mutex_init.3 \
 		pthread_mutex_lock.3 \
 		pthread_mutex_timedlock.3 \
 		pthread_mutex_trylock.3 \
 		pthread_mutex_unlock.3 \
 		pthread_once.3 \
 		pthread_resume_all_np.3 \
 		pthread_resume_np.3 \
 		pthread_rwlockattr_destroy.3 \
 		pthread_rwlockattr_getpshared.3 \
 		pthread_rwlockattr_init.3 \
 		pthread_rwlockattr_setpshared.3 \
 		pthread_rwlock_destroy.3 \
 		pthread_rwlock_init.3 \
 		pthread_rwlock_rdlock.3 \
 		pthread_rwlock_timedrdlock.3 \
 		pthread_rwlock_timedwrlock.3 \
 		pthread_rwlock_unlock.3 \
 		pthread_rwlock_wrlock.3 \
 		pthread_schedparam.3 \
 		pthread_self.3 \
 		pthread_set_name_np.3 \
 		pthread_setspecific.3 \
 		pthread_sigmask.3 \
 		pthread_spin_init.3 \
 		pthread_spin_lock.3 \
 		pthread_suspend_all_np.3 \
 		pthread_suspend_np.3 \
 		pthread_switch_add_np.3 \
 		pthread_testcancel.3 \
 		pthread_yield.3
 
 PTHREAD_MLINKS=	pthread_affinity_np.3 pthread_getaffinity_np.3 \
 		pthread_affinity_np.3 pthread_setaffinity_np.3
 PTHREAD_MLINKS+=pthread_attr.3 pthread_attr_destroy.3 \
 		pthread_attr.3 pthread_attr_getdetachstate.3 \
 		pthread_attr.3 pthread_attr_getguardsize.3 \
 		pthread_attr.3 pthread_attr_getinheritsched.3 \
 		pthread_attr.3 pthread_attr_getschedparam.3 \
 		pthread_attr.3 pthread_attr_getschedpolicy.3 \
 		pthread_attr.3 pthread_attr_getscope.3 \
 		pthread_attr.3 pthread_attr_getstack.3 \
 		pthread_attr.3 pthread_attr_getstackaddr.3 \
 		pthread_attr.3 pthread_attr_getstacksize.3 \
 		pthread_attr.3 pthread_attr_init.3 \
 		pthread_attr.3 pthread_attr_setdetachstate.3 \
 		pthread_attr.3 pthread_attr_setguardsize.3 \
 		pthread_attr.3 pthread_attr_setinheritsched.3 \
 		pthread_attr.3 pthread_attr_setschedparam.3 \
 		pthread_attr.3 pthread_attr_setschedpolicy.3 \
 		pthread_attr.3 pthread_attr_setscope.3 \
 		pthread_attr.3 pthread_attr_setstack.3 \
 		pthread_attr.3 pthread_attr_setstackaddr.3 \
 		pthread_attr.3 pthread_attr_setstacksize.3
 PTHREAD_MLINKS+=pthread_attr_affinity_np.3 pthread_attr_getaffinity_np.3 \
 		pthread_attr_affinity_np.3 pthread_attr_setaffinity_np.3
 PTHREAD_MLINKS+=pthread_barrierattr.3 pthread_barrierattr_destroy.3 \
 		pthread_barrierattr.3 pthread_barrierattr_getpshared.3 \
 		pthread_barrierattr.3 pthread_barrierattr_init.3 \
 		pthread_barrierattr.3 pthread_barrierattr_setpshared.3
 PTHREAD_MLINKS+=pthread_barrier_destroy.3 pthread_barrier_init.3 \
 		pthread_barrier_destroy.3 pthread_barrier_wait.3
 PTHREAD_MLINKS+=pthread_condattr.3 pthread_condattr_destroy.3 \
 		pthread_condattr.3 pthread_condattr_init.3 \
 		pthread_condattr.3 pthread_condattr_getclock.3 \
 		pthread_condattr.3 pthread_condattr_setclock.3 \
 		pthread_condattr.3 pthread_condattr_getpshared.3 \
 		pthread_condattr.3 pthread_condattr_setpshared.3
 PTHREAD_MLINKS+=pthread_getconcurrency.3 pthread_setconcurrency.3
 PTHREAD_MLINKS+=pthread_multi_np.3 pthread_single_np.3
 PTHREAD_MLINKS+=pthread_mutexattr.3 pthread_mutexattr_destroy.3 \
 		pthread_mutexattr.3 pthread_mutexattr_getprioceiling.3 \
 		pthread_mutexattr.3 pthread_mutexattr_getprotocol.3 \
 		pthread_mutexattr.3 pthread_mutexattr_getrobust.3 \
 		pthread_mutexattr.3 pthread_mutexattr_gettype.3 \
 		pthread_mutexattr.3 pthread_mutexattr_init.3 \
 		pthread_mutexattr.3 pthread_mutexattr_setprioceiling.3 \
 		pthread_mutexattr.3 pthread_mutexattr_setprotocol.3 \
 		pthread_mutexattr.3 pthread_mutexattr_setrobust.3 \
 		pthread_mutexattr.3 pthread_mutexattr_settype.3
 PTHREAD_MLINKS+=pthread_mutexattr_getkind_np.3 pthread_mutexattr_setkind_np.3
 PTHREAD_MLINKS+=pthread_rwlock_rdlock.3 pthread_rwlock_tryrdlock.3
 PTHREAD_MLINKS+=pthread_rwlock_wrlock.3 pthread_rwlock_trywrlock.3
 PTHREAD_MLINKS+=pthread_schedparam.3 pthread_getschedparam.3 \
 		pthread_schedparam.3 pthread_setschedparam.3
 PTHREAD_MLINKS+=pthread_spin_init.3 pthread_spin_destroy.3 \
 		pthread_spin_lock.3 pthread_spin_trylock.3 \
 		pthread_spin_lock.3 pthread_spin_unlock.3
 PTHREAD_MLINKS+=pthread_switch_add_np.3 pthread_switch_delete_np.3
 PTHREAD_MLINKS+=pthread_testcancel.3 pthread_setcancelstate.3 \
 		pthread_testcancel.3 pthread_setcanceltype.3
 PTHREAD_MLINKS+=pthread_join.3 pthread_timedjoin_np.3
 .endif
 
 .include <bsd.prog.mk>
Index: projects/clang390-import/share/man/man3/queue.3
===================================================================
--- projects/clang390-import/share/man/man3/queue.3	(revision 305686)
+++ projects/clang390-import/share/man/man3/queue.3	(revision 305687)
@@ -1,1293 +1,1337 @@
 .\" Copyright (c) 1993
 .\"	The Regents of the University of California.  All rights reserved.
 .\"
 .\" Redistribution and use in source and binary forms, with or without
 .\" modification, are permitted provided that the following conditions
 .\" are met:
 .\" 1. Redistributions of source code must retain the above copyright
 .\"    notice, this list of conditions and the following disclaimer.
 .\" 2. Redistributions in binary form must reproduce the above copyright
 .\"    notice, this list of conditions and the following disclaimer in the
 .\"    documentation and/or other materials provided with the distribution.
 .\" 3. Neither the name of the University nor the names of its contributors
 .\"    may be used to endorse or promote products derived from this software
 .\"    without specific prior written permission.
 .\"
 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 .\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 .\" SUCH DAMAGE.
 .\"
 .\"	@(#)queue.3	8.2 (Berkeley) 1/24/94
 .\" $FreeBSD$
 .\"
-.Dd August 15, 2016
+.Dd September 8, 2016
 .Dt QUEUE 3
 .Os
 .Sh NAME
 .Nm SLIST_CLASS_ENTRY ,
 .Nm SLIST_CLASS_HEAD ,
 .Nm SLIST_CONCAT ,
 .Nm SLIST_EMPTY ,
 .Nm SLIST_ENTRY ,
 .Nm SLIST_FIRST ,
 .Nm SLIST_FOREACH ,
 .Nm SLIST_FOREACH_FROM ,
 .Nm SLIST_FOREACH_FROM_SAFE ,
 .Nm SLIST_FOREACH_SAFE ,
 .Nm SLIST_HEAD ,
 .Nm SLIST_HEAD_INITIALIZER ,
 .Nm SLIST_INIT ,
 .Nm SLIST_INSERT_AFTER ,
 .Nm SLIST_INSERT_HEAD ,
 .Nm SLIST_NEXT ,
 .Nm SLIST_REMOVE ,
 .Nm SLIST_REMOVE_AFTER ,
 .Nm SLIST_REMOVE_HEAD ,
 .Nm SLIST_SWAP ,
 .Nm STAILQ_CLASS_ENTRY ,
 .Nm STAILQ_CLASS_HEAD ,
 .Nm STAILQ_CONCAT ,
 .Nm STAILQ_EMPTY ,
 .Nm STAILQ_ENTRY ,
 .Nm STAILQ_FIRST ,
 .Nm STAILQ_FOREACH ,
 .Nm STAILQ_FOREACH_FROM ,
 .Nm STAILQ_FOREACH_FROM_SAFE ,
 .Nm STAILQ_FOREACH_SAFE ,
 .Nm STAILQ_HEAD ,
 .Nm STAILQ_HEAD_INITIALIZER ,
 .Nm STAILQ_INIT ,
 .Nm STAILQ_INSERT_AFTER ,
 .Nm STAILQ_INSERT_HEAD ,
 .Nm STAILQ_INSERT_TAIL ,
 .Nm STAILQ_LAST ,
 .Nm STAILQ_NEXT ,
 .Nm STAILQ_REMOVE ,
 .Nm STAILQ_REMOVE_AFTER ,
 .Nm STAILQ_REMOVE_HEAD ,
 .Nm STAILQ_SWAP ,
 .Nm LIST_CLASS_ENTRY ,
 .Nm LIST_CLASS_HEAD ,
 .Nm LIST_CONCAT ,
 .Nm LIST_EMPTY ,
 .Nm LIST_ENTRY ,
 .Nm LIST_FIRST ,
 .Nm LIST_FOREACH ,
 .Nm LIST_FOREACH_FROM ,
 .Nm LIST_FOREACH_FROM_SAFE ,
 .Nm LIST_FOREACH_SAFE ,
 .Nm LIST_HEAD ,
 .Nm LIST_HEAD_INITIALIZER ,
 .Nm LIST_INIT ,
 .Nm LIST_INSERT_AFTER ,
 .Nm LIST_INSERT_BEFORE ,
 .Nm LIST_INSERT_HEAD ,
 .Nm LIST_NEXT ,
 .Nm LIST_PREV ,
 .Nm LIST_REMOVE ,
 .Nm LIST_SWAP ,
 .Nm TAILQ_CLASS_ENTRY ,
 .Nm TAILQ_CLASS_HEAD ,
 .Nm TAILQ_CONCAT ,
 .Nm TAILQ_EMPTY ,
 .Nm TAILQ_ENTRY ,
 .Nm TAILQ_FIRST ,
 .Nm TAILQ_FOREACH ,
 .Nm TAILQ_FOREACH_FROM ,
 .Nm TAILQ_FOREACH_FROM_SAFE ,
 .Nm TAILQ_FOREACH_REVERSE ,
 .Nm TAILQ_FOREACH_REVERSE_FROM ,
 .Nm TAILQ_FOREACH_REVERSE_FROM_SAFE ,
 .Nm TAILQ_FOREACH_REVERSE_SAFE ,
 .Nm TAILQ_FOREACH_SAFE ,
 .Nm TAILQ_HEAD ,
 .Nm TAILQ_HEAD_INITIALIZER ,
 .Nm TAILQ_INIT ,
 .Nm TAILQ_INSERT_AFTER ,
 .Nm TAILQ_INSERT_BEFORE ,
 .Nm TAILQ_INSERT_HEAD ,
 .Nm TAILQ_INSERT_TAIL ,
 .Nm TAILQ_LAST ,
 .Nm TAILQ_NEXT ,
 .Nm TAILQ_PREV ,
 .Nm TAILQ_REMOVE ,
 .Nm TAILQ_SWAP
 .Nd implementations of singly-linked lists, singly-linked tail queues,
 lists and tail queues
 .Sh SYNOPSIS
 .In sys/queue.h
 .\"
 .Fn SLIST_CLASS_ENTRY "CLASSTYPE"
 .Fn SLIST_CLASS_HEAD "HEADNAME" "CLASSTYPE"
 .Fn SLIST_CONCAT "SLIST_HEAD *head1" "SLIST_HEAD *head2" "TYPE" "SLIST_ENTRY NAME"
 .Fn SLIST_EMPTY "SLIST_HEAD *head"
 .Fn SLIST_ENTRY "TYPE"
 .Fn SLIST_FIRST "SLIST_HEAD *head"
 .Fn SLIST_FOREACH "TYPE *var" "SLIST_HEAD *head" "SLIST_ENTRY NAME"
 .Fn SLIST_FOREACH_FROM "TYPE *var" "SLIST_HEAD *head" "SLIST_ENTRY NAME"
 .Fn SLIST_FOREACH_FROM_SAFE "TYPE *var" "SLIST_HEAD *head" "SLIST_ENTRY NAME" "TYPE *temp_var"
 .Fn SLIST_FOREACH_SAFE "TYPE *var" "SLIST_HEAD *head" "SLIST_ENTRY NAME" "TYPE *temp_var"
 .Fn SLIST_HEAD "HEADNAME" "TYPE"
 .Fn SLIST_HEAD_INITIALIZER "SLIST_HEAD head"
 .Fn SLIST_INIT "SLIST_HEAD *head"
 .Fn SLIST_INSERT_AFTER "TYPE *listelm" "TYPE *elm" "SLIST_ENTRY NAME"
 .Fn SLIST_INSERT_HEAD "SLIST_HEAD *head" "TYPE *elm" "SLIST_ENTRY NAME"
 .Fn SLIST_NEXT "TYPE *elm" "SLIST_ENTRY NAME"
 .Fn SLIST_REMOVE "SLIST_HEAD *head" "TYPE *elm" "TYPE" "SLIST_ENTRY NAME"
 .Fn SLIST_REMOVE_AFTER "TYPE *elm" "SLIST_ENTRY NAME"
 .Fn SLIST_REMOVE_HEAD "SLIST_HEAD *head" "SLIST_ENTRY NAME"
 .Fn SLIST_SWAP "SLIST_HEAD *head1" "SLIST_HEAD *head2" "TYPE"
 .\"
 .Fn STAILQ_CLASS_ENTRY "CLASSTYPE"
 .Fn STAILQ_CLASS_HEAD "HEADNAME" "CLASSTYPE"
 .Fn STAILQ_CONCAT "STAILQ_HEAD *head1" "STAILQ_HEAD *head2"
 .Fn STAILQ_EMPTY "STAILQ_HEAD *head"
 .Fn STAILQ_ENTRY "TYPE"
 .Fn STAILQ_FIRST "STAILQ_HEAD *head"
 .Fn STAILQ_FOREACH "TYPE *var" "STAILQ_HEAD *head" "STAILQ_ENTRY NAME"
 .Fn STAILQ_FOREACH_FROM "TYPE *var" "STAILQ_HEAD *head" "STAILQ_ENTRY NAME"
 .Fn STAILQ_FOREACH_FROM_SAFE "TYPE *var" "STAILQ_HEAD *head" "STAILQ_ENTRY NAME" "TYPE *temp_var"
 .Fn STAILQ_FOREACH_SAFE "TYPE *var" "STAILQ_HEAD *head" "STAILQ_ENTRY NAME" "TYPE *temp_var"
 .Fn STAILQ_HEAD "HEADNAME" "TYPE"
 .Fn STAILQ_HEAD_INITIALIZER "STAILQ_HEAD head"
 .Fn STAILQ_INIT "STAILQ_HEAD *head"
 .Fn STAILQ_INSERT_AFTER "STAILQ_HEAD *head" "TYPE *listelm" "TYPE *elm" "STAILQ_ENTRY NAME"
 .Fn STAILQ_INSERT_HEAD "STAILQ_HEAD *head" "TYPE *elm" "STAILQ_ENTRY NAME"
 .Fn STAILQ_INSERT_TAIL "STAILQ_HEAD *head" "TYPE *elm" "STAILQ_ENTRY NAME"
 .Fn STAILQ_LAST "STAILQ_HEAD *head" "TYPE *elm" "STAILQ_ENTRY NAME"
 .Fn STAILQ_NEXT "TYPE *elm" "STAILQ_ENTRY NAME"
 .Fn STAILQ_REMOVE "STAILQ_HEAD *head" "TYPE *elm" "TYPE" "STAILQ_ENTRY NAME"
 .Fn STAILQ_REMOVE_AFTER "STAILQ_HEAD *head" "TYPE *elm" "STAILQ_ENTRY NAME"
 .Fn STAILQ_REMOVE_HEAD "STAILQ_HEAD *head" "STAILQ_ENTRY NAME"
 .Fn STAILQ_SWAP "STAILQ_HEAD *head1" "STAILQ_HEAD *head2" "TYPE"
 .\"
 .Fn LIST_CLASS_ENTRY "CLASSTYPE"
 .Fn LIST_CLASS_HEAD "HEADNAME" "CLASSTYPE"
 .Fn LIST_CONCAT "LIST_HEAD *head1" "LIST_HEAD *head2" "TYPE" "LIST_ENTRY NAME"
 .Fn LIST_EMPTY "LIST_HEAD *head"
 .Fn LIST_ENTRY "TYPE"
 .Fn LIST_FIRST "LIST_HEAD *head"
 .Fn LIST_FOREACH "TYPE *var" "LIST_HEAD *head" "LIST_ENTRY NAME"
 .Fn LIST_FOREACH_FROM "TYPE *var" "LIST_HEAD *head" "LIST_ENTRY NAME"
 .Fn LIST_FOREACH_FROM_SAFE "TYPE *var" "LIST_HEAD *head" "LIST_ENTRY NAME" "TYPE *temp_var"
 .Fn LIST_FOREACH_SAFE "TYPE *var" "LIST_HEAD *head" "LIST_ENTRY NAME" "TYPE *temp_var"
 .Fn LIST_HEAD "HEADNAME" "TYPE"
 .Fn LIST_HEAD_INITIALIZER "LIST_HEAD head"
 .Fn LIST_INIT "LIST_HEAD *head"
 .Fn LIST_INSERT_AFTER "TYPE *listelm" "TYPE *elm" "LIST_ENTRY NAME"
 .Fn LIST_INSERT_BEFORE "TYPE *listelm" "TYPE *elm" "LIST_ENTRY NAME"
 .Fn LIST_INSERT_HEAD "LIST_HEAD *head" "TYPE *elm" "LIST_ENTRY NAME"
 .Fn LIST_NEXT "TYPE *elm" "LIST_ENTRY NAME"
 .Fn LIST_PREV "TYPE *elm" "LIST_HEAD *head" "TYPE" "LIST_ENTRY NAME"
 .Fn LIST_REMOVE "TYPE *elm" "LIST_ENTRY NAME"
 .Fn LIST_SWAP "LIST_HEAD *head1" "LIST_HEAD *head2" "TYPE" "LIST_ENTRY NAME"
 .\"
 .Fn TAILQ_CLASS_ENTRY "CLASSTYPE"
 .Fn TAILQ_CLASS_HEAD "HEADNAME" "CLASSTYPE"
 .Fn TAILQ_CONCAT "TAILQ_HEAD *head1" "TAILQ_HEAD *head2" "TAILQ_ENTRY NAME"
 .Fn TAILQ_EMPTY "TAILQ_HEAD *head"
 .Fn TAILQ_ENTRY "TYPE"
 .Fn TAILQ_FIRST "TAILQ_HEAD *head"
 .Fn TAILQ_FOREACH "TYPE *var" "TAILQ_HEAD *head" "TAILQ_ENTRY NAME"
 .Fn TAILQ_FOREACH_FROM "TYPE *var" "TAILQ_HEAD *head" "TAILQ_ENTRY NAME"
 .Fn TAILQ_FOREACH_FROM_SAFE "TYPE *var" "TAILQ_HEAD *head" "TAILQ_ENTRY NAME" "TYPE *temp_var"
 .Fn TAILQ_FOREACH_REVERSE "TYPE *var" "TAILQ_HEAD *head" "HEADNAME" "TAILQ_ENTRY NAME"
 .Fn TAILQ_FOREACH_REVERSE_FROM "TYPE *var" "TAILQ_HEAD *head" "HEADNAME" "TAILQ_ENTRY NAME"
 .Fn TAILQ_FOREACH_REVERSE_FROM_SAFE "TYPE *var" "TAILQ_HEAD *head" "HEADNAME" "TAILQ_ENTRY NAME" "TYPE *temp_var"
 .Fn TAILQ_FOREACH_REVERSE_SAFE "TYPE *var" "TAILQ_HEAD *head" "HEADNAME" "TAILQ_ENTRY NAME" "TYPE *temp_var"
 .Fn TAILQ_FOREACH_SAFE "TYPE *var" "TAILQ_HEAD *head" "TAILQ_ENTRY NAME" "TYPE *temp_var"
 .Fn TAILQ_HEAD "HEADNAME" "TYPE"
 .Fn TAILQ_HEAD_INITIALIZER "TAILQ_HEAD head"
 .Fn TAILQ_INIT "TAILQ_HEAD *head"
 .Fn TAILQ_INSERT_AFTER "TAILQ_HEAD *head" "TYPE *listelm" "TYPE *elm" "TAILQ_ENTRY NAME"
 .Fn TAILQ_INSERT_BEFORE "TYPE *listelm" "TYPE *elm" "TAILQ_ENTRY NAME"
 .Fn TAILQ_INSERT_HEAD "TAILQ_HEAD *head" "TYPE *elm" "TAILQ_ENTRY NAME"
 .Fn TAILQ_INSERT_TAIL "TAILQ_HEAD *head" "TYPE *elm" "TAILQ_ENTRY NAME"
 .Fn TAILQ_LAST "TAILQ_HEAD *head" "HEADNAME"
 .Fn TAILQ_NEXT "TYPE *elm" "TAILQ_ENTRY NAME"
 .Fn TAILQ_PREV "TYPE *elm" "HEADNAME" "TAILQ_ENTRY NAME"
 .Fn TAILQ_REMOVE "TAILQ_HEAD *head" "TYPE *elm" "TAILQ_ENTRY NAME"
 .Fn TAILQ_SWAP "TAILQ_HEAD *head1" "TAILQ_HEAD *head2" "TYPE" "TAILQ_ENTRY NAME"
 .\"
 .Sh DESCRIPTION
 These macros define and operate on four types of data structures which
 can be used in both C and C++ source code:
 .Bl -enum -compact -offset indent
 .It
 Lists
 .It
 Singly-linked lists
 .It
 Singly-linked tail queues
 .It
 Tail queues
 .El
 All four structures support the following functionality:
 .Bl -enum -compact -offset indent
 .It
 Insertion of a new entry at the head of the list.
 .It
 Insertion of a new entry after any element in the list.
 .It
 O(1) removal of an entry from the head of the list.
 .It
 Forward traversal through the list.
 .It
 Swapping the contents of two lists.
 .El
 .Pp
 Singly-linked lists are the simplest of the four data structures
 and support only the above functionality.
 Singly-linked lists are ideal for applications with large datasets
 and few or no removals,
 or for implementing a LIFO queue.
 Singly-linked lists add the following functionality:
 .Bl -enum -compact -offset indent
 .It
 O(n) removal of any entry in the list.
 .It
 O(n) concatenation of two lists.
 .El
 .Pp
 Singly-linked tail queues add the following functionality:
 .Bl -enum -compact -offset indent
 .It
 Entries can be added at the end of a list.
 .It
 O(n) removal of any entry in the list.
 .It
 They may be concatenated.
 .El
 However:
 .Bl -enum -compact -offset indent
 .It
 All list insertions must specify the head of the list.
 .It
 Each head entry requires two pointers rather than one.
 .It
 Code size is about 15% greater and operations run about 20% slower
 than singly-linked lists.
 .El
 .Pp
 Singly-linked tail queues are ideal for applications with large datasets and
 few or no removals,
 or for implementing a FIFO queue.
 .Pp
 All doubly linked types of data structures (lists and tail queues)
 additionally allow:
 .Bl -enum -compact -offset indent
 .It
 Insertion of a new entry before any element in the list.
 .It
 O(1) removal of any entry in the list.
 .El
 However:
 .Bl -enum -compact -offset indent
 .It
 Each element requires two pointers rather than one.
 .It
 Code size and execution time of operations (except for removal) is about
 twice that of the singly-linked data-structures.
 .El
 .Pp
 Linked lists are the simplest of the doubly linked data structures.
 They add the following functionality over the above:
 .Bl -enum -compact -offset indent
 .It
 O(n) concatenation of two lists.
 .It
 They may be traversed backwards.
 .El
 However:
 .Bl -enum -compact -offset indent
 .It
 To traverse backwards, an entry to begin the traversal and the list in
 which it is contained must be specified.
 .El
 .Pp
 Tail queues add the following functionality:
 .Bl -enum -compact -offset indent
 .It
 Entries can be added at the end of a list.
 .It
 They may be traversed backwards, from tail to head.
 .It
 They may be concatenated.
 .El
 However:
 .Bl -enum -compact -offset indent
 .It
 All list insertions and removals must specify the head of the list.
 .It
 Each head entry requires two pointers rather than one.
 .It
 Code size is about 15% greater and operations run about 20% slower
 than singly-linked lists.
 .El
 .Pp
 In the macro definitions,
 .Fa TYPE
 is the name of a user defined structure.
 The structure must contain a field called
 .Fa NAME
 which is of type
 .Li SLIST_ENTRY ,
 .Li STAILQ_ENTRY ,
 .Li LIST_ENTRY ,
 or
 .Li TAILQ_ENTRY .
 In the macro definitions,
 .Fa CLASSTYPE
 is the name of a user defined class.
 The class must contain a field called
 .Fa NAME
 which is of type
 .Li SLIST_CLASS_ENTRY ,
 .Li STAILQ_CLASS_ENTRY ,
 .Li LIST_CLASS_ENTRY ,
 or
 .Li TAILQ_CLASS_ENTRY .
 The argument
 .Fa HEADNAME
 is the name of a user defined structure that must be declared
 using the macros
 .Li SLIST_HEAD ,
 .Li SLIST_CLASS_HEAD ,
 .Li STAILQ_HEAD ,
 .Li STAILQ_CLASS_HEAD ,
 .Li LIST_HEAD ,
 .Li LIST_CLASS_HEAD ,
 .Li TAILQ_HEAD ,
 or
 .Li TAILQ_CLASS_HEAD .
 See the examples below for further explanation of how these
 macros are used.
 .Sh SINGLY-LINKED LISTS
 A singly-linked list is headed by a structure defined by the
 .Nm SLIST_HEAD
 macro.
 This structure contains a single pointer to the first element
 on the list.
 The elements are singly linked for minimum space and pointer manipulation
 overhead at the expense of O(n) removal for arbitrary elements.
 New elements can be added to the list after an existing element or
 at the head of the list.
 An
 .Fa SLIST_HEAD
 structure is declared as follows:
 .Bd -literal -offset indent
 SLIST_HEAD(HEADNAME, TYPE) head;
 .Ed
 .Pp
 where
 .Fa HEADNAME
 is the name of the structure to be defined, and
 .Fa TYPE
 is the type of the elements to be linked into the list.
 A pointer to the head of the list can later be declared as:
 .Bd -literal -offset indent
 struct HEADNAME *headp;
 .Ed
 .Pp
 (The names
 .Li head
 and
 .Li headp
 are user selectable.)
 .Pp
 The macro
 .Nm SLIST_HEAD_INITIALIZER
 evaluates to an initializer for the list
 .Fa head .
 .Pp
 The macro
 .Nm SLIST_CONCAT
 concatenates the list headed by
 .Fa head2
 onto the end of the one headed by
 .Fa head1
 removing all entries from the former.
 Use of this macro should be avoided as it traverses the entirety of the
 .Fa head1
 list.
 A singly-linked tail queue should be used if this macro is needed in
 high-usage code paths or to operate on long lists.
 .Pp
 The macro
 .Nm SLIST_EMPTY
 evaluates to true if there are no elements in the list.
 .Pp
 The macro
 .Nm SLIST_ENTRY
 declares a structure that connects the elements in
 the list.
 .Pp
 The macro
 .Nm SLIST_FIRST
 returns the first element in the list or NULL if the list is empty.
 .Pp
 The macro
 .Nm SLIST_FOREACH
 traverses the list referenced by
 .Fa head
 in the forward direction, assigning each element in
 turn to
 .Fa var .
 .Pp
 The macro
 .Nm SLIST_FOREACH_FROM
 behaves identically to
 .Nm SLIST_FOREACH
 when
 .Fa var
 is NULL, else it treats
 .Fa var
 as a previously found SLIST element and begins the loop at
 .Fa var
 instead of the first element in the SLIST referenced by
 .Fa head .
 .Pp
 The macro
 .Nm SLIST_FOREACH_SAFE
 traverses the list referenced by
 .Fa head
 in the forward direction, assigning each element in
 turn to
 .Fa var .
 However, unlike
 .Fn SLIST_FOREACH
 here it is permitted to both remove
 .Fa var
 as well as free it from within the loop safely without interfering with the
 traversal.
 .Pp
 The macro
 .Nm SLIST_FOREACH_FROM_SAFE
 behaves identically to
 .Nm SLIST_FOREACH_SAFE
 when
 .Fa var
 is NULL, else it treats
 .Fa var
 as a previously found SLIST element and begins the loop at
 .Fa var
 instead of the first element in the SLIST referenced by
 .Fa head .
 .Pp
 The macro
 .Nm SLIST_INIT
 initializes the list referenced by
 .Fa head .
 .Pp
 The macro
 .Nm SLIST_INSERT_HEAD
 inserts the new element
 .Fa elm
 at the head of the list.
 .Pp
 The macro
 .Nm SLIST_INSERT_AFTER
 inserts the new element
 .Fa elm
 after the element
 .Fa listelm .
 .Pp
 The macro
 .Nm SLIST_NEXT
 returns the next element in the list.
 .Pp
 The macro
 .Nm SLIST_REMOVE_AFTER
 removes the element after
 .Fa elm
 from the list.
 Unlike
 .Fa SLIST_REMOVE ,
 this macro does not traverse the entire list.
 .Pp
 The macro
 .Nm SLIST_REMOVE_HEAD
 removes the element
 .Fa elm
 from the head of the list.
 For optimum efficiency,
 elements being removed from the head of the list should explicitly use
 this macro instead of the generic
 .Fa SLIST_REMOVE
 macro.
 .Pp
 The macro
 .Nm SLIST_REMOVE
 removes the element
 .Fa elm
 from the list.
 Use of this macro should be avoided as it traverses the entire list.
 A doubly-linked list should be used if this macro is needed in
 high-usage code paths or to operate on long lists.
 .Pp
 The macro
 .Nm SLIST_SWAP
 swaps the contents of
 .Fa head1
 and
 .Fa head2 .
 .Sh SINGLY-LINKED LIST EXAMPLE
 .Bd -literal
 SLIST_HEAD(slisthead, entry) head =
     SLIST_HEAD_INITIALIZER(head);
 struct slisthead *headp;		/* Singly-linked List head. */
 struct entry {
 	...
 	SLIST_ENTRY(entry) entries;	/* Singly-linked List. */
 	...
 } *n1, *n2, *n3, *np;
 
 SLIST_INIT(&head);			/* Initialize the list. */
 
 n1 = malloc(sizeof(struct entry));	/* Insert at the head. */
 SLIST_INSERT_HEAD(&head, n1, entries);
 
 n2 = malloc(sizeof(struct entry));	/* Insert after. */
 SLIST_INSERT_AFTER(n1, n2, entries);
 
 SLIST_REMOVE(&head, n2, entry, entries);/* Deletion. */
 free(n2);
 
 n3 = SLIST_FIRST(&head);
 SLIST_REMOVE_HEAD(&head, entries);	/* Deletion from the head. */
 free(n3);
 					/* Forward traversal. */
 SLIST_FOREACH(np, &head, entries)
 	np-> ...
 					/* Safe forward traversal. */
 SLIST_FOREACH_SAFE(np, &head, entries, np_temp) {
 	np->do_stuff();
 	...
 	SLIST_REMOVE(&head, np, entry, entries);
 	free(np);
 }
 
 while (!SLIST_EMPTY(&head)) {		/* List Deletion. */
 	n1 = SLIST_FIRST(&head);
 	SLIST_REMOVE_HEAD(&head, entries);
 	free(n1);
 }
 .Ed
 .Sh SINGLY-LINKED TAIL QUEUES
 A singly-linked tail queue is headed by a structure defined by the
 .Nm STAILQ_HEAD
 macro.
 This structure contains a pair of pointers,
 one to the first element in the tail queue and the other to
 the last element in the tail queue.
 The elements are singly linked for minimum space and pointer
 manipulation overhead at the expense of O(n) removal for arbitrary
 elements.
 New elements can be added to the tail queue after an existing element,
 at the head of the tail queue, or at the end of the tail queue.
 A
 .Fa STAILQ_HEAD
 structure is declared as follows:
 .Bd -literal -offset indent
 STAILQ_HEAD(HEADNAME, TYPE) head;
 .Ed
 .Pp
 where
 .Li HEADNAME
 is the name of the structure to be defined, and
 .Li TYPE
 is the type of the elements to be linked into the tail queue.
 A pointer to the head of the tail queue can later be declared as:
 .Bd -literal -offset indent
 struct HEADNAME *headp;
 .Ed
 .Pp
 (The names
 .Li head
 and
 .Li headp
 are user selectable.)
 .Pp
 The macro
 .Nm STAILQ_HEAD_INITIALIZER
 evaluates to an initializer for the tail queue
 .Fa head .
 .Pp
 The macro
 .Nm STAILQ_CONCAT
 concatenates the tail queue headed by
 .Fa head2
 onto the end of the one headed by
 .Fa head1
 removing all entries from the former.
 .Pp
 The macro
 .Nm STAILQ_EMPTY
 evaluates to true if there are no items on the tail queue.
 .Pp
 The macro
 .Nm STAILQ_ENTRY
 declares a structure that connects the elements in
 the tail queue.
 .Pp
 The macro
 .Nm STAILQ_FIRST
 returns the first item on the tail queue or NULL if the tail queue
 is empty.
 .Pp
 The macro
 .Nm STAILQ_FOREACH
 traverses the tail queue referenced by
 .Fa head
 in the forward direction, assigning each element
 in turn to
 .Fa var .
 .Pp
 The macro
 .Nm STAILQ_FOREACH_FROM
 behaves identically to
 .Nm STAILQ_FOREACH
 when
 .Fa var
 is NULL, else it treats
 .Fa var
 as a previously found STAILQ element and begins the loop at
 .Fa var
 instead of the first element in the STAILQ referenced by
 .Fa head .
 .Pp
 The macro
 .Nm STAILQ_FOREACH_SAFE
 traverses the tail queue referenced by
 .Fa head
 in the forward direction, assigning each element
 in turn to
 .Fa var .
 However, unlike
 .Fn STAILQ_FOREACH
 here it is permitted to both remove
 .Fa var
 as well as free it from within the loop safely without interfering with the
 traversal.
 .Pp
 The macro
 .Nm STAILQ_FOREACH_FROM_SAFE
 behaves identically to
 .Nm STAILQ_FOREACH_SAFE
 when
 .Fa var
 is NULL, else it treats
 .Fa var
 as a previously found STAILQ element and begins the loop at
 .Fa var
 instead of the first element in the STAILQ referenced by
 .Fa head .
 .Pp
 The macro
 .Nm STAILQ_INIT
 initializes the tail queue referenced by
 .Fa head .
 .Pp
 The macro
 .Nm STAILQ_INSERT_HEAD
 inserts the new element
 .Fa elm
 at the head of the tail queue.
 .Pp
 The macro
 .Nm STAILQ_INSERT_TAIL
 inserts the new element
 .Fa elm
 at the end of the tail queue.
 .Pp
 The macro
 .Nm STAILQ_INSERT_AFTER
 inserts the new element
 .Fa elm
 after the element
 .Fa listelm .
 .Pp
 The macro
 .Nm STAILQ_LAST
 returns the last item on the tail queue.
 If the tail queue is empty the return value is
 .Dv NULL .
 .Pp
 The macro
 .Nm STAILQ_NEXT
 returns the next item on the tail queue, or NULL this item is the last.
 .Pp
 The macro
 .Nm STAILQ_REMOVE_AFTER
 removes the element after
 .Fa elm
 from the tail queue.
 Unlike
 .Fa STAILQ_REMOVE ,
 this macro does not traverse the entire tail queue.
 .Pp
 The macro
 .Nm STAILQ_REMOVE_HEAD
 removes the element at the head of the tail queue.
 For optimum efficiency,
 elements being removed from the head of the tail queue should
 use this macro explicitly rather than the generic
 .Fa STAILQ_REMOVE
 macro.
 .Pp
 The macro
 .Nm STAILQ_REMOVE
 removes the element
 .Fa elm
 from the tail queue.
 Use of this macro should be avoided as it traverses the entire list.
 A doubly-linked tail queue should be used if this macro is needed in
 high-usage code paths or to operate on long tail queues.
 .Pp
 The macro
 .Nm STAILQ_SWAP
 swaps the contents of
 .Fa head1
 and
 .Fa head2 .
 .Sh SINGLY-LINKED TAIL QUEUE EXAMPLE
 .Bd -literal
 STAILQ_HEAD(stailhead, entry) head =
     STAILQ_HEAD_INITIALIZER(head);
 struct stailhead *headp;		/* Singly-linked tail queue head. */
 struct entry {
 	...
 	STAILQ_ENTRY(entry) entries;	/* Tail queue. */
 	...
 } *n1, *n2, *n3, *np;
 
 STAILQ_INIT(&head);			/* Initialize the queue. */
 
 n1 = malloc(sizeof(struct entry));	/* Insert at the head. */
 STAILQ_INSERT_HEAD(&head, n1, entries);
 
 n1 = malloc(sizeof(struct entry));	/* Insert at the tail. */
 STAILQ_INSERT_TAIL(&head, n1, entries);
 
 n2 = malloc(sizeof(struct entry));	/* Insert after. */
 STAILQ_INSERT_AFTER(&head, n1, n2, entries);
 					/* Deletion. */
 STAILQ_REMOVE(&head, n2, entry, entries);
 free(n2);
 					/* Deletion from the head. */
 n3 = STAILQ_FIRST(&head);
 STAILQ_REMOVE_HEAD(&head, entries);
 free(n3);
 					/* Forward traversal. */
 STAILQ_FOREACH(np, &head, entries)
 	np-> ...
 					/* Safe forward traversal. */
 STAILQ_FOREACH_SAFE(np, &head, entries, np_temp) {
 	np->do_stuff();
 	...
 	STAILQ_REMOVE(&head, np, entry, entries);
 	free(np);
 }
 					/* TailQ Deletion. */
 while (!STAILQ_EMPTY(&head)) {
 	n1 = STAILQ_FIRST(&head);
 	STAILQ_REMOVE_HEAD(&head, entries);
 	free(n1);
 }
 					/* Faster TailQ Deletion. */
 n1 = STAILQ_FIRST(&head);
 while (n1 != NULL) {
 	n2 = STAILQ_NEXT(n1, entries);
 	free(n1);
 	n1 = n2;
 }
 STAILQ_INIT(&head);
 .Ed
 .Sh LISTS
 A list is headed by a structure defined by the
 .Nm LIST_HEAD
 macro.
 This structure contains a single pointer to the first element
 on the list.
 The elements are doubly linked so that an arbitrary element can be
 removed without traversing the list.
 New elements can be added to the list after an existing element,
 before an existing element, or at the head of the list.
 A
 .Fa LIST_HEAD
 structure is declared as follows:
 .Bd -literal -offset indent
 LIST_HEAD(HEADNAME, TYPE) head;
 .Ed
 .Pp
 where
 .Fa HEADNAME
 is the name of the structure to be defined, and
 .Fa TYPE
 is the type of the elements to be linked into the list.
 A pointer to the head of the list can later be declared as:
 .Bd -literal -offset indent
 struct HEADNAME *headp;
 .Ed
 .Pp
 (The names
 .Li head
 and
 .Li headp
 are user selectable.)
 .Pp
 The macro
 .Nm LIST_HEAD_INITIALIZER
 evaluates to an initializer for the list
 .Fa head .
 .Pp
 The macro
 .Nm LIST_CONCAT
 concatenates the list headed by
 .Fa head2
 onto the end of the one headed by
 .Fa head1
 removing all entries from the former.
 Use of this macro should be avoided as it traverses the entirety of the
 .Fa head1
 list.
 A tail queue should be used if this macro is needed in
 high-usage code paths or to operate on long lists.
 .Pp
 The macro
 .Nm LIST_EMPTY
 evaluates to true if there are no elements in the list.
 .Pp
 The macro
 .Nm LIST_ENTRY
 declares a structure that connects the elements in
 the list.
 .Pp
 The macro
 .Nm LIST_FIRST
 returns the first element in the list or NULL if the list
 is empty.
 .Pp
 The macro
 .Nm LIST_FOREACH
 traverses the list referenced by
 .Fa head
 in the forward direction, assigning each element in turn to
 .Fa var .
 .Pp
 The macro
 .Nm LIST_FOREACH_FROM
 behaves identically to
 .Nm LIST_FOREACH
 when
 .Fa var
 is NULL, else it treats
 .Fa var
 as a previously found LIST element and begins the loop at
 .Fa var
 instead of the first element in the LIST referenced by
 .Fa head .
 .Pp
 The macro
 .Nm LIST_FOREACH_SAFE
 traverses the list referenced by
 .Fa head
 in the forward direction, assigning each element in turn to
 .Fa var .
 However, unlike
 .Fn LIST_FOREACH
 here it is permitted to both remove
 .Fa var
 as well as free it from within the loop safely without interfering with the
 traversal.
 .Pp
 The macro
 .Nm LIST_FOREACH_FROM_SAFE
 behaves identically to
 .Nm LIST_FOREACH_SAFE
 when
 .Fa var
 is NULL, else it treats
 .Fa var
 as a previously found LIST element and begins the loop at
 .Fa var
 instead of the first element in the LIST referenced by
 .Fa head .
 .Pp
 The macro
 .Nm LIST_INIT
 initializes the list referenced by
 .Fa head .
 .Pp
 The macro
 .Nm LIST_INSERT_HEAD
 inserts the new element
 .Fa elm
 at the head of the list.
 .Pp
 The macro
 .Nm LIST_INSERT_AFTER
 inserts the new element
 .Fa elm
 after the element
 .Fa listelm .
 .Pp
 The macro
 .Nm LIST_INSERT_BEFORE
 inserts the new element
 .Fa elm
 before the element
 .Fa listelm .
 .Pp
 The macro
 .Nm LIST_NEXT
 returns the next element in the list, or NULL if this is the last.
 .Pp
 The macro
 .Nm LIST_PREV
 returns the previous element in the list, or NULL if this is the first.
 List
 .Fa head
 must contain element
 .Fa elm .
 .Pp
 The macro
 .Nm LIST_REMOVE
 removes the element
 .Fa elm
 from the list.
 .Pp
 The macro
 .Nm LIST_SWAP
 swaps the contents of
 .Fa head1
 and
 .Fa head2 .
 .Sh LIST EXAMPLE
 .Bd -literal
 LIST_HEAD(listhead, entry) head =
     LIST_HEAD_INITIALIZER(head);
 struct listhead *headp;			/* List head. */
 struct entry {
 	...
 	LIST_ENTRY(entry) entries;	/* List. */
 	...
 } *n1, *n2, *n3, *np, *np_temp;
 
 LIST_INIT(&head);			/* Initialize the list. */
 
 n1 = malloc(sizeof(struct entry));	/* Insert at the head. */
 LIST_INSERT_HEAD(&head, n1, entries);
 
 n2 = malloc(sizeof(struct entry));	/* Insert after. */
 LIST_INSERT_AFTER(n1, n2, entries);
 
 n3 = malloc(sizeof(struct entry));	/* Insert before. */
 LIST_INSERT_BEFORE(n2, n3, entries);
 
 LIST_REMOVE(n2, entries);		/* Deletion. */
 free(n2);
 					/* Forward traversal. */
 LIST_FOREACH(np, &head, entries)
 	np-> ...
 
 					/* Safe forward traversal. */
 LIST_FOREACH_SAFE(np, &head, entries, np_temp) {
 	np->do_stuff();
 	...
 	LIST_REMOVE(np, entries);
 	free(np);
 }
 
 while (!LIST_EMPTY(&head)) {		/* List Deletion. */
 	n1 = LIST_FIRST(&head);
 	LIST_REMOVE(n1, entries);
 	free(n1);
 }
 
 n1 = LIST_FIRST(&head);			/* Faster List Deletion. */
 while (n1 != NULL) {
 	n2 = LIST_NEXT(n1, entries);
 	free(n1);
 	n1 = n2;
 }
 LIST_INIT(&head);
 .Ed
 .Sh TAIL QUEUES
 A tail queue is headed by a structure defined by the
 .Nm TAILQ_HEAD
 macro.
 This structure contains a pair of pointers,
 one to the first element in the tail queue and the other to
 the last element in the tail queue.
 The elements are doubly linked so that an arbitrary element can be
 removed without traversing the tail queue.
 New elements can be added to the tail queue after an existing element,
 before an existing element, at the head of the tail queue,
 or at the end of the tail queue.
 A
 .Fa TAILQ_HEAD
 structure is declared as follows:
 .Bd -literal -offset indent
 TAILQ_HEAD(HEADNAME, TYPE) head;
 .Ed
 .Pp
 where
 .Li HEADNAME
 is the name of the structure to be defined, and
 .Li TYPE
 is the type of the elements to be linked into the tail queue.
 A pointer to the head of the tail queue can later be declared as:
 .Bd -literal -offset indent
 struct HEADNAME *headp;
 .Ed
 .Pp
 (The names
 .Li head
 and
 .Li headp
 are user selectable.)
 .Pp
 The macro
 .Nm TAILQ_HEAD_INITIALIZER
 evaluates to an initializer for the tail queue
 .Fa head .
 .Pp
 The macro
 .Nm TAILQ_CONCAT
 concatenates the tail queue headed by
 .Fa head2
 onto the end of the one headed by
 .Fa head1
 removing all entries from the former.
 .Pp
 The macro
 .Nm TAILQ_EMPTY
 evaluates to true if there are no items on the tail queue.
 .Pp
 The macro
 .Nm TAILQ_ENTRY
 declares a structure that connects the elements in
 the tail queue.
 .Pp
 The macro
 .Nm TAILQ_FIRST
 returns the first item on the tail queue or NULL if the tail queue
 is empty.
 .Pp
 The macro
 .Nm TAILQ_FOREACH
 traverses the tail queue referenced by
 .Fa head
 in the forward direction, assigning each element in turn to
 .Fa var .
 .Fa var
 is set to
 .Dv NULL
 if the loop completes normally, or if there were no elements.
 .Pp
 The macro
 .Nm TAILQ_FOREACH_FROM
 behaves identically to
 .Nm TAILQ_FOREACH
 when
 .Fa var
 is NULL, else it treats
 .Fa var
 as a previously found TAILQ element and begins the loop at
 .Fa var
 instead of the first element in the TAILQ referenced by
 .Fa head .
 .Pp
 The macro
 .Nm TAILQ_FOREACH_REVERSE
 traverses the tail queue referenced by
 .Fa head
 in the reverse direction, assigning each element in turn to
 .Fa var .
 .Pp
 The macro
 .Nm TAILQ_FOREACH_REVERSE_FROM
 behaves identically to
 .Nm TAILQ_FOREACH_REVERSE
 when
 .Fa var
 is NULL, else it treats
 .Fa var
 as a previously found TAILQ element and begins the reverse loop at
 .Fa var
 instead of the last element in the TAILQ referenced by
 .Fa head .
 .Pp
 The macros
 .Nm TAILQ_FOREACH_SAFE
 and
 .Nm TAILQ_FOREACH_REVERSE_SAFE
 traverse the list referenced by
 .Fa head
 in the forward or reverse direction respectively,
 assigning each element in turn to
 .Fa var .
 However, unlike their unsafe counterparts,
 .Nm TAILQ_FOREACH
 and
 .Nm TAILQ_FOREACH_REVERSE
 permit to both remove
 .Fa var
 as well as free it from within the loop safely without interfering with the
 traversal.
 .Pp
 The macro
 .Nm TAILQ_FOREACH_FROM_SAFE
 behaves identically to
 .Nm TAILQ_FOREACH_SAFE
 when
 .Fa var
 is NULL, else it treats
 .Fa var
 as a previously found TAILQ element and begins the loop at
 .Fa var
 instead of the first element in the TAILQ referenced by
 .Fa head .
 .Pp
 The macro
 .Nm TAILQ_FOREACH_REVERSE_FROM_SAFE
 behaves identically to
 .Nm TAILQ_FOREACH_REVERSE_SAFE
 when
 .Fa var
 is NULL, else it treats
 .Fa var
 as a previously found TAILQ element and begins the reverse loop at
 .Fa var
 instead of the last element in the TAILQ referenced by
 .Fa head .
 .Pp
 The macro
 .Nm TAILQ_INIT
 initializes the tail queue referenced by
 .Fa head .
 .Pp
 The macro
 .Nm TAILQ_INSERT_HEAD
 inserts the new element
 .Fa elm
 at the head of the tail queue.
 .Pp
 The macro
 .Nm TAILQ_INSERT_TAIL
 inserts the new element
 .Fa elm
 at the end of the tail queue.
 .Pp
 The macro
 .Nm TAILQ_INSERT_AFTER
 inserts the new element
 .Fa elm
 after the element
 .Fa listelm .
 .Pp
 The macro
 .Nm TAILQ_INSERT_BEFORE
 inserts the new element
 .Fa elm
 before the element
 .Fa listelm .
 .Pp
 The macro
 .Nm TAILQ_LAST
 returns the last item on the tail queue.
 If the tail queue is empty the return value is
 .Dv NULL .
 .Pp
 The macro
 .Nm TAILQ_NEXT
 returns the next item on the tail queue, or NULL if this item is the last.
 .Pp
 The macro
 .Nm TAILQ_PREV
 returns the previous item on the tail queue, or NULL if this item
 is the first.
 .Pp
 The macro
 .Nm TAILQ_REMOVE
 removes the element
 .Fa elm
 from the tail queue.
 .Pp
 The macro
 .Nm TAILQ_SWAP
 swaps the contents of
 .Fa head1
 and
 .Fa head2 .
 .Sh TAIL QUEUE EXAMPLE
 .Bd -literal
 TAILQ_HEAD(tailhead, entry) head =
     TAILQ_HEAD_INITIALIZER(head);
 struct tailhead *headp;			/* Tail queue head. */
 struct entry {
 	...
 	TAILQ_ENTRY(entry) entries;	/* Tail queue. */
 	...
 } *n1, *n2, *n3, *np;
 
 TAILQ_INIT(&head);			/* Initialize the queue. */
 
 n1 = malloc(sizeof(struct entry));	/* Insert at the head. */
 TAILQ_INSERT_HEAD(&head, n1, entries);
 
 n1 = malloc(sizeof(struct entry));	/* Insert at the tail. */
 TAILQ_INSERT_TAIL(&head, n1, entries);
 
 n2 = malloc(sizeof(struct entry));	/* Insert after. */
 TAILQ_INSERT_AFTER(&head, n1, n2, entries);
 
 n3 = malloc(sizeof(struct entry));	/* Insert before. */
 TAILQ_INSERT_BEFORE(n2, n3, entries);
 
 TAILQ_REMOVE(&head, n2, entries);	/* Deletion. */
 free(n2);
 					/* Forward traversal. */
 TAILQ_FOREACH(np, &head, entries)
 	np-> ...
 					/* Safe forward traversal. */
 TAILQ_FOREACH_SAFE(np, &head, entries, np_temp) {
 	np->do_stuff();
 	...
 	TAILQ_REMOVE(&head, np, entries);
 	free(np);
 }
 					/* Reverse traversal. */
 TAILQ_FOREACH_REVERSE(np, &head, tailhead, entries)
 	np-> ...
 					/* TailQ Deletion. */
 while (!TAILQ_EMPTY(&head)) {
 	n1 = TAILQ_FIRST(&head);
 	TAILQ_REMOVE(&head, n1, entries);
 	free(n1);
 }
 					/* Faster TailQ Deletion. */
 n1 = TAILQ_FIRST(&head);
 while (n1 != NULL) {
 	n2 = TAILQ_NEXT(n1, entries);
 	free(n1);
 	n1 = n2;
 }
 TAILQ_INIT(&head);
 .Ed
+.Sh DIAGNOSTICS
+When debugging
+.Nm queue(3) ,
+it can be useful to trace queue changes.
+To enable tracing, define the macro
+.Va QUEUE_MACRO_DEBUG_TRACE
+at compile time.
+.Pp
+It can also be useful to trash pointers that have been unlinked from a queue,
+to detect use after removal.
+To enable pointer trashing, define the macro
+.Va QUEUE_MACRO_DEBUG_TRASH
+at compile time.
+The macro
+.Fn QMD_IS_TRASHED "void *ptr"
+returns true if
+.Fa ptr
+has been trashed by the
+.Va QUEUE_MACRO_DEBUG_TRASH
+option.
+.Pp
+In the kernel (with
+.Va INVARIANTS
+enabled), the
+.Fn SLIST_REMOVE_PREVPTR
+macro is available to aid debugging:
+.Bl -hang -offset indent
+.It Fn SLIST_REMOVE_PREVPTR "TYPE **prev" "TYPE *elm" "SLIST_ENTRY NAME"
+.Pp
+Removes
+.Fa elm ,
+which must directly follow the element whose
+.Va &SLIST_NEXT()
+is
+.Fa prev ,
+from the SLIST.
+This macro validates that
+.Fa elm
+follows
+.Fa prev
+in
+.Va INVARIANTS
+mode.
+.El
 .Sh SEE ALSO
 .Xr tree 3
 .Sh HISTORY
 The
 .Nm queue
 functions first appeared in
 .Bx 4.4 .
Index: projects/clang390-import/share/man/man4/ddb.4
===================================================================
--- projects/clang390-import/share/man/man4/ddb.4	(revision 305686)
+++ projects/clang390-import/share/man/man4/ddb.4	(revision 305687)
@@ -1,1565 +1,1576 @@
 .\"
 .\" Mach Operating System
 .\" Copyright (c) 1991,1990 Carnegie Mellon University
 .\" Copyright (c) 2007 Robert N. M. Watson
 .\" All Rights Reserved.
 .\"
 .\" Permission to use, copy, modify and distribute this software and its
 .\" documentation is hereby granted, provided that both the copyright
 .\" notice and this permission notice appear in all copies of the
 .\" software, derivative works or modified versions, and any portions
 .\" thereof, and that both notices appear in supporting documentation.
 .\"
 .\" CARNEGIE MELLON ALLOWS FREE USE OF THIS SOFTWARE IN ITS "AS IS"
 .\" CONDITION.  CARNEGIE MELLON DISCLAIMS ANY LIABILITY OF ANY KIND FOR
 .\" ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE.
 .\"
 .\" Carnegie Mellon requests users of this software to return to
 .\"
 .\"  Software Distribution Coordinator  or  Software.Distribution@CS.CMU.EDU
 .\"  School of Computer Science
 .\"  Carnegie Mellon University
 .\"  Pittsburgh PA 15213-3890
 .\"
 .\" any improvements or extensions that they make and grant Carnegie Mellon
 .\" the rights to redistribute these changes.
 .\"
 .\" changed a \# to #, since groff choked on it.
 .\"
 .\" HISTORY
 .\" ddb.4,v
 .\" Revision 1.1  1993/07/15  18:41:02  brezak
 .\" Man page for DDB
 .\"
 .\" Revision 2.6  92/04/08  08:52:57  rpd
 .\" 	Changes from OSF.
 .\" 	[92/01/17  14:19:22  jsb]
 .\" 	Changes for OSF debugger modifications.
 .\" 	[91/12/12            tak]
 .\"
 .\" Revision 2.5  91/06/25  13:50:22  rpd
 .\" 	Added some watchpoint explanation.
 .\" 	[91/06/25            rpd]
 .\"
 .\" Revision 2.4  91/06/17  15:47:31  jsb
 .\" 	Added documentation for continue/c, match, search, and watchpoints.
 .\" 	I've not actually explained what a watchpoint is; maybe Rich can
 .\" 	do that (hint, hint).
 .\" 	[91/06/17  10:58:08  jsb]
 .\"
 .\" Revision 2.3  91/05/14  17:04:23  mrt
 .\" 	Correcting copyright
 .\"
 .\" Revision 2.2  91/02/14  14:10:06  mrt
 .\" 	Changed to new Mach copyright
 .\" 	[91/02/12  18:10:12  mrt]
 .\"
 .\" Revision 2.2  90/08/30  14:23:15  dbg
 .\" 	Created.
 .\" 	[90/08/30            dbg]
 .\"
 .\" $FreeBSD$
 .\"
 .Dd July 13, 2016
 .Dt DDB 4
 .Os
 .Sh NAME
 .Nm ddb
 .Nd interactive kernel debugger
 .Sh SYNOPSIS
 In order to enable kernel debugging facilities include:
 .Bd -ragged -offset indent
 .Cd options KDB
 .Cd options DDB
 .Ed
 .Pp
 To prevent activation of the debugger on kernel
 .Xr panic 9 :
 .Bd -ragged -offset indent
 .Cd options KDB_UNATTENDED
 .Ed
 .Pp
 In order to print a stack trace of the current thread on the console
 for a panic:
 .Bd -ragged -offset indent
 .Cd options KDB_TRACE
 .Ed
 .Pp
 To print the numerical value of symbols in addition to the symbolic
 representation, define:
 .Bd -ragged -offset indent
 .Cd options DDB_NUMSYM
 .Ed
 .Pp
 To enable the
 .Xr gdb 1
 backend, so that remote debugging with
 .Xr kgdb 1
 is possible, include:
 .Bd -ragged -offset indent
 .Cd options GDB
 .Ed
 .Sh DESCRIPTION
 The
 .Nm
 kernel debugger is an interactive debugger with a syntax inspired by
 .Xr gdb 1 .
 If linked into the running kernel,
 it can be invoked locally with the
 .Ql debug
 .Xr keymap 5
 action.
 The debugger is also invoked on kernel
 .Xr panic 9
 if the
 .Va debug.debugger_on_panic
 .Xr sysctl 8
 MIB variable is set non-zero,
 which is the default
 unless the
 .Dv KDB_UNATTENDED
 option is specified.
 .Pp
 The current location is called
 .Va dot .
 The
 .Va dot
 is displayed with
 a hexadecimal format at a prompt.
 The commands
 .Ic examine
 and
 .Ic write
 update
 .Va dot
 to the address of the last line
 examined or the last location modified, and set
 .Va next
 to the address of
 the next location to be examined or changed.
 Other commands do not change
 .Va dot ,
 and set
 .Va next
 to be the same as
 .Va dot .
 .Pp
 The general command syntax is:
 .Ar command Ns Op Li / Ns Ar modifier
-.Ar address Ns Op Li , Ns Ar count
+.Oo Ar addr Oc Ns Op Li , Ns Ar count
 .Pp
 A blank line repeats the previous command from the address
 .Va next
 with
 count 1 and no modifiers.
 Specifying
-.Ar address
+.Ar addr
 sets
 .Va dot
 to the address.
 Omitting
-.Ar address
+.Ar addr
 uses
 .Va dot .
 A missing
 .Ar count
 is taken
 to be 1 for printing commands or infinity for stack traces.
+A
+.Ar count
+of -1 is equivalent to a missing
+.Ar count .
+Options that are supplied but not supported by the given
+.Ar command
+are usually ignored.
 .Pp
 The
 .Nm
 debugger has a pager feature (like the
 .Xr more 1
 command)
 for the output.
 If an output line exceeds the number set in the
 .Va lines
 variable, it displays
 .Dq Li --More--
 and waits for a response.
 The valid responses for it are:
 .Pp
 .Bl -tag -compact -width ".Li SPC"
 .It Li SPC
 one more page
 .It Li RET
 one more line
 .It Li q
 abort the current command, and return to the command input mode
 .El
 .Pp
 Finally,
 .Nm
 provides a small (currently 10 items) command history, and offers
 simple
 .Nm emacs Ns -style
 command line editing capabilities.
 In addition to
 the
 .Nm emacs
 control keys, the usual
 .Tn ANSI
 arrow keys may be used to
 browse through the history buffer, and move the cursor within the
 current line.
 .Sh COMMANDS
 .Bl -tag -width indent -compact
-.It Ic examine
-.It Ic x
+.It Xo
+.Ic examine Ns Op Li / Ns Cm AISabcdghilmorsuxz ...
+.Oo Ar addr Oc Ns Op Li , Ns Ar count
+.Xc
+.It Xo
+.Ic x       Ns Op Li / Ns Cm AISabcdghilmorsuxz ...
+.Oo Ar addr Oc Ns Op Li , Ns Ar count
+.Xc
 Display the addressed locations according to the formats in the modifier.
 Multiple modifier formats display multiple locations.
 If no format is specified, the last format specified for this command
 is used.
 .Pp
 The format characters are:
 .Bl -tag -compact -width indent
 .It Cm b
 look at by bytes (8 bits)
 .It Cm h
 look at by half words (16 bits)
 .It Cm l
 look at by long words (32 bits)
 .It Cm g
 look at by quad words (64 bits)
 .It Cm a
 print the location being displayed
 .It Cm A
 print the location with a line number if possible
 .It Cm x
 display in unsigned hex
 .It Cm z
 display in signed hex
 .It Cm o
 display in unsigned octal
 .It Cm d
 display in signed decimal
 .It Cm u
 display in unsigned decimal
 .It Cm r
 display in current radix, signed
 .It Cm c
 display low 8 bits as a character.
 Non-printing characters are displayed as an octal escape code (e.g.,
 .Ql \e000 ) .
 .It Cm s
 display the null-terminated string at the location.
 Non-printing characters are displayed as octal escapes.
 .It Cm m
 display in unsigned hex with character dump at the end of each line.
 The location is also displayed in hex at the beginning of each line.
 .It Cm i
 display as an instruction
 .It Cm I
 display as an instruction with possible alternate formats depending on the
-machine, but none of the supported architectures have an alternate format.
+machine, but none of the supported architectures have an alternate format
 .It Cm S
 display a symbol name for the pointer stored at the address
 .El
 .Pp
 .It Ic xf
 Examine forward:
 execute an
 .Ic examine
 command with the last specified parameters to it
 except that the next address displayed by it is used as the start address.
 .Pp
 .It Ic xb
 Examine backward:
 execute an
 .Ic examine
 command with the last specified parameters to it
 except that the last start address subtracted by the size displayed by it
 is used as the start address.
 .Pp
 .It Ic print Ns Op Li / Ns Cm acdoruxz
 .It Ic p Ns Op Li / Ns Cm acdoruxz
 Print
 .Ar addr Ns s
 according to the modifier character (as described above for
 .Cm examine ) .
 Valid formats are:
 .Cm a , x , z , o , d , u , r ,
 and
 .Cm c .
 If no modifier is specified, the last one specified to it is used.
 The argument
 .Ar addr
 can be a string, in which case it is printed as it is.
 For example:
 .Bd -literal -offset indent
 print/x "eax = " $eax "\enecx = " $ecx "\en"
 .Ed
 .Pp
 will print like:
 .Bd -literal -offset indent
 eax = xxxxxx
 ecx = yyyyyy
 .Ed
 .Pp
 .It Xo
 .Ic write Ns Op Li / Ns Cm bhl
 .Ar addr expr1 Op Ar expr2 ...
 .Xc
 .It Xo
 .Ic w Ns Op Li / Ns Cm bhl
 .Ar addr expr1 Op Ar expr2 ...
 .Xc
 Write the expressions specified after
 .Ar addr
 on the command line at succeeding locations starting with
 .Ar addr .
 The write unit size can be specified in the modifier with a letter
 .Cm b
 (byte),
 .Cm h
 (half word) or
 .Cm l
 (long word) respectively.
 If omitted,
 long word is assumed.
 .Pp
 .Sy Warning :
 since there is no delimiter between expressions, strange
 things may happen.
 It is best to enclose each expression in parentheses.
 .Pp
 .It Ic set Li $ Ns Ar variable Oo Li = Oc Ar expr
 Set the named variable or register with the value of
 .Ar expr .
 Valid variable names are described below.
 .Pp
-.It Ic break Ns Op Li / Ns Cm u
-.It Ic b Ns Op Li / Ns Cm u
+.It Ic break Ns Oo Li / Ns Cm u Oc Oo Ar addr Oc Ns Op Li , Ns Ar count
+.It Ic b     Ns Oo Li / Ns Cm u Oc Oo Ar addr Oc Ns Op Li , Ns Ar count
 Set a break point at
 .Ar addr .
 If
 .Ar count
-is supplied, continues
+is supplied, the
+.Ic continue
+command will not stop at this break point on the first
 .Ar count
-\- 1 times before stopping at the
-break point.
+\- 1 times that it is hit.
 If the break point is set, a break point number is
 printed with
 .Ql # .
 This number can be used in deleting the break point
 or adding conditions to it.
 .Pp
 If the
 .Cm u
 modifier is specified, this command sets a break point in user
 address space.
 Without the
 .Cm u
 option, the address is considered to be in the kernel
 space, and a wrong space address is rejected with an error message.
 This modifier can be used only if it is supported by machine dependent
 routines.
 .Pp
 .Sy Warning :
 If a user text is shadowed by a normal user space debugger,
 user space break points may not work correctly.
 Setting a break
 point at the low-level code paths may also cause strange behavior.
 .Pp
-.It Ic delete Ar addr
-.It Ic d Ar addr
+.It Ic delete Op Ar addr
+.It Ic d      Op Ar addr
 .It Ic delete Li # Ns Ar number
-.It Ic d Li # Ns Ar number
-Delete the break point.
-The target break point can be specified by a
+.It Ic d      Li # Ns Ar number
+Delete the specified break point.
+The break point can be specified by a
 break point number with
 .Ql # ,
 or by using the same
 .Ar addr
 specified in the original
 .Ic break
-command.
+command, or by omitting
+.Ar addr
+to get the default address of
+.Va dot .
 .Pp
-.It Ic watch Ar addr Ns Li , Ns Ar size
+.It Ic watch Oo Ar addr Oc Ns Op Li , Ns Ar size
 Set a watchpoint for a region.
 Execution stops when an attempt to modify the region occurs.
 The
 .Ar size
 argument defaults to 4.
 If you specify a wrong space address, the request is rejected
 with an error message.
 .Pp
 .Sy Warning :
 Attempts to watch wired kernel memory
 may cause unrecoverable error in some systems such as i386.
 Watchpoints on user addresses work best.
 .Pp
-.It Ic hwatch Ar addr Ns Li , Ns Ar size
+.It Ic hwatch Oo Ar addr Oc Ns Op Li , Ns Ar size
 Set a hardware watchpoint for a region if supported by the
 architecture.
 Execution stops when an attempt to modify the region occurs.
 The
 .Ar size
 argument defaults to 4.
 .Pp
 .Sy Warning :
 The hardware debug facilities do not have a concept of separate
 address spaces like the watch command does.
 Use
 .Ic hwatch
 for setting watchpoints on kernel address locations only, and avoid
 its use on user mode address spaces.
 .Pp
-.It Ic dhwatch Ar addr Ns Li , Ns Ar size
+.It Ic dhwatch Oo Ar addr Oc Ns Op Li , Ns Ar size
 Delete specified hardware watchpoint.
 .Pp
-.It Ic step Ns Op Li / Ns Cm p
-.It Ic s Ns Op Li / Ns Cm p
+.It Ic step Ns Oo Li / Ns Cm p Oc Ns Op Li , Ns Ar count
+.It Ic s    Ns Oo Li / Ns Cm p Oc Ns Op Li , Ns Ar count
 Single step
 .Ar count
-times (the comma is a mandatory part of the syntax).
+times.
 If the
 .Cm p
 modifier is specified, print each instruction at each step.
 Otherwise, only print the last instruction.
 .Pp
 .Sy Warning :
 depending on machine type, it may not be possible to
 single-step through some low-level code paths or user space code.
 On machines with software-emulated single-stepping (e.g., pmax),
 stepping through code executed by interrupt handlers will probably
 do the wrong thing.
 .Pp
 .It Ic continue Ns Op Li / Ns Cm c
 .It Ic c Ns Op Li / Ns Cm c
 Continue execution until a breakpoint or watchpoint.
 If the
 .Cm c
 modifier is specified, count instructions while executing.
 Some machines (e.g., pmax) also count loads and stores.
 .Pp
 .Sy Warning :
 when counting, the debugger is really silently single-stepping.
 This means that single-stepping on low-level code may cause strange
 behavior.
 .Pp
 .It Ic until Ns Op Li / Ns Cm p
 Stop at the next call or return instruction.
 If the
 .Cm p
 modifier is specified, print the call nesting depth and the
 cumulative instruction count at each call or return.
 Otherwise,
 only print when the matching return is hit.
 .Pp
 .It Ic next Ns Op Li / Ns Cm p
 .It Ic match Ns Op Li / Ns Cm p
 Stop at the matching return instruction.
 If the
 .Cm p
 modifier is specified, print the call nesting depth and the
 cumulative instruction count at each call or return.
 Otherwise, only print when the matching return is hit.
 .Pp
 .It Xo
 .Ic trace Ns Op Li / Ns Cm u
-.Op Ar pid | tid
+.Op Ar pid | tid Ns
 .Op Li , Ns Ar count
 .Xc
 .It Xo
 .Ic t Ns Op Li / Ns Cm u
-.Op Ar pid | tid
+.Op Ar pid | tid Ns
 .Op Li , Ns Ar count
 .Xc
 .It Xo
 .Ic where Ns Op Li / Ns Cm u
-.Op Ar pid | tid
+.Op Ar pid | tid Ns
 .Op Li , Ns Ar count
 .Xc
 .It Xo
 .Ic bt Ns Op Li / Ns Cm u
-.Op Ar pid | tid
+.Op Ar pid | tid Ns
 .Op Li , Ns Ar count
 .Xc
 Stack trace.
 The
 .Cm u
 option traces user space; if omitted,
 .Ic trace
 only traces
 kernel space.
 The optional argument
 .Ar count
 is the number of frames to be traced.
 If
 .Ar count
 is omitted, all frames are printed.
 .Pp
 .Sy Warning :
 User space stack trace is valid
 only if the machine dependent code supports it.
 .Pp
 .It Xo
 .Ic search Ns Op Li / Ns Cm bhl
 .Ar addr
 .Ar value
-.Op Ar mask
+.Op Ar mask Ns
 .Op Li , Ns Ar count
 .Xc
 Search memory for
 .Ar value .
-This command might fail in interesting
-ways if it does not find the searched-for value.
-This is because
-.Nm
-does not always recover from touching bad memory.
 The optional
 .Ar count
 argument limits the search.
 .\"
 .Pp
 .It Xo
 .Ic findstack
 .Ar addr
 .Xc
 Prints the thread address for a thread kernel-mode stack of which contains the
 specified address.
 If the thread is not found, search the thread stack cache and prints the
 cached stack address.
 Otherwise, prints nothing.
 .Pp
 .It Ic show Cm all procs Ns Op Li / Ns Cm m
 .It Ic ps Ns Op Li / Ns Cm m
 Display all process information.
 The process information may not be shown if it is not
 supported in the machine, or the bottom of the stack of the
 target process is not in the main memory at that time.
 The
 .Cm m
 modifier will alter the display to show VM map
 addresses for the process and not show other information.
 .\"
 .Pp
 .It Ic show Cm all trace
 .It Ic alltrace
-.Xc
 Show a stack trace for every thread in the system.
 .Pp
 .It Ic show Cm all ttys
 Show all TTY's within the system.
 Output is similar to
 .Xr pstat 8 ,
 but also includes the address of the TTY structure.
 .\"
 .Pp
 .It Ic show Cm all vnets
 Show the same output as "show vnet" does, but lists all
 virtualized network stacks within the system.
 .\"
 .Pp
 .It Ic show Cm allchains
 Show the same information like "show lockchain" does, but
 for every thread in the system.
 .\"
 .Pp
 .It Ic show Cm alllocks
 Show all locks that are currently held.
 This command is only available if
 .Xr witness 4
 is included in the kernel.
 .\"
 .Pp
 .It Ic show Cm allpcpu
 The same as "show pcpu", but for every CPU present in the system.
 .\"
 .Pp
 .It Ic show Cm allrman
 Show information related with resource management, including
 interrupt request lines, DMA request lines, I/O ports, I/O memory
 addresses, and Resource IDs.
 .\"
 .Pp
 .It Ic show Cm apic
 Dump data about APIC IDT vector mappings.
 .\"
 .Pp
 .It Ic show Cm breaks
 Show breakpoints set with the "break" command.
 .\"
 .Pp
 .It Ic show Cm bio Ar addr
 Show information about the bio structure
 .Vt struct bio
 present at
 .Ar addr .
 See the
 .Pa sys/bio.h
 header file and
 .Xr g_bio 9
 for more details on the exact meaning of the structure fields.
 .\"
 .Pp
 .It Ic show Cm buffer Ar addr
 Show information about the buf structure
 .Vt struct buf
 present at
 .Ar addr .
 See the
 .Pa sys/buf.h
 header file for more details on the exact meaning of the structure fields.
 .\"
 .Pp
 .It Ic show Cm callout Ar addr
 Show information about the callout structure
 .Vt struct callout
 present at
 .Ar addr .
 .\"
 .Pp
 .It Ic show Cm cbstat
 Show brief information about the TTY subsystem.
 .\"
 .Pp
 .It Ic show Cm cdev
 Without argument, show the list of all created cdev's, consisting of devfs
 node name and struct cdev address.
 When address of cdev is supplied, show some internal devfs state of the cdev.
 .\"
 .Pp
 .It Ic show Cm conifhk
 Lists hooks currently waiting for completion in
 run_interrupt_driven_config_hooks().
 .\"
 .Pp
 .It Ic show Cm cpusets
 Print numbered root and assigned CPU affinity sets.
 See
 .Xr cpuset 2
 for more details.
 .\"
 .Pp
 .It Ic show Cm cyrixreg
 Show registers specific to the Cyrix processor.
 .\"
 .Pp
 .It Ic show Cm devmap
 Prints the contents of the static device mapping table.
 Currently only available on the
 ARM
 architecture.
 .\"
 .Pp
 .It Ic show Cm domain Ar addr
 Print protocol domain structure
 .Vt struct domain
 at address
 .Ar addr .
 See the
 .Pa sys/domain.h
 header file for more details on the exact meaning of the structure fields.
 .\"
 .Pp
 .It Ic show Cm ffs Op Ar addr
 Show brief information about ffs mount at the address
 .Ar addr ,
 if argument is given.
 Otherwise, provides the summary about each ffs mount.
 .\"
 .Pp
 .It Ic show Cm file Ar addr
 Show information about the file structure
 .Vt struct file
 present at address
 .Ar addr .
 .\"
 .Pp
 .It Ic show Cm files
 Show information about every file structure in the system.
 .\"
 .Pp
 .It Ic show Cm freepages
 Show the number of physical pages in each of the free lists.
 .\"
 .Pp
 .It Ic show Cm geom Op Ar addr
 If the
 .Ar addr
 argument is not given, displays the entire GEOM topology.
 If
 .Ar addr
 is given, displays details about the given GEOM object (class, geom,
 provider or consumer).
 .\"
 .Pp
 .It Ic show Cm idt
 Show IDT layout.
 The first column specifies the IDT vector.
 The second one is the name of the interrupt/trap handler.
 Those functions are machine dependent.
 .\"
 .Pp
 .It Ic show Cm igi_list Ar addr
 Show information about the IGMP structure
 .Vt struct igmp_ifsoftc
 present at
 .Ar addr .
 .\"
 .Pp
 .It Ic show Cm inodedeps Op Ar addr
 Show brief information about each inodedep structure.
 If
 .Ar addr
 is given, only inodedeps belonging to the fs located at the
 supplied address are shown.
 .\"
 .Pp
 .It Ic show Cm inpcb Ar addr
 Show information on IP Control Block
 .Vt struct in_pcb
 present at
 .Ar addr .
 .\"
 .Pp
 .It Ic show Cm intr
 Dump information about interrupt handlers.
 .\"
 .Pp
 .It Ic show Cm intrcnt
 Dump the interrupt statistics.
 .\"
 .Pp
 .It Ic show Cm irqs
 Show interrupt lines and their respective kernel threads.
 .\"
 .Pp
 .It Ic show Cm jails
 Show the list of
 .Xr jail 8
 instances.
 In addition to what
 .Xr jls 8
 shows, also list kernel internal details.
 .\"
 .Pp
 .It Ic show Cm lapic
 Show information from the local APIC registers for this CPU.
 .\"
 .Pp
 .It Ic show Cm lock Ar addr
 Show lock structure.
 The output format is as follows:
 .Bl -tag -width "flags"
 .It Ic class:
 Class of the lock.
 Possible types include
 .Xr mutex 9 ,
 .Xr rmlock 9 ,
 .Xr rwlock 9 ,
 .Xr sx 9 .
 .It Ic name:
 Name of the lock.
 .It Ic flags:
 Flags passed to the lock initialization function.
 For exact possibilities see manual pages of possible lock types.
 .It Ic state:
 Current state of a lock.
 As well as
 .Ic flags
 it's lock-specific.
 .It Ic owner:
 Lock owner.
 .El
 .\"
 .Pp
 .It Ic show Cm lockchain Ar addr
 Show all threads a particular thread at address
 .Ar addr
 is waiting on based on non-sleepable and non-spin locks.
 .\"
 .Pp
 .It Ic show Cm lockedbufs
 Show the same information as "show buf", but for every locked
 .Vt struct buf
 object.
 .\"
 .Pp
 .It Ic show Cm lockedvnods
 List all locked vnodes in the system.
 .\"
 .Pp
 .It Ic show Cm locks
 Prints all locks that are currently acquired.
 This command is only available if
 .Xr witness 4
 is included in the kernel.
 .\"
 .Pp
 .It Ic show Cm locktree
 .\"
 .Pp
 .It Ic show Cm malloc
 Prints
 .Xr malloc 9
 memory allocator statistics.
 The output format is as follows:
 .Pp
 .Bl -tag -compact -offset indent -width "Requests"
 .It Ic Type
 Specifies a type of memory.
 It is the same as a description string used while defining the
 given memory type with
 .Xr MALLOC_DECLARE 9 .
 .It Ic InUse
 Number of memory allocations of the given type, for which
 .Xr free 9
 has not been called yet.
 .It Ic MemUse
 Total memory consumed by the given allocation type.
 .It Ic Requests
 Number of memory allocation requests for the given
 memory type.
 .El
 .Pp
 The same information can be gathered in userspace with
 .Dq Nm vmstat Fl m .
 .\"
 .Pp
 .It Ic show Cm map Ns Oo Li / Ns Cm f Oc Ar addr
 Prints the VM map at
 .Ar addr .
 If the
 .Cm f
 modifier is specified the
 complete map is printed.
 .\"
 .Pp
 .It Ic show Cm msgbuf
 Print the system's message buffer.
 It is the same output as in the
 .Dq Nm dmesg
 case.
 It is useful if you got a kernel panic, attached a serial cable
 to the machine and want to get the boot messages from before the
 system hang.
 .\"
 .It Ic show Cm mount
 Displays short info about all currently mounted file systems.
 .Pp
 .It Ic show Cm mount Ar addr
 Displays details about the given mount point.
 .\"
 .Pp
 .It Ic show Cm object Ns Oo Li / Ns Cm f Oc Ar addr
 Prints the VM object at
 .Ar addr .
 If the
 .Cm f
 option is specified the
 complete object is printed.
 .\"
 .Pp
 .It Ic show Cm panic
 Print the panic message if set.
 .\"
 .Pp
 .It Ic show Cm page
 Show statistics on VM pages.
 .\"
 .Pp
 .It Ic show Cm pageq
 Show statistics on VM page queues.
 .\"
 .Pp
 .It Ic show Cm pciregs
 Print PCI bus registers.
 The same information can be gathered in userspace by running
 .Dq Nm pciconf Fl lv .
 .\"
 .Pp
 .It Ic show Cm pcpu
 Print current processor state.
 The output format is as follows:
 .Pp
 .Bl -tag -compact -offset indent -width "spin locks held:"
 .It Ic cpuid
 Processor identifier.
 .It Ic curthread
 Thread pointer, process identifier and the name of the process.
 .It Ic curpcb
 Control block pointer.
 .It Ic fpcurthread
 FPU thread pointer.
 .It Ic idlethread
 Idle thread pointer.
 .It Ic APIC ID
 CPU identifier coming from APIC.
 .It Ic currentldt
 LDT pointer.
 .It Ic spin locks held
 Names of spin locks held.
 .El
 .\"
 .Pp
 .It Ic show Cm pgrpdump
 Dump process groups present within the system.
 .\"
 .Pp
 .It Ic show Cm proc Op Ar addr
 If no
 .Op Ar addr
 is specified, print information about the current process.
 Otherwise, show information about the process at address
 .Ar addr .
 .\"
 .Pp
 .It Ic show Cm procvm
 Show process virtual memory layout.
 .\"
 .Pp
 .It Ic show Cm protosw Ar addr
 Print protocol switch structure
 .Vt struct protosw
 at address
 .Ar addr .
 .\"
 .Pp
 .It Ic show Cm registers Ns Op Li / Ns Cm u
 Display the register set.
 If the
 .Cm u
 modifier is specified, it displays user registers instead of
 kernel registers or the currently saved one.
 .Pp
 .Sy Warning :
 The support of the
 .Cm u
 modifier depends on the machine.
 If not supported, incorrect information will be displayed.
 .\"
 .Pp
 .It Ic show Cm rman Ar addr
 Show resource manager object
 .Vt struct rman
 at address
 .Ar addr .
 Addresses of particular pointers can be gathered with "show allrman"
 command.
 .\"
 .Pp
 .It Ic show Cm rtc
 Show real time clock value.
 Useful for long debugging sessions.
 .\"
 .Pp
 .It Ic show Cm sleepchain
 Show all the threads a particular thread is waiting on based on
 sleepable locks.
 .\"
 .Pp
 .It Ic show Cm sleepq
 .It Ic show Cm sleepqueue
 Both commands provide the same functionality.
 They show sleepqueue
 .Vt struct sleepqueue
 structure.
 Sleepqueues are used within the
 .Fx
 kernel to implement sleepable
 synchronization primitives (thread holding a lock might sleep or
 be context switched), which at the time of writing are:
 .Xr condvar 9 ,
 .Xr sx 9
 and standard
 .Xr msleep 9
 interface.
 .\"
 .Pp
 .It Ic show Cm sockbuf Ar addr
 .It Ic show Cm socket Ar addr
 Those commands print
 .Vt struct sockbuf
 and
 .Vt struct socket
 objects placed at
 .Ar addr .
 Output consists of all values present in structures mentioned.
 For exact interpretation and more details, visit
 .Pa sys/socket.h
 header file.
 .\"
 .Pp
 .It Ic show Cm sysregs
 Show system registers (e.g.,
 .Li cr0-4
 on i386.)
 Not present on some platforms.
 .\"
 .Pp
 .It Ic show Cm tcpcb Ar addr
 Print TCP control block
 .Vt struct tcpcb
 lying at address
 .Ar addr .
 For exact interpretation of output, visit
 .Pa netinet/tcp.h
 header file.
 .\"
 .Pp
 .It Ic show Cm thread Op Ar addr
 If no
 .Ar addr
 is specified, show detailed information about current thread.
 Otherwise, information about thread at
 .Ar addr
 is printed.
 .\"
 .Pp
 .It Ic show Cm threads
 Show all threads within the system.
 Output format is as follows:
 .Pp
 .Bl -tag -compact -offset indent -width "Second column"
 .It Ic First column
 Thread identifier (TID)
 .It Ic Second column
 Thread structure address
 .It Ic Third column
 Backtrace.
 .El
 .\"
 .Pp
 .It Ic show Cm tty Ar addr
 Display the contents of a TTY structure in a readable form.
 .\"
 .Pp
 .It Ic show Cm turnstile Ar addr
 Show turnstile
 .Vt struct turnstile
 structure at address
 .Ar addr .
 Turnstiles are structures used within the
 .Fx
 kernel to implement
 synchronization primitives which, while holding a specific type of lock, cannot
 sleep or context switch to another thread.
 Currently, those are:
 .Xr mutex 9 ,
 .Xr rwlock 9 ,
 .Xr rmlock 9 .
 .\"
 .Pp
 .It Ic show Cm uma
 Show UMA allocator statistics.
 Output consists five columns:
 .Pp
 .Bl -tag -compact -offset indent -width "Requests"
 .It Cm "Zone"
 Name of the UMA zone.
 The same string that was passed to
 .Xr uma_zcreate 9
 as a first argument.
 .It Cm "Size"
 Size of a given memory object (slab).
 .It Cm "Used"
 Number of slabs being currently used.
 .It Cm "Free"
 Number of free slabs within the UMA zone.
 .It Cm "Requests"
 Number of allocations requests to the given zone.
 .El
 .Pp
 The very same information might be gathered in the userspace
 with the help of
 .Dq Nm vmstat Fl z .
 .\"
 .Pp
 .It Ic show Cm unpcb Ar addr
 Shows UNIX domain socket private control block
 .Vt struct unpcb
 present at the address
 .Ar addr .
 .\"
 .Pp
 .It Ic show Cm vmochk
 Prints, whether the internal VM objects are in a map somewhere
 and none have zero ref counts.
 .\"
 .Pp
 .It Ic show Cm vmopag
 This is supposed to show physical addresses consumed by a
 VM object.
 Currently, it is not possible to use this command when
 .Xr witness 4
 is compiled in the kernel.
 .\"
 .Pp
 .It Ic show Cm vnet Ar addr
 Prints virtualized network stack
 .Vt struct vnet
 structure present at the address
 .Ar addr .
 .\"
 .Pp
 .It Ic show Cm vnode Op Ar addr
 Prints vnode
 .Vt struct vnode
 structure lying at
 .Op Ar addr .
 For the exact interpretation of the output, look at the
 .Pa sys/vnode.h
 header file.
 .\"
 .Pp
 .It Ic show Cm vnodebufs Ar addr
 Shows clean/dirty buffer lists of the vnode located at
 .Ar addr .
 .\"
 .Pp
 .It Ic show Cm watches
 Displays all watchpoints.
 Shows watchpoints set with "watch" command.
 .\"
 .Pp
 .It Ic show Cm witness
 Shows information about lock acquisition coming from the
 .Xr witness 4
 subsystem.
 .\"
 .Pp
 .It Ic gdb
 Toggles between remote GDB and DDB mode.
 In remote GDB mode, another machine is required that runs
 .Xr gdb 1
 using the remote debug feature, with a connection to the serial
 console port on the target machine.
 Currently only available on the
 i386
 architecture.
 .Pp
 .It Ic halt
 Halt the system.
 .Pp
 .It Ic kill Ar sig pid
 Send signal
 .Ar sig
 to process
 .Ar pid .
 The signal is acted on upon returning from the debugger.
 This command can be used to kill a process causing resource contention
 in the case of a hung system.
 See
 .Xr signal 3
 for a list of signals.
 Note that the arguments are reversed relative to
 .Xr kill 2 .
 .Pp
 .It Ic reboot Op Ar seconds
 .It Ic reset Op Ar seconds
 Hard reset the system.
 If the optional argument
 .Ar seconds
 is given, the debugger will wait for this long, at most a week,
 before rebooting.
 .Pp
 .It Ic help
 Print a short summary of the available commands and command
 abbreviations.
 .Pp
 .It Ic capture on
 .It Ic capture off
 .It Ic capture reset
 .It Ic capture status
 .Nm
 supports a basic output capture facility, which can be used to retrieve the
 results of debugging commands from userspace using
 .Xr sysctl 3 .
 .Ic capture on
 enables output capture;
 .Ic capture off
 disables capture.
 .Ic capture reset
 will clear the capture buffer and disable capture.
 .Ic capture status
 will report current buffer use, buffer size, and disposition of output
 capture.
 .Pp
 Userspace processes may inspect and manage
 .Nm
 capture state using
 .Xr sysctl 8 :
 .Pp
 .Dv debug.ddb.capture.bufsize
 may be used to query or set the current capture buffer size.
 .Pp
 .Dv debug.ddb.capture.maxbufsize
 may be used to query the compile-time limit on the capture buffer size.
 .Pp
 .Dv debug.ddb.capture.bytes
 may be used to query the number of bytes of output currently in the capture
 buffer.
 .Pp
 .Dv debug.ddb.capture.data
 returns the contents of the buffer as a string to an appropriately privileged
 process.
 .Pp
 This facility is particularly useful in concert with the scripting and
 .Xr textdump 4
 facilities, allowing scripted debugging output to be captured and
 committed to disk as part of a textdump for later analysis.
 The contents of the capture buffer may also be inspected in a kernel core dump
 using
 .Xr kgdb 1 .
 .Pp
 .It Ic run
 .It Ic script
 .It Ic scripts
 .It Ic unscript
 Run, define, list, and delete scripts.
 See the
 .Sx SCRIPTING
 section for more information on the scripting facility.
 .Pp
 .It Ic textdump dump
 .It Ic textdump set
 .It Ic textdump status
 .It Ic textdump unset
 Use the
 .Ic textdump dump
 command to immediately perform a textdump.
 More information may be found in
 .Xr textdump 4 .
 The
 .Ic textdump set
 command may be used to force the next kernel core dump to be a textdump
 rather than a traditional memory dump or minidump.
 .Ic textdump status
 reports whether a textdump has been scheduled.
 .Ic textdump unset
 cancels a request to perform a textdump as the next kernel core dump.
 .El
 .Sh VARIABLES
 The debugger accesses registers and variables as
 .Li $ Ns Ar name .
 Register names are as in the
 .Dq Ic show Cm registers
 command.
 Some variables are suffixed with numbers, and may have some modifier
 following a colon immediately after the variable name.
 For example, register variables can have a
 .Cm u
 modifier to indicate user register (e.g.,
 .Dq Li $eax:u ) .
 .Pp
 Built-in variables currently supported are:
 .Pp
 .Bl -tag -width ".Va tabstops" -compact
 .It Va radix
 Input and output radix.
 .It Va maxoff
 Addresses are printed as
 .Dq Ar symbol Ns Li + Ns Ar offset
 unless
 .Ar offset
 is greater than
 .Va maxoff .
 .It Va maxwidth
 The width of the displayed line.
 .It Va lines
 The number of lines.
 It is used by the built-in pager.
 .It Va tabstops
 Tab stop width.
 .It Va work Ns Ar xx
 Work variable;
 .Ar xx
 can take values from 0 to 31.
 .El
 .Sh EXPRESSIONS
 Most expression operators in C are supported except
 .Ql ~ ,
 .Ql ^ ,
 and unary
 .Ql & .
 Special rules in
 .Nm
 are:
 .Bl -tag -width ".No Identifiers"
 .It Identifiers
 The name of a symbol is translated to the value of the symbol, which
 is the address of the corresponding object.
 .Ql \&.
 and
 .Ql \&:
 can be used in the identifier.
 If supported by an object format dependent routine,
 .Sm off
 .Oo Ar filename : Oc Ar func : lineno ,
 .Sm on
 .Oo Ar filename : Oc Ns Ar variable ,
 and
 .Oo Ar filename : Oc Ns Ar lineno
 can be accepted as a symbol.
 .It Numbers
 Radix is determined by the first two letters:
 .Ql 0x :
 hex,
 .Ql 0o :
 octal,
 .Ql 0t :
 decimal; otherwise, follow current radix.
 .It Li \&.
 .Va dot
 .It Li +
 .Va next
 .It Li ..
 address of the start of the last line examined.
 Unlike
 .Va dot
 or
 .Va next ,
 this is only changed by
 .Ic examine
 or
 .Ic write
 command.
 .It Li '
 last address explicitly specified.
 .It Li $ Ns Ar variable
 Translated to the value of the specified variable.
 It may be followed by a
 .Ql \&:
 and modifiers as described above.
 .It Ar a Ns Li # Ns Ar b
 A binary operator which rounds up the left hand side to the next
 multiple of right hand side.
 .It Li * Ns Ar expr
 Indirection.
 It may be followed by a
 .Ql \&:
 and modifiers as described above.
 .El
 .Sh SCRIPTING
 .Nm
 supports a basic scripting facility to allow automating tasks or responses to
 specific events.
 Each script consists of a list of DDB commands to be executed sequentially,
 and is assigned a unique name.
 Certain script names have special meaning, and will be automatically run on
 various
 .Nm
 events if scripts by those names have been defined.
 .Pp
 The
 .Ic script
 command may be used to define a script by name.
 Scripts consist of a series of
 .Nm
 commands separated with the
 .Ql \&;
 character.
 For example:
 .Bd -literal -offset indent
 script kdb.enter.panic=bt; show pcpu
 script lockinfo=show alllocks; show lockedvnods
 .Ed
 .Pp
 The
 .Ic scripts
 command lists currently defined scripts.
 .Pp
 The
 .Ic run
 command execute a script by name.
 For example:
 .Bd -literal -offset indent
 run lockinfo
 .Ed
 .Pp
 The
 .Ic unscript
 command may be used to delete a script by name.
 For example:
 .Bd -literal -offset indent
 unscript kdb.enter.panic
 .Ed
 .Pp
 These functions may also be performed from userspace using the
 .Xr ddb 8
 command.
 .Pp
 Certain scripts are run automatically, if defined, for specific
 .Nm
 events.
 The follow scripts are run when various events occur:
 .Bl -tag -width kdb.enter.powerfail
 .It Dv kdb.enter.acpi
 The kernel debugger was entered as a result of an
 .Xr acpi 4
 event.
 .It Dv kdb.enter.bootflags
 The kernel debugger was entered at boot as a result of the debugger boot
 flag being set.
 .It Dv kdb.enter.break
 The kernel debugger was entered as a result of a serial or console break.
 .It Dv kdb.enter.cam
 The kernel debugger was entered as a result of a
 .Xr CAM 4
 event.
 .It Dv kdb.enter.mac
 The kernel debugger was entered as a result of an assertion failure in the
 .Xr mac_test 4
 module of the
 TrustedBSD MAC Framework.
 .It Dv kdb.enter.ndis
 The kernel debugger was entered as a result of an
 .Xr ndis 4
 breakpoint event.
 .It Dv kdb.enter.netgraph
 The kernel debugger was entered as a result of a
 .Xr netgraph 4
 event.
 .It Dv kdb.enter.panic
 .Xr panic 9
 was called.
 .It Dv kdb.enter.powerfail
 The kernel debugger was entered as a result of a powerfail NMI on the sparc64
 platform.
 .It Dv kdb.enter.powerpc
 The kernel debugger was entered as a result of an unimplemented interrupt
 type on the powerpc platform.
 .It Dv kdb.enter.sysctl
 The kernel debugger was entered as a result of the
 .Dv debug.kdb.enter
 sysctl being set.
 .It Dv kdb.enter.trapsig
 The kernel debugger was entered as a result of a trapsig event on the sparc64
 platform.
 .It Dv kdb.enter.unionfs
 The kernel debugger was entered as a result of an assertion failure in the
 union file system.
 .It Dv kdb.enter.unknown
 The kernel debugger was entered, but no reason has been set.
 .It Dv kdb.enter.vfslock
 The kernel debugger was entered as a result of a VFS lock violation.
 .It Dv kdb.enter.watchdog
 The kernel debugger was entered as a result of a watchdog firing.
 .It Dv kdb.enter.witness
 The kernel debugger was entered as a result of a
 .Xr witness 4
 violation.
 .El
 .Pp
 In the event that none of these scripts is found,
 .Nm
 will attempt to execute a default script:
 .Bl -tag -width kdb.enter.powerfail
 .It Dv kdb.enter.default
 The kernel debugger was entered, but a script exactly matching the reason for
 entering was not defined.
 This can be used as a catch-all to handle cases not specifically of interest;
 for example,
 .Dv kdb.enter.witness
 might be defined to have special handling, and
 .Dv kdb.enter.default
 might be defined to simply panic and reboot.
 .El
 .Sh HINTS
 On machines with an ISA expansion bus, a simple NMI generation card can be
 constructed by connecting a push button between the A01 and B01 (CHCHK# and
 GND) card fingers.
 Momentarily shorting these two fingers together may cause the bridge chipset to
 generate an NMI, which causes the kernel to pass control to
 .Nm .
 Some bridge chipsets do not generate a NMI on CHCHK#, so your mileage may vary.
 The NMI allows one to break into the debugger on a wedged machine to
 diagnose problems.
 Other bus' bridge chipsets may be able to generate NMI using bus specific
 methods.
 There are many PCI and PCIe add-in cards which can generate NMI for
 debugging.
 Modern server systems typically use IPMI to generate signals to enter the
 debugger.
 The
 .Dv devel/ipmitool
 port can be used to send the
 .Cd chassis power diag
 command which delivers an NMI to the processor.
 Embedded systems often use JTAG for debugging, but rarely use it in
 combination with
 .Nm .
 .Pp
 For serial consoles, you can enter the debugger by sending a BREAK
 condition on the serial line if
 .Cd options BREAK_TO_DEBUGGER
 is specified in the kernel.
 Most terminal emulation programs can send a break sequence with a
 special key sequence or via a menu item.
 However, in some setups, sending the break can be difficult to arrange
 or happens spuriously, so if the kernel contains
 .Cd options ALT_BREAK_TO_DEBUGGER
 then the sequence of CR TILDE CTRL-B enters the debugger;
 CR TILDE CTRL-P causes a panic instead of entering the
 debugger; and
 CR TILDE CTRL-R causes an immediate reboot.
 In all the above sequences, CR is a Carriage Return and is usually
 sent by hitting the Enter or Return key.
 TILDE is the ASCII tilde character (~).
 CTRL-x is Control x created by hitting the control key and then x
 and then releasing both.
 .Pp
 The break to enter the debugger behavior may be enabled at run-time
 by setting the
 .Xr sysctl 8
 .Dv debug.kdb.break_to_debugger
 to 1.
 The alternate sequence to enter the debugger behavior may be enabled
 at run-time by setting the
 .Xr sysctl 8
 .Dv debug.kdb.alt_break_to_debugger
 to 1.
 The debugger may be entered by setting the
 .Xr sysctl 8
 .Dv debug.kdb.enter
 to 1.
 .Sh FILES
 Header files mentioned in this manual page can be found below
 .Pa /usr/include
 directory.
 .Pp
 .Bl -dash -compact
 .It
 .Pa sys/buf.h
 .It
 .Pa sys/domain.h
 .It
 .Pa netinet/in_pcb.h
 .It
 .Pa sys/socket.h
 .It
 .Pa sys/vnode.h
 .El
 .Sh SEE ALSO
 .Xr gdb 1 ,
 .Xr kgdb 1 ,
 .Xr acpi 4 ,
 .Xr CAM 4 ,
 .Xr mac_test 4 ,
 .Xr ndis 4 ,
 .Xr netgraph 4 ,
 .Xr textdump 4 ,
 .Xr witness 4 ,
 .Xr ddb 8 ,
 .Xr sysctl 8 ,
 .Xr panic 9
 .Sh HISTORY
 The
 .Nm
 debugger was developed for Mach, and ported to
 .Bx 386 0.1 .
 This manual page translated from
 .Xr man 7
 macros by
 .An Garrett Wollman .
 .Pp
 .An Robert N. M. Watson
 added support for
 .Nm
 output capture,
 .Xr textdump 4
 and scripting in
 .Fx 7.1 .
Index: projects/clang390-import/share/man/man4/pci.4
===================================================================
--- projects/clang390-import/share/man/man4/pci.4	(revision 305686)
+++ projects/clang390-import/share/man/man4/pci.4	(revision 305687)
@@ -1,342 +1,525 @@
 .\"
 .\" Copyright (c) 1999 Kenneth D. Merry.
 .\" All rights reserved.
 .\"
 .\" Redistribution and use in source and binary forms, with or without
 .\" modification, are permitted provided that the following conditions
 .\" are met:
 .\" 1. Redistributions of source code must retain the above copyright
 .\"    notice, this list of conditions and the following disclaimer.
 .\" 2. The name of the author may not be used to endorse or promote products
 .\"    derived from this software without specific prior written permission.
 .\"
 .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 .\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 .\" SUCH DAMAGE.
 .\"
 .\" $FreeBSD$
 .\"
-.Dd August 9, 2016
+.Dd September 8, 2016
 .Dt PCI 4
 .Os
 .Sh NAME
 .Nm pci
-.Nd generic PCI driver
+.Nd generic PCI bus driver
 .Sh SYNOPSIS
+To compile the PCI bus driver into the kernel,
+place the following line in your
+kernel configuration file:
+.Bd -ragged -offset indent
 .Cd device pci
+.Ed
+.Pp
+To compile in support for Single Root I/O Virtualization
+.Pq SR-IOV :
+.Bd -ragged -offset indent
+.Cd options PCI_IOV
+.Ed
+.Pp
+To compile in support for native PCI-express HotPlug:
+.Bd -ragged -offset indent
+.Cd options PCI_HP
+.Ed
 .Sh DESCRIPTION
 The
 .Nm
-driver provides a way for userland programs to read and write
+driver provides support for
 .Tn PCI
+devices in the kernel and limited access to
+.Tn PCI
+devices for userland.
+.Pp
+The
+.Nm
+driver provides a
+.Pa /dev/pci
+character device that can be used by userland programs to read and write
+.Tn PCI
 configuration registers.
-It also provides a way for userland programs to get a list of all
+Programs can also use this device to get a list of all
 .Tn PCI
 devices, or all
 .Tn PCI
 devices that match various patterns.
 .Pp
 Since the
 .Nm
 driver provides a write interface for
 .Tn PCI
 configuration registers, system administrators should exercise caution when
 granting access to the
 .Nm
 device.
 If used improperly, this driver can allow userland applications to
 crash a machine or cause data loss.
 .Pp
 The
 .Nm
 driver implements the
 .Tn PCI
 bus in the kernel.
 It enumerates any devices on the
 .Tn PCI
 bus and gives
 .Tn PCI
 client drivers the chance to attach to them.
 It assigns resources to children, when the BIOS does not.
 It takes care of routing interrupts when necessary.
 It reprobes the unattached
 .Tn PCI
 children when
 .Tn PCI
 client drivers are dynamically
 loaded at runtime.
-.Sh KERNEL CONFIGURATION
 The
 .Nm
-device is included in the kernel as described in the SYNOPSIS section.
-The
-.Nm
-driver cannot be built as a
-.Xr kld 4 .
+driver also includes support for PCI-PCI bridges,
+various platform-specific Host-PCI bridges,
+and basic support for
+.Tn PCI
+VGA adapters.
 .Sh IOCTLS
 The following
 .Xr ioctl 2
 calls are supported by the
 .Nm
 driver.
 They are defined in the header file
 .In sys/pciio.h .
 .Bl -tag -width 012345678901234
 .It PCIOCGETCONF
 This
 .Xr ioctl 2
 takes a
 .Va pci_conf_io
 structure.
 It allows the user to retrieve information on all
 .Tn PCI
 devices in the system, or on
 .Tn PCI
 devices matching patterns supplied by the user.
 The call may set
 .Va errno
 to any value specified in either
 .Xr copyin 9
 or
 .Xr copyout 9 .
 The
 .Va pci_conf_io
 structure consists of a number of fields:
 .Bl -tag -width match_buf_len
 .It pat_buf_len
 The length, in bytes, of the buffer filled with user-supplied patterns.
 .It num_patterns
 The number of user-supplied patterns.
 .It patterns
 Pointer to a buffer filled with user-supplied patterns.
 .Va patterns
 is a pointer to
 .Va num_patterns
 .Va pci_match_conf
 structures.
 The
 .Va pci_match_conf
 structure consists of the following elements:
 .Bl -tag -width pd_vendor
 .It pc_sel
 .Tn PCI
 domain, bus, slot and function.
 .It pd_name
 .Tn PCI
 device driver name.
 .It pd_unit
 .Tn PCI
 device driver unit number.
 .It pc_vendor
 .Tn PCI
 vendor ID.
 .It pc_device
 .Tn PCI
 device ID.
 .It pc_class
 .Tn PCI
 device class.
 .It flags
 The flags describe which of the fields the kernel should match against.
 A device must match all specified fields in order to be returned.
 The match flags are enumerated in the
 .Va pci_getconf_flags
 structure.
 Hopefully the flag values are obvious enough that they do not need to
 described in detail.
 .El
 .It match_buf_len
 Length of the
 .Va matches
 buffer allocated by the user to hold the results of the
 .Dv PCIOCGETCONF
 query.
 .It num_matches
 Number of matches returned by the kernel.
 .It matches
 Buffer containing matching devices returned by the kernel.
 The items in this buffer are of type
 .Va pci_conf ,
 which consists of the following items:
 .Bl -tag -width pc_subvendor
 .It pc_sel
 .Tn PCI
 domain, bus, slot and function.
 .It pc_hdr
 .Tn PCI
 header type.
 .It pc_subvendor
 .Tn PCI
 subvendor ID.
 .It pc_subdevice
 .Tn PCI
 subdevice ID.
 .It pc_vendor
 .Tn PCI
 vendor ID.
 .It pc_device
 .Tn PCI
 device ID.
 .It pc_class
 .Tn PCI
 device class.
 .It pc_subclass
 .Tn PCI
 device subclass.
 .It pc_progif
 .Tn PCI
 device programming interface.
 .It pc_revid
 .Tn PCI
 revision ID.
 .It pd_name
 Driver name.
 .It pd_unit
 Driver unit number.
 .El
 .It offset
 The offset is passed in by the user to tell the kernel where it should
 start traversing the device list.
 The value passed out by the kernel
 points to the record immediately after the last one returned.
 The user may
 pass the value returned by the kernel in subsequent calls to the
 .Dv PCIOCGETCONF
 ioctl.
 If the user does not intend to use the offset, it must be set to zero.
 .It generation
 .Tn PCI
 configuration generation.
 This value only needs to be set if the offset is set.
 The kernel will compare the current generation number of its internal
 device list to the generation passed in by the user to determine whether
 its device list has changed since the user last called the
 .Dv PCIOCGETCONF
 ioctl.
 If the device list has changed, a status of
 .Va PCI_GETCONF_LIST_CHANGED
 will be passed back.
 .It status
 The status tells the user the disposition of his request for a device list.
 The possible status values are:
 .Bl -ohang
 .It PCI_GETCONF_LAST_DEVICE
 This means that there are no more devices in the PCI device list matching
 the specified criteria after the
 ones returned in the
 .Va matches
 buffer.
 .It PCI_GETCONF_LIST_CHANGED
 This status tells the user that the
 .Tn PCI
 device list has changed since his last call to the
 .Dv PCIOCGETCONF
 ioctl and he must reset the
 .Va offset
 and
 .Va generation
 to zero to start over at the beginning of the list.
 .It PCI_GETCONF_MORE_DEVS
 This tells the user that his buffer was not large enough to hold all of the
 remaining devices in the device list that match his criteria.
 .It PCI_GETCONF_ERROR
 This indicates a general error while servicing the user's request.
 If the
 .Va pat_buf_len
 is not equal to
 .Va num_patterns
 times
 .Fn sizeof "struct pci_match_conf" ,
 .Va errno
 will be set to
 .Er EINVAL .
 .El
 .El
 .It PCIOCREAD
 This
 .Xr ioctl 2
 reads the
 .Tn PCI
 configuration registers specified by the passed-in
 .Va pci_io
 structure.
 The
 .Va pci_io
 structure consists of the following fields:
 .Bl -tag -width pi_width
 .It pi_sel
 A
 .Va pcisel
 structure which specifies the domain, bus, slot and function the user would
 like to query.
 If the specific bus is not found, errno will be set to ENODEV and -1 returned
 from the ioctl.
 .It pi_reg
 The
 .Tn PCI
 configuration register the user would like to access.
 .It pi_width
 The width, in bytes, of the data the user would like to read.
 This value
 may be either 1, 2, or 4.
 3-byte reads and reads larger than 4 bytes are
 not supported.
 If an invalid width is passed, errno will be set to EINVAL.
 .It pi_data
 The data returned by the kernel.
 .El
 .It PCIOCWRITE
 This
 .Xr ioctl 2
 allows users to write to the
 .Tn PCI
 specified in the passed-in
 .Va pci_io
 structure.
 The
 .Va pci_io
 structure is described above.
 The limitations on data width described for
 reading registers, above, also apply to writing
 .Tn PCI
 configuration registers.
+.El
+.Sh LOADER TUNABLES
+Tunables can be set at the
+.Xr loader 8
+prompt before booting the kernel, or stored in
+.Xr loader.conf 5 .
+The current value of these tunables can be examined at runtime via
+.Xr sysctl 8
+nodes of the same name.
+Unless otherwise specified,
+each of these tunables is a boolean that can be enabled by setting the
+tunable to a non-zero value.
+.Bl -tag -width indent
+.It Va hw.pci.clear_bars Pq Defaults to 0
+Ignore any firmware-assigned memory and I/O port resources.
+This forces the
+.Tn PCI
+bus driver to allocate resource ranges for memory and I/O port resources
+from scratch.
+.It Va hw.pci.clear_buses Pq Defaults to 0
+Ignore any firmware-assigned bus number registers in PCI-PCI bridges.
+This forces the
+.Tn PCI
+bus driver and PCI-PCI bridge driver to allocate bus numbers for secondary
+buses behind PCI-PCI bridges.
+.It Va hw.pci.clear_pcib Pq Defaults to 0
+Ignore any firmware-assigned memory and I/O port resource windows in PCI-PCI
+bridges.
+This forces the PCI-PCI bridge driver to allocate memory and I/O port resources
+for resource windows from scratch.
+.Pp
+By default the PCI-PCI bridge driver will allocate windows that
+contain the firmware-assigned resources devices behind the bridge.
+In addition, the PCI-PCI bridge driver will suballocate from existing window
+regions when possible to satisfy a resource request.
+As a result,
+both
+.Va hw.pci.clear_bars
+and
+.Va hw.pci.clear_pcib
+must be enabled to fully ignore firmware-supplied resource assignments.
+.It Va hw.pci.default_vgapci_unit Pq Defaults to -1
+By default,
+the first
+.Tn PCI
+VGA adapter encountered by the system is assumed to be the boot display device.
+This tunable can be set to choose a specific VGA adapter by specifying the
+unit number of the associated
+.Va vgapci Ns Ar X
+device.
+.It Va hw.pci.do_power_nodriver Pq Defaults to 0
+Place devices into a low power state
+.Pq D3
+when a suitable device driver is not found.
+Can be set to one of the following values:
+.Bl -tag -width indent
+.It 3
+Powers down all
+.Tn PCI
+devices without a device driver.
+.It 2
+Powers down most devices without a device driver.
+PCI devices with the display, memory, and base peripheral device classes
+are not powered down.
+.It 1
+Similar to a setting of 2 except that storage controllers are also not
+powered down.
+.It 0
+All devices are left fully powered.
+.El
+.Pp
+A
+.Tn PCI
+device must support power management to be powered down.
+Placing a device into a low power state may not reduce power consumption.
+.It Va hw.pci.do_power_resume Pq Defaults to 1
+Place
+.Tn PCI
+devices into the fully powered state when resuming either the system or an
+individual device.
+Setting this to zero is discouraged as the system will not attempt to power
+up non-powered PCI devices after a suspend.
+.It Va hw.pci.do_power_suspend Pq Defaults to 1
+Place
+.Tn PCI
+devices into a low power state when suspending either the system or individual
+devices.
+Normally the D3 state is used as the low power state,
+but firmware may override the desired power state during a system suspend.
+.It Va hw.pci.enable_ari Pq Defaults to 1
+Enable support for PCI-express Alternative RID Interpretation.
+This is often used in conjunction with SR-IOV.
+.It Va hw.pci.enable_io_modes Pq Defaults to 1
+Enable memory or I/O port decoding in a PCI device's command register if it has
+firmware-assigned memory or I/O port resources.
+The firmware
+.Pq BIOS
+in some systems does not enable memory or I/O port decoding for some devices
+even when it has assigned resources to the device.
+This enables decoding for such resources during bus probe.
+.It Va hw.pci.enable_msi Pq Defaults to 1
+Enable support for Message Signalled Interrupts
+.Pq MSI .
+MSI interrupts can be disabled by setting this tunable to 0.
+.It Va hw.pci.enable_msix Pq Defaults to 1
+Enable support for extended Message Signalled Interrupts
+.Pq MSI-X .
+MSI-X interrupts can be disabled by setting this tunable to 0.
+.It Va hw.pci.enable_pcie_hp Pq Defaults to 1
+Enable support for native PCI-express HotPlug.
+.It Va hw.pci.honor_msi_blacklist Pq Defaults to 1
+MSI and MSI-X interrupts are disabled for certain chipsets known to have
+broken MSI and MSI-X implementations when this tunable is set.
+It can be set to zero to permit use of MSI and MSI-X interrupts if the
+chipset match is a false positive.
+.It Va hw.pci.iov_max_config Pq Defaults to 1MB
+The maximum amount of memory permitted for the configuration parameters
+used when creating Virtual Functions via SR-IOV.
+This tunable can also be changed at runtime via
+.Xr sysctl 8 .
+.It Va hw.pci.realloc_bars Pq Defaults to 0
+Attempt to allocate a new resource range during the initial device scan
+for any memory or I/O port resources with firmware-assigned ranges that
+conflict with another active resource.
+.It Va hw.pci.usb_early_takeover Pq Defaults to 1 on Tn amd64 and Tn i386
+Disable legacy device emulation of USB devices during the initial device
+scan.
+Set this tunable to zero to use USB devices via legacy emulation when
+using a custom kernel without USB controller drivers.
+.It Va hw.pci<D>.<B>.<S>.INT<P>.irq
+These tunables can be used to override the interrupt routing for legacy
+PCI INTx interrupts.
+Unlike other tunables in this list,
+these do not have corresponding sysctl nodes.
+The tunable name includes the address of the PCI device as well as the
+pin of the desired INTx IRQ to override:
+.Bl -tag -width indent
+.It <D>
+The domain
+.Pq or segment
+of the PCI device in decimal.
+.It <B>
+The bus address of the PCI device in decimal.
+.It <S>
+The slot of the PCI device in decimal.
+.It <P>
+The interrupt pin of the PCI slot to override.
+One of
+.Ql A ,
+.Ql B ,
+.Ql C ,
+or
+.Ql D .
+.El
+.Pp
+The value of the tunable is the raw IRQ value to use for the INTx interrupt
+pin identified by the tunable name.
+Mapping of IRQ values to platform interrupt sources is machine dependent.
 .El
 .Sh FILES
 .Bl -tag -width /dev/pci -compact
 .It Pa /dev/pci
 Character device for the
 .Nm
 driver.
 .El
 .Sh SEE ALSO
 .Xr pciconf 8
 .Sh HISTORY
 The
 .Nm
 driver (not the kernel's
 .Tn PCI
 support code) first appeared in
 .Fx 2.2 ,
 and was written by Stefan Esser and Garrett Wollman.
 Support for device listing and matching was re-implemented by
 Kenneth Merry, and first appeared in
 .Fx 3.0 .
 .Sh AUTHORS
 .An Kenneth Merry Aq Mt ken@FreeBSD.org
 .Sh BUGS
 It is not possible for users to specify an accurate offset into the device
 list without calling the
 .Dv PCIOCGETCONF
 at least once, since they have no way of knowing the current generation
 number otherwise.
 This probably is not a serious problem, though, since
 users can easily narrow their search by specifying a pattern or patterns
 for the kernel to match against.
Index: projects/clang390-import/share/mk/bsd.subdir.mk
===================================================================
--- projects/clang390-import/share/mk/bsd.subdir.mk	(revision 305686)
+++ projects/clang390-import/share/mk/bsd.subdir.mk	(revision 305687)
@@ -1,193 +1,194 @@
 #	from: @(#)bsd.subdir.mk	5.9 (Berkeley) 2/1/91
 # $FreeBSD$
 #
 # The include file <bsd.subdir.mk> contains the default targets
 # for building subdirectories.
 #
 # For all of the directories listed in the variable SUBDIRS, the
 # specified directory will be visited and the target made. There is
 # also a default target which allows the command "make subdir" where
 # subdir is any directory listed in the variable SUBDIRS.
 #
 #
 # +++ variables +++
 #
 # DISTRIBUTION	Name of distribution. [base]
 #
 # SUBDIR	A list of subdirectories that should be built as well.
 #		Each of the targets will execute the same target in the
 #		subdirectories. SUBDIR.yes is automatically appended
 #		to this list.
 #
 # +++ targets +++
 #
 #	distribute:
 # 		This is a variant of install, which will
 # 		put the stuff into the right "distribution".
 #
 # 	See SUBDIR_TARGETS for list of targets that will recurse.
 #
 # 	Targets defined in STANDALONE_SUBDIR_TARGETS will always be ran
 # 	with SUBDIR_PARALLEL and will not respect .WAIT or SUBDIR_DEPEND_
 # 	values.
 #
 # 	SUBDIR_TARGETS and STANDALONE_SUBDIR_TARGETS can be appended to
 # 	via make.conf or src.conf.
 #
 
 .if !target(__<bsd.subdir.mk>__)
 __<bsd.subdir.mk>__:
 
 SUBDIR_TARGETS+= \
 		all all-man analyze buildconfig buildfiles buildincludes \
 		checkdpadd clean cleandepend cleandir cleanilinks \
 		cleanobj depend distribute files includes installconfig \
 		installfiles installincludes print-dir realinstall lint \
 		maninstall manlint obj objlink tags \
 
 # Described above.
 STANDALONE_SUBDIR_TARGETS+= \
 		all-man buildconfig buildfiles buildincludes check checkdpadd \
 		clean cleandepend cleandir cleanilinks cleanobj files includes \
 		installconfig installincludes installfiles print-dir \
 		maninstall manlint obj objlink
 
 # It is safe to install in parallel when staging.
 .if defined(NO_ROOT)
 STANDALONE_SUBDIR_TARGETS+= realinstall
 .endif
 
 .include <bsd.init.mk>
 
 .if make(print-dir)
 NEED_SUBDIR=	1
 ECHODIR=	:
 .SILENT:
 .if ${RELDIR:U.} != "."
 print-dir:	.PHONY
 	@echo ${RELDIR}
 .endif
 .endif
 
 .if !defined(NEED_SUBDIR)
 .if ${.MAKE.LEVEL} == 0 && ${MK_DIRDEPS_BUILD} == "yes" && !empty(SUBDIR) && !(make(clean*) || make(destroy*))
 .include <meta.subdir.mk>
 # ignore this
 _SUBDIR:
 .endif
 .endif
 
 DISTRIBUTION?=	base
 .if !target(distribute)
 distribute: .MAKE
 .for dist in ${DISTRIBUTION}
 	${_+_}cd ${.CURDIR}; \
 	    ${MAKE} install installconfig -DNO_SUBDIR DESTDIR=${DISTDIR}/${dist} SHARED=copies
 .endfor
 .endif
 # Convenience targets to run 'build${target}' and 'install${target}' when
 # calling 'make ${target}'.
 .for __target in files includes
 .if !target(${__target})
 ${__target}:	build${__target} install${__target}
 .ORDER:		build${__target} install${__target}
 .endif
 .endfor
 
 # Make 'install' supports a before and after target.  Actual install
 # hooks are placed in 'realinstall'.
 .if !target(install)
 .for __stage in before real after
 .if !target(${__stage}install)
 ${__stage}install:
 .endif
 .endfor
 install:	beforeinstall realinstall afterinstall
 .ORDER:		beforeinstall realinstall afterinstall
 .endif
 .ORDER: all install
 
 # SUBDIR recursing may be disabled for MK_DIRDEPS_BUILD
 .if !target(_SUBDIR)
 
 .if defined(SUBDIR)
 SUBDIR:=${SUBDIR} ${SUBDIR.yes}
 SUBDIR:=${SUBDIR:u}
 .endif
 
 # Subdir code shared among 'make <subdir>', 'make <target>' and SUBDIR_PARALLEL.
 _SUBDIR_SH=	\
 		if test -d ${.CURDIR}/$${dir}.${MACHINE_ARCH}; then \
 			dir=$${dir}.${MACHINE_ARCH}; \
 		fi; \
 		${ECHODIR} "===> ${DIRPRFX}$${dir} ($${target})"; \
 		cd ${.CURDIR}/$${dir}; \
 		${MAKE} $${target} DIRPRFX=${DIRPRFX}$${dir}/
 
 # This is kept for compatibility only.  The normal handling of attaching to
 # SUBDIR_TARGETS will create a target for each directory.
 _SUBDIR: .USEBEFORE
 .if defined(SUBDIR) && !empty(SUBDIR) && !defined(NO_SUBDIR)
 	@${_+_}target=${.TARGET:realinstall=install}; \
 	    for dir in ${SUBDIR:N.WAIT}; do ( ${_SUBDIR_SH} ); done
 .endif
 
 # Create 'make subdir' targets to run the real 'all' target.
 .for __dir in ${SUBDIR:N.WAIT}
 ${__dir}: all_subdir_${DIRPRFX}${__dir} .PHONY
 .endfor
 
 .for __target in ${SUBDIR_TARGETS}
 # Can ordering be skipped for this and SUBDIR_PARALLEL forced?
 .if ${STANDALONE_SUBDIR_TARGETS:M${__target}}
 _is_standalone_target=	1
-SUBDIR:=	${SUBDIR:N.WAIT}
+_subdir_filter=	N.WAIT
 .else
 _is_standalone_target=	0
+_subdir_filter=
 .endif
 __subdir_targets=
-.for __dir in ${SUBDIR}
+.for __dir in ${SUBDIR:${_subdir_filter}}
 .if ${__dir} == .WAIT
 __subdir_targets+= .WAIT
 .else
 __deps=
 .if ${_is_standalone_target} == 0
 .if defined(SUBDIR_PARALLEL)
 # Apply SUBDIR_DEPEND dependencies for SUBDIR_PARALLEL.
 .for __dep in ${SUBDIR_DEPEND_${__dir}}
 __deps+= ${__target}_subdir_${DIRPRFX}${__dep}
 .endfor
 .else
 # For non-parallel builds, directories depend on all targets before them.
 __deps:= ${__subdir_targets}
 .endif	# defined(SUBDIR_PARALLEL)
 .endif	# ${_is_standalone_target} == 0
 ${__target}_subdir_${DIRPRFX}${__dir}: .PHONY .MAKE .SILENT ${__deps}
 	@${_+_}target=${__target:realinstall=install}; \
 	    dir=${__dir}; \
 	    ${_SUBDIR_SH};
 __subdir_targets+= ${__target}_subdir_${DIRPRFX}${__dir}
 .endif	# ${__dir} == .WAIT
 .endfor	# __dir in ${SUBDIR}
 
 # Attach the subdir targets to the real target.
 # Only recurse on directly-called targets.  I.e., don't recurse on dependencies
 # such as 'install' becoming {before,real,after}install, just recurse
 # 'install'.  Despite that, 'realinstall' is special due to ordering issues
 # with 'afterinstall'.
 .if !defined(NO_SUBDIR) && (make(${__target}) || \
     (${__target} == realinstall && make(install)))
 ${__target}: ${__subdir_targets} .PHONY
 .endif	# make(${__target})
 .endfor	# __target in ${SUBDIR_TARGETS}
 
 .endif	# !target(_SUBDIR)
 
 # Ensure all targets exist
 .for __target in ${SUBDIR_TARGETS}
 .if !target(${__target})
 ${__target}:
 .endif
 .endfor
 
 .endif
Index: projects/clang390-import/share/mk/dirdeps.mk
===================================================================
--- projects/clang390-import/share/mk/dirdeps.mk	(revision 305686)
+++ projects/clang390-import/share/mk/dirdeps.mk	(revision 305687)
@@ -1,712 +1,725 @@
 # $FreeBSD$
-# $Id: dirdeps.mk,v 1.62 2016/03/16 00:11:53 sjg Exp $
+# $Id: dirdeps.mk,v 1.73 2016/08/15 19:28:13 sjg Exp $
 
 # Copyright (c) 2010-2013, Juniper Networks, Inc.
 # All rights reserved.
 # 
 # Redistribution and use in source and binary forms, with or without
 # modification, are permitted provided that the following conditions 
 # are met: 
 # 1. Redistributions of source code must retain the above copyright
 #    notice, this list of conditions and the following disclaimer. 
 # 2. Redistributions in binary form must reproduce the above copyright
 #    notice, this list of conditions and the following disclaimer in the
 #    documentation and/or other materials provided with the distribution.  
 # 
 # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
 # "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
 # LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
 # A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
 # OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
 # SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
 # LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
 # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
 # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
 # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
 
 # Much of the complexity here is for supporting cross-building.
 # If a tree does not support that, simply using plain Makefile.depend
 # should provide sufficient clue.
 # Otherwise the recommendation is to use Makefile.depend.${MACHINE}
 # as expected below.
 
 # Note: this file gets multiply included.
 # This is what we do with DIRDEPS
 
 # DIRDEPS:
 #	This is a list of directories - relative to SRCTOP, it is
 #	normally only of interest to .MAKE.LEVEL 0.
 #	In some cases the entry may be qualified with a .<machine>
 #	or .<target_spec> suffix (see TARGET_SPEC_VARS below),
 #	for example to force building something for the pseudo
 #	machines "host" or "common" regardless of current ${MACHINE}.
 #	
 #	All unqualified entries end up being qualified with .${TARGET_SPEC}
 #	and partially qualified (if TARGET_SPEC_VARS has multiple
 #	entries) are also expanded to a full .<target_spec>.
 #	The  _DIRDEP_USE target uses the suffix to set TARGET_SPEC
 #	correctly when visiting each entry.
 #
 #	The fully qualified directory entries are used to construct a
 #	dependency graph that will drive the build later.
 #	
 #	Also, for each fully qualified directory target, we will search
 #	using ${.MAKE.DEPENDFILE_PREFERENCE} to find additional
 #	dependencies.  We use Makefile.depend (default value for
 #	.MAKE.DEPENDFILE_PREFIX) to refer to these makefiles to
 #	distinguish them from others.
 #	
 #	Each Makefile.depend file sets DEP_RELDIR to be the
 #	the RELDIR (path relative to SRCTOP) for its directory, and
 #	since each Makefile.depend file includes dirdeps.mk, this
 #	processing is recursive and results in .MAKE.LEVEL 0 learning the
 #	dependencies of the tree wrt the initial directory (_DEP_RELDIR).
 #
 # BUILD_AT_LEVEL0
 #	Indicates whether .MAKE.LEVEL 0 builds anything:
 #	if "no" sub-makes are used to build everything,
 #	if "yes" sub-makes are only used to build for other machines.
 #	It is best to use "no", but this can require fixing some
 #	makefiles to not do anything at .MAKE.LEVEL 0.
 #
 # TARGET_SPEC_VARS
 #	The default value is just MACHINE, and for most environments
 #	this is sufficient.  The _DIRDEP_USE target actually sets
 #	both MACHINE and TARGET_SPEC to the suffix of the current
 #	target so that in the general case TARGET_SPEC can be ignored.
 #
 #	If more than MACHINE is needed then sys.mk needs to decompose
 #	TARGET_SPEC and set the relevant variables accordingly.
 #	It is important that MACHINE be included in and actually be
 #	the first member of TARGET_SPEC_VARS.  This allows other
 #	variables to be considered optional, and some of the treatment
 #	below relies on MACHINE being the first entry.
 #	Note: TARGET_SPEC cannot contain any '.'s so the target
 #	triple used by compiler folk won't work (directly anyway).
 #
 #	For example:
 #
 #		# Always list MACHINE first, 
 #		# other variables might be optional.
 #		TARGET_SPEC_VARS = MACHINE TARGET_OS
 #		.if ${TARGET_SPEC:Uno:M*,*} != ""
 #		_tspec := ${TARGET_SPEC:S/,/ /g}
 #		MACHINE := ${_tspec:[1]}
 #		TARGET_OS := ${_tspec:[2]}
 #		# etc.
 #		# We need to stop that TARGET_SPEC affecting any submakes
 #		# and deal with MACHINE=${TARGET_SPEC} in the environment.
 #		TARGET_SPEC =
 #		# export but do not track
 #		.export-env TARGET_SPEC 
 #		.export ${TARGET_SPEC_VARS}
 #		.for v in ${TARGET_SPEC_VARS:O:u}
 #		.if empty($v)
 #		.undef $v
 #		.endif
 #		.endfor
 #		.endif
 #		# make sure we know what TARGET_SPEC is
 #		# as we may need it to find Makefile.depend*
 #		TARGET_SPEC = ${TARGET_SPEC_VARS:@v@${$v:U}@:ts,}
 #	
 
 # touch this at your peril
 _DIRDEP_USE_LEVEL?= 0
 .if ${.MAKE.LEVEL} == ${_DIRDEP_USE_LEVEL}
 # only the first instance is interested in all this
 
-# First off, we want to know what ${MACHINE} to build for.
-# This can be complicated if we are using a mixture of ${MACHINE} specific
-# and non-specific Makefile.depend*
-
 .if !target(_DIRDEP_USE)
 
+# do some setup we only need once
+_CURDIR ?= ${.CURDIR}
+_OBJDIR ?= ${.OBJDIR}
+
+now_utc = ${%s:L:gmtime}
+.if !defined(start_utc)
+start_utc := ${now_utc}
+.endif
+
 .if ${MAKEFILE:T} == ${.PARSEFILE} && empty(DIRDEPS) && ${.TARGETS:Uall:M*/*} != ""
 # This little trick let's us do
 #
 # mk -f dirdeps.mk some/dir.${TARGET_SPEC}
 #
 all:
 ${.TARGETS:Nall}: all
 DIRDEPS := ${.TARGETS:M*[/.]*}
 # so that -DNO_DIRDEPS works
 DEP_RELDIR := ${DIRDEPS:[1]:R}
 # this will become DEP_MACHINE below
 TARGET_MACHINE := ${DIRDEPS:[1]:E:C/,.*//}
 .if ${TARGET_MACHINE:N*/*} == ""
 TARGET_MACHINE := ${MACHINE}
 .endif
 # disable DIRDEPS_CACHE as it does not like this trick
 MK_DIRDEPS_CACHE = no
 .endif
 
 # make sure we get the behavior we expect
 .MAKE.SAVE_DOLLARS = no
 
-# do some setup we only need once
-_CURDIR ?= ${.CURDIR}
-_OBJDIR ?= ${.OBJDIR}
-
-now_utc = ${%s:L:gmtime}
-.if !defined(start_utc)
-start_utc := ${now_utc}
-.endif
-
 # make sure these are empty to start with
 _DEP_TARGET_SPEC =
 
 # If TARGET_SPEC_VARS is other than just MACHINE
 # it should be set by sys.mk or similar by now.
 # TARGET_SPEC must not contain any '.'s.
 TARGET_SPEC_VARS ?= MACHINE
 # this is what we started with
 TARGET_SPEC = ${TARGET_SPEC_VARS:@v@${$v:U}@:ts,}
 # this is what we mostly use below
 DEP_TARGET_SPEC = ${TARGET_SPEC_VARS:S,^,DEP_,:@v@${$v:U}@:ts,}
 # make sure we have defaults
 .for v in ${TARGET_SPEC_VARS}
 DEP_$v ?= ${$v}
 .endfor
 
 .if ${TARGET_SPEC_VARS:[#]} > 1
 # Ok, this gets more complex (putting it mildly).
 # In order to stay sane, we need to ensure that all the build_dirs
 # we compute below are fully qualified wrt DEP_TARGET_SPEC.
 # The makefiles may only partially specify (eg. MACHINE only),
 # so we need to construct a set of modifiers to fill in the gaps.
 # jot 10 should output 1 2 3 .. 10
 JOT ?= jot
 _tspec_x := ${${JOT} ${TARGET_SPEC_VARS:[#]}:L:sh}
 # this handles unqualified entries
 M_dep_qual_fixes = C;(/[^/.,]+)$$;\1.$${DEP_TARGET_SPEC};
 # there needs to be at least one item missing for these to make sense
 .for i in ${_tspec_x:[2..-1]}
 _tspec_m$i := ${TARGET_SPEC_VARS:[2..$i]:@w@[^,]+@:ts,}
 _tspec_a$i := ,${TARGET_SPEC_VARS:[$i..-1]:@v@$$$${DEP_$v}@:ts,}
 M_dep_qual_fixes += C;(\.${_tspec_m$i})$$;\1${_tspec_a$i};
 .endfor
 .else
 # A harmless? default.
 M_dep_qual_fixes = U
 .endif
 
 .if !defined(.MAKE.DEPENDFILE_PREFERENCE)
 # .MAKE.DEPENDFILE_PREFERENCE makes the logic below neater?
 # you really want this set by sys.mk or similar
 .MAKE.DEPENDFILE_PREFERENCE = ${_CURDIR}/${.MAKE.DEPENDFILE:T}
 .if ${.MAKE.DEPENDFILE:E} == "${TARGET_SPEC}"
 .if ${TARGET_SPEC} != ${MACHINE}
 .MAKE.DEPENDFILE_PREFERENCE += ${_CURDIR}/${.MAKE.DEPENDFILE:T:R}.$${MACHINE}
 .endif
 .MAKE.DEPENDFILE_PREFERENCE += ${_CURDIR}/${.MAKE.DEPENDFILE:T:R}
 .endif
 .endif
 
 _default_dependfile := ${.MAKE.DEPENDFILE_PREFERENCE:[1]:T}
 _machine_dependfiles := ${.MAKE.DEPENDFILE_PREFERENCE:T:M*${MACHINE}*}
 
 # for machine specific dependfiles we require ${MACHINE} to be at the end
 # also for the sake of sanity we require a common prefix
 .if !defined(.MAKE.DEPENDFILE_PREFIX)
 # knowing .MAKE.DEPENDFILE_PREFIX helps
 .if !empty(_machine_dependfiles)
 .MAKE.DEPENDFILE_PREFIX := ${_machine_dependfiles:[1]:T:R}
 .else
 .MAKE.DEPENDFILE_PREFIX := ${_default_dependfile:T}
 .endif
 .endif
 
 
 # this is how we identify non-machine specific dependfiles
 N_notmachine := ${.MAKE.DEPENDFILE_PREFERENCE:E:N*${MACHINE}*:${M_ListToSkip}}
 
 .endif				# !target(_DIRDEP_USE)
 
+# First off, we want to know what ${MACHINE} to build for.
+# This can be complicated if we are using a mixture of ${MACHINE} specific
+# and non-specific Makefile.depend*
+
 # if we were included recursively _DEP_TARGET_SPEC should be valid.
 .if empty(_DEP_TARGET_SPEC)
 # we may or may not have included a dependfile yet
 .if defined(.INCLUDEDFROMFILE)
 _last_dependfile := ${.INCLUDEDFROMFILE:M${.MAKE.DEPENDFILE_PREFIX}*}
 .else
 _last_dependfile := ${.MAKE.MAKEFILES:M*/${.MAKE.DEPENDFILE_PREFIX}*:[-1]}
 .endif
 .if ${_debug_reldir:U0}
 .info ${DEP_RELDIR}.${DEP_TARGET_SPEC}: _last_dependfile='${_last_dependfile}'
 .endif
 
 .if empty(_last_dependfile) || ${_last_dependfile:E:${N_notmachine}} == ""
 # this is all we have to work with
 DEP_MACHINE = ${TARGET_MACHINE:U${MACHINE}}
 _DEP_TARGET_SPEC := ${DEP_TARGET_SPEC}
 .else
 _DEP_TARGET_SPEC = ${_last_dependfile:${M_dep_qual_fixes:ts:}:E}
 .endif
 .if !empty(_last_dependfile)
 # record that we've read dependfile for this
 _dirdeps_checked.${_CURDIR}.${TARGET_SPEC}:
 .endif
 .endif
 
 # by now _DEP_TARGET_SPEC should be set, parse it.
 .if ${TARGET_SPEC_VARS:[#]} > 1
 # we need to parse DEP_MACHINE may or may not contain more info
 _tspec := ${_DEP_TARGET_SPEC:S/,/ /g}
 .for i in ${_tspec_x}
 DEP_${TARGET_SPEC_VARS:[$i]} := ${_tspec:[$i]}
 .endfor
 .for v in ${TARGET_SPEC_VARS:O:u}
 .if empty(DEP_$v)
 .undef DEP_$v
 .endif
 .endfor
 .else
 DEP_MACHINE := ${_DEP_TARGET_SPEC}
 .endif
 
 # reset each time through
 _build_all_dirs =
 
 # the first time we are included the _DIRDEP_USE target will not be defined
 # we can use this as a clue to do initialization and other one time things.
 .if !target(_DIRDEP_USE)
 # make sure this target exists
 dirdeps: beforedirdeps .WAIT
 beforedirdeps:
 
 # We normally expect to be included by Makefile.depend.*
 # which sets the DEP_* macros below.
 DEP_RELDIR ?= ${RELDIR}
 
 # this can cause lots of output!
 # set to a set of glob expressions that might match RELDIR
 DEBUG_DIRDEPS ?= no
 
 # remember the initial value of DEP_RELDIR - we test for it below.
 _DEP_RELDIR := ${DEP_RELDIR}
 
 .endif
 
 # pickup customizations
 # as below you can use !target(_DIRDEP_USE) to protect things
 # which should only be done once.
 .-include <local.dirdeps.mk>
 
 .if !target(_DIRDEP_USE)
 # things we skip for host tools
 SKIP_HOSTDIR ?=
 
 NSkipHostDir = ${SKIP_HOSTDIR:N*.host*:S,$,.host*,:N.host*:S,^,${SRCTOP}/,:${M_ListToSkip}}
 
 # things we always skip
 # SKIP_DIRDEPS allows for adding entries on command line.
 SKIP_DIR += .host *.WAIT ${SKIP_DIRDEPS}
 SKIP_DIR.host += ${SKIP_HOSTDIR}
 
 DEP_SKIP_DIR = ${SKIP_DIR} \
 	${SKIP_DIR.${DEP_TARGET_SPEC}:U} \
 	${SKIP_DIR.${DEP_MACHINE}:U} \
 	${SKIP_DIRDEPS.${DEP_MACHINE}:U}
 
 NSkipDir = ${DEP_SKIP_DIR:${M_ListToSkip}}
 
 .if defined(NODIRDEPS) || defined(WITHOUT_DIRDEPS)
 NO_DIRDEPS =
 .elif defined(WITHOUT_DIRDEPS_BELOW)
 NO_DIRDEPS_BELOW =
 .endif
 
 .if defined(NO_DIRDEPS)
 # confine ourselves to the original dir and below.
 DIRDEPS_FILTER += M${_DEP_RELDIR}*
 .elif defined(NO_DIRDEPS_BELOW)
 DIRDEPS_FILTER += M${_DEP_RELDIR}
 .endif
 
 # this is what we run below
 DIRDEP_MAKE?= ${.MAKE}
 
 # we suppress SUBDIR when visiting the leaves
 # we assume sys.mk will set MACHINE_ARCH
 # you can add extras to DIRDEP_USE_ENV
 # if there is no makefile in the target directory, we skip it.
 _DIRDEP_USE:	.USE .MAKE
 	@for m in ${.MAKE.MAKEFILE_PREFERENCE}; do \
 		test -s ${.TARGET:R}/$$m || continue; \
 		echo "${TRACER}Checking ${.TARGET:R} for ${.TARGET:E} ..."; \
 		MACHINE_ARCH= NO_SUBDIR=1 ${DIRDEP_USE_ENV} \
 		TARGET_SPEC=${.TARGET:E} \
 		MACHINE=${.TARGET:E} \
 		${DIRDEP_MAKE} -C ${.TARGET:R} || exit 1; \
 		break; \
 	done
 
 .ifdef ALL_MACHINES
 # this is how you limit it to only the machines we have been built for
 # previously.
 .if empty(ONLY_MACHINE_LIST)
 .if !empty(ALL_MACHINE_LIST)
 # ALL_MACHINE_LIST is the list of all legal machines - ignore anything else
 _machine_list != cd ${_CURDIR} && 'ls' -1 ${ALL_MACHINE_LIST:O:u:@m@${.MAKE.DEPENDFILE:T:R}.$m@} 2> /dev/null; echo
 .else
 _machine_list != 'ls' -1 ${_CURDIR}/${.MAKE.DEPENDFILE_PREFIX}.* 2> /dev/null; echo
 .endif
 _only_machines := ${_machine_list:${NIgnoreFiles:UN*.bak}:E:O:u}
 .else
 _only_machines := ${ONLY_MACHINE_LIST}
 .endif
 
 .if empty(_only_machines)
 # we must be boot-strapping
 _only_machines := ${TARGET_MACHINE:U${ALL_MACHINE_LIST:U${DEP_MACHINE}}}
 .endif
 
 .else				# ! ALL_MACHINES
 # if ONLY_MACHINE_LIST is set, we are limited to that
 # if TARGET_MACHINE is set - it is really the same as ONLY_MACHINE_LIST
 # otherwise DEP_MACHINE is it - so DEP_MACHINE will match.
 _only_machines := ${ONLY_MACHINE_LIST:U${TARGET_MACHINE:U${DEP_MACHINE}}:M${DEP_MACHINE}}
 .endif
 
 .if !empty(NOT_MACHINE_LIST)
 _only_machines := ${_only_machines:${NOT_MACHINE_LIST:${M_ListToSkip}}}
 .endif
 
 # make sure we have a starting place?
 DIRDEPS ?= ${RELDIR}
 .endif				# target 
 
 # if repeatedly building the same target, 
 # we can avoid the overhead of re-computing the tree dependencies.
 MK_DIRDEPS_CACHE ?= no
 BUILD_DIRDEPS_CACHE ?= no
 BUILD_DIRDEPS ?= yes
 
 .if !defined(NO_DIRDEPS) && !defined(NO_DIRDEPS_BELOW)
 .if ${MK_DIRDEPS_CACHE} == "yes"
 # this is where we will cache all our work
-DIRDEPS_CACHE?= ${_OBJDIR}/dirdeps.cache${.TARGETS:Nall:O:u:ts-:S,/,_,g:S,^,.,:N.}
+DIRDEPS_CACHE?= ${_OBJDIR:tA}/dirdeps.cache${.TARGETS:Nall:O:u:ts-:S,/,_,g:S,^,.,:N.}
 
 # just ensure this exists
 build-dirdeps:
 
 M_oneperline = @x@\\${.newline}	$$x@
 
 .if ${BUILD_DIRDEPS_CACHE} == "no" 
 .if !target(dirdeps-cached)
 # we do this via sub-make
 BUILD_DIRDEPS = no
 
+# ignore anything but these
+.MAKE.META.IGNORE_FILTER = M*/${.MAKE.DEPENDFILE_PREFIX}*
+
 dirdeps: dirdeps-cached
 dirdeps-cached:	${DIRDEPS_CACHE} .MAKE
 	@echo "${TRACER}Using ${DIRDEPS_CACHE}"
 	@MAKELEVEL=${.MAKE.LEVEL} ${.MAKE} -C ${_CURDIR} -f ${DIRDEPS_CACHE} \
 		dirdeps MK_DIRDEPS_CACHE=no BUILD_DIRDEPS=no
 
 # these should generally do
 BUILD_DIRDEPS_MAKEFILE ?= ${MAKEFILE}
 BUILD_DIRDEPS_TARGETS ?= ${.TARGETS}
 
 # we need the .meta file to ensure we update if 
 # any of the Makefile.depend* changed.
 # We do not want to compare the command line though.
 ${DIRDEPS_CACHE}:	.META .NOMETA_CMP
 	+@{ echo '# Autogenerated - do NOT edit!'; echo; \
 	echo 'BUILD_DIRDEPS=no'; echo; \
 	echo '.include <dirdeps.mk>'; \
 	} > ${.TARGET}.new
 	+@MAKELEVEL=${.MAKE.LEVEL} DIRDEPS_CACHE=${DIRDEPS_CACHE} \
 	DIRDEPS="${DIRDEPS}" \
 	MAKEFLAGS= ${.MAKE} -C ${_CURDIR} -f ${BUILD_DIRDEPS_MAKEFILE} \
 	${BUILD_DIRDEPS_TARGETS} BUILD_DIRDEPS_CACHE=yes \
 	.MAKE.DEPENDFILE=.none \
 	${.MAKEFLAGS:tW:S,-D ,-D,g:tw:M*WITH*} \
 	3>&1 1>&2 | sed 's,${SRCTOP},$${SRCTOP},g' >> ${.TARGET}.new && \
 	mv ${.TARGET}.new ${.TARGET}
 
 .endif
 .elif !target(_count_dirdeps)
 # we want to capture the dirdeps count in the cache
 .END: _count_dirdeps
 _count_dirdeps: .NOMETA
 	@echo '.info $${.newline}$${TRACER}Makefiles read: total=${.MAKE.MAKEFILES:[#]} depend=${.MAKE.MAKEFILES:M*depend*:[#]} dirdeps=${.ALLTARGETS:M${SRCTOP}*:O:u:[#]}' >&3
 
 .endif
 .elif !make(dirdeps) && !target(_count_dirdeps)
 beforedirdeps: _count_dirdeps
 _count_dirdeps: .NOMETA
 	@echo "${TRACER}Makefiles read: total=${.MAKE.MAKEFILES:[#]} depend=${.MAKE.MAKEFILES:M*depend*:[#]} dirdeps=${.ALLTARGETS:M${SRCTOP}*:O:u:[#]} seconds=`expr ${now_utc} - ${start_utc}`"
 
 .endif
 .endif
 
 .if ${BUILD_DIRDEPS} == "yes"
 .if ${DEBUG_DIRDEPS:@x@${DEP_RELDIR:M$x}${${DEP_RELDIR}.${DEP_MACHINE}:L:M$x}@} != ""
 _debug_reldir = 1
 .else
 _debug_reldir = 0
 .endif
 .if ${DEBUG_DIRDEPS:@x@${DEP_RELDIR:M$x}${${DEP_RELDIR}.depend:L:M$x}@} != ""
 _debug_search = 1
 .else
 _debug_search = 0
 .endif
 
 # the rest is done repeatedly for every Makefile.depend we read.
 # if we are anything but the original dir we care only about the
 # machine type we were included for..
 
 .if ${DEP_RELDIR} == "."
 _this_dir := ${SRCTOP}
 .else
 _this_dir := ${SRCTOP}/${DEP_RELDIR}
 .endif
 
 # on rare occasions, there can be a need for extra help
 _dep_hack := ${_this_dir}/${.MAKE.DEPENDFILE_PREFIX}.inc
 .-include <${_dep_hack}>
 
 .if ${DEP_RELDIR} != ${_DEP_RELDIR} || ${DEP_TARGET_SPEC} != ${TARGET_SPEC}
 # this should be all
 _machines := ${DEP_MACHINE}
 .else
 # this is the machine list we actually use below
 _machines := ${_only_machines}
 
 .if defined(HOSTPROG) || ${DEP_MACHINE} == "host"
 # we need to build this guy's dependencies for host as well.
 _machines += host
 .endif
 
 _machines := ${_machines:O:u}
 .endif
 
 .if ${TARGET_SPEC_VARS:[#]} > 1
 # we need to tweak _machines
 _dm := ${DEP_MACHINE}
 # apply the same filtering that we do when qualifying DIRDEPS.
 # M_dep_qual_fixes expects .${MACHINE}* so add (and remove) '.'
 _machines := ${_machines:@DEP_MACHINE@${DEP_TARGET_SPEC}@:S,^,.,:${M_dep_qual_fixes:ts:}:O:u:S,^.,,}
 DEP_MACHINE := ${_dm}
 .endif
 
 # reset each time through
 _build_dirs =
 
 .if ${DEP_RELDIR} == ${_DEP_RELDIR}
 # pickup other machines for this dir if necessary
 .if ${BUILD_AT_LEVEL0:Uyes} == "no"
 _build_dirs += ${_machines:@m@${_CURDIR}.$m@}
 .else
 _build_dirs += ${_machines:N${DEP_TARGET_SPEC}:@m@${_CURDIR}.$m@}
 .if ${DEP_TARGET_SPEC} == ${TARGET_SPEC}
 # pickup local dependencies now
 .if ${MAKE_VERSION} < 20160220
 .-include <.depend>
 .else
 .dinclude <.depend>
 .endif
 .endif
 .endif
 .endif
 
 .if ${_debug_reldir}
 .info ${DEP_RELDIR}.${DEP_TARGET_SPEC}: DIRDEPS='${DIRDEPS}'
 .info ${DEP_RELDIR}.${DEP_TARGET_SPEC}: _machines='${_machines}' 
 .endif
 
 .if !empty(DIRDEPS)
 # these we reset each time through as they can depend on DEP_MACHINE
 DEP_DIRDEPS_FILTER = \
 	${DIRDEPS_FILTER.${DEP_TARGET_SPEC}:U} \
 	${DIRDEPS_FILTER.${DEP_MACHINE}:U} \
 	${DIRDEPS_FILTER:U} 
 .if empty(DEP_DIRDEPS_FILTER)
 # something harmless
 DEP_DIRDEPS_FILTER = U
 .endif
 
 # this is what we start with
 __depdirs := ${DIRDEPS:${NSkipDir}:${DEP_DIRDEPS_FILTER:ts:}:C,//+,/,g:O:u:@d@${SRCTOP}/$d@}
 
 # some entries may be qualified with .<machine> 
 # the :M*/*/*.* just tries to limit the dirs we check to likely ones.
 # the ${d:E:M*/*} ensures we don't consider junos/usr.sbin/mgd
 __qual_depdirs := ${__depdirs:M*/*/*.*:@d@${exists($d):?:${"${d:E:M*/*}":?:${exists(${d:R}):?$d:}}}@}
 __unqual_depdirs := ${__depdirs:${__qual_depdirs:Uno:${M_ListToSkip}}}
 
 .if ${DEP_RELDIR} == ${_DEP_RELDIR}
 # if it was called out - we likely need it.
 __hostdpadd := ${DPADD:U.:M${HOST_OBJTOP}/*:S,${HOST_OBJTOP}/,,:H:${NSkipDir}:${DIRDEPS_FILTER:ts:}:S,$,.host,:N.*:@d@${SRCTOP}/$d@}
 __qual_depdirs += ${__hostdpadd}
 .endif
 
 .if ${_debug_reldir}
 .info depdirs=${__depdirs}
 .info qualified=${__qual_depdirs}
 .info unqualified=${__unqual_depdirs}
 .endif
 
 # _build_dirs is what we will feed to _DIRDEP_USE
 _build_dirs += \
 	${__qual_depdirs:M*.host:${NSkipHostDir}:N.host} \
 	${__qual_depdirs:N*.host} \
 	${_machines:Mhost*:@m@${__unqual_depdirs:@d@$d.$m@}@:${NSkipHostDir}:N.host} \
 	${_machines:Nhost*:@m@${__unqual_depdirs:@d@$d.$m@}@}
 
 # qualify everything now
 _build_dirs := ${_build_dirs:${M_dep_qual_fixes:ts:}:O:u}
 
 _build_all_dirs += ${_build_dirs}
 _build_all_dirs := ${_build_all_dirs:O:u}
 
 .endif				# empty DIRDEPS
 
 # Normally if doing make -V something,
 # we do not want to waste time chasing DIRDEPS
 # but if we want to count the number of Makefile.depend* read, we do.
 .if ${.MAKEFLAGS:M-V${_V_READ_DIRDEPS}} == ""
 .if !empty(_build_all_dirs)
 .if ${BUILD_DIRDEPS_CACHE} == "yes"
 x!= { echo; echo '\# ${DEP_RELDIR}.${DEP_TARGET_SPEC}'; \
 	echo 'dirdeps: ${_build_all_dirs:${M_oneperline}}'; echo; } >&3; echo
 x!= { ${_build_all_dirs:@x@${target($x):?:echo '$x: _DIRDEP_USE';}@} echo; } >&3; echo
 .else
 # this makes it all happen
 dirdeps: ${_build_all_dirs}
 .endif
 ${_build_all_dirs}:	_DIRDEP_USE
 
 .if ${_debug_reldir}
 .info ${DEP_RELDIR}.${DEP_TARGET_SPEC}: needs: ${_build_dirs}
 .endif
 
 # this builds the dependency graph
 .for m in ${_machines}
 # it would be nice to do :N${.TARGET}
 .if !empty(__qual_depdirs)
 .for q in ${__qual_depdirs:${M_dep_qual_fixes:ts:}:E:O:u:N$m}
 .if ${_debug_reldir} || ${DEBUG_DIRDEPS:@x@${${DEP_RELDIR}.$m:L:M$x}${${DEP_RELDIR}.$q:L:M$x}@} != ""
 .info ${DEP_RELDIR}.$m: graph: ${_build_dirs:M*.$q}
 .endif
 .if ${BUILD_DIRDEPS_CACHE} == "yes"
 x!= { echo; echo '${_this_dir}.$m: ${_build_dirs:M*.$q:${M_oneperline}}'; echo; } >&3; echo
 .else
 ${_this_dir}.$m: ${_build_dirs:M*.$q}
 .endif
 .endfor
 .endif
 .if ${_debug_reldir}
 .info ${DEP_RELDIR}.$m: graph: ${_build_dirs:M*.$m:N${_this_dir}.$m}
 .endif
 .if ${BUILD_DIRDEPS_CACHE} == "yes"
 x!= { echo; echo '${_this_dir}.$m: ${_build_dirs:M*.$m:N${_this_dir}.$m:${M_oneperline}}'; echo; } >&3; echo
 .else
 ${_this_dir}.$m: ${_build_dirs:M*.$m:N${_this_dir}.$m}
 .endif
 .endfor
 
 .endif
 
 # Now find more dependencies - and recurse.
 .for d in ${_build_all_dirs}
 .if !target(_dirdeps_checked.$d)
 # once only
 _dirdeps_checked.$d:
 .if ${_debug_search}
 .info checking $d
 .endif
 # Note: _build_all_dirs is fully qualifed so d:R is always the directory
 .if exists(${d:R})
 # Warning: there is an assumption here that MACHINE is always 
 # the first entry in TARGET_SPEC_VARS.
 # If TARGET_SPEC and MACHINE are insufficient, you have a problem.
 _m := ${.MAKE.DEPENDFILE_PREFERENCE:T:S;${TARGET_SPEC}$;${d:E};:S;${MACHINE};${d:E:C/,.*//};:@m@${exists(${d:R}/$m):?${d:R}/$m:}@:[1]}
 .if !empty(_m)
 # M_dep_qual_fixes isn't geared to Makefile.depend
 _qm := ${_m:C;(\.depend)$;\1.${d:E};:${M_dep_qual_fixes:ts:}}
 .if ${_debug_search}
 .info Looking for ${_qm}
 .endif
 # we pass _DEP_TARGET_SPEC to tell the next step what we want
 _DEP_TARGET_SPEC := ${d:E}
 # some makefiles may still look at this
 _DEP_MACHINE := ${d:E:C/,.*//}
 # set this "just in case" 
 # we can skip :tA since we computed the path above
 DEP_RELDIR := ${_m:H:S,${SRCTOP}/,,}
 # and reset this
 DIRDEPS =
 .if ${_debug_reldir} && ${_qm} != ${_m}
 .info loading ${_m} for ${d:E}
 .endif
 .include <${_m}>
 .endif
 .endif
 .endif
 .endfor
 
 .endif				# -V
 .endif				# BUILD_DIRDEPS
 
 .elif ${.MAKE.LEVEL} > 42
 .error You should have stopped recursing by now.
 .else
 # we are building something
 DEP_RELDIR := ${RELDIR}
 _DEP_RELDIR := ${RELDIR}
 # pickup local dependencies
 .if ${MAKE_VERSION} < 20160220
 .-include <.depend>
 .else
 .dinclude <.depend>
 .endif
 .endif
 
 # bootstrapping new dependencies made easy?
 .if !target(bootstrap) && (make(bootstrap) || \
 	make(bootstrap-this) || \
 	make(bootstrap-recurse) || \
 	make(bootstrap-empty))
 
-.if exists(${.CURDIR}/${.MAKE.DEPENDFILE:T})
+# if we are bootstrapping create the default
+_want = ${.CURDIR}/${.MAKE.DEPENDFILE_DEFAULT:T}
+
+.if exists(${_want})
 # stop here
 ${.TARGETS:Mboot*}:
 .elif !make(bootstrap-empty)
 # find a Makefile.depend to use as _src
 _src != cd ${.CURDIR} && for m in ${.MAKE.DEPENDFILE_PREFERENCE:T:S,${MACHINE},*,}; do test -s $$m || continue; echo $$m; break; done; echo
 .if empty(_src)
 .error cannot find any of ${.MAKE.DEPENDFILE_PREFERENCE:T}${.newline}Use: bootstrap-empty
 .endif
 
-_src?= ${.MAKE.DEPENDFILE:T}
+_src?= ${.MAKE.DEPENDFILE}
 
+.MAKE.DEPENDFILE_BOOTSTRAP_SED+= -e 's,${_src:E},${MACHINE},g'
+
 # just create Makefile.depend* for this dir
 bootstrap-this:	.NOTMAIN
-	@echo Bootstrapping ${RELDIR}/${.MAKE.DEPENDFILE:T} from ${_src:T}
-	(cd ${.CURDIR} && sed 's,${_src:E},${MACHINE},g' ${_src} > ${.MAKE.DEPENDFILE:T})
+	@echo Bootstrapping ${RELDIR}/${_want:T} from ${_src:T}; \
+	echo You need to build ${RELDIR} to correctly populate it.
+.if ${_src:T} != ${.MAKE.DEPENDFILE_PREFIX:T}
+	(cd ${.CURDIR} && sed ${.MAKE.DEPENDFILE_BOOTSTRAP_SED} ${_src} > ${_want})
+.else
+	cp ${.CURDIR}/${_src:T} ${_want}
+.endif
 
 # create Makefile.depend* for this dir and its dependencies
 bootstrap: bootstrap-recurse
 bootstrap-recurse: bootstrap-this
 
 _mf := ${.PARSEFILE}
 bootstrap-recurse:	.NOTMAIN .MAKE
 	@cd ${SRCTOP} && \
 	for d in `cd ${RELDIR} && ${.MAKE} -B -f ${"${.MAKEFLAGS:M-n}":?${_src}:${.MAKE.DEPENDFILE:T}} -V DIRDEPS`; do \
 		test -d $$d || d=$${d%.*}; \
 		test -d $$d || continue; \
 		echo "Checking $$d for bootstrap ..."; \
 		(cd $$d && ${.MAKE} -f ${_mf} bootstrap-recurse); \
 	done
 
 .endif
 
 # create an empty Makefile.depend* to get the ball rolling.
 bootstrap-empty: .NOTMAIN .NOMETA
-	@echo Creating empty ${RELDIR}/${.MAKE.DEPENDFILE:T}; \
+	@echo Creating empty ${RELDIR}/${_want:T}; \
 	echo You need to build ${RELDIR} to correctly populate it.
-	@{ echo DIRDEPS=; echo ".include <dirdeps.mk>"; } > ${.CURDIR}/${.MAKE.DEPENDFILE:T}
+	@{ echo DIRDEPS=; echo ".include <dirdeps.mk>"; } > ${_want}
 
 .endif
Index: projects/clang390-import/share/mk/meta.sys.mk
===================================================================
--- projects/clang390-import/share/mk/meta.sys.mk	(revision 305686)
+++ projects/clang390-import/share/mk/meta.sys.mk	(revision 305687)
@@ -1,143 +1,163 @@
 # $FreeBSD$
 # $Id: meta.sys.mk,v 1.19 2014/08/02 23:16:02 sjg Exp $
 
 #
 #	@(#) Copyright (c) 2010, Simon J. Gerraty
 #
 #	This file is provided in the hope that it will
 #	be of use.  There is absolutely NO WARRANTY.
 #	Permission to copy, redistribute or otherwise
 #	use this file is hereby granted provided that 
 #	the above copyright notice and this notice are
 #	left intact. 
 #      
 #	Please send copies of changes and bug-fixes to:
 #	sjg@crufty.net
 #
 
 # include this if you want to enable meta mode
 # for maximum benefit, requires filemon(4) driver.
 
 .if ${MAKE_VERSION:U0} > 20100901
 .if !target(.ERROR)
 
 .-include "local.meta.sys.mk"
 
 # absoulte path to what we are reading.
 _PARSEDIR = ${.PARSEDIR:tA}
 
+.if !defined(SYS_MK_DIR)
+SYS_MK_DIR := ${_PARSEDIR}
+.endif
+
 META_MODE += meta verbose
 .MAKE.MODE ?= ${META_MODE}
 
 .if ${.MAKE.LEVEL} == 0
 _make_mode := ${.MAKE.MODE} ${META_MODE}
 .if ${_make_mode:M*read*} != "" || ${_make_mode:M*nofilemon*} != ""
 # tell everyone we are not updating Makefile.depend*
 UPDATE_DEPENDFILE = NO
 .export UPDATE_DEPENDFILE
 .endif
 .if ${UPDATE_DEPENDFILE:Uyes:tl} == "no" && !exists(/dev/filemon)
 # we should not get upset
 META_MODE += nofilemon
 .export META_MODE
 .endif
 .endif
 
 .if !defined(NO_SILENT)
 .if ${MAKE_VERSION} > 20110818
 # only be silent when we have a .meta file
 META_MODE += silent=yes
 .else
 .SILENT:
 .endif
 .endif
 
 # make defaults .MAKE.DEPENDFILE to .depend
 # that won't work for us.
 .if ${.MAKE.DEPENDFILE} == ".depend"
 .undef .MAKE.DEPENDFILE
 .endif
 
 # if you don't cross build for multiple MACHINEs concurrently, then
 # .MAKE.DEPENDFILE = Makefile.depend
 # probably makes sense - you can set that in local.sys.mk 
 .MAKE.DEPENDFILE ?= Makefile.depend.${MACHINE}
 
 # we use the pseudo machine "host" for the build host.
 # this should be taken care of before we get here
 .if ${OBJTOP:Ua} == ${HOST_OBJTOP:Ub}
 MACHINE = host
 .endif
 
 .if ${.MAKE.LEVEL} == 0
 # it can be handy to know which MACHINE kicked off the build
 # for example, if using Makefild.depend for multiple machines,
 # allowing only MACHINE0 to update can keep things simple.
 MACHINE0 := ${MACHINE}
 .export MACHINE0
 
 .if defined(PYTHON) && exists(${PYTHON})
 # we prefer the python version of this - it is much faster
 META2DEPS ?= ${.PARSEDIR}/meta2deps.py
 .else
 META2DEPS ?= ${.PARSEDIR}/meta2deps.sh
 .endif
 META2DEPS := ${META2DEPS}
 .export META2DEPS
 .endif
 
 MAKE_PRINT_VAR_ON_ERROR += \
 	.ERROR_TARGET \
 	.ERROR_META_FILE \
 	.MAKE.LEVEL \
 	MAKEFILE \
 	.MAKE.MODE
 
 .if !defined(SB) && defined(SRCTOP)
 SB = ${SRCTOP:H}
 .endif
 ERROR_LOGDIR ?= ${SB}/error
 meta_error_log = ${ERROR_LOGDIR}/meta-${.MAKE.PID}.log
 
 # we are not interested in make telling us a failure happened elsewhere
 .ERROR: _metaError
 _metaError: .NOMETA .NOTMAIN
 	-@[ "${.ERROR_META_FILE}" ] && { \
 	grep -q 'failure has been detected in another branch' ${.ERROR_META_FILE} && exit 0; \
 	mkdir -p ${meta_error_log:H}; \
 	cp ${.ERROR_META_FILE} ${meta_error_log}; \
 	echo "ERROR: log ${meta_error_log}" >&2; }; :
 
 .endif
 
+META_COOKIE_TOUCH=
+# some targets need to be .PHONY in non-meta mode
+META_NOPHONY= .PHONY
 # Are we, after all, in meta mode?
 .if ${.MAKE.MODE:Uno:Mmeta*} != ""
 MKDEP_MK = meta.autodep.mk
+
+# we can afford to use cookies to prevent some targets
+# re-running needlessly
+META_COOKIE_TOUCH= touch ${COOKIE.${.TARGET}:U${.OBJDIR}/${.TARGET:T}}
+META_NOPHONY=
+
+# some targets involve old pre-built targets
+# ignore mtime of shell
+# and mtime of makefiles does not matter in meta mode
+.MAKE.META.IGNORE_PATHS += \
+        ${MAKEFILE} \
+        ${SHELL} \
+        ${SYS_MK_DIR}
 
 # if we think we are updating dependencies, 
 # then filemon had better be present
 .if ${UPDATE_DEPENDFILE:Uyes:tl} != "no" && !exists(/dev/filemon)
 .error ${.newline}ERROR: The filemon module (/dev/filemon) is not loaded.
 .endif
 
 .if ${.MAKE.LEVEL} == 0
 # make sure dirdeps target exists and do it first
 all: dirdeps .WAIT
 dirdeps:
 .NOPATH: dirdeps
 
 .if defined(ALL_MACHINES)
 # the first .MAIN: is what counts
 # by default dirdeps is all we want at level0
 .MAIN: dirdeps
 # tell dirdeps.mk what we want
 BUILD_AT_LEVEL0 = no
 .endif
 .if ${.TARGETS:Nall} == "" 
 # it works best if we do everything via sub-makes
 BUILD_AT_LEVEL0 ?= no
 .endif
 
 .endif
 .endif
 .endif
Index: projects/clang390-import/sys/amd64/amd64/pmap.c
===================================================================
--- projects/clang390-import/sys/amd64/amd64/pmap.c	(revision 305686)
+++ projects/clang390-import/sys/amd64/amd64/pmap.c	(revision 305687)
@@ -1,7274 +1,7268 @@
 /*-
  * Copyright (c) 1991 Regents of the University of California.
  * All rights reserved.
  * Copyright (c) 1994 John S. Dyson
  * All rights reserved.
  * Copyright (c) 1994 David Greenman
  * All rights reserved.
  * Copyright (c) 2003 Peter Wemm
  * All rights reserved.
  * Copyright (c) 2005-2010 Alan L. Cox <alc@cs.rice.edu>
  * All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * the Systems Programming Group of the University of Utah Computer
  * Science Department and William Jolitz of UUNET Technologies Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *	This product includes software developed by the University of
  *	California, Berkeley and its contributors.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	from:	@(#)pmap.c	7.7 (Berkeley)	5/12/91
  */
 /*-
  * Copyright (c) 2003 Networks Associates Technology, Inc.
  * All rights reserved.
  *
  * This software was developed for the FreeBSD Project by Jake Burkholder,
  * Safeport Network Services, and Network Associates Laboratories, the
  * Security Research Division of Network Associates, Inc. under
  * DARPA/SPAWAR contract N66001-01-C-8035 ("CBOSS"), as part of the DARPA
  * CHATS research program.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #define	AMD64_NPT_AWARE
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 /*
  *	Manages physical address maps.
  *
  *	Since the information managed by this module is
  *	also stored by the logical address mapping module,
  *	this module may throw away valid virtual-to-physical
  *	mappings at almost any time.  However, invalidations
  *	of virtual-to-physical mappings must be done as
  *	requested.
  *
  *	In order to cope with hardware architectures which
  *	make virtual-to-physical map invalidates expensive,
  *	this module may delay invalidate or reduced protection
  *	operations until such time as they are actually
  *	necessary.  This module is given full information as
  *	to which processors are currently using which maps,
  *	and to when physical maps must be made correct.
  */
 
 #include "opt_pmap.h"
 #include "opt_vm.h"
 
 #include <sys/param.h>
 #include <sys/bitstring.h>
 #include <sys/bus.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/ktr.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/mman.h>
 #include <sys/mutex.h>
 #include <sys/proc.h>
 #include <sys/rwlock.h>
 #include <sys/sx.h>
 #include <sys/turnstile.h>
 #include <sys/vmem.h>
 #include <sys/vmmeter.h>
 #include <sys/sched.h>
 #include <sys/sysctl.h>
 #include <sys/smp.h>
 
 #include <vm/vm.h>
 #include <vm/vm_param.h>
 #include <vm/vm_kern.h>
 #include <vm/vm_page.h>
 #include <vm/vm_map.h>
 #include <vm/vm_object.h>
 #include <vm/vm_extern.h>
 #include <vm/vm_pageout.h>
 #include <vm/vm_pager.h>
 #include <vm/vm_phys.h>
 #include <vm/vm_radix.h>
 #include <vm/vm_reserv.h>
 #include <vm/uma.h>
 
 #include <machine/intr_machdep.h>
 #include <x86/apicvar.h>
 #include <machine/cpu.h>
 #include <machine/cputypes.h>
 #include <machine/md_var.h>
 #include <machine/pcb.h>
 #include <machine/specialreg.h>
 #ifdef SMP
 #include <machine/smp.h>
 #endif
 
 static __inline boolean_t
 pmap_type_guest(pmap_t pmap)
 {
 
 	return ((pmap->pm_type == PT_EPT) || (pmap->pm_type == PT_RVI));
 }
 
 static __inline boolean_t
 pmap_emulate_ad_bits(pmap_t pmap)
 {
 
 	return ((pmap->pm_flags & PMAP_EMULATE_AD_BITS) != 0);
 }
 
 static __inline pt_entry_t
 pmap_valid_bit(pmap_t pmap)
 {
 	pt_entry_t mask;
 
 	switch (pmap->pm_type) {
 	case PT_X86:
 	case PT_RVI:
 		mask = X86_PG_V;
 		break;
 	case PT_EPT:
 		if (pmap_emulate_ad_bits(pmap))
 			mask = EPT_PG_EMUL_V;
 		else
 			mask = EPT_PG_READ;
 		break;
 	default:
 		panic("pmap_valid_bit: invalid pm_type %d", pmap->pm_type);
 	}
 
 	return (mask);
 }
 
 static __inline pt_entry_t
 pmap_rw_bit(pmap_t pmap)
 {
 	pt_entry_t mask;
 
 	switch (pmap->pm_type) {
 	case PT_X86:
 	case PT_RVI:
 		mask = X86_PG_RW;
 		break;
 	case PT_EPT:
 		if (pmap_emulate_ad_bits(pmap))
 			mask = EPT_PG_EMUL_RW;
 		else
 			mask = EPT_PG_WRITE;
 		break;
 	default:
 		panic("pmap_rw_bit: invalid pm_type %d", pmap->pm_type);
 	}
 
 	return (mask);
 }
 
 static __inline pt_entry_t
 pmap_global_bit(pmap_t pmap)
 {
 	pt_entry_t mask;
 
 	switch (pmap->pm_type) {
 	case PT_X86:
 		mask = X86_PG_G;
 		break;
 	case PT_RVI:
 	case PT_EPT:
 		mask = 0;
 		break;
 	default:
 		panic("pmap_global_bit: invalid pm_type %d", pmap->pm_type);
 	}
 
 	return (mask);
 }
 
 static __inline pt_entry_t
 pmap_accessed_bit(pmap_t pmap)
 {
 	pt_entry_t mask;
 
 	switch (pmap->pm_type) {
 	case PT_X86:
 	case PT_RVI:
 		mask = X86_PG_A;
 		break;
 	case PT_EPT:
 		if (pmap_emulate_ad_bits(pmap))
 			mask = EPT_PG_READ;
 		else
 			mask = EPT_PG_A;
 		break;
 	default:
 		panic("pmap_accessed_bit: invalid pm_type %d", pmap->pm_type);
 	}
 
 	return (mask);
 }
 
 static __inline pt_entry_t
 pmap_modified_bit(pmap_t pmap)
 {
 	pt_entry_t mask;
 
 	switch (pmap->pm_type) {
 	case PT_X86:
 	case PT_RVI:
 		mask = X86_PG_M;
 		break;
 	case PT_EPT:
 		if (pmap_emulate_ad_bits(pmap))
 			mask = EPT_PG_WRITE;
 		else
 			mask = EPT_PG_M;
 		break;
 	default:
 		panic("pmap_modified_bit: invalid pm_type %d", pmap->pm_type);
 	}
 
 	return (mask);
 }
 
 extern	struct pcpu __pcpu[];
 
 #if !defined(DIAGNOSTIC)
 #ifdef __GNUC_GNU_INLINE__
 #define PMAP_INLINE	__attribute__((__gnu_inline__)) inline
 #else
 #define PMAP_INLINE	extern inline
 #endif
 #else
 #define PMAP_INLINE
 #endif
 
 #ifdef PV_STATS
 #define PV_STAT(x)	do { x ; } while (0)
 #else
 #define PV_STAT(x)	do { } while (0)
 #endif
 
 #define	pa_index(pa)	((pa) >> PDRSHIFT)
 #define	pa_to_pvh(pa)	(&pv_table[pa_index(pa)])
 
 #define	NPV_LIST_LOCKS	MAXCPU
 
 #define	PHYS_TO_PV_LIST_LOCK(pa)	\
 			(&pv_list_locks[pa_index(pa) % NPV_LIST_LOCKS])
 
 #define	CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa)	do {	\
 	struct rwlock **_lockp = (lockp);		\
 	struct rwlock *_new_lock;			\
 							\
 	_new_lock = PHYS_TO_PV_LIST_LOCK(pa);		\
 	if (_new_lock != *_lockp) {			\
 		if (*_lockp != NULL)			\
 			rw_wunlock(*_lockp);		\
 		*_lockp = _new_lock;			\
 		rw_wlock(*_lockp);			\
 	}						\
 } while (0)
 
 #define	CHANGE_PV_LIST_LOCK_TO_VM_PAGE(lockp, m)	\
 			CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, VM_PAGE_TO_PHYS(m))
 
 #define	RELEASE_PV_LIST_LOCK(lockp)		do {	\
 	struct rwlock **_lockp = (lockp);		\
 							\
 	if (*_lockp != NULL) {				\
 		rw_wunlock(*_lockp);			\
 		*_lockp = NULL;				\
 	}						\
 } while (0)
 
 #define	VM_PAGE_TO_PV_LIST_LOCK(m)	\
 			PHYS_TO_PV_LIST_LOCK(VM_PAGE_TO_PHYS(m))
 
 struct pmap kernel_pmap_store;
 
 vm_offset_t virtual_avail;	/* VA of first avail page (after kernel bss) */
 vm_offset_t virtual_end;	/* VA of last avail page (end of kernel AS) */
 
 int nkpt;
 SYSCTL_INT(_machdep, OID_AUTO, nkpt, CTLFLAG_RD, &nkpt, 0,
     "Number of kernel page table pages allocated on bootup");
 
 static int ndmpdp;
 vm_paddr_t dmaplimit;
 vm_offset_t kernel_vm_end = VM_MIN_KERNEL_ADDRESS;
 pt_entry_t pg_nx;
 
 static SYSCTL_NODE(_vm, OID_AUTO, pmap, CTLFLAG_RD, 0, "VM/pmap parameters");
 
 static int pat_works = 1;
 SYSCTL_INT(_vm_pmap, OID_AUTO, pat_works, CTLFLAG_RD, &pat_works, 1,
     "Is page attribute table fully functional?");
 
 static int pg_ps_enabled = 1;
 SYSCTL_INT(_vm_pmap, OID_AUTO, pg_ps_enabled, CTLFLAG_RDTUN | CTLFLAG_NOFETCH,
     &pg_ps_enabled, 0, "Are large page mappings enabled?");
 
 #define	PAT_INDEX_SIZE	8
 static int pat_index[PAT_INDEX_SIZE];	/* cache mode to PAT index conversion */
 
 static u_int64_t	KPTphys;	/* phys addr of kernel level 1 */
 static u_int64_t	KPDphys;	/* phys addr of kernel level 2 */
 u_int64_t		KPDPphys;	/* phys addr of kernel level 3 */
 u_int64_t		KPML4phys;	/* phys addr of kernel level 4 */
 
 static u_int64_t	DMPDphys;	/* phys addr of direct mapped level 2 */
 static u_int64_t	DMPDPphys;	/* phys addr of direct mapped level 3 */
 static int		ndmpdpphys;	/* number of DMPDPphys pages */
 
 /*
  * pmap_mapdev support pre initialization (i.e. console)
  */
 #define	PMAP_PREINIT_MAPPING_COUNT	8
 static struct pmap_preinit_mapping {
 	vm_paddr_t	pa;
 	vm_offset_t	va;
 	vm_size_t	sz;
 	int		mode;
 } pmap_preinit_mapping[PMAP_PREINIT_MAPPING_COUNT];
 static int pmap_initialized;
 
 /*
  * Data for the pv entry allocation mechanism.
  * Updates to pv_invl_gen are protected by the pv_list_locks[]
  * elements, but reads are not.
  */
 static TAILQ_HEAD(pch, pv_chunk) pv_chunks = TAILQ_HEAD_INITIALIZER(pv_chunks);
 static struct mtx pv_chunks_mutex;
 static struct rwlock pv_list_locks[NPV_LIST_LOCKS];
 static u_long pv_invl_gen[NPV_LIST_LOCKS];
 static struct md_page *pv_table;
 static struct md_page pv_dummy;
 
 /*
  * All those kernel PT submaps that BSD is so fond of
  */
 pt_entry_t *CMAP1 = 0;
 caddr_t CADDR1 = 0;
 static vm_offset_t qframe = 0;
 static struct mtx qframe_mtx;
 
 static int pmap_flags = PMAP_PDE_SUPERPAGE;	/* flags for x86 pmaps */
 
 int pmap_pcid_enabled = 1;
 SYSCTL_INT(_vm_pmap, OID_AUTO, pcid_enabled, CTLFLAG_RDTUN | CTLFLAG_NOFETCH,
     &pmap_pcid_enabled, 0, "Is TLB Context ID enabled ?");
 int invpcid_works = 0;
 SYSCTL_INT(_vm_pmap, OID_AUTO, invpcid_works, CTLFLAG_RD, &invpcid_works, 0,
     "Is the invpcid instruction available ?");
 
 static int
 pmap_pcid_save_cnt_proc(SYSCTL_HANDLER_ARGS)
 {
 	int i;
 	uint64_t res;
 
 	res = 0;
 	CPU_FOREACH(i) {
 		res += cpuid_to_pcpu[i]->pc_pm_save_cnt;
 	}
 	return (sysctl_handle_64(oidp, &res, 0, req));
 }
 SYSCTL_PROC(_vm_pmap, OID_AUTO, pcid_save_cnt, CTLTYPE_U64 | CTLFLAG_RW |
     CTLFLAG_MPSAFE, NULL, 0, pmap_pcid_save_cnt_proc, "QU",
     "Count of saved TLB context on switch");
 
 static LIST_HEAD(, pmap_invl_gen) pmap_invl_gen_tracker =
     LIST_HEAD_INITIALIZER(&pmap_invl_gen_tracker);
 static struct mtx invl_gen_mtx;
 static u_long pmap_invl_gen = 0;
 /* Fake lock object to satisfy turnstiles interface. */
 static struct lock_object invl_gen_ts = {
 	.lo_name = "invlts",
 };
 
 #define	PMAP_ASSERT_NOT_IN_DI() \
     KASSERT(curthread->td_md.md_invl_gen.gen == 0, ("DI already started"))
 
 /*
  * Start a new Delayed Invalidation (DI) block of code, executed by
  * the current thread.  Within a DI block, the current thread may
  * destroy both the page table and PV list entries for a mapping and
  * then release the corresponding PV list lock before ensuring that
  * the mapping is flushed from the TLBs of any processors with the
  * pmap active.
  */
 static void
 pmap_delayed_invl_started(void)
 {
 	struct pmap_invl_gen *invl_gen;
 	u_long currgen;
 
 	invl_gen = &curthread->td_md.md_invl_gen;
 	PMAP_ASSERT_NOT_IN_DI();
 	mtx_lock(&invl_gen_mtx);
 	if (LIST_EMPTY(&pmap_invl_gen_tracker))
 		currgen = pmap_invl_gen;
 	else
 		currgen = LIST_FIRST(&pmap_invl_gen_tracker)->gen;
 	invl_gen->gen = currgen + 1;
 	LIST_INSERT_HEAD(&pmap_invl_gen_tracker, invl_gen, link);
 	mtx_unlock(&invl_gen_mtx);
 }
 
 /*
  * Finish the DI block, previously started by the current thread.  All
  * required TLB flushes for the pages marked by
  * pmap_delayed_invl_page() must be finished before this function is
  * called.
  *
  * This function works by bumping the global DI generation number to
  * the generation number of the current thread's DI, unless there is a
  * pending DI that started earlier.  In the latter case, bumping the
  * global DI generation number would incorrectly signal that the
  * earlier DI had finished.  Instead, this function bumps the earlier
  * DI's generation number to match the generation number of the
  * current thread's DI.
  */
 static void
 pmap_delayed_invl_finished(void)
 {
 	struct pmap_invl_gen *invl_gen, *next;
 	struct turnstile *ts;
 
 	invl_gen = &curthread->td_md.md_invl_gen;
 	KASSERT(invl_gen->gen != 0, ("missed invl_started"));
 	mtx_lock(&invl_gen_mtx);
 	next = LIST_NEXT(invl_gen, link);
 	if (next == NULL) {
 		turnstile_chain_lock(&invl_gen_ts);
 		ts = turnstile_lookup(&invl_gen_ts);
 		pmap_invl_gen = invl_gen->gen;
 		if (ts != NULL) {
 			turnstile_broadcast(ts, TS_SHARED_QUEUE);
 			turnstile_unpend(ts, TS_SHARED_LOCK);
 		}
 		turnstile_chain_unlock(&invl_gen_ts);
 	} else {
 		next->gen = invl_gen->gen;
 	}
 	LIST_REMOVE(invl_gen, link);
 	mtx_unlock(&invl_gen_mtx);
 	invl_gen->gen = 0;
 }
 
 #ifdef PV_STATS
 static long invl_wait;
 SYSCTL_LONG(_vm_pmap, OID_AUTO, invl_wait, CTLFLAG_RD, &invl_wait, 0,
     "Number of times DI invalidation blocked pmap_remove_all/write");
 #endif
 
 static u_long *
 pmap_delayed_invl_genp(vm_page_t m)
 {
 
 	return (&pv_invl_gen[pa_index(VM_PAGE_TO_PHYS(m)) % NPV_LIST_LOCKS]);
 }
 
 /*
  * Ensure that all currently executing DI blocks, that need to flush
  * TLB for the given page m, actually flushed the TLB at the time the
  * function returned.  If the page m has an empty PV list and we call
  * pmap_delayed_invl_wait(), upon its return we know that no CPU has a
  * valid mapping for the page m in either its page table or TLB.
  *
  * This function works by blocking until the global DI generation
  * number catches up with the generation number associated with the
  * given page m and its PV list.  Since this function's callers
  * typically own an object lock and sometimes own a page lock, it
  * cannot sleep.  Instead, it blocks on a turnstile to relinquish the
  * processor.
  */
 static void
 pmap_delayed_invl_wait(vm_page_t m)
 {
 	struct thread *td;
 	struct turnstile *ts;
 	u_long *m_gen;
 #ifdef PV_STATS
 	bool accounted = false;
 #endif
 
 	td = curthread;
 	m_gen = pmap_delayed_invl_genp(m);
 	while (*m_gen > pmap_invl_gen) {
 #ifdef PV_STATS
 		if (!accounted) {
 			atomic_add_long(&invl_wait, 1);
 			accounted = true;
 		}
 #endif
 		ts = turnstile_trywait(&invl_gen_ts);
 		if (*m_gen > pmap_invl_gen)
 			turnstile_wait(ts, NULL, TS_SHARED_QUEUE);
 		else
 			turnstile_cancel(ts);
 	}
 }
 
 /*
  * Mark the page m's PV list as participating in the current thread's
  * DI block.  Any threads concurrently using m's PV list to remove or
  * restrict all mappings to m will wait for the current thread's DI
  * block to complete before proceeding.
  *
  * The function works by setting the DI generation number for m's PV
  * list to at least the DI generation number of the current thread.
  * This forces a caller of pmap_delayed_invl_wait() to block until
  * current thread calls pmap_delayed_invl_finished().
  */
 static void
 pmap_delayed_invl_page(vm_page_t m)
 {
 	u_long gen, *m_gen;
 
 	rw_assert(VM_PAGE_TO_PV_LIST_LOCK(m), RA_WLOCKED);
 	gen = curthread->td_md.md_invl_gen.gen;
 	if (gen == 0)
 		return;
 	m_gen = pmap_delayed_invl_genp(m);
 	if (*m_gen < gen)
 		*m_gen = gen;
 }
 
 /*
  * Crashdump maps.
  */
 static caddr_t crashdumpmap;
 
 static void	free_pv_chunk(struct pv_chunk *pc);
 static void	free_pv_entry(pmap_t pmap, pv_entry_t pv);
 static pv_entry_t get_pv_entry(pmap_t pmap, struct rwlock **lockp);
 static int	popcnt_pc_map_pq(uint64_t *map);
 static vm_page_t reclaim_pv_chunk(pmap_t locked_pmap, struct rwlock **lockp);
 static void	reserve_pv_entries(pmap_t pmap, int needed,
 		    struct rwlock **lockp);
 static void	pmap_pv_demote_pde(pmap_t pmap, vm_offset_t va, vm_paddr_t pa,
 		    struct rwlock **lockp);
 static boolean_t pmap_pv_insert_pde(pmap_t pmap, vm_offset_t va, vm_paddr_t pa,
 		    struct rwlock **lockp);
 static void	pmap_pv_promote_pde(pmap_t pmap, vm_offset_t va, vm_paddr_t pa,
 		    struct rwlock **lockp);
 static void	pmap_pvh_free(struct md_page *pvh, pmap_t pmap, vm_offset_t va);
 static pv_entry_t pmap_pvh_remove(struct md_page *pvh, pmap_t pmap,
 		    vm_offset_t va);
 
 static int pmap_change_attr_locked(vm_offset_t va, vm_size_t size, int mode);
 static boolean_t pmap_demote_pde(pmap_t pmap, pd_entry_t *pde, vm_offset_t va);
 static boolean_t pmap_demote_pde_locked(pmap_t pmap, pd_entry_t *pde,
     vm_offset_t va, struct rwlock **lockp);
 static boolean_t pmap_demote_pdpe(pmap_t pmap, pdp_entry_t *pdpe,
     vm_offset_t va);
 static boolean_t pmap_enter_pde(pmap_t pmap, vm_offset_t va, vm_page_t m,
     vm_prot_t prot, struct rwlock **lockp);
 static vm_page_t pmap_enter_quick_locked(pmap_t pmap, vm_offset_t va,
     vm_page_t m, vm_prot_t prot, vm_page_t mpte, struct rwlock **lockp);
 static void pmap_fill_ptp(pt_entry_t *firstpte, pt_entry_t newpte);
 static int pmap_insert_pt_page(pmap_t pmap, vm_page_t mpte);
 static void pmap_kenter_attr(vm_offset_t va, vm_paddr_t pa, int mode);
 static vm_page_t pmap_lookup_pt_page(pmap_t pmap, vm_offset_t va);
 static void pmap_pde_attr(pd_entry_t *pde, int cache_bits, int mask);
 static void pmap_promote_pde(pmap_t pmap, pd_entry_t *pde, vm_offset_t va,
     struct rwlock **lockp);
 static boolean_t pmap_protect_pde(pmap_t pmap, pd_entry_t *pde, vm_offset_t sva,
     vm_prot_t prot);
 static void pmap_pte_attr(pt_entry_t *pte, int cache_bits, int mask);
 static int pmap_remove_pde(pmap_t pmap, pd_entry_t *pdq, vm_offset_t sva,
     struct spglist *free, struct rwlock **lockp);
 static int pmap_remove_pte(pmap_t pmap, pt_entry_t *ptq, vm_offset_t sva,
     pd_entry_t ptepde, struct spglist *free, struct rwlock **lockp);
 static void pmap_remove_pt_page(pmap_t pmap, vm_page_t mpte);
 static void pmap_remove_page(pmap_t pmap, vm_offset_t va, pd_entry_t *pde,
     struct spglist *free);
 static boolean_t pmap_try_insert_pv_entry(pmap_t pmap, vm_offset_t va,
     vm_page_t m, struct rwlock **lockp);
 static void pmap_update_pde(pmap_t pmap, vm_offset_t va, pd_entry_t *pde,
     pd_entry_t newpde);
 static void pmap_update_pde_invalidate(pmap_t, vm_offset_t va, pd_entry_t pde);
 
 static vm_page_t _pmap_allocpte(pmap_t pmap, vm_pindex_t ptepindex,
 		struct rwlock **lockp);
 static vm_page_t pmap_allocpde(pmap_t pmap, vm_offset_t va,
 		struct rwlock **lockp);
 static vm_page_t pmap_allocpte(pmap_t pmap, vm_offset_t va,
 		struct rwlock **lockp);
 
 static void _pmap_unwire_ptp(pmap_t pmap, vm_offset_t va, vm_page_t m,
     struct spglist *free);
 static int pmap_unuse_pt(pmap_t, vm_offset_t, pd_entry_t, struct spglist *);
 static vm_offset_t pmap_kmem_choose(vm_offset_t addr);
 
 /*
  * Move the kernel virtual free pointer to the next
  * 2MB.  This is used to help improve performance
  * by using a large (2MB) page for much of the kernel
  * (.text, .data, .bss)
  */
 static vm_offset_t
 pmap_kmem_choose(vm_offset_t addr)
 {
 	vm_offset_t newaddr = addr;
 
 	newaddr = roundup2(addr, NBPDR);
 	return (newaddr);
 }
 
 /********************/
 /* Inline functions */
 /********************/
 
 /* Return a non-clipped PD index for a given VA */
 static __inline vm_pindex_t
 pmap_pde_pindex(vm_offset_t va)
 {
 	return (va >> PDRSHIFT);
 }
 
 
 /* Return various clipped indexes for a given VA */
 static __inline vm_pindex_t
 pmap_pte_index(vm_offset_t va)
 {
 
 	return ((va >> PAGE_SHIFT) & ((1ul << NPTEPGSHIFT) - 1));
 }
 
 static __inline vm_pindex_t
 pmap_pde_index(vm_offset_t va)
 {
 
 	return ((va >> PDRSHIFT) & ((1ul << NPDEPGSHIFT) - 1));
 }
 
 static __inline vm_pindex_t
 pmap_pdpe_index(vm_offset_t va)
 {
 
 	return ((va >> PDPSHIFT) & ((1ul << NPDPEPGSHIFT) - 1));
 }
 
 static __inline vm_pindex_t
 pmap_pml4e_index(vm_offset_t va)
 {
 
 	return ((va >> PML4SHIFT) & ((1ul << NPML4EPGSHIFT) - 1));
 }
 
 /* Return a pointer to the PML4 slot that corresponds to a VA */
 static __inline pml4_entry_t *
 pmap_pml4e(pmap_t pmap, vm_offset_t va)
 {
 
 	return (&pmap->pm_pml4[pmap_pml4e_index(va)]);
 }
 
 /* Return a pointer to the PDP slot that corresponds to a VA */
 static __inline pdp_entry_t *
 pmap_pml4e_to_pdpe(pml4_entry_t *pml4e, vm_offset_t va)
 {
 	pdp_entry_t *pdpe;
 
 	pdpe = (pdp_entry_t *)PHYS_TO_DMAP(*pml4e & PG_FRAME);
 	return (&pdpe[pmap_pdpe_index(va)]);
 }
 
 /* Return a pointer to the PDP slot that corresponds to a VA */
 static __inline pdp_entry_t *
 pmap_pdpe(pmap_t pmap, vm_offset_t va)
 {
 	pml4_entry_t *pml4e;
 	pt_entry_t PG_V;
 
 	PG_V = pmap_valid_bit(pmap);
 	pml4e = pmap_pml4e(pmap, va);
 	if ((*pml4e & PG_V) == 0)
 		return (NULL);
 	return (pmap_pml4e_to_pdpe(pml4e, va));
 }
 
 /* Return a pointer to the PD slot that corresponds to a VA */
 static __inline pd_entry_t *
 pmap_pdpe_to_pde(pdp_entry_t *pdpe, vm_offset_t va)
 {
 	pd_entry_t *pde;
 
 	pde = (pd_entry_t *)PHYS_TO_DMAP(*pdpe & PG_FRAME);
 	return (&pde[pmap_pde_index(va)]);
 }
 
 /* Return a pointer to the PD slot that corresponds to a VA */
 static __inline pd_entry_t *
 pmap_pde(pmap_t pmap, vm_offset_t va)
 {
 	pdp_entry_t *pdpe;
 	pt_entry_t PG_V;
 
 	PG_V = pmap_valid_bit(pmap);
 	pdpe = pmap_pdpe(pmap, va);
 	if (pdpe == NULL || (*pdpe & PG_V) == 0)
 		return (NULL);
 	return (pmap_pdpe_to_pde(pdpe, va));
 }
 
 /* Return a pointer to the PT slot that corresponds to a VA */
 static __inline pt_entry_t *
 pmap_pde_to_pte(pd_entry_t *pde, vm_offset_t va)
 {
 	pt_entry_t *pte;
 
 	pte = (pt_entry_t *)PHYS_TO_DMAP(*pde & PG_FRAME);
 	return (&pte[pmap_pte_index(va)]);
 }
 
 /* Return a pointer to the PT slot that corresponds to a VA */
 static __inline pt_entry_t *
 pmap_pte(pmap_t pmap, vm_offset_t va)
 {
 	pd_entry_t *pde;
 	pt_entry_t PG_V;
 
 	PG_V = pmap_valid_bit(pmap);
 	pde = pmap_pde(pmap, va);
 	if (pde == NULL || (*pde & PG_V) == 0)
 		return (NULL);
 	if ((*pde & PG_PS) != 0)	/* compat with i386 pmap_pte() */
 		return ((pt_entry_t *)pde);
 	return (pmap_pde_to_pte(pde, va));
 }
 
 static __inline void
 pmap_resident_count_inc(pmap_t pmap, int count)
 {
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	pmap->pm_stats.resident_count += count;
 }
 
 static __inline void
 pmap_resident_count_dec(pmap_t pmap, int count)
 {
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	KASSERT(pmap->pm_stats.resident_count >= count,
 	    ("pmap %p resident count underflow %ld %d", pmap,
 	    pmap->pm_stats.resident_count, count));
 	pmap->pm_stats.resident_count -= count;
 }
 
 PMAP_INLINE pt_entry_t *
 vtopte(vm_offset_t va)
 {
 	u_int64_t mask = ((1ul << (NPTEPGSHIFT + NPDEPGSHIFT + NPDPEPGSHIFT + NPML4EPGSHIFT)) - 1);
 
 	KASSERT(va >= VM_MAXUSER_ADDRESS, ("vtopte on a uva/gpa 0x%0lx", va));
 
 	return (PTmap + ((va >> PAGE_SHIFT) & mask));
 }
 
 static __inline pd_entry_t *
 vtopde(vm_offset_t va)
 {
 	u_int64_t mask = ((1ul << (NPDEPGSHIFT + NPDPEPGSHIFT + NPML4EPGSHIFT)) - 1);
 
 	KASSERT(va >= VM_MAXUSER_ADDRESS, ("vtopde on a uva/gpa 0x%0lx", va));
 
 	return (PDmap + ((va >> PDRSHIFT) & mask));
 }
 
 static u_int64_t
 allocpages(vm_paddr_t *firstaddr, int n)
 {
 	u_int64_t ret;
 
 	ret = *firstaddr;
 	bzero((void *)ret, n * PAGE_SIZE);
 	*firstaddr += n * PAGE_SIZE;
 	return (ret);
 }
 
 CTASSERT(powerof2(NDMPML4E));
 
 /* number of kernel PDP slots */
 #define	NKPDPE(ptpgs)		howmany(ptpgs, NPDEPG)
 
 static void
 nkpt_init(vm_paddr_t addr)
 {
 	int pt_pages;
 	
 #ifdef NKPT
 	pt_pages = NKPT;
 #else
 	pt_pages = howmany(addr, 1 << PDRSHIFT);
 	pt_pages += NKPDPE(pt_pages);
 
 	/*
 	 * Add some slop beyond the bare minimum required for bootstrapping
 	 * the kernel.
 	 *
 	 * This is quite important when allocating KVA for kernel modules.
 	 * The modules are required to be linked in the negative 2GB of
 	 * the address space.  If we run out of KVA in this region then
 	 * pmap_growkernel() will need to allocate page table pages to map
 	 * the entire 512GB of KVA space which is an unnecessary tax on
 	 * physical memory.
 	 *
 	 * Secondly, device memory mapped as part of setting up the low-
 	 * level console(s) is taken from KVA, starting at virtual_avail.
 	 * This is because cninit() is called after pmap_bootstrap() but
 	 * before vm_init() and pmap_init(). 20MB for a frame buffer is
 	 * not uncommon.
 	 */
 	pt_pages += 32;		/* 64MB additional slop. */
 #endif
 	nkpt = pt_pages;
 }
 
 static void
 create_pagetables(vm_paddr_t *firstaddr)
 {
 	int i, j, ndm1g, nkpdpe;
 	pt_entry_t *pt_p;
 	pd_entry_t *pd_p;
 	pdp_entry_t *pdp_p;
 	pml4_entry_t *p4_p;
 
 	/* Allocate page table pages for the direct map */
 	ndmpdp = howmany(ptoa(Maxmem), NBPDP);
 	if (ndmpdp < 4)		/* Minimum 4GB of dirmap */
 		ndmpdp = 4;
 	ndmpdpphys = howmany(ndmpdp, NPDPEPG);
 	if (ndmpdpphys > NDMPML4E) {
 		/*
 		 * Each NDMPML4E allows 512 GB, so limit to that,
 		 * and then readjust ndmpdp and ndmpdpphys.
 		 */
 		printf("NDMPML4E limits system to %d GB\n", NDMPML4E * 512);
 		Maxmem = atop(NDMPML4E * NBPML4);
 		ndmpdpphys = NDMPML4E;
 		ndmpdp = NDMPML4E * NPDEPG;
 	}
 	DMPDPphys = allocpages(firstaddr, ndmpdpphys);
 	ndm1g = 0;
 	if ((amd_feature & AMDID_PAGE1GB) != 0)
 		ndm1g = ptoa(Maxmem) >> PDPSHIFT;
 	if (ndm1g < ndmpdp)
 		DMPDphys = allocpages(firstaddr, ndmpdp - ndm1g);
 	dmaplimit = (vm_paddr_t)ndmpdp << PDPSHIFT;
 
 	/* Allocate pages */
 	KPML4phys = allocpages(firstaddr, 1);
 	KPDPphys = allocpages(firstaddr, NKPML4E);
 
 	/*
 	 * Allocate the initial number of kernel page table pages required to
 	 * bootstrap.  We defer this until after all memory-size dependent
 	 * allocations are done (e.g. direct map), so that we don't have to
 	 * build in too much slop in our estimate.
 	 *
 	 * Note that when NKPML4E > 1, we have an empty page underneath
 	 * all but the KPML4I'th one, so we need NKPML4E-1 extra (zeroed)
 	 * pages.  (pmap_enter requires a PD page to exist for each KPML4E.)
 	 */
 	nkpt_init(*firstaddr);
 	nkpdpe = NKPDPE(nkpt);
 
 	KPTphys = allocpages(firstaddr, nkpt);
 	KPDphys = allocpages(firstaddr, nkpdpe);
 
 	/* Fill in the underlying page table pages */
 	/* Nominally read-only (but really R/W) from zero to physfree */
 	/* XXX not fully used, underneath 2M pages */
 	pt_p = (pt_entry_t *)KPTphys;
 	for (i = 0; ptoa(i) < *firstaddr; i++)
 		pt_p[i] = ptoa(i) | X86_PG_RW | X86_PG_V | X86_PG_G;
 
 	/* Now map the page tables at their location within PTmap */
 	pd_p = (pd_entry_t *)KPDphys;
 	for (i = 0; i < nkpt; i++)
 		pd_p[i] = (KPTphys + ptoa(i)) | X86_PG_RW | X86_PG_V;
 
 	/* Map from zero to end of allocations under 2M pages */
 	/* This replaces some of the KPTphys entries above */
 	for (i = 0; (i << PDRSHIFT) < *firstaddr; i++)
 		pd_p[i] = (i << PDRSHIFT) | X86_PG_RW | X86_PG_V | PG_PS |
 		    X86_PG_G;
 
 	/* And connect up the PD to the PDP (leaving room for L4 pages) */
 	pdp_p = (pdp_entry_t *)(KPDPphys + ptoa(KPML4I - KPML4BASE));
 	for (i = 0; i < nkpdpe; i++)
 		pdp_p[i + KPDPI] = (KPDphys + ptoa(i)) | X86_PG_RW | X86_PG_V |
 		    PG_U;
 
 	/*
 	 * Now, set up the direct map region using 2MB and/or 1GB pages.  If
 	 * the end of physical memory is not aligned to a 1GB page boundary,
 	 * then the residual physical memory is mapped with 2MB pages.  Later,
 	 * if pmap_mapdev{_attr}() uses the direct map for non-write-back
 	 * memory, pmap_change_attr() will demote any 2MB or 1GB page mappings
 	 * that are partially used. 
 	 */
 	pd_p = (pd_entry_t *)DMPDphys;
 	for (i = NPDEPG * ndm1g, j = 0; i < NPDEPG * ndmpdp; i++, j++) {
 		pd_p[j] = (vm_paddr_t)i << PDRSHIFT;
 		/* Preset PG_M and PG_A because demotion expects it. */
 		pd_p[j] |= X86_PG_RW | X86_PG_V | PG_PS | X86_PG_G |
 		    X86_PG_M | X86_PG_A;
 	}
 	pdp_p = (pdp_entry_t *)DMPDPphys;
 	for (i = 0; i < ndm1g; i++) {
 		pdp_p[i] = (vm_paddr_t)i << PDPSHIFT;
 		/* Preset PG_M and PG_A because demotion expects it. */
 		pdp_p[i] |= X86_PG_RW | X86_PG_V | PG_PS | X86_PG_G |
 		    X86_PG_M | X86_PG_A;
 	}
 	for (j = 0; i < ndmpdp; i++, j++) {
 		pdp_p[i] = DMPDphys + ptoa(j);
 		pdp_p[i] |= X86_PG_RW | X86_PG_V | PG_U;
 	}
 
 	/* And recursively map PML4 to itself in order to get PTmap */
 	p4_p = (pml4_entry_t *)KPML4phys;
 	p4_p[PML4PML4I] = KPML4phys;
 	p4_p[PML4PML4I] |= X86_PG_RW | X86_PG_V | PG_U;
 
 	/* Connect the Direct Map slot(s) up to the PML4. */
 	for (i = 0; i < ndmpdpphys; i++) {
 		p4_p[DMPML4I + i] = DMPDPphys + ptoa(i);
 		p4_p[DMPML4I + i] |= X86_PG_RW | X86_PG_V | PG_U;
 	}
 
 	/* Connect the KVA slots up to the PML4 */
 	for (i = 0; i < NKPML4E; i++) {
 		p4_p[KPML4BASE + i] = KPDPphys + ptoa(i);
 		p4_p[KPML4BASE + i] |= X86_PG_RW | X86_PG_V | PG_U;
 	}
 }
 
 /*
  *	Bootstrap the system enough to run with virtual memory.
  *
  *	On amd64 this is called after mapping has already been enabled
  *	and just syncs the pmap module with what has already been done.
  *	[We can't call it easily with mapping off since the kernel is not
  *	mapped with PA == VA, hence we would have to relocate every address
  *	from the linked base (virtual) address "KERNBASE" to the actual
  *	(physical) address starting relative to 0]
  */
 void
 pmap_bootstrap(vm_paddr_t *firstaddr)
 {
 	vm_offset_t va;
 	pt_entry_t *pte;
 	int i;
 
 	/*
 	 * Create an initial set of page tables to run the kernel in.
 	 */
 	create_pagetables(firstaddr);
 
 	/*
 	 * Add a physical memory segment (vm_phys_seg) corresponding to the
 	 * preallocated kernel page table pages so that vm_page structures
 	 * representing these pages will be created.  The vm_page structures
 	 * are required for promotion of the corresponding kernel virtual
 	 * addresses to superpage mappings.
 	 */
 	vm_phys_add_seg(KPTphys, KPTphys + ptoa(nkpt));
 
 	virtual_avail = (vm_offset_t) KERNBASE + *firstaddr;
 	virtual_avail = pmap_kmem_choose(virtual_avail);
 
 	virtual_end = VM_MAX_KERNEL_ADDRESS;
 
 
 	/* XXX do %cr0 as well */
 	load_cr4(rcr4() | CR4_PGE);
 	load_cr3(KPML4phys);
 	if (cpu_stdext_feature & CPUID_STDEXT_SMEP)
 		load_cr4(rcr4() | CR4_SMEP);
 
 	/*
 	 * Initialize the kernel pmap (which is statically allocated).
 	 */
 	PMAP_LOCK_INIT(kernel_pmap);
 	kernel_pmap->pm_pml4 = (pdp_entry_t *)PHYS_TO_DMAP(KPML4phys);
 	kernel_pmap->pm_cr3 = KPML4phys;
 	CPU_FILL(&kernel_pmap->pm_active);	/* don't allow deactivation */
 	TAILQ_INIT(&kernel_pmap->pm_pvchunk);
 	kernel_pmap->pm_flags = pmap_flags;
 
  	/*
 	 * Initialize the TLB invalidations generation number lock.
 	 */
 	mtx_init(&invl_gen_mtx, "invlgn", NULL, MTX_DEF);
 
 	/*
 	 * Reserve some special page table entries/VA space for temporary
 	 * mapping of pages.
 	 */
 #define	SYSMAP(c, p, v, n)	\
 	v = (c)va; va += ((n)*PAGE_SIZE); p = pte; pte += (n);
 
 	va = virtual_avail;
 	pte = vtopte(va);
 
 	/*
 	 * Crashdump maps.  The first page is reused as CMAP1 for the
 	 * memory test.
 	 */
 	SYSMAP(caddr_t, CMAP1, crashdumpmap, MAXDUMPPGS)
 	CADDR1 = crashdumpmap;
 
 	virtual_avail = va;
 
 	/* Initialize the PAT MSR. */
 	pmap_init_pat();
 
 	/* Initialize TLB Context Id. */
 	TUNABLE_INT_FETCH("vm.pmap.pcid_enabled", &pmap_pcid_enabled);
 	if ((cpu_feature2 & CPUID2_PCID) != 0 && pmap_pcid_enabled) {
 		/* Check for INVPCID support */
 		invpcid_works = (cpu_stdext_feature & CPUID_STDEXT_INVPCID)
 		    != 0;
 		for (i = 0; i < MAXCPU; i++) {
 			kernel_pmap->pm_pcids[i].pm_pcid = PMAP_PCID_KERN;
 			kernel_pmap->pm_pcids[i].pm_gen = 1;
 		}
 		__pcpu[0].pc_pcid_next = PMAP_PCID_KERN + 1;
 		__pcpu[0].pc_pcid_gen = 1;
 		/*
 		 * pcpu area for APs is zeroed during AP startup.
 		 * pc_pcid_next and pc_pcid_gen are initialized by AP
 		 * during pcpu setup.
 		 */
 		load_cr4(rcr4() | CR4_PCIDE);
 	} else {
 		pmap_pcid_enabled = 0;
 	}
 }
 
 /*
  * Setup the PAT MSR.
  */
 void
 pmap_init_pat(void)
 {
 	int pat_table[PAT_INDEX_SIZE];
 	uint64_t pat_msr;
 	u_long cr0, cr4;
 	int i;
 
 	/* Bail if this CPU doesn't implement PAT. */
 	if ((cpu_feature & CPUID_PAT) == 0)
 		panic("no PAT??");
 
 	/* Set default PAT index table. */
 	for (i = 0; i < PAT_INDEX_SIZE; i++)
 		pat_table[i] = -1;
 	pat_table[PAT_WRITE_BACK] = 0;
 	pat_table[PAT_WRITE_THROUGH] = 1;
 	pat_table[PAT_UNCACHEABLE] = 3;
 	pat_table[PAT_WRITE_COMBINING] = 3;
 	pat_table[PAT_WRITE_PROTECTED] = 3;
 	pat_table[PAT_UNCACHED] = 3;
 
 	/* Initialize default PAT entries. */
 	pat_msr = PAT_VALUE(0, PAT_WRITE_BACK) |
 	    PAT_VALUE(1, PAT_WRITE_THROUGH) |
 	    PAT_VALUE(2, PAT_UNCACHED) |
 	    PAT_VALUE(3, PAT_UNCACHEABLE) |
 	    PAT_VALUE(4, PAT_WRITE_BACK) |
 	    PAT_VALUE(5, PAT_WRITE_THROUGH) |
 	    PAT_VALUE(6, PAT_UNCACHED) |
 	    PAT_VALUE(7, PAT_UNCACHEABLE);
 
 	if (pat_works) {
 		/*
 		 * Leave the indices 0-3 at the default of WB, WT, UC-, and UC.
 		 * Program 5 and 6 as WP and WC.
 		 * Leave 4 and 7 as WB and UC.
 		 */
 		pat_msr &= ~(PAT_MASK(5) | PAT_MASK(6));
 		pat_msr |= PAT_VALUE(5, PAT_WRITE_PROTECTED) |
 		    PAT_VALUE(6, PAT_WRITE_COMBINING);
 		pat_table[PAT_UNCACHED] = 2;
 		pat_table[PAT_WRITE_PROTECTED] = 5;
 		pat_table[PAT_WRITE_COMBINING] = 6;
 	} else {
 		/*
 		 * Just replace PAT Index 2 with WC instead of UC-.
 		 */
 		pat_msr &= ~PAT_MASK(2);
 		pat_msr |= PAT_VALUE(2, PAT_WRITE_COMBINING);
 		pat_table[PAT_WRITE_COMBINING] = 2;
 	}
 
 	/* Disable PGE. */
 	cr4 = rcr4();
 	load_cr4(cr4 & ~CR4_PGE);
 
 	/* Disable caches (CD = 1, NW = 0). */
 	cr0 = rcr0();
 	load_cr0((cr0 & ~CR0_NW) | CR0_CD);
 
 	/* Flushes caches and TLBs. */
 	wbinvd();
 	invltlb();
 
 	/* Update PAT and index table. */
 	wrmsr(MSR_PAT, pat_msr);
 	for (i = 0; i < PAT_INDEX_SIZE; i++)
 		pat_index[i] = pat_table[i];
 
 	/* Flush caches and TLBs again. */
 	wbinvd();
 	invltlb();
 
 	/* Restore caches and PGE. */
 	load_cr0(cr0);
 	load_cr4(cr4);
 }
 
 /*
  *	Initialize a vm_page's machine-dependent fields.
  */
 void
 pmap_page_init(vm_page_t m)
 {
 
 	TAILQ_INIT(&m->md.pv_list);
 	m->md.pat_mode = PAT_WRITE_BACK;
 }
 
 /*
  *	Initialize the pmap module.
  *	Called by vm_init, to initialize any structures that the pmap
  *	system needs to map virtual memory.
  */
 void
 pmap_init(void)
 {
 	struct pmap_preinit_mapping *ppim;
 	vm_page_t mpte;
 	vm_size_t s;
 	int error, i, pv_npg;
 
 	/*
 	 * Initialize the vm page array entries for the kernel pmap's
 	 * page table pages.
 	 */ 
 	for (i = 0; i < nkpt; i++) {
 		mpte = PHYS_TO_VM_PAGE(KPTphys + (i << PAGE_SHIFT));
 		KASSERT(mpte >= vm_page_array &&
 		    mpte < &vm_page_array[vm_page_array_size],
 		    ("pmap_init: page table page is out of range"));
 		mpte->pindex = pmap_pde_pindex(KERNBASE) + i;
 		mpte->phys_addr = KPTphys + (i << PAGE_SHIFT);
 	}
 
 	/*
 	 * If the kernel is running on a virtual machine, then it must assume
 	 * that MCA is enabled by the hypervisor.  Moreover, the kernel must
 	 * be prepared for the hypervisor changing the vendor and family that
 	 * are reported by CPUID.  Consequently, the workaround for AMD Family
 	 * 10h Erratum 383 is enabled if the processor's feature set does not
 	 * include at least one feature that is only supported by older Intel
 	 * or newer AMD processors.
 	 */
 	if (vm_guest != VM_GUEST_NO && (cpu_feature & CPUID_SS) == 0 &&
 	    (cpu_feature2 & (CPUID2_SSSE3 | CPUID2_SSE41 | CPUID2_AESNI |
 	    CPUID2_AVX | CPUID2_XSAVE)) == 0 && (amd_feature2 & (AMDID2_XOP |
 	    AMDID2_FMA4)) == 0)
 		workaround_erratum383 = 1;
 
 	/*
 	 * Are large page mappings enabled?
 	 */
 	TUNABLE_INT_FETCH("vm.pmap.pg_ps_enabled", &pg_ps_enabled);
 	if (pg_ps_enabled) {
 		KASSERT(MAXPAGESIZES > 1 && pagesizes[1] == 0,
 		    ("pmap_init: can't assign to pagesizes[1]"));
 		pagesizes[1] = NBPDR;
 	}
 
 	/*
 	 * Initialize the pv chunk list mutex.
 	 */
 	mtx_init(&pv_chunks_mutex, "pmap pv chunk list", NULL, MTX_DEF);
 
 	/*
 	 * Initialize the pool of pv list locks.
 	 */
 	for (i = 0; i < NPV_LIST_LOCKS; i++)
 		rw_init(&pv_list_locks[i], "pmap pv list");
 
 	/*
 	 * Calculate the size of the pv head table for superpages.
 	 */
 	pv_npg = howmany(vm_phys_segs[vm_phys_nsegs - 1].end, NBPDR);
 
 	/*
 	 * Allocate memory for the pv head table for superpages.
 	 */
 	s = (vm_size_t)(pv_npg * sizeof(struct md_page));
 	s = round_page(s);
 	pv_table = (struct md_page *)kmem_malloc(kernel_arena, s,
 	    M_WAITOK | M_ZERO);
 	for (i = 0; i < pv_npg; i++)
 		TAILQ_INIT(&pv_table[i].pv_list);
 	TAILQ_INIT(&pv_dummy.pv_list);
 
 	pmap_initialized = 1;
 	for (i = 0; i < PMAP_PREINIT_MAPPING_COUNT; i++) {
 		ppim = pmap_preinit_mapping + i;
 		if (ppim->va == 0)
 			continue;
 		/* Make the direct map consistent */
 		if (ppim->pa < dmaplimit && ppim->pa + ppim->sz < dmaplimit) {
 			(void)pmap_change_attr(PHYS_TO_DMAP(ppim->pa),
 			    ppim->sz, ppim->mode);
 		}
 		if (!bootverbose)
 			continue;
 		printf("PPIM %u: PA=%#lx, VA=%#lx, size=%#lx, mode=%#x\n", i,
 		    ppim->pa, ppim->va, ppim->sz, ppim->mode);
 	}
 
 	mtx_init(&qframe_mtx, "qfrmlk", NULL, MTX_SPIN);
 	error = vmem_alloc(kernel_arena, PAGE_SIZE, M_BESTFIT | M_WAITOK,
 	    (vmem_addr_t *)&qframe);
 	if (error != 0)
 		panic("qframe allocation failed");
 }
 
 static SYSCTL_NODE(_vm_pmap, OID_AUTO, pde, CTLFLAG_RD, 0,
     "2MB page mapping counters");
 
 static u_long pmap_pde_demotions;
 SYSCTL_ULONG(_vm_pmap_pde, OID_AUTO, demotions, CTLFLAG_RD,
     &pmap_pde_demotions, 0, "2MB page demotions");
 
 static u_long pmap_pde_mappings;
 SYSCTL_ULONG(_vm_pmap_pde, OID_AUTO, mappings, CTLFLAG_RD,
     &pmap_pde_mappings, 0, "2MB page mappings");
 
 static u_long pmap_pde_p_failures;
 SYSCTL_ULONG(_vm_pmap_pde, OID_AUTO, p_failures, CTLFLAG_RD,
     &pmap_pde_p_failures, 0, "2MB page promotion failures");
 
 static u_long pmap_pde_promotions;
 SYSCTL_ULONG(_vm_pmap_pde, OID_AUTO, promotions, CTLFLAG_RD,
     &pmap_pde_promotions, 0, "2MB page promotions");
 
 static SYSCTL_NODE(_vm_pmap, OID_AUTO, pdpe, CTLFLAG_RD, 0,
     "1GB page mapping counters");
 
 static u_long pmap_pdpe_demotions;
 SYSCTL_ULONG(_vm_pmap_pdpe, OID_AUTO, demotions, CTLFLAG_RD,
     &pmap_pdpe_demotions, 0, "1GB page demotions");
 
 /***************************************************
  * Low level helper routines.....
  ***************************************************/
 
 static pt_entry_t
 pmap_swap_pat(pmap_t pmap, pt_entry_t entry)
 {
 	int x86_pat_bits = X86_PG_PTE_PAT | X86_PG_PDE_PAT;
 
 	switch (pmap->pm_type) {
 	case PT_X86:
 	case PT_RVI:
 		/* Verify that both PAT bits are not set at the same time */
 		KASSERT((entry & x86_pat_bits) != x86_pat_bits,
 		    ("Invalid PAT bits in entry %#lx", entry));
 
 		/* Swap the PAT bits if one of them is set */
 		if ((entry & x86_pat_bits) != 0)
 			entry ^= x86_pat_bits;
 		break;
 	case PT_EPT:
 		/*
 		 * Nothing to do - the memory attributes are represented
 		 * the same way for regular pages and superpages.
 		 */
 		break;
 	default:
 		panic("pmap_switch_pat_bits: bad pm_type %d", pmap->pm_type);
 	}
 
 	return (entry);
 }
 
 /*
  * Determine the appropriate bits to set in a PTE or PDE for a specified
  * caching mode.
  */
 static int
 pmap_cache_bits(pmap_t pmap, int mode, boolean_t is_pde)
 {
 	int cache_bits, pat_flag, pat_idx;
 
 	if (mode < 0 || mode >= PAT_INDEX_SIZE || pat_index[mode] < 0)
 		panic("Unknown caching mode %d\n", mode);
 
 	switch (pmap->pm_type) {
 	case PT_X86:
 	case PT_RVI:
 		/* The PAT bit is different for PTE's and PDE's. */
 		pat_flag = is_pde ? X86_PG_PDE_PAT : X86_PG_PTE_PAT;
 
 		/* Map the caching mode to a PAT index. */
 		pat_idx = pat_index[mode];
 
 		/* Map the 3-bit index value into the PAT, PCD, and PWT bits. */
 		cache_bits = 0;
 		if (pat_idx & 0x4)
 			cache_bits |= pat_flag;
 		if (pat_idx & 0x2)
 			cache_bits |= PG_NC_PCD;
 		if (pat_idx & 0x1)
 			cache_bits |= PG_NC_PWT;
 		break;
 
 	case PT_EPT:
 		cache_bits = EPT_PG_IGNORE_PAT | EPT_PG_MEMORY_TYPE(mode);
 		break;
 
 	default:
 		panic("unsupported pmap type %d", pmap->pm_type);
 	}
 
 	return (cache_bits);
 }
 
 static int
 pmap_cache_mask(pmap_t pmap, boolean_t is_pde)
 {
 	int mask;
 
 	switch (pmap->pm_type) {
 	case PT_X86:
 	case PT_RVI:
 		mask = is_pde ? X86_PG_PDE_CACHE : X86_PG_PTE_CACHE;
 		break;
 	case PT_EPT:
 		mask = EPT_PG_IGNORE_PAT | EPT_PG_MEMORY_TYPE(0x7);
 		break;
 	default:
 		panic("pmap_cache_mask: invalid pm_type %d", pmap->pm_type);
 	}
 
 	return (mask);
 }
 
 static __inline boolean_t
 pmap_ps_enabled(pmap_t pmap)
 {
 
 	return (pg_ps_enabled && (pmap->pm_flags & PMAP_PDE_SUPERPAGE) != 0);
 }
 
 static void
 pmap_update_pde_store(pmap_t pmap, pd_entry_t *pde, pd_entry_t newpde)
 {
 
 	switch (pmap->pm_type) {
 	case PT_X86:
 		break;
 	case PT_RVI:
 	case PT_EPT:
 		/*
 		 * XXX
 		 * This is a little bogus since the generation number is
 		 * supposed to be bumped up when a region of the address
 		 * space is invalidated in the page tables.
 		 *
 		 * In this case the old PDE entry is valid but yet we want
 		 * to make sure that any mappings using the old entry are
 		 * invalidated in the TLB.
 		 *
 		 * The reason this works as expected is because we rendezvous
 		 * "all" host cpus and force any vcpu context to exit as a
 		 * side-effect.
 		 */
 		atomic_add_acq_long(&pmap->pm_eptgen, 1);
 		break;
 	default:
 		panic("pmap_update_pde_store: bad pm_type %d", pmap->pm_type);
 	}
 	pde_store(pde, newpde);
 }
 
 /*
  * After changing the page size for the specified virtual address in the page
  * table, flush the corresponding entries from the processor's TLB.  Only the
  * calling processor's TLB is affected.
  *
  * The calling thread must be pinned to a processor.
  */
 static void
 pmap_update_pde_invalidate(pmap_t pmap, vm_offset_t va, pd_entry_t newpde)
 {
 	pt_entry_t PG_G;
 
 	if (pmap_type_guest(pmap))
 		return;
 
 	KASSERT(pmap->pm_type == PT_X86,
 	    ("pmap_update_pde_invalidate: invalid type %d", pmap->pm_type));
 
 	PG_G = pmap_global_bit(pmap);
 
 	if ((newpde & PG_PS) == 0)
 		/* Demotion: flush a specific 2MB page mapping. */
 		invlpg(va);
 	else if ((newpde & PG_G) == 0)
 		/*
 		 * Promotion: flush every 4KB page mapping from the TLB
 		 * because there are too many to flush individually.
 		 */
 		invltlb();
 	else {
 		/*
 		 * Promotion: flush every 4KB page mapping from the TLB,
 		 * including any global (PG_G) mappings.
 		 */
 		invltlb_glob();
 	}
 }
 #ifdef SMP
 
 /*
  * For SMP, these functions have to use the IPI mechanism for coherence.
  *
  * N.B.: Before calling any of the following TLB invalidation functions,
  * the calling processor must ensure that all stores updating a non-
  * kernel page table are globally performed.  Otherwise, another
  * processor could cache an old, pre-update entry without being
  * invalidated.  This can happen one of two ways: (1) The pmap becomes
  * active on another processor after its pm_active field is checked by
  * one of the following functions but before a store updating the page
  * table is globally performed. (2) The pmap becomes active on another
  * processor before its pm_active field is checked but due to
  * speculative loads one of the following functions stills reads the
  * pmap as inactive on the other processor.
  * 
  * The kernel page table is exempt because its pm_active field is
  * immutable.  The kernel page table is always active on every
  * processor.
  */
 
 /*
  * Interrupt the cpus that are executing in the guest context.
  * This will force the vcpu to exit and the cached EPT mappings
  * will be invalidated by the host before the next vmresume.
  */
 static __inline void
 pmap_invalidate_ept(pmap_t pmap)
 {
 	int ipinum;
 
 	sched_pin();
 	KASSERT(!CPU_ISSET(curcpu, &pmap->pm_active),
 	    ("pmap_invalidate_ept: absurd pm_active"));
 
 	/*
 	 * The TLB mappings associated with a vcpu context are not
 	 * flushed each time a different vcpu is chosen to execute.
 	 *
 	 * This is in contrast with a process's vtop mappings that
 	 * are flushed from the TLB on each context switch.
 	 *
 	 * Therefore we need to do more than just a TLB shootdown on
 	 * the active cpus in 'pmap->pm_active'. To do this we keep
 	 * track of the number of invalidations performed on this pmap.
 	 *
 	 * Each vcpu keeps a cache of this counter and compares it
 	 * just before a vmresume. If the counter is out-of-date an
 	 * invept will be done to flush stale mappings from the TLB.
 	 */
 	atomic_add_acq_long(&pmap->pm_eptgen, 1);
 
 	/*
 	 * Force the vcpu to exit and trap back into the hypervisor.
 	 */
 	ipinum = pmap->pm_flags & PMAP_NESTED_IPIMASK;
 	ipi_selected(pmap->pm_active, ipinum);
 	sched_unpin();
 }
 
 void
 pmap_invalidate_page(pmap_t pmap, vm_offset_t va)
 {
 	cpuset_t *mask;
 	u_int cpuid, i;
 
 	if (pmap_type_guest(pmap)) {
 		pmap_invalidate_ept(pmap);
 		return;
 	}
 
 	KASSERT(pmap->pm_type == PT_X86,
 	    ("pmap_invalidate_page: invalid type %d", pmap->pm_type));
 
 	sched_pin();
 	if (pmap == kernel_pmap) {
 		invlpg(va);
 		mask = &all_cpus;
 	} else {
 		cpuid = PCPU_GET(cpuid);
 		if (pmap == PCPU_GET(curpmap))
 			invlpg(va);
 		else if (pmap_pcid_enabled)
 			pmap->pm_pcids[cpuid].pm_gen = 0;
 		if (pmap_pcid_enabled) {
 			CPU_FOREACH(i) {
 				if (cpuid != i)
 					pmap->pm_pcids[i].pm_gen = 0;
 			}
 		}
 		mask = &pmap->pm_active;
 	}
 	smp_masked_invlpg(*mask, va);
 	sched_unpin();
 }
 
 /* 4k PTEs -- Chosen to exceed the total size of Broadwell L2 TLB */
 #define	PMAP_INVLPG_THRESHOLD	(4 * 1024 * PAGE_SIZE)
 
 void
 pmap_invalidate_range(pmap_t pmap, vm_offset_t sva, vm_offset_t eva)
 {
 	cpuset_t *mask;
 	vm_offset_t addr;
 	u_int cpuid, i;
 
 	if (eva - sva >= PMAP_INVLPG_THRESHOLD) {
 		pmap_invalidate_all(pmap);
 		return;
 	}
 
 	if (pmap_type_guest(pmap)) {
 		pmap_invalidate_ept(pmap);
 		return;
 	}
 
 	KASSERT(pmap->pm_type == PT_X86,
 	    ("pmap_invalidate_range: invalid type %d", pmap->pm_type));
 
 	sched_pin();
 	cpuid = PCPU_GET(cpuid);
 	if (pmap == kernel_pmap) {
 		for (addr = sva; addr < eva; addr += PAGE_SIZE)
 			invlpg(addr);
 		mask = &all_cpus;
 	} else {
 		if (pmap == PCPU_GET(curpmap)) {
 			for (addr = sva; addr < eva; addr += PAGE_SIZE)
 				invlpg(addr);
 		} else if (pmap_pcid_enabled) {
 			pmap->pm_pcids[cpuid].pm_gen = 0;
 		}
 		if (pmap_pcid_enabled) {
 			CPU_FOREACH(i) {
 				if (cpuid != i)
 					pmap->pm_pcids[i].pm_gen = 0;
 			}
 		}
 		mask = &pmap->pm_active;
 	}
 	smp_masked_invlpg_range(*mask, sva, eva);
 	sched_unpin();
 }
 
 void
 pmap_invalidate_all(pmap_t pmap)
 {
 	cpuset_t *mask;
 	struct invpcid_descr d;
 	u_int cpuid, i;
 
 	if (pmap_type_guest(pmap)) {
 		pmap_invalidate_ept(pmap);
 		return;
 	}
 
 	KASSERT(pmap->pm_type == PT_X86,
 	    ("pmap_invalidate_all: invalid type %d", pmap->pm_type));
 
 	sched_pin();
 	if (pmap == kernel_pmap) {
 		if (pmap_pcid_enabled && invpcid_works) {
 			bzero(&d, sizeof(d));
 			invpcid(&d, INVPCID_CTXGLOB);
 		} else {
 			invltlb_glob();
 		}
 		mask = &all_cpus;
 	} else {
 		cpuid = PCPU_GET(cpuid);
 		if (pmap == PCPU_GET(curpmap)) {
 			if (pmap_pcid_enabled) {
 				if (invpcid_works) {
 					d.pcid = pmap->pm_pcids[cpuid].pm_pcid;
 					d.pad = 0;
 					d.addr = 0;
 					invpcid(&d, INVPCID_CTX);
 				} else {
 					load_cr3(pmap->pm_cr3 | pmap->pm_pcids
 					    [PCPU_GET(cpuid)].pm_pcid);
 				}
 			} else {
 				invltlb();
 			}
 		} else if (pmap_pcid_enabled) {
 			pmap->pm_pcids[cpuid].pm_gen = 0;
 		}
 		if (pmap_pcid_enabled) {
 			CPU_FOREACH(i) {
 				if (cpuid != i)
 					pmap->pm_pcids[i].pm_gen = 0;
 			}
 		}
 		mask = &pmap->pm_active;
 	}
 	smp_masked_invltlb(*mask, pmap);
 	sched_unpin();
 }
 
 void
 pmap_invalidate_cache(void)
 {
 
 	sched_pin();
 	wbinvd();
 	smp_cache_flush();
 	sched_unpin();
 }
 
 struct pde_action {
 	cpuset_t invalidate;	/* processors that invalidate their TLB */
 	pmap_t pmap;
 	vm_offset_t va;
 	pd_entry_t *pde;
 	pd_entry_t newpde;
 	u_int store;		/* processor that updates the PDE */
 };
 
 static void
 pmap_update_pde_action(void *arg)
 {
 	struct pde_action *act = arg;
 
 	if (act->store == PCPU_GET(cpuid))
 		pmap_update_pde_store(act->pmap, act->pde, act->newpde);
 }
 
 static void
 pmap_update_pde_teardown(void *arg)
 {
 	struct pde_action *act = arg;
 
 	if (CPU_ISSET(PCPU_GET(cpuid), &act->invalidate))
 		pmap_update_pde_invalidate(act->pmap, act->va, act->newpde);
 }
 
 /*
  * Change the page size for the specified virtual address in a way that
  * prevents any possibility of the TLB ever having two entries that map the
  * same virtual address using different page sizes.  This is the recommended
  * workaround for Erratum 383 on AMD Family 10h processors.  It prevents a
  * machine check exception for a TLB state that is improperly diagnosed as a
  * hardware error.
  */
 static void
 pmap_update_pde(pmap_t pmap, vm_offset_t va, pd_entry_t *pde, pd_entry_t newpde)
 {
 	struct pde_action act;
 	cpuset_t active, other_cpus;
 	u_int cpuid;
 
 	sched_pin();
 	cpuid = PCPU_GET(cpuid);
 	other_cpus = all_cpus;
 	CPU_CLR(cpuid, &other_cpus);
 	if (pmap == kernel_pmap || pmap_type_guest(pmap)) 
 		active = all_cpus;
 	else {
 		active = pmap->pm_active;
 	}
 	if (CPU_OVERLAP(&active, &other_cpus)) { 
 		act.store = cpuid;
 		act.invalidate = active;
 		act.va = va;
 		act.pmap = pmap;
 		act.pde = pde;
 		act.newpde = newpde;
 		CPU_SET(cpuid, &active);
 		smp_rendezvous_cpus(active,
 		    smp_no_rendevous_barrier, pmap_update_pde_action,
 		    pmap_update_pde_teardown, &act);
 	} else {
 		pmap_update_pde_store(pmap, pde, newpde);
 		if (CPU_ISSET(cpuid, &active))
 			pmap_update_pde_invalidate(pmap, va, newpde);
 	}
 	sched_unpin();
 }
 #else /* !SMP */
 /*
  * Normal, non-SMP, invalidation functions.
  */
 void
 pmap_invalidate_page(pmap_t pmap, vm_offset_t va)
 {
 
 	if (pmap->pm_type == PT_RVI || pmap->pm_type == PT_EPT) {
 		pmap->pm_eptgen++;
 		return;
 	}
 	KASSERT(pmap->pm_type == PT_X86,
 	    ("pmap_invalidate_range: unknown type %d", pmap->pm_type));
 
 	if (pmap == kernel_pmap || pmap == PCPU_GET(curpmap))
 		invlpg(va);
 	else if (pmap_pcid_enabled)
 		pmap->pm_pcids[0].pm_gen = 0;
 }
 
 void
 pmap_invalidate_range(pmap_t pmap, vm_offset_t sva, vm_offset_t eva)
 {
 	vm_offset_t addr;
 
 	if (pmap->pm_type == PT_RVI || pmap->pm_type == PT_EPT) {
 		pmap->pm_eptgen++;
 		return;
 	}
 	KASSERT(pmap->pm_type == PT_X86,
 	    ("pmap_invalidate_range: unknown type %d", pmap->pm_type));
 
 	if (pmap == kernel_pmap || pmap == PCPU_GET(curpmap)) {
 		for (addr = sva; addr < eva; addr += PAGE_SIZE)
 			invlpg(addr);
 	} else if (pmap_pcid_enabled) {
 		pmap->pm_pcids[0].pm_gen = 0;
 	}
 }
 
 void
 pmap_invalidate_all(pmap_t pmap)
 {
 	struct invpcid_descr d;
 
 	if (pmap->pm_type == PT_RVI || pmap->pm_type == PT_EPT) {
 		pmap->pm_eptgen++;
 		return;
 	}
 	KASSERT(pmap->pm_type == PT_X86,
 	    ("pmap_invalidate_all: unknown type %d", pmap->pm_type));
 
 	if (pmap == kernel_pmap) {
 		if (pmap_pcid_enabled && invpcid_works) {
 			bzero(&d, sizeof(d));
 			invpcid(&d, INVPCID_CTXGLOB);
 		} else {
 			invltlb_glob();
 		}
 	} else if (pmap == PCPU_GET(curpmap)) {
 		if (pmap_pcid_enabled) {
 			if (invpcid_works) {
 				d.pcid = pmap->pm_pcids[0].pm_pcid;
 				d.pad = 0;
 				d.addr = 0;
 				invpcid(&d, INVPCID_CTX);
 			} else {
 				load_cr3(pmap->pm_cr3 | pmap->pm_pcids[0].
 				    pm_pcid);
 			}
 		} else {
 			invltlb();
 		}
 	} else if (pmap_pcid_enabled) {
 		pmap->pm_pcids[0].pm_gen = 0;
 	}
 }
 
 PMAP_INLINE void
 pmap_invalidate_cache(void)
 {
 
 	wbinvd();
 }
 
 static void
 pmap_update_pde(pmap_t pmap, vm_offset_t va, pd_entry_t *pde, pd_entry_t newpde)
 {
 
 	pmap_update_pde_store(pmap, pde, newpde);
 	if (pmap == kernel_pmap || pmap == PCPU_GET(curpmap))
 		pmap_update_pde_invalidate(pmap, va, newpde);
 	else
 		pmap->pm_pcids[0].pm_gen = 0;
 }
 #endif /* !SMP */
 
 #define PMAP_CLFLUSH_THRESHOLD   (2 * 1024 * 1024)
 
 void
 pmap_invalidate_cache_range(vm_offset_t sva, vm_offset_t eva, boolean_t force)
 {
 
 	if (force) {
 		sva &= ~(vm_offset_t)cpu_clflush_line_size;
 	} else {
 		KASSERT((sva & PAGE_MASK) == 0,
 		    ("pmap_invalidate_cache_range: sva not page-aligned"));
 		KASSERT((eva & PAGE_MASK) == 0,
 		    ("pmap_invalidate_cache_range: eva not page-aligned"));
 	}
 
 	if ((cpu_feature & CPUID_SS) != 0 && !force)
 		; /* If "Self Snoop" is supported and allowed, do nothing. */
 	else if ((cpu_stdext_feature & CPUID_STDEXT_CLFLUSHOPT) != 0 &&
 	    eva - sva < PMAP_CLFLUSH_THRESHOLD) {
 		/*
 		 * XXX: Some CPUs fault, hang, or trash the local APIC
 		 * registers if we use CLFLUSH on the local APIC
 		 * range.  The local APIC is always uncached, so we
 		 * don't need to flush for that range anyway.
 		 */
 		if (pmap_kextract(sva) == lapic_paddr)
 			return;
 
 		/*
 		 * Otherwise, do per-cache line flush.  Use the mfence
 		 * instruction to insure that previous stores are
 		 * included in the write-back.  The processor
 		 * propagates flush to other processors in the cache
 		 * coherence domain.
 		 */
 		mfence();
 		for (; sva < eva; sva += cpu_clflush_line_size)
 			clflushopt(sva);
 		mfence();
 	} else if ((cpu_feature & CPUID_CLFSH) != 0 &&
 	    eva - sva < PMAP_CLFLUSH_THRESHOLD) {
 		if (pmap_kextract(sva) == lapic_paddr)
 			return;
 		/*
 		 * Writes are ordered by CLFLUSH on Intel CPUs.
 		 */
 		if (cpu_vendor_id != CPU_VENDOR_INTEL)
 			mfence();
 		for (; sva < eva; sva += cpu_clflush_line_size)
 			clflush(sva);
 		if (cpu_vendor_id != CPU_VENDOR_INTEL)
 			mfence();
 	} else {
 
 		/*
 		 * No targeted cache flush methods are supported by CPU,
 		 * or the supplied range is bigger than 2MB.
 		 * Globally invalidate cache.
 		 */
 		pmap_invalidate_cache();
 	}
 }
 
 /*
  * Remove the specified set of pages from the data and instruction caches.
  *
  * In contrast to pmap_invalidate_cache_range(), this function does not
  * rely on the CPU's self-snoop feature, because it is intended for use
  * when moving pages into a different cache domain.
  */
 void
 pmap_invalidate_cache_pages(vm_page_t *pages, int count)
 {
 	vm_offset_t daddr, eva;
 	int i;
 	bool useclflushopt;
 
 	useclflushopt = (cpu_stdext_feature & CPUID_STDEXT_CLFLUSHOPT) != 0;
 	if (count >= PMAP_CLFLUSH_THRESHOLD / PAGE_SIZE ||
 	    ((cpu_feature & CPUID_CLFSH) == 0 && !useclflushopt))
 		pmap_invalidate_cache();
 	else {
 		if (useclflushopt || cpu_vendor_id != CPU_VENDOR_INTEL)
 			mfence();
 		for (i = 0; i < count; i++) {
 			daddr = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(pages[i]));
 			eva = daddr + PAGE_SIZE;
 			for (; daddr < eva; daddr += cpu_clflush_line_size) {
 				if (useclflushopt)
 					clflushopt(daddr);
 				else
 					clflush(daddr);
 			}
 		}
 		if (useclflushopt || cpu_vendor_id != CPU_VENDOR_INTEL)
 			mfence();
 	}
 }
 
 /*
  *	Routine:	pmap_extract
  *	Function:
  *		Extract the physical page address associated
  *		with the given map/virtual_address pair.
  */
 vm_paddr_t 
 pmap_extract(pmap_t pmap, vm_offset_t va)
 {
 	pdp_entry_t *pdpe;
 	pd_entry_t *pde;
 	pt_entry_t *pte, PG_V;
 	vm_paddr_t pa;
 
 	pa = 0;
 	PG_V = pmap_valid_bit(pmap);
 	PMAP_LOCK(pmap);
 	pdpe = pmap_pdpe(pmap, va);
 	if (pdpe != NULL && (*pdpe & PG_V) != 0) {
 		if ((*pdpe & PG_PS) != 0)
 			pa = (*pdpe & PG_PS_FRAME) | (va & PDPMASK);
 		else {
 			pde = pmap_pdpe_to_pde(pdpe, va);
 			if ((*pde & PG_V) != 0) {
 				if ((*pde & PG_PS) != 0) {
 					pa = (*pde & PG_PS_FRAME) |
 					    (va & PDRMASK);
 				} else {
 					pte = pmap_pde_to_pte(pde, va);
 					pa = (*pte & PG_FRAME) |
 					    (va & PAGE_MASK);
 				}
 			}
 		}
 	}
 	PMAP_UNLOCK(pmap);
 	return (pa);
 }
 
 /*
  *	Routine:	pmap_extract_and_hold
  *	Function:
  *		Atomically extract and hold the physical page
  *		with the given pmap and virtual address pair
  *		if that mapping permits the given protection.
  */
 vm_page_t
 pmap_extract_and_hold(pmap_t pmap, vm_offset_t va, vm_prot_t prot)
 {
 	pd_entry_t pde, *pdep;
 	pt_entry_t pte, PG_RW, PG_V;
 	vm_paddr_t pa;
 	vm_page_t m;
 
 	pa = 0;
 	m = NULL;
 	PG_RW = pmap_rw_bit(pmap);
 	PG_V = pmap_valid_bit(pmap);
 	PMAP_LOCK(pmap);
 retry:
 	pdep = pmap_pde(pmap, va);
 	if (pdep != NULL && (pde = *pdep)) {
 		if (pde & PG_PS) {
 			if ((pde & PG_RW) || (prot & VM_PROT_WRITE) == 0) {
 				if (vm_page_pa_tryrelock(pmap, (pde &
 				    PG_PS_FRAME) | (va & PDRMASK), &pa))
 					goto retry;
 				m = PHYS_TO_VM_PAGE((pde & PG_PS_FRAME) |
 				    (va & PDRMASK));
 				vm_page_hold(m);
 			}
 		} else {
 			pte = *pmap_pde_to_pte(pdep, va);
 			if ((pte & PG_V) &&
 			    ((pte & PG_RW) || (prot & VM_PROT_WRITE) == 0)) {
 				if (vm_page_pa_tryrelock(pmap, pte & PG_FRAME,
 				    &pa))
 					goto retry;
 				m = PHYS_TO_VM_PAGE(pte & PG_FRAME);
 				vm_page_hold(m);
 			}
 		}
 	}
 	PA_UNLOCK_COND(pa);
 	PMAP_UNLOCK(pmap);
 	return (m);
 }
 
 vm_paddr_t
 pmap_kextract(vm_offset_t va)
 {
 	pd_entry_t pde;
 	vm_paddr_t pa;
 
 	if (va >= DMAP_MIN_ADDRESS && va < DMAP_MAX_ADDRESS) {
 		pa = DMAP_TO_PHYS(va);
 	} else {
 		pde = *vtopde(va);
 		if (pde & PG_PS) {
 			pa = (pde & PG_PS_FRAME) | (va & PDRMASK);
 		} else {
 			/*
 			 * Beware of a concurrent promotion that changes the
 			 * PDE at this point!  For example, vtopte() must not
 			 * be used to access the PTE because it would use the
 			 * new PDE.  It is, however, safe to use the old PDE
 			 * because the page table page is preserved by the
 			 * promotion.
 			 */
 			pa = *pmap_pde_to_pte(&pde, va);
 			pa = (pa & PG_FRAME) | (va & PAGE_MASK);
 		}
 	}
 	return (pa);
 }
 
 /***************************************************
  * Low level mapping routines.....
  ***************************************************/
 
 /*
  * Add a wired page to the kva.
  * Note: not SMP coherent.
  */
 PMAP_INLINE void 
 pmap_kenter(vm_offset_t va, vm_paddr_t pa)
 {
 	pt_entry_t *pte;
 
 	pte = vtopte(va);
 	pte_store(pte, pa | X86_PG_RW | X86_PG_V | X86_PG_G);
 }
 
 static __inline void
 pmap_kenter_attr(vm_offset_t va, vm_paddr_t pa, int mode)
 {
 	pt_entry_t *pte;
 	int cache_bits;
 
 	pte = vtopte(va);
 	cache_bits = pmap_cache_bits(kernel_pmap, mode, 0);
 	pte_store(pte, pa | X86_PG_RW | X86_PG_V | X86_PG_G | cache_bits);
 }
 
 /*
  * Remove a page from the kernel pagetables.
  * Note: not SMP coherent.
  */
 PMAP_INLINE void
 pmap_kremove(vm_offset_t va)
 {
 	pt_entry_t *pte;
 
 	pte = vtopte(va);
 	pte_clear(pte);
 }
 
 /*
  *	Used to map a range of physical addresses into kernel
  *	virtual address space.
  *
  *	The value passed in '*virt' is a suggested virtual address for
  *	the mapping. Architectures which can support a direct-mapped
  *	physical to virtual region can return the appropriate address
  *	within that region, leaving '*virt' unchanged. Other
  *	architectures should map the pages starting at '*virt' and
  *	update '*virt' with the first usable address after the mapped
  *	region.
  */
 vm_offset_t
 pmap_map(vm_offset_t *virt, vm_paddr_t start, vm_paddr_t end, int prot)
 {
 	return PHYS_TO_DMAP(start);
 }
 
 
 /*
  * Add a list of wired pages to the kva
  * this routine is only used for temporary
  * kernel mappings that do not need to have
  * page modification or references recorded.
  * Note that old mappings are simply written
  * over.  The page *must* be wired.
  * Note: SMP coherent.  Uses a ranged shootdown IPI.
  */
 void
 pmap_qenter(vm_offset_t sva, vm_page_t *ma, int count)
 {
 	pt_entry_t *endpte, oldpte, pa, *pte;
 	vm_page_t m;
 	int cache_bits;
 
 	oldpte = 0;
 	pte = vtopte(sva);
 	endpte = pte + count;
 	while (pte < endpte) {
 		m = *ma++;
 		cache_bits = pmap_cache_bits(kernel_pmap, m->md.pat_mode, 0);
 		pa = VM_PAGE_TO_PHYS(m) | cache_bits;
 		if ((*pte & (PG_FRAME | X86_PG_PTE_CACHE)) != pa) {
 			oldpte |= *pte;
 			pte_store(pte, pa | X86_PG_G | X86_PG_RW | X86_PG_V);
 		}
 		pte++;
 	}
 	if (__predict_false((oldpte & X86_PG_V) != 0))
 		pmap_invalidate_range(kernel_pmap, sva, sva + count *
 		    PAGE_SIZE);
 }
 
 /*
  * This routine tears out page mappings from the
  * kernel -- it is meant only for temporary mappings.
  * Note: SMP coherent.  Uses a ranged shootdown IPI.
  */
 void
 pmap_qremove(vm_offset_t sva, int count)
 {
 	vm_offset_t va;
 
 	va = sva;
 	while (count-- > 0) {
 		KASSERT(va >= VM_MIN_KERNEL_ADDRESS, ("usermode va %lx", va));
 		pmap_kremove(va);
 		va += PAGE_SIZE;
 	}
 	pmap_invalidate_range(kernel_pmap, sva, va);
 }
 
 /***************************************************
  * Page table page management routines.....
  ***************************************************/
 static __inline void
 pmap_free_zero_pages(struct spglist *free)
 {
 	vm_page_t m;
 
 	while ((m = SLIST_FIRST(free)) != NULL) {
 		SLIST_REMOVE_HEAD(free, plinks.s.ss);
 		/* Preserve the page's PG_ZERO setting. */
 		vm_page_free_toq(m);
 	}
 }
 
 /*
  * Schedule the specified unused page table page to be freed.  Specifically,
  * add the page to the specified list of pages that will be released to the
  * physical memory manager after the TLB has been updated.
  */
 static __inline void
 pmap_add_delayed_free_list(vm_page_t m, struct spglist *free,
     boolean_t set_PG_ZERO)
 {
 
 	if (set_PG_ZERO)
 		m->flags |= PG_ZERO;
 	else
 		m->flags &= ~PG_ZERO;
 	SLIST_INSERT_HEAD(free, m, plinks.s.ss);
 }
 	
 /*
  * Inserts the specified page table page into the specified pmap's collection
  * of idle page table pages.  Each of a pmap's page table pages is responsible
  * for mapping a distinct range of virtual addresses.  The pmap's collection is
  * ordered by this virtual address range.
  */
 static __inline int
 pmap_insert_pt_page(pmap_t pmap, vm_page_t mpte)
 {
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	return (vm_radix_insert(&pmap->pm_root, mpte));
 }
 
 /*
  * Looks for a page table page mapping the specified virtual address in the
  * specified pmap's collection of idle page table pages.  Returns NULL if there
  * is no page table page corresponding to the specified virtual address.
  */
 static __inline vm_page_t
 pmap_lookup_pt_page(pmap_t pmap, vm_offset_t va)
 {
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	return (vm_radix_lookup(&pmap->pm_root, pmap_pde_pindex(va)));
 }
 
 /*
  * Removes the specified page table page from the specified pmap's collection
  * of idle page table pages.  The specified page table page must be a member of
  * the pmap's collection.
  */
 static __inline void
 pmap_remove_pt_page(pmap_t pmap, vm_page_t mpte)
 {
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	vm_radix_remove(&pmap->pm_root, mpte->pindex);
 }
 
 /*
  * Decrements a page table page's wire count, which is used to record the
  * number of valid page table entries within the page.  If the wire count
  * drops to zero, then the page table page is unmapped.  Returns TRUE if the
  * page table page was unmapped and FALSE otherwise.
  */
 static inline boolean_t
 pmap_unwire_ptp(pmap_t pmap, vm_offset_t va, vm_page_t m, struct spglist *free)
 {
 
 	--m->wire_count;
 	if (m->wire_count == 0) {
 		_pmap_unwire_ptp(pmap, va, m, free);
 		return (TRUE);
 	} else
 		return (FALSE);
 }
 
 static void
 _pmap_unwire_ptp(pmap_t pmap, vm_offset_t va, vm_page_t m, struct spglist *free)
 {
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	/*
 	 * unmap the page table page
 	 */
 	if (m->pindex >= (NUPDE + NUPDPE)) {
 		/* PDP page */
 		pml4_entry_t *pml4;
 		pml4 = pmap_pml4e(pmap, va);
 		*pml4 = 0;
 	} else if (m->pindex >= NUPDE) {
 		/* PD page */
 		pdp_entry_t *pdp;
 		pdp = pmap_pdpe(pmap, va);
 		*pdp = 0;
 	} else {
 		/* PTE page */
 		pd_entry_t *pd;
 		pd = pmap_pde(pmap, va);
 		*pd = 0;
 	}
 	pmap_resident_count_dec(pmap, 1);
 	if (m->pindex < NUPDE) {
 		/* We just released a PT, unhold the matching PD */
 		vm_page_t pdpg;
 
 		pdpg = PHYS_TO_VM_PAGE(*pmap_pdpe(pmap, va) & PG_FRAME);
 		pmap_unwire_ptp(pmap, va, pdpg, free);
 	}
 	if (m->pindex >= NUPDE && m->pindex < (NUPDE + NUPDPE)) {
 		/* We just released a PD, unhold the matching PDP */
 		vm_page_t pdppg;
 
 		pdppg = PHYS_TO_VM_PAGE(*pmap_pml4e(pmap, va) & PG_FRAME);
 		pmap_unwire_ptp(pmap, va, pdppg, free);
 	}
 
 	/*
 	 * This is a release store so that the ordinary store unmapping
 	 * the page table page is globally performed before TLB shoot-
 	 * down is begun.
 	 */
 	atomic_subtract_rel_int(&vm_cnt.v_wire_count, 1);
 
 	/* 
 	 * Put page on a list so that it is released after
 	 * *ALL* TLB shootdown is done
 	 */
 	pmap_add_delayed_free_list(m, free, TRUE);
 }
 
 /*
  * After removing a page table entry, this routine is used to
  * conditionally free the page, and manage the hold/wire counts.
  */
 static int
 pmap_unuse_pt(pmap_t pmap, vm_offset_t va, pd_entry_t ptepde,
     struct spglist *free)
 {
 	vm_page_t mpte;
 
 	if (va >= VM_MAXUSER_ADDRESS)
 		return (0);
 	KASSERT(ptepde != 0, ("pmap_unuse_pt: ptepde != 0"));
 	mpte = PHYS_TO_VM_PAGE(ptepde & PG_FRAME);
 	return (pmap_unwire_ptp(pmap, va, mpte, free));
 }
 
 void
 pmap_pinit0(pmap_t pmap)
 {
 	int i;
 
 	PMAP_LOCK_INIT(pmap);
 	pmap->pm_pml4 = (pml4_entry_t *)PHYS_TO_DMAP(KPML4phys);
 	pmap->pm_cr3 = KPML4phys;
 	pmap->pm_root.rt_root = 0;
 	CPU_ZERO(&pmap->pm_active);
 	TAILQ_INIT(&pmap->pm_pvchunk);
 	bzero(&pmap->pm_stats, sizeof pmap->pm_stats);
 	pmap->pm_flags = pmap_flags;
 	CPU_FOREACH(i) {
 		pmap->pm_pcids[i].pm_pcid = PMAP_PCID_NONE;
 		pmap->pm_pcids[i].pm_gen = 0;
 	}
 	PCPU_SET(curpmap, kernel_pmap);
 	pmap_activate(curthread);
 	CPU_FILL(&kernel_pmap->pm_active);
 }
 
 /*
  * Initialize a preallocated and zeroed pmap structure,
  * such as one in a vmspace structure.
  */
 int
 pmap_pinit_type(pmap_t pmap, enum pmap_type pm_type, int flags)
 {
 	vm_page_t pml4pg;
 	vm_paddr_t pml4phys;
 	int i;
 
 	/*
 	 * allocate the page directory page
 	 */
 	while ((pml4pg = vm_page_alloc(NULL, 0, VM_ALLOC_NORMAL |
 	    VM_ALLOC_NOOBJ | VM_ALLOC_WIRED | VM_ALLOC_ZERO)) == NULL)
 		VM_WAIT;
 
 	pml4phys = VM_PAGE_TO_PHYS(pml4pg);
 	pmap->pm_pml4 = (pml4_entry_t *)PHYS_TO_DMAP(pml4phys);
 	CPU_FOREACH(i) {
 		pmap->pm_pcids[i].pm_pcid = PMAP_PCID_NONE;
 		pmap->pm_pcids[i].pm_gen = 0;
 	}
 	pmap->pm_cr3 = ~0;	/* initialize to an invalid value */
 
 	if ((pml4pg->flags & PG_ZERO) == 0)
 		pagezero(pmap->pm_pml4);
 
 	/*
 	 * Do not install the host kernel mappings in the nested page
 	 * tables. These mappings are meaningless in the guest physical
 	 * address space.
 	 */
 	if ((pmap->pm_type = pm_type) == PT_X86) {
 		pmap->pm_cr3 = pml4phys;
 
 		/* Wire in kernel global address entries. */
 		for (i = 0; i < NKPML4E; i++) {
 			pmap->pm_pml4[KPML4BASE + i] = (KPDPphys + ptoa(i)) |
 			    X86_PG_RW | X86_PG_V | PG_U;
 		}
 		for (i = 0; i < ndmpdpphys; i++) {
 			pmap->pm_pml4[DMPML4I + i] = (DMPDPphys + ptoa(i)) |
 			    X86_PG_RW | X86_PG_V | PG_U;
 		}
 
 		/* install self-referential address mapping entry(s) */
 		pmap->pm_pml4[PML4PML4I] = VM_PAGE_TO_PHYS(pml4pg) |
 		    X86_PG_V | X86_PG_RW | X86_PG_A | X86_PG_M;
 	}
 
 	pmap->pm_root.rt_root = 0;
 	CPU_ZERO(&pmap->pm_active);
 	TAILQ_INIT(&pmap->pm_pvchunk);
 	bzero(&pmap->pm_stats, sizeof pmap->pm_stats);
 	pmap->pm_flags = flags;
 	pmap->pm_eptgen = 0;
 
 	return (1);
 }
 
 int
 pmap_pinit(pmap_t pmap)
 {
 
 	return (pmap_pinit_type(pmap, PT_X86, pmap_flags));
 }
 
 /*
  * This routine is called if the desired page table page does not exist.
  *
  * If page table page allocation fails, this routine may sleep before
  * returning NULL.  It sleeps only if a lock pointer was given.
  *
  * Note: If a page allocation fails at page table level two or three,
  * one or two pages may be held during the wait, only to be released
  * afterwards.  This conservative approach is easily argued to avoid
  * race conditions.
  */
 static vm_page_t
 _pmap_allocpte(pmap_t pmap, vm_pindex_t ptepindex, struct rwlock **lockp)
 {
 	vm_page_t m, pdppg, pdpg;
 	pt_entry_t PG_A, PG_M, PG_RW, PG_V;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 
 	PG_A = pmap_accessed_bit(pmap);
 	PG_M = pmap_modified_bit(pmap);
 	PG_V = pmap_valid_bit(pmap);
 	PG_RW = pmap_rw_bit(pmap);
 
 	/*
 	 * Allocate a page table page.
 	 */
 	if ((m = vm_page_alloc(NULL, ptepindex, VM_ALLOC_NOOBJ |
 	    VM_ALLOC_WIRED | VM_ALLOC_ZERO)) == NULL) {
 		if (lockp != NULL) {
 			RELEASE_PV_LIST_LOCK(lockp);
 			PMAP_UNLOCK(pmap);
 			PMAP_ASSERT_NOT_IN_DI();
 			VM_WAIT;
 			PMAP_LOCK(pmap);
 		}
 
 		/*
 		 * Indicate the need to retry.  While waiting, the page table
 		 * page may have been allocated.
 		 */
 		return (NULL);
 	}
 	if ((m->flags & PG_ZERO) == 0)
 		pmap_zero_page(m);
 
 	/*
 	 * Map the pagetable page into the process address space, if
 	 * it isn't already there.
 	 */
 
 	if (ptepindex >= (NUPDE + NUPDPE)) {
 		pml4_entry_t *pml4;
 		vm_pindex_t pml4index;
 
 		/* Wire up a new PDPE page */
 		pml4index = ptepindex - (NUPDE + NUPDPE);
 		pml4 = &pmap->pm_pml4[pml4index];
 		*pml4 = VM_PAGE_TO_PHYS(m) | PG_U | PG_RW | PG_V | PG_A | PG_M;
 
 	} else if (ptepindex >= NUPDE) {
 		vm_pindex_t pml4index;
 		vm_pindex_t pdpindex;
 		pml4_entry_t *pml4;
 		pdp_entry_t *pdp;
 
 		/* Wire up a new PDE page */
 		pdpindex = ptepindex - NUPDE;
 		pml4index = pdpindex >> NPML4EPGSHIFT;
 
 		pml4 = &pmap->pm_pml4[pml4index];
 		if ((*pml4 & PG_V) == 0) {
 			/* Have to allocate a new pdp, recurse */
 			if (_pmap_allocpte(pmap, NUPDE + NUPDPE + pml4index,
 			    lockp) == NULL) {
 				--m->wire_count;
 				atomic_subtract_int(&vm_cnt.v_wire_count, 1);
 				vm_page_free_zero(m);
 				return (NULL);
 			}
 		} else {
 			/* Add reference to pdp page */
 			pdppg = PHYS_TO_VM_PAGE(*pml4 & PG_FRAME);
 			pdppg->wire_count++;
 		}
 		pdp = (pdp_entry_t *)PHYS_TO_DMAP(*pml4 & PG_FRAME);
 
 		/* Now find the pdp page */
 		pdp = &pdp[pdpindex & ((1ul << NPDPEPGSHIFT) - 1)];
 		*pdp = VM_PAGE_TO_PHYS(m) | PG_U | PG_RW | PG_V | PG_A | PG_M;
 
 	} else {
 		vm_pindex_t pml4index;
 		vm_pindex_t pdpindex;
 		pml4_entry_t *pml4;
 		pdp_entry_t *pdp;
 		pd_entry_t *pd;
 
 		/* Wire up a new PTE page */
 		pdpindex = ptepindex >> NPDPEPGSHIFT;
 		pml4index = pdpindex >> NPML4EPGSHIFT;
 
 		/* First, find the pdp and check that its valid. */
 		pml4 = &pmap->pm_pml4[pml4index];
 		if ((*pml4 & PG_V) == 0) {
 			/* Have to allocate a new pd, recurse */
 			if (_pmap_allocpte(pmap, NUPDE + pdpindex,
 			    lockp) == NULL) {
 				--m->wire_count;
 				atomic_subtract_int(&vm_cnt.v_wire_count, 1);
 				vm_page_free_zero(m);
 				return (NULL);
 			}
 			pdp = (pdp_entry_t *)PHYS_TO_DMAP(*pml4 & PG_FRAME);
 			pdp = &pdp[pdpindex & ((1ul << NPDPEPGSHIFT) - 1)];
 		} else {
 			pdp = (pdp_entry_t *)PHYS_TO_DMAP(*pml4 & PG_FRAME);
 			pdp = &pdp[pdpindex & ((1ul << NPDPEPGSHIFT) - 1)];
 			if ((*pdp & PG_V) == 0) {
 				/* Have to allocate a new pd, recurse */
 				if (_pmap_allocpte(pmap, NUPDE + pdpindex,
 				    lockp) == NULL) {
 					--m->wire_count;
 					atomic_subtract_int(&vm_cnt.v_wire_count,
 					    1);
 					vm_page_free_zero(m);
 					return (NULL);
 				}
 			} else {
 				/* Add reference to the pd page */
 				pdpg = PHYS_TO_VM_PAGE(*pdp & PG_FRAME);
 				pdpg->wire_count++;
 			}
 		}
 		pd = (pd_entry_t *)PHYS_TO_DMAP(*pdp & PG_FRAME);
 
 		/* Now we know where the page directory page is */
 		pd = &pd[ptepindex & ((1ul << NPDEPGSHIFT) - 1)];
 		*pd = VM_PAGE_TO_PHYS(m) | PG_U | PG_RW | PG_V | PG_A | PG_M;
 	}
 
 	pmap_resident_count_inc(pmap, 1);
 
 	return (m);
 }
 
 static vm_page_t
 pmap_allocpde(pmap_t pmap, vm_offset_t va, struct rwlock **lockp)
 {
 	vm_pindex_t pdpindex, ptepindex;
 	pdp_entry_t *pdpe, PG_V;
 	vm_page_t pdpg;
 
 	PG_V = pmap_valid_bit(pmap);
 
 retry:
 	pdpe = pmap_pdpe(pmap, va);
 	if (pdpe != NULL && (*pdpe & PG_V) != 0) {
 		/* Add a reference to the pd page. */
 		pdpg = PHYS_TO_VM_PAGE(*pdpe & PG_FRAME);
 		pdpg->wire_count++;
 	} else {
 		/* Allocate a pd page. */
 		ptepindex = pmap_pde_pindex(va);
 		pdpindex = ptepindex >> NPDPEPGSHIFT;
 		pdpg = _pmap_allocpte(pmap, NUPDE + pdpindex, lockp);
 		if (pdpg == NULL && lockp != NULL)
 			goto retry;
 	}
 	return (pdpg);
 }
 
 static vm_page_t
 pmap_allocpte(pmap_t pmap, vm_offset_t va, struct rwlock **lockp)
 {
 	vm_pindex_t ptepindex;
 	pd_entry_t *pd, PG_V;
 	vm_page_t m;
 
 	PG_V = pmap_valid_bit(pmap);
 
 	/*
 	 * Calculate pagetable page index
 	 */
 	ptepindex = pmap_pde_pindex(va);
 retry:
 	/*
 	 * Get the page directory entry
 	 */
 	pd = pmap_pde(pmap, va);
 
 	/*
 	 * This supports switching from a 2MB page to a
 	 * normal 4K page.
 	 */
 	if (pd != NULL && (*pd & (PG_PS | PG_V)) == (PG_PS | PG_V)) {
 		if (!pmap_demote_pde_locked(pmap, pd, va, lockp)) {
 			/*
 			 * Invalidation of the 2MB page mapping may have caused
 			 * the deallocation of the underlying PD page.
 			 */
 			pd = NULL;
 		}
 	}
 
 	/*
 	 * If the page table page is mapped, we just increment the
 	 * hold count, and activate it.
 	 */
 	if (pd != NULL && (*pd & PG_V) != 0) {
 		m = PHYS_TO_VM_PAGE(*pd & PG_FRAME);
 		m->wire_count++;
 	} else {
 		/*
 		 * Here if the pte page isn't mapped, or if it has been
 		 * deallocated.
 		 */
 		m = _pmap_allocpte(pmap, ptepindex, lockp);
 		if (m == NULL && lockp != NULL)
 			goto retry;
 	}
 	return (m);
 }
 
 
 /***************************************************
  * Pmap allocation/deallocation routines.
  ***************************************************/
 
 /*
  * Release any resources held by the given physical map.
  * Called when a pmap initialized by pmap_pinit is being released.
  * Should only be called if the map contains no valid mappings.
  */
 void
 pmap_release(pmap_t pmap)
 {
 	vm_page_t m;
 	int i;
 
 	KASSERT(pmap->pm_stats.resident_count == 0,
 	    ("pmap_release: pmap resident count %ld != 0",
 	    pmap->pm_stats.resident_count));
 	KASSERT(vm_radix_is_empty(&pmap->pm_root),
 	    ("pmap_release: pmap has reserved page table page(s)"));
 	KASSERT(CPU_EMPTY(&pmap->pm_active),
 	    ("releasing active pmap %p", pmap));
 
 	m = PHYS_TO_VM_PAGE(DMAP_TO_PHYS((vm_offset_t)pmap->pm_pml4));
 
 	for (i = 0; i < NKPML4E; i++)	/* KVA */
 		pmap->pm_pml4[KPML4BASE + i] = 0;
 	for (i = 0; i < ndmpdpphys; i++)/* Direct Map */
 		pmap->pm_pml4[DMPML4I + i] = 0;
 	pmap->pm_pml4[PML4PML4I] = 0;	/* Recursive Mapping */
 
 	m->wire_count--;
 	atomic_subtract_int(&vm_cnt.v_wire_count, 1);
 	vm_page_free_zero(m);
 }
 
 static int
 kvm_size(SYSCTL_HANDLER_ARGS)
 {
 	unsigned long ksize = VM_MAX_KERNEL_ADDRESS - VM_MIN_KERNEL_ADDRESS;
 
 	return sysctl_handle_long(oidp, &ksize, 0, req);
 }
 SYSCTL_PROC(_vm, OID_AUTO, kvm_size, CTLTYPE_LONG|CTLFLAG_RD, 
     0, 0, kvm_size, "LU", "Size of KVM");
 
 static int
 kvm_free(SYSCTL_HANDLER_ARGS)
 {
 	unsigned long kfree = VM_MAX_KERNEL_ADDRESS - kernel_vm_end;
 
 	return sysctl_handle_long(oidp, &kfree, 0, req);
 }
 SYSCTL_PROC(_vm, OID_AUTO, kvm_free, CTLTYPE_LONG|CTLFLAG_RD, 
     0, 0, kvm_free, "LU", "Amount of KVM free");
 
 /*
  * grow the number of kernel page table entries, if needed
  */
 void
 pmap_growkernel(vm_offset_t addr)
 {
 	vm_paddr_t paddr;
 	vm_page_t nkpg;
 	pd_entry_t *pde, newpdir;
 	pdp_entry_t *pdpe;
 
 	mtx_assert(&kernel_map->system_mtx, MA_OWNED);
 
 	/*
 	 * Return if "addr" is within the range of kernel page table pages
 	 * that were preallocated during pmap bootstrap.  Moreover, leave
 	 * "kernel_vm_end" and the kernel page table as they were.
 	 *
 	 * The correctness of this action is based on the following
 	 * argument: vm_map_insert() allocates contiguous ranges of the
 	 * kernel virtual address space.  It calls this function if a range
 	 * ends after "kernel_vm_end".  If the kernel is mapped between
 	 * "kernel_vm_end" and "addr", then the range cannot begin at
 	 * "kernel_vm_end".  In fact, its beginning address cannot be less
 	 * than the kernel.  Thus, there is no immediate need to allocate
 	 * any new kernel page table pages between "kernel_vm_end" and
 	 * "KERNBASE".
 	 */
 	if (KERNBASE < addr && addr <= KERNBASE + nkpt * NBPDR)
 		return;
 
 	addr = roundup2(addr, NBPDR);
 	if (addr - 1 >= kernel_map->max_offset)
 		addr = kernel_map->max_offset;
 	while (kernel_vm_end < addr) {
 		pdpe = pmap_pdpe(kernel_pmap, kernel_vm_end);
 		if ((*pdpe & X86_PG_V) == 0) {
 			/* We need a new PDP entry */
 			nkpg = vm_page_alloc(NULL, kernel_vm_end >> PDPSHIFT,
 			    VM_ALLOC_INTERRUPT | VM_ALLOC_NOOBJ |
 			    VM_ALLOC_WIRED | VM_ALLOC_ZERO);
 			if (nkpg == NULL)
 				panic("pmap_growkernel: no memory to grow kernel");
 			if ((nkpg->flags & PG_ZERO) == 0)
 				pmap_zero_page(nkpg);
 			paddr = VM_PAGE_TO_PHYS(nkpg);
 			*pdpe = (pdp_entry_t)(paddr | X86_PG_V | X86_PG_RW |
 			    X86_PG_A | X86_PG_M);
 			continue; /* try again */
 		}
 		pde = pmap_pdpe_to_pde(pdpe, kernel_vm_end);
 		if ((*pde & X86_PG_V) != 0) {
 			kernel_vm_end = (kernel_vm_end + NBPDR) & ~PDRMASK;
 			if (kernel_vm_end - 1 >= kernel_map->max_offset) {
 				kernel_vm_end = kernel_map->max_offset;
 				break;                       
 			}
 			continue;
 		}
 
 		nkpg = vm_page_alloc(NULL, pmap_pde_pindex(kernel_vm_end),
 		    VM_ALLOC_INTERRUPT | VM_ALLOC_NOOBJ | VM_ALLOC_WIRED |
 		    VM_ALLOC_ZERO);
 		if (nkpg == NULL)
 			panic("pmap_growkernel: no memory to grow kernel");
 		if ((nkpg->flags & PG_ZERO) == 0)
 			pmap_zero_page(nkpg);
 		paddr = VM_PAGE_TO_PHYS(nkpg);
 		newpdir = paddr | X86_PG_V | X86_PG_RW | X86_PG_A | X86_PG_M;
 		pde_store(pde, newpdir);
 
 		kernel_vm_end = (kernel_vm_end + NBPDR) & ~PDRMASK;
 		if (kernel_vm_end - 1 >= kernel_map->max_offset) {
 			kernel_vm_end = kernel_map->max_offset;
 			break;                       
 		}
 	}
 }
 
 
 /***************************************************
  * page management routines.
  ***************************************************/
 
 CTASSERT(sizeof(struct pv_chunk) == PAGE_SIZE);
 CTASSERT(_NPCM == 3);
 CTASSERT(_NPCPV == 168);
 
 static __inline struct pv_chunk *
 pv_to_chunk(pv_entry_t pv)
 {
 
 	return ((struct pv_chunk *)((uintptr_t)pv & ~(uintptr_t)PAGE_MASK));
 }
 
 #define PV_PMAP(pv) (pv_to_chunk(pv)->pc_pmap)
 
 #define	PC_FREE0	0xfffffffffffffffful
 #define	PC_FREE1	0xfffffffffffffffful
 #define	PC_FREE2	0x000000fffffffffful
 
 static const uint64_t pc_freemask[_NPCM] = { PC_FREE0, PC_FREE1, PC_FREE2 };
 
 #ifdef PV_STATS
 static int pc_chunk_count, pc_chunk_allocs, pc_chunk_frees, pc_chunk_tryfail;
 
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_count, CTLFLAG_RD, &pc_chunk_count, 0,
 	"Current number of pv entry chunks");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_allocs, CTLFLAG_RD, &pc_chunk_allocs, 0,
 	"Current number of pv entry chunks allocated");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_frees, CTLFLAG_RD, &pc_chunk_frees, 0,
 	"Current number of pv entry chunks frees");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_tryfail, CTLFLAG_RD, &pc_chunk_tryfail, 0,
 	"Number of times tried to get a chunk page but failed.");
 
 static long pv_entry_frees, pv_entry_allocs, pv_entry_count;
 static int pv_entry_spare;
 
 SYSCTL_LONG(_vm_pmap, OID_AUTO, pv_entry_frees, CTLFLAG_RD, &pv_entry_frees, 0,
 	"Current number of pv entry frees");
 SYSCTL_LONG(_vm_pmap, OID_AUTO, pv_entry_allocs, CTLFLAG_RD, &pv_entry_allocs, 0,
 	"Current number of pv entry allocs");
 SYSCTL_LONG(_vm_pmap, OID_AUTO, pv_entry_count, CTLFLAG_RD, &pv_entry_count, 0,
 	"Current number of pv entries");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pv_entry_spare, CTLFLAG_RD, &pv_entry_spare, 0,
 	"Current number of spare pv entries");
 #endif
 
 /*
  * We are in a serious low memory condition.  Resort to
  * drastic measures to free some pages so we can allocate
  * another pv entry chunk.
  *
  * Returns NULL if PV entries were reclaimed from the specified pmap.
  *
  * We do not, however, unmap 2mpages because subsequent accesses will
  * allocate per-page pv entries until repromotion occurs, thereby
  * exacerbating the shortage of free pv entries.
  */
 static vm_page_t
 reclaim_pv_chunk(pmap_t locked_pmap, struct rwlock **lockp)
 {
 	struct pch new_tail;
 	struct pv_chunk *pc;
 	struct md_page *pvh;
 	pd_entry_t *pde;
 	pmap_t pmap;
 	pt_entry_t *pte, tpte;
 	pt_entry_t PG_G, PG_A, PG_M, PG_RW;
 	pv_entry_t pv;
 	vm_offset_t va;
 	vm_page_t m, m_pc;
 	struct spglist free;
 	uint64_t inuse;
 	int bit, field, freed;
 
 	PMAP_LOCK_ASSERT(locked_pmap, MA_OWNED);
 	KASSERT(lockp != NULL, ("reclaim_pv_chunk: lockp is NULL"));
 	pmap = NULL;
 	m_pc = NULL;
 	PG_G = PG_A = PG_M = PG_RW = 0;
 	SLIST_INIT(&free);
 	TAILQ_INIT(&new_tail);
 	pmap_delayed_invl_started();
 	mtx_lock(&pv_chunks_mutex);
 	while ((pc = TAILQ_FIRST(&pv_chunks)) != NULL && SLIST_EMPTY(&free)) {
 		TAILQ_REMOVE(&pv_chunks, pc, pc_lru);
 		mtx_unlock(&pv_chunks_mutex);
 		if (pmap != pc->pc_pmap) {
 			if (pmap != NULL) {
 				pmap_invalidate_all(pmap);
 				if (pmap != locked_pmap)
 					PMAP_UNLOCK(pmap);
 			}
 			pmap_delayed_invl_finished();
 			pmap_delayed_invl_started();
 			pmap = pc->pc_pmap;
 			/* Avoid deadlock and lock recursion. */
 			if (pmap > locked_pmap) {
 				RELEASE_PV_LIST_LOCK(lockp);
 				PMAP_LOCK(pmap);
 			} else if (pmap != locked_pmap &&
 			    !PMAP_TRYLOCK(pmap)) {
 				pmap = NULL;
 				TAILQ_INSERT_TAIL(&new_tail, pc, pc_lru);
 				mtx_lock(&pv_chunks_mutex);
 				continue;
 			}
 			PG_G = pmap_global_bit(pmap);
 			PG_A = pmap_accessed_bit(pmap);
 			PG_M = pmap_modified_bit(pmap);
 			PG_RW = pmap_rw_bit(pmap);
 		}
 
 		/*
 		 * Destroy every non-wired, 4 KB page mapping in the chunk.
 		 */
 		freed = 0;
 		for (field = 0; field < _NPCM; field++) {
 			for (inuse = ~pc->pc_map[field] & pc_freemask[field];
 			    inuse != 0; inuse &= ~(1UL << bit)) {
 				bit = bsfq(inuse);
 				pv = &pc->pc_pventry[field * 64 + bit];
 				va = pv->pv_va;
 				pde = pmap_pde(pmap, va);
 				if ((*pde & PG_PS) != 0)
 					continue;
 				pte = pmap_pde_to_pte(pde, va);
 				if ((*pte & PG_W) != 0)
 					continue;
 				tpte = pte_load_clear(pte);
 				if ((tpte & PG_G) != 0)
 					pmap_invalidate_page(pmap, va);
 				m = PHYS_TO_VM_PAGE(tpte & PG_FRAME);
 				if ((tpte & (PG_M | PG_RW)) == (PG_M | PG_RW))
 					vm_page_dirty(m);
 				if ((tpte & PG_A) != 0)
 					vm_page_aflag_set(m, PGA_REFERENCED);
 				CHANGE_PV_LIST_LOCK_TO_VM_PAGE(lockp, m);
 				TAILQ_REMOVE(&m->md.pv_list, pv, pv_next);
 				m->md.pv_gen++;
 				if (TAILQ_EMPTY(&m->md.pv_list) &&
 				    (m->flags & PG_FICTITIOUS) == 0) {
 					pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 					if (TAILQ_EMPTY(&pvh->pv_list)) {
 						vm_page_aflag_clear(m,
 						    PGA_WRITEABLE);
 					}
 				}
 				pmap_delayed_invl_page(m);
 				pc->pc_map[field] |= 1UL << bit;
 				pmap_unuse_pt(pmap, va, *pde, &free);
 				freed++;
 			}
 		}
 		if (freed == 0) {
 			TAILQ_INSERT_TAIL(&new_tail, pc, pc_lru);
 			mtx_lock(&pv_chunks_mutex);
 			continue;
 		}
 		/* Every freed mapping is for a 4 KB page. */
 		pmap_resident_count_dec(pmap, freed);
 		PV_STAT(atomic_add_long(&pv_entry_frees, freed));
 		PV_STAT(atomic_add_int(&pv_entry_spare, freed));
 		PV_STAT(atomic_subtract_long(&pv_entry_count, freed));
 		TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 		if (pc->pc_map[0] == PC_FREE0 && pc->pc_map[1] == PC_FREE1 &&
 		    pc->pc_map[2] == PC_FREE2) {
 			PV_STAT(atomic_subtract_int(&pv_entry_spare, _NPCPV));
 			PV_STAT(atomic_subtract_int(&pc_chunk_count, 1));
 			PV_STAT(atomic_add_int(&pc_chunk_frees, 1));
 			/* Entire chunk is free; return it. */
 			m_pc = PHYS_TO_VM_PAGE(DMAP_TO_PHYS((vm_offset_t)pc));
 			dump_drop_page(m_pc->phys_addr);
 			mtx_lock(&pv_chunks_mutex);
 			break;
 		}
 		TAILQ_INSERT_HEAD(&pmap->pm_pvchunk, pc, pc_list);
 		TAILQ_INSERT_TAIL(&new_tail, pc, pc_lru);
 		mtx_lock(&pv_chunks_mutex);
 		/* One freed pv entry in locked_pmap is sufficient. */
 		if (pmap == locked_pmap)
 			break;
 	}
 	TAILQ_CONCAT(&pv_chunks, &new_tail, pc_lru);
 	mtx_unlock(&pv_chunks_mutex);
 	if (pmap != NULL) {
 		pmap_invalidate_all(pmap);
 		if (pmap != locked_pmap)
 			PMAP_UNLOCK(pmap);
 	}
 	pmap_delayed_invl_finished();
 	if (m_pc == NULL && !SLIST_EMPTY(&free)) {
 		m_pc = SLIST_FIRST(&free);
 		SLIST_REMOVE_HEAD(&free, plinks.s.ss);
 		/* Recycle a freed page table page. */
 		m_pc->wire_count = 1;
 		atomic_add_int(&vm_cnt.v_wire_count, 1);
 	}
 	pmap_free_zero_pages(&free);
 	return (m_pc);
 }
 
 /*
  * free the pv_entry back to the free list
  */
 static void
 free_pv_entry(pmap_t pmap, pv_entry_t pv)
 {
 	struct pv_chunk *pc;
 	int idx, field, bit;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	PV_STAT(atomic_add_long(&pv_entry_frees, 1));
 	PV_STAT(atomic_add_int(&pv_entry_spare, 1));
 	PV_STAT(atomic_subtract_long(&pv_entry_count, 1));
 	pc = pv_to_chunk(pv);
 	idx = pv - &pc->pc_pventry[0];
 	field = idx / 64;
 	bit = idx % 64;
 	pc->pc_map[field] |= 1ul << bit;
 	if (pc->pc_map[0] != PC_FREE0 || pc->pc_map[1] != PC_FREE1 ||
 	    pc->pc_map[2] != PC_FREE2) {
 		/* 98% of the time, pc is already at the head of the list. */
 		if (__predict_false(pc != TAILQ_FIRST(&pmap->pm_pvchunk))) {
 			TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 			TAILQ_INSERT_HEAD(&pmap->pm_pvchunk, pc, pc_list);
 		}
 		return;
 	}
 	TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 	free_pv_chunk(pc);
 }
 
 static void
 free_pv_chunk(struct pv_chunk *pc)
 {
 	vm_page_t m;
 
 	mtx_lock(&pv_chunks_mutex);
  	TAILQ_REMOVE(&pv_chunks, pc, pc_lru);
 	mtx_unlock(&pv_chunks_mutex);
 	PV_STAT(atomic_subtract_int(&pv_entry_spare, _NPCPV));
 	PV_STAT(atomic_subtract_int(&pc_chunk_count, 1));
 	PV_STAT(atomic_add_int(&pc_chunk_frees, 1));
 	/* entire chunk is free, return it */
 	m = PHYS_TO_VM_PAGE(DMAP_TO_PHYS((vm_offset_t)pc));
 	dump_drop_page(m->phys_addr);
 	vm_page_unwire(m, PQ_NONE);
 	vm_page_free(m);
 }
 
 /*
  * Returns a new PV entry, allocating a new PV chunk from the system when
  * needed.  If this PV chunk allocation fails and a PV list lock pointer was
  * given, a PV chunk is reclaimed from an arbitrary pmap.  Otherwise, NULL is
  * returned.
  *
  * The given PV list lock may be released.
  */
 static pv_entry_t
 get_pv_entry(pmap_t pmap, struct rwlock **lockp)
 {
 	int bit, field;
 	pv_entry_t pv;
 	struct pv_chunk *pc;
 	vm_page_t m;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	PV_STAT(atomic_add_long(&pv_entry_allocs, 1));
 retry:
 	pc = TAILQ_FIRST(&pmap->pm_pvchunk);
 	if (pc != NULL) {
 		for (field = 0; field < _NPCM; field++) {
 			if (pc->pc_map[field]) {
 				bit = bsfq(pc->pc_map[field]);
 				break;
 			}
 		}
 		if (field < _NPCM) {
 			pv = &pc->pc_pventry[field * 64 + bit];
 			pc->pc_map[field] &= ~(1ul << bit);
 			/* If this was the last item, move it to tail */
 			if (pc->pc_map[0] == 0 && pc->pc_map[1] == 0 &&
 			    pc->pc_map[2] == 0) {
 				TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 				TAILQ_INSERT_TAIL(&pmap->pm_pvchunk, pc,
 				    pc_list);
 			}
 			PV_STAT(atomic_add_long(&pv_entry_count, 1));
 			PV_STAT(atomic_subtract_int(&pv_entry_spare, 1));
 			return (pv);
 		}
 	}
 	/* No free items, allocate another chunk */
 	m = vm_page_alloc(NULL, 0, VM_ALLOC_NORMAL | VM_ALLOC_NOOBJ |
 	    VM_ALLOC_WIRED);
 	if (m == NULL) {
 		if (lockp == NULL) {
 			PV_STAT(pc_chunk_tryfail++);
 			return (NULL);
 		}
 		m = reclaim_pv_chunk(pmap, lockp);
 		if (m == NULL)
 			goto retry;
 	}
 	PV_STAT(atomic_add_int(&pc_chunk_count, 1));
 	PV_STAT(atomic_add_int(&pc_chunk_allocs, 1));
 	dump_add_page(m->phys_addr);
 	pc = (void *)PHYS_TO_DMAP(m->phys_addr);
 	pc->pc_pmap = pmap;
 	pc->pc_map[0] = PC_FREE0 & ~1ul;	/* preallocated bit 0 */
 	pc->pc_map[1] = PC_FREE1;
 	pc->pc_map[2] = PC_FREE2;
 	mtx_lock(&pv_chunks_mutex);
 	TAILQ_INSERT_TAIL(&pv_chunks, pc, pc_lru);
 	mtx_unlock(&pv_chunks_mutex);
 	pv = &pc->pc_pventry[0];
 	TAILQ_INSERT_HEAD(&pmap->pm_pvchunk, pc, pc_list);
 	PV_STAT(atomic_add_long(&pv_entry_count, 1));
 	PV_STAT(atomic_add_int(&pv_entry_spare, _NPCPV - 1));
 	return (pv);
 }
 
 /*
  * Returns the number of one bits within the given PV chunk map.
  *
  * The erratas for Intel processors state that "POPCNT Instruction May
  * Take Longer to Execute Than Expected".  It is believed that the
  * issue is the spurious dependency on the destination register.
  * Provide a hint to the register rename logic that the destination
  * value is overwritten, by clearing it, as suggested in the
  * optimization manual.  It should be cheap for unaffected processors
  * as well.
  *
  * Reference numbers for erratas are
  * 4th Gen Core: HSD146
  * 5th Gen Core: BDM85
  * 6th Gen Core: SKL029
  */
 static int
 popcnt_pc_map_pq(uint64_t *map)
 {
 	u_long result, tmp;
 
 	__asm __volatile("xorl %k0,%k0;popcntq %2,%0;"
 	    "xorl %k1,%k1;popcntq %3,%1;addl %k1,%k0;"
 	    "xorl %k1,%k1;popcntq %4,%1;addl %k1,%k0"
 	    : "=&r" (result), "=&r" (tmp)
 	    : "m" (map[0]), "m" (map[1]), "m" (map[2]));
 	return (result);
 }
 
 /*
  * Ensure that the number of spare PV entries in the specified pmap meets or
  * exceeds the given count, "needed".
  *
  * The given PV list lock may be released.
  */
 static void
 reserve_pv_entries(pmap_t pmap, int needed, struct rwlock **lockp)
 {
 	struct pch new_tail;
 	struct pv_chunk *pc;
 	int avail, free;
 	vm_page_t m;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	KASSERT(lockp != NULL, ("reserve_pv_entries: lockp is NULL"));
 
 	/*
 	 * Newly allocated PV chunks must be stored in a private list until
 	 * the required number of PV chunks have been allocated.  Otherwise,
 	 * reclaim_pv_chunk() could recycle one of these chunks.  In
 	 * contrast, these chunks must be added to the pmap upon allocation.
 	 */
 	TAILQ_INIT(&new_tail);
 retry:
 	avail = 0;
 	TAILQ_FOREACH(pc, &pmap->pm_pvchunk, pc_list) {
 #ifndef __POPCNT__
 		if ((cpu_feature2 & CPUID2_POPCNT) == 0)
 			bit_count((bitstr_t *)pc->pc_map, 0,
 			    sizeof(pc->pc_map) * NBBY, &free);
 		else
 #endif
 		free = popcnt_pc_map_pq(pc->pc_map);
 		if (free == 0)
 			break;
 		avail += free;
 		if (avail >= needed)
 			break;
 	}
 	for (; avail < needed; avail += _NPCPV) {
 		m = vm_page_alloc(NULL, 0, VM_ALLOC_NORMAL | VM_ALLOC_NOOBJ |
 		    VM_ALLOC_WIRED);
 		if (m == NULL) {
 			m = reclaim_pv_chunk(pmap, lockp);
 			if (m == NULL)
 				goto retry;
 		}
 		PV_STAT(atomic_add_int(&pc_chunk_count, 1));
 		PV_STAT(atomic_add_int(&pc_chunk_allocs, 1));
 		dump_add_page(m->phys_addr);
 		pc = (void *)PHYS_TO_DMAP(m->phys_addr);
 		pc->pc_pmap = pmap;
 		pc->pc_map[0] = PC_FREE0;
 		pc->pc_map[1] = PC_FREE1;
 		pc->pc_map[2] = PC_FREE2;
 		TAILQ_INSERT_HEAD(&pmap->pm_pvchunk, pc, pc_list);
 		TAILQ_INSERT_TAIL(&new_tail, pc, pc_lru);
 		PV_STAT(atomic_add_int(&pv_entry_spare, _NPCPV));
 	}
 	if (!TAILQ_EMPTY(&new_tail)) {
 		mtx_lock(&pv_chunks_mutex);
 		TAILQ_CONCAT(&pv_chunks, &new_tail, pc_lru);
 		mtx_unlock(&pv_chunks_mutex);
 	}
 }
 
 /*
  * First find and then remove the pv entry for the specified pmap and virtual
  * address from the specified pv list.  Returns the pv entry if found and NULL
  * otherwise.  This operation can be performed on pv lists for either 4KB or
  * 2MB page mappings.
  */
 static __inline pv_entry_t
 pmap_pvh_remove(struct md_page *pvh, pmap_t pmap, vm_offset_t va)
 {
 	pv_entry_t pv;
 
 	TAILQ_FOREACH(pv, &pvh->pv_list, pv_next) {
 		if (pmap == PV_PMAP(pv) && va == pv->pv_va) {
 			TAILQ_REMOVE(&pvh->pv_list, pv, pv_next);
 			pvh->pv_gen++;
 			break;
 		}
 	}
 	return (pv);
 }
 
 /*
  * After demotion from a 2MB page mapping to 512 4KB page mappings,
  * destroy the pv entry for the 2MB page mapping and reinstantiate the pv
  * entries for each of the 4KB page mappings.
  */
 static void
 pmap_pv_demote_pde(pmap_t pmap, vm_offset_t va, vm_paddr_t pa,
     struct rwlock **lockp)
 {
 	struct md_page *pvh;
 	struct pv_chunk *pc;
 	pv_entry_t pv;
 	vm_offset_t va_last;
 	vm_page_t m;
 	int bit, field;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	KASSERT((pa & PDRMASK) == 0,
 	    ("pmap_pv_demote_pde: pa is not 2mpage aligned"));
 	CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa);
 
 	/*
 	 * Transfer the 2mpage's pv entry for this mapping to the first
 	 * page's pv list.  Once this transfer begins, the pv list lock
 	 * must not be released until the last pv entry is reinstantiated.
 	 */
 	pvh = pa_to_pvh(pa);
 	va = trunc_2mpage(va);
 	pv = pmap_pvh_remove(pvh, pmap, va);
 	KASSERT(pv != NULL, ("pmap_pv_demote_pde: pv not found"));
 	m = PHYS_TO_VM_PAGE(pa);
 	TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 	m->md.pv_gen++;
 	/* Instantiate the remaining NPTEPG - 1 pv entries. */
 	PV_STAT(atomic_add_long(&pv_entry_allocs, NPTEPG - 1));
 	va_last = va + NBPDR - PAGE_SIZE;
 	for (;;) {
 		pc = TAILQ_FIRST(&pmap->pm_pvchunk);
 		KASSERT(pc->pc_map[0] != 0 || pc->pc_map[1] != 0 ||
 		    pc->pc_map[2] != 0, ("pmap_pv_demote_pde: missing spare"));
 		for (field = 0; field < _NPCM; field++) {
 			while (pc->pc_map[field]) {
 				bit = bsfq(pc->pc_map[field]);
 				pc->pc_map[field] &= ~(1ul << bit);
 				pv = &pc->pc_pventry[field * 64 + bit];
 				va += PAGE_SIZE;
 				pv->pv_va = va;
 				m++;
 				KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 			    ("pmap_pv_demote_pde: page %p is not managed", m));
 				TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 				m->md.pv_gen++;
 				if (va == va_last)
 					goto out;
 			}
 		}
 		TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 		TAILQ_INSERT_TAIL(&pmap->pm_pvchunk, pc, pc_list);
 	}
 out:
 	if (pc->pc_map[0] == 0 && pc->pc_map[1] == 0 && pc->pc_map[2] == 0) {
 		TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 		TAILQ_INSERT_TAIL(&pmap->pm_pvchunk, pc, pc_list);
 	}
 	PV_STAT(atomic_add_long(&pv_entry_count, NPTEPG - 1));
 	PV_STAT(atomic_subtract_int(&pv_entry_spare, NPTEPG - 1));
 }
 
 /*
  * After promotion from 512 4KB page mappings to a single 2MB page mapping,
  * replace the many pv entries for the 4KB page mappings by a single pv entry
  * for the 2MB page mapping.
  */
 static void
 pmap_pv_promote_pde(pmap_t pmap, vm_offset_t va, vm_paddr_t pa,
     struct rwlock **lockp)
 {
 	struct md_page *pvh;
 	pv_entry_t pv;
 	vm_offset_t va_last;
 	vm_page_t m;
 
 	KASSERT((pa & PDRMASK) == 0,
 	    ("pmap_pv_promote_pde: pa is not 2mpage aligned"));
 	CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa);
 
 	/*
 	 * Transfer the first page's pv entry for this mapping to the 2mpage's
 	 * pv list.  Aside from avoiding the cost of a call to get_pv_entry(),
 	 * a transfer avoids the possibility that get_pv_entry() calls
 	 * reclaim_pv_chunk() and that reclaim_pv_chunk() removes one of the
 	 * mappings that is being promoted.
 	 */
 	m = PHYS_TO_VM_PAGE(pa);
 	va = trunc_2mpage(va);
 	pv = pmap_pvh_remove(&m->md, pmap, va);
 	KASSERT(pv != NULL, ("pmap_pv_promote_pde: pv not found"));
 	pvh = pa_to_pvh(pa);
 	TAILQ_INSERT_TAIL(&pvh->pv_list, pv, pv_next);
 	pvh->pv_gen++;
 	/* Free the remaining NPTEPG - 1 pv entries. */
 	va_last = va + NBPDR - PAGE_SIZE;
 	do {
 		m++;
 		va += PAGE_SIZE;
 		pmap_pvh_free(&m->md, pmap, va);
 	} while (va < va_last);
 }
 
 /*
  * First find and then destroy the pv entry for the specified pmap and virtual
  * address.  This operation can be performed on pv lists for either 4KB or 2MB
  * page mappings.
  */
 static void
 pmap_pvh_free(struct md_page *pvh, pmap_t pmap, vm_offset_t va)
 {
 	pv_entry_t pv;
 
 	pv = pmap_pvh_remove(pvh, pmap, va);
 	KASSERT(pv != NULL, ("pmap_pvh_free: pv not found"));
 	free_pv_entry(pmap, pv);
 }
 
 /*
  * Conditionally create the PV entry for a 4KB page mapping if the required
  * memory can be allocated without resorting to reclamation.
  */
 static boolean_t
 pmap_try_insert_pv_entry(pmap_t pmap, vm_offset_t va, vm_page_t m,
     struct rwlock **lockp)
 {
 	pv_entry_t pv;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	/* Pass NULL instead of the lock pointer to disable reclamation. */
 	if ((pv = get_pv_entry(pmap, NULL)) != NULL) {
 		pv->pv_va = va;
 		CHANGE_PV_LIST_LOCK_TO_VM_PAGE(lockp, m);
 		TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 		m->md.pv_gen++;
 		return (TRUE);
 	} else
 		return (FALSE);
 }
 
 /*
  * Conditionally create the PV entry for a 2MB page mapping if the required
  * memory can be allocated without resorting to reclamation.
  */
 static boolean_t
 pmap_pv_insert_pde(pmap_t pmap, vm_offset_t va, vm_paddr_t pa,
     struct rwlock **lockp)
 {
 	struct md_page *pvh;
 	pv_entry_t pv;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	/* Pass NULL instead of the lock pointer to disable reclamation. */
 	if ((pv = get_pv_entry(pmap, NULL)) != NULL) {
 		pv->pv_va = va;
 		CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa);
 		pvh = pa_to_pvh(pa);
 		TAILQ_INSERT_TAIL(&pvh->pv_list, pv, pv_next);
 		pvh->pv_gen++;
 		return (TRUE);
 	} else
 		return (FALSE);
 }
 
 /*
  * Fills a page table page with mappings to consecutive physical pages.
  */
 static void
 pmap_fill_ptp(pt_entry_t *firstpte, pt_entry_t newpte)
 {
 	pt_entry_t *pte;
 
 	for (pte = firstpte; pte < firstpte + NPTEPG; pte++) {
 		*pte = newpte;
 		newpte += PAGE_SIZE;
 	}
 }
 
 /*
  * Tries to demote a 2MB page mapping.  If demotion fails, the 2MB page
  * mapping is invalidated.
  */
 static boolean_t
 pmap_demote_pde(pmap_t pmap, pd_entry_t *pde, vm_offset_t va)
 {
 	struct rwlock *lock;
 	boolean_t rv;
 
 	lock = NULL;
 	rv = pmap_demote_pde_locked(pmap, pde, va, &lock);
 	if (lock != NULL)
 		rw_wunlock(lock);
 	return (rv);
 }
 
 static boolean_t
 pmap_demote_pde_locked(pmap_t pmap, pd_entry_t *pde, vm_offset_t va,
     struct rwlock **lockp)
 {
 	pd_entry_t newpde, oldpde;
 	pt_entry_t *firstpte, newpte;
 	pt_entry_t PG_A, PG_G, PG_M, PG_RW, PG_V;
 	vm_paddr_t mptepa;
 	vm_page_t mpte;
 	struct spglist free;
 	int PG_PTE_CACHE;
 
 	PG_G = pmap_global_bit(pmap);
 	PG_A = pmap_accessed_bit(pmap);
 	PG_M = pmap_modified_bit(pmap);
 	PG_RW = pmap_rw_bit(pmap);
 	PG_V = pmap_valid_bit(pmap);
 	PG_PTE_CACHE = pmap_cache_mask(pmap, 0);
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	oldpde = *pde;
 	KASSERT((oldpde & (PG_PS | PG_V)) == (PG_PS | PG_V),
 	    ("pmap_demote_pde: oldpde is missing PG_PS and/or PG_V"));
 	if ((oldpde & PG_A) != 0 && (mpte = pmap_lookup_pt_page(pmap, va)) !=
 	    NULL)
 		pmap_remove_pt_page(pmap, mpte);
 	else {
 		KASSERT((oldpde & PG_W) == 0,
 		    ("pmap_demote_pde: page table page for a wired mapping"
 		    " is missing"));
 
 		/*
 		 * Invalidate the 2MB page mapping and return "failure" if the
 		 * mapping was never accessed or the allocation of the new
 		 * page table page fails.  If the 2MB page mapping belongs to
 		 * the direct map region of the kernel's address space, then
 		 * the page allocation request specifies the highest possible
 		 * priority (VM_ALLOC_INTERRUPT).  Otherwise, the priority is
 		 * normal.  Page table pages are preallocated for every other
 		 * part of the kernel address space, so the direct map region
 		 * is the only part of the kernel address space that must be
 		 * handled here.
 		 */
 		if ((oldpde & PG_A) == 0 || (mpte = vm_page_alloc(NULL,
 		    pmap_pde_pindex(va), (va >= DMAP_MIN_ADDRESS && va <
 		    DMAP_MAX_ADDRESS ? VM_ALLOC_INTERRUPT : VM_ALLOC_NORMAL) |
 		    VM_ALLOC_NOOBJ | VM_ALLOC_WIRED)) == NULL) {
 			SLIST_INIT(&free);
 			pmap_remove_pde(pmap, pde, trunc_2mpage(va), &free,
 			    lockp);
 			pmap_invalidate_page(pmap, trunc_2mpage(va));
 			pmap_free_zero_pages(&free);
 			CTR2(KTR_PMAP, "pmap_demote_pde: failure for va %#lx"
 			    " in pmap %p", va, pmap);
 			return (FALSE);
 		}
 		if (va < VM_MAXUSER_ADDRESS)
 			pmap_resident_count_inc(pmap, 1);
 	}
 	mptepa = VM_PAGE_TO_PHYS(mpte);
 	firstpte = (pt_entry_t *)PHYS_TO_DMAP(mptepa);
 	newpde = mptepa | PG_M | PG_A | (oldpde & PG_U) | PG_RW | PG_V;
 	KASSERT((oldpde & PG_A) != 0,
 	    ("pmap_demote_pde: oldpde is missing PG_A"));
 	KASSERT((oldpde & (PG_M | PG_RW)) != PG_RW,
 	    ("pmap_demote_pde: oldpde is missing PG_M"));
 	newpte = oldpde & ~PG_PS;
 	newpte = pmap_swap_pat(pmap, newpte);
 
 	/*
 	 * If the page table page is new, initialize it.
 	 */
 	if (mpte->wire_count == 1) {
 		mpte->wire_count = NPTEPG;
 		pmap_fill_ptp(firstpte, newpte);
 	}
 	KASSERT((*firstpte & PG_FRAME) == (newpte & PG_FRAME),
 	    ("pmap_demote_pde: firstpte and newpte map different physical"
 	    " addresses"));
 
 	/*
 	 * If the mapping has changed attributes, update the page table
 	 * entries.
 	 */
 	if ((*firstpte & PG_PTE_PROMOTE) != (newpte & PG_PTE_PROMOTE))
 		pmap_fill_ptp(firstpte, newpte);
 
 	/*
 	 * The spare PV entries must be reserved prior to demoting the
 	 * mapping, that is, prior to changing the PDE.  Otherwise, the state
 	 * of the PDE and the PV lists will be inconsistent, which can result
 	 * in reclaim_pv_chunk() attempting to remove a PV entry from the
 	 * wrong PV list and pmap_pv_demote_pde() failing to find the expected
 	 * PV entry for the 2MB page mapping that is being demoted.
 	 */
 	if ((oldpde & PG_MANAGED) != 0)
 		reserve_pv_entries(pmap, NPTEPG - 1, lockp);
 
 	/*
 	 * Demote the mapping.  This pmap is locked.  The old PDE has
 	 * PG_A set.  If the old PDE has PG_RW set, it also has PG_M
 	 * set.  Thus, there is no danger of a race with another
 	 * processor changing the setting of PG_A and/or PG_M between
 	 * the read above and the store below. 
 	 */
 	if (workaround_erratum383)
 		pmap_update_pde(pmap, va, pde, newpde);
 	else
 		pde_store(pde, newpde);
 
 	/*
 	 * Invalidate a stale recursive mapping of the page table page.
 	 */
 	if (va >= VM_MAXUSER_ADDRESS)
 		pmap_invalidate_page(pmap, (vm_offset_t)vtopte(va));
 
 	/*
 	 * Demote the PV entry.
 	 */
 	if ((oldpde & PG_MANAGED) != 0)
 		pmap_pv_demote_pde(pmap, va, oldpde & PG_PS_FRAME, lockp);
 
 	atomic_add_long(&pmap_pde_demotions, 1);
 	CTR2(KTR_PMAP, "pmap_demote_pde: success for va %#lx"
 	    " in pmap %p", va, pmap);
 	return (TRUE);
 }
 
 /*
  * pmap_remove_kernel_pde: Remove a kernel superpage mapping.
  */
 static void
 pmap_remove_kernel_pde(pmap_t pmap, pd_entry_t *pde, vm_offset_t va)
 {
 	pd_entry_t newpde;
 	vm_paddr_t mptepa;
 	vm_page_t mpte;
 
 	KASSERT(pmap == kernel_pmap, ("pmap %p is not kernel_pmap", pmap));
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	mpte = pmap_lookup_pt_page(pmap, va);
 	if (mpte == NULL)
 		panic("pmap_remove_kernel_pde: Missing pt page.");
 
 	pmap_remove_pt_page(pmap, mpte);
 	mptepa = VM_PAGE_TO_PHYS(mpte);
 	newpde = mptepa | X86_PG_M | X86_PG_A | X86_PG_RW | X86_PG_V;
 
 	/*
 	 * Initialize the page table page.
 	 */
 	pagezero((void *)PHYS_TO_DMAP(mptepa));
 
 	/*
 	 * Demote the mapping.
 	 */
 	if (workaround_erratum383)
 		pmap_update_pde(pmap, va, pde, newpde);
 	else
 		pde_store(pde, newpde);
 
 	/*
 	 * Invalidate a stale recursive mapping of the page table page.
 	 */
 	pmap_invalidate_page(pmap, (vm_offset_t)vtopte(va));
 }
 
 /*
  * pmap_remove_pde: do the things to unmap a superpage in a process
  */
 static int
 pmap_remove_pde(pmap_t pmap, pd_entry_t *pdq, vm_offset_t sva,
     struct spglist *free, struct rwlock **lockp)
 {
 	struct md_page *pvh;
 	pd_entry_t oldpde;
 	vm_offset_t eva, va;
 	vm_page_t m, mpte;
 	pt_entry_t PG_G, PG_A, PG_M, PG_RW;
 
 	PG_G = pmap_global_bit(pmap);
 	PG_A = pmap_accessed_bit(pmap);
 	PG_M = pmap_modified_bit(pmap);
 	PG_RW = pmap_rw_bit(pmap);
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	KASSERT((sva & PDRMASK) == 0,
 	    ("pmap_remove_pde: sva is not 2mpage aligned"));
 	oldpde = pte_load_clear(pdq);
 	if (oldpde & PG_W)
 		pmap->pm_stats.wired_count -= NBPDR / PAGE_SIZE;
 
 	/*
 	 * Machines that don't support invlpg, also don't support
 	 * PG_G.
 	 */
 	if (oldpde & PG_G)
 		pmap_invalidate_page(kernel_pmap, sva);
 	pmap_resident_count_dec(pmap, NBPDR / PAGE_SIZE);
 	if (oldpde & PG_MANAGED) {
 		CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, oldpde & PG_PS_FRAME);
 		pvh = pa_to_pvh(oldpde & PG_PS_FRAME);
 		pmap_pvh_free(pvh, pmap, sva);
 		eva = sva + NBPDR;
 		for (va = sva, m = PHYS_TO_VM_PAGE(oldpde & PG_PS_FRAME);
 		    va < eva; va += PAGE_SIZE, m++) {
 			if ((oldpde & (PG_M | PG_RW)) == (PG_M | PG_RW))
 				vm_page_dirty(m);
 			if (oldpde & PG_A)
 				vm_page_aflag_set(m, PGA_REFERENCED);
 			if (TAILQ_EMPTY(&m->md.pv_list) &&
 			    TAILQ_EMPTY(&pvh->pv_list))
 				vm_page_aflag_clear(m, PGA_WRITEABLE);
 			pmap_delayed_invl_page(m);
 		}
 	}
 	if (pmap == kernel_pmap) {
 		pmap_remove_kernel_pde(pmap, pdq, sva);
 	} else {
 		mpte = pmap_lookup_pt_page(pmap, sva);
 		if (mpte != NULL) {
 			pmap_remove_pt_page(pmap, mpte);
 			pmap_resident_count_dec(pmap, 1);
 			KASSERT(mpte->wire_count == NPTEPG,
 			    ("pmap_remove_pde: pte page wire count error"));
 			mpte->wire_count = 0;
 			pmap_add_delayed_free_list(mpte, free, FALSE);
 			atomic_subtract_int(&vm_cnt.v_wire_count, 1);
 		}
 	}
 	return (pmap_unuse_pt(pmap, sva, *pmap_pdpe(pmap, sva), free));
 }
 
 /*
  * pmap_remove_pte: do the things to unmap a page in a process
  */
 static int
 pmap_remove_pte(pmap_t pmap, pt_entry_t *ptq, vm_offset_t va, 
     pd_entry_t ptepde, struct spglist *free, struct rwlock **lockp)
 {
 	struct md_page *pvh;
 	pt_entry_t oldpte, PG_A, PG_M, PG_RW;
 	vm_page_t m;
 
 	PG_A = pmap_accessed_bit(pmap);
 	PG_M = pmap_modified_bit(pmap);
 	PG_RW = pmap_rw_bit(pmap);
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	oldpte = pte_load_clear(ptq);
 	if (oldpte & PG_W)
 		pmap->pm_stats.wired_count -= 1;
 	pmap_resident_count_dec(pmap, 1);
 	if (oldpte & PG_MANAGED) {
 		m = PHYS_TO_VM_PAGE(oldpte & PG_FRAME);
 		if ((oldpte & (PG_M | PG_RW)) == (PG_M | PG_RW))
 			vm_page_dirty(m);
 		if (oldpte & PG_A)
 			vm_page_aflag_set(m, PGA_REFERENCED);
 		CHANGE_PV_LIST_LOCK_TO_VM_PAGE(lockp, m);
 		pmap_pvh_free(&m->md, pmap, va);
 		if (TAILQ_EMPTY(&m->md.pv_list) &&
 		    (m->flags & PG_FICTITIOUS) == 0) {
 			pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 			if (TAILQ_EMPTY(&pvh->pv_list))
 				vm_page_aflag_clear(m, PGA_WRITEABLE);
 		}
 		pmap_delayed_invl_page(m);
 	}
 	return (pmap_unuse_pt(pmap, va, ptepde, free));
 }
 
 /*
  * Remove a single page from a process address space
  */
 static void
 pmap_remove_page(pmap_t pmap, vm_offset_t va, pd_entry_t *pde,
     struct spglist *free)
 {
 	struct rwlock *lock;
 	pt_entry_t *pte, PG_V;
 
 	PG_V = pmap_valid_bit(pmap);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	if ((*pde & PG_V) == 0)
 		return;
 	pte = pmap_pde_to_pte(pde, va);
 	if ((*pte & PG_V) == 0)
 		return;
 	lock = NULL;
 	pmap_remove_pte(pmap, pte, va, *pde, free, &lock);
 	if (lock != NULL)
 		rw_wunlock(lock);
 	pmap_invalidate_page(pmap, va);
 }
 
 /*
  *	Remove the given range of addresses from the specified map.
  *
  *	It is assumed that the start and end are properly
  *	rounded to the page size.
  */
 void
 pmap_remove(pmap_t pmap, vm_offset_t sva, vm_offset_t eva)
 {
 	struct rwlock *lock;
 	vm_offset_t va, va_next;
 	pml4_entry_t *pml4e;
 	pdp_entry_t *pdpe;
 	pd_entry_t ptpaddr, *pde;
 	pt_entry_t *pte, PG_G, PG_V;
 	struct spglist free;
 	int anyvalid;
 
 	PG_G = pmap_global_bit(pmap);
 	PG_V = pmap_valid_bit(pmap);
 
 	/*
 	 * Perform an unsynchronized read.  This is, however, safe.
 	 */
 	if (pmap->pm_stats.resident_count == 0)
 		return;
 
 	anyvalid = 0;
 	SLIST_INIT(&free);
 
 	pmap_delayed_invl_started();
 	PMAP_LOCK(pmap);
 
 	/*
 	 * special handling of removing one page.  a very
 	 * common operation and easy to short circuit some
 	 * code.
 	 */
 	if (sva + PAGE_SIZE == eva) {
 		pde = pmap_pde(pmap, sva);
 		if (pde && (*pde & PG_PS) == 0) {
 			pmap_remove_page(pmap, sva, pde, &free);
 			goto out;
 		}
 	}
 
 	lock = NULL;
 	for (; sva < eva; sva = va_next) {
 
 		if (pmap->pm_stats.resident_count == 0)
 			break;
 
 		pml4e = pmap_pml4e(pmap, sva);
 		if ((*pml4e & PG_V) == 0) {
 			va_next = (sva + NBPML4) & ~PML4MASK;
 			if (va_next < sva)
 				va_next = eva;
 			continue;
 		}
 
 		pdpe = pmap_pml4e_to_pdpe(pml4e, sva);
 		if ((*pdpe & PG_V) == 0) {
 			va_next = (sva + NBPDP) & ~PDPMASK;
 			if (va_next < sva)
 				va_next = eva;
 			continue;
 		}
 
 		/*
 		 * Calculate index for next page table.
 		 */
 		va_next = (sva + NBPDR) & ~PDRMASK;
 		if (va_next < sva)
 			va_next = eva;
 
 		pde = pmap_pdpe_to_pde(pdpe, sva);
 		ptpaddr = *pde;
 
 		/*
 		 * Weed out invalid mappings.
 		 */
 		if (ptpaddr == 0)
 			continue;
 
 		/*
 		 * Check for large page.
 		 */
 		if ((ptpaddr & PG_PS) != 0) {
 			/*
 			 * Are we removing the entire large page?  If not,
 			 * demote the mapping and fall through.
 			 */
 			if (sva + NBPDR == va_next && eva >= va_next) {
 				/*
 				 * The TLB entry for a PG_G mapping is
 				 * invalidated by pmap_remove_pde().
 				 */
 				if ((ptpaddr & PG_G) == 0)
 					anyvalid = 1;
 				pmap_remove_pde(pmap, pde, sva, &free, &lock);
 				continue;
 			} else if (!pmap_demote_pde_locked(pmap, pde, sva,
 			    &lock)) {
 				/* The large page mapping was destroyed. */
 				continue;
 			} else
 				ptpaddr = *pde;
 		}
 
 		/*
 		 * Limit our scan to either the end of the va represented
 		 * by the current page table page, or to the end of the
 		 * range being removed.
 		 */
 		if (va_next > eva)
 			va_next = eva;
 
 		va = va_next;
 		for (pte = pmap_pde_to_pte(pde, sva); sva != va_next; pte++,
 		    sva += PAGE_SIZE) {
 			if (*pte == 0) {
 				if (va != va_next) {
 					pmap_invalidate_range(pmap, va, sva);
 					va = va_next;
 				}
 				continue;
 			}
 			if ((*pte & PG_G) == 0)
 				anyvalid = 1;
 			else if (va == va_next)
 				va = sva;
 			if (pmap_remove_pte(pmap, pte, sva, ptpaddr, &free,
 			    &lock)) {
 				sva += PAGE_SIZE;
 				break;
 			}
 		}
 		if (va != va_next)
 			pmap_invalidate_range(pmap, va, sva);
 	}
 	if (lock != NULL)
 		rw_wunlock(lock);
 out:
 	if (anyvalid)
 		pmap_invalidate_all(pmap);
 	PMAP_UNLOCK(pmap);
 	pmap_delayed_invl_finished();
 	pmap_free_zero_pages(&free);
 }
 
 /*
  *	Routine:	pmap_remove_all
  *	Function:
  *		Removes this physical page from
  *		all physical maps in which it resides.
  *		Reflects back modify bits to the pager.
  *
  *	Notes:
  *		Original versions of this routine were very
  *		inefficient because they iteratively called
  *		pmap_remove (slow...)
  */
 
 void
 pmap_remove_all(vm_page_t m)
 {
 	struct md_page *pvh;
 	pv_entry_t pv;
 	pmap_t pmap;
 	struct rwlock *lock;
 	pt_entry_t *pte, tpte, PG_A, PG_M, PG_RW;
 	pd_entry_t *pde;
 	vm_offset_t va;
 	struct spglist free;
 	int pvh_gen, md_gen;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_remove_all: page %p is not managed", m));
 	SLIST_INIT(&free);
 	lock = VM_PAGE_TO_PV_LIST_LOCK(m);
 	pvh = (m->flags & PG_FICTITIOUS) != 0 ? &pv_dummy :
 	    pa_to_pvh(VM_PAGE_TO_PHYS(m));
 retry:
 	rw_wlock(lock);
 	while ((pv = TAILQ_FIRST(&pvh->pv_list)) != NULL) {
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			pvh_gen = pvh->pv_gen;
 			rw_wunlock(lock);
 			PMAP_LOCK(pmap);
 			rw_wlock(lock);
 			if (pvh_gen != pvh->pv_gen) {
 				rw_wunlock(lock);
 				PMAP_UNLOCK(pmap);
 				goto retry;
 			}
 		}
 		va = pv->pv_va;
 		pde = pmap_pde(pmap, va);
 		(void)pmap_demote_pde_locked(pmap, pde, va, &lock);
 		PMAP_UNLOCK(pmap);
 	}
 	while ((pv = TAILQ_FIRST(&m->md.pv_list)) != NULL) {
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			pvh_gen = pvh->pv_gen;
 			md_gen = m->md.pv_gen;
 			rw_wunlock(lock);
 			PMAP_LOCK(pmap);
 			rw_wlock(lock);
 			if (pvh_gen != pvh->pv_gen || md_gen != m->md.pv_gen) {
 				rw_wunlock(lock);
 				PMAP_UNLOCK(pmap);
 				goto retry;
 			}
 		}
 		PG_A = pmap_accessed_bit(pmap);
 		PG_M = pmap_modified_bit(pmap);
 		PG_RW = pmap_rw_bit(pmap);
 		pmap_resident_count_dec(pmap, 1);
 		pde = pmap_pde(pmap, pv->pv_va);
 		KASSERT((*pde & PG_PS) == 0, ("pmap_remove_all: found"
 		    " a 2mpage in page %p's pv list", m));
 		pte = pmap_pde_to_pte(pde, pv->pv_va);
 		tpte = pte_load_clear(pte);
 		if (tpte & PG_W)
 			pmap->pm_stats.wired_count--;
 		if (tpte & PG_A)
 			vm_page_aflag_set(m, PGA_REFERENCED);
 
 		/*
 		 * Update the vm_page_t clean and reference bits.
 		 */
 		if ((tpte & (PG_M | PG_RW)) == (PG_M | PG_RW))
 			vm_page_dirty(m);
 		pmap_unuse_pt(pmap, pv->pv_va, *pde, &free);
 		pmap_invalidate_page(pmap, pv->pv_va);
 		TAILQ_REMOVE(&m->md.pv_list, pv, pv_next);
 		m->md.pv_gen++;
 		free_pv_entry(pmap, pv);
 		PMAP_UNLOCK(pmap);
 	}
 	vm_page_aflag_clear(m, PGA_WRITEABLE);
 	rw_wunlock(lock);
 	pmap_delayed_invl_wait(m);
 	pmap_free_zero_pages(&free);
 }
 
 /*
  * pmap_protect_pde: do the things to protect a 2mpage in a process
  */
 static boolean_t
 pmap_protect_pde(pmap_t pmap, pd_entry_t *pde, vm_offset_t sva, vm_prot_t prot)
 {
 	pd_entry_t newpde, oldpde;
 	vm_offset_t eva, va;
 	vm_page_t m;
 	boolean_t anychanged;
 	pt_entry_t PG_G, PG_M, PG_RW;
 
 	PG_G = pmap_global_bit(pmap);
 	PG_M = pmap_modified_bit(pmap);
 	PG_RW = pmap_rw_bit(pmap);
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	KASSERT((sva & PDRMASK) == 0,
 	    ("pmap_protect_pde: sva is not 2mpage aligned"));
 	anychanged = FALSE;
 retry:
 	oldpde = newpde = *pde;
 	if (oldpde & PG_MANAGED) {
 		eva = sva + NBPDR;
 		for (va = sva, m = PHYS_TO_VM_PAGE(oldpde & PG_PS_FRAME);
 		    va < eva; va += PAGE_SIZE, m++)
 			if ((oldpde & (PG_M | PG_RW)) == (PG_M | PG_RW))
 				vm_page_dirty(m);
 	}
 	if ((prot & VM_PROT_WRITE) == 0)
 		newpde &= ~(PG_RW | PG_M);
 	if ((prot & VM_PROT_EXECUTE) == 0)
 		newpde |= pg_nx;
 	if (newpde != oldpde) {
 		if (!atomic_cmpset_long(pde, oldpde, newpde))
 			goto retry;
 		if (oldpde & PG_G)
 			pmap_invalidate_page(pmap, sva);
 		else
 			anychanged = TRUE;
 	}
 	return (anychanged);
 }
 
 /*
  *	Set the physical protection on the
  *	specified range of this map as requested.
  */
 void
 pmap_protect(pmap_t pmap, vm_offset_t sva, vm_offset_t eva, vm_prot_t prot)
 {
 	vm_offset_t va_next;
 	pml4_entry_t *pml4e;
 	pdp_entry_t *pdpe;
 	pd_entry_t ptpaddr, *pde;
 	pt_entry_t *pte, PG_G, PG_M, PG_RW, PG_V;
 	boolean_t anychanged;
 
 	KASSERT((prot & ~VM_PROT_ALL) == 0, ("invalid prot %x", prot));
 	if (prot == VM_PROT_NONE) {
 		pmap_remove(pmap, sva, eva);
 		return;
 	}
 
 	if ((prot & (VM_PROT_WRITE|VM_PROT_EXECUTE)) ==
 	    (VM_PROT_WRITE|VM_PROT_EXECUTE))
 		return;
 
 	PG_G = pmap_global_bit(pmap);
 	PG_M = pmap_modified_bit(pmap);
 	PG_V = pmap_valid_bit(pmap);
 	PG_RW = pmap_rw_bit(pmap);
 	anychanged = FALSE;
 
 	PMAP_LOCK(pmap);
 	for (; sva < eva; sva = va_next) {
 
 		pml4e = pmap_pml4e(pmap, sva);
 		if ((*pml4e & PG_V) == 0) {
 			va_next = (sva + NBPML4) & ~PML4MASK;
 			if (va_next < sva)
 				va_next = eva;
 			continue;
 		}
 
 		pdpe = pmap_pml4e_to_pdpe(pml4e, sva);
 		if ((*pdpe & PG_V) == 0) {
 			va_next = (sva + NBPDP) & ~PDPMASK;
 			if (va_next < sva)
 				va_next = eva;
 			continue;
 		}
 
 		va_next = (sva + NBPDR) & ~PDRMASK;
 		if (va_next < sva)
 			va_next = eva;
 
 		pde = pmap_pdpe_to_pde(pdpe, sva);
 		ptpaddr = *pde;
 
 		/*
 		 * Weed out invalid mappings.
 		 */
 		if (ptpaddr == 0)
 			continue;
 
 		/*
 		 * Check for large page.
 		 */
 		if ((ptpaddr & PG_PS) != 0) {
 			/*
 			 * Are we protecting the entire large page?  If not,
 			 * demote the mapping and fall through.
 			 */
 			if (sva + NBPDR == va_next && eva >= va_next) {
 				/*
 				 * The TLB entry for a PG_G mapping is
 				 * invalidated by pmap_protect_pde().
 				 */
 				if (pmap_protect_pde(pmap, pde, sva, prot))
 					anychanged = TRUE;
 				continue;
 			} else if (!pmap_demote_pde(pmap, pde, sva)) {
 				/*
 				 * The large page mapping was destroyed.
 				 */
 				continue;
 			}
 		}
 
 		if (va_next > eva)
 			va_next = eva;
 
 		for (pte = pmap_pde_to_pte(pde, sva); sva != va_next; pte++,
 		    sva += PAGE_SIZE) {
 			pt_entry_t obits, pbits;
 			vm_page_t m;
 
 retry:
 			obits = pbits = *pte;
 			if ((pbits & PG_V) == 0)
 				continue;
 
 			if ((prot & VM_PROT_WRITE) == 0) {
 				if ((pbits & (PG_MANAGED | PG_M | PG_RW)) ==
 				    (PG_MANAGED | PG_M | PG_RW)) {
 					m = PHYS_TO_VM_PAGE(pbits & PG_FRAME);
 					vm_page_dirty(m);
 				}
 				pbits &= ~(PG_RW | PG_M);
 			}
 			if ((prot & VM_PROT_EXECUTE) == 0)
 				pbits |= pg_nx;
 
 			if (pbits != obits) {
 				if (!atomic_cmpset_long(pte, obits, pbits))
 					goto retry;
 				if (obits & PG_G)
 					pmap_invalidate_page(pmap, sva);
 				else
 					anychanged = TRUE;
 			}
 		}
 	}
 	if (anychanged)
 		pmap_invalidate_all(pmap);
 	PMAP_UNLOCK(pmap);
 }
 
 /*
  * Tries to promote the 512, contiguous 4KB page mappings that are within a
  * single page table page (PTP) to a single 2MB page mapping.  For promotion
  * to occur, two conditions must be met: (1) the 4KB page mappings must map
  * aligned, contiguous physical memory and (2) the 4KB page mappings must have
  * identical characteristics. 
  */
 static void
 pmap_promote_pde(pmap_t pmap, pd_entry_t *pde, vm_offset_t va,
     struct rwlock **lockp)
 {
 	pd_entry_t newpde;
 	pt_entry_t *firstpte, oldpte, pa, *pte;
 	pt_entry_t PG_G, PG_A, PG_M, PG_RW, PG_V;
 	vm_page_t mpte;
 	int PG_PTE_CACHE;
 
 	PG_A = pmap_accessed_bit(pmap);
 	PG_G = pmap_global_bit(pmap);
 	PG_M = pmap_modified_bit(pmap);
 	PG_V = pmap_valid_bit(pmap);
 	PG_RW = pmap_rw_bit(pmap);
 	PG_PTE_CACHE = pmap_cache_mask(pmap, 0);
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 
 	/*
 	 * Examine the first PTE in the specified PTP.  Abort if this PTE is
 	 * either invalid, unused, or does not map the first 4KB physical page
 	 * within a 2MB page. 
 	 */
 	firstpte = (pt_entry_t *)PHYS_TO_DMAP(*pde & PG_FRAME);
 setpde:
 	newpde = *firstpte;
 	if ((newpde & ((PG_FRAME & PDRMASK) | PG_A | PG_V)) != (PG_A | PG_V)) {
 		atomic_add_long(&pmap_pde_p_failures, 1);
 		CTR2(KTR_PMAP, "pmap_promote_pde: failure for va %#lx"
 		    " in pmap %p", va, pmap);
 		return;
 	}
 	if ((newpde & (PG_M | PG_RW)) == PG_RW) {
 		/*
 		 * When PG_M is already clear, PG_RW can be cleared without
 		 * a TLB invalidation.
 		 */
 		if (!atomic_cmpset_long(firstpte, newpde, newpde & ~PG_RW))
 			goto setpde;
 		newpde &= ~PG_RW;
 	}
 
 	/*
 	 * Examine each of the other PTEs in the specified PTP.  Abort if this
 	 * PTE maps an unexpected 4KB physical page or does not have identical
 	 * characteristics to the first PTE.
 	 */
 	pa = (newpde & (PG_PS_FRAME | PG_A | PG_V)) + NBPDR - PAGE_SIZE;
 	for (pte = firstpte + NPTEPG - 1; pte > firstpte; pte--) {
 setpte:
 		oldpte = *pte;
 		if ((oldpte & (PG_FRAME | PG_A | PG_V)) != pa) {
 			atomic_add_long(&pmap_pde_p_failures, 1);
 			CTR2(KTR_PMAP, "pmap_promote_pde: failure for va %#lx"
 			    " in pmap %p", va, pmap);
 			return;
 		}
 		if ((oldpte & (PG_M | PG_RW)) == PG_RW) {
 			/*
 			 * When PG_M is already clear, PG_RW can be cleared
 			 * without a TLB invalidation.
 			 */
 			if (!atomic_cmpset_long(pte, oldpte, oldpte & ~PG_RW))
 				goto setpte;
 			oldpte &= ~PG_RW;
 			CTR2(KTR_PMAP, "pmap_promote_pde: protect for va %#lx"
 			    " in pmap %p", (oldpte & PG_FRAME & PDRMASK) |
 			    (va & ~PDRMASK), pmap);
 		}
 		if ((oldpte & PG_PTE_PROMOTE) != (newpde & PG_PTE_PROMOTE)) {
 			atomic_add_long(&pmap_pde_p_failures, 1);
 			CTR2(KTR_PMAP, "pmap_promote_pde: failure for va %#lx"
 			    " in pmap %p", va, pmap);
 			return;
 		}
 		pa -= PAGE_SIZE;
 	}
 
 	/*
 	 * Save the page table page in its current state until the PDE
 	 * mapping the superpage is demoted by pmap_demote_pde() or
 	 * destroyed by pmap_remove_pde(). 
 	 */
 	mpte = PHYS_TO_VM_PAGE(*pde & PG_FRAME);
 	KASSERT(mpte >= vm_page_array &&
 	    mpte < &vm_page_array[vm_page_array_size],
 	    ("pmap_promote_pde: page table page is out of range"));
 	KASSERT(mpte->pindex == pmap_pde_pindex(va),
 	    ("pmap_promote_pde: page table page's pindex is wrong"));
 	if (pmap_insert_pt_page(pmap, mpte)) {
 		atomic_add_long(&pmap_pde_p_failures, 1);
 		CTR2(KTR_PMAP,
 		    "pmap_promote_pde: failure for va %#lx in pmap %p", va,
 		    pmap);
 		return;
 	}
 
 	/*
 	 * Promote the pv entries.
 	 */
 	if ((newpde & PG_MANAGED) != 0)
 		pmap_pv_promote_pde(pmap, va, newpde & PG_PS_FRAME, lockp);
 
 	/*
 	 * Propagate the PAT index to its proper position.
 	 */
 	newpde = pmap_swap_pat(pmap, newpde);
 
 	/*
 	 * Map the superpage.
 	 */
 	if (workaround_erratum383)
 		pmap_update_pde(pmap, va, pde, PG_PS | newpde);
 	else
 		pde_store(pde, PG_PS | newpde);
 
 	atomic_add_long(&pmap_pde_promotions, 1);
 	CTR2(KTR_PMAP, "pmap_promote_pde: success for va %#lx"
 	    " in pmap %p", va, pmap);
 }
 
 /*
  *	Insert the given physical page (p) at
  *	the specified virtual address (v) in the
  *	target physical map with the protection requested.
  *
  *	If specified, the page will be wired down, meaning
  *	that the related pte can not be reclaimed.
  *
  *	NB:  This is the only routine which MAY NOT lazy-evaluate
  *	or lose information.  That is, this routine must actually
  *	insert this page into the given map NOW.
  *
  *	When destroying both a page table and PV entry, this function
  *	performs the TLB invalidation before releasing the PV list
  *	lock, so we do not need pmap_delayed_invl_page() calls here.
  */
 int
 pmap_enter(pmap_t pmap, vm_offset_t va, vm_page_t m, vm_prot_t prot,
     u_int flags, int8_t psind __unused)
 {
 	struct rwlock *lock;
 	pd_entry_t *pde;
 	pt_entry_t *pte, PG_G, PG_A, PG_M, PG_RW, PG_V;
 	pt_entry_t newpte, origpte;
 	pv_entry_t pv;
 	vm_paddr_t opa, pa;
 	vm_page_t mpte, om;
 	boolean_t nosleep;
 
 	PG_A = pmap_accessed_bit(pmap);
 	PG_G = pmap_global_bit(pmap);
 	PG_M = pmap_modified_bit(pmap);
 	PG_V = pmap_valid_bit(pmap);
 	PG_RW = pmap_rw_bit(pmap);
 
 	va = trunc_page(va);
 	KASSERT(va <= VM_MAX_KERNEL_ADDRESS, ("pmap_enter: toobig"));
 	KASSERT(va < UPT_MIN_ADDRESS || va >= UPT_MAX_ADDRESS,
 	    ("pmap_enter: invalid to pmap_enter page table pages (va: 0x%lx)",
 	    va));
 	KASSERT((m->oflags & VPO_UNMANAGED) != 0 || va < kmi.clean_sva ||
 	    va >= kmi.clean_eva,
 	    ("pmap_enter: managed mapping within the clean submap"));
 	if ((m->oflags & VPO_UNMANAGED) == 0 && !vm_page_xbusied(m))
 		VM_OBJECT_ASSERT_LOCKED(m->object);
 	pa = VM_PAGE_TO_PHYS(m);
 	newpte = (pt_entry_t)(pa | PG_A | PG_V);
 	if ((flags & VM_PROT_WRITE) != 0)
 		newpte |= PG_M;
 	if ((prot & VM_PROT_WRITE) != 0)
 		newpte |= PG_RW;
 	KASSERT((newpte & (PG_M | PG_RW)) != PG_M,
 	    ("pmap_enter: flags includes VM_PROT_WRITE but prot doesn't"));
 	if ((prot & VM_PROT_EXECUTE) == 0)
 		newpte |= pg_nx;
 	if ((flags & PMAP_ENTER_WIRED) != 0)
 		newpte |= PG_W;
 	if (va < VM_MAXUSER_ADDRESS)
 		newpte |= PG_U;
 	if (pmap == kernel_pmap)
 		newpte |= PG_G;
 	newpte |= pmap_cache_bits(pmap, m->md.pat_mode, 0);
 
 	/*
 	 * Set modified bit gratuitously for writeable mappings if
 	 * the page is unmanaged. We do not want to take a fault
 	 * to do the dirty bit accounting for these mappings.
 	 */
 	if ((m->oflags & VPO_UNMANAGED) != 0) {
 		if ((newpte & PG_RW) != 0)
 			newpte |= PG_M;
 	}
 
 	mpte = NULL;
 
 	lock = NULL;
 	PMAP_LOCK(pmap);
 
 	/*
 	 * In the case that a page table page is not
 	 * resident, we are creating it here.
 	 */
 retry:
 	pde = pmap_pde(pmap, va);
 	if (pde != NULL && (*pde & PG_V) != 0 && ((*pde & PG_PS) == 0 ||
 	    pmap_demote_pde_locked(pmap, pde, va, &lock))) {
 		pte = pmap_pde_to_pte(pde, va);
 		if (va < VM_MAXUSER_ADDRESS && mpte == NULL) {
 			mpte = PHYS_TO_VM_PAGE(*pde & PG_FRAME);
 			mpte->wire_count++;
 		}
 	} else if (va < VM_MAXUSER_ADDRESS) {
 		/*
 		 * Here if the pte page isn't mapped, or if it has been
 		 * deallocated.
 		 */
 		nosleep = (flags & PMAP_ENTER_NOSLEEP) != 0;
 		mpte = _pmap_allocpte(pmap, pmap_pde_pindex(va),
 		    nosleep ? NULL : &lock);
 		if (mpte == NULL && nosleep) {
 			if (lock != NULL)
 				rw_wunlock(lock);
 			PMAP_UNLOCK(pmap);
 			return (KERN_RESOURCE_SHORTAGE);
 		}
 		goto retry;
 	} else
 		panic("pmap_enter: invalid page directory va=%#lx", va);
 
 	origpte = *pte;
 
 	/*
 	 * Is the specified virtual address already mapped?
 	 */
 	if ((origpte & PG_V) != 0) {
 		/*
 		 * Wiring change, just update stats. We don't worry about
 		 * wiring PT pages as they remain resident as long as there
 		 * are valid mappings in them. Hence, if a user page is wired,
 		 * the PT page will be also.
 		 */
 		if ((newpte & PG_W) != 0 && (origpte & PG_W) == 0)
 			pmap->pm_stats.wired_count++;
 		else if ((newpte & PG_W) == 0 && (origpte & PG_W) != 0)
 			pmap->pm_stats.wired_count--;
 
 		/*
 		 * Remove the extra PT page reference.
 		 */
 		if (mpte != NULL) {
 			mpte->wire_count--;
 			KASSERT(mpte->wire_count > 0,
 			    ("pmap_enter: missing reference to page table page,"
 			     " va: 0x%lx", va));
 		}
 
 		/*
 		 * Has the physical page changed?
 		 */
 		opa = origpte & PG_FRAME;
 		if (opa == pa) {
 			/*
 			 * No, might be a protection or wiring change.
 			 */
 			if ((origpte & PG_MANAGED) != 0) {
 				newpte |= PG_MANAGED;
 				if ((newpte & PG_RW) != 0)
 					vm_page_aflag_set(m, PGA_WRITEABLE);
 			}
 			if (((origpte ^ newpte) & ~(PG_M | PG_A)) == 0)
 				goto unchanged;
 			goto validate;
 		}
 	} else {
 		/*
 		 * Increment the counters.
 		 */
 		if ((newpte & PG_W) != 0)
 			pmap->pm_stats.wired_count++;
 		pmap_resident_count_inc(pmap, 1);
 	}
 
 	/*
 	 * Enter on the PV list if part of our managed memory.
 	 */
 	if ((m->oflags & VPO_UNMANAGED) == 0) {
 		newpte |= PG_MANAGED;
 		pv = get_pv_entry(pmap, &lock);
 		pv->pv_va = va;
 		CHANGE_PV_LIST_LOCK_TO_PHYS(&lock, pa);
 		TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 		m->md.pv_gen++;
 		if ((newpte & PG_RW) != 0)
 			vm_page_aflag_set(m, PGA_WRITEABLE);
 	}
 
 	/*
 	 * Update the PTE.
 	 */
 	if ((origpte & PG_V) != 0) {
 validate:
 		origpte = pte_load_store(pte, newpte);
 		opa = origpte & PG_FRAME;
 		if (opa != pa) {
 			if ((origpte & PG_MANAGED) != 0) {
 				om = PHYS_TO_VM_PAGE(opa);
 				if ((origpte & (PG_M | PG_RW)) == (PG_M |
 				    PG_RW))
 					vm_page_dirty(om);
 				if ((origpte & PG_A) != 0)
 					vm_page_aflag_set(om, PGA_REFERENCED);
 				CHANGE_PV_LIST_LOCK_TO_PHYS(&lock, opa);
 				pmap_pvh_free(&om->md, pmap, va);
 				if ((om->aflags & PGA_WRITEABLE) != 0 &&
 				    TAILQ_EMPTY(&om->md.pv_list) &&
 				    ((om->flags & PG_FICTITIOUS) != 0 ||
 				    TAILQ_EMPTY(&pa_to_pvh(opa)->pv_list)))
 					vm_page_aflag_clear(om, PGA_WRITEABLE);
 			}
 		} else if ((newpte & PG_M) == 0 && (origpte & (PG_M |
 		    PG_RW)) == (PG_M | PG_RW)) {
 			if ((origpte & PG_MANAGED) != 0)
 				vm_page_dirty(m);
 
 			/*
 			 * Although the PTE may still have PG_RW set, TLB
 			 * invalidation may nonetheless be required because
 			 * the PTE no longer has PG_M set.
 			 */
 		} else if ((origpte & PG_NX) != 0 || (newpte & PG_NX) == 0) {
 			/*
 			 * This PTE change does not require TLB invalidation.
 			 */
 			goto unchanged;
 		}
 		if ((origpte & PG_A) != 0)
 			pmap_invalidate_page(pmap, va);
 	} else
 		pte_store(pte, newpte);
 
 unchanged:
 
 	/*
 	 * If both the page table page and the reservation are fully
 	 * populated, then attempt promotion.
 	 */
 	if ((mpte == NULL || mpte->wire_count == NPTEPG) &&
 	    pmap_ps_enabled(pmap) &&
 	    (m->flags & PG_FICTITIOUS) == 0 &&
 	    vm_reserv_level_iffullpop(m) == 0)
 		pmap_promote_pde(pmap, pde, va, &lock);
 
 	if (lock != NULL)
 		rw_wunlock(lock);
 	PMAP_UNLOCK(pmap);
 	return (KERN_SUCCESS);
 }
 
 /*
  * Tries to create a 2MB page mapping.  Returns TRUE if successful and FALSE
  * otherwise.  Fails if (1) a page table page cannot be allocated without
  * blocking, (2) a mapping already exists at the specified virtual address, or
  * (3) a pv entry cannot be allocated without reclaiming another pv entry. 
  */
 static boolean_t
 pmap_enter_pde(pmap_t pmap, vm_offset_t va, vm_page_t m, vm_prot_t prot,
     struct rwlock **lockp)
 {
 	pd_entry_t *pde, newpde;
 	pt_entry_t PG_V;
 	vm_page_t mpde;
 	struct spglist free;
 
 	PG_V = pmap_valid_bit(pmap);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 
 	if ((mpde = pmap_allocpde(pmap, va, NULL)) == NULL) {
 		CTR2(KTR_PMAP, "pmap_enter_pde: failure for va %#lx"
 		    " in pmap %p", va, pmap);
 		return (FALSE);
 	}
 	pde = (pd_entry_t *)PHYS_TO_DMAP(VM_PAGE_TO_PHYS(mpde));
 	pde = &pde[pmap_pde_index(va)];
 	if ((*pde & PG_V) != 0) {
 		KASSERT(mpde->wire_count > 1,
 		    ("pmap_enter_pde: mpde's wire count is too low"));
 		mpde->wire_count--;
 		CTR2(KTR_PMAP, "pmap_enter_pde: failure for va %#lx"
 		    " in pmap %p", va, pmap);
 		return (FALSE);
 	}
 	newpde = VM_PAGE_TO_PHYS(m) | pmap_cache_bits(pmap, m->md.pat_mode, 1) |
 	    PG_PS | PG_V;
 	if ((m->oflags & VPO_UNMANAGED) == 0) {
 		newpde |= PG_MANAGED;
 
 		/*
 		 * Abort this mapping if its PV entry could not be created.
 		 */
 		if (!pmap_pv_insert_pde(pmap, va, VM_PAGE_TO_PHYS(m),
 		    lockp)) {
 			SLIST_INIT(&free);
 			if (pmap_unwire_ptp(pmap, va, mpde, &free)) {
 				/*
 				 * Although "va" is not mapped, paging-
 				 * structure caches could nonetheless have
 				 * entries that refer to the freed page table
 				 * pages.  Invalidate those entries.
 				 */
 				pmap_invalidate_page(pmap, va);
 				pmap_free_zero_pages(&free);
 			}
 			CTR2(KTR_PMAP, "pmap_enter_pde: failure for va %#lx"
 			    " in pmap %p", va, pmap);
 			return (FALSE);
 		}
 	}
 	if ((prot & VM_PROT_EXECUTE) == 0)
 		newpde |= pg_nx;
 	if (va < VM_MAXUSER_ADDRESS)
 		newpde |= PG_U;
 
 	/*
 	 * Increment counters.
 	 */
 	pmap_resident_count_inc(pmap, NBPDR / PAGE_SIZE);
 
 	/*
 	 * Map the superpage.
 	 */
 	pde_store(pde, newpde);
 
 	atomic_add_long(&pmap_pde_mappings, 1);
 	CTR2(KTR_PMAP, "pmap_enter_pde: success for va %#lx"
 	    " in pmap %p", va, pmap);
 	return (TRUE);
 }
 
 /*
  * Maps a sequence of resident pages belonging to the same object.
  * The sequence begins with the given page m_start.  This page is
  * mapped at the given virtual address start.  Each subsequent page is
  * mapped at a virtual address that is offset from start by the same
  * amount as the page is offset from m_start within the object.  The
  * last page in the sequence is the page with the largest offset from
  * m_start that can be mapped at a virtual address less than the given
  * virtual address end.  Not every virtual page between start and end
  * is mapped; only those for which a resident page exists with the
  * corresponding offset from m_start are mapped.
  */
 void
 pmap_enter_object(pmap_t pmap, vm_offset_t start, vm_offset_t end,
     vm_page_t m_start, vm_prot_t prot)
 {
 	struct rwlock *lock;
 	vm_offset_t va;
 	vm_page_t m, mpte;
 	vm_pindex_t diff, psize;
 
 	VM_OBJECT_ASSERT_LOCKED(m_start->object);
 
 	psize = atop(end - start);
 	mpte = NULL;
 	m = m_start;
 	lock = NULL;
 	PMAP_LOCK(pmap);
 	while (m != NULL && (diff = m->pindex - m_start->pindex) < psize) {
 		va = start + ptoa(diff);
 		if ((va & PDRMASK) == 0 && va + NBPDR <= end &&
 		    m->psind == 1 && pmap_ps_enabled(pmap) &&
 		    pmap_enter_pde(pmap, va, m, prot, &lock))
 			m = &m[NBPDR / PAGE_SIZE - 1];
 		else
 			mpte = pmap_enter_quick_locked(pmap, va, m, prot,
 			    mpte, &lock);
 		m = TAILQ_NEXT(m, listq);
 	}
 	if (lock != NULL)
 		rw_wunlock(lock);
 	PMAP_UNLOCK(pmap);
 }
 
 /*
  * this code makes some *MAJOR* assumptions:
  * 1. Current pmap & pmap exists.
  * 2. Not wired.
  * 3. Read access.
  * 4. No page table pages.
  * but is *MUCH* faster than pmap_enter...
  */
 
 void
 pmap_enter_quick(pmap_t pmap, vm_offset_t va, vm_page_t m, vm_prot_t prot)
 {
 	struct rwlock *lock;
 
 	lock = NULL;
 	PMAP_LOCK(pmap);
 	(void)pmap_enter_quick_locked(pmap, va, m, prot, NULL, &lock);
 	if (lock != NULL)
 		rw_wunlock(lock);
 	PMAP_UNLOCK(pmap);
 }
 
 static vm_page_t
 pmap_enter_quick_locked(pmap_t pmap, vm_offset_t va, vm_page_t m,
     vm_prot_t prot, vm_page_t mpte, struct rwlock **lockp)
 {
 	struct spglist free;
 	pt_entry_t *pte, PG_V;
 	vm_paddr_t pa;
 
 	KASSERT(va < kmi.clean_sva || va >= kmi.clean_eva ||
 	    (m->oflags & VPO_UNMANAGED) != 0,
 	    ("pmap_enter_quick_locked: managed mapping within the clean submap"));
 	PG_V = pmap_valid_bit(pmap);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 
 	/*
 	 * In the case that a page table page is not
 	 * resident, we are creating it here.
 	 */
 	if (va < VM_MAXUSER_ADDRESS) {
 		vm_pindex_t ptepindex;
 		pd_entry_t *ptepa;
 
 		/*
 		 * Calculate pagetable page index
 		 */
 		ptepindex = pmap_pde_pindex(va);
 		if (mpte && (mpte->pindex == ptepindex)) {
 			mpte->wire_count++;
 		} else {
 			/*
 			 * Get the page directory entry
 			 */
 			ptepa = pmap_pde(pmap, va);
 
 			/*
 			 * If the page table page is mapped, we just increment
 			 * the hold count, and activate it.  Otherwise, we
 			 * attempt to allocate a page table page.  If this
 			 * attempt fails, we don't retry.  Instead, we give up.
 			 */
 			if (ptepa && (*ptepa & PG_V) != 0) {
 				if (*ptepa & PG_PS)
 					return (NULL);
 				mpte = PHYS_TO_VM_PAGE(*ptepa & PG_FRAME);
 				mpte->wire_count++;
 			} else {
 				/*
 				 * Pass NULL instead of the PV list lock
 				 * pointer, because we don't intend to sleep.
 				 */
 				mpte = _pmap_allocpte(pmap, ptepindex, NULL);
 				if (mpte == NULL)
 					return (mpte);
 			}
 		}
 		pte = (pt_entry_t *)PHYS_TO_DMAP(VM_PAGE_TO_PHYS(mpte));
 		pte = &pte[pmap_pte_index(va)];
 	} else {
 		mpte = NULL;
 		pte = vtopte(va);
 	}
 	if (*pte) {
 		if (mpte != NULL) {
 			mpte->wire_count--;
 			mpte = NULL;
 		}
 		return (mpte);
 	}
 
 	/*
 	 * Enter on the PV list if part of our managed memory.
 	 */
 	if ((m->oflags & VPO_UNMANAGED) == 0 &&
 	    !pmap_try_insert_pv_entry(pmap, va, m, lockp)) {
 		if (mpte != NULL) {
 			SLIST_INIT(&free);
 			if (pmap_unwire_ptp(pmap, va, mpte, &free)) {
 				/*
 				 * Although "va" is not mapped, paging-
 				 * structure caches could nonetheless have
 				 * entries that refer to the freed page table
 				 * pages.  Invalidate those entries.
 				 */
 				pmap_invalidate_page(pmap, va);
 				pmap_free_zero_pages(&free);
 			}
 			mpte = NULL;
 		}
 		return (mpte);
 	}
 
 	/*
 	 * Increment counters
 	 */
 	pmap_resident_count_inc(pmap, 1);
 
 	pa = VM_PAGE_TO_PHYS(m) | pmap_cache_bits(pmap, m->md.pat_mode, 0);
 	if ((prot & VM_PROT_EXECUTE) == 0)
 		pa |= pg_nx;
 
 	/*
 	 * Now validate mapping with RO protection
 	 */
 	if ((m->oflags & VPO_UNMANAGED) != 0)
 		pte_store(pte, pa | PG_V | PG_U);
 	else
 		pte_store(pte, pa | PG_V | PG_U | PG_MANAGED);
 	return (mpte);
 }
 
 /*
  * Make a temporary mapping for a physical address.  This is only intended
  * to be used for panic dumps.
  */
 void *
 pmap_kenter_temporary(vm_paddr_t pa, int i)
 {
 	vm_offset_t va;
 
 	va = (vm_offset_t)crashdumpmap + (i * PAGE_SIZE);
 	pmap_kenter(va, pa);
 	invlpg(va);
 	return ((void *)crashdumpmap);
 }
 
 /*
  * This code maps large physical mmap regions into the
  * processor address space.  Note that some shortcuts
  * are taken, but the code works.
  */
 void
 pmap_object_init_pt(pmap_t pmap, vm_offset_t addr, vm_object_t object,
     vm_pindex_t pindex, vm_size_t size)
 {
 	pd_entry_t *pde;
 	pt_entry_t PG_A, PG_M, PG_RW, PG_V;
 	vm_paddr_t pa, ptepa;
 	vm_page_t p, pdpg;
 	int pat_mode;
 
 	PG_A = pmap_accessed_bit(pmap);
 	PG_M = pmap_modified_bit(pmap);
 	PG_V = pmap_valid_bit(pmap);
 	PG_RW = pmap_rw_bit(pmap);
 
 	VM_OBJECT_ASSERT_WLOCKED(object);
 	KASSERT(object->type == OBJT_DEVICE || object->type == OBJT_SG,
 	    ("pmap_object_init_pt: non-device object"));
 	if ((addr & (NBPDR - 1)) == 0 && (size & (NBPDR - 1)) == 0) {
 		if (!pmap_ps_enabled(pmap))
 			return;
 		if (!vm_object_populate(object, pindex, pindex + atop(size)))
 			return;
 		p = vm_page_lookup(object, pindex);
 		KASSERT(p->valid == VM_PAGE_BITS_ALL,
 		    ("pmap_object_init_pt: invalid page %p", p));
 		pat_mode = p->md.pat_mode;
 
 		/*
 		 * Abort the mapping if the first page is not physically
 		 * aligned to a 2MB page boundary.
 		 */
 		ptepa = VM_PAGE_TO_PHYS(p);
 		if (ptepa & (NBPDR - 1))
 			return;
 
 		/*
 		 * Skip the first page.  Abort the mapping if the rest of
 		 * the pages are not physically contiguous or have differing
 		 * memory attributes.
 		 */
 		p = TAILQ_NEXT(p, listq);
 		for (pa = ptepa + PAGE_SIZE; pa < ptepa + size;
 		    pa += PAGE_SIZE) {
 			KASSERT(p->valid == VM_PAGE_BITS_ALL,
 			    ("pmap_object_init_pt: invalid page %p", p));
 			if (pa != VM_PAGE_TO_PHYS(p) ||
 			    pat_mode != p->md.pat_mode)
 				return;
 			p = TAILQ_NEXT(p, listq);
 		}
 
 		/*
 		 * Map using 2MB pages.  Since "ptepa" is 2M aligned and
 		 * "size" is a multiple of 2M, adding the PAT setting to "pa"
 		 * will not affect the termination of this loop.
 		 */ 
 		PMAP_LOCK(pmap);
 		for (pa = ptepa | pmap_cache_bits(pmap, pat_mode, 1);
 		    pa < ptepa + size; pa += NBPDR) {
 			pdpg = pmap_allocpde(pmap, addr, NULL);
 			if (pdpg == NULL) {
 				/*
 				 * The creation of mappings below is only an
 				 * optimization.  If a page directory page
 				 * cannot be allocated without blocking,
 				 * continue on to the next mapping rather than
 				 * blocking.
 				 */
 				addr += NBPDR;
 				continue;
 			}
 			pde = (pd_entry_t *)PHYS_TO_DMAP(VM_PAGE_TO_PHYS(pdpg));
 			pde = &pde[pmap_pde_index(addr)];
 			if ((*pde & PG_V) == 0) {
 				pde_store(pde, pa | PG_PS | PG_M | PG_A |
 				    PG_U | PG_RW | PG_V);
 				pmap_resident_count_inc(pmap, NBPDR / PAGE_SIZE);
 				atomic_add_long(&pmap_pde_mappings, 1);
 			} else {
 				/* Continue on if the PDE is already valid. */
 				pdpg->wire_count--;
 				KASSERT(pdpg->wire_count > 0,
 				    ("pmap_object_init_pt: missing reference "
 				    "to page directory page, va: 0x%lx", addr));
 			}
 			addr += NBPDR;
 		}
 		PMAP_UNLOCK(pmap);
 	}
 }
 
 /*
  *	Clear the wired attribute from the mappings for the specified range of
  *	addresses in the given pmap.  Every valid mapping within that range
  *	must have the wired attribute set.  In contrast, invalid mappings
  *	cannot have the wired attribute set, so they are ignored.
  *
  *	The wired attribute of the page table entry is not a hardware
  *	feature, so there is no need to invalidate any TLB entries.
  *	Since pmap_demote_pde() for the wired entry must never fail,
  *	pmap_delayed_invl_started()/finished() calls around the
  *	function are not needed.
  */
 void
 pmap_unwire(pmap_t pmap, vm_offset_t sva, vm_offset_t eva)
 {
 	vm_offset_t va_next;
 	pml4_entry_t *pml4e;
 	pdp_entry_t *pdpe;
 	pd_entry_t *pde;
 	pt_entry_t *pte, PG_V;
 
 	PG_V = pmap_valid_bit(pmap);
 	PMAP_LOCK(pmap);
 	for (; sva < eva; sva = va_next) {
 		pml4e = pmap_pml4e(pmap, sva);
 		if ((*pml4e & PG_V) == 0) {
 			va_next = (sva + NBPML4) & ~PML4MASK;
 			if (va_next < sva)
 				va_next = eva;
 			continue;
 		}
 		pdpe = pmap_pml4e_to_pdpe(pml4e, sva);
 		if ((*pdpe & PG_V) == 0) {
 			va_next = (sva + NBPDP) & ~PDPMASK;
 			if (va_next < sva)
 				va_next = eva;
 			continue;
 		}
 		va_next = (sva + NBPDR) & ~PDRMASK;
 		if (va_next < sva)
 			va_next = eva;
 		pde = pmap_pdpe_to_pde(pdpe, sva);
 		if ((*pde & PG_V) == 0)
 			continue;
 		if ((*pde & PG_PS) != 0) {
 			if ((*pde & PG_W) == 0)
 				panic("pmap_unwire: pde %#jx is missing PG_W",
 				    (uintmax_t)*pde);
 
 			/*
 			 * Are we unwiring the entire large page?  If not,
 			 * demote the mapping and fall through.
 			 */
 			if (sva + NBPDR == va_next && eva >= va_next) {
 				atomic_clear_long(pde, PG_W);
 				pmap->pm_stats.wired_count -= NBPDR /
 				    PAGE_SIZE;
 				continue;
 			} else if (!pmap_demote_pde(pmap, pde, sva))
 				panic("pmap_unwire: demotion failed");
 		}
 		if (va_next > eva)
 			va_next = eva;
 		for (pte = pmap_pde_to_pte(pde, sva); sva != va_next; pte++,
 		    sva += PAGE_SIZE) {
 			if ((*pte & PG_V) == 0)
 				continue;
 			if ((*pte & PG_W) == 0)
 				panic("pmap_unwire: pte %#jx is missing PG_W",
 				    (uintmax_t)*pte);
 
 			/*
 			 * PG_W must be cleared atomically.  Although the pmap
 			 * lock synchronizes access to PG_W, another processor
 			 * could be setting PG_M and/or PG_A concurrently.
 			 */
 			atomic_clear_long(pte, PG_W);
 			pmap->pm_stats.wired_count--;
 		}
 	}
 	PMAP_UNLOCK(pmap);
 }
 
 /*
  *	Copy the range specified by src_addr/len
  *	from the source map to the range dst_addr/len
  *	in the destination map.
  *
  *	This routine is only advisory and need not do anything.
  */
 
 void
 pmap_copy(pmap_t dst_pmap, pmap_t src_pmap, vm_offset_t dst_addr, vm_size_t len,
     vm_offset_t src_addr)
 {
 	struct rwlock *lock;
 	struct spglist free;
 	vm_offset_t addr;
 	vm_offset_t end_addr = src_addr + len;
 	vm_offset_t va_next;
 	pt_entry_t PG_A, PG_M, PG_V;
 
 	if (dst_addr != src_addr)
 		return;
 
 	if (dst_pmap->pm_type != src_pmap->pm_type)
 		return;
 
 	/*
 	 * EPT page table entries that require emulation of A/D bits are
 	 * sensitive to clearing the PG_A bit (aka EPT_PG_READ). Although
 	 * we clear PG_M (aka EPT_PG_WRITE) concomitantly, the PG_U bit
 	 * (aka EPT_PG_EXECUTE) could still be set. Since some EPT
 	 * implementations flag an EPT misconfiguration for exec-only
 	 * mappings we skip this function entirely for emulated pmaps.
 	 */
 	if (pmap_emulate_ad_bits(dst_pmap))
 		return;
 
 	lock = NULL;
 	if (dst_pmap < src_pmap) {
 		PMAP_LOCK(dst_pmap);
 		PMAP_LOCK(src_pmap);
 	} else {
 		PMAP_LOCK(src_pmap);
 		PMAP_LOCK(dst_pmap);
 	}
 
 	PG_A = pmap_accessed_bit(dst_pmap);
 	PG_M = pmap_modified_bit(dst_pmap);
 	PG_V = pmap_valid_bit(dst_pmap);
 
 	for (addr = src_addr; addr < end_addr; addr = va_next) {
 		pt_entry_t *src_pte, *dst_pte;
 		vm_page_t dstmpde, dstmpte, srcmpte;
 		pml4_entry_t *pml4e;
 		pdp_entry_t *pdpe;
 		pd_entry_t srcptepaddr, *pde;
 
 		KASSERT(addr < UPT_MIN_ADDRESS,
 		    ("pmap_copy: invalid to pmap_copy page tables"));
 
 		pml4e = pmap_pml4e(src_pmap, addr);
 		if ((*pml4e & PG_V) == 0) {
 			va_next = (addr + NBPML4) & ~PML4MASK;
 			if (va_next < addr)
 				va_next = end_addr;
 			continue;
 		}
 
 		pdpe = pmap_pml4e_to_pdpe(pml4e, addr);
 		if ((*pdpe & PG_V) == 0) {
 			va_next = (addr + NBPDP) & ~PDPMASK;
 			if (va_next < addr)
 				va_next = end_addr;
 			continue;
 		}
 
 		va_next = (addr + NBPDR) & ~PDRMASK;
 		if (va_next < addr)
 			va_next = end_addr;
 
 		pde = pmap_pdpe_to_pde(pdpe, addr);
 		srcptepaddr = *pde;
 		if (srcptepaddr == 0)
 			continue;
 			
 		if (srcptepaddr & PG_PS) {
 			if ((addr & PDRMASK) != 0 || addr + NBPDR > end_addr)
 				continue;
 			dstmpde = pmap_allocpde(dst_pmap, addr, NULL);
 			if (dstmpde == NULL)
 				break;
 			pde = (pd_entry_t *)
 			    PHYS_TO_DMAP(VM_PAGE_TO_PHYS(dstmpde));
 			pde = &pde[pmap_pde_index(addr)];
 			if (*pde == 0 && ((srcptepaddr & PG_MANAGED) == 0 ||
 			    pmap_pv_insert_pde(dst_pmap, addr, srcptepaddr &
 			    PG_PS_FRAME, &lock))) {
 				*pde = srcptepaddr & ~PG_W;
 				pmap_resident_count_inc(dst_pmap, NBPDR / PAGE_SIZE);
 				atomic_add_long(&pmap_pde_mappings, 1);
 			} else
 				dstmpde->wire_count--;
 			continue;
 		}
 
 		srcptepaddr &= PG_FRAME;
 		srcmpte = PHYS_TO_VM_PAGE(srcptepaddr);
 		KASSERT(srcmpte->wire_count > 0,
 		    ("pmap_copy: source page table page is unused"));
 
 		if (va_next > end_addr)
 			va_next = end_addr;
 
 		src_pte = (pt_entry_t *)PHYS_TO_DMAP(srcptepaddr);
 		src_pte = &src_pte[pmap_pte_index(addr)];
 		dstmpte = NULL;
 		while (addr < va_next) {
 			pt_entry_t ptetemp;
 			ptetemp = *src_pte;
 			/*
 			 * we only virtual copy managed pages
 			 */
 			if ((ptetemp & PG_MANAGED) != 0) {
 				if (dstmpte != NULL &&
 				    dstmpte->pindex == pmap_pde_pindex(addr))
 					dstmpte->wire_count++;
 				else if ((dstmpte = pmap_allocpte(dst_pmap,
 				    addr, NULL)) == NULL)
 					goto out;
 				dst_pte = (pt_entry_t *)
 				    PHYS_TO_DMAP(VM_PAGE_TO_PHYS(dstmpte));
 				dst_pte = &dst_pte[pmap_pte_index(addr)];
 				if (*dst_pte == 0 &&
 				    pmap_try_insert_pv_entry(dst_pmap, addr,
 				    PHYS_TO_VM_PAGE(ptetemp & PG_FRAME),
 				    &lock)) {
 					/*
 					 * Clear the wired, modified, and
 					 * accessed (referenced) bits
 					 * during the copy.
 					 */
 					*dst_pte = ptetemp & ~(PG_W | PG_M |
 					    PG_A);
 					pmap_resident_count_inc(dst_pmap, 1);
 				} else {
 					SLIST_INIT(&free);
 					if (pmap_unwire_ptp(dst_pmap, addr,
 					    dstmpte, &free)) {
 						/*
 						 * Although "addr" is not
 						 * mapped, paging-structure
 						 * caches could nonetheless
 						 * have entries that refer to
 						 * the freed page table pages.
 						 * Invalidate those entries.
 						 */
 						pmap_invalidate_page(dst_pmap,
 						    addr);
 						pmap_free_zero_pages(&free);
 					}
 					goto out;
 				}
 				if (dstmpte->wire_count >= srcmpte->wire_count)
 					break;
 			}
 			addr += PAGE_SIZE;
 			src_pte++;
 		}
 	}
 out:
 	if (lock != NULL)
 		rw_wunlock(lock);
 	PMAP_UNLOCK(src_pmap);
 	PMAP_UNLOCK(dst_pmap);
 }
 
 /*
  * Zero the specified hardware page.
  */
 void
 pmap_zero_page(vm_page_t m)
 {
 	vm_offset_t va = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m));
 
 	pagezero((void *)va);
 }
 
 /*
  * Zero an an area within a single hardware page.  off and size must not
  * cover an area beyond a single hardware page.
  */
 void
 pmap_zero_page_area(vm_page_t m, int off, int size)
 {
 	vm_offset_t va = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m));
 
 	if (off == 0 && size == PAGE_SIZE)
 		pagezero((void *)va);
 	else
 		bzero((char *)va + off, size);
 }
 
 /*
  * Copy 1 specified hardware page to another.
  */
 void
 pmap_copy_page(vm_page_t msrc, vm_page_t mdst)
 {
 	vm_offset_t src = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(msrc));
 	vm_offset_t dst = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(mdst));
 
 	pagecopy((void *)src, (void *)dst);
 }
 
 int unmapped_buf_allowed = 1;
 
 void
 pmap_copy_pages(vm_page_t ma[], vm_offset_t a_offset, vm_page_t mb[],
     vm_offset_t b_offset, int xfersize)
 {
 	void *a_cp, *b_cp;
 	vm_page_t pages[2];
 	vm_offset_t vaddr[2], a_pg_offset, b_pg_offset;
 	int cnt;
 	boolean_t mapped;
 
 	while (xfersize > 0) {
 		a_pg_offset = a_offset & PAGE_MASK;
 		pages[0] = ma[a_offset >> PAGE_SHIFT];
 		b_pg_offset = b_offset & PAGE_MASK;
 		pages[1] = mb[b_offset >> PAGE_SHIFT];
 		cnt = min(xfersize, PAGE_SIZE - a_pg_offset);
 		cnt = min(cnt, PAGE_SIZE - b_pg_offset);
 		mapped = pmap_map_io_transient(pages, vaddr, 2, FALSE);
 		a_cp = (char *)vaddr[0] + a_pg_offset;
 		b_cp = (char *)vaddr[1] + b_pg_offset;
 		bcopy(a_cp, b_cp, cnt);
 		if (__predict_false(mapped))
 			pmap_unmap_io_transient(pages, vaddr, 2, FALSE);
 		a_offset += cnt;
 		b_offset += cnt;
 		xfersize -= cnt;
 	}
 }
 
 /*
  * Returns true if the pmap's pv is one of the first
  * 16 pvs linked to from this page.  This count may
  * be changed upwards or downwards in the future; it
  * is only necessary that true be returned for a small
  * subset of pmaps for proper page aging.
  */
 boolean_t
 pmap_page_exists_quick(pmap_t pmap, vm_page_t m)
 {
 	struct md_page *pvh;
 	struct rwlock *lock;
 	pv_entry_t pv;
 	int loops = 0;
 	boolean_t rv;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_page_exists_quick: page %p is not managed", m));
 	rv = FALSE;
 	lock = VM_PAGE_TO_PV_LIST_LOCK(m);
 	rw_rlock(lock);
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
 		if (PV_PMAP(pv) == pmap) {
 			rv = TRUE;
 			break;
 		}
 		loops++;
 		if (loops >= 16)
 			break;
 	}
 	if (!rv && loops < 16 && (m->flags & PG_FICTITIOUS) == 0) {
 		pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 		TAILQ_FOREACH(pv, &pvh->pv_list, pv_next) {
 			if (PV_PMAP(pv) == pmap) {
 				rv = TRUE;
 				break;
 			}
 			loops++;
 			if (loops >= 16)
 				break;
 		}
 	}
 	rw_runlock(lock);
 	return (rv);
 }
 
 /*
  *	pmap_page_wired_mappings:
  *
  *	Return the number of managed mappings to the given physical page
  *	that are wired.
  */
 int
 pmap_page_wired_mappings(vm_page_t m)
 {
 	struct rwlock *lock;
 	struct md_page *pvh;
 	pmap_t pmap;
 	pt_entry_t *pte;
 	pv_entry_t pv;
 	int count, md_gen, pvh_gen;
 
 	if ((m->oflags & VPO_UNMANAGED) != 0)
 		return (0);
 	lock = VM_PAGE_TO_PV_LIST_LOCK(m);
 	rw_rlock(lock);
 restart:
 	count = 0;
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			md_gen = m->md.pv_gen;
 			rw_runlock(lock);
 			PMAP_LOCK(pmap);
 			rw_rlock(lock);
 			if (md_gen != m->md.pv_gen) {
 				PMAP_UNLOCK(pmap);
 				goto restart;
 			}
 		}
 		pte = pmap_pte(pmap, pv->pv_va);
 		if ((*pte & PG_W) != 0)
 			count++;
 		PMAP_UNLOCK(pmap);
 	}
 	if ((m->flags & PG_FICTITIOUS) == 0) {
 		pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 		TAILQ_FOREACH(pv, &pvh->pv_list, pv_next) {
 			pmap = PV_PMAP(pv);
 			if (!PMAP_TRYLOCK(pmap)) {
 				md_gen = m->md.pv_gen;
 				pvh_gen = pvh->pv_gen;
 				rw_runlock(lock);
 				PMAP_LOCK(pmap);
 				rw_rlock(lock);
 				if (md_gen != m->md.pv_gen ||
 				    pvh_gen != pvh->pv_gen) {
 					PMAP_UNLOCK(pmap);
 					goto restart;
 				}
 			}
 			pte = pmap_pde(pmap, pv->pv_va);
 			if ((*pte & PG_W) != 0)
 				count++;
 			PMAP_UNLOCK(pmap);
 		}
 	}
 	rw_runlock(lock);
 	return (count);
 }
 
 /*
  * Returns TRUE if the given page is mapped individually or as part of
  * a 2mpage.  Otherwise, returns FALSE.
  */
 boolean_t
 pmap_page_is_mapped(vm_page_t m)
 {
 	struct rwlock *lock;
 	boolean_t rv;
 
 	if ((m->oflags & VPO_UNMANAGED) != 0)
 		return (FALSE);
 	lock = VM_PAGE_TO_PV_LIST_LOCK(m);
 	rw_rlock(lock);
 	rv = !TAILQ_EMPTY(&m->md.pv_list) ||
 	    ((m->flags & PG_FICTITIOUS) == 0 &&
 	    !TAILQ_EMPTY(&pa_to_pvh(VM_PAGE_TO_PHYS(m))->pv_list));
 	rw_runlock(lock);
 	return (rv);
 }
 
 /*
  * Destroy all managed, non-wired mappings in the given user-space
  * pmap.  This pmap cannot be active on any processor besides the
  * caller.
  *
  * This function cannot be applied to the kernel pmap.  Moreover, it
  * is not intended for general use.  It is only to be used during
  * process termination.  Consequently, it can be implemented in ways
  * that make it faster than pmap_remove().  First, it can more quickly
  * destroy mappings by iterating over the pmap's collection of PV
  * entries, rather than searching the page table.  Second, it doesn't
  * have to test and clear the page table entries atomically, because
  * no processor is currently accessing the user address space.  In
  * particular, a page table entry's dirty bit won't change state once
  * this function starts.
  */
 void
 pmap_remove_pages(pmap_t pmap)
 {
 	pd_entry_t ptepde;
 	pt_entry_t *pte, tpte;
 	pt_entry_t PG_M, PG_RW, PG_V;
 	struct spglist free;
 	vm_page_t m, mpte, mt;
 	pv_entry_t pv;
 	struct md_page *pvh;
 	struct pv_chunk *pc, *npc;
 	struct rwlock *lock;
 	int64_t bit;
 	uint64_t inuse, bitmask;
 	int allfree, field, freed, idx;
 	boolean_t superpage;
 	vm_paddr_t pa;
 
 	/*
 	 * Assert that the given pmap is only active on the current
 	 * CPU.  Unfortunately, we cannot block another CPU from
 	 * activating the pmap while this function is executing.
 	 */
 	KASSERT(pmap == PCPU_GET(curpmap), ("non-current pmap %p", pmap));
 #ifdef INVARIANTS
 	{
 		cpuset_t other_cpus;
 
 		other_cpus = all_cpus;
 		critical_enter();
 		CPU_CLR(PCPU_GET(cpuid), &other_cpus);
 		CPU_AND(&other_cpus, &pmap->pm_active);
 		critical_exit();
 		KASSERT(CPU_EMPTY(&other_cpus), ("pmap active %p", pmap));
 	}
 #endif
 
 	lock = NULL;
 	PG_M = pmap_modified_bit(pmap);
 	PG_V = pmap_valid_bit(pmap);
 	PG_RW = pmap_rw_bit(pmap);
 
 	SLIST_INIT(&free);
 	PMAP_LOCK(pmap);
 	TAILQ_FOREACH_SAFE(pc, &pmap->pm_pvchunk, pc_list, npc) {
 		allfree = 1;
 		freed = 0;
 		for (field = 0; field < _NPCM; field++) {
 			inuse = ~pc->pc_map[field] & pc_freemask[field];
 			while (inuse != 0) {
 				bit = bsfq(inuse);
 				bitmask = 1UL << bit;
 				idx = field * 64 + bit;
 				pv = &pc->pc_pventry[idx];
 				inuse &= ~bitmask;
 
 				pte = pmap_pdpe(pmap, pv->pv_va);
 				ptepde = *pte;
 				pte = pmap_pdpe_to_pde(pte, pv->pv_va);
 				tpte = *pte;
 				if ((tpte & (PG_PS | PG_V)) == PG_V) {
 					superpage = FALSE;
 					ptepde = tpte;
 					pte = (pt_entry_t *)PHYS_TO_DMAP(tpte &
 					    PG_FRAME);
 					pte = &pte[pmap_pte_index(pv->pv_va)];
 					tpte = *pte;
 				} else {
 					/*
 					 * Keep track whether 'tpte' is a
 					 * superpage explicitly instead of
 					 * relying on PG_PS being set.
 					 *
 					 * This is because PG_PS is numerically
 					 * identical to PG_PTE_PAT and thus a
 					 * regular page could be mistaken for
 					 * a superpage.
 					 */
 					superpage = TRUE;
 				}
 
 				if ((tpte & PG_V) == 0) {
 					panic("bad pte va %lx pte %lx",
 					    pv->pv_va, tpte);
 				}
 
 /*
  * We cannot remove wired pages from a process' mapping at this time
  */
 				if (tpte & PG_W) {
 					allfree = 0;
 					continue;
 				}
 
 				if (superpage)
 					pa = tpte & PG_PS_FRAME;
 				else
 					pa = tpte & PG_FRAME;
 
 				m = PHYS_TO_VM_PAGE(pa);
 				KASSERT(m->phys_addr == pa,
 				    ("vm_page_t %p phys_addr mismatch %016jx %016jx",
 				    m, (uintmax_t)m->phys_addr,
 				    (uintmax_t)tpte));
 
 				KASSERT((m->flags & PG_FICTITIOUS) != 0 ||
 				    m < &vm_page_array[vm_page_array_size],
 				    ("pmap_remove_pages: bad tpte %#jx",
 				    (uintmax_t)tpte));
 
 				pte_clear(pte);
 
 				/*
 				 * Update the vm_page_t clean/reference bits.
 				 */
 				if ((tpte & (PG_M | PG_RW)) == (PG_M | PG_RW)) {
 					if (superpage) {
 						for (mt = m; mt < &m[NBPDR / PAGE_SIZE]; mt++)
 							vm_page_dirty(mt);
 					} else
 						vm_page_dirty(m);
 				}
 
 				CHANGE_PV_LIST_LOCK_TO_VM_PAGE(&lock, m);
 
 				/* Mark free */
 				pc->pc_map[field] |= bitmask;
 				if (superpage) {
 					pmap_resident_count_dec(pmap, NBPDR / PAGE_SIZE);
 					pvh = pa_to_pvh(tpte & PG_PS_FRAME);
 					TAILQ_REMOVE(&pvh->pv_list, pv, pv_next);
 					pvh->pv_gen++;
 					if (TAILQ_EMPTY(&pvh->pv_list)) {
 						for (mt = m; mt < &m[NBPDR / PAGE_SIZE]; mt++)
 							if ((mt->aflags & PGA_WRITEABLE) != 0 &&
 							    TAILQ_EMPTY(&mt->md.pv_list))
 								vm_page_aflag_clear(mt, PGA_WRITEABLE);
 					}
 					mpte = pmap_lookup_pt_page(pmap, pv->pv_va);
 					if (mpte != NULL) {
 						pmap_remove_pt_page(pmap, mpte);
 						pmap_resident_count_dec(pmap, 1);
 						KASSERT(mpte->wire_count == NPTEPG,
 						    ("pmap_remove_pages: pte page wire count error"));
 						mpte->wire_count = 0;
 						pmap_add_delayed_free_list(mpte, &free, FALSE);
 						atomic_subtract_int(&vm_cnt.v_wire_count, 1);
 					}
 				} else {
 					pmap_resident_count_dec(pmap, 1);
 					TAILQ_REMOVE(&m->md.pv_list, pv, pv_next);
 					m->md.pv_gen++;
 					if ((m->aflags & PGA_WRITEABLE) != 0 &&
 					    TAILQ_EMPTY(&m->md.pv_list) &&
 					    (m->flags & PG_FICTITIOUS) == 0) {
 						pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 						if (TAILQ_EMPTY(&pvh->pv_list))
 							vm_page_aflag_clear(m, PGA_WRITEABLE);
 					}
 				}
 				pmap_unuse_pt(pmap, pv->pv_va, ptepde, &free);
 				freed++;
 			}
 		}
 		PV_STAT(atomic_add_long(&pv_entry_frees, freed));
 		PV_STAT(atomic_add_int(&pv_entry_spare, freed));
 		PV_STAT(atomic_subtract_long(&pv_entry_count, freed));
 		if (allfree) {
 			TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 			free_pv_chunk(pc);
 		}
 	}
 	if (lock != NULL)
 		rw_wunlock(lock);
 	pmap_invalidate_all(pmap);
 	PMAP_UNLOCK(pmap);
 	pmap_free_zero_pages(&free);
 }
 
 static boolean_t
 pmap_page_test_mappings(vm_page_t m, boolean_t accessed, boolean_t modified)
 {
 	struct rwlock *lock;
 	pv_entry_t pv;
 	struct md_page *pvh;
 	pt_entry_t *pte, mask;
 	pt_entry_t PG_A, PG_M, PG_RW, PG_V;
 	pmap_t pmap;
 	int md_gen, pvh_gen;
 	boolean_t rv;
 
 	rv = FALSE;
 	lock = VM_PAGE_TO_PV_LIST_LOCK(m);
 	rw_rlock(lock);
 restart:
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			md_gen = m->md.pv_gen;
 			rw_runlock(lock);
 			PMAP_LOCK(pmap);
 			rw_rlock(lock);
 			if (md_gen != m->md.pv_gen) {
 				PMAP_UNLOCK(pmap);
 				goto restart;
 			}
 		}
 		pte = pmap_pte(pmap, pv->pv_va);
 		mask = 0;
 		if (modified) {
 			PG_M = pmap_modified_bit(pmap);
 			PG_RW = pmap_rw_bit(pmap);
 			mask |= PG_RW | PG_M;
 		}
 		if (accessed) {
 			PG_A = pmap_accessed_bit(pmap);
 			PG_V = pmap_valid_bit(pmap);
 			mask |= PG_V | PG_A;
 		}
 		rv = (*pte & mask) == mask;
 		PMAP_UNLOCK(pmap);
 		if (rv)
 			goto out;
 	}
 	if ((m->flags & PG_FICTITIOUS) == 0) {
 		pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 		TAILQ_FOREACH(pv, &pvh->pv_list, pv_next) {
 			pmap = PV_PMAP(pv);
 			if (!PMAP_TRYLOCK(pmap)) {
 				md_gen = m->md.pv_gen;
 				pvh_gen = pvh->pv_gen;
 				rw_runlock(lock);
 				PMAP_LOCK(pmap);
 				rw_rlock(lock);
 				if (md_gen != m->md.pv_gen ||
 				    pvh_gen != pvh->pv_gen) {
 					PMAP_UNLOCK(pmap);
 					goto restart;
 				}
 			}
 			pte = pmap_pde(pmap, pv->pv_va);
 			mask = 0;
 			if (modified) {
 				PG_M = pmap_modified_bit(pmap);
 				PG_RW = pmap_rw_bit(pmap);
 				mask |= PG_RW | PG_M;
 			}
 			if (accessed) {
 				PG_A = pmap_accessed_bit(pmap);
 				PG_V = pmap_valid_bit(pmap);
 				mask |= PG_V | PG_A;
 			}
 			rv = (*pte & mask) == mask;
 			PMAP_UNLOCK(pmap);
 			if (rv)
 				goto out;
 		}
 	}
 out:
 	rw_runlock(lock);
 	return (rv);
 }
 
 /*
  *	pmap_is_modified:
  *
  *	Return whether or not the specified physical page was modified
  *	in any physical maps.
  */
 boolean_t
 pmap_is_modified(vm_page_t m)
 {
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_is_modified: page %p is not managed", m));
 
 	/*
 	 * If the page is not exclusive busied, then PGA_WRITEABLE cannot be
 	 * concurrently set while the object is locked.  Thus, if PGA_WRITEABLE
 	 * is clear, no PTEs can have PG_M set.
 	 */
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	if (!vm_page_xbusied(m) && (m->aflags & PGA_WRITEABLE) == 0)
 		return (FALSE);
 	return (pmap_page_test_mappings(m, FALSE, TRUE));
 }
 
 /*
  *	pmap_is_prefaultable:
  *
  *	Return whether or not the specified virtual address is eligible
  *	for prefault.
  */
 boolean_t
 pmap_is_prefaultable(pmap_t pmap, vm_offset_t addr)
 {
 	pd_entry_t *pde;
 	pt_entry_t *pte, PG_V;
 	boolean_t rv;
 
 	PG_V = pmap_valid_bit(pmap);
 	rv = FALSE;
 	PMAP_LOCK(pmap);
 	pde = pmap_pde(pmap, addr);
 	if (pde != NULL && (*pde & (PG_PS | PG_V)) == PG_V) {
 		pte = pmap_pde_to_pte(pde, addr);
 		rv = (*pte & PG_V) == 0;
 	}
 	PMAP_UNLOCK(pmap);
 	return (rv);
 }
 
 /*
  *	pmap_is_referenced:
  *
  *	Return whether or not the specified physical page was referenced
  *	in any physical maps.
  */
 boolean_t
 pmap_is_referenced(vm_page_t m)
 {
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_is_referenced: page %p is not managed", m));
 	return (pmap_page_test_mappings(m, TRUE, FALSE));
 }
 
 /*
  * Clear the write and modified bits in each of the given page's mappings.
  */
 void
 pmap_remove_write(vm_page_t m)
 {
 	struct md_page *pvh;
 	pmap_t pmap;
 	struct rwlock *lock;
 	pv_entry_t next_pv, pv;
 	pd_entry_t *pde;
 	pt_entry_t oldpte, *pte, PG_M, PG_RW;
 	vm_offset_t va;
 	int pvh_gen, md_gen;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_remove_write: page %p is not managed", m));
 
 	/*
 	 * If the page is not exclusive busied, then PGA_WRITEABLE cannot be
 	 * set by another thread while the object is locked.  Thus,
 	 * if PGA_WRITEABLE is clear, no page table entries need updating.
 	 */
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	if (!vm_page_xbusied(m) && (m->aflags & PGA_WRITEABLE) == 0)
 		return;
 	lock = VM_PAGE_TO_PV_LIST_LOCK(m);
 	pvh = (m->flags & PG_FICTITIOUS) != 0 ? &pv_dummy :
 	    pa_to_pvh(VM_PAGE_TO_PHYS(m));
 retry_pv_loop:
 	rw_wlock(lock);
 	TAILQ_FOREACH_SAFE(pv, &pvh->pv_list, pv_next, next_pv) {
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			pvh_gen = pvh->pv_gen;
 			rw_wunlock(lock);
 			PMAP_LOCK(pmap);
 			rw_wlock(lock);
 			if (pvh_gen != pvh->pv_gen) {
 				PMAP_UNLOCK(pmap);
 				rw_wunlock(lock);
 				goto retry_pv_loop;
 			}
 		}
 		PG_RW = pmap_rw_bit(pmap);
 		va = pv->pv_va;
 		pde = pmap_pde(pmap, va);
 		if ((*pde & PG_RW) != 0)
 			(void)pmap_demote_pde_locked(pmap, pde, va, &lock);
 		KASSERT(lock == VM_PAGE_TO_PV_LIST_LOCK(m),
 		    ("inconsistent pv lock %p %p for page %p",
 		    lock, VM_PAGE_TO_PV_LIST_LOCK(m), m));
 		PMAP_UNLOCK(pmap);
 	}
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			pvh_gen = pvh->pv_gen;
 			md_gen = m->md.pv_gen;
 			rw_wunlock(lock);
 			PMAP_LOCK(pmap);
 			rw_wlock(lock);
 			if (pvh_gen != pvh->pv_gen ||
 			    md_gen != m->md.pv_gen) {
 				PMAP_UNLOCK(pmap);
 				rw_wunlock(lock);
 				goto retry_pv_loop;
 			}
 		}
 		PG_M = pmap_modified_bit(pmap);
 		PG_RW = pmap_rw_bit(pmap);
 		pde = pmap_pde(pmap, pv->pv_va);
 		KASSERT((*pde & PG_PS) == 0,
 		    ("pmap_remove_write: found a 2mpage in page %p's pv list",
 		    m));
 		pte = pmap_pde_to_pte(pde, pv->pv_va);
 retry:
 		oldpte = *pte;
 		if (oldpte & PG_RW) {
 			if (!atomic_cmpset_long(pte, oldpte, oldpte &
 			    ~(PG_RW | PG_M)))
 				goto retry;
 			if ((oldpte & PG_M) != 0)
 				vm_page_dirty(m);
 			pmap_invalidate_page(pmap, pv->pv_va);
 		}
 		PMAP_UNLOCK(pmap);
 	}
 	rw_wunlock(lock);
 	vm_page_aflag_clear(m, PGA_WRITEABLE);
 	pmap_delayed_invl_wait(m);
 }
 
 static __inline boolean_t
 safe_to_clear_referenced(pmap_t pmap, pt_entry_t pte)
 {
 
 	if (!pmap_emulate_ad_bits(pmap))
 		return (TRUE);
 
 	KASSERT(pmap->pm_type == PT_EPT, ("invalid pm_type %d", pmap->pm_type));
 
 	/*
 	 * XWR = 010 or 110 will cause an unconditional EPT misconfiguration
 	 * so we don't let the referenced (aka EPT_PG_READ) bit to be cleared
 	 * if the EPT_PG_WRITE bit is set.
 	 */
 	if ((pte & EPT_PG_WRITE) != 0)
 		return (FALSE);
 
 	/*
 	 * XWR = 100 is allowed only if the PMAP_SUPPORTS_EXEC_ONLY is set.
 	 */
 	if ((pte & EPT_PG_EXECUTE) == 0 ||
 	    ((pmap->pm_flags & PMAP_SUPPORTS_EXEC_ONLY) != 0))
 		return (TRUE);
 	else
 		return (FALSE);
 }
 
-#define	PMAP_TS_REFERENCED_MAX	5
-
 /*
  *	pmap_ts_referenced:
  *
  *	Return a count of reference bits for a page, clearing those bits.
  *	It is not necessary for every reference bit to be cleared, but it
  *	is necessary that 0 only be returned when there are truly no
  *	reference bits set.
  *
- *	XXX: The exact number of bits to check and clear is a matter that
- *	should be tested and standardized at some point in the future for
- *	optimal aging of shared pages.
- *
  *	As an optimization, update the page's dirty field if a modified bit is
  *	found while counting reference bits.  This opportunistic update can be
  *	performed at low cost and can eliminate the need for some future calls
  *	to pmap_is_modified().  However, since this function stops after
  *	finding PMAP_TS_REFERENCED_MAX reference bits, it may not detect some
  *	dirty pages.  Those dirty pages will only be detected by a future call
  *	to pmap_is_modified().
  *
  *	A DI block is not needed within this function, because
  *	invalidations are performed before the PV list lock is
  *	released.
  */
 int
 pmap_ts_referenced(vm_page_t m)
 {
 	struct md_page *pvh;
 	pv_entry_t pv, pvf;
 	pmap_t pmap;
 	struct rwlock *lock;
 	pd_entry_t oldpde, *pde;
 	pt_entry_t *pte, PG_A, PG_M, PG_RW;
 	vm_offset_t va;
 	vm_paddr_t pa;
 	int cleared, md_gen, not_cleared, pvh_gen;
 	struct spglist free;
 	boolean_t demoted;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_ts_referenced: page %p is not managed", m));
 	SLIST_INIT(&free);
 	cleared = 0;
 	pa = VM_PAGE_TO_PHYS(m);
 	lock = PHYS_TO_PV_LIST_LOCK(pa);
 	pvh = (m->flags & PG_FICTITIOUS) != 0 ? &pv_dummy : pa_to_pvh(pa);
 	rw_wlock(lock);
 retry:
 	not_cleared = 0;
 	if ((pvf = TAILQ_FIRST(&pvh->pv_list)) == NULL)
 		goto small_mappings;
 	pv = pvf;
 	do {
 		if (pvf == NULL)
 			pvf = pv;
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			pvh_gen = pvh->pv_gen;
 			rw_wunlock(lock);
 			PMAP_LOCK(pmap);
 			rw_wlock(lock);
 			if (pvh_gen != pvh->pv_gen) {
 				PMAP_UNLOCK(pmap);
 				goto retry;
 			}
 		}
 		PG_A = pmap_accessed_bit(pmap);
 		PG_M = pmap_modified_bit(pmap);
 		PG_RW = pmap_rw_bit(pmap);
 		va = pv->pv_va;
 		pde = pmap_pde(pmap, pv->pv_va);
 		oldpde = *pde;
 		if ((oldpde & (PG_M | PG_RW)) == (PG_M | PG_RW)) {
 			/*
 			 * Although "oldpde" is mapping a 2MB page, because
 			 * this function is called at a 4KB page granularity,
 			 * we only update the 4KB page under test.
 			 */
 			vm_page_dirty(m);
 		}
-		if ((*pde & PG_A) != 0) {
+		if ((oldpde & PG_A) != 0) {
 			/*
 			 * Since this reference bit is shared by 512 4KB
 			 * pages, it should not be cleared every time it is
 			 * tested.  Apply a simple "hash" function on the
 			 * physical page number, the virtual superpage number,
 			 * and the pmap address to select one 4KB page out of
 			 * the 512 on which testing the reference bit will
 			 * result in clearing that reference bit.  This
 			 * function is designed to avoid the selection of the
 			 * same 4KB page for every 2MB page mapping.
 			 *
 			 * On demotion, a mapping that hasn't been referenced
 			 * is simply destroyed.  To avoid the possibility of a
 			 * subsequent page fault on a demoted wired mapping,
 			 * always leave its reference bit set.  Moreover,
 			 * since the superpage is wired, the current state of
 			 * its reference bit won't affect page replacement.
 			 */
 			if ((((pa >> PAGE_SHIFT) ^ (pv->pv_va >> PDRSHIFT) ^
 			    (uintptr_t)pmap) & (NPTEPG - 1)) == 0 &&
-			    (*pde & PG_W) == 0) {
+			    (oldpde & PG_W) == 0) {
 				if (safe_to_clear_referenced(pmap, oldpde)) {
 					atomic_clear_long(pde, PG_A);
 					pmap_invalidate_page(pmap, pv->pv_va);
 					demoted = FALSE;
 				} else if (pmap_demote_pde_locked(pmap, pde,
 				    pv->pv_va, &lock)) {
 					/*
 					 * Remove the mapping to a single page
 					 * so that a subsequent access may
 					 * repromote.  Since the underlying
 					 * page table page is fully populated,
 					 * this removal never frees a page
 					 * table page.
 					 */
 					demoted = TRUE;
 					va += VM_PAGE_TO_PHYS(m) - (oldpde &
 					    PG_PS_FRAME);
 					pte = pmap_pde_to_pte(pde, va);
 					pmap_remove_pte(pmap, pte, va, *pde,
 					    NULL, &lock);
 					pmap_invalidate_page(pmap, va);
 				} else
 					demoted = TRUE;
 
 				if (demoted) {
 					/*
 					 * The superpage mapping was removed
 					 * entirely and therefore 'pv' is no
 					 * longer valid.
 					 */
 					if (pvf == pv)
 						pvf = NULL;
 					pv = NULL;
 				}
 				cleared++;
 				KASSERT(lock == VM_PAGE_TO_PV_LIST_LOCK(m),
 				    ("inconsistent pv lock %p %p for page %p",
 				    lock, VM_PAGE_TO_PV_LIST_LOCK(m), m));
 			} else
 				not_cleared++;
 		}
 		PMAP_UNLOCK(pmap);
 		/* Rotate the PV list if it has more than one entry. */
 		if (pv != NULL && TAILQ_NEXT(pv, pv_next) != NULL) {
 			TAILQ_REMOVE(&pvh->pv_list, pv, pv_next);
 			TAILQ_INSERT_TAIL(&pvh->pv_list, pv, pv_next);
 			pvh->pv_gen++;
 		}
 		if (cleared + not_cleared >= PMAP_TS_REFERENCED_MAX)
 			goto out;
 	} while ((pv = TAILQ_FIRST(&pvh->pv_list)) != pvf);
 small_mappings:
 	if ((pvf = TAILQ_FIRST(&m->md.pv_list)) == NULL)
 		goto out;
 	pv = pvf;
 	do {
 		if (pvf == NULL)
 			pvf = pv;
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			pvh_gen = pvh->pv_gen;
 			md_gen = m->md.pv_gen;
 			rw_wunlock(lock);
 			PMAP_LOCK(pmap);
 			rw_wlock(lock);
 			if (pvh_gen != pvh->pv_gen || md_gen != m->md.pv_gen) {
 				PMAP_UNLOCK(pmap);
 				goto retry;
 			}
 		}
 		PG_A = pmap_accessed_bit(pmap);
 		PG_M = pmap_modified_bit(pmap);
 		PG_RW = pmap_rw_bit(pmap);
 		pde = pmap_pde(pmap, pv->pv_va);
 		KASSERT((*pde & PG_PS) == 0,
 		    ("pmap_ts_referenced: found a 2mpage in page %p's pv list",
 		    m));
 		pte = pmap_pde_to_pte(pde, pv->pv_va);
 		if ((*pte & (PG_M | PG_RW)) == (PG_M | PG_RW))
 			vm_page_dirty(m);
 		if ((*pte & PG_A) != 0) {
 			if (safe_to_clear_referenced(pmap, *pte)) {
 				atomic_clear_long(pte, PG_A);
 				pmap_invalidate_page(pmap, pv->pv_va);
 				cleared++;
 			} else if ((*pte & PG_W) == 0) {
 				/*
 				 * Wired pages cannot be paged out so
 				 * doing accessed bit emulation for
 				 * them is wasted effort. We do the
 				 * hard work for unwired pages only.
 				 */
 				pmap_remove_pte(pmap, pte, pv->pv_va,
 				    *pde, &free, &lock);
 				pmap_invalidate_page(pmap, pv->pv_va);
 				cleared++;
 				if (pvf == pv)
 					pvf = NULL;
 				pv = NULL;
 				KASSERT(lock == VM_PAGE_TO_PV_LIST_LOCK(m),
 				    ("inconsistent pv lock %p %p for page %p",
 				    lock, VM_PAGE_TO_PV_LIST_LOCK(m), m));
 			} else
 				not_cleared++;
 		}
 		PMAP_UNLOCK(pmap);
 		/* Rotate the PV list if it has more than one entry. */
 		if (pv != NULL && TAILQ_NEXT(pv, pv_next) != NULL) {
 			TAILQ_REMOVE(&m->md.pv_list, pv, pv_next);
 			TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 			m->md.pv_gen++;
 		}
 	} while ((pv = TAILQ_FIRST(&m->md.pv_list)) != pvf && cleared +
 	    not_cleared < PMAP_TS_REFERENCED_MAX);
 out:
 	rw_wunlock(lock);
 	pmap_free_zero_pages(&free);
 	return (cleared + not_cleared);
 }
 
 /*
  *	Apply the given advice to the specified range of addresses within the
  *	given pmap.  Depending on the advice, clear the referenced and/or
  *	modified flags in each mapping and set the mapped page's dirty field.
  */
 void
 pmap_advise(pmap_t pmap, vm_offset_t sva, vm_offset_t eva, int advice)
 {
 	struct rwlock *lock;
 	pml4_entry_t *pml4e;
 	pdp_entry_t *pdpe;
 	pd_entry_t oldpde, *pde;
 	pt_entry_t *pte, PG_A, PG_G, PG_M, PG_RW, PG_V;
 	vm_offset_t va_next;
 	vm_page_t m;
 	boolean_t anychanged;
 
 	if (advice != MADV_DONTNEED && advice != MADV_FREE)
 		return;
 
 	/*
 	 * A/D bit emulation requires an alternate code path when clearing
 	 * the modified and accessed bits below. Since this function is
 	 * advisory in nature we skip it entirely for pmaps that require
 	 * A/D bit emulation.
 	 */
 	if (pmap_emulate_ad_bits(pmap))
 		return;
 
 	PG_A = pmap_accessed_bit(pmap);
 	PG_G = pmap_global_bit(pmap);
 	PG_M = pmap_modified_bit(pmap);
 	PG_V = pmap_valid_bit(pmap);
 	PG_RW = pmap_rw_bit(pmap);
 	anychanged = FALSE;
 	pmap_delayed_invl_started();
 	PMAP_LOCK(pmap);
 	for (; sva < eva; sva = va_next) {
 		pml4e = pmap_pml4e(pmap, sva);
 		if ((*pml4e & PG_V) == 0) {
 			va_next = (sva + NBPML4) & ~PML4MASK;
 			if (va_next < sva)
 				va_next = eva;
 			continue;
 		}
 		pdpe = pmap_pml4e_to_pdpe(pml4e, sva);
 		if ((*pdpe & PG_V) == 0) {
 			va_next = (sva + NBPDP) & ~PDPMASK;
 			if (va_next < sva)
 				va_next = eva;
 			continue;
 		}
 		va_next = (sva + NBPDR) & ~PDRMASK;
 		if (va_next < sva)
 			va_next = eva;
 		pde = pmap_pdpe_to_pde(pdpe, sva);
 		oldpde = *pde;
 		if ((oldpde & PG_V) == 0)
 			continue;
 		else if ((oldpde & PG_PS) != 0) {
 			if ((oldpde & PG_MANAGED) == 0)
 				continue;
 			lock = NULL;
 			if (!pmap_demote_pde_locked(pmap, pde, sva, &lock)) {
 				if (lock != NULL)
 					rw_wunlock(lock);
 
 				/*
 				 * The large page mapping was destroyed.
 				 */
 				continue;
 			}
 
 			/*
 			 * Unless the page mappings are wired, remove the
 			 * mapping to a single page so that a subsequent
 			 * access may repromote.  Since the underlying page
 			 * table page is fully populated, this removal never
 			 * frees a page table page.
 			 */
 			if ((oldpde & PG_W) == 0) {
 				pte = pmap_pde_to_pte(pde, sva);
 				KASSERT((*pte & PG_V) != 0,
 				    ("pmap_advise: invalid PTE"));
 				pmap_remove_pte(pmap, pte, sva, *pde, NULL,
 				    &lock);
 				anychanged = TRUE;
 			}
 			if (lock != NULL)
 				rw_wunlock(lock);
 		}
 		if (va_next > eva)
 			va_next = eva;
 		for (pte = pmap_pde_to_pte(pde, sva); sva != va_next; pte++,
 		    sva += PAGE_SIZE) {
 			if ((*pte & (PG_MANAGED | PG_V)) != (PG_MANAGED |
 			    PG_V))
 				continue;
 			else if ((*pte & (PG_M | PG_RW)) == (PG_M | PG_RW)) {
 				if (advice == MADV_DONTNEED) {
 					/*
 					 * Future calls to pmap_is_modified()
 					 * can be avoided by making the page
 					 * dirty now.
 					 */
 					m = PHYS_TO_VM_PAGE(*pte & PG_FRAME);
 					vm_page_dirty(m);
 				}
 				atomic_clear_long(pte, PG_M | PG_A);
 			} else if ((*pte & PG_A) != 0)
 				atomic_clear_long(pte, PG_A);
 			else
 				continue;
 			if ((*pte & PG_G) != 0)
 				pmap_invalidate_page(pmap, sva);
 			else
 				anychanged = TRUE;
 		}
 	}
 	if (anychanged)
 		pmap_invalidate_all(pmap);
 	PMAP_UNLOCK(pmap);
 	pmap_delayed_invl_finished();
 }
 
 /*
  *	Clear the modify bits on the specified physical page.
  */
 void
 pmap_clear_modify(vm_page_t m)
 {
 	struct md_page *pvh;
 	pmap_t pmap;
 	pv_entry_t next_pv, pv;
 	pd_entry_t oldpde, *pde;
 	pt_entry_t oldpte, *pte, PG_M, PG_RW, PG_V;
 	struct rwlock *lock;
 	vm_offset_t va;
 	int md_gen, pvh_gen;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_clear_modify: page %p is not managed", m));
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	KASSERT(!vm_page_xbusied(m),
 	    ("pmap_clear_modify: page %p is exclusive busied", m));
 
 	/*
 	 * If the page is not PGA_WRITEABLE, then no PTEs can have PG_M set.
 	 * If the object containing the page is locked and the page is not
 	 * exclusive busied, then PGA_WRITEABLE cannot be concurrently set.
 	 */
 	if ((m->aflags & PGA_WRITEABLE) == 0)
 		return;
 	pvh = (m->flags & PG_FICTITIOUS) != 0 ? &pv_dummy :
 	    pa_to_pvh(VM_PAGE_TO_PHYS(m));
 	lock = VM_PAGE_TO_PV_LIST_LOCK(m);
 	rw_wlock(lock);
 restart:
 	TAILQ_FOREACH_SAFE(pv, &pvh->pv_list, pv_next, next_pv) {
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			pvh_gen = pvh->pv_gen;
 			rw_wunlock(lock);
 			PMAP_LOCK(pmap);
 			rw_wlock(lock);
 			if (pvh_gen != pvh->pv_gen) {
 				PMAP_UNLOCK(pmap);
 				goto restart;
 			}
 		}
 		PG_M = pmap_modified_bit(pmap);
 		PG_V = pmap_valid_bit(pmap);
 		PG_RW = pmap_rw_bit(pmap);
 		va = pv->pv_va;
 		pde = pmap_pde(pmap, va);
 		oldpde = *pde;
 		if ((oldpde & PG_RW) != 0) {
 			if (pmap_demote_pde_locked(pmap, pde, va, &lock)) {
 				if ((oldpde & PG_W) == 0) {
 					/*
 					 * Write protect the mapping to a
 					 * single page so that a subsequent
 					 * write access may repromote.
 					 */
 					va += VM_PAGE_TO_PHYS(m) - (oldpde &
 					    PG_PS_FRAME);
 					pte = pmap_pde_to_pte(pde, va);
 					oldpte = *pte;
 					if ((oldpte & PG_V) != 0) {
 						while (!atomic_cmpset_long(pte,
 						    oldpte,
 						    oldpte & ~(PG_M | PG_RW)))
 							oldpte = *pte;
 						vm_page_dirty(m);
 						pmap_invalidate_page(pmap, va);
 					}
 				}
 			}
 		}
 		PMAP_UNLOCK(pmap);
 	}
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			md_gen = m->md.pv_gen;
 			pvh_gen = pvh->pv_gen;
 			rw_wunlock(lock);
 			PMAP_LOCK(pmap);
 			rw_wlock(lock);
 			if (pvh_gen != pvh->pv_gen || md_gen != m->md.pv_gen) {
 				PMAP_UNLOCK(pmap);
 				goto restart;
 			}
 		}
 		PG_M = pmap_modified_bit(pmap);
 		PG_RW = pmap_rw_bit(pmap);
 		pde = pmap_pde(pmap, pv->pv_va);
 		KASSERT((*pde & PG_PS) == 0, ("pmap_clear_modify: found"
 		    " a 2mpage in page %p's pv list", m));
 		pte = pmap_pde_to_pte(pde, pv->pv_va);
 		if ((*pte & (PG_M | PG_RW)) == (PG_M | PG_RW)) {
 			atomic_clear_long(pte, PG_M);
 			pmap_invalidate_page(pmap, pv->pv_va);
 		}
 		PMAP_UNLOCK(pmap);
 	}
 	rw_wunlock(lock);
 }
 
 /*
  * Miscellaneous support routines follow
  */
 
 /* Adjust the cache mode for a 4KB page mapped via a PTE. */
 static __inline void
 pmap_pte_attr(pt_entry_t *pte, int cache_bits, int mask)
 {
 	u_int opte, npte;
 
 	/*
 	 * The cache mode bits are all in the low 32-bits of the
 	 * PTE, so we can just spin on updating the low 32-bits.
 	 */
 	do {
 		opte = *(u_int *)pte;
 		npte = opte & ~mask;
 		npte |= cache_bits;
 	} while (npte != opte && !atomic_cmpset_int((u_int *)pte, opte, npte));
 }
 
 /* Adjust the cache mode for a 2MB page mapped via a PDE. */
 static __inline void
 pmap_pde_attr(pd_entry_t *pde, int cache_bits, int mask)
 {
 	u_int opde, npde;
 
 	/*
 	 * The cache mode bits are all in the low 32-bits of the
 	 * PDE, so we can just spin on updating the low 32-bits.
 	 */
 	do {
 		opde = *(u_int *)pde;
 		npde = opde & ~mask;
 		npde |= cache_bits;
 	} while (npde != opde && !atomic_cmpset_int((u_int *)pde, opde, npde));
 }
 
 /*
  * Map a set of physical memory pages into the kernel virtual
  * address space. Return a pointer to where it is mapped. This
  * routine is intended to be used for mapping device memory,
  * NOT real memory.
  */
 void *
 pmap_mapdev_attr(vm_paddr_t pa, vm_size_t size, int mode)
 {
 	struct pmap_preinit_mapping *ppim;
 	vm_offset_t va, offset;
 	vm_size_t tmpsize;
 	int i;
 
 	offset = pa & PAGE_MASK;
 	size = round_page(offset + size);
 	pa = trunc_page(pa);
 
 	if (!pmap_initialized) {
 		va = 0;
 		for (i = 0; i < PMAP_PREINIT_MAPPING_COUNT; i++) {
 			ppim = pmap_preinit_mapping + i;
 			if (ppim->va == 0) {
 				ppim->pa = pa;
 				ppim->sz = size;
 				ppim->mode = mode;
 				ppim->va = virtual_avail;
 				virtual_avail += size;
 				va = ppim->va;
 				break;
 			}
 		}
 		if (va == 0)
 			panic("%s: too many preinit mappings", __func__);
 	} else {
 		/*
 		 * If we have a preinit mapping, re-use it.
 		 */
 		for (i = 0; i < PMAP_PREINIT_MAPPING_COUNT; i++) {
 			ppim = pmap_preinit_mapping + i;
 			if (ppim->pa == pa && ppim->sz == size &&
 			    ppim->mode == mode)
 				return ((void *)(ppim->va + offset));
 		}
 		/*
 		 * If the specified range of physical addresses fits within
 		 * the direct map window, use the direct map.
 		 */
 		if (pa < dmaplimit && pa + size < dmaplimit) {
 			va = PHYS_TO_DMAP(pa);
 			if (!pmap_change_attr(va, size, mode))
 				return ((void *)(va + offset));
 		}
 		va = kva_alloc(size);
 		if (va == 0)
 			panic("%s: Couldn't allocate KVA", __func__);
 	}
 	for (tmpsize = 0; tmpsize < size; tmpsize += PAGE_SIZE)
 		pmap_kenter_attr(va + tmpsize, pa + tmpsize, mode);
 	pmap_invalidate_range(kernel_pmap, va, va + tmpsize);
 	pmap_invalidate_cache_range(va, va + tmpsize, FALSE);
 	return ((void *)(va + offset));
 }
 
 void *
 pmap_mapdev(vm_paddr_t pa, vm_size_t size)
 {
 
 	return (pmap_mapdev_attr(pa, size, PAT_UNCACHEABLE));
 }
 
 void *
 pmap_mapbios(vm_paddr_t pa, vm_size_t size)
 {
 
 	return (pmap_mapdev_attr(pa, size, PAT_WRITE_BACK));
 }
 
 void
 pmap_unmapdev(vm_offset_t va, vm_size_t size)
 {
 	struct pmap_preinit_mapping *ppim;
 	vm_offset_t offset;
 	int i;
 
 	/* If we gave a direct map region in pmap_mapdev, do nothing */
 	if (va >= DMAP_MIN_ADDRESS && va < DMAP_MAX_ADDRESS)
 		return;
 	offset = va & PAGE_MASK;
 	size = round_page(offset + size);
 	va = trunc_page(va);
 	for (i = 0; i < PMAP_PREINIT_MAPPING_COUNT; i++) {
 		ppim = pmap_preinit_mapping + i;
 		if (ppim->va == va && ppim->sz == size) {
 			if (pmap_initialized)
 				return;
 			ppim->pa = 0;
 			ppim->va = 0;
 			ppim->sz = 0;
 			ppim->mode = 0;
 			if (va + size == virtual_avail)
 				virtual_avail = va;
 			return;
 		}
 	}
 	if (pmap_initialized)
 		kva_free(va, size);
 }
 
 /*
  * Tries to demote a 1GB page mapping.
  */
 static boolean_t
 pmap_demote_pdpe(pmap_t pmap, pdp_entry_t *pdpe, vm_offset_t va)
 {
 	pdp_entry_t newpdpe, oldpdpe;
 	pd_entry_t *firstpde, newpde, *pde;
 	pt_entry_t PG_A, PG_M, PG_RW, PG_V;
 	vm_paddr_t mpdepa;
 	vm_page_t mpde;
 
 	PG_A = pmap_accessed_bit(pmap);
 	PG_M = pmap_modified_bit(pmap);
 	PG_V = pmap_valid_bit(pmap);
 	PG_RW = pmap_rw_bit(pmap);
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	oldpdpe = *pdpe;
 	KASSERT((oldpdpe & (PG_PS | PG_V)) == (PG_PS | PG_V),
 	    ("pmap_demote_pdpe: oldpdpe is missing PG_PS and/or PG_V"));
 	if ((mpde = vm_page_alloc(NULL, va >> PDPSHIFT, VM_ALLOC_INTERRUPT |
 	    VM_ALLOC_NOOBJ | VM_ALLOC_WIRED)) == NULL) {
 		CTR2(KTR_PMAP, "pmap_demote_pdpe: failure for va %#lx"
 		    " in pmap %p", va, pmap);
 		return (FALSE);
 	}
 	mpdepa = VM_PAGE_TO_PHYS(mpde);
 	firstpde = (pd_entry_t *)PHYS_TO_DMAP(mpdepa);
 	newpdpe = mpdepa | PG_M | PG_A | (oldpdpe & PG_U) | PG_RW | PG_V;
 	KASSERT((oldpdpe & PG_A) != 0,
 	    ("pmap_demote_pdpe: oldpdpe is missing PG_A"));
 	KASSERT((oldpdpe & (PG_M | PG_RW)) != PG_RW,
 	    ("pmap_demote_pdpe: oldpdpe is missing PG_M"));
 	newpde = oldpdpe;
 
 	/*
 	 * Initialize the page directory page.
 	 */
 	for (pde = firstpde; pde < firstpde + NPDEPG; pde++) {
 		*pde = newpde;
 		newpde += NBPDR;
 	}
 
 	/*
 	 * Demote the mapping.
 	 */
 	*pdpe = newpdpe;
 
 	/*
 	 * Invalidate a stale recursive mapping of the page directory page.
 	 */
 	pmap_invalidate_page(pmap, (vm_offset_t)vtopde(va));
 
 	pmap_pdpe_demotions++;
 	CTR2(KTR_PMAP, "pmap_demote_pdpe: success for va %#lx"
 	    " in pmap %p", va, pmap);
 	return (TRUE);
 }
 
 /*
  * Sets the memory attribute for the specified page.
  */
 void
 pmap_page_set_memattr(vm_page_t m, vm_memattr_t ma)
 {
 
 	m->md.pat_mode = ma;
 
 	/*
 	 * If "m" is a normal page, update its direct mapping.  This update
 	 * can be relied upon to perform any cache operations that are
 	 * required for data coherence.
 	 */
 	if ((m->flags & PG_FICTITIOUS) == 0 &&
 	    pmap_change_attr(PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m)), PAGE_SIZE,
 	    m->md.pat_mode))
 		panic("memory attribute change on the direct map failed");
 }
 
 /*
  * Changes the specified virtual address range's memory type to that given by
  * the parameter "mode".  The specified virtual address range must be
  * completely contained within either the direct map or the kernel map.  If
  * the virtual address range is contained within the kernel map, then the
  * memory type for each of the corresponding ranges of the direct map is also
  * changed.  (The corresponding ranges of the direct map are those ranges that
  * map the same physical pages as the specified virtual address range.)  These
  * changes to the direct map are necessary because Intel describes the
  * behavior of their processors as "undefined" if two or more mappings to the
  * same physical page have different memory types.
  *
  * Returns zero if the change completed successfully, and either EINVAL or
  * ENOMEM if the change failed.  Specifically, EINVAL is returned if some part
  * of the virtual address range was not mapped, and ENOMEM is returned if
  * there was insufficient memory available to complete the change.  In the
  * latter case, the memory type may have been changed on some part of the
  * virtual address range or the direct map.
  */
 int
 pmap_change_attr(vm_offset_t va, vm_size_t size, int mode)
 {
 	int error;
 
 	PMAP_LOCK(kernel_pmap);
 	error = pmap_change_attr_locked(va, size, mode);
 	PMAP_UNLOCK(kernel_pmap);
 	return (error);
 }
 
 static int
 pmap_change_attr_locked(vm_offset_t va, vm_size_t size, int mode)
 {
 	vm_offset_t base, offset, tmpva;
 	vm_paddr_t pa_start, pa_end, pa_end1;
 	pdp_entry_t *pdpe;
 	pd_entry_t *pde;
 	pt_entry_t *pte;
 	int cache_bits_pte, cache_bits_pde, error;
 	boolean_t changed;
 
 	PMAP_LOCK_ASSERT(kernel_pmap, MA_OWNED);
 	base = trunc_page(va);
 	offset = va & PAGE_MASK;
 	size = round_page(offset + size);
 
 	/*
 	 * Only supported on kernel virtual addresses, including the direct
 	 * map but excluding the recursive map.
 	 */
 	if (base < DMAP_MIN_ADDRESS)
 		return (EINVAL);
 
 	cache_bits_pde = pmap_cache_bits(kernel_pmap, mode, 1);
 	cache_bits_pte = pmap_cache_bits(kernel_pmap, mode, 0);
 	changed = FALSE;
 
 	/*
 	 * Pages that aren't mapped aren't supported.  Also break down 2MB pages
 	 * into 4KB pages if required.
 	 */
 	for (tmpva = base; tmpva < base + size; ) {
 		pdpe = pmap_pdpe(kernel_pmap, tmpva);
 		if (pdpe == NULL || *pdpe == 0)
 			return (EINVAL);
 		if (*pdpe & PG_PS) {
 			/*
 			 * If the current 1GB page already has the required
 			 * memory type, then we need not demote this page. Just
 			 * increment tmpva to the next 1GB page frame.
 			 */
 			if ((*pdpe & X86_PG_PDE_CACHE) == cache_bits_pde) {
 				tmpva = trunc_1gpage(tmpva) + NBPDP;
 				continue;
 			}
 
 			/*
 			 * If the current offset aligns with a 1GB page frame
 			 * and there is at least 1GB left within the range, then
 			 * we need not break down this page into 2MB pages.
 			 */
 			if ((tmpva & PDPMASK) == 0 &&
 			    tmpva + PDPMASK < base + size) {
 				tmpva += NBPDP;
 				continue;
 			}
 			if (!pmap_demote_pdpe(kernel_pmap, pdpe, tmpva))
 				return (ENOMEM);
 		}
 		pde = pmap_pdpe_to_pde(pdpe, tmpva);
 		if (*pde == 0)
 			return (EINVAL);
 		if (*pde & PG_PS) {
 			/*
 			 * If the current 2MB page already has the required
 			 * memory type, then we need not demote this page. Just
 			 * increment tmpva to the next 2MB page frame.
 			 */
 			if ((*pde & X86_PG_PDE_CACHE) == cache_bits_pde) {
 				tmpva = trunc_2mpage(tmpva) + NBPDR;
 				continue;
 			}
 
 			/*
 			 * If the current offset aligns with a 2MB page frame
 			 * and there is at least 2MB left within the range, then
 			 * we need not break down this page into 4KB pages.
 			 */
 			if ((tmpva & PDRMASK) == 0 &&
 			    tmpva + PDRMASK < base + size) {
 				tmpva += NBPDR;
 				continue;
 			}
 			if (!pmap_demote_pde(kernel_pmap, pde, tmpva))
 				return (ENOMEM);
 		}
 		pte = pmap_pde_to_pte(pde, tmpva);
 		if (*pte == 0)
 			return (EINVAL);
 		tmpva += PAGE_SIZE;
 	}
 	error = 0;
 
 	/*
 	 * Ok, all the pages exist, so run through them updating their
 	 * cache mode if required.
 	 */
 	pa_start = pa_end = 0;
 	for (tmpva = base; tmpva < base + size; ) {
 		pdpe = pmap_pdpe(kernel_pmap, tmpva);
 		if (*pdpe & PG_PS) {
 			if ((*pdpe & X86_PG_PDE_CACHE) != cache_bits_pde) {
 				pmap_pde_attr(pdpe, cache_bits_pde,
 				    X86_PG_PDE_CACHE);
 				changed = TRUE;
 			}
 			if (tmpva >= VM_MIN_KERNEL_ADDRESS &&
 			    (*pdpe & PG_PS_FRAME) < dmaplimit) {
 				if (pa_start == pa_end) {
 					/* Start physical address run. */
 					pa_start = *pdpe & PG_PS_FRAME;
 					pa_end = pa_start + NBPDP;
 				} else if (pa_end == (*pdpe & PG_PS_FRAME))
 					pa_end += NBPDP;
 				else {
 					/* Run ended, update direct map. */
 					error = pmap_change_attr_locked(
 					    PHYS_TO_DMAP(pa_start),
 					    pa_end - pa_start, mode);
 					if (error != 0)
 						break;
 					/* Start physical address run. */
 					pa_start = *pdpe & PG_PS_FRAME;
 					pa_end = pa_start + NBPDP;
 				}
 			}
 			tmpva = trunc_1gpage(tmpva) + NBPDP;
 			continue;
 		}
 		pde = pmap_pdpe_to_pde(pdpe, tmpva);
 		if (*pde & PG_PS) {
 			if ((*pde & X86_PG_PDE_CACHE) != cache_bits_pde) {
 				pmap_pde_attr(pde, cache_bits_pde,
 				    X86_PG_PDE_CACHE);
 				changed = TRUE;
 			}
 			if (tmpva >= VM_MIN_KERNEL_ADDRESS &&
 			    (*pde & PG_PS_FRAME) < dmaplimit) {
 				if (pa_start == pa_end) {
 					/* Start physical address run. */
 					pa_start = *pde & PG_PS_FRAME;
 					pa_end = pa_start + NBPDR;
 				} else if (pa_end == (*pde & PG_PS_FRAME))
 					pa_end += NBPDR;
 				else {
 					/* Run ended, update direct map. */
 					error = pmap_change_attr_locked(
 					    PHYS_TO_DMAP(pa_start),
 					    pa_end - pa_start, mode);
 					if (error != 0)
 						break;
 					/* Start physical address run. */
 					pa_start = *pde & PG_PS_FRAME;
 					pa_end = pa_start + NBPDR;
 				}
 			}
 			tmpva = trunc_2mpage(tmpva) + NBPDR;
 		} else {
 			pte = pmap_pde_to_pte(pde, tmpva);
 			if ((*pte & X86_PG_PTE_CACHE) != cache_bits_pte) {
 				pmap_pte_attr(pte, cache_bits_pte,
 				    X86_PG_PTE_CACHE);
 				changed = TRUE;
 			}
 			if (tmpva >= VM_MIN_KERNEL_ADDRESS &&
 			    (*pte & PG_PS_FRAME) < dmaplimit) {
 				if (pa_start == pa_end) {
 					/* Start physical address run. */
 					pa_start = *pte & PG_FRAME;
 					pa_end = pa_start + PAGE_SIZE;
 				} else if (pa_end == (*pte & PG_FRAME))
 					pa_end += PAGE_SIZE;
 				else {
 					/* Run ended, update direct map. */
 					error = pmap_change_attr_locked(
 					    PHYS_TO_DMAP(pa_start),
 					    pa_end - pa_start, mode);
 					if (error != 0)
 						break;
 					/* Start physical address run. */
 					pa_start = *pte & PG_FRAME;
 					pa_end = pa_start + PAGE_SIZE;
 				}
 			}
 			tmpva += PAGE_SIZE;
 		}
 	}
 	if (error == 0 && pa_start != pa_end && pa_start < dmaplimit) {
 		pa_end1 = MIN(pa_end, dmaplimit);
 		if (pa_start != pa_end1)
 			error = pmap_change_attr_locked(PHYS_TO_DMAP(pa_start),
 			    pa_end1 - pa_start, mode);
 	}
 
 	/*
 	 * Flush CPU caches if required to make sure any data isn't cached that
 	 * shouldn't be, etc.
 	 */
 	if (changed) {
 		pmap_invalidate_range(kernel_pmap, base, tmpva);
 		pmap_invalidate_cache_range(base, tmpva, FALSE);
 	}
 	return (error);
 }
 
 /*
  * Demotes any mapping within the direct map region that covers more than the
  * specified range of physical addresses.  This range's size must be a power
  * of two and its starting address must be a multiple of its size.  Since the
  * demotion does not change any attributes of the mapping, a TLB invalidation
  * is not mandatory.  The caller may, however, request a TLB invalidation.
  */
 void
 pmap_demote_DMAP(vm_paddr_t base, vm_size_t len, boolean_t invalidate)
 {
 	pdp_entry_t *pdpe;
 	pd_entry_t *pde;
 	vm_offset_t va;
 	boolean_t changed;
 
 	if (len == 0)
 		return;
 	KASSERT(powerof2(len), ("pmap_demote_DMAP: len is not a power of 2"));
 	KASSERT((base & (len - 1)) == 0,
 	    ("pmap_demote_DMAP: base is not a multiple of len"));
 	if (len < NBPDP && base < dmaplimit) {
 		va = PHYS_TO_DMAP(base);
 		changed = FALSE;
 		PMAP_LOCK(kernel_pmap);
 		pdpe = pmap_pdpe(kernel_pmap, va);
 		if ((*pdpe & X86_PG_V) == 0)
 			panic("pmap_demote_DMAP: invalid PDPE");
 		if ((*pdpe & PG_PS) != 0) {
 			if (!pmap_demote_pdpe(kernel_pmap, pdpe, va))
 				panic("pmap_demote_DMAP: PDPE failed");
 			changed = TRUE;
 		}
 		if (len < NBPDR) {
 			pde = pmap_pdpe_to_pde(pdpe, va);
 			if ((*pde & X86_PG_V) == 0)
 				panic("pmap_demote_DMAP: invalid PDE");
 			if ((*pde & PG_PS) != 0) {
 				if (!pmap_demote_pde(kernel_pmap, pde, va))
 					panic("pmap_demote_DMAP: PDE failed");
 				changed = TRUE;
 			}
 		}
 		if (changed && invalidate)
 			pmap_invalidate_page(kernel_pmap, va);
 		PMAP_UNLOCK(kernel_pmap);
 	}
 }
 
 /*
  * perform the pmap work for mincore
  */
 int
 pmap_mincore(pmap_t pmap, vm_offset_t addr, vm_paddr_t *locked_pa)
 {
 	pd_entry_t *pdep;
 	pt_entry_t pte, PG_A, PG_M, PG_RW, PG_V;
 	vm_paddr_t pa;
 	int val;
 
 	PG_A = pmap_accessed_bit(pmap);
 	PG_M = pmap_modified_bit(pmap);
 	PG_V = pmap_valid_bit(pmap);
 	PG_RW = pmap_rw_bit(pmap);
 
 	PMAP_LOCK(pmap);
 retry:
 	pdep = pmap_pde(pmap, addr);
 	if (pdep != NULL && (*pdep & PG_V)) {
 		if (*pdep & PG_PS) {
 			pte = *pdep;
 			/* Compute the physical address of the 4KB page. */
 			pa = ((*pdep & PG_PS_FRAME) | (addr & PDRMASK)) &
 			    PG_FRAME;
 			val = MINCORE_SUPER;
 		} else {
 			pte = *pmap_pde_to_pte(pdep, addr);
 			pa = pte & PG_FRAME;
 			val = 0;
 		}
 	} else {
 		pte = 0;
 		pa = 0;
 		val = 0;
 	}
 	if ((pte & PG_V) != 0) {
 		val |= MINCORE_INCORE;
 		if ((pte & (PG_M | PG_RW)) == (PG_M | PG_RW))
 			val |= MINCORE_MODIFIED | MINCORE_MODIFIED_OTHER;
 		if ((pte & PG_A) != 0)
 			val |= MINCORE_REFERENCED | MINCORE_REFERENCED_OTHER;
 	}
 	if ((val & (MINCORE_MODIFIED_OTHER | MINCORE_REFERENCED_OTHER)) !=
 	    (MINCORE_MODIFIED_OTHER | MINCORE_REFERENCED_OTHER) &&
 	    (pte & (PG_MANAGED | PG_V)) == (PG_MANAGED | PG_V)) {
 		/* Ensure that "PHYS_TO_VM_PAGE(pa)->object" doesn't change. */
 		if (vm_page_pa_tryrelock(pmap, pa, locked_pa))
 			goto retry;
 	} else
 		PA_UNLOCK_COND(*locked_pa);
 	PMAP_UNLOCK(pmap);
 	return (val);
 }
 
 static uint64_t
 pmap_pcid_alloc(pmap_t pmap, u_int cpuid)
 {
 	uint32_t gen, new_gen, pcid_next;
 
 	CRITICAL_ASSERT(curthread);
 	gen = PCPU_GET(pcid_gen);
 	if (pmap->pm_pcids[cpuid].pm_pcid == PMAP_PCID_KERN ||
 	    pmap->pm_pcids[cpuid].pm_gen == gen)
 		return (CR3_PCID_SAVE);
 	pcid_next = PCPU_GET(pcid_next);
 	KASSERT(pcid_next <= PMAP_PCID_OVERMAX, ("cpu %d pcid_next %#x",
 	    cpuid, pcid_next));
 	if (pcid_next == PMAP_PCID_OVERMAX) {
 		new_gen = gen + 1;
 		if (new_gen == 0)
 			new_gen = 1;
 		PCPU_SET(pcid_gen, new_gen);
 		pcid_next = PMAP_PCID_KERN + 1;
 	} else {
 		new_gen = gen;
 	}
 	pmap->pm_pcids[cpuid].pm_pcid = pcid_next;
 	pmap->pm_pcids[cpuid].pm_gen = new_gen;
 	PCPU_SET(pcid_next, pcid_next + 1);
 	return (0);
 }
 
 void
 pmap_activate_sw(struct thread *td)
 {
 	pmap_t oldpmap, pmap;
 	uint64_t cached, cr3;
 	u_int cpuid;
 
 	oldpmap = PCPU_GET(curpmap);
 	pmap = vmspace_pmap(td->td_proc->p_vmspace);
 	if (oldpmap == pmap)
 		return;
 	cpuid = PCPU_GET(cpuid);
 #ifdef SMP
 	CPU_SET_ATOMIC(cpuid, &pmap->pm_active);
 #else
 	CPU_SET(cpuid, &pmap->pm_active);
 #endif
 	cr3 = rcr3();
 	if (pmap_pcid_enabled) {
 		cached = pmap_pcid_alloc(pmap, cpuid);
 		KASSERT(pmap->pm_pcids[cpuid].pm_pcid >= 0 &&
 		    pmap->pm_pcids[cpuid].pm_pcid < PMAP_PCID_OVERMAX,
 		    ("pmap %p cpu %d pcid %#x", pmap, cpuid,
 		    pmap->pm_pcids[cpuid].pm_pcid));
 		KASSERT(pmap->pm_pcids[cpuid].pm_pcid != PMAP_PCID_KERN ||
 		    pmap == kernel_pmap,
 		    ("non-kernel pmap thread %p pmap %p cpu %d pcid %#x",
 		    td, pmap, cpuid, pmap->pm_pcids[cpuid].pm_pcid));
 		if (!cached || (cr3 & ~CR3_PCID_MASK) != pmap->pm_cr3) {
 			load_cr3(pmap->pm_cr3 | pmap->pm_pcids[cpuid].pm_pcid |
 			    cached);
 			if (cached)
 				PCPU_INC(pm_save_cnt);
 		}
 	} else if (cr3 != pmap->pm_cr3) {
 		load_cr3(pmap->pm_cr3);
 	}
 	PCPU_SET(curpmap, pmap);
 #ifdef SMP
 	CPU_CLR_ATOMIC(cpuid, &oldpmap->pm_active);
 #else
 	CPU_CLR(cpuid, &oldpmap->pm_active);
 #endif
 }
 
 void
 pmap_activate(struct thread *td)
 {
 
 	critical_enter();
 	pmap_activate_sw(td);
 	critical_exit();
 }
 
 void
 pmap_sync_icache(pmap_t pm, vm_offset_t va, vm_size_t sz)
 {
 }
 
 /*
  *	Increase the starting virtual address of the given mapping if a
  *	different alignment might result in more superpage mappings.
  */
 void
 pmap_align_superpage(vm_object_t object, vm_ooffset_t offset,
     vm_offset_t *addr, vm_size_t size)
 {
 	vm_offset_t superpage_offset;
 
 	if (size < NBPDR)
 		return;
 	if (object != NULL && (object->flags & OBJ_COLORED) != 0)
 		offset += ptoa(object->pg_color);
 	superpage_offset = offset & PDRMASK;
 	if (size - ((NBPDR - superpage_offset) & PDRMASK) < NBPDR ||
 	    (*addr & PDRMASK) == superpage_offset)
 		return;
 	if ((*addr & PDRMASK) < superpage_offset)
 		*addr = (*addr & ~PDRMASK) + superpage_offset;
 	else
 		*addr = ((*addr + PDRMASK) & ~PDRMASK) + superpage_offset;
 }
 
 #ifdef INVARIANTS
 static unsigned long num_dirty_emulations;
 SYSCTL_ULONG(_vm_pmap, OID_AUTO, num_dirty_emulations, CTLFLAG_RW,
 	     &num_dirty_emulations, 0, NULL);
 
 static unsigned long num_accessed_emulations;
 SYSCTL_ULONG(_vm_pmap, OID_AUTO, num_accessed_emulations, CTLFLAG_RW,
 	     &num_accessed_emulations, 0, NULL);
 
 static unsigned long num_superpage_accessed_emulations;
 SYSCTL_ULONG(_vm_pmap, OID_AUTO, num_superpage_accessed_emulations, CTLFLAG_RW,
 	     &num_superpage_accessed_emulations, 0, NULL);
 
 static unsigned long ad_emulation_superpage_promotions;
 SYSCTL_ULONG(_vm_pmap, OID_AUTO, ad_emulation_superpage_promotions, CTLFLAG_RW,
 	     &ad_emulation_superpage_promotions, 0, NULL);
 #endif	/* INVARIANTS */
 
 int
 pmap_emulate_accessed_dirty(pmap_t pmap, vm_offset_t va, int ftype)
 {
 	int rv;
 	struct rwlock *lock;
 	vm_page_t m, mpte;
 	pd_entry_t *pde;
 	pt_entry_t *pte, PG_A, PG_M, PG_RW, PG_V;
 
 	KASSERT(ftype == VM_PROT_READ || ftype == VM_PROT_WRITE,
 	    ("pmap_emulate_accessed_dirty: invalid fault type %d", ftype));
 
 	if (!pmap_emulate_ad_bits(pmap))
 		return (-1);
 
 	PG_A = pmap_accessed_bit(pmap);
 	PG_M = pmap_modified_bit(pmap);
 	PG_V = pmap_valid_bit(pmap);
 	PG_RW = pmap_rw_bit(pmap);
 
 	rv = -1;
 	lock = NULL;
 	PMAP_LOCK(pmap);
 
 	pde = pmap_pde(pmap, va);
 	if (pde == NULL || (*pde & PG_V) == 0)
 		goto done;
 
 	if ((*pde & PG_PS) != 0) {
 		if (ftype == VM_PROT_READ) {
 #ifdef INVARIANTS
 			atomic_add_long(&num_superpage_accessed_emulations, 1);
 #endif
 			*pde |= PG_A;
 			rv = 0;
 		}
 		goto done;
 	}
 
 	pte = pmap_pde_to_pte(pde, va);
 	if ((*pte & PG_V) == 0)
 		goto done;
 
 	if (ftype == VM_PROT_WRITE) {
 		if ((*pte & PG_RW) == 0)
 			goto done;
 		/*
 		 * Set the modified and accessed bits simultaneously.
 		 *
 		 * Intel EPT PTEs that do software emulation of A/D bits map
 		 * PG_A and PG_M to EPT_PG_READ and EPT_PG_WRITE respectively.
 		 * An EPT misconfiguration is triggered if the PTE is writable
 		 * but not readable (WR=10). This is avoided by setting PG_A
 		 * and PG_M simultaneously.
 		 */
 		*pte |= PG_M | PG_A;
 	} else {
 		*pte |= PG_A;
 	}
 
 	/* try to promote the mapping */
 	if (va < VM_MAXUSER_ADDRESS)
 		mpte = PHYS_TO_VM_PAGE(*pde & PG_FRAME);
 	else
 		mpte = NULL;
 
 	m = PHYS_TO_VM_PAGE(*pte & PG_FRAME);
 
 	if ((mpte == NULL || mpte->wire_count == NPTEPG) &&
 	    pmap_ps_enabled(pmap) &&
 	    (m->flags & PG_FICTITIOUS) == 0 &&
 	    vm_reserv_level_iffullpop(m) == 0) {
 		pmap_promote_pde(pmap, pde, va, &lock);
 #ifdef INVARIANTS
 		atomic_add_long(&ad_emulation_superpage_promotions, 1);
 #endif
 	}
 #ifdef INVARIANTS
 	if (ftype == VM_PROT_WRITE)
 		atomic_add_long(&num_dirty_emulations, 1);
 	else
 		atomic_add_long(&num_accessed_emulations, 1);
 #endif
 	rv = 0;		/* success */
 done:
 	if (lock != NULL)
 		rw_wunlock(lock);
 	PMAP_UNLOCK(pmap);
 	return (rv);
 }
 
 void
 pmap_get_mapping(pmap_t pmap, vm_offset_t va, uint64_t *ptr, int *num)
 {
 	pml4_entry_t *pml4;
 	pdp_entry_t *pdp;
 	pd_entry_t *pde;
 	pt_entry_t *pte, PG_V;
 	int idx;
 
 	idx = 0;
 	PG_V = pmap_valid_bit(pmap);
 	PMAP_LOCK(pmap);
 
 	pml4 = pmap_pml4e(pmap, va);
 	ptr[idx++] = *pml4;
 	if ((*pml4 & PG_V) == 0)
 		goto done;
 
 	pdp = pmap_pml4e_to_pdpe(pml4, va);
 	ptr[idx++] = *pdp;
 	if ((*pdp & PG_V) == 0 || (*pdp & PG_PS) != 0)
 		goto done;
 
 	pde = pmap_pdpe_to_pde(pdp, va);
 	ptr[idx++] = *pde;
 	if ((*pde & PG_V) == 0 || (*pde & PG_PS) != 0)
 		goto done;
 
 	pte = pmap_pde_to_pte(pde, va);
 	ptr[idx++] = *pte;
 
 done:
 	PMAP_UNLOCK(pmap);
 	*num = idx;
 }
 
 /**
  * Get the kernel virtual address of a set of physical pages. If there are
  * physical addresses not covered by the DMAP perform a transient mapping
  * that will be removed when calling pmap_unmap_io_transient.
  *
  * \param page        The pages the caller wishes to obtain the virtual
  *                    address on the kernel memory map.
  * \param vaddr       On return contains the kernel virtual memory address
  *                    of the pages passed in the page parameter.
  * \param count       Number of pages passed in.
  * \param can_fault   TRUE if the thread using the mapped pages can take
  *                    page faults, FALSE otherwise.
  *
  * \returns TRUE if the caller must call pmap_unmap_io_transient when
  *          finished or FALSE otherwise.
  *
  */
 boolean_t
 pmap_map_io_transient(vm_page_t page[], vm_offset_t vaddr[], int count,
     boolean_t can_fault)
 {
 	vm_paddr_t paddr;
 	boolean_t needs_mapping;
 	pt_entry_t *pte;
 	int cache_bits, error, i;
 
 	/*
 	 * Allocate any KVA space that we need, this is done in a separate
 	 * loop to prevent calling vmem_alloc while pinned.
 	 */
 	needs_mapping = FALSE;
 	for (i = 0; i < count; i++) {
 		paddr = VM_PAGE_TO_PHYS(page[i]);
 		if (__predict_false(paddr >= dmaplimit)) {
 			error = vmem_alloc(kernel_arena, PAGE_SIZE,
 			    M_BESTFIT | M_WAITOK, &vaddr[i]);
 			KASSERT(error == 0, ("vmem_alloc failed: %d", error));
 			needs_mapping = TRUE;
 		} else {
 			vaddr[i] = PHYS_TO_DMAP(paddr);
 		}
 	}
 
 	/* Exit early if everything is covered by the DMAP */
 	if (!needs_mapping)
 		return (FALSE);
 
 	/*
 	 * NB:  The sequence of updating a page table followed by accesses
 	 * to the corresponding pages used in the !DMAP case is subject to
 	 * the situation described in the "AMD64 Architecture Programmer's
 	 * Manual Volume 2: System Programming" rev. 3.23, "7.3.1 Special
 	 * Coherency Considerations".  Therefore, issuing the INVLPG right
 	 * after modifying the PTE bits is crucial.
 	 */
 	if (!can_fault)
 		sched_pin();
 	for (i = 0; i < count; i++) {
 		paddr = VM_PAGE_TO_PHYS(page[i]);
 		if (paddr >= dmaplimit) {
 			if (can_fault) {
 				/*
 				 * Slow path, since we can get page faults
 				 * while mappings are active don't pin the
 				 * thread to the CPU and instead add a global
 				 * mapping visible to all CPUs.
 				 */
 				pmap_qenter(vaddr[i], &page[i], 1);
 			} else {
 				pte = vtopte(vaddr[i]);
 				cache_bits = pmap_cache_bits(kernel_pmap,
 				    page[i]->md.pat_mode, 0);
 				pte_store(pte, paddr | X86_PG_RW | X86_PG_V |
 				    cache_bits);
 				invlpg(vaddr[i]);
 			}
 		}
 	}
 
 	return (needs_mapping);
 }
 
 void
 pmap_unmap_io_transient(vm_page_t page[], vm_offset_t vaddr[], int count,
     boolean_t can_fault)
 {
 	vm_paddr_t paddr;
 	int i;
 
 	if (!can_fault)
 		sched_unpin();
 	for (i = 0; i < count; i++) {
 		paddr = VM_PAGE_TO_PHYS(page[i]);
 		if (paddr >= dmaplimit) {
 			if (can_fault)
 				pmap_qremove(vaddr[i], 1);
 			vmem_free(kernel_arena, vaddr[i], PAGE_SIZE);
 		}
 	}
 }
 
 vm_offset_t
 pmap_quick_enter_page(vm_page_t m)
 {
 	vm_paddr_t paddr;
 
 	paddr = VM_PAGE_TO_PHYS(m);
 	if (paddr < dmaplimit)
 		return (PHYS_TO_DMAP(paddr));
 	mtx_lock_spin(&qframe_mtx);
 	KASSERT(*vtopte(qframe) == 0, ("qframe busy"));
 	pte_store(vtopte(qframe), paddr | X86_PG_RW | X86_PG_V | X86_PG_A |
 	    X86_PG_M | pmap_cache_bits(kernel_pmap, m->md.pat_mode, 0));
 	return (qframe);
 }
 
 void
 pmap_quick_remove_page(vm_offset_t addr)
 {
 
 	if (addr != qframe)
 		return;
 	pte_store(vtopte(qframe), 0);
 	invlpg(qframe);
 	mtx_unlock_spin(&qframe_mtx);
 }
 
 #include "opt_ddb.h"
 #ifdef DDB
 #include <ddb/ddb.h>
 
 DB_SHOW_COMMAND(pte, pmap_print_pte)
 {
 	pmap_t pmap;
 	pml4_entry_t *pml4;
 	pdp_entry_t *pdp;
 	pd_entry_t *pde;
 	pt_entry_t *pte, PG_V;
 	vm_offset_t va;
 
 	if (have_addr) {
 		va = (vm_offset_t)addr;
 		pmap = PCPU_GET(curpmap); /* XXX */
 	} else {
 		db_printf("show pte addr\n");
 		return;
 	}
 	PG_V = pmap_valid_bit(pmap);
 	pml4 = pmap_pml4e(pmap, va);
 	db_printf("VA %#016lx pml4e %#016lx", va, *pml4);
 	if ((*pml4 & PG_V) == 0) {
 		db_printf("\n");
 		return;
 	}
 	pdp = pmap_pml4e_to_pdpe(pml4, va);
 	db_printf(" pdpe %#016lx", *pdp);
 	if ((*pdp & PG_V) == 0 || (*pdp & PG_PS) != 0) {
 		db_printf("\n");
 		return;
 	}
 	pde = pmap_pdpe_to_pde(pdp, va);
 	db_printf(" pde %#016lx", *pde);
 	if ((*pde & PG_V) == 0 || (*pde & PG_PS) != 0) {
 		db_printf("\n");
 		return;
 	}
 	pte = pmap_pde_to_pte(pde, va);
 	db_printf(" pte %#016lx\n", *pte);
 }
 
 DB_SHOW_COMMAND(phys2dmap, pmap_phys2dmap)
 {
 	vm_paddr_t a;
 
 	if (have_addr) {
 		a = (vm_paddr_t)addr;
 		db_printf("0x%jx\n", (uintmax_t)PHYS_TO_DMAP(a));
 	} else {
 		db_printf("show phys2dmap addr\n");
 	}
 }
 #endif
Index: projects/clang390-import/sys/arm/arm/pmap-v6.c
===================================================================
--- projects/clang390-import/sys/arm/arm/pmap-v6.c	(revision 305686)
+++ projects/clang390-import/sys/arm/arm/pmap-v6.c	(revision 305687)
@@ -1,6806 +1,6800 @@
 /*-
  * Copyright (c) 1991 Regents of the University of California.
  * Copyright (c) 1994 John S. Dyson
  * Copyright (c) 1994 David Greenman
  * Copyright (c) 2005-2010 Alan L. Cox <alc@cs.rice.edu>
  * Copyright (c) 2014-2016 Svatopluk Kraus <skra@FreeBSD.org>
  * Copyright (c) 2014-2016 Michal Meloun <mmel@FreeBSD.org>
  * All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * the Systems Programming Group of the University of Utah Computer
  * Science Department and William Jolitz of UUNET Technologies Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	from:	@(#)pmap.c	7.7 (Berkeley)	5/12/91
  */
 /*-
  * Copyright (c) 2003 Networks Associates Technology, Inc.
  * All rights reserved.
  *
  * This software was developed for the FreeBSD Project by Jake Burkholder,
  * Safeport Network Services, and Network Associates Laboratories, the
  * Security Research Division of Network Associates, Inc. under
  * DARPA/SPAWAR contract N66001-01-C-8035 ("CBOSS"), as part of the DARPA
  * CHATS research program.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 /*
  *	Manages physical address maps.
  *
  *	Since the information managed by this module is
  *	also stored by the logical address mapping module,
  *	this module may throw away valid virtual-to-physical
  *	mappings at almost any time.  However, invalidations
  *	of virtual-to-physical mappings must be done as
  *	requested.
  *
  *	In order to cope with hardware architectures which
  *	make virtual-to-physical map invalidates expensive,
  *	this module may delay invalidate or reduced protection
  *	operations until such time as they are actually
  *	necessary.  This module is given full information as
  *	to which processors are currently using which maps,
  *	and to when physical maps must be made correct.
  */
 
 #include "opt_vm.h"
 #include "opt_pmap.h"
 #include "opt_ddb.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/ktr.h>
 #include <sys/lock.h>
 #include <sys/proc.h>
 #include <sys/rwlock.h>
 #include <sys/malloc.h>
 #include <sys/vmmeter.h>
 #include <sys/malloc.h>
 #include <sys/mman.h>
 #include <sys/sf_buf.h>
 #include <sys/smp.h>
 #include <sys/sched.h>
 #include <sys/sysctl.h>
 #ifdef SMP
 #include <sys/smp.h>
 #else
 #include <sys/cpuset.h>
 #endif
 
 #ifdef DDB
 #include <ddb/ddb.h>
 #endif
 
 #include <machine/physmem.h>
 
 #include <vm/vm.h>
 #include <vm/uma.h>
 #include <vm/pmap.h>
 #include <vm/vm_param.h>
 #include <vm/vm_kern.h>
 #include <vm/vm_object.h>
 #include <vm/vm_map.h>
 #include <vm/vm_page.h>
 #include <vm/vm_pageout.h>
 #include <vm/vm_phys.h>
 #include <vm/vm_extern.h>
 #include <vm/vm_reserv.h>
 #include <sys/lock.h>
 #include <sys/mutex.h>
 
 #include <machine/md_var.h>
 #include <machine/pmap_var.h>
 #include <machine/cpu.h>
 #include <machine/pcb.h>
 #include <machine/sf_buf.h>
 #ifdef SMP
 #include <machine/smp.h>
 #endif
 
 #ifndef PMAP_SHPGPERPROC
 #define PMAP_SHPGPERPROC 200
 #endif
 
 #ifndef DIAGNOSTIC
 #define PMAP_INLINE	__inline
 #else
 #define PMAP_INLINE
 #endif
 
 #ifdef PMAP_DEBUG
 static void pmap_zero_page_check(vm_page_t m);
 void pmap_debug(int level);
 int pmap_pid_dump(int pid);
 
 #define PDEBUG(_lev_,_stat_) \
 	if (pmap_debug_level >= (_lev_)) \
 		((_stat_))
 #define dprintf printf
 int pmap_debug_level = 1;
 #else   /* PMAP_DEBUG */
 #define PDEBUG(_lev_,_stat_) /* Nothing */
 #define dprintf(x, arg...)
 #endif  /* PMAP_DEBUG */
 
 /*
  *  Level 2 page tables map definion ('max' is excluded).
  */
 
 #define PT2V_MIN_ADDRESS	((vm_offset_t)PT2MAP)
 #define PT2V_MAX_ADDRESS	((vm_offset_t)PT2MAP + PT2MAP_SIZE)
 
 #define UPT2V_MIN_ADDRESS	((vm_offset_t)PT2MAP)
 #define UPT2V_MAX_ADDRESS \
     ((vm_offset_t)(PT2MAP + (KERNBASE >> PT2MAP_SHIFT)))
 
 /*
  *  Promotion to a 1MB (PTE1) page mapping requires that the corresponding
  *  4KB (PTE2) page mappings have identical settings for the following fields:
  */
 #define PTE2_PROMOTE	(PTE2_V | PTE2_A | PTE2_NM | PTE2_S | PTE2_NG |	\
 			 PTE2_NX | PTE2_RO | PTE2_U | PTE2_W |		\
 			 PTE2_ATTR_MASK)
 
 #define PTE1_PROMOTE	(PTE1_V | PTE1_A | PTE1_NM | PTE1_S | PTE1_NG |	\
 			 PTE1_NX | PTE1_RO | PTE1_U | PTE1_W |		\
 			 PTE1_ATTR_MASK)
 
 #define ATTR_TO_L1(l2_attr)	((((l2_attr) & L2_TEX0) ? L1_S_TEX0 : 0) | \
 				 (((l2_attr) & L2_C)    ? L1_S_C    : 0) | \
 				 (((l2_attr) & L2_B)    ? L1_S_B    : 0) | \
 				 (((l2_attr) & PTE2_A)  ? PTE1_A    : 0) | \
 				 (((l2_attr) & PTE2_NM) ? PTE1_NM   : 0) | \
 				 (((l2_attr) & PTE2_S)  ? PTE1_S    : 0) | \
 				 (((l2_attr) & PTE2_NG) ? PTE1_NG   : 0) | \
 				 (((l2_attr) & PTE2_NX) ? PTE1_NX   : 0) | \
 				 (((l2_attr) & PTE2_RO) ? PTE1_RO   : 0) | \
 				 (((l2_attr) & PTE2_U)  ? PTE1_U    : 0) | \
 				 (((l2_attr) & PTE2_W)  ? PTE1_W    : 0))
 
 #define ATTR_TO_L2(l1_attr)	((((l1_attr) & L1_S_TEX0) ? L2_TEX0 : 0) | \
 				 (((l1_attr) & L1_S_C)    ? L2_C    : 0) | \
 				 (((l1_attr) & L1_S_B)    ? L2_B    : 0) | \
 				 (((l1_attr) & PTE1_A)    ? PTE2_A  : 0) | \
 				 (((l1_attr) & PTE1_NM)   ? PTE2_NM : 0) | \
 				 (((l1_attr) & PTE1_S)    ? PTE2_S  : 0) | \
 				 (((l1_attr) & PTE1_NG)   ? PTE2_NG : 0) | \
 				 (((l1_attr) & PTE1_NX)   ? PTE2_NX : 0) | \
 				 (((l1_attr) & PTE1_RO)   ? PTE2_RO : 0) | \
 				 (((l1_attr) & PTE1_U)    ? PTE2_U  : 0) | \
 				 (((l1_attr) & PTE1_W)    ? PTE2_W  : 0))
 
 /*
  *  PTE2 descriptors creation macros.
  */
 #define PTE2_ATTR_DEFAULT	vm_memattr_to_pte2(VM_MEMATTR_DEFAULT)
 #define PTE2_ATTR_PT		vm_memattr_to_pte2(pt_memattr)
 
 #define PTE2_KPT(pa)	PTE2_KERN(pa, PTE2_AP_KRW, PTE2_ATTR_PT)
 #define PTE2_KPT_NG(pa)	PTE2_KERN_NG(pa, PTE2_AP_KRW, PTE2_ATTR_PT)
 
 #define PTE2_KRW(pa)	PTE2_KERN(pa, PTE2_AP_KRW, PTE2_ATTR_DEFAULT)
 #define PTE2_KRO(pa)	PTE2_KERN(pa, PTE2_AP_KR, PTE2_ATTR_DEFAULT)
 
 #define PV_STATS
 #ifdef PV_STATS
 #define PV_STAT(x)	do { x ; } while (0)
 #else
 #define PV_STAT(x)	do { } while (0)
 #endif
 
 /*
  *  The boot_pt1 is used temporary in very early boot stage as L1 page table.
  *  We can init many things with no memory allocation thanks to its static
  *  allocation and this brings two main advantages:
  *  (1) other cores can be started very simply,
  *  (2) various boot loaders can be supported as its arguments can be processed
  *      in virtual address space and can be moved to safe location before
  *      first allocation happened.
  *  Only disadvantage is that boot_pt1 is used only in very early boot stage.
  *  However, the table is uninitialized and so lays in bss. Therefore kernel
  *  image size is not influenced.
  *
  *  QQQ: In the future, maybe, boot_pt1 can be used for soft reset and
  *       CPU suspend/resume game.
  */
 extern pt1_entry_t boot_pt1[];
 
 vm_paddr_t base_pt1;
 pt1_entry_t *kern_pt1;
 pt2_entry_t *kern_pt2tab;
 pt2_entry_t *PT2MAP;
 
 static uint32_t ttb_flags;
 static vm_memattr_t pt_memattr;
 ttb_entry_t pmap_kern_ttb;
 
 struct pmap kernel_pmap_store;
 LIST_HEAD(pmaplist, pmap);
 static struct pmaplist allpmaps;
 static struct mtx allpmaps_lock;
 
 vm_offset_t virtual_avail;	/* VA of first avail page (after kernel bss) */
 vm_offset_t virtual_end;	/* VA of last avail page (end of kernel AS) */
 
 static vm_offset_t kernel_vm_end_new;
 vm_offset_t kernel_vm_end = KERNBASE + NKPT2PG * NPT2_IN_PG * PTE1_SIZE;
 vm_offset_t vm_max_kernel_address;
 vm_paddr_t kernel_l1pa;
 
 static struct rwlock __aligned(CACHE_LINE_SIZE) pvh_global_lock;
 
 /*
  *  Data for the pv entry allocation mechanism
  */
 static TAILQ_HEAD(pch, pv_chunk) pv_chunks = TAILQ_HEAD_INITIALIZER(pv_chunks);
 static int pv_entry_count = 0, pv_entry_max = 0, pv_entry_high_water = 0;
 static struct md_page *pv_table; /* XXX: Is it used only the list in md_page? */
 static int shpgperproc = PMAP_SHPGPERPROC;
 
 struct pv_chunk *pv_chunkbase;		/* KVA block for pv_chunks */
 int pv_maxchunks;			/* How many chunks we have KVA for */
 vm_offset_t pv_vafree;			/* freelist stored in the PTE */
 
 vm_paddr_t first_managed_pa;
 #define	pa_to_pvh(pa)	(&pv_table[pte1_index(pa - first_managed_pa)])
 
 /*
  *  All those kernel PT submaps that BSD is so fond of
  */
 struct sysmaps {
 	struct	mtx lock;
 	pt2_entry_t *CMAP1;
 	pt2_entry_t *CMAP2;
 	pt2_entry_t *CMAP3;
 	caddr_t	CADDR1;
 	caddr_t	CADDR2;
 	caddr_t	CADDR3;
 };
 static struct sysmaps sysmaps_pcpu[MAXCPU];
 caddr_t _tmppt = 0;
 
 struct msgbuf *msgbufp = NULL; /* XXX move it to machdep.c */
 
 /*
  *  Crashdump maps.
  */
 static caddr_t crashdumpmap;
 
 static pt2_entry_t *PMAP1 = NULL, *PMAP2;
 static pt2_entry_t *PADDR1 = NULL, *PADDR2;
 #ifdef DDB
 static pt2_entry_t *PMAP3;
 static pt2_entry_t *PADDR3;
 static int PMAP3cpu __unused; /* for SMP only */
 #endif
 #ifdef SMP
 static int PMAP1cpu;
 static int PMAP1changedcpu;
 SYSCTL_INT(_debug, OID_AUTO, PMAP1changedcpu, CTLFLAG_RD,
     &PMAP1changedcpu, 0,
     "Number of times pmap_pte2_quick changed CPU with same PMAP1");
 #endif
 static int PMAP1changed;
 SYSCTL_INT(_debug, OID_AUTO, PMAP1changed, CTLFLAG_RD,
     &PMAP1changed, 0,
     "Number of times pmap_pte2_quick changed PMAP1");
 static int PMAP1unchanged;
 SYSCTL_INT(_debug, OID_AUTO, PMAP1unchanged, CTLFLAG_RD,
     &PMAP1unchanged, 0,
     "Number of times pmap_pte2_quick didn't change PMAP1");
 static struct mtx PMAP2mutex;
 
 static __inline void pt2_wirecount_init(vm_page_t m);
 static boolean_t pmap_demote_pte1(pmap_t pmap, pt1_entry_t *pte1p,
     vm_offset_t va);
 void cache_icache_sync_fresh(vm_offset_t va, vm_paddr_t pa, vm_size_t size);
 
 /*
  *  Function to set the debug level of the pmap code.
  */
 #ifdef PMAP_DEBUG
 void
 pmap_debug(int level)
 {
 
 	pmap_debug_level = level;
 	dprintf("pmap_debug: level=%d\n", pmap_debug_level);
 }
 #endif /* PMAP_DEBUG */
 
 /*
  *  This table must corespond with memory attribute configuration in vm.h.
  *  First entry is used for normal system mapping.
  *
  *  Device memory is always marked as shared.
  *  Normal memory is shared only in SMP .
  *  Not outer shareable bits are not used yet.
  *  Class 6 cannot be used on ARM11.
  */
 #define TEXDEF_TYPE_SHIFT	0
 #define TEXDEF_TYPE_MASK	0x3
 #define TEXDEF_INNER_SHIFT	2
 #define TEXDEF_INNER_MASK	0x3
 #define TEXDEF_OUTER_SHIFT	4
 #define TEXDEF_OUTER_MASK	0x3
 #define TEXDEF_NOS_SHIFT	6
 #define TEXDEF_NOS_MASK		0x1
 
 #define TEX(t, i, o, s) 			\
 		((t) << TEXDEF_TYPE_SHIFT) |	\
 		((i) << TEXDEF_INNER_SHIFT) |	\
 		((o) << TEXDEF_OUTER_SHIFT | 	\
 		((s) << TEXDEF_NOS_SHIFT))
 
 static uint32_t tex_class[8] = {
 /*	    type      inner cache outer cache */
 	TEX(PRRR_MEM, NMRR_WB_WA, NMRR_WB_WA, 0),  /* 0 - ATTR_WB_WA	*/
 	TEX(PRRR_MEM, NMRR_NC,	  NMRR_NC,    0),  /* 1 - ATTR_NOCACHE	*/
 	TEX(PRRR_DEV, NMRR_NC,	  NMRR_NC,    0),  /* 2 - ATTR_DEVICE	*/
 	TEX(PRRR_SO,  NMRR_NC,	  NMRR_NC,    0),  /* 3 - ATTR_SO	*/
 	TEX(PRRR_MEM, NMRR_WT,	  NMRR_WT,    0),  /* 4 - ATTR_WT	*/
 	TEX(PRRR_MEM, NMRR_NC,	  NMRR_NC,    0),  /* 5 - NOT USED YET	*/
 	TEX(PRRR_MEM, NMRR_NC,	  NMRR_NC,    0),  /* 6 - NOT USED YET	*/
 	TEX(PRRR_MEM, NMRR_NC,	  NMRR_NC,    0),  /* 7 - NOT USED YET	*/
 };
 #undef TEX
 
 static uint32_t pte2_attr_tab[8] = {
 	PTE2_ATTR_WB_WA,	/* 0 - VM_MEMATTR_WB_WA */
 	PTE2_ATTR_NOCACHE,	/* 1 - VM_MEMATTR_NOCACHE */
 	PTE2_ATTR_DEVICE,	/* 2 - VM_MEMATTR_DEVICE */
 	PTE2_ATTR_SO,		/* 3 - VM_MEMATTR_SO */
 	PTE2_ATTR_WT,		/* 4 - VM_MEMATTR_WRITE_THROUGH */
 	0,			/* 5 - NOT USED YET */
 	0,			/* 6 - NOT USED YET */
 	0			/* 7 - NOT USED YET */
 };
 CTASSERT(VM_MEMATTR_WB_WA == 0);
 CTASSERT(VM_MEMATTR_NOCACHE == 1);
 CTASSERT(VM_MEMATTR_DEVICE == 2);
 CTASSERT(VM_MEMATTR_SO == 3);
 CTASSERT(VM_MEMATTR_WRITE_THROUGH == 4);
 
 static inline uint32_t
 vm_memattr_to_pte2(vm_memattr_t ma)
 {
 
 	KASSERT((u_int)ma < 5, ("%s: bad vm_memattr_t %d", __func__, ma));
 	return (pte2_attr_tab[(u_int)ma]);
 }
 
 static inline uint32_t
 vm_page_pte2_attr(vm_page_t m)
 {
 
 	return (vm_memattr_to_pte2(m->md.pat_mode));
 }
 
 /*
  * Convert TEX definition entry to TTB flags.
  */
 static uint32_t
 encode_ttb_flags(int idx)
 {
 	uint32_t inner, outer, nos, reg;
 
 	inner = (tex_class[idx] >> TEXDEF_INNER_SHIFT) &
 		TEXDEF_INNER_MASK;
 	outer = (tex_class[idx] >> TEXDEF_OUTER_SHIFT) &
 		TEXDEF_OUTER_MASK;
 	nos = (tex_class[idx] >> TEXDEF_NOS_SHIFT) &
 		TEXDEF_NOS_MASK;
 
 	reg = nos << 5;
 	reg |= outer << 3;
 	if (cpuinfo.coherent_walk)
 		reg |= (inner & 0x1) << 6;
 	reg |= (inner & 0x2) >> 1;
 #ifdef SMP
 	reg |= 1 << 1;
 #endif
 	return reg;
 }
 
 /*
  *  Set TEX remapping registers in current CPU.
  */
 void
 pmap_set_tex(void)
 {
 	uint32_t prrr, nmrr;
 	uint32_t type, inner, outer, nos;
 	int i;
 
 #ifdef PMAP_PTE_NOCACHE
 	/* XXX fixme */
 	if (cpuinfo.coherent_walk) {
 		pt_memattr = VM_MEMATTR_WB_WA;
 		ttb_flags = encode_ttb_flags(0);
 	}
 	else {
 		pt_memattr = VM_MEMATTR_NOCACHE;
 		ttb_flags = encode_ttb_flags(1);
 	}
 #else
 	pt_memattr = VM_MEMATTR_WB_WA;
 	ttb_flags = encode_ttb_flags(0);
 #endif
 
 	prrr = 0;
 	nmrr = 0;
 
 	/* Build remapping register from TEX classes. */
 	for (i = 0; i < 8; i++) {
 		type = (tex_class[i] >> TEXDEF_TYPE_SHIFT) &
 			TEXDEF_TYPE_MASK;
 		inner = (tex_class[i] >> TEXDEF_INNER_SHIFT) &
 			TEXDEF_INNER_MASK;
 		outer = (tex_class[i] >> TEXDEF_OUTER_SHIFT) &
 			TEXDEF_OUTER_MASK;
 		nos = (tex_class[i] >> TEXDEF_NOS_SHIFT) &
 			TEXDEF_NOS_MASK;
 
 		prrr |= type  << (i * 2);
 		prrr |= nos   << (i + 24);
 		nmrr |= inner << (i * 2);
 		nmrr |= outer << (i * 2 + 16);
 	}
 	/* Add shareable bits for device memory. */
 	prrr |= PRRR_DS0 | PRRR_DS1;
 
 	/* Add shareable bits for normal memory in SMP case. */
 #ifdef SMP
 	prrr |= PRRR_NS1;
 #endif
 	cp15_prrr_set(prrr);
 	cp15_nmrr_set(nmrr);
 
 	/* Caches are disabled, so full TLB flush should be enough. */
 	tlb_flush_all_local();
 }
 
 /*
  * KERNBASE must be multiple of NPT2_IN_PG * PTE1_SIZE. In other words,
  * KERNBASE is mapped by first L2 page table in L2 page table page. It
  * meets same constrain due to PT2MAP being placed just under KERNBASE.
  */
 CTASSERT((KERNBASE & (NPT2_IN_PG * PTE1_SIZE - 1)) == 0);
 CTASSERT((KERNBASE - VM_MAXUSER_ADDRESS) >= PT2MAP_SIZE);
 
 /*
  *  In crazy dreams, PAGE_SIZE could be a multiple of PTE2_SIZE in general.
  *  For now, anyhow, the following check must be fulfilled.
  */
 CTASSERT(PAGE_SIZE == PTE2_SIZE);
 /*
  *  We don't want to mess up MI code with all MMU and PMAP definitions,
  *  so some things, which depend on other ones, are defined independently.
  *  Now, it is time to check that we don't screw up something.
  */
 CTASSERT(PDRSHIFT == PTE1_SHIFT);
 /*
  *  Check L1 and L2 page table entries definitions consistency.
  */
 CTASSERT(NB_IN_PT1 == (sizeof(pt1_entry_t) * NPTE1_IN_PT1));
 CTASSERT(NB_IN_PT2 == (sizeof(pt2_entry_t) * NPTE2_IN_PT2));
 /*
  *  Check L2 page tables page consistency.
  */
 CTASSERT(PAGE_SIZE == (NPT2_IN_PG * NB_IN_PT2));
 CTASSERT((1 << PT2PG_SHIFT) == NPT2_IN_PG);
 /*
  *  Check PT2TAB consistency.
  *  PT2TAB_ENTRIES is defined as a division of NPTE1_IN_PT1 by NPT2_IN_PG.
  *  This should be done without remainder.
  */
 CTASSERT(NPTE1_IN_PT1 == (PT2TAB_ENTRIES * NPT2_IN_PG));
 
 /*
  *	A PT2MAP magic.
  *
  *  All level 2 page tables (PT2s) are mapped continuously and accordingly
  *  into PT2MAP address space. As PT2 size is less than PAGE_SIZE, this can
  *  be done only if PAGE_SIZE is a multiple of PT2 size. All PT2s in one page
  *  must be used together, but not necessary at once. The first PT2 in a page
  *  must map things on correctly aligned address and the others must follow
  *  in right order.
  */
 #define NB_IN_PT2TAB	(PT2TAB_ENTRIES * sizeof(pt2_entry_t))
 #define NPT2_IN_PT2TAB	(NB_IN_PT2TAB / NB_IN_PT2)
 #define NPG_IN_PT2TAB	(NB_IN_PT2TAB / PAGE_SIZE)
 
 /*
  *  Check PT2TAB consistency.
  *  NPT2_IN_PT2TAB is defined as a division of NB_IN_PT2TAB by NB_IN_PT2.
  *  NPG_IN_PT2TAB is defined as a division of NB_IN_PT2TAB by PAGE_SIZE.
  *  The both should be done without remainder.
  */
 CTASSERT(NB_IN_PT2TAB == (NPT2_IN_PT2TAB * NB_IN_PT2));
 CTASSERT(NB_IN_PT2TAB == (NPG_IN_PT2TAB * PAGE_SIZE));
 /*
  *  The implementation was made general, however, with the assumption
  *  bellow in mind. In case of another value of NPG_IN_PT2TAB,
  *  the code should be once more rechecked.
  */
 CTASSERT(NPG_IN_PT2TAB == 1);
 
 /*
  *  Get offset of PT2 in a page
  *  associated with given PT1 index.
  */
 static __inline u_int
 page_pt2off(u_int pt1_idx)
 {
 
 	return ((pt1_idx & PT2PG_MASK) * NB_IN_PT2);
 }
 
 /*
  *  Get physical address of PT2
  *  associated with given PT2s page and PT1 index.
  */
 static __inline vm_paddr_t
 page_pt2pa(vm_paddr_t pgpa, u_int pt1_idx)
 {
 
 	return (pgpa + page_pt2off(pt1_idx));
 }
 
 /*
  *  Get first entry of PT2
  *  associated with given PT2s page and PT1 index.
  */
 static __inline pt2_entry_t *
 page_pt2(vm_offset_t pgva, u_int pt1_idx)
 {
 
 	return ((pt2_entry_t *)(pgva + page_pt2off(pt1_idx)));
 }
 
 /*
  *  Get virtual address of PT2s page (mapped in PT2MAP)
  *  which holds PT2 which holds entry which maps given virtual address.
  */
 static __inline vm_offset_t
 pt2map_pt2pg(vm_offset_t va)
 {
 
 	va &= ~(NPT2_IN_PG * PTE1_SIZE - 1);
 	return ((vm_offset_t)pt2map_entry(va));
 }
 
 /*****************************************************************************
  *
  *     THREE pmap initialization milestones exist:
  *
  *  locore.S
  *    -> fundamental init (including MMU) in ASM
  *
  *  initarm()
  *    -> fundamental init continues in C
  *    -> first available physical address is known
  *
  *    pmap_bootstrap_prepare() -> FIRST PMAP MILESTONE (first epoch begins)
  *      -> basic (safe) interface for physical address allocation is made
  *      -> basic (safe) interface for virtual mapping is made
  *      -> limited not SMP coherent work is possible
  *
  *    -> more fundamental init continues in C
  *    -> locks and some more things are available
  *    -> all fundamental allocations and mappings are done
  *
  *    pmap_bootstrap() -> SECOND PMAP MILESTONE (second epoch begins)
  *      -> phys_avail[] and virtual_avail is set
  *      -> control is passed to vm subsystem
  *      -> physical and virtual address allocation are off limit
  *      -> low level mapping functions, some SMP coherent,
  *         are available, which cannot be used before vm subsystem
  *         is being inited
  *
  *  mi_startup()
  *    -> vm subsystem is being inited
  *
  *      pmap_init() -> THIRD PMAP MILESTONE (third epoch begins)
  *        -> pmap is fully inited
  *
  *****************************************************************************/
 
 /*****************************************************************************
  *
  *	PMAP first stage initialization and utility functions
  *	for pre-bootstrap epoch.
  *
  *  After pmap_bootstrap_prepare() is called, the following functions
  *  can be used:
  *
  *  (1) strictly only for this stage functions for physical page allocations,
  *      virtual space allocations, and mappings:
  *
  *  vm_paddr_t pmap_preboot_get_pages(u_int num);
  *  void pmap_preboot_map_pages(vm_paddr_t pa, vm_offset_t va, u_int num);
  *  vm_offset_t pmap_preboot_reserve_pages(u_int num);
  *  vm_offset_t pmap_preboot_get_vpages(u_int num);
  *  void pmap_preboot_map_attr(vm_paddr_t pa, vm_offset_t va, vm_size_t size,
  *      vm_prot_t prot, vm_memattr_t attr);
  *
  *  (2) for all stages:
  *
  *  vm_paddr_t pmap_kextract(vm_offset_t va);
  *
  *  NOTE: This is not SMP coherent stage.
  *
  *****************************************************************************/
 
 #define KERNEL_P2V(pa) \
     ((vm_offset_t)((pa) - arm_physmem_kernaddr + KERNVIRTADDR))
 #define KERNEL_V2P(va) \
     ((vm_paddr_t)((va) - KERNVIRTADDR + arm_physmem_kernaddr))
 
 static vm_paddr_t last_paddr;
 
 /*
  *  Pre-bootstrap epoch page allocator.
  */
 vm_paddr_t
 pmap_preboot_get_pages(u_int num)
 {
 	vm_paddr_t ret;
 
 	ret = last_paddr;
 	last_paddr += num * PAGE_SIZE;
 
 	return (ret);
 }
 
 /*
  *	The fundamental initialization of PMAP stuff.
  *
  *  Some things already happened in locore.S and some things could happen
  *  before pmap_bootstrap_prepare() is called, so let's recall what is done:
  *  1. Caches are disabled.
  *  2. We are running on virtual addresses already with 'boot_pt1'
  *     as L1 page table.
  *  3. So far, all virtual addresses can be converted to physical ones and
  *     vice versa by the following macros:
  *       KERNEL_P2V(pa) .... physical to virtual ones,
  *       KERNEL_V2P(va) .... virtual to physical ones.
  *
  *  What is done herein:
  *  1. The 'boot_pt1' is replaced by real kernel L1 page table 'kern_pt1'.
  *  2. PT2MAP magic is brought to live.
  *  3. Basic preboot functions for page allocations and mappings can be used.
  *  4. Everything is prepared for L1 cache enabling.
  *
  *  Variations:
  *  1. To use second TTB register, so kernel and users page tables will be
  *     separated. This way process forking - pmap_pinit() - could be faster,
  *     it saves physical pages and KVA per a process, and it's simple change.
  *     However, it will lead, due to hardware matter, to the following:
  *     (a) 2G space for kernel and 2G space for users.
  *     (b) 1G space for kernel in low addresses and 3G for users above it.
  *     A question is: Is the case (b) really an option? Note that case (b)
  *     does save neither physical memory and KVA.
  */
 void
 pmap_bootstrap_prepare(vm_paddr_t last)
 {
 	vm_paddr_t pt2pg_pa, pt2tab_pa, pa, size;
 	vm_offset_t pt2pg_va;
 	pt1_entry_t *pte1p;
 	pt2_entry_t *pte2p;
 	u_int i;
 	uint32_t actlr_mask, actlr_set, l1_attr;
 
 	/*
 	 * Now, we are going to make real kernel mapping. Note that we are
 	 * already running on some mapping made in locore.S and we expect
 	 * that it's large enough to ensure nofault access to physical memory
 	 * allocated herein before switch.
 	 *
 	 * As kernel image and everything needed before are and will be mapped
 	 * by section mappings, we align last physical address to PTE1_SIZE.
 	 */
 	last_paddr = pte1_roundup(last);
 
 	/*
 	 * Allocate and zero page(s) for kernel L1 page table.
 	 *
 	 * Note that it's first allocation on space which was PTE1_SIZE
 	 * aligned and as such base_pt1 is aligned to NB_IN_PT1 too.
 	 */
 	base_pt1 = pmap_preboot_get_pages(NPG_IN_PT1);
 	kern_pt1 = (pt1_entry_t *)KERNEL_P2V(base_pt1);
 	bzero((void*)kern_pt1, NB_IN_PT1);
 	pte1_sync_range(kern_pt1, NB_IN_PT1);
 
 	/* Allocate and zero page(s) for kernel PT2TAB. */
 	pt2tab_pa = pmap_preboot_get_pages(NPG_IN_PT2TAB);
 	kern_pt2tab = (pt2_entry_t *)KERNEL_P2V(pt2tab_pa);
 	bzero(kern_pt2tab, NB_IN_PT2TAB);
 	pte2_sync_range(kern_pt2tab, NB_IN_PT2TAB);
 
 	/* Allocate and zero page(s) for kernel L2 page tables. */
 	pt2pg_pa = pmap_preboot_get_pages(NKPT2PG);
 	pt2pg_va = KERNEL_P2V(pt2pg_pa);
 	size = NKPT2PG * PAGE_SIZE;
 	bzero((void*)pt2pg_va, size);
 	pte2_sync_range((pt2_entry_t *)pt2pg_va, size);
 
 	/*
 	 * Add a physical memory segment (vm_phys_seg) corresponding to the
 	 * preallocated pages for kernel L2 page tables so that vm_page
 	 * structures representing these pages will be created. The vm_page
 	 * structures are required for promotion of the corresponding kernel
 	 * virtual addresses to section mappings.
 	 */
 	vm_phys_add_seg(pt2tab_pa, pmap_preboot_get_pages(0));
 
 	/*
 	 * Insert allocated L2 page table pages to PT2TAB and make
 	 * link to all PT2s in L1 page table. See how kernel_vm_end
 	 * is initialized.
 	 *
 	 * We play simple and safe. So every KVA will have underlaying
 	 * L2 page table, even kernel image mapped by sections.
 	 */
 	pte2p = kern_pt2tab_entry(KERNBASE);
 	for (pa = pt2pg_pa; pa < pt2pg_pa + size; pa += PTE2_SIZE)
 		pt2tab_store(pte2p++, PTE2_KPT(pa));
 
 	pte1p = kern_pte1(KERNBASE);
 	for (pa = pt2pg_pa; pa < pt2pg_pa + size; pa += NB_IN_PT2)
 		pte1_store(pte1p++, PTE1_LINK(pa));
 
 	/* Make section mappings for kernel. */
 	l1_attr = ATTR_TO_L1(PTE2_ATTR_DEFAULT);
 	pte1p = kern_pte1(KERNBASE);
 	for (pa = KERNEL_V2P(KERNBASE); pa < last; pa += PTE1_SIZE)
 		pte1_store(pte1p++, PTE1_KERN(pa, PTE1_AP_KRW, l1_attr));
 
 	/*
 	 * Get free and aligned space for PT2MAP and make L1 page table links
 	 * to L2 page tables held in PT2TAB.
 	 *
 	 * Note that pages holding PT2s are stored in PT2TAB as pt2_entry_t
 	 * descriptors and PT2TAB page(s) itself is(are) used as PT2s. Thus
 	 * each entry in PT2TAB maps all PT2s in a page. This implies that
 	 * virtual address of PT2MAP must be aligned to NPT2_IN_PG * PTE1_SIZE.
 	 */
 	PT2MAP = (pt2_entry_t *)(KERNBASE - PT2MAP_SIZE);
 	pte1p = kern_pte1((vm_offset_t)PT2MAP);
 	for (pa = pt2tab_pa, i = 0; i < NPT2_IN_PT2TAB; i++, pa += NB_IN_PT2) {
 		pte1_store(pte1p++, PTE1_LINK(pa));
 	}
 
 	/*
 	 * Store PT2TAB in PT2TAB itself, i.e. self reference mapping.
 	 * Each pmap will hold own PT2TAB, so the mapping should be not global.
 	 */
 	pte2p = kern_pt2tab_entry((vm_offset_t)PT2MAP);
 	for (pa = pt2tab_pa, i = 0; i < NPG_IN_PT2TAB; i++, pa += PTE2_SIZE) {
 		pt2tab_store(pte2p++, PTE2_KPT_NG(pa));
 	}
 
 	/*
 	 * Choose correct L2 page table and make mappings for allocations
 	 * made herein which replaces temporary locore.S mappings after a while.
 	 * Note that PT2MAP cannot be used until we switch to kern_pt1.
 	 *
 	 * Note, that these allocations started aligned on 1M section and
 	 * kernel PT1 was allocated first. Making of mappings must follow
 	 * order of physical allocations as we've used KERNEL_P2V() macro
 	 * for virtual addresses resolution.
 	 */
 	pte2p = kern_pt2tab_entry((vm_offset_t)kern_pt1);
 	pt2pg_va = KERNEL_P2V(pte2_pa(pte2_load(pte2p)));
 
 	pte2p = page_pt2(pt2pg_va, pte1_index((vm_offset_t)kern_pt1));
 
 	/* Make mapping for kernel L1 page table. */
 	for (pa = base_pt1, i = 0; i < NPG_IN_PT1; i++, pa += PTE2_SIZE)
 		pte2_store(pte2p++, PTE2_KPT(pa));
 
 	/* Make mapping for kernel PT2TAB. */
 	for (pa = pt2tab_pa, i = 0; i < NPG_IN_PT2TAB; i++, pa += PTE2_SIZE)
 		pte2_store(pte2p++, PTE2_KPT(pa));
 
 	/* Finally, switch from 'boot_pt1' to 'kern_pt1'. */
 	pmap_kern_ttb = base_pt1 | ttb_flags;
 	cpuinfo_get_actlr_modifier(&actlr_mask, &actlr_set);
 	reinit_mmu(pmap_kern_ttb, actlr_mask, actlr_set);
 	/*
 	 * Initialize the first available KVA. As kernel image is mapped by
 	 * sections, we are leaving some gap behind.
 	 */
 	virtual_avail = (vm_offset_t)kern_pt2tab + NPG_IN_PT2TAB * PAGE_SIZE;
 }
 
 /*
  *  Setup L2 page table page for given KVA.
  *  Used in pre-bootstrap epoch.
  *
  *  Note that we have allocated NKPT2PG pages for L2 page tables in advance
  *  and used them for mapping KVA starting from KERNBASE. However, this is not
  *  enough. Vectors and devices need L2 page tables too. Note that they are
  *  even above VM_MAX_KERNEL_ADDRESS.
  */
 static __inline vm_paddr_t
 pmap_preboot_pt2pg_setup(vm_offset_t va)
 {
 	pt2_entry_t *pte2p, pte2;
 	vm_paddr_t pt2pg_pa;
 
 	/* Get associated entry in PT2TAB. */
 	pte2p = kern_pt2tab_entry(va);
 
 	/* Just return, if PT2s page exists already. */
 	pte2 = pt2tab_load(pte2p);
 	if (pte2_is_valid(pte2))
 		return (pte2_pa(pte2));
 
 	KASSERT(va >= VM_MAX_KERNEL_ADDRESS,
 	    ("%s: NKPT2PG too small", __func__));
 
 	/*
 	 * Allocate page for PT2s and insert it to PT2TAB.
 	 * In other words, map it into PT2MAP space.
 	 */
 	pt2pg_pa = pmap_preboot_get_pages(1);
 	pt2tab_store(pte2p, PTE2_KPT(pt2pg_pa));
 
 	/* Zero all PT2s in allocated page. */
 	bzero((void*)pt2map_pt2pg(va), PAGE_SIZE);
 	pte2_sync_range((pt2_entry_t *)pt2map_pt2pg(va), PAGE_SIZE);
 
 	return (pt2pg_pa);
 }
 
 /*
  *  Setup L2 page table for given KVA.
  *  Used in pre-bootstrap epoch.
  */
 static void
 pmap_preboot_pt2_setup(vm_offset_t va)
 {
 	pt1_entry_t *pte1p;
 	vm_paddr_t pt2pg_pa, pt2_pa;
 
 	/* Setup PT2's page. */
 	pt2pg_pa = pmap_preboot_pt2pg_setup(va);
 	pt2_pa = page_pt2pa(pt2pg_pa, pte1_index(va));
 
 	/* Insert PT2 to PT1. */
 	pte1p = kern_pte1(va);
 	pte1_store(pte1p, PTE1_LINK(pt2_pa));
 }
 
 /*
  *  Get L2 page entry associated with given KVA.
  *  Used in pre-bootstrap epoch.
  */
 static __inline pt2_entry_t*
 pmap_preboot_vtopte2(vm_offset_t va)
 {
 	pt1_entry_t *pte1p;
 
 	/* Setup PT2 if needed. */
 	pte1p = kern_pte1(va);
 	if (!pte1_is_valid(pte1_load(pte1p))) /* XXX - sections ?! */
 		pmap_preboot_pt2_setup(va);
 
 	return (pt2map_entry(va));
 }
 
 /*
  *  Pre-bootstrap epoch page(s) mapping(s).
  */
 void
 pmap_preboot_map_pages(vm_paddr_t pa, vm_offset_t va, u_int num)
 {
 	u_int i;
 	pt2_entry_t *pte2p;
 
 	/* Map all the pages. */
 	for (i = 0; i < num; i++) {
 		pte2p = pmap_preboot_vtopte2(va);
 		pte2_store(pte2p, PTE2_KRW(pa));
 		va += PAGE_SIZE;
 		pa += PAGE_SIZE;
 	}
 }
 
 /*
  *  Pre-bootstrap epoch virtual space alocator.
  */
 vm_offset_t
 pmap_preboot_reserve_pages(u_int num)
 {
 	u_int i;
 	vm_offset_t start, va;
 	pt2_entry_t *pte2p;
 
 	/* Allocate virtual space. */
 	start = va = virtual_avail;
 	virtual_avail += num * PAGE_SIZE;
 
 	/* Zero the mapping. */
 	for (i = 0; i < num; i++) {
 		pte2p = pmap_preboot_vtopte2(va);
 		pte2_store(pte2p, 0);
 		va += PAGE_SIZE;
 	}
 
 	return (start);
 }
 
 /*
  *  Pre-bootstrap epoch page(s) allocation and mapping(s).
  */
 vm_offset_t
 pmap_preboot_get_vpages(u_int num)
 {
 	vm_paddr_t  pa;
 	vm_offset_t va;
 
 	/* Allocate physical page(s). */
 	pa = pmap_preboot_get_pages(num);
 
 	/* Allocate virtual space. */
 	va = virtual_avail;
 	virtual_avail += num * PAGE_SIZE;
 
 	/* Map and zero all. */
 	pmap_preboot_map_pages(pa, va, num);
 	bzero((void *)va, num * PAGE_SIZE);
 
 	return (va);
 }
 
 /*
  *  Pre-bootstrap epoch page mapping(s) with attributes.
  */
 void
 pmap_preboot_map_attr(vm_paddr_t pa, vm_offset_t va, vm_size_t size,
     vm_prot_t prot, vm_memattr_t attr)
 {
 	u_int num;
 	u_int l1_attr, l1_prot, l2_prot, l2_attr;
 	pt1_entry_t *pte1p;
 	pt2_entry_t *pte2p;
 
 	l2_prot = prot & VM_PROT_WRITE ? PTE2_AP_KRW : PTE2_AP_KR;
 	l2_prot |= (prot & VM_PROT_EXECUTE) ? PTE2_X : PTE2_NX;
 	l2_attr = vm_memattr_to_pte2(attr);
 	l1_prot = ATTR_TO_L1(l2_prot);
 	l1_attr = ATTR_TO_L1(l2_attr);
 
 	/* Map all the pages. */
 	num = round_page(size);
 	while (num > 0) {
 		if ((((va | pa) & PTE1_OFFSET) == 0) && (num >= PTE1_SIZE)) {
 			pte1p = kern_pte1(va);
 			pte1_store(pte1p, PTE1_KERN(pa, l1_prot, l1_attr));
 			va += PTE1_SIZE;
 			pa += PTE1_SIZE;
 			num -= PTE1_SIZE;
 		} else {
 			pte2p = pmap_preboot_vtopte2(va);
 			pte2_store(pte2p, PTE2_KERN(pa, l2_prot, l2_attr));
 			va += PAGE_SIZE;
 			pa += PAGE_SIZE;
 			num -= PAGE_SIZE;
 		}
 	}
 }
 
 /*
  *  Extract from the kernel page table the physical address
  *  that is mapped by the given virtual address "va".
  */
 vm_paddr_t
 pmap_kextract(vm_offset_t va)
 {
 	vm_paddr_t pa;
 	pt1_entry_t pte1;
 	pt2_entry_t pte2;
 
 	pte1 = pte1_load(kern_pte1(va));
 	if (pte1_is_section(pte1)) {
 		pa = pte1_pa(pte1) | (va & PTE1_OFFSET);
 	} else if (pte1_is_link(pte1)) {
 		/*
 		 * We should beware of concurrent promotion that changes
 		 * pte1 at this point. However, it's not a problem as PT2
 		 * page is preserved by promotion in PT2TAB. So even if
 		 * it happens, using of PT2MAP is still safe.
 		 *
 		 * QQQ: However, concurrent removing is a problem which
 		 *      ends in abort on PT2MAP space. Locking must be used
 		 *      to deal with this.
 		 */
 		pte2 = pte2_load(pt2map_entry(va));
 		pa = pte2_pa(pte2) | (va & PTE2_OFFSET);
 	}
 	else {
 		panic("%s: va %#x pte1 %#x", __func__, va, pte1);
 	}
 	return (pa);
 }
 
 /*
  *  Extract from the kernel page table the physical address
  *  that is mapped by the given virtual address "va". Also
  *  return L2 page table entry which maps the address.
  *
  *  This is only intended to be used for panic dumps.
  */
 vm_paddr_t
 pmap_dump_kextract(vm_offset_t va, pt2_entry_t *pte2p)
 {
 	vm_paddr_t pa;
 	pt1_entry_t pte1;
 	pt2_entry_t pte2;
 
 	pte1 = pte1_load(kern_pte1(va));
 	if (pte1_is_section(pte1)) {
 		pa = pte1_pa(pte1) | (va & PTE1_OFFSET);
 		pte2 = pa | ATTR_TO_L2(pte1) | PTE2_V;
 	} else if (pte1_is_link(pte1)) {
 		pte2 = pte2_load(pt2map_entry(va));
 		pa = pte2_pa(pte2);
 	} else {
 		pte2 = 0;
 		pa = 0;
 	}
 	if (pte2p != NULL)
 		*pte2p = pte2;
 	return (pa);
 }
 
 /*****************************************************************************
  *
  *	PMAP second stage initialization and utility functions
  *	for bootstrap epoch.
  *
  *  After pmap_bootstrap() is called, the following functions for
  *  mappings can be used:
  *
  *  void pmap_kenter(vm_offset_t va, vm_paddr_t pa);
  *  void pmap_kremove(vm_offset_t va);
  *  vm_offset_t pmap_map(vm_offset_t *virt, vm_paddr_t start, vm_paddr_t end,
  *      int prot);
  *
  *  NOTE: This is not SMP coherent stage. And physical page allocation is not
  *        allowed during this stage.
  *
  *****************************************************************************/
 
 /*
  *  Initialize kernel PMAP locks and lists, kernel_pmap itself, and
  *  reserve various virtual spaces for temporary mappings.
  */
 void
 pmap_bootstrap(vm_offset_t firstaddr)
 {
 	pt2_entry_t *unused __unused;
 	struct sysmaps *sysmaps;
 	u_int i;
 
 	/*
 	 * Initialize the kernel pmap (which is statically allocated).
 	 */
 	PMAP_LOCK_INIT(kernel_pmap);
 	kernel_l1pa = (vm_paddr_t)kern_pt1;  /* for libkvm */
 	kernel_pmap->pm_pt1 = kern_pt1;
 	kernel_pmap->pm_pt2tab = kern_pt2tab;
 	CPU_FILL(&kernel_pmap->pm_active);  /* don't allow deactivation */
 	TAILQ_INIT(&kernel_pmap->pm_pvchunk);
 
 	/*
 	 * Initialize the global pv list lock.
 	 */
 	rw_init(&pvh_global_lock, "pmap pv global");
 
 	LIST_INIT(&allpmaps);
 
 	/*
 	 * Request a spin mutex so that changes to allpmaps cannot be
 	 * preempted by smp_rendezvous_cpus().
 	 */
 	mtx_init(&allpmaps_lock, "allpmaps", NULL, MTX_SPIN);
 	mtx_lock_spin(&allpmaps_lock);
 	LIST_INSERT_HEAD(&allpmaps, kernel_pmap, pm_list);
 	mtx_unlock_spin(&allpmaps_lock);
 
 	/*
 	 * Reserve some special page table entries/VA space for temporary
 	 * mapping of pages.
 	 */
 #define	SYSMAP(c, p, v, n)  do {		\
 	v = (c)pmap_preboot_reserve_pages(n);	\
 	p = pt2map_entry((vm_offset_t)v);	\
 	} while (0)
 
 	/*
 	 * Local CMAP1/CMAP2 are used for zeroing and copying pages.
 	 * Local CMAP3 is used for data cache cleaning.
 	 */
 	for (i = 0; i < MAXCPU; i++) {
 		sysmaps = &sysmaps_pcpu[i];
 		mtx_init(&sysmaps->lock, "SYSMAPS", NULL, MTX_DEF);
 		SYSMAP(caddr_t, sysmaps->CMAP1, sysmaps->CADDR1, 1);
 		SYSMAP(caddr_t, sysmaps->CMAP2, sysmaps->CADDR2, 1);
 		SYSMAP(caddr_t, sysmaps->CMAP3, sysmaps->CADDR3, 1);
 	}
 
 	/*
 	 * Crashdump maps.
 	 */
 	SYSMAP(caddr_t, unused, crashdumpmap, MAXDUMPPGS);
 
 	/*
 	 * _tmppt is used for reading arbitrary physical pages via /dev/mem.
 	 */
 	SYSMAP(caddr_t, unused, _tmppt, 1);
 
 	/*
 	 * PADDR1 and PADDR2 are used by pmap_pte2_quick() and pmap_pte2(),
 	 * respectively. PADDR3 is used by pmap_pte2_ddb().
 	 */
 	SYSMAP(pt2_entry_t *, PMAP1, PADDR1, 1);
 	SYSMAP(pt2_entry_t *, PMAP2, PADDR2, 1);
 #ifdef DDB
 	SYSMAP(pt2_entry_t *, PMAP3, PADDR3, 1);
 #endif
 	mtx_init(&PMAP2mutex, "PMAP2", NULL, MTX_DEF);
 
 	/*
 	 * Note that in very short time in initarm(), we are going to
 	 * initialize phys_avail[] array and no further page allocation
 	 * can happen after that until vm subsystem will be initialized.
 	 */
 	kernel_vm_end_new = kernel_vm_end;
 	virtual_end = vm_max_kernel_address;
 }
 
 static void
 pmap_init_qpages(void)
 {
 	struct pcpu *pc;
 	int i;
 
 	CPU_FOREACH(i) {
 		pc = pcpu_find(i);
 		pc->pc_qmap_addr = kva_alloc(PAGE_SIZE);
 		if (pc->pc_qmap_addr == 0)
 			panic("%s: unable to allocate KVA", __func__);
 	}
 }
 SYSINIT(qpages_init, SI_SUB_CPU, SI_ORDER_ANY, pmap_init_qpages, NULL);
 
 /*
  *  The function can already be use in second initialization stage.
  *  As such, the function DOES NOT call pmap_growkernel() where PT2
  *  allocation can happen. So if used, be sure that PT2 for given
  *  virtual address is allocated already!
  *
  *  Add a wired page to the kva.
  *  Note: not SMP coherent.
  */
 static __inline void
 pmap_kenter_prot_attr(vm_offset_t va, vm_paddr_t pa, uint32_t prot,
     uint32_t attr)
 {
 	pt1_entry_t *pte1p;
 	pt2_entry_t *pte2p;
 
 	pte1p = kern_pte1(va);
 	if (!pte1_is_valid(pte1_load(pte1p))) { /* XXX - sections ?! */
 		/*
 		 * This is a very low level function, so PT2 and particularly
 		 * PT2PG associated with given virtual address must be already
 		 * allocated. It's a pain mainly during pmap initialization
 		 * stage. However, called after pmap initialization with
 		 * virtual address not under kernel_vm_end will lead to
 		 * the same misery.
 		 */
 		if (!pte2_is_valid(pte2_load(kern_pt2tab_entry(va))))
 			panic("%s: kernel PT2 not allocated!", __func__);
 	}
 
 	pte2p = pt2map_entry(va);
 	pte2_store(pte2p, PTE2_KERN(pa, prot, attr));
 }
 
 PMAP_INLINE void
 pmap_kenter(vm_offset_t va, vm_paddr_t pa)
 {
 
 	pmap_kenter_prot_attr(va, pa, PTE2_AP_KRW, PTE2_ATTR_DEFAULT);
 }
 
 /*
  *  Remove a page from the kernel pagetables.
  *  Note: not SMP coherent.
  */
 PMAP_INLINE void
 pmap_kremove(vm_offset_t va)
 {
 	pt2_entry_t *pte2p;
 
 	pte2p = pt2map_entry(va);
 	pte2_clear(pte2p);
 }
 
 /*
  *  Share new kernel PT2PG with all pmaps.
  *  The caller is responsible for maintaining TLB consistency.
  */
 static void
 pmap_kenter_pt2tab(vm_offset_t va, pt2_entry_t npte2)
 {
 	pmap_t pmap;
 	pt2_entry_t *pte2p;
 
 	mtx_lock_spin(&allpmaps_lock);
 	LIST_FOREACH(pmap, &allpmaps, pm_list) {
 		pte2p = pmap_pt2tab_entry(pmap, va);
 		pt2tab_store(pte2p, npte2);
 	}
 	mtx_unlock_spin(&allpmaps_lock);
 }
 
 /*
  *  Share new kernel PTE1 with all pmaps.
  *  The caller is responsible for maintaining TLB consistency.
  */
 static void
 pmap_kenter_pte1(vm_offset_t va, pt1_entry_t npte1)
 {
 	pmap_t pmap;
 	pt1_entry_t *pte1p;
 
 	mtx_lock_spin(&allpmaps_lock);
 	LIST_FOREACH(pmap, &allpmaps, pm_list) {
 		pte1p = pmap_pte1(pmap, va);
 		pte1_store(pte1p, npte1);
 	}
 	mtx_unlock_spin(&allpmaps_lock);
 }
 
 /*
  *  Used to map a range of physical addresses into kernel
  *  virtual address space.
  *
  *  The value passed in '*virt' is a suggested virtual address for
  *  the mapping. Architectures which can support a direct-mapped
  *  physical to virtual region can return the appropriate address
  *  within that region, leaving '*virt' unchanged. Other
  *  architectures should map the pages starting at '*virt' and
  *  update '*virt' with the first usable address after the mapped
  *  region.
  *
  *  NOTE: Read the comments above pmap_kenter_prot_attr() as
  *        the function is used herein!
  */
 vm_offset_t
 pmap_map(vm_offset_t *virt, vm_paddr_t start, vm_paddr_t end, int prot)
 {
 	vm_offset_t va, sva;
 	vm_paddr_t pte1_offset;
 	pt1_entry_t npte1;
 	uint32_t l1prot, l2prot;
 	uint32_t l1attr, l2attr;
 
 	PDEBUG(1, printf("%s: virt = %#x, start = %#x, end = %#x (size = %#x),"
 	    " prot = %d\n", __func__, *virt, start, end, end - start,  prot));
 
 	l2prot = (prot & VM_PROT_WRITE) ? PTE2_AP_KRW : PTE2_AP_KR;
 	l2prot |= (prot & VM_PROT_EXECUTE) ? PTE2_X : PTE2_NX;
 	l1prot = ATTR_TO_L1(l2prot);
 
 	l2attr = PTE2_ATTR_DEFAULT;
 	l1attr = ATTR_TO_L1(l2attr);
 
 	va = *virt;
 	/*
 	 * Does the physical address range's size and alignment permit at
 	 * least one section mapping to be created?
 	 */
 	pte1_offset = start & PTE1_OFFSET;
 	if ((end - start) - ((PTE1_SIZE - pte1_offset) & PTE1_OFFSET) >=
 	    PTE1_SIZE) {
 		/*
 		 * Increase the starting virtual address so that its alignment
 		 * does not preclude the use of section mappings.
 		 */
 		if ((va & PTE1_OFFSET) < pte1_offset)
 			va = pte1_trunc(va) + pte1_offset;
 		else if ((va & PTE1_OFFSET) > pte1_offset)
 			va = pte1_roundup(va) + pte1_offset;
 	}
 	sva = va;
 	while (start < end) {
 		if ((start & PTE1_OFFSET) == 0 && end - start >= PTE1_SIZE) {
 			KASSERT((va & PTE1_OFFSET) == 0,
 			    ("%s: misaligned va %#x", __func__, va));
 			npte1 = PTE1_KERN(start, l1prot, l1attr);
 			pmap_kenter_pte1(va, npte1);
 			va += PTE1_SIZE;
 			start += PTE1_SIZE;
 		} else {
 			pmap_kenter_prot_attr(va, start, l2prot, l2attr);
 			va += PAGE_SIZE;
 			start += PAGE_SIZE;
 		}
 	}
 	tlb_flush_range(sva, va - sva);
 	*virt = va;
 	return (sva);
 }
 
 /*
  *  Make a temporary mapping for a physical address.
  *  This is only intended to be used for panic dumps.
  */
 void *
 pmap_kenter_temporary(vm_paddr_t pa, int i)
 {
 	vm_offset_t va;
 
 	/* QQQ: 'i' should be less or equal to MAXDUMPPGS. */
 
 	va = (vm_offset_t)crashdumpmap + (i * PAGE_SIZE);
 	pmap_kenter(va, pa);
 	tlb_flush_local(va);
 	return ((void *)crashdumpmap);
 }
 
 
 /*************************************
  *
  *  TLB & cache maintenance routines.
  *
  *************************************/
 
 /*
  *  We inline these within pmap.c for speed.
  */
 PMAP_INLINE void
 pmap_tlb_flush(pmap_t pmap, vm_offset_t va)
 {
 
 	if (pmap == kernel_pmap || !CPU_EMPTY(&pmap->pm_active))
 		tlb_flush(va);
 }
 
 PMAP_INLINE void
 pmap_tlb_flush_range(pmap_t pmap, vm_offset_t sva, vm_size_t size)
 {
 
 	if (pmap == kernel_pmap || !CPU_EMPTY(&pmap->pm_active))
 		tlb_flush_range(sva, size);
 }
 
 /*
  *  Abuse the pte2 nodes for unmapped kva to thread a kva freelist through.
  *  Requirements:
  *   - Must deal with pages in order to ensure that none of the PTE2_* bits
  *     are ever set, PTE2_V in particular.
  *   - Assumes we can write to pte2s without pte2_store() atomic ops.
  *   - Assumes nothing will ever test these addresses for 0 to indicate
  *     no mapping instead of correctly checking PTE2_V.
  *   - Assumes a vm_offset_t will fit in a pte2 (true for arm).
  *  Because PTE2_V is never set, there can be no mappings to invalidate.
  */
 static vm_offset_t
 pmap_pte2list_alloc(vm_offset_t *head)
 {
 	pt2_entry_t *pte2p;
 	vm_offset_t va;
 
 	va = *head;
 	if (va == 0)
 		panic("pmap_ptelist_alloc: exhausted ptelist KVA");
 	pte2p = pt2map_entry(va);
 	*head = *pte2p;
 	if (*head & PTE2_V)
 		panic("%s: va with PTE2_V set!", __func__);
 	*pte2p = 0;
 	return (va);
 }
 
 static void
 pmap_pte2list_free(vm_offset_t *head, vm_offset_t va)
 {
 	pt2_entry_t *pte2p;
 
 	if (va & PTE2_V)
 		panic("%s: freeing va with PTE2_V set!", __func__);
 	pte2p = pt2map_entry(va);
 	*pte2p = *head;		/* virtual! PTE2_V is 0 though */
 	*head = va;
 }
 
 static void
 pmap_pte2list_init(vm_offset_t *head, void *base, int npages)
 {
 	int i;
 	vm_offset_t va;
 
 	*head = 0;
 	for (i = npages - 1; i >= 0; i--) {
 		va = (vm_offset_t)base + i * PAGE_SIZE;
 		pmap_pte2list_free(head, va);
 	}
 }
 
 /*****************************************************************************
  *
  *	PMAP third and final stage initialization.
  *
  *  After pmap_init() is called, PMAP subsystem is fully initialized.
  *
  *****************************************************************************/
 
 SYSCTL_NODE(_vm, OID_AUTO, pmap, CTLFLAG_RD, 0, "VM/pmap parameters");
 
 SYSCTL_INT(_vm_pmap, OID_AUTO, pv_entry_max, CTLFLAG_RD, &pv_entry_max, 0,
     "Max number of PV entries");
 SYSCTL_INT(_vm_pmap, OID_AUTO, shpgperproc, CTLFLAG_RD, &shpgperproc, 0,
     "Page share factor per proc");
 
 static u_long nkpt2pg = NKPT2PG;
 SYSCTL_ULONG(_vm_pmap, OID_AUTO, nkpt2pg, CTLFLAG_RD,
     &nkpt2pg, 0, "Pre-allocated pages for kernel PT2s");
 
 static int sp_enabled = 1;
 SYSCTL_INT(_vm_pmap, OID_AUTO, sp_enabled, CTLFLAG_RDTUN | CTLFLAG_NOFETCH,
     &sp_enabled, 0, "Are large page mappings enabled?");
 
 static SYSCTL_NODE(_vm_pmap, OID_AUTO, pte1, CTLFLAG_RD, 0,
     "1MB page mapping counters");
 
 static u_long pmap_pte1_demotions;
 SYSCTL_ULONG(_vm_pmap_pte1, OID_AUTO, demotions, CTLFLAG_RD,
     &pmap_pte1_demotions, 0, "1MB page demotions");
 
 static u_long pmap_pte1_mappings;
 SYSCTL_ULONG(_vm_pmap_pte1, OID_AUTO, mappings, CTLFLAG_RD,
     &pmap_pte1_mappings, 0, "1MB page mappings");
 
 static u_long pmap_pte1_p_failures;
 SYSCTL_ULONG(_vm_pmap_pte1, OID_AUTO, p_failures, CTLFLAG_RD,
     &pmap_pte1_p_failures, 0, "1MB page promotion failures");
 
 static u_long pmap_pte1_promotions;
 SYSCTL_ULONG(_vm_pmap_pte1, OID_AUTO, promotions, CTLFLAG_RD,
     &pmap_pte1_promotions, 0, "1MB page promotions");
 
 static u_long pmap_pte1_kern_demotions;
 SYSCTL_ULONG(_vm_pmap_pte1, OID_AUTO, kern_demotions, CTLFLAG_RD,
     &pmap_pte1_kern_demotions, 0, "1MB page kernel demotions");
 
 static u_long pmap_pte1_kern_promotions;
 SYSCTL_ULONG(_vm_pmap_pte1, OID_AUTO, kern_promotions, CTLFLAG_RD,
     &pmap_pte1_kern_promotions, 0, "1MB page kernel promotions");
 
 static __inline ttb_entry_t
 pmap_ttb_get(pmap_t pmap)
 {
 
 	return (vtophys(pmap->pm_pt1) | ttb_flags);
 }
 
 /*
  *  Initialize a vm_page's machine-dependent fields.
  *
  *  Variations:
  *  1. Pages for L2 page tables are always not managed. So, pv_list and
  *     pt2_wirecount can share same physical space. However, proper
  *     initialization on a page alloc for page tables and reinitialization
  *     on the page free must be ensured.
  */
 void
 pmap_page_init(vm_page_t m)
 {
 
 	TAILQ_INIT(&m->md.pv_list);
 	pt2_wirecount_init(m);
 	m->md.pat_mode = VM_MEMATTR_DEFAULT;
 }
 
 /*
  *  Virtualization for faster way how to zero whole page.
  */
 static __inline void
 pagezero(void *page)
 {
 
 	bzero(page, PAGE_SIZE);
 }
 
 /*
  *  Zero L2 page table page.
  *  Use same KVA as in pmap_zero_page().
  */
 static __inline vm_paddr_t
 pmap_pt2pg_zero(vm_page_t m)
 {
 	vm_paddr_t pa;
 	struct sysmaps *sysmaps;
 
 	pa = VM_PAGE_TO_PHYS(m);
 
 	/*
 	 * XXX: For now, we map whole page even if it's already zero,
 	 *      to sync it even if the sync is only DSB.
 	 */
 	sched_pin();
 	sysmaps = &sysmaps_pcpu[PCPU_GET(cpuid)];
 	mtx_lock(&sysmaps->lock);
 	if (pte2_load(sysmaps->CMAP2) != 0)
 		panic("%s: CMAP2 busy", __func__);
 	pte2_store(sysmaps->CMAP2, PTE2_KERN_NG(pa, PTE2_AP_KRW,
 	    vm_page_pte2_attr(m)));
 	/*  Even VM_ALLOC_ZERO request is only advisory. */
 	if ((m->flags & PG_ZERO) == 0)
 		pagezero(sysmaps->CADDR2);
 	pte2_sync_range((pt2_entry_t *)sysmaps->CADDR2, PAGE_SIZE);
 	pte2_clear(sysmaps->CMAP2);
 	tlb_flush((vm_offset_t)sysmaps->CADDR2);
 	sched_unpin();
 	mtx_unlock(&sysmaps->lock);
 
 	return (pa);
 }
 
 /*
  *  Init just allocated page as L2 page table(s) holder
  *  and return its physical address.
  */
 static __inline vm_paddr_t
 pmap_pt2pg_init(pmap_t pmap, vm_offset_t va, vm_page_t m)
 {
 	vm_paddr_t pa;
 	pt2_entry_t *pte2p;
 
 	/* Check page attributes. */
 	if (m->md.pat_mode != pt_memattr)
 		pmap_page_set_memattr(m, pt_memattr);
 
 	/* Zero page and init wire counts. */
 	pa = pmap_pt2pg_zero(m);
 	pt2_wirecount_init(m);
 
 	/*
 	 * Map page to PT2MAP address space for given pmap.
 	 * Note that PT2MAP space is shared with all pmaps.
 	 */
 	if (pmap == kernel_pmap)
 		pmap_kenter_pt2tab(va, PTE2_KPT(pa));
 	else {
 		pte2p = pmap_pt2tab_entry(pmap, va);
 		pt2tab_store(pte2p, PTE2_KPT_NG(pa));
 	}
 
 	return (pa);
 }
 
 /*
  *  Initialize the pmap module.
  *  Called by vm_init, to initialize any structures that the pmap
  *  system needs to map virtual memory.
  */
 void
 pmap_init(void)
 {
 	vm_size_t s;
 	pt2_entry_t *pte2p, pte2;
 	u_int i, pte1_idx, pv_npg;
 
 	PDEBUG(1, printf("%s: phys_start = %#x\n", __func__, PHYSADDR));
 
 	/*
 	 * Initialize the vm page array entries for kernel pmap's
 	 * L2 page table pages allocated in advance.
 	 */
 	pte1_idx = pte1_index(KERNBASE - PT2MAP_SIZE);
 	pte2p = kern_pt2tab_entry(KERNBASE - PT2MAP_SIZE);
 	for (i = 0; i < nkpt2pg + NPG_IN_PT2TAB; i++, pte2p++) {
 		vm_paddr_t pa;
 		vm_page_t m;
 
 		pte2 = pte2_load(pte2p);
 		KASSERT(pte2_is_valid(pte2), ("%s: no valid entry", __func__));
 
 		pa = pte2_pa(pte2);
 		m = PHYS_TO_VM_PAGE(pa);
 		KASSERT(m >= vm_page_array &&
 		    m < &vm_page_array[vm_page_array_size],
 		    ("%s: L2 page table page is out of range", __func__));
 
 		m->pindex = pte1_idx;
 		m->phys_addr = pa;
 		pte1_idx += NPT2_IN_PG;
 	}
 
 	/*
 	 * Initialize the address space (zone) for the pv entries.  Set a
 	 * high water mark so that the system can recover from excessive
 	 * numbers of pv entries.
 	 */
 	TUNABLE_INT_FETCH("vm.pmap.shpgperproc", &shpgperproc);
 	pv_entry_max = shpgperproc * maxproc + vm_cnt.v_page_count;
 	TUNABLE_INT_FETCH("vm.pmap.pv_entries", &pv_entry_max);
 	pv_entry_max = roundup(pv_entry_max, _NPCPV);
 	pv_entry_high_water = 9 * (pv_entry_max / 10);
 
 	/*
 	 * Are large page mappings enabled?
 	 */
 	TUNABLE_INT_FETCH("vm.pmap.sp_enabled", &sp_enabled);
 	if (sp_enabled) {
 		KASSERT(MAXPAGESIZES > 1 && pagesizes[1] == 0,
 		    ("%s: can't assign to pagesizes[1]", __func__));
 		pagesizes[1] = PTE1_SIZE;
 	}
 
 	/*
 	 * Calculate the size of the pv head table for sections.
 	 * Handle the possibility that "vm_phys_segs[...].end" is zero.
 	 * Note that the table is only for sections which could be promoted.
 	 */
 	first_managed_pa = pte1_trunc(vm_phys_segs[0].start);
 	pv_npg = (pte1_trunc(vm_phys_segs[vm_phys_nsegs - 1].end - PAGE_SIZE)
 	    - first_managed_pa) / PTE1_SIZE + 1;
 
 	/*
 	 * Allocate memory for the pv head table for sections.
 	 */
 	s = (vm_size_t)(pv_npg * sizeof(struct md_page));
 	s = round_page(s);
 	pv_table = (struct md_page *)kmem_malloc(kernel_arena, s,
 	    M_WAITOK | M_ZERO);
 	for (i = 0; i < pv_npg; i++)
 		TAILQ_INIT(&pv_table[i].pv_list);
 
 	pv_maxchunks = MAX(pv_entry_max / _NPCPV, maxproc);
 	pv_chunkbase = (struct pv_chunk *)kva_alloc(PAGE_SIZE * pv_maxchunks);
 	if (pv_chunkbase == NULL)
 		panic("%s: not enough kvm for pv chunks", __func__);
 	pmap_pte2list_init(&pv_vafree, pv_chunkbase, pv_maxchunks);
 }
 
 /*
  *  Add a list of wired pages to the kva
  *  this routine is only used for temporary
  *  kernel mappings that do not need to have
  *  page modification or references recorded.
  *  Note that old mappings are simply written
  *  over.  The page *must* be wired.
  *  Note: SMP coherent.  Uses a ranged shootdown IPI.
  */
 void
 pmap_qenter(vm_offset_t sva, vm_page_t *ma, int count)
 {
 	u_int anychanged;
 	pt2_entry_t *epte2p, *pte2p, pte2;
 	vm_page_t m;
 	vm_paddr_t pa;
 
 	anychanged = 0;
 	pte2p = pt2map_entry(sva);
 	epte2p = pte2p + count;
 	while (pte2p < epte2p) {
 		m = *ma++;
 		pa = VM_PAGE_TO_PHYS(m);
 		pte2 = pte2_load(pte2p);
 		if ((pte2_pa(pte2) != pa) ||
 		    (pte2_attr(pte2) != vm_page_pte2_attr(m))) {
 			anychanged++;
 			pte2_store(pte2p, PTE2_KERN(pa, PTE2_AP_KRW,
 			    vm_page_pte2_attr(m)));
 		}
 		pte2p++;
 	}
 	if (__predict_false(anychanged))
 		tlb_flush_range(sva, count * PAGE_SIZE);
 }
 
 /*
  *  This routine tears out page mappings from the
  *  kernel -- it is meant only for temporary mappings.
  *  Note: SMP coherent.  Uses a ranged shootdown IPI.
  */
 void
 pmap_qremove(vm_offset_t sva, int count)
 {
 	vm_offset_t va;
 
 	va = sva;
 	while (count-- > 0) {
 		pmap_kremove(va);
 		va += PAGE_SIZE;
 	}
 	tlb_flush_range(sva, va - sva);
 }
 
 /*
  *  Are we current address space or kernel?
  */
 static __inline int
 pmap_is_current(pmap_t pmap)
 {
 
 	return (pmap == kernel_pmap ||
 		(pmap == vmspace_pmap(curthread->td_proc->p_vmspace)));
 }
 
 /*
  *  If the given pmap is not the current or kernel pmap, the returned
  *  pte2 must be released by passing it to pmap_pte2_release().
  */
 static pt2_entry_t *
 pmap_pte2(pmap_t pmap, vm_offset_t va)
 {
 	pt1_entry_t pte1;
 	vm_paddr_t pt2pg_pa;
 
 	pte1 = pte1_load(pmap_pte1(pmap, va));
 	if (pte1_is_section(pte1))
 		panic("%s: attempt to map PTE1", __func__);
 	if (pte1_is_link(pte1)) {
 		/* Are we current address space or kernel? */
 		if (pmap_is_current(pmap))
 			return (pt2map_entry(va));
 		/* Note that L2 page table size is not equal to PAGE_SIZE. */
 		pt2pg_pa = trunc_page(pte1_link_pa(pte1));
 		mtx_lock(&PMAP2mutex);
 		if (pte2_pa(pte2_load(PMAP2)) != pt2pg_pa) {
 			pte2_store(PMAP2, PTE2_KPT(pt2pg_pa));
 			tlb_flush((vm_offset_t)PADDR2);
 		}
 		return (PADDR2 + (arm32_btop(va) & (NPTE2_IN_PG - 1)));
 	}
 	return (NULL);
 }
 
 /*
  *  Releases a pte2 that was obtained from pmap_pte2().
  *  Be prepared for the pte2p being NULL.
  */
 static __inline void
 pmap_pte2_release(pt2_entry_t *pte2p)
 {
 
 	if ((pt2_entry_t *)(trunc_page((vm_offset_t)pte2p)) == PADDR2) {
 		mtx_unlock(&PMAP2mutex);
 	}
 }
 
 /*
  *  Super fast pmap_pte2 routine best used when scanning
  *  the pv lists.  This eliminates many coarse-grained
  *  invltlb calls.  Note that many of the pv list
  *  scans are across different pmaps.  It is very wasteful
  *  to do an entire tlb flush for checking a single mapping.
  *
  *  If the given pmap is not the current pmap, pvh_global_lock
  *  must be held and curthread pinned to a CPU.
  */
 static pt2_entry_t *
 pmap_pte2_quick(pmap_t pmap, vm_offset_t va)
 {
 	pt1_entry_t pte1;
 	vm_paddr_t pt2pg_pa;
 
 	pte1 = pte1_load(pmap_pte1(pmap, va));
 	if (pte1_is_section(pte1))
 		panic("%s: attempt to map PTE1", __func__);
 	if (pte1_is_link(pte1)) {
 		/* Are we current address space or kernel? */
 		if (pmap_is_current(pmap))
 			return (pt2map_entry(va));
 		rw_assert(&pvh_global_lock, RA_WLOCKED);
 		KASSERT(curthread->td_pinned > 0,
 		    ("%s: curthread not pinned", __func__));
 		/* Note that L2 page table size is not equal to PAGE_SIZE. */
 		pt2pg_pa = trunc_page(pte1_link_pa(pte1));
 		if (pte2_pa(pte2_load(PMAP1)) != pt2pg_pa) {
 			pte2_store(PMAP1, PTE2_KPT(pt2pg_pa));
 #ifdef SMP
 			PMAP1cpu = PCPU_GET(cpuid);
 #endif
 			tlb_flush_local((vm_offset_t)PADDR1);
 			PMAP1changed++;
 		} else
 #ifdef SMP
 		if (PMAP1cpu != PCPU_GET(cpuid)) {
 			PMAP1cpu = PCPU_GET(cpuid);
 			tlb_flush_local((vm_offset_t)PADDR1);
 			PMAP1changedcpu++;
 		} else
 #endif
 			PMAP1unchanged++;
 		return (PADDR1 + (arm32_btop(va) & (NPTE2_IN_PG - 1)));
 	}
 	return (NULL);
 }
 
 /*
  *  Routine: pmap_extract
  *  Function:
  * 	Extract the physical page address associated
  *	with the given map/virtual_address pair.
  */
 vm_paddr_t
 pmap_extract(pmap_t pmap, vm_offset_t va)
 {
 	vm_paddr_t pa;
 	pt1_entry_t pte1;
 	pt2_entry_t *pte2p;
 
 	PMAP_LOCK(pmap);
 	pte1 = pte1_load(pmap_pte1(pmap, va));
 	if (pte1_is_section(pte1))
 		pa = pte1_pa(pte1) | (va & PTE1_OFFSET);
 	else if (pte1_is_link(pte1)) {
 		pte2p = pmap_pte2(pmap, va);
 		pa = pte2_pa(pte2_load(pte2p)) | (va & PTE2_OFFSET);
 		pmap_pte2_release(pte2p);
 	} else
 		pa = 0;
 	PMAP_UNLOCK(pmap);
 	return (pa);
 }
 
 /*
  *  Routine: pmap_extract_and_hold
  *  Function:
  *	Atomically extract and hold the physical page
  *	with the given pmap and virtual address pair
  *	if that mapping permits the given protection.
  */
 vm_page_t
 pmap_extract_and_hold(pmap_t pmap, vm_offset_t va, vm_prot_t prot)
 {
 	vm_paddr_t pa, lockpa;
 	pt1_entry_t pte1;
 	pt2_entry_t pte2, *pte2p;
 	vm_page_t m;
 
 	lockpa = 0;
 	m = NULL;
 	PMAP_LOCK(pmap);
 retry:
 	pte1 = pte1_load(pmap_pte1(pmap, va));
 	if (pte1_is_section(pte1)) {
 		if (!(pte1 & PTE1_RO) || !(prot & VM_PROT_WRITE)) {
 			pa = pte1_pa(pte1) | (va & PTE1_OFFSET);
 			if (vm_page_pa_tryrelock(pmap, pa, &lockpa))
 				goto retry;
 			m = PHYS_TO_VM_PAGE(pa);
 			vm_page_hold(m);
 		}
 	} else if (pte1_is_link(pte1)) {
 		pte2p = pmap_pte2(pmap, va);
 		pte2 = pte2_load(pte2p);
 		pmap_pte2_release(pte2p);
 		if (pte2_is_valid(pte2) &&
 		    (!(pte2 & PTE2_RO) || !(prot & VM_PROT_WRITE))) {
 			pa = pte2_pa(pte2);
 			if (vm_page_pa_tryrelock(pmap, pa, &lockpa))
 				goto retry;
 			m = PHYS_TO_VM_PAGE(pa);
 			vm_page_hold(m);
 		}
 	}
 	PA_UNLOCK_COND(lockpa);
 	PMAP_UNLOCK(pmap);
 	return (m);
 }
 
 /*
  *  Grow the number of kernel L2 page table entries, if needed.
  */
 void
 pmap_growkernel(vm_offset_t addr)
 {
 	vm_page_t m;
 	vm_paddr_t pt2pg_pa, pt2_pa;
 	pt1_entry_t pte1;
 	pt2_entry_t pte2;
 
 	PDEBUG(1, printf("%s: addr = %#x\n", __func__, addr));
 	/*
 	 * All the time kernel_vm_end is first KVA for which underlying
 	 * L2 page table is either not allocated or linked from L1 page table
 	 * (not considering sections). Except for two possible cases:
 	 *
 	 *   (1) in the very beginning as long as pmap_growkernel() was
 	 *       not called, it could be first unused KVA (which is not
 	 *       rounded up to PTE1_SIZE),
 	 *
 	 *   (2) when all KVA space is mapped and kernel_map->max_offset
 	 *       address is not rounded up to PTE1_SIZE. (For example,
 	 *       it could be 0xFFFFFFFF.)
 	 */
 	kernel_vm_end = pte1_roundup(kernel_vm_end);
 	mtx_assert(&kernel_map->system_mtx, MA_OWNED);
 	addr = roundup2(addr, PTE1_SIZE);
 	if (addr - 1 >= kernel_map->max_offset)
 		addr = kernel_map->max_offset;
 	while (kernel_vm_end < addr) {
 		pte1 = pte1_load(kern_pte1(kernel_vm_end));
 		if (pte1_is_valid(pte1)) {
 			kernel_vm_end += PTE1_SIZE;
 			if (kernel_vm_end - 1 >= kernel_map->max_offset) {
 				kernel_vm_end = kernel_map->max_offset;
 				break;
 			}
 			continue;
 		}
 
 		/*
 		 * kernel_vm_end_new is used in pmap_pinit() when kernel
 		 * mappings are entered to new pmap all at once to avoid race
 		 * between pmap_kenter_pte1() and kernel_vm_end increase.
 		 * The same aplies to pmap_kenter_pt2tab().
 		 */
 		kernel_vm_end_new = kernel_vm_end + PTE1_SIZE;
 
 		pte2 = pt2tab_load(kern_pt2tab_entry(kernel_vm_end));
 		if (!pte2_is_valid(pte2)) {
 			/*
 			 * Install new PT2s page into kernel PT2TAB.
 			 */
 			m = vm_page_alloc(NULL,
 			    pte1_index(kernel_vm_end) & ~PT2PG_MASK,
 			    VM_ALLOC_INTERRUPT | VM_ALLOC_NOOBJ |
 			    VM_ALLOC_WIRED | VM_ALLOC_ZERO);
 			if (m == NULL)
 				panic("%s: no memory to grow kernel", __func__);
 			/*
 			 * QQQ: To link all new L2 page tables from L1 page
 			 *      table now and so pmap_kenter_pte1() them
 			 *      at once together with pmap_kenter_pt2tab()
 			 *      could be nice speed up. However,
 			 *      pmap_growkernel() does not happen so often...
 			 * QQQ: The other TTBR is another option.
 			 */
 			pt2pg_pa = pmap_pt2pg_init(kernel_pmap, kernel_vm_end,
 			    m);
 		} else
 			pt2pg_pa = pte2_pa(pte2);
 
 		pt2_pa = page_pt2pa(pt2pg_pa, pte1_index(kernel_vm_end));
 		pmap_kenter_pte1(kernel_vm_end, PTE1_LINK(pt2_pa));
 
 		kernel_vm_end = kernel_vm_end_new;
 		if (kernel_vm_end - 1 >= kernel_map->max_offset) {
 			kernel_vm_end = kernel_map->max_offset;
 			break;
 		}
 	}
 }
 
 static int
 kvm_size(SYSCTL_HANDLER_ARGS)
 {
 	unsigned long ksize = vm_max_kernel_address - KERNBASE;
 
 	return (sysctl_handle_long(oidp, &ksize, 0, req));
 }
 SYSCTL_PROC(_vm, OID_AUTO, kvm_size, CTLTYPE_LONG|CTLFLAG_RD,
     0, 0, kvm_size, "IU", "Size of KVM");
 
 static int
 kvm_free(SYSCTL_HANDLER_ARGS)
 {
 	unsigned long kfree = vm_max_kernel_address - kernel_vm_end;
 
 	return (sysctl_handle_long(oidp, &kfree, 0, req));
 }
 SYSCTL_PROC(_vm, OID_AUTO, kvm_free, CTLTYPE_LONG|CTLFLAG_RD,
     0, 0, kvm_free, "IU", "Amount of KVM free");
 
 /***********************************************
  *
  *  Pmap allocation/deallocation routines.
  *
  ***********************************************/
 
 /*
  *  Initialize the pmap for the swapper process.
  */
 void
 pmap_pinit0(pmap_t pmap)
 {
 	PDEBUG(1, printf("%s: pmap = %p\n", __func__, pmap));
 
 	PMAP_LOCK_INIT(pmap);
 
 	/*
 	 * Kernel page table directory and pmap stuff around is already
 	 * initialized, we are using it right now and here. So, finish
 	 * only PMAP structures initialization for process0 ...
 	 *
 	 * Since the L1 page table and PT2TAB is shared with the kernel pmap,
 	 * which is already included in the list "allpmaps", this pmap does
 	 * not need to be inserted into that list.
 	 */
 	pmap->pm_pt1 = kern_pt1;
 	pmap->pm_pt2tab = kern_pt2tab;
 	CPU_ZERO(&pmap->pm_active);
 	PCPU_SET(curpmap, pmap);
 	TAILQ_INIT(&pmap->pm_pvchunk);
 	bzero(&pmap->pm_stats, sizeof pmap->pm_stats);
 	CPU_SET(0, &pmap->pm_active);
 }
 
 static __inline void
 pte1_copy_nosync(pt1_entry_t *spte1p, pt1_entry_t *dpte1p, vm_offset_t sva,
     vm_offset_t eva)
 {
 	u_int idx, count;
 
 	idx = pte1_index(sva);
 	count = (pte1_index(eva) - idx + 1) * sizeof(pt1_entry_t);
 	bcopy(spte1p + idx, dpte1p + idx, count);
 }
 
 static __inline void
 pt2tab_copy_nosync(pt2_entry_t *spte2p, pt2_entry_t *dpte2p, vm_offset_t sva,
     vm_offset_t eva)
 {
 	u_int idx, count;
 
 	idx = pt2tab_index(sva);
 	count = (pt2tab_index(eva) - idx + 1) * sizeof(pt2_entry_t);
 	bcopy(spte2p + idx, dpte2p + idx, count);
 }
 
 /*
  *  Initialize a preallocated and zeroed pmap structure,
  *  such as one in a vmspace structure.
  */
 int
 pmap_pinit(pmap_t pmap)
 {
 	pt1_entry_t *pte1p;
 	pt2_entry_t *pte2p;
 	vm_paddr_t pa, pt2tab_pa;
 	u_int i;
 
 	PDEBUG(6, printf("%s: pmap = %p, pm_pt1 = %p\n", __func__, pmap,
 	    pmap->pm_pt1));
 
 	/*
 	 * No need to allocate L2 page table space yet but we do need
 	 * a valid L1 page table and PT2TAB table.
 	 *
 	 * Install shared kernel mappings to these tables. It's a little
 	 * tricky as some parts of KVA are reserved for vectors, devices,
 	 * and whatever else. These parts are supposed to be above
 	 * vm_max_kernel_address. Thus two regions should be installed:
 	 *
 	 *   (1) <KERNBASE, kernel_vm_end),
 	 *   (2) <vm_max_kernel_address, 0xFFFFFFFF>.
 	 *
 	 * QQQ: The second region should be stable enough to be installed
 	 *      only once in time when the tables are allocated.
 	 * QQQ: Maybe copy of both regions at once could be faster ...
 	 * QQQ: Maybe the other TTBR is an option.
 	 *
 	 * Finally, install own PT2TAB table to these tables.
 	 */
 
 	if (pmap->pm_pt1 == NULL) {
 		pmap->pm_pt1 = (pt1_entry_t *)kmem_alloc_contig(kernel_arena,
 		    NB_IN_PT1, M_NOWAIT | M_ZERO, 0, -1UL, NB_IN_PT1, 0,
 		    pt_memattr);
 		if (pmap->pm_pt1 == NULL)
 			return (0);
 	}
 	if (pmap->pm_pt2tab == NULL) {
 		/*
 		 * QQQ: (1) PT2TAB must be contiguous. If PT2TAB is one page
 		 *      only, what should be the only size for 32 bit systems,
 		 *      then we could allocate it with vm_page_alloc() and all
 		 *      the stuff needed as other L2 page table pages.
 		 *      (2) Note that a process PT2TAB is special L2 page table
 		 *      page. Its mapping in kernel_arena is permanent and can
 		 *      be used no matter which process is current. Its mapping
 		 *      in PT2MAP can be used only for current process.
 		 */
 		pmap->pm_pt2tab = (pt2_entry_t *)kmem_alloc_attr(kernel_arena,
 		    NB_IN_PT2TAB, M_NOWAIT | M_ZERO, 0, -1UL, pt_memattr);
 		if (pmap->pm_pt2tab == NULL) {
 			/*
 			 * QQQ: As struct pmap is allocated from UMA with
 			 *      UMA_ZONE_NOFREE flag, it's important to leave
 			 *      no allocation in pmap if initialization failed.
 			 */
 			kmem_free(kernel_arena, (vm_offset_t)pmap->pm_pt1,
 			    NB_IN_PT1);
 			pmap->pm_pt1 = NULL;
 			return (0);
 		}
 		/*
 		 * QQQ: Each L2 page table page vm_page_t has pindex set to
 		 *      pte1 index of virtual address mapped by this page.
 		 *      It's not valid for non kernel PT2TABs themselves.
 		 *      The pindex of these pages can not be altered because
 		 *      of the way how they are allocated now. However, it
 		 *      should not be a problem.
 		 */
 	}
 
 	mtx_lock_spin(&allpmaps_lock);
 	/*
 	 * To avoid race with pmap_kenter_pte1() and pmap_kenter_pt2tab(),
 	 * kernel_vm_end_new is used here instead of kernel_vm_end.
 	 */
 	pte1_copy_nosync(kern_pt1, pmap->pm_pt1, KERNBASE,
 	    kernel_vm_end_new - 1);
 	pte1_copy_nosync(kern_pt1, pmap->pm_pt1, vm_max_kernel_address,
 	    0xFFFFFFFF);
 	pt2tab_copy_nosync(kern_pt2tab, pmap->pm_pt2tab, KERNBASE,
 	    kernel_vm_end_new - 1);
 	pt2tab_copy_nosync(kern_pt2tab, pmap->pm_pt2tab, vm_max_kernel_address,
 	    0xFFFFFFFF);
 	LIST_INSERT_HEAD(&allpmaps, pmap, pm_list);
 	mtx_unlock_spin(&allpmaps_lock);
 
 	/*
 	 * Store PT2MAP PT2 pages (a.k.a. PT2TAB) in PT2TAB itself.
 	 * I.e. self reference mapping.  The PT2TAB is private, however mapped
 	 * into shared PT2MAP space, so the mapping should be not global.
 	 */
 	pt2tab_pa = vtophys(pmap->pm_pt2tab);
 	pte2p = pmap_pt2tab_entry(pmap, (vm_offset_t)PT2MAP);
 	for (pa = pt2tab_pa, i = 0; i < NPG_IN_PT2TAB; i++, pa += PTE2_SIZE) {
 		pt2tab_store(pte2p++, PTE2_KPT_NG(pa));
 	}
 
 	/* Insert PT2MAP PT2s into pmap PT1. */
 	pte1p = pmap_pte1(pmap, (vm_offset_t)PT2MAP);
 	for (pa = pt2tab_pa, i = 0; i < NPT2_IN_PT2TAB; i++, pa += NB_IN_PT2) {
 		pte1_store(pte1p++, PTE1_LINK(pa));
 	}
 
 	/*
 	 * Now synchronize new mapping which was made above.
 	 */
 	pte1_sync_range(pmap->pm_pt1, NB_IN_PT1);
 	pte2_sync_range(pmap->pm_pt2tab, NB_IN_PT2TAB);
 
 	CPU_ZERO(&pmap->pm_active);
 	TAILQ_INIT(&pmap->pm_pvchunk);
 	bzero(&pmap->pm_stats, sizeof pmap->pm_stats);
 
 	return (1);
 }
 
 #ifdef INVARIANTS
 static boolean_t
 pt2tab_user_is_empty(pt2_entry_t *tab)
 {
 	u_int i, end;
 
 	end = pt2tab_index(VM_MAXUSER_ADDRESS);
 	for (i = 0; i < end; i++)
 		if (tab[i] != 0) return (FALSE);
 	return (TRUE);
 }
 #endif
 /*
  *  Release any resources held by the given physical map.
  *  Called when a pmap initialized by pmap_pinit is being released.
  *  Should only be called if the map contains no valid mappings.
  */
 void
 pmap_release(pmap_t pmap)
 {
 #ifdef INVARIANTS
 	vm_offset_t start, end;
 #endif
 	KASSERT(pmap->pm_stats.resident_count == 0,
 	    ("%s: pmap resident count %ld != 0", __func__,
 	    pmap->pm_stats.resident_count));
 	KASSERT(pt2tab_user_is_empty(pmap->pm_pt2tab),
 	    ("%s: has allocated user PT2(s)", __func__));
 	KASSERT(CPU_EMPTY(&pmap->pm_active),
 	    ("%s: pmap %p is active on some CPU(s)", __func__, pmap));
 
 	mtx_lock_spin(&allpmaps_lock);
 	LIST_REMOVE(pmap, pm_list);
 	mtx_unlock_spin(&allpmaps_lock);
 
 #ifdef INVARIANTS
 	start = pte1_index(KERNBASE) * sizeof(pt1_entry_t);
 	end = (pte1_index(0xFFFFFFFF) + 1) * sizeof(pt1_entry_t);
 	bzero((char *)pmap->pm_pt1 + start, end - start);
 
 	start = pt2tab_index(KERNBASE) * sizeof(pt2_entry_t);
 	end = (pt2tab_index(0xFFFFFFFF) + 1) * sizeof(pt2_entry_t);
 	bzero((char *)pmap->pm_pt2tab + start, end - start);
 #endif
 	/*
 	 * We are leaving PT1 and PT2TAB allocated on released pmap,
 	 * so hopefully UMA vmspace_zone will always be inited with
 	 * UMA_ZONE_NOFREE flag.
 	 */
 }
 
 /*********************************************************
  *
  *  L2 table pages and their pages management routines.
  *
  *********************************************************/
 
 /*
  *  Virtual interface for L2 page table wire counting.
  *
  *  Each L2 page table in a page has own counter which counts a number of
  *  valid mappings in a table. Global page counter counts mappings in all
  *  tables in a page plus a single itself mapping in PT2TAB.
  *
  *  During a promotion we leave the associated L2 page table counter
  *  untouched, so the table (strictly speaking a page which holds it)
  *  is never freed if promoted.
  *
  *  If a page m->wire_count == 1 then no valid mappings exist in any L2 page
  *  table in the page and the page itself is only mapped in PT2TAB.
  */
 
 static __inline void
 pt2_wirecount_init(vm_page_t m)
 {
 	u_int i;
 
 	/*
 	 * Note: A page m is allocated with VM_ALLOC_WIRED flag and
 	 *       m->wire_count should be already set correctly.
 	 *       So, there is no need to set it again herein.
 	 */
 	for (i = 0; i < NPT2_IN_PG; i++)
 		m->md.pt2_wirecount[i] = 0;
 }
 
 static __inline void
 pt2_wirecount_inc(vm_page_t m, uint32_t pte1_idx)
 {
 
 	/*
 	 * Note: A just modificated pte2 (i.e. already allocated)
 	 *       is acquiring one extra reference which must be
 	 *       explicitly cleared. It influences the KASSERTs herein.
 	 *       All L2 page tables in a page always belong to the same
 	 *       pmap, so we allow only one extra reference for the page.
 	 */
 	KASSERT(m->md.pt2_wirecount[pte1_idx & PT2PG_MASK] < (NPTE2_IN_PT2 + 1),
 	    ("%s: PT2 is overflowing ...", __func__));
 	KASSERT(m->wire_count <= (NPTE2_IN_PG + 1),
 	    ("%s: PT2PG is overflowing ...", __func__));
 
 	m->wire_count++;
 	m->md.pt2_wirecount[pte1_idx & PT2PG_MASK]++;
 }
 
 static __inline void
 pt2_wirecount_dec(vm_page_t m, uint32_t pte1_idx)
 {
 
 	KASSERT(m->md.pt2_wirecount[pte1_idx & PT2PG_MASK] != 0,
 	    ("%s: PT2 is underflowing ...", __func__));
 	KASSERT(m->wire_count > 1,
 	    ("%s: PT2PG is underflowing ...", __func__));
 
 	m->wire_count--;
 	m->md.pt2_wirecount[pte1_idx & PT2PG_MASK]--;
 }
 
 static __inline void
 pt2_wirecount_set(vm_page_t m, uint32_t pte1_idx, uint16_t count)
 {
 
 	KASSERT(count <= NPTE2_IN_PT2,
 	    ("%s: invalid count %u", __func__, count));
 	KASSERT(m->wire_count >  m->md.pt2_wirecount[pte1_idx & PT2PG_MASK],
 	    ("%s: PT2PG corrupting (%u, %u) ...", __func__, m->wire_count,
 	    m->md.pt2_wirecount[pte1_idx & PT2PG_MASK]));
 
 	m->wire_count -= m->md.pt2_wirecount[pte1_idx & PT2PG_MASK];
 	m->wire_count += count;
 	m->md.pt2_wirecount[pte1_idx & PT2PG_MASK] = count;
 
 	KASSERT(m->wire_count <= (NPTE2_IN_PG + 1),
 	    ("%s: PT2PG is overflowed (%u) ...", __func__, m->wire_count));
 }
 
 static __inline uint32_t
 pt2_wirecount_get(vm_page_t m, uint32_t pte1_idx)
 {
 
 	return (m->md.pt2_wirecount[pte1_idx & PT2PG_MASK]);
 }
 
 static __inline boolean_t
 pt2_is_empty(vm_page_t m, vm_offset_t va)
 {
 
 	return (m->md.pt2_wirecount[pte1_index(va) & PT2PG_MASK] == 0);
 }
 
 static __inline boolean_t
 pt2_is_full(vm_page_t m, vm_offset_t va)
 {
 
 	return (m->md.pt2_wirecount[pte1_index(va) & PT2PG_MASK] ==
 	    NPTE2_IN_PT2);
 }
 
 static __inline boolean_t
 pt2pg_is_empty(vm_page_t m)
 {
 
 	return (m->wire_count == 1);
 }
 
 /*
  *  This routine is called if the L2 page table
  *  is not mapped correctly.
  */
 static vm_page_t
 _pmap_allocpte2(pmap_t pmap, vm_offset_t va, u_int flags)
 {
 	uint32_t pte1_idx;
 	pt1_entry_t *pte1p;
 	pt2_entry_t pte2;
 	vm_page_t  m;
 	vm_paddr_t pt2pg_pa, pt2_pa;
 
 	pte1_idx = pte1_index(va);
 	pte1p = pmap->pm_pt1 + pte1_idx;
 
 	KASSERT(pte1_load(pte1p) == 0,
 	    ("%s: pm_pt1[%#x] is not zero: %#x", __func__, pte1_idx,
 	    pte1_load(pte1p)));
 
 	pte2 = pt2tab_load(pmap_pt2tab_entry(pmap, va));
 	if (!pte2_is_valid(pte2)) {
 		/*
 		 * Install new PT2s page into pmap PT2TAB.
 		 */
 		m = vm_page_alloc(NULL, pte1_idx & ~PT2PG_MASK,
 		    VM_ALLOC_NOOBJ | VM_ALLOC_WIRED | VM_ALLOC_ZERO);
 		if (m == NULL) {
 			if ((flags & PMAP_ENTER_NOSLEEP) == 0) {
 				PMAP_UNLOCK(pmap);
 				rw_wunlock(&pvh_global_lock);
 				VM_WAIT;
 				rw_wlock(&pvh_global_lock);
 				PMAP_LOCK(pmap);
 			}
 
 			/*
 			 * Indicate the need to retry.  While waiting,
 			 * the L2 page table page may have been allocated.
 			 */
 			return (NULL);
 		}
 		pmap->pm_stats.resident_count++;
 		pt2pg_pa = pmap_pt2pg_init(pmap, va, m);
 	} else {
 		pt2pg_pa = pte2_pa(pte2);
 		m = PHYS_TO_VM_PAGE(pt2pg_pa);
 	}
 
 	pt2_wirecount_inc(m, pte1_idx);
 	pt2_pa = page_pt2pa(pt2pg_pa, pte1_idx);
 	pte1_store(pte1p, PTE1_LINK(pt2_pa));
 
 	return (m);
 }
 
 static vm_page_t
 pmap_allocpte2(pmap_t pmap, vm_offset_t va, u_int flags)
 {
 	u_int pte1_idx;
 	pt1_entry_t *pte1p, pte1;
 	vm_page_t m;
 
 	pte1_idx = pte1_index(va);
 retry:
 	pte1p = pmap->pm_pt1 + pte1_idx;
 	pte1 = pte1_load(pte1p);
 
 	/*
 	 * This supports switching from a 1MB page to a
 	 * normal 4K page.
 	 */
 	if (pte1_is_section(pte1)) {
 		(void)pmap_demote_pte1(pmap, pte1p, va);
 		/*
 		 * Reload pte1 after demotion.
 		 *
 		 * Note: Demotion can even fail as either PT2 is not find for
 		 *       the virtual address or PT2PG can not be allocated.
 		 */
 		pte1 = pte1_load(pte1p);
 	}
 
 	/*
 	 * If the L2 page table page is mapped, we just increment the
 	 * hold count, and activate it.
 	 */
 	if (pte1_is_link(pte1)) {
 		m = PHYS_TO_VM_PAGE(pte1_link_pa(pte1));
 		pt2_wirecount_inc(m, pte1_idx);
 	} else  {
 		/*
 		 * Here if the PT2 isn't mapped, or if it has
 		 * been deallocated.
 		 */
 		m = _pmap_allocpte2(pmap, va, flags);
 		if (m == NULL && (flags & PMAP_ENTER_NOSLEEP) == 0)
 			goto retry;
 	}
 
 	return (m);
 }
 
 static __inline void
 pmap_free_zero_pages(struct spglist *free)
 {
 	vm_page_t m;
 
 	while ((m = SLIST_FIRST(free)) != NULL) {
 		SLIST_REMOVE_HEAD(free, plinks.s.ss);
 		/* Preserve the page's PG_ZERO setting. */
 		vm_page_free_toq(m);
 	}
 }
 
 /*
  *  Schedule the specified unused L2 page table page to be freed. Specifically,
  *  add the page to the specified list of pages that will be released to the
  *  physical memory manager after the TLB has been updated.
  */
 static __inline void
 pmap_add_delayed_free_list(vm_page_t m, struct spglist *free)
 {
 
 	/*
 	 * Put page on a list so that it is released after
 	 * *ALL* TLB shootdown is done
 	 */
 #ifdef PMAP_DEBUG
 	pmap_zero_page_check(m);
 #endif
 	m->flags |= PG_ZERO;
 	SLIST_INSERT_HEAD(free, m, plinks.s.ss);
 }
 
 /*
  *  Unwire L2 page tables page.
  */
 static void
 pmap_unwire_pt2pg(pmap_t pmap, vm_offset_t va, vm_page_t m)
 {
 	pt1_entry_t *pte1p, opte1 __unused;
 	pt2_entry_t *pte2p;
 	uint32_t i;
 
 	KASSERT(pt2pg_is_empty(m),
 	    ("%s: pmap %p PT2PG %p wired", __func__, pmap, m));
 
 	/*
 	 * Unmap all L2 page tables in the page from L1 page table.
 	 *
 	 * QQQ: Individual L2 page tables (except the last one) can be unmapped
 	 * earlier. However, we are doing that this way.
 	 */
 	KASSERT(m->pindex == (pte1_index(va) & ~PT2PG_MASK),
 	    ("%s: pmap %p va %#x PT2PG %p bad index", __func__, pmap, va, m));
 	pte1p = pmap->pm_pt1 + m->pindex;
 	for (i = 0; i < NPT2_IN_PG; i++, pte1p++) {
 		KASSERT(m->md.pt2_wirecount[i] == 0,
 		    ("%s: pmap %p PT2 %u (PG %p) wired", __func__, pmap, i, m));
 		opte1 = pte1_load(pte1p);
 		if (pte1_is_link(opte1)) {
 			pte1_clear(pte1p);
 			/*
 			 * Flush intermediate TLB cache.
 			 */
 			pmap_tlb_flush(pmap, (m->pindex + i) << PTE1_SHIFT);
 		}
 #ifdef INVARIANTS
 		else
 			KASSERT((opte1 == 0) || pte1_is_section(opte1),
 			    ("%s: pmap %p va %#x bad pte1 %x at %u", __func__,
 			    pmap, va, opte1, i));
 #endif
 	}
 
 	/*
 	 * Unmap the page from PT2TAB.
 	 */
 	pte2p = pmap_pt2tab_entry(pmap, va);
 	(void)pt2tab_load_clear(pte2p);
 	pmap_tlb_flush(pmap, pt2map_pt2pg(va));
 
 	m->wire_count = 0;
 	pmap->pm_stats.resident_count--;
 
 	/*
 	 * This is a release store so that the ordinary store unmapping
 	 * the L2 page table page is globally performed before TLB shoot-
 	 * down is begun.
 	 */
 	atomic_subtract_rel_int(&vm_cnt.v_wire_count, 1);
 }
 
 /*
  *  Decrements a L2 page table page's wire count, which is used to record the
  *  number of valid page table entries within the page.  If the wire count
  *  drops to zero, then the page table page is unmapped.  Returns TRUE if the
  *  page table page was unmapped and FALSE otherwise.
  */
 static __inline boolean_t
 pmap_unwire_pt2(pmap_t pmap, vm_offset_t va, vm_page_t m, struct spglist *free)
 {
 	pt2_wirecount_dec(m, pte1_index(va));
 	if (pt2pg_is_empty(m)) {
 		/*
 		 * QQQ: Wire count is zero, so whole page should be zero and
 		 *      we can set PG_ZERO flag to it.
 		 *      Note that when promotion is enabled, it takes some
 		 *      more efforts. See pmap_unwire_pt2_all() below.
 		 */
 		pmap_unwire_pt2pg(pmap, va, m);
 		pmap_add_delayed_free_list(m, free);
 		return (TRUE);
 	} else
 		return (FALSE);
 }
 
 /*
  *  Drop a L2 page table page's wire count at once, which is used to record
  *  the number of valid L2 page table entries within the page. If the wire
  *  count drops to zero, then the L2 page table page is unmapped.
  */
 static __inline void
 pmap_unwire_pt2_all(pmap_t pmap, vm_offset_t va, vm_page_t m,
     struct spglist *free)
 {
 	u_int pte1_idx = pte1_index(va);
 
 	KASSERT(m->pindex == (pte1_idx & ~PT2PG_MASK),
 		("%s: PT2 page's pindex is wrong", __func__));
 	KASSERT(m->wire_count > pt2_wirecount_get(m, pte1_idx),
 	    ("%s: bad pt2 wire count %u > %u", __func__, m->wire_count,
 	    pt2_wirecount_get(m, pte1_idx)));
 
 	/*
 	 * It's possible that the L2 page table was never used.
 	 * It happened in case that a section was created without promotion.
 	 */
 	if (pt2_is_full(m, va)) {
 		pt2_wirecount_set(m, pte1_idx, 0);
 
 		/*
 		 * QQQ: We clear L2 page table now, so when L2 page table page
 		 *      is going to be freed, we can set it PG_ZERO flag ...
 		 *      This function is called only on section mappings, so
 		 *      hopefully it's not to big overload.
 		 *
 		 * XXX: If pmap is current, existing PT2MAP mapping could be
 		 *      used for zeroing.
 		 */
 		pmap_zero_page_area(m, page_pt2off(pte1_idx), NB_IN_PT2);
 	}
 #ifdef INVARIANTS
 	else
 		KASSERT(pt2_is_empty(m, va), ("%s: PT2 is not empty (%u)",
 		    __func__, pt2_wirecount_get(m, pte1_idx)));
 #endif
 	if (pt2pg_is_empty(m)) {
 		pmap_unwire_pt2pg(pmap, va, m);
 		pmap_add_delayed_free_list(m, free);
 	}
 }
 
 /*
  *  After removing a L2 page table entry, this routine is used to
  *  conditionally free the page, and manage the hold/wire counts.
  */
 static boolean_t
 pmap_unuse_pt2(pmap_t pmap, vm_offset_t va, struct spglist *free)
 {
 	pt1_entry_t pte1;
 	vm_page_t mpte;
 
 	if (va >= VM_MAXUSER_ADDRESS)
 		return (FALSE);
 	pte1 = pte1_load(pmap_pte1(pmap, va));
 	mpte = PHYS_TO_VM_PAGE(pte1_link_pa(pte1));
 	return (pmap_unwire_pt2(pmap, va, mpte, free));
 }
 
 /*************************************
  *
  *  Page management routines.
  *
  *************************************/
 
 CTASSERT(sizeof(struct pv_chunk) == PAGE_SIZE);
 CTASSERT(_NPCM == 11);
 CTASSERT(_NPCPV == 336);
 
 static __inline struct pv_chunk *
 pv_to_chunk(pv_entry_t pv)
 {
 
 	return ((struct pv_chunk *)((uintptr_t)pv & ~(uintptr_t)PAGE_MASK));
 }
 
 #define PV_PMAP(pv) (pv_to_chunk(pv)->pc_pmap)
 
 #define	PC_FREE0_9	0xfffffffful	/* Free values for index 0 through 9 */
 #define	PC_FREE10	0x0000fffful	/* Free values for index 10 */
 
 static const uint32_t pc_freemask[_NPCM] = {
 	PC_FREE0_9, PC_FREE0_9, PC_FREE0_9,
 	PC_FREE0_9, PC_FREE0_9, PC_FREE0_9,
 	PC_FREE0_9, PC_FREE0_9, PC_FREE0_9,
 	PC_FREE0_9, PC_FREE10
 };
 
 SYSCTL_INT(_vm_pmap, OID_AUTO, pv_entry_count, CTLFLAG_RD, &pv_entry_count, 0,
 	"Current number of pv entries");
 
 #ifdef PV_STATS
 static int pc_chunk_count, pc_chunk_allocs, pc_chunk_frees, pc_chunk_tryfail;
 
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_count, CTLFLAG_RD, &pc_chunk_count, 0,
     "Current number of pv entry chunks");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_allocs, CTLFLAG_RD, &pc_chunk_allocs, 0,
     "Current number of pv entry chunks allocated");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_frees, CTLFLAG_RD, &pc_chunk_frees, 0,
     "Current number of pv entry chunks frees");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_tryfail, CTLFLAG_RD, &pc_chunk_tryfail,
     0, "Number of times tried to get a chunk page but failed.");
 
 static long pv_entry_frees, pv_entry_allocs;
 static int pv_entry_spare;
 
 SYSCTL_LONG(_vm_pmap, OID_AUTO, pv_entry_frees, CTLFLAG_RD, &pv_entry_frees, 0,
     "Current number of pv entry frees");
 SYSCTL_LONG(_vm_pmap, OID_AUTO, pv_entry_allocs, CTLFLAG_RD, &pv_entry_allocs,
     0, "Current number of pv entry allocs");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pv_entry_spare, CTLFLAG_RD, &pv_entry_spare, 0,
     "Current number of spare pv entries");
 #endif
 
 /*
  *  Is given page managed?
  */
 static __inline boolean_t
 is_managed(vm_paddr_t pa)
 {
 	vm_offset_t pgnum;
 	vm_page_t m;
 
 	pgnum = atop(pa);
 	if (pgnum >= first_page) {
 		m = PHYS_TO_VM_PAGE(pa);
 		if (m == NULL)
 			return (FALSE);
 		if ((m->oflags & VPO_UNMANAGED) == 0)
 			return (TRUE);
 	}
 	return (FALSE);
 }
 
 static __inline boolean_t
 pte1_is_managed(pt1_entry_t pte1)
 {
 
 	return (is_managed(pte1_pa(pte1)));
 }
 
 static __inline boolean_t
 pte2_is_managed(pt2_entry_t pte2)
 {
 
 	return (is_managed(pte2_pa(pte2)));
 }
 
 /*
  *  We are in a serious low memory condition.  Resort to
  *  drastic measures to free some pages so we can allocate
  *  another pv entry chunk.
  */
 static vm_page_t
 pmap_pv_reclaim(pmap_t locked_pmap)
 {
 	struct pch newtail;
 	struct pv_chunk *pc;
 	struct md_page *pvh;
 	pt1_entry_t *pte1p;
 	pmap_t pmap;
 	pt2_entry_t *pte2p, tpte2;
 	pv_entry_t pv;
 	vm_offset_t va;
 	vm_page_t m, m_pc;
 	struct spglist free;
 	uint32_t inuse;
 	int bit, field, freed;
 
 	PMAP_LOCK_ASSERT(locked_pmap, MA_OWNED);
 	pmap = NULL;
 	m_pc = NULL;
 	SLIST_INIT(&free);
 	TAILQ_INIT(&newtail);
 	while ((pc = TAILQ_FIRST(&pv_chunks)) != NULL && (pv_vafree == 0 ||
 	    SLIST_EMPTY(&free))) {
 		TAILQ_REMOVE(&pv_chunks, pc, pc_lru);
 		if (pmap != pc->pc_pmap) {
 			if (pmap != NULL) {
 				if (pmap != locked_pmap)
 					PMAP_UNLOCK(pmap);
 			}
 			pmap = pc->pc_pmap;
 			/* Avoid deadlock and lock recursion. */
 			if (pmap > locked_pmap)
 				PMAP_LOCK(pmap);
 			else if (pmap != locked_pmap && !PMAP_TRYLOCK(pmap)) {
 				pmap = NULL;
 				TAILQ_INSERT_TAIL(&newtail, pc, pc_lru);
 				continue;
 			}
 		}
 
 		/*
 		 * Destroy every non-wired, 4 KB page mapping in the chunk.
 		 */
 		freed = 0;
 		for (field = 0; field < _NPCM; field++) {
 			for (inuse = ~pc->pc_map[field] & pc_freemask[field];
 			    inuse != 0; inuse &= ~(1UL << bit)) {
 				bit = ffs(inuse) - 1;
 				pv = &pc->pc_pventry[field * 32 + bit];
 				va = pv->pv_va;
 				pte1p = pmap_pte1(pmap, va);
 				if (pte1_is_section(pte1_load(pte1p)))
 					continue;
 				pte2p = pmap_pte2(pmap, va);
 				tpte2 = pte2_load(pte2p);
 				if ((tpte2 & PTE2_W) == 0)
 					tpte2 = pte2_load_clear(pte2p);
 				pmap_pte2_release(pte2p);
 				if ((tpte2 & PTE2_W) != 0)
 					continue;
 				KASSERT(tpte2 != 0,
 				    ("pmap_pv_reclaim: pmap %p va %#x zero pte",
 				    pmap, va));
 				pmap_tlb_flush(pmap, va);
 				m = PHYS_TO_VM_PAGE(pte2_pa(tpte2));
 				if (pte2_is_dirty(tpte2))
 					vm_page_dirty(m);
 				if ((tpte2 & PTE2_A) != 0)
 					vm_page_aflag_set(m, PGA_REFERENCED);
 				TAILQ_REMOVE(&m->md.pv_list, pv, pv_next);
 				if (TAILQ_EMPTY(&m->md.pv_list) &&
 				    (m->flags & PG_FICTITIOUS) == 0) {
 					pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 					if (TAILQ_EMPTY(&pvh->pv_list)) {
 						vm_page_aflag_clear(m,
 						    PGA_WRITEABLE);
 					}
 				}
 				pc->pc_map[field] |= 1UL << bit;
 				pmap_unuse_pt2(pmap, va, &free);
 				freed++;
 			}
 		}
 		if (freed == 0) {
 			TAILQ_INSERT_TAIL(&newtail, pc, pc_lru);
 			continue;
 		}
 		/* Every freed mapping is for a 4 KB page. */
 		pmap->pm_stats.resident_count -= freed;
 		PV_STAT(pv_entry_frees += freed);
 		PV_STAT(pv_entry_spare += freed);
 		pv_entry_count -= freed;
 		TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 		for (field = 0; field < _NPCM; field++)
 			if (pc->pc_map[field] != pc_freemask[field]) {
 				TAILQ_INSERT_HEAD(&pmap->pm_pvchunk, pc,
 				    pc_list);
 				TAILQ_INSERT_TAIL(&newtail, pc, pc_lru);
 
 				/*
 				 * One freed pv entry in locked_pmap is
 				 * sufficient.
 				 */
 				if (pmap == locked_pmap)
 					goto out;
 				break;
 			}
 		if (field == _NPCM) {
 			PV_STAT(pv_entry_spare -= _NPCPV);
 			PV_STAT(pc_chunk_count--);
 			PV_STAT(pc_chunk_frees++);
 			/* Entire chunk is free; return it. */
 			m_pc = PHYS_TO_VM_PAGE(pmap_kextract((vm_offset_t)pc));
 			pmap_qremove((vm_offset_t)pc, 1);
 			pmap_pte2list_free(&pv_vafree, (vm_offset_t)pc);
 			break;
 		}
 	}
 out:
 	TAILQ_CONCAT(&pv_chunks, &newtail, pc_lru);
 	if (pmap != NULL) {
 		if (pmap != locked_pmap)
 			PMAP_UNLOCK(pmap);
 	}
 	if (m_pc == NULL && pv_vafree != 0 && SLIST_EMPTY(&free)) {
 		m_pc = SLIST_FIRST(&free);
 		SLIST_REMOVE_HEAD(&free, plinks.s.ss);
 		/* Recycle a freed page table page. */
 		m_pc->wire_count = 1;
 		atomic_add_int(&vm_cnt.v_wire_count, 1);
 	}
 	pmap_free_zero_pages(&free);
 	return (m_pc);
 }
 
 static void
 free_pv_chunk(struct pv_chunk *pc)
 {
 	vm_page_t m;
 
 	TAILQ_REMOVE(&pv_chunks, pc, pc_lru);
 	PV_STAT(pv_entry_spare -= _NPCPV);
 	PV_STAT(pc_chunk_count--);
 	PV_STAT(pc_chunk_frees++);
 	/* entire chunk is free, return it */
 	m = PHYS_TO_VM_PAGE(pmap_kextract((vm_offset_t)pc));
 	pmap_qremove((vm_offset_t)pc, 1);
 	vm_page_unwire(m, PQ_NONE);
 	vm_page_free(m);
 	pmap_pte2list_free(&pv_vafree, (vm_offset_t)pc);
 }
 
 /*
  *  Free the pv_entry back to the free list.
  */
 static void
 free_pv_entry(pmap_t pmap, pv_entry_t pv)
 {
 	struct pv_chunk *pc;
 	int idx, field, bit;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	PV_STAT(pv_entry_frees++);
 	PV_STAT(pv_entry_spare++);
 	pv_entry_count--;
 	pc = pv_to_chunk(pv);
 	idx = pv - &pc->pc_pventry[0];
 	field = idx / 32;
 	bit = idx % 32;
 	pc->pc_map[field] |= 1ul << bit;
 	for (idx = 0; idx < _NPCM; idx++)
 		if (pc->pc_map[idx] != pc_freemask[idx]) {
 			/*
 			 * 98% of the time, pc is already at the head of the
 			 * list.  If it isn't already, move it to the head.
 			 */
 			if (__predict_false(TAILQ_FIRST(&pmap->pm_pvchunk) !=
 			    pc)) {
 				TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 				TAILQ_INSERT_HEAD(&pmap->pm_pvchunk, pc,
 				    pc_list);
 			}
 			return;
 		}
 	TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 	free_pv_chunk(pc);
 }
 
 /*
  *  Get a new pv_entry, allocating a block from the system
  *  when needed.
  */
 static pv_entry_t
 get_pv_entry(pmap_t pmap, boolean_t try)
 {
 	static const struct timeval printinterval = { 60, 0 };
 	static struct timeval lastprint;
 	int bit, field;
 	pv_entry_t pv;
 	struct pv_chunk *pc;
 	vm_page_t m;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	PV_STAT(pv_entry_allocs++);
 	pv_entry_count++;
 	if (pv_entry_count > pv_entry_high_water)
 		if (ratecheck(&lastprint, &printinterval))
 			printf("Approaching the limit on PV entries, consider "
 			    "increasing either the vm.pmap.shpgperproc or the "
 			    "vm.pmap.pv_entry_max tunable.\n");
 retry:
 	pc = TAILQ_FIRST(&pmap->pm_pvchunk);
 	if (pc != NULL) {
 		for (field = 0; field < _NPCM; field++) {
 			if (pc->pc_map[field]) {
 				bit = ffs(pc->pc_map[field]) - 1;
 				break;
 			}
 		}
 		if (field < _NPCM) {
 			pv = &pc->pc_pventry[field * 32 + bit];
 			pc->pc_map[field] &= ~(1ul << bit);
 			/* If this was the last item, move it to tail */
 			for (field = 0; field < _NPCM; field++)
 				if (pc->pc_map[field] != 0) {
 					PV_STAT(pv_entry_spare--);
 					return (pv);	/* not full, return */
 				}
 			TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 			TAILQ_INSERT_TAIL(&pmap->pm_pvchunk, pc, pc_list);
 			PV_STAT(pv_entry_spare--);
 			return (pv);
 		}
 	}
 	/*
 	 * Access to the pte2list "pv_vafree" is synchronized by the pvh
 	 * global lock.  If "pv_vafree" is currently non-empty, it will
 	 * remain non-empty until pmap_pte2list_alloc() completes.
 	 */
 	if (pv_vafree == 0 || (m = vm_page_alloc(NULL, 0, VM_ALLOC_NORMAL |
 	    VM_ALLOC_NOOBJ | VM_ALLOC_WIRED)) == NULL) {
 		if (try) {
 			pv_entry_count--;
 			PV_STAT(pc_chunk_tryfail++);
 			return (NULL);
 		}
 		m = pmap_pv_reclaim(pmap);
 		if (m == NULL)
 			goto retry;
 	}
 	PV_STAT(pc_chunk_count++);
 	PV_STAT(pc_chunk_allocs++);
 	pc = (struct pv_chunk *)pmap_pte2list_alloc(&pv_vafree);
 	pmap_qenter((vm_offset_t)pc, &m, 1);
 	pc->pc_pmap = pmap;
 	pc->pc_map[0] = pc_freemask[0] & ~1ul;	/* preallocated bit 0 */
 	for (field = 1; field < _NPCM; field++)
 		pc->pc_map[field] = pc_freemask[field];
 	TAILQ_INSERT_TAIL(&pv_chunks, pc, pc_lru);
 	pv = &pc->pc_pventry[0];
 	TAILQ_INSERT_HEAD(&pmap->pm_pvchunk, pc, pc_list);
 	PV_STAT(pv_entry_spare += _NPCPV - 1);
 	return (pv);
 }
 
 /*
  *  Create a pv entry for page at pa for
  *  (pmap, va).
  */
 static void
 pmap_insert_entry(pmap_t pmap, vm_offset_t va, vm_page_t m)
 {
 	pv_entry_t pv;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	pv = get_pv_entry(pmap, FALSE);
 	pv->pv_va = va;
 	TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 }
 
 static __inline pv_entry_t
 pmap_pvh_remove(struct md_page *pvh, pmap_t pmap, vm_offset_t va)
 {
 	pv_entry_t pv;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	TAILQ_FOREACH(pv, &pvh->pv_list, pv_next) {
 		if (pmap == PV_PMAP(pv) && va == pv->pv_va) {
 			TAILQ_REMOVE(&pvh->pv_list, pv, pv_next);
 			break;
 		}
 	}
 	return (pv);
 }
 
 static void
 pmap_pvh_free(struct md_page *pvh, pmap_t pmap, vm_offset_t va)
 {
 	pv_entry_t pv;
 
 	pv = pmap_pvh_remove(pvh, pmap, va);
 	KASSERT(pv != NULL, ("pmap_pvh_free: pv not found"));
 	free_pv_entry(pmap, pv);
 }
 
 static void
 pmap_remove_entry(pmap_t pmap, vm_page_t m, vm_offset_t va)
 {
 	struct md_page *pvh;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	pmap_pvh_free(&m->md, pmap, va);
 	if (TAILQ_EMPTY(&m->md.pv_list) && (m->flags & PG_FICTITIOUS) == 0) {
 		pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 		if (TAILQ_EMPTY(&pvh->pv_list))
 			vm_page_aflag_clear(m, PGA_WRITEABLE);
 	}
 }
 
 static void
 pmap_pv_demote_pte1(pmap_t pmap, vm_offset_t va, vm_paddr_t pa)
 {
 	struct md_page *pvh;
 	pv_entry_t pv;
 	vm_offset_t va_last;
 	vm_page_t m;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	KASSERT((pa & PTE1_OFFSET) == 0,
 	    ("pmap_pv_demote_pte1: pa is not 1mpage aligned"));
 
 	/*
 	 * Transfer the 1mpage's pv entry for this mapping to the first
 	 * page's pv list.
 	 */
 	pvh = pa_to_pvh(pa);
 	va = pte1_trunc(va);
 	pv = pmap_pvh_remove(pvh, pmap, va);
 	KASSERT(pv != NULL, ("pmap_pv_demote_pte1: pv not found"));
 	m = PHYS_TO_VM_PAGE(pa);
 	TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 	/* Instantiate the remaining NPTE2_IN_PT2 - 1 pv entries. */
 	va_last = va + PTE1_SIZE - PAGE_SIZE;
 	do {
 		m++;
 		KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 		    ("pmap_pv_demote_pte1: page %p is not managed", m));
 		va += PAGE_SIZE;
 		pmap_insert_entry(pmap, va, m);
 	} while (va < va_last);
 }
 
 static void
 pmap_pv_promote_pte1(pmap_t pmap, vm_offset_t va, vm_paddr_t pa)
 {
 	struct md_page *pvh;
 	pv_entry_t pv;
 	vm_offset_t va_last;
 	vm_page_t m;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	KASSERT((pa & PTE1_OFFSET) == 0,
 	    ("pmap_pv_promote_pte1: pa is not 1mpage aligned"));
 
 	/*
 	 * Transfer the first page's pv entry for this mapping to the
 	 * 1mpage's pv list.  Aside from avoiding the cost of a call
 	 * to get_pv_entry(), a transfer avoids the possibility that
 	 * get_pv_entry() calls pmap_pv_reclaim() and that pmap_pv_reclaim()
 	 * removes one of the mappings that is being promoted.
 	 */
 	m = PHYS_TO_VM_PAGE(pa);
 	va = pte1_trunc(va);
 	pv = pmap_pvh_remove(&m->md, pmap, va);
 	KASSERT(pv != NULL, ("pmap_pv_promote_pte1: pv not found"));
 	pvh = pa_to_pvh(pa);
 	TAILQ_INSERT_TAIL(&pvh->pv_list, pv, pv_next);
 	/* Free the remaining NPTE2_IN_PT2 - 1 pv entries. */
 	va_last = va + PTE1_SIZE - PAGE_SIZE;
 	do {
 		m++;
 		va += PAGE_SIZE;
 		pmap_pvh_free(&m->md, pmap, va);
 	} while (va < va_last);
 }
 
 /*
  *  Conditionally create a pv entry.
  */
 static boolean_t
 pmap_try_insert_pv_entry(pmap_t pmap, vm_offset_t va, vm_page_t m)
 {
 	pv_entry_t pv;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	if (pv_entry_count < pv_entry_high_water &&
 	    (pv = get_pv_entry(pmap, TRUE)) != NULL) {
 		pv->pv_va = va;
 		TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 		return (TRUE);
 	} else
 		return (FALSE);
 }
 
 /*
  *  Create the pv entries for each of the pages within a section.
  */
 static boolean_t
 pmap_pv_insert_pte1(pmap_t pmap, vm_offset_t va, vm_paddr_t pa)
 {
 	struct md_page *pvh;
 	pv_entry_t pv;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	if (pv_entry_count < pv_entry_high_water &&
 	    (pv = get_pv_entry(pmap, TRUE)) != NULL) {
 		pv->pv_va = va;
 		pvh = pa_to_pvh(pa);
 		TAILQ_INSERT_TAIL(&pvh->pv_list, pv, pv_next);
 		return (TRUE);
 	} else
 		return (FALSE);
 }
 
 static inline void
 pmap_tlb_flush_pte1(pmap_t pmap, vm_offset_t va, pt1_entry_t npte1)
 {
 
 	/* Kill all the small mappings or the big one only. */
 	if (pte1_is_section(npte1))
 		pmap_tlb_flush_range(pmap, pte1_trunc(va), PTE1_SIZE);
 	else
 		pmap_tlb_flush(pmap, pte1_trunc(va));
 }
 
 /*
  *  Update kernel pte1 on all pmaps.
  *
  *  The following function is called only on one cpu with disabled interrupts.
  *  In SMP case, smp_rendezvous_cpus() is used to stop other cpus. This way
  *  nobody can invoke explicit hardware table walk during the update of pte1.
  *  Unsolicited hardware table walk can still happen, invoked by speculative
  *  data or instruction prefetch or even by speculative hardware table walk.
  *
  *  The break-before-make approach should be implemented here. However, it's
  *  not so easy to do that for kernel mappings as it would be unhappy to unmap
  *  itself unexpectedly but voluntarily.
  */
 static void
 pmap_update_pte1_kernel(vm_offset_t va, pt1_entry_t npte1)
 {
 	pmap_t pmap;
 	pt1_entry_t *pte1p;
 
 	/*
 	 * Get current pmap. Interrupts should be disabled here
 	 * so PCPU_GET() is done atomically.
 	 */
 	pmap = PCPU_GET(curpmap);
 	if (pmap == NULL)
 		pmap = kernel_pmap;
 
 	/*
 	 * (1) Change pte1 on current pmap.
 	 * (2) Flush all obsolete TLB entries on current CPU.
 	 * (3) Change pte1 on all pmaps.
 	 * (4) Flush all obsolete TLB entries on all CPUs in SMP case.
 	 */
 
 	pte1p = pmap_pte1(pmap, va);
 	pte1_store(pte1p, npte1);
 
 	/* Kill all the small mappings or the big one only. */
 	if (pte1_is_section(npte1)) {
 		pmap_pte1_kern_promotions++;
 		tlb_flush_range_local(pte1_trunc(va), PTE1_SIZE);
 	} else {
 		pmap_pte1_kern_demotions++;
 		tlb_flush_local(pte1_trunc(va));
 	}
 
 	/*
 	 * In SMP case, this function is called when all cpus are at smp
 	 * rendezvous, so there is no need to use 'allpmaps_lock' lock here.
 	 * In UP case, the function is called with this lock locked.
 	 */
 	LIST_FOREACH(pmap, &allpmaps, pm_list) {
 		pte1p = pmap_pte1(pmap, va);
 		pte1_store(pte1p, npte1);
 	}
 
 #ifdef SMP
 	/* Kill all the small mappings or the big one only. */
 	if (pte1_is_section(npte1))
 		tlb_flush_range(pte1_trunc(va), PTE1_SIZE);
 	else
 		tlb_flush(pte1_trunc(va));
 #endif
 }
 
 #ifdef SMP
 struct pte1_action {
 	vm_offset_t va;
 	pt1_entry_t npte1;
 	u_int update;		/* CPU that updates the PTE1 */
 };
 
 static void
 pmap_update_pte1_action(void *arg)
 {
 	struct pte1_action *act = arg;
 
 	if (act->update == PCPU_GET(cpuid))
 		pmap_update_pte1_kernel(act->va, act->npte1);
 }
 
 /*
  *  Change pte1 on current pmap.
  *  Note that kernel pte1 must be changed on all pmaps.
  *
  *  According to the architecture reference manual published by ARM,
  *  the behaviour is UNPREDICTABLE when two or more TLB entries map the same VA.
  *  According to this manual, UNPREDICTABLE behaviours must never happen in
  *  a viable system. In contrast, on x86 processors, it is not specified which
  *  TLB entry mapping the virtual address will be used, but the MMU doesn't
  *  generate a bogus translation the way it does on Cortex-A8 rev 2 (Beaglebone
  *  Black).
  *
  *  It's a problem when either promotion or demotion is being done. The pte1
  *  update and appropriate TLB flush must be done atomically in general.
  */
 static void
 pmap_change_pte1(pmap_t pmap, pt1_entry_t *pte1p, vm_offset_t va,
     pt1_entry_t npte1)
 {
 
 	if (pmap == kernel_pmap) {
 		struct pte1_action act;
 
 		sched_pin();
 		act.va = va;
 		act.npte1 = npte1;
 		act.update = PCPU_GET(cpuid);
 		smp_rendezvous_cpus(all_cpus, smp_no_rendevous_barrier,
 		    pmap_update_pte1_action, NULL, &act);
 		sched_unpin();
 	} else {
 		register_t cspr;
 
 		/*
 		 * Use break-before-make approach for changing userland
 		 * mappings. It can cause L1 translation aborts on other
 		 * cores in SMP case. So, special treatment is implemented
 		 * in pmap_fault(). To reduce the likelihood that another core
 		 * will be affected by the broken mapping, disable interrupts
 		 * until the mapping change is completed.
 		 */
 		cspr = disable_interrupts(PSR_I | PSR_F);
 		pte1_clear(pte1p);
 		pmap_tlb_flush_pte1(pmap, va, npte1);
 		pte1_store(pte1p, npte1);
 		restore_interrupts(cspr);
 	}
 }
 #else
 static void
 pmap_change_pte1(pmap_t pmap, pt1_entry_t *pte1p, vm_offset_t va,
     pt1_entry_t npte1)
 {
 
 	if (pmap == kernel_pmap) {
 		mtx_lock_spin(&allpmaps_lock);
 		pmap_update_pte1_kernel(va, npte1);
 		mtx_unlock_spin(&allpmaps_lock);
 	} else {
 		register_t cspr;
 
 		/*
 		 * Use break-before-make approach for changing userland
 		 * mappings. It's absolutely safe in UP case when interrupts
 		 * are disabled.
 		 */
 		cspr = disable_interrupts(PSR_I | PSR_F);
 		pte1_clear(pte1p);
 		pmap_tlb_flush_pte1(pmap, va, npte1);
 		pte1_store(pte1p, npte1);
 		restore_interrupts(cspr);
 	}
 }
 #endif
 
 /*
  *  Tries to promote the NPTE2_IN_PT2, contiguous 4KB page mappings that are
  *  within a single page table page (PT2) to a single 1MB page mapping.
  *  For promotion to occur, two conditions must be met: (1) the 4KB page
  *  mappings must map aligned, contiguous physical memory and (2) the 4KB page
  *  mappings must have identical characteristics.
  *
  *  Managed (PG_MANAGED) mappings within the kernel address space are not
  *  promoted.  The reason is that kernel PTE1s are replicated in each pmap but
  *  pmap_remove_write(), pmap_clear_modify(), and pmap_clear_reference() only
  *  read the PTE1 from the kernel pmap.
  */
 static void
 pmap_promote_pte1(pmap_t pmap, pt1_entry_t *pte1p, vm_offset_t va)
 {
 	pt1_entry_t npte1;
 	pt2_entry_t *fpte2p, fpte2, fpte2_fav;
 	pt2_entry_t *pte2p, pte2;
 	vm_offset_t pteva __unused;
 	vm_page_t m __unused;
 
 	PDEBUG(6, printf("%s(%p): try for va %#x pte1 %#x at %p\n", __func__,
 	    pmap, va, pte1_load(pte1p), pte1p));
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 
 	/*
 	 * Examine the first PTE2 in the specified PT2. Abort if this PTE2 is
 	 * either invalid, unused, or does not map the first 4KB physical page
 	 * within a 1MB page.
 	 */
 	fpte2p = pmap_pte2_quick(pmap, pte1_trunc(va));
 	fpte2 = pte2_load(fpte2p);
 	if ((fpte2 & ((PTE2_FRAME & PTE1_OFFSET) | PTE2_A | PTE2_V)) !=
 	    (PTE2_A | PTE2_V)) {
 		pmap_pte1_p_failures++;
 		CTR3(KTR_PMAP, "%s: failure(1) for va %#x in pmap %p",
 		    __func__, va, pmap);
 		return;
 	}
 	if (pte2_is_managed(fpte2) && pmap == kernel_pmap) {
 		pmap_pte1_p_failures++;
 		CTR3(KTR_PMAP, "%s: failure(2) for va %#x in pmap %p",
 		    __func__, va, pmap);
 		return;
 	}
 	if ((fpte2 & (PTE2_NM | PTE2_RO)) == PTE2_NM) {
 		/*
 		 * When page is not modified, PTE2_RO can be set without
 		 * a TLB invalidation.
 		 */
 		fpte2 |= PTE2_RO;
 		pte2_store(fpte2p, fpte2);
 	}
 
 	/*
 	 * Examine each of the other PTE2s in the specified PT2. Abort if this
 	 * PTE2 maps an unexpected 4KB physical page or does not have identical
 	 * characteristics to the first PTE2.
 	 */
 	fpte2_fav = (fpte2 & (PTE2_FRAME | PTE2_A | PTE2_V));
 	fpte2_fav += PTE1_SIZE - PTE2_SIZE; /* examine from the end */
 	for (pte2p = fpte2p + NPTE2_IN_PT2 - 1; pte2p > fpte2p; pte2p--) {
 		pte2 = pte2_load(pte2p);
 		if ((pte2 & (PTE2_FRAME | PTE2_A | PTE2_V)) != fpte2_fav) {
 			pmap_pte1_p_failures++;
 			CTR3(KTR_PMAP, "%s: failure(3) for va %#x in pmap %p",
 			    __func__, va, pmap);
 			return;
 		}
 		if ((pte2 & (PTE2_NM | PTE2_RO)) == PTE2_NM) {
 			/*
 			 * When page is not modified, PTE2_RO can be set
 			 * without a TLB invalidation. See note above.
 			 */
 			pte2 |= PTE2_RO;
 			pte2_store(pte2p, pte2);
 			pteva = pte1_trunc(va) | (pte2 & PTE1_OFFSET &
 			    PTE2_FRAME);
 			CTR3(KTR_PMAP, "%s: protect for va %#x in pmap %p",
 			    __func__, pteva, pmap);
 		}
 		if ((pte2 & PTE2_PROMOTE) != (fpte2 & PTE2_PROMOTE)) {
 			pmap_pte1_p_failures++;
 			CTR3(KTR_PMAP, "%s: failure(4) for va %#x in pmap %p",
 			    __func__, va, pmap);
 			return;
 		}
 
 		fpte2_fav -= PTE2_SIZE;
 	}
 	/*
 	 * The page table page in its current state will stay in PT2TAB
 	 * until the PTE1 mapping the section is demoted by pmap_demote_pte1()
 	 * or destroyed by pmap_remove_pte1().
 	 *
 	 * Note that L2 page table size is not equal to PAGE_SIZE.
 	 */
 	m = PHYS_TO_VM_PAGE(trunc_page(pte1_link_pa(pte1_load(pte1p))));
 	KASSERT(m >= vm_page_array && m < &vm_page_array[vm_page_array_size],
 	    ("%s: PT2 page is out of range", __func__));
 	KASSERT(m->pindex == (pte1_index(va) & ~PT2PG_MASK),
 	    ("%s: PT2 page's pindex is wrong", __func__));
 
 	/*
 	 * Get pte1 from pte2 format.
 	 */
 	npte1 = (fpte2 & PTE1_FRAME) | ATTR_TO_L1(fpte2) | PTE1_V;
 
 	/*
 	 * Promote the pv entries.
 	 */
 	if (pte2_is_managed(fpte2))
 		pmap_pv_promote_pte1(pmap, va, pte1_pa(npte1));
 
 	/*
 	 * Promote the mappings.
 	 */
 	pmap_change_pte1(pmap, pte1p, va, npte1);
 
 	pmap_pte1_promotions++;
 	CTR3(KTR_PMAP, "%s: success for va %#x in pmap %p",
 	    __func__, va, pmap);
 
 	PDEBUG(6, printf("%s(%p): success for va %#x pte1 %#x(%#x) at %p\n",
 	    __func__, pmap, va, npte1, pte1_load(pte1p), pte1p));
 }
 
 /*
  *  Zero L2 page table page.
  */
 static __inline void
 pmap_clear_pt2(pt2_entry_t *fpte2p)
 {
 	pt2_entry_t *pte2p;
 
 	for (pte2p = fpte2p; pte2p < fpte2p + NPTE2_IN_PT2; pte2p++)
 		pte2_clear(pte2p);
 
 }
 
 /*
  *  Removes a 1MB page mapping from the kernel pmap.
  */
 static void
 pmap_remove_kernel_pte1(pmap_t pmap, pt1_entry_t *pte1p, vm_offset_t va)
 {
 	vm_page_t m;
 	uint32_t pte1_idx;
 	pt2_entry_t *fpte2p;
 	vm_paddr_t pt2_pa;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	m = pmap_pt2_page(pmap, va);
 	if (m == NULL)
 		/*
 		 * QQQ: Is this function called only on promoted pte1?
 		 *      We certainly do section mappings directly
 		 *      (without promotion) in kernel !!!
 		 */
 		panic("%s: missing pt2 page", __func__);
 
 	pte1_idx = pte1_index(va);
 
 	/*
 	 * Initialize the L2 page table.
 	 */
 	fpte2p = page_pt2(pt2map_pt2pg(va), pte1_idx);
 	pmap_clear_pt2(fpte2p);
 
 	/*
 	 * Remove the mapping.
 	 */
 	pt2_pa = page_pt2pa(VM_PAGE_TO_PHYS(m), pte1_idx);
 	pmap_kenter_pte1(va, PTE1_LINK(pt2_pa));
 
 	/*
 	 * QQQ: We do not need to invalidate PT2MAP mapping
 	 * as we did not change it. I.e. the L2 page table page
 	 * was and still is mapped the same way.
 	 */
 }
 
 /*
  *  Do the things to unmap a section in a process
  */
 static void
 pmap_remove_pte1(pmap_t pmap, pt1_entry_t *pte1p, vm_offset_t sva,
     struct spglist *free)
 {
 	pt1_entry_t opte1;
 	struct md_page *pvh;
 	vm_offset_t eva, va;
 	vm_page_t m;
 
 	PDEBUG(6, printf("%s(%p): va %#x pte1 %#x at %p\n", __func__, pmap, sva,
 	    pte1_load(pte1p), pte1p));
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	KASSERT((sva & PTE1_OFFSET) == 0,
 	    ("%s: sva is not 1mpage aligned", __func__));
 
 	/*
 	 * Clear and invalidate the mapping. It should occupy one and only TLB
 	 * entry. So, pmap_tlb_flush() called with aligned address should be
 	 * sufficient.
 	 */
 	opte1 = pte1_load_clear(pte1p);
 	pmap_tlb_flush(pmap, sva);
 
 	if (pte1_is_wired(opte1))
 		pmap->pm_stats.wired_count -= PTE1_SIZE / PAGE_SIZE;
 	pmap->pm_stats.resident_count -= PTE1_SIZE / PAGE_SIZE;
 	if (pte1_is_managed(opte1)) {
 		pvh = pa_to_pvh(pte1_pa(opte1));
 		pmap_pvh_free(pvh, pmap, sva);
 		eva = sva + PTE1_SIZE;
 		for (va = sva, m = PHYS_TO_VM_PAGE(pte1_pa(opte1));
 		    va < eva; va += PAGE_SIZE, m++) {
 			if (pte1_is_dirty(opte1))
 				vm_page_dirty(m);
 			if (opte1 & PTE1_A)
 				vm_page_aflag_set(m, PGA_REFERENCED);
 			if (TAILQ_EMPTY(&m->md.pv_list) &&
 			    TAILQ_EMPTY(&pvh->pv_list))
 				vm_page_aflag_clear(m, PGA_WRITEABLE);
 		}
 	}
 	if (pmap == kernel_pmap) {
 		/*
 		 * L2 page table(s) can't be removed from kernel map as
 		 * kernel counts on it (stuff around pmap_growkernel()).
 		 */
 		 pmap_remove_kernel_pte1(pmap, pte1p, sva);
 	} else {
 		/*
 		 * Get associated L2 page table page.
 		 * It's possible that the page was never allocated.
 		 */
 		m = pmap_pt2_page(pmap, sva);
 		if (m != NULL)
 			pmap_unwire_pt2_all(pmap, sva, m, free);
 	}
 }
 
 /*
  *  Fills L2 page table page with mappings to consecutive physical pages.
  */
 static __inline void
 pmap_fill_pt2(pt2_entry_t *fpte2p, pt2_entry_t npte2)
 {
 	pt2_entry_t *pte2p;
 
 	for (pte2p = fpte2p; pte2p < fpte2p + NPTE2_IN_PT2; pte2p++) {
 		pte2_store(pte2p, npte2);
 		npte2 += PTE2_SIZE;
 	}
 }
 
 /*
  *  Tries to demote a 1MB page mapping. If demotion fails, the
  *  1MB page mapping is invalidated.
  */
 static boolean_t
 pmap_demote_pte1(pmap_t pmap, pt1_entry_t *pte1p, vm_offset_t va)
 {
 	pt1_entry_t opte1, npte1;
 	pt2_entry_t *fpte2p, npte2;
 	vm_paddr_t pt2pg_pa, pt2_pa;
 	vm_page_t m;
 	struct spglist free;
 	uint32_t pte1_idx, isnew = 0;
 
 	PDEBUG(6, printf("%s(%p): try for va %#x pte1 %#x at %p\n", __func__,
 	    pmap, va, pte1_load(pte1p), pte1p));
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 
 	opte1 = pte1_load(pte1p);
 	KASSERT(pte1_is_section(opte1), ("%s: opte1 not a section", __func__));
 
 	if ((opte1 & PTE1_A) == 0 || (m = pmap_pt2_page(pmap, va)) == NULL) {
 		KASSERT(!pte1_is_wired(opte1),
 		    ("%s: PT2 page for a wired mapping is missing", __func__));
 
 		/*
 		 * Invalidate the 1MB page mapping and return
 		 * "failure" if the mapping was never accessed or the
 		 * allocation of the new page table page fails.
 		 */
 		if ((opte1 & PTE1_A) == 0 || (m = vm_page_alloc(NULL,
 		    pte1_index(va) & ~PT2PG_MASK, VM_ALLOC_NOOBJ |
 		    VM_ALLOC_NORMAL | VM_ALLOC_WIRED)) == NULL) {
 			SLIST_INIT(&free);
 			pmap_remove_pte1(pmap, pte1p, pte1_trunc(va), &free);
 			pmap_free_zero_pages(&free);
 			CTR3(KTR_PMAP, "%s: failure for va %#x in pmap %p",
 			    __func__, va, pmap);
 			return (FALSE);
 		}
 		if (va < VM_MAXUSER_ADDRESS)
 			pmap->pm_stats.resident_count++;
 
 		isnew = 1;
 
 		/*
 		 * We init all L2 page tables in the page even if
 		 * we are going to change everything for one L2 page
 		 * table in a while.
 		 */
 		pt2pg_pa = pmap_pt2pg_init(pmap, va, m);
 	} else {
 		if (va < VM_MAXUSER_ADDRESS) {
 			if (pt2_is_empty(m, va))
 				isnew = 1; /* Demoting section w/o promotion. */
 #ifdef INVARIANTS
 			else
 				KASSERT(pt2_is_full(m, va), ("%s: bad PT2 wire"
 				    " count %u", __func__,
 				    pt2_wirecount_get(m, pte1_index(va))));
 #endif
 		}
 	}
 
 	pt2pg_pa = VM_PAGE_TO_PHYS(m);
 	pte1_idx = pte1_index(va);
 	/*
 	 * If the pmap is current, then the PT2MAP can provide access to
 	 * the page table page (promoted L2 page tables are not unmapped).
 	 * Otherwise, temporarily map the L2 page table page (m) into
 	 * the kernel's address space at either PADDR1 or PADDR2.
 	 *
 	 * Note that L2 page table size is not equal to PAGE_SIZE.
 	 */
 	if (pmap_is_current(pmap))
 		fpte2p = page_pt2(pt2map_pt2pg(va), pte1_idx);
 	else if (curthread->td_pinned > 0 && rw_wowned(&pvh_global_lock)) {
 		if (pte2_pa(pte2_load(PMAP1)) != pt2pg_pa) {
 			pte2_store(PMAP1, PTE2_KPT(pt2pg_pa));
 #ifdef SMP
 			PMAP1cpu = PCPU_GET(cpuid);
 #endif
 			tlb_flush_local((vm_offset_t)PADDR1);
 			PMAP1changed++;
 		} else
 #ifdef SMP
 		if (PMAP1cpu != PCPU_GET(cpuid)) {
 			PMAP1cpu = PCPU_GET(cpuid);
 			tlb_flush_local((vm_offset_t)PADDR1);
 			PMAP1changedcpu++;
 		} else
 #endif
 			PMAP1unchanged++;
 		fpte2p = page_pt2((vm_offset_t)PADDR1, pte1_idx);
 	} else {
 		mtx_lock(&PMAP2mutex);
 		if (pte2_pa(pte2_load(PMAP2)) != pt2pg_pa) {
 			pte2_store(PMAP2, PTE2_KPT(pt2pg_pa));
 			tlb_flush((vm_offset_t)PADDR2);
 		}
 		fpte2p = page_pt2((vm_offset_t)PADDR2, pte1_idx);
 	}
 	pt2_pa = page_pt2pa(pt2pg_pa, pte1_idx);
 	npte1 = PTE1_LINK(pt2_pa);
 
 	KASSERT((opte1 & PTE1_A) != 0,
 	    ("%s: opte1 is missing PTE1_A", __func__));
 	KASSERT((opte1 & (PTE1_NM | PTE1_RO)) != PTE1_NM,
 	    ("%s: opte1 has PTE1_NM", __func__));
 
 	/*
 	 *  Get pte2 from pte1 format.
 	*/
 	npte2 = pte1_pa(opte1) | ATTR_TO_L2(opte1) | PTE2_V;
 
 	/*
 	 * If the L2 page table page is new, initialize it. If the mapping
 	 * has changed attributes, update the page table entries.
 	 */
 	if (isnew != 0) {
 		pt2_wirecount_set(m, pte1_idx, NPTE2_IN_PT2);
 		pmap_fill_pt2(fpte2p, npte2);
 	} else if ((pte2_load(fpte2p) & PTE2_PROMOTE) !=
 		    (npte2 & PTE2_PROMOTE))
 		pmap_fill_pt2(fpte2p, npte2);
 
 	KASSERT(pte2_pa(pte2_load(fpte2p)) == pte2_pa(npte2),
 	    ("%s: fpte2p and npte2 map different physical addresses",
 	    __func__));
 
 	if (fpte2p == PADDR2)
 		mtx_unlock(&PMAP2mutex);
 
 	/*
 	 * Demote the mapping. This pmap is locked. The old PTE1 has
 	 * PTE1_A set. If the old PTE1 has not PTE1_RO set, it also
 	 * has not PTE1_NM set. Thus, there is no danger of a race with
 	 * another processor changing the setting of PTE1_A and/or PTE1_NM
 	 * between the read above and the store below.
 	 */
 	pmap_change_pte1(pmap, pte1p, va, npte1);
 
 	/*
 	 * Demote the pv entry. This depends on the earlier demotion
 	 * of the mapping. Specifically, the (re)creation of a per-
 	 * page pv entry might trigger the execution of pmap_pv_reclaim(),
 	 * which might reclaim a newly (re)created per-page pv entry
 	 * and destroy the associated mapping. In order to destroy
 	 * the mapping, the PTE1 must have already changed from mapping
 	 * the 1mpage to referencing the page table page.
 	 */
 	if (pte1_is_managed(opte1))
 		pmap_pv_demote_pte1(pmap, va, pte1_pa(opte1));
 
 	pmap_pte1_demotions++;
 	CTR3(KTR_PMAP, "%s: success for va %#x in pmap %p",
 	    __func__, va, pmap);
 
 	PDEBUG(6, printf("%s(%p): success for va %#x pte1 %#x(%#x) at %p\n",
 	    __func__, pmap, va, npte1, pte1_load(pte1p), pte1p));
 	return (TRUE);
 }
 
 /*
  *	Insert the given physical page (p) at
  *	the specified virtual address (v) in the
  *	target physical map with the protection requested.
  *
  *	If specified, the page will be wired down, meaning
  *	that the related pte can not be reclaimed.
  *
  *	NB:  This is the only routine which MAY NOT lazy-evaluate
  *	or lose information.  That is, this routine must actually
  *	insert this page into the given map NOW.
  */
 int
 pmap_enter(pmap_t pmap, vm_offset_t va, vm_page_t m, vm_prot_t prot,
     u_int flags, int8_t psind)
 {
 	pt1_entry_t *pte1p;
 	pt2_entry_t *pte2p;
 	pt2_entry_t npte2, opte2;
 	pv_entry_t pv;
 	vm_paddr_t opa, pa;
 	vm_page_t mpte2, om;
 	boolean_t wired;
 
 	va = trunc_page(va);
 	mpte2 = NULL;
 	wired = (flags & PMAP_ENTER_WIRED) != 0;
 
 	KASSERT(va <= vm_max_kernel_address, ("%s: toobig", __func__));
 	KASSERT(va < UPT2V_MIN_ADDRESS || va >= UPT2V_MAX_ADDRESS,
 	    ("%s: invalid to pmap_enter page table pages (va: 0x%x)", __func__,
 	    va));
 	if ((m->oflags & VPO_UNMANAGED) == 0 && !vm_page_xbusied(m))
 		VM_OBJECT_ASSERT_LOCKED(m->object);
 
 	rw_wlock(&pvh_global_lock);
 	PMAP_LOCK(pmap);
 	sched_pin();
 
 	/*
 	 * In the case that a page table page is not
 	 * resident, we are creating it here.
 	 */
 	if (va < VM_MAXUSER_ADDRESS) {
 		mpte2 = pmap_allocpte2(pmap, va, flags);
 		if (mpte2 == NULL) {
 			KASSERT((flags & PMAP_ENTER_NOSLEEP) != 0,
 			    ("pmap_allocpte2 failed with sleep allowed"));
 			sched_unpin();
 			rw_wunlock(&pvh_global_lock);
 			PMAP_UNLOCK(pmap);
 			return (KERN_RESOURCE_SHORTAGE);
 		}
 	}
 	pte1p = pmap_pte1(pmap, va);
 	if (pte1_is_section(pte1_load(pte1p)))
 		panic("%s: attempted on 1MB page", __func__);
 	pte2p = pmap_pte2_quick(pmap, va);
 	if (pte2p == NULL)
 		panic("%s: invalid L1 page table entry va=%#x", __func__, va);
 
 	om = NULL;
 	pa = VM_PAGE_TO_PHYS(m);
 	opte2 = pte2_load(pte2p);
 	opa = pte2_pa(opte2);
 	/*
 	 * Mapping has not changed, must be protection or wiring change.
 	 */
 	if (pte2_is_valid(opte2) && (opa == pa)) {
 		/*
 		 * Wiring change, just update stats. We don't worry about
 		 * wiring PT2 pages as they remain resident as long as there
 		 * are valid mappings in them. Hence, if a user page is wired,
 		 * the PT2 page will be also.
 		 */
 		if (wired && !pte2_is_wired(opte2))
 			pmap->pm_stats.wired_count++;
 		else if (!wired && pte2_is_wired(opte2))
 			pmap->pm_stats.wired_count--;
 
 		/*
 		 * Remove extra pte2 reference
 		 */
 		if (mpte2)
 			pt2_wirecount_dec(mpte2, pte1_index(va));
 		if (pte2_is_managed(opte2))
 			om = m;
 		goto validate;
 	}
 
 	/*
 	 * QQQ: We think that changing physical address on writeable mapping
 	 *      is not safe. Well, maybe on kernel address space with correct
 	 *      locking, it can make a sense. However, we have no idea why
 	 *      anyone should do that on user address space. Are we wrong?
 	 */
 	KASSERT((opa == 0) || (opa == pa) ||
 	    !pte2_is_valid(opte2) || ((opte2 & PTE2_RO) != 0),
 	    ("%s: pmap %p va %#x(%#x) opa %#x pa %#x - gotcha %#x %#x!",
 	    __func__, pmap, va, opte2, opa, pa, flags, prot));
 
 	pv = NULL;
 
 	/*
 	 * Mapping has changed, invalidate old range and fall through to
 	 * handle validating new mapping.
 	 */
 	if (opa) {
 		if (pte2_is_wired(opte2))
 			pmap->pm_stats.wired_count--;
 		if (pte2_is_managed(opte2)) {
 			om = PHYS_TO_VM_PAGE(opa);
 			pv = pmap_pvh_remove(&om->md, pmap, va);
 		}
 		/*
 		 * Remove extra pte2 reference
 		 */
 		if (mpte2 != NULL)
 			pt2_wirecount_dec(mpte2, va >> PTE1_SHIFT);
 	} else
 		pmap->pm_stats.resident_count++;
 
 	/*
 	 * Enter on the PV list if part of our managed memory.
 	 */
 	if ((m->oflags & VPO_UNMANAGED) == 0) {
 		KASSERT(va < kmi.clean_sva || va >= kmi.clean_eva,
 		    ("%s: managed mapping within the clean submap", __func__));
 		if (pv == NULL)
 			pv = get_pv_entry(pmap, FALSE);
 		pv->pv_va = va;
 		TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 	} else if (pv != NULL)
 		free_pv_entry(pmap, pv);
 
 	/*
 	 * Increment counters
 	 */
 	if (wired)
 		pmap->pm_stats.wired_count++;
 
 validate:
 	/*
 	 * Now validate mapping with desired protection/wiring.
 	 */
 	npte2 = PTE2(pa, PTE2_NM, vm_page_pte2_attr(m));
 	if (prot & VM_PROT_WRITE) {
 		if (pte2_is_managed(npte2))
 			vm_page_aflag_set(m, PGA_WRITEABLE);
 	}
 	else
 		npte2 |= PTE2_RO;
 	if ((prot & VM_PROT_EXECUTE) == 0)
 		npte2 |= PTE2_NX;
 	if (wired)
 		npte2 |= PTE2_W;
 	if (va < VM_MAXUSER_ADDRESS)
 		npte2 |= PTE2_U;
 	if (pmap != kernel_pmap)
 		npte2 |= PTE2_NG;
 
 	/*
 	 * If the mapping or permission bits are different, we need
 	 * to update the pte2.
 	 *
 	 * QQQ: Think again and again what to do
 	 *      if the mapping is going to be changed!
 	 */
 	if ((opte2 & ~(PTE2_NM | PTE2_A)) != (npte2 & ~(PTE2_NM | PTE2_A))) {
 		/*
 		 * Sync icache if exec permission and attribute VM_MEMATTR_WB_WA
 		 * is set. Do it now, before the mapping is stored and made
 		 * valid for hardware table walk. If done later, there is a race
 		 * for other threads of current process in lazy loading case.
 		 * Don't do it for kernel memory which is mapped with exec
 		 * permission even if the memory isn't going to hold executable
 		 * code. The only time when icache sync is needed is after
 		 * kernel module is loaded and the relocation info is processed.
 		 * And it's done in elf_cpu_load_file().
 		 *
 		 * QQQ: (1) Does it exist any better way where
 		 *          or how to sync icache?
 		 *      (2) Now, we do it on a page basis.
 		 */
 		if ((prot & VM_PROT_EXECUTE) && pmap != kernel_pmap &&
 		    m->md.pat_mode == VM_MEMATTR_WB_WA &&
 		    (opa != pa || (opte2 & PTE2_NX)))
 			cache_icache_sync_fresh(va, pa, PAGE_SIZE);
 
 		npte2 |= PTE2_A;
 		if (flags & VM_PROT_WRITE)
 			npte2 &= ~PTE2_NM;
 		if (opte2 & PTE2_V) {
 			/* Change mapping with break-before-make approach. */
 			opte2 = pte2_load_clear(pte2p);
 			pmap_tlb_flush(pmap, va);
 			pte2_store(pte2p, npte2);
 			if (opte2 & PTE2_A) {
 				if (pte2_is_managed(opte2))
 					vm_page_aflag_set(om, PGA_REFERENCED);
 			}
 			if (pte2_is_dirty(opte2)) {
 				if (pte2_is_managed(opte2))
 					vm_page_dirty(om);
 			}
 			if (pte2_is_managed(opte2) &&
 			    TAILQ_EMPTY(&om->md.pv_list) &&
 			    ((om->flags & PG_FICTITIOUS) != 0 ||
 			    TAILQ_EMPTY(&pa_to_pvh(opa)->pv_list)))
 				vm_page_aflag_clear(om, PGA_WRITEABLE);
 		} else
 			pte2_store(pte2p, npte2);
 	}
 #if 0
 	else {
 		/*
 		 * QQQ: In time when both access and not mofified bits are
 		 *      emulated by software, this should not happen. Some
 		 *      analysis is need, if this really happen. Missing
 		 *      tlb flush somewhere could be the reason.
 		 */
 		panic("%s: pmap %p va %#x opte2 %x npte2 %x !!", __func__, pmap,
 		    va, opte2, npte2);
 	}
 #endif
 	/*
 	 * If both the L2 page table page and the reservation are fully
 	 * populated, then attempt promotion.
 	 */
 	if ((mpte2 == NULL || pt2_is_full(mpte2, va)) &&
 	    sp_enabled && (m->flags & PG_FICTITIOUS) == 0 &&
 	    vm_reserv_level_iffullpop(m) == 0)
 		pmap_promote_pte1(pmap, pte1p, va);
 	sched_unpin();
 	rw_wunlock(&pvh_global_lock);
 	PMAP_UNLOCK(pmap);
 	return (KERN_SUCCESS);
 }
 
 /*
  *  Do the things to unmap a page in a process.
  */
 static int
 pmap_remove_pte2(pmap_t pmap, pt2_entry_t *pte2p, vm_offset_t va,
     struct spglist *free)
 {
 	pt2_entry_t opte2;
 	vm_page_t m;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 
 	/* Clear and invalidate the mapping. */
 	opte2 = pte2_load_clear(pte2p);
 	pmap_tlb_flush(pmap, va);
 
 	KASSERT(pte2_is_valid(opte2), ("%s: pmap %p va %#x not link pte2 %#x",
 	    __func__, pmap, va, opte2));
 
 	if (opte2 & PTE2_W)
 		pmap->pm_stats.wired_count -= 1;
 	pmap->pm_stats.resident_count -= 1;
 	if (pte2_is_managed(opte2)) {
 		m = PHYS_TO_VM_PAGE(pte2_pa(opte2));
 		if (pte2_is_dirty(opte2))
 			vm_page_dirty(m);
 		if (opte2 & PTE2_A)
 			vm_page_aflag_set(m, PGA_REFERENCED);
 		pmap_remove_entry(pmap, m, va);
 	}
 	return (pmap_unuse_pt2(pmap, va, free));
 }
 
 /*
  *  Remove a single page from a process address space.
  */
 static void
 pmap_remove_page(pmap_t pmap, vm_offset_t va, struct spglist *free)
 {
 	pt2_entry_t *pte2p;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	KASSERT(curthread->td_pinned > 0,
 	    ("%s: curthread not pinned", __func__));
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	if ((pte2p = pmap_pte2_quick(pmap, va)) == NULL ||
 	    !pte2_is_valid(pte2_load(pte2p)))
 		return;
 	pmap_remove_pte2(pmap, pte2p, va, free);
 }
 
 /*
  *  Remove the given range of addresses from the specified map.
  *
  *  It is assumed that the start and end are properly
  *  rounded to the page size.
  */
 void
 pmap_remove(pmap_t pmap, vm_offset_t sva, vm_offset_t eva)
 {
 	vm_offset_t nextva;
 	pt1_entry_t *pte1p, pte1;
 	pt2_entry_t *pte2p, pte2;
 	struct spglist free;
 
 	/*
 	 * Perform an unsynchronized read. This is, however, safe.
 	 */
 	if (pmap->pm_stats.resident_count == 0)
 		return;
 
 	SLIST_INIT(&free);
 
 	rw_wlock(&pvh_global_lock);
 	sched_pin();
 	PMAP_LOCK(pmap);
 
 	/*
 	 * Special handling of removing one page. A very common
 	 * operation and easy to short circuit some code.
 	 */
 	if (sva + PAGE_SIZE == eva) {
 		pte1 = pte1_load(pmap_pte1(pmap, sva));
 		if (pte1_is_link(pte1)) {
 			pmap_remove_page(pmap, sva, &free);
 			goto out;
 		}
 	}
 
 	for (; sva < eva; sva = nextva) {
 		/*
 		 * Calculate address for next L2 page table.
 		 */
 		nextva = pte1_trunc(sva + PTE1_SIZE);
 		if (nextva < sva)
 			nextva = eva;
 		if (pmap->pm_stats.resident_count == 0)
 			break;
 
 		pte1p = pmap_pte1(pmap, sva);
 		pte1 = pte1_load(pte1p);
 
 		/*
 		 * Weed out invalid mappings. Note: we assume that the L1 page
 		 * table is always allocated, and in kernel virtual.
 		 */
 		if (pte1 == 0)
 			continue;
 
 		if (pte1_is_section(pte1)) {
 			/*
 			 * Are we removing the entire large page?  If not,
 			 * demote the mapping and fall through.
 			 */
 			if (sva + PTE1_SIZE == nextva && eva >= nextva) {
 				pmap_remove_pte1(pmap, pte1p, sva, &free);
 				continue;
 			} else if (!pmap_demote_pte1(pmap, pte1p, sva)) {
 				/* The large page mapping was destroyed. */
 				continue;
 			}
 #ifdef INVARIANTS
 			else {
 				/* Update pte1 after demotion. */
 				pte1 = pte1_load(pte1p);
 			}
 #endif
 		}
 
 		KASSERT(pte1_is_link(pte1), ("%s: pmap %p va %#x pte1 %#x at %p"
 		    " is not link", __func__, pmap, sva, pte1, pte1p));
 
 		/*
 		 * Limit our scan to either the end of the va represented
 		 * by the current L2 page table page, or to the end of the
 		 * range being removed.
 		 */
 		if (nextva > eva)
 			nextva = eva;
 
 		for (pte2p = pmap_pte2_quick(pmap, sva); sva != nextva;
 		    pte2p++, sva += PAGE_SIZE) {
 			pte2 = pte2_load(pte2p);
 			if (!pte2_is_valid(pte2))
 				continue;
 			if (pmap_remove_pte2(pmap, pte2p, sva, &free))
 				break;
 		}
 	}
 out:
 	sched_unpin();
 	rw_wunlock(&pvh_global_lock);
 	PMAP_UNLOCK(pmap);
 	pmap_free_zero_pages(&free);
 }
 
 /*
  *	Routine:	pmap_remove_all
  *	Function:
  *		Removes this physical page from
  *		all physical maps in which it resides.
  *		Reflects back modify bits to the pager.
  *
  *	Notes:
  *		Original versions of this routine were very
  *		inefficient because they iteratively called
  *		pmap_remove (slow...)
  */
 
 void
 pmap_remove_all(vm_page_t m)
 {
 	struct md_page *pvh;
 	pv_entry_t pv;
 	pmap_t pmap;
 	pt2_entry_t *pte2p, opte2;
 	pt1_entry_t *pte1p;
 	vm_offset_t va;
 	struct spglist free;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("%s: page %p is not managed", __func__, m));
 	SLIST_INIT(&free);
 	rw_wlock(&pvh_global_lock);
 	sched_pin();
 	if ((m->flags & PG_FICTITIOUS) != 0)
 		goto small_mappings;
 	pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 	while ((pv = TAILQ_FIRST(&pvh->pv_list)) != NULL) {
 		va = pv->pv_va;
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pte1p = pmap_pte1(pmap, va);
 		(void)pmap_demote_pte1(pmap, pte1p, va);
 		PMAP_UNLOCK(pmap);
 	}
 small_mappings:
 	while ((pv = TAILQ_FIRST(&m->md.pv_list)) != NULL) {
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pmap->pm_stats.resident_count--;
 		pte1p = pmap_pte1(pmap, pv->pv_va);
 		KASSERT(!pte1_is_section(pte1_load(pte1p)), ("%s: found "
 		    "a 1mpage in page %p's pv list", __func__, m));
 		pte2p = pmap_pte2_quick(pmap, pv->pv_va);
 		opte2 = pte2_load_clear(pte2p);
 		pmap_tlb_flush(pmap, pv->pv_va);
 		KASSERT(pte2_is_valid(opte2), ("%s: pmap %p va %x zero pte2",
 		    __func__, pmap, pv->pv_va));
 		if (pte2_is_wired(opte2))
 			pmap->pm_stats.wired_count--;
 		if (opte2 & PTE2_A)
 			vm_page_aflag_set(m, PGA_REFERENCED);
 
 		/*
 		 * Update the vm_page_t clean and reference bits.
 		 */
 		if (pte2_is_dirty(opte2))
 			vm_page_dirty(m);
 		pmap_unuse_pt2(pmap, pv->pv_va, &free);
 		TAILQ_REMOVE(&m->md.pv_list, pv, pv_next);
 		free_pv_entry(pmap, pv);
 		PMAP_UNLOCK(pmap);
 	}
 	vm_page_aflag_clear(m, PGA_WRITEABLE);
 	sched_unpin();
 	rw_wunlock(&pvh_global_lock);
 	pmap_free_zero_pages(&free);
 }
 
 /*
  *  Just subroutine for pmap_remove_pages() to reasonably satisfy
  *  good coding style, a.k.a. 80 character line width limit hell.
  */
 static __inline void
 pmap_remove_pte1_quick(pmap_t pmap, pt1_entry_t pte1, pv_entry_t pv,
     struct spglist *free)
 {
 	vm_paddr_t pa;
 	vm_page_t m, mt, mpt2pg;
 	struct md_page *pvh;
 
 	pa = pte1_pa(pte1);
 	m = PHYS_TO_VM_PAGE(pa);
 
 	KASSERT(m->phys_addr == pa, ("%s: vm_page_t %p addr mismatch %#x %#x",
 	    __func__, m, m->phys_addr, pa));
 	KASSERT((m->flags & PG_FICTITIOUS) != 0 ||
 	    m < &vm_page_array[vm_page_array_size],
 	    ("%s: bad pte1 %#x", __func__, pte1));
 
 	if (pte1_is_dirty(pte1)) {
 		for (mt = m; mt < &m[PTE1_SIZE / PAGE_SIZE]; mt++)
 			vm_page_dirty(mt);
 	}
 
 	pmap->pm_stats.resident_count -= PTE1_SIZE / PAGE_SIZE;
 	pvh = pa_to_pvh(pa);
 	TAILQ_REMOVE(&pvh->pv_list, pv, pv_next);
 	if (TAILQ_EMPTY(&pvh->pv_list)) {
 		for (mt = m; mt < &m[PTE1_SIZE / PAGE_SIZE]; mt++)
 			if (TAILQ_EMPTY(&mt->md.pv_list))
 				vm_page_aflag_clear(mt, PGA_WRITEABLE);
 	}
 	mpt2pg = pmap_pt2_page(pmap, pv->pv_va);
 	if (mpt2pg != NULL)
 		pmap_unwire_pt2_all(pmap, pv->pv_va, mpt2pg, free);
 }
 
 /*
  *  Just subroutine for pmap_remove_pages() to reasonably satisfy
  *  good coding style, a.k.a. 80 character line width limit hell.
  */
 static __inline void
 pmap_remove_pte2_quick(pmap_t pmap, pt2_entry_t pte2, pv_entry_t pv,
     struct spglist *free)
 {
 	vm_paddr_t pa;
 	vm_page_t m;
 	struct md_page *pvh;
 
 	pa = pte2_pa(pte2);
 	m = PHYS_TO_VM_PAGE(pa);
 
 	KASSERT(m->phys_addr == pa, ("%s: vm_page_t %p addr mismatch %#x %#x",
 	    __func__, m, m->phys_addr, pa));
 	KASSERT((m->flags & PG_FICTITIOUS) != 0 ||
 	    m < &vm_page_array[vm_page_array_size],
 	    ("%s: bad pte2 %#x", __func__, pte2));
 
 	if (pte2_is_dirty(pte2))
 		vm_page_dirty(m);
 
 	pmap->pm_stats.resident_count--;
 	TAILQ_REMOVE(&m->md.pv_list, pv, pv_next);
 	if (TAILQ_EMPTY(&m->md.pv_list) && (m->flags & PG_FICTITIOUS) == 0) {
 		pvh = pa_to_pvh(pa);
 		if (TAILQ_EMPTY(&pvh->pv_list))
 			vm_page_aflag_clear(m, PGA_WRITEABLE);
 	}
 	pmap_unuse_pt2(pmap, pv->pv_va, free);
 }
 
 /*
  *  Remove all pages from specified address space this aids process
  *  exit speeds. Also, this code is special cased for current process
  *  only, but can have the more generic (and slightly slower) mode enabled.
  *  This is much faster than pmap_remove in the case of running down
  *  an entire address space.
  */
 void
 pmap_remove_pages(pmap_t pmap)
 {
 	pt1_entry_t *pte1p, pte1;
 	pt2_entry_t *pte2p, pte2;
 	pv_entry_t pv;
 	struct pv_chunk *pc, *npc;
 	struct spglist free;
 	int field, idx;
 	int32_t bit;
 	uint32_t inuse, bitmask;
 	boolean_t allfree;
 
 	/*
 	 * Assert that the given pmap is only active on the current
 	 * CPU.  Unfortunately, we cannot block another CPU from
 	 * activating the pmap while this function is executing.
 	 */
 	KASSERT(pmap == vmspace_pmap(curthread->td_proc->p_vmspace),
 	    ("%s: non-current pmap %p", __func__, pmap));
 #if defined(SMP) && defined(INVARIANTS)
 	{
 		cpuset_t other_cpus;
 
 		sched_pin();
 		other_cpus = pmap->pm_active;
 		CPU_CLR(PCPU_GET(cpuid), &other_cpus);
 		sched_unpin();
 		KASSERT(CPU_EMPTY(&other_cpus),
 		    ("%s: pmap %p active on other cpus", __func__, pmap));
 	}
 #endif
 	SLIST_INIT(&free);
 	rw_wlock(&pvh_global_lock);
 	PMAP_LOCK(pmap);
 	sched_pin();
 	TAILQ_FOREACH_SAFE(pc, &pmap->pm_pvchunk, pc_list, npc) {
 		KASSERT(pc->pc_pmap == pmap, ("%s: wrong pmap %p %p",
 		    __func__, pmap, pc->pc_pmap));
 		allfree = TRUE;
 		for (field = 0; field < _NPCM; field++) {
 			inuse = (~(pc->pc_map[field])) & pc_freemask[field];
 			while (inuse != 0) {
 				bit = ffs(inuse) - 1;
 				bitmask = 1UL << bit;
 				idx = field * 32 + bit;
 				pv = &pc->pc_pventry[idx];
 				inuse &= ~bitmask;
 
 				/*
 				 * Note that we cannot remove wired pages
 				 * from a process' mapping at this time
 				 */
 				pte1p = pmap_pte1(pmap, pv->pv_va);
 				pte1 = pte1_load(pte1p);
 				if (pte1_is_section(pte1)) {
 					if (pte1_is_wired(pte1))  {
 						allfree = FALSE;
 						continue;
 					}
 					pte1_clear(pte1p);
 					pmap_remove_pte1_quick(pmap, pte1, pv,
 					    &free);
 				}
 				else if (pte1_is_link(pte1)) {
 					pte2p = pt2map_entry(pv->pv_va);
 					pte2 = pte2_load(pte2p);
 
 					if (!pte2_is_valid(pte2)) {
 						printf("%s: pmap %p va %#x "
 						    "pte2 %#x\n", __func__,
 						    pmap, pv->pv_va, pte2);
 						panic("bad pte2");
 					}
 
 					if (pte2_is_wired(pte2))   {
 						allfree = FALSE;
 						continue;
 					}
 					pte2_clear(pte2p);
 					pmap_remove_pte2_quick(pmap, pte2, pv,
 					    &free);
 				} else {
 					printf("%s: pmap %p va %#x pte1 %#x\n",
 					    __func__, pmap, pv->pv_va, pte1);
 					panic("bad pte1");
 				}
 
 				/* Mark free */
 				PV_STAT(pv_entry_frees++);
 				PV_STAT(pv_entry_spare++);
 				pv_entry_count--;
 				pc->pc_map[field] |= bitmask;
 			}
 		}
 		if (allfree) {
 			TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 			free_pv_chunk(pc);
 		}
 	}
 	tlb_flush_all_ng_local();
 	sched_unpin();
 	rw_wunlock(&pvh_global_lock);
 	PMAP_UNLOCK(pmap);
 	pmap_free_zero_pages(&free);
 }
 
 /*
  *  This code makes some *MAJOR* assumptions:
  *  1. Current pmap & pmap exists.
  *  2. Not wired.
  *  3. Read access.
  *  4. No L2 page table pages.
  *  but is *MUCH* faster than pmap_enter...
  */
 static vm_page_t
 pmap_enter_quick_locked(pmap_t pmap, vm_offset_t va, vm_page_t m,
     vm_prot_t prot, vm_page_t mpt2pg)
 {
 	pt2_entry_t *pte2p, pte2;
 	vm_paddr_t pa;
 	struct spglist free;
 	uint32_t l2prot;
 
 	KASSERT(va < kmi.clean_sva || va >= kmi.clean_eva ||
 	    (m->oflags & VPO_UNMANAGED) != 0,
 	    ("%s: managed mapping within the clean submap", __func__));
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 
 	/*
 	 * In the case that a L2 page table page is not
 	 * resident, we are creating it here.
 	 */
 	if (va < VM_MAXUSER_ADDRESS) {
 		u_int pte1_idx;
 		pt1_entry_t pte1, *pte1p;
 		vm_paddr_t pt2_pa;
 
 		/*
 		 * Get L1 page table things.
 		 */
 		pte1_idx = pte1_index(va);
 		pte1p = pmap_pte1(pmap, va);
 		pte1 = pte1_load(pte1p);
 
 		if (mpt2pg && (mpt2pg->pindex == (pte1_idx & ~PT2PG_MASK))) {
 			/*
 			 * Each of NPT2_IN_PG L2 page tables on the page can
 			 * come here. Make sure that associated L1 page table
 			 * link is established.
 			 *
 			 * QQQ: It comes that we don't establish all links to
 			 *      L2 page tables for newly allocated L2 page
 			 *      tables page.
 			 */
 			KASSERT(!pte1_is_section(pte1),
 			    ("%s: pte1 %#x is section", __func__, pte1));
 			if (!pte1_is_link(pte1)) {
 				pt2_pa = page_pt2pa(VM_PAGE_TO_PHYS(mpt2pg),
 				    pte1_idx);
 				pte1_store(pte1p, PTE1_LINK(pt2_pa));
 			}
 			pt2_wirecount_inc(mpt2pg, pte1_idx);
 		} else {
 			/*
 			 * If the L2 page table page is mapped, we just
 			 * increment the hold count, and activate it.
 			 */
 			if (pte1_is_section(pte1)) {
 				return (NULL);
 			} else if (pte1_is_link(pte1)) {
 				mpt2pg = PHYS_TO_VM_PAGE(pte1_link_pa(pte1));
 				pt2_wirecount_inc(mpt2pg, pte1_idx);
 			} else {
 				mpt2pg = _pmap_allocpte2(pmap, va,
 				    PMAP_ENTER_NOSLEEP);
 				if (mpt2pg == NULL)
 					return (NULL);
 			}
 		}
 	} else {
 		mpt2pg = NULL;
 	}
 
 	/*
 	 * This call to pt2map_entry() makes the assumption that we are
 	 * entering the page into the current pmap.  In order to support
 	 * quick entry into any pmap, one would likely use pmap_pte2_quick().
 	 * But that isn't as quick as pt2map_entry().
 	 */
 	pte2p = pt2map_entry(va);
 	pte2 = pte2_load(pte2p);
 	if (pte2_is_valid(pte2)) {
 		if (mpt2pg != NULL) {
 			/*
 			 * Remove extra pte2 reference
 			 */
 			pt2_wirecount_dec(mpt2pg, pte1_index(va));
 			mpt2pg = NULL;
 		}
 		return (NULL);
 	}
 
 	/*
 	 * Enter on the PV list if part of our managed memory.
 	 */
 	if ((m->oflags & VPO_UNMANAGED) == 0 &&
 	    !pmap_try_insert_pv_entry(pmap, va, m)) {
 		if (mpt2pg != NULL) {
 			SLIST_INIT(&free);
 			if (pmap_unwire_pt2(pmap, va, mpt2pg, &free)) {
 				pmap_tlb_flush(pmap, va);
 				pmap_free_zero_pages(&free);
 			}
 
 			mpt2pg = NULL;
 		}
 		return (NULL);
 	}
 
 	/*
 	 * Increment counters
 	 */
 	pmap->pm_stats.resident_count++;
 
 	/*
 	 * Now validate mapping with RO protection
 	 */
 	pa = VM_PAGE_TO_PHYS(m);
 	l2prot = PTE2_RO | PTE2_NM;
 	if (va < VM_MAXUSER_ADDRESS)
 		l2prot |= PTE2_U | PTE2_NG;
 	if ((prot & VM_PROT_EXECUTE) == 0)
 		l2prot |= PTE2_NX;
 	else if (m->md.pat_mode == VM_MEMATTR_WB_WA && pmap != kernel_pmap) {
 		/*
 		 * Sync icache if exec permission and attribute VM_MEMATTR_WB_WA
 		 * is set. QQQ: For more info, see comments in pmap_enter().
 		 */
 		cache_icache_sync_fresh(va, pa, PAGE_SIZE);
 	}
 	pte2_store(pte2p, PTE2(pa, l2prot, vm_page_pte2_attr(m)));
 
 	return (mpt2pg);
 }
 
 void
 pmap_enter_quick(pmap_t pmap, vm_offset_t va, vm_page_t m, vm_prot_t prot)
 {
 
 	rw_wlock(&pvh_global_lock);
 	PMAP_LOCK(pmap);
 	(void)pmap_enter_quick_locked(pmap, va, m, prot, NULL);
 	rw_wunlock(&pvh_global_lock);
 	PMAP_UNLOCK(pmap);
 }
 
 /*
  *  Tries to create 1MB page mapping.  Returns TRUE if successful and
  *  FALSE otherwise.  Fails if (1) a page table page cannot be allocated without
  *  blocking, (2) a mapping already exists at the specified virtual address, or
  *  (3) a pv entry cannot be allocated without reclaiming another pv entry.
  */
 static boolean_t
 pmap_enter_pte1(pmap_t pmap, vm_offset_t va, vm_page_t m, vm_prot_t prot)
 {
 	pt1_entry_t *pte1p;
 	vm_paddr_t pa;
 	uint32_t l1prot;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	pte1p = pmap_pte1(pmap, va);
 	if (pte1_is_valid(pte1_load(pte1p))) {
 		CTR3(KTR_PMAP, "%s: failure for va %#lx in pmap %p", __func__,
 		    va, pmap);
 		return (FALSE);
 	}
 	if ((m->oflags & VPO_UNMANAGED) == 0) {
 		/*
 		 * Abort this mapping if its PV entry could not be created.
 		 */
 		if (!pmap_pv_insert_pte1(pmap, va, VM_PAGE_TO_PHYS(m))) {
 			CTR3(KTR_PMAP, "%s: failure for va %#lx in pmap %p",
 			    __func__, va, pmap);
 			return (FALSE);
 		}
 	}
 	/*
 	 * Increment counters.
 	 */
 	pmap->pm_stats.resident_count += PTE1_SIZE / PAGE_SIZE;
 
 	/*
 	 * Map the section.
 	 *
 	 * QQQ: Why VM_PROT_WRITE is not evaluated and the mapping is
 	 *      made readonly?
 	 */
 	pa = VM_PAGE_TO_PHYS(m);
 	l1prot = PTE1_RO | PTE1_NM;
 	if (va < VM_MAXUSER_ADDRESS)
 		l1prot |= PTE1_U | PTE1_NG;
 	if ((prot & VM_PROT_EXECUTE) == 0)
 		l1prot |= PTE1_NX;
 	else if (m->md.pat_mode == VM_MEMATTR_WB_WA && pmap != kernel_pmap) {
 		/*
 		 * Sync icache if exec permission and attribute VM_MEMATTR_WB_WA
 		 * is set. QQQ: For more info, see comments in pmap_enter().
 		 */
 		cache_icache_sync_fresh(va, pa, PTE1_SIZE);
 	}
 	pte1_store(pte1p, PTE1(pa, l1prot, ATTR_TO_L1(vm_page_pte2_attr(m))));
 
 	pmap_pte1_mappings++;
 	CTR3(KTR_PMAP, "%s: success for va %#lx in pmap %p", __func__, va,
 	    pmap);
 	return (TRUE);
 }
 
 /*
  *  Maps a sequence of resident pages belonging to the same object.
  *  The sequence begins with the given page m_start.  This page is
  *  mapped at the given virtual address start.  Each subsequent page is
  *  mapped at a virtual address that is offset from start by the same
  *  amount as the page is offset from m_start within the object.  The
  *  last page in the sequence is the page with the largest offset from
  *  m_start that can be mapped at a virtual address less than the given
  *  virtual address end.  Not every virtual page between start and end
  *  is mapped; only those for which a resident page exists with the
  *  corresponding offset from m_start are mapped.
  */
 void
 pmap_enter_object(pmap_t pmap, vm_offset_t start, vm_offset_t end,
     vm_page_t m_start, vm_prot_t prot)
 {
 	vm_offset_t va;
 	vm_page_t m, mpt2pg;
 	vm_pindex_t diff, psize;
 
 	PDEBUG(6, printf("%s: pmap %p start %#x end  %#x m %p prot %#x\n",
 	    __func__, pmap, start, end, m_start, prot));
 
 	VM_OBJECT_ASSERT_LOCKED(m_start->object);
 	psize = atop(end - start);
 	mpt2pg = NULL;
 	m = m_start;
 	rw_wlock(&pvh_global_lock);
 	PMAP_LOCK(pmap);
 	while (m != NULL && (diff = m->pindex - m_start->pindex) < psize) {
 		va = start + ptoa(diff);
 		if ((va & PTE1_OFFSET) == 0 && va + PTE1_SIZE <= end &&
 		    m->psind == 1 && sp_enabled &&
 		    pmap_enter_pte1(pmap, va, m, prot))
 			m = &m[PTE1_SIZE / PAGE_SIZE - 1];
 		else
 			mpt2pg = pmap_enter_quick_locked(pmap, va, m, prot,
 			    mpt2pg);
 		m = TAILQ_NEXT(m, listq);
 	}
 	rw_wunlock(&pvh_global_lock);
 	PMAP_UNLOCK(pmap);
 }
 
 /*
  *  This code maps large physical mmap regions into the
  *  processor address space.  Note that some shortcuts
  *  are taken, but the code works.
  */
 void
 pmap_object_init_pt(pmap_t pmap, vm_offset_t addr, vm_object_t object,
     vm_pindex_t pindex, vm_size_t size)
 {
 	pt1_entry_t *pte1p;
 	vm_paddr_t pa, pte2_pa;
 	vm_page_t p;
 	vm_memattr_t pat_mode;
 	u_int l1attr, l1prot;
 
 	VM_OBJECT_ASSERT_WLOCKED(object);
 	KASSERT(object->type == OBJT_DEVICE || object->type == OBJT_SG,
 	    ("%s: non-device object", __func__));
 	if ((addr & PTE1_OFFSET) == 0 && (size & PTE1_OFFSET) == 0) {
 		if (!vm_object_populate(object, pindex, pindex + atop(size)))
 			return;
 		p = vm_page_lookup(object, pindex);
 		KASSERT(p->valid == VM_PAGE_BITS_ALL,
 		    ("%s: invalid page %p", __func__, p));
 		pat_mode = p->md.pat_mode;
 
 		/*
 		 * Abort the mapping if the first page is not physically
 		 * aligned to a 1MB page boundary.
 		 */
 		pte2_pa = VM_PAGE_TO_PHYS(p);
 		if (pte2_pa & PTE1_OFFSET)
 			return;
 
 		/*
 		 * Skip the first page. Abort the mapping if the rest of
 		 * the pages are not physically contiguous or have differing
 		 * memory attributes.
 		 */
 		p = TAILQ_NEXT(p, listq);
 		for (pa = pte2_pa + PAGE_SIZE; pa < pte2_pa + size;
 		    pa += PAGE_SIZE) {
 			KASSERT(p->valid == VM_PAGE_BITS_ALL,
 			    ("%s: invalid page %p", __func__, p));
 			if (pa != VM_PAGE_TO_PHYS(p) ||
 			    pat_mode != p->md.pat_mode)
 				return;
 			p = TAILQ_NEXT(p, listq);
 		}
 
 		/*
 		 * Map using 1MB pages.
 		 *
 		 * QQQ: Well, we are mapping a section, so same condition must
 		 * be hold like during promotion. It looks that only RW mapping
 		 * is done here, so readonly mapping must be done elsewhere.
 		 */
 		l1prot = PTE1_U | PTE1_NG | PTE1_RW | PTE1_M | PTE1_A;
 		l1attr = ATTR_TO_L1(vm_memattr_to_pte2(pat_mode));
 		PMAP_LOCK(pmap);
 		for (pa = pte2_pa; pa < pte2_pa + size; pa += PTE1_SIZE) {
 			pte1p = pmap_pte1(pmap, addr);
 			if (!pte1_is_valid(pte1_load(pte1p))) {
 				pte1_store(pte1p, PTE1(pa, l1prot, l1attr));
 				pmap->pm_stats.resident_count += PTE1_SIZE /
 				    PAGE_SIZE;
 				pmap_pte1_mappings++;
 			}
 			/* Else continue on if the PTE1 is already valid. */
 			addr += PTE1_SIZE;
 		}
 		PMAP_UNLOCK(pmap);
 	}
 }
 
 /*
  *  Do the things to protect a 1mpage in a process.
  */
 static void
 pmap_protect_pte1(pmap_t pmap, pt1_entry_t *pte1p, vm_offset_t sva,
     vm_prot_t prot)
 {
 	pt1_entry_t npte1, opte1;
 	vm_offset_t eva, va;
 	vm_page_t m;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	KASSERT((sva & PTE1_OFFSET) == 0,
 	    ("%s: sva is not 1mpage aligned", __func__));
 
 	opte1 = npte1 = pte1_load(pte1p);
 	if (pte1_is_managed(opte1)) {
 		eva = sva + PTE1_SIZE;
 		for (va = sva, m = PHYS_TO_VM_PAGE(pte1_pa(opte1));
 		    va < eva; va += PAGE_SIZE, m++)
 			if (pte1_is_dirty(opte1))
 				vm_page_dirty(m);
 	}
 	if ((prot & VM_PROT_WRITE) == 0)
 		npte1 |= PTE1_RO | PTE1_NM;
 	if ((prot & VM_PROT_EXECUTE) == 0)
 		npte1 |= PTE1_NX;
 
 	/*
 	 * QQQ: Herein, execute permission is never set.
 	 *      It only can be cleared. So, no icache
 	 *      syncing is needed.
 	 */
 
 	if (npte1 != opte1) {
 		pte1_store(pte1p, npte1);
 		pmap_tlb_flush(pmap, sva);
 	}
 }
 
 /*
  *	Set the physical protection on the
  *	specified range of this map as requested.
  */
 void
 pmap_protect(pmap_t pmap, vm_offset_t sva, vm_offset_t eva, vm_prot_t prot)
 {
 	boolean_t pv_lists_locked;
 	vm_offset_t nextva;
 	pt1_entry_t *pte1p, pte1;
 	pt2_entry_t *pte2p, opte2, npte2;
 
 	KASSERT((prot & ~VM_PROT_ALL) == 0, ("invalid prot %x", prot));
 	if (prot == VM_PROT_NONE) {
 		pmap_remove(pmap, sva, eva);
 		return;
 	}
 
 	if ((prot & (VM_PROT_WRITE | VM_PROT_EXECUTE)) ==
 	    (VM_PROT_WRITE | VM_PROT_EXECUTE))
 		return;
 
 	if (pmap_is_current(pmap))
 		pv_lists_locked = FALSE;
 	else {
 		pv_lists_locked = TRUE;
 resume:
 		rw_wlock(&pvh_global_lock);
 		sched_pin();
 	}
 
 	PMAP_LOCK(pmap);
 	for (; sva < eva; sva = nextva) {
 		/*
 		 * Calculate address for next L2 page table.
 		 */
 		nextva = pte1_trunc(sva + PTE1_SIZE);
 		if (nextva < sva)
 			nextva = eva;
 
 		pte1p = pmap_pte1(pmap, sva);
 		pte1 = pte1_load(pte1p);
 
 		/*
 		 * Weed out invalid mappings. Note: we assume that L1 page
 		 * page table is always allocated, and in kernel virtual.
 		 */
 		if (pte1 == 0)
 			continue;
 
 		if (pte1_is_section(pte1)) {
 			/*
 			 * Are we protecting the entire large page?  If not,
 			 * demote the mapping and fall through.
 			 */
 			if (sva + PTE1_SIZE == nextva && eva >= nextva) {
 				pmap_protect_pte1(pmap, pte1p, sva, prot);
 				continue;
 			} else {
 				if (!pv_lists_locked) {
 					pv_lists_locked = TRUE;
 					if (!rw_try_wlock(&pvh_global_lock)) {
 						PMAP_UNLOCK(pmap);
 						goto resume;
 					}
 					sched_pin();
 				}
 				if (!pmap_demote_pte1(pmap, pte1p, sva)) {
 					/*
 					 * The large page mapping
 					 * was destroyed.
 					 */
 					continue;
 				}
 #ifdef INVARIANTS
 				else {
 					/* Update pte1 after demotion */
 					pte1 = pte1_load(pte1p);
 				}
 #endif
 			}
 		}
 
 		KASSERT(pte1_is_link(pte1), ("%s: pmap %p va %#x pte1 %#x at %p"
 		    " is not link", __func__, pmap, sva, pte1, pte1p));
 
 		/*
 		 * Limit our scan to either the end of the va represented
 		 * by the current L2 page table page, or to the end of the
 		 * range being protected.
 		 */
 		if (nextva > eva)
 			nextva = eva;
 
 		for (pte2p = pmap_pte2_quick(pmap, sva); sva != nextva; pte2p++,
 		    sva += PAGE_SIZE) {
 			vm_page_t m;
 
 			opte2 = npte2 = pte2_load(pte2p);
 			if (!pte2_is_valid(opte2))
 				continue;
 
 			if ((prot & VM_PROT_WRITE) == 0) {
 				if (pte2_is_managed(opte2) &&
 				    pte2_is_dirty(opte2)) {
 					m = PHYS_TO_VM_PAGE(pte2_pa(opte2));
 					vm_page_dirty(m);
 				}
 				npte2 |= PTE2_RO | PTE2_NM;
 			}
 
 			if ((prot & VM_PROT_EXECUTE) == 0)
 				npte2 |= PTE2_NX;
 
 			/*
 			 * QQQ: Herein, execute permission is never set.
 			 *      It only can be cleared. So, no icache
 			 *      syncing is needed.
 			 */
 
 			if (npte2 != opte2) {
 				pte2_store(pte2p, npte2);
 				pmap_tlb_flush(pmap, sva);
 			}
 		}
 	}
 	if (pv_lists_locked) {
 		sched_unpin();
 		rw_wunlock(&pvh_global_lock);
 	}
 	PMAP_UNLOCK(pmap);
 }
 
 /*
  *	pmap_pvh_wired_mappings:
  *
  *	Return the updated number "count" of managed mappings that are wired.
  */
 static int
 pmap_pvh_wired_mappings(struct md_page *pvh, int count)
 {
 	pmap_t pmap;
 	pt1_entry_t pte1;
 	pt2_entry_t pte2;
 	pv_entry_t pv;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	sched_pin();
 	TAILQ_FOREACH(pv, &pvh->pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pte1 = pte1_load(pmap_pte1(pmap, pv->pv_va));
 		if (pte1_is_section(pte1)) {
 			if (pte1_is_wired(pte1))
 				count++;
 		} else {
 			KASSERT(pte1_is_link(pte1),
 			    ("%s: pte1 %#x is not link", __func__, pte1));
 			pte2 = pte2_load(pmap_pte2_quick(pmap, pv->pv_va));
 			if (pte2_is_wired(pte2))
 				count++;
 		}
 		PMAP_UNLOCK(pmap);
 	}
 	sched_unpin();
 	return (count);
 }
 
 /*
  *	pmap_page_wired_mappings:
  *
  *	Return the number of managed mappings to the given physical page
  *	that are wired.
  */
 int
 pmap_page_wired_mappings(vm_page_t m)
 {
 	int count;
 
 	count = 0;
 	if ((m->oflags & VPO_UNMANAGED) != 0)
 		return (count);
 	rw_wlock(&pvh_global_lock);
 	count = pmap_pvh_wired_mappings(&m->md, count);
 	if ((m->flags & PG_FICTITIOUS) == 0) {
 		count = pmap_pvh_wired_mappings(pa_to_pvh(VM_PAGE_TO_PHYS(m)),
 		    count);
 	}
 	rw_wunlock(&pvh_global_lock);
 	return (count);
 }
 
 /*
  *  Returns TRUE if any of the given mappings were used to modify
  *  physical memory.  Otherwise, returns FALSE.  Both page and 1mpage
  *  mappings are supported.
  */
 static boolean_t
 pmap_is_modified_pvh(struct md_page *pvh)
 {
 	pv_entry_t pv;
 	pt1_entry_t pte1;
 	pt2_entry_t pte2;
 	pmap_t pmap;
 	boolean_t rv;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	rv = FALSE;
 	sched_pin();
 	TAILQ_FOREACH(pv, &pvh->pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pte1 = pte1_load(pmap_pte1(pmap, pv->pv_va));
 		if (pte1_is_section(pte1)) {
 			rv = pte1_is_dirty(pte1);
 		} else {
 			KASSERT(pte1_is_link(pte1),
 			    ("%s: pte1 %#x is not link", __func__, pte1));
 			pte2 = pte2_load(pmap_pte2_quick(pmap, pv->pv_va));
 			rv = pte2_is_dirty(pte2);
 		}
 		PMAP_UNLOCK(pmap);
 		if (rv)
 			break;
 	}
 	sched_unpin();
 	return (rv);
 }
 
 /*
  *	pmap_is_modified:
  *
  *	Return whether or not the specified physical page was modified
  *	in any physical maps.
  */
 boolean_t
 pmap_is_modified(vm_page_t m)
 {
 	boolean_t rv;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("%s: page %p is not managed", __func__, m));
 
 	/*
 	 * If the page is not exclusive busied, then PGA_WRITEABLE cannot be
 	 * concurrently set while the object is locked.  Thus, if PGA_WRITEABLE
 	 * is clear, no PTE2s can have PG_M set.
 	 */
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	if (!vm_page_xbusied(m) && (m->aflags & PGA_WRITEABLE) == 0)
 		return (FALSE);
 	rw_wlock(&pvh_global_lock);
 	rv = pmap_is_modified_pvh(&m->md) ||
 	    ((m->flags & PG_FICTITIOUS) == 0 &&
 	    pmap_is_modified_pvh(pa_to_pvh(VM_PAGE_TO_PHYS(m))));
 	rw_wunlock(&pvh_global_lock);
 	return (rv);
 }
 
 /*
  *	pmap_is_prefaultable:
  *
  *	Return whether or not the specified virtual address is eligible
  *	for prefault.
  */
 boolean_t
 pmap_is_prefaultable(pmap_t pmap, vm_offset_t addr)
 {
 	pt1_entry_t pte1;
 	pt2_entry_t pte2;
 	boolean_t rv;
 
 	rv = FALSE;
 	PMAP_LOCK(pmap);
 	pte1 = pte1_load(pmap_pte1(pmap, addr));
 	if (pte1_is_link(pte1)) {
 		pte2 = pte2_load(pt2map_entry(addr));
 		rv = !pte2_is_valid(pte2) ;
 	}
 	PMAP_UNLOCK(pmap);
 	return (rv);
 }
 
 /*
  *  Returns TRUE if any of the given mappings were referenced and FALSE
  *  otherwise. Both page and 1mpage mappings are supported.
  */
 static boolean_t
 pmap_is_referenced_pvh(struct md_page *pvh)
 {
 
 	pv_entry_t pv;
 	pt1_entry_t pte1;
 	pt2_entry_t pte2;
 	pmap_t pmap;
 	boolean_t rv;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	rv = FALSE;
 	sched_pin();
 	TAILQ_FOREACH(pv, &pvh->pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pte1 = pte1_load(pmap_pte1(pmap, pv->pv_va));
 		if (pte1_is_section(pte1)) {
 			rv = (pte1 & (PTE1_A | PTE1_V)) == (PTE1_A | PTE1_V);
 		} else {
 			pte2 = pte2_load(pmap_pte2_quick(pmap, pv->pv_va));
 			rv = (pte2 & (PTE2_A | PTE2_V)) == (PTE2_A | PTE2_V);
 		}
 		PMAP_UNLOCK(pmap);
 		if (rv)
 			break;
 	}
 	sched_unpin();
 	return (rv);
 }
 
 /*
  *	pmap_is_referenced:
  *
  *	Return whether or not the specified physical page was referenced
  *	in any physical maps.
  */
 boolean_t
 pmap_is_referenced(vm_page_t m)
 {
 	boolean_t rv;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("%s: page %p is not managed", __func__, m));
 	rw_wlock(&pvh_global_lock);
 	rv = pmap_is_referenced_pvh(&m->md) ||
 	    ((m->flags & PG_FICTITIOUS) == 0 &&
 	    pmap_is_referenced_pvh(pa_to_pvh(VM_PAGE_TO_PHYS(m))));
 	rw_wunlock(&pvh_global_lock);
 	return (rv);
 }
 
-#define	PMAP_TS_REFERENCED_MAX	5
-
 /*
  *	pmap_ts_referenced:
  *
  *	Return a count of reference bits for a page, clearing those bits.
  *	It is not necessary for every reference bit to be cleared, but it
  *	is necessary that 0 only be returned when there are truly no
  *	reference bits set.
- *
- *	XXX: The exact number of bits to check and clear is a matter that
- *	should be tested and standardized at some point in the future for
- *	optimal aging of shared pages.
  *
  *	As an optimization, update the page's dirty field if a modified bit is
  *	found while counting reference bits.  This opportunistic update can be
  *	performed at low cost and can eliminate the need for some future calls
  *	to pmap_is_modified().  However, since this function stops after
  *	finding PMAP_TS_REFERENCED_MAX reference bits, it may not detect some
  *	dirty pages.  Those dirty pages will only be detected by a future call
  *	to pmap_is_modified().
  */
 int
 pmap_ts_referenced(vm_page_t m)
 {
 	struct md_page *pvh;
 	pv_entry_t pv, pvf;
 	pmap_t pmap;
 	pt1_entry_t  *pte1p, opte1;
 	pt2_entry_t *pte2p, opte2;
 	vm_paddr_t pa;
 	int rtval = 0;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("%s: page %p is not managed", __func__, m));
 	pa = VM_PAGE_TO_PHYS(m);
 	pvh = pa_to_pvh(pa);
 	rw_wlock(&pvh_global_lock);
 	sched_pin();
 	if ((m->flags & PG_FICTITIOUS) != 0 ||
 	    (pvf = TAILQ_FIRST(&pvh->pv_list)) == NULL)
 		goto small_mappings;
 	pv = pvf;
 	do {
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pte1p = pmap_pte1(pmap, pv->pv_va);
 		opte1 = pte1_load(pte1p);
 		if (pte1_is_dirty(opte1)) {
 			/*
 			 * Although "opte1" is mapping a 1MB page, because
 			 * this function is called at a 4KB page granularity,
 			 * we only update the 4KB page under test.
 			 */
 			vm_page_dirty(m);
 		}
 		if ((opte1 & PTE1_A) != 0) {
 			/*
 			 * Since this reference bit is shared by 256 4KB pages,
 			 * it should not be cleared every time it is tested.
 			 * Apply a simple "hash" function on the physical page
 			 * number, the virtual section number, and the pmap
 			 * address to select one 4KB page out of the 256
 			 * on which testing the reference bit will result
 			 * in clearing that bit. This function is designed
 			 * to avoid the selection of the same 4KB page
 			 * for every 1MB page mapping.
 			 *
 			 * On demotion, a mapping that hasn't been referenced
 			 * is simply destroyed.  To avoid the possibility of a
 			 * subsequent page fault on a demoted wired mapping,
 			 * always leave its reference bit set.  Moreover,
 			 * since the section is wired, the current state of
 			 * its reference bit won't affect page replacement.
 			 */
 			 if ((((pa >> PAGE_SHIFT) ^ (pv->pv_va >> PTE1_SHIFT) ^
 			    (uintptr_t)pmap) & (NPTE2_IN_PG - 1)) == 0 &&
 			    !pte1_is_wired(opte1)) {
 				pte1_clear_bit(pte1p, PTE1_A);
 				pmap_tlb_flush(pmap, pv->pv_va);
 			}
 			rtval++;
 		}
 		PMAP_UNLOCK(pmap);
 		/* Rotate the PV list if it has more than one entry. */
 		if (TAILQ_NEXT(pv, pv_next) != NULL) {
 			TAILQ_REMOVE(&pvh->pv_list, pv, pv_next);
 			TAILQ_INSERT_TAIL(&pvh->pv_list, pv, pv_next);
 		}
 		if (rtval >= PMAP_TS_REFERENCED_MAX)
 			goto out;
 	} while ((pv = TAILQ_FIRST(&pvh->pv_list)) != pvf);
 small_mappings:
 	if ((pvf = TAILQ_FIRST(&m->md.pv_list)) == NULL)
 		goto out;
 	pv = pvf;
 	do {
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pte1p = pmap_pte1(pmap, pv->pv_va);
 		KASSERT(pte1_is_link(pte1_load(pte1p)),
 		    ("%s: not found a link in page %p's pv list", __func__, m));
 
 		pte2p = pmap_pte2_quick(pmap, pv->pv_va);
 		opte2 = pte2_load(pte2p);
 		if (pte2_is_dirty(opte2))
 			vm_page_dirty(m);
 		if ((opte2 & PTE2_A) != 0) {
 			pte2_clear_bit(pte2p, PTE2_A);
 			pmap_tlb_flush(pmap, pv->pv_va);
 			rtval++;
 		}
 		PMAP_UNLOCK(pmap);
 		/* Rotate the PV list if it has more than one entry. */
 		if (TAILQ_NEXT(pv, pv_next) != NULL) {
 			TAILQ_REMOVE(&m->md.pv_list, pv, pv_next);
 			TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 		}
 	} while ((pv = TAILQ_FIRST(&m->md.pv_list)) != pvf && rtval <
 	    PMAP_TS_REFERENCED_MAX);
 out:
 	sched_unpin();
 	rw_wunlock(&pvh_global_lock);
 	return (rtval);
 }
 
 /*
  *	Clear the wired attribute from the mappings for the specified range of
  *	addresses in the given pmap.  Every valid mapping within that range
  *	must have the wired attribute set.  In contrast, invalid mappings
  *	cannot have the wired attribute set, so they are ignored.
  *
  *	The wired attribute of the page table entry is not a hardware feature,
  *	so there is no need to invalidate any TLB entries.
  */
 void
 pmap_unwire(pmap_t pmap, vm_offset_t sva, vm_offset_t eva)
 {
 	vm_offset_t nextva;
 	pt1_entry_t *pte1p, pte1;
 	pt2_entry_t *pte2p, pte2;
 	boolean_t pv_lists_locked;
 
 	if (pmap_is_current(pmap))
 		pv_lists_locked = FALSE;
 	else {
 		pv_lists_locked = TRUE;
 resume:
 		rw_wlock(&pvh_global_lock);
 		sched_pin();
 	}
 	PMAP_LOCK(pmap);
 	for (; sva < eva; sva = nextva) {
 		nextva = pte1_trunc(sva + PTE1_SIZE);
 		if (nextva < sva)
 			nextva = eva;
 
 		pte1p = pmap_pte1(pmap, sva);
 		pte1 = pte1_load(pte1p);
 
 		/*
 		 * Weed out invalid mappings. Note: we assume that L1 page
 		 * page table is always allocated, and in kernel virtual.
 		 */
 		if (pte1 == 0)
 			continue;
 
 		if (pte1_is_section(pte1)) {
 			if (!pte1_is_wired(pte1))
 				panic("%s: pte1 %#x not wired", __func__, pte1);
 
 			/*
 			 * Are we unwiring the entire large page?  If not,
 			 * demote the mapping and fall through.
 			 */
 			if (sva + PTE1_SIZE == nextva && eva >= nextva) {
 				pte1_clear_bit(pte1p, PTE1_W);
 				pmap->pm_stats.wired_count -= PTE1_SIZE /
 				    PAGE_SIZE;
 				continue;
 			} else {
 				if (!pv_lists_locked) {
 					pv_lists_locked = TRUE;
 					if (!rw_try_wlock(&pvh_global_lock)) {
 						PMAP_UNLOCK(pmap);
 						/* Repeat sva. */
 						goto resume;
 					}
 					sched_pin();
 				}
 				if (!pmap_demote_pte1(pmap, pte1p, sva))
 					panic("%s: demotion failed", __func__);
 #ifdef INVARIANTS
 				else {
 					/* Update pte1 after demotion */
 					pte1 = pte1_load(pte1p);
 				}
 #endif
 			}
 		}
 
 		KASSERT(pte1_is_link(pte1), ("%s: pmap %p va %#x pte1 %#x at %p"
 		    " is not link", __func__, pmap, sva, pte1, pte1p));
 
 		/*
 		 * Limit our scan to either the end of the va represented
 		 * by the current L2 page table page, or to the end of the
 		 * range being protected.
 		 */
 		if (nextva > eva)
 			nextva = eva;
 
 		for (pte2p = pmap_pte2_quick(pmap, sva); sva != nextva; pte2p++,
 		    sva += PAGE_SIZE) {
 			pte2 = pte2_load(pte2p);
 			if (!pte2_is_valid(pte2))
 				continue;
 			if (!pte2_is_wired(pte2))
 				panic("%s: pte2 %#x is missing PTE2_W",
 				    __func__, pte2);
 
 			/*
 			 * PTE2_W must be cleared atomically. Although the pmap
 			 * lock synchronizes access to PTE2_W, another processor
 			 * could be changing PTE2_NM and/or PTE2_A concurrently.
 			 */
 			pte2_clear_bit(pte2p, PTE2_W);
 			pmap->pm_stats.wired_count--;
 		}
 	}
 	if (pv_lists_locked) {
 		sched_unpin();
 		rw_wunlock(&pvh_global_lock);
 	}
 	PMAP_UNLOCK(pmap);
 }
 
 /*
  *  Clear the write and modified bits in each of the given page's mappings.
  */
 void
 pmap_remove_write(vm_page_t m)
 {
 	struct md_page *pvh;
 	pv_entry_t next_pv, pv;
 	pmap_t pmap;
 	pt1_entry_t *pte1p;
 	pt2_entry_t *pte2p, opte2;
 	vm_offset_t va;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("%s: page %p is not managed", __func__, m));
 
 	/*
 	 * If the page is not exclusive busied, then PGA_WRITEABLE cannot be
 	 * set by another thread while the object is locked.  Thus,
 	 * if PGA_WRITEABLE is clear, no page table entries need updating.
 	 */
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	if (!vm_page_xbusied(m) && (m->aflags & PGA_WRITEABLE) == 0)
 		return;
 	rw_wlock(&pvh_global_lock);
 	sched_pin();
 	if ((m->flags & PG_FICTITIOUS) != 0)
 		goto small_mappings;
 	pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 	TAILQ_FOREACH_SAFE(pv, &pvh->pv_list, pv_next, next_pv) {
 		va = pv->pv_va;
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pte1p = pmap_pte1(pmap, va);
 		if (!(pte1_load(pte1p) & PTE1_RO))
 			(void)pmap_demote_pte1(pmap, pte1p, va);
 		PMAP_UNLOCK(pmap);
 	}
 small_mappings:
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pte1p = pmap_pte1(pmap, pv->pv_va);
 		KASSERT(!pte1_is_section(pte1_load(pte1p)), ("%s: found"
 		    " a section in page %p's pv list", __func__, m));
 		pte2p = pmap_pte2_quick(pmap, pv->pv_va);
 		opte2 = pte2_load(pte2p);
 		if (!(opte2 & PTE2_RO)) {
 			pte2_store(pte2p, opte2 | PTE2_RO | PTE2_NM);
 			if (pte2_is_dirty(opte2))
 				vm_page_dirty(m);
 			pmap_tlb_flush(pmap, pv->pv_va);
 		}
 		PMAP_UNLOCK(pmap);
 	}
 	vm_page_aflag_clear(m, PGA_WRITEABLE);
 	sched_unpin();
 	rw_wunlock(&pvh_global_lock);
 }
 
 /*
  *	Apply the given advice to the specified range of addresses within the
  *	given pmap.  Depending on the advice, clear the referenced and/or
  *	modified flags in each mapping and set the mapped page's dirty field.
  */
 void
 pmap_advise(pmap_t pmap, vm_offset_t sva, vm_offset_t eva, int advice)
 {
 	pt1_entry_t *pte1p, opte1;
 	pt2_entry_t *pte2p, pte2;
 	vm_offset_t pdnxt;
 	vm_page_t m;
 	boolean_t pv_lists_locked;
 
 	if (advice != MADV_DONTNEED && advice != MADV_FREE)
 		return;
 	if (pmap_is_current(pmap))
 		pv_lists_locked = FALSE;
 	else {
 		pv_lists_locked = TRUE;
 resume:
 		rw_wlock(&pvh_global_lock);
 		sched_pin();
 	}
 	PMAP_LOCK(pmap);
 	for (; sva < eva; sva = pdnxt) {
 		pdnxt = pte1_trunc(sva + PTE1_SIZE);
 		if (pdnxt < sva)
 			pdnxt = eva;
 		pte1p = pmap_pte1(pmap, sva);
 		opte1 = pte1_load(pte1p);
 		if (!pte1_is_valid(opte1)) /* XXX */
 			continue;
 		else if (pte1_is_section(opte1)) {
 			if (!pte1_is_managed(opte1))
 				continue;
 			if (!pv_lists_locked) {
 				pv_lists_locked = TRUE;
 				if (!rw_try_wlock(&pvh_global_lock)) {
 					PMAP_UNLOCK(pmap);
 					goto resume;
 				}
 				sched_pin();
 			}
 			if (!pmap_demote_pte1(pmap, pte1p, sva)) {
 				/*
 				 * The large page mapping was destroyed.
 				 */
 				continue;
 			}
 
 			/*
 			 * Unless the page mappings are wired, remove the
 			 * mapping to a single page so that a subsequent
 			 * access may repromote.  Since the underlying L2 page
 			 * table is fully populated, this removal never
 			 * frees a L2 page table page.
 			 */
 			if (!pte1_is_wired(opte1)) {
 				pte2p = pmap_pte2_quick(pmap, sva);
 				KASSERT(pte2_is_valid(pte2_load(pte2p)),
 				    ("%s: invalid PTE2", __func__));
 				pmap_remove_pte2(pmap, pte2p, sva, NULL);
 			}
 		}
 		if (pdnxt > eva)
 			pdnxt = eva;
 		for (pte2p = pmap_pte2_quick(pmap, sva); sva != pdnxt; pte2p++,
 		    sva += PAGE_SIZE) {
 			pte2 = pte2_load(pte2p);
 			if (!pte2_is_valid(pte2) || !pte2_is_managed(pte2))
 				continue;
 			else if (pte2_is_dirty(pte2)) {
 				if (advice == MADV_DONTNEED) {
 					/*
 					 * Future calls to pmap_is_modified()
 					 * can be avoided by making the page
 					 * dirty now.
 					 */
 					m = PHYS_TO_VM_PAGE(pte2_pa(pte2));
 					vm_page_dirty(m);
 				}
 				pte2_set_bit(pte2p, PTE2_NM);
 				pte2_clear_bit(pte2p, PTE2_A);
 			} else if ((pte2 & PTE2_A) != 0)
 				pte2_clear_bit(pte2p, PTE2_A);
 			else
 				continue;
 			pmap_tlb_flush(pmap, sva);
 		}
 	}
 	if (pv_lists_locked) {
 		sched_unpin();
 		rw_wunlock(&pvh_global_lock);
 	}
 	PMAP_UNLOCK(pmap);
 }
 
 /*
  *	Clear the modify bits on the specified physical page.
  */
 void
 pmap_clear_modify(vm_page_t m)
 {
 	struct md_page *pvh;
 	pv_entry_t next_pv, pv;
 	pmap_t pmap;
 	pt1_entry_t *pte1p, opte1;
 	pt2_entry_t *pte2p, opte2;
 	vm_offset_t va;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("%s: page %p is not managed", __func__, m));
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	KASSERT(!vm_page_xbusied(m),
 	    ("%s: page %p is exclusive busy", __func__, m));
 
 	/*
 	 * If the page is not PGA_WRITEABLE, then no PTE2s can have PTE2_NM
 	 * cleared. If the object containing the page is locked and the page
 	 * is not exclusive busied, then PGA_WRITEABLE cannot be concurrently
 	 * set.
 	 */
 	if ((m->flags & PGA_WRITEABLE) == 0)
 		return;
 	rw_wlock(&pvh_global_lock);
 	sched_pin();
 	if ((m->flags & PG_FICTITIOUS) != 0)
 		goto small_mappings;
 	pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 	TAILQ_FOREACH_SAFE(pv, &pvh->pv_list, pv_next, next_pv) {
 		va = pv->pv_va;
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pte1p = pmap_pte1(pmap, va);
 		opte1 = pte1_load(pte1p);
 		if (!(opte1 & PTE1_RO)) {
 			if (pmap_demote_pte1(pmap, pte1p, va) &&
 			    !pte1_is_wired(opte1)) {
 				/*
 				 * Write protect the mapping to a
 				 * single page so that a subsequent
 				 * write access may repromote.
 				 */
 				va += VM_PAGE_TO_PHYS(m) - pte1_pa(opte1);
 				pte2p = pmap_pte2_quick(pmap, va);
 				opte2 = pte2_load(pte2p);
 				if ((opte2 & PTE2_V)) {
 					pte2_set_bit(pte2p, PTE2_NM | PTE2_RO);
 					vm_page_dirty(m);
 					pmap_tlb_flush(pmap, va);
 				}
 			}
 		}
 		PMAP_UNLOCK(pmap);
 	}
 small_mappings:
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pte1p = pmap_pte1(pmap, pv->pv_va);
 		KASSERT(!pte1_is_section(pte1_load(pte1p)), ("%s: found"
 		    " a section in page %p's pv list", __func__, m));
 		pte2p = pmap_pte2_quick(pmap, pv->pv_va);
 		if (pte2_is_dirty(pte2_load(pte2p))) {
 			pte2_set_bit(pte2p, PTE2_NM);
 			pmap_tlb_flush(pmap, pv->pv_va);
 		}
 		PMAP_UNLOCK(pmap);
 	}
 	sched_unpin();
 	rw_wunlock(&pvh_global_lock);
 }
 
 
 /*
  *  Sets the memory attribute for the specified page.
  */
 void
 pmap_page_set_memattr(vm_page_t m, vm_memattr_t ma)
 {
 	struct sysmaps *sysmaps;
 	vm_memattr_t oma;
 	vm_paddr_t pa;
 
 	oma = m->md.pat_mode;
 	m->md.pat_mode = ma;
 
 	CTR5(KTR_PMAP, "%s: page %p - 0x%08X oma: %d, ma: %d", __func__, m,
 	    VM_PAGE_TO_PHYS(m), oma, ma);
 	if ((m->flags & PG_FICTITIOUS) != 0)
 		return;
 #if 0
 	/*
 	 * If "m" is a normal page, flush it from the cache.
 	 *
 	 * First, try to find an existing mapping of the page by sf
 	 * buffer. sf_buf_invalidate_cache() modifies mapping and
 	 * flushes the cache.
 	 */
 	if (sf_buf_invalidate_cache(m, oma))
 		return;
 #endif
 	/*
 	 * If page is not mapped by sf buffer, map the page
 	 * transient and do invalidation.
 	 */
 	if (ma != oma) {
 		pa = VM_PAGE_TO_PHYS(m);
 		sched_pin();
 		sysmaps = &sysmaps_pcpu[PCPU_GET(cpuid)];
 		mtx_lock(&sysmaps->lock);
 		if (*sysmaps->CMAP2)
 			panic("%s: CMAP2 busy", __func__);
 		pte2_store(sysmaps->CMAP2, PTE2_KERN_NG(pa, PTE2_AP_KRW,
 		    vm_memattr_to_pte2(ma)));
 		dcache_wbinv_poc((vm_offset_t)sysmaps->CADDR2, pa, PAGE_SIZE);
 		pte2_clear(sysmaps->CMAP2);
 		tlb_flush((vm_offset_t)sysmaps->CADDR2);
 		sched_unpin();
 		mtx_unlock(&sysmaps->lock);
 	}
 }
 
 /*
  *  Miscellaneous support routines follow
  */
 
 /*
  *  Returns TRUE if the given page is mapped individually or as part of
  *  a 1mpage.  Otherwise, returns FALSE.
  */
 boolean_t
 pmap_page_is_mapped(vm_page_t m)
 {
 	boolean_t rv;
 
 	if ((m->oflags & VPO_UNMANAGED) != 0)
 		return (FALSE);
 	rw_wlock(&pvh_global_lock);
 	rv = !TAILQ_EMPTY(&m->md.pv_list) ||
 	    ((m->flags & PG_FICTITIOUS) == 0 &&
 	    !TAILQ_EMPTY(&pa_to_pvh(VM_PAGE_TO_PHYS(m))->pv_list));
 	rw_wunlock(&pvh_global_lock);
 	return (rv);
 }
 
 /*
  *  Returns true if the pmap's pv is one of the first
  *  16 pvs linked to from this page.  This count may
  *  be changed upwards or downwards in the future; it
  *  is only necessary that true be returned for a small
  *  subset of pmaps for proper page aging.
  */
 boolean_t
 pmap_page_exists_quick(pmap_t pmap, vm_page_t m)
 {
 	struct md_page *pvh;
 	pv_entry_t pv;
 	int loops = 0;
 	boolean_t rv;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("%s: page %p is not managed", __func__, m));
 	rv = FALSE;
 	rw_wlock(&pvh_global_lock);
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
 		if (PV_PMAP(pv) == pmap) {
 			rv = TRUE;
 			break;
 		}
 		loops++;
 		if (loops >= 16)
 			break;
 	}
 	if (!rv && loops < 16 && (m->flags & PG_FICTITIOUS) == 0) {
 		pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 		TAILQ_FOREACH(pv, &pvh->pv_list, pv_next) {
 			if (PV_PMAP(pv) == pmap) {
 				rv = TRUE;
 				break;
 			}
 			loops++;
 			if (loops >= 16)
 				break;
 		}
 	}
 	rw_wunlock(&pvh_global_lock);
 	return (rv);
 }
 
 /*
  *	pmap_zero_page zeros the specified hardware page by mapping
  *	the page into KVM and using bzero to clear its contents.
  */
 void
 pmap_zero_page(vm_page_t m)
 {
 	struct sysmaps *sysmaps;
 
 	sched_pin();
 	sysmaps = &sysmaps_pcpu[PCPU_GET(cpuid)];
 	mtx_lock(&sysmaps->lock);
 	if (pte2_load(sysmaps->CMAP2) != 0)
 		panic("%s: CMAP2 busy", __func__);
 	pte2_store(sysmaps->CMAP2, PTE2_KERN_NG(VM_PAGE_TO_PHYS(m), PTE2_AP_KRW,
 	    vm_page_pte2_attr(m)));
 	pagezero(sysmaps->CADDR2);
 	pte2_clear(sysmaps->CMAP2);
 	tlb_flush((vm_offset_t)sysmaps->CADDR2);
 	sched_unpin();
 	mtx_unlock(&sysmaps->lock);
 }
 
 /*
  *	pmap_zero_page_area zeros the specified hardware page by mapping
  *	the page into KVM and using bzero to clear its contents.
  *
  *	off and size may not cover an area beyond a single hardware page.
  */
 void
 pmap_zero_page_area(vm_page_t m, int off, int size)
 {
 	struct sysmaps *sysmaps;
 
 	sched_pin();
 	sysmaps = &sysmaps_pcpu[PCPU_GET(cpuid)];
 	mtx_lock(&sysmaps->lock);
 	if (pte2_load(sysmaps->CMAP2) != 0)
 		panic("%s: CMAP2 busy", __func__);
 	pte2_store(sysmaps->CMAP2, PTE2_KERN_NG(VM_PAGE_TO_PHYS(m), PTE2_AP_KRW,
 	    vm_page_pte2_attr(m)));
 	if (off == 0 && size == PAGE_SIZE)
 		pagezero(sysmaps->CADDR2);
 	else
 		bzero(sysmaps->CADDR2 + off, size);
 	pte2_clear(sysmaps->CMAP2);
 	tlb_flush((vm_offset_t)sysmaps->CADDR2);
 	sched_unpin();
 	mtx_unlock(&sysmaps->lock);
 }
 
 /*
  *	pmap_copy_page copies the specified (machine independent)
  *	page by mapping the page into virtual memory and using
  *	bcopy to copy the page, one machine dependent page at a
  *	time.
  */
 void
 pmap_copy_page(vm_page_t src, vm_page_t dst)
 {
 	struct sysmaps *sysmaps;
 
 	sched_pin();
 	sysmaps = &sysmaps_pcpu[PCPU_GET(cpuid)];
 	mtx_lock(&sysmaps->lock);
 	if (pte2_load(sysmaps->CMAP1) != 0)
 		panic("%s: CMAP1 busy", __func__);
 	if (pte2_load(sysmaps->CMAP2) != 0)
 		panic("%s: CMAP2 busy", __func__);
 	pte2_store(sysmaps->CMAP1, PTE2_KERN_NG(VM_PAGE_TO_PHYS(src),
 	    PTE2_AP_KR | PTE2_NM, vm_page_pte2_attr(src)));
 	pte2_store(sysmaps->CMAP2, PTE2_KERN_NG(VM_PAGE_TO_PHYS(dst),
 	    PTE2_AP_KRW, vm_page_pte2_attr(dst)));
 	bcopy(sysmaps->CADDR1, sysmaps->CADDR2, PAGE_SIZE);
 	pte2_clear(sysmaps->CMAP1);
 	tlb_flush((vm_offset_t)sysmaps->CADDR1);
 	pte2_clear(sysmaps->CMAP2);
 	tlb_flush((vm_offset_t)sysmaps->CADDR2);
 	sched_unpin();
 	mtx_unlock(&sysmaps->lock);
 }
 
 int unmapped_buf_allowed = 1;
 
 void
 pmap_copy_pages(vm_page_t ma[], vm_offset_t a_offset, vm_page_t mb[],
     vm_offset_t b_offset, int xfersize)
 {
 	struct sysmaps *sysmaps;
 	vm_page_t a_pg, b_pg;
 	char *a_cp, *b_cp;
 	vm_offset_t a_pg_offset, b_pg_offset;
 	int cnt;
 
 	sched_pin();
 	sysmaps = &sysmaps_pcpu[PCPU_GET(cpuid)];
 	mtx_lock(&sysmaps->lock);
 	if (*sysmaps->CMAP1 != 0)
 		panic("pmap_copy_pages: CMAP1 busy");
 	if (*sysmaps->CMAP2 != 0)
 		panic("pmap_copy_pages: CMAP2 busy");
 	while (xfersize > 0) {
 		a_pg = ma[a_offset >> PAGE_SHIFT];
 		a_pg_offset = a_offset & PAGE_MASK;
 		cnt = min(xfersize, PAGE_SIZE - a_pg_offset);
 		b_pg = mb[b_offset >> PAGE_SHIFT];
 		b_pg_offset = b_offset & PAGE_MASK;
 		cnt = min(cnt, PAGE_SIZE - b_pg_offset);
 		pte2_store(sysmaps->CMAP1, PTE2_KERN_NG(VM_PAGE_TO_PHYS(a_pg),
 		    PTE2_AP_KR | PTE2_NM, vm_page_pte2_attr(a_pg)));
 		tlb_flush_local((vm_offset_t)sysmaps->CADDR1);
 		pte2_store(sysmaps->CMAP2, PTE2_KERN_NG(VM_PAGE_TO_PHYS(b_pg),
 		    PTE2_AP_KRW, vm_page_pte2_attr(b_pg)));
 		tlb_flush_local((vm_offset_t)sysmaps->CADDR2);
 		a_cp = sysmaps->CADDR1 + a_pg_offset;
 		b_cp = sysmaps->CADDR2 + b_pg_offset;
 		bcopy(a_cp, b_cp, cnt);
 		a_offset += cnt;
 		b_offset += cnt;
 		xfersize -= cnt;
 	}
 	pte2_clear(sysmaps->CMAP1);
 	tlb_flush((vm_offset_t)sysmaps->CADDR1);
 	pte2_clear(sysmaps->CMAP2);
 	tlb_flush((vm_offset_t)sysmaps->CADDR2);
 	sched_unpin();
 	mtx_unlock(&sysmaps->lock);
 }
 
 vm_offset_t
 pmap_quick_enter_page(vm_page_t m)
 {
 	pt2_entry_t *pte2p;
 	vm_offset_t qmap_addr;
 
 	critical_enter();
 	qmap_addr = PCPU_GET(qmap_addr);
 	pte2p = pt2map_entry(qmap_addr);
 
 	KASSERT(pte2_load(pte2p) == 0, ("%s: PTE2 busy", __func__));
 
 	pte2_store(pte2p, PTE2_KERN_NG(VM_PAGE_TO_PHYS(m), PTE2_AP_KRW,
 	    vm_page_pte2_attr(m)));
 	return (qmap_addr);
 }
 
 void
 pmap_quick_remove_page(vm_offset_t addr)
 {
 	pt2_entry_t *pte2p;
 	vm_offset_t qmap_addr;
 
 	qmap_addr = PCPU_GET(qmap_addr);
 	pte2p = pt2map_entry(qmap_addr);
 
 	KASSERT(addr == qmap_addr, ("%s: invalid address", __func__));
 	KASSERT(pte2_load(pte2p) != 0, ("%s: PTE2 not in use", __func__));
 
 	pte2_clear(pte2p);
 	tlb_flush(qmap_addr);
 	critical_exit();
 }
 
 /*
  *	Copy the range specified by src_addr/len
  *	from the source map to the range dst_addr/len
  *	in the destination map.
  *
  *	This routine is only advisory and need not do anything.
  */
 void
 pmap_copy(pmap_t dst_pmap, pmap_t src_pmap, vm_offset_t dst_addr, vm_size_t len,
     vm_offset_t src_addr)
 {
 	struct spglist free;
 	vm_offset_t addr;
 	vm_offset_t end_addr = src_addr + len;
 	vm_offset_t nextva;
 
 	if (dst_addr != src_addr)
 		return;
 
 	if (!pmap_is_current(src_pmap))
 		return;
 
 	rw_wlock(&pvh_global_lock);
 	if (dst_pmap < src_pmap) {
 		PMAP_LOCK(dst_pmap);
 		PMAP_LOCK(src_pmap);
 	} else {
 		PMAP_LOCK(src_pmap);
 		PMAP_LOCK(dst_pmap);
 	}
 	sched_pin();
 	for (addr = src_addr; addr < end_addr; addr = nextva) {
 		pt2_entry_t *src_pte2p, *dst_pte2p;
 		vm_page_t dst_mpt2pg, src_mpt2pg;
 		pt1_entry_t src_pte1;
 		u_int pte1_idx;
 
 		KASSERT(addr < VM_MAXUSER_ADDRESS,
 		    ("%s: invalid to pmap_copy page tables", __func__));
 
 		nextva = pte1_trunc(addr + PTE1_SIZE);
 		if (nextva < addr)
 			nextva = end_addr;
 
 		pte1_idx = pte1_index(addr);
 		src_pte1 = src_pmap->pm_pt1[pte1_idx];
 		if (pte1_is_section(src_pte1)) {
 			if ((addr & PTE1_OFFSET) != 0 ||
 			    (addr + PTE1_SIZE) > end_addr)
 				continue;
 			if (dst_pmap->pm_pt1[pte1_idx] == 0 &&
 			    (!pte1_is_managed(src_pte1) ||
 			    pmap_pv_insert_pte1(dst_pmap, addr,
 			    pte1_pa(src_pte1)))) {
 				dst_pmap->pm_pt1[pte1_idx] = src_pte1 &
 				    ~PTE1_W;
 				dst_pmap->pm_stats.resident_count +=
 				    PTE1_SIZE / PAGE_SIZE;
 				pmap_pte1_mappings++;
 			}
 			continue;
 		} else if (!pte1_is_link(src_pte1))
 			continue;
 
 		src_mpt2pg = PHYS_TO_VM_PAGE(pte1_link_pa(src_pte1));
 
 		/*
 		 * We leave PT2s to be linked from PT1 even if they are not
 		 * referenced until all PT2s in a page are without reference.
 		 *
 		 * QQQ: It could be changed ...
 		 */
 #if 0 /* single_pt2_link_is_cleared */
 		KASSERT(pt2_wirecount_get(src_mpt2pg, pte1_idx) > 0,
 		    ("%s: source page table page is unused", __func__));
 #else
 		if (pt2_wirecount_get(src_mpt2pg, pte1_idx) == 0)
 			continue;
 #endif
 		if (nextva > end_addr)
 			nextva = end_addr;
 
 		src_pte2p = pt2map_entry(addr);
 		while (addr < nextva) {
 			pt2_entry_t temp_pte2;
 			temp_pte2 = pte2_load(src_pte2p);
 			/*
 			 * we only virtual copy managed pages
 			 */
 			if (pte2_is_managed(temp_pte2)) {
 				dst_mpt2pg = pmap_allocpte2(dst_pmap, addr,
 				    PMAP_ENTER_NOSLEEP);
 				if (dst_mpt2pg == NULL)
 					goto out;
 				dst_pte2p = pmap_pte2_quick(dst_pmap, addr);
 				if (!pte2_is_valid(pte2_load(dst_pte2p)) &&
 				    pmap_try_insert_pv_entry(dst_pmap, addr,
 				    PHYS_TO_VM_PAGE(pte2_pa(temp_pte2)))) {
 					/*
 					 * Clear the wired, modified, and
 					 * accessed (referenced) bits
 					 * during the copy.
 					 */
 					temp_pte2 &=  ~(PTE2_W | PTE2_A);
 					temp_pte2 |= PTE2_NM;
 					pte2_store(dst_pte2p, temp_pte2);
 					dst_pmap->pm_stats.resident_count++;
 				} else {
 					SLIST_INIT(&free);
 					if (pmap_unwire_pt2(dst_pmap, addr,
 					    dst_mpt2pg, &free)) {
 						pmap_tlb_flush(dst_pmap, addr);
 						pmap_free_zero_pages(&free);
 					}
 					goto out;
 				}
 				if (pt2_wirecount_get(dst_mpt2pg, pte1_idx) >=
 				    pt2_wirecount_get(src_mpt2pg, pte1_idx))
 					break;
 			}
 			addr += PAGE_SIZE;
 			src_pte2p++;
 		}
 	}
 out:
 	sched_unpin();
 	rw_wunlock(&pvh_global_lock);
 	PMAP_UNLOCK(src_pmap);
 	PMAP_UNLOCK(dst_pmap);
 }
 
 /*
  *	Increase the starting virtual address of the given mapping if a
  *	different alignment might result in more section mappings.
  */
 void
 pmap_align_superpage(vm_object_t object, vm_ooffset_t offset,
     vm_offset_t *addr, vm_size_t size)
 {
 	vm_offset_t pte1_offset;
 
 	if (size < PTE1_SIZE)
 		return;
 	if (object != NULL && (object->flags & OBJ_COLORED) != 0)
 		offset += ptoa(object->pg_color);
 	pte1_offset = offset & PTE1_OFFSET;
 	if (size - ((PTE1_SIZE - pte1_offset) & PTE1_OFFSET) < PTE1_SIZE ||
 	    (*addr & PTE1_OFFSET) == pte1_offset)
 		return;
 	if ((*addr & PTE1_OFFSET) < pte1_offset)
 		*addr = pte1_trunc(*addr) + pte1_offset;
 	else
 		*addr = pte1_roundup(*addr) + pte1_offset;
 }
 
 void
 pmap_activate(struct thread *td)
 {
 	pmap_t pmap, oldpmap;
 	u_int cpuid, ttb;
 
 	PDEBUG(9, printf("%s: td = %08x\n", __func__, (uint32_t)td));
 
 	critical_enter();
 	pmap = vmspace_pmap(td->td_proc->p_vmspace);
 	oldpmap = PCPU_GET(curpmap);
 	cpuid = PCPU_GET(cpuid);
 
 #if defined(SMP)
 	CPU_CLR_ATOMIC(cpuid, &oldpmap->pm_active);
 	CPU_SET_ATOMIC(cpuid, &pmap->pm_active);
 #else
 	CPU_CLR(cpuid, &oldpmap->pm_active);
 	CPU_SET(cpuid, &pmap->pm_active);
 #endif
 
 	ttb = pmap_ttb_get(pmap);
 
 	/*
 	 * pmap_activate is for the current thread on the current cpu
 	 */
 	td->td_pcb->pcb_pagedir = ttb;
 	cp15_ttbr_set(ttb);
 	PCPU_SET(curpmap, pmap);
 	critical_exit();
 }
 
 /*
  *  Perform the pmap work for mincore.
  */
 int
 pmap_mincore(pmap_t pmap, vm_offset_t addr, vm_paddr_t *locked_pa)
 {
 	pt1_entry_t *pte1p, pte1;
 	pt2_entry_t *pte2p, pte2;
 	vm_paddr_t pa;
 	boolean_t managed;
 	int val;
 
 	PMAP_LOCK(pmap);
 retry:
 	pte1p = pmap_pte1(pmap, addr);
 	pte1 = pte1_load(pte1p);
 	if (pte1_is_section(pte1)) {
 		pa = trunc_page(pte1_pa(pte1) | (addr & PTE1_OFFSET));
 		managed = pte1_is_managed(pte1);
 		val = MINCORE_SUPER | MINCORE_INCORE;
 		if (pte1_is_dirty(pte1))
 			val |= MINCORE_MODIFIED | MINCORE_MODIFIED_OTHER;
 		if (pte1 & PTE1_A)
 			val |= MINCORE_REFERENCED | MINCORE_REFERENCED_OTHER;
 	} else if (pte1_is_link(pte1)) {
 		pte2p = pmap_pte2(pmap, addr);
 		pte2 = pte2_load(pte2p);
 		pmap_pte2_release(pte2p);
 		pa = pte2_pa(pte2);
 		managed = pte2_is_managed(pte2);
 		val = MINCORE_INCORE;
 		if (pte2_is_dirty(pte2))
 			val |= MINCORE_MODIFIED | MINCORE_MODIFIED_OTHER;
 		if (pte2 & PTE2_A)
 			val |= MINCORE_REFERENCED | MINCORE_REFERENCED_OTHER;
 	} else {
 		managed = FALSE;
 		val = 0;
 	}
 	if ((val & (MINCORE_MODIFIED_OTHER | MINCORE_REFERENCED_OTHER)) !=
 	    (MINCORE_MODIFIED_OTHER | MINCORE_REFERENCED_OTHER) && managed) {
 		/* Ensure that "PHYS_TO_VM_PAGE(pa)->object" doesn't change. */
 		if (vm_page_pa_tryrelock(pmap, pa, locked_pa))
 			goto retry;
 	} else
 		PA_UNLOCK_COND(*locked_pa);
 	PMAP_UNLOCK(pmap);
 	return (val);
 }
 
 void
 pmap_kenter_device(vm_offset_t va, vm_size_t size, vm_paddr_t pa)
 {
 	vm_offset_t sva;
 	uint32_t l2attr;
 
 	KASSERT((size & PAGE_MASK) == 0,
 	    ("%s: device mapping not page-sized", __func__));
 
 	sva = va;
 	l2attr = vm_memattr_to_pte2(VM_MEMATTR_DEVICE);
 	while (size != 0) {
 		pmap_kenter_prot_attr(va, pa, PTE2_AP_KRW, l2attr);
 		va += PAGE_SIZE;
 		pa += PAGE_SIZE;
 		size -= PAGE_SIZE;
 	}
 	tlb_flush_range(sva, va - sva);
 }
 
 void
 pmap_kremove_device(vm_offset_t va, vm_size_t size)
 {
 	vm_offset_t sva;
 
 	KASSERT((size & PAGE_MASK) == 0,
 	    ("%s: device mapping not page-sized", __func__));
 
 	sva = va;
 	while (size != 0) {
 		pmap_kremove(va);
 		va += PAGE_SIZE;
 		size -= PAGE_SIZE;
 	}
 	tlb_flush_range(sva, va - sva);
 }
 
 void
 pmap_set_pcb_pagedir(pmap_t pmap, struct pcb *pcb)
 {
 
 	pcb->pcb_pagedir = pmap_ttb_get(pmap);
 }
 
 
 /*
  *  Clean L1 data cache range by physical address.
  *  The range must be within a single page.
  */
 static void
 pmap_dcache_wb_pou(vm_paddr_t pa, vm_size_t size, uint32_t attr)
 {
 	struct sysmaps *sysmaps;
 
 	KASSERT(((pa & PAGE_MASK) + size) <= PAGE_SIZE,
 	    ("%s: not on single page", __func__));
 
 	sched_pin();
 	sysmaps = &sysmaps_pcpu[PCPU_GET(cpuid)];
 	mtx_lock(&sysmaps->lock);
 	if (*sysmaps->CMAP3)
 		panic("%s: CMAP3 busy", __func__);
 	pte2_store(sysmaps->CMAP3, PTE2_KERN_NG(pa, PTE2_AP_KRW, attr));
 	dcache_wb_pou((vm_offset_t)sysmaps->CADDR3 + (pa & PAGE_MASK), size);
 	pte2_clear(sysmaps->CMAP3);
 	tlb_flush((vm_offset_t)sysmaps->CADDR3);
 	sched_unpin();
 	mtx_unlock(&sysmaps->lock);
 }
 
 /*
  *  Sync instruction cache range which is not mapped yet.
  */
 void
 cache_icache_sync_fresh(vm_offset_t va, vm_paddr_t pa, vm_size_t size)
 {
 	uint32_t len, offset;
 	vm_page_t m;
 
 	/* Write back d-cache on given address range. */
 	offset = pa & PAGE_MASK;
 	for ( ; size != 0; size -= len, pa += len, offset = 0) {
 		len = min(PAGE_SIZE - offset, size);
 		m = PHYS_TO_VM_PAGE(pa);
 		KASSERT(m != NULL, ("%s: vm_page_t is null for %#x",
 		  __func__, pa));
 		pmap_dcache_wb_pou(pa, len, vm_page_pte2_attr(m));
 	}
 	/*
 	 * I-cache is VIPT. Only way how to flush all virtual mappings
 	 * on given physical address is to invalidate all i-cache.
 	 */
 	icache_inv_all();
 }
 
 void
 pmap_sync_icache(pmap_t pmap, vm_offset_t va, vm_size_t size)
 {
 
 	/* Write back d-cache on given address range. */
 	if (va >= VM_MIN_KERNEL_ADDRESS) {
 		dcache_wb_pou(va, size);
 	} else {
 		uint32_t len, offset;
 		vm_paddr_t pa;
 		vm_page_t m;
 
 		offset = va & PAGE_MASK;
 		for ( ; size != 0; size -= len, va += len, offset = 0) {
 			pa = pmap_extract(pmap, va); /* offset is preserved */
 			len = min(PAGE_SIZE - offset, size);
 			m = PHYS_TO_VM_PAGE(pa);
 			KASSERT(m != NULL, ("%s: vm_page_t is null for %#x",
 				__func__, pa));
 			pmap_dcache_wb_pou(pa, len, vm_page_pte2_attr(m));
 		}
 	}
 	/*
 	 * I-cache is VIPT. Only way how to flush all virtual mappings
 	 * on given physical address is to invalidate all i-cache.
 	 */
 	icache_inv_all();
 }
 
 /*
  *  The implementation of pmap_fault() uses IN_RANGE2() macro which
  *  depends on the fact that given range size is a power of 2.
  */
 CTASSERT(powerof2(NB_IN_PT1));
 CTASSERT(powerof2(PT2MAP_SIZE));
 
 #define IN_RANGE2(addr, start, size)	\
     ((vm_offset_t)(start) == ((vm_offset_t)(addr) & ~((size) - 1)))
 
 /*
  *  Handle access and R/W emulation faults.
  */
 int
 pmap_fault(pmap_t pmap, vm_offset_t far, uint32_t fsr, int idx, bool usermode)
 {
 	pt1_entry_t *pte1p, pte1;
 	pt2_entry_t *pte2p, pte2;
 
 	if (pmap == NULL)
 		pmap = kernel_pmap;
 
 	/*
 	 * In kernel, we should never get abort with FAR which is in range of
 	 * pmap->pm_pt1 or PT2MAP address spaces. If it happens, stop here
 	 * and print out a useful abort message and even get to the debugger
 	 * otherwise it likely ends with never ending loop of aborts.
 	 */
 	if (__predict_false(IN_RANGE2(far, pmap->pm_pt1, NB_IN_PT1))) {
 		/*
 		 * All L1 tables should always be mapped and present.
 		 * However, we check only current one herein. For user mode,
 		 * only permission abort from malicious user is not fatal.
 		 * And alignment abort as it may have higher priority.
 		 */
 		if (!usermode || (idx != FAULT_ALIGN && idx != FAULT_PERM_L2)) {
 			CTR4(KTR_PMAP, "%s: pmap %#x pm_pt1 %#x far %#x",
 			    __func__, pmap, pmap->pm_pt1, far);
 			panic("%s: pm_pt1 abort", __func__);
 		}
 		return (KERN_INVALID_ADDRESS);
 	}
 	if (__predict_false(IN_RANGE2(far, PT2MAP, PT2MAP_SIZE))) {
 		/*
 		 * PT2MAP should be always mapped and present in current
 		 * L1 table. However, only existing L2 tables are mapped
 		 * in PT2MAP. For user mode, only L2 translation abort and
 		 * permission abort from malicious user is not fatal.
 		 * And alignment abort as it may have higher priority.
 		 */
 		if (!usermode || (idx != FAULT_ALIGN &&
 		    idx != FAULT_TRAN_L2 && idx != FAULT_PERM_L2)) {
 			CTR4(KTR_PMAP, "%s: pmap %#x PT2MAP %#x far %#x",
 			    __func__, pmap, PT2MAP, far);
 			panic("%s: PT2MAP abort", __func__);
 		}
 		return (KERN_INVALID_ADDRESS);
 	}
 
 	/*
 	 * A pmap lock is used below for handling of access and R/W emulation
 	 * aborts. They were handled by atomic operations before so some
 	 * analysis of new situation is needed to answer the following question:
 	 * Is it safe to use the lock even for these aborts?
 	 *
 	 * There may happen two cases in general:
 	 *
 	 * (1) Aborts while the pmap lock is locked already - this should not
 	 * happen as pmap lock is not recursive. However, under pmap lock only
 	 * internal kernel data should be accessed and such data should be
 	 * mapped with A bit set and NM bit cleared. If double abort happens,
 	 * then a mapping of data which has caused it must be fixed. Further,
 	 * all new mappings are always made with A bit set and the bit can be
 	 * cleared only on managed mappings.
 	 *
 	 * (2) Aborts while another lock(s) is/are locked - this already can
 	 * happen. However, there is no difference here if it's either access or
 	 * R/W emulation abort, or if it's some other abort.
 	 */
 
 	PMAP_LOCK(pmap);
 #ifdef SMP
 	/*
 	 * Special treatment is due to break-before-make approach done when
 	 * pte1 is updated for userland mapping during section promotion or
 	 * demotion. If not caught here, pmap_enter() can find a section
 	 * mapping on faulting address. That is not allowed.
 	 */
 	if (idx == FAULT_TRAN_L1 && usermode && cp15_ats1cur_check(far) == 0) {
 		PMAP_UNLOCK(pmap);
 		return (KERN_SUCCESS);
 	}
 #endif
 	/*
 	 * Accesss bits for page and section. Note that the entry
 	 * is not in TLB yet, so TLB flush is not necessary.
 	 *
 	 * QQQ: This is hardware emulation, we do not call userret()
 	 *      for aborts from user mode.
 	 */
 	if (idx == FAULT_ACCESS_L2) {
 		pte2p = pt2map_entry(far);
 		pte2 = pte2_load(pte2p);
 		if (pte2_is_valid(pte2)) {
 			pte2_store(pte2p, pte2 | PTE2_A);
 			PMAP_UNLOCK(pmap);
 			return (KERN_SUCCESS);
 		}
 	}
 	if (idx == FAULT_ACCESS_L1) {
 		pte1p = pmap_pte1(pmap, far);
 		pte1 = pte1_load(pte1p);
 		if (pte1_is_section(pte1)) {
 			pte1_store(pte1p, pte1 | PTE1_A);
 			PMAP_UNLOCK(pmap);
 			return (KERN_SUCCESS);
 		}
 	}
 
 	/*
 	 * Handle modify bits for page and section. Note that the modify
 	 * bit is emulated by software. So PTEx_RO is software read only
 	 * bit and PTEx_NM flag is real hardware read only bit.
 	 *
 	 * QQQ: This is hardware emulation, we do not call userret()
 	 *      for aborts from user mode.
 	 */
 	if ((fsr & FSR_WNR) && (idx == FAULT_PERM_L2)) {
 		pte2p = pt2map_entry(far);
 		pte2 = pte2_load(pte2p);
 		if (pte2_is_valid(pte2) && !(pte2 & PTE2_RO) &&
 		    (pte2 & PTE2_NM)) {
 			pte2_store(pte2p, pte2 & ~PTE2_NM);
 			tlb_flush(trunc_page(far));
 			PMAP_UNLOCK(pmap);
 			return (KERN_SUCCESS);
 		}
 	}
 	if ((fsr & FSR_WNR) && (idx == FAULT_PERM_L1)) {
 		pte1p = pmap_pte1(pmap, far);
 		pte1 = pte1_load(pte1p);
 		if (pte1_is_section(pte1) && !(pte1 & PTE1_RO) &&
 		    (pte1 & PTE1_NM)) {
 			pte1_store(pte1p, pte1 & ~PTE1_NM);
 			tlb_flush(pte1_trunc(far));
 			PMAP_UNLOCK(pmap);
 			return (KERN_SUCCESS);
 		}
 	}
 
 	/*
 	 * QQQ: The previous code, mainly fast handling of access and
 	 *      modify bits aborts, could be moved to ASM. Now we are
 	 *      starting to deal with not fast aborts.
 	 */
 
 #ifdef INVARIANTS
 	/*
 	 * Read an entry in PT2TAB associated with both pmap and far.
 	 * It's safe because PT2TAB is always mapped.
 	 */
 	pte2 = pt2tab_load(pmap_pt2tab_entry(pmap, far));
 	if (pte2_is_valid(pte2)) {
 		/*
 		 * Now, when we know that L2 page table is allocated,
 		 * we can use PT2MAP to get L2 page table entry.
 		 */
 		pte2 = pte2_load(pt2map_entry(far));
 		if (pte2_is_valid(pte2)) {
 			/*
 			 * If L2 page table entry is valid, make sure that
 			 * L1 page table entry is valid too.  Note that we
 			 * leave L2 page entries untouched when promoted.
 			 */
 			pte1 = pte1_load(pmap_pte1(pmap, far));
 			if (!pte1_is_valid(pte1)) {
 				panic("%s: missing L1 page entry (%p, %#x)",
 				    __func__, pmap, far);
 			}
 		}
 	}
 #endif
 	PMAP_UNLOCK(pmap);
 	return (KERN_FAILURE);
 }
 
 #if defined(PMAP_DEBUG)
 /*
  *  Reusing of KVA used in pmap_zero_page function !!!
  */
 static void
 pmap_zero_page_check(vm_page_t m)
 {
 	uint32_t *p, *end;
 	struct sysmaps *sysmaps;
 
 	sched_pin();
 	sysmaps = &sysmaps_pcpu[PCPU_GET(cpuid)];
 	mtx_lock(&sysmaps->lock);
 	if (pte2_load(sysmaps->CMAP2) != 0)
 		panic("%s: CMAP2 busy", __func__);
 	pte2_store(sysmaps->CMAP2, PTE2_KERN_NG(VM_PAGE_TO_PHYS(m), PTE2_AP_KRW,
 	    vm_page_pte2_attr(m)));
 	end = (uint32_t*)(sysmaps->CADDR2 + PAGE_SIZE);
 	for (p = (uint32_t*)sysmaps->CADDR2; p < end; p++)
 		if (*p != 0)
 			panic("%s: page %p not zero, va: %p", __func__, m,
 			    sysmaps->CADDR2);
 	pte2_clear(sysmaps->CMAP2);
 	tlb_flush((vm_offset_t)sysmaps->CADDR2);
 	sched_unpin();
 	mtx_unlock(&sysmaps->lock);
 }
 
 int
 pmap_pid_dump(int pid)
 {
 	pmap_t pmap;
 	struct proc *p;
 	int npte2 = 0;
 	int i, j, index;
 
 	sx_slock(&allproc_lock);
 	FOREACH_PROC_IN_SYSTEM(p) {
 		if (p->p_pid != pid || p->p_vmspace == NULL)
 			continue;
 		index = 0;
 		pmap = vmspace_pmap(p->p_vmspace);
 		for (i = 0; i < NPTE1_IN_PT1; i++) {
 			pt1_entry_t pte1;
 			pt2_entry_t *pte2p, pte2;
 			vm_offset_t base, va;
 			vm_paddr_t pa;
 			vm_page_t m;
 
 			base = i << PTE1_SHIFT;
 			pte1 = pte1_load(&pmap->pm_pt1[i]);
 
 			if (pte1_is_section(pte1)) {
 				/*
 				 * QQQ: Do something here!
 				 */
 			} else if (pte1_is_link(pte1)) {
 				for (j = 0; j < NPTE2_IN_PT2; j++) {
 					va = base + (j << PAGE_SHIFT);
 					if (va >= VM_MIN_KERNEL_ADDRESS) {
 						if (index) {
 							index = 0;
 							printf("\n");
 						}
 						sx_sunlock(&allproc_lock);
 						return (npte2);
 					}
 					pte2p = pmap_pte2(pmap, va);
 					pte2 = pte2_load(pte2p);
 					pmap_pte2_release(pte2p);
 					if (!pte2_is_valid(pte2))
 						continue;
 
 					pa = pte2_pa(pte2);
 					m = PHYS_TO_VM_PAGE(pa);
 					printf("va: 0x%x, pa: 0x%x, h: %d, w:"
 					    " %d, f: 0x%x", va, pa,
 					    m->hold_count, m->wire_count,
 					    m->flags);
 					npte2++;
 					index++;
 					if (index >= 2) {
 						index = 0;
 						printf("\n");
 					} else {
 						printf(" ");
 					}
 				}
 			}
 		}
 	}
 	sx_sunlock(&allproc_lock);
 	return (npte2);
 }
 
 #endif
 
 #ifdef DDB
 static pt2_entry_t *
 pmap_pte2_ddb(pmap_t pmap, vm_offset_t va)
 {
 	pt1_entry_t pte1;
 	vm_paddr_t pt2pg_pa;
 
 	pte1 = pte1_load(pmap_pte1(pmap, va));
 	if (!pte1_is_link(pte1))
 		return (NULL);
 
 	if (pmap_is_current(pmap))
 		return (pt2map_entry(va));
 
 	/* Note that L2 page table size is not equal to PAGE_SIZE. */
 	pt2pg_pa = trunc_page(pte1_link_pa(pte1));
 	if (pte2_pa(pte2_load(PMAP3)) != pt2pg_pa) {
 		pte2_store(PMAP3, PTE2_KPT(pt2pg_pa));
 #ifdef SMP
 		PMAP3cpu = PCPU_GET(cpuid);
 #endif
 		tlb_flush_local((vm_offset_t)PADDR3);
 	}
 #ifdef SMP
 	else if (PMAP3cpu != PCPU_GET(cpuid)) {
 		PMAP3cpu = PCPU_GET(cpuid);
 		tlb_flush_local((vm_offset_t)PADDR3);
 	}
 #endif
 	return (PADDR3 + (arm32_btop(va) & (NPTE2_IN_PG - 1)));
 }
 
 static void
 dump_pmap(pmap_t pmap)
 {
 
 	printf("pmap %p\n", pmap);
 	printf("  pm_pt1: %p\n", pmap->pm_pt1);
 	printf("  pm_pt2tab: %p\n", pmap->pm_pt2tab);
 	printf("  pm_active: 0x%08lX\n", pmap->pm_active.__bits[0]);
 }
 
 DB_SHOW_COMMAND(pmaps, pmap_list_pmaps)
 {
 
 	pmap_t pmap;
 	LIST_FOREACH(pmap, &allpmaps, pm_list) {
 		dump_pmap(pmap);
 	}
 }
 
 static int
 pte2_class(pt2_entry_t pte2)
 {
 	int cls;
 
 	cls = (pte2 >> 2) & 0x03;
 	cls |= (pte2 >> 4) & 0x04;
 	return (cls);
 }
 
 static void
 dump_section(pmap_t pmap, uint32_t pte1_idx)
 {
 }
 
 static void
 dump_link(pmap_t pmap, uint32_t pte1_idx, boolean_t invalid_ok)
 {
 	uint32_t i;
 	vm_offset_t va;
 	pt2_entry_t *pte2p, pte2;
 	vm_page_t m;
 
 	va = pte1_idx << PTE1_SHIFT;
 	pte2p = pmap_pte2_ddb(pmap, va);
 	for (i = 0; i < NPTE2_IN_PT2; i++, pte2p++, va += PAGE_SIZE) {
 		pte2 = pte2_load(pte2p);
 		if (pte2 == 0)
 			continue;
 		if (!pte2_is_valid(pte2)) {
 			printf(" 0x%08X: 0x%08X", va, pte2);
 			if (!invalid_ok)
 				printf(" - not valid !!!");
 			printf("\n");
 			continue;
 		}
 		m = PHYS_TO_VM_PAGE(pte2_pa(pte2));
 		printf(" 0x%08X: 0x%08X, TEX%d, s:%d, g:%d, m:%p", va , pte2,
 		    pte2_class(pte2), !!(pte2 & PTE2_S), !(pte2 & PTE2_NG), m);
 		if (m != NULL) {
 			printf(" v:%d h:%d w:%d f:0x%04X\n", m->valid,
 			    m->hold_count, m->wire_count, m->flags);
 		} else {
 			printf("\n");
 		}
 	}
 }
 
 static __inline boolean_t
 is_pv_chunk_space(vm_offset_t va)
 {
 
 	if ((((vm_offset_t)pv_chunkbase) <= va) &&
 	    (va < ((vm_offset_t)pv_chunkbase + PAGE_SIZE * pv_maxchunks)))
 		return (TRUE);
 	return (FALSE);
 }
 
 DB_SHOW_COMMAND(pmap, pmap_pmap_print)
 {
 	/* XXX convert args. */
 	pmap_t pmap = (pmap_t)addr;
 	pt1_entry_t pte1;
 	pt2_entry_t pte2;
 	vm_offset_t va, eva;
 	vm_page_t m;
 	uint32_t i;
 	boolean_t invalid_ok, dump_link_ok, dump_pv_chunk;
 
 	if (have_addr) {
 		pmap_t pm;
 
 		LIST_FOREACH(pm, &allpmaps, pm_list)
 			if (pm == pmap) break;
 		if (pm == NULL) {
 			printf("given pmap %p is not in allpmaps list\n", pmap);
 			return;
 		}
 	} else
 		pmap = PCPU_GET(curpmap);
 
 	eva = (modif[0] == 'u') ? VM_MAXUSER_ADDRESS : 0xFFFFFFFF;
 	dump_pv_chunk = FALSE; /* XXX evaluate from modif[] */
 
 	printf("pmap: 0x%08X\n", (uint32_t)pmap);
 	printf("PT2MAP: 0x%08X\n", (uint32_t)PT2MAP);
 	printf("pt2tab: 0x%08X\n", (uint32_t)pmap->pm_pt2tab);
 
 	for(i = 0; i < NPTE1_IN_PT1; i++) {
 		pte1 = pte1_load(&pmap->pm_pt1[i]);
 		if (pte1 == 0)
 			continue;
 		va = i << PTE1_SHIFT;
 		if (va >= eva)
 			break;
 
 		if (pte1_is_section(pte1)) {
 			printf("0x%08X: Section 0x%08X, s:%d g:%d\n", va, pte1,
 			    !!(pte1 & PTE1_S), !(pte1 & PTE1_NG));
 			dump_section(pmap, i);
 		} else if (pte1_is_link(pte1)) {
 			dump_link_ok = TRUE;
 			invalid_ok = FALSE;
 			pte2 = pte2_load(pmap_pt2tab_entry(pmap, va));
 			m = PHYS_TO_VM_PAGE(pte1_link_pa(pte1));
 			printf("0x%08X: Link 0x%08X, pt2tab: 0x%08X m: %p",
 			    va, pte1, pte2, m);
 			if (is_pv_chunk_space(va)) {
 				printf(" - pv_chunk space");
 				if (dump_pv_chunk)
 					invalid_ok = TRUE;
 				else
 					dump_link_ok = FALSE;
 			}
 			else if (m != NULL)
 				printf(" w:%d w2:%u", m->wire_count,
 				    pt2_wirecount_get(m, pte1_index(va)));
 			if (pte2 == 0)
 				printf(" !!! pt2tab entry is ZERO");
 			else if (pte2_pa(pte1) != pte2_pa(pte2))
 				printf(" !!! pt2tab entry is DIFFERENT - m: %p",
 				    PHYS_TO_VM_PAGE(pte2_pa(pte2)));
 			printf("\n");
 			if (dump_link_ok)
 				dump_link(pmap, i, invalid_ok);
 		} else
 			printf("0x%08X: Invalid entry 0x%08X\n", va, pte1);
 	}
 }
 
 static void
 dump_pt2tab(pmap_t pmap)
 {
 	uint32_t i;
 	pt2_entry_t pte2;
 	vm_offset_t va;
 	vm_paddr_t pa;
 	vm_page_t m;
 
 	printf("PT2TAB:\n");
 	for (i = 0; i < PT2TAB_ENTRIES; i++) {
 		pte2 = pte2_load(&pmap->pm_pt2tab[i]);
 		if (!pte2_is_valid(pte2))
 			continue;
 		va = i << PT2TAB_SHIFT;
 		pa = pte2_pa(pte2);
 		m = PHYS_TO_VM_PAGE(pa);
 		printf(" 0x%08X: 0x%08X, TEX%d, s:%d, m:%p", va, pte2,
 		    pte2_class(pte2), !!(pte2 & PTE2_S), m);
 		if (m != NULL)
 			printf(" , h: %d, w: %d, f: 0x%04X pidx: %lld",
 			    m->hold_count, m->wire_count, m->flags, m->pindex);
 		printf("\n");
 	}
 }
 
 DB_SHOW_COMMAND(pmap_pt2tab, pmap_pt2tab_print)
 {
 	/* XXX convert args. */
 	pmap_t pmap = (pmap_t)addr;
 	pt1_entry_t pte1;
 	pt2_entry_t pte2;
 	vm_offset_t va;
 	uint32_t i, start;
 
 	if (have_addr) {
 		printf("supported only on current pmap\n");
 		return;
 	}
 
 	pmap = PCPU_GET(curpmap);
 	printf("curpmap: 0x%08X\n", (uint32_t)pmap);
 	printf("PT2MAP: 0x%08X\n", (uint32_t)PT2MAP);
 	printf("pt2tab: 0x%08X\n", (uint32_t)pmap->pm_pt2tab);
 
 	start = pte1_index((vm_offset_t)PT2MAP);
 	for (i = start; i < (start + NPT2_IN_PT2TAB); i++) {
 		pte1 = pte1_load(&pmap->pm_pt1[i]);
 		if (pte1 == 0)
 			continue;
 		va = i << PTE1_SHIFT;
 		if (pte1_is_section(pte1)) {
 			printf("0x%08X: Section 0x%08X, s:%d\n", va, pte1,
 			    !!(pte1 & PTE1_S));
 			dump_section(pmap, i);
 		} else if (pte1_is_link(pte1)) {
 			pte2 = pte2_load(pmap_pt2tab_entry(pmap, va));
 			printf("0x%08X: Link 0x%08X, pt2tab: 0x%08X\n", va,
 			    pte1, pte2);
 			if (pte2 == 0)
 				printf("  !!! pt2tab entry is ZERO\n");
 		} else
 			printf("0x%08X: Invalid entry 0x%08X\n", va, pte1);
 	}
 	dump_pt2tab(pmap);
 }
 #endif
Index: projects/clang390-import/sys/arm64/arm64/pmap.c
===================================================================
--- projects/clang390-import/sys/arm64/arm64/pmap.c	(revision 305686)
+++ projects/clang390-import/sys/arm64/arm64/pmap.c	(revision 305687)
@@ -1,4732 +1,4744 @@
 /*-
  * Copyright (c) 1991 Regents of the University of California.
  * All rights reserved.
  * Copyright (c) 1994 John S. Dyson
  * All rights reserved.
  * Copyright (c) 1994 David Greenman
  * All rights reserved.
  * Copyright (c) 2003 Peter Wemm
  * All rights reserved.
  * Copyright (c) 2005-2010 Alan L. Cox <alc@cs.rice.edu>
  * All rights reserved.
  * Copyright (c) 2014 Andrew Turner
  * All rights reserved.
  * Copyright (c) 2014-2016 The FreeBSD Foundation
  * All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * the Systems Programming Group of the University of Utah Computer
  * Science Department and William Jolitz of UUNET Technologies Inc.
  *
  * This software was developed by Andrew Turner under sponsorship from
  * the FreeBSD Foundation.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *	This product includes software developed by the University of
  *	California, Berkeley and its contributors.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	from:	@(#)pmap.c	7.7 (Berkeley)	5/12/91
  */
 /*-
  * Copyright (c) 2003 Networks Associates Technology, Inc.
  * All rights reserved.
  *
  * This software was developed for the FreeBSD Project by Jake Burkholder,
  * Safeport Network Services, and Network Associates Laboratories, the
  * Security Research Division of Network Associates, Inc. under
  * DARPA/SPAWAR contract N66001-01-C-8035 ("CBOSS"), as part of the DARPA
  * CHATS research program.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 /*
  *	Manages physical address maps.
  *
  *	Since the information managed by this module is
  *	also stored by the logical address mapping module,
  *	this module may throw away valid virtual-to-physical
  *	mappings at almost any time.  However, invalidations
  *	of virtual-to-physical mappings must be done as
  *	requested.
  *
  *	In order to cope with hardware architectures which
  *	make virtual-to-physical map invalidates expensive,
  *	this module may delay invalidate or reduced protection
  *	operations until such time as they are actually
  *	necessary.  This module is given full information as
  *	to which processors are currently using which maps,
  *	and to when physical maps must be made correct.
  */
 
 #include <sys/param.h>
 #include <sys/bitstring.h>
 #include <sys/bus.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/ktr.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/mman.h>
 #include <sys/msgbuf.h>
 #include <sys/mutex.h>
 #include <sys/proc.h>
 #include <sys/rwlock.h>
 #include <sys/sx.h>
 #include <sys/vmem.h>
 #include <sys/vmmeter.h>
 #include <sys/sched.h>
 #include <sys/sysctl.h>
 #include <sys/_unrhdr.h>
 #include <sys/smp.h>
 
 #include <vm/vm.h>
 #include <vm/vm_param.h>
 #include <vm/vm_kern.h>
 #include <vm/vm_page.h>
 #include <vm/vm_map.h>
 #include <vm/vm_object.h>
 #include <vm/vm_extern.h>
 #include <vm/vm_pageout.h>
 #include <vm/vm_pager.h>
 #include <vm/vm_phys.h>
 #include <vm/vm_radix.h>
 #include <vm/vm_reserv.h>
 #include <vm/uma.h>
 
 #include <machine/machdep.h>
 #include <machine/md_var.h>
 #include <machine/pcb.h>
 
 #define	NL0PG		(PAGE_SIZE/(sizeof (pd_entry_t)))
 #define	NL1PG		(PAGE_SIZE/(sizeof (pd_entry_t)))
 #define	NL2PG		(PAGE_SIZE/(sizeof (pd_entry_t)))
 #define	NL3PG		(PAGE_SIZE/(sizeof (pt_entry_t)))
 
 #define	NUL0E		L0_ENTRIES
 #define	NUL1E		(NUL0E * NL1PG)
 #define	NUL2E		(NUL1E * NL2PG)
 
 #if !defined(DIAGNOSTIC)
 #ifdef __GNUC_GNU_INLINE__
 #define PMAP_INLINE	__attribute__((__gnu_inline__)) inline
 #else
 #define PMAP_INLINE	extern inline
 #endif
 #else
 #define PMAP_INLINE
 #endif
 
 /*
  * These are configured by the mair_el1 register. This is set up in locore.S
  */
 #define	DEVICE_MEMORY	0
 #define	UNCACHED_MEMORY	1
 #define	CACHED_MEMORY	2
 
 
 #ifdef PV_STATS
 #define PV_STAT(x)	do { x ; } while (0)
 #else
 #define PV_STAT(x)	do { } while (0)
 #endif
 
 #define	pmap_l2_pindex(v)	((v) >> L2_SHIFT)
 #define	pa_to_pvh(pa)		(&pv_table[pmap_l2_pindex(pa)])
 
 #define	NPV_LIST_LOCKS	MAXCPU
 
 #define	PHYS_TO_PV_LIST_LOCK(pa)	\
 			(&pv_list_locks[pa_index(pa) % NPV_LIST_LOCKS])
 
 #define	CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa)	do {	\
 	struct rwlock **_lockp = (lockp);		\
 	struct rwlock *_new_lock;			\
 							\
 	_new_lock = PHYS_TO_PV_LIST_LOCK(pa);		\
 	if (_new_lock != *_lockp) {			\
 		if (*_lockp != NULL)			\
 			rw_wunlock(*_lockp);		\
 		*_lockp = _new_lock;			\
 		rw_wlock(*_lockp);			\
 	}						\
 } while (0)
 
 #define	CHANGE_PV_LIST_LOCK_TO_VM_PAGE(lockp, m)	\
 			CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, VM_PAGE_TO_PHYS(m))
 
 #define	RELEASE_PV_LIST_LOCK(lockp)		do {	\
 	struct rwlock **_lockp = (lockp);		\
 							\
 	if (*_lockp != NULL) {				\
 		rw_wunlock(*_lockp);			\
 		*_lockp = NULL;				\
 	}						\
 } while (0)
 
 #define	VM_PAGE_TO_PV_LIST_LOCK(m)	\
 			PHYS_TO_PV_LIST_LOCK(VM_PAGE_TO_PHYS(m))
 
 struct pmap kernel_pmap_store;
 
 vm_offset_t virtual_avail;	/* VA of first avail page (after kernel bss) */
 vm_offset_t virtual_end;	/* VA of last avail page (end of kernel AS) */
 vm_offset_t kernel_vm_end = 0;
 
 struct msgbuf *msgbufp = NULL;
 
 /*
  * Data for the pv entry allocation mechanism.
  * Updates to pv_invl_gen are protected by the pv_list_locks[]
  * elements, but reads are not.
  */
 static struct md_page *pv_table;
 static struct md_page pv_dummy;
 
 vm_paddr_t dmap_phys_base;	/* The start of the dmap region */
 vm_paddr_t dmap_phys_max;	/* The limit of the dmap region */
 vm_offset_t dmap_max_addr;	/* The virtual address limit of the dmap */
 
 /* This code assumes all L1 DMAP entries will be used */
 CTASSERT((DMAP_MIN_ADDRESS  & ~L0_OFFSET) == DMAP_MIN_ADDRESS);
 CTASSERT((DMAP_MAX_ADDRESS  & ~L0_OFFSET) == DMAP_MAX_ADDRESS);
 
 #define	DMAP_TABLES	((DMAP_MAX_ADDRESS - DMAP_MIN_ADDRESS) >> L0_SHIFT)
 extern pt_entry_t pagetable_dmap[];
 
 static SYSCTL_NODE(_vm, OID_AUTO, pmap, CTLFLAG_RD, 0, "VM/pmap parameters");
 
 static int superpages_enabled = 1;
 SYSCTL_INT(_vm_pmap, OID_AUTO, superpages_enabled,
     CTLFLAG_RDTUN | CTLFLAG_NOFETCH, &superpages_enabled, 0,
     "Are large page mappings enabled?");
 
 /*
  * Data for the pv entry allocation mechanism
  */
 static TAILQ_HEAD(pch, pv_chunk) pv_chunks = TAILQ_HEAD_INITIALIZER(pv_chunks);
 static struct mtx pv_chunks_mutex;
 static struct rwlock pv_list_locks[NPV_LIST_LOCKS];
 
 static void	free_pv_chunk(struct pv_chunk *pc);
 static void	free_pv_entry(pmap_t pmap, pv_entry_t pv);
 static pv_entry_t get_pv_entry(pmap_t pmap, struct rwlock **lockp);
 static vm_page_t reclaim_pv_chunk(pmap_t locked_pmap, struct rwlock **lockp);
 static void	pmap_pvh_free(struct md_page *pvh, pmap_t pmap, vm_offset_t va);
 static pv_entry_t pmap_pvh_remove(struct md_page *pvh, pmap_t pmap,
 		    vm_offset_t va);
 
 static int pmap_change_attr(vm_offset_t va, vm_size_t size, int mode);
 static int pmap_change_attr_locked(vm_offset_t va, vm_size_t size, int mode);
 static pt_entry_t *pmap_demote_l1(pmap_t pmap, pt_entry_t *l1, vm_offset_t va);
 static pt_entry_t *pmap_demote_l2_locked(pmap_t pmap, pt_entry_t *l2,
     vm_offset_t va, struct rwlock **lockp);
 static pt_entry_t *pmap_demote_l2(pmap_t pmap, pt_entry_t *l2, vm_offset_t va);
 static vm_page_t pmap_enter_quick_locked(pmap_t pmap, vm_offset_t va,
     vm_page_t m, vm_prot_t prot, vm_page_t mpte, struct rwlock **lockp);
 static int pmap_remove_l3(pmap_t pmap, pt_entry_t *l3, vm_offset_t sva,
     pd_entry_t ptepde, struct spglist *free, struct rwlock **lockp);
 static boolean_t pmap_try_insert_pv_entry(pmap_t pmap, vm_offset_t va,
     vm_page_t m, struct rwlock **lockp);
 
 static vm_page_t _pmap_alloc_l3(pmap_t pmap, vm_pindex_t ptepindex,
 		struct rwlock **lockp);
 
 static void _pmap_unwire_l3(pmap_t pmap, vm_offset_t va, vm_page_t m,
     struct spglist *free);
 static int pmap_unuse_l3(pmap_t, vm_offset_t, pd_entry_t, struct spglist *);
 
 /*
  * These load the old table data and store the new value.
  * They need to be atomic as the System MMU may write to the table at
  * the same time as the CPU.
  */
 #define	pmap_load_store(table, entry) atomic_swap_64(table, entry)
 #define	pmap_set(table, mask) atomic_set_64(table, mask)
 #define	pmap_load_clear(table) atomic_swap_64(table, 0)
 #define	pmap_load(table) (*table)
 
 /********************/
 /* Inline functions */
 /********************/
 
 static __inline void
 pagecopy(void *s, void *d)
 {
 
 	memcpy(d, s, PAGE_SIZE);
 }
 
 #define	pmap_l0_index(va)	(((va) >> L0_SHIFT) & L0_ADDR_MASK)
 #define	pmap_l1_index(va)	(((va) >> L1_SHIFT) & Ln_ADDR_MASK)
 #define	pmap_l2_index(va)	(((va) >> L2_SHIFT) & Ln_ADDR_MASK)
 #define	pmap_l3_index(va)	(((va) >> L3_SHIFT) & Ln_ADDR_MASK)
 
 static __inline pd_entry_t *
 pmap_l0(pmap_t pmap, vm_offset_t va)
 {
 
 	return (&pmap->pm_l0[pmap_l0_index(va)]);
 }
 
 static __inline pd_entry_t *
 pmap_l0_to_l1(pd_entry_t *l0, vm_offset_t va)
 {
 	pd_entry_t *l1;
 
 	l1 = (pd_entry_t *)PHYS_TO_DMAP(pmap_load(l0) & ~ATTR_MASK);
 	return (&l1[pmap_l1_index(va)]);
 }
 
 static __inline pd_entry_t *
 pmap_l1(pmap_t pmap, vm_offset_t va)
 {
 	pd_entry_t *l0;
 
 	l0 = pmap_l0(pmap, va);
 	if ((pmap_load(l0) & ATTR_DESCR_MASK) != L0_TABLE)
 		return (NULL);
 
 	return (pmap_l0_to_l1(l0, va));
 }
 
 static __inline pd_entry_t *
 pmap_l1_to_l2(pd_entry_t *l1, vm_offset_t va)
 {
 	pd_entry_t *l2;
 
 	l2 = (pd_entry_t *)PHYS_TO_DMAP(pmap_load(l1) & ~ATTR_MASK);
 	return (&l2[pmap_l2_index(va)]);
 }
 
 static __inline pd_entry_t *
 pmap_l2(pmap_t pmap, vm_offset_t va)
 {
 	pd_entry_t *l1;
 
 	l1 = pmap_l1(pmap, va);
 	if ((pmap_load(l1) & ATTR_DESCR_MASK) != L1_TABLE)
 		return (NULL);
 
 	return (pmap_l1_to_l2(l1, va));
 }
 
 static __inline pt_entry_t *
 pmap_l2_to_l3(pd_entry_t *l2, vm_offset_t va)
 {
 	pt_entry_t *l3;
 
 	l3 = (pd_entry_t *)PHYS_TO_DMAP(pmap_load(l2) & ~ATTR_MASK);
 	return (&l3[pmap_l3_index(va)]);
 }
 
 /*
  * Returns the lowest valid pde for a given virtual address.
  * The next level may or may not point to a valid page or block.
  */
 static __inline pd_entry_t *
 pmap_pde(pmap_t pmap, vm_offset_t va, int *level)
 {
 	pd_entry_t *l0, *l1, *l2, desc;
 
 	l0 = pmap_l0(pmap, va);
 	desc = pmap_load(l0) & ATTR_DESCR_MASK;
 	if (desc != L0_TABLE) {
 		*level = -1;
 		return (NULL);
 	}
 
 	l1 = pmap_l0_to_l1(l0, va);
 	desc = pmap_load(l1) & ATTR_DESCR_MASK;
 	if (desc != L1_TABLE) {
 		*level = 0;
 		return (l0);
 	}
 
 	l2 = pmap_l1_to_l2(l1, va);
 	desc = pmap_load(l2) & ATTR_DESCR_MASK;
 	if (desc != L2_TABLE) {
 		*level = 1;
 		return (l1);
 	}
 
 	*level = 2;
 	return (l2);
 }
 
 /*
  * Returns the lowest valid pte block or table entry for a given virtual
  * address. If there are no valid entries return NULL and set the level to
  * the first invalid level.
  */
 static __inline pt_entry_t *
 pmap_pte(pmap_t pmap, vm_offset_t va, int *level)
 {
 	pd_entry_t *l1, *l2, desc;
 	pt_entry_t *l3;
 
 	l1 = pmap_l1(pmap, va);
 	if (l1 == NULL) {
 		*level = 0;
 		return (NULL);
 	}
 	desc = pmap_load(l1) & ATTR_DESCR_MASK;
 	if (desc == L1_BLOCK) {
 		*level = 1;
 		return (l1);
 	}
 
 	if (desc != L1_TABLE) {
 		*level = 1;
 		return (NULL);
 	}
 
 	l2 = pmap_l1_to_l2(l1, va);
 	desc = pmap_load(l2) & ATTR_DESCR_MASK;
 	if (desc == L2_BLOCK) {
 		*level = 2;
 		return (l2);
 	}
 
 	if (desc != L2_TABLE) {
 		*level = 2;
 		return (NULL);
 	}
 
 	*level = 3;
 	l3 = pmap_l2_to_l3(l2, va);
 	if ((pmap_load(l3) & ATTR_DESCR_MASK) != L3_PAGE)
 		return (NULL);
 
 	return (l3);
 }
 
 static inline bool
 pmap_superpages_enabled(void)
 {
 
 	return (superpages_enabled != 0);
 }
 
 bool
 pmap_get_tables(pmap_t pmap, vm_offset_t va, pd_entry_t **l0, pd_entry_t **l1,
     pd_entry_t **l2, pt_entry_t **l3)
 {
 	pd_entry_t *l0p, *l1p, *l2p;
 
 	if (pmap->pm_l0 == NULL)
 		return (false);
 
 	l0p = pmap_l0(pmap, va);
 	*l0 = l0p;
 
 	if ((pmap_load(l0p) & ATTR_DESCR_MASK) != L0_TABLE)
 		return (false);
 
 	l1p = pmap_l0_to_l1(l0p, va);
 	*l1 = l1p;
 
 	if ((pmap_load(l1p) & ATTR_DESCR_MASK) == L1_BLOCK) {
 		*l2 = NULL;
 		*l3 = NULL;
 		return (true);
 	}
 
 	if ((pmap_load(l1p) & ATTR_DESCR_MASK) != L1_TABLE)
 		return (false);
 
 	l2p = pmap_l1_to_l2(l1p, va);
 	*l2 = l2p;
 
 	if ((pmap_load(l2p) & ATTR_DESCR_MASK) == L2_BLOCK) {
 		*l3 = NULL;
 		return (true);
 	}
 
 	*l3 = pmap_l2_to_l3(l2p, va);
 
 	return (true);
 }
 
 static __inline int
 pmap_is_current(pmap_t pmap)
 {
 
 	return ((pmap == pmap_kernel()) ||
 	    (pmap == curthread->td_proc->p_vmspace->vm_map.pmap));
 }
 
 static __inline int
 pmap_l3_valid(pt_entry_t l3)
 {
 
 	return ((l3 & ATTR_DESCR_MASK) == L3_PAGE);
 }
 
 
 /* Is a level 1 or 2entry a valid block and cacheable */
 CTASSERT(L1_BLOCK == L2_BLOCK);
 static __inline int
 pmap_pte_valid_cacheable(pt_entry_t pte)
 {
 
 	return (((pte & ATTR_DESCR_MASK) == L1_BLOCK) &&
 	    ((pte & ATTR_IDX_MASK) == ATTR_IDX(CACHED_MEMORY)));
 }
 
 static __inline int
 pmap_l3_valid_cacheable(pt_entry_t l3)
 {
 
 	return (((l3 & ATTR_DESCR_MASK) == L3_PAGE) &&
 	    ((l3 & ATTR_IDX_MASK) == ATTR_IDX(CACHED_MEMORY)));
 }
 
 #define	PTE_SYNC(pte)	cpu_dcache_wb_range((vm_offset_t)pte, sizeof(*pte))
 
 /*
  * Checks if the page is dirty. We currently lack proper tracking of this on
  * arm64 so for now assume is a page mapped as rw was accessed it is.
  */
 static inline int
 pmap_page_dirty(pt_entry_t pte)
 {
 
 	return ((pte & (ATTR_AF | ATTR_AP_RW_BIT)) ==
 	    (ATTR_AF | ATTR_AP(ATTR_AP_RW)));
 }
 
 static __inline void
 pmap_resident_count_inc(pmap_t pmap, int count)
 {
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	pmap->pm_stats.resident_count += count;
 }
 
 static __inline void
 pmap_resident_count_dec(pmap_t pmap, int count)
 {
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	KASSERT(pmap->pm_stats.resident_count >= count,
 	    ("pmap %p resident count underflow %ld %d", pmap,
 	    pmap->pm_stats.resident_count, count));
 	pmap->pm_stats.resident_count -= count;
 }
 
 static pt_entry_t *
 pmap_early_page_idx(vm_offset_t l1pt, vm_offset_t va, u_int *l1_slot,
     u_int *l2_slot)
 {
 	pt_entry_t *l2;
 	pd_entry_t *l1;
 
 	l1 = (pd_entry_t *)l1pt;
 	*l1_slot = (va >> L1_SHIFT) & Ln_ADDR_MASK;
 
 	/* Check locore has used a table L1 map */
 	KASSERT((l1[*l1_slot] & ATTR_DESCR_MASK) == L1_TABLE,
 	   ("Invalid bootstrap L1 table"));
 	/* Find the address of the L2 table */
 	l2 = (pt_entry_t *)init_pt_va;
 	*l2_slot = pmap_l2_index(va);
 
 	return (l2);
 }
 
 static vm_paddr_t
 pmap_early_vtophys(vm_offset_t l1pt, vm_offset_t va)
 {
 	u_int l1_slot, l2_slot;
 	pt_entry_t *l2;
 
 	l2 = pmap_early_page_idx(l1pt, va, &l1_slot, &l2_slot);
 
 	return ((l2[l2_slot] & ~ATTR_MASK) + (va & L2_OFFSET));
 }
 
 static void
 pmap_bootstrap_dmap(vm_offset_t kern_l1, vm_paddr_t min_pa, vm_paddr_t max_pa)
 {
 	vm_offset_t va;
 	vm_paddr_t pa;
 	u_int l1_slot;
 
 	pa = dmap_phys_base = min_pa & ~L1_OFFSET;
 	va = DMAP_MIN_ADDRESS;
 	for (; va < DMAP_MAX_ADDRESS && pa < max_pa;
 	    pa += L1_SIZE, va += L1_SIZE, l1_slot++) {
 		l1_slot = ((va - DMAP_MIN_ADDRESS) >> L1_SHIFT);
 
 		pmap_load_store(&pagetable_dmap[l1_slot],
 		    (pa & ~L1_OFFSET) | ATTR_DEFAULT |
 		    ATTR_IDX(CACHED_MEMORY) | L1_BLOCK);
 	}
 
 	/* Set the upper limit of the DMAP region */
 	dmap_phys_max = pa;
 	dmap_max_addr = va;
 
 	cpu_dcache_wb_range((vm_offset_t)pagetable_dmap,
 	    PAGE_SIZE * DMAP_TABLES);
 	cpu_tlb_flushID();
 }
 
 static vm_offset_t
 pmap_bootstrap_l2(vm_offset_t l1pt, vm_offset_t va, vm_offset_t l2_start)
 {
 	vm_offset_t l2pt;
 	vm_paddr_t pa;
 	pd_entry_t *l1;
 	u_int l1_slot;
 
 	KASSERT((va & L1_OFFSET) == 0, ("Invalid virtual address"));
 
 	l1 = (pd_entry_t *)l1pt;
 	l1_slot = pmap_l1_index(va);
 	l2pt = l2_start;
 
 	for (; va < VM_MAX_KERNEL_ADDRESS; l1_slot++, va += L1_SIZE) {
 		KASSERT(l1_slot < Ln_ENTRIES, ("Invalid L1 index"));
 
 		pa = pmap_early_vtophys(l1pt, l2pt);
 		pmap_load_store(&l1[l1_slot],
 		    (pa & ~Ln_TABLE_MASK) | L1_TABLE);
 		l2pt += PAGE_SIZE;
 	}
 
 	/* Clean the L2 page table */
 	memset((void *)l2_start, 0, l2pt - l2_start);
 	cpu_dcache_wb_range(l2_start, l2pt - l2_start);
 
 	/* Flush the l1 table to ram */
 	cpu_dcache_wb_range((vm_offset_t)l1, PAGE_SIZE);
 
 	return l2pt;
 }
 
 static vm_offset_t
 pmap_bootstrap_l3(vm_offset_t l1pt, vm_offset_t va, vm_offset_t l3_start)
 {
 	vm_offset_t l2pt, l3pt;
 	vm_paddr_t pa;
 	pd_entry_t *l2;
 	u_int l2_slot;
 
 	KASSERT((va & L2_OFFSET) == 0, ("Invalid virtual address"));
 
 	l2 = pmap_l2(kernel_pmap, va);
 	l2 = (pd_entry_t *)rounddown2((uintptr_t)l2, PAGE_SIZE);
 	l2pt = (vm_offset_t)l2;
 	l2_slot = pmap_l2_index(va);
 	l3pt = l3_start;
 
 	for (; va < VM_MAX_KERNEL_ADDRESS; l2_slot++, va += L2_SIZE) {
 		KASSERT(l2_slot < Ln_ENTRIES, ("Invalid L2 index"));
 
 		pa = pmap_early_vtophys(l1pt, l3pt);
 		pmap_load_store(&l2[l2_slot],
 		    (pa & ~Ln_TABLE_MASK) | L2_TABLE);
 		l3pt += PAGE_SIZE;
 	}
 
 	/* Clean the L2 page table */
 	memset((void *)l3_start, 0, l3pt - l3_start);
 	cpu_dcache_wb_range(l3_start, l3pt - l3_start);
 
 	cpu_dcache_wb_range((vm_offset_t)l2, PAGE_SIZE);
 
 	return l3pt;
 }
 
 /*
  *	Bootstrap the system enough to run with virtual memory.
  */
 void
 pmap_bootstrap(vm_offset_t l0pt, vm_offset_t l1pt, vm_paddr_t kernstart,
     vm_size_t kernlen)
 {
 	u_int l1_slot, l2_slot, avail_slot, map_slot, used_map_slot;
 	uint64_t kern_delta;
 	pt_entry_t *l2;
 	vm_offset_t va, freemempos;
 	vm_offset_t dpcpu, msgbufpv;
 	vm_paddr_t pa, max_pa, min_pa;
 	int i;
 
 	kern_delta = KERNBASE - kernstart;
 	physmem = 0;
 
 	printf("pmap_bootstrap %lx %lx %lx\n", l1pt, kernstart, kernlen);
 	printf("%lx\n", l1pt);
 	printf("%lx\n", (KERNBASE >> L1_SHIFT) & Ln_ADDR_MASK);
 
 	/* Set this early so we can use the pagetable walking functions */
 	kernel_pmap_store.pm_l0 = (pd_entry_t *)l0pt;
 	PMAP_LOCK_INIT(kernel_pmap);
 
 	/* Assume the address we were loaded to is a valid physical address */
 	min_pa = max_pa = KERNBASE - kern_delta;
 
 	/*
 	 * Find the minimum physical address. physmap is sorted,
 	 * but may contain empty ranges.
 	 */
 	for (i = 0; i < (physmap_idx * 2); i += 2) {
 		if (physmap[i] == physmap[i + 1])
 			continue;
 		if (physmap[i] <= min_pa)
 			min_pa = physmap[i];
 		if (physmap[i + 1] > max_pa)
 			max_pa = physmap[i + 1];
 	}
 
 	/* Create a direct map region early so we can use it for pa -> va */
 	pmap_bootstrap_dmap(l1pt, min_pa, max_pa);
 
 	va = KERNBASE;
 	pa = KERNBASE - kern_delta;
 
 	/*
 	 * Start to initialise phys_avail by copying from physmap
 	 * up to the physical address KERNBASE points at.
 	 */
 	map_slot = avail_slot = 0;
 	for (; map_slot < (physmap_idx * 2) &&
 	    avail_slot < (PHYS_AVAIL_SIZE - 2); map_slot += 2) {
 		if (physmap[map_slot] == physmap[map_slot + 1])
 			continue;
 
 		if (physmap[map_slot] <= pa &&
 		    physmap[map_slot + 1] > pa)
 			break;
 
 		phys_avail[avail_slot] = physmap[map_slot];
 		phys_avail[avail_slot + 1] = physmap[map_slot + 1];
 		physmem += (phys_avail[avail_slot + 1] -
 		    phys_avail[avail_slot]) >> PAGE_SHIFT;
 		avail_slot += 2;
 	}
 
 	/* Add the memory before the kernel */
 	if (physmap[avail_slot] < pa && avail_slot < (PHYS_AVAIL_SIZE - 2)) {
 		phys_avail[avail_slot] = physmap[map_slot];
 		phys_avail[avail_slot + 1] = pa;
 		physmem += (phys_avail[avail_slot + 1] -
 		    phys_avail[avail_slot]) >> PAGE_SHIFT;
 		avail_slot += 2;
 	}
 	used_map_slot = map_slot;
 
 	/*
 	 * Read the page table to find out what is already mapped.
 	 * This assumes we have mapped a block of memory from KERNBASE
 	 * using a single L1 entry.
 	 */
 	l2 = pmap_early_page_idx(l1pt, KERNBASE, &l1_slot, &l2_slot);
 
 	/* Sanity check the index, KERNBASE should be the first VA */
 	KASSERT(l2_slot == 0, ("The L2 index is non-zero"));
 
 	/* Find how many pages we have mapped */
 	for (; l2_slot < Ln_ENTRIES; l2_slot++) {
 		if ((l2[l2_slot] & ATTR_DESCR_MASK) == 0)
 			break;
 
 		/* Check locore used L2 blocks */
 		KASSERT((l2[l2_slot] & ATTR_DESCR_MASK) == L2_BLOCK,
 		    ("Invalid bootstrap L2 table"));
 		KASSERT((l2[l2_slot] & ~ATTR_MASK) == pa,
 		    ("Incorrect PA in L2 table"));
 
 		va += L2_SIZE;
 		pa += L2_SIZE;
 	}
 
 	va = roundup2(va, L1_SIZE);
 
 	freemempos = KERNBASE + kernlen;
 	freemempos = roundup2(freemempos, PAGE_SIZE);
 	/* Create the l2 tables up to VM_MAX_KERNEL_ADDRESS */
 	freemempos = pmap_bootstrap_l2(l1pt, va, freemempos);
 	/* And the l3 tables for the early devmap */
 	freemempos = pmap_bootstrap_l3(l1pt,
 	    VM_MAX_KERNEL_ADDRESS - L2_SIZE, freemempos);
 
 	cpu_tlb_flushID();
 
 #define alloc_pages(var, np)						\
 	(var) = freemempos;						\
 	freemempos += (np * PAGE_SIZE);					\
 	memset((char *)(var), 0, ((np) * PAGE_SIZE));
 
 	/* Allocate dynamic per-cpu area. */
 	alloc_pages(dpcpu, DPCPU_SIZE / PAGE_SIZE);
 	dpcpu_init((void *)dpcpu, 0);
 
 	/* Allocate memory for the msgbuf, e.g. for /sbin/dmesg */
 	alloc_pages(msgbufpv, round_page(msgbufsize) / PAGE_SIZE);
 	msgbufp = (void *)msgbufpv;
 
 	virtual_avail = roundup2(freemempos, L1_SIZE);
 	virtual_end = VM_MAX_KERNEL_ADDRESS - L2_SIZE;
 	kernel_vm_end = virtual_avail;
 
 	pa = pmap_early_vtophys(l1pt, freemempos);
 
 	/* Finish initialising physmap */
 	map_slot = used_map_slot;
 	for (; avail_slot < (PHYS_AVAIL_SIZE - 2) &&
 	    map_slot < (physmap_idx * 2); map_slot += 2) {
 		if (physmap[map_slot] == physmap[map_slot + 1])
 			continue;
 
 		/* Have we used the current range? */
 		if (physmap[map_slot + 1] <= pa)
 			continue;
 
 		/* Do we need to split the entry? */
 		if (physmap[map_slot] < pa) {
 			phys_avail[avail_slot] = pa;
 			phys_avail[avail_slot + 1] = physmap[map_slot + 1];
 		} else {
 			phys_avail[avail_slot] = physmap[map_slot];
 			phys_avail[avail_slot + 1] = physmap[map_slot + 1];
 		}
 		physmem += (phys_avail[avail_slot + 1] -
 		    phys_avail[avail_slot]) >> PAGE_SHIFT;
 
 		avail_slot += 2;
 	}
 	phys_avail[avail_slot] = 0;
 	phys_avail[avail_slot + 1] = 0;
 
 	/*
 	 * Maxmem isn't the "maximum memory", it's one larger than the
 	 * highest page of the physical address space.  It should be
 	 * called something like "Maxphyspage".
 	 */
 	Maxmem = atop(phys_avail[avail_slot - 1]);
 
 	cpu_tlb_flushID();
 }
 
 /*
  *	Initialize a vm_page's machine-dependent fields.
  */
 void
 pmap_page_init(vm_page_t m)
 {
 
 	TAILQ_INIT(&m->md.pv_list);
 	m->md.pv_memattr = VM_MEMATTR_WRITE_BACK;
 }
 
 /*
  *	Initialize the pmap module.
  *	Called by vm_init, to initialize any structures that the pmap
  *	system needs to map virtual memory.
  */
 void
 pmap_init(void)
 {
 	vm_size_t s;
 	int i, pv_npg;
 
 	/*
 	 * Are large page mappings enabled?
 	 */
 	TUNABLE_INT_FETCH("vm.pmap.superpages_enabled", &superpages_enabled);
 
 	/*
 	 * Initialize the pv chunk list mutex.
 	 */
 	mtx_init(&pv_chunks_mutex, "pmap pv chunk list", NULL, MTX_DEF);
 
 	/*
 	 * Initialize the pool of pv list locks.
 	 */
 	for (i = 0; i < NPV_LIST_LOCKS; i++)
 		rw_init(&pv_list_locks[i], "pmap pv list");
 
 	/*
 	 * Calculate the size of the pv head table for superpages.
 	 */
 	pv_npg = howmany(vm_phys_segs[vm_phys_nsegs - 1].end, L2_SIZE);
 
 	/*
 	 * Allocate memory for the pv head table for superpages.
 	 */
 	s = (vm_size_t)(pv_npg * sizeof(struct md_page));
 	s = round_page(s);
 	pv_table = (struct md_page *)kmem_malloc(kernel_arena, s,
 	    M_WAITOK | M_ZERO);
 	for (i = 0; i < pv_npg; i++)
 		TAILQ_INIT(&pv_table[i].pv_list);
 	TAILQ_INIT(&pv_dummy.pv_list);
 }
 
 static SYSCTL_NODE(_vm_pmap, OID_AUTO, l2, CTLFLAG_RD, 0,
     "2MB page mapping counters");
 
 static u_long pmap_l2_demotions;
 SYSCTL_ULONG(_vm_pmap_l2, OID_AUTO, demotions, CTLFLAG_RD,
     &pmap_l2_demotions, 0, "2MB page demotions");
 
 static u_long pmap_l2_p_failures;
 SYSCTL_ULONG(_vm_pmap_l2, OID_AUTO, p_failures, CTLFLAG_RD,
     &pmap_l2_p_failures, 0, "2MB page promotion failures");
 
 static u_long pmap_l2_promotions;
 SYSCTL_ULONG(_vm_pmap_l2, OID_AUTO, promotions, CTLFLAG_RD,
     &pmap_l2_promotions, 0, "2MB page promotions");
 
 /*
  * Invalidate a single TLB entry.
  */
 PMAP_INLINE void
 pmap_invalidate_page(pmap_t pmap, vm_offset_t va)
 {
 
 	sched_pin();
 	__asm __volatile(
 	    "dsb  ishst		\n"
 	    "tlbi vaae1is, %0	\n"
 	    "dsb  ish		\n"
 	    "isb		\n"
 	    : : "r"(va >> PAGE_SHIFT));
 	sched_unpin();
 }
 
 PMAP_INLINE void
 pmap_invalidate_range(pmap_t pmap, vm_offset_t sva, vm_offset_t eva)
 {
 	vm_offset_t addr;
 
 	sched_pin();
 	dsb(ishst);
 	for (addr = sva; addr < eva; addr += PAGE_SIZE) {
 		__asm __volatile(
 		    "tlbi vaae1is, %0" : : "r"(addr >> PAGE_SHIFT));
 	}
 	__asm __volatile(
 	    "dsb  ish	\n"
 	    "isb	\n");
 	sched_unpin();
 }
 
 PMAP_INLINE void
 pmap_invalidate_all(pmap_t pmap)
 {
 
 	sched_pin();
 	__asm __volatile(
 	    "dsb  ishst		\n"
 	    "tlbi vmalle1is	\n"
 	    "dsb  ish		\n"
 	    "isb		\n");
 	sched_unpin();
 }
 
 /*
  *	Routine:	pmap_extract
  *	Function:
  *		Extract the physical page address associated
  *		with the given map/virtual_address pair.
  */
 vm_paddr_t
 pmap_extract(pmap_t pmap, vm_offset_t va)
 {
 	pt_entry_t *pte, tpte;
 	vm_paddr_t pa;
 	int lvl;
 
 	pa = 0;
 	PMAP_LOCK(pmap);
 	/*
 	 * Find the block or page map for this virtual address. pmap_pte
 	 * will return either a valid block/page entry, or NULL.
 	 */
 	pte = pmap_pte(pmap, va, &lvl);
 	if (pte != NULL) {
 		tpte = pmap_load(pte);
 		pa = tpte & ~ATTR_MASK;
 		switch(lvl) {
 		case 1:
 			KASSERT((tpte & ATTR_DESCR_MASK) == L1_BLOCK,
 			    ("pmap_extract: Invalid L1 pte found: %lx",
 			    tpte & ATTR_DESCR_MASK));
 			pa |= (va & L1_OFFSET);
 			break;
 		case 2:
 			KASSERT((tpte & ATTR_DESCR_MASK) == L2_BLOCK,
 			    ("pmap_extract: Invalid L2 pte found: %lx",
 			    tpte & ATTR_DESCR_MASK));
 			pa |= (va & L2_OFFSET);
 			break;
 		case 3:
 			KASSERT((tpte & ATTR_DESCR_MASK) == L3_PAGE,
 			    ("pmap_extract: Invalid L3 pte found: %lx",
 			    tpte & ATTR_DESCR_MASK));
 			pa |= (va & L3_OFFSET);
 			break;
 		}
 	}
 	PMAP_UNLOCK(pmap);
 	return (pa);
 }
 
 /*
  *	Routine:	pmap_extract_and_hold
  *	Function:
  *		Atomically extract and hold the physical page
  *		with the given pmap and virtual address pair
  *		if that mapping permits the given protection.
  */
 vm_page_t
 pmap_extract_and_hold(pmap_t pmap, vm_offset_t va, vm_prot_t prot)
 {
 	pt_entry_t *pte, tpte;
 	vm_offset_t off;
 	vm_paddr_t pa;
 	vm_page_t m;
 	int lvl;
 
 	pa = 0;
 	m = NULL;
 	PMAP_LOCK(pmap);
 retry:
 	pte = pmap_pte(pmap, va, &lvl);
 	if (pte != NULL) {
 		tpte = pmap_load(pte);
 
 		KASSERT(lvl > 0 && lvl <= 3,
 		    ("pmap_extract_and_hold: Invalid level %d", lvl));
 		CTASSERT(L1_BLOCK == L2_BLOCK);
 		KASSERT((lvl == 3 && (tpte & ATTR_DESCR_MASK) == L3_PAGE) ||
 		    (lvl < 3 && (tpte & ATTR_DESCR_MASK) == L1_BLOCK),
 		    ("pmap_extract_and_hold: Invalid pte at L%d: %lx", lvl,
 		     tpte & ATTR_DESCR_MASK));
 		if (((tpte & ATTR_AP_RW_BIT) == ATTR_AP(ATTR_AP_RW)) ||
 		    ((prot & VM_PROT_WRITE) == 0)) {
 			switch(lvl) {
 			case 1:
 				off = va & L1_OFFSET;
 				break;
 			case 2:
 				off = va & L2_OFFSET;
 				break;
 			case 3:
 			default:
 				off = 0;
 			}
 			if (vm_page_pa_tryrelock(pmap,
 			    (tpte & ~ATTR_MASK) | off, &pa))
 				goto retry;
 			m = PHYS_TO_VM_PAGE((tpte & ~ATTR_MASK) | off);
 			vm_page_hold(m);
 		}
 	}
 	PA_UNLOCK_COND(pa);
 	PMAP_UNLOCK(pmap);
 	return (m);
 }
 
 vm_paddr_t
 pmap_kextract(vm_offset_t va)
 {
 	pt_entry_t *pte, tpte;
 	vm_paddr_t pa;
 	int lvl;
 
 	if (va >= DMAP_MIN_ADDRESS && va < DMAP_MAX_ADDRESS) {
 		pa = DMAP_TO_PHYS(va);
 	} else {
 		pa = 0;
 		pte = pmap_pte(kernel_pmap, va, &lvl);
 		if (pte != NULL) {
 			tpte = pmap_load(pte);
 			pa = tpte & ~ATTR_MASK;
 			switch(lvl) {
 			case 1:
 				KASSERT((tpte & ATTR_DESCR_MASK) == L1_BLOCK,
 				    ("pmap_kextract: Invalid L1 pte found: %lx",
 				    tpte & ATTR_DESCR_MASK));
 				pa |= (va & L1_OFFSET);
 				break;
 			case 2:
 				KASSERT((tpte & ATTR_DESCR_MASK) == L2_BLOCK,
 				    ("pmap_kextract: Invalid L2 pte found: %lx",
 				    tpte & ATTR_DESCR_MASK));
 				pa |= (va & L2_OFFSET);
 				break;
 			case 3:
 				KASSERT((tpte & ATTR_DESCR_MASK) == L3_PAGE,
 				    ("pmap_kextract: Invalid L3 pte found: %lx",
 				    tpte & ATTR_DESCR_MASK));
 				pa |= (va & L3_OFFSET);
 				break;
 			}
 		}
 	}
 	return (pa);
 }
 
 /***************************************************
  * Low level mapping routines.....
  ***************************************************/
 
 static void
 pmap_kenter(vm_offset_t sva, vm_size_t size, vm_paddr_t pa, int mode)
 {
 	pd_entry_t *pde;
 	pt_entry_t *pte;
 	vm_offset_t va;
 	int lvl;
 
 	KASSERT((pa & L3_OFFSET) == 0,
 	   ("pmap_kenter: Invalid physical address"));
 	KASSERT((sva & L3_OFFSET) == 0,
 	   ("pmap_kenter: Invalid virtual address"));
 	KASSERT((size & PAGE_MASK) == 0,
 	    ("pmap_kenter: Mapping is not page-sized"));
 
 	va = sva;
 	while (size != 0) {
 		pde = pmap_pde(kernel_pmap, va, &lvl);
 		KASSERT(pde != NULL,
 		    ("pmap_kenter: Invalid page entry, va: 0x%lx", va));
 		KASSERT(lvl == 2, ("pmap_kenter: Invalid level %d", lvl));
 
 		pte = pmap_l2_to_l3(pde, va);
 		pmap_load_store(pte, (pa & ~L3_OFFSET) | ATTR_DEFAULT |
 		    ATTR_IDX(mode) | L3_PAGE);
 		PTE_SYNC(pte);
 
 		va += PAGE_SIZE;
 		pa += PAGE_SIZE;
 		size -= PAGE_SIZE;
 	}
 	pmap_invalidate_range(kernel_pmap, sva, va);
 }
 
 void
 pmap_kenter_device(vm_offset_t sva, vm_size_t size, vm_paddr_t pa)
 {
 
 	pmap_kenter(sva, size, pa, DEVICE_MEMORY);
 }
 
 /*
  * Remove a page from the kernel pagetables.
  */
 PMAP_INLINE void
 pmap_kremove(vm_offset_t va)
 {
 	pt_entry_t *pte;
 	int lvl;
 
 	pte = pmap_pte(kernel_pmap, va, &lvl);
 	KASSERT(pte != NULL, ("pmap_kremove: Invalid address"));
 	KASSERT(lvl == 3, ("pmap_kremove: Invalid pte level %d", lvl));
 
 	if (pmap_l3_valid_cacheable(pmap_load(pte)))
 		cpu_dcache_wb_range(va, L3_SIZE);
 	pmap_load_clear(pte);
 	PTE_SYNC(pte);
 	pmap_invalidate_page(kernel_pmap, va);
 }
 
 void
 pmap_kremove_device(vm_offset_t sva, vm_size_t size)
 {
 	pt_entry_t *pte;
 	vm_offset_t va;
 	int lvl;
 
 	KASSERT((sva & L3_OFFSET) == 0,
 	   ("pmap_kremove_device: Invalid virtual address"));
 	KASSERT((size & PAGE_MASK) == 0,
 	    ("pmap_kremove_device: Mapping is not page-sized"));
 
 	va = sva;
 	while (size != 0) {
 		pte = pmap_pte(kernel_pmap, va, &lvl);
 		KASSERT(pte != NULL, ("Invalid page table, va: 0x%lx", va));
 		KASSERT(lvl == 3,
 		    ("Invalid device pagetable level: %d != 3", lvl));
 		pmap_load_clear(pte);
 		PTE_SYNC(pte);
 
 		va += PAGE_SIZE;
 		size -= PAGE_SIZE;
 	}
 	pmap_invalidate_range(kernel_pmap, sva, va);
 }
 
 /*
  *	Used to map a range of physical addresses into kernel
  *	virtual address space.
  *
  *	The value passed in '*virt' is a suggested virtual address for
  *	the mapping. Architectures which can support a direct-mapped
  *	physical to virtual region can return the appropriate address
  *	within that region, leaving '*virt' unchanged. Other
  *	architectures should map the pages starting at '*virt' and
  *	update '*virt' with the first usable address after the mapped
  *	region.
  */
 vm_offset_t
 pmap_map(vm_offset_t *virt, vm_paddr_t start, vm_paddr_t end, int prot)
 {
 	return PHYS_TO_DMAP(start);
 }
 
 
 /*
  * Add a list of wired pages to the kva
  * this routine is only used for temporary
  * kernel mappings that do not need to have
  * page modification or references recorded.
  * Note that old mappings are simply written
  * over.  The page *must* be wired.
  * Note: SMP coherent.  Uses a ranged shootdown IPI.
  */
 void
 pmap_qenter(vm_offset_t sva, vm_page_t *ma, int count)
 {
 	pd_entry_t *pde;
 	pt_entry_t *pte, pa;
 	vm_offset_t va;
 	vm_page_t m;
 	int i, lvl;
 
 	va = sva;
 	for (i = 0; i < count; i++) {
 		pde = pmap_pde(kernel_pmap, va, &lvl);
 		KASSERT(pde != NULL,
 		    ("pmap_qenter: Invalid page entry, va: 0x%lx", va));
 		KASSERT(lvl == 2,
 		    ("pmap_qenter: Invalid level %d", lvl));
 
 		m = ma[i];
 		pa = VM_PAGE_TO_PHYS(m) | ATTR_DEFAULT | ATTR_AP(ATTR_AP_RW) |
 		    ATTR_IDX(m->md.pv_memattr) | L3_PAGE;
 		pte = pmap_l2_to_l3(pde, va);
 		pmap_load_store(pte, pa);
 		PTE_SYNC(pte);
 
 		va += L3_SIZE;
 	}
 	pmap_invalidate_range(kernel_pmap, sva, va);
 }
 
 /*
  * This routine tears out page mappings from the
  * kernel -- it is meant only for temporary mappings.
  */
 void
 pmap_qremove(vm_offset_t sva, int count)
 {
 	pt_entry_t *pte;
 	vm_offset_t va;
 	int lvl;
 
 	KASSERT(sva >= VM_MIN_KERNEL_ADDRESS, ("usermode va %lx", sva));
 
 	va = sva;
 	while (count-- > 0) {
 		pte = pmap_pte(kernel_pmap, va, &lvl);
 		KASSERT(lvl == 3,
 		    ("Invalid device pagetable level: %d != 3", lvl));
 		if (pte != NULL) {
 			if (pmap_l3_valid_cacheable(pmap_load(pte)))
 				cpu_dcache_wb_range(va, L3_SIZE);
 			pmap_load_clear(pte);
 			PTE_SYNC(pte);
 		}
 
 		va += PAGE_SIZE;
 	}
 	pmap_invalidate_range(kernel_pmap, sva, va);
 }
 
 /***************************************************
  * Page table page management routines.....
  ***************************************************/
 static __inline void
 pmap_free_zero_pages(struct spglist *free)
 {
 	vm_page_t m;
 
 	while ((m = SLIST_FIRST(free)) != NULL) {
 		SLIST_REMOVE_HEAD(free, plinks.s.ss);
 		/* Preserve the page's PG_ZERO setting. */
 		vm_page_free_toq(m);
 	}
 }
 
 /*
  * Schedule the specified unused page table page to be freed.  Specifically,
  * add the page to the specified list of pages that will be released to the
  * physical memory manager after the TLB has been updated.
  */
 static __inline void
 pmap_add_delayed_free_list(vm_page_t m, struct spglist *free,
     boolean_t set_PG_ZERO)
 {
 
 	if (set_PG_ZERO)
 		m->flags |= PG_ZERO;
 	else
 		m->flags &= ~PG_ZERO;
 	SLIST_INSERT_HEAD(free, m, plinks.s.ss);
 }
 
 /*
  * Decrements a page table page's wire count, which is used to record the
  * number of valid page table entries within the page.  If the wire count
  * drops to zero, then the page table page is unmapped.  Returns TRUE if the
  * page table page was unmapped and FALSE otherwise.
  */
 static inline boolean_t
 pmap_unwire_l3(pmap_t pmap, vm_offset_t va, vm_page_t m, struct spglist *free)
 {
 
 	--m->wire_count;
 	if (m->wire_count == 0) {
 		_pmap_unwire_l3(pmap, va, m, free);
 		return (TRUE);
 	} else
 		return (FALSE);
 }
 
 static void
 _pmap_unwire_l3(pmap_t pmap, vm_offset_t va, vm_page_t m, struct spglist *free)
 {
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	/*
 	 * unmap the page table page
 	 */
 	if (m->pindex >= (NUL2E + NUL1E)) {
 		/* l1 page */
 		pd_entry_t *l0;
 
 		l0 = pmap_l0(pmap, va);
 		pmap_load_clear(l0);
 		PTE_SYNC(l0);
 	} else if (m->pindex >= NUL2E) {
 		/* l2 page */
 		pd_entry_t *l1;
 
 		l1 = pmap_l1(pmap, va);
 		pmap_load_clear(l1);
 		PTE_SYNC(l1);
 	} else {
 		/* l3 page */
 		pd_entry_t *l2;
 
 		l2 = pmap_l2(pmap, va);
 		pmap_load_clear(l2);
 		PTE_SYNC(l2);
 	}
 	pmap_resident_count_dec(pmap, 1);
 	if (m->pindex < NUL2E) {
 		/* We just released an l3, unhold the matching l2 */
 		pd_entry_t *l1, tl1;
 		vm_page_t l2pg;
 
 		l1 = pmap_l1(pmap, va);
 		tl1 = pmap_load(l1);
 		l2pg = PHYS_TO_VM_PAGE(tl1 & ~ATTR_MASK);
 		pmap_unwire_l3(pmap, va, l2pg, free);
 	} else if (m->pindex < (NUL2E + NUL1E)) {
 		/* We just released an l2, unhold the matching l1 */
 		pd_entry_t *l0, tl0;
 		vm_page_t l1pg;
 
 		l0 = pmap_l0(pmap, va);
 		tl0 = pmap_load(l0);
 		l1pg = PHYS_TO_VM_PAGE(tl0 & ~ATTR_MASK);
 		pmap_unwire_l3(pmap, va, l1pg, free);
 	}
 	pmap_invalidate_page(pmap, va);
 
 	/*
 	 * This is a release store so that the ordinary store unmapping
 	 * the page table page is globally performed before TLB shoot-
 	 * down is begun.
 	 */
 	atomic_subtract_rel_int(&vm_cnt.v_wire_count, 1);
 
 	/*
 	 * Put page on a list so that it is released after
 	 * *ALL* TLB shootdown is done
 	 */
 	pmap_add_delayed_free_list(m, free, TRUE);
 }
 
 /*
  * After removing an l3 entry, this routine is used to
  * conditionally free the page, and manage the hold/wire counts.
  */
 static int
 pmap_unuse_l3(pmap_t pmap, vm_offset_t va, pd_entry_t ptepde,
     struct spglist *free)
 {
 	vm_page_t mpte;
 
 	if (va >= VM_MAXUSER_ADDRESS)
 		return (0);
 	KASSERT(ptepde != 0, ("pmap_unuse_pt: ptepde != 0"));
 	mpte = PHYS_TO_VM_PAGE(ptepde & ~ATTR_MASK);
 	return (pmap_unwire_l3(pmap, va, mpte, free));
 }
 
 void
 pmap_pinit0(pmap_t pmap)
 {
 
 	PMAP_LOCK_INIT(pmap);
 	bzero(&pmap->pm_stats, sizeof(pmap->pm_stats));
 	pmap->pm_l0 = kernel_pmap->pm_l0;
 	pmap->pm_root.rt_root = 0;
 }
 
 int
 pmap_pinit(pmap_t pmap)
 {
 	vm_paddr_t l0phys;
 	vm_page_t l0pt;
 
 	/*
 	 * allocate the l0 page
 	 */
 	while ((l0pt = vm_page_alloc(NULL, 0, VM_ALLOC_NORMAL |
 	    VM_ALLOC_NOOBJ | VM_ALLOC_WIRED | VM_ALLOC_ZERO)) == NULL)
 		VM_WAIT;
 
 	l0phys = VM_PAGE_TO_PHYS(l0pt);
 	pmap->pm_l0 = (pd_entry_t *)PHYS_TO_DMAP(l0phys);
 
 	if ((l0pt->flags & PG_ZERO) == 0)
 		pagezero(pmap->pm_l0);
 
 	pmap->pm_root.rt_root = 0;
 	bzero(&pmap->pm_stats, sizeof(pmap->pm_stats));
 
 	return (1);
 }
 
 /*
  * This routine is called if the desired page table page does not exist.
  *
  * If page table page allocation fails, this routine may sleep before
  * returning NULL.  It sleeps only if a lock pointer was given.
  *
  * Note: If a page allocation fails at page table level two or three,
  * one or two pages may be held during the wait, only to be released
  * afterwards.  This conservative approach is easily argued to avoid
  * race conditions.
  */
 static vm_page_t
 _pmap_alloc_l3(pmap_t pmap, vm_pindex_t ptepindex, struct rwlock **lockp)
 {
 	vm_page_t m, l1pg, l2pg;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 
 	/*
 	 * Allocate a page table page.
 	 */
 	if ((m = vm_page_alloc(NULL, ptepindex, VM_ALLOC_NOOBJ |
 	    VM_ALLOC_WIRED | VM_ALLOC_ZERO)) == NULL) {
 		if (lockp != NULL) {
 			RELEASE_PV_LIST_LOCK(lockp);
 			PMAP_UNLOCK(pmap);
 			VM_WAIT;
 			PMAP_LOCK(pmap);
 		}
 
 		/*
 		 * Indicate the need to retry.  While waiting, the page table
 		 * page may have been allocated.
 		 */
 		return (NULL);
 	}
 	if ((m->flags & PG_ZERO) == 0)
 		pmap_zero_page(m);
 
 	/*
 	 * Map the pagetable page into the process address space, if
 	 * it isn't already there.
 	 */
 
 	if (ptepindex >= (NUL2E + NUL1E)) {
 		pd_entry_t *l0;
 		vm_pindex_t l0index;
 
 		l0index = ptepindex - (NUL2E + NUL1E);
 		l0 = &pmap->pm_l0[l0index];
 		pmap_load_store(l0, VM_PAGE_TO_PHYS(m) | L0_TABLE);
 		PTE_SYNC(l0);
 	} else if (ptepindex >= NUL2E) {
 		vm_pindex_t l0index, l1index;
 		pd_entry_t *l0, *l1;
 		pd_entry_t tl0;
 
 		l1index = ptepindex - NUL2E;
 		l0index = l1index >> L0_ENTRIES_SHIFT;
 
 		l0 = &pmap->pm_l0[l0index];
 		tl0 = pmap_load(l0);
 		if (tl0 == 0) {
 			/* recurse for allocating page dir */
 			if (_pmap_alloc_l3(pmap, NUL2E + NUL1E + l0index,
 			    lockp) == NULL) {
 				--m->wire_count;
 				/* XXX: release mem barrier? */
 				atomic_subtract_int(&vm_cnt.v_wire_count, 1);
 				vm_page_free_zero(m);
 				return (NULL);
 			}
 		} else {
 			l1pg = PHYS_TO_VM_PAGE(tl0 & ~ATTR_MASK);
 			l1pg->wire_count++;
 		}
 
 		l1 = (pd_entry_t *)PHYS_TO_DMAP(pmap_load(l0) & ~ATTR_MASK);
 		l1 = &l1[ptepindex & Ln_ADDR_MASK];
 		pmap_load_store(l1, VM_PAGE_TO_PHYS(m) | L1_TABLE);
 		PTE_SYNC(l1);
 	} else {
 		vm_pindex_t l0index, l1index;
 		pd_entry_t *l0, *l1, *l2;
 		pd_entry_t tl0, tl1;
 
 		l1index = ptepindex >> Ln_ENTRIES_SHIFT;
 		l0index = l1index >> L0_ENTRIES_SHIFT;
 
 		l0 = &pmap->pm_l0[l0index];
 		tl0 = pmap_load(l0);
 		if (tl0 == 0) {
 			/* recurse for allocating page dir */
 			if (_pmap_alloc_l3(pmap, NUL2E + l1index,
 			    lockp) == NULL) {
 				--m->wire_count;
 				atomic_subtract_int(&vm_cnt.v_wire_count, 1);
 				vm_page_free_zero(m);
 				return (NULL);
 			}
 			tl0 = pmap_load(l0);
 			l1 = (pd_entry_t *)PHYS_TO_DMAP(tl0 & ~ATTR_MASK);
 			l1 = &l1[l1index & Ln_ADDR_MASK];
 		} else {
 			l1 = (pd_entry_t *)PHYS_TO_DMAP(tl0 & ~ATTR_MASK);
 			l1 = &l1[l1index & Ln_ADDR_MASK];
 			tl1 = pmap_load(l1);
 			if (tl1 == 0) {
 				/* recurse for allocating page dir */
 				if (_pmap_alloc_l3(pmap, NUL2E + l1index,
 				    lockp) == NULL) {
 					--m->wire_count;
 					/* XXX: release mem barrier? */
 					atomic_subtract_int(
 					    &vm_cnt.v_wire_count, 1);
 					vm_page_free_zero(m);
 					return (NULL);
 				}
 			} else {
 				l2pg = PHYS_TO_VM_PAGE(tl1 & ~ATTR_MASK);
 				l2pg->wire_count++;
 			}
 		}
 
 		l2 = (pd_entry_t *)PHYS_TO_DMAP(pmap_load(l1) & ~ATTR_MASK);
 		l2 = &l2[ptepindex & Ln_ADDR_MASK];
 		pmap_load_store(l2, VM_PAGE_TO_PHYS(m) | L2_TABLE);
 		PTE_SYNC(l2);
 	}
 
 	pmap_resident_count_inc(pmap, 1);
 
 	return (m);
 }
 
 static vm_page_t
 pmap_alloc_l3(pmap_t pmap, vm_offset_t va, struct rwlock **lockp)
 {
 	vm_pindex_t ptepindex;
 	pd_entry_t *pde, tpde;
 #ifdef INVARIANTS
 	pt_entry_t *pte;
 #endif
 	vm_page_t m;
 	int lvl;
 
 	/*
 	 * Calculate pagetable page index
 	 */
 	ptepindex = pmap_l2_pindex(va);
 retry:
 	/*
 	 * Get the page directory entry
 	 */
 	pde = pmap_pde(pmap, va, &lvl);
 
 	/*
 	 * If the page table page is mapped, we just increment the hold count,
 	 * and activate it. If we get a level 2 pde it will point to a level 3
 	 * table.
 	 */
 	switch (lvl) {
 	case -1:
 		break;
 	case 0:
 #ifdef INVARIANTS
 		pte = pmap_l0_to_l1(pde, va);
 		KASSERT(pmap_load(pte) == 0,
 		    ("pmap_alloc_l3: TODO: l0 superpages"));
 #endif
 		break;
 	case 1:
 #ifdef INVARIANTS
 		pte = pmap_l1_to_l2(pde, va);
 		KASSERT(pmap_load(pte) == 0,
 		    ("pmap_alloc_l3: TODO: l1 superpages"));
 #endif
 		break;
 	case 2:
 		tpde = pmap_load(pde);
 		if (tpde != 0) {
 			m = PHYS_TO_VM_PAGE(tpde & ~ATTR_MASK);
 			m->wire_count++;
 			return (m);
 		}
 		break;
 	default:
 		panic("pmap_alloc_l3: Invalid level %d", lvl);
 	}
 
 	/*
 	 * Here if the pte page isn't mapped, or if it has been deallocated.
 	 */
 	m = _pmap_alloc_l3(pmap, ptepindex, lockp);
 	if (m == NULL && lockp != NULL)
 		goto retry;
 
 	return (m);
 }
 
 
 /***************************************************
  * Pmap allocation/deallocation routines.
  ***************************************************/
 
 /*
  * Release any resources held by the given physical map.
  * Called when a pmap initialized by pmap_pinit is being released.
  * Should only be called if the map contains no valid mappings.
  */
 void
 pmap_release(pmap_t pmap)
 {
 	vm_page_t m;
 
 	KASSERT(pmap->pm_stats.resident_count == 0,
 	    ("pmap_release: pmap resident count %ld != 0",
 	    pmap->pm_stats.resident_count));
 	KASSERT(vm_radix_is_empty(&pmap->pm_root),
 	    ("pmap_release: pmap has reserved page table page(s)"));
 
 	m = PHYS_TO_VM_PAGE(DMAP_TO_PHYS((vm_offset_t)pmap->pm_l0));
 
 	m->wire_count--;
 	atomic_subtract_int(&vm_cnt.v_wire_count, 1);
 	vm_page_free_zero(m);
 }
 
 static int
 kvm_size(SYSCTL_HANDLER_ARGS)
 {
 	unsigned long ksize = VM_MAX_KERNEL_ADDRESS - VM_MIN_KERNEL_ADDRESS;
 
 	return sysctl_handle_long(oidp, &ksize, 0, req);
 }
 SYSCTL_PROC(_vm, OID_AUTO, kvm_size, CTLTYPE_LONG|CTLFLAG_RD,
     0, 0, kvm_size, "LU", "Size of KVM");
 
 static int
 kvm_free(SYSCTL_HANDLER_ARGS)
 {
 	unsigned long kfree = VM_MAX_KERNEL_ADDRESS - kernel_vm_end;
 
 	return sysctl_handle_long(oidp, &kfree, 0, req);
 }
 SYSCTL_PROC(_vm, OID_AUTO, kvm_free, CTLTYPE_LONG|CTLFLAG_RD,
     0, 0, kvm_free, "LU", "Amount of KVM free");
 
 /*
  * grow the number of kernel page table entries, if needed
  */
 void
 pmap_growkernel(vm_offset_t addr)
 {
 	vm_paddr_t paddr;
 	vm_page_t nkpg;
 	pd_entry_t *l0, *l1, *l2;
 
 	mtx_assert(&kernel_map->system_mtx, MA_OWNED);
 
 	addr = roundup2(addr, L2_SIZE);
 	if (addr - 1 >= kernel_map->max_offset)
 		addr = kernel_map->max_offset;
 	while (kernel_vm_end < addr) {
 		l0 = pmap_l0(kernel_pmap, kernel_vm_end);
 		KASSERT(pmap_load(l0) != 0,
 		    ("pmap_growkernel: No level 0 kernel entry"));
 
 		l1 = pmap_l0_to_l1(l0, kernel_vm_end);
 		if (pmap_load(l1) == 0) {
 			/* We need a new PDP entry */
 			nkpg = vm_page_alloc(NULL, kernel_vm_end >> L1_SHIFT,
 			    VM_ALLOC_INTERRUPT | VM_ALLOC_NOOBJ |
 			    VM_ALLOC_WIRED | VM_ALLOC_ZERO);
 			if (nkpg == NULL)
 				panic("pmap_growkernel: no memory to grow kernel");
 			if ((nkpg->flags & PG_ZERO) == 0)
 				pmap_zero_page(nkpg);
 			paddr = VM_PAGE_TO_PHYS(nkpg);
 			pmap_load_store(l1, paddr | L1_TABLE);
 			PTE_SYNC(l1);
 			continue; /* try again */
 		}
 		l2 = pmap_l1_to_l2(l1, kernel_vm_end);
 		if ((pmap_load(l2) & ATTR_AF) != 0) {
 			kernel_vm_end = (kernel_vm_end + L2_SIZE) & ~L2_OFFSET;
 			if (kernel_vm_end - 1 >= kernel_map->max_offset) {
 				kernel_vm_end = kernel_map->max_offset;
 				break;
 			}
 			continue;
 		}
 
 		nkpg = vm_page_alloc(NULL, kernel_vm_end >> L2_SHIFT,
 		    VM_ALLOC_INTERRUPT | VM_ALLOC_NOOBJ | VM_ALLOC_WIRED |
 		    VM_ALLOC_ZERO);
 		if (nkpg == NULL)
 			panic("pmap_growkernel: no memory to grow kernel");
 		if ((nkpg->flags & PG_ZERO) == 0)
 			pmap_zero_page(nkpg);
 		paddr = VM_PAGE_TO_PHYS(nkpg);
 		pmap_load_store(l2, paddr | L2_TABLE);
 		PTE_SYNC(l2);
 		pmap_invalidate_page(kernel_pmap, kernel_vm_end);
 
 		kernel_vm_end = (kernel_vm_end + L2_SIZE) & ~L2_OFFSET;
 		if (kernel_vm_end - 1 >= kernel_map->max_offset) {
 			kernel_vm_end = kernel_map->max_offset;
 			break;
 		}
 	}
 }
 
 
 /***************************************************
  * page management routines.
  ***************************************************/
 
 CTASSERT(sizeof(struct pv_chunk) == PAGE_SIZE);
 CTASSERT(_NPCM == 3);
 CTASSERT(_NPCPV == 168);
 
 static __inline struct pv_chunk *
 pv_to_chunk(pv_entry_t pv)
 {
 
 	return ((struct pv_chunk *)((uintptr_t)pv & ~(uintptr_t)PAGE_MASK));
 }
 
 #define PV_PMAP(pv) (pv_to_chunk(pv)->pc_pmap)
 
 #define	PC_FREE0	0xfffffffffffffffful
 #define	PC_FREE1	0xfffffffffffffffful
 #define	PC_FREE2	0x000000fffffffffful
 
 static const uint64_t pc_freemask[_NPCM] = { PC_FREE0, PC_FREE1, PC_FREE2 };
 
 #if 0
 #ifdef PV_STATS
 static int pc_chunk_count, pc_chunk_allocs, pc_chunk_frees, pc_chunk_tryfail;
 
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_count, CTLFLAG_RD, &pc_chunk_count, 0,
 	"Current number of pv entry chunks");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_allocs, CTLFLAG_RD, &pc_chunk_allocs, 0,
 	"Current number of pv entry chunks allocated");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_frees, CTLFLAG_RD, &pc_chunk_frees, 0,
 	"Current number of pv entry chunks frees");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_tryfail, CTLFLAG_RD, &pc_chunk_tryfail, 0,
 	"Number of times tried to get a chunk page but failed.");
 
 static long pv_entry_frees, pv_entry_allocs, pv_entry_count;
 static int pv_entry_spare;
 
 SYSCTL_LONG(_vm_pmap, OID_AUTO, pv_entry_frees, CTLFLAG_RD, &pv_entry_frees, 0,
 	"Current number of pv entry frees");
 SYSCTL_LONG(_vm_pmap, OID_AUTO, pv_entry_allocs, CTLFLAG_RD, &pv_entry_allocs, 0,
 	"Current number of pv entry allocs");
 SYSCTL_LONG(_vm_pmap, OID_AUTO, pv_entry_count, CTLFLAG_RD, &pv_entry_count, 0,
 	"Current number of pv entries");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pv_entry_spare, CTLFLAG_RD, &pv_entry_spare, 0,
 	"Current number of spare pv entries");
 #endif
 #endif /* 0 */
 
 /*
  * We are in a serious low memory condition.  Resort to
  * drastic measures to free some pages so we can allocate
  * another pv entry chunk.
  *
  * Returns NULL if PV entries were reclaimed from the specified pmap.
  *
  * We do not, however, unmap 2mpages because subsequent accesses will
  * allocate per-page pv entries until repromotion occurs, thereby
  * exacerbating the shortage of free pv entries.
  */
 static vm_page_t
 reclaim_pv_chunk(pmap_t locked_pmap, struct rwlock **lockp)
 {
 
 	panic("ARM64TODO: reclaim_pv_chunk");
 }
 
 /*
  * free the pv_entry back to the free list
  */
 static void
 free_pv_entry(pmap_t pmap, pv_entry_t pv)
 {
 	struct pv_chunk *pc;
 	int idx, field, bit;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	PV_STAT(atomic_add_long(&pv_entry_frees, 1));
 	PV_STAT(atomic_add_int(&pv_entry_spare, 1));
 	PV_STAT(atomic_subtract_long(&pv_entry_count, 1));
 	pc = pv_to_chunk(pv);
 	idx = pv - &pc->pc_pventry[0];
 	field = idx / 64;
 	bit = idx % 64;
 	pc->pc_map[field] |= 1ul << bit;
 	if (pc->pc_map[0] != PC_FREE0 || pc->pc_map[1] != PC_FREE1 ||
 	    pc->pc_map[2] != PC_FREE2) {
 		/* 98% of the time, pc is already at the head of the list. */
 		if (__predict_false(pc != TAILQ_FIRST(&pmap->pm_pvchunk))) {
 			TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 			TAILQ_INSERT_HEAD(&pmap->pm_pvchunk, pc, pc_list);
 		}
 		return;
 	}
 	TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 	free_pv_chunk(pc);
 }
 
 static void
 free_pv_chunk(struct pv_chunk *pc)
 {
 	vm_page_t m;
 
 	mtx_lock(&pv_chunks_mutex);
  	TAILQ_REMOVE(&pv_chunks, pc, pc_lru);
 	mtx_unlock(&pv_chunks_mutex);
 	PV_STAT(atomic_subtract_int(&pv_entry_spare, _NPCPV));
 	PV_STAT(atomic_subtract_int(&pc_chunk_count, 1));
 	PV_STAT(atomic_add_int(&pc_chunk_frees, 1));
 	/* entire chunk is free, return it */
 	m = PHYS_TO_VM_PAGE(DMAP_TO_PHYS((vm_offset_t)pc));
 	dump_drop_page(m->phys_addr);
 	vm_page_unwire(m, PQ_NONE);
 	vm_page_free(m);
 }
 
 /*
  * Returns a new PV entry, allocating a new PV chunk from the system when
  * needed.  If this PV chunk allocation fails and a PV list lock pointer was
  * given, a PV chunk is reclaimed from an arbitrary pmap.  Otherwise, NULL is
  * returned.
  *
  * The given PV list lock may be released.
  */
 static pv_entry_t
 get_pv_entry(pmap_t pmap, struct rwlock **lockp)
 {
 	int bit, field;
 	pv_entry_t pv;
 	struct pv_chunk *pc;
 	vm_page_t m;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	PV_STAT(atomic_add_long(&pv_entry_allocs, 1));
 retry:
 	pc = TAILQ_FIRST(&pmap->pm_pvchunk);
 	if (pc != NULL) {
 		for (field = 0; field < _NPCM; field++) {
 			if (pc->pc_map[field]) {
 				bit = ffsl(pc->pc_map[field]) - 1;
 				break;
 			}
 		}
 		if (field < _NPCM) {
 			pv = &pc->pc_pventry[field * 64 + bit];
 			pc->pc_map[field] &= ~(1ul << bit);
 			/* If this was the last item, move it to tail */
 			if (pc->pc_map[0] == 0 && pc->pc_map[1] == 0 &&
 			    pc->pc_map[2] == 0) {
 				TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 				TAILQ_INSERT_TAIL(&pmap->pm_pvchunk, pc,
 				    pc_list);
 			}
 			PV_STAT(atomic_add_long(&pv_entry_count, 1));
 			PV_STAT(atomic_subtract_int(&pv_entry_spare, 1));
 			return (pv);
 		}
 	}
 	/* No free items, allocate another chunk */
 	m = vm_page_alloc(NULL, 0, VM_ALLOC_NORMAL | VM_ALLOC_NOOBJ |
 	    VM_ALLOC_WIRED);
 	if (m == NULL) {
 		if (lockp == NULL) {
 			PV_STAT(pc_chunk_tryfail++);
 			return (NULL);
 		}
 		m = reclaim_pv_chunk(pmap, lockp);
 		if (m == NULL)
 			goto retry;
 	}
 	PV_STAT(atomic_add_int(&pc_chunk_count, 1));
 	PV_STAT(atomic_add_int(&pc_chunk_allocs, 1));
 	dump_add_page(m->phys_addr);
 	pc = (void *)PHYS_TO_DMAP(m->phys_addr);
 	pc->pc_pmap = pmap;
 	pc->pc_map[0] = PC_FREE0 & ~1ul;	/* preallocated bit 0 */
 	pc->pc_map[1] = PC_FREE1;
 	pc->pc_map[2] = PC_FREE2;
 	mtx_lock(&pv_chunks_mutex);
 	TAILQ_INSERT_TAIL(&pv_chunks, pc, pc_lru);
 	mtx_unlock(&pv_chunks_mutex);
 	pv = &pc->pc_pventry[0];
 	TAILQ_INSERT_HEAD(&pmap->pm_pvchunk, pc, pc_list);
 	PV_STAT(atomic_add_long(&pv_entry_count, 1));
 	PV_STAT(atomic_add_int(&pv_entry_spare, _NPCPV - 1));
 	return (pv);
 }
 
 /*
  * Ensure that the number of spare PV entries in the specified pmap meets or
  * exceeds the given count, "needed".
  *
  * The given PV list lock may be released.
  */
 static void
 reserve_pv_entries(pmap_t pmap, int needed, struct rwlock **lockp)
 {
 	struct pch new_tail;
 	struct pv_chunk *pc;
 	int avail, free;
 	vm_page_t m;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	KASSERT(lockp != NULL, ("reserve_pv_entries: lockp is NULL"));
 
 	/*
 	 * Newly allocated PV chunks must be stored in a private list until
 	 * the required number of PV chunks have been allocated.  Otherwise,
 	 * reclaim_pv_chunk() could recycle one of these chunks.  In
 	 * contrast, these chunks must be added to the pmap upon allocation.
 	 */
 	TAILQ_INIT(&new_tail);
 retry:
 	avail = 0;
 	TAILQ_FOREACH(pc, &pmap->pm_pvchunk, pc_list) {
 		bit_count((bitstr_t *)pc->pc_map, 0,
 		    sizeof(pc->pc_map) * NBBY, &free);
 		if (free == 0)
 			break;
 		avail += free;
 		if (avail >= needed)
 			break;
 	}
 	for (; avail < needed; avail += _NPCPV) {
 		m = vm_page_alloc(NULL, 0, VM_ALLOC_NORMAL | VM_ALLOC_NOOBJ |
 		    VM_ALLOC_WIRED);
 		if (m == NULL) {
 			m = reclaim_pv_chunk(pmap, lockp);
 			if (m == NULL)
 				goto retry;
 		}
 		PV_STAT(atomic_add_int(&pc_chunk_count, 1));
 		PV_STAT(atomic_add_int(&pc_chunk_allocs, 1));
 		dump_add_page(m->phys_addr);
 		pc = (void *)PHYS_TO_DMAP(m->phys_addr);
 		pc->pc_pmap = pmap;
 		pc->pc_map[0] = PC_FREE0;
 		pc->pc_map[1] = PC_FREE1;
 		pc->pc_map[2] = PC_FREE2;
 		TAILQ_INSERT_HEAD(&pmap->pm_pvchunk, pc, pc_list);
 		TAILQ_INSERT_TAIL(&new_tail, pc, pc_lru);
 		PV_STAT(atomic_add_int(&pv_entry_spare, _NPCPV));
 	}
 	if (!TAILQ_EMPTY(&new_tail)) {
 		mtx_lock(&pv_chunks_mutex);
 		TAILQ_CONCAT(&pv_chunks, &new_tail, pc_lru);
 		mtx_unlock(&pv_chunks_mutex);
 	}
 }
 
 /*
  * First find and then remove the pv entry for the specified pmap and virtual
  * address from the specified pv list.  Returns the pv entry if found and NULL
  * otherwise.  This operation can be performed on pv lists for either 4KB or
  * 2MB page mappings.
  */
 static __inline pv_entry_t
 pmap_pvh_remove(struct md_page *pvh, pmap_t pmap, vm_offset_t va)
 {
 	pv_entry_t pv;
 
 	TAILQ_FOREACH(pv, &pvh->pv_list, pv_next) {
 		if (pmap == PV_PMAP(pv) && va == pv->pv_va) {
 			TAILQ_REMOVE(&pvh->pv_list, pv, pv_next);
 			pvh->pv_gen++;
 			break;
 		}
 	}
 	return (pv);
 }
 
 /*
  * After demotion from a 2MB page mapping to 512 4KB page mappings,
  * destroy the pv entry for the 2MB page mapping and reinstantiate the pv
  * entries for each of the 4KB page mappings.
  */
 static void
 pmap_pv_demote_l2(pmap_t pmap, vm_offset_t va, vm_paddr_t pa,
     struct rwlock **lockp)
 {
 	struct md_page *pvh;
 	struct pv_chunk *pc;
 	pv_entry_t pv;
 	vm_offset_t va_last;
 	vm_page_t m;
 	int bit, field;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	KASSERT((pa & L2_OFFSET) == 0,
 	    ("pmap_pv_demote_l2: pa is not 2mpage aligned"));
 	CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa);
 
 	/*
 	 * Transfer the 2mpage's pv entry for this mapping to the first
 	 * page's pv list.  Once this transfer begins, the pv list lock
 	 * must not be released until the last pv entry is reinstantiated.
 	 */
 	pvh = pa_to_pvh(pa);
 	va = va & ~L2_OFFSET;
 	pv = pmap_pvh_remove(pvh, pmap, va);
 	KASSERT(pv != NULL, ("pmap_pv_demote_l2: pv not found"));
 	m = PHYS_TO_VM_PAGE(pa);
 	TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 	m->md.pv_gen++;
 	/* Instantiate the remaining Ln_ENTRIES - 1 pv entries. */
 	PV_STAT(atomic_add_long(&pv_entry_allocs, Ln_ENTRIES - 1));
 	va_last = va + L2_SIZE - PAGE_SIZE;
 	for (;;) {
 		pc = TAILQ_FIRST(&pmap->pm_pvchunk);
 		KASSERT(pc->pc_map[0] != 0 || pc->pc_map[1] != 0 ||
 		    pc->pc_map[2] != 0, ("pmap_pv_demote_l2: missing spare"));
 		for (field = 0; field < _NPCM; field++) {
 			while (pc->pc_map[field]) {
 				bit = ffsl(pc->pc_map[field]) - 1;
 				pc->pc_map[field] &= ~(1ul << bit);
 				pv = &pc->pc_pventry[field * 64 + bit];
 				va += PAGE_SIZE;
 				pv->pv_va = va;
 				m++;
 				KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 			    ("pmap_pv_demote_l2: page %p is not managed", m));
 				TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 				m->md.pv_gen++;
 				if (va == va_last)
 					goto out;
 			}
 		}
 		TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 		TAILQ_INSERT_TAIL(&pmap->pm_pvchunk, pc, pc_list);
 	}
 out:
 	if (pc->pc_map[0] == 0 && pc->pc_map[1] == 0 && pc->pc_map[2] == 0) {
 		TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 		TAILQ_INSERT_TAIL(&pmap->pm_pvchunk, pc, pc_list);
 	}
 	PV_STAT(atomic_add_long(&pv_entry_count, Ln_ENTRIES - 1));
 	PV_STAT(atomic_subtract_int(&pv_entry_spare, Ln_ENTRIES - 1));
 }
 
 /*
  * First find and then destroy the pv entry for the specified pmap and virtual
  * address.  This operation can be performed on pv lists for either 4KB or 2MB
  * page mappings.
  */
 static void
 pmap_pvh_free(struct md_page *pvh, pmap_t pmap, vm_offset_t va)
 {
 	pv_entry_t pv;
 
 	pv = pmap_pvh_remove(pvh, pmap, va);
 	KASSERT(pv != NULL, ("pmap_pvh_free: pv not found"));
 	free_pv_entry(pmap, pv);
 }
 
 /*
  * Conditionally create the PV entry for a 4KB page mapping if the required
  * memory can be allocated without resorting to reclamation.
  */
 static boolean_t
 pmap_try_insert_pv_entry(pmap_t pmap, vm_offset_t va, vm_page_t m,
     struct rwlock **lockp)
 {
 	pv_entry_t pv;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	/* Pass NULL instead of the lock pointer to disable reclamation. */
 	if ((pv = get_pv_entry(pmap, NULL)) != NULL) {
 		pv->pv_va = va;
 		CHANGE_PV_LIST_LOCK_TO_VM_PAGE(lockp, m);
 		TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 		m->md.pv_gen++;
 		return (TRUE);
 	} else
 		return (FALSE);
 }
 
 /*
  * pmap_remove_l3: do the things to unmap a page in a process
  */
 static int
 pmap_remove_l3(pmap_t pmap, pt_entry_t *l3, vm_offset_t va,
     pd_entry_t l2e, struct spglist *free, struct rwlock **lockp)
 {
 	struct md_page *pvh;
 	pt_entry_t old_l3;
 	vm_page_t m;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	if (pmap_is_current(pmap) && pmap_l3_valid_cacheable(pmap_load(l3)))
 		cpu_dcache_wb_range(va, L3_SIZE);
 	old_l3 = pmap_load_clear(l3);
 	PTE_SYNC(l3);
 	pmap_invalidate_page(pmap, va);
 	if (old_l3 & ATTR_SW_WIRED)
 		pmap->pm_stats.wired_count -= 1;
 	pmap_resident_count_dec(pmap, 1);
 	if (old_l3 & ATTR_SW_MANAGED) {
 		m = PHYS_TO_VM_PAGE(old_l3 & ~ATTR_MASK);
 		if (pmap_page_dirty(old_l3))
 			vm_page_dirty(m);
 		if (old_l3 & ATTR_AF)
 			vm_page_aflag_set(m, PGA_REFERENCED);
 		CHANGE_PV_LIST_LOCK_TO_VM_PAGE(lockp, m);
 		pmap_pvh_free(&m->md, pmap, va);
 		if (TAILQ_EMPTY(&m->md.pv_list) &&
 		    (m->flags & PG_FICTITIOUS) == 0) {
 			pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 			if (TAILQ_EMPTY(&pvh->pv_list))
 				vm_page_aflag_clear(m, PGA_WRITEABLE);
 		}
 	}
 	return (pmap_unuse_l3(pmap, va, l2e, free));
 }
 
 /*
  *	Remove the given range of addresses from the specified map.
  *
  *	It is assumed that the start and end are properly
  *	rounded to the page size.
  */
 void
 pmap_remove(pmap_t pmap, vm_offset_t sva, vm_offset_t eva)
 {
 	struct rwlock *lock;
 	vm_offset_t va, va_next;
 	pd_entry_t *l0, *l1, *l2;
 	pt_entry_t l3_paddr, *l3;
 	struct spglist free;
 	int anyvalid;
 
 	/*
 	 * Perform an unsynchronized read.  This is, however, safe.
 	 */
 	if (pmap->pm_stats.resident_count == 0)
 		return;
 
 	anyvalid = 0;
 	SLIST_INIT(&free);
 
 	PMAP_LOCK(pmap);
 
 	lock = NULL;
 	for (; sva < eva; sva = va_next) {
 
 		if (pmap->pm_stats.resident_count == 0)
 			break;
 
 		l0 = pmap_l0(pmap, sva);
 		if (pmap_load(l0) == 0) {
 			va_next = (sva + L0_SIZE) & ~L0_OFFSET;
 			if (va_next < sva)
 				va_next = eva;
 			continue;
 		}
 
 		l1 = pmap_l0_to_l1(l0, sva);
 		if (pmap_load(l1) == 0) {
 			va_next = (sva + L1_SIZE) & ~L1_OFFSET;
 			if (va_next < sva)
 				va_next = eva;
 			continue;
 		}
 
 		/*
 		 * Calculate index for next page table.
 		 */
 		va_next = (sva + L2_SIZE) & ~L2_OFFSET;
 		if (va_next < sva)
 			va_next = eva;
 
 		l2 = pmap_l1_to_l2(l1, sva);
 		if (l2 == NULL)
 			continue;
 
 		l3_paddr = pmap_load(l2);
 
 		if ((l3_paddr & ATTR_DESCR_MASK) == L2_BLOCK) {
 			/* TODO: Add pmap_remove_l2 */
 			if (pmap_demote_l2_locked(pmap, l2, sva & ~L2_OFFSET,
 			    &lock) == NULL)
 				continue;
 			l3_paddr = pmap_load(l2);
 		}
 
 		/*
 		 * Weed out invalid mappings.
 		 */
 		if ((l3_paddr & ATTR_DESCR_MASK) != L2_TABLE)
 			continue;
 
 		/*
 		 * Limit our scan to either the end of the va represented
 		 * by the current page table page, or to the end of the
 		 * range being removed.
 		 */
 		if (va_next > eva)
 			va_next = eva;
 
 		va = va_next;
 		for (l3 = pmap_l2_to_l3(l2, sva); sva != va_next; l3++,
 		    sva += L3_SIZE) {
 			if (l3 == NULL)
 				panic("l3 == NULL");
 			if (pmap_load(l3) == 0) {
 				if (va != va_next) {
 					pmap_invalidate_range(pmap, va, sva);
 					va = va_next;
 				}
 				continue;
 			}
 			if (va == va_next)
 				va = sva;
 			if (pmap_remove_l3(pmap, l3, sva, l3_paddr, &free,
 			    &lock)) {
 				sva += L3_SIZE;
 				break;
 			}
 		}
 		if (va != va_next)
 			pmap_invalidate_range(pmap, va, sva);
 	}
 	if (lock != NULL)
 		rw_wunlock(lock);
 	if (anyvalid)
 		pmap_invalidate_all(pmap);
 	PMAP_UNLOCK(pmap);
 	pmap_free_zero_pages(&free);
 }
 
 /*
  *	Routine:	pmap_remove_all
  *	Function:
  *		Removes this physical page from
  *		all physical maps in which it resides.
  *		Reflects back modify bits to the pager.
  *
  *	Notes:
  *		Original versions of this routine were very
  *		inefficient because they iteratively called
  *		pmap_remove (slow...)
  */
 
 void
 pmap_remove_all(vm_page_t m)
 {
 	struct md_page *pvh;
 	pv_entry_t pv;
 	pmap_t pmap;
 	struct rwlock *lock;
 	pd_entry_t *pde, tpde;
 	pt_entry_t *pte, tpte;
 	vm_offset_t va;
 	struct spglist free;
 	int lvl, pvh_gen, md_gen;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_remove_all: page %p is not managed", m));
 	SLIST_INIT(&free);
 	lock = VM_PAGE_TO_PV_LIST_LOCK(m);
 	pvh = (m->flags & PG_FICTITIOUS) != 0 ? &pv_dummy :
 	    pa_to_pvh(VM_PAGE_TO_PHYS(m));
 retry:
 	rw_wlock(lock);
 	while ((pv = TAILQ_FIRST(&pvh->pv_list)) != NULL) {
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			pvh_gen = pvh->pv_gen;
 			rw_wunlock(lock);
 			PMAP_LOCK(pmap);
 			rw_wlock(lock);
 			if (pvh_gen != pvh->pv_gen) {
 				rw_wunlock(lock);
 				PMAP_UNLOCK(pmap);
 				goto retry;
 			}
 		}
 		va = pv->pv_va;
 		pte = pmap_pte(pmap, va, &lvl);
 		KASSERT(pte != NULL,
 		    ("pmap_remove_all: no page table entry found"));
 		KASSERT(lvl == 2,
 		    ("pmap_remove_all: invalid pte level %d", lvl));
 
 		pmap_demote_l2_locked(pmap, pte, va, &lock);
 		PMAP_UNLOCK(pmap);
 	}
 	while ((pv = TAILQ_FIRST(&m->md.pv_list)) != NULL) {
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			pvh_gen = pvh->pv_gen;
 			md_gen = m->md.pv_gen;
 			rw_wunlock(lock);
 			PMAP_LOCK(pmap);
 			rw_wlock(lock);
 			if (pvh_gen != pvh->pv_gen || md_gen != m->md.pv_gen) {
 				rw_wunlock(lock);
 				PMAP_UNLOCK(pmap);
 				goto retry;
 			}
 		}
 		pmap_resident_count_dec(pmap, 1);
 
 		pde = pmap_pde(pmap, pv->pv_va, &lvl);
 		KASSERT(pde != NULL,
 		    ("pmap_remove_all: no page directory entry found"));
 		KASSERT(lvl == 2,
 		    ("pmap_remove_all: invalid pde level %d", lvl));
 		tpde = pmap_load(pde);
 
 		pte = pmap_l2_to_l3(pde, pv->pv_va);
 		tpte = pmap_load(pte);
 		if (pmap_is_current(pmap) &&
 		    pmap_l3_valid_cacheable(tpte))
 			cpu_dcache_wb_range(pv->pv_va, L3_SIZE);
 		pmap_load_clear(pte);
 		PTE_SYNC(pte);
 		pmap_invalidate_page(pmap, pv->pv_va);
 		if (tpte & ATTR_SW_WIRED)
 			pmap->pm_stats.wired_count--;
 		if ((tpte & ATTR_AF) != 0)
 			vm_page_aflag_set(m, PGA_REFERENCED);
 
 		/*
 		 * Update the vm_page_t clean and reference bits.
 		 */
 		if (pmap_page_dirty(tpte))
 			vm_page_dirty(m);
 		pmap_unuse_l3(pmap, pv->pv_va, tpde, &free);
 		TAILQ_REMOVE(&m->md.pv_list, pv, pv_next);
 		m->md.pv_gen++;
 		free_pv_entry(pmap, pv);
 		PMAP_UNLOCK(pmap);
 	}
 	vm_page_aflag_clear(m, PGA_WRITEABLE);
 	rw_wunlock(lock);
 	pmap_free_zero_pages(&free);
 }
 
 /*
  *	Set the physical protection on the
  *	specified range of this map as requested.
  */
 void
 pmap_protect(pmap_t pmap, vm_offset_t sva, vm_offset_t eva, vm_prot_t prot)
 {
 	vm_offset_t va, va_next;
 	pd_entry_t *l0, *l1, *l2;
 	pt_entry_t *l3p, l3;
 
 	if ((prot & VM_PROT_READ) == VM_PROT_NONE) {
 		pmap_remove(pmap, sva, eva);
 		return;
 	}
 
 	if ((prot & VM_PROT_WRITE) == VM_PROT_WRITE)
 		return;
 
 	PMAP_LOCK(pmap);
 	for (; sva < eva; sva = va_next) {
 
 		l0 = pmap_l0(pmap, sva);
 		if (pmap_load(l0) == 0) {
 			va_next = (sva + L0_SIZE) & ~L0_OFFSET;
 			if (va_next < sva)
 				va_next = eva;
 			continue;
 		}
 
 		l1 = pmap_l0_to_l1(l0, sva);
 		if (pmap_load(l1) == 0) {
 			va_next = (sva + L1_SIZE) & ~L1_OFFSET;
 			if (va_next < sva)
 				va_next = eva;
 			continue;
 		}
 
 		va_next = (sva + L2_SIZE) & ~L2_OFFSET;
 		if (va_next < sva)
 			va_next = eva;
 
 		l2 = pmap_l1_to_l2(l1, sva);
 		if (pmap_load(l2) == 0)
 			continue;
 
 		if ((pmap_load(l2) & ATTR_DESCR_MASK) == L2_BLOCK) {
 			l3p = pmap_demote_l2(pmap, l2, sva);
 			if (l3p == NULL)
 				continue;
 		}
 		KASSERT((pmap_load(l2) & ATTR_DESCR_MASK) == L2_TABLE,
 		    ("pmap_protect: Invalid L2 entry after demotion"));
 
 		if (va_next > eva)
 			va_next = eva;
 
 		va = va_next;
 		for (l3p = pmap_l2_to_l3(l2, sva); sva != va_next; l3p++,
 		    sva += L3_SIZE) {
 			l3 = pmap_load(l3p);
 			if (pmap_l3_valid(l3)) {
 				pmap_set(l3p, ATTR_AP(ATTR_AP_RO));
 				PTE_SYNC(l3p);
 				/* XXX: Use pmap_invalidate_range */
 				pmap_invalidate_page(pmap, va);
 			}
 		}
 	}
 	PMAP_UNLOCK(pmap);
 
 	/* TODO: Only invalidate entries we are touching */
 	pmap_invalidate_all(pmap);
 }
 
 /*
  * Inserts the specified page table page into the specified pmap's collection
  * of idle page table pages.  Each of a pmap's page table pages is responsible
  * for mapping a distinct range of virtual addresses.  The pmap's collection is
  * ordered by this virtual address range.
  */
 static __inline int
 pmap_insert_pt_page(pmap_t pmap, vm_page_t mpte)
 {
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	return (vm_radix_insert(&pmap->pm_root, mpte));
 }
 
 /*
  * Looks for a page table page mapping the specified virtual address in the
  * specified pmap's collection of idle page table pages.  Returns NULL if there
  * is no page table page corresponding to the specified virtual address.
  */
 static __inline vm_page_t
 pmap_lookup_pt_page(pmap_t pmap, vm_offset_t va)
 {
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	return (vm_radix_lookup(&pmap->pm_root, pmap_l2_pindex(va)));
 }
 
 /*
  * Removes the specified page table page from the specified pmap's collection
  * of idle page table pages.  The specified page table page must be a member of
  * the pmap's collection.
  */
 static __inline void
 pmap_remove_pt_page(pmap_t pmap, vm_page_t mpte)
 {
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	vm_radix_remove(&pmap->pm_root, mpte->pindex);
 }
 
 /*
  * Performs a break-before-make update of a pmap entry. This is needed when
  * either promoting or demoting pages to ensure the TLB doesn't get into an
  * inconsistent state.
  */
 static void
 pmap_update_entry(pmap_t pmap, pd_entry_t *pte, pd_entry_t newpte,
     vm_offset_t va, vm_size_t size)
 {
 	register_t intr;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 
 	/*
 	 * Ensure we don't get switched out with the page table in an
 	 * inconsistent state. We also need to ensure no interrupts fire
 	 * as they may make use of an address we are about to invalidate.
 	 */
 	intr = intr_disable();
 	critical_enter();
 
 	/* Clear the old mapping */
 	pmap_load_clear(pte);
 	PTE_SYNC(pte);
 	pmap_invalidate_range(pmap, va, va + size);
 
 	/* Create the new mapping */
 	pmap_load_store(pte, newpte);
 	PTE_SYNC(pte);
 
 	critical_exit();
 	intr_restore(intr);
 }
 
 /*
  * After promotion from 512 4KB page mappings to a single 2MB page mapping,
  * replace the many pv entries for the 4KB page mappings by a single pv entry
  * for the 2MB page mapping.
  */
 static void
 pmap_pv_promote_l2(pmap_t pmap, vm_offset_t va, vm_paddr_t pa,
     struct rwlock **lockp)
 {
 	struct md_page *pvh;
 	pv_entry_t pv;
 	vm_offset_t va_last;
 	vm_page_t m;
 
 	KASSERT((pa & L2_OFFSET) == 0,
 	    ("pmap_pv_promote_l2: pa is not 2mpage aligned"));
 	CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa);
 
 	/*
 	 * Transfer the first page's pv entry for this mapping to the 2mpage's
 	 * pv list.  Aside from avoiding the cost of a call to get_pv_entry(),
 	 * a transfer avoids the possibility that get_pv_entry() calls
 	 * reclaim_pv_chunk() and that reclaim_pv_chunk() removes one of the
 	 * mappings that is being promoted.
 	 */
 	m = PHYS_TO_VM_PAGE(pa);
 	va = va & ~L2_OFFSET;
 	pv = pmap_pvh_remove(&m->md, pmap, va);
 	KASSERT(pv != NULL, ("pmap_pv_promote_l2: pv not found"));
 	pvh = pa_to_pvh(pa);
 	TAILQ_INSERT_TAIL(&pvh->pv_list, pv, pv_next);
 	pvh->pv_gen++;
 	/* Free the remaining NPTEPG - 1 pv entries. */
 	va_last = va + L2_SIZE - PAGE_SIZE;
 	do {
 		m++;
 		va += PAGE_SIZE;
 		pmap_pvh_free(&m->md, pmap, va);
 	} while (va < va_last);
 }
 
 /*
  * Tries to promote the 512, contiguous 4KB page mappings that are within a
  * single level 2 table entry to a single 2MB page mapping.  For promotion
  * to occur, two conditions must be met: (1) the 4KB page mappings must map
  * aligned, contiguous physical memory and (2) the 4KB page mappings must have
  * identical characteristics.
  */
 static void
 pmap_promote_l2(pmap_t pmap, pd_entry_t *l2, vm_offset_t va,
     struct rwlock **lockp)
 {
 	pt_entry_t *firstl3, *l3, newl2, oldl3, pa;
 	vm_page_t mpte;
 	vm_offset_t sva;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 
 	sva = va & ~L2_OFFSET;
 	firstl3 = pmap_l2_to_l3(l2, sva);
 	newl2 = pmap_load(firstl3);
 
 	/* Check the alingment is valid */
 	if (((newl2 & ~ATTR_MASK) & L2_OFFSET) != 0) {
 		atomic_add_long(&pmap_l2_p_failures, 1);
 		CTR2(KTR_PMAP, "pmap_promote_l2: failure for va %#lx"
 		    " in pmap %p", va, pmap);
 		return;
 	}
 
 	pa = newl2 + L2_SIZE - PAGE_SIZE;
 	for (l3 = firstl3 + NL3PG - 1; l3 > firstl3; l3--) {
 		oldl3 = pmap_load(l3);
 		if (oldl3 != pa) {
 			atomic_add_long(&pmap_l2_p_failures, 1);
 			CTR2(KTR_PMAP, "pmap_promote_l2: failure for va %#lx"
 			    " in pmap %p", va, pmap);
 			return;
 		}
 		pa -= PAGE_SIZE;
 	}
 
 	/*
 	 * Save the page table page in its current state until the L2
 	 * mapping the superpage is demoted by pmap_demote_l2() or
 	 * destroyed by pmap_remove_l3().
 	 */
 	mpte = PHYS_TO_VM_PAGE(pmap_load(l2) & ~ATTR_MASK);
 	KASSERT(mpte >= vm_page_array &&
 	    mpte < &vm_page_array[vm_page_array_size],
 	    ("pmap_promote_l2: page table page is out of range"));
 	KASSERT(mpte->pindex == pmap_l2_pindex(va),
 	    ("pmap_promote_l2: page table page's pindex is wrong"));
 	if (pmap_insert_pt_page(pmap, mpte)) {
 		atomic_add_long(&pmap_l2_p_failures, 1);
 		CTR2(KTR_PMAP,
 		    "pmap_promote_l2: failure for va %#lx in pmap %p", va,
 		    pmap);
 		return;
 	}
 
 	if ((newl2 & ATTR_SW_MANAGED) != 0)
 		pmap_pv_promote_l2(pmap, va, newl2 & ~ATTR_MASK, lockp);
 
 	newl2 &= ~ATTR_DESCR_MASK;
 	newl2 |= L2_BLOCK;
 
 	pmap_update_entry(pmap, l2, newl2, sva, L2_SIZE);
 
 	atomic_add_long(&pmap_l2_promotions, 1);
 	CTR2(KTR_PMAP, "pmap_promote_l2: success for va %#lx in pmap %p", va,
 		    pmap);
 }
 
 /*
  *	Insert the given physical page (p) at
  *	the specified virtual address (v) in the
  *	target physical map with the protection requested.
  *
  *	If specified, the page will be wired down, meaning
  *	that the related pte can not be reclaimed.
  *
  *	NB:  This is the only routine which MAY NOT lazy-evaluate
  *	or lose information.  That is, this routine must actually
  *	insert this page into the given map NOW.
  */
 int
 pmap_enter(pmap_t pmap, vm_offset_t va, vm_page_t m, vm_prot_t prot,
     u_int flags, int8_t psind __unused)
 {
 	struct rwlock *lock;
 	pd_entry_t *pde;
 	pt_entry_t new_l3, orig_l3;
 	pt_entry_t *l2, *l3;
 	pv_entry_t pv;
 	vm_paddr_t opa, pa, l1_pa, l2_pa, l3_pa;
 	vm_page_t mpte, om, l1_m, l2_m, l3_m;
 	boolean_t nosleep;
 	int lvl;
 
 	va = trunc_page(va);
 	if ((m->oflags & VPO_UNMANAGED) == 0 && !vm_page_xbusied(m))
 		VM_OBJECT_ASSERT_LOCKED(m->object);
 	pa = VM_PAGE_TO_PHYS(m);
 	new_l3 = (pt_entry_t)(pa | ATTR_DEFAULT | ATTR_IDX(m->md.pv_memattr) |
 	    L3_PAGE);
 	if ((prot & VM_PROT_WRITE) == 0)
 		new_l3 |= ATTR_AP(ATTR_AP_RO);
 	if ((flags & PMAP_ENTER_WIRED) != 0)
 		new_l3 |= ATTR_SW_WIRED;
 	if ((va >> 63) == 0)
 		new_l3 |= ATTR_AP(ATTR_AP_USER);
 
 	CTR2(KTR_PMAP, "pmap_enter: %.16lx -> %.16lx", va, pa);
 
 	mpte = NULL;
 
 	lock = NULL;
 	PMAP_LOCK(pmap);
 
 	pde = pmap_pde(pmap, va, &lvl);
 	if (pde != NULL && lvl == 1) {
 		l2 = pmap_l1_to_l2(pde, va);
 		if ((pmap_load(l2) & ATTR_DESCR_MASK) == L2_BLOCK &&
 		    (l3 = pmap_demote_l2_locked(pmap, l2, va & ~L2_OFFSET,
 		    &lock)) != NULL) {
 			l3 = &l3[pmap_l3_index(va)];
 			if (va < VM_MAXUSER_ADDRESS) {
 				mpte = PHYS_TO_VM_PAGE(
 				    pmap_load(l2) & ~ATTR_MASK);
 				mpte->wire_count++;
 			}
 			goto havel3;
 		}
 	}
 
 	if (va < VM_MAXUSER_ADDRESS) {
 		nosleep = (flags & PMAP_ENTER_NOSLEEP) != 0;
 		mpte = pmap_alloc_l3(pmap, va, nosleep ? NULL : &lock);
 		if (mpte == NULL && nosleep) {
 			CTR0(KTR_PMAP, "pmap_enter: mpte == NULL");
 			if (lock != NULL)
 				rw_wunlock(lock);
 			PMAP_UNLOCK(pmap);
 			return (KERN_RESOURCE_SHORTAGE);
 		}
 		pde = pmap_pde(pmap, va, &lvl);
 		KASSERT(pde != NULL,
 		    ("pmap_enter: Invalid page entry, va: 0x%lx", va));
 		KASSERT(lvl == 2,
 		    ("pmap_enter: Invalid level %d", lvl));
 
 		l3 = pmap_l2_to_l3(pde, va);
 	} else {
 		/*
 		 * If we get a level 2 pde it must point to a level 3 entry
 		 * otherwise we will need to create the intermediate tables
 		 */
 		if (lvl < 2) {
 			switch(lvl) {
 			default:
 			case -1:
 				/* Get the l0 pde to update */
 				pde = pmap_l0(pmap, va);
 				KASSERT(pde != NULL, ("..."));
 
 				l1_m = vm_page_alloc(NULL, 0, VM_ALLOC_NORMAL |
 				    VM_ALLOC_NOOBJ | VM_ALLOC_WIRED |
 				    VM_ALLOC_ZERO);
 				if (l1_m == NULL)
 					panic("pmap_enter: l1 pte_m == NULL");
 				if ((l1_m->flags & PG_ZERO) == 0)
 					pmap_zero_page(l1_m);
 
 				l1_pa = VM_PAGE_TO_PHYS(l1_m);
 				pmap_load_store(pde, l1_pa | L0_TABLE);
 				PTE_SYNC(pde);
 				/* FALLTHROUGH */
 			case 0:
 				/* Get the l1 pde to update */
 				pde = pmap_l1_to_l2(pde, va);
 				KASSERT(pde != NULL, ("..."));
 
 				l2_m = vm_page_alloc(NULL, 0, VM_ALLOC_NORMAL |
 				    VM_ALLOC_NOOBJ | VM_ALLOC_WIRED |
 				    VM_ALLOC_ZERO);
 				if (l2_m == NULL)
 					panic("pmap_enter: l2 pte_m == NULL");
 				if ((l2_m->flags & PG_ZERO) == 0)
 					pmap_zero_page(l2_m);
 
 				l2_pa = VM_PAGE_TO_PHYS(l2_m);
 				pmap_load_store(pde, l2_pa | L1_TABLE);
 				PTE_SYNC(pde);
 				/* FALLTHROUGH */
 			case 1:
 				/* Get the l2 pde to update */
 				pde = pmap_l1_to_l2(pde, va);
 
 				l3_m = vm_page_alloc(NULL, 0, VM_ALLOC_NORMAL |
 				    VM_ALLOC_NOOBJ | VM_ALLOC_WIRED |
 				    VM_ALLOC_ZERO);
 				if (l3_m == NULL)
 					panic("pmap_enter: l3 pte_m == NULL");
 				if ((l3_m->flags & PG_ZERO) == 0)
 					pmap_zero_page(l3_m);
 
 				l3_pa = VM_PAGE_TO_PHYS(l3_m);
 				pmap_load_store(pde, l3_pa | L2_TABLE);
 				PTE_SYNC(pde);
 				break;
 			}
 		}
 		l3 = pmap_l2_to_l3(pde, va);
 		pmap_invalidate_page(pmap, va);
 	}
 havel3:
 
 	om = NULL;
 	orig_l3 = pmap_load(l3);
 	opa = orig_l3 & ~ATTR_MASK;
 
 	/*
 	 * Is the specified virtual address already mapped?
 	 */
 	if (pmap_l3_valid(orig_l3)) {
 		/*
 		 * Wiring change, just update stats. We don't worry about
 		 * wiring PT pages as they remain resident as long as there
 		 * are valid mappings in them. Hence, if a user page is wired,
 		 * the PT page will be also.
 		 */
 		if ((flags & PMAP_ENTER_WIRED) != 0 &&
 		    (orig_l3 & ATTR_SW_WIRED) == 0)
 			pmap->pm_stats.wired_count++;
 		else if ((flags & PMAP_ENTER_WIRED) == 0 &&
 		    (orig_l3 & ATTR_SW_WIRED) != 0)
 			pmap->pm_stats.wired_count--;
 
 		/*
 		 * Remove the extra PT page reference.
 		 */
 		if (mpte != NULL) {
 			mpte->wire_count--;
 			KASSERT(mpte->wire_count > 0,
 			    ("pmap_enter: missing reference to page table page,"
 			     " va: 0x%lx", va));
 		}
 
 		/*
 		 * Has the physical page changed?
 		 */
 		if (opa == pa) {
 			/*
 			 * No, might be a protection or wiring change.
 			 */
 			if ((orig_l3 & ATTR_SW_MANAGED) != 0) {
 				new_l3 |= ATTR_SW_MANAGED;
 				if ((new_l3 & ATTR_AP(ATTR_AP_RW)) ==
 				    ATTR_AP(ATTR_AP_RW)) {
 					vm_page_aflag_set(m, PGA_WRITEABLE);
 				}
 			}
 			goto validate;
 		}
 
 		/* Flush the cache, there might be uncommitted data in it */
 		if (pmap_is_current(pmap) && pmap_l3_valid_cacheable(orig_l3))
 			cpu_dcache_wb_range(va, L3_SIZE);
 	} else {
 		/*
 		 * Increment the counters.
 		 */
 		if ((new_l3 & ATTR_SW_WIRED) != 0)
 			pmap->pm_stats.wired_count++;
 		pmap_resident_count_inc(pmap, 1);
 	}
 	/*
 	 * Enter on the PV list if part of our managed memory.
 	 */
 	if ((m->oflags & VPO_UNMANAGED) == 0) {
 		new_l3 |= ATTR_SW_MANAGED;
 		pv = get_pv_entry(pmap, &lock);
 		pv->pv_va = va;
 		CHANGE_PV_LIST_LOCK_TO_PHYS(&lock, pa);
 		TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 		m->md.pv_gen++;
 		if ((new_l3 & ATTR_AP_RW_BIT) == ATTR_AP(ATTR_AP_RW))
 			vm_page_aflag_set(m, PGA_WRITEABLE);
 	}
 
 	/*
 	 * Update the L3 entry.
 	 */
 	if (orig_l3 != 0) {
 validate:
 		orig_l3 = pmap_load(l3);
 		opa = orig_l3 & ~ATTR_MASK;
 
 		if (opa != pa) {
 			pmap_update_entry(pmap, l3, new_l3, va, PAGE_SIZE);
 			if ((orig_l3 & ATTR_SW_MANAGED) != 0) {
 				om = PHYS_TO_VM_PAGE(opa);
 				if (pmap_page_dirty(orig_l3))
 					vm_page_dirty(om);
 				if ((orig_l3 & ATTR_AF) != 0)
 					vm_page_aflag_set(om, PGA_REFERENCED);
 				CHANGE_PV_LIST_LOCK_TO_PHYS(&lock, opa);
 				pmap_pvh_free(&om->md, pmap, va);
 				if ((om->aflags & PGA_WRITEABLE) != 0 &&
 				    TAILQ_EMPTY(&om->md.pv_list) &&
 				    ((om->flags & PG_FICTITIOUS) != 0 ||
 				    TAILQ_EMPTY(&pa_to_pvh(opa)->pv_list)))
 					vm_page_aflag_clear(om, PGA_WRITEABLE);
 			}
 		} else {
 			pmap_load_store(l3, new_l3);
 			PTE_SYNC(l3);
 			pmap_invalidate_page(pmap, va);
 			if (pmap_page_dirty(orig_l3) &&
 			    (orig_l3 & ATTR_SW_MANAGED) != 0)
 				vm_page_dirty(m);
 		}
 	} else {
 		pmap_load_store(l3, new_l3);
 	}
 
 	PTE_SYNC(l3);
 	pmap_invalidate_page(pmap, va);
 
 	if (pmap != pmap_kernel()) {
 		if (pmap == &curproc->p_vmspace->vm_pmap &&
 		    (prot & VM_PROT_EXECUTE) != 0)
 			cpu_icache_sync_range(va, PAGE_SIZE);
 
 		if ((mpte == NULL || mpte->wire_count == NL3PG) &&
 		    pmap_superpages_enabled() &&
 		    (m->flags & PG_FICTITIOUS) == 0 &&
 		    vm_reserv_level_iffullpop(m) == 0) {
 			pmap_promote_l2(pmap, pde, va, &lock);
 		}
 	}
 
 	if (lock != NULL)
 		rw_wunlock(lock);
 	PMAP_UNLOCK(pmap);
 	return (KERN_SUCCESS);
 }
 
 /*
  * Maps a sequence of resident pages belonging to the same object.
  * The sequence begins with the given page m_start.  This page is
  * mapped at the given virtual address start.  Each subsequent page is
  * mapped at a virtual address that is offset from start by the same
  * amount as the page is offset from m_start within the object.  The
  * last page in the sequence is the page with the largest offset from
  * m_start that can be mapped at a virtual address less than the given
  * virtual address end.  Not every virtual page between start and end
  * is mapped; only those for which a resident page exists with the
  * corresponding offset from m_start are mapped.
  */
 void
 pmap_enter_object(pmap_t pmap, vm_offset_t start, vm_offset_t end,
     vm_page_t m_start, vm_prot_t prot)
 {
 	struct rwlock *lock;
 	vm_offset_t va;
 	vm_page_t m, mpte;
 	vm_pindex_t diff, psize;
 
 	VM_OBJECT_ASSERT_LOCKED(m_start->object);
 
 	psize = atop(end - start);
 	mpte = NULL;
 	m = m_start;
 	lock = NULL;
 	PMAP_LOCK(pmap);
 	while (m != NULL && (diff = m->pindex - m_start->pindex) < psize) {
 		va = start + ptoa(diff);
 		mpte = pmap_enter_quick_locked(pmap, va, m, prot, mpte, &lock);
 		m = TAILQ_NEXT(m, listq);
 	}
 	if (lock != NULL)
 		rw_wunlock(lock);
 	PMAP_UNLOCK(pmap);
 }
 
 /*
  * this code makes some *MAJOR* assumptions:
  * 1. Current pmap & pmap exists.
  * 2. Not wired.
  * 3. Read access.
  * 4. No page table pages.
  * but is *MUCH* faster than pmap_enter...
  */
 
 void
 pmap_enter_quick(pmap_t pmap, vm_offset_t va, vm_page_t m, vm_prot_t prot)
 {
 	struct rwlock *lock;
 
 	lock = NULL;
 	PMAP_LOCK(pmap);
 	(void)pmap_enter_quick_locked(pmap, va, m, prot, NULL, &lock);
 	if (lock != NULL)
 		rw_wunlock(lock);
 	PMAP_UNLOCK(pmap);
 }
 
 static vm_page_t
 pmap_enter_quick_locked(pmap_t pmap, vm_offset_t va, vm_page_t m,
     vm_prot_t prot, vm_page_t mpte, struct rwlock **lockp)
 {
 	struct spglist free;
 	pd_entry_t *pde;
 	pt_entry_t *l2, *l3;
 	vm_paddr_t pa;
 	int lvl;
 
 	KASSERT(va < kmi.clean_sva || va >= kmi.clean_eva ||
 	    (m->oflags & VPO_UNMANAGED) != 0,
 	    ("pmap_enter_quick_locked: managed mapping within the clean submap"));
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 
 	CTR2(KTR_PMAP, "pmap_enter_quick_locked: %p %lx", pmap, va);
 	/*
 	 * In the case that a page table page is not
 	 * resident, we are creating it here.
 	 */
 	if (va < VM_MAXUSER_ADDRESS) {
 		vm_pindex_t l2pindex;
 
 		/*
 		 * Calculate pagetable page index
 		 */
 		l2pindex = pmap_l2_pindex(va);
 		if (mpte && (mpte->pindex == l2pindex)) {
 			mpte->wire_count++;
 		} else {
 			/*
 			 * Get the l2 entry
 			 */
 			pde = pmap_pde(pmap, va, &lvl);
 
 			/*
 			 * If the page table page is mapped, we just increment
 			 * the hold count, and activate it.  Otherwise, we
 			 * attempt to allocate a page table page.  If this
 			 * attempt fails, we don't retry.  Instead, we give up.
 			 */
 			if (lvl == 1) {
 				l2 = pmap_l1_to_l2(pde, va);
 				if ((pmap_load(l2) & ATTR_DESCR_MASK) ==
 				    L2_BLOCK)
 					return (NULL);
 			}
 			if (lvl == 2 && pmap_load(pde) != 0) {
 				mpte =
 				    PHYS_TO_VM_PAGE(pmap_load(pde) & ~ATTR_MASK);
 				mpte->wire_count++;
 			} else {
 				/*
 				 * Pass NULL instead of the PV list lock
 				 * pointer, because we don't intend to sleep.
 				 */
 				mpte = _pmap_alloc_l3(pmap, l2pindex, NULL);
 				if (mpte == NULL)
 					return (mpte);
 			}
 		}
 		l3 = (pt_entry_t *)PHYS_TO_DMAP(VM_PAGE_TO_PHYS(mpte));
 		l3 = &l3[pmap_l3_index(va)];
 	} else {
 		mpte = NULL;
 		pde = pmap_pde(kernel_pmap, va, &lvl);
 		KASSERT(pde != NULL,
 		    ("pmap_enter_quick_locked: Invalid page entry, va: 0x%lx",
 		     va));
 		KASSERT(lvl == 2,
 		    ("pmap_enter_quick_locked: Invalid level %d", lvl));
 		l3 = pmap_l2_to_l3(pde, va);
 	}
 
 	if (pmap_load(l3) != 0) {
 		if (mpte != NULL) {
 			mpte->wire_count--;
 			mpte = NULL;
 		}
 		return (mpte);
 	}
 
 	/*
 	 * Enter on the PV list if part of our managed memory.
 	 */
 	if ((m->oflags & VPO_UNMANAGED) == 0 &&
 	    !pmap_try_insert_pv_entry(pmap, va, m, lockp)) {
 		if (mpte != NULL) {
 			SLIST_INIT(&free);
 			if (pmap_unwire_l3(pmap, va, mpte, &free)) {
 				pmap_invalidate_page(pmap, va);
 				pmap_free_zero_pages(&free);
 			}
 			mpte = NULL;
 		}
 		return (mpte);
 	}
 
 	/*
 	 * Increment counters
 	 */
 	pmap_resident_count_inc(pmap, 1);
 
 	pa = VM_PAGE_TO_PHYS(m) | ATTR_DEFAULT | ATTR_IDX(m->md.pv_memattr) |
 	    ATTR_AP(ATTR_AP_RO) | L3_PAGE;
 
 	/*
 	 * Now validate mapping with RO protection
 	 */
 	if ((m->oflags & VPO_UNMANAGED) == 0)
 		pa |= ATTR_SW_MANAGED;
 	pmap_load_store(l3, pa);
 	PTE_SYNC(l3);
 	pmap_invalidate_page(pmap, va);
 	return (mpte);
 }
 
 /*
  * This code maps large physical mmap regions into the
  * processor address space.  Note that some shortcuts
  * are taken, but the code works.
  */
 void
 pmap_object_init_pt(pmap_t pmap, vm_offset_t addr, vm_object_t object,
     vm_pindex_t pindex, vm_size_t size)
 {
 
 	VM_OBJECT_ASSERT_WLOCKED(object);
 	KASSERT(object->type == OBJT_DEVICE || object->type == OBJT_SG,
 	    ("pmap_object_init_pt: non-device object"));
 }
 
 /*
  *	Clear the wired attribute from the mappings for the specified range of
  *	addresses in the given pmap.  Every valid mapping within that range
  *	must have the wired attribute set.  In contrast, invalid mappings
  *	cannot have the wired attribute set, so they are ignored.
  *
  *	The wired attribute of the page table entry is not a hardware feature,
  *	so there is no need to invalidate any TLB entries.
  */
 void
 pmap_unwire(pmap_t pmap, vm_offset_t sva, vm_offset_t eva)
 {
 	vm_offset_t va_next;
 	pd_entry_t *l0, *l1, *l2;
 	pt_entry_t *l3;
 
 	PMAP_LOCK(pmap);
 	for (; sva < eva; sva = va_next) {
 		l0 = pmap_l0(pmap, sva);
 		if (pmap_load(l0) == 0) {
 			va_next = (sva + L0_SIZE) & ~L0_OFFSET;
 			if (va_next < sva)
 				va_next = eva;
 			continue;
 		}
 
 		l1 = pmap_l0_to_l1(l0, sva);
 		if (pmap_load(l1) == 0) {
 			va_next = (sva + L1_SIZE) & ~L1_OFFSET;
 			if (va_next < sva)
 				va_next = eva;
 			continue;
 		}
 
 		va_next = (sva + L2_SIZE) & ~L2_OFFSET;
 		if (va_next < sva)
 			va_next = eva;
 
 		l2 = pmap_l1_to_l2(l1, sva);
 		if (pmap_load(l2) == 0)
 			continue;
 
 		if ((pmap_load(l2) & ATTR_DESCR_MASK) == L2_BLOCK) {
 			l3 = pmap_demote_l2(pmap, l2, sva);
 			if (l3 == NULL)
 				continue;
 		}
 		KASSERT((pmap_load(l2) & ATTR_DESCR_MASK) == L2_TABLE,
 		    ("pmap_unwire: Invalid l2 entry after demotion"));
 
 		if (va_next > eva)
 			va_next = eva;
 		for (l3 = pmap_l2_to_l3(l2, sva); sva != va_next; l3++,
 		    sva += L3_SIZE) {
 			if (pmap_load(l3) == 0)
 				continue;
 			if ((pmap_load(l3) & ATTR_SW_WIRED) == 0)
 				panic("pmap_unwire: l3 %#jx is missing "
 				    "ATTR_SW_WIRED", (uintmax_t)pmap_load(l3));
 
 			/*
 			 * PG_W must be cleared atomically.  Although the pmap
 			 * lock synchronizes access to PG_W, another processor
 			 * could be setting PG_M and/or PG_A concurrently.
 			 */
 			atomic_clear_long(l3, ATTR_SW_WIRED);
 			pmap->pm_stats.wired_count--;
 		}
 	}
 	PMAP_UNLOCK(pmap);
 }
 
 /*
  *	Copy the range specified by src_addr/len
  *	from the source map to the range dst_addr/len
  *	in the destination map.
  *
  *	This routine is only advisory and need not do anything.
  */
 
 void
 pmap_copy(pmap_t dst_pmap, pmap_t src_pmap, vm_offset_t dst_addr, vm_size_t len,
     vm_offset_t src_addr)
 {
 }
 
 /*
  *	pmap_zero_page zeros the specified hardware page by mapping
  *	the page into KVM and using bzero to clear its contents.
  */
 void
 pmap_zero_page(vm_page_t m)
 {
 	vm_offset_t va = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m));
 
 	pagezero((void *)va);
 }
 
 /*
  *	pmap_zero_page_area zeros the specified hardware page by mapping
  *	the page into KVM and using bzero to clear its contents.
  *
  *	off and size may not cover an area beyond a single hardware page.
  */
 void
 pmap_zero_page_area(vm_page_t m, int off, int size)
 {
 	vm_offset_t va = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m));
 
 	if (off == 0 && size == PAGE_SIZE)
 		pagezero((void *)va);
 	else
 		bzero((char *)va + off, size);
 }
 
 /*
  *	pmap_copy_page copies the specified (machine independent)
  *	page by mapping the page into virtual memory and using
  *	bcopy to copy the page, one machine dependent page at a
  *	time.
  */
 void
 pmap_copy_page(vm_page_t msrc, vm_page_t mdst)
 {
 	vm_offset_t src = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(msrc));
 	vm_offset_t dst = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(mdst));
 
 	pagecopy((void *)src, (void *)dst);
 }
 
 int unmapped_buf_allowed = 1;
 
 void
 pmap_copy_pages(vm_page_t ma[], vm_offset_t a_offset, vm_page_t mb[],
     vm_offset_t b_offset, int xfersize)
 {
 	void *a_cp, *b_cp;
 	vm_page_t m_a, m_b;
 	vm_paddr_t p_a, p_b;
 	vm_offset_t a_pg_offset, b_pg_offset;
 	int cnt;
 
 	while (xfersize > 0) {
 		a_pg_offset = a_offset & PAGE_MASK;
 		m_a = ma[a_offset >> PAGE_SHIFT];
 		p_a = m_a->phys_addr;
 		b_pg_offset = b_offset & PAGE_MASK;
 		m_b = mb[b_offset >> PAGE_SHIFT];
 		p_b = m_b->phys_addr;
 		cnt = min(xfersize, PAGE_SIZE - a_pg_offset);
 		cnt = min(cnt, PAGE_SIZE - b_pg_offset);
 		if (__predict_false(!PHYS_IN_DMAP(p_a))) {
 			panic("!DMAP a %lx", p_a);
 		} else {
 			a_cp = (char *)PHYS_TO_DMAP(p_a) + a_pg_offset;
 		}
 		if (__predict_false(!PHYS_IN_DMAP(p_b))) {
 			panic("!DMAP b %lx", p_b);
 		} else {
 			b_cp = (char *)PHYS_TO_DMAP(p_b) + b_pg_offset;
 		}
 		bcopy(a_cp, b_cp, cnt);
 		a_offset += cnt;
 		b_offset += cnt;
 		xfersize -= cnt;
 	}
 }
 
 vm_offset_t
 pmap_quick_enter_page(vm_page_t m)
 {
 
 	return (PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m)));
 }
 
 void
 pmap_quick_remove_page(vm_offset_t addr)
 {
 }
 
 /*
  * Returns true if the pmap's pv is one of the first
  * 16 pvs linked to from this page.  This count may
  * be changed upwards or downwards in the future; it
  * is only necessary that true be returned for a small
  * subset of pmaps for proper page aging.
  */
 boolean_t
 pmap_page_exists_quick(pmap_t pmap, vm_page_t m)
 {
 	struct md_page *pvh;
 	struct rwlock *lock;
 	pv_entry_t pv;
 	int loops = 0;
 	boolean_t rv;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_page_exists_quick: page %p is not managed", m));
 	rv = FALSE;
 	lock = VM_PAGE_TO_PV_LIST_LOCK(m);
 	rw_rlock(lock);
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
 		if (PV_PMAP(pv) == pmap) {
 			rv = TRUE;
 			break;
 		}
 		loops++;
 		if (loops >= 16)
 			break;
 	}
 	if (!rv && loops < 16 && (m->flags & PG_FICTITIOUS) == 0) {
 		pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 		TAILQ_FOREACH(pv, &pvh->pv_list, pv_next) {
 			if (PV_PMAP(pv) == pmap) {
 				rv = TRUE;
 				break;
 			}
 			loops++;
 			if (loops >= 16)
 				break;
 		}
 	}
 	rw_runlock(lock);
 	return (rv);
 }
 
 /*
  *	pmap_page_wired_mappings:
  *
  *	Return the number of managed mappings to the given physical page
  *	that are wired.
  */
 int
 pmap_page_wired_mappings(vm_page_t m)
 {
 	struct rwlock *lock;
 	struct md_page *pvh;
 	pmap_t pmap;
 	pt_entry_t *pte;
 	pv_entry_t pv;
 	int count, lvl, md_gen, pvh_gen;
 
 	if ((m->oflags & VPO_UNMANAGED) != 0)
 		return (0);
 	lock = VM_PAGE_TO_PV_LIST_LOCK(m);
 	rw_rlock(lock);
 restart:
 	count = 0;
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			md_gen = m->md.pv_gen;
 			rw_runlock(lock);
 			PMAP_LOCK(pmap);
 			rw_rlock(lock);
 			if (md_gen != m->md.pv_gen) {
 				PMAP_UNLOCK(pmap);
 				goto restart;
 			}
 		}
 		pte = pmap_pte(pmap, pv->pv_va, &lvl);
 		if (pte != NULL && (pmap_load(pte) & ATTR_SW_WIRED) != 0)
 			count++;
 		PMAP_UNLOCK(pmap);
 	}
 	if ((m->flags & PG_FICTITIOUS) == 0) {
 		pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 		TAILQ_FOREACH(pv, &pvh->pv_list, pv_next) {
 			pmap = PV_PMAP(pv);
 			if (!PMAP_TRYLOCK(pmap)) {
 				md_gen = m->md.pv_gen;
 				pvh_gen = pvh->pv_gen;
 				rw_runlock(lock);
 				PMAP_LOCK(pmap);
 				rw_rlock(lock);
 				if (md_gen != m->md.pv_gen ||
 				    pvh_gen != pvh->pv_gen) {
 					PMAP_UNLOCK(pmap);
 					goto restart;
 				}
 			}
 			pte = pmap_pte(pmap, pv->pv_va, &lvl);
 			if (pte != NULL &&
 			    (pmap_load(pte) & ATTR_SW_WIRED) != 0)
 				count++;
 			PMAP_UNLOCK(pmap);
 		}
 	}
 	rw_runlock(lock);
 	return (count);
 }
 
 /*
  * Destroy all managed, non-wired mappings in the given user-space
  * pmap.  This pmap cannot be active on any processor besides the
  * caller.
  *
  * This function cannot be applied to the kernel pmap.  Moreover, it
  * is not intended for general use.  It is only to be used during
  * process termination.  Consequently, it can be implemented in ways
  * that make it faster than pmap_remove().  First, it can more quickly
  * destroy mappings by iterating over the pmap's collection of PV
  * entries, rather than searching the page table.  Second, it doesn't
  * have to test and clear the page table entries atomically, because
  * no processor is currently accessing the user address space.  In
  * particular, a page table entry's dirty bit won't change state once
  * this function starts.
  */
 void
 pmap_remove_pages(pmap_t pmap)
 {
 	pd_entry_t *pde;
 	pt_entry_t *pte, tpte;
 	struct spglist free;
 	vm_page_t m, ml3, mt;
 	pv_entry_t pv;
 	struct md_page *pvh;
 	struct pv_chunk *pc, *npc;
 	struct rwlock *lock;
 	int64_t bit;
 	uint64_t inuse, bitmask;
 	int allfree, field, freed, idx, lvl;
 	vm_paddr_t pa;
 
 	lock = NULL;
 
 	SLIST_INIT(&free);
 	PMAP_LOCK(pmap);
 	TAILQ_FOREACH_SAFE(pc, &pmap->pm_pvchunk, pc_list, npc) {
 		allfree = 1;
 		freed = 0;
 		for (field = 0; field < _NPCM; field++) {
 			inuse = ~pc->pc_map[field] & pc_freemask[field];
 			while (inuse != 0) {
 				bit = ffsl(inuse) - 1;
 				bitmask = 1UL << bit;
 				idx = field * 64 + bit;
 				pv = &pc->pc_pventry[idx];
 				inuse &= ~bitmask;
 
 				pde = pmap_pde(pmap, pv->pv_va, &lvl);
 				KASSERT(pde != NULL,
 				    ("Attempting to remove an unmapped page"));
 
 				switch(lvl) {
 				case 1:
 					pte = pmap_l1_to_l2(pde, pv->pv_va);
 					tpte = pmap_load(pte); 
 					KASSERT((tpte & ATTR_DESCR_MASK) ==
 					    L2_BLOCK,
 					    ("Attempting to remove an invalid "
 					    "block: %lx", tpte));
 					tpte = pmap_load(pte);
 					break;
 				case 2:
 					pte = pmap_l2_to_l3(pde, pv->pv_va);
 					tpte = pmap_load(pte);
 					KASSERT((tpte & ATTR_DESCR_MASK) ==
 					    L3_PAGE,
 					    ("Attempting to remove an invalid "
 					     "page: %lx", tpte));
 					break;
 				default:
 					panic(
 					    "Invalid page directory level: %d",
 					    lvl);
 				}
 
 /*
  * We cannot remove wired pages from a process' mapping at this time
  */
 				if (tpte & ATTR_SW_WIRED) {
 					allfree = 0;
 					continue;
 				}
 
 				pa = tpte & ~ATTR_MASK;
 
 				m = PHYS_TO_VM_PAGE(pa);
 				KASSERT(m->phys_addr == pa,
 				    ("vm_page_t %p phys_addr mismatch %016jx %016jx",
 				    m, (uintmax_t)m->phys_addr,
 				    (uintmax_t)tpte));
 
 				KASSERT((m->flags & PG_FICTITIOUS) != 0 ||
 				    m < &vm_page_array[vm_page_array_size],
 				    ("pmap_remove_pages: bad pte %#jx",
 				    (uintmax_t)tpte));
 
 				if (pmap_is_current(pmap)) {
 					if (lvl == 2 &&
 					    pmap_l3_valid_cacheable(tpte)) {
 						cpu_dcache_wb_range(pv->pv_va,
 						    L3_SIZE);
 					} else if (lvl == 1 &&
 					    pmap_pte_valid_cacheable(tpte)) {
 						cpu_dcache_wb_range(pv->pv_va,
 						    L2_SIZE);
 					}
 				}
 				pmap_load_clear(pte);
 				PTE_SYNC(pte);
 				pmap_invalidate_page(pmap, pv->pv_va);
 
 				/*
 				 * Update the vm_page_t clean/reference bits.
 				 */
 				if ((tpte & ATTR_AP_RW_BIT) ==
 				    ATTR_AP(ATTR_AP_RW)) {
 					switch (lvl) {
 					case 1:
 						for (mt = m; mt < &m[L2_SIZE / PAGE_SIZE]; mt++)
 							vm_page_dirty(m);
 						break;
 					case 2:
 						vm_page_dirty(m);
 						break;
 					}
 				}
 
 				CHANGE_PV_LIST_LOCK_TO_VM_PAGE(&lock, m);
 
 				/* Mark free */
 				pc->pc_map[field] |= bitmask;
 				switch (lvl) {
 				case 1:
 					pmap_resident_count_dec(pmap,
 					    L2_SIZE / PAGE_SIZE);
 					pvh = pa_to_pvh(tpte & ~ATTR_MASK);
 					TAILQ_REMOVE(&pvh->pv_list, pv,pv_next);
 					pvh->pv_gen++;
 					if (TAILQ_EMPTY(&pvh->pv_list)) {
 						for (mt = m; mt < &m[L2_SIZE / PAGE_SIZE]; mt++)
 							if ((mt->aflags & PGA_WRITEABLE) != 0 &&
 							    TAILQ_EMPTY(&mt->md.pv_list))
 								vm_page_aflag_clear(mt, PGA_WRITEABLE);
 					}
 					ml3 = pmap_lookup_pt_page(pmap,
 					    pv->pv_va);
 					if (ml3 != NULL) {
 						pmap_remove_pt_page(pmap, ml3);
 						pmap_resident_count_dec(pmap,1);
 						KASSERT(ml3->wire_count == NL3PG,
 						    ("pmap_remove_pages: l3 page wire count error"));
 						ml3->wire_count = 0;
 						pmap_add_delayed_free_list(ml3,
 						    &free, FALSE);
 						atomic_subtract_int(
 						    &vm_cnt.v_wire_count, 1);
 					}
 					break;
 				case 2:
 					pmap_resident_count_dec(pmap, 1);
 					TAILQ_REMOVE(&m->md.pv_list, pv,
 					    pv_next);
 					m->md.pv_gen++;
 					if ((m->aflags & PGA_WRITEABLE) != 0 &&
 					    TAILQ_EMPTY(&m->md.pv_list) &&
 					    (m->flags & PG_FICTITIOUS) == 0) {
 						pvh = pa_to_pvh(
 						    VM_PAGE_TO_PHYS(m));
 						if (TAILQ_EMPTY(&pvh->pv_list))
 							vm_page_aflag_clear(m,
 							    PGA_WRITEABLE);
 					}
 					break;
 				}
 				pmap_unuse_l3(pmap, pv->pv_va, pmap_load(pde),
 				    &free);
 				freed++;
 			}
 		}
 		PV_STAT(atomic_add_long(&pv_entry_frees, freed));
 		PV_STAT(atomic_add_int(&pv_entry_spare, freed));
 		PV_STAT(atomic_subtract_long(&pv_entry_count, freed));
 		if (allfree) {
 			TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 			free_pv_chunk(pc);
 		}
 	}
 	pmap_invalidate_all(pmap);
 	if (lock != NULL)
 		rw_wunlock(lock);
 	PMAP_UNLOCK(pmap);
 	pmap_free_zero_pages(&free);
 }
 
 /*
  * This is used to check if a page has been accessed or modified. As we
  * don't have a bit to see if it has been modified we have to assume it
  * has been if the page is read/write.
  */
 static boolean_t
 pmap_page_test_mappings(vm_page_t m, boolean_t accessed, boolean_t modified)
 {
 	struct rwlock *lock;
 	pv_entry_t pv;
 	struct md_page *pvh;
 	pt_entry_t *pte, mask, value;
 	pmap_t pmap;
 	int lvl, md_gen, pvh_gen;
 	boolean_t rv;
 
 	rv = FALSE;
 	lock = VM_PAGE_TO_PV_LIST_LOCK(m);
 	rw_rlock(lock);
 restart:
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			md_gen = m->md.pv_gen;
 			rw_runlock(lock);
 			PMAP_LOCK(pmap);
 			rw_rlock(lock);
 			if (md_gen != m->md.pv_gen) {
 				PMAP_UNLOCK(pmap);
 				goto restart;
 			}
 		}
 		pte = pmap_pte(pmap, pv->pv_va, &lvl);
 		KASSERT(lvl == 3,
 		    ("pmap_page_test_mappings: Invalid level %d", lvl));
 		mask = 0;
 		value = 0;
 		if (modified) {
 			mask |= ATTR_AP_RW_BIT;
 			value |= ATTR_AP(ATTR_AP_RW);
 		}
 		if (accessed) {
 			mask |= ATTR_AF | ATTR_DESCR_MASK;
 			value |= ATTR_AF | L3_PAGE;
 		}
 		rv = (pmap_load(pte) & mask) == value;
 		PMAP_UNLOCK(pmap);
 		if (rv)
 			goto out;
 	}
 	if ((m->flags & PG_FICTITIOUS) == 0) {
 		pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 		TAILQ_FOREACH(pv, &pvh->pv_list, pv_next) {
 			pmap = PV_PMAP(pv);
 			if (!PMAP_TRYLOCK(pmap)) {
 				md_gen = m->md.pv_gen;
 				pvh_gen = pvh->pv_gen;
 				rw_runlock(lock);
 				PMAP_LOCK(pmap);
 				rw_rlock(lock);
 				if (md_gen != m->md.pv_gen ||
 				    pvh_gen != pvh->pv_gen) {
 					PMAP_UNLOCK(pmap);
 					goto restart;
 				}
 			}
 			pte = pmap_pte(pmap, pv->pv_va, &lvl);
 			KASSERT(lvl == 2,
 			    ("pmap_page_test_mappings: Invalid level %d", lvl));
 			mask = 0;
 			value = 0;
 			if (modified) {
 				mask |= ATTR_AP_RW_BIT;
 				value |= ATTR_AP(ATTR_AP_RW);
 			}
 			if (accessed) {
 				mask |= ATTR_AF | ATTR_DESCR_MASK;
 				value |= ATTR_AF | L2_BLOCK;
 			}
 			rv = (pmap_load(pte) & mask) == value;
 			PMAP_UNLOCK(pmap);
 			if (rv)
 				goto out;
 		}
 	}
 out:
 	rw_runlock(lock);
 	return (rv);
 }
 
 /*
  *	pmap_is_modified:
  *
  *	Return whether or not the specified physical page was modified
  *	in any physical maps.
  */
 boolean_t
 pmap_is_modified(vm_page_t m)
 {
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_is_modified: page %p is not managed", m));
 
 	/*
 	 * If the page is not exclusive busied, then PGA_WRITEABLE cannot be
 	 * concurrently set while the object is locked.  Thus, if PGA_WRITEABLE
 	 * is clear, no PTEs can have PG_M set.
 	 */
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	if (!vm_page_xbusied(m) && (m->aflags & PGA_WRITEABLE) == 0)
 		return (FALSE);
 	return (pmap_page_test_mappings(m, FALSE, TRUE));
 }
 
 /*
  *	pmap_is_prefaultable:
  *
  *	Return whether or not the specified virtual address is eligible
  *	for prefault.
  */
 boolean_t
 pmap_is_prefaultable(pmap_t pmap, vm_offset_t addr)
 {
 	pt_entry_t *pte;
 	boolean_t rv;
 	int lvl;
 
 	rv = FALSE;
 	PMAP_LOCK(pmap);
 	pte = pmap_pte(pmap, addr, &lvl);
 	if (pte != NULL && pmap_load(pte) != 0) {
 		rv = TRUE;
 	}
 	PMAP_UNLOCK(pmap);
 	return (rv);
 }
 
 /*
  *	pmap_is_referenced:
  *
  *	Return whether or not the specified physical page was referenced
  *	in any physical maps.
  */
 boolean_t
 pmap_is_referenced(vm_page_t m)
 {
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_is_referenced: page %p is not managed", m));
 	return (pmap_page_test_mappings(m, TRUE, FALSE));
 }
 
 /*
  * Clear the write and modified bits in each of the given page's mappings.
  */
 void
 pmap_remove_write(vm_page_t m)
 {
 	struct md_page *pvh;
 	pmap_t pmap;
 	struct rwlock *lock;
 	pv_entry_t next_pv, pv;
 	pt_entry_t oldpte, *pte;
 	vm_offset_t va;
 	int lvl, md_gen, pvh_gen;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_remove_write: page %p is not managed", m));
 
 	/*
 	 * If the page is not exclusive busied, then PGA_WRITEABLE cannot be
 	 * set by another thread while the object is locked.  Thus,
 	 * if PGA_WRITEABLE is clear, no page table entries need updating.
 	 */
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	if (!vm_page_xbusied(m) && (m->aflags & PGA_WRITEABLE) == 0)
 		return;
 	lock = VM_PAGE_TO_PV_LIST_LOCK(m);
 	pvh = (m->flags & PG_FICTITIOUS) != 0 ? &pv_dummy :
 	    pa_to_pvh(VM_PAGE_TO_PHYS(m));
 retry_pv_loop:
 	rw_wlock(lock);
 	TAILQ_FOREACH_SAFE(pv, &pvh->pv_list, pv_next, next_pv) {
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			pvh_gen = pvh->pv_gen;
 			rw_wunlock(lock);
 			PMAP_LOCK(pmap);
 			rw_wlock(lock);
 			if (pvh_gen != pvh->pv_gen) {
 				PMAP_UNLOCK(pmap);
 				rw_wunlock(lock);
 				goto retry_pv_loop;
 			}
 		}
 		va = pv->pv_va;
 		pte = pmap_pte(pmap, pv->pv_va, &lvl);
 		if ((pmap_load(pte) & ATTR_AP_RW_BIT) == ATTR_AP(ATTR_AP_RW))
 			pmap_demote_l2_locked(pmap, pte, va & ~L2_OFFSET,
 			    &lock);
 		KASSERT(lock == VM_PAGE_TO_PV_LIST_LOCK(m),
 		    ("inconsistent pv lock %p %p for page %p",
 		    lock, VM_PAGE_TO_PV_LIST_LOCK(m), m));
 		PMAP_UNLOCK(pmap);
 	}
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			pvh_gen = pvh->pv_gen;
 			md_gen = m->md.pv_gen;
 			rw_wunlock(lock);
 			PMAP_LOCK(pmap);
 			rw_wlock(lock);
 			if (pvh_gen != pvh->pv_gen ||
 			    md_gen != m->md.pv_gen) {
 				PMAP_UNLOCK(pmap);
 				rw_wunlock(lock);
 				goto retry_pv_loop;
 			}
 		}
 		pte = pmap_pte(pmap, pv->pv_va, &lvl);
 retry:
 		oldpte = pmap_load(pte);
 		if ((oldpte & ATTR_AP_RW_BIT) == ATTR_AP(ATTR_AP_RW)) {
 			if (!atomic_cmpset_long(pte, oldpte,
 			    oldpte | ATTR_AP(ATTR_AP_RO)))
 				goto retry;
 			if ((oldpte & ATTR_AF) != 0)
 				vm_page_dirty(m);
 			pmap_invalidate_page(pmap, pv->pv_va);
 		}
 		PMAP_UNLOCK(pmap);
 	}
 	rw_wunlock(lock);
 	vm_page_aflag_clear(m, PGA_WRITEABLE);
 }
 
 static __inline boolean_t
 safe_to_clear_referenced(pmap_t pmap, pt_entry_t pte)
 {
 
 	return (FALSE);
 }
 
-#define	PMAP_TS_REFERENCED_MAX	5
-
 /*
  *	pmap_ts_referenced:
  *
  *	Return a count of reference bits for a page, clearing those bits.
  *	It is not necessary for every reference bit to be cleared, but it
  *	is necessary that 0 only be returned when there are truly no
  *	reference bits set.
  *
- *	XXX: The exact number of bits to check and clear is a matter that
- *	should be tested and standardized at some point in the future for
- *	optimal aging of shared pages.
+ *	As an optimization, update the page's dirty field if a modified bit is
+ *	found while counting reference bits.  This opportunistic update can be
+ *	performed at low cost and can eliminate the need for some future calls
+ *	to pmap_is_modified().  However, since this function stops after
+ *	finding PMAP_TS_REFERENCED_MAX reference bits, it may not detect some
+ *	dirty pages.  Those dirty pages will only be detected by a future call
+ *	to pmap_is_modified().
  */
 int
 pmap_ts_referenced(vm_page_t m)
 {
 	struct md_page *pvh;
 	pv_entry_t pv, pvf;
 	pmap_t pmap;
 	struct rwlock *lock;
 	pd_entry_t *pde, tpde;
 	pt_entry_t *pte, tpte;
 	pt_entry_t *l3;
 	vm_offset_t va;
 	vm_paddr_t pa;
 	int cleared, md_gen, not_cleared, lvl, pvh_gen;
 	struct spglist free;
 	bool demoted;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_ts_referenced: page %p is not managed", m));
 	SLIST_INIT(&free);
 	cleared = 0;
 	pa = VM_PAGE_TO_PHYS(m);
 	lock = PHYS_TO_PV_LIST_LOCK(pa);
 	pvh = (m->flags & PG_FICTITIOUS) != 0 ? &pv_dummy : pa_to_pvh(pa);
 	rw_wlock(lock);
 retry:
 	not_cleared = 0;
 	if ((pvf = TAILQ_FIRST(&pvh->pv_list)) == NULL)
 		goto small_mappings;
 	pv = pvf;
 	do {
 		if (pvf == NULL)
 			pvf = pv;
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			pvh_gen = pvh->pv_gen;
 			rw_wunlock(lock);
 			PMAP_LOCK(pmap);
 			rw_wlock(lock);
 			if (pvh_gen != pvh->pv_gen) {
 				PMAP_UNLOCK(pmap);
 				goto retry;
 			}
 		}
 		va = pv->pv_va;
 		pde = pmap_pde(pmap, pv->pv_va, &lvl);
 		KASSERT(pde != NULL, ("pmap_ts_referenced: no l1 table found"));
 		KASSERT(lvl == 1,
 		    ("pmap_ts_referenced: invalid pde level %d", lvl));
 		tpde = pmap_load(pde);
 		KASSERT((tpde & ATTR_DESCR_MASK) == L1_TABLE,
 		    ("pmap_ts_referenced: found an invalid l1 table"));
 		pte = pmap_l1_to_l2(pde, pv->pv_va);
 		tpte = pmap_load(pte);
+		if (pmap_page_dirty(tpte)) {
+			/*
+			 * Although "tpte" is mapping a 2MB page, because
+			 * this function is called at a 4KB page granularity,
+			 * we only update the 4KB page under test.
+			 */
+			vm_page_dirty(m);
+		}
 		if ((tpte & ATTR_AF) != 0) {
 			/*
 			 * Since this reference bit is shared by 512 4KB
 			 * pages, it should not be cleared every time it is
 			 * tested.  Apply a simple "hash" function on the
 			 * physical page number, the virtual superpage number,
 			 * and the pmap address to select one 4KB page out of
 			 * the 512 on which testing the reference bit will
 			 * result in clearing that reference bit.  This
 			 * function is designed to avoid the selection of the
 			 * same 4KB page for every 2MB page mapping.
 			 *
 			 * On demotion, a mapping that hasn't been referenced
 			 * is simply destroyed.  To avoid the possibility of a
 			 * subsequent page fault on a demoted wired mapping,
 			 * always leave its reference bit set.  Moreover,
 			 * since the superpage is wired, the current state of
 			 * its reference bit won't affect page replacement.
 			 */
 			if ((((pa >> PAGE_SHIFT) ^ (pv->pv_va >> L2_SHIFT) ^
 			    (uintptr_t)pmap) & (Ln_ENTRIES - 1)) == 0 &&
 			    (tpte & ATTR_SW_WIRED) == 0) {
 				if (safe_to_clear_referenced(pmap, tpte)) {
 					/*
 					 * TODO: We don't handle the access
 					 * flag at all. We need to be able
 					 * to set it in  the exception handler.
 					 */
 					panic("ARM64TODO: "
 					    "safe_to_clear_referenced\n");
 				} else if (pmap_demote_l2_locked(pmap, pte,
 				    pv->pv_va, &lock) != NULL) {
 					demoted = true;
 					va += VM_PAGE_TO_PHYS(m) -
 					    (tpte & ~ATTR_MASK);
 					l3 = pmap_l2_to_l3(pte, va);
 					pmap_remove_l3(pmap, l3, va,
 					    pmap_load(pte), NULL, &lock);
 				} else
 					demoted = true;
 
 				if (demoted) {
 					/*
 					 * The superpage mapping was removed
 					 * entirely and therefore 'pv' is no
 					 * longer valid.
 					 */
 					if (pvf == pv)
 						pvf = NULL;
 					pv = NULL;
 				}
 				cleared++;
 				KASSERT(lock == VM_PAGE_TO_PV_LIST_LOCK(m),
 				    ("inconsistent pv lock %p %p for page %p",
 				    lock, VM_PAGE_TO_PV_LIST_LOCK(m), m));
 			} else
 				not_cleared++;
 		}
 		PMAP_UNLOCK(pmap);
 		/* Rotate the PV list if it has more than one entry. */
 		if (pv != NULL && TAILQ_NEXT(pv, pv_next) != NULL) {
 			TAILQ_REMOVE(&pvh->pv_list, pv, pv_next);
 			TAILQ_INSERT_TAIL(&pvh->pv_list, pv, pv_next);
 			pvh->pv_gen++;
 		}
 		if (cleared + not_cleared >= PMAP_TS_REFERENCED_MAX)
 			goto out;
 	} while ((pv = TAILQ_FIRST(&pvh->pv_list)) != pvf);
 small_mappings:
 	if ((pvf = TAILQ_FIRST(&m->md.pv_list)) == NULL)
 		goto out;
 	pv = pvf;
 	do {
 		if (pvf == NULL)
 			pvf = pv;
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			pvh_gen = pvh->pv_gen;
 			md_gen = m->md.pv_gen;
 			rw_wunlock(lock);
 			PMAP_LOCK(pmap);
 			rw_wlock(lock);
 			if (pvh_gen != pvh->pv_gen || md_gen != m->md.pv_gen) {
 				PMAP_UNLOCK(pmap);
 				goto retry;
 			}
 		}
 		pde = pmap_pde(pmap, pv->pv_va, &lvl);
 		KASSERT(pde != NULL, ("pmap_ts_referenced: no l2 table found"));
 		KASSERT(lvl == 2,
 		    ("pmap_ts_referenced: invalid pde level %d", lvl));
 		tpde = pmap_load(pde);
 		KASSERT((tpde & ATTR_DESCR_MASK) == L2_TABLE,
 		    ("pmap_ts_referenced: found an invalid l2 table"));
 		pte = pmap_l2_to_l3(pde, pv->pv_va);
 		tpte = pmap_load(pte);
+		if (pmap_page_dirty(tpte))
+			vm_page_dirty(m);
 		if ((tpte & ATTR_AF) != 0) {
 			if (safe_to_clear_referenced(pmap, tpte)) {
 				/*
 				 * TODO: We don't handle the access flag
 				 * at all. We need to be able to set it in
 				 * the exception handler.
 				 */
 				panic("ARM64TODO: safe_to_clear_referenced\n");
 			} else if ((tpte & ATTR_SW_WIRED) == 0) {
 				/*
 				 * Wired pages cannot be paged out so
 				 * doing accessed bit emulation for
 				 * them is wasted effort. We do the
 				 * hard work for unwired pages only.
 				 */
 				pmap_remove_l3(pmap, pte, pv->pv_va, tpde,
 				    &free, &lock);
 				pmap_invalidate_page(pmap, pv->pv_va);
 				cleared++;
 				if (pvf == pv)
 					pvf = NULL;
 				pv = NULL;
 				KASSERT(lock == VM_PAGE_TO_PV_LIST_LOCK(m),
 				    ("inconsistent pv lock %p %p for page %p",
 				    lock, VM_PAGE_TO_PV_LIST_LOCK(m), m));
 			} else
 				not_cleared++;
 		}
 		PMAP_UNLOCK(pmap);
 		/* Rotate the PV list if it has more than one entry. */
 		if (pv != NULL && TAILQ_NEXT(pv, pv_next) != NULL) {
 			TAILQ_REMOVE(&m->md.pv_list, pv, pv_next);
 			TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 			m->md.pv_gen++;
 		}
 	} while ((pv = TAILQ_FIRST(&m->md.pv_list)) != pvf && cleared +
 	    not_cleared < PMAP_TS_REFERENCED_MAX);
 out:
 	rw_wunlock(lock);
 	pmap_free_zero_pages(&free);
 	return (cleared + not_cleared);
 }
 
 /*
  *	Apply the given advice to the specified range of addresses within the
  *	given pmap.  Depending on the advice, clear the referenced and/or
  *	modified flags in each mapping and set the mapped page's dirty field.
  */
 void
 pmap_advise(pmap_t pmap, vm_offset_t sva, vm_offset_t eva, int advice)
 {
 }
 
 /*
  *	Clear the modify bits on the specified physical page.
  */
 void
 pmap_clear_modify(vm_page_t m)
 {
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_clear_modify: page %p is not managed", m));
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	KASSERT(!vm_page_xbusied(m),
 	    ("pmap_clear_modify: page %p is exclusive busied", m));
 
 	/*
 	 * If the page is not PGA_WRITEABLE, then no PTEs can have PG_M set.
 	 * If the object containing the page is locked and the page is not
 	 * exclusive busied, then PGA_WRITEABLE cannot be concurrently set.
 	 */
 	if ((m->aflags & PGA_WRITEABLE) == 0)
 		return;
 
 	/* ARM64TODO: We lack support for tracking if a page is modified */
 }
 
 void *
 pmap_mapbios(vm_paddr_t pa, vm_size_t size)
 {
 
         return ((void *)PHYS_TO_DMAP(pa));
 }
 
 void
 pmap_unmapbios(vm_paddr_t pa, vm_size_t size)
 {
 }
 
 /*
  * Sets the memory attribute for the specified page.
  */
 void
 pmap_page_set_memattr(vm_page_t m, vm_memattr_t ma)
 {
 
 	m->md.pv_memattr = ma;
 
 	/*
 	 * If "m" is a normal page, update its direct mapping.  This update
 	 * can be relied upon to perform any cache operations that are
 	 * required for data coherence.
 	 */
 	if ((m->flags & PG_FICTITIOUS) == 0 &&
 	    pmap_change_attr(PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m)), PAGE_SIZE,
 	    m->md.pv_memattr) != 0)
 		panic("memory attribute change on the direct map failed");
 }
 
 /*
  * Changes the specified virtual address range's memory type to that given by
  * the parameter "mode".  The specified virtual address range must be
  * completely contained within either the direct map or the kernel map.  If
  * the virtual address range is contained within the kernel map, then the
  * memory type for each of the corresponding ranges of the direct map is also
  * changed.  (The corresponding ranges of the direct map are those ranges that
  * map the same physical pages as the specified virtual address range.)  These
  * changes to the direct map are necessary because Intel describes the
  * behavior of their processors as "undefined" if two or more mappings to the
  * same physical page have different memory types.
  *
  * Returns zero if the change completed successfully, and either EINVAL or
  * ENOMEM if the change failed.  Specifically, EINVAL is returned if some part
  * of the virtual address range was not mapped, and ENOMEM is returned if
  * there was insufficient memory available to complete the change.  In the
  * latter case, the memory type may have been changed on some part of the
  * virtual address range or the direct map.
  */
 static int
 pmap_change_attr(vm_offset_t va, vm_size_t size, int mode)
 {
 	int error;
 
 	PMAP_LOCK(kernel_pmap);
 	error = pmap_change_attr_locked(va, size, mode);
 	PMAP_UNLOCK(kernel_pmap);
 	return (error);
 }
 
 static int
 pmap_change_attr_locked(vm_offset_t va, vm_size_t size, int mode)
 {
 	vm_offset_t base, offset, tmpva;
 	pt_entry_t l3, *pte, *newpte;
 	int lvl;
 
 	PMAP_LOCK_ASSERT(kernel_pmap, MA_OWNED);
 	base = trunc_page(va);
 	offset = va & PAGE_MASK;
 	size = round_page(offset + size);
 
 	if (!VIRT_IN_DMAP(base))
 		return (EINVAL);
 
 	for (tmpva = base; tmpva < base + size; ) {
 		pte = pmap_pte(kernel_pmap, va, &lvl);
 		if (pte == NULL)
 			return (EINVAL);
 
 		if ((pmap_load(pte) & ATTR_IDX_MASK) == ATTR_IDX(mode)) {
 			/*
 			 * We already have the correct attribute,
 			 * ignore this entry.
 			 */
 			switch (lvl) {
 			default:
 				panic("Invalid DMAP table level: %d\n", lvl);
 			case 1:
 				tmpva = (tmpva & ~L1_OFFSET) + L1_SIZE;
 				break;
 			case 2:
 				tmpva = (tmpva & ~L2_OFFSET) + L2_SIZE;
 				break;
 			case 3:
 				tmpva += PAGE_SIZE;
 				break;
 			}
 		} else {
 			/*
 			 * Split the entry to an level 3 table, then
 			 * set the new attribute.
 			 */
 			switch (lvl) {
 			default:
 				panic("Invalid DMAP table level: %d\n", lvl);
 			case 1:
 				newpte = pmap_demote_l1(kernel_pmap, pte,
 				    tmpva & ~L1_OFFSET);
 				if (newpte == NULL)
 					return (EINVAL);
 				pte = pmap_l1_to_l2(pte, tmpva);
 			case 2:
 				newpte = pmap_demote_l2(kernel_pmap, pte,
 				    tmpva & ~L2_OFFSET);
 				if (newpte == NULL)
 					return (EINVAL);
 				pte = pmap_l2_to_l3(pte, tmpva);
 			case 3:
 				/* Update the entry */
 				l3 = pmap_load(pte);
 				l3 &= ~ATTR_IDX_MASK;
 				l3 |= ATTR_IDX(mode);
 
 				pmap_update_entry(kernel_pmap, pte, l3, tmpva,
 				    PAGE_SIZE);
 
 				/*
 				 * If moving to a non-cacheable entry flush
 				 * the cache.
 				 */
 				if (mode == VM_MEMATTR_UNCACHEABLE)
 					cpu_dcache_wbinv_range(tmpva, L3_SIZE);
 
 				break;
 			}
 			tmpva += PAGE_SIZE;
 		}
 	}
 
 	return (0);
 }
 
 /*
  * Create an L2 table to map all addresses within an L1 mapping.
  */
 static pt_entry_t *
 pmap_demote_l1(pmap_t pmap, pt_entry_t *l1, vm_offset_t va)
 {
 	pt_entry_t *l2, newl2, oldl1;
 	vm_offset_t tmpl1;
 	vm_paddr_t l2phys, phys;
 	vm_page_t ml2;
 	int i;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	oldl1 = pmap_load(l1);
 	KASSERT((oldl1 & ATTR_DESCR_MASK) == L1_BLOCK,
 	    ("pmap_demote_l1: Demoting a non-block entry"));
 	KASSERT((va & L1_OFFSET) == 0,
 	    ("pmap_demote_l1: Invalid virtual address %#lx", va));
 	KASSERT((oldl1 & ATTR_SW_MANAGED) == 0,
 	    ("pmap_demote_l1: Level 1 table shouldn't be managed"));
 
 	tmpl1 = 0;
 	if (va <= (vm_offset_t)l1 && va + L1_SIZE > (vm_offset_t)l1) {
 		tmpl1 = kva_alloc(PAGE_SIZE);
 		if (tmpl1 == 0)
 			return (NULL);
 	}
 
 	if ((ml2 = vm_page_alloc(NULL, 0, VM_ALLOC_INTERRUPT |
 	    VM_ALLOC_NOOBJ | VM_ALLOC_WIRED)) == NULL) {
 		CTR2(KTR_PMAP, "pmap_demote_l1: failure for va %#lx"
 		    " in pmap %p", va, pmap);
 		return (NULL);
 	}
 
 	l2phys = VM_PAGE_TO_PHYS(ml2);
 	l2 = (pt_entry_t *)PHYS_TO_DMAP(l2phys);
 
 	/* Address the range points at */
 	phys = oldl1 & ~ATTR_MASK;
 	/* The attributed from the old l1 table to be copied */
 	newl2 = oldl1 & ATTR_MASK;
 
 	/* Create the new entries */
 	for (i = 0; i < Ln_ENTRIES; i++) {
 		l2[i] = newl2 | phys;
 		phys += L2_SIZE;
 	}
 	cpu_dcache_wb_range((vm_offset_t)l2, PAGE_SIZE);
 	KASSERT(l2[0] == ((oldl1 & ~ATTR_DESCR_MASK) | L2_BLOCK),
 	    ("Invalid l2 page (%lx != %lx)", l2[0],
 	    (oldl1 & ~ATTR_DESCR_MASK) | L2_BLOCK));
 
 	if (tmpl1 != 0) {
 		pmap_kenter(tmpl1, PAGE_SIZE,
 		    DMAP_TO_PHYS((vm_offset_t)l1) & ~L3_OFFSET, CACHED_MEMORY);
 		l1 = (pt_entry_t *)(tmpl1 + ((vm_offset_t)l1 & PAGE_MASK));
 	}
 
 	pmap_update_entry(pmap, l1, l2phys | L1_TABLE, va, PAGE_SIZE);
 
 	if (tmpl1 != 0) {
 		pmap_kremove(tmpl1);
 		kva_free(tmpl1, PAGE_SIZE);
 	}
 
 	return (l2);
 }
 
 /*
  * Create an L3 table to map all addresses within an L2 mapping.
  */
 static pt_entry_t *
 pmap_demote_l2_locked(pmap_t pmap, pt_entry_t *l2, vm_offset_t va,
     struct rwlock **lockp)
 {
 	pt_entry_t *l3, newl3, oldl2;
 	vm_offset_t tmpl2;
 	vm_paddr_t l3phys, phys;
 	vm_page_t ml3;
 	int i;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	l3 = NULL;
 	oldl2 = pmap_load(l2);
 	KASSERT((oldl2 & ATTR_DESCR_MASK) == L2_BLOCK,
 	    ("pmap_demote_l2: Demoting a non-block entry"));
 	KASSERT((va & L2_OFFSET) == 0,
 	    ("pmap_demote_l2: Invalid virtual address %#lx", va));
 
 	tmpl2 = 0;
 	if (va <= (vm_offset_t)l2 && va + L2_SIZE > (vm_offset_t)l2) {
 		tmpl2 = kva_alloc(PAGE_SIZE);
 		if (tmpl2 == 0)
 			return (NULL);
 	}
 
 	if ((ml3 = pmap_lookup_pt_page(pmap, va)) != NULL) {
 		pmap_remove_pt_page(pmap, ml3);
 	} else {
 		ml3 = vm_page_alloc(NULL, pmap_l2_pindex(va),
 		    (VIRT_IN_DMAP(va) ? VM_ALLOC_INTERRUPT : VM_ALLOC_NORMAL) |
 		    VM_ALLOC_NOOBJ | VM_ALLOC_WIRED);
 		if (ml3 == NULL) {
 			CTR2(KTR_PMAP, "pmap_demote_l2: failure for va %#lx"
 			    " in pmap %p", va, pmap);
 			goto fail;
 		}
 		if (va < VM_MAXUSER_ADDRESS)
 			pmap_resident_count_inc(pmap, 1);
 	}
 
 	l3phys = VM_PAGE_TO_PHYS(ml3);
 	l3 = (pt_entry_t *)PHYS_TO_DMAP(l3phys);
 
 	/* Address the range points at */
 	phys = oldl2 & ~ATTR_MASK;
 	/* The attributed from the old l2 table to be copied */
 	newl3 = (oldl2 & (ATTR_MASK & ~ATTR_DESCR_MASK)) | L3_PAGE;
 
 	/*
 	 * If the page table page is new, initialize it.
 	 */
 	if (ml3->wire_count == 1) {
 		for (i = 0; i < Ln_ENTRIES; i++) {
 			l3[i] = newl3 | phys;
 			phys += L3_SIZE;
 		}
 		cpu_dcache_wb_range((vm_offset_t)l3, PAGE_SIZE);
 	}
 	KASSERT(l3[0] == ((oldl2 & ~ATTR_DESCR_MASK) | L3_PAGE),
 	    ("Invalid l3 page (%lx != %lx)", l3[0],
 	    (oldl2 & ~ATTR_DESCR_MASK) | L3_PAGE));
 
 	/*
 	 * Map the temporary page so we don't lose access to the l2 table.
 	 */
 	if (tmpl2 != 0) {
 		pmap_kenter(tmpl2, PAGE_SIZE,
 		    DMAP_TO_PHYS((vm_offset_t)l2) & ~L3_OFFSET, CACHED_MEMORY);
 		l2 = (pt_entry_t *)(tmpl2 + ((vm_offset_t)l2 & PAGE_MASK));
 	}
 
 	/*
 	 * The spare PV entries must be reserved prior to demoting the
 	 * mapping, that is, prior to changing the PDE.  Otherwise, the state
 	 * of the L2 and the PV lists will be inconsistent, which can result
 	 * in reclaim_pv_chunk() attempting to remove a PV entry from the
 	 * wrong PV list and pmap_pv_demote_l2() failing to find the expected
 	 * PV entry for the 2MB page mapping that is being demoted.
 	 */
 	if ((oldl2 & ATTR_SW_MANAGED) != 0)
 		reserve_pv_entries(pmap, Ln_ENTRIES - 1, lockp);
 
 	pmap_update_entry(pmap, l2, l3phys | L2_TABLE, va, PAGE_SIZE);
 
 	/*
 	 * Demote the PV entry.
 	 */
 	if ((oldl2 & ATTR_SW_MANAGED) != 0)
 		pmap_pv_demote_l2(pmap, va, oldl2 & ~ATTR_MASK, lockp);
 
 	atomic_add_long(&pmap_l2_demotions, 1);
 	CTR3(KTR_PMAP, "pmap_demote_l2: success for va %#lx"
 	    " in pmap %p %lx", va, pmap, l3[0]);
 
 fail:
 	if (tmpl2 != 0) {
 		pmap_kremove(tmpl2);
 		kva_free(tmpl2, PAGE_SIZE);
 	}
 
 	return (l3);
 
 }
 
 static pt_entry_t *
 pmap_demote_l2(pmap_t pmap, pt_entry_t *l2, vm_offset_t va)
 {
 	struct rwlock *lock;
 	pt_entry_t *l3;
 
 	lock = NULL;
 	l3 = pmap_demote_l2_locked(pmap, l2, va, &lock);
 	if (lock != NULL)
 		rw_wunlock(lock);
 	return (l3);
 }
 
 /*
  * perform the pmap work for mincore
  */
 int
 pmap_mincore(pmap_t pmap, vm_offset_t addr, vm_paddr_t *locked_pa)
 {
 	pd_entry_t *l1p, l1;
 	pd_entry_t *l2p, l2;
 	pt_entry_t *l3p, l3;
 	vm_paddr_t pa;
 	bool managed;
 	int val;
 
 	PMAP_LOCK(pmap);
 retry:
 	pa = 0;
 	val = 0;
 	managed = false;
 
 	l1p = pmap_l1(pmap, addr);
 	if (l1p == NULL) /* No l1 */
 		goto done;
 
 	l1 = pmap_load(l1p);
 	if ((l1 & ATTR_DESCR_MASK) == L1_INVAL)
 		goto done;
 
 	if ((l1 & ATTR_DESCR_MASK) == L1_BLOCK) {
 		pa = (l1 & ~ATTR_MASK) | (addr & L1_OFFSET);
 		managed = (l1 & ATTR_SW_MANAGED) == ATTR_SW_MANAGED;
 		val = MINCORE_SUPER | MINCORE_INCORE;
 		if (pmap_page_dirty(l1))
 			val |= MINCORE_MODIFIED | MINCORE_MODIFIED_OTHER;
 		if ((l1 & ATTR_AF) == ATTR_AF)
 			val |= MINCORE_REFERENCED | MINCORE_REFERENCED_OTHER;
 		goto done;
 	}
 
 	l2p = pmap_l1_to_l2(l1p, addr);
 	if (l2p == NULL) /* No l2 */
 		goto done;
 
 	l2 = pmap_load(l2p);
 	if ((l2 & ATTR_DESCR_MASK) == L2_INVAL)
 		goto done;
 
 	if ((l2 & ATTR_DESCR_MASK) == L2_BLOCK) {
 		pa = (l2 & ~ATTR_MASK) | (addr & L2_OFFSET);
 		managed = (l2 & ATTR_SW_MANAGED) == ATTR_SW_MANAGED;
 		val = MINCORE_SUPER | MINCORE_INCORE;
 		if (pmap_page_dirty(l2))
 			val |= MINCORE_MODIFIED | MINCORE_MODIFIED_OTHER;
 		if ((l2 & ATTR_AF) == ATTR_AF)
 			val |= MINCORE_REFERENCED | MINCORE_REFERENCED_OTHER;
 		goto done;
 	}
 
 	l3p = pmap_l2_to_l3(l2p, addr);
 	if (l3p == NULL) /* No l3 */
 		goto done;
 
 	l3 = pmap_load(l2p);
 	if ((l3 & ATTR_DESCR_MASK) == L3_INVAL)
 		goto done;
 
 	if ((l3 & ATTR_DESCR_MASK) == L3_PAGE) {
 		pa = (l3 & ~ATTR_MASK) | (addr & L3_OFFSET);
 		managed = (l3 & ATTR_SW_MANAGED) == ATTR_SW_MANAGED;
 		val = MINCORE_INCORE;
 		if (pmap_page_dirty(l3))
 			val |= MINCORE_MODIFIED | MINCORE_MODIFIED_OTHER;
 		if ((l3 & ATTR_AF) == ATTR_AF)
 			val |= MINCORE_REFERENCED | MINCORE_REFERENCED_OTHER;
 	}
 
 done:
 	if ((val & (MINCORE_MODIFIED_OTHER | MINCORE_REFERENCED_OTHER)) !=
 	    (MINCORE_MODIFIED_OTHER | MINCORE_REFERENCED_OTHER) && managed) {
 		/* Ensure that "PHYS_TO_VM_PAGE(pa)->object" doesn't change. */
 		if (vm_page_pa_tryrelock(pmap, pa, locked_pa))
 			goto retry;
 	} else
 		PA_UNLOCK_COND(*locked_pa);
 	PMAP_UNLOCK(pmap);
 
 	return (val);
 }
 
 void
 pmap_activate(struct thread *td)
 {
 	pmap_t	pmap;
 
 	critical_enter();
 	pmap = vmspace_pmap(td->td_proc->p_vmspace);
 	td->td_pcb->pcb_l0addr = vtophys(pmap->pm_l0);
 	__asm __volatile("msr ttbr0_el1, %0" : : "r"(td->td_pcb->pcb_l0addr));
 	pmap_invalidate_all(pmap);
 	critical_exit();
 }
 
 void
 pmap_sync_icache(pmap_t pmap, vm_offset_t va, vm_size_t sz)
 {
 
 	if (va >= VM_MIN_KERNEL_ADDRESS) {
 		cpu_icache_sync_range(va, sz);
 	} else {
 		u_int len, offset;
 		vm_paddr_t pa;
 
 		/* Find the length of data in this page to flush */
 		offset = va & PAGE_MASK;
 		len = imin(PAGE_SIZE - offset, sz);
 
 		while (sz != 0) {
 			/* Extract the physical address & find it in the DMAP */
 			pa = pmap_extract(pmap, va);
 			if (pa != 0)
 				cpu_icache_sync_range(PHYS_TO_DMAP(pa), len);
 
 			/* Move to the next page */
 			sz -= len;
 			va += len;
 			/* Set the length for the next iteration */
 			len = imin(PAGE_SIZE, sz);
 		}
 	}
 }
 
 int
 pmap_fault(pmap_t pmap, uint64_t esr, uint64_t far)
 {
 #ifdef SMP
 	uint64_t par;
 #endif
 
 	switch (ESR_ELx_EXCEPTION(esr)) {
 	case EXCP_DATA_ABORT_L:
 	case EXCP_DATA_ABORT:
 		break;
 	default:
 		return (KERN_FAILURE);
 	}
 
 #ifdef SMP
 	PMAP_LOCK(pmap);
 	switch (esr & ISS_DATA_DFSC_MASK) {
 	case ISS_DATA_DFSC_TF_L0:
 	case ISS_DATA_DFSC_TF_L1:
 	case ISS_DATA_DFSC_TF_L2:
 	case ISS_DATA_DFSC_TF_L3:
 		/* Ask the MMU to check the address */
 		if (pmap == kernel_pmap)
 			par = arm64_address_translate_s1e1r(far);
 		else
 			par = arm64_address_translate_s1e0r(far);
 
 		/*
 		 * If the translation was successful the address was invalid
 		 * due to a break-before-make sequence. We can unlock and
 		 * return success to the trap handler.
 		 */
 		if (PAR_SUCCESS(par)) {
 			PMAP_UNLOCK(pmap);
 			return (KERN_SUCCESS);
 		}
 		break;
 	default:
 		break;
 	}
 	PMAP_UNLOCK(pmap);
 #endif
 
 	return (KERN_FAILURE);
 }
 
 /*
  *	Increase the starting virtual address of the given mapping if a
  *	different alignment might result in more superpage mappings.
  */
 void
 pmap_align_superpage(vm_object_t object, vm_ooffset_t offset,
     vm_offset_t *addr, vm_size_t size)
 {
 	vm_offset_t superpage_offset;
 
 	if (size < L2_SIZE)
 		return;
 	if (object != NULL && (object->flags & OBJ_COLORED) != 0)
 		offset += ptoa(object->pg_color);
 	superpage_offset = offset & L2_OFFSET;
 	if (size - ((L2_SIZE - superpage_offset) & L2_OFFSET) < L2_SIZE ||
 	    (*addr & L2_OFFSET) == superpage_offset)
 		return;
 	if ((*addr & L2_OFFSET) < superpage_offset)
 		*addr = (*addr & ~L2_OFFSET) + superpage_offset;
 	else
 		*addr = ((*addr + L2_OFFSET) & ~L2_OFFSET) + superpage_offset;
 }
 
 /**
  * Get the kernel virtual address of a set of physical pages. If there are
  * physical addresses not covered by the DMAP perform a transient mapping
  * that will be removed when calling pmap_unmap_io_transient.
  *
  * \param page        The pages the caller wishes to obtain the virtual
  *                    address on the kernel memory map.
  * \param vaddr       On return contains the kernel virtual memory address
  *                    of the pages passed in the page parameter.
  * \param count       Number of pages passed in.
  * \param can_fault   TRUE if the thread using the mapped pages can take
  *                    page faults, FALSE otherwise.
  *
  * \returns TRUE if the caller must call pmap_unmap_io_transient when
  *          finished or FALSE otherwise.
  *
  */
 boolean_t
 pmap_map_io_transient(vm_page_t page[], vm_offset_t vaddr[], int count,
     boolean_t can_fault)
 {
 	vm_paddr_t paddr;
 	boolean_t needs_mapping;
 	int error, i;
 
 	/*
 	 * Allocate any KVA space that we need, this is done in a separate
 	 * loop to prevent calling vmem_alloc while pinned.
 	 */
 	needs_mapping = FALSE;
 	for (i = 0; i < count; i++) {
 		paddr = VM_PAGE_TO_PHYS(page[i]);
 		if (__predict_false(!PHYS_IN_DMAP(paddr))) {
 			error = vmem_alloc(kernel_arena, PAGE_SIZE,
 			    M_BESTFIT | M_WAITOK, &vaddr[i]);
 			KASSERT(error == 0, ("vmem_alloc failed: %d", error));
 			needs_mapping = TRUE;
 		} else {
 			vaddr[i] = PHYS_TO_DMAP(paddr);
 		}
 	}
 
 	/* Exit early if everything is covered by the DMAP */
 	if (!needs_mapping)
 		return (FALSE);
 
 	if (!can_fault)
 		sched_pin();
 	for (i = 0; i < count; i++) {
 		paddr = VM_PAGE_TO_PHYS(page[i]);
 		if (!PHYS_IN_DMAP(paddr)) {
 			panic(
 			   "pmap_map_io_transient: TODO: Map out of DMAP data");
 		}
 	}
 
 	return (needs_mapping);
 }
 
 void
 pmap_unmap_io_transient(vm_page_t page[], vm_offset_t vaddr[], int count,
     boolean_t can_fault)
 {
 	vm_paddr_t paddr;
 	int i;
 
 	if (!can_fault)
 		sched_unpin();
 	for (i = 0; i < count; i++) {
 		paddr = VM_PAGE_TO_PHYS(page[i]);
 		if (!PHYS_IN_DMAP(paddr)) {
 			panic("ARM64TODO: pmap_unmap_io_transient: Unmap data");
 		}
 	}
 }
Index: projects/clang390-import/sys/cddl/compat/opensolaris/sys/random.h
===================================================================
--- projects/clang390-import/sys/cddl/compat/opensolaris/sys/random.h	(revision 305686)
+++ projects/clang390-import/sys/cddl/compat/opensolaris/sys/random.h	(revision 305687)
@@ -1,37 +1,37 @@
 /*-
  * Copyright (c) 2007 Pawel Jakub Dawidek <pjd@FreeBSD.org>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 #ifndef _OPENSOLARIS_SYS_RANDOM_H_
 #define	_OPENSOLARIS_SYS_RANDOM_H_
 
 #include_next <sys/random.h>
 
 #define	random_get_bytes(p, s)		read_random((p), (int)(s))
-#define	random_get_pseudo_bytes(p, s)	read_random((p), (int)(s))
+#define	random_get_pseudo_bytes(p, s)	arc4rand((p), (int)(s), 0)
 
 #endif	/* !_OPENSOLARIS_SYS_RANDOM_H_ */
Index: projects/clang390-import/sys/conf/options.mips
===================================================================
--- projects/clang390-import/sys/conf/options.mips	(revision 305686)
+++ projects/clang390-import/sys/conf/options.mips	(revision 305687)
@@ -1,148 +1,149 @@
 # Copyright (c) 2001, 2008, Juniper Networks, Inc.
 # All rights reserved.
 #
 # Redistribution and use in source and binary forms, with or without
 # modification, are permitted provided that the following conditions
 # are met:
 # 1. Redistributions of source code must retain the above copyright
 #    notice, this list of conditions and the following disclaimer.
 # 2. Redistributions in binary form must reproduce the above copyright
 #    notice, this list of conditions and the following disclaimer in the
 #    documentation and/or other materials provided with the distribution.
 # 3. Neither the name of the Juniper Networks, Inc. nor the names of its
 #    contributors may be used to endorse or promote products derived from
 #    this software without specific prior written permission.
 #
 # THIS SOFTWARE IS PROVIDED BY JUNIPER NETWORKS AND CONTRIBUTORS ``AS IS'' AND
 # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 # ARE DISCLAIMED.  IN NO EVENT SHALL JUNIPER NETWORKS OR CONTRIBUTORS BE LIABLE
 # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 # SUCH DAMAGE.
 #
 #	JNPR: options.mips,v 1.2 2006/09/15 12:52:34
 # $FreeBSD$
 
 CPU_MIPS4KC	opt_global.h
 CPU_MIPS24K	opt_global.h
 CPU_MIPS34K	opt_global.h
 CPU_MIPS74K	opt_global.h
 CPU_MIPS1004K	opt_global.h
 CPU_MIPS1074K	opt_global.h
 CPU_INTERAPTIV	opt_global.h
 CPU_PROAPTIV	opt_global.h
 CPU_MIPS32	opt_global.h
 CPU_MIPS64	opt_global.h
 CPU_SENTRY5	opt_global.h
 CPU_HAVEFPU	opt_global.h
 CPU_SB1 	opt_global.h
 CPU_CNMIPS	opt_global.h
 CPU_RMI		opt_global.h
 CPU_NLM		opt_global.h
 CPU_BERI	opt_global.h
+CPU_MALTA	opt_global.h
 
 # which MACHINE_ARCH architecture
 MIPS
 MIPSEL
 MIPS64
 MIPS64EL
 MIPSN32
 
 COMPAT_FREEBSD32	opt_compat.h
 
 YAMON		opt_global.h
 CFE		opt_global.h
 CFE_CONSOLE	opt_global.h
 CFE_ENV		opt_global.h
 CFE_ENV_SIZE	opt_global.h
 
 GFB_DEBUG		opt_gfb.h
 GFB_NO_FONT_LOADING	opt_gfb.h
 GFB_NO_MODE_CHANGE	opt_gfb.h
 
 NOFPU		opt_global.h
 
 TICK_USE_YAMON_FREQ	opt_global.h
 TICK_USE_MALTA_RTC	opt_global.h
 
 #
 # The highest memory address that can be used by the kernel in units of KB.
 #
 MAXMEM			opt_global.h
 
 #
 # Manual override of cache config
 #
 MIPS_DISABLE_L1_CACHE	opt_global.h
 
 #
 # Options that control the Cavium Simple Executive.
 #
 OCTEON_MODEL			opt_cvmx.h
 OCTEON_VENDOR_LANNER		opt_cvmx.h
 OCTEON_VENDOR_UBIQUITI		opt_cvmx.h
 OCTEON_VENDOR_RADISYS		opt_cvmx.h
 OCTEON_VENDOR_GEFES		opt_cvmx.h
 OCTEON_BOARD_CAPK_0100ND	opt_cvmx.h
 
 #
 # Options specific to the BERI platform. 
 #
 BERI_LARGE_TLB			opt_global.h
 
 #
 # Options that control the NetFPGA-10G Embedded CPU Ethernet Core.
 #
 NF10BMAC_64BIT			opt_netfpga.h
 
 #
 # Options that control the Atheros SoC peripherals
 #
 ARGE_DEBUG			opt_arge.h
 ARGE_MDIO			opt_arge.h
 
 #
 # At least one of the AR71XX ubiquiti boards has a Redboot configuration
 # that "lies" about the amount of RAM it has. Until a cleaner method is
 # defined, this option will suffice in overriding what Redboot says.
 #
 AR71XX_REALMEM			opt_ar71xx.h
 AR71XX_ENV_UBOOT		opt_ar71xx.h
 AR71XX_ENV_REDBOOT		opt_ar71xx.h
 AR71XX_ENV_ROUTERBOOT		opt_ar71xx.h
 AR71XX_ATH_EEPROM		opt_ar71xx.h
 
 #
 # Options that control the Ralink RT305xF Etherenet MAC.
 #
 IF_RT_DEBUG			opt_if_rt.h
 IF_RT_PHY_SUPPORT		opt_if_rt.h
 IF_RT_RING_DATA_COUNT		opt_if_rt.h
 
 #
 # Options that control the Ralink/Mediatek SoC type.
 #
 MT7620				opt_rt305x.h
 RT5350				opt_rt305x.h
 RT305XF				opt_rt305x.h
 RT3052F				opt_rt305x.h
 RT3050F				opt_rt305x.h
 RT305X				opt_rt305x.h
 RT305X_UBOOT			opt_rt305x.h
 RT305X_USE_UART			opt_rt305x.h
 
 #
 # Options that affect the pmap.
 #
 PV_STATS		opt_pmap.h
 
 #
 # Options to use INTRNG code
 #
 INTRNG			opt_global.h
 MIPS_NIRQ		opt_global.h
Index: projects/clang390-import/sys/ddb/db_command.c
===================================================================
--- projects/clang390-import/sys/ddb/db_command.c	(revision 305686)
+++ projects/clang390-import/sys/ddb/db_command.c	(revision 305687)
@@ -1,888 +1,888 @@
 /*-
  * Mach Operating System
  * Copyright (c) 1991,1990 Carnegie Mellon University
  * All Rights Reserved.
  *
  * Permission to use, copy, modify and distribute this software and its
  * documentation is hereby granted, provided that both the copyright
  * notice and this permission notice appear in all copies of the
  * software, derivative works or modified versions, and any portions
  * thereof, and that both notices appear in supporting documentation.
  *
  * CARNEGIE MELLON ALLOWS FREE USE OF THIS SOFTWARE IN ITS
  * CONDITION.  CARNEGIE MELLON DISCLAIMS ANY LIABILITY OF ANY KIND FOR
  * ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE.
  *
  * Carnegie Mellon requests users of this software to return to
  *
  *  Software Distribution Coordinator  or  Software.Distribution@CS.CMU.EDU
  *  School of Computer Science
  *  Carnegie Mellon University
  *  Pittsburgh PA 15213-3890
  *
  * any improvements or extensions that they make and grant Carnegie the
  * rights to redistribute these changes.
  */
 /*
  *	Author: David B. Golub, Carnegie Mellon University
  *	Date:	7/90
  */
 /*
  * Command dispatcher.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/linker_set.h>
 #include <sys/lock.h>
 #include <sys/kdb.h>
 #include <sys/mutex.h>
 #include <sys/proc.h>
 #include <sys/reboot.h>
 #include <sys/signalvar.h>
 #include <sys/systm.h>
 #include <sys/cons.h>
 #include <sys/conf.h>
 #include <sys/watchdog.h>
 #include <sys/kernel.h>
 
 #include <ddb/ddb.h>
 #include <ddb/db_command.h>
 #include <ddb/db_lex.h>
 #include <ddb/db_output.h>
 
 #include <machine/cpu.h>
 #include <machine/setjmp.h>
 
 /*
  * Exported global variables
  */
-bool		db_cmd_loop_done;
+int		db_cmd_loop_done;
 db_addr_t	db_dot;
 db_addr_t	db_last_addr;
 db_addr_t	db_prev;
 db_addr_t	db_next;
 
 static db_cmdfcn_t	db_dump;
 static db_cmdfcn_t	db_fncall;
 static db_cmdfcn_t	db_gdb;
 static db_cmdfcn_t	db_halt;
 static db_cmdfcn_t	db_kill;
 static db_cmdfcn_t	db_reset;
 static db_cmdfcn_t	db_stack_trace;
 static db_cmdfcn_t	db_stack_trace_active;
 static db_cmdfcn_t	db_stack_trace_all;
 static db_cmdfcn_t	db_watchdog;
 
 /*
  * 'show' commands
  */
 
 static struct command db_show_active_cmds[] = {
 	{ "trace",	db_stack_trace_active,	0,	NULL },
 };
 struct command_table db_show_active_table =
     LIST_HEAD_INITIALIZER(db_show_active_table);
 
 static struct command db_show_all_cmds[] = {
 	{ "trace",	db_stack_trace_all,	0,	NULL },
 };
 struct command_table db_show_all_table =
     LIST_HEAD_INITIALIZER(db_show_all_table);
 
 static struct command db_show_cmds[] = {
 	{ "active",	0,			0,	&db_show_active_table },
 	{ "all",	0,			0,	&db_show_all_table },
 	{ "registers",	db_show_regs,		0,	NULL },
 	{ "breaks",	db_listbreak_cmd, 	0,	NULL },
 	{ "threads",	db_show_threads,	0,	NULL },
 };
 struct command_table db_show_table = LIST_HEAD_INITIALIZER(db_show_table);
 
 static struct command db_cmds[] = {
 	{ "print",	db_print_cmd,		0,	NULL },
 	{ "p",		db_print_cmd,		0,	NULL },
 	{ "examine",	db_examine_cmd,		CS_SET_DOT, NULL },
 	{ "x",		db_examine_cmd,		CS_SET_DOT, NULL },
 	{ "search",	db_search_cmd,		CS_OWN|CS_SET_DOT, NULL },
 	{ "set",	db_set_cmd,		CS_OWN,	NULL },
 	{ "write",	db_write_cmd,		CS_MORE|CS_SET_DOT, NULL },
 	{ "w",		db_write_cmd,		CS_MORE|CS_SET_DOT, NULL },
 	{ "delete",	db_delete_cmd,		0,	NULL },
 	{ "d",		db_delete_cmd,		0,	NULL },
 	{ "dump",	db_dump,		0,	NULL },
 	{ "break",	db_breakpoint_cmd,	0,	NULL },
 	{ "b",		db_breakpoint_cmd,	0,	NULL },
 	{ "dwatch",	db_deletewatch_cmd,	0,	NULL },
 	{ "watch",	db_watchpoint_cmd,	CS_MORE,NULL },
 	{ "dhwatch",	db_deletehwatch_cmd,	0,      NULL },
 	{ "hwatch",	db_hwatchpoint_cmd,	0,      NULL },
 	{ "step",	db_single_step_cmd,	0,	NULL },
 	{ "s",		db_single_step_cmd,	0,	NULL },
 	{ "continue",	db_continue_cmd,	0,	NULL },
 	{ "c",		db_continue_cmd,	0,	NULL },
 	{ "until",	db_trace_until_call_cmd,0,	NULL },
 	{ "next",	db_trace_until_matching_cmd,0,	NULL },
 	{ "match",	db_trace_until_matching_cmd,0,	NULL },
 	{ "trace",	db_stack_trace,		CS_OWN,	NULL },
 	{ "t",		db_stack_trace,		CS_OWN,	NULL },
 	/* XXX alias for active trace */
 	{ "acttrace",	db_stack_trace_active,	0,	NULL },
 	/* XXX alias for all trace */
 	{ "alltrace",	db_stack_trace_all,	0,	NULL },
 	{ "where",	db_stack_trace,		CS_OWN,	NULL },
 	{ "bt",		db_stack_trace,		CS_OWN,	NULL },
 	{ "call",	db_fncall,		CS_OWN,	NULL },
 	{ "show",	0,			0,	&db_show_table },
 	{ "ps",		db_ps,			0,	NULL },
 	{ "gdb",	db_gdb,			0,	NULL },
 	{ "halt",	db_halt,		0,	NULL },
 	{ "reboot",	db_reset,		0,	NULL },
 	{ "reset",	db_reset,		0,	NULL },
 	{ "kill",	db_kill,		CS_OWN,	NULL },
 	{ "watchdog",	db_watchdog,		CS_OWN,	NULL },
 	{ "thread",	db_set_thread,		CS_OWN,	NULL },
 	{ "run",	db_run_cmd,		CS_OWN,	NULL },
 	{ "script",	db_script_cmd,		CS_OWN,	NULL },
 	{ "scripts",	db_scripts_cmd,		0,	NULL },
 	{ "unscript",	db_unscript_cmd,	CS_OWN,	NULL },
 	{ "capture",	db_capture_cmd,		CS_OWN,	NULL },
 	{ "textdump",	db_textdump_cmd,	CS_OWN, NULL },
 	{ "findstack",	db_findstack_cmd,	0,	NULL },
 };
 struct command_table db_cmd_table = LIST_HEAD_INITIALIZER(db_cmd_table);
 
 static struct command	*db_last_command = NULL;
 
 /*
  * if 'ed' style: 'dot' is set at start of last item printed,
  * and '+' points to next line.
  * Otherwise: 'dot' points to next item, '..' points to last.
  */
 static bool	db_ed_style = true;
 
 /*
  * Utility routine - discard tokens through end-of-line.
  */
 void
 db_skip_to_eol(void)
 {
 	int	t;
 	do {
 	    t = db_read_token();
 	} while (t != tEOL);
 }
 
 /*
  * Results of command search.
  */
 #define	CMD_UNIQUE	0
 #define	CMD_FOUND	1
 #define	CMD_NONE	2
 #define	CMD_AMBIGUOUS	3
 #define	CMD_HELP	4
 
 static void	db_cmd_match(char *name, struct command *cmd,
 		    struct command **cmdp, int *resultp);
 static void	db_cmd_list(struct command_table *table);
 static int	db_cmd_search(char *name, struct command_table *table,
 		    struct command **cmdp);
 static void	db_command(struct command **last_cmdp,
 		    struct command_table *cmd_table, int dopager);
 
 /*
  * Initialize the command lists from the static tables.
  */
 void
 db_command_init(void)
 {
 #define	N(a)	(sizeof(a) / sizeof(a[0]))
 	int i;
 
 	for (i = 0; i < N(db_cmds); i++)
 		db_command_register(&db_cmd_table, &db_cmds[i]);
 	for (i = 0; i < N(db_show_cmds); i++)
 		db_command_register(&db_show_table, &db_show_cmds[i]);
 	for (i = 0; i < N(db_show_active_cmds); i++)
 		db_command_register(&db_show_active_table,
 		    &db_show_active_cmds[i]);
 	for (i = 0; i < N(db_show_all_cmds); i++)
 		db_command_register(&db_show_all_table, &db_show_all_cmds[i]);
 #undef N
 }
 
 /*
  * Register a command.
  */
 void
 db_command_register(struct command_table *list, struct command *cmd)
 {
 	struct command *c, *last;
 
 	last = NULL;
 	LIST_FOREACH(c, list, next) {
 		int n = strcmp(cmd->name, c->name);
 
 		/* Check that the command is not already present. */
 		if (n == 0) {
 			printf("%s: Warning, the command \"%s\" already exists;"
 			     " ignoring request\n", __func__, cmd->name);
 			return;
 		}
 		if (n < 0) {
 			/* NB: keep list sorted lexicographically */
 			LIST_INSERT_BEFORE(c, cmd, next);
 			return;
 		}
 		last = c;
 	}
 	if (last == NULL)
 		LIST_INSERT_HEAD(list, cmd, next);
 	else
 		LIST_INSERT_AFTER(last, cmd, next);
 }
 
 /*
  * Remove a command previously registered with db_command_register.
  */
 void
 db_command_unregister(struct command_table *list, struct command *cmd)
 {
 	struct command *c;
 
 	LIST_FOREACH(c, list, next) {
 		if (cmd == c) {
 			LIST_REMOVE(cmd, next);
 			return;
 		}
 	}
 	/* NB: intentionally quiet */
 }
 
 /*
  * Helper function to match a single command.
  */
 static void
 db_cmd_match(char *name, struct command *cmd, struct command **cmdp,
     int *resultp)
 {
 	char *lp, *rp;
 	int c;
 
 	lp = name;
 	rp = cmd->name;
 	while ((c = *lp) == *rp) {
 		if (c == 0) {
 			/* complete match */
 			*cmdp = cmd;
 			*resultp = CMD_UNIQUE;
 			return;
 		}
 		lp++;
 		rp++;
 	}
 	if (c == 0) {
 		/* end of name, not end of command -
 		   partial match */
 		if (*resultp == CMD_FOUND) {
 			*resultp = CMD_AMBIGUOUS;
 			/* but keep looking for a full match -
 			   this lets us match single letters */
 		} else {
 			*cmdp = cmd;
 			*resultp = CMD_FOUND;
 		}
 	}
 }
 
 /*
  * Search for command prefix.
  */
 static int
 db_cmd_search(char *name, struct command_table *table, struct command **cmdp)
 {
 	struct command	*cmd;
 	int		result = CMD_NONE;
 
 	LIST_FOREACH(cmd, table, next) {
 		db_cmd_match(name,cmd,cmdp,&result);
 		if (result == CMD_UNIQUE)
 			break;
 	}
 
 	if (result == CMD_NONE) {
 		/* check for 'help' */
 		if (name[0] == 'h' && name[1] == 'e'
 		    && name[2] == 'l' && name[3] == 'p')
 			result = CMD_HELP;
 	}
 	return (result);
 }
 
 static void
 db_cmd_list(struct command_table *table)
 {
 	struct command	*cmd;
 
 	LIST_FOREACH(cmd, table, next) {
 		db_printf("%-16s", cmd->name);
 		db_end_line(16);
 	}
 }
 
 static void
 db_command(struct command **last_cmdp, struct command_table *cmd_table,
     int dopager)
 {
 	struct command	*cmd = NULL;
 	int		t;
 	char		modif[TOK_STRING_SIZE];
 	db_expr_t	addr, count;
 	bool		have_addr = false;
 	int		result;
 
 	t = db_read_token();
 	if (t == tEOL) {
 	    /* empty line repeats last command, at 'next' */
 	    cmd = *last_cmdp;
 	    addr = (db_expr_t)db_next;
 	    have_addr = false;
 	    count = 1;
 	    modif[0] = '\0';
 	}
 	else if (t == tEXCL) {
 	    db_fncall((db_expr_t)0, (bool)false, (db_expr_t)0, (char *)0);
 	    return;
 	}
 	else if (t != tIDENT) {
 	    db_printf("?\n");
 	    db_flush_lex();
 	    return;
 	}
 	else {
 	    /*
 	     * Search for command
 	     */
 	    while (cmd_table) {
 		result = db_cmd_search(db_tok_string,
 				       cmd_table,
 				       &cmd);
 		switch (result) {
 		    case CMD_NONE:
 			db_printf("No such command\n");
 			db_flush_lex();
 			return;
 		    case CMD_AMBIGUOUS:
 			db_printf("Ambiguous\n");
 			db_flush_lex();
 			return;
 		    case CMD_HELP:
 			db_cmd_list(cmd_table);
 			db_flush_lex();
 			return;
 		    default:
 			break;
 		}
 		if ((cmd_table = cmd->more) != NULL) {
 		    t = db_read_token();
 		    if (t != tIDENT) {
 			db_cmd_list(cmd_table);
 			db_flush_lex();
 			return;
 		    }
 		}
 	    }
 
 	    if ((cmd->flag & CS_OWN) == 0) {
 		/*
 		 * Standard syntax:
 		 * command [/modifier] [addr] [,count]
 		 */
 		t = db_read_token();
 		if (t == tSLASH) {
 		    t = db_read_token();
 		    if (t != tIDENT) {
 			db_printf("Bad modifier\n");
 			db_flush_lex();
 			return;
 		    }
 		    db_strcpy(modif, db_tok_string);
 		}
 		else {
 		    db_unread_token(t);
 		    modif[0] = '\0';
 		}
 
 		if (db_expression(&addr)) {
 		    db_dot = (db_addr_t) addr;
 		    db_last_addr = db_dot;
 		    have_addr = true;
 		}
 		else {
 		    addr = (db_expr_t) db_dot;
 		    have_addr = false;
 		}
 		t = db_read_token();
 		if (t == tCOMMA) {
 		    if (!db_expression(&count)) {
 			db_printf("Count missing\n");
 			db_flush_lex();
 			return;
 		    }
 		}
 		else {
 		    db_unread_token(t);
 		    count = -1;
 		}
 		if ((cmd->flag & CS_MORE) == 0) {
 		    db_skip_to_eol();
 		}
 	    }
 	}
 	*last_cmdp = cmd;
 	if (cmd != NULL) {
 	    /*
 	     * Execute the command.
 	     */
 	    if (dopager)
 		db_enable_pager();
 	    else
 		db_disable_pager();
 	    (*cmd->fcn)(addr, have_addr, count, modif);
 	    if (dopager)
 		db_disable_pager();
 
 	    if (cmd->flag & CS_SET_DOT) {
 		/*
 		 * If command changes dot, set dot to
 		 * previous address displayed (if 'ed' style).
 		 */
 		if (db_ed_style) {
 		    db_dot = db_prev;
 		}
 		else {
 		    db_dot = db_next;
 		}
 	    }
 	    else {
 		/*
 		 * If command does not change dot,
 		 * set 'next' location to be the same.
 		 */
 		db_next = db_dot;
 	    }
 	}
 }
 
 /*
  * At least one non-optional command must be implemented using
  * DB_COMMAND() so that db_cmd_set gets created.  Here is one.
  */
 DB_COMMAND(panic, db_panic)
 {
 	db_disable_pager();
 	panic("from debugger");
 }
 
 void
 db_command_loop(void)
 {
 	/*
 	 * Initialize 'prev' and 'next' to dot.
 	 */
 	db_prev = db_dot;
 	db_next = db_dot;
 
 	db_cmd_loop_done = 0;
 	while (!db_cmd_loop_done) {
 	    if (db_print_position() != 0)
 		db_printf("\n");
 
 	    db_printf("db> ");
 	    (void) db_read_line();
 
 	    db_command(&db_last_command, &db_cmd_table, /* dopager */ 1);
 	}
 }
 
 /*
  * Execute a command on behalf of a script.  The caller is responsible for
  * making sure that the command string is < DB_MAXLINE or it will be
  * truncated.
  *
  * XXXRW: Runs by injecting faked input into DDB input stream; it would be
  * nicer to use an alternative approach that didn't mess with the previous
  * command buffer.
  */
 void
 db_command_script(const char *command)
 {
 	db_prev = db_next = db_dot;
 	db_inject_line(command);
 	db_command(&db_last_command, &db_cmd_table, /* dopager */ 0);
 }
 
 void
 db_error(const char *s)
 {
 	if (s)
 	    db_printf("%s", s);
 	db_flush_lex();
 	kdb_reenter();
 }
 
 static void
 db_dump(db_expr_t dummy, bool dummy2, db_expr_t dummy3, char *dummy4)
 {
 	int error;
 
 	if (textdump_pending) {
 		db_printf("textdump_pending set.\n"
 		    "run \"textdump unset\" first or \"textdump dump\" for a textdump.\n");
 		return;
 	}
 	error = doadump(false);
 	if (error) {
 		db_printf("Cannot dump: ");
 		switch (error) {
 		case EBUSY:
 			db_printf("debugger got invoked while dumping.\n");
 			break;
 		case ENXIO:
 			db_printf("no dump device specified.\n");
 			break;
 		default:
 			db_printf("unknown error (error=%d).\n", error);
 			break;
 		}
 	}
 }
 
 /*
  * Call random function:
  * !expr(arg,arg,arg)
  */
 
 /* The generic implementation supports a maximum of 10 arguments. */
 typedef db_expr_t __db_f(db_expr_t, db_expr_t, db_expr_t, db_expr_t,
     db_expr_t, db_expr_t, db_expr_t, db_expr_t, db_expr_t, db_expr_t);
 
 static __inline int
 db_fncall_generic(db_expr_t addr, db_expr_t *rv, int nargs, db_expr_t args[])
 {
 	__db_f *f = (__db_f *)addr;
 
 	if (nargs > 10) {
 		db_printf("Too many arguments (max 10)\n");
 		return (0);
 	}
 	*rv = (*f)(args[0], args[1], args[2], args[3], args[4], args[5],
 	    args[6], args[7], args[8], args[9]);
 	return (1);
 }
 
 static void
 db_fncall(db_expr_t dummy1, bool dummy2, db_expr_t dummy3, char *dummy4)
 {
 	db_expr_t	fn_addr;
 	db_expr_t	args[DB_MAXARGS];
 	int		nargs = 0;
 	db_expr_t	retval;
 	int		t;
 
 	if (!db_expression(&fn_addr)) {
 	    db_printf("Bad function\n");
 	    db_flush_lex();
 	    return;
 	}
 
 	t = db_read_token();
 	if (t == tLPAREN) {
 	    if (db_expression(&args[0])) {
 		nargs++;
 		while ((t = db_read_token()) == tCOMMA) {
 		    if (nargs == DB_MAXARGS) {
 			db_printf("Too many arguments (max %d)\n", DB_MAXARGS);
 			db_flush_lex();
 			return;
 		    }
 		    if (!db_expression(&args[nargs])) {
 			db_printf("Argument missing\n");
 			db_flush_lex();
 			return;
 		    }
 		    nargs++;
 		}
 		db_unread_token(t);
 	    }
 	    if (db_read_token() != tRPAREN) {
 		db_printf("?\n");
 		db_flush_lex();
 		return;
 	    }
 	}
 	db_skip_to_eol();
 	db_disable_pager();
 
 	if (DB_CALL(fn_addr, &retval, nargs, args))
 		db_printf("= %#lr\n", (long)retval);
 }
 
 static void
 db_halt(db_expr_t dummy, bool dummy2, db_expr_t dummy3, char *dummy4)
 {
 
 	cpu_halt();
 }
 
 static void
 db_kill(db_expr_t dummy1, bool dummy2, db_expr_t dummy3, char *dummy4)
 {
 	db_expr_t old_radix, pid, sig;
 	struct proc *p;
 
 #define	DB_ERROR(f)	do { db_printf f; db_flush_lex(); goto out; } while (0)
 
 	/*
 	 * PIDs and signal numbers are typically represented in base
 	 * 10, so make that the default here.  It can, of course, be
 	 * overridden by specifying a prefix.
 	 */
 	old_radix = db_radix;
 	db_radix = 10;
 	/* Retrieve arguments. */
 	if (!db_expression(&sig))
 		DB_ERROR(("Missing signal number\n"));
 	if (!db_expression(&pid))
 		DB_ERROR(("Missing process ID\n"));
 	db_skip_to_eol();
 	if (!_SIG_VALID(sig))
 		DB_ERROR(("Signal number out of range\n"));
 
 	/*
 	 * Find the process in question.  allproc_lock is not needed
 	 * since we're in DDB.
 	 */
 	/* sx_slock(&allproc_lock); */
 	FOREACH_PROC_IN_SYSTEM(p)
 	    if (p->p_pid == pid)
 		    break;
 	/* sx_sunlock(&allproc_lock); */
 	if (p == NULL)
 		DB_ERROR(("Can't find process with pid %ld\n", (long) pid));
 
 	/* If it's already locked, bail; otherwise, do the deed. */
 	if (PROC_TRYLOCK(p) == 0)
 		DB_ERROR(("Can't lock process with pid %ld\n", (long) pid));
 	else {
 		pksignal(p, sig, NULL);
 		PROC_UNLOCK(p);
 	}
 
 out:
 	db_radix = old_radix;
 #undef DB_ERROR
 }
 
 /*
  * Reboot.  In case there is an additional argument, take it as delay in
  * seconds.  Default to 15s if we cannot parse it and make sure we will
  * never wait longer than 1 week.  Some code is similar to
  * kern_shutdown.c:shutdown_panic().
  */
 #ifndef	DB_RESET_MAXDELAY
 #define	DB_RESET_MAXDELAY	(3600 * 24 * 7)
 #endif
 
 static void
 db_reset(db_expr_t addr, bool have_addr, db_expr_t count __unused,
     char *modif __unused)
 {
 	int delay, loop;
 
 	if (have_addr) {
 		delay = (int)db_hex2dec(addr);
 
 		/* If we parse to fail, use 15s. */
 		if (delay == -1)
 			delay = 15;
 
 		/* Cap at one week. */
 		if ((uintmax_t)delay > (uintmax_t)DB_RESET_MAXDELAY)
 			delay = DB_RESET_MAXDELAY;
 
 		db_printf("Automatic reboot in %d seconds - "
 		    "press a key on the console to abort\n", delay);
 		for (loop = delay * 10; loop > 0; --loop) {
 			DELAY(1000 * 100); /* 1/10th second */
 			/* Did user type a key? */
 			if (cncheckc() != -1)
 				return;
 		}
 	}
 
 	cpu_reset();
 }
 
 static void
 db_watchdog(db_expr_t dummy1, bool dummy2, db_expr_t dummy3, char *dummy4)
 {
 	db_expr_t old_radix, tout;
 	int err, i;
 
 	old_radix = db_radix;
 	db_radix = 10;
 	err = db_expression(&tout);
 	db_skip_to_eol();
 	db_radix = old_radix;
 
 	/* If no argument is provided the watchdog will just be disabled. */
 	if (err == 0) {
 		db_printf("No argument provided, disabling watchdog\n");
 		tout = 0;
 	} else if ((tout & WD_INTERVAL) == WD_TO_NEVER) {
 		db_error("Out of range watchdog interval\n");
 		return;
 	}
 	EVENTHANDLER_INVOKE(watchdog_list, tout, &i);
 }
 
 static void
 db_gdb(db_expr_t dummy1, bool dummy2, db_expr_t dummy3, char *dummy4)
 {
 
 	if (kdb_dbbe_select("gdb") != 0) {
 		db_printf("The remote GDB backend could not be selected.\n");
 		return;
 	}
 	/*
 	 * Mark that we are done in the debugger.  kdb_trap()
 	 * should re-enter with the new backend.
 	 */
 	db_cmd_loop_done = 1;
 	db_printf("(ctrl-c will return control to ddb)\n");
 }
 
 static void
 db_stack_trace(db_expr_t tid, bool hastid, db_expr_t count, char *modif)
 {
 	struct thread *td;
 	db_expr_t radix;
 	pid_t pid;
 	int t;
 
 	/*
 	 * We parse our own arguments. We don't like the default radix.
 	 */
 	radix = db_radix;
 	db_radix = 10;
 	hastid = db_expression(&tid);
 	t = db_read_token();
 	if (t == tCOMMA) {
 		if (!db_expression(&count)) {
 			db_printf("Count missing\n");
 			db_flush_lex();
 			return;
 		}
 	} else {
 		db_unread_token(t);
 		count = -1;
 	}
 	db_skip_to_eol();
 	db_radix = radix;
 
 	if (hastid) {
 		td = kdb_thr_lookup((lwpid_t)tid);
 		if (td == NULL)
 			td = kdb_thr_from_pid((pid_t)tid);
 		if (td == NULL) {
 			db_printf("Thread %d not found\n", (int)tid);
 			return;
 		}
 	} else
 		td = kdb_thread;
 	if (td->td_proc != NULL)
 		pid = td->td_proc->p_pid;
 	else
 		pid = -1;
 	db_printf("Tracing pid %d tid %ld td %p\n", pid, (long)td->td_tid, td);
 	db_trace_thread(td, count);
 }
 
 static void
 _db_stack_trace_all(bool active_only)
 {
 	struct proc *p;
 	struct thread *td;
 	jmp_buf jb;
 	void *prev_jb;
 
 	FOREACH_PROC_IN_SYSTEM(p) {
 		prev_jb = kdb_jmpbuf(jb);
 		if (setjmp(jb) == 0) {
 			FOREACH_THREAD_IN_PROC(p, td) {
 				if (td->td_state == TDS_RUNNING)
 					db_printf("\nTracing command %s pid %d"
 					    " tid %ld td %p (CPU %d)\n",
 					    p->p_comm, p->p_pid,
 					    (long)td->td_tid, td,
 					    td->td_oncpu);
 				else if (active_only)
 					continue;
 				else
 					db_printf("\nTracing command %s pid %d"
 					    " tid %ld td %p\n", p->p_comm,
 					    p->p_pid, (long)td->td_tid, td);
 				db_trace_thread(td, -1);
 				if (db_pager_quit) {
 					kdb_jmpbuf(prev_jb);
 					return;
 				}
 			}
 		}
 		kdb_jmpbuf(prev_jb);
 	}
 }
 
 static void
 db_stack_trace_active(db_expr_t dummy, bool dummy2, db_expr_t dummy3,
     char *dummy4)
 {
 
 	_db_stack_trace_all(true);
 }
 
 static void
 db_stack_trace_all(db_expr_t dummy, bool dummy2, db_expr_t dummy3,
     char *dummy4)
 {
 
 	_db_stack_trace_all(false);
 }
 
 /*
  * Take the parsed expression value from the command line that was parsed
  * as a hexadecimal value and convert it as if the expression was parsed
  * as a decimal value.  Returns -1 if the expression was not a valid
  * decimal value.
  */
 db_expr_t
 db_hex2dec(db_expr_t expr)
 {
 	uintptr_t x, y;
 	db_expr_t val;
 
 	y = 1;
 	val = 0;
 	x = expr;
 	while (x != 0) {
 		if (x % 16 > 9)
 			return (-1);
 		val += (x % 16) * (y);
 		x >>= 4;
 		y *= 10;
 	}
 	return (val);
 }
Index: projects/clang390-import/sys/ddb/db_main.c
===================================================================
--- projects/clang390-import/sys/ddb/db_main.c	(revision 305686)
+++ projects/clang390-import/sys/ddb/db_main.c	(revision 305687)
@@ -1,282 +1,279 @@
 /*-
  * Mach Operating System
  * Copyright (c) 1991,1990 Carnegie Mellon University
  * All Rights Reserved.
  *
  * Permission to use, copy, modify and distribute this software and its
  * documentation is hereby granted, provided that both the copyright
  * notice and this permission notice appear in all copies of the
  * software, derivative works or modified versions, and any portions
  * thereof, and that both notices appear in supporting documentation.
  *
  * CARNEGIE MELLON ALLOWS FREE USE OF THIS SOFTWARE IN ITS
  * CONDITION.  CARNEGIE MELLON DISCLAIMS ANY LIABILITY OF ANY KIND FOR
  * ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE.
  *
  * Carnegie Mellon requests users of this software to return to
  *
  *  Software Distribution Coordinator  or  Software.Distribution@CS.CMU.EDU
  *  School of Computer Science
  *  Carnegie Mellon University
  *  Pittsburgh PA 15213-3890
  *
  * any improvements or extensions that they make and grant Carnegie the
  * rights to redistribute these changes.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/cons.h>
 #include <sys/linker.h>
 #include <sys/kdb.h>
 #include <sys/kernel.h>
 #include <sys/pcpu.h>
 #include <sys/proc.h>
 #include <sys/reboot.h>
 #include <sys/sysctl.h>
 
 #include <machine/kdb.h>
 #include <machine/pcb.h>
 #include <machine/setjmp.h>
 
 #include <ddb/ddb.h>
 #include <ddb/db_command.h>
 #include <ddb/db_sym.h>
 
 SYSCTL_NODE(_debug, OID_AUTO, ddb, CTLFLAG_RW, 0, "DDB settings");
 
 static dbbe_init_f db_init;
 static dbbe_trap_f db_trap;
 static dbbe_trace_f db_trace_self_wrapper;
 static dbbe_trace_thread_f db_trace_thread_wrapper;
 
 KDB_BACKEND(ddb, db_init, db_trace_self_wrapper, db_trace_thread_wrapper,
     db_trap);
 
 /*
  * Symbols can be loaded by specifying the exact addresses of
  * the symtab and strtab in memory. This is used when loaded from
  * boot loaders different than the native one (like Xen).
  */
 vm_offset_t ksymtab, kstrtab, ksymtab_size;
 
 bool
 X_db_line_at_pc(db_symtab_t *symtab, c_db_sym_t sym, char **file, int *line,
     db_expr_t off)
 {
 	return (false);
 }
 
 c_db_sym_t
 X_db_lookup(db_symtab_t *symtab, const char *symbol)
 {
 	c_linker_sym_t lsym;
 	Elf_Sym *sym;
 
 	if (symtab->private == NULL) {
 		return ((c_db_sym_t)((!linker_ddb_lookup(symbol, &lsym))
 			? lsym : NULL));
 	} else {
 		sym = (Elf_Sym *)symtab->start;
 		while ((char *)sym < symtab->end) {
 			if (sym->st_name != 0 &&
 			    !strcmp(symtab->private + sym->st_name, symbol))
 				return ((c_db_sym_t)sym);
 			sym++;
 		}
 	}
 	return (NULL);
 }
 
 c_db_sym_t
 X_db_search_symbol(db_symtab_t *symtab, db_addr_t off, db_strategy_t strat,
     db_expr_t *diffp)
 {
 	c_linker_sym_t lsym;
 	Elf_Sym *sym, *match;
 	unsigned long diff;
 
 	if (symtab->private == NULL) {
 		if (!linker_ddb_search_symbol((caddr_t)off, &lsym, &diff)) {
 			*diffp = (db_expr_t)diff;
 			return ((c_db_sym_t)lsym);
 		}
 		return (NULL);
 	}
 
 	diff = ~0UL;
 	match = NULL;
 	for (sym = (Elf_Sym*)symtab->start; (char*)sym < symtab->end; sym++) {
 		if (sym->st_name == 0 || sym->st_shndx == SHN_UNDEF)
 			continue;
 		if (off < sym->st_value)
 			continue;
 		if (ELF_ST_TYPE(sym->st_info) != STT_OBJECT &&
 		    ELF_ST_TYPE(sym->st_info) != STT_FUNC &&
 		    ELF_ST_TYPE(sym->st_info) != STT_NOTYPE)
 			continue;
 		if ((off - sym->st_value) > diff)
 			continue;
 		if ((off - sym->st_value) < diff) {
 			diff = off - sym->st_value;
 			match = sym;
 		} else {
 			if (match == NULL)
 				match = sym;
 			else if (ELF_ST_BIND(match->st_info) == STB_LOCAL &&
 			    ELF_ST_BIND(sym->st_info) != STB_LOCAL)
 				match = sym;
 		}
 		if (diff == 0) {
 			if (strat == DB_STGY_PROC &&
 			    ELF_ST_TYPE(sym->st_info) == STT_FUNC &&
 			    ELF_ST_BIND(sym->st_info) != STB_LOCAL)
 				break;
 			if (strat == DB_STGY_ANY &&
 			    ELF_ST_BIND(sym->st_info) != STB_LOCAL)
 				break;
 		}
 	}
 
 	*diffp = (match == NULL) ? off : diff;
 	return ((c_db_sym_t)match);
 }
 
 bool
 X_db_sym_numargs(db_symtab_t *symtab, c_db_sym_t sym, int *nargp,
     char **argp)
 {
 	return (false);
 }
 
 void
 X_db_symbol_values(db_symtab_t *symtab, c_db_sym_t sym, const char **namep,
     db_expr_t *valp)
 {
 	linker_symval_t lval;
 
 	if (symtab->private == NULL) {
 		linker_ddb_symbol_values((c_linker_sym_t)sym, &lval);
 		if (namep != NULL)
 			*namep = (const char*)lval.name;
 		if (valp != NULL)
 			*valp = (db_expr_t)lval.value;
 	} else {
 		if (namep != NULL)
 			*namep = (const char *)symtab->private +
 			    ((const Elf_Sym *)sym)->st_name;
 		if (valp != NULL)
 			*valp = (db_expr_t)((const Elf_Sym *)sym)->st_value;
 	}
 }
 
 int
 db_fetch_ksymtab(vm_offset_t ksym_start, vm_offset_t ksym_end)
 {
 	Elf_Size strsz;
 
 	if (ksym_end > ksym_start && ksym_start != 0) {
 		ksymtab = ksym_start;
 		ksymtab_size = *(Elf_Size*)ksymtab;
 		ksymtab += sizeof(Elf_Size);
 		kstrtab = ksymtab + ksymtab_size;
 		strsz = *(Elf_Size*)kstrtab;
 		kstrtab += sizeof(Elf_Size);
 		if (kstrtab + strsz > ksym_end) {
 			/* Sizes doesn't match, unset everything. */
 			ksymtab = ksymtab_size = kstrtab = 0;
 		}
 	}
 
 	if (ksymtab == 0 || ksymtab_size == 0 || kstrtab == 0)
 		return (-1);
 
 	return (0);
 }
 
 static int
 db_init(void)
 {
 
 	db_command_init();
 
 	if (ksymtab != 0 && kstrtab != 0 && ksymtab_size != 0) {
 		db_add_symbol_table((char *)ksymtab,
 		    (char *)(ksymtab + ksymtab_size), "elf", (char *)kstrtab);
 	}
 	db_add_symbol_table(NULL, NULL, "kld", NULL);
 	return (1);	/* We're the default debugger. */
 }
 
 static int
 db_trap(int type, int code)
 {
 	jmp_buf jb;
 	void *prev_jb;
 	bool bkpt, watchpt;
 	const char *why;
 
 	/*
 	 * Don't handle the trap if the console is unavailable (i.e. it
 	 * is in graphics mode).
 	 */
 	if (cnunavailable())
 		return (0);
 
-	bkpt = IS_BREAKPOINT_TRAP(type, code);
-	watchpt = IS_WATCHPOINT_TRAP(type, code);
-
-	if (db_stop_at_pc(&bkpt)) {
+	if (db_stop_at_pc(type, code, &bkpt, &watchpt)) {
 		if (db_inst_count) {
 			db_printf("After %d instructions (%d loads, %d stores),\n",
 			    db_inst_count, db_load_count, db_store_count);
 		}
 		prev_jb = kdb_jmpbuf(jb);
 		if (setjmp(jb) == 0) {
 			db_dot = PC_REGS();
 			db_print_thread();
 			if (bkpt)
 				db_printf("Breakpoint at\t");
 			else if (watchpt)
 				db_printf("Watchpoint at\t");
 			else
 				db_printf("Stopped at\t");
 			db_print_loc_and_inst(db_dot);
 		}
 		why = kdb_why;
 		db_script_kdbenter(why != KDB_WHY_UNSET ? why : "unknown");
 		db_command_loop();
 		(void)kdb_jmpbuf(prev_jb);
 	}
 
 	db_restart_at_pc(watchpt);
 
 	return (1);
 }
 
 static void
 db_trace_self_wrapper(void)
 {
 	jmp_buf jb;
 	void *prev_jb;
 
 	prev_jb = kdb_jmpbuf(jb);
 	if (setjmp(jb) == 0)
 		db_trace_self();
 	(void)kdb_jmpbuf(prev_jb);
 }
 
 static void
 db_trace_thread_wrapper(struct thread *td)
 {
 	jmp_buf jb;
 	void *prev_jb;
 
 	prev_jb = kdb_jmpbuf(jb);
 	if (setjmp(jb) == 0)
 		db_trace_thread(td, -1);
 	(void)kdb_jmpbuf(prev_jb);
 }
Index: projects/clang390-import/sys/ddb/db_run.c
===================================================================
--- projects/clang390-import/sys/ddb/db_run.c	(revision 305686)
+++ projects/clang390-import/sys/ddb/db_run.c	(revision 305687)
@@ -1,384 +1,386 @@
 /*-
  * Mach Operating System
  * Copyright (c) 1991,1990 Carnegie Mellon University
  * All Rights Reserved.
  *
  * Permission to use, copy, modify and distribute this software and its
  * documentation is hereby granted, provided that both the copyright
  * notice and this permission notice appear in all copies of the
  * software, derivative works or modified versions, and any portions
  * thereof, and that both notices appear in supporting documentation.
  *
  * CARNEGIE MELLON ALLOWS FREE USE OF THIS SOFTWARE IN ITS
  * CONDITION.  CARNEGIE MELLON DISCLAIMS ANY LIABILITY OF ANY KIND FOR
  * ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE.
  *
  * Carnegie Mellon requests users of this software to return to
  *
  *  Software Distribution Coordinator  or  Software.Distribution@CS.CMU.EDU
  *  School of Computer Science
  *  Carnegie Mellon University
  *  Pittsburgh PA 15213-3890
  *
  * any improvements or extensions that they make and grant Carnegie the
  * rights to redistribute these changes.
  */
 /*
  * 	Author: David B. Golub, Carnegie Mellon University
  *	Date:	7/90
  */
 
 /*
  * Commands to run process.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/kdb.h>
 #include <sys/proc.h>
 
 #include <machine/kdb.h>
 #include <machine/pcb.h>
 
 #include <vm/vm.h>
 
 #include <ddb/ddb.h>
 #include <ddb/db_break.h>
 #include <ddb/db_access.h>
 
 static int	db_run_mode;
 #define	STEP_NONE	0
 #define	STEP_ONCE	1
 #define	STEP_RETURN	2
 #define	STEP_CALLT	3
 #define	STEP_CONTINUE	4
 #define	STEP_INVISIBLE	5
 #define	STEP_COUNT	6
 
 static bool		db_sstep_print;
 static int		db_loop_count;
 static int		db_call_depth;
 
 int		db_inst_count;
 int		db_load_count;
 int		db_store_count;
 
 #ifdef SOFTWARE_SSTEP
 db_breakpoint_t	db_not_taken_bkpt = 0;
 db_breakpoint_t	db_taken_bkpt = 0;
 #endif
 
 #ifndef db_set_single_step
 void db_set_single_step(void);
 #endif
 #ifndef db_clear_single_step
 void db_clear_single_step(void);
 #endif
 #ifndef db_pc_is_singlestep
 static bool
 db_pc_is_singlestep(db_addr_t pc)
 {
 #ifdef SOFTWARE_SSTEP
 	if ((db_not_taken_bkpt != 0 && pc == db_not_taken_bkpt->address)
 	    || (db_taken_bkpt != 0 && pc == db_taken_bkpt->address))
 		return (true);
 #endif
 	return (false);
 }
 #endif
 
 bool
-db_stop_at_pc(bool *is_breakpoint)
+db_stop_at_pc(int type, int code, bool *is_breakpoint, bool *is_watchpoint)
 {
 	db_addr_t	pc;
 	db_breakpoint_t bkpt;
 
+	*is_breakpoint = IS_BREAKPOINT_TRAP(type, code);
+	*is_watchpoint = IS_WATCHPOINT_TRAP(type, code);
 	pc = PC_REGS();
-
 	if (db_pc_is_singlestep(pc))
 		*is_breakpoint = false;
 
 	db_clear_single_step();
 	db_clear_breakpoints();
 	db_clear_watchpoints();
 
 #ifdef	FIXUP_PC_AFTER_BREAK
 	if (*is_breakpoint) {
 	    /*
 	     * Breakpoint trap.  Fix up the PC if the
 	     * machine requires it.
 	     */
 	    FIXUP_PC_AFTER_BREAK
 	    pc = PC_REGS();
 	}
 #endif
 
 	/*
 	 * Now check for a breakpoint at this address.
 	 */
 	bkpt = db_find_breakpoint_here(pc);
 	if (bkpt) {
 	    if (--bkpt->count == 0) {
 		bkpt->count = bkpt->init_count;
 		*is_breakpoint = true;
 		return (true);	/* stop here */
 	    }
+	    return (false);	/* continue the countdown */
 	} else if (*is_breakpoint) {
 #ifdef BKPT_SKIP
 		BKPT_SKIP;
 #endif
 	}
 
 	*is_breakpoint = false;
 
 	if (db_run_mode == STEP_INVISIBLE) {
 	    db_run_mode = STEP_CONTINUE;
 	    return (false);	/* continue */
 	}
 	if (db_run_mode == STEP_COUNT) {
 	    return (false); /* continue */
 	}
 	if (db_run_mode == STEP_ONCE) {
 	    if (--db_loop_count > 0) {
 		if (db_sstep_print) {
 		    db_printf("\t\t");
 		    db_print_loc_and_inst(pc);
 		}
 		return (false);	/* continue */
 	    }
 	}
 	if (db_run_mode == STEP_RETURN) {
 	    /* continue until matching return */
 	    db_expr_t ins;
 
 	    ins = db_get_value(pc, sizeof(int), false);
 	    if (!inst_trap_return(ins) &&
 		(!inst_return(ins) || --db_call_depth != 0)) {
 		if (db_sstep_print) {
 		    if (inst_call(ins) || inst_return(ins)) {
 			int i;
 
 			db_printf("[after %6d]     ", db_inst_count);
 			for (i = db_call_depth; --i > 0; )
 			    db_printf("  ");
 			db_print_loc_and_inst(pc);
 		    }
 		}
 		if (inst_call(ins))
 		    db_call_depth++;
 		return (false);	/* continue */
 	    }
 	}
 	if (db_run_mode == STEP_CALLT) {
 	    /* continue until call or return */
 	    db_expr_t ins;
 
 	    ins = db_get_value(pc, sizeof(int), false);
 	    if (!inst_call(ins) &&
 		!inst_return(ins) &&
 		!inst_trap_return(ins)) {
 		return (false);	/* continue */
 	    }
 	}
 	db_run_mode = STEP_NONE;
 	return (true);
 }
 
 void
 db_restart_at_pc(bool watchpt)
 {
 	db_addr_t	pc = PC_REGS();
 
 	if ((db_run_mode == STEP_COUNT) ||
 	    (db_run_mode == STEP_RETURN) ||
 	    (db_run_mode == STEP_CALLT)) {
 	    /*
 	     * We are about to execute this instruction,
 	     * so count it now.
 	     */
 #ifdef	SOFTWARE_SSTEP
 	    db_expr_t		ins =
 #endif
 	    db_get_value(pc, sizeof(int), false);
 	    db_inst_count++;
 	    db_load_count += inst_load(ins);
 	    db_store_count += inst_store(ins);
 #ifdef	SOFTWARE_SSTEP
 	    /* XXX works on mips, but... */
 	    if (inst_branch(ins) || inst_call(ins)) {
 		ins = db_get_value(next_instr_address(pc,1),
 				   sizeof(int), false);
 		db_inst_count++;
 		db_load_count += inst_load(ins);
 		db_store_count += inst_store(ins);
 	    }
 #endif	/* SOFTWARE_SSTEP */
 	}
 
 	if (db_run_mode == STEP_CONTINUE) {
 	    if (watchpt || db_find_breakpoint_here(pc)) {
 		/*
 		 * Step over breakpoint/watchpoint.
 		 */
 		db_run_mode = STEP_INVISIBLE;
 		db_set_single_step();
 	    } else {
 		db_set_breakpoints();
 		db_set_watchpoints();
 	    }
 	} else {
 	    db_set_single_step();
 	}
 }
 
 #ifdef	SOFTWARE_SSTEP
 /*
  *	Software implementation of single-stepping.
  *	If your machine does not have a trace mode
  *	similar to the vax or sun ones you can use
  *	this implementation, done for the mips.
  *	Just define the above conditional and provide
  *	the functions/macros defined below.
  *
  * extern bool
  *	inst_branch(),		returns true if the instruction might branch
  * extern unsigned
  *	branch_taken(),		return the address the instruction might
  *				branch to
  *	db_getreg_val();	return the value of a user register,
  *				as indicated in the hardware instruction
  *				encoding, e.g. 8 for r8
  *
  * next_instr_address(pc,bd)	returns the address of the first
  *				instruction following the one at "pc",
  *				which is either in the taken path of
  *				the branch (bd==1) or not.  This is
  *				for machines (mips) with branch delays.
  *
  *	A single-step may involve at most 2 breakpoints -
  *	one for branch-not-taken and one for branch taken.
  *	If one of these addresses does not already have a breakpoint,
  *	we allocate a breakpoint and save it here.
  *	These breakpoints are deleted on return.
  */
 
 void
 db_set_single_step(void)
 {
 	db_addr_t pc = PC_REGS(), brpc;
 	unsigned inst;
 
 	/*
 	 *	User was stopped at pc, e.g. the instruction
 	 *	at pc was not executed.
 	 */
 	inst = db_get_value(pc, sizeof(int), false);
 	if (inst_branch(inst) || inst_call(inst) || inst_return(inst)) {
 		brpc = branch_taken(inst, pc);
 		if (brpc != pc) {	/* self-branches are hopeless */
 			db_taken_bkpt = db_set_temp_breakpoint(brpc);
 		}
 		pc = next_instr_address(pc, 1);
 	}
 	pc = next_instr_address(pc, 0);
 	db_not_taken_bkpt = db_set_temp_breakpoint(pc);
 }
 
 void
 db_clear_single_step(void)
 {
 
 	if (db_not_taken_bkpt != 0) {
 		db_delete_temp_breakpoint(db_not_taken_bkpt);
 		db_not_taken_bkpt = 0;
 	}
 	if (db_taken_bkpt != 0) {
 		db_delete_temp_breakpoint(db_taken_bkpt);
 		db_taken_bkpt = 0;
 	}
 }
 
 #endif	/* SOFTWARE_SSTEP */
 
 extern int	db_cmd_loop_done;
 
 /* single-step */
 /*ARGSUSED*/
 void
 db_single_step_cmd(db_expr_t addr, bool have_addr, db_expr_t count, char *modif)
 {
 	bool		print = false;
 
 	if (count == -1)
 	    count = 1;
 
 	if (modif[0] == 'p')
 	    print = true;
 
 	db_run_mode = STEP_ONCE;
 	db_loop_count = count;
 	db_sstep_print = print;
 	db_inst_count = 0;
 	db_load_count = 0;
 	db_store_count = 0;
 
 	db_cmd_loop_done = 1;
 }
 
 /* trace and print until call/return */
 /*ARGSUSED*/
 void
 db_trace_until_call_cmd(db_expr_t addr, bool have_addr, db_expr_t count,
     char *modif)
 {
 	bool	print = false;
 
 	if (modif[0] == 'p')
 	    print = true;
 
 	db_run_mode = STEP_CALLT;
 	db_sstep_print = print;
 	db_inst_count = 0;
 	db_load_count = 0;
 	db_store_count = 0;
 
 	db_cmd_loop_done = 1;
 }
 
 /*ARGSUSED*/
 void
 db_trace_until_matching_cmd(db_expr_t addr, bool have_addr, db_expr_t count,
     char *modif)
 {
 	bool	print = false;
 
 	if (modif[0] == 'p')
 	    print = true;
 
 	db_run_mode = STEP_RETURN;
 	db_call_depth = 1;
 	db_sstep_print = print;
 	db_inst_count = 0;
 	db_load_count = 0;
 	db_store_count = 0;
 
 	db_cmd_loop_done = 1;
 }
 
 /* continue */
 /*ARGSUSED*/
 void
 db_continue_cmd(db_expr_t addr, bool have_addr, db_expr_t count, char *modif)
 {
 	if (modif[0] == 'c')
 	    db_run_mode = STEP_COUNT;
 	else
 	    db_run_mode = STEP_CONTINUE;
 	db_inst_count = 0;
 	db_load_count = 0;
 	db_store_count = 0;
 
 	db_cmd_loop_done = 1;
 }
Index: projects/clang390-import/sys/ddb/ddb.h
===================================================================
--- projects/clang390-import/sys/ddb/ddb.h	(revision 305686)
+++ projects/clang390-import/sys/ddb/ddb.h	(revision 305687)
@@ -1,295 +1,296 @@
 /*-
  * Copyright (c) 1993, Garrett A. Wollman.
  * Copyright (c) 1993, University of Vermont and State Agricultural College.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 /*
  * Necessary declarations for the `ddb' kernel debugger.
  */
 
 #ifndef _DDB_DDB_H_
 #define	_DDB_DDB_H_
 
 #ifdef SYSCTL_DECL
 SYSCTL_DECL(_debug_ddb);
 #endif
 
 #include <machine/db_machdep.h>		/* type definitions */
 
 #include <sys/queue.h>			/* LIST_* */
 #include <sys/kernel.h>			/* SYSINIT */
 
 #ifndef DB_MAXARGS
 #define	DB_MAXARGS	10
 #endif
 
 #ifndef DB_MAXLINE
 #define	DB_MAXLINE	120
 #endif
 
 #ifndef DB_MAXSCRIPTS
 #define	DB_MAXSCRIPTS	8
 #endif
 
 #ifndef DB_MAXSCRIPTNAME
 #define	DB_MAXSCRIPTNAME	32
 #endif
 
 #ifndef DB_MAXSCRIPTLEN
 #define	DB_MAXSCRIPTLEN	128
 #endif
 
 #ifndef DB_MAXSCRIPTRECURSION
 #define	DB_MAXSCRIPTRECURSION	3
 #endif
 
 #ifndef DB_CALL
 #define	DB_CALL	db_fncall_generic
 #else
 int	DB_CALL(db_expr_t, db_expr_t *, int, db_expr_t[]);
 #endif
 
 /*
  * Extern variables to set the address and size of the symtab and strtab.
  * Most users should use db_fetch_symtab in order to set them from the
  * boot loader provided values.
  */
 extern vm_offset_t ksymtab, kstrtab, ksymtab_size;
 
 /*
  * There are three "command tables":
  * - One for simple commands; a list of these is displayed
  *   by typing 'help' at the debugger prompt.
  * - One for sub-commands of 'show'; to see this type 'show'
  *   without any arguments.
  * - The last one for sub-commands of 'show all'; type 'show all'
  *   without any argument to get a list.
  */
 struct command;
 LIST_HEAD(command_table, command);
 extern struct command_table db_cmd_table;
 extern struct command_table db_show_table;
 extern struct command_table db_show_all_table;
 
 /*
  * Type signature for a function implementing a ddb command.
  */
 typedef void db_cmdfcn_t(db_expr_t addr, bool have_addr, db_expr_t count,
 	    char *modif);
 
 /*
  * Command table entry.
  */
 struct command {
 	char *	name;		/* command name */
 	db_cmdfcn_t *fcn;	/* function to call */
 	int	flag;		/* extra info: */
 #define	CS_OWN		0x1	/* non-standard syntax */
 #define	CS_MORE		0x2	/* standard syntax, but may have other words
 				 * at end */
 #define	CS_SET_DOT	0x100	/* set dot after command */
 	struct command_table *more; /* another level of command */
 	LIST_ENTRY(command) next; /* next entry in the command table */
 };
 
 /*
  * Arrange for the specified ddb command to be defined and
  * bound to the specified function.  Commands can be defined
  * in modules in which case they will be available only when
  * the module is loaded.
  */
 #define	_DB_SET(_suffix, _name, _func, list, _flag, _more)	\
 static struct command __CONCAT(_name,_suffix) = {		\
 	.name	= __STRING(_name),				\
 	.fcn	= _func,					\
 	.flag	= _flag,					\
 	.more	= _more						\
 };								\
 static void __CONCAT(__CONCAT(_name,_suffix),_add)(void *arg __unused) \
     { db_command_register(&list, &__CONCAT(_name,_suffix)); }	\
 SYSINIT(__CONCAT(_name,_suffix), SI_SUB_KLD, SI_ORDER_ANY,	\
     __CONCAT(__CONCAT(_name,_suffix),_add), NULL);		\
 static void __CONCAT(__CONCAT(_name,_suffix),_del)(void *arg __unused) \
     { db_command_unregister(&list, &__CONCAT(_name,_suffix)); }	\
 SYSUNINIT(__CONCAT(_name,_suffix), SI_SUB_KLD, SI_ORDER_ANY,	\
     __CONCAT(__CONCAT(_name,_suffix),_del), NULL);
 
 /*
  * Like _DB_SET but also create the function declaration which
  * must be followed immediately by the body; e.g.
  *   _DB_FUNC(_cmd, panic, db_panic, db_cmd_table, 0, NULL)
  *   {
  *	...panic implementation...
  *   }
  *
  * This macro is mostly used to define commands placed in one of
  * the ddb command tables; see DB_COMMAND, etc. below.
  */
 #define	_DB_FUNC(_suffix, _name, _func, list, _flag, _more)	\
 static db_cmdfcn_t _func;					\
 _DB_SET(_suffix, _name, _func, list, _flag, _more);		\
 static void							\
 _func(db_expr_t addr, bool have_addr, db_expr_t count, char *modif)
 
 /* common idom provided for backwards compatibility */
 #define	DB_FUNC(_name, _func, list, _flag, _more)		\
 	_DB_FUNC(_cmd, _name, _func, list, _flag, _more)
 
 #define	DB_COMMAND(cmd_name, func_name) \
 	_DB_FUNC(_cmd, cmd_name, func_name, db_cmd_table, 0, NULL)
 #define	DB_ALIAS(alias_name, func_name) \
 	_DB_SET(_cmd, alias_name, func_name, db_cmd_table, 0, NULL)
 #define	DB_SHOW_COMMAND(cmd_name, func_name) \
 	_DB_FUNC(_show, cmd_name, func_name, db_show_table, 0, NULL)
 #define	DB_SHOW_ALIAS(alias_name, func_name) \
 	_DB_SET(_show, alias_name, func_name, db_show_table, 0, NULL)
 #define	DB_SHOW_ALL_COMMAND(cmd_name, func_name) \
 	_DB_FUNC(_show_all, cmd_name, func_name, db_show_all_table, 0, NULL)
 #define	DB_SHOW_ALL_ALIAS(alias_name, func_name) \
 	_DB_SET(_show_all, alias_name, func_name, db_show_all_table, 0, NULL)
 
 extern db_expr_t db_maxoff;
 extern int db_indent;
 extern int db_inst_count;
 extern int db_load_count;
 extern int db_store_count;
 extern volatile int db_pager_quit;
 extern db_expr_t db_radix;
 extern db_expr_t db_max_width;
 extern db_expr_t db_tab_stop_width;
 extern db_expr_t db_lines_per_page;
 
 struct thread;
 struct vm_map;
 
 void		db_check_interrupt(void);
 void		db_clear_watchpoints(void);
 db_addr_t	db_disasm(db_addr_t loc, bool altfmt);
 				/* instruction disassembler */
 void		db_error(const char *s);
 int		db_expression(db_expr_t *valuep);
 int		db_get_variable(db_expr_t *valuep);
 void		db_iprintf(const char *,...) __printflike(1, 2);
 struct proc	*db_lookup_proc(db_expr_t addr);
 struct thread	*db_lookup_thread(db_expr_t addr, bool check_pid);
 struct vm_map	*db_map_addr(vm_offset_t);
 bool		db_map_current(struct vm_map *);
 bool		db_map_equal(struct vm_map *, struct vm_map *);
 int		db_md_set_watchpoint(db_expr_t addr, db_expr_t size);
 int		db_md_clr_watchpoint(db_expr_t addr, db_expr_t size);
 void		db_md_list_watchpoints(void);
 void		db_print_loc_and_inst(db_addr_t loc);
 void		db_print_thread(void);
 int		db_printf(const char *fmt, ...) __printflike(1, 2);
 int		db_read_bytes(vm_offset_t addr, size_t size, char *data);
 				/* machine-dependent */
 int		db_readline(char *lstart, int lsize);
 void		db_restart_at_pc(bool watchpt);
 int		db_set_variable(db_expr_t value);
 void		db_set_watchpoints(void);
 void		db_skip_to_eol(void);
-bool		db_stop_at_pc(bool *is_breakpoint);
+bool		db_stop_at_pc(int type, int code, bool *is_breakpoint,
+		    bool *is_watchpoint);
 #define		db_strcpy	strcpy
 void		db_trace_self(void);
 int		db_trace_thread(struct thread *, int);
 bool		db_value_of_name(const char *name, db_expr_t *valuep);
 bool		db_value_of_name_pcpu(const char *name, db_expr_t *valuep);
 bool		db_value_of_name_vnet(const char *name, db_expr_t *valuep);
 int		db_write_bytes(vm_offset_t addr, size_t size, char *data);
 void		db_command_register(struct command_table *, struct command *);
 void		db_command_unregister(struct command_table *, struct command *);
 int		db_fetch_ksymtab(vm_offset_t ksym_start, vm_offset_t ksym_end);
 
 db_cmdfcn_t	db_breakpoint_cmd;
 db_cmdfcn_t	db_capture_cmd;
 db_cmdfcn_t	db_continue_cmd;
 db_cmdfcn_t	db_delete_cmd;
 db_cmdfcn_t	db_deletehwatch_cmd;
 db_cmdfcn_t	db_deletewatch_cmd;
 db_cmdfcn_t	db_examine_cmd;
 db_cmdfcn_t	db_findstack_cmd;
 db_cmdfcn_t	db_hwatchpoint_cmd;
 db_cmdfcn_t	db_listbreak_cmd;
 db_cmdfcn_t	db_scripts_cmd;
 db_cmdfcn_t	db_print_cmd;
 db_cmdfcn_t	db_ps;
 db_cmdfcn_t	db_run_cmd;
 db_cmdfcn_t	db_script_cmd;
 db_cmdfcn_t	db_search_cmd;
 db_cmdfcn_t	db_set_cmd;
 db_cmdfcn_t	db_set_thread;
 db_cmdfcn_t	db_show_regs;
 db_cmdfcn_t	db_show_threads;
 db_cmdfcn_t	db_single_step_cmd;
 db_cmdfcn_t	db_textdump_cmd;
 db_cmdfcn_t	db_trace_until_call_cmd;
 db_cmdfcn_t	db_trace_until_matching_cmd;
 db_cmdfcn_t	db_unscript_cmd;
 db_cmdfcn_t	db_watchpoint_cmd;
 db_cmdfcn_t	db_write_cmd;
 
 /*
  * Interface between DDB and the DDB output capture facility.
  */
 struct dumperinfo;
 void	db_capture_dump(struct dumperinfo *di);
 void	db_capture_enterpager(void);
 void	db_capture_exitpager(void);
 void	db_capture_write(char *buffer, u_int buflen);
 void	db_capture_writech(char ch);
 
 /*
  * Interface between DDB  and the script facility.
  */
 void	db_script_kdbenter(const char *eventname);	/* KDB enter event. */
 
 /*
  * Interface between DDB and the textdump facility.
  *
  * Text dump blocks are of a fixed size; textdump_block_buffer is a
  * statically allocated buffer that code interacting with textdumps can use
  * to prepare and hold a pending block in when calling writenextblock().
  */
 #define	TEXTDUMP_BLOCKSIZE	512
 extern char	textdump_block_buffer[TEXTDUMP_BLOCKSIZE];
 
 void	textdump_mkustar(char *block_buffer, const char *filename,
 	    u_int size);
 void	textdump_restoreoff(off_t offset);
 void	textdump_saveoff(off_t *offsetp);
 int	textdump_writenextblock(struct dumperinfo *di, char *buffer);
 
 /*
  * Interface between the kernel and textdumps.
  */
 extern int	textdump_pending;	/* Call textdump_dumpsys() instead. */
 void	textdump_dumpsys(struct dumperinfo *di);
 
 #endif /* !_DDB_DDB_H_ */
Index: projects/clang390-import/sys/dev/ath/ath_hal/ah.c
===================================================================
--- projects/clang390-import/sys/dev/ath/ath_hal/ah.c	(revision 305686)
+++ projects/clang390-import/sys/dev/ath/ath_hal/ah.c	(revision 305687)
@@ -1,1480 +1,1497 @@
 /*
  * Copyright (c) 2002-2009 Sam Leffler, Errno Consulting
  * Copyright (c) 2002-2008 Atheros Communications, Inc.
  *
  * Permission to use, copy, modify, and/or distribute this software for any
  * purpose with or without fee is hereby granted, provided that the above
  * copyright notice and this permission notice appear in all copies.
  *
  * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
  * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
  * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
  * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
  * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  *
  * $FreeBSD$
  */
 #include "opt_ah.h"
 
 #include "ah.h"
 #include "ah_internal.h"
 #include "ah_devid.h"
 #include "ah_eeprom.h"			/* for 5ghz fast clock flag */
 
 #include "ar5416/ar5416reg.h"		/* NB: includes ar5212reg.h */
 #include "ar9003/ar9300_devid.h"
 
 /* linker set of registered chips */
 OS_SET_DECLARE(ah_chips, struct ath_hal_chip);
 
 /*
  * Check the set of registered chips to see if any recognize
  * the device as one they can support.
  */
 const char*
 ath_hal_probe(uint16_t vendorid, uint16_t devid)
 {
 	struct ath_hal_chip * const *pchip;
 
 	OS_SET_FOREACH(pchip, ah_chips) {
 		const char *name = (*pchip)->probe(vendorid, devid);
 		if (name != AH_NULL)
 			return name;
 	}
 	return AH_NULL;
 }
 
 /*
  * Attach detects device chip revisions, initializes the hwLayer
  * function list, reads EEPROM information,
  * selects reset vectors, and performs a short self test.
  * Any failures will return an error that should cause a hardware
  * disable.
  */
 struct ath_hal*
 ath_hal_attach(uint16_t devid, HAL_SOFTC sc,
 	HAL_BUS_TAG st, HAL_BUS_HANDLE sh, uint16_t *eepromdata,
 	HAL_OPS_CONFIG *ah_config,
 	HAL_STATUS *error)
 {
 	struct ath_hal_chip * const *pchip;
 
 	OS_SET_FOREACH(pchip, ah_chips) {
 		struct ath_hal_chip *chip = *pchip;
 		struct ath_hal *ah;
 
 		/* XXX don't have vendorid, assume atheros one works */
 		if (chip->probe(ATHEROS_VENDOR_ID, devid) == AH_NULL)
 			continue;
 		ah = chip->attach(devid, sc, st, sh, eepromdata, ah_config,
 		    error);
 		if (ah != AH_NULL) {
 			/* copy back private state to public area */
 			ah->ah_devid = AH_PRIVATE(ah)->ah_devid;
 			ah->ah_subvendorid = AH_PRIVATE(ah)->ah_subvendorid;
 			ah->ah_macVersion = AH_PRIVATE(ah)->ah_macVersion;
 			ah->ah_macRev = AH_PRIVATE(ah)->ah_macRev;
 			ah->ah_phyRev = AH_PRIVATE(ah)->ah_phyRev;
 			ah->ah_analog5GhzRev = AH_PRIVATE(ah)->ah_analog5GhzRev;
 			ah->ah_analog2GhzRev = AH_PRIVATE(ah)->ah_analog2GhzRev;
 			return ah;
 		}
 	}
 	return AH_NULL;
 }
 
 const char *
 ath_hal_mac_name(struct ath_hal *ah)
 {
 	switch (ah->ah_macVersion) {
 	case AR_SREV_VERSION_CRETE:
 	case AR_SREV_VERSION_MAUI_1:
 		return "AR5210";
 	case AR_SREV_VERSION_MAUI_2:
 	case AR_SREV_VERSION_OAHU:
 		return "AR5211";
 	case AR_SREV_VERSION_VENICE:
 		return "AR5212";
 	case AR_SREV_VERSION_GRIFFIN:
 		return "AR2413";
 	case AR_SREV_VERSION_CONDOR:
 		return "AR5424";
 	case AR_SREV_VERSION_EAGLE:
 		return "AR5413";
 	case AR_SREV_VERSION_COBRA:
 		return "AR2415";
 	case AR_SREV_2425:	/* Swan */
 		return "AR2425";
 	case AR_SREV_2417:	/* Nala */
 		return "AR2417";
 	case AR_XSREV_VERSION_OWL_PCI:
 		return "AR5416";
 	case AR_XSREV_VERSION_OWL_PCIE:
 		return "AR5418";
 	case AR_XSREV_VERSION_HOWL:
 		return "AR9130";
 	case AR_XSREV_VERSION_SOWL:
 		return "AR9160";
 	case AR_XSREV_VERSION_MERLIN:
 		if (AH_PRIVATE(ah)->ah_ispcie)
 			return "AR9280";
 		return "AR9220";
 	case AR_XSREV_VERSION_KITE:
 		return "AR9285";
 	case AR_XSREV_VERSION_KIWI:
 		if (AH_PRIVATE(ah)->ah_ispcie)
 			return "AR9287";
 		return "AR9227";
 	case AR_SREV_VERSION_AR9380:
 		if (ah->ah_macRev >= AR_SREV_REVISION_AR9580_10)
 			return "AR9580";
 		return "AR9380";
 	case AR_SREV_VERSION_AR9460:
 		return "AR9460";
 	case AR_SREV_VERSION_AR9330:
 		return "AR9330";
 	case AR_SREV_VERSION_AR9340:
 		return "AR9340";
 	case AR_SREV_VERSION_QCA9550:
 		return "QCA9550";
 	case AR_SREV_VERSION_AR9485:
 		return "AR9485";
 	case AR_SREV_VERSION_QCA9565:
 		return "QCA9565";
 	case AR_SREV_VERSION_QCA9530:
 		return "QCA9530";
 	}
 	return "????";
 }
 
 /*
  * Return the mask of available modes based on the hardware capabilities.
  */
 u_int
 ath_hal_getwirelessmodes(struct ath_hal*ah)
 {
 	return ath_hal_getWirelessModes(ah);
 }
 
 /* linker set of registered RF backends */
 OS_SET_DECLARE(ah_rfs, struct ath_hal_rf);
 
 /*
  * Check the set of registered RF backends to see if
  * any recognize the device as one they can support.
  */
 struct ath_hal_rf *
 ath_hal_rfprobe(struct ath_hal *ah, HAL_STATUS *ecode)
 {
 	struct ath_hal_rf * const *prf;
 
 	OS_SET_FOREACH(prf, ah_rfs) {
 		struct ath_hal_rf *rf = *prf;
 		if (rf->probe(ah))
 			return rf;
 	}
 	*ecode = HAL_ENOTSUPP;
 	return AH_NULL;
 }
 
 const char *
 ath_hal_rf_name(struct ath_hal *ah)
 {
 	switch (ah->ah_analog5GhzRev & AR_RADIO_SREV_MAJOR) {
 	case 0:			/* 5210 */
 		return "5110";	/* NB: made up */
 	case AR_RAD5111_SREV_MAJOR:
 	case AR_RAD5111_SREV_PROD:
 		return "5111";
 	case AR_RAD2111_SREV_MAJOR:
 		return "2111";
 	case AR_RAD5112_SREV_MAJOR:
 	case AR_RAD5112_SREV_2_0:
 	case AR_RAD5112_SREV_2_1:
 		return "5112";
 	case AR_RAD2112_SREV_MAJOR:
 	case AR_RAD2112_SREV_2_0:
 	case AR_RAD2112_SREV_2_1:
 		return "2112";
 	case AR_RAD2413_SREV_MAJOR:
 		return "2413";
 	case AR_RAD5413_SREV_MAJOR:
 		return "5413";
 	case AR_RAD2316_SREV_MAJOR:
 		return "2316";
 	case AR_RAD2317_SREV_MAJOR:
 		return "2317";
 	case AR_RAD5424_SREV_MAJOR:
 		return "5424";
 
 	case AR_RAD5133_SREV_MAJOR:
 		return "5133";
 	case AR_RAD2133_SREV_MAJOR:
 		return "2133";
 	case AR_RAD5122_SREV_MAJOR:
 		return "5122";
 	case AR_RAD2122_SREV_MAJOR:
 		return "2122";
 	}
 	return "????";
 }
 
 /*
  * Poll the register looking for a specific value.
  */
 HAL_BOOL
 ath_hal_wait(struct ath_hal *ah, u_int reg, uint32_t mask, uint32_t val)
 {
 #define	AH_TIMEOUT	1000
 	return ath_hal_waitfor(ah, reg, mask, val, AH_TIMEOUT);
 #undef AH_TIMEOUT
 }
 
 HAL_BOOL
 ath_hal_waitfor(struct ath_hal *ah, u_int reg, uint32_t mask, uint32_t val, uint32_t timeout)
 {
 	int i;
 
 	for (i = 0; i < timeout; i++) {
 		if ((OS_REG_READ(ah, reg) & mask) == val)
 			return AH_TRUE;
 		OS_DELAY(10);
 	}
 	HALDEBUG(ah, HAL_DEBUG_REGIO | HAL_DEBUG_PHYIO,
 	    "%s: timeout on reg 0x%x: 0x%08x & 0x%08x != 0x%08x\n",
 	    __func__, reg, OS_REG_READ(ah, reg), mask, val);
 	return AH_FALSE;
 }
 
 /*
  * Reverse the bits starting at the low bit for a value of
  * bit_count in size
  */
 uint32_t
 ath_hal_reverseBits(uint32_t val, uint32_t n)
 {
 	uint32_t retval;
 	int i;
 
 	for (i = 0, retval = 0; i < n; i++) {
 		retval = (retval << 1) | (val & 1);
 		val >>= 1;
 	}
 	return retval;
 }
 
 /* 802.11n related timing definitions */
 
 #define	OFDM_PLCP_BITS	22
 #define	HT_L_STF	8
 #define	HT_L_LTF	8
 #define	HT_L_SIG	4
 #define	HT_SIG		8
 #define	HT_STF		4
 #define	HT_LTF(n)	((n) * 4)
 
-#define	HT_RC_2_MCS(_rc)	((_rc) & 0xf)
+#define	HT_RC_2_MCS(_rc)	((_rc) & 0x1f)
 #define	HT_RC_2_STREAMS(_rc)	((((_rc) & 0x78) >> 3) + 1)
 #define	IS_HT_RATE(_rc)		( (_rc) & IEEE80211_RATE_MCS)
 
 /*
  * Calculate the duration of a packet whether it is 11n or legacy.
  */
 uint32_t
 ath_hal_pkt_txtime(struct ath_hal *ah, const HAL_RATE_TABLE *rates, uint32_t frameLen,
     uint16_t rateix, HAL_BOOL isht40, HAL_BOOL shortPreamble,
     HAL_BOOL includeSifs)
 {
 	uint8_t rc;
 	int numStreams;
 
 	rc = rates->info[rateix].rateCode;
 
 	/* Legacy rate? Return the old way */
 	if (! IS_HT_RATE(rc))
 		return ath_hal_computetxtime(ah, rates, frameLen, rateix,
 		    shortPreamble, includeSifs);
 
 	/* 11n frame - extract out the number of spatial streams */
 	numStreams = HT_RC_2_STREAMS(rc);
 	KASSERT(numStreams > 0 && numStreams <= 4,
 	    ("number of spatial streams needs to be 1..3: MCS rate 0x%x!",
 	    rateix));
 
 	/* XXX TODO: Add SIFS */
 	return ath_computedur_ht(frameLen, rc, numStreams, isht40,
 	    shortPreamble);
 }
 
 static const uint16_t ht20_bps[32] = {
     26, 52, 78, 104, 156, 208, 234, 260,
     52, 104, 156, 208, 312, 416, 468, 520,
     78, 156, 234, 312, 468, 624, 702, 780,
     104, 208, 312, 416, 624, 832, 936, 1040
 };
 static const uint16_t ht40_bps[32] = {
     54, 108, 162, 216, 324, 432, 486, 540,
     108, 216, 324, 432, 648, 864, 972, 1080,
     162, 324, 486, 648, 972, 1296, 1458, 1620,
     216, 432, 648, 864, 1296, 1728, 1944, 2160
 };
 
 /*
  * Calculate the transmit duration of an 11n frame.
  */
 uint32_t
 ath_computedur_ht(uint32_t frameLen, uint16_t rate, int streams,
     HAL_BOOL isht40, HAL_BOOL isShortGI)
 {
 	uint32_t bitsPerSymbol, numBits, numSymbols, txTime;
 
 	KASSERT(rate & IEEE80211_RATE_MCS, ("not mcs %d", rate));
 	KASSERT((rate &~ IEEE80211_RATE_MCS) < 31, ("bad mcs 0x%x", rate));
 
 	if (isht40)
-		bitsPerSymbol = ht40_bps[rate & 0x1f];
+		bitsPerSymbol = ht40_bps[HT_RC_2_MCS(rate)];
 	else
-		bitsPerSymbol = ht20_bps[rate & 0x1f];
+		bitsPerSymbol = ht20_bps[HT_RC_2_MCS(rate)];
 	numBits = OFDM_PLCP_BITS + (frameLen << 3);
 	numSymbols = howmany(numBits, bitsPerSymbol);
 	if (isShortGI)
 		txTime = ((numSymbols * 18) + 4) / 5;   /* 3.6us */
 	else
 		txTime = numSymbols * 4;                /* 4us */
 	return txTime + HT_L_STF + HT_L_LTF +
 	    HT_L_SIG + HT_SIG + HT_STF + HT_LTF(streams);
 }
 
 /*
  * Compute the time to transmit a frame of length frameLen bytes
  * using the specified rate, phy, and short preamble setting.
  */
 uint16_t
 ath_hal_computetxtime(struct ath_hal *ah,
 	const HAL_RATE_TABLE *rates, uint32_t frameLen, uint16_t rateix,
 	HAL_BOOL shortPreamble, HAL_BOOL includeSifs)
 {
 	uint32_t bitsPerSymbol, numBits, numSymbols, phyTime, txTime;
 	uint32_t kbps;
 
 	/* Warn if this function is called for 11n rates; it should not be! */
 	if (IS_HT_RATE(rates->info[rateix].rateCode))
 		ath_hal_printf(ah, "%s: MCS rate? (index %d; hwrate 0x%x)\n",
 		    __func__, rateix, rates->info[rateix].rateCode);
 
 	kbps = rates->info[rateix].rateKbps;
 	/*
 	 * index can be invalid during dynamic Turbo transitions. 
 	 * XXX
 	 */
 	if (kbps == 0)
 		return 0;
 	switch (rates->info[rateix].phy) {
 	case IEEE80211_T_CCK:
 		phyTime		= CCK_PREAMBLE_BITS + CCK_PLCP_BITS;
 		if (shortPreamble && rates->info[rateix].shortPreamble)
 			phyTime >>= 1;
 		numBits		= frameLen << 3;
 		txTime		= phyTime
 				+ ((numBits * 1000)/kbps);
 		if (includeSifs)
 			txTime	+= CCK_SIFS_TIME;
 		break;
 	case IEEE80211_T_OFDM:
 		bitsPerSymbol	= (kbps * OFDM_SYMBOL_TIME) / 1000;
 		HALASSERT(bitsPerSymbol != 0);
 
 		numBits		= OFDM_PLCP_BITS + (frameLen << 3);
 		numSymbols	= howmany(numBits, bitsPerSymbol);
 		txTime		= OFDM_PREAMBLE_TIME
 				+ (numSymbols * OFDM_SYMBOL_TIME);
 		if (includeSifs)
 			txTime	+= OFDM_SIFS_TIME;
 		break;
 	case IEEE80211_T_OFDM_HALF:
 		bitsPerSymbol	= (kbps * OFDM_HALF_SYMBOL_TIME) / 1000;
 		HALASSERT(bitsPerSymbol != 0);
 
 		numBits		= OFDM_HALF_PLCP_BITS + (frameLen << 3);
 		numSymbols	= howmany(numBits, bitsPerSymbol);
 		txTime		= OFDM_HALF_PREAMBLE_TIME
 				+ (numSymbols * OFDM_HALF_SYMBOL_TIME);
 		if (includeSifs)
 			txTime	+= OFDM_HALF_SIFS_TIME;
 		break;
 	case IEEE80211_T_OFDM_QUARTER:
 		bitsPerSymbol	= (kbps * OFDM_QUARTER_SYMBOL_TIME) / 1000;
 		HALASSERT(bitsPerSymbol != 0);
 
 		numBits		= OFDM_QUARTER_PLCP_BITS + (frameLen << 3);
 		numSymbols	= howmany(numBits, bitsPerSymbol);
 		txTime		= OFDM_QUARTER_PREAMBLE_TIME
 				+ (numSymbols * OFDM_QUARTER_SYMBOL_TIME);
 		if (includeSifs)
 			txTime	+= OFDM_QUARTER_SIFS_TIME;
 		break;
 	case IEEE80211_T_TURBO:
 		bitsPerSymbol	= (kbps * TURBO_SYMBOL_TIME) / 1000;
 		HALASSERT(bitsPerSymbol != 0);
 
 		numBits		= TURBO_PLCP_BITS + (frameLen << 3);
 		numSymbols	= howmany(numBits, bitsPerSymbol);
 		txTime		= TURBO_PREAMBLE_TIME
 				+ (numSymbols * TURBO_SYMBOL_TIME);
 		if (includeSifs)
 			txTime	+= TURBO_SIFS_TIME;
 		break;
 	default:
 		HALDEBUG(ah, HAL_DEBUG_PHYIO,
 		    "%s: unknown phy %u (rate ix %u)\n",
 		    __func__, rates->info[rateix].phy, rateix);
 		txTime = 0;
 		break;
 	}
 	return txTime;
 }
 
 int
 ath_hal_get_curmode(struct ath_hal *ah, const struct ieee80211_channel *chan)
 {
 	/*
 	 * Pick a default mode at bootup. A channel change is inevitable.
 	 */
 	if (!chan)
 		return HAL_MODE_11NG_HT20;
 
 	if (IEEE80211_IS_CHAN_TURBO(chan))
 		return HAL_MODE_TURBO;
 
 	/* check for NA_HT before plain A, since IS_CHAN_A includes NA_HT */
 	if (IEEE80211_IS_CHAN_5GHZ(chan) && IEEE80211_IS_CHAN_HT20(chan))
 		return HAL_MODE_11NA_HT20;
 	if (IEEE80211_IS_CHAN_5GHZ(chan) && IEEE80211_IS_CHAN_HT40U(chan))
 		return HAL_MODE_11NA_HT40PLUS;
 	if (IEEE80211_IS_CHAN_5GHZ(chan) && IEEE80211_IS_CHAN_HT40D(chan))
 		return HAL_MODE_11NA_HT40MINUS;
 	if (IEEE80211_IS_CHAN_A(chan))
 		return HAL_MODE_11A;
 
 	/* check for NG_HT before plain G, since IS_CHAN_G includes NG_HT */
 	if (IEEE80211_IS_CHAN_2GHZ(chan) && IEEE80211_IS_CHAN_HT20(chan))
 		return HAL_MODE_11NG_HT20;
 	if (IEEE80211_IS_CHAN_2GHZ(chan) && IEEE80211_IS_CHAN_HT40U(chan))
 		return HAL_MODE_11NG_HT40PLUS;
 	if (IEEE80211_IS_CHAN_2GHZ(chan) && IEEE80211_IS_CHAN_HT40D(chan))
 		return HAL_MODE_11NG_HT40MINUS;
 
 	/*
 	 * XXX For FreeBSD, will this work correctly given the DYN
 	 * chan mode (OFDM+CCK dynamic) ? We have pure-G versions DYN-BG..
 	 */
 	if (IEEE80211_IS_CHAN_G(chan))
 		return HAL_MODE_11G;
 	if (IEEE80211_IS_CHAN_B(chan))
 		return HAL_MODE_11B;
 
 	HALASSERT(0);
 	return HAL_MODE_11NG_HT20;
 }
 
 
 typedef enum {
 	WIRELESS_MODE_11a   = 0,
 	WIRELESS_MODE_TURBO = 1,
 	WIRELESS_MODE_11b   = 2,
 	WIRELESS_MODE_11g   = 3,
 	WIRELESS_MODE_108g  = 4,
 
 	WIRELESS_MODE_MAX
 } WIRELESS_MODE;
 
+/*
+ * XXX TODO: for some (?) chips, an 11b mode still runs at 11bg.
+ * Maybe AR5211 has separate 11b and 11g only modes, so 11b is 22MHz
+ * and 11g is 44MHz, but AR5416 and later run 11b in 11bg mode, right?
+ */
 static WIRELESS_MODE
 ath_hal_chan2wmode(struct ath_hal *ah, const struct ieee80211_channel *chan)
 {
 	if (IEEE80211_IS_CHAN_B(chan))
 		return WIRELESS_MODE_11b;
 	if (IEEE80211_IS_CHAN_G(chan))
 		return WIRELESS_MODE_11g;
 	if (IEEE80211_IS_CHAN_108G(chan))
 		return WIRELESS_MODE_108g;
 	if (IEEE80211_IS_CHAN_TURBO(chan))
 		return WIRELESS_MODE_TURBO;
 	return WIRELESS_MODE_11a;
 }
 
 /*
  * Convert between microseconds and core system clocks.
  */
                                      /* 11a Turbo  11b  11g  108g */
 static const uint8_t CLOCK_RATE[]  = { 40,  80,   22,  44,   88  };
 
 #define	CLOCK_FAST_RATE_5GHZ_OFDM	44
 
 u_int
 ath_hal_mac_clks(struct ath_hal *ah, u_int usecs)
 {
 	const struct ieee80211_channel *c = AH_PRIVATE(ah)->ah_curchan;
 	u_int clks;
 
 	/* NB: ah_curchan may be null when called attach time */
 	/* XXX merlin and later specific workaround - 5ghz fast clock is 44 */
 	if (c != AH_NULL && IS_5GHZ_FAST_CLOCK_EN(ah, c)) {
 		clks = usecs * CLOCK_FAST_RATE_5GHZ_OFDM;
 		if (IEEE80211_IS_CHAN_HT40(c))
 			clks <<= 1;
 	} else if (c != AH_NULL) {
 		clks = usecs * CLOCK_RATE[ath_hal_chan2wmode(ah, c)];
 		if (IEEE80211_IS_CHAN_HT40(c))
 			clks <<= 1;
 	} else
 		clks = usecs * CLOCK_RATE[WIRELESS_MODE_11b];
 
 	/* Compensate for half/quarter rate */
 	if (c != AH_NULL && IEEE80211_IS_CHAN_HALF(c))
 		clks = clks / 2;
 	else if (c != AH_NULL && IEEE80211_IS_CHAN_QUARTER(c))
 		clks = clks / 4;
 
 	return clks;
 }
 
 u_int
 ath_hal_mac_usec(struct ath_hal *ah, u_int clks)
 {
+	uint64_t psec;
+
+	psec = ath_hal_mac_psec(ah, clks);
+	return (psec / 1000000);
+}
+
+/*
+ * XXX TODO: half, quarter rates.
+ */
+uint64_t
+ath_hal_mac_psec(struct ath_hal *ah, u_int clks)
+{
 	const struct ieee80211_channel *c = AH_PRIVATE(ah)->ah_curchan;
-	u_int usec;
+	uint64_t psec;
 
 	/* NB: ah_curchan may be null when called attach time */
 	/* XXX merlin and later specific workaround - 5ghz fast clock is 44 */
 	if (c != AH_NULL && IS_5GHZ_FAST_CLOCK_EN(ah, c)) {
-		usec = clks / CLOCK_FAST_RATE_5GHZ_OFDM;
+		psec = (clks * 1000000ULL) / CLOCK_FAST_RATE_5GHZ_OFDM;
 		if (IEEE80211_IS_CHAN_HT40(c))
-			usec >>= 1;
+			psec >>= 1;
 	} else if (c != AH_NULL) {
-		usec = clks / CLOCK_RATE[ath_hal_chan2wmode(ah, c)];
+		psec = (clks * 1000000ULL) / CLOCK_RATE[ath_hal_chan2wmode(ah, c)];
 		if (IEEE80211_IS_CHAN_HT40(c))
-			usec >>= 1;
+			psec >>= 1;
 	} else
-		usec = clks / CLOCK_RATE[WIRELESS_MODE_11b];
-	return usec;
+		psec = (clks * 1000000ULL) / CLOCK_RATE[WIRELESS_MODE_11b];
+	return psec;
 }
 
 /*
  * Setup a h/w rate table's reverse lookup table and
  * fill in ack durations.  This routine is called for
  * each rate table returned through the ah_getRateTable
  * method.  The reverse lookup tables are assumed to be
  * initialized to zero (or at least the first entry).
  * We use this as a key that indicates whether or not
  * we've previously setup the reverse lookup table.
  *
  * XXX not reentrant, but shouldn't matter
  */
 void
 ath_hal_setupratetable(struct ath_hal *ah, HAL_RATE_TABLE *rt)
 {
 #define	N(a)	(sizeof(a)/sizeof(a[0]))
 	int i;
 
 	if (rt->rateCodeToIndex[0] != 0)	/* already setup */
 		return;
 	for (i = 0; i < N(rt->rateCodeToIndex); i++)
 		rt->rateCodeToIndex[i] = (uint8_t) -1;
 	for (i = 0; i < rt->rateCount; i++) {
 		uint8_t code = rt->info[i].rateCode;
 		uint8_t cix = rt->info[i].controlRate;
 
 		HALASSERT(code < N(rt->rateCodeToIndex));
 		rt->rateCodeToIndex[code] = i;
 		HALASSERT((code | rt->info[i].shortPreamble) <
 		    N(rt->rateCodeToIndex));
 		rt->rateCodeToIndex[code | rt->info[i].shortPreamble] = i;
 		/*
 		 * XXX for 11g the control rate to use for 5.5 and 11 Mb/s
 		 *     depends on whether they are marked as basic rates;
 		 *     the static tables are setup with an 11b-compatible
 		 *     2Mb/s rate which will work but is suboptimal
 		 */
 		rt->info[i].lpAckDuration = ath_hal_computetxtime(ah, rt,
 			WLAN_CTRL_FRAME_SIZE, cix, AH_FALSE, AH_TRUE);
 		rt->info[i].spAckDuration = ath_hal_computetxtime(ah, rt,
 			WLAN_CTRL_FRAME_SIZE, cix, AH_TRUE, AH_TRUE);
 	}
 #undef N
 }
 
 HAL_STATUS
 ath_hal_getcapability(struct ath_hal *ah, HAL_CAPABILITY_TYPE type,
 	uint32_t capability, uint32_t *result)
 {
 	const HAL_CAPABILITIES *pCap = &AH_PRIVATE(ah)->ah_caps;
 
 	switch (type) {
 	case HAL_CAP_REG_DMN:		/* regulatory domain */
 		*result = AH_PRIVATE(ah)->ah_currentRD;
 		return HAL_OK;
 	case HAL_CAP_DFS_DMN:		/* DFS Domain */
 		*result = AH_PRIVATE(ah)->ah_dfsDomain;
 		return HAL_OK;
 	case HAL_CAP_CIPHER:		/* cipher handled in hardware */
 	case HAL_CAP_TKIP_MIC:		/* handle TKIP MIC in hardware */
 		return HAL_ENOTSUPP;
 	case HAL_CAP_TKIP_SPLIT:	/* hardware TKIP uses split keys */
 		return HAL_ENOTSUPP;
 	case HAL_CAP_PHYCOUNTERS:	/* hardware PHY error counters */
 		return pCap->halHwPhyCounterSupport ? HAL_OK : HAL_ENXIO;
 	case HAL_CAP_WME_TKIPMIC:   /* hardware can do TKIP MIC when WMM is turned on */
 		return HAL_ENOTSUPP;
 	case HAL_CAP_DIVERSITY:		/* hardware supports fast diversity */
 		return HAL_ENOTSUPP;
 	case HAL_CAP_KEYCACHE_SIZE:	/* hardware key cache size */
 		*result =  pCap->halKeyCacheSize;
 		return HAL_OK;
 	case HAL_CAP_NUM_TXQUEUES:	/* number of hardware tx queues */
 		*result = pCap->halTotalQueues;
 		return HAL_OK;
 	case HAL_CAP_VEOL:		/* hardware supports virtual EOL */
 		return pCap->halVEOLSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_PSPOLL:		/* hardware PS-Poll support works */
 		return pCap->halPSPollBroken ? HAL_ENOTSUPP : HAL_OK;
 	case HAL_CAP_COMPRESSION:
 		return pCap->halCompressSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_BURST:
 		return pCap->halBurstSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_FASTFRAME:
 		return pCap->halFastFramesSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_DIAG:		/* hardware diagnostic support */
 		*result = AH_PRIVATE(ah)->ah_diagreg;
 		return HAL_OK;
 	case HAL_CAP_TXPOW:		/* global tx power limit  */
 		switch (capability) {
 		case 0:			/* facility is supported */
 			return HAL_OK;
 		case 1:			/* current limit */
 			*result = AH_PRIVATE(ah)->ah_powerLimit;
 			return HAL_OK;
 		case 2:			/* current max tx power */
 			*result = AH_PRIVATE(ah)->ah_maxPowerLevel;
 			return HAL_OK;
 		case 3:			/* scale factor */
 			*result = AH_PRIVATE(ah)->ah_tpScale;
 			return HAL_OK;
 		}
 		return HAL_ENOTSUPP;
 	case HAL_CAP_BSSIDMASK:		/* hardware supports bssid mask */
 		return pCap->halBssIdMaskSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_MCAST_KEYSRCH:	/* multicast frame keycache search */
 		return pCap->halMcastKeySrchSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_TSF_ADJUST:	/* hardware has beacon tsf adjust */
 		return HAL_ENOTSUPP;
 	case HAL_CAP_RFSILENT:		/* rfsilent support  */
 		switch (capability) {
 		case 0:			/* facility is supported */
 			return pCap->halRfSilentSupport ? HAL_OK : HAL_ENOTSUPP;
 		case 1:			/* current setting */
 			return AH_PRIVATE(ah)->ah_rfkillEnabled ?
 				HAL_OK : HAL_ENOTSUPP;
 		case 2:			/* rfsilent config */
 			*result = AH_PRIVATE(ah)->ah_rfsilent;
 			return HAL_OK;
 		}
 		return HAL_ENOTSUPP;
 	case HAL_CAP_11D:
 		return HAL_OK;
 
 	case HAL_CAP_HT:
 		return pCap->halHTSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_GTXTO:
 		return pCap->halGTTSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_FAST_CC:
 		return pCap->halFastCCSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_TX_CHAINMASK:	/* mask of TX chains supported */
 		*result = pCap->halTxChainMask;
 		return HAL_OK;
 	case HAL_CAP_RX_CHAINMASK:	/* mask of RX chains supported */
 		*result = pCap->halRxChainMask;
 		return HAL_OK;
 	case HAL_CAP_NUM_GPIO_PINS:
 		*result = pCap->halNumGpioPins;
 		return HAL_OK;
 	case HAL_CAP_CST:
 		return pCap->halCSTSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_RTS_AGGR_LIMIT:
 		*result = pCap->halRtsAggrLimit;
 		return HAL_OK;
 	case HAL_CAP_4ADDR_AGGR:
 		return pCap->hal4AddrAggrSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_EXT_CHAN_DFS:
 		return pCap->halExtChanDfsSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_RX_STBC:
 		return pCap->halRxStbcSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_TX_STBC:
 		return pCap->halTxStbcSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_COMBINED_RADAR_RSSI:
 		return pCap->halUseCombinedRadarRssi ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_AUTO_SLEEP:
 		return pCap->halAutoSleepSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_MBSSID_AGGR_SUPPORT:
 		return pCap->halMbssidAggrSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_SPLIT_4KB_TRANS:	/* hardware handles descriptors straddling 4k page boundary */
 		return pCap->hal4kbSplitTransSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_REG_FLAG:
 		*result = AH_PRIVATE(ah)->ah_currentRDext;
 		return HAL_OK;
 	case HAL_CAP_ENHANCED_DMA_SUPPORT:
 		return pCap->halEnhancedDmaSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_NUM_TXMAPS:
 		*result = pCap->halNumTxMaps;
 		return HAL_OK;
 	case HAL_CAP_TXDESCLEN:
 		*result = pCap->halTxDescLen;
 		return HAL_OK;
 	case HAL_CAP_TXSTATUSLEN:
 		*result = pCap->halTxStatusLen;
 		return HAL_OK;
 	case HAL_CAP_RXSTATUSLEN:
 		*result = pCap->halRxStatusLen;
 		return HAL_OK;
 	case HAL_CAP_RXFIFODEPTH:
 		switch (capability) {
 		case HAL_RX_QUEUE_HP:
 			*result = pCap->halRxHpFifoDepth;
 			return HAL_OK;
 		case HAL_RX_QUEUE_LP:
 			*result = pCap->halRxLpFifoDepth;
 			return HAL_OK;
 		default:
 			return HAL_ENOTSUPP;
 	}
 	case HAL_CAP_RXBUFSIZE:
 	case HAL_CAP_NUM_MR_RETRIES:
 		*result = pCap->halNumMRRetries;
 		return HAL_OK;
 	case HAL_CAP_BT_COEX:
 		return pCap->halBtCoexSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_SPECTRAL_SCAN:
 		return pCap->halSpectralScanSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_HT20_SGI:
 		return pCap->halHTSGI20Support ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_RXTSTAMP_PREC:	/* rx desc tstamp precision (bits) */
 		*result = pCap->halRxTstampPrecision;
 		return HAL_OK;
 	case HAL_CAP_ANT_DIV_COMB:	/* AR9285/AR9485 LNA diversity */
 		return pCap->halAntDivCombSupport ? HAL_OK  : HAL_ENOTSUPP;
 
 	case HAL_CAP_ENHANCED_DFS_SUPPORT:
 		return pCap->halEnhancedDfsSupport ? HAL_OK : HAL_ENOTSUPP;
 
 	/* FreeBSD-specific entries for now */
 	case HAL_CAP_RXORN_FATAL:	/* HAL_INT_RXORN treated as fatal  */
 		return AH_PRIVATE(ah)->ah_rxornIsFatal ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_INTRMASK:		/* mask of supported interrupts */
 		*result = pCap->halIntrMask;
 		return HAL_OK;
 	case HAL_CAP_BSSIDMATCH:	/* hardware has disable bssid match */
 		return pCap->halBssidMatchSupport ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_STREAMS:		/* number of 11n spatial streams */
 		switch (capability) {
 		case 0:			/* TX */
 			*result = pCap->halTxStreams;
 			return HAL_OK;
 		case 1:			/* RX */
 			*result = pCap->halRxStreams;
 			return HAL_OK;
 		default:
 			return HAL_ENOTSUPP;
 		}
 	case HAL_CAP_RXDESC_SELFLINK:	/* hardware supports self-linked final RX descriptors correctly */
 		return pCap->halHasRxSelfLinkedTail ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_BB_READ_WAR:		/* Baseband read WAR */
 		return pCap->halHasBBReadWar? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_SERIALISE_WAR:		/* PCI register serialisation */
 		return pCap->halSerialiseRegWar ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_MFP:			/* Management frame protection setting */
 		*result = pCap->halMfpSupport;
 		return HAL_OK;
 	case HAL_CAP_RX_LNA_MIXING:	/* Hardware uses an RX LNA mixer to map 2 antennas to a 1 stream receiver */
 		return pCap->halRxUsingLnaMixing ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_DO_MYBEACON:	/* Hardware supports filtering my-beacons */
 		return pCap->halRxDoMyBeacon ? HAL_OK : HAL_ENOTSUPP;
 	case HAL_CAP_TXTSTAMP_PREC:	/* tx desc tstamp precision (bits) */
 		*result = pCap->halTxTstampPrecision;
 		return HAL_OK;
 	default:
 		return HAL_EINVAL;
 	}
 }
 
 HAL_BOOL
 ath_hal_setcapability(struct ath_hal *ah, HAL_CAPABILITY_TYPE type,
 	uint32_t capability, uint32_t setting, HAL_STATUS *status)
 {
 
 	switch (type) {
 	case HAL_CAP_TXPOW:
 		switch (capability) {
 		case 3:
 			if (setting <= HAL_TP_SCALE_MIN) {
 				AH_PRIVATE(ah)->ah_tpScale = setting;
 				return AH_TRUE;
 			}
 			break;
 		}
 		break;
 	case HAL_CAP_RFSILENT:		/* rfsilent support  */
 		/*
 		 * NB: allow even if halRfSilentSupport is false
 		 *     in case the EEPROM is misprogrammed.
 		 */
 		switch (capability) {
 		case 1:			/* current setting */
 			AH_PRIVATE(ah)->ah_rfkillEnabled = (setting != 0);
 			return AH_TRUE;
 		case 2:			/* rfsilent config */
 			/* XXX better done per-chip for validation? */
 			AH_PRIVATE(ah)->ah_rfsilent = setting;
 			return AH_TRUE;
 		}
 		break;
 	case HAL_CAP_REG_DMN:		/* regulatory domain */
 		AH_PRIVATE(ah)->ah_currentRD = setting;
 		return AH_TRUE;
 	case HAL_CAP_RXORN_FATAL:	/* HAL_INT_RXORN treated as fatal  */
 		AH_PRIVATE(ah)->ah_rxornIsFatal = setting;
 		return AH_TRUE;
 	default:
 		break;
 	}
 	if (status)
 		*status = HAL_EINVAL;
 	return AH_FALSE;
 }
 
 /* 
  * Common support for getDiagState method.
  */
 
 static u_int
 ath_hal_getregdump(struct ath_hal *ah, const HAL_REGRANGE *regs,
 	void *dstbuf, int space)
 {
 	uint32_t *dp = dstbuf;
 	int i;
 
 	for (i = 0; space >= 2*sizeof(uint32_t); i++) {
 		uint32_t r = regs[i].start;
 		uint32_t e = regs[i].end;
 		*dp++ = r;
 		*dp++ = e;
 		space -= 2*sizeof(uint32_t);
 		do {
 			*dp++ = OS_REG_READ(ah, r);
 			r += sizeof(uint32_t);
 			space -= sizeof(uint32_t);
 		} while (r <= e && space >= sizeof(uint32_t));
 	}
 	return (char *) dp - (char *) dstbuf;
 }
  
 static void
 ath_hal_setregs(struct ath_hal *ah, const HAL_REGWRITE *regs, int space)
 {
 	while (space >= sizeof(HAL_REGWRITE)) {
 		OS_REG_WRITE(ah, regs->addr, regs->value);
 		regs++, space -= sizeof(HAL_REGWRITE);
 	}
 }
 
 HAL_BOOL
 ath_hal_getdiagstate(struct ath_hal *ah, int request,
 	const void *args, uint32_t argsize,
 	void **result, uint32_t *resultsize)
 {
 
 	switch (request) {
 	case HAL_DIAG_REVS:
 		*result = &AH_PRIVATE(ah)->ah_devid;
 		*resultsize = sizeof(HAL_REVS);
 		return AH_TRUE;
 	case HAL_DIAG_REGS:
 		*resultsize = ath_hal_getregdump(ah, args, *result,*resultsize);
 		return AH_TRUE;
 	case HAL_DIAG_SETREGS:
 		ath_hal_setregs(ah, args, argsize);
 		*resultsize = 0;
 		return AH_TRUE;
 	case HAL_DIAG_FATALERR:
 		*result = &AH_PRIVATE(ah)->ah_fatalState[0];
 		*resultsize = sizeof(AH_PRIVATE(ah)->ah_fatalState);
 		return AH_TRUE;
 	case HAL_DIAG_EEREAD:
 		if (argsize != sizeof(uint16_t))
 			return AH_FALSE;
 		if (!ath_hal_eepromRead(ah, *(const uint16_t *)args, *result))
 			return AH_FALSE;
 		*resultsize = sizeof(uint16_t);
 		return AH_TRUE;
 #ifdef AH_PRIVATE_DIAG
 	case HAL_DIAG_SETKEY: {
 		const HAL_DIAG_KEYVAL *dk;
 
 		if (argsize != sizeof(HAL_DIAG_KEYVAL))
 			return AH_FALSE;
 		dk = (const HAL_DIAG_KEYVAL *)args;
 		return ah->ah_setKeyCacheEntry(ah, dk->dk_keyix,
 			&dk->dk_keyval, dk->dk_mac, dk->dk_xor);
 	}
 	case HAL_DIAG_RESETKEY:
 		if (argsize != sizeof(uint16_t))
 			return AH_FALSE;
 		return ah->ah_resetKeyCacheEntry(ah, *(const uint16_t *)args);
 #ifdef AH_SUPPORT_WRITE_EEPROM
 	case HAL_DIAG_EEWRITE: {
 		const HAL_DIAG_EEVAL *ee;
 		if (argsize != sizeof(HAL_DIAG_EEVAL))
 			return AH_FALSE;
 		ee = (const HAL_DIAG_EEVAL *)args;
 		return ath_hal_eepromWrite(ah, ee->ee_off, ee->ee_data);
 	}
 #endif /* AH_SUPPORT_WRITE_EEPROM */
 #endif /* AH_PRIVATE_DIAG */
 	case HAL_DIAG_11NCOMPAT:
 		if (argsize == 0) {
 			*resultsize = sizeof(uint32_t);
 			*((uint32_t *)(*result)) =
 				AH_PRIVATE(ah)->ah_11nCompat;
 		} else if (argsize == sizeof(uint32_t)) {
 			AH_PRIVATE(ah)->ah_11nCompat = *(const uint32_t *)args;
 		} else
 			return AH_FALSE;
 		return AH_TRUE;
 	case HAL_DIAG_CHANSURVEY:
 		*result = &AH_PRIVATE(ah)->ah_chansurvey;
 		*resultsize = sizeof(HAL_CHANNEL_SURVEY);
 		return AH_TRUE;
 	}
 	return AH_FALSE;
 }
 
 /*
  * Set the properties of the tx queue with the parameters
  * from qInfo.
  */
 HAL_BOOL
 ath_hal_setTxQProps(struct ath_hal *ah,
 	HAL_TX_QUEUE_INFO *qi, const HAL_TXQ_INFO *qInfo)
 {
 	uint32_t cw;
 
 	if (qi->tqi_type == HAL_TX_QUEUE_INACTIVE) {
 		HALDEBUG(ah, HAL_DEBUG_TXQUEUE,
 		    "%s: inactive queue\n", __func__);
 		return AH_FALSE;
 	}
 	/* XXX validate parameters */
 	qi->tqi_ver = qInfo->tqi_ver;
 	qi->tqi_subtype = qInfo->tqi_subtype;
 	qi->tqi_qflags = qInfo->tqi_qflags;
 	qi->tqi_priority = qInfo->tqi_priority;
 	if (qInfo->tqi_aifs != HAL_TXQ_USEDEFAULT)
 		qi->tqi_aifs = AH_MIN(qInfo->tqi_aifs, 255);
 	else
 		qi->tqi_aifs = INIT_AIFS;
 	if (qInfo->tqi_cwmin != HAL_TXQ_USEDEFAULT) {
 		cw = AH_MIN(qInfo->tqi_cwmin, 1024);
 		/* make sure that the CWmin is of the form (2^n - 1) */
 		qi->tqi_cwmin = 1;
 		while (qi->tqi_cwmin < cw)
 			qi->tqi_cwmin = (qi->tqi_cwmin << 1) | 1;
 	} else
 		qi->tqi_cwmin = qInfo->tqi_cwmin;
 	if (qInfo->tqi_cwmax != HAL_TXQ_USEDEFAULT) {
 		cw = AH_MIN(qInfo->tqi_cwmax, 1024);
 		/* make sure that the CWmax is of the form (2^n - 1) */
 		qi->tqi_cwmax = 1;
 		while (qi->tqi_cwmax < cw)
 			qi->tqi_cwmax = (qi->tqi_cwmax << 1) | 1;
 	} else
 		qi->tqi_cwmax = INIT_CWMAX;
 	/* Set retry limit values */
 	if (qInfo->tqi_shretry != 0)
 		qi->tqi_shretry = AH_MIN(qInfo->tqi_shretry, 15);
 	else
 		qi->tqi_shretry = INIT_SH_RETRY;
 	if (qInfo->tqi_lgretry != 0)
 		qi->tqi_lgretry = AH_MIN(qInfo->tqi_lgretry, 15);
 	else
 		qi->tqi_lgretry = INIT_LG_RETRY;
 	qi->tqi_cbrPeriod = qInfo->tqi_cbrPeriod;
 	qi->tqi_cbrOverflowLimit = qInfo->tqi_cbrOverflowLimit;
 	qi->tqi_burstTime = qInfo->tqi_burstTime;
 	qi->tqi_readyTime = qInfo->tqi_readyTime;
 
 	switch (qInfo->tqi_subtype) {
 	case HAL_WME_UPSD:
 		if (qi->tqi_type == HAL_TX_QUEUE_DATA)
 			qi->tqi_intFlags = HAL_TXQ_USE_LOCKOUT_BKOFF_DIS;
 		break;
 	default:
 		break;		/* NB: silence compiler */
 	}
 	return AH_TRUE;
 }
 
 HAL_BOOL
 ath_hal_getTxQProps(struct ath_hal *ah,
 	HAL_TXQ_INFO *qInfo, const HAL_TX_QUEUE_INFO *qi)
 {
 	if (qi->tqi_type == HAL_TX_QUEUE_INACTIVE) {
 		HALDEBUG(ah, HAL_DEBUG_TXQUEUE,
 		    "%s: inactive queue\n", __func__);
 		return AH_FALSE;
 	}
 
 	qInfo->tqi_qflags = qi->tqi_qflags;
 	qInfo->tqi_ver = qi->tqi_ver;
 	qInfo->tqi_subtype = qi->tqi_subtype;
 	qInfo->tqi_qflags = qi->tqi_qflags;
 	qInfo->tqi_priority = qi->tqi_priority;
 	qInfo->tqi_aifs = qi->tqi_aifs;
 	qInfo->tqi_cwmin = qi->tqi_cwmin;
 	qInfo->tqi_cwmax = qi->tqi_cwmax;
 	qInfo->tqi_shretry = qi->tqi_shretry;
 	qInfo->tqi_lgretry = qi->tqi_lgretry;
 	qInfo->tqi_cbrPeriod = qi->tqi_cbrPeriod;
 	qInfo->tqi_cbrOverflowLimit = qi->tqi_cbrOverflowLimit;
 	qInfo->tqi_burstTime = qi->tqi_burstTime;
 	qInfo->tqi_readyTime = qi->tqi_readyTime;
 	return AH_TRUE;
 }
 
                                      /* 11a Turbo  11b  11g  108g */
 static const int16_t NOISE_FLOOR[] = { -96, -93,  -98, -96,  -93 };
 
 /*
  * Read the current channel noise floor and return.
  * If nf cal hasn't finished, channel noise floor should be 0
  * and we return a nominal value based on band and frequency.
  *
  * NB: This is a private routine used by per-chip code to
  *     implement the ah_getChanNoise method.
  */
 int16_t
 ath_hal_getChanNoise(struct ath_hal *ah, const struct ieee80211_channel *chan)
 {
 	HAL_CHANNEL_INTERNAL *ichan;
 
 	ichan = ath_hal_checkchannel(ah, chan);
 	if (ichan == AH_NULL) {
 		HALDEBUG(ah, HAL_DEBUG_NFCAL,
 		    "%s: invalid channel %u/0x%x; no mapping\n",
 		    __func__, chan->ic_freq, chan->ic_flags);
 		return 0;
 	}
 	if (ichan->rawNoiseFloor == 0) {
 		WIRELESS_MODE mode = ath_hal_chan2wmode(ah, chan);
 
 		HALASSERT(mode < WIRELESS_MODE_MAX);
 		return NOISE_FLOOR[mode] + ath_hal_getNfAdjust(ah, ichan);
 	} else
 		return ichan->rawNoiseFloor + ichan->noiseFloorAdjust;
 }
 
 /*
  * Fetch the current setup of ctl/ext noise floor values.
  *
  * If the CHANNEL_MIMO_NF_VALID flag isn't set, the array is simply
  * populated with values from NOISE_FLOOR[] + ath_hal_getNfAdjust().
  *
  * The caller must supply ctl/ext NF arrays which are at least
  * AH_MAX_CHAINS entries long.
  */
 int
 ath_hal_get_mimo_chan_noise(struct ath_hal *ah,
     const struct ieee80211_channel *chan, int16_t *nf_ctl,
     int16_t *nf_ext)
 {
 #ifdef	AH_SUPPORT_AR5416
 	HAL_CHANNEL_INTERNAL *ichan;
 	int i;
 
 	ichan = ath_hal_checkchannel(ah, chan);
 	if (ichan == AH_NULL) {
 		HALDEBUG(ah, HAL_DEBUG_NFCAL,
 		    "%s: invalid channel %u/0x%x; no mapping\n",
 		    __func__, chan->ic_freq, chan->ic_flags);
 		for (i = 0; i < AH_MAX_CHAINS; i++) {
 			nf_ctl[i] = nf_ext[i] = 0;
 		}
 		return 0;
 	}
 
 	/* Return 0 if there's no valid MIMO values (yet) */
 	if (! (ichan->privFlags & CHANNEL_MIMO_NF_VALID)) {
 		for (i = 0; i < AH_MAX_CHAINS; i++) {
 			nf_ctl[i] = nf_ext[i] = 0;
 		}
 		return 0;
 	}
 	if (ichan->rawNoiseFloor == 0) {
 		WIRELESS_MODE mode = ath_hal_chan2wmode(ah, chan);
 		HALASSERT(mode < WIRELESS_MODE_MAX);
 		/*
 		 * See the comment below - this could cause issues for
 		 * stations which have a very low RSSI, below the
 		 * 'normalised' NF values in NOISE_FLOOR[].
 		 */
 		for (i = 0; i < AH_MAX_CHAINS; i++) {
 			nf_ctl[i] = nf_ext[i] = NOISE_FLOOR[mode] +
 			    ath_hal_getNfAdjust(ah, ichan);
 		}
 		return 1;
 	} else {
 		/*
 		 * The value returned here from a MIMO radio is presumed to be
 		 * "good enough" as a NF calculation. As RSSI values are calculated
 		 * against this, an adjusted NF may be higher than the RSSI value
 		 * returned from a vary weak station, resulting in an obscenely
 		 * high signal strength calculation being returned.
 		 *
 		 * This should be re-evaluated at a later date, along with any
 		 * signal strength calculations which are made. Quite likely the
 		 * RSSI values will need to be adjusted to ensure the calculations
 		 * don't "wrap" when RSSI is less than the "adjusted" NF value.
 		 * ("Adjust" here is via ichan->noiseFloorAdjust.)
 		 */
 		for (i = 0; i < AH_MAX_CHAINS; i++) {
 			nf_ctl[i] = ichan->noiseFloorCtl[i] + ath_hal_getNfAdjust(ah, ichan);
 			nf_ext[i] = ichan->noiseFloorExt[i] + ath_hal_getNfAdjust(ah, ichan);
 		}
 		return 1;
 	}
 #else
 	return 0;
 #endif	/* AH_SUPPORT_AR5416 */
 }
 
 /*
  * Process all valid raw noise floors into the dBm noise floor values.
  * Though our device has no reference for a dBm noise floor, we perform
  * a relative minimization of NF's based on the lowest NF found across a
  * channel scan.
  */
 void
 ath_hal_process_noisefloor(struct ath_hal *ah)
 {
 	HAL_CHANNEL_INTERNAL *c;
 	int16_t correct2, correct5;
 	int16_t lowest2, lowest5;
 	int i;
 
 	/* 
 	 * Find the lowest 2GHz and 5GHz noise floor values after adjusting
 	 * for statistically recorded NF/channel deviation.
 	 */
 	correct2 = lowest2 = 0;
 	correct5 = lowest5 = 0;
 	for (i = 0; i < AH_PRIVATE(ah)->ah_nchan; i++) {
 		WIRELESS_MODE mode;
 		int16_t nf;
 
 		c = &AH_PRIVATE(ah)->ah_channels[i];
 		if (c->rawNoiseFloor >= 0)
 			continue;
 		/* XXX can't identify proper mode */
 		mode = IS_CHAN_5GHZ(c) ? WIRELESS_MODE_11a : WIRELESS_MODE_11g;
 		nf = c->rawNoiseFloor + NOISE_FLOOR[mode] +
 			ath_hal_getNfAdjust(ah, c);
 		if (IS_CHAN_5GHZ(c)) {
 			if (nf < lowest5) { 
 				lowest5 = nf;
 				correct5 = NOISE_FLOOR[mode] -
 				    (c->rawNoiseFloor + ath_hal_getNfAdjust(ah, c));
 			}
 		} else {
 			if (nf < lowest2) { 
 				lowest2 = nf;
 				correct2 = NOISE_FLOOR[mode] -
 				    (c->rawNoiseFloor + ath_hal_getNfAdjust(ah, c));
 			}
 		}
 	}
 
 	/* Correct the channels to reach the expected NF value */
 	for (i = 0; i < AH_PRIVATE(ah)->ah_nchan; i++) {
 		c = &AH_PRIVATE(ah)->ah_channels[i];
 		if (c->rawNoiseFloor >= 0)
 			continue;
 		/* Apply correction factor */
 		c->noiseFloorAdjust = ath_hal_getNfAdjust(ah, c) +
 			(IS_CHAN_5GHZ(c) ? correct5 : correct2);
 		HALDEBUG(ah, HAL_DEBUG_NFCAL, "%u raw nf %d adjust %d\n",
 		    c->channel, c->rawNoiseFloor, c->noiseFloorAdjust);
 	}
 }
 
 /*
  * INI support routines.
  */
 
 int
 ath_hal_ini_write(struct ath_hal *ah, const HAL_INI_ARRAY *ia,
 	int col, int regWr)
 {
 	int r;
 
 	HALASSERT(col < ia->cols);
 	for (r = 0; r < ia->rows; r++) {
 		OS_REG_WRITE(ah, HAL_INI_VAL(ia, r, 0),
 		    HAL_INI_VAL(ia, r, col));
 
 		/* Analog shift register delay seems needed for Merlin - PR kern/154220 */
 		if (HAL_INI_VAL(ia, r, 0) >= 0x7800 && HAL_INI_VAL(ia, r, 0) < 0x7900)
 			OS_DELAY(100);
 
 		DMA_YIELD(regWr);
 	}
 	return regWr;
 }
 
 void
 ath_hal_ini_bank_setup(uint32_t data[], const HAL_INI_ARRAY *ia, int col)
 {
 	int r;
 
 	HALASSERT(col < ia->cols);
 	for (r = 0; r < ia->rows; r++)
 		data[r] = HAL_INI_VAL(ia, r, col);
 }
 
 int
 ath_hal_ini_bank_write(struct ath_hal *ah, const HAL_INI_ARRAY *ia,
 	const uint32_t data[], int regWr)
 {
 	int r;
 
 	for (r = 0; r < ia->rows; r++) {
 		OS_REG_WRITE(ah, HAL_INI_VAL(ia, r, 0), data[r]);
 		DMA_YIELD(regWr);
 	}
 	return regWr;
 }
 
 /*
  * These are EEPROM board related routines which should likely live in
  * a helper library of some sort.
  */
 
 /**************************************************************
  * ath_ee_getLowerUppderIndex
  *
  * Return indices surrounding the value in sorted integer lists.
  * Requirement: the input list must be monotonically increasing
  *     and populated up to the list size
  * Returns: match is set if an index in the array matches exactly
  *     or a the target is before or after the range of the array.
  */
 HAL_BOOL
 ath_ee_getLowerUpperIndex(uint8_t target, uint8_t *pList, uint16_t listSize,
                    uint16_t *indexL, uint16_t *indexR)
 {
     uint16_t i;
 
     /*
      * Check first and last elements for beyond ordered array cases.
      */
     if (target <= pList[0]) {
         *indexL = *indexR = 0;
         return AH_TRUE;
     }
     if (target >= pList[listSize-1]) {
         *indexL = *indexR = (uint16_t)(listSize - 1);
         return AH_TRUE;
     }
 
     /* look for value being near or between 2 values in list */
     for (i = 0; i < listSize - 1; i++) {
         /*
          * If value is close to the current value of the list
          * then target is not between values, it is one of the values
          */
         if (pList[i] == target) {
             *indexL = *indexR = i;
             return AH_TRUE;
         }
         /*
          * Look for value being between current value and next value
          * if so return these 2 values
          */
         if (target < pList[i + 1]) {
             *indexL = i;
             *indexR = (uint16_t)(i + 1);
             return AH_FALSE;
         }
     }
     HALASSERT(0);
     *indexL = *indexR = 0;
     return AH_FALSE;
 }
 
 /**************************************************************
  * ath_ee_FillVpdTable
  *
  * Fill the Vpdlist for indices Pmax-Pmin
  * Note: pwrMin, pwrMax and Vpdlist are all in dBm * 4
  */
 HAL_BOOL
 ath_ee_FillVpdTable(uint8_t pwrMin, uint8_t pwrMax, uint8_t *pPwrList,
                    uint8_t *pVpdList, uint16_t numIntercepts, uint8_t *pRetVpdList)
 {
     uint16_t  i, k;
     uint8_t   currPwr = pwrMin;
     uint16_t  idxL, idxR;
 
     HALASSERT(pwrMax > pwrMin);
     for (i = 0; i <= (pwrMax - pwrMin) / 2; i++) {
         ath_ee_getLowerUpperIndex(currPwr, pPwrList, numIntercepts,
                            &(idxL), &(idxR));
         if (idxR < 1)
             idxR = 1;           /* extrapolate below */
         if (idxL == numIntercepts - 1)
             idxL = (uint16_t)(numIntercepts - 2);   /* extrapolate above */
         if (pPwrList[idxL] == pPwrList[idxR])
             k = pVpdList[idxL];
         else
             k = (uint16_t)( ((currPwr - pPwrList[idxL]) * pVpdList[idxR] + (pPwrList[idxR] - currPwr) * pVpdList[idxL]) /
                   (pPwrList[idxR] - pPwrList[idxL]) );
         HALASSERT(k < 256);
         pRetVpdList[i] = (uint8_t)k;
         currPwr += 2;               /* half dB steps */
     }
 
     return AH_TRUE;
 }
 
 /**************************************************************************
  * ath_ee_interpolate
  *
  * Returns signed interpolated or the scaled up interpolated value
  */
 int16_t
 ath_ee_interpolate(uint16_t target, uint16_t srcLeft, uint16_t srcRight,
             int16_t targetLeft, int16_t targetRight)
 {
     int16_t rv;
 
     if (srcRight == srcLeft) {
         rv = targetLeft;
     } else {
         rv = (int16_t)( ((target - srcLeft) * targetRight +
               (srcRight - target) * targetLeft) / (srcRight - srcLeft) );
     }
     return rv;
 }
 
 /*
  * Adjust the TSF.
  */
 void
 ath_hal_adjusttsf(struct ath_hal *ah, int32_t tsfdelta)
 {
 	/* XXX handle wrap/overflow */
 	OS_REG_WRITE(ah, AR_TSF_L32, OS_REG_READ(ah, AR_TSF_L32) + tsfdelta);
 }
 
 /*
  * Enable or disable CCA.
  */
 void
 ath_hal_setcca(struct ath_hal *ah, int ena)
 {
 	/*
 	 * NB: fill me in; this is not provided by default because disabling
 	 *     CCA in most locales violates regulatory.
 	 */
 }
 
 /*
  * Get CCA setting.
  */
 int
 ath_hal_getcca(struct ath_hal *ah)
 {
 	u_int32_t diag;
 	if (ath_hal_getcapability(ah, HAL_CAP_DIAG, 0, &diag) != HAL_OK)
 		return 1;
 	return ((diag & 0x500000) == 0);
 }
 
 /*
  * This routine is only needed when supporting EEPROM-in-RAM setups
  * (eg embedded SoCs and on-board PCI/PCIe devices.)
  */
 /* NB: This is in 16 bit words; not bytes */
 /* XXX This doesn't belong here!  */
 #define ATH_DATA_EEPROM_SIZE    2048
 
 HAL_BOOL
 ath_hal_EepromDataRead(struct ath_hal *ah, u_int off, uint16_t *data)
 {
 	if (ah->ah_eepromdata == AH_NULL) {
 		HALDEBUG(ah, HAL_DEBUG_ANY, "%s: no eeprom data!\n", __func__);
 		return AH_FALSE;
 	}
 	if (off > ATH_DATA_EEPROM_SIZE) {
 		HALDEBUG(ah, HAL_DEBUG_ANY, "%s: offset %x > %x\n",
 		    __func__, off, ATH_DATA_EEPROM_SIZE);
 		return AH_FALSE;
 	}
 	(*data) = ah->ah_eepromdata[off];
 	return AH_TRUE;
 }
 
 /*
  * Do a 2GHz specific MHz->IEEE based on the hardware
  * frequency.
  *
  * This is the unmapped frequency which is programmed into the hardware.
  */
 int
 ath_hal_mhz2ieee_2ghz(struct ath_hal *ah, int freq)
 {
 
 	if (freq == 2484)
 		return 14;
 	if (freq < 2484)
 		return ((int) freq - 2407) / 5;
 	else
 		return 15 + ((freq - 2512) / 20);
 }
 
 /*
  * Clear the current survey data.
  *
  * This should be done during a channel change.
  */
 void
 ath_hal_survey_clear(struct ath_hal *ah)
 {
 
 	OS_MEMZERO(&AH_PRIVATE(ah)->ah_chansurvey,
 	    sizeof(AH_PRIVATE(ah)->ah_chansurvey));
 }
 
 /*
  * Add a sample to the channel survey.
  */
 void
 ath_hal_survey_add_sample(struct ath_hal *ah, HAL_SURVEY_SAMPLE *hs)
 {
 	HAL_CHANNEL_SURVEY *cs;
 
 	cs = &AH_PRIVATE(ah)->ah_chansurvey;
 
 	OS_MEMCPY(&cs->samples[cs->cur_sample], hs, sizeof(*hs));
 	cs->samples[cs->cur_sample].seq_num = cs->cur_seq;
 	cs->cur_sample = (cs->cur_sample + 1) % CHANNEL_SURVEY_SAMPLE_COUNT;
 	cs->cur_seq++;
 }
Index: projects/clang390-import/sys/dev/ath/ath_hal/ah.h
===================================================================
--- projects/clang390-import/sys/dev/ath/ath_hal/ah.h	(revision 305686)
+++ projects/clang390-import/sys/dev/ath/ath_hal/ah.h	(revision 305687)
@@ -1,1677 +1,1684 @@
 /*
  * Copyright (c) 2002-2009 Sam Leffler, Errno Consulting
  * Copyright (c) 2002-2008 Atheros Communications, Inc.
  *
  * Permission to use, copy, modify, and/or distribute this software for any
  * purpose with or without fee is hereby granted, provided that the above
  * copyright notice and this permission notice appear in all copies.
  *
  * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
  * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
  * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
  * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
  * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  *
  * $FreeBSD$
  */
 
 #ifndef _ATH_AH_H_
 #define _ATH_AH_H_
 /*
  * Atheros Hardware Access Layer
  *
  * Clients of the HAL call ath_hal_attach to obtain a reference to an ath_hal
  * structure for use with the device.  Hardware-related operations that
  * follow must call back into the HAL through interface, supplying the
  * reference as the first parameter.
  */
 
 #include "ah_osdep.h"
 
 /*
  * The maximum number of TX/RX chains supported.
  * This is intended to be used by various statistics gathering operations
  * (NF, RSSI, EVM).
  */
 #define	AH_MAX_CHAINS			3
 #define	AH_MIMO_MAX_EVM_PILOTS		6
 
 /*
  * __ahdecl is analogous to _cdecl; it defines the calling
  * convention used within the HAL.  For most systems this
  * can just default to be empty and the compiler will (should)
  * use _cdecl.  For systems where _cdecl is not compatible this
  * must be defined.  See linux/ah_osdep.h for an example.
  */
 #ifndef __ahdecl
 #define __ahdecl
 #endif
 
 /*
  * Status codes that may be returned by the HAL.  Note that
  * interfaces that return a status code set it only when an
  * error occurs--i.e. you cannot check it for success.
  */
 typedef enum {
 	HAL_OK		= 0,	/* No error */
 	HAL_ENXIO	= 1,	/* No hardware present */
 	HAL_ENOMEM	= 2,	/* Memory allocation failed */
 	HAL_EIO		= 3,	/* Hardware didn't respond as expected */
 	HAL_EEMAGIC	= 4,	/* EEPROM magic number invalid */
 	HAL_EEVERSION	= 5,	/* EEPROM version invalid */
 	HAL_EELOCKED	= 6,	/* EEPROM unreadable */
 	HAL_EEBADSUM	= 7,	/* EEPROM checksum invalid */
 	HAL_EEREAD	= 8,	/* EEPROM read problem */
 	HAL_EEBADMAC	= 9,	/* EEPROM mac address invalid */
 	HAL_EESIZE	= 10,	/* EEPROM size not supported */
 	HAL_EEWRITE	= 11,	/* Attempt to change write-locked EEPROM */
 	HAL_EINVAL	= 12,	/* Invalid parameter to function */
 	HAL_ENOTSUPP	= 13,	/* Hardware revision not supported */
 	HAL_ESELFTEST	= 14,	/* Hardware self-test failed */
 	HAL_EINPROGRESS	= 15,	/* Operation incomplete */
 	HAL_EEBADREG	= 16,	/* EEPROM invalid regulatory contents */
 	HAL_EEBADCC	= 17,	/* EEPROM invalid country code */
 	HAL_INV_PMODE	= 18,	/* Couldn't bring out of sleep state */
 } HAL_STATUS;
 
 typedef enum {
 	AH_FALSE = 0,		/* NB: lots of code assumes false is zero */
 	AH_TRUE  = 1,
 } HAL_BOOL;
 
 typedef enum {
 	HAL_CAP_REG_DMN		= 0,	/* current regulatory domain */
 	HAL_CAP_CIPHER		= 1,	/* hardware supports cipher */
 	HAL_CAP_TKIP_MIC	= 2,	/* handle TKIP MIC in hardware */
 	HAL_CAP_TKIP_SPLIT	= 3,	/* hardware TKIP uses split keys */
 	HAL_CAP_PHYCOUNTERS	= 4,	/* hardware PHY error counters */
 	HAL_CAP_DIVERSITY	= 5,	/* hardware supports fast diversity */
 	HAL_CAP_KEYCACHE_SIZE	= 6,	/* number of entries in key cache */
 	HAL_CAP_NUM_TXQUEUES	= 7,	/* number of hardware xmit queues */
 	HAL_CAP_VEOL		= 9,	/* hardware supports virtual EOL */
 	HAL_CAP_PSPOLL		= 10,	/* hardware has working PS-Poll support */
 	HAL_CAP_DIAG		= 11,	/* hardware diagnostic support */
 	HAL_CAP_COMPRESSION	= 12,	/* hardware supports compression */
 	HAL_CAP_BURST		= 13,	/* hardware supports packet bursting */
 	HAL_CAP_FASTFRAME	= 14,	/* hardware supoprts fast frames */
 	HAL_CAP_TXPOW		= 15,	/* global tx power limit  */
 	HAL_CAP_TPC		= 16,	/* per-packet tx power control  */
 	HAL_CAP_PHYDIAG		= 17,	/* hardware phy error diagnostic */
 	HAL_CAP_BSSIDMASK	= 18,	/* hardware supports bssid mask */
 	HAL_CAP_MCAST_KEYSRCH	= 19,	/* hardware has multicast key search */
 	HAL_CAP_TSF_ADJUST	= 20,	/* hardware has beacon tsf adjust */
 	/* 21 was HAL_CAP_XR */
 	HAL_CAP_WME_TKIPMIC 	= 22,   /* hardware can support TKIP MIC when WMM is turned on */
 	/* 23 was HAL_CAP_CHAN_HALFRATE */
 	/* 24 was HAL_CAP_CHAN_QUARTERRATE */
 	HAL_CAP_RFSILENT	= 25,	/* hardware has rfsilent support  */
 	HAL_CAP_TPC_ACK		= 26,	/* ack txpower with per-packet tpc */
 	HAL_CAP_TPC_CTS		= 27,	/* cts txpower with per-packet tpc */
 	HAL_CAP_11D		= 28,   /* 11d beacon support for changing cc */
 	HAL_CAP_PCIE_PS		= 29,
 	HAL_CAP_HT		= 30,   /* hardware can support HT */
 	HAL_CAP_GTXTO		= 31,	/* hardware supports global tx timeout */
 	HAL_CAP_FAST_CC		= 32,	/* hardware supports fast channel change */
 	HAL_CAP_TX_CHAINMASK	= 33,	/* mask of TX chains supported */
 	HAL_CAP_RX_CHAINMASK	= 34,	/* mask of RX chains supported */
 	HAL_CAP_NUM_GPIO_PINS	= 36,	/* number of GPIO pins */
 
 	HAL_CAP_CST		= 38,	/* hardware supports carrier sense timeout */
 	HAL_CAP_RIFS_RX		= 39,
 	HAL_CAP_RIFS_TX		= 40,
 	HAL_CAP_FORCE_PPM	= 41,
 	HAL_CAP_RTS_AGGR_LIMIT	= 42,	/* aggregation limit with RTS */
 	HAL_CAP_4ADDR_AGGR	= 43,	/* hardware is capable of 4addr aggregation */
 	HAL_CAP_DFS_DMN		= 44,	/* current DFS domain */
 	HAL_CAP_EXT_CHAN_DFS	= 45,	/* DFS support for extension channel */
 	HAL_CAP_COMBINED_RADAR_RSSI	= 46,	/* Is combined RSSI for radar accurate */
 
 	HAL_CAP_AUTO_SLEEP	= 48,	/* hardware can go to network sleep
 					   automatically after waking up to receive TIM */
 	HAL_CAP_MBSSID_AGGR_SUPPORT	= 49, /* Support for mBSSID Aggregation */
 	HAL_CAP_SPLIT_4KB_TRANS	= 50,	/* hardware supports descriptors straddling a 4k page boundary */
 	HAL_CAP_REG_FLAG	= 51,	/* Regulatory domain flags */
 	HAL_CAP_BB_RIFS_HANG	= 52,
 	HAL_CAP_RIFS_RX_ENABLED	= 53,
 	HAL_CAP_BB_DFS_HANG	= 54,
 
 	HAL_CAP_RX_STBC		= 58,
 	HAL_CAP_TX_STBC		= 59,
 
 	HAL_CAP_BT_COEX		= 60,	/* hardware is capable of bluetooth coexistence */
 	HAL_CAP_DYNAMIC_SMPS	= 61,	/* Dynamic MIMO Power Save hardware support */
 
 	HAL_CAP_DS		= 67,	/* 2 stream */
 	HAL_CAP_BB_RX_CLEAR_STUCK_HANG	= 68,
 	HAL_CAP_MAC_HANG	= 69,	/* can MAC hang */
 	HAL_CAP_MFP		= 70,	/* Management Frame Protection in hardware */
 
 	HAL_CAP_TS		= 72,	/* 3 stream */
 
 	HAL_CAP_ENHANCED_DMA_SUPPORT	= 75,	/* DMA FIFO support */
 	HAL_CAP_NUM_TXMAPS	= 76,	/* Number of buffers in a transmit descriptor */
 	HAL_CAP_TXDESCLEN	= 77,	/* Length of transmit descriptor */
 	HAL_CAP_TXSTATUSLEN	= 78,	/* Length of transmit status descriptor */
 	HAL_CAP_RXSTATUSLEN	= 79,	/* Length of transmit status descriptor */
 	HAL_CAP_RXFIFODEPTH	= 80,	/* Receive hardware FIFO depth */
 	HAL_CAP_RXBUFSIZE	= 81,	/* Receive Buffer Length */
 	HAL_CAP_NUM_MR_RETRIES	= 82,	/* limit on multirate retries */
 	HAL_CAP_OL_PWRCTRL	= 84,	/* Open loop TX power control */
 	HAL_CAP_SPECTRAL_SCAN	= 90,	/* Hardware supports spectral scan */
 
 	HAL_CAP_BB_PANIC_WATCHDOG	= 92,
 
 	HAL_CAP_HT20_SGI	= 96,	/* hardware supports HT20 short GI */
 
 	HAL_CAP_LDPC		= 99,
 
 	HAL_CAP_RXTSTAMP_PREC	= 100,	/* rx desc tstamp precision (bits) */
 
 	HAL_CAP_ANT_DIV_COMB	= 105,	/* Enable antenna diversity/combining */
 	HAL_CAP_PHYRESTART_CLR_WAR	= 106,	/* in some cases, clear phy restart to fix bb hang */
 	HAL_CAP_ENTERPRISE_MODE	= 107,	/* Enterprise mode features */
 	HAL_CAP_LDPCWAR		= 108,
 	HAL_CAP_CHANNEL_SWITCH_TIME_USEC	= 109,	/* Channel change time, usec */
 	HAL_CAP_ENABLE_APM	= 110,	/* APM enabled */
 	HAL_CAP_PCIE_LCR_EXTSYNC_EN	= 111,
 	HAL_CAP_PCIE_LCR_OFFSET	= 112,
 
 	HAL_CAP_ENHANCED_DFS_SUPPORT	= 117,	/* hardware supports enhanced DFS */
 	HAL_CAP_MCI		= 118,
 	HAL_CAP_SMARTANTENNA	= 119,
 	HAL_CAP_TRAFFIC_FAST_RECOVER	= 120,
 	HAL_CAP_TX_DIVERSITY	= 121,
 	HAL_CAP_CRDC		= 122,
 
 	/* The following are private to the FreeBSD HAL (224 onward) */
 
 	HAL_CAP_INTMIT		= 229,	/* interference mitigation */
 	HAL_CAP_RXORN_FATAL	= 230,	/* HAL_INT_RXORN treated as fatal */
 	HAL_CAP_BB_HANG		= 235,	/* can baseband hang */
 	HAL_CAP_INTRMASK	= 237,	/* bitmask of supported interrupts */
 	HAL_CAP_BSSIDMATCH	= 238,	/* hardware has disable bssid match */
 	HAL_CAP_STREAMS		= 239,	/* how many 802.11n spatial streams are available */
 	HAL_CAP_RXDESC_SELFLINK	= 242,	/* support a self-linked tail RX descriptor */
 	HAL_CAP_BB_READ_WAR	= 244,	/* baseband read WAR */
 	HAL_CAP_SERIALISE_WAR	= 245,	/* serialise register access on PCI */
 	HAL_CAP_ENFORCE_TXOP	= 246,	/* Enforce TXOP if supported */
 	HAL_CAP_RX_LNA_MIXING	= 247,	/* RX hardware uses LNA mixing */
 	HAL_CAP_DO_MYBEACON	= 248,	/* Supports HAL_RX_FILTER_MYBEACON */
 	HAL_CAP_TOA_LOCATIONING	= 249,	/* time of flight / arrival locationing */
 	HAL_CAP_TXTSTAMP_PREC	= 250,	/* tx desc tstamp precision (bits) */
 } HAL_CAPABILITY_TYPE;
 
 /* 
  * "States" for setting the LED.  These correspond to
  * the possible 802.11 operational states and there may
  * be a many-to-one mapping between these states and the
  * actual hardware state for the LED's (i.e. the hardware
  * may have fewer states).
  */
 typedef enum {
 	HAL_LED_INIT	= 0,
 	HAL_LED_SCAN	= 1,
 	HAL_LED_AUTH	= 2,
 	HAL_LED_ASSOC	= 3,
 	HAL_LED_RUN	= 4
 } HAL_LED_STATE;
 
 /*
  * Transmit queue types/numbers.  These are used to tag
  * each transmit queue in the hardware and to identify a set
  * of transmit queues for operations such as start/stop dma.
  */
 typedef enum {
 	HAL_TX_QUEUE_INACTIVE	= 0,		/* queue is inactive/unused */
 	HAL_TX_QUEUE_DATA	= 1,		/* data xmit q's */
 	HAL_TX_QUEUE_BEACON	= 2,		/* beacon xmit q */
 	HAL_TX_QUEUE_CAB	= 3,		/* "crap after beacon" xmit q */
 	HAL_TX_QUEUE_UAPSD	= 4,		/* u-apsd power save xmit q */
 	HAL_TX_QUEUE_PSPOLL	= 5,		/* power save poll xmit q */
 	HAL_TX_QUEUE_CFEND	= 6,
 	HAL_TX_QUEUE_PAPRD	= 7,
 } HAL_TX_QUEUE;
 
 #define	HAL_NUM_TX_QUEUES	10		/* max possible # of queues */
 
 /*
  * Receive queue types.  These are used to tag
  * each transmit queue in the hardware and to identify a set
  * of transmit queues for operations such as start/stop dma.
  */
 typedef enum {
 	HAL_RX_QUEUE_HP = 0,			/* high priority recv queue */
 	HAL_RX_QUEUE_LP = 1,			/* low priority recv queue */
 } HAL_RX_QUEUE;
 
 #define	HAL_NUM_RX_QUEUES	2		/* max possible # of queues */
 
 #define	HAL_TXFIFO_DEPTH	8		/* transmit fifo depth */
 
 /*
  * Transmit queue subtype.  These map directly to
  * WME Access Categories (except for UPSD).  Refer
  * to Table 5 of the WME spec.
  */
 typedef enum {
 	HAL_WME_AC_BK	= 0,			/* background access category */
 	HAL_WME_AC_BE	= 1, 			/* best effort access category*/
 	HAL_WME_AC_VI	= 2,			/* video access category */
 	HAL_WME_AC_VO	= 3,			/* voice access category */
 	HAL_WME_UPSD	= 4,			/* uplink power save */
 } HAL_TX_QUEUE_SUBTYPE;
 
 /*
  * Transmit queue flags that control various
  * operational parameters.
  */
 typedef enum {
 	/*
 	 * Per queue interrupt enables.  When set the associated
 	 * interrupt may be delivered for packets sent through
 	 * the queue.  Without these enabled no interrupts will
 	 * be delivered for transmits through the queue.
 	 */
 	HAL_TXQ_TXOKINT_ENABLE	   = 0x0001,	/* enable TXOK interrupt */
 	HAL_TXQ_TXERRINT_ENABLE	   = 0x0001,	/* enable TXERR interrupt */
 	HAL_TXQ_TXDESCINT_ENABLE   = 0x0002,	/* enable TXDESC interrupt */
 	HAL_TXQ_TXEOLINT_ENABLE	   = 0x0004,	/* enable TXEOL interrupt */
 	HAL_TXQ_TXURNINT_ENABLE	   = 0x0008,	/* enable TXURN interrupt */
 	/*
 	 * Enable hardware compression for packets sent through
 	 * the queue.  The compression buffer must be setup and
 	 * packets must have a key entry marked in the tx descriptor.
 	 */
 	HAL_TXQ_COMPRESSION_ENABLE  = 0x0010,	/* enable h/w compression */
 	/*
 	 * Disable queue when veol is hit or ready time expires.
 	 * By default the queue is disabled only on reaching the
 	 * physical end of queue (i.e. a null link ptr in the
 	 * descriptor chain).
 	 */
 	HAL_TXQ_RDYTIME_EXP_POLICY_ENABLE = 0x0020,
 	/*
 	 * Schedule frames on delivery of a DBA (DMA Beacon Alert)
 	 * event.  Frames will be transmitted only when this timer
 	 * fires, e.g to transmit a beacon in ap or adhoc modes.
 	 */
 	HAL_TXQ_DBA_GATED	    = 0x0040,	/* schedule based on DBA */
 	/*
 	 * Each transmit queue has a counter that is incremented
 	 * each time the queue is enabled and decremented when
 	 * the list of frames to transmit is traversed (or when
 	 * the ready time for the queue expires).  This counter
 	 * must be non-zero for frames to be scheduled for
 	 * transmission.  The following controls disable bumping
 	 * this counter under certain conditions.  Typically this
 	 * is used to gate frames based on the contents of another
 	 * queue (e.g. CAB traffic may only follow a beacon frame).
 	 * These are meaningful only when frames are scheduled
 	 * with a non-ASAP policy (e.g. DBA-gated).
 	 */
 	HAL_TXQ_CBR_DIS_QEMPTY	    = 0x0080,	/* disable on this q empty */
 	HAL_TXQ_CBR_DIS_BEMPTY	    = 0x0100,	/* disable on beacon q empty */
 
 	/*
 	 * Fragment burst backoff policy.  Normally the no backoff
 	 * is done after a successful transmission, the next fragment
 	 * is sent at SIFS.  If this flag is set backoff is done
 	 * after each fragment, regardless whether it was ack'd or
 	 * not, after the backoff count reaches zero a normal channel
 	 * access procedure is done before the next transmit (i.e.
 	 * wait AIFS instead of SIFS).
 	 */
 	HAL_TXQ_FRAG_BURST_BACKOFF_ENABLE = 0x00800000,
 	/*
 	 * Disable post-tx backoff following each frame.
 	 */
 	HAL_TXQ_BACKOFF_DISABLE	    = 0x00010000, /* disable post backoff  */
 	/*
 	 * DCU arbiter lockout control.  This controls how
 	 * lower priority tx queues are handled with respect to
 	 * to a specific queue when multiple queues have frames
 	 * to send.  No lockout means lower priority queues arbitrate
 	 * concurrently with this queue.  Intra-frame lockout
 	 * means lower priority queues are locked out until the
 	 * current frame transmits (e.g. including backoffs and bursting).
 	 * Global lockout means nothing lower can arbitrary so
 	 * long as there is traffic activity on this queue (frames,
 	 * backoff, etc).
 	 */
 	HAL_TXQ_ARB_LOCKOUT_INTRA   = 0x00020000, /* intra-frame lockout */
 	HAL_TXQ_ARB_LOCKOUT_GLOBAL  = 0x00040000, /* full lockout s */
 
 	HAL_TXQ_IGNORE_VIRTCOL	    = 0x00080000, /* ignore virt collisions */
 	HAL_TXQ_SEQNUM_INC_DIS	    = 0x00100000, /* disable seqnum increment */
 } HAL_TX_QUEUE_FLAGS;
 
 typedef struct {
 	uint32_t	tqi_ver;		/* hal TXQ version */
 	HAL_TX_QUEUE_SUBTYPE tqi_subtype;	/* subtype if applicable */
 	HAL_TX_QUEUE_FLAGS tqi_qflags;		/* flags (see above) */
 	uint32_t	tqi_priority;		/* (not used) */
 	uint32_t	tqi_aifs;		/* aifs */
 	uint32_t	tqi_cwmin;		/* cwMin */
 	uint32_t	tqi_cwmax;		/* cwMax */
 	uint16_t	tqi_shretry;		/* rts retry limit */
 	uint16_t	tqi_lgretry;		/* long retry limit (not used)*/
 	uint32_t	tqi_cbrPeriod;		/* CBR period (us) */
 	uint32_t	tqi_cbrOverflowLimit;	/* threshold for CBROVF int */
 	uint32_t	tqi_burstTime;		/* max burst duration (us) */
 	uint32_t	tqi_readyTime;		/* frame schedule time (us) */
 	uint32_t	tqi_compBuf;		/* comp buffer phys addr */
 } HAL_TXQ_INFO;
 
 #define HAL_TQI_NONVAL 0xffff
 
 /* token to use for aifs, cwmin, cwmax */
 #define	HAL_TXQ_USEDEFAULT	((uint32_t) -1)
 
 /* compression definitions */
 #define HAL_COMP_BUF_MAX_SIZE           9216            /* 9K */
 #define HAL_COMP_BUF_ALIGN_SIZE         512
 
 /*
  * Transmit packet types.  This belongs in ah_desc.h, but
  * is here so we can give a proper type to various parameters
  * (and not require everyone include the file).
  *
  * NB: These values are intentionally assigned for
  *     direct use when setting up h/w descriptors.
  */
 typedef enum {
 	HAL_PKT_TYPE_NORMAL	= 0,
 	HAL_PKT_TYPE_ATIM	= 1,
 	HAL_PKT_TYPE_PSPOLL	= 2,
 	HAL_PKT_TYPE_BEACON	= 3,
 	HAL_PKT_TYPE_PROBE_RESP	= 4,
 	HAL_PKT_TYPE_CHIRP	= 5,
 	HAL_PKT_TYPE_GRP_POLL	= 6,
 	HAL_PKT_TYPE_AMPDU	= 7,
 } HAL_PKT_TYPE;
 
 /* Rx Filter Frame Types */
 typedef enum {
 	/*
 	 * These bits correspond to AR_RX_FILTER for all chips.
 	 * Not all bits are supported by all chips.
 	 */
 	HAL_RX_FILTER_UCAST	= 0x00000001,	/* Allow unicast frames */
 	HAL_RX_FILTER_MCAST	= 0x00000002,	/* Allow multicast frames */
 	HAL_RX_FILTER_BCAST	= 0x00000004,	/* Allow broadcast frames */
 	HAL_RX_FILTER_CONTROL	= 0x00000008,	/* Allow control frames */
 	HAL_RX_FILTER_BEACON	= 0x00000010,	/* Allow beacon frames */
 	HAL_RX_FILTER_PROM	= 0x00000020,	/* Promiscuous mode */
 	HAL_RX_FILTER_PROBEREQ	= 0x00000080,	/* Allow probe request frames */
 	HAL_RX_FILTER_PHYERR	= 0x00000100,	/* Allow phy errors */
 	HAL_RX_FILTER_MYBEACON  = 0x00000200,   /* Filter beacons other than mine */
 	HAL_RX_FILTER_COMPBAR	= 0x00000400,	/* Allow compressed BAR */
 	HAL_RX_FILTER_COMP_BA	= 0x00000800,	/* Allow compressed blockack */
 	HAL_RX_FILTER_PHYRADAR	= 0x00002000,	/* Allow phy radar errors */
 	HAL_RX_FILTER_PSPOLL	= 0x00004000,	/* Allow PS-POLL frames */
 	HAL_RX_FILTER_MCAST_BCAST_ALL	= 0x00008000,
 						/* Allow all mcast/bcast frames */
 
 	/*
 	 * Magic RX filter flags that aren't targeting hardware bits
 	 * but instead the HAL sets individual bits - eg PHYERR will result
 	 * in OFDM/CCK timing error frames being received.
 	 */
 	HAL_RX_FILTER_BSSID	= 0x40000000,	/* Disable BSSID match */
 } HAL_RX_FILTER;
 
 typedef enum {
 	HAL_PM_AWAKE		= 0,
 	HAL_PM_FULL_SLEEP	= 1,
 	HAL_PM_NETWORK_SLEEP	= 2,
 	HAL_PM_UNDEFINED	= 3
 } HAL_POWER_MODE;
 
 /*
  * Enterprise mode flags
  */
 #define	AH_ENT_DUAL_BAND_DISABLE	0x00000001
 #define	AH_ENT_CHAIN2_DISABLE		0x00000002
 #define	AH_ENT_5MHZ_DISABLE		0x00000004
 #define	AH_ENT_10MHZ_DISABLE		0x00000008
 #define	AH_ENT_49GHZ_DISABLE		0x00000010
 #define	AH_ENT_LOOPBACK_DISABLE		0x00000020
 #define	AH_ENT_TPC_PERF_DISABLE		0x00000040
 #define	AH_ENT_MIN_PKT_SIZE_DISABLE	0x00000080
 #define	AH_ENT_SPECTRAL_PRECISION	0x00000300
 #define	AH_ENT_SPECTRAL_PRECISION_S	8
 #define	AH_ENT_RTSCTS_DELIM_WAR		0x00010000
 
 #define AH_FIRST_DESC_NDELIMS 60
 
 /*
  * NOTE WELL:
  * These are mapped to take advantage of the common locations for many of
  * the bits on all of the currently supported MAC chips. This is to make
  * the ISR as efficient as possible, while still abstracting HW differences.
  * When new hardware breaks this commonality this enumerated type, as well
  * as the HAL functions using it, must be modified. All values are directly
  * mapped unless commented otherwise.
  */
 typedef enum {
 	HAL_INT_RX	= 0x00000001,	/* Non-common mapping */
 	HAL_INT_RXDESC	= 0x00000002,	/* Legacy mapping */
 	HAL_INT_RXERR	= 0x00000004,
 	HAL_INT_RXHP	= 0x00000001,	/* EDMA */
 	HAL_INT_RXLP	= 0x00000002,	/* EDMA */
 	HAL_INT_RXNOFRM	= 0x00000008,
 	HAL_INT_RXEOL	= 0x00000010,
 	HAL_INT_RXORN	= 0x00000020,
 	HAL_INT_TX	= 0x00000040,	/* Non-common mapping */
 	HAL_INT_TXDESC	= 0x00000080,
 	HAL_INT_TIM_TIMER= 0x00000100,
 	HAL_INT_MCI	= 0x00000200,
 	HAL_INT_BBPANIC	= 0x00000400,
 	HAL_INT_TXURN	= 0x00000800,
 	HAL_INT_MIB	= 0x00001000,
 	HAL_INT_RXPHY	= 0x00004000,
 	HAL_INT_RXKCM	= 0x00008000,
 	HAL_INT_SWBA	= 0x00010000,
 	HAL_INT_BRSSI	= 0x00020000,
 	HAL_INT_BMISS	= 0x00040000,
 	HAL_INT_BNR	= 0x00100000,
 	HAL_INT_TIM	= 0x00200000,	/* Non-common mapping */
 	HAL_INT_DTIM	= 0x00400000,	/* Non-common mapping */
 	HAL_INT_DTIMSYNC= 0x00800000,	/* Non-common mapping */
 	HAL_INT_GPIO	= 0x01000000,
 	HAL_INT_CABEND	= 0x02000000,	/* Non-common mapping */
 	HAL_INT_TSFOOR	= 0x04000000,	/* Non-common mapping */
 	HAL_INT_TBTT	= 0x08000000,	/* Non-common mapping */
 	/* Atheros ref driver has a generic timer interrupt now..*/
 	HAL_INT_GENTIMER	= 0x08000000,	/* Non-common mapping */
 	HAL_INT_CST	= 0x10000000,	/* Non-common mapping */
 	HAL_INT_GTT	= 0x20000000,	/* Non-common mapping */
 	HAL_INT_FATAL	= 0x40000000,	/* Non-common mapping */
 #define	HAL_INT_GLOBAL	0x80000000	/* Set/clear IER */
 	HAL_INT_BMISC	= HAL_INT_TIM
 			| HAL_INT_DTIM
 			| HAL_INT_DTIMSYNC
 			| HAL_INT_CABEND
 			| HAL_INT_TBTT,
 
 	/* Interrupt bits that map directly to ISR/IMR bits */
 	HAL_INT_COMMON  = HAL_INT_RXNOFRM
 			| HAL_INT_RXDESC
 			| HAL_INT_RXEOL
 			| HAL_INT_RXORN
 			| HAL_INT_TXDESC
 			| HAL_INT_TXURN
 			| HAL_INT_MIB
 			| HAL_INT_RXPHY
 			| HAL_INT_RXKCM
 			| HAL_INT_SWBA
 			| HAL_INT_BMISS
 			| HAL_INT_BRSSI
 			| HAL_INT_BNR
 			| HAL_INT_GPIO,
 } HAL_INT;
 
 /*
  * MSI vector assignments
  */
 typedef enum {
 	HAL_MSIVEC_MISC = 0,
 	HAL_MSIVEC_TX   = 1,
 	HAL_MSIVEC_RXLP = 2,
 	HAL_MSIVEC_RXHP = 3,
 } HAL_MSIVEC;
 
 typedef enum {
 	HAL_INT_LINE = 0,
 	HAL_INT_MSI  = 1,
 } HAL_INT_TYPE;
 
 /* For interrupt mitigation registers */
 typedef enum {
 	HAL_INT_RX_FIRSTPKT=0,
 	HAL_INT_RX_LASTPKT,
 	HAL_INT_TX_FIRSTPKT,
 	HAL_INT_TX_LASTPKT,
 	HAL_INT_THRESHOLD
 } HAL_INT_MITIGATION;
 
 /* XXX this is duplicate information! */
 typedef struct {
 	u_int32_t	cyclecnt_diff;		/* delta cycle count */
 	u_int32_t	rxclr_cnt;		/* rx clear count */
 	u_int32_t	extrxclr_cnt;		/* ext chan rx clear count */
 	u_int32_t	txframecnt_diff;	/* delta tx frame count */
 	u_int32_t	rxframecnt_diff;	/* delta rx frame count */
 	u_int32_t	listen_time;		/* listen time in msec - time for which ch is free */
 	u_int32_t	ofdmphyerr_cnt;		/* OFDM err count since last reset */
 	u_int32_t	cckphyerr_cnt;		/* CCK err count since last reset */
 	u_int32_t	ofdmphyerrcnt_diff;	/* delta OFDM Phy Error Count */
 	HAL_BOOL	valid;			/* if the stats are valid*/
 } HAL_ANISTATS;
 
 typedef struct {
 	u_int8_t	txctl_offset;
 	u_int8_t	txctl_numwords;
 	u_int8_t	txstatus_offset;
 	u_int8_t	txstatus_numwords;
 
 	u_int8_t	rxctl_offset;
 	u_int8_t	rxctl_numwords;
 	u_int8_t	rxstatus_offset;
 	u_int8_t	rxstatus_numwords;
 
 	u_int8_t	macRevision;
 } HAL_DESC_INFO;
 
 typedef enum {
 	HAL_GPIO_OUTPUT_MUX_AS_OUTPUT		= 0,
 	HAL_GPIO_OUTPUT_MUX_PCIE_ATTENTION_LED	= 1,
 	HAL_GPIO_OUTPUT_MUX_PCIE_POWER_LED	= 2,
 	HAL_GPIO_OUTPUT_MUX_MAC_NETWORK_LED	= 3,
 	HAL_GPIO_OUTPUT_MUX_MAC_POWER_LED	= 4,
 	HAL_GPIO_OUTPUT_MUX_AS_WLAN_ACTIVE	= 5,
 	HAL_GPIO_OUTPUT_MUX_AS_TX_FRAME		= 6,
 
 	HAL_GPIO_OUTPUT_MUX_AS_MCI_WLAN_DATA,
 	HAL_GPIO_OUTPUT_MUX_AS_MCI_WLAN_CLK,
 	HAL_GPIO_OUTPUT_MUX_AS_MCI_BT_DATA,
 	HAL_GPIO_OUTPUT_MUX_AS_MCI_BT_CLK,
 	HAL_GPIO_OUTPUT_MUX_AS_WL_IN_TX,
 	HAL_GPIO_OUTPUT_MUX_AS_WL_IN_RX,
 	HAL_GPIO_OUTPUT_MUX_AS_BT_IN_TX,
 	HAL_GPIO_OUTPUT_MUX_AS_BT_IN_RX,
 	HAL_GPIO_OUTPUT_MUX_AS_RUCKUS_STROBE,
 	HAL_GPIO_OUTPUT_MUX_AS_RUCKUS_DATA,
 	HAL_GPIO_OUTPUT_MUX_AS_SMARTANT_CTRL0,
 	HAL_GPIO_OUTPUT_MUX_AS_SMARTANT_CTRL1,
 	HAL_GPIO_OUTPUT_MUX_AS_SMARTANT_CTRL2,
 	HAL_GPIO_OUTPUT_MUX_NUM_ENTRIES
 } HAL_GPIO_MUX_TYPE;
 
 typedef enum {
 	HAL_GPIO_INTR_LOW		= 0,
 	HAL_GPIO_INTR_HIGH		= 1,
 	HAL_GPIO_INTR_DISABLE		= 2
 } HAL_GPIO_INTR_TYPE;
 
 typedef struct halCounters {
     u_int32_t   tx_frame_count;
     u_int32_t   rx_frame_count;
     u_int32_t   rx_clear_count;
     u_int32_t   cycle_count;
     u_int8_t    is_rx_active;     // true (1) or false (0)
     u_int8_t    is_tx_active;     // true (1) or false (0)
 } HAL_COUNTERS;
 
 typedef enum {
 	HAL_RFGAIN_INACTIVE		= 0,
 	HAL_RFGAIN_READ_REQUESTED	= 1,
 	HAL_RFGAIN_NEED_CHANGE		= 2
 } HAL_RFGAIN;
 
 typedef uint16_t HAL_CTRY_CODE;		/* country code */
 typedef uint16_t HAL_REG_DOMAIN;		/* regulatory domain code */
 
 #define HAL_ANTENNA_MIN_MODE  0
 #define HAL_ANTENNA_FIXED_A   1
 #define HAL_ANTENNA_FIXED_B   2
 #define HAL_ANTENNA_MAX_MODE  3
 
 typedef struct {
 	uint32_t	ackrcv_bad;
 	uint32_t	rts_bad;
 	uint32_t	rts_good;
 	uint32_t	fcs_bad;
 	uint32_t	beacons;
 } HAL_MIB_STATS;
 
 /*
  * These bits represent what's in ah_currentRDext.
  */
 typedef enum {
 	REG_EXT_FCC_MIDBAND		= 0,
 	REG_EXT_JAPAN_MIDBAND		= 1,
 	REG_EXT_FCC_DFS_HT40		= 2,
 	REG_EXT_JAPAN_NONDFS_HT40	= 3,
 	REG_EXT_JAPAN_DFS_HT40		= 4
 } REG_EXT_BITMAP;
 
 enum {
 	HAL_MODE_11A	= 0x001,		/* 11a channels */
 	HAL_MODE_TURBO	= 0x002,		/* 11a turbo-only channels */
 	HAL_MODE_11B	= 0x004,		/* 11b channels */
 	HAL_MODE_PUREG	= 0x008,		/* 11g channels (OFDM only) */
 #ifdef notdef
 	HAL_MODE_11G	= 0x010,		/* 11g channels (OFDM/CCK) */
 #else
 	HAL_MODE_11G	= 0x008,		/* XXX historical */
 #endif
 	HAL_MODE_108G	= 0x020,		/* 11g+Turbo channels */
 	HAL_MODE_108A	= 0x040,		/* 11a+Turbo channels */
 	HAL_MODE_11A_HALF_RATE = 0x200,		/* 11a half width channels */
 	HAL_MODE_11A_QUARTER_RATE = 0x400,	/* 11a quarter width channels */
 	HAL_MODE_11G_HALF_RATE = 0x800,		/* 11g half width channels */
 	HAL_MODE_11G_QUARTER_RATE = 0x1000,	/* 11g quarter width channels */
 	HAL_MODE_11NG_HT20	= 0x008000,
 	HAL_MODE_11NA_HT20  	= 0x010000,
 	HAL_MODE_11NG_HT40PLUS	= 0x020000,
 	HAL_MODE_11NG_HT40MINUS	= 0x040000,
 	HAL_MODE_11NA_HT40PLUS	= 0x080000,
 	HAL_MODE_11NA_HT40MINUS	= 0x100000,
 	HAL_MODE_ALL	= 0xffffff
 };
 
 typedef struct {
 	int		rateCount;		/* NB: for proper padding */
 	uint8_t		rateCodeToIndex[256];	/* back mapping */
 	struct {
 		uint8_t		valid;		/* valid for rate control use */
 		uint8_t		phy;		/* CCK/OFDM/XR */
 		uint32_t	rateKbps;	/* transfer rate in kbs */
 		uint8_t		rateCode;	/* rate for h/w descriptors */
 		uint8_t		shortPreamble;	/* mask for enabling short
 						 * preamble in CCK rate code */
 		uint8_t		dot11Rate;	/* value for supported rates
 						 * info element of MLME */
 		uint8_t		controlRate;	/* index of next lower basic
 						 * rate; used for dur. calcs */
 		uint16_t	lpAckDuration;	/* long preamble ACK duration */
 		uint16_t	spAckDuration;	/* short preamble ACK duration*/
 	} info[64];
 } HAL_RATE_TABLE;
 
 typedef struct {
 	u_int		rs_count;		/* number of valid entries */
 	uint8_t	rs_rates[64];		/* rates */
 } HAL_RATE_SET;
 
 /*
  * 802.11n specific structures and enums
  */
 typedef enum {
 	HAL_CHAINTYPE_TX	= 1,	/* Tx chain type */
 	HAL_CHAINTYPE_RX	= 2,	/* RX chain type */
 } HAL_CHAIN_TYPE;
 
 typedef struct {
 	u_int	Tries;
 	u_int	Rate;		/* hardware rate code */
 	u_int	RateIndex;	/* rate series table index */
 	u_int	PktDuration;
 	u_int	ChSel;
 	u_int	RateFlags;
 #define	HAL_RATESERIES_RTS_CTS		0x0001	/* use rts/cts w/this series */
 #define	HAL_RATESERIES_2040		0x0002	/* use ext channel for series */
 #define	HAL_RATESERIES_HALFGI		0x0004	/* use half-gi for series */
 #define	HAL_RATESERIES_STBC		0x0008	/* use STBC for series */
 	u_int	tx_power_cap;		/* in 1/2 dBm units XXX TODO */
 } HAL_11N_RATE_SERIES;
 
 typedef enum {
 	HAL_HT_MACMODE_20	= 0,	/* 20 MHz operation */
 	HAL_HT_MACMODE_2040	= 1,	/* 20/40 MHz operation */
 } HAL_HT_MACMODE;
 
 typedef enum {
 	HAL_HT_PHYMODE_20	= 0,	/* 20 MHz operation */
 	HAL_HT_PHYMODE_2040	= 1,	/* 20/40 MHz operation */
 } HAL_HT_PHYMODE;
 
 typedef enum {
 	HAL_HT_EXTPROTSPACING_20 = 0,	/* 20 MHz spacing */
 	HAL_HT_EXTPROTSPACING_25 = 1,	/* 25 MHz spacing */
 } HAL_HT_EXTPROTSPACING;
 
 
 typedef enum {
 	HAL_RX_CLEAR_CTL_LOW	= 0x1,	/* force control channel to appear busy */
 	HAL_RX_CLEAR_EXT_LOW	= 0x2,	/* force extension channel to appear busy */
 } HAL_HT_RXCLEAR;
 
 typedef enum {
 	HAL_FREQ_BAND_5GHZ	= 0,
 	HAL_FREQ_BAND_2GHZ	= 1,
 } HAL_FREQ_BAND;
 
 /*
  * Antenna switch control.  By default antenna selection
  * enables multiple (2) antenna use.  To force use of the
  * A or B antenna only specify a fixed setting.  Fixing
  * the antenna will also disable any diversity support.
  */
 typedef enum {
 	HAL_ANT_VARIABLE = 0,			/* variable by programming */
 	HAL_ANT_FIXED_A	 = 1,			/* fixed antenna A */
 	HAL_ANT_FIXED_B	 = 2,			/* fixed antenna B */
 } HAL_ANT_SETTING;
 
 typedef enum {
 	HAL_M_STA	= 1,			/* infrastructure station */
 	HAL_M_IBSS	= 0,			/* IBSS (adhoc) station */
 	HAL_M_HOSTAP	= 6,			/* Software Access Point */
 	HAL_M_MONITOR	= 8			/* Monitor mode */
 } HAL_OPMODE;
 
 typedef enum {
 	HAL_RESET_NORMAL	= 0,		/* Do normal reset */
 	HAL_RESET_BBPANIC	= 1,		/* Reset because of BB panic */
 	HAL_RESET_FORCE_COLD	= 2,		/* Force full reset */
 } HAL_RESET_TYPE;
 
 typedef struct {
 	uint8_t		kv_type;		/* one of HAL_CIPHER */
 	uint8_t		kv_apsd;		/* Mask for APSD enabled ACs */
 	uint16_t	kv_len;			/* length in bits */
 	uint8_t		kv_val[16];		/* enough for 128-bit keys */
 	uint8_t		kv_mic[8];		/* TKIP MIC key */
 	uint8_t		kv_txmic[8];		/* TKIP TX MIC key (optional) */
 } HAL_KEYVAL;
 
 /*
  * This is the TX descriptor field which marks the key padding requirement.
  * The naming is unfortunately unclear.
  */
 #define AH_KEYTYPE_MASK     0x0F
 typedef enum {
     HAL_KEY_TYPE_CLEAR,
     HAL_KEY_TYPE_WEP,
     HAL_KEY_TYPE_AES,
     HAL_KEY_TYPE_TKIP,
 } HAL_KEY_TYPE;
 
 typedef enum {
 	HAL_CIPHER_WEP		= 0,
 	HAL_CIPHER_AES_OCB	= 1,
 	HAL_CIPHER_AES_CCM	= 2,
 	HAL_CIPHER_CKIP		= 3,
 	HAL_CIPHER_TKIP		= 4,
 	HAL_CIPHER_CLR		= 5,		/* no encryption */
 
 	HAL_CIPHER_MIC		= 127		/* TKIP-MIC, not a cipher */
 } HAL_CIPHER;
 
 enum {
 	HAL_SLOT_TIME_6	 = 6,			/* NB: for turbo mode */
 	HAL_SLOT_TIME_9	 = 9,
 	HAL_SLOT_TIME_20 = 20,
 };
 
 /*
  * Per-station beacon timer state.  Note that the specified
  * beacon interval (given in TU's) can also include flags
  * to force a TSF reset and to enable the beacon xmit logic.
  * If bs_cfpmaxduration is non-zero the hardware is setup to
  * coexist with a PCF-capable AP.
  */
 typedef struct {
 	uint32_t	bs_nexttbtt;		/* next beacon in TU */
 	uint32_t	bs_nextdtim;		/* next DTIM in TU */
 	uint32_t	bs_intval;		/* beacon interval+flags */
 /*
  * HAL_BEACON_PERIOD, HAL_BEACON_ENA and HAL_BEACON_RESET_TSF
  * are all 1:1 correspondances with the pre-11n chip AR_BEACON
  * register.
  */
 #define	HAL_BEACON_PERIOD	0x0000ffff	/* beacon interval period */
 #define	HAL_BEACON_PERIOD_TU8	0x0007ffff	/* beacon interval, tu/8 */
 #define	HAL_BEACON_ENA		0x00800000	/* beacon xmit enable */
 #define	HAL_BEACON_RESET_TSF	0x01000000	/* clear TSF */
 #define	HAL_TSFOOR_THRESHOLD	0x00004240	/* TSF OOR thresh (16k uS) */
 	uint32_t	bs_dtimperiod;
 	uint16_t	bs_cfpperiod;		/* CFP period in TU */
 	uint16_t	bs_cfpmaxduration;	/* max CFP duration in TU */
 	uint32_t	bs_cfpnext;		/* next CFP in TU */
 	uint16_t	bs_timoffset;		/* byte offset to TIM bitmap */
 	uint16_t	bs_bmissthreshold;	/* beacon miss threshold */
 	uint32_t	bs_sleepduration;	/* max sleep duration */
 	uint32_t	bs_tsfoor_threshold;	/* TSF out of range threshold */
 } HAL_BEACON_STATE;
 
 /*
  * Like HAL_BEACON_STATE but for non-station mode setup.
  * NB: see above flag definitions for bt_intval. 
  */
 typedef struct {
 	uint32_t	bt_intval;		/* beacon interval+flags */
 	uint32_t	bt_nexttbtt;		/* next beacon in TU */
 	uint32_t	bt_nextatim;		/* next ATIM in TU */
 	uint32_t	bt_nextdba;		/* next DBA in 1/8th TU */
 	uint32_t	bt_nextswba;		/* next SWBA in 1/8th TU */
 	uint32_t	bt_flags;		/* timer enables */
 #define HAL_BEACON_TBTT_EN	0x00000001
 #define HAL_BEACON_DBA_EN	0x00000002
 #define HAL_BEACON_SWBA_EN	0x00000004
 } HAL_BEACON_TIMERS;
 
 /*
  * Per-node statistics maintained by the driver for use in
  * optimizing signal quality and other operational aspects.
  */
 typedef struct {
 	uint32_t	ns_avgbrssi;	/* average beacon rssi */
 	uint32_t	ns_avgrssi;	/* average data rssi */
 	uint32_t	ns_avgtxrssi;	/* average tx rssi */
 } HAL_NODE_STATS;
 
 #define	HAL_RSSI_EP_MULTIPLIER	(1<<7)	/* pow2 to optimize out * and / */
 
 /*
  * This is the ANI state and MIB stats.
  *
  * It's used by the HAL modules to keep state /and/ by the debug ioctl
  * to fetch ANI information.
  */
 typedef struct {
 	uint32_t	ast_ani_niup;   /* ANI increased noise immunity */
 	uint32_t	ast_ani_nidown; /* ANI decreased noise immunity */
 	uint32_t	ast_ani_spurup; /* ANI increased spur immunity */
 	uint32_t	ast_ani_spurdown;/* ANI descreased spur immunity */
 	uint32_t	ast_ani_ofdmon; /* ANI OFDM weak signal detect on */
 	uint32_t	ast_ani_ofdmoff;/* ANI OFDM weak signal detect off */
 	uint32_t	ast_ani_cckhigh;/* ANI CCK weak signal threshold high */
 	uint32_t	ast_ani_ccklow; /* ANI CCK weak signal threshold low */
 	uint32_t	ast_ani_stepup; /* ANI increased first step level */
 	uint32_t	ast_ani_stepdown;/* ANI decreased first step level */
 	uint32_t	ast_ani_ofdmerrs;/* ANI cumulative ofdm phy err count */
 	uint32_t	ast_ani_cckerrs;/* ANI cumulative cck phy err count */
 	uint32_t	ast_ani_reset;  /* ANI parameters zero'd for non-STA */
 	uint32_t	ast_ani_lzero;  /* ANI listen time forced to zero */
 	uint32_t	ast_ani_lneg;   /* ANI listen time calculated < 0 */
 	HAL_MIB_STATS	ast_mibstats;   /* MIB counter stats */
 	HAL_NODE_STATS	ast_nodestats;  /* Latest rssi stats from driver */
 } HAL_ANI_STATS;
 
 typedef struct {
 	uint8_t		noiseImmunityLevel;
 	uint8_t		spurImmunityLevel;
 	uint8_t		firstepLevel;
 	uint8_t		ofdmWeakSigDetectOff;
 	uint8_t		cckWeakSigThreshold;
 	uint32_t	listenTime;
 
 	/* NB: intentionally ordered so data exported to user space is first */
 	uint32_t	txFrameCount;   /* Last txFrameCount */
 	uint32_t	rxFrameCount;   /* Last rx Frame count */
 	uint32_t	cycleCount;     /* Last cycleCount
 					   (to detect wrap-around) */
 	uint32_t	ofdmPhyErrCount;/* OFDM err count since last reset */
 	uint32_t	cckPhyErrCount; /* CCK err count since last reset */
 } HAL_ANI_STATE;
 
 struct ath_desc;
 struct ath_tx_status;
 struct ath_rx_status;
 struct ieee80211_channel;
 
 /*
  * This is a channel survey sample entry.
  *
  * The AR5212 ANI routines fill these samples. The ANI code then uses it
  * when calculating listen time; it is also exported via a diagnostic
  * API.
  */
 typedef struct {
 	uint32_t        seq_num;
 	uint32_t        tx_busy;
 	uint32_t        rx_busy;
 	uint32_t        chan_busy;
 	uint32_t        ext_chan_busy;
 	uint32_t        cycle_count;
 	/* XXX TODO */
 	uint32_t        ofdm_phyerr_count;
 	uint32_t        cck_phyerr_count;
 } HAL_SURVEY_SAMPLE;
 
 /*
  * This provides 3.2 seconds of sample space given an
  * ANI time of 1/10th of a second. This may not be enough!
  */
 #define	CHANNEL_SURVEY_SAMPLE_COUNT	32
 
 typedef struct {
 	HAL_SURVEY_SAMPLE samples[CHANNEL_SURVEY_SAMPLE_COUNT];
 	uint32_t cur_sample;	/* current sample in sequence */
 	uint32_t cur_seq;	/* current sequence number */
 } HAL_CHANNEL_SURVEY;
 
 /*
  * ANI commands.
  *
  * These are used both internally and externally via the diagnostic
  * API.
  *
  * Note that this is NOT the ANI commands being used via the INTMIT
  * capability - that has a different mapping for some reason.
  */
 typedef enum {
 	HAL_ANI_PRESENT = 0,			/* is ANI support present */
 	HAL_ANI_NOISE_IMMUNITY_LEVEL = 1,	/* set level */
 	HAL_ANI_OFDM_WEAK_SIGNAL_DETECTION = 2,	/* enable/disable */
 	HAL_ANI_CCK_WEAK_SIGNAL_THR = 3,	/* enable/disable */
 	HAL_ANI_FIRSTEP_LEVEL = 4,		/* set level */
 	HAL_ANI_SPUR_IMMUNITY_LEVEL = 5,	/* set level */
 	HAL_ANI_MODE = 6,			/* 0 => manual, 1 => auto (XXX do not change) */
 	HAL_ANI_PHYERR_RESET = 7,		/* reset phy error stats */
 	HAL_ANI_MRC_CCK = 8,
 } HAL_ANI_CMD;
 
 #define	HAL_ANI_ALL		0xffffffff
 
 /*
  * This is the layout of the ANI INTMIT capability.
  *
  * Notice that the command values differ to HAL_ANI_CMD.
  */
 typedef enum {
 	HAL_CAP_INTMIT_PRESENT = 0,
 	HAL_CAP_INTMIT_ENABLE = 1,
 	HAL_CAP_INTMIT_NOISE_IMMUNITY_LEVEL = 2,
 	HAL_CAP_INTMIT_OFDM_WEAK_SIGNAL_LEVEL = 3,
 	HAL_CAP_INTMIT_CCK_WEAK_SIGNAL_THR = 4,
 	HAL_CAP_INTMIT_FIRSTEP_LEVEL = 5,
 	HAL_CAP_INTMIT_SPUR_IMMUNITY_LEVEL = 6
 } HAL_CAP_INTMIT_CMD;
 
 typedef struct {
 	int32_t		pe_firpwr;	/* FIR pwr out threshold */
 	int32_t		pe_rrssi;	/* Radar rssi thresh */
 	int32_t		pe_height;	/* Pulse height thresh */
 	int32_t		pe_prssi;	/* Pulse rssi thresh */
 	int32_t		pe_inband;	/* Inband thresh */
 
 	/* The following params are only for AR5413 and later */
 	u_int32_t	pe_relpwr;	/* Relative power threshold in 0.5dB steps */
 	u_int32_t	pe_relstep;	/* Pulse Relative step threshold in 0.5dB steps */
 	u_int32_t	pe_maxlen;	/* Max length of radar sign in 0.8us units */
 	int32_t		pe_usefir128;	/* Use the average in-band power measured over 128 cycles */
 	int32_t		pe_blockradar;	/*
 					 * Enable to block radar check if pkt detect is done via OFDM
 					 * weak signal detect or pkt is detected immediately after tx
 					 * to rx transition
 					 */
 	int32_t		pe_enmaxrssi;	/*
 					 * Enable to use the max rssi instead of the last rssi during
 					 * fine gain changes for radar detection
 					 */
 	int32_t		pe_extchannel;	/* Enable DFS on ext channel */
 	int32_t		pe_enabled;	/* Whether radar detection is enabled */
 	int32_t		pe_enrelpwr;
 	int32_t		pe_en_relstep_check;
 } HAL_PHYERR_PARAM;
 
 #define	HAL_PHYERR_PARAM_NOVAL	65535
 
 typedef struct {
 	u_int16_t	ss_fft_period;	/* Skip interval for FFT reports */
 	u_int16_t	ss_period;	/* Spectral scan period */
 	u_int16_t	ss_count;	/* # of reports to return from ss_active */
 	u_int16_t	ss_short_report;/* Set to report ony 1 set of FFT results */
 	u_int8_t	radar_bin_thresh_sel;	/* strong signal radar FFT threshold configuration */
 	u_int16_t	ss_spectral_pri;		/* are we doing a noise power cal ? */
 	int8_t		ss_nf_cal[AH_MAX_CHAINS*2];     /* nf calibrated values for ctl+ext from eeprom */
 	int8_t		ss_nf_pwr[AH_MAX_CHAINS*2];     /* nf pwr values for ctl+ext from eeprom */
 	int32_t		ss_nf_temp_data;	/* temperature data taken during nf scan */
 	int		ss_enabled;
 	int		ss_active;
 } HAL_SPECTRAL_PARAM;
 #define	HAL_SPECTRAL_PARAM_NOVAL	0xFFFF
 #define	HAL_SPECTRAL_PARAM_ENABLE	0x8000	/* Enable/Disable if applicable */
 
 /*
  * DFS operating mode flags.
  */
 typedef enum {
 	HAL_DFS_UNINIT_DOMAIN	= 0,	/* Uninitialized dfs domain */
 	HAL_DFS_FCC_DOMAIN	= 1,	/* FCC3 dfs domain */
 	HAL_DFS_ETSI_DOMAIN	= 2,	/* ETSI dfs domain */
 	HAL_DFS_MKK4_DOMAIN	= 3,	/* Japan dfs domain */
 } HAL_DFS_DOMAIN;
 
 
 /*
  * MFP decryption options for initializing the MAC.
  */
 typedef enum {
 	HAL_MFP_QOSDATA = 0,	/* Decrypt MFP frames like QoS data frames. All chips before Merlin. */
 	HAL_MFP_PASSTHRU,	/* Don't decrypt MFP frames at all. Passthrough */
 	HAL_MFP_HW_CRYPTO	/* hardware decryption enabled. Merlin can do it. */
 } HAL_MFP_OPT_T;
 
 /* LNA config supported */
 typedef enum {
 	HAL_ANT_DIV_COMB_LNA1_MINUS_LNA2	= 0,
 	HAL_ANT_DIV_COMB_LNA2			= 1,
 	HAL_ANT_DIV_COMB_LNA1			= 2,
 	HAL_ANT_DIV_COMB_LNA1_PLUS_LNA2		= 3,
 } HAL_ANT_DIV_COMB_LNA_CONF;
 
 typedef struct {
 	u_int8_t	main_lna_conf;
 	u_int8_t	alt_lna_conf;
 	u_int8_t	fast_div_bias;
 	u_int8_t	main_gaintb;
 	u_int8_t	alt_gaintb;
 	u_int8_t	antdiv_configgroup;
 	int8_t		lna1_lna2_delta;
 } HAL_ANT_COMB_CONFIG;
 
 #define	DEFAULT_ANTDIV_CONFIG_GROUP	0x00
 #define	HAL_ANTDIV_CONFIG_GROUP_1	0x01
 #define	HAL_ANTDIV_CONFIG_GROUP_2	0x02
 #define	HAL_ANTDIV_CONFIG_GROUP_3	0x03
 
 /*
  * Flag for setting QUIET period
  */
 typedef enum {
 	HAL_QUIET_DISABLE		= 0x0,
 	HAL_QUIET_ENABLE		= 0x1,
 	HAL_QUIET_ADD_CURRENT_TSF	= 0x2,	/* add current TSF to next_start offset */
 	HAL_QUIET_ADD_SWBA_RESP_TIME	= 0x4,	/* add beacon response time to next_start offset */
 } HAL_QUIET_FLAG;
 
 #define	HAL_DFS_EVENT_PRICH		0x0000001
 #define	HAL_DFS_EVENT_EXTCH		0x0000002
 #define	HAL_DFS_EVENT_EXTEARLY		0x0000004
 #define	HAL_DFS_EVENT_ISDC		0x0000008
 
 struct hal_dfs_event {
 	uint64_t	re_full_ts;	/* 64-bit full timestamp from interrupt time */
 	uint32_t	re_ts;		/* Original 15 bit recv timestamp */
 	uint8_t		re_rssi;	/* rssi of radar event */
 	uint8_t		re_dur;		/* duration of radar pulse */
 	uint32_t	re_flags;	/* Flags (see above) */
 };
 typedef struct hal_dfs_event HAL_DFS_EVENT;
 
 /*
  * Generic Timer domain
  */
 typedef enum {
 	HAL_GEN_TIMER_TSF = 0,
 	HAL_GEN_TIMER_TSF2,
 	HAL_GEN_TIMER_TSF_ANY
 } HAL_GEN_TIMER_DOMAIN;
 
 /*
  * BT Co-existence definitions
  */
 #include "ath_hal/ah_btcoex.h"
 
 struct hal_bb_panic_info {
 	u_int32_t	status;
 	u_int32_t	tsf;
 	u_int32_t	phy_panic_wd_ctl1;
 	u_int32_t	phy_panic_wd_ctl2;
 	u_int32_t	phy_gen_ctrl;
 	u_int32_t	rxc_pcnt;
 	u_int32_t	rxf_pcnt;
 	u_int32_t	txf_pcnt;
 	u_int32_t	cycles;
 	u_int32_t	wd;
 	u_int32_t	det;
 	u_int32_t	rdar;
 	u_int32_t	r_odfm;
 	u_int32_t	r_cck;
 	u_int32_t	t_odfm;
 	u_int32_t	t_cck;
 	u_int32_t	agc;
 	u_int32_t	src;
 };
 
 /* Serialize Register Access Mode */
 typedef enum {
 	SER_REG_MODE_OFF	= 0,
 	SER_REG_MODE_ON		= 1,
 	SER_REG_MODE_AUTO	= 2,
 } SER_REG_MODE;
 
 typedef struct
 {
 	int ah_debug;			/* only used if AH_DEBUG is defined */
 	int ah_ar5416_biasadj;		/* enable AR2133 radio specific bias fiddling */
 
 	/* NB: these are deprecated; they exist for now for compatibility */
 	int ah_dma_beacon_response_time;/* in TU's */
 	int ah_sw_beacon_response_time;	/* in TU's */
 	int ah_additional_swba_backoff;	/* in TU's */
 	int ah_force_full_reset;	/* force full chip reset rather then warm reset */
 	int ah_serialise_reg_war;	/* force serialisation of register IO */
 
 	/* XXX these don't belong here, they're just for the ar9300  HAL port effort */
 	int ath_hal_desc_tpc;		/* Per-packet TPC */
 	int ath_hal_sta_update_tx_pwr_enable;	/* GreenTX */
 	int ath_hal_sta_update_tx_pwr_enable_S1;	/* GreenTX */
 	int ath_hal_sta_update_tx_pwr_enable_S2;	/* GreenTX */
 	int ath_hal_sta_update_tx_pwr_enable_S3;	/* GreenTX */
 
 	/* I'm not sure what the default values for these should be */
 	int ath_hal_pll_pwr_save;
 	int ath_hal_pcie_power_save_enable;
 	int ath_hal_intr_mitigation_rx;
 	int ath_hal_intr_mitigation_tx;
 
 	int ath_hal_pcie_clock_req;
 #define	AR_PCIE_PLL_PWRSAVE_CONTROL	(1<<0)
 #define	AR_PCIE_PLL_PWRSAVE_ON_D3	(1<<1)
 #define	AR_PCIE_PLL_PWRSAVE_ON_D0	(1<<2)
 
 	int ath_hal_pcie_waen;
 	int ath_hal_pcie_ser_des_write;
 
 	/* these are important for correct AR9300 behaviour */
 	int ath_hal_ht_enable;		/* needs to be enabled for AR9300 HT */
 	int ath_hal_diversity_control;
 	int ath_hal_antenna_switch_swap;
 	int ath_hal_ext_lna_ctl_gpio;
 	int ath_hal_spur_mode;
 	int ath_hal_6mb_ack;		/* should set this to 1 for 11a/11na? */
 	int ath_hal_enable_msi;		/* enable MSI interrupts (needed?) */
 	int ath_hal_beacon_filter_interval;	/* ok to be 0 for now? */
 
 	/* For now, set this to 0 - net80211 needs to know about hardware MFP support */
 	int ath_hal_mfp_support;
 
 	int ath_hal_enable_ani;	/* should set this.. */
 	int ath_hal_cwm_ignore_ext_cca;
 	int ath_hal_show_bb_panic;
 	int ath_hal_ant_ctrl_comm2g_switch_enable;
 	int ath_hal_ext_atten_margin_cfg;
 	int ath_hal_min_gainidx;
 	int ath_hal_war70c;
 	uint32_t ath_hal_mci_config;
 } HAL_OPS_CONFIG;
 
 /*
  * Hardware Access Layer (HAL) API.
  *
  * Clients of the HAL call ath_hal_attach to obtain a reference to an
  * ath_hal structure for use with the device.  Hardware-related operations
  * that follow must call back into the HAL through interface, supplying
  * the reference as the first parameter.  Note that before using the
  * reference returned by ath_hal_attach the caller should verify the
  * ABI version number.
  */
 struct ath_hal {
 	uint32_t	ah_magic;	/* consistency check magic number */
 	uint16_t	ah_devid;	/* PCI device ID */
 	uint16_t	ah_subvendorid;	/* PCI subvendor ID */
 	HAL_SOFTC	ah_sc;		/* back pointer to driver/os state */
 	HAL_BUS_TAG	ah_st;		/* params for register r+w */
 	HAL_BUS_HANDLE	ah_sh;
 	HAL_CTRY_CODE	ah_countryCode;
 
 	uint32_t	ah_macVersion;	/* MAC version id */
 	uint16_t	ah_macRev;	/* MAC revision */
 	uint16_t	ah_phyRev;	/* PHY revision */
 	/* NB: when only one radio is present the rev is in 5Ghz */
 	uint16_t	ah_analog5GhzRev;/* 5GHz radio revision */
 	uint16_t	ah_analog2GhzRev;/* 2GHz radio revision */
 
 	uint16_t	*ah_eepromdata;	/* eeprom buffer, if needed */
 
 	uint32_t	ah_intrstate[8];	/* last int state */
 	uint32_t	ah_syncstate;		/* last sync intr state */
 
 	/* Current powerstate from HAL calls */
 	HAL_POWER_MODE	ah_powerMode;
 
 	HAL_OPS_CONFIG ah_config;
 	const HAL_RATE_TABLE *__ahdecl(*ah_getRateTable)(struct ath_hal *,
 				u_int mode);
 	void	  __ahdecl(*ah_detach)(struct ath_hal*);
 
 	/* Reset functions */
 	HAL_BOOL  __ahdecl(*ah_reset)(struct ath_hal *, HAL_OPMODE,
 				struct ieee80211_channel *,
 				HAL_BOOL bChannelChange,
 				HAL_RESET_TYPE resetType,
 				HAL_STATUS *status);
 	HAL_BOOL  __ahdecl(*ah_phyDisable)(struct ath_hal *);
 	HAL_BOOL  __ahdecl(*ah_disable)(struct ath_hal *);
 	void	  __ahdecl(*ah_configPCIE)(struct ath_hal *, HAL_BOOL restore,
 				HAL_BOOL power_off);
 	void	  __ahdecl(*ah_disablePCIE)(struct ath_hal *);
 	void	  __ahdecl(*ah_setPCUConfig)(struct ath_hal *);
 	HAL_BOOL  __ahdecl(*ah_perCalibration)(struct ath_hal*,
 			struct ieee80211_channel *, HAL_BOOL *);
 	HAL_BOOL  __ahdecl(*ah_perCalibrationN)(struct ath_hal *,
 			struct ieee80211_channel *, u_int chainMask,
 			HAL_BOOL longCal, HAL_BOOL *isCalDone);
 	HAL_BOOL  __ahdecl(*ah_resetCalValid)(struct ath_hal *,
 			const struct ieee80211_channel *);
 	HAL_BOOL  __ahdecl(*ah_setTxPower)(struct ath_hal *,
 	    		const struct ieee80211_channel *, uint16_t *);
 	HAL_BOOL  __ahdecl(*ah_setTxPowerLimit)(struct ath_hal *, uint32_t);
 	HAL_BOOL  __ahdecl(*ah_setBoardValues)(struct ath_hal *,
 	    		const struct ieee80211_channel *);
 
 	/* Transmit functions */
 	HAL_BOOL  __ahdecl(*ah_updateTxTrigLevel)(struct ath_hal*,
 				HAL_BOOL incTrigLevel);
 	int	  __ahdecl(*ah_setupTxQueue)(struct ath_hal *, HAL_TX_QUEUE,
 				const HAL_TXQ_INFO *qInfo);
 	HAL_BOOL  __ahdecl(*ah_setTxQueueProps)(struct ath_hal *, int q, 
 				const HAL_TXQ_INFO *qInfo);
 	HAL_BOOL  __ahdecl(*ah_getTxQueueProps)(struct ath_hal *, int q, 
 				HAL_TXQ_INFO *qInfo);
 	HAL_BOOL  __ahdecl(*ah_releaseTxQueue)(struct ath_hal *ah, u_int q);
 	HAL_BOOL  __ahdecl(*ah_resetTxQueue)(struct ath_hal *ah, u_int q);
 	uint32_t __ahdecl(*ah_getTxDP)(struct ath_hal*, u_int);
 	HAL_BOOL  __ahdecl(*ah_setTxDP)(struct ath_hal*, u_int, uint32_t txdp);
 	uint32_t __ahdecl(*ah_numTxPending)(struct ath_hal *, u_int q);
 	HAL_BOOL  __ahdecl(*ah_startTxDma)(struct ath_hal*, u_int);
 	HAL_BOOL  __ahdecl(*ah_stopTxDma)(struct ath_hal*, u_int);
 	HAL_BOOL  __ahdecl(*ah_setupTxDesc)(struct ath_hal *, struct ath_desc *,
 				u_int pktLen, u_int hdrLen,
 				HAL_PKT_TYPE type, u_int txPower,
 				u_int txRate0, u_int txTries0,
 				u_int keyIx, u_int antMode, u_int flags,
 				u_int rtsctsRate, u_int rtsctsDuration,
 				u_int compicvLen, u_int compivLen,
 				u_int comp);
 	HAL_BOOL  __ahdecl(*ah_setupXTxDesc)(struct ath_hal *, struct ath_desc*,
 				u_int txRate1, u_int txTries1,
 				u_int txRate2, u_int txTries2,
 				u_int txRate3, u_int txTries3);
 	HAL_BOOL  __ahdecl(*ah_fillTxDesc)(struct ath_hal *, struct ath_desc *,
 				HAL_DMA_ADDR *bufAddrList, uint32_t *segLenList,
 				u_int descId, u_int qcuId, HAL_BOOL firstSeg,
 				HAL_BOOL lastSeg, const struct ath_desc *);
 	HAL_STATUS __ahdecl(*ah_procTxDesc)(struct ath_hal *,
 				struct ath_desc *, struct ath_tx_status *);
 	void	   __ahdecl(*ah_getTxIntrQueue)(struct ath_hal *, uint32_t *);
 	void	   __ahdecl(*ah_reqTxIntrDesc)(struct ath_hal *, struct ath_desc*);
 	HAL_BOOL	__ahdecl(*ah_getTxCompletionRates)(struct ath_hal *,
 				const struct ath_desc *ds, int *rates, int *tries);
 	void	  __ahdecl(*ah_setTxDescLink)(struct ath_hal *ah, void *ds,
 				uint32_t link);
 	void	  __ahdecl(*ah_getTxDescLink)(struct ath_hal *ah, void *ds,
 				uint32_t *link);
 	void	  __ahdecl(*ah_getTxDescLinkPtr)(struct ath_hal *ah, void *ds,
 				uint32_t **linkptr);
 	void	  __ahdecl(*ah_setupTxStatusRing)(struct ath_hal *,
 				void *ts_start, uint32_t ts_paddr_start,
 				uint16_t size);
 	void	  __ahdecl(*ah_getTxRawTxDesc)(struct ath_hal *, u_int32_t *);
 
 	/* Receive Functions */
 	uint32_t __ahdecl(*ah_getRxDP)(struct ath_hal*, HAL_RX_QUEUE);
 	void	  __ahdecl(*ah_setRxDP)(struct ath_hal*, uint32_t rxdp, HAL_RX_QUEUE);
 	void	  __ahdecl(*ah_enableReceive)(struct ath_hal*);
 	HAL_BOOL  __ahdecl(*ah_stopDmaReceive)(struct ath_hal*);
 	void	  __ahdecl(*ah_startPcuReceive)(struct ath_hal*);
 	void	  __ahdecl(*ah_stopPcuReceive)(struct ath_hal*);
 	void	  __ahdecl(*ah_setMulticastFilter)(struct ath_hal*,
 				uint32_t filter0, uint32_t filter1);
 	HAL_BOOL  __ahdecl(*ah_setMulticastFilterIndex)(struct ath_hal*,
 				uint32_t index);
 	HAL_BOOL  __ahdecl(*ah_clrMulticastFilterIndex)(struct ath_hal*,
 				uint32_t index);
 	uint32_t __ahdecl(*ah_getRxFilter)(struct ath_hal*);
 	void	  __ahdecl(*ah_setRxFilter)(struct ath_hal*, uint32_t);
 	HAL_BOOL  __ahdecl(*ah_setupRxDesc)(struct ath_hal *, struct ath_desc *,
 				uint32_t size, u_int flags);
 	HAL_STATUS __ahdecl(*ah_procRxDesc)(struct ath_hal *,
 				struct ath_desc *, uint32_t phyAddr,
 				struct ath_desc *next, uint64_t tsf,
 				struct ath_rx_status *);
 	void	  __ahdecl(*ah_rxMonitor)(struct ath_hal *,
 				const HAL_NODE_STATS *,
 				const struct ieee80211_channel *);
 	void      __ahdecl(*ah_aniPoll)(struct ath_hal *,
 				const struct ieee80211_channel *);
 	void	  __ahdecl(*ah_procMibEvent)(struct ath_hal *,
 				const HAL_NODE_STATS *);
 
 	/* Misc Functions */
 	HAL_STATUS __ahdecl(*ah_getCapability)(struct ath_hal *,
 				HAL_CAPABILITY_TYPE, uint32_t capability,
 				uint32_t *result);
 	HAL_BOOL   __ahdecl(*ah_setCapability)(struct ath_hal *,
 				HAL_CAPABILITY_TYPE, uint32_t capability,
 				uint32_t setting, HAL_STATUS *);
 	HAL_BOOL   __ahdecl(*ah_getDiagState)(struct ath_hal *, int request,
 				const void *args, uint32_t argsize,
 				void **result, uint32_t *resultsize);
 	void	  __ahdecl(*ah_getMacAddress)(struct ath_hal *, uint8_t *);
 	HAL_BOOL  __ahdecl(*ah_setMacAddress)(struct ath_hal *, const uint8_t*);
 	void	  __ahdecl(*ah_getBssIdMask)(struct ath_hal *, uint8_t *);
 	HAL_BOOL  __ahdecl(*ah_setBssIdMask)(struct ath_hal *, const uint8_t*);
 	HAL_BOOL  __ahdecl(*ah_setRegulatoryDomain)(struct ath_hal*,
 				uint16_t, HAL_STATUS *);
 	void	  __ahdecl(*ah_setLedState)(struct ath_hal*, HAL_LED_STATE);
 	void	  __ahdecl(*ah_writeAssocid)(struct ath_hal*,
 				const uint8_t *bssid, uint16_t assocId);
 	HAL_BOOL  __ahdecl(*ah_gpioCfgOutput)(struct ath_hal *,
 				uint32_t gpio, HAL_GPIO_MUX_TYPE);
 	HAL_BOOL  __ahdecl(*ah_gpioCfgInput)(struct ath_hal *, uint32_t gpio);
 	uint32_t __ahdecl(*ah_gpioGet)(struct ath_hal *, uint32_t gpio);
 	HAL_BOOL  __ahdecl(*ah_gpioSet)(struct ath_hal *,
 				uint32_t gpio, uint32_t val);
 	void	  __ahdecl(*ah_gpioSetIntr)(struct ath_hal*, u_int, uint32_t);
 	uint32_t __ahdecl(*ah_getTsf32)(struct ath_hal*);
 	uint64_t __ahdecl(*ah_getTsf64)(struct ath_hal*);
 	void     __ahdecl(*ah_setTsf64)(struct ath_hal *, uint64_t);
 	void	  __ahdecl(*ah_resetTsf)(struct ath_hal*);
 	HAL_BOOL  __ahdecl(*ah_detectCardPresent)(struct ath_hal*);
 	void	  __ahdecl(*ah_updateMibCounters)(struct ath_hal*,
 				HAL_MIB_STATS*);
 	HAL_RFGAIN __ahdecl(*ah_getRfGain)(struct ath_hal*);
 	u_int	  __ahdecl(*ah_getDefAntenna)(struct ath_hal*);
 	void	  __ahdecl(*ah_setDefAntenna)(struct ath_hal*, u_int);
 	HAL_ANT_SETTING	 __ahdecl(*ah_getAntennaSwitch)(struct ath_hal*);
 	HAL_BOOL  __ahdecl(*ah_setAntennaSwitch)(struct ath_hal*,
 				HAL_ANT_SETTING);
 	HAL_BOOL  __ahdecl(*ah_setSifsTime)(struct ath_hal*, u_int);
 	u_int	  __ahdecl(*ah_getSifsTime)(struct ath_hal*);
 	HAL_BOOL  __ahdecl(*ah_setSlotTime)(struct ath_hal*, u_int);
 	u_int	  __ahdecl(*ah_getSlotTime)(struct ath_hal*);
 	HAL_BOOL  __ahdecl(*ah_setAckTimeout)(struct ath_hal*, u_int);
 	u_int	  __ahdecl(*ah_getAckTimeout)(struct ath_hal*);
 	HAL_BOOL  __ahdecl(*ah_setAckCTSRate)(struct ath_hal*, u_int);
 	u_int	  __ahdecl(*ah_getAckCTSRate)(struct ath_hal*);
 	HAL_BOOL  __ahdecl(*ah_setCTSTimeout)(struct ath_hal*, u_int);
 	u_int	  __ahdecl(*ah_getCTSTimeout)(struct ath_hal*);
 	HAL_BOOL  __ahdecl(*ah_setDecompMask)(struct ath_hal*, uint16_t, int);
 	void	  __ahdecl(*ah_setCoverageClass)(struct ath_hal*, uint8_t, int);
 	HAL_STATUS	__ahdecl(*ah_setQuiet)(struct ath_hal *ah, uint32_t period,
 				uint32_t duration, uint32_t nextStart,
 				HAL_QUIET_FLAG flag);
 	void	  __ahdecl(*ah_setChainMasks)(struct ath_hal *,
 				uint32_t, uint32_t);
 
 	/* DFS functions */
 	void	  __ahdecl(*ah_enableDfs)(struct ath_hal *ah,
 				HAL_PHYERR_PARAM *pe);
 	void	  __ahdecl(*ah_getDfsThresh)(struct ath_hal *ah,
 				HAL_PHYERR_PARAM *pe);
 	HAL_BOOL  __ahdecl(*ah_getDfsDefaultThresh)(struct ath_hal *ah,
 				HAL_PHYERR_PARAM *pe);
 	HAL_BOOL  __ahdecl(*ah_procRadarEvent)(struct ath_hal *ah,
 				struct ath_rx_status *rxs, uint64_t fulltsf,
 				const char *buf, HAL_DFS_EVENT *event);
 	HAL_BOOL  __ahdecl(*ah_isFastClockEnabled)(struct ath_hal *ah);
 
 	/* Spectral Scan functions */
 	void	__ahdecl(*ah_spectralConfigure)(struct ath_hal *ah,
 				HAL_SPECTRAL_PARAM *sp);
 	void	__ahdecl(*ah_spectralGetConfig)(struct ath_hal *ah,
 				HAL_SPECTRAL_PARAM *sp);
 	void	__ahdecl(*ah_spectralStart)(struct ath_hal *);
 	void	__ahdecl(*ah_spectralStop)(struct ath_hal *);
 	HAL_BOOL	__ahdecl(*ah_spectralIsEnabled)(struct ath_hal *);
 	HAL_BOOL	__ahdecl(*ah_spectralIsActive)(struct ath_hal *);
 	/* XXX getNfPri() and getNfExt() */
 
 	/* Key Cache Functions */
 	uint32_t __ahdecl(*ah_getKeyCacheSize)(struct ath_hal*);
 	HAL_BOOL  __ahdecl(*ah_resetKeyCacheEntry)(struct ath_hal*, uint16_t);
 	HAL_BOOL  __ahdecl(*ah_isKeyCacheEntryValid)(struct ath_hal *,
 				uint16_t);
 	HAL_BOOL  __ahdecl(*ah_setKeyCacheEntry)(struct ath_hal*,
 				uint16_t, const HAL_KEYVAL *,
 				const uint8_t *, int);
 	HAL_BOOL  __ahdecl(*ah_setKeyCacheEntryMac)(struct ath_hal*,
 				uint16_t, const uint8_t *);
 
 	/* Power Management Functions */
 	HAL_BOOL  __ahdecl(*ah_setPowerMode)(struct ath_hal*,
 				HAL_POWER_MODE mode, int setChip);
 	HAL_POWER_MODE __ahdecl(*ah_getPowerMode)(struct ath_hal*);
 	int16_t   __ahdecl(*ah_getChanNoise)(struct ath_hal *,
 				const struct ieee80211_channel *);
 
 	/* Beacon Management Functions */
 	void	  __ahdecl(*ah_setBeaconTimers)(struct ath_hal*,
 				const HAL_BEACON_TIMERS *);
 	/* NB: deprecated, use ah_setBeaconTimers instead */
 	void	  __ahdecl(*ah_beaconInit)(struct ath_hal *,
 				uint32_t nexttbtt, uint32_t intval);
 	void	  __ahdecl(*ah_setStationBeaconTimers)(struct ath_hal*,
 				const HAL_BEACON_STATE *);
 	void	  __ahdecl(*ah_resetStationBeaconTimers)(struct ath_hal*);
 	uint64_t  __ahdecl(*ah_getNextTBTT)(struct ath_hal *);
 
 	/* 802.11n Functions */
 	HAL_BOOL  __ahdecl(*ah_chainTxDesc)(struct ath_hal *,
 				struct ath_desc *,
 				HAL_DMA_ADDR *bufAddrList,
 				uint32_t *segLenList,
 				u_int, u_int, HAL_PKT_TYPE,
 				u_int, HAL_CIPHER, uint8_t, HAL_BOOL,
 				HAL_BOOL, HAL_BOOL);
 	HAL_BOOL  __ahdecl(*ah_setupFirstTxDesc)(struct ath_hal *,
 				struct ath_desc *, u_int, u_int, u_int,
 				u_int, u_int, u_int, u_int, u_int);
 	HAL_BOOL  __ahdecl(*ah_setupLastTxDesc)(struct ath_hal *,
 				struct ath_desc *, const struct ath_desc *);
 	void	  __ahdecl(*ah_set11nRateScenario)(struct ath_hal *,
 	    			struct ath_desc *, u_int, u_int,
 				HAL_11N_RATE_SERIES [], u_int, u_int);
 
 	/*
 	 * The next 4 (set11ntxdesc -> set11naggrlast) are specific
 	 * to the EDMA HAL.  Descriptors are chained together by
 	 * using filltxdesc (not ChainTxDesc) and then setting the
 	 * aggregate flags appropriately using first/middle/last.
 	 */
 	void	  __ahdecl(*ah_set11nTxDesc)(struct ath_hal *,
 				void *, u_int, HAL_PKT_TYPE, u_int, u_int,
 				u_int);
 	void	  __ahdecl(*ah_set11nAggrFirst)(struct ath_hal *,
 				struct ath_desc *, u_int, u_int);
 	void	  __ahdecl(*ah_set11nAggrMiddle)(struct ath_hal *,
 	    			struct ath_desc *, u_int);
 	void	  __ahdecl(*ah_set11nAggrLast)(struct ath_hal *,
 				struct ath_desc *);
 	void	  __ahdecl(*ah_clr11nAggr)(struct ath_hal *,
 	    			struct ath_desc *);
 	void	  __ahdecl(*ah_set11nBurstDuration)(struct ath_hal *,
 	    			struct ath_desc *, u_int);
 	void	  __ahdecl(*ah_set11nVirtMoreFrag)(struct ath_hal *,
 				struct ath_desc *, u_int);
 
 	HAL_BOOL  __ahdecl(*ah_getMibCycleCounts) (struct ath_hal *,
 				HAL_SURVEY_SAMPLE *);
 
 	uint32_t  __ahdecl(*ah_get11nExtBusy)(struct ath_hal *);
 	void      __ahdecl(*ah_set11nMac2040)(struct ath_hal *,
 				HAL_HT_MACMODE);
 	HAL_HT_RXCLEAR __ahdecl(*ah_get11nRxClear)(struct ath_hal *ah);
 	void	  __ahdecl(*ah_set11nRxClear)(struct ath_hal *,
 	    			HAL_HT_RXCLEAR);
 
 	/* Interrupt functions */
 	HAL_BOOL  __ahdecl(*ah_isInterruptPending)(struct ath_hal*);
 	HAL_BOOL  __ahdecl(*ah_getPendingInterrupts)(struct ath_hal*, HAL_INT*);
 	HAL_INT	  __ahdecl(*ah_getInterrupts)(struct ath_hal*);
 	HAL_INT	  __ahdecl(*ah_setInterrupts)(struct ath_hal*, HAL_INT);
 
 	/* Bluetooth Coexistence functions */
 	void	    __ahdecl(*ah_btCoexSetInfo)(struct ath_hal *,
 				HAL_BT_COEX_INFO *);
 	void	    __ahdecl(*ah_btCoexSetConfig)(struct ath_hal *,
 				HAL_BT_COEX_CONFIG *);
 	void	    __ahdecl(*ah_btCoexSetQcuThresh)(struct ath_hal *,
 				int);
 	void	    __ahdecl(*ah_btCoexSetWeights)(struct ath_hal *,
 				uint32_t);
 	void	    __ahdecl(*ah_btCoexSetBmissThresh)(struct ath_hal *,
 				uint32_t);
 	void	    __ahdecl(*ah_btCoexSetParameter)(struct ath_hal *,
 				uint32_t, uint32_t);
 	void	    __ahdecl(*ah_btCoexDisable)(struct ath_hal *);
 	int	    __ahdecl(*ah_btCoexEnable)(struct ath_hal *);
 
 	/* Bluetooth MCI methods */
 	void	    __ahdecl(*ah_btMciSetup)(struct ath_hal *,
 				uint32_t, void *, uint16_t, uint32_t);
 	HAL_BOOL    __ahdecl(*ah_btMciSendMessage)(struct ath_hal *,
 				uint8_t, uint32_t, uint32_t *, uint8_t,
 				HAL_BOOL, HAL_BOOL);
 	uint32_t    __ahdecl(*ah_btMciGetInterrupt)(struct ath_hal *,
 				uint32_t *, uint32_t *);
 	uint32_t    __ahdecl(*ah_btMciState)(struct ath_hal *,
 				uint32_t, uint32_t *);
 	void	    __ahdecl(*ah_btMciDetach)(struct ath_hal *);
 
 	/* LNA diversity configuration */
 	void	    __ahdecl(*ah_divLnaConfGet)(struct ath_hal *,
 				HAL_ANT_COMB_CONFIG *);
 	void	    __ahdecl(*ah_divLnaConfSet)(struct ath_hal *,
 				HAL_ANT_COMB_CONFIG *);
 };
 
 /* 
  * Check the PCI vendor ID and device ID against Atheros' values
  * and return a printable description for any Atheros hardware.
  * AH_NULL is returned if the ID's do not describe Atheros hardware.
  */
 extern	const char *__ahdecl ath_hal_probe(uint16_t vendorid, uint16_t devid);
 
 /*
  * Attach the HAL for use with the specified device.  The device is
  * defined by the PCI device ID.  The caller provides an opaque pointer
  * to an upper-layer data structure (HAL_SOFTC) that is stored in the
  * HAL state block for later use.  Hardware register accesses are done
  * using the specified bus tag and handle.  On successful return a
  * reference to a state block is returned that must be supplied in all
  * subsequent HAL calls.  Storage associated with this reference is
  * dynamically allocated and must be freed by calling the ah_detach
  * method when the client is done.  If the attach operation fails a
  * null (AH_NULL) reference will be returned and a status code will
  * be returned if the status parameter is non-zero.
  */
 extern	struct ath_hal * __ahdecl ath_hal_attach(uint16_t devid, HAL_SOFTC,
 		HAL_BUS_TAG, HAL_BUS_HANDLE, uint16_t *eepromdata,
 		HAL_OPS_CONFIG *ah_config, HAL_STATUS* status);
 
 extern	const char *ath_hal_mac_name(struct ath_hal *);
 extern	const char *ath_hal_rf_name(struct ath_hal *);
 
 /*
  * Regulatory interfaces.  Drivers should use ath_hal_init_channels to
  * request a set of channels for a particular country code and/or
  * regulatory domain.  If CTRY_DEFAULT and SKU_NONE are specified then
  * this list is constructed according to the contents of the EEPROM.
  * ath_hal_getchannels acts similarly but does not alter the operating
  * state; this can be used to collect information for a particular
  * regulatory configuration.  Finally ath_hal_set_channels installs a
  * channel list constructed outside the driver.  The HAL will adopt the
  * channel list and setup internal state according to the specified
  * regulatory configuration (e.g. conformance test limits).
  *
  * For all interfaces the channel list is returned in the supplied array.
  * maxchans defines the maximum size of this array.  nchans contains the
  * actual number of channels returned.  If a problem occurred then a
  * status code != HAL_OK is returned.
  */
 struct ieee80211_channel;
 
 /*
  * Return a list of channels according to the specified regulatory.
  */
 extern	HAL_STATUS __ahdecl ath_hal_getchannels(struct ath_hal *,
     struct ieee80211_channel *chans, u_int maxchans, int *nchans,
     u_int modeSelect, HAL_CTRY_CODE cc, HAL_REG_DOMAIN regDmn,
     HAL_BOOL enableExtendedChannels);
 
 /*
  * Return a list of channels and install it as the current operating
  * regulatory list.
  */
 extern	HAL_STATUS __ahdecl ath_hal_init_channels(struct ath_hal *,
     struct ieee80211_channel *chans, u_int maxchans, int *nchans,
     u_int modeSelect, HAL_CTRY_CODE cc, HAL_REG_DOMAIN rd,
     HAL_BOOL enableExtendedChannels);
 
 /*
  * Install the list of channels as the current operating regulatory
  * and setup related state according to the country code and sku.
  */
 extern	HAL_STATUS __ahdecl ath_hal_set_channels(struct ath_hal *,
     struct ieee80211_channel *chans, int nchans,
     HAL_CTRY_CODE cc, HAL_REG_DOMAIN regDmn);
 
 /*
  * Fetch the ctl/ext noise floor values reported by a MIMO
  * radio. Returns 1 for valid results, 0 for invalid channel.
  */
 extern int __ahdecl ath_hal_get_mimo_chan_noise(struct ath_hal *ah,
     const struct ieee80211_channel *chan, int16_t *nf_ctl,
     int16_t *nf_ext);
 
 /*
  * Calibrate noise floor data following a channel scan or similar.
  * This must be called prior retrieving noise floor data.
  */
 extern	void __ahdecl ath_hal_process_noisefloor(struct ath_hal *ah);
 
 /*
  * Return bit mask of wireless modes supported by the hardware.
  */
 extern	u_int __ahdecl ath_hal_getwirelessmodes(struct ath_hal*);
 
 /*
  * Get the HAL wireless mode for the given channel.
  */
 extern	int ath_hal_get_curmode(struct ath_hal *ah,
     const struct ieee80211_channel *chan);
 
 /*
  * Calculate the packet TX time for a legacy or 11n frame
  */
 extern uint32_t __ahdecl ath_hal_pkt_txtime(struct ath_hal *ah,
     const HAL_RATE_TABLE *rates, uint32_t frameLen,
     uint16_t rateix, HAL_BOOL isht40, HAL_BOOL shortPreamble,
     HAL_BOOL includeSifs);
 
 /*
  * Calculate the duration of an 11n frame.
  */
 extern uint32_t __ahdecl ath_computedur_ht(uint32_t frameLen, uint16_t rate,
     int streams, HAL_BOOL isht40, HAL_BOOL isShortGI);
 
 /*
  * Calculate the transmit duration of a legacy frame.
  */
 extern uint16_t __ahdecl ath_hal_computetxtime(struct ath_hal *,
 		const HAL_RATE_TABLE *rates, uint32_t frameLen,
 		uint16_t rateix, HAL_BOOL shortPreamble,
 		HAL_BOOL includeSifs);
 
 /*
  * Adjust the TSF.
  */
 extern void __ahdecl ath_hal_adjusttsf(struct ath_hal *ah, int32_t tsfdelta);
 
 /*
  * Enable or disable CCA.
  */
 void __ahdecl ath_hal_setcca(struct ath_hal *ah, int ena);
 
 /*
  * Get CCA setting.
  */
 int __ahdecl ath_hal_getcca(struct ath_hal *ah);
 
 /*
  * Read EEPROM data from ah_eepromdata
  */
 HAL_BOOL __ahdecl ath_hal_EepromDataRead(struct ath_hal *ah,
 		u_int off, uint16_t *data);
 
 /*
  * For now, simply pass through MFP frames.
  */
 static inline u_int32_t
 ath_hal_get_mfp_qos(struct ath_hal *ah)
 {
 	//return AH_PRIVATE(ah)->ah_mfp_qos;
 	return HAL_MFP_QOSDATA;
 }
 
+/*
+ * Convert between microseconds and core system clocks.
+ */
+extern u_int ath_hal_mac_clks(struct ath_hal *ah, u_int usecs);
+extern u_int ath_hal_mac_usec(struct ath_hal *ah, u_int clks);
+extern uint64_t ath_hal_mac_psec(struct ath_hal *ah, u_int clks);
+
 #endif /* _ATH_AH_H_ */
Index: projects/clang390-import/sys/dev/ath/ath_hal/ah_internal.h
===================================================================
--- projects/clang390-import/sys/dev/ath/ath_hal/ah_internal.h	(revision 305686)
+++ projects/clang390-import/sys/dev/ath/ath_hal/ah_internal.h	(revision 305687)
@@ -1,1047 +1,1041 @@
 /*
  * Copyright (c) 2002-2009 Sam Leffler, Errno Consulting
  * Copyright (c) 2002-2008 Atheros Communications, Inc.
  *
  * Permission to use, copy, modify, and/or distribute this software for any
  * purpose with or without fee is hereby granted, provided that the above
  * copyright notice and this permission notice appear in all copies.
  *
  * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
  * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
  * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
  * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
  * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  *
  * $FreeBSD$
  */
 #ifndef _ATH_AH_INTERAL_H_
 #define _ATH_AH_INTERAL_H_
 /*
  * Atheros Device Hardware Access Layer (HAL).
  *
  * Internal definitions.
  */
 #define	AH_NULL	0
 #define	AH_MIN(a,b)	((a)<(b)?(a):(b))
 #define	AH_MAX(a,b)	((a)>(b)?(a):(b))
 
 #include <net80211/_ieee80211.h>
 #include "opt_ah.h"			/* needed for AH_SUPPORT_AR5416 */
 
 #ifndef	AH_SUPPORT_AR5416
 #define	AH_SUPPORT_AR5416	1
 #endif
 
 #ifndef NBBY
 #define	NBBY	8			/* number of bits/byte */
 #endif
 
 #ifndef roundup
 #define	roundup(x, y)	((((x)+((y)-1))/(y))*(y))  /* to any y */
 #endif
 #ifndef howmany
 #define	howmany(x, y)	(((x)+((y)-1))/(y))
 #endif
 
 #ifndef offsetof
 #define	offsetof(type, field)	((size_t)(&((type *)0)->field))
 #endif
 
 typedef struct {
 	uint32_t	start;		/* first register */
 	uint32_t	end;		/* ending register or zero */
 } HAL_REGRANGE;
 
 typedef struct {
 	uint32_t	addr;		/* regiser address/offset */
 	uint32_t	value;		/* value to write */
 } HAL_REGWRITE;
 
 /*
  * Transmit power scale factor.
  *
  * NB: This is not public because we want to discourage the use of
  *     scaling; folks should use the tx power limit interface.
  */
 typedef enum {
 	HAL_TP_SCALE_MAX	= 0,		/* no scaling (default) */
 	HAL_TP_SCALE_50		= 1,		/* 50% of max (-3 dBm) */
 	HAL_TP_SCALE_25		= 2,		/* 25% of max (-6 dBm) */
 	HAL_TP_SCALE_12		= 3,		/* 12% of max (-9 dBm) */
 	HAL_TP_SCALE_MIN	= 4,		/* min, but still on */
 } HAL_TP_SCALE;
 
 typedef enum {
  	HAL_CAP_RADAR		= 0,		/* Radar capability */
  	HAL_CAP_AR		= 1,		/* AR capability */
 } HAL_PHYDIAG_CAPS;
 
 /*
  * Enable/disable strong signal fast diversity
  */
 #define	HAL_CAP_STRONG_DIV		2
 
 /*
  * Each chip or class of chips registers to offer support.
  */
 struct ath_hal_chip {
 	const char	*name;
 	const char	*(*probe)(uint16_t vendorid, uint16_t devid);
 	struct ath_hal	*(*attach)(uint16_t devid, HAL_SOFTC,
 			    HAL_BUS_TAG, HAL_BUS_HANDLE, uint16_t *eepromdata,
 			    HAL_OPS_CONFIG *ah,
 			    HAL_STATUS *error);
 };
 #ifndef AH_CHIP
 #define	AH_CHIP(_name, _probe, _attach)				\
 static struct ath_hal_chip _name##_chip = {			\
 	.name		= #_name,				\
 	.probe		= _probe,				\
 	.attach		= _attach				\
 };								\
 OS_DATA_SET(ah_chips, _name##_chip)
 #endif
 
 /*
  * Each RF backend registers to offer support; this is mostly
  * used by multi-chip 5212 solutions.  Single-chip solutions
  * have a fixed idea about which RF to use.
  */
 struct ath_hal_rf {
 	const char	*name;
 	HAL_BOOL	(*probe)(struct ath_hal *ah);
 	HAL_BOOL	(*attach)(struct ath_hal *ah, HAL_STATUS *ecode);
 };
 #ifndef AH_RF
 #define	AH_RF(_name, _probe, _attach)				\
 static struct ath_hal_rf _name##_rf = {				\
 	.name		= __STRING(_name),			\
 	.probe		= _probe,				\
 	.attach		= _attach				\
 };								\
 OS_DATA_SET(ah_rfs, _name##_rf)
 #endif
 
 struct ath_hal_rf *ath_hal_rfprobe(struct ath_hal *ah, HAL_STATUS *ecode);
 
 /*
  * Maximum number of internal channels.  Entries are per unique
  * frequency so this might be need to be increased to handle all
  * usage cases; typically no more than 32 are really needed but
  * dynamically allocating the data structures is a bit painful
  * right now.
  */
 #ifndef AH_MAXCHAN
 #define	AH_MAXCHAN	96
 #endif
 
 #define	HAL_NF_CAL_HIST_LEN_FULL	5
 #define	HAL_NF_CAL_HIST_LEN_SMALL	1
 #define	HAL_NUM_NF_READINGS		6	/* 3 chains * (ctl + ext) */
 #define	HAL_NF_LOAD_DELAY		1000
 
 /*
  * PER_CHAN doesn't work for now, as it looks like the device layer
  * has to pre-populate the per-channel list with nominal values.
  */
 //#define	ATH_NF_PER_CHAN		1
 
 typedef struct {
     u_int8_t    curr_index;
     int8_t      invalidNFcount; /* TO DO: REMOVE THIS! */
     int16_t     priv_nf[HAL_NUM_NF_READINGS];
 } HAL_NFCAL_BASE;
 
 typedef struct {
     HAL_NFCAL_BASE base;
     int16_t     nf_cal_buffer[HAL_NF_CAL_HIST_LEN_FULL][HAL_NUM_NF_READINGS];
 } HAL_NFCAL_HIST_FULL;
 
 typedef struct {
     HAL_NFCAL_BASE base;
     int16_t     nf_cal_buffer[HAL_NF_CAL_HIST_LEN_SMALL][HAL_NUM_NF_READINGS];
 } HAL_NFCAL_HIST_SMALL;
 
 #ifdef	ATH_NF_PER_CHAN
 typedef HAL_NFCAL_HIST_FULL HAL_CHAN_NFCAL_HIST;
 #define	AH_HOME_CHAN_NFCAL_HIST(ah, ichan) (ichan ? &ichan->nf_cal_hist: NULL)
 #else
 typedef HAL_NFCAL_HIST_SMALL HAL_CHAN_NFCAL_HIST;
 #define	AH_HOME_CHAN_NFCAL_HIST(ah, ichan) (&AH_PRIVATE(ah)->nf_cal_hist)
 #endif	/* ATH_NF_PER_CHAN */
 
 /*
  * Internal per-channel state.  These are found
  * using ic_devdata in the ieee80211_channel.
  */
 typedef struct {
 	uint16_t	channel;	/* h/w frequency, NB: may be mapped */
 	uint8_t		privFlags;
 #define	CHANNEL_IQVALID		0x01	/* IQ calibration valid */
 #define	CHANNEL_ANI_INIT	0x02	/* ANI state initialized */
 #define	CHANNEL_ANI_SETUP	0x04	/* ANI state setup */
 #define	CHANNEL_MIMO_NF_VALID	0x04	/* Mimo NF values are valid */
 	uint8_t		calValid;	/* bitmask of cal types */
 	int8_t		iCoff;
 	int8_t		qCoff;
 	int16_t		rawNoiseFloor;
 	int16_t		noiseFloorAdjust;
 #ifdef	AH_SUPPORT_AR5416
 	int16_t		noiseFloorCtl[AH_MAX_CHAINS];
 	int16_t		noiseFloorExt[AH_MAX_CHAINS];
 #endif	/* AH_SUPPORT_AR5416 */
 	uint16_t	mainSpur;	/* cached spur value for this channel */
 
 	/*XXX TODO: make these part of privFlags */
 	uint8_t  paprd_done:1,           /* 1: PAPRD DONE, 0: PAPRD Cal not done */
 	       paprd_table_write_done:1; /* 1: DONE, 0: Cal data write not done */
 	int		one_time_cals_done;
 	HAL_CHAN_NFCAL_HIST nf_cal_hist;
 } HAL_CHANNEL_INTERNAL;
 
 /* channel requires noise floor check */
 #define	CHANNEL_NFCREQUIRED	IEEE80211_CHAN_PRIV0
 
 /* all full-width channels */
 #define	IEEE80211_CHAN_ALLFULL \
 	(IEEE80211_CHAN_ALL - (IEEE80211_CHAN_HALF | IEEE80211_CHAN_QUARTER))
 #define	IEEE80211_CHAN_ALLTURBOFULL \
 	(IEEE80211_CHAN_ALLTURBO - \
 	 (IEEE80211_CHAN_HALF | IEEE80211_CHAN_QUARTER))
 
 typedef struct {
 	uint32_t	halChanSpreadSupport 		: 1,
 			halSleepAfterBeaconBroken	: 1,
 			halCompressSupport		: 1,
 			halBurstSupport			: 1,
 			halFastFramesSupport		: 1,
 			halChapTuningSupport		: 1,
 			halTurboGSupport		: 1,
 			halTurboPrimeSupport		: 1,
 			halMicAesCcmSupport		: 1,
 			halMicCkipSupport		: 1,
 			halMicTkipSupport		: 1,
 			halTkipMicTxRxKeySupport	: 1,
 			halCipherAesCcmSupport		: 1,
 			halCipherCkipSupport		: 1,
 			halCipherTkipSupport		: 1,
 			halPSPollBroken			: 1,
 			halVEOLSupport			: 1,
 			halBssIdMaskSupport		: 1,
 			halMcastKeySrchSupport		: 1,
 			halTsfAddSupport		: 1,
 			halChanHalfRate			: 1,
 			halChanQuarterRate		: 1,
 			halHTSupport			: 1,
 			halHTSGI20Support		: 1,
 			halRfSilentSupport		: 1,
 			halHwPhyCounterSupport		: 1,
 			halWowSupport			: 1,
 			halWowMatchPatternExact		: 1,
 			halAutoSleepSupport		: 1,
 			halFastCCSupport		: 1,
 			halBtCoexSupport		: 1;
 	uint32_t	halRxStbcSupport		: 1,
 			halTxStbcSupport		: 1,
 			halGTTSupport			: 1,
 			halCSTSupport			: 1,
 			halRifsRxSupport		: 1,
 			halRifsTxSupport		: 1,
 			hal4AddrAggrSupport		: 1,
 			halExtChanDfsSupport		: 1,
 			halUseCombinedRadarRssi		: 1,
 			halForcePpmSupport		: 1,
 			halEnhancedPmSupport		: 1,
 			halEnhancedDfsSupport		: 1,
 			halMbssidAggrSupport		: 1,
 			halBssidMatchSupport		: 1,
 			hal4kbSplitTransSupport		: 1,
 			halHasRxSelfLinkedTail		: 1,
 			halSupportsFastClock5GHz	: 1,
 			halHasBBReadWar			: 1,
 			halSerialiseRegWar		: 1,
 			halMciSupport			: 1,
 			halRxTxAbortSupport		: 1,
 			halPaprdEnabled			: 1,
 			halHasUapsdSupport		: 1,
 			halWpsPushButtonSupport		: 1,
 			halBtCoexApsmWar		: 1,
 			halGenTimerSupport		: 1,
 			halLDPCSupport			: 1,
 			halHwBeaconProcSupport		: 1,
 			halEnhancedDmaSupport		: 1;
 	uint32_t	halIsrRacSupport		: 1,
 			halApmEnable			: 1,
 			halIntrMitigation		: 1,
 			hal49GhzSupport			: 1,
 			halAntDivCombSupport		: 1,
 			halAntDivCombSupportOrg		: 1,
 			halRadioRetentionSupport	: 1,
 			halSpectralScanSupport		: 1,
 			halRxUsingLnaMixing		: 1,
 			halRxDoMyBeacon			: 1,
 			halHwUapsdTrig			: 1;
 
 	uint32_t	halWirelessModes;
 	uint16_t	halTotalQueues;
 	uint16_t	halKeyCacheSize;
 	uint16_t	halLow5GhzChan, halHigh5GhzChan;
 	uint16_t	halLow2GhzChan, halHigh2GhzChan;
 	int		halTxTstampPrecision;
 	int		halRxTstampPrecision;
 	int		halRtsAggrLimit;
 	uint8_t		halTxChainMask;
 	uint8_t		halRxChainMask;
 	uint8_t		halNumGpioPins;
 	uint8_t		halNumAntCfg2GHz;
 	uint8_t		halNumAntCfg5GHz;
 	uint32_t	halIntrMask;
 	uint8_t		halTxStreams;
 	uint8_t		halRxStreams;
 	HAL_MFP_OPT_T	halMfpSupport;
 
 	/* AR9300 HAL porting capabilities */
 	int		hal_paprd_enabled;
 	int		hal_pcie_lcr_offset;
 	int		hal_pcie_lcr_extsync_en;
 	int		halNumTxMaps;
 	int		halTxDescLen;
 	int		halTxStatusLen;
 	int		halRxStatusLen;
 	int		halRxHpFifoDepth;
 	int		halRxLpFifoDepth;
 	uint32_t	halRegCap;		/* XXX needed? */
 	int		halNumMRRetries;
 	int		hal_ani_poll_interval;
 	int		hal_channel_switch_time_usec;
 } HAL_CAPABILITIES;
 
 struct regDomain;
 
 /*
  * Definitions for ah_flags in ath_hal_private
  */
 #define		AH_USE_EEPROM	0x1
 #define		AH_IS_HB63	0x2
 
 /*
  * The ``private area'' follows immediately after the ``public area''
  * in the data structure returned by ath_hal_attach.  Private data are
  * used by device-independent code such as the regulatory domain support.
  * In general, code within the HAL should never depend on data in the
  * public area.  Instead any public data needed internally should be
  * shadowed here.
  *
  * When declaring a device-specific ath_hal data structure this structure
  * is assumed to at the front; e.g.
  *
  *	struct ath_hal_5212 {
  *		struct ath_hal_private	ah_priv;
  *		...
  *	};
  *
  * It might be better to manage the method pointers in this structure
  * using an indirect pointer to a read-only data structure but this would
  * disallow class-style method overriding.
  */
 struct ath_hal_private {
 	struct ath_hal	h;			/* public area */
 
 	/* NB: all methods go first to simplify initialization */
 	HAL_BOOL	(*ah_getChannelEdges)(struct ath_hal*,
 				uint16_t channelFlags,
 				uint16_t *lowChannel, uint16_t *highChannel);
 	u_int		(*ah_getWirelessModes)(struct ath_hal*);
 	HAL_BOOL	(*ah_eepromRead)(struct ath_hal *, u_int off,
 				uint16_t *data);
 	HAL_BOOL	(*ah_eepromWrite)(struct ath_hal *, u_int off,
 				uint16_t data);
 	HAL_BOOL	(*ah_getChipPowerLimits)(struct ath_hal *,
 				struct ieee80211_channel *);
 	int16_t		(*ah_getNfAdjust)(struct ath_hal *,
 				const HAL_CHANNEL_INTERNAL*);
 	void		(*ah_getNoiseFloor)(struct ath_hal *,
 				int16_t nfarray[]);
 
 	void		*ah_eeprom;		/* opaque EEPROM state */
 	uint16_t	ah_eeversion;		/* EEPROM version */
 	void		(*ah_eepromDetach)(struct ath_hal *);
 	HAL_STATUS	(*ah_eepromGet)(struct ath_hal *, int, void *);
 	HAL_STATUS	(*ah_eepromSet)(struct ath_hal *, int, int);
 	uint16_t	(*ah_getSpurChan)(struct ath_hal *, int, HAL_BOOL);
 	HAL_BOOL	(*ah_eepromDiag)(struct ath_hal *, int request,
 			    const void *args, uint32_t argsize,
 			    void **result, uint32_t *resultsize);
 
 	/*
 	 * Device revision information.
 	 */
 	uint16_t	ah_devid;		/* PCI device ID */
 	uint16_t	ah_subvendorid;		/* PCI subvendor ID */
 	uint32_t	ah_macVersion;		/* MAC version id */
 	uint16_t	ah_macRev;		/* MAC revision */
 	uint16_t	ah_phyRev;		/* PHY revision */
 	uint16_t	ah_analog5GhzRev;	/* 2GHz radio revision */
 	uint16_t	ah_analog2GhzRev;	/* 5GHz radio revision */
 	uint32_t	ah_flags;		/* misc flags */
 	uint8_t		ah_ispcie;		/* PCIE, special treatment */
 	uint8_t		ah_devType;		/* card type - CB, PCI, PCIe */
 
 	HAL_OPMODE	ah_opmode;		/* operating mode from reset */
 	const struct ieee80211_channel *ah_curchan;/* operating channel */
 	HAL_CAPABILITIES ah_caps;		/* device capabilities */
 	uint32_t	ah_diagreg;		/* user-specified AR_DIAG_SW */
 	int16_t		ah_powerLimit;		/* tx power cap */
 	uint16_t	ah_maxPowerLevel;	/* calculated max tx power */
 	u_int		ah_tpScale;		/* tx power scale factor */
 	u_int16_t	ah_extraTxPow;		/* low rates extra-txpower */
 	uint32_t	ah_11nCompat;		/* 11n compat controls */
 
 	/*
 	 * State for regulatory domain handling.
 	 */
 	HAL_REG_DOMAIN	ah_currentRD;		/* EEPROM regulatory domain */
 	HAL_REG_DOMAIN	ah_currentRDext;	/* EEPROM extended regdomain flags */
 	HAL_DFS_DOMAIN	ah_dfsDomain;		/* current DFS domain */
 	HAL_CHANNEL_INTERNAL ah_channels[AH_MAXCHAN]; /* private chan state */
 	u_int		ah_nchan;		/* valid items in ah_channels */
 	const struct regDomain *ah_rd2GHz;	/* reg state for 2G band */
 	const struct regDomain *ah_rd5GHz;	/* reg state for 5G band */
 
 	uint8_t    	ah_coverageClass;   	/* coverage class */
 	/*
 	 * RF Silent handling; setup according to the EEPROM.
 	 */
 	uint16_t	ah_rfsilent;		/* GPIO pin + polarity */
 	HAL_BOOL	ah_rfkillEnabled;	/* enable/disable RfKill */
 	/*
 	 * Diagnostic support for discriminating HIUERR reports.
 	 */
 	uint32_t	ah_fatalState[6];	/* AR_ISR+shadow regs */
 	int		ah_rxornIsFatal;	/* how to treat HAL_INT_RXORN */
 
 	/* Only used if ATH_NF_PER_CHAN is defined */
 	HAL_NFCAL_HIST_FULL	nf_cal_hist;
 
 	/*
 	 * Channel survey history - current channel only.
 	 */
 	 HAL_CHANNEL_SURVEY	ah_chansurvey;	/* channel survey */
 };
 
 #define	AH_PRIVATE(_ah)	((struct ath_hal_private *)(_ah))
 
 #define	ath_hal_getChannelEdges(_ah, _cf, _lc, _hc) \
 	AH_PRIVATE(_ah)->ah_getChannelEdges(_ah, _cf, _lc, _hc)
 #define	ath_hal_getWirelessModes(_ah) \
 	AH_PRIVATE(_ah)->ah_getWirelessModes(_ah)
 #define	ath_hal_eepromRead(_ah, _off, _data) \
 	AH_PRIVATE(_ah)->ah_eepromRead(_ah, _off, _data)
 #define	ath_hal_eepromWrite(_ah, _off, _data) \
 	AH_PRIVATE(_ah)->ah_eepromWrite(_ah, _off, _data)
 #define	ath_hal_gpioCfgOutput(_ah, _gpio, _type) \
 	(_ah)->ah_gpioCfgOutput(_ah, _gpio, _type)
 #define	ath_hal_gpioCfgInput(_ah, _gpio) \
 	(_ah)->ah_gpioCfgInput(_ah, _gpio)
 #define	ath_hal_gpioGet(_ah, _gpio) \
 	(_ah)->ah_gpioGet(_ah, _gpio)
 #define	ath_hal_gpioSet(_ah, _gpio, _val) \
 	(_ah)->ah_gpioSet(_ah, _gpio, _val)
 #define	ath_hal_gpioSetIntr(_ah, _gpio, _ilevel) \
 	(_ah)->ah_gpioSetIntr(_ah, _gpio, _ilevel)
 #define	ath_hal_getpowerlimits(_ah, _chan) \
 	AH_PRIVATE(_ah)->ah_getChipPowerLimits(_ah, _chan)
 #define ath_hal_getNfAdjust(_ah, _c) \
 	AH_PRIVATE(_ah)->ah_getNfAdjust(_ah, _c)
 #define	ath_hal_getNoiseFloor(_ah, _nfArray) \
 	AH_PRIVATE(_ah)->ah_getNoiseFloor(_ah, _nfArray)
 #define	ath_hal_configPCIE(_ah, _reset, _poweroff) \
 	(_ah)->ah_configPCIE(_ah, _reset, _poweroff)
 #define	ath_hal_disablePCIE(_ah) \
 	(_ah)->ah_disablePCIE(_ah)
 #define	ath_hal_setInterrupts(_ah, _mask) \
 	(_ah)->ah_setInterrupts(_ah, _mask)
 
 #define ath_hal_isrfkillenabled(_ah)  \
     (ath_hal_getcapability(_ah, HAL_CAP_RFSILENT, 1, AH_NULL) == HAL_OK)
 #define ath_hal_enable_rfkill(_ah, _v) \
     ath_hal_setcapability(_ah, HAL_CAP_RFSILENT, 1, _v, AH_NULL)
 #define ath_hal_hasrfkill_int(_ah)  \
     (ath_hal_getcapability(_ah, HAL_CAP_RFSILENT, 3, AH_NULL) == HAL_OK)
 
 #define	ath_hal_eepromDetach(_ah) do {				\
 	if (AH_PRIVATE(_ah)->ah_eepromDetach != AH_NULL)	\
 		AH_PRIVATE(_ah)->ah_eepromDetach(_ah);		\
 } while (0)
 #define	ath_hal_eepromGet(_ah, _param, _val) \
 	AH_PRIVATE(_ah)->ah_eepromGet(_ah, _param, _val)
 #define	ath_hal_eepromSet(_ah, _param, _val) \
 	AH_PRIVATE(_ah)->ah_eepromSet(_ah, _param, _val)
 #define	ath_hal_eepromGetFlag(_ah, _param) \
 	(AH_PRIVATE(_ah)->ah_eepromGet(_ah, _param, AH_NULL) == HAL_OK)
 #define ath_hal_getSpurChan(_ah, _ix, _is2G) \
 	AH_PRIVATE(_ah)->ah_getSpurChan(_ah, _ix, _is2G)
 #define	ath_hal_eepromDiag(_ah, _request, _a, _asize, _r, _rsize) \
 	AH_PRIVATE(_ah)->ah_eepromDiag(_ah, _request, _a, _asize,  _r, _rsize)
 
 #ifndef _NET_IF_IEEE80211_H_
 /*
  * Stuff that would naturally come from _ieee80211.h
  */
 #define	IEEE80211_ADDR_LEN		6
 
 #define	IEEE80211_WEP_IVLEN			3	/* 24bit */
 #define	IEEE80211_WEP_KIDLEN			1	/* 1 octet */
 #define	IEEE80211_WEP_CRCLEN			4	/* CRC-32 */
 
 #define	IEEE80211_CRC_LEN			4
 
 #define	IEEE80211_MAX_LEN			(2300 + IEEE80211_CRC_LEN + \
     (IEEE80211_WEP_IVLEN + IEEE80211_WEP_KIDLEN + IEEE80211_WEP_CRCLEN))
 #endif /* _NET_IF_IEEE80211_H_ */
 
 #define HAL_TXQ_USE_LOCKOUT_BKOFF_DIS	0x00000001
 
 #define INIT_AIFS		2
 #define INIT_CWMIN		15
 #define INIT_CWMIN_11B		31
 #define INIT_CWMAX		1023
 #define INIT_SH_RETRY		10
 #define INIT_LG_RETRY		10
 #define INIT_SSH_RETRY		32
 #define INIT_SLG_RETRY		32
 
 typedef struct {
 	uint32_t	tqi_ver;		/* HAL TXQ verson */
 	HAL_TX_QUEUE	tqi_type;		/* hw queue type*/
 	HAL_TX_QUEUE_SUBTYPE tqi_subtype;	/* queue subtype, if applicable */
 	HAL_TX_QUEUE_FLAGS tqi_qflags;		/* queue flags */
 	uint32_t	tqi_priority;
 	uint32_t	tqi_aifs;		/* aifs */
 	uint32_t	tqi_cwmin;		/* cwMin */
 	uint32_t	tqi_cwmax;		/* cwMax */
 	uint16_t	tqi_shretry;		/* frame short retry limit */
 	uint16_t	tqi_lgretry;		/* frame long retry limit */
 	uint32_t	tqi_cbrPeriod;
 	uint32_t	tqi_cbrOverflowLimit;
 	uint32_t	tqi_burstTime;
 	uint32_t	tqi_readyTime;
 	uint32_t	tqi_physCompBuf;
 	uint32_t	tqi_intFlags;		/* flags for internal use */
 } HAL_TX_QUEUE_INFO;
 
 extern	HAL_BOOL ath_hal_setTxQProps(struct ath_hal *ah,
 		HAL_TX_QUEUE_INFO *qi, const HAL_TXQ_INFO *qInfo);
 extern	HAL_BOOL ath_hal_getTxQProps(struct ath_hal *ah,
 		HAL_TXQ_INFO *qInfo, const HAL_TX_QUEUE_INFO *qi);
 
 #define	HAL_SPUR_VAL_MASK		0x3FFF
 #define	HAL_SPUR_CHAN_WIDTH		87
 #define	HAL_BIN_WIDTH_BASE_100HZ	3125
 #define	HAL_BIN_WIDTH_TURBO_100HZ	6250
 #define	HAL_MAX_BINS_ALLOWED		28
 
 #define	IS_CHAN_5GHZ(_c)	((_c)->channel > 4900)
 #define	IS_CHAN_2GHZ(_c)	(!IS_CHAN_5GHZ(_c))
 
 #define	IS_CHAN_IN_PUBLIC_SAFETY_BAND(_c) ((_c) > 4940 && (_c) < 4990)
 
 /*
  * Deduce if the host cpu has big- or litt-endian byte order.
  */
 static __inline__ int
 isBigEndian(void)
 {
 	union {
 		int32_t i;
 		char c[4];
 	} u;
 	u.i = 1;
 	return (u.c[0] == 0);
 }
 
 /* unalligned little endian access */     
 #define LE_READ_2(p)							\
 	((uint16_t)							\
 	 ((((const uint8_t *)(p))[0]    ) | (((const uint8_t *)(p))[1]<< 8)))
 #define LE_READ_4(p)							\
 	((uint32_t)							\
 	 ((((const uint8_t *)(p))[0]    ) | (((const uint8_t *)(p))[1]<< 8) |\
 	  (((const uint8_t *)(p))[2]<<16) | (((const uint8_t *)(p))[3]<<24)))
 
 /*
  * Register manipulation macros that expect bit field defines
  * to follow the convention that an _S suffix is appended for
  * a shift count, while the field mask has no suffix.
  */
 #define	SM(_v, _f)	(((_v) << _f##_S) & (_f))
 #define	MS(_v, _f)	(((_v) & (_f)) >> _f##_S)
 #define OS_REG_RMW(_a, _r, _set, _clr)    \
 	OS_REG_WRITE(_a, _r, (OS_REG_READ(_a, _r) & ~(_clr)) | (_set))
 #define	OS_REG_RMW_FIELD(_a, _r, _f, _v) \
 	OS_REG_WRITE(_a, _r, \
 		(OS_REG_READ(_a, _r) &~ (_f)) | (((_v) << _f##_S) & (_f)))
 #define	OS_REG_SET_BIT(_a, _r, _f) \
 	OS_REG_WRITE(_a, _r, OS_REG_READ(_a, _r) | (_f))
 #define	OS_REG_CLR_BIT(_a, _r, _f) \
 	OS_REG_WRITE(_a, _r, OS_REG_READ(_a, _r) &~ (_f))
 #define OS_REG_IS_BIT_SET(_a, _r, _f) \
 	    ((OS_REG_READ(_a, _r) & (_f)) != 0)
 #define	OS_REG_RMW_FIELD_ALT(_a, _r, _f, _v) \
 	    OS_REG_WRITE(_a, _r, \
 	    (OS_REG_READ(_a, _r) &~(_f<<_f##_S)) | \
 	    (((_v) << _f##_S) & (_f<<_f##_S)))
 #define	OS_REG_READ_FIELD(_a, _r, _f) \
 	    (((OS_REG_READ(_a, _r) & _f) >> _f##_S))
 #define	OS_REG_READ_FIELD_ALT(_a, _r, _f) \
 	    ((OS_REG_READ(_a, _r) >> (_f##_S))&(_f))
 
 /* Analog register writes may require a delay between each one (eg Merlin?) */
 #define	OS_A_REG_RMW_FIELD(_a, _r, _f, _v) \
 	do { OS_REG_WRITE(_a, _r, (OS_REG_READ(_a, _r) &~ (_f)) | \
 	    (((_v) << _f##_S) & (_f))) ; OS_DELAY(100); } while (0)
 #define	OS_A_REG_WRITE(_a, _r, _v) \
 	do { OS_REG_WRITE(_a, _r, _v); OS_DELAY(100); } while (0)
 
 /* wait for the register contents to have the specified value */
 extern	HAL_BOOL ath_hal_wait(struct ath_hal *, u_int reg,
 		uint32_t mask, uint32_t val);
 extern	HAL_BOOL ath_hal_waitfor(struct ath_hal *, u_int reg,
 		uint32_t mask, uint32_t val, uint32_t timeout);
 
 /* return the first n bits in val reversed */
 extern	uint32_t ath_hal_reverseBits(uint32_t val, uint32_t n);
 
 /* printf interfaces */
 extern	void ath_hal_printf(struct ath_hal *, const char*, ...)
 		__printflike(2,3);
 extern	void ath_hal_vprintf(struct ath_hal *, const char*, __va_list)
 		__printflike(2, 0);
 extern	const char* ath_hal_ether_sprintf(const uint8_t *mac);
 
 /* allocate and free memory */
 extern	void *ath_hal_malloc(size_t);
 extern	void ath_hal_free(void *);
 
 /* common debugging interfaces */
 #ifdef AH_DEBUG
 #include "ah_debug.h"
 extern	int ath_hal_debug;	/* Global debug flags */
 
 /*
  * The typecast is purely because some callers will pass in
  * AH_NULL directly rather than using a NULL ath_hal pointer.
  */
 #define	HALDEBUG(_ah, __m, ...) \
 	do {							\
 		if ((__m) == HAL_DEBUG_UNMASKABLE ||		\
 		    ath_hal_debug & (__m) ||			\
 		    ((_ah) != NULL &&				\
 		      ((struct ath_hal *) (_ah))->ah_config.ah_debug & (__m))) {	\
 			DO_HALDEBUG((_ah), (__m), __VA_ARGS__);	\
 		}						\
 	} while(0);
 
 extern	void DO_HALDEBUG(struct ath_hal *ah, u_int mask, const char* fmt, ...)
 	__printflike(3,4);
 #else
 #define HALDEBUG(_ah, __m, ...)
 #endif /* AH_DEBUG */
 
 /*
  * Register logging definitions shared with ardecode.
  */
 #include "ah_decode.h"
 
 /*
  * Common assertion interface.  Note: it is a bad idea to generate
  * an assertion failure for any recoverable event.  Instead catch
  * the violation and, if possible, fix it up or recover from it; either
  * with an error return value or a diagnostic messages.  System software
  * does not panic unless the situation is hopeless.
  */
 #ifdef AH_ASSERT
 extern	void ath_hal_assert_failed(const char* filename,
 		int lineno, const char* msg);
 
 #define	HALASSERT(_x) do {					\
 	if (!(_x)) {						\
 		ath_hal_assert_failed(__FILE__, __LINE__, #_x);	\
 	}							\
 } while (0)
 #else
 #define	HALASSERT(_x)
 #endif /* AH_ASSERT */
 
 /* 
  * Regulatory domain support.
  */
 
 /*
  * Return the max allowed antenna gain and apply any regulatory
  * domain specific changes.
  */
 u_int	ath_hal_getantennareduction(struct ath_hal *ah,
 	    const struct ieee80211_channel *chan, u_int twiceGain);
 
 /*
  * Return the test group for the specific channel based on
  * the current regulatory setup.
  */
 u_int	ath_hal_getctl(struct ath_hal *, const struct ieee80211_channel *);
 
 /*
  * Map a public channel definition to the corresponding
  * internal data structure.  This implicitly specifies
  * whether or not the specified channel is ok to use
  * based on the current regulatory domain constraints.
  */
 #ifndef AH_DEBUG
 static OS_INLINE HAL_CHANNEL_INTERNAL *
 ath_hal_checkchannel(struct ath_hal *ah, const struct ieee80211_channel *c)
 {
 	HAL_CHANNEL_INTERNAL *cc;
 
 	HALASSERT(c->ic_devdata < AH_PRIVATE(ah)->ah_nchan);
 	cc = &AH_PRIVATE(ah)->ah_channels[c->ic_devdata];
 	HALASSERT(c->ic_freq == cc->channel || IEEE80211_IS_CHAN_GSM(c));
 	return cc;
 }
 #else
 /* NB: non-inline version that checks state */
 HAL_CHANNEL_INTERNAL *ath_hal_checkchannel(struct ath_hal *,
 		const struct ieee80211_channel *);
 #endif /* AH_DEBUG */
 
 /*
  * Return the h/w frequency for a channel.  This may be
  * different from ic_freq if this is a GSM device that
  * takes 2.4GHz frequencies and down-converts them.
  */
 static OS_INLINE uint16_t
 ath_hal_gethwchannel(struct ath_hal *ah, const struct ieee80211_channel *c)
 {
 	return ath_hal_checkchannel(ah, c)->channel;
 }
 
 /*
- * Convert between microseconds and core system clocks.
- */
-extern	u_int ath_hal_mac_clks(struct ath_hal *ah, u_int usecs);
-extern	u_int ath_hal_mac_usec(struct ath_hal *ah, u_int clks);
-
-/*
  * Generic get/set capability support.  Each chip overrides
  * this routine to support chip-specific capabilities.
  */
 extern	HAL_STATUS ath_hal_getcapability(struct ath_hal *ah,
 		HAL_CAPABILITY_TYPE type, uint32_t capability,
 		uint32_t *result);
 extern	HAL_BOOL ath_hal_setcapability(struct ath_hal *ah,
 		HAL_CAPABILITY_TYPE type, uint32_t capability,
 		uint32_t setting, HAL_STATUS *status);
 
 /* The diagnostic codes used to be internally defined here -adrian */
 #include "ah_diagcodes.h"
 
 /*
  * The AR5416 and later HALs have MAC and baseband hang checking.
  */
 typedef struct {
 	uint32_t hang_reg_offset;
 	uint32_t hang_val;
 	uint32_t hang_mask;
 	uint32_t hang_offset;
 } hal_hw_hang_check_t;
 
 typedef struct {
 	uint32_t dma_dbg_3;
 	uint32_t dma_dbg_4;
 	uint32_t dma_dbg_5;
 	uint32_t dma_dbg_6;
 } mac_dbg_regs_t;
 
 typedef enum {
 	dcu_chain_state		= 0x1,
 	dcu_complete_state	= 0x2,
 	qcu_state		= 0x4,
 	qcu_fsp_ok		= 0x8,
 	qcu_fsp_state		= 0x10,
 	qcu_stitch_state	= 0x20,
 	qcu_fetch_state		= 0x40,
 	qcu_complete_state	= 0x80
 } hal_mac_hangs_t;
 
 typedef struct {
 	int states;
 	uint8_t dcu_chain_state;
 	uint8_t dcu_complete_state;
 	uint8_t qcu_state;
 	uint8_t qcu_fsp_ok;
 	uint8_t qcu_fsp_state;
 	uint8_t qcu_stitch_state;
 	uint8_t qcu_fetch_state;
 	uint8_t qcu_complete_state;
 } hal_mac_hang_check_t;
 
 enum {
     HAL_BB_HANG_DFS		= 0x0001,
     HAL_BB_HANG_RIFS		= 0x0002,
     HAL_BB_HANG_RX_CLEAR	= 0x0004,
     HAL_BB_HANG_UNKNOWN		= 0x0080,
 
     HAL_MAC_HANG_SIG1		= 0x0100,
     HAL_MAC_HANG_SIG2		= 0x0200,
     HAL_MAC_HANG_UNKNOWN	= 0x8000,
 
     HAL_BB_HANGS = HAL_BB_HANG_DFS
 		 | HAL_BB_HANG_RIFS
 		 | HAL_BB_HANG_RX_CLEAR
 		 | HAL_BB_HANG_UNKNOWN,
     HAL_MAC_HANGS = HAL_MAC_HANG_SIG1
 		 | HAL_MAC_HANG_SIG2
 		 | HAL_MAC_HANG_UNKNOWN,
 };
 
 /* Merge these with above */
 typedef enum hal_hw_hangs {
     HAL_DFS_BB_HANG_WAR          = 0x1,
     HAL_RIFS_BB_HANG_WAR         = 0x2,
     HAL_RX_STUCK_LOW_BB_HANG_WAR = 0x4,
     HAL_MAC_HANG_WAR             = 0x8,
     HAL_PHYRESTART_CLR_WAR       = 0x10,
     HAL_MAC_HANG_DETECTED        = 0x40000000,
     HAL_BB_HANG_DETECTED         = 0x80000000
 } hal_hw_hangs_t;
 
 /*
  * Device revision information.
  */
 typedef struct {
 	uint16_t	ah_devid;		/* PCI device ID */
 	uint16_t	ah_subvendorid;		/* PCI subvendor ID */
 	uint32_t	ah_macVersion;		/* MAC version id */
 	uint16_t	ah_macRev;		/* MAC revision */
 	uint16_t	ah_phyRev;		/* PHY revision */
 	uint16_t	ah_analog5GhzRev;	/* 2GHz radio revision */
 	uint16_t	ah_analog2GhzRev;	/* 5GHz radio revision */
 } HAL_REVS;
 
 /*
  * Argument payload for HAL_DIAG_SETKEY.
  */
 typedef struct {
 	HAL_KEYVAL	dk_keyval;
 	uint16_t	dk_keyix;	/* key index */
 	uint8_t		dk_mac[IEEE80211_ADDR_LEN];
 	int		dk_xor;		/* XOR key data */
 } HAL_DIAG_KEYVAL;
 
 /*
  * Argument payload for HAL_DIAG_EEWRITE.
  */
 typedef struct {
 	uint16_t	ee_off;		/* eeprom offset */
 	uint16_t	ee_data;	/* write data */
 } HAL_DIAG_EEVAL;
 
 
 typedef struct {
 	u_int offset;		/* reg offset */
 	uint32_t val;		/* reg value  */
 } HAL_DIAG_REGVAL;
 
 /*
  * 11n compatibility tweaks.
  */
 #define	HAL_DIAG_11N_SERVICES	0x00000003
 #define	HAL_DIAG_11N_SERVICES_S	0
 #define	HAL_DIAG_11N_TXSTOMP	0x0000000c
 #define	HAL_DIAG_11N_TXSTOMP_S	2
 
 typedef struct {
 	int		maxNoiseImmunityLevel;	/* [0..4] */
 	int		totalSizeDesired[5];
 	int		coarseHigh[5];
 	int		coarseLow[5];
 	int		firpwr[5];
 
 	int		maxSpurImmunityLevel;	/* [0..7] */
 	int		cycPwrThr1[8];
 
 	int		maxFirstepLevel;	/* [0..2] */
 	int		firstep[3];
 
 	uint32_t	ofdmTrigHigh;
 	uint32_t	ofdmTrigLow;
 	int32_t		cckTrigHigh;
 	int32_t		cckTrigLow;
 	int32_t		rssiThrLow;
 	int32_t		rssiThrHigh;
 
 	int		period;			/* update listen period */
 } HAL_ANI_PARAMS;
 
 extern	HAL_BOOL ath_hal_getdiagstate(struct ath_hal *ah, int request,
 			const void *args, uint32_t argsize,
 			void **result, uint32_t *resultsize);
 
 /*
  * Setup a h/w rate table for use.
  */
 extern	void ath_hal_setupratetable(struct ath_hal *ah, HAL_RATE_TABLE *rt);
 
 /*
  * Common routine for implementing getChanNoise api.
  */
 int16_t	ath_hal_getChanNoise(struct ath_hal *, const struct ieee80211_channel *);
 
 /*
  * Initialization support.
  */
 typedef struct {
 	const uint32_t	*data;
 	int		rows, cols;
 } HAL_INI_ARRAY;
 
 #define	HAL_INI_INIT(_ia, _data, _cols) do {			\
 	(_ia)->data = (const uint32_t *)(_data);		\
 	(_ia)->rows = sizeof(_data) / sizeof((_data)[0]);	\
 	(_ia)->cols = (_cols);					\
 } while (0)
 #define	HAL_INI_VAL(_ia, _r, _c) \
 	((_ia)->data[((_r)*(_ia)->cols) + (_c)])
 
 /*
  * OS_DELAY() does a PIO READ on the PCI bus which allows
  * other cards' DMA reads to complete in the middle of our reset.
  */
 #define DMA_YIELD(x) do {		\
 	if ((++(x) % 64) == 0)		\
 		OS_DELAY(1);		\
 } while (0)
 
 #define HAL_INI_WRITE_ARRAY(ah, regArray, col, regWr) do {             	\
 	int r;								\
 	for (r = 0; r < N(regArray); r++) {				\
 		OS_REG_WRITE(ah, (regArray)[r][0], (regArray)[r][col]);	\
 		DMA_YIELD(regWr);					\
 	}								\
 } while (0)
 
 #define HAL_INI_WRITE_BANK(ah, regArray, bankData, regWr) do {		\
 	int r;								\
 	for (r = 0; r < N(regArray); r++) {				\
 		OS_REG_WRITE(ah, (regArray)[r][0], (bankData)[r]);	\
 		DMA_YIELD(regWr);					\
 	}								\
 } while (0)
 
 extern	int ath_hal_ini_write(struct ath_hal *ah, const HAL_INI_ARRAY *ia,
 		int col, int regWr);
 extern	void ath_hal_ini_bank_setup(uint32_t data[], const HAL_INI_ARRAY *ia,
 		int col);
 extern	int ath_hal_ini_bank_write(struct ath_hal *ah, const HAL_INI_ARRAY *ia,
 		const uint32_t data[], int regWr);
 
 #define	CCK_SIFS_TIME		10
 #define	CCK_PREAMBLE_BITS	144
 #define	CCK_PLCP_BITS		48
 
 #define	OFDM_SIFS_TIME		16
 #define	OFDM_PREAMBLE_TIME	20
 #define	OFDM_PLCP_BITS		22
 #define	OFDM_SYMBOL_TIME	4
 
 #define	OFDM_HALF_SIFS_TIME	32
 #define	OFDM_HALF_PREAMBLE_TIME	40
 #define	OFDM_HALF_PLCP_BITS	22
 #define	OFDM_HALF_SYMBOL_TIME	8
 
 #define	OFDM_QUARTER_SIFS_TIME 		64
 #define	OFDM_QUARTER_PREAMBLE_TIME	80
 #define	OFDM_QUARTER_PLCP_BITS		22
 #define	OFDM_QUARTER_SYMBOL_TIME	16
 
 #define	TURBO_SIFS_TIME		8
 #define	TURBO_PREAMBLE_TIME	14
 #define	TURBO_PLCP_BITS		22
 #define	TURBO_SYMBOL_TIME	4
 
 #define	WLAN_CTRL_FRAME_SIZE	(2+2+6+4)	/* ACK+FCS */
 
 /* Generic EEPROM board value functions */
 extern	HAL_BOOL ath_ee_getLowerUpperIndex(uint8_t target, uint8_t *pList,
 	uint16_t listSize, uint16_t *indexL, uint16_t *indexR);
 extern	HAL_BOOL ath_ee_FillVpdTable(uint8_t pwrMin, uint8_t pwrMax,
 	uint8_t *pPwrList, uint8_t *pVpdList, uint16_t numIntercepts,
 	uint8_t *pRetVpdList);
 extern	int16_t ath_ee_interpolate(uint16_t target, uint16_t srcLeft,
 	uint16_t srcRight, int16_t targetLeft, int16_t targetRight);
 
 /* Whether 5ghz fast clock is needed */
 /*
  * The chipset (Merlin, AR9300/later) should set the capability flag below;
  * this flag simply says that the hardware can do it, not that the EEPROM
  * says it can.
  *
  * Merlin 2.0/2.1 chips with an EEPROM version > 16 do 5ghz fast clock
  *   if the relevant eeprom flag is set.
  * Merlin 2.0/2.1 chips with an EEPROM version <= 16 do 5ghz fast clock
  *   by default.
  */
 #define	IS_5GHZ_FAST_CLOCK_EN(_ah, _c) \
 	(IEEE80211_IS_CHAN_5GHZ(_c) && \
 	 AH_PRIVATE((_ah))->ah_caps.halSupportsFastClock5GHz && \
 	ath_hal_eepromGetFlag((_ah), AR_EEP_FSTCLK_5G))
 
 /*
  * Fetch the maximum regulatory domain power for the given channel
  * in 1/2dBm steps.
  */
 static inline int
 ath_hal_get_twice_max_regpower(struct ath_hal_private *ahp,
     const HAL_CHANNEL_INTERNAL *ichan, const struct ieee80211_channel *chan)
 {
 	struct ath_hal *ah = &ahp->h;
 
 	if (! chan) {
 		ath_hal_printf(ah, "%s: called with chan=NULL!\n", __func__);
 		return (0);
 	}
 	return (chan->ic_maxpower);
 }
 
 /*
  * Get the maximum antenna gain allowed, in 1/2dBm steps.
  */
 static inline int
 ath_hal_getantennaallowed(struct ath_hal *ah,
     const struct ieee80211_channel *chan)
 {
 
 	if (! chan)
 		return (0);
 
 	return (chan->ic_maxantgain);
 }
 
 /*
  * Map the given 2GHz channel to an IEEE number.
  */
 extern	int ath_hal_mhz2ieee_2ghz(struct ath_hal *, int freq);
 
 /*
  * Clear the channel survey data.
  */
 extern	void ath_hal_survey_clear(struct ath_hal *ah);
 
 /*
  * Add a sample to the channel survey data.
  */
 extern	void ath_hal_survey_add_sample(struct ath_hal *ah,
 	    HAL_SURVEY_SAMPLE *hs);
 
 #endif /* _ATH_AH_INTERAL_H_ */
Index: projects/clang390-import/sys/dev/ath/ath_hal/ar5416/ar5416_reset.c
===================================================================
--- projects/clang390-import/sys/dev/ath/ath_hal/ar5416/ar5416_reset.c	(revision 305686)
+++ projects/clang390-import/sys/dev/ath/ath_hal/ar5416/ar5416_reset.c	(revision 305687)
@@ -1,2891 +1,2894 @@
 /*
  * Copyright (c) 2002-2009 Sam Leffler, Errno Consulting
  * Copyright (c) 2002-2008 Atheros Communications, Inc.
  *
  * Permission to use, copy, modify, and/or distribute this software for any
  * purpose with or without fee is hereby granted, provided that the above
  * copyright notice and this permission notice appear in all copies.
  *
  * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
  * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
  * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
  * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
  * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  *
  * $FreeBSD$
  */
 #include "opt_ah.h"
 
 #include "ah.h"
 #include "ah_internal.h"
 #include "ah_devid.h"
 
 #include "ah_eeprom_v14.h"
 
 #include "ar5416/ar5416.h"
 #include "ar5416/ar5416reg.h"
 #include "ar5416/ar5416phy.h"
 
 /* Eeprom versioning macros. Returns true if the version is equal or newer than the ver specified */ 
 #define	EEP_MINOR(_ah) \
 	(AH_PRIVATE(_ah)->ah_eeversion & AR5416_EEP_VER_MINOR_MASK)
 #define IS_EEP_MINOR_V2(_ah)	(EEP_MINOR(_ah) >= AR5416_EEP_MINOR_VER_2)
 #define IS_EEP_MINOR_V3(_ah)	(EEP_MINOR(_ah) >= AR5416_EEP_MINOR_VER_3)
 
 /* Additional Time delay to wait after activiting the Base band */
 #define BASE_ACTIVATE_DELAY	100	/* 100 usec */
 #define PLL_SETTLE_DELAY	300	/* 300 usec */
 #define RTC_PLL_SETTLE_DELAY    1000    /* 1 ms     */
 
 static void ar5416InitDMA(struct ath_hal *ah);
 static void ar5416InitBB(struct ath_hal *ah, const struct ieee80211_channel *);
 static void ar5416InitIMR(struct ath_hal *ah, HAL_OPMODE opmode);
 static void ar5416InitQoS(struct ath_hal *ah);
 static void ar5416InitUserSettings(struct ath_hal *ah);
 static void ar5416OverrideIni(struct ath_hal *ah, const struct ieee80211_channel *);
 
 #if 0
 static HAL_BOOL	ar5416ChannelChange(struct ath_hal *, const struct ieee80211_channel *);
 #endif
 static void ar5416SetDeltaSlope(struct ath_hal *, const struct ieee80211_channel *);
 
 static HAL_BOOL ar5416SetResetPowerOn(struct ath_hal *ah);
 static HAL_BOOL ar5416SetReset(struct ath_hal *ah, int type);
 static HAL_BOOL ar5416SetPowerPerRateTable(struct ath_hal *ah,
 	struct ar5416eeprom *pEepData, 
 	const struct ieee80211_channel *chan, int16_t *ratesArray,
 	uint16_t cfgCtl, uint16_t AntennaReduction,
 	uint16_t twiceMaxRegulatoryPower, 
 	uint16_t powerLimit);
 static void ar5416Set11nRegs(struct ath_hal *ah, const struct ieee80211_channel *chan);
 static void ar5416MarkPhyInactive(struct ath_hal *ah);
 static void ar5416SetIFSTiming(struct ath_hal *ah,
    const struct ieee80211_channel *chan);
 
 /*
  * Places the device in and out of reset and then places sane
  * values in the registers based on EEPROM config, initialization
  * vectors (as determined by the mode), and station configuration
  *
  * bChannelChange is used to preserve DMA/PCU registers across
  * a HW Reset during channel change.
  */
 HAL_BOOL
 ar5416Reset(struct ath_hal *ah, HAL_OPMODE opmode,
 	struct ieee80211_channel *chan,
 	HAL_BOOL bChannelChange,
 	HAL_RESET_TYPE resetType,
 	HAL_STATUS *status)
 {
 #define	N(a)	(sizeof (a) / sizeof (a[0]))
 #define	FAIL(_code)	do { ecode = _code; goto bad; } while (0)
 	struct ath_hal_5212 *ahp = AH5212(ah);
 	HAL_CHANNEL_INTERNAL *ichan;
 	uint32_t saveDefAntenna, saveLedState;
 	uint32_t macStaId1;
 	uint16_t rfXpdGain[2];
 	HAL_STATUS ecode;
 	uint32_t powerVal, rssiThrReg;
 	uint32_t ackTpcPow, ctsTpcPow, chirpTpcPow;
 	int i;
 	uint64_t tsf = 0;
 
 	OS_MARK(ah, AH_MARK_RESET, bChannelChange);
 
 	/* Bring out of sleep mode */
 	if (!ar5416SetPowerMode(ah, HAL_PM_AWAKE, AH_TRUE)) {
 		HALDEBUG(ah, HAL_DEBUG_ANY, "%s: chip did not wakeup\n",
 		    __func__);
 		FAIL(HAL_EIO);
 	}
 
 	/*
 	 * Map public channel to private.
 	 */
 	ichan = ath_hal_checkchannel(ah, chan);
 	if (ichan == AH_NULL)
 		FAIL(HAL_EINVAL);
 	switch (opmode) {
 	case HAL_M_STA:
 	case HAL_M_IBSS:
 	case HAL_M_HOSTAP:
 	case HAL_M_MONITOR:
 		break;
 	default:
 		HALDEBUG(ah, HAL_DEBUG_ANY, "%s: invalid operating mode %u\n",
 		    __func__, opmode);
 		FAIL(HAL_EINVAL);
 		break;
 	}
 	HALASSERT(AH_PRIVATE(ah)->ah_eeversion >= AR_EEPROM_VER14_1);
 
 	/* Blank the channel survey statistics */
 	ath_hal_survey_clear(ah);
 
 	/* XXX Turn on fast channel change for 5416 */
 
 	/*
 	 * Preserve the bmiss rssi threshold and count threshold
 	 * across resets
 	 */
 	rssiThrReg = OS_REG_READ(ah, AR_RSSI_THR);
 	/* If reg is zero, first time thru set to default val */
 	if (rssiThrReg == 0)
 		rssiThrReg = INIT_RSSI_THR;
 
 	/*
 	 * Preserve the antenna on a channel change
 	 */
 	saveDefAntenna = OS_REG_READ(ah, AR_DEF_ANTENNA);
 
 	/*
 	 * Don't do this for the AR9285 - it breaks RX for single
 	 * antenna designs when diversity is disabled.
 	 *
 	 * I'm not sure what this was working around; it may be
 	 * something to do with the AR5416.  Certainly this register
 	 * isn't supposed to be used by the MIMO chips for anything
 	 * except for defining the default antenna when an external
 	 * phase array / smart antenna is connected.
 	 *
 	 * See PR: kern/179269 .
 	 */
 	if ((! AR_SREV_KITE(ah)) && saveDefAntenna == 0)	/* XXX magic constants */
 		saveDefAntenna = 1;
 
 	/* Save hardware flag before chip reset clears the register */
 	macStaId1 = OS_REG_READ(ah, AR_STA_ID1) & 
 		(AR_STA_ID1_BASE_RATE_11B | AR_STA_ID1_USE_DEFANT);
 
 	/* Save led state from pci config register */
 	saveLedState = OS_REG_READ(ah, AR_MAC_LED) &
 		(AR_MAC_LED_ASSOC | AR_MAC_LED_MODE |
 		 AR_MAC_LED_BLINK_THRESH_SEL | AR_MAC_LED_BLINK_SLOW);
 
 	/* For chips on which the RTC reset is done, save TSF before it gets cleared */
 	if (AR_SREV_HOWL(ah) ||
 	    (AR_SREV_MERLIN(ah) &&
 	     ath_hal_eepromGetFlag(ah, AR_EEP_OL_PWRCTRL)) ||
 	    (ah->ah_config.ah_force_full_reset))
 		tsf = ar5416GetTsf64(ah);
 
 	/* Mark PHY as inactive; marked active in ar5416InitBB() */
 	ar5416MarkPhyInactive(ah);
 
 	if (!ar5416ChipReset(ah, chan)) {
 		HALDEBUG(ah, HAL_DEBUG_ANY, "%s: chip reset failed\n", __func__);
 		FAIL(HAL_EIO);
 	}
 
 	/* Restore TSF */
 	if (tsf)
 		ar5416SetTsf64(ah, tsf);
 
 	OS_MARK(ah, AH_MARK_RESET_LINE, __LINE__);
 	if (AR_SREV_MERLIN_10_OR_LATER(ah))
 		OS_REG_SET_BIT(ah, AR_GPIO_INPUT_EN_VAL, AR_GPIO_JTAG_DISABLE);
 
 	AH5416(ah)->ah_writeIni(ah, chan);
 
 	if(AR_SREV_KIWI_13_OR_LATER(ah) ) {
 		/* Enable ASYNC FIFO */
 		OS_REG_SET_BIT(ah, AR_MAC_PCU_ASYNC_FIFO_REG3,
 		    AR_MAC_PCU_ASYNC_FIFO_REG3_DATAPATH_SEL);
 		OS_REG_SET_BIT(ah, AR_PHY_MODE, AR_PHY_MODE_ASYNCFIFO);
 		OS_REG_CLR_BIT(ah, AR_MAC_PCU_ASYNC_FIFO_REG3,
 		    AR_MAC_PCU_ASYNC_FIFO_REG3_SOFT_RESET);
 		OS_REG_SET_BIT(ah, AR_MAC_PCU_ASYNC_FIFO_REG3,
 		    AR_MAC_PCU_ASYNC_FIFO_REG3_SOFT_RESET);
 	}
 
 	/* Override ini values (that can be overriden in this fashion) */
 	ar5416OverrideIni(ah, chan);
 
 	/* Setup 11n MAC/Phy mode registers */
 	ar5416Set11nRegs(ah, chan);	
 
 	OS_MARK(ah, AH_MARK_RESET_LINE, __LINE__);
 
 	/*
 	 * Some AR91xx SoC devices frequently fail to accept TSF writes
 	 * right after the chip reset. When that happens, write a new
 	 * value after the initvals have been applied, with an offset
 	 * based on measured time difference
 	 */
 	if (AR_SREV_HOWL(ah) && (ar5416GetTsf64(ah) < tsf)) {
 		tsf += 1500;
 		ar5416SetTsf64(ah, tsf);
 	}
 
 	HALDEBUG(ah, HAL_DEBUG_RESET, ">>>2 %s: AR_PHY_DAG_CTRLCCK=0x%x\n",
 		__func__, OS_REG_READ(ah,AR_PHY_DAG_CTRLCCK));
 	HALDEBUG(ah, HAL_DEBUG_RESET, ">>>2 %s: AR_PHY_ADC_CTL=0x%x\n",
 		__func__, OS_REG_READ(ah,AR_PHY_ADC_CTL));	
 
 	/*
 	 * This routine swaps the analog chains - it should be done
 	 * before any radio register twiddling is done.
 	 */
 	ar5416InitChainMasks(ah);
 
 	/* Setup the open-loop power calibration if required */
 	if (ath_hal_eepromGetFlag(ah, AR_EEP_OL_PWRCTRL)) {
 		AH5416(ah)->ah_olcInit(ah);
 		AH5416(ah)->ah_olcTempCompensation(ah);
 	}
 
 	/* Setup the transmit power values. */
 	if (!ah->ah_setTxPower(ah, chan, rfXpdGain)) {
 		HALDEBUG(ah, HAL_DEBUG_ANY,
 		    "%s: error init'ing transmit power\n", __func__);
 		FAIL(HAL_EIO);
 	}
 
 	/* Write the analog registers */
 	if (!ahp->ah_rfHal->setRfRegs(ah, chan,
 	    IEEE80211_IS_CHAN_2GHZ(chan) ? 2: 1, rfXpdGain)) {
 		HALDEBUG(ah, HAL_DEBUG_ANY,
 		    "%s: ar5212SetRfRegs failed\n", __func__);
 		FAIL(HAL_EIO);
 	}
 
 	/* Write delta slope for OFDM enabled modes (A, G, Turbo) */
 	if (IEEE80211_IS_CHAN_OFDM(chan)|| IEEE80211_IS_CHAN_HT(chan))
 		ar5416SetDeltaSlope(ah, chan);
 
 	AH5416(ah)->ah_spurMitigate(ah, chan);
 
 	/* Setup board specific options for EEPROM version 3 */
 	if (!ah->ah_setBoardValues(ah, chan)) {
 		HALDEBUG(ah, HAL_DEBUG_ANY,
 		    "%s: error setting board options\n", __func__);
 		FAIL(HAL_EIO);
 	}
 
 	OS_MARK(ah, AH_MARK_RESET_LINE, __LINE__);
 
 	OS_REG_WRITE(ah, AR_STA_ID0, LE_READ_4(ahp->ah_macaddr));
 	OS_REG_WRITE(ah, AR_STA_ID1, LE_READ_2(ahp->ah_macaddr + 4)
 		| macStaId1
 		| AR_STA_ID1_RTS_USE_DEF
 		| ahp->ah_staId1Defaults
 	);
 	ar5212SetOperatingMode(ah, opmode);
 
 	/* Set Venice BSSID mask according to current state */
 	OS_REG_WRITE(ah, AR_BSSMSKL, LE_READ_4(ahp->ah_bssidmask));
 	OS_REG_WRITE(ah, AR_BSSMSKU, LE_READ_2(ahp->ah_bssidmask + 4));
 
 	/* Restore previous led state */
 	if (AR_SREV_HOWL(ah))
 		OS_REG_WRITE(ah, AR_MAC_LED,
 		    AR_MAC_LED_ASSOC_ACTIVE | AR_CFG_SCLK_32KHZ);
 	else
 		OS_REG_WRITE(ah, AR_MAC_LED, OS_REG_READ(ah, AR_MAC_LED) |
 		    saveLedState);
 
         /* Start TSF2 for generic timer 8-15 */
 #ifdef	NOTYET
 	if (AR_SREV_KIWI(ah))
 		ar5416StartTsf2(ah);
 #endif
 
 	/*
 	 * Enable Bluetooth Coexistence if it's enabled.
 	 */
 	if (AH5416(ah)->ah_btCoexConfigType != HAL_BT_COEX_CFG_NONE)
 		ar5416InitBTCoex(ah);
 
 	/* Restore previous antenna */
 	OS_REG_WRITE(ah, AR_DEF_ANTENNA, saveDefAntenna);
 
 	/* then our BSSID and associate id */
 	OS_REG_WRITE(ah, AR_BSS_ID0, LE_READ_4(ahp->ah_bssid));
 	OS_REG_WRITE(ah, AR_BSS_ID1, LE_READ_2(ahp->ah_bssid + 4) |
 	    (ahp->ah_assocId & 0x3fff) << AR_BSS_ID1_AID_S);
 
 	/* Restore bmiss rssi & count thresholds */
 	OS_REG_WRITE(ah, AR_RSSI_THR, ahp->ah_rssiThr);
 
 	OS_REG_WRITE(ah, AR_ISR, ~0);		/* cleared on write */
 
 	/* Restore bmiss rssi & count thresholds */
 	OS_REG_WRITE(ah, AR_RSSI_THR, rssiThrReg);
 
 	if (!ar5212SetChannel(ah, chan))
 		FAIL(HAL_EIO);
 
 	OS_MARK(ah, AH_MARK_RESET_LINE, __LINE__);
 
 	/* Set 1:1 QCU to DCU mapping for all queues */
 	for (i = 0; i < AR_NUM_DCU; i++)
 		OS_REG_WRITE(ah, AR_DQCUMASK(i), 1 << i);
 
 	ahp->ah_intrTxqs = 0;
 	for (i = 0; i < AH_PRIVATE(ah)->ah_caps.halTotalQueues; i++)
 		ah->ah_resetTxQueue(ah, i);
 
 	ar5416InitIMR(ah, opmode);
 	ar5416SetCoverageClass(ah, AH_PRIVATE(ah)->ah_coverageClass, 1);
 	ar5416InitQoS(ah);
 	/* This may override the AR_DIAG_SW register */
 	ar5416InitUserSettings(ah);
 
 	/* XXX this won't work for AR9287! */
 	if (IEEE80211_IS_CHAN_HALF(chan) || IEEE80211_IS_CHAN_QUARTER(chan)) {
 		ar5416SetIFSTiming(ah, chan);
 #if 0
 			/*
 			 * AR5413?
 			 * Force window_length for 1/2 and 1/4 rate channels,
 			 * the ini file sets this to zero otherwise.
 			 */
 			OS_REG_RMW_FIELD(ah, AR_PHY_FRAME_CTL,
 			    AR_PHY_FRAME_CTL_WINLEN, 3);
 		}
 #endif
 	}
 
 	if (AR_SREV_KIWI_13_OR_LATER(ah)) {
 		/*
 		 * Enable ASYNC FIFO
 		 *
 		 * If Async FIFO is enabled, the following counters change
 		 * as MAC now runs at 117 Mhz instead of 88/44MHz when
 		 * async FIFO is disabled.
 		 *
 		 * Overwrite the delay/timeouts initialized in ProcessIni()
 		 * above.
 		 */
 		OS_REG_WRITE(ah, AR_D_GBL_IFS_SIFS,
 		    AR_D_GBL_IFS_SIFS_ASYNC_FIFO_DUR);
 		OS_REG_WRITE(ah, AR_D_GBL_IFS_SLOT,
 		    AR_D_GBL_IFS_SLOT_ASYNC_FIFO_DUR);
 		OS_REG_WRITE(ah, AR_D_GBL_IFS_EIFS,
 		    AR_D_GBL_IFS_EIFS_ASYNC_FIFO_DUR);
 
 		OS_REG_WRITE(ah, AR_TIME_OUT,
 		    AR_TIME_OUT_ACK_CTS_ASYNC_FIFO_DUR);
 		OS_REG_WRITE(ah, AR_USEC, AR_USEC_ASYNC_FIFO_DUR);
 
 		OS_REG_SET_BIT(ah, AR_MAC_PCU_LOGIC_ANALYZER,
 		    AR_MAC_PCU_LOGIC_ANALYZER_DISBUG20768);
 		OS_REG_RMW_FIELD(ah, AR_AHB_MODE, AR_AHB_CUSTOM_BURST_EN,
 		    AR_AHB_CUSTOM_BURST_ASYNC_FIFO_VAL);
 	}
 
 	if (AR_SREV_KIWI_13_OR_LATER(ah)) {
 		/* Enable AGGWEP to accelerate encryption engine */
 		OS_REG_SET_BIT(ah, AR_PCU_MISC_MODE2,
 		    AR_PCU_MISC_MODE2_ENABLE_AGGWEP);
 	}
 
 
 	/*
 	 * disable seq number generation in hw
 	 */
 	 OS_REG_WRITE(ah, AR_STA_ID1,
 	     OS_REG_READ(ah, AR_STA_ID1) | AR_STA_ID1_PRESERVE_SEQNUM);
 	 
 	ar5416InitDMA(ah);
 
 	/*
 	 * program OBS bus to see MAC interrupts
 	 */
 	OS_REG_WRITE(ah, AR_OBS, 8);
 
 	/*
 	 * Disable the "general" TX/RX mitigation timers.
 	 */
 	OS_REG_WRITE(ah, AR_MIRT, 0);
 
 #ifdef	AH_AR5416_INTERRUPT_MITIGATION
 	/*
 	 * This initialises the RX interrupt mitigation timers.
 	 *
 	 * The mitigation timers begin at idle and are triggered
 	 * upon the RXOK of a single frame (or sub-frame, for A-MPDU.)
 	 * Then, the RX mitigation interrupt will fire:
 	 *
 	 * + 250uS after the last RX'ed frame, or
 	 * + 700uS after the first RX'ed frame
 	 *
 	 * Thus, the LAST field dictates the extra latency
 	 * induced by the RX mitigation method and the FIRST
 	 * field dictates how long to delay before firing an
 	 * RX mitigation interrupt.
 	 *
 	 * Please note this only seems to be for RXOK frames;
 	 * not CRC or PHY error frames.
 	 *
 	 */
 	OS_REG_RMW_FIELD(ah, AR_RIMT, AR_RIMT_LAST, 250);
 	OS_REG_RMW_FIELD(ah, AR_RIMT, AR_RIMT_FIRST, 700);
 #endif
 	ar5416InitBB(ah, chan);
 
 	/* Setup compression registers */
 	ar5212SetCompRegs(ah);		/* XXX not needed? */
 
 	/*
 	 * 5416 baseband will check the per rate power table
 	 * and select the lower of the two
 	 */
 	ackTpcPow = 63;
 	ctsTpcPow = 63;
 	chirpTpcPow = 63;
 	powerVal = SM(ackTpcPow, AR_TPC_ACK) |
 		SM(ctsTpcPow, AR_TPC_CTS) |
 		SM(chirpTpcPow, AR_TPC_CHIRP);
 	OS_REG_WRITE(ah, AR_TPC, powerVal);
 
 	if (!ar5416InitCal(ah, chan))
 		FAIL(HAL_ESELFTEST);
 
 	ar5416RestoreChainMask(ah);
 
 	AH_PRIVATE(ah)->ah_opmode = opmode;	/* record operating mode */
 
 	if (bChannelChange && !IEEE80211_IS_CHAN_DFS(chan)) 
 		chan->ic_state &= ~IEEE80211_CHANSTATE_CWINT;
 
 	if (AR_SREV_HOWL(ah)) {
 		/*
 		 * Enable the MBSSID block-ack fix for HOWL.
 		 * This feature is only supported on Howl 1.4, but it is safe to
 		 * set bit 22 of STA_ID1 on other Howl revisions (1.1, 1.2, 1.3),
 		 * since bit 22 is unused in those Howl revisions.
 		 */
 		unsigned int reg;
 		reg = (OS_REG_READ(ah, AR_STA_ID1) | (1<<22));
 		OS_REG_WRITE(ah,AR_STA_ID1, reg);
 		ath_hal_printf(ah, "MBSSID Set bit 22 of AR_STA_ID 0x%x\n", reg);
 	}
 
 	HALDEBUG(ah, HAL_DEBUG_RESET, "%s: done\n", __func__);
 
 	OS_MARK(ah, AH_MARK_RESET_DONE, 0);
 
 	return AH_TRUE;
 bad:
 	OS_MARK(ah, AH_MARK_RESET_DONE, ecode);
 	if (status != AH_NULL)
 		*status = ecode;
 	return AH_FALSE;
 #undef FAIL
 #undef N
 }
 
 #if 0
 /*
  * This channel change evaluates whether the selected hardware can
  * perform a synthesizer-only channel change (no reset).  If the
  * TX is not stopped, or the RFBus cannot be granted in the given
  * time, the function returns false as a reset is necessary
  */
 HAL_BOOL
 ar5416ChannelChange(struct ath_hal *ah, const structu ieee80211_channel *chan)
 {
 	uint32_t       ulCount;
 	uint32_t   data, synthDelay, qnum;
 	uint16_t   rfXpdGain[4];
 	struct ath_hal_5212 *ahp = AH5212(ah);
 	HAL_CHANNEL_INTERNAL *ichan;
 
 	/*
 	 * Map public channel to private.
 	 */
 	ichan = ath_hal_checkchannel(ah, chan);
 
 	/* TX must be stopped or RF Bus grant will not work */
 	for (qnum = 0; qnum < AH_PRIVATE(ah)->ah_caps.halTotalQueues; qnum++) {
 		if (ar5212NumTxPending(ah, qnum)) {
 			HALDEBUG(ah, HAL_DEBUG_ANY,
 			    "%s: frames pending on queue %d\n", __func__, qnum);
 			return AH_FALSE;
 		}
 	}
 
 	/*
 	 * Kill last Baseband Rx Frame - Request analog bus grant
 	 */
 	OS_REG_WRITE(ah, AR_PHY_RFBUS_REQ, AR_PHY_RFBUS_REQ_REQUEST);
 	if (!ath_hal_wait(ah, AR_PHY_RFBUS_GNT, AR_PHY_RFBUS_GRANT_EN, AR_PHY_RFBUS_GRANT_EN)) {
 		HALDEBUG(ah, HAL_DEBUG_ANY, "%s: could not kill baseband rx\n",
 		    __func__);
 		return AH_FALSE;
 	}
 
 	ar5416Set11nRegs(ah, chan);	/* NB: setup 5416-specific regs */
 
 	/* Change the synth */
 	if (!ar5212SetChannel(ah, chan))
 		return AH_FALSE;
 
 	/* Setup the transmit power values. */
 	if (!ah->ah_setTxPower(ah, chan, rfXpdGain)) {
 		HALDEBUG(ah, HAL_DEBUG_ANY,
 		    "%s: error init'ing transmit power\n", __func__);
 		return AH_FALSE;
 	}
 
 	/*
 	 * Wait for the frequency synth to settle (synth goes on
 	 * via PHY_ACTIVE_EN).  Read the phy active delay register.
 	 * Value is in 100ns increments.
 	 */
 	data = OS_REG_READ(ah, AR_PHY_RX_DELAY) & AR_PHY_RX_DELAY_DELAY;
 	if (IS_CHAN_CCK(ichan)) {
 		synthDelay = (4 * data) / 22;
 	} else {
 		synthDelay = data / 10;
 	}
 
 	OS_DELAY(synthDelay + BASE_ACTIVATE_DELAY);
 
 	/* Release the RFBus Grant */
 	OS_REG_WRITE(ah, AR_PHY_RFBUS_REQ, 0);
 
 	/* Write delta slope for OFDM enabled modes (A, G, Turbo) */
 	if (IEEE80211_IS_CHAN_OFDM(ichan)|| IEEE80211_IS_CHAN_HT(chan)) {
 		HALASSERT(AH_PRIVATE(ah)->ah_eeversion >= AR_EEPROM_VER5_3);
 		ar5212SetSpurMitigation(ah, chan);
 		ar5416SetDeltaSlope(ah, chan);
 	}
 
 	/* XXX spur mitigation for Melin */
 
 	if (!IEEE80211_IS_CHAN_DFS(chan)) 
 		chan->ic_state &= ~IEEE80211_CHANSTATE_CWINT;
 
 	ichan->channel_time = 0;
 	ichan->tsf_last = ar5416GetTsf64(ah);
 	ar5212TxEnable(ah, AH_TRUE);
 	return AH_TRUE;
 }
 #endif
 
 static void
 ar5416InitDMA(struct ath_hal *ah)
 {
 	struct ath_hal_5212 *ahp = AH5212(ah);
 
 	/*
 	 * set AHB_MODE not to do cacheline prefetches
 	 */
 	OS_REG_SET_BIT(ah, AR_AHB_MODE, AR_AHB_PREFETCH_RD_EN);
 
 	/*
 	 * let mac dma reads be in 128 byte chunks
 	 */
 	OS_REG_WRITE(ah, AR_TXCFG, 
 		(OS_REG_READ(ah, AR_TXCFG) & ~AR_TXCFG_DMASZ_MASK) | AR_TXCFG_DMASZ_128B);
 
 	/*
 	 * let mac dma writes be in 128 byte chunks
 	 */
 	/*
 	 * XXX If you change this, you must change the headroom
 	 * assigned in ah_maxTxTrigLev - see ar5416InitState().
 	 */
 	OS_REG_WRITE(ah, AR_RXCFG, 
 		(OS_REG_READ(ah, AR_RXCFG) & ~AR_RXCFG_DMASZ_MASK) | AR_RXCFG_DMASZ_128B);
 
 	/* restore TX trigger level */
 	OS_REG_WRITE(ah, AR_TXCFG,
 		(OS_REG_READ(ah, AR_TXCFG) &~ AR_FTRIG) |
 		    SM(ahp->ah_txTrigLev, AR_FTRIG));
 
 	/*
 	 * Setup receive FIFO threshold to hold off TX activities
 	 */
 	OS_REG_WRITE(ah, AR_RXFIFO_CFG, 0x200);
 	
 	/*
 	 * reduce the number of usable entries in PCU TXBUF to avoid
 	 * wrap around.
 	 */
 	if (AR_SREV_KITE(ah))
 		/*
 		 * For AR9285 the number of Fifos are reduced to half.
 		 * So set the usable tx buf size also to half to
 		 * avoid data/delimiter underruns
 		 */
 		OS_REG_WRITE(ah, AR_PCU_TXBUF_CTRL, AR_9285_PCU_TXBUF_CTRL_USABLE_SIZE);
 	else
 		OS_REG_WRITE(ah, AR_PCU_TXBUF_CTRL, AR_PCU_TXBUF_CTRL_USABLE_SIZE);
 }
 
 static void
 ar5416InitBB(struct ath_hal *ah, const struct ieee80211_channel *chan)
 {
 	uint32_t synthDelay;
 
 	/*
 	 * Wait for the frequency synth to settle (synth goes on
 	 * via AR_PHY_ACTIVE_EN).  Read the phy active delay register.
 	 * Value is in 100ns increments.
 	  */
 	synthDelay = OS_REG_READ(ah, AR_PHY_RX_DELAY) & AR_PHY_RX_DELAY_DELAY;
 	if (IEEE80211_IS_CHAN_CCK(chan)) {
 		synthDelay = (4 * synthDelay) / 22;
 	} else {
 		synthDelay /= 10;
 	}
 
 	/* Turn on PLL on 5416 */
 	HALDEBUG(ah, HAL_DEBUG_RESET, "%s %s channel\n",
 	    __func__, IEEE80211_IS_CHAN_5GHZ(chan) ? "5GHz" : "2GHz");
 
 	/* Activate the PHY (includes baseband activate and synthesizer on) */
 	OS_REG_WRITE(ah, AR_PHY_ACTIVE, AR_PHY_ACTIVE_EN);
 	
 	/* 
 	 * If the AP starts the calibration before the base band timeout
 	 * completes  we could get rx_clear false triggering.  Add an
 	 * extra BASE_ACTIVATE_DELAY usecs to ensure this condition
 	 * does not happen.
 	 */
 	if (IEEE80211_IS_CHAN_HALF(chan)) {
 		OS_DELAY((synthDelay << 1) + BASE_ACTIVATE_DELAY);
 	} else if (IEEE80211_IS_CHAN_QUARTER(chan)) {
 		OS_DELAY((synthDelay << 2) + BASE_ACTIVATE_DELAY);
 	} else {
 		OS_DELAY(synthDelay + BASE_ACTIVATE_DELAY);
 	}
 }
 
 static void
 ar5416InitIMR(struct ath_hal *ah, HAL_OPMODE opmode)
 {
 	struct ath_hal_5212 *ahp = AH5212(ah);
 
 	/*
 	 * Setup interrupt handling.  Note that ar5212ResetTxQueue
 	 * manipulates the secondary IMR's as queues are enabled
 	 * and disabled.  This is done with RMW ops to insure the
 	 * settings we make here are preserved.
 	 */
         ahp->ah_maskReg = AR_IMR_TXERR | AR_IMR_TXURN
 			| AR_IMR_RXERR | AR_IMR_RXORN
                         | AR_IMR_BCNMISC;
 
 #ifdef	AH_AR5416_INTERRUPT_MITIGATION
 	ahp->ah_maskReg |= AR_IMR_RXINTM | AR_IMR_RXMINTR;
 #else
 	ahp->ah_maskReg |= AR_IMR_RXOK;
 #endif	
 	ahp->ah_maskReg |= AR_IMR_TXOK;
 
 	if (opmode == HAL_M_HOSTAP)
 		ahp->ah_maskReg |= AR_IMR_MIB;
 	OS_REG_WRITE(ah, AR_IMR, ahp->ah_maskReg);
 
 #ifdef  ADRIAN_NOTYET
 	/* This is straight from ath9k */
 	if (! AR_SREV_HOWL(ah)) {
 		OS_REG_WRITE(ah, AR_INTR_SYNC_CAUSE, 0xFFFFFFFF);
 		OS_REG_WRITE(ah, AR_INTR_SYNC_ENABLE, AR_INTR_SYNC_DEFAULT);
 		OS_REG_WRITE(ah, AR_INTR_SYNC_MASK, 0);
 	}
 #endif
 
 	/* Enable bus errors that are OR'd to set the HIUERR bit */
 #if 0
 	OS_REG_WRITE(ah, AR_IMR_S2, 
 	    	OS_REG_READ(ah, AR_IMR_S2) | AR_IMR_S2_GTT | AR_IMR_S2_CST);
 #endif
 }
 
 static void
 ar5416InitQoS(struct ath_hal *ah)
 {
 	/* QoS support */
 	OS_REG_WRITE(ah, AR_QOS_CONTROL, 0x100aa);	/* XXX magic */
 	OS_REG_WRITE(ah, AR_QOS_SELECT, 0x3210);	/* XXX magic */
 
 	/* Turn on NOACK Support for QoS packets */
 	OS_REG_WRITE(ah, AR_NOACK,
 		SM(2, AR_NOACK_2BIT_VALUE) |
 		SM(5, AR_NOACK_BIT_OFFSET) |
 		SM(0, AR_NOACK_BYTE_OFFSET));
 		
     	/*
     	 * initialize TXOP for all TIDs
     	 */
 	OS_REG_WRITE(ah, AR_TXOP_X, AR_TXOP_X_VAL);
 	OS_REG_WRITE(ah, AR_TXOP_0_3, 0xFFFFFFFF);
 	OS_REG_WRITE(ah, AR_TXOP_4_7, 0xFFFFFFFF);
 	OS_REG_WRITE(ah, AR_TXOP_8_11, 0xFFFFFFFF);
 	OS_REG_WRITE(ah, AR_TXOP_12_15, 0xFFFFFFFF);
 }
 
 static void
 ar5416InitUserSettings(struct ath_hal *ah)
 {
 	struct ath_hal_5212 *ahp = AH5212(ah);
 
 	/* Restore user-specified settings */
 	if (ahp->ah_miscMode != 0)
 		OS_REG_WRITE(ah, AR_MISC_MODE, OS_REG_READ(ah, AR_MISC_MODE)
 		    | ahp->ah_miscMode);
 	if (ahp->ah_sifstime != (u_int) -1)
 		ar5212SetSifsTime(ah, ahp->ah_sifstime);
 	if (ahp->ah_slottime != (u_int) -1)
 		ar5212SetSlotTime(ah, ahp->ah_slottime);
 	if (ahp->ah_acktimeout != (u_int) -1)
 		ar5212SetAckTimeout(ah, ahp->ah_acktimeout);
 	if (ahp->ah_ctstimeout != (u_int) -1)
 		ar5212SetCTSTimeout(ah, ahp->ah_ctstimeout);
 	if (AH_PRIVATE(ah)->ah_diagreg != 0)
 		OS_REG_WRITE(ah, AR_DIAG_SW, AH_PRIVATE(ah)->ah_diagreg);
 	if (AH5416(ah)->ah_globaltxtimeout != (u_int) -1)
         	ar5416SetGlobalTxTimeout(ah, AH5416(ah)->ah_globaltxtimeout);
 }
 
 static void
 ar5416SetRfMode(struct ath_hal *ah, const struct ieee80211_channel *chan)
 {
 	uint32_t rfMode;
 
 	if (chan == AH_NULL)
 		return;
 
 	/* treat channel B as channel G , no  B mode suport in owl */
 	rfMode = IEEE80211_IS_CHAN_CCK(chan) ?
 	    AR_PHY_MODE_DYNAMIC : AR_PHY_MODE_OFDM;
 
 	if (AR_SREV_MERLIN_20(ah) && IS_5GHZ_FAST_CLOCK_EN(ah, chan)) {
 		/* phy mode bits for 5GHz channels require Fast Clock */
 		rfMode |= AR_PHY_MODE_DYNAMIC
 		       |  AR_PHY_MODE_DYN_CCK_DISABLE;
 	} else if (!AR_SREV_MERLIN_10_OR_LATER(ah)) {
 		rfMode |= IEEE80211_IS_CHAN_5GHZ(chan) ?
 			AR_PHY_MODE_RF5GHZ : AR_PHY_MODE_RF2GHZ;
 	}
 
 	OS_REG_WRITE(ah, AR_PHY_MODE, rfMode);
 }
 
 /*
  * Places the hardware into reset and then pulls it out of reset
  */
 HAL_BOOL
 ar5416ChipReset(struct ath_hal *ah, const struct ieee80211_channel *chan)
 {
 	OS_MARK(ah, AH_MARK_CHIPRESET, chan ? chan->ic_freq : 0);
 	/*
 	 * Warm reset is optimistic for open-loop TX power control.
 	 */
 	if (AR_SREV_MERLIN(ah) &&
 	    ath_hal_eepromGetFlag(ah, AR_EEP_OL_PWRCTRL)) {
 		if (!ar5416SetResetReg(ah, HAL_RESET_POWER_ON))
 			return AH_FALSE;
 	} else if (ah->ah_config.ah_force_full_reset) {
 		if (!ar5416SetResetReg(ah, HAL_RESET_POWER_ON))
 			return AH_FALSE;
 	} else {
 		if (!ar5416SetResetReg(ah, HAL_RESET_WARM))
 			return AH_FALSE;
 	}
 
 	/* Bring out of sleep mode (AGAIN) */
 	if (!ar5416SetPowerMode(ah, HAL_PM_AWAKE, AH_TRUE))
 	       return AH_FALSE;
 
 #ifdef notyet
 	ahp->ah_chipFullSleep = AH_FALSE;
 #endif
 
 	AH5416(ah)->ah_initPLL(ah, chan);
 
 	/*
 	 * Perform warm reset before the mode/PLL/turbo registers
 	 * are changed in order to deactivate the radio.  Mode changes
 	 * with an active radio can result in corrupted shifts to the
 	 * radio device.
 	 */
 	ar5416SetRfMode(ah, chan);
 
 	return AH_TRUE;	
 }
 
 /*
  * Delta slope coefficient computation.
  * Required for OFDM operation.
  */
 static void
 ar5416GetDeltaSlopeValues(struct ath_hal *ah, uint32_t coef_scaled,
                           uint32_t *coef_mantissa, uint32_t *coef_exponent)
 {
 #define COEF_SCALE_S 24
     uint32_t coef_exp, coef_man;
     /*
      * ALGO -> coef_exp = 14-floor(log2(coef));
      * floor(log2(x)) is the highest set bit position
      */
     for (coef_exp = 31; coef_exp > 0; coef_exp--)
             if ((coef_scaled >> coef_exp) & 0x1)
                     break;
     /* A coef_exp of 0 is a legal bit position but an unexpected coef_exp */
     HALASSERT(coef_exp);
     coef_exp = 14 - (coef_exp - COEF_SCALE_S);
 
     /*
      * ALGO -> coef_man = floor(coef* 2^coef_exp+0.5);
      * The coefficient is already shifted up for scaling
      */
     coef_man = coef_scaled + (1 << (COEF_SCALE_S - coef_exp - 1));
 
     *coef_mantissa = coef_man >> (COEF_SCALE_S - coef_exp);
     *coef_exponent = coef_exp - 16;
 
 #undef COEF_SCALE_S    
 }
 
 void
 ar5416SetDeltaSlope(struct ath_hal *ah, const struct ieee80211_channel *chan)
 {
 #define INIT_CLOCKMHZSCALED	0x64000000
 	uint32_t coef_scaled, ds_coef_exp, ds_coef_man;
 	uint32_t clockMhzScaled;
 
 	CHAN_CENTERS centers;
 
 	/* half and quarter rate can divide the scaled clock by 2 or 4 respectively */
 	/* scale for selected channel bandwidth */ 
 	clockMhzScaled = INIT_CLOCKMHZSCALED;
 	if (IEEE80211_IS_CHAN_TURBO(chan))
 		clockMhzScaled <<= 1;
 	else if (IEEE80211_IS_CHAN_HALF(chan))
 		clockMhzScaled >>= 1;
 	else if (IEEE80211_IS_CHAN_QUARTER(chan))
 		clockMhzScaled >>= 2;
 
 	/*
 	 * ALGO -> coef = 1e8/fcarrier*fclock/40;
 	 * scaled coef to provide precision for this floating calculation 
 	 */
 	ar5416GetChannelCenters(ah, chan, &centers);
 	coef_scaled = clockMhzScaled / centers.synth_center;		
 
  	ar5416GetDeltaSlopeValues(ah, coef_scaled, &ds_coef_man, &ds_coef_exp);
 
 	OS_REG_RMW_FIELD(ah, AR_PHY_TIMING3,
 		AR_PHY_TIMING3_DSC_MAN, ds_coef_man);
 	OS_REG_RMW_FIELD(ah, AR_PHY_TIMING3,
 		AR_PHY_TIMING3_DSC_EXP, ds_coef_exp);
 
         /*
          * For Short GI,
          * scaled coeff is 9/10 that of normal coeff
          */ 
         coef_scaled = (9 * coef_scaled)/10;
 
         ar5416GetDeltaSlopeValues(ah, coef_scaled, &ds_coef_man, &ds_coef_exp);
 
         /* for short gi */
         OS_REG_RMW_FIELD(ah, AR_PHY_HALFGI,
                 AR_PHY_HALFGI_DSC_MAN, ds_coef_man);
         OS_REG_RMW_FIELD(ah, AR_PHY_HALFGI,
                 AR_PHY_HALFGI_DSC_EXP, ds_coef_exp);	
 #undef INIT_CLOCKMHZSCALED
 }
 
 /*
  * Set a limit on the overall output power.  Used for dynamic
  * transmit power control and the like.
  *
  * NB: limit is in units of 0.5 dbM.
  */
 HAL_BOOL
 ar5416SetTxPowerLimit(struct ath_hal *ah, uint32_t limit)
 {
 	uint16_t dummyXpdGains[2];
 
 	AH_PRIVATE(ah)->ah_powerLimit = AH_MIN(limit, MAX_RATE_POWER);
 	return ah->ah_setTxPower(ah, AH_PRIVATE(ah)->ah_curchan,
 			dummyXpdGains);
 }
 
 HAL_BOOL
 ar5416GetChipPowerLimits(struct ath_hal *ah,
 	struct ieee80211_channel *chan)
 {
 	struct ath_hal_5212 *ahp = AH5212(ah);
 	int16_t minPower, maxPower;
 
 	/*
 	 * Get Pier table max and min powers.
 	 */
 	if (ahp->ah_rfHal->getChannelMaxMinPower(ah, chan, &maxPower, &minPower)) {
 		/* NB: rf code returns 1/4 dBm units, convert */
 		chan->ic_maxpower = maxPower / 2;
 		chan->ic_minpower = minPower / 2;
 	} else {
 		HALDEBUG(ah, HAL_DEBUG_ANY,
 		    "%s: no min/max power for %u/0x%x\n",
 		    __func__, chan->ic_freq, chan->ic_flags);
 		chan->ic_maxpower = AR5416_MAX_RATE_POWER;
 		chan->ic_minpower = 0;
 	}
 	HALDEBUG(ah, HAL_DEBUG_RESET,
 	    "Chan %d: MaxPow = %d MinPow = %d\n",
 	    chan->ic_freq, chan->ic_maxpower, chan->ic_minpower);
 	return AH_TRUE;
 }
 
 /**************************************************************
  * ar5416WriteTxPowerRateRegisters
  *
  * Write the TX power rate registers from the raw values given
  * in ratesArray[].
  *
  * The CCK and HT40 rate registers are only written if needed.
  * HT20 and 11g/11a OFDM rate registers are always written.
  *
  * The values written are raw values which should be written
  * to the registers - so it's up to the caller to pre-adjust
  * them (eg CCK power offset value, or Merlin TX power offset,
  * etc.)
  */
 void
 ar5416WriteTxPowerRateRegisters(struct ath_hal *ah,
     const struct ieee80211_channel *chan, const int16_t ratesArray[])
 {
 #define POW_SM(_r, _s)     (((_r) & 0x3f) << (_s))
 
     /* Write the OFDM power per rate set */
     OS_REG_WRITE(ah, AR_PHY_POWER_TX_RATE1,
         POW_SM(ratesArray[rate18mb], 24)
           | POW_SM(ratesArray[rate12mb], 16)
           | POW_SM(ratesArray[rate9mb], 8)
           | POW_SM(ratesArray[rate6mb], 0)
     );
     OS_REG_WRITE(ah, AR_PHY_POWER_TX_RATE2,
         POW_SM(ratesArray[rate54mb], 24)
           | POW_SM(ratesArray[rate48mb], 16)
           | POW_SM(ratesArray[rate36mb], 8)
           | POW_SM(ratesArray[rate24mb], 0)
     );
 
     if (IEEE80211_IS_CHAN_2GHZ(chan)) {
         /* Write the CCK power per rate set */
         OS_REG_WRITE(ah, AR_PHY_POWER_TX_RATE3,
             POW_SM(ratesArray[rate2s], 24)
               | POW_SM(ratesArray[rate2l],  16)
               | POW_SM(ratesArray[rateXr],  8) /* XR target power */
               | POW_SM(ratesArray[rate1l],   0)
         );
         OS_REG_WRITE(ah, AR_PHY_POWER_TX_RATE4,
             POW_SM(ratesArray[rate11s], 24)
               | POW_SM(ratesArray[rate11l], 16)
               | POW_SM(ratesArray[rate5_5s], 8)
               | POW_SM(ratesArray[rate5_5l], 0)
         );
     HALDEBUG(ah, HAL_DEBUG_RESET,
 	"%s AR_PHY_POWER_TX_RATE3=0x%x AR_PHY_POWER_TX_RATE4=0x%x\n",
 	    __func__, OS_REG_READ(ah,AR_PHY_POWER_TX_RATE3),
 	    OS_REG_READ(ah,AR_PHY_POWER_TX_RATE4)); 
     }
 
     /* Write the HT20 power per rate set */
     OS_REG_WRITE(ah, AR_PHY_POWER_TX_RATE5,
         POW_SM(ratesArray[rateHt20_3], 24)
           | POW_SM(ratesArray[rateHt20_2], 16)
           | POW_SM(ratesArray[rateHt20_1], 8)
           | POW_SM(ratesArray[rateHt20_0], 0)
     );
     OS_REG_WRITE(ah, AR_PHY_POWER_TX_RATE6,
         POW_SM(ratesArray[rateHt20_7], 24)
           | POW_SM(ratesArray[rateHt20_6], 16)
           | POW_SM(ratesArray[rateHt20_5], 8)
           | POW_SM(ratesArray[rateHt20_4], 0)
     );
 
     if (IEEE80211_IS_CHAN_HT40(chan)) {
         /* Write the HT40 power per rate set */
         OS_REG_WRITE(ah, AR_PHY_POWER_TX_RATE7,
             POW_SM(ratesArray[rateHt40_3], 24)
               | POW_SM(ratesArray[rateHt40_2], 16)
               | POW_SM(ratesArray[rateHt40_1], 8)
               | POW_SM(ratesArray[rateHt40_0], 0)
         );
         OS_REG_WRITE(ah, AR_PHY_POWER_TX_RATE8,
             POW_SM(ratesArray[rateHt40_7], 24)
               | POW_SM(ratesArray[rateHt40_6], 16)
               | POW_SM(ratesArray[rateHt40_5], 8)
               | POW_SM(ratesArray[rateHt40_4], 0)
         );
         /* Write the Dup/Ext 40 power per rate set */
         OS_REG_WRITE(ah, AR_PHY_POWER_TX_RATE9,
             POW_SM(ratesArray[rateExtOfdm], 24)
               | POW_SM(ratesArray[rateExtCck], 16)
               | POW_SM(ratesArray[rateDupOfdm], 8)
               | POW_SM(ratesArray[rateDupCck], 0)
         );
     }
 
     /*
      * Set max power to 30 dBm and, optionally,
      * enable TPC in tx descriptors.
      */
     OS_REG_WRITE(ah, AR_PHY_POWER_TX_RATE_MAX, MAX_RATE_POWER |
       (AH5212(ah)->ah_tpcEnabled ? AR_PHY_POWER_TX_RATE_MAX_TPC_ENABLE : 0));
 #undef POW_SM
 }
 
 
 /**************************************************************
  * ar5416SetTransmitPower
  *
  * Set the transmit power in the baseband for the given
  * operating channel and mode.
  */
 HAL_BOOL
 ar5416SetTransmitPower(struct ath_hal *ah,
 	const struct ieee80211_channel *chan, uint16_t *rfXpdGain)
 {
 #define N(a)            (sizeof (a) / sizeof (a[0]))
 #define POW_SM(_r, _s)     (((_r) & 0x3f) << (_s))
 
     MODAL_EEP_HEADER	*pModal;
     struct ath_hal_5212 *ahp = AH5212(ah);
     int16_t		txPowerIndexOffset = 0;
     int			i;
     
     uint16_t		cfgCtl;
     uint16_t		powerLimit;
     uint16_t		twiceAntennaReduction;
     uint16_t		twiceMaxRegulatoryPower;
     int16_t		maxPower;
     HAL_EEPROM_v14 *ee = AH_PRIVATE(ah)->ah_eeprom;
     struct ar5416eeprom	*pEepData = &ee->ee_base;
 
     HALASSERT(AH_PRIVATE(ah)->ah_eeversion >= AR_EEPROM_VER14_1);
 
     /*
      * Default to 2, is overridden based on the EEPROM version / value.
      */
     AH5416(ah)->ah_ht40PowerIncForPdadc = 2;
 
     /* Setup info for the actual eeprom */
     OS_MEMZERO(AH5416(ah)->ah_ratesArray, sizeof(AH5416(ah)->ah_ratesArray));
     cfgCtl = ath_hal_getctl(ah, chan);
     powerLimit = chan->ic_maxregpower * 2;
     twiceAntennaReduction = chan->ic_maxantgain;
     twiceMaxRegulatoryPower = AH_MIN(MAX_RATE_POWER, AH_PRIVATE(ah)->ah_powerLimit); 
     pModal = &pEepData->modalHeader[IEEE80211_IS_CHAN_2GHZ(chan)];
     HALDEBUG(ah, HAL_DEBUG_RESET, "%s Channel=%u CfgCtl=%u\n",
 	__func__,chan->ic_freq, cfgCtl );      
   
     if (IS_EEP_MINOR_V2(ah)) {
         AH5416(ah)->ah_ht40PowerIncForPdadc = pModal->ht40PowerIncForPdadc;
     }
  
     if (!ar5416SetPowerPerRateTable(ah, pEepData,  chan,
                                     &AH5416(ah)->ah_ratesArray[0],
 				    cfgCtl,
                                     twiceAntennaReduction,
 				    twiceMaxRegulatoryPower, powerLimit)) {
         HALDEBUG(ah, HAL_DEBUG_ANY,
 	    "%s: unable to set tx power per rate table\n", __func__);
         return AH_FALSE;
     }
 
     if (!AH5416(ah)->ah_setPowerCalTable(ah,  pEepData, chan, &txPowerIndexOffset)) {
         HALDEBUG(ah, HAL_DEBUG_ANY, "%s: unable to set power table\n",
 	    __func__);
         return AH_FALSE;
     }
   
     maxPower = AH_MAX(AH5416(ah)->ah_ratesArray[rate6mb],
       AH5416(ah)->ah_ratesArray[rateHt20_0]);
 
     if (IEEE80211_IS_CHAN_2GHZ(chan)) {
         maxPower = AH_MAX(maxPower, AH5416(ah)->ah_ratesArray[rate1l]);
     }
 
     if (IEEE80211_IS_CHAN_HT40(chan)) {
         maxPower = AH_MAX(maxPower, AH5416(ah)->ah_ratesArray[rateHt40_0]);
     }
 
     ahp->ah_tx6PowerInHalfDbm = maxPower;   
     AH_PRIVATE(ah)->ah_maxPowerLevel = maxPower;
     ahp->ah_txPowerIndexOffset = txPowerIndexOffset;
 
     /*
      * txPowerIndexOffset is set by the SetPowerTable() call -
      *  adjust the rate table (0 offset if rates EEPROM not loaded)
      */
     for (i = 0; i < N(AH5416(ah)->ah_ratesArray); i++) {
         AH5416(ah)->ah_ratesArray[i] =
           (int16_t)(txPowerIndexOffset + AH5416(ah)->ah_ratesArray[i]);
         if (AH5416(ah)->ah_ratesArray[i] > AR5416_MAX_RATE_POWER)
             AH5416(ah)->ah_ratesArray[i] = AR5416_MAX_RATE_POWER;
     }
 
 #ifdef AH_EEPROM_DUMP
     /*
      * Dump the rate array whilst it represents the intended dBm*2
      * values versus what's being adjusted before being programmed
      * in. Keep this in mind if you code up this function and enable
      * this debugging; the values won't necessarily be what's being
      * programmed into the hardware.
      */
     ar5416PrintPowerPerRate(ah, AH5416(ah)->ah_ratesArray);
 #endif
 
     /*
      * Merlin and later have a power offset, so subtract
      * pwr_table_offset * 2 from each value. The default
      * power offset is -5 dBm - ie, a register value of 0
      * equates to a TX power of -5 dBm.
      */
     if (AR_SREV_MERLIN_20_OR_LATER(ah)) {
         int8_t pwr_table_offset;
 
 	(void) ath_hal_eepromGet(ah, AR_EEP_PWR_TABLE_OFFSET,
 	    &pwr_table_offset);
 	/* Underflow power gets clamped at raw value 0 */
 	/* Overflow power gets camped at AR5416_MAX_RATE_POWER */
 	for (i = 0; i < N(AH5416(ah)->ah_ratesArray); i++) {
 		/*
 		 * + pwr_table_offset is in dBm
 		 * + ratesArray is in 1/2 dBm
 		 */
 		AH5416(ah)->ah_ratesArray[i] -= (pwr_table_offset * 2);
 		if (AH5416(ah)->ah_ratesArray[i] < 0)
 			AH5416(ah)->ah_ratesArray[i] = 0;
 		else if (AH5416(ah)->ah_ratesArray[i] > AR5416_MAX_RATE_POWER)
 		    AH5416(ah)->ah_ratesArray[i] = AR5416_MAX_RATE_POWER;
 	}
     }
 
     /*
      * Adjust rates for OLC where needed
      *
      * The following CCK rates need adjusting when doing 2.4ghz
      * CCK transmission.
      *
      * + rate2s, rate2l, rate1l, rate11s, rate11l, rate5_5s, rate5_5l
      * + rateExtCck, rateDupCck
      *
      * They're adjusted here regardless. The hardware then gets
      * programmed as needed. 5GHz operation doesn't program in CCK
      * rates for legacy mode but they seem to be initialised for
      * HT40 regardless of channel type.
      */
     if (AR_SREV_MERLIN_20_OR_LATER(ah) &&
 	    ath_hal_eepromGetFlag(ah, AR_EEP_OL_PWRCTRL)) {
         int adj[] = {
 	              rate2s, rate2l, rate1l, rate11s, rate11l,
 	              rate5_5s, rate5_5l, rateExtCck, rateDupCck
 		    };
         int cck_ofdm_delta = 2;
 	int i;
 	for (i = 0; i < N(adj); i++) {
             AH5416(ah)->ah_ratesArray[adj[i]] -= cck_ofdm_delta;
 	    if (AH5416(ah)->ah_ratesArray[adj[i]] < 0)
 	        AH5416(ah)->ah_ratesArray[adj[i]] = 0;
         }
     }
 
     /*
      * Adjust the HT40 power to meet the correct target TX power
      * for 40MHz mode, based on TX power curves that are established
      * for 20MHz mode.
      *
      * XXX handle overflow/too high power level?
      */
     if (IEEE80211_IS_CHAN_HT40(chan)) {
 	AH5416(ah)->ah_ratesArray[rateHt40_0] +=
 	  AH5416(ah)->ah_ht40PowerIncForPdadc;
 	AH5416(ah)->ah_ratesArray[rateHt40_1] +=
 	  AH5416(ah)->ah_ht40PowerIncForPdadc;
 	AH5416(ah)->ah_ratesArray[rateHt40_2] += AH5416(ah)->ah_ht40PowerIncForPdadc;
 	AH5416(ah)->ah_ratesArray[rateHt40_3] += AH5416(ah)->ah_ht40PowerIncForPdadc;
 	AH5416(ah)->ah_ratesArray[rateHt40_4] += AH5416(ah)->ah_ht40PowerIncForPdadc;
 	AH5416(ah)->ah_ratesArray[rateHt40_5] += AH5416(ah)->ah_ht40PowerIncForPdadc;
 	AH5416(ah)->ah_ratesArray[rateHt40_6] += AH5416(ah)->ah_ht40PowerIncForPdadc;
 	AH5416(ah)->ah_ratesArray[rateHt40_7] += AH5416(ah)->ah_ht40PowerIncForPdadc;
     }
 
     /* Write the TX power rate registers */
     ar5416WriteTxPowerRateRegisters(ah, chan, AH5416(ah)->ah_ratesArray);
 
     /* Write the Power subtraction for dynamic chain changing, for per-packet powertx */
     OS_REG_WRITE(ah, AR_PHY_POWER_TX_SUB,
         POW_SM(pModal->pwrDecreaseFor3Chain, 6)
           | POW_SM(pModal->pwrDecreaseFor2Chain, 0)
     );
     return AH_TRUE;
 #undef POW_SM
 #undef N
 }
 
 /*
  * Exported call to check for a recent gain reading and return
  * the current state of the thermal calibration gain engine.
  */
 HAL_RFGAIN
 ar5416GetRfgain(struct ath_hal *ah)
 {
 
 	return (HAL_RFGAIN_INACTIVE);
 }
 
 /*
  * Places all of hardware into reset
  */
 HAL_BOOL
 ar5416Disable(struct ath_hal *ah)
 {
 
 	if (!ar5416SetPowerMode(ah, HAL_PM_AWAKE, AH_TRUE))
 		return AH_FALSE;
 	if (! ar5416SetResetReg(ah, HAL_RESET_COLD))
 		return AH_FALSE;
 
 	AH5416(ah)->ah_initPLL(ah, AH_NULL);
 	return (AH_TRUE);
 }
 
 /*
  * Places the PHY and Radio chips into reset.  A full reset
  * must be called to leave this state.  The PCI/MAC/PCU are
  * not placed into reset as we must receive interrupt to
  * re-enable the hardware.
  */
 HAL_BOOL
 ar5416PhyDisable(struct ath_hal *ah)
 {
 
 	if (! ar5416SetResetReg(ah, HAL_RESET_WARM))
 		return AH_FALSE;
 
 	AH5416(ah)->ah_initPLL(ah, AH_NULL);
 	return (AH_TRUE);
 }
 
 /*
  * Write the given reset bit mask into the reset register
  */
 HAL_BOOL
 ar5416SetResetReg(struct ath_hal *ah, uint32_t type)
 {
 	/*
 	 * Set force wake
 	 */
 	OS_REG_WRITE(ah, AR_RTC_FORCE_WAKE,
 	    AR_RTC_FORCE_WAKE_EN | AR_RTC_FORCE_WAKE_ON_INT);
 
 	switch (type) {
 	case HAL_RESET_POWER_ON:
 		return ar5416SetResetPowerOn(ah);
 	case HAL_RESET_WARM:
 	case HAL_RESET_COLD:
 		return ar5416SetReset(ah, type);
 	default:
 		HALASSERT(AH_FALSE);
 		return AH_FALSE;
 	}
 }
 
 static HAL_BOOL
 ar5416SetResetPowerOn(struct ath_hal *ah)
 {
     /* Power On Reset (Hard Reset) */
 
     /*
      * Set force wake
      *	
      * If the MAC was running, previously calling
      * reset will wake up the MAC but it may go back to sleep
      * before we can start polling. 
      * Set force wake  stops that 
      * This must be called before initiating a hard reset.
      */
     OS_REG_WRITE(ah, AR_RTC_FORCE_WAKE,
             AR_RTC_FORCE_WAKE_EN | AR_RTC_FORCE_WAKE_ON_INT);    
 
     /*
      * PowerOn reset can be used in open loop power control or failure recovery.
      * If we do RTC reset while DMA is still running, hardware may corrupt memory.
      * Therefore, we need to reset AHB first to stop DMA.
      */
     if (! AR_SREV_HOWL(ah))
     	OS_REG_WRITE(ah, AR_RC, AR_RC_AHB);
     /*
      * RTC reset and clear
      */
     OS_REG_WRITE(ah, AR_RTC_RESET, 0);
     OS_DELAY(20);
 
     if (! AR_SREV_HOWL(ah))
     	OS_REG_WRITE(ah, AR_RC, 0);
 
     OS_REG_WRITE(ah, AR_RTC_RESET, 1);
 
     /*
      * Poll till RTC is ON
      */
     if (!ath_hal_wait(ah, AR_RTC_STATUS, AR_RTC_PM_STATUS_M, AR_RTC_STATUS_ON)) {
         HALDEBUG(ah, HAL_DEBUG_ANY, "%s: RTC not waking up\n", __func__);
         return AH_FALSE;
     }
 
     return ar5416SetReset(ah, HAL_RESET_COLD);
 }
 
 static HAL_BOOL
 ar5416SetReset(struct ath_hal *ah, int type)
 {
     uint32_t tmpReg, mask;
     uint32_t rst_flags;
 
 #ifdef	AH_SUPPORT_AR9130	/* Because of the AR9130 specific registers */
     if (AR_SREV_HOWL(ah)) {
         HALDEBUG(ah, HAL_DEBUG_ANY, "[ath] HOWL: Fiddling with derived clk!\n");
         uint32_t val = OS_REG_READ(ah, AR_RTC_DERIVED_CLK);
         val &= ~AR_RTC_DERIVED_CLK_PERIOD;
         val |= SM(1, AR_RTC_DERIVED_CLK_PERIOD);
         OS_REG_WRITE(ah, AR_RTC_DERIVED_CLK, val);
         (void) OS_REG_READ(ah, AR_RTC_DERIVED_CLK);
     }
 #endif	/* AH_SUPPORT_AR9130 */
 
     /*
      * Force wake
      */
     OS_REG_WRITE(ah, AR_RTC_FORCE_WAKE,
 	AR_RTC_FORCE_WAKE_EN | AR_RTC_FORCE_WAKE_ON_INT);
 
 #ifdef	AH_SUPPORT_AR9130
     if (AR_SREV_HOWL(ah)) {
         rst_flags = AR_RTC_RC_MAC_WARM | AR_RTC_RC_MAC_COLD |
           AR_RTC_RC_COLD_RESET | AR_RTC_RC_WARM_RESET;
     } else {
 #endif	/* AH_SUPPORT_AR9130 */
         /*
          * Reset AHB
          *
          * (In case the last interrupt source was a bus timeout.)
          * XXX TODO: this is not the way to do it! It should be recorded
          * XXX by the interrupt handler and passed _into_ the
          * XXX reset path routine so this occurs.
          */
         tmpReg = OS_REG_READ(ah, AR_INTR_SYNC_CAUSE);
         if (tmpReg & (AR_INTR_SYNC_LOCAL_TIMEOUT|AR_INTR_SYNC_RADM_CPL_TIMEOUT)) {
             OS_REG_WRITE(ah, AR_INTR_SYNC_ENABLE, 0);
             OS_REG_WRITE(ah, AR_RC, AR_RC_AHB|AR_RC_HOSTIF);
         } else {
 	    OS_REG_WRITE(ah, AR_RC, AR_RC_AHB);
         }
         rst_flags = AR_RTC_RC_MAC_WARM;
         if (type == HAL_RESET_COLD)
             rst_flags |= AR_RTC_RC_MAC_COLD;
 #ifdef	AH_SUPPORT_AR9130
     }
 #endif	/* AH_SUPPORT_AR9130 */
 
     OS_REG_WRITE(ah, AR_RTC_RC, rst_flags);
 
     if (AR_SREV_HOWL(ah))
         OS_DELAY(10000);
     else
         OS_DELAY(100);
 
     /*
      * Clear resets and force wakeup
      */
     OS_REG_WRITE(ah, AR_RTC_RC, 0);
     if (!ath_hal_wait(ah, AR_RTC_RC, AR_RTC_RC_M, 0)) {
         HALDEBUG(ah, HAL_DEBUG_ANY, "%s: RTC stuck in MAC reset\n", __func__);
         return AH_FALSE;
     }
 
     /* Clear AHB reset */
     if (! AR_SREV_HOWL(ah))
         OS_REG_WRITE(ah, AR_RC, 0);
 
     if (AR_SREV_HOWL(ah))
         OS_DELAY(50);
 
     if (AR_SREV_HOWL(ah)) {
                 uint32_t mask;
                 mask = OS_REG_READ(ah, AR_CFG);
                 if (mask & (AR_CFG_SWRB | AR_CFG_SWTB | AR_CFG_SWRG)) {
                         HALDEBUG(ah, HAL_DEBUG_RESET,
                                 "CFG Byte Swap Set 0x%x\n", mask);
                 } else {
                         mask =  
                                 INIT_CONFIG_STATUS | AR_CFG_SWRB | AR_CFG_SWTB;
                         OS_REG_WRITE(ah, AR_CFG, mask);
                         HALDEBUG(ah, HAL_DEBUG_RESET,
                                 "Setting CFG 0x%x\n", OS_REG_READ(ah, AR_CFG));
                 }
     } else {
 	if (type == HAL_RESET_COLD) {
 		if (isBigEndian()) {
 			/*
 			 * Set CFG, little-endian for descriptor accesses.
 			 */
 			mask = INIT_CONFIG_STATUS | AR_CFG_SWRD;
 #ifndef AH_NEED_DESC_SWAP
 			mask |= AR_CFG_SWTD;
 #endif
 			HALDEBUG(ah, HAL_DEBUG_RESET,
 			    "%s Applying descriptor swap\n", __func__);
 			OS_REG_WRITE(ah, AR_CFG, mask);
 		} else
 			OS_REG_WRITE(ah, AR_CFG, INIT_CONFIG_STATUS);
 	}
     }
 
     return AH_TRUE;
 }
 
 void
 ar5416InitChainMasks(struct ath_hal *ah)
 {
 	int rx_chainmask = AH5416(ah)->ah_rx_chainmask;
 
 	/* Flip this for this chainmask regardless of chip */
 	if (rx_chainmask == 0x5)
 		OS_REG_SET_BIT(ah, AR_PHY_ANALOG_SWAP, AR_PHY_SWAP_ALT_CHAIN);
 
 	/*
 	 * Workaround for OWL 1.0 calibration failure; enable multi-chain;
 	 * then set true mask after calibration.
 	 */
 	if (IS_5416V1(ah) && (rx_chainmask == 0x5 || rx_chainmask == 0x3)) {
 		OS_REG_WRITE(ah, AR_PHY_RX_CHAINMASK, 0x7);
 		OS_REG_WRITE(ah, AR_PHY_CAL_CHAINMASK, 0x7);
 	} else {
 		OS_REG_WRITE(ah, AR_PHY_RX_CHAINMASK, AH5416(ah)->ah_rx_chainmask);
 		OS_REG_WRITE(ah, AR_PHY_CAL_CHAINMASK, AH5416(ah)->ah_rx_chainmask);
 	}
 	OS_REG_WRITE(ah, AR_SELFGEN_MASK, AH5416(ah)->ah_tx_chainmask);
 
 	if (AH5416(ah)->ah_tx_chainmask == 0x5)
 		OS_REG_SET_BIT(ah, AR_PHY_ANALOG_SWAP, AR_PHY_SWAP_ALT_CHAIN);
 
 	if (AR_SREV_HOWL(ah)) {
 		OS_REG_WRITE(ah, AR_PHY_ANALOG_SWAP,
 		OS_REG_READ(ah, AR_PHY_ANALOG_SWAP) | 0x00000001);
 	}
 }
 
 /*
  * Work-around for Owl 1.0 calibration failure.
  *
  * ar5416InitChainMasks sets the RX chainmask to 0x7 if it's Owl 1.0
  * due to init calibration failures. ar5416RestoreChainMask restores
  * these registers to the correct setting.
  */
 void
 ar5416RestoreChainMask(struct ath_hal *ah)
 {
 	int rx_chainmask = AH5416(ah)->ah_rx_chainmask;
 
 	if (IS_5416V1(ah) && (rx_chainmask == 0x5 || rx_chainmask == 0x3)) {
 		OS_REG_WRITE(ah, AR_PHY_RX_CHAINMASK, rx_chainmask);
 		OS_REG_WRITE(ah, AR_PHY_CAL_CHAINMASK, rx_chainmask);
 	}
 }
 
 void
 ar5416InitPLL(struct ath_hal *ah, const struct ieee80211_channel *chan)
 {
 	uint32_t pll = AR_RTC_PLL_REFDIV_5 | AR_RTC_PLL_DIV2;
 	if (chan != AH_NULL) {
 		if (IEEE80211_IS_CHAN_HALF(chan))
 			pll |= SM(0x1, AR_RTC_PLL_CLKSEL);
 		else if (IEEE80211_IS_CHAN_QUARTER(chan))
 			pll |= SM(0x2, AR_RTC_PLL_CLKSEL);
 
 		if (IEEE80211_IS_CHAN_5GHZ(chan))
 			pll |= SM(0xa, AR_RTC_PLL_DIV);
 		else
 			pll |= SM(0xb, AR_RTC_PLL_DIV);
 	} else
 		pll |= SM(0xb, AR_RTC_PLL_DIV);
 	
 	OS_REG_WRITE(ah, AR_RTC_PLL_CONTROL, pll);
 
 	/* TODO:
 	* For multi-band owl, switch between bands by reiniting the PLL.
 	*/
 
 	OS_DELAY(RTC_PLL_SETTLE_DELAY);
 
 	OS_REG_WRITE(ah, AR_RTC_SLEEP_CLK, AR_RTC_SLEEP_DERIVED_CLK);
 }
 
 static void
 ar5416SetDefGainValues(struct ath_hal *ah,
     const MODAL_EEP_HEADER *pModal,
     const struct ar5416eeprom *eep,
     uint8_t txRxAttenLocal, int regChainOffset, int i)
 {
 
 	if (IS_EEP_MINOR_V3(ah)) {
 		txRxAttenLocal = pModal->txRxAttenCh[i];
 
 		if (AR_SREV_MERLIN_10_OR_LATER(ah)) {
 			OS_REG_RMW_FIELD(ah, AR_PHY_GAIN_2GHZ + regChainOffset,
 			      AR_PHY_GAIN_2GHZ_XATTEN1_MARGIN,
 			      pModal->bswMargin[i]);
 			OS_REG_RMW_FIELD(ah, AR_PHY_GAIN_2GHZ + regChainOffset,
 			      AR_PHY_GAIN_2GHZ_XATTEN1_DB,
 			      pModal->bswAtten[i]);
 			OS_REG_RMW_FIELD(ah, AR_PHY_GAIN_2GHZ + regChainOffset,
 			      AR_PHY_GAIN_2GHZ_XATTEN2_MARGIN,
 			      pModal->xatten2Margin[i]);
 			OS_REG_RMW_FIELD(ah, AR_PHY_GAIN_2GHZ + regChainOffset,
 			      AR_PHY_GAIN_2GHZ_XATTEN2_DB,
 			      pModal->xatten2Db[i]);
 		} else {
 			OS_REG_RMW_FIELD(ah, AR_PHY_GAIN_2GHZ + regChainOffset,
 			      AR_PHY_GAIN_2GHZ_BSW_MARGIN,
 			      pModal->bswMargin[i]);
 			OS_REG_RMW_FIELD(ah, AR_PHY_GAIN_2GHZ + regChainOffset,
 			      AR_PHY_GAIN_2GHZ_BSW_ATTEN,
 			      pModal->bswAtten[i]);
 		}
 	}
 
 	if (AR_SREV_MERLIN_10_OR_LATER(ah)) {
 		OS_REG_RMW_FIELD(ah,
 		      AR_PHY_RXGAIN + regChainOffset,
 		      AR9280_PHY_RXGAIN_TXRX_ATTEN, txRxAttenLocal);
 		OS_REG_RMW_FIELD(ah,
 		      AR_PHY_RXGAIN + regChainOffset,
 		      AR9280_PHY_RXGAIN_TXRX_MARGIN, pModal->rxTxMarginCh[i]);
 	} else {
 		OS_REG_RMW_FIELD(ah,
 			  AR_PHY_RXGAIN + regChainOffset,
 			  AR_PHY_RXGAIN_TXRX_ATTEN, txRxAttenLocal);
 		OS_REG_RMW_FIELD(ah,
 			  AR_PHY_GAIN_2GHZ + regChainOffset,
 			  AR_PHY_GAIN_2GHZ_RXTX_MARGIN, pModal->rxTxMarginCh[i]);
 	}
 }
 
 /*
  * Get the register chain offset for the given chain.
  *
  * Take into account the register chain swapping with AR5416 v2.0.
  *
  * XXX make sure that the reg chain swapping is only done for
  * XXX AR5416 v2.0 or greater, and not later chips?
  */
 int
 ar5416GetRegChainOffset(struct ath_hal *ah, int i)
 {
 	int regChainOffset;
 
 	if (AR_SREV_5416_V20_OR_LATER(ah) && 
 	    (AH5416(ah)->ah_rx_chainmask == 0x5 ||
 	    AH5416(ah)->ah_tx_chainmask == 0x5) && (i != 0)) {
 		/* Regs are swapped from chain 2 to 1 for 5416 2_0 with 
 		 * only chains 0 and 2 populated 
 		 */
 		regChainOffset = (i == 1) ? 0x2000 : 0x1000;
 	} else {
 		regChainOffset = i * 0x1000;
 	}
 
 	return regChainOffset;
 }
 
 /*
  * Read EEPROM header info and program the device for correct operation
  * given the channel value.
  */
 HAL_BOOL
 ar5416SetBoardValues(struct ath_hal *ah, const struct ieee80211_channel *chan)
 {
     const HAL_EEPROM_v14 *ee = AH_PRIVATE(ah)->ah_eeprom;
     const struct ar5416eeprom *eep = &ee->ee_base;
     const MODAL_EEP_HEADER *pModal;
     int			i, regChainOffset;
     uint8_t		txRxAttenLocal;    /* workaround for eeprom versions <= 14.2 */
 
     HALASSERT(AH_PRIVATE(ah)->ah_eeversion >= AR_EEPROM_VER14_1);
     pModal = &eep->modalHeader[IEEE80211_IS_CHAN_2GHZ(chan)];
 
     /* NB: workaround for eeprom versions <= 14.2 */
     txRxAttenLocal = IEEE80211_IS_CHAN_2GHZ(chan) ? 23 : 44;
 
     OS_REG_WRITE(ah, AR_PHY_SWITCH_COM, pModal->antCtrlCommon);
     for (i = 0; i < AR5416_MAX_CHAINS; i++) { 
 	   if (AR_SREV_MERLIN(ah)) {
 		if (i >= 2) break;
 	   }
 	regChainOffset = ar5416GetRegChainOffset(ah, i);
 
         OS_REG_WRITE(ah, AR_PHY_SWITCH_CHAIN_0 + regChainOffset, pModal->antCtrlChain[i]);
 
         OS_REG_WRITE(ah, AR_PHY_TIMING_CTRL4 + regChainOffset, 
         	(OS_REG_READ(ah, AR_PHY_TIMING_CTRL4 + regChainOffset) &
         	~(AR_PHY_TIMING_CTRL4_IQCORR_Q_Q_COFF | AR_PHY_TIMING_CTRL4_IQCORR_Q_I_COFF)) |
         	SM(pModal->iqCalICh[i], AR_PHY_TIMING_CTRL4_IQCORR_Q_I_COFF) |
         	SM(pModal->iqCalQCh[i], AR_PHY_TIMING_CTRL4_IQCORR_Q_Q_COFF));
 
         /*
          * Large signal upgrade,
 	 * If 14.3 or later EEPROM, use
 	 * txRxAttenLocal = pModal->txRxAttenCh[i]
 	 * else txRxAttenLocal is fixed value above.
          */
 
         if ((i == 0) || AR_SREV_5416_V20_OR_LATER(ah))
 	    ar5416SetDefGainValues(ah, pModal, eep, txRxAttenLocal, regChainOffset, i);
 
     }
 
 	if (AR_SREV_MERLIN_10_OR_LATER(ah)) {
                 if (IEEE80211_IS_CHAN_2GHZ(chan)) {
                         OS_A_REG_RMW_FIELD(ah, AR_AN_RF2G1_CH0, AR_AN_RF2G1_CH0_OB, pModal->ob);
                         OS_A_REG_RMW_FIELD(ah, AR_AN_RF2G1_CH0, AR_AN_RF2G1_CH0_DB, pModal->db);
                         OS_A_REG_RMW_FIELD(ah, AR_AN_RF2G1_CH1, AR_AN_RF2G1_CH1_OB, pModal->ob_ch1);
                         OS_A_REG_RMW_FIELD(ah, AR_AN_RF2G1_CH1, AR_AN_RF2G1_CH1_DB, pModal->db_ch1);
                 } else {
                         OS_A_REG_RMW_FIELD(ah, AR_AN_RF5G1_CH0, AR_AN_RF5G1_CH0_OB5, pModal->ob);
                         OS_A_REG_RMW_FIELD(ah, AR_AN_RF5G1_CH0, AR_AN_RF5G1_CH0_DB5, pModal->db);
                         OS_A_REG_RMW_FIELD(ah, AR_AN_RF5G1_CH1, AR_AN_RF5G1_CH1_OB5, pModal->ob_ch1);
                         OS_A_REG_RMW_FIELD(ah, AR_AN_RF5G1_CH1, AR_AN_RF5G1_CH1_DB5, pModal->db_ch1);
                 }
                 OS_A_REG_RMW_FIELD(ah, AR_AN_TOP2, AR_AN_TOP2_XPABIAS_LVL, pModal->xpaBiasLvl);
                 OS_A_REG_RMW_FIELD(ah, AR_AN_TOP2, AR_AN_TOP2_LOCALBIAS,
 		    !!(pModal->flagBits & AR5416_EEP_FLAG_LOCALBIAS));
                 OS_A_REG_RMW_FIELD(ah, AR_PHY_XPA_CFG, AR_PHY_FORCE_XPA_CFG,
 		    !!(pModal->flagBits & AR5416_EEP_FLAG_FORCEXPAON));
         }
 
     OS_REG_RMW_FIELD(ah, AR_PHY_SETTLING, AR_PHY_SETTLING_SWITCH, pModal->switchSettling);
     OS_REG_RMW_FIELD(ah, AR_PHY_DESIRED_SZ, AR_PHY_DESIRED_SZ_ADC, pModal->adcDesiredSize);
 
     if (! AR_SREV_MERLIN_10_OR_LATER(ah))
     	OS_REG_RMW_FIELD(ah, AR_PHY_DESIRED_SZ, AR_PHY_DESIRED_SZ_PGA, pModal->pgaDesiredSize);
 
     OS_REG_WRITE(ah, AR_PHY_RF_CTL4,
         SM(pModal->txEndToXpaOff, AR_PHY_RF_CTL4_TX_END_XPAA_OFF)
         | SM(pModal->txEndToXpaOff, AR_PHY_RF_CTL4_TX_END_XPAB_OFF)
         | SM(pModal->txFrameToXpaOn, AR_PHY_RF_CTL4_FRAME_XPAA_ON)
         | SM(pModal->txFrameToXpaOn, AR_PHY_RF_CTL4_FRAME_XPAB_ON));
 
     OS_REG_RMW_FIELD(ah, AR_PHY_RF_CTL3, AR_PHY_TX_END_TO_A2_RX_ON,
 	pModal->txEndToRxOn);
 
     if (AR_SREV_MERLIN_10_OR_LATER(ah)) {
 	OS_REG_RMW_FIELD(ah, AR_PHY_CCA, AR9280_PHY_CCA_THRESH62,
 	    pModal->thresh62);
 	OS_REG_RMW_FIELD(ah, AR_PHY_EXT_CCA0, AR_PHY_EXT_CCA0_THRESH62,
 	    pModal->thresh62);
     } else {
 	OS_REG_RMW_FIELD(ah, AR_PHY_CCA, AR_PHY_CCA_THRESH62,
 	    pModal->thresh62);
 	OS_REG_RMW_FIELD(ah, AR_PHY_EXT_CCA, AR_PHY_EXT_CCA_THRESH62,
 	    pModal->thresh62);
     }
     
     /* Minor Version Specific application */
     if (IS_EEP_MINOR_V2(ah)) {
         OS_REG_RMW_FIELD(ah, AR_PHY_RF_CTL2, AR_PHY_TX_FRAME_TO_DATA_START,
 	    pModal->txFrameToDataStart);
         OS_REG_RMW_FIELD(ah, AR_PHY_RF_CTL2, AR_PHY_TX_FRAME_TO_PA_ON,
 	    pModal->txFrameToPaOn);    
     }	
 
     if (IS_EEP_MINOR_V3(ah) && IEEE80211_IS_CHAN_HT40(chan))
 		/* Overwrite switch settling with HT40 value */
 		OS_REG_RMW_FIELD(ah, AR_PHY_SETTLING, AR_PHY_SETTLING_SWITCH,
 		    pModal->swSettleHt40);
 
     if (AR_SREV_MERLIN_20_OR_LATER(ah) && EEP_MINOR(ah) >= AR5416_EEP_MINOR_VER_19)
          OS_REG_RMW_FIELD(ah, AR_PHY_CCK_TX_CTRL, AR_PHY_CCK_TX_CTRL_TX_DAC_SCALE_CCK, pModal->miscBits);
 
         if (AR_SREV_MERLIN_20(ah) && EEP_MINOR(ah) >= AR5416_EEP_MINOR_VER_20) {
                 if (IEEE80211_IS_CHAN_2GHZ(chan))
                         OS_A_REG_RMW_FIELD(ah, AR_AN_TOP1, AR_AN_TOP1_DACIPMODE,
 			    eep->baseEepHeader.dacLpMode);
                 else if (eep->baseEepHeader.dacHiPwrMode_5G)
                         OS_A_REG_RMW_FIELD(ah, AR_AN_TOP1, AR_AN_TOP1_DACIPMODE, 0);
                 else
                         OS_A_REG_RMW_FIELD(ah, AR_AN_TOP1, AR_AN_TOP1_DACIPMODE,
 			    eep->baseEepHeader.dacLpMode);
 
 		OS_DELAY(100);
 
                 OS_REG_RMW_FIELD(ah, AR_PHY_FRAME_CTL, AR_PHY_FRAME_CTL_TX_CLIP,
 		    pModal->miscBits >> 2);
                 OS_REG_RMW_FIELD(ah, AR_PHY_TX_PWRCTRL9, AR_PHY_TX_DESIRED_SCALE_CCK,
 		    eep->baseEepHeader.desiredScaleCCK);
         }
 
     return (AH_TRUE);
 }
 
 /*
  * Helper functions common for AP/CB/XB
  */
 
 /*
  * Set the target power array "ratesArray" from the
  * given set of target powers.
  *
  * This is used by the various chipset/EEPROM TX power
  * setup routines.
  */ 
 void
 ar5416SetRatesArrayFromTargetPower(struct ath_hal *ah,
     const struct ieee80211_channel *chan,
     int16_t *ratesArray,
     const CAL_TARGET_POWER_LEG *targetPowerCck,
     const CAL_TARGET_POWER_LEG *targetPowerCckExt,
     const CAL_TARGET_POWER_LEG *targetPowerOfdm,
     const CAL_TARGET_POWER_LEG *targetPowerOfdmExt,
     const CAL_TARGET_POWER_HT *targetPowerHt20,
     const CAL_TARGET_POWER_HT *targetPowerHt40)
 {
 #define	N(a)	(sizeof(a)/sizeof(a[0]))
 	int i;
 
 	/* Blank the rates array, to be consistent */
 	for (i = 0; i < Ar5416RateSize; i++)
 		ratesArray[i] = 0;
 
 	/* Set rates Array from collected data */
 	ratesArray[rate6mb] = ratesArray[rate9mb] = ratesArray[rate12mb] =
 	ratesArray[rate18mb] = ratesArray[rate24mb] =
 	    targetPowerOfdm->tPow2x[0];
 	ratesArray[rate36mb] = targetPowerOfdm->tPow2x[1];
 	ratesArray[rate48mb] = targetPowerOfdm->tPow2x[2];
 	ratesArray[rate54mb] = targetPowerOfdm->tPow2x[3];
 	ratesArray[rateXr] = targetPowerOfdm->tPow2x[0];
 
 	for (i = 0; i < N(targetPowerHt20->tPow2x); i++) {
 		ratesArray[rateHt20_0 + i] = targetPowerHt20->tPow2x[i];
 	}
 
 	if (IEEE80211_IS_CHAN_2GHZ(chan)) {
 		ratesArray[rate1l]  = targetPowerCck->tPow2x[0];
 		ratesArray[rate2s] = ratesArray[rate2l]  = targetPowerCck->tPow2x[1];
 		ratesArray[rate5_5s] = ratesArray[rate5_5l] = targetPowerCck->tPow2x[2];
 		ratesArray[rate11s] = ratesArray[rate11l] = targetPowerCck->tPow2x[3];
 	}
 	if (IEEE80211_IS_CHAN_HT40(chan)) {
 		for (i = 0; i < N(targetPowerHt40->tPow2x); i++) {
 			ratesArray[rateHt40_0 + i] = targetPowerHt40->tPow2x[i];
 		}
 		ratesArray[rateDupOfdm] = targetPowerHt40->tPow2x[0];
 		ratesArray[rateDupCck]  = targetPowerHt40->tPow2x[0];
 		ratesArray[rateExtOfdm] = targetPowerOfdmExt->tPow2x[0];
 		if (IEEE80211_IS_CHAN_2GHZ(chan)) {
 			ratesArray[rateExtCck]  = targetPowerCckExt->tPow2x[0];
 		}
 	}
 #undef	N
 }
 
 /*
  * ar5416SetPowerPerRateTable
  *
  * Sets the transmit power in the baseband for the given
  * operating channel and mode.
  */
 static HAL_BOOL
 ar5416SetPowerPerRateTable(struct ath_hal *ah, struct ar5416eeprom *pEepData,
                            const struct ieee80211_channel *chan,
                            int16_t *ratesArray, uint16_t cfgCtl,
                            uint16_t AntennaReduction, 
                            uint16_t twiceMaxRegulatoryPower,
                            uint16_t powerLimit)
 {
 #define	N(a)	(sizeof(a)/sizeof(a[0]))
 /* Local defines to distinguish between extension and control CTL's */
 #define EXT_ADDITIVE (0x8000)
 #define CTL_11A_EXT (CTL_11A | EXT_ADDITIVE)
 #define CTL_11G_EXT (CTL_11G | EXT_ADDITIVE)
 #define CTL_11B_EXT (CTL_11B | EXT_ADDITIVE)
 
 	uint16_t twiceMaxEdgePower = AR5416_MAX_RATE_POWER;
 	int i;
 	int16_t  twiceLargestAntenna;
 	CAL_CTL_DATA *rep;
 	CAL_TARGET_POWER_LEG targetPowerOfdm, targetPowerCck = {0, {0, 0, 0, 0}};
 	CAL_TARGET_POWER_LEG targetPowerOfdmExt = {0, {0, 0, 0, 0}}, targetPowerCckExt = {0, {0, 0, 0, 0}};
 	CAL_TARGET_POWER_HT  targetPowerHt20, targetPowerHt40 = {0, {0, 0, 0, 0}};
 	int16_t scaledPower, minCtlPower;
 
 #define SUB_NUM_CTL_MODES_AT_5G_40 2   /* excluding HT40, EXT-OFDM */
 #define SUB_NUM_CTL_MODES_AT_2G_40 3   /* excluding HT40, EXT-OFDM, EXT-CCK */
 	static const uint16_t ctlModesFor11a[] = {
 	   CTL_11A, CTL_5GHT20, CTL_11A_EXT, CTL_5GHT40
 	};
 	static const uint16_t ctlModesFor11g[] = {
 	   CTL_11B, CTL_11G, CTL_2GHT20, CTL_11B_EXT, CTL_11G_EXT, CTL_2GHT40
 	};
 	const uint16_t *pCtlMode;
 	uint16_t numCtlModes, ctlMode, freq;
 	CHAN_CENTERS centers;
 
 	ar5416GetChannelCenters(ah,  chan, &centers);
 
 	/* Compute TxPower reduction due to Antenna Gain */
 
 	twiceLargestAntenna = AH_MAX(AH_MAX(
 	    pEepData->modalHeader[IEEE80211_IS_CHAN_2GHZ(chan)].antennaGainCh[0],
 	    pEepData->modalHeader[IEEE80211_IS_CHAN_2GHZ(chan)].antennaGainCh[1]),
 	    pEepData->modalHeader[IEEE80211_IS_CHAN_2GHZ(chan)].antennaGainCh[2]);
 #if 0
 	/* Turn it back on if we need to calculate per chain antenna gain reduction */
 	/* Use only if the expected gain > 6dbi */
 	/* Chain 0 is always used */
 	twiceLargestAntenna = pEepData->modalHeader[IEEE80211_IS_CHAN_2GHZ(chan)].antennaGainCh[0];
 
 	/* Look at antenna gains of Chains 1 and 2 if the TX mask is set */
 	if (ahp->ah_tx_chainmask & 0x2)
 		twiceLargestAntenna = AH_MAX(twiceLargestAntenna,
 			pEepData->modalHeader[IEEE80211_IS_CHAN_2GHZ(chan)].antennaGainCh[1]);
 
 	if (ahp->ah_tx_chainmask & 0x4)
 		twiceLargestAntenna = AH_MAX(twiceLargestAntenna,
 			pEepData->modalHeader[IEEE80211_IS_CHAN_2GHZ(chan)].antennaGainCh[2]);
 #endif
 	twiceLargestAntenna = (int16_t)AH_MIN((AntennaReduction) - twiceLargestAntenna, 0);
 
 	/* XXX setup for 5212 use (really used?) */
 	ath_hal_eepromSet(ah,
 	    IEEE80211_IS_CHAN_2GHZ(chan) ? AR_EEP_ANTGAINMAX_2 : AR_EEP_ANTGAINMAX_5,
 	    twiceLargestAntenna);
 
 	/* 
 	 * scaledPower is the minimum of the user input power level and
 	 * the regulatory allowed power level
 	 */
 	scaledPower = AH_MIN(powerLimit, twiceMaxRegulatoryPower + twiceLargestAntenna);
 
 	/* Reduce scaled Power by number of chains active to get to per chain tx power level */
 	/* TODO: better value than these? */
 	switch (owl_get_ntxchains(AH5416(ah)->ah_tx_chainmask)) {
 	case 1:
 		break;
 	case 2:
 		scaledPower -= pEepData->modalHeader[IEEE80211_IS_CHAN_2GHZ(chan)].pwrDecreaseFor2Chain;
 		break;
 	case 3:
 		scaledPower -= pEepData->modalHeader[IEEE80211_IS_CHAN_2GHZ(chan)].pwrDecreaseFor3Chain;
 		break;
 	default:
 		return AH_FALSE; /* Unsupported number of chains */
 	}
 
 	scaledPower = AH_MAX(0, scaledPower);
 
 	/* Get target powers from EEPROM - our baseline for TX Power */
 	if (IEEE80211_IS_CHAN_2GHZ(chan)) {
 		/* Setup for CTL modes */
 		numCtlModes = N(ctlModesFor11g) - SUB_NUM_CTL_MODES_AT_2G_40; /* CTL_11B, CTL_11G, CTL_2GHT20 */
 		pCtlMode = ctlModesFor11g;
 
 		ar5416GetTargetPowersLeg(ah,  chan, pEepData->calTargetPowerCck,
 				AR5416_NUM_2G_CCK_TARGET_POWERS, &targetPowerCck, 4, AH_FALSE);
 		ar5416GetTargetPowersLeg(ah,  chan, pEepData->calTargetPower2G,
 				AR5416_NUM_2G_20_TARGET_POWERS, &targetPowerOfdm, 4, AH_FALSE);
 		ar5416GetTargetPowers(ah,  chan, pEepData->calTargetPower2GHT20,
 				AR5416_NUM_2G_20_TARGET_POWERS, &targetPowerHt20, 8, AH_FALSE);
 
 		if (IEEE80211_IS_CHAN_HT40(chan)) {
 			numCtlModes = N(ctlModesFor11g);    /* All 2G CTL's */
 
 			ar5416GetTargetPowers(ah,  chan, pEepData->calTargetPower2GHT40,
 				AR5416_NUM_2G_40_TARGET_POWERS, &targetPowerHt40, 8, AH_TRUE);
 			/* Get target powers for extension channels */
 			ar5416GetTargetPowersLeg(ah,  chan, pEepData->calTargetPowerCck,
 				AR5416_NUM_2G_CCK_TARGET_POWERS, &targetPowerCckExt, 4, AH_TRUE);
 			ar5416GetTargetPowersLeg(ah,  chan, pEepData->calTargetPower2G,
 				AR5416_NUM_2G_20_TARGET_POWERS, &targetPowerOfdmExt, 4, AH_TRUE);
 		}
 	} else {
 		/* Setup for CTL modes */
 		numCtlModes = N(ctlModesFor11a) - SUB_NUM_CTL_MODES_AT_5G_40; /* CTL_11A, CTL_5GHT20 */
 		pCtlMode = ctlModesFor11a;
 
 		ar5416GetTargetPowersLeg(ah,  chan, pEepData->calTargetPower5G,
 				AR5416_NUM_5G_20_TARGET_POWERS, &targetPowerOfdm, 4, AH_FALSE);
 		ar5416GetTargetPowers(ah,  chan, pEepData->calTargetPower5GHT20,
 				AR5416_NUM_5G_20_TARGET_POWERS, &targetPowerHt20, 8, AH_FALSE);
 
 		if (IEEE80211_IS_CHAN_HT40(chan)) {
 			numCtlModes = N(ctlModesFor11a); /* All 5G CTL's */
 
 			ar5416GetTargetPowers(ah,  chan, pEepData->calTargetPower5GHT40,
 				AR5416_NUM_5G_40_TARGET_POWERS, &targetPowerHt40, 8, AH_TRUE);
 			ar5416GetTargetPowersLeg(ah,  chan, pEepData->calTargetPower5G,
 				AR5416_NUM_5G_20_TARGET_POWERS, &targetPowerOfdmExt, 4, AH_TRUE);
 		}
 	}
 
 	/*
 	 * For MIMO, need to apply regulatory caps individually across dynamically
 	 * running modes: CCK, OFDM, HT20, HT40
 	 *
 	 * The outer loop walks through each possible applicable runtime mode.
 	 * The inner loop walks through each ctlIndex entry in EEPROM.
 	 * The ctl value is encoded as [7:4] == test group, [3:0] == test mode.
 	 *
 	 */
 	for (ctlMode = 0; ctlMode < numCtlModes; ctlMode++) {
 		HAL_BOOL isHt40CtlMode = (pCtlMode[ctlMode] == CTL_5GHT40) ||
 		    (pCtlMode[ctlMode] == CTL_2GHT40);
 		if (isHt40CtlMode) {
 			freq = centers.ctl_center;
 		} else if (pCtlMode[ctlMode] & EXT_ADDITIVE) {
 			freq = centers.ext_center;
 		} else {
 			freq = centers.ctl_center;
 		}
 
 		/* walk through each CTL index stored in EEPROM */
 		for (i = 0; (i < AR5416_NUM_CTLS) && pEepData->ctlIndex[i]; i++) {
 			uint16_t twiceMinEdgePower;
 
 			/* compare test group from regulatory channel list with test mode from pCtlMode list */
 			if ((((cfgCtl & ~CTL_MODE_M) | (pCtlMode[ctlMode] & CTL_MODE_M)) == pEepData->ctlIndex[i]) ||
 				(((cfgCtl & ~CTL_MODE_M) | (pCtlMode[ctlMode] & CTL_MODE_M)) == 
 				 ((pEepData->ctlIndex[i] & CTL_MODE_M) | SD_NO_CTL))) {
 				rep = &(pEepData->ctlData[i]);
 				twiceMinEdgePower = ar5416GetMaxEdgePower(freq,
 							rep->ctlEdges[owl_get_ntxchains(AH5416(ah)->ah_tx_chainmask) - 1],
 							IEEE80211_IS_CHAN_2GHZ(chan));
 				if ((cfgCtl & ~CTL_MODE_M) == SD_NO_CTL) {
 					/* Find the minimum of all CTL edge powers that apply to this channel */
 					twiceMaxEdgePower = AH_MIN(twiceMaxEdgePower, twiceMinEdgePower);
 				} else {
 					/* specific */
 					twiceMaxEdgePower = twiceMinEdgePower;
 					break;
 				}
 			}
 		}
 		minCtlPower = (uint8_t)AH_MIN(twiceMaxEdgePower, scaledPower);
 		/* Apply ctl mode to correct target power set */
 		switch(pCtlMode[ctlMode]) {
 		case CTL_11B:
 			for (i = 0; i < N(targetPowerCck.tPow2x); i++) {
 				targetPowerCck.tPow2x[i] = (uint8_t)AH_MIN(targetPowerCck.tPow2x[i], minCtlPower);
 			}
 			break;
 		case CTL_11A:
 		case CTL_11G:
 			for (i = 0; i < N(targetPowerOfdm.tPow2x); i++) {
 				targetPowerOfdm.tPow2x[i] = (uint8_t)AH_MIN(targetPowerOfdm.tPow2x[i], minCtlPower);
 			}
 			break;
 		case CTL_5GHT20:
 		case CTL_2GHT20:
 			for (i = 0; i < N(targetPowerHt20.tPow2x); i++) {
 				targetPowerHt20.tPow2x[i] = (uint8_t)AH_MIN(targetPowerHt20.tPow2x[i], minCtlPower);
 			}
 			break;
 		case CTL_11B_EXT:
 			targetPowerCckExt.tPow2x[0] = (uint8_t)AH_MIN(targetPowerCckExt.tPow2x[0], minCtlPower);
 			break;
 		case CTL_11A_EXT:
 		case CTL_11G_EXT:
 			targetPowerOfdmExt.tPow2x[0] = (uint8_t)AH_MIN(targetPowerOfdmExt.tPow2x[0], minCtlPower);
 			break;
 		case CTL_5GHT40:
 		case CTL_2GHT40:
 			for (i = 0; i < N(targetPowerHt40.tPow2x); i++) {
 				targetPowerHt40.tPow2x[i] = (uint8_t)AH_MIN(targetPowerHt40.tPow2x[i], minCtlPower);
 			}
 			break;
 		default:
 			return AH_FALSE;
 			break;
 		}
 	} /* end ctl mode checking */
 
 	/* Set rates Array from collected data */
 	ar5416SetRatesArrayFromTargetPower(ah, chan, ratesArray,
 	    &targetPowerCck,
 	    &targetPowerCckExt,
 	    &targetPowerOfdm,
 	    &targetPowerOfdmExt,
 	    &targetPowerHt20,
 	    &targetPowerHt40);
 	return AH_TRUE;
 #undef EXT_ADDITIVE
 #undef CTL_11A_EXT
 #undef CTL_11G_EXT
 #undef CTL_11B_EXT
 #undef SUB_NUM_CTL_MODES_AT_5G_40
 #undef SUB_NUM_CTL_MODES_AT_2G_40
 #undef N
 }
 
 /**************************************************************************
  * fbin2freq
  *
  * Get channel value from binary representation held in eeprom
  * RETURNS: the frequency in MHz
  */
 static uint16_t
 fbin2freq(uint8_t fbin, HAL_BOOL is2GHz)
 {
     /*
      * Reserved value 0xFF provides an empty definition both as
      * an fbin and as a frequency - do not convert
      */
     if (fbin == AR5416_BCHAN_UNUSED) {
         return fbin;
     }
 
     return (uint16_t)((is2GHz) ? (2300 + fbin) : (4800 + 5 * fbin));
 }
 
 /*
  * ar5416GetMaxEdgePower
  *
  * Find the maximum conformance test limit for the given channel and CTL info
  */
 uint16_t
 ar5416GetMaxEdgePower(uint16_t freq, CAL_CTL_EDGES *pRdEdgesPower, HAL_BOOL is2GHz)
 {
     uint16_t twiceMaxEdgePower = AR5416_MAX_RATE_POWER;
     int      i;
 
     /* Get the edge power */
     for (i = 0; (i < AR5416_NUM_BAND_EDGES) && (pRdEdgesPower[i].bChannel != AR5416_BCHAN_UNUSED) ; i++) {
         /*
          * If there's an exact channel match or an inband flag set
          * on the lower channel use the given rdEdgePower
          */
         if (freq == fbin2freq(pRdEdgesPower[i].bChannel, is2GHz)) {
             twiceMaxEdgePower = MS(pRdEdgesPower[i].tPowerFlag, CAL_CTL_EDGES_POWER);
             break;
         } else if ((i > 0) && (freq < fbin2freq(pRdEdgesPower[i].bChannel, is2GHz))) {
             if (fbin2freq(pRdEdgesPower[i - 1].bChannel, is2GHz) < freq && (pRdEdgesPower[i - 1].tPowerFlag & CAL_CTL_EDGES_FLAG) != 0) {
                 twiceMaxEdgePower = MS(pRdEdgesPower[i - 1].tPowerFlag, CAL_CTL_EDGES_POWER);
             }
             /* Leave loop - no more affecting edges possible in this monotonic increasing list */
             break;
         }
     }
     HALASSERT(twiceMaxEdgePower > 0);
     return twiceMaxEdgePower;
 }
 
 /**************************************************************
  * ar5416GetTargetPowers
  *
  * Return the rates of target power for the given target power table
  * channel, and number of channels
  */
 void
 ar5416GetTargetPowers(struct ath_hal *ah,  const struct ieee80211_channel *chan,
                       CAL_TARGET_POWER_HT *powInfo, uint16_t numChannels,
                       CAL_TARGET_POWER_HT *pNewPower, uint16_t numRates,
                       HAL_BOOL isHt40Target)
 {
     uint16_t clo, chi;
     int i;
     int matchIndex = -1, lowIndex = -1;
     uint16_t freq;
     CHAN_CENTERS centers;
 
     ar5416GetChannelCenters(ah,  chan, &centers);
     freq = isHt40Target ? centers.synth_center : centers.ctl_center;
 
     /* Copy the target powers into the temp channel list */
     if (freq <= fbin2freq(powInfo[0].bChannel, IEEE80211_IS_CHAN_2GHZ(chan))) {
         matchIndex = 0;
     } else {
         for (i = 0; (i < numChannels) && (powInfo[i].bChannel != AR5416_BCHAN_UNUSED); i++) {
             if (freq == fbin2freq(powInfo[i].bChannel, IEEE80211_IS_CHAN_2GHZ(chan))) {
                 matchIndex = i;
                 break;
             } else if ((freq < fbin2freq(powInfo[i].bChannel, IEEE80211_IS_CHAN_2GHZ(chan))) &&
                        (freq > fbin2freq(powInfo[i - 1].bChannel, IEEE80211_IS_CHAN_2GHZ(chan))))
             {
                 lowIndex = i - 1;
                 break;
             }
         }
         if ((matchIndex == -1) && (lowIndex == -1)) {
             HALASSERT(freq > fbin2freq(powInfo[i - 1].bChannel, IEEE80211_IS_CHAN_2GHZ(chan)));
             matchIndex = i - 1;
         }
     }
 
     if (matchIndex != -1) {
         OS_MEMCPY(pNewPower, &powInfo[matchIndex], sizeof(*pNewPower));
     } else {
         HALASSERT(lowIndex != -1);
         /*
          * Get the lower and upper channels, target powers,
          * and interpolate between them.
          */
         clo = fbin2freq(powInfo[lowIndex].bChannel, IEEE80211_IS_CHAN_2GHZ(chan));
         chi = fbin2freq(powInfo[lowIndex + 1].bChannel, IEEE80211_IS_CHAN_2GHZ(chan));
 
         for (i = 0; i < numRates; i++) {
             pNewPower->tPow2x[i] = (uint8_t)ath_ee_interpolate(freq, clo, chi,
                                    powInfo[lowIndex].tPow2x[i], powInfo[lowIndex + 1].tPow2x[i]);
         }
     }
 }
 /**************************************************************
  * ar5416GetTargetPowersLeg
  *
  * Return the four rates of target power for the given target power table
  * channel, and number of channels
  */
 void
 ar5416GetTargetPowersLeg(struct ath_hal *ah, 
                          const struct ieee80211_channel *chan,
                          CAL_TARGET_POWER_LEG *powInfo, uint16_t numChannels,
                          CAL_TARGET_POWER_LEG *pNewPower, uint16_t numRates,
 			 HAL_BOOL isExtTarget)
 {
     uint16_t clo, chi;
     int i;
     int matchIndex = -1, lowIndex = -1;
     uint16_t freq;
     CHAN_CENTERS centers;
 
     ar5416GetChannelCenters(ah,  chan, &centers);
     freq = (isExtTarget) ? centers.ext_center :centers.ctl_center;
 
     /* Copy the target powers into the temp channel list */
     if (freq <= fbin2freq(powInfo[0].bChannel, IEEE80211_IS_CHAN_2GHZ(chan))) {
         matchIndex = 0;
     } else {
         for (i = 0; (i < numChannels) && (powInfo[i].bChannel != AR5416_BCHAN_UNUSED); i++) {
             if (freq == fbin2freq(powInfo[i].bChannel, IEEE80211_IS_CHAN_2GHZ(chan))) {
                 matchIndex = i;
                 break;
             } else if ((freq < fbin2freq(powInfo[i].bChannel, IEEE80211_IS_CHAN_2GHZ(chan))) &&
                        (freq > fbin2freq(powInfo[i - 1].bChannel, IEEE80211_IS_CHAN_2GHZ(chan))))
             {
                 lowIndex = i - 1;
                 break;
             }
         }
         if ((matchIndex == -1) && (lowIndex == -1)) {
             HALASSERT(freq > fbin2freq(powInfo[i - 1].bChannel, IEEE80211_IS_CHAN_2GHZ(chan)));
             matchIndex = i - 1;
         }
     }
 
     if (matchIndex != -1) {
         OS_MEMCPY(pNewPower, &powInfo[matchIndex], sizeof(*pNewPower));
     } else {
         HALASSERT(lowIndex != -1);
         /*
          * Get the lower and upper channels, target powers,
          * and interpolate between them.
          */
         clo = fbin2freq(powInfo[lowIndex].bChannel, IEEE80211_IS_CHAN_2GHZ(chan));
         chi = fbin2freq(powInfo[lowIndex + 1].bChannel, IEEE80211_IS_CHAN_2GHZ(chan));
 
         for (i = 0; i < numRates; i++) {
             pNewPower->tPow2x[i] = (uint8_t)ath_ee_interpolate(freq, clo, chi,
                                    powInfo[lowIndex].tPow2x[i], powInfo[lowIndex + 1].tPow2x[i]);
         }
     }
 }
 
 /*
  * Set the gain boundaries for the given radio chain.
  *
  * The gain boundaries tell the hardware at what point in the
  * PDADC array to "switch over" from one PD gain setting
  * to another. There's also a gain overlap between two
  * PDADC array gain curves where there's valid PD values
  * for 2 gain settings.
  *
  * The hardware uses the gain overlap and gain boundaries
  * to determine which gain curve to use for the given
  * target TX power.
  */
 void
 ar5416SetGainBoundariesClosedLoop(struct ath_hal *ah, int i,
     uint16_t pdGainOverlap_t2, uint16_t gainBoundaries[])
 {
 	int regChainOffset;
 
 	regChainOffset = ar5416GetRegChainOffset(ah, i);
 
 	HALDEBUG(ah, HAL_DEBUG_EEPROM, "%s: chain %d: gainOverlap_t2: %d,"
 	    " gainBoundaries: %d, %d, %d, %d\n", __func__, i, pdGainOverlap_t2,
 	    gainBoundaries[0], gainBoundaries[1], gainBoundaries[2],
 	    gainBoundaries[3]);
 	OS_REG_WRITE(ah, AR_PHY_TPCRG5 + regChainOffset,
 	    SM(pdGainOverlap_t2, AR_PHY_TPCRG5_PD_GAIN_OVERLAP) |
 	    SM(gainBoundaries[0], AR_PHY_TPCRG5_PD_GAIN_BOUNDARY_1)  |
 	    SM(gainBoundaries[1], AR_PHY_TPCRG5_PD_GAIN_BOUNDARY_2)  |
 	    SM(gainBoundaries[2], AR_PHY_TPCRG5_PD_GAIN_BOUNDARY_3)  |
 	    SM(gainBoundaries[3], AR_PHY_TPCRG5_PD_GAIN_BOUNDARY_4));
 }
 
 /*
  * Get the gain values and the number of gain levels given
  * in xpdMask.
  *
  * The EEPROM xpdMask determines which power detector gain
  * levels were used during calibration. Each of these mask
  * bits maps to a fixed gain level in hardware.
  */
 uint16_t
 ar5416GetXpdGainValues(struct ath_hal *ah, uint16_t xpdMask,
     uint16_t xpdGainValues[])
 {
     int i;
     uint16_t numXpdGain = 0;
 
     for (i = 1; i <= AR5416_PD_GAINS_IN_MASK; i++) {
         if ((xpdMask >> (AR5416_PD_GAINS_IN_MASK - i)) & 1) {
             if (numXpdGain >= AR5416_NUM_PD_GAINS) {
                 HALASSERT(0);
                 break;
             }
             xpdGainValues[numXpdGain] = (uint16_t)(AR5416_PD_GAINS_IN_MASK - i);
             numXpdGain++;
         }
     }
     return numXpdGain;
 }
 
 /*
  * Write the detector gain and biases.
  *
  * There are four power detector gain levels. The xpdMask in the EEPROM
  * determines which power detector gain levels have TX power calibration
  * data associated with them. This function writes the number of
  * PD gain levels and their values into the hardware.
  *
  * This is valid for all TX chains - the calibration data itself however
  * will likely differ per-chain.
  */
 void
 ar5416WriteDetectorGainBiases(struct ath_hal *ah, uint16_t numXpdGain,
     uint16_t xpdGainValues[])
 {
     HALDEBUG(ah, HAL_DEBUG_EEPROM, "%s: numXpdGain: %d,"
       " xpdGainValues: %d, %d, %d\n", __func__, numXpdGain,
       xpdGainValues[0], xpdGainValues[1], xpdGainValues[2]);
 
     OS_REG_WRITE(ah, AR_PHY_TPCRG1, (OS_REG_READ(ah, AR_PHY_TPCRG1) & 
     	~(AR_PHY_TPCRG1_NUM_PD_GAIN | AR_PHY_TPCRG1_PD_GAIN_1 |
 	AR_PHY_TPCRG1_PD_GAIN_2 | AR_PHY_TPCRG1_PD_GAIN_3)) | 
 	SM(numXpdGain - 1, AR_PHY_TPCRG1_NUM_PD_GAIN) |
 	SM(xpdGainValues[0], AR_PHY_TPCRG1_PD_GAIN_1 ) |
 	SM(xpdGainValues[1], AR_PHY_TPCRG1_PD_GAIN_2) |
 	SM(xpdGainValues[2],  AR_PHY_TPCRG1_PD_GAIN_3));
 }
 
 /*
  * Write the PDADC array to the given radio chain i.
  *
  * The 32 PDADC registers are written without any care about
  * their contents - so if various chips treat values as "special",
  * this routine will not care.
  */
 void
 ar5416WritePdadcValues(struct ath_hal *ah, int i, uint8_t pdadcValues[])
 {
 	int regOffset, regChainOffset;
 	int j;
 	int reg32;
 
 	regChainOffset = ar5416GetRegChainOffset(ah, i);
 	regOffset = AR_PHY_BASE + (672 << 2) + regChainOffset;
 
 	for (j = 0; j < 32; j++) {
 		reg32 = ((pdadcValues[4*j + 0] & 0xFF) << 0)  |
 		    ((pdadcValues[4*j + 1] & 0xFF) << 8)  |
 		    ((pdadcValues[4*j + 2] & 0xFF) << 16) |
 		    ((pdadcValues[4*j + 3] & 0xFF) << 24) ;
 		OS_REG_WRITE(ah, regOffset, reg32);
 		HALDEBUG(ah, HAL_DEBUG_EEPROM, "PDADC: Chain %d |"
 		    " PDADC %3d Value %3d | PDADC %3d Value %3d | PDADC %3d"
 		    " Value %3d | PDADC %3d Value %3d |\n",
 		    i,
 		    4*j, pdadcValues[4*j],
 		    4*j+1, pdadcValues[4*j + 1],
 		    4*j+2, pdadcValues[4*j + 2],
 		    4*j+3, pdadcValues[4*j + 3]);
 		regOffset += 4;
 	}
 }
 
 /**************************************************************
  * ar5416SetPowerCalTable
  *
  * Pull the PDADC piers from cal data and interpolate them across the given
  * points as well as from the nearest pier(s) to get a power detector
  * linear voltage to power level table.
  */
 HAL_BOOL
 ar5416SetPowerCalTable(struct ath_hal *ah, struct ar5416eeprom *pEepData,
 	const struct ieee80211_channel *chan, int16_t *pTxPowerIndexOffset)
 {
     CAL_DATA_PER_FREQ *pRawDataset;
     uint8_t  *pCalBChans = AH_NULL;
     uint16_t pdGainOverlap_t2;
     static uint8_t  pdadcValues[AR5416_NUM_PDADC_VALUES];
     uint16_t gainBoundaries[AR5416_PD_GAINS_IN_MASK];
     uint16_t numPiers, i;
     int16_t  tMinCalPower;
     uint16_t numXpdGain, xpdMask;
     uint16_t xpdGainValues[AR5416_NUM_PD_GAINS];
     uint32_t regChainOffset;
 
     OS_MEMZERO(xpdGainValues, sizeof(xpdGainValues));
     
     xpdMask = pEepData->modalHeader[IEEE80211_IS_CHAN_2GHZ(chan)].xpdGain;
 
     if (IS_EEP_MINOR_V2(ah)) {
         pdGainOverlap_t2 = pEepData->modalHeader[IEEE80211_IS_CHAN_2GHZ(chan)].pdGainOverlap;
     } else { 
     	pdGainOverlap_t2 = (uint16_t)(MS(OS_REG_READ(ah, AR_PHY_TPCRG5), AR_PHY_TPCRG5_PD_GAIN_OVERLAP));
     }
 
     if (IEEE80211_IS_CHAN_2GHZ(chan)) {
         pCalBChans = pEepData->calFreqPier2G;
         numPiers = AR5416_NUM_2G_CAL_PIERS;
     } else {
         pCalBChans = pEepData->calFreqPier5G;
         numPiers = AR5416_NUM_5G_CAL_PIERS;
     }
 
     /* Calculate the value of xpdgains from the xpdGain Mask */
     numXpdGain = ar5416GetXpdGainValues(ah, xpdMask, xpdGainValues);
     
     /* Write the detector gain biases and their number */
     ar5416WriteDetectorGainBiases(ah, numXpdGain, xpdGainValues);
 
     for (i = 0; i < AR5416_MAX_CHAINS; i++) {
 	regChainOffset = ar5416GetRegChainOffset(ah, i);
 
         if (pEepData->baseEepHeader.txMask & (1 << i)) {
             if (IEEE80211_IS_CHAN_2GHZ(chan)) {
                 pRawDataset = pEepData->calPierData2G[i];
             } else {
                 pRawDataset = pEepData->calPierData5G[i];
             }
 
             /* Fetch the gain boundaries and the PDADC values */
 	    ar5416GetGainBoundariesAndPdadcs(ah,  chan, pRawDataset,
                                              pCalBChans, numPiers,
                                              pdGainOverlap_t2,
                                              &tMinCalPower, gainBoundaries,
                                              pdadcValues, numXpdGain);
 
             if ((i == 0) || AR_SREV_5416_V20_OR_LATER(ah)) {
 		ar5416SetGainBoundariesClosedLoop(ah, i, pdGainOverlap_t2,
 		  gainBoundaries);
             }
 
             /* Write the power values into the baseband power table */
 	    ar5416WritePdadcValues(ah, i, pdadcValues);
         }
     }
     *pTxPowerIndexOffset = 0;
 
     return AH_TRUE;
 }
 
 /**************************************************************
  * ar5416GetGainBoundariesAndPdadcs
  *
  * Uses the data points read from EEPROM to reconstruct the pdadc power table
  * Called by ar5416SetPowerCalTable only.
  */
 void
 ar5416GetGainBoundariesAndPdadcs(struct ath_hal *ah, 
                                  const struct ieee80211_channel *chan,
 				 CAL_DATA_PER_FREQ *pRawDataSet,
                                  uint8_t * bChans,  uint16_t availPiers,
                                  uint16_t tPdGainOverlap, int16_t *pMinCalPower, uint16_t * pPdGainBoundaries,
                                  uint8_t * pPDADCValues, uint16_t numXpdGains)
 {
 
     int       i, j, k;
     int16_t   ss;         /* potentially -ve index for taking care of pdGainOverlap */
     uint16_t  idxL, idxR, numPiers; /* Pier indexes */
 
     /* filled out Vpd table for all pdGains (chanL) */
     static uint8_t   vpdTableL[AR5416_NUM_PD_GAINS][AR5416_MAX_PWR_RANGE_IN_HALF_DB];
 
     /* filled out Vpd table for all pdGains (chanR) */
     static uint8_t   vpdTableR[AR5416_NUM_PD_GAINS][AR5416_MAX_PWR_RANGE_IN_HALF_DB];
 
     /* filled out Vpd table for all pdGains (interpolated) */
     static uint8_t   vpdTableI[AR5416_NUM_PD_GAINS][AR5416_MAX_PWR_RANGE_IN_HALF_DB];
 
     uint8_t   *pVpdL, *pVpdR, *pPwrL, *pPwrR;
     uint8_t   minPwrT4[AR5416_NUM_PD_GAINS];
     uint8_t   maxPwrT4[AR5416_NUM_PD_GAINS];
     int16_t   vpdStep;
     int16_t   tmpVal;
     uint16_t  sizeCurrVpdTable, maxIndex, tgtIndex;
     HAL_BOOL    match;
     int16_t  minDelta = 0;
     CHAN_CENTERS centers;
 
     ar5416GetChannelCenters(ah, chan, &centers);
 
     /* Trim numPiers for the number of populated channel Piers */
     for (numPiers = 0; numPiers < availPiers; numPiers++) {
         if (bChans[numPiers] == AR5416_BCHAN_UNUSED) {
             break;
         }
     }
 
     /* Find pier indexes around the current channel */
     match = ath_ee_getLowerUpperIndex((uint8_t)FREQ2FBIN(centers.synth_center,
 	IEEE80211_IS_CHAN_2GHZ(chan)), bChans, numPiers, &idxL, &idxR);
 
     if (match) {
         /* Directly fill both vpd tables from the matching index */
         for (i = 0; i < numXpdGains; i++) {
             minPwrT4[i] = pRawDataSet[idxL].pwrPdg[i][0];
             maxPwrT4[i] = pRawDataSet[idxL].pwrPdg[i][4];
             ath_ee_FillVpdTable(minPwrT4[i], maxPwrT4[i], pRawDataSet[idxL].pwrPdg[i],
                                pRawDataSet[idxL].vpdPdg[i], AR5416_PD_GAIN_ICEPTS, vpdTableI[i]);
         }
     } else {
         for (i = 0; i < numXpdGains; i++) {
             pVpdL = pRawDataSet[idxL].vpdPdg[i];
             pPwrL = pRawDataSet[idxL].pwrPdg[i];
             pVpdR = pRawDataSet[idxR].vpdPdg[i];
             pPwrR = pRawDataSet[idxR].pwrPdg[i];
 
             /* Start Vpd interpolation from the max of the minimum powers */
             minPwrT4[i] = AH_MAX(pPwrL[0], pPwrR[0]);
 
             /* End Vpd interpolation from the min of the max powers */
             maxPwrT4[i] = AH_MIN(pPwrL[AR5416_PD_GAIN_ICEPTS - 1], pPwrR[AR5416_PD_GAIN_ICEPTS - 1]);
             HALASSERT(maxPwrT4[i] > minPwrT4[i]);
 
             /* Fill pier Vpds */
             ath_ee_FillVpdTable(minPwrT4[i], maxPwrT4[i], pPwrL, pVpdL, AR5416_PD_GAIN_ICEPTS, vpdTableL[i]);
             ath_ee_FillVpdTable(minPwrT4[i], maxPwrT4[i], pPwrR, pVpdR, AR5416_PD_GAIN_ICEPTS, vpdTableR[i]);
 
             /* Interpolate the final vpd */
             for (j = 0; j <= (maxPwrT4[i] - minPwrT4[i]) / 2; j++) {
                 vpdTableI[i][j] = (uint8_t)(ath_ee_interpolate((uint16_t)FREQ2FBIN(centers.synth_center,
 		    IEEE80211_IS_CHAN_2GHZ(chan)),
                     bChans[idxL], bChans[idxR], vpdTableL[i][j], vpdTableR[i][j]));
             }
         }
     }
     *pMinCalPower = (int16_t)(minPwrT4[0] / 2);
 
     k = 0; /* index for the final table */
     for (i = 0; i < numXpdGains; i++) {
         if (i == (numXpdGains - 1)) {
             pPdGainBoundaries[i] = (uint16_t)(maxPwrT4[i] / 2);
         } else {
             pPdGainBoundaries[i] = (uint16_t)((maxPwrT4[i] + minPwrT4[i+1]) / 4);
         }
 
         pPdGainBoundaries[i] = (uint16_t)AH_MIN(AR5416_MAX_RATE_POWER, pPdGainBoundaries[i]);
 
 	/* NB: only applies to owl 1.0 */
         if ((i == 0) && !AR_SREV_5416_V20_OR_LATER(ah) ) {
 	    /*
              * fix the gain delta, but get a delta that can be applied to min to
              * keep the upper power values accurate, don't think max needs to
              * be adjusted because should not be at that area of the table?
 	     */
             minDelta = pPdGainBoundaries[0] - 23;
             pPdGainBoundaries[0] = 23;
         }
         else {
             minDelta = 0;
         }
 
         /* Find starting index for this pdGain */
         if (i == 0) {
             if (AR_SREV_MERLIN_10_OR_LATER(ah))
                 ss = (int16_t)(0 - (minPwrT4[i] / 2));
             else
                 ss = 0; /* for the first pdGain, start from index 0 */
         } else {
 	    /* need overlap entries extrapolated below. */
             ss = (int16_t)((pPdGainBoundaries[i-1] - (minPwrT4[i] / 2)) - tPdGainOverlap + 1 + minDelta);
         }
         vpdStep = (int16_t)(vpdTableI[i][1] - vpdTableI[i][0]);
         vpdStep = (int16_t)((vpdStep < 1) ? 1 : vpdStep);
         /*
          *-ve ss indicates need to extrapolate data below for this pdGain
          */
         while ((ss < 0) && (k < (AR5416_NUM_PDADC_VALUES - 1))) {
             tmpVal = (int16_t)(vpdTableI[i][0] + ss * vpdStep);
             pPDADCValues[k++] = (uint8_t)((tmpVal < 0) ? 0 : tmpVal);
             ss++;
         }
 
         sizeCurrVpdTable = (uint8_t)((maxPwrT4[i] - minPwrT4[i]) / 2 +1);
         tgtIndex = (uint8_t)(pPdGainBoundaries[i] + tPdGainOverlap - (minPwrT4[i] / 2));
         maxIndex = (tgtIndex < sizeCurrVpdTable) ? tgtIndex : sizeCurrVpdTable;
 
         while ((ss < maxIndex) && (k < (AR5416_NUM_PDADC_VALUES - 1))) {
             pPDADCValues[k++] = vpdTableI[i][ss++];
         }
 
         vpdStep = (int16_t)(vpdTableI[i][sizeCurrVpdTable - 1] - vpdTableI[i][sizeCurrVpdTable - 2]);
         vpdStep = (int16_t)((vpdStep < 1) ? 1 : vpdStep);
         /*
          * for last gain, pdGainBoundary == Pmax_t2, so will
          * have to extrapolate
          */
         if (tgtIndex >= maxIndex) {  /* need to extrapolate above */
             while ((ss <= tgtIndex) && (k < (AR5416_NUM_PDADC_VALUES - 1))) {
                 tmpVal = (int16_t)((vpdTableI[i][sizeCurrVpdTable - 1] +
                           (ss - maxIndex +1) * vpdStep));
                 pPDADCValues[k++] = (uint8_t)((tmpVal > 255) ? 255 : tmpVal);
                 ss++;
             }
         }               /* extrapolated above */
     }                   /* for all pdGainUsed */
 
     /* Fill out pdGainBoundaries - only up to 2 allowed here, but hardware allows up to 4 */
     while (i < AR5416_PD_GAINS_IN_MASK) {
         pPdGainBoundaries[i] = pPdGainBoundaries[i-1];
         i++;
     }
 
     while (k < AR5416_NUM_PDADC_VALUES) {
         pPDADCValues[k] = pPDADCValues[k-1];
         k++;
     }
     return;
 }
 
 /*
  * The linux ath9k driver and (from what I've been told) the reference
  * Atheros driver enables the 11n PHY by default whether or not it's
  * configured.
  */
 static void
 ar5416Set11nRegs(struct ath_hal *ah, const struct ieee80211_channel *chan)
 {
 	uint32_t phymode;
 	uint32_t enableDacFifo = 0;
 	HAL_HT_MACMODE macmode;		/* MAC - 20/40 mode */
 
 	if (AR_SREV_KITE_10_OR_LATER(ah))
 		enableDacFifo = (OS_REG_READ(ah, AR_PHY_TURBO) & AR_PHY_FC_ENABLE_DAC_FIFO);
 
 	/* Enable 11n HT, 20 MHz */
 	phymode = AR_PHY_FC_HT_EN | AR_PHY_FC_SHORT_GI_40
 		| AR_PHY_FC_SINGLE_HT_LTF1 | AR_PHY_FC_WALSH | enableDacFifo;
 
 	/* Configure baseband for dynamic 20/40 operation */
 	if (IEEE80211_IS_CHAN_HT40(chan)) {
 		phymode |= AR_PHY_FC_DYN2040_EN;
 
 		/* Configure control (primary) channel at +-10MHz */
 		if (IEEE80211_IS_CHAN_HT40U(chan))
 			phymode |= AR_PHY_FC_DYN2040_PRI_CH;
 #if 0
 		/* Configure 20/25 spacing */
 		if (ht->ht_extprotspacing == HAL_HT_EXTPROTSPACING_25)
 			phymode |= AR_PHY_FC_DYN2040_EXT_CH;
 #endif
 		macmode = HAL_HT_MACMODE_2040;
 	} else
 		macmode = HAL_HT_MACMODE_20;
 	OS_REG_WRITE(ah, AR_PHY_TURBO, phymode);
 
 	/* Configure MAC for 20/40 operation */
 	ar5416Set11nMac2040(ah, macmode);
 
 	/* global transmit timeout (25 TUs default)*/
 	/* XXX - put this elsewhere??? */
 	OS_REG_WRITE(ah, AR_GTXTO, 25 << AR_GTXTO_TIMEOUT_LIMIT_S) ;
 
 	/* carrier sense timeout */
 	OS_REG_SET_BIT(ah, AR_GTTM, AR_GTTM_CST_USEC);
 	OS_REG_WRITE(ah, AR_CST, 0xF << AR_CST_TIMEOUT_LIMIT_S);
 }
 
 void
 ar5416GetChannelCenters(struct ath_hal *ah,
 	const struct ieee80211_channel *chan, CHAN_CENTERS *centers)
 {
 	uint16_t freq = ath_hal_gethwchannel(ah, chan);
 
 	centers->ctl_center = freq;
 	centers->synth_center = freq;
 	/*
 	 * In 20/40 phy mode, the center frequency is
 	 * "between" the control and extension channels.
 	 */
 	if (IEEE80211_IS_CHAN_HT40U(chan)) {
 		centers->synth_center += HT40_CHANNEL_CENTER_SHIFT;
 		centers->ext_center =
 		    centers->synth_center + HT40_CHANNEL_CENTER_SHIFT;
 	} else if (IEEE80211_IS_CHAN_HT40D(chan)) {
 		centers->synth_center -= HT40_CHANNEL_CENTER_SHIFT;
 		centers->ext_center =
 		    centers->synth_center - HT40_CHANNEL_CENTER_SHIFT;
 	} else {
 		centers->ext_center = freq;
 	}
 }
 
 /*
  * Override the INI vals being programmed.
  */
 static void
 ar5416OverrideIni(struct ath_hal *ah, const struct ieee80211_channel *chan)
 {
 	uint32_t val;
 
 	/*
 	 * Set the RX_ABORT and RX_DIS and clear if off only after
 	 * RXE is set for MAC. This prevents frames with corrupted
 	 * descriptor status.
 	 */
 	OS_REG_SET_BIT(ah, AR_DIAG_SW, (AR_DIAG_RX_DIS | AR_DIAG_RX_ABORT));
 
 	if (AR_SREV_MERLIN_10_OR_LATER(ah)) {
 		val = OS_REG_READ(ah, AR_PCU_MISC_MODE2);
 		val &= (~AR_PCU_MISC_MODE2_ADHOC_MCAST_KEYID_ENABLE);
 		if (!AR_SREV_9271(ah))
 			val &= ~AR_PCU_MISC_MODE2_HWWAR1;
 
 		if (AR_SREV_KIWI_10_OR_LATER(ah))
 			val = val & (~AR_PCU_MISC_MODE2_HWWAR2);
 
 		OS_REG_WRITE(ah, AR_PCU_MISC_MODE2, val);
 	}
 
 	/*
 	 * Disable RIFS search on some chips to avoid baseband
 	 * hang issues.
 	 */
 	if (AR_SREV_HOWL(ah) || AR_SREV_SOWL(ah))
 		(void) ar5416SetRifsDelay(ah, chan, AH_FALSE);
 
         if (!AR_SREV_5416_V20_OR_LATER(ah) || AR_SREV_MERLIN(ah))
 		return;
 
 	/*
 	 * Disable BB clock gating
 	 * Necessary to avoid issues on AR5416 2.0
 	 */
 	OS_REG_WRITE(ah, 0x9800 + (651 << 2), 0x11);
 }
 
 struct ini {
 	uint32_t        *data;          /* NB: !const */
 	int             rows, cols;
 };
 
 /*
  * Override XPA bias level based on operating frequency.
  * This is a v14 EEPROM specific thing for the AR9160.
  */
 void
 ar5416EepromSetAddac(struct ath_hal *ah, const struct ieee80211_channel *chan)
 {
 #define	XPA_LVL_FREQ(cnt)	(pModal->xpaBiasLvlFreq[cnt])
 	MODAL_EEP_HEADER	*pModal;
 	HAL_EEPROM_v14 *ee = AH_PRIVATE(ah)->ah_eeprom;
 	struct ar5416eeprom	*eep = &ee->ee_base;
 	uint8_t biaslevel;
 
 	if (! AR_SREV_SOWL(ah))
 		return;
 
         if (EEP_MINOR(ah) < AR5416_EEP_MINOR_VER_7)
                 return;
 
 	pModal = &(eep->modalHeader[IEEE80211_IS_CHAN_2GHZ(chan)]);
 
 	if (pModal->xpaBiasLvl != 0xff)
 		biaslevel = pModal->xpaBiasLvl;
 	else {
 		uint16_t resetFreqBin, freqBin, freqCount = 0;
 		CHAN_CENTERS centers;
 
 		ar5416GetChannelCenters(ah, chan, &centers);
 
 		resetFreqBin = FREQ2FBIN(centers.synth_center, IEEE80211_IS_CHAN_2GHZ(chan));
 		freqBin = XPA_LVL_FREQ(0) & 0xff;
 		biaslevel = (uint8_t) (XPA_LVL_FREQ(0) >> 14);
 
 		freqCount++;
 
 		while (freqCount < 3) {
 			if (XPA_LVL_FREQ(freqCount) == 0x0)
 			break;
 
 			freqBin = XPA_LVL_FREQ(freqCount) & 0xff;
 			if (resetFreqBin >= freqBin)
 				biaslevel = (uint8_t)(XPA_LVL_FREQ(freqCount) >> 14);
 			else
 				break;
 			freqCount++;
 		}
 	}
 
 	HALDEBUG(ah, HAL_DEBUG_EEPROM, "%s: overriding XPA bias level = %d\n",
 	    __func__, biaslevel);
 
 	/*
 	 * This is a dirty workaround for the const initval data,
 	 * which will upset multiple AR9160's on the same board.
 	 *
 	 * The HAL should likely just have a private copy of the addac
 	 * data per instance.
 	 */
 	if (IEEE80211_IS_CHAN_2GHZ(chan))
                 HAL_INI_VAL((struct ini *) &AH5416(ah)->ah_ini_addac, 7, 1) =
 		    (HAL_INI_VAL(&AH5416(ah)->ah_ini_addac, 7, 1) & (~0x18)) | biaslevel << 3;
         else
                 HAL_INI_VAL((struct ini *) &AH5416(ah)->ah_ini_addac, 6, 1) =
 		    (HAL_INI_VAL(&AH5416(ah)->ah_ini_addac, 6, 1) & (~0xc0)) | biaslevel << 6;
 #undef XPA_LVL_FREQ
 }
 
 static void
 ar5416MarkPhyInactive(struct ath_hal *ah)
 {
 	OS_REG_WRITE(ah, AR_PHY_ACTIVE, AR_PHY_ACTIVE_DIS);
 }
 
 #define	AR5416_IFS_SLOT_FULL_RATE_40	0x168	/* 9 us half, 40 MHz core clock (9*40) */
 #define	AR5416_IFS_SLOT_HALF_RATE_40	0x104	/* 13 us half, 20 MHz core clock (13*20) */
 #define	AR5416_IFS_SLOT_QUARTER_RATE_40	0xD2	/* 21 us quarter, 10 MHz core clock (21*10) */
 
 #define	AR5416_IFS_EIFS_FULL_RATE_40	0xE60	/* (74 + (2 * 9)) * 40MHz core clock */
 #define	AR5416_IFS_EIFS_HALF_RATE_40	0xDAC	/* (149 + (2 * 13)) * 20MHz core clock */
 #define	AR5416_IFS_EIFS_QUARTER_RATE_40	0xD48	/* (298 + (2 * 21)) * 10MHz core clock */
 
 #define	AR5416_IFS_SLOT_FULL_RATE_44	0x18c	/* 9 us half, 44 MHz core clock (9*44) */
 #define	AR5416_IFS_SLOT_HALF_RATE_44	0x11e	/* 13 us half, 22 MHz core clock (13*22) */
 #define	AR5416_IFS_SLOT_QUARTER_RATE_44	0xe7	/* 21 us quarter, 11 MHz core clock (21*11) */
 
 #define	AR5416_IFS_EIFS_FULL_RATE_44	0xfd0	/* (74 + (2 * 9)) * 44MHz core clock */
 #define	AR5416_IFS_EIFS_HALF_RATE_44	0xf0a	/* (149 + (2 * 13)) * 22MHz core clock */
 #define	AR5416_IFS_EIFS_QUARTER_RATE_44	0xe9c	/* (298 + (2 * 21)) * 11MHz core clock */
 
 #define	AR5416_INIT_USEC_40		40
 #define	AR5416_HALF_RATE_USEC_40	19 /* ((40 / 2) - 1 ) */
 #define	AR5416_QUARTER_RATE_USEC_40	9  /* ((40 / 4) - 1 ) */
 
 #define	AR5416_INIT_USEC_44		44
 #define	AR5416_HALF_RATE_USEC_44	21 /* ((44 / 2) - 1 ) */
 #define	AR5416_QUARTER_RATE_USEC_44	10  /* ((44 / 4) - 1 ) */
 
 
 /* XXX What should these be for 40/44MHz clocks (and half/quarter) ? */
 #define	AR5416_RX_NON_FULL_RATE_LATENCY		63
 #define	AR5416_TX_HALF_RATE_LATENCY		108
 #define	AR5416_TX_QUARTER_RATE_LATENCY		216
 
 /*
  * Adjust various register settings based on half/quarter rate clock setting.
  * This includes:
  *
  * + USEC, TX/RX latency,
  * + IFS params: slot, eifs, misc etc.
  *
  * TODO:
  *
  * + Verify which other registers need to be tweaked;
  * + Verify the behaviour of this for 5GHz fast and non-fast clock mode;
  * + This just plain won't work for long distance links - the coverage class
  *   code isn't aware of the slot/ifs/ACK/RTS timeout values that need to
  *   change;
  * + Verify whether the 32KHz USEC value needs to be kept for the 802.11n
  *   series chips?
  * + Calculate/derive values for 2GHz, 5GHz, 5GHz fast clock
  */
 static void
 ar5416SetIFSTiming(struct ath_hal *ah, const struct ieee80211_channel *chan)
 {
 	uint32_t txLat, rxLat, usec, slot, refClock, eifs, init_usec;
 	int clk_44 = 0;
 
 	HALASSERT(IEEE80211_IS_CHAN_HALF(chan) ||
 	    IEEE80211_IS_CHAN_QUARTER(chan));
 
 	/* 2GHz and 5GHz fast clock - 44MHz; else 40MHz */
 	if (IEEE80211_IS_CHAN_2GHZ(chan))
 		clk_44 = 1;
 	else if (IEEE80211_IS_CHAN_5GHZ(chan) &&
 	    IS_5GHZ_FAST_CLOCK_EN(ah, chan))
 		clk_44 = 1;
 
 	/* XXX does this need save/restoring for the 11n chips? */
+	/*
+	 * XXX TODO: should mask out the txlat/rxlat/usec values?
+	 */
 	refClock = OS_REG_READ(ah, AR_USEC) & AR_USEC_USEC32;
 
 	/*
 	 * XXX This really should calculate things, not use
 	 * hard coded values! Ew.
 	 */
 	if (IEEE80211_IS_CHAN_HALF(chan)) {
 		if (clk_44) {
 			slot = AR5416_IFS_SLOT_HALF_RATE_44;
 			rxLat = AR5416_RX_NON_FULL_RATE_LATENCY <<
 			    AR5416_USEC_RX_LAT_S;
 			txLat = AR5416_TX_HALF_RATE_LATENCY <<
 			    AR5416_USEC_TX_LAT_S;
 			usec = AR5416_HALF_RATE_USEC_44;
 			eifs = AR5416_IFS_EIFS_HALF_RATE_44;
 			init_usec = AR5416_INIT_USEC_44 >> 1;
 		} else {
 			slot = AR5416_IFS_SLOT_HALF_RATE_40;
 			rxLat = AR5416_RX_NON_FULL_RATE_LATENCY <<
 			    AR5416_USEC_RX_LAT_S;
 			txLat = AR5416_TX_HALF_RATE_LATENCY <<
 			    AR5416_USEC_TX_LAT_S;
 			usec = AR5416_HALF_RATE_USEC_40;
 			eifs = AR5416_IFS_EIFS_HALF_RATE_40;
 			init_usec = AR5416_INIT_USEC_40 >> 1;
 		}
 	} else { /* quarter rate */
 		if (clk_44) {
 			slot = AR5416_IFS_SLOT_QUARTER_RATE_44;
 			rxLat = AR5416_RX_NON_FULL_RATE_LATENCY <<
 			    AR5416_USEC_RX_LAT_S;
 			txLat = AR5416_TX_QUARTER_RATE_LATENCY <<
 			    AR5416_USEC_TX_LAT_S;
 			usec = AR5416_QUARTER_RATE_USEC_44;
 			eifs = AR5416_IFS_EIFS_QUARTER_RATE_44;
 			init_usec = AR5416_INIT_USEC_44 >> 2;
 		} else {
 			slot = AR5416_IFS_SLOT_QUARTER_RATE_40;
 			rxLat = AR5416_RX_NON_FULL_RATE_LATENCY <<
 			    AR5416_USEC_RX_LAT_S;
 			txLat = AR5416_TX_QUARTER_RATE_LATENCY <<
 			    AR5416_USEC_TX_LAT_S;
 			usec = AR5416_QUARTER_RATE_USEC_40;
 			eifs = AR5416_IFS_EIFS_QUARTER_RATE_40;
 			init_usec = AR5416_INIT_USEC_40 >> 2;
 		}
 	}
 
 	/* XXX verify these! */
 	OS_REG_WRITE(ah, AR_USEC, (usec | refClock | txLat | rxLat));
 	OS_REG_WRITE(ah, AR_D_GBL_IFS_SLOT, slot);
 	OS_REG_WRITE(ah, AR_D_GBL_IFS_EIFS, eifs);
 	OS_REG_RMW_FIELD(ah, AR_D_GBL_IFS_MISC,
 	    AR_D_GBL_IFS_MISC_USEC_DURATION, init_usec);
 }
 
Index: projects/clang390-import/sys/dev/cxgbe/t4_main.c
===================================================================
--- projects/clang390-import/sys/dev/cxgbe/t4_main.c	(revision 305686)
+++ projects/clang390-import/sys/dev/cxgbe/t4_main.c	(revision 305687)
@@ -1,9575 +1,9577 @@
 /*-
  * Copyright (c) 2011 Chelsio Communications, Inc.
  * All rights reserved.
  * Written by: Navdeep Parhar <np@FreeBSD.org>
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_ddb.h"
 #include "opt_inet.h"
 #include "opt_inet6.h"
 #include "opt_rss.h"
 
 #include <sys/param.h>
 #include <sys/conf.h>
 #include <sys/priv.h>
 #include <sys/kernel.h>
 #include <sys/bus.h>
 #include <sys/module.h>
 #include <sys/malloc.h>
 #include <sys/queue.h>
 #include <sys/taskqueue.h>
 #include <sys/pciio.h>
 #include <dev/pci/pcireg.h>
 #include <dev/pci/pcivar.h>
 #include <dev/pci/pci_private.h>
 #include <sys/firmware.h>
 #include <sys/sbuf.h>
 #include <sys/smp.h>
 #include <sys/socket.h>
 #include <sys/sockio.h>
 #include <sys/sysctl.h>
 #include <net/ethernet.h>
 #include <net/if.h>
 #include <net/if_types.h>
 #include <net/if_dl.h>
 #include <net/if_vlan_var.h>
 #ifdef RSS
 #include <net/rss_config.h>
 #endif
 #if defined(__i386__) || defined(__amd64__)
 #include <vm/vm.h>
 #include <vm/pmap.h>
 #endif
 #ifdef DDB
 #include <ddb/ddb.h>
 #include <ddb/db_lex.h>
 #endif
 
 #include "common/common.h"
 #include "common/t4_msg.h"
 #include "common/t4_regs.h"
 #include "common/t4_regs_values.h"
 #include "t4_ioctl.h"
 #include "t4_l2t.h"
 #include "t4_mp_ring.h"
 #include "t4_if.h"
 
 /* T4 bus driver interface */
 static int t4_probe(device_t);
 static int t4_attach(device_t);
 static int t4_detach(device_t);
 static int t4_ready(device_t);
 static int t4_read_port_device(device_t, int, device_t *);
 static device_method_t t4_methods[] = {
 	DEVMETHOD(device_probe,		t4_probe),
 	DEVMETHOD(device_attach,	t4_attach),
 	DEVMETHOD(device_detach,	t4_detach),
 
 	DEVMETHOD(t4_is_main_ready,	t4_ready),
 	DEVMETHOD(t4_read_port_device,	t4_read_port_device),
 
 	DEVMETHOD_END
 };
 static driver_t t4_driver = {
 	"t4nex",
 	t4_methods,
 	sizeof(struct adapter)
 };
 
 
 /* T4 port (cxgbe) interface */
 static int cxgbe_probe(device_t);
 static int cxgbe_attach(device_t);
 static int cxgbe_detach(device_t);
 device_method_t cxgbe_methods[] = {
 	DEVMETHOD(device_probe,		cxgbe_probe),
 	DEVMETHOD(device_attach,	cxgbe_attach),
 	DEVMETHOD(device_detach,	cxgbe_detach),
 	{ 0, 0 }
 };
 static driver_t cxgbe_driver = {
 	"cxgbe",
 	cxgbe_methods,
 	sizeof(struct port_info)
 };
 
 /* T4 VI (vcxgbe) interface */
 static int vcxgbe_probe(device_t);
 static int vcxgbe_attach(device_t);
 static int vcxgbe_detach(device_t);
 static device_method_t vcxgbe_methods[] = {
 	DEVMETHOD(device_probe,		vcxgbe_probe),
 	DEVMETHOD(device_attach,	vcxgbe_attach),
 	DEVMETHOD(device_detach,	vcxgbe_detach),
 	{ 0, 0 }
 };
 static driver_t vcxgbe_driver = {
 	"vcxgbe",
 	vcxgbe_methods,
 	sizeof(struct vi_info)
 };
 
 static d_ioctl_t t4_ioctl;
 
 static struct cdevsw t4_cdevsw = {
        .d_version = D_VERSION,
        .d_ioctl = t4_ioctl,
        .d_name = "t4nex",
 };
 
 /* T5 bus driver interface */
 static int t5_probe(device_t);
 static device_method_t t5_methods[] = {
 	DEVMETHOD(device_probe,		t5_probe),
 	DEVMETHOD(device_attach,	t4_attach),
 	DEVMETHOD(device_detach,	t4_detach),
 
 	DEVMETHOD(t4_is_main_ready,	t4_ready),
 	DEVMETHOD(t4_read_port_device,	t4_read_port_device),
 
 	DEVMETHOD_END
 };
 static driver_t t5_driver = {
 	"t5nex",
 	t5_methods,
 	sizeof(struct adapter)
 };
 
 
 /* T5 port (cxl) interface */
 static driver_t cxl_driver = {
 	"cxl",
 	cxgbe_methods,
 	sizeof(struct port_info)
 };
 
 /* T5 VI (vcxl) interface */
 static driver_t vcxl_driver = {
 	"vcxl",
 	vcxgbe_methods,
 	sizeof(struct vi_info)
 };
 
 /* ifnet + media interface */
 static void cxgbe_init(void *);
 static int cxgbe_ioctl(struct ifnet *, unsigned long, caddr_t);
 static int cxgbe_transmit(struct ifnet *, struct mbuf *);
 static void cxgbe_qflush(struct ifnet *);
 static int cxgbe_media_change(struct ifnet *);
 static void cxgbe_media_status(struct ifnet *, struct ifmediareq *);
 
 MALLOC_DEFINE(M_CXGBE, "cxgbe", "Chelsio T4/T5 Ethernet driver and services");
 
 /*
  * Correct lock order when you need to acquire multiple locks is t4_list_lock,
  * then ADAPTER_LOCK, then t4_uld_list_lock.
  */
 static struct sx t4_list_lock;
 SLIST_HEAD(, adapter) t4_list;
 #ifdef TCP_OFFLOAD
 static struct sx t4_uld_list_lock;
 SLIST_HEAD(, uld_info) t4_uld_list;
 #endif
 
 /*
  * Tunables.  See tweak_tunables() too.
  *
  * Each tunable is set to a default value here if it's known at compile-time.
  * Otherwise it is set to -1 as an indication to tweak_tunables() that it should
  * provide a reasonable default when the driver is loaded.
  *
  * Tunables applicable to both T4 and T5 are under hw.cxgbe.  Those specific to
  * T5 are under hw.cxl.
  */
 
 /*
  * Number of queues for tx and rx, 10G and 1G, NIC and offload.
  */
 #define NTXQ_10G 16
 int t4_ntxq10g = -1;
 TUNABLE_INT("hw.cxgbe.ntxq10g", &t4_ntxq10g);
 
 #define NRXQ_10G 8
 int t4_nrxq10g = -1;
 TUNABLE_INT("hw.cxgbe.nrxq10g", &t4_nrxq10g);
 
 #define NTXQ_1G 4
 int t4_ntxq1g = -1;
 TUNABLE_INT("hw.cxgbe.ntxq1g", &t4_ntxq1g);
 
 #define NRXQ_1G 2
 int t4_nrxq1g = -1;
 TUNABLE_INT("hw.cxgbe.nrxq1g", &t4_nrxq1g);
 
 #define NTXQ_VI 1
 static int t4_ntxq_vi = -1;
 TUNABLE_INT("hw.cxgbe.ntxq_vi", &t4_ntxq_vi);
 
 #define NRXQ_VI 1
 static int t4_nrxq_vi = -1;
 TUNABLE_INT("hw.cxgbe.nrxq_vi", &t4_nrxq_vi);
 
 static int t4_rsrv_noflowq = 0;
 TUNABLE_INT("hw.cxgbe.rsrv_noflowq", &t4_rsrv_noflowq);
 
 #ifdef TCP_OFFLOAD
 #define NOFLDTXQ_10G 8
 static int t4_nofldtxq10g = -1;
 TUNABLE_INT("hw.cxgbe.nofldtxq10g", &t4_nofldtxq10g);
 
 #define NOFLDRXQ_10G 2
 static int t4_nofldrxq10g = -1;
 TUNABLE_INT("hw.cxgbe.nofldrxq10g", &t4_nofldrxq10g);
 
 #define NOFLDTXQ_1G 2
 static int t4_nofldtxq1g = -1;
 TUNABLE_INT("hw.cxgbe.nofldtxq1g", &t4_nofldtxq1g);
 
 #define NOFLDRXQ_1G 1
 static int t4_nofldrxq1g = -1;
 TUNABLE_INT("hw.cxgbe.nofldrxq1g", &t4_nofldrxq1g);
 
 #define NOFLDTXQ_VI 1
 static int t4_nofldtxq_vi = -1;
 TUNABLE_INT("hw.cxgbe.nofldtxq_vi", &t4_nofldtxq_vi);
 
 #define NOFLDRXQ_VI 1
 static int t4_nofldrxq_vi = -1;
 TUNABLE_INT("hw.cxgbe.nofldrxq_vi", &t4_nofldrxq_vi);
 #endif
 
 #ifdef DEV_NETMAP
 #define NNMTXQ_VI 2
 static int t4_nnmtxq_vi = -1;
 TUNABLE_INT("hw.cxgbe.nnmtxq_vi", &t4_nnmtxq_vi);
 
 #define NNMRXQ_VI 2
 static int t4_nnmrxq_vi = -1;
 TUNABLE_INT("hw.cxgbe.nnmrxq_vi", &t4_nnmrxq_vi);
 #endif
 
 /*
  * Holdoff parameters for 10G and 1G ports.
  */
 #define TMR_IDX_10G 1
 int t4_tmr_idx_10g = TMR_IDX_10G;
 TUNABLE_INT("hw.cxgbe.holdoff_timer_idx_10G", &t4_tmr_idx_10g);
 
 #define PKTC_IDX_10G (-1)
 int t4_pktc_idx_10g = PKTC_IDX_10G;
 TUNABLE_INT("hw.cxgbe.holdoff_pktc_idx_10G", &t4_pktc_idx_10g);
 
 #define TMR_IDX_1G 1
 int t4_tmr_idx_1g = TMR_IDX_1G;
 TUNABLE_INT("hw.cxgbe.holdoff_timer_idx_1G", &t4_tmr_idx_1g);
 
 #define PKTC_IDX_1G (-1)
 int t4_pktc_idx_1g = PKTC_IDX_1G;
 TUNABLE_INT("hw.cxgbe.holdoff_pktc_idx_1G", &t4_pktc_idx_1g);
 
 /*
  * Size (# of entries) of each tx and rx queue.
  */
 unsigned int t4_qsize_txq = TX_EQ_QSIZE;
 TUNABLE_INT("hw.cxgbe.qsize_txq", &t4_qsize_txq);
 
 unsigned int t4_qsize_rxq = RX_IQ_QSIZE;
 TUNABLE_INT("hw.cxgbe.qsize_rxq", &t4_qsize_rxq);
 
 /*
  * Interrupt types allowed (bits 0, 1, 2 = INTx, MSI, MSI-X respectively).
  */
 int t4_intr_types = INTR_MSIX | INTR_MSI | INTR_INTX;
 TUNABLE_INT("hw.cxgbe.interrupt_types", &t4_intr_types);
 
 /*
  * Configuration file.
  */
 #define DEFAULT_CF	"default"
 #define FLASH_CF	"flash"
 #define UWIRE_CF	"uwire"
 #define FPGA_CF		"fpga"
 static char t4_cfg_file[32] = DEFAULT_CF;
 TUNABLE_STR("hw.cxgbe.config_file", t4_cfg_file, sizeof(t4_cfg_file));
 
 /*
  * PAUSE settings (bit 0, 1 = rx_pause, tx_pause respectively).
  * rx_pause = 1 to heed incoming PAUSE frames, 0 to ignore them.
  * tx_pause = 1 to emit PAUSE frames when the rx FIFO reaches its high water
  *            mark or when signalled to do so, 0 to never emit PAUSE.
  */
 static int t4_pause_settings = PAUSE_TX | PAUSE_RX;
 TUNABLE_INT("hw.cxgbe.pause_settings", &t4_pause_settings);
 
 /*
  * Firmware auto-install by driver during attach (0, 1, 2 = prohibited, allowed,
  * encouraged respectively).
  */
 static unsigned int t4_fw_install = 1;
 TUNABLE_INT("hw.cxgbe.fw_install", &t4_fw_install);
 
 /*
  * ASIC features that will be used.  Disable the ones you don't want so that the
  * chip resources aren't wasted on features that will not be used.
  */
 static int t4_nbmcaps_allowed = 0;
 TUNABLE_INT("hw.cxgbe.nbmcaps_allowed", &t4_nbmcaps_allowed);
 
 static int t4_linkcaps_allowed = 0;	/* No DCBX, PPP, etc. by default */
 TUNABLE_INT("hw.cxgbe.linkcaps_allowed", &t4_linkcaps_allowed);
 
 static int t4_switchcaps_allowed = FW_CAPS_CONFIG_SWITCH_INGRESS |
     FW_CAPS_CONFIG_SWITCH_EGRESS;
 TUNABLE_INT("hw.cxgbe.switchcaps_allowed", &t4_switchcaps_allowed);
 
 static int t4_niccaps_allowed = FW_CAPS_CONFIG_NIC;
 TUNABLE_INT("hw.cxgbe.niccaps_allowed", &t4_niccaps_allowed);
 
 static int t4_toecaps_allowed = -1;
 TUNABLE_INT("hw.cxgbe.toecaps_allowed", &t4_toecaps_allowed);
 
 static int t4_rdmacaps_allowed = -1;
 TUNABLE_INT("hw.cxgbe.rdmacaps_allowed", &t4_rdmacaps_allowed);
 
 static int t4_tlscaps_allowed = 0;
 TUNABLE_INT("hw.cxgbe.tlscaps_allowed", &t4_tlscaps_allowed);
 
 static int t4_iscsicaps_allowed = -1;
 TUNABLE_INT("hw.cxgbe.iscsicaps_allowed", &t4_iscsicaps_allowed);
 
 static int t4_fcoecaps_allowed = 0;
 TUNABLE_INT("hw.cxgbe.fcoecaps_allowed", &t4_fcoecaps_allowed);
 
 static int t5_write_combine = 0;
 TUNABLE_INT("hw.cxl.write_combine", &t5_write_combine);
 
 static int t4_num_vis = 1;
 TUNABLE_INT("hw.cxgbe.num_vis", &t4_num_vis);
 
 /* Functions used by extra VIs to obtain unique MAC addresses for each VI. */
 static int vi_mac_funcs[] = {
 	FW_VI_FUNC_OFLD,
 	FW_VI_FUNC_IWARP,
 	FW_VI_FUNC_OPENISCSI,
 	FW_VI_FUNC_OPENFCOE,
 	FW_VI_FUNC_FOISCSI,
 	FW_VI_FUNC_FOFCOE,
 };
 
 struct intrs_and_queues {
 	uint16_t intr_type;	/* INTx, MSI, or MSI-X */
 	uint16_t nirq;		/* Total # of vectors */
 	uint16_t intr_flags_10g;/* Interrupt flags for each 10G port */
 	uint16_t intr_flags_1g;	/* Interrupt flags for each 1G port */
 	uint16_t ntxq10g;	/* # of NIC txq's for each 10G port */
 	uint16_t nrxq10g;	/* # of NIC rxq's for each 10G port */
 	uint16_t ntxq1g;	/* # of NIC txq's for each 1G port */
 	uint16_t nrxq1g;	/* # of NIC rxq's for each 1G port */
 	uint16_t rsrv_noflowq;	/* Flag whether to reserve queue 0 */
 	uint16_t nofldtxq10g;	/* # of TOE txq's for each 10G port */
 	uint16_t nofldrxq10g;	/* # of TOE rxq's for each 10G port */
 	uint16_t nofldtxq1g;	/* # of TOE txq's for each 1G port */
 	uint16_t nofldrxq1g;	/* # of TOE rxq's for each 1G port */
 
 	/* The vcxgbe/vcxl interfaces use these and not the ones above. */
 	uint16_t ntxq_vi;	/* # of NIC txq's */
 	uint16_t nrxq_vi;	/* # of NIC rxq's */
 	uint16_t nofldtxq_vi;	/* # of TOE txq's */
 	uint16_t nofldrxq_vi;	/* # of TOE rxq's */
 	uint16_t nnmtxq_vi;	/* # of netmap txq's */
 	uint16_t nnmrxq_vi;	/* # of netmap rxq's */
 };
 
 struct filter_entry {
         uint32_t valid:1;	/* filter allocated and valid */
         uint32_t locked:1;	/* filter is administratively locked */
         uint32_t pending:1;	/* filter action is pending firmware reply */
 	uint32_t smtidx:8;	/* Source MAC Table index for smac */
 	struct l2t_entry *l2t;	/* Layer Two Table entry for dmac */
 
         struct t4_filter_specification fs;
 };
 
 static void setup_memwin(struct adapter *);
 static void position_memwin(struct adapter *, int, uint32_t);
 static int rw_via_memwin(struct adapter *, int, uint32_t, uint32_t *, int, int);
 static inline int read_via_memwin(struct adapter *, int, uint32_t, uint32_t *,
     int);
 static inline int write_via_memwin(struct adapter *, int, uint32_t,
     const uint32_t *, int);
 static int validate_mem_range(struct adapter *, uint32_t, int);
 static int fwmtype_to_hwmtype(int);
 static int validate_mt_off_len(struct adapter *, int, uint32_t, int,
     uint32_t *);
 static int fixup_devlog_params(struct adapter *);
 static int cfg_itype_and_nqueues(struct adapter *, int, int, int,
     struct intrs_and_queues *);
 static int prep_firmware(struct adapter *);
 static int partition_resources(struct adapter *, const struct firmware *,
     const char *);
 static int get_params__pre_init(struct adapter *);
 static int get_params__post_init(struct adapter *);
 static int set_params__post_init(struct adapter *);
 static void t4_set_desc(struct adapter *);
 static void build_medialist(struct port_info *, struct ifmedia *);
 static int cxgbe_init_synchronized(struct vi_info *);
 static int cxgbe_uninit_synchronized(struct vi_info *);
 static void quiesce_txq(struct adapter *, struct sge_txq *);
 static void quiesce_wrq(struct adapter *, struct sge_wrq *);
 static void quiesce_iq(struct adapter *, struct sge_iq *);
 static void quiesce_fl(struct adapter *, struct sge_fl *);
 static int t4_alloc_irq(struct adapter *, struct irq *, int rid,
     driver_intr_t *, void *, char *);
 static int t4_free_irq(struct adapter *, struct irq *);
 static void get_regs(struct adapter *, struct t4_regdump *, uint8_t *);
 static void vi_refresh_stats(struct adapter *, struct vi_info *);
 static void cxgbe_refresh_stats(struct adapter *, struct port_info *);
 static void cxgbe_tick(void *);
 static void cxgbe_vlan_config(void *, struct ifnet *, uint16_t);
 static void cxgbe_sysctls(struct port_info *);
 static int sysctl_int_array(SYSCTL_HANDLER_ARGS);
 static int sysctl_bitfield(SYSCTL_HANDLER_ARGS);
 static int sysctl_btphy(SYSCTL_HANDLER_ARGS);
 static int sysctl_noflowq(SYSCTL_HANDLER_ARGS);
 static int sysctl_holdoff_tmr_idx(SYSCTL_HANDLER_ARGS);
 static int sysctl_holdoff_pktc_idx(SYSCTL_HANDLER_ARGS);
 static int sysctl_qsize_rxq(SYSCTL_HANDLER_ARGS);
 static int sysctl_qsize_txq(SYSCTL_HANDLER_ARGS);
 static int sysctl_pause_settings(SYSCTL_HANDLER_ARGS);
 static int sysctl_handle_t4_reg64(SYSCTL_HANDLER_ARGS);
 static int sysctl_temperature(SYSCTL_HANDLER_ARGS);
 #ifdef SBUF_DRAIN
 static int sysctl_cctrl(SYSCTL_HANDLER_ARGS);
 static int sysctl_cim_ibq_obq(SYSCTL_HANDLER_ARGS);
 static int sysctl_cim_la(SYSCTL_HANDLER_ARGS);
 static int sysctl_cim_la_t6(SYSCTL_HANDLER_ARGS);
 static int sysctl_cim_ma_la(SYSCTL_HANDLER_ARGS);
 static int sysctl_cim_pif_la(SYSCTL_HANDLER_ARGS);
 static int sysctl_cim_qcfg(SYSCTL_HANDLER_ARGS);
 static int sysctl_cpl_stats(SYSCTL_HANDLER_ARGS);
 static int sysctl_ddp_stats(SYSCTL_HANDLER_ARGS);
 static int sysctl_devlog(SYSCTL_HANDLER_ARGS);
 static int sysctl_fcoe_stats(SYSCTL_HANDLER_ARGS);
 static int sysctl_hw_sched(SYSCTL_HANDLER_ARGS);
 static int sysctl_lb_stats(SYSCTL_HANDLER_ARGS);
 static int sysctl_linkdnrc(SYSCTL_HANDLER_ARGS);
 static int sysctl_meminfo(SYSCTL_HANDLER_ARGS);
 static int sysctl_mps_tcam(SYSCTL_HANDLER_ARGS);
 static int sysctl_mps_tcam_t6(SYSCTL_HANDLER_ARGS);
 static int sysctl_path_mtus(SYSCTL_HANDLER_ARGS);
 static int sysctl_pm_stats(SYSCTL_HANDLER_ARGS);
 static int sysctl_rdma_stats(SYSCTL_HANDLER_ARGS);
 static int sysctl_tcp_stats(SYSCTL_HANDLER_ARGS);
 static int sysctl_tids(SYSCTL_HANDLER_ARGS);
 static int sysctl_tp_err_stats(SYSCTL_HANDLER_ARGS);
 static int sysctl_tp_la_mask(SYSCTL_HANDLER_ARGS);
 static int sysctl_tp_la(SYSCTL_HANDLER_ARGS);
 static int sysctl_tx_rate(SYSCTL_HANDLER_ARGS);
 static int sysctl_ulprx_la(SYSCTL_HANDLER_ARGS);
 static int sysctl_wcwr_stats(SYSCTL_HANDLER_ARGS);
 static int sysctl_tc_params(SYSCTL_HANDLER_ARGS);
 #endif
 #ifdef TCP_OFFLOAD
 static int sysctl_tp_tick(SYSCTL_HANDLER_ARGS);
 static int sysctl_tp_dack_timer(SYSCTL_HANDLER_ARGS);
 static int sysctl_tp_timer(SYSCTL_HANDLER_ARGS);
 #endif
 static uint32_t fconf_iconf_to_mode(uint32_t, uint32_t);
 static uint32_t mode_to_fconf(uint32_t);
 static uint32_t mode_to_iconf(uint32_t);
 static int check_fspec_against_fconf_iconf(struct adapter *,
     struct t4_filter_specification *);
 static int get_filter_mode(struct adapter *, uint32_t *);
 static int set_filter_mode(struct adapter *, uint32_t);
 static inline uint64_t get_filter_hits(struct adapter *, uint32_t);
 static int get_filter(struct adapter *, struct t4_filter *);
 static int set_filter(struct adapter *, struct t4_filter *);
 static int del_filter(struct adapter *, struct t4_filter *);
 static void clear_filter(struct filter_entry *);
 static int set_filter_wr(struct adapter *, int);
 static int del_filter_wr(struct adapter *, int);
 static int set_tcb_rpl(struct sge_iq *, const struct rss_header *,
     struct mbuf *);
 static int get_sge_context(struct adapter *, struct t4_sge_context *);
 static int load_fw(struct adapter *, struct t4_data *);
 static int read_card_mem(struct adapter *, int, struct t4_mem_range *);
 static int read_i2c(struct adapter *, struct t4_i2c_data *);
 #ifdef TCP_OFFLOAD
 static int toe_capability(struct vi_info *, int);
 #endif
 static int mod_event(module_t, int, void *);
 static int notify_siblings(device_t, int);
 
 struct {
 	uint16_t device;
 	char *desc;
 } t4_pciids[] = {
 	{0xa000, "Chelsio Terminator 4 FPGA"},
 	{0x4400, "Chelsio T440-dbg"},
 	{0x4401, "Chelsio T420-CR"},
 	{0x4402, "Chelsio T422-CR"},
 	{0x4403, "Chelsio T440-CR"},
 	{0x4404, "Chelsio T420-BCH"},
 	{0x4405, "Chelsio T440-BCH"},
 	{0x4406, "Chelsio T440-CH"},
 	{0x4407, "Chelsio T420-SO"},
 	{0x4408, "Chelsio T420-CX"},
 	{0x4409, "Chelsio T420-BT"},
 	{0x440a, "Chelsio T404-BT"},
 	{0x440e, "Chelsio T440-LP-CR"},
 }, t5_pciids[] = {
 	{0xb000, "Chelsio Terminator 5 FPGA"},
 	{0x5400, "Chelsio T580-dbg"},
 	{0x5401,  "Chelsio T520-CR"},		/* 2 x 10G */
 	{0x5402,  "Chelsio T522-CR"},		/* 2 x 10G, 2 X 1G */
 	{0x5403,  "Chelsio T540-CR"},		/* 4 x 10G */
 	{0x5407,  "Chelsio T520-SO"},		/* 2 x 10G, nomem */
 	{0x5409,  "Chelsio T520-BT"},		/* 2 x 10GBaseT */
 	{0x540a,  "Chelsio T504-BT"},		/* 4 x 1G */
 	{0x540d,  "Chelsio T580-CR"},		/* 2 x 40G */
 	{0x540e,  "Chelsio T540-LP-CR"},	/* 4 x 10G */
 	{0x5410,  "Chelsio T580-LP-CR"},	/* 2 x 40G */
 	{0x5411,  "Chelsio T520-LL-CR"},	/* 2 x 10G */
 	{0x5412,  "Chelsio T560-CR"},		/* 1 x 40G, 2 x 10G */
 	{0x5414,  "Chelsio T580-LP-SO-CR"},	/* 2 x 40G, nomem */
 	{0x5415,  "Chelsio T502-BT"},		/* 2 x 1G */
 #ifdef notyet
 	{0x5404,  "Chelsio T520-BCH"},
 	{0x5405,  "Chelsio T540-BCH"},
 	{0x5406,  "Chelsio T540-CH"},
 	{0x5408,  "Chelsio T520-CX"},
 	{0x540b,  "Chelsio B520-SR"},
 	{0x540c,  "Chelsio B504-BT"},
 	{0x540f,  "Chelsio Amsterdam"},
 	{0x5413,  "Chelsio T580-CHR"},
 #endif
 };
 
 #ifdef TCP_OFFLOAD
 /*
  * service_iq() has an iq and needs the fl.  Offset of fl from the iq should be
  * exactly the same for both rxq and ofld_rxq.
  */
 CTASSERT(offsetof(struct sge_ofld_rxq, iq) == offsetof(struct sge_rxq, iq));
 CTASSERT(offsetof(struct sge_ofld_rxq, fl) == offsetof(struct sge_rxq, fl));
 #endif
 CTASSERT(sizeof(struct cluster_metadata) <= CL_METADATA_SIZE);
 
 static int
 t4_probe(device_t dev)
 {
 	int i;
 	uint16_t v = pci_get_vendor(dev);
 	uint16_t d = pci_get_device(dev);
 	uint8_t f = pci_get_function(dev);
 
 	if (v != PCI_VENDOR_ID_CHELSIO)
 		return (ENXIO);
 
 	/* Attach only to PF0 of the FPGA */
 	if (d == 0xa000 && f != 0)
 		return (ENXIO);
 
 	for (i = 0; i < nitems(t4_pciids); i++) {
 		if (d == t4_pciids[i].device) {
 			device_set_desc(dev, t4_pciids[i].desc);
 			return (BUS_PROBE_DEFAULT);
 		}
 	}
 
 	return (ENXIO);
 }
 
 static int
 t5_probe(device_t dev)
 {
 	int i;
 	uint16_t v = pci_get_vendor(dev);
 	uint16_t d = pci_get_device(dev);
 	uint8_t f = pci_get_function(dev);
 
 	if (v != PCI_VENDOR_ID_CHELSIO)
 		return (ENXIO);
 
 	/* Attach only to PF0 of the FPGA */
 	if (d == 0xb000 && f != 0)
 		return (ENXIO);
 
 	for (i = 0; i < nitems(t5_pciids); i++) {
 		if (d == t5_pciids[i].device) {
 			device_set_desc(dev, t5_pciids[i].desc);
 			return (BUS_PROBE_DEFAULT);
 		}
 	}
 
 	return (ENXIO);
 }
 
 static void
 t5_attribute_workaround(device_t dev)
 {
 	device_t root_port;
 	uint32_t v;
 
 	/*
 	 * The T5 chips do not properly echo the No Snoop and Relaxed
 	 * Ordering attributes when replying to a TLP from a Root
 	 * Port.  As a workaround, find the parent Root Port and
 	 * disable No Snoop and Relaxed Ordering.  Note that this
 	 * affects all devices under this root port.
 	 */
 	root_port = pci_find_pcie_root_port(dev);
 	if (root_port == NULL) {
 		device_printf(dev, "Unable to find parent root port\n");
 		return;
 	}
 
 	v = pcie_adjust_config(root_port, PCIER_DEVICE_CTL,
 	    PCIEM_CTL_RELAXED_ORD_ENABLE | PCIEM_CTL_NOSNOOP_ENABLE, 0, 2);
 	if ((v & (PCIEM_CTL_RELAXED_ORD_ENABLE | PCIEM_CTL_NOSNOOP_ENABLE)) !=
 	    0)
 		device_printf(dev, "Disabled No Snoop/Relaxed Ordering on %s\n",
 		    device_get_nameunit(root_port));
 }
 
 static int
 t4_attach(device_t dev)
 {
 	struct adapter *sc;
 	int rc = 0, i, j, n10g, n1g, rqidx, tqidx;
 	struct make_dev_args mda;
 	struct intrs_and_queues iaq;
 	struct sge *s;
 	uint8_t *buf;
 #ifdef TCP_OFFLOAD
 	int ofld_rqidx, ofld_tqidx;
 #endif
 #ifdef DEV_NETMAP
 	int nm_rqidx, nm_tqidx;
 #endif
 	int num_vis;
 
 	sc = device_get_softc(dev);
 	sc->dev = dev;
 	TUNABLE_INT_FETCH("hw.cxgbe.debug_flags", &sc->debug_flags);
 
 	if ((pci_get_device(dev) & 0xff00) == 0x5400)
 		t5_attribute_workaround(dev);
 	pci_enable_busmaster(dev);
 	if (pci_find_cap(dev, PCIY_EXPRESS, &i) == 0) {
 		uint32_t v;
 
 		pci_set_max_read_req(dev, 4096);
 		v = pci_read_config(dev, i + PCIER_DEVICE_CTL, 2);
 		v |= PCIEM_CTL_RELAXED_ORD_ENABLE;
 		pci_write_config(dev, i + PCIER_DEVICE_CTL, v, 2);
 
 		sc->params.pci.mps = 128 << ((v & PCIEM_CTL_MAX_PAYLOAD) >> 5);
 	}
 
 	sc->sge_gts_reg = MYPF_REG(A_SGE_PF_GTS);
 	sc->sge_kdoorbell_reg = MYPF_REG(A_SGE_PF_KDOORBELL);
 	sc->traceq = -1;
 	mtx_init(&sc->ifp_lock, sc->ifp_lockname, 0, MTX_DEF);
 	snprintf(sc->ifp_lockname, sizeof(sc->ifp_lockname), "%s tracer",
 	    device_get_nameunit(dev));
 
 	snprintf(sc->lockname, sizeof(sc->lockname), "%s",
 	    device_get_nameunit(dev));
 	mtx_init(&sc->sc_lock, sc->lockname, 0, MTX_DEF);
 	t4_add_adapter(sc);
 
 	mtx_init(&sc->sfl_lock, "starving freelists", 0, MTX_DEF);
 	TAILQ_INIT(&sc->sfl);
 	callout_init_mtx(&sc->sfl_callout, &sc->sfl_lock, 0);
 
 	mtx_init(&sc->reg_lock, "indirect register access", 0, MTX_DEF);
 
 	rc = t4_map_bars_0_and_4(sc);
 	if (rc != 0)
 		goto done; /* error message displayed already */
 
 	/*
 	 * This is the real PF# to which we're attaching.  Works from within PCI
 	 * passthrough environments too, where pci_get_function() could return a
 	 * different PF# depending on the passthrough configuration.  We need to
 	 * use the real PF# in all our communication with the firmware.
 	 */
 	sc->pf = G_SOURCEPF(t4_read_reg(sc, A_PL_WHOAMI));
 	sc->mbox = sc->pf;
 
 	memset(sc->chan_map, 0xff, sizeof(sc->chan_map));
 
 	/* Prepare the adapter for operation. */
 	buf = malloc(PAGE_SIZE, M_CXGBE, M_ZERO | M_WAITOK);
 	rc = -t4_prep_adapter(sc, buf);
 	free(buf, M_CXGBE);
 	if (rc != 0) {
 		device_printf(dev, "failed to prepare adapter: %d.\n", rc);
 		goto done;
 	}
 
 	/*
 	 * Do this really early, with the memory windows set up even before the
 	 * character device.  The userland tool's register i/o and mem read
 	 * will work even in "recovery mode".
 	 */
 	setup_memwin(sc);
 	if (t4_init_devlog_params(sc, 0) == 0)
 		fixup_devlog_params(sc);
 	make_dev_args_init(&mda);
 	mda.mda_devsw = &t4_cdevsw;
 	mda.mda_uid = UID_ROOT;
 	mda.mda_gid = GID_WHEEL;
 	mda.mda_mode = 0600;
 	mda.mda_si_drv1 = sc;
 	rc = make_dev_s(&mda, &sc->cdev, "%s", device_get_nameunit(dev));
 	if (rc != 0)
 		device_printf(dev, "failed to create nexus char device: %d.\n",
 		    rc);
 
 	/* Go no further if recovery mode has been requested. */
 	if (TUNABLE_INT_FETCH("hw.cxgbe.sos", &i) && i != 0) {
 		device_printf(dev, "recovery mode.\n");
 		goto done;
 	}
 
 #if defined(__i386__)
 	if ((cpu_feature & CPUID_CX8) == 0) {
 		device_printf(dev, "64 bit atomics not available.\n");
 		rc = ENOTSUP;
 		goto done;
 	}
 #endif
 
 	/* Prepare the firmware for operation */
 	rc = prep_firmware(sc);
 	if (rc != 0)
 		goto done; /* error message displayed already */
 
 	rc = get_params__post_init(sc);
 	if (rc != 0)
 		goto done; /* error message displayed already */
 
 	rc = set_params__post_init(sc);
 	if (rc != 0)
 		goto done; /* error message displayed already */
 
 	rc = t4_map_bar_2(sc);
 	if (rc != 0)
 		goto done; /* error message displayed already */
 
 	rc = t4_create_dma_tag(sc);
 	if (rc != 0)
 		goto done; /* error message displayed already */
 
 	/*
 	 * Number of VIs to create per-port.  The first VI is the "main" regular
 	 * VI for the port.  The rest are additional virtual interfaces on the
 	 * same physical port.  Note that the main VI does not have native
 	 * netmap support but the extra VIs do.
 	 *
 	 * Limit the number of VIs per port to the number of available
 	 * MAC addresses per port.
 	 */
 	if (t4_num_vis >= 1)
 		num_vis = t4_num_vis;
 	else
 		num_vis = 1;
 	if (num_vis > nitems(vi_mac_funcs)) {
 		num_vis = nitems(vi_mac_funcs);
 		device_printf(dev, "Number of VIs limited to %d\n", num_vis);
 	}
 
 	/*
 	 * First pass over all the ports - allocate VIs and initialize some
 	 * basic parameters like mac address, port type, etc.  We also figure
 	 * out whether a port is 10G or 1G and use that information when
 	 * calculating how many interrupts to attempt to allocate.
 	 */
 	n10g = n1g = 0;
 	for_each_port(sc, i) {
 		struct port_info *pi;
 
 		pi = malloc(sizeof(*pi), M_CXGBE, M_ZERO | M_WAITOK);
 		sc->port[i] = pi;
 
 		/* These must be set before t4_port_init */
 		pi->adapter = sc;
 		pi->port_id = i;
 		/*
 		 * XXX: vi[0] is special so we can't delay this allocation until
 		 * pi->nvi's final value is known.
 		 */
 		pi->vi = malloc(sizeof(struct vi_info) * num_vis, M_CXGBE,
 		    M_ZERO | M_WAITOK);
 
 		/*
 		 * Allocate the "main" VI and initialize parameters
 		 * like mac addr.
 		 */
 		rc = -t4_port_init(sc, sc->mbox, sc->pf, 0, i);
 		if (rc != 0) {
 			device_printf(dev, "unable to initialize port %d: %d\n",
 			    i, rc);
 			free(pi->vi, M_CXGBE);
 			free(pi, M_CXGBE);
 			sc->port[i] = NULL;
 			goto done;
 		}
 
 		pi->link_cfg.requested_fc &= ~(PAUSE_TX | PAUSE_RX);
 		pi->link_cfg.requested_fc |= t4_pause_settings;
 		pi->link_cfg.fc &= ~(PAUSE_TX | PAUSE_RX);
 		pi->link_cfg.fc |= t4_pause_settings;
 
 		rc = -t4_link_l1cfg(sc, sc->mbox, pi->tx_chan, &pi->link_cfg);
 		if (rc != 0) {
 			device_printf(dev, "port %d l1cfg failed: %d\n", i, rc);
 			free(pi->vi, M_CXGBE);
 			free(pi, M_CXGBE);
 			sc->port[i] = NULL;
 			goto done;
 		}
 
 		snprintf(pi->lockname, sizeof(pi->lockname), "%sp%d",
 		    device_get_nameunit(dev), i);
 		mtx_init(&pi->pi_lock, pi->lockname, 0, MTX_DEF);
 		sc->chan_map[pi->tx_chan] = i;
 
 		pi->tc = malloc(sizeof(struct tx_sched_class) *
 		    sc->chip_params->nsched_cls, M_CXGBE, M_ZERO | M_WAITOK);
 
 		if (is_10G_port(pi) || is_40G_port(pi)) {
 			n10g++;
 		} else {
 			n1g++;
 		}
 
 		pi->linkdnrc = -1;
 
 		pi->dev = device_add_child(dev, is_t4(sc) ? "cxgbe" : "cxl", -1);
 		if (pi->dev == NULL) {
 			device_printf(dev,
 			    "failed to add device for port %d.\n", i);
 			rc = ENXIO;
 			goto done;
 		}
 		pi->vi[0].dev = pi->dev;
 		device_set_softc(pi->dev, pi);
 	}
 
 	/*
 	 * Interrupt type, # of interrupts, # of rx/tx queues, etc.
 	 */
 	rc = cfg_itype_and_nqueues(sc, n10g, n1g, num_vis, &iaq);
 	if (rc != 0)
 		goto done; /* error message displayed already */
 	if (iaq.nrxq_vi + iaq.nofldrxq_vi + iaq.nnmrxq_vi == 0)
 		num_vis = 1;
 
 	sc->intr_type = iaq.intr_type;
 	sc->intr_count = iaq.nirq;
 
 	s = &sc->sge;
 	s->nrxq = n10g * iaq.nrxq10g + n1g * iaq.nrxq1g;
 	s->ntxq = n10g * iaq.ntxq10g + n1g * iaq.ntxq1g;
 	if (num_vis > 1) {
 		s->nrxq += (n10g + n1g) * (num_vis - 1) * iaq.nrxq_vi;
 		s->ntxq += (n10g + n1g) * (num_vis - 1) * iaq.ntxq_vi;
 	}
 	s->neq = s->ntxq + s->nrxq;	/* the free list in an rxq is an eq */
 	s->neq += sc->params.nports + 1;/* ctrl queues: 1 per port + 1 mgmt */
 	s->niq = s->nrxq + 1;		/* 1 extra for firmware event queue */
 #ifdef TCP_OFFLOAD
 	if (is_offload(sc)) {
 		s->nofldrxq = n10g * iaq.nofldrxq10g + n1g * iaq.nofldrxq1g;
 		s->nofldtxq = n10g * iaq.nofldtxq10g + n1g * iaq.nofldtxq1g;
 		if (num_vis > 1) {
 			s->nofldrxq += (n10g + n1g) * (num_vis - 1) *
 			    iaq.nofldrxq_vi;
 			s->nofldtxq += (n10g + n1g) * (num_vis - 1) *
 			    iaq.nofldtxq_vi;
 		}
 		s->neq += s->nofldtxq + s->nofldrxq;
 		s->niq += s->nofldrxq;
 
 		s->ofld_rxq = malloc(s->nofldrxq * sizeof(struct sge_ofld_rxq),
 		    M_CXGBE, M_ZERO | M_WAITOK);
 		s->ofld_txq = malloc(s->nofldtxq * sizeof(struct sge_wrq),
 		    M_CXGBE, M_ZERO | M_WAITOK);
 	}
 #endif
 #ifdef DEV_NETMAP
 	if (num_vis > 1) {
 		s->nnmrxq = (n10g + n1g) * (num_vis - 1) * iaq.nnmrxq_vi;
 		s->nnmtxq = (n10g + n1g) * (num_vis - 1) * iaq.nnmtxq_vi;
 	}
 	s->neq += s->nnmtxq + s->nnmrxq;
 	s->niq += s->nnmrxq;
 
 	s->nm_rxq = malloc(s->nnmrxq * sizeof(struct sge_nm_rxq),
 	    M_CXGBE, M_ZERO | M_WAITOK);
 	s->nm_txq = malloc(s->nnmtxq * sizeof(struct sge_nm_txq),
 	    M_CXGBE, M_ZERO | M_WAITOK);
 #endif
 
 	s->ctrlq = malloc(sc->params.nports * sizeof(struct sge_wrq), M_CXGBE,
 	    M_ZERO | M_WAITOK);
 	s->rxq = malloc(s->nrxq * sizeof(struct sge_rxq), M_CXGBE,
 	    M_ZERO | M_WAITOK);
 	s->txq = malloc(s->ntxq * sizeof(struct sge_txq), M_CXGBE,
 	    M_ZERO | M_WAITOK);
 	s->iqmap = malloc(s->niq * sizeof(struct sge_iq *), M_CXGBE,
 	    M_ZERO | M_WAITOK);
 	s->eqmap = malloc(s->neq * sizeof(struct sge_eq *), M_CXGBE,
 	    M_ZERO | M_WAITOK);
 
 	sc->irq = malloc(sc->intr_count * sizeof(struct irq), M_CXGBE,
 	    M_ZERO | M_WAITOK);
 
 	t4_init_l2t(sc, M_WAITOK);
 
 	/*
 	 * Second pass over the ports.  This time we know the number of rx and
 	 * tx queues that each port should get.
 	 */
 	rqidx = tqidx = 0;
 #ifdef TCP_OFFLOAD
 	ofld_rqidx = ofld_tqidx = 0;
 #endif
 #ifdef DEV_NETMAP
 	nm_rqidx = nm_tqidx = 0;
 #endif
 	for_each_port(sc, i) {
 		struct port_info *pi = sc->port[i];
 		struct vi_info *vi;
 
 		if (pi == NULL)
 			continue;
 
 		pi->nvi = num_vis;
 		for_each_vi(pi, j, vi) {
 			vi->pi = pi;
 			vi->qsize_rxq = t4_qsize_rxq;
 			vi->qsize_txq = t4_qsize_txq;
 
 			vi->first_rxq = rqidx;
 			vi->first_txq = tqidx;
 			if (is_10G_port(pi) || is_40G_port(pi)) {
 				vi->tmr_idx = t4_tmr_idx_10g;
 				vi->pktc_idx = t4_pktc_idx_10g;
 				vi->flags |= iaq.intr_flags_10g & INTR_RXQ;
 				vi->nrxq = j == 0 ? iaq.nrxq10g : iaq.nrxq_vi;
 				vi->ntxq = j == 0 ? iaq.ntxq10g : iaq.ntxq_vi;
 			} else {
 				vi->tmr_idx = t4_tmr_idx_1g;
 				vi->pktc_idx = t4_pktc_idx_1g;
 				vi->flags |= iaq.intr_flags_1g & INTR_RXQ;
 				vi->nrxq = j == 0 ? iaq.nrxq1g : iaq.nrxq_vi;
 				vi->ntxq = j == 0 ? iaq.ntxq1g : iaq.ntxq_vi;
 			}
 			rqidx += vi->nrxq;
 			tqidx += vi->ntxq;
 
 			if (j == 0 && vi->ntxq > 1)
 				vi->rsrv_noflowq = iaq.rsrv_noflowq ? 1 : 0;
 			else
 				vi->rsrv_noflowq = 0;
 
 #ifdef TCP_OFFLOAD
 			vi->first_ofld_rxq = ofld_rqidx;
 			vi->first_ofld_txq = ofld_tqidx;
 			if (is_10G_port(pi) || is_40G_port(pi)) {
 				vi->flags |= iaq.intr_flags_10g & INTR_OFLD_RXQ;
 				vi->nofldrxq = j == 0 ? iaq.nofldrxq10g :
 				    iaq.nofldrxq_vi;
 				vi->nofldtxq = j == 0 ? iaq.nofldtxq10g :
 				    iaq.nofldtxq_vi;
 			} else {
 				vi->flags |= iaq.intr_flags_1g & INTR_OFLD_RXQ;
 				vi->nofldrxq = j == 0 ? iaq.nofldrxq1g :
 				    iaq.nofldrxq_vi;
 				vi->nofldtxq = j == 0 ? iaq.nofldtxq1g :
 				    iaq.nofldtxq_vi;
 			}
 			ofld_rqidx += vi->nofldrxq;
 			ofld_tqidx += vi->nofldtxq;
 #endif
 #ifdef DEV_NETMAP
 			if (j > 0) {
 				vi->first_nm_rxq = nm_rqidx;
 				vi->first_nm_txq = nm_tqidx;
 				vi->nnmrxq = iaq.nnmrxq_vi;
 				vi->nnmtxq = iaq.nnmtxq_vi;
 				nm_rqidx += vi->nnmrxq;
 				nm_tqidx += vi->nnmtxq;
 			}
 #endif
 		}
 	}
 
 	rc = t4_setup_intr_handlers(sc);
 	if (rc != 0) {
 		device_printf(dev,
 		    "failed to setup interrupt handlers: %d\n", rc);
 		goto done;
 	}
 
 	rc = bus_generic_attach(dev);
 	if (rc != 0) {
 		device_printf(dev,
 		    "failed to attach all child ports: %d\n", rc);
 		goto done;
 	}
 
 	device_printf(dev,
 	    "PCIe gen%d x%d, %d ports, %d %s interrupt%s, %d eq, %d iq\n",
 	    sc->params.pci.speed, sc->params.pci.width, sc->params.nports,
 	    sc->intr_count, sc->intr_type == INTR_MSIX ? "MSI-X" :
 	    (sc->intr_type == INTR_MSI ? "MSI" : "INTx"),
 	    sc->intr_count > 1 ? "s" : "", sc->sge.neq, sc->sge.niq);
 
 	t4_set_desc(sc);
 
 	notify_siblings(dev, 0);
 
 done:
 	if (rc != 0 && sc->cdev) {
 		/* cdev was created and so cxgbetool works; recover that way. */
 		device_printf(dev,
 		    "error during attach, adapter is now in recovery mode.\n");
 		rc = 0;
 	}
 
 	if (rc != 0)
 		t4_detach_common(dev);
 	else
 		t4_sysctls(sc);
 
 	return (rc);
 }
 
 static int
 t4_ready(device_t dev)
 {
 	struct adapter *sc;
 
 	sc = device_get_softc(dev);
 	if (sc->flags & FW_OK)
 		return (0);
 	return (ENXIO);
 }
 
 static int
 t4_read_port_device(device_t dev, int port, device_t *child)
 {
 	struct adapter *sc;
 	struct port_info *pi;
 
 	sc = device_get_softc(dev);
 	if (port < 0 || port >= MAX_NPORTS)
 		return (EINVAL);
 	pi = sc->port[port];
 	if (pi == NULL || pi->dev == NULL)
 		return (ENXIO);
 	*child = pi->dev;
 	return (0);
 }
 
 static int
 notify_siblings(device_t dev, int detaching)
 {
 	device_t sibling;
 	int error, i;
 
 	error = 0;
 	for (i = 0; i < PCI_FUNCMAX; i++) {
 		if (i == pci_get_function(dev))
 			continue;
 		sibling = pci_find_dbsf(pci_get_domain(dev), pci_get_bus(dev),
 		    pci_get_slot(dev), i);
 		if (sibling == NULL || !device_is_attached(sibling))
 			continue;
 		if (detaching)
 			error = T4_DETACH_CHILD(sibling);
 		else
 			(void)T4_ATTACH_CHILD(sibling);
 		if (error)
 			break;
 	}
 	return (error);
 }
 
 /*
  * Idempotent
  */
 static int
 t4_detach(device_t dev)
 {
 	struct adapter *sc;
 	int rc;
 
 	sc = device_get_softc(dev);
 
 	rc = notify_siblings(dev, 1);
 	if (rc) {
 		device_printf(dev,
 		    "failed to detach sibling devices: %d\n", rc);
 		return (rc);
 	}
 
 	return (t4_detach_common(dev));
 }
 
 int
 t4_detach_common(device_t dev)
 {
 	struct adapter *sc;
 	struct port_info *pi;
 	int i, rc;
 
 	sc = device_get_softc(dev);
 
 	if (sc->flags & FULL_INIT_DONE) {
 		if (!(sc->flags & IS_VF))
 			t4_intr_disable(sc);
 	}
 
 	if (sc->cdev) {
 		destroy_dev(sc->cdev);
 		sc->cdev = NULL;
 	}
 
 	if (device_is_attached(dev)) {
 		rc = bus_generic_detach(dev);
 		if (rc) {
 			device_printf(dev,
 			    "failed to detach child devices: %d\n", rc);
 			return (rc);
 		}
 	}
 
 	for (i = 0; i < sc->intr_count; i++)
 		t4_free_irq(sc, &sc->irq[i]);
 
 	for (i = 0; i < MAX_NPORTS; i++) {
 		pi = sc->port[i];
 		if (pi) {
 			t4_free_vi(sc, sc->mbox, sc->pf, 0, pi->vi[0].viid);
 			if (pi->dev)
 				device_delete_child(dev, pi->dev);
 
 			mtx_destroy(&pi->pi_lock);
 			free(pi->vi, M_CXGBE);
 			free(pi->tc, M_CXGBE);
 			free(pi, M_CXGBE);
 		}
 	}
 
 	if (sc->flags & FULL_INIT_DONE)
 		adapter_full_uninit(sc);
 
 	if ((sc->flags & (IS_VF | FW_OK)) == FW_OK)
 		t4_fw_bye(sc, sc->mbox);
 
 	if (sc->intr_type == INTR_MSI || sc->intr_type == INTR_MSIX)
 		pci_release_msi(dev);
 
 	if (sc->regs_res)
 		bus_release_resource(dev, SYS_RES_MEMORY, sc->regs_rid,
 		    sc->regs_res);
 
 	if (sc->udbs_res)
 		bus_release_resource(dev, SYS_RES_MEMORY, sc->udbs_rid,
 		    sc->udbs_res);
 
 	if (sc->msix_res)
 		bus_release_resource(dev, SYS_RES_MEMORY, sc->msix_rid,
 		    sc->msix_res);
 
 	if (sc->l2t)
 		t4_free_l2t(sc->l2t);
 
 #ifdef TCP_OFFLOAD
 	free(sc->sge.ofld_rxq, M_CXGBE);
 	free(sc->sge.ofld_txq, M_CXGBE);
 #endif
 #ifdef DEV_NETMAP
 	free(sc->sge.nm_rxq, M_CXGBE);
 	free(sc->sge.nm_txq, M_CXGBE);
 #endif
 	free(sc->irq, M_CXGBE);
 	free(sc->sge.rxq, M_CXGBE);
 	free(sc->sge.txq, M_CXGBE);
 	free(sc->sge.ctrlq, M_CXGBE);
 	free(sc->sge.iqmap, M_CXGBE);
 	free(sc->sge.eqmap, M_CXGBE);
 	free(sc->tids.ftid_tab, M_CXGBE);
 	t4_destroy_dma_tag(sc);
 	if (mtx_initialized(&sc->sc_lock)) {
 		sx_xlock(&t4_list_lock);
 		SLIST_REMOVE(&t4_list, sc, adapter, link);
 		sx_xunlock(&t4_list_lock);
 		mtx_destroy(&sc->sc_lock);
 	}
 
 	callout_drain(&sc->sfl_callout);
 	if (mtx_initialized(&sc->tids.ftid_lock))
 		mtx_destroy(&sc->tids.ftid_lock);
 	if (mtx_initialized(&sc->sfl_lock))
 		mtx_destroy(&sc->sfl_lock);
 	if (mtx_initialized(&sc->ifp_lock))
 		mtx_destroy(&sc->ifp_lock);
 	if (mtx_initialized(&sc->reg_lock))
 		mtx_destroy(&sc->reg_lock);
 
 	for (i = 0; i < NUM_MEMWIN; i++) {
 		struct memwin *mw = &sc->memwin[i];
 
 		if (rw_initialized(&mw->mw_lock))
 			rw_destroy(&mw->mw_lock);
 	}
 
 	bzero(sc, sizeof(*sc));
 
 	return (0);
 }
 
 static int
 cxgbe_probe(device_t dev)
 {
 	char buf[128];
 	struct port_info *pi = device_get_softc(dev);
 
 	snprintf(buf, sizeof(buf), "port %d", pi->port_id);
 	device_set_desc_copy(dev, buf);
 
 	return (BUS_PROBE_DEFAULT);
 }
 
 #define T4_CAP (IFCAP_VLAN_HWTAGGING | IFCAP_VLAN_MTU | IFCAP_HWCSUM | \
     IFCAP_VLAN_HWCSUM | IFCAP_TSO | IFCAP_JUMBO_MTU | IFCAP_LRO | \
     IFCAP_VLAN_HWTSO | IFCAP_LINKSTATE | IFCAP_HWCSUM_IPV6 | IFCAP_HWSTATS)
 #define T4_CAP_ENABLE (T4_CAP)
 
 static int
 cxgbe_vi_attach(device_t dev, struct vi_info *vi)
 {
 	struct ifnet *ifp;
 	struct sbuf *sb;
 
 	vi->xact_addr_filt = -1;
 	callout_init(&vi->tick, 1);
 
 	/* Allocate an ifnet and set it up */
 	ifp = if_alloc(IFT_ETHER);
 	if (ifp == NULL) {
 		device_printf(dev, "Cannot allocate ifnet\n");
 		return (ENOMEM);
 	}
 	vi->ifp = ifp;
 	ifp->if_softc = vi;
 
 	if_initname(ifp, device_get_name(dev), device_get_unit(dev));
 	ifp->if_flags = IFF_BROADCAST | IFF_SIMPLEX | IFF_MULTICAST;
 
 	ifp->if_init = cxgbe_init;
 	ifp->if_ioctl = cxgbe_ioctl;
 	ifp->if_transmit = cxgbe_transmit;
 	ifp->if_qflush = cxgbe_qflush;
 	ifp->if_get_counter = cxgbe_get_counter;
 
 	ifp->if_capabilities = T4_CAP;
 #ifdef TCP_OFFLOAD
 	if (vi->nofldrxq != 0)
 		ifp->if_capabilities |= IFCAP_TOE;
 #endif
 	ifp->if_capenable = T4_CAP_ENABLE;
 	ifp->if_hwassist = CSUM_TCP | CSUM_UDP | CSUM_IP | CSUM_TSO |
 	    CSUM_UDP_IPV6 | CSUM_TCP_IPV6;
 
 	ifp->if_hw_tsomax = 65536 - (ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN);
 	ifp->if_hw_tsomaxsegcount = TX_SGL_SEGS;
 	ifp->if_hw_tsomaxsegsize = 65536;
 
 	/* Initialize ifmedia for this VI */
 	ifmedia_init(&vi->media, IFM_IMASK, cxgbe_media_change,
 	    cxgbe_media_status);
 	build_medialist(vi->pi, &vi->media);
 
 	vi->vlan_c = EVENTHANDLER_REGISTER(vlan_config, cxgbe_vlan_config, ifp,
 	    EVENTHANDLER_PRI_ANY);
 
 	ether_ifattach(ifp, vi->hw_addr);
 #ifdef DEV_NETMAP
 	if (vi->nnmrxq != 0)
 		cxgbe_nm_attach(vi);
 #endif
 	sb = sbuf_new_auto();
 	sbuf_printf(sb, "%d txq, %d rxq (NIC)", vi->ntxq, vi->nrxq);
 #ifdef TCP_OFFLOAD
 	if (ifp->if_capabilities & IFCAP_TOE)
 		sbuf_printf(sb, "; %d txq, %d rxq (TOE)",
 		    vi->nofldtxq, vi->nofldrxq);
 #endif
 #ifdef DEV_NETMAP
 	if (ifp->if_capabilities & IFCAP_NETMAP)
 		sbuf_printf(sb, "; %d txq, %d rxq (netmap)",
 		    vi->nnmtxq, vi->nnmrxq);
 #endif
 	sbuf_finish(sb);
 	device_printf(dev, "%s\n", sbuf_data(sb));
 	sbuf_delete(sb);
 
 	vi_sysctls(vi);
 
 	return (0);
 }
 
 static int
 cxgbe_attach(device_t dev)
 {
 	struct port_info *pi = device_get_softc(dev);
 	struct vi_info *vi;
 	int i, rc;
 
 	callout_init_mtx(&pi->tick, &pi->pi_lock, 0);
 
 	rc = cxgbe_vi_attach(dev, &pi->vi[0]);
 	if (rc)
 		return (rc);
 
 	for_each_vi(pi, i, vi) {
 		if (i == 0)
 			continue;
 		vi->dev = device_add_child(dev, is_t4(pi->adapter) ?
 		    "vcxgbe" : "vcxl", -1);
 		if (vi->dev == NULL) {
 			device_printf(dev, "failed to add VI %d\n", i);
 			continue;
 		}
 		device_set_softc(vi->dev, vi);
 	}
 
 	cxgbe_sysctls(pi);
 
 	bus_generic_attach(dev);
 
 	return (0);
 }
 
 static void
 cxgbe_vi_detach(struct vi_info *vi)
 {
 	struct ifnet *ifp = vi->ifp;
 
 	ether_ifdetach(ifp);
 
 	if (vi->vlan_c)
 		EVENTHANDLER_DEREGISTER(vlan_config, vi->vlan_c);
 
 	/* Let detach proceed even if these fail. */
 #ifdef DEV_NETMAP
 	if (ifp->if_capabilities & IFCAP_NETMAP)
 		cxgbe_nm_detach(vi);
 #endif
 	cxgbe_uninit_synchronized(vi);
 	callout_drain(&vi->tick);
 	vi_full_uninit(vi);
 
 	ifmedia_removeall(&vi->media);
 	if_free(vi->ifp);
 	vi->ifp = NULL;
 }
 
 static int
 cxgbe_detach(device_t dev)
 {
 	struct port_info *pi = device_get_softc(dev);
 	struct adapter *sc = pi->adapter;
 	int rc;
 
 	/* Detach the extra VIs first. */
 	rc = bus_generic_detach(dev);
 	if (rc)
 		return (rc);
 	device_delete_children(dev);
 
 	doom_vi(sc, &pi->vi[0]);
 
 	if (pi->flags & HAS_TRACEQ) {
 		sc->traceq = -1;	/* cloner should not create ifnet */
 		t4_tracer_port_detach(sc);
 	}
 
 	cxgbe_vi_detach(&pi->vi[0]);
 	callout_drain(&pi->tick);
 
 	end_synchronized_op(sc, 0);
 
 	return (0);
 }
 
 static void
 cxgbe_init(void *arg)
 {
 	struct vi_info *vi = arg;
 	struct adapter *sc = vi->pi->adapter;
 
 	if (begin_synchronized_op(sc, vi, SLEEP_OK | INTR_OK, "t4init") != 0)
 		return;
 	cxgbe_init_synchronized(vi);
 	end_synchronized_op(sc, 0);
 }
 
 static int
 cxgbe_ioctl(struct ifnet *ifp, unsigned long cmd, caddr_t data)
 {
 	int rc = 0, mtu, flags, can_sleep;
 	struct vi_info *vi = ifp->if_softc;
 	struct adapter *sc = vi->pi->adapter;
 	struct ifreq *ifr = (struct ifreq *)data;
 	uint32_t mask;
 
 	switch (cmd) {
 	case SIOCSIFMTU:
 		mtu = ifr->ifr_mtu;
 		if ((mtu < ETHERMIN) || (mtu > ETHERMTU_JUMBO))
 			return (EINVAL);
 
 		rc = begin_synchronized_op(sc, vi, SLEEP_OK | INTR_OK, "t4mtu");
 		if (rc)
 			return (rc);
 		ifp->if_mtu = mtu;
 		if (vi->flags & VI_INIT_DONE) {
 			t4_update_fl_bufsize(ifp);
 			if (ifp->if_drv_flags & IFF_DRV_RUNNING)
 				rc = update_mac_settings(ifp, XGMAC_MTU);
 		}
 		end_synchronized_op(sc, 0);
 		break;
 
 	case SIOCSIFFLAGS:
 		can_sleep = 0;
 redo_sifflags:
 		rc = begin_synchronized_op(sc, vi,
 		    can_sleep ? (SLEEP_OK | INTR_OK) : HOLD_LOCK, "t4flg");
 		if (rc)
 			return (rc);
 
 		if (ifp->if_flags & IFF_UP) {
 			if (ifp->if_drv_flags & IFF_DRV_RUNNING) {
 				flags = vi->if_flags;
 				if ((ifp->if_flags ^ flags) &
 				    (IFF_PROMISC | IFF_ALLMULTI)) {
 					if (can_sleep == 1) {
 						end_synchronized_op(sc, 0);
 						can_sleep = 0;
 						goto redo_sifflags;
 					}
 					rc = update_mac_settings(ifp,
 					    XGMAC_PROMISC | XGMAC_ALLMULTI);
 				}
 			} else {
 				if (can_sleep == 0) {
 					end_synchronized_op(sc, LOCK_HELD);
 					can_sleep = 1;
 					goto redo_sifflags;
 				}
 				rc = cxgbe_init_synchronized(vi);
 			}
 			vi->if_flags = ifp->if_flags;
 		} else if (ifp->if_drv_flags & IFF_DRV_RUNNING) {
 			if (can_sleep == 0) {
 				end_synchronized_op(sc, LOCK_HELD);
 				can_sleep = 1;
 				goto redo_sifflags;
 			}
 			rc = cxgbe_uninit_synchronized(vi);
 		}
 		end_synchronized_op(sc, can_sleep ? 0 : LOCK_HELD);
 		break;
 
 	case SIOCADDMULTI:
 	case SIOCDELMULTI: /* these two are called with a mutex held :-( */
 		rc = begin_synchronized_op(sc, vi, HOLD_LOCK, "t4multi");
 		if (rc)
 			return (rc);
 		if (ifp->if_drv_flags & IFF_DRV_RUNNING)
 			rc = update_mac_settings(ifp, XGMAC_MCADDRS);
 		end_synchronized_op(sc, LOCK_HELD);
 		break;
 
 	case SIOCSIFCAP:
 		rc = begin_synchronized_op(sc, vi, SLEEP_OK | INTR_OK, "t4cap");
 		if (rc)
 			return (rc);
 
 		mask = ifr->ifr_reqcap ^ ifp->if_capenable;
 		if (mask & IFCAP_TXCSUM) {
 			ifp->if_capenable ^= IFCAP_TXCSUM;
 			ifp->if_hwassist ^= (CSUM_TCP | CSUM_UDP | CSUM_IP);
 
 			if (IFCAP_TSO4 & ifp->if_capenable &&
 			    !(IFCAP_TXCSUM & ifp->if_capenable)) {
 				ifp->if_capenable &= ~IFCAP_TSO4;
 				if_printf(ifp,
 				    "tso4 disabled due to -txcsum.\n");
 			}
 		}
 		if (mask & IFCAP_TXCSUM_IPV6) {
 			ifp->if_capenable ^= IFCAP_TXCSUM_IPV6;
 			ifp->if_hwassist ^= (CSUM_UDP_IPV6 | CSUM_TCP_IPV6);
 
 			if (IFCAP_TSO6 & ifp->if_capenable &&
 			    !(IFCAP_TXCSUM_IPV6 & ifp->if_capenable)) {
 				ifp->if_capenable &= ~IFCAP_TSO6;
 				if_printf(ifp,
 				    "tso6 disabled due to -txcsum6.\n");
 			}
 		}
 		if (mask & IFCAP_RXCSUM)
 			ifp->if_capenable ^= IFCAP_RXCSUM;
 		if (mask & IFCAP_RXCSUM_IPV6)
 			ifp->if_capenable ^= IFCAP_RXCSUM_IPV6;
 
 		/*
 		 * Note that we leave CSUM_TSO alone (it is always set).  The
 		 * kernel takes both IFCAP_TSOx and CSUM_TSO into account before
 		 * sending a TSO request our way, so it's sufficient to toggle
 		 * IFCAP_TSOx only.
 		 */
 		if (mask & IFCAP_TSO4) {
 			if (!(IFCAP_TSO4 & ifp->if_capenable) &&
 			    !(IFCAP_TXCSUM & ifp->if_capenable)) {
 				if_printf(ifp, "enable txcsum first.\n");
 				rc = EAGAIN;
 				goto fail;
 			}
 			ifp->if_capenable ^= IFCAP_TSO4;
 		}
 		if (mask & IFCAP_TSO6) {
 			if (!(IFCAP_TSO6 & ifp->if_capenable) &&
 			    !(IFCAP_TXCSUM_IPV6 & ifp->if_capenable)) {
 				if_printf(ifp, "enable txcsum6 first.\n");
 				rc = EAGAIN;
 				goto fail;
 			}
 			ifp->if_capenable ^= IFCAP_TSO6;
 		}
 		if (mask & IFCAP_LRO) {
 #if defined(INET) || defined(INET6)
 			int i;
 			struct sge_rxq *rxq;
 
 			ifp->if_capenable ^= IFCAP_LRO;
 			for_each_rxq(vi, i, rxq) {
 				if (ifp->if_capenable & IFCAP_LRO)
 					rxq->iq.flags |= IQ_LRO_ENABLED;
 				else
 					rxq->iq.flags &= ~IQ_LRO_ENABLED;
 			}
 #endif
 		}
 #ifdef TCP_OFFLOAD
 		if (mask & IFCAP_TOE) {
 			int enable = (ifp->if_capenable ^ mask) & IFCAP_TOE;
 
 			rc = toe_capability(vi, enable);
 			if (rc != 0)
 				goto fail;
 
 			ifp->if_capenable ^= mask;
 		}
 #endif
 		if (mask & IFCAP_VLAN_HWTAGGING) {
 			ifp->if_capenable ^= IFCAP_VLAN_HWTAGGING;
 			if (ifp->if_drv_flags & IFF_DRV_RUNNING)
 				rc = update_mac_settings(ifp, XGMAC_VLANEX);
 		}
 		if (mask & IFCAP_VLAN_MTU) {
 			ifp->if_capenable ^= IFCAP_VLAN_MTU;
 
 			/* Need to find out how to disable auto-mtu-inflation */
 		}
 		if (mask & IFCAP_VLAN_HWTSO)
 			ifp->if_capenable ^= IFCAP_VLAN_HWTSO;
 		if (mask & IFCAP_VLAN_HWCSUM)
 			ifp->if_capenable ^= IFCAP_VLAN_HWCSUM;
 
 #ifdef VLAN_CAPABILITIES
 		VLAN_CAPABILITIES(ifp);
 #endif
 fail:
 		end_synchronized_op(sc, 0);
 		break;
 
 	case SIOCSIFMEDIA:
 	case SIOCGIFMEDIA:
 		ifmedia_ioctl(ifp, ifr, &vi->media, cmd);
 		break;
 
 	case SIOCGI2C: {
 		struct ifi2creq i2c;
 
 		rc = copyin(ifr->ifr_data, &i2c, sizeof(i2c));
 		if (rc != 0)
 			break;
 		if (i2c.dev_addr != 0xA0 && i2c.dev_addr != 0xA2) {
 			rc = EPERM;
 			break;
 		}
 		if (i2c.len > sizeof(i2c.data)) {
 			rc = EINVAL;
 			break;
 		}
 		rc = begin_synchronized_op(sc, vi, SLEEP_OK | INTR_OK, "t4i2c");
 		if (rc)
 			return (rc);
 		rc = -t4_i2c_rd(sc, sc->mbox, vi->pi->port_id, i2c.dev_addr,
 		    i2c.offset, i2c.len, &i2c.data[0]);
 		end_synchronized_op(sc, 0);
 		if (rc == 0)
 			rc = copyout(&i2c, ifr->ifr_data, sizeof(i2c));
 		break;
 	}
 
 	default:
 		rc = ether_ioctl(ifp, cmd, data);
 	}
 
 	return (rc);
 }
 
 static int
 cxgbe_transmit(struct ifnet *ifp, struct mbuf *m)
 {
 	struct vi_info *vi = ifp->if_softc;
 	struct port_info *pi = vi->pi;
 	struct adapter *sc = pi->adapter;
 	struct sge_txq *txq;
 	void *items[1];
 	int rc;
 
 	M_ASSERTPKTHDR(m);
 	MPASS(m->m_nextpkt == NULL);	/* not quite ready for this yet */
 
 	if (__predict_false(pi->link_cfg.link_ok == 0)) {
 		m_freem(m);
 		return (ENETDOWN);
 	}
 
 	rc = parse_pkt(sc, &m);
 	if (__predict_false(rc != 0)) {
 		MPASS(m == NULL);			/* was freed already */
 		atomic_add_int(&pi->tx_parse_error, 1);	/* rare, atomic is ok */
 		return (rc);
 	}
 
 	/* Select a txq. */
 	txq = &sc->sge.txq[vi->first_txq];
 	if (M_HASHTYPE_GET(m) != M_HASHTYPE_NONE)
 		txq += ((m->m_pkthdr.flowid % (vi->ntxq - vi->rsrv_noflowq)) +
 		    vi->rsrv_noflowq);
 
 	items[0] = m;
 	rc = mp_ring_enqueue(txq->r, items, 1, 4096);
 	if (__predict_false(rc != 0))
 		m_freem(m);
 
 	return (rc);
 }
 
 static void
 cxgbe_qflush(struct ifnet *ifp)
 {
 	struct vi_info *vi = ifp->if_softc;
 	struct sge_txq *txq;
 	int i;
 
 	/* queues do not exist if !VI_INIT_DONE. */
 	if (vi->flags & VI_INIT_DONE) {
 		for_each_txq(vi, i, txq) {
 			TXQ_LOCK(txq);
 			txq->eq.flags &= ~EQ_ENABLED;
 			TXQ_UNLOCK(txq);
 			while (!mp_ring_is_idle(txq->r)) {
 				mp_ring_check_drainage(txq->r, 0);
 				pause("qflush", 1);
 			}
 		}
 	}
 	if_qflush(ifp);
 }
 
 static uint64_t
 vi_get_counter(struct ifnet *ifp, ift_counter c)
 {
 	struct vi_info *vi = ifp->if_softc;
 	struct fw_vi_stats_vf *s = &vi->stats;
 
 	vi_refresh_stats(vi->pi->adapter, vi);
 
 	switch (c) {
 	case IFCOUNTER_IPACKETS:
 		return (s->rx_bcast_frames + s->rx_mcast_frames +
 		    s->rx_ucast_frames);
 	case IFCOUNTER_IERRORS:
 		return (s->rx_err_frames);
 	case IFCOUNTER_OPACKETS:
 		return (s->tx_bcast_frames + s->tx_mcast_frames +
 		    s->tx_ucast_frames + s->tx_offload_frames);
 	case IFCOUNTER_OERRORS:
 		return (s->tx_drop_frames);
 	case IFCOUNTER_IBYTES:
 		return (s->rx_bcast_bytes + s->rx_mcast_bytes +
 		    s->rx_ucast_bytes);
 	case IFCOUNTER_OBYTES:
 		return (s->tx_bcast_bytes + s->tx_mcast_bytes +
 		    s->tx_ucast_bytes + s->tx_offload_bytes);
 	case IFCOUNTER_IMCASTS:
 		return (s->rx_mcast_frames);
 	case IFCOUNTER_OMCASTS:
 		return (s->tx_mcast_frames);
 	case IFCOUNTER_OQDROPS: {
 		uint64_t drops;
 
 		drops = 0;
 		if (vi->flags & VI_INIT_DONE) {
 			int i;
 			struct sge_txq *txq;
 
 			for_each_txq(vi, i, txq)
 				drops += counter_u64_fetch(txq->r->drops);
 		}
 
 		return (drops);
 
 	}
 
 	default:
 		return (if_get_counter_default(ifp, c));
 	}
 }
 
 uint64_t
 cxgbe_get_counter(struct ifnet *ifp, ift_counter c)
 {
 	struct vi_info *vi = ifp->if_softc;
 	struct port_info *pi = vi->pi;
 	struct adapter *sc = pi->adapter;
 	struct port_stats *s = &pi->stats;
 
 	if (pi->nvi > 1 || sc->flags & IS_VF)
 		return (vi_get_counter(ifp, c));
 
 	cxgbe_refresh_stats(sc, pi);
 
 	switch (c) {
 	case IFCOUNTER_IPACKETS:
 		return (s->rx_frames);
 
 	case IFCOUNTER_IERRORS:
 		return (s->rx_jabber + s->rx_runt + s->rx_too_long +
 		    s->rx_fcs_err + s->rx_len_err);
 
 	case IFCOUNTER_OPACKETS:
 		return (s->tx_frames);
 
 	case IFCOUNTER_OERRORS:
 		return (s->tx_error_frames);
 
 	case IFCOUNTER_IBYTES:
 		return (s->rx_octets);
 
 	case IFCOUNTER_OBYTES:
 		return (s->tx_octets);
 
 	case IFCOUNTER_IMCASTS:
 		return (s->rx_mcast_frames);
 
 	case IFCOUNTER_OMCASTS:
 		return (s->tx_mcast_frames);
 
 	case IFCOUNTER_IQDROPS:
 		return (s->rx_ovflow0 + s->rx_ovflow1 + s->rx_ovflow2 +
 		    s->rx_ovflow3 + s->rx_trunc0 + s->rx_trunc1 + s->rx_trunc2 +
 		    s->rx_trunc3 + pi->tnl_cong_drops);
 
 	case IFCOUNTER_OQDROPS: {
 		uint64_t drops;
 
 		drops = s->tx_drop;
 		if (vi->flags & VI_INIT_DONE) {
 			int i;
 			struct sge_txq *txq;
 
 			for_each_txq(vi, i, txq)
 				drops += counter_u64_fetch(txq->r->drops);
 		}
 
 		return (drops);
 
 	}
 
 	default:
 		return (if_get_counter_default(ifp, c));
 	}
 }
 
 static int
 cxgbe_media_change(struct ifnet *ifp)
 {
 	struct vi_info *vi = ifp->if_softc;
 
 	device_printf(vi->dev, "%s unimplemented.\n", __func__);
 
 	return (EOPNOTSUPP);
 }
 
 static void
 cxgbe_media_status(struct ifnet *ifp, struct ifmediareq *ifmr)
 {
 	struct vi_info *vi = ifp->if_softc;
 	struct port_info *pi = vi->pi;
 	struct ifmedia_entry *cur;
 	int speed = pi->link_cfg.speed;
 
 	cur = vi->media.ifm_cur;
 
 	ifmr->ifm_status = IFM_AVALID;
 	if (!pi->link_cfg.link_ok)
 		return;
 
 	ifmr->ifm_status |= IFM_ACTIVE;
 
 	/* active and current will differ iff current media is autoselect. */
 	if (IFM_SUBTYPE(cur->ifm_media) != IFM_AUTO)
 		return;
 
 	ifmr->ifm_active = IFM_ETHER | IFM_FDX;
 	if (speed == 10000)
 		ifmr->ifm_active |= IFM_10G_T;
 	else if (speed == 1000)
 		ifmr->ifm_active |= IFM_1000_T;
 	else if (speed == 100)
 		ifmr->ifm_active |= IFM_100_TX;
 	else if (speed == 10)
 		ifmr->ifm_active |= IFM_10_T;
 	else
 		KASSERT(0, ("%s: link up but speed unknown (%u)", __func__,
 			    speed));
 }
 
 static int
 vcxgbe_probe(device_t dev)
 {
 	char buf[128];
 	struct vi_info *vi = device_get_softc(dev);
 
 	snprintf(buf, sizeof(buf), "port %d vi %td", vi->pi->port_id,
 	    vi - vi->pi->vi);
 	device_set_desc_copy(dev, buf);
 
 	return (BUS_PROBE_DEFAULT);
 }
 
 static int
 vcxgbe_attach(device_t dev)
 {
 	struct vi_info *vi;
 	struct port_info *pi;
 	struct adapter *sc;
 	int func, index, rc;
 	u32 param, val;
 
 	vi = device_get_softc(dev);
 	pi = vi->pi;
 	sc = pi->adapter;
 
 	index = vi - pi->vi;
 	KASSERT(index < nitems(vi_mac_funcs),
 	    ("%s: VI %s doesn't have a MAC func", __func__,
 	    device_get_nameunit(dev)));
 	func = vi_mac_funcs[index];
 	rc = t4_alloc_vi_func(sc, sc->mbox, pi->tx_chan, sc->pf, 0, 1,
 	    vi->hw_addr, &vi->rss_size, func, 0);
 	if (rc < 0) {
 		device_printf(dev, "Failed to allocate virtual interface "
 		    "for port %d: %d\n", pi->port_id, -rc);
 		return (-rc);
 	}
 	vi->viid = rc;
 
 	param = V_FW_PARAMS_MNEM(FW_PARAMS_MNEM_DEV) |
 	    V_FW_PARAMS_PARAM_X(FW_PARAMS_PARAM_DEV_RSSINFO) |
 	    V_FW_PARAMS_PARAM_YZ(vi->viid);
 	rc = t4_query_params(sc, sc->mbox, sc->pf, 0, 1, &param, &val);
 	if (rc)
 		vi->rss_base = 0xffff;
 	else {
 		/* MPASS((val >> 16) == rss_size); */
 		vi->rss_base = val & 0xffff;
 	}
 
 	rc = cxgbe_vi_attach(dev, vi);
 	if (rc) {
 		t4_free_vi(sc, sc->mbox, sc->pf, 0, vi->viid);
 		return (rc);
 	}
 	return (0);
 }
 
 static int
 vcxgbe_detach(device_t dev)
 {
 	struct vi_info *vi;
 	struct adapter *sc;
 
 	vi = device_get_softc(dev);
 	sc = vi->pi->adapter;
 
 	doom_vi(sc, vi);
 
 	cxgbe_vi_detach(vi);
 	t4_free_vi(sc, sc->mbox, sc->pf, 0, vi->viid);
 
 	end_synchronized_op(sc, 0);
 
 	return (0);
 }
 
 void
 t4_fatal_err(struct adapter *sc)
 {
 	t4_set_reg_field(sc, A_SGE_CONTROL, F_GLOBALENABLE, 0);
 	t4_intr_disable(sc);
 	log(LOG_EMERG, "%s: encountered fatal error, adapter stopped.\n",
 	    device_get_nameunit(sc->dev));
 }
 
 void
 t4_add_adapter(struct adapter *sc)
 {
 	sx_xlock(&t4_list_lock);
 	SLIST_INSERT_HEAD(&t4_list, sc, link);
 	sx_xunlock(&t4_list_lock);
 }
 
 int
 t4_map_bars_0_and_4(struct adapter *sc)
 {
 	sc->regs_rid = PCIR_BAR(0);
 	sc->regs_res = bus_alloc_resource_any(sc->dev, SYS_RES_MEMORY,
 	    &sc->regs_rid, RF_ACTIVE);
 	if (sc->regs_res == NULL) {
 		device_printf(sc->dev, "cannot map registers.\n");
 		return (ENXIO);
 	}
 	sc->bt = rman_get_bustag(sc->regs_res);
 	sc->bh = rman_get_bushandle(sc->regs_res);
 	sc->mmio_len = rman_get_size(sc->regs_res);
 	setbit(&sc->doorbells, DOORBELL_KDB);
 
 	sc->msix_rid = PCIR_BAR(4);
 	sc->msix_res = bus_alloc_resource_any(sc->dev, SYS_RES_MEMORY,
 	    &sc->msix_rid, RF_ACTIVE);
 	if (sc->msix_res == NULL) {
 		device_printf(sc->dev, "cannot map MSI-X BAR.\n");
 		return (ENXIO);
 	}
 
 	return (0);
 }
 
 int
 t4_map_bar_2(struct adapter *sc)
 {
 
 	/*
 	 * T4: only iWARP driver uses the userspace doorbells.  There is no need
 	 * to map it if RDMA is disabled.
 	 */
 	if (is_t4(sc) && sc->rdmacaps == 0)
 		return (0);
 
 	sc->udbs_rid = PCIR_BAR(2);
 	sc->udbs_res = bus_alloc_resource_any(sc->dev, SYS_RES_MEMORY,
 	    &sc->udbs_rid, RF_ACTIVE);
 	if (sc->udbs_res == NULL) {
 		device_printf(sc->dev, "cannot map doorbell BAR.\n");
 		return (ENXIO);
 	}
 	sc->udbs_base = rman_get_virtual(sc->udbs_res);
 
 	if (is_t5(sc)) {
 		setbit(&sc->doorbells, DOORBELL_UDB);
 #if defined(__i386__) || defined(__amd64__)
 		if (t5_write_combine) {
 			int rc;
 
 			/*
 			 * Enable write combining on BAR2.  This is the
 			 * userspace doorbell BAR and is split into 128B
 			 * (UDBS_SEG_SIZE) doorbell regions, each associated
 			 * with an egress queue.  The first 64B has the doorbell
 			 * and the second 64B can be used to submit a tx work
 			 * request with an implicit doorbell.
 			 */
 
 			rc = pmap_change_attr((vm_offset_t)sc->udbs_base,
 			    rman_get_size(sc->udbs_res), PAT_WRITE_COMBINING);
 			if (rc == 0) {
 				clrbit(&sc->doorbells, DOORBELL_UDB);
 				setbit(&sc->doorbells, DOORBELL_WCWR);
 				setbit(&sc->doorbells, DOORBELL_UDBWC);
 			} else {
 				device_printf(sc->dev,
 				    "couldn't enable write combining: %d\n",
 				    rc);
 			}
 
 			t4_write_reg(sc, A_SGE_STAT_CFG,
 			    V_STATSOURCE_T5(7) | V_STATMODE(0));
 		}
 #endif
 	}
 
 	return (0);
 }
 
 struct memwin_init {
 	uint32_t base;
 	uint32_t aperture;
 };
 
 static const struct memwin_init t4_memwin[NUM_MEMWIN] = {
 	{ MEMWIN0_BASE, MEMWIN0_APERTURE },
 	{ MEMWIN1_BASE, MEMWIN1_APERTURE },
 	{ MEMWIN2_BASE_T4, MEMWIN2_APERTURE_T4 }
 };
 
 static const struct memwin_init t5_memwin[NUM_MEMWIN] = {
 	{ MEMWIN0_BASE, MEMWIN0_APERTURE },
 	{ MEMWIN1_BASE, MEMWIN1_APERTURE },
 	{ MEMWIN2_BASE_T5, MEMWIN2_APERTURE_T5 },
 };
 
 static void
 setup_memwin(struct adapter *sc)
 {
 	const struct memwin_init *mw_init;
 	struct memwin *mw;
 	int i;
 	uint32_t bar0;
 
 	if (is_t4(sc)) {
 		/*
 		 * Read low 32b of bar0 indirectly via the hardware backdoor
 		 * mechanism.  Works from within PCI passthrough environments
 		 * too, where rman_get_start() can return a different value.  We
 		 * need to program the T4 memory window decoders with the actual
 		 * addresses that will be coming across the PCIe link.
 		 */
 		bar0 = t4_hw_pci_read_cfg4(sc, PCIR_BAR(0));
 		bar0 &= (uint32_t) PCIM_BAR_MEM_BASE;
 
 		mw_init = &t4_memwin[0];
 	} else {
 		/* T5+ use the relative offset inside the PCIe BAR */
 		bar0 = 0;
 
 		mw_init = &t5_memwin[0];
 	}
 
 	for (i = 0, mw = &sc->memwin[0]; i < NUM_MEMWIN; i++, mw_init++, mw++) {
 		rw_init(&mw->mw_lock, "memory window access");
 		mw->mw_base = mw_init->base;
 		mw->mw_aperture = mw_init->aperture;
 		mw->mw_curpos = 0;
 		t4_write_reg(sc,
 		    PCIE_MEM_ACCESS_REG(A_PCIE_MEM_ACCESS_BASE_WIN, i),
 		    (mw->mw_base + bar0) | V_BIR(0) |
 		    V_WINDOW(ilog2(mw->mw_aperture) - 10));
 		rw_wlock(&mw->mw_lock);
 		position_memwin(sc, i, 0);
 		rw_wunlock(&mw->mw_lock);
 	}
 
 	/* flush */
 	t4_read_reg(sc, PCIE_MEM_ACCESS_REG(A_PCIE_MEM_ACCESS_BASE_WIN, 2));
 }
 
 /*
  * Positions the memory window at the given address in the card's address space.
  * There are some alignment requirements and the actual position may be at an
  * address prior to the requested address.  mw->mw_curpos always has the actual
  * position of the window.
  */
 static void
 position_memwin(struct adapter *sc, int idx, uint32_t addr)
 {
 	struct memwin *mw;
 	uint32_t pf;
 	uint32_t reg;
 
 	MPASS(idx >= 0 && idx < NUM_MEMWIN);
 	mw = &sc->memwin[idx];
 	rw_assert(&mw->mw_lock, RA_WLOCKED);
 
 	if (is_t4(sc)) {
 		pf = 0;
 		mw->mw_curpos = addr & ~0xf;	/* start must be 16B aligned */
 	} else {
 		pf = V_PFNUM(sc->pf);
 		mw->mw_curpos = addr & ~0x7f;	/* start must be 128B aligned */
 	}
 	reg = PCIE_MEM_ACCESS_REG(A_PCIE_MEM_ACCESS_OFFSET, idx);
 	t4_write_reg(sc, reg, mw->mw_curpos | pf);
 	t4_read_reg(sc, reg);	/* flush */
 }
 
 static int
 rw_via_memwin(struct adapter *sc, int idx, uint32_t addr, uint32_t *val,
     int len, int rw)
 {
 	struct memwin *mw;
 	uint32_t mw_end, v;
 
 	MPASS(idx >= 0 && idx < NUM_MEMWIN);
 
 	/* Memory can only be accessed in naturally aligned 4 byte units */
 	if (addr & 3 || len & 3 || len <= 0)
 		return (EINVAL);
 
 	mw = &sc->memwin[idx];
 	while (len > 0) {
 		rw_rlock(&mw->mw_lock);
 		mw_end = mw->mw_curpos + mw->mw_aperture;
 		if (addr >= mw_end || addr < mw->mw_curpos) {
 			/* Will need to reposition the window */
 			if (!rw_try_upgrade(&mw->mw_lock)) {
 				rw_runlock(&mw->mw_lock);
 				rw_wlock(&mw->mw_lock);
 			}
 			rw_assert(&mw->mw_lock, RA_WLOCKED);
 			position_memwin(sc, idx, addr);
 			rw_downgrade(&mw->mw_lock);
 			mw_end = mw->mw_curpos + mw->mw_aperture;
 		}
 		rw_assert(&mw->mw_lock, RA_RLOCKED);
 		while (addr < mw_end && len > 0) {
 			if (rw == 0) {
 				v = t4_read_reg(sc, mw->mw_base + addr -
 				    mw->mw_curpos);
 				*val++ = le32toh(v);
 			} else {
 				v = *val++;
 				t4_write_reg(sc, mw->mw_base + addr -
 				    mw->mw_curpos, htole32(v));
 			}
 			addr += 4;
 			len -= 4;
 		}
 		rw_runlock(&mw->mw_lock);
 	}
 
 	return (0);
 }
 
 static inline int
 read_via_memwin(struct adapter *sc, int idx, uint32_t addr, uint32_t *val,
     int len)
 {
 
 	return (rw_via_memwin(sc, idx, addr, val, len, 0));
 }
 
 static inline int
 write_via_memwin(struct adapter *sc, int idx, uint32_t addr,
     const uint32_t *val, int len)
 {
 
 	return (rw_via_memwin(sc, idx, addr, (void *)(uintptr_t)val, len, 1));
 }
 
 static int
 t4_range_cmp(const void *a, const void *b)
 {
 	return ((const struct t4_range *)a)->start -
 	       ((const struct t4_range *)b)->start;
 }
 
 /*
  * Verify that the memory range specified by the addr/len pair is valid within
  * the card's address space.
  */
 static int
 validate_mem_range(struct adapter *sc, uint32_t addr, int len)
 {
 	struct t4_range mem_ranges[4], *r, *next;
 	uint32_t em, addr_len;
 	int i, n, remaining;
 
 	/* Memory can only be accessed in naturally aligned 4 byte units */
 	if (addr & 3 || len & 3 || len <= 0)
 		return (EINVAL);
 
 	/* Enabled memories */
 	em = t4_read_reg(sc, A_MA_TARGET_MEM_ENABLE);
 
 	r = &mem_ranges[0];
 	n = 0;
 	bzero(r, sizeof(mem_ranges));
 	if (em & F_EDRAM0_ENABLE) {
 		addr_len = t4_read_reg(sc, A_MA_EDRAM0_BAR);
 		r->size = G_EDRAM0_SIZE(addr_len) << 20;
 		if (r->size > 0) {
 			r->start = G_EDRAM0_BASE(addr_len) << 20;
 			if (addr >= r->start &&
 			    addr + len <= r->start + r->size)
 				return (0);
 			r++;
 			n++;
 		}
 	}
 	if (em & F_EDRAM1_ENABLE) {
 		addr_len = t4_read_reg(sc, A_MA_EDRAM1_BAR);
 		r->size = G_EDRAM1_SIZE(addr_len) << 20;
 		if (r->size > 0) {
 			r->start = G_EDRAM1_BASE(addr_len) << 20;
 			if (addr >= r->start &&
 			    addr + len <= r->start + r->size)
 				return (0);
 			r++;
 			n++;
 		}
 	}
 	if (em & F_EXT_MEM_ENABLE) {
 		addr_len = t4_read_reg(sc, A_MA_EXT_MEMORY_BAR);
 		r->size = G_EXT_MEM_SIZE(addr_len) << 20;
 		if (r->size > 0) {
 			r->start = G_EXT_MEM_BASE(addr_len) << 20;
 			if (addr >= r->start &&
 			    addr + len <= r->start + r->size)
 				return (0);
 			r++;
 			n++;
 		}
 	}
 	if (is_t5(sc) && em & F_EXT_MEM1_ENABLE) {
 		addr_len = t4_read_reg(sc, A_MA_EXT_MEMORY1_BAR);
 		r->size = G_EXT_MEM1_SIZE(addr_len) << 20;
 		if (r->size > 0) {
 			r->start = G_EXT_MEM1_BASE(addr_len) << 20;
 			if (addr >= r->start &&
 			    addr + len <= r->start + r->size)
 				return (0);
 			r++;
 			n++;
 		}
 	}
 	MPASS(n <= nitems(mem_ranges));
 
 	if (n > 1) {
 		/* Sort and merge the ranges. */
 		qsort(mem_ranges, n, sizeof(struct t4_range), t4_range_cmp);
 
 		/* Start from index 0 and examine the next n - 1 entries. */
 		r = &mem_ranges[0];
 		for (remaining = n - 1; remaining > 0; remaining--, r++) {
 
 			MPASS(r->size > 0);	/* r is a valid entry. */
 			next = r + 1;
 			MPASS(next->size > 0);	/* and so is the next one. */
 
 			while (r->start + r->size >= next->start) {
 				/* Merge the next one into the current entry. */
 				r->size = max(r->start + r->size,
 				    next->start + next->size) - r->start;
 				n--;	/* One fewer entry in total. */
 				if (--remaining == 0)
 					goto done;	/* short circuit */
 				next++;
 			}
 			if (next != r + 1) {
 				/*
 				 * Some entries were merged into r and next
 				 * points to the first valid entry that couldn't
 				 * be merged.
 				 */
 				MPASS(next->size > 0);	/* must be valid */
 				memcpy(r + 1, next, remaining * sizeof(*r));
 #ifdef INVARIANTS
 				/*
 				 * This so that the foo->size assertion in the
 				 * next iteration of the loop do the right
 				 * thing for entries that were pulled up and are
 				 * no longer valid.
 				 */
 				MPASS(n < nitems(mem_ranges));
 				bzero(&mem_ranges[n], (nitems(mem_ranges) - n) *
 				    sizeof(struct t4_range));
 #endif
 			}
 		}
 done:
 		/* Done merging the ranges. */
 		MPASS(n > 0);
 		r = &mem_ranges[0];
 		for (i = 0; i < n; i++, r++) {
 			if (addr >= r->start &&
 			    addr + len <= r->start + r->size)
 				return (0);
 		}
 	}
 
 	return (EFAULT);
 }
 
 static int
 fwmtype_to_hwmtype(int mtype)
 {
 
 	switch (mtype) {
 	case FW_MEMTYPE_EDC0:
 		return (MEM_EDC0);
 	case FW_MEMTYPE_EDC1:
 		return (MEM_EDC1);
 	case FW_MEMTYPE_EXTMEM:
 		return (MEM_MC0);
 	case FW_MEMTYPE_EXTMEM1:
 		return (MEM_MC1);
 	default:
 		panic("%s: cannot translate fw mtype %d.", __func__, mtype);
 	}
 }
 
 /*
  * Verify that the memory range specified by the memtype/offset/len pair is
  * valid and lies entirely within the memtype specified.  The global address of
  * the start of the range is returned in addr.
  */
 static int
 validate_mt_off_len(struct adapter *sc, int mtype, uint32_t off, int len,
     uint32_t *addr)
 {
 	uint32_t em, addr_len, maddr;
 
 	/* Memory can only be accessed in naturally aligned 4 byte units */
 	if (off & 3 || len & 3 || len == 0)
 		return (EINVAL);
 
 	em = t4_read_reg(sc, A_MA_TARGET_MEM_ENABLE);
 	switch (fwmtype_to_hwmtype(mtype)) {
 	case MEM_EDC0:
 		if (!(em & F_EDRAM0_ENABLE))
 			return (EINVAL);
 		addr_len = t4_read_reg(sc, A_MA_EDRAM0_BAR);
 		maddr = G_EDRAM0_BASE(addr_len) << 20;
 		break;
 	case MEM_EDC1:
 		if (!(em & F_EDRAM1_ENABLE))
 			return (EINVAL);
 		addr_len = t4_read_reg(sc, A_MA_EDRAM1_BAR);
 		maddr = G_EDRAM1_BASE(addr_len) << 20;
 		break;
 	case MEM_MC:
 		if (!(em & F_EXT_MEM_ENABLE))
 			return (EINVAL);
 		addr_len = t4_read_reg(sc, A_MA_EXT_MEMORY_BAR);
 		maddr = G_EXT_MEM_BASE(addr_len) << 20;
 		break;
 	case MEM_MC1:
 		if (!is_t5(sc) || !(em & F_EXT_MEM1_ENABLE))
 			return (EINVAL);
 		addr_len = t4_read_reg(sc, A_MA_EXT_MEMORY1_BAR);
 		maddr = G_EXT_MEM1_BASE(addr_len) << 20;
 		break;
 	default:
 		return (EINVAL);
 	}
 
 	*addr = maddr + off;	/* global address */
 	return (validate_mem_range(sc, *addr, len));
 }
 
 static int
 fixup_devlog_params(struct adapter *sc)
 {
 	struct devlog_params *dparams = &sc->params.devlog;
 	int rc;
 
 	rc = validate_mt_off_len(sc, dparams->memtype, dparams->start,
 	    dparams->size, &dparams->addr);
 
 	return (rc);
 }
 
 static int
 cfg_itype_and_nqueues(struct adapter *sc, int n10g, int n1g, int num_vis,
     struct intrs_and_queues *iaq)
 {
 	int rc, itype, navail, nrxq10g, nrxq1g, n;
 	int nofldrxq10g = 0, nofldrxq1g = 0;
 
 	bzero(iaq, sizeof(*iaq));
 
 	iaq->ntxq10g = t4_ntxq10g;
 	iaq->ntxq1g = t4_ntxq1g;
 	iaq->ntxq_vi = t4_ntxq_vi;
 	iaq->nrxq10g = nrxq10g = t4_nrxq10g;
 	iaq->nrxq1g = nrxq1g = t4_nrxq1g;
 	iaq->nrxq_vi = t4_nrxq_vi;
 	iaq->rsrv_noflowq = t4_rsrv_noflowq;
 #ifdef TCP_OFFLOAD
 	if (is_offload(sc)) {
 		iaq->nofldtxq10g = t4_nofldtxq10g;
 		iaq->nofldtxq1g = t4_nofldtxq1g;
 		iaq->nofldtxq_vi = t4_nofldtxq_vi;
 		iaq->nofldrxq10g = nofldrxq10g = t4_nofldrxq10g;
 		iaq->nofldrxq1g = nofldrxq1g = t4_nofldrxq1g;
 		iaq->nofldrxq_vi = t4_nofldrxq_vi;
 	}
 #endif
 #ifdef DEV_NETMAP
 	iaq->nnmtxq_vi = t4_nnmtxq_vi;
 	iaq->nnmrxq_vi = t4_nnmrxq_vi;
 #endif
 
 	for (itype = INTR_MSIX; itype; itype >>= 1) {
 
 		if ((itype & t4_intr_types) == 0)
 			continue;	/* not allowed */
 
 		if (itype == INTR_MSIX)
 			navail = pci_msix_count(sc->dev);
 		else if (itype == INTR_MSI)
 			navail = pci_msi_count(sc->dev);
 		else
 			navail = 1;
 restart:
 		if (navail == 0)
 			continue;
 
 		iaq->intr_type = itype;
 		iaq->intr_flags_10g = 0;
 		iaq->intr_flags_1g = 0;
 
 		/*
 		 * Best option: an interrupt vector for errors, one for the
 		 * firmware event queue, and one for every rxq (NIC and TOE) of
 		 * every VI.  The VIs that support netmap use the same
 		 * interrupts for the NIC rx queues and the netmap rx queues
 		 * because only one set of queues is active at a time.
 		 */
 		iaq->nirq = T4_EXTRA_INTR;
 		iaq->nirq += n10g * (nrxq10g + nofldrxq10g);
 		iaq->nirq += n1g * (nrxq1g + nofldrxq1g);
 		iaq->nirq += (n10g + n1g) * (num_vis - 1) *
 		    max(iaq->nrxq_vi, iaq->nnmrxq_vi);	/* See comment above. */
 		iaq->nirq += (n10g + n1g) * (num_vis - 1) * iaq->nofldrxq_vi;
 		if (iaq->nirq <= navail &&
 		    (itype != INTR_MSI || powerof2(iaq->nirq))) {
 			iaq->intr_flags_10g = INTR_ALL;
 			iaq->intr_flags_1g = INTR_ALL;
 			goto allocate;
 		}
 
 		/* Disable the VIs (and netmap) if there aren't enough intrs */
 		if (num_vis > 1) {
 			device_printf(sc->dev, "virtual interfaces disabled "
 			    "because num_vis=%u with current settings "
 			    "(nrxq10g=%u, nrxq1g=%u, nofldrxq10g=%u, "
 			    "nofldrxq1g=%u, nrxq_vi=%u nofldrxq_vi=%u, "
 			    "nnmrxq_vi=%u) would need %u interrupts but "
 			    "only %u are available.\n", num_vis, nrxq10g,
 			    nrxq1g, nofldrxq10g, nofldrxq1g, iaq->nrxq_vi,
 			    iaq->nofldrxq_vi, iaq->nnmrxq_vi, iaq->nirq,
 			    navail);
 			num_vis = 1;
 			iaq->ntxq_vi = iaq->nrxq_vi = 0;
 			iaq->nofldtxq_vi = iaq->nofldrxq_vi = 0;
 			iaq->nnmtxq_vi = iaq->nnmrxq_vi = 0;
 			goto restart;
 		}
 
 		/*
 		 * Second best option: a vector for errors, one for the firmware
 		 * event queue, and vectors for either all the NIC rx queues or
 		 * all the TOE rx queues.  The queues that don't get vectors
 		 * will forward their interrupts to those that do.
 		 */
 		iaq->nirq = T4_EXTRA_INTR;
 		if (nrxq10g >= nofldrxq10g) {
 			iaq->intr_flags_10g = INTR_RXQ;
 			iaq->nirq += n10g * nrxq10g;
 		} else {
 			iaq->intr_flags_10g = INTR_OFLD_RXQ;
 			iaq->nirq += n10g * nofldrxq10g;
 		}
 		if (nrxq1g >= nofldrxq1g) {
 			iaq->intr_flags_1g = INTR_RXQ;
 			iaq->nirq += n1g * nrxq1g;
 		} else {
 			iaq->intr_flags_1g = INTR_OFLD_RXQ;
 			iaq->nirq += n1g * nofldrxq1g;
 		}
 		if (iaq->nirq <= navail &&
 		    (itype != INTR_MSI || powerof2(iaq->nirq)))
 			goto allocate;
 
 		/*
 		 * Next best option: an interrupt vector for errors, one for the
 		 * firmware event queue, and at least one per main-VI.  At this
 		 * point we know we'll have to downsize nrxq and/or nofldrxq to
 		 * fit what's available to us.
 		 */
 		iaq->nirq = T4_EXTRA_INTR;
 		iaq->nirq += n10g + n1g;
 		if (iaq->nirq <= navail) {
 			int leftover = navail - iaq->nirq;
 
 			if (n10g > 0) {
 				int target = max(nrxq10g, nofldrxq10g);
 
 				iaq->intr_flags_10g = nrxq10g >= nofldrxq10g ?
 				    INTR_RXQ : INTR_OFLD_RXQ;
 
 				n = 1;
 				while (n < target && leftover >= n10g) {
 					leftover -= n10g;
 					iaq->nirq += n10g;
 					n++;
 				}
 				iaq->nrxq10g = min(n, nrxq10g);
 #ifdef TCP_OFFLOAD
 				iaq->nofldrxq10g = min(n, nofldrxq10g);
 #endif
 			}
 
 			if (n1g > 0) {
 				int target = max(nrxq1g, nofldrxq1g);
 
 				iaq->intr_flags_1g = nrxq1g >= nofldrxq1g ?
 				    INTR_RXQ : INTR_OFLD_RXQ;
 
 				n = 1;
 				while (n < target && leftover >= n1g) {
 					leftover -= n1g;
 					iaq->nirq += n1g;
 					n++;
 				}
 				iaq->nrxq1g = min(n, nrxq1g);
 #ifdef TCP_OFFLOAD
 				iaq->nofldrxq1g = min(n, nofldrxq1g);
 #endif
 			}
 
 			if (itype != INTR_MSI || powerof2(iaq->nirq))
 				goto allocate;
 		}
 
 		/*
 		 * Least desirable option: one interrupt vector for everything.
 		 */
 		iaq->nirq = iaq->nrxq10g = iaq->nrxq1g = 1;
 		iaq->intr_flags_10g = iaq->intr_flags_1g = 0;
 #ifdef TCP_OFFLOAD
 		if (is_offload(sc))
 			iaq->nofldrxq10g = iaq->nofldrxq1g = 1;
 #endif
 allocate:
 		navail = iaq->nirq;
 		rc = 0;
 		if (itype == INTR_MSIX)
 			rc = pci_alloc_msix(sc->dev, &navail);
 		else if (itype == INTR_MSI)
 			rc = pci_alloc_msi(sc->dev, &navail);
 
 		if (rc == 0) {
 			if (navail == iaq->nirq)
 				return (0);
 
 			/*
 			 * Didn't get the number requested.  Use whatever number
 			 * the kernel is willing to allocate (it's in navail).
 			 */
 			device_printf(sc->dev, "fewer vectors than requested, "
 			    "type=%d, req=%d, rcvd=%d; will downshift req.\n",
 			    itype, iaq->nirq, navail);
 			pci_release_msi(sc->dev);
 			goto restart;
 		}
 
 		device_printf(sc->dev,
 		    "failed to allocate vectors:%d, type=%d, req=%d, rcvd=%d\n",
 		    itype, rc, iaq->nirq, navail);
 	}
 
 	device_printf(sc->dev,
 	    "failed to find a usable interrupt type.  "
 	    "allowed=%d, msi-x=%d, msi=%d, intx=1", t4_intr_types,
 	    pci_msix_count(sc->dev), pci_msi_count(sc->dev));
 
 	return (ENXIO);
 }
 
 #define FW_VERSION(chip) ( \
     V_FW_HDR_FW_VER_MAJOR(chip##FW_VERSION_MAJOR) | \
     V_FW_HDR_FW_VER_MINOR(chip##FW_VERSION_MINOR) | \
     V_FW_HDR_FW_VER_MICRO(chip##FW_VERSION_MICRO) | \
     V_FW_HDR_FW_VER_BUILD(chip##FW_VERSION_BUILD))
 #define FW_INTFVER(chip, intf) (chip##FW_HDR_INTFVER_##intf)
 
 struct fw_info {
 	uint8_t chip;
 	char *kld_name;
 	char *fw_mod_name;
 	struct fw_hdr fw_hdr;	/* XXX: waste of space, need a sparse struct */
 } fw_info[] = {
 	{
 		.chip = CHELSIO_T4,
 		.kld_name = "t4fw_cfg",
 		.fw_mod_name = "t4fw",
 		.fw_hdr = {
 			.chip = FW_HDR_CHIP_T4,
 			.fw_ver = htobe32_const(FW_VERSION(T4)),
 			.intfver_nic = FW_INTFVER(T4, NIC),
 			.intfver_vnic = FW_INTFVER(T4, VNIC),
 			.intfver_ofld = FW_INTFVER(T4, OFLD),
 			.intfver_ri = FW_INTFVER(T4, RI),
 			.intfver_iscsipdu = FW_INTFVER(T4, ISCSIPDU),
 			.intfver_iscsi = FW_INTFVER(T4, ISCSI),
 			.intfver_fcoepdu = FW_INTFVER(T4, FCOEPDU),
 			.intfver_fcoe = FW_INTFVER(T4, FCOE),
 		},
 	}, {
 		.chip = CHELSIO_T5,
 		.kld_name = "t5fw_cfg",
 		.fw_mod_name = "t5fw",
 		.fw_hdr = {
 			.chip = FW_HDR_CHIP_T5,
 			.fw_ver = htobe32_const(FW_VERSION(T5)),
 			.intfver_nic = FW_INTFVER(T5, NIC),
 			.intfver_vnic = FW_INTFVER(T5, VNIC),
 			.intfver_ofld = FW_INTFVER(T5, OFLD),
 			.intfver_ri = FW_INTFVER(T5, RI),
 			.intfver_iscsipdu = FW_INTFVER(T5, ISCSIPDU),
 			.intfver_iscsi = FW_INTFVER(T5, ISCSI),
 			.intfver_fcoepdu = FW_INTFVER(T5, FCOEPDU),
 			.intfver_fcoe = FW_INTFVER(T5, FCOE),
 		},
 	}
 };
 
 static struct fw_info *
 find_fw_info(int chip)
 {
 	int i;
 
 	for (i = 0; i < nitems(fw_info); i++) {
 		if (fw_info[i].chip == chip)
 			return (&fw_info[i]);
 	}
 	return (NULL);
 }
 
 /*
  * Is the given firmware API compatible with the one the driver was compiled
  * with?
  */
 static int
 fw_compatible(const struct fw_hdr *hdr1, const struct fw_hdr *hdr2)
 {
 
 	/* short circuit if it's the exact same firmware version */
 	if (hdr1->chip == hdr2->chip && hdr1->fw_ver == hdr2->fw_ver)
 		return (1);
 
 	/*
 	 * XXX: Is this too conservative?  Perhaps I should limit this to the
 	 * features that are supported in the driver.
 	 */
 #define SAME_INTF(x) (hdr1->intfver_##x == hdr2->intfver_##x)
 	if (hdr1->chip == hdr2->chip && SAME_INTF(nic) && SAME_INTF(vnic) &&
 	    SAME_INTF(ofld) && SAME_INTF(ri) && SAME_INTF(iscsipdu) &&
 	    SAME_INTF(iscsi) && SAME_INTF(fcoepdu) && SAME_INTF(fcoe))
 		return (1);
 #undef SAME_INTF
 
 	return (0);
 }
 
 /*
  * The firmware in the KLD is usable, but should it be installed?  This routine
  * explains itself in detail if it indicates the KLD firmware should be
  * installed.
  */
 static int
 should_install_kld_fw(struct adapter *sc, int card_fw_usable, int k, int c)
 {
 	const char *reason;
 
 	if (!card_fw_usable) {
 		reason = "incompatible or unusable";
 		goto install;
 	}
 
 	if (k > c) {
 		reason = "older than the version bundled with this driver";
 		goto install;
 	}
 
 	if (t4_fw_install == 2 && k != c) {
 		reason = "different than the version bundled with this driver";
 		goto install;
 	}
 
 	return (0);
 
 install:
 	if (t4_fw_install == 0) {
 		device_printf(sc->dev, "firmware on card (%u.%u.%u.%u) is %s, "
 		    "but the driver is prohibited from installing a different "
 		    "firmware on the card.\n",
 		    G_FW_HDR_FW_VER_MAJOR(c), G_FW_HDR_FW_VER_MINOR(c),
 		    G_FW_HDR_FW_VER_MICRO(c), G_FW_HDR_FW_VER_BUILD(c), reason);
 
 		return (0);
 	}
 
 	device_printf(sc->dev, "firmware on card (%u.%u.%u.%u) is %s, "
 	    "installing firmware %u.%u.%u.%u on card.\n",
 	    G_FW_HDR_FW_VER_MAJOR(c), G_FW_HDR_FW_VER_MINOR(c),
 	    G_FW_HDR_FW_VER_MICRO(c), G_FW_HDR_FW_VER_BUILD(c), reason,
 	    G_FW_HDR_FW_VER_MAJOR(k), G_FW_HDR_FW_VER_MINOR(k),
 	    G_FW_HDR_FW_VER_MICRO(k), G_FW_HDR_FW_VER_BUILD(k));
 
 	return (1);
 }
 /*
  * Establish contact with the firmware and determine if we are the master driver
  * or not, and whether we are responsible for chip initialization.
  */
 static int
 prep_firmware(struct adapter *sc)
 {
 	const struct firmware *fw = NULL, *default_cfg;
 	int rc, pf, card_fw_usable, kld_fw_usable, need_fw_reset = 1;
 	enum dev_state state;
 	struct fw_info *fw_info;
 	struct fw_hdr *card_fw;		/* fw on the card */
 	const struct fw_hdr *kld_fw;	/* fw in the KLD */
 	const struct fw_hdr *drv_fw;	/* fw header the driver was compiled
 					   against */
 
 	/* Contact firmware. */
 	rc = t4_fw_hello(sc, sc->mbox, sc->mbox, MASTER_MAY, &state);
 	if (rc < 0 || state == DEV_STATE_ERR) {
 		rc = -rc;
 		device_printf(sc->dev,
 		    "failed to connect to the firmware: %d, %d.\n", rc, state);
 		return (rc);
 	}
 	pf = rc;
 	if (pf == sc->mbox)
 		sc->flags |= MASTER_PF;
 	else if (state == DEV_STATE_UNINIT) {
 		/*
 		 * We didn't get to be the master so we definitely won't be
 		 * configuring the chip.  It's a bug if someone else hasn't
 		 * configured it already.
 		 */
 		device_printf(sc->dev, "couldn't be master(%d), "
 		    "device not already initialized either(%d).\n", rc, state);
 		return (EDOOFUS);
 	}
 
 	/* This is the firmware whose headers the driver was compiled against */
 	fw_info = find_fw_info(chip_id(sc));
 	if (fw_info == NULL) {
 		device_printf(sc->dev,
 		    "unable to look up firmware information for chip %d.\n",
 		    chip_id(sc));
 		return (EINVAL);
 	}
 	drv_fw = &fw_info->fw_hdr;
 
 	/*
 	 * The firmware KLD contains many modules.  The KLD name is also the
 	 * name of the module that contains the default config file.
 	 */
 	default_cfg = firmware_get(fw_info->kld_name);
 
 	/* Read the header of the firmware on the card */
 	card_fw = malloc(sizeof(*card_fw), M_CXGBE, M_ZERO | M_WAITOK);
 	rc = -t4_read_flash(sc, FLASH_FW_START,
 	    sizeof (*card_fw) / sizeof (uint32_t), (uint32_t *)card_fw, 1);
 	if (rc == 0)
 		card_fw_usable = fw_compatible(drv_fw, (const void*)card_fw);
 	else {
 		device_printf(sc->dev,
 		    "Unable to read card's firmware header: %d\n", rc);
 		card_fw_usable = 0;
 	}
 
 	/* This is the firmware in the KLD */
 	fw = firmware_get(fw_info->fw_mod_name);
 	if (fw != NULL) {
 		kld_fw = (const void *)fw->data;
 		kld_fw_usable = fw_compatible(drv_fw, kld_fw);
 	} else {
 		kld_fw = NULL;
 		kld_fw_usable = 0;
 	}
 
 	if (card_fw_usable && card_fw->fw_ver == drv_fw->fw_ver &&
 	    (!kld_fw_usable || kld_fw->fw_ver == drv_fw->fw_ver)) {
 		/*
 		 * Common case: the firmware on the card is an exact match and
 		 * the KLD is an exact match too, or the KLD is
 		 * absent/incompatible.  Note that t4_fw_install = 2 is ignored
 		 * here -- use cxgbetool loadfw if you want to reinstall the
 		 * same firmware as the one on the card.
 		 */
 	} else if (kld_fw_usable && state == DEV_STATE_UNINIT &&
 	    should_install_kld_fw(sc, card_fw_usable, be32toh(kld_fw->fw_ver),
 	    be32toh(card_fw->fw_ver))) {
 
 		rc = -t4_fw_upgrade(sc, sc->mbox, fw->data, fw->datasize, 0);
 		if (rc != 0) {
 			device_printf(sc->dev,
 			    "failed to install firmware: %d\n", rc);
 			goto done;
 		}
 
 		/* Installed successfully, update the cached header too. */
 		memcpy(card_fw, kld_fw, sizeof(*card_fw));
 		card_fw_usable = 1;
 		need_fw_reset = 0;	/* already reset as part of load_fw */
 	}
 
 	if (!card_fw_usable) {
 		uint32_t d, c, k;
 
 		d = ntohl(drv_fw->fw_ver);
 		c = ntohl(card_fw->fw_ver);
 		k = kld_fw ? ntohl(kld_fw->fw_ver) : 0;
 
 		device_printf(sc->dev, "Cannot find a usable firmware: "
 		    "fw_install %d, chip state %d, "
 		    "driver compiled with %d.%d.%d.%d, "
 		    "card has %d.%d.%d.%d, KLD has %d.%d.%d.%d\n",
 		    t4_fw_install, state,
 		    G_FW_HDR_FW_VER_MAJOR(d), G_FW_HDR_FW_VER_MINOR(d),
 		    G_FW_HDR_FW_VER_MICRO(d), G_FW_HDR_FW_VER_BUILD(d),
 		    G_FW_HDR_FW_VER_MAJOR(c), G_FW_HDR_FW_VER_MINOR(c),
 		    G_FW_HDR_FW_VER_MICRO(c), G_FW_HDR_FW_VER_BUILD(c),
 		    G_FW_HDR_FW_VER_MAJOR(k), G_FW_HDR_FW_VER_MINOR(k),
 		    G_FW_HDR_FW_VER_MICRO(k), G_FW_HDR_FW_VER_BUILD(k));
 		rc = EINVAL;
 		goto done;
 	}
 
 	/* Reset device */
 	if (need_fw_reset &&
 	    (rc = -t4_fw_reset(sc, sc->mbox, F_PIORSTMODE | F_PIORST)) != 0) {
 		device_printf(sc->dev, "firmware reset failed: %d.\n", rc);
 		if (rc != ETIMEDOUT && rc != EIO)
 			t4_fw_bye(sc, sc->mbox);
 		goto done;
 	}
 	sc->flags |= FW_OK;
 
 	rc = get_params__pre_init(sc);
 	if (rc != 0)
 		goto done; /* error message displayed already */
 
 	/* Partition adapter resources as specified in the config file. */
 	if (state == DEV_STATE_UNINIT) {
 
 		KASSERT(sc->flags & MASTER_PF,
 		    ("%s: trying to change chip settings when not master.",
 		    __func__));
 
 		rc = partition_resources(sc, default_cfg, fw_info->kld_name);
 		if (rc != 0)
 			goto done;	/* error message displayed already */
 
 		t4_tweak_chip_settings(sc);
 
 		/* get basic stuff going */
 		rc = -t4_fw_initialize(sc, sc->mbox);
 		if (rc != 0) {
 			device_printf(sc->dev, "fw init failed: %d.\n", rc);
 			goto done;
 		}
 	} else {
 		snprintf(sc->cfg_file, sizeof(sc->cfg_file), "pf%d", pf);
 		sc->cfcsum = 0;
 	}
 
 done:
 	free(card_fw, M_CXGBE);
 	if (fw != NULL)
 		firmware_put(fw, FIRMWARE_UNLOAD);
 	if (default_cfg != NULL)
 		firmware_put(default_cfg, FIRMWARE_UNLOAD);
 
 	return (rc);
 }
 
 #define FW_PARAM_DEV(param) \
 	(V_FW_PARAMS_MNEM(FW_PARAMS_MNEM_DEV) | \
 	 V_FW_PARAMS_PARAM_X(FW_PARAMS_PARAM_DEV_##param))
 #define FW_PARAM_PFVF(param) \
 	(V_FW_PARAMS_MNEM(FW_PARAMS_MNEM_PFVF) | \
 	 V_FW_PARAMS_PARAM_X(FW_PARAMS_PARAM_PFVF_##param))
 
 /*
  * Partition chip resources for use between various PFs, VFs, etc.
  */
 static int
 partition_resources(struct adapter *sc, const struct firmware *default_cfg,
     const char *name_prefix)
 {
 	const struct firmware *cfg = NULL;
 	int rc = 0;
 	struct fw_caps_config_cmd caps;
 	uint32_t mtype, moff, finicsum, cfcsum;
 
 	/*
 	 * Figure out what configuration file to use.  Pick the default config
 	 * file for the card if the user hasn't specified one explicitly.
 	 */
 	snprintf(sc->cfg_file, sizeof(sc->cfg_file), "%s", t4_cfg_file);
 	if (strncmp(t4_cfg_file, DEFAULT_CF, sizeof(t4_cfg_file)) == 0) {
 		/* Card specific overrides go here. */
 		if (pci_get_device(sc->dev) == 0x440a)
 			snprintf(sc->cfg_file, sizeof(sc->cfg_file), UWIRE_CF);
 		if (is_fpga(sc))
 			snprintf(sc->cfg_file, sizeof(sc->cfg_file), FPGA_CF);
 	}
 
 	/*
 	 * We need to load another module if the profile is anything except
 	 * "default" or "flash".
 	 */
 	if (strncmp(sc->cfg_file, DEFAULT_CF, sizeof(sc->cfg_file)) != 0 &&
 	    strncmp(sc->cfg_file, FLASH_CF, sizeof(sc->cfg_file)) != 0) {
 		char s[32];
 
 		snprintf(s, sizeof(s), "%s_%s", name_prefix, sc->cfg_file);
 		cfg = firmware_get(s);
 		if (cfg == NULL) {
 			if (default_cfg != NULL) {
 				device_printf(sc->dev,
 				    "unable to load module \"%s\" for "
 				    "configuration profile \"%s\", will use "
 				    "the default config file instead.\n",
 				    s, sc->cfg_file);
 				snprintf(sc->cfg_file, sizeof(sc->cfg_file),
 				    "%s", DEFAULT_CF);
 			} else {
 				device_printf(sc->dev,
 				    "unable to load module \"%s\" for "
 				    "configuration profile \"%s\", will use "
 				    "the config file on the card's flash "
 				    "instead.\n", s, sc->cfg_file);
 				snprintf(sc->cfg_file, sizeof(sc->cfg_file),
 				    "%s", FLASH_CF);
 			}
 		}
 	}
 
 	if (strncmp(sc->cfg_file, DEFAULT_CF, sizeof(sc->cfg_file)) == 0 &&
 	    default_cfg == NULL) {
 		device_printf(sc->dev,
 		    "default config file not available, will use the config "
 		    "file on the card's flash instead.\n");
 		snprintf(sc->cfg_file, sizeof(sc->cfg_file), "%s", FLASH_CF);
 	}
 
 	if (strncmp(sc->cfg_file, FLASH_CF, sizeof(sc->cfg_file)) != 0) {
 		u_int cflen;
 		const uint32_t *cfdata;
 		uint32_t param, val, addr;
 
 		KASSERT(cfg != NULL || default_cfg != NULL,
 		    ("%s: no config to upload", __func__));
 
 		/*
 		 * Ask the firmware where it wants us to upload the config file.
 		 */
 		param = FW_PARAM_DEV(CF);
 		rc = -t4_query_params(sc, sc->mbox, sc->pf, 0, 1, &param, &val);
 		if (rc != 0) {
 			/* No support for config file?  Shouldn't happen. */
 			device_printf(sc->dev,
 			    "failed to query config file location: %d.\n", rc);
 			goto done;
 		}
 		mtype = G_FW_PARAMS_PARAM_Y(val);
 		moff = G_FW_PARAMS_PARAM_Z(val) << 16;
 
 		/*
 		 * XXX: sheer laziness.  We deliberately added 4 bytes of
 		 * useless stuffing/comments at the end of the config file so
 		 * it's ok to simply throw away the last remaining bytes when
 		 * the config file is not an exact multiple of 4.  This also
 		 * helps with the validate_mt_off_len check.
 		 */
 		if (cfg != NULL) {
 			cflen = cfg->datasize & ~3;
 			cfdata = cfg->data;
 		} else {
 			cflen = default_cfg->datasize & ~3;
 			cfdata = default_cfg->data;
 		}
 
 		if (cflen > FLASH_CFG_MAX_SIZE) {
 			device_printf(sc->dev,
 			    "config file too long (%d, max allowed is %d).  "
 			    "Will try to use the config on the card, if any.\n",
 			    cflen, FLASH_CFG_MAX_SIZE);
 			goto use_config_on_flash;
 		}
 
 		rc = validate_mt_off_len(sc, mtype, moff, cflen, &addr);
 		if (rc != 0) {
 			device_printf(sc->dev,
 			    "%s: addr (%d/0x%x) or len %d is not valid: %d.  "
 			    "Will try to use the config on the card, if any.\n",
 			    __func__, mtype, moff, cflen, rc);
 			goto use_config_on_flash;
 		}
 		write_via_memwin(sc, 2, addr, cfdata, cflen);
 	} else {
 use_config_on_flash:
 		mtype = FW_MEMTYPE_FLASH;
 		moff = t4_flash_cfg_addr(sc);
 	}
 
 	bzero(&caps, sizeof(caps));
 	caps.op_to_write = htobe32(V_FW_CMD_OP(FW_CAPS_CONFIG_CMD) |
 	    F_FW_CMD_REQUEST | F_FW_CMD_READ);
 	caps.cfvalid_to_len16 = htobe32(F_FW_CAPS_CONFIG_CMD_CFVALID |
 	    V_FW_CAPS_CONFIG_CMD_MEMTYPE_CF(mtype) |
 	    V_FW_CAPS_CONFIG_CMD_MEMADDR64K_CF(moff >> 16) | FW_LEN16(caps));
 	rc = -t4_wr_mbox(sc, sc->mbox, &caps, sizeof(caps), &caps);
 	if (rc != 0) {
 		device_printf(sc->dev,
 		    "failed to pre-process config file: %d "
 		    "(mtype %d, moff 0x%x).\n", rc, mtype, moff);
 		goto done;
 	}
 
 	finicsum = be32toh(caps.finicsum);
 	cfcsum = be32toh(caps.cfcsum);
 	if (finicsum != cfcsum) {
 		device_printf(sc->dev,
 		    "WARNING: config file checksum mismatch: %08x %08x\n",
 		    finicsum, cfcsum);
 	}
 	sc->cfcsum = cfcsum;
 
 #define LIMIT_CAPS(x) do { \
 	caps.x &= htobe16(t4_##x##_allowed); \
 } while (0)
 
 	/*
 	 * Let the firmware know what features will (not) be used so it can tune
 	 * things accordingly.
 	 */
 	LIMIT_CAPS(nbmcaps);
 	LIMIT_CAPS(linkcaps);
 	LIMIT_CAPS(switchcaps);
 	LIMIT_CAPS(niccaps);
 	LIMIT_CAPS(toecaps);
 	LIMIT_CAPS(rdmacaps);
 	LIMIT_CAPS(tlscaps);
 	LIMIT_CAPS(iscsicaps);
 	LIMIT_CAPS(fcoecaps);
 #undef LIMIT_CAPS
 
 	caps.op_to_write = htobe32(V_FW_CMD_OP(FW_CAPS_CONFIG_CMD) |
 	    F_FW_CMD_REQUEST | F_FW_CMD_WRITE);
 	caps.cfvalid_to_len16 = htobe32(FW_LEN16(caps));
 	rc = -t4_wr_mbox(sc, sc->mbox, &caps, sizeof(caps), NULL);
 	if (rc != 0) {
 		device_printf(sc->dev,
 		    "failed to process config file: %d.\n", rc);
 	}
 done:
 	if (cfg != NULL)
 		firmware_put(cfg, FIRMWARE_UNLOAD);
 	return (rc);
 }
 
 /*
  * Retrieve parameters that are needed (or nice to have) very early.
  */
 static int
 get_params__pre_init(struct adapter *sc)
 {
 	int rc;
 	uint32_t param[2], val[2];
 
 	t4_get_version_info(sc);
 
 	snprintf(sc->fw_version, sizeof(sc->fw_version), "%u.%u.%u.%u",
 	    G_FW_HDR_FW_VER_MAJOR(sc->params.fw_vers),
 	    G_FW_HDR_FW_VER_MINOR(sc->params.fw_vers),
 	    G_FW_HDR_FW_VER_MICRO(sc->params.fw_vers),
 	    G_FW_HDR_FW_VER_BUILD(sc->params.fw_vers));
 
 	snprintf(sc->bs_version, sizeof(sc->bs_version), "%u.%u.%u.%u",
 	    G_FW_HDR_FW_VER_MAJOR(sc->params.bs_vers),
 	    G_FW_HDR_FW_VER_MINOR(sc->params.bs_vers),
 	    G_FW_HDR_FW_VER_MICRO(sc->params.bs_vers),
 	    G_FW_HDR_FW_VER_BUILD(sc->params.bs_vers));
 
 	snprintf(sc->tp_version, sizeof(sc->tp_version), "%u.%u.%u.%u",
 	    G_FW_HDR_FW_VER_MAJOR(sc->params.tp_vers),
 	    G_FW_HDR_FW_VER_MINOR(sc->params.tp_vers),
 	    G_FW_HDR_FW_VER_MICRO(sc->params.tp_vers),
 	    G_FW_HDR_FW_VER_BUILD(sc->params.tp_vers));
 
 	snprintf(sc->er_version, sizeof(sc->er_version), "%u.%u.%u.%u",
 	    G_FW_HDR_FW_VER_MAJOR(sc->params.er_vers),
 	    G_FW_HDR_FW_VER_MINOR(sc->params.er_vers),
 	    G_FW_HDR_FW_VER_MICRO(sc->params.er_vers),
 	    G_FW_HDR_FW_VER_BUILD(sc->params.er_vers));
 
 	param[0] = FW_PARAM_DEV(PORTVEC);
 	param[1] = FW_PARAM_DEV(CCLK);
 	rc = -t4_query_params(sc, sc->mbox, sc->pf, 0, 2, param, val);
 	if (rc != 0) {
 		device_printf(sc->dev,
 		    "failed to query parameters (pre_init): %d.\n", rc);
 		return (rc);
 	}
 
 	sc->params.portvec = val[0];
 	sc->params.nports = bitcount32(val[0]);
 	sc->params.vpd.cclk = val[1];
 
 	/* Read device log parameters. */
 	rc = -t4_init_devlog_params(sc, 1);
 	if (rc == 0)
 		fixup_devlog_params(sc);
 	else {
 		device_printf(sc->dev,
 		    "failed to get devlog parameters: %d.\n", rc);
 		rc = 0;	/* devlog isn't critical for device operation */
 	}
 
 	return (rc);
 }
 
 /*
  * Retrieve various parameters that are of interest to the driver.  The device
  * has been initialized by the firmware at this point.
  */
 static int
 get_params__post_init(struct adapter *sc)
 {
 	int rc;
 	uint32_t param[7], val[7];
 	struct fw_caps_config_cmd caps;
 
 	param[0] = FW_PARAM_PFVF(IQFLINT_START);
 	param[1] = FW_PARAM_PFVF(EQ_START);
 	param[2] = FW_PARAM_PFVF(FILTER_START);
 	param[3] = FW_PARAM_PFVF(FILTER_END);
 	param[4] = FW_PARAM_PFVF(L2T_START);
 	param[5] = FW_PARAM_PFVF(L2T_END);
 	rc = -t4_query_params(sc, sc->mbox, sc->pf, 0, 6, param, val);
 	if (rc != 0) {
 		device_printf(sc->dev,
 		    "failed to query parameters (post_init): %d.\n", rc);
 		return (rc);
 	}
 
 	sc->sge.iq_start = val[0];
 	sc->sge.eq_start = val[1];
 	sc->tids.ftid_base = val[2];
 	sc->tids.nftids = val[3] - val[2] + 1;
 	sc->params.ftid_min = val[2];
 	sc->params.ftid_max = val[3];
 	sc->vres.l2t.start = val[4];
 	sc->vres.l2t.size = val[5] - val[4] + 1;
 	KASSERT(sc->vres.l2t.size <= L2T_SIZE,
 	    ("%s: L2 table size (%u) larger than expected (%u)",
 	    __func__, sc->vres.l2t.size, L2T_SIZE));
 
 	/* get capabilites */
 	bzero(&caps, sizeof(caps));
 	caps.op_to_write = htobe32(V_FW_CMD_OP(FW_CAPS_CONFIG_CMD) |
 	    F_FW_CMD_REQUEST | F_FW_CMD_READ);
 	caps.cfvalid_to_len16 = htobe32(FW_LEN16(caps));
 	rc = -t4_wr_mbox(sc, sc->mbox, &caps, sizeof(caps), &caps);
 	if (rc != 0) {
 		device_printf(sc->dev,
 		    "failed to get card capabilities: %d.\n", rc);
 		return (rc);
 	}
 
 #define READ_CAPS(x) do { \
 	sc->x = htobe16(caps.x); \
 } while (0)
 	READ_CAPS(nbmcaps);
 	READ_CAPS(linkcaps);
 	READ_CAPS(switchcaps);
 	READ_CAPS(niccaps);
 	READ_CAPS(toecaps);
 	READ_CAPS(rdmacaps);
 	READ_CAPS(tlscaps);
 	READ_CAPS(iscsicaps);
 	READ_CAPS(fcoecaps);
 
 	if (sc->niccaps & FW_CAPS_CONFIG_NIC_ETHOFLD) {
 		param[0] = FW_PARAM_PFVF(ETHOFLD_START);
 		param[1] = FW_PARAM_PFVF(ETHOFLD_END);
 		param[2] = FW_PARAM_DEV(FLOWC_BUFFIFO_SZ);
 		rc = -t4_query_params(sc, sc->mbox, sc->pf, 0, 3, param, val);
 		if (rc != 0) {
 			device_printf(sc->dev,
 			    "failed to query NIC parameters: %d.\n", rc);
 			return (rc);
 		}
 		sc->tids.etid_base = val[0];
 		sc->params.etid_min = val[0];
 		sc->tids.netids = val[1] - val[0] + 1;
 		sc->params.netids = sc->tids.netids;
 		sc->params.eo_wr_cred = val[2];
 		sc->params.ethoffload = 1;
 	}
 
 	if (sc->toecaps) {
 		/* query offload-related parameters */
 		param[0] = FW_PARAM_DEV(NTID);
 		param[1] = FW_PARAM_PFVF(SERVER_START);
 		param[2] = FW_PARAM_PFVF(SERVER_END);
 		param[3] = FW_PARAM_PFVF(TDDP_START);
 		param[4] = FW_PARAM_PFVF(TDDP_END);
 		param[5] = FW_PARAM_DEV(FLOWC_BUFFIFO_SZ);
 		rc = -t4_query_params(sc, sc->mbox, sc->pf, 0, 6, param, val);
 		if (rc != 0) {
 			device_printf(sc->dev,
 			    "failed to query TOE parameters: %d.\n", rc);
 			return (rc);
 		}
 		sc->tids.ntids = val[0];
 		sc->tids.natids = min(sc->tids.ntids / 2, MAX_ATIDS);
 		sc->tids.stid_base = val[1];
 		sc->tids.nstids = val[2] - val[1] + 1;
 		sc->vres.ddp.start = val[3];
 		sc->vres.ddp.size = val[4] - val[3] + 1;
 		sc->params.ofldq_wr_cred = val[5];
 		sc->params.offload = 1;
 	}
 	if (sc->rdmacaps) {
 		param[0] = FW_PARAM_PFVF(STAG_START);
 		param[1] = FW_PARAM_PFVF(STAG_END);
 		param[2] = FW_PARAM_PFVF(RQ_START);
 		param[3] = FW_PARAM_PFVF(RQ_END);
 		param[4] = FW_PARAM_PFVF(PBL_START);
 		param[5] = FW_PARAM_PFVF(PBL_END);
 		rc = -t4_query_params(sc, sc->mbox, sc->pf, 0, 6, param, val);
 		if (rc != 0) {
 			device_printf(sc->dev,
 			    "failed to query RDMA parameters(1): %d.\n", rc);
 			return (rc);
 		}
 		sc->vres.stag.start = val[0];
 		sc->vres.stag.size = val[1] - val[0] + 1;
 		sc->vres.rq.start = val[2];
 		sc->vres.rq.size = val[3] - val[2] + 1;
 		sc->vres.pbl.start = val[4];
 		sc->vres.pbl.size = val[5] - val[4] + 1;
 
 		param[0] = FW_PARAM_PFVF(SQRQ_START);
 		param[1] = FW_PARAM_PFVF(SQRQ_END);
 		param[2] = FW_PARAM_PFVF(CQ_START);
 		param[3] = FW_PARAM_PFVF(CQ_END);
 		param[4] = FW_PARAM_PFVF(OCQ_START);
 		param[5] = FW_PARAM_PFVF(OCQ_END);
 		rc = -t4_query_params(sc, sc->mbox, sc->pf, 0, 6, param, val);
 		if (rc != 0) {
 			device_printf(sc->dev,
 			    "failed to query RDMA parameters(2): %d.\n", rc);
 			return (rc);
 		}
 		sc->vres.qp.start = val[0];
 		sc->vres.qp.size = val[1] - val[0] + 1;
 		sc->vres.cq.start = val[2];
 		sc->vres.cq.size = val[3] - val[2] + 1;
 		sc->vres.ocq.start = val[4];
 		sc->vres.ocq.size = val[5] - val[4] + 1;
 	}
 	if (sc->iscsicaps) {
 		param[0] = FW_PARAM_PFVF(ISCSI_START);
 		param[1] = FW_PARAM_PFVF(ISCSI_END);
 		rc = -t4_query_params(sc, sc->mbox, sc->pf, 0, 2, param, val);
 		if (rc != 0) {
 			device_printf(sc->dev,
 			    "failed to query iSCSI parameters: %d.\n", rc);
 			return (rc);
 		}
 		sc->vres.iscsi.start = val[0];
 		sc->vres.iscsi.size = val[1] - val[0] + 1;
 	}
 
 	t4_init_sge_params(sc);
 
 	/*
 	 * We've got the params we wanted to query via the firmware.  Now grab
 	 * some others directly from the chip.
 	 */
 	rc = t4_read_chip_settings(sc);
 
 	return (rc);
 }
 
 static int
 set_params__post_init(struct adapter *sc)
 {
 	uint32_t param, val;
 
 	/* ask for encapsulated CPLs */
 	param = FW_PARAM_PFVF(CPLFW4MSG_ENCAP);
 	val = 1;
 	(void)t4_set_params(sc, sc->mbox, sc->pf, 0, 1, &param, &val);
 
 	return (0);
 }
 
 #undef FW_PARAM_PFVF
 #undef FW_PARAM_DEV
 
 static void
 t4_set_desc(struct adapter *sc)
 {
 	char buf[128];
 	struct adapter_params *p = &sc->params;
 
 	snprintf(buf, sizeof(buf), "Chelsio %s", p->vpd.id);
 
 	device_set_desc_copy(sc->dev, buf);
 }
 
 static void
 build_medialist(struct port_info *pi, struct ifmedia *media)
 {
 	int m;
 
 	PORT_LOCK(pi);
 
 	ifmedia_removeall(media);
 
 	m = IFM_ETHER | IFM_FDX;
 
 	switch(pi->port_type) {
 	case FW_PORT_TYPE_BT_XFI:
 	case FW_PORT_TYPE_BT_XAUI:
 		ifmedia_add(media, m | IFM_10G_T, 0, NULL);
 		/* fall through */
 
 	case FW_PORT_TYPE_BT_SGMII:
 		ifmedia_add(media, m | IFM_1000_T, 0, NULL);
 		ifmedia_add(media, m | IFM_100_TX, 0, NULL);
 		ifmedia_add(media, IFM_ETHER | IFM_AUTO, 0, NULL);
 		ifmedia_set(media, IFM_ETHER | IFM_AUTO);
 		break;
 
 	case FW_PORT_TYPE_CX4:
 		ifmedia_add(media, m | IFM_10G_CX4, 0, NULL);
 		ifmedia_set(media, m | IFM_10G_CX4);
 		break;
 
 	case FW_PORT_TYPE_QSFP_10G:
 	case FW_PORT_TYPE_SFP:
 	case FW_PORT_TYPE_FIBER_XFI:
 	case FW_PORT_TYPE_FIBER_XAUI:
 		switch (pi->mod_type) {
 
 		case FW_PORT_MOD_TYPE_LR:
 			ifmedia_add(media, m | IFM_10G_LR, 0, NULL);
 			ifmedia_set(media, m | IFM_10G_LR);
 			break;
 
 		case FW_PORT_MOD_TYPE_SR:
 			ifmedia_add(media, m | IFM_10G_SR, 0, NULL);
 			ifmedia_set(media, m | IFM_10G_SR);
 			break;
 
 		case FW_PORT_MOD_TYPE_LRM:
 			ifmedia_add(media, m | IFM_10G_LRM, 0, NULL);
 			ifmedia_set(media, m | IFM_10G_LRM);
 			break;
 
 		case FW_PORT_MOD_TYPE_TWINAX_PASSIVE:
 		case FW_PORT_MOD_TYPE_TWINAX_ACTIVE:
 			ifmedia_add(media, m | IFM_10G_TWINAX, 0, NULL);
 			ifmedia_set(media, m | IFM_10G_TWINAX);
 			break;
 
 		case FW_PORT_MOD_TYPE_NONE:
 			m &= ~IFM_FDX;
 			ifmedia_add(media, m | IFM_NONE, 0, NULL);
 			ifmedia_set(media, m | IFM_NONE);
 			break;
 
 		case FW_PORT_MOD_TYPE_NA:
 		case FW_PORT_MOD_TYPE_ER:
 		default:
 			device_printf(pi->dev,
 			    "unknown port_type (%d), mod_type (%d)\n",
 			    pi->port_type, pi->mod_type);
 			ifmedia_add(media, m | IFM_UNKNOWN, 0, NULL);
 			ifmedia_set(media, m | IFM_UNKNOWN);
 			break;
 		}
 		break;
 
 	case FW_PORT_TYPE_QSFP:
 		switch (pi->mod_type) {
 
 		case FW_PORT_MOD_TYPE_LR:
 			ifmedia_add(media, m | IFM_40G_LR4, 0, NULL);
 			ifmedia_set(media, m | IFM_40G_LR4);
 			break;
 
 		case FW_PORT_MOD_TYPE_SR:
 			ifmedia_add(media, m | IFM_40G_SR4, 0, NULL);
 			ifmedia_set(media, m | IFM_40G_SR4);
 			break;
 
 		case FW_PORT_MOD_TYPE_TWINAX_PASSIVE:
 		case FW_PORT_MOD_TYPE_TWINAX_ACTIVE:
 			ifmedia_add(media, m | IFM_40G_CR4, 0, NULL);
 			ifmedia_set(media, m | IFM_40G_CR4);
 			break;
 
 		case FW_PORT_MOD_TYPE_NONE:
 			m &= ~IFM_FDX;
 			ifmedia_add(media, m | IFM_NONE, 0, NULL);
 			ifmedia_set(media, m | IFM_NONE);
 			break;
 
 		default:
 			device_printf(pi->dev,
 			    "unknown port_type (%d), mod_type (%d)\n",
 			    pi->port_type, pi->mod_type);
 			ifmedia_add(media, m | IFM_UNKNOWN, 0, NULL);
 			ifmedia_set(media, m | IFM_UNKNOWN);
 			break;
 		}
 		break;
 
 	default:
 		device_printf(pi->dev,
 		    "unknown port_type (%d), mod_type (%d)\n", pi->port_type,
 		    pi->mod_type);
 		ifmedia_add(media, m | IFM_UNKNOWN, 0, NULL);
 		ifmedia_set(media, m | IFM_UNKNOWN);
 		break;
 	}
 
 	PORT_UNLOCK(pi);
 }
 
 #define FW_MAC_EXACT_CHUNK	7
 
 /*
  * Program the port's XGMAC based on parameters in ifnet.  The caller also
  * indicates which parameters should be programmed (the rest are left alone).
  */
 int
 update_mac_settings(struct ifnet *ifp, int flags)
 {
 	int rc = 0;
 	struct vi_info *vi = ifp->if_softc;
 	struct port_info *pi = vi->pi;
 	struct adapter *sc = pi->adapter;
 	int mtu = -1, promisc = -1, allmulti = -1, vlanex = -1;
 
 	ASSERT_SYNCHRONIZED_OP(sc);
 	KASSERT(flags, ("%s: not told what to update.", __func__));
 
 	if (flags & XGMAC_MTU)
 		mtu = ifp->if_mtu;
 
 	if (flags & XGMAC_PROMISC)
 		promisc = ifp->if_flags & IFF_PROMISC ? 1 : 0;
 
 	if (flags & XGMAC_ALLMULTI)
 		allmulti = ifp->if_flags & IFF_ALLMULTI ? 1 : 0;
 
 	if (flags & XGMAC_VLANEX)
 		vlanex = ifp->if_capenable & IFCAP_VLAN_HWTAGGING ? 1 : 0;
 
 	if (flags & (XGMAC_MTU|XGMAC_PROMISC|XGMAC_ALLMULTI|XGMAC_VLANEX)) {
 		rc = -t4_set_rxmode(sc, sc->mbox, vi->viid, mtu, promisc,
 		    allmulti, 1, vlanex, false);
 		if (rc) {
 			if_printf(ifp, "set_rxmode (%x) failed: %d\n", flags,
 			    rc);
 			return (rc);
 		}
 	}
 
 	if (flags & XGMAC_UCADDR) {
 		uint8_t ucaddr[ETHER_ADDR_LEN];
 
 		bcopy(IF_LLADDR(ifp), ucaddr, sizeof(ucaddr));
 		rc = t4_change_mac(sc, sc->mbox, vi->viid, vi->xact_addr_filt,
 		    ucaddr, true, true);
 		if (rc < 0) {
 			rc = -rc;
 			if_printf(ifp, "change_mac failed: %d\n", rc);
 			return (rc);
 		} else {
 			vi->xact_addr_filt = rc;
 			rc = 0;
 		}
 	}
 
 	if (flags & XGMAC_MCADDRS) {
 		const uint8_t *mcaddr[FW_MAC_EXACT_CHUNK];
 		int del = 1;
 		uint64_t hash = 0;
 		struct ifmultiaddr *ifma;
 		int i = 0, j;
 
 		if_maddr_rlock(ifp);
 		TAILQ_FOREACH(ifma, &ifp->if_multiaddrs, ifma_link) {
 			if (ifma->ifma_addr->sa_family != AF_LINK)
 				continue;
 			mcaddr[i] =
 			    LLADDR((struct sockaddr_dl *)ifma->ifma_addr);
 			MPASS(ETHER_IS_MULTICAST(mcaddr[i]));
 			i++;
 
 			if (i == FW_MAC_EXACT_CHUNK) {
 				rc = t4_alloc_mac_filt(sc, sc->mbox, vi->viid,
 				    del, i, mcaddr, NULL, &hash, 0);
 				if (rc < 0) {
 					rc = -rc;
 					for (j = 0; j < i; j++) {
 						if_printf(ifp,
 						    "failed to add mc address"
 						    " %02x:%02x:%02x:"
 						    "%02x:%02x:%02x rc=%d\n",
 						    mcaddr[j][0], mcaddr[j][1],
 						    mcaddr[j][2], mcaddr[j][3],
 						    mcaddr[j][4], mcaddr[j][5],
 						    rc);
 					}
 					goto mcfail;
 				}
 				del = 0;
 				i = 0;
 			}
 		}
 		if (i > 0) {
 			rc = t4_alloc_mac_filt(sc, sc->mbox, vi->viid, del, i,
 			    mcaddr, NULL, &hash, 0);
 			if (rc < 0) {
 				rc = -rc;
 				for (j = 0; j < i; j++) {
 					if_printf(ifp,
 					    "failed to add mc address"
 					    " %02x:%02x:%02x:"
 					    "%02x:%02x:%02x rc=%d\n",
 					    mcaddr[j][0], mcaddr[j][1],
 					    mcaddr[j][2], mcaddr[j][3],
 					    mcaddr[j][4], mcaddr[j][5],
 					    rc);
 				}
 				goto mcfail;
 			}
 		}
 
 		rc = -t4_set_addr_hash(sc, sc->mbox, vi->viid, 0, hash, 0);
 		if (rc != 0)
 			if_printf(ifp, "failed to set mc address hash: %d", rc);
 mcfail:
 		if_maddr_runlock(ifp);
 	}
 
 	return (rc);
 }
 
 /*
  * {begin|end}_synchronized_op must be called from the same thread.
  */
 int
 begin_synchronized_op(struct adapter *sc, struct vi_info *vi, int flags,
     char *wmesg)
 {
 	int rc, pri;
 
 #ifdef WITNESS
 	/* the caller thinks it's ok to sleep, but is it really? */
 	if (flags & SLEEP_OK)
 		WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, NULL,
 		    "begin_synchronized_op");
 #endif
 
 	if (INTR_OK)
 		pri = PCATCH;
 	else
 		pri = 0;
 
 	ADAPTER_LOCK(sc);
 	for (;;) {
 
 		if (vi && IS_DOOMED(vi)) {
 			rc = ENXIO;
 			goto done;
 		}
 
 		if (!IS_BUSY(sc)) {
 			rc = 0;
 			break;
 		}
 
 		if (!(flags & SLEEP_OK)) {
 			rc = EBUSY;
 			goto done;
 		}
 
 		if (mtx_sleep(&sc->flags, &sc->sc_lock, pri, wmesg, 0)) {
 			rc = EINTR;
 			goto done;
 		}
 	}
 
 	KASSERT(!IS_BUSY(sc), ("%s: controller busy.", __func__));
 	SET_BUSY(sc);
 #ifdef INVARIANTS
 	sc->last_op = wmesg;
 	sc->last_op_thr = curthread;
 	sc->last_op_flags = flags;
 #endif
 
 done:
 	if (!(flags & HOLD_LOCK) || rc)
 		ADAPTER_UNLOCK(sc);
 
 	return (rc);
 }
 
 /*
  * Tell if_ioctl and if_init that the VI is going away.  This is
  * special variant of begin_synchronized_op and must be paired with a
  * call to end_synchronized_op.
  */
 void
 doom_vi(struct adapter *sc, struct vi_info *vi)
 {
 
 	ADAPTER_LOCK(sc);
 	SET_DOOMED(vi);
 	wakeup(&sc->flags);
 	while (IS_BUSY(sc))
 		mtx_sleep(&sc->flags, &sc->sc_lock, 0, "t4detach", 0);
 	SET_BUSY(sc);
 #ifdef INVARIANTS
 	sc->last_op = "t4detach";
 	sc->last_op_thr = curthread;
 	sc->last_op_flags = 0;
 #endif
 	ADAPTER_UNLOCK(sc);
 }
 
 /*
  * {begin|end}_synchronized_op must be called from the same thread.
  */
 void
 end_synchronized_op(struct adapter *sc, int flags)
 {
 
 	if (flags & LOCK_HELD)
 		ADAPTER_LOCK_ASSERT_OWNED(sc);
 	else
 		ADAPTER_LOCK(sc);
 
 	KASSERT(IS_BUSY(sc), ("%s: controller not busy.", __func__));
 	CLR_BUSY(sc);
 	wakeup(&sc->flags);
 	ADAPTER_UNLOCK(sc);
 }
 
 static int
 cxgbe_init_synchronized(struct vi_info *vi)
 {
 	struct port_info *pi = vi->pi;
 	struct adapter *sc = pi->adapter;
 	struct ifnet *ifp = vi->ifp;
 	int rc = 0, i;
 	struct sge_txq *txq;
 
 	ASSERT_SYNCHRONIZED_OP(sc);
 
 	if (ifp->if_drv_flags & IFF_DRV_RUNNING)
 		return (0);	/* already running */
 
 	if (!(sc->flags & FULL_INIT_DONE) &&
 	    ((rc = adapter_full_init(sc)) != 0))
 		return (rc);	/* error message displayed already */
 
 	if (!(vi->flags & VI_INIT_DONE) &&
 	    ((rc = vi_full_init(vi)) != 0))
 		return (rc); /* error message displayed already */
 
 	rc = update_mac_settings(ifp, XGMAC_ALL);
 	if (rc)
 		goto done;	/* error message displayed already */
 
 	rc = -t4_enable_vi(sc, sc->mbox, vi->viid, true, true);
 	if (rc != 0) {
 		if_printf(ifp, "enable_vi failed: %d\n", rc);
 		goto done;
 	}
 
 	/*
 	 * Can't fail from this point onwards.  Review cxgbe_uninit_synchronized
 	 * if this changes.
 	 */
 
 	for_each_txq(vi, i, txq) {
 		TXQ_LOCK(txq);
 		txq->eq.flags |= EQ_ENABLED;
 		TXQ_UNLOCK(txq);
 	}
 
 	/*
 	 * The first iq of the first port to come up is used for tracing.
 	 */
 	if (sc->traceq < 0 && IS_MAIN_VI(vi)) {
 		sc->traceq = sc->sge.rxq[vi->first_rxq].iq.abs_id;
 		t4_write_reg(sc, is_t4(sc) ?  A_MPS_TRC_RSS_CONTROL :
 		    A_MPS_T5_TRC_RSS_CONTROL, V_RSSCONTROL(pi->tx_chan) |
 		    V_QUEUENUMBER(sc->traceq));
 		pi->flags |= HAS_TRACEQ;
 	}
 
 	/* all ok */
 	PORT_LOCK(pi);
 	ifp->if_drv_flags |= IFF_DRV_RUNNING;
 	pi->up_vis++;
 
 	if (pi->nvi > 1 || sc->flags & IS_VF)
 		callout_reset(&vi->tick, hz, vi_tick, vi);
 	else
 		callout_reset(&pi->tick, hz, cxgbe_tick, pi);
 	PORT_UNLOCK(pi);
 done:
 	if (rc != 0)
 		cxgbe_uninit_synchronized(vi);
 
 	return (rc);
 }
 
 /*
  * Idempotent.
  */
 static int
 cxgbe_uninit_synchronized(struct vi_info *vi)
 {
 	struct port_info *pi = vi->pi;
 	struct adapter *sc = pi->adapter;
 	struct ifnet *ifp = vi->ifp;
 	int rc, i;
 	struct sge_txq *txq;
 
 	ASSERT_SYNCHRONIZED_OP(sc);
 
 	if (!(vi->flags & VI_INIT_DONE)) {
 		KASSERT(!(ifp->if_drv_flags & IFF_DRV_RUNNING),
 		    ("uninited VI is running"));
 		return (0);
 	}
 
 	/*
 	 * Disable the VI so that all its data in either direction is discarded
 	 * by the MPS.  Leave everything else (the queues, interrupts, and 1Hz
 	 * tick) intact as the TP can deliver negative advice or data that it's
 	 * holding in its RAM (for an offloaded connection) even after the VI is
 	 * disabled.
 	 */
 	rc = -t4_enable_vi(sc, sc->mbox, vi->viid, false, false);
 	if (rc) {
 		if_printf(ifp, "disable_vi failed: %d\n", rc);
 		return (rc);
 	}
 
 	for_each_txq(vi, i, txq) {
 		TXQ_LOCK(txq);
 		txq->eq.flags &= ~EQ_ENABLED;
 		TXQ_UNLOCK(txq);
 	}
 
 	PORT_LOCK(pi);
 	if (pi->nvi > 1 || sc->flags & IS_VF)
 		callout_stop(&vi->tick);
 	else
 		callout_stop(&pi->tick);
 	if (!(ifp->if_drv_flags & IFF_DRV_RUNNING)) {
 		PORT_UNLOCK(pi);
 		return (0);
 	}
 	ifp->if_drv_flags &= ~IFF_DRV_RUNNING;
 	pi->up_vis--;
 	if (pi->up_vis > 0) {
 		PORT_UNLOCK(pi);
 		return (0);
 	}
 	PORT_UNLOCK(pi);
 
 	pi->link_cfg.link_ok = 0;
 	pi->link_cfg.speed = 0;
 	pi->linkdnrc = -1;
 	t4_os_link_changed(sc, pi->port_id, 0, -1);
 
 	return (0);
 }
 
 /*
  * It is ok for this function to fail midway and return right away.  t4_detach
  * will walk the entire sc->irq list and clean up whatever is valid.
  */
 int
 t4_setup_intr_handlers(struct adapter *sc)
 {
 	int rc, rid, p, q, v;
 	char s[8];
 	struct irq *irq;
 	struct port_info *pi;
 	struct vi_info *vi;
 	struct sge *sge = &sc->sge;
 	struct sge_rxq *rxq;
 #ifdef TCP_OFFLOAD
 	struct sge_ofld_rxq *ofld_rxq;
 #endif
 #ifdef DEV_NETMAP
 	struct sge_nm_rxq *nm_rxq;
 #endif
 #ifdef RSS
 	int nbuckets = rss_getnumbuckets();
 #endif
 
 	/*
 	 * Setup interrupts.
 	 */
 	irq = &sc->irq[0];
 	rid = sc->intr_type == INTR_INTX ? 0 : 1;
 	if (sc->intr_count == 1)
 		return (t4_alloc_irq(sc, irq, rid, t4_intr_all, sc, "all"));
 
 	/* Multiple interrupts. */
 	if (sc->flags & IS_VF)
 		KASSERT(sc->intr_count >= T4VF_EXTRA_INTR + sc->params.nports,
 		    ("%s: too few intr.", __func__));
 	else
 		KASSERT(sc->intr_count >= T4_EXTRA_INTR + sc->params.nports,
 		    ("%s: too few intr.", __func__));
 
 	/* The first one is always error intr on PFs */
 	if (!(sc->flags & IS_VF)) {
 		rc = t4_alloc_irq(sc, irq, rid, t4_intr_err, sc, "err");
 		if (rc != 0)
 			return (rc);
 		irq++;
 		rid++;
 	}
 
 	/* The second one is always the firmware event queue (first on VFs) */
 	rc = t4_alloc_irq(sc, irq, rid, t4_intr_evt, &sge->fwq, "evt");
 	if (rc != 0)
 		return (rc);
 	irq++;
 	rid++;
 
 	for_each_port(sc, p) {
 		pi = sc->port[p];
 		for_each_vi(pi, v, vi) {
 			vi->first_intr = rid - 1;
 
 			if (vi->nnmrxq > 0) {
 				int n = max(vi->nrxq, vi->nnmrxq);
 
 				MPASS(vi->flags & INTR_RXQ);
 
 				rxq = &sge->rxq[vi->first_rxq];
 #ifdef DEV_NETMAP
 				nm_rxq = &sge->nm_rxq[vi->first_nm_rxq];
 #endif
 				for (q = 0; q < n; q++) {
 					snprintf(s, sizeof(s), "%x%c%x", p,
 					    'a' + v, q);
 					if (q < vi->nrxq)
 						irq->rxq = rxq++;
 #ifdef DEV_NETMAP
 					if (q < vi->nnmrxq)
 						irq->nm_rxq = nm_rxq++;
 #endif
 					rc = t4_alloc_irq(sc, irq, rid,
 					    t4_vi_intr, irq, s);
 					if (rc != 0)
 						return (rc);
 					irq++;
 					rid++;
 					vi->nintr++;
 				}
 			} else if (vi->flags & INTR_RXQ) {
 				for_each_rxq(vi, q, rxq) {
 					snprintf(s, sizeof(s), "%x%c%x", p,
 					    'a' + v, q);
 					rc = t4_alloc_irq(sc, irq, rid,
 					    t4_intr, rxq, s);
 					if (rc != 0)
 						return (rc);
 #ifdef RSS
 					bus_bind_intr(sc->dev, irq->res,
 					    rss_getcpu(q % nbuckets));
 #endif
 					irq++;
 					rid++;
 					vi->nintr++;
 				}
 			}
 #ifdef TCP_OFFLOAD
 			if (vi->flags & INTR_OFLD_RXQ) {
 				for_each_ofld_rxq(vi, q, ofld_rxq) {
 					snprintf(s, sizeof(s), "%x%c%x", p,
 					    'A' + v, q);
 					rc = t4_alloc_irq(sc, irq, rid,
 					    t4_intr, ofld_rxq, s);
 					if (rc != 0)
 						return (rc);
 					irq++;
 					rid++;
 					vi->nintr++;
 				}
 			}
 #endif
 		}
 	}
 	MPASS(irq == &sc->irq[sc->intr_count]);
 
 	return (0);
 }
 
 int
 adapter_full_init(struct adapter *sc)
 {
 	int rc, i;
 
 	ASSERT_SYNCHRONIZED_OP(sc);
 	ADAPTER_LOCK_ASSERT_NOTOWNED(sc);
 	KASSERT((sc->flags & FULL_INIT_DONE) == 0,
 	    ("%s: FULL_INIT_DONE already", __func__));
 
 	/*
 	 * queues that belong to the adapter (not any particular port).
 	 */
 	rc = t4_setup_adapter_queues(sc);
 	if (rc != 0)
 		goto done;
 
 	for (i = 0; i < nitems(sc->tq); i++) {
 		sc->tq[i] = taskqueue_create("t4 taskq", M_NOWAIT,
 		    taskqueue_thread_enqueue, &sc->tq[i]);
 		if (sc->tq[i] == NULL) {
 			device_printf(sc->dev,
 			    "failed to allocate task queue %d\n", i);
 			rc = ENOMEM;
 			goto done;
 		}
 		taskqueue_start_threads(&sc->tq[i], 1, PI_NET, "%s tq%d",
 		    device_get_nameunit(sc->dev), i);
 	}
 
 	if (!(sc->flags & IS_VF))
 		t4_intr_enable(sc);
 	sc->flags |= FULL_INIT_DONE;
 done:
 	if (rc != 0)
 		adapter_full_uninit(sc);
 
 	return (rc);
 }
 
 int
 adapter_full_uninit(struct adapter *sc)
 {
 	int i;
 
 	ADAPTER_LOCK_ASSERT_NOTOWNED(sc);
 
 	t4_teardown_adapter_queues(sc);
 
 	for (i = 0; i < nitems(sc->tq) && sc->tq[i]; i++) {
 		taskqueue_free(sc->tq[i]);
 		sc->tq[i] = NULL;
 	}
 
 	sc->flags &= ~FULL_INIT_DONE;
 
 	return (0);
 }
 
 #ifdef RSS
 #define SUPPORTED_RSS_HASHTYPES (RSS_HASHTYPE_RSS_IPV4 | \
     RSS_HASHTYPE_RSS_TCP_IPV4 | RSS_HASHTYPE_RSS_IPV6 | \
     RSS_HASHTYPE_RSS_TCP_IPV6 | RSS_HASHTYPE_RSS_UDP_IPV4 | \
     RSS_HASHTYPE_RSS_UDP_IPV6)
 
 /* Translates kernel hash types to hardware. */
 static int
 hashconfig_to_hashen(int hashconfig)
 {
 	int hashen = 0;
 
 	if (hashconfig & RSS_HASHTYPE_RSS_IPV4)
 		hashen |= F_FW_RSS_VI_CONFIG_CMD_IP4TWOTUPEN;
 	if (hashconfig & RSS_HASHTYPE_RSS_IPV6)
 		hashen |= F_FW_RSS_VI_CONFIG_CMD_IP6TWOTUPEN;
 	if (hashconfig & RSS_HASHTYPE_RSS_UDP_IPV4) {
 		hashen |= F_FW_RSS_VI_CONFIG_CMD_UDPEN |
 		    F_FW_RSS_VI_CONFIG_CMD_IP4FOURTUPEN;
 	}
 	if (hashconfig & RSS_HASHTYPE_RSS_UDP_IPV6) {
 		hashen |= F_FW_RSS_VI_CONFIG_CMD_UDPEN |
 		    F_FW_RSS_VI_CONFIG_CMD_IP6FOURTUPEN;
 	}
 	if (hashconfig & RSS_HASHTYPE_RSS_TCP_IPV4)
 		hashen |= F_FW_RSS_VI_CONFIG_CMD_IP4FOURTUPEN;
 	if (hashconfig & RSS_HASHTYPE_RSS_TCP_IPV6)
 		hashen |= F_FW_RSS_VI_CONFIG_CMD_IP6FOURTUPEN;
 
 	return (hashen);
 }
 
 /* Translates hardware hash types to kernel. */
 static int
 hashen_to_hashconfig(int hashen)
 {
 	int hashconfig = 0;
 
 	if (hashen & F_FW_RSS_VI_CONFIG_CMD_UDPEN) {
 		/*
 		 * If UDP hashing was enabled it must have been enabled for
 		 * either IPv4 or IPv6 (inclusive or).  Enabling UDP without
 		 * enabling any 4-tuple hash is nonsense configuration.
 		 */
 		MPASS(hashen & (F_FW_RSS_VI_CONFIG_CMD_IP4FOURTUPEN |
 		    F_FW_RSS_VI_CONFIG_CMD_IP6FOURTUPEN));
 
 		if (hashen & F_FW_RSS_VI_CONFIG_CMD_IP4FOURTUPEN)
 			hashconfig |= RSS_HASHTYPE_RSS_UDP_IPV4;
 		if (hashen & F_FW_RSS_VI_CONFIG_CMD_IP6FOURTUPEN)
 			hashconfig |= RSS_HASHTYPE_RSS_UDP_IPV6;
 	}
 	if (hashen & F_FW_RSS_VI_CONFIG_CMD_IP4FOURTUPEN)
 		hashconfig |= RSS_HASHTYPE_RSS_TCP_IPV4;
 	if (hashen & F_FW_RSS_VI_CONFIG_CMD_IP6FOURTUPEN)
 		hashconfig |= RSS_HASHTYPE_RSS_TCP_IPV6;
 	if (hashen & F_FW_RSS_VI_CONFIG_CMD_IP4TWOTUPEN)
 		hashconfig |= RSS_HASHTYPE_RSS_IPV4;
 	if (hashen & F_FW_RSS_VI_CONFIG_CMD_IP6TWOTUPEN)
 		hashconfig |= RSS_HASHTYPE_RSS_IPV6;
 
 	return (hashconfig);
 }
 #endif
 
 int
 vi_full_init(struct vi_info *vi)
 {
 	struct adapter *sc = vi->pi->adapter;
 	struct ifnet *ifp = vi->ifp;
 	uint16_t *rss;
 	struct sge_rxq *rxq;
 	int rc, i, j, hashen;
 #ifdef RSS
 	int nbuckets = rss_getnumbuckets();
 	int hashconfig = rss_gethashconfig();
 	int extra;
 	uint32_t raw_rss_key[RSS_KEYSIZE / sizeof(uint32_t)];
 	uint32_t rss_key[RSS_KEYSIZE / sizeof(uint32_t)];
 #endif
 
 	ASSERT_SYNCHRONIZED_OP(sc);
 	KASSERT((vi->flags & VI_INIT_DONE) == 0,
 	    ("%s: VI_INIT_DONE already", __func__));
 
 	sysctl_ctx_init(&vi->ctx);
 	vi->flags |= VI_SYSCTL_CTX;
 
 	/*
 	 * Allocate tx/rx/fl queues for this VI.
 	 */
 	rc = t4_setup_vi_queues(vi);
 	if (rc != 0)
 		goto done;	/* error message displayed already */
 
 	/*
 	 * Setup RSS for this VI.  Save a copy of the RSS table for later use.
 	 */
 	if (vi->nrxq > vi->rss_size) {
 		if_printf(ifp, "nrxq (%d) > hw RSS table size (%d); "
 		    "some queues will never receive traffic.\n", vi->nrxq,
 		    vi->rss_size);
 	} else if (vi->rss_size % vi->nrxq) {
 		if_printf(ifp, "nrxq (%d), hw RSS table size (%d); "
 		    "expect uneven traffic distribution.\n", vi->nrxq,
 		    vi->rss_size);
 	}
 #ifdef RSS
 	MPASS(RSS_KEYSIZE == 40);
 	if (vi->nrxq != nbuckets) {
 		if_printf(ifp, "nrxq (%d) != kernel RSS buckets (%d);"
 		    "performance will be impacted.\n", vi->nrxq, nbuckets);
 	}
 
 	rss_getkey((void *)&raw_rss_key[0]);
 	for (i = 0; i < nitems(rss_key); i++) {
 		rss_key[i] = htobe32(raw_rss_key[nitems(rss_key) - 1 - i]);
 	}
 	t4_write_rss_key(sc, &rss_key[0], -1);
 #endif
 	rss = malloc(vi->rss_size * sizeof (*rss), M_CXGBE, M_ZERO | M_WAITOK);
 	for (i = 0; i < vi->rss_size;) {
 #ifdef RSS
 		j = rss_get_indirection_to_bucket(i);
 		j %= vi->nrxq;
 		rxq = &sc->sge.rxq[vi->first_rxq + j];
 		rss[i++] = rxq->iq.abs_id;
 #else
 		for_each_rxq(vi, j, rxq) {
 			rss[i++] = rxq->iq.abs_id;
 			if (i == vi->rss_size)
 				break;
 		}
 #endif
 	}
 
 	rc = -t4_config_rss_range(sc, sc->mbox, vi->viid, 0, vi->rss_size, rss,
 	    vi->rss_size);
 	if (rc != 0) {
 		if_printf(ifp, "rss_config failed: %d\n", rc);
 		goto done;
 	}
 
 #ifdef RSS
 	hashen = hashconfig_to_hashen(hashconfig);
 
 	/*
 	 * We may have had to enable some hashes even though the global config
 	 * wants them disabled.  This is a potential problem that must be
 	 * reported to the user.
 	 */
 	extra = hashen_to_hashconfig(hashen) ^ hashconfig;
 
 	/*
 	 * If we consider only the supported hash types, then the enabled hashes
 	 * are a superset of the requested hashes.  In other words, there cannot
 	 * be any supported hash that was requested but not enabled, but there
 	 * can be hashes that were not requested but had to be enabled.
 	 */
 	extra &= SUPPORTED_RSS_HASHTYPES;
 	MPASS((extra & hashconfig) == 0);
 
 	if (extra) {
 		if_printf(ifp,
 		    "global RSS config (0x%x) cannot be accommodated.\n",
 		    hashconfig);
 	}
 	if (extra & RSS_HASHTYPE_RSS_IPV4)
 		if_printf(ifp, "IPv4 2-tuple hashing forced on.\n");
 	if (extra & RSS_HASHTYPE_RSS_TCP_IPV4)
 		if_printf(ifp, "TCP/IPv4 4-tuple hashing forced on.\n");
 	if (extra & RSS_HASHTYPE_RSS_IPV6)
 		if_printf(ifp, "IPv6 2-tuple hashing forced on.\n");
 	if (extra & RSS_HASHTYPE_RSS_TCP_IPV6)
 		if_printf(ifp, "TCP/IPv6 4-tuple hashing forced on.\n");
 	if (extra & RSS_HASHTYPE_RSS_UDP_IPV4)
 		if_printf(ifp, "UDP/IPv4 4-tuple hashing forced on.\n");
 	if (extra & RSS_HASHTYPE_RSS_UDP_IPV6)
 		if_printf(ifp, "UDP/IPv6 4-tuple hashing forced on.\n");
 #else
 	hashen = F_FW_RSS_VI_CONFIG_CMD_IP6FOURTUPEN |
 	    F_FW_RSS_VI_CONFIG_CMD_IP6TWOTUPEN |
 	    F_FW_RSS_VI_CONFIG_CMD_IP4FOURTUPEN |
 	    F_FW_RSS_VI_CONFIG_CMD_IP4TWOTUPEN | F_FW_RSS_VI_CONFIG_CMD_UDPEN;
 #endif
 	rc = -t4_config_vi_rss(sc, sc->mbox, vi->viid, hashen, rss[0]);
 	if (rc != 0) {
 		if_printf(ifp, "rss hash/defaultq config failed: %d\n", rc);
 		goto done;
 	}
 
 	vi->rss = rss;
 	vi->flags |= VI_INIT_DONE;
 done:
 	if (rc != 0)
 		vi_full_uninit(vi);
 
 	return (rc);
 }
 
 /*
  * Idempotent.
  */
 int
 vi_full_uninit(struct vi_info *vi)
 {
 	struct port_info *pi = vi->pi;
 	struct adapter *sc = pi->adapter;
 	int i;
 	struct sge_rxq *rxq;
 	struct sge_txq *txq;
 #ifdef TCP_OFFLOAD
 	struct sge_ofld_rxq *ofld_rxq;
 	struct sge_wrq *ofld_txq;
 #endif
 
 	if (vi->flags & VI_INIT_DONE) {
 
 		/* Need to quiesce queues.  */
 
 		/* XXX: Only for the first VI? */
 		if (IS_MAIN_VI(vi) && !(sc->flags & IS_VF))
 			quiesce_wrq(sc, &sc->sge.ctrlq[pi->port_id]);
 
 		for_each_txq(vi, i, txq) {
 			quiesce_txq(sc, txq);
 		}
 
 #ifdef TCP_OFFLOAD
 		for_each_ofld_txq(vi, i, ofld_txq) {
 			quiesce_wrq(sc, ofld_txq);
 		}
 #endif
 
 		for_each_rxq(vi, i, rxq) {
 			quiesce_iq(sc, &rxq->iq);
 			quiesce_fl(sc, &rxq->fl);
 		}
 
 #ifdef TCP_OFFLOAD
 		for_each_ofld_rxq(vi, i, ofld_rxq) {
 			quiesce_iq(sc, &ofld_rxq->iq);
 			quiesce_fl(sc, &ofld_rxq->fl);
 		}
 #endif
 		free(vi->rss, M_CXGBE);
 		free(vi->nm_rss, M_CXGBE);
 	}
 
 	t4_teardown_vi_queues(vi);
 	vi->flags &= ~VI_INIT_DONE;
 
 	return (0);
 }
 
 static void
 quiesce_txq(struct adapter *sc, struct sge_txq *txq)
 {
 	struct sge_eq *eq = &txq->eq;
 	struct sge_qstat *spg = (void *)&eq->desc[eq->sidx];
 
 	(void) sc;	/* unused */
 
 #ifdef INVARIANTS
 	TXQ_LOCK(txq);
 	MPASS((eq->flags & EQ_ENABLED) == 0);
 	TXQ_UNLOCK(txq);
 #endif
 
 	/* Wait for the mp_ring to empty. */
 	while (!mp_ring_is_idle(txq->r)) {
 		mp_ring_check_drainage(txq->r, 0);
 		pause("rquiesce", 1);
 	}
 
 	/* Then wait for the hardware to finish. */
 	while (spg->cidx != htobe16(eq->pidx))
 		pause("equiesce", 1);
 
 	/* Finally, wait for the driver to reclaim all descriptors. */
 	while (eq->cidx != eq->pidx)
 		pause("dquiesce", 1);
 }
 
 static void
 quiesce_wrq(struct adapter *sc, struct sge_wrq *wrq)
 {
 
 	/* XXXTX */
 }
 
 static void
 quiesce_iq(struct adapter *sc, struct sge_iq *iq)
 {
 	(void) sc;	/* unused */
 
 	/* Synchronize with the interrupt handler */
 	while (!atomic_cmpset_int(&iq->state, IQS_IDLE, IQS_DISABLED))
 		pause("iqfree", 1);
 }
 
 static void
 quiesce_fl(struct adapter *sc, struct sge_fl *fl)
 {
 	mtx_lock(&sc->sfl_lock);
 	FL_LOCK(fl);
 	fl->flags |= FL_DOOMED;
 	FL_UNLOCK(fl);
 	callout_stop(&sc->sfl_callout);
 	mtx_unlock(&sc->sfl_lock);
 
 	KASSERT((fl->flags & FL_STARVING) == 0,
 	    ("%s: still starving", __func__));
 }
 
 static int
 t4_alloc_irq(struct adapter *sc, struct irq *irq, int rid,
     driver_intr_t *handler, void *arg, char *name)
 {
 	int rc;
 
 	irq->rid = rid;
 	irq->res = bus_alloc_resource_any(sc->dev, SYS_RES_IRQ, &irq->rid,
 	    RF_SHAREABLE | RF_ACTIVE);
 	if (irq->res == NULL) {
 		device_printf(sc->dev,
 		    "failed to allocate IRQ for rid %d, name %s.\n", rid, name);
 		return (ENOMEM);
 	}
 
 	rc = bus_setup_intr(sc->dev, irq->res, INTR_MPSAFE | INTR_TYPE_NET,
 	    NULL, handler, arg, &irq->tag);
 	if (rc != 0) {
 		device_printf(sc->dev,
 		    "failed to setup interrupt for rid %d, name %s: %d\n",
 		    rid, name, rc);
 	} else if (name)
 		bus_describe_intr(sc->dev, irq->res, irq->tag, "%s", name);
 
 	return (rc);
 }
 
 static int
 t4_free_irq(struct adapter *sc, struct irq *irq)
 {
 	if (irq->tag)
 		bus_teardown_intr(sc->dev, irq->res, irq->tag);
 	if (irq->res)
 		bus_release_resource(sc->dev, SYS_RES_IRQ, irq->rid, irq->res);
 
 	bzero(irq, sizeof(*irq));
 
 	return (0);
 }
 
 static void
 get_regs(struct adapter *sc, struct t4_regdump *regs, uint8_t *buf)
 {
 
 	regs->version = chip_id(sc) | chip_rev(sc) << 10;
 	t4_get_regs(sc, buf, regs->len);
 }
 
 #define	A_PL_INDIR_CMD	0x1f8
 
 #define	S_PL_AUTOINC	31
 #define	M_PL_AUTOINC	0x1U
 #define	V_PL_AUTOINC(x)	((x) << S_PL_AUTOINC)
 #define	G_PL_AUTOINC(x)	(((x) >> S_PL_AUTOINC) & M_PL_AUTOINC)
 
 #define	S_PL_VFID	20
 #define	M_PL_VFID	0xffU
 #define	V_PL_VFID(x)	((x) << S_PL_VFID)
 #define	G_PL_VFID(x)	(((x) >> S_PL_VFID) & M_PL_VFID)
 
 #define	S_PL_ADDR	0
 #define	M_PL_ADDR	0xfffffU
 #define	V_PL_ADDR(x)	((x) << S_PL_ADDR)
 #define	G_PL_ADDR(x)	(((x) >> S_PL_ADDR) & M_PL_ADDR)
 
 #define	A_PL_INDIR_DATA	0x1fc
 
 static uint64_t
 read_vf_stat(struct adapter *sc, unsigned int viid, int reg)
 {
 	u32 stats[2];
 
 	mtx_assert(&sc->reg_lock, MA_OWNED);
 	if (sc->flags & IS_VF) {
 		stats[0] = t4_read_reg(sc, VF_MPS_REG(reg));
 		stats[1] = t4_read_reg(sc, VF_MPS_REG(reg + 4));
 	} else {
 		t4_write_reg(sc, A_PL_INDIR_CMD, V_PL_AUTOINC(1) |
 		    V_PL_VFID(G_FW_VIID_VIN(viid)) |
 		    V_PL_ADDR(VF_MPS_REG(reg)));
 		stats[0] = t4_read_reg(sc, A_PL_INDIR_DATA);
 		stats[1] = t4_read_reg(sc, A_PL_INDIR_DATA);
 	}
 	return (((uint64_t)stats[1]) << 32 | stats[0]);
 }
 
 static void
 t4_get_vi_stats(struct adapter *sc, unsigned int viid,
     struct fw_vi_stats_vf *stats)
 {
 
 #define GET_STAT(name) \
 	read_vf_stat(sc, viid, A_MPS_VF_STAT_##name##_L)
 
 	stats->tx_bcast_bytes    = GET_STAT(TX_VF_BCAST_BYTES);
 	stats->tx_bcast_frames   = GET_STAT(TX_VF_BCAST_FRAMES);
 	stats->tx_mcast_bytes    = GET_STAT(TX_VF_MCAST_BYTES);
 	stats->tx_mcast_frames   = GET_STAT(TX_VF_MCAST_FRAMES);
 	stats->tx_ucast_bytes    = GET_STAT(TX_VF_UCAST_BYTES);
 	stats->tx_ucast_frames   = GET_STAT(TX_VF_UCAST_FRAMES);
 	stats->tx_drop_frames    = GET_STAT(TX_VF_DROP_FRAMES);
 	stats->tx_offload_bytes  = GET_STAT(TX_VF_OFFLOAD_BYTES);
 	stats->tx_offload_frames = GET_STAT(TX_VF_OFFLOAD_FRAMES);
 	stats->rx_bcast_bytes    = GET_STAT(RX_VF_BCAST_BYTES);
 	stats->rx_bcast_frames   = GET_STAT(RX_VF_BCAST_FRAMES);
 	stats->rx_mcast_bytes    = GET_STAT(RX_VF_MCAST_BYTES);
 	stats->rx_mcast_frames   = GET_STAT(RX_VF_MCAST_FRAMES);
 	stats->rx_ucast_bytes    = GET_STAT(RX_VF_UCAST_BYTES);
 	stats->rx_ucast_frames   = GET_STAT(RX_VF_UCAST_FRAMES);
 	stats->rx_err_frames     = GET_STAT(RX_VF_ERR_FRAMES);
 
 #undef GET_STAT
 }
 
 static void
 t4_clr_vi_stats(struct adapter *sc, unsigned int viid)
 {
 	int reg;
 
 	t4_write_reg(sc, A_PL_INDIR_CMD, V_PL_AUTOINC(1) |
 	    V_PL_VFID(G_FW_VIID_VIN(viid)) |
 	    V_PL_ADDR(VF_MPS_REG(A_MPS_VF_STAT_TX_VF_BCAST_BYTES_L)));
 	for (reg = A_MPS_VF_STAT_TX_VF_BCAST_BYTES_L;
 	     reg <= A_MPS_VF_STAT_RX_VF_ERR_FRAMES_H; reg += 4)
 		t4_write_reg(sc, A_PL_INDIR_DATA, 0);
 }
 
 static void
 vi_refresh_stats(struct adapter *sc, struct vi_info *vi)
 {
 	struct timeval tv;
 	const struct timeval interval = {0, 250000};	/* 250ms */
 
 	if (!(vi->flags & VI_INIT_DONE))
 		return;
 
 	getmicrotime(&tv);
 	timevalsub(&tv, &interval);
 	if (timevalcmp(&tv, &vi->last_refreshed, <))
 		return;
 
 	mtx_lock(&sc->reg_lock);
 	t4_get_vi_stats(sc, vi->viid, &vi->stats);
 	getmicrotime(&vi->last_refreshed);
 	mtx_unlock(&sc->reg_lock);
 }
 
 static void
 cxgbe_refresh_stats(struct adapter *sc, struct port_info *pi)
 {
 	int i;
 	u_int v, tnl_cong_drops;
 	struct timeval tv;
 	const struct timeval interval = {0, 250000};	/* 250ms */
 
 	getmicrotime(&tv);
 	timevalsub(&tv, &interval);
 	if (timevalcmp(&tv, &pi->last_refreshed, <))
 		return;
 
 	tnl_cong_drops = 0;
 	t4_get_port_stats(sc, pi->tx_chan, &pi->stats);
 	for (i = 0; i < sc->chip_params->nchan; i++) {
 		if (pi->rx_chan_map & (1 << i)) {
 			mtx_lock(&sc->reg_lock);
 			t4_read_indirect(sc, A_TP_MIB_INDEX, A_TP_MIB_DATA, &v,
 			    1, A_TP_MIB_TNL_CNG_DROP_0 + i);
 			mtx_unlock(&sc->reg_lock);
 			tnl_cong_drops += v;
 		}
 	}
 	pi->tnl_cong_drops = tnl_cong_drops;
 	getmicrotime(&pi->last_refreshed);
 }
 
 static void
 cxgbe_tick(void *arg)
 {
 	struct port_info *pi = arg;
 	struct adapter *sc = pi->adapter;
 
 	PORT_LOCK_ASSERT_OWNED(pi);
 	cxgbe_refresh_stats(sc, pi);
 
 	callout_schedule(&pi->tick, hz);
 }
 
 void
 vi_tick(void *arg)
 {
 	struct vi_info *vi = arg;
 	struct adapter *sc = vi->pi->adapter;
 
 	vi_refresh_stats(sc, vi);
 
 	callout_schedule(&vi->tick, hz);
 }
 
 static void
 cxgbe_vlan_config(void *arg, struct ifnet *ifp, uint16_t vid)
 {
 	struct ifnet *vlan;
 
 	if (arg != ifp || ifp->if_type != IFT_ETHER)
 		return;
 
 	vlan = VLAN_DEVAT(ifp, vid);
 	VLAN_SETCOOKIE(vlan, ifp);
 }
 
 /*
  * Should match fw_caps_config_<foo> enums in t4fw_interface.h
  */
 static char *caps_decoder[] = {
 	"\20\001IPMI\002NCSI",				/* 0: NBM */
 	"\20\001PPP\002QFC\003DCBX",			/* 1: link */
 	"\20\001INGRESS\002EGRESS",			/* 2: switch */
 	"\20\001NIC\002VM\003IDS\004UM\005UM_ISGL"	/* 3: NIC */
 	    "\006HASHFILTER\007ETHOFLD",
 	"\20\001TOE",					/* 4: TOE */
 	"\20\001RDDP\002RDMAC",				/* 5: RDMA */
 	"\20\001INITIATOR_PDU\002TARGET_PDU"		/* 6: iSCSI */
 	    "\003INITIATOR_CNXOFLD\004TARGET_CNXOFLD"
 	    "\005INITIATOR_SSNOFLD\006TARGET_SSNOFLD"
 	    "\007T10DIF"
 	    "\010INITIATOR_CMDOFLD\011TARGET_CMDOFLD",
 	"\20\00KEYS",					/* 7: TLS */
 	"\20\001INITIATOR\002TARGET\003CTRL_OFLD"	/* 8: FCoE */
 		    "\004PO_INITIATOR\005PO_TARGET",
 };
 
 void
 t4_sysctls(struct adapter *sc)
 {
 	struct sysctl_ctx_list *ctx;
 	struct sysctl_oid *oid;
 	struct sysctl_oid_list *children, *c0;
 	static char *doorbells = {"\20\1UDB\2WCWR\3UDBWC\4KDB"};
 
 	ctx = device_get_sysctl_ctx(sc->dev);
 
 	/*
 	 * dev.t4nex.X.
 	 */
 	oid = device_get_sysctl_tree(sc->dev);
 	c0 = children = SYSCTL_CHILDREN(oid);
 
 	sc->sc_do_rxcopy = 1;
 	SYSCTL_ADD_INT(ctx, children, OID_AUTO, "do_rx_copy", CTLFLAG_RW,
 	    &sc->sc_do_rxcopy, 1, "Do RX copy of small frames");
 
 	SYSCTL_ADD_INT(ctx, children, OID_AUTO, "nports", CTLFLAG_RD, NULL,
 	    sc->params.nports, "# of ports");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "doorbells",
 	    CTLTYPE_STRING | CTLFLAG_RD, doorbells, sc->doorbells,
 	    sysctl_bitfield, "A", "available doorbells");
 
 	SYSCTL_ADD_INT(ctx, children, OID_AUTO, "core_clock", CTLFLAG_RD, NULL,
 	    sc->params.vpd.cclk, "core clock frequency (in KHz)");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "holdoff_timers",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc->params.sge.timer_val,
 	    sizeof(sc->params.sge.timer_val), sysctl_int_array, "A",
 	    "interrupt holdoff timer values (us)");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "holdoff_pkt_counts",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc->params.sge.counter_val,
 	    sizeof(sc->params.sge.counter_val), sysctl_int_array, "A",
 	    "interrupt holdoff packet counter values");
 
 	t4_sge_sysctls(sc, ctx, children);
 
 	sc->lro_timeout = 100;
 	SYSCTL_ADD_INT(ctx, children, OID_AUTO, "lro_timeout", CTLFLAG_RW,
 	    &sc->lro_timeout, 0, "lro inactive-flush timeout (in us)");
 
 	SYSCTL_ADD_INT(ctx, children, OID_AUTO, "debug_flags", CTLFLAG_RW,
 	    &sc->debug_flags, 0, "flags to enable runtime debugging");
 
 	SYSCTL_ADD_STRING(ctx, children, OID_AUTO, "tp_version",
 	    CTLFLAG_RD, sc->tp_version, 0, "TP microcode version");
 
 	SYSCTL_ADD_STRING(ctx, children, OID_AUTO, "firmware_version",
 	    CTLFLAG_RD, sc->fw_version, 0, "firmware version");
 
 	if (sc->flags & IS_VF)
 		return;
 
 	SYSCTL_ADD_INT(ctx, children, OID_AUTO, "hw_revision", CTLFLAG_RD,
 	    NULL, chip_rev(sc), "chip hardware revision");
 
 	SYSCTL_ADD_STRING(ctx, children, OID_AUTO, "sn",
 	    CTLFLAG_RD, sc->params.vpd.sn, 0, "serial number");
 
 	SYSCTL_ADD_STRING(ctx, children, OID_AUTO, "pn",
 	    CTLFLAG_RD, sc->params.vpd.pn, 0, "part number");
 
 	SYSCTL_ADD_STRING(ctx, children, OID_AUTO, "ec",
 	    CTLFLAG_RD, sc->params.vpd.ec, 0, "engineering change");
 
 	SYSCTL_ADD_STRING(ctx, children, OID_AUTO, "na",
 	    CTLFLAG_RD, sc->params.vpd.na, 0, "network address");
 
 	SYSCTL_ADD_STRING(ctx, children, OID_AUTO, "er_version", CTLFLAG_RD,
 	    sc->er_version, 0, "expansion ROM version");
 
 	SYSCTL_ADD_STRING(ctx, children, OID_AUTO, "bs_version", CTLFLAG_RD,
 	    sc->bs_version, 0, "bootstrap firmware version");
 
 	SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "scfg_version", CTLFLAG_RD,
 	    NULL, sc->params.scfg_vers, "serial config version");
 
 	SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "vpd_version", CTLFLAG_RD,
 	    NULL, sc->params.vpd_vers, "VPD version");
 
 	SYSCTL_ADD_STRING(ctx, children, OID_AUTO, "cf",
 	    CTLFLAG_RD, sc->cfg_file, 0, "configuration file");
 
 	SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "cfcsum", CTLFLAG_RD, NULL,
 	    sc->cfcsum, "config file checksum");
 
 #define SYSCTL_CAP(name, n, text) \
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, #name, \
 	    CTLTYPE_STRING | CTLFLAG_RD, caps_decoder[n], sc->name, \
 	    sysctl_bitfield, "A", "available " text " capabilities")
 
 	SYSCTL_CAP(nbmcaps, 0, "NBM");
 	SYSCTL_CAP(linkcaps, 1, "link");
 	SYSCTL_CAP(switchcaps, 2, "switch");
 	SYSCTL_CAP(niccaps, 3, "NIC");
 	SYSCTL_CAP(toecaps, 4, "TCP offload");
 	SYSCTL_CAP(rdmacaps, 5, "RDMA");
 	SYSCTL_CAP(iscsicaps, 6, "iSCSI");
 	SYSCTL_CAP(tlscaps, 7, "TLS");
 	SYSCTL_CAP(fcoecaps, 8, "FCoE");
 #undef SYSCTL_CAP
 
 	SYSCTL_ADD_INT(ctx, children, OID_AUTO, "nfilters", CTLFLAG_RD,
 	    NULL, sc->tids.nftids, "number of filters");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "temperature", CTLTYPE_INT |
 	    CTLFLAG_RD, sc, 0, sysctl_temperature, "I",
 	    "chip temperature (in Celsius)");
 
 #ifdef SBUF_DRAIN
 	/*
 	 * dev.t4nex.X.misc.  Marked CTLFLAG_SKIP to avoid information overload.
 	 */
 	oid = SYSCTL_ADD_NODE(ctx, c0, OID_AUTO, "misc",
 	    CTLFLAG_RD | CTLFLAG_SKIP, NULL,
 	    "logs and miscellaneous information");
 	children = SYSCTL_CHILDREN(oid);
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cctrl",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_cctrl, "A", "congestion control");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cim_ibq_tp0",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_cim_ibq_obq, "A", "CIM IBQ 0 (TP0)");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cim_ibq_tp1",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 1,
 	    sysctl_cim_ibq_obq, "A", "CIM IBQ 1 (TP1)");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cim_ibq_ulp",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 2,
 	    sysctl_cim_ibq_obq, "A", "CIM IBQ 2 (ULP)");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cim_ibq_sge0",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 3,
 	    sysctl_cim_ibq_obq, "A", "CIM IBQ 3 (SGE0)");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cim_ibq_sge1",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 4,
 	    sysctl_cim_ibq_obq, "A", "CIM IBQ 4 (SGE1)");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cim_ibq_ncsi",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 5,
 	    sysctl_cim_ibq_obq, "A", "CIM IBQ 5 (NCSI)");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cim_la",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    chip_id(sc) <= CHELSIO_T5 ? sysctl_cim_la : sysctl_cim_la_t6,
 	    "A", "CIM logic analyzer");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cim_ma_la",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_cim_ma_la, "A", "CIM MA logic analyzer");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cim_obq_ulp0",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0 + CIM_NUM_IBQ,
 	    sysctl_cim_ibq_obq, "A", "CIM OBQ 0 (ULP0)");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cim_obq_ulp1",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 1 + CIM_NUM_IBQ,
 	    sysctl_cim_ibq_obq, "A", "CIM OBQ 1 (ULP1)");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cim_obq_ulp2",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 2 + CIM_NUM_IBQ,
 	    sysctl_cim_ibq_obq, "A", "CIM OBQ 2 (ULP2)");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cim_obq_ulp3",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 3 + CIM_NUM_IBQ,
 	    sysctl_cim_ibq_obq, "A", "CIM OBQ 3 (ULP3)");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cim_obq_sge",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 4 + CIM_NUM_IBQ,
 	    sysctl_cim_ibq_obq, "A", "CIM OBQ 4 (SGE)");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cim_obq_ncsi",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 5 + CIM_NUM_IBQ,
 	    sysctl_cim_ibq_obq, "A", "CIM OBQ 5 (NCSI)");
 
 	if (chip_id(sc) > CHELSIO_T4) {
 		SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cim_obq_sge0_rx",
 		    CTLTYPE_STRING | CTLFLAG_RD, sc, 6 + CIM_NUM_IBQ,
 		    sysctl_cim_ibq_obq, "A", "CIM OBQ 6 (SGE0-RX)");
 
 		SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cim_obq_sge1_rx",
 		    CTLTYPE_STRING | CTLFLAG_RD, sc, 7 + CIM_NUM_IBQ,
 		    sysctl_cim_ibq_obq, "A", "CIM OBQ 7 (SGE1-RX)");
 	}
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cim_pif_la",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_cim_pif_la, "A", "CIM PIF logic analyzer");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cim_qcfg",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_cim_qcfg, "A", "CIM queue configuration");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cpl_stats",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_cpl_stats, "A", "CPL statistics");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "ddp_stats",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_ddp_stats, "A", "non-TCP DDP statistics");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "devlog",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_devlog, "A", "firmware's device log");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "fcoe_stats",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_fcoe_stats, "A", "FCoE statistics");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "hw_sched",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_hw_sched, "A", "hardware scheduler ");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "l2t",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_l2t, "A", "hardware L2 table");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "lb_stats",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_lb_stats, "A", "loopback statistics");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "meminfo",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_meminfo, "A", "memory regions");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "mps_tcam",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    chip_id(sc) <= CHELSIO_T5 ? sysctl_mps_tcam : sysctl_mps_tcam_t6,
 	    "A", "MPS TCAM entries");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "path_mtus",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_path_mtus, "A", "path MTUs");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "pm_stats",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_pm_stats, "A", "PM statistics");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "rdma_stats",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_rdma_stats, "A", "RDMA statistics");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "tcp_stats",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_tcp_stats, "A", "TCP statistics");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "tids",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_tids, "A", "TID information");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "tp_err_stats",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_tp_err_stats, "A", "TP error statistics");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "tp_la_mask",
 	    CTLTYPE_INT | CTLFLAG_RW, sc, 0, sysctl_tp_la_mask, "I",
 	    "TP logic analyzer event capture mask");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "tp_la",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_tp_la, "A", "TP logic analyzer");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "tx_rate",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_tx_rate, "A", "Tx rate");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "ulprx_la",
 	    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 	    sysctl_ulprx_la, "A", "ULPRX logic analyzer");
 
 	if (is_t5(sc)) {
 		SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "wcwr_stats",
 		    CTLTYPE_STRING | CTLFLAG_RD, sc, 0,
 		    sysctl_wcwr_stats, "A", "write combined work requests");
 	}
 #endif
 
 #ifdef TCP_OFFLOAD
 	if (is_offload(sc)) {
 		/*
 		 * dev.t4nex.X.toe.
 		 */
 		oid = SYSCTL_ADD_NODE(ctx, c0, OID_AUTO, "toe", CTLFLAG_RD,
 		    NULL, "TOE parameters");
 		children = SYSCTL_CHILDREN(oid);
 
 		sc->tt.sndbuf = 256 * 1024;
 		SYSCTL_ADD_INT(ctx, children, OID_AUTO, "sndbuf", CTLFLAG_RW,
 		    &sc->tt.sndbuf, 0, "max hardware send buffer size");
 
 		sc->tt.ddp = 0;
 		SYSCTL_ADD_INT(ctx, children, OID_AUTO, "ddp", CTLFLAG_RW,
 		    &sc->tt.ddp, 0, "DDP allowed");
 
 		sc->tt.rx_coalesce = 1;
 		SYSCTL_ADD_INT(ctx, children, OID_AUTO, "rx_coalesce",
 		    CTLFLAG_RW, &sc->tt.rx_coalesce, 0, "receive coalescing");
 
 		sc->tt.tx_align = 1;
 		SYSCTL_ADD_INT(ctx, children, OID_AUTO, "tx_align",
 		    CTLFLAG_RW, &sc->tt.tx_align, 0, "chop and align payload");
 
 		sc->tt.tx_zcopy = 0;
 		SYSCTL_ADD_INT(ctx, children, OID_AUTO, "tx_zcopy",
 		    CTLFLAG_RW, &sc->tt.tx_zcopy, 0,
 		    "Enable zero-copy aio_write(2)");
 
 		SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "timer_tick",
 		    CTLTYPE_STRING | CTLFLAG_RD, sc, 0, sysctl_tp_tick, "A",
 		    "TP timer tick (us)");
 
 		SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "timestamp_tick",
 		    CTLTYPE_STRING | CTLFLAG_RD, sc, 1, sysctl_tp_tick, "A",
 		    "TCP timestamp tick (us)");
 
 		SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "dack_tick",
 		    CTLTYPE_STRING | CTLFLAG_RD, sc, 2, sysctl_tp_tick, "A",
 		    "DACK tick (us)");
 
 		SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "dack_timer",
 		    CTLTYPE_UINT | CTLFLAG_RD, sc, 0, sysctl_tp_dack_timer,
 		    "IU", "DACK timer (us)");
 
 		SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "rexmt_min",
 		    CTLTYPE_ULONG | CTLFLAG_RD, sc, A_TP_RXT_MIN,
 		    sysctl_tp_timer, "LU", "Retransmit min (us)");
 
 		SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "rexmt_max",
 		    CTLTYPE_ULONG | CTLFLAG_RD, sc, A_TP_RXT_MAX,
 		    sysctl_tp_timer, "LU", "Retransmit max (us)");
 
 		SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "persist_min",
 		    CTLTYPE_ULONG | CTLFLAG_RD, sc, A_TP_PERS_MIN,
 		    sysctl_tp_timer, "LU", "Persist timer min (us)");
 
 		SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "persist_max",
 		    CTLTYPE_ULONG | CTLFLAG_RD, sc, A_TP_PERS_MAX,
 		    sysctl_tp_timer, "LU", "Persist timer max (us)");
 
 		SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "keepalive_idle",
 		    CTLTYPE_ULONG | CTLFLAG_RD, sc, A_TP_KEEP_IDLE,
 		    sysctl_tp_timer, "LU", "Keepidle idle timer (us)");
 
 		SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "keepalive_intvl",
 		    CTLTYPE_ULONG | CTLFLAG_RD, sc, A_TP_KEEP_INTVL,
 		    sysctl_tp_timer, "LU", "Keepidle interval (us)");
 
 		SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "initial_srtt",
 		    CTLTYPE_ULONG | CTLFLAG_RD, sc, A_TP_INIT_SRTT,
 		    sysctl_tp_timer, "LU", "Initial SRTT (us)");
 
 		SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "finwait2_timer",
 		    CTLTYPE_ULONG | CTLFLAG_RD, sc, A_TP_FINWAIT2_TIMER,
 		    sysctl_tp_timer, "LU", "FINWAIT2 timer (us)");
 	}
 #endif
 }
 
 void
 vi_sysctls(struct vi_info *vi)
 {
 	struct sysctl_ctx_list *ctx;
 	struct sysctl_oid *oid;
 	struct sysctl_oid_list *children;
 
 	ctx = device_get_sysctl_ctx(vi->dev);
 
 	/*
 	 * dev.v?(cxgbe|cxl).X.
 	 */
 	oid = device_get_sysctl_tree(vi->dev);
 	children = SYSCTL_CHILDREN(oid);
 
 	SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "viid", CTLFLAG_RD, NULL,
 	    vi->viid, "VI identifer");
 	SYSCTL_ADD_INT(ctx, children, OID_AUTO, "nrxq", CTLFLAG_RD,
 	    &vi->nrxq, 0, "# of rx queues");
 	SYSCTL_ADD_INT(ctx, children, OID_AUTO, "ntxq", CTLFLAG_RD,
 	    &vi->ntxq, 0, "# of tx queues");
 	SYSCTL_ADD_INT(ctx, children, OID_AUTO, "first_rxq", CTLFLAG_RD,
 	    &vi->first_rxq, 0, "index of first rx queue");
 	SYSCTL_ADD_INT(ctx, children, OID_AUTO, "first_txq", CTLFLAG_RD,
 	    &vi->first_txq, 0, "index of first tx queue");
 	SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "rss_size", CTLFLAG_RD, NULL,
 	    vi->rss_size, "size of RSS indirection table");
 
 	if (IS_MAIN_VI(vi)) {
 		SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "rsrv_noflowq",
 		    CTLTYPE_INT | CTLFLAG_RW, vi, 0, sysctl_noflowq, "IU",
 		    "Reserve queue 0 for non-flowid packets");
 	}
 
 #ifdef TCP_OFFLOAD
 	if (vi->nofldrxq != 0) {
 		SYSCTL_ADD_INT(ctx, children, OID_AUTO, "nofldrxq", CTLFLAG_RD,
 		    &vi->nofldrxq, 0,
 		    "# of rx queues for offloaded TCP connections");
 		SYSCTL_ADD_INT(ctx, children, OID_AUTO, "nofldtxq", CTLFLAG_RD,
 		    &vi->nofldtxq, 0,
 		    "# of tx queues for offloaded TCP connections");
 		SYSCTL_ADD_INT(ctx, children, OID_AUTO, "first_ofld_rxq",
 		    CTLFLAG_RD, &vi->first_ofld_rxq, 0,
 		    "index of first TOE rx queue");
 		SYSCTL_ADD_INT(ctx, children, OID_AUTO, "first_ofld_txq",
 		    CTLFLAG_RD, &vi->first_ofld_txq, 0,
 		    "index of first TOE tx queue");
 	}
 #endif
 #ifdef DEV_NETMAP
 	if (vi->nnmrxq != 0) {
 		SYSCTL_ADD_INT(ctx, children, OID_AUTO, "nnmrxq", CTLFLAG_RD,
 		    &vi->nnmrxq, 0, "# of netmap rx queues");
 		SYSCTL_ADD_INT(ctx, children, OID_AUTO, "nnmtxq", CTLFLAG_RD,
 		    &vi->nnmtxq, 0, "# of netmap tx queues");
 		SYSCTL_ADD_INT(ctx, children, OID_AUTO, "first_nm_rxq",
 		    CTLFLAG_RD, &vi->first_nm_rxq, 0,
 		    "index of first netmap rx queue");
 		SYSCTL_ADD_INT(ctx, children, OID_AUTO, "first_nm_txq",
 		    CTLFLAG_RD, &vi->first_nm_txq, 0,
 		    "index of first netmap tx queue");
 	}
 #endif
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "holdoff_tmr_idx",
 	    CTLTYPE_INT | CTLFLAG_RW, vi, 0, sysctl_holdoff_tmr_idx, "I",
 	    "holdoff timer index");
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "holdoff_pktc_idx",
 	    CTLTYPE_INT | CTLFLAG_RW, vi, 0, sysctl_holdoff_pktc_idx, "I",
 	    "holdoff packet counter index");
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "qsize_rxq",
 	    CTLTYPE_INT | CTLFLAG_RW, vi, 0, sysctl_qsize_rxq, "I",
 	    "rx queue size");
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "qsize_txq",
 	    CTLTYPE_INT | CTLFLAG_RW, vi, 0, sysctl_qsize_txq, "I",
 	    "tx queue size");
 }
 
 static void
 cxgbe_sysctls(struct port_info *pi)
 {
 	struct sysctl_ctx_list *ctx;
 	struct sysctl_oid *oid;
 	struct sysctl_oid_list *children, *children2;
 	struct adapter *sc = pi->adapter;
 	int i;
 	char name[16];
 
 	ctx = device_get_sysctl_ctx(pi->dev);
 
 	/*
 	 * dev.cxgbe.X.
 	 */
 	oid = device_get_sysctl_tree(pi->dev);
 	children = SYSCTL_CHILDREN(oid);
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "linkdnrc", CTLTYPE_STRING |
 	   CTLFLAG_RD, pi, 0, sysctl_linkdnrc, "A", "reason why link is down");
 	if (pi->port_type == FW_PORT_TYPE_BT_XAUI) {
 		SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "temperature",
 		    CTLTYPE_INT | CTLFLAG_RD, pi, 0, sysctl_btphy, "I",
 		    "PHY temperature (in Celsius)");
 		SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "fw_version",
 		    CTLTYPE_INT | CTLFLAG_RD, pi, 1, sysctl_btphy, "I",
 		    "PHY firmware version");
 	}
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "pause_settings",
 	    CTLTYPE_STRING | CTLFLAG_RW, pi, PAUSE_TX, sysctl_pause_settings,
 	    "A", "PAUSE settings (bit 0 = rx_pause, bit 1 = tx_pause)");
 
 	SYSCTL_ADD_INT(ctx, children, OID_AUTO, "max_speed", CTLFLAG_RD, NULL,
 	    port_top_speed(pi), "max speed (in Gbps)");
 
 	if (sc->flags & IS_VF)
 		return;
 
 	/*
 	 * dev.(cxgbe|cxl).X.tc.
 	 */
 	oid = SYSCTL_ADD_NODE(ctx, children, OID_AUTO, "tc", CTLFLAG_RD, NULL,
 	    "Tx scheduler traffic classes");
 	for (i = 0; i < sc->chip_params->nsched_cls; i++) {
 		struct tx_sched_class *tc = &pi->tc[i];
 
 		snprintf(name, sizeof(name), "%d", i);
 		children2 = SYSCTL_CHILDREN(SYSCTL_ADD_NODE(ctx,
 		    SYSCTL_CHILDREN(oid), OID_AUTO, name, CTLFLAG_RD, NULL,
 		    "traffic class"));
 		SYSCTL_ADD_UINT(ctx, children2, OID_AUTO, "flags", CTLFLAG_RD,
 		    &tc->flags, 0, "flags");
 		SYSCTL_ADD_UINT(ctx, children2, OID_AUTO, "refcount",
 		    CTLFLAG_RD, &tc->refcount, 0, "references to this class");
 #ifdef SBUF_DRAIN
 		SYSCTL_ADD_PROC(ctx, children2, OID_AUTO, "params",
 		    CTLTYPE_STRING | CTLFLAG_RD, sc, (pi->port_id << 16) | i,
 		    sysctl_tc_params, "A", "traffic class parameters");
 #endif
 	}
 
 	/*
 	 * dev.cxgbe.X.stats.
 	 */
 	oid = SYSCTL_ADD_NODE(ctx, children, OID_AUTO, "stats", CTLFLAG_RD,
 	    NULL, "port statistics");
 	children = SYSCTL_CHILDREN(oid);
 	SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "tx_parse_error", CTLFLAG_RD,
 	    &pi->tx_parse_error, 0,
 	    "# of tx packets with invalid length or # of segments");
 
 #define SYSCTL_ADD_T4_REG64(pi, name, desc, reg) \
 	SYSCTL_ADD_OID(ctx, children, OID_AUTO, name, \
 	    CTLTYPE_U64 | CTLFLAG_RD, sc, reg, \
 	    sysctl_handle_t4_reg64, "QU", desc)
 
 	SYSCTL_ADD_T4_REG64(pi, "tx_octets", "# of octets in good frames",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_BYTES_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_frames", "total # of good frames",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_FRAMES_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_bcast_frames", "# of broadcast frames",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_BCAST_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_mcast_frames", "# of multicast frames",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_MCAST_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_ucast_frames", "# of unicast frames",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_UCAST_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_error_frames", "# of error frames",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_ERROR_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_frames_64",
 	    "# of tx frames in this range",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_64B_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_frames_65_127",
 	    "# of tx frames in this range",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_65B_127B_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_frames_128_255",
 	    "# of tx frames in this range",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_128B_255B_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_frames_256_511",
 	    "# of tx frames in this range",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_256B_511B_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_frames_512_1023",
 	    "# of tx frames in this range",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_512B_1023B_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_frames_1024_1518",
 	    "# of tx frames in this range",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_1024B_1518B_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_frames_1519_max",
 	    "# of tx frames in this range",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_1519B_MAX_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_drop", "# of dropped tx frames",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_DROP_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_pause", "# of pause frames transmitted",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_PAUSE_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_ppp0", "# of PPP prio 0 frames transmitted",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_PPP0_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_ppp1", "# of PPP prio 1 frames transmitted",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_PPP1_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_ppp2", "# of PPP prio 2 frames transmitted",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_PPP2_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_ppp3", "# of PPP prio 3 frames transmitted",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_PPP3_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_ppp4", "# of PPP prio 4 frames transmitted",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_PPP4_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_ppp5", "# of PPP prio 5 frames transmitted",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_PPP5_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_ppp6", "# of PPP prio 6 frames transmitted",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_PPP6_L));
 	SYSCTL_ADD_T4_REG64(pi, "tx_ppp7", "# of PPP prio 7 frames transmitted",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_TX_PORT_PPP7_L));
 
 	SYSCTL_ADD_T4_REG64(pi, "rx_octets", "# of octets in good frames",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_BYTES_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_frames", "total # of good frames",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_FRAMES_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_bcast_frames", "# of broadcast frames",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_BCAST_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_mcast_frames", "# of multicast frames",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_MCAST_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_ucast_frames", "# of unicast frames",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_UCAST_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_too_long", "# of frames exceeding MTU",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_MTU_ERROR_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_jabber", "# of jabber frames",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_MTU_CRC_ERROR_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_fcs_err",
 	    "# of frames received with bad FCS",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_CRC_ERROR_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_len_err",
 	    "# of frames received with length error",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_LEN_ERROR_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_symbol_err", "symbol errors",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_SYM_ERROR_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_runt", "# of short frames received",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_LESS_64B_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_frames_64",
 	    "# of rx frames in this range",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_64B_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_frames_65_127",
 	    "# of rx frames in this range",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_65B_127B_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_frames_128_255",
 	    "# of rx frames in this range",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_128B_255B_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_frames_256_511",
 	    "# of rx frames in this range",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_256B_511B_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_frames_512_1023",
 	    "# of rx frames in this range",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_512B_1023B_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_frames_1024_1518",
 	    "# of rx frames in this range",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_1024B_1518B_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_frames_1519_max",
 	    "# of rx frames in this range",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_1519B_MAX_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_pause", "# of pause frames received",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_PAUSE_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_ppp0", "# of PPP prio 0 frames received",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_PPP0_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_ppp1", "# of PPP prio 1 frames received",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_PPP1_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_ppp2", "# of PPP prio 2 frames received",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_PPP2_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_ppp3", "# of PPP prio 3 frames received",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_PPP3_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_ppp4", "# of PPP prio 4 frames received",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_PPP4_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_ppp5", "# of PPP prio 5 frames received",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_PPP5_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_ppp6", "# of PPP prio 6 frames received",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_PPP6_L));
 	SYSCTL_ADD_T4_REG64(pi, "rx_ppp7", "# of PPP prio 7 frames received",
 	    PORT_REG(pi->tx_chan, A_MPS_PORT_STAT_RX_PORT_PPP7_L));
 
 #undef SYSCTL_ADD_T4_REG64
 
 #define SYSCTL_ADD_T4_PORTSTAT(name, desc) \
 	SYSCTL_ADD_UQUAD(ctx, children, OID_AUTO, #name, CTLFLAG_RD, \
 	    &pi->stats.name, desc)
 
 	/* We get these from port_stats and they may be stale by up to 1s */
 	SYSCTL_ADD_T4_PORTSTAT(rx_ovflow0,
 	    "# drops due to buffer-group 0 overflows");
 	SYSCTL_ADD_T4_PORTSTAT(rx_ovflow1,
 	    "# drops due to buffer-group 1 overflows");
 	SYSCTL_ADD_T4_PORTSTAT(rx_ovflow2,
 	    "# drops due to buffer-group 2 overflows");
 	SYSCTL_ADD_T4_PORTSTAT(rx_ovflow3,
 	    "# drops due to buffer-group 3 overflows");
 	SYSCTL_ADD_T4_PORTSTAT(rx_trunc0,
 	    "# of buffer-group 0 truncated packets");
 	SYSCTL_ADD_T4_PORTSTAT(rx_trunc1,
 	    "# of buffer-group 1 truncated packets");
 	SYSCTL_ADD_T4_PORTSTAT(rx_trunc2,
 	    "# of buffer-group 2 truncated packets");
 	SYSCTL_ADD_T4_PORTSTAT(rx_trunc3,
 	    "# of buffer-group 3 truncated packets");
 
 #undef SYSCTL_ADD_T4_PORTSTAT
 }
 
 static int
 sysctl_int_array(SYSCTL_HANDLER_ARGS)
 {
 	int rc, *i, space = 0;
 	struct sbuf sb;
 
 	sbuf_new_for_sysctl(&sb, NULL, 64, req);
 	for (i = arg1; arg2; arg2 -= sizeof(int), i++) {
 		if (space)
 			sbuf_printf(&sb, " ");
 		sbuf_printf(&sb, "%d", *i);
 		space = 1;
 	}
 	rc = sbuf_finish(&sb);
 	sbuf_delete(&sb);
 	return (rc);
 }
 
 static int
 sysctl_bitfield(SYSCTL_HANDLER_ARGS)
 {
 	int rc;
 	struct sbuf *sb;
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return(rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 128, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	sbuf_printf(sb, "%b", (int)arg2, (char *)arg1);
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 static int
 sysctl_btphy(SYSCTL_HANDLER_ARGS)
 {
 	struct port_info *pi = arg1;
 	int op = arg2;
 	struct adapter *sc = pi->adapter;
 	u_int v;
 	int rc;
 
 	rc = begin_synchronized_op(sc, &pi->vi[0], SLEEP_OK | INTR_OK, "t4btt");
 	if (rc)
 		return (rc);
 	/* XXX: magic numbers */
 	rc = -t4_mdio_rd(sc, sc->mbox, pi->mdio_addr, 0x1e, op ? 0x20 : 0xc820,
 	    &v);
 	end_synchronized_op(sc, 0);
 	if (rc)
 		return (rc);
 	if (op == 0)
 		v /= 256;
 
 	rc = sysctl_handle_int(oidp, &v, 0, req);
 	return (rc);
 }
 
 static int
 sysctl_noflowq(SYSCTL_HANDLER_ARGS)
 {
 	struct vi_info *vi = arg1;
 	int rc, val;
 
 	val = vi->rsrv_noflowq;
 	rc = sysctl_handle_int(oidp, &val, 0, req);
 	if (rc != 0 || req->newptr == NULL)
 		return (rc);
 
 	if ((val >= 1) && (vi->ntxq > 1))
 		vi->rsrv_noflowq = 1;
 	else
 		vi->rsrv_noflowq = 0;
 
 	return (rc);
 }
 
 static int
 sysctl_holdoff_tmr_idx(SYSCTL_HANDLER_ARGS)
 {
 	struct vi_info *vi = arg1;
 	struct adapter *sc = vi->pi->adapter;
 	int idx, rc, i;
 	struct sge_rxq *rxq;
 #ifdef TCP_OFFLOAD
 	struct sge_ofld_rxq *ofld_rxq;
 #endif
 	uint8_t v;
 
 	idx = vi->tmr_idx;
 
 	rc = sysctl_handle_int(oidp, &idx, 0, req);
 	if (rc != 0 || req->newptr == NULL)
 		return (rc);
 
 	if (idx < 0 || idx >= SGE_NTIMERS)
 		return (EINVAL);
 
 	rc = begin_synchronized_op(sc, vi, HOLD_LOCK | SLEEP_OK | INTR_OK,
 	    "t4tmr");
 	if (rc)
 		return (rc);
 
 	v = V_QINTR_TIMER_IDX(idx) | V_QINTR_CNT_EN(vi->pktc_idx != -1);
 	for_each_rxq(vi, i, rxq) {
 #ifdef atomic_store_rel_8
 		atomic_store_rel_8(&rxq->iq.intr_params, v);
 #else
 		rxq->iq.intr_params = v;
 #endif
 	}
 #ifdef TCP_OFFLOAD
 	for_each_ofld_rxq(vi, i, ofld_rxq) {
 #ifdef atomic_store_rel_8
 		atomic_store_rel_8(&ofld_rxq->iq.intr_params, v);
 #else
 		ofld_rxq->iq.intr_params = v;
 #endif
 	}
 #endif
 	vi->tmr_idx = idx;
 
 	end_synchronized_op(sc, LOCK_HELD);
 	return (0);
 }
 
 static int
 sysctl_holdoff_pktc_idx(SYSCTL_HANDLER_ARGS)
 {
 	struct vi_info *vi = arg1;
 	struct adapter *sc = vi->pi->adapter;
 	int idx, rc;
 
 	idx = vi->pktc_idx;
 
 	rc = sysctl_handle_int(oidp, &idx, 0, req);
 	if (rc != 0 || req->newptr == NULL)
 		return (rc);
 
 	if (idx < -1 || idx >= SGE_NCOUNTERS)
 		return (EINVAL);
 
 	rc = begin_synchronized_op(sc, vi, HOLD_LOCK | SLEEP_OK | INTR_OK,
 	    "t4pktc");
 	if (rc)
 		return (rc);
 
 	if (vi->flags & VI_INIT_DONE)
 		rc = EBUSY; /* cannot be changed once the queues are created */
 	else
 		vi->pktc_idx = idx;
 
 	end_synchronized_op(sc, LOCK_HELD);
 	return (rc);
 }
 
 static int
 sysctl_qsize_rxq(SYSCTL_HANDLER_ARGS)
 {
 	struct vi_info *vi = arg1;
 	struct adapter *sc = vi->pi->adapter;
 	int qsize, rc;
 
 	qsize = vi->qsize_rxq;
 
 	rc = sysctl_handle_int(oidp, &qsize, 0, req);
 	if (rc != 0 || req->newptr == NULL)
 		return (rc);
 
 	if (qsize < 128 || (qsize & 7))
 		return (EINVAL);
 
 	rc = begin_synchronized_op(sc, vi, HOLD_LOCK | SLEEP_OK | INTR_OK,
 	    "t4rxqs");
 	if (rc)
 		return (rc);
 
 	if (vi->flags & VI_INIT_DONE)
 		rc = EBUSY; /* cannot be changed once the queues are created */
 	else
 		vi->qsize_rxq = qsize;
 
 	end_synchronized_op(sc, LOCK_HELD);
 	return (rc);
 }
 
 static int
 sysctl_qsize_txq(SYSCTL_HANDLER_ARGS)
 {
 	struct vi_info *vi = arg1;
 	struct adapter *sc = vi->pi->adapter;
 	int qsize, rc;
 
 	qsize = vi->qsize_txq;
 
 	rc = sysctl_handle_int(oidp, &qsize, 0, req);
 	if (rc != 0 || req->newptr == NULL)
 		return (rc);
 
 	if (qsize < 128 || qsize > 65536)
 		return (EINVAL);
 
 	rc = begin_synchronized_op(sc, vi, HOLD_LOCK | SLEEP_OK | INTR_OK,
 	    "t4txqs");
 	if (rc)
 		return (rc);
 
 	if (vi->flags & VI_INIT_DONE)
 		rc = EBUSY; /* cannot be changed once the queues are created */
 	else
 		vi->qsize_txq = qsize;
 
 	end_synchronized_op(sc, LOCK_HELD);
 	return (rc);
 }
 
 static int
 sysctl_pause_settings(SYSCTL_HANDLER_ARGS)
 {
 	struct port_info *pi = arg1;
 	struct adapter *sc = pi->adapter;
 	struct link_config *lc = &pi->link_cfg;
 	int rc;
 
 	if (req->newptr == NULL) {
 		struct sbuf *sb;
 		static char *bits = "\20\1PAUSE_RX\2PAUSE_TX";
 
 		rc = sysctl_wire_old_buffer(req, 0);
 		if (rc != 0)
 			return(rc);
 
 		sb = sbuf_new_for_sysctl(NULL, NULL, 128, req);
 		if (sb == NULL)
 			return (ENOMEM);
 
 		sbuf_printf(sb, "%b", lc->fc & (PAUSE_TX | PAUSE_RX), bits);
 		rc = sbuf_finish(sb);
 		sbuf_delete(sb);
 	} else {
 		char s[2];
 		int n;
 
 		s[0] = '0' + (lc->requested_fc & (PAUSE_TX | PAUSE_RX));
 		s[1] = 0;
 
 		rc = sysctl_handle_string(oidp, s, sizeof(s), req);
 		if (rc != 0)
 			return(rc);
 
 		if (s[1] != 0)
 			return (EINVAL);
 		if (s[0] < '0' || s[0] > '9')
 			return (EINVAL);	/* not a number */
 		n = s[0] - '0';
 		if (n & ~(PAUSE_TX | PAUSE_RX))
 			return (EINVAL);	/* some other bit is set too */
 
 		rc = begin_synchronized_op(sc, &pi->vi[0], SLEEP_OK | INTR_OK,
 		    "t4PAUSE");
 		if (rc)
 			return (rc);
 		if ((lc->requested_fc & (PAUSE_TX | PAUSE_RX)) != n) {
 			int link_ok = lc->link_ok;
 
 			lc->requested_fc &= ~(PAUSE_TX | PAUSE_RX);
 			lc->requested_fc |= n;
 			rc = -t4_link_l1cfg(sc, sc->mbox, pi->tx_chan, lc);
 			lc->link_ok = link_ok;	/* restore */
 		}
 		end_synchronized_op(sc, 0);
 	}
 
 	return (rc);
 }
 
 static int
 sysctl_handle_t4_reg64(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	int reg = arg2;
 	uint64_t val;
 
 	val = t4_read_reg64(sc, reg);
 
 	return (sysctl_handle_64(oidp, &val, 0, req));
 }
 
 static int
 sysctl_temperature(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	int rc, t;
 	uint32_t param, val;
 
 	rc = begin_synchronized_op(sc, NULL, SLEEP_OK | INTR_OK, "t4temp");
 	if (rc)
 		return (rc);
 	param = V_FW_PARAMS_MNEM(FW_PARAMS_MNEM_DEV) |
 	    V_FW_PARAMS_PARAM_X(FW_PARAMS_PARAM_DEV_DIAG) |
 	    V_FW_PARAMS_PARAM_Y(FW_PARAM_DEV_DIAG_TMP);
 	rc = -t4_query_params(sc, sc->mbox, sc->pf, 0, 1, &param, &val);
 	end_synchronized_op(sc, 0);
 	if (rc)
 		return (rc);
 
 	/* unknown is returned as 0 but we display -1 in that case */
 	t = val == 0 ? -1 : val;
 
 	rc = sysctl_handle_int(oidp, &t, 0, req);
 	return (rc);
 }
 
 #ifdef SBUF_DRAIN
 static int
 sysctl_cctrl(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	int rc, i;
 	uint16_t incr[NMTUS][NCCTRL_WIN];
 	static const char *dec_fac[] = {
 		"0.5", "0.5625", "0.625", "0.6875", "0.75", "0.8125", "0.875",
 		"0.9375"
 	};
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 4096, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	t4_read_cong_tbl(sc, incr);
 
 	for (i = 0; i < NCCTRL_WIN; ++i) {
 		sbuf_printf(sb, "%2d: %4u %4u %4u %4u %4u %4u %4u %4u\n", i,
 		    incr[0][i], incr[1][i], incr[2][i], incr[3][i], incr[4][i],
 		    incr[5][i], incr[6][i], incr[7][i]);
 		sbuf_printf(sb, "%8u %4u %4u %4u %4u %4u %4u %4u %5u %s\n",
 		    incr[8][i], incr[9][i], incr[10][i], incr[11][i],
 		    incr[12][i], incr[13][i], incr[14][i], incr[15][i],
 		    sc->params.a_wnd[i], dec_fac[sc->params.b_wnd[i]]);
 	}
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 static const char *qname[CIM_NUM_IBQ + CIM_NUM_OBQ_T5] = {
 	"TP0", "TP1", "ULP", "SGE0", "SGE1", "NC-SI",	/* ibq's */
 	"ULP0", "ULP1", "ULP2", "ULP3", "SGE", "NC-SI",	/* obq's */
 	"SGE0-RX", "SGE1-RX"	/* additional obq's (T5 onwards) */
 };
 
 static int
 sysctl_cim_ibq_obq(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	int rc, i, n, qid = arg2;
 	uint32_t *buf, *p;
 	char *qtype;
 	u_int cim_num_obq = sc->chip_params->cim_num_obq;
 
 	KASSERT(qid >= 0 && qid < CIM_NUM_IBQ + cim_num_obq,
 	    ("%s: bad qid %d\n", __func__, qid));
 
 	if (qid < CIM_NUM_IBQ) {
 		/* inbound queue */
 		qtype = "IBQ";
 		n = 4 * CIM_IBQ_SIZE;
 		buf = malloc(n * sizeof(uint32_t), M_CXGBE, M_ZERO | M_WAITOK);
 		rc = t4_read_cim_ibq(sc, qid, buf, n);
 	} else {
 		/* outbound queue */
 		qtype = "OBQ";
 		qid -= CIM_NUM_IBQ;
 		n = 4 * cim_num_obq * CIM_OBQ_SIZE;
 		buf = malloc(n * sizeof(uint32_t), M_CXGBE, M_ZERO | M_WAITOK);
 		rc = t4_read_cim_obq(sc, qid, buf, n);
 	}
 
 	if (rc < 0) {
 		rc = -rc;
 		goto done;
 	}
 	n = rc * sizeof(uint32_t);	/* rc has # of words actually read */
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		goto done;
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, PAGE_SIZE, req);
 	if (sb == NULL) {
 		rc = ENOMEM;
 		goto done;
 	}
 
 	sbuf_printf(sb, "%s%d %s", qtype , qid, qname[arg2]);
 	for (i = 0, p = buf; i < n; i += 16, p += 4)
 		sbuf_printf(sb, "\n%#06x: %08x %08x %08x %08x", i, p[0], p[1],
 		    p[2], p[3]);
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 done:
 	free(buf, M_CXGBE);
 	return (rc);
 }
 
 static int
 sysctl_cim_la(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	u_int cfg;
 	struct sbuf *sb;
 	uint32_t *buf, *p;
 	int rc;
 
 	MPASS(chip_id(sc) <= CHELSIO_T5);
 
 	rc = -t4_cim_read(sc, A_UP_UP_DBG_LA_CFG, 1, &cfg);
 	if (rc != 0)
 		return (rc);
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 4096, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	buf = malloc(sc->params.cim_la_size * sizeof(uint32_t), M_CXGBE,
 	    M_ZERO | M_WAITOK);
 
 	rc = -t4_cim_read_la(sc, buf, NULL);
 	if (rc != 0)
 		goto done;
 
 	sbuf_printf(sb, "Status   Data      PC%s",
 	    cfg & F_UPDBGLACAPTPCONLY ? "" :
 	    "     LS0Stat  LS0Addr             LS0Data");
 
 	for (p = buf; p <= &buf[sc->params.cim_la_size - 8]; p += 8) {
 		if (cfg & F_UPDBGLACAPTPCONLY) {
 			sbuf_printf(sb, "\n  %02x   %08x %08x", p[5] & 0xff,
 			    p[6], p[7]);
 			sbuf_printf(sb, "\n  %02x   %02x%06x %02x%06x",
 			    (p[3] >> 8) & 0xff, p[3] & 0xff, p[4] >> 8,
 			    p[4] & 0xff, p[5] >> 8);
 			sbuf_printf(sb, "\n  %02x   %x%07x %x%07x",
 			    (p[0] >> 4) & 0xff, p[0] & 0xf, p[1] >> 4,
 			    p[1] & 0xf, p[2] >> 4);
 		} else {
 			sbuf_printf(sb,
 			    "\n  %02x   %x%07x %x%07x %08x %08x "
 			    "%08x%08x%08x%08x",
 			    (p[0] >> 4) & 0xff, p[0] & 0xf, p[1] >> 4,
 			    p[1] & 0xf, p[2] >> 4, p[2] & 0xf, p[3], p[4], p[5],
 			    p[6], p[7]);
 		}
 	}
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 done:
 	free(buf, M_CXGBE);
 	return (rc);
 }
 
 static int
 sysctl_cim_la_t6(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	u_int cfg;
 	struct sbuf *sb;
 	uint32_t *buf, *p;
 	int rc;
 
 	MPASS(chip_id(sc) > CHELSIO_T5);
 
 	rc = -t4_cim_read(sc, A_UP_UP_DBG_LA_CFG, 1, &cfg);
 	if (rc != 0)
 		return (rc);
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 4096, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	buf = malloc(sc->params.cim_la_size * sizeof(uint32_t), M_CXGBE,
 	    M_ZERO | M_WAITOK);
 
 	rc = -t4_cim_read_la(sc, buf, NULL);
 	if (rc != 0)
 		goto done;
 
 	sbuf_printf(sb, "Status   Inst    Data      PC%s",
 	    cfg & F_UPDBGLACAPTPCONLY ? "" :
 	    "     LS0Stat  LS0Addr  LS0Data  LS1Stat  LS1Addr  LS1Data");
 
 	for (p = buf; p <= &buf[sc->params.cim_la_size - 10]; p += 10) {
 		if (cfg & F_UPDBGLACAPTPCONLY) {
 			sbuf_printf(sb, "\n  %02x   %08x %08x %08x",
 			    p[3] & 0xff, p[2], p[1], p[0]);
 			sbuf_printf(sb, "\n  %02x   %02x%06x %02x%06x %02x%06x",
 			    (p[6] >> 8) & 0xff, p[6] & 0xff, p[5] >> 8,
 			    p[5] & 0xff, p[4] >> 8, p[4] & 0xff, p[3] >> 8);
 			sbuf_printf(sb, "\n  %02x   %04x%04x %04x%04x %04x%04x",
 			    (p[9] >> 16) & 0xff, p[9] & 0xffff, p[8] >> 16,
 			    p[8] & 0xffff, p[7] >> 16, p[7] & 0xffff,
 			    p[6] >> 16);
 		} else {
 			sbuf_printf(sb, "\n  %02x   %04x%04x %04x%04x %04x%04x "
 			    "%08x %08x %08x %08x %08x %08x",
 			    (p[9] >> 16) & 0xff,
 			    p[9] & 0xffff, p[8] >> 16,
 			    p[8] & 0xffff, p[7] >> 16,
 			    p[7] & 0xffff, p[6] >> 16,
 			    p[2], p[1], p[0], p[5], p[4], p[3]);
 		}
 	}
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 done:
 	free(buf, M_CXGBE);
 	return (rc);
 }
 
 static int
 sysctl_cim_ma_la(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	u_int i;
 	struct sbuf *sb;
 	uint32_t *buf, *p;
 	int rc;
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 4096, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	buf = malloc(2 * CIM_MALA_SIZE * 5 * sizeof(uint32_t), M_CXGBE,
 	    M_ZERO | M_WAITOK);
 
 	t4_cim_read_ma_la(sc, buf, buf + 5 * CIM_MALA_SIZE);
 	p = buf;
 
 	for (i = 0; i < CIM_MALA_SIZE; i++, p += 5) {
 		sbuf_printf(sb, "\n%02x%08x%08x%08x%08x", p[4], p[3], p[2],
 		    p[1], p[0]);
 	}
 
 	sbuf_printf(sb, "\n\nCnt ID Tag UE       Data       RDY VLD");
 	for (i = 0; i < CIM_MALA_SIZE; i++, p += 5) {
 		sbuf_printf(sb, "\n%3u %2u  %x   %u %08x%08x  %u   %u",
 		    (p[2] >> 10) & 0xff, (p[2] >> 7) & 7,
 		    (p[2] >> 3) & 0xf, (p[2] >> 2) & 1,
 		    (p[1] >> 2) | ((p[2] & 3) << 30),
 		    (p[0] >> 2) | ((p[1] & 3) << 30), (p[0] >> 1) & 1,
 		    p[0] & 1);
 	}
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 	free(buf, M_CXGBE);
 	return (rc);
 }
 
 static int
 sysctl_cim_pif_la(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	u_int i;
 	struct sbuf *sb;
 	uint32_t *buf, *p;
 	int rc;
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 4096, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	buf = malloc(2 * CIM_PIFLA_SIZE * 6 * sizeof(uint32_t), M_CXGBE,
 	    M_ZERO | M_WAITOK);
 
 	t4_cim_read_pif_la(sc, buf, buf + 6 * CIM_PIFLA_SIZE, NULL, NULL);
 	p = buf;
 
 	sbuf_printf(sb, "Cntl ID DataBE   Addr                 Data");
 	for (i = 0; i < CIM_PIFLA_SIZE; i++, p += 6) {
 		sbuf_printf(sb, "\n %02x  %02x  %04x  %08x %08x%08x%08x%08x",
 		    (p[5] >> 22) & 0xff, (p[5] >> 16) & 0x3f, p[5] & 0xffff,
 		    p[4], p[3], p[2], p[1], p[0]);
 	}
 
 	sbuf_printf(sb, "\n\nCntl ID               Data");
 	for (i = 0; i < CIM_PIFLA_SIZE; i++, p += 6) {
 		sbuf_printf(sb, "\n %02x  %02x %08x%08x%08x%08x",
 		    (p[4] >> 6) & 0xff, p[4] & 0x3f, p[3], p[2], p[1], p[0]);
 	}
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 	free(buf, M_CXGBE);
 	return (rc);
 }
 
 static int
 sysctl_cim_qcfg(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	int rc, i;
 	uint16_t base[CIM_NUM_IBQ + CIM_NUM_OBQ_T5];
 	uint16_t size[CIM_NUM_IBQ + CIM_NUM_OBQ_T5];
 	uint16_t thres[CIM_NUM_IBQ];
 	uint32_t obq_wr[2 * CIM_NUM_OBQ_T5], *wr = obq_wr;
 	uint32_t stat[4 * (CIM_NUM_IBQ + CIM_NUM_OBQ_T5)], *p = stat;
 	u_int cim_num_obq, ibq_rdaddr, obq_rdaddr, nq;
 
 	cim_num_obq = sc->chip_params->cim_num_obq;
 	if (is_t4(sc)) {
 		ibq_rdaddr = A_UP_IBQ_0_RDADDR;
 		obq_rdaddr = A_UP_OBQ_0_REALADDR;
 	} else {
 		ibq_rdaddr = A_UP_IBQ_0_SHADOW_RDADDR;
 		obq_rdaddr = A_UP_OBQ_0_SHADOW_REALADDR;
 	}
 	nq = CIM_NUM_IBQ + cim_num_obq;
 
 	rc = -t4_cim_read(sc, ibq_rdaddr, 4 * nq, stat);
 	if (rc == 0)
 		rc = -t4_cim_read(sc, obq_rdaddr, 2 * cim_num_obq, obq_wr);
 	if (rc != 0)
 		return (rc);
 
 	t4_read_cimq_cfg(sc, base, size, thres);
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, PAGE_SIZE, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	sbuf_printf(sb, "Queue  Base  Size Thres RdPtr WrPtr  SOP  EOP Avail");
 
 	for (i = 0; i < CIM_NUM_IBQ; i++, p += 4)
 		sbuf_printf(sb, "\n%7s %5x %5u %5u %6x  %4x %4u %4u %5u",
 		    qname[i], base[i], size[i], thres[i], G_IBQRDADDR(p[0]),
 		    G_IBQWRADDR(p[1]), G_QUESOPCNT(p[3]), G_QUEEOPCNT(p[3]),
 		    G_QUEREMFLITS(p[2]) * 16);
 	for ( ; i < nq; i++, p += 4, wr += 2)
 		sbuf_printf(sb, "\n%7s %5x %5u %12x  %4x %4u %4u %5u", qname[i],
 		    base[i], size[i], G_QUERDADDR(p[0]) & 0x3fff,
 		    wr[0] - base[i], G_QUESOPCNT(p[3]), G_QUEEOPCNT(p[3]),
 		    G_QUEREMFLITS(p[2]) * 16);
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 static int
 sysctl_cpl_stats(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	int rc;
 	struct tp_cpl_stats stats;
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 256, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	mtx_lock(&sc->reg_lock);
 	t4_tp_get_cpl_stats(sc, &stats);
 	mtx_unlock(&sc->reg_lock);
 
 	if (sc->chip_params->nchan > 2) {
 		sbuf_printf(sb, "                 channel 0  channel 1"
 		    "  channel 2  channel 3");
 		sbuf_printf(sb, "\nCPL requests:   %10u %10u %10u %10u",
 		    stats.req[0], stats.req[1], stats.req[2], stats.req[3]);
 		sbuf_printf(sb, "\nCPL responses:   %10u %10u %10u %10u",
 		    stats.rsp[0], stats.rsp[1], stats.rsp[2], stats.rsp[3]);
 	} else {
 		sbuf_printf(sb, "                 channel 0  channel 1");
 		sbuf_printf(sb, "\nCPL requests:   %10u %10u",
 		    stats.req[0], stats.req[1]);
 		sbuf_printf(sb, "\nCPL responses:   %10u %10u",
 		    stats.rsp[0], stats.rsp[1]);
 	}
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 static int
 sysctl_ddp_stats(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	int rc;
 	struct tp_usm_stats stats;
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return(rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 256, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	t4_get_usm_stats(sc, &stats);
 
 	sbuf_printf(sb, "Frames: %u\n", stats.frames);
 	sbuf_printf(sb, "Octets: %ju\n", stats.octets);
 	sbuf_printf(sb, "Drops:  %u", stats.drops);
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 static const char * const devlog_level_strings[] = {
 	[FW_DEVLOG_LEVEL_EMERG]		= "EMERG",
 	[FW_DEVLOG_LEVEL_CRIT]		= "CRIT",
 	[FW_DEVLOG_LEVEL_ERR]		= "ERR",
 	[FW_DEVLOG_LEVEL_NOTICE]	= "NOTICE",
 	[FW_DEVLOG_LEVEL_INFO]		= "INFO",
 	[FW_DEVLOG_LEVEL_DEBUG]		= "DEBUG"
 };
 
 static const char * const devlog_facility_strings[] = {
 	[FW_DEVLOG_FACILITY_CORE]	= "CORE",
 	[FW_DEVLOG_FACILITY_CF]		= "CF",
 	[FW_DEVLOG_FACILITY_SCHED]	= "SCHED",
 	[FW_DEVLOG_FACILITY_TIMER]	= "TIMER",
 	[FW_DEVLOG_FACILITY_RES]	= "RES",
 	[FW_DEVLOG_FACILITY_HW]		= "HW",
 	[FW_DEVLOG_FACILITY_FLR]	= "FLR",
 	[FW_DEVLOG_FACILITY_DMAQ]	= "DMAQ",
 	[FW_DEVLOG_FACILITY_PHY]	= "PHY",
 	[FW_DEVLOG_FACILITY_MAC]	= "MAC",
 	[FW_DEVLOG_FACILITY_PORT]	= "PORT",
 	[FW_DEVLOG_FACILITY_VI]		= "VI",
 	[FW_DEVLOG_FACILITY_FILTER]	= "FILTER",
 	[FW_DEVLOG_FACILITY_ACL]	= "ACL",
 	[FW_DEVLOG_FACILITY_TM]		= "TM",
 	[FW_DEVLOG_FACILITY_QFC]	= "QFC",
 	[FW_DEVLOG_FACILITY_DCB]	= "DCB",
 	[FW_DEVLOG_FACILITY_ETH]	= "ETH",
 	[FW_DEVLOG_FACILITY_OFLD]	= "OFLD",
 	[FW_DEVLOG_FACILITY_RI]		= "RI",
 	[FW_DEVLOG_FACILITY_ISCSI]	= "ISCSI",
 	[FW_DEVLOG_FACILITY_FCOE]	= "FCOE",
 	[FW_DEVLOG_FACILITY_FOISCSI]	= "FOISCSI",
 	[FW_DEVLOG_FACILITY_FOFCOE]	= "FOFCOE",
 	[FW_DEVLOG_FACILITY_CHNET]	= "CHNET",
 };
 
 static int
 sysctl_devlog(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct devlog_params *dparams = &sc->params.devlog;
 	struct fw_devlog_e *buf, *e;
 	int i, j, rc, nentries, first = 0;
 	struct sbuf *sb;
 	uint64_t ftstamp = UINT64_MAX;
 
 	if (dparams->addr == 0)
 		return (ENXIO);
 
 	buf = malloc(dparams->size, M_CXGBE, M_NOWAIT);
 	if (buf == NULL)
 		return (ENOMEM);
 
 	rc = read_via_memwin(sc, 1, dparams->addr, (void *)buf, dparams->size);
 	if (rc != 0)
 		goto done;
 
 	nentries = dparams->size / sizeof(struct fw_devlog_e);
 	for (i = 0; i < nentries; i++) {
 		e = &buf[i];
 
 		if (e->timestamp == 0)
 			break;	/* end */
 
 		e->timestamp = be64toh(e->timestamp);
 		e->seqno = be32toh(e->seqno);
 		for (j = 0; j < 8; j++)
 			e->params[j] = be32toh(e->params[j]);
 
 		if (e->timestamp < ftstamp) {
 			ftstamp = e->timestamp;
 			first = i;
 		}
 	}
 
 	if (buf[first].timestamp == 0)
 		goto done;	/* nothing in the log */
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		goto done;
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 4096, req);
 	if (sb == NULL) {
 		rc = ENOMEM;
 		goto done;
 	}
 	sbuf_printf(sb, "%10s  %15s  %8s  %8s  %s\n",
 	    "Seq#", "Tstamp", "Level", "Facility", "Message");
 
 	i = first;
 	do {
 		e = &buf[i];
 		if (e->timestamp == 0)
 			break;	/* end */
 
 		sbuf_printf(sb, "%10d  %15ju  %8s  %8s  ",
 		    e->seqno, e->timestamp,
 		    (e->level < nitems(devlog_level_strings) ?
 			devlog_level_strings[e->level] : "UNKNOWN"),
 		    (e->facility < nitems(devlog_facility_strings) ?
 			devlog_facility_strings[e->facility] : "UNKNOWN"));
 		sbuf_printf(sb, e->fmt, e->params[0], e->params[1],
 		    e->params[2], e->params[3], e->params[4],
 		    e->params[5], e->params[6], e->params[7]);
 
 		if (++i == nentries)
 			i = 0;
 	} while (i != first);
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 done:
 	free(buf, M_CXGBE);
 	return (rc);
 }
 
 static int
 sysctl_fcoe_stats(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	int rc;
 	struct tp_fcoe_stats stats[MAX_NCHAN];
 	int i, nchan = sc->chip_params->nchan;
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 256, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	for (i = 0; i < nchan; i++)
 		t4_get_fcoe_stats(sc, i, &stats[i]);
 
 	if (nchan > 2) {
 		sbuf_printf(sb, "                   channel 0        channel 1"
 		    "        channel 2        channel 3");
 		sbuf_printf(sb, "\noctetsDDP:  %16ju %16ju %16ju %16ju",
 		    stats[0].octets_ddp, stats[1].octets_ddp,
 		    stats[2].octets_ddp, stats[3].octets_ddp);
 		sbuf_printf(sb, "\nframesDDP:  %16u %16u %16u %16u",
 		    stats[0].frames_ddp, stats[1].frames_ddp,
 		    stats[2].frames_ddp, stats[3].frames_ddp);
 		sbuf_printf(sb, "\nframesDrop: %16u %16u %16u %16u",
 		    stats[0].frames_drop, stats[1].frames_drop,
 		    stats[2].frames_drop, stats[3].frames_drop);
 	} else {
 		sbuf_printf(sb, "                   channel 0        channel 1");
 		sbuf_printf(sb, "\noctetsDDP:  %16ju %16ju",
 		    stats[0].octets_ddp, stats[1].octets_ddp);
 		sbuf_printf(sb, "\nframesDDP:  %16u %16u",
 		    stats[0].frames_ddp, stats[1].frames_ddp);
 		sbuf_printf(sb, "\nframesDrop: %16u %16u",
 		    stats[0].frames_drop, stats[1].frames_drop);
 	}
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 static int
 sysctl_hw_sched(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	int rc, i;
 	unsigned int map, kbps, ipg, mode;
 	unsigned int pace_tab[NTX_SCHED];
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 256, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	map = t4_read_reg(sc, A_TP_TX_MOD_QUEUE_REQ_MAP);
 	mode = G_TIMERMODE(t4_read_reg(sc, A_TP_MOD_CONFIG));
 	t4_read_pace_tbl(sc, pace_tab);
 
 	sbuf_printf(sb, "Scheduler  Mode   Channel  Rate (Kbps)   "
 	    "Class IPG (0.1 ns)   Flow IPG (us)");
 
 	for (i = 0; i < NTX_SCHED; ++i, map >>= 2) {
 		t4_get_tx_sched(sc, i, &kbps, &ipg);
 		sbuf_printf(sb, "\n    %u      %-5s     %u     ", i,
 		    (mode & (1 << i)) ? "flow" : "class", map & 3);
 		if (kbps)
 			sbuf_printf(sb, "%9u     ", kbps);
 		else
 			sbuf_printf(sb, " disabled     ");
 
 		if (ipg)
 			sbuf_printf(sb, "%13u        ", ipg);
 		else
 			sbuf_printf(sb, "     disabled        ");
 
 		if (pace_tab[i])
 			sbuf_printf(sb, "%10u", pace_tab[i]);
 		else
 			sbuf_printf(sb, "  disabled");
 	}
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 static int
 sysctl_lb_stats(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	int rc, i, j;
 	uint64_t *p0, *p1;
 	struct lb_port_stats s[2];
 	static const char *stat_name[] = {
 		"OctetsOK:", "FramesOK:", "BcastFrames:", "McastFrames:",
 		"UcastFrames:", "ErrorFrames:", "Frames64:", "Frames65To127:",
 		"Frames128To255:", "Frames256To511:", "Frames512To1023:",
 		"Frames1024To1518:", "Frames1519ToMax:", "FramesDropped:",
 		"BG0FramesDropped:", "BG1FramesDropped:", "BG2FramesDropped:",
 		"BG3FramesDropped:", "BG0FramesTrunc:", "BG1FramesTrunc:",
 		"BG2FramesTrunc:", "BG3FramesTrunc:"
 	};
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 4096, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	memset(s, 0, sizeof(s));
 
 	for (i = 0; i < sc->chip_params->nchan; i += 2) {
 		t4_get_lb_stats(sc, i, &s[0]);
 		t4_get_lb_stats(sc, i + 1, &s[1]);
 
 		p0 = &s[0].octets;
 		p1 = &s[1].octets;
 		sbuf_printf(sb, "%s                       Loopback %u"
 		    "           Loopback %u", i == 0 ? "" : "\n", i, i + 1);
 
 		for (j = 0; j < nitems(stat_name); j++)
 			sbuf_printf(sb, "\n%-17s %20ju %20ju", stat_name[j],
 				   *p0++, *p1++);
 	}
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 static int
 sysctl_linkdnrc(SYSCTL_HANDLER_ARGS)
 {
 	int rc = 0;
 	struct port_info *pi = arg1;
 	struct sbuf *sb;
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return(rc);
 	sb = sbuf_new_for_sysctl(NULL, NULL, 64, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	if (pi->linkdnrc < 0)
 		sbuf_printf(sb, "n/a");
 	else
 		sbuf_printf(sb, "%s", t4_link_down_rc_str(pi->linkdnrc));
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 struct mem_desc {
 	unsigned int base;
 	unsigned int limit;
 	unsigned int idx;
 };
 
 static int
 mem_desc_cmp(const void *a, const void *b)
 {
 	return ((const struct mem_desc *)a)->base -
 	       ((const struct mem_desc *)b)->base;
 }
 
 static void
 mem_region_show(struct sbuf *sb, const char *name, unsigned int from,
     unsigned int to)
 {
 	unsigned int size;
 
 	if (from == to)
 		return;
 
 	size = to - from + 1;
 	if (size == 0)
 		return;
 
 	/* XXX: need humanize_number(3) in libkern for a more readable 'size' */
 	sbuf_printf(sb, "%-15s %#x-%#x [%u]\n", name, from, to, size);
 }
 
 static int
 sysctl_meminfo(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	int rc, i, n;
 	uint32_t lo, hi, used, alloc;
 	static const char *memory[] = {"EDC0:", "EDC1:", "MC:", "MC0:", "MC1:"};
 	static const char *region[] = {
 		"DBQ contexts:", "IMSG contexts:", "FLM cache:", "TCBs:",
 		"Pstructs:", "Timers:", "Rx FL:", "Tx FL:", "Pstruct FL:",
 		"Tx payload:", "Rx payload:", "LE hash:", "iSCSI region:",
 		"TDDP region:", "TPT region:", "STAG region:", "RQ region:",
 		"RQUDP region:", "PBL region:", "TXPBL region:",
 		"DBVFIFO region:", "ULPRX state:", "ULPTX state:",
 		"On-chip queues:"
 	};
 	struct mem_desc avail[4];
 	struct mem_desc mem[nitems(region) + 3];	/* up to 3 holes */
 	struct mem_desc *md = mem;
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 4096, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	for (i = 0; i < nitems(mem); i++) {
 		mem[i].limit = 0;
 		mem[i].idx = i;
 	}
 
 	/* Find and sort the populated memory ranges */
 	i = 0;
 	lo = t4_read_reg(sc, A_MA_TARGET_MEM_ENABLE);
 	if (lo & F_EDRAM0_ENABLE) {
 		hi = t4_read_reg(sc, A_MA_EDRAM0_BAR);
 		avail[i].base = G_EDRAM0_BASE(hi) << 20;
 		avail[i].limit = avail[i].base + (G_EDRAM0_SIZE(hi) << 20);
 		avail[i].idx = 0;
 		i++;
 	}
 	if (lo & F_EDRAM1_ENABLE) {
 		hi = t4_read_reg(sc, A_MA_EDRAM1_BAR);
 		avail[i].base = G_EDRAM1_BASE(hi) << 20;
 		avail[i].limit = avail[i].base + (G_EDRAM1_SIZE(hi) << 20);
 		avail[i].idx = 1;
 		i++;
 	}
 	if (lo & F_EXT_MEM_ENABLE) {
 		hi = t4_read_reg(sc, A_MA_EXT_MEMORY_BAR);
 		avail[i].base = G_EXT_MEM_BASE(hi) << 20;
 		avail[i].limit = avail[i].base +
 		    (G_EXT_MEM_SIZE(hi) << 20);
 		avail[i].idx = is_t5(sc) ? 3 : 2;	/* Call it MC0 for T5 */
 		i++;
 	}
 	if (is_t5(sc) && lo & F_EXT_MEM1_ENABLE) {
 		hi = t4_read_reg(sc, A_MA_EXT_MEMORY1_BAR);
 		avail[i].base = G_EXT_MEM1_BASE(hi) << 20;
 		avail[i].limit = avail[i].base +
 		    (G_EXT_MEM1_SIZE(hi) << 20);
 		avail[i].idx = 4;
 		i++;
 	}
 	if (!i)                                    /* no memory available */
 		return 0;
 	qsort(avail, i, sizeof(struct mem_desc), mem_desc_cmp);
 
 	(md++)->base = t4_read_reg(sc, A_SGE_DBQ_CTXT_BADDR);
 	(md++)->base = t4_read_reg(sc, A_SGE_IMSG_CTXT_BADDR);
 	(md++)->base = t4_read_reg(sc, A_SGE_FLM_CACHE_BADDR);
 	(md++)->base = t4_read_reg(sc, A_TP_CMM_TCB_BASE);
 	(md++)->base = t4_read_reg(sc, A_TP_CMM_MM_BASE);
 	(md++)->base = t4_read_reg(sc, A_TP_CMM_TIMER_BASE);
 	(md++)->base = t4_read_reg(sc, A_TP_CMM_MM_RX_FLST_BASE);
 	(md++)->base = t4_read_reg(sc, A_TP_CMM_MM_TX_FLST_BASE);
 	(md++)->base = t4_read_reg(sc, A_TP_CMM_MM_PS_FLST_BASE);
 
 	/* the next few have explicit upper bounds */
 	md->base = t4_read_reg(sc, A_TP_PMM_TX_BASE);
 	md->limit = md->base - 1 +
 		    t4_read_reg(sc, A_TP_PMM_TX_PAGE_SIZE) *
 		    G_PMTXMAXPAGE(t4_read_reg(sc, A_TP_PMM_TX_MAX_PAGE));
 	md++;
 
 	md->base = t4_read_reg(sc, A_TP_PMM_RX_BASE);
 	md->limit = md->base - 1 +
 		    t4_read_reg(sc, A_TP_PMM_RX_PAGE_SIZE) *
 		    G_PMRXMAXPAGE(t4_read_reg(sc, A_TP_PMM_RX_MAX_PAGE));
 	md++;
 
 	if (t4_read_reg(sc, A_LE_DB_CONFIG) & F_HASHEN) {
 		if (chip_id(sc) <= CHELSIO_T5)
 			md->base = t4_read_reg(sc, A_LE_DB_HASH_TID_BASE);
 		else
 			md->base = t4_read_reg(sc, A_LE_DB_HASH_TBL_BASE_ADDR);
 		md->limit = 0;
 	} else {
 		md->base = 0;
 		md->idx = nitems(region);  /* hide it */
 	}
 	md++;
 
 #define ulp_region(reg) \
 	md->base = t4_read_reg(sc, A_ULP_ ## reg ## _LLIMIT);\
 	(md++)->limit = t4_read_reg(sc, A_ULP_ ## reg ## _ULIMIT)
 
 	ulp_region(RX_ISCSI);
 	ulp_region(RX_TDDP);
 	ulp_region(TX_TPT);
 	ulp_region(RX_STAG);
 	ulp_region(RX_RQ);
 	ulp_region(RX_RQUDP);
 	ulp_region(RX_PBL);
 	ulp_region(TX_PBL);
 #undef ulp_region
 
 	md->base = 0;
 	md->idx = nitems(region);
 	if (!is_t4(sc)) {
 		uint32_t size = 0;
 		uint32_t sge_ctrl = t4_read_reg(sc, A_SGE_CONTROL2);
 		uint32_t fifo_size = t4_read_reg(sc, A_SGE_DBVFIFO_SIZE);
 
 		if (is_t5(sc)) {
 			if (sge_ctrl & F_VFIFO_ENABLE)
 				size = G_DBVFIFO_SIZE(fifo_size);
 		} else
 			size = G_T6_DBVFIFO_SIZE(fifo_size);
 
 		if (size) {
 			md->base = G_BASEADDR(t4_read_reg(sc,
 			    A_SGE_DBVFIFO_BADDR));
 			md->limit = md->base + (size << 2) - 1;
 		}
 	}
 	md++;
 
 	md->base = t4_read_reg(sc, A_ULP_RX_CTX_BASE);
 	md->limit = 0;
 	md++;
 	md->base = t4_read_reg(sc, A_ULP_TX_ERR_TABLE_BASE);
 	md->limit = 0;
 	md++;
 
 	md->base = sc->vres.ocq.start;
 	if (sc->vres.ocq.size)
 		md->limit = md->base + sc->vres.ocq.size - 1;
 	else
 		md->idx = nitems(region);  /* hide it */
 	md++;
 
 	/* add any address-space holes, there can be up to 3 */
 	for (n = 0; n < i - 1; n++)
 		if (avail[n].limit < avail[n + 1].base)
 			(md++)->base = avail[n].limit;
 	if (avail[n].limit)
 		(md++)->base = avail[n].limit;
 
 	n = md - mem;
 	qsort(mem, n, sizeof(struct mem_desc), mem_desc_cmp);
 
 	for (lo = 0; lo < i; lo++)
 		mem_region_show(sb, memory[avail[lo].idx], avail[lo].base,
 				avail[lo].limit - 1);
 
 	sbuf_printf(sb, "\n");
 	for (i = 0; i < n; i++) {
 		if (mem[i].idx >= nitems(region))
 			continue;                        /* skip holes */
 		if (!mem[i].limit)
 			mem[i].limit = i < n - 1 ? mem[i + 1].base - 1 : ~0;
 		mem_region_show(sb, region[mem[i].idx], mem[i].base,
 				mem[i].limit);
 	}
 
 	sbuf_printf(sb, "\n");
 	lo = t4_read_reg(sc, A_CIM_SDRAM_BASE_ADDR);
 	hi = t4_read_reg(sc, A_CIM_SDRAM_ADDR_SIZE) + lo - 1;
 	mem_region_show(sb, "uP RAM:", lo, hi);
 
 	lo = t4_read_reg(sc, A_CIM_EXTMEM2_BASE_ADDR);
 	hi = t4_read_reg(sc, A_CIM_EXTMEM2_ADDR_SIZE) + lo - 1;
 	mem_region_show(sb, "uP Extmem2:", lo, hi);
 
 	lo = t4_read_reg(sc, A_TP_PMM_RX_MAX_PAGE);
 	sbuf_printf(sb, "\n%u Rx pages of size %uKiB for %u channels\n",
 		   G_PMRXMAXPAGE(lo),
 		   t4_read_reg(sc, A_TP_PMM_RX_PAGE_SIZE) >> 10,
 		   (lo & F_PMRXNUMCHN) ? 2 : 1);
 
 	lo = t4_read_reg(sc, A_TP_PMM_TX_MAX_PAGE);
 	hi = t4_read_reg(sc, A_TP_PMM_TX_PAGE_SIZE);
 	sbuf_printf(sb, "%u Tx pages of size %u%ciB for %u channels\n",
 		   G_PMTXMAXPAGE(lo),
 		   hi >= (1 << 20) ? (hi >> 20) : (hi >> 10),
 		   hi >= (1 << 20) ? 'M' : 'K', 1 << G_PMTXNUMCHN(lo));
 	sbuf_printf(sb, "%u p-structs\n",
 		   t4_read_reg(sc, A_TP_CMM_MM_MAX_PSTRUCT));
 
 	for (i = 0; i < 4; i++) {
 		if (chip_id(sc) > CHELSIO_T5)
 			lo = t4_read_reg(sc, A_MPS_RX_MAC_BG_PG_CNT0 + i * 4);
 		else
 			lo = t4_read_reg(sc, A_MPS_RX_PG_RSV0 + i * 4);
 		if (is_t5(sc)) {
 			used = G_T5_USED(lo);
 			alloc = G_T5_ALLOC(lo);
 		} else {
 			used = G_USED(lo);
 			alloc = G_ALLOC(lo);
 		}
 		/* For T6 these are MAC buffer groups */
 		sbuf_printf(sb, "\nPort %d using %u pages out of %u allocated",
 		    i, used, alloc);
 	}
 	for (i = 0; i < sc->chip_params->nchan; i++) {
 		if (chip_id(sc) > CHELSIO_T5)
 			lo = t4_read_reg(sc, A_MPS_RX_LPBK_BG_PG_CNT0 + i * 4);
 		else
 			lo = t4_read_reg(sc, A_MPS_RX_PG_RSV4 + i * 4);
 		if (is_t5(sc)) {
 			used = G_T5_USED(lo);
 			alloc = G_T5_ALLOC(lo);
 		} else {
 			used = G_USED(lo);
 			alloc = G_ALLOC(lo);
 		}
 		/* For T6 these are MAC buffer groups */
 		sbuf_printf(sb,
 		    "\nLoopback %d using %u pages out of %u allocated",
 		    i, used, alloc);
 	}
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 static inline void
 tcamxy2valmask(uint64_t x, uint64_t y, uint8_t *addr, uint64_t *mask)
 {
 	*mask = x | y;
 	y = htobe64(y);
 	memcpy(addr, (char *)&y + 2, ETHER_ADDR_LEN);
 }
 
 static int
 sysctl_mps_tcam(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	int rc, i;
 
 	MPASS(chip_id(sc) <= CHELSIO_T5);
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 4096, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	sbuf_printf(sb,
 	    "Idx  Ethernet address     Mask     Vld Ports PF"
 	    "  VF              Replication             P0 P1 P2 P3  ML");
 	for (i = 0; i < sc->chip_params->mps_tcam_size; i++) {
 		uint64_t tcamx, tcamy, mask;
 		uint32_t cls_lo, cls_hi;
 		uint8_t addr[ETHER_ADDR_LEN];
 
 		tcamy = t4_read_reg64(sc, MPS_CLS_TCAM_Y_L(i));
 		tcamx = t4_read_reg64(sc, MPS_CLS_TCAM_X_L(i));
 		if (tcamx & tcamy)
 			continue;
 		tcamxy2valmask(tcamx, tcamy, addr, &mask);
 		cls_lo = t4_read_reg(sc, MPS_CLS_SRAM_L(i));
 		cls_hi = t4_read_reg(sc, MPS_CLS_SRAM_H(i));
 		sbuf_printf(sb, "\n%3u %02x:%02x:%02x:%02x:%02x:%02x %012jx"
 			   "  %c   %#x%4u%4d", i, addr[0], addr[1], addr[2],
 			   addr[3], addr[4], addr[5], (uintmax_t)mask,
 			   (cls_lo & F_SRAM_VLD) ? 'Y' : 'N',
 			   G_PORTMAP(cls_hi), G_PF(cls_lo),
 			   (cls_lo & F_VF_VALID) ? G_VF(cls_lo) : -1);
 
 		if (cls_lo & F_REPLICATE) {
 			struct fw_ldst_cmd ldst_cmd;
 
 			memset(&ldst_cmd, 0, sizeof(ldst_cmd));
 			ldst_cmd.op_to_addrspace =
 			    htobe32(V_FW_CMD_OP(FW_LDST_CMD) |
 				F_FW_CMD_REQUEST | F_FW_CMD_READ |
 				V_FW_LDST_CMD_ADDRSPACE(FW_LDST_ADDRSPC_MPS));
 			ldst_cmd.cycles_to_len16 = htobe32(FW_LEN16(ldst_cmd));
 			ldst_cmd.u.mps.rplc.fid_idx =
 			    htobe16(V_FW_LDST_CMD_FID(FW_LDST_MPS_RPLC) |
 				V_FW_LDST_CMD_IDX(i));
 
 			rc = begin_synchronized_op(sc, NULL, SLEEP_OK | INTR_OK,
 			    "t4mps");
 			if (rc)
 				break;
 			rc = -t4_wr_mbox(sc, sc->mbox, &ldst_cmd,
 			    sizeof(ldst_cmd), &ldst_cmd);
 			end_synchronized_op(sc, 0);
 
 			if (rc != 0) {
 				sbuf_printf(sb, "%36d", rc);
 				rc = 0;
 			} else {
 				sbuf_printf(sb, " %08x %08x %08x %08x",
 				    be32toh(ldst_cmd.u.mps.rplc.rplc127_96),
 				    be32toh(ldst_cmd.u.mps.rplc.rplc95_64),
 				    be32toh(ldst_cmd.u.mps.rplc.rplc63_32),
 				    be32toh(ldst_cmd.u.mps.rplc.rplc31_0));
 			}
 		} else
 			sbuf_printf(sb, "%36s", "");
 
 		sbuf_printf(sb, "%4u%3u%3u%3u %#3x", G_SRAM_PRIO0(cls_lo),
 		    G_SRAM_PRIO1(cls_lo), G_SRAM_PRIO2(cls_lo),
 		    G_SRAM_PRIO3(cls_lo), (cls_lo >> S_MULTILISTEN0) & 0xf);
 	}
 
 	if (rc)
 		(void) sbuf_finish(sb);
 	else
 		rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 static int
 sysctl_mps_tcam_t6(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	int rc, i;
 
 	MPASS(chip_id(sc) > CHELSIO_T5);
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 4096, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	sbuf_printf(sb, "Idx  Ethernet address     Mask       VNI   Mask"
 	    "   IVLAN Vld DIP_Hit   Lookup  Port Vld Ports PF  VF"
 	    "                           Replication"
 	    "                                    P0 P1 P2 P3  ML\n");
 
 	for (i = 0; i < sc->chip_params->mps_tcam_size; i++) {
 		uint8_t dip_hit, vlan_vld, lookup_type, port_num;
 		uint16_t ivlan;
 		uint64_t tcamx, tcamy, val, mask;
 		uint32_t cls_lo, cls_hi, ctl, data2, vnix, vniy;
 		uint8_t addr[ETHER_ADDR_LEN];
 
 		ctl = V_CTLREQID(1) | V_CTLCMDTYPE(0) | V_CTLXYBITSEL(0);
 		if (i < 256)
 			ctl |= V_CTLTCAMINDEX(i) | V_CTLTCAMSEL(0);
 		else
 			ctl |= V_CTLTCAMINDEX(i - 256) | V_CTLTCAMSEL(1);
 		t4_write_reg(sc, A_MPS_CLS_TCAM_DATA2_CTL, ctl);
 		val = t4_read_reg(sc, A_MPS_CLS_TCAM_RDATA1_REQ_ID1);
 		tcamy = G_DMACH(val) << 32;
 		tcamy |= t4_read_reg(sc, A_MPS_CLS_TCAM_RDATA0_REQ_ID1);
 		data2 = t4_read_reg(sc, A_MPS_CLS_TCAM_RDATA2_REQ_ID1);
 		lookup_type = G_DATALKPTYPE(data2);
 		port_num = G_DATAPORTNUM(data2);
 		if (lookup_type && lookup_type != M_DATALKPTYPE) {
 			/* Inner header VNI */
 			vniy = ((data2 & F_DATAVIDH2) << 23) |
 				       (G_DATAVIDH1(data2) << 16) | G_VIDL(val);
 			dip_hit = data2 & F_DATADIPHIT;
 			vlan_vld = 0;
 		} else {
 			vniy = 0;
 			dip_hit = 0;
 			vlan_vld = data2 & F_DATAVIDH2;
 			ivlan = G_VIDL(val);
 		}
 
 		ctl |= V_CTLXYBITSEL(1);
 		t4_write_reg(sc, A_MPS_CLS_TCAM_DATA2_CTL, ctl);
 		val = t4_read_reg(sc, A_MPS_CLS_TCAM_RDATA1_REQ_ID1);
 		tcamx = G_DMACH(val) << 32;
 		tcamx |= t4_read_reg(sc, A_MPS_CLS_TCAM_RDATA0_REQ_ID1);
 		data2 = t4_read_reg(sc, A_MPS_CLS_TCAM_RDATA2_REQ_ID1);
 		if (lookup_type && lookup_type != M_DATALKPTYPE) {
 			/* Inner header VNI mask */
 			vnix = ((data2 & F_DATAVIDH2) << 23) |
 			       (G_DATAVIDH1(data2) << 16) | G_VIDL(val);
 		} else
 			vnix = 0;
 
 		if (tcamx & tcamy)
 			continue;
 		tcamxy2valmask(tcamx, tcamy, addr, &mask);
 
 		cls_lo = t4_read_reg(sc, MPS_CLS_SRAM_L(i));
 		cls_hi = t4_read_reg(sc, MPS_CLS_SRAM_H(i));
 
 		if (lookup_type && lookup_type != M_DATALKPTYPE) {
 			sbuf_printf(sb, "\n%3u %02x:%02x:%02x:%02x:%02x:%02x "
 			    "%012jx %06x %06x    -    -   %3c"
 			    "      'I'  %4x   %3c   %#x%4u%4d", i, addr[0],
 			    addr[1], addr[2], addr[3], addr[4], addr[5],
 			    (uintmax_t)mask, vniy, vnix, dip_hit ? 'Y' : 'N',
 			    port_num, cls_lo & F_T6_SRAM_VLD ? 'Y' : 'N',
 			    G_PORTMAP(cls_hi), G_T6_PF(cls_lo),
 			    cls_lo & F_T6_VF_VALID ? G_T6_VF(cls_lo) : -1);
 		} else {
 			sbuf_printf(sb, "\n%3u %02x:%02x:%02x:%02x:%02x:%02x "
 			    "%012jx    -       -   ", i, addr[0], addr[1],
 			    addr[2], addr[3], addr[4], addr[5],
 			    (uintmax_t)mask);
 
 			if (vlan_vld)
 				sbuf_printf(sb, "%4u   Y     ", ivlan);
 			else
 				sbuf_printf(sb, "  -    N     ");
 
 			sbuf_printf(sb, "-      %3c  %4x   %3c   %#x%4u%4d",
 			    lookup_type ? 'I' : 'O', port_num,
 			    cls_lo & F_T6_SRAM_VLD ? 'Y' : 'N',
 			    G_PORTMAP(cls_hi), G_T6_PF(cls_lo),
 			    cls_lo & F_T6_VF_VALID ? G_T6_VF(cls_lo) : -1);
 		}
 
 
 		if (cls_lo & F_T6_REPLICATE) {
 			struct fw_ldst_cmd ldst_cmd;
 
 			memset(&ldst_cmd, 0, sizeof(ldst_cmd));
 			ldst_cmd.op_to_addrspace =
 			    htobe32(V_FW_CMD_OP(FW_LDST_CMD) |
 				F_FW_CMD_REQUEST | F_FW_CMD_READ |
 				V_FW_LDST_CMD_ADDRSPACE(FW_LDST_ADDRSPC_MPS));
 			ldst_cmd.cycles_to_len16 = htobe32(FW_LEN16(ldst_cmd));
 			ldst_cmd.u.mps.rplc.fid_idx =
 			    htobe16(V_FW_LDST_CMD_FID(FW_LDST_MPS_RPLC) |
 				V_FW_LDST_CMD_IDX(i));
 
 			rc = begin_synchronized_op(sc, NULL, SLEEP_OK | INTR_OK,
 			    "t6mps");
 			if (rc)
 				break;
 			rc = -t4_wr_mbox(sc, sc->mbox, &ldst_cmd,
 			    sizeof(ldst_cmd), &ldst_cmd);
 			end_synchronized_op(sc, 0);
 
 			if (rc != 0) {
 				sbuf_printf(sb, "%72d", rc);
 				rc = 0;
 			} else {
 				sbuf_printf(sb, " %08x %08x %08x %08x"
 				    " %08x %08x %08x %08x",
 				    be32toh(ldst_cmd.u.mps.rplc.rplc255_224),
 				    be32toh(ldst_cmd.u.mps.rplc.rplc223_192),
 				    be32toh(ldst_cmd.u.mps.rplc.rplc191_160),
 				    be32toh(ldst_cmd.u.mps.rplc.rplc159_128),
 				    be32toh(ldst_cmd.u.mps.rplc.rplc127_96),
 				    be32toh(ldst_cmd.u.mps.rplc.rplc95_64),
 				    be32toh(ldst_cmd.u.mps.rplc.rplc63_32),
 				    be32toh(ldst_cmd.u.mps.rplc.rplc31_0));
 			}
 		} else
 			sbuf_printf(sb, "%72s", "");
 
 		sbuf_printf(sb, "%4u%3u%3u%3u %#x",
 		    G_T6_SRAM_PRIO0(cls_lo), G_T6_SRAM_PRIO1(cls_lo),
 		    G_T6_SRAM_PRIO2(cls_lo), G_T6_SRAM_PRIO3(cls_lo),
 		    (cls_lo >> S_T6_MULTILISTEN0) & 0xf);
 	}
 
 	if (rc)
 		(void) sbuf_finish(sb);
 	else
 		rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 static int
 sysctl_path_mtus(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	int rc;
 	uint16_t mtus[NMTUS];
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 256, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	t4_read_mtu_tbl(sc, mtus, NULL);
 
 	sbuf_printf(sb, "%u %u %u %u %u %u %u %u %u %u %u %u %u %u %u %u",
 	    mtus[0], mtus[1], mtus[2], mtus[3], mtus[4], mtus[5], mtus[6],
 	    mtus[7], mtus[8], mtus[9], mtus[10], mtus[11], mtus[12], mtus[13],
 	    mtus[14], mtus[15]);
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 static int
 sysctl_pm_stats(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	int rc, i;
 	uint32_t tx_cnt[MAX_PM_NSTATS], rx_cnt[MAX_PM_NSTATS];
 	uint64_t tx_cyc[MAX_PM_NSTATS], rx_cyc[MAX_PM_NSTATS];
 	static const char *tx_stats[MAX_PM_NSTATS] = {
 		"Read:", "Write bypass:", "Write mem:", "Bypass + mem:",
 		"Tx FIFO wait", NULL, "Tx latency"
 	};
 	static const char *rx_stats[MAX_PM_NSTATS] = {
 		"Read:", "Write bypass:", "Write mem:", "Flush:",
 		" Rx FIFO wait", NULL, "Rx latency"
 	};
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 256, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	t4_pmtx_get_stats(sc, tx_cnt, tx_cyc);
 	t4_pmrx_get_stats(sc, rx_cnt, rx_cyc);
 
 	sbuf_printf(sb, "                Tx pcmds             Tx bytes");
 	for (i = 0; i < 4; i++) {
 		sbuf_printf(sb, "\n%-13s %10u %20ju", tx_stats[i], tx_cnt[i],
 		    tx_cyc[i]);
 	}
 
 	sbuf_printf(sb, "\n                Rx pcmds             Rx bytes");
 	for (i = 0; i < 4; i++) {
 		sbuf_printf(sb, "\n%-13s %10u %20ju", rx_stats[i], rx_cnt[i],
 		    rx_cyc[i]);
 	}
 
 	if (chip_id(sc) > CHELSIO_T5) {
 		sbuf_printf(sb,
 		    "\n              Total wait      Total occupancy");
 		sbuf_printf(sb, "\n%-13s %10u %20ju", tx_stats[i], tx_cnt[i],
 		    tx_cyc[i]);
 		sbuf_printf(sb, "\n%-13s %10u %20ju", rx_stats[i], rx_cnt[i],
 		    rx_cyc[i]);
 
 		i += 2;
 		MPASS(i < nitems(tx_stats));
 
 		sbuf_printf(sb,
 		    "\n                   Reads           Total wait");
 		sbuf_printf(sb, "\n%-13s %10u %20ju", tx_stats[i], tx_cnt[i],
 		    tx_cyc[i]);
 		sbuf_printf(sb, "\n%-13s %10u %20ju", rx_stats[i], rx_cnt[i],
 		    rx_cyc[i]);
 	}
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 static int
 sysctl_rdma_stats(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	int rc;
 	struct tp_rdma_stats stats;
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 256, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	mtx_lock(&sc->reg_lock);
 	t4_tp_get_rdma_stats(sc, &stats);
 	mtx_unlock(&sc->reg_lock);
 
 	sbuf_printf(sb, "NoRQEModDefferals: %u\n", stats.rqe_dfr_mod);
 	sbuf_printf(sb, "NoRQEPktDefferals: %u", stats.rqe_dfr_pkt);
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 static int
 sysctl_tcp_stats(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	int rc;
 	struct tp_tcp_stats v4, v6;
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 256, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	mtx_lock(&sc->reg_lock);
 	t4_tp_get_tcp_stats(sc, &v4, &v6);
 	mtx_unlock(&sc->reg_lock);
 
 	sbuf_printf(sb,
 	    "                                IP                 IPv6\n");
 	sbuf_printf(sb, "OutRsts:      %20u %20u\n",
 	    v4.tcp_out_rsts, v6.tcp_out_rsts);
 	sbuf_printf(sb, "InSegs:       %20ju %20ju\n",
 	    v4.tcp_in_segs, v6.tcp_in_segs);
 	sbuf_printf(sb, "OutSegs:      %20ju %20ju\n",
 	    v4.tcp_out_segs, v6.tcp_out_segs);
 	sbuf_printf(sb, "RetransSegs:  %20ju %20ju",
 	    v4.tcp_retrans_segs, v6.tcp_retrans_segs);
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 static int
 sysctl_tids(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	int rc;
 	struct tid_info *t = &sc->tids;
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 256, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	if (t->natids) {
 		sbuf_printf(sb, "ATID range: 0-%u, in use: %u\n", t->natids - 1,
 		    t->atids_in_use);
 	}
 
 	if (t->ntids) {
 		if (t4_read_reg(sc, A_LE_DB_CONFIG) & F_HASHEN) {
 			uint32_t b = t4_read_reg(sc, A_LE_DB_SERVER_INDEX) / 4;
 
 			if (b) {
 				sbuf_printf(sb, "TID range: 0-%u, %u-%u", b - 1,
 				    t4_read_reg(sc, A_LE_DB_TID_HASHBASE) / 4,
 				    t->ntids - 1);
 			} else {
 				sbuf_printf(sb, "TID range: %u-%u",
 				    t4_read_reg(sc, A_LE_DB_TID_HASHBASE) / 4,
 				    t->ntids - 1);
 			}
 		} else
 			sbuf_printf(sb, "TID range: 0-%u", t->ntids - 1);
 		sbuf_printf(sb, ", in use: %u\n",
 		    atomic_load_acq_int(&t->tids_in_use));
 	}
 
 	if (t->nstids) {
 		sbuf_printf(sb, "STID range: %u-%u, in use: %u\n", t->stid_base,
 		    t->stid_base + t->nstids - 1, t->stids_in_use);
 	}
 
 	if (t->nftids) {
 		sbuf_printf(sb, "FTID range: %u-%u\n", t->ftid_base,
 		    t->ftid_base + t->nftids - 1);
 	}
 
 	if (t->netids) {
 		sbuf_printf(sb, "ETID range: %u-%u\n", t->etid_base,
 		    t->etid_base + t->netids - 1);
 	}
 
 	sbuf_printf(sb, "HW TID usage: %u IP users, %u IPv6 users",
 	    t4_read_reg(sc, A_LE_DB_ACT_CNT_IPV4),
 	    t4_read_reg(sc, A_LE_DB_ACT_CNT_IPV6));
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 static int
 sysctl_tp_err_stats(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	int rc;
 	struct tp_err_stats stats;
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 256, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	mtx_lock(&sc->reg_lock);
 	t4_tp_get_err_stats(sc, &stats);
 	mtx_unlock(&sc->reg_lock);
 
 	if (sc->chip_params->nchan > 2) {
 		sbuf_printf(sb, "                 channel 0  channel 1"
 		    "  channel 2  channel 3\n");
 		sbuf_printf(sb, "macInErrs:      %10u %10u %10u %10u\n",
 		    stats.mac_in_errs[0], stats.mac_in_errs[1],
 		    stats.mac_in_errs[2], stats.mac_in_errs[3]);
 		sbuf_printf(sb, "hdrInErrs:      %10u %10u %10u %10u\n",
 		    stats.hdr_in_errs[0], stats.hdr_in_errs[1],
 		    stats.hdr_in_errs[2], stats.hdr_in_errs[3]);
 		sbuf_printf(sb, "tcpInErrs:      %10u %10u %10u %10u\n",
 		    stats.tcp_in_errs[0], stats.tcp_in_errs[1],
 		    stats.tcp_in_errs[2], stats.tcp_in_errs[3]);
 		sbuf_printf(sb, "tcp6InErrs:     %10u %10u %10u %10u\n",
 		    stats.tcp6_in_errs[0], stats.tcp6_in_errs[1],
 		    stats.tcp6_in_errs[2], stats.tcp6_in_errs[3]);
 		sbuf_printf(sb, "tnlCongDrops:   %10u %10u %10u %10u\n",
 		    stats.tnl_cong_drops[0], stats.tnl_cong_drops[1],
 		    stats.tnl_cong_drops[2], stats.tnl_cong_drops[3]);
 		sbuf_printf(sb, "tnlTxDrops:     %10u %10u %10u %10u\n",
 		    stats.tnl_tx_drops[0], stats.tnl_tx_drops[1],
 		    stats.tnl_tx_drops[2], stats.tnl_tx_drops[3]);
 		sbuf_printf(sb, "ofldVlanDrops:  %10u %10u %10u %10u\n",
 		    stats.ofld_vlan_drops[0], stats.ofld_vlan_drops[1],
 		    stats.ofld_vlan_drops[2], stats.ofld_vlan_drops[3]);
 		sbuf_printf(sb, "ofldChanDrops:  %10u %10u %10u %10u\n\n",
 		    stats.ofld_chan_drops[0], stats.ofld_chan_drops[1],
 		    stats.ofld_chan_drops[2], stats.ofld_chan_drops[3]);
 	} else {
 		sbuf_printf(sb, "                 channel 0  channel 1\n");
 		sbuf_printf(sb, "macInErrs:      %10u %10u\n",
 		    stats.mac_in_errs[0], stats.mac_in_errs[1]);
 		sbuf_printf(sb, "hdrInErrs:      %10u %10u\n",
 		    stats.hdr_in_errs[0], stats.hdr_in_errs[1]);
 		sbuf_printf(sb, "tcpInErrs:      %10u %10u\n",
 		    stats.tcp_in_errs[0], stats.tcp_in_errs[1]);
 		sbuf_printf(sb, "tcp6InErrs:     %10u %10u\n",
 		    stats.tcp6_in_errs[0], stats.tcp6_in_errs[1]);
 		sbuf_printf(sb, "tnlCongDrops:   %10u %10u\n",
 		    stats.tnl_cong_drops[0], stats.tnl_cong_drops[1]);
 		sbuf_printf(sb, "tnlTxDrops:     %10u %10u\n",
 		    stats.tnl_tx_drops[0], stats.tnl_tx_drops[1]);
 		sbuf_printf(sb, "ofldVlanDrops:  %10u %10u\n",
 		    stats.ofld_vlan_drops[0], stats.ofld_vlan_drops[1]);
 		sbuf_printf(sb, "ofldChanDrops:  %10u %10u\n\n",
 		    stats.ofld_chan_drops[0], stats.ofld_chan_drops[1]);
 	}
 
 	sbuf_printf(sb, "ofldNoNeigh:    %u\nofldCongDefer:  %u",
 	    stats.ofld_no_neigh, stats.ofld_cong_defer);
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 static int
 sysctl_tp_la_mask(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct tp_params *tpp = &sc->params.tp;
 	u_int mask;
 	int rc;
 
 	mask = tpp->la_mask >> 16;
 	rc = sysctl_handle_int(oidp, &mask, 0, req);
 	if (rc != 0 || req->newptr == NULL)
 		return (rc);
 	if (mask > 0xffff)
 		return (EINVAL);
 	tpp->la_mask = mask << 16;
 	t4_set_reg_field(sc, A_TP_DBG_LA_CONFIG, 0xffff0000U, tpp->la_mask);
 
 	return (0);
 }
 
 struct field_desc {
 	const char *name;
 	u_int start;
 	u_int width;
 };
 
 static void
 field_desc_show(struct sbuf *sb, uint64_t v, const struct field_desc *f)
 {
 	char buf[32];
 	int line_size = 0;
 
 	while (f->name) {
 		uint64_t mask = (1ULL << f->width) - 1;
 		int len = snprintf(buf, sizeof(buf), "%s: %ju", f->name,
 		    ((uintmax_t)v >> f->start) & mask);
 
 		if (line_size + len >= 79) {
 			line_size = 8;
 			sbuf_printf(sb, "\n        ");
 		}
 		sbuf_printf(sb, "%s ", buf);
 		line_size += len + 1;
 		f++;
 	}
 	sbuf_printf(sb, "\n");
 }
 
 static const struct field_desc tp_la0[] = {
 	{ "RcfOpCodeOut", 60, 4 },
 	{ "State", 56, 4 },
 	{ "WcfState", 52, 4 },
 	{ "RcfOpcSrcOut", 50, 2 },
 	{ "CRxError", 49, 1 },
 	{ "ERxError", 48, 1 },
 	{ "SanityFailed", 47, 1 },
 	{ "SpuriousMsg", 46, 1 },
 	{ "FlushInputMsg", 45, 1 },
 	{ "FlushInputCpl", 44, 1 },
 	{ "RssUpBit", 43, 1 },
 	{ "RssFilterHit", 42, 1 },
 	{ "Tid", 32, 10 },
 	{ "InitTcb", 31, 1 },
 	{ "LineNumber", 24, 7 },
 	{ "Emsg", 23, 1 },
 	{ "EdataOut", 22, 1 },
 	{ "Cmsg", 21, 1 },
 	{ "CdataOut", 20, 1 },
 	{ "EreadPdu", 19, 1 },
 	{ "CreadPdu", 18, 1 },
 	{ "TunnelPkt", 17, 1 },
 	{ "RcfPeerFin", 16, 1 },
 	{ "RcfReasonOut", 12, 4 },
 	{ "TxCchannel", 10, 2 },
 	{ "RcfTxChannel", 8, 2 },
 	{ "RxEchannel", 6, 2 },
 	{ "RcfRxChannel", 5, 1 },
 	{ "RcfDataOutSrdy", 4, 1 },
 	{ "RxDvld", 3, 1 },
 	{ "RxOoDvld", 2, 1 },
 	{ "RxCongestion", 1, 1 },
 	{ "TxCongestion", 0, 1 },
 	{ NULL }
 };
 
 static const struct field_desc tp_la1[] = {
 	{ "CplCmdIn", 56, 8 },
 	{ "CplCmdOut", 48, 8 },
 	{ "ESynOut", 47, 1 },
 	{ "EAckOut", 46, 1 },
 	{ "EFinOut", 45, 1 },
 	{ "ERstOut", 44, 1 },
 	{ "SynIn", 43, 1 },
 	{ "AckIn", 42, 1 },
 	{ "FinIn", 41, 1 },
 	{ "RstIn", 40, 1 },
 	{ "DataIn", 39, 1 },
 	{ "DataInVld", 38, 1 },
 	{ "PadIn", 37, 1 },
 	{ "RxBufEmpty", 36, 1 },
 	{ "RxDdp", 35, 1 },
 	{ "RxFbCongestion", 34, 1 },
 	{ "TxFbCongestion", 33, 1 },
 	{ "TxPktSumSrdy", 32, 1 },
 	{ "RcfUlpType", 28, 4 },
 	{ "Eread", 27, 1 },
 	{ "Ebypass", 26, 1 },
 	{ "Esave", 25, 1 },
 	{ "Static0", 24, 1 },
 	{ "Cread", 23, 1 },
 	{ "Cbypass", 22, 1 },
 	{ "Csave", 21, 1 },
 	{ "CPktOut", 20, 1 },
 	{ "RxPagePoolFull", 18, 2 },
 	{ "RxLpbkPkt", 17, 1 },
 	{ "TxLpbkPkt", 16, 1 },
 	{ "RxVfValid", 15, 1 },
 	{ "SynLearned", 14, 1 },
 	{ "SetDelEntry", 13, 1 },
 	{ "SetInvEntry", 12, 1 },
 	{ "CpcmdDvld", 11, 1 },
 	{ "CpcmdSave", 10, 1 },
 	{ "RxPstructsFull", 8, 2 },
 	{ "EpcmdDvld", 7, 1 },
 	{ "EpcmdFlush", 6, 1 },
 	{ "EpcmdTrimPrefix", 5, 1 },
 	{ "EpcmdTrimPostfix", 4, 1 },
 	{ "ERssIp4Pkt", 3, 1 },
 	{ "ERssIp6Pkt", 2, 1 },
 	{ "ERssTcpUdpPkt", 1, 1 },
 	{ "ERssFceFipPkt", 0, 1 },
 	{ NULL }
 };
 
 static const struct field_desc tp_la2[] = {
 	{ "CplCmdIn", 56, 8 },
 	{ "MpsVfVld", 55, 1 },
 	{ "MpsPf", 52, 3 },
 	{ "MpsVf", 44, 8 },
 	{ "SynIn", 43, 1 },
 	{ "AckIn", 42, 1 },
 	{ "FinIn", 41, 1 },
 	{ "RstIn", 40, 1 },
 	{ "DataIn", 39, 1 },
 	{ "DataInVld", 38, 1 },
 	{ "PadIn", 37, 1 },
 	{ "RxBufEmpty", 36, 1 },
 	{ "RxDdp", 35, 1 },
 	{ "RxFbCongestion", 34, 1 },
 	{ "TxFbCongestion", 33, 1 },
 	{ "TxPktSumSrdy", 32, 1 },
 	{ "RcfUlpType", 28, 4 },
 	{ "Eread", 27, 1 },
 	{ "Ebypass", 26, 1 },
 	{ "Esave", 25, 1 },
 	{ "Static0", 24, 1 },
 	{ "Cread", 23, 1 },
 	{ "Cbypass", 22, 1 },
 	{ "Csave", 21, 1 },
 	{ "CPktOut", 20, 1 },
 	{ "RxPagePoolFull", 18, 2 },
 	{ "RxLpbkPkt", 17, 1 },
 	{ "TxLpbkPkt", 16, 1 },
 	{ "RxVfValid", 15, 1 },
 	{ "SynLearned", 14, 1 },
 	{ "SetDelEntry", 13, 1 },
 	{ "SetInvEntry", 12, 1 },
 	{ "CpcmdDvld", 11, 1 },
 	{ "CpcmdSave", 10, 1 },
 	{ "RxPstructsFull", 8, 2 },
 	{ "EpcmdDvld", 7, 1 },
 	{ "EpcmdFlush", 6, 1 },
 	{ "EpcmdTrimPrefix", 5, 1 },
 	{ "EpcmdTrimPostfix", 4, 1 },
 	{ "ERssIp4Pkt", 3, 1 },
 	{ "ERssIp6Pkt", 2, 1 },
 	{ "ERssTcpUdpPkt", 1, 1 },
 	{ "ERssFceFipPkt", 0, 1 },
 	{ NULL }
 };
 
 static void
 tp_la_show(struct sbuf *sb, uint64_t *p, int idx)
 {
 
 	field_desc_show(sb, *p, tp_la0);
 }
 
 static void
 tp_la_show2(struct sbuf *sb, uint64_t *p, int idx)
 {
 
 	if (idx)
 		sbuf_printf(sb, "\n");
 	field_desc_show(sb, p[0], tp_la0);
 	if (idx < (TPLA_SIZE / 2 - 1) || p[1] != ~0ULL)
 		field_desc_show(sb, p[1], tp_la0);
 }
 
 static void
 tp_la_show3(struct sbuf *sb, uint64_t *p, int idx)
 {
 
 	if (idx)
 		sbuf_printf(sb, "\n");
 	field_desc_show(sb, p[0], tp_la0);
 	if (idx < (TPLA_SIZE / 2 - 1) || p[1] != ~0ULL)
 		field_desc_show(sb, p[1], (p[0] & (1 << 17)) ? tp_la2 : tp_la1);
 }
 
 static int
 sysctl_tp_la(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	uint64_t *buf, *p;
 	int rc;
 	u_int i, inc;
 	void (*show_func)(struct sbuf *, uint64_t *, int);
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 4096, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	buf = malloc(TPLA_SIZE * sizeof(uint64_t), M_CXGBE, M_ZERO | M_WAITOK);
 
 	t4_tp_read_la(sc, buf, NULL);
 	p = buf;
 
 	switch (G_DBGLAMODE(t4_read_reg(sc, A_TP_DBG_LA_CONFIG))) {
 	case 2:
 		inc = 2;
 		show_func = tp_la_show2;
 		break;
 	case 3:
 		inc = 2;
 		show_func = tp_la_show3;
 		break;
 	default:
 		inc = 1;
 		show_func = tp_la_show;
 	}
 
 	for (i = 0; i < TPLA_SIZE / inc; i++, p += inc)
 		(*show_func)(sb, p, i);
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 	free(buf, M_CXGBE);
 	return (rc);
 }
 
 static int
 sysctl_tx_rate(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	int rc;
 	u64 nrate[MAX_NCHAN], orate[MAX_NCHAN];
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 256, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	t4_get_chan_txrate(sc, nrate, orate);
 
 	if (sc->chip_params->nchan > 2) {
 		sbuf_printf(sb, "              channel 0   channel 1"
 		    "   channel 2   channel 3\n");
 		sbuf_printf(sb, "NIC B/s:     %10ju  %10ju  %10ju  %10ju\n",
 		    nrate[0], nrate[1], nrate[2], nrate[3]);
 		sbuf_printf(sb, "Offload B/s: %10ju  %10ju  %10ju  %10ju",
 		    orate[0], orate[1], orate[2], orate[3]);
 	} else {
 		sbuf_printf(sb, "              channel 0   channel 1\n");
 		sbuf_printf(sb, "NIC B/s:     %10ju  %10ju\n",
 		    nrate[0], nrate[1]);
 		sbuf_printf(sb, "Offload B/s: %10ju  %10ju",
 		    orate[0], orate[1]);
 	}
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 static int
 sysctl_ulprx_la(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	uint32_t *buf, *p;
 	int rc, i;
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 4096, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	buf = malloc(ULPRX_LA_SIZE * 8 * sizeof(uint32_t), M_CXGBE,
 	    M_ZERO | M_WAITOK);
 
 	t4_ulprx_read_la(sc, buf);
 	p = buf;
 
 	sbuf_printf(sb, "      Pcmd        Type   Message"
 	    "                Data");
 	for (i = 0; i < ULPRX_LA_SIZE; i++, p += 8) {
 		sbuf_printf(sb, "\n%08x%08x  %4x  %08x  %08x%08x%08x%08x",
 		    p[1], p[0], p[2], p[3], p[7], p[6], p[5], p[4]);
 	}
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 	free(buf, M_CXGBE);
 	return (rc);
 }
 
 static int
 sysctl_wcwr_stats(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct sbuf *sb;
 	int rc, v;
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 4096, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	v = t4_read_reg(sc, A_SGE_STAT_CFG);
 	if (G_STATSOURCE_T5(v) == 7) {
 		if (G_STATMODE(v) == 0) {
 			sbuf_printf(sb, "total %d, incomplete %d",
 			    t4_read_reg(sc, A_SGE_STAT_TOTAL),
 			    t4_read_reg(sc, A_SGE_STAT_MATCH));
 		} else if (G_STATMODE(v) == 1) {
 			sbuf_printf(sb, "total %d, data overflow %d",
 			    t4_read_reg(sc, A_SGE_STAT_TOTAL),
 			    t4_read_reg(sc, A_SGE_STAT_MATCH));
 		}
 	}
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 
 static int
 sysctl_tc_params(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	struct tx_sched_class *tc;
 	struct t4_sched_class_params p;
 	struct sbuf *sb;
 	int i, rc, port_id, flags, mbps, gbps;
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	if (rc != 0)
 		return (rc);
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 4096, req);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	port_id = arg2 >> 16;
 	MPASS(port_id < sc->params.nports);
 	MPASS(sc->port[port_id] != NULL);
 	i = arg2 & 0xffff;
 	MPASS(i < sc->chip_params->nsched_cls);
 	tc = &sc->port[port_id]->tc[i];
 
 	rc = begin_synchronized_op(sc, NULL, HOLD_LOCK | SLEEP_OK | INTR_OK,
 	    "t4tc_p");
 	if (rc)
 		goto done;
 	flags = tc->flags;
 	p = tc->params;
 	end_synchronized_op(sc, LOCK_HELD);
 
 	if ((flags & TX_SC_OK) == 0) {
 		sbuf_printf(sb, "none");
 		goto done;
 	}
 
 	if (p.level == SCHED_CLASS_LEVEL_CL_WRR) {
 		sbuf_printf(sb, "cl-wrr weight %u", p.weight);
 		goto done;
 	} else if (p.level == SCHED_CLASS_LEVEL_CL_RL)
 		sbuf_printf(sb, "cl-rl");
 	else if (p.level == SCHED_CLASS_LEVEL_CH_RL)
 		sbuf_printf(sb, "ch-rl");
 	else {
 		rc = ENXIO;
 		goto done;
 	}
 
 	if (p.ratemode == SCHED_CLASS_RATEMODE_REL) {
 		/* XXX: top speed or actual link speed? */
 		gbps = port_top_speed(sc->port[port_id]);
 		sbuf_printf(sb, " %u%% of %uGbps", p.maxrate, gbps);
 	}
 	else if (p.ratemode == SCHED_CLASS_RATEMODE_ABS) {
 		switch (p.rateunit) {
 		case SCHED_CLASS_RATEUNIT_BITS:
 			mbps = p.maxrate / 1000;
 			gbps = p.maxrate / 1000000;
 			if (p.maxrate == gbps * 1000000)
 				sbuf_printf(sb, " %uGbps", gbps);
 			else if (p.maxrate == mbps * 1000)
 				sbuf_printf(sb, " %uMbps", mbps);
 			else
 				sbuf_printf(sb, " %uKbps", p.maxrate);
 			break;
 		case SCHED_CLASS_RATEUNIT_PKTS:
 			sbuf_printf(sb, " %upps", p.maxrate);
 			break;
 		default:
 			rc = ENXIO;
 			goto done;
 		}
 	}
 
 	switch (p.mode) {
 	case SCHED_CLASS_MODE_CLASS:
 		sbuf_printf(sb, " aggregate");
 		break;
 	case SCHED_CLASS_MODE_FLOW:
 		sbuf_printf(sb, " per-flow");
 		break;
 	default:
 		rc = ENXIO;
 		goto done;
 	}
 
 done:
 	if (rc == 0)
 		rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (rc);
 }
 #endif
 
 #ifdef TCP_OFFLOAD
 static void
 unit_conv(char *buf, size_t len, u_int val, u_int factor)
 {
 	u_int rem = val % factor;
 
 	if (rem == 0)
 		snprintf(buf, len, "%u", val / factor);
 	else {
 		while (rem % 10 == 0)
 			rem /= 10;
 		snprintf(buf, len, "%u.%u", val / factor, rem);
 	}
 }
 
 static int
 sysctl_tp_tick(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	char buf[16];
 	u_int res, re;
 	u_int cclk_ps = 1000000000 / sc->params.vpd.cclk;
 
 	res = t4_read_reg(sc, A_TP_TIMER_RESOLUTION);
 	switch (arg2) {
 	case 0:
 		/* timer_tick */
 		re = G_TIMERRESOLUTION(res);
 		break;
 	case 1:
 		/* TCP timestamp tick */
 		re = G_TIMESTAMPRESOLUTION(res);
 		break;
 	case 2:
 		/* DACK tick */
 		re = G_DELAYEDACKRESOLUTION(res);
 		break;
 	default:
 		return (EDOOFUS);
 	}
 
 	unit_conv(buf, sizeof(buf), (cclk_ps << re), 1000000);
 
 	return (sysctl_handle_string(oidp, buf, sizeof(buf), req));
 }
 
 static int
 sysctl_tp_dack_timer(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	u_int res, dack_re, v;
 	u_int cclk_ps = 1000000000 / sc->params.vpd.cclk;
 
 	res = t4_read_reg(sc, A_TP_TIMER_RESOLUTION);
 	dack_re = G_DELAYEDACKRESOLUTION(res);
 	v = ((cclk_ps << dack_re) / 1000000) * t4_read_reg(sc, A_TP_DACK_TIMER);
 
 	return (sysctl_handle_int(oidp, &v, 0, req));
 }
 
 static int
 sysctl_tp_timer(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *sc = arg1;
 	int reg = arg2;
 	u_int tre;
 	u_long tp_tick_us, v;
 	u_int cclk_ps = 1000000000 / sc->params.vpd.cclk;
 
 	MPASS(reg == A_TP_RXT_MIN || reg == A_TP_RXT_MAX ||
 	    reg == A_TP_PERS_MIN || reg == A_TP_PERS_MAX ||
 	    reg == A_TP_KEEP_IDLE || A_TP_KEEP_INTVL || reg == A_TP_INIT_SRTT ||
 	    reg == A_TP_FINWAIT2_TIMER);
 
 	tre = G_TIMERRESOLUTION(t4_read_reg(sc, A_TP_TIMER_RESOLUTION));
 	tp_tick_us = (cclk_ps << tre) / 1000000;
 
 	if (reg == A_TP_INIT_SRTT)
 		v = tp_tick_us * G_INITSRTT(t4_read_reg(sc, reg));
 	else
 		v = tp_tick_us * t4_read_reg(sc, reg);
 
 	return (sysctl_handle_long(oidp, &v, 0, req));
 }
 #endif
 
 static uint32_t
 fconf_iconf_to_mode(uint32_t fconf, uint32_t iconf)
 {
 	uint32_t mode;
 
 	mode = T4_FILTER_IPv4 | T4_FILTER_IPv6 | T4_FILTER_IP_SADDR |
 	    T4_FILTER_IP_DADDR | T4_FILTER_IP_SPORT | T4_FILTER_IP_DPORT;
 
 	if (fconf & F_FRAGMENTATION)
 		mode |= T4_FILTER_IP_FRAGMENT;
 
 	if (fconf & F_MPSHITTYPE)
 		mode |= T4_FILTER_MPS_HIT_TYPE;
 
 	if (fconf & F_MACMATCH)
 		mode |= T4_FILTER_MAC_IDX;
 
 	if (fconf & F_ETHERTYPE)
 		mode |= T4_FILTER_ETH_TYPE;
 
 	if (fconf & F_PROTOCOL)
 		mode |= T4_FILTER_IP_PROTO;
 
 	if (fconf & F_TOS)
 		mode |= T4_FILTER_IP_TOS;
 
 	if (fconf & F_VLAN)
 		mode |= T4_FILTER_VLAN;
 
 	if (fconf & F_VNIC_ID) {
 		mode |= T4_FILTER_VNIC;
 		if (iconf & F_VNIC)
 			mode |= T4_FILTER_IC_VNIC;
 	}
 
 	if (fconf & F_PORT)
 		mode |= T4_FILTER_PORT;
 
 	if (fconf & F_FCOE)
 		mode |= T4_FILTER_FCoE;
 
 	return (mode);
 }
 
 static uint32_t
 mode_to_fconf(uint32_t mode)
 {
 	uint32_t fconf = 0;
 
 	if (mode & T4_FILTER_IP_FRAGMENT)
 		fconf |= F_FRAGMENTATION;
 
 	if (mode & T4_FILTER_MPS_HIT_TYPE)
 		fconf |= F_MPSHITTYPE;
 
 	if (mode & T4_FILTER_MAC_IDX)
 		fconf |= F_MACMATCH;
 
 	if (mode & T4_FILTER_ETH_TYPE)
 		fconf |= F_ETHERTYPE;
 
 	if (mode & T4_FILTER_IP_PROTO)
 		fconf |= F_PROTOCOL;
 
 	if (mode & T4_FILTER_IP_TOS)
 		fconf |= F_TOS;
 
 	if (mode & T4_FILTER_VLAN)
 		fconf |= F_VLAN;
 
 	if (mode & T4_FILTER_VNIC)
 		fconf |= F_VNIC_ID;
 
 	if (mode & T4_FILTER_PORT)
 		fconf |= F_PORT;
 
 	if (mode & T4_FILTER_FCoE)
 		fconf |= F_FCOE;
 
 	return (fconf);
 }
 
 static uint32_t
 mode_to_iconf(uint32_t mode)
 {
 
 	if (mode & T4_FILTER_IC_VNIC)
 		return (F_VNIC);
 	return (0);
 }
 
 static int check_fspec_against_fconf_iconf(struct adapter *sc,
     struct t4_filter_specification *fs)
 {
 	struct tp_params *tpp = &sc->params.tp;
 	uint32_t fconf = 0;
 
 	if (fs->val.frag || fs->mask.frag)
 		fconf |= F_FRAGMENTATION;
 
 	if (fs->val.matchtype || fs->mask.matchtype)
 		fconf |= F_MPSHITTYPE;
 
 	if (fs->val.macidx || fs->mask.macidx)
 		fconf |= F_MACMATCH;
 
 	if (fs->val.ethtype || fs->mask.ethtype)
 		fconf |= F_ETHERTYPE;
 
 	if (fs->val.proto || fs->mask.proto)
 		fconf |= F_PROTOCOL;
 
 	if (fs->val.tos || fs->mask.tos)
 		fconf |= F_TOS;
 
 	if (fs->val.vlan_vld || fs->mask.vlan_vld)
 		fconf |= F_VLAN;
 
 	if (fs->val.ovlan_vld || fs->mask.ovlan_vld) {
 		fconf |= F_VNIC_ID;
 		if (tpp->ingress_config & F_VNIC)
 			return (EINVAL);
 	}
 
 	if (fs->val.pfvf_vld || fs->mask.pfvf_vld) {
 		fconf |= F_VNIC_ID;
 		if ((tpp->ingress_config & F_VNIC) == 0)
 			return (EINVAL);
 	}
 
 	if (fs->val.iport || fs->mask.iport)
 		fconf |= F_PORT;
 
 	if (fs->val.fcoe || fs->mask.fcoe)
 		fconf |= F_FCOE;
 
 	if ((tpp->vlan_pri_map | fconf) != tpp->vlan_pri_map)
 		return (E2BIG);
 
 	return (0);
 }
 
 static int
 get_filter_mode(struct adapter *sc, uint32_t *mode)
 {
 	struct tp_params *tpp = &sc->params.tp;
 
 	/*
 	 * We trust the cached values of the relevant TP registers.  This means
 	 * things work reliably only if writes to those registers are always via
 	 * t4_set_filter_mode.
 	 */
 	*mode = fconf_iconf_to_mode(tpp->vlan_pri_map, tpp->ingress_config);
 
 	return (0);
 }
 
 static int
 set_filter_mode(struct adapter *sc, uint32_t mode)
 {
 	struct tp_params *tpp = &sc->params.tp;
 	uint32_t fconf, iconf;
 	int rc;
 
 	iconf = mode_to_iconf(mode);
 	if ((iconf ^ tpp->ingress_config) & F_VNIC) {
 		/*
 		 * For now we just complain if A_TP_INGRESS_CONFIG is not
 		 * already set to the correct value for the requested filter
 		 * mode.  It's not clear if it's safe to write to this register
 		 * on the fly.  (And we trust the cached value of the register).
 		 */
 		return (EBUSY);
 	}
 
 	fconf = mode_to_fconf(mode);
 
 	rc = begin_synchronized_op(sc, NULL, HOLD_LOCK | SLEEP_OK | INTR_OK,
 	    "t4setfm");
 	if (rc)
 		return (rc);
 
 	if (sc->tids.ftids_in_use > 0) {
 		rc = EBUSY;
 		goto done;
 	}
 
 #ifdef TCP_OFFLOAD
 	if (uld_active(sc, ULD_TOM)) {
 		rc = EBUSY;
 		goto done;
 	}
 #endif
 
 	rc = -t4_set_filter_mode(sc, fconf);
 done:
 	end_synchronized_op(sc, LOCK_HELD);
 	return (rc);
 }
 
 static inline uint64_t
 get_filter_hits(struct adapter *sc, uint32_t fid)
 {
 	uint32_t tcb_addr;
 
 	tcb_addr = t4_read_reg(sc, A_TP_CMM_TCB_BASE) +
 	    (fid + sc->tids.ftid_base) * TCB_SIZE;
 
 	if (is_t4(sc)) {
 		uint64_t hits;
 
 		read_via_memwin(sc, 0, tcb_addr + 16, (uint32_t *)&hits, 8);
 		return (be64toh(hits));
 	} else {
 		uint32_t hits;
 
 		read_via_memwin(sc, 0, tcb_addr + 24, &hits, 4);
 		return (be32toh(hits));
 	}
 }
 
 static int
 get_filter(struct adapter *sc, struct t4_filter *t)
 {
 	int i, rc, nfilters = sc->tids.nftids;
 	struct filter_entry *f;
 
 	rc = begin_synchronized_op(sc, NULL, HOLD_LOCK | SLEEP_OK | INTR_OK,
 	    "t4getf");
 	if (rc)
 		return (rc);
 
 	if (sc->tids.ftids_in_use == 0 || sc->tids.ftid_tab == NULL ||
 	    t->idx >= nfilters) {
 		t->idx = 0xffffffff;
 		goto done;
 	}
 
 	f = &sc->tids.ftid_tab[t->idx];
 	for (i = t->idx; i < nfilters; i++, f++) {
 		if (f->valid) {
 			t->idx = i;
 			t->l2tidx = f->l2t ? f->l2t->idx : 0;
 			t->smtidx = f->smtidx;
 			if (f->fs.hitcnts)
 				t->hits = get_filter_hits(sc, t->idx);
 			else
 				t->hits = UINT64_MAX;
 			t->fs = f->fs;
 
 			goto done;
 		}
 	}
 
 	t->idx = 0xffffffff;
 done:
 	end_synchronized_op(sc, LOCK_HELD);
 	return (0);
 }
 
 static int
 set_filter(struct adapter *sc, struct t4_filter *t)
 {
 	unsigned int nfilters, nports;
 	struct filter_entry *f;
 	int i, rc;
 
 	rc = begin_synchronized_op(sc, NULL, SLEEP_OK | INTR_OK, "t4setf");
 	if (rc)
 		return (rc);
 
 	nfilters = sc->tids.nftids;
 	nports = sc->params.nports;
 
 	if (nfilters == 0) {
 		rc = ENOTSUP;
 		goto done;
 	}
 
 	if (t->idx >= nfilters) {
 		rc = EINVAL;
 		goto done;
 	}
 
 	/* Validate against the global filter mode and ingress config */
 	rc = check_fspec_against_fconf_iconf(sc, &t->fs);
 	if (rc != 0)
 		goto done;
 
 	if (t->fs.action == FILTER_SWITCH && t->fs.eport >= nports) {
 		rc = EINVAL;
 		goto done;
 	}
 
 	if (t->fs.val.iport >= nports) {
 		rc = EINVAL;
 		goto done;
 	}
 
 	/* Can't specify an iq if not steering to it */
 	if (!t->fs.dirsteer && t->fs.iq) {
 		rc = EINVAL;
 		goto done;
 	}
 
 	/* IPv6 filter idx must be 4 aligned */
 	if (t->fs.type == 1 &&
 	    ((t->idx & 0x3) || t->idx + 4 >= nfilters)) {
 		rc = EINVAL;
 		goto done;
 	}
 
 	if (!(sc->flags & FULL_INIT_DONE) &&
 	    ((rc = adapter_full_init(sc)) != 0))
 		goto done;
 
 	if (sc->tids.ftid_tab == NULL) {
 		KASSERT(sc->tids.ftids_in_use == 0,
 		    ("%s: no memory allocated but filters_in_use > 0",
 		    __func__));
 
 		sc->tids.ftid_tab = malloc(sizeof (struct filter_entry) *
 		    nfilters, M_CXGBE, M_NOWAIT | M_ZERO);
 		if (sc->tids.ftid_tab == NULL) {
 			rc = ENOMEM;
 			goto done;
 		}
 		mtx_init(&sc->tids.ftid_lock, "T4 filters", 0, MTX_DEF);
 	}
 
 	for (i = 0; i < 4; i++) {
 		f = &sc->tids.ftid_tab[t->idx + i];
 
 		if (f->pending || f->valid) {
 			rc = EBUSY;
 			goto done;
 		}
 		if (f->locked) {
 			rc = EPERM;
 			goto done;
 		}
 
 		if (t->fs.type == 0)
 			break;
 	}
 
 	f = &sc->tids.ftid_tab[t->idx];
 	f->fs = t->fs;
 
 	rc = set_filter_wr(sc, t->idx);
 done:
 	end_synchronized_op(sc, 0);
 
 	if (rc == 0) {
 		mtx_lock(&sc->tids.ftid_lock);
 		for (;;) {
 			if (f->pending == 0) {
 				rc = f->valid ? 0 : EIO;
 				break;
 			}
 
 			if (mtx_sleep(&sc->tids.ftid_tab, &sc->tids.ftid_lock,
 			    PCATCH, "t4setfw", 0)) {
 				rc = EINPROGRESS;
 				break;
 			}
 		}
 		mtx_unlock(&sc->tids.ftid_lock);
 	}
 	return (rc);
 }
 
 static int
 del_filter(struct adapter *sc, struct t4_filter *t)
 {
 	unsigned int nfilters;
 	struct filter_entry *f;
 	int rc;
 
 	rc = begin_synchronized_op(sc, NULL, SLEEP_OK | INTR_OK, "t4delf");
 	if (rc)
 		return (rc);
 
 	nfilters = sc->tids.nftids;
 
 	if (nfilters == 0) {
 		rc = ENOTSUP;
 		goto done;
 	}
 
 	if (sc->tids.ftid_tab == NULL || sc->tids.ftids_in_use == 0 ||
 	    t->idx >= nfilters) {
 		rc = EINVAL;
 		goto done;
 	}
 
 	if (!(sc->flags & FULL_INIT_DONE)) {
 		rc = EAGAIN;
 		goto done;
 	}
 
 	f = &sc->tids.ftid_tab[t->idx];
 
 	if (f->pending) {
 		rc = EBUSY;
 		goto done;
 	}
 	if (f->locked) {
 		rc = EPERM;
 		goto done;
 	}
 
 	if (f->valid) {
 		t->fs = f->fs;	/* extra info for the caller */
 		rc = del_filter_wr(sc, t->idx);
 	}
 
 done:
 	end_synchronized_op(sc, 0);
 
 	if (rc == 0) {
 		mtx_lock(&sc->tids.ftid_lock);
 		for (;;) {
 			if (f->pending == 0) {
 				rc = f->valid ? EIO : 0;
 				break;
 			}
 
 			if (mtx_sleep(&sc->tids.ftid_tab, &sc->tids.ftid_lock,
 			    PCATCH, "t4delfw", 0)) {
 				rc = EINPROGRESS;
 				break;
 			}
 		}
 		mtx_unlock(&sc->tids.ftid_lock);
 	}
 
 	return (rc);
 }
 
 static void
 clear_filter(struct filter_entry *f)
 {
 	if (f->l2t)
 		t4_l2t_release(f->l2t);
 
 	bzero(f, sizeof (*f));
 }
 
 static int
 set_filter_wr(struct adapter *sc, int fidx)
 {
 	struct filter_entry *f = &sc->tids.ftid_tab[fidx];
 	struct fw_filter_wr *fwr;
 	unsigned int ftid, vnic_vld, vnic_vld_mask;
 	struct wrq_cookie cookie;
 
 	ASSERT_SYNCHRONIZED_OP(sc);
 
 	if (f->fs.newdmac || f->fs.newvlan) {
 		/* This filter needs an L2T entry; allocate one. */
 		f->l2t = t4_l2t_alloc_switching(sc->l2t);
 		if (f->l2t == NULL)
 			return (EAGAIN);
 		if (t4_l2t_set_switching(sc, f->l2t, f->fs.vlan, f->fs.eport,
 		    f->fs.dmac)) {
 			t4_l2t_release(f->l2t);
 			f->l2t = NULL;
 			return (ENOMEM);
 		}
 	}
 
 	/* Already validated against fconf, iconf */
 	MPASS((f->fs.val.pfvf_vld & f->fs.val.ovlan_vld) == 0);
 	MPASS((f->fs.mask.pfvf_vld & f->fs.mask.ovlan_vld) == 0);
 	if (f->fs.val.pfvf_vld || f->fs.val.ovlan_vld)
 		vnic_vld = 1;
 	else
 		vnic_vld = 0;
 	if (f->fs.mask.pfvf_vld || f->fs.mask.ovlan_vld)
 		vnic_vld_mask = 1;
 	else
 		vnic_vld_mask = 0;
 
 	ftid = sc->tids.ftid_base + fidx;
 
 	fwr = start_wrq_wr(&sc->sge.mgmtq, howmany(sizeof(*fwr), 16), &cookie);
 	if (fwr == NULL)
 		return (ENOMEM);
 	bzero(fwr, sizeof(*fwr));
 
 	fwr->op_pkd = htobe32(V_FW_WR_OP(FW_FILTER_WR));
 	fwr->len16_pkd = htobe32(FW_LEN16(*fwr));
 	fwr->tid_to_iq =
 	    htobe32(V_FW_FILTER_WR_TID(ftid) |
 		V_FW_FILTER_WR_RQTYPE(f->fs.type) |
 		V_FW_FILTER_WR_NOREPLY(0) |
 		V_FW_FILTER_WR_IQ(f->fs.iq));
 	fwr->del_filter_to_l2tix =
 	    htobe32(V_FW_FILTER_WR_RPTTID(f->fs.rpttid) |
 		V_FW_FILTER_WR_DROP(f->fs.action == FILTER_DROP) |
 		V_FW_FILTER_WR_DIRSTEER(f->fs.dirsteer) |
 		V_FW_FILTER_WR_MASKHASH(f->fs.maskhash) |
 		V_FW_FILTER_WR_DIRSTEERHASH(f->fs.dirsteerhash) |
 		V_FW_FILTER_WR_LPBK(f->fs.action == FILTER_SWITCH) |
 		V_FW_FILTER_WR_DMAC(f->fs.newdmac) |
 		V_FW_FILTER_WR_SMAC(f->fs.newsmac) |
 		V_FW_FILTER_WR_INSVLAN(f->fs.newvlan == VLAN_INSERT ||
 		    f->fs.newvlan == VLAN_REWRITE) |
 		V_FW_FILTER_WR_RMVLAN(f->fs.newvlan == VLAN_REMOVE ||
 		    f->fs.newvlan == VLAN_REWRITE) |
 		V_FW_FILTER_WR_HITCNTS(f->fs.hitcnts) |
 		V_FW_FILTER_WR_TXCHAN(f->fs.eport) |
 		V_FW_FILTER_WR_PRIO(f->fs.prio) |
 		V_FW_FILTER_WR_L2TIX(f->l2t ? f->l2t->idx : 0));
 	fwr->ethtype = htobe16(f->fs.val.ethtype);
 	fwr->ethtypem = htobe16(f->fs.mask.ethtype);
 	fwr->frag_to_ovlan_vldm =
 	    (V_FW_FILTER_WR_FRAG(f->fs.val.frag) |
 		V_FW_FILTER_WR_FRAGM(f->fs.mask.frag) |
 		V_FW_FILTER_WR_IVLAN_VLD(f->fs.val.vlan_vld) |
 		V_FW_FILTER_WR_OVLAN_VLD(vnic_vld) |
 		V_FW_FILTER_WR_IVLAN_VLDM(f->fs.mask.vlan_vld) |
 		V_FW_FILTER_WR_OVLAN_VLDM(vnic_vld_mask));
 	fwr->smac_sel = 0;
 	fwr->rx_chan_rx_rpl_iq = htobe16(V_FW_FILTER_WR_RX_CHAN(0) |
 	    V_FW_FILTER_WR_RX_RPL_IQ(sc->sge.fwq.abs_id));
 	fwr->maci_to_matchtypem =
 	    htobe32(V_FW_FILTER_WR_MACI(f->fs.val.macidx) |
 		V_FW_FILTER_WR_MACIM(f->fs.mask.macidx) |
 		V_FW_FILTER_WR_FCOE(f->fs.val.fcoe) |
 		V_FW_FILTER_WR_FCOEM(f->fs.mask.fcoe) |
 		V_FW_FILTER_WR_PORT(f->fs.val.iport) |
 		V_FW_FILTER_WR_PORTM(f->fs.mask.iport) |
 		V_FW_FILTER_WR_MATCHTYPE(f->fs.val.matchtype) |
 		V_FW_FILTER_WR_MATCHTYPEM(f->fs.mask.matchtype));
 	fwr->ptcl = f->fs.val.proto;
 	fwr->ptclm = f->fs.mask.proto;
 	fwr->ttyp = f->fs.val.tos;
 	fwr->ttypm = f->fs.mask.tos;
 	fwr->ivlan = htobe16(f->fs.val.vlan);
 	fwr->ivlanm = htobe16(f->fs.mask.vlan);
 	fwr->ovlan = htobe16(f->fs.val.vnic);
 	fwr->ovlanm = htobe16(f->fs.mask.vnic);
 	bcopy(f->fs.val.dip, fwr->lip, sizeof (fwr->lip));
 	bcopy(f->fs.mask.dip, fwr->lipm, sizeof (fwr->lipm));
 	bcopy(f->fs.val.sip, fwr->fip, sizeof (fwr->fip));
 	bcopy(f->fs.mask.sip, fwr->fipm, sizeof (fwr->fipm));
 	fwr->lp = htobe16(f->fs.val.dport);
 	fwr->lpm = htobe16(f->fs.mask.dport);
 	fwr->fp = htobe16(f->fs.val.sport);
 	fwr->fpm = htobe16(f->fs.mask.sport);
 	if (f->fs.newsmac)
 		bcopy(f->fs.smac, fwr->sma, sizeof (fwr->sma));
 
 	f->pending = 1;
 	sc->tids.ftids_in_use++;
 
 	commit_wrq_wr(&sc->sge.mgmtq, fwr, &cookie);
 	return (0);
 }
 
 static int
 del_filter_wr(struct adapter *sc, int fidx)
 {
 	struct filter_entry *f = &sc->tids.ftid_tab[fidx];
 	struct fw_filter_wr *fwr;
 	unsigned int ftid;
 	struct wrq_cookie cookie;
 
 	ftid = sc->tids.ftid_base + fidx;
 
 	fwr = start_wrq_wr(&sc->sge.mgmtq, howmany(sizeof(*fwr), 16), &cookie);
 	if (fwr == NULL)
 		return (ENOMEM);
 	bzero(fwr, sizeof (*fwr));
 
 	t4_mk_filtdelwr(ftid, fwr, sc->sge.fwq.abs_id);
 
 	f->pending = 1;
 	commit_wrq_wr(&sc->sge.mgmtq, fwr, &cookie);
 	return (0);
 }
 
 int
 t4_filter_rpl(struct sge_iq *iq, const struct rss_header *rss, struct mbuf *m)
 {
 	struct adapter *sc = iq->adapter;
 	const struct cpl_set_tcb_rpl *rpl = (const void *)(rss + 1);
 	unsigned int idx = GET_TID(rpl);
 	unsigned int rc;
 	struct filter_entry *f;
 
 	KASSERT(m == NULL, ("%s: payload with opcode %02x", __func__,
 	    rss->opcode));
 	MPASS(iq == &sc->sge.fwq);
 	MPASS(is_ftid(sc, idx));
 
 	idx -= sc->tids.ftid_base;
 	f = &sc->tids.ftid_tab[idx];
 	rc = G_COOKIE(rpl->cookie);
 
 	mtx_lock(&sc->tids.ftid_lock);
 	if (rc == FW_FILTER_WR_FLT_ADDED) {
 		KASSERT(f->pending, ("%s: filter[%u] isn't pending.",
 		    __func__, idx));
 		f->smtidx = (be64toh(rpl->oldval) >> 24) & 0xff;
 		f->pending = 0;  /* asynchronous setup completed */
 		f->valid = 1;
 	} else {
 		if (rc != FW_FILTER_WR_FLT_DELETED) {
 			/* Add or delete failed, display an error */
 			log(LOG_ERR,
 			    "filter %u setup failed with error %u\n",
 			    idx, rc);
 		}
 
 		clear_filter(f);
 		sc->tids.ftids_in_use--;
 	}
 	wakeup(&sc->tids.ftid_tab);
 	mtx_unlock(&sc->tids.ftid_lock);
 
 	return (0);
 }
 
 static int
 set_tcb_rpl(struct sge_iq *iq, const struct rss_header *rss, struct mbuf *m)
 {
 
 	MPASS(iq->set_tcb_rpl != NULL);
 	return (iq->set_tcb_rpl(iq, rss, m));
 }
 
 static int
 l2t_write_rpl(struct sge_iq *iq, const struct rss_header *rss, struct mbuf *m)
 {
 
 	MPASS(iq->l2t_write_rpl != NULL);
 	return (iq->l2t_write_rpl(iq, rss, m));
 }
 
 static int
 get_sge_context(struct adapter *sc, struct t4_sge_context *cntxt)
 {
 	int rc;
 
 	if (cntxt->cid > M_CTXTQID)
 		return (EINVAL);
 
 	if (cntxt->mem_id != CTXT_EGRESS && cntxt->mem_id != CTXT_INGRESS &&
 	    cntxt->mem_id != CTXT_FLM && cntxt->mem_id != CTXT_CNM)
 		return (EINVAL);
 
 	rc = begin_synchronized_op(sc, NULL, SLEEP_OK | INTR_OK, "t4ctxt");
 	if (rc)
 		return (rc);
 
 	if (sc->flags & FW_OK) {
 		rc = -t4_sge_ctxt_rd(sc, sc->mbox, cntxt->cid, cntxt->mem_id,
 		    &cntxt->data[0]);
 		if (rc == 0)
 			goto done;
 	}
 
 	/*
 	 * Read via firmware failed or wasn't even attempted.  Read directly via
 	 * the backdoor.
 	 */
 	rc = -t4_sge_ctxt_rd_bd(sc, cntxt->cid, cntxt->mem_id, &cntxt->data[0]);
 done:
 	end_synchronized_op(sc, 0);
 	return (rc);
 }
 
 static int
 load_fw(struct adapter *sc, struct t4_data *fw)
 {
 	int rc;
 	uint8_t *fw_data;
 
 	rc = begin_synchronized_op(sc, NULL, SLEEP_OK | INTR_OK, "t4ldfw");
 	if (rc)
 		return (rc);
 
 	if (sc->flags & FULL_INIT_DONE) {
 		rc = EBUSY;
 		goto done;
 	}
 
 	fw_data = malloc(fw->len, M_CXGBE, M_WAITOK);
 	if (fw_data == NULL) {
 		rc = ENOMEM;
 		goto done;
 	}
 
 	rc = copyin(fw->data, fw_data, fw->len);
 	if (rc == 0)
 		rc = -t4_load_fw(sc, fw_data, fw->len);
 
 	free(fw_data, M_CXGBE);
 done:
 	end_synchronized_op(sc, 0);
 	return (rc);
 }
 
 #define MAX_READ_BUF_SIZE (128 * 1024)
 static int
 read_card_mem(struct adapter *sc, int win, struct t4_mem_range *mr)
 {
 	uint32_t addr, remaining, n;
 	uint32_t *buf;
 	int rc;
 	uint8_t *dst;
 
 	rc = validate_mem_range(sc, mr->addr, mr->len);
 	if (rc != 0)
 		return (rc);
 
 	buf = malloc(min(mr->len, MAX_READ_BUF_SIZE), M_CXGBE, M_WAITOK);
 	addr = mr->addr;
 	remaining = mr->len;
 	dst = (void *)mr->data;
 
 	while (remaining) {
 		n = min(remaining, MAX_READ_BUF_SIZE);
 		read_via_memwin(sc, 2, addr, buf, n);
 
 		rc = copyout(buf, dst, n);
 		if (rc != 0)
 			break;
 
 		dst += n;
 		remaining -= n;
 		addr += n;
 	}
 
 	free(buf, M_CXGBE);
 	return (rc);
 }
 #undef MAX_READ_BUF_SIZE
 
 static int
 read_i2c(struct adapter *sc, struct t4_i2c_data *i2cd)
 {
 	int rc;
 
 	if (i2cd->len == 0 || i2cd->port_id >= sc->params.nports)
 		return (EINVAL);
 
 	if (i2cd->len > sizeof(i2cd->data))
 		return (EFBIG);
 
 	rc = begin_synchronized_op(sc, NULL, SLEEP_OK | INTR_OK, "t4i2crd");
 	if (rc)
 		return (rc);
 	rc = -t4_i2c_rd(sc, sc->mbox, i2cd->port_id, i2cd->dev_addr,
 	    i2cd->offset, i2cd->len, &i2cd->data[0]);
 	end_synchronized_op(sc, 0);
 
 	return (rc);
 }
 
 static int
 in_range(int val, int lo, int hi)
 {
 
 	return (val < 0 || (val <= hi && val >= lo));
 }
 
 static int
 set_sched_class_config(struct adapter *sc, int minmax)
 {
 	int rc;
 
 	if (minmax < 0)
 		return (EINVAL);
 
 	rc = begin_synchronized_op(sc, NULL, SLEEP_OK | INTR_OK, "t4sscc");
 	if (rc)
 		return (rc);
 	rc = -t4_sched_config(sc, FW_SCHED_TYPE_PKTSCHED, minmax, 1);
 	end_synchronized_op(sc, 0);
 
 	return (rc);
 }
 
 static int
 set_sched_class_params(struct adapter *sc, struct t4_sched_class_params *p,
     int sleep_ok)
 {
 	int rc, top_speed, fw_level, fw_mode, fw_rateunit, fw_ratemode;
 	struct port_info *pi;
 	struct tx_sched_class *tc;
 
 	if (p->level == SCHED_CLASS_LEVEL_CL_RL)
 		fw_level = FW_SCHED_PARAMS_LEVEL_CL_RL;
 	else if (p->level == SCHED_CLASS_LEVEL_CL_WRR)
 		fw_level = FW_SCHED_PARAMS_LEVEL_CL_WRR;
 	else if (p->level == SCHED_CLASS_LEVEL_CH_RL)
 		fw_level = FW_SCHED_PARAMS_LEVEL_CH_RL;
 	else
 		return (EINVAL);
 
 	if (p->mode == SCHED_CLASS_MODE_CLASS)
 		fw_mode = FW_SCHED_PARAMS_MODE_CLASS;
 	else if (p->mode == SCHED_CLASS_MODE_FLOW)
 		fw_mode = FW_SCHED_PARAMS_MODE_FLOW;
 	else
 		return (EINVAL);
 
 	if (p->rateunit == SCHED_CLASS_RATEUNIT_BITS)
 		fw_rateunit = FW_SCHED_PARAMS_UNIT_BITRATE;
 	else if (p->rateunit == SCHED_CLASS_RATEUNIT_PKTS)
 		fw_rateunit = FW_SCHED_PARAMS_UNIT_PKTRATE;
 	else
 		return (EINVAL);
 
 	if (p->ratemode == SCHED_CLASS_RATEMODE_REL)
 		fw_ratemode = FW_SCHED_PARAMS_RATE_REL;
 	else if (p->ratemode == SCHED_CLASS_RATEMODE_ABS)
 		fw_ratemode = FW_SCHED_PARAMS_RATE_ABS;
 	else
 		return (EINVAL);
 
 	/* Vet our parameters ... */
 	if (!in_range(p->channel, 0, sc->chip_params->nchan - 1))
 		return (ERANGE);
 
 	pi = sc->port[sc->chan_map[p->channel]];
 	if (pi == NULL)
 		return (ENXIO);
 	MPASS(pi->tx_chan == p->channel);
 	top_speed = port_top_speed(pi) * 1000000; /* Gbps -> Kbps */
 
 	if (!in_range(p->cl, 0, sc->chip_params->nsched_cls) ||
 	    !in_range(p->minrate, 0, top_speed) ||
 	    !in_range(p->maxrate, 0, top_speed) ||
 	    !in_range(p->weight, 0, 100))
 		return (ERANGE);
 
 	/*
 	 * Translate any unset parameters into the firmware's
 	 * nomenclature and/or fail the call if the parameters
 	 * are required ...
 	 */
 	if (p->rateunit < 0 || p->ratemode < 0 || p->channel < 0 || p->cl < 0)
 		return (EINVAL);
 
 	if (p->minrate < 0)
 		p->minrate = 0;
 	if (p->maxrate < 0) {
 		if (p->level == SCHED_CLASS_LEVEL_CL_RL ||
 		    p->level == SCHED_CLASS_LEVEL_CH_RL)
 			return (EINVAL);
 		else
 			p->maxrate = 0;
 	}
 	if (p->weight < 0) {
 		if (p->level == SCHED_CLASS_LEVEL_CL_WRR)
 			return (EINVAL);
 		else
 			p->weight = 0;
 	}
 	if (p->pktsize < 0) {
 		if (p->level == SCHED_CLASS_LEVEL_CL_RL ||
 		    p->level == SCHED_CLASS_LEVEL_CH_RL)
 			return (EINVAL);
 		else
 			p->pktsize = 0;
 	}
 
 	rc = begin_synchronized_op(sc, NULL,
 	    sleep_ok ? (SLEEP_OK | INTR_OK) : HOLD_LOCK, "t4sscp");
 	if (rc)
 		return (rc);
 	tc = &pi->tc[p->cl];
 	tc->params = *p;
 	rc = -t4_sched_params(sc, FW_SCHED_TYPE_PKTSCHED, fw_level, fw_mode,
 	    fw_rateunit, fw_ratemode, p->channel, p->cl, p->minrate, p->maxrate,
 	    p->weight, p->pktsize, sleep_ok);
 	if (rc == 0)
 		tc->flags |= TX_SC_OK;
 	else {
 		/*
 		 * Unknown state at this point, see tc->params for what was
 		 * attempted.
 		 */
 		tc->flags &= ~TX_SC_OK;
 	}
 	end_synchronized_op(sc, sleep_ok ? 0 : LOCK_HELD);
 
 	return (rc);
 }
 
 int
 t4_set_sched_class(struct adapter *sc, struct t4_sched_params *p)
 {
 
 	if (p->type != SCHED_CLASS_TYPE_PACKET)
 		return (EINVAL);
 
 	if (p->subcmd == SCHED_CLASS_SUBCMD_CONFIG)
 		return (set_sched_class_config(sc, p->u.config.minmax));
 
 	if (p->subcmd == SCHED_CLASS_SUBCMD_PARAMS)
 		return (set_sched_class_params(sc, &p->u.params, 1));
 
 	return (EINVAL);
 }
 
 int
 t4_set_sched_queue(struct adapter *sc, struct t4_sched_queue *p)
 {
 	struct port_info *pi = NULL;
 	struct vi_info *vi;
 	struct sge_txq *txq;
 	uint32_t fw_mnem, fw_queue, fw_class;
 	int i, rc;
 
 	rc = begin_synchronized_op(sc, NULL, SLEEP_OK | INTR_OK, "t4setsq");
 	if (rc)
 		return (rc);
 
 	if (p->port >= sc->params.nports) {
 		rc = EINVAL;
 		goto done;
 	}
 
 	/* XXX: Only supported for the main VI. */
 	pi = sc->port[p->port];
 	vi = &pi->vi[0];
 	if (!(vi->flags & VI_INIT_DONE)) {
 		/* tx queues not set up yet */
 		rc = EAGAIN;
 		goto done;
 	}
 
 	if (!in_range(p->queue, 0, vi->ntxq - 1) ||
 	    !in_range(p->cl, 0, sc->chip_params->nsched_cls - 1)) {
 		rc = EINVAL;
 		goto done;
 	}
 
 	/*
 	 * Create a template for the FW_PARAMS_CMD mnemonic and value (TX
 	 * Scheduling Class in this case).
 	 */
 	fw_mnem = (V_FW_PARAMS_MNEM(FW_PARAMS_MNEM_DMAQ) |
 	    V_FW_PARAMS_PARAM_X(FW_PARAMS_PARAM_DMAQ_EQ_SCHEDCLASS_ETH));
 	fw_class = p->cl < 0 ? 0xffffffff : p->cl;
 
 	/*
 	 * If op.queue is non-negative, then we're only changing the scheduling
 	 * on a single specified TX queue.
 	 */
 	if (p->queue >= 0) {
 		txq = &sc->sge.txq[vi->first_txq + p->queue];
 		fw_queue = (fw_mnem | V_FW_PARAMS_PARAM_YZ(txq->eq.cntxt_id));
 		rc = -t4_set_params(sc, sc->mbox, sc->pf, 0, 1, &fw_queue,
 		    &fw_class);
 		goto done;
 	}
 
 	/*
 	 * Change the scheduling on all the TX queues for the
 	 * interface.
 	 */
 	for_each_txq(vi, i, txq) {
 		fw_queue = (fw_mnem | V_FW_PARAMS_PARAM_YZ(txq->eq.cntxt_id));
 		rc = -t4_set_params(sc, sc->mbox, sc->pf, 0, 1, &fw_queue,
 		    &fw_class);
 		if (rc)
 			goto done;
 	}
 
 	rc = 0;
 done:
 	end_synchronized_op(sc, 0);
 	return (rc);
 }
 
 int
 t4_os_find_pci_capability(struct adapter *sc, int cap)
 {
 	int i;
 
 	return (pci_find_cap(sc->dev, cap, &i) == 0 ? i : 0);
 }
 
 int
 t4_os_pci_save_state(struct adapter *sc)
 {
 	device_t dev;
 	struct pci_devinfo *dinfo;
 
 	dev = sc->dev;
 	dinfo = device_get_ivars(dev);
 
 	pci_cfg_save(dev, dinfo, 0);
 	return (0);
 }
 
 int
 t4_os_pci_restore_state(struct adapter *sc)
 {
 	device_t dev;
 	struct pci_devinfo *dinfo;
 
 	dev = sc->dev;
 	dinfo = device_get_ivars(dev);
 
 	pci_cfg_restore(dev, dinfo);
 	return (0);
 }
 
 void
 t4_os_portmod_changed(const struct adapter *sc, int idx)
 {
 	struct port_info *pi = sc->port[idx];
 	struct vi_info *vi;
 	struct ifnet *ifp;
 	int v;
 	static const char *mod_str[] = {
 		NULL, "LR", "SR", "ER", "TWINAX", "active TWINAX", "LRM"
 	};
 
 	for_each_vi(pi, v, vi) {
 		build_medialist(pi, &vi->media);
 	}
 
 	ifp = pi->vi[0].ifp;
 	if (pi->mod_type == FW_PORT_MOD_TYPE_NONE)
 		if_printf(ifp, "transceiver unplugged.\n");
 	else if (pi->mod_type == FW_PORT_MOD_TYPE_UNKNOWN)
 		if_printf(ifp, "unknown transceiver inserted.\n");
 	else if (pi->mod_type == FW_PORT_MOD_TYPE_NOTSUPPORTED)
 		if_printf(ifp, "unsupported transceiver inserted.\n");
 	else if (pi->mod_type > 0 && pi->mod_type < nitems(mod_str)) {
 		if_printf(ifp, "%s transceiver inserted.\n",
 		    mod_str[pi->mod_type]);
 	} else {
 		if_printf(ifp, "transceiver (type %d) inserted.\n",
 		    pi->mod_type);
 	}
 }
 
 void
 t4_os_link_changed(struct adapter *sc, int idx, int link_stat, int reason)
 {
 	struct port_info *pi = sc->port[idx];
 	struct vi_info *vi;
 	struct ifnet *ifp;
 	int v;
 
 	if (link_stat)
 		pi->linkdnrc = -1;
 	else {
 		if (reason >= 0)
 			pi->linkdnrc = reason;
 	}
 	for_each_vi(pi, v, vi) {
 		ifp = vi->ifp;
 		if (ifp == NULL)
 			continue;
 
 		if (link_stat) {
 			ifp->if_baudrate = IF_Mbps(pi->link_cfg.speed);
 			if_link_state_change(ifp, LINK_STATE_UP);
 		} else {
 			if_link_state_change(ifp, LINK_STATE_DOWN);
 		}
 	}
 }
 
 void
 t4_iterate(void (*func)(struct adapter *, void *), void *arg)
 {
 	struct adapter *sc;
 
 	sx_slock(&t4_list_lock);
 	SLIST_FOREACH(sc, &t4_list, link) {
 		/*
 		 * func should not make any assumptions about what state sc is
 		 * in - the only guarantee is that sc->sc_lock is a valid lock.
 		 */
 		func(sc, arg);
 	}
 	sx_sunlock(&t4_list_lock);
 }
 
 static int
 t4_ioctl(struct cdev *dev, unsigned long cmd, caddr_t data, int fflag,
     struct thread *td)
 {
 	int rc;
 	struct adapter *sc = dev->si_drv1;
 
 	rc = priv_check(td, PRIV_DRIVER);
 	if (rc != 0)
 		return (rc);
 
 	switch (cmd) {
 	case CHELSIO_T4_GETREG: {
 		struct t4_reg *edata = (struct t4_reg *)data;
 
 		if ((edata->addr & 0x3) != 0 || edata->addr >= sc->mmio_len)
 			return (EFAULT);
 
 		if (edata->size == 4)
 			edata->val = t4_read_reg(sc, edata->addr);
 		else if (edata->size == 8)
 			edata->val = t4_read_reg64(sc, edata->addr);
 		else
 			return (EINVAL);
 
 		break;
 	}
 	case CHELSIO_T4_SETREG: {
 		struct t4_reg *edata = (struct t4_reg *)data;
 
 		if ((edata->addr & 0x3) != 0 || edata->addr >= sc->mmio_len)
 			return (EFAULT);
 
 		if (edata->size == 4) {
 			if (edata->val & 0xffffffff00000000)
 				return (EINVAL);
 			t4_write_reg(sc, edata->addr, (uint32_t) edata->val);
 		} else if (edata->size == 8)
 			t4_write_reg64(sc, edata->addr, edata->val);
 		else
 			return (EINVAL);
 		break;
 	}
 	case CHELSIO_T4_REGDUMP: {
 		struct t4_regdump *regs = (struct t4_regdump *)data;
 		int reglen = t4_get_regs_len(sc);
 		uint8_t *buf;
 
 		if (regs->len < reglen) {
 			regs->len = reglen; /* hint to the caller */
 			return (ENOBUFS);
 		}
 
 		regs->len = reglen;
 		buf = malloc(reglen, M_CXGBE, M_WAITOK | M_ZERO);
 		get_regs(sc, regs, buf);
 		rc = copyout(buf, regs->data, reglen);
 		free(buf, M_CXGBE);
 		break;
 	}
 	case CHELSIO_T4_GET_FILTER_MODE:
 		rc = get_filter_mode(sc, (uint32_t *)data);
 		break;
 	case CHELSIO_T4_SET_FILTER_MODE:
 		rc = set_filter_mode(sc, *(uint32_t *)data);
 		break;
 	case CHELSIO_T4_GET_FILTER:
 		rc = get_filter(sc, (struct t4_filter *)data);
 		break;
 	case CHELSIO_T4_SET_FILTER:
 		rc = set_filter(sc, (struct t4_filter *)data);
 		break;
 	case CHELSIO_T4_DEL_FILTER:
 		rc = del_filter(sc, (struct t4_filter *)data);
 		break;
 	case CHELSIO_T4_GET_SGE_CONTEXT:
 		rc = get_sge_context(sc, (struct t4_sge_context *)data);
 		break;
 	case CHELSIO_T4_LOAD_FW:
 		rc = load_fw(sc, (struct t4_data *)data);
 		break;
 	case CHELSIO_T4_GET_MEM:
 		rc = read_card_mem(sc, 2, (struct t4_mem_range *)data);
 		break;
 	case CHELSIO_T4_GET_I2C:
 		rc = read_i2c(sc, (struct t4_i2c_data *)data);
 		break;
 	case CHELSIO_T4_CLEAR_STATS: {
 		int i, v;
 		u_int port_id = *(uint32_t *)data;
 		struct port_info *pi;
 		struct vi_info *vi;
 
 		if (port_id >= sc->params.nports)
 			return (EINVAL);
 		pi = sc->port[port_id];
+		if (pi == NULL)
+			return (EIO);
 
 		/* MAC stats */
 		t4_clr_port_stats(sc, pi->tx_chan);
 		pi->tx_parse_error = 0;
 		mtx_lock(&sc->reg_lock);
 		for_each_vi(pi, v, vi) {
 			if (vi->flags & VI_INIT_DONE)
 				t4_clr_vi_stats(sc, vi->viid);
 		}
 		mtx_unlock(&sc->reg_lock);
 
 		/*
 		 * Since this command accepts a port, clear stats for
 		 * all VIs on this port.
 		 */
 		for_each_vi(pi, v, vi) {
 			if (vi->flags & VI_INIT_DONE) {
 				struct sge_rxq *rxq;
 				struct sge_txq *txq;
 				struct sge_wrq *wrq;
 
 				for_each_rxq(vi, i, rxq) {
 #if defined(INET) || defined(INET6)
 					rxq->lro.lro_queued = 0;
 					rxq->lro.lro_flushed = 0;
 #endif
 					rxq->rxcsum = 0;
 					rxq->vlan_extraction = 0;
 				}
 
 				for_each_txq(vi, i, txq) {
 					txq->txcsum = 0;
 					txq->tso_wrs = 0;
 					txq->vlan_insertion = 0;
 					txq->imm_wrs = 0;
 					txq->sgl_wrs = 0;
 					txq->txpkt_wrs = 0;
 					txq->txpkts0_wrs = 0;
 					txq->txpkts1_wrs = 0;
 					txq->txpkts0_pkts = 0;
 					txq->txpkts1_pkts = 0;
 					mp_ring_reset_stats(txq->r);
 				}
 
 #ifdef TCP_OFFLOAD
 				/* nothing to clear for each ofld_rxq */
 
 				for_each_ofld_txq(vi, i, wrq) {
 					wrq->tx_wrs_direct = 0;
 					wrq->tx_wrs_copied = 0;
 				}
 #endif
 
 				if (IS_MAIN_VI(vi)) {
 					wrq = &sc->sge.ctrlq[pi->port_id];
 					wrq->tx_wrs_direct = 0;
 					wrq->tx_wrs_copied = 0;
 				}
 			}
 		}
 		break;
 	}
 	case CHELSIO_T4_SCHED_CLASS:
 		rc = t4_set_sched_class(sc, (struct t4_sched_params *)data);
 		break;
 	case CHELSIO_T4_SCHED_QUEUE:
 		rc = t4_set_sched_queue(sc, (struct t4_sched_queue *)data);
 		break;
 	case CHELSIO_T4_GET_TRACER:
 		rc = t4_get_tracer(sc, (struct t4_tracer *)data);
 		break;
 	case CHELSIO_T4_SET_TRACER:
 		rc = t4_set_tracer(sc, (struct t4_tracer *)data);
 		break;
 	default:
 		rc = ENOTTY;
 	}
 
 	return (rc);
 }
 
 void
 t4_db_full(struct adapter *sc)
 {
 
 	CXGBE_UNIMPLEMENTED(__func__);
 }
 
 void
 t4_db_dropped(struct adapter *sc)
 {
 
 	CXGBE_UNIMPLEMENTED(__func__);
 }
 
 #ifdef TCP_OFFLOAD
 static int
 toe_capability(struct vi_info *vi, int enable)
 {
 	int rc;
 	struct port_info *pi = vi->pi;
 	struct adapter *sc = pi->adapter;
 
 	ASSERT_SYNCHRONIZED_OP(sc);
 
 	if (!is_offload(sc))
 		return (ENODEV);
 
 	if (enable) {
 		if ((vi->ifp->if_capenable & IFCAP_TOE) != 0) {
 			/* TOE is already enabled. */
 			return (0);
 		}
 
 		/*
 		 * We need the port's queues around so that we're able to send
 		 * and receive CPLs to/from the TOE even if the ifnet for this
 		 * port has never been UP'd administratively.
 		 */
 		if (!(vi->flags & VI_INIT_DONE)) {
 			rc = vi_full_init(vi);
 			if (rc)
 				return (rc);
 		}
 		if (!(pi->vi[0].flags & VI_INIT_DONE)) {
 			rc = vi_full_init(&pi->vi[0]);
 			if (rc)
 				return (rc);
 		}
 
 		if (isset(&sc->offload_map, pi->port_id)) {
 			/* TOE is enabled on another VI of this port. */
 			pi->uld_vis++;
 			return (0);
 		}
 
 		if (!uld_active(sc, ULD_TOM)) {
 			rc = t4_activate_uld(sc, ULD_TOM);
 			if (rc == EAGAIN) {
 				log(LOG_WARNING,
 				    "You must kldload t4_tom.ko before trying "
 				    "to enable TOE on a cxgbe interface.\n");
 			}
 			if (rc != 0)
 				return (rc);
 			KASSERT(sc->tom_softc != NULL,
 			    ("%s: TOM activated but softc NULL", __func__));
 			KASSERT(uld_active(sc, ULD_TOM),
 			    ("%s: TOM activated but flag not set", __func__));
 		}
 
 		/* Activate iWARP and iSCSI too, if the modules are loaded. */
 		if (!uld_active(sc, ULD_IWARP))
 			(void) t4_activate_uld(sc, ULD_IWARP);
 		if (!uld_active(sc, ULD_ISCSI))
 			(void) t4_activate_uld(sc, ULD_ISCSI);
 
 		pi->uld_vis++;
 		setbit(&sc->offload_map, pi->port_id);
 	} else {
 		pi->uld_vis--;
 
 		if (!isset(&sc->offload_map, pi->port_id) || pi->uld_vis > 0)
 			return (0);
 
 		KASSERT(uld_active(sc, ULD_TOM),
 		    ("%s: TOM never initialized?", __func__));
 		clrbit(&sc->offload_map, pi->port_id);
 	}
 
 	return (0);
 }
 
 /*
  * Add an upper layer driver to the global list.
  */
 int
 t4_register_uld(struct uld_info *ui)
 {
 	int rc = 0;
 	struct uld_info *u;
 
 	sx_xlock(&t4_uld_list_lock);
 	SLIST_FOREACH(u, &t4_uld_list, link) {
 	    if (u->uld_id == ui->uld_id) {
 		    rc = EEXIST;
 		    goto done;
 	    }
 	}
 
 	SLIST_INSERT_HEAD(&t4_uld_list, ui, link);
 	ui->refcount = 0;
 done:
 	sx_xunlock(&t4_uld_list_lock);
 	return (rc);
 }
 
 int
 t4_unregister_uld(struct uld_info *ui)
 {
 	int rc = EINVAL;
 	struct uld_info *u;
 
 	sx_xlock(&t4_uld_list_lock);
 
 	SLIST_FOREACH(u, &t4_uld_list, link) {
 	    if (u == ui) {
 		    if (ui->refcount > 0) {
 			    rc = EBUSY;
 			    goto done;
 		    }
 
 		    SLIST_REMOVE(&t4_uld_list, ui, uld_info, link);
 		    rc = 0;
 		    goto done;
 	    }
 	}
 done:
 	sx_xunlock(&t4_uld_list_lock);
 	return (rc);
 }
 
 int
 t4_activate_uld(struct adapter *sc, int id)
 {
 	int rc;
 	struct uld_info *ui;
 
 	ASSERT_SYNCHRONIZED_OP(sc);
 
 	if (id < 0 || id > ULD_MAX)
 		return (EINVAL);
 	rc = EAGAIN;	/* kldoad the module with this ULD and try again. */
 
 	sx_slock(&t4_uld_list_lock);
 
 	SLIST_FOREACH(ui, &t4_uld_list, link) {
 		if (ui->uld_id == id) {
 			if (!(sc->flags & FULL_INIT_DONE)) {
 				rc = adapter_full_init(sc);
 				if (rc != 0)
 					break;
 			}
 
 			rc = ui->activate(sc);
 			if (rc == 0) {
 				setbit(&sc->active_ulds, id);
 				ui->refcount++;
 			}
 			break;
 		}
 	}
 
 	sx_sunlock(&t4_uld_list_lock);
 
 	return (rc);
 }
 
 int
 t4_deactivate_uld(struct adapter *sc, int id)
 {
 	int rc;
 	struct uld_info *ui;
 
 	ASSERT_SYNCHRONIZED_OP(sc);
 
 	if (id < 0 || id > ULD_MAX)
 		return (EINVAL);
 	rc = ENXIO;
 
 	sx_slock(&t4_uld_list_lock);
 
 	SLIST_FOREACH(ui, &t4_uld_list, link) {
 		if (ui->uld_id == id) {
 			rc = ui->deactivate(sc);
 			if (rc == 0) {
 				clrbit(&sc->active_ulds, id);
 				ui->refcount--;
 			}
 			break;
 		}
 	}
 
 	sx_sunlock(&t4_uld_list_lock);
 
 	return (rc);
 }
 
 int
 uld_active(struct adapter *sc, int uld_id)
 {
 
 	MPASS(uld_id >= 0 && uld_id <= ULD_MAX);
 
 	return (isset(&sc->active_ulds, uld_id));
 }
 #endif
 
 /*
  * Come up with reasonable defaults for some of the tunables, provided they're
  * not set by the user (in which case we'll use the values as is).
  */
 static void
 tweak_tunables(void)
 {
 	int nc = mp_ncpus;	/* our snapshot of the number of CPUs */
 
 	if (t4_ntxq10g < 1) {
 #ifdef RSS
 		t4_ntxq10g = rss_getnumbuckets();
 #else
 		t4_ntxq10g = min(nc, NTXQ_10G);
 #endif
 	}
 
 	if (t4_ntxq1g < 1) {
 #ifdef RSS
 		/* XXX: way too many for 1GbE? */
 		t4_ntxq1g = rss_getnumbuckets();
 #else
 		t4_ntxq1g = min(nc, NTXQ_1G);
 #endif
 	}
 
 	if (t4_ntxq_vi < 1)
 		t4_ntxq_vi = min(nc, NTXQ_VI);
 
 	if (t4_nrxq10g < 1) {
 #ifdef RSS
 		t4_nrxq10g = rss_getnumbuckets();
 #else
 		t4_nrxq10g = min(nc, NRXQ_10G);
 #endif
 	}
 
 	if (t4_nrxq1g < 1) {
 #ifdef RSS
 		/* XXX: way too many for 1GbE? */
 		t4_nrxq1g = rss_getnumbuckets();
 #else
 		t4_nrxq1g = min(nc, NRXQ_1G);
 #endif
 	}
 
 	if (t4_nrxq_vi < 1)
 		t4_nrxq_vi = min(nc, NRXQ_VI);
 
 #ifdef TCP_OFFLOAD
 	if (t4_nofldtxq10g < 1)
 		t4_nofldtxq10g = min(nc, NOFLDTXQ_10G);
 
 	if (t4_nofldtxq1g < 1)
 		t4_nofldtxq1g = min(nc, NOFLDTXQ_1G);
 
 	if (t4_nofldtxq_vi < 1)
 		t4_nofldtxq_vi = min(nc, NOFLDTXQ_VI);
 
 	if (t4_nofldrxq10g < 1)
 		t4_nofldrxq10g = min(nc, NOFLDRXQ_10G);
 
 	if (t4_nofldrxq1g < 1)
 		t4_nofldrxq1g = min(nc, NOFLDRXQ_1G);
 
 	if (t4_nofldrxq_vi < 1)
 		t4_nofldrxq_vi = min(nc, NOFLDRXQ_VI);
 
 	if (t4_toecaps_allowed == -1)
 		t4_toecaps_allowed = FW_CAPS_CONFIG_TOE;
 
 	if (t4_rdmacaps_allowed == -1) {
 		t4_rdmacaps_allowed = FW_CAPS_CONFIG_RDMA_RDDP |
 		    FW_CAPS_CONFIG_RDMA_RDMAC;
 	}
 
 	if (t4_iscsicaps_allowed == -1) {
 		t4_iscsicaps_allowed = FW_CAPS_CONFIG_ISCSI_INITIATOR_PDU |
 		    FW_CAPS_CONFIG_ISCSI_TARGET_PDU |
 		    FW_CAPS_CONFIG_ISCSI_T10DIF;
 	}
 #else
 	if (t4_toecaps_allowed == -1)
 		t4_toecaps_allowed = 0;
 
 	if (t4_rdmacaps_allowed == -1)
 		t4_rdmacaps_allowed = 0;
 
 	if (t4_iscsicaps_allowed == -1)
 		t4_iscsicaps_allowed = 0;
 #endif
 
 #ifdef DEV_NETMAP
 	if (t4_nnmtxq_vi < 1)
 		t4_nnmtxq_vi = min(nc, NNMTXQ_VI);
 
 	if (t4_nnmrxq_vi < 1)
 		t4_nnmrxq_vi = min(nc, NNMRXQ_VI);
 #endif
 
 	if (t4_tmr_idx_10g < 0 || t4_tmr_idx_10g >= SGE_NTIMERS)
 		t4_tmr_idx_10g = TMR_IDX_10G;
 
 	if (t4_pktc_idx_10g < -1 || t4_pktc_idx_10g >= SGE_NCOUNTERS)
 		t4_pktc_idx_10g = PKTC_IDX_10G;
 
 	if (t4_tmr_idx_1g < 0 || t4_tmr_idx_1g >= SGE_NTIMERS)
 		t4_tmr_idx_1g = TMR_IDX_1G;
 
 	if (t4_pktc_idx_1g < -1 || t4_pktc_idx_1g >= SGE_NCOUNTERS)
 		t4_pktc_idx_1g = PKTC_IDX_1G;
 
 	if (t4_qsize_txq < 128)
 		t4_qsize_txq = 128;
 
 	if (t4_qsize_rxq < 128)
 		t4_qsize_rxq = 128;
 	while (t4_qsize_rxq & 7)
 		t4_qsize_rxq++;
 
 	t4_intr_types &= INTR_MSIX | INTR_MSI | INTR_INTX;
 }
 
 #ifdef DDB
 static void
 t4_dump_tcb(struct adapter *sc, int tid)
 {
 	uint32_t base, i, j, off, pf, reg, save, tcb_addr, win_pos;
 
 	reg = PCIE_MEM_ACCESS_REG(A_PCIE_MEM_ACCESS_OFFSET, 2);
 	save = t4_read_reg(sc, reg);
 	base = sc->memwin[2].mw_base;
 
 	/* Dump TCB for the tid */
 	tcb_addr = t4_read_reg(sc, A_TP_CMM_TCB_BASE);
 	tcb_addr += tid * TCB_SIZE;
 
 	if (is_t4(sc)) {
 		pf = 0;
 		win_pos = tcb_addr & ~0xf;	/* start must be 16B aligned */
 	} else {
 		pf = V_PFNUM(sc->pf);
 		win_pos = tcb_addr & ~0x7f;	/* start must be 128B aligned */
 	}
 	t4_write_reg(sc, reg, win_pos | pf);
 	t4_read_reg(sc, reg);
 
 	off = tcb_addr - win_pos;
 	for (i = 0; i < 4; i++) {
 		uint32_t buf[8];
 		for (j = 0; j < 8; j++, off += 4)
 			buf[j] = htonl(t4_read_reg(sc, base + off));
 
 		db_printf("%08x %08x %08x %08x %08x %08x %08x %08x\n",
 		    buf[0], buf[1], buf[2], buf[3], buf[4], buf[5], buf[6],
 		    buf[7]);
 	}
 
 	t4_write_reg(sc, reg, save);
 	t4_read_reg(sc, reg);
 }
 
 static void
 t4_dump_devlog(struct adapter *sc)
 {
 	struct devlog_params *dparams = &sc->params.devlog;
 	struct fw_devlog_e e;
 	int i, first, j, m, nentries, rc;
 	uint64_t ftstamp = UINT64_MAX;
 
 	if (dparams->start == 0) {
 		db_printf("devlog params not valid\n");
 		return;
 	}
 
 	nentries = dparams->size / sizeof(struct fw_devlog_e);
 	m = fwmtype_to_hwmtype(dparams->memtype);
 
 	/* Find the first entry. */
 	first = -1;
 	for (i = 0; i < nentries && !db_pager_quit; i++) {
 		rc = -t4_mem_read(sc, m, dparams->start + i * sizeof(e),
 		    sizeof(e), (void *)&e);
 		if (rc != 0)
 			break;
 
 		if (e.timestamp == 0)
 			break;
 
 		e.timestamp = be64toh(e.timestamp);
 		if (e.timestamp < ftstamp) {
 			ftstamp = e.timestamp;
 			first = i;
 		}
 	}
 
 	if (first == -1)
 		return;
 
 	i = first;
 	do {
 		rc = -t4_mem_read(sc, m, dparams->start + i * sizeof(e),
 		    sizeof(e), (void *)&e);
 		if (rc != 0)
 			return;
 
 		if (e.timestamp == 0)
 			return;
 
 		e.timestamp = be64toh(e.timestamp);
 		e.seqno = be32toh(e.seqno);
 		for (j = 0; j < 8; j++)
 			e.params[j] = be32toh(e.params[j]);
 
 		db_printf("%10d  %15ju  %8s  %8s  ",
 		    e.seqno, e.timestamp,
 		    (e.level < nitems(devlog_level_strings) ?
 			devlog_level_strings[e.level] : "UNKNOWN"),
 		    (e.facility < nitems(devlog_facility_strings) ?
 			devlog_facility_strings[e.facility] : "UNKNOWN"));
 		db_printf(e.fmt, e.params[0], e.params[1], e.params[2],
 		    e.params[3], e.params[4], e.params[5], e.params[6],
 		    e.params[7]);
 
 		if (++i == nentries)
 			i = 0;
 	} while (i != first && !db_pager_quit);
 }
 
 static struct command_table db_t4_table = LIST_HEAD_INITIALIZER(db_t4_table);
 _DB_SET(_show, t4, NULL, db_show_table, 0, &db_t4_table);
 
 DB_FUNC(devlog, db_show_devlog, db_t4_table, CS_OWN, NULL)
 {
 	device_t dev;
 	int t;
 	bool valid;
 
 	valid = false;
 	t = db_read_token();
 	if (t == tIDENT) {
 		dev = device_lookup_by_name(db_tok_string);
 		valid = true;
 	}
 	db_skip_to_eol();
 	if (!valid) {
 		db_printf("usage: show t4 devlog <nexus>\n");
 		return;
 	}
 
 	if (dev == NULL) {
 		db_printf("device not found\n");
 		return;
 	}
 
 	t4_dump_devlog(device_get_softc(dev));
 }
 
 DB_FUNC(tcb, db_show_t4tcb, db_t4_table, CS_OWN, NULL)
 {
 	device_t dev;
 	int radix, tid, t;
 	bool valid;
 
 	valid = false;
 	radix = db_radix;
 	db_radix = 10;
 	t = db_read_token();
 	if (t == tIDENT) {
 		dev = device_lookup_by_name(db_tok_string);
 		t = db_read_token();
 		if (t == tNUMBER) {
 			tid = db_tok_number;
 			valid = true;
 		}
 	}	
 	db_radix = radix;
 	db_skip_to_eol();
 	if (!valid) {
 		db_printf("usage: show t4 tcb <nexus> <tid>\n");
 		return;
 	}
 
 	if (dev == NULL) {
 		db_printf("device not found\n");
 		return;
 	}
 	if (tid < 0) {
 		db_printf("invalid tid\n");
 		return;
 	}
 
 	t4_dump_tcb(device_get_softc(dev), tid);
 }
 #endif
 
 static struct sx mlu;	/* mod load unload */
 SX_SYSINIT(cxgbe_mlu, &mlu, "cxgbe mod load/unload");
 
 static int
 mod_event(module_t mod, int cmd, void *arg)
 {
 	int rc = 0;
 	static int loaded = 0;
 
 	switch (cmd) {
 	case MOD_LOAD:
 		sx_xlock(&mlu);
 		if (loaded++ == 0) {
 			t4_sge_modload();
 			t4_register_cpl_handler(CPL_SET_TCB_RPL, set_tcb_rpl);
 			t4_register_cpl_handler(CPL_L2T_WRITE_RPL, l2t_write_rpl);
 			t4_register_cpl_handler(CPL_TRACE_PKT, t4_trace_pkt);
 			t4_register_cpl_handler(CPL_T5_TRACE_PKT, t5_trace_pkt);
 			sx_init(&t4_list_lock, "T4/T5 adapters");
 			SLIST_INIT(&t4_list);
 #ifdef TCP_OFFLOAD
 			sx_init(&t4_uld_list_lock, "T4/T5 ULDs");
 			SLIST_INIT(&t4_uld_list);
 #endif
 			t4_tracer_modload();
 			tweak_tunables();
 		}
 		sx_xunlock(&mlu);
 		break;
 
 	case MOD_UNLOAD:
 		sx_xlock(&mlu);
 		if (--loaded == 0) {
 			int tries;
 
 			sx_slock(&t4_list_lock);
 			if (!SLIST_EMPTY(&t4_list)) {
 				rc = EBUSY;
 				sx_sunlock(&t4_list_lock);
 				goto done_unload;
 			}
 #ifdef TCP_OFFLOAD
 			sx_slock(&t4_uld_list_lock);
 			if (!SLIST_EMPTY(&t4_uld_list)) {
 				rc = EBUSY;
 				sx_sunlock(&t4_uld_list_lock);
 				sx_sunlock(&t4_list_lock);
 				goto done_unload;
 			}
 #endif
 			tries = 0;
 			while (tries++ < 5 && t4_sge_extfree_refs() != 0) {
 				uprintf("%ju clusters with custom free routine "
 				    "still is use.\n", t4_sge_extfree_refs());
 				pause("t4unload", 2 * hz);
 			}
 #ifdef TCP_OFFLOAD
 			sx_sunlock(&t4_uld_list_lock);
 #endif
 			sx_sunlock(&t4_list_lock);
 
 			if (t4_sge_extfree_refs() == 0) {
 				t4_tracer_modunload();
 #ifdef TCP_OFFLOAD
 				sx_destroy(&t4_uld_list_lock);
 #endif
 				sx_destroy(&t4_list_lock);
 				t4_sge_modunload();
 				loaded = 0;
 			} else {
 				rc = EBUSY;
 				loaded++;	/* undo earlier decrement */
 			}
 		}
 done_unload:
 		sx_xunlock(&mlu);
 		break;
 	}
 
 	return (rc);
 }
 
 static devclass_t t4_devclass, t5_devclass;
 static devclass_t cxgbe_devclass, cxl_devclass;
 static devclass_t vcxgbe_devclass, vcxl_devclass;
 
 DRIVER_MODULE(t4nex, pci, t4_driver, t4_devclass, mod_event, 0);
 MODULE_VERSION(t4nex, 1);
 MODULE_DEPEND(t4nex, firmware, 1, 1, 1);
 #ifdef DEV_NETMAP
 MODULE_DEPEND(t4nex, netmap, 1, 1, 1);
 #endif /* DEV_NETMAP */
 
 
 DRIVER_MODULE(t5nex, pci, t5_driver, t5_devclass, mod_event, 0);
 MODULE_VERSION(t5nex, 1);
 MODULE_DEPEND(t5nex, firmware, 1, 1, 1);
 #ifdef DEV_NETMAP
 MODULE_DEPEND(t5nex, netmap, 1, 1, 1);
 #endif /* DEV_NETMAP */
 
 DRIVER_MODULE(cxgbe, t4nex, cxgbe_driver, cxgbe_devclass, 0, 0);
 MODULE_VERSION(cxgbe, 1);
 
 DRIVER_MODULE(cxl, t5nex, cxl_driver, cxl_devclass, 0, 0);
 MODULE_VERSION(cxl, 1);
 
 DRIVER_MODULE(vcxgbe, cxgbe, vcxgbe_driver, vcxgbe_devclass, 0, 0);
 MODULE_VERSION(vcxgbe, 1);
 
 DRIVER_MODULE(vcxl, cxl, vcxl_driver, vcxl_devclass, 0, 0);
 MODULE_VERSION(vcxl, 1);
Index: projects/clang390-import/sys/dev/cxgbe/t4_sge.c
===================================================================
--- projects/clang390-import/sys/dev/cxgbe/t4_sge.c	(revision 305686)
+++ projects/clang390-import/sys/dev/cxgbe/t4_sge.c	(revision 305687)
@@ -1,5211 +1,5210 @@
 /*-
  * Copyright (c) 2011 Chelsio Communications, Inc.
  * All rights reserved.
  * Written by: Navdeep Parhar <np@FreeBSD.org>
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_inet.h"
 #include "opt_inet6.h"
 
 #include <sys/types.h>
 #include <sys/eventhandler.h>
 #include <sys/mbuf.h>
 #include <sys/socket.h>
 #include <sys/kernel.h>
 #include <sys/malloc.h>
 #include <sys/queue.h>
 #include <sys/sbuf.h>
 #include <sys/taskqueue.h>
 #include <sys/time.h>
 #include <sys/sglist.h>
 #include <sys/sysctl.h>
 #include <sys/smp.h>
 #include <sys/counter.h>
 #include <net/bpf.h>
 #include <net/ethernet.h>
 #include <net/if.h>
 #include <net/if_vlan_var.h>
 #include <netinet/in.h>
 #include <netinet/ip.h>
 #include <netinet/ip6.h>
 #include <netinet/tcp.h>
 #include <machine/in_cksum.h>
 #include <machine/md_var.h>
 #include <vm/vm.h>
 #include <vm/pmap.h>
 #ifdef DEV_NETMAP
 #include <machine/bus.h>
 #include <sys/selinfo.h>
 #include <net/if_var.h>
 #include <net/netmap.h>
 #include <dev/netmap/netmap_kern.h>
 #endif
 
 #include "common/common.h"
 #include "common/t4_regs.h"
 #include "common/t4_regs_values.h"
 #include "common/t4_msg.h"
 #include "t4_l2t.h"
 #include "t4_mp_ring.h"
 
 #ifdef T4_PKT_TIMESTAMP
 #define RX_COPY_THRESHOLD (MINCLSIZE - 8)
 #else
 #define RX_COPY_THRESHOLD MINCLSIZE
 #endif
 
 /*
  * Ethernet frames are DMA'd at this byte offset into the freelist buffer.
  * 0-7 are valid values.
  */
 static int fl_pktshift = 2;
 TUNABLE_INT("hw.cxgbe.fl_pktshift", &fl_pktshift);
 
 /*
  * Pad ethernet payload up to this boundary.
  * -1: driver should figure out a good value.
  *  0: disable padding.
  *  Any power of 2 from 32 to 4096 (both inclusive) is also a valid value.
  */
 int fl_pad = -1;
 TUNABLE_INT("hw.cxgbe.fl_pad", &fl_pad);
 
 /*
  * Status page length.
  * -1: driver should figure out a good value.
  *  64 or 128 are the only other valid values.
  */
 static int spg_len = -1;
 TUNABLE_INT("hw.cxgbe.spg_len", &spg_len);
 
 /*
  * Congestion drops.
  * -1: no congestion feedback (not recommended).
  *  0: backpressure the channel instead of dropping packets right away.
  *  1: no backpressure, drop packets for the congested queue immediately.
  */
 static int cong_drop = 0;
 TUNABLE_INT("hw.cxgbe.cong_drop", &cong_drop);
 
 /*
  * Deliver multiple frames in the same free list buffer if they fit.
  * -1: let the driver decide whether to enable buffer packing or not.
  *  0: disable buffer packing.
  *  1: enable buffer packing.
  */
 static int buffer_packing = -1;
 TUNABLE_INT("hw.cxgbe.buffer_packing", &buffer_packing);
 
 /*
  * Start next frame in a packed buffer at this boundary.
  * -1: driver should figure out a good value.
  * T4: driver will ignore this and use the same value as fl_pad above.
  * T5: 16, or a power of 2 from 64 to 4096 (both inclusive) is a valid value.
  */
 static int fl_pack = -1;
 TUNABLE_INT("hw.cxgbe.fl_pack", &fl_pack);
 
 /*
  * Allow the driver to create mbuf(s) in a cluster allocated for rx.
  * 0: never; always allocate mbufs from the zone_mbuf UMA zone.
  * 1: ok to create mbuf(s) within a cluster if there is room.
  */
 static int allow_mbufs_in_cluster = 1;
 TUNABLE_INT("hw.cxgbe.allow_mbufs_in_cluster", &allow_mbufs_in_cluster);
 
 /*
  * Largest rx cluster size that the driver is allowed to allocate.
  */
 static int largest_rx_cluster = MJUM16BYTES;
 TUNABLE_INT("hw.cxgbe.largest_rx_cluster", &largest_rx_cluster);
 
 /*
  * Size of cluster allocation that's most likely to succeed.  The driver will
  * fall back to this size if it fails to allocate clusters larger than this.
  */
 static int safest_rx_cluster = PAGE_SIZE;
 TUNABLE_INT("hw.cxgbe.safest_rx_cluster", &safest_rx_cluster);
 
 struct txpkts {
 	u_int wr_type;		/* type 0 or type 1 */
 	u_int npkt;		/* # of packets in this work request */
 	u_int plen;		/* total payload (sum of all packets) */
 	u_int len16;		/* # of 16B pieces used by this work request */
 };
 
 /* A packet's SGL.  This + m_pkthdr has all info needed for tx */
 struct sgl {
 	struct sglist sg;
 	struct sglist_seg seg[TX_SGL_SEGS];
 };
 
 static int service_iq(struct sge_iq *, int);
 static struct mbuf *get_fl_payload(struct adapter *, struct sge_fl *, uint32_t);
 static int t4_eth_rx(struct sge_iq *, const struct rss_header *, struct mbuf *);
 static inline void init_iq(struct sge_iq *, struct adapter *, int, int, int);
 static inline void init_fl(struct adapter *, struct sge_fl *, int, int, char *);
 static inline void init_eq(struct adapter *, struct sge_eq *, int, int, uint8_t,
     uint16_t, char *);
 static int alloc_ring(struct adapter *, size_t, bus_dma_tag_t *, bus_dmamap_t *,
     bus_addr_t *, void **);
 static int free_ring(struct adapter *, bus_dma_tag_t, bus_dmamap_t, bus_addr_t,
     void *);
 static int alloc_iq_fl(struct vi_info *, struct sge_iq *, struct sge_fl *,
     int, int);
 static int free_iq_fl(struct vi_info *, struct sge_iq *, struct sge_fl *);
 static void add_fl_sysctls(struct sysctl_ctx_list *, struct sysctl_oid *,
     struct sge_fl *);
 static int alloc_fwq(struct adapter *);
 static int free_fwq(struct adapter *);
 static int alloc_mgmtq(struct adapter *);
 static int free_mgmtq(struct adapter *);
 static int alloc_rxq(struct vi_info *, struct sge_rxq *, int, int,
     struct sysctl_oid *);
 static int free_rxq(struct vi_info *, struct sge_rxq *);
 #ifdef TCP_OFFLOAD
 static int alloc_ofld_rxq(struct vi_info *, struct sge_ofld_rxq *, int, int,
     struct sysctl_oid *);
 static int free_ofld_rxq(struct vi_info *, struct sge_ofld_rxq *);
 #endif
 #ifdef DEV_NETMAP
 static int alloc_nm_rxq(struct vi_info *, struct sge_nm_rxq *, int, int,
     struct sysctl_oid *);
 static int free_nm_rxq(struct vi_info *, struct sge_nm_rxq *);
 static int alloc_nm_txq(struct vi_info *, struct sge_nm_txq *, int, int,
     struct sysctl_oid *);
 static int free_nm_txq(struct vi_info *, struct sge_nm_txq *);
 #endif
 static int ctrl_eq_alloc(struct adapter *, struct sge_eq *);
 static int eth_eq_alloc(struct adapter *, struct vi_info *, struct sge_eq *);
 #ifdef TCP_OFFLOAD
 static int ofld_eq_alloc(struct adapter *, struct vi_info *, struct sge_eq *);
 #endif
 static int alloc_eq(struct adapter *, struct vi_info *, struct sge_eq *);
 static int free_eq(struct adapter *, struct sge_eq *);
 static int alloc_wrq(struct adapter *, struct vi_info *, struct sge_wrq *,
     struct sysctl_oid *);
 static int free_wrq(struct adapter *, struct sge_wrq *);
 static int alloc_txq(struct vi_info *, struct sge_txq *, int,
     struct sysctl_oid *);
 static int free_txq(struct vi_info *, struct sge_txq *);
 static void oneseg_dma_callback(void *, bus_dma_segment_t *, int, int);
 static inline void ring_fl_db(struct adapter *, struct sge_fl *);
 static int refill_fl(struct adapter *, struct sge_fl *, int);
 static void refill_sfl(void *);
 static int alloc_fl_sdesc(struct sge_fl *);
 static void free_fl_sdesc(struct adapter *, struct sge_fl *);
 static void find_best_refill_source(struct adapter *, struct sge_fl *, int);
 static void find_safe_refill_source(struct adapter *, struct sge_fl *);
 static void add_fl_to_sfl(struct adapter *, struct sge_fl *);
 
 static inline void get_pkt_gl(struct mbuf *, struct sglist *);
 static inline u_int txpkt_len16(u_int, u_int);
 static inline u_int txpkt_vm_len16(u_int, u_int);
 static inline u_int txpkts0_len16(u_int);
 static inline u_int txpkts1_len16(void);
 static u_int write_txpkt_wr(struct sge_txq *, struct fw_eth_tx_pkt_wr *,
     struct mbuf *, u_int);
 static u_int write_txpkt_vm_wr(struct sge_txq *, struct fw_eth_tx_pkt_vm_wr *,
     struct mbuf *, u_int);
 static int try_txpkts(struct mbuf *, struct mbuf *, struct txpkts *, u_int);
 static int add_to_txpkts(struct mbuf *, struct txpkts *, u_int);
 static u_int write_txpkts_wr(struct sge_txq *, struct fw_eth_tx_pkts_wr *,
     struct mbuf *, const struct txpkts *, u_int);
 static void write_gl_to_txd(struct sge_txq *, struct mbuf *, caddr_t *, int);
 static inline void copy_to_txd(struct sge_eq *, caddr_t, caddr_t *, int);
 static inline void ring_eq_db(struct adapter *, struct sge_eq *, u_int);
 static inline uint16_t read_hw_cidx(struct sge_eq *);
 static inline u_int reclaimable_tx_desc(struct sge_eq *);
 static inline u_int total_available_tx_desc(struct sge_eq *);
 static u_int reclaim_tx_descs(struct sge_txq *, u_int);
 static void tx_reclaim(void *, int);
 static __be64 get_flit(struct sglist_seg *, int, int);
 static int handle_sge_egr_update(struct sge_iq *, const struct rss_header *,
     struct mbuf *);
 static int handle_fw_msg(struct sge_iq *, const struct rss_header *,
     struct mbuf *);
 static int t4_handle_wrerr_rpl(struct adapter *, const __be64 *);
 static void wrq_tx_drain(void *, int);
 static void drain_wrq_wr_list(struct adapter *, struct sge_wrq *);
 
 static int sysctl_uint16(SYSCTL_HANDLER_ARGS);
 static int sysctl_bufsizes(SYSCTL_HANDLER_ARGS);
 static int sysctl_tc(SYSCTL_HANDLER_ARGS);
 
 static counter_u64_t extfree_refs;
 static counter_u64_t extfree_rels;
 
 an_handler_t t4_an_handler;
 fw_msg_handler_t t4_fw_msg_handler[NUM_FW6_TYPES];
 cpl_handler_t t4_cpl_handler[NUM_CPL_CMDS];
 
 
 static int
 an_not_handled(struct sge_iq *iq, const struct rsp_ctrl *ctrl)
 {
 
 #ifdef INVARIANTS
 	panic("%s: async notification on iq %p (ctrl %p)", __func__, iq, ctrl);
 #else
 	log(LOG_ERR, "%s: async notification on iq %p (ctrl %p)\n",
 	    __func__, iq, ctrl);
 #endif
 	return (EDOOFUS);
 }
 
 int
 t4_register_an_handler(an_handler_t h)
 {
 	uintptr_t *loc, new;
 
 	new = h ? (uintptr_t)h : (uintptr_t)an_not_handled;
 	loc = (uintptr_t *) &t4_an_handler;
 	atomic_store_rel_ptr(loc, new);
 
 	return (0);
 }
 
 static int
 fw_msg_not_handled(struct adapter *sc, const __be64 *rpl)
 {
 	const struct cpl_fw6_msg *cpl =
 	    __containerof(rpl, struct cpl_fw6_msg, data[0]);
 
 #ifdef INVARIANTS
 	panic("%s: fw_msg type %d", __func__, cpl->type);
 #else
 	log(LOG_ERR, "%s: fw_msg type %d\n", __func__, cpl->type);
 #endif
 	return (EDOOFUS);
 }
 
 int
 t4_register_fw_msg_handler(int type, fw_msg_handler_t h)
 {
 	uintptr_t *loc, new;
 
 	if (type >= nitems(t4_fw_msg_handler))
 		return (EINVAL);
 
 	/*
 	 * These are dispatched by the handler for FW{4|6}_CPL_MSG using the CPL
 	 * handler dispatch table.  Reject any attempt to install a handler for
 	 * this subtype.
 	 */
 	if (type == FW_TYPE_RSSCPL || type == FW6_TYPE_RSSCPL)
 		return (EINVAL);
 
 	new = h ? (uintptr_t)h : (uintptr_t)fw_msg_not_handled;
 	loc = (uintptr_t *) &t4_fw_msg_handler[type];
 	atomic_store_rel_ptr(loc, new);
 
 	return (0);
 }
 
 static int
 cpl_not_handled(struct sge_iq *iq, const struct rss_header *rss, struct mbuf *m)
 {
 
 #ifdef INVARIANTS
 	panic("%s: opcode 0x%02x on iq %p with payload %p",
 	    __func__, rss->opcode, iq, m);
 #else
 	log(LOG_ERR, "%s: opcode 0x%02x on iq %p with payload %p\n",
 	    __func__, rss->opcode, iq, m);
 	m_freem(m);
 #endif
 	return (EDOOFUS);
 }
 
 int
 t4_register_cpl_handler(int opcode, cpl_handler_t h)
 {
 	uintptr_t *loc, new;
 
 	if (opcode >= nitems(t4_cpl_handler))
 		return (EINVAL);
 
 	new = h ? (uintptr_t)h : (uintptr_t)cpl_not_handled;
 	loc = (uintptr_t *) &t4_cpl_handler[opcode];
 	atomic_store_rel_ptr(loc, new);
 
 	return (0);
 }
 
 /*
  * Called on MOD_LOAD.  Validates and calculates the SGE tunables.
  */
 void
 t4_sge_modload(void)
 {
 	int i;
 
 	if (fl_pktshift < 0 || fl_pktshift > 7) {
 		printf("Invalid hw.cxgbe.fl_pktshift value (%d),"
 		    " using 2 instead.\n", fl_pktshift);
 		fl_pktshift = 2;
 	}
 
 	if (spg_len != 64 && spg_len != 128) {
 		int len;
 
 #if defined(__i386__) || defined(__amd64__)
 		len = cpu_clflush_line_size > 64 ? 128 : 64;
 #else
 		len = 64;
 #endif
 		if (spg_len != -1) {
 			printf("Invalid hw.cxgbe.spg_len value (%d),"
 			    " using %d instead.\n", spg_len, len);
 		}
 		spg_len = len;
 	}
 
 	if (cong_drop < -1 || cong_drop > 1) {
 		printf("Invalid hw.cxgbe.cong_drop value (%d),"
 		    " using 0 instead.\n", cong_drop);
 		cong_drop = 0;
 	}
 
 	extfree_refs = counter_u64_alloc(M_WAITOK);
 	extfree_rels = counter_u64_alloc(M_WAITOK);
 	counter_u64_zero(extfree_refs);
 	counter_u64_zero(extfree_rels);
 
 	t4_an_handler = an_not_handled;
 	for (i = 0; i < nitems(t4_fw_msg_handler); i++)
 		t4_fw_msg_handler[i] = fw_msg_not_handled;
 	for (i = 0; i < nitems(t4_cpl_handler); i++)
 		t4_cpl_handler[i] = cpl_not_handled;
 
 	t4_register_cpl_handler(CPL_FW4_MSG, handle_fw_msg);
 	t4_register_cpl_handler(CPL_FW6_MSG, handle_fw_msg);
 	t4_register_cpl_handler(CPL_SGE_EGR_UPDATE, handle_sge_egr_update);
 	t4_register_cpl_handler(CPL_RX_PKT, t4_eth_rx);
 	t4_register_fw_msg_handler(FW6_TYPE_CMD_RPL, t4_handle_fw_rpl);
 	t4_register_fw_msg_handler(FW6_TYPE_WRERR_RPL, t4_handle_wrerr_rpl);
 }
 
 void
 t4_sge_modunload(void)
 {
 
 	counter_u64_free(extfree_refs);
 	counter_u64_free(extfree_rels);
 }
 
 uint64_t
 t4_sge_extfree_refs(void)
 {
 	uint64_t refs, rels;
 
 	rels = counter_u64_fetch(extfree_rels);
 	refs = counter_u64_fetch(extfree_refs);
 
 	return (refs - rels);
 }
 
 static inline void
 setup_pad_and_pack_boundaries(struct adapter *sc)
 {
 	uint32_t v, m;
 	int pad, pack;
 
 	pad = fl_pad;
 	if (fl_pad < 32 || fl_pad > 4096 || !powerof2(fl_pad)) {
 		/*
 		 * If there is any chance that we might use buffer packing and
 		 * the chip is a T4, then pick 64 as the pad/pack boundary.  Set
 		 * it to 32 in all other cases.
 		 */
 		pad = is_t4(sc) && buffer_packing ? 64 : 32;
 
 		/*
 		 * For fl_pad = 0 we'll still write a reasonable value to the
 		 * register but all the freelists will opt out of padding.
 		 * We'll complain here only if the user tried to set it to a
 		 * value greater than 0 that was invalid.
 		 */
 		if (fl_pad > 0) {
 			device_printf(sc->dev, "Invalid hw.cxgbe.fl_pad value"
 			    " (%d), using %d instead.\n", fl_pad, pad);
 		}
 	}
 	m = V_INGPADBOUNDARY(M_INGPADBOUNDARY);
 	v = V_INGPADBOUNDARY(ilog2(pad) - 5);
 	t4_set_reg_field(sc, A_SGE_CONTROL, m, v);
 
 	if (is_t4(sc)) {
 		if (fl_pack != -1 && fl_pack != pad) {
 			/* Complain but carry on. */
 			device_printf(sc->dev, "hw.cxgbe.fl_pack (%d) ignored,"
 			    " using %d instead.\n", fl_pack, pad);
 		}
 		return;
 	}
 
 	pack = fl_pack;
 	if (fl_pack < 16 || fl_pack == 32 || fl_pack > 4096 ||
 	    !powerof2(fl_pack)) {
 		pack = max(sc->params.pci.mps, CACHE_LINE_SIZE);
 		MPASS(powerof2(pack));
 		if (pack < 16)
 			pack = 16;
 		if (pack == 32)
 			pack = 64;
 		if (pack > 4096)
 			pack = 4096;
 		if (fl_pack != -1) {
 			device_printf(sc->dev, "Invalid hw.cxgbe.fl_pack value"
 			    " (%d), using %d instead.\n", fl_pack, pack);
 		}
 	}
 	m = V_INGPACKBOUNDARY(M_INGPACKBOUNDARY);
 	if (pack == 16)
 		v = V_INGPACKBOUNDARY(0);
 	else
 		v = V_INGPACKBOUNDARY(ilog2(pack) - 5);
 
 	MPASS(!is_t4(sc));	/* T4 doesn't have SGE_CONTROL2 */
 	t4_set_reg_field(sc, A_SGE_CONTROL2, m, v);
 }
 
 /*
  * adap->params.vpd.cclk must be set up before this is called.
  */
 void
 t4_tweak_chip_settings(struct adapter *sc)
 {
 	int i;
 	uint32_t v, m;
 	int intr_timer[SGE_NTIMERS] = {1, 5, 10, 50, 100, 200};
 	int timer_max = M_TIMERVALUE0 * 1000 / sc->params.vpd.cclk;
 	int intr_pktcount[SGE_NCOUNTERS] = {1, 8, 16, 32}; /* 63 max */
 	uint16_t indsz = min(RX_COPY_THRESHOLD - 1, M_INDICATESIZE);
 	static int sge_flbuf_sizes[] = {
 		MCLBYTES,
 #if MJUMPAGESIZE != MCLBYTES
 		MJUMPAGESIZE,
 		MJUMPAGESIZE - CL_METADATA_SIZE,
 		MJUMPAGESIZE - 2 * MSIZE - CL_METADATA_SIZE,
 #endif
 		MJUM9BYTES,
 		MJUM16BYTES,
 		MCLBYTES - MSIZE - CL_METADATA_SIZE,
 		MJUM9BYTES - CL_METADATA_SIZE,
 		MJUM16BYTES - CL_METADATA_SIZE,
 	};
 
 	KASSERT(sc->flags & MASTER_PF,
 	    ("%s: trying to change chip settings when not master.", __func__));
 
 	m = V_PKTSHIFT(M_PKTSHIFT) | F_RXPKTCPLMODE | F_EGRSTATUSPAGESIZE;
 	v = V_PKTSHIFT(fl_pktshift) | F_RXPKTCPLMODE |
 	    V_EGRSTATUSPAGESIZE(spg_len == 128);
 	t4_set_reg_field(sc, A_SGE_CONTROL, m, v);
 
 	setup_pad_and_pack_boundaries(sc);
 
 	v = V_HOSTPAGESIZEPF0(PAGE_SHIFT - 10) |
 	    V_HOSTPAGESIZEPF1(PAGE_SHIFT - 10) |
 	    V_HOSTPAGESIZEPF2(PAGE_SHIFT - 10) |
 	    V_HOSTPAGESIZEPF3(PAGE_SHIFT - 10) |
 	    V_HOSTPAGESIZEPF4(PAGE_SHIFT - 10) |
 	    V_HOSTPAGESIZEPF5(PAGE_SHIFT - 10) |
 	    V_HOSTPAGESIZEPF6(PAGE_SHIFT - 10) |
 	    V_HOSTPAGESIZEPF7(PAGE_SHIFT - 10);
 	t4_write_reg(sc, A_SGE_HOST_PAGE_SIZE, v);
 
 	KASSERT(nitems(sge_flbuf_sizes) <= SGE_FLBUF_SIZES,
 	    ("%s: hw buffer size table too big", __func__));
 	for (i = 0; i < min(nitems(sge_flbuf_sizes), SGE_FLBUF_SIZES); i++) {
 		t4_write_reg(sc, A_SGE_FL_BUFFER_SIZE0 + (4 * i),
 		    sge_flbuf_sizes[i]);
 	}
 
 	v = V_THRESHOLD_0(intr_pktcount[0]) | V_THRESHOLD_1(intr_pktcount[1]) |
 	    V_THRESHOLD_2(intr_pktcount[2]) | V_THRESHOLD_3(intr_pktcount[3]);
 	t4_write_reg(sc, A_SGE_INGRESS_RX_THRESHOLD, v);
 
 	KASSERT(intr_timer[0] <= timer_max,
 	    ("%s: not a single usable timer (%d, %d)", __func__, intr_timer[0],
 	    timer_max));
 	for (i = 1; i < nitems(intr_timer); i++) {
 		KASSERT(intr_timer[i] >= intr_timer[i - 1],
 		    ("%s: timers not listed in increasing order (%d)",
 		    __func__, i));
 
 		while (intr_timer[i] > timer_max) {
 			if (i == nitems(intr_timer) - 1) {
 				intr_timer[i] = timer_max;
 				break;
 			}
 			intr_timer[i] += intr_timer[i - 1];
 			intr_timer[i] /= 2;
 		}
 	}
 
 	v = V_TIMERVALUE0(us_to_core_ticks(sc, intr_timer[0])) |
 	    V_TIMERVALUE1(us_to_core_ticks(sc, intr_timer[1]));
 	t4_write_reg(sc, A_SGE_TIMER_VALUE_0_AND_1, v);
 	v = V_TIMERVALUE2(us_to_core_ticks(sc, intr_timer[2])) |
 	    V_TIMERVALUE3(us_to_core_ticks(sc, intr_timer[3]));
 	t4_write_reg(sc, A_SGE_TIMER_VALUE_2_AND_3, v);
 	v = V_TIMERVALUE4(us_to_core_ticks(sc, intr_timer[4])) |
 	    V_TIMERVALUE5(us_to_core_ticks(sc, intr_timer[5]));
 	t4_write_reg(sc, A_SGE_TIMER_VALUE_4_AND_5, v);
 
 	/* 4K, 16K, 64K, 256K DDP "page sizes" for TDDP */
 	v = V_HPZ0(0) | V_HPZ1(2) | V_HPZ2(4) | V_HPZ3(6);
 	t4_write_reg(sc, A_ULP_RX_TDDP_PSZ, v);
 
 	/*
 	 * 4K, 8K, 16K, 64K DDP "page sizes" for iSCSI DDP.  These have been
 	 * chosen with MAXPHYS = 128K in mind.  The largest DDP buffer that we
 	 * may have to deal with is MAXPHYS + 1 page.
 	 */
 	v = V_HPZ0(0) | V_HPZ1(1) | V_HPZ2(2) | V_HPZ3(4);
 	t4_write_reg(sc, A_ULP_RX_ISCSI_PSZ, v);
 
 	/* We use multiple DDP page sizes both in plain-TOE and ISCSI modes. */
 	m = v = F_TDDPTAGTCB | F_ISCSITAGTCB;
 	t4_set_reg_field(sc, A_ULP_RX_CTL, m, v);
 
 	m = V_INDICATESIZE(M_INDICATESIZE) | F_REARMDDPOFFSET |
 	    F_RESETDDPOFFSET;
 	v = V_INDICATESIZE(indsz) | F_REARMDDPOFFSET | F_RESETDDPOFFSET;
 	t4_set_reg_field(sc, A_TP_PARA_REG5, m, v);
 }
 
 /*
  * SGE wants the buffer to be at least 64B and then a multiple of 16.  If
  * padding is in use, the buffer's start and end need to be aligned to the pad
  * boundary as well.  We'll just make sure that the size is a multiple of the
  * boundary here, it is up to the buffer allocation code to make sure the start
  * of the buffer is aligned as well.
  */
 static inline int
 hwsz_ok(struct adapter *sc, int hwsz)
 {
 	int mask = fl_pad ? sc->params.sge.pad_boundary - 1 : 16 - 1;
 
 	return (hwsz >= 64 && (hwsz & mask) == 0);
 }
 
 /*
  * XXX: driver really should be able to deal with unexpected settings.
  */
 int
 t4_read_chip_settings(struct adapter *sc)
 {
 	struct sge *s = &sc->sge;
 	struct sge_params *sp = &sc->params.sge;
 	int i, j, n, rc = 0;
 	uint32_t m, v, r;
 	uint16_t indsz = min(RX_COPY_THRESHOLD - 1, M_INDICATESIZE);
 	static int sw_buf_sizes[] = {	/* Sorted by size */
 		MCLBYTES,
 #if MJUMPAGESIZE != MCLBYTES
 		MJUMPAGESIZE,
 #endif
 		MJUM9BYTES,
 		MJUM16BYTES
 	};
 	struct sw_zone_info *swz, *safe_swz;
 	struct hw_buf_info *hwb;
 
 	m = F_RXPKTCPLMODE;
 	v = F_RXPKTCPLMODE;
 	r = sc->params.sge.sge_control;
 	if ((r & m) != v) {
 		device_printf(sc->dev, "invalid SGE_CONTROL(0x%x)\n", r);
 		rc = EINVAL;
 	}
 
 	/*
 	 * If this changes then every single use of PAGE_SHIFT in the driver
 	 * needs to be carefully reviewed for PAGE_SHIFT vs sp->page_shift.
 	 */
 	if (sp->page_shift != PAGE_SHIFT) {
 		device_printf(sc->dev, "invalid SGE_HOST_PAGE_SIZE(0x%x)\n", r);
 		rc = EINVAL;
 	}
 
 	/* Filter out unusable hw buffer sizes entirely (mark with -2). */
 	hwb = &s->hw_buf_info[0];
 	for (i = 0; i < nitems(s->hw_buf_info); i++, hwb++) {
 		r = sc->params.sge.sge_fl_buffer_size[i];
 		hwb->size = r;
 		hwb->zidx = hwsz_ok(sc, r) ? -1 : -2;
 		hwb->next = -1;
 	}
 
 	/*
 	 * Create a sorted list in decreasing order of hw buffer sizes (and so
 	 * increasing order of spare area) for each software zone.
 	 *
 	 * If padding is enabled then the start and end of the buffer must align
 	 * to the pad boundary; if packing is enabled then they must align with
 	 * the pack boundary as well.  Allocations from the cluster zones are
 	 * aligned to min(size, 4K), so the buffer starts at that alignment and
 	 * ends at hwb->size alignment.  If mbuf inlining is allowed the
 	 * starting alignment will be reduced to MSIZE and the driver will
 	 * exercise appropriate caution when deciding on the best buffer layout
 	 * to use.
 	 */
 	n = 0;	/* no usable buffer size to begin with */
 	swz = &s->sw_zone_info[0];
 	safe_swz = NULL;
 	for (i = 0; i < SW_ZONE_SIZES; i++, swz++) {
 		int8_t head = -1, tail = -1;
 
 		swz->size = sw_buf_sizes[i];
 		swz->zone = m_getzone(swz->size);
 		swz->type = m_gettype(swz->size);
 
 		if (swz->size < PAGE_SIZE) {
 			MPASS(powerof2(swz->size));
 			if (fl_pad && (swz->size % sp->pad_boundary != 0))
 				continue;
 		}
 
 		if (swz->size == safest_rx_cluster)
 			safe_swz = swz;
 
 		hwb = &s->hw_buf_info[0];
 		for (j = 0; j < SGE_FLBUF_SIZES; j++, hwb++) {
 			if (hwb->zidx != -1 || hwb->size > swz->size)
 				continue;
 #ifdef INVARIANTS
 			if (fl_pad)
 				MPASS(hwb->size % sp->pad_boundary == 0);
 #endif
 			hwb->zidx = i;
 			if (head == -1)
 				head = tail = j;
 			else if (hwb->size < s->hw_buf_info[tail].size) {
 				s->hw_buf_info[tail].next = j;
 				tail = j;
 			} else {
 				int8_t *cur;
 				struct hw_buf_info *t;
 
 				for (cur = &head; *cur != -1; cur = &t->next) {
 					t = &s->hw_buf_info[*cur];
 					if (hwb->size == t->size) {
 						hwb->zidx = -2;
 						break;
 					}
 					if (hwb->size > t->size) {
 						hwb->next = *cur;
 						*cur = j;
 						break;
 					}
 				}
 			}
 		}
 		swz->head_hwidx = head;
 		swz->tail_hwidx = tail;
 
 		if (tail != -1) {
 			n++;
 			if (swz->size - s->hw_buf_info[tail].size >=
 			    CL_METADATA_SIZE)
 				sc->flags |= BUF_PACKING_OK;
 		}
 	}
 	if (n == 0) {
 		device_printf(sc->dev, "no usable SGE FL buffer size.\n");
 		rc = EINVAL;
 	}
 
 	s->safe_hwidx1 = -1;
 	s->safe_hwidx2 = -1;
 	if (safe_swz != NULL) {
 		s->safe_hwidx1 = safe_swz->head_hwidx;
 		for (i = safe_swz->head_hwidx; i != -1; i = hwb->next) {
 			int spare;
 
 			hwb = &s->hw_buf_info[i];
 #ifdef INVARIANTS
 			if (fl_pad)
 				MPASS(hwb->size % sp->pad_boundary == 0);
 #endif
 			spare = safe_swz->size - hwb->size;
 			if (spare >= CL_METADATA_SIZE) {
 				s->safe_hwidx2 = i;
 				break;
 			}
 		}
 	}
 
 	if (sc->flags & IS_VF)
 		return (0);
 
 	v = V_HPZ0(0) | V_HPZ1(2) | V_HPZ2(4) | V_HPZ3(6);
 	r = t4_read_reg(sc, A_ULP_RX_TDDP_PSZ);
 	if (r != v) {
 		device_printf(sc->dev, "invalid ULP_RX_TDDP_PSZ(0x%x)\n", r);
 		rc = EINVAL;
 	}
 
 	m = v = F_TDDPTAGTCB;
 	r = t4_read_reg(sc, A_ULP_RX_CTL);
 	if ((r & m) != v) {
 		device_printf(sc->dev, "invalid ULP_RX_CTL(0x%x)\n", r);
 		rc = EINVAL;
 	}
 
 	m = V_INDICATESIZE(M_INDICATESIZE) | F_REARMDDPOFFSET |
 	    F_RESETDDPOFFSET;
 	v = V_INDICATESIZE(indsz) | F_REARMDDPOFFSET | F_RESETDDPOFFSET;
 	r = t4_read_reg(sc, A_TP_PARA_REG5);
 	if ((r & m) != v) {
 		device_printf(sc->dev, "invalid TP_PARA_REG5(0x%x)\n", r);
 		rc = EINVAL;
 	}
 
 	t4_init_tp_params(sc);
 
 	t4_read_mtu_tbl(sc, sc->params.mtus, NULL);
 	t4_load_mtus(sc, sc->params.mtus, sc->params.a_wnd, sc->params.b_wnd);
 
 	return (rc);
 }
 
 int
 t4_create_dma_tag(struct adapter *sc)
 {
 	int rc;
 
 	rc = bus_dma_tag_create(bus_get_dma_tag(sc->dev), 1, 0,
 	    BUS_SPACE_MAXADDR, BUS_SPACE_MAXADDR, NULL, NULL, BUS_SPACE_MAXSIZE,
 	    BUS_SPACE_UNRESTRICTED, BUS_SPACE_MAXSIZE, BUS_DMA_ALLOCNOW, NULL,
 	    NULL, &sc->dmat);
 	if (rc != 0) {
 		device_printf(sc->dev,
 		    "failed to create main DMA tag: %d\n", rc);
 	}
 
 	return (rc);
 }
 
 void
 t4_sge_sysctls(struct adapter *sc, struct sysctl_ctx_list *ctx,
     struct sysctl_oid_list *children)
 {
 	struct sge_params *sp = &sc->params.sge;
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "buffer_sizes",
 	    CTLTYPE_STRING | CTLFLAG_RD, &sc->sge, 0, sysctl_bufsizes, "A",
 	    "freelist buffer sizes");
 
 	SYSCTL_ADD_INT(ctx, children, OID_AUTO, "fl_pktshift", CTLFLAG_RD,
 	    NULL, sp->fl_pktshift, "payload DMA offset in rx buffer (bytes)");
 
 	SYSCTL_ADD_INT(ctx, children, OID_AUTO, "fl_pad", CTLFLAG_RD,
 	    NULL, sp->pad_boundary, "payload pad boundary (bytes)");
 
 	SYSCTL_ADD_INT(ctx, children, OID_AUTO, "spg_len", CTLFLAG_RD,
 	    NULL, sp->spg_len, "status page size (bytes)");
 
 	SYSCTL_ADD_INT(ctx, children, OID_AUTO, "cong_drop", CTLFLAG_RD,
 	    NULL, cong_drop, "congestion drop setting");
 
 	SYSCTL_ADD_INT(ctx, children, OID_AUTO, "fl_pack", CTLFLAG_RD,
 	    NULL, sp->pack_boundary, "payload pack boundary (bytes)");
 }
 
 int
 t4_destroy_dma_tag(struct adapter *sc)
 {
 	if (sc->dmat)
 		bus_dma_tag_destroy(sc->dmat);
 
 	return (0);
 }
 
 /*
  * Allocate and initialize the firmware event queue and the management queue.
  *
  * Returns errno on failure.  Resources allocated up to that point may still be
  * allocated.  Caller is responsible for cleanup in case this function fails.
  */
 int
 t4_setup_adapter_queues(struct adapter *sc)
 {
 	int rc;
 
 	ADAPTER_LOCK_ASSERT_NOTOWNED(sc);
 
 	sysctl_ctx_init(&sc->ctx);
 	sc->flags |= ADAP_SYSCTL_CTX;
 
 	/*
 	 * Firmware event queue
 	 */
 	rc = alloc_fwq(sc);
 	if (rc != 0)
 		return (rc);
 
 	/*
 	 * Management queue.  This is just a control queue that uses the fwq as
 	 * its associated iq.
 	 */
 	if (!(sc->flags & IS_VF))
 		rc = alloc_mgmtq(sc);
 
 	return (rc);
 }
 
 /*
  * Idempotent
  */
 int
 t4_teardown_adapter_queues(struct adapter *sc)
 {
 
 	ADAPTER_LOCK_ASSERT_NOTOWNED(sc);
 
 	/* Do this before freeing the queue */
 	if (sc->flags & ADAP_SYSCTL_CTX) {
 		sysctl_ctx_free(&sc->ctx);
 		sc->flags &= ~ADAP_SYSCTL_CTX;
 	}
 
 	free_mgmtq(sc);
 	free_fwq(sc);
 
 	return (0);
 }
 
 static inline int
 first_vector(struct vi_info *vi)
 {
 	struct adapter *sc = vi->pi->adapter;
 
 	if (sc->intr_count == 1)
 		return (0);
 
 	return (vi->first_intr);
 }
 
 /*
  * Given an arbitrary "index," come up with an iq that can be used by other
  * queues (of this VI) for interrupt forwarding, SGE egress updates, etc.
  * The iq returned is guaranteed to be something that takes direct interrupts.
  */
 static struct sge_iq *
 vi_intr_iq(struct vi_info *vi, int idx)
 {
 	struct adapter *sc = vi->pi->adapter;
 	struct sge *s = &sc->sge;
 	struct sge_iq *iq = NULL;
 	int nintr, i;
 
 	if (sc->intr_count == 1)
 		return (&sc->sge.fwq);
 
 	nintr = vi->nintr;
 	KASSERT(nintr != 0,
 	    ("%s: vi %p has no exclusive interrupts, total interrupts = %d",
 	    __func__, vi, sc->intr_count));
 	i = idx % nintr;
 
 	if (vi->flags & INTR_RXQ) {
 	       	if (i < vi->nrxq) {
 			iq = &s->rxq[vi->first_rxq + i].iq;
 			goto done;
 		}
 		i -= vi->nrxq;
 	}
 #ifdef TCP_OFFLOAD
 	if (vi->flags & INTR_OFLD_RXQ) {
 	       	if (i < vi->nofldrxq) {
 			iq = &s->ofld_rxq[vi->first_ofld_rxq + i].iq;
 			goto done;
 		}
 		i -= vi->nofldrxq;
 	}
 #endif
 	panic("%s: vi %p, intr_flags 0x%lx, idx %d, total intr %d\n", __func__,
 	    vi, vi->flags & INTR_ALL, idx, nintr);
 done:
 	MPASS(iq != NULL);
 	KASSERT(iq->flags & IQ_INTR,
 	    ("%s: iq %p (vi %p, intr_flags 0x%lx, idx %d)", __func__, iq, vi,
 	    vi->flags & INTR_ALL, idx));
 	return (iq);
 }
 
 /* Maximum payload that can be delivered with a single iq descriptor */
 static inline int
 mtu_to_max_payload(struct adapter *sc, int mtu, const int toe)
 {
 	int payload;
 
 #ifdef TCP_OFFLOAD
 	if (toe) {
 		payload = sc->tt.rx_coalesce ?
 		    G_RXCOALESCESIZE(t4_read_reg(sc, A_TP_PARA_REG2)) : mtu;
 	} else {
 #endif
 		/* large enough even when hw VLAN extraction is disabled */
 		payload = sc->params.sge.fl_pktshift + ETHER_HDR_LEN +
 		    ETHER_VLAN_ENCAP_LEN + mtu;
 #ifdef TCP_OFFLOAD
 	}
 #endif
 
 	return (payload);
 }
 
 int
 t4_setup_vi_queues(struct vi_info *vi)
 {
 	int rc = 0, i, j, intr_idx, iqid;
 	struct sge_rxq *rxq;
 	struct sge_txq *txq;
 	struct sge_wrq *ctrlq;
 #ifdef TCP_OFFLOAD
 	struct sge_ofld_rxq *ofld_rxq;
 	struct sge_wrq *ofld_txq;
 #endif
 #ifdef DEV_NETMAP
 	int saved_idx;
 	struct sge_nm_rxq *nm_rxq;
 	struct sge_nm_txq *nm_txq;
 #endif
 	char name[16];
 	struct port_info *pi = vi->pi;
 	struct adapter *sc = pi->adapter;
 	struct ifnet *ifp = vi->ifp;
 	struct sysctl_oid *oid = device_get_sysctl_tree(vi->dev);
 	struct sysctl_oid_list *children = SYSCTL_CHILDREN(oid);
 	int maxp, mtu = ifp->if_mtu;
 
 	/* Interrupt vector to start from (when using multiple vectors) */
 	intr_idx = first_vector(vi);
 
 #ifdef DEV_NETMAP
 	saved_idx = intr_idx;
 	if (ifp->if_capabilities & IFCAP_NETMAP) {
 
 		/* netmap is supported with direct interrupts only. */
 		MPASS(vi->flags & INTR_RXQ);
 
 		/*
 		 * We don't have buffers to back the netmap rx queues
 		 * right now so we create the queues in a way that
 		 * doesn't set off any congestion signal in the chip.
 		 */
 		oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, "nm_rxq",
 		    CTLFLAG_RD, NULL, "rx queues");
 		for_each_nm_rxq(vi, i, nm_rxq) {
 			rc = alloc_nm_rxq(vi, nm_rxq, intr_idx, i, oid);
 			if (rc != 0)
 				goto done;
 			intr_idx++;
 		}
 
 		oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, "nm_txq",
 		    CTLFLAG_RD, NULL, "tx queues");
 		for_each_nm_txq(vi, i, nm_txq) {
 			iqid = vi->first_nm_rxq + (i % vi->nnmrxq);
 			rc = alloc_nm_txq(vi, nm_txq, iqid, i, oid);
 			if (rc != 0)
 				goto done;
 		}
 	}
 
 	/* Normal rx queues and netmap rx queues share the same interrupts. */
 	intr_idx = saved_idx;
 #endif
 
 	/*
 	 * First pass over all NIC and TOE rx queues:
 	 * a) initialize iq and fl
 	 * b) allocate queue iff it will take direct interrupts.
 	 */
 	maxp = mtu_to_max_payload(sc, mtu, 0);
 	if (vi->flags & INTR_RXQ) {
 		oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, "rxq",
 		    CTLFLAG_RD, NULL, "rx queues");
 	}
 	for_each_rxq(vi, i, rxq) {
 
 		init_iq(&rxq->iq, sc, vi->tmr_idx, vi->pktc_idx, vi->qsize_rxq);
 
 		snprintf(name, sizeof(name), "%s rxq%d-fl",
 		    device_get_nameunit(vi->dev), i);
 		init_fl(sc, &rxq->fl, vi->qsize_rxq / 8, maxp, name);
 
 		if (vi->flags & INTR_RXQ) {
 			rxq->iq.flags |= IQ_INTR;
 			rc = alloc_rxq(vi, rxq, intr_idx, i, oid);
 			if (rc != 0)
 				goto done;
 			intr_idx++;
 		}
 	}
 #ifdef DEV_NETMAP
 	if (ifp->if_capabilities & IFCAP_NETMAP)
 		intr_idx = saved_idx + max(vi->nrxq, vi->nnmrxq);
 #endif
 #ifdef TCP_OFFLOAD
 	maxp = mtu_to_max_payload(sc, mtu, 1);
 	if (vi->flags & INTR_OFLD_RXQ) {
 		oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, "ofld_rxq",
 		    CTLFLAG_RD, NULL,
 		    "rx queues for offloaded TCP connections");
 	}
 	for_each_ofld_rxq(vi, i, ofld_rxq) {
 
 		init_iq(&ofld_rxq->iq, sc, vi->tmr_idx, vi->pktc_idx,
 		    vi->qsize_rxq);
 
 		snprintf(name, sizeof(name), "%s ofld_rxq%d-fl",
 		    device_get_nameunit(vi->dev), i);
 		init_fl(sc, &ofld_rxq->fl, vi->qsize_rxq / 8, maxp, name);
 
 		if (vi->flags & INTR_OFLD_RXQ) {
 			ofld_rxq->iq.flags |= IQ_INTR;
 			rc = alloc_ofld_rxq(vi, ofld_rxq, intr_idx, i, oid);
 			if (rc != 0)
 				goto done;
 			intr_idx++;
 		}
 	}
 #endif
 
 	/*
 	 * Second pass over all NIC and TOE rx queues.  The queues forwarding
 	 * their interrupts are allocated now.
 	 */
 	j = 0;
 	if (!(vi->flags & INTR_RXQ)) {
 		oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, "rxq",
 		    CTLFLAG_RD, NULL, "rx queues");
 		for_each_rxq(vi, i, rxq) {
 			MPASS(!(rxq->iq.flags & IQ_INTR));
 
 			intr_idx = vi_intr_iq(vi, j)->abs_id;
 
 			rc = alloc_rxq(vi, rxq, intr_idx, i, oid);
 			if (rc != 0)
 				goto done;
 			j++;
 		}
 	}
 #ifdef TCP_OFFLOAD
 	if (vi->nofldrxq != 0 && !(vi->flags & INTR_OFLD_RXQ)) {
 		oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, "ofld_rxq",
 		    CTLFLAG_RD, NULL,
 		    "rx queues for offloaded TCP connections");
 		for_each_ofld_rxq(vi, i, ofld_rxq) {
 			MPASS(!(ofld_rxq->iq.flags & IQ_INTR));
 
 			intr_idx = vi_intr_iq(vi, j)->abs_id;
 
 			rc = alloc_ofld_rxq(vi, ofld_rxq, intr_idx, i, oid);
 			if (rc != 0)
 				goto done;
 			j++;
 		}
 	}
 #endif
 
 	/*
 	 * Now the tx queues.  Only one pass needed.
 	 */
 	oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, "txq", CTLFLAG_RD,
 	    NULL, "tx queues");
 	j = 0;
 	for_each_txq(vi, i, txq) {
 		iqid = vi_intr_iq(vi, j)->cntxt_id;
 		snprintf(name, sizeof(name), "%s txq%d",
 		    device_get_nameunit(vi->dev), i);
 		init_eq(sc, &txq->eq, EQ_ETH, vi->qsize_txq, pi->tx_chan, iqid,
 		    name);
 
 		rc = alloc_txq(vi, txq, i, oid);
 		if (rc != 0)
 			goto done;
 		j++;
 	}
 #ifdef TCP_OFFLOAD
 	oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, "ofld_txq",
 	    CTLFLAG_RD, NULL, "tx queues for offloaded TCP connections");
 	for_each_ofld_txq(vi, i, ofld_txq) {
 		struct sysctl_oid *oid2;
 
 		iqid = vi_intr_iq(vi, j)->cntxt_id;
 		snprintf(name, sizeof(name), "%s ofld_txq%d",
 		    device_get_nameunit(vi->dev), i);
 		init_eq(sc, &ofld_txq->eq, EQ_OFLD, vi->qsize_txq, pi->tx_chan,
 		    iqid, name);
 
 		snprintf(name, sizeof(name), "%d", i);
 		oid2 = SYSCTL_ADD_NODE(&vi->ctx, SYSCTL_CHILDREN(oid), OID_AUTO,
 		    name, CTLFLAG_RD, NULL, "offload tx queue");
 
 		rc = alloc_wrq(sc, vi, ofld_txq, oid2);
 		if (rc != 0)
 			goto done;
 		j++;
 	}
 #endif
 
 	/*
 	 * Finally, the control queue.
 	 */
 	if (!IS_MAIN_VI(vi) || sc->flags & IS_VF)
 		goto done;
 	oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, "ctrlq", CTLFLAG_RD,
 	    NULL, "ctrl queue");
 	ctrlq = &sc->sge.ctrlq[pi->port_id];
 	iqid = vi_intr_iq(vi, 0)->cntxt_id;
 	snprintf(name, sizeof(name), "%s ctrlq", device_get_nameunit(vi->dev));
 	init_eq(sc, &ctrlq->eq, EQ_CTRL, CTRL_EQ_QSIZE, pi->tx_chan, iqid,
 	    name);
 	rc = alloc_wrq(sc, vi, ctrlq, oid);
 
 done:
 	if (rc)
 		t4_teardown_vi_queues(vi);
 
 	return (rc);
 }
 
 /*
  * Idempotent
  */
 int
 t4_teardown_vi_queues(struct vi_info *vi)
 {
 	int i;
 	struct port_info *pi = vi->pi;
 	struct adapter *sc = pi->adapter;
 	struct sge_rxq *rxq;
 	struct sge_txq *txq;
 #ifdef TCP_OFFLOAD
 	struct sge_ofld_rxq *ofld_rxq;
 	struct sge_wrq *ofld_txq;
 #endif
 #ifdef DEV_NETMAP
 	struct sge_nm_rxq *nm_rxq;
 	struct sge_nm_txq *nm_txq;
 #endif
 
 	/* Do this before freeing the queues */
 	if (vi->flags & VI_SYSCTL_CTX) {
 		sysctl_ctx_free(&vi->ctx);
 		vi->flags &= ~VI_SYSCTL_CTX;
 	}
 
 #ifdef DEV_NETMAP
 	if (vi->ifp->if_capabilities & IFCAP_NETMAP) {
 		for_each_nm_txq(vi, i, nm_txq) {
 			free_nm_txq(vi, nm_txq);
 		}
 
 		for_each_nm_rxq(vi, i, nm_rxq) {
 			free_nm_rxq(vi, nm_rxq);
 		}
 	}
 #endif
 
 	/*
 	 * Take down all the tx queues first, as they reference the rx queues
 	 * (for egress updates, etc.).
 	 */
 
 	if (IS_MAIN_VI(vi) && !(sc->flags & IS_VF))
 		free_wrq(sc, &sc->sge.ctrlq[pi->port_id]);
 
 	for_each_txq(vi, i, txq) {
 		free_txq(vi, txq);
 	}
 #ifdef TCP_OFFLOAD
 	for_each_ofld_txq(vi, i, ofld_txq) {
 		free_wrq(sc, ofld_txq);
 	}
 #endif
 
 	/*
 	 * Then take down the rx queues that forward their interrupts, as they
 	 * reference other rx queues.
 	 */
 
 	for_each_rxq(vi, i, rxq) {
 		if ((rxq->iq.flags & IQ_INTR) == 0)
 			free_rxq(vi, rxq);
 	}
 #ifdef TCP_OFFLOAD
 	for_each_ofld_rxq(vi, i, ofld_rxq) {
 		if ((ofld_rxq->iq.flags & IQ_INTR) == 0)
 			free_ofld_rxq(vi, ofld_rxq);
 	}
 #endif
 
 	/*
 	 * Then take down the rx queues that take direct interrupts.
 	 */
 
 	for_each_rxq(vi, i, rxq) {
 		if (rxq->iq.flags & IQ_INTR)
 			free_rxq(vi, rxq);
 	}
 #ifdef TCP_OFFLOAD
 	for_each_ofld_rxq(vi, i, ofld_rxq) {
 		if (ofld_rxq->iq.flags & IQ_INTR)
 			free_ofld_rxq(vi, ofld_rxq);
 	}
 #endif
 
 	return (0);
 }
 
 /*
  * Deals with errors and the firmware event queue.  All data rx queues forward
  * their interrupt to the firmware event queue.
  */
 void
 t4_intr_all(void *arg)
 {
 	struct adapter *sc = arg;
 	struct sge_iq *fwq = &sc->sge.fwq;
 
 	t4_intr_err(arg);
 	if (atomic_cmpset_int(&fwq->state, IQS_IDLE, IQS_BUSY)) {
 		service_iq(fwq, 0);
 		atomic_cmpset_int(&fwq->state, IQS_BUSY, IQS_IDLE);
 	}
 }
 
 /* Deals with error interrupts */
 void
 t4_intr_err(void *arg)
 {
 	struct adapter *sc = arg;
 
 	t4_write_reg(sc, MYPF_REG(A_PCIE_PF_CLI), 0);
 	t4_slow_intr_handler(sc);
 }
 
 void
 t4_intr_evt(void *arg)
 {
 	struct sge_iq *iq = arg;
 
 	if (atomic_cmpset_int(&iq->state, IQS_IDLE, IQS_BUSY)) {
 		service_iq(iq, 0);
 		atomic_cmpset_int(&iq->state, IQS_BUSY, IQS_IDLE);
 	}
 }
 
 void
 t4_intr(void *arg)
 {
 	struct sge_iq *iq = arg;
 
 	if (atomic_cmpset_int(&iq->state, IQS_IDLE, IQS_BUSY)) {
 		service_iq(iq, 0);
 		atomic_cmpset_int(&iq->state, IQS_BUSY, IQS_IDLE);
 	}
 }
 
 void
 t4_vi_intr(void *arg)
 {
 	struct irq *irq = arg;
 
 #ifdef DEV_NETMAP
 	if (atomic_cmpset_int(&irq->nm_state, NM_ON, NM_BUSY)) {
 		t4_nm_intr(irq->nm_rxq);
 		atomic_cmpset_int(&irq->nm_state, NM_BUSY, NM_ON);
 	}
 #endif
 	if (irq->rxq != NULL)
 		t4_intr(irq->rxq);
 }
 
 /*
  * Deals with anything and everything on the given ingress queue.
  */
 static int
 service_iq(struct sge_iq *iq, int budget)
 {
 	struct sge_iq *q;
 	struct sge_rxq *rxq = iq_to_rxq(iq);	/* Use iff iq is part of rxq */
 	struct sge_fl *fl;			/* Use iff IQ_HAS_FL */
 	struct adapter *sc = iq->adapter;
 	struct iq_desc *d = &iq->desc[iq->cidx];
 	int ndescs = 0, limit;
 	int rsp_type, refill;
 	uint32_t lq;
 	uint16_t fl_hw_cidx;
 	struct mbuf *m0;
 	STAILQ_HEAD(, sge_iq) iql = STAILQ_HEAD_INITIALIZER(iql);
 #if defined(INET) || defined(INET6)
 	const struct timeval lro_timeout = {0, sc->lro_timeout};
 #endif
 
 	KASSERT(iq->state == IQS_BUSY, ("%s: iq %p not BUSY", __func__, iq));
 
 	limit = budget ? budget : iq->qsize / 16;
 
 	if (iq->flags & IQ_HAS_FL) {
 		fl = &rxq->fl;
 		fl_hw_cidx = fl->hw_cidx;	/* stable snapshot */
 	} else {
 		fl = NULL;
 		fl_hw_cidx = 0;			/* to silence gcc warning */
 	}
 
 	/*
 	 * We always come back and check the descriptor ring for new indirect
 	 * interrupts and other responses after running a single handler.
 	 */
 	for (;;) {
 		while ((d->rsp.u.type_gen & F_RSPD_GEN) == iq->gen) {
 
 			rmb();
 
 			refill = 0;
 			m0 = NULL;
 			rsp_type = G_RSPD_TYPE(d->rsp.u.type_gen);
 			lq = be32toh(d->rsp.pldbuflen_qid);
 
 			switch (rsp_type) {
 			case X_RSPD_TYPE_FLBUF:
 
 				KASSERT(iq->flags & IQ_HAS_FL,
 				    ("%s: data for an iq (%p) with no freelist",
 				    __func__, iq));
 
 				m0 = get_fl_payload(sc, fl, lq);
 				if (__predict_false(m0 == NULL))
 					goto process_iql;
 				refill = IDXDIFF(fl->hw_cidx, fl_hw_cidx, fl->sidx) > 2;
 #ifdef T4_PKT_TIMESTAMP
 				/*
 				 * 60 bit timestamp for the payload is
 				 * *(uint64_t *)m0->m_pktdat.  Note that it is
 				 * in the leading free-space in the mbuf.  The
 				 * kernel can clobber it during a pullup,
 				 * m_copymdata, etc.  You need to make sure that
 				 * the mbuf reaches you unmolested if you care
 				 * about the timestamp.
 				 */
 				*(uint64_t *)m0->m_pktdat =
 				    be64toh(ctrl->u.last_flit) &
 				    0xfffffffffffffff;
 #endif
 
 				/* fall through */
 
 			case X_RSPD_TYPE_CPL:
 				KASSERT(d->rss.opcode < NUM_CPL_CMDS,
 				    ("%s: bad opcode %02x.", __func__,
 				    d->rss.opcode));
 				t4_cpl_handler[d->rss.opcode](iq, &d->rss, m0);
 				break;
 
 			case X_RSPD_TYPE_INTR:
 
 				/*
 				 * Interrupts should be forwarded only to queues
 				 * that are not forwarding their interrupts.
 				 * This means service_iq can recurse but only 1
 				 * level deep.
 				 */
 				KASSERT(budget == 0,
 				    ("%s: budget %u, rsp_type %u", __func__,
 				    budget, rsp_type));
 
 				/*
 				 * There are 1K interrupt-capable queues (qids 0
 				 * through 1023).  A response type indicating a
 				 * forwarded interrupt with a qid >= 1K is an
 				 * iWARP async notification.
 				 */
 				if (lq >= 1024) {
                                         t4_an_handler(iq, &d->rsp);
                                         break;
                                 }
 
 				q = sc->sge.iqmap[lq - sc->sge.iq_start -
 				    sc->sge.iq_base];
 				if (atomic_cmpset_int(&q->state, IQS_IDLE,
 				    IQS_BUSY)) {
 					if (service_iq(q, q->qsize / 16) == 0) {
 						atomic_cmpset_int(&q->state,
 						    IQS_BUSY, IQS_IDLE);
 					} else {
 						STAILQ_INSERT_TAIL(&iql, q,
 						    link);
 					}
 				}
 				break;
 
 			default:
 				KASSERT(0,
 				    ("%s: illegal response type %d on iq %p",
 				    __func__, rsp_type, iq));
 				log(LOG_ERR,
 				    "%s: illegal response type %d on iq %p",
 				    device_get_nameunit(sc->dev), rsp_type, iq);
 				break;
 			}
 
 			d++;
 			if (__predict_false(++iq->cidx == iq->sidx)) {
 				iq->cidx = 0;
 				iq->gen ^= F_RSPD_GEN;
 				d = &iq->desc[0];
 			}
 			if (__predict_false(++ndescs == limit)) {
 				t4_write_reg(sc, sc->sge_gts_reg,
 				    V_CIDXINC(ndescs) |
 				    V_INGRESSQID(iq->cntxt_id) |
 				    V_SEINTARM(V_QINTR_TIMER_IDX(X_TIMERREG_UPDATE_CIDX)));
 				ndescs = 0;
 
 #if defined(INET) || defined(INET6)
 				if (iq->flags & IQ_LRO_ENABLED &&
 				    sc->lro_timeout != 0) {
 					tcp_lro_flush_inactive(&rxq->lro,
 					    &lro_timeout);
 				}
 #endif
 
 				if (budget) {
 					if (iq->flags & IQ_HAS_FL) {
 						FL_LOCK(fl);
 						refill_fl(sc, fl, 32);
 						FL_UNLOCK(fl);
 					}
 					return (EINPROGRESS);
 				}
 			}
 			if (refill) {
 				FL_LOCK(fl);
 				refill_fl(sc, fl, 32);
 				FL_UNLOCK(fl);
 				fl_hw_cidx = fl->hw_cidx;
 			}
 		}
 
 process_iql:
 		if (STAILQ_EMPTY(&iql))
 			break;
 
 		/*
 		 * Process the head only, and send it to the back of the list if
 		 * it's still not done.
 		 */
 		q = STAILQ_FIRST(&iql);
 		STAILQ_REMOVE_HEAD(&iql, link);
 		if (service_iq(q, q->qsize / 8) == 0)
 			atomic_cmpset_int(&q->state, IQS_BUSY, IQS_IDLE);
 		else
 			STAILQ_INSERT_TAIL(&iql, q, link);
 	}
 
 #if defined(INET) || defined(INET6)
 	if (iq->flags & IQ_LRO_ENABLED) {
 		struct lro_ctrl *lro = &rxq->lro;
 
 		tcp_lro_flush_all(lro);
 	}
 #endif
 
 	t4_write_reg(sc, sc->sge_gts_reg, V_CIDXINC(ndescs) |
 	    V_INGRESSQID((u32)iq->cntxt_id) | V_SEINTARM(iq->intr_params));
 
 	if (iq->flags & IQ_HAS_FL) {
 		int starved;
 
 		FL_LOCK(fl);
 		starved = refill_fl(sc, fl, 64);
 		FL_UNLOCK(fl);
 		if (__predict_false(starved != 0))
 			add_fl_to_sfl(sc, fl);
 	}
 
 	return (0);
 }
 
 static inline int
 cl_has_metadata(struct sge_fl *fl, struct cluster_layout *cll)
 {
 	int rc = fl->flags & FL_BUF_PACKING || cll->region1 > 0;
 
 	if (rc)
 		MPASS(cll->region3 >= CL_METADATA_SIZE);
 
 	return (rc);
 }
 
 static inline struct cluster_metadata *
 cl_metadata(struct adapter *sc, struct sge_fl *fl, struct cluster_layout *cll,
     caddr_t cl)
 {
 
 	if (cl_has_metadata(fl, cll)) {
 		struct sw_zone_info *swz = &sc->sge.sw_zone_info[cll->zidx];
 
 		return ((struct cluster_metadata *)(cl + swz->size) - 1);
 	}
 	return (NULL);
 }
 
 static void
 rxb_free(struct mbuf *m, void *arg1, void *arg2)
 {
 	uma_zone_t zone = arg1;
 	caddr_t cl = arg2;
 
 	uma_zfree(zone, cl);
 	counter_u64_add(extfree_rels, 1);
 }
 
 /*
  * The mbuf returned by this function could be allocated from zone_mbuf or
  * constructed in spare room in the cluster.
  *
  * The mbuf carries the payload in one of these ways
  * a) frame inside the mbuf (mbuf from zone_mbuf)
  * b) m_cljset (for clusters without metadata) zone_mbuf
  * c) m_extaddref (cluster with metadata) inline mbuf
  * d) m_extaddref (cluster with metadata) zone_mbuf
  */
 static struct mbuf *
 get_scatter_segment(struct adapter *sc, struct sge_fl *fl, int fr_offset,
     int remaining)
 {
 	struct mbuf *m;
 	struct fl_sdesc *sd = &fl->sdesc[fl->cidx];
 	struct cluster_layout *cll = &sd->cll;
 	struct sw_zone_info *swz = &sc->sge.sw_zone_info[cll->zidx];
 	struct hw_buf_info *hwb = &sc->sge.hw_buf_info[cll->hwidx];
 	struct cluster_metadata *clm = cl_metadata(sc, fl, cll, sd->cl);
 	int len, blen;
 	caddr_t payload;
 
 	blen = hwb->size - fl->rx_offset;	/* max possible in this buf */
 	len = min(remaining, blen);
 	payload = sd->cl + cll->region1 + fl->rx_offset;
 	if (fl->flags & FL_BUF_PACKING) {
 		const u_int l = fr_offset + len;
 		const u_int pad = roundup2(l, fl->buf_boundary) - l;
 
 		if (fl->rx_offset + len + pad < hwb->size)
 			blen = len + pad;
 		MPASS(fl->rx_offset + blen <= hwb->size);
 	} else {
 		MPASS(fl->rx_offset == 0);	/* not packing */
 	}
 
 
 	if (sc->sc_do_rxcopy && len < RX_COPY_THRESHOLD) {
 
 		/*
 		 * Copy payload into a freshly allocated mbuf.
 		 */
 
 		m = fr_offset == 0 ?
 		    m_gethdr(M_NOWAIT, MT_DATA) : m_get(M_NOWAIT, MT_DATA);
 		if (m == NULL)
 			return (NULL);
 		fl->mbuf_allocated++;
 #ifdef T4_PKT_TIMESTAMP
 		/* Leave room for a timestamp */
 		m->m_data += 8;
 #endif
 		/* copy data to mbuf */
 		bcopy(payload, mtod(m, caddr_t), len);
 
 	} else if (sd->nmbuf * MSIZE < cll->region1) {
 
 		/*
 		 * There's spare room in the cluster for an mbuf.  Create one
 		 * and associate it with the payload that's in the cluster.
 		 */
 
 		MPASS(clm != NULL);
 		m = (struct mbuf *)(sd->cl + sd->nmbuf * MSIZE);
 		/* No bzero required */
 		if (m_init(m, M_NOWAIT, MT_DATA,
 		    fr_offset == 0 ? M_PKTHDR | M_NOFREE : M_NOFREE))
 			return (NULL);
 		fl->mbuf_inlined++;
 		m_extaddref(m, payload, blen, &clm->refcount, rxb_free,
 		    swz->zone, sd->cl);
 		if (sd->nmbuf++ == 0)
 			counter_u64_add(extfree_refs, 1);
 
 	} else {
 
 		/*
 		 * Grab an mbuf from zone_mbuf and associate it with the
 		 * payload in the cluster.
 		 */
 
 		m = fr_offset == 0 ?
 		    m_gethdr(M_NOWAIT, MT_DATA) : m_get(M_NOWAIT, MT_DATA);
 		if (m == NULL)
 			return (NULL);
 		fl->mbuf_allocated++;
 		if (clm != NULL) {
 			m_extaddref(m, payload, blen, &clm->refcount,
 			    rxb_free, swz->zone, sd->cl);
 			if (sd->nmbuf++ == 0)
 				counter_u64_add(extfree_refs, 1);
 		} else {
 			m_cljset(m, sd->cl, swz->type);
 			sd->cl = NULL;	/* consumed, not a recycle candidate */
 		}
 	}
 	if (fr_offset == 0)
 		m->m_pkthdr.len = remaining;
 	m->m_len = len;
 
 	if (fl->flags & FL_BUF_PACKING) {
 		fl->rx_offset += blen;
 		MPASS(fl->rx_offset <= hwb->size);
 		if (fl->rx_offset < hwb->size)
 			return (m);	/* without advancing the cidx */
 	}
 
 	if (__predict_false(++fl->cidx % 8 == 0)) {
 		uint16_t cidx = fl->cidx / 8;
 
 		if (__predict_false(cidx == fl->sidx))
 			fl->cidx = cidx = 0;
 		fl->hw_cidx = cidx;
 	}
 	fl->rx_offset = 0;
 
 	return (m);
 }
 
 static struct mbuf *
 get_fl_payload(struct adapter *sc, struct sge_fl *fl, uint32_t len_newbuf)
 {
 	struct mbuf *m0, *m, **pnext;
 	u_int remaining;
 	const u_int total = G_RSPD_LEN(len_newbuf);
 
 	if (__predict_false(fl->flags & FL_BUF_RESUME)) {
 		M_ASSERTPKTHDR(fl->m0);
 		MPASS(fl->m0->m_pkthdr.len == total);
 		MPASS(fl->remaining < total);
 
 		m0 = fl->m0;
 		pnext = fl->pnext;
 		remaining = fl->remaining;
 		fl->flags &= ~FL_BUF_RESUME;
 		goto get_segment;
 	}
 
 	if (fl->rx_offset > 0 && len_newbuf & F_RSPD_NEWBUF) {
 		fl->rx_offset = 0;
 		if (__predict_false(++fl->cidx % 8 == 0)) {
 			uint16_t cidx = fl->cidx / 8;
 
 			if (__predict_false(cidx == fl->sidx))
 				fl->cidx = cidx = 0;
 			fl->hw_cidx = cidx;
 		}
 	}
 
 	/*
 	 * Payload starts at rx_offset in the current hw buffer.  Its length is
 	 * 'len' and it may span multiple hw buffers.
 	 */
 
 	m0 = get_scatter_segment(sc, fl, 0, total);
 	if (m0 == NULL)
 		return (NULL);
 	remaining = total - m0->m_len;
 	pnext = &m0->m_next;
 	while (remaining > 0) {
 get_segment:
 		MPASS(fl->rx_offset == 0);
 		m = get_scatter_segment(sc, fl, total - remaining, remaining);
 		if (__predict_false(m == NULL)) {
 			fl->m0 = m0;
 			fl->pnext = pnext;
 			fl->remaining = remaining;
 			fl->flags |= FL_BUF_RESUME;
 			return (NULL);
 		}
 		*pnext = m;
 		pnext = &m->m_next;
 		remaining -= m->m_len;
 	}
 	*pnext = NULL;
 
 	M_ASSERTPKTHDR(m0);
 	return (m0);
 }
 
 static int
 t4_eth_rx(struct sge_iq *iq, const struct rss_header *rss, struct mbuf *m0)
 {
 	struct sge_rxq *rxq = iq_to_rxq(iq);
 	struct ifnet *ifp = rxq->ifp;
 	struct adapter *sc = iq->adapter;
 	const struct cpl_rx_pkt *cpl = (const void *)(rss + 1);
 #if defined(INET) || defined(INET6)
 	struct lro_ctrl *lro = &rxq->lro;
 #endif
 	static const int sw_hashtype[4][2] = {
 		{M_HASHTYPE_NONE, M_HASHTYPE_NONE},
 		{M_HASHTYPE_RSS_IPV4, M_HASHTYPE_RSS_IPV6},
 		{M_HASHTYPE_RSS_TCP_IPV4, M_HASHTYPE_RSS_TCP_IPV6},
 		{M_HASHTYPE_RSS_UDP_IPV4, M_HASHTYPE_RSS_UDP_IPV6},
 	};
 
 	KASSERT(m0 != NULL, ("%s: no payload with opcode %02x", __func__,
 	    rss->opcode));
 
 	m0->m_pkthdr.len -= sc->params.sge.fl_pktshift;
 	m0->m_len -= sc->params.sge.fl_pktshift;
 	m0->m_data += sc->params.sge.fl_pktshift;
 
 	m0->m_pkthdr.rcvif = ifp;
 	M_HASHTYPE_SET(m0, sw_hashtype[rss->hash_type][rss->ipv6]);
 	m0->m_pkthdr.flowid = be32toh(rss->hash_val);
 
 	if (cpl->csum_calc && !cpl->err_vec) {
 		if (ifp->if_capenable & IFCAP_RXCSUM &&
 		    cpl->l2info & htobe32(F_RXF_IP)) {
 			m0->m_pkthdr.csum_flags = (CSUM_IP_CHECKED |
 			    CSUM_IP_VALID | CSUM_DATA_VALID | CSUM_PSEUDO_HDR);
 			rxq->rxcsum++;
 		} else if (ifp->if_capenable & IFCAP_RXCSUM_IPV6 &&
 		    cpl->l2info & htobe32(F_RXF_IP6)) {
 			m0->m_pkthdr.csum_flags = (CSUM_DATA_VALID_IPV6 |
 			    CSUM_PSEUDO_HDR);
 			rxq->rxcsum++;
 		}
 
 		if (__predict_false(cpl->ip_frag))
 			m0->m_pkthdr.csum_data = be16toh(cpl->csum);
 		else
 			m0->m_pkthdr.csum_data = 0xffff;
 	}
 
 	if (cpl->vlan_ex) {
 		m0->m_pkthdr.ether_vtag = be16toh(cpl->vlan);
 		m0->m_flags |= M_VLANTAG;
 		rxq->vlan_extraction++;
 	}
 
 #if defined(INET) || defined(INET6)
-	if (cpl->l2info & htobe32(F_RXF_LRO) &&
-	    iq->flags & IQ_LRO_ENABLED &&
+	if (iq->flags & IQ_LRO_ENABLED &&
 	    tcp_lro_rx(lro, m0, 0) == 0) {
 		/* queued for LRO */
 	} else
 #endif
 	ifp->if_input(ifp, m0);
 
 	return (0);
 }
 
 /*
  * Must drain the wrq or make sure that someone else will.
  */
 static void
 wrq_tx_drain(void *arg, int n)
 {
 	struct sge_wrq *wrq = arg;
 	struct sge_eq *eq = &wrq->eq;
 
 	EQ_LOCK(eq);
 	if (TAILQ_EMPTY(&wrq->incomplete_wrs) && !STAILQ_EMPTY(&wrq->wr_list))
 		drain_wrq_wr_list(wrq->adapter, wrq);
 	EQ_UNLOCK(eq);
 }
 
 static void
 drain_wrq_wr_list(struct adapter *sc, struct sge_wrq *wrq)
 {
 	struct sge_eq *eq = &wrq->eq;
 	u_int available, dbdiff;	/* # of hardware descriptors */
 	u_int n;
 	struct wrqe *wr;
 	struct fw_eth_tx_pkt_wr *dst;	/* any fw WR struct will do */
 
 	EQ_LOCK_ASSERT_OWNED(eq);
 	MPASS(TAILQ_EMPTY(&wrq->incomplete_wrs));
 	wr = STAILQ_FIRST(&wrq->wr_list);
 	MPASS(wr != NULL);	/* Must be called with something useful to do */
 	MPASS(eq->pidx == eq->dbidx);
 	dbdiff = 0;
 
 	do {
 		eq->cidx = read_hw_cidx(eq);
 		if (eq->pidx == eq->cidx)
 			available = eq->sidx - 1;
 		else
 			available = IDXDIFF(eq->cidx, eq->pidx, eq->sidx) - 1;
 
 		MPASS(wr->wrq == wrq);
 		n = howmany(wr->wr_len, EQ_ESIZE);
 		if (available < n)
 			break;
 
 		dst = (void *)&eq->desc[eq->pidx];
 		if (__predict_true(eq->sidx - eq->pidx > n)) {
 			/* Won't wrap, won't end exactly at the status page. */
 			bcopy(&wr->wr[0], dst, wr->wr_len);
 			eq->pidx += n;
 		} else {
 			int first_portion = (eq->sidx - eq->pidx) * EQ_ESIZE;
 
 			bcopy(&wr->wr[0], dst, first_portion);
 			if (wr->wr_len > first_portion) {
 				bcopy(&wr->wr[first_portion], &eq->desc[0],
 				    wr->wr_len - first_portion);
 			}
 			eq->pidx = n - (eq->sidx - eq->pidx);
 		}
 
 		if (available < eq->sidx / 4 &&
 		    atomic_cmpset_int(&eq->equiq, 0, 1)) {
 			dst->equiq_to_len16 |= htobe32(F_FW_WR_EQUIQ |
 			    F_FW_WR_EQUEQ);
 			eq->equeqidx = eq->pidx;
 		} else if (IDXDIFF(eq->pidx, eq->equeqidx, eq->sidx) >= 32) {
 			dst->equiq_to_len16 |= htobe32(F_FW_WR_EQUEQ);
 			eq->equeqidx = eq->pidx;
 		}
 
 		dbdiff += n;
 		if (dbdiff >= 16) {
 			ring_eq_db(sc, eq, dbdiff);
 			dbdiff = 0;
 		}
 
 		STAILQ_REMOVE_HEAD(&wrq->wr_list, link);
 		free_wrqe(wr);
 		MPASS(wrq->nwr_pending > 0);
 		wrq->nwr_pending--;
 		MPASS(wrq->ndesc_needed >= n);
 		wrq->ndesc_needed -= n;
 	} while ((wr = STAILQ_FIRST(&wrq->wr_list)) != NULL);
 
 	if (dbdiff)
 		ring_eq_db(sc, eq, dbdiff);
 }
 
 /*
  * Doesn't fail.  Holds on to work requests it can't send right away.
  */
 void
 t4_wrq_tx_locked(struct adapter *sc, struct sge_wrq *wrq, struct wrqe *wr)
 {
 #ifdef INVARIANTS
 	struct sge_eq *eq = &wrq->eq;
 #endif
 
 	EQ_LOCK_ASSERT_OWNED(eq);
 	MPASS(wr != NULL);
 	MPASS(wr->wr_len > 0 && wr->wr_len <= SGE_MAX_WR_LEN);
 	MPASS((wr->wr_len & 0x7) == 0);
 
 	STAILQ_INSERT_TAIL(&wrq->wr_list, wr, link);
 	wrq->nwr_pending++;
 	wrq->ndesc_needed += howmany(wr->wr_len, EQ_ESIZE);
 
 	if (!TAILQ_EMPTY(&wrq->incomplete_wrs))
 		return;	/* commit_wrq_wr will drain wr_list as well. */
 
 	drain_wrq_wr_list(sc, wrq);
 
 	/* Doorbell must have caught up to the pidx. */
 	MPASS(eq->pidx == eq->dbidx);
 }
 
 void
 t4_update_fl_bufsize(struct ifnet *ifp)
 {
 	struct vi_info *vi = ifp->if_softc;
 	struct adapter *sc = vi->pi->adapter;
 	struct sge_rxq *rxq;
 #ifdef TCP_OFFLOAD
 	struct sge_ofld_rxq *ofld_rxq;
 #endif
 	struct sge_fl *fl;
 	int i, maxp, mtu = ifp->if_mtu;
 
 	maxp = mtu_to_max_payload(sc, mtu, 0);
 	for_each_rxq(vi, i, rxq) {
 		fl = &rxq->fl;
 
 		FL_LOCK(fl);
 		find_best_refill_source(sc, fl, maxp);
 		FL_UNLOCK(fl);
 	}
 #ifdef TCP_OFFLOAD
 	maxp = mtu_to_max_payload(sc, mtu, 1);
 	for_each_ofld_rxq(vi, i, ofld_rxq) {
 		fl = &ofld_rxq->fl;
 
 		FL_LOCK(fl);
 		find_best_refill_source(sc, fl, maxp);
 		FL_UNLOCK(fl);
 	}
 #endif
 }
 
 static inline int
 mbuf_nsegs(struct mbuf *m)
 {
 
 	M_ASSERTPKTHDR(m);
 	KASSERT(m->m_pkthdr.l5hlen > 0,
 	    ("%s: mbuf %p missing information on # of segments.", __func__, m));
 
 	return (m->m_pkthdr.l5hlen);
 }
 
 static inline void
 set_mbuf_nsegs(struct mbuf *m, uint8_t nsegs)
 {
 
 	M_ASSERTPKTHDR(m);
 	m->m_pkthdr.l5hlen = nsegs;
 }
 
 static inline int
 mbuf_len16(struct mbuf *m)
 {
 	int n;
 
 	M_ASSERTPKTHDR(m);
 	n = m->m_pkthdr.PH_loc.eight[0];
 	MPASS(n > 0 && n <= SGE_MAX_WR_LEN / 16);
 
 	return (n);
 }
 
 static inline void
 set_mbuf_len16(struct mbuf *m, uint8_t len16)
 {
 
 	M_ASSERTPKTHDR(m);
 	m->m_pkthdr.PH_loc.eight[0] = len16;
 }
 
 static inline int
 needs_tso(struct mbuf *m)
 {
 
 	M_ASSERTPKTHDR(m);
 
 	if (m->m_pkthdr.csum_flags & CSUM_TSO) {
 		KASSERT(m->m_pkthdr.tso_segsz > 0,
 		    ("%s: TSO requested in mbuf %p but MSS not provided",
 		    __func__, m));
 		return (1);
 	}
 
 	return (0);
 }
 
 static inline int
 needs_l3_csum(struct mbuf *m)
 {
 
 	M_ASSERTPKTHDR(m);
 
 	if (m->m_pkthdr.csum_flags & (CSUM_IP | CSUM_TSO))
 		return (1);
 	return (0);
 }
 
 static inline int
 needs_l4_csum(struct mbuf *m)
 {
 
 	M_ASSERTPKTHDR(m);
 
 	if (m->m_pkthdr.csum_flags & (CSUM_TCP | CSUM_UDP | CSUM_UDP_IPV6 |
 	    CSUM_TCP_IPV6 | CSUM_TSO))
 		return (1);
 	return (0);
 }
 
 static inline int
 needs_vlan_insertion(struct mbuf *m)
 {
 
 	M_ASSERTPKTHDR(m);
 
 	if (m->m_flags & M_VLANTAG) {
 		KASSERT(m->m_pkthdr.ether_vtag != 0,
 		    ("%s: HWVLAN requested in mbuf %p but tag not provided",
 		    __func__, m));
 		return (1);
 	}
 	return (0);
 }
 
 static void *
 m_advance(struct mbuf **pm, int *poffset, int len)
 {
 	struct mbuf *m = *pm;
 	int offset = *poffset;
 	uintptr_t p = 0;
 
 	MPASS(len > 0);
 
 	for (;;) {
 		if (offset + len < m->m_len) {
 			offset += len;
 			p = mtod(m, uintptr_t) + offset;
 			break;
 		}
 		len -= m->m_len - offset;
 		m = m->m_next;
 		offset = 0;
 		MPASS(m != NULL);
 	}
 	*poffset = offset;
 	*pm = m;
 	return ((void *)p);
 }
 
 static inline int
 same_paddr(char *a, char *b)
 {
 
 	if (a == b)
 		return (1);
 	else if (a != NULL && b != NULL) {
 		vm_offset_t x = (vm_offset_t)a;
 		vm_offset_t y = (vm_offset_t)b;
 
 		if ((x & PAGE_MASK) == (y & PAGE_MASK) &&
 		    pmap_kextract(x) == pmap_kextract(y))
 			return (1);
 	}
 
 	return (0);
 }
 
 /*
  * Can deal with empty mbufs in the chain that have m_len = 0, but the chain
  * must have at least one mbuf that's not empty.
  */
 static inline int
 count_mbuf_nsegs(struct mbuf *m)
 {
 	char *prev_end, *start;
 	int len, nsegs;
 
 	MPASS(m != NULL);
 
 	nsegs = 0;
 	prev_end = NULL;
 	for (; m; m = m->m_next) {
 
 		len = m->m_len;
 		if (__predict_false(len == 0))
 			continue;
 		start = mtod(m, char *);
 
 		nsegs += sglist_count(start, len);
 		if (same_paddr(prev_end, start))
 			nsegs--;
 		prev_end = start + len;
 	}
 
 	MPASS(nsegs > 0);
 	return (nsegs);
 }
 
 /*
  * Analyze the mbuf to determine its tx needs.  The mbuf passed in may change:
  * a) caller can assume it's been freed if this function returns with an error.
  * b) it may get defragged up if the gather list is too long for the hardware.
  */
 int
 parse_pkt(struct adapter *sc, struct mbuf **mp)
 {
 	struct mbuf *m0 = *mp, *m;
 	int rc, nsegs, defragged = 0, offset;
 	struct ether_header *eh;
 	void *l3hdr;
 #if defined(INET) || defined(INET6)
 	struct tcphdr *tcp;
 #endif
 	uint16_t eh_type;
 
 	M_ASSERTPKTHDR(m0);
 	if (__predict_false(m0->m_pkthdr.len < ETHER_HDR_LEN)) {
 		rc = EINVAL;
 fail:
 		m_freem(m0);
 		*mp = NULL;
 		return (rc);
 	}
 restart:
 	/*
 	 * First count the number of gather list segments in the payload.
 	 * Defrag the mbuf if nsegs exceeds the hardware limit.
 	 */
 	M_ASSERTPKTHDR(m0);
 	MPASS(m0->m_pkthdr.len > 0);
 	nsegs = count_mbuf_nsegs(m0);
 	if (nsegs > (needs_tso(m0) ? TX_SGL_SEGS_TSO : TX_SGL_SEGS)) {
 		if (defragged++ > 0 || (m = m_defrag(m0, M_NOWAIT)) == NULL) {
 			rc = EFBIG;
 			goto fail;
 		}
 		*mp = m0 = m;	/* update caller's copy after defrag */
 		goto restart;
 	}
 
 	if (__predict_false(nsegs > 2 && m0->m_pkthdr.len <= MHLEN)) {
 		m0 = m_pullup(m0, m0->m_pkthdr.len);
 		if (m0 == NULL) {
 			/* Should have left well enough alone. */
 			rc = EFBIG;
 			goto fail;
 		}
 		*mp = m0;	/* update caller's copy after pullup */
 		goto restart;
 	}
 	set_mbuf_nsegs(m0, nsegs);
 	if (sc->flags & IS_VF)
 		set_mbuf_len16(m0, txpkt_vm_len16(nsegs, needs_tso(m0)));
 	else
 		set_mbuf_len16(m0, txpkt_len16(nsegs, needs_tso(m0)));
 
 	if (!needs_tso(m0) &&
 	    !(sc->flags & IS_VF && (needs_l3_csum(m0) || needs_l4_csum(m0))))
 		return (0);
 
 	m = m0;
 	eh = mtod(m, struct ether_header *);
 	eh_type = ntohs(eh->ether_type);
 	if (eh_type == ETHERTYPE_VLAN) {
 		struct ether_vlan_header *evh = (void *)eh;
 
 		eh_type = ntohs(evh->evl_proto);
 		m0->m_pkthdr.l2hlen = sizeof(*evh);
 	} else
 		m0->m_pkthdr.l2hlen = sizeof(*eh);
 
 	offset = 0;
 	l3hdr = m_advance(&m, &offset, m0->m_pkthdr.l2hlen);
 
 	switch (eh_type) {
 #ifdef INET6
 	case ETHERTYPE_IPV6:
 	{
 		struct ip6_hdr *ip6 = l3hdr;
 
 		MPASS(!needs_tso(m0) || ip6->ip6_nxt == IPPROTO_TCP);
 
 		m0->m_pkthdr.l3hlen = sizeof(*ip6);
 		break;
 	}
 #endif
 #ifdef INET
 	case ETHERTYPE_IP:
 	{
 		struct ip *ip = l3hdr;
 
 		m0->m_pkthdr.l3hlen = ip->ip_hl * 4;
 		break;
 	}
 #endif
 	default:
 		panic("%s: ethertype 0x%04x unknown.  if_cxgbe must be compiled"
 		    " with the same INET/INET6 options as the kernel.",
 		    __func__, eh_type);
 	}
 
 #if defined(INET) || defined(INET6)
 	if (needs_tso(m0)) {
 		tcp = m_advance(&m, &offset, m0->m_pkthdr.l3hlen);
 		m0->m_pkthdr.l4hlen = tcp->th_off * 4;
 	}
 #endif
 	MPASS(m0 == *mp);
 	return (0);
 }
 
 void *
 start_wrq_wr(struct sge_wrq *wrq, int len16, struct wrq_cookie *cookie)
 {
 	struct sge_eq *eq = &wrq->eq;
 	struct adapter *sc = wrq->adapter;
 	int ndesc, available;
 	struct wrqe *wr;
 	void *w;
 
 	MPASS(len16 > 0);
 	ndesc = howmany(len16, EQ_ESIZE / 16);
 	MPASS(ndesc > 0 && ndesc <= SGE_MAX_WR_NDESC);
 
 	EQ_LOCK(eq);
 
 	if (!STAILQ_EMPTY(&wrq->wr_list))
 		drain_wrq_wr_list(sc, wrq);
 
 	if (!STAILQ_EMPTY(&wrq->wr_list)) {
 slowpath:
 		EQ_UNLOCK(eq);
 		wr = alloc_wrqe(len16 * 16, wrq);
 		if (__predict_false(wr == NULL))
 			return (NULL);
 		cookie->pidx = -1;
 		cookie->ndesc = ndesc;
 		return (&wr->wr);
 	}
 
 	eq->cidx = read_hw_cidx(eq);
 	if (eq->pidx == eq->cidx)
 		available = eq->sidx - 1;
 	else
 		available = IDXDIFF(eq->cidx, eq->pidx, eq->sidx) - 1;
 	if (available < ndesc)
 		goto slowpath;
 
 	cookie->pidx = eq->pidx;
 	cookie->ndesc = ndesc;
 	TAILQ_INSERT_TAIL(&wrq->incomplete_wrs, cookie, link);
 
 	w = &eq->desc[eq->pidx];
 	IDXINCR(eq->pidx, ndesc, eq->sidx);
 	if (__predict_false(eq->pidx < ndesc - 1)) {
 		w = &wrq->ss[0];
 		wrq->ss_pidx = cookie->pidx;
 		wrq->ss_len = len16 * 16;
 	}
 
 	EQ_UNLOCK(eq);
 
 	return (w);
 }
 
 void
 commit_wrq_wr(struct sge_wrq *wrq, void *w, struct wrq_cookie *cookie)
 {
 	struct sge_eq *eq = &wrq->eq;
 	struct adapter *sc = wrq->adapter;
 	int ndesc, pidx;
 	struct wrq_cookie *prev, *next;
 
 	if (cookie->pidx == -1) {
 		struct wrqe *wr = __containerof(w, struct wrqe, wr);
 
 		t4_wrq_tx(sc, wr);
 		return;
 	}
 
 	ndesc = cookie->ndesc;	/* Can be more than SGE_MAX_WR_NDESC here. */
 	pidx = cookie->pidx;
 	MPASS(pidx >= 0 && pidx < eq->sidx);
 	if (__predict_false(w == &wrq->ss[0])) {
 		int n = (eq->sidx - wrq->ss_pidx) * EQ_ESIZE;
 
 		MPASS(wrq->ss_len > n);	/* WR had better wrap around. */
 		bcopy(&wrq->ss[0], &eq->desc[wrq->ss_pidx], n);
 		bcopy(&wrq->ss[n], &eq->desc[0], wrq->ss_len - n);
 		wrq->tx_wrs_ss++;
 	} else
 		wrq->tx_wrs_direct++;
 
 	EQ_LOCK(eq);
 	prev = TAILQ_PREV(cookie, wrq_incomplete_wrs, link);
 	next = TAILQ_NEXT(cookie, link);
 	if (prev == NULL) {
 		MPASS(pidx == eq->dbidx);
 		if (next == NULL || ndesc >= 16)
 			ring_eq_db(wrq->adapter, eq, ndesc);
 		else {
 			MPASS(IDXDIFF(next->pidx, pidx, eq->sidx) == ndesc);
 			next->pidx = pidx;
 			next->ndesc += ndesc;
 		}
 	} else {
 		MPASS(IDXDIFF(pidx, prev->pidx, eq->sidx) == prev->ndesc);
 		prev->ndesc += ndesc;
 	}
 	TAILQ_REMOVE(&wrq->incomplete_wrs, cookie, link);
 
 	if (TAILQ_EMPTY(&wrq->incomplete_wrs) && !STAILQ_EMPTY(&wrq->wr_list))
 		drain_wrq_wr_list(sc, wrq);
 
 #ifdef INVARIANTS
 	if (TAILQ_EMPTY(&wrq->incomplete_wrs)) {
 		/* Doorbell must have caught up to the pidx. */
 		MPASS(wrq->eq.pidx == wrq->eq.dbidx);
 	}
 #endif
 	EQ_UNLOCK(eq);
 }
 
 static u_int
 can_resume_eth_tx(struct mp_ring *r)
 {
 	struct sge_eq *eq = r->cookie;
 
 	return (total_available_tx_desc(eq) > eq->sidx / 8);
 }
 
 static inline int
 cannot_use_txpkts(struct mbuf *m)
 {
 	/* maybe put a GL limit too, to avoid silliness? */
 
 	return (needs_tso(m));
 }
 
 /*
  * r->items[cidx] to r->items[pidx], with a wraparound at r->size, are ready to
  * be consumed.  Return the actual number consumed.  0 indicates a stall.
  */
 static u_int
 eth_tx(struct mp_ring *r, u_int cidx, u_int pidx)
 {
 	struct sge_txq *txq = r->cookie;
 	struct sge_eq *eq = &txq->eq;
 	struct ifnet *ifp = txq->ifp;
 	struct vi_info *vi = ifp->if_softc;
 	struct port_info *pi = vi->pi;
 	struct adapter *sc = pi->adapter;
 	u_int total, remaining;		/* # of packets */
 	u_int available, dbdiff;	/* # of hardware descriptors */
 	u_int n, next_cidx;
 	struct mbuf *m0, *tail;
 	struct txpkts txp;
 	struct fw_eth_tx_pkts_wr *wr;	/* any fw WR struct will do */
 
 	remaining = IDXDIFF(pidx, cidx, r->size);
 	MPASS(remaining > 0);	/* Must not be called without work to do. */
 	total = 0;
 
 	TXQ_LOCK(txq);
 	if (__predict_false((eq->flags & EQ_ENABLED) == 0)) {
 		while (cidx != pidx) {
 			m0 = r->items[cidx];
 			m_freem(m0);
 			if (++cidx == r->size)
 				cidx = 0;
 		}
 		reclaim_tx_descs(txq, 2048);
 		total = remaining;
 		goto done;
 	}
 
 	/* How many hardware descriptors do we have readily available. */
 	if (eq->pidx == eq->cidx)
 		available = eq->sidx - 1;
 	else
 		available = IDXDIFF(eq->cidx, eq->pidx, eq->sidx) - 1;
 	dbdiff = IDXDIFF(eq->pidx, eq->dbidx, eq->sidx);
 
 	while (remaining > 0) {
 
 		m0 = r->items[cidx];
 		M_ASSERTPKTHDR(m0);
 		MPASS(m0->m_nextpkt == NULL);
 
 		if (available < SGE_MAX_WR_NDESC) {
 			available += reclaim_tx_descs(txq, 64);
 			if (available < howmany(mbuf_len16(m0), EQ_ESIZE / 16))
 				break;	/* out of descriptors */
 		}
 
 		next_cidx = cidx + 1;
 		if (__predict_false(next_cidx == r->size))
 			next_cidx = 0;
 
 		wr = (void *)&eq->desc[eq->pidx];
 		if (sc->flags & IS_VF) {
 			total++;
 			remaining--;
 			ETHER_BPF_MTAP(ifp, m0);
 			n = write_txpkt_vm_wr(txq, (void *)wr, m0, available);
 		} else if (remaining > 1 &&
 		    try_txpkts(m0, r->items[next_cidx], &txp, available) == 0) {
 
 			/* pkts at cidx, next_cidx should both be in txp. */
 			MPASS(txp.npkt == 2);
 			tail = r->items[next_cidx];
 			MPASS(tail->m_nextpkt == NULL);
 			ETHER_BPF_MTAP(ifp, m0);
 			ETHER_BPF_MTAP(ifp, tail);
 			m0->m_nextpkt = tail;
 
 			if (__predict_false(++next_cidx == r->size))
 				next_cidx = 0;
 
 			while (next_cidx != pidx) {
 				if (add_to_txpkts(r->items[next_cidx], &txp,
 				    available) != 0)
 					break;
 				tail->m_nextpkt = r->items[next_cidx];
 				tail = tail->m_nextpkt;
 				ETHER_BPF_MTAP(ifp, tail);
 				if (__predict_false(++next_cidx == r->size))
 					next_cidx = 0;
 			}
 
 			n = write_txpkts_wr(txq, wr, m0, &txp, available);
 			total += txp.npkt;
 			remaining -= txp.npkt;
 		} else {
 			total++;
 			remaining--;
 			ETHER_BPF_MTAP(ifp, m0);
 			n = write_txpkt_wr(txq, (void *)wr, m0, available);
 		}
 		MPASS(n >= 1 && n <= available && n <= SGE_MAX_WR_NDESC);
 
 		available -= n;
 		dbdiff += n;
 		IDXINCR(eq->pidx, n, eq->sidx);
 
 		if (total_available_tx_desc(eq) < eq->sidx / 4 &&
 		    atomic_cmpset_int(&eq->equiq, 0, 1)) {
 			wr->equiq_to_len16 |= htobe32(F_FW_WR_EQUIQ |
 			    F_FW_WR_EQUEQ);
 			eq->equeqidx = eq->pidx;
 		} else if (IDXDIFF(eq->pidx, eq->equeqidx, eq->sidx) >= 32) {
 			wr->equiq_to_len16 |= htobe32(F_FW_WR_EQUEQ);
 			eq->equeqidx = eq->pidx;
 		}
 
 		if (dbdiff >= 16 && remaining >= 4) {
 			ring_eq_db(sc, eq, dbdiff);
 			available += reclaim_tx_descs(txq, 4 * dbdiff);
 			dbdiff = 0;
 		}
 
 		cidx = next_cidx;
 	}
 	if (dbdiff != 0) {
 		ring_eq_db(sc, eq, dbdiff);
 		reclaim_tx_descs(txq, 32);
 	}
 done:
 	TXQ_UNLOCK(txq);
 
 	return (total);
 }
 
 static inline void
 init_iq(struct sge_iq *iq, struct adapter *sc, int tmr_idx, int pktc_idx,
     int qsize)
 {
 
 	KASSERT(tmr_idx >= 0 && tmr_idx < SGE_NTIMERS,
 	    ("%s: bad tmr_idx %d", __func__, tmr_idx));
 	KASSERT(pktc_idx < SGE_NCOUNTERS,	/* -ve is ok, means don't use */
 	    ("%s: bad pktc_idx %d", __func__, pktc_idx));
 
 	iq->flags = 0;
 	iq->adapter = sc;
 	iq->intr_params = V_QINTR_TIMER_IDX(tmr_idx);
 	iq->intr_pktc_idx = SGE_NCOUNTERS - 1;
 	if (pktc_idx >= 0) {
 		iq->intr_params |= F_QINTR_CNT_EN;
 		iq->intr_pktc_idx = pktc_idx;
 	}
 	iq->qsize = roundup2(qsize, 16);	/* See FW_IQ_CMD/iqsize */
 	iq->sidx = iq->qsize - sc->params.sge.spg_len / IQ_ESIZE;
 }
 
 static inline void
 init_fl(struct adapter *sc, struct sge_fl *fl, int qsize, int maxp, char *name)
 {
 
 	fl->qsize = qsize;
 	fl->sidx = qsize - sc->params.sge.spg_len / EQ_ESIZE;
 	strlcpy(fl->lockname, name, sizeof(fl->lockname));
 	if (sc->flags & BUF_PACKING_OK &&
 	    ((!is_t4(sc) && buffer_packing) ||	/* T5+: enabled unless 0 */
 	    (is_t4(sc) && buffer_packing == 1)))/* T4: disabled unless 1 */
 		fl->flags |= FL_BUF_PACKING;
 	find_best_refill_source(sc, fl, maxp);
 	find_safe_refill_source(sc, fl);
 }
 
 static inline void
 init_eq(struct adapter *sc, struct sge_eq *eq, int eqtype, int qsize,
     uint8_t tx_chan, uint16_t iqid, char *name)
 {
 	KASSERT(eqtype <= EQ_TYPEMASK, ("%s: bad qtype %d", __func__, eqtype));
 
 	eq->flags = eqtype & EQ_TYPEMASK;
 	eq->tx_chan = tx_chan;
 	eq->iqid = iqid;
 	eq->sidx = qsize - sc->params.sge.spg_len / EQ_ESIZE;
 	strlcpy(eq->lockname, name, sizeof(eq->lockname));
 }
 
 static int
 alloc_ring(struct adapter *sc, size_t len, bus_dma_tag_t *tag,
     bus_dmamap_t *map, bus_addr_t *pa, void **va)
 {
 	int rc;
 
 	rc = bus_dma_tag_create(sc->dmat, 512, 0, BUS_SPACE_MAXADDR,
 	    BUS_SPACE_MAXADDR, NULL, NULL, len, 1, len, 0, NULL, NULL, tag);
 	if (rc != 0) {
 		device_printf(sc->dev, "cannot allocate DMA tag: %d\n", rc);
 		goto done;
 	}
 
 	rc = bus_dmamem_alloc(*tag, va,
 	    BUS_DMA_WAITOK | BUS_DMA_COHERENT | BUS_DMA_ZERO, map);
 	if (rc != 0) {
 		device_printf(sc->dev, "cannot allocate DMA memory: %d\n", rc);
 		goto done;
 	}
 
 	rc = bus_dmamap_load(*tag, *map, *va, len, oneseg_dma_callback, pa, 0);
 	if (rc != 0) {
 		device_printf(sc->dev, "cannot load DMA map: %d\n", rc);
 		goto done;
 	}
 done:
 	if (rc)
 		free_ring(sc, *tag, *map, *pa, *va);
 
 	return (rc);
 }
 
 static int
 free_ring(struct adapter *sc, bus_dma_tag_t tag, bus_dmamap_t map,
     bus_addr_t pa, void *va)
 {
 	if (pa)
 		bus_dmamap_unload(tag, map);
 	if (va)
 		bus_dmamem_free(tag, va, map);
 	if (tag)
 		bus_dma_tag_destroy(tag);
 
 	return (0);
 }
 
 /*
  * Allocates the ring for an ingress queue and an optional freelist.  If the
  * freelist is specified it will be allocated and then associated with the
  * ingress queue.
  *
  * Returns errno on failure.  Resources allocated up to that point may still be
  * allocated.  Caller is responsible for cleanup in case this function fails.
  *
  * If the ingress queue will take interrupts directly (iq->flags & IQ_INTR) then
  * the intr_idx specifies the vector, starting from 0.  Otherwise it specifies
  * the abs_id of the ingress queue to which its interrupts should be forwarded.
  */
 static int
 alloc_iq_fl(struct vi_info *vi, struct sge_iq *iq, struct sge_fl *fl,
     int intr_idx, int cong)
 {
 	int rc, i, cntxt_id;
 	size_t len;
 	struct fw_iq_cmd c;
 	struct port_info *pi = vi->pi;
 	struct adapter *sc = iq->adapter;
 	struct sge_params *sp = &sc->params.sge;
 	__be32 v = 0;
 
 	len = iq->qsize * IQ_ESIZE;
 	rc = alloc_ring(sc, len, &iq->desc_tag, &iq->desc_map, &iq->ba,
 	    (void **)&iq->desc);
 	if (rc != 0)
 		return (rc);
 
 	bzero(&c, sizeof(c));
 	c.op_to_vfn = htobe32(V_FW_CMD_OP(FW_IQ_CMD) | F_FW_CMD_REQUEST |
 	    F_FW_CMD_WRITE | F_FW_CMD_EXEC | V_FW_IQ_CMD_PFN(sc->pf) |
 	    V_FW_IQ_CMD_VFN(0));
 
 	c.alloc_to_len16 = htobe32(F_FW_IQ_CMD_ALLOC | F_FW_IQ_CMD_IQSTART |
 	    FW_LEN16(c));
 
 	/* Special handling for firmware event queue */
 	if (iq == &sc->sge.fwq)
 		v |= F_FW_IQ_CMD_IQASYNCH;
 
 	if (iq->flags & IQ_INTR) {
 		KASSERT(intr_idx < sc->intr_count,
 		    ("%s: invalid direct intr_idx %d", __func__, intr_idx));
 	} else
 		v |= F_FW_IQ_CMD_IQANDST;
 	v |= V_FW_IQ_CMD_IQANDSTINDEX(intr_idx);
 
 	c.type_to_iqandstindex = htobe32(v |
 	    V_FW_IQ_CMD_TYPE(FW_IQ_TYPE_FL_INT_CAP) |
 	    V_FW_IQ_CMD_VIID(vi->viid) |
 	    V_FW_IQ_CMD_IQANUD(X_UPDATEDELIVERY_INTERRUPT));
 	c.iqdroprss_to_iqesize = htobe16(V_FW_IQ_CMD_IQPCIECH(pi->tx_chan) |
 	    F_FW_IQ_CMD_IQGTSMODE |
 	    V_FW_IQ_CMD_IQINTCNTTHRESH(iq->intr_pktc_idx) |
 	    V_FW_IQ_CMD_IQESIZE(ilog2(IQ_ESIZE) - 4));
 	c.iqsize = htobe16(iq->qsize);
 	c.iqaddr = htobe64(iq->ba);
 	if (cong >= 0)
 		c.iqns_to_fl0congen = htobe32(F_FW_IQ_CMD_IQFLINTCONGEN);
 
 	if (fl) {
 		mtx_init(&fl->fl_lock, fl->lockname, NULL, MTX_DEF);
 
 		len = fl->qsize * EQ_ESIZE;
 		rc = alloc_ring(sc, len, &fl->desc_tag, &fl->desc_map,
 		    &fl->ba, (void **)&fl->desc);
 		if (rc)
 			return (rc);
 
 		/* Allocate space for one software descriptor per buffer. */
 		rc = alloc_fl_sdesc(fl);
 		if (rc != 0) {
 			device_printf(sc->dev,
 			    "failed to setup fl software descriptors: %d\n",
 			    rc);
 			return (rc);
 		}
 
 		if (fl->flags & FL_BUF_PACKING) {
 			fl->lowat = roundup2(sp->fl_starve_threshold2, 8);
 			fl->buf_boundary = sp->pack_boundary;
 		} else {
 			fl->lowat = roundup2(sp->fl_starve_threshold, 8);
 			fl->buf_boundary = 16;
 		}
 		if (fl_pad && fl->buf_boundary < sp->pad_boundary)
 			fl->buf_boundary = sp->pad_boundary;
 
 		c.iqns_to_fl0congen |=
 		    htobe32(V_FW_IQ_CMD_FL0HOSTFCMODE(X_HOSTFCMODE_NONE) |
 			F_FW_IQ_CMD_FL0FETCHRO | F_FW_IQ_CMD_FL0DATARO |
 			(fl_pad ? F_FW_IQ_CMD_FL0PADEN : 0) |
 			(fl->flags & FL_BUF_PACKING ? F_FW_IQ_CMD_FL0PACKEN :
 			    0));
 		if (cong >= 0) {
 			c.iqns_to_fl0congen |=
 				htobe32(V_FW_IQ_CMD_FL0CNGCHMAP(cong) |
 				    F_FW_IQ_CMD_FL0CONGCIF |
 				    F_FW_IQ_CMD_FL0CONGEN);
 		}
 		c.fl0dcaen_to_fl0cidxfthresh =
 		    htobe16(V_FW_IQ_CMD_FL0FBMIN(X_FETCHBURSTMIN_128B) |
 			V_FW_IQ_CMD_FL0FBMAX(X_FETCHBURSTMAX_512B));
 		c.fl0size = htobe16(fl->qsize);
 		c.fl0addr = htobe64(fl->ba);
 	}
 
 	rc = -t4_wr_mbox(sc, sc->mbox, &c, sizeof(c), &c);
 	if (rc != 0) {
 		device_printf(sc->dev,
 		    "failed to create ingress queue: %d\n", rc);
 		return (rc);
 	}
 
 	iq->cidx = 0;
 	iq->gen = F_RSPD_GEN;
 	iq->intr_next = iq->intr_params;
 	iq->cntxt_id = be16toh(c.iqid);
 	iq->abs_id = be16toh(c.physiqid);
 	iq->flags |= IQ_ALLOCATED;
 
 	cntxt_id = iq->cntxt_id - sc->sge.iq_start;
 	if (cntxt_id >= sc->sge.niq) {
 		panic ("%s: iq->cntxt_id (%d) more than the max (%d)", __func__,
 		    cntxt_id, sc->sge.niq - 1);
 	}
 	sc->sge.iqmap[cntxt_id] = iq;
 
 	if (fl) {
 		u_int qid;
 
 		iq->flags |= IQ_HAS_FL;
 		fl->cntxt_id = be16toh(c.fl0id);
 		fl->pidx = fl->cidx = 0;
 
 		cntxt_id = fl->cntxt_id - sc->sge.eq_start;
 		if (cntxt_id >= sc->sge.neq) {
 			panic("%s: fl->cntxt_id (%d) more than the max (%d)",
 			    __func__, cntxt_id, sc->sge.neq - 1);
 		}
 		sc->sge.eqmap[cntxt_id] = (void *)fl;
 
 		qid = fl->cntxt_id;
 		if (isset(&sc->doorbells, DOORBELL_UDB)) {
 			uint32_t s_qpp = sc->params.sge.eq_s_qpp;
 			uint32_t mask = (1 << s_qpp) - 1;
 			volatile uint8_t *udb;
 
 			udb = sc->udbs_base + UDBS_DB_OFFSET;
 			udb += (qid >> s_qpp) << PAGE_SHIFT;
 			qid &= mask;
 			if (qid < PAGE_SIZE / UDBS_SEG_SIZE) {
 				udb += qid << UDBS_SEG_SHIFT;
 				qid = 0;
 			}
 			fl->udb = (volatile void *)udb;
 		}
 		fl->dbval = V_QID(qid) | sc->chip_params->sge_fl_db;
 
 		FL_LOCK(fl);
 		/* Enough to make sure the SGE doesn't think it's starved */
 		refill_fl(sc, fl, fl->lowat);
 		FL_UNLOCK(fl);
 	}
 
 	if (is_t5(sc) && !(sc->flags & IS_VF) && cong >= 0) {
 		uint32_t param, val;
 
 		param = V_FW_PARAMS_MNEM(FW_PARAMS_MNEM_DMAQ) |
 		    V_FW_PARAMS_PARAM_X(FW_PARAMS_PARAM_DMAQ_CONM_CTXT) |
 		    V_FW_PARAMS_PARAM_YZ(iq->cntxt_id);
 		if (cong == 0)
 			val = 1 << 19;
 		else {
 			val = 2 << 19;
 			for (i = 0; i < 4; i++) {
 				if (cong & (1 << i))
 					val |= 1 << (i << 2);
 			}
 		}
 
 		rc = -t4_set_params(sc, sc->mbox, sc->pf, 0, 1, &param, &val);
 		if (rc != 0) {
 			/* report error but carry on */
 			device_printf(sc->dev,
 			    "failed to set congestion manager context for "
 			    "ingress queue %d: %d\n", iq->cntxt_id, rc);
 		}
 	}
 
 	/* Enable IQ interrupts */
 	atomic_store_rel_int(&iq->state, IQS_IDLE);
 	t4_write_reg(sc, sc->sge_gts_reg, V_SEINTARM(iq->intr_params) |
 	    V_INGRESSQID(iq->cntxt_id));
 
 	return (0);
 }
 
 static int
 free_iq_fl(struct vi_info *vi, struct sge_iq *iq, struct sge_fl *fl)
 {
 	int rc;
 	struct adapter *sc = iq->adapter;
 	device_t dev;
 
 	if (sc == NULL)
 		return (0);	/* nothing to do */
 
 	dev = vi ? vi->dev : sc->dev;
 
 	if (iq->flags & IQ_ALLOCATED) {
 		rc = -t4_iq_free(sc, sc->mbox, sc->pf, 0,
 		    FW_IQ_TYPE_FL_INT_CAP, iq->cntxt_id,
 		    fl ? fl->cntxt_id : 0xffff, 0xffff);
 		if (rc != 0) {
 			device_printf(dev,
 			    "failed to free queue %p: %d\n", iq, rc);
 			return (rc);
 		}
 		iq->flags &= ~IQ_ALLOCATED;
 	}
 
 	free_ring(sc, iq->desc_tag, iq->desc_map, iq->ba, iq->desc);
 
 	bzero(iq, sizeof(*iq));
 
 	if (fl) {
 		free_ring(sc, fl->desc_tag, fl->desc_map, fl->ba,
 		    fl->desc);
 
 		if (fl->sdesc)
 			free_fl_sdesc(sc, fl);
 
 		if (mtx_initialized(&fl->fl_lock))
 			mtx_destroy(&fl->fl_lock);
 
 		bzero(fl, sizeof(*fl));
 	}
 
 	return (0);
 }
 
 static void
 add_fl_sysctls(struct sysctl_ctx_list *ctx, struct sysctl_oid *oid,
     struct sge_fl *fl)
 {
 	struct sysctl_oid_list *children = SYSCTL_CHILDREN(oid);
 
 	oid = SYSCTL_ADD_NODE(ctx, children, OID_AUTO, "fl", CTLFLAG_RD, NULL,
 	    "freelist");
 	children = SYSCTL_CHILDREN(oid);
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cntxt_id",
 	    CTLTYPE_INT | CTLFLAG_RD, &fl->cntxt_id, 0, sysctl_uint16, "I",
 	    "SGE context id of the freelist");
 	SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "padding", CTLFLAG_RD, NULL,
 	    fl_pad ? 1 : 0, "padding enabled");
 	SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "packing", CTLFLAG_RD, NULL,
 	    fl->flags & FL_BUF_PACKING ? 1 : 0, "packing enabled");
 	SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "cidx", CTLFLAG_RD, &fl->cidx,
 	    0, "consumer index");
 	if (fl->flags & FL_BUF_PACKING) {
 		SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "rx_offset",
 		    CTLFLAG_RD, &fl->rx_offset, 0, "packing rx offset");
 	}
 	SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "pidx", CTLFLAG_RD, &fl->pidx,
 	    0, "producer index");
 	SYSCTL_ADD_UQUAD(ctx, children, OID_AUTO, "mbuf_allocated",
 	    CTLFLAG_RD, &fl->mbuf_allocated, "# of mbuf allocated");
 	SYSCTL_ADD_UQUAD(ctx, children, OID_AUTO, "mbuf_inlined",
 	    CTLFLAG_RD, &fl->mbuf_inlined, "# of mbuf inlined in clusters");
 	SYSCTL_ADD_UQUAD(ctx, children, OID_AUTO, "cluster_allocated",
 	    CTLFLAG_RD, &fl->cl_allocated, "# of clusters allocated");
 	SYSCTL_ADD_UQUAD(ctx, children, OID_AUTO, "cluster_recycled",
 	    CTLFLAG_RD, &fl->cl_recycled, "# of clusters recycled");
 	SYSCTL_ADD_UQUAD(ctx, children, OID_AUTO, "cluster_fast_recycled",
 	    CTLFLAG_RD, &fl->cl_fast_recycled, "# of clusters recycled (fast)");
 }
 
 static int
 alloc_fwq(struct adapter *sc)
 {
 	int rc, intr_idx;
 	struct sge_iq *fwq = &sc->sge.fwq;
 	struct sysctl_oid *oid = device_get_sysctl_tree(sc->dev);
 	struct sysctl_oid_list *children = SYSCTL_CHILDREN(oid);
 
 	init_iq(fwq, sc, 0, 0, FW_IQ_QSIZE);
 	fwq->flags |= IQ_INTR;	/* always */
 	if (sc->flags & IS_VF)
 		intr_idx = 0;
 	else {
 		intr_idx = sc->intr_count > 1 ? 1 : 0;
 		fwq->set_tcb_rpl = t4_filter_rpl;
 		fwq->l2t_write_rpl = do_l2t_write_rpl;
 	}
 	rc = alloc_iq_fl(&sc->port[0]->vi[0], fwq, NULL, intr_idx, -1);
 	if (rc != 0) {
 		device_printf(sc->dev,
 		    "failed to create firmware event queue: %d\n", rc);
 		return (rc);
 	}
 
 	oid = SYSCTL_ADD_NODE(&sc->ctx, children, OID_AUTO, "fwq", CTLFLAG_RD,
 	    NULL, "firmware event queue");
 	children = SYSCTL_CHILDREN(oid);
 
 	SYSCTL_ADD_PROC(&sc->ctx, children, OID_AUTO, "abs_id",
 	    CTLTYPE_INT | CTLFLAG_RD, &fwq->abs_id, 0, sysctl_uint16, "I",
 	    "absolute id of the queue");
 	SYSCTL_ADD_PROC(&sc->ctx, children, OID_AUTO, "cntxt_id",
 	    CTLTYPE_INT | CTLFLAG_RD, &fwq->cntxt_id, 0, sysctl_uint16, "I",
 	    "SGE context id of the queue");
 	SYSCTL_ADD_PROC(&sc->ctx, children, OID_AUTO, "cidx",
 	    CTLTYPE_INT | CTLFLAG_RD, &fwq->cidx, 0, sysctl_uint16, "I",
 	    "consumer index");
 
 	return (0);
 }
 
 static int
 free_fwq(struct adapter *sc)
 {
 	return free_iq_fl(NULL, &sc->sge.fwq, NULL);
 }
 
 static int
 alloc_mgmtq(struct adapter *sc)
 {
 	int rc;
 	struct sge_wrq *mgmtq = &sc->sge.mgmtq;
 	char name[16];
 	struct sysctl_oid *oid = device_get_sysctl_tree(sc->dev);
 	struct sysctl_oid_list *children = SYSCTL_CHILDREN(oid);
 
 	oid = SYSCTL_ADD_NODE(&sc->ctx, children, OID_AUTO, "mgmtq", CTLFLAG_RD,
 	    NULL, "management queue");
 
 	snprintf(name, sizeof(name), "%s mgmtq", device_get_nameunit(sc->dev));
 	init_eq(sc, &mgmtq->eq, EQ_CTRL, CTRL_EQ_QSIZE, sc->port[0]->tx_chan,
 	    sc->sge.fwq.cntxt_id, name);
 	rc = alloc_wrq(sc, NULL, mgmtq, oid);
 	if (rc != 0) {
 		device_printf(sc->dev,
 		    "failed to create management queue: %d\n", rc);
 		return (rc);
 	}
 
 	return (0);
 }
 
 static int
 free_mgmtq(struct adapter *sc)
 {
 
 	return free_wrq(sc, &sc->sge.mgmtq);
 }
 
 int
 tnl_cong(struct port_info *pi, int drop)
 {
 
 	if (drop == -1)
 		return (-1);
 	else if (drop == 1)
 		return (0);
 	else
 		return (pi->rx_chan_map);
 }
 
 static int
 alloc_rxq(struct vi_info *vi, struct sge_rxq *rxq, int intr_idx, int idx,
     struct sysctl_oid *oid)
 {
 	int rc;
 	struct adapter *sc = vi->pi->adapter;
 	struct sysctl_oid_list *children;
 	char name[16];
 
 	rc = alloc_iq_fl(vi, &rxq->iq, &rxq->fl, intr_idx,
 	    tnl_cong(vi->pi, cong_drop));
 	if (rc != 0)
 		return (rc);
 
 	if (idx == 0)
 		sc->sge.iq_base = rxq->iq.abs_id - rxq->iq.cntxt_id;
 	else
 		KASSERT(rxq->iq.cntxt_id + sc->sge.iq_base == rxq->iq.abs_id,
 		    ("iq_base mismatch"));
 	KASSERT(sc->sge.iq_base == 0 || sc->flags & IS_VF,
 	    ("PF with non-zero iq_base"));
 
 	/*
 	 * The freelist is just barely above the starvation threshold right now,
 	 * fill it up a bit more.
 	 */
 	FL_LOCK(&rxq->fl);
 	refill_fl(sc, &rxq->fl, 128);
 	FL_UNLOCK(&rxq->fl);
 
 #if defined(INET) || defined(INET6)
 	rc = tcp_lro_init(&rxq->lro);
 	if (rc != 0)
 		return (rc);
 	rxq->lro.ifp = vi->ifp; /* also indicates LRO init'ed */
 
 	if (vi->ifp->if_capenable & IFCAP_LRO)
 		rxq->iq.flags |= IQ_LRO_ENABLED;
 #endif
 	rxq->ifp = vi->ifp;
 
 	children = SYSCTL_CHILDREN(oid);
 
 	snprintf(name, sizeof(name), "%d", idx);
 	oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, name, CTLFLAG_RD,
 	    NULL, "rx queue");
 	children = SYSCTL_CHILDREN(oid);
 
 	SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "abs_id",
 	    CTLTYPE_INT | CTLFLAG_RD, &rxq->iq.abs_id, 0, sysctl_uint16, "I",
 	    "absolute id of the queue");
 	SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "cntxt_id",
 	    CTLTYPE_INT | CTLFLAG_RD, &rxq->iq.cntxt_id, 0, sysctl_uint16, "I",
 	    "SGE context id of the queue");
 	SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "cidx",
 	    CTLTYPE_INT | CTLFLAG_RD, &rxq->iq.cidx, 0, sysctl_uint16, "I",
 	    "consumer index");
 #if defined(INET) || defined(INET6)
 	SYSCTL_ADD_U64(&vi->ctx, children, OID_AUTO, "lro_queued", CTLFLAG_RD,
 	    &rxq->lro.lro_queued, 0, NULL);
 	SYSCTL_ADD_U64(&vi->ctx, children, OID_AUTO, "lro_flushed", CTLFLAG_RD,
 	    &rxq->lro.lro_flushed, 0, NULL);
 #endif
 	SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "rxcsum", CTLFLAG_RD,
 	    &rxq->rxcsum, "# of times hardware assisted with checksum");
 	SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "vlan_extraction",
 	    CTLFLAG_RD, &rxq->vlan_extraction,
 	    "# of times hardware extracted 802.1Q tag");
 
 	add_fl_sysctls(&vi->ctx, oid, &rxq->fl);
 
 	return (rc);
 }
 
 static int
 free_rxq(struct vi_info *vi, struct sge_rxq *rxq)
 {
 	int rc;
 
 #if defined(INET) || defined(INET6)
 	if (rxq->lro.ifp) {
 		tcp_lro_free(&rxq->lro);
 		rxq->lro.ifp = NULL;
 	}
 #endif
 
 	rc = free_iq_fl(vi, &rxq->iq, &rxq->fl);
 	if (rc == 0)
 		bzero(rxq, sizeof(*rxq));
 
 	return (rc);
 }
 
 #ifdef TCP_OFFLOAD
 static int
 alloc_ofld_rxq(struct vi_info *vi, struct sge_ofld_rxq *ofld_rxq,
     int intr_idx, int idx, struct sysctl_oid *oid)
 {
 	int rc;
 	struct sysctl_oid_list *children;
 	char name[16];
 
 	rc = alloc_iq_fl(vi, &ofld_rxq->iq, &ofld_rxq->fl, intr_idx,
 	    vi->pi->rx_chan_map);
 	if (rc != 0)
 		return (rc);
 
 	children = SYSCTL_CHILDREN(oid);
 
 	snprintf(name, sizeof(name), "%d", idx);
 	oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, name, CTLFLAG_RD,
 	    NULL, "rx queue");
 	children = SYSCTL_CHILDREN(oid);
 
 	SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "abs_id",
 	    CTLTYPE_INT | CTLFLAG_RD, &ofld_rxq->iq.abs_id, 0, sysctl_uint16,
 	    "I", "absolute id of the queue");
 	SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "cntxt_id",
 	    CTLTYPE_INT | CTLFLAG_RD, &ofld_rxq->iq.cntxt_id, 0, sysctl_uint16,
 	    "I", "SGE context id of the queue");
 	SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "cidx",
 	    CTLTYPE_INT | CTLFLAG_RD, &ofld_rxq->iq.cidx, 0, sysctl_uint16, "I",
 	    "consumer index");
 
 	add_fl_sysctls(&vi->ctx, oid, &ofld_rxq->fl);
 
 	return (rc);
 }
 
 static int
 free_ofld_rxq(struct vi_info *vi, struct sge_ofld_rxq *ofld_rxq)
 {
 	int rc;
 
 	rc = free_iq_fl(vi, &ofld_rxq->iq, &ofld_rxq->fl);
 	if (rc == 0)
 		bzero(ofld_rxq, sizeof(*ofld_rxq));
 
 	return (rc);
 }
 #endif
 
 #ifdef DEV_NETMAP
 static int
 alloc_nm_rxq(struct vi_info *vi, struct sge_nm_rxq *nm_rxq, int intr_idx,
     int idx, struct sysctl_oid *oid)
 {
 	int rc;
 	struct sysctl_oid_list *children;
 	struct sysctl_ctx_list *ctx;
 	char name[16];
 	size_t len;
 	struct adapter *sc = vi->pi->adapter;
 	struct netmap_adapter *na = NA(vi->ifp);
 
 	MPASS(na != NULL);
 
 	len = vi->qsize_rxq * IQ_ESIZE;
 	rc = alloc_ring(sc, len, &nm_rxq->iq_desc_tag, &nm_rxq->iq_desc_map,
 	    &nm_rxq->iq_ba, (void **)&nm_rxq->iq_desc);
 	if (rc != 0)
 		return (rc);
 
 	len = na->num_rx_desc * EQ_ESIZE + sc->params.sge.spg_len;
 	rc = alloc_ring(sc, len, &nm_rxq->fl_desc_tag, &nm_rxq->fl_desc_map,
 	    &nm_rxq->fl_ba, (void **)&nm_rxq->fl_desc);
 	if (rc != 0)
 		return (rc);
 
 	nm_rxq->vi = vi;
 	nm_rxq->nid = idx;
 	nm_rxq->iq_cidx = 0;
 	nm_rxq->iq_sidx = vi->qsize_rxq - sc->params.sge.spg_len / IQ_ESIZE;
 	nm_rxq->iq_gen = F_RSPD_GEN;
 	nm_rxq->fl_pidx = nm_rxq->fl_cidx = 0;
 	nm_rxq->fl_sidx = na->num_rx_desc;
 	nm_rxq->intr_idx = intr_idx;
 
 	ctx = &vi->ctx;
 	children = SYSCTL_CHILDREN(oid);
 
 	snprintf(name, sizeof(name), "%d", idx);
 	oid = SYSCTL_ADD_NODE(ctx, children, OID_AUTO, name, CTLFLAG_RD, NULL,
 	    "rx queue");
 	children = SYSCTL_CHILDREN(oid);
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "abs_id",
 	    CTLTYPE_INT | CTLFLAG_RD, &nm_rxq->iq_abs_id, 0, sysctl_uint16,
 	    "I", "absolute id of the queue");
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cntxt_id",
 	    CTLTYPE_INT | CTLFLAG_RD, &nm_rxq->iq_cntxt_id, 0, sysctl_uint16,
 	    "I", "SGE context id of the queue");
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cidx",
 	    CTLTYPE_INT | CTLFLAG_RD, &nm_rxq->iq_cidx, 0, sysctl_uint16, "I",
 	    "consumer index");
 
 	children = SYSCTL_CHILDREN(oid);
 	oid = SYSCTL_ADD_NODE(ctx, children, OID_AUTO, "fl", CTLFLAG_RD, NULL,
 	    "freelist");
 	children = SYSCTL_CHILDREN(oid);
 
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cntxt_id",
 	    CTLTYPE_INT | CTLFLAG_RD, &nm_rxq->fl_cntxt_id, 0, sysctl_uint16,
 	    "I", "SGE context id of the freelist");
 	SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "cidx", CTLFLAG_RD,
 	    &nm_rxq->fl_cidx, 0, "consumer index");
 	SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "pidx", CTLFLAG_RD,
 	    &nm_rxq->fl_pidx, 0, "producer index");
 
 	return (rc);
 }
 
 
 static int
 free_nm_rxq(struct vi_info *vi, struct sge_nm_rxq *nm_rxq)
 {
 	struct adapter *sc = vi->pi->adapter;
 
 	free_ring(sc, nm_rxq->iq_desc_tag, nm_rxq->iq_desc_map, nm_rxq->iq_ba,
 	    nm_rxq->iq_desc);
 	free_ring(sc, nm_rxq->fl_desc_tag, nm_rxq->fl_desc_map, nm_rxq->fl_ba,
 	    nm_rxq->fl_desc);
 
 	return (0);
 }
 
 static int
 alloc_nm_txq(struct vi_info *vi, struct sge_nm_txq *nm_txq, int iqidx, int idx,
     struct sysctl_oid *oid)
 {
 	int rc;
 	size_t len;
 	struct port_info *pi = vi->pi;
 	struct adapter *sc = pi->adapter;
 	struct netmap_adapter *na = NA(vi->ifp);
 	char name[16];
 	struct sysctl_oid_list *children = SYSCTL_CHILDREN(oid);
 
 	len = na->num_tx_desc * EQ_ESIZE + sc->params.sge.spg_len;
 	rc = alloc_ring(sc, len, &nm_txq->desc_tag, &nm_txq->desc_map,
 	    &nm_txq->ba, (void **)&nm_txq->desc);
 	if (rc)
 		return (rc);
 
 	nm_txq->pidx = nm_txq->cidx = 0;
 	nm_txq->sidx = na->num_tx_desc;
 	nm_txq->nid = idx;
 	nm_txq->iqidx = iqidx;
 	nm_txq->cpl_ctrl0 = htobe32(V_TXPKT_OPCODE(CPL_TX_PKT) |
 	    V_TXPKT_INTF(pi->tx_chan) | V_TXPKT_VF_VLD(1) |
 	    V_TXPKT_VF(vi->viid));
 
 	snprintf(name, sizeof(name), "%d", idx);
 	oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, name, CTLFLAG_RD,
 	    NULL, "netmap tx queue");
 	children = SYSCTL_CHILDREN(oid);
 
 	SYSCTL_ADD_UINT(&vi->ctx, children, OID_AUTO, "cntxt_id", CTLFLAG_RD,
 	    &nm_txq->cntxt_id, 0, "SGE context id of the queue");
 	SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "cidx",
 	    CTLTYPE_INT | CTLFLAG_RD, &nm_txq->cidx, 0, sysctl_uint16, "I",
 	    "consumer index");
 	SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "pidx",
 	    CTLTYPE_INT | CTLFLAG_RD, &nm_txq->pidx, 0, sysctl_uint16, "I",
 	    "producer index");
 
 	return (rc);
 }
 
 static int
 free_nm_txq(struct vi_info *vi, struct sge_nm_txq *nm_txq)
 {
 	struct adapter *sc = vi->pi->adapter;
 
 	free_ring(sc, nm_txq->desc_tag, nm_txq->desc_map, nm_txq->ba,
 	    nm_txq->desc);
 
 	return (0);
 }
 #endif
 
 static int
 ctrl_eq_alloc(struct adapter *sc, struct sge_eq *eq)
 {
 	int rc, cntxt_id;
 	struct fw_eq_ctrl_cmd c;
 	int qsize = eq->sidx + sc->params.sge.spg_len / EQ_ESIZE;
 
 	bzero(&c, sizeof(c));
 
 	c.op_to_vfn = htobe32(V_FW_CMD_OP(FW_EQ_CTRL_CMD) | F_FW_CMD_REQUEST |
 	    F_FW_CMD_WRITE | F_FW_CMD_EXEC | V_FW_EQ_CTRL_CMD_PFN(sc->pf) |
 	    V_FW_EQ_CTRL_CMD_VFN(0));
 	c.alloc_to_len16 = htobe32(F_FW_EQ_CTRL_CMD_ALLOC |
 	    F_FW_EQ_CTRL_CMD_EQSTART | FW_LEN16(c));
 	c.cmpliqid_eqid = htonl(V_FW_EQ_CTRL_CMD_CMPLIQID(eq->iqid));
 	c.physeqid_pkd = htobe32(0);
 	c.fetchszm_to_iqid =
 	    htobe32(V_FW_EQ_CTRL_CMD_HOSTFCMODE(X_HOSTFCMODE_NONE) |
 		V_FW_EQ_CTRL_CMD_PCIECHN(eq->tx_chan) |
 		F_FW_EQ_CTRL_CMD_FETCHRO | V_FW_EQ_CTRL_CMD_IQID(eq->iqid));
 	c.dcaen_to_eqsize =
 	    htobe32(V_FW_EQ_CTRL_CMD_FBMIN(X_FETCHBURSTMIN_64B) |
 		V_FW_EQ_CTRL_CMD_FBMAX(X_FETCHBURSTMAX_512B) |
 		V_FW_EQ_CTRL_CMD_EQSIZE(qsize));
 	c.eqaddr = htobe64(eq->ba);
 
 	rc = -t4_wr_mbox(sc, sc->mbox, &c, sizeof(c), &c);
 	if (rc != 0) {
 		device_printf(sc->dev,
 		    "failed to create control queue %d: %d\n", eq->tx_chan, rc);
 		return (rc);
 	}
 	eq->flags |= EQ_ALLOCATED;
 
 	eq->cntxt_id = G_FW_EQ_CTRL_CMD_EQID(be32toh(c.cmpliqid_eqid));
 	cntxt_id = eq->cntxt_id - sc->sge.eq_start;
 	if (cntxt_id >= sc->sge.neq)
 	    panic("%s: eq->cntxt_id (%d) more than the max (%d)", __func__,
 		cntxt_id, sc->sge.neq - 1);
 	sc->sge.eqmap[cntxt_id] = eq;
 
 	return (rc);
 }
 
 static int
 eth_eq_alloc(struct adapter *sc, struct vi_info *vi, struct sge_eq *eq)
 {
 	int rc, cntxt_id;
 	struct fw_eq_eth_cmd c;
 	int qsize = eq->sidx + sc->params.sge.spg_len / EQ_ESIZE;
 
 	bzero(&c, sizeof(c));
 
 	c.op_to_vfn = htobe32(V_FW_CMD_OP(FW_EQ_ETH_CMD) | F_FW_CMD_REQUEST |
 	    F_FW_CMD_WRITE | F_FW_CMD_EXEC | V_FW_EQ_ETH_CMD_PFN(sc->pf) |
 	    V_FW_EQ_ETH_CMD_VFN(0));
 	c.alloc_to_len16 = htobe32(F_FW_EQ_ETH_CMD_ALLOC |
 	    F_FW_EQ_ETH_CMD_EQSTART | FW_LEN16(c));
 	c.autoequiqe_to_viid = htobe32(F_FW_EQ_ETH_CMD_AUTOEQUIQE |
 	    F_FW_EQ_ETH_CMD_AUTOEQUEQE | V_FW_EQ_ETH_CMD_VIID(vi->viid));
 	c.fetchszm_to_iqid =
 	    htobe32(V_FW_EQ_ETH_CMD_HOSTFCMODE(X_HOSTFCMODE_NONE) |
 		V_FW_EQ_ETH_CMD_PCIECHN(eq->tx_chan) | F_FW_EQ_ETH_CMD_FETCHRO |
 		V_FW_EQ_ETH_CMD_IQID(eq->iqid));
 	c.dcaen_to_eqsize = htobe32(V_FW_EQ_ETH_CMD_FBMIN(X_FETCHBURSTMIN_64B) |
 	    V_FW_EQ_ETH_CMD_FBMAX(X_FETCHBURSTMAX_512B) |
 	    V_FW_EQ_ETH_CMD_EQSIZE(qsize));
 	c.eqaddr = htobe64(eq->ba);
 
 	rc = -t4_wr_mbox(sc, sc->mbox, &c, sizeof(c), &c);
 	if (rc != 0) {
 		device_printf(vi->dev,
 		    "failed to create Ethernet egress queue: %d\n", rc);
 		return (rc);
 	}
 	eq->flags |= EQ_ALLOCATED;
 
 	eq->cntxt_id = G_FW_EQ_ETH_CMD_EQID(be32toh(c.eqid_pkd));
 	eq->abs_id = G_FW_EQ_ETH_CMD_PHYSEQID(be32toh(c.physeqid_pkd));
 	cntxt_id = eq->cntxt_id - sc->sge.eq_start;
 	if (cntxt_id >= sc->sge.neq)
 	    panic("%s: eq->cntxt_id (%d) more than the max (%d)", __func__,
 		cntxt_id, sc->sge.neq - 1);
 	sc->sge.eqmap[cntxt_id] = eq;
 
 	return (rc);
 }
 
 #ifdef TCP_OFFLOAD
 static int
 ofld_eq_alloc(struct adapter *sc, struct vi_info *vi, struct sge_eq *eq)
 {
 	int rc, cntxt_id;
 	struct fw_eq_ofld_cmd c;
 	int qsize = eq->sidx + sc->params.sge.spg_len / EQ_ESIZE;
 
 	bzero(&c, sizeof(c));
 
 	c.op_to_vfn = htonl(V_FW_CMD_OP(FW_EQ_OFLD_CMD) | F_FW_CMD_REQUEST |
 	    F_FW_CMD_WRITE | F_FW_CMD_EXEC | V_FW_EQ_OFLD_CMD_PFN(sc->pf) |
 	    V_FW_EQ_OFLD_CMD_VFN(0));
 	c.alloc_to_len16 = htonl(F_FW_EQ_OFLD_CMD_ALLOC |
 	    F_FW_EQ_OFLD_CMD_EQSTART | FW_LEN16(c));
 	c.fetchszm_to_iqid =
 		htonl(V_FW_EQ_OFLD_CMD_HOSTFCMODE(X_HOSTFCMODE_NONE) |
 		    V_FW_EQ_OFLD_CMD_PCIECHN(eq->tx_chan) |
 		    F_FW_EQ_OFLD_CMD_FETCHRO | V_FW_EQ_OFLD_CMD_IQID(eq->iqid));
 	c.dcaen_to_eqsize =
 	    htobe32(V_FW_EQ_OFLD_CMD_FBMIN(X_FETCHBURSTMIN_64B) |
 		V_FW_EQ_OFLD_CMD_FBMAX(X_FETCHBURSTMAX_512B) |
 		V_FW_EQ_OFLD_CMD_EQSIZE(qsize));
 	c.eqaddr = htobe64(eq->ba);
 
 	rc = -t4_wr_mbox(sc, sc->mbox, &c, sizeof(c), &c);
 	if (rc != 0) {
 		device_printf(vi->dev,
 		    "failed to create egress queue for TCP offload: %d\n", rc);
 		return (rc);
 	}
 	eq->flags |= EQ_ALLOCATED;
 
 	eq->cntxt_id = G_FW_EQ_OFLD_CMD_EQID(be32toh(c.eqid_pkd));
 	cntxt_id = eq->cntxt_id - sc->sge.eq_start;
 	if (cntxt_id >= sc->sge.neq)
 	    panic("%s: eq->cntxt_id (%d) more than the max (%d)", __func__,
 		cntxt_id, sc->sge.neq - 1);
 	sc->sge.eqmap[cntxt_id] = eq;
 
 	return (rc);
 }
 #endif
 
 static int
 alloc_eq(struct adapter *sc, struct vi_info *vi, struct sge_eq *eq)
 {
 	int rc, qsize;
 	size_t len;
 
 	mtx_init(&eq->eq_lock, eq->lockname, NULL, MTX_DEF);
 
 	qsize = eq->sidx + sc->params.sge.spg_len / EQ_ESIZE;
 	len = qsize * EQ_ESIZE;
 	rc = alloc_ring(sc, len, &eq->desc_tag, &eq->desc_map,
 	    &eq->ba, (void **)&eq->desc);
 	if (rc)
 		return (rc);
 
 	eq->pidx = eq->cidx = 0;
 	eq->equeqidx = eq->dbidx = 0;
 	eq->doorbells = sc->doorbells;
 
 	switch (eq->flags & EQ_TYPEMASK) {
 	case EQ_CTRL:
 		rc = ctrl_eq_alloc(sc, eq);
 		break;
 
 	case EQ_ETH:
 		rc = eth_eq_alloc(sc, vi, eq);
 		break;
 
 #ifdef TCP_OFFLOAD
 	case EQ_OFLD:
 		rc = ofld_eq_alloc(sc, vi, eq);
 		break;
 #endif
 
 	default:
 		panic("%s: invalid eq type %d.", __func__,
 		    eq->flags & EQ_TYPEMASK);
 	}
 	if (rc != 0) {
 		device_printf(sc->dev,
 		    "failed to allocate egress queue(%d): %d\n",
 		    eq->flags & EQ_TYPEMASK, rc);
 	}
 
 	if (isset(&eq->doorbells, DOORBELL_UDB) ||
 	    isset(&eq->doorbells, DOORBELL_UDBWC) ||
 	    isset(&eq->doorbells, DOORBELL_WCWR)) {
 		uint32_t s_qpp = sc->params.sge.eq_s_qpp;
 		uint32_t mask = (1 << s_qpp) - 1;
 		volatile uint8_t *udb;
 
 		udb = sc->udbs_base + UDBS_DB_OFFSET;
 		udb += (eq->cntxt_id >> s_qpp) << PAGE_SHIFT;	/* pg offset */
 		eq->udb_qid = eq->cntxt_id & mask;		/* id in page */
 		if (eq->udb_qid >= PAGE_SIZE / UDBS_SEG_SIZE)
 	    		clrbit(&eq->doorbells, DOORBELL_WCWR);
 		else {
 			udb += eq->udb_qid << UDBS_SEG_SHIFT;	/* seg offset */
 			eq->udb_qid = 0;
 		}
 		eq->udb = (volatile void *)udb;
 	}
 
 	return (rc);
 }
 
 static int
 free_eq(struct adapter *sc, struct sge_eq *eq)
 {
 	int rc;
 
 	if (eq->flags & EQ_ALLOCATED) {
 		switch (eq->flags & EQ_TYPEMASK) {
 		case EQ_CTRL:
 			rc = -t4_ctrl_eq_free(sc, sc->mbox, sc->pf, 0,
 			    eq->cntxt_id);
 			break;
 
 		case EQ_ETH:
 			rc = -t4_eth_eq_free(sc, sc->mbox, sc->pf, 0,
 			    eq->cntxt_id);
 			break;
 
 #ifdef TCP_OFFLOAD
 		case EQ_OFLD:
 			rc = -t4_ofld_eq_free(sc, sc->mbox, sc->pf, 0,
 			    eq->cntxt_id);
 			break;
 #endif
 
 		default:
 			panic("%s: invalid eq type %d.", __func__,
 			    eq->flags & EQ_TYPEMASK);
 		}
 		if (rc != 0) {
 			device_printf(sc->dev,
 			    "failed to free egress queue (%d): %d\n",
 			    eq->flags & EQ_TYPEMASK, rc);
 			return (rc);
 		}
 		eq->flags &= ~EQ_ALLOCATED;
 	}
 
 	free_ring(sc, eq->desc_tag, eq->desc_map, eq->ba, eq->desc);
 
 	if (mtx_initialized(&eq->eq_lock))
 		mtx_destroy(&eq->eq_lock);
 
 	bzero(eq, sizeof(*eq));
 	return (0);
 }
 
 static int
 alloc_wrq(struct adapter *sc, struct vi_info *vi, struct sge_wrq *wrq,
     struct sysctl_oid *oid)
 {
 	int rc;
 	struct sysctl_ctx_list *ctx = vi ? &vi->ctx : &sc->ctx;
 	struct sysctl_oid_list *children = SYSCTL_CHILDREN(oid);
 
 	rc = alloc_eq(sc, vi, &wrq->eq);
 	if (rc)
 		return (rc);
 
 	wrq->adapter = sc;
 	TASK_INIT(&wrq->wrq_tx_task, 0, wrq_tx_drain, wrq);
 	TAILQ_INIT(&wrq->incomplete_wrs);
 	STAILQ_INIT(&wrq->wr_list);
 	wrq->nwr_pending = 0;
 	wrq->ndesc_needed = 0;
 
 	SYSCTL_ADD_UINT(ctx, children, OID_AUTO, "cntxt_id", CTLFLAG_RD,
 	    &wrq->eq.cntxt_id, 0, "SGE context id of the queue");
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "cidx",
 	    CTLTYPE_INT | CTLFLAG_RD, &wrq->eq.cidx, 0, sysctl_uint16, "I",
 	    "consumer index");
 	SYSCTL_ADD_PROC(ctx, children, OID_AUTO, "pidx",
 	    CTLTYPE_INT | CTLFLAG_RD, &wrq->eq.pidx, 0, sysctl_uint16, "I",
 	    "producer index");
 	SYSCTL_ADD_UQUAD(ctx, children, OID_AUTO, "tx_wrs_direct", CTLFLAG_RD,
 	    &wrq->tx_wrs_direct, "# of work requests (direct)");
 	SYSCTL_ADD_UQUAD(ctx, children, OID_AUTO, "tx_wrs_copied", CTLFLAG_RD,
 	    &wrq->tx_wrs_copied, "# of work requests (copied)");
 
 	return (rc);
 }
 
 static int
 free_wrq(struct adapter *sc, struct sge_wrq *wrq)
 {
 	int rc;
 
 	rc = free_eq(sc, &wrq->eq);
 	if (rc)
 		return (rc);
 
 	bzero(wrq, sizeof(*wrq));
 	return (0);
 }
 
 static int
 alloc_txq(struct vi_info *vi, struct sge_txq *txq, int idx,
     struct sysctl_oid *oid)
 {
 	int rc;
 	struct port_info *pi = vi->pi;
 	struct adapter *sc = pi->adapter;
 	struct sge_eq *eq = &txq->eq;
 	char name[16];
 	struct sysctl_oid_list *children = SYSCTL_CHILDREN(oid);
 
 	rc = mp_ring_alloc(&txq->r, eq->sidx, txq, eth_tx, can_resume_eth_tx,
 	    M_CXGBE, M_WAITOK);
 	if (rc != 0) {
 		device_printf(sc->dev, "failed to allocate mp_ring: %d\n", rc);
 		return (rc);
 	}
 
 	rc = alloc_eq(sc, vi, eq);
 	if (rc != 0) {
 		mp_ring_free(txq->r);
 		txq->r = NULL;
 		return (rc);
 	}
 
 	/* Can't fail after this point. */
 
 	if (idx == 0)
 		sc->sge.eq_base = eq->abs_id - eq->cntxt_id;
 	else
 		KASSERT(eq->cntxt_id + sc->sge.eq_base == eq->abs_id,
 		    ("eq_base mismatch"));
 	KASSERT(sc->sge.eq_base == 0 || sc->flags & IS_VF,
 	    ("PF with non-zero eq_base"));
 
 	TASK_INIT(&txq->tx_reclaim_task, 0, tx_reclaim, eq);
 	txq->ifp = vi->ifp;
 	txq->gl = sglist_alloc(TX_SGL_SEGS, M_WAITOK);
 	if (sc->flags & IS_VF)
 		txq->cpl_ctrl0 = htobe32(V_TXPKT_OPCODE(CPL_TX_PKT_XT) |
 		    V_TXPKT_INTF(pi->tx_chan));
 	else
 		txq->cpl_ctrl0 = htobe32(V_TXPKT_OPCODE(CPL_TX_PKT) |
 		    V_TXPKT_INTF(pi->tx_chan) | V_TXPKT_VF_VLD(1) |
 		    V_TXPKT_VF(vi->viid));
 	txq->tc_idx = -1;
 	txq->sdesc = malloc(eq->sidx * sizeof(struct tx_sdesc), M_CXGBE,
 	    M_ZERO | M_WAITOK);
 
 	snprintf(name, sizeof(name), "%d", idx);
 	oid = SYSCTL_ADD_NODE(&vi->ctx, children, OID_AUTO, name, CTLFLAG_RD,
 	    NULL, "tx queue");
 	children = SYSCTL_CHILDREN(oid);
 
 	SYSCTL_ADD_UINT(&vi->ctx, children, OID_AUTO, "abs_id", CTLFLAG_RD,
 	    &eq->abs_id, 0, "absolute id of the queue");
 	SYSCTL_ADD_UINT(&vi->ctx, children, OID_AUTO, "cntxt_id", CTLFLAG_RD,
 	    &eq->cntxt_id, 0, "SGE context id of the queue");
 	SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "cidx",
 	    CTLTYPE_INT | CTLFLAG_RD, &eq->cidx, 0, sysctl_uint16, "I",
 	    "consumer index");
 	SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "pidx",
 	    CTLTYPE_INT | CTLFLAG_RD, &eq->pidx, 0, sysctl_uint16, "I",
 	    "producer index");
 
 	SYSCTL_ADD_PROC(&vi->ctx, children, OID_AUTO, "tc",
 	    CTLTYPE_INT | CTLFLAG_RW, vi, idx, sysctl_tc, "I",
 	    "traffic class (-1 means none)");
 
 	SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "txcsum", CTLFLAG_RD,
 	    &txq->txcsum, "# of times hardware assisted with checksum");
 	SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "vlan_insertion",
 	    CTLFLAG_RD, &txq->vlan_insertion,
 	    "# of times hardware inserted 802.1Q tag");
 	SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "tso_wrs", CTLFLAG_RD,
 	    &txq->tso_wrs, "# of TSO work requests");
 	SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "imm_wrs", CTLFLAG_RD,
 	    &txq->imm_wrs, "# of work requests with immediate data");
 	SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "sgl_wrs", CTLFLAG_RD,
 	    &txq->sgl_wrs, "# of work requests with direct SGL");
 	SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "txpkt_wrs", CTLFLAG_RD,
 	    &txq->txpkt_wrs, "# of txpkt work requests (one pkt/WR)");
 	SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "txpkts0_wrs",
 	    CTLFLAG_RD, &txq->txpkts0_wrs,
 	    "# of txpkts (type 0) work requests");
 	SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "txpkts1_wrs",
 	    CTLFLAG_RD, &txq->txpkts1_wrs,
 	    "# of txpkts (type 1) work requests");
 	SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "txpkts0_pkts",
 	    CTLFLAG_RD, &txq->txpkts0_pkts,
 	    "# of frames tx'd using type0 txpkts work requests");
 	SYSCTL_ADD_UQUAD(&vi->ctx, children, OID_AUTO, "txpkts1_pkts",
 	    CTLFLAG_RD, &txq->txpkts1_pkts,
 	    "# of frames tx'd using type1 txpkts work requests");
 
 	SYSCTL_ADD_COUNTER_U64(&vi->ctx, children, OID_AUTO, "r_enqueues",
 	    CTLFLAG_RD, &txq->r->enqueues,
 	    "# of enqueues to the mp_ring for this queue");
 	SYSCTL_ADD_COUNTER_U64(&vi->ctx, children, OID_AUTO, "r_drops",
 	    CTLFLAG_RD, &txq->r->drops,
 	    "# of drops in the mp_ring for this queue");
 	SYSCTL_ADD_COUNTER_U64(&vi->ctx, children, OID_AUTO, "r_starts",
 	    CTLFLAG_RD, &txq->r->starts,
 	    "# of normal consumer starts in the mp_ring for this queue");
 	SYSCTL_ADD_COUNTER_U64(&vi->ctx, children, OID_AUTO, "r_stalls",
 	    CTLFLAG_RD, &txq->r->stalls,
 	    "# of consumer stalls in the mp_ring for this queue");
 	SYSCTL_ADD_COUNTER_U64(&vi->ctx, children, OID_AUTO, "r_restarts",
 	    CTLFLAG_RD, &txq->r->restarts,
 	    "# of consumer restarts in the mp_ring for this queue");
 	SYSCTL_ADD_COUNTER_U64(&vi->ctx, children, OID_AUTO, "r_abdications",
 	    CTLFLAG_RD, &txq->r->abdications,
 	    "# of consumer abdications in the mp_ring for this queue");
 
 	return (0);
 }
 
 static int
 free_txq(struct vi_info *vi, struct sge_txq *txq)
 {
 	int rc;
 	struct adapter *sc = vi->pi->adapter;
 	struct sge_eq *eq = &txq->eq;
 
 	rc = free_eq(sc, eq);
 	if (rc)
 		return (rc);
 
 	sglist_free(txq->gl);
 	free(txq->sdesc, M_CXGBE);
 	mp_ring_free(txq->r);
 
 	bzero(txq, sizeof(*txq));
 	return (0);
 }
 
 static void
 oneseg_dma_callback(void *arg, bus_dma_segment_t *segs, int nseg, int error)
 {
 	bus_addr_t *ba = arg;
 
 	KASSERT(nseg == 1,
 	    ("%s meant for single segment mappings only.", __func__));
 
 	*ba = error ? 0 : segs->ds_addr;
 }
 
 static inline void
 ring_fl_db(struct adapter *sc, struct sge_fl *fl)
 {
 	uint32_t n, v;
 
 	n = IDXDIFF(fl->pidx / 8, fl->dbidx, fl->sidx);
 	MPASS(n > 0);
 
 	wmb();
 	v = fl->dbval | V_PIDX(n);
 	if (fl->udb)
 		*fl->udb = htole32(v);
 	else
 		t4_write_reg(sc, sc->sge_kdoorbell_reg, v);
 	IDXINCR(fl->dbidx, n, fl->sidx);
 }
 
 /*
  * Fills up the freelist by allocating up to 'n' buffers.  Buffers that are
  * recycled do not count towards this allocation budget.
  *
  * Returns non-zero to indicate that this freelist should be added to the list
  * of starving freelists.
  */
 static int
 refill_fl(struct adapter *sc, struct sge_fl *fl, int n)
 {
 	__be64 *d;
 	struct fl_sdesc *sd;
 	uintptr_t pa;
 	caddr_t cl;
 	struct cluster_layout *cll;
 	struct sw_zone_info *swz;
 	struct cluster_metadata *clm;
 	uint16_t max_pidx;
 	uint16_t hw_cidx = fl->hw_cidx;		/* stable snapshot */
 
 	FL_LOCK_ASSERT_OWNED(fl);
 
 	/*
 	 * We always stop at the beginning of the hardware descriptor that's just
 	 * before the one with the hw cidx.  This is to avoid hw pidx = hw cidx,
 	 * which would mean an empty freelist to the chip.
 	 */
 	max_pidx = __predict_false(hw_cidx == 0) ? fl->sidx - 1 : hw_cidx - 1;
 	if (fl->pidx == max_pidx * 8)
 		return (0);
 
 	d = &fl->desc[fl->pidx];
 	sd = &fl->sdesc[fl->pidx];
 	cll = &fl->cll_def;	/* default layout */
 	swz = &sc->sge.sw_zone_info[cll->zidx];
 
 	while (n > 0) {
 
 		if (sd->cl != NULL) {
 
 			if (sd->nmbuf == 0) {
 				/*
 				 * Fast recycle without involving any atomics on
 				 * the cluster's metadata (if the cluster has
 				 * metadata).  This happens when all frames
 				 * received in the cluster were small enough to
 				 * fit within a single mbuf each.
 				 */
 				fl->cl_fast_recycled++;
 #ifdef INVARIANTS
 				clm = cl_metadata(sc, fl, &sd->cll, sd->cl);
 				if (clm != NULL)
 					MPASS(clm->refcount == 1);
 #endif
 				goto recycled_fast;
 			}
 
 			/*
 			 * Cluster is guaranteed to have metadata.  Clusters
 			 * without metadata always take the fast recycle path
 			 * when they're recycled.
 			 */
 			clm = cl_metadata(sc, fl, &sd->cll, sd->cl);
 			MPASS(clm != NULL);
 
 			if (atomic_fetchadd_int(&clm->refcount, -1) == 1) {
 				fl->cl_recycled++;
 				counter_u64_add(extfree_rels, 1);
 				goto recycled;
 			}
 			sd->cl = NULL;	/* gave up my reference */
 		}
 		MPASS(sd->cl == NULL);
 alloc:
 		cl = uma_zalloc(swz->zone, M_NOWAIT);
 		if (__predict_false(cl == NULL)) {
 			if (cll == &fl->cll_alt || fl->cll_alt.zidx == -1 ||
 			    fl->cll_def.zidx == fl->cll_alt.zidx)
 				break;
 
 			/* fall back to the safe zone */
 			cll = &fl->cll_alt;
 			swz = &sc->sge.sw_zone_info[cll->zidx];
 			goto alloc;
 		}
 		fl->cl_allocated++;
 		n--;
 
 		pa = pmap_kextract((vm_offset_t)cl);
 		pa += cll->region1;
 		sd->cl = cl;
 		sd->cll = *cll;
 		*d = htobe64(pa | cll->hwidx);
 		clm = cl_metadata(sc, fl, cll, cl);
 		if (clm != NULL) {
 recycled:
 #ifdef INVARIANTS
 			clm->sd = sd;
 #endif
 			clm->refcount = 1;
 		}
 		sd->nmbuf = 0;
 recycled_fast:
 		d++;
 		sd++;
 		if (__predict_false(++fl->pidx % 8 == 0)) {
 			uint16_t pidx = fl->pidx / 8;
 
 			if (__predict_false(pidx == fl->sidx)) {
 				fl->pidx = 0;
 				pidx = 0;
 				sd = fl->sdesc;
 				d = fl->desc;
 			}
 			if (pidx == max_pidx)
 				break;
 
 			if (IDXDIFF(pidx, fl->dbidx, fl->sidx) >= 4)
 				ring_fl_db(sc, fl);
 		}
 	}
 
 	if (fl->pidx / 8 != fl->dbidx)
 		ring_fl_db(sc, fl);
 
 	return (FL_RUNNING_LOW(fl) && !(fl->flags & FL_STARVING));
 }
 
 /*
  * Attempt to refill all starving freelists.
  */
 static void
 refill_sfl(void *arg)
 {
 	struct adapter *sc = arg;
 	struct sge_fl *fl, *fl_temp;
 
 	mtx_assert(&sc->sfl_lock, MA_OWNED);
 	TAILQ_FOREACH_SAFE(fl, &sc->sfl, link, fl_temp) {
 		FL_LOCK(fl);
 		refill_fl(sc, fl, 64);
 		if (FL_NOT_RUNNING_LOW(fl) || fl->flags & FL_DOOMED) {
 			TAILQ_REMOVE(&sc->sfl, fl, link);
 			fl->flags &= ~FL_STARVING;
 		}
 		FL_UNLOCK(fl);
 	}
 
 	if (!TAILQ_EMPTY(&sc->sfl))
 		callout_schedule(&sc->sfl_callout, hz / 5);
 }
 
 static int
 alloc_fl_sdesc(struct sge_fl *fl)
 {
 
 	fl->sdesc = malloc(fl->sidx * 8 * sizeof(struct fl_sdesc), M_CXGBE,
 	    M_ZERO | M_WAITOK);
 
 	return (0);
 }
 
 static void
 free_fl_sdesc(struct adapter *sc, struct sge_fl *fl)
 {
 	struct fl_sdesc *sd;
 	struct cluster_metadata *clm;
 	struct cluster_layout *cll;
 	int i;
 
 	sd = fl->sdesc;
 	for (i = 0; i < fl->sidx * 8; i++, sd++) {
 		if (sd->cl == NULL)
 			continue;
 
 		cll = &sd->cll;
 		clm = cl_metadata(sc, fl, cll, sd->cl);
 		if (sd->nmbuf == 0)
 			uma_zfree(sc->sge.sw_zone_info[cll->zidx].zone, sd->cl);
 		else if (clm && atomic_fetchadd_int(&clm->refcount, -1) == 1) {
 			uma_zfree(sc->sge.sw_zone_info[cll->zidx].zone, sd->cl);
 			counter_u64_add(extfree_rels, 1);
 		}
 		sd->cl = NULL;
 	}
 
 	free(fl->sdesc, M_CXGBE);
 	fl->sdesc = NULL;
 }
 
 static inline void
 get_pkt_gl(struct mbuf *m, struct sglist *gl)
 {
 	int rc;
 
 	M_ASSERTPKTHDR(m);
 
 	sglist_reset(gl);
 	rc = sglist_append_mbuf(gl, m);
 	if (__predict_false(rc != 0)) {
 		panic("%s: mbuf %p (%d segs) was vetted earlier but now fails "
 		    "with %d.", __func__, m, mbuf_nsegs(m), rc);
 	}
 
 	KASSERT(gl->sg_nseg == mbuf_nsegs(m),
 	    ("%s: nsegs changed for mbuf %p from %d to %d", __func__, m,
 	    mbuf_nsegs(m), gl->sg_nseg));
 	KASSERT(gl->sg_nseg > 0 &&
 	    gl->sg_nseg <= (needs_tso(m) ? TX_SGL_SEGS_TSO : TX_SGL_SEGS),
 	    ("%s: %d segments, should have been 1 <= nsegs <= %d", __func__,
 		gl->sg_nseg, needs_tso(m) ? TX_SGL_SEGS_TSO : TX_SGL_SEGS));
 }
 
 /*
  * len16 for a txpkt WR with a GL.  Includes the firmware work request header.
  */
 static inline u_int
 txpkt_len16(u_int nsegs, u_int tso)
 {
 	u_int n;
 
 	MPASS(nsegs > 0);
 
 	nsegs--; /* first segment is part of ulptx_sgl */
 	n = sizeof(struct fw_eth_tx_pkt_wr) + sizeof(struct cpl_tx_pkt_core) +
 	    sizeof(struct ulptx_sgl) + 8 * ((3 * nsegs) / 2 + (nsegs & 1));
 	if (tso)
 		n += sizeof(struct cpl_tx_pkt_lso_core);
 
 	return (howmany(n, 16));
 }
 
 /*
  * len16 for a txpkt_vm WR with a GL.  Includes the firmware work
  * request header.
  */
 static inline u_int
 txpkt_vm_len16(u_int nsegs, u_int tso)
 {
 	u_int n;
 
 	MPASS(nsegs > 0);
 
 	nsegs--; /* first segment is part of ulptx_sgl */
 	n = sizeof(struct fw_eth_tx_pkt_vm_wr) +
 	    sizeof(struct cpl_tx_pkt_core) +
 	    sizeof(struct ulptx_sgl) + 8 * ((3 * nsegs) / 2 + (nsegs & 1));
 	if (tso)
 		n += sizeof(struct cpl_tx_pkt_lso_core);
 
 	return (howmany(n, 16));
 }
 
 /*
  * len16 for a txpkts type 0 WR with a GL.  Does not include the firmware work
  * request header.
  */
 static inline u_int
 txpkts0_len16(u_int nsegs)
 {
 	u_int n;
 
 	MPASS(nsegs > 0);
 
 	nsegs--; /* first segment is part of ulptx_sgl */
 	n = sizeof(struct ulp_txpkt) + sizeof(struct ulptx_idata) +
 	    sizeof(struct cpl_tx_pkt_core) + sizeof(struct ulptx_sgl) +
 	    8 * ((3 * nsegs) / 2 + (nsegs & 1));
 
 	return (howmany(n, 16));
 }
 
 /*
  * len16 for a txpkts type 1 WR with a GL.  Does not include the firmware work
  * request header.
  */
 static inline u_int
 txpkts1_len16(void)
 {
 	u_int n;
 
 	n = sizeof(struct cpl_tx_pkt_core) + sizeof(struct ulptx_sgl);
 
 	return (howmany(n, 16));
 }
 
 static inline u_int
 imm_payload(u_int ndesc)
 {
 	u_int n;
 
 	n = ndesc * EQ_ESIZE - sizeof(struct fw_eth_tx_pkt_wr) -
 	    sizeof(struct cpl_tx_pkt_core);
 
 	return (n);
 }
 
 /*
  * Write a VM txpkt WR for this packet to the hardware descriptors, update the
  * software descriptor, and advance the pidx.  It is guaranteed that enough
  * descriptors are available.
  *
  * The return value is the # of hardware descriptors used.
  */
 static u_int
 write_txpkt_vm_wr(struct sge_txq *txq, struct fw_eth_tx_pkt_vm_wr *wr,
     struct mbuf *m0, u_int available)
 {
 	struct sge_eq *eq = &txq->eq;
 	struct tx_sdesc *txsd;
 	struct cpl_tx_pkt_core *cpl;
 	uint32_t ctrl;	/* used in many unrelated places */
 	uint64_t ctrl1;
 	int csum_type, len16, ndesc, pktlen, nsegs;
 	caddr_t dst;
 
 	TXQ_LOCK_ASSERT_OWNED(txq);
 	M_ASSERTPKTHDR(m0);
 	MPASS(available > 0 && available < eq->sidx);
 
 	len16 = mbuf_len16(m0);
 	nsegs = mbuf_nsegs(m0);
 	pktlen = m0->m_pkthdr.len;
 	ctrl = sizeof(struct cpl_tx_pkt_core);
 	if (needs_tso(m0))
 		ctrl += sizeof(struct cpl_tx_pkt_lso_core);
 	ndesc = howmany(len16, EQ_ESIZE / 16);
 	MPASS(ndesc <= available);
 
 	/* Firmware work request header */
 	MPASS(wr == (void *)&eq->desc[eq->pidx]);
 	wr->op_immdlen = htobe32(V_FW_WR_OP(FW_ETH_TX_PKT_VM_WR) |
 	    V_FW_ETH_TX_PKT_WR_IMMDLEN(ctrl));
 
 	ctrl = V_FW_WR_LEN16(len16);
 	wr->equiq_to_len16 = htobe32(ctrl);
 	wr->r3[0] = 0;
 	wr->r3[1] = 0;
 	
 	/*
 	 * Copy over ethmacdst, ethmacsrc, ethtype, and vlantci.
 	 * vlantci is ignored unless the ethtype is 0x8100, so it's
 	 * simpler to always copy it rather than making it
 	 * conditional.  Also, it seems that we do not have to set
 	 * vlantci or fake the ethtype when doing VLAN tag insertion.
 	 */
 	m_copydata(m0, 0, sizeof(struct ether_header) + 2, wr->ethmacdst);
 
 	csum_type = -1;
 	if (needs_tso(m0)) {
 		struct cpl_tx_pkt_lso_core *lso = (void *)(wr + 1);
 
 		KASSERT(m0->m_pkthdr.l2hlen > 0 && m0->m_pkthdr.l3hlen > 0 &&
 		    m0->m_pkthdr.l4hlen > 0,
 		    ("%s: mbuf %p needs TSO but missing header lengths",
 			__func__, m0));
 
 		ctrl = V_LSO_OPCODE(CPL_TX_PKT_LSO) | F_LSO_FIRST_SLICE |
 		    F_LSO_LAST_SLICE | V_LSO_IPHDR_LEN(m0->m_pkthdr.l3hlen >> 2)
 		    | V_LSO_TCPHDR_LEN(m0->m_pkthdr.l4hlen >> 2);
 		if (m0->m_pkthdr.l2hlen == sizeof(struct ether_vlan_header))
 			ctrl |= V_LSO_ETHHDR_LEN(1);
 		if (m0->m_pkthdr.l3hlen == sizeof(struct ip6_hdr))
 			ctrl |= F_LSO_IPV6;
 
 		lso->lso_ctrl = htobe32(ctrl);
 		lso->ipid_ofst = htobe16(0);
 		lso->mss = htobe16(m0->m_pkthdr.tso_segsz);
 		lso->seqno_offset = htobe32(0);
 		lso->len = htobe32(pktlen);
 
 		if (m0->m_pkthdr.l3hlen == sizeof(struct ip6_hdr))
 			csum_type = TX_CSUM_TCPIP6;
 		else
 			csum_type = TX_CSUM_TCPIP;
 
 		cpl = (void *)(lso + 1);
 
 		txq->tso_wrs++;
 	} else {
 		if (m0->m_pkthdr.csum_flags & CSUM_IP_TCP)
 			csum_type = TX_CSUM_TCPIP;
 		else if (m0->m_pkthdr.csum_flags & CSUM_IP_UDP)
 			csum_type = TX_CSUM_UDPIP;
 		else if (m0->m_pkthdr.csum_flags & CSUM_IP6_TCP)
 			csum_type = TX_CSUM_TCPIP6;
 		else if (m0->m_pkthdr.csum_flags & CSUM_IP6_UDP)
 			csum_type = TX_CSUM_UDPIP6;
 #if defined(INET)
 		else if (m0->m_pkthdr.csum_flags & CSUM_IP) {
 			/*
 			 * XXX: The firmware appears to stomp on the
 			 * fragment/flags field of the IP header when
 			 * using TX_CSUM_IP.  Fall back to doing
 			 * software checksums.
 			 */
 			u_short *sump;
 			struct mbuf *m;
 			int offset;
 
 			m = m0;
 			offset = 0;
 			sump = m_advance(&m, &offset, m0->m_pkthdr.l2hlen +
 			    offsetof(struct ip, ip_sum));
 			*sump = in_cksum_skip(m0, m0->m_pkthdr.l2hlen +
 			    m0->m_pkthdr.l3hlen, m0->m_pkthdr.l2hlen);
 			m0->m_pkthdr.csum_flags &= ~CSUM_IP;
 		}
 #endif
 
 		cpl = (void *)(wr + 1);
 	}
 
 	/* Checksum offload */
 	ctrl1 = 0;
 	if (needs_l3_csum(m0) == 0)
 		ctrl1 |= F_TXPKT_IPCSUM_DIS;
 	if (csum_type >= 0) {
 		KASSERT(m0->m_pkthdr.l2hlen > 0 && m0->m_pkthdr.l3hlen > 0,
 	    ("%s: mbuf %p needs checksum offload but missing header lengths",
 			__func__, m0));
 
 		/* XXX: T6 */
 		ctrl1 |= V_TXPKT_ETHHDR_LEN(m0->m_pkthdr.l2hlen -
 		    ETHER_HDR_LEN);
 		ctrl1 |= V_TXPKT_IPHDR_LEN(m0->m_pkthdr.l3hlen);
 		ctrl1 |= V_TXPKT_CSUM_TYPE(csum_type);
 	} else
 		ctrl1 |= F_TXPKT_L4CSUM_DIS;
 	if (m0->m_pkthdr.csum_flags & (CSUM_IP | CSUM_TCP | CSUM_UDP |
 	    CSUM_UDP_IPV6 | CSUM_TCP_IPV6 | CSUM_TSO))
 		txq->txcsum++;	/* some hardware assistance provided */
 
 	/* VLAN tag insertion */
 	if (needs_vlan_insertion(m0)) {
 		ctrl1 |= F_TXPKT_VLAN_VLD |
 		    V_TXPKT_VLAN(m0->m_pkthdr.ether_vtag);
 		txq->vlan_insertion++;
 	}
 
 	/* CPL header */
 	cpl->ctrl0 = txq->cpl_ctrl0;
 	cpl->pack = 0;
 	cpl->len = htobe16(pktlen);
 	cpl->ctrl1 = htobe64(ctrl1);
 
 	/* SGL */
 	dst = (void *)(cpl + 1);
 
 	/*
 	 * A packet using TSO will use up an entire descriptor for the
 	 * firmware work request header, LSO CPL, and TX_PKT_XT CPL.
 	 * If this descriptor is the last descriptor in the ring, wrap
 	 * around to the front of the ring explicitly for the start of
 	 * the sgl.
 	 */
 	if (dst == (void *)&eq->desc[eq->sidx]) {
 		dst = (void *)&eq->desc[0];
 		write_gl_to_txd(txq, m0, &dst, 0);
 	} else
 		write_gl_to_txd(txq, m0, &dst, eq->sidx - ndesc < eq->pidx);
 	txq->sgl_wrs++;
 
 	txq->txpkt_wrs++;
 
 	txsd = &txq->sdesc[eq->pidx];
 	txsd->m = m0;
 	txsd->desc_used = ndesc;
 
 	return (ndesc);
 }
 
 /*
  * Write a txpkt WR for this packet to the hardware descriptors, update the
  * software descriptor, and advance the pidx.  It is guaranteed that enough
  * descriptors are available.
  *
  * The return value is the # of hardware descriptors used.
  */
 static u_int
 write_txpkt_wr(struct sge_txq *txq, struct fw_eth_tx_pkt_wr *wr,
     struct mbuf *m0, u_int available)
 {
 	struct sge_eq *eq = &txq->eq;
 	struct tx_sdesc *txsd;
 	struct cpl_tx_pkt_core *cpl;
 	uint32_t ctrl;	/* used in many unrelated places */
 	uint64_t ctrl1;
 	int len16, ndesc, pktlen, nsegs;
 	caddr_t dst;
 
 	TXQ_LOCK_ASSERT_OWNED(txq);
 	M_ASSERTPKTHDR(m0);
 	MPASS(available > 0 && available < eq->sidx);
 
 	len16 = mbuf_len16(m0);
 	nsegs = mbuf_nsegs(m0);
 	pktlen = m0->m_pkthdr.len;
 	ctrl = sizeof(struct cpl_tx_pkt_core);
 	if (needs_tso(m0))
 		ctrl += sizeof(struct cpl_tx_pkt_lso_core);
 	else if (pktlen <= imm_payload(2) && available >= 2) {
 		/* Immediate data.  Recalculate len16 and set nsegs to 0. */
 		ctrl += pktlen;
 		len16 = howmany(sizeof(struct fw_eth_tx_pkt_wr) +
 		    sizeof(struct cpl_tx_pkt_core) + pktlen, 16);
 		nsegs = 0;
 	}
 	ndesc = howmany(len16, EQ_ESIZE / 16);
 	MPASS(ndesc <= available);
 
 	/* Firmware work request header */
 	MPASS(wr == (void *)&eq->desc[eq->pidx]);
 	wr->op_immdlen = htobe32(V_FW_WR_OP(FW_ETH_TX_PKT_WR) |
 	    V_FW_ETH_TX_PKT_WR_IMMDLEN(ctrl));
 
 	ctrl = V_FW_WR_LEN16(len16);
 	wr->equiq_to_len16 = htobe32(ctrl);
 	wr->r3 = 0;
 
 	if (needs_tso(m0)) {
 		struct cpl_tx_pkt_lso_core *lso = (void *)(wr + 1);
 
 		KASSERT(m0->m_pkthdr.l2hlen > 0 && m0->m_pkthdr.l3hlen > 0 &&
 		    m0->m_pkthdr.l4hlen > 0,
 		    ("%s: mbuf %p needs TSO but missing header lengths",
 			__func__, m0));
 
 		ctrl = V_LSO_OPCODE(CPL_TX_PKT_LSO) | F_LSO_FIRST_SLICE |
 		    F_LSO_LAST_SLICE | V_LSO_IPHDR_LEN(m0->m_pkthdr.l3hlen >> 2)
 		    | V_LSO_TCPHDR_LEN(m0->m_pkthdr.l4hlen >> 2);
 		if (m0->m_pkthdr.l2hlen == sizeof(struct ether_vlan_header))
 			ctrl |= V_LSO_ETHHDR_LEN(1);
 		if (m0->m_pkthdr.l3hlen == sizeof(struct ip6_hdr))
 			ctrl |= F_LSO_IPV6;
 
 		lso->lso_ctrl = htobe32(ctrl);
 		lso->ipid_ofst = htobe16(0);
 		lso->mss = htobe16(m0->m_pkthdr.tso_segsz);
 		lso->seqno_offset = htobe32(0);
 		lso->len = htobe32(pktlen);
 
 		cpl = (void *)(lso + 1);
 
 		txq->tso_wrs++;
 	} else
 		cpl = (void *)(wr + 1);
 
 	/* Checksum offload */
 	ctrl1 = 0;
 	if (needs_l3_csum(m0) == 0)
 		ctrl1 |= F_TXPKT_IPCSUM_DIS;
 	if (needs_l4_csum(m0) == 0)
 		ctrl1 |= F_TXPKT_L4CSUM_DIS;
 	if (m0->m_pkthdr.csum_flags & (CSUM_IP | CSUM_TCP | CSUM_UDP |
 	    CSUM_UDP_IPV6 | CSUM_TCP_IPV6 | CSUM_TSO))
 		txq->txcsum++;	/* some hardware assistance provided */
 
 	/* VLAN tag insertion */
 	if (needs_vlan_insertion(m0)) {
 		ctrl1 |= F_TXPKT_VLAN_VLD | V_TXPKT_VLAN(m0->m_pkthdr.ether_vtag);
 		txq->vlan_insertion++;
 	}
 
 	/* CPL header */
 	cpl->ctrl0 = txq->cpl_ctrl0;
 	cpl->pack = 0;
 	cpl->len = htobe16(pktlen);
 	cpl->ctrl1 = htobe64(ctrl1);
 
 	/* SGL */
 	dst = (void *)(cpl + 1);
 	if (nsegs > 0) {
 
 		write_gl_to_txd(txq, m0, &dst, eq->sidx - ndesc < eq->pidx);
 		txq->sgl_wrs++;
 	} else {
 		struct mbuf *m;
 
 		for (m = m0; m != NULL; m = m->m_next) {
 			copy_to_txd(eq, mtod(m, caddr_t), &dst, m->m_len);
 #ifdef INVARIANTS
 			pktlen -= m->m_len;
 #endif
 		}
 #ifdef INVARIANTS
 		KASSERT(pktlen == 0, ("%s: %d bytes left.", __func__, pktlen));
 #endif
 		txq->imm_wrs++;
 	}
 
 	txq->txpkt_wrs++;
 
 	txsd = &txq->sdesc[eq->pidx];
 	txsd->m = m0;
 	txsd->desc_used = ndesc;
 
 	return (ndesc);
 }
 
 static int
 try_txpkts(struct mbuf *m, struct mbuf *n, struct txpkts *txp, u_int available)
 {
 	u_int needed, nsegs1, nsegs2, l1, l2;
 
 	if (cannot_use_txpkts(m) || cannot_use_txpkts(n))
 		return (1);
 
 	nsegs1 = mbuf_nsegs(m);
 	nsegs2 = mbuf_nsegs(n);
 	if (nsegs1 + nsegs2 == 2) {
 		txp->wr_type = 1;
 		l1 = l2 = txpkts1_len16();
 	} else {
 		txp->wr_type = 0;
 		l1 = txpkts0_len16(nsegs1);
 		l2 = txpkts0_len16(nsegs2);
 	}
 	txp->len16 = howmany(sizeof(struct fw_eth_tx_pkts_wr), 16) + l1 + l2;
 	needed = howmany(txp->len16, EQ_ESIZE / 16);
 	if (needed > SGE_MAX_WR_NDESC || needed > available)
 		return (1);
 
 	txp->plen = m->m_pkthdr.len + n->m_pkthdr.len;
 	if (txp->plen > 65535)
 		return (1);
 
 	txp->npkt = 2;
 	set_mbuf_len16(m, l1);
 	set_mbuf_len16(n, l2);
 
 	return (0);
 }
 
 static int
 add_to_txpkts(struct mbuf *m, struct txpkts *txp, u_int available)
 {
 	u_int plen, len16, needed, nsegs;
 
 	MPASS(txp->wr_type == 0 || txp->wr_type == 1);
 
 	nsegs = mbuf_nsegs(m);
 	if (needs_tso(m) || (txp->wr_type == 1 && nsegs != 1))
 		return (1);
 
 	plen = txp->plen + m->m_pkthdr.len;
 	if (plen > 65535)
 		return (1);
 
 	if (txp->wr_type == 0)
 		len16 = txpkts0_len16(nsegs);
 	else
 		len16 = txpkts1_len16();
 	needed = howmany(txp->len16 + len16, EQ_ESIZE / 16);
 	if (needed > SGE_MAX_WR_NDESC || needed > available)
 		return (1);
 
 	txp->npkt++;
 	txp->plen = plen;
 	txp->len16 += len16;
 	set_mbuf_len16(m, len16);
 
 	return (0);
 }
 
 /*
  * Write a txpkts WR for the packets in txp to the hardware descriptors, update
  * the software descriptor, and advance the pidx.  It is guaranteed that enough
  * descriptors are available.
  *
  * The return value is the # of hardware descriptors used.
  */
 static u_int
 write_txpkts_wr(struct sge_txq *txq, struct fw_eth_tx_pkts_wr *wr,
     struct mbuf *m0, const struct txpkts *txp, u_int available)
 {
 	struct sge_eq *eq = &txq->eq;
 	struct tx_sdesc *txsd;
 	struct cpl_tx_pkt_core *cpl;
 	uint32_t ctrl;
 	uint64_t ctrl1;
 	int ndesc, checkwrap;
 	struct mbuf *m;
 	void *flitp;
 
 	TXQ_LOCK_ASSERT_OWNED(txq);
 	MPASS(txp->npkt > 0);
 	MPASS(txp->plen < 65536);
 	MPASS(m0 != NULL);
 	MPASS(m0->m_nextpkt != NULL);
 	MPASS(txp->len16 <= howmany(SGE_MAX_WR_LEN, 16));
 	MPASS(available > 0 && available < eq->sidx);
 
 	ndesc = howmany(txp->len16, EQ_ESIZE / 16);
 	MPASS(ndesc <= available);
 
 	MPASS(wr == (void *)&eq->desc[eq->pidx]);
 	wr->op_pkd = htobe32(V_FW_WR_OP(FW_ETH_TX_PKTS_WR));
 	ctrl = V_FW_WR_LEN16(txp->len16);
 	wr->equiq_to_len16 = htobe32(ctrl);
 	wr->plen = htobe16(txp->plen);
 	wr->npkt = txp->npkt;
 	wr->r3 = 0;
 	wr->type = txp->wr_type;
 	flitp = wr + 1;
 
 	/*
 	 * At this point we are 16B into a hardware descriptor.  If checkwrap is
 	 * set then we know the WR is going to wrap around somewhere.  We'll
 	 * check for that at appropriate points.
 	 */
 	checkwrap = eq->sidx - ndesc < eq->pidx;
 	for (m = m0; m != NULL; m = m->m_nextpkt) {
 		if (txp->wr_type == 0) {
 			struct ulp_txpkt *ulpmc;
 			struct ulptx_idata *ulpsc;
 
 			/* ULP master command */
 			ulpmc = flitp;
 			ulpmc->cmd_dest = htobe32(V_ULPTX_CMD(ULP_TX_PKT) |
 			    V_ULP_TXPKT_DEST(0) | V_ULP_TXPKT_FID(eq->iqid));
 			ulpmc->len = htobe32(mbuf_len16(m));
 
 			/* ULP subcommand */
 			ulpsc = (void *)(ulpmc + 1);
 			ulpsc->cmd_more = htobe32(V_ULPTX_CMD(ULP_TX_SC_IMM) |
 			    F_ULP_TX_SC_MORE);
 			ulpsc->len = htobe32(sizeof(struct cpl_tx_pkt_core));
 
 			cpl = (void *)(ulpsc + 1);
 			if (checkwrap &&
 			    (uintptr_t)cpl == (uintptr_t)&eq->desc[eq->sidx])
 				cpl = (void *)&eq->desc[0];
 			txq->txpkts0_pkts += txp->npkt;
 			txq->txpkts0_wrs++;
 		} else {
 			cpl = flitp;
 			txq->txpkts1_pkts += txp->npkt;
 			txq->txpkts1_wrs++;
 		}
 
 		/* Checksum offload */
 		ctrl1 = 0;
 		if (needs_l3_csum(m) == 0)
 			ctrl1 |= F_TXPKT_IPCSUM_DIS;
 		if (needs_l4_csum(m) == 0)
 			ctrl1 |= F_TXPKT_L4CSUM_DIS;
 		if (m->m_pkthdr.csum_flags & (CSUM_IP | CSUM_TCP | CSUM_UDP |
 		    CSUM_UDP_IPV6 | CSUM_TCP_IPV6 | CSUM_TSO))
 			txq->txcsum++;	/* some hardware assistance provided */
 
 		/* VLAN tag insertion */
 		if (needs_vlan_insertion(m)) {
 			ctrl1 |= F_TXPKT_VLAN_VLD |
 			    V_TXPKT_VLAN(m->m_pkthdr.ether_vtag);
 			txq->vlan_insertion++;
 		}
 
 		/* CPL header */
 		cpl->ctrl0 = txq->cpl_ctrl0;
 		cpl->pack = 0;
 		cpl->len = htobe16(m->m_pkthdr.len);
 		cpl->ctrl1 = htobe64(ctrl1);
 
 		flitp = cpl + 1;
 		if (checkwrap &&
 		    (uintptr_t)flitp == (uintptr_t)&eq->desc[eq->sidx])
 			flitp = (void *)&eq->desc[0];
 
 		write_gl_to_txd(txq, m, (caddr_t *)(&flitp), checkwrap);
 
 	}
 
 	txsd = &txq->sdesc[eq->pidx];
 	txsd->m = m0;
 	txsd->desc_used = ndesc;
 
 	return (ndesc);
 }
 
 /*
  * If the SGL ends on an address that is not 16 byte aligned, this function will
  * add a 0 filled flit at the end.
  */
 static void
 write_gl_to_txd(struct sge_txq *txq, struct mbuf *m, caddr_t *to, int checkwrap)
 {
 	struct sge_eq *eq = &txq->eq;
 	struct sglist *gl = txq->gl;
 	struct sglist_seg *seg;
 	__be64 *flitp, *wrap;
 	struct ulptx_sgl *usgl;
 	int i, nflits, nsegs;
 
 	KASSERT(((uintptr_t)(*to) & 0xf) == 0,
 	    ("%s: SGL must start at a 16 byte boundary: %p", __func__, *to));
 	MPASS((uintptr_t)(*to) >= (uintptr_t)&eq->desc[0]);
 	MPASS((uintptr_t)(*to) < (uintptr_t)&eq->desc[eq->sidx]);
 
 	get_pkt_gl(m, gl);
 	nsegs = gl->sg_nseg;
 	MPASS(nsegs > 0);
 
 	nflits = (3 * (nsegs - 1)) / 2 + ((nsegs - 1) & 1) + 2;
 	flitp = (__be64 *)(*to);
 	wrap = (__be64 *)(&eq->desc[eq->sidx]);
 	seg = &gl->sg_segs[0];
 	usgl = (void *)flitp;
 
 	/*
 	 * We start at a 16 byte boundary somewhere inside the tx descriptor
 	 * ring, so we're at least 16 bytes away from the status page.  There is
 	 * no chance of a wrap around in the middle of usgl (which is 16 bytes).
 	 */
 
 	usgl->cmd_nsge = htobe32(V_ULPTX_CMD(ULP_TX_SC_DSGL) |
 	    V_ULPTX_NSGE(nsegs));
 	usgl->len0 = htobe32(seg->ss_len);
 	usgl->addr0 = htobe64(seg->ss_paddr);
 	seg++;
 
 	if (checkwrap == 0 || (uintptr_t)(flitp + nflits) <= (uintptr_t)wrap) {
 
 		/* Won't wrap around at all */
 
 		for (i = 0; i < nsegs - 1; i++, seg++) {
 			usgl->sge[i / 2].len[i & 1] = htobe32(seg->ss_len);
 			usgl->sge[i / 2].addr[i & 1] = htobe64(seg->ss_paddr);
 		}
 		if (i & 1)
 			usgl->sge[i / 2].len[1] = htobe32(0);
 		flitp += nflits;
 	} else {
 
 		/* Will wrap somewhere in the rest of the SGL */
 
 		/* 2 flits already written, write the rest flit by flit */
 		flitp = (void *)(usgl + 1);
 		for (i = 0; i < nflits - 2; i++) {
 			if (flitp == wrap)
 				flitp = (void *)eq->desc;
 			*flitp++ = get_flit(seg, nsegs - 1, i);
 		}
 	}
 
 	if (nflits & 1) {
 		MPASS(((uintptr_t)flitp) & 0xf);
 		*flitp++ = 0;
 	}
 
 	MPASS((((uintptr_t)flitp) & 0xf) == 0);
 	if (__predict_false(flitp == wrap))
 		*to = (void *)eq->desc;
 	else
 		*to = (void *)flitp;
 }
 
 static inline void
 copy_to_txd(struct sge_eq *eq, caddr_t from, caddr_t *to, int len)
 {
 
 	MPASS((uintptr_t)(*to) >= (uintptr_t)&eq->desc[0]);
 	MPASS((uintptr_t)(*to) < (uintptr_t)&eq->desc[eq->sidx]);
 
 	if (__predict_true((uintptr_t)(*to) + len <=
 	    (uintptr_t)&eq->desc[eq->sidx])) {
 		bcopy(from, *to, len);
 		(*to) += len;
 	} else {
 		int portion = (uintptr_t)&eq->desc[eq->sidx] - (uintptr_t)(*to);
 
 		bcopy(from, *to, portion);
 		from += portion;
 		portion = len - portion;	/* remaining */
 		bcopy(from, (void *)eq->desc, portion);
 		(*to) = (caddr_t)eq->desc + portion;
 	}
 }
 
 static inline void
 ring_eq_db(struct adapter *sc, struct sge_eq *eq, u_int n)
 {
 	u_int db;
 
 	MPASS(n > 0);
 
 	db = eq->doorbells;
 	if (n > 1)
 		clrbit(&db, DOORBELL_WCWR);
 	wmb();
 
 	switch (ffs(db) - 1) {
 	case DOORBELL_UDB:
 		*eq->udb = htole32(V_QID(eq->udb_qid) | V_PIDX(n));
 		break;
 
 	case DOORBELL_WCWR: {
 		volatile uint64_t *dst, *src;
 		int i;
 
 		/*
 		 * Queues whose 128B doorbell segment fits in the page do not
 		 * use relative qid (udb_qid is always 0).  Only queues with
 		 * doorbell segments can do WCWR.
 		 */
 		KASSERT(eq->udb_qid == 0 && n == 1,
 		    ("%s: inappropriate doorbell (0x%x, %d, %d) for eq %p",
 		    __func__, eq->doorbells, n, eq->dbidx, eq));
 
 		dst = (volatile void *)((uintptr_t)eq->udb + UDBS_WR_OFFSET -
 		    UDBS_DB_OFFSET);
 		i = eq->dbidx;
 		src = (void *)&eq->desc[i];
 		while (src != (void *)&eq->desc[i + 1])
 			*dst++ = *src++;
 		wmb();
 		break;
 	}
 
 	case DOORBELL_UDBWC:
 		*eq->udb = htole32(V_QID(eq->udb_qid) | V_PIDX(n));
 		wmb();
 		break;
 
 	case DOORBELL_KDB:
 		t4_write_reg(sc, sc->sge_kdoorbell_reg,
 		    V_QID(eq->cntxt_id) | V_PIDX(n));
 		break;
 	}
 
 	IDXINCR(eq->dbidx, n, eq->sidx);
 }
 
 static inline u_int
 reclaimable_tx_desc(struct sge_eq *eq)
 {
 	uint16_t hw_cidx;
 
 	hw_cidx = read_hw_cidx(eq);
 	return (IDXDIFF(hw_cidx, eq->cidx, eq->sidx));
 }
 
 static inline u_int
 total_available_tx_desc(struct sge_eq *eq)
 {
 	uint16_t hw_cidx, pidx;
 
 	hw_cidx = read_hw_cidx(eq);
 	pidx = eq->pidx;
 
 	if (pidx == hw_cidx)
 		return (eq->sidx - 1);
 	else
 		return (IDXDIFF(hw_cidx, pidx, eq->sidx) - 1);
 }
 
 static inline uint16_t
 read_hw_cidx(struct sge_eq *eq)
 {
 	struct sge_qstat *spg = (void *)&eq->desc[eq->sidx];
 	uint16_t cidx = spg->cidx;	/* stable snapshot */
 
 	return (be16toh(cidx));
 }
 
 /*
  * Reclaim 'n' descriptors approximately.
  */
 static u_int
 reclaim_tx_descs(struct sge_txq *txq, u_int n)
 {
 	struct tx_sdesc *txsd;
 	struct sge_eq *eq = &txq->eq;
 	u_int can_reclaim, reclaimed;
 
 	TXQ_LOCK_ASSERT_OWNED(txq);
 	MPASS(n > 0);
 
 	reclaimed = 0;
 	can_reclaim = reclaimable_tx_desc(eq);
 	while (can_reclaim && reclaimed < n) {
 		int ndesc;
 		struct mbuf *m, *nextpkt;
 
 		txsd = &txq->sdesc[eq->cidx];
 		ndesc = txsd->desc_used;
 
 		/* Firmware doesn't return "partial" credits. */
 		KASSERT(can_reclaim >= ndesc,
 		    ("%s: unexpected number of credits: %d, %d",
 		    __func__, can_reclaim, ndesc));
 
 		for (m = txsd->m; m != NULL; m = nextpkt) {
 			nextpkt = m->m_nextpkt;
 			m->m_nextpkt = NULL;
 			m_freem(m);
 		}
 		reclaimed += ndesc;
 		can_reclaim -= ndesc;
 		IDXINCR(eq->cidx, ndesc, eq->sidx);
 	}
 
 	return (reclaimed);
 }
 
 static void
 tx_reclaim(void *arg, int n)
 {
 	struct sge_txq *txq = arg;
 	struct sge_eq *eq = &txq->eq;
 
 	do {
 		if (TXQ_TRYLOCK(txq) == 0)
 			break;
 		n = reclaim_tx_descs(txq, 32);
 		if (eq->cidx == eq->pidx)
 			eq->equeqidx = eq->pidx;
 		TXQ_UNLOCK(txq);
 	} while (n > 0);
 }
 
 static __be64
 get_flit(struct sglist_seg *segs, int nsegs, int idx)
 {
 	int i = (idx / 3) * 2;
 
 	switch (idx % 3) {
 	case 0: {
 		__be64 rc;
 
 		rc = htobe32(segs[i].ss_len);
 		if (i + 1 < nsegs)
 			rc |= (uint64_t)htobe32(segs[i + 1].ss_len) << 32;
 
 		return (rc);
 	}
 	case 1:
 		return (htobe64(segs[i].ss_paddr));
 	case 2:
 		return (htobe64(segs[i + 1].ss_paddr));
 	}
 
 	return (0);
 }
 
 static void
 find_best_refill_source(struct adapter *sc, struct sge_fl *fl, int maxp)
 {
 	int8_t zidx, hwidx, idx;
 	uint16_t region1, region3;
 	int spare, spare_needed, n;
 	struct sw_zone_info *swz;
 	struct hw_buf_info *hwb, *hwb_list = &sc->sge.hw_buf_info[0];
 
 	/*
 	 * Buffer Packing: Look for PAGE_SIZE or larger zone which has a bufsize
 	 * large enough for the max payload and cluster metadata.  Otherwise
 	 * settle for the largest bufsize that leaves enough room in the cluster
 	 * for metadata.
 	 *
 	 * Without buffer packing: Look for the smallest zone which has a
 	 * bufsize large enough for the max payload.  Settle for the largest
 	 * bufsize available if there's nothing big enough for max payload.
 	 */
 	spare_needed = fl->flags & FL_BUF_PACKING ? CL_METADATA_SIZE : 0;
 	swz = &sc->sge.sw_zone_info[0];
 	hwidx = -1;
 	for (zidx = 0; zidx < SW_ZONE_SIZES; zidx++, swz++) {
 		if (swz->size > largest_rx_cluster) {
 			if (__predict_true(hwidx != -1))
 				break;
 
 			/*
 			 * This is a misconfiguration.  largest_rx_cluster is
 			 * preventing us from finding a refill source.  See
 			 * dev.t5nex.<n>.buffer_sizes to figure out why.
 			 */
 			device_printf(sc->dev, "largest_rx_cluster=%u leaves no"
 			    " refill source for fl %p (dma %u).  Ignored.\n",
 			    largest_rx_cluster, fl, maxp);
 		}
 		for (idx = swz->head_hwidx; idx != -1; idx = hwb->next) {
 			hwb = &hwb_list[idx];
 			spare = swz->size - hwb->size;
 			if (spare < spare_needed)
 				continue;
 
 			hwidx = idx;		/* best option so far */
 			if (hwb->size >= maxp) {
 
 				if ((fl->flags & FL_BUF_PACKING) == 0)
 					goto done; /* stop looking (not packing) */
 
 				if (swz->size >= safest_rx_cluster)
 					goto done; /* stop looking (packing) */
 			}
 			break;		/* keep looking, next zone */
 		}
 	}
 done:
 	/* A usable hwidx has been located. */
 	MPASS(hwidx != -1);
 	hwb = &hwb_list[hwidx];
 	zidx = hwb->zidx;
 	swz = &sc->sge.sw_zone_info[zidx];
 	region1 = 0;
 	region3 = swz->size - hwb->size;
 
 	/*
 	 * Stay within this zone and see if there is a better match when mbuf
 	 * inlining is allowed.  Remember that the hwidx's are sorted in
 	 * decreasing order of size (so in increasing order of spare area).
 	 */
 	for (idx = hwidx; idx != -1; idx = hwb->next) {
 		hwb = &hwb_list[idx];
 		spare = swz->size - hwb->size;
 
 		if (allow_mbufs_in_cluster == 0 || hwb->size < maxp)
 			break;
 
 		/*
 		 * Do not inline mbufs if doing so would violate the pad/pack
 		 * boundary alignment requirement.
 		 */
 		if (fl_pad && (MSIZE % sc->params.sge.pad_boundary) != 0)
 			continue;
 		if (fl->flags & FL_BUF_PACKING &&
 		    (MSIZE % sc->params.sge.pack_boundary) != 0)
 			continue;
 
 		if (spare < CL_METADATA_SIZE + MSIZE)
 			continue;
 		n = (spare - CL_METADATA_SIZE) / MSIZE;
 		if (n > howmany(hwb->size, maxp))
 			break;
 
 		hwidx = idx;
 		if (fl->flags & FL_BUF_PACKING) {
 			region1 = n * MSIZE;
 			region3 = spare - region1;
 		} else {
 			region1 = MSIZE;
 			region3 = spare - region1;
 			break;
 		}
 	}
 
 	KASSERT(zidx >= 0 && zidx < SW_ZONE_SIZES,
 	    ("%s: bad zone %d for fl %p, maxp %d", __func__, zidx, fl, maxp));
 	KASSERT(hwidx >= 0 && hwidx <= SGE_FLBUF_SIZES,
 	    ("%s: bad hwidx %d for fl %p, maxp %d", __func__, hwidx, fl, maxp));
 	KASSERT(region1 + sc->sge.hw_buf_info[hwidx].size + region3 ==
 	    sc->sge.sw_zone_info[zidx].size,
 	    ("%s: bad buffer layout for fl %p, maxp %d. "
 		"cl %d; r1 %d, payload %d, r3 %d", __func__, fl, maxp,
 		sc->sge.sw_zone_info[zidx].size, region1,
 		sc->sge.hw_buf_info[hwidx].size, region3));
 	if (fl->flags & FL_BUF_PACKING || region1 > 0) {
 		KASSERT(region3 >= CL_METADATA_SIZE,
 		    ("%s: no room for metadata.  fl %p, maxp %d; "
 		    "cl %d; r1 %d, payload %d, r3 %d", __func__, fl, maxp,
 		    sc->sge.sw_zone_info[zidx].size, region1,
 		    sc->sge.hw_buf_info[hwidx].size, region3));
 		KASSERT(region1 % MSIZE == 0,
 		    ("%s: bad mbuf region for fl %p, maxp %d. "
 		    "cl %d; r1 %d, payload %d, r3 %d", __func__, fl, maxp,
 		    sc->sge.sw_zone_info[zidx].size, region1,
 		    sc->sge.hw_buf_info[hwidx].size, region3));
 	}
 
 	fl->cll_def.zidx = zidx;
 	fl->cll_def.hwidx = hwidx;
 	fl->cll_def.region1 = region1;
 	fl->cll_def.region3 = region3;
 }
 
 static void
 find_safe_refill_source(struct adapter *sc, struct sge_fl *fl)
 {
 	struct sge *s = &sc->sge;
 	struct hw_buf_info *hwb;
 	struct sw_zone_info *swz;
 	int spare;
 	int8_t hwidx;
 
 	if (fl->flags & FL_BUF_PACKING)
 		hwidx = s->safe_hwidx2;	/* with room for metadata */
 	else if (allow_mbufs_in_cluster && s->safe_hwidx2 != -1) {
 		hwidx = s->safe_hwidx2;
 		hwb = &s->hw_buf_info[hwidx];
 		swz = &s->sw_zone_info[hwb->zidx];
 		spare = swz->size - hwb->size;
 
 		/* no good if there isn't room for an mbuf as well */
 		if (spare < CL_METADATA_SIZE + MSIZE)
 			hwidx = s->safe_hwidx1;
 	} else
 		hwidx = s->safe_hwidx1;
 
 	if (hwidx == -1) {
 		/* No fallback source */
 		fl->cll_alt.hwidx = -1;
 		fl->cll_alt.zidx = -1;
 
 		return;
 	}
 
 	hwb = &s->hw_buf_info[hwidx];
 	swz = &s->sw_zone_info[hwb->zidx];
 	spare = swz->size - hwb->size;
 	fl->cll_alt.hwidx = hwidx;
 	fl->cll_alt.zidx = hwb->zidx;
 	if (allow_mbufs_in_cluster &&
 	    (fl_pad == 0 || (MSIZE % sc->params.sge.pad_boundary) == 0))
 		fl->cll_alt.region1 = ((spare - CL_METADATA_SIZE) / MSIZE) * MSIZE;
 	else
 		fl->cll_alt.region1 = 0;
 	fl->cll_alt.region3 = spare - fl->cll_alt.region1;
 }
 
 static void
 add_fl_to_sfl(struct adapter *sc, struct sge_fl *fl)
 {
 	mtx_lock(&sc->sfl_lock);
 	FL_LOCK(fl);
 	if ((fl->flags & FL_DOOMED) == 0) {
 		fl->flags |= FL_STARVING;
 		TAILQ_INSERT_TAIL(&sc->sfl, fl, link);
 		callout_reset(&sc->sfl_callout, hz / 5, refill_sfl, sc);
 	}
 	FL_UNLOCK(fl);
 	mtx_unlock(&sc->sfl_lock);
 }
 
 static void
 handle_wrq_egr_update(struct adapter *sc, struct sge_eq *eq)
 {
 	struct sge_wrq *wrq = (void *)eq;
 
 	atomic_readandclear_int(&eq->equiq);
 	taskqueue_enqueue(sc->tq[eq->tx_chan], &wrq->wrq_tx_task);
 }
 
 static void
 handle_eth_egr_update(struct adapter *sc, struct sge_eq *eq)
 {
 	struct sge_txq *txq = (void *)eq;
 
 	MPASS((eq->flags & EQ_TYPEMASK) == EQ_ETH);
 
 	atomic_readandclear_int(&eq->equiq);
 	mp_ring_check_drainage(txq->r, 0);
 	taskqueue_enqueue(sc->tq[eq->tx_chan], &txq->tx_reclaim_task);
 }
 
 static int
 handle_sge_egr_update(struct sge_iq *iq, const struct rss_header *rss,
     struct mbuf *m)
 {
 	const struct cpl_sge_egr_update *cpl = (const void *)(rss + 1);
 	unsigned int qid = G_EGR_QID(ntohl(cpl->opcode_qid));
 	struct adapter *sc = iq->adapter;
 	struct sge *s = &sc->sge;
 	struct sge_eq *eq;
 	static void (*h[])(struct adapter *, struct sge_eq *) = {NULL,
 		&handle_wrq_egr_update, &handle_eth_egr_update,
 		&handle_wrq_egr_update};
 
 	KASSERT(m == NULL, ("%s: payload with opcode %02x", __func__,
 	    rss->opcode));
 
 	eq = s->eqmap[qid - s->eq_start - s->eq_base];
 	(*h[eq->flags & EQ_TYPEMASK])(sc, eq);
 
 	return (0);
 }
 
 /* handle_fw_msg works for both fw4_msg and fw6_msg because this is valid */
 CTASSERT(offsetof(struct cpl_fw4_msg, data) == \
     offsetof(struct cpl_fw6_msg, data));
 
 static int
 handle_fw_msg(struct sge_iq *iq, const struct rss_header *rss, struct mbuf *m)
 {
 	struct adapter *sc = iq->adapter;
 	const struct cpl_fw6_msg *cpl = (const void *)(rss + 1);
 
 	KASSERT(m == NULL, ("%s: payload with opcode %02x", __func__,
 	    rss->opcode));
 
 	if (cpl->type == FW_TYPE_RSSCPL || cpl->type == FW6_TYPE_RSSCPL) {
 		const struct rss_header *rss2;
 
 		rss2 = (const struct rss_header *)&cpl->data[0];
 		return (t4_cpl_handler[rss2->opcode](iq, rss2, m));
 	}
 
 	return (t4_fw_msg_handler[cpl->type](sc, &cpl->data[0]));
 }
 
 /**
  *	t4_handle_wrerr_rpl - process a FW work request error message
  *	@adap: the adapter
  *	@rpl: start of the FW message
  */
 static int
 t4_handle_wrerr_rpl(struct adapter *adap, const __be64 *rpl)
 {
 	u8 opcode = *(const u8 *)rpl;
 	const struct fw_error_cmd *e = (const void *)rpl;
 	unsigned int i;
 
 	if (opcode != FW_ERROR_CMD) {
 		log(LOG_ERR,
 		    "%s: Received WRERR_RPL message with opcode %#x\n",
 		    device_get_nameunit(adap->dev), opcode);
 		return (EINVAL);
 	}
 	log(LOG_ERR, "%s: FW_ERROR (%s) ", device_get_nameunit(adap->dev),
 	    G_FW_ERROR_CMD_FATAL(be32toh(e->op_to_type)) ? "fatal" :
 	    "non-fatal");
 	switch (G_FW_ERROR_CMD_TYPE(be32toh(e->op_to_type))) {
 	case FW_ERROR_TYPE_EXCEPTION:
 		log(LOG_ERR, "exception info:\n");
 		for (i = 0; i < nitems(e->u.exception.info); i++)
 			log(LOG_ERR, "%s%08x", i == 0 ? "\t" : " ",
 			    be32toh(e->u.exception.info[i]));
 		log(LOG_ERR, "\n");
 		break;
 	case FW_ERROR_TYPE_HWMODULE:
 		log(LOG_ERR, "HW module regaddr %08x regval %08x\n",
 		    be32toh(e->u.hwmodule.regaddr),
 		    be32toh(e->u.hwmodule.regval));
 		break;
 	case FW_ERROR_TYPE_WR:
 		log(LOG_ERR, "WR cidx %d PF %d VF %d eqid %d hdr:\n",
 		    be16toh(e->u.wr.cidx),
 		    G_FW_ERROR_CMD_PFN(be16toh(e->u.wr.pfn_vfn)),
 		    G_FW_ERROR_CMD_VFN(be16toh(e->u.wr.pfn_vfn)),
 		    be32toh(e->u.wr.eqid));
 		for (i = 0; i < nitems(e->u.wr.wrhdr); i++)
 			log(LOG_ERR, "%s%02x", i == 0 ? "\t" : " ",
 			    e->u.wr.wrhdr[i]);
 		log(LOG_ERR, "\n");
 		break;
 	case FW_ERROR_TYPE_ACL:
 		log(LOG_ERR, "ACL cidx %d PF %d VF %d eqid %d %s",
 		    be16toh(e->u.acl.cidx),
 		    G_FW_ERROR_CMD_PFN(be16toh(e->u.acl.pfn_vfn)),
 		    G_FW_ERROR_CMD_VFN(be16toh(e->u.acl.pfn_vfn)),
 		    be32toh(e->u.acl.eqid),
 		    G_FW_ERROR_CMD_MV(be16toh(e->u.acl.mv_pkd)) ? "vlanid" :
 		    "MAC");
 		for (i = 0; i < nitems(e->u.acl.val); i++)
 			log(LOG_ERR, " %02x", e->u.acl.val[i]);
 		log(LOG_ERR, "\n");
 		break;
 	default:
 		log(LOG_ERR, "type %#x\n",
 		    G_FW_ERROR_CMD_TYPE(be32toh(e->op_to_type)));
 		return (EINVAL);
 	}
 	return (0);
 }
 
 static int
 sysctl_uint16(SYSCTL_HANDLER_ARGS)
 {
 	uint16_t *id = arg1;
 	int i = *id;
 
 	return sysctl_handle_int(oidp, &i, 0, req);
 }
 
 static int
 sysctl_bufsizes(SYSCTL_HANDLER_ARGS)
 {
 	struct sge *s = arg1;
 	struct hw_buf_info *hwb = &s->hw_buf_info[0];
 	struct sw_zone_info *swz = &s->sw_zone_info[0];
 	int i, rc;
 	struct sbuf sb;
 	char c;
 
 	sbuf_new(&sb, NULL, 32, SBUF_AUTOEXTEND);
 	for (i = 0; i < SGE_FLBUF_SIZES; i++, hwb++) {
 		if (hwb->zidx >= 0 && swz[hwb->zidx].size <= largest_rx_cluster)
 			c = '*';
 		else
 			c = '\0';
 
 		sbuf_printf(&sb, "%u%c ", hwb->size, c);
 	}
 	sbuf_trim(&sb);
 	sbuf_finish(&sb);
 	rc = sysctl_handle_string(oidp, sbuf_data(&sb), sbuf_len(&sb), req);
 	sbuf_delete(&sb);
 	return (rc);
 }
 
 static int
 sysctl_tc(SYSCTL_HANDLER_ARGS)
 {
 	struct vi_info *vi = arg1;
 	struct port_info *pi;
 	struct adapter *sc;
 	struct sge_txq *txq;
 	struct tx_sched_class *tc;
 	int qidx = arg2, rc, tc_idx;
 	uint32_t fw_queue, fw_class;
 
 	MPASS(qidx >= 0 && qidx < vi->ntxq);
 	pi = vi->pi;
 	sc = pi->adapter;
 	txq = &sc->sge.txq[vi->first_txq + qidx];
 
 	tc_idx = txq->tc_idx;
 	rc = sysctl_handle_int(oidp, &tc_idx, 0, req);
 	if (rc != 0 || req->newptr == NULL)
 		return (rc);
 
 	/* Note that -1 is legitimate input (it means unbind). */
 	if (tc_idx < -1 || tc_idx >= sc->chip_params->nsched_cls)
 		return (EINVAL);
 
 	rc = begin_synchronized_op(sc, vi, SLEEP_OK | INTR_OK, "t4stc");
 	if (rc)
 		return (rc);
 
 	if (tc_idx == txq->tc_idx) {
 		rc = 0;		/* No change, nothing to do. */
 		goto done;
 	}
 
 	fw_queue = V_FW_PARAMS_MNEM(FW_PARAMS_MNEM_DMAQ) |
 	    V_FW_PARAMS_PARAM_X(FW_PARAMS_PARAM_DMAQ_EQ_SCHEDCLASS_ETH) |
 	    V_FW_PARAMS_PARAM_YZ(txq->eq.cntxt_id);
 
 	if (tc_idx == -1)
 		fw_class = 0xffffffff;	/* Unbind. */
 	else {
 		/*
 		 * Bind to a different class.  Ethernet txq's are only allowed
 		 * to bind to cl-rl mode-class for now.  XXX: too restrictive.
 		 */
 		tc = &pi->tc[tc_idx];
 		if (tc->flags & TX_SC_OK &&
 		    tc->params.level == SCHED_CLASS_LEVEL_CL_RL &&
 		    tc->params.mode == SCHED_CLASS_MODE_CLASS) {
 			/* Ok to proceed. */
 			fw_class = tc_idx;
 		} else {
 			rc = tc->flags & TX_SC_OK ? EBUSY : ENXIO;
 			goto done;
 		}
 	}
 
 	rc = -t4_set_params(sc, sc->mbox, sc->pf, 0, 1, &fw_queue, &fw_class);
 	if (rc == 0) {
 		if (txq->tc_idx != -1) {
 			tc = &pi->tc[txq->tc_idx];
 			MPASS(tc->refcount > 0);
 			tc->refcount--;
 		}
 		if (tc_idx != -1) {
 			tc = &pi->tc[tc_idx];
 			tc->refcount++;
 		}
 		txq->tc_idx = tc_idx;
 	}
 done:
 	end_synchronized_op(sc, 0);
 	return (rc);
 }
Index: projects/clang390-import/sys/dev/gpio/gpiobusvar.h
===================================================================
--- projects/clang390-import/sys/dev/gpio/gpiobusvar.h	(revision 305686)
+++ projects/clang390-import/sys/dev/gpio/gpiobusvar.h	(revision 305687)
@@ -1,153 +1,157 @@
 /*-
  * Copyright (c) 2009 Oleksandr Tymoshenko <gonzo@freebsd.org>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  *
  */
 
 #ifndef	__GPIOBUS_H__
 #define	__GPIOBUS_H__
 
 #include "opt_platform.h"
 
 #include <sys/lock.h>
 #include <sys/mutex.h>
 #include <sys/rman.h>
 
 #ifdef FDT
 #include <dev/ofw/ofw_bus_subr.h>
 #include <gnu/dts/include/dt-bindings/gpio/gpio.h>
 #endif
 
+#ifdef	INTRNG
+#include <sys/intr.h>
+#endif
+
 #include "gpio_if.h"
 
 #ifdef FDT
 #define	GPIOBUS_IVAR(d) (struct gpiobus_ivar *)				\
 	&((struct ofw_gpiobus_devinfo *)device_get_ivars(d))->opd_dinfo
 #else
 #define	GPIOBUS_IVAR(d) (struct gpiobus_ivar *) device_get_ivars(d)
 #endif
 #define	GPIOBUS_SOFTC(d) (struct gpiobus_softc *) device_get_softc(d)
 #define	GPIOBUS_LOCK(_sc) mtx_lock(&(_sc)->sc_mtx)
 #define	GPIOBUS_UNLOCK(_sc) mtx_unlock(&(_sc)->sc_mtx)
 #define	GPIOBUS_LOCK_INIT(_sc) mtx_init(&_sc->sc_mtx,			\
 	    device_get_nameunit(_sc->sc_dev), "gpiobus", MTX_DEF)
 #define	GPIOBUS_LOCK_DESTROY(_sc) mtx_destroy(&_sc->sc_mtx)
 #define	GPIOBUS_ASSERT_LOCKED(_sc) mtx_assert(&_sc->sc_mtx, MA_OWNED)
 #define	GPIOBUS_ASSERT_UNLOCKED(_sc) mtx_assert(&_sc->sc_mtx, MA_NOTOWNED)
 
 #define	GPIOBUS_WAIT		1
 #define	GPIOBUS_DONTWAIT	2
 
 /* Use default interrupt mode -  for gpio_alloc_intr_resource */
 #define GPIO_INTR_CONFORM	GPIO_INTR_NONE
 
 struct gpiobus_pin_data
 {
 	int		mapped;		/* pin is mapped/reserved. */
 	char		*name;		/* pin name. */
 };
 
 #ifdef INTRNG
 struct intr_map_data_gpio {
 	struct intr_map_data	hdr;
 	u_int			gpio_pin_num;
 	u_int			gpio_pin_flags;
 	u_int		 	gpio_intr_mode;
 };
 #endif
 
 struct gpiobus_softc
 {
 	struct mtx	sc_mtx;		/* bus mutex */
 	struct rman	sc_intr_rman;	/* isr resources */
 	device_t	sc_busdev;	/* bus device */
 	device_t	sc_owner;	/* bus owner */
 	device_t	sc_dev;		/* driver device */
 	int		sc_npins;	/* total pins on bus */
 	struct gpiobus_pin_data	*sc_pins; /* pin data */
 };
 
 struct gpiobus_pin
 {
 	device_t	dev;	/* gpio device */
 	uint32_t	flags;	/* pin flags */
 	uint32_t	pin;	/* pin number */
 };
 typedef struct gpiobus_pin *gpio_pin_t;
 
 struct gpiobus_ivar
 {
 	struct resource_list	rl;	/* isr resource list */
 	uint32_t	npins;	/* pins total */
 	uint32_t	*flags;	/* pins flags */
 	uint32_t	*pins;	/* pins map */
 };
 
 #ifdef FDT
 struct ofw_gpiobus_devinfo {
 	struct gpiobus_ivar	opd_dinfo;
 	struct ofw_bus_devinfo	opd_obdinfo;
 };
 
 static __inline int
 gpio_map_gpios(device_t bus, phandle_t dev, phandle_t gparent, int gcells,
     pcell_t *gpios, uint32_t *pin, uint32_t *flags)
 {
 	return (GPIO_MAP_GPIOS(bus, dev, gparent, gcells, gpios, pin, flags));
 }
 
 device_t ofw_gpiobus_add_fdt_child(device_t, const char *, phandle_t);
 int ofw_gpiobus_parse_gpios(device_t, char *, struct gpiobus_pin **);
 void ofw_gpiobus_register_provider(device_t);
 void ofw_gpiobus_unregister_provider(device_t);
 
 /* Consumers interface. */
 int gpio_pin_get_by_ofw_name(device_t consumer, phandle_t node,
     char *name, gpio_pin_t *gpio);
 int gpio_pin_get_by_ofw_idx(device_t consumer, phandle_t node,
     int idx, gpio_pin_t *gpio);
 int gpio_pin_get_by_ofw_property(device_t consumer, phandle_t node,
     char *name, gpio_pin_t *gpio);
 void gpio_pin_release(gpio_pin_t gpio);
 int gpio_pin_getcaps(gpio_pin_t pin, uint32_t *caps);
 int gpio_pin_is_active(gpio_pin_t pin, bool *active);
 int gpio_pin_set_active(gpio_pin_t pin, bool active);
 int gpio_pin_setflags(gpio_pin_t pin, uint32_t flags);
 #endif
 struct resource *gpio_alloc_intr_resource(device_t consumer_dev, int *rid,
     u_int alloc_flags, gpio_pin_t pin, uint32_t intr_mode);
 int gpio_check_flags(uint32_t, uint32_t);
 device_t gpiobus_attach_bus(device_t);
 int gpiobus_detach_bus(device_t);
 int gpiobus_init_softc(device_t);
 int gpiobus_alloc_ivars(struct gpiobus_ivar *);
 void gpiobus_free_ivars(struct gpiobus_ivar *);
 int gpiobus_acquire_pin(device_t, uint32_t);
 int gpiobus_release_pin(device_t, uint32_t);
 
 extern driver_t gpiobus_driver;
 
 #endif	/* __GPIOBUS_H__ */
Index: projects/clang390-import/sys/fs/nullfs/null_vnops.c
===================================================================
--- projects/clang390-import/sys/fs/nullfs/null_vnops.c	(revision 305686)
+++ projects/clang390-import/sys/fs/nullfs/null_vnops.c	(revision 305687)
@@ -1,935 +1,934 @@
 /*-
  * Copyright (c) 1992, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * John Heidemann of the UCLA Ficus project.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)null_vnops.c	8.6 (Berkeley) 5/27/95
  *
  * Ancestors:
  *	@(#)lofs_vnops.c	1.2 (Berkeley) 6/18/92
  *	...and...
  *	@(#)null_vnodeops.c 1.20 92/07/07 UCLA Ficus project
  *
  * $FreeBSD$
  */
 
 /*
  * Null Layer
  *
  * (See mount_nullfs(8) for more information.)
  *
  * The null layer duplicates a portion of the filesystem
  * name space under a new name.  In this respect, it is
  * similar to the loopback filesystem.  It differs from
  * the loopback fs in two respects:  it is implemented using
  * a stackable layers techniques, and its "null-node"s stack above
  * all lower-layer vnodes, not just over directory vnodes.
  *
  * The null layer has two purposes.  First, it serves as a demonstration
  * of layering by proving a layer which does nothing.  (It actually
  * does everything the loopback filesystem does, which is slightly
  * more than nothing.)  Second, the null layer can serve as a prototype
  * layer.  Since it provides all necessary layer framework,
  * new filesystem layers can be created very easily be starting
  * with a null layer.
  *
  * The remainder of this man page examines the null layer as a basis
  * for constructing new layers.
  *
  *
  * INSTANTIATING NEW NULL LAYERS
  *
  * New null layers are created with mount_nullfs(8).
  * Mount_nullfs(8) takes two arguments, the pathname
  * of the lower vfs (target-pn) and the pathname where the null
  * layer will appear in the namespace (alias-pn).  After
  * the null layer is put into place, the contents
  * of target-pn subtree will be aliased under alias-pn.
  *
  *
  * OPERATION OF A NULL LAYER
  *
  * The null layer is the minimum filesystem layer,
  * simply bypassing all possible operations to the lower layer
  * for processing there.  The majority of its activity centers
  * on the bypass routine, through which nearly all vnode operations
  * pass.
  *
  * The bypass routine accepts arbitrary vnode operations for
  * handling by the lower layer.  It begins by examing vnode
  * operation arguments and replacing any null-nodes by their
  * lower-layer equivlants.  It then invokes the operation
  * on the lower layer.  Finally, it replaces the null-nodes
  * in the arguments and, if a vnode is return by the operation,
  * stacks a null-node on top of the returned vnode.
  *
  * Although bypass handles most operations, vop_getattr, vop_lock,
  * vop_unlock, vop_inactive, vop_reclaim, and vop_print are not
  * bypassed. Vop_getattr must change the fsid being returned.
  * Vop_lock and vop_unlock must handle any locking for the
  * current vnode as well as pass the lock request down.
  * Vop_inactive and vop_reclaim are not bypassed so that
  * they can handle freeing null-layer specific data. Vop_print
  * is not bypassed to avoid excessive debugging information.
  * Also, certain vnode operations change the locking state within
  * the operation (create, mknod, remove, link, rename, mkdir, rmdir,
  * and symlink). Ideally these operations should not change the
  * lock state, but should be changed to let the caller of the
  * function unlock them. Otherwise all intermediate vnode layers
  * (such as union, umapfs, etc) must catch these functions to do
  * the necessary locking at their layer.
  *
  *
  * INSTANTIATING VNODE STACKS
  *
  * Mounting associates the null layer with a lower layer,
  * effect stacking two VFSes.  Vnode stacks are instead
  * created on demand as files are accessed.
  *
  * The initial mount creates a single vnode stack for the
  * root of the new null layer.  All other vnode stacks
  * are created as a result of vnode operations on
  * this or other null vnode stacks.
  *
  * New vnode stacks come into existence as a result of
  * an operation which returns a vnode.
  * The bypass routine stacks a null-node above the new
  * vnode before returning it to the caller.
  *
  * For example, imagine mounting a null layer with
  * "mount_nullfs /usr/include /dev/layer/null".
  * Changing directory to /dev/layer/null will assign
  * the root null-node (which was created when the null layer was mounted).
  * Now consider opening "sys".  A vop_lookup would be
  * done on the root null-node.  This operation would bypass through
  * to the lower layer which would return a vnode representing
  * the UFS "sys".  Null_bypass then builds a null-node
  * aliasing the UFS "sys" and returns this to the caller.
  * Later operations on the null-node "sys" will repeat this
  * process when constructing other vnode stacks.
  *
  *
  * CREATING OTHER FILE SYSTEM LAYERS
  *
  * One of the easiest ways to construct new filesystem layers is to make
  * a copy of the null layer, rename all files and variables, and
  * then begin modifing the copy.  Sed can be used to easily rename
  * all variables.
  *
  * The umap layer is an example of a layer descended from the
  * null layer.
  *
  *
  * INVOKING OPERATIONS ON LOWER LAYERS
  *
  * There are two techniques to invoke operations on a lower layer
  * when the operation cannot be completely bypassed.  Each method
  * is appropriate in different situations.  In both cases,
  * it is the responsibility of the aliasing layer to make
  * the operation arguments "correct" for the lower layer
  * by mapping a vnode arguments to the lower layer.
  *
  * The first approach is to call the aliasing layer's bypass routine.
  * This method is most suitable when you wish to invoke the operation
  * currently being handled on the lower layer.  It has the advantage
  * that the bypass routine already must do argument mapping.
  * An example of this is null_getattrs in the null layer.
  *
  * A second approach is to directly invoke vnode operations on
  * the lower layer with the VOP_OPERATIONNAME interface.
  * The advantage of this method is that it is easy to invoke
  * arbitrary operations on the lower layer.  The disadvantage
  * is that vnode arguments must be manualy mapped.
  *
  */
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/conf.h>
 #include <sys/kernel.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/mount.h>
 #include <sys/mutex.h>
 #include <sys/namei.h>
 #include <sys/sysctl.h>
 #include <sys/vnode.h>
 
 #include <fs/nullfs/null.h>
 
 #include <vm/vm.h>
 #include <vm/vm_extern.h>
 #include <vm/vm_object.h>
 #include <vm/vnode_pager.h>
 
 static int null_bug_bypass = 0;   /* for debugging: enables bypass printf'ing */
 SYSCTL_INT(_debug, OID_AUTO, nullfs_bug_bypass, CTLFLAG_RW, 
 	&null_bug_bypass, 0, "");
 
 /*
  * This is the 10-Apr-92 bypass routine.
  *    This version has been optimized for speed, throwing away some
  * safety checks.  It should still always work, but it's not as
  * robust to programmer errors.
  *
  * In general, we map all vnodes going down and unmap them on the way back.
  * As an exception to this, vnodes can be marked "unmapped" by setting
  * the Nth bit in operation's vdesc_flags.
  *
  * Also, some BSD vnode operations have the side effect of vrele'ing
  * their arguments.  With stacking, the reference counts are held
  * by the upper node, not the lower one, so we must handle these
  * side-effects here.  This is not of concern in Sun-derived systems
  * since there are no such side-effects.
  *
  * This makes the following assumptions:
  * - only one returned vpp
  * - no INOUT vpp's (Sun's vop_open has one of these)
  * - the vnode operation vector of the first vnode should be used
  *   to determine what implementation of the op should be invoked
  * - all mapped vnodes are of our vnode-type (NEEDSWORK:
  *   problems on rmdir'ing mount points and renaming?)
  */
 int
 null_bypass(struct vop_generic_args *ap)
 {
 	struct vnode **this_vp_p;
 	int error;
 	struct vnode *old_vps[VDESC_MAX_VPS];
 	struct vnode **vps_p[VDESC_MAX_VPS];
 	struct vnode ***vppp;
 	struct vnodeop_desc *descp = ap->a_desc;
 	int reles, i;
 
 	if (null_bug_bypass)
 		printf ("null_bypass: %s\n", descp->vdesc_name);
 
 #ifdef DIAGNOSTIC
 	/*
 	 * We require at least one vp.
 	 */
 	if (descp->vdesc_vp_offsets == NULL ||
 	    descp->vdesc_vp_offsets[0] == VDESC_NO_OFFSET)
 		panic ("null_bypass: no vp's in map");
 #endif
 
 	/*
 	 * Map the vnodes going in.
 	 * Later, we'll invoke the operation based on
 	 * the first mapped vnode's operation vector.
 	 */
 	reles = descp->vdesc_flags;
 	for (i = 0; i < VDESC_MAX_VPS; reles >>= 1, i++) {
 		if (descp->vdesc_vp_offsets[i] == VDESC_NO_OFFSET)
 			break;   /* bail out at end of list */
 		vps_p[i] = this_vp_p =
 			VOPARG_OFFSETTO(struct vnode**,descp->vdesc_vp_offsets[i],ap);
 		/*
 		 * We're not guaranteed that any but the first vnode
 		 * are of our type.  Check for and don't map any
 		 * that aren't.  (We must always map first vp or vclean fails.)
 		 */
 		if (i && (*this_vp_p == NULLVP ||
 		    (*this_vp_p)->v_op != &null_vnodeops)) {
 			old_vps[i] = NULLVP;
 		} else {
 			old_vps[i] = *this_vp_p;
 			*(vps_p[i]) = NULLVPTOLOWERVP(*this_vp_p);
 			/*
 			 * XXX - Several operations have the side effect
 			 * of vrele'ing their vp's.  We must account for
 			 * that.  (This should go away in the future.)
 			 */
 			if (reles & VDESC_VP0_WILLRELE)
 				VREF(*this_vp_p);
 		}
 
 	}
 
 	/*
 	 * Call the operation on the lower layer
 	 * with the modified argument structure.
 	 */
 	if (vps_p[0] && *vps_p[0])
 		error = VCALL(ap);
 	else {
 		printf("null_bypass: no map for %s\n", descp->vdesc_name);
 		error = EINVAL;
 	}
 
 	/*
 	 * Maintain the illusion of call-by-value
 	 * by restoring vnodes in the argument structure
 	 * to their original value.
 	 */
 	reles = descp->vdesc_flags;
 	for (i = 0; i < VDESC_MAX_VPS; reles >>= 1, i++) {
 		if (descp->vdesc_vp_offsets[i] == VDESC_NO_OFFSET)
 			break;   /* bail out at end of list */
 		if (old_vps[i]) {
 			*(vps_p[i]) = old_vps[i];
 #if 0
 			if (reles & VDESC_VP0_WILLUNLOCK)
 				VOP_UNLOCK(*(vps_p[i]), 0);
 #endif
 			if (reles & VDESC_VP0_WILLRELE)
 				vrele(*(vps_p[i]));
 		}
 	}
 
 	/*
 	 * Map the possible out-going vpp
 	 * (Assumes that the lower layer always returns
 	 * a VREF'ed vpp unless it gets an error.)
 	 */
 	if (descp->vdesc_vpp_offset != VDESC_NO_OFFSET &&
 	    !(descp->vdesc_flags & VDESC_NOMAP_VPP) &&
 	    !error) {
 		/*
 		 * XXX - even though some ops have vpp returned vp's,
 		 * several ops actually vrele this before returning.
 		 * We must avoid these ops.
 		 * (This should go away when these ops are regularized.)
 		 */
 		if (descp->vdesc_flags & VDESC_VPP_WILLRELE)
 			goto out;
 		vppp = VOPARG_OFFSETTO(struct vnode***,
 				 descp->vdesc_vpp_offset,ap);
 		if (*vppp)
 			error = null_nodeget(old_vps[0]->v_mount, **vppp, *vppp);
 	}
 
  out:
 	return (error);
 }
 
 static int
 null_add_writecount(struct vop_add_writecount_args *ap)
 {
 	struct vnode *lvp, *vp;
 	int error;
 
 	vp = ap->a_vp;
 	lvp = NULLVPTOLOWERVP(vp);
 	KASSERT(vp->v_writecount + ap->a_inc >= 0, ("wrong writecount inc"));
 	if (vp->v_writecount > 0 && vp->v_writecount + ap->a_inc == 0)
 		error = VOP_ADD_WRITECOUNT(lvp, -1);
 	else if (vp->v_writecount == 0 && vp->v_writecount + ap->a_inc > 0)
 		error = VOP_ADD_WRITECOUNT(lvp, 1);
 	else
 		error = 0;
 	if (error == 0)
 		vp->v_writecount += ap->a_inc;
 	return (error);
 }
 
 /*
  * We have to carry on the locking protocol on the null layer vnodes
  * as we progress through the tree. We also have to enforce read-only
  * if this layer is mounted read-only.
  */
 static int
 null_lookup(struct vop_lookup_args *ap)
 {
 	struct componentname *cnp = ap->a_cnp;
 	struct vnode *dvp = ap->a_dvp;
 	int flags = cnp->cn_flags;
 	struct vnode *vp, *ldvp, *lvp;
 	struct mount *mp;
 	int error;
 
 	mp = dvp->v_mount;
 	if ((flags & ISLASTCN) != 0 && (mp->mnt_flag & MNT_RDONLY) != 0 &&
 	    (cnp->cn_nameiop == DELETE || cnp->cn_nameiop == RENAME))
 		return (EROFS);
 	/*
 	 * Although it is possible to call null_bypass(), we'll do
 	 * a direct call to reduce overhead
 	 */
 	ldvp = NULLVPTOLOWERVP(dvp);
 	vp = lvp = NULL;
 	KASSERT((ldvp->v_vflag & VV_ROOT) == 0 ||
 	    ((dvp->v_vflag & VV_ROOT) != 0 && (flags & ISDOTDOT) == 0),
 	    ("ldvp %p fl %#x dvp %p fl %#x flags %#x", ldvp, ldvp->v_vflag,
 	     dvp, dvp->v_vflag, flags));
 
 	/*
 	 * Hold ldvp.  The reference on it, owned by dvp, is lost in
 	 * case of dvp reclamation, and we need ldvp to move our lock
 	 * from ldvp to dvp.
 	 */
 	vhold(ldvp);
 
 	error = VOP_LOOKUP(ldvp, &lvp, cnp);
 
 	/*
 	 * VOP_LOOKUP() on lower vnode may unlock ldvp, which allows
 	 * dvp to be reclaimed due to shared v_vnlock.  Check for the
 	 * doomed state and return error.
 	 */
 	if ((error == 0 || error == EJUSTRETURN) &&
 	    (dvp->v_iflag & VI_DOOMED) != 0) {
 		error = ENOENT;
 		if (lvp != NULL)
 			vput(lvp);
 
 		/*
 		 * If vgone() did reclaimed dvp before curthread
 		 * relocked ldvp, the locks of dvp and ldpv are no
 		 * longer shared.  In this case, relock of ldvp in
 		 * lower fs VOP_LOOKUP() does not restore the locking
 		 * state of dvp.  Compensate for this by unlocking
 		 * ldvp and locking dvp, which is also correct if the
 		 * locks are still shared.
 		 */
 		VOP_UNLOCK(ldvp, 0);
 		vn_lock(dvp, LK_EXCLUSIVE | LK_RETRY);
 	}
 	vdrop(ldvp);
 
 	if (error == EJUSTRETURN && (flags & ISLASTCN) != 0 &&
 	    (mp->mnt_flag & MNT_RDONLY) != 0 &&
 	    (cnp->cn_nameiop == CREATE || cnp->cn_nameiop == RENAME))
 		error = EROFS;
 
 	if ((error == 0 || error == EJUSTRETURN) && lvp != NULL) {
 		if (ldvp == lvp) {
 			*ap->a_vpp = dvp;
 			VREF(dvp);
 			vrele(lvp);
 		} else {
 			error = null_nodeget(mp, lvp, &vp);
 			if (error == 0)
 				*ap->a_vpp = vp;
 		}
 	}
 	return (error);
 }
 
 static int
 null_open(struct vop_open_args *ap)
 {
 	int retval;
 	struct vnode *vp, *ldvp;
 
 	vp = ap->a_vp;
 	ldvp = NULLVPTOLOWERVP(vp);
 	retval = null_bypass(&ap->a_gen);
 	if (retval == 0)
 		vp->v_object = ldvp->v_object;
 	return (retval);
 }
 
 /*
  * Setattr call. Disallow write attempts if the layer is mounted read-only.
  */
 static int
 null_setattr(struct vop_setattr_args *ap)
 {
 	struct vnode *vp = ap->a_vp;
 	struct vattr *vap = ap->a_vap;
 
   	if ((vap->va_flags != VNOVAL || vap->va_uid != (uid_t)VNOVAL ||
 	    vap->va_gid != (gid_t)VNOVAL || vap->va_atime.tv_sec != VNOVAL ||
 	    vap->va_mtime.tv_sec != VNOVAL || vap->va_mode != (mode_t)VNOVAL) &&
 	    (vp->v_mount->mnt_flag & MNT_RDONLY))
 		return (EROFS);
 	if (vap->va_size != VNOVAL) {
  		switch (vp->v_type) {
  		case VDIR:
  			return (EISDIR);
  		case VCHR:
  		case VBLK:
  		case VSOCK:
  		case VFIFO:
 			if (vap->va_flags != VNOVAL)
 				return (EOPNOTSUPP);
 			return (0);
 		case VREG:
 		case VLNK:
  		default:
 			/*
 			 * Disallow write attempts if the filesystem is
 			 * mounted read-only.
 			 */
 			if (vp->v_mount->mnt_flag & MNT_RDONLY)
 				return (EROFS);
 		}
 	}
 
 	return (null_bypass((struct vop_generic_args *)ap));
 }
 
 /*
  *  We handle getattr only to change the fsid.
  */
 static int
 null_getattr(struct vop_getattr_args *ap)
 {
 	int error;
 
 	if ((error = null_bypass((struct vop_generic_args *)ap)) != 0)
 		return (error);
 
 	ap->a_vap->va_fsid = ap->a_vp->v_mount->mnt_stat.f_fsid.val[0];
 	return (0);
 }
 
 /*
  * Handle to disallow write access if mounted read-only.
  */
 static int
 null_access(struct vop_access_args *ap)
 {
 	struct vnode *vp = ap->a_vp;
 	accmode_t accmode = ap->a_accmode;
 
 	/*
 	 * Disallow write attempts on read-only layers;
 	 * unless the file is a socket, fifo, or a block or
 	 * character device resident on the filesystem.
 	 */
 	if (accmode & VWRITE) {
 		switch (vp->v_type) {
 		case VDIR:
 		case VLNK:
 		case VREG:
 			if (vp->v_mount->mnt_flag & MNT_RDONLY)
 				return (EROFS);
 			break;
 		default:
 			break;
 		}
 	}
 	return (null_bypass((struct vop_generic_args *)ap));
 }
 
 static int
 null_accessx(struct vop_accessx_args *ap)
 {
 	struct vnode *vp = ap->a_vp;
 	accmode_t accmode = ap->a_accmode;
 
 	/*
 	 * Disallow write attempts on read-only layers;
 	 * unless the file is a socket, fifo, or a block or
 	 * character device resident on the filesystem.
 	 */
 	if (accmode & VWRITE) {
 		switch (vp->v_type) {
 		case VDIR:
 		case VLNK:
 		case VREG:
 			if (vp->v_mount->mnt_flag & MNT_RDONLY)
 				return (EROFS);
 			break;
 		default:
 			break;
 		}
 	}
 	return (null_bypass((struct vop_generic_args *)ap));
 }
 
 /*
  * Increasing refcount of lower vnode is needed at least for the case
  * when lower FS is NFS to do sillyrename if the file is in use.
  * Unfortunately v_usecount is incremented in many places in
  * the kernel and, as such, there may be races that result in
  * the NFS client doing an extraneous silly rename, but that seems
  * preferable to not doing a silly rename when it is needed.
  */
 static int
 null_remove(struct vop_remove_args *ap)
 {
 	int retval, vreleit;
 	struct vnode *lvp, *vp;
 
 	vp = ap->a_vp;
 	if (vrefcnt(vp) > 1) {
 		lvp = NULLVPTOLOWERVP(vp);
 		VREF(lvp);
 		vreleit = 1;
 	} else
 		vreleit = 0;
 	VTONULL(vp)->null_flags |= NULLV_DROP;
 	retval = null_bypass(&ap->a_gen);
 	if (vreleit != 0)
 		vrele(lvp);
 	return (retval);
 }
 
 /*
  * We handle this to eliminate null FS to lower FS
  * file moving. Don't know why we don't allow this,
  * possibly we should.
  */
 static int
 null_rename(struct vop_rename_args *ap)
 {
 	struct vnode *tdvp = ap->a_tdvp;
 	struct vnode *fvp = ap->a_fvp;
 	struct vnode *fdvp = ap->a_fdvp;
 	struct vnode *tvp = ap->a_tvp;
 	struct null_node *tnn;
 
 	/* Check for cross-device rename. */
 	if ((fvp->v_mount != tdvp->v_mount) ||
 	    (tvp && (fvp->v_mount != tvp->v_mount))) {
 		if (tdvp == tvp)
 			vrele(tdvp);
 		else
 			vput(tdvp);
 		if (tvp)
 			vput(tvp);
 		vrele(fdvp);
 		vrele(fvp);
 		return (EXDEV);
 	}
 
 	if (tvp != NULL) {
 		tnn = VTONULL(tvp);
 		tnn->null_flags |= NULLV_DROP;
 	}
 	return (null_bypass((struct vop_generic_args *)ap));
 }
 
 static int
 null_rmdir(struct vop_rmdir_args *ap)
 {
 
 	VTONULL(ap->a_vp)->null_flags |= NULLV_DROP;
 	return (null_bypass(&ap->a_gen));
 }
 
 /*
  * We need to process our own vnode lock and then clear the
  * interlock flag as it applies only to our vnode, not the
  * vnodes below us on the stack.
  */
 static int
 null_lock(struct vop_lock1_args *ap)
 {
 	struct vnode *vp = ap->a_vp;
 	int flags = ap->a_flags;
 	struct null_node *nn;
 	struct vnode *lvp;
 	int error;
 
 
 	if ((flags & LK_INTERLOCK) == 0) {
 		VI_LOCK(vp);
 		ap->a_flags = flags |= LK_INTERLOCK;
 	}
 	nn = VTONULL(vp);
 	/*
 	 * If we're still active we must ask the lower layer to
 	 * lock as ffs has special lock considerations in it's
 	 * vop lock.
 	 */
 	if (nn != NULL && (lvp = NULLVPTOLOWERVP(vp)) != NULL) {
 		VI_LOCK_FLAGS(lvp, MTX_DUPOK);
 		VI_UNLOCK(vp);
 		/*
 		 * We have to hold the vnode here to solve a potential
 		 * reclaim race.  If we're forcibly vgone'd while we
 		 * still have refs, a thread could be sleeping inside
 		 * the lowervp's vop_lock routine.  When we vgone we will
 		 * drop our last ref to the lowervp, which would allow it
 		 * to be reclaimed.  The lowervp could then be recycled,
 		 * in which case it is not legal to be sleeping in it's VOP.
 		 * We prevent it from being recycled by holding the vnode
 		 * here.
 		 */
 		vholdl(lvp);
 		error = VOP_LOCK(lvp, flags);
 
 		/*
 		 * We might have slept to get the lock and someone might have
 		 * clean our vnode already, switching vnode lock from one in
 		 * lowervp to v_lock in our own vnode structure.  Handle this
 		 * case by reacquiring correct lock in requested mode.
 		 */
 		if (VTONULL(vp) == NULL && error == 0) {
 			ap->a_flags &= ~(LK_TYPE_MASK | LK_INTERLOCK);
 			switch (flags & LK_TYPE_MASK) {
 			case LK_SHARED:
 				ap->a_flags |= LK_SHARED;
 				break;
 			case LK_UPGRADE:
 			case LK_EXCLUSIVE:
 				ap->a_flags |= LK_EXCLUSIVE;
 				break;
 			default:
 				panic("Unsupported lock request %d\n",
 				    ap->a_flags);
 			}
 			VOP_UNLOCK(lvp, 0);
 			error = vop_stdlock(ap);
 		}
 		vdrop(lvp);
 	} else
 		error = vop_stdlock(ap);
 
 	return (error);
 }
 
 /*
  * We need to process our own vnode unlock and then clear the
  * interlock flag as it applies only to our vnode, not the
  * vnodes below us on the stack.
  */
 static int
 null_unlock(struct vop_unlock_args *ap)
 {
 	struct vnode *vp = ap->a_vp;
 	int flags = ap->a_flags;
 	int mtxlkflag = 0;
 	struct null_node *nn;
 	struct vnode *lvp;
 	int error;
 
 	if ((flags & LK_INTERLOCK) != 0)
 		mtxlkflag = 1;
 	else if (mtx_owned(VI_MTX(vp)) == 0) {
 		VI_LOCK(vp);
 		mtxlkflag = 2;
 	}
 	nn = VTONULL(vp);
 	if (nn != NULL && (lvp = NULLVPTOLOWERVP(vp)) != NULL) {
 		VI_LOCK_FLAGS(lvp, MTX_DUPOK);
 		flags |= LK_INTERLOCK;
 		vholdl(lvp);
 		VI_UNLOCK(vp);
 		error = VOP_UNLOCK(lvp, flags);
 		vdrop(lvp);
 		if (mtxlkflag == 0)
 			VI_LOCK(vp);
 	} else {
 		if (mtxlkflag == 2)
 			VI_UNLOCK(vp);
 		error = vop_stdunlock(ap);
 	}
 
 	return (error);
 }
 
 /*
  * Do not allow the VOP_INACTIVE to be passed to the lower layer,
  * since the reference count on the lower vnode is not related to
  * ours.
  */
 static int
 null_inactive(struct vop_inactive_args *ap __unused)
 {
 	struct vnode *vp, *lvp;
 	struct null_node *xp;
 	struct mount *mp;
 	struct null_mount *xmp;
 
 	vp = ap->a_vp;
 	xp = VTONULL(vp);
 	lvp = NULLVPTOLOWERVP(vp);
 	mp = vp->v_mount;
 	xmp = MOUNTTONULLMOUNT(mp);
 	if ((xmp->nullm_flags & NULLM_CACHE) == 0 ||
 	    (xp->null_flags & NULLV_DROP) != 0 ||
 	    (lvp->v_vflag & VV_NOSYNC) != 0) {
 		/*
 		 * If this is the last reference and caching of the
 		 * nullfs vnodes is not enabled, or the lower vnode is
 		 * deleted, then free up the vnode so as not to tie up
 		 * the lower vnodes.
 		 */
 		vp->v_object = NULL;
 		vrecycle(vp);
 	}
 	return (0);
 }
 
 /*
  * Now, the nullfs vnode and, due to the sharing lock, the lower
  * vnode, are exclusively locked, and we shall destroy the null vnode.
  */
 static int
 null_reclaim(struct vop_reclaim_args *ap)
 {
 	struct vnode *vp;
 	struct null_node *xp;
 	struct vnode *lowervp;
 
 	vp = ap->a_vp;
 	xp = VTONULL(vp);
 	lowervp = xp->null_lowervp;
 
 	KASSERT(lowervp != NULL && vp->v_vnlock != &vp->v_lock,
 	    ("Reclaiming incomplete null vnode %p", vp));
 
 	null_hashrem(xp);
 	/*
 	 * Use the interlock to protect the clearing of v_data to
 	 * prevent faults in null_lock().
 	 */
 	lockmgr(&vp->v_lock, LK_EXCLUSIVE, NULL);
 	VI_LOCK(vp);
 	vp->v_data = NULL;
 	vp->v_object = NULL;
 	vp->v_vnlock = &vp->v_lock;
 	VI_UNLOCK(vp);
 
 	/*
 	 * If we were opened for write, we leased one write reference
 	 * to the lower vnode.  If this is a reclamation due to the
 	 * forced unmount, undo the reference now.
 	 */
 	if (vp->v_writecount > 0)
 		VOP_ADD_WRITECOUNT(lowervp, -1);
 	if ((xp->null_flags & NULLV_NOUNLOCK) != 0)
 		vunref(lowervp);
 	else
 		vput(lowervp);
 	free(xp, M_NULLFSNODE);
 
 	return (0);
 }
 
 static int
 null_print(struct vop_print_args *ap)
 {
 	struct vnode *vp = ap->a_vp;
 
 	printf("\tvp=%p, lowervp=%p\n", vp, VTONULL(vp)->null_lowervp);
 	return (0);
 }
 
 /* ARGSUSED */
 static int
 null_getwritemount(struct vop_getwritemount_args *ap)
 {
 	struct null_node *xp;
 	struct vnode *lowervp;
 	struct vnode *vp;
 
 	vp = ap->a_vp;
 	VI_LOCK(vp);
 	xp = VTONULL(vp);
 	if (xp && (lowervp = xp->null_lowervp)) {
 		VI_LOCK_FLAGS(lowervp, MTX_DUPOK);
 		VI_UNLOCK(vp);
 		vholdl(lowervp);
 		VI_UNLOCK(lowervp);
 		VOP_GETWRITEMOUNT(lowervp, ap->a_mpp);
 		vdrop(lowervp);
 	} else {
 		VI_UNLOCK(vp);
 		*(ap->a_mpp) = NULL;
 	}
 	return (0);
 }
 
 static int
 null_vptofh(struct vop_vptofh_args *ap)
 {
 	struct vnode *lvp;
 
 	lvp = NULLVPTOLOWERVP(ap->a_vp);
 	return VOP_VPTOFH(lvp, ap->a_fhp);
 }
 
 static int
 null_vptocnp(struct vop_vptocnp_args *ap)
 {
 	struct vnode *vp = ap->a_vp;
 	struct vnode **dvp = ap->a_vpp;
 	struct vnode *lvp, *ldvp;
 	struct ucred *cred = ap->a_cred;
 	int error, locked;
 
 	locked = VOP_ISLOCKED(vp);
 	lvp = NULLVPTOLOWERVP(vp);
 	vhold(lvp);
 	VOP_UNLOCK(vp, 0); /* vp is held by vn_vptocnp_locked that called us */
 	ldvp = lvp;
 	vref(lvp);
 	error = vn_vptocnp(&ldvp, cred, ap->a_buf, ap->a_buflen);
 	vdrop(lvp);
 	if (error != 0) {
 		vn_lock(vp, locked | LK_RETRY);
 		return (ENOENT);
 	}
 
 	/*
 	 * Exclusive lock is required by insmntque1 call in
 	 * null_nodeget()
 	 */
 	error = vn_lock(ldvp, LK_EXCLUSIVE);
 	if (error != 0) {
 		vrele(ldvp);
 		vn_lock(vp, locked | LK_RETRY);
 		return (ENOENT);
 	}
-	vref(ldvp);
 	error = null_nodeget(vp->v_mount, ldvp, dvp);
 	if (error == 0) {
 #ifdef DIAGNOSTIC
 		NULLVPTOLOWERVP(*dvp);
 #endif
 		VOP_UNLOCK(*dvp, 0); /* keep reference on *dvp */
 	}
 	vn_lock(vp, locked | LK_RETRY);
 	return (error);
 }
 
 /*
  * Global vfs data structures
  */
 struct vop_vector null_vnodeops = {
 	.vop_bypass =		null_bypass,
 	.vop_access =		null_access,
 	.vop_accessx =		null_accessx,
 	.vop_advlockpurge =	vop_stdadvlockpurge,
 	.vop_bmap =		VOP_EOPNOTSUPP,
 	.vop_getattr =		null_getattr,
 	.vop_getwritemount =	null_getwritemount,
 	.vop_inactive =		null_inactive,
 	.vop_islocked =		vop_stdislocked,
 	.vop_lock1 =		null_lock,
 	.vop_lookup =		null_lookup,
 	.vop_open =		null_open,
 	.vop_print =		null_print,
 	.vop_reclaim =		null_reclaim,
 	.vop_remove =		null_remove,
 	.vop_rename =		null_rename,
 	.vop_rmdir =		null_rmdir,
 	.vop_setattr =		null_setattr,
 	.vop_strategy =		VOP_EOPNOTSUPP,
 	.vop_unlock =		null_unlock,
 	.vop_vptocnp =		null_vptocnp,
 	.vop_vptofh =		null_vptofh,
 	.vop_add_writecount =	null_add_writecount,
 };
Index: projects/clang390-import/sys/i386/i386/pmap.c
===================================================================
--- projects/clang390-import/sys/i386/i386/pmap.c	(revision 305686)
+++ projects/clang390-import/sys/i386/i386/pmap.c	(revision 305687)
@@ -1,5622 +1,5616 @@
 /*-
  * Copyright (c) 1991 Regents of the University of California.
  * All rights reserved.
  * Copyright (c) 1994 John S. Dyson
  * All rights reserved.
  * Copyright (c) 1994 David Greenman
  * All rights reserved.
  * Copyright (c) 2005-2010 Alan L. Cox <alc@cs.rice.edu>
  * All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * the Systems Programming Group of the University of Utah Computer
  * Science Department and William Jolitz of UUNET Technologies Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *	This product includes software developed by the University of
  *	California, Berkeley and its contributors.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	from:	@(#)pmap.c	7.7 (Berkeley)	5/12/91
  */
 /*-
  * Copyright (c) 2003 Networks Associates Technology, Inc.
  * All rights reserved.
  *
  * This software was developed for the FreeBSD Project by Jake Burkholder,
  * Safeport Network Services, and Network Associates Laboratories, the
  * Security Research Division of Network Associates, Inc. under
  * DARPA/SPAWAR contract N66001-01-C-8035 ("CBOSS"), as part of the DARPA
  * CHATS research program.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 /*
  *	Manages physical address maps.
  *
  *	Since the information managed by this module is
  *	also stored by the logical address mapping module,
  *	this module may throw away valid virtual-to-physical
  *	mappings at almost any time.  However, invalidations
  *	of virtual-to-physical mappings must be done as
  *	requested.
  *
  *	In order to cope with hardware architectures which
  *	make virtual-to-physical map invalidates expensive,
  *	this module may delay invalidate or reduced protection
  *	operations until such time as they are actually
  *	necessary.  This module is given full information as
  *	to which processors are currently using which maps,
  *	and to when physical maps must be made correct.
  */
 
 #include "opt_apic.h"
 #include "opt_cpu.h"
 #include "opt_pmap.h"
 #include "opt_smp.h"
 #include "opt_xbox.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/ktr.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/mman.h>
 #include <sys/msgbuf.h>
 #include <sys/mutex.h>
 #include <sys/proc.h>
 #include <sys/rwlock.h>
 #include <sys/sf_buf.h>
 #include <sys/sx.h>
 #include <sys/vmmeter.h>
 #include <sys/sched.h>
 #include <sys/sysctl.h>
 #include <sys/smp.h>
 
 #include <vm/vm.h>
 #include <vm/vm_param.h>
 #include <vm/vm_kern.h>
 #include <vm/vm_page.h>
 #include <vm/vm_map.h>
 #include <vm/vm_object.h>
 #include <vm/vm_extern.h>
 #include <vm/vm_pageout.h>
 #include <vm/vm_pager.h>
 #include <vm/vm_phys.h>
 #include <vm/vm_radix.h>
 #include <vm/vm_reserv.h>
 #include <vm/uma.h>
 
 #ifdef DEV_APIC
 #include <sys/bus.h>
 #include <machine/intr_machdep.h>
 #include <x86/apicvar.h>
 #endif
 #include <machine/cpu.h>
 #include <machine/cputypes.h>
 #include <machine/md_var.h>
 #include <machine/pcb.h>
 #include <machine/specialreg.h>
 #ifdef SMP
 #include <machine/smp.h>
 #endif
 
 #ifdef XBOX
 #include <machine/xbox.h>
 #endif
 
 #if !defined(CPU_DISABLE_SSE) && defined(I686_CPU)
 #define CPU_ENABLE_SSE
 #endif
 
 #ifndef PMAP_SHPGPERPROC
 #define PMAP_SHPGPERPROC 200
 #endif
 
 #if !defined(DIAGNOSTIC)
 #ifdef __GNUC_GNU_INLINE__
 #define PMAP_INLINE	__attribute__((__gnu_inline__)) inline
 #else
 #define PMAP_INLINE	extern inline
 #endif
 #else
 #define PMAP_INLINE
 #endif
 
 #ifdef PV_STATS
 #define PV_STAT(x)	do { x ; } while (0)
 #else
 #define PV_STAT(x)	do { } while (0)
 #endif
 
 #define	pa_index(pa)	((pa) >> PDRSHIFT)
 #define	pa_to_pvh(pa)	(&pv_table[pa_index(pa)])
 
 /*
  * Get PDEs and PTEs for user/kernel address space
  */
 #define	pmap_pde(m, v)	(&((m)->pm_pdir[(vm_offset_t)(v) >> PDRSHIFT]))
 #define pdir_pde(m, v) (m[(vm_offset_t)(v) >> PDRSHIFT])
 
 #define pmap_pde_v(pte)		((*(int *)pte & PG_V) != 0)
 #define pmap_pte_w(pte)		((*(int *)pte & PG_W) != 0)
 #define pmap_pte_m(pte)		((*(int *)pte & PG_M) != 0)
 #define pmap_pte_u(pte)		((*(int *)pte & PG_A) != 0)
 #define pmap_pte_v(pte)		((*(int *)pte & PG_V) != 0)
 
 #define pmap_pte_set_w(pte, v)	((v) ? atomic_set_int((u_int *)(pte), PG_W) : \
     atomic_clear_int((u_int *)(pte), PG_W))
 #define pmap_pte_set_prot(pte, v) ((*(int *)pte &= ~PG_PROT), (*(int *)pte |= (v)))
 
 struct pmap kernel_pmap_store;
 LIST_HEAD(pmaplist, pmap);
 static struct pmaplist allpmaps;
 static struct mtx allpmaps_lock;
 
 vm_offset_t virtual_avail;	/* VA of first avail page (after kernel bss) */
 vm_offset_t virtual_end;	/* VA of last avail page (end of kernel AS) */
 int pgeflag = 0;		/* PG_G or-in */
 int pseflag = 0;		/* PG_PS or-in */
 
 static int nkpt = NKPT;
 vm_offset_t kernel_vm_end = KERNBASE + NKPT * NBPDR;
 extern u_int32_t KERNend;
 extern u_int32_t KPTphys;
 
 #if defined(PAE) || defined(PAE_TABLES)
 pt_entry_t pg_nx;
 static uma_zone_t pdptzone;
 #endif
 
 static SYSCTL_NODE(_vm, OID_AUTO, pmap, CTLFLAG_RD, 0, "VM/pmap parameters");
 
 static int pat_works = 1;
 SYSCTL_INT(_vm_pmap, OID_AUTO, pat_works, CTLFLAG_RD, &pat_works, 1,
     "Is page attribute table fully functional?");
 
 static int pg_ps_enabled = 1;
 SYSCTL_INT(_vm_pmap, OID_AUTO, pg_ps_enabled, CTLFLAG_RDTUN | CTLFLAG_NOFETCH,
     &pg_ps_enabled, 0, "Are large page mappings enabled?");
 
 #define	PAT_INDEX_SIZE	8
 static int pat_index[PAT_INDEX_SIZE];	/* cache mode to PAT index conversion */
 
 /*
  * pmap_mapdev support pre initialization (i.e. console)
  */
 #define	PMAP_PREINIT_MAPPING_COUNT	8
 static struct pmap_preinit_mapping {
 	vm_paddr_t	pa;
 	vm_offset_t	va;
 	vm_size_t	sz;
 	int		mode;
 } pmap_preinit_mapping[PMAP_PREINIT_MAPPING_COUNT];
 static int pmap_initialized;
 
 static struct rwlock_padalign pvh_global_lock;
 
 /*
  * Data for the pv entry allocation mechanism
  */
 static TAILQ_HEAD(pch, pv_chunk) pv_chunks = TAILQ_HEAD_INITIALIZER(pv_chunks);
 static int pv_entry_count = 0, pv_entry_max = 0, pv_entry_high_water = 0;
 static struct md_page *pv_table;
 static int shpgperproc = PMAP_SHPGPERPROC;
 
 struct pv_chunk *pv_chunkbase;		/* KVA block for pv_chunks */
 int pv_maxchunks;			/* How many chunks we have KVA for */
 vm_offset_t pv_vafree;			/* freelist stored in the PTE */
 
 /*
  * All those kernel PT submaps that BSD is so fond of
  */
 struct sysmaps {
 	struct	mtx lock;
 	pt_entry_t *CMAP1;
 	pt_entry_t *CMAP2;
 	caddr_t	CADDR1;
 	caddr_t	CADDR2;
 };
 static struct sysmaps sysmaps_pcpu[MAXCPU];
 pt_entry_t *CMAP3;
 static pd_entry_t *KPTD;
 caddr_t ptvmmap = 0;
 caddr_t CADDR3;
 struct msgbuf *msgbufp = NULL;
 
 /*
  * Crashdump maps.
  */
 static caddr_t crashdumpmap;
 
 static pt_entry_t *PMAP1 = NULL, *PMAP2;
 static pt_entry_t *PADDR1 = NULL, *PADDR2;
 #ifdef SMP
 static int PMAP1cpu;
 static int PMAP1changedcpu;
 SYSCTL_INT(_debug, OID_AUTO, PMAP1changedcpu, CTLFLAG_RD, 
 	   &PMAP1changedcpu, 0,
 	   "Number of times pmap_pte_quick changed CPU with same PMAP1");
 #endif
 static int PMAP1changed;
 SYSCTL_INT(_debug, OID_AUTO, PMAP1changed, CTLFLAG_RD, 
 	   &PMAP1changed, 0,
 	   "Number of times pmap_pte_quick changed PMAP1");
 static int PMAP1unchanged;
 SYSCTL_INT(_debug, OID_AUTO, PMAP1unchanged, CTLFLAG_RD, 
 	   &PMAP1unchanged, 0,
 	   "Number of times pmap_pte_quick didn't change PMAP1");
 static struct mtx PMAP2mutex;
 
 static void	free_pv_chunk(struct pv_chunk *pc);
 static void	free_pv_entry(pmap_t pmap, pv_entry_t pv);
 static pv_entry_t get_pv_entry(pmap_t pmap, boolean_t try);
 static void	pmap_pv_demote_pde(pmap_t pmap, vm_offset_t va, vm_paddr_t pa);
 static boolean_t pmap_pv_insert_pde(pmap_t pmap, vm_offset_t va, vm_paddr_t pa);
 static void	pmap_pv_promote_pde(pmap_t pmap, vm_offset_t va, vm_paddr_t pa);
 static void	pmap_pvh_free(struct md_page *pvh, pmap_t pmap, vm_offset_t va);
 static pv_entry_t pmap_pvh_remove(struct md_page *pvh, pmap_t pmap,
 		    vm_offset_t va);
 static int	pmap_pvh_wired_mappings(struct md_page *pvh, int count);
 
 static boolean_t pmap_demote_pde(pmap_t pmap, pd_entry_t *pde, vm_offset_t va);
 static boolean_t pmap_enter_pde(pmap_t pmap, vm_offset_t va, vm_page_t m,
     vm_prot_t prot);
 static vm_page_t pmap_enter_quick_locked(pmap_t pmap, vm_offset_t va,
     vm_page_t m, vm_prot_t prot, vm_page_t mpte);
 static void pmap_flush_page(vm_page_t m);
 static int pmap_insert_pt_page(pmap_t pmap, vm_page_t mpte);
 static void pmap_fill_ptp(pt_entry_t *firstpte, pt_entry_t newpte);
 static boolean_t pmap_is_modified_pvh(struct md_page *pvh);
 static boolean_t pmap_is_referenced_pvh(struct md_page *pvh);
 static void pmap_kenter_attr(vm_offset_t va, vm_paddr_t pa, int mode);
 static void pmap_kenter_pde(vm_offset_t va, pd_entry_t newpde);
 static vm_page_t pmap_lookup_pt_page(pmap_t pmap, vm_offset_t va);
 static void pmap_pde_attr(pd_entry_t *pde, int cache_bits);
 static void pmap_promote_pde(pmap_t pmap, pd_entry_t *pde, vm_offset_t va);
 static boolean_t pmap_protect_pde(pmap_t pmap, pd_entry_t *pde, vm_offset_t sva,
     vm_prot_t prot);
 static void pmap_pte_attr(pt_entry_t *pte, int cache_bits);
 static void pmap_remove_pde(pmap_t pmap, pd_entry_t *pdq, vm_offset_t sva,
     struct spglist *free);
 static int pmap_remove_pte(pmap_t pmap, pt_entry_t *ptq, vm_offset_t sva,
     struct spglist *free);
 static void pmap_remove_pt_page(pmap_t pmap, vm_page_t mpte);
 static void pmap_remove_page(struct pmap *pmap, vm_offset_t va,
     struct spglist *free);
 static void pmap_remove_entry(struct pmap *pmap, vm_page_t m,
 					vm_offset_t va);
 static void pmap_insert_entry(pmap_t pmap, vm_offset_t va, vm_page_t m);
 static boolean_t pmap_try_insert_pv_entry(pmap_t pmap, vm_offset_t va,
     vm_page_t m);
 static void pmap_update_pde(pmap_t pmap, vm_offset_t va, pd_entry_t *pde,
     pd_entry_t newpde);
 static void pmap_update_pde_invalidate(vm_offset_t va, pd_entry_t newpde);
 
 static vm_page_t pmap_allocpte(pmap_t pmap, vm_offset_t va, u_int flags);
 
 static vm_page_t _pmap_allocpte(pmap_t pmap, u_int ptepindex, u_int flags);
 static void _pmap_unwire_ptp(pmap_t pmap, vm_page_t m, struct spglist *free);
 static pt_entry_t *pmap_pte_quick(pmap_t pmap, vm_offset_t va);
 static void pmap_pte_release(pt_entry_t *pte);
 static int pmap_unuse_pt(pmap_t, vm_offset_t, struct spglist *);
 #if defined(PAE) || defined(PAE_TABLES)
 static void *pmap_pdpt_allocf(uma_zone_t zone, vm_size_t bytes, uint8_t *flags,
     int wait);
 #endif
 static void pmap_set_pg(void);
 
 static __inline void pagezero(void *page);
 
 CTASSERT(1 << PDESHIFT == sizeof(pd_entry_t));
 CTASSERT(1 << PTESHIFT == sizeof(pt_entry_t));
 
 /*
  * If you get an error here, then you set KVA_PAGES wrong! See the
  * description of KVA_PAGES in sys/i386/include/pmap.h. It must be
  * multiple of 4 for a normal kernel, or a multiple of 8 for a PAE.
  */
 CTASSERT(KERNBASE % (1 << 24) == 0);
 
 /*
  *	Bootstrap the system enough to run with virtual memory.
  *
  *	On the i386 this is called after mapping has already been enabled
  *	and just syncs the pmap module with what has already been done.
  *	[We can't call it easily with mapping off since the kernel is not
  *	mapped with PA == VA, hence we would have to relocate every address
  *	from the linked base (virtual) address "KERNBASE" to the actual
  *	(physical) address starting relative to 0]
  */
 void
 pmap_bootstrap(vm_paddr_t firstaddr)
 {
 	vm_offset_t va;
 	pt_entry_t *pte, *unused;
 	struct sysmaps *sysmaps;
 	int i;
 
 	/*
 	 * Add a physical memory segment (vm_phys_seg) corresponding to the
 	 * preallocated kernel page table pages so that vm_page structures
 	 * representing these pages will be created.  The vm_page structures
 	 * are required for promotion of the corresponding kernel virtual
 	 * addresses to superpage mappings.
 	 */
 	vm_phys_add_seg(KPTphys, KPTphys + ptoa(nkpt));
 
 	/*
 	 * Initialize the first available kernel virtual address.  However,
 	 * using "firstaddr" may waste a few pages of the kernel virtual
 	 * address space, because locore may not have mapped every physical
 	 * page that it allocated.  Preferably, locore would provide a first
 	 * unused virtual address in addition to "firstaddr".
 	 */
 	virtual_avail = (vm_offset_t) KERNBASE + firstaddr;
 
 	virtual_end = VM_MAX_KERNEL_ADDRESS;
 
 	/*
 	 * Initialize the kernel pmap (which is statically allocated).
 	 */
 	PMAP_LOCK_INIT(kernel_pmap);
 	kernel_pmap->pm_pdir = (pd_entry_t *) (KERNBASE + (u_int)IdlePTD);
 #if defined(PAE) || defined(PAE_TABLES)
 	kernel_pmap->pm_pdpt = (pdpt_entry_t *) (KERNBASE + (u_int)IdlePDPT);
 #endif
 	CPU_FILL(&kernel_pmap->pm_active);	/* don't allow deactivation */
 	TAILQ_INIT(&kernel_pmap->pm_pvchunk);
 
  	/*
 	 * Initialize the global pv list lock.
 	 */
 	rw_init(&pvh_global_lock, "pmap pv global");
 
 	LIST_INIT(&allpmaps);
 
 	/*
 	 * Request a spin mutex so that changes to allpmaps cannot be
 	 * preempted by smp_rendezvous_cpus().  Otherwise,
 	 * pmap_update_pde_kernel() could access allpmaps while it is
 	 * being changed.
 	 */
 	mtx_init(&allpmaps_lock, "allpmaps", NULL, MTX_SPIN);
 	mtx_lock_spin(&allpmaps_lock);
 	LIST_INSERT_HEAD(&allpmaps, kernel_pmap, pm_list);
 	mtx_unlock_spin(&allpmaps_lock);
 
 	/*
 	 * Reserve some special page table entries/VA space for temporary
 	 * mapping of pages.
 	 */
 #define	SYSMAP(c, p, v, n)	\
 	v = (c)va; va += ((n)*PAGE_SIZE); p = pte; pte += (n);
 
 	va = virtual_avail;
 	pte = vtopte(va);
 
 	/*
 	 * CMAP1/CMAP2 are used for zeroing and copying pages.
 	 * CMAP3 is used for the boot-time memory test.
 	 */
 	for (i = 0; i < MAXCPU; i++) {
 		sysmaps = &sysmaps_pcpu[i];
 		mtx_init(&sysmaps->lock, "SYSMAPS", NULL, MTX_DEF);
 		SYSMAP(caddr_t, sysmaps->CMAP1, sysmaps->CADDR1, 1)
 		SYSMAP(caddr_t, sysmaps->CMAP2, sysmaps->CADDR2, 1)
 	}
 	SYSMAP(caddr_t, CMAP3, CADDR3, 1);
 
 	/*
 	 * Crashdump maps.
 	 */
 	SYSMAP(caddr_t, unused, crashdumpmap, MAXDUMPPGS)
 
 	/*
 	 * ptvmmap is used for reading arbitrary physical pages via /dev/mem.
 	 */
 	SYSMAP(caddr_t, unused, ptvmmap, 1)
 
 	/*
 	 * msgbufp is used to map the system message buffer.
 	 */
 	SYSMAP(struct msgbuf *, unused, msgbufp, atop(round_page(msgbufsize)))
 
 	/*
 	 * KPTmap is used by pmap_kextract().
 	 *
 	 * KPTmap is first initialized by locore.  However, that initial
 	 * KPTmap can only support NKPT page table pages.  Here, a larger
 	 * KPTmap is created that can support KVA_PAGES page table pages.
 	 */
 	SYSMAP(pt_entry_t *, KPTD, KPTmap, KVA_PAGES)
 
 	for (i = 0; i < NKPT; i++)
 		KPTD[i] = (KPTphys + (i << PAGE_SHIFT)) | pgeflag | PG_RW | PG_V;
 
 	/*
 	 * Adjust the start of the KPTD and KPTmap so that the implementation
 	 * of pmap_kextract() and pmap_growkernel() can be made simpler.
 	 */
 	KPTD -= KPTDI;
 	KPTmap -= i386_btop(KPTDI << PDRSHIFT);
 
 	/*
 	 * PADDR1 and PADDR2 are used by pmap_pte_quick() and pmap_pte(),
 	 * respectively.
 	 */
 	SYSMAP(pt_entry_t *, PMAP1, PADDR1, 1)
 	SYSMAP(pt_entry_t *, PMAP2, PADDR2, 1)
 
 	mtx_init(&PMAP2mutex, "PMAP2", NULL, MTX_DEF);
 
 	virtual_avail = va;
 
 	/*
 	 * Leave in place an identity mapping (virt == phys) for the low 1 MB
 	 * physical memory region that is used by the ACPI wakeup code.  This
 	 * mapping must not have PG_G set. 
 	 */
 #ifdef XBOX
 	/* FIXME: This is gross, but needed for the XBOX. Since we are in such
 	 * an early stadium, we cannot yet neatly map video memory ... :-(
 	 * Better fixes are very welcome! */
 	if (!arch_i386_is_xbox)
 #endif
 	for (i = 1; i < NKPT; i++)
 		PTD[i] = 0;
 
 	/* Initialize the PAT MSR if present. */
 	pmap_init_pat();
 
 	/* Turn on PG_G on kernel page(s) */
 	pmap_set_pg();
 }
 
 static void
 pmap_init_qpages(void)
 {
 	struct pcpu *pc;
 	int i;
 
 	CPU_FOREACH(i) {
 		pc = pcpu_find(i);
 		pc->pc_qmap_addr = kva_alloc(PAGE_SIZE);
 		if (pc->pc_qmap_addr == 0)
 			panic("pmap_init_qpages: unable to allocate KVA");
 	}
 }
 
 SYSINIT(qpages_init, SI_SUB_CPU, SI_ORDER_ANY, pmap_init_qpages, NULL);
 
 /*
  * Setup the PAT MSR.
  */
 void
 pmap_init_pat(void)
 {
 	int pat_table[PAT_INDEX_SIZE];
 	uint64_t pat_msr;
 	u_long cr0, cr4;
 	int i;
 
 	/* Set default PAT index table. */
 	for (i = 0; i < PAT_INDEX_SIZE; i++)
 		pat_table[i] = -1;
 	pat_table[PAT_WRITE_BACK] = 0;
 	pat_table[PAT_WRITE_THROUGH] = 1;
 	pat_table[PAT_UNCACHEABLE] = 3;
 	pat_table[PAT_WRITE_COMBINING] = 3;
 	pat_table[PAT_WRITE_PROTECTED] = 3;
 	pat_table[PAT_UNCACHED] = 3;
 
 	/* Bail if this CPU doesn't implement PAT. */
 	if ((cpu_feature & CPUID_PAT) == 0) {
 		for (i = 0; i < PAT_INDEX_SIZE; i++)
 			pat_index[i] = pat_table[i];
 		pat_works = 0;
 		return;
 	}
 
 	/*
 	 * Due to some Intel errata, we can only safely use the lower 4
 	 * PAT entries.
 	 *
 	 *   Intel Pentium III Processor Specification Update
 	 * Errata E.27 (Upper Four PAT Entries Not Usable With Mode B
 	 * or Mode C Paging)
 	 *
 	 *   Intel Pentium IV  Processor Specification Update
 	 * Errata N46 (PAT Index MSB May Be Calculated Incorrectly)
 	 */
 	if (cpu_vendor_id == CPU_VENDOR_INTEL &&
 	    !(CPUID_TO_FAMILY(cpu_id) == 6 && CPUID_TO_MODEL(cpu_id) >= 0xe))
 		pat_works = 0;
 
 	/* Initialize default PAT entries. */
 	pat_msr = PAT_VALUE(0, PAT_WRITE_BACK) |
 	    PAT_VALUE(1, PAT_WRITE_THROUGH) |
 	    PAT_VALUE(2, PAT_UNCACHED) |
 	    PAT_VALUE(3, PAT_UNCACHEABLE) |
 	    PAT_VALUE(4, PAT_WRITE_BACK) |
 	    PAT_VALUE(5, PAT_WRITE_THROUGH) |
 	    PAT_VALUE(6, PAT_UNCACHED) |
 	    PAT_VALUE(7, PAT_UNCACHEABLE);
 
 	if (pat_works) {
 		/*
 		 * Leave the indices 0-3 at the default of WB, WT, UC-, and UC.
 		 * Program 5 and 6 as WP and WC.
 		 * Leave 4 and 7 as WB and UC.
 		 */
 		pat_msr &= ~(PAT_MASK(5) | PAT_MASK(6));
 		pat_msr |= PAT_VALUE(5, PAT_WRITE_PROTECTED) |
 		    PAT_VALUE(6, PAT_WRITE_COMBINING);
 		pat_table[PAT_UNCACHED] = 2;
 		pat_table[PAT_WRITE_PROTECTED] = 5;
 		pat_table[PAT_WRITE_COMBINING] = 6;
 	} else {
 		/*
 		 * Just replace PAT Index 2 with WC instead of UC-.
 		 */
 		pat_msr &= ~PAT_MASK(2);
 		pat_msr |= PAT_VALUE(2, PAT_WRITE_COMBINING);
 		pat_table[PAT_WRITE_COMBINING] = 2;
 	}
 
 	/* Disable PGE. */
 	cr4 = rcr4();
 	load_cr4(cr4 & ~CR4_PGE);
 
 	/* Disable caches (CD = 1, NW = 0). */
 	cr0 = rcr0();
 	load_cr0((cr0 & ~CR0_NW) | CR0_CD);
 
 	/* Flushes caches and TLBs. */
 	wbinvd();
 	invltlb();
 
 	/* Update PAT and index table. */
 	wrmsr(MSR_PAT, pat_msr);
 	for (i = 0; i < PAT_INDEX_SIZE; i++)
 		pat_index[i] = pat_table[i];
 
 	/* Flush caches and TLBs again. */
 	wbinvd();
 	invltlb();
 
 	/* Restore caches and PGE. */
 	load_cr0(cr0);
 	load_cr4(cr4);
 }
 
 /*
  * Set PG_G on kernel pages.  Only the BSP calls this when SMP is turned on.
  */
 static void
 pmap_set_pg(void)
 {
 	pt_entry_t *pte;
 	vm_offset_t va, endva;
 
 	if (pgeflag == 0)
 		return;
 
 	endva = KERNBASE + KERNend;
 
 	if (pseflag) {
 		va = KERNBASE + KERNLOAD;
 		while (va  < endva) {
 			pdir_pde(PTD, va) |= pgeflag;
 			invltlb();	/* Flush non-PG_G entries. */
 			va += NBPDR;
 		}
 	} else {
 		va = (vm_offset_t)btext;
 		while (va < endva) {
 			pte = vtopte(va);
 			if (*pte)
 				*pte |= pgeflag;
 			invltlb();	/* Flush non-PG_G entries. */
 			va += PAGE_SIZE;
 		}
 	}
 }
 
 /*
  * Initialize a vm_page's machine-dependent fields.
  */
 void
 pmap_page_init(vm_page_t m)
 {
 
 	TAILQ_INIT(&m->md.pv_list);
 	m->md.pat_mode = PAT_WRITE_BACK;
 }
 
 #if defined(PAE) || defined(PAE_TABLES)
 static void *
 pmap_pdpt_allocf(uma_zone_t zone, vm_size_t bytes, uint8_t *flags, int wait)
 {
 
 	/* Inform UMA that this allocator uses kernel_map/object. */
 	*flags = UMA_SLAB_KERNEL;
 	return ((void *)kmem_alloc_contig(kernel_arena, bytes, wait, 0x0ULL,
 	    0xffffffffULL, 1, 0, VM_MEMATTR_DEFAULT));
 }
 #endif
 
 /*
  * Abuse the pte nodes for unmapped kva to thread a kva freelist through.
  * Requirements:
  *  - Must deal with pages in order to ensure that none of the PG_* bits
  *    are ever set, PG_V in particular.
  *  - Assumes we can write to ptes without pte_store() atomic ops, even
  *    on PAE systems.  This should be ok.
  *  - Assumes nothing will ever test these addresses for 0 to indicate
  *    no mapping instead of correctly checking PG_V.
  *  - Assumes a vm_offset_t will fit in a pte (true for i386).
  * Because PG_V is never set, there can be no mappings to invalidate.
  */
 static vm_offset_t
 pmap_ptelist_alloc(vm_offset_t *head)
 {
 	pt_entry_t *pte;
 	vm_offset_t va;
 
 	va = *head;
 	if (va == 0)
 		panic("pmap_ptelist_alloc: exhausted ptelist KVA");
 	pte = vtopte(va);
 	*head = *pte;
 	if (*head & PG_V)
 		panic("pmap_ptelist_alloc: va with PG_V set!");
 	*pte = 0;
 	return (va);
 }
 
 static void
 pmap_ptelist_free(vm_offset_t *head, vm_offset_t va)
 {
 	pt_entry_t *pte;
 
 	if (va & PG_V)
 		panic("pmap_ptelist_free: freeing va with PG_V set!");
 	pte = vtopte(va);
 	*pte = *head;		/* virtual! PG_V is 0 though */
 	*head = va;
 }
 
 static void
 pmap_ptelist_init(vm_offset_t *head, void *base, int npages)
 {
 	int i;
 	vm_offset_t va;
 
 	*head = 0;
 	for (i = npages - 1; i >= 0; i--) {
 		va = (vm_offset_t)base + i * PAGE_SIZE;
 		pmap_ptelist_free(head, va);
 	}
 }
 
 
 /*
  *	Initialize the pmap module.
  *	Called by vm_init, to initialize any structures that the pmap
  *	system needs to map virtual memory.
  */
 void
 pmap_init(void)
 {
 	struct pmap_preinit_mapping *ppim;
 	vm_page_t mpte;
 	vm_size_t s;
 	int i, pv_npg;
 
 	/*
 	 * Initialize the vm page array entries for the kernel pmap's
 	 * page table pages.
 	 */ 
 	for (i = 0; i < NKPT; i++) {
 		mpte = PHYS_TO_VM_PAGE(KPTphys + (i << PAGE_SHIFT));
 		KASSERT(mpte >= vm_page_array &&
 		    mpte < &vm_page_array[vm_page_array_size],
 		    ("pmap_init: page table page is out of range"));
 		mpte->pindex = i + KPTDI;
 		mpte->phys_addr = KPTphys + (i << PAGE_SHIFT);
 	}
 
 	/*
 	 * Initialize the address space (zone) for the pv entries.  Set a
 	 * high water mark so that the system can recover from excessive
 	 * numbers of pv entries.
 	 */
 	TUNABLE_INT_FETCH("vm.pmap.shpgperproc", &shpgperproc);
 	pv_entry_max = shpgperproc * maxproc + vm_cnt.v_page_count;
 	TUNABLE_INT_FETCH("vm.pmap.pv_entries", &pv_entry_max);
 	pv_entry_max = roundup(pv_entry_max, _NPCPV);
 	pv_entry_high_water = 9 * (pv_entry_max / 10);
 
 	/*
 	 * If the kernel is running on a virtual machine, then it must assume
 	 * that MCA is enabled by the hypervisor.  Moreover, the kernel must
 	 * be prepared for the hypervisor changing the vendor and family that
 	 * are reported by CPUID.  Consequently, the workaround for AMD Family
 	 * 10h Erratum 383 is enabled if the processor's feature set does not
 	 * include at least one feature that is only supported by older Intel
 	 * or newer AMD processors.
 	 */
 	if (vm_guest != VM_GUEST_NO && (cpu_feature & CPUID_SS) == 0 &&
 	    (cpu_feature2 & (CPUID2_SSSE3 | CPUID2_SSE41 | CPUID2_AESNI |
 	    CPUID2_AVX | CPUID2_XSAVE)) == 0 && (amd_feature2 & (AMDID2_XOP |
 	    AMDID2_FMA4)) == 0)
 		workaround_erratum383 = 1;
 
 	/*
 	 * Are large page mappings supported and enabled?
 	 */
 	TUNABLE_INT_FETCH("vm.pmap.pg_ps_enabled", &pg_ps_enabled);
 	if (pseflag == 0)
 		pg_ps_enabled = 0;
 	else if (pg_ps_enabled) {
 		KASSERT(MAXPAGESIZES > 1 && pagesizes[1] == 0,
 		    ("pmap_init: can't assign to pagesizes[1]"));
 		pagesizes[1] = NBPDR;
 	}
 
 	/*
 	 * Calculate the size of the pv head table for superpages.
 	 * Handle the possibility that "vm_phys_segs[...].end" is zero.
 	 */
 	pv_npg = trunc_4mpage(vm_phys_segs[vm_phys_nsegs - 1].end -
 	    PAGE_SIZE) / NBPDR + 1;
 
 	/*
 	 * Allocate memory for the pv head table for superpages.
 	 */
 	s = (vm_size_t)(pv_npg * sizeof(struct md_page));
 	s = round_page(s);
 	pv_table = (struct md_page *)kmem_malloc(kernel_arena, s,
 	    M_WAITOK | M_ZERO);
 	for (i = 0; i < pv_npg; i++)
 		TAILQ_INIT(&pv_table[i].pv_list);
 
 	pv_maxchunks = MAX(pv_entry_max / _NPCPV, maxproc);
 	pv_chunkbase = (struct pv_chunk *)kva_alloc(PAGE_SIZE * pv_maxchunks);
 	if (pv_chunkbase == NULL)
 		panic("pmap_init: not enough kvm for pv chunks");
 	pmap_ptelist_init(&pv_vafree, pv_chunkbase, pv_maxchunks);
 #if defined(PAE) || defined(PAE_TABLES)
 	pdptzone = uma_zcreate("PDPT", NPGPTD * sizeof(pdpt_entry_t), NULL,
 	    NULL, NULL, NULL, (NPGPTD * sizeof(pdpt_entry_t)) - 1,
 	    UMA_ZONE_VM | UMA_ZONE_NOFREE);
 	uma_zone_set_allocf(pdptzone, pmap_pdpt_allocf);
 #endif
 
 	pmap_initialized = 1;
 	if (!bootverbose)
 		return;
 	for (i = 0; i < PMAP_PREINIT_MAPPING_COUNT; i++) {
 		ppim = pmap_preinit_mapping + i;
 		if (ppim->va == 0)
 			continue;
 		printf("PPIM %u: PA=%#jx, VA=%#x, size=%#x, mode=%#x\n", i,
 		    (uintmax_t)ppim->pa, ppim->va, ppim->sz, ppim->mode);
 	}
 }
 
 
 SYSCTL_INT(_vm_pmap, OID_AUTO, pv_entry_max, CTLFLAG_RD, &pv_entry_max, 0,
 	"Max number of PV entries");
 SYSCTL_INT(_vm_pmap, OID_AUTO, shpgperproc, CTLFLAG_RD, &shpgperproc, 0,
 	"Page share factor per proc");
 
 static SYSCTL_NODE(_vm_pmap, OID_AUTO, pde, CTLFLAG_RD, 0,
     "2/4MB page mapping counters");
 
 static u_long pmap_pde_demotions;
 SYSCTL_ULONG(_vm_pmap_pde, OID_AUTO, demotions, CTLFLAG_RD,
     &pmap_pde_demotions, 0, "2/4MB page demotions");
 
 static u_long pmap_pde_mappings;
 SYSCTL_ULONG(_vm_pmap_pde, OID_AUTO, mappings, CTLFLAG_RD,
     &pmap_pde_mappings, 0, "2/4MB page mappings");
 
 static u_long pmap_pde_p_failures;
 SYSCTL_ULONG(_vm_pmap_pde, OID_AUTO, p_failures, CTLFLAG_RD,
     &pmap_pde_p_failures, 0, "2/4MB page promotion failures");
 
 static u_long pmap_pde_promotions;
 SYSCTL_ULONG(_vm_pmap_pde, OID_AUTO, promotions, CTLFLAG_RD,
     &pmap_pde_promotions, 0, "2/4MB page promotions");
 
 /***************************************************
  * Low level helper routines.....
  ***************************************************/
 
 /*
  * Determine the appropriate bits to set in a PTE or PDE for a specified
  * caching mode.
  */
 int
 pmap_cache_bits(int mode, boolean_t is_pde)
 {
 	int cache_bits, pat_flag, pat_idx;
 
 	if (mode < 0 || mode >= PAT_INDEX_SIZE || pat_index[mode] < 0)
 		panic("Unknown caching mode %d\n", mode);
 
 	/* The PAT bit is different for PTE's and PDE's. */
 	pat_flag = is_pde ? PG_PDE_PAT : PG_PTE_PAT;
 
 	/* Map the caching mode to a PAT index. */
 	pat_idx = pat_index[mode];
 
 	/* Map the 3-bit index value into the PAT, PCD, and PWT bits. */
 	cache_bits = 0;
 	if (pat_idx & 0x4)
 		cache_bits |= pat_flag;
 	if (pat_idx & 0x2)
 		cache_bits |= PG_NC_PCD;
 	if (pat_idx & 0x1)
 		cache_bits |= PG_NC_PWT;
 	return (cache_bits);
 }
 
 /*
  * The caller is responsible for maintaining TLB consistency.
  */
 static void
 pmap_kenter_pde(vm_offset_t va, pd_entry_t newpde)
 {
 	pd_entry_t *pde;
 	pmap_t pmap;
 	boolean_t PTD_updated;
 
 	PTD_updated = FALSE;
 	mtx_lock_spin(&allpmaps_lock);
 	LIST_FOREACH(pmap, &allpmaps, pm_list) {
 		if ((pmap->pm_pdir[PTDPTDI] & PG_FRAME) == (PTDpde[0] &
 		    PG_FRAME))
 			PTD_updated = TRUE;
 		pde = pmap_pde(pmap, va);
 		pde_store(pde, newpde);
 	}
 	mtx_unlock_spin(&allpmaps_lock);
 	KASSERT(PTD_updated,
 	    ("pmap_kenter_pde: current page table is not in allpmaps"));
 }
 
 /*
  * After changing the page size for the specified virtual address in the page
  * table, flush the corresponding entries from the processor's TLB.  Only the
  * calling processor's TLB is affected.
  *
  * The calling thread must be pinned to a processor.
  */
 static void
 pmap_update_pde_invalidate(vm_offset_t va, pd_entry_t newpde)
 {
 	u_long cr4;
 
 	if ((newpde & PG_PS) == 0)
 		/* Demotion: flush a specific 2MB page mapping. */
 		invlpg(va);
 	else if ((newpde & PG_G) == 0)
 		/*
 		 * Promotion: flush every 4KB page mapping from the TLB
 		 * because there are too many to flush individually.
 		 */
 		invltlb();
 	else {
 		/*
 		 * Promotion: flush every 4KB page mapping from the TLB,
 		 * including any global (PG_G) mappings.
 		 */
 		cr4 = rcr4();
 		load_cr4(cr4 & ~CR4_PGE);
 		/*
 		 * Although preemption at this point could be detrimental to
 		 * performance, it would not lead to an error.  PG_G is simply
 		 * ignored if CR4.PGE is clear.  Moreover, in case this block
 		 * is re-entered, the load_cr4() either above or below will
 		 * modify CR4.PGE flushing the TLB.
 		 */
 		load_cr4(cr4 | CR4_PGE);
 	}
 }
 
 void
 invltlb_glob(void)
 {
 	uint64_t cr4;
 
 	if (pgeflag == 0) {
 		invltlb();
 	} else {
 		cr4 = rcr4();
 		load_cr4(cr4 & ~CR4_PGE);
 		load_cr4(cr4 | CR4_PGE);
 	}
 }
 
 
 #ifdef SMP
 /*
  * For SMP, these functions have to use the IPI mechanism for coherence.
  *
  * N.B.: Before calling any of the following TLB invalidation functions,
  * the calling processor must ensure that all stores updating a non-
  * kernel page table are globally performed.  Otherwise, another
  * processor could cache an old, pre-update entry without being
  * invalidated.  This can happen one of two ways: (1) The pmap becomes
  * active on another processor after its pm_active field is checked by
  * one of the following functions but before a store updating the page
  * table is globally performed. (2) The pmap becomes active on another
  * processor before its pm_active field is checked but due to
  * speculative loads one of the following functions stills reads the
  * pmap as inactive on the other processor.
  * 
  * The kernel page table is exempt because its pm_active field is
  * immutable.  The kernel page table is always active on every
  * processor.
  */
 void
 pmap_invalidate_page(pmap_t pmap, vm_offset_t va)
 {
 	cpuset_t *mask, other_cpus;
 	u_int cpuid;
 
 	sched_pin();
 	if (pmap == kernel_pmap || !CPU_CMP(&pmap->pm_active, &all_cpus)) {
 		invlpg(va);
 		mask = &all_cpus;
 	} else {
 		cpuid = PCPU_GET(cpuid);
 		other_cpus = all_cpus;
 		CPU_CLR(cpuid, &other_cpus);
 		if (CPU_ISSET(cpuid, &pmap->pm_active))
 			invlpg(va);
 		CPU_AND(&other_cpus, &pmap->pm_active);
 		mask = &other_cpus;
 	}
 	smp_masked_invlpg(*mask, va);
 	sched_unpin();
 }
 
 /* 4k PTEs -- Chosen to exceed the total size of Broadwell L2 TLB */
 #define	PMAP_INVLPG_THRESHOLD	(4 * 1024 * PAGE_SIZE)
 
 void
 pmap_invalidate_range(pmap_t pmap, vm_offset_t sva, vm_offset_t eva)
 {
 	cpuset_t *mask, other_cpus;
 	vm_offset_t addr;
 	u_int cpuid;
 
 	if (eva - sva >= PMAP_INVLPG_THRESHOLD) {
 		pmap_invalidate_all(pmap);
 		return;
 	}
 
 	sched_pin();
 	if (pmap == kernel_pmap || !CPU_CMP(&pmap->pm_active, &all_cpus)) {
 		for (addr = sva; addr < eva; addr += PAGE_SIZE)
 			invlpg(addr);
 		mask = &all_cpus;
 	} else {
 		cpuid = PCPU_GET(cpuid);
 		other_cpus = all_cpus;
 		CPU_CLR(cpuid, &other_cpus);
 		if (CPU_ISSET(cpuid, &pmap->pm_active))
 			for (addr = sva; addr < eva; addr += PAGE_SIZE)
 				invlpg(addr);
 		CPU_AND(&other_cpus, &pmap->pm_active);
 		mask = &other_cpus;
 	}
 	smp_masked_invlpg_range(*mask, sva, eva);
 	sched_unpin();
 }
 
 void
 pmap_invalidate_all(pmap_t pmap)
 {
 	cpuset_t *mask, other_cpus;
 	u_int cpuid;
 
 	sched_pin();
 	if (pmap == kernel_pmap) {
 		invltlb_glob();
 		mask = &all_cpus;
 	} else if (!CPU_CMP(&pmap->pm_active, &all_cpus)) {
 		invltlb();
 		mask = &all_cpus;
 	} else {
 		cpuid = PCPU_GET(cpuid);
 		other_cpus = all_cpus;
 		CPU_CLR(cpuid, &other_cpus);
 		if (CPU_ISSET(cpuid, &pmap->pm_active))
 			invltlb();
 		CPU_AND(&other_cpus, &pmap->pm_active);
 		mask = &other_cpus;
 	}
 	smp_masked_invltlb(*mask, pmap);
 	sched_unpin();
 }
 
 void
 pmap_invalidate_cache(void)
 {
 
 	sched_pin();
 	wbinvd();
 	smp_cache_flush();
 	sched_unpin();
 }
 
 struct pde_action {
 	cpuset_t invalidate;	/* processors that invalidate their TLB */
 	vm_offset_t va;
 	pd_entry_t *pde;
 	pd_entry_t newpde;
 	u_int store;		/* processor that updates the PDE */
 };
 
 static void
 pmap_update_pde_kernel(void *arg)
 {
 	struct pde_action *act = arg;
 	pd_entry_t *pde;
 	pmap_t pmap;
 
 	if (act->store == PCPU_GET(cpuid)) {
 
 		/*
 		 * Elsewhere, this operation requires allpmaps_lock for
 		 * synchronization.  Here, it does not because it is being
 		 * performed in the context of an all_cpus rendezvous.
 		 */
 		LIST_FOREACH(pmap, &allpmaps, pm_list) {
 			pde = pmap_pde(pmap, act->va);
 			pde_store(pde, act->newpde);
 		}
 	}
 }
 
 static void
 pmap_update_pde_user(void *arg)
 {
 	struct pde_action *act = arg;
 
 	if (act->store == PCPU_GET(cpuid))
 		pde_store(act->pde, act->newpde);
 }
 
 static void
 pmap_update_pde_teardown(void *arg)
 {
 	struct pde_action *act = arg;
 
 	if (CPU_ISSET(PCPU_GET(cpuid), &act->invalidate))
 		pmap_update_pde_invalidate(act->va, act->newpde);
 }
 
 /*
  * Change the page size for the specified virtual address in a way that
  * prevents any possibility of the TLB ever having two entries that map the
  * same virtual address using different page sizes.  This is the recommended
  * workaround for Erratum 383 on AMD Family 10h processors.  It prevents a
  * machine check exception for a TLB state that is improperly diagnosed as a
  * hardware error.
  */
 static void
 pmap_update_pde(pmap_t pmap, vm_offset_t va, pd_entry_t *pde, pd_entry_t newpde)
 {
 	struct pde_action act;
 	cpuset_t active, other_cpus;
 	u_int cpuid;
 
 	sched_pin();
 	cpuid = PCPU_GET(cpuid);
 	other_cpus = all_cpus;
 	CPU_CLR(cpuid, &other_cpus);
 	if (pmap == kernel_pmap)
 		active = all_cpus;
 	else
 		active = pmap->pm_active;
 	if (CPU_OVERLAP(&active, &other_cpus)) {
 		act.store = cpuid;
 		act.invalidate = active;
 		act.va = va;
 		act.pde = pde;
 		act.newpde = newpde;
 		CPU_SET(cpuid, &active);
 		smp_rendezvous_cpus(active,
 		    smp_no_rendevous_barrier, pmap == kernel_pmap ?
 		    pmap_update_pde_kernel : pmap_update_pde_user,
 		    pmap_update_pde_teardown, &act);
 	} else {
 		if (pmap == kernel_pmap)
 			pmap_kenter_pde(va, newpde);
 		else
 			pde_store(pde, newpde);
 		if (CPU_ISSET(cpuid, &active))
 			pmap_update_pde_invalidate(va, newpde);
 	}
 	sched_unpin();
 }
 #else /* !SMP */
 /*
  * Normal, non-SMP, 486+ invalidation functions.
  * We inline these within pmap.c for speed.
  */
 PMAP_INLINE void
 pmap_invalidate_page(pmap_t pmap, vm_offset_t va)
 {
 
 	if (pmap == kernel_pmap || !CPU_EMPTY(&pmap->pm_active))
 		invlpg(va);
 }
 
 PMAP_INLINE void
 pmap_invalidate_range(pmap_t pmap, vm_offset_t sva, vm_offset_t eva)
 {
 	vm_offset_t addr;
 
 	if (pmap == kernel_pmap || !CPU_EMPTY(&pmap->pm_active))
 		for (addr = sva; addr < eva; addr += PAGE_SIZE)
 			invlpg(addr);
 }
 
 PMAP_INLINE void
 pmap_invalidate_all(pmap_t pmap)
 {
 
 	if (pmap == kernel_pmap)
 		invltlb_glob();
 	else if (!CPU_EMPTY(&pmap->pm_active))
 		invltlb();
 }
 
 PMAP_INLINE void
 pmap_invalidate_cache(void)
 {
 
 	wbinvd();
 }
 
 static void
 pmap_update_pde(pmap_t pmap, vm_offset_t va, pd_entry_t *pde, pd_entry_t newpde)
 {
 
 	if (pmap == kernel_pmap)
 		pmap_kenter_pde(va, newpde);
 	else
 		pde_store(pde, newpde);
 	if (pmap == kernel_pmap || !CPU_EMPTY(&pmap->pm_active))
 		pmap_update_pde_invalidate(va, newpde);
 }
 #endif /* !SMP */
 
 #define	PMAP_CLFLUSH_THRESHOLD	(2 * 1024 * 1024)
 
 void
 pmap_invalidate_cache_range(vm_offset_t sva, vm_offset_t eva, boolean_t force)
 {
 
 	if (force) {
 		sva &= ~(vm_offset_t)cpu_clflush_line_size;
 	} else {
 		KASSERT((sva & PAGE_MASK) == 0,
 		    ("pmap_invalidate_cache_range: sva not page-aligned"));
 		KASSERT((eva & PAGE_MASK) == 0,
 		    ("pmap_invalidate_cache_range: eva not page-aligned"));
 	}
 
 	if ((cpu_feature & CPUID_SS) != 0 && !force)
 		; /* If "Self Snoop" is supported and allowed, do nothing. */
 	else if ((cpu_stdext_feature & CPUID_STDEXT_CLFLUSHOPT) != 0 &&
 	    eva - sva < PMAP_CLFLUSH_THRESHOLD) {
 #ifdef DEV_APIC
 		/*
 		 * XXX: Some CPUs fault, hang, or trash the local APIC
 		 * registers if we use CLFLUSH on the local APIC
 		 * range.  The local APIC is always uncached, so we
 		 * don't need to flush for that range anyway.
 		 */
 		if (pmap_kextract(sva) == lapic_paddr)
 			return;
 #endif
 		/*
 		 * Otherwise, do per-cache line flush.  Use the mfence
 		 * instruction to insure that previous stores are
 		 * included in the write-back.  The processor
 		 * propagates flush to other processors in the cache
 		 * coherence domain.
 		 */
 		mfence();
 		for (; sva < eva; sva += cpu_clflush_line_size)
 			clflushopt(sva);
 		mfence();
 	} else if ((cpu_feature & CPUID_CLFSH) != 0 &&
 	    eva - sva < PMAP_CLFLUSH_THRESHOLD) {
 #ifdef DEV_APIC
 		if (pmap_kextract(sva) == lapic_paddr)
 			return;
 #endif
 		/*
 		 * Writes are ordered by CLFLUSH on Intel CPUs.
 		 */
 		if (cpu_vendor_id != CPU_VENDOR_INTEL)
 			mfence();
 		for (; sva < eva; sva += cpu_clflush_line_size)
 			clflush(sva);
 		if (cpu_vendor_id != CPU_VENDOR_INTEL)
 			mfence();
 	} else {
 
 		/*
 		 * No targeted cache flush methods are supported by CPU,
 		 * or the supplied range is bigger than 2MB.
 		 * Globally invalidate cache.
 		 */
 		pmap_invalidate_cache();
 	}
 }
 
 void
 pmap_invalidate_cache_pages(vm_page_t *pages, int count)
 {
 	int i;
 
 	if (count >= PMAP_CLFLUSH_THRESHOLD / PAGE_SIZE ||
 	    (cpu_feature & CPUID_CLFSH) == 0) {
 		pmap_invalidate_cache();
 	} else {
 		for (i = 0; i < count; i++)
 			pmap_flush_page(pages[i]);
 	}
 }
 
 /*
  * Are we current address space or kernel?
  */
 static __inline int
 pmap_is_current(pmap_t pmap)
 {
 
 	return (pmap == kernel_pmap || pmap ==
 	    vmspace_pmap(curthread->td_proc->p_vmspace));
 }
 
 /*
  * If the given pmap is not the current or kernel pmap, the returned pte must
  * be released by passing it to pmap_pte_release().
  */
 pt_entry_t *
 pmap_pte(pmap_t pmap, vm_offset_t va)
 {
 	pd_entry_t newpf;
 	pd_entry_t *pde;
 
 	pde = pmap_pde(pmap, va);
 	if (*pde & PG_PS)
 		return (pde);
 	if (*pde != 0) {
 		/* are we current address space or kernel? */
 		if (pmap_is_current(pmap))
 			return (vtopte(va));
 		mtx_lock(&PMAP2mutex);
 		newpf = *pde & PG_FRAME;
 		if ((*PMAP2 & PG_FRAME) != newpf) {
 			*PMAP2 = newpf | PG_RW | PG_V | PG_A | PG_M;
 			pmap_invalidate_page(kernel_pmap, (vm_offset_t)PADDR2);
 		}
 		return (PADDR2 + (i386_btop(va) & (NPTEPG - 1)));
 	}
 	return (NULL);
 }
 
 /*
  * Releases a pte that was obtained from pmap_pte().  Be prepared for the pte
  * being NULL.
  */
 static __inline void
 pmap_pte_release(pt_entry_t *pte)
 {
 
 	if ((pt_entry_t *)((vm_offset_t)pte & ~PAGE_MASK) == PADDR2)
 		mtx_unlock(&PMAP2mutex);
 }
 
 /*
  * NB:  The sequence of updating a page table followed by accesses to the
  * corresponding pages is subject to the situation described in the "AMD64
  * Architecture Programmer's Manual Volume 2: System Programming" rev. 3.23,
  * "7.3.1 Special Coherency Considerations".  Therefore, issuing the INVLPG
  * right after modifying the PTE bits is crucial.
  */
 static __inline void
 invlcaddr(void *caddr)
 {
 
 	invlpg((u_int)caddr);
 }
 
 /*
  * Super fast pmap_pte routine best used when scanning
  * the pv lists.  This eliminates many coarse-grained
  * invltlb calls.  Note that many of the pv list
  * scans are across different pmaps.  It is very wasteful
  * to do an entire invltlb for checking a single mapping.
  *
  * If the given pmap is not the current pmap, pvh_global_lock
  * must be held and curthread pinned to a CPU.
  */
 static pt_entry_t *
 pmap_pte_quick(pmap_t pmap, vm_offset_t va)
 {
 	pd_entry_t newpf;
 	pd_entry_t *pde;
 
 	pde = pmap_pde(pmap, va);
 	if (*pde & PG_PS)
 		return (pde);
 	if (*pde != 0) {
 		/* are we current address space or kernel? */
 		if (pmap_is_current(pmap))
 			return (vtopte(va));
 		rw_assert(&pvh_global_lock, RA_WLOCKED);
 		KASSERT(curthread->td_pinned > 0, ("curthread not pinned"));
 		newpf = *pde & PG_FRAME;
 		if ((*PMAP1 & PG_FRAME) != newpf) {
 			*PMAP1 = newpf | PG_RW | PG_V | PG_A | PG_M;
 #ifdef SMP
 			PMAP1cpu = PCPU_GET(cpuid);
 #endif
 			invlcaddr(PADDR1);
 			PMAP1changed++;
 		} else
 #ifdef SMP
 		if (PMAP1cpu != PCPU_GET(cpuid)) {
 			PMAP1cpu = PCPU_GET(cpuid);
 			invlcaddr(PADDR1);
 			PMAP1changedcpu++;
 		} else
 #endif
 			PMAP1unchanged++;
 		return (PADDR1 + (i386_btop(va) & (NPTEPG - 1)));
 	}
 	return (0);
 }
 
 /*
  *	Routine:	pmap_extract
  *	Function:
  *		Extract the physical page address associated
  *		with the given map/virtual_address pair.
  */
 vm_paddr_t 
 pmap_extract(pmap_t pmap, vm_offset_t va)
 {
 	vm_paddr_t rtval;
 	pt_entry_t *pte;
 	pd_entry_t pde;
 
 	rtval = 0;
 	PMAP_LOCK(pmap);
 	pde = pmap->pm_pdir[va >> PDRSHIFT];
 	if (pde != 0) {
 		if ((pde & PG_PS) != 0)
 			rtval = (pde & PG_PS_FRAME) | (va & PDRMASK);
 		else {
 			pte = pmap_pte(pmap, va);
 			rtval = (*pte & PG_FRAME) | (va & PAGE_MASK);
 			pmap_pte_release(pte);
 		}
 	}
 	PMAP_UNLOCK(pmap);
 	return (rtval);
 }
 
 /*
  *	Routine:	pmap_extract_and_hold
  *	Function:
  *		Atomically extract and hold the physical page
  *		with the given pmap and virtual address pair
  *		if that mapping permits the given protection.
  */
 vm_page_t
 pmap_extract_and_hold(pmap_t pmap, vm_offset_t va, vm_prot_t prot)
 {
 	pd_entry_t pde;
 	pt_entry_t pte, *ptep;
 	vm_page_t m;
 	vm_paddr_t pa;
 
 	pa = 0;
 	m = NULL;
 	PMAP_LOCK(pmap);
 retry:
 	pde = *pmap_pde(pmap, va);
 	if (pde != 0) {
 		if (pde & PG_PS) {
 			if ((pde & PG_RW) || (prot & VM_PROT_WRITE) == 0) {
 				if (vm_page_pa_tryrelock(pmap, (pde &
 				    PG_PS_FRAME) | (va & PDRMASK), &pa))
 					goto retry;
 				m = PHYS_TO_VM_PAGE((pde & PG_PS_FRAME) |
 				    (va & PDRMASK));
 				vm_page_hold(m);
 			}
 		} else {
 			ptep = pmap_pte(pmap, va);
 			pte = *ptep;
 			pmap_pte_release(ptep);
 			if (pte != 0 &&
 			    ((pte & PG_RW) || (prot & VM_PROT_WRITE) == 0)) {
 				if (vm_page_pa_tryrelock(pmap, pte & PG_FRAME,
 				    &pa))
 					goto retry;
 				m = PHYS_TO_VM_PAGE(pte & PG_FRAME);
 				vm_page_hold(m);
 			}
 		}
 	}
 	PA_UNLOCK_COND(pa);
 	PMAP_UNLOCK(pmap);
 	return (m);
 }
 
 /***************************************************
  * Low level mapping routines.....
  ***************************************************/
 
 /*
  * Add a wired page to the kva.
  * Note: not SMP coherent.
  *
  * This function may be used before pmap_bootstrap() is called.
  */
 PMAP_INLINE void 
 pmap_kenter(vm_offset_t va, vm_paddr_t pa)
 {
 	pt_entry_t *pte;
 
 	pte = vtopte(va);
 	pte_store(pte, pa | PG_RW | PG_V | pgeflag);
 }
 
 static __inline void
 pmap_kenter_attr(vm_offset_t va, vm_paddr_t pa, int mode)
 {
 	pt_entry_t *pte;
 
 	pte = vtopte(va);
 	pte_store(pte, pa | PG_RW | PG_V | pgeflag | pmap_cache_bits(mode, 0));
 }
 
 /*
  * Remove a page from the kernel pagetables.
  * Note: not SMP coherent.
  *
  * This function may be used before pmap_bootstrap() is called.
  */
 PMAP_INLINE void
 pmap_kremove(vm_offset_t va)
 {
 	pt_entry_t *pte;
 
 	pte = vtopte(va);
 	pte_clear(pte);
 }
 
 /*
  *	Used to map a range of physical addresses into kernel
  *	virtual address space.
  *
  *	The value passed in '*virt' is a suggested virtual address for
  *	the mapping. Architectures which can support a direct-mapped
  *	physical to virtual region can return the appropriate address
  *	within that region, leaving '*virt' unchanged. Other
  *	architectures should map the pages starting at '*virt' and
  *	update '*virt' with the first usable address after the mapped
  *	region.
  */
 vm_offset_t
 pmap_map(vm_offset_t *virt, vm_paddr_t start, vm_paddr_t end, int prot)
 {
 	vm_offset_t va, sva;
 	vm_paddr_t superpage_offset;
 	pd_entry_t newpde;
 
 	va = *virt;
 	/*
 	 * Does the physical address range's size and alignment permit at
 	 * least one superpage mapping to be created?
 	 */ 
 	superpage_offset = start & PDRMASK;
 	if ((end - start) - ((NBPDR - superpage_offset) & PDRMASK) >= NBPDR) {
 		/*
 		 * Increase the starting virtual address so that its alignment
 		 * does not preclude the use of superpage mappings.
 		 */
 		if ((va & PDRMASK) < superpage_offset)
 			va = (va & ~PDRMASK) + superpage_offset;
 		else if ((va & PDRMASK) > superpage_offset)
 			va = ((va + PDRMASK) & ~PDRMASK) + superpage_offset;
 	}
 	sva = va;
 	while (start < end) {
 		if ((start & PDRMASK) == 0 && end - start >= NBPDR &&
 		    pseflag) {
 			KASSERT((va & PDRMASK) == 0,
 			    ("pmap_map: misaligned va %#x", va));
 			newpde = start | PG_PS | pgeflag | PG_RW | PG_V;
 			pmap_kenter_pde(va, newpde);
 			va += NBPDR;
 			start += NBPDR;
 		} else {
 			pmap_kenter(va, start);
 			va += PAGE_SIZE;
 			start += PAGE_SIZE;
 		}
 	}
 	pmap_invalidate_range(kernel_pmap, sva, va);
 	*virt = va;
 	return (sva);
 }
 
 
 /*
  * Add a list of wired pages to the kva
  * this routine is only used for temporary
  * kernel mappings that do not need to have
  * page modification or references recorded.
  * Note that old mappings are simply written
  * over.  The page *must* be wired.
  * Note: SMP coherent.  Uses a ranged shootdown IPI.
  */
 void
 pmap_qenter(vm_offset_t sva, vm_page_t *ma, int count)
 {
 	pt_entry_t *endpte, oldpte, pa, *pte;
 	vm_page_t m;
 
 	oldpte = 0;
 	pte = vtopte(sva);
 	endpte = pte + count;
 	while (pte < endpte) {
 		m = *ma++;
 		pa = VM_PAGE_TO_PHYS(m) | pmap_cache_bits(m->md.pat_mode, 0);
 		if ((*pte & (PG_FRAME | PG_PTE_CACHE)) != pa) {
 			oldpte |= *pte;
 			pte_store(pte, pa | pgeflag | PG_RW | PG_V);
 		}
 		pte++;
 	}
 	if (__predict_false((oldpte & PG_V) != 0))
 		pmap_invalidate_range(kernel_pmap, sva, sva + count *
 		    PAGE_SIZE);
 }
 
 /*
  * This routine tears out page mappings from the
  * kernel -- it is meant only for temporary mappings.
  * Note: SMP coherent.  Uses a ranged shootdown IPI.
  */
 void
 pmap_qremove(vm_offset_t sva, int count)
 {
 	vm_offset_t va;
 
 	va = sva;
 	while (count-- > 0) {
 		pmap_kremove(va);
 		va += PAGE_SIZE;
 	}
 	pmap_invalidate_range(kernel_pmap, sva, va);
 }
 
 /***************************************************
  * Page table page management routines.....
  ***************************************************/
 static __inline void
 pmap_free_zero_pages(struct spglist *free)
 {
 	vm_page_t m;
 
 	while ((m = SLIST_FIRST(free)) != NULL) {
 		SLIST_REMOVE_HEAD(free, plinks.s.ss);
 		/* Preserve the page's PG_ZERO setting. */
 		vm_page_free_toq(m);
 	}
 }
 
 /*
  * Schedule the specified unused page table page to be freed.  Specifically,
  * add the page to the specified list of pages that will be released to the
  * physical memory manager after the TLB has been updated.
  */
 static __inline void
 pmap_add_delayed_free_list(vm_page_t m, struct spglist *free,
     boolean_t set_PG_ZERO)
 {
 
 	if (set_PG_ZERO)
 		m->flags |= PG_ZERO;
 	else
 		m->flags &= ~PG_ZERO;
 	SLIST_INSERT_HEAD(free, m, plinks.s.ss);
 }
 
 /*
  * Inserts the specified page table page into the specified pmap's collection
  * of idle page table pages.  Each of a pmap's page table pages is responsible
  * for mapping a distinct range of virtual addresses.  The pmap's collection is
  * ordered by this virtual address range.
  */
 static __inline int
 pmap_insert_pt_page(pmap_t pmap, vm_page_t mpte)
 {
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	return (vm_radix_insert(&pmap->pm_root, mpte));
 }
 
 /*
  * Looks for a page table page mapping the specified virtual address in the
  * specified pmap's collection of idle page table pages.  Returns NULL if there
  * is no page table page corresponding to the specified virtual address.
  */
 static __inline vm_page_t
 pmap_lookup_pt_page(pmap_t pmap, vm_offset_t va)
 {
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	return (vm_radix_lookup(&pmap->pm_root, va >> PDRSHIFT));
 }
 
 /*
  * Removes the specified page table page from the specified pmap's collection
  * of idle page table pages.  The specified page table page must be a member of
  * the pmap's collection.
  */
 static __inline void
 pmap_remove_pt_page(pmap_t pmap, vm_page_t mpte)
 {
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	vm_radix_remove(&pmap->pm_root, mpte->pindex);
 }
 
 /*
  * Decrements a page table page's wire count, which is used to record the
  * number of valid page table entries within the page.  If the wire count
  * drops to zero, then the page table page is unmapped.  Returns TRUE if the
  * page table page was unmapped and FALSE otherwise.
  */
 static inline boolean_t
 pmap_unwire_ptp(pmap_t pmap, vm_page_t m, struct spglist *free)
 {
 
 	--m->wire_count;
 	if (m->wire_count == 0) {
 		_pmap_unwire_ptp(pmap, m, free);
 		return (TRUE);
 	} else
 		return (FALSE);
 }
 
 static void
 _pmap_unwire_ptp(pmap_t pmap, vm_page_t m, struct spglist *free)
 {
 	vm_offset_t pteva;
 
 	/*
 	 * unmap the page table page
 	 */
 	pmap->pm_pdir[m->pindex] = 0;
 	--pmap->pm_stats.resident_count;
 
 	/*
 	 * This is a release store so that the ordinary store unmapping
 	 * the page table page is globally performed before TLB shoot-
 	 * down is begun.
 	 */
 	atomic_subtract_rel_int(&vm_cnt.v_wire_count, 1);
 
 	/*
 	 * Do an invltlb to make the invalidated mapping
 	 * take effect immediately.
 	 */
 	pteva = VM_MAXUSER_ADDRESS + i386_ptob(m->pindex);
 	pmap_invalidate_page(pmap, pteva);
 
 	/* 
 	 * Put page on a list so that it is released after
 	 * *ALL* TLB shootdown is done
 	 */
 	pmap_add_delayed_free_list(m, free, TRUE);
 }
 
 /*
  * After removing a page table entry, this routine is used to
  * conditionally free the page, and manage the hold/wire counts.
  */
 static int
 pmap_unuse_pt(pmap_t pmap, vm_offset_t va, struct spglist *free)
 {
 	pd_entry_t ptepde;
 	vm_page_t mpte;
 
 	if (va >= VM_MAXUSER_ADDRESS)
 		return (0);
 	ptepde = *pmap_pde(pmap, va);
 	mpte = PHYS_TO_VM_PAGE(ptepde & PG_FRAME);
 	return (pmap_unwire_ptp(pmap, mpte, free));
 }
 
 /*
  * Initialize the pmap for the swapper process.
  */
 void
 pmap_pinit0(pmap_t pmap)
 {
 
 	PMAP_LOCK_INIT(pmap);
 	/*
 	 * Since the page table directory is shared with the kernel pmap,
 	 * which is already included in the list "allpmaps", this pmap does
 	 * not need to be inserted into that list.
 	 */
 	pmap->pm_pdir = (pd_entry_t *)(KERNBASE + (vm_offset_t)IdlePTD);
 #if defined(PAE) || defined(PAE_TABLES)
 	pmap->pm_pdpt = (pdpt_entry_t *)(KERNBASE + (vm_offset_t)IdlePDPT);
 #endif
 	pmap->pm_root.rt_root = 0;
 	CPU_ZERO(&pmap->pm_active);
 	PCPU_SET(curpmap, pmap);
 	TAILQ_INIT(&pmap->pm_pvchunk);
 	bzero(&pmap->pm_stats, sizeof pmap->pm_stats);
 }
 
 /*
  * Initialize a preallocated and zeroed pmap structure,
  * such as one in a vmspace structure.
  */
 int
 pmap_pinit(pmap_t pmap)
 {
 	vm_page_t m, ptdpg[NPGPTD];
 	vm_paddr_t pa;
 	int i;
 
 	/*
 	 * No need to allocate page table space yet but we do need a valid
 	 * page directory table.
 	 */
 	if (pmap->pm_pdir == NULL) {
 		pmap->pm_pdir = (pd_entry_t *)kva_alloc(NBPTD);
 		if (pmap->pm_pdir == NULL)
 			return (0);
 #if defined(PAE) || defined(PAE_TABLES)
 		pmap->pm_pdpt = uma_zalloc(pdptzone, M_WAITOK | M_ZERO);
 		KASSERT(((vm_offset_t)pmap->pm_pdpt &
 		    ((NPGPTD * sizeof(pdpt_entry_t)) - 1)) == 0,
 		    ("pmap_pinit: pdpt misaligned"));
 		KASSERT(pmap_kextract((vm_offset_t)pmap->pm_pdpt) < (4ULL<<30),
 		    ("pmap_pinit: pdpt above 4g"));
 #endif
 		pmap->pm_root.rt_root = 0;
 	}
 	KASSERT(vm_radix_is_empty(&pmap->pm_root),
 	    ("pmap_pinit: pmap has reserved page table page(s)"));
 
 	/*
 	 * allocate the page directory page(s)
 	 */
 	for (i = 0; i < NPGPTD;) {
 		m = vm_page_alloc(NULL, 0, VM_ALLOC_NORMAL | VM_ALLOC_NOOBJ |
 		    VM_ALLOC_WIRED | VM_ALLOC_ZERO);
 		if (m == NULL)
 			VM_WAIT;
 		else {
 			ptdpg[i++] = m;
 		}
 	}
 
 	pmap_qenter((vm_offset_t)pmap->pm_pdir, ptdpg, NPGPTD);
 
 	for (i = 0; i < NPGPTD; i++)
 		if ((ptdpg[i]->flags & PG_ZERO) == 0)
 			pagezero(pmap->pm_pdir + (i * NPDEPG));
 
 	mtx_lock_spin(&allpmaps_lock);
 	LIST_INSERT_HEAD(&allpmaps, pmap, pm_list);
 	/* Copy the kernel page table directory entries. */
 	bcopy(PTD + KPTDI, pmap->pm_pdir + KPTDI, nkpt * sizeof(pd_entry_t));
 	mtx_unlock_spin(&allpmaps_lock);
 
 	/* install self-referential address mapping entry(s) */
 	for (i = 0; i < NPGPTD; i++) {
 		pa = VM_PAGE_TO_PHYS(ptdpg[i]);
 		pmap->pm_pdir[PTDPTDI + i] = pa | PG_V | PG_RW | PG_A | PG_M;
 #if defined(PAE) || defined(PAE_TABLES)
 		pmap->pm_pdpt[i] = pa | PG_V;
 #endif
 	}
 
 	CPU_ZERO(&pmap->pm_active);
 	TAILQ_INIT(&pmap->pm_pvchunk);
 	bzero(&pmap->pm_stats, sizeof pmap->pm_stats);
 
 	return (1);
 }
 
 /*
  * this routine is called if the page table page is not
  * mapped correctly.
  */
 static vm_page_t
 _pmap_allocpte(pmap_t pmap, u_int ptepindex, u_int flags)
 {
 	vm_paddr_t ptepa;
 	vm_page_t m;
 
 	/*
 	 * Allocate a page table page.
 	 */
 	if ((m = vm_page_alloc(NULL, ptepindex, VM_ALLOC_NOOBJ |
 	    VM_ALLOC_WIRED | VM_ALLOC_ZERO)) == NULL) {
 		if ((flags & PMAP_ENTER_NOSLEEP) == 0) {
 			PMAP_UNLOCK(pmap);
 			rw_wunlock(&pvh_global_lock);
 			VM_WAIT;
 			rw_wlock(&pvh_global_lock);
 			PMAP_LOCK(pmap);
 		}
 
 		/*
 		 * Indicate the need to retry.  While waiting, the page table
 		 * page may have been allocated.
 		 */
 		return (NULL);
 	}
 	if ((m->flags & PG_ZERO) == 0)
 		pmap_zero_page(m);
 
 	/*
 	 * Map the pagetable page into the process address space, if
 	 * it isn't already there.
 	 */
 
 	pmap->pm_stats.resident_count++;
 
 	ptepa = VM_PAGE_TO_PHYS(m);
 	pmap->pm_pdir[ptepindex] =
 		(pd_entry_t) (ptepa | PG_U | PG_RW | PG_V | PG_A | PG_M);
 
 	return (m);
 }
 
 static vm_page_t
 pmap_allocpte(pmap_t pmap, vm_offset_t va, u_int flags)
 {
 	u_int ptepindex;
 	pd_entry_t ptepa;
 	vm_page_t m;
 
 	/*
 	 * Calculate pagetable page index
 	 */
 	ptepindex = va >> PDRSHIFT;
 retry:
 	/*
 	 * Get the page directory entry
 	 */
 	ptepa = pmap->pm_pdir[ptepindex];
 
 	/*
 	 * This supports switching from a 4MB page to a
 	 * normal 4K page.
 	 */
 	if (ptepa & PG_PS) {
 		(void)pmap_demote_pde(pmap, &pmap->pm_pdir[ptepindex], va);
 		ptepa = pmap->pm_pdir[ptepindex];
 	}
 
 	/*
 	 * If the page table page is mapped, we just increment the
 	 * hold count, and activate it.
 	 */
 	if (ptepa) {
 		m = PHYS_TO_VM_PAGE(ptepa & PG_FRAME);
 		m->wire_count++;
 	} else {
 		/*
 		 * Here if the pte page isn't mapped, or if it has
 		 * been deallocated. 
 		 */
 		m = _pmap_allocpte(pmap, ptepindex, flags);
 		if (m == NULL && (flags & PMAP_ENTER_NOSLEEP) == 0)
 			goto retry;
 	}
 	return (m);
 }
 
 
 /***************************************************
 * Pmap allocation/deallocation routines.
  ***************************************************/
 
 /*
  * Release any resources held by the given physical map.
  * Called when a pmap initialized by pmap_pinit is being released.
  * Should only be called if the map contains no valid mappings.
  */
 void
 pmap_release(pmap_t pmap)
 {
 	vm_page_t m, ptdpg[NPGPTD];
 	int i;
 
 	KASSERT(pmap->pm_stats.resident_count == 0,
 	    ("pmap_release: pmap resident count %ld != 0",
 	    pmap->pm_stats.resident_count));
 	KASSERT(vm_radix_is_empty(&pmap->pm_root),
 	    ("pmap_release: pmap has reserved page table page(s)"));
 	KASSERT(CPU_EMPTY(&pmap->pm_active),
 	    ("releasing active pmap %p", pmap));
 
 	mtx_lock_spin(&allpmaps_lock);
 	LIST_REMOVE(pmap, pm_list);
 	mtx_unlock_spin(&allpmaps_lock);
 
 	for (i = 0; i < NPGPTD; i++)
 		ptdpg[i] = PHYS_TO_VM_PAGE(pmap->pm_pdir[PTDPTDI + i] &
 		    PG_FRAME);
 
 	bzero(pmap->pm_pdir + PTDPTDI, (nkpt + NPGPTD) *
 	    sizeof(*pmap->pm_pdir));
 
 	pmap_qremove((vm_offset_t)pmap->pm_pdir, NPGPTD);
 
 	for (i = 0; i < NPGPTD; i++) {
 		m = ptdpg[i];
 #if defined(PAE) || defined(PAE_TABLES)
 		KASSERT(VM_PAGE_TO_PHYS(m) == (pmap->pm_pdpt[i] & PG_FRAME),
 		    ("pmap_release: got wrong ptd page"));
 #endif
 		m->wire_count--;
 		atomic_subtract_int(&vm_cnt.v_wire_count, 1);
 		vm_page_free_zero(m);
 	}
 }
 
 static int
 kvm_size(SYSCTL_HANDLER_ARGS)
 {
 	unsigned long ksize = VM_MAX_KERNEL_ADDRESS - KERNBASE;
 
 	return (sysctl_handle_long(oidp, &ksize, 0, req));
 }
 SYSCTL_PROC(_vm, OID_AUTO, kvm_size, CTLTYPE_LONG|CTLFLAG_RD, 
     0, 0, kvm_size, "IU", "Size of KVM");
 
 static int
 kvm_free(SYSCTL_HANDLER_ARGS)
 {
 	unsigned long kfree = VM_MAX_KERNEL_ADDRESS - kernel_vm_end;
 
 	return (sysctl_handle_long(oidp, &kfree, 0, req));
 }
 SYSCTL_PROC(_vm, OID_AUTO, kvm_free, CTLTYPE_LONG|CTLFLAG_RD, 
     0, 0, kvm_free, "IU", "Amount of KVM free");
 
 /*
  * grow the number of kernel page table entries, if needed
  */
 void
 pmap_growkernel(vm_offset_t addr)
 {
 	vm_paddr_t ptppaddr;
 	vm_page_t nkpg;
 	pd_entry_t newpdir;
 
 	mtx_assert(&kernel_map->system_mtx, MA_OWNED);
 	addr = roundup2(addr, NBPDR);
 	if (addr - 1 >= kernel_map->max_offset)
 		addr = kernel_map->max_offset;
 	while (kernel_vm_end < addr) {
 		if (pdir_pde(PTD, kernel_vm_end)) {
 			kernel_vm_end = (kernel_vm_end + NBPDR) & ~PDRMASK;
 			if (kernel_vm_end - 1 >= kernel_map->max_offset) {
 				kernel_vm_end = kernel_map->max_offset;
 				break;
 			}
 			continue;
 		}
 
 		nkpg = vm_page_alloc(NULL, kernel_vm_end >> PDRSHIFT,
 		    VM_ALLOC_INTERRUPT | VM_ALLOC_NOOBJ | VM_ALLOC_WIRED |
 		    VM_ALLOC_ZERO);
 		if (nkpg == NULL)
 			panic("pmap_growkernel: no memory to grow kernel");
 
 		nkpt++;
 
 		if ((nkpg->flags & PG_ZERO) == 0)
 			pmap_zero_page(nkpg);
 		ptppaddr = VM_PAGE_TO_PHYS(nkpg);
 		newpdir = (pd_entry_t) (ptppaddr | PG_V | PG_RW | PG_A | PG_M);
 		pdir_pde(KPTD, kernel_vm_end) = pgeflag | newpdir;
 
 		pmap_kenter_pde(kernel_vm_end, newpdir);
 		kernel_vm_end = (kernel_vm_end + NBPDR) & ~PDRMASK;
 		if (kernel_vm_end - 1 >= kernel_map->max_offset) {
 			kernel_vm_end = kernel_map->max_offset;
 			break;
 		}
 	}
 }
 
 
 /***************************************************
  * page management routines.
  ***************************************************/
 
 CTASSERT(sizeof(struct pv_chunk) == PAGE_SIZE);
 CTASSERT(_NPCM == 11);
 CTASSERT(_NPCPV == 336);
 
 static __inline struct pv_chunk *
 pv_to_chunk(pv_entry_t pv)
 {
 
 	return ((struct pv_chunk *)((uintptr_t)pv & ~(uintptr_t)PAGE_MASK));
 }
 
 #define PV_PMAP(pv) (pv_to_chunk(pv)->pc_pmap)
 
 #define	PC_FREE0_9	0xfffffffful	/* Free values for index 0 through 9 */
 #define	PC_FREE10	0x0000fffful	/* Free values for index 10 */
 
 static const uint32_t pc_freemask[_NPCM] = {
 	PC_FREE0_9, PC_FREE0_9, PC_FREE0_9,
 	PC_FREE0_9, PC_FREE0_9, PC_FREE0_9,
 	PC_FREE0_9, PC_FREE0_9, PC_FREE0_9,
 	PC_FREE0_9, PC_FREE10
 };
 
 SYSCTL_INT(_vm_pmap, OID_AUTO, pv_entry_count, CTLFLAG_RD, &pv_entry_count, 0,
 	"Current number of pv entries");
 
 #ifdef PV_STATS
 static int pc_chunk_count, pc_chunk_allocs, pc_chunk_frees, pc_chunk_tryfail;
 
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_count, CTLFLAG_RD, &pc_chunk_count, 0,
 	"Current number of pv entry chunks");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_allocs, CTLFLAG_RD, &pc_chunk_allocs, 0,
 	"Current number of pv entry chunks allocated");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_frees, CTLFLAG_RD, &pc_chunk_frees, 0,
 	"Current number of pv entry chunks frees");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_tryfail, CTLFLAG_RD, &pc_chunk_tryfail, 0,
 	"Number of times tried to get a chunk page but failed.");
 
 static long pv_entry_frees, pv_entry_allocs;
 static int pv_entry_spare;
 
 SYSCTL_LONG(_vm_pmap, OID_AUTO, pv_entry_frees, CTLFLAG_RD, &pv_entry_frees, 0,
 	"Current number of pv entry frees");
 SYSCTL_LONG(_vm_pmap, OID_AUTO, pv_entry_allocs, CTLFLAG_RD, &pv_entry_allocs, 0,
 	"Current number of pv entry allocs");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pv_entry_spare, CTLFLAG_RD, &pv_entry_spare, 0,
 	"Current number of spare pv entries");
 #endif
 
 /*
  * We are in a serious low memory condition.  Resort to
  * drastic measures to free some pages so we can allocate
  * another pv entry chunk.
  */
 static vm_page_t
 pmap_pv_reclaim(pmap_t locked_pmap)
 {
 	struct pch newtail;
 	struct pv_chunk *pc;
 	struct md_page *pvh;
 	pd_entry_t *pde;
 	pmap_t pmap;
 	pt_entry_t *pte, tpte;
 	pv_entry_t pv;
 	vm_offset_t va;
 	vm_page_t m, m_pc;
 	struct spglist free;
 	uint32_t inuse;
 	int bit, field, freed;
 
 	PMAP_LOCK_ASSERT(locked_pmap, MA_OWNED);
 	pmap = NULL;
 	m_pc = NULL;
 	SLIST_INIT(&free);
 	TAILQ_INIT(&newtail);
 	while ((pc = TAILQ_FIRST(&pv_chunks)) != NULL && (pv_vafree == 0 ||
 	    SLIST_EMPTY(&free))) {
 		TAILQ_REMOVE(&pv_chunks, pc, pc_lru);
 		if (pmap != pc->pc_pmap) {
 			if (pmap != NULL) {
 				pmap_invalidate_all(pmap);
 				if (pmap != locked_pmap)
 					PMAP_UNLOCK(pmap);
 			}
 			pmap = pc->pc_pmap;
 			/* Avoid deadlock and lock recursion. */
 			if (pmap > locked_pmap)
 				PMAP_LOCK(pmap);
 			else if (pmap != locked_pmap && !PMAP_TRYLOCK(pmap)) {
 				pmap = NULL;
 				TAILQ_INSERT_TAIL(&newtail, pc, pc_lru);
 				continue;
 			}
 		}
 
 		/*
 		 * Destroy every non-wired, 4 KB page mapping in the chunk.
 		 */
 		freed = 0;
 		for (field = 0; field < _NPCM; field++) {
 			for (inuse = ~pc->pc_map[field] & pc_freemask[field];
 			    inuse != 0; inuse &= ~(1UL << bit)) {
 				bit = bsfl(inuse);
 				pv = &pc->pc_pventry[field * 32 + bit];
 				va = pv->pv_va;
 				pde = pmap_pde(pmap, va);
 				if ((*pde & PG_PS) != 0)
 					continue;
 				pte = pmap_pte(pmap, va);
 				tpte = *pte;
 				if ((tpte & PG_W) == 0)
 					tpte = pte_load_clear(pte);
 				pmap_pte_release(pte);
 				if ((tpte & PG_W) != 0)
 					continue;
 				KASSERT(tpte != 0,
 				    ("pmap_pv_reclaim: pmap %p va %x zero pte",
 				    pmap, va));
 				if ((tpte & PG_G) != 0)
 					pmap_invalidate_page(pmap, va);
 				m = PHYS_TO_VM_PAGE(tpte & PG_FRAME);
 				if ((tpte & (PG_M | PG_RW)) == (PG_M | PG_RW))
 					vm_page_dirty(m);
 				if ((tpte & PG_A) != 0)
 					vm_page_aflag_set(m, PGA_REFERENCED);
 				TAILQ_REMOVE(&m->md.pv_list, pv, pv_next);
 				if (TAILQ_EMPTY(&m->md.pv_list) &&
 				    (m->flags & PG_FICTITIOUS) == 0) {
 					pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 					if (TAILQ_EMPTY(&pvh->pv_list)) {
 						vm_page_aflag_clear(m,
 						    PGA_WRITEABLE);
 					}
 				}
 				pc->pc_map[field] |= 1UL << bit;
 				pmap_unuse_pt(pmap, va, &free);
 				freed++;
 			}
 		}
 		if (freed == 0) {
 			TAILQ_INSERT_TAIL(&newtail, pc, pc_lru);
 			continue;
 		}
 		/* Every freed mapping is for a 4 KB page. */
 		pmap->pm_stats.resident_count -= freed;
 		PV_STAT(pv_entry_frees += freed);
 		PV_STAT(pv_entry_spare += freed);
 		pv_entry_count -= freed;
 		TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 		for (field = 0; field < _NPCM; field++)
 			if (pc->pc_map[field] != pc_freemask[field]) {
 				TAILQ_INSERT_HEAD(&pmap->pm_pvchunk, pc,
 				    pc_list);
 				TAILQ_INSERT_TAIL(&newtail, pc, pc_lru);
 
 				/*
 				 * One freed pv entry in locked_pmap is
 				 * sufficient.
 				 */
 				if (pmap == locked_pmap)
 					goto out;
 				break;
 			}
 		if (field == _NPCM) {
 			PV_STAT(pv_entry_spare -= _NPCPV);
 			PV_STAT(pc_chunk_count--);
 			PV_STAT(pc_chunk_frees++);
 			/* Entire chunk is free; return it. */
 			m_pc = PHYS_TO_VM_PAGE(pmap_kextract((vm_offset_t)pc));
 			pmap_qremove((vm_offset_t)pc, 1);
 			pmap_ptelist_free(&pv_vafree, (vm_offset_t)pc);
 			break;
 		}
 	}
 out:
 	TAILQ_CONCAT(&pv_chunks, &newtail, pc_lru);
 	if (pmap != NULL) {
 		pmap_invalidate_all(pmap);
 		if (pmap != locked_pmap)
 			PMAP_UNLOCK(pmap);
 	}
 	if (m_pc == NULL && pv_vafree != 0 && SLIST_EMPTY(&free)) {
 		m_pc = SLIST_FIRST(&free);
 		SLIST_REMOVE_HEAD(&free, plinks.s.ss);
 		/* Recycle a freed page table page. */
 		m_pc->wire_count = 1;
 		atomic_add_int(&vm_cnt.v_wire_count, 1);
 	}
 	pmap_free_zero_pages(&free);
 	return (m_pc);
 }
 
 /*
  * free the pv_entry back to the free list
  */
 static void
 free_pv_entry(pmap_t pmap, pv_entry_t pv)
 {
 	struct pv_chunk *pc;
 	int idx, field, bit;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	PV_STAT(pv_entry_frees++);
 	PV_STAT(pv_entry_spare++);
 	pv_entry_count--;
 	pc = pv_to_chunk(pv);
 	idx = pv - &pc->pc_pventry[0];
 	field = idx / 32;
 	bit = idx % 32;
 	pc->pc_map[field] |= 1ul << bit;
 	for (idx = 0; idx < _NPCM; idx++)
 		if (pc->pc_map[idx] != pc_freemask[idx]) {
 			/*
 			 * 98% of the time, pc is already at the head of the
 			 * list.  If it isn't already, move it to the head.
 			 */
 			if (__predict_false(TAILQ_FIRST(&pmap->pm_pvchunk) !=
 			    pc)) {
 				TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 				TAILQ_INSERT_HEAD(&pmap->pm_pvchunk, pc,
 				    pc_list);
 			}
 			return;
 		}
 	TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 	free_pv_chunk(pc);
 }
 
 static void
 free_pv_chunk(struct pv_chunk *pc)
 {
 	vm_page_t m;
 
  	TAILQ_REMOVE(&pv_chunks, pc, pc_lru);
 	PV_STAT(pv_entry_spare -= _NPCPV);
 	PV_STAT(pc_chunk_count--);
 	PV_STAT(pc_chunk_frees++);
 	/* entire chunk is free, return it */
 	m = PHYS_TO_VM_PAGE(pmap_kextract((vm_offset_t)pc));
 	pmap_qremove((vm_offset_t)pc, 1);
 	vm_page_unwire(m, PQ_NONE);
 	vm_page_free(m);
 	pmap_ptelist_free(&pv_vafree, (vm_offset_t)pc);
 }
 
 /*
  * get a new pv_entry, allocating a block from the system
  * when needed.
  */
 static pv_entry_t
 get_pv_entry(pmap_t pmap, boolean_t try)
 {
 	static const struct timeval printinterval = { 60, 0 };
 	static struct timeval lastprint;
 	int bit, field;
 	pv_entry_t pv;
 	struct pv_chunk *pc;
 	vm_page_t m;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	PV_STAT(pv_entry_allocs++);
 	pv_entry_count++;
 	if (pv_entry_count > pv_entry_high_water)
 		if (ratecheck(&lastprint, &printinterval))
 			printf("Approaching the limit on PV entries, consider "
 			    "increasing either the vm.pmap.shpgperproc or the "
 			    "vm.pmap.pv_entry_max tunable.\n");
 retry:
 	pc = TAILQ_FIRST(&pmap->pm_pvchunk);
 	if (pc != NULL) {
 		for (field = 0; field < _NPCM; field++) {
 			if (pc->pc_map[field]) {
 				bit = bsfl(pc->pc_map[field]);
 				break;
 			}
 		}
 		if (field < _NPCM) {
 			pv = &pc->pc_pventry[field * 32 + bit];
 			pc->pc_map[field] &= ~(1ul << bit);
 			/* If this was the last item, move it to tail */
 			for (field = 0; field < _NPCM; field++)
 				if (pc->pc_map[field] != 0) {
 					PV_STAT(pv_entry_spare--);
 					return (pv);	/* not full, return */
 				}
 			TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 			TAILQ_INSERT_TAIL(&pmap->pm_pvchunk, pc, pc_list);
 			PV_STAT(pv_entry_spare--);
 			return (pv);
 		}
 	}
 	/*
 	 * Access to the ptelist "pv_vafree" is synchronized by the pvh
 	 * global lock.  If "pv_vafree" is currently non-empty, it will
 	 * remain non-empty until pmap_ptelist_alloc() completes.
 	 */
 	if (pv_vafree == 0 || (m = vm_page_alloc(NULL, 0, VM_ALLOC_NORMAL |
 	    VM_ALLOC_NOOBJ | VM_ALLOC_WIRED)) == NULL) {
 		if (try) {
 			pv_entry_count--;
 			PV_STAT(pc_chunk_tryfail++);
 			return (NULL);
 		}
 		m = pmap_pv_reclaim(pmap);
 		if (m == NULL)
 			goto retry;
 	}
 	PV_STAT(pc_chunk_count++);
 	PV_STAT(pc_chunk_allocs++);
 	pc = (struct pv_chunk *)pmap_ptelist_alloc(&pv_vafree);
 	pmap_qenter((vm_offset_t)pc, &m, 1);
 	pc->pc_pmap = pmap;
 	pc->pc_map[0] = pc_freemask[0] & ~1ul;	/* preallocated bit 0 */
 	for (field = 1; field < _NPCM; field++)
 		pc->pc_map[field] = pc_freemask[field];
 	TAILQ_INSERT_TAIL(&pv_chunks, pc, pc_lru);
 	pv = &pc->pc_pventry[0];
 	TAILQ_INSERT_HEAD(&pmap->pm_pvchunk, pc, pc_list);
 	PV_STAT(pv_entry_spare += _NPCPV - 1);
 	return (pv);
 }
 
 static __inline pv_entry_t
 pmap_pvh_remove(struct md_page *pvh, pmap_t pmap, vm_offset_t va)
 {
 	pv_entry_t pv;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	TAILQ_FOREACH(pv, &pvh->pv_list, pv_next) {
 		if (pmap == PV_PMAP(pv) && va == pv->pv_va) {
 			TAILQ_REMOVE(&pvh->pv_list, pv, pv_next);
 			break;
 		}
 	}
 	return (pv);
 }
 
 static void
 pmap_pv_demote_pde(pmap_t pmap, vm_offset_t va, vm_paddr_t pa)
 {
 	struct md_page *pvh;
 	pv_entry_t pv;
 	vm_offset_t va_last;
 	vm_page_t m;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	KASSERT((pa & PDRMASK) == 0,
 	    ("pmap_pv_demote_pde: pa is not 4mpage aligned"));
 
 	/*
 	 * Transfer the 4mpage's pv entry for this mapping to the first
 	 * page's pv list.
 	 */
 	pvh = pa_to_pvh(pa);
 	va = trunc_4mpage(va);
 	pv = pmap_pvh_remove(pvh, pmap, va);
 	KASSERT(pv != NULL, ("pmap_pv_demote_pde: pv not found"));
 	m = PHYS_TO_VM_PAGE(pa);
 	TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 	/* Instantiate the remaining NPTEPG - 1 pv entries. */
 	va_last = va + NBPDR - PAGE_SIZE;
 	do {
 		m++;
 		KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 		    ("pmap_pv_demote_pde: page %p is not managed", m));
 		va += PAGE_SIZE;
 		pmap_insert_entry(pmap, va, m);
 	} while (va < va_last);
 }
 
 static void
 pmap_pv_promote_pde(pmap_t pmap, vm_offset_t va, vm_paddr_t pa)
 {
 	struct md_page *pvh;
 	pv_entry_t pv;
 	vm_offset_t va_last;
 	vm_page_t m;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	KASSERT((pa & PDRMASK) == 0,
 	    ("pmap_pv_promote_pde: pa is not 4mpage aligned"));
 
 	/*
 	 * Transfer the first page's pv entry for this mapping to the
 	 * 4mpage's pv list.  Aside from avoiding the cost of a call
 	 * to get_pv_entry(), a transfer avoids the possibility that
 	 * get_pv_entry() calls pmap_collect() and that pmap_collect()
 	 * removes one of the mappings that is being promoted.
 	 */
 	m = PHYS_TO_VM_PAGE(pa);
 	va = trunc_4mpage(va);
 	pv = pmap_pvh_remove(&m->md, pmap, va);
 	KASSERT(pv != NULL, ("pmap_pv_promote_pde: pv not found"));
 	pvh = pa_to_pvh(pa);
 	TAILQ_INSERT_TAIL(&pvh->pv_list, pv, pv_next);
 	/* Free the remaining NPTEPG - 1 pv entries. */
 	va_last = va + NBPDR - PAGE_SIZE;
 	do {
 		m++;
 		va += PAGE_SIZE;
 		pmap_pvh_free(&m->md, pmap, va);
 	} while (va < va_last);
 }
 
 static void
 pmap_pvh_free(struct md_page *pvh, pmap_t pmap, vm_offset_t va)
 {
 	pv_entry_t pv;
 
 	pv = pmap_pvh_remove(pvh, pmap, va);
 	KASSERT(pv != NULL, ("pmap_pvh_free: pv not found"));
 	free_pv_entry(pmap, pv);
 }
 
 static void
 pmap_remove_entry(pmap_t pmap, vm_page_t m, vm_offset_t va)
 {
 	struct md_page *pvh;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	pmap_pvh_free(&m->md, pmap, va);
 	if (TAILQ_EMPTY(&m->md.pv_list) && (m->flags & PG_FICTITIOUS) == 0) {
 		pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 		if (TAILQ_EMPTY(&pvh->pv_list))
 			vm_page_aflag_clear(m, PGA_WRITEABLE);
 	}
 }
 
 /*
  * Create a pv entry for page at pa for
  * (pmap, va).
  */
 static void
 pmap_insert_entry(pmap_t pmap, vm_offset_t va, vm_page_t m)
 {
 	pv_entry_t pv;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	pv = get_pv_entry(pmap, FALSE);
 	pv->pv_va = va;
 	TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 }
 
 /*
  * Conditionally create a pv entry.
  */
 static boolean_t
 pmap_try_insert_pv_entry(pmap_t pmap, vm_offset_t va, vm_page_t m)
 {
 	pv_entry_t pv;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	if (pv_entry_count < pv_entry_high_water && 
 	    (pv = get_pv_entry(pmap, TRUE)) != NULL) {
 		pv->pv_va = va;
 		TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 		return (TRUE);
 	} else
 		return (FALSE);
 }
 
 /*
  * Create the pv entries for each of the pages within a superpage.
  */
 static boolean_t
 pmap_pv_insert_pde(pmap_t pmap, vm_offset_t va, vm_paddr_t pa)
 {
 	struct md_page *pvh;
 	pv_entry_t pv;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	if (pv_entry_count < pv_entry_high_water && 
 	    (pv = get_pv_entry(pmap, TRUE)) != NULL) {
 		pv->pv_va = va;
 		pvh = pa_to_pvh(pa);
 		TAILQ_INSERT_TAIL(&pvh->pv_list, pv, pv_next);
 		return (TRUE);
 	} else
 		return (FALSE);
 }
 
 /*
  * Fills a page table page with mappings to consecutive physical pages.
  */
 static void
 pmap_fill_ptp(pt_entry_t *firstpte, pt_entry_t newpte)
 {
 	pt_entry_t *pte;
 
 	for (pte = firstpte; pte < firstpte + NPTEPG; pte++) {
 		*pte = newpte;	
 		newpte += PAGE_SIZE;
 	}
 }
 
 /*
  * Tries to demote a 2- or 4MB page mapping.  If demotion fails, the
  * 2- or 4MB page mapping is invalidated.
  */
 static boolean_t
 pmap_demote_pde(pmap_t pmap, pd_entry_t *pde, vm_offset_t va)
 {
 	pd_entry_t newpde, oldpde;
 	pt_entry_t *firstpte, newpte;
 	vm_paddr_t mptepa;
 	vm_page_t mpte;
 	struct spglist free;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	oldpde = *pde;
 	KASSERT((oldpde & (PG_PS | PG_V)) == (PG_PS | PG_V),
 	    ("pmap_demote_pde: oldpde is missing PG_PS and/or PG_V"));
 	if ((oldpde & PG_A) != 0 && (mpte = pmap_lookup_pt_page(pmap, va)) !=
 	    NULL)
 		pmap_remove_pt_page(pmap, mpte);
 	else {
 		KASSERT((oldpde & PG_W) == 0,
 		    ("pmap_demote_pde: page table page for a wired mapping"
 		    " is missing"));
 
 		/*
 		 * Invalidate the 2- or 4MB page mapping and return
 		 * "failure" if the mapping was never accessed or the
 		 * allocation of the new page table page fails.
 		 */
 		if ((oldpde & PG_A) == 0 || (mpte = vm_page_alloc(NULL,
 		    va >> PDRSHIFT, VM_ALLOC_NOOBJ | VM_ALLOC_NORMAL |
 		    VM_ALLOC_WIRED)) == NULL) {
 			SLIST_INIT(&free);
 			pmap_remove_pde(pmap, pde, trunc_4mpage(va), &free);
 			pmap_invalidate_page(pmap, trunc_4mpage(va));
 			pmap_free_zero_pages(&free);
 			CTR2(KTR_PMAP, "pmap_demote_pde: failure for va %#x"
 			    " in pmap %p", va, pmap);
 			return (FALSE);
 		}
 		if (va < VM_MAXUSER_ADDRESS)
 			pmap->pm_stats.resident_count++;
 	}
 	mptepa = VM_PAGE_TO_PHYS(mpte);
 
 	/*
 	 * If the page mapping is in the kernel's address space, then the
 	 * KPTmap can provide access to the page table page.  Otherwise,
 	 * temporarily map the page table page (mpte) into the kernel's
 	 * address space at either PADDR1 or PADDR2. 
 	 */
 	if (va >= KERNBASE)
 		firstpte = &KPTmap[i386_btop(trunc_4mpage(va))];
 	else if (curthread->td_pinned > 0 && rw_wowned(&pvh_global_lock)) {
 		if ((*PMAP1 & PG_FRAME) != mptepa) {
 			*PMAP1 = mptepa | PG_RW | PG_V | PG_A | PG_M;
 #ifdef SMP
 			PMAP1cpu = PCPU_GET(cpuid);
 #endif
 			invlcaddr(PADDR1);
 			PMAP1changed++;
 		} else
 #ifdef SMP
 		if (PMAP1cpu != PCPU_GET(cpuid)) {
 			PMAP1cpu = PCPU_GET(cpuid);
 			invlcaddr(PADDR1);
 			PMAP1changedcpu++;
 		} else
 #endif
 			PMAP1unchanged++;
 		firstpte = PADDR1;
 	} else {
 		mtx_lock(&PMAP2mutex);
 		if ((*PMAP2 & PG_FRAME) != mptepa) {
 			*PMAP2 = mptepa | PG_RW | PG_V | PG_A | PG_M;
 			pmap_invalidate_page(kernel_pmap, (vm_offset_t)PADDR2);
 		}
 		firstpte = PADDR2;
 	}
 	newpde = mptepa | PG_M | PG_A | (oldpde & PG_U) | PG_RW | PG_V;
 	KASSERT((oldpde & PG_A) != 0,
 	    ("pmap_demote_pde: oldpde is missing PG_A"));
 	KASSERT((oldpde & (PG_M | PG_RW)) != PG_RW,
 	    ("pmap_demote_pde: oldpde is missing PG_M"));
 	newpte = oldpde & ~PG_PS;
 	if ((newpte & PG_PDE_PAT) != 0)
 		newpte ^= PG_PDE_PAT | PG_PTE_PAT;
 
 	/*
 	 * If the page table page is new, initialize it.
 	 */
 	if (mpte->wire_count == 1) {
 		mpte->wire_count = NPTEPG;
 		pmap_fill_ptp(firstpte, newpte);
 	}
 	KASSERT((*firstpte & PG_FRAME) == (newpte & PG_FRAME),
 	    ("pmap_demote_pde: firstpte and newpte map different physical"
 	    " addresses"));
 
 	/*
 	 * If the mapping has changed attributes, update the page table
 	 * entries.
 	 */ 
 	if ((*firstpte & PG_PTE_PROMOTE) != (newpte & PG_PTE_PROMOTE))
 		pmap_fill_ptp(firstpte, newpte);
 	
 	/*
 	 * Demote the mapping.  This pmap is locked.  The old PDE has
 	 * PG_A set.  If the old PDE has PG_RW set, it also has PG_M
 	 * set.  Thus, there is no danger of a race with another
 	 * processor changing the setting of PG_A and/or PG_M between
 	 * the read above and the store below. 
 	 */
 	if (workaround_erratum383)
 		pmap_update_pde(pmap, va, pde, newpde);
 	else if (pmap == kernel_pmap)
 		pmap_kenter_pde(va, newpde);
 	else
 		pde_store(pde, newpde);	
 	if (firstpte == PADDR2)
 		mtx_unlock(&PMAP2mutex);
 
 	/*
 	 * Invalidate the recursive mapping of the page table page.
 	 */
 	pmap_invalidate_page(pmap, (vm_offset_t)vtopte(va));
 
 	/*
 	 * Demote the pv entry.  This depends on the earlier demotion
 	 * of the mapping.  Specifically, the (re)creation of a per-
 	 * page pv entry might trigger the execution of pmap_collect(),
 	 * which might reclaim a newly (re)created per-page pv entry
 	 * and destroy the associated mapping.  In order to destroy
 	 * the mapping, the PDE must have already changed from mapping
 	 * the 2mpage to referencing the page table page.
 	 */
 	if ((oldpde & PG_MANAGED) != 0)
 		pmap_pv_demote_pde(pmap, va, oldpde & PG_PS_FRAME);
 
 	pmap_pde_demotions++;
 	CTR2(KTR_PMAP, "pmap_demote_pde: success for va %#x"
 	    " in pmap %p", va, pmap);
 	return (TRUE);
 }
 
 /*
  * Removes a 2- or 4MB page mapping from the kernel pmap.
  */
 static void
 pmap_remove_kernel_pde(pmap_t pmap, pd_entry_t *pde, vm_offset_t va)
 {
 	pd_entry_t newpde;
 	vm_paddr_t mptepa;
 	vm_page_t mpte;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	mpte = pmap_lookup_pt_page(pmap, va);
 	if (mpte == NULL)
 		panic("pmap_remove_kernel_pde: Missing pt page.");
 
 	pmap_remove_pt_page(pmap, mpte);
 	mptepa = VM_PAGE_TO_PHYS(mpte);
 	newpde = mptepa | PG_M | PG_A | PG_RW | PG_V;
 
 	/*
 	 * Initialize the page table page.
 	 */
 	pagezero((void *)&KPTmap[i386_btop(trunc_4mpage(va))]);
 
 	/*
 	 * Remove the mapping.
 	 */
 	if (workaround_erratum383)
 		pmap_update_pde(pmap, va, pde, newpde);
 	else 
 		pmap_kenter_pde(va, newpde);
 
 	/*
 	 * Invalidate the recursive mapping of the page table page.
 	 */
 	pmap_invalidate_page(pmap, (vm_offset_t)vtopte(va));
 }
 
 /*
  * pmap_remove_pde: do the things to unmap a superpage in a process
  */
 static void
 pmap_remove_pde(pmap_t pmap, pd_entry_t *pdq, vm_offset_t sva,
     struct spglist *free)
 {
 	struct md_page *pvh;
 	pd_entry_t oldpde;
 	vm_offset_t eva, va;
 	vm_page_t m, mpte;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	KASSERT((sva & PDRMASK) == 0,
 	    ("pmap_remove_pde: sva is not 4mpage aligned"));
 	oldpde = pte_load_clear(pdq);
 	if (oldpde & PG_W)
 		pmap->pm_stats.wired_count -= NBPDR / PAGE_SIZE;
 
 	/*
 	 * Machines that don't support invlpg, also don't support
 	 * PG_G.
 	 */
 	if (oldpde & PG_G)
 		pmap_invalidate_page(kernel_pmap, sva);
 	pmap->pm_stats.resident_count -= NBPDR / PAGE_SIZE;
 	if (oldpde & PG_MANAGED) {
 		pvh = pa_to_pvh(oldpde & PG_PS_FRAME);
 		pmap_pvh_free(pvh, pmap, sva);
 		eva = sva + NBPDR;
 		for (va = sva, m = PHYS_TO_VM_PAGE(oldpde & PG_PS_FRAME);
 		    va < eva; va += PAGE_SIZE, m++) {
 			if ((oldpde & (PG_M | PG_RW)) == (PG_M | PG_RW))
 				vm_page_dirty(m);
 			if (oldpde & PG_A)
 				vm_page_aflag_set(m, PGA_REFERENCED);
 			if (TAILQ_EMPTY(&m->md.pv_list) &&
 			    TAILQ_EMPTY(&pvh->pv_list))
 				vm_page_aflag_clear(m, PGA_WRITEABLE);
 		}
 	}
 	if (pmap == kernel_pmap) {
 		pmap_remove_kernel_pde(pmap, pdq, sva);
 	} else {
 		mpte = pmap_lookup_pt_page(pmap, sva);
 		if (mpte != NULL) {
 			pmap_remove_pt_page(pmap, mpte);
 			pmap->pm_stats.resident_count--;
 			KASSERT(mpte->wire_count == NPTEPG,
 			    ("pmap_remove_pde: pte page wire count error"));
 			mpte->wire_count = 0;
 			pmap_add_delayed_free_list(mpte, free, FALSE);
 			atomic_subtract_int(&vm_cnt.v_wire_count, 1);
 		}
 	}
 }
 
 /*
  * pmap_remove_pte: do the things to unmap a page in a process
  */
 static int
 pmap_remove_pte(pmap_t pmap, pt_entry_t *ptq, vm_offset_t va,
     struct spglist *free)
 {
 	pt_entry_t oldpte;
 	vm_page_t m;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	oldpte = pte_load_clear(ptq);
 	KASSERT(oldpte != 0,
 	    ("pmap_remove_pte: pmap %p va %x zero pte", pmap, va));
 	if (oldpte & PG_W)
 		pmap->pm_stats.wired_count -= 1;
 	/*
 	 * Machines that don't support invlpg, also don't support
 	 * PG_G.
 	 */
 	if (oldpte & PG_G)
 		pmap_invalidate_page(kernel_pmap, va);
 	pmap->pm_stats.resident_count -= 1;
 	if (oldpte & PG_MANAGED) {
 		m = PHYS_TO_VM_PAGE(oldpte & PG_FRAME);
 		if ((oldpte & (PG_M | PG_RW)) == (PG_M | PG_RW))
 			vm_page_dirty(m);
 		if (oldpte & PG_A)
 			vm_page_aflag_set(m, PGA_REFERENCED);
 		pmap_remove_entry(pmap, m, va);
 	}
 	return (pmap_unuse_pt(pmap, va, free));
 }
 
 /*
  * Remove a single page from a process address space
  */
 static void
 pmap_remove_page(pmap_t pmap, vm_offset_t va, struct spglist *free)
 {
 	pt_entry_t *pte;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	KASSERT(curthread->td_pinned > 0, ("curthread not pinned"));
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	if ((pte = pmap_pte_quick(pmap, va)) == NULL || *pte == 0)
 		return;
 	pmap_remove_pte(pmap, pte, va, free);
 	pmap_invalidate_page(pmap, va);
 }
 
 /*
  *	Remove the given range of addresses from the specified map.
  *
  *	It is assumed that the start and end are properly
  *	rounded to the page size.
  */
 void
 pmap_remove(pmap_t pmap, vm_offset_t sva, vm_offset_t eva)
 {
 	vm_offset_t pdnxt;
 	pd_entry_t ptpaddr;
 	pt_entry_t *pte;
 	struct spglist free;
 	int anyvalid;
 
 	/*
 	 * Perform an unsynchronized read.  This is, however, safe.
 	 */
 	if (pmap->pm_stats.resident_count == 0)
 		return;
 
 	anyvalid = 0;
 	SLIST_INIT(&free);
 
 	rw_wlock(&pvh_global_lock);
 	sched_pin();
 	PMAP_LOCK(pmap);
 
 	/*
 	 * special handling of removing one page.  a very
 	 * common operation and easy to short circuit some
 	 * code.
 	 */
 	if ((sva + PAGE_SIZE == eva) && 
 	    ((pmap->pm_pdir[(sva >> PDRSHIFT)] & PG_PS) == 0)) {
 		pmap_remove_page(pmap, sva, &free);
 		goto out;
 	}
 
 	for (; sva < eva; sva = pdnxt) {
 		u_int pdirindex;
 
 		/*
 		 * Calculate index for next page table.
 		 */
 		pdnxt = (sva + NBPDR) & ~PDRMASK;
 		if (pdnxt < sva)
 			pdnxt = eva;
 		if (pmap->pm_stats.resident_count == 0)
 			break;
 
 		pdirindex = sva >> PDRSHIFT;
 		ptpaddr = pmap->pm_pdir[pdirindex];
 
 		/*
 		 * Weed out invalid mappings. Note: we assume that the page
 		 * directory table is always allocated, and in kernel virtual.
 		 */
 		if (ptpaddr == 0)
 			continue;
 
 		/*
 		 * Check for large page.
 		 */
 		if ((ptpaddr & PG_PS) != 0) {
 			/*
 			 * Are we removing the entire large page?  If not,
 			 * demote the mapping and fall through.
 			 */
 			if (sva + NBPDR == pdnxt && eva >= pdnxt) {
 				/*
 				 * The TLB entry for a PG_G mapping is
 				 * invalidated by pmap_remove_pde().
 				 */
 				if ((ptpaddr & PG_G) == 0)
 					anyvalid = 1;
 				pmap_remove_pde(pmap,
 				    &pmap->pm_pdir[pdirindex], sva, &free);
 				continue;
 			} else if (!pmap_demote_pde(pmap,
 			    &pmap->pm_pdir[pdirindex], sva)) {
 				/* The large page mapping was destroyed. */
 				continue;
 			}
 		}
 
 		/*
 		 * Limit our scan to either the end of the va represented
 		 * by the current page table page, or to the end of the
 		 * range being removed.
 		 */
 		if (pdnxt > eva)
 			pdnxt = eva;
 
 		for (pte = pmap_pte_quick(pmap, sva); sva != pdnxt; pte++,
 		    sva += PAGE_SIZE) {
 			if (*pte == 0)
 				continue;
 
 			/*
 			 * The TLB entry for a PG_G mapping is invalidated
 			 * by pmap_remove_pte().
 			 */
 			if ((*pte & PG_G) == 0)
 				anyvalid = 1;
 			if (pmap_remove_pte(pmap, pte, sva, &free))
 				break;
 		}
 	}
 out:
 	sched_unpin();
 	if (anyvalid)
 		pmap_invalidate_all(pmap);
 	rw_wunlock(&pvh_global_lock);
 	PMAP_UNLOCK(pmap);
 	pmap_free_zero_pages(&free);
 }
 
 /*
  *	Routine:	pmap_remove_all
  *	Function:
  *		Removes this physical page from
  *		all physical maps in which it resides.
  *		Reflects back modify bits to the pager.
  *
  *	Notes:
  *		Original versions of this routine were very
  *		inefficient because they iteratively called
  *		pmap_remove (slow...)
  */
 
 void
 pmap_remove_all(vm_page_t m)
 {
 	struct md_page *pvh;
 	pv_entry_t pv;
 	pmap_t pmap;
 	pt_entry_t *pte, tpte;
 	pd_entry_t *pde;
 	vm_offset_t va;
 	struct spglist free;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_remove_all: page %p is not managed", m));
 	SLIST_INIT(&free);
 	rw_wlock(&pvh_global_lock);
 	sched_pin();
 	if ((m->flags & PG_FICTITIOUS) != 0)
 		goto small_mappings;
 	pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 	while ((pv = TAILQ_FIRST(&pvh->pv_list)) != NULL) {
 		va = pv->pv_va;
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pde = pmap_pde(pmap, va);
 		(void)pmap_demote_pde(pmap, pde, va);
 		PMAP_UNLOCK(pmap);
 	}
 small_mappings:
 	while ((pv = TAILQ_FIRST(&m->md.pv_list)) != NULL) {
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pmap->pm_stats.resident_count--;
 		pde = pmap_pde(pmap, pv->pv_va);
 		KASSERT((*pde & PG_PS) == 0, ("pmap_remove_all: found"
 		    " a 4mpage in page %p's pv list", m));
 		pte = pmap_pte_quick(pmap, pv->pv_va);
 		tpte = pte_load_clear(pte);
 		KASSERT(tpte != 0, ("pmap_remove_all: pmap %p va %x zero pte",
 		    pmap, pv->pv_va));
 		if (tpte & PG_W)
 			pmap->pm_stats.wired_count--;
 		if (tpte & PG_A)
 			vm_page_aflag_set(m, PGA_REFERENCED);
 
 		/*
 		 * Update the vm_page_t clean and reference bits.
 		 */
 		if ((tpte & (PG_M | PG_RW)) == (PG_M | PG_RW))
 			vm_page_dirty(m);
 		pmap_unuse_pt(pmap, pv->pv_va, &free);
 		pmap_invalidate_page(pmap, pv->pv_va);
 		TAILQ_REMOVE(&m->md.pv_list, pv, pv_next);
 		free_pv_entry(pmap, pv);
 		PMAP_UNLOCK(pmap);
 	}
 	vm_page_aflag_clear(m, PGA_WRITEABLE);
 	sched_unpin();
 	rw_wunlock(&pvh_global_lock);
 	pmap_free_zero_pages(&free);
 }
 
 /*
  * pmap_protect_pde: do the things to protect a 4mpage in a process
  */
 static boolean_t
 pmap_protect_pde(pmap_t pmap, pd_entry_t *pde, vm_offset_t sva, vm_prot_t prot)
 {
 	pd_entry_t newpde, oldpde;
 	vm_offset_t eva, va;
 	vm_page_t m;
 	boolean_t anychanged;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	KASSERT((sva & PDRMASK) == 0,
 	    ("pmap_protect_pde: sva is not 4mpage aligned"));
 	anychanged = FALSE;
 retry:
 	oldpde = newpde = *pde;
 	if (oldpde & PG_MANAGED) {
 		eva = sva + NBPDR;
 		for (va = sva, m = PHYS_TO_VM_PAGE(oldpde & PG_PS_FRAME);
 		    va < eva; va += PAGE_SIZE, m++)
 			if ((oldpde & (PG_M | PG_RW)) == (PG_M | PG_RW))
 				vm_page_dirty(m);
 	}
 	if ((prot & VM_PROT_WRITE) == 0)
 		newpde &= ~(PG_RW | PG_M);
 #if defined(PAE) || defined(PAE_TABLES)
 	if ((prot & VM_PROT_EXECUTE) == 0)
 		newpde |= pg_nx;
 #endif
 	if (newpde != oldpde) {
 		if (!pde_cmpset(pde, oldpde, newpde))
 			goto retry;
 		if (oldpde & PG_G)
 			pmap_invalidate_page(pmap, sva);
 		else
 			anychanged = TRUE;
 	}
 	return (anychanged);
 }
 
 /*
  *	Set the physical protection on the
  *	specified range of this map as requested.
  */
 void
 pmap_protect(pmap_t pmap, vm_offset_t sva, vm_offset_t eva, vm_prot_t prot)
 {
 	vm_offset_t pdnxt;
 	pd_entry_t ptpaddr;
 	pt_entry_t *pte;
 	boolean_t anychanged, pv_lists_locked;
 
 	KASSERT((prot & ~VM_PROT_ALL) == 0, ("invalid prot %x", prot));
 	if (prot == VM_PROT_NONE) {
 		pmap_remove(pmap, sva, eva);
 		return;
 	}
 
 #if defined(PAE) || defined(PAE_TABLES)
 	if ((prot & (VM_PROT_WRITE|VM_PROT_EXECUTE)) ==
 	    (VM_PROT_WRITE|VM_PROT_EXECUTE))
 		return;
 #else
 	if (prot & VM_PROT_WRITE)
 		return;
 #endif
 
 	if (pmap_is_current(pmap))
 		pv_lists_locked = FALSE;
 	else {
 		pv_lists_locked = TRUE;
 resume:
 		rw_wlock(&pvh_global_lock);
 		sched_pin();
 	}
 	anychanged = FALSE;
 
 	PMAP_LOCK(pmap);
 	for (; sva < eva; sva = pdnxt) {
 		pt_entry_t obits, pbits;
 		u_int pdirindex;
 
 		pdnxt = (sva + NBPDR) & ~PDRMASK;
 		if (pdnxt < sva)
 			pdnxt = eva;
 
 		pdirindex = sva >> PDRSHIFT;
 		ptpaddr = pmap->pm_pdir[pdirindex];
 
 		/*
 		 * Weed out invalid mappings. Note: we assume that the page
 		 * directory table is always allocated, and in kernel virtual.
 		 */
 		if (ptpaddr == 0)
 			continue;
 
 		/*
 		 * Check for large page.
 		 */
 		if ((ptpaddr & PG_PS) != 0) {
 			/*
 			 * Are we protecting the entire large page?  If not,
 			 * demote the mapping and fall through.
 			 */
 			if (sva + NBPDR == pdnxt && eva >= pdnxt) {
 				/*
 				 * The TLB entry for a PG_G mapping is
 				 * invalidated by pmap_protect_pde().
 				 */
 				if (pmap_protect_pde(pmap,
 				    &pmap->pm_pdir[pdirindex], sva, prot))
 					anychanged = TRUE;
 				continue;
 			} else {
 				if (!pv_lists_locked) {
 					pv_lists_locked = TRUE;
 					if (!rw_try_wlock(&pvh_global_lock)) {
 						if (anychanged)
 							pmap_invalidate_all(
 							    pmap);
 						PMAP_UNLOCK(pmap);
 						goto resume;
 					}
 					sched_pin();
 				}
 				if (!pmap_demote_pde(pmap,
 				    &pmap->pm_pdir[pdirindex], sva)) {
 					/*
 					 * The large page mapping was
 					 * destroyed.
 					 */
 					continue;
 				}
 			}
 		}
 
 		if (pdnxt > eva)
 			pdnxt = eva;
 
 		for (pte = pmap_pte_quick(pmap, sva); sva != pdnxt; pte++,
 		    sva += PAGE_SIZE) {
 			vm_page_t m;
 
 retry:
 			/*
 			 * Regardless of whether a pte is 32 or 64 bits in
 			 * size, PG_RW, PG_A, and PG_M are among the least
 			 * significant 32 bits.
 			 */
 			obits = pbits = *pte;
 			if ((pbits & PG_V) == 0)
 				continue;
 
 			if ((prot & VM_PROT_WRITE) == 0) {
 				if ((pbits & (PG_MANAGED | PG_M | PG_RW)) ==
 				    (PG_MANAGED | PG_M | PG_RW)) {
 					m = PHYS_TO_VM_PAGE(pbits & PG_FRAME);
 					vm_page_dirty(m);
 				}
 				pbits &= ~(PG_RW | PG_M);
 			}
 #if defined(PAE) || defined(PAE_TABLES)
 			if ((prot & VM_PROT_EXECUTE) == 0)
 				pbits |= pg_nx;
 #endif
 
 			if (pbits != obits) {
 #if defined(PAE) || defined(PAE_TABLES)
 				if (!atomic_cmpset_64(pte, obits, pbits))
 					goto retry;
 #else
 				if (!atomic_cmpset_int((u_int *)pte, obits,
 				    pbits))
 					goto retry;
 #endif
 				if (obits & PG_G)
 					pmap_invalidate_page(pmap, sva);
 				else
 					anychanged = TRUE;
 			}
 		}
 	}
 	if (anychanged)
 		pmap_invalidate_all(pmap);
 	if (pv_lists_locked) {
 		sched_unpin();
 		rw_wunlock(&pvh_global_lock);
 	}
 	PMAP_UNLOCK(pmap);
 }
 
 /*
  * Tries to promote the 512 or 1024, contiguous 4KB page mappings that are
  * within a single page table page (PTP) to a single 2- or 4MB page mapping.
  * For promotion to occur, two conditions must be met: (1) the 4KB page
  * mappings must map aligned, contiguous physical memory and (2) the 4KB page
  * mappings must have identical characteristics.
  *
  * Managed (PG_MANAGED) mappings within the kernel address space are not
  * promoted.  The reason is that kernel PDEs are replicated in each pmap but
  * pmap_clear_ptes() and pmap_ts_referenced() only read the PDE from the kernel
  * pmap.
  */
 static void
 pmap_promote_pde(pmap_t pmap, pd_entry_t *pde, vm_offset_t va)
 {
 	pd_entry_t newpde;
 	pt_entry_t *firstpte, oldpte, pa, *pte;
 	vm_offset_t oldpteva;
 	vm_page_t mpte;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 
 	/*
 	 * Examine the first PTE in the specified PTP.  Abort if this PTE is
 	 * either invalid, unused, or does not map the first 4KB physical page
 	 * within a 2- or 4MB page.
 	 */
 	firstpte = pmap_pte_quick(pmap, trunc_4mpage(va));
 setpde:
 	newpde = *firstpte;
 	if ((newpde & ((PG_FRAME & PDRMASK) | PG_A | PG_V)) != (PG_A | PG_V)) {
 		pmap_pde_p_failures++;
 		CTR2(KTR_PMAP, "pmap_promote_pde: failure for va %#x"
 		    " in pmap %p", va, pmap);
 		return;
 	}
 	if ((*firstpte & PG_MANAGED) != 0 && pmap == kernel_pmap) {
 		pmap_pde_p_failures++;
 		CTR2(KTR_PMAP, "pmap_promote_pde: failure for va %#x"
 		    " in pmap %p", va, pmap);
 		return;
 	}
 	if ((newpde & (PG_M | PG_RW)) == PG_RW) {
 		/*
 		 * When PG_M is already clear, PG_RW can be cleared without
 		 * a TLB invalidation.
 		 */
 		if (!atomic_cmpset_int((u_int *)firstpte, newpde, newpde &
 		    ~PG_RW))  
 			goto setpde;
 		newpde &= ~PG_RW;
 	}
 
 	/* 
 	 * Examine each of the other PTEs in the specified PTP.  Abort if this
 	 * PTE maps an unexpected 4KB physical page or does not have identical
 	 * characteristics to the first PTE.
 	 */
 	pa = (newpde & (PG_PS_FRAME | PG_A | PG_V)) + NBPDR - PAGE_SIZE;
 	for (pte = firstpte + NPTEPG - 1; pte > firstpte; pte--) {
 setpte:
 		oldpte = *pte;
 		if ((oldpte & (PG_FRAME | PG_A | PG_V)) != pa) {
 			pmap_pde_p_failures++;
 			CTR2(KTR_PMAP, "pmap_promote_pde: failure for va %#x"
 			    " in pmap %p", va, pmap);
 			return;
 		}
 		if ((oldpte & (PG_M | PG_RW)) == PG_RW) {
 			/*
 			 * When PG_M is already clear, PG_RW can be cleared
 			 * without a TLB invalidation.
 			 */
 			if (!atomic_cmpset_int((u_int *)pte, oldpte,
 			    oldpte & ~PG_RW))
 				goto setpte;
 			oldpte &= ~PG_RW;
 			oldpteva = (oldpte & PG_FRAME & PDRMASK) |
 			    (va & ~PDRMASK);
 			CTR2(KTR_PMAP, "pmap_promote_pde: protect for va %#x"
 			    " in pmap %p", oldpteva, pmap);
 		}
 		if ((oldpte & PG_PTE_PROMOTE) != (newpde & PG_PTE_PROMOTE)) {
 			pmap_pde_p_failures++;
 			CTR2(KTR_PMAP, "pmap_promote_pde: failure for va %#x"
 			    " in pmap %p", va, pmap);
 			return;
 		}
 		pa -= PAGE_SIZE;
 	}
 
 	/*
 	 * Save the page table page in its current state until the PDE
 	 * mapping the superpage is demoted by pmap_demote_pde() or
 	 * destroyed by pmap_remove_pde(). 
 	 */
 	mpte = PHYS_TO_VM_PAGE(*pde & PG_FRAME);
 	KASSERT(mpte >= vm_page_array &&
 	    mpte < &vm_page_array[vm_page_array_size],
 	    ("pmap_promote_pde: page table page is out of range"));
 	KASSERT(mpte->pindex == va >> PDRSHIFT,
 	    ("pmap_promote_pde: page table page's pindex is wrong"));
 	if (pmap_insert_pt_page(pmap, mpte)) {
 		pmap_pde_p_failures++;
 		CTR2(KTR_PMAP,
 		    "pmap_promote_pde: failure for va %#x in pmap %p", va,
 		    pmap);
 		return;
 	}
 
 	/*
 	 * Promote the pv entries.
 	 */
 	if ((newpde & PG_MANAGED) != 0)
 		pmap_pv_promote_pde(pmap, va, newpde & PG_PS_FRAME);
 
 	/*
 	 * Propagate the PAT index to its proper position.
 	 */
 	if ((newpde & PG_PTE_PAT) != 0)
 		newpde ^= PG_PDE_PAT | PG_PTE_PAT;
 
 	/*
 	 * Map the superpage.
 	 */
 	if (workaround_erratum383)
 		pmap_update_pde(pmap, va, pde, PG_PS | newpde);
 	else if (pmap == kernel_pmap)
 		pmap_kenter_pde(va, PG_PS | newpde);
 	else
 		pde_store(pde, PG_PS | newpde);
 
 	pmap_pde_promotions++;
 	CTR2(KTR_PMAP, "pmap_promote_pde: success for va %#x"
 	    " in pmap %p", va, pmap);
 }
 
 /*
  *	Insert the given physical page (p) at
  *	the specified virtual address (v) in the
  *	target physical map with the protection requested.
  *
  *	If specified, the page will be wired down, meaning
  *	that the related pte can not be reclaimed.
  *
  *	NB:  This is the only routine which MAY NOT lazy-evaluate
  *	or lose information.  That is, this routine must actually
  *	insert this page into the given map NOW.
  */
 int
 pmap_enter(pmap_t pmap, vm_offset_t va, vm_page_t m, vm_prot_t prot,
     u_int flags, int8_t psind)
 {
 	pd_entry_t *pde;
 	pt_entry_t *pte;
 	pt_entry_t newpte, origpte;
 	pv_entry_t pv;
 	vm_paddr_t opa, pa;
 	vm_page_t mpte, om;
 	boolean_t invlva, wired;
 
 	va = trunc_page(va);
 	mpte = NULL;
 	wired = (flags & PMAP_ENTER_WIRED) != 0;
 
 	KASSERT(va <= VM_MAX_KERNEL_ADDRESS, ("pmap_enter: toobig"));
 	KASSERT(va < UPT_MIN_ADDRESS || va >= UPT_MAX_ADDRESS,
 	    ("pmap_enter: invalid to pmap_enter page table pages (va: 0x%x)",
 	    va));
 	if ((m->oflags & VPO_UNMANAGED) == 0 && !vm_page_xbusied(m))
 		VM_OBJECT_ASSERT_LOCKED(m->object);
 
 	rw_wlock(&pvh_global_lock);
 	PMAP_LOCK(pmap);
 	sched_pin();
 
 	/*
 	 * In the case that a page table page is not
 	 * resident, we are creating it here.
 	 */
 	if (va < VM_MAXUSER_ADDRESS) {
 		mpte = pmap_allocpte(pmap, va, flags);
 		if (mpte == NULL) {
 			KASSERT((flags & PMAP_ENTER_NOSLEEP) != 0,
 			    ("pmap_allocpte failed with sleep allowed"));
 			sched_unpin();
 			rw_wunlock(&pvh_global_lock);
 			PMAP_UNLOCK(pmap);
 			return (KERN_RESOURCE_SHORTAGE);
 		}
 	}
 
 	pde = pmap_pde(pmap, va);
 	if ((*pde & PG_PS) != 0)
 		panic("pmap_enter: attempted pmap_enter on 4MB page");
 	pte = pmap_pte_quick(pmap, va);
 
 	/*
 	 * Page Directory table entry not valid, we need a new PT page
 	 */
 	if (pte == NULL) {
 		panic("pmap_enter: invalid page directory pdir=%#jx, va=%#x",
 			(uintmax_t)pmap->pm_pdir[PTDPTDI], va);
 	}
 
 	pa = VM_PAGE_TO_PHYS(m);
 	om = NULL;
 	origpte = *pte;
 	opa = origpte & PG_FRAME;
 
 	/*
 	 * Mapping has not changed, must be protection or wiring change.
 	 */
 	if (origpte && (opa == pa)) {
 		/*
 		 * Wiring change, just update stats. We don't worry about
 		 * wiring PT pages as they remain resident as long as there
 		 * are valid mappings in them. Hence, if a user page is wired,
 		 * the PT page will be also.
 		 */
 		if (wired && ((origpte & PG_W) == 0))
 			pmap->pm_stats.wired_count++;
 		else if (!wired && (origpte & PG_W))
 			pmap->pm_stats.wired_count--;
 
 		/*
 		 * Remove extra pte reference
 		 */
 		if (mpte)
 			mpte->wire_count--;
 
 		if (origpte & PG_MANAGED) {
 			om = m;
 			pa |= PG_MANAGED;
 		}
 		goto validate;
 	} 
 
 	pv = NULL;
 
 	/*
 	 * Mapping has changed, invalidate old range and fall through to
 	 * handle validating new mapping.
 	 */
 	if (opa) {
 		if (origpte & PG_W)
 			pmap->pm_stats.wired_count--;
 		if (origpte & PG_MANAGED) {
 			om = PHYS_TO_VM_PAGE(opa);
 			pv = pmap_pvh_remove(&om->md, pmap, va);
 		}
 		if (mpte != NULL) {
 			mpte->wire_count--;
 			KASSERT(mpte->wire_count > 0,
 			    ("pmap_enter: missing reference to page table page,"
 			     " va: 0x%x", va));
 		}
 	} else
 		pmap->pm_stats.resident_count++;
 
 	/*
 	 * Enter on the PV list if part of our managed memory.
 	 */
 	if ((m->oflags & VPO_UNMANAGED) == 0) {
 		KASSERT(va < kmi.clean_sva || va >= kmi.clean_eva,
 		    ("pmap_enter: managed mapping within the clean submap"));
 		if (pv == NULL)
 			pv = get_pv_entry(pmap, FALSE);
 		pv->pv_va = va;
 		TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 		pa |= PG_MANAGED;
 	} else if (pv != NULL)
 		free_pv_entry(pmap, pv);
 
 	/*
 	 * Increment counters
 	 */
 	if (wired)
 		pmap->pm_stats.wired_count++;
 
 validate:
 	/*
 	 * Now validate mapping with desired protection/wiring.
 	 */
 	newpte = (pt_entry_t)(pa | pmap_cache_bits(m->md.pat_mode, 0) | PG_V);
 	if ((prot & VM_PROT_WRITE) != 0) {
 		newpte |= PG_RW;
 		if ((newpte & PG_MANAGED) != 0)
 			vm_page_aflag_set(m, PGA_WRITEABLE);
 	}
 #if defined(PAE) || defined(PAE_TABLES)
 	if ((prot & VM_PROT_EXECUTE) == 0)
 		newpte |= pg_nx;
 #endif
 	if (wired)
 		newpte |= PG_W;
 	if (va < VM_MAXUSER_ADDRESS)
 		newpte |= PG_U;
 	if (pmap == kernel_pmap)
 		newpte |= pgeflag;
 
 	/*
 	 * if the mapping or permission bits are different, we need
 	 * to update the pte.
 	 */
 	if ((origpte & ~(PG_M|PG_A)) != newpte) {
 		newpte |= PG_A;
 		if ((flags & VM_PROT_WRITE) != 0)
 			newpte |= PG_M;
 		if (origpte & PG_V) {
 			invlva = FALSE;
 			origpte = pte_load_store(pte, newpte);
 			if (origpte & PG_A) {
 				if (origpte & PG_MANAGED)
 					vm_page_aflag_set(om, PGA_REFERENCED);
 				if (opa != VM_PAGE_TO_PHYS(m))
 					invlva = TRUE;
 #if defined(PAE) || defined(PAE_TABLES)
 				if ((origpte & PG_NX) == 0 &&
 				    (newpte & PG_NX) != 0)
 					invlva = TRUE;
 #endif
 			}
 			if ((origpte & (PG_M | PG_RW)) == (PG_M | PG_RW)) {
 				if ((origpte & PG_MANAGED) != 0)
 					vm_page_dirty(om);
 				if ((prot & VM_PROT_WRITE) == 0)
 					invlva = TRUE;
 			}
 			if ((origpte & PG_MANAGED) != 0 &&
 			    TAILQ_EMPTY(&om->md.pv_list) &&
 			    ((om->flags & PG_FICTITIOUS) != 0 ||
 			    TAILQ_EMPTY(&pa_to_pvh(opa)->pv_list)))
 				vm_page_aflag_clear(om, PGA_WRITEABLE);
 			if (invlva)
 				pmap_invalidate_page(pmap, va);
 		} else
 			pte_store(pte, newpte);
 	}
 
 	/*
 	 * If both the page table page and the reservation are fully
 	 * populated, then attempt promotion.
 	 */
 	if ((mpte == NULL || mpte->wire_count == NPTEPG) &&
 	    pg_ps_enabled && (m->flags & PG_FICTITIOUS) == 0 &&
 	    vm_reserv_level_iffullpop(m) == 0)
 		pmap_promote_pde(pmap, pde, va);
 
 	sched_unpin();
 	rw_wunlock(&pvh_global_lock);
 	PMAP_UNLOCK(pmap);
 	return (KERN_SUCCESS);
 }
 
 /*
  * Tries to create a 2- or 4MB page mapping.  Returns TRUE if successful and
  * FALSE otherwise.  Fails if (1) a page table page cannot be allocated without
  * blocking, (2) a mapping already exists at the specified virtual address, or
  * (3) a pv entry cannot be allocated without reclaiming another pv entry. 
  */
 static boolean_t
 pmap_enter_pde(pmap_t pmap, vm_offset_t va, vm_page_t m, vm_prot_t prot)
 {
 	pd_entry_t *pde, newpde;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	pde = pmap_pde(pmap, va);
 	if (*pde != 0) {
 		CTR2(KTR_PMAP, "pmap_enter_pde: failure for va %#lx"
 		    " in pmap %p", va, pmap);
 		return (FALSE);
 	}
 	newpde = VM_PAGE_TO_PHYS(m) | pmap_cache_bits(m->md.pat_mode, 1) |
 	    PG_PS | PG_V;
 	if ((m->oflags & VPO_UNMANAGED) == 0) {
 		newpde |= PG_MANAGED;
 
 		/*
 		 * Abort this mapping if its PV entry could not be created.
 		 */
 		if (!pmap_pv_insert_pde(pmap, va, VM_PAGE_TO_PHYS(m))) {
 			CTR2(KTR_PMAP, "pmap_enter_pde: failure for va %#lx"
 			    " in pmap %p", va, pmap);
 			return (FALSE);
 		}
 	}
 #if defined(PAE) || defined(PAE_TABLES)
 	if ((prot & VM_PROT_EXECUTE) == 0)
 		newpde |= pg_nx;
 #endif
 	if (va < VM_MAXUSER_ADDRESS)
 		newpde |= PG_U;
 
 	/*
 	 * Increment counters.
 	 */
 	pmap->pm_stats.resident_count += NBPDR / PAGE_SIZE;
 
 	/*
 	 * Map the superpage.
 	 */
 	pde_store(pde, newpde);
 
 	pmap_pde_mappings++;
 	CTR2(KTR_PMAP, "pmap_enter_pde: success for va %#lx"
 	    " in pmap %p", va, pmap);
 	return (TRUE);
 }
 
 /*
  * Maps a sequence of resident pages belonging to the same object.
  * The sequence begins with the given page m_start.  This page is
  * mapped at the given virtual address start.  Each subsequent page is
  * mapped at a virtual address that is offset from start by the same
  * amount as the page is offset from m_start within the object.  The
  * last page in the sequence is the page with the largest offset from
  * m_start that can be mapped at a virtual address less than the given
  * virtual address end.  Not every virtual page between start and end
  * is mapped; only those for which a resident page exists with the
  * corresponding offset from m_start are mapped.
  */
 void
 pmap_enter_object(pmap_t pmap, vm_offset_t start, vm_offset_t end,
     vm_page_t m_start, vm_prot_t prot)
 {
 	vm_offset_t va;
 	vm_page_t m, mpte;
 	vm_pindex_t diff, psize;
 
 	VM_OBJECT_ASSERT_LOCKED(m_start->object);
 
 	psize = atop(end - start);
 	mpte = NULL;
 	m = m_start;
 	rw_wlock(&pvh_global_lock);
 	PMAP_LOCK(pmap);
 	while (m != NULL && (diff = m->pindex - m_start->pindex) < psize) {
 		va = start + ptoa(diff);
 		if ((va & PDRMASK) == 0 && va + NBPDR <= end &&
 		    m->psind == 1 && pg_ps_enabled &&
 		    pmap_enter_pde(pmap, va, m, prot))
 			m = &m[NBPDR / PAGE_SIZE - 1];
 		else
 			mpte = pmap_enter_quick_locked(pmap, va, m, prot,
 			    mpte);
 		m = TAILQ_NEXT(m, listq);
 	}
 	rw_wunlock(&pvh_global_lock);
 	PMAP_UNLOCK(pmap);
 }
 
 /*
  * this code makes some *MAJOR* assumptions:
  * 1. Current pmap & pmap exists.
  * 2. Not wired.
  * 3. Read access.
  * 4. No page table pages.
  * but is *MUCH* faster than pmap_enter...
  */
 
 void
 pmap_enter_quick(pmap_t pmap, vm_offset_t va, vm_page_t m, vm_prot_t prot)
 {
 
 	rw_wlock(&pvh_global_lock);
 	PMAP_LOCK(pmap);
 	(void)pmap_enter_quick_locked(pmap, va, m, prot, NULL);
 	rw_wunlock(&pvh_global_lock);
 	PMAP_UNLOCK(pmap);
 }
 
 static vm_page_t
 pmap_enter_quick_locked(pmap_t pmap, vm_offset_t va, vm_page_t m,
     vm_prot_t prot, vm_page_t mpte)
 {
 	pt_entry_t *pte;
 	vm_paddr_t pa;
 	struct spglist free;
 
 	KASSERT(va < kmi.clean_sva || va >= kmi.clean_eva ||
 	    (m->oflags & VPO_UNMANAGED) != 0,
 	    ("pmap_enter_quick_locked: managed mapping within the clean submap"));
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 
 	/*
 	 * In the case that a page table page is not
 	 * resident, we are creating it here.
 	 */
 	if (va < VM_MAXUSER_ADDRESS) {
 		u_int ptepindex;
 		pd_entry_t ptepa;
 
 		/*
 		 * Calculate pagetable page index
 		 */
 		ptepindex = va >> PDRSHIFT;
 		if (mpte && (mpte->pindex == ptepindex)) {
 			mpte->wire_count++;
 		} else {
 			/*
 			 * Get the page directory entry
 			 */
 			ptepa = pmap->pm_pdir[ptepindex];
 
 			/*
 			 * If the page table page is mapped, we just increment
 			 * the hold count, and activate it.
 			 */
 			if (ptepa) {
 				if (ptepa & PG_PS)
 					return (NULL);
 				mpte = PHYS_TO_VM_PAGE(ptepa & PG_FRAME);
 				mpte->wire_count++;
 			} else {
 				mpte = _pmap_allocpte(pmap, ptepindex,
 				    PMAP_ENTER_NOSLEEP);
 				if (mpte == NULL)
 					return (mpte);
 			}
 		}
 	} else {
 		mpte = NULL;
 	}
 
 	/*
 	 * This call to vtopte makes the assumption that we are
 	 * entering the page into the current pmap.  In order to support
 	 * quick entry into any pmap, one would likely use pmap_pte_quick.
 	 * But that isn't as quick as vtopte.
 	 */
 	pte = vtopte(va);
 	if (*pte) {
 		if (mpte != NULL) {
 			mpte->wire_count--;
 			mpte = NULL;
 		}
 		return (mpte);
 	}
 
 	/*
 	 * Enter on the PV list if part of our managed memory.
 	 */
 	if ((m->oflags & VPO_UNMANAGED) == 0 &&
 	    !pmap_try_insert_pv_entry(pmap, va, m)) {
 		if (mpte != NULL) {
 			SLIST_INIT(&free);
 			if (pmap_unwire_ptp(pmap, mpte, &free)) {
 				pmap_invalidate_page(pmap, va);
 				pmap_free_zero_pages(&free);
 			}
 			
 			mpte = NULL;
 		}
 		return (mpte);
 	}
 
 	/*
 	 * Increment counters
 	 */
 	pmap->pm_stats.resident_count++;
 
 	pa = VM_PAGE_TO_PHYS(m) | pmap_cache_bits(m->md.pat_mode, 0);
 #if defined(PAE) || defined(PAE_TABLES)
 	if ((prot & VM_PROT_EXECUTE) == 0)
 		pa |= pg_nx;
 #endif
 
 	/*
 	 * Now validate mapping with RO protection
 	 */
 	if ((m->oflags & VPO_UNMANAGED) != 0)
 		pte_store(pte, pa | PG_V | PG_U);
 	else
 		pte_store(pte, pa | PG_V | PG_U | PG_MANAGED);
 	return (mpte);
 }
 
 /*
  * Make a temporary mapping for a physical address.  This is only intended
  * to be used for panic dumps.
  */
 void *
 pmap_kenter_temporary(vm_paddr_t pa, int i)
 {
 	vm_offset_t va;
 
 	va = (vm_offset_t)crashdumpmap + (i * PAGE_SIZE);
 	pmap_kenter(va, pa);
 	invlpg(va);
 	return ((void *)crashdumpmap);
 }
 
 /*
  * This code maps large physical mmap regions into the
  * processor address space.  Note that some shortcuts
  * are taken, but the code works.
  */
 void
 pmap_object_init_pt(pmap_t pmap, vm_offset_t addr, vm_object_t object,
     vm_pindex_t pindex, vm_size_t size)
 {
 	pd_entry_t *pde;
 	vm_paddr_t pa, ptepa;
 	vm_page_t p;
 	int pat_mode;
 
 	VM_OBJECT_ASSERT_WLOCKED(object);
 	KASSERT(object->type == OBJT_DEVICE || object->type == OBJT_SG,
 	    ("pmap_object_init_pt: non-device object"));
 	if (pseflag && 
 	    (addr & (NBPDR - 1)) == 0 && (size & (NBPDR - 1)) == 0) {
 		if (!vm_object_populate(object, pindex, pindex + atop(size)))
 			return;
 		p = vm_page_lookup(object, pindex);
 		KASSERT(p->valid == VM_PAGE_BITS_ALL,
 		    ("pmap_object_init_pt: invalid page %p", p));
 		pat_mode = p->md.pat_mode;
 
 		/*
 		 * Abort the mapping if the first page is not physically
 		 * aligned to a 2/4MB page boundary.
 		 */
 		ptepa = VM_PAGE_TO_PHYS(p);
 		if (ptepa & (NBPDR - 1))
 			return;
 
 		/*
 		 * Skip the first page.  Abort the mapping if the rest of
 		 * the pages are not physically contiguous or have differing
 		 * memory attributes.
 		 */
 		p = TAILQ_NEXT(p, listq);
 		for (pa = ptepa + PAGE_SIZE; pa < ptepa + size;
 		    pa += PAGE_SIZE) {
 			KASSERT(p->valid == VM_PAGE_BITS_ALL,
 			    ("pmap_object_init_pt: invalid page %p", p));
 			if (pa != VM_PAGE_TO_PHYS(p) ||
 			    pat_mode != p->md.pat_mode)
 				return;
 			p = TAILQ_NEXT(p, listq);
 		}
 
 		/*
 		 * Map using 2/4MB pages.  Since "ptepa" is 2/4M aligned and
 		 * "size" is a multiple of 2/4M, adding the PAT setting to
 		 * "pa" will not affect the termination of this loop.
 		 */
 		PMAP_LOCK(pmap);
 		for (pa = ptepa | pmap_cache_bits(pat_mode, 1); pa < ptepa +
 		    size; pa += NBPDR) {
 			pde = pmap_pde(pmap, addr);
 			if (*pde == 0) {
 				pde_store(pde, pa | PG_PS | PG_M | PG_A |
 				    PG_U | PG_RW | PG_V);
 				pmap->pm_stats.resident_count += NBPDR /
 				    PAGE_SIZE;
 				pmap_pde_mappings++;
 			}
 			/* Else continue on if the PDE is already valid. */
 			addr += NBPDR;
 		}
 		PMAP_UNLOCK(pmap);
 	}
 }
 
 /*
  *	Clear the wired attribute from the mappings for the specified range of
  *	addresses in the given pmap.  Every valid mapping within that range
  *	must have the wired attribute set.  In contrast, invalid mappings
  *	cannot have the wired attribute set, so they are ignored.
  *
  *	The wired attribute of the page table entry is not a hardware feature,
  *	so there is no need to invalidate any TLB entries.
  */
 void
 pmap_unwire(pmap_t pmap, vm_offset_t sva, vm_offset_t eva)
 {
 	vm_offset_t pdnxt;
 	pd_entry_t *pde;
 	pt_entry_t *pte;
 	boolean_t pv_lists_locked;
 
 	if (pmap_is_current(pmap))
 		pv_lists_locked = FALSE;
 	else {
 		pv_lists_locked = TRUE;
 resume:
 		rw_wlock(&pvh_global_lock);
 		sched_pin();
 	}
 	PMAP_LOCK(pmap);
 	for (; sva < eva; sva = pdnxt) {
 		pdnxt = (sva + NBPDR) & ~PDRMASK;
 		if (pdnxt < sva)
 			pdnxt = eva;
 		pde = pmap_pde(pmap, sva);
 		if ((*pde & PG_V) == 0)
 			continue;
 		if ((*pde & PG_PS) != 0) {
 			if ((*pde & PG_W) == 0)
 				panic("pmap_unwire: pde %#jx is missing PG_W",
 				    (uintmax_t)*pde);
 
 			/*
 			 * Are we unwiring the entire large page?  If not,
 			 * demote the mapping and fall through.
 			 */
 			if (sva + NBPDR == pdnxt && eva >= pdnxt) {
 				/*
 				 * Regardless of whether a pde (or pte) is 32
 				 * or 64 bits in size, PG_W is among the least
 				 * significant 32 bits.
 				 */
 				atomic_clear_int((u_int *)pde, PG_W);
 				pmap->pm_stats.wired_count -= NBPDR /
 				    PAGE_SIZE;
 				continue;
 			} else {
 				if (!pv_lists_locked) {
 					pv_lists_locked = TRUE;
 					if (!rw_try_wlock(&pvh_global_lock)) {
 						PMAP_UNLOCK(pmap);
 						/* Repeat sva. */
 						goto resume;
 					}
 					sched_pin();
 				}
 				if (!pmap_demote_pde(pmap, pde, sva))
 					panic("pmap_unwire: demotion failed");
 			}
 		}
 		if (pdnxt > eva)
 			pdnxt = eva;
 		for (pte = pmap_pte_quick(pmap, sva); sva != pdnxt; pte++,
 		    sva += PAGE_SIZE) {
 			if ((*pte & PG_V) == 0)
 				continue;
 			if ((*pte & PG_W) == 0)
 				panic("pmap_unwire: pte %#jx is missing PG_W",
 				    (uintmax_t)*pte);
 
 			/*
 			 * PG_W must be cleared atomically.  Although the pmap
 			 * lock synchronizes access to PG_W, another processor
 			 * could be setting PG_M and/or PG_A concurrently.
 			 *
 			 * PG_W is among the least significant 32 bits.
 			 */
 			atomic_clear_int((u_int *)pte, PG_W);
 			pmap->pm_stats.wired_count--;
 		}
 	}
 	if (pv_lists_locked) {
 		sched_unpin();
 		rw_wunlock(&pvh_global_lock);
 	}
 	PMAP_UNLOCK(pmap);
 }
 
 
 /*
  *	Copy the range specified by src_addr/len
  *	from the source map to the range dst_addr/len
  *	in the destination map.
  *
  *	This routine is only advisory and need not do anything.
  */
 
 void
 pmap_copy(pmap_t dst_pmap, pmap_t src_pmap, vm_offset_t dst_addr, vm_size_t len,
     vm_offset_t src_addr)
 {
 	struct spglist free;
 	vm_offset_t addr;
 	vm_offset_t end_addr = src_addr + len;
 	vm_offset_t pdnxt;
 
 	if (dst_addr != src_addr)
 		return;
 
 	if (!pmap_is_current(src_pmap))
 		return;
 
 	rw_wlock(&pvh_global_lock);
 	if (dst_pmap < src_pmap) {
 		PMAP_LOCK(dst_pmap);
 		PMAP_LOCK(src_pmap);
 	} else {
 		PMAP_LOCK(src_pmap);
 		PMAP_LOCK(dst_pmap);
 	}
 	sched_pin();
 	for (addr = src_addr; addr < end_addr; addr = pdnxt) {
 		pt_entry_t *src_pte, *dst_pte;
 		vm_page_t dstmpte, srcmpte;
 		pd_entry_t srcptepaddr;
 		u_int ptepindex;
 
 		KASSERT(addr < UPT_MIN_ADDRESS,
 		    ("pmap_copy: invalid to pmap_copy page tables"));
 
 		pdnxt = (addr + NBPDR) & ~PDRMASK;
 		if (pdnxt < addr)
 			pdnxt = end_addr;
 		ptepindex = addr >> PDRSHIFT;
 
 		srcptepaddr = src_pmap->pm_pdir[ptepindex];
 		if (srcptepaddr == 0)
 			continue;
 			
 		if (srcptepaddr & PG_PS) {
 			if ((addr & PDRMASK) != 0 || addr + NBPDR > end_addr)
 				continue;
 			if (dst_pmap->pm_pdir[ptepindex] == 0 &&
 			    ((srcptepaddr & PG_MANAGED) == 0 ||
 			    pmap_pv_insert_pde(dst_pmap, addr, srcptepaddr &
 			    PG_PS_FRAME))) {
 				dst_pmap->pm_pdir[ptepindex] = srcptepaddr &
 				    ~PG_W;
 				dst_pmap->pm_stats.resident_count +=
 				    NBPDR / PAGE_SIZE;
 				pmap_pde_mappings++;
 			}
 			continue;
 		}
 
 		srcmpte = PHYS_TO_VM_PAGE(srcptepaddr & PG_FRAME);
 		KASSERT(srcmpte->wire_count > 0,
 		    ("pmap_copy: source page table page is unused"));
 
 		if (pdnxt > end_addr)
 			pdnxt = end_addr;
 
 		src_pte = vtopte(addr);
 		while (addr < pdnxt) {
 			pt_entry_t ptetemp;
 			ptetemp = *src_pte;
 			/*
 			 * we only virtual copy managed pages
 			 */
 			if ((ptetemp & PG_MANAGED) != 0) {
 				dstmpte = pmap_allocpte(dst_pmap, addr,
 				    PMAP_ENTER_NOSLEEP);
 				if (dstmpte == NULL)
 					goto out;
 				dst_pte = pmap_pte_quick(dst_pmap, addr);
 				if (*dst_pte == 0 &&
 				    pmap_try_insert_pv_entry(dst_pmap, addr,
 				    PHYS_TO_VM_PAGE(ptetemp & PG_FRAME))) {
 					/*
 					 * Clear the wired, modified, and
 					 * accessed (referenced) bits
 					 * during the copy.
 					 */
 					*dst_pte = ptetemp & ~(PG_W | PG_M |
 					    PG_A);
 					dst_pmap->pm_stats.resident_count++;
 	 			} else {
 					SLIST_INIT(&free);
 					if (pmap_unwire_ptp(dst_pmap, dstmpte,
 					    &free)) {
 						pmap_invalidate_page(dst_pmap,
 						    addr);
 						pmap_free_zero_pages(&free);
 					}
 					goto out;
 				}
 				if (dstmpte->wire_count >= srcmpte->wire_count)
 					break;
 			}
 			addr += PAGE_SIZE;
 			src_pte++;
 		}
 	}
 out:
 	sched_unpin();
 	rw_wunlock(&pvh_global_lock);
 	PMAP_UNLOCK(src_pmap);
 	PMAP_UNLOCK(dst_pmap);
 }	
 
 /*
  * Zero 1 page of virtual memory mapped from a hardware page by the caller.
  */
 static __inline void
 pagezero(void *page)
 {
 #if defined(I686_CPU)
 	if (cpu_class == CPUCLASS_686) {
 #if defined(CPU_ENABLE_SSE)
 		if (cpu_feature & CPUID_SSE2)
 			sse2_pagezero(page);
 		else
 #endif
 			i686_pagezero(page);
 	} else
 #endif
 		bzero(page, PAGE_SIZE);
 }
 
 /*
  * Zero the specified hardware page.
  */
 void
 pmap_zero_page(vm_page_t m)
 {
 	struct sysmaps *sysmaps;
 
 	sysmaps = &sysmaps_pcpu[PCPU_GET(cpuid)];
 	mtx_lock(&sysmaps->lock);
 	if (*sysmaps->CMAP2)
 		panic("pmap_zero_page: CMAP2 busy");
 	sched_pin();
 	*sysmaps->CMAP2 = PG_V | PG_RW | VM_PAGE_TO_PHYS(m) | PG_A | PG_M |
 	    pmap_cache_bits(m->md.pat_mode, 0);
 	invlcaddr(sysmaps->CADDR2);
 	pagezero(sysmaps->CADDR2);
 	*sysmaps->CMAP2 = 0;
 	sched_unpin();
 	mtx_unlock(&sysmaps->lock);
 }
 
 /*
  * Zero an an area within a single hardware page.  off and size must not
  * cover an area beyond a single hardware page.
  */
 void
 pmap_zero_page_area(vm_page_t m, int off, int size)
 {
 	struct sysmaps *sysmaps;
 
 	sysmaps = &sysmaps_pcpu[PCPU_GET(cpuid)];
 	mtx_lock(&sysmaps->lock);
 	if (*sysmaps->CMAP2)
 		panic("pmap_zero_page_area: CMAP2 busy");
 	sched_pin();
 	*sysmaps->CMAP2 = PG_V | PG_RW | VM_PAGE_TO_PHYS(m) | PG_A | PG_M |
 	    pmap_cache_bits(m->md.pat_mode, 0);
 	invlcaddr(sysmaps->CADDR2);
 	if (off == 0 && size == PAGE_SIZE) 
 		pagezero(sysmaps->CADDR2);
 	else
 		bzero((char *)sysmaps->CADDR2 + off, size);
 	*sysmaps->CMAP2 = 0;
 	sched_unpin();
 	mtx_unlock(&sysmaps->lock);
 }
 
 /*
  * Copy 1 specified hardware page to another.
  */
 void
 pmap_copy_page(vm_page_t src, vm_page_t dst)
 {
 	struct sysmaps *sysmaps;
 
 	sysmaps = &sysmaps_pcpu[PCPU_GET(cpuid)];
 	mtx_lock(&sysmaps->lock);
 	if (*sysmaps->CMAP1)
 		panic("pmap_copy_page: CMAP1 busy");
 	if (*sysmaps->CMAP2)
 		panic("pmap_copy_page: CMAP2 busy");
 	sched_pin();
 	*sysmaps->CMAP1 = PG_V | VM_PAGE_TO_PHYS(src) | PG_A |
 	    pmap_cache_bits(src->md.pat_mode, 0);
 	invlcaddr(sysmaps->CADDR1);
 	*sysmaps->CMAP2 = PG_V | PG_RW | VM_PAGE_TO_PHYS(dst) | PG_A | PG_M |
 	    pmap_cache_bits(dst->md.pat_mode, 0);
 	invlcaddr(sysmaps->CADDR2);
 	bcopy(sysmaps->CADDR1, sysmaps->CADDR2, PAGE_SIZE);
 	*sysmaps->CMAP1 = 0;
 	*sysmaps->CMAP2 = 0;
 	sched_unpin();
 	mtx_unlock(&sysmaps->lock);
 }
 
 int unmapped_buf_allowed = 1;
 
 void
 pmap_copy_pages(vm_page_t ma[], vm_offset_t a_offset, vm_page_t mb[],
     vm_offset_t b_offset, int xfersize)
 {
 	struct sysmaps *sysmaps;
 	vm_page_t a_pg, b_pg;
 	char *a_cp, *b_cp;
 	vm_offset_t a_pg_offset, b_pg_offset;
 	int cnt;
 
 	sysmaps = &sysmaps_pcpu[PCPU_GET(cpuid)];
 	mtx_lock(&sysmaps->lock);
 	if (*sysmaps->CMAP1 != 0)
 		panic("pmap_copy_pages: CMAP1 busy");
 	if (*sysmaps->CMAP2 != 0)
 		panic("pmap_copy_pages: CMAP2 busy");
 	sched_pin();
 	while (xfersize > 0) {
 		a_pg = ma[a_offset >> PAGE_SHIFT];
 		a_pg_offset = a_offset & PAGE_MASK;
 		cnt = min(xfersize, PAGE_SIZE - a_pg_offset);
 		b_pg = mb[b_offset >> PAGE_SHIFT];
 		b_pg_offset = b_offset & PAGE_MASK;
 		cnt = min(cnt, PAGE_SIZE - b_pg_offset);
 		*sysmaps->CMAP1 = PG_V | VM_PAGE_TO_PHYS(a_pg) | PG_A |
 		    pmap_cache_bits(a_pg->md.pat_mode, 0);
 		invlcaddr(sysmaps->CADDR1);
 		*sysmaps->CMAP2 = PG_V | PG_RW | VM_PAGE_TO_PHYS(b_pg) | PG_A |
 		    PG_M | pmap_cache_bits(b_pg->md.pat_mode, 0);
 		invlcaddr(sysmaps->CADDR2);
 		a_cp = sysmaps->CADDR1 + a_pg_offset;
 		b_cp = sysmaps->CADDR2 + b_pg_offset;
 		bcopy(a_cp, b_cp, cnt);
 		a_offset += cnt;
 		b_offset += cnt;
 		xfersize -= cnt;
 	}
 	*sysmaps->CMAP1 = 0;
 	*sysmaps->CMAP2 = 0;
 	sched_unpin();
 	mtx_unlock(&sysmaps->lock);
 }
 
 /*
  * Returns true if the pmap's pv is one of the first
  * 16 pvs linked to from this page.  This count may
  * be changed upwards or downwards in the future; it
  * is only necessary that true be returned for a small
  * subset of pmaps for proper page aging.
  */
 boolean_t
 pmap_page_exists_quick(pmap_t pmap, vm_page_t m)
 {
 	struct md_page *pvh;
 	pv_entry_t pv;
 	int loops = 0;
 	boolean_t rv;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_page_exists_quick: page %p is not managed", m));
 	rv = FALSE;
 	rw_wlock(&pvh_global_lock);
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
 		if (PV_PMAP(pv) == pmap) {
 			rv = TRUE;
 			break;
 		}
 		loops++;
 		if (loops >= 16)
 			break;
 	}
 	if (!rv && loops < 16 && (m->flags & PG_FICTITIOUS) == 0) {
 		pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 		TAILQ_FOREACH(pv, &pvh->pv_list, pv_next) {
 			if (PV_PMAP(pv) == pmap) {
 				rv = TRUE;
 				break;
 			}
 			loops++;
 			if (loops >= 16)
 				break;
 		}
 	}
 	rw_wunlock(&pvh_global_lock);
 	return (rv);
 }
 
 /*
  *	pmap_page_wired_mappings:
  *
  *	Return the number of managed mappings to the given physical page
  *	that are wired.
  */
 int
 pmap_page_wired_mappings(vm_page_t m)
 {
 	int count;
 
 	count = 0;
 	if ((m->oflags & VPO_UNMANAGED) != 0)
 		return (count);
 	rw_wlock(&pvh_global_lock);
 	count = pmap_pvh_wired_mappings(&m->md, count);
 	if ((m->flags & PG_FICTITIOUS) == 0) {
 	    count = pmap_pvh_wired_mappings(pa_to_pvh(VM_PAGE_TO_PHYS(m)),
 	        count);
 	}
 	rw_wunlock(&pvh_global_lock);
 	return (count);
 }
 
 /*
  *	pmap_pvh_wired_mappings:
  *
  *	Return the updated number "count" of managed mappings that are wired.
  */
 static int
 pmap_pvh_wired_mappings(struct md_page *pvh, int count)
 {
 	pmap_t pmap;
 	pt_entry_t *pte;
 	pv_entry_t pv;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	sched_pin();
 	TAILQ_FOREACH(pv, &pvh->pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pte = pmap_pte_quick(pmap, pv->pv_va);
 		if ((*pte & PG_W) != 0)
 			count++;
 		PMAP_UNLOCK(pmap);
 	}
 	sched_unpin();
 	return (count);
 }
 
 /*
  * Returns TRUE if the given page is mapped individually or as part of
  * a 4mpage.  Otherwise, returns FALSE.
  */
 boolean_t
 pmap_page_is_mapped(vm_page_t m)
 {
 	boolean_t rv;
 
 	if ((m->oflags & VPO_UNMANAGED) != 0)
 		return (FALSE);
 	rw_wlock(&pvh_global_lock);
 	rv = !TAILQ_EMPTY(&m->md.pv_list) ||
 	    ((m->flags & PG_FICTITIOUS) == 0 &&
 	    !TAILQ_EMPTY(&pa_to_pvh(VM_PAGE_TO_PHYS(m))->pv_list));
 	rw_wunlock(&pvh_global_lock);
 	return (rv);
 }
 
 /*
  * Remove all pages from specified address space
  * this aids process exit speeds.  Also, this code
  * is special cased for current process only, but
  * can have the more generic (and slightly slower)
  * mode enabled.  This is much faster than pmap_remove
  * in the case of running down an entire address space.
  */
 void
 pmap_remove_pages(pmap_t pmap)
 {
 	pt_entry_t *pte, tpte;
 	vm_page_t m, mpte, mt;
 	pv_entry_t pv;
 	struct md_page *pvh;
 	struct pv_chunk *pc, *npc;
 	struct spglist free;
 	int field, idx;
 	int32_t bit;
 	uint32_t inuse, bitmask;
 	int allfree;
 
 	if (pmap != PCPU_GET(curpmap)) {
 		printf("warning: pmap_remove_pages called with non-current pmap\n");
 		return;
 	}
 	SLIST_INIT(&free);
 	rw_wlock(&pvh_global_lock);
 	PMAP_LOCK(pmap);
 	sched_pin();
 	TAILQ_FOREACH_SAFE(pc, &pmap->pm_pvchunk, pc_list, npc) {
 		KASSERT(pc->pc_pmap == pmap, ("Wrong pmap %p %p", pmap,
 		    pc->pc_pmap));
 		allfree = 1;
 		for (field = 0; field < _NPCM; field++) {
 			inuse = ~pc->pc_map[field] & pc_freemask[field];
 			while (inuse != 0) {
 				bit = bsfl(inuse);
 				bitmask = 1UL << bit;
 				idx = field * 32 + bit;
 				pv = &pc->pc_pventry[idx];
 				inuse &= ~bitmask;
 
 				pte = pmap_pde(pmap, pv->pv_va);
 				tpte = *pte;
 				if ((tpte & PG_PS) == 0) {
 					pte = vtopte(pv->pv_va);
 					tpte = *pte & ~PG_PTE_PAT;
 				}
 
 				if (tpte == 0) {
 					printf(
 					    "TPTE at %p  IS ZERO @ VA %08x\n",
 					    pte, pv->pv_va);
 					panic("bad pte");
 				}
 
 /*
  * We cannot remove wired pages from a process' mapping at this time
  */
 				if (tpte & PG_W) {
 					allfree = 0;
 					continue;
 				}
 
 				m = PHYS_TO_VM_PAGE(tpte & PG_FRAME);
 				KASSERT(m->phys_addr == (tpte & PG_FRAME),
 				    ("vm_page_t %p phys_addr mismatch %016jx %016jx",
 				    m, (uintmax_t)m->phys_addr,
 				    (uintmax_t)tpte));
 
 				KASSERT((m->flags & PG_FICTITIOUS) != 0 ||
 				    m < &vm_page_array[vm_page_array_size],
 				    ("pmap_remove_pages: bad tpte %#jx",
 				    (uintmax_t)tpte));
 
 				pte_clear(pte);
 
 				/*
 				 * Update the vm_page_t clean/reference bits.
 				 */
 				if ((tpte & (PG_M | PG_RW)) == (PG_M | PG_RW)) {
 					if ((tpte & PG_PS) != 0) {
 						for (mt = m; mt < &m[NBPDR / PAGE_SIZE]; mt++)
 							vm_page_dirty(mt);
 					} else
 						vm_page_dirty(m);
 				}
 
 				/* Mark free */
 				PV_STAT(pv_entry_frees++);
 				PV_STAT(pv_entry_spare++);
 				pv_entry_count--;
 				pc->pc_map[field] |= bitmask;
 				if ((tpte & PG_PS) != 0) {
 					pmap->pm_stats.resident_count -= NBPDR / PAGE_SIZE;
 					pvh = pa_to_pvh(tpte & PG_PS_FRAME);
 					TAILQ_REMOVE(&pvh->pv_list, pv, pv_next);
 					if (TAILQ_EMPTY(&pvh->pv_list)) {
 						for (mt = m; mt < &m[NBPDR / PAGE_SIZE]; mt++)
 							if (TAILQ_EMPTY(&mt->md.pv_list))
 								vm_page_aflag_clear(mt, PGA_WRITEABLE);
 					}
 					mpte = pmap_lookup_pt_page(pmap, pv->pv_va);
 					if (mpte != NULL) {
 						pmap_remove_pt_page(pmap, mpte);
 						pmap->pm_stats.resident_count--;
 						KASSERT(mpte->wire_count == NPTEPG,
 						    ("pmap_remove_pages: pte page wire count error"));
 						mpte->wire_count = 0;
 						pmap_add_delayed_free_list(mpte, &free, FALSE);
 						atomic_subtract_int(&vm_cnt.v_wire_count, 1);
 					}
 				} else {
 					pmap->pm_stats.resident_count--;
 					TAILQ_REMOVE(&m->md.pv_list, pv, pv_next);
 					if (TAILQ_EMPTY(&m->md.pv_list) &&
 					    (m->flags & PG_FICTITIOUS) == 0) {
 						pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 						if (TAILQ_EMPTY(&pvh->pv_list))
 							vm_page_aflag_clear(m, PGA_WRITEABLE);
 					}
 					pmap_unuse_pt(pmap, pv->pv_va, &free);
 				}
 			}
 		}
 		if (allfree) {
 			TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 			free_pv_chunk(pc);
 		}
 	}
 	sched_unpin();
 	pmap_invalidate_all(pmap);
 	rw_wunlock(&pvh_global_lock);
 	PMAP_UNLOCK(pmap);
 	pmap_free_zero_pages(&free);
 }
 
 /*
  *	pmap_is_modified:
  *
  *	Return whether or not the specified physical page was modified
  *	in any physical maps.
  */
 boolean_t
 pmap_is_modified(vm_page_t m)
 {
 	boolean_t rv;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_is_modified: page %p is not managed", m));
 
 	/*
 	 * If the page is not exclusive busied, then PGA_WRITEABLE cannot be
 	 * concurrently set while the object is locked.  Thus, if PGA_WRITEABLE
 	 * is clear, no PTEs can have PG_M set.
 	 */
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	if (!vm_page_xbusied(m) && (m->aflags & PGA_WRITEABLE) == 0)
 		return (FALSE);
 	rw_wlock(&pvh_global_lock);
 	rv = pmap_is_modified_pvh(&m->md) ||
 	    ((m->flags & PG_FICTITIOUS) == 0 &&
 	    pmap_is_modified_pvh(pa_to_pvh(VM_PAGE_TO_PHYS(m))));
 	rw_wunlock(&pvh_global_lock);
 	return (rv);
 }
 
 /*
  * Returns TRUE if any of the given mappings were used to modify
  * physical memory.  Otherwise, returns FALSE.  Both page and 2mpage
  * mappings are supported.
  */
 static boolean_t
 pmap_is_modified_pvh(struct md_page *pvh)
 {
 	pv_entry_t pv;
 	pt_entry_t *pte;
 	pmap_t pmap;
 	boolean_t rv;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	rv = FALSE;
 	sched_pin();
 	TAILQ_FOREACH(pv, &pvh->pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pte = pmap_pte_quick(pmap, pv->pv_va);
 		rv = (*pte & (PG_M | PG_RW)) == (PG_M | PG_RW);
 		PMAP_UNLOCK(pmap);
 		if (rv)
 			break;
 	}
 	sched_unpin();
 	return (rv);
 }
 
 /*
  *	pmap_is_prefaultable:
  *
  *	Return whether or not the specified virtual address is elgible
  *	for prefault.
  */
 boolean_t
 pmap_is_prefaultable(pmap_t pmap, vm_offset_t addr)
 {
 	pd_entry_t *pde;
 	pt_entry_t *pte;
 	boolean_t rv;
 
 	rv = FALSE;
 	PMAP_LOCK(pmap);
 	pde = pmap_pde(pmap, addr);
 	if (*pde != 0 && (*pde & PG_PS) == 0) {
 		pte = vtopte(addr);
 		rv = *pte == 0;
 	}
 	PMAP_UNLOCK(pmap);
 	return (rv);
 }
 
 /*
  *	pmap_is_referenced:
  *
  *	Return whether or not the specified physical page was referenced
  *	in any physical maps.
  */
 boolean_t
 pmap_is_referenced(vm_page_t m)
 {
 	boolean_t rv;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_is_referenced: page %p is not managed", m));
 	rw_wlock(&pvh_global_lock);
 	rv = pmap_is_referenced_pvh(&m->md) ||
 	    ((m->flags & PG_FICTITIOUS) == 0 &&
 	    pmap_is_referenced_pvh(pa_to_pvh(VM_PAGE_TO_PHYS(m))));
 	rw_wunlock(&pvh_global_lock);
 	return (rv);
 }
 
 /*
  * Returns TRUE if any of the given mappings were referenced and FALSE
  * otherwise.  Both page and 4mpage mappings are supported.
  */
 static boolean_t
 pmap_is_referenced_pvh(struct md_page *pvh)
 {
 	pv_entry_t pv;
 	pt_entry_t *pte;
 	pmap_t pmap;
 	boolean_t rv;
 
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 	rv = FALSE;
 	sched_pin();
 	TAILQ_FOREACH(pv, &pvh->pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pte = pmap_pte_quick(pmap, pv->pv_va);
 		rv = (*pte & (PG_A | PG_V)) == (PG_A | PG_V);
 		PMAP_UNLOCK(pmap);
 		if (rv)
 			break;
 	}
 	sched_unpin();
 	return (rv);
 }
 
 /*
  * Clear the write and modified bits in each of the given page's mappings.
  */
 void
 pmap_remove_write(vm_page_t m)
 {
 	struct md_page *pvh;
 	pv_entry_t next_pv, pv;
 	pmap_t pmap;
 	pd_entry_t *pde;
 	pt_entry_t oldpte, *pte;
 	vm_offset_t va;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_remove_write: page %p is not managed", m));
 
 	/*
 	 * If the page is not exclusive busied, then PGA_WRITEABLE cannot be
 	 * set by another thread while the object is locked.  Thus,
 	 * if PGA_WRITEABLE is clear, no page table entries need updating.
 	 */
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	if (!vm_page_xbusied(m) && (m->aflags & PGA_WRITEABLE) == 0)
 		return;
 	rw_wlock(&pvh_global_lock);
 	sched_pin();
 	if ((m->flags & PG_FICTITIOUS) != 0)
 		goto small_mappings;
 	pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 	TAILQ_FOREACH_SAFE(pv, &pvh->pv_list, pv_next, next_pv) {
 		va = pv->pv_va;
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pde = pmap_pde(pmap, va);
 		if ((*pde & PG_RW) != 0)
 			(void)pmap_demote_pde(pmap, pde, va);
 		PMAP_UNLOCK(pmap);
 	}
 small_mappings:
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pde = pmap_pde(pmap, pv->pv_va);
 		KASSERT((*pde & PG_PS) == 0, ("pmap_clear_write: found"
 		    " a 4mpage in page %p's pv list", m));
 		pte = pmap_pte_quick(pmap, pv->pv_va);
 retry:
 		oldpte = *pte;
 		if ((oldpte & PG_RW) != 0) {
 			/*
 			 * Regardless of whether a pte is 32 or 64 bits
 			 * in size, PG_RW and PG_M are among the least
 			 * significant 32 bits.
 			 */
 			if (!atomic_cmpset_int((u_int *)pte, oldpte,
 			    oldpte & ~(PG_RW | PG_M)))
 				goto retry;
 			if ((oldpte & PG_M) != 0)
 				vm_page_dirty(m);
 			pmap_invalidate_page(pmap, pv->pv_va);
 		}
 		PMAP_UNLOCK(pmap);
 	}
 	vm_page_aflag_clear(m, PGA_WRITEABLE);
 	sched_unpin();
 	rw_wunlock(&pvh_global_lock);
 }
 
-#define	PMAP_TS_REFERENCED_MAX	5
-
 /*
  *	pmap_ts_referenced:
  *
  *	Return a count of reference bits for a page, clearing those bits.
  *	It is not necessary for every reference bit to be cleared, but it
  *	is necessary that 0 only be returned when there are truly no
  *	reference bits set.
- *
- *	XXX: The exact number of bits to check and clear is a matter that
- *	should be tested and standardized at some point in the future for
- *	optimal aging of shared pages.
  *
  *	As an optimization, update the page's dirty field if a modified bit is
  *	found while counting reference bits.  This opportunistic update can be
  *	performed at low cost and can eliminate the need for some future calls
  *	to pmap_is_modified().  However, since this function stops after
  *	finding PMAP_TS_REFERENCED_MAX reference bits, it may not detect some
  *	dirty pages.  Those dirty pages will only be detected by a future call
  *	to pmap_is_modified().
  */
 int
 pmap_ts_referenced(vm_page_t m)
 {
 	struct md_page *pvh;
 	pv_entry_t pv, pvf;
 	pmap_t pmap;
 	pd_entry_t *pde;
 	pt_entry_t *pte;
 	vm_paddr_t pa;
 	int rtval = 0;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_ts_referenced: page %p is not managed", m));
 	pa = VM_PAGE_TO_PHYS(m);
 	pvh = pa_to_pvh(pa);
 	rw_wlock(&pvh_global_lock);
 	sched_pin();
 	if ((m->flags & PG_FICTITIOUS) != 0 ||
 	    (pvf = TAILQ_FIRST(&pvh->pv_list)) == NULL)
 		goto small_mappings;
 	pv = pvf;
 	do {
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pde = pmap_pde(pmap, pv->pv_va);
 		if ((*pde & (PG_M | PG_RW)) == (PG_M | PG_RW)) {
 			/*
 			 * Although "*pde" is mapping a 2/4MB page, because
 			 * this function is called at a 4KB page granularity,
 			 * we only update the 4KB page under test.
 			 */
 			vm_page_dirty(m);
 		}
 		if ((*pde & PG_A) != 0) {
 			/*
 			 * Since this reference bit is shared by either 1024
 			 * or 512 4KB pages, it should not be cleared every
 			 * time it is tested.  Apply a simple "hash" function
 			 * on the physical page number, the virtual superpage
 			 * number, and the pmap address to select one 4KB page
 			 * out of the 1024 or 512 on which testing the
 			 * reference bit will result in clearing that bit.
 			 * This function is designed to avoid the selection of
 			 * the same 4KB page for every 2- or 4MB page mapping.
 			 *
 			 * On demotion, a mapping that hasn't been referenced
 			 * is simply destroyed.  To avoid the possibility of a
 			 * subsequent page fault on a demoted wired mapping,
 			 * always leave its reference bit set.  Moreover,
 			 * since the superpage is wired, the current state of
 			 * its reference bit won't affect page replacement.
 			 */
 			if ((((pa >> PAGE_SHIFT) ^ (pv->pv_va >> PDRSHIFT) ^
 			    (uintptr_t)pmap) & (NPTEPG - 1)) == 0 &&
 			    (*pde & PG_W) == 0) {
 				atomic_clear_int((u_int *)pde, PG_A);
 				pmap_invalidate_page(pmap, pv->pv_va);
 			}
 			rtval++;
 		}
 		PMAP_UNLOCK(pmap);
 		/* Rotate the PV list if it has more than one entry. */
 		if (TAILQ_NEXT(pv, pv_next) != NULL) {
 			TAILQ_REMOVE(&pvh->pv_list, pv, pv_next);
 			TAILQ_INSERT_TAIL(&pvh->pv_list, pv, pv_next);
 		}
 		if (rtval >= PMAP_TS_REFERENCED_MAX)
 			goto out;
 	} while ((pv = TAILQ_FIRST(&pvh->pv_list)) != pvf);
 small_mappings:
 	if ((pvf = TAILQ_FIRST(&m->md.pv_list)) == NULL)
 		goto out;
 	pv = pvf;
 	do {
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pde = pmap_pde(pmap, pv->pv_va);
 		KASSERT((*pde & PG_PS) == 0,
 		    ("pmap_ts_referenced: found a 4mpage in page %p's pv list",
 		    m));
 		pte = pmap_pte_quick(pmap, pv->pv_va);
 		if ((*pte & (PG_M | PG_RW)) == (PG_M | PG_RW))
 			vm_page_dirty(m);
 		if ((*pte & PG_A) != 0) {
 			atomic_clear_int((u_int *)pte, PG_A);
 			pmap_invalidate_page(pmap, pv->pv_va);
 			rtval++;
 		}
 		PMAP_UNLOCK(pmap);
 		/* Rotate the PV list if it has more than one entry. */
 		if (TAILQ_NEXT(pv, pv_next) != NULL) {
 			TAILQ_REMOVE(&m->md.pv_list, pv, pv_next);
 			TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 		}
 	} while ((pv = TAILQ_FIRST(&m->md.pv_list)) != pvf && rtval <
 	    PMAP_TS_REFERENCED_MAX);
 out:
 	sched_unpin();
 	rw_wunlock(&pvh_global_lock);
 	return (rtval);
 }
 
 /*
  *	Apply the given advice to the specified range of addresses within the
  *	given pmap.  Depending on the advice, clear the referenced and/or
  *	modified flags in each mapping and set the mapped page's dirty field.
  */
 void
 pmap_advise(pmap_t pmap, vm_offset_t sva, vm_offset_t eva, int advice)
 {
 	pd_entry_t oldpde, *pde;
 	pt_entry_t *pte;
 	vm_offset_t pdnxt;
 	vm_page_t m;
 	boolean_t anychanged, pv_lists_locked;
 
 	if (advice != MADV_DONTNEED && advice != MADV_FREE)
 		return;
 	if (pmap_is_current(pmap))
 		pv_lists_locked = FALSE;
 	else {
 		pv_lists_locked = TRUE;
 resume:
 		rw_wlock(&pvh_global_lock);
 		sched_pin();
 	}
 	anychanged = FALSE;
 	PMAP_LOCK(pmap);
 	for (; sva < eva; sva = pdnxt) {
 		pdnxt = (sva + NBPDR) & ~PDRMASK;
 		if (pdnxt < sva)
 			pdnxt = eva;
 		pde = pmap_pde(pmap, sva);
 		oldpde = *pde;
 		if ((oldpde & PG_V) == 0)
 			continue;
 		else if ((oldpde & PG_PS) != 0) {
 			if ((oldpde & PG_MANAGED) == 0)
 				continue;
 			if (!pv_lists_locked) {
 				pv_lists_locked = TRUE;
 				if (!rw_try_wlock(&pvh_global_lock)) {
 					if (anychanged)
 						pmap_invalidate_all(pmap);
 					PMAP_UNLOCK(pmap);
 					goto resume;
 				}
 				sched_pin();
 			}
 			if (!pmap_demote_pde(pmap, pde, sva)) {
 				/*
 				 * The large page mapping was destroyed.
 				 */
 				continue;
 			}
 
 			/*
 			 * Unless the page mappings are wired, remove the
 			 * mapping to a single page so that a subsequent
 			 * access may repromote.  Since the underlying page
 			 * table page is fully populated, this removal never
 			 * frees a page table page.
 			 */
 			if ((oldpde & PG_W) == 0) {
 				pte = pmap_pte_quick(pmap, sva);
 				KASSERT((*pte & PG_V) != 0,
 				    ("pmap_advise: invalid PTE"));
 				pmap_remove_pte(pmap, pte, sva, NULL);
 				anychanged = TRUE;
 			}
 		}
 		if (pdnxt > eva)
 			pdnxt = eva;
 		for (pte = pmap_pte_quick(pmap, sva); sva != pdnxt; pte++,
 		    sva += PAGE_SIZE) {
 			if ((*pte & (PG_MANAGED | PG_V)) != (PG_MANAGED |
 			    PG_V))
 				continue;
 			else if ((*pte & (PG_M | PG_RW)) == (PG_M | PG_RW)) {
 				if (advice == MADV_DONTNEED) {
 					/*
 					 * Future calls to pmap_is_modified()
 					 * can be avoided by making the page
 					 * dirty now.
 					 */
 					m = PHYS_TO_VM_PAGE(*pte & PG_FRAME);
 					vm_page_dirty(m);
 				}
 				atomic_clear_int((u_int *)pte, PG_M | PG_A);
 			} else if ((*pte & PG_A) != 0)
 				atomic_clear_int((u_int *)pte, PG_A);
 			else
 				continue;
 			if ((*pte & PG_G) != 0)
 				pmap_invalidate_page(pmap, sva);
 			else
 				anychanged = TRUE;
 		}
 	}
 	if (anychanged)
 		pmap_invalidate_all(pmap);
 	if (pv_lists_locked) {
 		sched_unpin();
 		rw_wunlock(&pvh_global_lock);
 	}
 	PMAP_UNLOCK(pmap);
 }
 
 /*
  *	Clear the modify bits on the specified physical page.
  */
 void
 pmap_clear_modify(vm_page_t m)
 {
 	struct md_page *pvh;
 	pv_entry_t next_pv, pv;
 	pmap_t pmap;
 	pd_entry_t oldpde, *pde;
 	pt_entry_t oldpte, *pte;
 	vm_offset_t va;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_clear_modify: page %p is not managed", m));
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	KASSERT(!vm_page_xbusied(m),
 	    ("pmap_clear_modify: page %p is exclusive busied", m));
 
 	/*
 	 * If the page is not PGA_WRITEABLE, then no PTEs can have PG_M set.
 	 * If the object containing the page is locked and the page is not
 	 * exclusive busied, then PGA_WRITEABLE cannot be concurrently set.
 	 */
 	if ((m->aflags & PGA_WRITEABLE) == 0)
 		return;
 	rw_wlock(&pvh_global_lock);
 	sched_pin();
 	if ((m->flags & PG_FICTITIOUS) != 0)
 		goto small_mappings;
 	pvh = pa_to_pvh(VM_PAGE_TO_PHYS(m));
 	TAILQ_FOREACH_SAFE(pv, &pvh->pv_list, pv_next, next_pv) {
 		va = pv->pv_va;
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pde = pmap_pde(pmap, va);
 		oldpde = *pde;
 		if ((oldpde & PG_RW) != 0) {
 			if (pmap_demote_pde(pmap, pde, va)) {
 				if ((oldpde & PG_W) == 0) {
 					/*
 					 * Write protect the mapping to a
 					 * single page so that a subsequent
 					 * write access may repromote.
 					 */
 					va += VM_PAGE_TO_PHYS(m) - (oldpde &
 					    PG_PS_FRAME);
 					pte = pmap_pte_quick(pmap, va);
 					oldpte = *pte;
 					if ((oldpte & PG_V) != 0) {
 						/*
 						 * Regardless of whether a pte is 32 or 64 bits
 						 * in size, PG_RW and PG_M are among the least
 						 * significant 32 bits.
 						 */
 						while (!atomic_cmpset_int((u_int *)pte,
 						    oldpte,
 						    oldpte & ~(PG_M | PG_RW)))
 							oldpte = *pte;
 						vm_page_dirty(m);
 						pmap_invalidate_page(pmap, va);
 					}
 				}
 			}
 		}
 		PMAP_UNLOCK(pmap);
 	}
 small_mappings:
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pde = pmap_pde(pmap, pv->pv_va);
 		KASSERT((*pde & PG_PS) == 0, ("pmap_clear_modify: found"
 		    " a 4mpage in page %p's pv list", m));
 		pte = pmap_pte_quick(pmap, pv->pv_va);
 		if ((*pte & (PG_M | PG_RW)) == (PG_M | PG_RW)) {
 			/*
 			 * Regardless of whether a pte is 32 or 64 bits
 			 * in size, PG_M is among the least significant
 			 * 32 bits. 
 			 */
 			atomic_clear_int((u_int *)pte, PG_M);
 			pmap_invalidate_page(pmap, pv->pv_va);
 		}
 		PMAP_UNLOCK(pmap);
 	}
 	sched_unpin();
 	rw_wunlock(&pvh_global_lock);
 }
 
 /*
  * Miscellaneous support routines follow
  */
 
 /* Adjust the cache mode for a 4KB page mapped via a PTE. */
 static __inline void
 pmap_pte_attr(pt_entry_t *pte, int cache_bits)
 {
 	u_int opte, npte;
 
 	/*
 	 * The cache mode bits are all in the low 32-bits of the
 	 * PTE, so we can just spin on updating the low 32-bits.
 	 */
 	do {
 		opte = *(u_int *)pte;
 		npte = opte & ~PG_PTE_CACHE;
 		npte |= cache_bits;
 	} while (npte != opte && !atomic_cmpset_int((u_int *)pte, opte, npte));
 }
 
 /* Adjust the cache mode for a 2/4MB page mapped via a PDE. */
 static __inline void
 pmap_pde_attr(pd_entry_t *pde, int cache_bits)
 {
 	u_int opde, npde;
 
 	/*
 	 * The cache mode bits are all in the low 32-bits of the
 	 * PDE, so we can just spin on updating the low 32-bits.
 	 */
 	do {
 		opde = *(u_int *)pde;
 		npde = opde & ~PG_PDE_CACHE;
 		npde |= cache_bits;
 	} while (npde != opde && !atomic_cmpset_int((u_int *)pde, opde, npde));
 }
 
 /*
  * Map a set of physical memory pages into the kernel virtual
  * address space. Return a pointer to where it is mapped. This
  * routine is intended to be used for mapping device memory,
  * NOT real memory.
  */
 void *
 pmap_mapdev_attr(vm_paddr_t pa, vm_size_t size, int mode)
 {
 	struct pmap_preinit_mapping *ppim;
 	vm_offset_t va, offset;
 	vm_size_t tmpsize;
 	int i;
 
 	offset = pa & PAGE_MASK;
 	size = round_page(offset + size);
 	pa = pa & PG_FRAME;
 
 	if (pa < KERNLOAD && pa + size <= KERNLOAD)
 		va = KERNBASE + pa;
 	else if (!pmap_initialized) {
 		va = 0;
 		for (i = 0; i < PMAP_PREINIT_MAPPING_COUNT; i++) {
 			ppim = pmap_preinit_mapping + i;
 			if (ppim->va == 0) {
 				ppim->pa = pa;
 				ppim->sz = size;
 				ppim->mode = mode;
 				ppim->va = virtual_avail;
 				virtual_avail += size;
 				va = ppim->va;
 				break;
 			}
 		}
 		if (va == 0)
 			panic("%s: too many preinit mappings", __func__);
 	} else {
 		/*
 		 * If we have a preinit mapping, re-use it.
 		 */
 		for (i = 0; i < PMAP_PREINIT_MAPPING_COUNT; i++) {
 			ppim = pmap_preinit_mapping + i;
 			if (ppim->pa == pa && ppim->sz == size &&
 			    ppim->mode == mode)
 				return ((void *)(ppim->va + offset));
 		}
 		va = kva_alloc(size);
 		if (va == 0)
 			panic("%s: Couldn't allocate KVA", __func__);
 	}
 	for (tmpsize = 0; tmpsize < size; tmpsize += PAGE_SIZE)
 		pmap_kenter_attr(va + tmpsize, pa + tmpsize, mode);
 	pmap_invalidate_range(kernel_pmap, va, va + tmpsize);
 	pmap_invalidate_cache_range(va, va + size, FALSE);
 	return ((void *)(va + offset));
 }
 
 void *
 pmap_mapdev(vm_paddr_t pa, vm_size_t size)
 {
 
 	return (pmap_mapdev_attr(pa, size, PAT_UNCACHEABLE));
 }
 
 void *
 pmap_mapbios(vm_paddr_t pa, vm_size_t size)
 {
 
 	return (pmap_mapdev_attr(pa, size, PAT_WRITE_BACK));
 }
 
 void
 pmap_unmapdev(vm_offset_t va, vm_size_t size)
 {
 	struct pmap_preinit_mapping *ppim;
 	vm_offset_t offset;
 	int i;
 
 	if (va >= KERNBASE && va + size <= KERNBASE + KERNLOAD)
 		return;
 	offset = va & PAGE_MASK;
 	size = round_page(offset + size);
 	va = trunc_page(va);
 	for (i = 0; i < PMAP_PREINIT_MAPPING_COUNT; i++) {
 		ppim = pmap_preinit_mapping + i;
 		if (ppim->va == va && ppim->sz == size) {
 			if (pmap_initialized)
 				return;
 			ppim->pa = 0;
 			ppim->va = 0;
 			ppim->sz = 0;
 			ppim->mode = 0;
 			if (va + size == virtual_avail)
 				virtual_avail = va;
 			return;
 		}
 	}
 	if (pmap_initialized)
 		kva_free(va, size);
 }
 
 /*
  * Sets the memory attribute for the specified page.
  */
 void
 pmap_page_set_memattr(vm_page_t m, vm_memattr_t ma)
 {
 
 	m->md.pat_mode = ma;
 	if ((m->flags & PG_FICTITIOUS) != 0)
 		return;
 
 	/*
 	 * If "m" is a normal page, flush it from the cache.
 	 * See pmap_invalidate_cache_range().
 	 *
 	 * First, try to find an existing mapping of the page by sf
 	 * buffer. sf_buf_invalidate_cache() modifies mapping and
 	 * flushes the cache.
 	 */    
 	if (sf_buf_invalidate_cache(m))
 		return;
 
 	/*
 	 * If page is not mapped by sf buffer, but CPU does not
 	 * support self snoop, map the page transient and do
 	 * invalidation. In the worst case, whole cache is flushed by
 	 * pmap_invalidate_cache_range().
 	 */
 	if ((cpu_feature & CPUID_SS) == 0)
 		pmap_flush_page(m);
 }
 
 static void
 pmap_flush_page(vm_page_t m)
 {
 	struct sysmaps *sysmaps;
 	vm_offset_t sva, eva;
 	bool useclflushopt;
 
 	useclflushopt = (cpu_stdext_feature & CPUID_STDEXT_CLFLUSHOPT) != 0;
 	if (useclflushopt || (cpu_feature & CPUID_CLFSH) != 0) {
 		sysmaps = &sysmaps_pcpu[PCPU_GET(cpuid)];
 		mtx_lock(&sysmaps->lock);
 		if (*sysmaps->CMAP2)
 			panic("pmap_flush_page: CMAP2 busy");
 		sched_pin();
 		*sysmaps->CMAP2 = PG_V | PG_RW | VM_PAGE_TO_PHYS(m) |
 		    PG_A | PG_M | pmap_cache_bits(m->md.pat_mode, 0);
 		invlcaddr(sysmaps->CADDR2);
 		sva = (vm_offset_t)sysmaps->CADDR2;
 		eva = sva + PAGE_SIZE;
 
 		/*
 		 * Use mfence despite the ordering implied by
 		 * mtx_{un,}lock() because clflush on non-Intel CPUs
 		 * and clflushopt are not guaranteed to be ordered by
 		 * any other instruction.
 		 */
 		if (useclflushopt || cpu_vendor_id != CPU_VENDOR_INTEL)
 			mfence();
 		for (; sva < eva; sva += cpu_clflush_line_size) {
 			if (useclflushopt)
 				clflushopt(sva);
 			else
 				clflush(sva);
 		}
 		if (useclflushopt || cpu_vendor_id != CPU_VENDOR_INTEL)
 			mfence();
 		*sysmaps->CMAP2 = 0;
 		sched_unpin();
 		mtx_unlock(&sysmaps->lock);
 	} else
 		pmap_invalidate_cache();
 }
 
 /*
  * Changes the specified virtual address range's memory type to that given by
  * the parameter "mode".  The specified virtual address range must be
  * completely contained within either the kernel map.
  *
  * Returns zero if the change completed successfully, and either EINVAL or
  * ENOMEM if the change failed.  Specifically, EINVAL is returned if some part
  * of the virtual address range was not mapped, and ENOMEM is returned if
  * there was insufficient memory available to complete the change.
  */
 int
 pmap_change_attr(vm_offset_t va, vm_size_t size, int mode)
 {
 	vm_offset_t base, offset, tmpva;
 	pd_entry_t *pde;
 	pt_entry_t *pte;
 	int cache_bits_pte, cache_bits_pde;
 	boolean_t changed;
 
 	base = trunc_page(va);
 	offset = va & PAGE_MASK;
 	size = round_page(offset + size);
 
 	/*
 	 * Only supported on kernel virtual addresses above the recursive map.
 	 */
 	if (base < VM_MIN_KERNEL_ADDRESS)
 		return (EINVAL);
 
 	cache_bits_pde = pmap_cache_bits(mode, 1);
 	cache_bits_pte = pmap_cache_bits(mode, 0);
 	changed = FALSE;
 
 	/*
 	 * Pages that aren't mapped aren't supported.  Also break down
 	 * 2/4MB pages into 4KB pages if required.
 	 */
 	PMAP_LOCK(kernel_pmap);
 	for (tmpva = base; tmpva < base + size; ) {
 		pde = pmap_pde(kernel_pmap, tmpva);
 		if (*pde == 0) {
 			PMAP_UNLOCK(kernel_pmap);
 			return (EINVAL);
 		}
 		if (*pde & PG_PS) {
 			/*
 			 * If the current 2/4MB page already has
 			 * the required memory type, then we need not
 			 * demote this page.  Just increment tmpva to
 			 * the next 2/4MB page frame.
 			 */
 			if ((*pde & PG_PDE_CACHE) == cache_bits_pde) {
 				tmpva = trunc_4mpage(tmpva) + NBPDR;
 				continue;
 			}
 
 			/*
 			 * If the current offset aligns with a 2/4MB
 			 * page frame and there is at least 2/4MB left
 			 * within the range, then we need not break
 			 * down this page into 4KB pages.
 			 */
 			if ((tmpva & PDRMASK) == 0 &&
 			    tmpva + PDRMASK < base + size) {
 				tmpva += NBPDR;
 				continue;
 			}
 			if (!pmap_demote_pde(kernel_pmap, pde, tmpva)) {
 				PMAP_UNLOCK(kernel_pmap);
 				return (ENOMEM);
 			}
 		}
 		pte = vtopte(tmpva);
 		if (*pte == 0) {
 			PMAP_UNLOCK(kernel_pmap);
 			return (EINVAL);
 		}
 		tmpva += PAGE_SIZE;
 	}
 	PMAP_UNLOCK(kernel_pmap);
 
 	/*
 	 * Ok, all the pages exist, so run through them updating their
 	 * cache mode if required.
 	 */
 	for (tmpva = base; tmpva < base + size; ) {
 		pde = pmap_pde(kernel_pmap, tmpva);
 		if (*pde & PG_PS) {
 			if ((*pde & PG_PDE_CACHE) != cache_bits_pde) {
 				pmap_pde_attr(pde, cache_bits_pde);
 				changed = TRUE;
 			}
 			tmpva = trunc_4mpage(tmpva) + NBPDR;
 		} else {
 			pte = vtopte(tmpva);
 			if ((*pte & PG_PTE_CACHE) != cache_bits_pte) {
 				pmap_pte_attr(pte, cache_bits_pte);
 				changed = TRUE;
 			}
 			tmpva += PAGE_SIZE;
 		}
 	}
 
 	/*
 	 * Flush CPU caches to make sure any data isn't cached that
 	 * shouldn't be, etc.
 	 */
 	if (changed) {
 		pmap_invalidate_range(kernel_pmap, base, tmpva);
 		pmap_invalidate_cache_range(base, tmpva, FALSE);
 	}
 	return (0);
 }
 
 /*
  * perform the pmap work for mincore
  */
 int
 pmap_mincore(pmap_t pmap, vm_offset_t addr, vm_paddr_t *locked_pa)
 {
 	pd_entry_t *pdep;
 	pt_entry_t *ptep, pte;
 	vm_paddr_t pa;
 	int val;
 
 	PMAP_LOCK(pmap);
 retry:
 	pdep = pmap_pde(pmap, addr);
 	if (*pdep != 0) {
 		if (*pdep & PG_PS) {
 			pte = *pdep;
 			/* Compute the physical address of the 4KB page. */
 			pa = ((*pdep & PG_PS_FRAME) | (addr & PDRMASK)) &
 			    PG_FRAME;
 			val = MINCORE_SUPER;
 		} else {
 			ptep = pmap_pte(pmap, addr);
 			pte = *ptep;
 			pmap_pte_release(ptep);
 			pa = pte & PG_FRAME;
 			val = 0;
 		}
 	} else {
 		pte = 0;
 		pa = 0;
 		val = 0;
 	}
 	if ((pte & PG_V) != 0) {
 		val |= MINCORE_INCORE;
 		if ((pte & (PG_M | PG_RW)) == (PG_M | PG_RW))
 			val |= MINCORE_MODIFIED | MINCORE_MODIFIED_OTHER;
 		if ((pte & PG_A) != 0)
 			val |= MINCORE_REFERENCED | MINCORE_REFERENCED_OTHER;
 	}
 	if ((val & (MINCORE_MODIFIED_OTHER | MINCORE_REFERENCED_OTHER)) !=
 	    (MINCORE_MODIFIED_OTHER | MINCORE_REFERENCED_OTHER) &&
 	    (pte & (PG_MANAGED | PG_V)) == (PG_MANAGED | PG_V)) {
 		/* Ensure that "PHYS_TO_VM_PAGE(pa)->object" doesn't change. */
 		if (vm_page_pa_tryrelock(pmap, pa, locked_pa))
 			goto retry;
 	} else
 		PA_UNLOCK_COND(*locked_pa);
 	PMAP_UNLOCK(pmap);
 	return (val);
 }
 
 void
 pmap_activate(struct thread *td)
 {
 	pmap_t	pmap, oldpmap;
 	u_int	cpuid;
 	u_int32_t  cr3;
 
 	critical_enter();
 	pmap = vmspace_pmap(td->td_proc->p_vmspace);
 	oldpmap = PCPU_GET(curpmap);
 	cpuid = PCPU_GET(cpuid);
 #if defined(SMP)
 	CPU_CLR_ATOMIC(cpuid, &oldpmap->pm_active);
 	CPU_SET_ATOMIC(cpuid, &pmap->pm_active);
 #else
 	CPU_CLR(cpuid, &oldpmap->pm_active);
 	CPU_SET(cpuid, &pmap->pm_active);
 #endif
 #if defined(PAE) || defined(PAE_TABLES)
 	cr3 = vtophys(pmap->pm_pdpt);
 #else
 	cr3 = vtophys(pmap->pm_pdir);
 #endif
 	/*
 	 * pmap_activate is for the current thread on the current cpu
 	 */
 	td->td_pcb->pcb_cr3 = cr3;
 	load_cr3(cr3);
 	PCPU_SET(curpmap, pmap);
 	critical_exit();
 }
 
 void
 pmap_sync_icache(pmap_t pm, vm_offset_t va, vm_size_t sz)
 {
 }
 
 /*
  *	Increase the starting virtual address of the given mapping if a
  *	different alignment might result in more superpage mappings.
  */
 void
 pmap_align_superpage(vm_object_t object, vm_ooffset_t offset,
     vm_offset_t *addr, vm_size_t size)
 {
 	vm_offset_t superpage_offset;
 
 	if (size < NBPDR)
 		return;
 	if (object != NULL && (object->flags & OBJ_COLORED) != 0)
 		offset += ptoa(object->pg_color);
 	superpage_offset = offset & PDRMASK;
 	if (size - ((NBPDR - superpage_offset) & PDRMASK) < NBPDR ||
 	    (*addr & PDRMASK) == superpage_offset)
 		return;
 	if ((*addr & PDRMASK) < superpage_offset)
 		*addr = (*addr & ~PDRMASK) + superpage_offset;
 	else
 		*addr = ((*addr + PDRMASK) & ~PDRMASK) + superpage_offset;
 }
 
 vm_offset_t
 pmap_quick_enter_page(vm_page_t m)
 {
 	vm_offset_t qaddr;
 	pt_entry_t *pte;
 
 	critical_enter();
 	qaddr = PCPU_GET(qmap_addr);
 	pte = vtopte(qaddr);
 
 	KASSERT(*pte == 0, ("pmap_quick_enter_page: PTE busy"));
 	*pte = PG_V | PG_RW | VM_PAGE_TO_PHYS(m) | PG_A | PG_M |
 	    pmap_cache_bits(pmap_page_get_memattr(m), 0);
 	invlpg(qaddr);
 
 	return (qaddr);
 }
 
 void
 pmap_quick_remove_page(vm_offset_t addr)
 {
 	vm_offset_t qaddr;
 	pt_entry_t *pte;
 
 	qaddr = PCPU_GET(qmap_addr);
 	pte = vtopte(qaddr);
 
 	KASSERT(*pte != 0, ("pmap_quick_remove_page: PTE not in use"));
 	KASSERT(addr == qaddr, ("pmap_quick_remove_page: invalid address"));
 
 	*pte = 0;
 	critical_exit();
 }
 
 #if defined(PMAP_DEBUG)
 pmap_pid_dump(int pid)
 {
 	pmap_t pmap;
 	struct proc *p;
 	int npte = 0;
 	int index;
 
 	sx_slock(&allproc_lock);
 	FOREACH_PROC_IN_SYSTEM(p) {
 		if (p->p_pid != pid)
 			continue;
 
 		if (p->p_vmspace) {
 			int i,j;
 			index = 0;
 			pmap = vmspace_pmap(p->p_vmspace);
 			for (i = 0; i < NPDEPTD; i++) {
 				pd_entry_t *pde;
 				pt_entry_t *pte;
 				vm_offset_t base = i << PDRSHIFT;
 				
 				pde = &pmap->pm_pdir[i];
 				if (pde && pmap_pde_v(pde)) {
 					for (j = 0; j < NPTEPG; j++) {
 						vm_offset_t va = base + (j << PAGE_SHIFT);
 						if (va >= (vm_offset_t) VM_MIN_KERNEL_ADDRESS) {
 							if (index) {
 								index = 0;
 								printf("\n");
 							}
 							sx_sunlock(&allproc_lock);
 							return (npte);
 						}
 						pte = pmap_pte(pmap, va);
 						if (pte && pmap_pte_v(pte)) {
 							pt_entry_t pa;
 							vm_page_t m;
 							pa = *pte;
 							m = PHYS_TO_VM_PAGE(pa & PG_FRAME);
 							printf("va: 0x%x, pt: 0x%x, h: %d, w: %d, f: 0x%x",
 								va, pa, m->hold_count, m->wire_count, m->flags);
 							npte++;
 							index++;
 							if (index >= 2) {
 								index = 0;
 								printf("\n");
 							} else {
 								printf(" ");
 							}
 						}
 					}
 				}
 			}
 		}
 	}
 	sx_sunlock(&allproc_lock);
 	return (npte);
 }
 #endif
Index: projects/clang390-import/sys/kern/kern_exit.c
===================================================================
--- projects/clang390-import/sys/kern/kern_exit.c	(revision 305686)
+++ projects/clang390-import/sys/kern/kern_exit.c	(revision 305687)
@@ -1,1326 +1,1326 @@
 /*-
  * Copyright (c) 1982, 1986, 1989, 1991, 1993
  *	The Regents of the University of California.  All rights reserved.
  * (c) UNIX System Laboratories, Inc.
  * All or some portions of this file are derived from material licensed
  * to the University of California by American Telephone and Telegraph
  * Co. or Unix System Laboratories, Inc. and are reproduced herein with
  * the permission of UNIX System Laboratories, Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)kern_exit.c	8.7 (Berkeley) 2/12/94
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_compat.h"
 #include "opt_ktrace.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/sysproto.h>
 #include <sys/capsicum.h>
 #include <sys/eventhandler.h>
 #include <sys/kernel.h>
 #include <sys/malloc.h>
 #include <sys/lock.h>
 #include <sys/mutex.h>
 #include <sys/proc.h>
 #include <sys/procdesc.h>
 #include <sys/pioctl.h>
 #include <sys/jail.h>
 #include <sys/tty.h>
 #include <sys/wait.h>
 #include <sys/vmmeter.h>
 #include <sys/vnode.h>
 #include <sys/racct.h>
 #include <sys/resourcevar.h>
 #include <sys/sbuf.h>
 #include <sys/signalvar.h>
 #include <sys/sched.h>
 #include <sys/sx.h>
 #include <sys/syscallsubr.h>
 #include <sys/syslog.h>
 #include <sys/ptrace.h>
 #include <sys/acct.h>		/* for acct_process() function prototype */
 #include <sys/filedesc.h>
 #include <sys/sdt.h>
 #include <sys/shm.h>
 #include <sys/sem.h>
 #include <sys/umtx.h>
 #ifdef KTRACE
 #include <sys/ktrace.h>
 #endif
 
 #include <security/audit/audit.h>
 #include <security/mac/mac_framework.h>
 
 #include <vm/vm.h>
 #include <vm/vm_extern.h>
 #include <vm/vm_param.h>
 #include <vm/pmap.h>
 #include <vm/vm_map.h>
 #include <vm/vm_page.h>
 #include <vm/uma.h>
 #include <vm/vm_domain.h>
 
 #ifdef KDTRACE_HOOKS
 #include <sys/dtrace_bsd.h>
 dtrace_execexit_func_t	dtrace_fasttrap_exit;
 #endif
 
 SDT_PROVIDER_DECLARE(proc);
 SDT_PROBE_DEFINE1(proc, , , exit, "int");
 
 /* Hook for NFS teardown procedure. */
 void (*nlminfo_release_p)(struct proc *p);
 
 struct proc *
 proc_realparent(struct proc *child)
 {
 	struct proc *p, *parent;
 
 	sx_assert(&proctree_lock, SX_LOCKED);
 	if ((child->p_treeflag & P_TREE_ORPHANED) == 0) {
 		if (child->p_oppid == 0 ||
 		    child->p_pptr->p_pid == child->p_oppid)
 			parent = child->p_pptr;
 		else
 			parent = initproc;
 		return (parent);
 	}
 	for (p = child; (p->p_treeflag & P_TREE_FIRST_ORPHAN) == 0;) {
 		/* Cannot use LIST_PREV(), since the list head is not known. */
 		p = __containerof(p->p_orphan.le_prev, struct proc,
 		    p_orphan.le_next);
 		KASSERT((p->p_treeflag & P_TREE_ORPHANED) != 0,
 		    ("missing P_ORPHAN %p", p));
 	}
 	parent = __containerof(p->p_orphan.le_prev, struct proc,
 	    p_orphans.lh_first);
 	return (parent);
 }
 
 void
 reaper_abandon_children(struct proc *p, bool exiting)
 {
 	struct proc *p1, *p2, *ptmp;
 
 	sx_assert(&proctree_lock, SX_LOCKED);
 	KASSERT(p != initproc, ("reaper_abandon_children for initproc"));
 	if ((p->p_treeflag & P_TREE_REAPER) == 0)
 		return;
 	p1 = p->p_reaper;
 	LIST_FOREACH_SAFE(p2, &p->p_reaplist, p_reapsibling, ptmp) {
 		LIST_REMOVE(p2, p_reapsibling);
 		p2->p_reaper = p1;
 		p2->p_reapsubtree = p->p_reapsubtree;
 		LIST_INSERT_HEAD(&p1->p_reaplist, p2, p_reapsibling);
 		if (exiting && p2->p_pptr == p) {
 			PROC_LOCK(p2);
 			proc_reparent(p2, p1);
 			PROC_UNLOCK(p2);
 		}
 	}
 	KASSERT(LIST_EMPTY(&p->p_reaplist), ("p_reaplist not empty"));
 	p->p_treeflag &= ~P_TREE_REAPER;
 }
 
 static void
 clear_orphan(struct proc *p)
 {
 	struct proc *p1;
 
 	sx_assert(&proctree_lock, SA_XLOCKED);
 	if ((p->p_treeflag & P_TREE_ORPHANED) == 0)
 		return;
 	if ((p->p_treeflag & P_TREE_FIRST_ORPHAN) != 0) {
 		p1 = LIST_NEXT(p, p_orphan);
 		if (p1 != NULL)
 			p1->p_treeflag |= P_TREE_FIRST_ORPHAN;
 		p->p_treeflag &= ~P_TREE_FIRST_ORPHAN;
 	}
 	LIST_REMOVE(p, p_orphan);
 	p->p_treeflag &= ~P_TREE_ORPHANED;
 }
 
 /*
  * exit -- death of process.
  */
 void
 sys_sys_exit(struct thread *td, struct sys_exit_args *uap)
 {
 
 	exit1(td, uap->rval, 0);
 	/* NOTREACHED */
 }
 
 /*
  * Exit: deallocate address space and other resources, change proc state to
  * zombie, and unlink proc from allproc and parent's lists.  Save exit status
  * and rusage for wait().  Check for child processes and orphan them.
  */
 void
 exit1(struct thread *td, int rval, int signo)
 {
 	struct proc *p, *nq, *q, *t;
 	struct thread *tdt;
 
 	mtx_assert(&Giant, MA_NOTOWNED);
 	KASSERT(rval == 0 || signo == 0, ("exit1 rv %d sig %d", rval, signo));
 
 	p = td->td_proc;
 	/*
 	 * XXX in case we're rebooting we just let init die in order to
 	 * work around an unsolved stack overflow seen very late during
 	 * shutdown on sparc64 when the gmirror worker process exists.
 	 */
 	if (p == initproc && rebooting == 0) {
 		printf("init died (signal %d, exit %d)\n", signo, rval);
 		panic("Going nowhere without my init!");
 	}
 
 	/*
 	 * Deref SU mp, since the thread does not return to userspace.
 	 */
 	if (softdep_ast_cleanup != NULL)
 		softdep_ast_cleanup();
 
 	/*
 	 * MUST abort all other threads before proceeding past here.
 	 */
 	PROC_LOCK(p);
 	/*
 	 * First check if some other thread or external request got
 	 * here before us.  If so, act appropriately: exit or suspend.
 	 * We must ensure that stop requests are handled before we set
 	 * P_WEXIT.
 	 */
 	thread_suspend_check(0);
 	while (p->p_flag & P_HADTHREADS) {
 		/*
 		 * Kill off the other threads. This requires
 		 * some co-operation from other parts of the kernel
 		 * so it may not be instantaneous.  With this state set
 		 * any thread entering the kernel from userspace will
 		 * thread_exit() in trap().  Any thread attempting to
 		 * sleep will return immediately with EINTR or EWOULDBLOCK
 		 * which will hopefully force them to back out to userland
 		 * freeing resources as they go.  Any thread attempting
 		 * to return to userland will thread_exit() from userret().
 		 * thread_exit() will unsuspend us when the last of the
 		 * other threads exits.
 		 * If there is already a thread singler after resumption,
 		 * calling thread_single will fail; in that case, we just
 		 * re-check all suspension request, the thread should
 		 * either be suspended there or exit.
 		 */
 		if (!thread_single(p, SINGLE_EXIT))
 			/*
 			 * All other activity in this process is now
 			 * stopped.  Threading support has been turned
 			 * off.
 			 */
 			break;
 		/*
 		 * Recheck for new stop or suspend requests which
 		 * might appear while process lock was dropped in
 		 * thread_single().
 		 */
 		thread_suspend_check(0);
 	}
 	KASSERT(p->p_numthreads == 1,
 	    ("exit1: proc %p exiting with %d threads", p, p->p_numthreads));
 	racct_sub(p, RACCT_NTHR, 1);
 
 	/* Let event handler change exit status */
 	p->p_xexit = rval;
 	p->p_xsig = signo;
 
 	/*
 	 * Wakeup anyone in procfs' PIOCWAIT.  They should have a hold
 	 * on our vmspace, so we should block below until they have
 	 * released their reference to us.  Note that if they have
 	 * requested S_EXIT stops we will block here until they ack
 	 * via PIOCCONT.
 	 */
 	_STOPEVENT(p, S_EXIT, 0);
 
 	/*
 	 * Ignore any pending request to stop due to a stop signal.
 	 * Once P_WEXIT is set, future requests will be ignored as
 	 * well.
 	 */
 	p->p_flag &= ~P_STOPPED_SIG;
 	KASSERT(!P_SHOULDSTOP(p), ("exiting process is stopped"));
 
 	/*
 	 * Note that we are exiting and do another wakeup of anyone in
 	 * PIOCWAIT in case they aren't listening for S_EXIT stops or
 	 * decided to wait again after we told them we are exiting.
 	 */
 	p->p_flag |= P_WEXIT;
 	wakeup(&p->p_stype);
 
 	/*
 	 * Wait for any processes that have a hold on our vmspace to
 	 * release their reference.
 	 */
 	while (p->p_lock > 0)
 		msleep(&p->p_lock, &p->p_mtx, PWAIT, "exithold", 0);
 
 	PROC_UNLOCK(p);
 	/* Drain the limit callout while we don't have the proc locked */
 	callout_drain(&p->p_limco);
 
 #ifdef AUDIT
 	/*
 	 * The Sun BSM exit token contains two components: an exit status as
 	 * passed to exit(), and a return value to indicate what sort of exit
 	 * it was.  The exit status is WEXITSTATUS(rv), but it's not clear
 	 * what the return value is.
 	 */
 	AUDIT_ARG_EXIT(rval, 0);
 	AUDIT_SYSCALL_EXIT(0, td);
 #endif
 
 	/* Are we a task leader with peers? */
 	if (p->p_peers != NULL && p == p->p_leader) {
 		mtx_lock(&ppeers_lock);
 		q = p->p_peers;
 		while (q != NULL) {
 			PROC_LOCK(q);
 			kern_psignal(q, SIGKILL);
 			PROC_UNLOCK(q);
 			q = q->p_peers;
 		}
 		while (p->p_peers != NULL)
 			msleep(p, &ppeers_lock, PWAIT, "exit1", 0);
 		mtx_unlock(&ppeers_lock);
 	}
 
 	/*
 	 * Check if any loadable modules need anything done at process exit.
 	 * E.g. SYSV IPC stuff.
 	 * Event handler could change exit status.
 	 * XXX what if one of these generates an error?
 	 */
 	EVENTHANDLER_INVOKE(process_exit, p);
 
 	/*
 	 * If parent is waiting for us to exit or exec,
 	 * P_PPWAIT is set; we will wakeup the parent below.
 	 */
 	PROC_LOCK(p);
 	stopprofclock(p);
 	p->p_flag &= ~(P_TRACED | P_PPWAIT | P_PPTRACE);
 	p->p_ptevents = 0;
 
 	/*
 	 * Stop the real interval timer.  If the handler is currently
 	 * executing, prevent it from rearming itself and let it finish.
 	 */
 	if (timevalisset(&p->p_realtimer.it_value) &&
 	    _callout_stop_safe(&p->p_itcallout, CS_EXECUTING, NULL) == 0) {
 		timevalclear(&p->p_realtimer.it_interval);
 		msleep(&p->p_itcallout, &p->p_mtx, PWAIT, "ritwait", 0);
 		KASSERT(!timevalisset(&p->p_realtimer.it_value),
 		    ("realtime timer is still armed"));
 	}
 
 	PROC_UNLOCK(p);
 
 	umtx_thread_exit(td);
 
 	/*
 	 * Reset any sigio structures pointing to us as a result of
 	 * F_SETOWN with our pid.
 	 */
 	funsetownlst(&p->p_sigiolst);
 
 	/*
 	 * If this process has an nlminfo data area (for lockd), release it
 	 */
 	if (nlminfo_release_p != NULL && p->p_nlminfo != NULL)
 		(*nlminfo_release_p)(p);
 
 	/*
 	 * Close open files and release open-file table.
 	 * This may block!
 	 */
 	fdescfree(td);
 
 	/*
 	 * If this thread tickled GEOM, we need to wait for the giggling to
 	 * stop before we return to userland
 	 */
 	if (td->td_pflags & TDP_GEOM)
 		g_waitidle();
 
 	/*
 	 * Remove ourself from our leader's peer list and wake our leader.
 	 */
 	if (p->p_leader->p_peers != NULL) {
 		mtx_lock(&ppeers_lock);
 		if (p->p_leader->p_peers != NULL) {
 			q = p->p_leader;
 			while (q->p_peers != p)
 				q = q->p_peers;
 			q->p_peers = p->p_peers;
 			wakeup(p->p_leader);
 		}
 		mtx_unlock(&ppeers_lock);
 	}
 
 	vmspace_exit(td);
 	killjobc();
 	(void)acct_process(td);
 
 #ifdef KTRACE
 	ktrprocexit(td);
 #endif
 	/*
 	 * Release reference to text vnode
 	 */
 	if (p->p_textvp != NULL) {
 		vrele(p->p_textvp);
 		p->p_textvp = NULL;
 	}
 
 	/*
 	 * Release our limits structure.
 	 */
 	lim_free(p->p_limit);
 	p->p_limit = NULL;
 
 	tidhash_remove(td);
 
 	/*
 	 * Remove proc from allproc queue and pidhash chain.
 	 * Place onto zombproc.  Unlink from parent's child list.
 	 */
 	sx_xlock(&allproc_lock);
 	LIST_REMOVE(p, p_list);
 	LIST_INSERT_HEAD(&zombproc, p, p_list);
 	LIST_REMOVE(p, p_hash);
 	sx_xunlock(&allproc_lock);
 
 	/*
 	 * Call machine-dependent code to release any
 	 * machine-dependent resources other than the address space.
 	 * The address space is released by "vmspace_exitfree(p)" in
 	 * vm_waitproc().
 	 */
 	cpu_exit(td);
 
 	WITNESS_WARN(WARN_PANIC, NULL, "process (pid %d) exiting", p->p_pid);
 
 	/*
 	 * Reparent all children processes:
 	 * - traced ones to the original parent (or init if we are that parent)
 	 * - the rest to init
 	 */
 	sx_xlock(&proctree_lock);
 	q = LIST_FIRST(&p->p_children);
 	if (q != NULL)		/* only need this if any child is S_ZOMB */
 		wakeup(q->p_reaper);
 	for (; q != NULL; q = nq) {
 		nq = LIST_NEXT(q, p_sibling);
 		PROC_LOCK(q);
 		q->p_sigparent = SIGCHLD;
 
 		if (!(q->p_flag & P_TRACED)) {
 			proc_reparent(q, q->p_reaper);
 		} else {
 			/*
 			 * Traced processes are killed since their existence
 			 * means someone is screwing up.
 			 */
 			t = proc_realparent(q);
 			if (t == p) {
 				proc_reparent(q, q->p_reaper);
 			} else {
 				PROC_LOCK(t);
 				proc_reparent(q, t);
 				PROC_UNLOCK(t);
 			}
 			/*
 			 * Since q was found on our children list, the
 			 * proc_reparent() call moved q to the orphan
 			 * list due to present P_TRACED flag. Clear
 			 * orphan link for q now while q is locked.
 			 */
 			clear_orphan(q);
 			q->p_flag &= ~(P_TRACED | P_STOPPED_TRACE);
 			q->p_flag2 &= ~P2_PTRACE_FSTP;
 			q->p_ptevents = 0;
 			FOREACH_THREAD_IN_PROC(q, tdt) {
 				tdt->td_dbgflags &= ~(TDB_SUSPEND | TDB_XSIG |
 				    TDB_FSTP);
 			}
 			kern_psignal(q, SIGKILL);
 		}
 		PROC_UNLOCK(q);
 	}
 
 	/*
 	 * Also get rid of our orphans.
 	 */
 	while ((q = LIST_FIRST(&p->p_orphans)) != NULL) {
 		PROC_LOCK(q);
 		CTR2(KTR_PTRACE, "exit: pid %d, clearing orphan %d", p->p_pid,
 		    q->p_pid);
 		clear_orphan(q);
 		PROC_UNLOCK(q);
 	}
 
 	/* Save exit status. */
 	PROC_LOCK(p);
 	p->p_xthread = td;
 
 	/* Tell the prison that we are gone. */
 	prison_proc_free(p->p_ucred->cr_prison);
 
 #ifdef KDTRACE_HOOKS
 	/*
 	 * Tell the DTrace fasttrap provider about the exit if it
 	 * has declared an interest.
 	 */
 	if (dtrace_fasttrap_exit)
 		dtrace_fasttrap_exit(p);
 #endif
 
 	/*
 	 * Notify interested parties of our demise.
 	 */
 	KNOTE_LOCKED(p->p_klist, NOTE_EXIT);
 
 #ifdef KDTRACE_HOOKS
 	int reason = CLD_EXITED;
 	if (WCOREDUMP(signo))
 		reason = CLD_DUMPED;
 	else if (WIFSIGNALED(signo))
 		reason = CLD_KILLED;
 	SDT_PROBE1(proc, , , exit, reason);
 #endif
 
 	/*
 	 * If this is a process with a descriptor, we may not need to deliver
 	 * a signal to the parent.  proctree_lock is held over
 	 * procdesc_exit() to serialize concurrent calls to close() and
 	 * exit().
 	 */
 	if (p->p_procdesc == NULL || procdesc_exit(p)) {
 		/*
 		 * Notify parent that we're gone.  If parent has the
 		 * PS_NOCLDWAIT flag set, or if the handler is set to SIG_IGN,
 		 * notify process 1 instead (and hope it will handle this
 		 * situation).
 		 */
 		PROC_LOCK(p->p_pptr);
 		mtx_lock(&p->p_pptr->p_sigacts->ps_mtx);
 		if (p->p_pptr->p_sigacts->ps_flag &
 		    (PS_NOCLDWAIT | PS_CLDSIGIGN)) {
 			struct proc *pp;
 
 			mtx_unlock(&p->p_pptr->p_sigacts->ps_mtx);
 			pp = p->p_pptr;
 			PROC_UNLOCK(pp);
 			proc_reparent(p, p->p_reaper);
 			p->p_sigparent = SIGCHLD;
 			PROC_LOCK(p->p_pptr);
 
 			/*
 			 * Notify parent, so in case he was wait(2)ing or
 			 * executing waitpid(2) with our pid, he will
 			 * continue.
 			 */
 			wakeup(pp);
 		} else
 			mtx_unlock(&p->p_pptr->p_sigacts->ps_mtx);
 
 		if (p->p_pptr == p->p_reaper || p->p_pptr == initproc)
 			childproc_exited(p);
 		else if (p->p_sigparent != 0) {
 			if (p->p_sigparent == SIGCHLD)
 				childproc_exited(p);
 			else	/* LINUX thread */
 				kern_psignal(p->p_pptr, p->p_sigparent);
 		}
 	} else
 		PROC_LOCK(p->p_pptr);
 	sx_xunlock(&proctree_lock);
 
 	/*
 	 * The state PRS_ZOMBIE prevents other proesses from sending
 	 * signal to the process, to avoid memory leak, we free memory
 	 * for signal queue at the time when the state is set.
 	 */
 	sigqueue_flush(&p->p_sigqueue);
 	sigqueue_flush(&td->td_sigqueue);
 
 	/*
 	 * We have to wait until after acquiring all locks before
 	 * changing p_state.  We need to avoid all possible context
 	 * switches (including ones from blocking on a mutex) while
 	 * marked as a zombie.  We also have to set the zombie state
 	 * before we release the parent process' proc lock to avoid
 	 * a lost wakeup.  So, we first call wakeup, then we grab the
 	 * sched lock, update the state, and release the parent process'
 	 * proc lock.
 	 */
 	wakeup(p->p_pptr);
 	cv_broadcast(&p->p_pwait);
 	sched_exit(p->p_pptr, td);
 	PROC_SLOCK(p);
 	p->p_state = PRS_ZOMBIE;
 	PROC_UNLOCK(p->p_pptr);
 
 	/*
 	 * Save our children's rusage information in our exit rusage.
 	 */
 	PROC_STATLOCK(p);
 	ruadd(&p->p_ru, &p->p_rux, &p->p_stats->p_cru, &p->p_crux);
 	PROC_STATUNLOCK(p);
 
 	/*
 	 * Make sure the scheduler takes this thread out of its tables etc.
 	 * This will also release this thread's reference to the ucred.
 	 * Other thread parts to release include pcb bits and such.
 	 */
 	thread_exit();
 }
 
 
 #ifndef _SYS_SYSPROTO_H_
 struct abort2_args {
 	char *why;
 	int nargs;
 	void **args;
 };
 #endif
 
 int
 sys_abort2(struct thread *td, struct abort2_args *uap)
 {
 	struct proc *p = td->td_proc;
 	struct sbuf *sb;
 	void *uargs[16];
 	int error, i, sig;
 
 	/*
 	 * Do it right now so we can log either proper call of abort2(), or
 	 * note, that invalid argument was passed. 512 is big enough to
 	 * handle 16 arguments' descriptions with additional comments.
 	 */
 	sb = sbuf_new(NULL, NULL, 512, SBUF_FIXEDLEN);
 	sbuf_clear(sb);
 	sbuf_printf(sb, "%s(pid %d uid %d) aborted: ",
 	    p->p_comm, p->p_pid, td->td_ucred->cr_uid);
 	/*
 	 * Since we can't return from abort2(), send SIGKILL in cases, where
 	 * abort2() was called improperly
 	 */
 	sig = SIGKILL;
 	/* Prevent from DoSes from user-space. */
 	if (uap->nargs < 0 || uap->nargs > 16)
 		goto out;
 	if (uap->nargs > 0) {
 		if (uap->args == NULL)
 			goto out;
 		error = copyin(uap->args, uargs, uap->nargs * sizeof(void *));
 		if (error != 0)
 			goto out;
 	}
 	/*
 	 * Limit size of 'reason' string to 128. Will fit even when
 	 * maximal number of arguments was chosen to be logged.
 	 */
 	if (uap->why != NULL) {
 		error = sbuf_copyin(sb, uap->why, 128);
 		if (error < 0)
 			goto out;
 	} else {
 		sbuf_printf(sb, "(null)");
 	}
 	if (uap->nargs > 0) {
 		sbuf_printf(sb, "(");
 		for (i = 0;i < uap->nargs; i++)
 			sbuf_printf(sb, "%s%p", i == 0 ? "" : ", ", uargs[i]);
 		sbuf_printf(sb, ")");
 	}
 	/*
 	 * Final stage: arguments were proper, string has been
 	 * successfully copied from userspace, and copying pointers
 	 * from user-space succeed.
 	 */
 	sig = SIGABRT;
 out:
 	if (sig == SIGKILL) {
 		sbuf_trim(sb);
 		sbuf_printf(sb, " (Reason text inaccessible)");
 	}
 	sbuf_cat(sb, "\n");
 	sbuf_finish(sb);
 	log(LOG_INFO, "%s", sbuf_data(sb));
 	sbuf_delete(sb);
 	exit1(td, 0, sig);
 	return (0);
 }
 
 
 #ifdef COMPAT_43
 /*
  * The dirty work is handled by kern_wait().
  */
 int
 owait(struct thread *td, struct owait_args *uap __unused)
 {
 	int error, status;
 
 	error = kern_wait(td, WAIT_ANY, &status, 0, NULL);
 	if (error == 0)
 		td->td_retval[1] = status;
 	return (error);
 }
 #endif /* COMPAT_43 */
 
 /*
  * The dirty work is handled by kern_wait().
  */
 int
 sys_wait4(struct thread *td, struct wait4_args *uap)
 {
 	struct rusage ru, *rup;
 	int error, status;
 
 	if (uap->rusage != NULL)
 		rup = &ru;
 	else
 		rup = NULL;
 	error = kern_wait(td, uap->pid, &status, uap->options, rup);
-	if (uap->status != NULL && error == 0)
+	if (uap->status != NULL && error == 0 && td->td_retval[0] != 0)
 		error = copyout(&status, uap->status, sizeof(status));
-	if (uap->rusage != NULL && error == 0)
+	if (uap->rusage != NULL && error == 0 && td->td_retval[0] != 0)
 		error = copyout(&ru, uap->rusage, sizeof(struct rusage));
 	return (error);
 }
 
 int
 sys_wait6(struct thread *td, struct wait6_args *uap)
 {
 	struct __wrusage wru, *wrup;
 	siginfo_t si, *sip;
 	idtype_t idtype;
 	id_t id;
 	int error, status;
 
 	idtype = uap->idtype;
 	id = uap->id;
 
 	if (uap->wrusage != NULL)
 		wrup = &wru;
 	else
 		wrup = NULL;
 
 	if (uap->info != NULL) {
 		sip = &si;
 		bzero(sip, sizeof(*sip));
 	} else
 		sip = NULL;
 
 	/*
 	 *  We expect all callers of wait6() to know about WEXITED and
 	 *  WTRAPPED.
 	 */
 	error = kern_wait6(td, idtype, id, &status, uap->options, wrup, sip);
 
-	if (uap->status != NULL && error == 0)
+	if (uap->status != NULL && error == 0 && td->td_retval[0] != 0)
 		error = copyout(&status, uap->status, sizeof(status));
-	if (uap->wrusage != NULL && error == 0)
+	if (uap->wrusage != NULL && error == 0 && td->td_retval[0] != 0)
 		error = copyout(&wru, uap->wrusage, sizeof(wru));
 	if (uap->info != NULL && error == 0)
 		error = copyout(&si, uap->info, sizeof(si));
 	return (error);
 }
 
 /*
  * Reap the remains of a zombie process and optionally return status and
  * rusage.  Asserts and will release both the proctree_lock and the process
  * lock as part of its work.
  */
 void
 proc_reap(struct thread *td, struct proc *p, int *status, int options)
 {
 	struct proc *q, *t;
 
 	sx_assert(&proctree_lock, SA_XLOCKED);
 	PROC_LOCK_ASSERT(p, MA_OWNED);
 	PROC_SLOCK_ASSERT(p, MA_OWNED);
 	KASSERT(p->p_state == PRS_ZOMBIE, ("proc_reap: !PRS_ZOMBIE"));
 
 	q = td->td_proc;
 
 	PROC_SUNLOCK(p);
 	if (status)
 		*status = KW_EXITCODE(p->p_xexit, p->p_xsig);
 	if (options & WNOWAIT) {
 		/*
 		 *  Only poll, returning the status.  Caller does not wish to
 		 * release the proc struct just yet.
 		 */
 		PROC_UNLOCK(p);
 		sx_xunlock(&proctree_lock);
 		return;
 	}
 
 	PROC_LOCK(q);
 	sigqueue_take(p->p_ksi);
 	PROC_UNLOCK(q);
 
 	/*
 	 * If we got the child via a ptrace 'attach', we need to give it back
 	 * to the old parent.
 	 */
 	if (p->p_oppid != 0 && p->p_oppid != p->p_pptr->p_pid) {
 		PROC_UNLOCK(p);
 		t = proc_realparent(p);
 		PROC_LOCK(t);
 		PROC_LOCK(p);
 		CTR2(KTR_PTRACE,
 		    "wait: traced child %d moved back to parent %d", p->p_pid,
 		    t->p_pid);
 		proc_reparent(p, t);
 		p->p_oppid = 0;
 		PROC_UNLOCK(p);
 		pksignal(t, SIGCHLD, p->p_ksi);
 		wakeup(t);
 		cv_broadcast(&p->p_pwait);
 		PROC_UNLOCK(t);
 		sx_xunlock(&proctree_lock);
 		return;
 	}
 	p->p_oppid = 0;
 	PROC_UNLOCK(p);
 
 	/*
 	 * Remove other references to this process to ensure we have an
 	 * exclusive reference.
 	 */
 	sx_xlock(&allproc_lock);
 	LIST_REMOVE(p, p_list);	/* off zombproc */
 	sx_xunlock(&allproc_lock);
 	LIST_REMOVE(p, p_sibling);
 	reaper_abandon_children(p, true);
 	LIST_REMOVE(p, p_reapsibling);
 	PROC_LOCK(p);
 	clear_orphan(p);
 	PROC_UNLOCK(p);
 	leavepgrp(p);
 	if (p->p_procdesc != NULL)
 		procdesc_reap(p);
 	sx_xunlock(&proctree_lock);
 
 	PROC_LOCK(p);
 	knlist_detach(p->p_klist);
 	p->p_klist = NULL;
 	PROC_UNLOCK(p);
 
 	/*
 	 * Removal from allproc list and process group list paired with
 	 * PROC_LOCK which was executed during that time should guarantee
 	 * nothing can reach this process anymore. As such further locking
 	 * is unnecessary.
 	 */
 	p->p_xexit = p->p_xsig = 0;		/* XXX: why? */
 
 	PROC_LOCK(q);
 	ruadd(&q->p_stats->p_cru, &q->p_crux, &p->p_ru, &p->p_rux);
 	PROC_UNLOCK(q);
 
 	/*
 	 * Decrement the count of procs running with this uid.
 	 */
 	(void)chgproccnt(p->p_ucred->cr_ruidinfo, -1, 0);
 
 	/*
 	 * Destroy resource accounting information associated with the process.
 	 */
 #ifdef RACCT
 	if (racct_enable) {
 		PROC_LOCK(p);
 		racct_sub(p, RACCT_NPROC, 1);
 		PROC_UNLOCK(p);
 	}
 #endif
 	racct_proc_exit(p);
 
 	/*
 	 * Free credentials, arguments, and sigacts.
 	 */
 	crfree(p->p_ucred);
 	proc_set_cred(p, NULL);
 	pargs_drop(p->p_args);
 	p->p_args = NULL;
 	sigacts_free(p->p_sigacts);
 	p->p_sigacts = NULL;
 
 	/*
 	 * Do any thread-system specific cleanups.
 	 */
 	thread_wait(p);
 
 	/*
 	 * Give vm and machine-dependent layer a chance to free anything that
 	 * cpu_exit couldn't release while still running in process context.
 	 */
 	vm_waitproc(p);
 #ifdef MAC
 	mac_proc_destroy(p);
 #endif
 	/*
 	 * Free any domain policy that's still hiding around.
 	 */
 	vm_domain_policy_cleanup(&p->p_vm_dom_policy);
 
 	KASSERT(FIRST_THREAD_IN_PROC(p),
 	    ("proc_reap: no residual thread!"));
 	uma_zfree(proc_zone, p);
 	atomic_add_int(&nprocs, -1);
 }
 
 static int
 proc_to_reap(struct thread *td, struct proc *p, idtype_t idtype, id_t id,
     int *status, int options, struct __wrusage *wrusage, siginfo_t *siginfo,
     int check_only)
 {
 	struct rusage *rup;
 
 	sx_assert(&proctree_lock, SA_XLOCKED);
 
 	PROC_LOCK(p);
 
 	switch (idtype) {
 	case P_ALL:
 		if (p->p_procdesc != NULL) {
 			PROC_UNLOCK(p);
 			return (0);
 		}
 		break;
 	case P_PID:
 		if (p->p_pid != (pid_t)id) {
 			PROC_UNLOCK(p);
 			return (0);
 		}
 		break;
 	case P_PGID:
 		if (p->p_pgid != (pid_t)id) {
 			PROC_UNLOCK(p);
 			return (0);
 		}
 		break;
 	case P_SID:
 		if (p->p_session->s_sid != (pid_t)id) {
 			PROC_UNLOCK(p);
 			return (0);
 		}
 		break;
 	case P_UID:
 		if (p->p_ucred->cr_uid != (uid_t)id) {
 			PROC_UNLOCK(p);
 			return (0);
 		}
 		break;
 	case P_GID:
 		if (p->p_ucred->cr_gid != (gid_t)id) {
 			PROC_UNLOCK(p);
 			return (0);
 		}
 		break;
 	case P_JAILID:
 		if (p->p_ucred->cr_prison->pr_id != (int)id) {
 			PROC_UNLOCK(p);
 			return (0);
 		}
 		break;
 	/*
 	 * It seems that the thread structures get zeroed out
 	 * at process exit.  This makes it impossible to
 	 * support P_SETID, P_CID or P_CPUID.
 	 */
 	default:
 		PROC_UNLOCK(p);
 		return (0);
 	}
 
 	if (p_canwait(td, p)) {
 		PROC_UNLOCK(p);
 		return (0);
 	}
 
 	if (((options & WEXITED) == 0) && (p->p_state == PRS_ZOMBIE)) {
 		PROC_UNLOCK(p);
 		return (0);
 	}
 
 	/*
 	 * This special case handles a kthread spawned by linux_clone
 	 * (see linux_misc.c).  The linux_wait4 and linux_waitpid
 	 * functions need to be able to distinguish between waiting
 	 * on a process and waiting on a thread.  It is a thread if
 	 * p_sigparent is not SIGCHLD, and the WLINUXCLONE option
 	 * signifies we want to wait for threads and not processes.
 	 */
 	if ((p->p_sigparent != SIGCHLD) ^
 	    ((options & WLINUXCLONE) != 0)) {
 		PROC_UNLOCK(p);
 		return (0);
 	}
 
 	if (siginfo != NULL) {
 		bzero(siginfo, sizeof(*siginfo));
 		siginfo->si_errno = 0;
 
 		/*
 		 * SUSv4 requires that the si_signo value is always
 		 * SIGCHLD. Obey it despite the rfork(2) interface
 		 * allows to request other signal for child exit
 		 * notification.
 		 */
 		siginfo->si_signo = SIGCHLD;
 
 		/*
 		 *  This is still a rough estimate.  We will fix the
 		 *  cases TRAPPED, STOPPED, and CONTINUED later.
 		 */
 		if (WCOREDUMP(p->p_xsig)) {
 			siginfo->si_code = CLD_DUMPED;
 			siginfo->si_status = WTERMSIG(p->p_xsig);
 		} else if (WIFSIGNALED(p->p_xsig)) {
 			siginfo->si_code = CLD_KILLED;
 			siginfo->si_status = WTERMSIG(p->p_xsig);
 		} else {
 			siginfo->si_code = CLD_EXITED;
 			siginfo->si_status = p->p_xexit;
 		}
 
 		siginfo->si_pid = p->p_pid;
 		siginfo->si_uid = p->p_ucred->cr_uid;
 
 		/*
 		 * The si_addr field would be useful additional
 		 * detail, but apparently the PC value may be lost
 		 * when we reach this point.  bzero() above sets
 		 * siginfo->si_addr to NULL.
 		 */
 	}
 
 	/*
 	 * There should be no reason to limit resources usage info to
 	 * exited processes only.  A snapshot about any resources used
 	 * by a stopped process may be exactly what is needed.
 	 */
 	if (wrusage != NULL) {
 		rup = &wrusage->wru_self;
 		*rup = p->p_ru;
 		PROC_STATLOCK(p);
 		calcru(p, &rup->ru_utime, &rup->ru_stime);
 		PROC_STATUNLOCK(p);
 
 		rup = &wrusage->wru_children;
 		*rup = p->p_stats->p_cru;
 		calccru(p, &rup->ru_utime, &rup->ru_stime);
 	}
 
 	if (p->p_state == PRS_ZOMBIE && !check_only) {
 		PROC_SLOCK(p);
 		proc_reap(td, p, status, options);
 		return (-1);
 	}
 	PROC_UNLOCK(p);
 	return (1);
 }
 
 int
 kern_wait(struct thread *td, pid_t pid, int *status, int options,
     struct rusage *rusage)
 {
 	struct __wrusage wru, *wrup;
 	idtype_t idtype;
 	id_t id;
 	int ret;
 
 	/*
 	 * Translate the special pid values into the (idtype, pid)
 	 * pair for kern_wait6.  The WAIT_MYPGRP case is handled by
 	 * kern_wait6() on its own.
 	 */
 	if (pid == WAIT_ANY) {
 		idtype = P_ALL;
 		id = 0;
 	} else if (pid < 0) {
 		idtype = P_PGID;
 		id = (id_t)-pid;
 	} else {
 		idtype = P_PID;
 		id = (id_t)pid;
 	}
 
 	if (rusage != NULL)
 		wrup = &wru;
 	else
 		wrup = NULL;
 
 	/*
 	 * For backward compatibility we implicitly add flags WEXITED
 	 * and WTRAPPED here.
 	 */
 	options |= WEXITED | WTRAPPED;
 	ret = kern_wait6(td, idtype, id, status, options, wrup, NULL);
 	if (rusage != NULL)
 		*rusage = wru.wru_self;
 	return (ret);
 }
 
 int
 kern_wait6(struct thread *td, idtype_t idtype, id_t id, int *status,
     int options, struct __wrusage *wrusage, siginfo_t *siginfo)
 {
 	struct proc *p, *q;
 	pid_t pid;
 	int error, nfound, ret;
 
 	AUDIT_ARG_VALUE((int)idtype);	/* XXX - This is likely wrong! */
 	AUDIT_ARG_PID((pid_t)id);	/* XXX - This may be wrong! */
 	AUDIT_ARG_VALUE(options);
 
 	q = td->td_proc;
 
 	if ((pid_t)id == WAIT_MYPGRP && (idtype == P_PID || idtype == P_PGID)) {
 		PROC_LOCK(q);
 		id = (id_t)q->p_pgid;
 		PROC_UNLOCK(q);
 		idtype = P_PGID;
 	}
 
 	/* If we don't know the option, just return. */
 	if ((options & ~(WUNTRACED | WNOHANG | WCONTINUED | WNOWAIT |
 	    WEXITED | WTRAPPED | WLINUXCLONE)) != 0)
 		return (EINVAL);
 	if ((options & (WEXITED | WUNTRACED | WCONTINUED | WTRAPPED)) == 0) {
 		/*
 		 * We will be unable to find any matching processes,
 		 * because there are no known events to look for.
 		 * Prefer to return error instead of blocking
 		 * indefinitely.
 		 */
 		return (EINVAL);
 	}
 
 loop:
 	if (q->p_flag & P_STATCHILD) {
 		PROC_LOCK(q);
 		q->p_flag &= ~P_STATCHILD;
 		PROC_UNLOCK(q);
 	}
 	nfound = 0;
 	sx_xlock(&proctree_lock);
 	LIST_FOREACH(p, &q->p_children, p_sibling) {
 		pid = p->p_pid;
 		ret = proc_to_reap(td, p, idtype, id, status, options,
 		    wrusage, siginfo, 0);
 		if (ret == 0)
 			continue;
 		else if (ret == 1)
 			nfound++;
 		else {
 			td->td_retval[0] = pid;
 			return (0);
 		}
 
 		PROC_LOCK(p);
 		PROC_SLOCK(p);
 
 		if ((options & WTRAPPED) != 0 &&
 		    (p->p_flag & P_TRACED) != 0 &&
 		    (p->p_flag & (P_STOPPED_TRACE | P_STOPPED_SIG)) != 0 &&
 		    (p->p_suspcount == p->p_numthreads) &&
 		    ((p->p_flag & P_WAITED) == 0)) {
 			PROC_SUNLOCK(p);
 			if ((options & WNOWAIT) == 0)
 				p->p_flag |= P_WAITED;
 			sx_xunlock(&proctree_lock);
 
 			if (status != NULL)
 				*status = W_STOPCODE(p->p_xsig);
 			if (siginfo != NULL) {
 				siginfo->si_status = p->p_xsig;
 				siginfo->si_code = CLD_TRAPPED;
 			}
 			if ((options & WNOWAIT) == 0) {
 				PROC_LOCK(q);
 				sigqueue_take(p->p_ksi);
 				PROC_UNLOCK(q);
 			}
 
 			CTR4(KTR_PTRACE,
 	    "wait: returning trapped pid %d status %#x (xstat %d) xthread %d",
 			    p->p_pid, W_STOPCODE(p->p_xsig), p->p_xsig,
 			    p->p_xthread != NULL ? p->p_xthread->td_tid : -1);
 			PROC_UNLOCK(p);
 			td->td_retval[0] = pid;
 			return (0);
 		}
 		if ((options & WUNTRACED) != 0 &&
 		    (p->p_flag & P_STOPPED_SIG) != 0 &&
 		    (p->p_suspcount == p->p_numthreads) &&
 		    ((p->p_flag & P_WAITED) == 0)) {
 			PROC_SUNLOCK(p);
 			if ((options & WNOWAIT) == 0)
 				p->p_flag |= P_WAITED;
 			sx_xunlock(&proctree_lock);
 
 			if (status != NULL)
 				*status = W_STOPCODE(p->p_xsig);
 			if (siginfo != NULL) {
 				siginfo->si_status = p->p_xsig;
 				siginfo->si_code = CLD_STOPPED;
 			}
 			if ((options & WNOWAIT) == 0) {
 				PROC_LOCK(q);
 				sigqueue_take(p->p_ksi);
 				PROC_UNLOCK(q);
 			}
 
 			PROC_UNLOCK(p);
 			td->td_retval[0] = pid;
 			return (0);
 		}
 		PROC_SUNLOCK(p);
 		if ((options & WCONTINUED) != 0 &&
 		    (p->p_flag & P_CONTINUED) != 0) {
 			sx_xunlock(&proctree_lock);
 			if ((options & WNOWAIT) == 0) {
 				p->p_flag &= ~P_CONTINUED;
 				PROC_LOCK(q);
 				sigqueue_take(p->p_ksi);
 				PROC_UNLOCK(q);
 			}
 			PROC_UNLOCK(p);
 
 			if (status != NULL)
 				*status = SIGCONT;
 			if (siginfo != NULL) {
 				siginfo->si_status = SIGCONT;
 				siginfo->si_code = CLD_CONTINUED;
 			}
 			td->td_retval[0] = pid;
 			return (0);
 		}
 		PROC_UNLOCK(p);
 	}
 
 	/*
 	 * Look in the orphans list too, to allow the parent to
 	 * collect it's child exit status even if child is being
 	 * debugged.
 	 *
 	 * Debugger detaches from the parent upon successful
 	 * switch-over from parent to child.  At this point due to
 	 * re-parenting the parent loses the child to debugger and a
 	 * wait4(2) call would report that it has no children to wait
 	 * for.  By maintaining a list of orphans we allow the parent
 	 * to successfully wait until the child becomes a zombie.
 	 */
 	if (nfound == 0) {
 		LIST_FOREACH(p, &q->p_orphans, p_orphan) {
 			ret = proc_to_reap(td, p, idtype, id, NULL, options,
 			    NULL, NULL, 1);
 			if (ret != 0) {
 				KASSERT(ret != -1, ("reaped an orphan (pid %d)",
 				    (int)td->td_retval[0]));
 				nfound++;
 				break;
 			}
 		}
 	}
 	if (nfound == 0) {
 		sx_xunlock(&proctree_lock);
 		return (ECHILD);
 	}
 	if (options & WNOHANG) {
 		sx_xunlock(&proctree_lock);
 		td->td_retval[0] = 0;
 		return (0);
 	}
 	PROC_LOCK(q);
 	sx_xunlock(&proctree_lock);
 	if (q->p_flag & P_STATCHILD) {
 		q->p_flag &= ~P_STATCHILD;
 		error = 0;
 	} else
 		error = msleep(q, &q->p_mtx, PWAIT | PCATCH, "wait", 0);
 	PROC_UNLOCK(q);
 	if (error)
 		return (error);
 	goto loop;
 }
 
 /*
  * Make process 'parent' the new parent of process 'child'.
  * Must be called with an exclusive hold of proctree lock.
  */
 void
 proc_reparent(struct proc *child, struct proc *parent)
 {
 
 	sx_assert(&proctree_lock, SX_XLOCKED);
 	PROC_LOCK_ASSERT(child, MA_OWNED);
 	if (child->p_pptr == parent)
 		return;
 
 	PROC_LOCK(child->p_pptr);
 	sigqueue_take(child->p_ksi);
 	PROC_UNLOCK(child->p_pptr);
 	LIST_REMOVE(child, p_sibling);
 	LIST_INSERT_HEAD(&parent->p_children, child, p_sibling);
 
 	clear_orphan(child);
 	if (child->p_flag & P_TRACED) {
 		if (LIST_EMPTY(&child->p_pptr->p_orphans)) {
 			child->p_treeflag |= P_TREE_FIRST_ORPHAN;
 			LIST_INSERT_HEAD(&child->p_pptr->p_orphans, child,
 			    p_orphan);
 		} else {
 			LIST_INSERT_AFTER(LIST_FIRST(&child->p_pptr->p_orphans),
 			    child, p_orphan);
 		}
 		child->p_treeflag |= P_TREE_ORPHANED;
 	}
 
 	child->p_pptr = parent;
 }
Index: projects/clang390-import/sys/kern/kern_mutex.c
===================================================================
--- projects/clang390-import/sys/kern/kern_mutex.c	(revision 305686)
+++ projects/clang390-import/sys/kern/kern_mutex.c	(revision 305687)
@@ -1,1082 +1,1119 @@
 /*-
  * Copyright (c) 1998 Berkeley Software Design, Inc. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Berkeley Software Design Inc's name may not be used to endorse or
  *    promote products derived from this software without specific prior
  *    written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY BERKELEY SOFTWARE DESIGN INC ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL BERKELEY SOFTWARE DESIGN INC BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	from BSDI $Id: mutex_witness.c,v 1.1.2.20 2000/04/27 03:10:27 cp Exp $
  *	and BSDI $Id: synch_machdep.c,v 2.3.2.39 2000/04/27 03:10:25 cp Exp $
  */
 
 /*
  * Machine independent bits of mutex implementation.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_adaptive_mutexes.h"
 #include "opt_ddb.h"
 #include "opt_hwpmc_hooks.h"
 #include "opt_sched.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/bus.h>
 #include <sys/conf.h>
 #include <sys/kdb.h>
 #include <sys/kernel.h>
 #include <sys/ktr.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/mutex.h>
 #include <sys/proc.h>
 #include <sys/resourcevar.h>
 #include <sys/sched.h>
 #include <sys/sbuf.h>
 #include <sys/smp.h>
 #include <sys/sysctl.h>
 #include <sys/turnstile.h>
 #include <sys/vmmeter.h>
 #include <sys/lock_profile.h>
 
 #include <machine/atomic.h>
 #include <machine/bus.h>
 #include <machine/cpu.h>
 
 #include <ddb/ddb.h>
 
 #include <fs/devfs/devfs_int.h>
 
 #include <vm/vm.h>
 #include <vm/vm_extern.h>
 
 #if defined(SMP) && !defined(NO_ADAPTIVE_MUTEXES)
 #define	ADAPTIVE_MUTEXES
 #endif
 
 #ifdef HWPMC_HOOKS
 #include <sys/pmckern.h>
 PMC_SOFT_DEFINE( , , lock, failed);
 #endif
 
 /*
  * Return the mutex address when the lock cookie address is provided.
  * This functionality assumes that struct mtx* have a member named mtx_lock.
  */
 #define	mtxlock2mtx(c)	(__containerof(c, struct mtx, mtx_lock))
 
 /*
  * Internal utility macros.
  */
 #define mtx_unowned(m)	((m)->mtx_lock == MTX_UNOWNED)
 
 #define	mtx_destroyed(m) ((m)->mtx_lock == MTX_DESTROYED)
 
 #define	mtx_owner(m)	((struct thread *)((m)->mtx_lock & ~MTX_FLAGMASK))
 
 static void	assert_mtx(const struct lock_object *lock, int what);
 #ifdef DDB
 static void	db_show_mtx(const struct lock_object *lock);
 #endif
 static void	lock_mtx(struct lock_object *lock, uintptr_t how);
 static void	lock_spin(struct lock_object *lock, uintptr_t how);
 #ifdef KDTRACE_HOOKS
 static int	owner_mtx(const struct lock_object *lock,
 		    struct thread **owner);
 #endif
 static uintptr_t unlock_mtx(struct lock_object *lock);
 static uintptr_t unlock_spin(struct lock_object *lock);
 
 /*
  * Lock classes for sleep and spin mutexes.
  */
 struct lock_class lock_class_mtx_sleep = {
 	.lc_name = "sleep mutex",
 	.lc_flags = LC_SLEEPLOCK | LC_RECURSABLE,
 	.lc_assert = assert_mtx,
 #ifdef DDB
 	.lc_ddb_show = db_show_mtx,
 #endif
 	.lc_lock = lock_mtx,
 	.lc_unlock = unlock_mtx,
 #ifdef KDTRACE_HOOKS
 	.lc_owner = owner_mtx,
 #endif
 };
 struct lock_class lock_class_mtx_spin = {
 	.lc_name = "spin mutex",
 	.lc_flags = LC_SPINLOCK | LC_RECURSABLE,
 	.lc_assert = assert_mtx,
 #ifdef DDB
 	.lc_ddb_show = db_show_mtx,
 #endif
 	.lc_lock = lock_spin,
 	.lc_unlock = unlock_spin,
 #ifdef KDTRACE_HOOKS
 	.lc_owner = owner_mtx,
 #endif
 };
 
 #ifdef ADAPTIVE_MUTEXES
 static SYSCTL_NODE(_debug, OID_AUTO, mtx, CTLFLAG_RD, NULL, "mtx debugging");
 
 static struct lock_delay_config mtx_delay = {
 	.initial	= 1000,
 	.step		= 500,
 	.min		= 100,
 	.max		= 5000,
 };
 
 SYSCTL_INT(_debug_mtx, OID_AUTO, delay_initial, CTLFLAG_RW, &mtx_delay.initial,
     0, "");
 SYSCTL_INT(_debug_mtx, OID_AUTO, delay_step, CTLFLAG_RW, &mtx_delay.step,
     0, "");
 SYSCTL_INT(_debug_mtx, OID_AUTO, delay_min, CTLFLAG_RW, &mtx_delay.min,
     0, "");
 SYSCTL_INT(_debug_mtx, OID_AUTO, delay_max, CTLFLAG_RW, &mtx_delay.max,
     0, "");
 
 static void
 mtx_delay_sysinit(void *dummy)
 {
 
 	mtx_delay.initial = mp_ncpus * 25;
 	mtx_delay.step = (mp_ncpus * 25) / 2;
 	mtx_delay.min = mp_ncpus * 5;
 	mtx_delay.max = mp_ncpus * 25 * 10;
 }
 LOCK_DELAY_SYSINIT(mtx_delay_sysinit);
 #endif
 
+static SYSCTL_NODE(_debug, OID_AUTO, mtx_spin, CTLFLAG_RD, NULL,
+    "mtx spin debugging");
+
+static struct lock_delay_config mtx_spin_delay = {
+	.initial        = 1000,
+	.step           = 500,
+	.min            = 100,
+	.max            = 5000,
+};
+
+SYSCTL_INT(_debug_mtx_spin, OID_AUTO, delay_initial, CTLFLAG_RW,
+    &mtx_spin_delay.initial, 0, "");
+SYSCTL_INT(_debug_mtx_spin, OID_AUTO, delay_step, CTLFLAG_RW, &mtx_spin_delay.step,
+    0, "");
+SYSCTL_INT(_debug_mtx_spin, OID_AUTO, delay_min, CTLFLAG_RW, &mtx_spin_delay.min,
+    0, "");
+SYSCTL_INT(_debug_mtx_spin, OID_AUTO, delay_max, CTLFLAG_RW, &mtx_spin_delay.max,
+    0, "");
+
+static void
+mtx_spin_delay_sysinit(void *dummy)
+{
+
+	mtx_spin_delay.initial = mp_ncpus * 25;
+	mtx_spin_delay.step = (mp_ncpus * 25) / 2;
+	mtx_spin_delay.min = mp_ncpus * 5;
+	mtx_spin_delay.max = mp_ncpus * 25 * 10;
+}
+LOCK_DELAY_SYSINIT(mtx_spin_delay_sysinit);
+
 /*
  * System-wide mutexes
  */
 struct mtx blocked_lock;
 struct mtx Giant;
 
 void
 assert_mtx(const struct lock_object *lock, int what)
 {
 
 	mtx_assert((const struct mtx *)lock, what);
 }
 
 void
 lock_mtx(struct lock_object *lock, uintptr_t how)
 {
 
 	mtx_lock((struct mtx *)lock);
 }
 
 void
 lock_spin(struct lock_object *lock, uintptr_t how)
 {
 
 	panic("spin locks can only use msleep_spin");
 }
 
 uintptr_t
 unlock_mtx(struct lock_object *lock)
 {
 	struct mtx *m;
 
 	m = (struct mtx *)lock;
 	mtx_assert(m, MA_OWNED | MA_NOTRECURSED);
 	mtx_unlock(m);
 	return (0);
 }
 
 uintptr_t
 unlock_spin(struct lock_object *lock)
 {
 
 	panic("spin locks can only use msleep_spin");
 }
 
 #ifdef KDTRACE_HOOKS
 int
 owner_mtx(const struct lock_object *lock, struct thread **owner)
 {
 	const struct mtx *m = (const struct mtx *)lock;
 
 	*owner = mtx_owner(m);
 	return (mtx_unowned(m) == 0);
 }
 #endif
 
 /*
  * Function versions of the inlined __mtx_* macros.  These are used by
  * modules and can also be called from assembly language if needed.
  */
 void
 __mtx_lock_flags(volatile uintptr_t *c, int opts, const char *file, int line)
 {
 	struct mtx *m;
 
 	if (SCHEDULER_STOPPED())
 		return;
 
 	m = mtxlock2mtx(c);
 
 	KASSERT(kdb_active != 0 || !TD_IS_IDLETHREAD(curthread),
 	    ("mtx_lock() by idle thread %p on sleep mutex %s @ %s:%d",
 	    curthread, m->lock_object.lo_name, file, line));
 	KASSERT(m->mtx_lock != MTX_DESTROYED,
 	    ("mtx_lock() of destroyed mutex @ %s:%d", file, line));
 	KASSERT(LOCK_CLASS(&m->lock_object) == &lock_class_mtx_sleep,
 	    ("mtx_lock() of spin mutex %s @ %s:%d", m->lock_object.lo_name,
 	    file, line));
 	WITNESS_CHECKORDER(&m->lock_object, (opts & ~MTX_RECURSE) |
 	    LOP_NEWORDER | LOP_EXCLUSIVE, file, line, NULL);
 
 	__mtx_lock(m, curthread, opts, file, line);
 	LOCK_LOG_LOCK("LOCK", &m->lock_object, opts, m->mtx_recurse, file,
 	    line);
 	WITNESS_LOCK(&m->lock_object, (opts & ~MTX_RECURSE) | LOP_EXCLUSIVE,
 	    file, line);
 	TD_LOCKS_INC(curthread);
 }
 
 void
 __mtx_unlock_flags(volatile uintptr_t *c, int opts, const char *file, int line)
 {
 	struct mtx *m;
 
 	if (SCHEDULER_STOPPED())
 		return;
 
 	m = mtxlock2mtx(c);
 
 	KASSERT(m->mtx_lock != MTX_DESTROYED,
 	    ("mtx_unlock() of destroyed mutex @ %s:%d", file, line));
 	KASSERT(LOCK_CLASS(&m->lock_object) == &lock_class_mtx_sleep,
 	    ("mtx_unlock() of spin mutex %s @ %s:%d", m->lock_object.lo_name,
 	    file, line));
 	WITNESS_UNLOCK(&m->lock_object, opts | LOP_EXCLUSIVE, file, line);
 	LOCK_LOG_LOCK("UNLOCK", &m->lock_object, opts, m->mtx_recurse, file,
 	    line);
 	mtx_assert(m, MA_OWNED);
 
 	__mtx_unlock(m, curthread, opts, file, line);
 	TD_LOCKS_DEC(curthread);
 }
 
 void
 __mtx_lock_spin_flags(volatile uintptr_t *c, int opts, const char *file,
     int line)
 {
 	struct mtx *m;
 
 	if (SCHEDULER_STOPPED())
 		return;
 
 	m = mtxlock2mtx(c);
 
 	KASSERT(m->mtx_lock != MTX_DESTROYED,
 	    ("mtx_lock_spin() of destroyed mutex @ %s:%d", file, line));
 	KASSERT(LOCK_CLASS(&m->lock_object) == &lock_class_mtx_spin,
 	    ("mtx_lock_spin() of sleep mutex %s @ %s:%d",
 	    m->lock_object.lo_name, file, line));
 	if (mtx_owned(m))
 		KASSERT((m->lock_object.lo_flags & LO_RECURSABLE) != 0 ||
 		    (opts & MTX_RECURSE) != 0,
 	    ("mtx_lock_spin: recursed on non-recursive mutex %s @ %s:%d\n",
 		    m->lock_object.lo_name, file, line));
 	opts &= ~MTX_RECURSE;
 	WITNESS_CHECKORDER(&m->lock_object, opts | LOP_NEWORDER | LOP_EXCLUSIVE,
 	    file, line, NULL);
 	__mtx_lock_spin(m, curthread, opts, file, line);
 	LOCK_LOG_LOCK("LOCK", &m->lock_object, opts, m->mtx_recurse, file,
 	    line);
 	WITNESS_LOCK(&m->lock_object, opts | LOP_EXCLUSIVE, file, line);
 }
 
 int
 __mtx_trylock_spin_flags(volatile uintptr_t *c, int opts, const char *file,
     int line)
 {
 	struct mtx *m;
 
 	if (SCHEDULER_STOPPED())
 		return (1);
 
 	m = mtxlock2mtx(c);
 
 	KASSERT(m->mtx_lock != MTX_DESTROYED,
 	    ("mtx_trylock_spin() of destroyed mutex @ %s:%d", file, line));
 	KASSERT(LOCK_CLASS(&m->lock_object) == &lock_class_mtx_spin,
 	    ("mtx_trylock_spin() of sleep mutex %s @ %s:%d",
 	    m->lock_object.lo_name, file, line));
 	KASSERT((opts & MTX_RECURSE) == 0,
 	    ("mtx_trylock_spin: unsupp. opt MTX_RECURSE on mutex %s @ %s:%d\n",
 	    m->lock_object.lo_name, file, line));
 	if (__mtx_trylock_spin(m, curthread, opts, file, line)) {
 		LOCK_LOG_TRY("LOCK", &m->lock_object, opts, 1, file, line);
 		WITNESS_LOCK(&m->lock_object, opts | LOP_EXCLUSIVE, file, line);
 		return (1);
 	}
 	LOCK_LOG_TRY("LOCK", &m->lock_object, opts, 0, file, line);
 	return (0);
 }
 
 void
 __mtx_unlock_spin_flags(volatile uintptr_t *c, int opts, const char *file,
     int line)
 {
 	struct mtx *m;
 
 	if (SCHEDULER_STOPPED())
 		return;
 
 	m = mtxlock2mtx(c);
 
 	KASSERT(m->mtx_lock != MTX_DESTROYED,
 	    ("mtx_unlock_spin() of destroyed mutex @ %s:%d", file, line));
 	KASSERT(LOCK_CLASS(&m->lock_object) == &lock_class_mtx_spin,
 	    ("mtx_unlock_spin() of sleep mutex %s @ %s:%d",
 	    m->lock_object.lo_name, file, line));
 	WITNESS_UNLOCK(&m->lock_object, opts | LOP_EXCLUSIVE, file, line);
 	LOCK_LOG_LOCK("UNLOCK", &m->lock_object, opts, m->mtx_recurse, file,
 	    line);
 	mtx_assert(m, MA_OWNED);
 
 	__mtx_unlock_spin(m);
 }
 
 /*
  * The important part of mtx_trylock{,_flags}()
  * Tries to acquire lock `m.'  If this function is called on a mutex that
  * is already owned, it will recursively acquire the lock.
  */
 int
 _mtx_trylock_flags_(volatile uintptr_t *c, int opts, const char *file, int line)
 {
 	struct mtx *m;
 #ifdef LOCK_PROFILING
 	uint64_t waittime = 0;
 	int contested = 0;
 #endif
 	int rval;
 
 	if (SCHEDULER_STOPPED())
 		return (1);
 
 	m = mtxlock2mtx(c);
 
 	KASSERT(kdb_active != 0 || !TD_IS_IDLETHREAD(curthread),
 	    ("mtx_trylock() by idle thread %p on sleep mutex %s @ %s:%d",
 	    curthread, m->lock_object.lo_name, file, line));
 	KASSERT(m->mtx_lock != MTX_DESTROYED,
 	    ("mtx_trylock() of destroyed mutex @ %s:%d", file, line));
 	KASSERT(LOCK_CLASS(&m->lock_object) == &lock_class_mtx_sleep,
 	    ("mtx_trylock() of spin mutex %s @ %s:%d", m->lock_object.lo_name,
 	    file, line));
 
 	if (mtx_owned(m) && ((m->lock_object.lo_flags & LO_RECURSABLE) != 0 ||
 	    (opts & MTX_RECURSE) != 0)) {
 		m->mtx_recurse++;
 		atomic_set_ptr(&m->mtx_lock, MTX_RECURSED);
 		rval = 1;
 	} else
 		rval = _mtx_obtain_lock(m, (uintptr_t)curthread);
 	opts &= ~MTX_RECURSE;
 
 	LOCK_LOG_TRY("LOCK", &m->lock_object, opts, rval, file, line);
 	if (rval) {
 		WITNESS_LOCK(&m->lock_object, opts | LOP_EXCLUSIVE | LOP_TRYLOCK,
 		    file, line);
 		TD_LOCKS_INC(curthread);
 		if (m->mtx_recurse == 0)
 			LOCKSTAT_PROFILE_OBTAIN_LOCK_SUCCESS(adaptive__acquire,
 			    m, contested, waittime, file, line);
 
 	}
 
 	return (rval);
 }
 
 /*
  * __mtx_lock_sleep: the tougher part of acquiring an MTX_DEF lock.
  *
  * We call this if the lock is either contested (i.e. we need to go to
  * sleep waiting for it), or if we need to recurse on it.
  */
 void
 __mtx_lock_sleep(volatile uintptr_t *c, uintptr_t tid, int opts,
     const char *file, int line)
 {
 	struct mtx *m;
 	struct turnstile *ts;
 	uintptr_t v;
 #ifdef ADAPTIVE_MUTEXES
 	volatile struct thread *owner;
 #endif
 #ifdef KTR
 	int cont_logged = 0;
 #endif
 #ifdef LOCK_PROFILING
 	int contested = 0;
 	uint64_t waittime = 0;
 #endif
 #if defined(ADAPTIVE_MUTEXES) || defined(KDTRACE_HOOKS)
 	struct lock_delay_arg lda;
 #endif
 #ifdef KDTRACE_HOOKS
 	u_int sleep_cnt = 0;
 	int64_t sleep_time = 0;
 	int64_t all_time = 0;
 #endif
 
 	if (SCHEDULER_STOPPED())
 		return;
 
 #if defined(ADAPTIVE_MUTEXES)
 	lock_delay_arg_init(&lda, &mtx_delay);
 #elif defined(KDTRACE_HOOKS)
 	lock_delay_arg_init(&lda, NULL);
 #endif
 	m = mtxlock2mtx(c);
 
 	if (mtx_owned(m)) {
 		KASSERT((m->lock_object.lo_flags & LO_RECURSABLE) != 0 ||
 		    (opts & MTX_RECURSE) != 0,
 	    ("_mtx_lock_sleep: recursed on non-recursive mutex %s @ %s:%d\n",
 		    m->lock_object.lo_name, file, line));
 		opts &= ~MTX_RECURSE;
 		m->mtx_recurse++;
 		atomic_set_ptr(&m->mtx_lock, MTX_RECURSED);
 		if (LOCK_LOG_TEST(&m->lock_object, opts))
 			CTR1(KTR_LOCK, "_mtx_lock_sleep: %p recursing", m);
 		return;
 	}
 	opts &= ~MTX_RECURSE;
 
 #ifdef HWPMC_HOOKS
 	PMC_SOFT_CALL( , , lock, failed);
 #endif
 	lock_profile_obtain_lock_failed(&m->lock_object,
 		    &contested, &waittime);
 	if (LOCK_LOG_TEST(&m->lock_object, opts))
 		CTR4(KTR_LOCK,
 		    "_mtx_lock_sleep: %s contested (lock=%p) at %s:%d",
 		    m->lock_object.lo_name, (void *)m->mtx_lock, file, line);
 #ifdef KDTRACE_HOOKS
 	all_time -= lockstat_nsecs(&m->lock_object);
 #endif
 
 	for (;;) {
 		if (m->mtx_lock == MTX_UNOWNED && _mtx_obtain_lock(m, tid))
 			break;
 #ifdef KDTRACE_HOOKS
 		lda.spin_cnt++;
 #endif
 #ifdef ADAPTIVE_MUTEXES
 		/*
 		 * If the owner is running on another CPU, spin until the
 		 * owner stops running or the state of the lock changes.
 		 */
 		v = m->mtx_lock;
 		if (v != MTX_UNOWNED) {
 			owner = (struct thread *)(v & ~MTX_FLAGMASK);
 			if (TD_IS_RUNNING(owner)) {
 				if (LOCK_LOG_TEST(&m->lock_object, 0))
 					CTR3(KTR_LOCK,
 					    "%s: spinning on %p held by %p",
 					    __func__, m, owner);
 				KTR_STATE1(KTR_SCHED, "thread",
 				    sched_tdname((struct thread *)tid),
 				    "spinning", "lockname:\"%s\"",
 				    m->lock_object.lo_name);
 				while (mtx_owner(m) == owner &&
 				    TD_IS_RUNNING(owner))
 					lock_delay(&lda);
 				KTR_STATE0(KTR_SCHED, "thread",
 				    sched_tdname((struct thread *)tid),
 				    "running");
 				continue;
 			}
 		}
 #endif
 
 		ts = turnstile_trywait(&m->lock_object);
 		v = m->mtx_lock;
 
 		/*
 		 * Check if the lock has been released while spinning for
 		 * the turnstile chain lock.
 		 */
 		if (v == MTX_UNOWNED) {
 			turnstile_cancel(ts);
 			continue;
 		}
 
 #ifdef ADAPTIVE_MUTEXES
 		/*
 		 * The current lock owner might have started executing
 		 * on another CPU (or the lock could have changed
 		 * owners) while we were waiting on the turnstile
 		 * chain lock.  If so, drop the turnstile lock and try
 		 * again.
 		 */
 		owner = (struct thread *)(v & ~MTX_FLAGMASK);
 		if (TD_IS_RUNNING(owner)) {
 			turnstile_cancel(ts);
 			continue;
 		}
 #endif
 
 		/*
 		 * If the mutex isn't already contested and a failure occurs
 		 * setting the contested bit, the mutex was either released
 		 * or the state of the MTX_RECURSED bit changed.
 		 */
 		if ((v & MTX_CONTESTED) == 0 &&
 		    !atomic_cmpset_ptr(&m->mtx_lock, v, v | MTX_CONTESTED)) {
 			turnstile_cancel(ts);
 			continue;
 		}
 
 		/*
 		 * We definitely must sleep for this lock.
 		 */
 		mtx_assert(m, MA_NOTOWNED);
 
 #ifdef KTR
 		if (!cont_logged) {
 			CTR6(KTR_CONTENTION,
 			    "contention: %p at %s:%d wants %s, taken by %s:%d",
 			    (void *)tid, file, line, m->lock_object.lo_name,
 			    WITNESS_FILE(&m->lock_object),
 			    WITNESS_LINE(&m->lock_object));
 			cont_logged = 1;
 		}
 #endif
 
 		/*
 		 * Block on the turnstile.
 		 */
 #ifdef KDTRACE_HOOKS
 		sleep_time -= lockstat_nsecs(&m->lock_object);
 #endif
 		turnstile_wait(ts, mtx_owner(m), TS_EXCLUSIVE_QUEUE);
 #ifdef KDTRACE_HOOKS
 		sleep_time += lockstat_nsecs(&m->lock_object);
 		sleep_cnt++;
 #endif
 	}
 #ifdef KDTRACE_HOOKS
 	all_time += lockstat_nsecs(&m->lock_object);
 #endif
 #ifdef KTR
 	if (cont_logged) {
 		CTR4(KTR_CONTENTION,
 		    "contention end: %s acquired by %p at %s:%d",
 		    m->lock_object.lo_name, (void *)tid, file, line);
 	}
 #endif
 	LOCKSTAT_PROFILE_OBTAIN_LOCK_SUCCESS(adaptive__acquire, m, contested,
 	    waittime, file, line);
 #ifdef KDTRACE_HOOKS
 	if (sleep_time)
 		LOCKSTAT_RECORD1(adaptive__block, m, sleep_time);
 
 	/*
 	 * Only record the loops spinning and not sleeping. 
 	 */
 	if (lda.spin_cnt > sleep_cnt)
 		LOCKSTAT_RECORD1(adaptive__spin, m, all_time - sleep_time);
 #endif
 }
 
 static void
 _mtx_lock_spin_failed(struct mtx *m)
 {
 	struct thread *td;
 
 	td = mtx_owner(m);
 
 	/* If the mutex is unlocked, try again. */
 	if (td == NULL)
 		return;
 
 	printf( "spin lock %p (%s) held by %p (tid %d) too long\n",
 	    m, m->lock_object.lo_name, td, td->td_tid);
 #ifdef WITNESS
 	witness_display_spinlock(&m->lock_object, td, printf);
 #endif
 	panic("spin lock held too long");
 }
 
 #ifdef SMP
 /*
  * _mtx_lock_spin_cookie: the tougher part of acquiring an MTX_SPIN lock.
  *
  * This is only called if we need to actually spin for the lock. Recursion
  * is handled inline.
  */
 void
 _mtx_lock_spin_cookie(volatile uintptr_t *c, uintptr_t tid, int opts,
     const char *file, int line)
 {
 	struct mtx *m;
-	int i = 0;
+	struct lock_delay_arg lda;
 #ifdef LOCK_PROFILING
 	int contested = 0;
 	uint64_t waittime = 0;
 #endif
 #ifdef KDTRACE_HOOKS
 	int64_t spin_time = 0;
 #endif
 
 	if (SCHEDULER_STOPPED())
 		return;
 
+	lock_delay_arg_init(&lda, &mtx_spin_delay);
 	m = mtxlock2mtx(c);
 
 	if (LOCK_LOG_TEST(&m->lock_object, opts))
 		CTR1(KTR_LOCK, "_mtx_lock_spin: %p spinning", m);
 	KTR_STATE1(KTR_SCHED, "thread", sched_tdname((struct thread *)tid),
 	    "spinning", "lockname:\"%s\"", m->lock_object.lo_name);
 
 #ifdef HWPMC_HOOKS
 	PMC_SOFT_CALL( , , lock, failed);
 #endif
 	lock_profile_obtain_lock_failed(&m->lock_object, &contested, &waittime);
 #ifdef KDTRACE_HOOKS
 	spin_time -= lockstat_nsecs(&m->lock_object);
 #endif
 	for (;;) {
 		if (m->mtx_lock == MTX_UNOWNED && _mtx_obtain_lock(m, tid))
 			break;
 		/* Give interrupts a chance while we spin. */
 		spinlock_exit();
 		while (m->mtx_lock != MTX_UNOWNED) {
-			if (i++ < 10000000) {
-				cpu_spinwait();
+			if (lda.spin_cnt < 10000000) {
+				lock_delay(&lda);
 				continue;
 			}
-			if (i < 60000000 || kdb_active || panicstr != NULL)
+			lda.spin_cnt++;
+			if (lda.spin_cnt < 60000000 || kdb_active ||
+			    panicstr != NULL)
 				DELAY(1);
 			else
 				_mtx_lock_spin_failed(m);
 			cpu_spinwait();
 		}
 		spinlock_enter();
 	}
 #ifdef KDTRACE_HOOKS
 	spin_time += lockstat_nsecs(&m->lock_object);
 #endif
 
 	if (LOCK_LOG_TEST(&m->lock_object, opts))
 		CTR1(KTR_LOCK, "_mtx_lock_spin: %p spin done", m);
 	KTR_STATE0(KTR_SCHED, "thread", sched_tdname((struct thread *)tid),
 	    "running");
 
 #ifdef KDTRACE_HOOKS
 	LOCKSTAT_PROFILE_OBTAIN_LOCK_SUCCESS(spin__acquire, m,
 	    contested, waittime, file, line);
 	if (spin_time != 0)
 		LOCKSTAT_RECORD1(spin__spin, m, spin_time);
 #endif
 }
 #endif /* SMP */
 
 void
 thread_lock_flags_(struct thread *td, int opts, const char *file, int line)
 {
 	struct mtx *m;
 	uintptr_t tid;
-	int i;
+	struct lock_delay_arg lda;
 #ifdef LOCK_PROFILING
 	int contested = 0;
 	uint64_t waittime = 0;
 #endif
 #ifdef KDTRACE_HOOKS
 	int64_t spin_time = 0;
 #endif
 
-	i = 0;
 	tid = (uintptr_t)curthread;
 
 	if (SCHEDULER_STOPPED()) {
 		/*
 		 * Ensure that spinlock sections are balanced even when the
 		 * scheduler is stopped, since we may otherwise inadvertently
 		 * re-enable interrupts while dumping core.
 		 */
 		spinlock_enter();
 		return;
 	}
 
+	lock_delay_arg_init(&lda, &mtx_spin_delay);
+
 #ifdef KDTRACE_HOOKS
 	spin_time -= lockstat_nsecs(&td->td_lock->lock_object);
 #endif
 	for (;;) {
 retry:
 		spinlock_enter();
 		m = td->td_lock;
 		KASSERT(m->mtx_lock != MTX_DESTROYED,
 		    ("thread_lock() of destroyed mutex @ %s:%d", file, line));
 		KASSERT(LOCK_CLASS(&m->lock_object) == &lock_class_mtx_spin,
 		    ("thread_lock() of sleep mutex %s @ %s:%d",
 		    m->lock_object.lo_name, file, line));
 		if (mtx_owned(m))
 			KASSERT((m->lock_object.lo_flags & LO_RECURSABLE) != 0,
 	    ("thread_lock: recursed on non-recursive mutex %s @ %s:%d\n",
 			    m->lock_object.lo_name, file, line));
 		WITNESS_CHECKORDER(&m->lock_object,
 		    opts | LOP_NEWORDER | LOP_EXCLUSIVE, file, line, NULL);
 		for (;;) {
 			if (m->mtx_lock == MTX_UNOWNED && _mtx_obtain_lock(m, tid))
 				break;
 			if (m->mtx_lock == tid) {
 				m->mtx_recurse++;
 				break;
 			}
 #ifdef HWPMC_HOOKS
 			PMC_SOFT_CALL( , , lock, failed);
 #endif
 			lock_profile_obtain_lock_failed(&m->lock_object,
 			    &contested, &waittime);
 			/* Give interrupts a chance while we spin. */
 			spinlock_exit();
 			while (m->mtx_lock != MTX_UNOWNED) {
-				if (i++ < 10000000)
+				if (lda.spin_cnt < 10000000) {
+					lock_delay(&lda);
+				} else {
+					lda.spin_cnt++;
+					if (lda.spin_cnt < 60000000 ||
+					    kdb_active || panicstr != NULL)
+						DELAY(1);
+					else
+						_mtx_lock_spin_failed(m);
 					cpu_spinwait();
-				else if (i < 60000000 ||
-				    kdb_active || panicstr != NULL)
-					DELAY(1);
-				else
-					_mtx_lock_spin_failed(m);
-				cpu_spinwait();
+				}
 				if (m != td->td_lock)
 					goto retry;
 			}
 			spinlock_enter();
 		}
 		if (m == td->td_lock)
 			break;
 		__mtx_unlock_spin(m);	/* does spinlock_exit() */
 	}
 #ifdef KDTRACE_HOOKS
 	spin_time += lockstat_nsecs(&m->lock_object);
 #endif
 	if (m->mtx_recurse == 0)
 		LOCKSTAT_PROFILE_OBTAIN_LOCK_SUCCESS(spin__acquire, m,
 		    contested, waittime, file, line);
 	LOCK_LOG_LOCK("LOCK", &m->lock_object, opts, m->mtx_recurse, file,
 	    line);
 	WITNESS_LOCK(&m->lock_object, opts | LOP_EXCLUSIVE, file, line);
 #ifdef KDTRACE_HOOKS
 	if (spin_time != 0)
 		LOCKSTAT_RECORD1(thread__spin, m, spin_time);
 #endif
 }
 
 struct mtx *
 thread_lock_block(struct thread *td)
 {
 	struct mtx *lock;
 
 	THREAD_LOCK_ASSERT(td, MA_OWNED);
 	lock = td->td_lock;
 	td->td_lock = &blocked_lock;
 	mtx_unlock_spin(lock);
 
 	return (lock);
 }
 
 void
 thread_lock_unblock(struct thread *td, struct mtx *new)
 {
 	mtx_assert(new, MA_OWNED);
 	MPASS(td->td_lock == &blocked_lock);
 	atomic_store_rel_ptr((volatile void *)&td->td_lock, (uintptr_t)new);
 }
 
 void
 thread_lock_set(struct thread *td, struct mtx *new)
 {
 	struct mtx *lock;
 
 	mtx_assert(new, MA_OWNED);
 	THREAD_LOCK_ASSERT(td, MA_OWNED);
 	lock = td->td_lock;
 	td->td_lock = new;
 	mtx_unlock_spin(lock);
 }
 
 /*
  * __mtx_unlock_sleep: the tougher part of releasing an MTX_DEF lock.
  *
  * We are only called here if the lock is recursed or contested (i.e. we
  * need to wake up a blocked thread).
  */
 void
 __mtx_unlock_sleep(volatile uintptr_t *c, int opts, const char *file, int line)
 {
 	struct mtx *m;
 	struct turnstile *ts;
 
 	if (SCHEDULER_STOPPED())
 		return;
 
 	m = mtxlock2mtx(c);
 
 	if (mtx_recursed(m)) {
 		if (--(m->mtx_recurse) == 0)
 			atomic_clear_ptr(&m->mtx_lock, MTX_RECURSED);
 		if (LOCK_LOG_TEST(&m->lock_object, opts))
 			CTR1(KTR_LOCK, "_mtx_unlock_sleep: %p unrecurse", m);
 		return;
 	}
 
 	/*
 	 * We have to lock the chain before the turnstile so this turnstile
 	 * can be removed from the hash list if it is empty.
 	 */
 	turnstile_chain_lock(&m->lock_object);
 	ts = turnstile_lookup(&m->lock_object);
 	if (LOCK_LOG_TEST(&m->lock_object, opts))
 		CTR1(KTR_LOCK, "_mtx_unlock_sleep: %p contested", m);
 	MPASS(ts != NULL);
 	turnstile_broadcast(ts, TS_EXCLUSIVE_QUEUE);
 	_mtx_release_lock_quick(m);
 
 	/*
 	 * This turnstile is now no longer associated with the mutex.  We can
 	 * unlock the chain lock so a new turnstile may take it's place.
 	 */
 	turnstile_unpend(ts, TS_EXCLUSIVE_LOCK);
 	turnstile_chain_unlock(&m->lock_object);
 }
 
 /*
  * All the unlocking of MTX_SPIN locks is done inline.
  * See the __mtx_unlock_spin() macro for the details.
  */
 
 /*
  * The backing function for the INVARIANTS-enabled mtx_assert()
  */
 #ifdef INVARIANT_SUPPORT
 void
 __mtx_assert(const volatile uintptr_t *c, int what, const char *file, int line)
 {
 	const struct mtx *m;
 
 	if (panicstr != NULL || dumping)
 		return;
 
 	m = mtxlock2mtx(c);
 
 	switch (what) {
 	case MA_OWNED:
 	case MA_OWNED | MA_RECURSED:
 	case MA_OWNED | MA_NOTRECURSED:
 		if (!mtx_owned(m))
 			panic("mutex %s not owned at %s:%d",
 			    m->lock_object.lo_name, file, line);
 		if (mtx_recursed(m)) {
 			if ((what & MA_NOTRECURSED) != 0)
 				panic("mutex %s recursed at %s:%d",
 				    m->lock_object.lo_name, file, line);
 		} else if ((what & MA_RECURSED) != 0) {
 			panic("mutex %s unrecursed at %s:%d",
 			    m->lock_object.lo_name, file, line);
 		}
 		break;
 	case MA_NOTOWNED:
 		if (mtx_owned(m))
 			panic("mutex %s owned at %s:%d",
 			    m->lock_object.lo_name, file, line);
 		break;
 	default:
 		panic("unknown mtx_assert at %s:%d", file, line);
 	}
 }
 #endif
 
 /*
  * General init routine used by the MTX_SYSINIT() macro.
  */
 void
 mtx_sysinit(void *arg)
 {
 	struct mtx_args *margs = arg;
 
 	mtx_init((struct mtx *)margs->ma_mtx, margs->ma_desc, NULL,
 	    margs->ma_opts);
 }
 
 /*
  * Mutex initialization routine; initialize lock `m' of type contained in
  * `opts' with options contained in `opts' and name `name.'  The optional
  * lock type `type' is used as a general lock category name for use with
  * witness.
  */
 void
 _mtx_init(volatile uintptr_t *c, const char *name, const char *type, int opts)
 {
 	struct mtx *m;
 	struct lock_class *class;
 	int flags;
 
 	m = mtxlock2mtx(c);
 
 	MPASS((opts & ~(MTX_SPIN | MTX_QUIET | MTX_RECURSE |
 	    MTX_NOWITNESS | MTX_DUPOK | MTX_NOPROFILE | MTX_NEW)) == 0);
 	ASSERT_ATOMIC_LOAD_PTR(m->mtx_lock,
 	    ("%s: mtx_lock not aligned for %s: %p", __func__, name,
 	    &m->mtx_lock));
 
 	/* Determine lock class and lock flags. */
 	if (opts & MTX_SPIN)
 		class = &lock_class_mtx_spin;
 	else
 		class = &lock_class_mtx_sleep;
 	flags = 0;
 	if (opts & MTX_QUIET)
 		flags |= LO_QUIET;
 	if (opts & MTX_RECURSE)
 		flags |= LO_RECURSABLE;
 	if ((opts & MTX_NOWITNESS) == 0)
 		flags |= LO_WITNESS;
 	if (opts & MTX_DUPOK)
 		flags |= LO_DUPOK;
 	if (opts & MTX_NOPROFILE)
 		flags |= LO_NOPROFILE;
 	if (opts & MTX_NEW)
 		flags |= LO_NEW;
 
 	/* Initialize mutex. */
 	lock_init(&m->lock_object, class, name, type, flags);
 
 	m->mtx_lock = MTX_UNOWNED;
 	m->mtx_recurse = 0;
 }
 
 /*
  * Remove lock `m' from all_mtx queue.  We don't allow MTX_QUIET to be
  * passed in as a flag here because if the corresponding mtx_init() was
  * called with MTX_QUIET set, then it will already be set in the mutex's
  * flags.
  */
 void
 _mtx_destroy(volatile uintptr_t *c)
 {
 	struct mtx *m;
 
 	m = mtxlock2mtx(c);
 
 	if (!mtx_owned(m))
 		MPASS(mtx_unowned(m));
 	else {
 		MPASS((m->mtx_lock & (MTX_RECURSED|MTX_CONTESTED)) == 0);
 
 		/* Perform the non-mtx related part of mtx_unlock_spin(). */
 		if (LOCK_CLASS(&m->lock_object) == &lock_class_mtx_spin)
 			spinlock_exit();
 		else
 			TD_LOCKS_DEC(curthread);
 
 		lock_profile_release_lock(&m->lock_object);
 		/* Tell witness this isn't locked to make it happy. */
 		WITNESS_UNLOCK(&m->lock_object, LOP_EXCLUSIVE, __FILE__,
 		    __LINE__);
 	}
 
 	m->mtx_lock = MTX_DESTROYED;
 	lock_destroy(&m->lock_object);
 }
 
 /*
  * Intialize the mutex code and system mutexes.  This is called from the MD
  * startup code prior to mi_startup().  The per-CPU data space needs to be
  * setup before this is called.
  */
 void
 mutex_init(void)
 {
 
 	/* Setup turnstiles so that sleep mutexes work. */
 	init_turnstiles();
 
 	/*
 	 * Initialize mutexes.
 	 */
 	mtx_init(&Giant, "Giant", NULL, MTX_DEF | MTX_RECURSE);
 	mtx_init(&blocked_lock, "blocked lock", NULL, MTX_SPIN);
 	blocked_lock.mtx_lock = 0xdeadc0de;	/* Always blocked. */
 	mtx_init(&proc0.p_mtx, "process lock", NULL, MTX_DEF | MTX_DUPOK);
 	mtx_init(&proc0.p_slock, "process slock", NULL, MTX_SPIN);
 	mtx_init(&proc0.p_statmtx, "pstatl", NULL, MTX_SPIN);
 	mtx_init(&proc0.p_itimmtx, "pitiml", NULL, MTX_SPIN);
 	mtx_init(&proc0.p_profmtx, "pprofl", NULL, MTX_SPIN);
 	mtx_init(&devmtx, "cdev", NULL, MTX_DEF);
 	mtx_lock(&Giant);
 }
 
 #ifdef DDB
 void
 db_show_mtx(const struct lock_object *lock)
 {
 	struct thread *td;
 	const struct mtx *m;
 
 	m = (const struct mtx *)lock;
 
 	db_printf(" flags: {");
 	if (LOCK_CLASS(lock) == &lock_class_mtx_spin)
 		db_printf("SPIN");
 	else
 		db_printf("DEF");
 	if (m->lock_object.lo_flags & LO_RECURSABLE)
 		db_printf(", RECURSE");
 	if (m->lock_object.lo_flags & LO_DUPOK)
 		db_printf(", DUPOK");
 	db_printf("}\n");
 	db_printf(" state: {");
 	if (mtx_unowned(m))
 		db_printf("UNOWNED");
 	else if (mtx_destroyed(m))
 		db_printf("DESTROYED");
 	else {
 		db_printf("OWNED");
 		if (m->mtx_lock & MTX_CONTESTED)
 			db_printf(", CONTESTED");
 		if (m->mtx_lock & MTX_RECURSED)
 			db_printf(", RECURSED");
 	}
 	db_printf("}\n");
 	if (!mtx_unowned(m) && !mtx_destroyed(m)) {
 		td = mtx_owner(m);
 		db_printf(" owner: %p (tid %d, pid %d, \"%s\")\n", td,
 		    td->td_tid, td->td_proc->p_pid, td->td_name);
 		if (mtx_recursed(m))
 			db_printf(" recursed: %d\n", m->mtx_recurse);
 	}
 }
 #endif
Index: projects/clang390-import/sys/kern/subr_witness.c
===================================================================
--- projects/clang390-import/sys/kern/subr_witness.c	(revision 305686)
+++ projects/clang390-import/sys/kern/subr_witness.c	(revision 305687)
@@ -1,3017 +1,3025 @@
 /*-
  * Copyright (c) 2008 Isilon Systems, Inc.
  * Copyright (c) 2008 Ilya Maykov <ivmaykov@gmail.com>
  * Copyright (c) 1998 Berkeley Software Design, Inc.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Berkeley Software Design Inc's name may not be used to endorse or
  *    promote products derived from this software without specific prior
  *    written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY BERKELEY SOFTWARE DESIGN INC ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL BERKELEY SOFTWARE DESIGN INC BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	from BSDI $Id: mutex_witness.c,v 1.1.2.20 2000/04/27 03:10:27 cp Exp $
  *	and BSDI $Id: synch_machdep.c,v 2.3.2.39 2000/04/27 03:10:25 cp Exp $
  */
 
 /*
  * Implementation of the `witness' lock verifier.  Originally implemented for
  * mutexes in BSD/OS.  Extended to handle generic lock objects and lock
  * classes in FreeBSD.
  */
 
 /*
  *	Main Entry: witness
  *	Pronunciation: 'wit-n&s
  *	Function: noun
  *	Etymology: Middle English witnesse, from Old English witnes knowledge,
  *	    testimony, witness, from 2wit
  *	Date: before 12th century
  *	1 : attestation of a fact or event : TESTIMONY
  *	2 : one that gives evidence; specifically : one who testifies in
  *	    a cause or before a judicial tribunal
  *	3 : one asked to be present at a transaction so as to be able to
  *	    testify to its having taken place
  *	4 : one who has personal knowledge of something
  *	5 a : something serving as evidence or proof : SIGN
  *	  b : public affirmation by word or example of usually
  *	      religious faith or conviction <the heroic witness to divine
  *	      life -- Pilot>
  *	6 capitalized : a member of the Jehovah's Witnesses 
  */
 
 /*
  * Special rules concerning Giant and lock orders:
  *
  * 1) Giant must be acquired before any other mutexes.  Stated another way,
  *    no other mutex may be held when Giant is acquired.
  *
  * 2) Giant must be released when blocking on a sleepable lock.
  *
  * This rule is less obvious, but is a result of Giant providing the same
  * semantics as spl().  Basically, when a thread sleeps, it must release
  * Giant.  When a thread blocks on a sleepable lock, it sleeps.  Hence rule
  * 2).
  *
  * 3) Giant may be acquired before or after sleepable locks.
  *
  * This rule is also not quite as obvious.  Giant may be acquired after
  * a sleepable lock because it is a non-sleepable lock and non-sleepable
  * locks may always be acquired while holding a sleepable lock.  The second
  * case, Giant before a sleepable lock, follows from rule 2) above.  Suppose
  * you have two threads T1 and T2 and a sleepable lock X.  Suppose that T1
  * acquires X and blocks on Giant.  Then suppose that T2 acquires Giant and
  * blocks on X.  When T2 blocks on X, T2 will release Giant allowing T1 to
  * execute.  Thus, acquiring Giant both before and after a sleepable lock
  * will not result in a lock order reversal.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_ddb.h"
 #include "opt_hwpmc_hooks.h"
 #include "opt_stack.h"
 #include "opt_witness.h"
 
 #include <sys/param.h>
 #include <sys/bus.h>
 #include <sys/kdb.h>
 #include <sys/kernel.h>
 #include <sys/ktr.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/mutex.h>
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/sbuf.h>
 #include <sys/sched.h>
 #include <sys/stack.h>
 #include <sys/sysctl.h>
 #include <sys/syslog.h>
 #include <sys/systm.h>
 
 #ifdef DDB
 #include <ddb/ddb.h>
 #endif
 
 #include <machine/stdarg.h>
 
 #if !defined(DDB) && !defined(STACK)
 #error "DDB or STACK options are required for WITNESS"
 #endif
 
 /* Note that these traces do not work with KTR_ALQ. */
 #if 0
 #define	KTR_WITNESS	KTR_SUBSYS
 #else
 #define	KTR_WITNESS	0
 #endif
 
 #define	LI_RECURSEMASK	0x0000ffff	/* Recursion depth of lock instance. */
 #define	LI_EXCLUSIVE	0x00010000	/* Exclusive lock instance. */
 #define	LI_NORELEASE	0x00020000	/* Lock not allowed to be released. */
 
 /* Define this to check for blessed mutexes */
 #undef BLESSING
 
 #ifndef WITNESS_COUNT
 #define	WITNESS_COUNT 		1536
 #endif
 #define	WITNESS_HASH_SIZE	251	/* Prime, gives load factor < 2 */
 #define	WITNESS_PENDLIST	(1024 + MAXCPU)
 
 /* Allocate 256 KB of stack data space */
 #define	WITNESS_LO_DATA_COUNT	2048
 
 /* Prime, gives load factor of ~2 at full load */
 #define	WITNESS_LO_HASH_SIZE	1021
 
 /*
  * XXX: This is somewhat bogus, as we assume here that at most 2048 threads
  * will hold LOCK_NCHILDREN locks.  We handle failure ok, and we should
  * probably be safe for the most part, but it's still a SWAG.
  */
 #define	LOCK_NCHILDREN	5
 #define	LOCK_CHILDCOUNT	2048
 
 #define	MAX_W_NAME	64
 
 #define	FULLGRAPH_SBUF_SIZE	512
 
 /*
  * These flags go in the witness relationship matrix and describe the
  * relationship between any two struct witness objects.
  */
 #define	WITNESS_UNRELATED        0x00    /* No lock order relation. */
 #define	WITNESS_PARENT           0x01    /* Parent, aka direct ancestor. */
 #define	WITNESS_ANCESTOR         0x02    /* Direct or indirect ancestor. */
 #define	WITNESS_CHILD            0x04    /* Child, aka direct descendant. */
 #define	WITNESS_DESCENDANT       0x08    /* Direct or indirect descendant. */
 #define	WITNESS_ANCESTOR_MASK    (WITNESS_PARENT | WITNESS_ANCESTOR)
 #define	WITNESS_DESCENDANT_MASK  (WITNESS_CHILD | WITNESS_DESCENDANT)
 #define	WITNESS_RELATED_MASK						\
 	(WITNESS_ANCESTOR_MASK | WITNESS_DESCENDANT_MASK)
 #define	WITNESS_REVERSAL         0x10    /* A lock order reversal has been
 					  * observed. */
 #define	WITNESS_RESERVED1        0x20    /* Unused flag, reserved. */
 #define	WITNESS_RESERVED2        0x40    /* Unused flag, reserved. */
 #define	WITNESS_LOCK_ORDER_KNOWN 0x80    /* This lock order is known. */
 
 /* Descendant to ancestor flags */
 #define	WITNESS_DTOA(x)	(((x) & WITNESS_RELATED_MASK) >> 2)
 
 /* Ancestor to descendant flags */
 #define	WITNESS_ATOD(x)	(((x) & WITNESS_RELATED_MASK) << 2)
 
 #define	WITNESS_INDEX_ASSERT(i)						\
 	MPASS((i) > 0 && (i) <= w_max_used_index && (i) < witness_count)
 
 static MALLOC_DEFINE(M_WITNESS, "Witness", "Witness");
 
 /*
  * Lock instances.  A lock instance is the data associated with a lock while
  * it is held by witness.  For example, a lock instance will hold the
  * recursion count of a lock.  Lock instances are held in lists.  Spin locks
  * are held in a per-cpu list while sleep locks are held in per-thread list.
  */
 struct lock_instance {
 	struct lock_object	*li_lock;
 	const char		*li_file;
 	int			li_line;
 	u_int			li_flags;
 };
 
 /*
  * A simple list type used to build the list of locks held by a thread
  * or CPU.  We can't simply embed the list in struct lock_object since a
  * lock may be held by more than one thread if it is a shared lock.  Locks
  * are added to the head of the list, so we fill up each list entry from
  * "the back" logically.  To ease some of the arithmetic, we actually fill
  * in each list entry the normal way (children[0] then children[1], etc.) but
  * when we traverse the list we read children[count-1] as the first entry
  * down to children[0] as the final entry.
  */
 struct lock_list_entry {
 	struct lock_list_entry	*ll_next;
 	struct lock_instance	ll_children[LOCK_NCHILDREN];
 	u_int			ll_count;
 };
 
 /*
  * The main witness structure. One of these per named lock type in the system
  * (for example, "vnode interlock").
  */
 struct witness {
 	char  			w_name[MAX_W_NAME];
 	uint32_t 		w_index;  /* Index in the relationship matrix */
 	struct lock_class	*w_class;
 	STAILQ_ENTRY(witness) 	w_list;		/* List of all witnesses. */
 	STAILQ_ENTRY(witness) 	w_typelist;	/* Witnesses of a type. */
 	struct witness		*w_hash_next; /* Linked list in hash buckets. */
 	const char		*w_file; /* File where last acquired */
 	uint32_t 		w_line; /* Line where last acquired */
 	uint32_t 		w_refcount;
 	uint16_t 		w_num_ancestors; /* direct/indirect
 						  * ancestor count */
 	uint16_t 		w_num_descendants; /* direct/indirect
 						    * descendant count */
 	int16_t 		w_ddb_level;
 	unsigned		w_displayed:1;
 	unsigned		w_reversed:1;
 };
 
 STAILQ_HEAD(witness_list, witness);
 
 /*
  * The witness hash table. Keys are witness names (const char *), elements are
  * witness objects (struct witness *).
  */
 struct witness_hash {
 	struct witness	*wh_array[WITNESS_HASH_SIZE];
 	uint32_t	wh_size;
 	uint32_t	wh_count;
 };
 
 /*
  * Key type for the lock order data hash table.
  */
 struct witness_lock_order_key {
 	uint16_t	from;
 	uint16_t	to;
 };
 
 struct witness_lock_order_data {
 	struct stack			wlod_stack;
 	struct witness_lock_order_key	wlod_key;
 	struct witness_lock_order_data	*wlod_next;
 };
 
 /*
  * The witness lock order data hash table. Keys are witness index tuples
  * (struct witness_lock_order_key), elements are lock order data objects
  * (struct witness_lock_order_data). 
  */
 struct witness_lock_order_hash {
 	struct witness_lock_order_data	*wloh_array[WITNESS_LO_HASH_SIZE];
 	u_int	wloh_size;
 	u_int	wloh_count;
 };
 
 #ifdef BLESSING
 struct witness_blessed {
 	const char	*b_lock1;
 	const char	*b_lock2;
 };
 #endif
 
 struct witness_pendhelp {
 	const char		*wh_type;
 	struct lock_object	*wh_lock;
 };
 
 struct witness_order_list_entry {
 	const char		*w_name;
 	struct lock_class	*w_class;
 };
 
 /*
  * Returns 0 if one of the locks is a spin lock and the other is not.
  * Returns 1 otherwise.
  */
 static __inline int
 witness_lock_type_equal(struct witness *w1, struct witness *w2)
 {
 
 	return ((w1->w_class->lc_flags & (LC_SLEEPLOCK | LC_SPINLOCK)) ==
 		(w2->w_class->lc_flags & (LC_SLEEPLOCK | LC_SPINLOCK)));
 }
 
 static __inline int
 witness_lock_order_key_equal(const struct witness_lock_order_key *a,
     const struct witness_lock_order_key *b)
 {
 
 	return (a->from == b->from && a->to == b->to);
 }
 
 static int	_isitmyx(struct witness *w1, struct witness *w2, int rmask,
 		    const char *fname);
 static void	adopt(struct witness *parent, struct witness *child);
 #ifdef BLESSING
 static int	blessed(struct witness *, struct witness *);
 #endif
 static void	depart(struct witness *w);
 static struct witness	*enroll(const char *description,
 			    struct lock_class *lock_class);
 static struct lock_instance	*find_instance(struct lock_list_entry *list,
 				    const struct lock_object *lock);
 static int	isitmychild(struct witness *parent, struct witness *child);
 static int	isitmydescendant(struct witness *parent, struct witness *child);
 static void	itismychild(struct witness *parent, struct witness *child);
 static int	sysctl_debug_witness_badstacks(SYSCTL_HANDLER_ARGS);
 static int	sysctl_debug_witness_watch(SYSCTL_HANDLER_ARGS);
 static int	sysctl_debug_witness_fullgraph(SYSCTL_HANDLER_ARGS);
 static int	sysctl_debug_witness_channel(SYSCTL_HANDLER_ARGS);
 static void	witness_add_fullgraph(struct sbuf *sb, struct witness *parent);
 #ifdef DDB
 static void	witness_ddb_compute_levels(void);
 static void	witness_ddb_display(int(*)(const char *fmt, ...));
 static void	witness_ddb_display_descendants(int(*)(const char *fmt, ...),
 		    struct witness *, int indent);
 static void	witness_ddb_display_list(int(*prnt)(const char *fmt, ...),
 		    struct witness_list *list);
 static void	witness_ddb_level_descendants(struct witness *parent, int l);
 static void	witness_ddb_list(struct thread *td);
 #endif
 static void	witness_debugger(int cond, const char *msg);
 static void	witness_free(struct witness *m);
 static struct witness	*witness_get(void);
 static uint32_t	witness_hash_djb2(const uint8_t *key, uint32_t size);
 static struct witness	*witness_hash_get(const char *key);
 static void	witness_hash_put(struct witness *w);
 static void	witness_init_hash_tables(void);
 static void	witness_increment_graph_generation(void);
 static void	witness_lock_list_free(struct lock_list_entry *lle);
 static struct lock_list_entry	*witness_lock_list_get(void);
 static int	witness_lock_order_add(struct witness *parent,
 		    struct witness *child);
 static int	witness_lock_order_check(struct witness *parent,
 		    struct witness *child);
 static struct witness_lock_order_data	*witness_lock_order_get(
 					    struct witness *parent,
 					    struct witness *child);
 static void	witness_list_lock(struct lock_instance *instance,
 		    int (*prnt)(const char *fmt, ...));
 static int	witness_output(const char *fmt, ...) __printflike(1, 2);
 static int	witness_voutput(const char *fmt, va_list ap) __printflike(1, 0);
 static void	witness_setflag(struct lock_object *lock, int flag, int set);
 
 static SYSCTL_NODE(_debug, OID_AUTO, witness, CTLFLAG_RW, NULL,
     "Witness Locking");
 
 /*
  * If set to 0, lock order checking is disabled.  If set to -1,
  * witness is completely disabled.  Otherwise witness performs full
  * lock order checking for all locks.  At runtime, lock order checking
  * may be toggled.  However, witness cannot be reenabled once it is
  * completely disabled.
  */
 static int witness_watch = 1;
 SYSCTL_PROC(_debug_witness, OID_AUTO, watch, CTLFLAG_RWTUN | CTLTYPE_INT, NULL, 0,
     sysctl_debug_witness_watch, "I", "witness is watching lock operations");
 
 #ifdef KDB
 /*
  * When KDB is enabled and witness_kdb is 1, it will cause the system
  * to drop into kdebug() when:
  *	- a lock hierarchy violation occurs
  *	- locks are held when going to sleep.
  */
 #ifdef WITNESS_KDB
 int	witness_kdb = 1;
 #else
 int	witness_kdb = 0;
 #endif
 SYSCTL_INT(_debug_witness, OID_AUTO, kdb, CTLFLAG_RWTUN, &witness_kdb, 0, "");
 #endif /* KDB */
 
 #if defined(DDB) || defined(KDB)
 /*
  * When DDB or KDB is enabled and witness_trace is 1, it will cause the system
  * to print a stack trace:
  *	- a lock hierarchy violation occurs
  *	- locks are held when going to sleep.
  */
 int	witness_trace = 1;
 SYSCTL_INT(_debug_witness, OID_AUTO, trace, CTLFLAG_RWTUN, &witness_trace, 0, "");
 #endif /* DDB || KDB */
 
 #ifdef WITNESS_SKIPSPIN
 int	witness_skipspin = 1;
 #else
 int	witness_skipspin = 0;
 #endif
 SYSCTL_INT(_debug_witness, OID_AUTO, skipspin, CTLFLAG_RDTUN, &witness_skipspin, 0, "");
 
 int badstack_sbuf_size;
 
 int witness_count = WITNESS_COUNT;
 SYSCTL_INT(_debug_witness, OID_AUTO, witness_count, CTLFLAG_RDTUN, 
     &witness_count, 0, "");
 
 /*
  * Output channel for witness messages.  By default we print to the console.
  */
 enum witness_channel {
 	WITNESS_CONSOLE,
 	WITNESS_LOG,
 	WITNESS_NONE,
 };
 
 static enum witness_channel witness_channel = WITNESS_CONSOLE;
 SYSCTL_PROC(_debug_witness, OID_AUTO, output_channel, CTLTYPE_STRING |
     CTLFLAG_RWTUN, NULL, 0, sysctl_debug_witness_channel, "A",
     "Output channel for warnings");
 
 /*
  * Call this to print out the relations between locks.
  */
 SYSCTL_PROC(_debug_witness, OID_AUTO, fullgraph, CTLTYPE_STRING | CTLFLAG_RD,
     NULL, 0, sysctl_debug_witness_fullgraph, "A", "Show locks relation graphs");
 
 /*
  * Call this to print out the witness faulty stacks.
  */
 SYSCTL_PROC(_debug_witness, OID_AUTO, badstacks, CTLTYPE_STRING | CTLFLAG_RD,
     NULL, 0, sysctl_debug_witness_badstacks, "A", "Show bad witness stacks");
 
 static struct mtx w_mtx;
 
 /* w_list */
 static struct witness_list w_free = STAILQ_HEAD_INITIALIZER(w_free);
 static struct witness_list w_all = STAILQ_HEAD_INITIALIZER(w_all);
 
 /* w_typelist */
 static struct witness_list w_spin = STAILQ_HEAD_INITIALIZER(w_spin);
 static struct witness_list w_sleep = STAILQ_HEAD_INITIALIZER(w_sleep);
 
 /* lock list */
 static struct lock_list_entry *w_lock_list_free = NULL;
 static struct witness_pendhelp pending_locks[WITNESS_PENDLIST];
 static u_int pending_cnt;
 
 static int w_free_cnt, w_spin_cnt, w_sleep_cnt;
 SYSCTL_INT(_debug_witness, OID_AUTO, free_cnt, CTLFLAG_RD, &w_free_cnt, 0, "");
 SYSCTL_INT(_debug_witness, OID_AUTO, spin_cnt, CTLFLAG_RD, &w_spin_cnt, 0, "");
 SYSCTL_INT(_debug_witness, OID_AUTO, sleep_cnt, CTLFLAG_RD, &w_sleep_cnt, 0,
     "");
 
 static struct witness *w_data;
 static uint8_t **w_rmatrix;
 static struct lock_list_entry w_locklistdata[LOCK_CHILDCOUNT];
 static struct witness_hash w_hash;	/* The witness hash table. */
 
 /* The lock order data hash */
 static struct witness_lock_order_data w_lodata[WITNESS_LO_DATA_COUNT];
 static struct witness_lock_order_data *w_lofree = NULL;
 static struct witness_lock_order_hash w_lohash;
 static int w_max_used_index = 0;
 static unsigned int w_generation = 0;
 static const char w_notrunning[] = "Witness not running\n";
 static const char w_stillcold[] = "Witness is still cold\n";
 
 
 static struct witness_order_list_entry order_lists[] = {
 	/*
 	 * sx locks
 	 */
 	{ "proctree", &lock_class_sx },
 	{ "allproc", &lock_class_sx },
 	{ "allprison", &lock_class_sx },
 	{ NULL, NULL },
 	/*
 	 * Various mutexes
 	 */
 	{ "Giant", &lock_class_mtx_sleep },
 	{ "pipe mutex", &lock_class_mtx_sleep },
 	{ "sigio lock", &lock_class_mtx_sleep },
 	{ "process group", &lock_class_mtx_sleep },
 	{ "process lock", &lock_class_mtx_sleep },
 	{ "session", &lock_class_mtx_sleep },
 	{ "uidinfo hash", &lock_class_rw },
 #ifdef	HWPMC_HOOKS
 	{ "pmc-sleep", &lock_class_mtx_sleep },
 #endif
 	{ "time lock", &lock_class_mtx_sleep },
 	{ NULL, NULL },
 	/*
 	 * umtx
 	 */
 	{ "umtx lock", &lock_class_mtx_sleep },
 	{ NULL, NULL },
 	/*
 	 * Sockets
 	 */
 	{ "accept", &lock_class_mtx_sleep },
 	{ "so_snd", &lock_class_mtx_sleep },
 	{ "so_rcv", &lock_class_mtx_sleep },
 	{ "sellck", &lock_class_mtx_sleep },
 	{ NULL, NULL },
 	/*
 	 * Routing
 	 */
 	{ "so_rcv", &lock_class_mtx_sleep },
 	{ "radix node head", &lock_class_rw },
 	{ "rtentry", &lock_class_mtx_sleep },
 	{ "ifaddr", &lock_class_mtx_sleep },
 	{ NULL, NULL },
 	/*
 	 * IPv4 multicast:
 	 * protocol locks before interface locks, after UDP locks.
 	 */
 	{ "udpinp", &lock_class_rw },
 	{ "in_multi_mtx", &lock_class_mtx_sleep },
 	{ "igmp_mtx", &lock_class_mtx_sleep },
 	{ "if_addr_lock", &lock_class_rw },
 	{ NULL, NULL },
 	/*
 	 * IPv6 multicast:
 	 * protocol locks before interface locks, after UDP locks.
 	 */
 	{ "udpinp", &lock_class_rw },
 	{ "in6_multi_mtx", &lock_class_mtx_sleep },
 	{ "mld_mtx", &lock_class_mtx_sleep },
 	{ "if_addr_lock", &lock_class_rw },
 	{ NULL, NULL },
 	/*
 	 * UNIX Domain Sockets
 	 */
 	{ "unp_link_rwlock", &lock_class_rw },
 	{ "unp_list_lock", &lock_class_mtx_sleep },
 	{ "unp", &lock_class_mtx_sleep },
 	{ "so_snd", &lock_class_mtx_sleep },
 	{ NULL, NULL },
 	/*
 	 * UDP/IP
 	 */
 	{ "udp", &lock_class_rw },
 	{ "udpinp", &lock_class_rw },
 	{ "so_snd", &lock_class_mtx_sleep },
 	{ NULL, NULL },
 	/*
 	 * TCP/IP
 	 */
 	{ "tcp", &lock_class_rw },
 	{ "tcpinp", &lock_class_rw },
 	{ "so_snd", &lock_class_mtx_sleep },
 	{ NULL, NULL },
 	/*
 	 * BPF
 	 */
 	{ "bpf global lock", &lock_class_mtx_sleep },
 	{ "bpf interface lock", &lock_class_rw },
 	{ "bpf cdev lock", &lock_class_mtx_sleep },
 	{ NULL, NULL },
 	/*
 	 * NFS server
 	 */
 	{ "nfsd_mtx", &lock_class_mtx_sleep },
 	{ "so_snd", &lock_class_mtx_sleep },
 	{ NULL, NULL },
 
 	/*
 	 * IEEE 802.11
 	 */
 	{ "802.11 com lock", &lock_class_mtx_sleep},
 	{ NULL, NULL },
 	/*
 	 * Network drivers
 	 */
 	{ "network driver", &lock_class_mtx_sleep},
 	{ NULL, NULL },
 
 	/*
 	 * Netgraph
 	 */
 	{ "ng_node", &lock_class_mtx_sleep },
 	{ "ng_worklist", &lock_class_mtx_sleep },
 	{ NULL, NULL },
 	/*
 	 * CDEV
 	 */
 	{ "vm map (system)", &lock_class_mtx_sleep },
 	{ "vm page queue", &lock_class_mtx_sleep },
 	{ "vnode interlock", &lock_class_mtx_sleep },
 	{ "cdev", &lock_class_mtx_sleep },
 	{ NULL, NULL },
 	/*
 	 * VM
 	 */
 	{ "vm map (user)", &lock_class_sx },
 	{ "vm object", &lock_class_rw },
 	{ "vm page", &lock_class_mtx_sleep },
 	{ "vm page queue", &lock_class_mtx_sleep },
 	{ "pmap pv global", &lock_class_rw },
 	{ "pmap", &lock_class_mtx_sleep },
 	{ "pmap pv list", &lock_class_rw },
 	{ "vm page free queue", &lock_class_mtx_sleep },
 	{ NULL, NULL },
 	/*
 	 * kqueue/VFS interaction
 	 */
 	{ "kqueue", &lock_class_mtx_sleep },
 	{ "struct mount mtx", &lock_class_mtx_sleep },
 	{ "vnode interlock", &lock_class_mtx_sleep },
 	{ NULL, NULL },
 	/*
+	 * VFS namecache
+	 */
+	{ "ncglobal", &lock_class_rw },
+	{ "ncbuc", &lock_class_rw },
+	{ "vnode interlock", &lock_class_mtx_sleep },
+	{ "ncneg", &lock_class_mtx_sleep },
+	{ NULL, NULL },
+	/*
 	 * ZFS locking
 	 */
 	{ "dn->dn_mtx", &lock_class_sx },
 	{ "dr->dt.di.dr_mtx", &lock_class_sx },
 	{ "db->db_mtx", &lock_class_sx },
 	{ NULL, NULL },
 	/*
 	 * spin locks
 	 */
 #ifdef SMP
 	{ "ap boot", &lock_class_mtx_spin },
 #endif
 	{ "rm.mutex_mtx", &lock_class_mtx_spin },
 	{ "sio", &lock_class_mtx_spin },
 #ifdef __i386__
 	{ "cy", &lock_class_mtx_spin },
 #endif
 #ifdef __sparc64__
 	{ "pcib_mtx", &lock_class_mtx_spin },
 	{ "rtc_mtx", &lock_class_mtx_spin },
 #endif
 	{ "scc_hwmtx", &lock_class_mtx_spin },
 	{ "uart_hwmtx", &lock_class_mtx_spin },
 	{ "fast_taskqueue", &lock_class_mtx_spin },
 	{ "intr table", &lock_class_mtx_spin },
 #ifdef	HWPMC_HOOKS
 	{ "pmc-per-proc", &lock_class_mtx_spin },
 #endif
 	{ "process slock", &lock_class_mtx_spin },
 	{ "syscons video lock", &lock_class_mtx_spin },
 	{ "sleepq chain", &lock_class_mtx_spin },
 	{ "rm_spinlock", &lock_class_mtx_spin },
 	{ "turnstile chain", &lock_class_mtx_spin },
 	{ "turnstile lock", &lock_class_mtx_spin },
 	{ "sched lock", &lock_class_mtx_spin },
 	{ "td_contested", &lock_class_mtx_spin },
 	{ "callout", &lock_class_mtx_spin },
 	{ "entropy harvest mutex", &lock_class_mtx_spin },
 #ifdef SMP
 	{ "smp rendezvous", &lock_class_mtx_spin },
 #endif
 #ifdef __powerpc__
 	{ "tlb0", &lock_class_mtx_spin },
 #endif
 	/*
 	 * leaf locks
 	 */
 	{ "intrcnt", &lock_class_mtx_spin },
 	{ "icu", &lock_class_mtx_spin },
 #if defined(SMP) && defined(__sparc64__)
 	{ "ipi", &lock_class_mtx_spin },
 #endif
 #ifdef __i386__
 	{ "allpmaps", &lock_class_mtx_spin },
 	{ "descriptor tables", &lock_class_mtx_spin },
 #endif
 	{ "clk", &lock_class_mtx_spin },
 	{ "cpuset", &lock_class_mtx_spin },
 	{ "mprof lock", &lock_class_mtx_spin },
 	{ "zombie lock", &lock_class_mtx_spin },
 	{ "ALD Queue", &lock_class_mtx_spin },
 #if defined(__i386__) || defined(__amd64__)
 	{ "pcicfg", &lock_class_mtx_spin },
 	{ "NDIS thread lock", &lock_class_mtx_spin },
 #endif
 	{ "tw_osl_io_lock", &lock_class_mtx_spin },
 	{ "tw_osl_q_lock", &lock_class_mtx_spin },
 	{ "tw_cl_io_lock", &lock_class_mtx_spin },
 	{ "tw_cl_intr_lock", &lock_class_mtx_spin },
 	{ "tw_cl_gen_lock", &lock_class_mtx_spin },
 #ifdef	HWPMC_HOOKS
 	{ "pmc-leaf", &lock_class_mtx_spin },
 #endif
 	{ "blocked lock", &lock_class_mtx_spin },
 	{ NULL, NULL },
 	{ NULL, NULL }
 };
 
 #ifdef BLESSING
 /*
  * Pairs of locks which have been blessed
  * Don't complain about order problems with blessed locks
  */
 static struct witness_blessed blessed_list[] = {
 };
 #endif
 
 /*
  * This global is set to 0 once it becomes safe to use the witness code.
  */
 static int witness_cold = 1;
 
 /*
  * This global is set to 1 once the static lock orders have been enrolled
  * so that a warning can be issued for any spin locks enrolled later.
  */
 static int witness_spin_warn = 0;
 
 /* Trim useless garbage from filenames. */
 static const char *
 fixup_filename(const char *file)
 {
 
 	if (file == NULL)
 		return (NULL);
 	while (strncmp(file, "../", 3) == 0)
 		file += 3;
 	return (file);
 }
 
 /*
  * The WITNESS-enabled diagnostic code.  Note that the witness code does
  * assume that the early boot is single-threaded at least until after this
  * routine is completed.
  */
 static void
 witness_initialize(void *dummy __unused)
 {
 	struct lock_object *lock;
 	struct witness_order_list_entry *order;
 	struct witness *w, *w1;
 	int i;
 
 	w_data = malloc(sizeof (struct witness) * witness_count, M_WITNESS,
 	    M_WAITOK | M_ZERO);
 
 	w_rmatrix = malloc(sizeof(*w_rmatrix) * (witness_count + 1),
 	    M_WITNESS, M_WAITOK | M_ZERO);
 
 	for (i = 0; i < witness_count + 1; i++) {
 		w_rmatrix[i] = malloc(sizeof(*w_rmatrix[i]) *
 		    (witness_count + 1), M_WITNESS, M_WAITOK | M_ZERO);
 	}
 	badstack_sbuf_size = witness_count * 256;
 
 	/*
 	 * We have to release Giant before initializing its witness
 	 * structure so that WITNESS doesn't get confused.
 	 */
 	mtx_unlock(&Giant);
 	mtx_assert(&Giant, MA_NOTOWNED);
 
 	CTR1(KTR_WITNESS, "%s: initializing witness", __func__);
 	mtx_init(&w_mtx, "witness lock", NULL, MTX_SPIN | MTX_QUIET |
 	    MTX_NOWITNESS | MTX_NOPROFILE);
 	for (i = witness_count - 1; i >= 0; i--) {
 		w = &w_data[i];
 		memset(w, 0, sizeof(*w));
 		w_data[i].w_index = i;	/* Witness index never changes. */
 		witness_free(w);
 	}
 	KASSERT(STAILQ_FIRST(&w_free)->w_index == 0,
 	    ("%s: Invalid list of free witness objects", __func__));
 
 	/* Witness with index 0 is not used to aid in debugging. */
 	STAILQ_REMOVE_HEAD(&w_free, w_list);
 	w_free_cnt--;
 
 	for (i = 0; i < witness_count; i++) {
 		memset(w_rmatrix[i], 0, sizeof(*w_rmatrix[i]) * 
 		    (witness_count + 1));
 	}
 
 	for (i = 0; i < LOCK_CHILDCOUNT; i++)
 		witness_lock_list_free(&w_locklistdata[i]);
 	witness_init_hash_tables();
 
 	/* First add in all the specified order lists. */
 	for (order = order_lists; order->w_name != NULL; order++) {
 		w = enroll(order->w_name, order->w_class);
 		if (w == NULL)
 			continue;
 		w->w_file = "order list";
 		for (order++; order->w_name != NULL; order++) {
 			w1 = enroll(order->w_name, order->w_class);
 			if (w1 == NULL)
 				continue;
 			w1->w_file = "order list";
 			itismychild(w, w1);
 			w = w1;
 		}
 	}
 	witness_spin_warn = 1;
 
 	/* Iterate through all locks and add them to witness. */
 	for (i = 0; pending_locks[i].wh_lock != NULL; i++) {
 		lock = pending_locks[i].wh_lock;
 		KASSERT(lock->lo_flags & LO_WITNESS,
 		    ("%s: lock %s is on pending list but not LO_WITNESS",
 		    __func__, lock->lo_name));
 		lock->lo_witness = enroll(pending_locks[i].wh_type,
 		    LOCK_CLASS(lock));
 	}
 
 	/* Mark the witness code as being ready for use. */
 	witness_cold = 0;
 
 	mtx_lock(&Giant);
 }
 SYSINIT(witness_init, SI_SUB_WITNESS, SI_ORDER_FIRST, witness_initialize,
     NULL);
 
 void
 witness_init(struct lock_object *lock, const char *type)
 {
 	struct lock_class *class;
 
 	/* Various sanity checks. */
 	class = LOCK_CLASS(lock);
 	if ((lock->lo_flags & LO_RECURSABLE) != 0 &&
 	    (class->lc_flags & LC_RECURSABLE) == 0)
 		kassert_panic("%s: lock (%s) %s can not be recursable",
 		    __func__, class->lc_name, lock->lo_name);
 	if ((lock->lo_flags & LO_SLEEPABLE) != 0 &&
 	    (class->lc_flags & LC_SLEEPABLE) == 0)
 		kassert_panic("%s: lock (%s) %s can not be sleepable",
 		    __func__, class->lc_name, lock->lo_name);
 	if ((lock->lo_flags & LO_UPGRADABLE) != 0 &&
 	    (class->lc_flags & LC_UPGRADABLE) == 0)
 		kassert_panic("%s: lock (%s) %s can not be upgradable",
 		    __func__, class->lc_name, lock->lo_name);
 
 	/*
 	 * If we shouldn't watch this lock, then just clear lo_witness.
 	 * Otherwise, if witness_cold is set, then it is too early to
 	 * enroll this lock, so defer it to witness_initialize() by adding
 	 * it to the pending_locks list.  If it is not too early, then enroll
 	 * the lock now.
 	 */
 	if (witness_watch < 1 || panicstr != NULL ||
 	    (lock->lo_flags & LO_WITNESS) == 0)
 		lock->lo_witness = NULL;
 	else if (witness_cold) {
 		pending_locks[pending_cnt].wh_lock = lock;
 		pending_locks[pending_cnt++].wh_type = type;
 		if (pending_cnt > WITNESS_PENDLIST)
 			panic("%s: pending locks list is too small, "
 			    "increase WITNESS_PENDLIST\n",
 			    __func__);
 	} else
 		lock->lo_witness = enroll(type, class);
 }
 
 void
 witness_destroy(struct lock_object *lock)
 {
 	struct lock_class *class;
 	struct witness *w;
 
 	class = LOCK_CLASS(lock);
 
 	if (witness_cold)
 		panic("lock (%s) %s destroyed while witness_cold",
 		    class->lc_name, lock->lo_name);
 
 	/* XXX: need to verify that no one holds the lock */
 	if ((lock->lo_flags & LO_WITNESS) == 0 || lock->lo_witness == NULL)
 		return;
 	w = lock->lo_witness;
 
 	mtx_lock_spin(&w_mtx);
 	MPASS(w->w_refcount > 0);
 	w->w_refcount--;
 
 	if (w->w_refcount == 0)
 		depart(w);
 	mtx_unlock_spin(&w_mtx);
 }
 
 #ifdef DDB
 static void
 witness_ddb_compute_levels(void)
 {
 	struct witness *w;
 
 	/*
 	 * First clear all levels.
 	 */
 	STAILQ_FOREACH(w, &w_all, w_list)
 		w->w_ddb_level = -1;
 
 	/*
 	 * Look for locks with no parents and level all their descendants.
 	 */
 	STAILQ_FOREACH(w, &w_all, w_list) {
 
 		/* If the witness has ancestors (is not a root), skip it. */
 		if (w->w_num_ancestors > 0)
 			continue;
 		witness_ddb_level_descendants(w, 0);
 	}
 }
 
 static void
 witness_ddb_level_descendants(struct witness *w, int l)
 {
 	int i;
 
 	if (w->w_ddb_level >= l)
 		return;
 
 	w->w_ddb_level = l;
 	l++;
 
 	for (i = 1; i <= w_max_used_index; i++) {
 		if (w_rmatrix[w->w_index][i] & WITNESS_PARENT)
 			witness_ddb_level_descendants(&w_data[i], l);
 	}
 }
 
 static void
 witness_ddb_display_descendants(int(*prnt)(const char *fmt, ...),
     struct witness *w, int indent)
 {
 	int i;
 
  	for (i = 0; i < indent; i++)
  		prnt(" ");
 	prnt("%s (type: %s, depth: %d, active refs: %d)",
 	     w->w_name, w->w_class->lc_name,
 	     w->w_ddb_level, w->w_refcount);
  	if (w->w_displayed) {
  		prnt(" -- (already displayed)\n");
  		return;
  	}
  	w->w_displayed = 1;
 	if (w->w_file != NULL && w->w_line != 0)
 		prnt(" -- last acquired @ %s:%d\n", fixup_filename(w->w_file),
 		    w->w_line);
 	else
 		prnt(" -- never acquired\n");
 	indent++;
 	WITNESS_INDEX_ASSERT(w->w_index);
 	for (i = 1; i <= w_max_used_index; i++) {
 		if (db_pager_quit)
 			return;
 		if (w_rmatrix[w->w_index][i] & WITNESS_PARENT)
 			witness_ddb_display_descendants(prnt, &w_data[i],
 			    indent);
 	}
 }
 
 static void
 witness_ddb_display_list(int(*prnt)(const char *fmt, ...),
     struct witness_list *list)
 {
 	struct witness *w;
 
 	STAILQ_FOREACH(w, list, w_typelist) {
 		if (w->w_file == NULL || w->w_ddb_level > 0)
 			continue;
 
 		/* This lock has no anscestors - display its descendants. */
 		witness_ddb_display_descendants(prnt, w, 0);
 		if (db_pager_quit)
 			return;
 	}
 }
 	
 static void
 witness_ddb_display(int(*prnt)(const char *fmt, ...))
 {
 	struct witness *w;
 
 	KASSERT(witness_cold == 0, ("%s: witness_cold", __func__));
 	witness_ddb_compute_levels();
 
 	/* Clear all the displayed flags. */
 	STAILQ_FOREACH(w, &w_all, w_list)
 		w->w_displayed = 0;
 
 	/*
 	 * First, handle sleep locks which have been acquired at least
 	 * once.
 	 */
 	prnt("Sleep locks:\n");
 	witness_ddb_display_list(prnt, &w_sleep);
 	if (db_pager_quit)
 		return;
 	
 	/*
 	 * Now do spin locks which have been acquired at least once.
 	 */
 	prnt("\nSpin locks:\n");
 	witness_ddb_display_list(prnt, &w_spin);
 	if (db_pager_quit)
 		return;
 	
 	/*
 	 * Finally, any locks which have not been acquired yet.
 	 */
 	prnt("\nLocks which were never acquired:\n");
 	STAILQ_FOREACH(w, &w_all, w_list) {
 		if (w->w_file != NULL || w->w_refcount == 0)
 			continue;
 		prnt("%s (type: %s, depth: %d)\n", w->w_name,
 		    w->w_class->lc_name, w->w_ddb_level);
 		if (db_pager_quit)
 			return;
 	}
 }
 #endif /* DDB */
 
 int
 witness_defineorder(struct lock_object *lock1, struct lock_object *lock2)
 {
 
 	if (witness_watch == -1 || panicstr != NULL)
 		return (0);
 
 	/* Require locks that witness knows about. */
 	if (lock1 == NULL || lock1->lo_witness == NULL || lock2 == NULL ||
 	    lock2->lo_witness == NULL)
 		return (EINVAL);
 
 	mtx_assert(&w_mtx, MA_NOTOWNED);
 	mtx_lock_spin(&w_mtx);
 
 	/*
 	 * If we already have either an explicit or implied lock order that
 	 * is the other way around, then return an error.
 	 */
 	if (witness_watch &&
 	    isitmydescendant(lock2->lo_witness, lock1->lo_witness)) {
 		mtx_unlock_spin(&w_mtx);
 		return (EDOOFUS);
 	}
 	
 	/* Try to add the new order. */
 	CTR3(KTR_WITNESS, "%s: adding %s as a child of %s", __func__,
 	    lock2->lo_witness->w_name, lock1->lo_witness->w_name);
 	itismychild(lock1->lo_witness, lock2->lo_witness);
 	mtx_unlock_spin(&w_mtx);
 	return (0);
 }
 
 void
 witness_checkorder(struct lock_object *lock, int flags, const char *file,
     int line, struct lock_object *interlock)
 {
 	struct lock_list_entry *lock_list, *lle;
 	struct lock_instance *lock1, *lock2, *plock;
 	struct lock_class *class, *iclass;
 	struct witness *w, *w1;
 	struct thread *td;
 	int i, j;
 
 	if (witness_cold || witness_watch < 1 || lock->lo_witness == NULL ||
 	    panicstr != NULL)
 		return;
 
 	w = lock->lo_witness;
 	class = LOCK_CLASS(lock);
 	td = curthread;
 
 	if (class->lc_flags & LC_SLEEPLOCK) {
 
 		/*
 		 * Since spin locks include a critical section, this check
 		 * implicitly enforces a lock order of all sleep locks before
 		 * all spin locks.
 		 */
 		if (td->td_critnest != 0 && !kdb_active)
 			kassert_panic("acquiring blockable sleep lock with "
 			    "spinlock or critical section held (%s) %s @ %s:%d",
 			    class->lc_name, lock->lo_name,
 			    fixup_filename(file), line);
 
 		/*
 		 * If this is the first lock acquired then just return as
 		 * no order checking is needed.
 		 */
 		lock_list = td->td_sleeplocks;
 		if (lock_list == NULL || lock_list->ll_count == 0)
 			return;
 	} else {
 
 		/*
 		 * If this is the first lock, just return as no order
 		 * checking is needed.  Avoid problems with thread
 		 * migration pinning the thread while checking if
 		 * spinlocks are held.  If at least one spinlock is held
 		 * the thread is in a safe path and it is allowed to
 		 * unpin it.
 		 */
 		sched_pin();
 		lock_list = PCPU_GET(spinlocks);
 		if (lock_list == NULL || lock_list->ll_count == 0) {
 			sched_unpin();
 			return;
 		}
 		sched_unpin();
 	}
 
 	/*
 	 * Check to see if we are recursing on a lock we already own.  If
 	 * so, make sure that we don't mismatch exclusive and shared lock
 	 * acquires.
 	 */
 	lock1 = find_instance(lock_list, lock);
 	if (lock1 != NULL) {
 		if ((lock1->li_flags & LI_EXCLUSIVE) != 0 &&
 		    (flags & LOP_EXCLUSIVE) == 0) {
 			witness_output("shared lock of (%s) %s @ %s:%d\n",
 			    class->lc_name, lock->lo_name,
 			    fixup_filename(file), line);
 			witness_output("while exclusively locked from %s:%d\n",
 			    fixup_filename(lock1->li_file), lock1->li_line);
 			kassert_panic("excl->share");
 		}
 		if ((lock1->li_flags & LI_EXCLUSIVE) == 0 &&
 		    (flags & LOP_EXCLUSIVE) != 0) {
 			witness_output("exclusive lock of (%s) %s @ %s:%d\n",
 			    class->lc_name, lock->lo_name,
 			    fixup_filename(file), line);
 			witness_output("while share locked from %s:%d\n",
 			    fixup_filename(lock1->li_file), lock1->li_line);
 			kassert_panic("share->excl");
 		}
 		return;
 	}
 
 	/* Warn if the interlock is not locked exactly once. */
 	if (interlock != NULL) {
 		iclass = LOCK_CLASS(interlock);
 		lock1 = find_instance(lock_list, interlock);
 		if (lock1 == NULL)
 			kassert_panic("interlock (%s) %s not locked @ %s:%d",
 			    iclass->lc_name, interlock->lo_name,
 			    fixup_filename(file), line);
 		else if ((lock1->li_flags & LI_RECURSEMASK) != 0)
 			kassert_panic("interlock (%s) %s recursed @ %s:%d",
 			    iclass->lc_name, interlock->lo_name,
 			    fixup_filename(file), line);
 	}
 
 	/*
 	 * Find the previously acquired lock, but ignore interlocks.
 	 */
 	plock = &lock_list->ll_children[lock_list->ll_count - 1];
 	if (interlock != NULL && plock->li_lock == interlock) {
 		if (lock_list->ll_count > 1)
 			plock =
 			    &lock_list->ll_children[lock_list->ll_count - 2];
 		else {
 			lle = lock_list->ll_next;
 
 			/*
 			 * The interlock is the only lock we hold, so
 			 * simply return.
 			 */
 			if (lle == NULL)
 				return;
 			plock = &lle->ll_children[lle->ll_count - 1];
 		}
 	}
 	
 	/*
 	 * Try to perform most checks without a lock.  If this succeeds we
 	 * can skip acquiring the lock and return success.  Otherwise we redo
 	 * the check with the lock held to handle races with concurrent updates.
 	 */
 	w1 = plock->li_lock->lo_witness;
 	if (witness_lock_order_check(w1, w))
 		return;
 
 	mtx_lock_spin(&w_mtx);
 	if (witness_lock_order_check(w1, w)) {
 		mtx_unlock_spin(&w_mtx);
 		return;
 	}
 	witness_lock_order_add(w1, w);
 
 	/*
 	 * Check for duplicate locks of the same type.  Note that we only
 	 * have to check for this on the last lock we just acquired.  Any
 	 * other cases will be caught as lock order violations.
 	 */
 	if (w1 == w) {
 		i = w->w_index;
 		if (!(lock->lo_flags & LO_DUPOK) && !(flags & LOP_DUPOK) &&
 		    !(w_rmatrix[i][i] & WITNESS_REVERSAL)) {
 		    w_rmatrix[i][i] |= WITNESS_REVERSAL;
 			w->w_reversed = 1;
 			mtx_unlock_spin(&w_mtx);
 			witness_output(
 			    "acquiring duplicate lock of same type: \"%s\"\n", 
 			    w->w_name);
 			witness_output(" 1st %s @ %s:%d\n", plock->li_lock->lo_name,
 			    fixup_filename(plock->li_file), plock->li_line);
 			witness_output(" 2nd %s @ %s:%d\n", lock->lo_name,
 			    fixup_filename(file), line);
 			witness_debugger(1, __func__);
 		} else
 			mtx_unlock_spin(&w_mtx);
 		return;
 	}
 	mtx_assert(&w_mtx, MA_OWNED);
 
 	/*
 	 * If we know that the lock we are acquiring comes after
 	 * the lock we most recently acquired in the lock order tree,
 	 * then there is no need for any further checks.
 	 */
 	if (isitmychild(w1, w))
 		goto out;
 
 	for (j = 0, lle = lock_list; lle != NULL; lle = lle->ll_next) {
 		for (i = lle->ll_count - 1; i >= 0; i--, j++) {
 
 			MPASS(j < LOCK_CHILDCOUNT * LOCK_NCHILDREN);
 			lock1 = &lle->ll_children[i];
 
 			/*
 			 * Ignore the interlock.
 			 */
 			if (interlock == lock1->li_lock)
 				continue;
 
 			/*
 			 * If this lock doesn't undergo witness checking,
 			 * then skip it.
 			 */
 			w1 = lock1->li_lock->lo_witness;
 			if (w1 == NULL) {
 				KASSERT((lock1->li_lock->lo_flags & LO_WITNESS) == 0,
 				    ("lock missing witness structure"));
 				continue;
 			}
 
 			/*
 			 * If we are locking Giant and this is a sleepable
 			 * lock, then skip it.
 			 */
 			if ((lock1->li_lock->lo_flags & LO_SLEEPABLE) != 0 &&
 			    lock == &Giant.lock_object)
 				continue;
 
 			/*
 			 * If we are locking a sleepable lock and this lock
 			 * is Giant, then skip it.
 			 */
 			if ((lock->lo_flags & LO_SLEEPABLE) != 0 &&
 			    lock1->li_lock == &Giant.lock_object)
 				continue;
 
 			/*
 			 * If we are locking a sleepable lock and this lock
 			 * isn't sleepable, we want to treat it as a lock
 			 * order violation to enfore a general lock order of
 			 * sleepable locks before non-sleepable locks.
 			 */
 			if (((lock->lo_flags & LO_SLEEPABLE) != 0 &&
 			    (lock1->li_lock->lo_flags & LO_SLEEPABLE) == 0))
 				goto reversal;
 
 			/*
 			 * If we are locking Giant and this is a non-sleepable
 			 * lock, then treat it as a reversal.
 			 */
 			if ((lock1->li_lock->lo_flags & LO_SLEEPABLE) == 0 &&
 			    lock == &Giant.lock_object)
 				goto reversal;
 
 			/*
 			 * Check the lock order hierarchy for a reveresal.
 			 */
 			if (!isitmydescendant(w, w1))
 				continue;
 		reversal:
 
 			/*
 			 * We have a lock order violation, check to see if it
 			 * is allowed or has already been yelled about.
 			 */
 #ifdef BLESSING
 
 			/*
 			 * If the lock order is blessed, just bail.  We don't
 			 * look for other lock order violations though, which
 			 * may be a bug.
 			 */
 			if (blessed(w, w1))
 				goto out;
 #endif
 
 			/* Bail if this violation is known */
 			if (w_rmatrix[w1->w_index][w->w_index] & WITNESS_REVERSAL)
 				goto out;
 
 			/* Record this as a violation */
 			w_rmatrix[w1->w_index][w->w_index] |= WITNESS_REVERSAL;
 			w_rmatrix[w->w_index][w1->w_index] |= WITNESS_REVERSAL;
 			w->w_reversed = w1->w_reversed = 1;
 			witness_increment_graph_generation();
 			mtx_unlock_spin(&w_mtx);
 
 #ifdef WITNESS_NO_VNODE
 			/*
 			 * There are known LORs between VNODE locks. They are
 			 * not an indication of a bug. VNODE locks are flagged
 			 * as such (LO_IS_VNODE) and we don't yell if the LOR
 			 * is between 2 VNODE locks.
 			 */
 			if ((lock->lo_flags & LO_IS_VNODE) != 0 &&
 			    (lock1->li_lock->lo_flags & LO_IS_VNODE) != 0)
 				return;
 #endif
 
 			/*
 			 * Ok, yell about it.
 			 */
 			if (((lock->lo_flags & LO_SLEEPABLE) != 0 &&
 			    (lock1->li_lock->lo_flags & LO_SLEEPABLE) == 0))
 				witness_output(
 		"lock order reversal: (sleepable after non-sleepable)\n");
 			else if ((lock1->li_lock->lo_flags & LO_SLEEPABLE) == 0
 			    && lock == &Giant.lock_object)
 				witness_output(
 		"lock order reversal: (Giant after non-sleepable)\n");
 			else
 				witness_output("lock order reversal:\n");
 
 			/*
 			 * Try to locate an earlier lock with
 			 * witness w in our list.
 			 */
 			do {
 				lock2 = &lle->ll_children[i];
 				MPASS(lock2->li_lock != NULL);
 				if (lock2->li_lock->lo_witness == w)
 					break;
 				if (i == 0 && lle->ll_next != NULL) {
 					lle = lle->ll_next;
 					i = lle->ll_count - 1;
 					MPASS(i >= 0 && i < LOCK_NCHILDREN);
 				} else
 					i--;
 			} while (i >= 0);
 			if (i < 0) {
 				witness_output(" 1st %p %s (%s) @ %s:%d\n",
 				    lock1->li_lock, lock1->li_lock->lo_name,
 				    w1->w_name, fixup_filename(lock1->li_file),
 				    lock1->li_line);
 				witness_output(" 2nd %p %s (%s) @ %s:%d\n", lock,
 				    lock->lo_name, w->w_name,
 				    fixup_filename(file), line);
 			} else {
 				witness_output(" 1st %p %s (%s) @ %s:%d\n",
 				    lock2->li_lock, lock2->li_lock->lo_name,
 				    lock2->li_lock->lo_witness->w_name,
 				    fixup_filename(lock2->li_file),
 				    lock2->li_line);
 				witness_output(" 2nd %p %s (%s) @ %s:%d\n",
 				    lock1->li_lock, lock1->li_lock->lo_name,
 				    w1->w_name, fixup_filename(lock1->li_file),
 				    lock1->li_line);
 				witness_output(" 3rd %p %s (%s) @ %s:%d\n", lock,
 				    lock->lo_name, w->w_name,
 				    fixup_filename(file), line);
 			}
 			witness_debugger(1, __func__);
 			return;
 		}
 	}
 
 	/*
 	 * If requested, build a new lock order.  However, don't build a new
 	 * relationship between a sleepable lock and Giant if it is in the
 	 * wrong direction.  The correct lock order is that sleepable locks
 	 * always come before Giant.
 	 */
 	if (flags & LOP_NEWORDER &&
 	    !(plock->li_lock == &Giant.lock_object &&
 	    (lock->lo_flags & LO_SLEEPABLE) != 0)) {
 		CTR3(KTR_WITNESS, "%s: adding %s as a child of %s", __func__,
 		    w->w_name, plock->li_lock->lo_witness->w_name);
 		itismychild(plock->li_lock->lo_witness, w);
 	}
 out:
 	mtx_unlock_spin(&w_mtx);
 }
 
 void
 witness_lock(struct lock_object *lock, int flags, const char *file, int line)
 {
 	struct lock_list_entry **lock_list, *lle;
 	struct lock_instance *instance;
 	struct witness *w;
 	struct thread *td;
 
 	if (witness_cold || witness_watch == -1 || lock->lo_witness == NULL ||
 	    panicstr != NULL)
 		return;
 	w = lock->lo_witness;
 	td = curthread;
 
 	/* Determine lock list for this lock. */
 	if (LOCK_CLASS(lock)->lc_flags & LC_SLEEPLOCK)
 		lock_list = &td->td_sleeplocks;
 	else
 		lock_list = PCPU_PTR(spinlocks);
 
 	/* Check to see if we are recursing on a lock we already own. */
 	instance = find_instance(*lock_list, lock);
 	if (instance != NULL) {
 		instance->li_flags++;
 		CTR4(KTR_WITNESS, "%s: pid %d recursed on %s r=%d", __func__,
 		    td->td_proc->p_pid, lock->lo_name,
 		    instance->li_flags & LI_RECURSEMASK);
 		instance->li_file = file;
 		instance->li_line = line;
 		return;
 	}
 
 	/* Update per-witness last file and line acquire. */
 	w->w_file = file;
 	w->w_line = line;
 
 	/* Find the next open lock instance in the list and fill it. */
 	lle = *lock_list;
 	if (lle == NULL || lle->ll_count == LOCK_NCHILDREN) {
 		lle = witness_lock_list_get();
 		if (lle == NULL)
 			return;
 		lle->ll_next = *lock_list;
 		CTR3(KTR_WITNESS, "%s: pid %d added lle %p", __func__,
 		    td->td_proc->p_pid, lle);
 		*lock_list = lle;
 	}
 	instance = &lle->ll_children[lle->ll_count++];
 	instance->li_lock = lock;
 	instance->li_line = line;
 	instance->li_file = file;
 	if ((flags & LOP_EXCLUSIVE) != 0)
 		instance->li_flags = LI_EXCLUSIVE;
 	else
 		instance->li_flags = 0;
 	CTR4(KTR_WITNESS, "%s: pid %d added %s as lle[%d]", __func__,
 	    td->td_proc->p_pid, lock->lo_name, lle->ll_count - 1);
 }
 
 void
 witness_upgrade(struct lock_object *lock, int flags, const char *file, int line)
 {
 	struct lock_instance *instance;
 	struct lock_class *class;
 
 	KASSERT(witness_cold == 0, ("%s: witness_cold", __func__));
 	if (lock->lo_witness == NULL || witness_watch == -1 || panicstr != NULL)
 		return;
 	class = LOCK_CLASS(lock);
 	if (witness_watch) {
 		if ((lock->lo_flags & LO_UPGRADABLE) == 0)
 			kassert_panic(
 			    "upgrade of non-upgradable lock (%s) %s @ %s:%d",
 			    class->lc_name, lock->lo_name,
 			    fixup_filename(file), line);
 		if ((class->lc_flags & LC_SLEEPLOCK) == 0)
 			kassert_panic(
 			    "upgrade of non-sleep lock (%s) %s @ %s:%d",
 			    class->lc_name, lock->lo_name,
 			    fixup_filename(file), line);
 	}
 	instance = find_instance(curthread->td_sleeplocks, lock);
 	if (instance == NULL) {
 		kassert_panic("upgrade of unlocked lock (%s) %s @ %s:%d",
 		    class->lc_name, lock->lo_name,
 		    fixup_filename(file), line);
 		return;
 	}
 	if (witness_watch) {
 		if ((instance->li_flags & LI_EXCLUSIVE) != 0)
 			kassert_panic(
 			    "upgrade of exclusive lock (%s) %s @ %s:%d",
 			    class->lc_name, lock->lo_name,
 			    fixup_filename(file), line);
 		if ((instance->li_flags & LI_RECURSEMASK) != 0)
 			kassert_panic(
 			    "upgrade of recursed lock (%s) %s r=%d @ %s:%d",
 			    class->lc_name, lock->lo_name,
 			    instance->li_flags & LI_RECURSEMASK,
 			    fixup_filename(file), line);
 	}
 	instance->li_flags |= LI_EXCLUSIVE;
 }
 
 void
 witness_downgrade(struct lock_object *lock, int flags, const char *file,
     int line)
 {
 	struct lock_instance *instance;
 	struct lock_class *class;
 
 	KASSERT(witness_cold == 0, ("%s: witness_cold", __func__));
 	if (lock->lo_witness == NULL || witness_watch == -1 || panicstr != NULL)
 		return;
 	class = LOCK_CLASS(lock);
 	if (witness_watch) {
 		if ((lock->lo_flags & LO_UPGRADABLE) == 0)
 			kassert_panic(
 			    "downgrade of non-upgradable lock (%s) %s @ %s:%d",
 			    class->lc_name, lock->lo_name,
 			    fixup_filename(file), line);
 		if ((class->lc_flags & LC_SLEEPLOCK) == 0)
 			kassert_panic(
 			    "downgrade of non-sleep lock (%s) %s @ %s:%d",
 			    class->lc_name, lock->lo_name,
 			    fixup_filename(file), line);
 	}
 	instance = find_instance(curthread->td_sleeplocks, lock);
 	if (instance == NULL) {
 		kassert_panic("downgrade of unlocked lock (%s) %s @ %s:%d",
 		    class->lc_name, lock->lo_name,
 		    fixup_filename(file), line);
 		return;
 	}
 	if (witness_watch) {
 		if ((instance->li_flags & LI_EXCLUSIVE) == 0)
 			kassert_panic(
 			    "downgrade of shared lock (%s) %s @ %s:%d",
 			    class->lc_name, lock->lo_name,
 			    fixup_filename(file), line);
 		if ((instance->li_flags & LI_RECURSEMASK) != 0)
 			kassert_panic(
 			    "downgrade of recursed lock (%s) %s r=%d @ %s:%d",
 			    class->lc_name, lock->lo_name,
 			    instance->li_flags & LI_RECURSEMASK,
 			    fixup_filename(file), line);
 	}
 	instance->li_flags &= ~LI_EXCLUSIVE;
 }
 
 void
 witness_unlock(struct lock_object *lock, int flags, const char *file, int line)
 {
 	struct lock_list_entry **lock_list, *lle;
 	struct lock_instance *instance;
 	struct lock_class *class;
 	struct thread *td;
 	register_t s;
 	int i, j;
 
 	if (witness_cold || lock->lo_witness == NULL || panicstr != NULL)
 		return;
 	td = curthread;
 	class = LOCK_CLASS(lock);
 
 	/* Find lock instance associated with this lock. */
 	if (class->lc_flags & LC_SLEEPLOCK)
 		lock_list = &td->td_sleeplocks;
 	else
 		lock_list = PCPU_PTR(spinlocks);
 	lle = *lock_list;
 	for (; *lock_list != NULL; lock_list = &(*lock_list)->ll_next)
 		for (i = 0; i < (*lock_list)->ll_count; i++) {
 			instance = &(*lock_list)->ll_children[i];
 			if (instance->li_lock == lock)
 				goto found;
 		}
 
 	/*
 	 * When disabling WITNESS through witness_watch we could end up in
 	 * having registered locks in the td_sleeplocks queue.
 	 * We have to make sure we flush these queues, so just search for
 	 * eventual register locks and remove them.
 	 */
 	if (witness_watch > 0) {
 		kassert_panic("lock (%s) %s not locked @ %s:%d", class->lc_name,
 		    lock->lo_name, fixup_filename(file), line);
 		return;
 	} else {
 		return;
 	}
 found:
 
 	/* First, check for shared/exclusive mismatches. */
 	if ((instance->li_flags & LI_EXCLUSIVE) != 0 && witness_watch > 0 &&
 	    (flags & LOP_EXCLUSIVE) == 0) {
 		witness_output("shared unlock of (%s) %s @ %s:%d\n",
 		    class->lc_name, lock->lo_name, fixup_filename(file), line);
 		witness_output("while exclusively locked from %s:%d\n",
 		    fixup_filename(instance->li_file), instance->li_line);
 		kassert_panic("excl->ushare");
 	}
 	if ((instance->li_flags & LI_EXCLUSIVE) == 0 && witness_watch > 0 &&
 	    (flags & LOP_EXCLUSIVE) != 0) {
 		witness_output("exclusive unlock of (%s) %s @ %s:%d\n",
 		    class->lc_name, lock->lo_name, fixup_filename(file), line);
 		witness_output("while share locked from %s:%d\n",
 		    fixup_filename(instance->li_file),
 		    instance->li_line);
 		kassert_panic("share->uexcl");
 	}
 	/* If we are recursed, unrecurse. */
 	if ((instance->li_flags & LI_RECURSEMASK) > 0) {
 		CTR4(KTR_WITNESS, "%s: pid %d unrecursed on %s r=%d", __func__,
 		    td->td_proc->p_pid, instance->li_lock->lo_name,
 		    instance->li_flags);
 		instance->li_flags--;
 		return;
 	}
 	/* The lock is now being dropped, check for NORELEASE flag */
 	if ((instance->li_flags & LI_NORELEASE) != 0 && witness_watch > 0) {
 		witness_output("forbidden unlock of (%s) %s @ %s:%d\n",
 		    class->lc_name, lock->lo_name, fixup_filename(file), line);
 		kassert_panic("lock marked norelease");
 	}
 
 	/* Otherwise, remove this item from the list. */
 	s = intr_disable();
 	CTR4(KTR_WITNESS, "%s: pid %d removed %s from lle[%d]", __func__,
 	    td->td_proc->p_pid, instance->li_lock->lo_name,
 	    (*lock_list)->ll_count - 1);
 	for (j = i; j < (*lock_list)->ll_count - 1; j++)
 		(*lock_list)->ll_children[j] =
 		    (*lock_list)->ll_children[j + 1];
 	(*lock_list)->ll_count--;
 	intr_restore(s);
 
 	/*
 	 * In order to reduce contention on w_mtx, we want to keep always an
 	 * head object into lists so that frequent allocation from the 
 	 * free witness pool (and subsequent locking) is avoided.
 	 * In order to maintain the current code simple, when the head
 	 * object is totally unloaded it means also that we do not have
 	 * further objects in the list, so the list ownership needs to be
 	 * hand over to another object if the current head needs to be freed.
 	 */
 	if ((*lock_list)->ll_count == 0) {
 		if (*lock_list == lle) {
 			if (lle->ll_next == NULL)
 				return;
 		} else
 			lle = *lock_list;
 		*lock_list = lle->ll_next;
 		CTR3(KTR_WITNESS, "%s: pid %d removed lle %p", __func__,
 		    td->td_proc->p_pid, lle);
 		witness_lock_list_free(lle);
 	}
 }
 
 void
 witness_thread_exit(struct thread *td)
 {
 	struct lock_list_entry *lle;
 	int i, n;
 
 	lle = td->td_sleeplocks;
 	if (lle == NULL || panicstr != NULL)
 		return;
 	if (lle->ll_count != 0) {
 		for (n = 0; lle != NULL; lle = lle->ll_next)
 			for (i = lle->ll_count - 1; i >= 0; i--) {
 				if (n == 0)
 					witness_output(
 		    "Thread %p exiting with the following locks held:\n", td);
 				n++;
 				witness_list_lock(&lle->ll_children[i],
 				    witness_output);
 				
 			}
 		kassert_panic(
 		    "Thread %p cannot exit while holding sleeplocks\n", td);
 	}
 	witness_lock_list_free(lle);
 }
 
 /*
  * Warn if any locks other than 'lock' are held.  Flags can be passed in to
  * exempt Giant and sleepable locks from the checks as well.  If any
  * non-exempt locks are held, then a supplied message is printed to the
  * output channel along with a list of the offending locks.  If indicated in the
  * flags then a failure results in a panic as well.
  */
 int
 witness_warn(int flags, struct lock_object *lock, const char *fmt, ...)
 {
 	struct lock_list_entry *lock_list, *lle;
 	struct lock_instance *lock1;
 	struct thread *td;
 	va_list ap;
 	int i, n;
 
 	if (witness_cold || witness_watch < 1 || panicstr != NULL)
 		return (0);
 	n = 0;
 	td = curthread;
 	for (lle = td->td_sleeplocks; lle != NULL; lle = lle->ll_next)
 		for (i = lle->ll_count - 1; i >= 0; i--) {
 			lock1 = &lle->ll_children[i];
 			if (lock1->li_lock == lock)
 				continue;
 			if (flags & WARN_GIANTOK &&
 			    lock1->li_lock == &Giant.lock_object)
 				continue;
 			if (flags & WARN_SLEEPOK &&
 			    (lock1->li_lock->lo_flags & LO_SLEEPABLE) != 0)
 				continue;
 			if (n == 0) {
 				va_start(ap, fmt);
 				witness_voutput(fmt, ap);
 				va_end(ap);
 				witness_output(
 				    " with the following %slocks held:\n",
 				    (flags & WARN_SLEEPOK) != 0 ?
 				    "non-sleepable " : "");
 			}
 			n++;
 			witness_list_lock(lock1, witness_output);
 		}
 
 	/*
 	 * Pin the thread in order to avoid problems with thread migration.
 	 * Once that all verifies are passed about spinlocks ownership,
 	 * the thread is in a safe path and it can be unpinned.
 	 */
 	sched_pin();
 	lock_list = PCPU_GET(spinlocks);
 	if (lock_list != NULL && lock_list->ll_count != 0) {
 		sched_unpin();
 
 		/*
 		 * We should only have one spinlock and as long as
 		 * the flags cannot match for this locks class,
 		 * check if the first spinlock is the one curthread
 		 * should hold.
 		 */
 		lock1 = &lock_list->ll_children[lock_list->ll_count - 1];
 		if (lock_list->ll_count == 1 && lock_list->ll_next == NULL &&
 		    lock1->li_lock == lock && n == 0)
 			return (0);
 
 		va_start(ap, fmt);
 		witness_voutput(fmt, ap);
 		va_end(ap);
 		witness_output(" with the following %slocks held:\n",
 		    (flags & WARN_SLEEPOK) != 0 ?  "non-sleepable " : "");
 		n += witness_list_locks(&lock_list, witness_output);
 	} else
 		sched_unpin();
 	if (flags & WARN_PANIC && n)
 		kassert_panic("%s", __func__);
 	else
 		witness_debugger(n, __func__);
 	return (n);
 }
 
 const char *
 witness_file(struct lock_object *lock)
 {
 	struct witness *w;
 
 	if (witness_cold || witness_watch < 1 || lock->lo_witness == NULL)
 		return ("?");
 	w = lock->lo_witness;
 	return (w->w_file);
 }
 
 int
 witness_line(struct lock_object *lock)
 {
 	struct witness *w;
 
 	if (witness_cold || witness_watch < 1 || lock->lo_witness == NULL)
 		return (0);
 	w = lock->lo_witness;
 	return (w->w_line);
 }
 
 static struct witness *
 enroll(const char *description, struct lock_class *lock_class)
 {
 	struct witness *w;
 	struct witness_list *typelist;
 
 	MPASS(description != NULL);
 
 	if (witness_watch == -1 || panicstr != NULL)
 		return (NULL);
 	if ((lock_class->lc_flags & LC_SPINLOCK)) {
 		if (witness_skipspin)
 			return (NULL);
 		else
 			typelist = &w_spin;
 	} else if ((lock_class->lc_flags & LC_SLEEPLOCK)) {
 		typelist = &w_sleep;
 	} else {
 		kassert_panic("lock class %s is not sleep or spin",
 		    lock_class->lc_name);
 		return (NULL);
 	}
 
 	mtx_lock_spin(&w_mtx);
 	w = witness_hash_get(description);
 	if (w)
 		goto found;
 	if ((w = witness_get()) == NULL)
 		return (NULL);
 	MPASS(strlen(description) < MAX_W_NAME);
 	strcpy(w->w_name, description);
 	w->w_class = lock_class;
 	w->w_refcount = 1;
 	STAILQ_INSERT_HEAD(&w_all, w, w_list);
 	if (lock_class->lc_flags & LC_SPINLOCK) {
 		STAILQ_INSERT_HEAD(&w_spin, w, w_typelist);
 		w_spin_cnt++;
 	} else if (lock_class->lc_flags & LC_SLEEPLOCK) {
 		STAILQ_INSERT_HEAD(&w_sleep, w, w_typelist);
 		w_sleep_cnt++;
 	}
 
 	/* Insert new witness into the hash */
 	witness_hash_put(w);
 	witness_increment_graph_generation();
 	mtx_unlock_spin(&w_mtx);
 	return (w);
 found:
 	w->w_refcount++;
 	mtx_unlock_spin(&w_mtx);
 	if (lock_class != w->w_class)
 		kassert_panic(
 			"lock (%s) %s does not match earlier (%s) lock",
 			description, lock_class->lc_name,
 			w->w_class->lc_name);
 	return (w);
 }
 
 static void
 depart(struct witness *w)
 {
 	struct witness_list *list;
 
 	MPASS(w->w_refcount == 0);
 	if (w->w_class->lc_flags & LC_SLEEPLOCK) {
 		list = &w_sleep;
 		w_sleep_cnt--;
 	} else {
 		list = &w_spin;
 		w_spin_cnt--;
 	}
 	/*
 	 * Set file to NULL as it may point into a loadable module.
 	 */
 	w->w_file = NULL;
 	w->w_line = 0;
 	witness_increment_graph_generation();
 }
 
 
 static void
 adopt(struct witness *parent, struct witness *child)
 {
 	int pi, ci, i, j;
 
 	if (witness_cold == 0)
 		mtx_assert(&w_mtx, MA_OWNED);
 
 	/* If the relationship is already known, there's no work to be done. */
 	if (isitmychild(parent, child))
 		return;
 
 	/* When the structure of the graph changes, bump up the generation. */
 	witness_increment_graph_generation();
 
 	/*
 	 * The hard part ... create the direct relationship, then propagate all
 	 * indirect relationships.
 	 */
 	pi = parent->w_index;
 	ci = child->w_index;
 	WITNESS_INDEX_ASSERT(pi);
 	WITNESS_INDEX_ASSERT(ci);
 	MPASS(pi != ci);
 	w_rmatrix[pi][ci] |= WITNESS_PARENT;
 	w_rmatrix[ci][pi] |= WITNESS_CHILD;
 
 	/*
 	 * If parent was not already an ancestor of child,
 	 * then we increment the descendant and ancestor counters.
 	 */
 	if ((w_rmatrix[pi][ci] & WITNESS_ANCESTOR) == 0) {
 		parent->w_num_descendants++;
 		child->w_num_ancestors++;
 	}
 
 	/* 
 	 * Find each ancestor of 'pi'. Note that 'pi' itself is counted as 
 	 * an ancestor of 'pi' during this loop.
 	 */
 	for (i = 1; i <= w_max_used_index; i++) {
 		if ((w_rmatrix[i][pi] & WITNESS_ANCESTOR_MASK) == 0 && 
 		    (i != pi))
 			continue;
 
 		/* Find each descendant of 'i' and mark it as a descendant. */
 		for (j = 1; j <= w_max_used_index; j++) {
 
 			/* 
 			 * Skip children that are already marked as
 			 * descendants of 'i'.
 			 */
 			if (w_rmatrix[i][j] & WITNESS_ANCESTOR_MASK)
 				continue;
 
 			/*
 			 * We are only interested in descendants of 'ci'. Note
 			 * that 'ci' itself is counted as a descendant of 'ci'.
 			 */
 			if ((w_rmatrix[ci][j] & WITNESS_ANCESTOR_MASK) == 0 && 
 			    (j != ci))
 				continue;
 			w_rmatrix[i][j] |= WITNESS_ANCESTOR;
 			w_rmatrix[j][i] |= WITNESS_DESCENDANT;
 			w_data[i].w_num_descendants++;
 			w_data[j].w_num_ancestors++;
 
 			/* 
 			 * Make sure we aren't marking a node as both an
 			 * ancestor and descendant. We should have caught 
 			 * this as a lock order reversal earlier.
 			 */
 			if ((w_rmatrix[i][j] & WITNESS_ANCESTOR_MASK) &&
 			    (w_rmatrix[i][j] & WITNESS_DESCENDANT_MASK)) {
 				printf("witness rmatrix paradox! [%d][%d]=%d "
 				    "both ancestor and descendant\n",
 				    i, j, w_rmatrix[i][j]); 
 				kdb_backtrace();
 				printf("Witness disabled.\n");
 				witness_watch = -1;
 			}
 			if ((w_rmatrix[j][i] & WITNESS_ANCESTOR_MASK) &&
 			    (w_rmatrix[j][i] & WITNESS_DESCENDANT_MASK)) {
 				printf("witness rmatrix paradox! [%d][%d]=%d "
 				    "both ancestor and descendant\n",
 				    j, i, w_rmatrix[j][i]); 
 				kdb_backtrace();
 				printf("Witness disabled.\n");
 				witness_watch = -1;
 			}
 		}
 	}
 }
 
 static void
 itismychild(struct witness *parent, struct witness *child)
 {
 	int unlocked;
 
 	MPASS(child != NULL && parent != NULL);
 	if (witness_cold == 0)
 		mtx_assert(&w_mtx, MA_OWNED);
 
 	if (!witness_lock_type_equal(parent, child)) {
 		if (witness_cold == 0) {
 			unlocked = 1;
 			mtx_unlock_spin(&w_mtx);
 		} else {
 			unlocked = 0;
 		}
 		kassert_panic(
 		    "%s: parent \"%s\" (%s) and child \"%s\" (%s) are not "
 		    "the same lock type", __func__, parent->w_name,
 		    parent->w_class->lc_name, child->w_name,
 		    child->w_class->lc_name);
 		if (unlocked)
 			mtx_lock_spin(&w_mtx);
 	}
 	adopt(parent, child);
 }
 
 /*
  * Generic code for the isitmy*() functions. The rmask parameter is the
  * expected relationship of w1 to w2.
  */
 static int
 _isitmyx(struct witness *w1, struct witness *w2, int rmask, const char *fname)
 {
 	unsigned char r1, r2;
 	int i1, i2;
 
 	i1 = w1->w_index;
 	i2 = w2->w_index;
 	WITNESS_INDEX_ASSERT(i1);
 	WITNESS_INDEX_ASSERT(i2);
 	r1 = w_rmatrix[i1][i2] & WITNESS_RELATED_MASK;
 	r2 = w_rmatrix[i2][i1] & WITNESS_RELATED_MASK;
 
 	/* The flags on one better be the inverse of the flags on the other */
 	if (!((WITNESS_ATOD(r1) == r2 && WITNESS_DTOA(r2) == r1) ||
 	    (WITNESS_DTOA(r1) == r2 && WITNESS_ATOD(r2) == r1))) {
 		/* Don't squawk if we're potentially racing with an update. */
 		if (!mtx_owned(&w_mtx))
 			return (0);
 		printf("%s: rmatrix mismatch between %s (index %d) and %s "
 		    "(index %d): w_rmatrix[%d][%d] == %hhx but "
 		    "w_rmatrix[%d][%d] == %hhx\n",
 		    fname, w1->w_name, i1, w2->w_name, i2, i1, i2, r1,
 		    i2, i1, r2);
 		kdb_backtrace();
 		printf("Witness disabled.\n");
 		witness_watch = -1;
 	}
 	return (r1 & rmask);
 }
 
 /*
  * Checks if @child is a direct child of @parent.
  */
 static int
 isitmychild(struct witness *parent, struct witness *child)
 {
 
 	return (_isitmyx(parent, child, WITNESS_PARENT, __func__));
 }
 
 /*
  * Checks if @descendant is a direct or inderect descendant of @ancestor.
  */
 static int
 isitmydescendant(struct witness *ancestor, struct witness *descendant)
 {
 
 	return (_isitmyx(ancestor, descendant, WITNESS_ANCESTOR_MASK,
 	    __func__));
 }
 
 #ifdef BLESSING
 static int
 blessed(struct witness *w1, struct witness *w2)
 {
 	int i;
 	struct witness_blessed *b;
 
 	for (i = 0; i < nitems(blessed_list); i++) {
 		b = &blessed_list[i];
 		if (strcmp(w1->w_name, b->b_lock1) == 0) {
 			if (strcmp(w2->w_name, b->b_lock2) == 0)
 				return (1);
 			continue;
 		}
 		if (strcmp(w1->w_name, b->b_lock2) == 0)
 			if (strcmp(w2->w_name, b->b_lock1) == 0)
 				return (1);
 	}
 	return (0);
 }
 #endif
 
 static struct witness *
 witness_get(void)
 {
 	struct witness *w;
 	int index;
 
 	if (witness_cold == 0)
 		mtx_assert(&w_mtx, MA_OWNED);
 
 	if (witness_watch == -1) {
 		mtx_unlock_spin(&w_mtx);
 		return (NULL);
 	}
 	if (STAILQ_EMPTY(&w_free)) {
 		witness_watch = -1;
 		mtx_unlock_spin(&w_mtx);
 		printf("WITNESS: unable to allocate a new witness object\n");
 		return (NULL);
 	}
 	w = STAILQ_FIRST(&w_free);
 	STAILQ_REMOVE_HEAD(&w_free, w_list);
 	w_free_cnt--;
 	index = w->w_index;
 	MPASS(index > 0 && index == w_max_used_index+1 &&
 	    index < witness_count);
 	bzero(w, sizeof(*w));
 	w->w_index = index;
 	if (index > w_max_used_index)
 		w_max_used_index = index;
 	return (w);
 }
 
 static void
 witness_free(struct witness *w)
 {
 
 	STAILQ_INSERT_HEAD(&w_free, w, w_list);
 	w_free_cnt++;
 }
 
 static struct lock_list_entry *
 witness_lock_list_get(void)
 {
 	struct lock_list_entry *lle;
 
 	if (witness_watch == -1)
 		return (NULL);
 	mtx_lock_spin(&w_mtx);
 	lle = w_lock_list_free;
 	if (lle == NULL) {
 		witness_watch = -1;
 		mtx_unlock_spin(&w_mtx);
 		printf("%s: witness exhausted\n", __func__);
 		return (NULL);
 	}
 	w_lock_list_free = lle->ll_next;
 	mtx_unlock_spin(&w_mtx);
 	bzero(lle, sizeof(*lle));
 	return (lle);
 }
 		
 static void
 witness_lock_list_free(struct lock_list_entry *lle)
 {
 
 	mtx_lock_spin(&w_mtx);
 	lle->ll_next = w_lock_list_free;
 	w_lock_list_free = lle;
 	mtx_unlock_spin(&w_mtx);
 }
 
 static struct lock_instance *
 find_instance(struct lock_list_entry *list, const struct lock_object *lock)
 {
 	struct lock_list_entry *lle;
 	struct lock_instance *instance;
 	int i;
 
 	for (lle = list; lle != NULL; lle = lle->ll_next)
 		for (i = lle->ll_count - 1; i >= 0; i--) {
 			instance = &lle->ll_children[i];
 			if (instance->li_lock == lock)
 				return (instance);
 		}
 	return (NULL);
 }
 
 static void
 witness_list_lock(struct lock_instance *instance,
     int (*prnt)(const char *fmt, ...))
 {
 	struct lock_object *lock;
 
 	lock = instance->li_lock;
 	prnt("%s %s %s", (instance->li_flags & LI_EXCLUSIVE) != 0 ?
 	    "exclusive" : "shared", LOCK_CLASS(lock)->lc_name, lock->lo_name);
 	if (lock->lo_witness->w_name != lock->lo_name)
 		prnt(" (%s)", lock->lo_witness->w_name);
 	prnt(" r = %d (%p) locked @ %s:%d\n",
 	    instance->li_flags & LI_RECURSEMASK, lock,
 	    fixup_filename(instance->li_file), instance->li_line);
 }
 
 static int
 witness_output(const char *fmt, ...)
 {
 	va_list ap;
 	int ret;
 
 	va_start(ap, fmt);
 	ret = witness_voutput(fmt, ap);
 	va_end(ap);
 	return (ret);
 }
 
 static int
 witness_voutput(const char *fmt, va_list ap)
 {
 	int ret;
 
 	ret = 0;
 	switch (witness_channel) {
 	case WITNESS_CONSOLE:
 		ret = vprintf(fmt, ap);
 		break;
 	case WITNESS_LOG:
 		vlog(LOG_NOTICE, fmt, ap);
 		break;
 	case WITNESS_NONE:
 		break;
 	}
 	return (ret);
 }
 
 #ifdef DDB
 static int
 witness_thread_has_locks(struct thread *td)
 {
 
 	if (td->td_sleeplocks == NULL)
 		return (0);
 	return (td->td_sleeplocks->ll_count != 0);
 }
 
 static int
 witness_proc_has_locks(struct proc *p)
 {
 	struct thread *td;
 
 	FOREACH_THREAD_IN_PROC(p, td) {
 		if (witness_thread_has_locks(td))
 			return (1);
 	}
 	return (0);
 }
 #endif
 
 int
 witness_list_locks(struct lock_list_entry **lock_list,
     int (*prnt)(const char *fmt, ...))
 {
 	struct lock_list_entry *lle;
 	int i, nheld;
 
 	nheld = 0;
 	for (lle = *lock_list; lle != NULL; lle = lle->ll_next)
 		for (i = lle->ll_count - 1; i >= 0; i--) {
 			witness_list_lock(&lle->ll_children[i], prnt);
 			nheld++;
 		}
 	return (nheld);
 }
 
 /*
  * This is a bit risky at best.  We call this function when we have timed
  * out acquiring a spin lock, and we assume that the other CPU is stuck
  * with this lock held.  So, we go groveling around in the other CPU's
  * per-cpu data to try to find the lock instance for this spin lock to
  * see when it was last acquired.
  */
 void
 witness_display_spinlock(struct lock_object *lock, struct thread *owner,
     int (*prnt)(const char *fmt, ...))
 {
 	struct lock_instance *instance;
 	struct pcpu *pc;
 
 	if (owner->td_critnest == 0 || owner->td_oncpu == NOCPU)
 		return;
 	pc = pcpu_find(owner->td_oncpu);
 	instance = find_instance(pc->pc_spinlocks, lock);
 	if (instance != NULL)
 		witness_list_lock(instance, prnt);
 }
 
 void
 witness_save(struct lock_object *lock, const char **filep, int *linep)
 {
 	struct lock_list_entry *lock_list;
 	struct lock_instance *instance;
 	struct lock_class *class;
 
 	/*
 	 * This function is used independently in locking code to deal with
 	 * Giant, SCHEDULER_STOPPED() check can be removed here after Giant
 	 * is gone.
 	 */
 	if (SCHEDULER_STOPPED())
 		return;
 	KASSERT(witness_cold == 0, ("%s: witness_cold", __func__));
 	if (lock->lo_witness == NULL || witness_watch == -1 || panicstr != NULL)
 		return;
 	class = LOCK_CLASS(lock);
 	if (class->lc_flags & LC_SLEEPLOCK)
 		lock_list = curthread->td_sleeplocks;
 	else {
 		if (witness_skipspin)
 			return;
 		lock_list = PCPU_GET(spinlocks);
 	}
 	instance = find_instance(lock_list, lock);
 	if (instance == NULL) {
 		kassert_panic("%s: lock (%s) %s not locked", __func__,
 		    class->lc_name, lock->lo_name);
 		return;
 	}
 	*filep = instance->li_file;
 	*linep = instance->li_line;
 }
 
 void
 witness_restore(struct lock_object *lock, const char *file, int line)
 {
 	struct lock_list_entry *lock_list;
 	struct lock_instance *instance;
 	struct lock_class *class;
 
 	/*
 	 * This function is used independently in locking code to deal with
 	 * Giant, SCHEDULER_STOPPED() check can be removed here after Giant
 	 * is gone.
 	 */
 	if (SCHEDULER_STOPPED())
 		return;
 	KASSERT(witness_cold == 0, ("%s: witness_cold", __func__));
 	if (lock->lo_witness == NULL || witness_watch == -1 || panicstr != NULL)
 		return;
 	class = LOCK_CLASS(lock);
 	if (class->lc_flags & LC_SLEEPLOCK)
 		lock_list = curthread->td_sleeplocks;
 	else {
 		if (witness_skipspin)
 			return;
 		lock_list = PCPU_GET(spinlocks);
 	}
 	instance = find_instance(lock_list, lock);
 	if (instance == NULL)
 		kassert_panic("%s: lock (%s) %s not locked", __func__,
 		    class->lc_name, lock->lo_name);
 	lock->lo_witness->w_file = file;
 	lock->lo_witness->w_line = line;
 	if (instance == NULL)
 		return;
 	instance->li_file = file;
 	instance->li_line = line;
 }
 
 void
 witness_assert(const struct lock_object *lock, int flags, const char *file,
     int line)
 {
 #ifdef INVARIANT_SUPPORT
 	struct lock_instance *instance;
 	struct lock_class *class;
 
 	if (lock->lo_witness == NULL || witness_watch < 1 || panicstr != NULL)
 		return;
 	class = LOCK_CLASS(lock);
 	if ((class->lc_flags & LC_SLEEPLOCK) != 0)
 		instance = find_instance(curthread->td_sleeplocks, lock);
 	else if ((class->lc_flags & LC_SPINLOCK) != 0)
 		instance = find_instance(PCPU_GET(spinlocks), lock);
 	else {
 		kassert_panic("Lock (%s) %s is not sleep or spin!",
 		    class->lc_name, lock->lo_name);
 		return;
 	}
 	switch (flags) {
 	case LA_UNLOCKED:
 		if (instance != NULL)
 			kassert_panic("Lock (%s) %s locked @ %s:%d.",
 			    class->lc_name, lock->lo_name,
 			    fixup_filename(file), line);
 		break;
 	case LA_LOCKED:
 	case LA_LOCKED | LA_RECURSED:
 	case LA_LOCKED | LA_NOTRECURSED:
 	case LA_SLOCKED:
 	case LA_SLOCKED | LA_RECURSED:
 	case LA_SLOCKED | LA_NOTRECURSED:
 	case LA_XLOCKED:
 	case LA_XLOCKED | LA_RECURSED:
 	case LA_XLOCKED | LA_NOTRECURSED:
 		if (instance == NULL) {
 			kassert_panic("Lock (%s) %s not locked @ %s:%d.",
 			    class->lc_name, lock->lo_name,
 			    fixup_filename(file), line);
 			break;
 		}
 		if ((flags & LA_XLOCKED) != 0 &&
 		    (instance->li_flags & LI_EXCLUSIVE) == 0)
 			kassert_panic(
 			    "Lock (%s) %s not exclusively locked @ %s:%d.",
 			    class->lc_name, lock->lo_name,
 			    fixup_filename(file), line);
 		if ((flags & LA_SLOCKED) != 0 &&
 		    (instance->li_flags & LI_EXCLUSIVE) != 0)
 			kassert_panic(
 			    "Lock (%s) %s exclusively locked @ %s:%d.",
 			    class->lc_name, lock->lo_name,
 			    fixup_filename(file), line);
 		if ((flags & LA_RECURSED) != 0 &&
 		    (instance->li_flags & LI_RECURSEMASK) == 0)
 			kassert_panic("Lock (%s) %s not recursed @ %s:%d.",
 			    class->lc_name, lock->lo_name,
 			    fixup_filename(file), line);
 		if ((flags & LA_NOTRECURSED) != 0 &&
 		    (instance->li_flags & LI_RECURSEMASK) != 0)
 			kassert_panic("Lock (%s) %s recursed @ %s:%d.",
 			    class->lc_name, lock->lo_name,
 			    fixup_filename(file), line);
 		break;
 	default:
 		kassert_panic("Invalid lock assertion at %s:%d.",
 		    fixup_filename(file), line);
 
 	}
 #endif	/* INVARIANT_SUPPORT */
 }
 
 static void
 witness_setflag(struct lock_object *lock, int flag, int set)
 {
 	struct lock_list_entry *lock_list;
 	struct lock_instance *instance;
 	struct lock_class *class;
 
 	if (lock->lo_witness == NULL || witness_watch == -1 || panicstr != NULL)
 		return;
 	class = LOCK_CLASS(lock);
 	if (class->lc_flags & LC_SLEEPLOCK)
 		lock_list = curthread->td_sleeplocks;
 	else {
 		if (witness_skipspin)
 			return;
 		lock_list = PCPU_GET(spinlocks);
 	}
 	instance = find_instance(lock_list, lock);
 	if (instance == NULL) {
 		kassert_panic("%s: lock (%s) %s not locked", __func__,
 		    class->lc_name, lock->lo_name);
 		return;
 	}
 
 	if (set)
 		instance->li_flags |= flag;
 	else
 		instance->li_flags &= ~flag;
 }
 
 void
 witness_norelease(struct lock_object *lock)
 {
 
 	witness_setflag(lock, LI_NORELEASE, 1);
 }
 
 void
 witness_releaseok(struct lock_object *lock)
 {
 
 	witness_setflag(lock, LI_NORELEASE, 0);
 }
 
 #ifdef DDB
 static void
 witness_ddb_list(struct thread *td)
 {
 
 	KASSERT(witness_cold == 0, ("%s: witness_cold", __func__));
 	KASSERT(kdb_active, ("%s: not in the debugger", __func__));
 
 	if (witness_watch < 1)
 		return;
 
 	witness_list_locks(&td->td_sleeplocks, db_printf);
 
 	/*
 	 * We only handle spinlocks if td == curthread.  This is somewhat broken
 	 * if td is currently executing on some other CPU and holds spin locks
 	 * as we won't display those locks.  If we had a MI way of getting
 	 * the per-cpu data for a given cpu then we could use
 	 * td->td_oncpu to get the list of spinlocks for this thread
 	 * and "fix" this.
 	 *
 	 * That still wouldn't really fix this unless we locked the scheduler
 	 * lock or stopped the other CPU to make sure it wasn't changing the
 	 * list out from under us.  It is probably best to just not try to
 	 * handle threads on other CPU's for now.
 	 */
 	if (td == curthread && PCPU_GET(spinlocks) != NULL)
 		witness_list_locks(PCPU_PTR(spinlocks), db_printf);
 }
 
 DB_SHOW_COMMAND(locks, db_witness_list)
 {
 	struct thread *td;
 
 	if (have_addr)
 		td = db_lookup_thread(addr, true);
 	else
 		td = kdb_thread;
 	witness_ddb_list(td);
 }
 
 DB_SHOW_ALL_COMMAND(locks, db_witness_list_all)
 {
 	struct thread *td;
 	struct proc *p;
 
 	/*
 	 * It would be nice to list only threads and processes that actually
 	 * held sleep locks, but that information is currently not exported
 	 * by WITNESS.
 	 */
 	FOREACH_PROC_IN_SYSTEM(p) {
 		if (!witness_proc_has_locks(p))
 			continue;
 		FOREACH_THREAD_IN_PROC(p, td) {
 			if (!witness_thread_has_locks(td))
 				continue;
 			db_printf("Process %d (%s) thread %p (%d)\n", p->p_pid,
 			    p->p_comm, td, td->td_tid);
 			witness_ddb_list(td);
 			if (db_pager_quit)
 				return;
 		}
 	}
 }
 DB_SHOW_ALIAS(alllocks, db_witness_list_all)
 
 DB_SHOW_COMMAND(witness, db_witness_display)
 {
 
 	witness_ddb_display(db_printf);
 }
 #endif
 
 static int
 sysctl_debug_witness_badstacks(SYSCTL_HANDLER_ARGS)
 {
 	struct witness_lock_order_data *data1, *data2, *tmp_data1, *tmp_data2;
 	struct witness *tmp_w1, *tmp_w2, *w1, *w2;
 	struct sbuf *sb;
 	u_int w_rmatrix1, w_rmatrix2;
 	int error, generation, i, j;
 
 	tmp_data1 = NULL;
 	tmp_data2 = NULL;
 	tmp_w1 = NULL;
 	tmp_w2 = NULL;
 	if (witness_watch < 1) {
 		error = SYSCTL_OUT(req, w_notrunning, sizeof(w_notrunning));
 		return (error);
 	}
 	if (witness_cold) {
 		error = SYSCTL_OUT(req, w_stillcold, sizeof(w_stillcold));
 		return (error);
 	}
 	error = 0;
 	sb = sbuf_new(NULL, NULL, badstack_sbuf_size, SBUF_AUTOEXTEND);
 	if (sb == NULL)
 		return (ENOMEM);
 
 	/* Allocate and init temporary storage space. */
 	tmp_w1 = malloc(sizeof(struct witness), M_TEMP, M_WAITOK | M_ZERO);
 	tmp_w2 = malloc(sizeof(struct witness), M_TEMP, M_WAITOK | M_ZERO);
 	tmp_data1 = malloc(sizeof(struct witness_lock_order_data), M_TEMP, 
 	    M_WAITOK | M_ZERO);
 	tmp_data2 = malloc(sizeof(struct witness_lock_order_data), M_TEMP, 
 	    M_WAITOK | M_ZERO);
 	stack_zero(&tmp_data1->wlod_stack);
 	stack_zero(&tmp_data2->wlod_stack);
 
 restart:
 	mtx_lock_spin(&w_mtx);
 	generation = w_generation;
 	mtx_unlock_spin(&w_mtx);
 	sbuf_printf(sb, "Number of known direct relationships is %d\n",
 	    w_lohash.wloh_count);
 	for (i = 1; i < w_max_used_index; i++) {
 		mtx_lock_spin(&w_mtx);
 		if (generation != w_generation) {
 			mtx_unlock_spin(&w_mtx);
 
 			/* The graph has changed, try again. */
 			req->oldidx = 0;
 			sbuf_clear(sb);
 			goto restart;
 		}
 
 		w1 = &w_data[i];
 		if (w1->w_reversed == 0) {
 			mtx_unlock_spin(&w_mtx);
 			continue;
 		}
 
 		/* Copy w1 locally so we can release the spin lock. */
 		*tmp_w1 = *w1;
 		mtx_unlock_spin(&w_mtx);
 
 		if (tmp_w1->w_reversed == 0)
 			continue;
 		for (j = 1; j < w_max_used_index; j++) {
 			if ((w_rmatrix[i][j] & WITNESS_REVERSAL) == 0 || i > j)
 				continue;
 
 			mtx_lock_spin(&w_mtx);
 			if (generation != w_generation) {
 				mtx_unlock_spin(&w_mtx);
 
 				/* The graph has changed, try again. */
 				req->oldidx = 0;
 				sbuf_clear(sb);
 				goto restart;
 			}
 
 			w2 = &w_data[j];
 			data1 = witness_lock_order_get(w1, w2);
 			data2 = witness_lock_order_get(w2, w1);
 
 			/*
 			 * Copy information locally so we can release the
 			 * spin lock.
 			 */
 			*tmp_w2 = *w2;
 			w_rmatrix1 = (unsigned int)w_rmatrix[i][j];
 			w_rmatrix2 = (unsigned int)w_rmatrix[j][i];
 
 			if (data1) {
 				stack_zero(&tmp_data1->wlod_stack);
 				stack_copy(&data1->wlod_stack,
 				    &tmp_data1->wlod_stack);
 			}
 			if (data2 && data2 != data1) {
 				stack_zero(&tmp_data2->wlod_stack);
 				stack_copy(&data2->wlod_stack,
 				    &tmp_data2->wlod_stack);
 			}
 			mtx_unlock_spin(&w_mtx);
 
 			sbuf_printf(sb,
 	    "\nLock order reversal between \"%s\"(%s) and \"%s\"(%s)!\n",
 			    tmp_w1->w_name, tmp_w1->w_class->lc_name, 
 			    tmp_w2->w_name, tmp_w2->w_class->lc_name);
 			if (data1) {
 				sbuf_printf(sb,
 			"Lock order \"%s\"(%s) -> \"%s\"(%s) first seen at:\n",
 				    tmp_w1->w_name, tmp_w1->w_class->lc_name, 
 				    tmp_w2->w_name, tmp_w2->w_class->lc_name);
 				stack_sbuf_print(sb, &tmp_data1->wlod_stack);
 				sbuf_printf(sb, "\n");
 			}
 			if (data2 && data2 != data1) {
 				sbuf_printf(sb,
 			"Lock order \"%s\"(%s) -> \"%s\"(%s) first seen at:\n",
 				    tmp_w2->w_name, tmp_w2->w_class->lc_name, 
 				    tmp_w1->w_name, tmp_w1->w_class->lc_name);
 				stack_sbuf_print(sb, &tmp_data2->wlod_stack);
 				sbuf_printf(sb, "\n");
 			}
 		}
 	}
 	mtx_lock_spin(&w_mtx);
 	if (generation != w_generation) {
 		mtx_unlock_spin(&w_mtx);
 
 		/*
 		 * The graph changed while we were printing stack data,
 		 * try again.
 		 */
 		req->oldidx = 0;
 		sbuf_clear(sb);
 		goto restart;
 	}
 	mtx_unlock_spin(&w_mtx);
 
 	/* Free temporary storage space. */
 	free(tmp_data1, M_TEMP);
 	free(tmp_data2, M_TEMP);
 	free(tmp_w1, M_TEMP);
 	free(tmp_w2, M_TEMP);
 
 	sbuf_finish(sb);
 	error = SYSCTL_OUT(req, sbuf_data(sb), sbuf_len(sb) + 1);
 	sbuf_delete(sb);
 
 	return (error);
 }
 
 static int
 sysctl_debug_witness_channel(SYSCTL_HANDLER_ARGS)
 {
 	static const struct {
 		enum witness_channel channel;
 		const char *name;
 	} channels[] = {
 		{ WITNESS_CONSOLE, "console" },
 		{ WITNESS_LOG, "log" },
 		{ WITNESS_NONE, "none" },
 	};
 	char buf[16];
 	u_int i;
 	int error;
 
 	buf[0] = '\0';
 	for (i = 0; i < nitems(channels); i++)
 		if (witness_channel == channels[i].channel) {
 			snprintf(buf, sizeof(buf), "%s", channels[i].name);
 			break;
 		}
 
 	error = sysctl_handle_string(oidp, buf, sizeof(buf), req);
 	if (error != 0 || req->newptr == NULL)
 		return (error);
 
 	error = EINVAL;
 	for (i = 0; i < nitems(channels); i++)
 		if (strcmp(channels[i].name, buf) == 0) {
 			witness_channel = channels[i].channel;
 			error = 0;
 			break;
 		}
 	return (error);
 }
 
 static int
 sysctl_debug_witness_fullgraph(SYSCTL_HANDLER_ARGS)
 {
 	struct witness *w;
 	struct sbuf *sb;
 	int error;
 
 	if (witness_watch < 1) {
 		error = SYSCTL_OUT(req, w_notrunning, sizeof(w_notrunning));
 		return (error);
 	}
 	if (witness_cold) {
 		error = SYSCTL_OUT(req, w_stillcold, sizeof(w_stillcold));
 		return (error);
 	}
 	error = 0;
 
 	error = sysctl_wire_old_buffer(req, 0);
 	if (error != 0)
 		return (error);
 	sb = sbuf_new_for_sysctl(NULL, NULL, FULLGRAPH_SBUF_SIZE, req);
 	if (sb == NULL)
 		return (ENOMEM);
 	sbuf_printf(sb, "\n");
 
 	mtx_lock_spin(&w_mtx);
 	STAILQ_FOREACH(w, &w_all, w_list)
 		w->w_displayed = 0;
 	STAILQ_FOREACH(w, &w_all, w_list)
 		witness_add_fullgraph(sb, w);
 	mtx_unlock_spin(&w_mtx);
 
 	/*
 	 * Close the sbuf and return to userland.
 	 */
 	error = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (error);
 }
 
 static int
 sysctl_debug_witness_watch(SYSCTL_HANDLER_ARGS)
 {
 	int error, value;
 
 	value = witness_watch;
 	error = sysctl_handle_int(oidp, &value, 0, req);
 	if (error != 0 || req->newptr == NULL)
 		return (error);
 	if (value > 1 || value < -1 ||
 	    (witness_watch == -1 && value != witness_watch))
 		return (EINVAL);
 	witness_watch = value;
 	return (0);
 }
 
 static void
 witness_add_fullgraph(struct sbuf *sb, struct witness *w)
 {
 	int i;
 
 	if (w->w_displayed != 0 || (w->w_file == NULL && w->w_line == 0))
 		return;
 	w->w_displayed = 1;
 
 	WITNESS_INDEX_ASSERT(w->w_index);
 	for (i = 1; i <= w_max_used_index; i++) {
 		if (w_rmatrix[w->w_index][i] & WITNESS_PARENT) {
 			sbuf_printf(sb, "\"%s\",\"%s\"\n", w->w_name,
 			    w_data[i].w_name);
 			witness_add_fullgraph(sb, &w_data[i]);
 		}
 	}
 }
 
 /*
  * A simple hash function. Takes a key pointer and a key size. If size == 0,
  * interprets the key as a string and reads until the null
  * terminator. Otherwise, reads the first size bytes. Returns an unsigned 32-bit
  * hash value computed from the key.
  */
 static uint32_t
 witness_hash_djb2(const uint8_t *key, uint32_t size)
 {
 	unsigned int hash = 5381;
 	int i;
 
 	/* hash = hash * 33 + key[i] */
 	if (size)
 		for (i = 0; i < size; i++)
 			hash = ((hash << 5) + hash) + (unsigned int)key[i];
 	else
 		for (i = 0; key[i] != 0; i++)
 			hash = ((hash << 5) + hash) + (unsigned int)key[i];
 
 	return (hash);
 }
 
 
 /*
  * Initializes the two witness hash tables. Called exactly once from
  * witness_initialize().
  */
 static void
 witness_init_hash_tables(void)
 {
 	int i;
 
 	MPASS(witness_cold);
 
 	/* Initialize the hash tables. */
 	for (i = 0; i < WITNESS_HASH_SIZE; i++)
 		w_hash.wh_array[i] = NULL;
 
 	w_hash.wh_size = WITNESS_HASH_SIZE;
 	w_hash.wh_count = 0;
 
 	/* Initialize the lock order data hash. */
 	w_lofree = NULL;
 	for (i = 0; i < WITNESS_LO_DATA_COUNT; i++) {
 		memset(&w_lodata[i], 0, sizeof(w_lodata[i]));
 		w_lodata[i].wlod_next = w_lofree;
 		w_lofree = &w_lodata[i];
 	}
 	w_lohash.wloh_size = WITNESS_LO_HASH_SIZE;
 	w_lohash.wloh_count = 0;
 	for (i = 0; i < WITNESS_LO_HASH_SIZE; i++)
 		w_lohash.wloh_array[i] = NULL;
 }
 
 static struct witness *
 witness_hash_get(const char *key)
 {
 	struct witness *w;
 	uint32_t hash;
 	
 	MPASS(key != NULL);
 	if (witness_cold == 0)
 		mtx_assert(&w_mtx, MA_OWNED);
 	hash = witness_hash_djb2(key, 0) % w_hash.wh_size;
 	w = w_hash.wh_array[hash];
 	while (w != NULL) {
 		if (strcmp(w->w_name, key) == 0)
 			goto out;
 		w = w->w_hash_next;
 	}
 
 out:
 	return (w);
 }
 
 static void
 witness_hash_put(struct witness *w)
 {
 	uint32_t hash;
 
 	MPASS(w != NULL);
 	MPASS(w->w_name != NULL);
 	if (witness_cold == 0)
 		mtx_assert(&w_mtx, MA_OWNED);
 	KASSERT(witness_hash_get(w->w_name) == NULL,
 	    ("%s: trying to add a hash entry that already exists!", __func__));
 	KASSERT(w->w_hash_next == NULL,
 	    ("%s: w->w_hash_next != NULL", __func__));
 
 	hash = witness_hash_djb2(w->w_name, 0) % w_hash.wh_size;
 	w->w_hash_next = w_hash.wh_array[hash];
 	w_hash.wh_array[hash] = w;
 	w_hash.wh_count++;
 }
 
 
 static struct witness_lock_order_data *
 witness_lock_order_get(struct witness *parent, struct witness *child)
 {
 	struct witness_lock_order_data *data = NULL;
 	struct witness_lock_order_key key;
 	unsigned int hash;
 
 	MPASS(parent != NULL && child != NULL);
 	key.from = parent->w_index;
 	key.to = child->w_index;
 	WITNESS_INDEX_ASSERT(key.from);
 	WITNESS_INDEX_ASSERT(key.to);
 	if ((w_rmatrix[parent->w_index][child->w_index]
 	    & WITNESS_LOCK_ORDER_KNOWN) == 0)
 		goto out;
 
 	hash = witness_hash_djb2((const char*)&key,
 	    sizeof(key)) % w_lohash.wloh_size;
 	data = w_lohash.wloh_array[hash];
 	while (data != NULL) {
 		if (witness_lock_order_key_equal(&data->wlod_key, &key))
 			break;
 		data = data->wlod_next;
 	}
 
 out:
 	return (data);
 }
 
 /*
  * Verify that parent and child have a known relationship, are not the same,
  * and child is actually a child of parent.  This is done without w_mtx
  * to avoid contention in the common case.
  */
 static int
 witness_lock_order_check(struct witness *parent, struct witness *child)
 {
 
 	if (parent != child &&
 	    w_rmatrix[parent->w_index][child->w_index]
 	    & WITNESS_LOCK_ORDER_KNOWN &&
 	    isitmychild(parent, child))
 		return (1);
 
 	return (0);
 }
 
 static int
 witness_lock_order_add(struct witness *parent, struct witness *child)
 {
 	struct witness_lock_order_data *data = NULL;
 	struct witness_lock_order_key key;
 	unsigned int hash;
 	
 	MPASS(parent != NULL && child != NULL);
 	key.from = parent->w_index;
 	key.to = child->w_index;
 	WITNESS_INDEX_ASSERT(key.from);
 	WITNESS_INDEX_ASSERT(key.to);
 	if (w_rmatrix[parent->w_index][child->w_index]
 	    & WITNESS_LOCK_ORDER_KNOWN)
 		return (1);
 
 	hash = witness_hash_djb2((const char*)&key,
 	    sizeof(key)) % w_lohash.wloh_size;
 	w_rmatrix[parent->w_index][child->w_index] |= WITNESS_LOCK_ORDER_KNOWN;
 	data = w_lofree;
 	if (data == NULL)
 		return (0);
 	w_lofree = data->wlod_next;
 	data->wlod_next = w_lohash.wloh_array[hash];
 	data->wlod_key = key;
 	w_lohash.wloh_array[hash] = data;
 	w_lohash.wloh_count++;
 	stack_zero(&data->wlod_stack);
 	stack_save(&data->wlod_stack);
 	return (1);
 }
 
 /* Call this whenever the structure of the witness graph changes. */
 static void
 witness_increment_graph_generation(void)
 {
 
 	if (witness_cold == 0)
 		mtx_assert(&w_mtx, MA_OWNED);
 	w_generation++;
 }
 
 static int
 witness_output_drain(void *arg __unused, const char *data, int len)
 {
 
 	witness_output("%.*s", len, data);
 	return (len);
 }
 
 static void
 witness_debugger(int cond, const char *msg)
 {
 	char buf[32];
 	struct sbuf sb;
 	struct stack st;
 
 	if (!cond)
 		return;
 
 	if (witness_trace) {
 		sbuf_new(&sb, buf, sizeof(buf), SBUF_FIXEDLEN);
 		sbuf_set_drain(&sb, witness_output_drain, NULL);
 
 		stack_zero(&st);
 		stack_save(&st);
 		witness_output("stack backtrace:\n");
 		stack_sbuf_print_ddb(&sb, &st);
 
 		sbuf_finish(&sb);
 	}
 
 #ifdef KDB
 	if (witness_kdb)
 		kdb_enter(KDB_WHY_WITNESS, msg);
 #endif
 }
Index: projects/clang390-import/sys/kern/uipc_syscalls.c
===================================================================
--- projects/clang390-import/sys/kern/uipc_syscalls.c	(revision 305686)
+++ projects/clang390-import/sys/kern/uipc_syscalls.c	(revision 305687)
@@ -1,1755 +1,1575 @@
 /*-
  * Copyright (c) 1982, 1986, 1989, 1990, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)uipc_syscalls.c	8.4 (Berkeley) 2/21/94
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_capsicum.h"
 #include "opt_inet.h"
 #include "opt_inet6.h"
 #include "opt_compat.h"
 #include "opt_ktrace.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/capsicum.h>
 #include <sys/kernel.h>
 #include <sys/lock.h>
 #include <sys/mutex.h>
 #include <sys/sysproto.h>
 #include <sys/malloc.h>
 #include <sys/filedesc.h>
 #include <sys/proc.h>
 #include <sys/filio.h>
 #include <sys/jail.h>
 #include <sys/mbuf.h>
 #include <sys/protosw.h>
 #include <sys/rwlock.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/syscallsubr.h>
 #ifdef KTRACE
 #include <sys/ktrace.h>
 #endif
 #ifdef COMPAT_FREEBSD32
 #include <compat/freebsd32/freebsd32_util.h>
 #endif
 
 #include <net/vnet.h>
 
 #include <security/audit/audit.h>
 #include <security/mac/mac_framework.h>
 
 /*
  * Flags for accept1() and kern_accept4(), in addition to SOCK_CLOEXEC
  * and SOCK_NONBLOCK.
  */
 #define	ACCEPT4_INHERIT	0x1
 #define	ACCEPT4_COMPAT	0x2
 
 static int sendit(struct thread *td, int s, struct msghdr *mp, int flags);
 static int recvit(struct thread *td, int s, struct msghdr *mp, void *namelenp);
 
 static int accept1(struct thread *td, int s, struct sockaddr *uname,
 		   socklen_t *anamelen, int flags);
 static int getsockname1(struct thread *td, struct getsockname_args *uap,
 			int compat);
 static int getpeername1(struct thread *td, struct getpeername_args *uap,
 			int compat);
 static int sockargs(struct mbuf **, char *, socklen_t, int);
 
 /*
  * Convert a user file descriptor to a kernel file entry and check if required
  * capability rights are present.
  * A reference on the file entry is held upon returning.
  */
 int
 getsock_cap(struct thread *td, int fd, cap_rights_t *rightsp,
     struct file **fpp, u_int *fflagp)
 {
 	struct file *fp;
 	int error;
 
 	error = fget_unlocked(td->td_proc->p_fd, fd, rightsp, &fp, NULL);
 	if (error != 0)
 		return (error);
 	if (fp->f_type != DTYPE_SOCKET) {
 		fdrop(fp, td);
 		return (ENOTSOCK);
 	}
 	if (fflagp != NULL)
 		*fflagp = fp->f_flag;
 	*fpp = fp;
 	return (0);
 }
 
 /*
  * System call interface to the socket abstraction.
  */
 #if defined(COMPAT_43)
 #define COMPAT_OLDSOCK
 #endif
 
 int
-sys_socket(td, uap)
-	struct thread *td;
-	struct socket_args /* {
-		int	domain;
-		int	type;
-		int	protocol;
-	} */ *uap;
+sys_socket(struct thread *td, struct socket_args *uap)
 {
 	struct socket *so;
 	struct file *fp;
 	int fd, error, type, oflag, fflag;
 
 	AUDIT_ARG_SOCKET(uap->domain, uap->type, uap->protocol);
 
 	type = uap->type;
 	oflag = 0;
 	fflag = 0;
 	if ((type & SOCK_CLOEXEC) != 0) {
 		type &= ~SOCK_CLOEXEC;
 		oflag |= O_CLOEXEC;
 	}
 	if ((type & SOCK_NONBLOCK) != 0) {
 		type &= ~SOCK_NONBLOCK;
 		fflag |= FNONBLOCK;
 	}
 
 #ifdef MAC
 	error = mac_socket_check_create(td->td_ucred, uap->domain, type,
 	    uap->protocol);
 	if (error != 0)
 		return (error);
 #endif
 	error = falloc(td, &fp, &fd, oflag);
 	if (error != 0)
 		return (error);
 	/* An extra reference on `fp' has been held for us by falloc(). */
 	error = socreate(uap->domain, &so, type, uap->protocol,
 	    td->td_ucred, td);
 	if (error != 0) {
 		fdclose(td, fp, fd);
 	} else {
 		finit(fp, FREAD | FWRITE | fflag, DTYPE_SOCKET, so, &socketops);
 		if ((fflag & FNONBLOCK) != 0)
 			(void) fo_ioctl(fp, FIONBIO, &fflag, td->td_ucred, td);
 		td->td_retval[0] = fd;
 	}
 	fdrop(fp, td);
 	return (error);
 }
 
-/* ARGSUSED */
 int
-sys_bind(td, uap)
-	struct thread *td;
-	struct bind_args /* {
-		int	s;
-		caddr_t	name;
-		int	namelen;
-	} */ *uap;
+sys_bind(struct thread *td, struct bind_args *uap)
 {
 	struct sockaddr *sa;
 	int error;
 
 	error = getsockaddr(&sa, uap->name, uap->namelen);
 	if (error == 0) {
 		error = kern_bindat(td, AT_FDCWD, uap->s, sa);
 		free(sa, M_SONAME);
 	}
 	return (error);
 }
 
 int
 kern_bindat(struct thread *td, int dirfd, int fd, struct sockaddr *sa)
 {
 	struct socket *so;
 	struct file *fp;
 	cap_rights_t rights;
 	int error;
 
 	AUDIT_ARG_FD(fd);
 	AUDIT_ARG_SOCKADDR(td, dirfd, sa);
 	error = getsock_cap(td, fd, cap_rights_init(&rights, CAP_BIND),
 	    &fp, NULL);
 	if (error != 0)
 		return (error);
 	so = fp->f_data;
 #ifdef KTRACE
 	if (KTRPOINT(td, KTR_STRUCT))
 		ktrsockaddr(sa);
 #endif
 #ifdef MAC
 	error = mac_socket_check_bind(td->td_ucred, so, sa);
 	if (error == 0) {
 #endif
 		if (dirfd == AT_FDCWD)
 			error = sobind(so, sa, td);
 		else
 			error = sobindat(dirfd, so, sa, td);
 #ifdef MAC
 	}
 #endif
 	fdrop(fp, td);
 	return (error);
 }
 
-/* ARGSUSED */
 int
-sys_bindat(td, uap)
-	struct thread *td;
-	struct bindat_args /* {
-		int	fd;
-		int	s;
-		caddr_t	name;
-		int	namelen;
-	} */ *uap;
+sys_bindat(struct thread *td, struct bindat_args *uap)
 {
 	struct sockaddr *sa;
 	int error;
 
 	error = getsockaddr(&sa, uap->name, uap->namelen);
 	if (error == 0) {
 		error = kern_bindat(td, uap->fd, uap->s, sa);
 		free(sa, M_SONAME);
 	}
 	return (error);
 }
 
-/* ARGSUSED */
 int
-sys_listen(td, uap)
-	struct thread *td;
-	struct listen_args /* {
-		int	s;
-		int	backlog;
-	} */ *uap;
+sys_listen(struct thread *td, struct listen_args *uap)
 {
 	struct socket *so;
 	struct file *fp;
 	cap_rights_t rights;
 	int error;
 
 	AUDIT_ARG_FD(uap->s);
 	error = getsock_cap(td, uap->s, cap_rights_init(&rights, CAP_LISTEN),
 	    &fp, NULL);
 	if (error == 0) {
 		so = fp->f_data;
 #ifdef MAC
 		error = mac_socket_check_listen(td->td_ucred, so);
 		if (error == 0)
 #endif
 			error = solisten(so, uap->backlog, td);
 		fdrop(fp, td);
 	}
 	return(error);
 }
 
 /*
  * accept1()
  */
 static int
 accept1(td, s, uname, anamelen, flags)
 	struct thread *td;
 	int s;
 	struct sockaddr *uname;
 	socklen_t *anamelen;
 	int flags;
 {
 	struct sockaddr *name;
 	socklen_t namelen;
 	struct file *fp;
 	int error;
 
 	if (uname == NULL)
 		return (kern_accept4(td, s, NULL, NULL, flags, NULL));
 
 	error = copyin(anamelen, &namelen, sizeof (namelen));
 	if (error != 0)
 		return (error);
 
 	error = kern_accept4(td, s, &name, &namelen, flags, &fp);
 
 	if (error != 0)
 		return (error);
 
 	if (error == 0 && uname != NULL) {
 #ifdef COMPAT_OLDSOCK
 		if (flags & ACCEPT4_COMPAT)
 			((struct osockaddr *)name)->sa_family =
 			    name->sa_family;
 #endif
 		error = copyout(name, uname, namelen);
 	}
 	if (error == 0)
 		error = copyout(&namelen, anamelen,
 		    sizeof(namelen));
 	if (error != 0)
 		fdclose(td, fp, td->td_retval[0]);
 	fdrop(fp, td);
 	free(name, M_SONAME);
 	return (error);
 }
 
 int
 kern_accept(struct thread *td, int s, struct sockaddr **name,
     socklen_t *namelen, struct file **fp)
 {
 	return (kern_accept4(td, s, name, namelen, ACCEPT4_INHERIT, fp));
 }
 
 int
 kern_accept4(struct thread *td, int s, struct sockaddr **name,
     socklen_t *namelen, int flags, struct file **fp)
 {
 	struct file *headfp, *nfp = NULL;
 	struct sockaddr *sa = NULL;
 	struct socket *head, *so;
 	cap_rights_t rights;
 	u_int fflag;
 	pid_t pgid;
 	int error, fd, tmp;
 
 	if (name != NULL)
 		*name = NULL;
 
 	AUDIT_ARG_FD(s);
 	error = getsock_cap(td, s, cap_rights_init(&rights, CAP_ACCEPT),
 	    &headfp, &fflag);
 	if (error != 0)
 		return (error);
 	head = headfp->f_data;
 	if ((head->so_options & SO_ACCEPTCONN) == 0) {
 		error = EINVAL;
 		goto done;
 	}
 #ifdef MAC
 	error = mac_socket_check_accept(td->td_ucred, head);
 	if (error != 0)
 		goto done;
 #endif
 	error = falloc(td, &nfp, &fd, (flags & SOCK_CLOEXEC) ? O_CLOEXEC : 0);
 	if (error != 0)
 		goto done;
 	ACCEPT_LOCK();
 	if ((head->so_state & SS_NBIO) && TAILQ_EMPTY(&head->so_comp)) {
 		ACCEPT_UNLOCK();
 		error = EWOULDBLOCK;
 		goto noconnection;
 	}
 	while (TAILQ_EMPTY(&head->so_comp) && head->so_error == 0) {
 		if (head->so_rcv.sb_state & SBS_CANTRCVMORE) {
 			head->so_error = ECONNABORTED;
 			break;
 		}
 		error = msleep(&head->so_timeo, &accept_mtx, PSOCK | PCATCH,
 		    "accept", 0);
 		if (error != 0) {
 			ACCEPT_UNLOCK();
 			goto noconnection;
 		}
 	}
 	if (head->so_error) {
 		error = head->so_error;
 		head->so_error = 0;
 		ACCEPT_UNLOCK();
 		goto noconnection;
 	}
 	so = TAILQ_FIRST(&head->so_comp);
 	KASSERT(!(so->so_qstate & SQ_INCOMP), ("accept1: so SQ_INCOMP"));
 	KASSERT(so->so_qstate & SQ_COMP, ("accept1: so not SQ_COMP"));
 
 	/*
 	 * Before changing the flags on the socket, we have to bump the
 	 * reference count.  Otherwise, if the protocol calls sofree(),
 	 * the socket will be released due to a zero refcount.
 	 */
 	SOCK_LOCK(so);			/* soref() and so_state update */
 	soref(so);			/* file descriptor reference */
 
 	TAILQ_REMOVE(&head->so_comp, so, so_list);
 	head->so_qlen--;
 	if (flags & ACCEPT4_INHERIT)
 		so->so_state |= (head->so_state & SS_NBIO);
 	else
 		so->so_state |= (flags & SOCK_NONBLOCK) ? SS_NBIO : 0;
 	so->so_qstate &= ~SQ_COMP;
 	so->so_head = NULL;
 
 	SOCK_UNLOCK(so);
 	ACCEPT_UNLOCK();
 
 	/* An extra reference on `nfp' has been held for us by falloc(). */
 	td->td_retval[0] = fd;
 
 	/* connection has been removed from the listen queue */
 	KNOTE_UNLOCKED(&head->so_rcv.sb_sel.si_note, 0);
 
 	if (flags & ACCEPT4_INHERIT) {
 		pgid = fgetown(&head->so_sigio);
 		if (pgid != 0)
 			fsetown(pgid, &so->so_sigio);
 	} else {
 		fflag &= ~(FNONBLOCK | FASYNC);
 		if (flags & SOCK_NONBLOCK)
 			fflag |= FNONBLOCK;
 	}
 
 	finit(nfp, fflag, DTYPE_SOCKET, so, &socketops);
 	/* Sync socket nonblocking/async state with file flags */
 	tmp = fflag & FNONBLOCK;
 	(void) fo_ioctl(nfp, FIONBIO, &tmp, td->td_ucred, td);
 	tmp = fflag & FASYNC;
 	(void) fo_ioctl(nfp, FIOASYNC, &tmp, td->td_ucred, td);
 	sa = NULL;
 	error = soaccept(so, &sa);
 	if (error != 0)
 		goto noconnection;
 	if (sa == NULL) {
 		if (name)
 			*namelen = 0;
 		goto done;
 	}
 	AUDIT_ARG_SOCKADDR(td, AT_FDCWD, sa);
 	if (name) {
 		/* check sa_len before it is destroyed */
 		if (*namelen > sa->sa_len)
 			*namelen = sa->sa_len;
 #ifdef KTRACE
 		if (KTRPOINT(td, KTR_STRUCT))
 			ktrsockaddr(sa);
 #endif
 		*name = sa;
 		sa = NULL;
 	}
 noconnection:
 	free(sa, M_SONAME);
 
 	/*
 	 * close the new descriptor, assuming someone hasn't ripped it
 	 * out from under us.
 	 */
 	if (error != 0)
 		fdclose(td, nfp, fd);
 
 	/*
 	 * Release explicitly held references before returning.  We return
 	 * a reference on nfp to the caller on success if they request it.
 	 */
 done:
 	if (fp != NULL) {
 		if (error == 0) {
 			*fp = nfp;
 			nfp = NULL;
 		} else
 			*fp = NULL;
 	}
 	if (nfp != NULL)
 		fdrop(nfp, td);
 	fdrop(headfp, td);
 	return (error);
 }
 
 int
 sys_accept(td, uap)
 	struct thread *td;
 	struct accept_args *uap;
 {
 
 	return (accept1(td, uap->s, uap->name, uap->anamelen, ACCEPT4_INHERIT));
 }
 
 int
 sys_accept4(td, uap)
 	struct thread *td;
 	struct accept4_args *uap;
 {
 
 	if (uap->flags & ~(SOCK_CLOEXEC | SOCK_NONBLOCK))
 		return (EINVAL);
 
 	return (accept1(td, uap->s, uap->name, uap->anamelen, uap->flags));
 }
 
 #ifdef COMPAT_OLDSOCK
 int
 oaccept(td, uap)
 	struct thread *td;
 	struct accept_args *uap;
 {
 
 	return (accept1(td, uap->s, uap->name, uap->anamelen,
 	    ACCEPT4_INHERIT | ACCEPT4_COMPAT));
 }
 #endif /* COMPAT_OLDSOCK */
 
-/* ARGSUSED */
 int
-sys_connect(td, uap)
-	struct thread *td;
-	struct connect_args /* {
-		int	s;
-		caddr_t	name;
-		int	namelen;
-	} */ *uap;
+sys_connect(struct thread *td, struct connect_args *uap)
 {
 	struct sockaddr *sa;
 	int error;
 
 	error = getsockaddr(&sa, uap->name, uap->namelen);
 	if (error == 0) {
 		error = kern_connectat(td, AT_FDCWD, uap->s, sa);
 		free(sa, M_SONAME);
 	}
 	return (error);
 }
 
 int
 kern_connectat(struct thread *td, int dirfd, int fd, struct sockaddr *sa)
 {
 	struct socket *so;
 	struct file *fp;
 	cap_rights_t rights;
 	int error, interrupted = 0;
 
 	AUDIT_ARG_FD(fd);
 	AUDIT_ARG_SOCKADDR(td, dirfd, sa);
 	error = getsock_cap(td, fd, cap_rights_init(&rights, CAP_CONNECT),
 	    &fp, NULL);
 	if (error != 0)
 		return (error);
 	so = fp->f_data;
 	if (so->so_state & SS_ISCONNECTING) {
 		error = EALREADY;
 		goto done1;
 	}
 #ifdef KTRACE
 	if (KTRPOINT(td, KTR_STRUCT))
 		ktrsockaddr(sa);
 #endif
 #ifdef MAC
 	error = mac_socket_check_connect(td->td_ucred, so, sa);
 	if (error != 0)
 		goto bad;
 #endif
 	if (dirfd == AT_FDCWD)
 		error = soconnect(so, sa, td);
 	else
 		error = soconnectat(dirfd, so, sa, td);
 	if (error != 0)
 		goto bad;
 	if ((so->so_state & SS_NBIO) && (so->so_state & SS_ISCONNECTING)) {
 		error = EINPROGRESS;
 		goto done1;
 	}
 	SOCK_LOCK(so);
 	while ((so->so_state & SS_ISCONNECTING) && so->so_error == 0) {
 		error = msleep(&so->so_timeo, SOCK_MTX(so), PSOCK | PCATCH,
 		    "connec", 0);
 		if (error != 0) {
 			if (error == EINTR || error == ERESTART)
 				interrupted = 1;
 			break;
 		}
 	}
 	if (error == 0) {
 		error = so->so_error;
 		so->so_error = 0;
 	}
 	SOCK_UNLOCK(so);
 bad:
 	if (!interrupted)
 		so->so_state &= ~SS_ISCONNECTING;
 	if (error == ERESTART)
 		error = EINTR;
 done1:
 	fdrop(fp, td);
 	return (error);
 }
 
-/* ARGSUSED */
 int
-sys_connectat(td, uap)
-	struct thread *td;
-	struct connectat_args /* {
-		int	fd;
-		int	s;
-		caddr_t	name;
-		int	namelen;
-	} */ *uap;
+sys_connectat(struct thread *td, struct connectat_args *uap)
 {
 	struct sockaddr *sa;
 	int error;
 
 	error = getsockaddr(&sa, uap->name, uap->namelen);
 	if (error == 0) {
 		error = kern_connectat(td, uap->fd, uap->s, sa);
 		free(sa, M_SONAME);
 	}
 	return (error);
 }
 
 int
 kern_socketpair(struct thread *td, int domain, int type, int protocol,
     int *rsv)
 {
 	struct file *fp1, *fp2;
 	struct socket *so1, *so2;
 	int fd, error, oflag, fflag;
 
 	AUDIT_ARG_SOCKET(domain, type, protocol);
 
 	oflag = 0;
 	fflag = 0;
 	if ((type & SOCK_CLOEXEC) != 0) {
 		type &= ~SOCK_CLOEXEC;
 		oflag |= O_CLOEXEC;
 	}
 	if ((type & SOCK_NONBLOCK) != 0) {
 		type &= ~SOCK_NONBLOCK;
 		fflag |= FNONBLOCK;
 	}
 #ifdef MAC
 	/* We might want to have a separate check for socket pairs. */
 	error = mac_socket_check_create(td->td_ucred, domain, type,
 	    protocol);
 	if (error != 0)
 		return (error);
 #endif
 	error = socreate(domain, &so1, type, protocol, td->td_ucred, td);
 	if (error != 0)
 		return (error);
 	error = socreate(domain, &so2, type, protocol, td->td_ucred, td);
 	if (error != 0)
 		goto free1;
 	/* On success extra reference to `fp1' and 'fp2' is set by falloc. */
 	error = falloc(td, &fp1, &fd, oflag);
 	if (error != 0)
 		goto free2;
 	rsv[0] = fd;
 	fp1->f_data = so1;	/* so1 already has ref count */
 	error = falloc(td, &fp2, &fd, oflag);
 	if (error != 0)
 		goto free3;
 	fp2->f_data = so2;	/* so2 already has ref count */
 	rsv[1] = fd;
 	error = soconnect2(so1, so2);
 	if (error != 0)
 		goto free4;
 	if (type == SOCK_DGRAM) {
 		/*
 		 * Datagram socket connection is asymmetric.
 		 */
 		 error = soconnect2(so2, so1);
 		 if (error != 0)
 			goto free4;
 	}
 	finit(fp1, FREAD | FWRITE | fflag, DTYPE_SOCKET, fp1->f_data,
 	    &socketops);
 	finit(fp2, FREAD | FWRITE | fflag, DTYPE_SOCKET, fp2->f_data,
 	    &socketops);
 	if ((fflag & FNONBLOCK) != 0) {
 		(void) fo_ioctl(fp1, FIONBIO, &fflag, td->td_ucred, td);
 		(void) fo_ioctl(fp2, FIONBIO, &fflag, td->td_ucred, td);
 	}
 	fdrop(fp1, td);
 	fdrop(fp2, td);
 	return (0);
 free4:
 	fdclose(td, fp2, rsv[1]);
 	fdrop(fp2, td);
 free3:
 	fdclose(td, fp1, rsv[0]);
 	fdrop(fp1, td);
 free2:
 	if (so2 != NULL)
 		(void)soclose(so2);
 free1:
 	if (so1 != NULL)
 		(void)soclose(so1);
 	return (error);
 }
 
 int
 sys_socketpair(struct thread *td, struct socketpair_args *uap)
 {
 	int error, sv[2];
 
 	error = kern_socketpair(td, uap->domain, uap->type,
 	    uap->protocol, sv);
 	if (error != 0)
 		return (error);
 	error = copyout(sv, uap->rsv, 2 * sizeof(int));
 	if (error != 0) {
 		(void)kern_close(td, sv[0]);
 		(void)kern_close(td, sv[1]);
 	}
 	return (error);
 }
 
 static int
-sendit(td, s, mp, flags)
-	struct thread *td;
-	int s;
-	struct msghdr *mp;
-	int flags;
+sendit(struct thread *td, int s, struct msghdr *mp, int flags)
 {
 	struct mbuf *control;
 	struct sockaddr *to;
 	int error;
 
 #ifdef CAPABILITY_MODE
 	if (IN_CAPABILITY_MODE(td) && (mp->msg_name != NULL))
 		return (ECAPMODE);
 #endif
 
 	if (mp->msg_name != NULL) {
 		error = getsockaddr(&to, mp->msg_name, mp->msg_namelen);
 		if (error != 0) {
 			to = NULL;
 			goto bad;
 		}
 		mp->msg_name = to;
 	} else {
 		to = NULL;
 	}
 
 	if (mp->msg_control) {
 		if (mp->msg_controllen < sizeof(struct cmsghdr)
 #ifdef COMPAT_OLDSOCK
 		    && mp->msg_flags != MSG_COMPAT
 #endif
 		) {
 			error = EINVAL;
 			goto bad;
 		}
 		error = sockargs(&control, mp->msg_control,
 		    mp->msg_controllen, MT_CONTROL);
 		if (error != 0)
 			goto bad;
 #ifdef COMPAT_OLDSOCK
 		if (mp->msg_flags == MSG_COMPAT) {
 			struct cmsghdr *cm;
 
 			M_PREPEND(control, sizeof(*cm), M_WAITOK);
 			cm = mtod(control, struct cmsghdr *);
 			cm->cmsg_len = control->m_len;
 			cm->cmsg_level = SOL_SOCKET;
 			cm->cmsg_type = SCM_RIGHTS;
 		}
 #endif
 	} else {
 		control = NULL;
 	}
 
 	error = kern_sendit(td, s, mp, flags, control, UIO_USERSPACE);
 
 bad:
 	free(to, M_SONAME);
 	return (error);
 }
 
 int
-kern_sendit(td, s, mp, flags, control, segflg)
-	struct thread *td;
-	int s;
-	struct msghdr *mp;
-	int flags;
-	struct mbuf *control;
-	enum uio_seg segflg;
+kern_sendit(struct thread *td, int s, struct msghdr *mp, int flags,
+    struct mbuf *control, enum uio_seg segflg)
 {
 	struct file *fp;
 	struct uio auio;
 	struct iovec *iov;
 	struct socket *so;
 	cap_rights_t rights;
 #ifdef KTRACE
 	struct uio *ktruio = NULL;
 #endif
 	ssize_t len;
 	int i, error;
 
 	AUDIT_ARG_FD(s);
 	cap_rights_init(&rights, CAP_SEND);
 	if (mp->msg_name != NULL) {
 		AUDIT_ARG_SOCKADDR(td, AT_FDCWD, mp->msg_name);
 		cap_rights_set(&rights, CAP_CONNECT);
 	}
 	error = getsock_cap(td, s, &rights, &fp, NULL);
 	if (error != 0)
 		return (error);
 	so = (struct socket *)fp->f_data;
 
 #ifdef KTRACE
 	if (mp->msg_name != NULL && KTRPOINT(td, KTR_STRUCT))
 		ktrsockaddr(mp->msg_name);
 #endif
 #ifdef MAC
 	if (mp->msg_name != NULL) {
 		error = mac_socket_check_connect(td->td_ucred, so,
 		    mp->msg_name);
 		if (error != 0)
 			goto bad;
 	}
 	error = mac_socket_check_send(td->td_ucred, so);
 	if (error != 0)
 		goto bad;
 #endif
 
 	auio.uio_iov = mp->msg_iov;
 	auio.uio_iovcnt = mp->msg_iovlen;
 	auio.uio_segflg = segflg;
 	auio.uio_rw = UIO_WRITE;
 	auio.uio_td = td;
 	auio.uio_offset = 0;			/* XXX */
 	auio.uio_resid = 0;
 	iov = mp->msg_iov;
 	for (i = 0; i < mp->msg_iovlen; i++, iov++) {
 		if ((auio.uio_resid += iov->iov_len) < 0) {
 			error = EINVAL;
 			goto bad;
 		}
 	}
 #ifdef KTRACE
 	if (KTRPOINT(td, KTR_GENIO))
 		ktruio = cloneuio(&auio);
 #endif
 	len = auio.uio_resid;
 	error = sosend(so, mp->msg_name, &auio, 0, control, flags, td);
 	if (error != 0) {
 		if (auio.uio_resid != len && (error == ERESTART ||
 		    error == EINTR || error == EWOULDBLOCK))
 			error = 0;
 		/* Generation of SIGPIPE can be controlled per socket */
 		if (error == EPIPE && !(so->so_options & SO_NOSIGPIPE) &&
 		    !(flags & MSG_NOSIGNAL)) {
 			PROC_LOCK(td->td_proc);
 			tdsignal(td, SIGPIPE);
 			PROC_UNLOCK(td->td_proc);
 		}
 	}
 	if (error == 0)
 		td->td_retval[0] = len - auio.uio_resid;
 #ifdef KTRACE
 	if (ktruio != NULL) {
 		ktruio->uio_resid = td->td_retval[0];
 		ktrgenio(s, UIO_WRITE, ktruio, error);
 	}
 #endif
 bad:
 	fdrop(fp, td);
 	return (error);
 }
 
 int
-sys_sendto(td, uap)
-	struct thread *td;
-	struct sendto_args /* {
-		int	s;
-		caddr_t	buf;
-		size_t	len;
-		int	flags;
-		caddr_t	to;
-		int	tolen;
-	} */ *uap;
+sys_sendto(struct thread *td, struct sendto_args *uap)
 {
 	struct msghdr msg;
 	struct iovec aiov;
 
 	msg.msg_name = uap->to;
 	msg.msg_namelen = uap->tolen;
 	msg.msg_iov = &aiov;
 	msg.msg_iovlen = 1;
 	msg.msg_control = 0;
 #ifdef COMPAT_OLDSOCK
 	msg.msg_flags = 0;
 #endif
 	aiov.iov_base = uap->buf;
 	aiov.iov_len = uap->len;
 	return (sendit(td, uap->s, &msg, uap->flags));
 }
 
 #ifdef COMPAT_OLDSOCK
 int
-osend(td, uap)
-	struct thread *td;
-	struct osend_args /* {
-		int	s;
-		caddr_t	buf;
-		int	len;
-		int	flags;
-	} */ *uap;
+osend(struct thread *td, struct osend_args *uap)
 {
 	struct msghdr msg;
 	struct iovec aiov;
 
 	msg.msg_name = 0;
 	msg.msg_namelen = 0;
 	msg.msg_iov = &aiov;
 	msg.msg_iovlen = 1;
 	aiov.iov_base = uap->buf;
 	aiov.iov_len = uap->len;
 	msg.msg_control = 0;
 	msg.msg_flags = 0;
 	return (sendit(td, uap->s, &msg, uap->flags));
 }
 
 int
-osendmsg(td, uap)
-	struct thread *td;
-	struct osendmsg_args /* {
-		int	s;
-		caddr_t	msg;
-		int	flags;
-	} */ *uap;
+osendmsg(struct thread *td, struct osendmsg_args *uap)
 {
 	struct msghdr msg;
 	struct iovec *iov;
 	int error;
 
 	error = copyin(uap->msg, &msg, sizeof (struct omsghdr));
 	if (error != 0)
 		return (error);
 	error = copyiniov(msg.msg_iov, msg.msg_iovlen, &iov, EMSGSIZE);
 	if (error != 0)
 		return (error);
 	msg.msg_iov = iov;
 	msg.msg_flags = MSG_COMPAT;
 	error = sendit(td, uap->s, &msg, uap->flags);
 	free(iov, M_IOV);
 	return (error);
 }
 #endif
 
 int
-sys_sendmsg(td, uap)
-	struct thread *td;
-	struct sendmsg_args /* {
-		int	s;
-		caddr_t	msg;
-		int	flags;
-	} */ *uap;
+sys_sendmsg(struct thread *td, struct sendmsg_args *uap)
 {
 	struct msghdr msg;
 	struct iovec *iov;
 	int error;
 
 	error = copyin(uap->msg, &msg, sizeof (msg));
 	if (error != 0)
 		return (error);
 	error = copyiniov(msg.msg_iov, msg.msg_iovlen, &iov, EMSGSIZE);
 	if (error != 0)
 		return (error);
 	msg.msg_iov = iov;
 #ifdef COMPAT_OLDSOCK
 	msg.msg_flags = 0;
 #endif
 	error = sendit(td, uap->s, &msg, uap->flags);
 	free(iov, M_IOV);
 	return (error);
 }
 
 int
-kern_recvit(td, s, mp, fromseg, controlp)
-	struct thread *td;
-	int s;
-	struct msghdr *mp;
-	enum uio_seg fromseg;
-	struct mbuf **controlp;
+kern_recvit(struct thread *td, int s, struct msghdr *mp, enum uio_seg fromseg,
+    struct mbuf **controlp)
 {
 	struct uio auio;
 	struct iovec *iov;
 	struct mbuf *m, *control = NULL;
 	caddr_t ctlbuf;
 	struct file *fp;
 	struct socket *so;
 	struct sockaddr *fromsa = NULL;
 	cap_rights_t rights;
 #ifdef KTRACE
 	struct uio *ktruio = NULL;
 #endif
 	ssize_t len;
 	int error, i;
 
 	if (controlp != NULL)
 		*controlp = NULL;
 
 	AUDIT_ARG_FD(s);
 	error = getsock_cap(td, s, cap_rights_init(&rights, CAP_RECV),
 	    &fp, NULL);
 	if (error != 0)
 		return (error);
 	so = fp->f_data;
 
 #ifdef MAC
 	error = mac_socket_check_receive(td->td_ucred, so);
 	if (error != 0) {
 		fdrop(fp, td);
 		return (error);
 	}
 #endif
 
 	auio.uio_iov = mp->msg_iov;
 	auio.uio_iovcnt = mp->msg_iovlen;
 	auio.uio_segflg = UIO_USERSPACE;
 	auio.uio_rw = UIO_READ;
 	auio.uio_td = td;
 	auio.uio_offset = 0;			/* XXX */
 	auio.uio_resid = 0;
 	iov = mp->msg_iov;
 	for (i = 0; i < mp->msg_iovlen; i++, iov++) {
 		if ((auio.uio_resid += iov->iov_len) < 0) {
 			fdrop(fp, td);
 			return (EINVAL);
 		}
 	}
 #ifdef KTRACE
 	if (KTRPOINT(td, KTR_GENIO))
 		ktruio = cloneuio(&auio);
 #endif
 	len = auio.uio_resid;
 	error = soreceive(so, &fromsa, &auio, NULL,
 	    (mp->msg_control || controlp) ? &control : NULL,
 	    &mp->msg_flags);
 	if (error != 0) {
 		if (auio.uio_resid != len && (error == ERESTART ||
 		    error == EINTR || error == EWOULDBLOCK))
 			error = 0;
 	}
 	if (fromsa != NULL)
 		AUDIT_ARG_SOCKADDR(td, AT_FDCWD, fromsa);
 #ifdef KTRACE
 	if (ktruio != NULL) {
 		ktruio->uio_resid = len - auio.uio_resid;
 		ktrgenio(s, UIO_READ, ktruio, error);
 	}
 #endif
 	if (error != 0)
 		goto out;
 	td->td_retval[0] = len - auio.uio_resid;
 	if (mp->msg_name) {
 		len = mp->msg_namelen;
 		if (len <= 0 || fromsa == NULL)
 			len = 0;
 		else {
 			/* save sa_len before it is destroyed by MSG_COMPAT */
 			len = MIN(len, fromsa->sa_len);
 #ifdef COMPAT_OLDSOCK
 			if (mp->msg_flags & MSG_COMPAT)
 				((struct osockaddr *)fromsa)->sa_family =
 				    fromsa->sa_family;
 #endif
 			if (fromseg == UIO_USERSPACE) {
 				error = copyout(fromsa, mp->msg_name,
 				    (unsigned)len);
 				if (error != 0)
 					goto out;
 			} else
 				bcopy(fromsa, mp->msg_name, len);
 		}
 		mp->msg_namelen = len;
 	}
 	if (mp->msg_control && controlp == NULL) {
 #ifdef COMPAT_OLDSOCK
 		/*
 		 * We assume that old recvmsg calls won't receive access
 		 * rights and other control info, esp. as control info
 		 * is always optional and those options didn't exist in 4.3.
 		 * If we receive rights, trim the cmsghdr; anything else
 		 * is tossed.
 		 */
 		if (control && mp->msg_flags & MSG_COMPAT) {
 			if (mtod(control, struct cmsghdr *)->cmsg_level !=
 			    SOL_SOCKET ||
 			    mtod(control, struct cmsghdr *)->cmsg_type !=
 			    SCM_RIGHTS) {
 				mp->msg_controllen = 0;
 				goto out;
 			}
 			control->m_len -= sizeof (struct cmsghdr);
 			control->m_data += sizeof (struct cmsghdr);
 		}
 #endif
 		len = mp->msg_controllen;
 		m = control;
 		mp->msg_controllen = 0;
 		ctlbuf = mp->msg_control;
 
 		while (m && len > 0) {
 			unsigned int tocopy;
 
 			if (len >= m->m_len)
 				tocopy = m->m_len;
 			else {
 				mp->msg_flags |= MSG_CTRUNC;
 				tocopy = len;
 			}
 
 			if ((error = copyout(mtod(m, caddr_t),
 					ctlbuf, tocopy)) != 0)
 				goto out;
 
 			ctlbuf += tocopy;
 			len -= tocopy;
 			m = m->m_next;
 		}
 		mp->msg_controllen = ctlbuf - (caddr_t)mp->msg_control;
 	}
 out:
 	fdrop(fp, td);
 #ifdef KTRACE
 	if (fromsa && KTRPOINT(td, KTR_STRUCT))
 		ktrsockaddr(fromsa);
 #endif
 	free(fromsa, M_SONAME);
 
 	if (error == 0 && controlp != NULL)
 		*controlp = control;
 	else  if (control)
 		m_freem(control);
 
 	return (error);
 }
 
 static int
-recvit(td, s, mp, namelenp)
-	struct thread *td;
-	int s;
-	struct msghdr *mp;
-	void *namelenp;
+recvit(struct thread *td, int s, struct msghdr *mp, void *namelenp)
 {
 	int error;
 
 	error = kern_recvit(td, s, mp, UIO_USERSPACE, NULL);
 	if (error != 0)
 		return (error);
 	if (namelenp != NULL) {
 		error = copyout(&mp->msg_namelen, namelenp, sizeof (socklen_t));
 #ifdef COMPAT_OLDSOCK
 		if (mp->msg_flags & MSG_COMPAT)
 			error = 0;	/* old recvfrom didn't check */
 #endif
 	}
 	return (error);
 }
 
 int
-sys_recvfrom(td, uap)
-	struct thread *td;
-	struct recvfrom_args /* {
-		int	s;
-		caddr_t	buf;
-		size_t	len;
-		int	flags;
-		struct sockaddr * __restrict	from;
-		socklen_t * __restrict fromlenaddr;
-	} */ *uap;
+sys_recvfrom(struct thread *td, struct recvfrom_args *uap)
 {
 	struct msghdr msg;
 	struct iovec aiov;
 	int error;
 
 	if (uap->fromlenaddr) {
 		error = copyin(uap->fromlenaddr,
 		    &msg.msg_namelen, sizeof (msg.msg_namelen));
 		if (error != 0)
 			goto done2;
 	} else {
 		msg.msg_namelen = 0;
 	}
 	msg.msg_name = uap->from;
 	msg.msg_iov = &aiov;
 	msg.msg_iovlen = 1;
 	aiov.iov_base = uap->buf;
 	aiov.iov_len = uap->len;
 	msg.msg_control = 0;
 	msg.msg_flags = uap->flags;
 	error = recvit(td, uap->s, &msg, uap->fromlenaddr);
 done2:
 	return (error);
 }
 
 #ifdef COMPAT_OLDSOCK
 int
-orecvfrom(td, uap)
-	struct thread *td;
-	struct recvfrom_args *uap;
+orecvfrom(struct thread *td, struct recvfrom_args *uap)
 {
 
 	uap->flags |= MSG_COMPAT;
 	return (sys_recvfrom(td, uap));
 }
 #endif
 
 #ifdef COMPAT_OLDSOCK
 int
-orecv(td, uap)
-	struct thread *td;
-	struct orecv_args /* {
-		int	s;
-		caddr_t	buf;
-		int	len;
-		int	flags;
-	} */ *uap;
+orecv(struct thread *td, struct orecv_args *uap)
 {
 	struct msghdr msg;
 	struct iovec aiov;
 
 	msg.msg_name = 0;
 	msg.msg_namelen = 0;
 	msg.msg_iov = &aiov;
 	msg.msg_iovlen = 1;
 	aiov.iov_base = uap->buf;
 	aiov.iov_len = uap->len;
 	msg.msg_control = 0;
 	msg.msg_flags = uap->flags;
 	return (recvit(td, uap->s, &msg, NULL));
 }
 
 /*
  * Old recvmsg.  This code takes advantage of the fact that the old msghdr
  * overlays the new one, missing only the flags, and with the (old) access
  * rights where the control fields are now.
  */
 int
-orecvmsg(td, uap)
-	struct thread *td;
-	struct orecvmsg_args /* {
-		int	s;
-		struct	omsghdr *msg;
-		int	flags;
-	} */ *uap;
+orecvmsg(struct thread *td, struct orecvmsg_args *uap)
 {
 	struct msghdr msg;
 	struct iovec *iov;
 	int error;
 
 	error = copyin(uap->msg, &msg, sizeof (struct omsghdr));
 	if (error != 0)
 		return (error);
 	error = copyiniov(msg.msg_iov, msg.msg_iovlen, &iov, EMSGSIZE);
 	if (error != 0)
 		return (error);
 	msg.msg_flags = uap->flags | MSG_COMPAT;
 	msg.msg_iov = iov;
 	error = recvit(td, uap->s, &msg, &uap->msg->msg_namelen);
 	if (msg.msg_controllen && error == 0)
 		error = copyout(&msg.msg_controllen,
 		    &uap->msg->msg_accrightslen, sizeof (int));
 	free(iov, M_IOV);
 	return (error);
 }
 #endif
 
 int
-sys_recvmsg(td, uap)
-	struct thread *td;
-	struct recvmsg_args /* {
-		int	s;
-		struct	msghdr *msg;
-		int	flags;
-	} */ *uap;
+sys_recvmsg(struct thread *td, struct recvmsg_args *uap)
 {
 	struct msghdr msg;
 	struct iovec *uiov, *iov;
 	int error;
 
 	error = copyin(uap->msg, &msg, sizeof (msg));
 	if (error != 0)
 		return (error);
 	error = copyiniov(msg.msg_iov, msg.msg_iovlen, &iov, EMSGSIZE);
 	if (error != 0)
 		return (error);
 	msg.msg_flags = uap->flags;
 #ifdef COMPAT_OLDSOCK
 	msg.msg_flags &= ~MSG_COMPAT;
 #endif
 	uiov = msg.msg_iov;
 	msg.msg_iov = iov;
 	error = recvit(td, uap->s, &msg, NULL);
 	if (error == 0) {
 		msg.msg_iov = uiov;
 		error = copyout(&msg, uap->msg, sizeof(msg));
 	}
 	free(iov, M_IOV);
 	return (error);
 }
 
-/* ARGSUSED */
 int
-sys_shutdown(td, uap)
-	struct thread *td;
-	struct shutdown_args /* {
-		int	s;
-		int	how;
-	} */ *uap;
+sys_shutdown(struct thread *td, struct shutdown_args *uap)
 {
 	struct socket *so;
 	struct file *fp;
 	cap_rights_t rights;
 	int error;
 
 	AUDIT_ARG_FD(uap->s);
 	error = getsock_cap(td, uap->s, cap_rights_init(&rights, CAP_SHUTDOWN),
 	    &fp, NULL);
 	if (error == 0) {
 		so = fp->f_data;
 		error = soshutdown(so, uap->how);
 		/*
 		 * Previous versions did not return ENOTCONN, but 0 in
 		 * case the socket was not connected. Some important
 		 * programs like syslogd up to r279016, 2015-02-19,
 		 * still depend on this behavior.
 		 */
 		if (error == ENOTCONN &&
 		    td->td_proc->p_osrel < P_OSREL_SHUTDOWN_ENOTCONN)
 			error = 0;
 		fdrop(fp, td);
 	}
 	return (error);
 }
 
-/* ARGSUSED */
 int
-sys_setsockopt(td, uap)
-	struct thread *td;
-	struct setsockopt_args /* {
-		int	s;
-		int	level;
-		int	name;
-		caddr_t	val;
-		int	valsize;
-	} */ *uap;
+sys_setsockopt(struct thread *td, struct setsockopt_args *uap)
 {
 
 	return (kern_setsockopt(td, uap->s, uap->level, uap->name,
 	    uap->val, UIO_USERSPACE, uap->valsize));
 }
 
 int
-kern_setsockopt(td, s, level, name, val, valseg, valsize)
-	struct thread *td;
-	int s;
-	int level;
-	int name;
-	void *val;
-	enum uio_seg valseg;
-	socklen_t valsize;
+kern_setsockopt(struct thread *td, int s, int level, int name, void *val,
+    enum uio_seg valseg, socklen_t valsize)
 {
 	struct socket *so;
 	struct file *fp;
 	struct sockopt sopt;
 	cap_rights_t rights;
 	int error;
 
 	if (val == NULL && valsize != 0)
 		return (EFAULT);
 	if ((int)valsize < 0)
 		return (EINVAL);
 
 	sopt.sopt_dir = SOPT_SET;
 	sopt.sopt_level = level;
 	sopt.sopt_name = name;
 	sopt.sopt_val = val;
 	sopt.sopt_valsize = valsize;
 	switch (valseg) {
 	case UIO_USERSPACE:
 		sopt.sopt_td = td;
 		break;
 	case UIO_SYSSPACE:
 		sopt.sopt_td = NULL;
 		break;
 	default:
 		panic("kern_setsockopt called with bad valseg");
 	}
 
 	AUDIT_ARG_FD(s);
 	error = getsock_cap(td, s, cap_rights_init(&rights, CAP_SETSOCKOPT),
 	    &fp, NULL);
 	if (error == 0) {
 		so = fp->f_data;
 		error = sosetopt(so, &sopt);
 		fdrop(fp, td);
 	}
 	return(error);
 }
 
-/* ARGSUSED */
 int
-sys_getsockopt(td, uap)
-	struct thread *td;
-	struct getsockopt_args /* {
-		int	s;
-		int	level;
-		int	name;
-		void * __restrict	val;
-		socklen_t * __restrict avalsize;
-	} */ *uap;
+sys_getsockopt(struct thread *td, struct getsockopt_args *uap)
 {
 	socklen_t valsize;
 	int error;
 
 	if (uap->val) {
 		error = copyin(uap->avalsize, &valsize, sizeof (valsize));
 		if (error != 0)
 			return (error);
 	}
 
 	error = kern_getsockopt(td, uap->s, uap->level, uap->name,
 	    uap->val, UIO_USERSPACE, &valsize);
 
 	if (error == 0)
 		error = copyout(&valsize, uap->avalsize, sizeof (valsize));
 	return (error);
 }
 
 /*
  * Kernel version of getsockopt.
  * optval can be a userland or userspace. optlen is always a kernel pointer.
  */
 int
-kern_getsockopt(td, s, level, name, val, valseg, valsize)
-	struct thread *td;
-	int s;
-	int level;
-	int name;
-	void *val;
-	enum uio_seg valseg;
-	socklen_t *valsize;
+kern_getsockopt(struct thread *td, int s, int level, int name, void *val,
+    enum uio_seg valseg, socklen_t *valsize)
 {
 	struct socket *so;
 	struct file *fp;
 	struct sockopt sopt;
 	cap_rights_t rights;
 	int error;
 
 	if (val == NULL)
 		*valsize = 0;
 	if ((int)*valsize < 0)
 		return (EINVAL);
 
 	sopt.sopt_dir = SOPT_GET;
 	sopt.sopt_level = level;
 	sopt.sopt_name = name;
 	sopt.sopt_val = val;
 	sopt.sopt_valsize = (size_t)*valsize; /* checked non-negative above */
 	switch (valseg) {
 	case UIO_USERSPACE:
 		sopt.sopt_td = td;
 		break;
 	case UIO_SYSSPACE:
 		sopt.sopt_td = NULL;
 		break;
 	default:
 		panic("kern_getsockopt called with bad valseg");
 	}
 
 	AUDIT_ARG_FD(s);
 	error = getsock_cap(td, s, cap_rights_init(&rights, CAP_GETSOCKOPT),
 	    &fp, NULL);
 	if (error == 0) {
 		so = fp->f_data;
 		error = sogetopt(so, &sopt);
 		*valsize = sopt.sopt_valsize;
 		fdrop(fp, td);
 	}
 	return (error);
 }
 
 /*
  * getsockname1() - Get socket name.
  */
-/* ARGSUSED */
 static int
-getsockname1(td, uap, compat)
-	struct thread *td;
-	struct getsockname_args /* {
-		int	fdes;
-		struct sockaddr * __restrict asa;
-		socklen_t * __restrict alen;
-	} */ *uap;
-	int compat;
+getsockname1(struct thread *td, struct getsockname_args *uap, int compat)
 {
 	struct sockaddr *sa;
 	socklen_t len;
 	int error;
 
 	error = copyin(uap->alen, &len, sizeof(len));
 	if (error != 0)
 		return (error);
 
 	error = kern_getsockname(td, uap->fdes, &sa, &len);
 	if (error != 0)
 		return (error);
 
 	if (len != 0) {
 #ifdef COMPAT_OLDSOCK
 		if (compat)
 			((struct osockaddr *)sa)->sa_family = sa->sa_family;
 #endif
 		error = copyout(sa, uap->asa, (u_int)len);
 	}
 	free(sa, M_SONAME);
 	if (error == 0)
 		error = copyout(&len, uap->alen, sizeof(len));
 	return (error);
 }
 
 int
 kern_getsockname(struct thread *td, int fd, struct sockaddr **sa,
     socklen_t *alen)
 {
 	struct socket *so;
 	struct file *fp;
 	cap_rights_t rights;
 	socklen_t len;
 	int error;
 
 	AUDIT_ARG_FD(fd);
 	error = getsock_cap(td, fd, cap_rights_init(&rights, CAP_GETSOCKNAME),
 	    &fp, NULL);
 	if (error != 0)
 		return (error);
 	so = fp->f_data;
 	*sa = NULL;
 	CURVNET_SET(so->so_vnet);
 	error = (*so->so_proto->pr_usrreqs->pru_sockaddr)(so, sa);
 	CURVNET_RESTORE();
 	if (error != 0)
 		goto bad;
 	if (*sa == NULL)
 		len = 0;
 	else
 		len = MIN(*alen, (*sa)->sa_len);
 	*alen = len;
 #ifdef KTRACE
 	if (KTRPOINT(td, KTR_STRUCT))
 		ktrsockaddr(*sa);
 #endif
 bad:
 	fdrop(fp, td);
 	if (error != 0 && *sa != NULL) {
 		free(*sa, M_SONAME);
 		*sa = NULL;
 	}
 	return (error);
 }
 
 int
-sys_getsockname(td, uap)
-	struct thread *td;
-	struct getsockname_args *uap;
+sys_getsockname(struct thread *td, struct getsockname_args *uap)
 {
 
 	return (getsockname1(td, uap, 0));
 }
 
 #ifdef COMPAT_OLDSOCK
 int
-ogetsockname(td, uap)
-	struct thread *td;
-	struct getsockname_args *uap;
+ogetsockname(struct thread *td, struct getsockname_args *uap)
 {
 
 	return (getsockname1(td, uap, 1));
 }
 #endif /* COMPAT_OLDSOCK */
 
 /*
  * getpeername1() - Get name of peer for connected socket.
  */
-/* ARGSUSED */
 static int
-getpeername1(td, uap, compat)
-	struct thread *td;
-	struct getpeername_args /* {
-		int	fdes;
-		struct sockaddr * __restrict	asa;
-		socklen_t * __restrict	alen;
-	} */ *uap;
-	int compat;
+getpeername1(struct thread *td, struct getpeername_args *uap, int compat)
 {
 	struct sockaddr *sa;
 	socklen_t len;
 	int error;
 
 	error = copyin(uap->alen, &len, sizeof (len));
 	if (error != 0)
 		return (error);
 
 	error = kern_getpeername(td, uap->fdes, &sa, &len);
 	if (error != 0)
 		return (error);
 
 	if (len != 0) {
 #ifdef COMPAT_OLDSOCK
 		if (compat)
 			((struct osockaddr *)sa)->sa_family = sa->sa_family;
 #endif
 		error = copyout(sa, uap->asa, (u_int)len);
 	}
 	free(sa, M_SONAME);
 	if (error == 0)
 		error = copyout(&len, uap->alen, sizeof(len));
 	return (error);
 }
 
 int
 kern_getpeername(struct thread *td, int fd, struct sockaddr **sa,
     socklen_t *alen)
 {
 	struct socket *so;
 	struct file *fp;
 	cap_rights_t rights;
 	socklen_t len;
 	int error;
 
 	AUDIT_ARG_FD(fd);
 	error = getsock_cap(td, fd, cap_rights_init(&rights, CAP_GETPEERNAME),
 	    &fp, NULL);
 	if (error != 0)
 		return (error);
 	so = fp->f_data;
 	if ((so->so_state & (SS_ISCONNECTED|SS_ISCONFIRMING)) == 0) {
 		error = ENOTCONN;
 		goto done;
 	}
 	*sa = NULL;
 	CURVNET_SET(so->so_vnet);
 	error = (*so->so_proto->pr_usrreqs->pru_peeraddr)(so, sa);
 	CURVNET_RESTORE();
 	if (error != 0)
 		goto bad;
 	if (*sa == NULL)
 		len = 0;
 	else
 		len = MIN(*alen, (*sa)->sa_len);
 	*alen = len;
 #ifdef KTRACE
 	if (KTRPOINT(td, KTR_STRUCT))
 		ktrsockaddr(*sa);
 #endif
 bad:
 	if (error != 0 && *sa != NULL) {
 		free(*sa, M_SONAME);
 		*sa = NULL;
 	}
 done:
 	fdrop(fp, td);
 	return (error);
 }
 
 int
-sys_getpeername(td, uap)
-	struct thread *td;
-	struct getpeername_args *uap;
+sys_getpeername(struct thread *td, struct getpeername_args *uap)
 {
 
 	return (getpeername1(td, uap, 0));
 }
 
 #ifdef COMPAT_OLDSOCK
 int
-ogetpeername(td, uap)
-	struct thread *td;
-	struct ogetpeername_args *uap;
+ogetpeername(struct thread *td, struct ogetpeername_args *uap)
 {
 
 	/* XXX uap should have type `getpeername_args *' to begin with. */
 	return (getpeername1(td, (struct getpeername_args *)uap, 1));
 }
 #endif /* COMPAT_OLDSOCK */
 
 static int
 sockargs(struct mbuf **mp, char *buf, socklen_t buflen, int type)
 {
 	struct sockaddr *sa;
 	struct mbuf *m;
 	int error;
 
 	if (buflen > MLEN) {
 #ifdef COMPAT_OLDSOCK
 		if (type == MT_SONAME && buflen <= 112)
 			buflen = MLEN;		/* unix domain compat. hack */
 		else
 #endif
 			if (buflen > MCLBYTES)
 				return (EINVAL);
 	}
 	m = m_get2(buflen, M_WAITOK, type, 0);
 	m->m_len = buflen;
 	error = copyin(buf, mtod(m, void *), buflen);
 	if (error != 0)
 		(void) m_free(m);
 	else {
 		*mp = m;
 		if (type == MT_SONAME) {
 			sa = mtod(m, struct sockaddr *);
 
 #if defined(COMPAT_OLDSOCK) && BYTE_ORDER != BIG_ENDIAN
 			if (sa->sa_family == 0 && sa->sa_len < AF_MAX)
 				sa->sa_family = sa->sa_len;
 #endif
 			sa->sa_len = buflen;
 		}
 	}
 	return (error);
 }
 
 int
-getsockaddr(namp, uaddr, len)
-	struct sockaddr **namp;
-	caddr_t uaddr;
-	size_t len;
+getsockaddr(struct sockaddr **namp, caddr_t uaddr, size_t len)
 {
 	struct sockaddr *sa;
 	int error;
 
 	if (len > SOCK_MAXADDRLEN)
 		return (ENAMETOOLONG);
 	if (len < offsetof(struct sockaddr, sa_data[0]))
 		return (EINVAL);
 	sa = malloc(len, M_SONAME, M_WAITOK);
 	error = copyin(uaddr, sa, len);
 	if (error != 0) {
 		free(sa, M_SONAME);
 	} else {
 #if defined(COMPAT_OLDSOCK) && BYTE_ORDER != BIG_ENDIAN
 		if (sa->sa_family == 0 && sa->sa_len < AF_MAX)
 			sa->sa_family = sa->sa_len;
 #endif
 		sa->sa_len = len;
 		*namp = sa;
 	}
 	return (error);
 }
Index: projects/clang390-import/sys/kern/vfs_cache.c
===================================================================
--- projects/clang390-import/sys/kern/vfs_cache.c	(revision 305686)
+++ projects/clang390-import/sys/kern/vfs_cache.c	(revision 305687)
@@ -1,1591 +1,1739 @@
 /*-
  * Copyright (c) 1989, 1993, 1995
  *	The Regents of the University of California.  All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * Poul-Henning Kamp of the FreeBSD Project.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)vfs_cache.c	8.5 (Berkeley) 3/22/95
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_ktrace.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/counter.h>
 #include <sys/filedesc.h>
 #include <sys/fnv_hash.h>
 #include <sys/kernel.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/fcntl.h>
 #include <sys/mount.h>
 #include <sys/namei.h>
 #include <sys/proc.h>
 #include <sys/rwlock.h>
 #include <sys/sdt.h>
+#include <sys/smp.h>
 #include <sys/syscallsubr.h>
 #include <sys/sysctl.h>
 #include <sys/sysproto.h>
 #include <sys/vnode.h>
 #ifdef KTRACE
 #include <sys/ktrace.h>
 #endif
 
 #include <vm/uma.h>
 
 SDT_PROVIDER_DECLARE(vfs);
 SDT_PROBE_DEFINE3(vfs, namecache, enter, done, "struct vnode *", "char *",
     "struct vnode *");
 SDT_PROBE_DEFINE2(vfs, namecache, enter_negative, done, "struct vnode *",
     "char *");
 SDT_PROBE_DEFINE1(vfs, namecache, fullpath, entry, "struct vnode *");
 SDT_PROBE_DEFINE3(vfs, namecache, fullpath, hit, "struct vnode *",
     "char *", "struct vnode *");
 SDT_PROBE_DEFINE1(vfs, namecache, fullpath, miss, "struct vnode *");
 SDT_PROBE_DEFINE3(vfs, namecache, fullpath, return, "int",
     "struct vnode *", "char *");
 SDT_PROBE_DEFINE3(vfs, namecache, lookup, hit, "struct vnode *", "char *",
     "struct vnode *");
 SDT_PROBE_DEFINE2(vfs, namecache, lookup, hit__negative,
     "struct vnode *", "char *");
 SDT_PROBE_DEFINE2(vfs, namecache, lookup, miss, "struct vnode *",
     "char *");
 SDT_PROBE_DEFINE1(vfs, namecache, purge, done, "struct vnode *");
 SDT_PROBE_DEFINE1(vfs, namecache, purge_negative, done, "struct vnode *");
 SDT_PROBE_DEFINE1(vfs, namecache, purgevfs, done, "struct mount *");
 SDT_PROBE_DEFINE3(vfs, namecache, zap, done, "struct vnode *", "char *",
     "struct vnode *");
 SDT_PROBE_DEFINE2(vfs, namecache, zap_negative, done, "struct vnode *",
     "char *");
 
 /*
  * This structure describes the elements in the cache of recent
  * names looked up by namei.
  */
 
 struct	namecache {
 	LIST_ENTRY(namecache) nc_hash;	/* hash chain */
 	LIST_ENTRY(namecache) nc_src;	/* source vnode list */
 	TAILQ_ENTRY(namecache) nc_dst;	/* destination vnode list */
 	struct	vnode *nc_dvp;		/* vnode of parent of name */
 	struct	vnode *nc_vp;		/* vnode the name refers to */
 	u_char	nc_flag;		/* flag bits */
 	u_char	nc_nlen;		/* length of name */
 	char	nc_name[0];		/* segment name + nul */
 };
 
 /*
  * struct namecache_ts repeats struct namecache layout up to the
  * nc_nlen member.
  * struct namecache_ts is used in place of struct namecache when time(s) need
  * to be stored.  The nc_dotdottime field is used when a cache entry is mapping
  * both a non-dotdot directory name plus dotdot for the directory's
  * parent.
  */
 struct	namecache_ts {
 	LIST_ENTRY(namecache) nc_hash;	/* hash chain */
 	LIST_ENTRY(namecache) nc_src;	/* source vnode list */
 	TAILQ_ENTRY(namecache) nc_dst;	/* destination vnode list */
 	struct	vnode *nc_dvp;		/* vnode of parent of name */
 	struct	vnode *nc_vp;		/* vnode the name refers to */
 	u_char	nc_flag;		/* flag bits */
 	u_char	nc_nlen;		/* length of name */
 	struct	timespec nc_time;	/* timespec provided by fs */
 	struct	timespec nc_dotdottime;	/* dotdot timespec provided by fs */
 	int	nc_ticks;		/* ticks value when entry was added */
 	char	nc_name[0];		/* segment name + nul */
 };
 
 /*
  * Flags in namecache.nc_flag
  */
 #define NCF_WHITE	0x01
 #define NCF_ISDOTDOT	0x02
 #define	NCF_TS		0x04
 #define	NCF_DTS		0x08
 #define	NCF_DVDROP	0x10
 
 /*
  * Name caching works as follows:
  *
  * Names found by directory scans are retained in a cache
  * for future reference.  It is managed LRU, so frequently
  * used names will hang around.  Cache is indexed by hash value
  * obtained from (vp, name) where vp refers to the directory
  * containing name.
  *
  * If it is a "negative" entry, (i.e. for a name that is known NOT to
  * exist) the vnode pointer will be NULL.
  *
  * Upon reaching the last segment of a path, if the reference
  * is for DELETE, or NOCACHE is set (rewrite), and the
  * name is located in the cache, it will be dropped.
+ *
+ * These locks are used (in the order in which they can be taken):
+ * NAME         TYPE    ROLE
+ * cache_lock   rwlock  global, needed for all modifications
+ * bucketlock   rwlock  for access to given hash bucket
+ * ncneg_mtx    mtx     negative entry LRU management
+ *
+ * A name -> vnode lookup can be safely performed by either locking cache_lock
+ * or the relevant hash bucket.
+ *
+ * ".." and vnode -> name lookups require cache_lock.
+ *
+ * Modifications require both cache_lock and relevant bucketlock taken for
+ * writing.
+ *
+ * Negative entry LRU management requires ncneg_mtx taken on top of either
+ * cache_lock or bucketlock.
  */
 
 /*
  * Structures associated with name caching.
  */
 #define NCHHASH(hash) \
 	(&nchashtbl[(hash) & nchash])
 static LIST_HEAD(nchashhead, namecache) *nchashtbl;	/* Hash Table */
 static TAILQ_HEAD(, namecache) ncneg;	/* Hash Table */
 static u_long	nchash;			/* size of hash table */
 SYSCTL_ULONG(_debug, OID_AUTO, nchash, CTLFLAG_RD, &nchash, 0,
     "Size of namecache hash table");
 static u_long	ncnegfactor = 16;	/* ratio of negative entries */
 SYSCTL_ULONG(_vfs, OID_AUTO, ncnegfactor, CTLFLAG_RW, &ncnegfactor, 0,
     "Ratio of negative namecache entries");
 static u_long	numneg;			/* number of negative entries allocated */
 SYSCTL_ULONG(_debug, OID_AUTO, numneg, CTLFLAG_RD, &numneg, 0,
     "Number of negative entries in namecache");
 static u_long	numcache;		/* number of cache entries allocated */
 SYSCTL_ULONG(_debug, OID_AUTO, numcache, CTLFLAG_RD, &numcache, 0,
     "Number of namecache entries");
 static u_long	numcachehv;		/* number of cache entries with vnodes held */
 SYSCTL_ULONG(_debug, OID_AUTO, numcachehv, CTLFLAG_RD, &numcachehv, 0,
     "Number of namecache entries with vnodes held");
 u_int	ncsizefactor = 2;
 SYSCTL_UINT(_vfs, OID_AUTO, ncsizefactor, CTLFLAG_RW, &ncsizefactor, 0,
     "Size factor for namecache");
 
 struct nchstats	nchstats;		/* cache effectiveness statistics */
 
 static struct rwlock cache_lock;
-RW_SYSINIT(vfscache, &cache_lock, "Name Cache");
+RW_SYSINIT(vfscache, &cache_lock, "ncglobal");
 
+#define	CACHE_TRY_WLOCK()	rw_try_wlock(&cache_lock)
 #define	CACHE_UPGRADE_LOCK()	rw_try_upgrade(&cache_lock)
 #define	CACHE_RLOCK()		rw_rlock(&cache_lock)
 #define	CACHE_RUNLOCK()		rw_runlock(&cache_lock)
 #define	CACHE_WLOCK()		rw_wlock(&cache_lock)
 #define	CACHE_WUNLOCK()		rw_wunlock(&cache_lock)
 
 static struct mtx_padalign ncneg_mtx;
-MTX_SYSINIT(vfscache_neg, &ncneg_mtx, "Name Cache neg", MTX_DEF);
+MTX_SYSINIT(vfscache_neg, &ncneg_mtx, "ncneg", MTX_DEF);
 
+static u_int   numbucketlocks;
+static struct rwlock_padalign  *bucketlocks;
+#define	HASH2BUCKETLOCK(hash) \
+	((struct rwlock *)(&bucketlocks[((hash) % numbucketlocks)]))
+
 /*
  * UMA zones for the VFS cache.
  *
  * The small cache is used for entries with short names, which are the
  * most common.  The large cache is used for entries which are too big to
  * fit in the small cache.
  */
 static uma_zone_t cache_zone_small;
 static uma_zone_t cache_zone_small_ts;
 static uma_zone_t cache_zone_large;
 static uma_zone_t cache_zone_large_ts;
 
 #define	CACHE_PATH_CUTOFF	35
 
 static struct namecache *
 cache_alloc(int len, int ts)
 {
 
 	if (len > CACHE_PATH_CUTOFF) {
 		if (ts)
 			return (uma_zalloc(cache_zone_large_ts, M_WAITOK));
 		else
 			return (uma_zalloc(cache_zone_large, M_WAITOK));
 	}
 	if (ts)
 		return (uma_zalloc(cache_zone_small_ts, M_WAITOK));
 	else
 		return (uma_zalloc(cache_zone_small, M_WAITOK));
 }
 
 static void
 cache_free(struct namecache *ncp)
 {
 	int ts;
 
 	if (ncp == NULL)
 		return;
 	ts = ncp->nc_flag & NCF_TS;
 	if ((ncp->nc_flag & NCF_DVDROP) != 0)
 		vdrop(ncp->nc_dvp);
 	if (ncp->nc_nlen <= CACHE_PATH_CUTOFF) {
 		if (ts)
 			uma_zfree(cache_zone_small_ts, ncp);
 		else
 			uma_zfree(cache_zone_small, ncp);
 	} else if (ts)
 		uma_zfree(cache_zone_large_ts, ncp);
 	else
 		uma_zfree(cache_zone_large, ncp);
 }
 
 static char *
 nc_get_name(struct namecache *ncp)
 {
 	struct namecache_ts *ncp_ts;
 
 	if ((ncp->nc_flag & NCF_TS) == 0)
 		return (ncp->nc_name);
 	ncp_ts = (struct namecache_ts *)ncp;
 	return (ncp_ts->nc_name);
 }
 
 static void
 cache_out_ts(struct namecache *ncp, struct timespec *tsp, int *ticksp)
 {
 
 	KASSERT((ncp->nc_flag & NCF_TS) != 0 ||
 	    (tsp == NULL && ticksp == NULL),
 	    ("No NCF_TS"));
 
 	if (tsp != NULL)
 		*tsp = ((struct namecache_ts *)ncp)->nc_time;
 	if (ticksp != NULL)
 		*ticksp = ((struct namecache_ts *)ncp)->nc_ticks;
 }
 
 static int	doingcache = 1;		/* 1 => enable the cache */
 SYSCTL_INT(_debug, OID_AUTO, vfscache, CTLFLAG_RW, &doingcache, 0,
     "VFS namecache enabled");
 
 /* Export size information to userland */
 SYSCTL_INT(_debug_sizeof, OID_AUTO, namecache, CTLFLAG_RD, SYSCTL_NULL_INT_PTR,
     sizeof(struct namecache), "sizeof(struct namecache)");
 
 /*
  * The new name cache statistics
  */
 static SYSCTL_NODE(_vfs, OID_AUTO, cache, CTLFLAG_RW, 0,
     "Name cache statistics");
 #define STATNODE_ULONG(name, descr)	\
 	SYSCTL_ULONG(_vfs_cache, OID_AUTO, name, CTLFLAG_RD, &name, 0, descr);
 #define STATNODE_COUNTER(name, descr)	\
 	static counter_u64_t name;	\
 	SYSCTL_COUNTER_U64(_vfs_cache, OID_AUTO, name, CTLFLAG_RD, &name, descr);
 STATNODE_ULONG(numneg, "Number of negative cache entries");
 STATNODE_ULONG(numcache, "Number of cache entries");
 STATNODE_COUNTER(numcalls, "Number of cache lookups");
 STATNODE_COUNTER(dothits, "Number of '.' hits");
 STATNODE_COUNTER(dotdothits, "Number of '..' hits");
 STATNODE_COUNTER(numchecks, "Number of checks in lookup");
 STATNODE_COUNTER(nummiss, "Number of cache misses");
 STATNODE_COUNTER(nummisszap, "Number of cache misses we do not want to cache");
 STATNODE_COUNTER(numposzaps,
     "Number of cache hits (positive) we do not want to cache");
 STATNODE_COUNTER(numposhits, "Number of cache hits (positive)");
 STATNODE_COUNTER(numnegzaps,
     "Number of cache hits (negative) we do not want to cache");
 STATNODE_COUNTER(numneghits, "Number of cache hits (negative)");
 /* These count for kern___getcwd(), too. */
 STATNODE_COUNTER(numfullpathcalls, "Number of fullpath search calls");
 STATNODE_COUNTER(numfullpathfail1, "Number of fullpath search errors (ENOTDIR)");
 STATNODE_COUNTER(numfullpathfail2,
     "Number of fullpath search errors (VOP_VPTOCNP failures)");
 STATNODE_COUNTER(numfullpathfail4, "Number of fullpath search errors (ENOMEM)");
 STATNODE_COUNTER(numfullpathfound, "Number of successful fullpath calls");
 static long numupgrades; STATNODE_ULONG(numupgrades,
     "Number of updates of the cache after lookup (write lock + retry)");
+static long zap_and_exit_bucket_fail; STATNODE_ULONG(zap_and_exit_bucket_fail,
+    "Number of times bucketlocked zap_and_exit case failed to writelock");
 
 static void cache_zap(struct namecache *ncp);
 static int vn_vptocnp_locked(struct vnode **vp, struct ucred *cred, char *buf,
     u_int *buflen);
 static int vn_fullpath1(struct thread *td, struct vnode *vp, struct vnode *rdir,
     char *buf, char **retbuf, u_int buflen);
 
 static MALLOC_DEFINE(M_VFSCACHE, "vfscache", "VFS name cache entries");
 
 static uint32_t
 cache_get_hash(char *name, u_char len, struct vnode *dvp)
 {
 	uint32_t hash;
 
 	hash = fnv_32_buf(name, len, FNV1_32_INIT);
 	hash = fnv_32_buf(&dvp, sizeof(dvp), hash);
 	return (hash);
 }
 
+#ifdef INVARIANTS
+static void
+cache_assert_bucket_locked(struct namecache *ncp, int mode)
+{
+	struct rwlock *bucketlock;
+	uint32_t hash;
+
+	hash = cache_get_hash(nc_get_name(ncp), ncp->nc_nlen, ncp->nc_dvp);
+	bucketlock = HASH2BUCKETLOCK(hash);
+	rw_assert(bucketlock, mode);
+}
+#else
+#define cache_assert_bucket_locked(x, y) do { } while (0)
+#endif
+
+static void
+cache_lock_all_buckets(void)
+{
+	u_int i;
+
+	for (i = 0; i < numbucketlocks; i++)
+		rw_wlock(&bucketlocks[i]);
+}
+
+static void
+cache_unlock_all_buckets(void)
+{
+	u_int i;
+
+	for (i = 0; i < numbucketlocks; i++)
+		rw_wunlock(&bucketlocks[i]);
+}
+
 static int
 sysctl_nchstats(SYSCTL_HANDLER_ARGS)
 {
 	struct nchstats snap;
 
 	if (req->oldptr == NULL)
 		return (SYSCTL_OUT(req, 0, sizeof(snap)));
 
 	snap = nchstats;
 	snap.ncs_goodhits = counter_u64_fetch(numposhits);
 	snap.ncs_neghits = counter_u64_fetch(numneghits);
 	snap.ncs_badhits = counter_u64_fetch(numposzaps) +
 	    counter_u64_fetch(numnegzaps);
 	snap.ncs_miss = counter_u64_fetch(nummisszap) +
 	    counter_u64_fetch(nummiss);
 
 	return (SYSCTL_OUT(req, &snap, sizeof(snap)));
 }
 SYSCTL_PROC(_vfs_cache, OID_AUTO, nchstats, CTLTYPE_OPAQUE | CTLFLAG_RD |
     CTLFLAG_MPSAFE, 0, 0, sysctl_nchstats, "LU",
     "VFS cache effectiveness statistics");
 
 #ifdef DIAGNOSTIC
 /*
  * Grab an atomic snapshot of the name cache hash chain lengths
  */
 static SYSCTL_NODE(_debug, OID_AUTO, hashstat, CTLFLAG_RW, NULL,
     "hash table stats");
 
 static int
 sysctl_debug_hashstat_rawnchash(SYSCTL_HANDLER_ARGS)
 {
 	struct nchashhead *ncpp;
 	struct namecache *ncp;
 	int i, error, n_nchash, *cntbuf;
 
 retry:
 	n_nchash = nchash + 1;	/* nchash is max index, not count */
 	if (req->oldptr == NULL)
 		return SYSCTL_OUT(req, 0, n_nchash * sizeof(int));
 	cntbuf = malloc(n_nchash * sizeof(int), M_TEMP, M_ZERO | M_WAITOK);
 	CACHE_RLOCK();
 	if (n_nchash != nchash + 1) {
 		CACHE_RUNLOCK();
 		free(cntbuf, M_TEMP);
 		goto retry;
 	}
 	/* Scan hash tables counting entries */
 	for (ncpp = nchashtbl, i = 0; i < n_nchash; ncpp++, i++)
 		LIST_FOREACH(ncp, ncpp, nc_hash)
 			cntbuf[i]++;
 	CACHE_RUNLOCK();
 	for (error = 0, i = 0; i < n_nchash; i++)
 		if ((error = SYSCTL_OUT(req, &cntbuf[i], sizeof(int))) != 0)
 			break;
 	free(cntbuf, M_TEMP);
 	return (error);
 }
 SYSCTL_PROC(_debug_hashstat, OID_AUTO, rawnchash, CTLTYPE_INT|CTLFLAG_RD|
     CTLFLAG_MPSAFE, 0, 0, sysctl_debug_hashstat_rawnchash, "S,int",
     "nchash chain lengths");
 
 static int
 sysctl_debug_hashstat_nchash(SYSCTL_HANDLER_ARGS)
 {
 	int error;
 	struct nchashhead *ncpp;
 	struct namecache *ncp;
 	int n_nchash;
 	int count, maxlength, used, pct;
 
 	if (!req->oldptr)
 		return SYSCTL_OUT(req, 0, 4 * sizeof(int));
 
 	CACHE_RLOCK();
 	n_nchash = nchash + 1;	/* nchash is max index, not count */
 	used = 0;
 	maxlength = 0;
 
 	/* Scan hash tables for applicable entries */
 	for (ncpp = nchashtbl; n_nchash > 0; n_nchash--, ncpp++) {
 		count = 0;
 		LIST_FOREACH(ncp, ncpp, nc_hash) {
 			count++;
 		}
 		if (count)
 			used++;
 		if (maxlength < count)
 			maxlength = count;
 	}
 	n_nchash = nchash + 1;
 	CACHE_RUNLOCK();
 	pct = (used * 100) / (n_nchash / 100);
 	error = SYSCTL_OUT(req, &n_nchash, sizeof(n_nchash));
 	if (error)
 		return (error);
 	error = SYSCTL_OUT(req, &used, sizeof(used));
 	if (error)
 		return (error);
 	error = SYSCTL_OUT(req, &maxlength, sizeof(maxlength));
 	if (error)
 		return (error);
 	error = SYSCTL_OUT(req, &pct, sizeof(pct));
 	if (error)
 		return (error);
 	return (0);
 }
 SYSCTL_PROC(_debug_hashstat, OID_AUTO, nchash, CTLTYPE_INT|CTLFLAG_RD|
     CTLFLAG_MPSAFE, 0, 0, sysctl_debug_hashstat_nchash, "I",
     "nchash statistics (number of total/used buckets, maximum chain length, usage percentage)");
 #endif
 
 /*
  * Negative entries management
  */
 static void
-cache_negative_hit(struct namecache *ncp, int wlocked)
+cache_negative_hit(struct namecache *ncp)
 {
 
-	if (!wlocked) {
-		rw_assert(&cache_lock, RA_RLOCKED);
-		mtx_lock(&ncneg_mtx);
-	} else {
-		rw_assert(&cache_lock, RA_WLOCKED);
-	}
-
+	mtx_lock(&ncneg_mtx);
 	TAILQ_REMOVE(&ncneg, ncp, nc_dst);
 	TAILQ_INSERT_TAIL(&ncneg, ncp, nc_dst);
-
-	if (!wlocked)
-		mtx_unlock(&ncneg_mtx);
+	mtx_unlock(&ncneg_mtx);
 }
 
 static void
 cache_negative_insert(struct namecache *ncp)
 {
 
 	rw_assert(&cache_lock, RA_WLOCKED);
+	cache_assert_bucket_locked(ncp, RA_WLOCKED);
 	MPASS(ncp->nc_vp == NULL);
+	mtx_lock(&ncneg_mtx);
 	TAILQ_INSERT_TAIL(&ncneg, ncp, nc_dst);
 	numneg++;
+	mtx_unlock(&ncneg_mtx);
 }
 
 static void
 cache_negative_remove(struct namecache *ncp)
 {
 
 	rw_assert(&cache_lock, RA_WLOCKED);
+	cache_assert_bucket_locked(ncp, RA_WLOCKED);
 	MPASS(ncp->nc_vp == NULL);
+	mtx_lock(&ncneg_mtx);
 	TAILQ_REMOVE(&ncneg, ncp, nc_dst);
 	numneg--;
+	mtx_unlock(&ncneg_mtx);
 }
 
 static struct namecache *
 cache_negative_zap_one(void)
 {
 	struct namecache *ncp;
 
 	rw_assert(&cache_lock, RA_WLOCKED);
 	ncp = TAILQ_FIRST(&ncneg);
 	KASSERT(ncp->nc_vp == NULL, ("ncp %p vp %p on ncneg",
 	    ncp, ncp->nc_vp));
 	cache_zap(ncp);
 	return (ncp);
 }
 
 /*
  * cache_zap():
  *
  *   Removes a namecache entry from cache, whether it contains an actual
  *   pointer to a vnode or if it is just a negative cache entry.
  */
 static void
-cache_zap(struct namecache *ncp)
+cache_zap_locked(struct namecache *ncp)
 {
 
 	rw_assert(&cache_lock, RA_WLOCKED);
+	cache_assert_bucket_locked(ncp, RA_WLOCKED);
 	CTR2(KTR_VFS, "cache_zap(%p) vp %p", ncp, ncp->nc_vp);
 	if (ncp->nc_vp != NULL) {
 		SDT_PROBE3(vfs, namecache, zap, done, ncp->nc_dvp,
 		    nc_get_name(ncp), ncp->nc_vp);
 	} else {
 		SDT_PROBE2(vfs, namecache, zap_negative, done, ncp->nc_dvp,
 		    nc_get_name(ncp));
 	}
 	LIST_REMOVE(ncp, nc_hash);
 	if (ncp->nc_flag & NCF_ISDOTDOT) {
 		if (ncp == ncp->nc_dvp->v_cache_dd)
 			ncp->nc_dvp->v_cache_dd = NULL;
 	} else {
 		LIST_REMOVE(ncp, nc_src);
 		if (LIST_EMPTY(&ncp->nc_dvp->v_cache_src)) {
 			ncp->nc_flag |= NCF_DVDROP;
 			numcachehv--;
 		}
 	}
 	if (ncp->nc_vp) {
 		TAILQ_REMOVE(&ncp->nc_vp->v_cache_dst, ncp, nc_dst);
 		if (ncp == ncp->nc_vp->v_cache_dd)
 			ncp->nc_vp->v_cache_dd = NULL;
 	} else {
 		cache_negative_remove(ncp);
 	}
 	numcache--;
 }
 
+static void
+cache_zap(struct namecache *ncp)
+{
+	struct rwlock *bucketlock;
+	uint32_t hash;
+
+	rw_assert(&cache_lock, RA_WLOCKED);
+
+	hash = cache_get_hash(nc_get_name(ncp), ncp->nc_nlen, ncp->nc_dvp);
+	bucketlock = HASH2BUCKETLOCK(hash);
+	rw_wlock(bucketlock);
+	cache_zap_locked(ncp);
+	rw_wunlock(bucketlock);
+}
+
 /*
  * Lookup an entry in the cache
  *
  * Lookup is called with dvp pointing to the directory to search,
  * cnp pointing to the name of the entry being sought. If the lookup
  * succeeds, the vnode is returned in *vpp, and a status of -1 is
  * returned. If the lookup determines that the name does not exist
  * (negative caching), a status of ENOENT is returned. If the lookup
  * fails, a status of zero is returned.  If the directory vnode is
  * recycled out from under us due to a forced unmount, a status of
  * ENOENT is returned.
  *
  * vpp is locked and ref'd on return.  If we're looking up DOTDOT, dvp is
  * unlocked.  If we're looking up . an extra ref is taken, but the lock is
  * not recursively acquired.
  */
 
+enum { UNLOCKED, WLOCKED, RLOCKED };
+
+static void
+cache_unlock(int cache_locked)
+{
+
+	switch (cache_locked) {
+	case UNLOCKED:
+		break;
+	case WLOCKED:
+		CACHE_WUNLOCK();
+		break;
+	case RLOCKED:
+		CACHE_RUNLOCK();
+		break;
+	}
+}
+
 int
 cache_lookup(struct vnode *dvp, struct vnode **vpp, struct componentname *cnp,
     struct timespec *tsp, int *ticksp)
 {
+	struct rwlock *bucketlock;
 	struct namecache *ncp;
 	uint32_t hash;
-	int error, ltype, wlocked;
+	int error, ltype, cache_locked;
 
 	if (!doingcache) {
 		cnp->cn_flags &= ~MAKEENTRY;
 		return (0);
 	}
 retry:
-	wlocked = 0;
-	counter_u64_add(numcalls, 1);
+	bucketlock = NULL;
+	cache_locked = UNLOCKED;
 	error = 0;
+	counter_u64_add(numcalls, 1);
 
 retry_wlocked:
 	if (cnp->cn_nameptr[0] == '.') {
 		if (cnp->cn_namelen == 1) {
 			*vpp = dvp;
 			CTR2(KTR_VFS, "cache_lookup(%p, %s) found via .",
 			    dvp, cnp->cn_nameptr);
 			counter_u64_add(dothits, 1);
 			SDT_PROBE3(vfs, namecache, lookup, hit, dvp, ".", *vpp);
 			if (tsp != NULL)
 				timespecclear(tsp);
 			if (ticksp != NULL)
 				*ticksp = ticks;
 			VREF(*vpp);
 			/*
 			 * When we lookup "." we still can be asked to lock it
 			 * differently...
 			 */
 			ltype = cnp->cn_lkflags & LK_TYPE_MASK;
 			if (ltype != VOP_ISLOCKED(*vpp)) {
 				if (ltype == LK_EXCLUSIVE) {
 					vn_lock(*vpp, LK_UPGRADE | LK_RETRY);
 					if ((*vpp)->v_iflag & VI_DOOMED) {
 						/* forced unmount */
 						vrele(*vpp);
 						*vpp = NULL;
 						return (ENOENT);
 					}
 				} else
 					vn_lock(*vpp, LK_DOWNGRADE | LK_RETRY);
 			}
 			return (-1);
 		}
-		if (!wlocked)
-			CACHE_RLOCK();
 		if (cnp->cn_namelen == 2 && cnp->cn_nameptr[1] == '.') {
 			counter_u64_add(dotdothits, 1);
+			if (cache_locked == UNLOCKED) {
+				CACHE_RLOCK();
+				cache_locked = RLOCKED;
+			}
+
 			if (dvp->v_cache_dd == NULL) {
 				SDT_PROBE3(vfs, namecache, lookup, miss, dvp,
 				    "..", NULL);
 				goto unlock;
 			}
 			if ((cnp->cn_flags & MAKEENTRY) == 0) {
-				if (!wlocked && !CACHE_UPGRADE_LOCK())
+				if (cache_locked != WLOCKED &&
+				    !CACHE_UPGRADE_LOCK())
 					goto wlock;
 				ncp = NULL;
 				if (dvp->v_cache_dd->nc_flag & NCF_ISDOTDOT) {
 					ncp = dvp->v_cache_dd;
 					cache_zap(ncp);
 				}
 				dvp->v_cache_dd = NULL;
 				CACHE_WUNLOCK();
 				cache_free(ncp);
 				return (0);
 			}
 			ncp = dvp->v_cache_dd;
 			if (ncp->nc_flag & NCF_ISDOTDOT)
 				*vpp = ncp->nc_vp;
 			else
 				*vpp = ncp->nc_dvp;
 			/* Return failure if negative entry was found. */
 			if (*vpp == NULL)
 				goto negative_success;
 			CTR3(KTR_VFS, "cache_lookup(%p, %s) found %p via ..",
 			    dvp, cnp->cn_nameptr, *vpp);
 			SDT_PROBE3(vfs, namecache, lookup, hit, dvp, "..",
 			    *vpp);
 			cache_out_ts(ncp, tsp, ticksp);
 			if ((ncp->nc_flag & (NCF_ISDOTDOT | NCF_DTS)) ==
 			    NCF_DTS && tsp != NULL)
 				*tsp = ((struct namecache_ts *)ncp)->
 				    nc_dotdottime;
 			goto success;
 		}
-	} else if (!wlocked)
-		CACHE_RLOCK();
+	}
 
 	hash = cache_get_hash(cnp->cn_nameptr, cnp->cn_namelen, dvp);
+	if (cache_locked == UNLOCKED) {
+		bucketlock = HASH2BUCKETLOCK(hash);
+		rw_rlock(bucketlock);
+	}
+
 	LIST_FOREACH(ncp, (NCHHASH(hash)), nc_hash) {
 		counter_u64_add(numchecks, 1);
 		if (ncp->nc_dvp == dvp && ncp->nc_nlen == cnp->cn_namelen &&
 		    !bcmp(nc_get_name(ncp), cnp->cn_nameptr, ncp->nc_nlen))
 			break;
 	}
 
 	/* We failed to find an entry */
 	if (ncp == NULL) {
 		SDT_PROBE3(vfs, namecache, lookup, miss, dvp, cnp->cn_nameptr,
 		    NULL);
 		if ((cnp->cn_flags & MAKEENTRY) == 0) {
 			counter_u64_add(nummisszap, 1);
 		} else {
 			counter_u64_add(nummiss, 1);
 		}
 		goto unlock;
 	}
 
 	/* We don't want to have an entry, so dump it */
 	if ((cnp->cn_flags & MAKEENTRY) == 0) {
 		counter_u64_add(numposzaps, 1);
-		if (!wlocked && !CACHE_UPGRADE_LOCK())
-			goto wlock;
-		cache_zap(ncp);
-		CACHE_WUNLOCK();
-		cache_free(ncp);
-		return (0);
+		goto zap_and_exit;
 	}
 
 	/* We found a "positive" match, return the vnode */
 	if (ncp->nc_vp) {
 		counter_u64_add(numposhits, 1);
 		*vpp = ncp->nc_vp;
 		CTR4(KTR_VFS, "cache_lookup(%p, %s) found %p via ncp %p",
 		    dvp, cnp->cn_nameptr, *vpp, ncp);
 		SDT_PROBE3(vfs, namecache, lookup, hit, dvp, nc_get_name(ncp),
 		    *vpp);
 		cache_out_ts(ncp, tsp, ticksp);
 		goto success;
 	}
 
 negative_success:
 	/* We found a negative match, and want to create it, so purge */
 	if (cnp->cn_nameiop == CREATE) {
 		counter_u64_add(numnegzaps, 1);
-		if (!wlocked && !CACHE_UPGRADE_LOCK())
-			goto wlock;
-		cache_zap(ncp);
-		CACHE_WUNLOCK();
-		cache_free(ncp);
-		return (0);
+		goto zap_and_exit;
 	}
 
 	counter_u64_add(numneghits, 1);
-	cache_negative_hit(ncp, wlocked);
+	cache_negative_hit(ncp);
 	if (ncp->nc_flag & NCF_WHITE)
 		cnp->cn_flags |= ISWHITEOUT;
 	SDT_PROBE2(vfs, namecache, lookup, hit__negative, dvp,
 	    nc_get_name(ncp));
 	cache_out_ts(ncp, tsp, ticksp);
-	if (wlocked)
-		CACHE_WUNLOCK();
-	else
-		CACHE_RUNLOCK();
+	MPASS(bucketlock != NULL || cache_locked != UNLOCKED);
+	if (bucketlock != NULL)
+		rw_runlock(bucketlock);
+	cache_unlock(cache_locked);
 	return (ENOENT);
 
 wlock:
 	/*
 	 * We need to update the cache after our lookup, so upgrade to
 	 * a write lock and retry the operation.
 	 */
 	CACHE_RUNLOCK();
+wlock_unlocked:
 	CACHE_WLOCK();
 	numupgrades++;
-	wlocked = 1;
+	cache_locked = WLOCKED;
 	goto retry_wlocked;
 
 success:
 	/*
 	 * On success we return a locked and ref'd vnode as per the lookup
 	 * protocol.
 	 */
 	MPASS(dvp != *vpp);
 	ltype = 0;	/* silence gcc warning */
 	if (cnp->cn_flags & ISDOTDOT) {
 		ltype = VOP_ISLOCKED(dvp);
 		VOP_UNLOCK(dvp, 0);
 	}
 	vhold(*vpp);
-	if (wlocked)
-		CACHE_WUNLOCK();
-	else
-		CACHE_RUNLOCK();
+	MPASS(bucketlock != NULL || cache_locked != UNLOCKED);
+	if (bucketlock != NULL)
+		rw_runlock(bucketlock);
+	cache_unlock(cache_locked);
 	error = vget(*vpp, cnp->cn_lkflags | LK_VNHELD, cnp->cn_thread);
 	if (cnp->cn_flags & ISDOTDOT) {
 		vn_lock(dvp, ltype | LK_RETRY);
 		if (dvp->v_iflag & VI_DOOMED) {
 			if (error == 0)
 				vput(*vpp);
 			*vpp = NULL;
 			return (ENOENT);
 		}
 	}
 	if (error) {
 		*vpp = NULL;
 		goto retry;
 	}
 	if ((cnp->cn_flags & ISLASTCN) &&
 	    (cnp->cn_lkflags & LK_TYPE_MASK) == LK_EXCLUSIVE) {
 		ASSERT_VOP_ELOCKED(*vpp, "cache_lookup");
 	}
 	return (-1);
 
 unlock:
-	if (wlocked)
-		CACHE_WUNLOCK();
-	else
-		CACHE_RUNLOCK();
+	MPASS(bucketlock != NULL || cache_locked != UNLOCKED);
+	if (bucketlock != NULL)
+		rw_runlock(bucketlock);
+	cache_unlock(cache_locked);
 	return (0);
+
+zap_and_exit:
+	if (bucketlock != NULL) {
+		rw_assert(&cache_lock, RA_UNLOCKED);
+		if (!CACHE_TRY_WLOCK()) {
+			rw_runlock(bucketlock);
+			bucketlock = NULL;
+			zap_and_exit_bucket_fail++;
+			goto wlock_unlocked;
+		}
+		cache_locked = WLOCKED;
+		rw_runlock(bucketlock);
+		bucketlock = NULL;
+	} else if (cache_locked != WLOCKED && !CACHE_UPGRADE_LOCK())
+		goto wlock;
+	cache_zap(ncp);
+	CACHE_WUNLOCK();
+	cache_free(ncp);
+	return (0);
 }
 
 /*
  * Add an entry to the cache.
  */
 void
 cache_enter_time(struct vnode *dvp, struct vnode *vp, struct componentname *cnp,
     struct timespec *tsp, struct timespec *dtsp)
 {
+	struct rwlock *bucketlock;
 	struct namecache *ncp, *n2, *ndd, *nneg;
 	struct namecache_ts *n3;
 	struct nchashhead *ncpp;
 	uint32_t hash;
 	int flag;
 	int len;
 
 	CTR3(KTR_VFS, "cache_enter(%p, %p, %s)", dvp, vp, cnp->cn_nameptr);
 	VNASSERT(vp == NULL || (vp->v_iflag & VI_DOOMED) == 0, vp,
 	    ("cache_enter: Adding a doomed vnode"));
 	VNASSERT(dvp == NULL || (dvp->v_iflag & VI_DOOMED) == 0, dvp,
 	    ("cache_enter: Doomed vnode used as src"));
 
 	if (!doingcache)
 		return;
 
 	/*
 	 * Avoid blowout in namecache entries.
 	 */
 	if (numcache >= desiredvnodes * ncsizefactor)
 		return;
 
 	ndd = nneg = NULL;
 	flag = 0;
 	if (cnp->cn_nameptr[0] == '.') {
 		if (cnp->cn_namelen == 1)
 			return;
 		if (cnp->cn_namelen == 2 && cnp->cn_nameptr[1] == '.') {
 			CACHE_WLOCK();
 			/*
 			 * If dotdot entry already exists, just retarget it
 			 * to new parent vnode, otherwise continue with new
 			 * namecache entry allocation.
 			 */
 			if ((ncp = dvp->v_cache_dd) != NULL &&
 			    ncp->nc_flag & NCF_ISDOTDOT) {
 				KASSERT(ncp->nc_dvp == dvp,
 				    ("wrong isdotdot parent"));
 				if (ncp->nc_vp != NULL) {
 					TAILQ_REMOVE(&ncp->nc_vp->v_cache_dst,
 					    ncp, nc_dst);
 				} else {
 					cache_negative_remove(ncp);
 				}
 				if (vp != NULL) {
 					TAILQ_INSERT_HEAD(&vp->v_cache_dst,
 					    ncp, nc_dst);
 				} else {
 					cache_negative_insert(ncp);
 				}
 				ncp->nc_vp = vp;
 				CACHE_WUNLOCK();
 				return;
 			}
 			dvp->v_cache_dd = NULL;
 			SDT_PROBE3(vfs, namecache, enter, done, dvp, "..", vp);
 			CACHE_WUNLOCK();
 			flag = NCF_ISDOTDOT;
 		}
 	}
 
 	/*
 	 * Calculate the hash key and setup as much of the new
 	 * namecache entry as possible before acquiring the lock.
 	 */
 	ncp = cache_alloc(cnp->cn_namelen, tsp != NULL);
 	ncp->nc_vp = vp;
 	ncp->nc_dvp = dvp;
 	ncp->nc_flag = flag;
 	if (tsp != NULL) {
 		n3 = (struct namecache_ts *)ncp;
 		n3->nc_time = *tsp;
 		n3->nc_ticks = ticks;
 		n3->nc_flag |= NCF_TS;
 		if (dtsp != NULL) {
 			n3->nc_dotdottime = *dtsp;
 			n3->nc_flag |= NCF_DTS;
 		}
 	}
 	len = ncp->nc_nlen = cnp->cn_namelen;
 	hash = cache_get_hash(cnp->cn_nameptr, len, dvp);
 	strlcpy(nc_get_name(ncp), cnp->cn_nameptr, len + 1);
 	CACHE_WLOCK();
 
 	/*
 	 * See if this vnode or negative entry is already in the cache
 	 * with this name.  This can happen with concurrent lookups of
 	 * the same path name.
 	 */
 	ncpp = NCHHASH(hash);
 	LIST_FOREACH(n2, ncpp, nc_hash) {
 		if (n2->nc_dvp == dvp &&
 		    n2->nc_nlen == cnp->cn_namelen &&
 		    !bcmp(nc_get_name(n2), cnp->cn_nameptr, n2->nc_nlen)) {
 			if (tsp != NULL) {
 				KASSERT((n2->nc_flag & NCF_TS) != 0,
 				    ("no NCF_TS"));
 				n3 = (struct namecache_ts *)n2;
 				n3->nc_time =
 				    ((struct namecache_ts *)ncp)->nc_time;
 				n3->nc_ticks =
 				    ((struct namecache_ts *)ncp)->nc_ticks;
 				if (dtsp != NULL) {
 					n3->nc_dotdottime =
 					    ((struct namecache_ts *)ncp)->
 					    nc_dotdottime;
 					n3->nc_flag |= NCF_DTS;
 				}
 			}
 			CACHE_WUNLOCK();
 			cache_free(ncp);
 			return;
 		}
 	}
 
 	if (flag == NCF_ISDOTDOT) {
 		/*
 		 * See if we are trying to add .. entry, but some other lookup
 		 * has populated v_cache_dd pointer already.
 		 */
 		if (dvp->v_cache_dd != NULL) {
 			CACHE_WUNLOCK();
 			cache_free(ncp);
 			return;
 		}
 		KASSERT(vp == NULL || vp->v_type == VDIR,
 		    ("wrong vnode type %p", vp));
 		dvp->v_cache_dd = ncp;
 	}
 
 	numcache++;
 	if (vp != NULL) {
 		if (vp->v_type == VDIR) {
 			if (flag != NCF_ISDOTDOT) {
 				/*
 				 * For this case, the cache entry maps both the
 				 * directory name in it and the name ".." for the
 				 * directory's parent.
 				 */
 				if ((ndd = vp->v_cache_dd) != NULL) {
 					if ((ndd->nc_flag & NCF_ISDOTDOT) != 0)
 						cache_zap(ndd);
 					else
 						ndd = NULL;
 				}
 				vp->v_cache_dd = ncp;
 			}
 		} else {
 			vp->v_cache_dd = NULL;
 		}
 	}
 
-	/*
-	 * Insert the new namecache entry into the appropriate chain
-	 * within the cache entries table.
-	 */
-	LIST_INSERT_HEAD(ncpp, ncp, nc_hash);
 	if (flag != NCF_ISDOTDOT) {
 		if (LIST_EMPTY(&dvp->v_cache_src)) {
 			vhold(dvp);
 			numcachehv++;
 		}
 		LIST_INSERT_HEAD(&dvp->v_cache_src, ncp, nc_src);
 	}
 
+	bucketlock = HASH2BUCKETLOCK(hash);
+	rw_wlock(bucketlock);
+
 	/*
+	 * Insert the new namecache entry into the appropriate chain
+	 * within the cache entries table.
+	 */
+	LIST_INSERT_HEAD(ncpp, ncp, nc_hash);
+
+	/*
 	 * If the entry is "negative", we place it into the
 	 * "negative" cache queue, otherwise, we place it into the
 	 * destination vnode's cache entries queue.
 	 */
 	if (vp != NULL) {
 		TAILQ_INSERT_HEAD(&vp->v_cache_dst, ncp, nc_dst);
 		SDT_PROBE3(vfs, namecache, enter, done, dvp, nc_get_name(ncp),
 		    vp);
 	} else {
 		if (cnp->cn_flags & ISWHITEOUT)
 			ncp->nc_flag |= NCF_WHITE;
 		cache_negative_insert(ncp);
 		SDT_PROBE2(vfs, namecache, enter_negative, done, dvp,
 		    nc_get_name(ncp));
 	}
+	rw_wunlock(bucketlock);
 	if (numneg * ncnegfactor > numcache)
 		nneg = cache_negative_zap_one();
 	CACHE_WUNLOCK();
 	cache_free(ndd);
 	cache_free(nneg);
 }
 
+static u_int
+cache_roundup_2(u_int val)
+{
+	u_int res;
+
+	for (res = 1; res <= val; res <<= 1)
+		continue;
+
+	return (res);
+}
+
 /*
  * Name cache initialization, from vfs_init() when we are booting
  */
 static void
 nchinit(void *dummy __unused)
 {
+	u_int i;
 
 	TAILQ_INIT(&ncneg);
 
 	cache_zone_small = uma_zcreate("S VFS Cache",
 	    sizeof(struct namecache) + CACHE_PATH_CUTOFF + 1,
 	    NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_ZINIT);
 	cache_zone_small_ts = uma_zcreate("STS VFS Cache",
 	    sizeof(struct namecache_ts) + CACHE_PATH_CUTOFF + 1,
 	    NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_ZINIT);
 	cache_zone_large = uma_zcreate("L VFS Cache",
 	    sizeof(struct namecache) + NAME_MAX + 1,
 	    NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_ZINIT);
 	cache_zone_large_ts = uma_zcreate("LTS VFS Cache",
 	    sizeof(struct namecache_ts) + NAME_MAX + 1,
 	    NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_ZINIT);
 
 	nchashtbl = hashinit(desiredvnodes * 2, M_VFSCACHE, &nchash);
+	numbucketlocks = cache_roundup_2(mp_ncpus * 16);
+	if (numbucketlocks > nchash)
+		numbucketlocks = nchash;
+	bucketlocks = malloc(sizeof(*bucketlocks) * numbucketlocks, M_VFSCACHE,
+	    M_WAITOK | M_ZERO);
+	for (i = 0; i < numbucketlocks; i++)
+		rw_init_flags(&bucketlocks[i], "ncbuc", RW_DUPOK);
 
 	numcalls = counter_u64_alloc(M_WAITOK);
 	dothits = counter_u64_alloc(M_WAITOK);
 	dotdothits = counter_u64_alloc(M_WAITOK);
 	numchecks = counter_u64_alloc(M_WAITOK);
 	nummiss = counter_u64_alloc(M_WAITOK);
 	nummisszap = counter_u64_alloc(M_WAITOK);
 	numposzaps = counter_u64_alloc(M_WAITOK);
 	numposhits = counter_u64_alloc(M_WAITOK);
 	numnegzaps = counter_u64_alloc(M_WAITOK);
 	numneghits = counter_u64_alloc(M_WAITOK);
 	numfullpathcalls = counter_u64_alloc(M_WAITOK);
 	numfullpathfail1 = counter_u64_alloc(M_WAITOK);
 	numfullpathfail2 = counter_u64_alloc(M_WAITOK);
 	numfullpathfail4 = counter_u64_alloc(M_WAITOK);
 	numfullpathfound = counter_u64_alloc(M_WAITOK);
 }
 SYSINIT(vfs, SI_SUB_VFS, SI_ORDER_SECOND, nchinit, NULL);
 
 void
 cache_changesize(int newmaxvnodes)
 {
 	struct nchashhead *new_nchashtbl, *old_nchashtbl;
 	u_long new_nchash, old_nchash;
 	struct namecache *ncp;
 	uint32_t hash;
 	int i;
 
 	new_nchashtbl = hashinit(newmaxvnodes * 2, M_VFSCACHE, &new_nchash);
 	/* If same hash table size, nothing to do */
 	if (nchash == new_nchash) {
 		free(new_nchashtbl, M_VFSCACHE);
 		return;
 	}
 	/*
 	 * Move everything from the old hash table to the new table.
 	 * None of the namecache entries in the table can be removed
 	 * because to do so, they have to be removed from the hash table.
 	 */
 	CACHE_WLOCK();
+	cache_lock_all_buckets();
 	old_nchashtbl = nchashtbl;
 	old_nchash = nchash;
 	nchashtbl = new_nchashtbl;
 	nchash = new_nchash;
 	for (i = 0; i <= old_nchash; i++) {
 		while ((ncp = LIST_FIRST(&old_nchashtbl[i])) != NULL) {
 			hash = cache_get_hash(nc_get_name(ncp), ncp->nc_nlen,
 			    ncp->nc_dvp);
 			LIST_REMOVE(ncp, nc_hash);
 			LIST_INSERT_HEAD(NCHHASH(hash), ncp, nc_hash);
 		}
 	}
+	cache_unlock_all_buckets();
 	CACHE_WUNLOCK();
 	free(old_nchashtbl, M_VFSCACHE);
 }
 
 /*
  * Invalidate all entries to a particular vnode.
  */
 void
 cache_purge(struct vnode *vp)
 {
 	TAILQ_HEAD(, namecache) ncps;
 	struct namecache *ncp, *nnp;
 
 	CTR1(KTR_VFS, "cache_purge(%p)", vp);
 	SDT_PROBE1(vfs, namecache, purge, done, vp);
 	TAILQ_INIT(&ncps);
 	CACHE_WLOCK();
 	while (!LIST_EMPTY(&vp->v_cache_src)) {
 		ncp = LIST_FIRST(&vp->v_cache_src);
 		cache_zap(ncp);
 		TAILQ_INSERT_TAIL(&ncps, ncp, nc_dst);
 	}
 	while (!TAILQ_EMPTY(&vp->v_cache_dst)) {
 		ncp = TAILQ_FIRST(&vp->v_cache_dst);
 		cache_zap(ncp);
 		TAILQ_INSERT_TAIL(&ncps, ncp, nc_dst);
 	}
 	if (vp->v_cache_dd != NULL) {
 		ncp = vp->v_cache_dd;
 		KASSERT(ncp->nc_flag & NCF_ISDOTDOT,
 		   ("lost dotdot link"));
 		cache_zap(ncp);
 		TAILQ_INSERT_TAIL(&ncps, ncp, nc_dst);
 	}
 	KASSERT(vp->v_cache_dd == NULL, ("incomplete purge"));
 	CACHE_WUNLOCK();
 	TAILQ_FOREACH_SAFE(ncp, &ncps, nc_dst, nnp) {
 		cache_free(ncp);
 	}
 }
 
 /*
  * Invalidate all negative entries for a particular directory vnode.
  */
 void
 cache_purge_negative(struct vnode *vp)
 {
 	TAILQ_HEAD(, namecache) ncps;
 	struct namecache *ncp, *nnp;
 
 	CTR1(KTR_VFS, "cache_purge_negative(%p)", vp);
 	SDT_PROBE1(vfs, namecache, purge_negative, done, vp);
 	TAILQ_INIT(&ncps);
 	CACHE_WLOCK();
 	LIST_FOREACH_SAFE(ncp, &vp->v_cache_src, nc_src, nnp) {
 		if (ncp->nc_vp != NULL)
 			continue;
 		cache_zap(ncp);
 		TAILQ_INSERT_TAIL(&ncps, ncp, nc_dst);
 	}
 	CACHE_WUNLOCK();
 	TAILQ_FOREACH_SAFE(ncp, &ncps, nc_dst, nnp) {
 		cache_free(ncp);
 	}
 }
 
 /*
  * Flush all entries referencing a particular filesystem.
  */
 void
 cache_purgevfs(struct mount *mp)
 {
 	TAILQ_HEAD(, namecache) ncps;
-	struct nchashhead *ncpp;
+	struct rwlock *bucketlock;
+	struct nchashhead *bucket;
 	struct namecache *ncp, *nnp;
+	u_long i, j, n_nchash;
 
 	/* Scan hash tables for applicable entries */
 	SDT_PROBE1(vfs, namecache, purgevfs, done, mp);
 	TAILQ_INIT(&ncps);
 	CACHE_WLOCK();
-	for (ncpp = &nchashtbl[nchash]; ncpp >= nchashtbl; ncpp--) {
-		LIST_FOREACH_SAFE(ncp, ncpp, nc_hash, nnp) {
-			if (ncp->nc_dvp->v_mount != mp)
-				continue;
-			cache_zap(ncp);
-			TAILQ_INSERT_TAIL(&ncps, ncp, nc_dst);
+	n_nchash = nchash + 1;
+	for (i = 0; i < numbucketlocks; i++) {
+		bucketlock = (struct rwlock *)&bucketlocks[i];
+		rw_wlock(bucketlock);
+		for (j = i; j < n_nchash; j += numbucketlocks) {
+			bucket = &nchashtbl[j];
+			LIST_FOREACH_SAFE(ncp, bucket, nc_hash, nnp) {
+				cache_assert_bucket_locked(ncp, RA_WLOCKED);
+				if (ncp->nc_dvp->v_mount != mp)
+					continue;
+				cache_zap_locked(ncp);
+				TAILQ_INSERT_HEAD(&ncps, ncp, nc_dst);
+			}
 		}
+		rw_wunlock(bucketlock);
 	}
 	CACHE_WUNLOCK();
 	TAILQ_FOREACH_SAFE(ncp, &ncps, nc_dst, nnp) {
 		cache_free(ncp);
 	}
 }
 
 /*
  * Perform canonical checks and cache lookup and pass on to filesystem
  * through the vop_cachedlookup only if needed.
  */
 
 int
 vfs_cache_lookup(struct vop_lookup_args *ap)
 {
 	struct vnode *dvp;
 	int error;
 	struct vnode **vpp = ap->a_vpp;
 	struct componentname *cnp = ap->a_cnp;
 	struct ucred *cred = cnp->cn_cred;
 	int flags = cnp->cn_flags;
 	struct thread *td = cnp->cn_thread;
 
 	*vpp = NULL;
 	dvp = ap->a_dvp;
 
 	if (dvp->v_type != VDIR)
 		return (ENOTDIR);
 
 	if ((flags & ISLASTCN) && (dvp->v_mount->mnt_flag & MNT_RDONLY) &&
 	    (cnp->cn_nameiop == DELETE || cnp->cn_nameiop == RENAME))
 		return (EROFS);
 
 	error = VOP_ACCESS(dvp, VEXEC, cred, td);
 	if (error)
 		return (error);
 
 	error = cache_lookup(dvp, vpp, cnp, NULL, NULL);
 	if (error == 0)
 		return (VOP_CACHEDLOOKUP(dvp, vpp, cnp));
 	if (error == -1)
 		return (0);
 	return (error);
 }
 
 /*
  * XXX All of these sysctls would probably be more productive dead.
  */
 static int disablecwd;
 SYSCTL_INT(_debug, OID_AUTO, disablecwd, CTLFLAG_RW, &disablecwd, 0,
    "Disable the getcwd syscall");
 
 /* Implementation of the getcwd syscall. */
 int
 sys___getcwd(struct thread *td, struct __getcwd_args *uap)
 {
 
 	return (kern___getcwd(td, uap->buf, UIO_USERSPACE, uap->buflen,
 	    MAXPATHLEN));
 }
 
 int
 kern___getcwd(struct thread *td, char *buf, enum uio_seg bufseg, u_int buflen,
     u_int path_max)
 {
 	char *bp, *tmpbuf;
 	struct filedesc *fdp;
 	struct vnode *cdir, *rdir;
 	int error;
 
 	if (disablecwd)
 		return (ENODEV);
 	if (buflen < 2)
 		return (EINVAL);
 	if (buflen > path_max)
 		buflen = path_max;
 
 	tmpbuf = malloc(buflen, M_TEMP, M_WAITOK);
 	fdp = td->td_proc->p_fd;
 	FILEDESC_SLOCK(fdp);
 	cdir = fdp->fd_cdir;
 	VREF(cdir);
 	rdir = fdp->fd_rdir;
 	VREF(rdir);
 	FILEDESC_SUNLOCK(fdp);
 	error = vn_fullpath1(td, cdir, rdir, tmpbuf, &bp, buflen);
 	vrele(rdir);
 	vrele(cdir);
 
 	if (!error) {
 		if (bufseg == UIO_SYSSPACE)
 			bcopy(bp, buf, strlen(bp) + 1);
 		else
 			error = copyout(bp, buf, strlen(bp) + 1);
 #ifdef KTRACE
 	if (KTRPOINT(curthread, KTR_NAMEI))
 		ktrnamei(bp);
 #endif
 	}
 	free(tmpbuf, M_TEMP);
 	return (error);
 }
 
 /*
  * Thus begins the fullpath magic.
  */
 
 static int disablefullpath;
 SYSCTL_INT(_debug, OID_AUTO, disablefullpath, CTLFLAG_RW, &disablefullpath, 0,
     "Disable the vn_fullpath function");
 
 /*
  * Retrieve the full filesystem path that correspond to a vnode from the name
  * cache (if available)
  */
 int
 vn_fullpath(struct thread *td, struct vnode *vn, char **retbuf, char **freebuf)
 {
 	char *buf;
 	struct filedesc *fdp;
 	struct vnode *rdir;
 	int error;
 
 	if (disablefullpath)
 		return (ENODEV);
 	if (vn == NULL)
 		return (EINVAL);
 
 	buf = malloc(MAXPATHLEN, M_TEMP, M_WAITOK);
 	fdp = td->td_proc->p_fd;
 	FILEDESC_SLOCK(fdp);
 	rdir = fdp->fd_rdir;
 	VREF(rdir);
 	FILEDESC_SUNLOCK(fdp);
 	error = vn_fullpath1(td, vn, rdir, buf, retbuf, MAXPATHLEN);
 	vrele(rdir);
 
 	if (!error)
 		*freebuf = buf;
 	else
 		free(buf, M_TEMP);
 	return (error);
 }
 
 /*
  * This function is similar to vn_fullpath, but it attempts to lookup the
  * pathname relative to the global root mount point.  This is required for the
  * auditing sub-system, as audited pathnames must be absolute, relative to the
  * global root mount point.
  */
 int
 vn_fullpath_global(struct thread *td, struct vnode *vn,
     char **retbuf, char **freebuf)
 {
 	char *buf;
 	int error;
 
 	if (disablefullpath)
 		return (ENODEV);
 	if (vn == NULL)
 		return (EINVAL);
 	buf = malloc(MAXPATHLEN, M_TEMP, M_WAITOK);
 	error = vn_fullpath1(td, vn, rootvnode, buf, retbuf, MAXPATHLEN);
 	if (!error)
 		*freebuf = buf;
 	else
 		free(buf, M_TEMP);
 	return (error);
 }
 
 int
 vn_vptocnp(struct vnode **vp, struct ucred *cred, char *buf, u_int *buflen)
 {
 	int error;
 
 	CACHE_RLOCK();
 	error = vn_vptocnp_locked(vp, cred, buf, buflen);
 	if (error == 0)
 		CACHE_RUNLOCK();
 	return (error);
 }
 
 static int
 vn_vptocnp_locked(struct vnode **vp, struct ucred *cred, char *buf,
     u_int *buflen)
 {
 	struct vnode *dvp;
 	struct namecache *ncp;
 	int error;
 
 	TAILQ_FOREACH(ncp, &((*vp)->v_cache_dst), nc_dst) {
 		if ((ncp->nc_flag & NCF_ISDOTDOT) == 0)
 			break;
 	}
 	if (ncp != NULL) {
 		if (*buflen < ncp->nc_nlen) {
 			CACHE_RUNLOCK();
 			vrele(*vp);
 			counter_u64_add(numfullpathfail4, 1);
 			error = ENOMEM;
 			SDT_PROBE3(vfs, namecache, fullpath, return, error,
 			    vp, NULL);
 			return (error);
 		}
 		*buflen -= ncp->nc_nlen;
 		memcpy(buf + *buflen, nc_get_name(ncp), ncp->nc_nlen);
 		SDT_PROBE3(vfs, namecache, fullpath, hit, ncp->nc_dvp,
 		    nc_get_name(ncp), vp);
 		dvp = *vp;
 		*vp = ncp->nc_dvp;
 		vref(*vp);
 		CACHE_RUNLOCK();
 		vrele(dvp);
 		CACHE_RLOCK();
 		return (0);
 	}
 	SDT_PROBE1(vfs, namecache, fullpath, miss, vp);
 
 	CACHE_RUNLOCK();
 	vn_lock(*vp, LK_SHARED | LK_RETRY);
 	error = VOP_VPTOCNP(*vp, &dvp, cred, buf, buflen);
 	vput(*vp);
 	if (error) {
 		counter_u64_add(numfullpathfail2, 1);
 		SDT_PROBE3(vfs, namecache, fullpath, return,  error, vp, NULL);
 		return (error);
 	}
 
 	*vp = dvp;
 	CACHE_RLOCK();
 	if (dvp->v_iflag & VI_DOOMED) {
 		/* forced unmount */
 		CACHE_RUNLOCK();
 		vrele(dvp);
 		error = ENOENT;
 		SDT_PROBE3(vfs, namecache, fullpath, return, error, vp, NULL);
 		return (error);
 	}
 	/*
 	 * *vp has its use count incremented still.
 	 */
 
 	return (0);
 }
 
 /*
  * The magic behind kern___getcwd() and vn_fullpath().
  */
 static int
 vn_fullpath1(struct thread *td, struct vnode *vp, struct vnode *rdir,
     char *buf, char **retbuf, u_int buflen)
 {
 	int error, slash_prefixed;
 #ifdef KDTRACE_HOOKS
 	struct vnode *startvp = vp;
 #endif
 	struct vnode *vp1;
 
 	buflen--;
 	buf[buflen] = '\0';
 	error = 0;
 	slash_prefixed = 0;
 
 	SDT_PROBE1(vfs, namecache, fullpath, entry, vp);
 	counter_u64_add(numfullpathcalls, 1);
 	vref(vp);
 	CACHE_RLOCK();
 	if (vp->v_type != VDIR) {
 		error = vn_vptocnp_locked(&vp, td->td_ucred, buf, &buflen);
 		if (error)
 			return (error);
 		if (buflen == 0) {
 			CACHE_RUNLOCK();
 			vrele(vp);
 			return (ENOMEM);
 		}
 		buf[--buflen] = '/';
 		slash_prefixed = 1;
 	}
 	while (vp != rdir && vp != rootvnode) {
 		if (vp->v_vflag & VV_ROOT) {
 			if (vp->v_iflag & VI_DOOMED) {	/* forced unmount */
 				CACHE_RUNLOCK();
 				vrele(vp);
 				error = ENOENT;
 				SDT_PROBE3(vfs, namecache, fullpath, return,
 				    error, vp, NULL);
 				break;
 			}
 			vp1 = vp->v_mount->mnt_vnodecovered;
 			vref(vp1);
 			CACHE_RUNLOCK();
 			vrele(vp);
 			vp = vp1;
 			CACHE_RLOCK();
 			continue;
 		}
 		if (vp->v_type != VDIR) {
 			CACHE_RUNLOCK();
 			vrele(vp);
 			counter_u64_add(numfullpathfail1, 1);
 			error = ENOTDIR;
 			SDT_PROBE3(vfs, namecache, fullpath, return,
 			    error, vp, NULL);
 			break;
 		}
 		error = vn_vptocnp_locked(&vp, td->td_ucred, buf, &buflen);
 		if (error)
 			break;
 		if (buflen == 0) {
 			CACHE_RUNLOCK();
 			vrele(vp);
 			error = ENOMEM;
 			SDT_PROBE3(vfs, namecache, fullpath, return, error,
 			    startvp, NULL);
 			break;
 		}
 		buf[--buflen] = '/';
 		slash_prefixed = 1;
 	}
 	if (error)
 		return (error);
 	if (!slash_prefixed) {
 		if (buflen == 0) {
 			CACHE_RUNLOCK();
 			vrele(vp);
 			counter_u64_add(numfullpathfail4, 1);
 			SDT_PROBE3(vfs, namecache, fullpath, return, ENOMEM,
 			    startvp, NULL);
 			return (ENOMEM);
 		}
 		buf[--buflen] = '/';
 	}
 	counter_u64_add(numfullpathfound, 1);
 	CACHE_RUNLOCK();
 	vrele(vp);
 
 	SDT_PROBE3(vfs, namecache, fullpath, return, 0, startvp, buf + buflen);
 	*retbuf = buf + buflen;
 	return (0);
 }
 
 struct vnode *
 vn_dir_dd_ino(struct vnode *vp)
 {
 	struct namecache *ncp;
 	struct vnode *ddvp;
 
 	ASSERT_VOP_LOCKED(vp, "vn_dir_dd_ino");
 	CACHE_RLOCK();
 	TAILQ_FOREACH(ncp, &(vp->v_cache_dst), nc_dst) {
 		if ((ncp->nc_flag & NCF_ISDOTDOT) != 0)
 			continue;
 		ddvp = ncp->nc_dvp;
 		vhold(ddvp);
 		CACHE_RUNLOCK();
 		if (vget(ddvp, LK_SHARED | LK_NOWAIT | LK_VNHELD, curthread))
 			return (NULL);
 		return (ddvp);
 	}
 	CACHE_RUNLOCK();
 	return (NULL);
 }
 
 int
 vn_commname(struct vnode *vp, char *buf, u_int buflen)
 {
 	struct namecache *ncp;
 	int l;
 
 	CACHE_RLOCK();
 	TAILQ_FOREACH(ncp, &vp->v_cache_dst, nc_dst)
 		if ((ncp->nc_flag & NCF_ISDOTDOT) == 0)
 			break;
 	if (ncp == NULL) {
 		CACHE_RUNLOCK();
 		return (ENOENT);
 	}
 	l = min(ncp->nc_nlen, buflen - 1);
 	memcpy(buf, nc_get_name(ncp), l);
 	CACHE_RUNLOCK();
 	buf[l] = '\0';
 	return (0);
 }
 
 /* ABI compat shims for old kernel modules. */
 #undef cache_enter
 
 void	cache_enter(struct vnode *dvp, struct vnode *vp,
 	    struct componentname *cnp);
 
 void
 cache_enter(struct vnode *dvp, struct vnode *vp, struct componentname *cnp)
 {
 
 	cache_enter_time(dvp, vp, cnp, NULL, NULL);
 }
 
 /*
  * This function updates path string to vnode's full global path
  * and checks the size of the new path string against the pathlen argument.
  *
  * Requires a locked, referenced vnode.
  * Vnode is re-locked on success or ENODEV, otherwise unlocked.
  *
  * If sysctl debug.disablefullpath is set, ENODEV is returned,
  * vnode is left locked and path remain untouched.
  *
  * If vp is a directory, the call to vn_fullpath_global() always succeeds
  * because it falls back to the ".." lookup if the namecache lookup fails.
  */
 int
 vn_path_to_global_path(struct thread *td, struct vnode *vp, char *path,
     u_int pathlen)
 {
 	struct nameidata nd;
 	struct vnode *vp1;
 	char *rpath, *fbuf;
 	int error;
 
 	ASSERT_VOP_ELOCKED(vp, __func__);
 
 	/* Return ENODEV if sysctl debug.disablefullpath==1 */
 	if (disablefullpath)
 		return (ENODEV);
 
 	/* Construct global filesystem path from vp. */
 	VOP_UNLOCK(vp, 0);
 	error = vn_fullpath_global(td, vp, &rpath, &fbuf);
 
 	if (error != 0) {
 		vrele(vp);
 		return (error);
 	}
 
 	if (strlen(rpath) >= pathlen) {
 		vrele(vp);
 		error = ENAMETOOLONG;
 		goto out;
 	}
 
 	/*
 	 * Re-lookup the vnode by path to detect a possible rename.
 	 * As a side effect, the vnode is relocked.
 	 * If vnode was renamed, return ENOENT.
 	 */
 	NDINIT(&nd, LOOKUP, FOLLOW | LOCKLEAF | AUDITVNODE1,
 	    UIO_SYSSPACE, path, td);
 	error = namei(&nd);
 	if (error != 0) {
 		vrele(vp);
 		goto out;
 	}
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	vp1 = nd.ni_vp;
 	vrele(vp);
 	if (vp1 == vp)
 		strcpy(path, rpath);
 	else {
 		vput(vp1);
 		error = ENOENT;
 	}
 
 out:
 	free(fbuf, M_TEMP);
 	return (error);
 }
Index: projects/clang390-import/sys/mips/malta/asm_malta.S
===================================================================
--- projects/clang390-import/sys/mips/malta/asm_malta.S	(nonexistent)
+++ projects/clang390-import/sys/mips/malta/asm_malta.S	(revision 305687)
@@ -0,0 +1,89 @@
+/*-
+ * Copyright (c) 2016 Ruslan Bukin <br@bsdpad.com>
+ * All rights reserved.
+ *
+ * Portions of this software were developed by SRI International and the
+ * University of Cambridge Computer Laboratory under DARPA/AFRL contract
+ * FA8750-10-C-0237 ("CTSRD"), as part of the DARPA CRASH research programme.
+ *
+ * Portions of this software were developed by the University of Cambridge
+ * Computer Laboratory as part of the CTSRD Project, with support from the
+ * UK Higher Education Innovation Fund (HEIF).
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ *
+ * $FreeBSD$
+ */
+
+#include <machine/asm.h>
+
+#define	VPECONF0_MVP	(1 << 1)
+
+	.set noreorder
+
+#ifdef SMP
+/*
+ * This function must be implemented in assembly because it is called early
+ * in AP boot without a valid stack.
+ */
+LEAF(platform_processor_id)
+	.set push
+	.set mips32r2
+	mfc0	v0, $15, 1
+	jr	ra
+	andi	v0, 0x1f
+	.set pop
+END(platform_processor_id)
+
+LEAF(enable_mvp)
+	.set push
+	.set mips32r2
+	.set noat
+	li	t2, (VPECONF0_MVP)
+	move	$1, t2
+	jr	ra
+	.word	0x41810000 | (1 << 11) | 2	# mttc0 t2, $1, 2
+	.set pop
+END(enable_mvp)
+
+/*
+ * Called on APs to wait until they are told to launch.
+ */
+LEAF(malta_ap_wait)
+	jal	platform_processor_id
+	nop
+
+1:
+	ll	t0, malta_ap_boot
+	bne	v0, t0, 1b
+	nop
+
+	move	t0, zero
+	sc	t0, malta_ap_boot
+
+	beqz	t0, 1b
+	nop
+
+	j	mpentry
+	nop
+END(malta_ap_wait)
+#endif

Property changes on: projects/clang390-import/sys/mips/malta/asm_malta.S
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: projects/clang390-import/sys/mips/malta/files.malta
===================================================================
--- projects/clang390-import/sys/mips/malta/files.malta	(revision 305686)
+++ projects/clang390-import/sys/mips/malta/files.malta	(revision 305687)
@@ -1,12 +1,16 @@
 # $FreeBSD$
 mips/malta/gt.c				standard
 mips/malta/gt_pci.c			standard
 mips/malta/gt_pci_bus_space.c			standard
 mips/malta/obio.c			optional uart
 mips/malta/uart_cpu_maltausart.c	optional uart
 mips/malta/uart_bus_maltausart.c	optional uart
 dev/uart/uart_dev_ns8250.c		optional uart
 mips/malta/malta_machdep.c		standard
 mips/malta/yamon.c			standard
 mips/mips/intr_machdep.c		standard
 mips/mips/tick.c			standard
+
+# SMP
+mips/malta/asm_malta.S			optional smp
+mips/malta/malta_mp.c			optional smp
Index: projects/clang390-import/sys/mips/malta/malta_mp.c
===================================================================
--- projects/clang390-import/sys/mips/malta/malta_mp.c	(nonexistent)
+++ projects/clang390-import/sys/mips/malta/malta_mp.c	(revision 305687)
@@ -0,0 +1,226 @@
+/*-
+ * Copyright (c) 2016 Ruslan Bukin <br@bsdpad.com>
+ * All rights reserved.
+ *
+ * Portions of this software were developed by SRI International and the
+ * University of Cambridge Computer Laboratory under DARPA/AFRL contract
+ * FA8750-10-C-0237 ("CTSRD"), as part of the DARPA CRASH research programme.
+ *
+ * Portions of this software were developed by the University of Cambridge
+ * Computer Laboratory as part of the CTSRD Project, with support from the
+ * UK Higher Education Innovation Fund (HEIF).
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ *
+ * $FreeBSD$
+ */
+
+#include <sys/cdefs.h>
+__FBSDID("$FreeBSD$");
+
+#include <sys/param.h>
+#include <sys/conf.h>
+#include <sys/kernel.h>
+#include <sys/smp.h>
+#include <sys/systm.h>
+
+#include <machine/cpufunc.h>
+#include <machine/hwfunc.h>
+#include <machine/md_var.h>
+#include <machine/smp.h>
+
+#define	MALTA_MAXCPU	2
+
+unsigned malta_ap_boot = ~0;
+
+#define	C_SW0		(1 << 8)
+#define	C_SW1		(1 << 9)
+#define	C_IRQ0		(1 << 10)
+#define	C_IRQ1		(1 << 11)
+#define	C_IRQ2		(1 << 12)
+#define	C_IRQ3		(1 << 13)
+#define	C_IRQ4		(1 << 14)
+#define	C_IRQ5		(1 << 15)
+
+static inline void
+ehb(void)
+{
+	__asm __volatile(
+	"	.set mips32r2	\n"
+	"	ehb		\n"
+	"	.set mips0	\n");
+}
+
+#define	mttc0(rd, sel, val)						\
+({									\
+	__asm __volatile(						\
+	"	.set push					\n"	\
+	"	.set mips32r2					\n"	\
+	"	.set noat					\n"	\
+	"	move	$1, %0					\n"	\
+	"	.word 0x41810000 | (" #rd " << 11) | " #sel "	\n"	\
+	"	.set pop					\n"	\
+	:: "r" (val));							\
+})
+
+#define	mftc0(rt, sel)							\
+({									\
+	unsigned long __res;						\
+	__asm __volatile(						\
+	"	.set push					\n"	\
+	"	.set mips32r2					\n"	\
+	"	.set noat					\n"	\
+	"	.word 0x41000800 | (" #rt " << 16) | " #sel "	\n"	\
+	"	move	%0, $1					\n"	\
+	"	.set pop					\n"	\
+	: "=r" (__res));						\
+	 __res;								\
+})
+
+#define	write_c0_register32(reg, sel, val)				\
+({									\
+	__asm __volatile(						\
+	"	.set push					\n"	\
+	"	.set mips32					\n"	\
+	"	mtc0	%0, $%1, %2				\n"	\
+	"	.set pop					\n"	\
+	:: "r" (val), "i" (reg), "i" (sel));				\
+})
+
+#define	read_c0_register32(reg, sel)					\
+({									\
+	uint32_t __retval;						\
+	__asm __volatile(						\
+	"	.set push					\n"	\
+	"	.set mips32					\n"	\
+	"	mfc0	%0, $%1, %2				\n"	\
+	"	.set pop					\n"	\
+	: "=r" (__retval) : "i" (reg), "i" (sel));			\
+	__retval;							\
+})
+
+void
+platform_ipi_send(int cpuid)
+{
+	uint32_t reg;
+
+	/*
+	 * Set thread context.
+	 * Note this is not global, so we don't need lock.
+	 */
+	reg = read_c0_register32(1, 1);
+	reg &= ~(0xff);
+	reg |= cpuid;
+	write_c0_register32(1, 1, reg);
+
+	ehb();
+
+	/* Set cause */
+	reg = mftc0(13, 0);
+	mttc0(13, 0, (reg | C_SW1));
+}
+
+void
+platform_ipi_clear(void)
+{
+	uint32_t reg;
+
+	reg = mips_rd_cause();
+	reg &= ~(C_SW1);
+	mips_wr_cause(reg);
+}
+
+int
+platform_ipi_hardintr_num(void)
+{
+
+	return (-1);
+}
+
+int
+platform_ipi_softintr_num(void)
+{
+
+	return (1);
+}
+
+void
+platform_init_ap(int cpuid)
+{
+	uint32_t clock_int_mask;
+	uint32_t ipi_intr_mask;
+
+	/*
+	 * Clear any pending IPIs.
+	 */
+	platform_ipi_clear();
+
+	/*
+	 * Unmask the clock and ipi interrupts.
+	 */
+	ipi_intr_mask = soft_int_mask(platform_ipi_softintr_num());
+	clock_int_mask = hard_int_mask(5);
+	set_intr_mask(ipi_intr_mask | clock_int_mask);
+
+	mips_wbflush();
+}
+
+void
+platform_cpu_mask(cpuset_t *mask)
+{
+	uint32_t i, m;
+
+	CPU_ZERO(mask);
+	for (i = 0, m = 1 ; i < MALTA_MAXCPU; i++, m <<= 1)
+		CPU_SET(i, mask);
+}
+
+struct cpu_group *
+platform_smp_topo(void)
+{
+
+	return (smp_topo_none());
+}
+
+int
+platform_start_ap(int cpuid)
+{
+	int timeout;
+
+	if (atomic_cmpset_32(&malta_ap_boot, ~0, cpuid) == 0)
+		return (-1);
+
+	printf("Waiting for cpu%d to start\n", cpuid);
+
+	timeout = 100;
+	do {
+		DELAY(1000);
+		if (atomic_cmpset_32(&malta_ap_boot, 0, ~0) != 0) {
+			printf("CPU %d started\n", cpuid);
+			return (0);
+		}
+	} while (timeout--);
+
+	printf("CPU %d failed to start\n", cpuid);
+
+	return (0);
+}

Property changes on: projects/clang390-import/sys/mips/malta/malta_mp.c
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: projects/clang390-import/sys/mips/malta/std.malta
===================================================================
--- projects/clang390-import/sys/mips/malta/std.malta	(revision 305686)
+++ projects/clang390-import/sys/mips/malta/std.malta	(revision 305687)
@@ -1,11 +1,11 @@
 # $FreeBSD$
 files	"../malta/files.malta"
 
-cpu		CPU_MIPS4KC
+cpu		CPU_MALTA
 device		pci
 device		ata
 
 device		scbus		# SCSI bus (required for ATA/SCSI)
 device		cd		# CD
 device		da		# Direct Access (disks)
 device		pass		# Passthrough device (direct ATA/SCSI access)
Index: projects/clang390-import/sys/mips/mips/locore.S
===================================================================
--- projects/clang390-import/sys/mips/mips/locore.S	(revision 305686)
+++ projects/clang390-import/sys/mips/mips/locore.S	(revision 305687)
@@ -1,187 +1,202 @@
 /*	$OpenBSD: locore.S,v 1.18 1998/09/15 10:58:53 pefo Exp $	*/
 /*-
  * Copyright (c) 1992, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * Digital Equipment Corporation and Ralph Campbell.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * Copyright (C) 1989 Digital Equipment Corporation.
  * Permission to use, copy, modify, and distribute this software and
  * its documentation for any purpose and without fee is hereby granted,
  * provided that the above copyright notice appears in all copies.
  * Digital Equipment Corporation makes no representations about the
  * suitability of this software for any purpose.  It is provided "as is"
  * without express or implied warranty.
  *
  * from: Header: /sprite/src/kernel/mach/ds3100.md/RCS/loMem.s,
  *	v 1.1 89/07/11 17:55:04 nelson Exp  SPRITE (DECWRL)
  * from: Header: /sprite/src/kernel/mach/ds3100.md/RCS/machAsm.s,
  *	v 9.2 90/01/29 18:00:39 shirriff Exp  SPRITE (DECWRL)
  * from: Header: /sprite/src/kernel/vm/ds3100.md/vmPmaxAsm.s,
  *	v 1.1 89/07/10 14:27:41 nelson Exp  SPRITE (DECWRL)
  *
  *	from: @(#)locore.s	8.5 (Berkeley) 1/4/94
  *	JNPR: locore.S,v 1.6.2.1 2007/08/29 12:24:49 girish
  * $FreeBSD$
  */
 
 /*
  * FREEBSD_DEVELOPERS_FIXME
  * The start routine below was written for a multi-core CPU
  * with each core being hyperthreaded. This serves as an example
  * for a complex CPU architecture. For a different CPU complex
  * please make necessary changes to read CPU-ID etc.
  * A clean solution would be to have a different locore file for
  * each CPU type.
  */
 
 /*
  *	Contains code that is the first executed at boot time plus
  *	assembly language support routines.
  */
 
 #include <machine/asm.h>
 #include <machine/cpu.h>
 #include <machine/cpuregs.h>
 #include <machine/regnum.h>
 
 #include "assym.s"
 
 	.data
 #ifdef YAMON
 GLOBAL(fenvp)
 	.space 4			# Assumes mips32?  Is that OK?
 #endif
 
 	.set noreorder
 
 	.text
 
 GLOBAL(btext)
 ASM_ENTRY(_start)
 VECTOR(_locore, unknown)
 	/* UNSAFE TO USE a0..a3, need to preserve the args from boot loader */
 	mtc0	zero, MIPS_COP_0_CAUSE	# Clear soft interrupts
 	
 #if defined(CPU_CNMIPS)
 	/*
 	 * t1: Bits to set explicitly:
 	 *	Enable FPU
 	 */
 
 	/* Set these bits */
         li	t1, (MIPS_SR_COP_0_BIT | MIPS_SR_PX | MIPS_SR_KX | MIPS_SR_UX | MIPS_SR_SX | MIPS_SR_BEV)
 
 	/* Reset these bits */
         li	t0, ~(MIPS_SR_DE | MIPS_SR_SR | MIPS_SR_ERL | MIPS_SR_EXL | MIPS_SR_INT_IE | MIPS_SR_COP_2_BIT)
 #elif defined (CPU_RMI) || defined (CPU_NLM) 
 	/* Set these bits */
         li	t1, (MIPS_SR_COP_2_BIT | MIPS_SR_COP_0_BIT | MIPS_SR_KX | MIPS_SR_UX)
 
 	/* Reset these bits */
         li	t0, ~(MIPS_SR_BEV | MIPS_SR_SR | MIPS_SR_INT_IE)
 #else
 	/*
 	 * t0: Bits to preserve if set:
 	 * 	Soft reset
 	 *	Boot exception vectors (firmware-provided)
 	 */
 	li	t0, (MIPS_SR_BEV | MIPS_SR_SR)
 	/*
 	 * t1: Bits to set explicitly:
 	 *	Enable FPU
 	 */
 	li	t1, MIPS_SR_COP_1_BIT
 #ifdef __mips_n64
 	or	t1, MIPS_SR_KX | MIPS_SR_SX | MIPS_SR_UX
 #endif
 #endif
 	/*
 	 * Read coprocessor 0 status register, clear bits not
 	 * preserved (namely, clearing interrupt bits), and set
 	 * bits we want to explicitly set.
 	 */
 	mfc0	t2, MIPS_COP_0_STATUS
 	and	t2, t0
 	or	t2, t1
 	mtc0	t2, MIPS_COP_0_STATUS
 	COP0_SYNC
 
 	/* Make sure KSEG0 is cached */
 	li	t0, MIPS_CCA_CACHED
 	mtc0	t0, MIPS_COP_0_CONFIG
 	COP0_SYNC
 
 	/*xxximp
 	 * now that we pass a0...a3 to the platform_init routine, do we need
 	 * to stash this stuff here?
 	 */
 #ifdef YAMON
 	/* Save YAMON boot environment pointer */
 	sw	a2, _C_LABEL(fenvp)
 #endif
 
 #if defined(CPU_CNMIPS) && defined(SMP)
 	.set push
 	.set mips32r2
 	rdhwr	t2, $0
 	beqz	t2, 1f
 	nop
 	j	octeon_ap_wait
 	nop
 	.set pop
 1:
 #endif
 
+#if defined(CPU_MALTA) && defined(SMP)
+	.set push
+	.set mips32r2
+	jal	enable_mvp
+	nop
+	jal	platform_processor_id
+	nop
+	beqz	v0, 1f
+	nop
+	j	malta_ap_wait
+	nop
+	.set pop
+1:
+#endif
+
 	/*
 	 * Initialize stack and call machine startup.
 	 */
 	PTR_LA		sp, _C_LABEL(pcpu_space)
 	PTR_ADDU	sp, (PAGE_SIZE * 2) - CALLFRAME_SIZ
 
 	REG_S	zero, CALLFRAME_RA(sp)	# Zero out old ra for debugger
 	REG_S	zero, CALLFRAME_SP(sp)	# Zero out old fp for debugger
 
 	PTR_LA	gp, _C_LABEL(_gp)
 
 	/* Call the platform-specific startup code. */
 	jal	_C_LABEL(platform_start)
 	nop
 
 	PTR_LA	sp, _C_LABEL(thread0_st)
 	PTR_L	a0, TD_PCB(sp)
 	REG_LI	t0, ~7
 	and	a0, a0, t0
 	PTR_SUBU	sp, a0, CALLFRAME_SIZ
 
 	jal	_C_LABEL(mi_startup)		# mi_startup(frame)
 	sw	zero, CALLFRAME_SIZ - 8(sp)	# Zero out old fp for debugger
 
 	PANIC("Startup failed!")
 
 VECTOR_END(_locore)
Index: projects/clang390-import/sys/net80211/ieee80211_freebsd.c
===================================================================
--- projects/clang390-import/sys/net80211/ieee80211_freebsd.c	(revision 305686)
+++ projects/clang390-import/sys/net80211/ieee80211_freebsd.c	(revision 305687)
@@ -1,936 +1,972 @@
 /*-
  * Copyright (c) 2003-2009 Sam Leffler, Errno Consulting
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 /*
  * IEEE 802.11 support (FreeBSD-specific code)
  */
 #include "opt_wlan.h"
 
 #include <sys/param.h>
 #include <sys/systm.h> 
 #include <sys/eventhandler.h>
 #include <sys/kernel.h>
 #include <sys/linker.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>   
 #include <sys/module.h>
 #include <sys/proc.h>
 #include <sys/sysctl.h>
 
 #include <sys/socket.h>
 
 #include <net/bpf.h>
 #include <net/if.h>
 #include <net/if_var.h>
 #include <net/if_dl.h>
 #include <net/if_clone.h>
 #include <net/if_media.h>
 #include <net/if_types.h>
 #include <net/ethernet.h>
 #include <net/route.h>
 #include <net/vnet.h>
 
 #include <net80211/ieee80211_var.h>
 #include <net80211/ieee80211_input.h>
 
 SYSCTL_NODE(_net, OID_AUTO, wlan, CTLFLAG_RD, 0, "IEEE 80211 parameters");
 
 #ifdef IEEE80211_DEBUG
 static int	ieee80211_debug = 0;
 SYSCTL_INT(_net_wlan, OID_AUTO, debug, CTLFLAG_RW, &ieee80211_debug,
 	    0, "debugging printfs");
 #endif
 
 static MALLOC_DEFINE(M_80211_COM, "80211com", "802.11 com state");
 
 static const char wlanname[] = "wlan";
 static struct if_clone *wlan_cloner;
 
 static int
 wlan_clone_create(struct if_clone *ifc, int unit, caddr_t params)
 {
 	struct ieee80211_clone_params cp;
 	struct ieee80211vap *vap;
 	struct ieee80211com *ic;
 	int error;
 
 	error = copyin(params, &cp, sizeof(cp));
 	if (error)
 		return error;
 	ic = ieee80211_find_com(cp.icp_parent);
 	if (ic == NULL)
 		return ENXIO;
 	if (cp.icp_opmode >= IEEE80211_OPMODE_MAX) {
 		ic_printf(ic, "%s: invalid opmode %d\n", __func__,
 		    cp.icp_opmode);
 		return EINVAL;
 	}
 	if ((ic->ic_caps & ieee80211_opcap[cp.icp_opmode]) == 0) {
 		ic_printf(ic, "%s mode not supported\n",
 		    ieee80211_opmode_name[cp.icp_opmode]);
 		return EOPNOTSUPP;
 	}
 	if ((cp.icp_flags & IEEE80211_CLONE_TDMA) &&
 #ifdef IEEE80211_SUPPORT_TDMA
 	    (ic->ic_caps & IEEE80211_C_TDMA) == 0
 #else
 	    (1)
 #endif
 	) {
 		ic_printf(ic, "TDMA not supported\n");
 		return EOPNOTSUPP;
 	}
 	vap = ic->ic_vap_create(ic, wlanname, unit,
 			cp.icp_opmode, cp.icp_flags, cp.icp_bssid,
 			cp.icp_flags & IEEE80211_CLONE_MACADDR ?
 			    cp.icp_macaddr : ic->ic_macaddr);
 
 	return (vap == NULL ? EIO : 0);
 }
 
 static void
 wlan_clone_destroy(struct ifnet *ifp)
 {
 	struct ieee80211vap *vap = ifp->if_softc;
 	struct ieee80211com *ic = vap->iv_ic;
 
 	ic->ic_vap_delete(vap);
 }
 
 void
 ieee80211_vap_destroy(struct ieee80211vap *vap)
 {
 	CURVNET_SET(vap->iv_ifp->if_vnet);
 	if_clone_destroyif(wlan_cloner, vap->iv_ifp);
 	CURVNET_RESTORE();
 }
 
 int
 ieee80211_sysctl_msecs_ticks(SYSCTL_HANDLER_ARGS)
 {
 	int msecs = ticks_to_msecs(*(int *)arg1);
 	int error, t;
 
 	error = sysctl_handle_int(oidp, &msecs, 0, req);
 	if (error || !req->newptr)
 		return error;
 	t = msecs_to_ticks(msecs);
 	*(int *)arg1 = (t < 1) ? 1 : t;
 	return 0;
 }
 
 static int
 ieee80211_sysctl_inact(SYSCTL_HANDLER_ARGS)
 {
 	int inact = (*(int *)arg1) * IEEE80211_INACT_WAIT;
 	int error;
 
 	error = sysctl_handle_int(oidp, &inact, 0, req);
 	if (error || !req->newptr)
 		return error;
 	*(int *)arg1 = inact / IEEE80211_INACT_WAIT;
 	return 0;
 }
 
 static int
 ieee80211_sysctl_parent(SYSCTL_HANDLER_ARGS)
 {
 	struct ieee80211com *ic = arg1;
 
 	return SYSCTL_OUT_STR(req, ic->ic_name);
 }
 
 static int
 ieee80211_sysctl_radar(SYSCTL_HANDLER_ARGS)
 {
 	struct ieee80211com *ic = arg1;
 	int t = 0, error;
 
 	error = sysctl_handle_int(oidp, &t, 0, req);
 	if (error || !req->newptr)
 		return error;
 	IEEE80211_LOCK(ic);
 	ieee80211_dfs_notify_radar(ic, ic->ic_curchan);
 	IEEE80211_UNLOCK(ic);
 	return 0;
 }
 
 void
 ieee80211_sysctl_attach(struct ieee80211com *ic)
 {
 }
 
 void
 ieee80211_sysctl_detach(struct ieee80211com *ic)
 {
 }
 
 void
 ieee80211_sysctl_vattach(struct ieee80211vap *vap)
 {
 	struct ifnet *ifp = vap->iv_ifp;
 	struct sysctl_ctx_list *ctx;
 	struct sysctl_oid *oid;
 	char num[14];			/* sufficient for 32 bits */
 
 	ctx = (struct sysctl_ctx_list *) IEEE80211_MALLOC(sizeof(struct sysctl_ctx_list),
 		M_DEVBUF, IEEE80211_M_NOWAIT | IEEE80211_M_ZERO);
 	if (ctx == NULL) {
 		if_printf(ifp, "%s: cannot allocate sysctl context!\n",
 			__func__);
 		return;
 	}
 	sysctl_ctx_init(ctx);
 	snprintf(num, sizeof(num), "%u", ifp->if_dunit);
 	oid = SYSCTL_ADD_NODE(ctx, &SYSCTL_NODE_CHILDREN(_net, wlan),
 		OID_AUTO, num, CTLFLAG_RD, NULL, "");
 	SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(oid), OID_AUTO,
 		"%parent", CTLTYPE_STRING | CTLFLAG_RD, vap->iv_ic, 0,
 		ieee80211_sysctl_parent, "A", "parent device");
 	SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(oid), OID_AUTO,
 		"driver_caps", CTLFLAG_RW, &vap->iv_caps, 0,
 		"driver capabilities");
 #ifdef IEEE80211_DEBUG
 	vap->iv_debug = ieee80211_debug;
 	SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(oid), OID_AUTO,
 		"debug", CTLFLAG_RW, &vap->iv_debug, 0,
 		"control debugging printfs");
 #endif
 	SYSCTL_ADD_INT(ctx, SYSCTL_CHILDREN(oid), OID_AUTO,
 		"bmiss_max", CTLFLAG_RW, &vap->iv_bmiss_max, 0,
 		"consecutive beacon misses before scanning");
 	/* XXX inherit from tunables */
 	SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(oid), OID_AUTO,
 		"inact_run", CTLTYPE_INT | CTLFLAG_RW, &vap->iv_inact_run, 0,
 		ieee80211_sysctl_inact, "I",
 		"station inactivity timeout (sec)");
 	SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(oid), OID_AUTO,
 		"inact_probe", CTLTYPE_INT | CTLFLAG_RW, &vap->iv_inact_probe, 0,
 		ieee80211_sysctl_inact, "I",
 		"station inactivity probe timeout (sec)");
 	SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(oid), OID_AUTO,
 		"inact_auth", CTLTYPE_INT | CTLFLAG_RW, &vap->iv_inact_auth, 0,
 		ieee80211_sysctl_inact, "I",
 		"station authentication timeout (sec)");
 	SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(oid), OID_AUTO,
 		"inact_init", CTLTYPE_INT | CTLFLAG_RW, &vap->iv_inact_init, 0,
 		ieee80211_sysctl_inact, "I",
 		"station initial state timeout (sec)");
 	if (vap->iv_htcaps & IEEE80211_HTC_HT) {
 		SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(oid), OID_AUTO,
 			"ampdu_mintraffic_bk", CTLFLAG_RW,
 			&vap->iv_ampdu_mintraffic[WME_AC_BK], 0,
 			"BK traffic tx aggr threshold (pps)");
 		SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(oid), OID_AUTO,
 			"ampdu_mintraffic_be", CTLFLAG_RW,
 			&vap->iv_ampdu_mintraffic[WME_AC_BE], 0,
 			"BE traffic tx aggr threshold (pps)");
 		SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(oid), OID_AUTO,
 			"ampdu_mintraffic_vo", CTLFLAG_RW,
 			&vap->iv_ampdu_mintraffic[WME_AC_VO], 0,
 			"VO traffic tx aggr threshold (pps)");
 		SYSCTL_ADD_UINT(ctx, SYSCTL_CHILDREN(oid), OID_AUTO,
 			"ampdu_mintraffic_vi", CTLFLAG_RW,
 			&vap->iv_ampdu_mintraffic[WME_AC_VI], 0,
 			"VI traffic tx aggr threshold (pps)");
 	}
 	if (vap->iv_caps & IEEE80211_C_DFS) {
 		SYSCTL_ADD_PROC(ctx, SYSCTL_CHILDREN(oid), OID_AUTO,
 			"radar", CTLTYPE_INT | CTLFLAG_RW, vap->iv_ic, 0,
 			ieee80211_sysctl_radar, "I", "simulate radar event");
 	}
 	vap->iv_sysctl = ctx;
 	vap->iv_oid = oid;
 }
 
 void
 ieee80211_sysctl_vdetach(struct ieee80211vap *vap)
 {
 
 	if (vap->iv_sysctl != NULL) {
 		sysctl_ctx_free(vap->iv_sysctl);
 		IEEE80211_FREE(vap->iv_sysctl, M_DEVBUF);
 		vap->iv_sysctl = NULL;
 	}
 }
 
 int
 ieee80211_node_dectestref(struct ieee80211_node *ni)
 {
 	/* XXX need equivalent of atomic_dec_and_test */
 	atomic_subtract_int(&ni->ni_refcnt, 1);
 	return atomic_cmpset_int(&ni->ni_refcnt, 0, 1);
 }
 
 void
 ieee80211_drain_ifq(struct ifqueue *ifq)
 {
 	struct ieee80211_node *ni;
 	struct mbuf *m;
 
 	for (;;) {
 		IF_DEQUEUE(ifq, m);
 		if (m == NULL)
 			break;
 
 		ni = (struct ieee80211_node *)m->m_pkthdr.rcvif;
 		KASSERT(ni != NULL, ("frame w/o node"));
 		ieee80211_free_node(ni);
 		m->m_pkthdr.rcvif = NULL;
 
 		m_freem(m);
 	}
 }
 
 void
 ieee80211_flush_ifq(struct ifqueue *ifq, struct ieee80211vap *vap)
 {
 	struct ieee80211_node *ni;
 	struct mbuf *m, **mprev;
 
 	IF_LOCK(ifq);
 	mprev = &ifq->ifq_head;
 	while ((m = *mprev) != NULL) {
 		ni = (struct ieee80211_node *)m->m_pkthdr.rcvif;
 		if (ni != NULL && ni->ni_vap == vap) {
 			*mprev = m->m_nextpkt;		/* remove from list */
 			ifq->ifq_len--;
 
 			m_freem(m);
 			ieee80211_free_node(ni);	/* reclaim ref */
 		} else
 			mprev = &m->m_nextpkt;
 	}
 	/* recalculate tail ptr */
 	m = ifq->ifq_head;
 	for (; m != NULL && m->m_nextpkt != NULL; m = m->m_nextpkt)
 		;
 	ifq->ifq_tail = m;
 	IF_UNLOCK(ifq);
 }
 
 /*
  * As above, for mbufs allocated with m_gethdr/MGETHDR
  * or initialized by M_COPY_PKTHDR.
  */
 #define	MC_ALIGN(m, len)						\
 do {									\
 	(m)->m_data += rounddown2(MCLBYTES - (len), sizeof(long));	\
 } while (/* CONSTCOND */ 0)
 
 /*
  * Allocate and setup a management frame of the specified
  * size.  We return the mbuf and a pointer to the start
  * of the contiguous data area that's been reserved based
  * on the packet length.  The data area is forced to 32-bit
  * alignment and the buffer length to a multiple of 4 bytes.
  * This is done mainly so beacon frames (that require this)
  * can use this interface too.
  */
 struct mbuf *
 ieee80211_getmgtframe(uint8_t **frm, int headroom, int pktlen)
 {
 	struct mbuf *m;
 	u_int len;
 
 	/*
 	 * NB: we know the mbuf routines will align the data area
 	 *     so we don't need to do anything special.
 	 */
 	len = roundup2(headroom + pktlen, 4);
 	KASSERT(len <= MCLBYTES, ("802.11 mgt frame too large: %u", len));
 	if (len < MINCLSIZE) {
 		m = m_gethdr(M_NOWAIT, MT_DATA);
 		/*
 		 * Align the data in case additional headers are added.
 		 * This should only happen when a WEP header is added
 		 * which only happens for shared key authentication mgt
 		 * frames which all fit in MHLEN.
 		 */
 		if (m != NULL)
 			M_ALIGN(m, len);
 	} else {
 		m = m_getcl(M_NOWAIT, MT_DATA, M_PKTHDR);
 		if (m != NULL)
 			MC_ALIGN(m, len);
 	}
 	if (m != NULL) {
 		m->m_data += headroom;
 		*frm = m->m_data;
 	}
 	return m;
 }
 
 #ifndef __NO_STRICT_ALIGNMENT
 /*
  * Re-align the payload in the mbuf.  This is mainly used (right now)
  * to handle IP header alignment requirements on certain architectures.
  */
 struct mbuf *
 ieee80211_realign(struct ieee80211vap *vap, struct mbuf *m, size_t align)
 {
 	int pktlen, space;
 	struct mbuf *n;
 
 	pktlen = m->m_pkthdr.len;
 	space = pktlen + align;
 	if (space < MINCLSIZE)
 		n = m_gethdr(M_NOWAIT, MT_DATA);
 	else {
 		n = m_getjcl(M_NOWAIT, MT_DATA, M_PKTHDR,
 		    space <= MCLBYTES ?     MCLBYTES :
 #if MJUMPAGESIZE != MCLBYTES
 		    space <= MJUMPAGESIZE ? MJUMPAGESIZE :
 #endif
 		    space <= MJUM9BYTES ?   MJUM9BYTES : MJUM16BYTES);
 	}
 	if (__predict_true(n != NULL)) {
 		m_move_pkthdr(n, m);
 		n->m_data = (caddr_t)(ALIGN(n->m_data + align) - align);
 		m_copydata(m, 0, pktlen, mtod(n, caddr_t));
 		n->m_len = pktlen;
 	} else {
 		IEEE80211_DISCARD(vap, IEEE80211_MSG_ANY,
 		    mtod(m, const struct ieee80211_frame *), NULL,
 		    "%s", "no mbuf to realign");
 		vap->iv_stats.is_rx_badalign++;
 	}
 	m_freem(m);
 	return n;
 }
 #endif /* !__NO_STRICT_ALIGNMENT */
 
 int
 ieee80211_add_callback(struct mbuf *m,
 	void (*func)(struct ieee80211_node *, void *, int), void *arg)
 {
 	struct m_tag *mtag;
 	struct ieee80211_cb *cb;
 
 	mtag = m_tag_alloc(MTAG_ABI_NET80211, NET80211_TAG_CALLBACK,
 			sizeof(struct ieee80211_cb), M_NOWAIT);
 	if (mtag == NULL)
 		return 0;
 
 	cb = (struct ieee80211_cb *)(mtag+1);
 	cb->func = func;
 	cb->arg = arg;
 	m_tag_prepend(m, mtag);
 	m->m_flags |= M_TXCB;
 	return 1;
 }
 
 int
 ieee80211_add_xmit_params(struct mbuf *m,
     const struct ieee80211_bpf_params *params)
 {
 	struct m_tag *mtag;
 	struct ieee80211_tx_params *tx;
 
 	mtag = m_tag_alloc(MTAG_ABI_NET80211, NET80211_TAG_XMIT_PARAMS,
 	    sizeof(struct ieee80211_tx_params), M_NOWAIT);
 	if (mtag == NULL)
 		return (0);
 
 	tx = (struct ieee80211_tx_params *)(mtag+1);
 	memcpy(&tx->params, params, sizeof(struct ieee80211_bpf_params));
 	m_tag_prepend(m, mtag);
 	return (1);
 }
 
 int
 ieee80211_get_xmit_params(struct mbuf *m,
     struct ieee80211_bpf_params *params)
 {
 	struct m_tag *mtag;
 	struct ieee80211_tx_params *tx;
 
 	mtag = m_tag_locate(m, MTAG_ABI_NET80211, NET80211_TAG_XMIT_PARAMS,
 	    NULL);
 	if (mtag == NULL)
 		return (-1);
 	tx = (struct ieee80211_tx_params *)(mtag + 1);
 	memcpy(params, &tx->params, sizeof(struct ieee80211_bpf_params));
 	return (0);
 }
 
 void
 ieee80211_process_callback(struct ieee80211_node *ni,
 	struct mbuf *m, int status)
 {
 	struct m_tag *mtag;
 
 	mtag = m_tag_locate(m, MTAG_ABI_NET80211, NET80211_TAG_CALLBACK, NULL);
 	if (mtag != NULL) {
 		struct ieee80211_cb *cb = (struct ieee80211_cb *)(mtag+1);
 		cb->func(ni, cb->arg, status);
 	}
 }
 
 /*
  * Add RX parameters to the given mbuf.
  *
  * Returns 1 if OK, 0 on error.
  */
 int
 ieee80211_add_rx_params(struct mbuf *m, const struct ieee80211_rx_stats *rxs)
 {
 	struct m_tag *mtag;
 	struct ieee80211_rx_params *rx;
 
 	mtag = m_tag_alloc(MTAG_ABI_NET80211, NET80211_TAG_RECV_PARAMS,
 	    sizeof(struct ieee80211_rx_stats), M_NOWAIT);
 	if (mtag == NULL)
 		return (0);
 
 	rx = (struct ieee80211_rx_params *)(mtag + 1);
 	memcpy(&rx->params, rxs, sizeof(*rxs));
 	m_tag_prepend(m, mtag);
 	return (1);
 }
 
 int
 ieee80211_get_rx_params(struct mbuf *m, struct ieee80211_rx_stats *rxs)
 {
 	struct m_tag *mtag;
 	struct ieee80211_rx_params *rx;
 
 	mtag = m_tag_locate(m, MTAG_ABI_NET80211, NET80211_TAG_RECV_PARAMS,
 	    NULL);
 	if (mtag == NULL)
 		return (-1);
 	rx = (struct ieee80211_rx_params *)(mtag + 1);
 	memcpy(rxs, &rx->params, sizeof(*rxs));
 	return (0);
 }
 
 /*
+ * Add TOA parameters to the given mbuf.
+ */
+int
+ieee80211_add_toa_params(struct mbuf *m, const struct ieee80211_toa_params *p)
+{
+	struct m_tag *mtag;
+	struct ieee80211_toa_params *rp;
+
+	mtag = m_tag_alloc(MTAG_ABI_NET80211, NET80211_TAG_TOA_PARAMS,
+	    sizeof(struct ieee80211_toa_params), M_NOWAIT);
+	if (mtag == NULL)
+		return (0);
+
+	rp = (struct ieee80211_toa_params *)(mtag + 1);
+	memcpy(rp, p, sizeof(*rp));
+	m_tag_prepend(m, mtag);
+	return (1);
+}
+
+int
+ieee80211_get_toa_params(struct mbuf *m, struct ieee80211_toa_params *p)
+{
+	struct m_tag *mtag;
+	struct ieee80211_toa_params *rp;
+
+	mtag = m_tag_locate(m, MTAG_ABI_NET80211, NET80211_TAG_TOA_PARAMS,
+	    NULL);
+	if (mtag == NULL)
+		return (0);
+	rp = (struct ieee80211_toa_params *)(mtag + 1);
+	if (p != NULL)
+		memcpy(p, rp, sizeof(*p));
+	return (1);
+}
+
+/*
  * Transmit a frame to the parent interface.
  */
 int
 ieee80211_parent_xmitpkt(struct ieee80211com *ic, struct mbuf *m)
 {
 	int error;
 
 	/*
 	 * Assert the IC TX lock is held - this enforces the
 	 * processing -> queuing order is maintained
 	 */
 	IEEE80211_TX_LOCK_ASSERT(ic);
 	error = ic->ic_transmit(ic, m);
 	if (error) {
 		struct ieee80211_node *ni;
 
 		ni = (struct ieee80211_node *)m->m_pkthdr.rcvif;
 
 		/* XXX number of fragments */
 		if_inc_counter(ni->ni_vap->iv_ifp, IFCOUNTER_OERRORS, 1);
 		ieee80211_free_node(ni);
 		ieee80211_free_mbuf(m);
 	}
 	return (error);
 }
 
 /*
  * Transmit a frame to the VAP interface.
  */
 int
 ieee80211_vap_xmitpkt(struct ieee80211vap *vap, struct mbuf *m)
 {
 	struct ifnet *ifp = vap->iv_ifp;
 
 	/*
 	 * When transmitting via the VAP, we shouldn't hold
 	 * any IC TX lock as the VAP TX path will acquire it.
 	 */
 	IEEE80211_TX_UNLOCK_ASSERT(vap->iv_ic);
 
 	return (ifp->if_transmit(ifp, m));
 
 }
 
 #include <sys/libkern.h>
 
 void
 get_random_bytes(void *p, size_t n)
 {
 	uint8_t *dp = p;
 
 	while (n > 0) {
 		uint32_t v = arc4random();
 		size_t nb = n > sizeof(uint32_t) ? sizeof(uint32_t) : n;
 		bcopy(&v, dp, n > sizeof(uint32_t) ? sizeof(uint32_t) : n);
 		dp += sizeof(uint32_t), n -= nb;
 	}
 }
 
 /*
  * Helper function for events that pass just a single mac address.
  */
 static void
 notify_macaddr(struct ifnet *ifp, int op, const uint8_t mac[IEEE80211_ADDR_LEN])
 {
 	struct ieee80211_join_event iev;
 
 	CURVNET_SET(ifp->if_vnet);
 	memset(&iev, 0, sizeof(iev));
 	IEEE80211_ADDR_COPY(iev.iev_addr, mac);
 	rt_ieee80211msg(ifp, op, &iev, sizeof(iev));
 	CURVNET_RESTORE();
 }
 
 void
 ieee80211_notify_node_join(struct ieee80211_node *ni, int newassoc)
 {
 	struct ieee80211vap *vap = ni->ni_vap;
 	struct ifnet *ifp = vap->iv_ifp;
 
 	CURVNET_SET_QUIET(ifp->if_vnet);
 	IEEE80211_NOTE(vap, IEEE80211_MSG_NODE, ni, "%snode join",
 	    (ni == vap->iv_bss) ? "bss " : "");
 
 	if (ni == vap->iv_bss) {
 		notify_macaddr(ifp, newassoc ?
 		    RTM_IEEE80211_ASSOC : RTM_IEEE80211_REASSOC, ni->ni_bssid);
 		if_link_state_change(ifp, LINK_STATE_UP);
 	} else {
 		notify_macaddr(ifp, newassoc ?
 		    RTM_IEEE80211_JOIN : RTM_IEEE80211_REJOIN, ni->ni_macaddr);
 	}
 	CURVNET_RESTORE();
 }
 
 void
 ieee80211_notify_node_leave(struct ieee80211_node *ni)
 {
 	struct ieee80211vap *vap = ni->ni_vap;
 	struct ifnet *ifp = vap->iv_ifp;
 
 	CURVNET_SET_QUIET(ifp->if_vnet);
 	IEEE80211_NOTE(vap, IEEE80211_MSG_NODE, ni, "%snode leave",
 	    (ni == vap->iv_bss) ? "bss " : "");
 
 	if (ni == vap->iv_bss) {
 		rt_ieee80211msg(ifp, RTM_IEEE80211_DISASSOC, NULL, 0);
 		if_link_state_change(ifp, LINK_STATE_DOWN);
 	} else {
 		/* fire off wireless event station leaving */
 		notify_macaddr(ifp, RTM_IEEE80211_LEAVE, ni->ni_macaddr);
 	}
 	CURVNET_RESTORE();
 }
 
 void
 ieee80211_notify_scan_done(struct ieee80211vap *vap)
 {
 	struct ifnet *ifp = vap->iv_ifp;
 
 	IEEE80211_DPRINTF(vap, IEEE80211_MSG_SCAN, "%s\n", "notify scan done");
 
 	/* dispatch wireless event indicating scan completed */
 	CURVNET_SET(ifp->if_vnet);
 	rt_ieee80211msg(ifp, RTM_IEEE80211_SCAN, NULL, 0);
 	CURVNET_RESTORE();
 }
 
 void
 ieee80211_notify_replay_failure(struct ieee80211vap *vap,
 	const struct ieee80211_frame *wh, const struct ieee80211_key *k,
 	u_int64_t rsc, int tid)
 {
 	struct ifnet *ifp = vap->iv_ifp;
 
 	IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_CRYPTO, wh->i_addr2,
 	    "%s replay detected tid %d <rsc %ju, csc %ju, keyix %u rxkeyix %u>",
 	    k->wk_cipher->ic_name, tid, (intmax_t) rsc,
 	    (intmax_t) k->wk_keyrsc[tid],
 	    k->wk_keyix, k->wk_rxkeyix);
 
 	if (ifp != NULL) {		/* NB: for cipher test modules */
 		struct ieee80211_replay_event iev;
 
 		IEEE80211_ADDR_COPY(iev.iev_dst, wh->i_addr1);
 		IEEE80211_ADDR_COPY(iev.iev_src, wh->i_addr2);
 		iev.iev_cipher = k->wk_cipher->ic_cipher;
 		if (k->wk_rxkeyix != IEEE80211_KEYIX_NONE)
 			iev.iev_keyix = k->wk_rxkeyix;
 		else
 			iev.iev_keyix = k->wk_keyix;
 		iev.iev_keyrsc = k->wk_keyrsc[tid];
 		iev.iev_rsc = rsc;
 		CURVNET_SET(ifp->if_vnet);
 		rt_ieee80211msg(ifp, RTM_IEEE80211_REPLAY, &iev, sizeof(iev));
 		CURVNET_RESTORE();
 	}
 }
 
 void
 ieee80211_notify_michael_failure(struct ieee80211vap *vap,
 	const struct ieee80211_frame *wh, u_int keyix)
 {
 	struct ifnet *ifp = vap->iv_ifp;
 
 	IEEE80211_NOTE_MAC(vap, IEEE80211_MSG_CRYPTO, wh->i_addr2,
 	    "michael MIC verification failed <keyix %u>", keyix);
 	vap->iv_stats.is_rx_tkipmic++;
 
 	if (ifp != NULL) {		/* NB: for cipher test modules */
 		struct ieee80211_michael_event iev;
 
 		IEEE80211_ADDR_COPY(iev.iev_dst, wh->i_addr1);
 		IEEE80211_ADDR_COPY(iev.iev_src, wh->i_addr2);
 		iev.iev_cipher = IEEE80211_CIPHER_TKIP;
 		iev.iev_keyix = keyix;
 		CURVNET_SET(ifp->if_vnet);
 		rt_ieee80211msg(ifp, RTM_IEEE80211_MICHAEL, &iev, sizeof(iev));
 		CURVNET_RESTORE();
 	}
 }
 
 void
 ieee80211_notify_wds_discover(struct ieee80211_node *ni)
 {
 	struct ieee80211vap *vap = ni->ni_vap;
 	struct ifnet *ifp = vap->iv_ifp;
 
 	notify_macaddr(ifp, RTM_IEEE80211_WDS, ni->ni_macaddr);
 }
 
 void
 ieee80211_notify_csa(struct ieee80211com *ic,
 	const struct ieee80211_channel *c, int mode, int count)
 {
 	struct ieee80211_csa_event iev;
 	struct ieee80211vap *vap;
 	struct ifnet *ifp;
 
 	memset(&iev, 0, sizeof(iev));
 	iev.iev_flags = c->ic_flags;
 	iev.iev_freq = c->ic_freq;
 	iev.iev_ieee = c->ic_ieee;
 	iev.iev_mode = mode;
 	iev.iev_count = count;
 	TAILQ_FOREACH(vap, &ic->ic_vaps, iv_next) {
 		ifp = vap->iv_ifp;
 		CURVNET_SET(ifp->if_vnet);
 		rt_ieee80211msg(ifp, RTM_IEEE80211_CSA, &iev, sizeof(iev));
 		CURVNET_RESTORE();
 	}
 }
 
 void
 ieee80211_notify_radar(struct ieee80211com *ic,
 	const struct ieee80211_channel *c)
 {
 	struct ieee80211_radar_event iev;
 	struct ieee80211vap *vap;
 	struct ifnet *ifp;
 
 	memset(&iev, 0, sizeof(iev));
 	iev.iev_flags = c->ic_flags;
 	iev.iev_freq = c->ic_freq;
 	iev.iev_ieee = c->ic_ieee;
 	TAILQ_FOREACH(vap, &ic->ic_vaps, iv_next) {
 		ifp = vap->iv_ifp;
 		CURVNET_SET(ifp->if_vnet);
 		rt_ieee80211msg(ifp, RTM_IEEE80211_RADAR, &iev, sizeof(iev));
 		CURVNET_RESTORE();
 	}
 }
 
 void
 ieee80211_notify_cac(struct ieee80211com *ic,
 	const struct ieee80211_channel *c, enum ieee80211_notify_cac_event type)
 {
 	struct ieee80211_cac_event iev;
 	struct ieee80211vap *vap;
 	struct ifnet *ifp;
 
 	memset(&iev, 0, sizeof(iev));
 	iev.iev_flags = c->ic_flags;
 	iev.iev_freq = c->ic_freq;
 	iev.iev_ieee = c->ic_ieee;
 	iev.iev_type = type;
 	TAILQ_FOREACH(vap, &ic->ic_vaps, iv_next) {
 		ifp = vap->iv_ifp;
 		CURVNET_SET(ifp->if_vnet);
 		rt_ieee80211msg(ifp, RTM_IEEE80211_CAC, &iev, sizeof(iev));
 		CURVNET_RESTORE();
 	}
 }
 
 void
 ieee80211_notify_node_deauth(struct ieee80211_node *ni)
 {
 	struct ieee80211vap *vap = ni->ni_vap;
 	struct ifnet *ifp = vap->iv_ifp;
 
 	IEEE80211_NOTE(vap, IEEE80211_MSG_NODE, ni, "%s", "node deauth");
 
 	notify_macaddr(ifp, RTM_IEEE80211_DEAUTH, ni->ni_macaddr);
 }
 
 void
 ieee80211_notify_node_auth(struct ieee80211_node *ni)
 {
 	struct ieee80211vap *vap = ni->ni_vap;
 	struct ifnet *ifp = vap->iv_ifp;
 
 	IEEE80211_NOTE(vap, IEEE80211_MSG_NODE, ni, "%s", "node auth");
 
 	notify_macaddr(ifp, RTM_IEEE80211_AUTH, ni->ni_macaddr);
 }
 
 void
 ieee80211_notify_country(struct ieee80211vap *vap,
 	const uint8_t bssid[IEEE80211_ADDR_LEN], const uint8_t cc[2])
 {
 	struct ifnet *ifp = vap->iv_ifp;
 	struct ieee80211_country_event iev;
 
 	memset(&iev, 0, sizeof(iev));
 	IEEE80211_ADDR_COPY(iev.iev_addr, bssid);
 	iev.iev_cc[0] = cc[0];
 	iev.iev_cc[1] = cc[1];
 	CURVNET_SET(ifp->if_vnet);
 	rt_ieee80211msg(ifp, RTM_IEEE80211_COUNTRY, &iev, sizeof(iev));
 	CURVNET_RESTORE();
 }
 
 void
 ieee80211_notify_radio(struct ieee80211com *ic, int state)
 {
 	struct ieee80211_radio_event iev;
 	struct ieee80211vap *vap;
 	struct ifnet *ifp;
 
 	memset(&iev, 0, sizeof(iev));
 	iev.iev_state = state;
 	TAILQ_FOREACH(vap, &ic->ic_vaps, iv_next) {
 		ifp = vap->iv_ifp;
 		CURVNET_SET(ifp->if_vnet);
 		rt_ieee80211msg(ifp, RTM_IEEE80211_RADIO, &iev, sizeof(iev));
 		CURVNET_RESTORE();
 	}
 }
 
 void
 ieee80211_load_module(const char *modname)
 {
 
 #ifdef notyet
 	(void)kern_kldload(curthread, modname, NULL);
 #else
 	printf("%s: load the %s module by hand for now.\n", __func__, modname);
 #endif
 }
 
 static eventhandler_tag wlan_bpfevent;
 static eventhandler_tag wlan_ifllevent;
 
 static void
 bpf_track(void *arg, struct ifnet *ifp, int dlt, int attach)
 {
 	/* NB: identify vap's by if_init */
 	if (dlt == DLT_IEEE802_11_RADIO &&
 	    ifp->if_init == ieee80211_init) {
 		struct ieee80211vap *vap = ifp->if_softc;
 		/*
 		 * Track bpf radiotap listener state.  We mark the vap
 		 * to indicate if any listener is present and the com
 		 * to indicate if any listener exists on any associated
 		 * vap.  This flag is used by drivers to prepare radiotap
 		 * state only when needed.
 		 */
 		if (attach) {
 			ieee80211_syncflag_ext(vap, IEEE80211_FEXT_BPF);
 			if (vap->iv_opmode == IEEE80211_M_MONITOR)
 				atomic_add_int(&vap->iv_ic->ic_montaps, 1);
 		} else if (!bpf_peers_present(vap->iv_rawbpf)) {
 			ieee80211_syncflag_ext(vap, -IEEE80211_FEXT_BPF);
 			if (vap->iv_opmode == IEEE80211_M_MONITOR)
 				atomic_subtract_int(&vap->iv_ic->ic_montaps, 1);
 		}
 	}
 }
 
 /*
  * Change MAC address on the vap (if was not started).
  */
 static void
 wlan_iflladdr(void *arg __unused, struct ifnet *ifp)
 {
 	/* NB: identify vap's by if_init */
 	if (ifp->if_init == ieee80211_init &&
 	    (ifp->if_flags & IFF_UP) == 0) {
 		struct ieee80211vap *vap = ifp->if_softc;
 
 		IEEE80211_ADDR_COPY(vap->iv_myaddr, IF_LLADDR(ifp));
 	}
 }
 
 /*
  * Module glue.
  *
  * NB: the module name is "wlan" for compatibility with NetBSD.
  */
 static int
 wlan_modevent(module_t mod, int type, void *unused)
 {
 	switch (type) {
 	case MOD_LOAD:
 		if (bootverbose)
 			printf("wlan: <802.11 Link Layer>\n");
 		wlan_bpfevent = EVENTHANDLER_REGISTER(bpf_track,
 		    bpf_track, 0, EVENTHANDLER_PRI_ANY);
 		wlan_ifllevent = EVENTHANDLER_REGISTER(iflladdr_event,
 		    wlan_iflladdr, NULL, EVENTHANDLER_PRI_ANY);
 		wlan_cloner = if_clone_simple(wlanname, wlan_clone_create,
 		    wlan_clone_destroy, 0);
 		return 0;
 	case MOD_UNLOAD:
 		if_clone_detach(wlan_cloner);
 		EVENTHANDLER_DEREGISTER(bpf_track, wlan_bpfevent);
 		EVENTHANDLER_DEREGISTER(iflladdr_event, wlan_ifllevent);
 		return 0;
 	}
 	return EINVAL;
 }
 
 static moduledata_t wlan_mod = {
 	wlanname,
 	wlan_modevent,
 	0
 };
 DECLARE_MODULE(wlan, wlan_mod, SI_SUB_DRIVERS, SI_ORDER_FIRST);
 MODULE_VERSION(wlan, 1);
 MODULE_DEPEND(wlan, ether, 1, 1, 1);
 #ifdef	IEEE80211_ALQ
 MODULE_DEPEND(wlan, alq, 1, 1, 1);
 #endif	/* IEEE80211_ALQ */
 
Index: projects/clang390-import/sys/net80211/ieee80211_freebsd.h
===================================================================
--- projects/clang390-import/sys/net80211/ieee80211_freebsd.h	(revision 305686)
+++ projects/clang390-import/sys/net80211/ieee80211_freebsd.h	(revision 305687)
@@ -1,675 +1,685 @@
 /*-
  * Copyright (c) 2003-2008 Sam Leffler, Errno Consulting
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 #ifndef _NET80211_IEEE80211_FREEBSD_H_
 #define _NET80211_IEEE80211_FREEBSD_H_
 
 #ifdef _KERNEL
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/counter.h>
 #include <sys/lock.h>
 #include <sys/mutex.h>
 #include <sys/rwlock.h>
 #include <sys/sysctl.h>
 #include <sys/taskqueue.h>
 
 /*
  * Common state locking definitions.
  */
 typedef struct {
 	char		name[16];		/* e.g. "ath0_com_lock" */
 	struct mtx	mtx;
 } ieee80211_com_lock_t;
 #define	IEEE80211_LOCK_INIT(_ic, _name) do {				\
 	ieee80211_com_lock_t *cl = &(_ic)->ic_comlock;			\
 	snprintf(cl->name, sizeof(cl->name), "%s_com_lock", _name);	\
 	mtx_init(&cl->mtx, cl->name, NULL, MTX_DEF | MTX_RECURSE);	\
 } while (0)
 #define	IEEE80211_LOCK_OBJ(_ic)	(&(_ic)->ic_comlock.mtx)
 #define	IEEE80211_LOCK_DESTROY(_ic) mtx_destroy(IEEE80211_LOCK_OBJ(_ic))
 #define	IEEE80211_LOCK(_ic)	   mtx_lock(IEEE80211_LOCK_OBJ(_ic))
 #define	IEEE80211_UNLOCK(_ic)	   mtx_unlock(IEEE80211_LOCK_OBJ(_ic))
 #define	IEEE80211_LOCK_ASSERT(_ic) \
 	mtx_assert(IEEE80211_LOCK_OBJ(_ic), MA_OWNED)
 #define	IEEE80211_UNLOCK_ASSERT(_ic) \
 	mtx_assert(IEEE80211_LOCK_OBJ(_ic), MA_NOTOWNED)
 
 /*
  * Transmit lock.
  *
  * This is a (mostly) temporary lock designed to serialise all of the
  * transmission operations throughout the stack.
  */
 typedef struct {
 	char		name[16];		/* e.g. "ath0_tx_lock" */
 	struct mtx	mtx;
 } ieee80211_tx_lock_t;
 #define	IEEE80211_TX_LOCK_INIT(_ic, _name) do {				\
 	ieee80211_tx_lock_t *cl = &(_ic)->ic_txlock;			\
 	snprintf(cl->name, sizeof(cl->name), "%s_tx_lock", _name);	\
 	mtx_init(&cl->mtx, cl->name, NULL, MTX_DEF);	\
 } while (0)
 #define	IEEE80211_TX_LOCK_OBJ(_ic)	(&(_ic)->ic_txlock.mtx)
 #define	IEEE80211_TX_LOCK_DESTROY(_ic) mtx_destroy(IEEE80211_TX_LOCK_OBJ(_ic))
 #define	IEEE80211_TX_LOCK(_ic)	   mtx_lock(IEEE80211_TX_LOCK_OBJ(_ic))
 #define	IEEE80211_TX_UNLOCK(_ic)	   mtx_unlock(IEEE80211_TX_LOCK_OBJ(_ic))
 #define	IEEE80211_TX_LOCK_ASSERT(_ic) \
 	mtx_assert(IEEE80211_TX_LOCK_OBJ(_ic), MA_OWNED)
 #define	IEEE80211_TX_UNLOCK_ASSERT(_ic) \
 	mtx_assert(IEEE80211_TX_LOCK_OBJ(_ic), MA_NOTOWNED)
 
 /*
  * Stageq / ni_tx_superg lock
  */
 typedef struct {
 	char		name[16];		/* e.g. "ath0_ff_lock" */
 	struct mtx	mtx;
 } ieee80211_ff_lock_t;
 #define IEEE80211_FF_LOCK_INIT(_ic, _name) do {				\
 	ieee80211_ff_lock_t *fl = &(_ic)->ic_fflock;			\
 	snprintf(fl->name, sizeof(fl->name), "%s_ff_lock", _name);	\
 	mtx_init(&fl->mtx, fl->name, NULL, MTX_DEF);			\
 } while (0)
 #define IEEE80211_FF_LOCK_OBJ(_ic)	(&(_ic)->ic_fflock.mtx)
 #define IEEE80211_FF_LOCK_DESTROY(_ic)	mtx_destroy(IEEE80211_FF_LOCK_OBJ(_ic))
 #define IEEE80211_FF_LOCK(_ic)		mtx_lock(IEEE80211_FF_LOCK_OBJ(_ic))
 #define IEEE80211_FF_UNLOCK(_ic)	mtx_unlock(IEEE80211_FF_LOCK_OBJ(_ic))
 #define IEEE80211_FF_LOCK_ASSERT(_ic) \
 	mtx_assert(IEEE80211_FF_LOCK_OBJ(_ic), MA_OWNED)
 
 /*
  * Node locking definitions.
  */
 typedef struct {
 	char		name[16];		/* e.g. "ath0_node_lock" */
 	struct mtx	mtx;
 } ieee80211_node_lock_t;
 #define	IEEE80211_NODE_LOCK_INIT(_nt, _name) do {			\
 	ieee80211_node_lock_t *nl = &(_nt)->nt_nodelock;		\
 	snprintf(nl->name, sizeof(nl->name), "%s_node_lock", _name);	\
 	mtx_init(&nl->mtx, nl->name, NULL, MTX_DEF | MTX_RECURSE);	\
 } while (0)
 #define	IEEE80211_NODE_LOCK_OBJ(_nt)	(&(_nt)->nt_nodelock.mtx)
 #define	IEEE80211_NODE_LOCK_DESTROY(_nt) \
 	mtx_destroy(IEEE80211_NODE_LOCK_OBJ(_nt))
 #define	IEEE80211_NODE_LOCK(_nt) \
 	mtx_lock(IEEE80211_NODE_LOCK_OBJ(_nt))
 #define	IEEE80211_NODE_IS_LOCKED(_nt) \
 	mtx_owned(IEEE80211_NODE_LOCK_OBJ(_nt))
 #define	IEEE80211_NODE_UNLOCK(_nt) \
 	mtx_unlock(IEEE80211_NODE_LOCK_OBJ(_nt))
 #define	IEEE80211_NODE_LOCK_ASSERT(_nt)	\
 	mtx_assert(IEEE80211_NODE_LOCK_OBJ(_nt), MA_OWNED)
 
 /*
  * Power-save queue definitions. 
  */
 typedef struct mtx ieee80211_psq_lock_t;
 #define	IEEE80211_PSQ_INIT(_psq, _name) \
 	mtx_init(&(_psq)->psq_lock, _name, "802.11 ps q", MTX_DEF)
 #define	IEEE80211_PSQ_DESTROY(_psq)	mtx_destroy(&(_psq)->psq_lock)
 #define	IEEE80211_PSQ_LOCK(_psq)	mtx_lock(&(_psq)->psq_lock)
 #define	IEEE80211_PSQ_UNLOCK(_psq)	mtx_unlock(&(_psq)->psq_lock)
 
 #ifndef IF_PREPEND_LIST
 #define _IF_PREPEND_LIST(ifq, mhead, mtail, mcount) do {	\
 	(mtail)->m_nextpkt = (ifq)->ifq_head;			\
 	if ((ifq)->ifq_tail == NULL)				\
 		(ifq)->ifq_tail = (mtail);			\
 	(ifq)->ifq_head = (mhead);				\
 	(ifq)->ifq_len += (mcount);				\
 } while (0)
 #define IF_PREPEND_LIST(ifq, mhead, mtail, mcount) do {		\
 	IF_LOCK(ifq);						\
 	_IF_PREPEND_LIST(ifq, mhead, mtail, mcount);		\
 	IF_UNLOCK(ifq);						\
 } while (0)
 #endif /* IF_PREPEND_LIST */
  
 /*
  * Age queue definitions.
  */
 typedef struct mtx ieee80211_ageq_lock_t;
 #define	IEEE80211_AGEQ_INIT(_aq, _name) \
 	mtx_init(&(_aq)->aq_lock, _name, "802.11 age q", MTX_DEF)
 #define	IEEE80211_AGEQ_DESTROY(_aq)	mtx_destroy(&(_aq)->aq_lock)
 #define	IEEE80211_AGEQ_LOCK(_aq)	mtx_lock(&(_aq)->aq_lock)
 #define	IEEE80211_AGEQ_UNLOCK(_aq)	mtx_unlock(&(_aq)->aq_lock)
 
 /*
  * 802.1x MAC ACL database locking definitions.
  */
 typedef struct mtx acl_lock_t;
 #define	ACL_LOCK_INIT(_as, _name) \
 	mtx_init(&(_as)->as_lock, _name, "802.11 ACL", MTX_DEF)
 #define	ACL_LOCK_DESTROY(_as)		mtx_destroy(&(_as)->as_lock)
 #define	ACL_LOCK(_as)			mtx_lock(&(_as)->as_lock)
 #define	ACL_UNLOCK(_as)			mtx_unlock(&(_as)->as_lock)
 #define	ACL_LOCK_ASSERT(_as) \
 	mtx_assert((&(_as)->as_lock), MA_OWNED)
 
 /*
  * Scan table definitions.
  */
 typedef struct mtx ieee80211_scan_table_lock_t;
 #define	IEEE80211_SCAN_TABLE_LOCK_INIT(_st, _name) \
 	mtx_init(&(_st)->st_lock, _name, "802.11 scan table", MTX_DEF)
 #define	IEEE80211_SCAN_TABLE_LOCK_DESTROY(_st)	mtx_destroy(&(_st)->st_lock)
 #define	IEEE80211_SCAN_TABLE_LOCK(_st)		mtx_lock(&(_st)->st_lock)
 #define	IEEE80211_SCAN_TABLE_UNLOCK(_st)	mtx_unlock(&(_st)->st_lock)
 
 typedef struct mtx ieee80211_scan_iter_lock_t;
 #define	IEEE80211_SCAN_ITER_LOCK_INIT(_st, _name) \
 	mtx_init(&(_st)->st_scanlock, _name, "802.11 scangen", MTX_DEF)
 #define	IEEE80211_SCAN_ITER_LOCK_DESTROY(_st)	mtx_destroy(&(_st)->st_scanlock)
 #define	IEEE80211_SCAN_ITER_LOCK(_st)		mtx_lock(&(_st)->st_scanlock)
 #define	IEEE80211_SCAN_ITER_UNLOCK(_st)	mtx_unlock(&(_st)->st_scanlock)
 
 /*
  * Mesh node/routing definitions.
  */
 typedef struct mtx ieee80211_rte_lock_t;
 #define	MESH_RT_ENTRY_LOCK_INIT(_rt, _name) \
 	mtx_init(&(rt)->rt_lock, _name, "802.11s route entry", MTX_DEF)
 #define	MESH_RT_ENTRY_LOCK_DESTROY(_rt) \
 	mtx_destroy(&(_rt)->rt_lock)
 #define	MESH_RT_ENTRY_LOCK(rt)	mtx_lock(&(rt)->rt_lock)
 #define	MESH_RT_ENTRY_LOCK_ASSERT(rt) mtx_assert(&(rt)->rt_lock, MA_OWNED)
 #define	MESH_RT_ENTRY_UNLOCK(rt)	mtx_unlock(&(rt)->rt_lock)
 
 typedef struct mtx ieee80211_rt_lock_t;
 #define	MESH_RT_LOCK(ms)	mtx_lock(&(ms)->ms_rt_lock)
 #define	MESH_RT_LOCK_ASSERT(ms)	mtx_assert(&(ms)->ms_rt_lock, MA_OWNED)
 #define	MESH_RT_UNLOCK(ms)	mtx_unlock(&(ms)->ms_rt_lock)
 #define	MESH_RT_LOCK_INIT(ms, name) \
 	mtx_init(&(ms)->ms_rt_lock, name, "802.11s routing table", MTX_DEF)
 #define	MESH_RT_LOCK_DESTROY(ms) \
 	mtx_destroy(&(ms)->ms_rt_lock)
 
 /*
  * Node reference counting definitions.
  *
  * ieee80211_node_initref	initialize the reference count to 1
  * ieee80211_node_incref	add a reference
  * ieee80211_node_decref	remove a reference
  * ieee80211_node_dectestref	remove a reference and return 1 if this
  *				is the last reference, otherwise 0
  * ieee80211_node_refcnt	reference count for printing (only)
  */
 #include <machine/atomic.h>
 
 #define ieee80211_node_initref(_ni) \
 	do { ((_ni)->ni_refcnt = 1); } while (0)
 #define ieee80211_node_incref(_ni) \
 	atomic_add_int(&(_ni)->ni_refcnt, 1)
 #define	ieee80211_node_decref(_ni) \
 	atomic_subtract_int(&(_ni)->ni_refcnt, 1)
 struct ieee80211_node;
 int	ieee80211_node_dectestref(struct ieee80211_node *ni);
 #define	ieee80211_node_refcnt(_ni)	(_ni)->ni_refcnt
 
 struct ifqueue;
 struct ieee80211vap;
 void	ieee80211_drain_ifq(struct ifqueue *);
 void	ieee80211_flush_ifq(struct ifqueue *, struct ieee80211vap *);
 
 void	ieee80211_vap_destroy(struct ieee80211vap *);
 
 #define	IFNET_IS_UP_RUNNING(_ifp) \
 	(((_ifp)->if_flags & IFF_UP) && \
 	 ((_ifp)->if_drv_flags & IFF_DRV_RUNNING))
 
 /* XXX TODO: cap these at 1, as hz may not be 1000 */
 #define	msecs_to_ticks(ms)	(((ms)*hz)/1000)
 #define	ticks_to_msecs(t)	(1000*(t) / hz)
 #define	ticks_to_secs(t)	((t) / hz)
 
 #define ieee80211_time_after(a,b) 	((long)(b) - (long)(a) < 0)
 #define ieee80211_time_before(a,b)	ieee80211_time_after(b,a)
 #define ieee80211_time_after_eq(a,b)	((long)(a) - (long)(b) >= 0)
 #define ieee80211_time_before_eq(a,b)	ieee80211_time_after_eq(b,a)
 
 struct mbuf *ieee80211_getmgtframe(uint8_t **frm, int headroom, int pktlen);
 
 /* tx path usage */
 #define	M_ENCAP		M_PROTO1		/* 802.11 encap done */
 #define	M_EAPOL		M_PROTO3		/* PAE/EAPOL frame */
 #define	M_PWR_SAV	M_PROTO4		/* bypass PS handling */
 #define	M_MORE_DATA	M_PROTO5		/* more data frames to follow */
 #define	M_FF		M_PROTO6		/* fast frame / A-MSDU */
 #define	M_TXCB		M_PROTO7		/* do tx complete callback */
 #define	M_AMPDU_MPDU	M_PROTO8		/* ok for A-MPDU aggregation */
 #define	M_FRAG		M_PROTO9		/* frame fragmentation */
 #define	M_FIRSTFRAG	M_PROTO10		/* first frame fragment */
 #define	M_LASTFRAG	M_PROTO11		/* last frame fragment */
 
 #define	M_80211_TX \
 	(M_ENCAP|M_EAPOL|M_PWR_SAV|M_MORE_DATA|M_FF|M_TXCB| \
 	 M_AMPDU_MPDU|M_FRAG|M_FIRSTFRAG|M_LASTFRAG)
 
 /* rx path usage */
 #define	M_AMPDU		M_PROTO1		/* A-MPDU subframe */
 #define	M_WEP		M_PROTO2		/* WEP done by hardware */
 #if 0
 #define	M_AMPDU_MPDU	M_PROTO8		/* A-MPDU re-order done */
 #endif
 #define	M_80211_RX	(M_AMPDU|M_WEP|M_AMPDU_MPDU)
 
 #define	IEEE80211_MBUF_TX_FLAG_BITS \
 	M_FLAG_BITS \
 	"\15M_ENCAP\17M_EAPOL\20M_PWR_SAV\21M_MORE_DATA\22M_FF\23M_TXCB" \
 	"\24M_AMPDU_MPDU\25M_FRAG\26M_FIRSTFRAG\27M_LASTFRAG"
 
 #define	IEEE80211_MBUF_RX_FLAG_BITS \
 	M_FLAG_BITS \
 	"\15M_AMPDU\16M_WEP\24M_AMPDU_MPDU"
 
 /*
  * Store WME access control bits in the vlan tag.
  * This is safe since it's done after the packet is classified
  * (where we use any previous tag) and because it's passed
  * directly in to the driver and there's no chance someone
  * else will clobber them on us.
  */
 #define	M_WME_SETAC(m, ac) \
 	((m)->m_pkthdr.ether_vtag = (ac))
 #define	M_WME_GETAC(m)	((m)->m_pkthdr.ether_vtag)
 
 /*
  * Mbufs on the power save queue are tagged with an age and
  * timed out.  We reuse the hardware checksum field in the
  * mbuf packet header to store this data.
  */
 #define	M_AGE_SET(m,v)		(m->m_pkthdr.csum_data = v)
 #define	M_AGE_GET(m)		(m->m_pkthdr.csum_data)
 #define	M_AGE_SUB(m,adj)	(m->m_pkthdr.csum_data -= adj)
 
 /*
  * Store the sequence number.
  */
 #define	M_SEQNO_SET(m, seqno) \
 	((m)->m_pkthdr.tso_segsz = (seqno))
 #define	M_SEQNO_GET(m)	((m)->m_pkthdr.tso_segsz)
 
 #define	MTAG_ABI_NET80211	1132948340	/* net80211 ABI */
 
 struct ieee80211_cb {
 	void	(*func)(struct ieee80211_node *, void *, int status);
 	void	*arg;
 };
 #define	NET80211_TAG_CALLBACK	0	/* xmit complete callback */
 int	ieee80211_add_callback(struct mbuf *m,
 		void (*func)(struct ieee80211_node *, void *, int), void *arg);
 void	ieee80211_process_callback(struct ieee80211_node *, struct mbuf *, int);
 
 #define	NET80211_TAG_XMIT_PARAMS	1
 /* See below; this is after the bpf_params definition */
 
 #define	NET80211_TAG_RECV_PARAMS	2
 
+#define	NET80211_TAG_TOA_PARAMS		3
+
 struct ieee80211com;
 int	ieee80211_parent_xmitpkt(struct ieee80211com *, struct mbuf *);
 int	ieee80211_vap_xmitpkt(struct ieee80211vap *, struct mbuf *);
 
 void	get_random_bytes(void *, size_t);
 
 void	ieee80211_sysctl_attach(struct ieee80211com *);
 void	ieee80211_sysctl_detach(struct ieee80211com *);
 void	ieee80211_sysctl_vattach(struct ieee80211vap *);
 void	ieee80211_sysctl_vdetach(struct ieee80211vap *);
 
 SYSCTL_DECL(_net_wlan);
 int	ieee80211_sysctl_msecs_ticks(SYSCTL_HANDLER_ARGS);
 
 void	ieee80211_load_module(const char *);
 
 /*
  * A "policy module" is an adjunct module to net80211 that provides
  * functionality that typically includes policy decisions.  This
  * modularity enables extensibility and vendor-supplied functionality.
  */
 #define	_IEEE80211_POLICY_MODULE(policy, name, version)			\
 typedef void (*policy##_setup)(int);					\
 SET_DECLARE(policy##_set, policy##_setup);				\
 static int								\
 wlan_##name##_modevent(module_t mod, int type, void *unused)		\
 {									\
 	policy##_setup * const *iter, f;				\
 	switch (type) {							\
 	case MOD_LOAD:							\
 		SET_FOREACH(iter, policy##_set) {			\
 			f = (void*) *iter;				\
 			f(type);					\
 		}							\
 		return 0;						\
 	case MOD_UNLOAD:						\
 	case MOD_QUIESCE:						\
 		if (nrefs) {						\
 			printf("wlan_" #name ": still in use "		\
 				"(%u dynamic refs)\n", nrefs);		\
 			return EBUSY;					\
 		}							\
 		if (type == MOD_UNLOAD) {				\
 			SET_FOREACH(iter, policy##_set) {		\
 				f = (void*) *iter;			\
 				f(type);				\
 			}						\
 		}							\
 		return 0;						\
 	}								\
 	return EINVAL;							\
 }									\
 static moduledata_t name##_mod = {					\
 	"wlan_" #name,							\
 	wlan_##name##_modevent,						\
 	0								\
 };									\
 DECLARE_MODULE(wlan_##name, name##_mod, SI_SUB_DRIVERS, SI_ORDER_FIRST);\
 MODULE_VERSION(wlan_##name, version);					\
 MODULE_DEPEND(wlan_##name, wlan, 1, 1, 1)
 
 /*
  * Crypto modules implement cipher support.
  */
 #define	IEEE80211_CRYPTO_MODULE(name, version)				\
 _IEEE80211_POLICY_MODULE(crypto, name, version);			\
 static void								\
 name##_modevent(int type)						\
 {									\
 	if (type == MOD_LOAD)						\
 		ieee80211_crypto_register(&name);			\
 	else								\
 		ieee80211_crypto_unregister(&name);			\
 }									\
 TEXT_SET(crypto##_set, name##_modevent)
 
 /*
  * Scanner modules provide scanning policy.
  */
 #define	IEEE80211_SCANNER_MODULE(name, version)				\
 	_IEEE80211_POLICY_MODULE(scanner, name, version)
 
 #define	IEEE80211_SCANNER_ALG(name, alg, v)				\
 static void								\
 name##_modevent(int type)						\
 {									\
 	if (type == MOD_LOAD)						\
 		ieee80211_scanner_register(alg, &v);			\
 	else								\
 		ieee80211_scanner_unregister(alg, &v);			\
 }									\
 TEXT_SET(scanner_set, name##_modevent);					\
 
 /*
  * ACL modules implement acl policy.
  */
 #define	IEEE80211_ACL_MODULE(name, alg, version)			\
 _IEEE80211_POLICY_MODULE(acl, name, version);				\
 static void								\
 alg##_modevent(int type)						\
 {									\
 	if (type == MOD_LOAD)						\
 		ieee80211_aclator_register(&alg);			\
 	else								\
 		ieee80211_aclator_unregister(&alg);			\
 }									\
 TEXT_SET(acl_set, alg##_modevent);					\
 
 /*
  * Authenticator modules handle 802.1x/WPA authentication.
  */
 #define	IEEE80211_AUTH_MODULE(name, version)				\
 	_IEEE80211_POLICY_MODULE(auth, name, version)
 
 #define	IEEE80211_AUTH_ALG(name, alg, v)				\
 static void								\
 name##_modevent(int type)						\
 {									\
 	if (type == MOD_LOAD)						\
 		ieee80211_authenticator_register(alg, &v);		\
 	else								\
 		ieee80211_authenticator_unregister(alg);		\
 }									\
 TEXT_SET(auth_set, name##_modevent)
 
 /*
  * Rate control modules provide tx rate control support.
  */
 #define	IEEE80211_RATECTL_MODULE(alg, version)				\
 	_IEEE80211_POLICY_MODULE(ratectl, alg, version);		\
 
 #define	IEEE80211_RATECTL_ALG(name, alg, v)				\
 static void								\
 alg##_modevent(int type)						\
 {									\
 	if (type == MOD_LOAD)						\
 		ieee80211_ratectl_register(alg, &v);			\
 	else								\
 		ieee80211_ratectl_unregister(alg);			\
 }									\
 TEXT_SET(ratectl##_set, alg##_modevent)
 
 struct ieee80211req;
 typedef int ieee80211_ioctl_getfunc(struct ieee80211vap *,
     struct ieee80211req *);
 SET_DECLARE(ieee80211_ioctl_getset, ieee80211_ioctl_getfunc);
 #define	IEEE80211_IOCTL_GET(_name, _get) TEXT_SET(ieee80211_ioctl_getset, _get)
 
 typedef int ieee80211_ioctl_setfunc(struct ieee80211vap *,
     struct ieee80211req *);
 SET_DECLARE(ieee80211_ioctl_setset, ieee80211_ioctl_setfunc);
 #define	IEEE80211_IOCTL_SET(_name, _set) TEXT_SET(ieee80211_ioctl_setset, _set)
 #endif /* _KERNEL */
 
 /* XXX this stuff belongs elsewhere */
 /*
  * Message formats for messages from the net80211 layer to user
  * applications via the routing socket.  These messages are appended
  * to an if_announcemsghdr structure.
  */
 struct ieee80211_join_event {
 	uint8_t		iev_addr[6];
 };
 
 struct ieee80211_leave_event {
 	uint8_t		iev_addr[6];
 };
 
 struct ieee80211_replay_event {
 	uint8_t		iev_src[6];	/* src MAC */
 	uint8_t		iev_dst[6];	/* dst MAC */
 	uint8_t		iev_cipher;	/* cipher type */
 	uint8_t		iev_keyix;	/* key id/index */
 	uint64_t	iev_keyrsc;	/* RSC from key */
 	uint64_t	iev_rsc;	/* RSC from frame */
 };
 
 struct ieee80211_michael_event {
 	uint8_t		iev_src[6];	/* src MAC */
 	uint8_t		iev_dst[6];	/* dst MAC */
 	uint8_t		iev_cipher;	/* cipher type */
 	uint8_t		iev_keyix;	/* key id/index */
 };
 
 struct ieee80211_wds_event {
 	uint8_t		iev_addr[6];
 };
 
 struct ieee80211_csa_event {
 	uint32_t	iev_flags;	/* channel flags */
 	uint16_t	iev_freq;	/* setting in Mhz */
 	uint8_t		iev_ieee;	/* IEEE channel number */
 	uint8_t		iev_mode;	/* CSA mode */
 	uint8_t		iev_count;	/* CSA count */
 };
 
 struct ieee80211_cac_event {
 	uint32_t	iev_flags;	/* channel flags */
 	uint16_t	iev_freq;	/* setting in Mhz */
 	uint8_t		iev_ieee;	/* IEEE channel number */
 	/* XXX timestamp? */
 	uint8_t		iev_type;	/* IEEE80211_NOTIFY_CAC_* */
 };
 
 struct ieee80211_radar_event {
 	uint32_t	iev_flags;	/* channel flags */
 	uint16_t	iev_freq;	/* setting in Mhz */
 	uint8_t		iev_ieee;	/* IEEE channel number */
 	/* XXX timestamp? */
 };
 
 struct ieee80211_auth_event {
 	uint8_t		iev_addr[6];
 };
 
 struct ieee80211_deauth_event {
 	uint8_t		iev_addr[6];
 };
 
 struct ieee80211_country_event {
 	uint8_t		iev_addr[6];
 	uint8_t		iev_cc[2];	/* ISO country code */
 };
 
 struct ieee80211_radio_event {
 	uint8_t		iev_state;	/* 1 on, 0 off */
 };
 
 #define	RTM_IEEE80211_ASSOC	100	/* station associate (bss mode) */
 #define	RTM_IEEE80211_REASSOC	101	/* station re-associate (bss mode) */
 #define	RTM_IEEE80211_DISASSOC	102	/* station disassociate (bss mode) */
 #define	RTM_IEEE80211_JOIN	103	/* station join (ap mode) */
 #define	RTM_IEEE80211_LEAVE	104	/* station leave (ap mode) */
 #define	RTM_IEEE80211_SCAN	105	/* scan complete, results available */
 #define	RTM_IEEE80211_REPLAY	106	/* sequence counter replay detected */
 #define	RTM_IEEE80211_MICHAEL	107	/* Michael MIC failure detected */
 #define	RTM_IEEE80211_REJOIN	108	/* station re-associate (ap mode) */
 #define	RTM_IEEE80211_WDS	109	/* WDS discovery (ap mode) */
 #define	RTM_IEEE80211_CSA	110	/* Channel Switch Announcement event */
 #define	RTM_IEEE80211_RADAR	111	/* radar event */
 #define	RTM_IEEE80211_CAC	112	/* Channel Availability Check event */
 #define	RTM_IEEE80211_DEAUTH	113	/* station deauthenticate */
 #define	RTM_IEEE80211_AUTH	114	/* station authenticate (ap mode) */
 #define	RTM_IEEE80211_COUNTRY	115	/* discovered country code (sta mode) */
 #define	RTM_IEEE80211_RADIO	116	/* RF kill switch state change */
 
 /*
  * Structure prepended to raw packets sent through the bpf
  * interface when set to DLT_IEEE802_11_RADIO.  This allows
  * user applications to specify pretty much everything in
  * an Atheros tx descriptor.  XXX need to generalize.
  *
  * XXX cannot be more than 14 bytes as it is copied to a sockaddr's
  * XXX sa_data area.
  */
 struct ieee80211_bpf_params {
 	uint8_t		ibp_vers;	/* version */
 #define	IEEE80211_BPF_VERSION	0
 	uint8_t		ibp_len;	/* header length in bytes */
 	uint8_t		ibp_flags;
 #define	IEEE80211_BPF_SHORTPRE	0x01	/* tx with short preamble */
 #define	IEEE80211_BPF_NOACK	0x02	/* tx with no ack */
 #define	IEEE80211_BPF_CRYPTO	0x04	/* tx with h/w encryption */
 #define	IEEE80211_BPF_FCS	0x10	/* frame incldues FCS */
 #define	IEEE80211_BPF_DATAPAD	0x20	/* frame includes data padding */
 #define	IEEE80211_BPF_RTS	0x40	/* tx with RTS/CTS */
 #define	IEEE80211_BPF_CTS	0x80	/* tx with CTS only */
 	uint8_t		ibp_pri;	/* WME/WMM AC+tx antenna */
 	uint8_t		ibp_try0;	/* series 1 try count */
 	uint8_t		ibp_rate0;	/* series 1 IEEE tx rate */
 	uint8_t		ibp_power;	/* tx power (device units) */
 	uint8_t		ibp_ctsrate;	/* IEEE tx rate for CTS */
 	uint8_t		ibp_try1;	/* series 2 try count */
 	uint8_t		ibp_rate1;	/* series 2 IEEE tx rate */
 	uint8_t		ibp_try2;	/* series 3 try count */
 	uint8_t		ibp_rate2;	/* series 3 IEEE tx rate */
 	uint8_t		ibp_try3;	/* series 4 try count */
 	uint8_t		ibp_rate3;	/* series 4 IEEE tx rate */
 };
 
 #ifdef _KERNEL
 struct ieee80211_tx_params {
 	struct ieee80211_bpf_params params;
 };
 int	ieee80211_add_xmit_params(struct mbuf *m,
 	    const struct ieee80211_bpf_params *);
 int	ieee80211_get_xmit_params(struct mbuf *m,
 	    struct ieee80211_bpf_params *);
 
 #define	IEEE80211_MAX_CHAINS		3
 #define	IEEE80211_MAX_EVM_PILOTS	6
 
 #define	IEEE80211_R_NF		0x0000001	/* global NF value valid */
 #define	IEEE80211_R_RSSI	0x0000002	/* global RSSI value valid */
 #define	IEEE80211_R_C_CHAIN	0x0000004	/* RX chain count valid */
 #define	IEEE80211_R_C_NF	0x0000008	/* per-chain NF value valid */
 #define	IEEE80211_R_C_RSSI	0x0000010	/* per-chain RSSI value valid */
 #define	IEEE80211_R_C_EVM	0x0000020	/* per-chain EVM valid */
 #define	IEEE80211_R_C_HT40	0x0000040	/* RX'ed packet is 40mhz, pilots 4,5 valid */
 #define	IEEE80211_R_FREQ	0x0000080	/* Freq value populated, MHz */
 #define	IEEE80211_R_IEEE	0x0000100	/* IEEE value populated */
 #define	IEEE80211_R_BAND	0x0000200	/* Frequency band populated */
 
 struct ieee80211_rx_stats {
 	uint32_t r_flags;		/* IEEE80211_R_* flags */
 	uint8_t c_chain;		/* number of RX chains involved */
 	int16_t	c_nf_ctl[IEEE80211_MAX_CHAINS];	/* per-chain NF */
 	int16_t	c_nf_ext[IEEE80211_MAX_CHAINS];	/* per-chain NF */
 	int16_t	c_rssi_ctl[IEEE80211_MAX_CHAINS];	/* per-chain RSSI */
 	int16_t	c_rssi_ext[IEEE80211_MAX_CHAINS];	/* per-chain RSSI */
 	uint8_t nf;			/* global NF */
 	uint8_t rssi;			/* global RSSI */
 	uint8_t evm[IEEE80211_MAX_CHAINS][IEEE80211_MAX_EVM_PILOTS];
 					/* per-chain, per-pilot EVM values */
 	uint16_t c_freq;
 	uint8_t c_ieee;
 };
 
 struct ieee80211_rx_params {
 	struct ieee80211_rx_stats params;
 };
 int	ieee80211_add_rx_params(struct mbuf *m,
 	    const struct ieee80211_rx_stats *rxs);
 int	ieee80211_get_rx_params(struct mbuf *m,
 	    struct ieee80211_rx_stats *rxs);
+
+struct ieee80211_toa_params {
+	int request_id;
+};
+int	ieee80211_add_toa_params(struct mbuf *m,
+	    const struct ieee80211_toa_params *p);
+int	ieee80211_get_toa_params(struct mbuf *m,
+	    struct ieee80211_toa_params *p);
 #endif /* _KERNEL */
 
 /*
  * Malloc API.  Other BSD operating systems have slightly
  * different malloc/free namings (eg DragonflyBSD.)
  */
 #define	IEEE80211_MALLOC	malloc
 #define	IEEE80211_FREE		free
 
 /* XXX TODO: get rid of WAITOK, fix all the users of it? */
 #define	IEEE80211_M_NOWAIT	M_NOWAIT
 #define	IEEE80211_M_WAITOK	M_WAITOK
 #define	IEEE80211_M_ZERO	M_ZERO
 
 /* XXX TODO: the type fields */
 
 #endif /* _NET80211_IEEE80211_FREEBSD_H_ */
Index: projects/clang390-import/sys/powerpc/booke/pmap.c
===================================================================
--- projects/clang390-import/sys/powerpc/booke/pmap.c	(revision 305686)
+++ projects/clang390-import/sys/powerpc/booke/pmap.c	(revision 305687)
@@ -1,3601 +1,3607 @@
 /*-
  * Copyright (C) 2007-2009 Semihalf, Rafal Jaworowski <raj@semihalf.com>
  * Copyright (C) 2006 Semihalf, Marian Balakowicz <m8@semihalf.com>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN
  * NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
  * TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
  * PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
  * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
  * NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
  * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  *
  * Some hw specific parts of this pmap were derived or influenced
  * by NetBSD's ibm4xx pmap module. More generic code is shared with
  * a few other pmap modules from the FreeBSD tree.
  */
 
  /*
   * VM layout notes:
   *
   * Kernel and user threads run within one common virtual address space
   * defined by AS=0.
   *
   * Virtual address space layout:
   * -----------------------------
   * 0x0000_0000 - 0xafff_ffff	: user process
   * 0xb000_0000 - 0xbfff_ffff	: pmap_mapdev()-ed area (PCI/PCIE etc.)
   * 0xc000_0000 - 0xc0ff_ffff	: kernel reserved
   *   0xc000_0000 - data_end	: kernel code+data, env, metadata etc.
   * 0xc100_0000 - 0xfeef_ffff	: KVA
   *   0xc100_0000 - 0xc100_3fff : reserved for page zero/copy
   *   0xc100_4000 - 0xc200_3fff : reserved for ptbl bufs
   *   0xc200_4000 - 0xc200_8fff : guard page + kstack0
   *   0xc200_9000 - 0xfeef_ffff	: actual free KVA space
   * 0xfef0_0000 - 0xffff_ffff	: I/O devices region
   */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_kstack_pages.h"
 
 #include <sys/param.h>
 #include <sys/conf.h>
 #include <sys/malloc.h>
 #include <sys/ktr.h>
 #include <sys/proc.h>
 #include <sys/user.h>
 #include <sys/queue.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/kerneldump.h>
 #include <sys/linker.h>
 #include <sys/msgbuf.h>
 #include <sys/lock.h>
 #include <sys/mutex.h>
 #include <sys/rwlock.h>
 #include <sys/sched.h>
 #include <sys/smp.h>
 #include <sys/vmmeter.h>
 
 #include <vm/vm.h>
 #include <vm/vm_page.h>
 #include <vm/vm_kern.h>
 #include <vm/vm_pageout.h>
 #include <vm/vm_extern.h>
 #include <vm/vm_object.h>
 #include <vm/vm_param.h>
 #include <vm/vm_map.h>
 #include <vm/vm_pager.h>
 #include <vm/uma.h>
 
 #include <machine/cpu.h>
 #include <machine/pcb.h>
 #include <machine/platform.h>
 
 #include <machine/tlb.h>
 #include <machine/spr.h>
 #include <machine/md_var.h>
 #include <machine/mmuvar.h>
 #include <machine/pmap.h>
 #include <machine/pte.h>
 
 #include "mmu_if.h"
 
 #define	SPARSE_MAPDEV
 #ifdef  DEBUG
 #define debugf(fmt, args...) printf(fmt, ##args)
 #else
 #define debugf(fmt, args...)
 #endif
 
 #define TODO			panic("%s: not implemented", __func__);
 
 extern unsigned char _etext[];
 extern unsigned char _end[];
 
 extern uint32_t *bootinfo;
 
 vm_paddr_t kernload;
 vm_offset_t kernstart;
 vm_size_t kernsize;
 
 /* Message buffer and tables. */
 static vm_offset_t data_start;
 static vm_size_t data_end;
 
 /* Phys/avail memory regions. */
 static struct mem_region *availmem_regions;
 static int availmem_regions_sz;
 static struct mem_region *physmem_regions;
 static int physmem_regions_sz;
 
 /* Reserved KVA space and mutex for mmu_booke_zero_page. */
 static vm_offset_t zero_page_va;
 static struct mtx zero_page_mutex;
 
 static struct mtx tlbivax_mutex;
 
 /* Reserved KVA space and mutex for mmu_booke_copy_page. */
 static vm_offset_t copy_page_src_va;
 static vm_offset_t copy_page_dst_va;
 static struct mtx copy_page_mutex;
 
 /**************************************************************************/
 /* PMAP */
 /**************************************************************************/
 
 static int mmu_booke_enter_locked(mmu_t, pmap_t, vm_offset_t, vm_page_t,
     vm_prot_t, u_int flags, int8_t psind);
 
 unsigned int kptbl_min;		/* Index of the first kernel ptbl. */
 unsigned int kernel_ptbls;	/* Number of KVA ptbls. */
 
 /*
  * If user pmap is processed with mmu_booke_remove and the resident count
  * drops to 0, there are no more pages to remove, so we need not continue.
  */
 #define PMAP_REMOVE_DONE(pmap) \
 	((pmap) != kernel_pmap && (pmap)->pm_stats.resident_count == 0)
 
 extern int elf32_nxstack;
 
 /**************************************************************************/
 /* TLB and TID handling */
 /**************************************************************************/
 
 /* Translation ID busy table */
 static volatile pmap_t tidbusy[MAXCPU][TID_MAX + 1];
 
 /*
  * TLB0 capabilities (entry, way numbers etc.). These can vary between e500
  * core revisions and should be read from h/w registers during early config.
  */
 uint32_t tlb0_entries;
 uint32_t tlb0_ways;
 uint32_t tlb0_entries_per_way;
 uint32_t tlb1_entries;
 
 #define TLB0_ENTRIES		(tlb0_entries)
 #define TLB0_WAYS		(tlb0_ways)
 #define TLB0_ENTRIES_PER_WAY	(tlb0_entries_per_way)
 
 #define TLB1_ENTRIES (tlb1_entries)
 #define TLB1_MAXENTRIES	64
 
 static vm_offset_t tlb1_map_base = VM_MAXUSER_ADDRESS + PAGE_SIZE;
 
 static tlbtid_t tid_alloc(struct pmap *);
 static void tid_flush(tlbtid_t tid);
 
 static void tlb_print_entry(int, uint32_t, uint32_t, uint32_t, uint32_t);
 
 static void tlb1_read_entry(tlb_entry_t *, unsigned int);
 static void tlb1_write_entry(tlb_entry_t *, unsigned int);
 static int tlb1_iomapped(int, vm_paddr_t, vm_size_t, vm_offset_t *);
 static vm_size_t tlb1_mapin_region(vm_offset_t, vm_paddr_t, vm_size_t);
 
 static vm_size_t tsize2size(unsigned int);
 static unsigned int size2tsize(vm_size_t);
 static unsigned int ilog2(unsigned int);
 
 static void set_mas4_defaults(void);
 
 static inline void tlb0_flush_entry(vm_offset_t);
 static inline unsigned int tlb0_tableidx(vm_offset_t, unsigned int);
 
 /**************************************************************************/
 /* Page table management */
 /**************************************************************************/
 
 static struct rwlock_padalign pvh_global_lock;
 
 /* Data for the pv entry allocation mechanism */
 static uma_zone_t pvzone;
 static int pv_entry_count = 0, pv_entry_max = 0, pv_entry_high_water = 0;
 
 #define PV_ENTRY_ZONE_MIN	2048	/* min pv entries in uma zone */
 
 #ifndef PMAP_SHPGPERPROC
 #define PMAP_SHPGPERPROC	200
 #endif
 
 static void ptbl_init(void);
 static struct ptbl_buf *ptbl_buf_alloc(void);
 static void ptbl_buf_free(struct ptbl_buf *);
 static void ptbl_free_pmap_ptbl(pmap_t, pte_t *);
 
 static pte_t *ptbl_alloc(mmu_t, pmap_t, unsigned int, boolean_t);
 static void ptbl_free(mmu_t, pmap_t, unsigned int);
 static void ptbl_hold(mmu_t, pmap_t, unsigned int);
 static int ptbl_unhold(mmu_t, pmap_t, unsigned int);
 
 static vm_paddr_t pte_vatopa(mmu_t, pmap_t, vm_offset_t);
 static pte_t *pte_find(mmu_t, pmap_t, vm_offset_t);
 static int pte_enter(mmu_t, pmap_t, vm_page_t, vm_offset_t, uint32_t, boolean_t);
 static int pte_remove(mmu_t, pmap_t, vm_offset_t, uint8_t);
 static void kernel_pte_alloc(vm_offset_t data_end, vm_offset_t addr,
 			     vm_offset_t pdir);
 
 static pv_entry_t pv_alloc(void);
 static void pv_free(pv_entry_t);
 static void pv_insert(pmap_t, vm_offset_t, vm_page_t);
 static void pv_remove(pmap_t, vm_offset_t, vm_page_t);
 
 static void booke_pmap_init_qpages(void);
 
 /* Number of kva ptbl buffers, each covering one ptbl (PTBL_PAGES). */
 #define PTBL_BUFS		(128 * 16)
 
 struct ptbl_buf {
 	TAILQ_ENTRY(ptbl_buf) link;	/* list link */
 	vm_offset_t kva;		/* va of mapping */
 };
 
 /* ptbl free list and a lock used for access synchronization. */
 static TAILQ_HEAD(, ptbl_buf) ptbl_buf_freelist;
 static struct mtx ptbl_buf_freelist_lock;
 
 /* Base address of kva space allocated fot ptbl bufs. */
 static vm_offset_t ptbl_buf_pool_vabase;
 
 /* Pointer to ptbl_buf structures. */
 static struct ptbl_buf *ptbl_bufs;
 
 #ifdef SMP
 extern tlb_entry_t __boot_tlb1[];
 void pmap_bootstrap_ap(volatile uint32_t *);
 #endif
 
 /*
  * Kernel MMU interface
  */
 static void		mmu_booke_clear_modify(mmu_t, vm_page_t);
 static void		mmu_booke_copy(mmu_t, pmap_t, pmap_t, vm_offset_t,
     vm_size_t, vm_offset_t);
 static void		mmu_booke_copy_page(mmu_t, vm_page_t, vm_page_t);
 static void		mmu_booke_copy_pages(mmu_t, vm_page_t *,
     vm_offset_t, vm_page_t *, vm_offset_t, int);
 static int		mmu_booke_enter(mmu_t, pmap_t, vm_offset_t, vm_page_t,
     vm_prot_t, u_int flags, int8_t psind);
 static void		mmu_booke_enter_object(mmu_t, pmap_t, vm_offset_t, vm_offset_t,
     vm_page_t, vm_prot_t);
 static void		mmu_booke_enter_quick(mmu_t, pmap_t, vm_offset_t, vm_page_t,
     vm_prot_t);
 static vm_paddr_t	mmu_booke_extract(mmu_t, pmap_t, vm_offset_t);
 static vm_page_t	mmu_booke_extract_and_hold(mmu_t, pmap_t, vm_offset_t,
     vm_prot_t);
 static void		mmu_booke_init(mmu_t);
 static boolean_t	mmu_booke_is_modified(mmu_t, vm_page_t);
 static boolean_t	mmu_booke_is_prefaultable(mmu_t, pmap_t, vm_offset_t);
 static boolean_t	mmu_booke_is_referenced(mmu_t, vm_page_t);
 static int		mmu_booke_ts_referenced(mmu_t, vm_page_t);
 static vm_offset_t	mmu_booke_map(mmu_t, vm_offset_t *, vm_paddr_t, vm_paddr_t,
     int);
 static int		mmu_booke_mincore(mmu_t, pmap_t, vm_offset_t,
     vm_paddr_t *);
 static void		mmu_booke_object_init_pt(mmu_t, pmap_t, vm_offset_t,
     vm_object_t, vm_pindex_t, vm_size_t);
 static boolean_t	mmu_booke_page_exists_quick(mmu_t, pmap_t, vm_page_t);
 static void		mmu_booke_page_init(mmu_t, vm_page_t);
 static int		mmu_booke_page_wired_mappings(mmu_t, vm_page_t);
 static void		mmu_booke_pinit(mmu_t, pmap_t);
 static void		mmu_booke_pinit0(mmu_t, pmap_t);
 static void		mmu_booke_protect(mmu_t, pmap_t, vm_offset_t, vm_offset_t,
     vm_prot_t);
 static void		mmu_booke_qenter(mmu_t, vm_offset_t, vm_page_t *, int);
 static void		mmu_booke_qremove(mmu_t, vm_offset_t, int);
 static void		mmu_booke_release(mmu_t, pmap_t);
 static void		mmu_booke_remove(mmu_t, pmap_t, vm_offset_t, vm_offset_t);
 static void		mmu_booke_remove_all(mmu_t, vm_page_t);
 static void		mmu_booke_remove_write(mmu_t, vm_page_t);
 static void		mmu_booke_unwire(mmu_t, pmap_t, vm_offset_t, vm_offset_t);
 static void		mmu_booke_zero_page(mmu_t, vm_page_t);
 static void		mmu_booke_zero_page_area(mmu_t, vm_page_t, int, int);
 static void		mmu_booke_activate(mmu_t, struct thread *);
 static void		mmu_booke_deactivate(mmu_t, struct thread *);
 static void		mmu_booke_bootstrap(mmu_t, vm_offset_t, vm_offset_t);
 static void		*mmu_booke_mapdev(mmu_t, vm_paddr_t, vm_size_t);
 static void		*mmu_booke_mapdev_attr(mmu_t, vm_paddr_t, vm_size_t, vm_memattr_t);
 static void		mmu_booke_unmapdev(mmu_t, vm_offset_t, vm_size_t);
 static vm_paddr_t	mmu_booke_kextract(mmu_t, vm_offset_t);
 static void		mmu_booke_kenter(mmu_t, vm_offset_t, vm_paddr_t);
 static void		mmu_booke_kenter_attr(mmu_t, vm_offset_t, vm_paddr_t, vm_memattr_t);
 static void		mmu_booke_kremove(mmu_t, vm_offset_t);
 static boolean_t	mmu_booke_dev_direct_mapped(mmu_t, vm_paddr_t, vm_size_t);
 static void		mmu_booke_sync_icache(mmu_t, pmap_t, vm_offset_t,
     vm_size_t);
 static void		mmu_booke_dumpsys_map(mmu_t, vm_paddr_t pa, size_t,
     void **);
 static void		mmu_booke_dumpsys_unmap(mmu_t, vm_paddr_t pa, size_t,
     void *);
 static void		mmu_booke_scan_init(mmu_t);
 static vm_offset_t	mmu_booke_quick_enter_page(mmu_t mmu, vm_page_t m);
 static void		mmu_booke_quick_remove_page(mmu_t mmu, vm_offset_t addr);
 static int		mmu_booke_change_attr(mmu_t mmu, vm_offset_t addr,
     vm_size_t sz, vm_memattr_t mode);
 
 static mmu_method_t mmu_booke_methods[] = {
 	/* pmap dispatcher interface */
 	MMUMETHOD(mmu_clear_modify,	mmu_booke_clear_modify),
 	MMUMETHOD(mmu_copy,		mmu_booke_copy),
 	MMUMETHOD(mmu_copy_page,	mmu_booke_copy_page),
 	MMUMETHOD(mmu_copy_pages,	mmu_booke_copy_pages),
 	MMUMETHOD(mmu_enter,		mmu_booke_enter),
 	MMUMETHOD(mmu_enter_object,	mmu_booke_enter_object),
 	MMUMETHOD(mmu_enter_quick,	mmu_booke_enter_quick),
 	MMUMETHOD(mmu_extract,		mmu_booke_extract),
 	MMUMETHOD(mmu_extract_and_hold,	mmu_booke_extract_and_hold),
 	MMUMETHOD(mmu_init,		mmu_booke_init),
 	MMUMETHOD(mmu_is_modified,	mmu_booke_is_modified),
 	MMUMETHOD(mmu_is_prefaultable,	mmu_booke_is_prefaultable),
 	MMUMETHOD(mmu_is_referenced,	mmu_booke_is_referenced),
 	MMUMETHOD(mmu_ts_referenced,	mmu_booke_ts_referenced),
 	MMUMETHOD(mmu_map,		mmu_booke_map),
 	MMUMETHOD(mmu_mincore,		mmu_booke_mincore),
 	MMUMETHOD(mmu_object_init_pt,	mmu_booke_object_init_pt),
 	MMUMETHOD(mmu_page_exists_quick,mmu_booke_page_exists_quick),
 	MMUMETHOD(mmu_page_init,	mmu_booke_page_init),
 	MMUMETHOD(mmu_page_wired_mappings, mmu_booke_page_wired_mappings),
 	MMUMETHOD(mmu_pinit,		mmu_booke_pinit),
 	MMUMETHOD(mmu_pinit0,		mmu_booke_pinit0),
 	MMUMETHOD(mmu_protect,		mmu_booke_protect),
 	MMUMETHOD(mmu_qenter,		mmu_booke_qenter),
 	MMUMETHOD(mmu_qremove,		mmu_booke_qremove),
 	MMUMETHOD(mmu_release,		mmu_booke_release),
 	MMUMETHOD(mmu_remove,		mmu_booke_remove),
 	MMUMETHOD(mmu_remove_all,	mmu_booke_remove_all),
 	MMUMETHOD(mmu_remove_write,	mmu_booke_remove_write),
 	MMUMETHOD(mmu_sync_icache,	mmu_booke_sync_icache),
 	MMUMETHOD(mmu_unwire,		mmu_booke_unwire),
 	MMUMETHOD(mmu_zero_page,	mmu_booke_zero_page),
 	MMUMETHOD(mmu_zero_page_area,	mmu_booke_zero_page_area),
 	MMUMETHOD(mmu_activate,		mmu_booke_activate),
 	MMUMETHOD(mmu_deactivate,	mmu_booke_deactivate),
 	MMUMETHOD(mmu_quick_enter_page, mmu_booke_quick_enter_page),
 	MMUMETHOD(mmu_quick_remove_page, mmu_booke_quick_remove_page),
 
 	/* Internal interfaces */
 	MMUMETHOD(mmu_bootstrap,	mmu_booke_bootstrap),
 	MMUMETHOD(mmu_dev_direct_mapped,mmu_booke_dev_direct_mapped),
 	MMUMETHOD(mmu_mapdev,		mmu_booke_mapdev),
 	MMUMETHOD(mmu_mapdev_attr,	mmu_booke_mapdev_attr),
 	MMUMETHOD(mmu_kenter,		mmu_booke_kenter),
 	MMUMETHOD(mmu_kenter_attr,	mmu_booke_kenter_attr),
 	MMUMETHOD(mmu_kextract,		mmu_booke_kextract),
 	MMUMETHOD(mmu_kremove,		mmu_booke_kremove),
 	MMUMETHOD(mmu_unmapdev,		mmu_booke_unmapdev),
 	MMUMETHOD(mmu_change_attr,	mmu_booke_change_attr),
 
 	/* dumpsys() support */
 	MMUMETHOD(mmu_dumpsys_map,	mmu_booke_dumpsys_map),
 	MMUMETHOD(mmu_dumpsys_unmap,	mmu_booke_dumpsys_unmap),
 	MMUMETHOD(mmu_scan_init,	mmu_booke_scan_init),
 
 	{ 0, 0 }
 };
 
 MMU_DEF(booke_mmu, MMU_TYPE_BOOKE, mmu_booke_methods, 0);
 
 static __inline uint32_t
 tlb_calc_wimg(vm_paddr_t pa, vm_memattr_t ma)
 {
 	uint32_t attrib;
 	int i;
 
 	if (ma != VM_MEMATTR_DEFAULT) {
 		switch (ma) {
 		case VM_MEMATTR_UNCACHEABLE:
 			return (MAS2_I | MAS2_G);
 		case VM_MEMATTR_WRITE_COMBINING:
 		case VM_MEMATTR_WRITE_BACK:
 		case VM_MEMATTR_PREFETCHABLE:
 			return (MAS2_I);
 		case VM_MEMATTR_WRITE_THROUGH:
 			return (MAS2_W | MAS2_M);
 		case VM_MEMATTR_CACHEABLE:
 			return (MAS2_M);
 		}
 	}
 
 	/*
 	 * Assume the page is cache inhibited and access is guarded unless
 	 * it's in our available memory array.
 	 */
 	attrib = _TLB_ENTRY_IO;
 	for (i = 0; i < physmem_regions_sz; i++) {
 		if ((pa >= physmem_regions[i].mr_start) &&
 		    (pa < (physmem_regions[i].mr_start +
 		     physmem_regions[i].mr_size))) {
 			attrib = _TLB_ENTRY_MEM;
 			break;
 		}
 	}
 
 	return (attrib);
 }
 
 static inline void
 tlb_miss_lock(void)
 {
 #ifdef SMP
 	struct pcpu *pc;
 
 	if (!smp_started)
 		return;
 
 	STAILQ_FOREACH(pc, &cpuhead, pc_allcpu) {
 		if (pc != pcpup) {
 
 			CTR3(KTR_PMAP, "%s: tlb miss LOCK of CPU=%d, "
 			    "tlb_lock=%p", __func__, pc->pc_cpuid, pc->pc_booke_tlb_lock);
 
 			KASSERT((pc->pc_cpuid != PCPU_GET(cpuid)),
 			    ("tlb_miss_lock: tried to lock self"));
 
 			tlb_lock(pc->pc_booke_tlb_lock);
 
 			CTR1(KTR_PMAP, "%s: locked", __func__);
 		}
 	}
 #endif
 }
 
 static inline void
 tlb_miss_unlock(void)
 {
 #ifdef SMP
 	struct pcpu *pc;
 
 	if (!smp_started)
 		return;
 
 	STAILQ_FOREACH(pc, &cpuhead, pc_allcpu) {
 		if (pc != pcpup) {
 			CTR2(KTR_PMAP, "%s: tlb miss UNLOCK of CPU=%d",
 			    __func__, pc->pc_cpuid);
 
 			tlb_unlock(pc->pc_booke_tlb_lock);
 
 			CTR1(KTR_PMAP, "%s: unlocked", __func__);
 		}
 	}
 #endif
 }
 
 /* Return number of entries in TLB0. */
 static __inline void
 tlb0_get_tlbconf(void)
 {
 	uint32_t tlb0_cfg;
 
 	tlb0_cfg = mfspr(SPR_TLB0CFG);
 	tlb0_entries = tlb0_cfg & TLBCFG_NENTRY_MASK;
 	tlb0_ways = (tlb0_cfg & TLBCFG_ASSOC_MASK) >> TLBCFG_ASSOC_SHIFT;
 	tlb0_entries_per_way = tlb0_entries / tlb0_ways;
 }
 
 /* Return number of entries in TLB1. */
 static __inline void
 tlb1_get_tlbconf(void)
 {
 	uint32_t tlb1_cfg;
 
 	tlb1_cfg = mfspr(SPR_TLB1CFG);
 	tlb1_entries = tlb1_cfg & TLBCFG_NENTRY_MASK;
 }
 
 /**************************************************************************/
 /* Page table related */
 /**************************************************************************/
 
 /* Initialize pool of kva ptbl buffers. */
 static void
 ptbl_init(void)
 {
 	int i;
 
 	CTR3(KTR_PMAP, "%s: s (ptbl_bufs = 0x%08x size 0x%08x)", __func__,
 	    (uint32_t)ptbl_bufs, sizeof(struct ptbl_buf) * PTBL_BUFS);
 	CTR3(KTR_PMAP, "%s: s (ptbl_buf_pool_vabase = 0x%08x size = 0x%08x)",
 	    __func__, ptbl_buf_pool_vabase, PTBL_BUFS * PTBL_PAGES * PAGE_SIZE);
 
 	mtx_init(&ptbl_buf_freelist_lock, "ptbl bufs lock", NULL, MTX_DEF);
 	TAILQ_INIT(&ptbl_buf_freelist);
 
 	for (i = 0; i < PTBL_BUFS; i++) {
 		ptbl_bufs[i].kva = ptbl_buf_pool_vabase + i * PTBL_PAGES * PAGE_SIZE;
 		TAILQ_INSERT_TAIL(&ptbl_buf_freelist, &ptbl_bufs[i], link);
 	}
 }
 
 /* Get a ptbl_buf from the freelist. */
 static struct ptbl_buf *
 ptbl_buf_alloc(void)
 {
 	struct ptbl_buf *buf;
 
 	mtx_lock(&ptbl_buf_freelist_lock);
 	buf = TAILQ_FIRST(&ptbl_buf_freelist);
 	if (buf != NULL)
 		TAILQ_REMOVE(&ptbl_buf_freelist, buf, link);
 	mtx_unlock(&ptbl_buf_freelist_lock);
 
 	CTR2(KTR_PMAP, "%s: buf = %p", __func__, buf);
 
 	return (buf);
 }
 
 /* Return ptbl buff to free pool. */
 static void
 ptbl_buf_free(struct ptbl_buf *buf)
 {
 
 	CTR2(KTR_PMAP, "%s: buf = %p", __func__, buf);
 
 	mtx_lock(&ptbl_buf_freelist_lock);
 	TAILQ_INSERT_TAIL(&ptbl_buf_freelist, buf, link);
 	mtx_unlock(&ptbl_buf_freelist_lock);
 }
 
 /*
  * Search the list of allocated ptbl bufs and find on list of allocated ptbls
  */
 static void
 ptbl_free_pmap_ptbl(pmap_t pmap, pte_t *ptbl)
 {
 	struct ptbl_buf *pbuf;
 
 	CTR2(KTR_PMAP, "%s: ptbl = %p", __func__, ptbl);
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 
 	TAILQ_FOREACH(pbuf, &pmap->pm_ptbl_list, link)
 		if (pbuf->kva == (vm_offset_t)ptbl) {
 			/* Remove from pmap ptbl buf list. */
 			TAILQ_REMOVE(&pmap->pm_ptbl_list, pbuf, link);
 
 			/* Free corresponding ptbl buf. */
 			ptbl_buf_free(pbuf);
 			break;
 		}
 }
 
 /* Allocate page table. */
 static pte_t *
 ptbl_alloc(mmu_t mmu, pmap_t pmap, unsigned int pdir_idx, boolean_t nosleep)
 {
 	vm_page_t mtbl[PTBL_PAGES];
 	vm_page_t m;
 	struct ptbl_buf *pbuf;
 	unsigned int pidx;
 	pte_t *ptbl;
 	int i, j;
 
 	CTR4(KTR_PMAP, "%s: pmap = %p su = %d pdir_idx = %d", __func__, pmap,
 	    (pmap == kernel_pmap), pdir_idx);
 
 	KASSERT((pdir_idx <= (VM_MAXUSER_ADDRESS / PDIR_SIZE)),
 	    ("ptbl_alloc: invalid pdir_idx"));
 	KASSERT((pmap->pm_pdir[pdir_idx] == NULL),
 	    ("pte_alloc: valid ptbl entry exists!"));
 
 	pbuf = ptbl_buf_alloc();
 	if (pbuf == NULL)
 		panic("pte_alloc: couldn't alloc kernel virtual memory");
 		
 	ptbl = (pte_t *)pbuf->kva;
 
 	CTR2(KTR_PMAP, "%s: ptbl kva = %p", __func__, ptbl);
 
 	/* Allocate ptbl pages, this will sleep! */
 	for (i = 0; i < PTBL_PAGES; i++) {
 		pidx = (PTBL_PAGES * pdir_idx) + i;
 		while ((m = vm_page_alloc(NULL, pidx,
 		    VM_ALLOC_NOOBJ | VM_ALLOC_WIRED)) == NULL) {
 			PMAP_UNLOCK(pmap);
 			rw_wunlock(&pvh_global_lock);
 			if (nosleep) {
 				ptbl_free_pmap_ptbl(pmap, ptbl);
 				for (j = 0; j < i; j++)
 					vm_page_free(mtbl[j]);
 				atomic_subtract_int(&vm_cnt.v_wire_count, i);
 				return (NULL);
 			}
 			VM_WAIT;
 			rw_wlock(&pvh_global_lock);
 			PMAP_LOCK(pmap);
 		}
 		mtbl[i] = m;
 	}
 
 	/* Map allocated pages into kernel_pmap. */
 	mmu_booke_qenter(mmu, (vm_offset_t)ptbl, mtbl, PTBL_PAGES);
 
 	/* Zero whole ptbl. */
 	bzero((caddr_t)ptbl, PTBL_PAGES * PAGE_SIZE);
 
 	/* Add pbuf to the pmap ptbl bufs list. */
 	TAILQ_INSERT_TAIL(&pmap->pm_ptbl_list, pbuf, link);
 
 	return (ptbl);
 }
 
 /* Free ptbl pages and invalidate pdir entry. */
 static void
 ptbl_free(mmu_t mmu, pmap_t pmap, unsigned int pdir_idx)
 {
 	pte_t *ptbl;
 	vm_paddr_t pa;
 	vm_offset_t va;
 	vm_page_t m;
 	int i;
 
 	CTR4(KTR_PMAP, "%s: pmap = %p su = %d pdir_idx = %d", __func__, pmap,
 	    (pmap == kernel_pmap), pdir_idx);
 
 	KASSERT((pdir_idx <= (VM_MAXUSER_ADDRESS / PDIR_SIZE)),
 	    ("ptbl_free: invalid pdir_idx"));
 
 	ptbl = pmap->pm_pdir[pdir_idx];
 
 	CTR2(KTR_PMAP, "%s: ptbl = %p", __func__, ptbl);
 
 	KASSERT((ptbl != NULL), ("ptbl_free: null ptbl"));
 
 	/*
 	 * Invalidate the pdir entry as soon as possible, so that other CPUs
 	 * don't attempt to look up the page tables we are releasing.
 	 */
 	mtx_lock_spin(&tlbivax_mutex);
 	tlb_miss_lock();
 	
 	pmap->pm_pdir[pdir_idx] = NULL;
 
 	tlb_miss_unlock();
 	mtx_unlock_spin(&tlbivax_mutex);
 
 	for (i = 0; i < PTBL_PAGES; i++) {
 		va = ((vm_offset_t)ptbl + (i * PAGE_SIZE));
 		pa = pte_vatopa(mmu, kernel_pmap, va);
 		m = PHYS_TO_VM_PAGE(pa);
 		vm_page_free_zero(m);
 		atomic_subtract_int(&vm_cnt.v_wire_count, 1);
 		mmu_booke_kremove(mmu, va);
 	}
 
 	ptbl_free_pmap_ptbl(pmap, ptbl);
 }
 
 /*
  * Decrement ptbl pages hold count and attempt to free ptbl pages.
  * Called when removing pte entry from ptbl.
  *
  * Return 1 if ptbl pages were freed.
  */
 static int
 ptbl_unhold(mmu_t mmu, pmap_t pmap, unsigned int pdir_idx)
 {
 	pte_t *ptbl;
 	vm_paddr_t pa;
 	vm_page_t m;
 	int i;
 
 	CTR4(KTR_PMAP, "%s: pmap = %p su = %d pdir_idx = %d", __func__, pmap,
 	    (pmap == kernel_pmap), pdir_idx);
 
 	KASSERT((pdir_idx <= (VM_MAXUSER_ADDRESS / PDIR_SIZE)),
 	    ("ptbl_unhold: invalid pdir_idx"));
 	KASSERT((pmap != kernel_pmap),
 	    ("ptbl_unhold: unholding kernel ptbl!"));
 
 	ptbl = pmap->pm_pdir[pdir_idx];
 
 	//debugf("ptbl_unhold: ptbl = 0x%08x\n", (u_int32_t)ptbl);
 	KASSERT(((vm_offset_t)ptbl >= VM_MIN_KERNEL_ADDRESS),
 	    ("ptbl_unhold: non kva ptbl"));
 
 	/* decrement hold count */
 	for (i = 0; i < PTBL_PAGES; i++) {
 		pa = pte_vatopa(mmu, kernel_pmap,
 		    (vm_offset_t)ptbl + (i * PAGE_SIZE));
 		m = PHYS_TO_VM_PAGE(pa);
 		m->wire_count--;
 	}
 
 	/*
 	 * Free ptbl pages if there are no pte etries in this ptbl.
 	 * wire_count has the same value for all ptbl pages, so check the last
 	 * page.
 	 */
 	if (m->wire_count == 0) {
 		ptbl_free(mmu, pmap, pdir_idx);
 
 		//debugf("ptbl_unhold: e (freed ptbl)\n");
 		return (1);
 	}
 
 	return (0);
 }
 
 /*
  * Increment hold count for ptbl pages. This routine is used when a new pte
  * entry is being inserted into the ptbl.
  */
 static void
 ptbl_hold(mmu_t mmu, pmap_t pmap, unsigned int pdir_idx)
 {
 	vm_paddr_t pa;
 	pte_t *ptbl;
 	vm_page_t m;
 	int i;
 
 	CTR3(KTR_PMAP, "%s: pmap = %p pdir_idx = %d", __func__, pmap,
 	    pdir_idx);
 
 	KASSERT((pdir_idx <= (VM_MAXUSER_ADDRESS / PDIR_SIZE)),
 	    ("ptbl_hold: invalid pdir_idx"));
 	KASSERT((pmap != kernel_pmap),
 	    ("ptbl_hold: holding kernel ptbl!"));
 
 	ptbl = pmap->pm_pdir[pdir_idx];
 
 	KASSERT((ptbl != NULL), ("ptbl_hold: null ptbl"));
 
 	for (i = 0; i < PTBL_PAGES; i++) {
 		pa = pte_vatopa(mmu, kernel_pmap,
 		    (vm_offset_t)ptbl + (i * PAGE_SIZE));
 		m = PHYS_TO_VM_PAGE(pa);
 		m->wire_count++;
 	}
 }
 
 /* Allocate pv_entry structure. */
 pv_entry_t
 pv_alloc(void)
 {
 	pv_entry_t pv;
 
 	pv_entry_count++;
 	if (pv_entry_count > pv_entry_high_water)
 		pagedaemon_wakeup();
 	pv = uma_zalloc(pvzone, M_NOWAIT);
 
 	return (pv);
 }
 
 /* Free pv_entry structure. */
 static __inline void
 pv_free(pv_entry_t pve)
 {
 
 	pv_entry_count--;
 	uma_zfree(pvzone, pve);
 }
 
 
 /* Allocate and initialize pv_entry structure. */
 static void
 pv_insert(pmap_t pmap, vm_offset_t va, vm_page_t m)
 {
 	pv_entry_t pve;
 
 	//int su = (pmap == kernel_pmap);
 	//debugf("pv_insert: s (su = %d pmap = 0x%08x va = 0x%08x m = 0x%08x)\n", su,
 	//	(u_int32_t)pmap, va, (u_int32_t)m);
 
 	pve = pv_alloc();
 	if (pve == NULL)
 		panic("pv_insert: no pv entries!");
 
 	pve->pv_pmap = pmap;
 	pve->pv_va = va;
 
 	/* add to pv_list */
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 
 	TAILQ_INSERT_TAIL(&m->md.pv_list, pve, pv_link);
 
 	//debugf("pv_insert: e\n");
 }
 
 /* Destroy pv entry. */
 static void
 pv_remove(pmap_t pmap, vm_offset_t va, vm_page_t m)
 {
 	pv_entry_t pve;
 
 	//int su = (pmap == kernel_pmap);
 	//debugf("pv_remove: s (su = %d pmap = 0x%08x va = 0x%08x)\n", su, (u_int32_t)pmap, va);
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	rw_assert(&pvh_global_lock, RA_WLOCKED);
 
 	/* find pv entry */
 	TAILQ_FOREACH(pve, &m->md.pv_list, pv_link) {
 		if ((pmap == pve->pv_pmap) && (va == pve->pv_va)) {
 			/* remove from pv_list */
 			TAILQ_REMOVE(&m->md.pv_list, pve, pv_link);
 			if (TAILQ_EMPTY(&m->md.pv_list))
 				vm_page_aflag_clear(m, PGA_WRITEABLE);
 
 			/* free pv entry struct */
 			pv_free(pve);
 			break;
 		}
 	}
 
 	//debugf("pv_remove: e\n");
 }
 
 /*
  * Clean pte entry, try to free page table page if requested.
  *
  * Return 1 if ptbl pages were freed, otherwise return 0.
  */
 static int
 pte_remove(mmu_t mmu, pmap_t pmap, vm_offset_t va, uint8_t flags)
 {
 	unsigned int pdir_idx = PDIR_IDX(va);
 	unsigned int ptbl_idx = PTBL_IDX(va);
 	vm_page_t m;
 	pte_t *ptbl;
 	pte_t *pte;
 
 	//int su = (pmap == kernel_pmap);
 	//debugf("pte_remove: s (su = %d pmap = 0x%08x va = 0x%08x flags = %d)\n",
 	//		su, (u_int32_t)pmap, va, flags);
 
 	ptbl = pmap->pm_pdir[pdir_idx];
 	KASSERT(ptbl, ("pte_remove: null ptbl"));
 
 	pte = &ptbl[ptbl_idx];
 
 	if (pte == NULL || !PTE_ISVALID(pte))
 		return (0);
 
 	if (PTE_ISWIRED(pte))
 		pmap->pm_stats.wired_count--;
 
 	/* Handle managed entry. */
 	if (PTE_ISMANAGED(pte)) {
 		/* Get vm_page_t for mapped pte. */
 		m = PHYS_TO_VM_PAGE(PTE_PA(pte));
 
 		if (PTE_ISMODIFIED(pte))
 			vm_page_dirty(m);
 
 		if (PTE_ISREFERENCED(pte))
 			vm_page_aflag_set(m, PGA_REFERENCED);
 
 		pv_remove(pmap, va, m);
 	}
 
 	mtx_lock_spin(&tlbivax_mutex);
 	tlb_miss_lock();
 
 	tlb0_flush_entry(va);
 	*pte = 0;
 
 	tlb_miss_unlock();
 	mtx_unlock_spin(&tlbivax_mutex);
 
 	pmap->pm_stats.resident_count--;
 
 	if (flags & PTBL_UNHOLD) {
 		//debugf("pte_remove: e (unhold)\n");
 		return (ptbl_unhold(mmu, pmap, pdir_idx));
 	}
 
 	//debugf("pte_remove: e\n");
 	return (0);
 }
 
 /*
  * Insert PTE for a given page and virtual address.
  */
 static int
 pte_enter(mmu_t mmu, pmap_t pmap, vm_page_t m, vm_offset_t va, uint32_t flags,
     boolean_t nosleep)
 {
 	unsigned int pdir_idx = PDIR_IDX(va);
 	unsigned int ptbl_idx = PTBL_IDX(va);
 	pte_t *ptbl, *pte;
 
 	CTR4(KTR_PMAP, "%s: su = %d pmap = %p va = %p", __func__,
 	    pmap == kernel_pmap, pmap, va);
 
 	/* Get the page table pointer. */
 	ptbl = pmap->pm_pdir[pdir_idx];
 
 	if (ptbl == NULL) {
 		/* Allocate page table pages. */
 		ptbl = ptbl_alloc(mmu, pmap, pdir_idx, nosleep);
 		if (ptbl == NULL) {
 			KASSERT(nosleep, ("nosleep and NULL ptbl"));
 			return (ENOMEM);
 		}
 	} else {
 		/*
 		 * Check if there is valid mapping for requested
 		 * va, if there is, remove it.
 		 */
 		pte = &pmap->pm_pdir[pdir_idx][ptbl_idx];
 		if (PTE_ISVALID(pte)) {
 			pte_remove(mmu, pmap, va, PTBL_HOLD);
 		} else {
 			/*
 			 * pte is not used, increment hold count
 			 * for ptbl pages.
 			 */
 			if (pmap != kernel_pmap)
 				ptbl_hold(mmu, pmap, pdir_idx);
 		}
 	}
 
 	/*
 	 * Insert pv_entry into pv_list for mapped page if part of managed
 	 * memory.
 	 */
 	if ((m->oflags & VPO_UNMANAGED) == 0) {
 		flags |= PTE_MANAGED;
 
 		/* Create and insert pv entry. */
 		pv_insert(pmap, va, m);
 	}
 
 	pmap->pm_stats.resident_count++;
 	
 	mtx_lock_spin(&tlbivax_mutex);
 	tlb_miss_lock();
 
 	tlb0_flush_entry(va);
 	if (pmap->pm_pdir[pdir_idx] == NULL) {
 		/*
 		 * If we just allocated a new page table, hook it in
 		 * the pdir.
 		 */
 		pmap->pm_pdir[pdir_idx] = ptbl;
 	}
 	pte = &(pmap->pm_pdir[pdir_idx][ptbl_idx]);
 	*pte = PTE_RPN_FROM_PA(VM_PAGE_TO_PHYS(m));
 	*pte |= (PTE_VALID | flags | PTE_PS_4KB); /* 4KB pages only */
 
 	tlb_miss_unlock();
 	mtx_unlock_spin(&tlbivax_mutex);
 	return (0);
 }
 
 /* Return the pa for the given pmap/va. */
 static vm_paddr_t
 pte_vatopa(mmu_t mmu, pmap_t pmap, vm_offset_t va)
 {
 	vm_paddr_t pa = 0;
 	pte_t *pte;
 
 	pte = pte_find(mmu, pmap, va);
 	if ((pte != NULL) && PTE_ISVALID(pte))
 		pa = (PTE_PA(pte) | (va & PTE_PA_MASK));
 	return (pa);
 }
 
 /* Get a pointer to a PTE in a page table. */
 static pte_t *
 pte_find(mmu_t mmu, pmap_t pmap, vm_offset_t va)
 {
 	unsigned int pdir_idx = PDIR_IDX(va);
 	unsigned int ptbl_idx = PTBL_IDX(va);
 
 	KASSERT((pmap != NULL), ("pte_find: invalid pmap"));
 
 	if (pmap->pm_pdir[pdir_idx])
 		return (&(pmap->pm_pdir[pdir_idx][ptbl_idx]));
 
 	return (NULL);
 }
 
 /* Set up kernel page tables. */
 static void
 kernel_pte_alloc(vm_offset_t data_end, vm_offset_t addr, vm_offset_t pdir)
 {
 	int		i;
 	vm_offset_t	va;
 	pte_t		*pte;
 
 	/* Initialize kernel pdir */
 	for (i = 0; i < kernel_ptbls; i++)
 		kernel_pmap->pm_pdir[kptbl_min + i] =
 		    (pte_t *)(pdir + (i * PAGE_SIZE * PTBL_PAGES));
 
 	/*
 	 * Fill in PTEs covering kernel code and data. They are not required
 	 * for address translation, as this area is covered by static TLB1
 	 * entries, but for pte_vatopa() to work correctly with kernel area
 	 * addresses.
 	 */
 	for (va = addr; va < data_end; va += PAGE_SIZE) {
 		pte = &(kernel_pmap->pm_pdir[PDIR_IDX(va)][PTBL_IDX(va)]);
 		*pte = PTE_RPN_FROM_PA(kernload + (va - kernstart));
 		*pte |= PTE_M | PTE_SR | PTE_SW | PTE_SX | PTE_WIRED |
 		    PTE_VALID | PTE_PS_4KB;
 	}
 }
 
 /**************************************************************************/
 /* PMAP related */
 /**************************************************************************/
 
 /*
  * This is called during booke_init, before the system is really initialized.
  */
 static void
 mmu_booke_bootstrap(mmu_t mmu, vm_offset_t start, vm_offset_t kernelend)
 {
 	vm_paddr_t phys_kernelend;
 	struct mem_region *mp, *mp1;
 	int cnt, i, j;
 	vm_paddr_t s, e, sz;
 	vm_paddr_t physsz, hwphyssz;
 	u_int phys_avail_count;
 	vm_size_t kstack0_sz;
 	vm_offset_t kernel_pdir, kstack0;
 	vm_paddr_t kstack0_phys;
 	void *dpcpu;
 
 	debugf("mmu_booke_bootstrap: entered\n");
 
 	/* Set interesting system properties */
 	hw_direct_map = 0;
 	elf32_nxstack = 1;
 
 	/* Initialize invalidation mutex */
 	mtx_init(&tlbivax_mutex, "tlbivax", NULL, MTX_SPIN);
 
 	/* Read TLB0 size and associativity. */
 	tlb0_get_tlbconf();
 
 	/*
 	 * Align kernel start and end address (kernel image).
 	 * Note that kernel end does not necessarily relate to kernsize.
 	 * kernsize is the size of the kernel that is actually mapped.
 	 */
 	kernstart = trunc_page(start);
 	data_start = round_page(kernelend);
 	data_end = data_start;
 
 	/*
 	 * Addresses of preloaded modules (like file systems) use
 	 * physical addresses. Make sure we relocate those into
 	 * virtual addresses.
 	 */
 	preload_addr_relocate = kernstart - kernload;
 
 	/* Allocate the dynamic per-cpu area. */
 	dpcpu = (void *)data_end;
 	data_end += DPCPU_SIZE;
 
 	/* Allocate space for the message buffer. */
 	msgbufp = (struct msgbuf *)data_end;
 	data_end += msgbufsize;
 	debugf(" msgbufp at 0x%08x end = 0x%08x\n", (uint32_t)msgbufp,
 	    data_end);
 
 	data_end = round_page(data_end);
 
 	/* Allocate space for ptbl_bufs. */
 	ptbl_bufs = (struct ptbl_buf *)data_end;
 	data_end += sizeof(struct ptbl_buf) * PTBL_BUFS;
 	debugf(" ptbl_bufs at 0x%08x end = 0x%08x\n", (uint32_t)ptbl_bufs,
 	    data_end);
 
 	data_end = round_page(data_end);
 
 	/* Allocate PTE tables for kernel KVA. */
 	kernel_pdir = data_end;
 	kernel_ptbls = howmany(VM_MAX_KERNEL_ADDRESS - VM_MIN_KERNEL_ADDRESS,
 	    PDIR_SIZE);
 	data_end += kernel_ptbls * PTBL_PAGES * PAGE_SIZE;
 	debugf(" kernel ptbls: %d\n", kernel_ptbls);
 	debugf(" kernel pdir at 0x%08x end = 0x%08x\n", kernel_pdir, data_end);
 
 	debugf(" data_end: 0x%08x\n", data_end);
 	if (data_end - kernstart > kernsize) {
 		kernsize += tlb1_mapin_region(kernstart + kernsize,
 		    kernload + kernsize, (data_end - kernstart) - kernsize);
 	}
 	data_end = kernstart + kernsize;
 	debugf(" updated data_end: 0x%08x\n", data_end);
 
 	/*
 	 * Clear the structures - note we can only do it safely after the
 	 * possible additional TLB1 translations are in place (above) so that
 	 * all range up to the currently calculated 'data_end' is covered.
 	 */
 	dpcpu_init(dpcpu, 0);
 	memset((void *)ptbl_bufs, 0, sizeof(struct ptbl_buf) * PTBL_SIZE);
 	memset((void *)kernel_pdir, 0, kernel_ptbls * PTBL_PAGES * PAGE_SIZE);
 
 	/*******************************************************/
 	/* Set the start and end of kva. */
 	/*******************************************************/
 	virtual_avail = round_page(data_end);
 	virtual_end = VM_MAX_KERNEL_ADDRESS;
 
 	/* Allocate KVA space for page zero/copy operations. */
 	zero_page_va = virtual_avail;
 	virtual_avail += PAGE_SIZE;
 	copy_page_src_va = virtual_avail;
 	virtual_avail += PAGE_SIZE;
 	copy_page_dst_va = virtual_avail;
 	virtual_avail += PAGE_SIZE;
 	debugf("zero_page_va = 0x%08x\n", zero_page_va);
 	debugf("copy_page_src_va = 0x%08x\n", copy_page_src_va);
 	debugf("copy_page_dst_va = 0x%08x\n", copy_page_dst_va);
 
 	/* Initialize page zero/copy mutexes. */
 	mtx_init(&zero_page_mutex, "mmu_booke_zero_page", NULL, MTX_DEF);
 	mtx_init(&copy_page_mutex, "mmu_booke_copy_page", NULL, MTX_DEF);
 
 	/* Allocate KVA space for ptbl bufs. */
 	ptbl_buf_pool_vabase = virtual_avail;
 	virtual_avail += PTBL_BUFS * PTBL_PAGES * PAGE_SIZE;
 	debugf("ptbl_buf_pool_vabase = 0x%08x end = 0x%08x\n",
 	    ptbl_buf_pool_vabase, virtual_avail);
 
 	/* Calculate corresponding physical addresses for the kernel region. */
 	phys_kernelend = kernload + kernsize;
 	debugf("kernel image and allocated data:\n");
 	debugf(" kernload    = 0x%09llx\n", (uint64_t)kernload);
 	debugf(" kernstart   = 0x%08x\n", kernstart);
 	debugf(" kernsize    = 0x%08x\n", kernsize);
 
 	if (sizeof(phys_avail) / sizeof(phys_avail[0]) < availmem_regions_sz)
 		panic("mmu_booke_bootstrap: phys_avail too small");
 
 	/*
 	 * Remove kernel physical address range from avail regions list. Page
 	 * align all regions.  Non-page aligned memory isn't very interesting
 	 * to us.  Also, sort the entries for ascending addresses.
 	 */
 
 	/* Retrieve phys/avail mem regions */
 	mem_regions(&physmem_regions, &physmem_regions_sz,
 	    &availmem_regions, &availmem_regions_sz);
 	sz = 0;
 	cnt = availmem_regions_sz;
 	debugf("processing avail regions:\n");
 	for (mp = availmem_regions; mp->mr_size; mp++) {
 		s = mp->mr_start;
 		e = mp->mr_start + mp->mr_size;
 		debugf(" %09jx-%09jx -> ", (uintmax_t)s, (uintmax_t)e);
 		/* Check whether this region holds all of the kernel. */
 		if (s < kernload && e > phys_kernelend) {
 			availmem_regions[cnt].mr_start = phys_kernelend;
 			availmem_regions[cnt++].mr_size = e - phys_kernelend;
 			e = kernload;
 		}
 		/* Look whether this regions starts within the kernel. */
 		if (s >= kernload && s < phys_kernelend) {
 			if (e <= phys_kernelend)
 				goto empty;
 			s = phys_kernelend;
 		}
 		/* Now look whether this region ends within the kernel. */
 		if (e > kernload && e <= phys_kernelend) {
 			if (s >= kernload)
 				goto empty;
 			e = kernload;
 		}
 		/* Now page align the start and size of the region. */
 		s = round_page(s);
 		e = trunc_page(e);
 		if (e < s)
 			e = s;
 		sz = e - s;
 		debugf("%09jx-%09jx = %jx\n",
 		    (uintmax_t)s, (uintmax_t)e, (uintmax_t)sz);
 
 		/* Check whether some memory is left here. */
 		if (sz == 0) {
 		empty:
 			memmove(mp, mp + 1,
 			    (cnt - (mp - availmem_regions)) * sizeof(*mp));
 			cnt--;
 			mp--;
 			continue;
 		}
 
 		/* Do an insertion sort. */
 		for (mp1 = availmem_regions; mp1 < mp; mp1++)
 			if (s < mp1->mr_start)
 				break;
 		if (mp1 < mp) {
 			memmove(mp1 + 1, mp1, (char *)mp - (char *)mp1);
 			mp1->mr_start = s;
 			mp1->mr_size = sz;
 		} else {
 			mp->mr_start = s;
 			mp->mr_size = sz;
 		}
 	}
 	availmem_regions_sz = cnt;
 
 	/*******************************************************/
 	/* Steal physical memory for kernel stack from the end */
 	/* of the first avail region                           */
 	/*******************************************************/
 	kstack0_sz = kstack_pages * PAGE_SIZE;
 	kstack0_phys = availmem_regions[0].mr_start +
 	    availmem_regions[0].mr_size;
 	kstack0_phys -= kstack0_sz;
 	availmem_regions[0].mr_size -= kstack0_sz;
 
 	/*******************************************************/
 	/* Fill in phys_avail table, based on availmem_regions */
 	/*******************************************************/
 	phys_avail_count = 0;
 	physsz = 0;
 	hwphyssz = 0;
 	TUNABLE_ULONG_FETCH("hw.physmem", (u_long *) &hwphyssz);
 
 	debugf("fill in phys_avail:\n");
 	for (i = 0, j = 0; i < availmem_regions_sz; i++, j += 2) {
 
 		debugf(" region: 0x%jx - 0x%jx (0x%jx)\n",
 		    (uintmax_t)availmem_regions[i].mr_start,
 		    (uintmax_t)availmem_regions[i].mr_start +
 		        availmem_regions[i].mr_size,
 		    (uintmax_t)availmem_regions[i].mr_size);
 
 		if (hwphyssz != 0 &&
 		    (physsz + availmem_regions[i].mr_size) >= hwphyssz) {
 			debugf(" hw.physmem adjust\n");
 			if (physsz < hwphyssz) {
 				phys_avail[j] = availmem_regions[i].mr_start;
 				phys_avail[j + 1] =
 				    availmem_regions[i].mr_start +
 				    hwphyssz - physsz;
 				physsz = hwphyssz;
 				phys_avail_count++;
 			}
 			break;
 		}
 
 		phys_avail[j] = availmem_regions[i].mr_start;
 		phys_avail[j + 1] = availmem_regions[i].mr_start +
 		    availmem_regions[i].mr_size;
 		phys_avail_count++;
 		physsz += availmem_regions[i].mr_size;
 	}
 	physmem = btoc(physsz);
 
 	/* Calculate the last available physical address. */
 	for (i = 0; phys_avail[i + 2] != 0; i += 2)
 		;
 	Maxmem = powerpc_btop(phys_avail[i + 1]);
 
 	debugf("Maxmem = 0x%08lx\n", Maxmem);
 	debugf("phys_avail_count = %d\n", phys_avail_count);
 	debugf("physsz = 0x%09jx physmem = %jd (0x%09jx)\n",
 	    (uintmax_t)physsz, (uintmax_t)physmem, (uintmax_t)physmem);
 
 	/*******************************************************/
 	/* Initialize (statically allocated) kernel pmap. */
 	/*******************************************************/
 	PMAP_LOCK_INIT(kernel_pmap);
 	kptbl_min = VM_MIN_KERNEL_ADDRESS / PDIR_SIZE;
 
 	debugf("kernel_pmap = 0x%08x\n", (uint32_t)kernel_pmap);
 	debugf("kptbl_min = %d, kernel_ptbls = %d\n", kptbl_min, kernel_ptbls);
 	debugf("kernel pdir range: 0x%08x - 0x%08x\n",
 	    kptbl_min * PDIR_SIZE, (kptbl_min + kernel_ptbls) * PDIR_SIZE - 1);
 
 	kernel_pte_alloc(data_end, kernstart, kernel_pdir);
 	for (i = 0; i < MAXCPU; i++) {
 		kernel_pmap->pm_tid[i] = TID_KERNEL;
 		
 		/* Initialize each CPU's tidbusy entry 0 with kernel_pmap */
 		tidbusy[i][TID_KERNEL] = kernel_pmap;
 	}
 
 	/* Mark kernel_pmap active on all CPUs */
 	CPU_FILL(&kernel_pmap->pm_active);
 
  	/*
 	 * Initialize the global pv list lock.
 	 */
 	rw_init(&pvh_global_lock, "pmap pv global");
 
 	/*******************************************************/
 	/* Final setup */
 	/*******************************************************/
 
 	/* Enter kstack0 into kernel map, provide guard page */
 	kstack0 = virtual_avail + KSTACK_GUARD_PAGES * PAGE_SIZE;
 	thread0.td_kstack = kstack0;
 	thread0.td_kstack_pages = kstack_pages;
 
 	debugf("kstack_sz = 0x%08x\n", kstack0_sz);
 	debugf("kstack0_phys at 0x%09llx - 0x%09llx\n",
 	    kstack0_phys, kstack0_phys + kstack0_sz);
 	debugf("kstack0 at 0x%08x - 0x%08x\n", kstack0, kstack0 + kstack0_sz);
 	
 	virtual_avail += KSTACK_GUARD_PAGES * PAGE_SIZE + kstack0_sz;
 	for (i = 0; i < kstack_pages; i++) {
 		mmu_booke_kenter(mmu, kstack0, kstack0_phys);
 		kstack0 += PAGE_SIZE;
 		kstack0_phys += PAGE_SIZE;
 	}
 
 	pmap_bootstrapped = 1;
 	
 	debugf("virtual_avail = %08x\n", virtual_avail);
 	debugf("virtual_end   = %08x\n", virtual_end);
 
 	debugf("mmu_booke_bootstrap: exit\n");
 }
 
 #ifdef SMP
  void
 tlb1_ap_prep(void)
 {
 	tlb_entry_t *e, tmp;
 	unsigned int i;
 
 	/* Prepare TLB1 image for AP processors */
 	e = __boot_tlb1;
 	for (i = 0; i < TLB1_ENTRIES; i++) {
 		tlb1_read_entry(&tmp, i);
 
 		if ((tmp.mas1 & MAS1_VALID) && (tmp.mas2 & _TLB_ENTRY_SHARED))
 			memcpy(e++, &tmp, sizeof(tmp));
 	}
 }
 
 void
 pmap_bootstrap_ap(volatile uint32_t *trcp __unused)
 {
 	int i;
 
 	/*
 	 * Finish TLB1 configuration: the BSP already set up its TLB1 and we
 	 * have the snapshot of its contents in the s/w __boot_tlb1[] table
 	 * created by tlb1_ap_prep(), so use these values directly to
 	 * (re)program AP's TLB1 hardware.
 	 *
 	 * Start at index 1 because index 0 has the kernel map.
 	 */
 	for (i = 1; i < TLB1_ENTRIES; i++) {
 		if (__boot_tlb1[i].mas1 & MAS1_VALID)
 			tlb1_write_entry(&__boot_tlb1[i], i);
 	}
 
 	set_mas4_defaults();
 }
 #endif
 
 static void
 booke_pmap_init_qpages(void)
 {
 	struct pcpu *pc;
 	int i;
 
 	CPU_FOREACH(i) {
 		pc = pcpu_find(i);
 		pc->pc_qmap_addr = kva_alloc(PAGE_SIZE);
 		if (pc->pc_qmap_addr == 0)
 			panic("pmap_init_qpages: unable to allocate KVA");
 	}
 }
 
 SYSINIT(qpages_init, SI_SUB_CPU, SI_ORDER_ANY, booke_pmap_init_qpages, NULL);
 
 /*
  * Get the physical page address for the given pmap/virtual address.
  */
 static vm_paddr_t
 mmu_booke_extract(mmu_t mmu, pmap_t pmap, vm_offset_t va)
 {
 	vm_paddr_t pa;
 
 	PMAP_LOCK(pmap);
 	pa = pte_vatopa(mmu, pmap, va);
 	PMAP_UNLOCK(pmap);
 
 	return (pa);
 }
 
 /*
  * Extract the physical page address associated with the given
  * kernel virtual address.
  */
 static vm_paddr_t
 mmu_booke_kextract(mmu_t mmu, vm_offset_t va)
 {
 	tlb_entry_t e;
 	int i;
 
 	/* Check TLB1 mappings */
 	for (i = 0; i < TLB1_ENTRIES; i++) {
 		tlb1_read_entry(&e, i);
 		if (!(e.mas1 & MAS1_VALID))
 			continue;
 		if (va >= e.virt && va < e.virt + e.size)
 			return (e.phys + (va - e.virt));
 	}
 
 	return (pte_vatopa(mmu, kernel_pmap, va));
 }
 
 /*
  * Initialize the pmap module.
  * Called by vm_init, to initialize any structures that the pmap
  * system needs to map virtual memory.
  */
 static void
 mmu_booke_init(mmu_t mmu)
 {
 	int shpgperproc = PMAP_SHPGPERPROC;
 
 	/*
 	 * Initialize the address space (zone) for the pv entries.  Set a
 	 * high water mark so that the system can recover from excessive
 	 * numbers of pv entries.
 	 */
 	pvzone = uma_zcreate("PV ENTRY", sizeof(struct pv_entry), NULL, NULL,
 	    NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_VM | UMA_ZONE_NOFREE);
 
 	TUNABLE_INT_FETCH("vm.pmap.shpgperproc", &shpgperproc);
 	pv_entry_max = shpgperproc * maxproc + vm_cnt.v_page_count;
 
 	TUNABLE_INT_FETCH("vm.pmap.pv_entries", &pv_entry_max);
 	pv_entry_high_water = 9 * (pv_entry_max / 10);
 
 	uma_zone_reserve_kva(pvzone, pv_entry_max);
 
 	/* Pre-fill pvzone with initial number of pv entries. */
 	uma_prealloc(pvzone, PV_ENTRY_ZONE_MIN);
 
 	/* Initialize ptbl allocation. */
 	ptbl_init();
 }
 
 /*
  * Map a list of wired pages into kernel virtual address space.  This is
  * intended for temporary mappings which do not need page modification or
  * references recorded.  Existing mappings in the region are overwritten.
  */
 static void
 mmu_booke_qenter(mmu_t mmu, vm_offset_t sva, vm_page_t *m, int count)
 {
 	vm_offset_t va;
 
 	va = sva;
 	while (count-- > 0) {
 		mmu_booke_kenter(mmu, va, VM_PAGE_TO_PHYS(*m));
 		va += PAGE_SIZE;
 		m++;
 	}
 }
 
 /*
  * Remove page mappings from kernel virtual address space.  Intended for
  * temporary mappings entered by mmu_booke_qenter.
  */
 static void
 mmu_booke_qremove(mmu_t mmu, vm_offset_t sva, int count)
 {
 	vm_offset_t va;
 
 	va = sva;
 	while (count-- > 0) {
 		mmu_booke_kremove(mmu, va);
 		va += PAGE_SIZE;
 	}
 }
 
 /*
  * Map a wired page into kernel virtual address space.
  */
 static void
 mmu_booke_kenter(mmu_t mmu, vm_offset_t va, vm_paddr_t pa)
 {
 
 	mmu_booke_kenter_attr(mmu, va, pa, VM_MEMATTR_DEFAULT);
 }
 
 static void
 mmu_booke_kenter_attr(mmu_t mmu, vm_offset_t va, vm_paddr_t pa, vm_memattr_t ma)
 {
 	uint32_t flags;
 	pte_t *pte;
 
 	KASSERT(((va >= VM_MIN_KERNEL_ADDRESS) &&
 	    (va <= VM_MAX_KERNEL_ADDRESS)), ("mmu_booke_kenter: invalid va"));
 
 	flags = PTE_SR | PTE_SW | PTE_SX | PTE_WIRED | PTE_VALID;
 	flags |= tlb_calc_wimg(pa, ma) << PTE_MAS2_SHIFT;
 	flags |= PTE_PS_4KB;
 
 	pte = pte_find(mmu, kernel_pmap, va);
 
 	mtx_lock_spin(&tlbivax_mutex);
 	tlb_miss_lock();
 	
 	if (PTE_ISVALID(pte)) {
 	
 		CTR1(KTR_PMAP, "%s: replacing entry!", __func__);
 
 		/* Flush entry from TLB0 */
 		tlb0_flush_entry(va);
 	}
 
 	*pte = PTE_RPN_FROM_PA(pa) | flags;
 
 	//debugf("mmu_booke_kenter: pdir_idx = %d ptbl_idx = %d va=0x%08x "
 	//		"pa=0x%08x rpn=0x%08x flags=0x%08x\n",
 	//		pdir_idx, ptbl_idx, va, pa, pte->rpn, pte->flags);
 
 	/* Flush the real memory from the instruction cache. */
 	if ((flags & (PTE_I | PTE_G)) == 0)
 		__syncicache((void *)va, PAGE_SIZE);
 
 	tlb_miss_unlock();
 	mtx_unlock_spin(&tlbivax_mutex);
 }
 
 /*
  * Remove a page from kernel page table.
  */
 static void
 mmu_booke_kremove(mmu_t mmu, vm_offset_t va)
 {
 	pte_t *pte;
 
 	CTR2(KTR_PMAP,"%s: s (va = 0x%08x)\n", __func__, va);
 
 	KASSERT(((va >= VM_MIN_KERNEL_ADDRESS) &&
 	    (va <= VM_MAX_KERNEL_ADDRESS)),
 	    ("mmu_booke_kremove: invalid va"));
 
 	pte = pte_find(mmu, kernel_pmap, va);
 
 	if (!PTE_ISVALID(pte)) {
 	
 		CTR1(KTR_PMAP, "%s: invalid pte", __func__);
 
 		return;
 	}
 
 	mtx_lock_spin(&tlbivax_mutex);
 	tlb_miss_lock();
 
 	/* Invalidate entry in TLB0, update PTE. */
 	tlb0_flush_entry(va);
 	*pte = 0;
 
 	tlb_miss_unlock();
 	mtx_unlock_spin(&tlbivax_mutex);
 }
 
 /*
  * Initialize pmap associated with process 0.
  */
 static void
 mmu_booke_pinit0(mmu_t mmu, pmap_t pmap)
 {
 
 	PMAP_LOCK_INIT(pmap);
 	mmu_booke_pinit(mmu, pmap);
 	PCPU_SET(curpmap, pmap);
 }
 
 /*
  * Initialize a preallocated and zeroed pmap structure,
  * such as one in a vmspace structure.
  */
 static void
 mmu_booke_pinit(mmu_t mmu, pmap_t pmap)
 {
 	int i;
 
 	CTR4(KTR_PMAP, "%s: pmap = %p, proc %d '%s'", __func__, pmap,
 	    curthread->td_proc->p_pid, curthread->td_proc->p_comm);
 
 	KASSERT((pmap != kernel_pmap), ("pmap_pinit: initializing kernel_pmap"));
 
 	for (i = 0; i < MAXCPU; i++)
 		pmap->pm_tid[i] = TID_NONE;
 	CPU_ZERO(&kernel_pmap->pm_active);
 	bzero(&pmap->pm_stats, sizeof(pmap->pm_stats));
 	bzero(&pmap->pm_pdir, sizeof(pte_t *) * PDIR_NENTRIES);
 	TAILQ_INIT(&pmap->pm_ptbl_list);
 }
 
 /*
  * Release any resources held by the given physical map.
  * Called when a pmap initialized by mmu_booke_pinit is being released.
  * Should only be called if the map contains no valid mappings.
  */
 static void
 mmu_booke_release(mmu_t mmu, pmap_t pmap)
 {
 
 	KASSERT(pmap->pm_stats.resident_count == 0,
 	    ("pmap_release: pmap resident count %ld != 0",
 	    pmap->pm_stats.resident_count));
 }
 
 /*
  * Insert the given physical page at the specified virtual address in the
  * target physical map with the protection requested. If specified the page
  * will be wired down.
  */
 static int
 mmu_booke_enter(mmu_t mmu, pmap_t pmap, vm_offset_t va, vm_page_t m,
     vm_prot_t prot, u_int flags, int8_t psind)
 {
 	int error;
 
 	rw_wlock(&pvh_global_lock);
 	PMAP_LOCK(pmap);
 	error = mmu_booke_enter_locked(mmu, pmap, va, m, prot, flags, psind);
 	rw_wunlock(&pvh_global_lock);
 	PMAP_UNLOCK(pmap);
 	return (error);
 }
 
 static int
 mmu_booke_enter_locked(mmu_t mmu, pmap_t pmap, vm_offset_t va, vm_page_t m,
     vm_prot_t prot, u_int pmap_flags, int8_t psind __unused)
 {
 	pte_t *pte;
 	vm_paddr_t pa;
 	uint32_t flags;
 	int error, su, sync;
 
 	pa = VM_PAGE_TO_PHYS(m);
 	su = (pmap == kernel_pmap);
 	sync = 0;
 
 	//debugf("mmu_booke_enter_locked: s (pmap=0x%08x su=%d tid=%d m=0x%08x va=0x%08x "
 	//		"pa=0x%08x prot=0x%08x flags=%#x)\n",
 	//		(u_int32_t)pmap, su, pmap->pm_tid,
 	//		(u_int32_t)m, va, pa, prot, flags);
 
 	if (su) {
 		KASSERT(((va >= virtual_avail) &&
 		    (va <= VM_MAX_KERNEL_ADDRESS)),
 		    ("mmu_booke_enter_locked: kernel pmap, non kernel va"));
 	} else {
 		KASSERT((va <= VM_MAXUSER_ADDRESS),
 		    ("mmu_booke_enter_locked: user pmap, non user va"));
 	}
 	if ((m->oflags & VPO_UNMANAGED) == 0 && !vm_page_xbusied(m))
 		VM_OBJECT_ASSERT_LOCKED(m->object);
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 
 	/*
 	 * If there is an existing mapping, and the physical address has not
 	 * changed, must be protection or wiring change.
 	 */
 	if (((pte = pte_find(mmu, pmap, va)) != NULL) &&
 	    (PTE_ISVALID(pte)) && (PTE_PA(pte) == pa)) {
 	    
 		/*
 		 * Before actually updating pte->flags we calculate and
 		 * prepare its new value in a helper var.
 		 */
 		flags = *pte;
 		flags &= ~(PTE_UW | PTE_UX | PTE_SW | PTE_SX | PTE_MODIFIED);
 
 		/* Wiring change, just update stats. */
 		if ((pmap_flags & PMAP_ENTER_WIRED) != 0) {
 			if (!PTE_ISWIRED(pte)) {
 				flags |= PTE_WIRED;
 				pmap->pm_stats.wired_count++;
 			}
 		} else {
 			if (PTE_ISWIRED(pte)) {
 				flags &= ~PTE_WIRED;
 				pmap->pm_stats.wired_count--;
 			}
 		}
 
 		if (prot & VM_PROT_WRITE) {
 			/* Add write permissions. */
 			flags |= PTE_SW;
 			if (!su)
 				flags |= PTE_UW;
 
 			if ((flags & PTE_MANAGED) != 0)
 				vm_page_aflag_set(m, PGA_WRITEABLE);
 		} else {
 			/* Handle modified pages, sense modify status. */
 
 			/*
 			 * The PTE_MODIFIED flag could be set by underlying
 			 * TLB misses since we last read it (above), possibly
 			 * other CPUs could update it so we check in the PTE
 			 * directly rather than rely on that saved local flags
 			 * copy.
 			 */
 			if (PTE_ISMODIFIED(pte))
 				vm_page_dirty(m);
 		}
 
 		if (prot & VM_PROT_EXECUTE) {
 			flags |= PTE_SX;
 			if (!su)
 				flags |= PTE_UX;
 
 			/*
 			 * Check existing flags for execute permissions: if we
 			 * are turning execute permissions on, icache should
 			 * be flushed.
 			 */
 			if ((*pte & (PTE_UX | PTE_SX)) == 0)
 				sync++;
 		}
 
 		flags &= ~PTE_REFERENCED;
 
 		/*
 		 * The new flags value is all calculated -- only now actually
 		 * update the PTE.
 		 */
 		mtx_lock_spin(&tlbivax_mutex);
 		tlb_miss_lock();
 
 		tlb0_flush_entry(va);
 		*pte &= ~PTE_FLAGS_MASK;
 		*pte |= flags;
 
 		tlb_miss_unlock();
 		mtx_unlock_spin(&tlbivax_mutex);
 
 	} else {
 		/*
 		 * If there is an existing mapping, but it's for a different
 		 * physical address, pte_enter() will delete the old mapping.
 		 */
 		//if ((pte != NULL) && PTE_ISVALID(pte))
 		//	debugf("mmu_booke_enter_locked: replace\n");
 		//else
 		//	debugf("mmu_booke_enter_locked: new\n");
 
 		/* Now set up the flags and install the new mapping. */
 		flags = (PTE_SR | PTE_VALID);
 		flags |= PTE_M;
 
 		if (!su)
 			flags |= PTE_UR;
 
 		if (prot & VM_PROT_WRITE) {
 			flags |= PTE_SW;
 			if (!su)
 				flags |= PTE_UW;
 
 			if ((m->oflags & VPO_UNMANAGED) == 0)
 				vm_page_aflag_set(m, PGA_WRITEABLE);
 		}
 
 		if (prot & VM_PROT_EXECUTE) {
 			flags |= PTE_SX;
 			if (!su)
 				flags |= PTE_UX;
 		}
 
 		/* If its wired update stats. */
 		if ((pmap_flags & PMAP_ENTER_WIRED) != 0)
 			flags |= PTE_WIRED;
 
 		error = pte_enter(mmu, pmap, m, va, flags,
 		    (pmap_flags & PMAP_ENTER_NOSLEEP) != 0);
 		if (error != 0)
 			return (KERN_RESOURCE_SHORTAGE);
 
 		if ((flags & PMAP_ENTER_WIRED) != 0)
 			pmap->pm_stats.wired_count++;
 
 		/* Flush the real memory from the instruction cache. */
 		if (prot & VM_PROT_EXECUTE)
 			sync++;
 	}
 
 	if (sync && (su || pmap == PCPU_GET(curpmap))) {
 		__syncicache((void *)va, PAGE_SIZE);
 		sync = 0;
 	}
 
 	return (KERN_SUCCESS);
 }
 
 /*
  * Maps a sequence of resident pages belonging to the same object.
  * The sequence begins with the given page m_start.  This page is
  * mapped at the given virtual address start.  Each subsequent page is
  * mapped at a virtual address that is offset from start by the same
  * amount as the page is offset from m_start within the object.  The
  * last page in the sequence is the page with the largest offset from
  * m_start that can be mapped at a virtual address less than the given
  * virtual address end.  Not every virtual page between start and end
  * is mapped; only those for which a resident page exists with the
  * corresponding offset from m_start are mapped.
  */
 static void
 mmu_booke_enter_object(mmu_t mmu, pmap_t pmap, vm_offset_t start,
     vm_offset_t end, vm_page_t m_start, vm_prot_t prot)
 {
 	vm_page_t m;
 	vm_pindex_t diff, psize;
 
 	VM_OBJECT_ASSERT_LOCKED(m_start->object);
 
 	psize = atop(end - start);
 	m = m_start;
 	rw_wlock(&pvh_global_lock);
 	PMAP_LOCK(pmap);
 	while (m != NULL && (diff = m->pindex - m_start->pindex) < psize) {
 		mmu_booke_enter_locked(mmu, pmap, start + ptoa(diff), m,
 		    prot & (VM_PROT_READ | VM_PROT_EXECUTE),
 		    PMAP_ENTER_NOSLEEP, 0);
 		m = TAILQ_NEXT(m, listq);
 	}
 	rw_wunlock(&pvh_global_lock);
 	PMAP_UNLOCK(pmap);
 }
 
 static void
 mmu_booke_enter_quick(mmu_t mmu, pmap_t pmap, vm_offset_t va, vm_page_t m,
     vm_prot_t prot)
 {
 
 	rw_wlock(&pvh_global_lock);
 	PMAP_LOCK(pmap);
 	mmu_booke_enter_locked(mmu, pmap, va, m,
 	    prot & (VM_PROT_READ | VM_PROT_EXECUTE), PMAP_ENTER_NOSLEEP,
 	    0);
 	rw_wunlock(&pvh_global_lock);
 	PMAP_UNLOCK(pmap);
 }
 
 /*
  * Remove the given range of addresses from the specified map.
  *
  * It is assumed that the start and end are properly rounded to the page size.
  */
 static void
 mmu_booke_remove(mmu_t mmu, pmap_t pmap, vm_offset_t va, vm_offset_t endva)
 {
 	pte_t *pte;
 	uint8_t hold_flag;
 
 	int su = (pmap == kernel_pmap);
 
 	//debugf("mmu_booke_remove: s (su = %d pmap=0x%08x tid=%d va=0x%08x endva=0x%08x)\n",
 	//		su, (u_int32_t)pmap, pmap->pm_tid, va, endva);
 
 	if (su) {
 		KASSERT(((va >= virtual_avail) &&
 		    (va <= VM_MAX_KERNEL_ADDRESS)),
 		    ("mmu_booke_remove: kernel pmap, non kernel va"));
 	} else {
 		KASSERT((va <= VM_MAXUSER_ADDRESS),
 		    ("mmu_booke_remove: user pmap, non user va"));
 	}
 
 	if (PMAP_REMOVE_DONE(pmap)) {
 		//debugf("mmu_booke_remove: e (empty)\n");
 		return;
 	}
 
 	hold_flag = PTBL_HOLD_FLAG(pmap);
 	//debugf("mmu_booke_remove: hold_flag = %d\n", hold_flag);
 
 	rw_wlock(&pvh_global_lock);
 	PMAP_LOCK(pmap);
 	for (; va < endva; va += PAGE_SIZE) {
 		pte = pte_find(mmu, pmap, va);
 		if ((pte != NULL) && PTE_ISVALID(pte))
 			pte_remove(mmu, pmap, va, hold_flag);
 	}
 	PMAP_UNLOCK(pmap);
 	rw_wunlock(&pvh_global_lock);
 
 	//debugf("mmu_booke_remove: e\n");
 }
 
 /*
  * Remove physical page from all pmaps in which it resides.
  */
 static void
 mmu_booke_remove_all(mmu_t mmu, vm_page_t m)
 {
 	pv_entry_t pv, pvn;
 	uint8_t hold_flag;
 
 	rw_wlock(&pvh_global_lock);
 	for (pv = TAILQ_FIRST(&m->md.pv_list); pv != NULL; pv = pvn) {
 		pvn = TAILQ_NEXT(pv, pv_link);
 
 		PMAP_LOCK(pv->pv_pmap);
 		hold_flag = PTBL_HOLD_FLAG(pv->pv_pmap);
 		pte_remove(mmu, pv->pv_pmap, pv->pv_va, hold_flag);
 		PMAP_UNLOCK(pv->pv_pmap);
 	}
 	vm_page_aflag_clear(m, PGA_WRITEABLE);
 	rw_wunlock(&pvh_global_lock);
 }
 
 /*
  * Map a range of physical addresses into kernel virtual address space.
  */
 static vm_offset_t
 mmu_booke_map(mmu_t mmu, vm_offset_t *virt, vm_paddr_t pa_start,
     vm_paddr_t pa_end, int prot)
 {
 	vm_offset_t sva = *virt;
 	vm_offset_t va = sva;
 
 	//debugf("mmu_booke_map: s (sva = 0x%08x pa_start = 0x%08x pa_end = 0x%08x)\n",
 	//		sva, pa_start, pa_end);
 
 	while (pa_start < pa_end) {
 		mmu_booke_kenter(mmu, va, pa_start);
 		va += PAGE_SIZE;
 		pa_start += PAGE_SIZE;
 	}
 	*virt = va;
 
 	//debugf("mmu_booke_map: e (va = 0x%08x)\n", va);
 	return (sva);
 }
 
 /*
  * The pmap must be activated before it's address space can be accessed in any
  * way.
  */
 static void
 mmu_booke_activate(mmu_t mmu, struct thread *td)
 {
 	pmap_t pmap;
 	u_int cpuid;
 
 	pmap = &td->td_proc->p_vmspace->vm_pmap;
 
 	CTR5(KTR_PMAP, "%s: s (td = %p, proc = '%s', id = %d, pmap = 0x%08x)",
 	    __func__, td, td->td_proc->p_comm, td->td_proc->p_pid, pmap);
 
 	KASSERT((pmap != kernel_pmap), ("mmu_booke_activate: kernel_pmap!"));
 
 	sched_pin();
 
 	cpuid = PCPU_GET(cpuid);
 	CPU_SET_ATOMIC(cpuid, &pmap->pm_active);
 	PCPU_SET(curpmap, pmap);
 	
 	if (pmap->pm_tid[cpuid] == TID_NONE)
 		tid_alloc(pmap);
 
 	/* Load PID0 register with pmap tid value. */
 	mtspr(SPR_PID0, pmap->pm_tid[cpuid]);
 	__asm __volatile("isync");
 
 	mtspr(SPR_DBCR0, td->td_pcb->pcb_cpu.booke.dbcr0);
 
 	sched_unpin();
 
 	CTR3(KTR_PMAP, "%s: e (tid = %d for '%s')", __func__,
 	    pmap->pm_tid[PCPU_GET(cpuid)], td->td_proc->p_comm);
 }
 
 /*
  * Deactivate the specified process's address space.
  */
 static void
 mmu_booke_deactivate(mmu_t mmu, struct thread *td)
 {
 	pmap_t pmap;
 
 	pmap = &td->td_proc->p_vmspace->vm_pmap;
 	
 	CTR5(KTR_PMAP, "%s: td=%p, proc = '%s', id = %d, pmap = 0x%08x",
 	    __func__, td, td->td_proc->p_comm, td->td_proc->p_pid, pmap);
 
 	td->td_pcb->pcb_cpu.booke.dbcr0 = mfspr(SPR_DBCR0);
 
 	CPU_CLR_ATOMIC(PCPU_GET(cpuid), &pmap->pm_active);
 	PCPU_SET(curpmap, NULL);
 }
 
 /*
  * Copy the range specified by src_addr/len
  * from the source map to the range dst_addr/len
  * in the destination map.
  *
  * This routine is only advisory and need not do anything.
  */
 static void
 mmu_booke_copy(mmu_t mmu, pmap_t dst_pmap, pmap_t src_pmap,
     vm_offset_t dst_addr, vm_size_t len, vm_offset_t src_addr)
 {
 
 }
 
 /*
  * Set the physical protection on the specified range of this map as requested.
  */
 static void
 mmu_booke_protect(mmu_t mmu, pmap_t pmap, vm_offset_t sva, vm_offset_t eva,
     vm_prot_t prot)
 {
 	vm_offset_t va;
 	vm_page_t m;
 	pte_t *pte;
 
 	if ((prot & VM_PROT_READ) == VM_PROT_NONE) {
 		mmu_booke_remove(mmu, pmap, sva, eva);
 		return;
 	}
 
 	if (prot & VM_PROT_WRITE)
 		return;
 
 	PMAP_LOCK(pmap);
 	for (va = sva; va < eva; va += PAGE_SIZE) {
 		if ((pte = pte_find(mmu, pmap, va)) != NULL) {
 			if (PTE_ISVALID(pte)) {
 				m = PHYS_TO_VM_PAGE(PTE_PA(pte));
 
 				mtx_lock_spin(&tlbivax_mutex);
 				tlb_miss_lock();
 
 				/* Handle modified pages. */
 				if (PTE_ISMODIFIED(pte) && PTE_ISMANAGED(pte))
 					vm_page_dirty(m);
 
 				tlb0_flush_entry(va);
 				*pte &= ~(PTE_UW | PTE_SW | PTE_MODIFIED);
 
 				tlb_miss_unlock();
 				mtx_unlock_spin(&tlbivax_mutex);
 			}
 		}
 	}
 	PMAP_UNLOCK(pmap);
 }
 
 /*
  * Clear the write and modified bits in each of the given page's mappings.
  */
 static void
 mmu_booke_remove_write(mmu_t mmu, vm_page_t m)
 {
 	pv_entry_t pv;
 	pte_t *pte;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("mmu_booke_remove_write: page %p is not managed", m));
 
 	/*
 	 * If the page is not exclusive busied, then PGA_WRITEABLE cannot be
 	 * set by another thread while the object is locked.  Thus,
 	 * if PGA_WRITEABLE is clear, no page table entries need updating.
 	 */
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	if (!vm_page_xbusied(m) && (m->aflags & PGA_WRITEABLE) == 0)
 		return;
 	rw_wlock(&pvh_global_lock);
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) {
 		PMAP_LOCK(pv->pv_pmap);
 		if ((pte = pte_find(mmu, pv->pv_pmap, pv->pv_va)) != NULL) {
 			if (PTE_ISVALID(pte)) {
 				m = PHYS_TO_VM_PAGE(PTE_PA(pte));
 
 				mtx_lock_spin(&tlbivax_mutex);
 				tlb_miss_lock();
 
 				/* Handle modified pages. */
 				if (PTE_ISMODIFIED(pte))
 					vm_page_dirty(m);
 
 				/* Flush mapping from TLB0. */
 				*pte &= ~(PTE_UW | PTE_SW | PTE_MODIFIED);
 
 				tlb_miss_unlock();
 				mtx_unlock_spin(&tlbivax_mutex);
 			}
 		}
 		PMAP_UNLOCK(pv->pv_pmap);
 	}
 	vm_page_aflag_clear(m, PGA_WRITEABLE);
 	rw_wunlock(&pvh_global_lock);
 }
 
 static void
 mmu_booke_sync_icache(mmu_t mmu, pmap_t pm, vm_offset_t va, vm_size_t sz)
 {
 	pte_t *pte;
 	pmap_t pmap;
 	vm_page_t m;
 	vm_offset_t addr;
 	vm_paddr_t pa = 0;
 	int active, valid;
  
 	va = trunc_page(va);
 	sz = round_page(sz);
 
 	rw_wlock(&pvh_global_lock);
 	pmap = PCPU_GET(curpmap);
 	active = (pm == kernel_pmap || pm == pmap) ? 1 : 0;
 	while (sz > 0) {
 		PMAP_LOCK(pm);
 		pte = pte_find(mmu, pm, va);
 		valid = (pte != NULL && PTE_ISVALID(pte)) ? 1 : 0;
 		if (valid)
 			pa = PTE_PA(pte);
 		PMAP_UNLOCK(pm);
 		if (valid) {
 			if (!active) {
 				/* Create a mapping in the active pmap. */
 				addr = 0;
 				m = PHYS_TO_VM_PAGE(pa);
 				PMAP_LOCK(pmap);
 				pte_enter(mmu, pmap, m, addr,
 				    PTE_SR | PTE_VALID | PTE_UR, FALSE);
 				__syncicache((void *)addr, PAGE_SIZE);
 				pte_remove(mmu, pmap, addr, PTBL_UNHOLD);
 				PMAP_UNLOCK(pmap);
 			} else
 				__syncicache((void *)va, PAGE_SIZE);
 		}
 		va += PAGE_SIZE;
 		sz -= PAGE_SIZE;
 	}
 	rw_wunlock(&pvh_global_lock);
 }
 
 /*
  * Atomically extract and hold the physical page with the given
  * pmap and virtual address pair if that mapping permits the given
  * protection.
  */
 static vm_page_t
 mmu_booke_extract_and_hold(mmu_t mmu, pmap_t pmap, vm_offset_t va,
     vm_prot_t prot)
 {
 	pte_t *pte;
 	vm_page_t m;
 	uint32_t pte_wbit;
 	vm_paddr_t pa;
 	
 	m = NULL;
 	pa = 0;	
 	PMAP_LOCK(pmap);
 retry:
 	pte = pte_find(mmu, pmap, va);
 	if ((pte != NULL) && PTE_ISVALID(pte)) {
 		if (pmap == kernel_pmap)
 			pte_wbit = PTE_SW;
 		else
 			pte_wbit = PTE_UW;
 
 		if ((*pte & pte_wbit) || ((prot & VM_PROT_WRITE) == 0)) {
 			if (vm_page_pa_tryrelock(pmap, PTE_PA(pte), &pa))
 				goto retry;
 			m = PHYS_TO_VM_PAGE(PTE_PA(pte));
 			vm_page_hold(m);
 		}
 	}
 
 	PA_UNLOCK_COND(pa);
 	PMAP_UNLOCK(pmap);
 	return (m);
 }
 
 /*
  * Initialize a vm_page's machine-dependent fields.
  */
 static void
 mmu_booke_page_init(mmu_t mmu, vm_page_t m)
 {
 
 	TAILQ_INIT(&m->md.pv_list);
 }
 
 /*
  * mmu_booke_zero_page_area zeros the specified hardware page by
  * mapping it into virtual memory and using bzero to clear
  * its contents.
  *
  * off and size must reside within a single page.
  */
 static void
 mmu_booke_zero_page_area(mmu_t mmu, vm_page_t m, int off, int size)
 {
 	vm_offset_t va;
 
 	/* XXX KASSERT off and size are within a single page? */
 
 	mtx_lock(&zero_page_mutex);
 	va = zero_page_va;
 
 	mmu_booke_kenter(mmu, va, VM_PAGE_TO_PHYS(m));
 	bzero((caddr_t)va + off, size);
 	mmu_booke_kremove(mmu, va);
 
 	mtx_unlock(&zero_page_mutex);
 }
 
 /*
  * mmu_booke_zero_page zeros the specified hardware page.
  */
 static void
 mmu_booke_zero_page(mmu_t mmu, vm_page_t m)
 {
 	vm_offset_t off, va;
 
 	mtx_lock(&zero_page_mutex);
 	va = zero_page_va;
 
 	mmu_booke_kenter(mmu, va, VM_PAGE_TO_PHYS(m));
 	for (off = 0; off < PAGE_SIZE; off += cacheline_size)
 		__asm __volatile("dcbz 0,%0" :: "r"(va + off));
 	mmu_booke_kremove(mmu, va);
 
 	mtx_unlock(&zero_page_mutex);
 }
 
 /*
  * mmu_booke_copy_page copies the specified (machine independent) page by
  * mapping the page into virtual memory and using memcopy to copy the page,
  * one machine dependent page at a time.
  */
 static void
 mmu_booke_copy_page(mmu_t mmu, vm_page_t sm, vm_page_t dm)
 {
 	vm_offset_t sva, dva;
 
 	sva = copy_page_src_va;
 	dva = copy_page_dst_va;
 
 	mtx_lock(&copy_page_mutex);
 	mmu_booke_kenter(mmu, sva, VM_PAGE_TO_PHYS(sm));
 	mmu_booke_kenter(mmu, dva, VM_PAGE_TO_PHYS(dm));
 	memcpy((caddr_t)dva, (caddr_t)sva, PAGE_SIZE);
 	mmu_booke_kremove(mmu, dva);
 	mmu_booke_kremove(mmu, sva);
 	mtx_unlock(&copy_page_mutex);
 }
 
 static inline void
 mmu_booke_copy_pages(mmu_t mmu, vm_page_t *ma, vm_offset_t a_offset,
     vm_page_t *mb, vm_offset_t b_offset, int xfersize)
 {
 	void *a_cp, *b_cp;
 	vm_offset_t a_pg_offset, b_pg_offset;
 	int cnt;
 
 	mtx_lock(&copy_page_mutex);
 	while (xfersize > 0) {
 		a_pg_offset = a_offset & PAGE_MASK;
 		cnt = min(xfersize, PAGE_SIZE - a_pg_offset);
 		mmu_booke_kenter(mmu, copy_page_src_va,
 		    VM_PAGE_TO_PHYS(ma[a_offset >> PAGE_SHIFT]));
 		a_cp = (char *)copy_page_src_va + a_pg_offset;
 		b_pg_offset = b_offset & PAGE_MASK;
 		cnt = min(cnt, PAGE_SIZE - b_pg_offset);
 		mmu_booke_kenter(mmu, copy_page_dst_va,
 		    VM_PAGE_TO_PHYS(mb[b_offset >> PAGE_SHIFT]));
 		b_cp = (char *)copy_page_dst_va + b_pg_offset;
 		bcopy(a_cp, b_cp, cnt);
 		mmu_booke_kremove(mmu, copy_page_dst_va);
 		mmu_booke_kremove(mmu, copy_page_src_va);
 		a_offset += cnt;
 		b_offset += cnt;
 		xfersize -= cnt;
 	}
 	mtx_unlock(&copy_page_mutex);
 }
 
 static vm_offset_t
 mmu_booke_quick_enter_page(mmu_t mmu, vm_page_t m)
 {
 	vm_paddr_t paddr;
 	vm_offset_t qaddr;
 	uint32_t flags;
 	pte_t *pte;
 
 	paddr = VM_PAGE_TO_PHYS(m);
 
 	flags = PTE_SR | PTE_SW | PTE_SX | PTE_WIRED | PTE_VALID;
 	flags |= tlb_calc_wimg(paddr, pmap_page_get_memattr(m)) << PTE_MAS2_SHIFT;
 	flags |= PTE_PS_4KB;
 
 	critical_enter();
 	qaddr = PCPU_GET(qmap_addr);
 
 	pte = pte_find(mmu, kernel_pmap, qaddr);
 
 	KASSERT(*pte == 0, ("mmu_booke_quick_enter_page: PTE busy"));
 
 	/* 
 	 * XXX: tlbivax is broadcast to other cores, but qaddr should
  	 * not be present in other TLBs.  Is there a better instruction
 	 * sequence to use? Or just forget it & use mmu_booke_kenter()... 
 	 */
 	__asm __volatile("tlbivax 0, %0" :: "r"(qaddr & MAS2_EPN_MASK));
 	__asm __volatile("isync; msync");
 
 	*pte = PTE_RPN_FROM_PA(paddr) | flags;
 
 	/* Flush the real memory from the instruction cache. */
 	if ((flags & (PTE_I | PTE_G)) == 0)
 		__syncicache((void *)qaddr, PAGE_SIZE);
 
 	return (qaddr);
 }
 
 static void
 mmu_booke_quick_remove_page(mmu_t mmu, vm_offset_t addr)
 {
 	pte_t *pte;
 
 	pte = pte_find(mmu, kernel_pmap, addr);
 
 	KASSERT(PCPU_GET(qmap_addr) == addr,
 	    ("mmu_booke_quick_remove_page: invalid address"));
 	KASSERT(*pte != 0,
 	    ("mmu_booke_quick_remove_page: PTE not in use"));
 
 	*pte = 0;
 	critical_exit();
 }
 
 /*
  * Return whether or not the specified physical page was modified
  * in any of physical maps.
  */
 static boolean_t
 mmu_booke_is_modified(mmu_t mmu, vm_page_t m)
 {
 	pte_t *pte;
 	pv_entry_t pv;
 	boolean_t rv;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("mmu_booke_is_modified: page %p is not managed", m));
 	rv = FALSE;
 
 	/*
 	 * If the page is not exclusive busied, then PGA_WRITEABLE cannot be
 	 * concurrently set while the object is locked.  Thus, if PGA_WRITEABLE
 	 * is clear, no PTEs can be modified.
 	 */
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	if (!vm_page_xbusied(m) && (m->aflags & PGA_WRITEABLE) == 0)
 		return (rv);
 	rw_wlock(&pvh_global_lock);
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) {
 		PMAP_LOCK(pv->pv_pmap);
 		if ((pte = pte_find(mmu, pv->pv_pmap, pv->pv_va)) != NULL &&
 		    PTE_ISVALID(pte)) {
 			if (PTE_ISMODIFIED(pte))
 				rv = TRUE;
 		}
 		PMAP_UNLOCK(pv->pv_pmap);
 		if (rv)
 			break;
 	}
 	rw_wunlock(&pvh_global_lock);
 	return (rv);
 }
 
 /*
  * Return whether or not the specified virtual address is eligible
  * for prefault.
  */
 static boolean_t
 mmu_booke_is_prefaultable(mmu_t mmu, pmap_t pmap, vm_offset_t addr)
 {
 
 	return (FALSE);
 }
 
 /*
  * Return whether or not the specified physical page was referenced
  * in any physical maps.
  */
 static boolean_t
 mmu_booke_is_referenced(mmu_t mmu, vm_page_t m)
 {
 	pte_t *pte;
 	pv_entry_t pv;
 	boolean_t rv;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("mmu_booke_is_referenced: page %p is not managed", m));
 	rv = FALSE;
 	rw_wlock(&pvh_global_lock);
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) {
 		PMAP_LOCK(pv->pv_pmap);
 		if ((pte = pte_find(mmu, pv->pv_pmap, pv->pv_va)) != NULL &&
 		    PTE_ISVALID(pte)) {
 			if (PTE_ISREFERENCED(pte))
 				rv = TRUE;
 		}
 		PMAP_UNLOCK(pv->pv_pmap);
 		if (rv)
 			break;
 	}
 	rw_wunlock(&pvh_global_lock);
 	return (rv);
 }
 
 /*
  * Clear the modify bits on the specified physical page.
  */
 static void
 mmu_booke_clear_modify(mmu_t mmu, vm_page_t m)
 {
 	pte_t *pte;
 	pv_entry_t pv;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("mmu_booke_clear_modify: page %p is not managed", m));
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	KASSERT(!vm_page_xbusied(m),
 	    ("mmu_booke_clear_modify: page %p is exclusive busied", m));
 
 	/*
 	 * If the page is not PG_AWRITEABLE, then no PTEs can be modified.
 	 * If the object containing the page is locked and the page is not
 	 * exclusive busied, then PG_AWRITEABLE cannot be concurrently set.
 	 */
 	if ((m->aflags & PGA_WRITEABLE) == 0)
 		return;
 	rw_wlock(&pvh_global_lock);
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) {
 		PMAP_LOCK(pv->pv_pmap);
 		if ((pte = pte_find(mmu, pv->pv_pmap, pv->pv_va)) != NULL &&
 		    PTE_ISVALID(pte)) {
 			mtx_lock_spin(&tlbivax_mutex);
 			tlb_miss_lock();
 			
 			if (*pte & (PTE_SW | PTE_UW | PTE_MODIFIED)) {
 				tlb0_flush_entry(pv->pv_va);
 				*pte &= ~(PTE_SW | PTE_UW | PTE_MODIFIED |
 				    PTE_REFERENCED);
 			}
 
 			tlb_miss_unlock();
 			mtx_unlock_spin(&tlbivax_mutex);
 		}
 		PMAP_UNLOCK(pv->pv_pmap);
 	}
 	rw_wunlock(&pvh_global_lock);
 }
 
 /*
  * Return a count of reference bits for a page, clearing those bits.
  * It is not necessary for every reference bit to be cleared, but it
  * is necessary that 0 only be returned when there are truly no
  * reference bits set.
  *
- * XXX: The exact number of bits to check and clear is a matter that
- * should be tested and standardized at some point in the future for
- * optimal aging of shared pages.
+ * As an optimization, update the page's dirty field if a modified bit is
+ * found while counting reference bits.  This opportunistic update can be
+ * performed at low cost and can eliminate the need for some future calls
+ * to pmap_is_modified().  However, since this function stops after
+ * finding PMAP_TS_REFERENCED_MAX reference bits, it may not detect some
+ * dirty pages.  Those dirty pages will only be detected by a future call
+ * to pmap_is_modified().
  */
 static int
 mmu_booke_ts_referenced(mmu_t mmu, vm_page_t m)
 {
 	pte_t *pte;
 	pv_entry_t pv;
 	int count;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("mmu_booke_ts_referenced: page %p is not managed", m));
 	count = 0;
 	rw_wlock(&pvh_global_lock);
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) {
 		PMAP_LOCK(pv->pv_pmap);
 		if ((pte = pte_find(mmu, pv->pv_pmap, pv->pv_va)) != NULL &&
 		    PTE_ISVALID(pte)) {
+			if (PTE_ISMODIFIED(pte))
+				vm_page_dirty(m);
 			if (PTE_ISREFERENCED(pte)) {
 				mtx_lock_spin(&tlbivax_mutex);
 				tlb_miss_lock();
 
 				tlb0_flush_entry(pv->pv_va);
 				*pte &= ~PTE_REFERENCED;
 
 				tlb_miss_unlock();
 				mtx_unlock_spin(&tlbivax_mutex);
 
-				if (++count > 4) {
+				if (++count >= PMAP_TS_REFERENCED_MAX) {
 					PMAP_UNLOCK(pv->pv_pmap);
 					break;
 				}
 			}
 		}
 		PMAP_UNLOCK(pv->pv_pmap);
 	}
 	rw_wunlock(&pvh_global_lock);
 	return (count);
 }
 
 /*
  * Clear the wired attribute from the mappings for the specified range of
  * addresses in the given pmap.  Every valid mapping within that range must
  * have the wired attribute set.  In contrast, invalid mappings cannot have
  * the wired attribute set, so they are ignored.
  *
  * The wired attribute of the page table entry is not a hardware feature, so
  * there is no need to invalidate any TLB entries.
  */
 static void
 mmu_booke_unwire(mmu_t mmu, pmap_t pmap, vm_offset_t sva, vm_offset_t eva)
 {
 	vm_offset_t va;
 	pte_t *pte;
 
 	PMAP_LOCK(pmap);
 	for (va = sva; va < eva; va += PAGE_SIZE) {
 		if ((pte = pte_find(mmu, pmap, va)) != NULL &&
 		    PTE_ISVALID(pte)) {
 			if (!PTE_ISWIRED(pte))
 				panic("mmu_booke_unwire: pte %p isn't wired",
 				    pte);
 			*pte &= ~PTE_WIRED;
 			pmap->pm_stats.wired_count--;
 		}
 	}
 	PMAP_UNLOCK(pmap);
 
 }
 
 /*
  * Return true if the pmap's pv is one of the first 16 pvs linked to from this
  * page.  This count may be changed upwards or downwards in the future; it is
  * only necessary that true be returned for a small subset of pmaps for proper
  * page aging.
  */
 static boolean_t
 mmu_booke_page_exists_quick(mmu_t mmu, pmap_t pmap, vm_page_t m)
 {
 	pv_entry_t pv;
 	int loops;
 	boolean_t rv;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("mmu_booke_page_exists_quick: page %p is not managed", m));
 	loops = 0;
 	rv = FALSE;
 	rw_wlock(&pvh_global_lock);
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) {
 		if (pv->pv_pmap == pmap) {
 			rv = TRUE;
 			break;
 		}
 		if (++loops >= 16)
 			break;
 	}
 	rw_wunlock(&pvh_global_lock);
 	return (rv);
 }
 
 /*
  * Return the number of managed mappings to the given physical page that are
  * wired.
  */
 static int
 mmu_booke_page_wired_mappings(mmu_t mmu, vm_page_t m)
 {
 	pv_entry_t pv;
 	pte_t *pte;
 	int count = 0;
 
 	if ((m->oflags & VPO_UNMANAGED) != 0)
 		return (count);
 	rw_wlock(&pvh_global_lock);
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_link) {
 		PMAP_LOCK(pv->pv_pmap);
 		if ((pte = pte_find(mmu, pv->pv_pmap, pv->pv_va)) != NULL)
 			if (PTE_ISVALID(pte) && PTE_ISWIRED(pte))
 				count++;
 		PMAP_UNLOCK(pv->pv_pmap);
 	}
 	rw_wunlock(&pvh_global_lock);
 	return (count);
 }
 
 static int
 mmu_booke_dev_direct_mapped(mmu_t mmu, vm_paddr_t pa, vm_size_t size)
 {
 	int i;
 	vm_offset_t va;
 
 	/*
 	 * This currently does not work for entries that
 	 * overlap TLB1 entries.
 	 */
 	for (i = 0; i < TLB1_ENTRIES; i ++) {
 		if (tlb1_iomapped(i, pa, size, &va) == 0)
 			return (0);
 	}
 
 	return (EFAULT);
 }
 
 void
 mmu_booke_dumpsys_map(mmu_t mmu, vm_paddr_t pa, size_t sz, void **va)
 {
 	vm_paddr_t ppa;
 	vm_offset_t ofs;
 	vm_size_t gran;
 
 	/* Minidumps are based on virtual memory addresses. */
 	if (do_minidump) {
 		*va = (void *)(vm_offset_t)pa;
 		return;
 	}
 
 	/* Raw physical memory dumps don't have a virtual address. */
 	/* We always map a 256MB page at 256M. */
 	gran = 256 * 1024 * 1024;
 	ppa = rounddown2(pa, gran);
 	ofs = pa - ppa;
 	*va = (void *)gran;
 	tlb1_set_entry((vm_offset_t)va, ppa, gran, _TLB_ENTRY_IO);
 
 	if (sz > (gran - ofs))
 		tlb1_set_entry((vm_offset_t)(va + gran), ppa + gran, gran,
 		    _TLB_ENTRY_IO);
 }
 
 void
 mmu_booke_dumpsys_unmap(mmu_t mmu, vm_paddr_t pa, size_t sz, void *va)
 {
 	vm_paddr_t ppa;
 	vm_offset_t ofs;
 	vm_size_t gran;
 	tlb_entry_t e;
 	int i;
 
 	/* Minidumps are based on virtual memory addresses. */
 	/* Nothing to do... */
 	if (do_minidump)
 		return;
 
 	for (i = 0; i < TLB1_ENTRIES; i++) {
 		tlb1_read_entry(&e, i);
 		if (!(e.mas1 & MAS1_VALID))
 			break;
 	}
 
 	/* Raw physical memory dumps don't have a virtual address. */
 	i--;
 	e.mas1 = 0;
 	e.mas2 = 0;
 	e.mas3 = 0;
 	tlb1_write_entry(&e, i);
 
 	gran = 256 * 1024 * 1024;
 	ppa = rounddown2(pa, gran);
 	ofs = pa - ppa;
 	if (sz > (gran - ofs)) {
 		i--;
 		e.mas1 = 0;
 		e.mas2 = 0;
 		e.mas3 = 0;
 		tlb1_write_entry(&e, i);
 	}
 }
 
 extern struct dump_pa dump_map[PHYS_AVAIL_SZ + 1];
 
 void
 mmu_booke_scan_init(mmu_t mmu)
 {
 	vm_offset_t va;
 	pte_t *pte;
 	int i;
 
 	if (!do_minidump) {
 		/* Initialize phys. segments for dumpsys(). */
 		memset(&dump_map, 0, sizeof(dump_map));
 		mem_regions(&physmem_regions, &physmem_regions_sz, &availmem_regions,
 		    &availmem_regions_sz);
 		for (i = 0; i < physmem_regions_sz; i++) {
 			dump_map[i].pa_start = physmem_regions[i].mr_start;
 			dump_map[i].pa_size = physmem_regions[i].mr_size;
 		}
 		return;
 	}
 
 	/* Virtual segments for minidumps: */
 	memset(&dump_map, 0, sizeof(dump_map));
 
 	/* 1st: kernel .data and .bss. */
 	dump_map[0].pa_start = trunc_page((uintptr_t)_etext);
 	dump_map[0].pa_size =
 	    round_page((uintptr_t)_end) - dump_map[0].pa_start;
 
 	/* 2nd: msgbuf and tables (see pmap_bootstrap()). */
 	dump_map[1].pa_start = data_start;
 	dump_map[1].pa_size = data_end - data_start;
 
 	/* 3rd: kernel VM. */
 	va = dump_map[1].pa_start + dump_map[1].pa_size;
 	/* Find start of next chunk (from va). */
 	while (va < virtual_end) {
 		/* Don't dump the buffer cache. */
 		if (va >= kmi.buffer_sva && va < kmi.buffer_eva) {
 			va = kmi.buffer_eva;
 			continue;
 		}
 		pte = pte_find(mmu, kernel_pmap, va);
 		if (pte != NULL && PTE_ISVALID(pte))
 			break;
 		va += PAGE_SIZE;
 	}
 	if (va < virtual_end) {
 		dump_map[2].pa_start = va;
 		va += PAGE_SIZE;
 		/* Find last page in chunk. */
 		while (va < virtual_end) {
 			/* Don't run into the buffer cache. */
 			if (va == kmi.buffer_sva)
 				break;
 			pte = pte_find(mmu, kernel_pmap, va);
 			if (pte == NULL || !PTE_ISVALID(pte))
 				break;
 			va += PAGE_SIZE;
 		}
 		dump_map[2].pa_size = va - dump_map[2].pa_start;
 	}
 }
 
 /*
  * Map a set of physical memory pages into the kernel virtual address space.
  * Return a pointer to where it is mapped. This routine is intended to be used
  * for mapping device memory, NOT real memory.
  */
 static void *
 mmu_booke_mapdev(mmu_t mmu, vm_paddr_t pa, vm_size_t size)
 {
 
 	return (mmu_booke_mapdev_attr(mmu, pa, size, VM_MEMATTR_DEFAULT));
 }
 
 static void *
 mmu_booke_mapdev_attr(mmu_t mmu, vm_paddr_t pa, vm_size_t size, vm_memattr_t ma)
 {
 	tlb_entry_t e;
 	void *res;
 	uintptr_t va, tmpva;
 	vm_size_t sz;
 	int i;
 
 	/*
 	 * Check if this is premapped in TLB1. Note: this should probably also
 	 * check whether a sequence of TLB1 entries exist that match the
 	 * requirement, but now only checks the easy case.
 	 */
 	if (ma == VM_MEMATTR_DEFAULT) {
 		for (i = 0; i < TLB1_ENTRIES; i++) {
 			tlb1_read_entry(&e, i);
 			if (!(e.mas1 & MAS1_VALID))
 				continue;
 			if (pa >= e.phys &&
 			    (pa + size) <= (e.phys + e.size))
 				return (void *)(e.virt +
 				    (vm_offset_t)(pa - e.phys));
 		}
 	}
 
 	size = roundup(size, PAGE_SIZE);
 
 	/*
 	 * The device mapping area is between VM_MAXUSER_ADDRESS and
 	 * VM_MIN_KERNEL_ADDRESS.  This gives 1GB of device addressing.
 	 */
 #ifdef SPARSE_MAPDEV
 	/*
 	 * With a sparse mapdev, align to the largest starting region.  This
 	 * could feasibly be optimized for a 'best-fit' alignment, but that
 	 * calculation could be very costly.
 	 */
 	do {
 	    tmpva = tlb1_map_base;
 	    va = roundup(tlb1_map_base, 1 << flsl(size));
 	} while (!atomic_cmpset_int(&tlb1_map_base, tmpva, va + size));
 #else
 	va = atomic_fetchadd_int(&tlb1_map_base, size);
 #endif
 	res = (void *)va;
 
 	do {
 		sz = 1 << (ilog2(size) & ~1);
 		if (va % sz != 0) {
 			do {
 				sz >>= 2;
 			} while (va % sz != 0);
 		}
 		if (bootverbose)
 			printf("Wiring VA=%x to PA=%jx (size=%x)\n",
 			    va, (uintmax_t)pa, sz);
 		tlb1_set_entry(va, pa, sz,
 		    _TLB_ENTRY_SHARED | tlb_calc_wimg(pa, ma));
 		size -= sz;
 		pa += sz;
 		va += sz;
 	} while (size > 0);
 
 	return (res);
 }
 
 /*
  * 'Unmap' a range mapped by mmu_booke_mapdev().
  */
 static void
 mmu_booke_unmapdev(mmu_t mmu, vm_offset_t va, vm_size_t size)
 {
 #ifdef SUPPORTS_SHRINKING_TLB1
 	vm_offset_t base, offset;
 
 	/*
 	 * Unmap only if this is inside kernel virtual space.
 	 */
 	if ((va >= VM_MIN_KERNEL_ADDRESS) && (va <= VM_MAX_KERNEL_ADDRESS)) {
 		base = trunc_page(va);
 		offset = va & PAGE_MASK;
 		size = roundup(offset + size, PAGE_SIZE);
 		kva_free(base, size);
 	}
 #endif
 }
 
 /*
  * mmu_booke_object_init_pt preloads the ptes for a given object into the
  * specified pmap. This eliminates the blast of soft faults on process startup
  * and immediately after an mmap.
  */
 static void
 mmu_booke_object_init_pt(mmu_t mmu, pmap_t pmap, vm_offset_t addr,
     vm_object_t object, vm_pindex_t pindex, vm_size_t size)
 {
 
 	VM_OBJECT_ASSERT_WLOCKED(object);
 	KASSERT(object->type == OBJT_DEVICE || object->type == OBJT_SG,
 	    ("mmu_booke_object_init_pt: non-device object"));
 }
 
 /*
  * Perform the pmap work for mincore.
  */
 static int
 mmu_booke_mincore(mmu_t mmu, pmap_t pmap, vm_offset_t addr,
     vm_paddr_t *locked_pa)
 {
 
 	/* XXX: this should be implemented at some point */
 	return (0);
 }
 
 static int
 mmu_booke_change_attr(mmu_t mmu, vm_offset_t addr, vm_size_t sz,
     vm_memattr_t mode)
 {
 	vm_offset_t va;
 	pte_t *pte;
 	int i, j;
 	tlb_entry_t e;
 
 	/* Check TLB1 mappings */
 	for (i = 0; i < TLB1_ENTRIES; i++) {
 		tlb1_read_entry(&e, i);
 		if (!(e.mas1 & MAS1_VALID))
 			continue;
 		if (addr >= e.virt && addr < e.virt + e.size)
 			break;
 	}
 	if (i < TLB1_ENTRIES) {
 		/* Only allow full mappings to be modified for now. */
 		/* Validate the range. */
 		for (j = i, va = addr; va < addr + sz; va += e.size, j++) {
 			tlb1_read_entry(&e, j);
 			if (va != e.virt || (sz - (va - addr) < e.size))
 				return (EINVAL);
 		}
 		for (va = addr; va < addr + sz; va += e.size, i++) {
 			tlb1_read_entry(&e, i);
 			e.mas2 &= ~MAS2_WIMGE_MASK;
 			e.mas2 |= tlb_calc_wimg(e.phys, mode);
 
 			/*
 			 * Write it out to the TLB.  Should really re-sync with other
 			 * cores.
 			 */
 			tlb1_write_entry(&e, i);
 		}
 		return (0);
 	}
 
 	/* Not in TLB1, try through pmap */
 	/* First validate the range. */
 	for (va = addr; va < addr + sz; va += PAGE_SIZE) {
 		pte = pte_find(mmu, kernel_pmap, va);
 		if (pte == NULL || !PTE_ISVALID(pte))
 			return (EINVAL);
 	}
 
 	mtx_lock_spin(&tlbivax_mutex);
 	tlb_miss_lock();
 	for (va = addr; va < addr + sz; va += PAGE_SIZE) {
 		pte = pte_find(mmu, kernel_pmap, va);
 		*pte &= ~(PTE_MAS2_MASK << PTE_MAS2_SHIFT);
 		*pte |= tlb_calc_wimg(PTE_PA(pte), mode << PTE_MAS2_SHIFT);
 		tlb0_flush_entry(va);
 	}
 	tlb_miss_unlock();
 	mtx_unlock_spin(&tlbivax_mutex);
 
 	return (pte_vatopa(mmu, kernel_pmap, va));
 }
 
 /**************************************************************************/
 /* TID handling */
 /**************************************************************************/
 
 /*
  * Allocate a TID. If necessary, steal one from someone else.
  * The new TID is flushed from the TLB before returning.
  */
 static tlbtid_t
 tid_alloc(pmap_t pmap)
 {
 	tlbtid_t tid;
 	int thiscpu;
 
 	KASSERT((pmap != kernel_pmap), ("tid_alloc: kernel pmap"));
 
 	CTR2(KTR_PMAP, "%s: s (pmap = %p)", __func__, pmap);
 
 	thiscpu = PCPU_GET(cpuid);
 
 	tid = PCPU_GET(tid_next);
 	if (tid > TID_MAX)
 		tid = TID_MIN;
 	PCPU_SET(tid_next, tid + 1);
 
 	/* If we are stealing TID then clear the relevant pmap's field */
 	if (tidbusy[thiscpu][tid] != NULL) {
 
 		CTR2(KTR_PMAP, "%s: warning: stealing tid %d", __func__, tid);
 		
 		tidbusy[thiscpu][tid]->pm_tid[thiscpu] = TID_NONE;
 
 		/* Flush all entries from TLB0 matching this TID. */
 		tid_flush(tid);
 	}
 
 	tidbusy[thiscpu][tid] = pmap;
 	pmap->pm_tid[thiscpu] = tid;
 	__asm __volatile("msync; isync");
 
 	CTR3(KTR_PMAP, "%s: e (%02d next = %02d)", __func__, tid,
 	    PCPU_GET(tid_next));
 
 	return (tid);
 }
 
 /**************************************************************************/
 /* TLB0 handling */
 /**************************************************************************/
 
 static void
 tlb_print_entry(int i, uint32_t mas1, uint32_t mas2, uint32_t mas3,
     uint32_t mas7)
 {
 	int as;
 	char desc[3];
 	tlbtid_t tid;
 	vm_size_t size;
 	unsigned int tsize;
 
 	desc[2] = '\0';
 	if (mas1 & MAS1_VALID)
 		desc[0] = 'V';
 	else
 		desc[0] = ' ';
 
 	if (mas1 & MAS1_IPROT)
 		desc[1] = 'P';
 	else
 		desc[1] = ' ';
 
 	as = (mas1 & MAS1_TS_MASK) ? 1 : 0;
 	tid = MAS1_GETTID(mas1);
 
 	tsize = (mas1 & MAS1_TSIZE_MASK) >> MAS1_TSIZE_SHIFT;
 	size = 0;
 	if (tsize)
 		size = tsize2size(tsize);
 
 	debugf("%3d: (%s) [AS=%d] "
 	    "sz = 0x%08x tsz = %d tid = %d mas1 = 0x%08x "
 	    "mas2(va) = 0x%08x mas3(pa) = 0x%08x mas7 = 0x%08x\n",
 	    i, desc, as, size, tsize, tid, mas1, mas2, mas3, mas7);
 }
 
 /* Convert TLB0 va and way number to tlb0[] table index. */
 static inline unsigned int
 tlb0_tableidx(vm_offset_t va, unsigned int way)
 {
 	unsigned int idx;
 
 	idx = (way * TLB0_ENTRIES_PER_WAY);
 	idx += (va & MAS2_TLB0_ENTRY_IDX_MASK) >> MAS2_TLB0_ENTRY_IDX_SHIFT;
 	return (idx);
 }
 
 /*
  * Invalidate TLB0 entry.
  */
 static inline void
 tlb0_flush_entry(vm_offset_t va)
 {
 
 	CTR2(KTR_PMAP, "%s: s va=0x%08x", __func__, va);
 
 	mtx_assert(&tlbivax_mutex, MA_OWNED);
 
 	__asm __volatile("tlbivax 0, %0" :: "r"(va & MAS2_EPN_MASK));
 	__asm __volatile("isync; msync");
 	__asm __volatile("tlbsync; msync");
 
 	CTR1(KTR_PMAP, "%s: e", __func__);
 }
 
 /* Print out contents of the MAS registers for each TLB0 entry */
 void
 tlb0_print_tlbentries(void)
 {
 	uint32_t mas0, mas1, mas2, mas3, mas7;
 	int entryidx, way, idx;
 
 	debugf("TLB0 entries:\n");
 	for (way = 0; way < TLB0_WAYS; way ++)
 		for (entryidx = 0; entryidx < TLB0_ENTRIES_PER_WAY; entryidx++) {
 
 			mas0 = MAS0_TLBSEL(0) | MAS0_ESEL(way);
 			mtspr(SPR_MAS0, mas0);
 			__asm __volatile("isync");
 
 			mas2 = entryidx << MAS2_TLB0_ENTRY_IDX_SHIFT;
 			mtspr(SPR_MAS2, mas2);
 
 			__asm __volatile("isync; tlbre");
 
 			mas1 = mfspr(SPR_MAS1);
 			mas2 = mfspr(SPR_MAS2);
 			mas3 = mfspr(SPR_MAS3);
 			mas7 = mfspr(SPR_MAS7);
 
 			idx = tlb0_tableidx(mas2, way);
 			tlb_print_entry(idx, mas1, mas2, mas3, mas7);
 		}
 }
 
 /**************************************************************************/
 /* TLB1 handling */
 /**************************************************************************/
 
 /*
  * TLB1 mapping notes:
  *
  * TLB1[0]	Kernel text and data.
  * TLB1[1-15]	Additional kernel text and data mappings (if required), PCI
  *		windows, other devices mappings.
  */
 
  /*
  * Read an entry from given TLB1 slot.
  */
 void
 tlb1_read_entry(tlb_entry_t *entry, unsigned int slot)
 {
 	uint32_t mas0;
 
 	KASSERT((entry != NULL), ("%s(): Entry is NULL!", __func__));
 
 	mas0 = MAS0_TLBSEL(1) | MAS0_ESEL(slot);
 	mtspr(SPR_MAS0, mas0);
 	__asm __volatile("isync; tlbre");
 
 	entry->mas1 = mfspr(SPR_MAS1);
 	entry->mas2 = mfspr(SPR_MAS2);
 	entry->mas3 = mfspr(SPR_MAS3);
 
 	switch ((mfpvr() >> 16) & 0xFFFF) {
 	case FSL_E500v2:
 	case FSL_E500mc:
 	case FSL_E5500:
 	case FSL_E6500:
 		entry->mas7 = mfspr(SPR_MAS7);
 		break;
 	default:
 		entry->mas7 = 0;
 		break;
 	}
 
 	entry->virt = entry->mas2 & MAS2_EPN_MASK;
 	entry->phys = ((vm_paddr_t)(entry->mas7 & MAS7_RPN) << 32) |
 	    (entry->mas3 & MAS3_RPN);
 	entry->size =
 	    tsize2size((entry->mas1 & MAS1_TSIZE_MASK) >> MAS1_TSIZE_SHIFT);
 }
 
 /*
  * Write given entry to TLB1 hardware.
  * Use 32 bit pa, clear 4 high-order bits of RPN (mas7).
  */
 static void
 tlb1_write_entry(tlb_entry_t *e, unsigned int idx)
 {
 	uint32_t mas0;
 
 	//debugf("tlb1_write_entry: s\n");
 
 	/* Select entry */
 	mas0 = MAS0_TLBSEL(1) | MAS0_ESEL(idx);
 	//debugf("tlb1_write_entry: mas0 = 0x%08x\n", mas0);
 
 	mtspr(SPR_MAS0, mas0);
 	__asm __volatile("isync");
 	mtspr(SPR_MAS1, e->mas1);
 	__asm __volatile("isync");
 	mtspr(SPR_MAS2, e->mas2);
 	__asm __volatile("isync");
 	mtspr(SPR_MAS3, e->mas3);
 	__asm __volatile("isync");
 	switch ((mfpvr() >> 16) & 0xFFFF) {
 	case FSL_E500mc:
 	case FSL_E5500:
 	case FSL_E6500:
 		mtspr(SPR_MAS8, 0);
 		__asm __volatile("isync");
 		/* FALLTHROUGH */
 	case FSL_E500v2:
 		mtspr(SPR_MAS7, e->mas7);
 		__asm __volatile("isync");
 		break;
 	default:
 		break;
 	}
 
 	__asm __volatile("tlbwe; isync; msync");
 
 	//debugf("tlb1_write_entry: e\n");
 }
 
 /*
  * Return the largest uint value log such that 2^log <= num.
  */
 static unsigned int
 ilog2(unsigned int num)
 {
 	int lz;
 
 	__asm ("cntlzw %0, %1" : "=r" (lz) : "r" (num));
 	return (31 - lz);
 }
 
 /*
  * Convert TLB TSIZE value to mapped region size.
  */
 static vm_size_t
 tsize2size(unsigned int tsize)
 {
 
 	/*
 	 * size = 4^tsize KB
 	 * size = 4^tsize * 2^10 = 2^(2 * tsize - 10)
 	 */
 
 	return ((1 << (2 * tsize)) * 1024);
 }
 
 /*
  * Convert region size (must be power of 4) to TLB TSIZE value.
  */
 static unsigned int
 size2tsize(vm_size_t size)
 {
 
 	return (ilog2(size) / 2 - 5);
 }
 
 /*
  * Register permanent kernel mapping in TLB1.
  *
  * Entries are created starting from index 0 (current free entry is
  * kept in tlb1_idx) and are not supposed to be invalidated.
  */
 int
 tlb1_set_entry(vm_offset_t va, vm_paddr_t pa, vm_size_t size,
     uint32_t flags)
 {
 	tlb_entry_t e;
 	uint32_t ts, tid;
 	int tsize, index;
 
 	for (index = 0; index < TLB1_ENTRIES; index++) {
 		tlb1_read_entry(&e, index);
 		if ((e.mas1 & MAS1_VALID) == 0)
 			break;
 		/* Check if we're just updating the flags, and update them. */
 		if (e.phys == pa && e.virt == va && e.size == size) {
 			e.mas2 = (va & MAS2_EPN_MASK) | flags;
 			tlb1_write_entry(&e, index);
 			return (0);
 		}
 	}
 	if (index >= TLB1_ENTRIES) {
 		printf("tlb1_set_entry: TLB1 full!\n");
 		return (-1);
 	}
 
 	/* Convert size to TSIZE */
 	tsize = size2tsize(size);
 
 	tid = (TID_KERNEL << MAS1_TID_SHIFT) & MAS1_TID_MASK;
 	/* XXX TS is hard coded to 0 for now as we only use single address space */
 	ts = (0 << MAS1_TS_SHIFT) & MAS1_TS_MASK;
 
 	e.phys = pa;
 	e.virt = va;
 	e.size = size;
 	e.mas1 = MAS1_VALID | MAS1_IPROT | ts | tid;
 	e.mas1 |= ((tsize << MAS1_TSIZE_SHIFT) & MAS1_TSIZE_MASK);
 	e.mas2 = (va & MAS2_EPN_MASK) | flags;
 
 	/* Set supervisor RWX permission bits */
 	e.mas3 = (pa & MAS3_RPN) | MAS3_SR | MAS3_SW | MAS3_SX;
 	e.mas7 = (pa >> 32) & MAS7_RPN;
 
 	tlb1_write_entry(&e, index);
 
 	/*
 	 * XXX in general TLB1 updates should be propagated between CPUs,
 	 * since current design assumes to have the same TLB1 set-up on all
 	 * cores.
 	 */
 	return (0);
 }
 
 /*
  * Map in contiguous RAM region into the TLB1 using maximum of
  * KERNEL_REGION_MAX_TLB_ENTRIES entries.
  *
  * If necessary round up last entry size and return total size
  * used by all allocated entries.
  */
 vm_size_t
 tlb1_mapin_region(vm_offset_t va, vm_paddr_t pa, vm_size_t size)
 {
 	vm_size_t pgs[KERNEL_REGION_MAX_TLB_ENTRIES];
 	vm_size_t mapped, pgsz, base, mask;
 	int idx, nents;
 
 	/* Round up to the next 1M */
 	size = roundup2(size, 1 << 20);
 
 	mapped = 0;
 	idx = 0;
 	base = va;
 	pgsz = 64*1024*1024;
 	while (mapped < size) {
 		while (mapped < size && idx < KERNEL_REGION_MAX_TLB_ENTRIES) {
 			while (pgsz > (size - mapped))
 				pgsz >>= 2;
 			pgs[idx++] = pgsz;
 			mapped += pgsz;
 		}
 
 		/* We under-map. Correct for this. */
 		if (mapped < size) {
 			while (pgs[idx - 1] == pgsz) {
 				idx--;
 				mapped -= pgsz;
 			}
 			/* XXX We may increase beyond out starting point. */
 			pgsz <<= 2;
 			pgs[idx++] = pgsz;
 			mapped += pgsz;
 		}
 	}
 
 	nents = idx;
 	mask = pgs[0] - 1;
 	/* Align address to the boundary */
 	if (va & mask) {
 		va = (va + mask) & ~mask;
 		pa = (pa + mask) & ~mask;
 	}
 
 	for (idx = 0; idx < nents; idx++) {
 		pgsz = pgs[idx];
 		debugf("%u: %llx -> %x, size=%x\n", idx, pa, va, pgsz);
 		tlb1_set_entry(va, pa, pgsz,
 		    _TLB_ENTRY_SHARED | _TLB_ENTRY_MEM);
 		pa += pgsz;
 		va += pgsz;
 	}
 
 	mapped = (va - base);
 #ifdef __powerpc64__
 	printf("mapped size 0x%016lx (wasted space 0x%16lx)\n",
 #else
 	printf("mapped size 0x%08x (wasted space 0x%08x)\n",
 #endif
 	    mapped, mapped - size);
 	return (mapped);
 }
 
 /*
  * TLB1 initialization routine, to be called after the very first
  * assembler level setup done in locore.S.
  */
 void
 tlb1_init()
 {
 	uint32_t mas0, mas1, mas2, mas3, mas7;
 	uint32_t tsz;
 
 	tlb1_get_tlbconf();
 
 	mas0 = MAS0_TLBSEL(1) | MAS0_ESEL(0);
 	mtspr(SPR_MAS0, mas0);
 	__asm __volatile("isync; tlbre");
 
 	mas1 = mfspr(SPR_MAS1);
 	mas2 = mfspr(SPR_MAS2);
 	mas3 = mfspr(SPR_MAS3);
 	mas7 = mfspr(SPR_MAS7);
 
 	kernload =  ((vm_paddr_t)(mas7 & MAS7_RPN) << 32) |
 	    (mas3 & MAS3_RPN);
 
 	tsz = (mas1 & MAS1_TSIZE_MASK) >> MAS1_TSIZE_SHIFT;
 	kernsize += (tsz > 0) ? tsize2size(tsz) : 0;
 
 	/* Setup TLB miss defaults */
 	set_mas4_defaults();
 }
 
 /*
  * pmap_early_io_unmap() should be used in short conjunction with
  * pmap_early_io_map(), as in the following snippet:
  *
  * x = pmap_early_io_map(...);
  * <do something with x>
  * pmap_early_io_unmap(x, size);
  *
  * And avoiding more allocations between.
  */
 void
 pmap_early_io_unmap(vm_offset_t va, vm_size_t size)
 {
 	int i;
 	tlb_entry_t e;
 	vm_size_t isize;
 
 	size = roundup(size, PAGE_SIZE);
 	isize = size;
 	for (i = 0; i < TLB1_ENTRIES && size > 0; i++) {
 		tlb1_read_entry(&e, i);
 		if (!(e.mas1 & MAS1_VALID))
 			continue;
 		if (va <= e.virt && (va + isize) >= (e.virt + e.size)) {
 			size -= e.size;
 			e.mas1 &= ~MAS1_VALID;
 			tlb1_write_entry(&e, i);
 		}
 	}
 	if (tlb1_map_base == va + isize)
 		tlb1_map_base -= isize;
 }	
 		
 vm_offset_t 
 pmap_early_io_map(vm_paddr_t pa, vm_size_t size)
 {
 	vm_paddr_t pa_base;
 	vm_offset_t va, sz;
 	int i;
 	tlb_entry_t e;
 
 	KASSERT(!pmap_bootstrapped, ("Do not use after PMAP is up!"));
 	
 	for (i = 0; i < TLB1_ENTRIES; i++) {
 		tlb1_read_entry(&e, i);
 		if (!(e.mas1 & MAS1_VALID))
 			continue;
 		if (pa >= e.phys && (pa + size) <=
 		    (e.phys + e.size))
 			return (e.virt + (pa - e.phys));
 	}
 
 	pa_base = rounddown(pa, PAGE_SIZE);
 	size = roundup(size + (pa - pa_base), PAGE_SIZE);
 	tlb1_map_base = roundup2(tlb1_map_base, 1 << (ilog2(size) & ~1));
 	va = tlb1_map_base + (pa - pa_base);
 
 	do {
 		sz = 1 << (ilog2(size) & ~1);
 		tlb1_set_entry(tlb1_map_base, pa_base, sz,
 		    _TLB_ENTRY_SHARED | _TLB_ENTRY_IO);
 		size -= sz;
 		pa_base += sz;
 		tlb1_map_base += sz;
 	} while (size > 0);
 
 	return (va);
 }
 
 /*
  * Setup MAS4 defaults.
  * These values are loaded to MAS0-2 on a TLB miss.
  */
 static void
 set_mas4_defaults(void)
 {
 	uint32_t mas4;
 
 	/* Defaults: TLB0, PID0, TSIZED=4K */
 	mas4 = MAS4_TLBSELD0;
 	mas4 |= (TLB_SIZE_4K << MAS4_TSIZED_SHIFT) & MAS4_TSIZED_MASK;
 #ifdef SMP
 	mas4 |= MAS4_MD;
 #endif
 	mtspr(SPR_MAS4, mas4);
 	__asm __volatile("isync");
 }
 
 /*
  * Print out contents of the MAS registers for each TLB1 entry
  */
 void
 tlb1_print_tlbentries(void)
 {
 	uint32_t mas0, mas1, mas2, mas3, mas7;
 	int i;
 
 	debugf("TLB1 entries:\n");
 	for (i = 0; i < TLB1_ENTRIES; i++) {
 
 		mas0 = MAS0_TLBSEL(1) | MAS0_ESEL(i);
 		mtspr(SPR_MAS0, mas0);
 
 		__asm __volatile("isync; tlbre");
 
 		mas1 = mfspr(SPR_MAS1);
 		mas2 = mfspr(SPR_MAS2);
 		mas3 = mfspr(SPR_MAS3);
 		mas7 = mfspr(SPR_MAS7);
 
 		tlb_print_entry(i, mas1, mas2, mas3, mas7);
 	}
 }
 
 /*
  * Return 0 if the physical IO range is encompassed by one of the
  * the TLB1 entries, otherwise return related error code.
  */
 static int
 tlb1_iomapped(int i, vm_paddr_t pa, vm_size_t size, vm_offset_t *va)
 {
 	uint32_t prot;
 	vm_paddr_t pa_start;
 	vm_paddr_t pa_end;
 	unsigned int entry_tsize;
 	vm_size_t entry_size;
 	tlb_entry_t e;
 
 	*va = (vm_offset_t)NULL;
 
 	tlb1_read_entry(&e, i);
 	/* Skip invalid entries */
 	if (!(e.mas1 & MAS1_VALID))
 		return (EINVAL);
 
 	/*
 	 * The entry must be cache-inhibited, guarded, and r/w
 	 * so it can function as an i/o page
 	 */
 	prot = e.mas2 & (MAS2_I | MAS2_G);
 	if (prot != (MAS2_I | MAS2_G))
 		return (EPERM);
 
 	prot = e.mas3 & (MAS3_SR | MAS3_SW);
 	if (prot != (MAS3_SR | MAS3_SW))
 		return (EPERM);
 
 	/* The address should be within the entry range. */
 	entry_tsize = (e.mas1 & MAS1_TSIZE_MASK) >> MAS1_TSIZE_SHIFT;
 	KASSERT((entry_tsize), ("tlb1_iomapped: invalid entry tsize"));
 
 	entry_size = tsize2size(entry_tsize);
 	pa_start = (((vm_paddr_t)e.mas7 & MAS7_RPN) << 32) | 
 	    (e.mas3 & MAS3_RPN);
 	pa_end = pa_start + entry_size;
 
 	if ((pa < pa_start) || ((pa + size) > pa_end))
 		return (ERANGE);
 
 	/* Return virtual address of this mapping. */
 	*va = (e.mas2 & MAS2_EPN_MASK) + (pa - pa_start);
 	return (0);
 }
 
 /*
  * Invalidate all TLB0 entries which match the given TID. Note this is
  * dedicated for cases when invalidations should NOT be propagated to other
  * CPUs.
  */
 static void
 tid_flush(tlbtid_t tid)
 {
 	register_t msr;
 	uint32_t mas0, mas1, mas2;
 	int entry, way;
 
 
 	/* Don't evict kernel translations */
 	if (tid == TID_KERNEL)
 		return;
 
 	msr = mfmsr();
 	__asm __volatile("wrteei 0");
 
 	for (way = 0; way < TLB0_WAYS; way++)
 		for (entry = 0; entry < TLB0_ENTRIES_PER_WAY; entry++) {
 
 			mas0 = MAS0_TLBSEL(0) | MAS0_ESEL(way);
 			mtspr(SPR_MAS0, mas0);
 			__asm __volatile("isync");
 
 			mas2 = entry << MAS2_TLB0_ENTRY_IDX_SHIFT;
 			mtspr(SPR_MAS2, mas2);
 
 			__asm __volatile("isync; tlbre");
 
 			mas1 = mfspr(SPR_MAS1);
 
 			if (!(mas1 & MAS1_VALID))
 				continue;
 			if (((mas1 & MAS1_TID_MASK) >> MAS1_TID_SHIFT) != tid)
 				continue;
 			mas1 &= ~MAS1_VALID;
 			mtspr(SPR_MAS1, mas1);
 			__asm __volatile("isync; tlbwe; isync; msync");
 		}
 	mtmsr(msr);
 }
Index: projects/clang390-import/sys/powerpc/conf/MPC85XX
===================================================================
--- projects/clang390-import/sys/powerpc/conf/MPC85XX	(revision 305686)
+++ projects/clang390-import/sys/powerpc/conf/MPC85XX	(revision 305687)
@@ -1,94 +1,95 @@
 #
 # Custom kernel for Freescale MPC85XX development boards like the CDS etc.
 #
 # $FreeBSD$
 #
 
 cpu		BOOKE
 cpu		BOOKE_E500
 ident		MPC85XX
 
 machine		powerpc	powerpc
 
 include 	"dpaa/config.dpaa"
 makeoptions	DEBUG="-Wa,-me500 -g"
 makeoptions	WERROR="-Werror -Wno-format -Wno-redundant-decls"
 makeoptions	NO_MODULES=yes
 
 options 	FPU_EMU
 
 options 	_KPOSIX_PRIORITY_SCHEDULING
 options 	ALT_BREAK_TO_DEBUGGER
 options 	BREAK_TO_DEBUGGER
 options 	BOOTP
 options 	BOOTP_NFSROOT
 #options 	BOOTP_NFSV3
 options 	CD9660
 options 	COMPAT_43
 options 	DDB
 #options 	DEADLKRES
 options 	DEVICE_POLLING
 #options 	DIAGNOSTIC
 options 	FDT
 #makeoptions	FDT_DTS_FILE=mpc8555cds.dts
 options 	FFS
 options 	GDB
 options 	GEOM_PART_GPT
 options 	INET
 options 	INET6
 options 	INVARIANTS
 options 	INVARIANT_SUPPORT
 options 	KDB
 options 	KTRACE
 options 	MD_ROOT
 options 	MPC85XX
 options 	MSDOSFS
 options 	NFS_ROOT
 options 	NFSCL
 options 	NFSLOCKD
 options 	PROCFS
 options 	PSEUDOFS
 options 	SCHED_ULE
 options 	CAPABILITIES
 options 	CAPABILITY_MODE
 options 	SMP
 options 	SYSVMSG
 options 	SYSVSEM
 options 	SYSVSHM
 options 	WITNESS
 options 	WITNESS_SKIPSPIN
 
 device		ata
 device		bpf
 device		cfi
 device		crypto
 device		cryptodev
 device		da
 device		ds1553
 device		em
 device		alc
 device		ether
 device		fxp
 device  	gpio
 device		iic
 device		iicbus
 #device		isa
 device		loop
 device		md
 device		miibus
 device		pass
 device		pci
 device		quicc
 device		random
 #device		rl
 device		scbus
 device		scc
 device		sec
 device		tsec
 device		tun
 device		uart
 options 	USB_DEBUG	# enable debug msgs
 #device		uhci
+device		ehci
 device		umass
 device		usb
 device		vlan
Index: projects/clang390-import/sys/riscv/riscv/pmap.c
===================================================================
--- projects/clang390-import/sys/riscv/riscv/pmap.c	(revision 305686)
+++ projects/clang390-import/sys/riscv/riscv/pmap.c	(revision 305687)
@@ -1,3279 +1,3284 @@
 /*-
  * Copyright (c) 1991 Regents of the University of California.
  * All rights reserved.
  * Copyright (c) 1994 John S. Dyson
  * All rights reserved.
  * Copyright (c) 1994 David Greenman
  * All rights reserved.
  * Copyright (c) 2003 Peter Wemm
  * All rights reserved.
  * Copyright (c) 2005-2010 Alan L. Cox <alc@cs.rice.edu>
  * All rights reserved.
  * Copyright (c) 2014 Andrew Turner
  * All rights reserved.
  * Copyright (c) 2014 The FreeBSD Foundation
  * All rights reserved.
  * Copyright (c) 2015-2016 Ruslan Bukin <br@bsdpad.com>
  * All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * the Systems Programming Group of the University of Utah Computer
  * Science Department and William Jolitz of UUNET Technologies Inc.
  *
  * Portions of this software were developed by Andrew Turner under
  * sponsorship from The FreeBSD Foundation.
  *
  * Portions of this software were developed by SRI International and the
  * University of Cambridge Computer Laboratory under DARPA/AFRL contract
  * FA8750-10-C-0237 ("CTSRD"), as part of the DARPA CRASH research programme.
  *
  * Portions of this software were developed by the University of Cambridge
  * Computer Laboratory as part of the CTSRD Project, with support from the
  * UK Higher Education Innovation Fund (HEIF).
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *	This product includes software developed by the University of
  *	California, Berkeley and its contributors.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	from:	@(#)pmap.c	7.7 (Berkeley)	5/12/91
  */
 /*-
  * Copyright (c) 2003 Networks Associates Technology, Inc.
  * All rights reserved.
  *
  * This software was developed for the FreeBSD Project by Jake Burkholder,
  * Safeport Network Services, and Network Associates Laboratories, the
  * Security Research Division of Network Associates, Inc. under
  * DARPA/SPAWAR contract N66001-01-C-8035 ("CBOSS"), as part of the DARPA
  * CHATS research program.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 /*
  *	Manages physical address maps.
  *
  *	Since the information managed by this module is
  *	also stored by the logical address mapping module,
  *	this module may throw away valid virtual-to-physical
  *	mappings at almost any time.  However, invalidations
  *	of virtual-to-physical mappings must be done as
  *	requested.
  *
  *	In order to cope with hardware architectures which
  *	make virtual-to-physical map invalidates expensive,
  *	this module may delay invalidate or reduced protection
  *	operations until such time as they are actually
  *	necessary.  This module is given full information as
  *	to which processors are currently using which maps,
  *	and to when physical maps must be made correct.
  */
 
 #include <sys/param.h>
 #include <sys/bus.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/ktr.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/mman.h>
 #include <sys/msgbuf.h>
 #include <sys/mutex.h>
 #include <sys/proc.h>
 #include <sys/rwlock.h>
 #include <sys/sx.h>
 #include <sys/vmem.h>
 #include <sys/vmmeter.h>
 #include <sys/sched.h>
 #include <sys/sysctl.h>
 #include <sys/smp.h>
 
 #include <vm/vm.h>
 #include <vm/vm_param.h>
 #include <vm/vm_kern.h>
 #include <vm/vm_page.h>
 #include <vm/vm_map.h>
 #include <vm/vm_object.h>
 #include <vm/vm_extern.h>
 #include <vm/vm_pageout.h>
 #include <vm/vm_pager.h>
 #include <vm/vm_radix.h>
 #include <vm/vm_reserv.h>
 #include <vm/uma.h>
 
 #include <machine/machdep.h>
 #include <machine/md_var.h>
 #include <machine/pcb.h>
 
 #define	NPDEPG		(PAGE_SIZE/(sizeof (pd_entry_t)))
 #define	NUPDE			(NPDEPG * NPDEPG)
 #define	NUSERPGTBLS		(NUPDE + NPDEPG)
 
 #if !defined(DIAGNOSTIC)
 #ifdef __GNUC_GNU_INLINE__
 #define PMAP_INLINE	__attribute__((__gnu_inline__)) inline
 #else
 #define PMAP_INLINE	extern inline
 #endif
 #else
 #define PMAP_INLINE
 #endif
 
 #ifdef PV_STATS
 #define PV_STAT(x)	do { x ; } while (0)
 #else
 #define PV_STAT(x)	do { } while (0)
 #endif
 
 #define	pmap_l2_pindex(v)	((v) >> L2_SHIFT)
 
 #define	NPV_LIST_LOCKS	MAXCPU
 
 #define	PHYS_TO_PV_LIST_LOCK(pa)	\
 			(&pv_list_locks[pa_index(pa) % NPV_LIST_LOCKS])
 
 #define	CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, pa)	do {	\
 	struct rwlock **_lockp = (lockp);		\
 	struct rwlock *_new_lock;			\
 							\
 	_new_lock = PHYS_TO_PV_LIST_LOCK(pa);		\
 	if (_new_lock != *_lockp) {			\
 		if (*_lockp != NULL)			\
 			rw_wunlock(*_lockp);		\
 		*_lockp = _new_lock;			\
 		rw_wlock(*_lockp);			\
 	}						\
 } while (0)
 
 #define	CHANGE_PV_LIST_LOCK_TO_VM_PAGE(lockp, m)	\
 			CHANGE_PV_LIST_LOCK_TO_PHYS(lockp, VM_PAGE_TO_PHYS(m))
 
 #define	RELEASE_PV_LIST_LOCK(lockp)		do {	\
 	struct rwlock **_lockp = (lockp);		\
 							\
 	if (*_lockp != NULL) {				\
 		rw_wunlock(*_lockp);			\
 		*_lockp = NULL;				\
 	}						\
 } while (0)
 
 #define	VM_PAGE_TO_PV_LIST_LOCK(m)	\
 			PHYS_TO_PV_LIST_LOCK(VM_PAGE_TO_PHYS(m))
 
 /* The list of all the user pmaps */
 LIST_HEAD(pmaplist, pmap);
 static struct pmaplist allpmaps;
 
 static MALLOC_DEFINE(M_VMPMAP, "pmap", "PMAP L1");
 
 struct pmap kernel_pmap_store;
 
 vm_offset_t virtual_avail;	/* VA of first avail page (after kernel bss) */
 vm_offset_t virtual_end;	/* VA of last avail page (end of kernel AS) */
 vm_offset_t kernel_vm_end = 0;
 
 struct msgbuf *msgbufp = NULL;
 
 vm_paddr_t dmap_phys_base;	/* The start of the dmap region */
 vm_paddr_t dmap_phys_max;	/* The limit of the dmap region */
 vm_offset_t dmap_max_addr;	/* The virtual address limit of the dmap */
 
 /* This code assumes all L1 DMAP entries will be used */
 CTASSERT((DMAP_MIN_ADDRESS  & ~L1_OFFSET) == DMAP_MIN_ADDRESS);
 CTASSERT((DMAP_MAX_ADDRESS  & ~L1_OFFSET) == DMAP_MAX_ADDRESS);
 
 static struct rwlock_padalign pvh_global_lock;
 
 /*
  * Data for the pv entry allocation mechanism
  */
 static TAILQ_HEAD(pch, pv_chunk) pv_chunks = TAILQ_HEAD_INITIALIZER(pv_chunks);
 static struct mtx pv_chunks_mutex;
 static struct rwlock pv_list_locks[NPV_LIST_LOCKS];
 
 static void	free_pv_chunk(struct pv_chunk *pc);
 static void	free_pv_entry(pmap_t pmap, pv_entry_t pv);
 static pv_entry_t get_pv_entry(pmap_t pmap, struct rwlock **lockp);
 static vm_page_t reclaim_pv_chunk(pmap_t locked_pmap, struct rwlock **lockp);
 static void	pmap_pvh_free(struct md_page *pvh, pmap_t pmap, vm_offset_t va);
 static pv_entry_t pmap_pvh_remove(struct md_page *pvh, pmap_t pmap,
 		    vm_offset_t va);
 static vm_page_t pmap_enter_quick_locked(pmap_t pmap, vm_offset_t va,
     vm_page_t m, vm_prot_t prot, vm_page_t mpte, struct rwlock **lockp);
 static int pmap_remove_l3(pmap_t pmap, pt_entry_t *l3, vm_offset_t sva,
     pd_entry_t ptepde, struct spglist *free, struct rwlock **lockp);
 static boolean_t pmap_try_insert_pv_entry(pmap_t pmap, vm_offset_t va,
     vm_page_t m, struct rwlock **lockp);
 
 static vm_page_t _pmap_alloc_l3(pmap_t pmap, vm_pindex_t ptepindex,
 		struct rwlock **lockp);
 
 static void _pmap_unwire_l3(pmap_t pmap, vm_offset_t va, vm_page_t m,
     struct spglist *free);
 static int pmap_unuse_l3(pmap_t, vm_offset_t, pd_entry_t, struct spglist *);
 
 /*
  * These load the old table data and store the new value.
  * They need to be atomic as the System MMU may write to the table at
  * the same time as the CPU.
  */
 #define	pmap_load_store(table, entry) atomic_swap_64(table, entry)
 #define	pmap_set(table, mask) atomic_set_64(table, mask)
 #define	pmap_load_clear(table) atomic_swap_64(table, 0)
 #define	pmap_load(table) (*table)
 
 /********************/
 /* Inline functions */
 /********************/
 
 static __inline void
 pagecopy(void *s, void *d)
 {
 
 	memcpy(d, s, PAGE_SIZE);
 }
 
 static __inline void
 pagezero(void *p)
 {
 
 	bzero(p, PAGE_SIZE);
 }
 
 #define	pmap_l1_index(va)	(((va) >> L1_SHIFT) & Ln_ADDR_MASK)
 #define	pmap_l2_index(va)	(((va) >> L2_SHIFT) & Ln_ADDR_MASK)
 #define	pmap_l3_index(va)	(((va) >> L3_SHIFT) & Ln_ADDR_MASK)
 
 #define	PTE_TO_PHYS(pte)	((pte >> PTE_PPN0_S) * PAGE_SIZE)
 
 static __inline pd_entry_t *
 pmap_l1(pmap_t pmap, vm_offset_t va)
 {
 
 	return (&pmap->pm_l1[pmap_l1_index(va)]);
 }
 
 static __inline pd_entry_t *
 pmap_l1_to_l2(pd_entry_t *l1, vm_offset_t va)
 {
 	vm_paddr_t phys;
 	pd_entry_t *l2;
 
 	phys = PTE_TO_PHYS(pmap_load(l1));
 	l2 = (pd_entry_t *)PHYS_TO_DMAP(phys);
 
 	return (&l2[pmap_l2_index(va)]);
 }
 
 static __inline pd_entry_t *
 pmap_l2(pmap_t pmap, vm_offset_t va)
 {
 	pd_entry_t *l1;
 
 	l1 = pmap_l1(pmap, va);
 	if (l1 == NULL)
 		return (NULL);
 	if ((pmap_load(l1) & PTE_V) == 0)
 		return (NULL);
 	if ((pmap_load(l1) & PTE_RX) != 0)
 		return (NULL);
 
 	return (pmap_l1_to_l2(l1, va));
 }
 
 static __inline pt_entry_t *
 pmap_l2_to_l3(pd_entry_t *l2, vm_offset_t va)
 {
 	vm_paddr_t phys;
 	pt_entry_t *l3;
 
 	phys = PTE_TO_PHYS(pmap_load(l2));
 	l3 = (pd_entry_t *)PHYS_TO_DMAP(phys);
 
 	return (&l3[pmap_l3_index(va)]);
 }
 
 static __inline pt_entry_t *
 pmap_l3(pmap_t pmap, vm_offset_t va)
 {
 	pd_entry_t *l2;
 
 	l2 = pmap_l2(pmap, va);
 	if (l2 == NULL)
 		return (NULL);
 	if ((pmap_load(l2) & PTE_V) == 0)
 		return (NULL);
 	if ((pmap_load(l2) & PTE_RX) != 0)
 		return (NULL);
 
 	return (pmap_l2_to_l3(l2, va));
 }
 
 
 static __inline int
 pmap_is_write(pt_entry_t entry)
 {
 
 	return (entry & PTE_W);
 }
 
 static __inline int
 pmap_is_current(pmap_t pmap)
 {
 
 	return ((pmap == pmap_kernel()) ||
 	    (pmap == curthread->td_proc->p_vmspace->vm_map.pmap));
 }
 
 static __inline int
 pmap_l3_valid(pt_entry_t l3)
 {
 
 	return (l3 & PTE_V);
 }
 
 static __inline int
 pmap_l3_valid_cacheable(pt_entry_t l3)
 {
 
 	/* TODO */
 
 	return (0);
 }
 
 #define	PTE_SYNC(pte)	cpu_dcache_wb_range((vm_offset_t)pte, sizeof(*pte))
 
 /* Checks if the page is dirty. */
 static inline int
 pmap_page_dirty(pt_entry_t pte)
 {
 
 	return (pte & PTE_D);
 }
 
 static __inline void
 pmap_resident_count_inc(pmap_t pmap, int count)
 {
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	pmap->pm_stats.resident_count += count;
 }
 
 static __inline void
 pmap_resident_count_dec(pmap_t pmap, int count)
 {
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	KASSERT(pmap->pm_stats.resident_count >= count,
 	    ("pmap %p resident count underflow %ld %d", pmap,
 	    pmap->pm_stats.resident_count, count));
 	pmap->pm_stats.resident_count -= count;
 }
 
 static void
 pmap_distribute_l1(struct pmap *pmap, vm_pindex_t l1index,
     pt_entry_t entry)
 {
 	struct pmap *user_pmap;
 	pd_entry_t *l1;
 
 	/* Distribute new kernel L1 entry to all the user pmaps */
 	if (pmap != kernel_pmap)
 		return;
 
 	LIST_FOREACH(user_pmap, &allpmaps, pm_list) {
 		l1 = &user_pmap->pm_l1[l1index];
 		if (entry)
 			pmap_load_store(l1, entry);
 		else
 			pmap_load_clear(l1);
 	}
 }
 
 static pt_entry_t *
 pmap_early_page_idx(vm_offset_t l1pt, vm_offset_t va, u_int *l1_slot,
     u_int *l2_slot)
 {
 	pt_entry_t *l2;
 	pd_entry_t *l1;
 
 	l1 = (pd_entry_t *)l1pt;
 	*l1_slot = (va >> L1_SHIFT) & Ln_ADDR_MASK;
 
 	/* Check locore has used a table L1 map */
 	KASSERT((l1[*l1_slot] & PTE_RX) == 0,
 		("Invalid bootstrap L1 table"));
 
 	/* Find the address of the L2 table */
 	l2 = (pt_entry_t *)init_pt_va;
 	*l2_slot = pmap_l2_index(va);
 
 	return (l2);
 }
 
 static vm_paddr_t
 pmap_early_vtophys(vm_offset_t l1pt, vm_offset_t va)
 {
 	u_int l1_slot, l2_slot;
 	pt_entry_t *l2;
 	u_int ret;
 
 	l2 = pmap_early_page_idx(l1pt, va, &l1_slot, &l2_slot);
 
 	/* Check locore has used L2 superpages */
 	KASSERT((l2[l2_slot] & PTE_RX) != 0,
 		("Invalid bootstrap L2 table"));
 
 	/* L2 is superpages */
 	ret = (l2[l2_slot] >> PTE_PPN1_S) << L2_SHIFT;
 	ret += (va & L2_OFFSET);
 
 	return (ret);
 }
 
 static void
 pmap_bootstrap_dmap(vm_offset_t kern_l1, vm_paddr_t min_pa, vm_paddr_t max_pa)
 {
 	vm_offset_t va;
 	vm_paddr_t pa;
 	pd_entry_t *l1;
 	u_int l1_slot;
 	pt_entry_t entry;
 	pn_t pn;
 
 	pa = dmap_phys_base = min_pa & ~L1_OFFSET;
 	va = DMAP_MIN_ADDRESS;
 	l1 = (pd_entry_t *)kern_l1;
 	l1_slot = pmap_l1_index(DMAP_MIN_ADDRESS);
 
 	for (; va < DMAP_MAX_ADDRESS && pa < max_pa;
 	    pa += L1_SIZE, va += L1_SIZE, l1_slot++) {
 		KASSERT(l1_slot < Ln_ENTRIES, ("Invalid L1 index"));
 
 		/* superpages */
 		pn = (pa / PAGE_SIZE);
 		entry = (PTE_V | PTE_RWX);
 		entry |= (pn << PTE_PPN0_S);
 		pmap_load_store(&l1[l1_slot], entry);
 	}
 
 	/* Set the upper limit of the DMAP region */
 	dmap_phys_max = pa;
 	dmap_max_addr = va;
 
 	cpu_dcache_wb_range((vm_offset_t)l1, PAGE_SIZE);
 	cpu_tlb_flushID();
 }
 
 static vm_offset_t
 pmap_bootstrap_l3(vm_offset_t l1pt, vm_offset_t va, vm_offset_t l3_start)
 {
 	vm_offset_t l2pt, l3pt;
 	pt_entry_t entry;
 	pd_entry_t *l2;
 	vm_paddr_t pa;
 	u_int l2_slot;
 	pn_t pn;
 
 	KASSERT((va & L2_OFFSET) == 0, ("Invalid virtual address"));
 
 	l2 = pmap_l2(kernel_pmap, va);
 	l2 = (pd_entry_t *)((uintptr_t)l2 & ~(PAGE_SIZE - 1));
 	l2pt = (vm_offset_t)l2;
 	l2_slot = pmap_l2_index(va);
 	l3pt = l3_start;
 
 	for (; va < VM_MAX_KERNEL_ADDRESS; l2_slot++, va += L2_SIZE) {
 		KASSERT(l2_slot < Ln_ENTRIES, ("Invalid L2 index"));
 
 		pa = pmap_early_vtophys(l1pt, l3pt);
 		pn = (pa / PAGE_SIZE);
 		entry = (PTE_V);
 		entry |= (pn << PTE_PPN0_S);
 		pmap_load_store(&l2[l2_slot], entry);
 		l3pt += PAGE_SIZE;
 	}
 
 
 	/* Clean the L2 page table */
 	memset((void *)l3_start, 0, l3pt - l3_start);
 	cpu_dcache_wb_range(l3_start, l3pt - l3_start);
 
 	cpu_dcache_wb_range((vm_offset_t)l2, PAGE_SIZE);
 
 	return (l3pt);
 }
 
 /*
  *	Bootstrap the system enough to run with virtual memory.
  */
 void
 pmap_bootstrap(vm_offset_t l1pt, vm_paddr_t kernstart, vm_size_t kernlen)
 {
 	u_int l1_slot, l2_slot, avail_slot, map_slot, used_map_slot;
 	uint64_t kern_delta;
 	pt_entry_t *l2;
 	vm_offset_t va, freemempos;
 	vm_offset_t dpcpu, msgbufpv;
 	vm_paddr_t pa, min_pa, max_pa;
 	int i;
 
 	kern_delta = KERNBASE - kernstart;
 	physmem = 0;
 
 	printf("pmap_bootstrap %lx %lx %lx\n", l1pt, kernstart, kernlen);
 	printf("%lx\n", l1pt);
 	printf("%lx\n", (KERNBASE >> L1_SHIFT) & Ln_ADDR_MASK);
 
 	/* Set this early so we can use the pagetable walking functions */
 	kernel_pmap_store.pm_l1 = (pd_entry_t *)l1pt;
 	PMAP_LOCK_INIT(kernel_pmap);
 
  	/*
 	 * Initialize the global pv list lock.
 	 */
 	rw_init(&pvh_global_lock, "pmap pv global");
 
 	LIST_INIT(&allpmaps);
 
 	/* Assume the address we were loaded to is a valid physical address */
 	min_pa = max_pa = KERNBASE - kern_delta;
 
 	/*
 	 * Find the minimum physical address. physmap is sorted,
 	 * but may contain empty ranges.
 	 */
 	for (i = 0; i < (physmap_idx * 2); i += 2) {
 		if (physmap[i] == physmap[i + 1])
 			continue;
 		if (physmap[i] <= min_pa)
 			min_pa = physmap[i];
 		if (physmap[i + 1] > max_pa)
 			max_pa = physmap[i + 1];
 		break;
 	}
 
 	/* Create a direct map region early so we can use it for pa -> va */
 	pmap_bootstrap_dmap(l1pt, min_pa, max_pa);
 
 	va = KERNBASE;
 	pa = KERNBASE - kern_delta;
 
 	/*
 	 * Start to initialize phys_avail by copying from physmap
 	 * up to the physical address KERNBASE points at.
 	 */
 	map_slot = avail_slot = 0;
 	for (; map_slot < (physmap_idx * 2); map_slot += 2) {
 		if (physmap[map_slot] == physmap[map_slot + 1])
 			continue;
 
 		if (physmap[map_slot] <= pa &&
 		    physmap[map_slot + 1] > pa)
 			break;
 
 		phys_avail[avail_slot] = physmap[map_slot];
 		phys_avail[avail_slot + 1] = physmap[map_slot + 1];
 		physmem += (phys_avail[avail_slot + 1] -
 		    phys_avail[avail_slot]) >> PAGE_SHIFT;
 		avail_slot += 2;
 	}
 
 	/* Add the memory before the kernel */
 	if (physmap[avail_slot] < pa) {
 		phys_avail[avail_slot] = physmap[map_slot];
 		phys_avail[avail_slot + 1] = pa;
 		physmem += (phys_avail[avail_slot + 1] -
 		    phys_avail[avail_slot]) >> PAGE_SHIFT;
 		avail_slot += 2;
 	}
 	used_map_slot = map_slot;
 
 	/*
 	 * Read the page table to find out what is already mapped.
 	 * This assumes we have mapped a block of memory from KERNBASE
 	 * using a single L1 entry.
 	 */
 	l2 = pmap_early_page_idx(l1pt, KERNBASE, &l1_slot, &l2_slot);
 
 	/* Sanity check the index, KERNBASE should be the first VA */
 	KASSERT(l2_slot == 0, ("The L2 index is non-zero"));
 
 	/* Find how many pages we have mapped */
 	for (; l2_slot < Ln_ENTRIES; l2_slot++) {
 		if ((l2[l2_slot] & PTE_V) == 0)
 			break;
 
 		/* Check locore used L2 superpages */
 		KASSERT((l2[l2_slot] & PTE_RX) != 0,
 		    ("Invalid bootstrap L2 table"));
 
 		va += L2_SIZE;
 		pa += L2_SIZE;
 	}
 
 	va = roundup2(va, L2_SIZE);
 
 	freemempos = KERNBASE + kernlen;
 	freemempos = roundup2(freemempos, PAGE_SIZE);
 
 	/* Create the l3 tables for the early devmap */
 	freemempos = pmap_bootstrap_l3(l1pt,
 	    VM_MAX_KERNEL_ADDRESS - L2_SIZE, freemempos);
 
 	cpu_tlb_flushID();
 
 #define alloc_pages(var, np)						\
 	(var) = freemempos;						\
 	freemempos += (np * PAGE_SIZE);					\
 	memset((char *)(var), 0, ((np) * PAGE_SIZE));
 
 	/* Allocate dynamic per-cpu area. */
 	alloc_pages(dpcpu, DPCPU_SIZE / PAGE_SIZE);
 	dpcpu_init((void *)dpcpu, 0);
 
 	/* Allocate memory for the msgbuf, e.g. for /sbin/dmesg */
 	alloc_pages(msgbufpv, round_page(msgbufsize) / PAGE_SIZE);
 	msgbufp = (void *)msgbufpv;
 
 	virtual_avail = roundup2(freemempos, L2_SIZE);
 	virtual_end = VM_MAX_KERNEL_ADDRESS - L2_SIZE;
 	kernel_vm_end = virtual_avail;
 	
 	pa = pmap_early_vtophys(l1pt, freemempos);
 
 	/* Finish initialising physmap */
 	map_slot = used_map_slot;
 	for (; avail_slot < (PHYS_AVAIL_SIZE - 2) &&
 	    map_slot < (physmap_idx * 2); map_slot += 2) {
 		if (physmap[map_slot] == physmap[map_slot + 1]) {
 			continue;
 		}
 
 		/* Have we used the current range? */
 		if (physmap[map_slot + 1] <= pa) {
 			continue;
 		}
 
 		/* Do we need to split the entry? */
 		if (physmap[map_slot] < pa) {
 			phys_avail[avail_slot] = pa;
 			phys_avail[avail_slot + 1] = physmap[map_slot + 1];
 		} else {
 			phys_avail[avail_slot] = physmap[map_slot];
 			phys_avail[avail_slot + 1] = physmap[map_slot + 1];
 		}
 		physmem += (phys_avail[avail_slot + 1] -
 		    phys_avail[avail_slot]) >> PAGE_SHIFT;
 
 		avail_slot += 2;
 	}
 	phys_avail[avail_slot] = 0;
 	phys_avail[avail_slot + 1] = 0;
 
 	/*
 	 * Maxmem isn't the "maximum memory", it's one larger than the
 	 * highest page of the physical address space.  It should be
 	 * called something like "Maxphyspage".
 	 */
 	Maxmem = atop(phys_avail[avail_slot - 1]);
 
 	cpu_tlb_flushID();
 }
 
 /*
  *	Initialize a vm_page's machine-dependent fields.
  */
 void
 pmap_page_init(vm_page_t m)
 {
 
 	TAILQ_INIT(&m->md.pv_list);
 	m->md.pv_memattr = VM_MEMATTR_WRITE_BACK;
 }
 
 /*
  *	Initialize the pmap module.
  *	Called by vm_init, to initialize any structures that the pmap
  *	system needs to map virtual memory.
  */
 void
 pmap_init(void)
 {
 	int i;
 
 	/*
 	 * Initialize the pv chunk list mutex.
 	 */
 	mtx_init(&pv_chunks_mutex, "pmap pv chunk list", NULL, MTX_DEF);
 
 	/*
 	 * Initialize the pool of pv list locks.
 	 */
 	for (i = 0; i < NPV_LIST_LOCKS; i++)
 		rw_init(&pv_list_locks[i], "pmap pv list");
 }
 
 /*
  * Normal, non-SMP, invalidation functions.
  * We inline these within pmap.c for speed.
  */
 PMAP_INLINE void
 pmap_invalidate_page(pmap_t pmap, vm_offset_t va)
 {
 
 	/* TODO */
 
 	sched_pin();
 	__asm __volatile("sfence.vm");
 	sched_unpin();
 }
 
 PMAP_INLINE void
 pmap_invalidate_range(pmap_t pmap, vm_offset_t sva, vm_offset_t eva)
 {
 
 	/* TODO */
 
 	sched_pin();
 	__asm __volatile("sfence.vm");
 	sched_unpin();
 }
 
 PMAP_INLINE void
 pmap_invalidate_all(pmap_t pmap)
 {
 
 	/* TODO */
 
 	sched_pin();
 	__asm __volatile("sfence.vm");
 	sched_unpin();
 }
 
 /*
  *	Routine:	pmap_extract
  *	Function:
  *		Extract the physical page address associated
  *		with the given map/virtual_address pair.
  */
 vm_paddr_t 
 pmap_extract(pmap_t pmap, vm_offset_t va)
 {
 	pd_entry_t *l2p, l2;
 	pt_entry_t *l3p, l3;
 	vm_paddr_t pa;
 
 	pa = 0;
 	PMAP_LOCK(pmap);
 	/*
 	 * Start with the l2 tabel. We are unable to allocate
 	 * pages in the l1 table.
 	 */
 	l2p = pmap_l2(pmap, va);
 	if (l2p != NULL) {
 		l2 = pmap_load(l2p);
 		if ((l2 & PTE_RX) == 0) {
 			l3p = pmap_l2_to_l3(l2p, va);
 			if (l3p != NULL) {
 				l3 = pmap_load(l3p);
 				pa = PTE_TO_PHYS(l3);
 				pa |= (va & L3_OFFSET);
 			}
 		} else {
 			/* L2 is superpages */
 			pa = (l2 >> PTE_PPN1_S) << L2_SHIFT;
 			pa |= (va & L2_OFFSET);
 		}
 	}
 	PMAP_UNLOCK(pmap);
 	return (pa);
 }
 
 /*
  *	Routine:	pmap_extract_and_hold
  *	Function:
  *		Atomically extract and hold the physical page
  *		with the given pmap and virtual address pair
  *		if that mapping permits the given protection.
  */
 vm_page_t
 pmap_extract_and_hold(pmap_t pmap, vm_offset_t va, vm_prot_t prot)
 {
 	pt_entry_t *l3p, l3;
 	vm_paddr_t phys;
 	vm_paddr_t pa;
 	vm_page_t m;
 
 	pa = 0;
 	m = NULL;
 	PMAP_LOCK(pmap);
 retry:
 	l3p = pmap_l3(pmap, va);
 	if (l3p != NULL && (l3 = pmap_load(l3p)) != 0) {
 		if ((pmap_is_write(l3)) || ((prot & VM_PROT_WRITE) == 0)) {
 			phys = PTE_TO_PHYS(l3);
 			if (vm_page_pa_tryrelock(pmap, phys, &pa))
 				goto retry;
 			m = PHYS_TO_VM_PAGE(phys);
 			vm_page_hold(m);
 		}
 	}
 	PA_UNLOCK_COND(pa);
 	PMAP_UNLOCK(pmap);
 	return (m);
 }
 
 vm_paddr_t
 pmap_kextract(vm_offset_t va)
 {
 	pd_entry_t *l2;
 	pt_entry_t *l3;
 	vm_paddr_t pa;
 
 	if (va >= DMAP_MIN_ADDRESS && va < DMAP_MAX_ADDRESS) {
 		pa = DMAP_TO_PHYS(va);
 	} else {
 		l2 = pmap_l2(kernel_pmap, va);
 		if (l2 == NULL)
 			panic("pmap_kextract: No l2");
 		if ((pmap_load(l2) & PTE_RX) != 0) {
 			/* superpages */
 			pa = (pmap_load(l2) >> PTE_PPN1_S) << L2_SHIFT;
 			pa |= (va & L2_OFFSET);
 			return (pa);
 		}
 
 		l3 = pmap_l2_to_l3(l2, va);
 		if (l3 == NULL)
 			panic("pmap_kextract: No l3...");
 		pa = PTE_TO_PHYS(pmap_load(l3));
 		pa |= (va & PAGE_MASK);
 	}
 	return (pa);
 }
 
 /***************************************************
  * Low level mapping routines.....
  ***************************************************/
 
 void
 pmap_kenter_device(vm_offset_t sva, vm_size_t size, vm_paddr_t pa)
 {
 	pt_entry_t entry;
 	pt_entry_t *l3;
 	vm_offset_t va;
 	pn_t pn;
 
 	KASSERT((pa & L3_OFFSET) == 0,
 	   ("pmap_kenter_device: Invalid physical address"));
 	KASSERT((sva & L3_OFFSET) == 0,
 	   ("pmap_kenter_device: Invalid virtual address"));
 	KASSERT((size & PAGE_MASK) == 0,
 	    ("pmap_kenter_device: Mapping is not page-sized"));
 
 	va = sva;
 	while (size != 0) {
 		l3 = pmap_l3(kernel_pmap, va);
 		KASSERT(l3 != NULL, ("Invalid page table, va: 0x%lx", va));
 
 		pn = (pa / PAGE_SIZE);
 		entry = (PTE_V | PTE_RWX);
 		entry |= (pn << PTE_PPN0_S);
 		pmap_load_store(l3, entry);
 
 		PTE_SYNC(l3);
 
 		va += PAGE_SIZE;
 		pa += PAGE_SIZE;
 		size -= PAGE_SIZE;
 	}
 	pmap_invalidate_range(kernel_pmap, sva, va);
 }
 
 /*
  * Remove a page from the kernel pagetables.
  * Note: not SMP coherent.
  */
 PMAP_INLINE void
 pmap_kremove(vm_offset_t va)
 {
 	pt_entry_t *l3;
 
 	l3 = pmap_l3(kernel_pmap, va);
 	KASSERT(l3 != NULL, ("pmap_kremove: Invalid address"));
 
 	if (pmap_l3_valid_cacheable(pmap_load(l3)))
 		cpu_dcache_wb_range(va, L3_SIZE);
 	pmap_load_clear(l3);
 	PTE_SYNC(l3);
 	pmap_invalidate_page(kernel_pmap, va);
 }
 
 void
 pmap_kremove_device(vm_offset_t sva, vm_size_t size)
 {
 	pt_entry_t *l3;
 	vm_offset_t va;
 
 	KASSERT((sva & L3_OFFSET) == 0,
 	   ("pmap_kremove_device: Invalid virtual address"));
 	KASSERT((size & PAGE_MASK) == 0,
 	    ("pmap_kremove_device: Mapping is not page-sized"));
 
 	va = sva;
 	while (size != 0) {
 		l3 = pmap_l3(kernel_pmap, va);
 		KASSERT(l3 != NULL, ("Invalid page table, va: 0x%lx", va));
 		pmap_load_clear(l3);
 		PTE_SYNC(l3);
 
 		va += PAGE_SIZE;
 		size -= PAGE_SIZE;
 	}
 	pmap_invalidate_range(kernel_pmap, sva, va);
 }
 
 /*
  *	Used to map a range of physical addresses into kernel
  *	virtual address space.
  *
  *	The value passed in '*virt' is a suggested virtual address for
  *	the mapping. Architectures which can support a direct-mapped
  *	physical to virtual region can return the appropriate address
  *	within that region, leaving '*virt' unchanged. Other
  *	architectures should map the pages starting at '*virt' and
  *	update '*virt' with the first usable address after the mapped
  *	region.
  */
 vm_offset_t
 pmap_map(vm_offset_t *virt, vm_paddr_t start, vm_paddr_t end, int prot)
 {
 
 	return PHYS_TO_DMAP(start);
 }
 
 
 /*
  * Add a list of wired pages to the kva
  * this routine is only used for temporary
  * kernel mappings that do not need to have
  * page modification or references recorded.
  * Note that old mappings are simply written
  * over.  The page *must* be wired.
  * Note: SMP coherent.  Uses a ranged shootdown IPI.
  */
 void
 pmap_qenter(vm_offset_t sva, vm_page_t *ma, int count)
 {
 	pt_entry_t *l3, pa;
 	vm_offset_t va;
 	vm_page_t m;
 	pt_entry_t entry;
 	pn_t pn;
 	int i;
 
 	va = sva;
 	for (i = 0; i < count; i++) {
 		m = ma[i];
 		pa = VM_PAGE_TO_PHYS(m);
 		pn = (pa / PAGE_SIZE);
 		l3 = pmap_l3(kernel_pmap, va);
 
 		entry = (PTE_V | PTE_RWX);
 		entry |= (pn << PTE_PPN0_S);
 		pmap_load_store(l3, entry);
 
 		PTE_SYNC(l3);
 		va += L3_SIZE;
 	}
 	pmap_invalidate_range(kernel_pmap, sva, va);
 }
 
 /*
  * This routine tears out page mappings from the
  * kernel -- it is meant only for temporary mappings.
  * Note: SMP coherent.  Uses a ranged shootdown IPI.
  */
 void
 pmap_qremove(vm_offset_t sva, int count)
 {
 	pt_entry_t *l3;
 	vm_offset_t va;
 
 	KASSERT(sva >= VM_MIN_KERNEL_ADDRESS, ("usermode va %lx", sva));
 
 	va = sva;
 	while (count-- > 0) {
 		l3 = pmap_l3(kernel_pmap, va);
 		KASSERT(l3 != NULL, ("pmap_kremove: Invalid address"));
 
 		if (pmap_l3_valid_cacheable(pmap_load(l3)))
 			cpu_dcache_wb_range(va, L3_SIZE);
 		pmap_load_clear(l3);
 		PTE_SYNC(l3);
 
 		va += PAGE_SIZE;
 	}
 	pmap_invalidate_range(kernel_pmap, sva, va);
 }
 
 /***************************************************
  * Page table page management routines.....
  ***************************************************/
 static __inline void
 pmap_free_zero_pages(struct spglist *free)
 {
 	vm_page_t m;
 
 	while ((m = SLIST_FIRST(free)) != NULL) {
 		SLIST_REMOVE_HEAD(free, plinks.s.ss);
 		/* Preserve the page's PG_ZERO setting. */
 		vm_page_free_toq(m);
 	}
 }
 
 /*
  * Schedule the specified unused page table page to be freed.  Specifically,
  * add the page to the specified list of pages that will be released to the
  * physical memory manager after the TLB has been updated.
  */
 static __inline void
 pmap_add_delayed_free_list(vm_page_t m, struct spglist *free,
     boolean_t set_PG_ZERO)
 {
 
 	if (set_PG_ZERO)
 		m->flags |= PG_ZERO;
 	else
 		m->flags &= ~PG_ZERO;
 	SLIST_INSERT_HEAD(free, m, plinks.s.ss);
 }
 	
 /*
  * Decrements a page table page's wire count, which is used to record the
  * number of valid page table entries within the page.  If the wire count
  * drops to zero, then the page table page is unmapped.  Returns TRUE if the
  * page table page was unmapped and FALSE otherwise.
  */
 static inline boolean_t
 pmap_unwire_l3(pmap_t pmap, vm_offset_t va, vm_page_t m, struct spglist *free)
 {
 
 	--m->wire_count;
 	if (m->wire_count == 0) {
 		_pmap_unwire_l3(pmap, va, m, free);
 		return (TRUE);
 	} else {
 		return (FALSE);
 	}
 }
 
 static void
 _pmap_unwire_l3(pmap_t pmap, vm_offset_t va, vm_page_t m, struct spglist *free)
 {
 	vm_paddr_t phys;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	/*
 	 * unmap the page table page
 	 */
 	if (m->pindex >= NUPDE) {
 		/* PD page */
 		pd_entry_t *l1;
 		l1 = pmap_l1(pmap, va);
 		pmap_load_clear(l1);
 		pmap_distribute_l1(pmap, pmap_l1_index(va), 0);
 		PTE_SYNC(l1);
 	} else {
 		/* PTE page */
 		pd_entry_t *l2;
 		l2 = pmap_l2(pmap, va);
 		pmap_load_clear(l2);
 		PTE_SYNC(l2);
 	}
 	pmap_resident_count_dec(pmap, 1);
 	if (m->pindex < NUPDE) {
 		pd_entry_t *l1;
 		/* We just released a PT, unhold the matching PD */
 		vm_page_t pdpg;
 
 		l1 = pmap_l1(pmap, va);
 		phys = PTE_TO_PHYS(pmap_load(l1));
 		pdpg = PHYS_TO_VM_PAGE(phys);
 		pmap_unwire_l3(pmap, va, pdpg, free);
 	}
 	pmap_invalidate_page(pmap, va);
 
 	/*
 	 * This is a release store so that the ordinary store unmapping
 	 * the page table page is globally performed before TLB shoot-
 	 * down is begun.
 	 */
 	atomic_subtract_rel_int(&vm_cnt.v_wire_count, 1);
 
 	/* 
 	 * Put page on a list so that it is released after
 	 * *ALL* TLB shootdown is done
 	 */
 	pmap_add_delayed_free_list(m, free, TRUE);
 }
 
 /*
  * After removing an l3 entry, this routine is used to
  * conditionally free the page, and manage the hold/wire counts.
  */
 static int
 pmap_unuse_l3(pmap_t pmap, vm_offset_t va, pd_entry_t ptepde,
     struct spglist *free)
 {
 	vm_paddr_t phys;
 	vm_page_t mpte;
 
 	if (va >= VM_MAXUSER_ADDRESS)
 		return (0);
 	KASSERT(ptepde != 0, ("pmap_unuse_pt: ptepde != 0"));
 
 	phys = PTE_TO_PHYS(ptepde);
 
 	mpte = PHYS_TO_VM_PAGE(phys);
 	return (pmap_unwire_l3(pmap, va, mpte, free));
 }
 
 void
 pmap_pinit0(pmap_t pmap)
 {
 
 	PMAP_LOCK_INIT(pmap);
 	bzero(&pmap->pm_stats, sizeof(pmap->pm_stats));
 	pmap->pm_l1 = kernel_pmap->pm_l1;
 }
 
 int
 pmap_pinit(pmap_t pmap)
 {
 	vm_paddr_t l1phys;
 	vm_page_t l1pt;
 
 	/*
 	 * allocate the l1 page
 	 */
 	while ((l1pt = vm_page_alloc(NULL, 0xdeadbeef, VM_ALLOC_NORMAL |
 	    VM_ALLOC_NOOBJ | VM_ALLOC_WIRED | VM_ALLOC_ZERO)) == NULL)
 		VM_WAIT;
 
 	l1phys = VM_PAGE_TO_PHYS(l1pt);
 	pmap->pm_l1 = (pd_entry_t *)PHYS_TO_DMAP(l1phys);
 
 	if ((l1pt->flags & PG_ZERO) == 0)
 		pagezero(pmap->pm_l1);
 
 	bzero(&pmap->pm_stats, sizeof(pmap->pm_stats));
 
 	/* Install kernel pagetables */
 	memcpy(pmap->pm_l1, kernel_pmap->pm_l1, PAGE_SIZE);
 
 	/* Add to the list of all user pmaps */
 	LIST_INSERT_HEAD(&allpmaps, pmap, pm_list);
 
 	return (1);
 }
 
 /*
  * This routine is called if the desired page table page does not exist.
  *
  * If page table page allocation fails, this routine may sleep before
  * returning NULL.  It sleeps only if a lock pointer was given.
  *
  * Note: If a page allocation fails at page table level two or three,
  * one or two pages may be held during the wait, only to be released
  * afterwards.  This conservative approach is easily argued to avoid
  * race conditions.
  */
 static vm_page_t
 _pmap_alloc_l3(pmap_t pmap, vm_pindex_t ptepindex, struct rwlock **lockp)
 {
 	vm_page_t m, /*pdppg, */pdpg;
 	pt_entry_t entry;
 	vm_paddr_t phys;
 	pn_t pn;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 
 	/*
 	 * Allocate a page table page.
 	 */
 	if ((m = vm_page_alloc(NULL, ptepindex, VM_ALLOC_NOOBJ |
 	    VM_ALLOC_WIRED | VM_ALLOC_ZERO)) == NULL) {
 		if (lockp != NULL) {
 			RELEASE_PV_LIST_LOCK(lockp);
 			PMAP_UNLOCK(pmap);
 			rw_runlock(&pvh_global_lock);
 			VM_WAIT;
 			rw_rlock(&pvh_global_lock);
 			PMAP_LOCK(pmap);
 		}
 
 		/*
 		 * Indicate the need to retry.  While waiting, the page table
 		 * page may have been allocated.
 		 */
 		return (NULL);
 	}
 
 	if ((m->flags & PG_ZERO) == 0)
 		pmap_zero_page(m);
 
 	/*
 	 * Map the pagetable page into the process address space, if
 	 * it isn't already there.
 	 */
 
 	if (ptepindex >= NUPDE) {
 		pd_entry_t *l1;
 		vm_pindex_t l1index;
 
 		l1index = ptepindex - NUPDE;
 		l1 = &pmap->pm_l1[l1index];
 
 		pn = (VM_PAGE_TO_PHYS(m) / PAGE_SIZE);
 		entry = (PTE_V);
 		entry |= (pn << PTE_PPN0_S);
 		pmap_load_store(l1, entry);
 		pmap_distribute_l1(pmap, l1index, entry);
 
 		PTE_SYNC(l1);
 
 	} else {
 		vm_pindex_t l1index;
 		pd_entry_t *l1, *l2;
 
 		l1index = ptepindex >> (L1_SHIFT - L2_SHIFT);
 		l1 = &pmap->pm_l1[l1index];
 		if (pmap_load(l1) == 0) {
 			/* recurse for allocating page dir */
 			if (_pmap_alloc_l3(pmap, NUPDE + l1index,
 			    lockp) == NULL) {
 				--m->wire_count;
 				atomic_subtract_int(&vm_cnt.v_wire_count, 1);
 				vm_page_free_zero(m);
 				return (NULL);
 			}
 		} else {
 			phys = PTE_TO_PHYS(pmap_load(l1));
 			pdpg = PHYS_TO_VM_PAGE(phys);
 			pdpg->wire_count++;
 		}
 
 		phys = PTE_TO_PHYS(pmap_load(l1));
 		l2 = (pd_entry_t *)PHYS_TO_DMAP(phys);
 		l2 = &l2[ptepindex & Ln_ADDR_MASK];
 
 		pn = (VM_PAGE_TO_PHYS(m) / PAGE_SIZE);
 		entry = (PTE_V);
 		entry |= (pn << PTE_PPN0_S);
 		pmap_load_store(l2, entry);
 
 		PTE_SYNC(l2);
 	}
 
 	pmap_resident_count_inc(pmap, 1);
 
 	return (m);
 }
 
 static vm_page_t
 pmap_alloc_l3(pmap_t pmap, vm_offset_t va, struct rwlock **lockp)
 {
 	vm_pindex_t ptepindex;
 	pd_entry_t *l2;
 	vm_paddr_t phys;
 	vm_page_t m;
 
 	/*
 	 * Calculate pagetable page index
 	 */
 	ptepindex = pmap_l2_pindex(va);
 retry:
 	/*
 	 * Get the page directory entry
 	 */
 	l2 = pmap_l2(pmap, va);
 
 	/*
 	 * If the page table page is mapped, we just increment the
 	 * hold count, and activate it.
 	 */
 	if (l2 != NULL && pmap_load(l2) != 0) {
 		phys = PTE_TO_PHYS(pmap_load(l2));
 		m = PHYS_TO_VM_PAGE(phys);
 		m->wire_count++;
 	} else {
 		/*
 		 * Here if the pte page isn't mapped, or if it has been
 		 * deallocated.
 		 */
 		m = _pmap_alloc_l3(pmap, ptepindex, lockp);
 		if (m == NULL && lockp != NULL)
 			goto retry;
 	}
 	return (m);
 }
 
 
 /***************************************************
  * Pmap allocation/deallocation routines.
  ***************************************************/
 
 /*
  * Release any resources held by the given physical map.
  * Called when a pmap initialized by pmap_pinit is being released.
  * Should only be called if the map contains no valid mappings.
  */
 void
 pmap_release(pmap_t pmap)
 {
 	vm_page_t m;
 
 	KASSERT(pmap->pm_stats.resident_count == 0,
 	    ("pmap_release: pmap resident count %ld != 0",
 	    pmap->pm_stats.resident_count));
 
 	m = PHYS_TO_VM_PAGE(DMAP_TO_PHYS((vm_offset_t)pmap->pm_l1));
 	m->wire_count--;
 	atomic_subtract_int(&vm_cnt.v_wire_count, 1);
 	vm_page_free_zero(m);
 
 	/* Remove pmap from the allpmaps list */
 	LIST_REMOVE(pmap, pm_list);
 
 	/* Remove kernel pagetables */
 	bzero(pmap->pm_l1, PAGE_SIZE);
 }
 
 #if 0
 static int
 kvm_size(SYSCTL_HANDLER_ARGS)
 {
 	unsigned long ksize = VM_MAX_KERNEL_ADDRESS - VM_MIN_KERNEL_ADDRESS;
 
 	return sysctl_handle_long(oidp, &ksize, 0, req);
 }
 SYSCTL_PROC(_vm, OID_AUTO, kvm_size, CTLTYPE_LONG|CTLFLAG_RD, 
     0, 0, kvm_size, "LU", "Size of KVM");
 
 static int
 kvm_free(SYSCTL_HANDLER_ARGS)
 {
 	unsigned long kfree = VM_MAX_KERNEL_ADDRESS - kernel_vm_end;
 
 	return sysctl_handle_long(oidp, &kfree, 0, req);
 }
 SYSCTL_PROC(_vm, OID_AUTO, kvm_free, CTLTYPE_LONG|CTLFLAG_RD, 
     0, 0, kvm_free, "LU", "Amount of KVM free");
 #endif /* 0 */
 
 /*
  * grow the number of kernel page table entries, if needed
  */
 void
 pmap_growkernel(vm_offset_t addr)
 {
 	vm_paddr_t paddr;
 	vm_page_t nkpg;
 	pd_entry_t *l1, *l2;
 	pt_entry_t entry;
 	pn_t pn;
 
 	mtx_assert(&kernel_map->system_mtx, MA_OWNED);
 
 	addr = roundup2(addr, L2_SIZE);
 	if (addr - 1 >= kernel_map->max_offset)
 		addr = kernel_map->max_offset;
 	while (kernel_vm_end < addr) {
 		l1 = pmap_l1(kernel_pmap, kernel_vm_end);
 		if (pmap_load(l1) == 0) {
 			/* We need a new PDP entry */
 			nkpg = vm_page_alloc(NULL, kernel_vm_end >> L1_SHIFT,
 			    VM_ALLOC_INTERRUPT | VM_ALLOC_NOOBJ |
 			    VM_ALLOC_WIRED | VM_ALLOC_ZERO);
 			if (nkpg == NULL)
 				panic("pmap_growkernel: no memory to grow kernel");
 			if ((nkpg->flags & PG_ZERO) == 0)
 				pmap_zero_page(nkpg);
 			paddr = VM_PAGE_TO_PHYS(nkpg);
 
 			pn = (paddr / PAGE_SIZE);
 			entry = (PTE_V);
 			entry |= (pn << PTE_PPN0_S);
 			pmap_load_store(l1, entry);
 			pmap_distribute_l1(kernel_pmap,
 			    pmap_l1_index(kernel_vm_end), entry);
 
 			PTE_SYNC(l1);
 			continue; /* try again */
 		}
 		l2 = pmap_l1_to_l2(l1, kernel_vm_end);
 		if ((pmap_load(l2) & PTE_A) != 0) {
 			kernel_vm_end = (kernel_vm_end + L2_SIZE) & ~L2_OFFSET;
 			if (kernel_vm_end - 1 >= kernel_map->max_offset) {
 				kernel_vm_end = kernel_map->max_offset;
 				break;
 			}
 			continue;
 		}
 
 		nkpg = vm_page_alloc(NULL, kernel_vm_end >> L2_SHIFT,
 		    VM_ALLOC_INTERRUPT | VM_ALLOC_NOOBJ | VM_ALLOC_WIRED |
 		    VM_ALLOC_ZERO);
 		if (nkpg == NULL)
 			panic("pmap_growkernel: no memory to grow kernel");
 		if ((nkpg->flags & PG_ZERO) == 0) {
 			pmap_zero_page(nkpg);
 		}
 		paddr = VM_PAGE_TO_PHYS(nkpg);
 
 		pn = (paddr / PAGE_SIZE);
 		entry = (PTE_V);
 		entry |= (pn << PTE_PPN0_S);
 		pmap_load_store(l2, entry);
 
 		PTE_SYNC(l2);
 		pmap_invalidate_page(kernel_pmap, kernel_vm_end);
 
 		kernel_vm_end = (kernel_vm_end + L2_SIZE) & ~L2_OFFSET;
 		if (kernel_vm_end - 1 >= kernel_map->max_offset) {
 			kernel_vm_end = kernel_map->max_offset;
 			break;                       
 		}
 	}
 }
 
 
 /***************************************************
  * page management routines.
  ***************************************************/
 
 CTASSERT(sizeof(struct pv_chunk) == PAGE_SIZE);
 CTASSERT(_NPCM == 3);
 CTASSERT(_NPCPV == 168);
 
 static __inline struct pv_chunk *
 pv_to_chunk(pv_entry_t pv)
 {
 
 	return ((struct pv_chunk *)((uintptr_t)pv & ~(uintptr_t)PAGE_MASK));
 }
 
 #define PV_PMAP(pv) (pv_to_chunk(pv)->pc_pmap)
 
 #define	PC_FREE0	0xfffffffffffffffful
 #define	PC_FREE1	0xfffffffffffffffful
 #define	PC_FREE2	0x000000fffffffffful
 
 static const uint64_t pc_freemask[_NPCM] = { PC_FREE0, PC_FREE1, PC_FREE2 };
 
 #if 0
 #ifdef PV_STATS
 static int pc_chunk_count, pc_chunk_allocs, pc_chunk_frees, pc_chunk_tryfail;
 
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_count, CTLFLAG_RD, &pc_chunk_count, 0,
 	"Current number of pv entry chunks");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_allocs, CTLFLAG_RD, &pc_chunk_allocs, 0,
 	"Current number of pv entry chunks allocated");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_frees, CTLFLAG_RD, &pc_chunk_frees, 0,
 	"Current number of pv entry chunks frees");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pc_chunk_tryfail, CTLFLAG_RD, &pc_chunk_tryfail, 0,
 	"Number of times tried to get a chunk page but failed.");
 
 static long pv_entry_frees, pv_entry_allocs, pv_entry_count;
 static int pv_entry_spare;
 
 SYSCTL_LONG(_vm_pmap, OID_AUTO, pv_entry_frees, CTLFLAG_RD, &pv_entry_frees, 0,
 	"Current number of pv entry frees");
 SYSCTL_LONG(_vm_pmap, OID_AUTO, pv_entry_allocs, CTLFLAG_RD, &pv_entry_allocs, 0,
 	"Current number of pv entry allocs");
 SYSCTL_LONG(_vm_pmap, OID_AUTO, pv_entry_count, CTLFLAG_RD, &pv_entry_count, 0,
 	"Current number of pv entries");
 SYSCTL_INT(_vm_pmap, OID_AUTO, pv_entry_spare, CTLFLAG_RD, &pv_entry_spare, 0,
 	"Current number of spare pv entries");
 #endif
 #endif /* 0 */
 
 /*
  * We are in a serious low memory condition.  Resort to
  * drastic measures to free some pages so we can allocate
  * another pv entry chunk.
  *
  * Returns NULL if PV entries were reclaimed from the specified pmap.
  *
  * We do not, however, unmap 2mpages because subsequent accesses will
  * allocate per-page pv entries until repromotion occurs, thereby
  * exacerbating the shortage of free pv entries.
  */
 static vm_page_t
 reclaim_pv_chunk(pmap_t locked_pmap, struct rwlock **lockp)
 {
 
 	panic("RISCVTODO: reclaim_pv_chunk");
 }
 
 /*
  * free the pv_entry back to the free list
  */
 static void
 free_pv_entry(pmap_t pmap, pv_entry_t pv)
 {
 	struct pv_chunk *pc;
 	int idx, field, bit;
 
 	rw_assert(&pvh_global_lock, RA_LOCKED);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	PV_STAT(atomic_add_long(&pv_entry_frees, 1));
 	PV_STAT(atomic_add_int(&pv_entry_spare, 1));
 	PV_STAT(atomic_subtract_long(&pv_entry_count, 1));
 	pc = pv_to_chunk(pv);
 	idx = pv - &pc->pc_pventry[0];
 	field = idx / 64;
 	bit = idx % 64;
 	pc->pc_map[field] |= 1ul << bit;
 	if (pc->pc_map[0] != PC_FREE0 || pc->pc_map[1] != PC_FREE1 ||
 	    pc->pc_map[2] != PC_FREE2) {
 		/* 98% of the time, pc is already at the head of the list. */
 		if (__predict_false(pc != TAILQ_FIRST(&pmap->pm_pvchunk))) {
 			TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 			TAILQ_INSERT_HEAD(&pmap->pm_pvchunk, pc, pc_list);
 		}
 		return;
 	}
 	TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 	free_pv_chunk(pc);
 }
 
 static void
 free_pv_chunk(struct pv_chunk *pc)
 {
 	vm_page_t m;
 
 	mtx_lock(&pv_chunks_mutex);
  	TAILQ_REMOVE(&pv_chunks, pc, pc_lru);
 	mtx_unlock(&pv_chunks_mutex);
 	PV_STAT(atomic_subtract_int(&pv_entry_spare, _NPCPV));
 	PV_STAT(atomic_subtract_int(&pc_chunk_count, 1));
 	PV_STAT(atomic_add_int(&pc_chunk_frees, 1));
 	/* entire chunk is free, return it */
 	m = PHYS_TO_VM_PAGE(DMAP_TO_PHYS((vm_offset_t)pc));
 #if 0 /* TODO: For minidump */
 	dump_drop_page(m->phys_addr);
 #endif
 	vm_page_unwire(m, PQ_INACTIVE);
 	vm_page_free(m);
 }
 
 /*
  * Returns a new PV entry, allocating a new PV chunk from the system when
  * needed.  If this PV chunk allocation fails and a PV list lock pointer was
  * given, a PV chunk is reclaimed from an arbitrary pmap.  Otherwise, NULL is
  * returned.
  *
  * The given PV list lock may be released.
  */
 static pv_entry_t
 get_pv_entry(pmap_t pmap, struct rwlock **lockp)
 {
 	int bit, field;
 	pv_entry_t pv;
 	struct pv_chunk *pc;
 	vm_page_t m;
 
 	rw_assert(&pvh_global_lock, RA_LOCKED);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	PV_STAT(atomic_add_long(&pv_entry_allocs, 1));
 retry:
 	pc = TAILQ_FIRST(&pmap->pm_pvchunk);
 	if (pc != NULL) {
 		for (field = 0; field < _NPCM; field++) {
 			if (pc->pc_map[field]) {
 				bit = ffsl(pc->pc_map[field]) - 1;
 				break;
 			}
 		}
 		if (field < _NPCM) {
 			pv = &pc->pc_pventry[field * 64 + bit];
 			pc->pc_map[field] &= ~(1ul << bit);
 			/* If this was the last item, move it to tail */
 			if (pc->pc_map[0] == 0 && pc->pc_map[1] == 0 &&
 			    pc->pc_map[2] == 0) {
 				TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 				TAILQ_INSERT_TAIL(&pmap->pm_pvchunk, pc,
 				    pc_list);
 			}
 			PV_STAT(atomic_add_long(&pv_entry_count, 1));
 			PV_STAT(atomic_subtract_int(&pv_entry_spare, 1));
 			return (pv);
 		}
 	}
 	/* No free items, allocate another chunk */
 	m = vm_page_alloc(NULL, 0, VM_ALLOC_NORMAL | VM_ALLOC_NOOBJ |
 	    VM_ALLOC_WIRED);
 	if (m == NULL) {
 		if (lockp == NULL) {
 			PV_STAT(pc_chunk_tryfail++);
 			return (NULL);
 		}
 		m = reclaim_pv_chunk(pmap, lockp);
 		if (m == NULL)
 			goto retry;
 	}
 	PV_STAT(atomic_add_int(&pc_chunk_count, 1));
 	PV_STAT(atomic_add_int(&pc_chunk_allocs, 1));
 #if 0 /* TODO: This is for minidump */
 	dump_add_page(m->phys_addr);
 #endif
 	pc = (void *)PHYS_TO_DMAP(m->phys_addr);
 	pc->pc_pmap = pmap;
 	pc->pc_map[0] = PC_FREE0 & ~1ul;	/* preallocated bit 0 */
 	pc->pc_map[1] = PC_FREE1;
 	pc->pc_map[2] = PC_FREE2;
 	mtx_lock(&pv_chunks_mutex);
 	TAILQ_INSERT_TAIL(&pv_chunks, pc, pc_lru);
 	mtx_unlock(&pv_chunks_mutex);
 	pv = &pc->pc_pventry[0];
 	TAILQ_INSERT_HEAD(&pmap->pm_pvchunk, pc, pc_list);
 	PV_STAT(atomic_add_long(&pv_entry_count, 1));
 	PV_STAT(atomic_add_int(&pv_entry_spare, _NPCPV - 1));
 	return (pv);
 }
 
 /*
  * First find and then remove the pv entry for the specified pmap and virtual
  * address from the specified pv list.  Returns the pv entry if found and NULL
  * otherwise.  This operation can be performed on pv lists for either 4KB or
  * 2MB page mappings.
  */
 static __inline pv_entry_t
 pmap_pvh_remove(struct md_page *pvh, pmap_t pmap, vm_offset_t va)
 {
 	pv_entry_t pv;
 
 	rw_assert(&pvh_global_lock, RA_LOCKED);
 	TAILQ_FOREACH(pv, &pvh->pv_list, pv_next) {
 		if (pmap == PV_PMAP(pv) && va == pv->pv_va) {
 			TAILQ_REMOVE(&pvh->pv_list, pv, pv_next);
 			pvh->pv_gen++;
 			break;
 		}
 	}
 	return (pv);
 }
 
 /*
  * First find and then destroy the pv entry for the specified pmap and virtual
  * address.  This operation can be performed on pv lists for either 4KB or 2MB
  * page mappings.
  */
 static void
 pmap_pvh_free(struct md_page *pvh, pmap_t pmap, vm_offset_t va)
 {
 	pv_entry_t pv;
 
 	pv = pmap_pvh_remove(pvh, pmap, va);
 
 	KASSERT(pv != NULL, ("pmap_pvh_free: pv not found"));
 	free_pv_entry(pmap, pv);
 }
 
 /*
  * Conditionally create the PV entry for a 4KB page mapping if the required
  * memory can be allocated without resorting to reclamation.
  */
 static boolean_t
 pmap_try_insert_pv_entry(pmap_t pmap, vm_offset_t va, vm_page_t m,
     struct rwlock **lockp)
 {
 	pv_entry_t pv;
 
 	rw_assert(&pvh_global_lock, RA_LOCKED);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	/* Pass NULL instead of the lock pointer to disable reclamation. */
 	if ((pv = get_pv_entry(pmap, NULL)) != NULL) {
 		pv->pv_va = va;
 		CHANGE_PV_LIST_LOCK_TO_VM_PAGE(lockp, m);
 		TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 		m->md.pv_gen++;
 		return (TRUE);
 	} else
 		return (FALSE);
 }
 
 /*
  * pmap_remove_l3: do the things to unmap a page in a process
  */
 static int
 pmap_remove_l3(pmap_t pmap, pt_entry_t *l3, vm_offset_t va, 
     pd_entry_t l2e, struct spglist *free, struct rwlock **lockp)
 {
 	pt_entry_t old_l3;
 	vm_paddr_t phys;
 	vm_page_t m;
 
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 	if (pmap_is_current(pmap) && pmap_l3_valid_cacheable(pmap_load(l3)))
 		cpu_dcache_wb_range(va, L3_SIZE);
 	old_l3 = pmap_load_clear(l3);
 	PTE_SYNC(l3);
 	pmap_invalidate_page(pmap, va);
 	if (old_l3 & PTE_SW_WIRED)
 		pmap->pm_stats.wired_count -= 1;
 	pmap_resident_count_dec(pmap, 1);
 	if (old_l3 & PTE_SW_MANAGED) {
 		phys = PTE_TO_PHYS(old_l3);
 		m = PHYS_TO_VM_PAGE(phys);
 		if (pmap_page_dirty(old_l3))
 			vm_page_dirty(m);
 		if (old_l3 & PTE_A)
 			vm_page_aflag_set(m, PGA_REFERENCED);
 		CHANGE_PV_LIST_LOCK_TO_VM_PAGE(lockp, m);
 		pmap_pvh_free(&m->md, pmap, va);
 	}
 
 	return (pmap_unuse_l3(pmap, va, l2e, free));
 }
 
 /*
  *	Remove the given range of addresses from the specified map.
  *
  *	It is assumed that the start and end are properly
  *	rounded to the page size.
  */
 void
 pmap_remove(pmap_t pmap, vm_offset_t sva, vm_offset_t eva)
 {
 	struct rwlock *lock;
 	vm_offset_t va, va_next;
 	pd_entry_t *l1, *l2;
 	pt_entry_t l3_pte, *l3;
 	struct spglist free;
 	int anyvalid;
 
 	/*
 	 * Perform an unsynchronized read.  This is, however, safe.
 	 */
 	if (pmap->pm_stats.resident_count == 0)
 		return;
 
 	anyvalid = 0;
 	SLIST_INIT(&free);
 
 	rw_rlock(&pvh_global_lock);
 	PMAP_LOCK(pmap);
 
 	lock = NULL;
 	for (; sva < eva; sva = va_next) {
 		if (pmap->pm_stats.resident_count == 0)
 			break;
 
 		l1 = pmap_l1(pmap, sva);
 		if (pmap_load(l1) == 0) {
 			va_next = (sva + L1_SIZE) & ~L1_OFFSET;
 			if (va_next < sva)
 				va_next = eva;
 			continue;
 		}
 
 		/*
 		 * Calculate index for next page table.
 		 */
 		va_next = (sva + L2_SIZE) & ~L2_OFFSET;
 		if (va_next < sva)
 			va_next = eva;
 
 		l2 = pmap_l1_to_l2(l1, sva);
 		if (l2 == NULL)
 			continue;
 
 		l3_pte = pmap_load(l2);
 
 		/*
 		 * Weed out invalid mappings.
 		 */
 		if (l3_pte == 0)
 			continue;
 		if ((pmap_load(l2) & PTE_RX) != 0)
 			continue;
 
 		/*
 		 * Limit our scan to either the end of the va represented
 		 * by the current page table page, or to the end of the
 		 * range being removed.
 		 */
 		if (va_next > eva)
 			va_next = eva;
 
 		va = va_next;
 		for (l3 = pmap_l2_to_l3(l2, sva); sva != va_next; l3++,
 		    sva += L3_SIZE) {
 			if (l3 == NULL)
 				panic("l3 == NULL");
 			if (pmap_load(l3) == 0) {
 				if (va != va_next) {
 					pmap_invalidate_range(pmap, va, sva);
 					va = va_next;
 				}
 				continue;
 			}
 			if (va == va_next)
 				va = sva;
 			if (pmap_remove_l3(pmap, l3, sva, l3_pte, &free,
 			    &lock)) {
 				sva += L3_SIZE;
 				break;
 			}
 		}
 		if (va != va_next)
 			pmap_invalidate_range(pmap, va, sva);
 	}
 	if (lock != NULL)
 		rw_wunlock(lock);
 	if (anyvalid)
 		pmap_invalidate_all(pmap);
 	rw_runlock(&pvh_global_lock);	
 	PMAP_UNLOCK(pmap);
 	pmap_free_zero_pages(&free);
 }
 
 /*
  *	Routine:	pmap_remove_all
  *	Function:
  *		Removes this physical page from
  *		all physical maps in which it resides.
  *		Reflects back modify bits to the pager.
  *
  *	Notes:
  *		Original versions of this routine were very
  *		inefficient because they iteratively called
  *		pmap_remove (slow...)
  */
 
 void
 pmap_remove_all(vm_page_t m)
 {
 	pv_entry_t pv;
 	pmap_t pmap;
 	pt_entry_t *l3, tl3;
 	pd_entry_t *l2, tl2;
 	struct spglist free;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_remove_all: page %p is not managed", m));
 	SLIST_INIT(&free);
 	rw_wlock(&pvh_global_lock);
 	while ((pv = TAILQ_FIRST(&m->md.pv_list)) != NULL) {
 		pmap = PV_PMAP(pv);
 		PMAP_LOCK(pmap);
 		pmap_resident_count_dec(pmap, 1);
 		l2 = pmap_l2(pmap, pv->pv_va);
 		KASSERT(l2 != NULL, ("pmap_remove_all: no l2 table found"));
 		tl2 = pmap_load(l2);
 
 		KASSERT((tl2 & PTE_RX) == 0,
 		    ("pmap_remove_all: found a table when expecting "
 		    "a block in %p's pv list", m));
 
 		l3 = pmap_l2_to_l3(l2, pv->pv_va);
 		if (pmap_is_current(pmap) &&
 		    pmap_l3_valid_cacheable(pmap_load(l3)))
 			cpu_dcache_wb_range(pv->pv_va, L3_SIZE);
 		tl3 = pmap_load_clear(l3);
 		PTE_SYNC(l3);
 		pmap_invalidate_page(pmap, pv->pv_va);
 		if (tl3 & PTE_SW_WIRED)
 			pmap->pm_stats.wired_count--;
 		if ((tl3 & PTE_A) != 0)
 			vm_page_aflag_set(m, PGA_REFERENCED);
 
 		/*
 		 * Update the vm_page_t clean and reference bits.
 		 */
 		if (pmap_page_dirty(tl3))
 			vm_page_dirty(m);
 		pmap_unuse_l3(pmap, pv->pv_va, pmap_load(l2), &free);
 		TAILQ_REMOVE(&m->md.pv_list, pv, pv_next);
 		m->md.pv_gen++;
 		free_pv_entry(pmap, pv);
 		PMAP_UNLOCK(pmap);
 	}
 	vm_page_aflag_clear(m, PGA_WRITEABLE);
 	rw_wunlock(&pvh_global_lock);
 	pmap_free_zero_pages(&free);
 }
 
 /*
  *	Set the physical protection on the
  *	specified range of this map as requested.
  */
 void
 pmap_protect(pmap_t pmap, vm_offset_t sva, vm_offset_t eva, vm_prot_t prot)
 {
 	vm_offset_t va, va_next;
 	pd_entry_t *l1, *l2;
 	pt_entry_t *l3p, l3;
 	pt_entry_t entry;
 
 	if ((prot & VM_PROT_READ) == VM_PROT_NONE) {
 		pmap_remove(pmap, sva, eva);
 		return;
 	}
 
 	if ((prot & VM_PROT_WRITE) == VM_PROT_WRITE)
 		return;
 
 	PMAP_LOCK(pmap);
 	for (; sva < eva; sva = va_next) {
 
 		l1 = pmap_l1(pmap, sva);
 		if (pmap_load(l1) == 0) {
 			va_next = (sva + L1_SIZE) & ~L1_OFFSET;
 			if (va_next < sva)
 				va_next = eva;
 			continue;
 		}
 
 		va_next = (sva + L2_SIZE) & ~L2_OFFSET;
 		if (va_next < sva)
 			va_next = eva;
 
 		l2 = pmap_l1_to_l2(l1, sva);
 		if (l2 == NULL)
 			continue;
 		if ((pmap_load(l2) & PTE_RX) != 0)
 			continue;
 
 		if (va_next > eva)
 			va_next = eva;
 
 		va = va_next;
 		for (l3p = pmap_l2_to_l3(l2, sva); sva != va_next; l3p++,
 		    sva += L3_SIZE) {
 			l3 = pmap_load(l3p);
 			if (pmap_l3_valid(l3)) {
 				entry = pmap_load(l3p);
 				entry &= ~(PTE_W);
 				pmap_load_store(l3p, entry);
 				PTE_SYNC(l3p);
 				/* XXX: Use pmap_invalidate_range */
 				pmap_invalidate_page(pmap, va);
 			}
 		}
 	}
 	PMAP_UNLOCK(pmap);
 
 	/* TODO: Only invalidate entries we are touching */
 	pmap_invalidate_all(pmap);
 }
 
 /*
  *	Insert the given physical page (p) at
  *	the specified virtual address (v) in the
  *	target physical map with the protection requested.
  *
  *	If specified, the page will be wired down, meaning
  *	that the related pte can not be reclaimed.
  *
  *	NB:  This is the only routine which MAY NOT lazy-evaluate
  *	or lose information.  That is, this routine must actually
  *	insert this page into the given map NOW.
  */
 int
 pmap_enter(pmap_t pmap, vm_offset_t va, vm_page_t m, vm_prot_t prot,
     u_int flags, int8_t psind __unused)
 {
 	struct rwlock *lock;
 	pd_entry_t *l1, *l2;
 	pt_entry_t new_l3, orig_l3;
 	pt_entry_t *l3;
 	pv_entry_t pv;
 	vm_paddr_t opa, pa, l2_pa, l3_pa;
 	vm_page_t mpte, om, l2_m, l3_m;
 	boolean_t nosleep;
 	pt_entry_t entry;
 	pn_t l2_pn;
 	pn_t l3_pn;
 	pn_t pn;
 
 	va = trunc_page(va);
 	if ((m->oflags & VPO_UNMANAGED) == 0 && !vm_page_xbusied(m))
 		VM_OBJECT_ASSERT_LOCKED(m->object);
 	pa = VM_PAGE_TO_PHYS(m);
 	pn = (pa / PAGE_SIZE);
 
 	new_l3 = PTE_V | PTE_R | PTE_X;
 	if (prot & VM_PROT_WRITE)
 		new_l3 |= PTE_W;
 	if ((va >> 63) == 0)
 		new_l3 |= PTE_U;
 
 	new_l3 |= (pn << PTE_PPN0_S);
 	if ((flags & PMAP_ENTER_WIRED) != 0)
 		new_l3 |= PTE_SW_WIRED;
 
 	CTR2(KTR_PMAP, "pmap_enter: %.16lx -> %.16lx", va, pa);
 
 	mpte = NULL;
 
 	lock = NULL;
 	rw_rlock(&pvh_global_lock);
 	PMAP_LOCK(pmap);
 
 	if (va < VM_MAXUSER_ADDRESS) {
 		nosleep = (flags & PMAP_ENTER_NOSLEEP) != 0;
 		mpte = pmap_alloc_l3(pmap, va, nosleep ? NULL : &lock);
 		if (mpte == NULL && nosleep) {
 			CTR0(KTR_PMAP, "pmap_enter: mpte == NULL");
 			if (lock != NULL)
 				rw_wunlock(lock);
 			rw_runlock(&pvh_global_lock);
 			PMAP_UNLOCK(pmap);
 			return (KERN_RESOURCE_SHORTAGE);
 		}
 		l3 = pmap_l3(pmap, va);
 	} else {
 		l3 = pmap_l3(pmap, va);
 		/* TODO: This is not optimal, but should mostly work */
 		if (l3 == NULL) {
 			l2 = pmap_l2(pmap, va);
 			if (l2 == NULL) {
 				l2_m = vm_page_alloc(NULL, 0, VM_ALLOC_NORMAL |
 				    VM_ALLOC_NOOBJ | VM_ALLOC_WIRED |
 				    VM_ALLOC_ZERO);
 				if (l2_m == NULL)
 					panic("pmap_enter: l2 pte_m == NULL");
 				if ((l2_m->flags & PG_ZERO) == 0)
 					pmap_zero_page(l2_m);
 
 				l2_pa = VM_PAGE_TO_PHYS(l2_m);
 				l2_pn = (l2_pa / PAGE_SIZE);
 
 				l1 = pmap_l1(pmap, va);
 				entry = (PTE_V);
 				entry |= (l2_pn << PTE_PPN0_S);
 				pmap_load_store(l1, entry);
 				pmap_distribute_l1(pmap, pmap_l1_index(va), entry);
 				PTE_SYNC(l1);
 
 				l2 = pmap_l1_to_l2(l1, va);
 			}
 
 			KASSERT(l2 != NULL,
 			    ("No l2 table after allocating one"));
 
 			l3_m = vm_page_alloc(NULL, 0, VM_ALLOC_NORMAL |
 			    VM_ALLOC_NOOBJ | VM_ALLOC_WIRED | VM_ALLOC_ZERO);
 			if (l3_m == NULL)
 				panic("pmap_enter: l3 pte_m == NULL");
 			if ((l3_m->flags & PG_ZERO) == 0)
 				pmap_zero_page(l3_m);
 
 			l3_pa = VM_PAGE_TO_PHYS(l3_m);
 			l3_pn = (l3_pa / PAGE_SIZE);
 			entry = (PTE_V);
 			entry |= (l3_pn << PTE_PPN0_S);
 			pmap_load_store(l2, entry);
 			PTE_SYNC(l2);
 			l3 = pmap_l2_to_l3(l2, va);
 		}
 		pmap_invalidate_page(pmap, va);
 	}
 
 	om = NULL;
 	orig_l3 = pmap_load(l3);
 	opa = PTE_TO_PHYS(orig_l3);
 
 	/*
 	 * Is the specified virtual address already mapped?
 	 */
 	if (pmap_l3_valid(orig_l3)) {
 		/*
 		 * Wiring change, just update stats. We don't worry about
 		 * wiring PT pages as they remain resident as long as there
 		 * are valid mappings in them. Hence, if a user page is wired,
 		 * the PT page will be also.
 		 */
 		if ((flags & PMAP_ENTER_WIRED) != 0 &&
 		    (orig_l3 & PTE_SW_WIRED) == 0)
 			pmap->pm_stats.wired_count++;
 		else if ((flags & PMAP_ENTER_WIRED) == 0 &&
 		    (orig_l3 & PTE_SW_WIRED) != 0)
 			pmap->pm_stats.wired_count--;
 
 		/*
 		 * Remove the extra PT page reference.
 		 */
 		if (mpte != NULL) {
 			mpte->wire_count--;
 			KASSERT(mpte->wire_count > 0,
 			    ("pmap_enter: missing reference to page table page,"
 			     " va: 0x%lx", va));
 		}
 
 		/*
 		 * Has the physical page changed?
 		 */
 		if (opa == pa) {
 			/*
 			 * No, might be a protection or wiring change.
 			 */
 			if ((orig_l3 & PTE_SW_MANAGED) != 0) {
 				new_l3 |= PTE_SW_MANAGED;
 				if (pmap_is_write(new_l3))
 					vm_page_aflag_set(m, PGA_WRITEABLE);
 			}
 			goto validate;
 		}
 
 		/* Flush the cache, there might be uncommitted data in it */
 		if (pmap_is_current(pmap) && pmap_l3_valid_cacheable(orig_l3))
 			cpu_dcache_wb_range(va, L3_SIZE);
 	} else {
 		/*
 		 * Increment the counters.
 		 */
 		if ((new_l3 & PTE_SW_WIRED) != 0)
 			pmap->pm_stats.wired_count++;
 		pmap_resident_count_inc(pmap, 1);
 	}
 	/*
 	 * Enter on the PV list if part of our managed memory.
 	 */
 	if ((m->oflags & VPO_UNMANAGED) == 0) {
 		new_l3 |= PTE_SW_MANAGED;
 		pv = get_pv_entry(pmap, &lock);
 		pv->pv_va = va;
 		CHANGE_PV_LIST_LOCK_TO_PHYS(&lock, pa);
 		TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 		m->md.pv_gen++;
 		if (pmap_is_write(new_l3))
 			vm_page_aflag_set(m, PGA_WRITEABLE);
 	}
 
 	/*
 	 * Update the L3 entry.
 	 */
 	if (orig_l3 != 0) {
 validate:
 		orig_l3 = pmap_load_store(l3, new_l3);
 		PTE_SYNC(l3);
 		opa = PTE_TO_PHYS(orig_l3);
 
 		if (opa != pa) {
 			if ((orig_l3 & PTE_SW_MANAGED) != 0) {
 				om = PHYS_TO_VM_PAGE(opa);
 				if (pmap_page_dirty(orig_l3))
 					vm_page_dirty(om);
 				if ((orig_l3 & PTE_A) != 0)
 					vm_page_aflag_set(om, PGA_REFERENCED);
 				CHANGE_PV_LIST_LOCK_TO_PHYS(&lock, opa);
 				pmap_pvh_free(&om->md, pmap, va);
 			}
 		} else if (pmap_page_dirty(orig_l3)) {
 			if ((orig_l3 & PTE_SW_MANAGED) != 0)
 				vm_page_dirty(m);
 		}
 	} else {
 		pmap_load_store(l3, new_l3);
 		PTE_SYNC(l3);
 	}
 	pmap_invalidate_page(pmap, va);
 	if ((pmap != pmap_kernel()) && (pmap == &curproc->p_vmspace->vm_pmap))
 	    cpu_icache_sync_range(va, PAGE_SIZE);
 
 	if (lock != NULL)
 		rw_wunlock(lock);
 	rw_runlock(&pvh_global_lock);
 	PMAP_UNLOCK(pmap);
 	return (KERN_SUCCESS);
 }
 
 /*
  * Maps a sequence of resident pages belonging to the same object.
  * The sequence begins with the given page m_start.  This page is
  * mapped at the given virtual address start.  Each subsequent page is
  * mapped at a virtual address that is offset from start by the same
  * amount as the page is offset from m_start within the object.  The
  * last page in the sequence is the page with the largest offset from
  * m_start that can be mapped at a virtual address less than the given
  * virtual address end.  Not every virtual page between start and end
  * is mapped; only those for which a resident page exists with the
  * corresponding offset from m_start are mapped.
  */
 void
 pmap_enter_object(pmap_t pmap, vm_offset_t start, vm_offset_t end,
     vm_page_t m_start, vm_prot_t prot)
 {
 	struct rwlock *lock;
 	vm_offset_t va;
 	vm_page_t m, mpte;
 	vm_pindex_t diff, psize;
 
 	VM_OBJECT_ASSERT_LOCKED(m_start->object);
 
 	psize = atop(end - start);
 	mpte = NULL;
 	m = m_start;
 	lock = NULL;
 	rw_rlock(&pvh_global_lock);
 	PMAP_LOCK(pmap);
 	while (m != NULL && (diff = m->pindex - m_start->pindex) < psize) {
 		va = start + ptoa(diff);
 		mpte = pmap_enter_quick_locked(pmap, va, m, prot, mpte, &lock);
 		m = TAILQ_NEXT(m, listq);
 	}
 	if (lock != NULL)
 		rw_wunlock(lock);
 	rw_runlock(&pvh_global_lock);
 	PMAP_UNLOCK(pmap);
 }
 
 /*
  * this code makes some *MAJOR* assumptions:
  * 1. Current pmap & pmap exists.
  * 2. Not wired.
  * 3. Read access.
  * 4. No page table pages.
  * but is *MUCH* faster than pmap_enter...
  */
 
 void
 pmap_enter_quick(pmap_t pmap, vm_offset_t va, vm_page_t m, vm_prot_t prot)
 {
 	struct rwlock *lock;
 
 	lock = NULL;
 	rw_rlock(&pvh_global_lock);
 	PMAP_LOCK(pmap);
 	(void)pmap_enter_quick_locked(pmap, va, m, prot, NULL, &lock);
 	if (lock != NULL)
 		rw_wunlock(lock);
 	rw_runlock(&pvh_global_lock);
 	PMAP_UNLOCK(pmap);
 }
 
 static vm_page_t
 pmap_enter_quick_locked(pmap_t pmap, vm_offset_t va, vm_page_t m,
     vm_prot_t prot, vm_page_t mpte, struct rwlock **lockp)
 {
 	struct spglist free;
 	vm_paddr_t phys;
 	pd_entry_t *l2;
 	pt_entry_t *l3;
 	vm_paddr_t pa;
 	pt_entry_t entry;
 	pn_t pn;
 
 	KASSERT(va < kmi.clean_sva || va >= kmi.clean_eva ||
 	    (m->oflags & VPO_UNMANAGED) != 0,
 	    ("pmap_enter_quick_locked: managed mapping within the clean submap"));
 	rw_assert(&pvh_global_lock, RA_LOCKED);
 	PMAP_LOCK_ASSERT(pmap, MA_OWNED);
 
 	CTR2(KTR_PMAP, "pmap_enter_quick_locked: %p %lx", pmap, va);
 	/*
 	 * In the case that a page table page is not
 	 * resident, we are creating it here.
 	 */
 	if (va < VM_MAXUSER_ADDRESS) {
 		vm_pindex_t l2pindex;
 
 		/*
 		 * Calculate pagetable page index
 		 */
 		l2pindex = pmap_l2_pindex(va);
 		if (mpte && (mpte->pindex == l2pindex)) {
 			mpte->wire_count++;
 		} else {
 			/*
 			 * Get the l2 entry
 			 */
 			l2 = pmap_l2(pmap, va);
 
 			/*
 			 * If the page table page is mapped, we just increment
 			 * the hold count, and activate it.  Otherwise, we
 			 * attempt to allocate a page table page.  If this
 			 * attempt fails, we don't retry.  Instead, we give up.
 			 */
 			if (l2 != NULL && pmap_load(l2) != 0) {
 				phys = PTE_TO_PHYS(pmap_load(l2));
 				mpte = PHYS_TO_VM_PAGE(phys);
 				mpte->wire_count++;
 			} else {
 				/*
 				 * Pass NULL instead of the PV list lock
 				 * pointer, because we don't intend to sleep.
 				 */
 				mpte = _pmap_alloc_l3(pmap, l2pindex, NULL);
 				if (mpte == NULL)
 					return (mpte);
 			}
 		}
 		l3 = (pt_entry_t *)PHYS_TO_DMAP(VM_PAGE_TO_PHYS(mpte));
 		l3 = &l3[pmap_l3_index(va)];
 	} else {
 		mpte = NULL;
 		l3 = pmap_l3(kernel_pmap, va);
 	}
 	if (l3 == NULL)
 		panic("pmap_enter_quick_locked: No l3");
 	if (pmap_load(l3) != 0) {
 		if (mpte != NULL) {
 			mpte->wire_count--;
 			mpte = NULL;
 		}
 		return (mpte);
 	}
 
 	/*
 	 * Enter on the PV list if part of our managed memory.
 	 */
 	if ((m->oflags & VPO_UNMANAGED) == 0 &&
 	    !pmap_try_insert_pv_entry(pmap, va, m, lockp)) {
 		if (mpte != NULL) {
 			SLIST_INIT(&free);
 			if (pmap_unwire_l3(pmap, va, mpte, &free)) {
 				pmap_invalidate_page(pmap, va);
 				pmap_free_zero_pages(&free);
 			}
 			mpte = NULL;
 		}
 		return (mpte);
 	}
 
 	/*
 	 * Increment counters
 	 */
 	pmap_resident_count_inc(pmap, 1);
 
 	pa = VM_PAGE_TO_PHYS(m);
 	pn = (pa / PAGE_SIZE);
 
 	/* RISCVTODO: check permissions */
 	entry = (PTE_V | PTE_RWX);
 	entry |= (pn << PTE_PPN0_S);
 
 	/*
 	 * Now validate mapping with RO protection
 	 */
 	if ((m->oflags & VPO_UNMANAGED) == 0)
 		entry |= PTE_SW_MANAGED;
 	pmap_load_store(l3, entry);
 
 	PTE_SYNC(l3);
 	pmap_invalidate_page(pmap, va);
 	return (mpte);
 }
 
 /*
  * This code maps large physical mmap regions into the
  * processor address space.  Note that some shortcuts
  * are taken, but the code works.
  */
 void
 pmap_object_init_pt(pmap_t pmap, vm_offset_t addr, vm_object_t object,
     vm_pindex_t pindex, vm_size_t size)
 {
 
 	VM_OBJECT_ASSERT_WLOCKED(object);
 	KASSERT(object->type == OBJT_DEVICE || object->type == OBJT_SG,
 	    ("pmap_object_init_pt: non-device object"));
 }
 
 /*
  *	Clear the wired attribute from the mappings for the specified range of
  *	addresses in the given pmap.  Every valid mapping within that range
  *	must have the wired attribute set.  In contrast, invalid mappings
  *	cannot have the wired attribute set, so they are ignored.
  *
  *	The wired attribute of the page table entry is not a hardware feature,
  *	so there is no need to invalidate any TLB entries.
  */
 void
 pmap_unwire(pmap_t pmap, vm_offset_t sva, vm_offset_t eva)
 {
 	vm_offset_t va_next;
 	pd_entry_t *l1, *l2;
 	pt_entry_t *l3;
 	boolean_t pv_lists_locked;
 
 	pv_lists_locked = FALSE;
 	PMAP_LOCK(pmap);
 	for (; sva < eva; sva = va_next) {
 		l1 = pmap_l1(pmap, sva);
 		if (pmap_load(l1) == 0) {
 			va_next = (sva + L1_SIZE) & ~L1_OFFSET;
 			if (va_next < sva)
 				va_next = eva;
 			continue;
 		}
 
 		va_next = (sva + L2_SIZE) & ~L2_OFFSET;
 		if (va_next < sva)
 			va_next = eva;
 
 		l2 = pmap_l1_to_l2(l1, sva);
 		if (pmap_load(l2) == 0)
 			continue;
 
 		if (va_next > eva)
 			va_next = eva;
 		for (l3 = pmap_l2_to_l3(l2, sva); sva != va_next; l3++,
 		    sva += L3_SIZE) {
 			if (pmap_load(l3) == 0)
 				continue;
 			if ((pmap_load(l3) & PTE_SW_WIRED) == 0)
 				panic("pmap_unwire: l3 %#jx is missing "
 				    "PTE_SW_WIRED", (uintmax_t)*l3);
 
 			/*
 			 * PG_W must be cleared atomically.  Although the pmap
 			 * lock synchronizes access to PG_W, another processor
 			 * could be setting PG_M and/or PG_A concurrently.
 			 */
 			atomic_clear_long(l3, PTE_SW_WIRED);
 			pmap->pm_stats.wired_count--;
 		}
 	}
 	if (pv_lists_locked)
 		rw_runlock(&pvh_global_lock);
 	PMAP_UNLOCK(pmap);
 }
 
 /*
  *	Copy the range specified by src_addr/len
  *	from the source map to the range dst_addr/len
  *	in the destination map.
  *
  *	This routine is only advisory and need not do anything.
  */
 
 void
 pmap_copy(pmap_t dst_pmap, pmap_t src_pmap, vm_offset_t dst_addr, vm_size_t len,
     vm_offset_t src_addr)
 {
 
 }
 
 /*
  *	pmap_zero_page zeros the specified hardware page by mapping
  *	the page into KVM and using bzero to clear its contents.
  */
 void
 pmap_zero_page(vm_page_t m)
 {
 	vm_offset_t va = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m));
 
 	pagezero((void *)va);
 }
 
 /*
  *	pmap_zero_page_area zeros the specified hardware page by mapping 
  *	the page into KVM and using bzero to clear its contents.
  *
  *	off and size may not cover an area beyond a single hardware page.
  */
 void
 pmap_zero_page_area(vm_page_t m, int off, int size)
 {
 	vm_offset_t va = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m));
 
 	if (off == 0 && size == PAGE_SIZE)
 		pagezero((void *)va);
 	else
 		bzero((char *)va + off, size);
 }
 
 /*
  *	pmap_copy_page copies the specified (machine independent)
  *	page by mapping the page into virtual memory and using
  *	bcopy to copy the page, one machine dependent page at a
  *	time.
  */
 void
 pmap_copy_page(vm_page_t msrc, vm_page_t mdst)
 {
 	vm_offset_t src = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(msrc));
 	vm_offset_t dst = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(mdst));
 
 	pagecopy((void *)src, (void *)dst);
 }
 
 int unmapped_buf_allowed = 1;
 
 void
 pmap_copy_pages(vm_page_t ma[], vm_offset_t a_offset, vm_page_t mb[],
     vm_offset_t b_offset, int xfersize)
 {
 	void *a_cp, *b_cp;
 	vm_page_t m_a, m_b;
 	vm_paddr_t p_a, p_b;
 	vm_offset_t a_pg_offset, b_pg_offset;
 	int cnt;
 
 	while (xfersize > 0) {
 		a_pg_offset = a_offset & PAGE_MASK;
 		m_a = ma[a_offset >> PAGE_SHIFT];
 		p_a = m_a->phys_addr;
 		b_pg_offset = b_offset & PAGE_MASK;
 		m_b = mb[b_offset >> PAGE_SHIFT];
 		p_b = m_b->phys_addr;
 		cnt = min(xfersize, PAGE_SIZE - a_pg_offset);
 		cnt = min(cnt, PAGE_SIZE - b_pg_offset);
 		if (__predict_false(!PHYS_IN_DMAP(p_a))) {
 			panic("!DMAP a %lx", p_a);
 		} else {
 			a_cp = (char *)PHYS_TO_DMAP(p_a) + a_pg_offset;
 		}
 		if (__predict_false(!PHYS_IN_DMAP(p_b))) {
 			panic("!DMAP b %lx", p_b);
 		} else {
 			b_cp = (char *)PHYS_TO_DMAP(p_b) + b_pg_offset;
 		}
 		bcopy(a_cp, b_cp, cnt);
 		a_offset += cnt;
 		b_offset += cnt;
 		xfersize -= cnt;
 	}
 }
 
 vm_offset_t
 pmap_quick_enter_page(vm_page_t m)
 {
 
 	return (PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m)));
 }
 
 void
 pmap_quick_remove_page(vm_offset_t addr)
 {
 }
 
 /*
  * Returns true if the pmap's pv is one of the first
  * 16 pvs linked to from this page.  This count may
  * be changed upwards or downwards in the future; it
  * is only necessary that true be returned for a small
  * subset of pmaps for proper page aging.
  */
 boolean_t
 pmap_page_exists_quick(pmap_t pmap, vm_page_t m)
 {
 	struct rwlock *lock;
 	pv_entry_t pv;
 	int loops = 0;
 	boolean_t rv;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_page_exists_quick: page %p is not managed", m));
 	rv = FALSE;
 	rw_rlock(&pvh_global_lock);
 	lock = VM_PAGE_TO_PV_LIST_LOCK(m);
 	rw_rlock(lock);
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
 		if (PV_PMAP(pv) == pmap) {
 			rv = TRUE;
 			break;
 		}
 		loops++;
 		if (loops >= 16)
 			break;
 	}
 	rw_runlock(lock);
 	rw_runlock(&pvh_global_lock);
 	return (rv);
 }
 
 /*
  *	pmap_page_wired_mappings:
  *
  *	Return the number of managed mappings to the given physical page
  *	that are wired.
  */
 int
 pmap_page_wired_mappings(vm_page_t m)
 {
 	struct rwlock *lock;
 	pmap_t pmap;
 	pt_entry_t *l3;
 	pv_entry_t pv;
 	int count, md_gen;
 
 	if ((m->oflags & VPO_UNMANAGED) != 0)
 		return (0);
 	rw_rlock(&pvh_global_lock);
 	lock = VM_PAGE_TO_PV_LIST_LOCK(m);
 	rw_rlock(lock);
 restart:
 	count = 0;
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			md_gen = m->md.pv_gen;
 			rw_runlock(lock);
 			PMAP_LOCK(pmap);
 			rw_rlock(lock);
 			if (md_gen != m->md.pv_gen) {
 				PMAP_UNLOCK(pmap);
 				goto restart;
 			}
 		}
 		l3 = pmap_l3(pmap, pv->pv_va);
 		if (l3 != NULL && (pmap_load(l3) & PTE_SW_WIRED) != 0)
 			count++;
 		PMAP_UNLOCK(pmap);
 	}
 	rw_runlock(lock);
 	rw_runlock(&pvh_global_lock);
 	return (count);
 }
 
 /*
  * Destroy all managed, non-wired mappings in the given user-space
  * pmap.  This pmap cannot be active on any processor besides the
  * caller.
  *
  * This function cannot be applied to the kernel pmap.  Moreover, it
  * is not intended for general use.  It is only to be used during
  * process termination.  Consequently, it can be implemented in ways
  * that make it faster than pmap_remove().  First, it can more quickly
  * destroy mappings by iterating over the pmap's collection of PV
  * entries, rather than searching the page table.  Second, it doesn't
  * have to test and clear the page table entries atomically, because
  * no processor is currently accessing the user address space.  In
  * particular, a page table entry's dirty bit won't change state once
  * this function starts.
  */
 void
 pmap_remove_pages(pmap_t pmap)
 {
 	pd_entry_t ptepde, *l2;
 	pt_entry_t *l3, tl3;
 	struct spglist free;
 	vm_page_t m;
 	pv_entry_t pv;
 	struct pv_chunk *pc, *npc;
 	struct rwlock *lock;
 	int64_t bit;
 	uint64_t inuse, bitmask;
 	int allfree, field, freed, idx;
 	vm_paddr_t pa;
 
 	lock = NULL;
 
 	SLIST_INIT(&free);
 	rw_rlock(&pvh_global_lock);
 	PMAP_LOCK(pmap);
 	TAILQ_FOREACH_SAFE(pc, &pmap->pm_pvchunk, pc_list, npc) {
 		allfree = 1;
 		freed = 0;
 		for (field = 0; field < _NPCM; field++) {
 			inuse = ~pc->pc_map[field] & pc_freemask[field];
 			while (inuse != 0) {
 				bit = ffsl(inuse) - 1;
 				bitmask = 1UL << bit;
 				idx = field * 64 + bit;
 				pv = &pc->pc_pventry[idx];
 				inuse &= ~bitmask;
 
 				l2 = pmap_l2(pmap, pv->pv_va);
 				ptepde = pmap_load(l2);
 				l3 = pmap_l2_to_l3(l2, pv->pv_va);
 				tl3 = pmap_load(l3);
 
 /*
  * We cannot remove wired pages from a process' mapping at this time
  */
 				if (tl3 & PTE_SW_WIRED) {
 					allfree = 0;
 					continue;
 				}
 
 				pa = PTE_TO_PHYS(tl3);
 				m = PHYS_TO_VM_PAGE(pa);
 				KASSERT(m->phys_addr == pa,
 				    ("vm_page_t %p phys_addr mismatch %016jx %016jx",
 				    m, (uintmax_t)m->phys_addr,
 				    (uintmax_t)tl3));
 
 				KASSERT((m->flags & PG_FICTITIOUS) != 0 ||
 				    m < &vm_page_array[vm_page_array_size],
 				    ("pmap_remove_pages: bad l3 %#jx",
 				    (uintmax_t)tl3));
 
 				if (pmap_is_current(pmap) &&
 				    pmap_l3_valid_cacheable(pmap_load(l3)))
 					cpu_dcache_wb_range(pv->pv_va, L3_SIZE);
 				pmap_load_clear(l3);
 				PTE_SYNC(l3);
 				pmap_invalidate_page(pmap, pv->pv_va);
 
 				/*
 				 * Update the vm_page_t clean/reference bits.
 				 */
 				if (pmap_page_dirty(tl3))
 					vm_page_dirty(m);
 
 				CHANGE_PV_LIST_LOCK_TO_VM_PAGE(&lock, m);
 
 				/* Mark free */
 				pc->pc_map[field] |= bitmask;
 
 				pmap_resident_count_dec(pmap, 1);
 				TAILQ_REMOVE(&m->md.pv_list, pv, pv_next);
 				m->md.pv_gen++;
 
 				pmap_unuse_l3(pmap, pv->pv_va, ptepde, &free);
 				freed++;
 			}
 		}
 		PV_STAT(atomic_add_long(&pv_entry_frees, freed));
 		PV_STAT(atomic_add_int(&pv_entry_spare, freed));
 		PV_STAT(atomic_subtract_long(&pv_entry_count, freed));
 		if (allfree) {
 			TAILQ_REMOVE(&pmap->pm_pvchunk, pc, pc_list);
 			free_pv_chunk(pc);
 		}
 	}
 	pmap_invalidate_all(pmap);
 	if (lock != NULL)
 		rw_wunlock(lock);
 	rw_runlock(&pvh_global_lock);
 	PMAP_UNLOCK(pmap);
 	pmap_free_zero_pages(&free);
 }
 
 /*
  * This is used to check if a page has been accessed or modified. As we
  * don't have a bit to see if it has been modified we have to assume it
  * has been if the page is read/write.
  */
 static boolean_t
 pmap_page_test_mappings(vm_page_t m, boolean_t accessed, boolean_t modified)
 {
 	struct rwlock *lock;
 	pv_entry_t pv;
 	pt_entry_t *l3, mask, value;
 	pmap_t pmap;
 	int md_gen;
 	boolean_t rv;
 
 	rv = FALSE;
 	rw_rlock(&pvh_global_lock);
 	lock = VM_PAGE_TO_PV_LIST_LOCK(m);
 	rw_rlock(lock);
 restart:
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			md_gen = m->md.pv_gen;
 			rw_runlock(lock);
 			PMAP_LOCK(pmap);
 			rw_rlock(lock);
 			if (md_gen != m->md.pv_gen) {
 				PMAP_UNLOCK(pmap);
 				goto restart;
 			}
 		}
 		l3 = pmap_l3(pmap, pv->pv_va);
 		mask = 0;
 		value = 0;
 		if (modified) {
 			mask |= PTE_D;
 			value |= PTE_D;
 		}
 		if (accessed) {
 			mask |= PTE_A;
 			value |= PTE_A;
 		}
 
 #if 0
 		if (modified) {
 			mask |= ATTR_AP_RW_BIT;
 			value |= ATTR_AP(ATTR_AP_RW);
 		}
 		if (accessed) {
 			mask |= ATTR_AF | ATTR_DESCR_MASK;
 			value |= ATTR_AF | L3_PAGE;
 		}
 #endif
 
 		rv = (pmap_load(l3) & mask) == value;
 		PMAP_UNLOCK(pmap);
 		if (rv)
 			goto out;
 	}
 out:
 	rw_runlock(lock);
 	rw_runlock(&pvh_global_lock);
 	return (rv);
 }
 
 /*
  *	pmap_is_modified:
  *
  *	Return whether or not the specified physical page was modified
  *	in any physical maps.
  */
 boolean_t
 pmap_is_modified(vm_page_t m)
 {
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_is_modified: page %p is not managed", m));
 
 	/*
 	 * If the page is not exclusive busied, then PGA_WRITEABLE cannot be
 	 * concurrently set while the object is locked.  Thus, if PGA_WRITEABLE
 	 * is clear, no PTEs can have PG_M set.
 	 */
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	if (!vm_page_xbusied(m) && (m->aflags & PGA_WRITEABLE) == 0)
 		return (FALSE);
 	return (pmap_page_test_mappings(m, FALSE, TRUE));
 }
 
 /*
  *	pmap_is_prefaultable:
  *
  *	Return whether or not the specified virtual address is eligible
  *	for prefault.
  */
 boolean_t
 pmap_is_prefaultable(pmap_t pmap, vm_offset_t addr)
 {
 	pt_entry_t *l3;
 	boolean_t rv;
 
 	rv = FALSE;
 	PMAP_LOCK(pmap);
 	l3 = pmap_l3(pmap, addr);
 	if (l3 != NULL && pmap_load(l3) != 0) {
 		rv = TRUE;
 	}
 	PMAP_UNLOCK(pmap);
 	return (rv);
 }
 
 /*
  *	pmap_is_referenced:
  *
  *	Return whether or not the specified physical page was referenced
  *	in any physical maps.
  */
 boolean_t
 pmap_is_referenced(vm_page_t m)
 {
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_is_referenced: page %p is not managed", m));
 	return (pmap_page_test_mappings(m, TRUE, FALSE));
 }
 
 /*
  * Clear the write and modified bits in each of the given page's mappings.
  */
 void
 pmap_remove_write(vm_page_t m)
 {
 	pmap_t pmap;
 	struct rwlock *lock;
 	pv_entry_t pv;
 	pt_entry_t *l3, oldl3;
 	pt_entry_t newl3;
 	int md_gen;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_remove_write: page %p is not managed", m));
 
 	/*
 	 * If the page is not exclusive busied, then PGA_WRITEABLE cannot be
 	 * set by another thread while the object is locked.  Thus,
 	 * if PGA_WRITEABLE is clear, no page table entries need updating.
 	 */
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	if (!vm_page_xbusied(m) && (m->aflags & PGA_WRITEABLE) == 0)
 		return;
 	rw_rlock(&pvh_global_lock);
 	lock = VM_PAGE_TO_PV_LIST_LOCK(m);
 retry_pv_loop:
 	rw_wlock(lock);
 	TAILQ_FOREACH(pv, &m->md.pv_list, pv_next) {
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			md_gen = m->md.pv_gen;
 			rw_wunlock(lock);
 			PMAP_LOCK(pmap);
 			rw_wlock(lock);
 			if (md_gen != m->md.pv_gen) {
 				PMAP_UNLOCK(pmap);
 				rw_wunlock(lock);
 				goto retry_pv_loop;
 			}
 		}
 		l3 = pmap_l3(pmap, pv->pv_va);
 retry:
 		oldl3 = pmap_load(l3);
 
 		if (pmap_is_write(oldl3)) {
 			newl3 = oldl3 & ~(PTE_W);
 			if (!atomic_cmpset_long(l3, oldl3, newl3))
 				goto retry;
 			/* TODO: use pmap_page_dirty(oldl3) ? */
 			if ((oldl3 & PTE_A) != 0)
 				vm_page_dirty(m);
 			pmap_invalidate_page(pmap, pv->pv_va);
 		}
 		PMAP_UNLOCK(pmap);
 	}
 	rw_wunlock(lock);
 	vm_page_aflag_clear(m, PGA_WRITEABLE);
 	rw_runlock(&pvh_global_lock);
 }
 
 static __inline boolean_t
 safe_to_clear_referenced(pmap_t pmap, pt_entry_t pte)
 {
 
 	return (FALSE);
 }
 
-#define	PMAP_TS_REFERENCED_MAX	5
-
 /*
  *	pmap_ts_referenced:
  *
  *	Return a count of reference bits for a page, clearing those bits.
  *	It is not necessary for every reference bit to be cleared, but it
  *	is necessary that 0 only be returned when there are truly no
  *	reference bits set.
  *
- *	XXX: The exact number of bits to check and clear is a matter that
- *	should be tested and standardized at some point in the future for
- *	optimal aging of shared pages.
+ *	As an optimization, update the page's dirty field if a modified bit is
+ *	found while counting reference bits.  This opportunistic update can be
+ *	performed at low cost and can eliminate the need for some future calls
+ *	to pmap_is_modified().  However, since this function stops after
+ *	finding PMAP_TS_REFERENCED_MAX reference bits, it may not detect some
+ *	dirty pages.  Those dirty pages will only be detected by a future call
+ *	to pmap_is_modified().
  */
 int
 pmap_ts_referenced(vm_page_t m)
 {
 	pv_entry_t pv, pvf;
 	pmap_t pmap;
 	struct rwlock *lock;
 	pd_entry_t *l2;
-	pt_entry_t *l3;
+	pt_entry_t *l3, old_l3;
 	vm_paddr_t pa;
 	int cleared, md_gen, not_cleared;
 	struct spglist free;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_ts_referenced: page %p is not managed", m));
 	SLIST_INIT(&free);
 	cleared = 0;
 	pa = VM_PAGE_TO_PHYS(m);
 	lock = PHYS_TO_PV_LIST_LOCK(pa);
 	rw_rlock(&pvh_global_lock);
 	rw_wlock(lock);
 retry:
 	not_cleared = 0;
 	if ((pvf = TAILQ_FIRST(&m->md.pv_list)) == NULL)
 		goto out;
 	pv = pvf;
 	do {
 		if (pvf == NULL)
 			pvf = pv;
 		pmap = PV_PMAP(pv);
 		if (!PMAP_TRYLOCK(pmap)) {
 			md_gen = m->md.pv_gen;
 			rw_wunlock(lock);
 			PMAP_LOCK(pmap);
 			rw_wlock(lock);
 			if (md_gen != m->md.pv_gen) {
 				PMAP_UNLOCK(pmap);
 				goto retry;
 			}
 		}
 		l2 = pmap_l2(pmap, pv->pv_va);
 
 		KASSERT((pmap_load(l2) & PTE_RX) == 0,
 		    ("pmap_ts_referenced: found an invalid l2 table"));
 
 		l3 = pmap_l2_to_l3(l2, pv->pv_va);
-		if ((pmap_load(l3) & PTE_A) != 0) {
-			if (safe_to_clear_referenced(pmap, pmap_load(l3))) {
+		old_l3 = pmap_load(l3);
+		if (pmap_page_dirty(old_l3))
+			vm_page_dirty(m);
+		if ((old_l3 & PTE_A) != 0) {
+			if (safe_to_clear_referenced(pmap, old_l3)) {
 				/*
 				 * TODO: We don't handle the access flag
 				 * at all. We need to be able to set it in
 				 * the exception handler.
 				 */
 				panic("RISCVTODO: safe_to_clear_referenced\n");
-			} else if ((pmap_load(l3) & PTE_SW_WIRED) == 0) {
+			} else if ((old_l3 & PTE_SW_WIRED) == 0) {
 				/*
 				 * Wired pages cannot be paged out so
 				 * doing accessed bit emulation for
 				 * them is wasted effort. We do the
 				 * hard work for unwired pages only.
 				 */
 				pmap_remove_l3(pmap, l3, pv->pv_va,
 				    pmap_load(l2), &free, &lock);
 				pmap_invalidate_page(pmap, pv->pv_va);
 				cleared++;
 				if (pvf == pv)
 					pvf = NULL;
 				pv = NULL;
 				KASSERT(lock == VM_PAGE_TO_PV_LIST_LOCK(m),
 				    ("inconsistent pv lock %p %p for page %p",
 				    lock, VM_PAGE_TO_PV_LIST_LOCK(m), m));
 			} else
 				not_cleared++;
 		}
 		PMAP_UNLOCK(pmap);
 		/* Rotate the PV list if it has more than one entry. */
 		if (pv != NULL && TAILQ_NEXT(pv, pv_next) != NULL) {
 			TAILQ_REMOVE(&m->md.pv_list, pv, pv_next);
 			TAILQ_INSERT_TAIL(&m->md.pv_list, pv, pv_next);
 			m->md.pv_gen++;
 		}
 	} while ((pv = TAILQ_FIRST(&m->md.pv_list)) != pvf && cleared +
 	    not_cleared < PMAP_TS_REFERENCED_MAX);
 out:
 	rw_wunlock(lock);
 	rw_runlock(&pvh_global_lock);
 	pmap_free_zero_pages(&free);
 	return (cleared + not_cleared);
 }
 
 /*
  *	Apply the given advice to the specified range of addresses within the
  *	given pmap.  Depending on the advice, clear the referenced and/or
  *	modified flags in each mapping and set the mapped page's dirty field.
  */
 void
 pmap_advise(pmap_t pmap, vm_offset_t sva, vm_offset_t eva, int advice)
 {
 }
 
 /*
  *	Clear the modify bits on the specified physical page.
  */
 void
 pmap_clear_modify(vm_page_t m)
 {
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_clear_modify: page %p is not managed", m));
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	KASSERT(!vm_page_xbusied(m),
 	    ("pmap_clear_modify: page %p is exclusive busied", m));
 
 	/*
 	 * If the page is not PGA_WRITEABLE, then no PTEs can have PG_M set.
 	 * If the object containing the page is locked and the page is not
 	 * exclusive busied, then PGA_WRITEABLE cannot be concurrently set.
 	 */
 	if ((m->aflags & PGA_WRITEABLE) == 0)
 		return;
 
 	/* RISCVTODO: We lack support for tracking if a page is modified */
 }
 
 void *
 pmap_mapbios(vm_paddr_t pa, vm_size_t size)
 {
 
         return ((void *)PHYS_TO_DMAP(pa));
 }
 
 void
 pmap_unmapbios(vm_paddr_t pa, vm_size_t size)
 {
 }
 
 /*
  * Sets the memory attribute for the specified page.
  */
 void
 pmap_page_set_memattr(vm_page_t m, vm_memattr_t ma)
 {
 
 	m->md.pv_memattr = ma;
 
 	/*
 	 * RISCVTODO: Implement the below (from the amd64 pmap)
 	 * If "m" is a normal page, update its direct mapping.  This update
 	 * can be relied upon to perform any cache operations that are
 	 * required for data coherence.
 	 */
 	if ((m->flags & PG_FICTITIOUS) == 0 &&
 	    PHYS_IN_DMAP(VM_PAGE_TO_PHYS(m)))
 		panic("RISCVTODO: pmap_page_set_memattr");
 }
 
 /*
  * perform the pmap work for mincore
  */
 int
 pmap_mincore(pmap_t pmap, vm_offset_t addr, vm_paddr_t *locked_pa)
 {
 
 	panic("RISCVTODO: pmap_mincore");
 }
 
 void
 pmap_activate(struct thread *td)
 {
 	pmap_t pmap;
 
 	critical_enter();
 	pmap = vmspace_pmap(td->td_proc->p_vmspace);
 	td->td_pcb->pcb_l1addr = vtophys(pmap->pm_l1);
 
 	__asm __volatile("csrw sptbr, %0" :: "r"(td->td_pcb->pcb_l1addr >> PAGE_SHIFT));
 
 	pmap_invalidate_all(pmap);
 	critical_exit();
 }
 
 void
 pmap_sync_icache(pmap_t pm, vm_offset_t va, vm_size_t sz)
 {
 
 	panic("RISCVTODO: pmap_sync_icache");
 }
 
 /*
  *	Increase the starting virtual address of the given mapping if a
  *	different alignment might result in more superpage mappings.
  */
 void
 pmap_align_superpage(vm_object_t object, vm_ooffset_t offset,
     vm_offset_t *addr, vm_size_t size)
 {
 }
 
 /**
  * Get the kernel virtual address of a set of physical pages. If there are
  * physical addresses not covered by the DMAP perform a transient mapping
  * that will be removed when calling pmap_unmap_io_transient.
  *
  * \param page        The pages the caller wishes to obtain the virtual
  *                    address on the kernel memory map.
  * \param vaddr       On return contains the kernel virtual memory address
  *                    of the pages passed in the page parameter.
  * \param count       Number of pages passed in.
  * \param can_fault   TRUE if the thread using the mapped pages can take
  *                    page faults, FALSE otherwise.
  *
  * \returns TRUE if the caller must call pmap_unmap_io_transient when
  *          finished or FALSE otherwise.
  *
  */
 boolean_t
 pmap_map_io_transient(vm_page_t page[], vm_offset_t vaddr[], int count,
     boolean_t can_fault)
 {
 	vm_paddr_t paddr;
 	boolean_t needs_mapping;
 	int error, i;
 
 	/*
 	 * Allocate any KVA space that we need, this is done in a separate
 	 * loop to prevent calling vmem_alloc while pinned.
 	 */
 	needs_mapping = FALSE;
 	for (i = 0; i < count; i++) {
 		paddr = VM_PAGE_TO_PHYS(page[i]);
 		if (__predict_false(paddr >= DMAP_MAX_PHYSADDR)) {
 			error = vmem_alloc(kernel_arena, PAGE_SIZE,
 			    M_BESTFIT | M_WAITOK, &vaddr[i]);
 			KASSERT(error == 0, ("vmem_alloc failed: %d", error));
 			needs_mapping = TRUE;
 		} else {
 			vaddr[i] = PHYS_TO_DMAP(paddr);
 		}
 	}
 
 	/* Exit early if everything is covered by the DMAP */
 	if (!needs_mapping)
 		return (FALSE);
 
 	if (!can_fault)
 		sched_pin();
 	for (i = 0; i < count; i++) {
 		paddr = VM_PAGE_TO_PHYS(page[i]);
 		if (paddr >= DMAP_MAX_PHYSADDR) {
 			panic(
 			   "pmap_map_io_transient: TODO: Map out of DMAP data");
 		}
 	}
 
 	return (needs_mapping);
 }
 
 void
 pmap_unmap_io_transient(vm_page_t page[], vm_offset_t vaddr[], int count,
     boolean_t can_fault)
 {
 	vm_paddr_t paddr;
 	int i;
 
 	if (!can_fault)
 		sched_unpin();
 	for (i = 0; i < count; i++) {
 		paddr = VM_PAGE_TO_PHYS(page[i]);
 		if (paddr >= DMAP_MAX_PHYSADDR) {
 			panic("RISCVTODO: pmap_unmap_io_transient: Unmap data");
 		}
 	}
 }
Index: projects/clang390-import/sys/security/audit/audit_syscalls.c
===================================================================
--- projects/clang390-import/sys/security/audit/audit_syscalls.c	(revision 305686)
+++ projects/clang390-import/sys/security/audit/audit_syscalls.c	(revision 305687)
@@ -1,873 +1,873 @@
 /*-
  * Copyright (c) 1999-2009 Apple Inc.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1.  Redistributions of source code must retain the above copyright
  *     notice, this list of conditions and the following disclaimer.
  * 2.  Redistributions in binary form must reproduce the above copyright
  *     notice, this list of conditions and the following disclaimer in the
  *     documentation and/or other materials provided with the distribution.
  * 3.  Neither the name of Apple Inc. ("Apple") nor the names of
  *     its contributors may be used to endorse or promote products derived
  *     from this software without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY APPLE AND ITS CONTRIBUTORS "AS IS" AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED. IN NO EVENT SHALL APPLE OR ITS CONTRIBUTORS BE LIABLE FOR
  * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
  * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
  * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
  * POSSIBILITY OF SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/mount.h>
 #include <sys/namei.h>
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/sysproto.h>
 #include <sys/systm.h>
 #include <sys/vnode.h>
 #include <sys/jail.h>
 
 #include <bsm/audit.h>
 #include <bsm/audit_kevents.h>
 
 #include <security/audit/audit.h>
 #include <security/audit/audit_private.h>
 #include <security/mac/mac_framework.h>
 
 #ifdef AUDIT
 
 /*
  * System call to allow a user space application to submit a BSM audit record
  * to the kernel for inclusion in the audit log.  This function does little
  * verification on the audit record that is submitted.
  *
  * XXXAUDIT: Audit preselection for user records does not currently work,
  * since we pre-select only based on the AUE_audit event type, not the event
  * type submitted as part of the user audit data.
  */
 /* ARGSUSED */
 int
 sys_audit(struct thread *td, struct audit_args *uap)
 {
 	int error;
 	void * rec;
 	struct kaudit_record *ar;
 
 	if (jailed(td->td_ucred))
 		return (ENOSYS);
 	error = priv_check(td, PRIV_AUDIT_SUBMIT);
 	if (error)
 		return (error);
 
 	if ((uap->length <= 0) || (uap->length > audit_qctrl.aq_bufsz))
 		return (EINVAL);
 
 	ar = currecord();
 
 	/*
 	 * If there's no current audit record (audit() itself not audited)
 	 * commit the user audit record.
 	 */
 	if (ar == NULL) {
 
 		/*
 		 * This is not very efficient; we're required to allocate a
 		 * complete kernel audit record just so the user record can
 		 * tag along.
 		 *
 		 * XXXAUDIT: Maybe AUE_AUDIT in the system call context and
 		 * special pre-select handling?
 		 */
 		td->td_ar = audit_new(AUE_NULL, td);
 		if (td->td_ar == NULL)
 			return (ENOTSUP);
 		td->td_pflags |= TDP_AUDITREC;
 		ar = td->td_ar;
 	}
 
 	if (uap->length > MAX_AUDIT_RECORD_SIZE)
 		return (EINVAL);
 
 	rec = malloc(uap->length, M_AUDITDATA, M_WAITOK);
 
 	error = copyin(uap->record, rec, uap->length);
 	if (error)
 		goto free_out;
 
 	/* Verify the record. */
 	if (bsm_rec_verify(rec) == 0) {
 		error = EINVAL;
 		goto free_out;
 	}
 
 #ifdef MAC
 	error = mac_system_check_audit(td->td_ucred, rec, uap->length);
 	if (error)
 		goto free_out;
 #endif
 
 	/*
 	 * Attach the user audit record to the kernel audit record.  Because
 	 * this system call is an auditable event, we will write the user
 	 * record along with the record for this audit event.
 	 *
 	 * XXXAUDIT: KASSERT appropriate starting values of k_udata, k_ulen,
 	 * k_ar_commit & AR_COMMIT_USER?
 	 */
 	ar->k_udata = rec;
 	ar->k_ulen  = uap->length;
 	ar->k_ar_commit |= AR_COMMIT_USER;
 
 	/*
 	 * Currently we assume that all preselection has been performed in
 	 * userspace.  We unconditionally set these masks so that the records
 	 * get committed both to the trail and pipe.  In the future we will
 	 * want to setup kernel based preselection.
 	 */
 	ar->k_ar_commit |= (AR_PRESELECT_USER_TRAIL | AR_PRESELECT_USER_PIPE);
 	return (0);
 
 free_out:
 	/*
 	 * audit_syscall_exit() will free the audit record on the thread even
 	 * if we allocated it above.
 	 */
 	free(rec, M_AUDITDATA);
 	return (error);
 }
 
 /*
  *  System call to manipulate auditing.
  */
 /* ARGSUSED */
 int
 sys_auditon(struct thread *td, struct auditon_args *uap)
 {
 	struct ucred *cred, *newcred, *oldcred;
 	int error;
 	union auditon_udata udata;
 	struct proc *tp;
 
 	if (jailed(td->td_ucred))
 		return (ENOSYS);
 	AUDIT_ARG_CMD(uap->cmd);
 
 #ifdef MAC
 	error = mac_system_check_auditon(td->td_ucred, uap->cmd);
 	if (error)
 		return (error);
 #endif
 
 	error = priv_check(td, PRIV_AUDIT_CONTROL);
 	if (error)
 		return (error);
 
 	if ((uap->length <= 0) || (uap->length > sizeof(union auditon_udata)))
 		return (EINVAL);
 
 	memset((void *)&udata, 0, sizeof(udata));
 
 	/*
 	 * Some of the GET commands use the arguments too.
 	 */
 	switch (uap->cmd) {
 	case A_SETPOLICY:
 	case A_OLDSETPOLICY:
 	case A_SETKMASK:
 	case A_SETQCTRL:
 	case A_OLDSETQCTRL:
 	case A_SETSTAT:
 	case A_SETUMASK:
 	case A_SETSMASK:
 	case A_SETCOND:
 	case A_OLDSETCOND:
 	case A_SETCLASS:
 	case A_SETPMASK:
 	case A_SETFSIZE:
 	case A_SETKAUDIT:
 	case A_GETCLASS:
 	case A_GETPINFO:
 	case A_GETPINFO_ADDR:
 	case A_SENDTRIGGER:
 		error = copyin(uap->data, (void *)&udata, uap->length);
 		if (error)
 			return (error);
 		AUDIT_ARG_AUDITON(&udata);
 		break;
 	}
 
 	/*
 	 * XXXAUDIT: Locking?
 	 */
 	switch (uap->cmd) {
 	case A_OLDGETPOLICY:
 	case A_GETPOLICY:
 		if (uap->length == sizeof(udata.au_policy64)) {
 			if (!audit_fail_stop)
 				udata.au_policy64 |= AUDIT_CNT;
 			if (audit_panic_on_write_fail)
 				udata.au_policy64 |= AUDIT_AHLT;
 			if (audit_argv)
 				udata.au_policy64 |= AUDIT_ARGV;
 			if (audit_arge)
 				udata.au_policy64 |= AUDIT_ARGE;
 			break;
 		}
 		if (uap->length != sizeof(udata.au_policy))
 			return (EINVAL);
 		if (!audit_fail_stop)
 			udata.au_policy |= AUDIT_CNT;
 		if (audit_panic_on_write_fail)
 			udata.au_policy |= AUDIT_AHLT;
 		if (audit_argv)
 			udata.au_policy |= AUDIT_ARGV;
 		if (audit_arge)
 			udata.au_policy |= AUDIT_ARGE;
 		break;
 
 	case A_OLDSETPOLICY:
 	case A_SETPOLICY:
 		if (uap->length == sizeof(udata.au_policy64)) {
 			if (udata.au_policy & (~AUDIT_CNT|AUDIT_AHLT|
 			    AUDIT_ARGV|AUDIT_ARGE))
 				return (EINVAL);
 			audit_fail_stop = ((udata.au_policy64 & AUDIT_CNT) ==
 			    0);
 			audit_panic_on_write_fail = (udata.au_policy64 &
 			    AUDIT_AHLT);
 			audit_argv = (udata.au_policy64 & AUDIT_ARGV);
 			audit_arge = (udata.au_policy64 & AUDIT_ARGE);
 			break;
 		}
 		if (uap->length != sizeof(udata.au_policy))
 			return (EINVAL);
 		if (udata.au_policy & ~(AUDIT_CNT|AUDIT_AHLT|AUDIT_ARGV|
 		    AUDIT_ARGE))
 			return (EINVAL);
 		/*
 		 * XXX - Need to wake up waiters if the policy relaxes?
 		 */
 		audit_fail_stop = ((udata.au_policy & AUDIT_CNT) == 0);
 		audit_panic_on_write_fail = (udata.au_policy & AUDIT_AHLT);
 		audit_argv = (udata.au_policy & AUDIT_ARGV);
 		audit_arge = (udata.au_policy & AUDIT_ARGE);
 		break;
 
 	case A_GETKMASK:
 		if (uap->length != sizeof(udata.au_mask))
 			return (EINVAL);
 		udata.au_mask = audit_nae_mask;
 		break;
 
 	case A_SETKMASK:
 		if (uap->length != sizeof(udata.au_mask))
 			return (EINVAL);
 		audit_nae_mask = udata.au_mask;
 		break;
 
 	case A_OLDGETQCTRL:
 	case A_GETQCTRL:
 		if (uap->length == sizeof(udata.au_qctrl64)) {
 			udata.au_qctrl64.aq64_hiwater =
 			    (u_int64_t)audit_qctrl.aq_hiwater;
 			udata.au_qctrl64.aq64_lowater =
 			    (u_int64_t)audit_qctrl.aq_lowater;
 			udata.au_qctrl64.aq64_bufsz =
 			    (u_int64_t)audit_qctrl.aq_bufsz;
 			udata.au_qctrl64.aq64_minfree =
 			    (u_int64_t)audit_qctrl.aq_minfree;
 			break;
 		}
 		if (uap->length != sizeof(udata.au_qctrl))
 			return (EINVAL);
 		udata.au_qctrl = audit_qctrl;
 		break;
 
 	case A_OLDSETQCTRL:
 	case A_SETQCTRL:
 		if (uap->length == sizeof(udata.au_qctrl64)) {
+			/* NB: aq64_minfree is unsigned unlike aq_minfree. */
 			if ((udata.au_qctrl64.aq64_hiwater > AQ_MAXHIGH) ||
 			    (udata.au_qctrl64.aq64_lowater >=
 			    udata.au_qctrl.aq_hiwater) ||
 			    (udata.au_qctrl64.aq64_bufsz > AQ_MAXBUFSZ) ||
-			    (udata.au_qctrl64.aq64_minfree > 100) ||
-			    (udata.au_qctrl64.aq64_minfree < 0))
+			    (udata.au_qctrl64.aq64_minfree > 100))
 				return (EINVAL);
 			audit_qctrl.aq_hiwater =
 			    (int)udata.au_qctrl64.aq64_hiwater;
 			audit_qctrl.aq_lowater =
 			    (int)udata.au_qctrl64.aq64_lowater;
 			audit_qctrl.aq_bufsz =
 			    (int)udata.au_qctrl64.aq64_bufsz;
 			audit_qctrl.aq_minfree =
 			    (int)udata.au_qctrl64.aq64_minfree;
 			audit_qctrl.aq_delay = -1;	/* Not used. */
 			break;
 		}
 		if (uap->length != sizeof(udata.au_qctrl))
 			return (EINVAL);
 		if ((udata.au_qctrl.aq_hiwater > AQ_MAXHIGH) ||
 		    (udata.au_qctrl.aq_lowater >= udata.au_qctrl.aq_hiwater) ||
 		    (udata.au_qctrl.aq_bufsz > AQ_MAXBUFSZ) ||
 		    (udata.au_qctrl.aq_minfree < 0) ||
 		    (udata.au_qctrl.aq_minfree > 100))
 			return (EINVAL);
 
 		audit_qctrl = udata.au_qctrl;
 		/* XXX The queue delay value isn't used with the kernel. */
 		audit_qctrl.aq_delay = -1;
 		break;
 
 	case A_GETCWD:
 		return (ENOSYS);
 		break;
 
 	case A_GETCAR:
 		return (ENOSYS);
 		break;
 
 	case A_GETSTAT:
 		return (ENOSYS);
 		break;
 
 	case A_SETSTAT:
 		return (ENOSYS);
 		break;
 
 	case A_SETUMASK:
 		return (ENOSYS);
 		break;
 
 	case A_SETSMASK:
 		return (ENOSYS);
 		break;
 
 	case A_OLDGETCOND:
 	case A_GETCOND:
 		if (uap->length == sizeof(udata.au_cond64)) {
 			if (audit_enabled && !audit_suspended)
 				udata.au_cond64 = AUC_AUDITING;
 			else
 				udata.au_cond64 = AUC_NOAUDIT;
 			break;
 		}
 		if (uap->length != sizeof(udata.au_cond))
 			return (EINVAL);
 		if (audit_enabled && !audit_suspended)
 			udata.au_cond = AUC_AUDITING;
 		else
 			udata.au_cond = AUC_NOAUDIT;
 		break;
 
 	case A_OLDSETCOND:
 	case A_SETCOND:
 		if (uap->length == sizeof(udata.au_cond64)) {
 			if (udata.au_cond64 == AUC_NOAUDIT)
 				audit_suspended = 1;
 			if (udata.au_cond64 == AUC_AUDITING)
 				audit_suspended = 0;
 			if (udata.au_cond64 == AUC_DISABLED) {
 				audit_suspended = 1;
 				audit_shutdown(NULL, 0);
 			}
 			break;
 		}
 		if (uap->length != sizeof(udata.au_cond))
 			return (EINVAL);
 		if (udata.au_cond == AUC_NOAUDIT)
 			audit_suspended = 1;
 		if (udata.au_cond == AUC_AUDITING)
 			audit_suspended = 0;
 		if (udata.au_cond == AUC_DISABLED) {
 			audit_suspended = 1;
 			audit_shutdown(NULL, 0);
 		}
 		break;
 
 	case A_GETCLASS:
 		if (uap->length != sizeof(udata.au_evclass))
 			return (EINVAL);
 		udata.au_evclass.ec_class = au_event_class(
 		    udata.au_evclass.ec_number);
 		break;
 
 	case A_SETCLASS:
 		if (uap->length != sizeof(udata.au_evclass))
 			return (EINVAL);
 		au_evclassmap_insert(udata.au_evclass.ec_number,
 		    udata.au_evclass.ec_class);
 		break;
 
 	case A_GETPINFO:
 		if (uap->length != sizeof(udata.au_aupinfo))
 			return (EINVAL);
 		if (udata.au_aupinfo.ap_pid < 1)
 			return (ESRCH);
 		if ((tp = pfind(udata.au_aupinfo.ap_pid)) == NULL)
 			return (ESRCH);
 		if ((error = p_cansee(td, tp)) != 0) {
 			PROC_UNLOCK(tp);
 			return (error);
 		}
 		cred = tp->p_ucred;
 		if (cred->cr_audit.ai_termid.at_type == AU_IPv6) {
 			PROC_UNLOCK(tp);
 			return (EINVAL);
 		}
 		udata.au_aupinfo.ap_auid = cred->cr_audit.ai_auid;
 		udata.au_aupinfo.ap_mask.am_success =
 		    cred->cr_audit.ai_mask.am_success;
 		udata.au_aupinfo.ap_mask.am_failure =
 		    cred->cr_audit.ai_mask.am_failure;
 		udata.au_aupinfo.ap_termid.machine =
 		    cred->cr_audit.ai_termid.at_addr[0];
 		udata.au_aupinfo.ap_termid.port =
 		    (dev_t)cred->cr_audit.ai_termid.at_port;
 		udata.au_aupinfo.ap_asid = cred->cr_audit.ai_asid;
 		PROC_UNLOCK(tp);
 		break;
 
 	case A_SETPMASK:
 		if (uap->length != sizeof(udata.au_aupinfo))
 			return (EINVAL);
 		if (udata.au_aupinfo.ap_pid < 1)
 			return (ESRCH);
 		newcred = crget();
 		if ((tp = pfind(udata.au_aupinfo.ap_pid)) == NULL) {
 			crfree(newcred);
 			return (ESRCH);
 		}
 		if ((error = p_cansee(td, tp)) != 0) {
 			PROC_UNLOCK(tp);
 			crfree(newcred);
 			return (error);
 		}
 		oldcred = tp->p_ucred;
 		crcopy(newcred, oldcred);
 		newcred->cr_audit.ai_mask.am_success =
 		    udata.au_aupinfo.ap_mask.am_success;
 		newcred->cr_audit.ai_mask.am_failure =
 		    udata.au_aupinfo.ap_mask.am_failure;
 		proc_set_cred(tp, newcred);
 		PROC_UNLOCK(tp);
 		crfree(oldcred);
 		break;
 
 	case A_SETFSIZE:
 		if (uap->length != sizeof(udata.au_fstat))
 			return (EINVAL);
 		if ((udata.au_fstat.af_filesz != 0) &&
 		   (udata.au_fstat.af_filesz < MIN_AUDIT_FILE_SIZE))
 			return (EINVAL);
 		audit_fstat.af_filesz = udata.au_fstat.af_filesz;
 		break;
 
 	case A_GETFSIZE:
 		if (uap->length != sizeof(udata.au_fstat))
 			return (EINVAL);
 		udata.au_fstat.af_filesz = audit_fstat.af_filesz;
 		udata.au_fstat.af_currsz = audit_fstat.af_currsz;
 		break;
 
 	case A_GETPINFO_ADDR:
 		if (uap->length != sizeof(udata.au_aupinfo_addr))
 			return (EINVAL);
 		if (udata.au_aupinfo_addr.ap_pid < 1)
 			return (ESRCH);
 		if ((tp = pfind(udata.au_aupinfo_addr.ap_pid)) == NULL)
 			return (ESRCH);
 		cred = tp->p_ucred;
 		udata.au_aupinfo_addr.ap_auid = cred->cr_audit.ai_auid;
 		udata.au_aupinfo_addr.ap_mask.am_success =
 		    cred->cr_audit.ai_mask.am_success;
 		udata.au_aupinfo_addr.ap_mask.am_failure =
 		    cred->cr_audit.ai_mask.am_failure;
 		udata.au_aupinfo_addr.ap_termid = cred->cr_audit.ai_termid;
 		udata.au_aupinfo_addr.ap_asid = cred->cr_audit.ai_asid;
 		PROC_UNLOCK(tp);
 		break;
 
 	case A_GETKAUDIT:
 		if (uap->length != sizeof(udata.au_kau_info))
 			return (EINVAL);
 		audit_get_kinfo(&udata.au_kau_info);
 		break;
 
 	case A_SETKAUDIT:
 		if (uap->length != sizeof(udata.au_kau_info))
 			return (EINVAL);
 		if (udata.au_kau_info.ai_termid.at_type != AU_IPv4 &&
 		    udata.au_kau_info.ai_termid.at_type != AU_IPv6)
 			return (EINVAL);
 		audit_set_kinfo(&udata.au_kau_info);
 		break;
 
 	case A_SENDTRIGGER:
 		if (uap->length != sizeof(udata.au_trigger))
 			return (EINVAL);
 		if ((udata.au_trigger < AUDIT_TRIGGER_MIN) ||
 		    (udata.au_trigger > AUDIT_TRIGGER_MAX))
 			return (EINVAL);
 		return (audit_send_trigger(udata.au_trigger));
 
 	default:
 		return (EINVAL);
 	}
 
 	/*
 	 * Copy data back to userspace for the GET comands.
 	 */
 	switch (uap->cmd) {
 	case A_GETPOLICY:
 	case A_OLDGETPOLICY:
 	case A_GETKMASK:
 	case A_GETQCTRL:
 	case A_OLDGETQCTRL:
 	case A_GETCWD:
 	case A_GETCAR:
 	case A_GETSTAT:
 	case A_GETCOND:
 	case A_OLDGETCOND:
 	case A_GETCLASS:
 	case A_GETPINFO:
 	case A_GETFSIZE:
 	case A_GETPINFO_ADDR:
 	case A_GETKAUDIT:
 		error = copyout((void *)&udata, uap->data, uap->length);
 		if (error)
 			return (error);
 		break;
 	}
 
 	return (0);
 }
 
 /*
  * System calls to manage the user audit information.
  */
 /* ARGSUSED */
 int
 sys_getauid(struct thread *td, struct getauid_args *uap)
 {
 	int error;
 
 	if (jailed(td->td_ucred))
 		return (ENOSYS);
 	error = priv_check(td, PRIV_AUDIT_GETAUDIT);
 	if (error)
 		return (error);
 	return (copyout(&td->td_ucred->cr_audit.ai_auid, uap->auid,
 	    sizeof(td->td_ucred->cr_audit.ai_auid)));
 }
 
 /* ARGSUSED */
 int
 sys_setauid(struct thread *td, struct setauid_args *uap)
 {
 	struct ucred *newcred, *oldcred;
 	au_id_t id;
 	int error;
 
 	if (jailed(td->td_ucred))
 		return (ENOSYS);
 	error = copyin(uap->auid, &id, sizeof(id));
 	if (error)
 		return (error);
 	audit_arg_auid(id);
 	newcred = crget();
 	PROC_LOCK(td->td_proc);
 	oldcred = td->td_proc->p_ucred;
 	crcopy(newcred, oldcred);
 #ifdef MAC
 	error = mac_cred_check_setauid(oldcred, id);
 	if (error)
 		goto fail;
 #endif
 	error = priv_check_cred(oldcred, PRIV_AUDIT_SETAUDIT, 0);
 	if (error)
 		goto fail;
 	newcred->cr_audit.ai_auid = id;
 	proc_set_cred(td->td_proc, newcred);
 	PROC_UNLOCK(td->td_proc);
 	crfree(oldcred);
 	return (0);
 fail:
 	PROC_UNLOCK(td->td_proc);
 	crfree(newcred);
 	return (error);
 }
 
 /*
  * System calls to get and set process audit information.
  */
 /* ARGSUSED */
 int
 sys_getaudit(struct thread *td, struct getaudit_args *uap)
 {
 	struct auditinfo ai;
 	struct ucred *cred;
 	int error;
 
 	cred = td->td_ucred;
 	if (jailed(cred))
 		return (ENOSYS);
 	error = priv_check(td, PRIV_AUDIT_GETAUDIT);
 	if (error)
 		return (error);
 	if (cred->cr_audit.ai_termid.at_type == AU_IPv6)
 		return (E2BIG);
 	bzero(&ai, sizeof(ai));
 	ai.ai_auid = cred->cr_audit.ai_auid;
 	ai.ai_mask = cred->cr_audit.ai_mask;
 	ai.ai_asid = cred->cr_audit.ai_asid;
 	ai.ai_termid.machine = cred->cr_audit.ai_termid.at_addr[0];
 	ai.ai_termid.port = cred->cr_audit.ai_termid.at_port;
 	return (copyout(&ai, uap->auditinfo, sizeof(ai)));
 }
 
 /* ARGSUSED */
 int
 sys_setaudit(struct thread *td, struct setaudit_args *uap)
 {
 	struct ucred *newcred, *oldcred;
 	struct auditinfo ai;
 	int error;
 
 	if (jailed(td->td_ucred))
 		return (ENOSYS);
 	error = copyin(uap->auditinfo, &ai, sizeof(ai));
 	if (error)
 		return (error);
 	audit_arg_auditinfo(&ai);
 	newcred = crget();
 	PROC_LOCK(td->td_proc);
 	oldcred = td->td_proc->p_ucred;
 	crcopy(newcred, oldcred);
 #ifdef MAC
 	error = mac_cred_check_setaudit(oldcred, &ai);
 	if (error)
 		goto fail;
 #endif
 	error = priv_check_cred(oldcred, PRIV_AUDIT_SETAUDIT, 0);
 	if (error)
 		goto fail;
 	bzero(&newcred->cr_audit, sizeof(newcred->cr_audit));
 	newcred->cr_audit.ai_auid = ai.ai_auid;
 	newcred->cr_audit.ai_mask = ai.ai_mask;
 	newcred->cr_audit.ai_asid = ai.ai_asid;
 	newcred->cr_audit.ai_termid.at_addr[0] = ai.ai_termid.machine;
 	newcred->cr_audit.ai_termid.at_port = ai.ai_termid.port;
 	newcred->cr_audit.ai_termid.at_type = AU_IPv4;
 	proc_set_cred(td->td_proc, newcred);
 	PROC_UNLOCK(td->td_proc);
 	crfree(oldcred);
 	return (0);
 fail:
 	PROC_UNLOCK(td->td_proc);
 	crfree(newcred);
 	return (error);
 }
 
 /* ARGSUSED */
 int
 sys_getaudit_addr(struct thread *td, struct getaudit_addr_args *uap)
 {
 	int error;
 
 	if (jailed(td->td_ucred))
 		return (ENOSYS);
 	if (uap->length < sizeof(*uap->auditinfo_addr))
 		return (EOVERFLOW);
 	error = priv_check(td, PRIV_AUDIT_GETAUDIT);
 	if (error)
 		return (error);
 	return (copyout(&td->td_ucred->cr_audit, uap->auditinfo_addr,
 	    sizeof(*uap->auditinfo_addr)));
 }
 
 /* ARGSUSED */
 int
 sys_setaudit_addr(struct thread *td, struct setaudit_addr_args *uap)
 {
 	struct ucred *newcred, *oldcred;
 	struct auditinfo_addr aia;
 	int error;
 
 	if (jailed(td->td_ucred))
 		return (ENOSYS);
 	error = copyin(uap->auditinfo_addr, &aia, sizeof(aia));
 	if (error)
 		return (error);
 	audit_arg_auditinfo_addr(&aia);
 	if (aia.ai_termid.at_type != AU_IPv6 &&
 	    aia.ai_termid.at_type != AU_IPv4)
 		return (EINVAL);
 	newcred = crget();
 	PROC_LOCK(td->td_proc);	
 	oldcred = td->td_proc->p_ucred;
 	crcopy(newcred, oldcred);
 #ifdef MAC
 	error = mac_cred_check_setaudit_addr(oldcred, &aia);
 	if (error)
 		goto fail;
 #endif
 	error = priv_check_cred(oldcred, PRIV_AUDIT_SETAUDIT, 0);
 	if (error)
 		goto fail;
 	newcred->cr_audit = aia;
 	proc_set_cred(td->td_proc, newcred);
 	PROC_UNLOCK(td->td_proc);
 	crfree(oldcred);
 	return (0);
 fail:
 	PROC_UNLOCK(td->td_proc);
 	crfree(newcred);
 	return (error);
 }
 
 /*
  * Syscall to manage audit files.
  */
 /* ARGSUSED */
 int
 sys_auditctl(struct thread *td, struct auditctl_args *uap)
 {
 	struct nameidata nd;
 	struct ucred *cred;
 	struct vnode *vp;
 	int error = 0;
 	int flags;
 
 	if (jailed(td->td_ucred))
 		return (ENOSYS);
 	error = priv_check(td, PRIV_AUDIT_CONTROL);
 	if (error)
 		return (error);
 
 	vp = NULL;
 	cred = NULL;
 
 	/*
 	 * If a path is specified, open the replacement vnode, perform
 	 * validity checks, and grab another reference to the current
 	 * credential.
 	 *
 	 * On Darwin, a NULL path argument is also used to disable audit.
 	 */
 	if (uap->path == NULL)
 		return (EINVAL);
 
 	NDINIT(&nd, LOOKUP, FOLLOW | LOCKLEAF | AUDITVNODE1,
 	    UIO_USERSPACE, uap->path, td);
 	flags = AUDIT_OPEN_FLAGS;
 	error = vn_open(&nd, &flags, 0, NULL);
 	if (error)
 		return (error);
 	vp = nd.ni_vp;
 #ifdef MAC
 	error = mac_system_check_auditctl(td->td_ucred, vp);
 	VOP_UNLOCK(vp, 0);
 	if (error) {
 		vn_close(vp, AUDIT_CLOSE_FLAGS, td->td_ucred, td);
 		return (error);
 	}
 #else
 	VOP_UNLOCK(vp, 0);
 #endif
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	if (vp->v_type != VREG) {
 		vn_close(vp, AUDIT_CLOSE_FLAGS, td->td_ucred, td);
 		return (EINVAL);
 	}
 	cred = td->td_ucred;
 	crhold(cred);
 
 	/*
 	 * XXXAUDIT: Should audit_suspended actually be cleared by
 	 * audit_worker?
 	 */
 	audit_suspended = 0;
 
 	audit_rotate_vnode(cred, vp);
 
 	return (error);
 }
 
 #else /* !AUDIT */
 
 int
 sys_audit(struct thread *td, struct audit_args *uap)
 {
 
 	return (ENOSYS);
 }
 
 int
 sys_auditon(struct thread *td, struct auditon_args *uap)
 {
 
 	return (ENOSYS);
 }
 
 int
 sys_getauid(struct thread *td, struct getauid_args *uap)
 {
 
 	return (ENOSYS);
 }
 
 int
 sys_setauid(struct thread *td, struct setauid_args *uap)
 {
 
 	return (ENOSYS);
 }
 
 int
 sys_getaudit(struct thread *td, struct getaudit_args *uap)
 {
 
 	return (ENOSYS);
 }
 
 int
 sys_setaudit(struct thread *td, struct setaudit_args *uap)
 {
 
 	return (ENOSYS);
 }
 
 int
 sys_getaudit_addr(struct thread *td, struct getaudit_addr_args *uap)
 {
 
 	return (ENOSYS);
 }
 
 int
 sys_setaudit_addr(struct thread *td, struct setaudit_addr_args *uap)
 {
 
 	return (ENOSYS);
 }
 
 int
 sys_auditctl(struct thread *td, struct auditctl_args *uap)
 {
 
 	return (ENOSYS);
 }
 #endif /* AUDIT */
Index: projects/clang390-import/sys/sparc64/sparc64/pmap.c
===================================================================
--- projects/clang390-import/sys/sparc64/sparc64/pmap.c	(revision 305686)
+++ projects/clang390-import/sys/sparc64/sparc64/pmap.c	(revision 305687)
@@ -1,2334 +1,2328 @@
 /*-
  * Copyright (c) 1991 Regents of the University of California.
  * All rights reserved.
  * Copyright (c) 1994 John S. Dyson
  * All rights reserved.
  * Copyright (c) 1994 David Greenman
  * All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * the Systems Programming Group of the University of Utah Computer
  * Science Department and William Jolitz of UUNET Technologies Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *      from:   @(#)pmap.c      7.7 (Berkeley)  5/12/91
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 /*
  * Manages physical address maps.
  *
  * Since the information managed by this module is also stored by the
  * logical address mapping module, this module may throw away valid virtual
  * to physical mappings at almost any time.  However, invalidations of
  * mappings must be done as requested.
  *
  * In order to cope with hardware architectures which make virtual to
  * physical map invalidates expensive, this module may delay invalidate
  * reduced protection operations until such time as they are actually
  * necessary.  This module is given full information as to which processors
  * are currently using which maps, and to when physical maps must be made
  * correct.
  */
 
 #include "opt_kstack_pages.h"
 #include "opt_pmap.h"
 
 #include <sys/param.h>
 #include <sys/kernel.h>
 #include <sys/ktr.h>
 #include <sys/lock.h>
 #include <sys/msgbuf.h>
 #include <sys/mutex.h>
 #include <sys/proc.h>
 #include <sys/rwlock.h>
 #include <sys/smp.h>
 #include <sys/sysctl.h>
 #include <sys/systm.h>
 #include <sys/vmmeter.h>
 
 #include <dev/ofw/openfirm.h>
 
 #include <vm/vm.h>
 #include <vm/vm_param.h>
 #include <vm/vm_kern.h>
 #include <vm/vm_page.h>
 #include <vm/vm_map.h>
 #include <vm/vm_object.h>
 #include <vm/vm_extern.h>
 #include <vm/vm_pageout.h>
 #include <vm/vm_pager.h>
 #include <vm/vm_phys.h>
 
 #include <machine/cache.h>
 #include <machine/frame.h>
 #include <machine/instr.h>
 #include <machine/md_var.h>
 #include <machine/metadata.h>
 #include <machine/ofw_mem.h>
 #include <machine/smp.h>
 #include <machine/tlb.h>
 #include <machine/tte.h>
 #include <machine/tsb.h>
 #include <machine/ver.h>
 
 /*
  * Virtual address of message buffer
  */
 struct msgbuf *msgbufp;
 
 /*
  * Map of physical memory reagions
  */
 vm_paddr_t phys_avail[128];
 static struct ofw_mem_region mra[128];
 struct ofw_mem_region sparc64_memreg[128];
 int sparc64_nmemreg;
 static struct ofw_map translations[128];
 static int translations_size;
 
 static vm_offset_t pmap_idle_map;
 static vm_offset_t pmap_temp_map_1;
 static vm_offset_t pmap_temp_map_2;
 
 /*
  * First and last available kernel virtual addresses
  */
 vm_offset_t virtual_avail;
 vm_offset_t virtual_end;
 vm_offset_t kernel_vm_end;
 
 vm_offset_t vm_max_kernel_address;
 
 /*
  * Kernel pmap
  */
 struct pmap kernel_pmap_store;
 
 struct rwlock_padalign tte_list_global_lock;
 
 /*
  * Allocate physical memory for use in pmap_bootstrap.
  */
 static vm_paddr_t pmap_bootstrap_alloc(vm_size_t size, uint32_t colors);
 
 static void pmap_bootstrap_set_tte(struct tte *tp, u_long vpn, u_long data);
 static void pmap_cache_remove(vm_page_t m, vm_offset_t va);
 static int pmap_protect_tte(struct pmap *pm1, struct pmap *pm2,
     struct tte *tp, vm_offset_t va);
 static int pmap_unwire_tte(pmap_t pm, pmap_t pm2, struct tte *tp,
     vm_offset_t va);
 static void pmap_init_qpages(void);
 
 /*
  * Map the given physical page at the specified virtual address in the
  * target pmap with the protection requested.  If specified the page
  * will be wired down.
  *
  * The page queues and pmap must be locked.
  */
 static int pmap_enter_locked(pmap_t pm, vm_offset_t va, vm_page_t m,
     vm_prot_t prot, u_int flags, int8_t psind);
 
 extern int tl1_dmmu_miss_direct_patch_tsb_phys_1[];
 extern int tl1_dmmu_miss_direct_patch_tsb_phys_end_1[];
 extern int tl1_dmmu_miss_patch_asi_1[];
 extern int tl1_dmmu_miss_patch_quad_ldd_1[];
 extern int tl1_dmmu_miss_patch_tsb_1[];
 extern int tl1_dmmu_miss_patch_tsb_2[];
 extern int tl1_dmmu_miss_patch_tsb_mask_1[];
 extern int tl1_dmmu_miss_patch_tsb_mask_2[];
 extern int tl1_dmmu_prot_patch_asi_1[];
 extern int tl1_dmmu_prot_patch_quad_ldd_1[];
 extern int tl1_dmmu_prot_patch_tsb_1[];
 extern int tl1_dmmu_prot_patch_tsb_2[];
 extern int tl1_dmmu_prot_patch_tsb_mask_1[];
 extern int tl1_dmmu_prot_patch_tsb_mask_2[];
 extern int tl1_immu_miss_patch_asi_1[];
 extern int tl1_immu_miss_patch_quad_ldd_1[];
 extern int tl1_immu_miss_patch_tsb_1[];
 extern int tl1_immu_miss_patch_tsb_2[];
 extern int tl1_immu_miss_patch_tsb_mask_1[];
 extern int tl1_immu_miss_patch_tsb_mask_2[];
 
 /*
  * If user pmap is processed with pmap_remove and with pmap_remove and the
  * resident count drops to 0, there are no more pages to remove, so we
  * need not continue.
  */
 #define	PMAP_REMOVE_DONE(pm) \
 	((pm) != kernel_pmap && (pm)->pm_stats.resident_count == 0)
 
 /*
  * The threshold (in bytes) above which tsb_foreach() is used in pmap_remove()
  * and pmap_protect() instead of trying each virtual address.
  */
 #define	PMAP_TSB_THRESH	((TSB_SIZE / 2) * PAGE_SIZE)
 
 SYSCTL_NODE(_debug, OID_AUTO, pmap_stats, CTLFLAG_RD, 0, "");
 
 PMAP_STATS_VAR(pmap_nenter);
 PMAP_STATS_VAR(pmap_nenter_update);
 PMAP_STATS_VAR(pmap_nenter_replace);
 PMAP_STATS_VAR(pmap_nenter_new);
 PMAP_STATS_VAR(pmap_nkenter);
 PMAP_STATS_VAR(pmap_nkenter_oc);
 PMAP_STATS_VAR(pmap_nkenter_stupid);
 PMAP_STATS_VAR(pmap_nkremove);
 PMAP_STATS_VAR(pmap_nqenter);
 PMAP_STATS_VAR(pmap_nqremove);
 PMAP_STATS_VAR(pmap_ncache_enter);
 PMAP_STATS_VAR(pmap_ncache_enter_c);
 PMAP_STATS_VAR(pmap_ncache_enter_oc);
 PMAP_STATS_VAR(pmap_ncache_enter_cc);
 PMAP_STATS_VAR(pmap_ncache_enter_coc);
 PMAP_STATS_VAR(pmap_ncache_enter_nc);
 PMAP_STATS_VAR(pmap_ncache_enter_cnc);
 PMAP_STATS_VAR(pmap_ncache_remove);
 PMAP_STATS_VAR(pmap_ncache_remove_c);
 PMAP_STATS_VAR(pmap_ncache_remove_oc);
 PMAP_STATS_VAR(pmap_ncache_remove_cc);
 PMAP_STATS_VAR(pmap_ncache_remove_coc);
 PMAP_STATS_VAR(pmap_ncache_remove_nc);
 PMAP_STATS_VAR(pmap_nzero_page);
 PMAP_STATS_VAR(pmap_nzero_page_c);
 PMAP_STATS_VAR(pmap_nzero_page_oc);
 PMAP_STATS_VAR(pmap_nzero_page_nc);
 PMAP_STATS_VAR(pmap_nzero_page_area);
 PMAP_STATS_VAR(pmap_nzero_page_area_c);
 PMAP_STATS_VAR(pmap_nzero_page_area_oc);
 PMAP_STATS_VAR(pmap_nzero_page_area_nc);
 PMAP_STATS_VAR(pmap_ncopy_page);
 PMAP_STATS_VAR(pmap_ncopy_page_c);
 PMAP_STATS_VAR(pmap_ncopy_page_oc);
 PMAP_STATS_VAR(pmap_ncopy_page_nc);
 PMAP_STATS_VAR(pmap_ncopy_page_dc);
 PMAP_STATS_VAR(pmap_ncopy_page_doc);
 PMAP_STATS_VAR(pmap_ncopy_page_sc);
 PMAP_STATS_VAR(pmap_ncopy_page_soc);
 
 PMAP_STATS_VAR(pmap_nnew_thread);
 PMAP_STATS_VAR(pmap_nnew_thread_oc);
 
 static inline u_long dtlb_get_data(u_int tlb, u_int slot);
 
 /*
  * Quick sort callout for comparing memory regions
  */
 static int mr_cmp(const void *a, const void *b);
 static int om_cmp(const void *a, const void *b);
 
 static int
 mr_cmp(const void *a, const void *b)
 {
 	const struct ofw_mem_region *mra;
 	const struct ofw_mem_region *mrb;
 
 	mra = a;
 	mrb = b;
 	if (mra->mr_start < mrb->mr_start)
 		return (-1);
 	else if (mra->mr_start > mrb->mr_start)
 		return (1);
 	else
 		return (0);
 }
 
 static int
 om_cmp(const void *a, const void *b)
 {
 	const struct ofw_map *oma;
 	const struct ofw_map *omb;
 
 	oma = a;
 	omb = b;
 	if (oma->om_start < omb->om_start)
 		return (-1);
 	else if (oma->om_start > omb->om_start)
 		return (1);
 	else
 		return (0);
 }
 
 static inline u_long
 dtlb_get_data(u_int tlb, u_int slot)
 {
 	u_long data;
 	register_t s;
 
 	slot = TLB_DAR_SLOT(tlb, slot);
 	/*
 	 * We read ASI_DTLB_DATA_ACCESS_REG twice back-to-back in order to
 	 * work around errata of USIII and beyond.
 	 */
 	s = intr_disable();
 	(void)ldxa(slot, ASI_DTLB_DATA_ACCESS_REG);
 	data = ldxa(slot, ASI_DTLB_DATA_ACCESS_REG);
 	intr_restore(s);
 	return (data);
 }
 
 /*
  * Bootstrap the system enough to run with virtual memory.
  */
 void
 pmap_bootstrap(u_int cpu_impl)
 {
 	struct pmap *pm;
 	struct tte *tp;
 	vm_offset_t off;
 	vm_offset_t va;
 	vm_paddr_t pa;
 	vm_size_t physsz;
 	vm_size_t virtsz;
 	u_long data;
 	u_long vpn;
 	phandle_t pmem;
 	phandle_t vmem;
 	u_int dtlb_slots_avail;
 	int i;
 	int j;
 	int sz;
 	uint32_t asi;
 	uint32_t colors;
 	uint32_t ldd;
 
 	/*
 	 * Set the kernel context.
 	 */
 	pmap_set_kctx();
 
 	colors = dcache_color_ignore != 0 ? 1 : DCACHE_COLORS;
 
 	/*
 	 * Find out what physical memory is available from the PROM and
 	 * initialize the phys_avail array.  This must be done before
 	 * pmap_bootstrap_alloc is called.
 	 */
 	if ((pmem = OF_finddevice("/memory")) == -1)
 		OF_panic("%s: finddevice /memory", __func__);
 	if ((sz = OF_getproplen(pmem, "available")) == -1)
 		OF_panic("%s: getproplen /memory/available", __func__);
 	if (sizeof(phys_avail) < sz)
 		OF_panic("%s: phys_avail too small", __func__);
 	if (sizeof(mra) < sz)
 		OF_panic("%s: mra too small", __func__);
 	bzero(mra, sz);
 	if (OF_getprop(pmem, "available", mra, sz) == -1)
 		OF_panic("%s: getprop /memory/available", __func__);
 	sz /= sizeof(*mra);
 #ifdef DIAGNOSTIC
 	OF_printf("pmap_bootstrap: physical memory\n");
 #endif
 	qsort(mra, sz, sizeof (*mra), mr_cmp);
 	physsz = 0;
 	getenv_quad("hw.physmem", &physmem);
 	physmem = btoc(physmem);
 	for (i = 0, j = 0; i < sz; i++, j += 2) {
 #ifdef DIAGNOSTIC
 		OF_printf("start=%#lx size=%#lx\n", mra[i].mr_start,
 		    mra[i].mr_size);
 #endif
 		if (physmem != 0 && btoc(physsz + mra[i].mr_size) >= physmem) {
 			if (btoc(physsz) < physmem) {
 				phys_avail[j] = mra[i].mr_start;
 				phys_avail[j + 1] = mra[i].mr_start +
 				    (ctob(physmem) - physsz);
 				physsz = ctob(physmem);
 			}
 			break;
 		}
 		phys_avail[j] = mra[i].mr_start;
 		phys_avail[j + 1] = mra[i].mr_start + mra[i].mr_size;
 		physsz += mra[i].mr_size;
 	}
 	physmem = btoc(physsz);
 
 	/*
 	 * Calculate the size of kernel virtual memory, and the size and mask
 	 * for the kernel TSB based on the phsyical memory size but limited
 	 * by the amount of dTLB slots available for locked entries if we have
 	 * to lock the TSB in the TLB (given that for spitfire-class CPUs all
 	 * of the dt64 slots can hold locked entries but there is no large
 	 * dTLB for unlocked ones, we don't use more than half of it for the
 	 * TSB).
 	 * Note that for reasons unknown OpenSolaris doesn't take advantage of
 	 * ASI_ATOMIC_QUAD_LDD_PHYS on UltraSPARC-III.  However, given that no
 	 * public documentation is available for these, the latter just might
 	 * not support it, yet.
 	 */
 	if (cpu_impl == CPU_IMPL_SPARC64V ||
 	    cpu_impl >= CPU_IMPL_ULTRASPARCIIIp) {
 		tsb_kernel_ldd_phys = 1;
 		virtsz = roundup(5 / 3 * physsz, PAGE_SIZE_4M <<
 		    (PAGE_SHIFT - TTE_SHIFT));
 	} else {
 		dtlb_slots_avail = 0;
 		for (i = 0; i < dtlb_slots; i++) {
 			data = dtlb_get_data(cpu_impl ==
 			    CPU_IMPL_ULTRASPARCIII ? TLB_DAR_T16 :
 			    TLB_DAR_T32, i);
 			if ((data & (TD_V | TD_L)) != (TD_V | TD_L))
 				dtlb_slots_avail++;
 		}
 #ifdef SMP
 		dtlb_slots_avail -= PCPU_PAGES;
 #endif
 		if (cpu_impl >= CPU_IMPL_ULTRASPARCI &&
 		    cpu_impl < CPU_IMPL_ULTRASPARCIII)
 			dtlb_slots_avail /= 2;
 		virtsz = roundup(physsz, PAGE_SIZE_4M <<
 		    (PAGE_SHIFT - TTE_SHIFT));
 		virtsz = MIN(virtsz, (dtlb_slots_avail * PAGE_SIZE_4M) <<
 		    (PAGE_SHIFT - TTE_SHIFT));
 	}
 	vm_max_kernel_address = VM_MIN_KERNEL_ADDRESS + virtsz;
 	tsb_kernel_size = virtsz >> (PAGE_SHIFT - TTE_SHIFT);
 	tsb_kernel_mask = (tsb_kernel_size >> TTE_SHIFT) - 1;
 
 	/*
 	 * Allocate the kernel TSB and lock it in the TLB if necessary.
 	 */
 	pa = pmap_bootstrap_alloc(tsb_kernel_size, colors);
 	if (pa & PAGE_MASK_4M)
 		OF_panic("%s: TSB unaligned", __func__);
 	tsb_kernel_phys = pa;
 	if (tsb_kernel_ldd_phys == 0) {
 		tsb_kernel =
 		    (struct tte *)(VM_MIN_KERNEL_ADDRESS - tsb_kernel_size);
 		pmap_map_tsb();
 		bzero(tsb_kernel, tsb_kernel_size);
 	} else {
 		tsb_kernel =
 		    (struct tte *)TLB_PHYS_TO_DIRECT(tsb_kernel_phys);
 		aszero(ASI_PHYS_USE_EC, tsb_kernel_phys, tsb_kernel_size);
 	}
 
 	/*
 	 * Allocate and map the dynamic per-CPU area for the BSP.
 	 */
 	pa = pmap_bootstrap_alloc(DPCPU_SIZE, colors);
 	dpcpu0 = (void *)TLB_PHYS_TO_DIRECT(pa);
 
 	/*
 	 * Allocate and map the message buffer.
 	 */
 	pa = pmap_bootstrap_alloc(msgbufsize, colors);
 	msgbufp = (struct msgbuf *)TLB_PHYS_TO_DIRECT(pa);
 
 	/*
 	 * Patch the TSB addresses and mask as well as the ASIs used to load
 	 * it into the trap table.
 	 */
 
 #define	LDDA_R_I_R(rd, imm_asi, rs1, rs2)				\
 	(EIF_OP(IOP_LDST) | EIF_F3_RD(rd) | EIF_F3_OP3(INS3_LDDA) |	\
 	    EIF_F3_RS1(rs1) | EIF_F3_I(0) | EIF_F3_IMM_ASI(imm_asi) |	\
 	    EIF_F3_RS2(rs2))
 #define	OR_R_I_R(rd, imm13, rs1)					\
 	(EIF_OP(IOP_MISC) | EIF_F3_RD(rd) | EIF_F3_OP3(INS2_OR) |	\
 	    EIF_F3_RS1(rs1) | EIF_F3_I(1) | EIF_IMM(imm13, 13))
 #define	SETHI(rd, imm22)						\
 	(EIF_OP(IOP_FORM2) | EIF_F2_RD(rd) | EIF_F2_OP2(INS0_SETHI) |	\
 	    EIF_IMM((imm22) >> 10, 22))
 #define	WR_R_I(rd, imm13, rs1)						\
 	(EIF_OP(IOP_MISC) | EIF_F3_RD(rd) | EIF_F3_OP3(INS2_WR) |	\
 	    EIF_F3_RS1(rs1) | EIF_F3_I(1) | EIF_IMM(imm13, 13))
 
 #define	PATCH_ASI(addr, asi) do {					\
 	if (addr[0] != WR_R_I(IF_F3_RD(addr[0]), 0x0,			\
 	    IF_F3_RS1(addr[0])))					\
 		OF_panic("%s: patched instructions have changed",	\
 		    __func__);						\
 	addr[0] |= EIF_IMM((asi), 13);					\
 	flush(addr);							\
 } while (0)
 
 #define	PATCH_LDD(addr, asi) do {					\
 	if (addr[0] != LDDA_R_I_R(IF_F3_RD(addr[0]), 0x0,		\
 	    IF_F3_RS1(addr[0]), IF_F3_RS2(addr[0])))			\
 		OF_panic("%s: patched instructions have changed",	\
 		    __func__);						\
 	addr[0] |= EIF_F3_IMM_ASI(asi);					\
 	flush(addr);							\
 } while (0)
 
 #define	PATCH_TSB(addr, val) do {					\
 	if (addr[0] != SETHI(IF_F2_RD(addr[0]), 0x0) ||			\
 	    addr[1] != OR_R_I_R(IF_F3_RD(addr[1]), 0x0,			\
 	    IF_F3_RS1(addr[1]))	||					\
 	    addr[3] != SETHI(IF_F2_RD(addr[3]), 0x0))			\
 		OF_panic("%s: patched instructions have changed",	\
 		    __func__);						\
 	addr[0] |= EIF_IMM((val) >> 42, 22);				\
 	addr[1] |= EIF_IMM((val) >> 32, 10);				\
 	addr[3] |= EIF_IMM((val) >> 10, 22);				\
 	flush(addr);							\
 	flush(addr + 1);						\
 	flush(addr + 3);						\
 } while (0)
 
 #define	PATCH_TSB_MASK(addr, val) do {					\
 	if (addr[0] != SETHI(IF_F2_RD(addr[0]), 0x0) ||			\
 	    addr[1] != OR_R_I_R(IF_F3_RD(addr[1]), 0x0,			\
 	    IF_F3_RS1(addr[1])))					\
 		OF_panic("%s: patched instructions have changed",	\
 		    __func__);						\
 	addr[0] |= EIF_IMM((val) >> 10, 22);				\
 	addr[1] |= EIF_IMM((val), 10);					\
 	flush(addr);							\
 	flush(addr + 1);						\
 } while (0)
 
 	if (tsb_kernel_ldd_phys == 0) {
 		asi = ASI_N;
 		ldd = ASI_NUCLEUS_QUAD_LDD;
 		off = (vm_offset_t)tsb_kernel;
 	} else {
 		asi = ASI_PHYS_USE_EC;
 		ldd = ASI_ATOMIC_QUAD_LDD_PHYS;
 		off = (vm_offset_t)tsb_kernel_phys;
 	}
 	PATCH_TSB(tl1_dmmu_miss_direct_patch_tsb_phys_1, tsb_kernel_phys);
 	PATCH_TSB(tl1_dmmu_miss_direct_patch_tsb_phys_end_1,
 	    tsb_kernel_phys + tsb_kernel_size - 1);
 	PATCH_ASI(tl1_dmmu_miss_patch_asi_1, asi);
 	PATCH_LDD(tl1_dmmu_miss_patch_quad_ldd_1, ldd);
 	PATCH_TSB(tl1_dmmu_miss_patch_tsb_1, off);
 	PATCH_TSB(tl1_dmmu_miss_patch_tsb_2, off);
 	PATCH_TSB_MASK(tl1_dmmu_miss_patch_tsb_mask_1, tsb_kernel_mask);
 	PATCH_TSB_MASK(tl1_dmmu_miss_patch_tsb_mask_2, tsb_kernel_mask);
 	PATCH_ASI(tl1_dmmu_prot_patch_asi_1, asi);
 	PATCH_LDD(tl1_dmmu_prot_patch_quad_ldd_1, ldd);
 	PATCH_TSB(tl1_dmmu_prot_patch_tsb_1, off);
 	PATCH_TSB(tl1_dmmu_prot_patch_tsb_2, off);
 	PATCH_TSB_MASK(tl1_dmmu_prot_patch_tsb_mask_1, tsb_kernel_mask);
 	PATCH_TSB_MASK(tl1_dmmu_prot_patch_tsb_mask_2, tsb_kernel_mask);
 	PATCH_ASI(tl1_immu_miss_patch_asi_1, asi);
 	PATCH_LDD(tl1_immu_miss_patch_quad_ldd_1, ldd);
 	PATCH_TSB(tl1_immu_miss_patch_tsb_1, off);
 	PATCH_TSB(tl1_immu_miss_patch_tsb_2, off);
 	PATCH_TSB_MASK(tl1_immu_miss_patch_tsb_mask_1, tsb_kernel_mask);
 	PATCH_TSB_MASK(tl1_immu_miss_patch_tsb_mask_2, tsb_kernel_mask);
 
 	/*
 	 * Enter fake 8k pages for the 4MB kernel pages, so that
 	 * pmap_kextract() will work for them.
 	 */
 	for (i = 0; i < kernel_tlb_slots; i++) {
 		pa = kernel_tlbs[i].te_pa;
 		va = kernel_tlbs[i].te_va;
 		for (off = 0; off < PAGE_SIZE_4M; off += PAGE_SIZE) {
 			tp = tsb_kvtotte(va + off);
 			vpn = TV_VPN(va + off, TS_8K);
 			data = TD_V | TD_8K | TD_PA(pa + off) | TD_REF |
 			    TD_SW | TD_CP | TD_CV | TD_P | TD_W;
 			pmap_bootstrap_set_tte(tp, vpn, data);
 		}
 	}
 
 	/*
 	 * Set the start and end of KVA.  The kernel is loaded starting
 	 * at the first available 4MB super page, so we advance to the
 	 * end of the last one used for it.
 	 */
 	virtual_avail = KERNBASE + kernel_tlb_slots * PAGE_SIZE_4M;
 	virtual_end = vm_max_kernel_address;
 	kernel_vm_end = vm_max_kernel_address;
 
 	/*
 	 * Allocate kva space for temporary mappings.
 	 */
 	pmap_idle_map = virtual_avail;
 	virtual_avail += PAGE_SIZE * colors;
 	pmap_temp_map_1 = virtual_avail;
 	virtual_avail += PAGE_SIZE * colors;
 	pmap_temp_map_2 = virtual_avail;
 	virtual_avail += PAGE_SIZE * colors;
 
 	/*
 	 * Allocate a kernel stack with guard page for thread0 and map it
 	 * into the kernel TSB.  We must ensure that the virtual address is
 	 * colored properly for corresponding CPUs, since we're allocating
 	 * from phys_avail so the memory won't have an associated vm_page_t.
 	 */
 	pa = pmap_bootstrap_alloc(KSTACK_PAGES * PAGE_SIZE, colors);
 	kstack0_phys = pa;
 	virtual_avail += roundup(KSTACK_GUARD_PAGES, colors) * PAGE_SIZE;
 	kstack0 = virtual_avail;
 	virtual_avail += roundup(KSTACK_PAGES, colors) * PAGE_SIZE;
 	if (dcache_color_ignore == 0)
 		KASSERT(DCACHE_COLOR(kstack0) == DCACHE_COLOR(kstack0_phys),
 		    ("pmap_bootstrap: kstack0 miscolored"));
 	for (i = 0; i < KSTACK_PAGES; i++) {
 		pa = kstack0_phys + i * PAGE_SIZE;
 		va = kstack0 + i * PAGE_SIZE;
 		tp = tsb_kvtotte(va);
 		vpn = TV_VPN(va, TS_8K);
 		data = TD_V | TD_8K | TD_PA(pa) | TD_REF | TD_SW | TD_CP |
 		    TD_CV | TD_P | TD_W;
 		pmap_bootstrap_set_tte(tp, vpn, data);
 	}
 
 	/*
 	 * Calculate the last available physical address.
 	 */
 	for (i = 0; phys_avail[i + 2] != 0; i += 2)
 		;
 	Maxmem = sparc64_btop(phys_avail[i + 1]);
 
 	/*
 	 * Add the PROM mappings to the kernel TSB.
 	 */
 	if ((vmem = OF_finddevice("/virtual-memory")) == -1)
 		OF_panic("%s: finddevice /virtual-memory", __func__);
 	if ((sz = OF_getproplen(vmem, "translations")) == -1)
 		OF_panic("%s: getproplen translations", __func__);
 	if (sizeof(translations) < sz)
 		OF_panic("%s: translations too small", __func__);
 	bzero(translations, sz);
 	if (OF_getprop(vmem, "translations", translations, sz) == -1)
 		OF_panic("%s: getprop /virtual-memory/translations",
 		    __func__);
 	sz /= sizeof(*translations);
 	translations_size = sz;
 #ifdef DIAGNOSTIC
 	OF_printf("pmap_bootstrap: translations\n");
 #endif
 	qsort(translations, sz, sizeof (*translations), om_cmp);
 	for (i = 0; i < sz; i++) {
 #ifdef DIAGNOSTIC
 		OF_printf("translation: start=%#lx size=%#lx tte=%#lx\n",
 		    translations[i].om_start, translations[i].om_size,
 		    translations[i].om_tte);
 #endif
 		if ((translations[i].om_tte & TD_V) == 0)
 			continue;
 		if (translations[i].om_start < VM_MIN_PROM_ADDRESS ||
 		    translations[i].om_start > VM_MAX_PROM_ADDRESS)
 			continue;
 		for (off = 0; off < translations[i].om_size;
 		    off += PAGE_SIZE) {
 			va = translations[i].om_start + off;
 			tp = tsb_kvtotte(va);
 			vpn = TV_VPN(va, TS_8K);
 			data = ((translations[i].om_tte &
 			    ~((TD_SOFT2_MASK << TD_SOFT2_SHIFT) |
 			    (cpu_impl >= CPU_IMPL_ULTRASPARCI &&
 			    cpu_impl < CPU_IMPL_ULTRASPARCIII ?
 			    (TD_DIAG_SF_MASK << TD_DIAG_SF_SHIFT) :
 			    (TD_RSVD_CH_MASK << TD_RSVD_CH_SHIFT)) |
 			    (TD_SOFT_MASK << TD_SOFT_SHIFT))) | TD_EXEC) +
 			    off;
 			pmap_bootstrap_set_tte(tp, vpn, data);
 		}
 	}
 
 	/*
 	 * Get the available physical memory ranges from /memory/reg.  These
 	 * are only used for kernel dumps, but it may not be wise to do PROM
 	 * calls in that situation.
 	 */
 	if ((sz = OF_getproplen(pmem, "reg")) == -1)
 		OF_panic("%s: getproplen /memory/reg", __func__);
 	if (sizeof(sparc64_memreg) < sz)
 		OF_panic("%s: sparc64_memreg too small", __func__);
 	if (OF_getprop(pmem, "reg", sparc64_memreg, sz) == -1)
 		OF_panic("%s: getprop /memory/reg", __func__);
 	sparc64_nmemreg = sz / sizeof(*sparc64_memreg);
 
 	/*
 	 * Initialize the kernel pmap (which is statically allocated).
 	 */
 	pm = kernel_pmap;
 	PMAP_LOCK_INIT(pm);
 	for (i = 0; i < MAXCPU; i++)
 		pm->pm_context[i] = TLB_CTX_KERNEL;
 	CPU_FILL(&pm->pm_active);
 
 	/*
 	 * Initialize the global tte list lock, which is more commonly
 	 * known as the pmap pv global lock.
 	 */
 	rw_init(&tte_list_global_lock, "pmap pv global");
 
 	/*
 	 * Flush all non-locked TLB entries possibly left over by the
 	 * firmware.
 	 */
 	tlb_flush_nonlocked();
 }
 
 static void
 pmap_init_qpages(void)
 {
 	struct pcpu *pc;
 	int i;
 
 	if (dcache_color_ignore != 0)
 		return;
 
 	CPU_FOREACH(i) {
 		pc = pcpu_find(i);
 		pc->pc_qmap_addr = kva_alloc(PAGE_SIZE * DCACHE_COLORS);
 		if (pc->pc_qmap_addr == 0)
 			panic("pmap_init_qpages: unable to allocate KVA");
 	}
 }
 
 SYSINIT(qpages_init, SI_SUB_CPU, SI_ORDER_ANY, pmap_init_qpages, NULL);
 
 /*
  * Map the 4MB kernel TSB pages.
  */
 void
 pmap_map_tsb(void)
 {
 	vm_offset_t va;
 	vm_paddr_t pa;
 	u_long data;
 	int i;
 
 	for (i = 0; i < tsb_kernel_size; i += PAGE_SIZE_4M) {
 		va = (vm_offset_t)tsb_kernel + i;
 		pa = tsb_kernel_phys + i;
 		data = TD_V | TD_4M | TD_PA(pa) | TD_L | TD_CP | TD_CV |
 		    TD_P | TD_W;
 		stxa(AA_DMMU_TAR, ASI_DMMU, TLB_TAR_VA(va) |
 		    TLB_TAR_CTX(TLB_CTX_KERNEL));
 		stxa_sync(0, ASI_DTLB_DATA_IN_REG, data);
 	}
 }
 
 /*
  * Set the secondary context to be the kernel context (needed for FP block
  * operations in the kernel).
  */
 void
 pmap_set_kctx(void)
 {
 
 	stxa(AA_DMMU_SCXR, ASI_DMMU, (ldxa(AA_DMMU_SCXR, ASI_DMMU) &
 	    TLB_CXR_PGSZ_MASK) | TLB_CTX_KERNEL);
 	flush(KERNBASE);
 }
 
 /*
  * Allocate a physical page of memory directly from the phys_avail map.
  * Can only be called from pmap_bootstrap before avail start and end are
  * calculated.
  */
 static vm_paddr_t
 pmap_bootstrap_alloc(vm_size_t size, uint32_t colors)
 {
 	vm_paddr_t pa;
 	int i;
 
 	size = roundup(size, PAGE_SIZE * colors);
 	for (i = 0; phys_avail[i + 1] != 0; i += 2) {
 		if (phys_avail[i + 1] - phys_avail[i] < size)
 			continue;
 		pa = phys_avail[i];
 		phys_avail[i] += size;
 		return (pa);
 	}
 	OF_panic("%s: no suitable region found", __func__);
 }
 
 /*
  * Set a TTE.  This function is intended as a helper when tsb_kernel is
  * direct-mapped but we haven't taken over the trap table, yet, as it's the
  * case when we are taking advantage of ASI_ATOMIC_QUAD_LDD_PHYS to access
  * the kernel TSB.
  */
 void
 pmap_bootstrap_set_tte(struct tte *tp, u_long vpn, u_long data)
 {
 
 	if (tsb_kernel_ldd_phys == 0) {
 		tp->tte_vpn = vpn;
 		tp->tte_data = data;
 	} else {
 		stxa((vm_paddr_t)tp + offsetof(struct tte, tte_vpn),
 		    ASI_PHYS_USE_EC, vpn);
 		stxa((vm_paddr_t)tp + offsetof(struct tte, tte_data),
 		    ASI_PHYS_USE_EC, data);
 	}
 }
 
 /*
  * Initialize a vm_page's machine-dependent fields.
  */
 void
 pmap_page_init(vm_page_t m)
 {
 
 	TAILQ_INIT(&m->md.tte_list);
 	m->md.color = DCACHE_COLOR(VM_PAGE_TO_PHYS(m));
 	m->md.pmap = NULL;
 }
 
 /*
  * Initialize the pmap module.
  */
 void
 pmap_init(void)
 {
 	vm_offset_t addr;
 	vm_size_t size;
 	int result;
 	int i;
 
 	for (i = 0; i < translations_size; i++) {
 		addr = translations[i].om_start;
 		size = translations[i].om_size;
 		if ((translations[i].om_tte & TD_V) == 0)
 			continue;
 		if (addr < VM_MIN_PROM_ADDRESS || addr > VM_MAX_PROM_ADDRESS)
 			continue;
 		result = vm_map_find(kernel_map, NULL, 0, &addr, size, 0,
 		    VMFS_NO_SPACE, VM_PROT_ALL, VM_PROT_ALL, MAP_NOFAULT);
 		if (result != KERN_SUCCESS || addr != translations[i].om_start)
 			panic("pmap_init: vm_map_find");
 	}
 }
 
 /*
  * Extract the physical page address associated with the given
  * map/virtual_address pair.
  */
 vm_paddr_t
 pmap_extract(pmap_t pm, vm_offset_t va)
 {
 	struct tte *tp;
 	vm_paddr_t pa;
 
 	if (pm == kernel_pmap)
 		return (pmap_kextract(va));
 	PMAP_LOCK(pm);
 	tp = tsb_tte_lookup(pm, va);
 	if (tp == NULL)
 		pa = 0;
 	else
 		pa = TTE_GET_PA(tp) | (va & TTE_GET_PAGE_MASK(tp));
 	PMAP_UNLOCK(pm);
 	return (pa);
 }
 
 /*
  * Atomically extract and hold the physical page with the given
  * pmap and virtual address pair if that mapping permits the given
  * protection.
  */
 vm_page_t
 pmap_extract_and_hold(pmap_t pm, vm_offset_t va, vm_prot_t prot)
 {
 	struct tte *tp;
 	vm_page_t m;
 	vm_paddr_t pa;
 
 	m = NULL;
 	pa = 0;
 	PMAP_LOCK(pm);
 retry:
 	if (pm == kernel_pmap) {
 		if (va >= VM_MIN_DIRECT_ADDRESS) {
 			tp = NULL;
 			m = PHYS_TO_VM_PAGE(TLB_DIRECT_TO_PHYS(va));
 			(void)vm_page_pa_tryrelock(pm, TLB_DIRECT_TO_PHYS(va),
 			    &pa);
 			vm_page_hold(m);
 		} else {
 			tp = tsb_kvtotte(va);
 			if ((tp->tte_data & TD_V) == 0)
 				tp = NULL;
 		}
 	} else
 		tp = tsb_tte_lookup(pm, va);
 	if (tp != NULL && ((tp->tte_data & TD_SW) ||
 	    (prot & VM_PROT_WRITE) == 0)) {
 		if (vm_page_pa_tryrelock(pm, TTE_GET_PA(tp), &pa))
 			goto retry;
 		m = PHYS_TO_VM_PAGE(TTE_GET_PA(tp));
 		vm_page_hold(m);
 	}
 	PA_UNLOCK_COND(pa);
 	PMAP_UNLOCK(pm);
 	return (m);
 }
 
 /*
  * Extract the physical page address associated with the given kernel virtual
  * address.
  */
 vm_paddr_t
 pmap_kextract(vm_offset_t va)
 {
 	struct tte *tp;
 
 	if (va >= VM_MIN_DIRECT_ADDRESS)
 		return (TLB_DIRECT_TO_PHYS(va));
 	tp = tsb_kvtotte(va);
 	if ((tp->tte_data & TD_V) == 0)
 		return (0);
 	return (TTE_GET_PA(tp) | (va & TTE_GET_PAGE_MASK(tp)));
 }
 
 int
 pmap_cache_enter(vm_page_t m, vm_offset_t va)
 {
 	struct tte *tp;
 	int color;
 
 	rw_assert(&tte_list_global_lock, RA_WLOCKED);
 	KASSERT((m->flags & PG_FICTITIOUS) == 0,
 	    ("pmap_cache_enter: fake page"));
 	PMAP_STATS_INC(pmap_ncache_enter);
 
 	if (dcache_color_ignore != 0)
 		return (1);
 
 	/*
 	 * Find the color for this virtual address and note the added mapping.
 	 */
 	color = DCACHE_COLOR(va);
 	m->md.colors[color]++;
 
 	/*
 	 * If all existing mappings have the same color, the mapping is
 	 * cacheable.
 	 */
 	if (m->md.color == color) {
 		KASSERT(m->md.colors[DCACHE_OTHER_COLOR(color)] == 0,
 		    ("pmap_cache_enter: cacheable, mappings of other color"));
 		if (m->md.color == DCACHE_COLOR(VM_PAGE_TO_PHYS(m)))
 			PMAP_STATS_INC(pmap_ncache_enter_c);
 		else
 			PMAP_STATS_INC(pmap_ncache_enter_oc);
 		return (1);
 	}
 
 	/*
 	 * If there are no mappings of the other color, and the page still has
 	 * the wrong color, this must be a new mapping.  Change the color to
 	 * match the new mapping, which is cacheable.  We must flush the page
 	 * from the cache now.
 	 */
 	if (m->md.colors[DCACHE_OTHER_COLOR(color)] == 0) {
 		KASSERT(m->md.colors[color] == 1,
 		    ("pmap_cache_enter: changing color, not new mapping"));
 		dcache_page_inval(VM_PAGE_TO_PHYS(m));
 		m->md.color = color;
 		if (m->md.color == DCACHE_COLOR(VM_PAGE_TO_PHYS(m)))
 			PMAP_STATS_INC(pmap_ncache_enter_cc);
 		else
 			PMAP_STATS_INC(pmap_ncache_enter_coc);
 		return (1);
 	}
 
 	/*
 	 * If the mapping is already non-cacheable, just return.
 	 */
 	if (m->md.color == -1) {
 		PMAP_STATS_INC(pmap_ncache_enter_nc);
 		return (0);
 	}
 
 	PMAP_STATS_INC(pmap_ncache_enter_cnc);
 
 	/*
 	 * Mark all mappings as uncacheable, flush any lines with the other
 	 * color out of the dcache, and set the color to none (-1).
 	 */
 	TAILQ_FOREACH(tp, &m->md.tte_list, tte_link) {
 		atomic_clear_long(&tp->tte_data, TD_CV);
 		tlb_page_demap(TTE_GET_PMAP(tp), TTE_GET_VA(tp));
 	}
 	dcache_page_inval(VM_PAGE_TO_PHYS(m));
 	m->md.color = -1;
 	return (0);
 }
 
 static void
 pmap_cache_remove(vm_page_t m, vm_offset_t va)
 {
 	struct tte *tp;
 	int color;
 
 	rw_assert(&tte_list_global_lock, RA_WLOCKED);
 	CTR3(KTR_PMAP, "pmap_cache_remove: m=%p va=%#lx c=%d", m, va,
 	    m->md.colors[DCACHE_COLOR(va)]);
 	KASSERT((m->flags & PG_FICTITIOUS) == 0,
 	    ("pmap_cache_remove: fake page"));
 	PMAP_STATS_INC(pmap_ncache_remove);
 
 	if (dcache_color_ignore != 0)
 		return;
 
 	KASSERT(m->md.colors[DCACHE_COLOR(va)] > 0,
 	    ("pmap_cache_remove: no mappings %d <= 0",
 	    m->md.colors[DCACHE_COLOR(va)]));
 
 	/*
 	 * Find the color for this virtual address and note the removal of
 	 * the mapping.
 	 */
 	color = DCACHE_COLOR(va);
 	m->md.colors[color]--;
 
 	/*
 	 * If the page is cacheable, just return and keep the same color, even
 	 * if there are no longer any mappings.
 	 */
 	if (m->md.color != -1) {
 		if (m->md.color == DCACHE_COLOR(VM_PAGE_TO_PHYS(m)))
 			PMAP_STATS_INC(pmap_ncache_remove_c);
 		else
 			PMAP_STATS_INC(pmap_ncache_remove_oc);
 		return;
 	}
 
 	KASSERT(m->md.colors[DCACHE_OTHER_COLOR(color)] != 0,
 	    ("pmap_cache_remove: uncacheable, no mappings of other color"));
 
 	/*
 	 * If the page is not cacheable (color is -1), and the number of
 	 * mappings for this color is not zero, just return.  There are
 	 * mappings of the other color still, so remain non-cacheable.
 	 */
 	if (m->md.colors[color] != 0) {
 		PMAP_STATS_INC(pmap_ncache_remove_nc);
 		return;
 	}
 
 	/*
 	 * The number of mappings for this color is now zero.  Recache the
 	 * other colored mappings, and change the page color to the other
 	 * color.  There should be no lines in the data cache for this page,
 	 * so flushing should not be needed.
 	 */
 	TAILQ_FOREACH(tp, &m->md.tte_list, tte_link) {
 		atomic_set_long(&tp->tte_data, TD_CV);
 		tlb_page_demap(TTE_GET_PMAP(tp), TTE_GET_VA(tp));
 	}
 	m->md.color = DCACHE_OTHER_COLOR(color);
 
 	if (m->md.color == DCACHE_COLOR(VM_PAGE_TO_PHYS(m)))
 		PMAP_STATS_INC(pmap_ncache_remove_cc);
 	else
 		PMAP_STATS_INC(pmap_ncache_remove_coc);
 }
 
 /*
  * Map a wired page into kernel virtual address space.
  */
 void
 pmap_kenter(vm_offset_t va, vm_page_t m)
 {
 	vm_offset_t ova;
 	struct tte *tp;
 	vm_page_t om;
 	u_long data;
 
 	rw_assert(&tte_list_global_lock, RA_WLOCKED);
 	PMAP_STATS_INC(pmap_nkenter);
 	tp = tsb_kvtotte(va);
 	CTR4(KTR_PMAP, "pmap_kenter: va=%#lx pa=%#lx tp=%p data=%#lx",
 	    va, VM_PAGE_TO_PHYS(m), tp, tp->tte_data);
 	if (DCACHE_COLOR(VM_PAGE_TO_PHYS(m)) != DCACHE_COLOR(va)) {
 		CTR5(KTR_SPARE2,
 	"pmap_kenter: off color va=%#lx pa=%#lx o=%p ot=%d pi=%#lx",
 		    va, VM_PAGE_TO_PHYS(m), m->object,
 		    m->object ? m->object->type : -1,
 		    m->pindex);
 		PMAP_STATS_INC(pmap_nkenter_oc);
 	}
 	if ((tp->tte_data & TD_V) != 0) {
 		om = PHYS_TO_VM_PAGE(TTE_GET_PA(tp));
 		ova = TTE_GET_VA(tp);
 		if (m == om && va == ova) {
 			PMAP_STATS_INC(pmap_nkenter_stupid);
 			return;
 		}
 		TAILQ_REMOVE(&om->md.tte_list, tp, tte_link);
 		pmap_cache_remove(om, ova);
 		if (va != ova)
 			tlb_page_demap(kernel_pmap, ova);
 	}
 	data = TD_V | TD_8K | VM_PAGE_TO_PHYS(m) | TD_REF | TD_SW | TD_CP |
 	    TD_P | TD_W;
 	if (pmap_cache_enter(m, va) != 0)
 		data |= TD_CV;
 	tp->tte_vpn = TV_VPN(va, TS_8K);
 	tp->tte_data = data;
 	TAILQ_INSERT_TAIL(&m->md.tte_list, tp, tte_link);
 }
 
 /*
  * Map a wired page into kernel virtual address space.  This additionally
  * takes a flag argument which is or'ed to the TTE data.  This is used by
  * sparc64_bus_mem_map().
  * NOTE: if the mapping is non-cacheable, it's the caller's responsibility
  * to flush entries that might still be in the cache, if applicable.
  */
 void
 pmap_kenter_flags(vm_offset_t va, vm_paddr_t pa, u_long flags)
 {
 	struct tte *tp;
 
 	tp = tsb_kvtotte(va);
 	CTR4(KTR_PMAP, "pmap_kenter_flags: va=%#lx pa=%#lx tp=%p data=%#lx",
 	    va, pa, tp, tp->tte_data);
 	tp->tte_vpn = TV_VPN(va, TS_8K);
 	tp->tte_data = TD_V | TD_8K | TD_PA(pa) | TD_REF | TD_P | flags;
 }
 
 /*
  * Remove a wired page from kernel virtual address space.
  */
 void
 pmap_kremove(vm_offset_t va)
 {
 	struct tte *tp;
 	vm_page_t m;
 
 	rw_assert(&tte_list_global_lock, RA_WLOCKED);
 	PMAP_STATS_INC(pmap_nkremove);
 	tp = tsb_kvtotte(va);
 	CTR3(KTR_PMAP, "pmap_kremove: va=%#lx tp=%p data=%#lx", va, tp,
 	    tp->tte_data);
 	if ((tp->tte_data & TD_V) == 0)
 		return;
 	m = PHYS_TO_VM_PAGE(TTE_GET_PA(tp));
 	TAILQ_REMOVE(&m->md.tte_list, tp, tte_link);
 	pmap_cache_remove(m, va);
 	TTE_ZERO(tp);
 }
 
 /*
  * Inverse of pmap_kenter_flags, used by bus_space_unmap().
  */
 void
 pmap_kremove_flags(vm_offset_t va)
 {
 	struct tte *tp;
 
 	tp = tsb_kvtotte(va);
 	CTR3(KTR_PMAP, "pmap_kremove_flags: va=%#lx tp=%p data=%#lx", va, tp,
 	    tp->tte_data);
 	TTE_ZERO(tp);
 }
 
 /*
  * Map a range of physical addresses into kernel virtual address space.
  *
  * The value passed in *virt is a suggested virtual address for the mapping.
  * Architectures which can support a direct-mapped physical to virtual region
  * can return the appropriate address within that region, leaving '*virt'
  * unchanged.
  */
 vm_offset_t
 pmap_map(vm_offset_t *virt, vm_paddr_t start, vm_paddr_t end, int prot)
 {
 
 	return (TLB_PHYS_TO_DIRECT(start));
 }
 
 /*
  * Map a list of wired pages into kernel virtual address space.  This is
  * intended for temporary mappings which do not need page modification or
  * references recorded.  Existing mappings in the region are overwritten.
  */
 void
 pmap_qenter(vm_offset_t sva, vm_page_t *m, int count)
 {
 	vm_offset_t va;
 
 	PMAP_STATS_INC(pmap_nqenter);
 	va = sva;
 	rw_wlock(&tte_list_global_lock);
 	while (count-- > 0) {
 		pmap_kenter(va, *m);
 		va += PAGE_SIZE;
 		m++;
 	}
 	rw_wunlock(&tte_list_global_lock);
 	tlb_range_demap(kernel_pmap, sva, va);
 }
 
 /*
  * Remove page mappings from kernel virtual address space.  Intended for
  * temporary mappings entered by pmap_qenter.
  */
 void
 pmap_qremove(vm_offset_t sva, int count)
 {
 	vm_offset_t va;
 
 	PMAP_STATS_INC(pmap_nqremove);
 	va = sva;
 	rw_wlock(&tte_list_global_lock);
 	while (count-- > 0) {
 		pmap_kremove(va);
 		va += PAGE_SIZE;
 	}
 	rw_wunlock(&tte_list_global_lock);
 	tlb_range_demap(kernel_pmap, sva, va);
 }
 
 /*
  * Initialize the pmap associated with process 0.
  */
 void
 pmap_pinit0(pmap_t pm)
 {
 	int i;
 
 	PMAP_LOCK_INIT(pm);
 	for (i = 0; i < MAXCPU; i++)
 		pm->pm_context[i] = TLB_CTX_KERNEL;
 	CPU_ZERO(&pm->pm_active);
 	pm->pm_tsb = NULL;
 	pm->pm_tsb_obj = NULL;
 	bzero(&pm->pm_stats, sizeof(pm->pm_stats));
 }
 
 /*
  * Initialize a preallocated and zeroed pmap structure, such as one in a
  * vmspace structure.
  */
 int
 pmap_pinit(pmap_t pm)
 {
 	vm_page_t ma[TSB_PAGES];
 	vm_page_t m;
 	int i;
 
 	/*
 	 * Allocate KVA space for the TSB.
 	 */
 	if (pm->pm_tsb == NULL) {
 		pm->pm_tsb = (struct tte *)kva_alloc(TSB_BSIZE);
 		if (pm->pm_tsb == NULL)
 			return (0);
 		}
 
 	/*
 	 * Allocate an object for it.
 	 */
 	if (pm->pm_tsb_obj == NULL)
 		pm->pm_tsb_obj = vm_object_allocate(OBJT_PHYS, TSB_PAGES);
 
 	for (i = 0; i < MAXCPU; i++)
 		pm->pm_context[i] = -1;
 	CPU_ZERO(&pm->pm_active);
 
 	VM_OBJECT_WLOCK(pm->pm_tsb_obj);
 	for (i = 0; i < TSB_PAGES; i++) {
 		m = vm_page_grab(pm->pm_tsb_obj, i, VM_ALLOC_NOBUSY |
 		    VM_ALLOC_WIRED | VM_ALLOC_ZERO);
 		m->valid = VM_PAGE_BITS_ALL;
 		m->md.pmap = pm;
 		ma[i] = m;
 	}
 	VM_OBJECT_WUNLOCK(pm->pm_tsb_obj);
 	pmap_qenter((vm_offset_t)pm->pm_tsb, ma, TSB_PAGES);
 
 	bzero(&pm->pm_stats, sizeof(pm->pm_stats));
 	return (1);
 }
 
 /*
  * Release any resources held by the given physical map.
  * Called when a pmap initialized by pmap_pinit is being released.
  * Should only be called if the map contains no valid mappings.
  */
 void
 pmap_release(pmap_t pm)
 {
 	vm_object_t obj;
 	vm_page_t m;
 #ifdef SMP
 	struct pcpu *pc;
 #endif
 
 	CTR2(KTR_PMAP, "pmap_release: ctx=%#x tsb=%p",
 	    pm->pm_context[curcpu], pm->pm_tsb);
 	KASSERT(pmap_resident_count(pm) == 0,
 	    ("pmap_release: resident pages %ld != 0",
 	    pmap_resident_count(pm)));
 
 	/*
 	 * After the pmap was freed, it might be reallocated to a new process.
 	 * When switching, this might lead us to wrongly assume that we need
 	 * not switch contexts because old and new pmap pointer are equal.
 	 * Therefore, make sure that this pmap is not referenced by any PCPU
 	 * pointer any more.  This could happen in two cases:
 	 * - A process that referenced the pmap is currently exiting on a CPU.
 	 *   However, it is guaranteed to not switch in any more after setting
 	 *   its state to PRS_ZOMBIE.
 	 * - A process that referenced this pmap ran on a CPU, but we switched
 	 *   to a kernel thread, leaving the pmap pointer unchanged.
 	 */
 #ifdef SMP
 	sched_pin();
 	STAILQ_FOREACH(pc, &cpuhead, pc_allcpu)
 		atomic_cmpset_rel_ptr((uintptr_t *)&pc->pc_pmap,
 		    (uintptr_t)pm, (uintptr_t)NULL);
 	sched_unpin();
 #else
 	critical_enter();
 	if (PCPU_GET(pmap) == pm)
 		PCPU_SET(pmap, NULL);
 	critical_exit();
 #endif
 
 	pmap_qremove((vm_offset_t)pm->pm_tsb, TSB_PAGES);
 	obj = pm->pm_tsb_obj;
 	VM_OBJECT_WLOCK(obj);
 	KASSERT(obj->ref_count == 1, ("pmap_release: tsbobj ref count != 1"));
 	while (!TAILQ_EMPTY(&obj->memq)) {
 		m = TAILQ_FIRST(&obj->memq);
 		m->md.pmap = NULL;
 		m->wire_count--;
 		atomic_subtract_int(&vm_cnt.v_wire_count, 1);
 		vm_page_free_zero(m);
 	}
 	VM_OBJECT_WUNLOCK(obj);
 }
 
 /*
  * Grow the number of kernel page table entries.  Unneeded.
  */
 void
 pmap_growkernel(vm_offset_t addr)
 {
 
 	panic("pmap_growkernel: can't grow kernel");
 }
 
 int
 pmap_remove_tte(struct pmap *pm, struct pmap *pm2, struct tte *tp,
     vm_offset_t va)
 {
 	vm_page_t m;
 	u_long data;
 
 	rw_assert(&tte_list_global_lock, RA_WLOCKED);
 	data = atomic_readandclear_long(&tp->tte_data);
 	if ((data & TD_FAKE) == 0) {
 		m = PHYS_TO_VM_PAGE(TD_PA(data));
 		TAILQ_REMOVE(&m->md.tte_list, tp, tte_link);
 		if ((data & TD_WIRED) != 0)
 			pm->pm_stats.wired_count--;
 		if ((data & TD_PV) != 0) {
 			if ((data & TD_W) != 0)
 				vm_page_dirty(m);
 			if ((data & TD_REF) != 0)
 				vm_page_aflag_set(m, PGA_REFERENCED);
 			if (TAILQ_EMPTY(&m->md.tte_list))
 				vm_page_aflag_clear(m, PGA_WRITEABLE);
 			pm->pm_stats.resident_count--;
 		}
 		pmap_cache_remove(m, va);
 	}
 	TTE_ZERO(tp);
 	if (PMAP_REMOVE_DONE(pm))
 		return (0);
 	return (1);
 }
 
 /*
  * Remove the given range of addresses from the specified map.
  */
 void
 pmap_remove(pmap_t pm, vm_offset_t start, vm_offset_t end)
 {
 	struct tte *tp;
 	vm_offset_t va;
 
 	CTR3(KTR_PMAP, "pmap_remove: ctx=%#lx start=%#lx end=%#lx",
 	    pm->pm_context[curcpu], start, end);
 	if (PMAP_REMOVE_DONE(pm))
 		return;
 	rw_wlock(&tte_list_global_lock);
 	PMAP_LOCK(pm);
 	if (end - start > PMAP_TSB_THRESH) {
 		tsb_foreach(pm, NULL, start, end, pmap_remove_tte);
 		tlb_context_demap(pm);
 	} else {
 		for (va = start; va < end; va += PAGE_SIZE)
 			if ((tp = tsb_tte_lookup(pm, va)) != NULL &&
 			    !pmap_remove_tte(pm, NULL, tp, va))
 				break;
 		tlb_range_demap(pm, start, end - 1);
 	}
 	PMAP_UNLOCK(pm);
 	rw_wunlock(&tte_list_global_lock);
 }
 
 void
 pmap_remove_all(vm_page_t m)
 {
 	struct pmap *pm;
 	struct tte *tpn;
 	struct tte *tp;
 	vm_offset_t va;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_remove_all: page %p is not managed", m));
 	rw_wlock(&tte_list_global_lock);
 	for (tp = TAILQ_FIRST(&m->md.tte_list); tp != NULL; tp = tpn) {
 		tpn = TAILQ_NEXT(tp, tte_link);
 		if ((tp->tte_data & TD_PV) == 0)
 			continue;
 		pm = TTE_GET_PMAP(tp);
 		va = TTE_GET_VA(tp);
 		PMAP_LOCK(pm);
 		if ((tp->tte_data & TD_WIRED) != 0)
 			pm->pm_stats.wired_count--;
 		if ((tp->tte_data & TD_REF) != 0)
 			vm_page_aflag_set(m, PGA_REFERENCED);
 		if ((tp->tte_data & TD_W) != 0)
 			vm_page_dirty(m);
 		tp->tte_data &= ~TD_V;
 		tlb_page_demap(pm, va);
 		TAILQ_REMOVE(&m->md.tte_list, tp, tte_link);
 		pm->pm_stats.resident_count--;
 		pmap_cache_remove(m, va);
 		TTE_ZERO(tp);
 		PMAP_UNLOCK(pm);
 	}
 	vm_page_aflag_clear(m, PGA_WRITEABLE);
 	rw_wunlock(&tte_list_global_lock);
 }
 
 static int
 pmap_protect_tte(struct pmap *pm, struct pmap *pm2, struct tte *tp,
     vm_offset_t va)
 {
 	u_long data;
 	vm_page_t m;
 
 	PMAP_LOCK_ASSERT(pm, MA_OWNED);
 	data = atomic_clear_long(&tp->tte_data, TD_SW | TD_W);
 	if ((data & (TD_PV | TD_W)) == (TD_PV | TD_W)) {
 		m = PHYS_TO_VM_PAGE(TD_PA(data));
 		vm_page_dirty(m);
 	}
 	return (1);
 }
 
 /*
  * Set the physical protection on the specified range of this map as requested.
  */
 void
 pmap_protect(pmap_t pm, vm_offset_t sva, vm_offset_t eva, vm_prot_t prot)
 {
 	vm_offset_t va;
 	struct tte *tp;
 
 	CTR4(KTR_PMAP, "pmap_protect: ctx=%#lx sva=%#lx eva=%#lx prot=%#lx",
 	    pm->pm_context[curcpu], sva, eva, prot);
 
 	if ((prot & VM_PROT_READ) == VM_PROT_NONE) {
 		pmap_remove(pm, sva, eva);
 		return;
 	}
 
 	if (prot & VM_PROT_WRITE)
 		return;
 
 	PMAP_LOCK(pm);
 	if (eva - sva > PMAP_TSB_THRESH) {
 		tsb_foreach(pm, NULL, sva, eva, pmap_protect_tte);
 		tlb_context_demap(pm);
 	} else {
 		for (va = sva; va < eva; va += PAGE_SIZE)
 			if ((tp = tsb_tte_lookup(pm, va)) != NULL)
 				pmap_protect_tte(pm, NULL, tp, va);
 		tlb_range_demap(pm, sva, eva - 1);
 	}
 	PMAP_UNLOCK(pm);
 }
 
 /*
  * Map the given physical page at the specified virtual address in the
  * target pmap with the protection requested.  If specified the page
  * will be wired down.
  */
 int
 pmap_enter(pmap_t pm, vm_offset_t va, vm_page_t m, vm_prot_t prot,
     u_int flags, int8_t psind)
 {
 	int rv;
 
 	rw_wlock(&tte_list_global_lock);
 	PMAP_LOCK(pm);
 	rv = pmap_enter_locked(pm, va, m, prot, flags, psind);
 	rw_wunlock(&tte_list_global_lock);
 	PMAP_UNLOCK(pm);
 	return (rv);
 }
 
 /*
  * Map the given physical page at the specified virtual address in the
  * target pmap with the protection requested.  If specified the page
  * will be wired down.
  *
  * The page queues and pmap must be locked.
  */
 static int
 pmap_enter_locked(pmap_t pm, vm_offset_t va, vm_page_t m, vm_prot_t prot,
     u_int flags, int8_t psind __unused)
 {
 	struct tte *tp;
 	vm_paddr_t pa;
 	vm_page_t real;
 	u_long data;
 	boolean_t wired;
 
 	rw_assert(&tte_list_global_lock, RA_WLOCKED);
 	PMAP_LOCK_ASSERT(pm, MA_OWNED);
 	if ((m->oflags & VPO_UNMANAGED) == 0 && !vm_page_xbusied(m))
 		VM_OBJECT_ASSERT_LOCKED(m->object);
 	PMAP_STATS_INC(pmap_nenter);
 	pa = VM_PAGE_TO_PHYS(m);
 	wired = (flags & PMAP_ENTER_WIRED) != 0;
 
 	/*
 	 * If this is a fake page from the device_pager, but it covers actual
 	 * physical memory, convert to the real backing page.
 	 */
 	if ((m->flags & PG_FICTITIOUS) != 0) {
 		real = vm_phys_paddr_to_vm_page(pa);
 		if (real != NULL)
 			m = real;
 	}
 
 	CTR6(KTR_PMAP,
 	    "pmap_enter_locked: ctx=%p m=%p va=%#lx pa=%#lx prot=%#x wired=%d",
 	    pm->pm_context[curcpu], m, va, pa, prot, wired);
 
 	/*
 	 * If there is an existing mapping, and the physical address has not
 	 * changed, must be protection or wiring change.
 	 */
 	if ((tp = tsb_tte_lookup(pm, va)) != NULL && TTE_GET_PA(tp) == pa) {
 		CTR0(KTR_PMAP, "pmap_enter_locked: update");
 		PMAP_STATS_INC(pmap_nenter_update);
 
 		/*
 		 * Wiring change, just update stats.
 		 */
 		if (wired) {
 			if ((tp->tte_data & TD_WIRED) == 0) {
 				tp->tte_data |= TD_WIRED;
 				pm->pm_stats.wired_count++;
 			}
 		} else {
 			if ((tp->tte_data & TD_WIRED) != 0) {
 				tp->tte_data &= ~TD_WIRED;
 				pm->pm_stats.wired_count--;
 			}
 		}
 
 		/*
 		 * Save the old bits and clear the ones we're interested in.
 		 */
 		data = tp->tte_data;
 		tp->tte_data &= ~(TD_EXEC | TD_SW | TD_W);
 
 		/*
 		 * If we're turning off write permissions, sense modify status.
 		 */
 		if ((prot & VM_PROT_WRITE) != 0) {
 			tp->tte_data |= TD_SW;
 			if (wired)
 				tp->tte_data |= TD_W;
 			if ((m->oflags & VPO_UNMANAGED) == 0)
 				vm_page_aflag_set(m, PGA_WRITEABLE);
 		} else if ((data & TD_W) != 0)
 			vm_page_dirty(m);
 
 		/*
 		 * If we're turning on execute permissions, flush the icache.
 		 */
 		if ((prot & VM_PROT_EXECUTE) != 0) {
 			if ((data & TD_EXEC) == 0)
 				icache_page_inval(pa);
 			tp->tte_data |= TD_EXEC;
 		}
 
 		/*
 		 * Delete the old mapping.
 		 */
 		tlb_page_demap(pm, TTE_GET_VA(tp));
 	} else {
 		/*
 		 * If there is an existing mapping, but its for a different
 		 * physical address, delete the old mapping.
 		 */
 		if (tp != NULL) {
 			CTR0(KTR_PMAP, "pmap_enter_locked: replace");
 			PMAP_STATS_INC(pmap_nenter_replace);
 			pmap_remove_tte(pm, NULL, tp, va);
 			tlb_page_demap(pm, va);
 		} else {
 			CTR0(KTR_PMAP, "pmap_enter_locked: new");
 			PMAP_STATS_INC(pmap_nenter_new);
 		}
 
 		/*
 		 * Now set up the data and install the new mapping.
 		 */
 		data = TD_V | TD_8K | TD_PA(pa);
 		if (pm == kernel_pmap)
 			data |= TD_P;
 		if ((prot & VM_PROT_WRITE) != 0) {
 			data |= TD_SW;
 			if ((m->oflags & VPO_UNMANAGED) == 0)
 				vm_page_aflag_set(m, PGA_WRITEABLE);
 		}
 		if (prot & VM_PROT_EXECUTE) {
 			data |= TD_EXEC;
 			icache_page_inval(pa);
 		}
 
 		/*
 		 * If its wired update stats.  We also don't need reference or
 		 * modify tracking for wired mappings, so set the bits now.
 		 */
 		if (wired) {
 			pm->pm_stats.wired_count++;
 			data |= TD_REF | TD_WIRED;
 			if ((prot & VM_PROT_WRITE) != 0)
 				data |= TD_W;
 		}
 
 		tsb_tte_enter(pm, m, va, TS_8K, data);
 	}
 
 	return (KERN_SUCCESS);
 }
 
 /*
  * Maps a sequence of resident pages belonging to the same object.
  * The sequence begins with the given page m_start.  This page is
  * mapped at the given virtual address start.  Each subsequent page is
  * mapped at a virtual address that is offset from start by the same
  * amount as the page is offset from m_start within the object.  The
  * last page in the sequence is the page with the largest offset from
  * m_start that can be mapped at a virtual address less than the given
  * virtual address end.  Not every virtual page between start and end
  * is mapped; only those for which a resident page exists with the
  * corresponding offset from m_start are mapped.
  */
 void
 pmap_enter_object(pmap_t pm, vm_offset_t start, vm_offset_t end,
     vm_page_t m_start, vm_prot_t prot)
 {
 	vm_page_t m;
 	vm_pindex_t diff, psize;
 
 	VM_OBJECT_ASSERT_LOCKED(m_start->object);
 
 	psize = atop(end - start);
 	m = m_start;
 	rw_wlock(&tte_list_global_lock);
 	PMAP_LOCK(pm);
 	while (m != NULL && (diff = m->pindex - m_start->pindex) < psize) {
 		pmap_enter_locked(pm, start + ptoa(diff), m, prot &
 		    (VM_PROT_READ | VM_PROT_EXECUTE), 0, 0);
 		m = TAILQ_NEXT(m, listq);
 	}
 	rw_wunlock(&tte_list_global_lock);
 	PMAP_UNLOCK(pm);
 }
 
 void
 pmap_enter_quick(pmap_t pm, vm_offset_t va, vm_page_t m, vm_prot_t prot)
 {
 
 	rw_wlock(&tte_list_global_lock);
 	PMAP_LOCK(pm);
 	pmap_enter_locked(pm, va, m, prot & (VM_PROT_READ | VM_PROT_EXECUTE),
 	    0, 0);
 	rw_wunlock(&tte_list_global_lock);
 	PMAP_UNLOCK(pm);
 }
 
 void
 pmap_object_init_pt(pmap_t pm, vm_offset_t addr, vm_object_t object,
     vm_pindex_t pindex, vm_size_t size)
 {
 
 	VM_OBJECT_ASSERT_WLOCKED(object);
 	KASSERT(object->type == OBJT_DEVICE || object->type == OBJT_SG,
 	    ("pmap_object_init_pt: non-device object"));
 }
 
 static int
 pmap_unwire_tte(pmap_t pm, pmap_t pm2, struct tte *tp, vm_offset_t va)
 {
 
 	PMAP_LOCK_ASSERT(pm, MA_OWNED);
 	if ((tp->tte_data & TD_WIRED) == 0)
 		panic("pmap_unwire_tte: tp %p is missing TD_WIRED", tp);
 	atomic_clear_long(&tp->tte_data, TD_WIRED);
 	pm->pm_stats.wired_count--;
 	return (1);
 }
 
 /*
  * Clear the wired attribute from the mappings for the specified range of
  * addresses in the given pmap.  Every valid mapping within that range must
  * have the wired attribute set.  In contrast, invalid mappings cannot have
  * the wired attribute set, so they are ignored.
  *
  * The wired attribute of the translation table entry is not a hardware
  * feature, so there is no need to invalidate any TLB entries.
  */
 void
 pmap_unwire(pmap_t pm, vm_offset_t sva, vm_offset_t eva)
 {
 	vm_offset_t va;
 	struct tte *tp;
 
 	PMAP_LOCK(pm);
 	if (eva - sva > PMAP_TSB_THRESH)
 		tsb_foreach(pm, NULL, sva, eva, pmap_unwire_tte);
 	else {
 		for (va = sva; va < eva; va += PAGE_SIZE)
 			if ((tp = tsb_tte_lookup(pm, va)) != NULL)
 				pmap_unwire_tte(pm, NULL, tp, va);
 	}
 	PMAP_UNLOCK(pm);
 }
 
 static int
 pmap_copy_tte(pmap_t src_pmap, pmap_t dst_pmap, struct tte *tp,
     vm_offset_t va)
 {
 	vm_page_t m;
 	u_long data;
 
 	if ((tp->tte_data & TD_FAKE) != 0)
 		return (1);
 	if (tsb_tte_lookup(dst_pmap, va) == NULL) {
 		data = tp->tte_data &
 		    ~(TD_PV | TD_REF | TD_SW | TD_CV | TD_W);
 		m = PHYS_TO_VM_PAGE(TTE_GET_PA(tp));
 		tsb_tte_enter(dst_pmap, m, va, TS_8K, data);
 	}
 	return (1);
 }
 
 void
 pmap_copy(pmap_t dst_pmap, pmap_t src_pmap, vm_offset_t dst_addr,
     vm_size_t len, vm_offset_t src_addr)
 {
 	struct tte *tp;
 	vm_offset_t va;
 
 	if (dst_addr != src_addr)
 		return;
 	rw_wlock(&tte_list_global_lock);
 	if (dst_pmap < src_pmap) {
 		PMAP_LOCK(dst_pmap);
 		PMAP_LOCK(src_pmap);
 	} else {
 		PMAP_LOCK(src_pmap);
 		PMAP_LOCK(dst_pmap);
 	}
 	if (len > PMAP_TSB_THRESH) {
 		tsb_foreach(src_pmap, dst_pmap, src_addr, src_addr + len,
 		    pmap_copy_tte);
 		tlb_context_demap(dst_pmap);
 	} else {
 		for (va = src_addr; va < src_addr + len; va += PAGE_SIZE)
 			if ((tp = tsb_tte_lookup(src_pmap, va)) != NULL)
 				pmap_copy_tte(src_pmap, dst_pmap, tp, va);
 		tlb_range_demap(dst_pmap, src_addr, src_addr + len - 1);
 	}
 	rw_wunlock(&tte_list_global_lock);
 	PMAP_UNLOCK(src_pmap);
 	PMAP_UNLOCK(dst_pmap);
 }
 
 void
 pmap_zero_page(vm_page_t m)
 {
 	struct tte *tp;
 	vm_offset_t va;
 	vm_paddr_t pa;
 
 	KASSERT((m->flags & PG_FICTITIOUS) == 0,
 	    ("pmap_zero_page: fake page"));
 	PMAP_STATS_INC(pmap_nzero_page);
 	pa = VM_PAGE_TO_PHYS(m);
 	if (dcache_color_ignore != 0 || m->md.color == DCACHE_COLOR(pa)) {
 		PMAP_STATS_INC(pmap_nzero_page_c);
 		va = TLB_PHYS_TO_DIRECT(pa);
 		cpu_block_zero((void *)va, PAGE_SIZE);
 	} else if (m->md.color == -1) {
 		PMAP_STATS_INC(pmap_nzero_page_nc);
 		aszero(ASI_PHYS_USE_EC, pa, PAGE_SIZE);
 	} else {
 		PMAP_STATS_INC(pmap_nzero_page_oc);
 		PMAP_LOCK(kernel_pmap);
 		va = pmap_temp_map_1 + (m->md.color * PAGE_SIZE);
 		tp = tsb_kvtotte(va);
 		tp->tte_data = TD_V | TD_8K | TD_PA(pa) | TD_CP | TD_CV | TD_W;
 		tp->tte_vpn = TV_VPN(va, TS_8K);
 		cpu_block_zero((void *)va, PAGE_SIZE);
 		tlb_page_demap(kernel_pmap, va);
 		PMAP_UNLOCK(kernel_pmap);
 	}
 }
 
 void
 pmap_zero_page_area(vm_page_t m, int off, int size)
 {
 	struct tte *tp;
 	vm_offset_t va;
 	vm_paddr_t pa;
 
 	KASSERT((m->flags & PG_FICTITIOUS) == 0,
 	    ("pmap_zero_page_area: fake page"));
 	KASSERT(off + size <= PAGE_SIZE, ("pmap_zero_page_area: bad off/size"));
 	PMAP_STATS_INC(pmap_nzero_page_area);
 	pa = VM_PAGE_TO_PHYS(m);
 	if (dcache_color_ignore != 0 || m->md.color == DCACHE_COLOR(pa)) {
 		PMAP_STATS_INC(pmap_nzero_page_area_c);
 		va = TLB_PHYS_TO_DIRECT(pa);
 		bzero((void *)(va + off), size);
 	} else if (m->md.color == -1) {
 		PMAP_STATS_INC(pmap_nzero_page_area_nc);
 		aszero(ASI_PHYS_USE_EC, pa + off, size);
 	} else {
 		PMAP_STATS_INC(pmap_nzero_page_area_oc);
 		PMAP_LOCK(kernel_pmap);
 		va = pmap_temp_map_1 + (m->md.color * PAGE_SIZE);
 		tp = tsb_kvtotte(va);
 		tp->tte_data = TD_V | TD_8K | TD_PA(pa) | TD_CP | TD_CV | TD_W;
 		tp->tte_vpn = TV_VPN(va, TS_8K);
 		bzero((void *)(va + off), size);
 		tlb_page_demap(kernel_pmap, va);
 		PMAP_UNLOCK(kernel_pmap);
 	}
 }
 
 void
 pmap_copy_page(vm_page_t msrc, vm_page_t mdst)
 {
 	vm_offset_t vdst;
 	vm_offset_t vsrc;
 	vm_paddr_t pdst;
 	vm_paddr_t psrc;
 	struct tte *tp;
 
 	KASSERT((mdst->flags & PG_FICTITIOUS) == 0,
 	    ("pmap_copy_page: fake dst page"));
 	KASSERT((msrc->flags & PG_FICTITIOUS) == 0,
 	    ("pmap_copy_page: fake src page"));
 	PMAP_STATS_INC(pmap_ncopy_page);
 	pdst = VM_PAGE_TO_PHYS(mdst);
 	psrc = VM_PAGE_TO_PHYS(msrc);
 	if (dcache_color_ignore != 0 ||
 	    (msrc->md.color == DCACHE_COLOR(psrc) &&
 	    mdst->md.color == DCACHE_COLOR(pdst))) {
 		PMAP_STATS_INC(pmap_ncopy_page_c);
 		vdst = TLB_PHYS_TO_DIRECT(pdst);
 		vsrc = TLB_PHYS_TO_DIRECT(psrc);
 		cpu_block_copy((void *)vsrc, (void *)vdst, PAGE_SIZE);
 	} else if (msrc->md.color == -1 && mdst->md.color == -1) {
 		PMAP_STATS_INC(pmap_ncopy_page_nc);
 		ascopy(ASI_PHYS_USE_EC, psrc, pdst, PAGE_SIZE);
 	} else if (msrc->md.color == -1) {
 		if (mdst->md.color == DCACHE_COLOR(pdst)) {
 			PMAP_STATS_INC(pmap_ncopy_page_dc);
 			vdst = TLB_PHYS_TO_DIRECT(pdst);
 			ascopyfrom(ASI_PHYS_USE_EC, psrc, (void *)vdst,
 			    PAGE_SIZE);
 		} else {
 			PMAP_STATS_INC(pmap_ncopy_page_doc);
 			PMAP_LOCK(kernel_pmap);
 			vdst = pmap_temp_map_1 + (mdst->md.color * PAGE_SIZE);
 			tp = tsb_kvtotte(vdst);
 			tp->tte_data =
 			    TD_V | TD_8K | TD_PA(pdst) | TD_CP | TD_CV | TD_W;
 			tp->tte_vpn = TV_VPN(vdst, TS_8K);
 			ascopyfrom(ASI_PHYS_USE_EC, psrc, (void *)vdst,
 			    PAGE_SIZE);
 			tlb_page_demap(kernel_pmap, vdst);
 			PMAP_UNLOCK(kernel_pmap);
 		}
 	} else if (mdst->md.color == -1) {
 		if (msrc->md.color == DCACHE_COLOR(psrc)) {
 			PMAP_STATS_INC(pmap_ncopy_page_sc);
 			vsrc = TLB_PHYS_TO_DIRECT(psrc);
 			ascopyto((void *)vsrc, ASI_PHYS_USE_EC, pdst,
 			    PAGE_SIZE);
 		} else {
 			PMAP_STATS_INC(pmap_ncopy_page_soc);
 			PMAP_LOCK(kernel_pmap);
 			vsrc = pmap_temp_map_1 + (msrc->md.color * PAGE_SIZE);
 			tp = tsb_kvtotte(vsrc);
 			tp->tte_data =
 			    TD_V | TD_8K | TD_PA(psrc) | TD_CP | TD_CV | TD_W;
 			tp->tte_vpn = TV_VPN(vsrc, TS_8K);
 			ascopyto((void *)vsrc, ASI_PHYS_USE_EC, pdst,
 			    PAGE_SIZE);
 			tlb_page_demap(kernel_pmap, vsrc);
 			PMAP_UNLOCK(kernel_pmap);
 		}
 	} else {
 		PMAP_STATS_INC(pmap_ncopy_page_oc);
 		PMAP_LOCK(kernel_pmap);
 		vdst = pmap_temp_map_1 + (mdst->md.color * PAGE_SIZE);
 		tp = tsb_kvtotte(vdst);
 		tp->tte_data =
 		    TD_V | TD_8K | TD_PA(pdst) | TD_CP | TD_CV | TD_W;
 		tp->tte_vpn = TV_VPN(vdst, TS_8K);
 		vsrc = pmap_temp_map_2 + (msrc->md.color * PAGE_SIZE);
 		tp = tsb_kvtotte(vsrc);
 		tp->tte_data =
 		    TD_V | TD_8K | TD_PA(psrc) | TD_CP | TD_CV | TD_W;
 		tp->tte_vpn = TV_VPN(vsrc, TS_8K);
 		cpu_block_copy((void *)vsrc, (void *)vdst, PAGE_SIZE);
 		tlb_page_demap(kernel_pmap, vdst);
 		tlb_page_demap(kernel_pmap, vsrc);
 		PMAP_UNLOCK(kernel_pmap);
 	}
 }
 
 vm_offset_t
 pmap_quick_enter_page(vm_page_t m)
 {
 	vm_paddr_t pa;
 	vm_offset_t qaddr;
 	struct tte *tp;
 
 	pa = VM_PAGE_TO_PHYS(m);
 	if (dcache_color_ignore != 0 || m->md.color == DCACHE_COLOR(pa))
 		return (TLB_PHYS_TO_DIRECT(pa));
 
 	critical_enter();
 	qaddr = PCPU_GET(qmap_addr);
 	qaddr += (PAGE_SIZE * ((DCACHE_COLORS + DCACHE_COLOR(pa) -
 	    DCACHE_COLOR(qaddr)) % DCACHE_COLORS));
 	tp = tsb_kvtotte(qaddr);
 
 	KASSERT(tp->tte_data == 0, ("pmap_quick_enter_page: PTE busy"));
 	
 	tp->tte_data = TD_V | TD_8K | TD_PA(pa) | TD_CP | TD_CV | TD_W;
 	tp->tte_vpn = TV_VPN(qaddr, TS_8K);
 
 	return (qaddr);
 }
 
 void
 pmap_quick_remove_page(vm_offset_t addr)
 {
 	vm_offset_t qaddr;
 	struct tte *tp;
 
 	if (addr >= VM_MIN_DIRECT_ADDRESS)
 		return;
 
 	tp = tsb_kvtotte(addr);
 	qaddr = PCPU_GET(qmap_addr);
 	
 	KASSERT((addr >= qaddr) && (addr < (qaddr + (PAGE_SIZE * DCACHE_COLORS))),
 	    ("pmap_quick_remove_page: invalid address"));
 	KASSERT(tp->tte_data != 0, ("pmap_quick_remove_page: PTE not in use"));
 	
 	stxa(TLB_DEMAP_VA(addr) | TLB_DEMAP_NUCLEUS | TLB_DEMAP_PAGE, ASI_DMMU_DEMAP, 0);
 	stxa(TLB_DEMAP_VA(addr) | TLB_DEMAP_NUCLEUS | TLB_DEMAP_PAGE, ASI_IMMU_DEMAP, 0);
 	flush(KERNBASE);
 	TTE_ZERO(tp);
 	critical_exit();
 }
 
 int unmapped_buf_allowed;
 
 void
 pmap_copy_pages(vm_page_t ma[], vm_offset_t a_offset, vm_page_t mb[],
     vm_offset_t b_offset, int xfersize)
 {
 
 	panic("pmap_copy_pages: not implemented");
 }
 
 /*
  * Returns true if the pmap's pv is one of the first
  * 16 pvs linked to from this page.  This count may
  * be changed upwards or downwards in the future; it
  * is only necessary that true be returned for a small
  * subset of pmaps for proper page aging.
  */
 boolean_t
 pmap_page_exists_quick(pmap_t pm, vm_page_t m)
 {
 	struct tte *tp;
 	int loops;
 	boolean_t rv;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_page_exists_quick: page %p is not managed", m));
 	loops = 0;
 	rv = FALSE;
 	rw_wlock(&tte_list_global_lock);
 	TAILQ_FOREACH(tp, &m->md.tte_list, tte_link) {
 		if ((tp->tte_data & TD_PV) == 0)
 			continue;
 		if (TTE_GET_PMAP(tp) == pm) {
 			rv = TRUE;
 			break;
 		}
 		if (++loops >= 16)
 			break;
 	}
 	rw_wunlock(&tte_list_global_lock);
 	return (rv);
 }
 
 /*
  * Return the number of managed mappings to the given physical page
  * that are wired.
  */
 int
 pmap_page_wired_mappings(vm_page_t m)
 {
 	struct tte *tp;
 	int count;
 
 	count = 0;
 	if ((m->oflags & VPO_UNMANAGED) != 0)
 		return (count);
 	rw_wlock(&tte_list_global_lock);
 	TAILQ_FOREACH(tp, &m->md.tte_list, tte_link)
 		if ((tp->tte_data & (TD_PV | TD_WIRED)) == (TD_PV | TD_WIRED))
 			count++;
 	rw_wunlock(&tte_list_global_lock);
 	return (count);
 }
 
 /*
  * Remove all pages from specified address space, this aids process exit
  * speeds.  This is much faster than pmap_remove in the case of running down
  * an entire address space.  Only works for the current pmap.
  */
 void
 pmap_remove_pages(pmap_t pm)
 {
 
 }
 
 /*
  * Returns TRUE if the given page has a managed mapping.
  */
 boolean_t
 pmap_page_is_mapped(vm_page_t m)
 {
 	struct tte *tp;
 	boolean_t rv;
 
 	rv = FALSE;
 	if ((m->oflags & VPO_UNMANAGED) != 0)
 		return (rv);
 	rw_wlock(&tte_list_global_lock);
 	TAILQ_FOREACH(tp, &m->md.tte_list, tte_link)
 		if ((tp->tte_data & TD_PV) != 0) {
 			rv = TRUE;
 			break;
 		}
 	rw_wunlock(&tte_list_global_lock);
 	return (rv);
 }
 
-#define	PMAP_TS_REFERENCED_MAX	5
-
 /*
  * Return a count of reference bits for a page, clearing those bits.
  * It is not necessary for every reference bit to be cleared, but it
  * is necessary that 0 only be returned when there are truly no
  * reference bits set.
- *
- * XXX: The exact number of bits to check and clear is a matter that
- * should be tested and standardized at some point in the future for
- * optimal aging of shared pages.
  *
  * As an optimization, update the page's dirty field if a modified bit is
  * found while counting reference bits.  This opportunistic update can be
  * performed at low cost and can eliminate the need for some future calls
  * to pmap_is_modified().  However, since this function stops after
  * finding PMAP_TS_REFERENCED_MAX reference bits, it may not detect some
  * dirty pages.  Those dirty pages will only be detected by a future call
  * to pmap_is_modified().
  */
 int
 pmap_ts_referenced(vm_page_t m)
 {
 	struct tte *tpf;
 	struct tte *tpn;
 	struct tte *tp;
 	u_long data;
 	int count;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_ts_referenced: page %p is not managed", m));
 	count = 0;
 	rw_wlock(&tte_list_global_lock);
 	if ((tp = TAILQ_FIRST(&m->md.tte_list)) != NULL) {
 		tpf = tp;
 		do {
 			tpn = TAILQ_NEXT(tp, tte_link);
 			TAILQ_REMOVE(&m->md.tte_list, tp, tte_link);
 			TAILQ_INSERT_TAIL(&m->md.tte_list, tp, tte_link);
 			if ((tp->tte_data & TD_PV) == 0)
 				continue;
 			data = atomic_clear_long(&tp->tte_data, TD_REF);
 			if ((data & TD_W) != 0)
 				vm_page_dirty(m);
 			if ((data & TD_REF) != 0 && ++count >=
 			    PMAP_TS_REFERENCED_MAX)
 				break;
 		} while ((tp = tpn) != NULL && tp != tpf);
 	}
 	rw_wunlock(&tte_list_global_lock);
 	return (count);
 }
 
 boolean_t
 pmap_is_modified(vm_page_t m)
 {
 	struct tte *tp;
 	boolean_t rv;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_is_modified: page %p is not managed", m));
 	rv = FALSE;
 
 	/*
 	 * If the page is not exclusive busied, then PGA_WRITEABLE cannot be
 	 * concurrently set while the object is locked.  Thus, if PGA_WRITEABLE
 	 * is clear, no TTEs can have TD_W set.
 	 */
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	if (!vm_page_xbusied(m) && (m->aflags & PGA_WRITEABLE) == 0)
 		return (rv);
 	rw_wlock(&tte_list_global_lock);
 	TAILQ_FOREACH(tp, &m->md.tte_list, tte_link) {
 		if ((tp->tte_data & TD_PV) == 0)
 			continue;
 		if ((tp->tte_data & TD_W) != 0) {
 			rv = TRUE;
 			break;
 		}
 	}
 	rw_wunlock(&tte_list_global_lock);
 	return (rv);
 }
 
 /*
  *	pmap_is_prefaultable:
  *
  *	Return whether or not the specified virtual address is elgible
  *	for prefault.
  */
 boolean_t
 pmap_is_prefaultable(pmap_t pmap, vm_offset_t addr)
 {
 	boolean_t rv;
 
 	PMAP_LOCK(pmap);
 	rv = tsb_tte_lookup(pmap, addr) == NULL;
 	PMAP_UNLOCK(pmap);
 	return (rv);
 }
 
 /*
  * Return whether or not the specified physical page was referenced
  * in any physical maps.
  */
 boolean_t
 pmap_is_referenced(vm_page_t m)
 {
 	struct tte *tp;
 	boolean_t rv;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_is_referenced: page %p is not managed", m));
 	rv = FALSE;
 	rw_wlock(&tte_list_global_lock);
 	TAILQ_FOREACH(tp, &m->md.tte_list, tte_link) {
 		if ((tp->tte_data & TD_PV) == 0)
 			continue;
 		if ((tp->tte_data & TD_REF) != 0) {
 			rv = TRUE;
 			break;
 		}
 	}
 	rw_wunlock(&tte_list_global_lock);
 	return (rv);
 }
 
 /*
  * This function is advisory.
  */
 void
 pmap_advise(pmap_t pmap, vm_offset_t sva, vm_offset_t eva, int advice)
 {
 }
 
 void
 pmap_clear_modify(vm_page_t m)
 {
 	struct tte *tp;
 	u_long data;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_clear_modify: page %p is not managed", m));
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	KASSERT(!vm_page_xbusied(m),
 	    ("pmap_clear_modify: page %p is exclusive busied", m));
 
 	/*
 	 * If the page is not PGA_WRITEABLE, then no TTEs can have TD_W set.
 	 * If the object containing the page is locked and the page is not
 	 * exclusive busied, then PGA_WRITEABLE cannot be concurrently set.
 	 */
 	if ((m->aflags & PGA_WRITEABLE) == 0)
 		return;
 	rw_wlock(&tte_list_global_lock);
 	TAILQ_FOREACH(tp, &m->md.tte_list, tte_link) {
 		if ((tp->tte_data & TD_PV) == 0)
 			continue;
 		data = atomic_clear_long(&tp->tte_data, TD_W);
 		if ((data & TD_W) != 0)
 			tlb_page_demap(TTE_GET_PMAP(tp), TTE_GET_VA(tp));
 	}
 	rw_wunlock(&tte_list_global_lock);
 }
 
 void
 pmap_remove_write(vm_page_t m)
 {
 	struct tte *tp;
 	u_long data;
 
 	KASSERT((m->oflags & VPO_UNMANAGED) == 0,
 	    ("pmap_remove_write: page %p is not managed", m));
 
 	/*
 	 * If the page is not exclusive busied, then PGA_WRITEABLE cannot be
 	 * set by another thread while the object is locked.  Thus,
 	 * if PGA_WRITEABLE is clear, no page table entries need updating.
 	 */
 	VM_OBJECT_ASSERT_WLOCKED(m->object);
 	if (!vm_page_xbusied(m) && (m->aflags & PGA_WRITEABLE) == 0)
 		return;
 	rw_wlock(&tte_list_global_lock);
 	TAILQ_FOREACH(tp, &m->md.tte_list, tte_link) {
 		if ((tp->tte_data & TD_PV) == 0)
 			continue;
 		data = atomic_clear_long(&tp->tte_data, TD_SW | TD_W);
 		if ((data & TD_W) != 0) {
 			vm_page_dirty(m);
 			tlb_page_demap(TTE_GET_PMAP(tp), TTE_GET_VA(tp));
 		}
 	}
 	vm_page_aflag_clear(m, PGA_WRITEABLE);
 	rw_wunlock(&tte_list_global_lock);
 }
 
 int
 pmap_mincore(pmap_t pm, vm_offset_t addr, vm_paddr_t *locked_pa)
 {
 
 	/* TODO; */
 	return (0);
 }
 
 /*
  * Activate a user pmap.  The pmap must be activated before its address space
  * can be accessed in any way.
  */
 void
 pmap_activate(struct thread *td)
 {
 	struct vmspace *vm;
 	struct pmap *pm;
 	int context;
 
 	critical_enter();
 	vm = td->td_proc->p_vmspace;
 	pm = vmspace_pmap(vm);
 
 	context = PCPU_GET(tlb_ctx);
 	if (context == PCPU_GET(tlb_ctx_max)) {
 		tlb_flush_user();
 		context = PCPU_GET(tlb_ctx_min);
 	}
 	PCPU_SET(tlb_ctx, context + 1);
 
 	pm->pm_context[curcpu] = context;
 #ifdef SMP
 	CPU_SET_ATOMIC(PCPU_GET(cpuid), &pm->pm_active);
 	atomic_store_acq_ptr((uintptr_t *)PCPU_PTR(pmap), (uintptr_t)pm);
 #else
 	CPU_SET(PCPU_GET(cpuid), &pm->pm_active);
 	PCPU_SET(pmap, pm);
 #endif
 
 	stxa(AA_DMMU_TSB, ASI_DMMU, pm->pm_tsb);
 	stxa(AA_IMMU_TSB, ASI_IMMU, pm->pm_tsb);
 	stxa(AA_DMMU_PCXR, ASI_DMMU, (ldxa(AA_DMMU_PCXR, ASI_DMMU) &
 	    TLB_CXR_PGSZ_MASK) | context);
 	flush(KERNBASE);
 	critical_exit();
 }
 
 void
 pmap_sync_icache(pmap_t pm, vm_offset_t va, vm_size_t sz)
 {
 
 }
 
 /*
  * Increase the starting virtual address of the given mapping if a
  * different alignment might result in more superpage mappings.
  */
 void
 pmap_align_superpage(vm_object_t object, vm_ooffset_t offset,
     vm_offset_t *addr, vm_size_t size)
 {
 
 }
Index: projects/clang390-import/sys/sys/queue.h
===================================================================
--- projects/clang390-import/sys/sys/queue.h	(revision 305686)
+++ projects/clang390-import/sys/sys/queue.h	(revision 305687)
@@ -1,787 +1,819 @@
 /*-
  * Copyright (c) 1991, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)queue.h	8.5 (Berkeley) 8/20/94
  * $FreeBSD$
  */
 
 #ifndef _SYS_QUEUE_H_
 #define	_SYS_QUEUE_H_
 
 #include <sys/cdefs.h>
 
 /*
  * This file defines four types of data structures: singly-linked lists,
  * singly-linked tail queues, lists and tail queues.
  *
  * A singly-linked list is headed by a single forward pointer. The elements
  * are singly linked for minimum space and pointer manipulation overhead at
  * the expense of O(n) removal for arbitrary elements. New elements can be
  * added to the list after an existing element or at the head of the list.
  * Elements being removed from the head of the list should use the explicit
  * macro for this purpose for optimum efficiency. A singly-linked list may
  * only be traversed in the forward direction.  Singly-linked lists are ideal
  * for applications with large datasets and few or no removals or for
  * implementing a LIFO queue.
  *
  * A singly-linked tail queue is headed by a pair of pointers, one to the
  * head of the list and the other to the tail of the list. The elements are
  * singly linked for minimum space and pointer manipulation overhead at the
  * expense of O(n) removal for arbitrary elements. New elements can be added
  * to the list after an existing element, at the head of the list, or at the
  * end of the list. Elements being removed from the head of the tail queue
  * should use the explicit macro for this purpose for optimum efficiency.
  * A singly-linked tail queue may only be traversed in the forward direction.
  * Singly-linked tail queues are ideal for applications with large datasets
  * and few or no removals or for implementing a FIFO queue.
  *
  * A list is headed by a single forward pointer (or an array of forward
  * pointers for a hash table header). The elements are doubly linked
  * so that an arbitrary element can be removed without a need to
  * traverse the list. New elements can be added to the list before
  * or after an existing element or at the head of the list. A list
  * may be traversed in either direction.
  *
  * A tail queue is headed by a pair of pointers, one to the head of the
  * list and the other to the tail of the list. The elements are doubly
  * linked so that an arbitrary element can be removed without a need to
  * traverse the list. New elements can be added to the list before or
  * after an existing element, at the head of the list, or at the end of
  * the list. A tail queue may be traversed in either direction.
  *
  * For details on the use of these macros, see the queue(3) manual page.
  *
  * Below is a summary of implemented functions where:
  *  +  means the macro is available
  *  -  means the macro is not available
  *  s  means the macro is available but is slow (runs in O(n) time)
  *
  *				SLIST	LIST	STAILQ	TAILQ
  * _HEAD			+	+	+	+
  * _CLASS_HEAD			+	+	+	+
  * _HEAD_INITIALIZER		+	+	+	+
  * _ENTRY			+	+	+	+
  * _CLASS_ENTRY			+	+	+	+
  * _INIT			+	+	+	+
  * _EMPTY			+	+	+	+
  * _FIRST			+	+	+	+
  * _NEXT			+	+	+	+
  * _PREV			-	+	-	+
  * _LAST			-	-	+	+
  * _FOREACH			+	+	+	+
  * _FOREACH_FROM		+	+	+	+
  * _FOREACH_SAFE		+	+	+	+
  * _FOREACH_FROM_SAFE		+	+	+	+
  * _FOREACH_REVERSE		-	-	-	+
  * _FOREACH_REVERSE_FROM	-	-	-	+
  * _FOREACH_REVERSE_SAFE	-	-	-	+
  * _FOREACH_REVERSE_FROM_SAFE	-	-	-	+
  * _INSERT_HEAD			+	+	+	+
  * _INSERT_BEFORE		-	+	-	+
  * _INSERT_AFTER		+	+	+	+
  * _INSERT_TAIL			-	-	+	+
  * _CONCAT			s	s	+	+
  * _REMOVE_AFTER		+	-	+	-
  * _REMOVE_HEAD			+	-	+	-
  * _REMOVE			s	+	s	+
  * _SWAP			+	+	+	+
  *
  */
 #ifdef QUEUE_MACRO_DEBUG
+#warn Use QUEUE_MACRO_DEBUG_TRACE and/or QUEUE_MACRO_DEBUG_TRASH
+#define	QUEUE_MACRO_DEBUG_TRACE
+#define	QUEUE_MACRO_DEBUG_TRASH
+#endif
+
+#ifdef QUEUE_MACRO_DEBUG_TRACE
 /* Store the last 2 places the queue element or head was altered */
 struct qm_trace {
 	unsigned long	 lastline;
 	unsigned long	 prevline;
 	const char	*lastfile;
 	const char	*prevfile;
 };
 
 #define	TRACEBUF	struct qm_trace trace;
 #define	TRACEBUF_INITIALIZER	{ __LINE__, 0, __FILE__, NULL } ,
-#define	TRASHIT(x)	do {(x) = (void *)-1;} while (0)
-#define	QMD_SAVELINK(name, link)	void **name = (void *)&(link)
 
 #define	QMD_TRACE_HEAD(head) do {					\
 	(head)->trace.prevline = (head)->trace.lastline;		\
 	(head)->trace.prevfile = (head)->trace.lastfile;		\
 	(head)->trace.lastline = __LINE__;				\
 	(head)->trace.lastfile = __FILE__;				\
 } while (0)
 
 #define	QMD_TRACE_ELEM(elem) do {					\
 	(elem)->trace.prevline = (elem)->trace.lastline;		\
 	(elem)->trace.prevfile = (elem)->trace.lastfile;		\
 	(elem)->trace.lastline = __LINE__;				\
 	(elem)->trace.lastfile = __FILE__;				\
 } while (0)
 
-#else
+#else	/* !QUEUE_MACRO_DEBUG_TRACE */
 #define	QMD_TRACE_ELEM(elem)
 #define	QMD_TRACE_HEAD(head)
-#define	QMD_SAVELINK(name, link)
 #define	TRACEBUF
 #define	TRACEBUF_INITIALIZER
+#endif	/* QUEUE_MACRO_DEBUG_TRACE */
+
+#ifdef QUEUE_MACRO_DEBUG_TRASH
+#define	TRASHIT(x)		do {(x) = (void *)-1;} while (0)
+#define	QMD_IS_TRASHED(x)	((x) == (void *)(intptr_t)-1)
+#else	/* !QUEUE_MACRO_DEBUG_TRASH */
 #define	TRASHIT(x)
-#endif	/* QUEUE_MACRO_DEBUG */
+#define	QMD_IS_TRASHED(x)	0
+#endif	/* QUEUE_MACRO_DEBUG_TRASH */
 
+#if defined(QUEUE_MACRO_DEBUG_TRACE) || defined(QUEUE_MACRO_DEBUG_TRASH)
+#define	QMD_SAVELINK(name, link)	void **name = (void *)&(link)
+#else	/* !QUEUE_MACRO_DEBUG_TRACE && !QUEUE_MACRO_DEBUG_TRASH */
+#define	QMD_SAVELINK(name, link)
+#endif	/* QUEUE_MACRO_DEBUG_TRACE || QUEUE_MACRO_DEBUG_TRASH */
+
 #ifdef __cplusplus
 /*
  * In C++ there can be structure lists and class lists:
  */
 #define	QUEUE_TYPEOF(type) type
 #else
 #define	QUEUE_TYPEOF(type) struct type
 #endif
 
 /*
  * Singly-linked List declarations.
  */
 #define	SLIST_HEAD(name, type)						\
 struct name {								\
 	struct type *slh_first;	/* first element */			\
 }
 
 #define	SLIST_CLASS_HEAD(name, type)					\
 struct name {								\
 	class type *slh_first;	/* first element */			\
 }
 
 #define	SLIST_HEAD_INITIALIZER(head)					\
 	{ NULL }
 
 #define	SLIST_ENTRY(type)						\
 struct {								\
 	struct type *sle_next;	/* next element */			\
 }
 
 #define	SLIST_CLASS_ENTRY(type)						\
 struct {								\
 	class type *sle_next;		/* next element */		\
 }
 
 /*
  * Singly-linked List functions.
  */
+#if (defined(_KERNEL) && defined(INVARIANTS))
+#define	QMD_SLIST_CHECK_PREVPTR(prevp, elm) do {			\
+	if (*(prevp) != (elm))						\
+		panic("Bad prevptr *(%p) == %p != %p",			\
+		    (prevp), *(prevp), (elm));				\
+} while (0)
+#else
+#define	QMD_SLIST_CHECK_PREVPTR(prevp, elm)
+#endif
+
 #define SLIST_CONCAT(head1, head2, type, field) do {			\
 	QUEUE_TYPEOF(type) *curelm = SLIST_FIRST(head1);		\
 	if (curelm == NULL) {						\
 		if ((SLIST_FIRST(head1) = SLIST_FIRST(head2)) != NULL)	\
 			SLIST_INIT(head2);				\
 	} else if (SLIST_FIRST(head2) != NULL) {			\
 		while (SLIST_NEXT(curelm, field) != NULL)		\
 			curelm = SLIST_NEXT(curelm, field);		\
 		SLIST_NEXT(curelm, field) = SLIST_FIRST(head2);		\
 		SLIST_INIT(head2);					\
 	}								\
 } while (0)
 
 #define	SLIST_EMPTY(head)	((head)->slh_first == NULL)
 
 #define	SLIST_FIRST(head)	((head)->slh_first)
 
 #define	SLIST_FOREACH(var, head, field)					\
 	for ((var) = SLIST_FIRST((head));				\
 	    (var);							\
 	    (var) = SLIST_NEXT((var), field))
 
 #define	SLIST_FOREACH_FROM(var, head, field)				\
 	for ((var) = ((var) ? (var) : SLIST_FIRST((head)));		\
 	    (var);							\
 	    (var) = SLIST_NEXT((var), field))
 
 #define	SLIST_FOREACH_SAFE(var, head, field, tvar)			\
 	for ((var) = SLIST_FIRST((head));				\
 	    (var) && ((tvar) = SLIST_NEXT((var), field), 1);		\
 	    (var) = (tvar))
 
 #define	SLIST_FOREACH_FROM_SAFE(var, head, field, tvar)			\
 	for ((var) = ((var) ? (var) : SLIST_FIRST((head)));		\
 	    (var) && ((tvar) = SLIST_NEXT((var), field), 1);		\
 	    (var) = (tvar))
 
 #define	SLIST_FOREACH_PREVPTR(var, varp, head, field)			\
 	for ((varp) = &SLIST_FIRST((head));				\
 	    ((var) = *(varp)) != NULL;					\
 	    (varp) = &SLIST_NEXT((var), field))
 
 #define	SLIST_INIT(head) do {						\
 	SLIST_FIRST((head)) = NULL;					\
 } while (0)
 
 #define	SLIST_INSERT_AFTER(slistelm, elm, field) do {			\
 	SLIST_NEXT((elm), field) = SLIST_NEXT((slistelm), field);	\
 	SLIST_NEXT((slistelm), field) = (elm);				\
 } while (0)
 
 #define	SLIST_INSERT_HEAD(head, elm, field) do {			\
 	SLIST_NEXT((elm), field) = SLIST_FIRST((head));			\
 	SLIST_FIRST((head)) = (elm);					\
 } while (0)
 
 #define	SLIST_NEXT(elm, field)	((elm)->field.sle_next)
 
 #define	SLIST_REMOVE(head, elm, type, field) do {			\
 	QMD_SAVELINK(oldnext, (elm)->field.sle_next);			\
 	if (SLIST_FIRST((head)) == (elm)) {				\
 		SLIST_REMOVE_HEAD((head), field);			\
 	}								\
 	else {								\
 		QUEUE_TYPEOF(type) *curelm = SLIST_FIRST(head);		\
 		while (SLIST_NEXT(curelm, field) != (elm))		\
 			curelm = SLIST_NEXT(curelm, field);		\
 		SLIST_REMOVE_AFTER(curelm, field);			\
 	}								\
 	TRASHIT(*oldnext);						\
 } while (0)
 
 #define SLIST_REMOVE_AFTER(elm, field) do {				\
 	SLIST_NEXT(elm, field) =					\
 	    SLIST_NEXT(SLIST_NEXT(elm, field), field);			\
 } while (0)
 
 #define	SLIST_REMOVE_HEAD(head, field) do {				\
 	SLIST_FIRST((head)) = SLIST_NEXT(SLIST_FIRST((head)), field);	\
+} while (0)
+
+#define	SLIST_REMOVE_PREVPTR(prevp, elm, field) do {			\
+	QMD_SLIST_CHECK_PREVPTR(prevp, elm);				\
+	*(prevp) = SLIST_NEXT(elm, field);				\
+	TRASHIT((elm)->field.sle_next);					\
 } while (0)
 
 #define SLIST_SWAP(head1, head2, type) do {				\
 	QUEUE_TYPEOF(type) *swap_first = SLIST_FIRST(head1);		\
 	SLIST_FIRST(head1) = SLIST_FIRST(head2);			\
 	SLIST_FIRST(head2) = swap_first;				\
 } while (0)
 
 /*
  * Singly-linked Tail queue declarations.
  */
 #define	STAILQ_HEAD(name, type)						\
 struct name {								\
 	struct type *stqh_first;/* first element */			\
 	struct type **stqh_last;/* addr of last next element */		\
 }
 
 #define	STAILQ_CLASS_HEAD(name, type)					\
 struct name {								\
 	class type *stqh_first;	/* first element */			\
 	class type **stqh_last;	/* addr of last next element */		\
 }
 
 #define	STAILQ_HEAD_INITIALIZER(head)					\
 	{ NULL, &(head).stqh_first }
 
 #define	STAILQ_ENTRY(type)						\
 struct {								\
 	struct type *stqe_next;	/* next element */			\
 }
 
 #define	STAILQ_CLASS_ENTRY(type)					\
 struct {								\
 	class type *stqe_next;	/* next element */			\
 }
 
 /*
  * Singly-linked Tail queue functions.
  */
 #define	STAILQ_CONCAT(head1, head2) do {				\
 	if (!STAILQ_EMPTY((head2))) {					\
 		*(head1)->stqh_last = (head2)->stqh_first;		\
 		(head1)->stqh_last = (head2)->stqh_last;		\
 		STAILQ_INIT((head2));					\
 	}								\
 } while (0)
 
 #define	STAILQ_EMPTY(head)	((head)->stqh_first == NULL)
 
 #define	STAILQ_FIRST(head)	((head)->stqh_first)
 
 #define	STAILQ_FOREACH(var, head, field)				\
 	for((var) = STAILQ_FIRST((head));				\
 	   (var);							\
 	   (var) = STAILQ_NEXT((var), field))
 
 #define	STAILQ_FOREACH_FROM(var, head, field)				\
 	for ((var) = ((var) ? (var) : STAILQ_FIRST((head)));		\
 	   (var);							\
 	   (var) = STAILQ_NEXT((var), field))
 
 #define	STAILQ_FOREACH_SAFE(var, head, field, tvar)			\
 	for ((var) = STAILQ_FIRST((head));				\
 	    (var) && ((tvar) = STAILQ_NEXT((var), field), 1);		\
 	    (var) = (tvar))
 
 #define	STAILQ_FOREACH_FROM_SAFE(var, head, field, tvar)		\
 	for ((var) = ((var) ? (var) : STAILQ_FIRST((head)));		\
 	    (var) && ((tvar) = STAILQ_NEXT((var), field), 1);		\
 	    (var) = (tvar))
 
 #define	STAILQ_INIT(head) do {						\
 	STAILQ_FIRST((head)) = NULL;					\
 	(head)->stqh_last = &STAILQ_FIRST((head));			\
 } while (0)
 
 #define	STAILQ_INSERT_AFTER(head, tqelm, elm, field) do {		\
 	if ((STAILQ_NEXT((elm), field) = STAILQ_NEXT((tqelm), field)) == NULL)\
 		(head)->stqh_last = &STAILQ_NEXT((elm), field);		\
 	STAILQ_NEXT((tqelm), field) = (elm);				\
 } while (0)
 
 #define	STAILQ_INSERT_HEAD(head, elm, field) do {			\
 	if ((STAILQ_NEXT((elm), field) = STAILQ_FIRST((head))) == NULL)	\
 		(head)->stqh_last = &STAILQ_NEXT((elm), field);		\
 	STAILQ_FIRST((head)) = (elm);					\
 } while (0)
 
 #define	STAILQ_INSERT_TAIL(head, elm, field) do {			\
 	STAILQ_NEXT((elm), field) = NULL;				\
 	*(head)->stqh_last = (elm);					\
 	(head)->stqh_last = &STAILQ_NEXT((elm), field);			\
 } while (0)
 
 #define	STAILQ_LAST(head, type, field)				\
 	(STAILQ_EMPTY((head)) ? NULL :				\
 	    __containerof((head)->stqh_last,			\
 	    QUEUE_TYPEOF(type), field.stqe_next))
 
 #define	STAILQ_NEXT(elm, field)	((elm)->field.stqe_next)
 
 #define	STAILQ_REMOVE(head, elm, type, field) do {			\
 	QMD_SAVELINK(oldnext, (elm)->field.stqe_next);			\
 	if (STAILQ_FIRST((head)) == (elm)) {				\
 		STAILQ_REMOVE_HEAD((head), field);			\
 	}								\
 	else {								\
 		QUEUE_TYPEOF(type) *curelm = STAILQ_FIRST(head);	\
 		while (STAILQ_NEXT(curelm, field) != (elm))		\
 			curelm = STAILQ_NEXT(curelm, field);		\
 		STAILQ_REMOVE_AFTER(head, curelm, field);		\
 	}								\
 	TRASHIT(*oldnext);						\
 } while (0)
 
 #define STAILQ_REMOVE_AFTER(head, elm, field) do {			\
 	if ((STAILQ_NEXT(elm, field) =					\
 	     STAILQ_NEXT(STAILQ_NEXT(elm, field), field)) == NULL)	\
 		(head)->stqh_last = &STAILQ_NEXT((elm), field);		\
 } while (0)
 
 #define	STAILQ_REMOVE_HEAD(head, field) do {				\
 	if ((STAILQ_FIRST((head)) =					\
 	     STAILQ_NEXT(STAILQ_FIRST((head)), field)) == NULL)		\
 		(head)->stqh_last = &STAILQ_FIRST((head));		\
 } while (0)
 
 #define STAILQ_SWAP(head1, head2, type) do {				\
 	QUEUE_TYPEOF(type) *swap_first = STAILQ_FIRST(head1);		\
 	QUEUE_TYPEOF(type) **swap_last = (head1)->stqh_last;		\
 	STAILQ_FIRST(head1) = STAILQ_FIRST(head2);			\
 	(head1)->stqh_last = (head2)->stqh_last;			\
 	STAILQ_FIRST(head2) = swap_first;				\
 	(head2)->stqh_last = swap_last;					\
 	if (STAILQ_EMPTY(head1))					\
 		(head1)->stqh_last = &STAILQ_FIRST(head1);		\
 	if (STAILQ_EMPTY(head2))					\
 		(head2)->stqh_last = &STAILQ_FIRST(head2);		\
 } while (0)
 
 
 /*
  * List declarations.
  */
 #define	LIST_HEAD(name, type)						\
 struct name {								\
 	struct type *lh_first;	/* first element */			\
 }
 
 #define	LIST_CLASS_HEAD(name, type)					\
 struct name {								\
 	class type *lh_first;	/* first element */			\
 }
 
 #define	LIST_HEAD_INITIALIZER(head)					\
 	{ NULL }
 
 #define	LIST_ENTRY(type)						\
 struct {								\
 	struct type *le_next;	/* next element */			\
 	struct type **le_prev;	/* address of previous next element */	\
 }
 
 #define	LIST_CLASS_ENTRY(type)						\
 struct {								\
 	class type *le_next;	/* next element */			\
 	class type **le_prev;	/* address of previous next element */	\
 }
 
 /*
  * List functions.
  */
 
 #if (defined(_KERNEL) && defined(INVARIANTS))
 #define	QMD_LIST_CHECK_HEAD(head, field) do {				\
 	if (LIST_FIRST((head)) != NULL &&				\
 	    LIST_FIRST((head))->field.le_prev !=			\
 	     &LIST_FIRST((head)))					\
 		panic("Bad list head %p first->prev != head", (head));	\
 } while (0)
 
 #define	QMD_LIST_CHECK_NEXT(elm, field) do {				\
 	if (LIST_NEXT((elm), field) != NULL &&				\
 	    LIST_NEXT((elm), field)->field.le_prev !=			\
 	     &((elm)->field.le_next))					\
 	     	panic("Bad link elm %p next->prev != elm", (elm));	\
 } while (0)
 
 #define	QMD_LIST_CHECK_PREV(elm, field) do {				\
 	if (*(elm)->field.le_prev != (elm))				\
 		panic("Bad link elm %p prev->next != elm", (elm));	\
 } while (0)
 #else
 #define	QMD_LIST_CHECK_HEAD(head, field)
 #define	QMD_LIST_CHECK_NEXT(elm, field)
 #define	QMD_LIST_CHECK_PREV(elm, field)
 #endif /* (_KERNEL && INVARIANTS) */
 
 #define LIST_CONCAT(head1, head2, type, field) do {			      \
 	QUEUE_TYPEOF(type) *curelm = LIST_FIRST(head1);			      \
 	if (curelm == NULL) {						      \
 		if ((LIST_FIRST(head1) = LIST_FIRST(head2)) != NULL) {	      \
 			LIST_FIRST(head2)->field.le_prev =		      \
 			    &LIST_FIRST((head1));			      \
 			LIST_INIT(head2);				      \
 		}							      \
 	} else if (LIST_FIRST(head2) != NULL) {				      \
 		while (LIST_NEXT(curelm, field) != NULL)		      \
 			curelm = LIST_NEXT(curelm, field);		      \
 		LIST_NEXT(curelm, field) = LIST_FIRST(head2);		      \
 		LIST_FIRST(head2)->field.le_prev = &LIST_NEXT(curelm, field); \
 		LIST_INIT(head2);					      \
 	}								      \
 } while (0)
 
 #define	LIST_EMPTY(head)	((head)->lh_first == NULL)
 
 #define	LIST_FIRST(head)	((head)->lh_first)
 
 #define	LIST_FOREACH(var, head, field)					\
 	for ((var) = LIST_FIRST((head));				\
 	    (var);							\
 	    (var) = LIST_NEXT((var), field))
 
 #define	LIST_FOREACH_FROM(var, head, field)				\
 	for ((var) = ((var) ? (var) : LIST_FIRST((head)));		\
 	    (var);							\
 	    (var) = LIST_NEXT((var), field))
 
 #define	LIST_FOREACH_SAFE(var, head, field, tvar)			\
 	for ((var) = LIST_FIRST((head));				\
 	    (var) && ((tvar) = LIST_NEXT((var), field), 1);		\
 	    (var) = (tvar))
 
 #define	LIST_FOREACH_FROM_SAFE(var, head, field, tvar)			\
 	for ((var) = ((var) ? (var) : LIST_FIRST((head)));		\
 	    (var) && ((tvar) = LIST_NEXT((var), field), 1);		\
 	    (var) = (tvar))
 
 #define	LIST_INIT(head) do {						\
 	LIST_FIRST((head)) = NULL;					\
 } while (0)
 
 #define	LIST_INSERT_AFTER(listelm, elm, field) do {			\
 	QMD_LIST_CHECK_NEXT(listelm, field);				\
 	if ((LIST_NEXT((elm), field) = LIST_NEXT((listelm), field)) != NULL)\
 		LIST_NEXT((listelm), field)->field.le_prev =		\
 		    &LIST_NEXT((elm), field);				\
 	LIST_NEXT((listelm), field) = (elm);				\
 	(elm)->field.le_prev = &LIST_NEXT((listelm), field);		\
 } while (0)
 
 #define	LIST_INSERT_BEFORE(listelm, elm, field) do {			\
 	QMD_LIST_CHECK_PREV(listelm, field);				\
 	(elm)->field.le_prev = (listelm)->field.le_prev;		\
 	LIST_NEXT((elm), field) = (listelm);				\
 	*(listelm)->field.le_prev = (elm);				\
 	(listelm)->field.le_prev = &LIST_NEXT((elm), field);		\
 } while (0)
 
 #define	LIST_INSERT_HEAD(head, elm, field) do {				\
 	QMD_LIST_CHECK_HEAD((head), field);				\
 	if ((LIST_NEXT((elm), field) = LIST_FIRST((head))) != NULL)	\
 		LIST_FIRST((head))->field.le_prev = &LIST_NEXT((elm), field);\
 	LIST_FIRST((head)) = (elm);					\
 	(elm)->field.le_prev = &LIST_FIRST((head));			\
 } while (0)
 
 #define	LIST_NEXT(elm, field)	((elm)->field.le_next)
 
 #define	LIST_PREV(elm, head, type, field)			\
 	((elm)->field.le_prev == &LIST_FIRST((head)) ? NULL :	\
 	    __containerof((elm)->field.le_prev,			\
 	    QUEUE_TYPEOF(type), field.le_next))
 
 #define	LIST_REMOVE(elm, field) do {					\
 	QMD_SAVELINK(oldnext, (elm)->field.le_next);			\
 	QMD_SAVELINK(oldprev, (elm)->field.le_prev);			\
 	QMD_LIST_CHECK_NEXT(elm, field);				\
 	QMD_LIST_CHECK_PREV(elm, field);				\
 	if (LIST_NEXT((elm), field) != NULL)				\
 		LIST_NEXT((elm), field)->field.le_prev = 		\
 		    (elm)->field.le_prev;				\
 	*(elm)->field.le_prev = LIST_NEXT((elm), field);		\
 	TRASHIT(*oldnext);						\
 	TRASHIT(*oldprev);						\
 } while (0)
 
 #define LIST_SWAP(head1, head2, type, field) do {			\
 	QUEUE_TYPEOF(type) *swap_tmp = LIST_FIRST(head1);		\
 	LIST_FIRST((head1)) = LIST_FIRST((head2));			\
 	LIST_FIRST((head2)) = swap_tmp;					\
 	if ((swap_tmp = LIST_FIRST((head1))) != NULL)			\
 		swap_tmp->field.le_prev = &LIST_FIRST((head1));		\
 	if ((swap_tmp = LIST_FIRST((head2))) != NULL)			\
 		swap_tmp->field.le_prev = &LIST_FIRST((head2));		\
 } while (0)
 
 /*
  * Tail queue declarations.
  */
 #define	TAILQ_HEAD(name, type)						\
 struct name {								\
 	struct type *tqh_first;	/* first element */			\
 	struct type **tqh_last;	/* addr of last next element */		\
 	TRACEBUF							\
 }
 
 #define	TAILQ_CLASS_HEAD(name, type)					\
 struct name {								\
 	class type *tqh_first;	/* first element */			\
 	class type **tqh_last;	/* addr of last next element */		\
 	TRACEBUF							\
 }
 
 #define	TAILQ_HEAD_INITIALIZER(head)					\
 	{ NULL, &(head).tqh_first, TRACEBUF_INITIALIZER }
 
 #define	TAILQ_ENTRY(type)						\
 struct {								\
 	struct type *tqe_next;	/* next element */			\
 	struct type **tqe_prev;	/* address of previous next element */	\
 	TRACEBUF							\
 }
 
 #define	TAILQ_CLASS_ENTRY(type)						\
 struct {								\
 	class type *tqe_next;	/* next element */			\
 	class type **tqe_prev;	/* address of previous next element */	\
 	TRACEBUF							\
 }
 
 /*
  * Tail queue functions.
  */
 #if (defined(_KERNEL) && defined(INVARIANTS))
 #define	QMD_TAILQ_CHECK_HEAD(head, field) do {				\
 	if (!TAILQ_EMPTY(head) &&					\
 	    TAILQ_FIRST((head))->field.tqe_prev !=			\
 	     &TAILQ_FIRST((head)))					\
 		panic("Bad tailq head %p first->prev != head", (head));	\
 } while (0)
 
 #define	QMD_TAILQ_CHECK_TAIL(head, field) do {				\
 	if (*(head)->tqh_last != NULL)					\
 	    	panic("Bad tailq NEXT(%p->tqh_last) != NULL", (head)); 	\
 } while (0)
 
 #define	QMD_TAILQ_CHECK_NEXT(elm, field) do {				\
 	if (TAILQ_NEXT((elm), field) != NULL &&				\
 	    TAILQ_NEXT((elm), field)->field.tqe_prev !=			\
 	     &((elm)->field.tqe_next))					\
 		panic("Bad link elm %p next->prev != elm", (elm));	\
 } while (0)
 
 #define	QMD_TAILQ_CHECK_PREV(elm, field) do {				\
 	if (*(elm)->field.tqe_prev != (elm))				\
 		panic("Bad link elm %p prev->next != elm", (elm));	\
 } while (0)
 #else
 #define	QMD_TAILQ_CHECK_HEAD(head, field)
 #define	QMD_TAILQ_CHECK_TAIL(head, headname)
 #define	QMD_TAILQ_CHECK_NEXT(elm, field)
 #define	QMD_TAILQ_CHECK_PREV(elm, field)
 #endif /* (_KERNEL && INVARIANTS) */
 
 #define	TAILQ_CONCAT(head1, head2, field) do {				\
 	if (!TAILQ_EMPTY(head2)) {					\
 		*(head1)->tqh_last = (head2)->tqh_first;		\
 		(head2)->tqh_first->field.tqe_prev = (head1)->tqh_last;	\
 		(head1)->tqh_last = (head2)->tqh_last;			\
 		TAILQ_INIT((head2));					\
 		QMD_TRACE_HEAD(head1);					\
 		QMD_TRACE_HEAD(head2);					\
 	}								\
 } while (0)
 
 #define	TAILQ_EMPTY(head)	((head)->tqh_first == NULL)
 
 #define	TAILQ_FIRST(head)	((head)->tqh_first)
 
 #define	TAILQ_FOREACH(var, head, field)					\
 	for ((var) = TAILQ_FIRST((head));				\
 	    (var);							\
 	    (var) = TAILQ_NEXT((var), field))
 
 #define	TAILQ_FOREACH_FROM(var, head, field)				\
 	for ((var) = ((var) ? (var) : TAILQ_FIRST((head)));		\
 	    (var);							\
 	    (var) = TAILQ_NEXT((var), field))
 
 #define	TAILQ_FOREACH_SAFE(var, head, field, tvar)			\
 	for ((var) = TAILQ_FIRST((head));				\
 	    (var) && ((tvar) = TAILQ_NEXT((var), field), 1);		\
 	    (var) = (tvar))
 
 #define	TAILQ_FOREACH_FROM_SAFE(var, head, field, tvar)			\
 	for ((var) = ((var) ? (var) : TAILQ_FIRST((head)));		\
 	    (var) && ((tvar) = TAILQ_NEXT((var), field), 1);		\
 	    (var) = (tvar))
 
 #define	TAILQ_FOREACH_REVERSE(var, head, headname, field)		\
 	for ((var) = TAILQ_LAST((head), headname);			\
 	    (var);							\
 	    (var) = TAILQ_PREV((var), headname, field))
 
 #define	TAILQ_FOREACH_REVERSE_FROM(var, head, headname, field)		\
 	for ((var) = ((var) ? (var) : TAILQ_LAST((head), headname));	\
 	    (var);							\
 	    (var) = TAILQ_PREV((var), headname, field))
 
 #define	TAILQ_FOREACH_REVERSE_SAFE(var, head, headname, field, tvar)	\
 	for ((var) = TAILQ_LAST((head), headname);			\
 	    (var) && ((tvar) = TAILQ_PREV((var), headname, field), 1);	\
 	    (var) = (tvar))
 
 #define	TAILQ_FOREACH_REVERSE_FROM_SAFE(var, head, headname, field, tvar) \
 	for ((var) = ((var) ? (var) : TAILQ_LAST((head), headname));	\
 	    (var) && ((tvar) = TAILQ_PREV((var), headname, field), 1);	\
 	    (var) = (tvar))
 
 #define	TAILQ_INIT(head) do {						\
 	TAILQ_FIRST((head)) = NULL;					\
 	(head)->tqh_last = &TAILQ_FIRST((head));			\
 	QMD_TRACE_HEAD(head);						\
 } while (0)
 
 #define	TAILQ_INSERT_AFTER(head, listelm, elm, field) do {		\
 	QMD_TAILQ_CHECK_NEXT(listelm, field);				\
 	if ((TAILQ_NEXT((elm), field) = TAILQ_NEXT((listelm), field)) != NULL)\
 		TAILQ_NEXT((elm), field)->field.tqe_prev = 		\
 		    &TAILQ_NEXT((elm), field);				\
 	else {								\
 		(head)->tqh_last = &TAILQ_NEXT((elm), field);		\
 		QMD_TRACE_HEAD(head);					\
 	}								\
 	TAILQ_NEXT((listelm), field) = (elm);				\
 	(elm)->field.tqe_prev = &TAILQ_NEXT((listelm), field);		\
 	QMD_TRACE_ELEM(&(elm)->field);					\
 	QMD_TRACE_ELEM(&(listelm)->field);				\
 } while (0)
 
 #define	TAILQ_INSERT_BEFORE(listelm, elm, field) do {			\
 	QMD_TAILQ_CHECK_PREV(listelm, field);				\
 	(elm)->field.tqe_prev = (listelm)->field.tqe_prev;		\
 	TAILQ_NEXT((elm), field) = (listelm);				\
 	*(listelm)->field.tqe_prev = (elm);				\
 	(listelm)->field.tqe_prev = &TAILQ_NEXT((elm), field);		\
 	QMD_TRACE_ELEM(&(elm)->field);					\
 	QMD_TRACE_ELEM(&(listelm)->field);				\
 } while (0)
 
 #define	TAILQ_INSERT_HEAD(head, elm, field) do {			\
 	QMD_TAILQ_CHECK_HEAD(head, field);				\
 	if ((TAILQ_NEXT((elm), field) = TAILQ_FIRST((head))) != NULL)	\
 		TAILQ_FIRST((head))->field.tqe_prev =			\
 		    &TAILQ_NEXT((elm), field);				\
 	else								\
 		(head)->tqh_last = &TAILQ_NEXT((elm), field);		\
 	TAILQ_FIRST((head)) = (elm);					\
 	(elm)->field.tqe_prev = &TAILQ_FIRST((head));			\
 	QMD_TRACE_HEAD(head);						\
 	QMD_TRACE_ELEM(&(elm)->field);					\
 } while (0)
 
 #define	TAILQ_INSERT_TAIL(head, elm, field) do {			\
 	QMD_TAILQ_CHECK_TAIL(head, field);				\
 	TAILQ_NEXT((elm), field) = NULL;				\
 	(elm)->field.tqe_prev = (head)->tqh_last;			\
 	*(head)->tqh_last = (elm);					\
 	(head)->tqh_last = &TAILQ_NEXT((elm), field);			\
 	QMD_TRACE_HEAD(head);						\
 	QMD_TRACE_ELEM(&(elm)->field);					\
 } while (0)
 
 #define	TAILQ_LAST(head, headname)					\
 	(*(((struct headname *)((head)->tqh_last))->tqh_last))
 
 #define	TAILQ_NEXT(elm, field) ((elm)->field.tqe_next)
 
 #define	TAILQ_PREV(elm, headname, field)				\
 	(*(((struct headname *)((elm)->field.tqe_prev))->tqh_last))
 
 #define	TAILQ_REMOVE(head, elm, field) do {				\
 	QMD_SAVELINK(oldnext, (elm)->field.tqe_next);			\
 	QMD_SAVELINK(oldprev, (elm)->field.tqe_prev);			\
 	QMD_TAILQ_CHECK_NEXT(elm, field);				\
 	QMD_TAILQ_CHECK_PREV(elm, field);				\
 	if ((TAILQ_NEXT((elm), field)) != NULL)				\
 		TAILQ_NEXT((elm), field)->field.tqe_prev = 		\
 		    (elm)->field.tqe_prev;				\
 	else {								\
 		(head)->tqh_last = (elm)->field.tqe_prev;		\
 		QMD_TRACE_HEAD(head);					\
 	}								\
 	*(elm)->field.tqe_prev = TAILQ_NEXT((elm), field);		\
 	TRASHIT(*oldnext);						\
 	TRASHIT(*oldprev);						\
 	QMD_TRACE_ELEM(&(elm)->field);					\
 } while (0)
 
 #define TAILQ_SWAP(head1, head2, type, field) do {			\
 	QUEUE_TYPEOF(type) *swap_first = (head1)->tqh_first;		\
 	QUEUE_TYPEOF(type) **swap_last = (head1)->tqh_last;		\
 	(head1)->tqh_first = (head2)->tqh_first;			\
 	(head1)->tqh_last = (head2)->tqh_last;				\
 	(head2)->tqh_first = swap_first;				\
 	(head2)->tqh_last = swap_last;					\
 	if ((swap_first = (head1)->tqh_first) != NULL)			\
 		swap_first->field.tqe_prev = &(head1)->tqh_first;	\
 	else								\
 		(head1)->tqh_last = &(head1)->tqh_first;		\
 	if ((swap_first = (head2)->tqh_first) != NULL)			\
 		swap_first->field.tqe_prev = &(head2)->tqh_first;	\
 	else								\
 		(head2)->tqh_last = &(head2)->tqh_first;		\
 } while (0)
 
 #endif /* !_SYS_QUEUE_H_ */
Index: projects/clang390-import/sys/vm/pmap.h
===================================================================
--- projects/clang390-import/sys/vm/pmap.h	(revision 305686)
+++ projects/clang390-import/sys/vm/pmap.h	(revision 305687)
@@ -1,161 +1,171 @@
 /*-
  * Copyright (c) 1991, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * The Mach Operating System project at Carnegie-Mellon University.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	from: @(#)pmap.h	8.1 (Berkeley) 6/11/93
  *
  *
  * Copyright (c) 1987, 1990 Carnegie-Mellon University.
  * All rights reserved.
  *
  * Author: Avadis Tevanian, Jr.
  *
  * Permission to use, copy, modify and distribute this software and
  * its documentation is hereby granted, provided that both the copyright
  * notice and this permission notice appear in all copies of the
  * software, derivative works or modified versions, and any portions
  * thereof, and that both notices appear in supporting documentation.
  *
  * CARNEGIE MELLON ALLOWS FREE USE OF THIS SOFTWARE IN ITS "AS IS"
  * CONDITION.  CARNEGIE MELLON DISCLAIMS ANY LIABILITY OF ANY KIND
  * FOR ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE.
  *
  * Carnegie Mellon requests users of this software to return to
  *
  *  Software Distribution Coordinator  or  Software.Distribution@CS.CMU.EDU
  *  School of Computer Science
  *  Carnegie Mellon University
  *  Pittsburgh PA 15213-3890
  *
  * any improvements or extensions that they make and grant Carnegie the
  * rights to redistribute these changes.
  *
  * $FreeBSD$
  */
 
 /*
  *	Machine address mapping definitions -- machine-independent
  *	section.  [For machine-dependent section, see "machine/pmap.h".]
  */
 
 #ifndef	_PMAP_VM_
 #define	_PMAP_VM_
 /*
  * Each machine dependent implementation is expected to
  * keep certain statistics.  They may do this anyway they
  * so choose, but are expected to return the statistics
  * in the following structure.
  */
 struct pmap_statistics {
 	long resident_count;	/* # of pages mapped (total) */
 	long wired_count;	/* # of pages wired */
 };
 typedef struct pmap_statistics *pmap_statistics_t;
 
 /*
  * Each machine-dependent implementation is required to provide:
  *
  * vm_memattr_t	pmap_page_get_memattr(vm_page_t);
  * boolean_t	pmap_page_is_mapped(vm_page_t);
  * boolean_t	pmap_page_is_write_mapped(vm_page_t);
  * void		pmap_page_set_memattr(vm_page_t, vm_memattr_t);
  */
 #include <machine/pmap.h>
 
 #ifdef _KERNEL
 struct thread;
 
 /*
  * Updates to kernel_vm_end are synchronized by the kernel_map's system mutex.
  */
 extern vm_offset_t kernel_vm_end;
 
 /*
  * Flags for pmap_enter().  The bits in the low-order byte are reserved
  * for the protection code (vm_prot_t) that describes the fault type.
  */
 #define	PMAP_ENTER_NOSLEEP	0x0100
 #define	PMAP_ENTER_WIRED	0x0200
 
+/*
+ * Define the maximum number of machine-dependent reference bits that are
+ * cleared by a call to pmap_ts_referenced().  This limit serves two purposes.
+ * First, it bounds the cost of reference bit maintenance on widely shared
+ * pages.  Second, it prevents numeric overflow during maintenance of a
+ * widely shared page's "act_count" field.  An overflow could result in the
+ * premature deactivation of the page.
+ */
+#define	PMAP_TS_REFERENCED_MAX	5
+
 void		 pmap_activate(struct thread *td);
 void		 pmap_advise(pmap_t pmap, vm_offset_t sva, vm_offset_t eva,
 		    int advice);
 void		 pmap_align_superpage(vm_object_t, vm_ooffset_t, vm_offset_t *,
 		    vm_size_t);
 void		 pmap_clear_modify(vm_page_t m);
 void		 pmap_copy(pmap_t, pmap_t, vm_offset_t, vm_size_t, vm_offset_t);
 void		 pmap_copy_page(vm_page_t, vm_page_t);
 void		 pmap_copy_pages(vm_page_t ma[], vm_offset_t a_offset,
 		    vm_page_t mb[], vm_offset_t b_offset, int xfersize);
 int		 pmap_enter(pmap_t pmap, vm_offset_t va, vm_page_t m,
 		    vm_prot_t prot, u_int flags, int8_t psind);
 void		 pmap_enter_object(pmap_t pmap, vm_offset_t start,
 		    vm_offset_t end, vm_page_t m_start, vm_prot_t prot);
 void		 pmap_enter_quick(pmap_t pmap, vm_offset_t va, vm_page_t m,
 		    vm_prot_t prot);
 vm_paddr_t	 pmap_extract(pmap_t pmap, vm_offset_t va);
 vm_page_t	 pmap_extract_and_hold(pmap_t pmap, vm_offset_t va,
 		    vm_prot_t prot);
 void		 pmap_growkernel(vm_offset_t);
 void		 pmap_init(void);
 boolean_t	 pmap_is_modified(vm_page_t m);
 boolean_t	 pmap_is_prefaultable(pmap_t pmap, vm_offset_t va);
 boolean_t	 pmap_is_referenced(vm_page_t m);
 vm_offset_t	 pmap_map(vm_offset_t *, vm_paddr_t, vm_paddr_t, int);
 int		 pmap_mincore(pmap_t pmap, vm_offset_t addr,
 		    vm_paddr_t *locked_pa);
 void		 pmap_object_init_pt(pmap_t pmap, vm_offset_t addr,
 		    vm_object_t object, vm_pindex_t pindex, vm_size_t size);
 boolean_t	 pmap_page_exists_quick(pmap_t pmap, vm_page_t m);
 void		 pmap_page_init(vm_page_t m);
 int		 pmap_page_wired_mappings(vm_page_t m);
 int		 pmap_pinit(pmap_t);
 void		 pmap_pinit0(pmap_t);
 void		 pmap_protect(pmap_t, vm_offset_t, vm_offset_t, vm_prot_t);
 void		 pmap_qenter(vm_offset_t, vm_page_t *, int);
 void		 pmap_qremove(vm_offset_t, int);
 vm_offset_t	 pmap_quick_enter_page(vm_page_t);
 void		 pmap_quick_remove_page(vm_offset_t);
 void		 pmap_release(pmap_t);
 void		 pmap_remove(pmap_t, vm_offset_t, vm_offset_t);
 void		 pmap_remove_all(vm_page_t m);
 void		 pmap_remove_pages(pmap_t);
 void		 pmap_remove_write(vm_page_t m);
 void		 pmap_sync_icache(pmap_t, vm_offset_t, vm_size_t);
 boolean_t	 pmap_ts_referenced(vm_page_t m);
 void		 pmap_unwire(pmap_t pmap, vm_offset_t start, vm_offset_t end);
 void		 pmap_zero_page(vm_page_t);
 void		 pmap_zero_page_area(vm_page_t, int off, int size);
 
 #define	pmap_resident_count(pm)	((pm)->pm_stats.resident_count)
 #define	pmap_wired_count(pm)	((pm)->pm_stats.wired_count)
 
 #endif /* _KERNEL */
 #endif /* _PMAP_VM_ */
Index: projects/clang390-import/tests/sys/kern/Makefile
===================================================================
--- projects/clang390-import/tests/sys/kern/Makefile	(revision 305686)
+++ projects/clang390-import/tests/sys/kern/Makefile	(revision 305687)
@@ -1,39 +1,40 @@
 # $FreeBSD$
 
 TESTSRC=	${SRCTOP}/contrib/netbsd-tests/kernel
 .PATH:		${SRCTOP}/sys/kern
 
 TESTSDIR=	${TESTSBASE}/sys/kern
 
 ATF_TESTS_C+=	kern_copyin
 ATF_TESTS_C+=	kern_descrip_test
 ATF_TESTS_C+=	ptrace_test
 PLAIN_TESTS_C+=	subr_unit_test
 ATF_TESTS_C+=	unix_seqpacket_test
 ATF_TESTS_C+=	unix_passfd_test
 TEST_METADATA.unix_seqpacket_test+=	timeout="15"
+ATF_TESTS_C+=	waitpid_nohang
 
 LIBADD.ptrace_test+=			pthread
 LIBADD.unix_seqpacket_test+=		pthread
 
 NETBSD_ATF_TESTS_C+=	lockf_test
 NETBSD_ATF_TESTS_C+=	mqueue_test
 
 CFLAGS.mqueue_test+=	-I${SRCTOP}/tests
 LIBADD.mqueue_test+=	rt
 
 # subr_unit.c contains functions whose prototypes lie in headers that cannot be
 # included in userland.  But as far as subr_unit_test goes, they're effectively
 # static.  So it's ok to disable -Wmissing-prototypes for this program.
 CFLAGS.subr_unit.c+=	-Wno-missing-prototypes
 SRCS.subr_unit_test+=	subr_unit.c
 
 WARNS?=	5
 
 TESTS_SUBDIRS+=	acct
 TESTS_SUBDIRS+=	execve
 TESTS_SUBDIRS+=	pipe
 
 .include <netbsd-tests.test.mk>
 
 .include <bsd.test.mk>
Index: projects/clang390-import/tests/sys/kern/waitpid_nohang.c
===================================================================
--- projects/clang390-import/tests/sys/kern/waitpid_nohang.c	(nonexistent)
+++ projects/clang390-import/tests/sys/kern/waitpid_nohang.c	(revision 305687)
@@ -0,0 +1,70 @@
+/*-
+ * Copyright (c) 2016 Jilles Tjoelker
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <sys/cdefs.h>
+__FBSDID("$FreeBSD$");
+
+#include <sys/wait.h>
+
+#include <atf-c.h>
+#include <signal.h>
+#include <unistd.h>
+
+ATF_TC_WITHOUT_HEAD(waitpid_nohang);
+ATF_TC_BODY(waitpid_nohang, tc)
+{
+	pid_t child, pid;
+	int status, r;
+
+	child = fork();
+	ATF_REQUIRE(child != -1);
+	if (child == 0) {
+		sleep(10);
+		_exit(1);
+	}
+
+	status = 42;
+	pid = waitpid(child, &status, WNOHANG);
+	ATF_REQUIRE(pid == 0);
+	ATF_CHECK(status == 42);
+
+	r = kill(child, SIGTERM);
+	ATF_REQUIRE(r == 0);
+	r = waitid(P_PID, child, NULL, WEXITED | WNOWAIT);
+	ATF_REQUIRE(r == 0);
+
+	status = -1;
+	pid = waitpid(child, &status, WNOHANG);
+	ATF_REQUIRE(pid == child);
+	ATF_CHECK(WIFSIGNALED(status) && WTERMSIG(status) == SIGTERM);
+}
+
+ATF_TP_ADD_TCS(tp)
+{
+
+	ATF_TP_ADD_TC(tp, waitpid_nohang);
+	return (atf_no_error());
+}

Property changes on: projects/clang390-import/tests/sys/kern/waitpid_nohang.c
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: projects/clang390-import/tools/regression/capsicum/libcasper/dns.c
===================================================================
--- projects/clang390-import/tools/regression/capsicum/libcasper/dns.c	(revision 305686)
+++ projects/clang390-import/tools/regression/capsicum/libcasper/dns.c	(nonexistent)
@@ -1,702 +0,0 @@
-/*-
- * Copyright (c) 2013 The FreeBSD Foundation
- * All rights reserved.
- *
- * This software was developed by Pawel Jakub Dawidek under sponsorship from
- * the FreeBSD Foundation.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- * 1. Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in the
- *    documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
- * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
- * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
- * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
- * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
- * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
- * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
- * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
- * SUCH DAMAGE.
- */
-
-#include <sys/cdefs.h>
-__FBSDID("$FreeBSD$");
-
-#include <sys/capsicum.h>
-
-#include <arpa/inet.h>
-#include <netinet/in.h>
-
-#include <assert.h>
-#include <err.h>
-#include <errno.h>
-#include <netdb.h>
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-#include <unistd.h>
-
-#include <libcasper.h>
-
-#include <casper/cap_dns.h>
-
-static int ntest = 1;
-
-#define CHECK(expr)     do {						\
-	if ((expr))							\
-		printf("ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
-	else								\
-		printf("not ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
-	ntest++;							\
-} while (0)
-#define CHECKX(expr)     do {						\
-	if ((expr)) {							\
-		printf("ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
-	} else {							\
-		printf("not ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
-		exit(1);						\
-	}								\
-	ntest++;							\
-} while (0)
-
-#define	GETHOSTBYNAME			0x01
-#define	GETHOSTBYNAME2_AF_INET		0x02
-#define	GETHOSTBYNAME2_AF_INET6		0x04
-#define	GETHOSTBYADDR_AF_INET		0x08
-#define	GETHOSTBYADDR_AF_INET6		0x10
-#define	GETADDRINFO_AF_UNSPEC		0x20
-#define	GETADDRINFO_AF_INET		0x40
-#define	GETADDRINFO_AF_INET6		0x80
-
-static bool
-addrinfo_compare(struct addrinfo *ai0, struct addrinfo *ai1)
-{
-	struct addrinfo *at0, *at1;
-
-	if (ai0 == NULL && ai1 == NULL)
-		return (true);
-	if (ai0 == NULL || ai1 == NULL)
-		return (false);
-
-	at0 = ai0;
-	at1 = ai1;
-	while (true) {
-		if ((at0->ai_flags == at1->ai_flags) &&
-		    (at0->ai_family == at1->ai_family) &&
-		    (at0->ai_socktype == at1->ai_socktype) &&
-		    (at0->ai_protocol == at1->ai_protocol) &&
-		    (at0->ai_addrlen == at1->ai_addrlen) &&
-		    (memcmp(at0->ai_addr, at1->ai_addr,
-			at0->ai_addrlen) == 0)) {
-			if (at0->ai_canonname != NULL &&
-			    at1->ai_canonname != NULL) {
-				if (strcmp(at0->ai_canonname,
-				    at1->ai_canonname) != 0) {
-					return (false);
-				}
-			}
-
-			if (at0->ai_canonname == NULL &&
-			    at1->ai_canonname != NULL) {
-				return (false);
-			}
-			if (at0->ai_canonname != NULL &&
-			    at1->ai_canonname == NULL) {
-				return (false);
-			}
-
-			if (at0->ai_next == NULL && at1->ai_next == NULL)
-				return (true);
-			if (at0->ai_next == NULL || at1->ai_next == NULL)
-				return (false);
-
-			at0 = at0->ai_next;
-			at1 = at1->ai_next;
-		} else {
-			return (false);
-		}
-	}
-
-	/* NOTREACHED */
-	fprintf(stderr, "Dead code reached in addrinfo_compare()\n");
-	exit(1);
-}
-
-static bool
-hostent_aliases_compare(char **aliases0, char **aliases1)
-{
-	int i0, i1;
-
-	if (aliases0 == NULL && aliases1 == NULL)
-		return (true);
-	if (aliases0 == NULL || aliases1 == NULL)
-		return (false);
-
-	for (i0 = 0; aliases0[i0] != NULL; i0++) {
-		for (i1 = 0; aliases1[i1] != NULL; i1++) {
-			if (strcmp(aliases0[i0], aliases1[i1]) == 0)
-				break;
-		}
-		if (aliases1[i1] == NULL)
-			return (false);
-	}
-
-	return (true);
-}
-
-static bool
-hostent_addr_list_compare(char **addr_list0, char **addr_list1, int length)
-{
-	int i0, i1;
-
-	if (addr_list0 == NULL && addr_list1 == NULL)
-		return (true);
-	if (addr_list0 == NULL || addr_list1 == NULL)
-		return (false);
-
-	for (i0 = 0; addr_list0[i0] != NULL; i0++) {
-		for (i1 = 0; addr_list1[i1] != NULL; i1++) {
-			if (memcmp(addr_list0[i0], addr_list1[i1], length) == 0)
-				break;
-		}
-		if (addr_list1[i1] == NULL)
-			return (false);
-	}
-
-	return (true);
-}
-
-static bool
-hostent_compare(const struct hostent *hp0, const struct hostent *hp1)
-{
-
-	if (hp0 == NULL && hp1 != NULL)
-		return (true);
-
-	if (hp0 == NULL || hp1 == NULL)
-		return (false);
-
-	if (hp0->h_name != NULL || hp1->h_name != NULL) {
-		if (hp0->h_name == NULL || hp1->h_name == NULL)
-			return (false);
-		if (strcmp(hp0->h_name, hp1->h_name) != 0)
-			return (false);
-	}
-
-	if (!hostent_aliases_compare(hp0->h_aliases, hp1->h_aliases))
-		return (false);
-	if (!hostent_aliases_compare(hp1->h_aliases, hp0->h_aliases))
-		return (false);
-
-	if (hp0->h_addrtype != hp1->h_addrtype)
-		return (false);
-
-	if (hp0->h_length != hp1->h_length)
-		return (false);
-
-	if (!hostent_addr_list_compare(hp0->h_addr_list, hp1->h_addr_list,
-	    hp0->h_length)) {
-		return (false);
-	}
-	if (!hostent_addr_list_compare(hp1->h_addr_list, hp0->h_addr_list,
-	    hp0->h_length)) {
-		return (false);
-	}
-
-	return (true);
-}
-
-static unsigned int
-runtest(cap_channel_t *capdns)
-{
-	unsigned int result;
-	struct addrinfo *ais, *aic, hints, *hintsp;
-	struct hostent *hps, *hpc;
-	struct in_addr ip4;
-	struct in6_addr ip6;
-
-	result = 0;
-
-	hps = gethostbyname("example.com");
-	if (hps == NULL)
-		fprintf(stderr, "Unable to resolve %s IPv4.\n", "example.com");
-	hpc = cap_gethostbyname(capdns, "example.com");
-	if (hostent_compare(hps, hpc))
-		result |= GETHOSTBYNAME;
-
-	hps = gethostbyname2("example.com", AF_INET);
-	if (hps == NULL)
-		fprintf(stderr, "Unable to resolve %s IPv4.\n", "example.com");
-	hpc = cap_gethostbyname2(capdns, "example.com", AF_INET);
-	if (hostent_compare(hps, hpc))
-		result |= GETHOSTBYNAME2_AF_INET;
-
-	hps = gethostbyname2("example.com", AF_INET6);
-	if (hps == NULL)
-		fprintf(stderr, "Unable to resolve %s IPv6.\n", "example.com");
-	hpc = cap_gethostbyname2(capdns, "example.com", AF_INET6);
-	if (hostent_compare(hps, hpc))
-		result |= GETHOSTBYNAME2_AF_INET6;
-
-	hints.ai_flags = 0;
-	hints.ai_family = AF_UNSPEC;
-	hints.ai_socktype = 0;
-	hints.ai_protocol = 0;
-	hints.ai_addrlen = 0;
-	hints.ai_addr = NULL;
-	hints.ai_canonname = NULL;
-	hints.ai_next = NULL;
-
-	hintsp = &hints;
-
-	if (getaddrinfo("freebsd.org", "25", hintsp, &ais) != 0) {
-		fprintf(stderr,
-		    "Unable to issue [system] getaddrinfo() for AF_UNSPEC: %s\n",
-		    gai_strerror(errno));
-	}
-	if (cap_getaddrinfo(capdns, "freebsd.org", "25", hintsp, &aic) == 0) {
-		if (addrinfo_compare(ais, aic))
-			result |= GETADDRINFO_AF_UNSPEC;
-		freeaddrinfo(ais);
-		freeaddrinfo(aic);
-	}
-
-	hints.ai_family = AF_INET;
-	if (getaddrinfo("freebsd.org", "25", hintsp, &ais) != 0) {
-		fprintf(stderr,
-		    "Unable to issue [system] getaddrinfo() for AF_UNSPEC: %s\n",
-		    gai_strerror(errno));
-	}
-	if (cap_getaddrinfo(capdns, "freebsd.org", "25", hintsp, &aic) == 0) {
-		if (addrinfo_compare(ais, aic))
-			result |= GETADDRINFO_AF_INET;
-		freeaddrinfo(ais);
-		freeaddrinfo(aic);
-	}
-
-	hints.ai_family = AF_INET6;
-	if (getaddrinfo("freebsd.org", "25", hintsp, &ais) != 0) {
-		fprintf(stderr,
-		    "Unable to issue [system] getaddrinfo() for AF_UNSPEC: %s\n",
-		    gai_strerror(errno));
-	}
-	if (cap_getaddrinfo(capdns, "freebsd.org", "25", hintsp, &aic) == 0) {
-		if (addrinfo_compare(ais, aic))
-			result |= GETADDRINFO_AF_INET6;
-		freeaddrinfo(ais);
-		freeaddrinfo(aic);
-	}
-
-	/*
-	 * 8.8.178.135 is IPv4 address of freefall.freebsd.org
-	 * as of 27 October 2013.
-	 */
-	inet_pton(AF_INET, "8.8.178.135", &ip4);
-	hps = gethostbyaddr(&ip4, sizeof(ip4), AF_INET);
-	if (hps == NULL)
-		fprintf(stderr, "Unable to resolve %s.\n", "8.8.178.135");
-	hpc = cap_gethostbyaddr(capdns, &ip4, sizeof(ip4), AF_INET);
-	if (hostent_compare(hps, hpc))
-		result |= GETHOSTBYADDR_AF_INET;
-
-	/*
-	 * 2001:1900:2254:206c::16:87 is IPv6 address of freefall.freebsd.org
-	 * as of 27 October 2013.
-	 */
-	inet_pton(AF_INET6, "2001:1900:2254:206c::16:87", &ip6);
-	hps = gethostbyaddr(&ip6, sizeof(ip6), AF_INET6);
-	if (hps == NULL) {
-		fprintf(stderr, "Unable to resolve %s.\n",
-		    "2001:1900:2254:206c::16:87");
-	}
-	hpc = cap_gethostbyaddr(capdns, &ip6, sizeof(ip6), AF_INET6);
-	if (hostent_compare(hps, hpc))
-		result |= GETHOSTBYADDR_AF_INET6;
-
-	return (result);
-}
-
-int
-main(void)
-{
-	cap_channel_t *capcas, *capdns, *origcapdns;
-	const char *types[2];
-	int families[2];
-
-	printf("1..91\n");
-
-	capcas = cap_init();
-	CHECKX(capcas != NULL);
-
-	origcapdns = capdns = cap_service_open(capcas, "system.dns");
-	CHECKX(capdns != NULL);
-
-	cap_close(capcas);
-
-	/* No limits set. */
-
-	CHECK(runtest(capdns) ==
-	    (GETHOSTBYNAME | GETHOSTBYNAME2_AF_INET | GETHOSTBYNAME2_AF_INET6 |
-	     GETHOSTBYADDR_AF_INET | GETHOSTBYADDR_AF_INET6 |
-	     GETADDRINFO_AF_UNSPEC | GETADDRINFO_AF_INET | GETADDRINFO_AF_INET6));
-
-	/*
-	 * Allow:
-	 * type: NAME, ADDR
-	 * family: AF_INET, AF_INET6
-	 */
-
-	capdns = cap_clone(origcapdns);
-	CHECK(capdns != NULL);
-
-	types[0] = "NAME";
-	types[1] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 2) == 0);
-	families[0] = AF_INET;
-	families[1] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 2) == 0);
-
-	CHECK(runtest(capdns) ==
-	    (GETHOSTBYNAME | GETHOSTBYNAME2_AF_INET | GETHOSTBYNAME2_AF_INET6 |
-	     GETHOSTBYADDR_AF_INET | GETHOSTBYADDR_AF_INET6 |
-	     GETADDRINFO_AF_INET | GETADDRINFO_AF_INET6));
-
-	cap_close(capdns);
-
-	/*
-	 * Allow:
-	 * type: NAME
-	 * family: AF_INET, AF_INET6
-	 */
-
-	capdns = cap_clone(origcapdns);
-	CHECK(capdns != NULL);
-
-	types[0] = "NAME";
-	CHECK(cap_dns_type_limit(capdns, types, 1) == 0);
-	types[1] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 2) == -1 &&
-	    errno == ENOTCAPABLE);
-	types[0] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-	families[0] = AF_INET;
-	families[1] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 2) == 0);
-
-	CHECK(runtest(capdns) ==
-	    (GETHOSTBYNAME | GETHOSTBYNAME2_AF_INET | GETHOSTBYNAME2_AF_INET6));
-
-	cap_close(capdns);
-
-	/*
-	 * Allow:
-	 * type: ADDR
-	 * family: AF_INET, AF_INET6
-	 */
-
-	capdns = cap_clone(origcapdns);
-	CHECK(capdns != NULL);
-
-	types[0] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 1) == 0);
-	types[1] = "NAME";
-	CHECK(cap_dns_type_limit(capdns, types, 2) == -1 &&
-	    errno == ENOTCAPABLE);
-	types[0] = "NAME";
-	CHECK(cap_dns_type_limit(capdns, types, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-	families[0] = AF_INET;
-	families[1] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 2) == 0);
-
-	CHECK(runtest(capdns) ==
-	    (GETHOSTBYADDR_AF_INET | GETHOSTBYADDR_AF_INET6 |
-	    GETADDRINFO_AF_INET | GETADDRINFO_AF_INET6));
-
-	cap_close(capdns);
-
-	/*
-	 * Allow:
-	 * type: NAME, ADDR
-	 * family: AF_INET
-	 */
-
-	capdns = cap_clone(origcapdns);
-	CHECK(capdns != NULL);
-
-	types[0] = "NAME";
-	types[1] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 2) == 0);
-	families[0] = AF_INET;
-	CHECK(cap_dns_family_limit(capdns, families, 1) == 0);
-	families[1] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 2) == -1 &&
-	    errno == ENOTCAPABLE);
-	families[0] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest(capdns) ==
-	    (GETHOSTBYNAME | GETHOSTBYNAME2_AF_INET | GETHOSTBYADDR_AF_INET |
-	    GETADDRINFO_AF_INET));
-
-	cap_close(capdns);
-
-	/*
-	 * Allow:
-	 * type: NAME, ADDR
-	 * family: AF_INET6
-	 */
-
-	capdns = cap_clone(origcapdns);
-	CHECK(capdns != NULL);
-
-	types[0] = "NAME";
-	types[1] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 2) == 0);
-	families[0] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 1) == 0);
-	families[1] = AF_INET;
-	CHECK(cap_dns_family_limit(capdns, families, 2) == -1 &&
-	    errno == ENOTCAPABLE);
-	families[0] = AF_INET;
-	CHECK(cap_dns_family_limit(capdns, families, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest(capdns) ==
-	    (GETHOSTBYNAME2_AF_INET6 | GETHOSTBYADDR_AF_INET6 |
-	    GETADDRINFO_AF_INET6));
-
-	cap_close(capdns);
-
-	/* Below we also test further limiting capability. */
-
-	/*
-	 * Allow:
-	 * type: NAME
-	 * family: AF_INET
-	 */
-
-	capdns = cap_clone(origcapdns);
-	CHECK(capdns != NULL);
-
-	types[0] = "NAME";
-	types[1] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 2) == 0);
-	families[0] = AF_INET;
-	families[1] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 2) == 0);
-	types[0] = "NAME";
-	CHECK(cap_dns_type_limit(capdns, types, 1) == 0);
-	types[1] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 2) == -1 &&
-	    errno == ENOTCAPABLE);
-	types[0] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-	families[0] = AF_INET;
-	CHECK(cap_dns_family_limit(capdns, families, 1) == 0);
-	families[1] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 2) == -1 &&
-	    errno == ENOTCAPABLE);
-	families[0] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest(capdns) == (GETHOSTBYNAME | GETHOSTBYNAME2_AF_INET));
-
-	cap_close(capdns);
-
-	/*
-	 * Allow:
-	 * type: NAME
-	 * family: AF_INET6
-	 */
-
-	capdns = cap_clone(origcapdns);
-	CHECK(capdns != NULL);
-
-	types[0] = "NAME";
-	types[1] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 2) == 0);
-	families[0] = AF_INET;
-	families[1] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 2) == 0);
-	types[0] = "NAME";
-	CHECK(cap_dns_type_limit(capdns, types, 1) == 0);
-	types[1] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 2) == -1 &&
-	    errno == ENOTCAPABLE);
-	types[0] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-	families[0] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 1) == 0);
-	families[1] = AF_INET;
-	CHECK(cap_dns_family_limit(capdns, families, 2) == -1 &&
-	    errno == ENOTCAPABLE);
-	families[0] = AF_INET;
-	CHECK(cap_dns_family_limit(capdns, families, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest(capdns) == GETHOSTBYNAME2_AF_INET6);
-
-	cap_close(capdns);
-
-	/*
-	 * Allow:
-	 * type: ADDR
-	 * family: AF_INET
-	 */
-
-	capdns = cap_clone(origcapdns);
-	CHECK(capdns != NULL);
-
-	types[0] = "NAME";
-	types[1] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 2) == 0);
-	families[0] = AF_INET;
-	families[1] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 2) == 0);
-	types[0] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 1) == 0);
-	types[1] = "NAME";
-	CHECK(cap_dns_type_limit(capdns, types, 2) == -1 &&
-	    errno == ENOTCAPABLE);
-	types[0] = "NAME";
-	CHECK(cap_dns_type_limit(capdns, types, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-	families[0] = AF_INET;
-	CHECK(cap_dns_family_limit(capdns, families, 1) == 0);
-	families[1] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 2) == -1 &&
-	    errno == ENOTCAPABLE);
-	families[0] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest(capdns) == (GETHOSTBYADDR_AF_INET | GETADDRINFO_AF_INET));
-
-	cap_close(capdns);
-
-	/*
-	 * Allow:
-	 * type: ADDR
-	 * family: AF_INET6
-	 */
-
-	capdns = cap_clone(origcapdns);
-	CHECK(capdns != NULL);
-
-	types[0] = "NAME";
-	types[1] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 2) == 0);
-	families[0] = AF_INET;
-	families[1] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 2) == 0);
-	types[0] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 1) == 0);
-	types[1] = "NAME";
-	CHECK(cap_dns_type_limit(capdns, types, 2) == -1 &&
-	    errno == ENOTCAPABLE);
-	types[0] = "NAME";
-	CHECK(cap_dns_type_limit(capdns, types, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-	families[0] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 1) == 0);
-	families[1] = AF_INET;
-	CHECK(cap_dns_family_limit(capdns, families, 2) == -1 &&
-	    errno == ENOTCAPABLE);
-	families[0] = AF_INET;
-	CHECK(cap_dns_family_limit(capdns, families, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest(capdns) == (GETHOSTBYADDR_AF_INET6 |
-	    GETADDRINFO_AF_INET6));
-
-	cap_close(capdns);
-
-	/* Trying to rise the limits. */
-
-	capdns = cap_clone(origcapdns);
-	CHECK(capdns != NULL);
-
-	types[0] = "NAME";
-	CHECK(cap_dns_type_limit(capdns, types, 1) == 0);
-	families[0] = AF_INET;
-	CHECK(cap_dns_family_limit(capdns, families, 1) == 0);
-
-	types[0] = "NAME";
-	types[1] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 2) == -1 &&
-	    errno == ENOTCAPABLE);
-	families[0] = AF_INET;
-	families[1] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 2) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	types[0] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-	families[0] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(cap_dns_type_limit(capdns, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-	CHECK(cap_dns_family_limit(capdns, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	/* Do the limits still hold? */
-	CHECK(runtest(capdns) == (GETHOSTBYNAME | GETHOSTBYNAME2_AF_INET));
-
-	cap_close(capdns);
-
-	capdns = cap_clone(origcapdns);
-	CHECK(capdns != NULL);
-
-	types[0] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 1) == 0);
-	families[0] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 1) == 0);
-
-	types[0] = "NAME";
-	types[1] = "ADDR";
-	CHECK(cap_dns_type_limit(capdns, types, 2) == -1 &&
-	    errno == ENOTCAPABLE);
-	families[0] = AF_INET;
-	families[1] = AF_INET6;
-	CHECK(cap_dns_family_limit(capdns, families, 2) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	types[0] = "NAME";
-	CHECK(cap_dns_type_limit(capdns, types, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-	families[0] = AF_INET;
-	CHECK(cap_dns_family_limit(capdns, families, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(cap_dns_type_limit(capdns, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-	CHECK(cap_dns_family_limit(capdns, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	/* Do the limits still hold? */
-	CHECK(runtest(capdns) == (GETHOSTBYADDR_AF_INET6 |
-	    GETADDRINFO_AF_INET6));
-
-	cap_close(capdns);
-
-	cap_close(origcapdns);
-
-	exit(0);
-}

Property changes on: projects/clang390-import/tools/regression/capsicum/libcasper/dns.c
___________________________________________________________________
Deleted: svn:eol-style
## -1 +0,0 ##
-native
\ No newline at end of property
Deleted: svn:keywords
## -1 +0,0 ##
-FreeBSD=%H
\ No newline at end of property
Deleted: svn:mime-type
## -1 +0,0 ##
-text/plain
\ No newline at end of property
Index: projects/clang390-import/tools/regression/capsicum/libcasper/grp.c
===================================================================
--- projects/clang390-import/tools/regression/capsicum/libcasper/grp.c	(revision 305686)
+++ projects/clang390-import/tools/regression/capsicum/libcasper/grp.c	(nonexistent)
@@ -1,1550 +0,0 @@
-/*-
- * Copyright (c) 2013 The FreeBSD Foundation
- * All rights reserved.
- *
- * This software was developed by Pawel Jakub Dawidek under sponsorship from
- * the FreeBSD Foundation.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- * 1. Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in the
- *    documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
- * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
- * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
- * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
- * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
- * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
- * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
- * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
- * SUCH DAMAGE.
- */
-
-#include <sys/cdefs.h>
-__FBSDID("$FreeBSD$");
-
-#include <sys/capsicum.h>
-
-#include <assert.h>
-#include <err.h>
-#include <errno.h>
-#include <grp.h>
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-#include <unistd.h>
-
-#include <libcasper.h>
-
-#include <casper/cap_grp.h>
-
-static int ntest = 1;
-
-#define CHECK(expr)     do {						\
-	if ((expr))							\
-		printf("ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
-	else								\
-		printf("not ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
-	ntest++;							\
-} while (0)
-#define CHECKX(expr)     do {						\
-	if ((expr)) {							\
-		printf("ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
-	} else {							\
-		printf("not ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
-		exit(1);						\
-	}								\
-	ntest++;							\
-} while (0)
-
-#define	GID_WHEEL	0
-#define	GID_OPERATOR	5
-
-#define	GETGRENT0	0x0001
-#define	GETGRENT1	0x0002
-#define	GETGRENT2	0x0004
-#define	GETGRENT	(GETGRENT0 | GETGRENT1 | GETGRENT2)
-#define	GETGRENT_R0	0x0008
-#define	GETGRENT_R1	0x0010
-#define	GETGRENT_R2	0x0020
-#define	GETGRENT_R	(GETGRENT_R0 | GETGRENT_R1 | GETGRENT_R2)
-#define	GETGRNAM	0x0040
-#define	GETGRNAM_R	0x0080
-#define	GETGRGID	0x0100
-#define	GETGRGID_R	0x0200
-#define	SETGRENT	0x0400
-
-static bool
-group_mem_compare(char **mem0, char **mem1)
-{
-	int i0, i1;
-
-	if (mem0 == NULL && mem1 == NULL)
-		return (true);
-	if (mem0 == NULL || mem1 == NULL)
-		return (false);
-
-	for (i0 = 0; mem0[i0] != NULL; i0++) {
-		for (i1 = 0; mem1[i1] != NULL; i1++) {
-			if (strcmp(mem0[i0], mem1[i1]) == 0)
-				break;
-		}
-		if (mem1[i1] == NULL)
-			return (false);
-	}
-
-	return (true);
-}
-
-static bool
-group_compare(const struct group *grp0, const struct group *grp1)
-{
-
-	if (grp0 == NULL && grp1 == NULL)
-		return (true);
-	if (grp0 == NULL || grp1 == NULL)
-		return (false);
-
-	if (strcmp(grp0->gr_name, grp1->gr_name) != 0)
-		return (false);
-
-	if (grp0->gr_passwd != NULL || grp1->gr_passwd != NULL) {
-		if (grp0->gr_passwd == NULL || grp1->gr_passwd == NULL)
-			return (false);
-		if (strcmp(grp0->gr_passwd, grp1->gr_passwd) != 0)
-			return (false);
-	}
-
-	if (grp0->gr_gid != grp1->gr_gid)
-		return (false);
-
-	if (!group_mem_compare(grp0->gr_mem, grp1->gr_mem))
-		return (false);
-
-	return (true);
-}
-
-static unsigned int
-runtest_cmds(cap_channel_t *capgrp)
-{
-	char bufs[1024], bufc[1024];
-	unsigned int result;
-	struct group *grps, *grpc;
-	struct group sts, stc;
-
-	result = 0;
-
-	(void)setgrent();
-	if (cap_setgrent(capgrp) == 1)
-		result |= SETGRENT;
-
-	grps = getgrent();
-	grpc = cap_getgrent(capgrp);
-	if (group_compare(grps, grpc)) {
-		result |= GETGRENT0;
-		grps = getgrent();
-		grpc = cap_getgrent(capgrp);
-		if (group_compare(grps, grpc))
-			result |= GETGRENT1;
-	}
-
-	getgrent_r(&sts, bufs, sizeof(bufs), &grps);
-	cap_getgrent_r(capgrp, &stc, bufc, sizeof(bufc), &grpc);
-	if (group_compare(grps, grpc)) {
-		result |= GETGRENT_R0;
-		getgrent_r(&sts, bufs, sizeof(bufs), &grps);
-		cap_getgrent_r(capgrp, &stc, bufc, sizeof(bufc), &grpc);
-		if (group_compare(grps, grpc))
-			result |= GETGRENT_R1;
-	}
-
-	(void)setgrent();
-	if (cap_setgrent(capgrp) == 1)
-		result |= SETGRENT;
-
-	getgrent_r(&sts, bufs, sizeof(bufs), &grps);
-	cap_getgrent_r(capgrp, &stc, bufc, sizeof(bufc), &grpc);
-	if (group_compare(grps, grpc))
-		result |= GETGRENT_R2;
-
-	grps = getgrent();
-	grpc = cap_getgrent(capgrp);
-	if (group_compare(grps, grpc))
-		result |= GETGRENT2;
-
-	grps = getgrnam("wheel");
-	grpc = cap_getgrnam(capgrp, "wheel");
-	if (group_compare(grps, grpc)) {
-		grps = getgrnam("operator");
-		grpc = cap_getgrnam(capgrp, "operator");
-		if (group_compare(grps, grpc))
-			result |= GETGRNAM;
-	}
-
-	getgrnam_r("wheel", &sts, bufs, sizeof(bufs), &grps);
-	cap_getgrnam_r(capgrp, "wheel", &stc, bufc, sizeof(bufc), &grpc);
-	if (group_compare(grps, grpc)) {
-		getgrnam_r("operator", &sts, bufs, sizeof(bufs), &grps);
-		cap_getgrnam_r(capgrp, "operator", &stc, bufc, sizeof(bufc),
-		    &grpc);
-		if (group_compare(grps, grpc))
-			result |= GETGRNAM_R;
-	}
-
-	grps = getgrgid(GID_WHEEL);
-	grpc = cap_getgrgid(capgrp, GID_WHEEL);
-	if (group_compare(grps, grpc)) {
-		grps = getgrgid(GID_OPERATOR);
-		grpc = cap_getgrgid(capgrp, GID_OPERATOR);
-		if (group_compare(grps, grpc))
-			result |= GETGRGID;
-	}
-
-	getgrgid_r(GID_WHEEL, &sts, bufs, sizeof(bufs), &grps);
-	cap_getgrgid_r(capgrp, GID_WHEEL, &stc, bufc, sizeof(bufc), &grpc);
-	if (group_compare(grps, grpc)) {
-		getgrgid_r(GID_OPERATOR, &sts, bufs, sizeof(bufs), &grps);
-		cap_getgrgid_r(capgrp, GID_OPERATOR, &stc, bufc, sizeof(bufc),
-		    &grpc);
-		if (group_compare(grps, grpc))
-			result |= GETGRGID_R;
-	}
-
-	return (result);
-}
-
-static void
-test_cmds(cap_channel_t *origcapgrp)
-{
-	cap_channel_t *capgrp;
-	const char *cmds[7], *fields[4], *names[5];
-	gid_t gids[5];
-
-	fields[0] = "gr_name";
-	fields[1] = "gr_passwd";
-	fields[2] = "gr_gid";
-	fields[3] = "gr_mem";
-
-	names[0] = "wheel";
-	names[1] = "daemon";
-	names[2] = "kmem";
-	names[3] = "sys";
-	names[4] = "operator";
-
-	gids[0] = 0;
-	gids[1] = 1;
-	gids[2] = 2;
-	gids[3] = 3;
-	gids[4] = 5;
-
-	/*
-	 * Allow:
-	 * cmds: setgrent, getgrent, getgrent_r, getgrnam, getgrnam_r,
-	 *       getgrgid, getgrgid_r
-	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
-	 * groups:
-	 *     names: wheel, daemon, kmem, sys, operator
-	 *     gids:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid";
-	cmds[6] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == 0);
-	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == 0);
-	CHECK(cap_grp_limit_groups(capgrp, names, 5, NULL, 0) == 0);
-
-	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
-	    GETGRNAM | GETGRNAM_R | GETGRGID | GETGRGID_R));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * cmds: setgrent, getgrent, getgrent_r, getgrnam, getgrnam_r,
-	 *       getgrgid, getgrgid_r
-	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
-	 * groups:
-	 *     names:
-	 *     gids: 0, 1, 2, 3, 5
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid";
-	cmds[6] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == 0);
-	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == 0);
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 5) == 0);
-
-	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
-	    GETGRNAM | GETGRNAM_R | GETGRGID | GETGRGID_R));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * cmds: getgrent, getgrent_r, getgrnam, getgrnam_r,
-	 *       getgrgid, getgrgid_r
-	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
-	 * groups:
-	 *     names: wheel, daemon, kmem, sys, operator
-	 *     gids:
-	 * Disallow:
-	 * cmds: setgrent
-	 * fields:
-	 * groups:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	cmds[0] = "getgrent";
-	cmds[1] = "getgrent_r";
-	cmds[2] = "getgrnam";
-	cmds[3] = "getgrnam_r";
-	cmds[4] = "getgrgid";
-	cmds[5] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid";
-	cmds[6] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "setgrent";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_grp_limit_groups(capgrp, names, 5, NULL, 0) == 0);
-
-	CHECK(runtest_cmds(capgrp) == (GETGRENT0 | GETGRENT1 | GETGRENT_R0 |
-	    GETGRENT_R1 | GETGRNAM | GETGRNAM_R | GETGRGID | GETGRGID_R));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * cmds: getgrent, getgrent_r, getgrnam, getgrnam_r,
-	 *       getgrgid, getgrgid_r
-	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
-	 * groups:
-	 *     names:
-	 *     gids: 0, 1, 2, 3, 5
-	 * Disallow:
-	 * cmds: setgrent
-	 * fields:
-	 * groups:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	cmds[0] = "getgrent";
-	cmds[1] = "getgrent_r";
-	cmds[2] = "getgrnam";
-	cmds[3] = "getgrnam_r";
-	cmds[4] = "getgrgid";
-	cmds[5] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid";
-	cmds[6] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "setgrent";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 5) == 0);
-
-	CHECK(runtest_cmds(capgrp) == (GETGRENT0 | GETGRENT1 | GETGRENT_R0 |
-	    GETGRENT_R1 | GETGRNAM | GETGRNAM_R | GETGRGID | GETGRGID_R));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * cmds: setgrent, getgrent_r, getgrnam, getgrnam_r,
-	 *       getgrgid, getgrgid_r
-	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
-	 * groups:
-	 *     names: wheel, daemon, kmem, sys, operator
-	 *     gids:
-	 * Disallow:
-	 * cmds: getgrent
-	 * fields:
-	 * groups:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent_r";
-	cmds[2] = "getgrnam";
-	cmds[3] = "getgrnam_r";
-	cmds[4] = "getgrgid";
-	cmds[5] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid";
-	cmds[6] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getgrent";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == 0);
-	CHECK(cap_grp_limit_groups(capgrp, names, 5, NULL, 0) == 0);
-
-	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT_R2 |
-	    GETGRNAM | GETGRNAM_R | GETGRGID | GETGRGID_R));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * cmds: setgrent, getgrent_r, getgrnam, getgrnam_r,
-	 *       getgrgid, getgrgid_r
-	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
-	 * groups:
-	 *     names:
-	 *     gids: 0, 1, 2, 3, 5
-	 * Disallow:
-	 * cmds: getgrent
-	 * fields:
-	 * groups:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent_r";
-	cmds[2] = "getgrnam";
-	cmds[3] = "getgrnam_r";
-	cmds[4] = "getgrgid";
-	cmds[5] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid";
-	cmds[6] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getgrent";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == 0);
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 5) == 0);
-
-	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT_R2 |
-	    GETGRNAM | GETGRNAM_R | GETGRGID | GETGRGID_R));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * cmds: setgrent, getgrent, getgrnam, getgrnam_r,
-	 *       getgrgid, getgrgid_r
-	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
-	 * groups:
-	 *     names: wheel, daemon, kmem, sys, operator
-	 *     gids:
-	 * Disallow:
-	 * cmds: getgrent_r
-	 * fields:
-	 * groups:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrnam";
-	cmds[3] = "getgrnam_r";
-	cmds[4] = "getgrgid";
-	cmds[5] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid";
-	cmds[6] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getgrent_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_grp_limit_groups(capgrp, names, 5, NULL, 0) == 0);
-
-	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT0 | GETGRENT1 |
-	    GETGRNAM | GETGRNAM_R | GETGRGID | GETGRGID_R));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * cmds: setgrent, getgrent, getgrnam, getgrnam_r,
-	 *       getgrgid, getgrgid_r
-	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
-	 * groups:
-	 *     names:
-	 *     gids: 0, 1, 2, 3, 5
-	 * Disallow:
-	 * cmds: getgrent_r
-	 * fields:
-	 * groups:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrnam";
-	cmds[3] = "getgrnam_r";
-	cmds[4] = "getgrgid";
-	cmds[5] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid";
-	cmds[6] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getgrent_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 5) == 0);
-
-	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT0 | GETGRENT1 |
-	    GETGRNAM | GETGRNAM_R | GETGRGID | GETGRGID_R));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * cmds: setgrent, getgrent, getgrent_r, getgrnam_r,
-	 *       getgrgid, getgrgid_r
-	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
-	 * groups:
-	 *     names: wheel, daemon, kmem, sys, operator
-	 *     gids:
-	 * Disallow:
-	 * cmds: getgrnam
-	 * fields:
-	 * groups:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam_r";
-	cmds[4] = "getgrgid";
-	cmds[5] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid";
-	cmds[6] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getgrnam";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == 0);
-	CHECK(cap_grp_limit_groups(capgrp, names, 5, NULL, 0) == 0);
-
-	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
-	    GETGRNAM_R | GETGRGID | GETGRGID_R));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * cmds: setgrent, getgrent, getgrent_r, getgrnam_r,
-	 *       getgrgid, getgrgid_r
-	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
-	 * groups:
-	 *     names:
-	 *     gids: 0, 1, 2, 3, 5
-	 * Disallow:
-	 * cmds: getgrnam
-	 * fields:
-	 * groups:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam_r";
-	cmds[4] = "getgrgid";
-	cmds[5] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid";
-	cmds[6] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getgrnam";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == 0);
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 5) == 0);
-
-	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
-	    GETGRNAM_R | GETGRGID | GETGRGID_R));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * cmds: setgrent, getgrent, getgrent_r, getgrnam,
-	 *       getgrgid, getgrgid_r
-	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
-	 * groups:
-	 *     names: wheel, daemon, kmem, sys, operator
-	 *     gids:
-	 * Disallow:
-	 * cmds: getgrnam_r
-	 * fields:
-	 * groups:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrgid";
-	cmds[5] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid";
-	cmds[6] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getgrnam_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_grp_limit_groups(capgrp, names, 5, NULL, 0) == 0);
-
-	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
-	    GETGRNAM | GETGRGID | GETGRGID_R));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * cmds: setgrent, getgrent, getgrent_r, getgrnam,
-	 *       getgrgid, getgrgid_r
-	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
-	 * groups:
-	 *     names:
-	 *     gids: 0, 1, 2, 3, 5
-	 * Disallow:
-	 * cmds: getgrnam_r
-	 * fields:
-	 * groups:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrgid";
-	cmds[5] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid";
-	cmds[6] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getgrnam_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 5) == 0);
-
-	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
-	    GETGRNAM | GETGRGID | GETGRGID_R));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * cmds: setgrent, getgrent, getgrent_r, getgrnam, getgrnam_r,
-	 *       getgrgid_r
-	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
-	 * groups:
-	 *     names: wheel, daemon, kmem, sys, operator
-	 *     gids:
-	 * Disallow:
-	 * cmds: getgrgid
-	 * fields:
-	 * groups:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid";
-	cmds[6] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getgrgid";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == 0);
-	CHECK(cap_grp_limit_groups(capgrp, names, 5, NULL, 0) == 0);
-
-	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
-	    GETGRNAM | GETGRNAM_R | GETGRGID_R));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * cmds: setgrent, getgrent, getgrent_r, getgrnam, getgrnam_r,
-	 *       getgrgid_r
-	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
-	 * groups:
-	 *     names:
-	 *     gids: 0, 1, 2, 3, 5
-	 * Disallow:
-	 * cmds: getgrgid
-	 * fields:
-	 * groups:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid";
-	cmds[6] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getgrgid";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == 0);
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 5) == 0);
-
-	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
-	    GETGRNAM | GETGRNAM_R | GETGRGID_R));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * cmds: setgrent, getgrent, getgrent_r, getgrnam, getgrnam_r,
-	 *       getgrgid
-	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
-	 * groups:
-	 *     names: wheel, daemon, kmem, sys, operator
-	 *     gids:
-	 * Disallow:
-	 * cmds: getgrgid_r
-	 * fields:
-	 * groups:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid";
-	cmds[6] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_grp_limit_groups(capgrp, names, 5, NULL, 0) == 0);
-
-	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
-	    GETGRNAM | GETGRNAM_R | GETGRGID));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * cmds: setgrent, getgrent, getgrent_r, getgrnam, getgrnam_r,
-	 *       getgrgid
-	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
-	 * groups:
-	 *     names:
-	 *     gids: 0, 1, 2, 3, 5
-	 * Disallow:
-	 * cmds: getgrgid_r
-	 * fields:
-	 * groups:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 6) == 0);
-	cmds[0] = "setgrent";
-	cmds[1] = "getgrent";
-	cmds[2] = "getgrent_r";
-	cmds[3] = "getgrnam";
-	cmds[4] = "getgrnam_r";
-	cmds[5] = "getgrgid";
-	cmds[6] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getgrgid_r";
-	CHECK(cap_grp_limit_cmds(capgrp, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 5) == 0);
-
-	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
-	    GETGRNAM | GETGRNAM_R | GETGRGID));
-
-	cap_close(capgrp);
-}
-
-#define	GR_NAME		0x01
-#define	GR_PASSWD	0x02
-#define	GR_GID		0x04
-#define	GR_MEM		0x08
-
-static unsigned int
-group_fields(const struct group *grp)
-{
-	unsigned int result;
-
-	result = 0;
-
-	if (grp->gr_name != NULL && grp->gr_name[0] != '\0')
-		result |= GR_NAME;
-
-	if (grp->gr_passwd != NULL && grp->gr_passwd[0] != '\0')
-		result |= GR_PASSWD;
-
-	if (grp->gr_gid != (gid_t)-1)
-		result |= GR_GID;
-
-	if (grp->gr_mem != NULL && grp->gr_mem[0] != NULL)
-		result |= GR_MEM;
-
-	return (result);
-}
-
-static bool
-runtest_fields(cap_channel_t *capgrp, unsigned int expected)
-{
-	char buf[1024];
-	struct group *grp;
-	struct group st;
-
-	(void)cap_setgrent(capgrp);
-	grp = cap_getgrent(capgrp);
-	if (group_fields(grp) != expected)
-		return (false);
-
-	(void)cap_setgrent(capgrp);
-	cap_getgrent_r(capgrp, &st, buf, sizeof(buf), &grp);
-	if (group_fields(grp) != expected)
-		return (false);
-
-	grp = cap_getgrnam(capgrp, "wheel");
-	if (group_fields(grp) != expected)
-		return (false);
-
-	cap_getgrnam_r(capgrp, "wheel", &st, buf, sizeof(buf), &grp);
-	if (group_fields(grp) != expected)
-		return (false);
-
-	grp = cap_getgrgid(capgrp, GID_WHEEL);
-	if (group_fields(grp) != expected)
-		return (false);
-
-	cap_getgrgid_r(capgrp, GID_WHEEL, &st, buf, sizeof(buf), &grp);
-	if (group_fields(grp) != expected)
-		return (false);
-
-	return (true);
-}
-
-static void
-test_fields(cap_channel_t *origcapgrp)
-{
-	cap_channel_t *capgrp;
-	const char *fields[4];
-
-	/* No limits. */
-
-	CHECK(runtest_fields(origcapgrp, GR_NAME | GR_PASSWD | GR_GID | GR_MEM));
-
-	/*
-	 * Allow:
-	 * fields: gr_name, gr_passwd, gr_gid, gr_mem
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	fields[0] = "gr_name";
-	fields[1] = "gr_passwd";
-	fields[2] = "gr_gid";
-	fields[3] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == 0);
-
-	CHECK(runtest_fields(capgrp, GR_NAME | GR_PASSWD | GR_GID | GR_MEM));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * fields: gr_passwd, gr_gid, gr_mem
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	fields[0] = "gr_passwd";
-	fields[1] = "gr_gid";
-	fields[2] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 3) == 0);
-	fields[0] = "gr_name";
-	fields[1] = "gr_passwd";
-	fields[2] = "gr_gid";
-	fields[3] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest_fields(capgrp, GR_PASSWD | GR_GID | GR_MEM));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * fields: gr_name, gr_gid, gr_mem
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	fields[0] = "gr_name";
-	fields[1] = "gr_gid";
-	fields[2] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 3) == 0);
-	fields[0] = "gr_name";
-	fields[1] = "gr_passwd";
-	fields[2] = "gr_gid";
-	fields[3] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
-	    errno == ENOTCAPABLE);
-	fields[0] = "gr_passwd";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest_fields(capgrp, GR_NAME | GR_GID | GR_MEM));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * fields: gr_name, gr_passwd, gr_mem
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	fields[0] = "gr_name";
-	fields[1] = "gr_passwd";
-	fields[2] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 3) == 0);
-	fields[0] = "gr_name";
-	fields[1] = "gr_passwd";
-	fields[2] = "gr_gid";
-	fields[3] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
-	    errno == ENOTCAPABLE);
-	fields[0] = "gr_gid";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest_fields(capgrp, GR_NAME | GR_PASSWD | GR_MEM));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * fields: gr_name, gr_passwd, gr_gid
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	fields[0] = "gr_name";
-	fields[1] = "gr_passwd";
-	fields[2] = "gr_gid";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 3) == 0);
-	fields[0] = "gr_name";
-	fields[1] = "gr_passwd";
-	fields[2] = "gr_gid";
-	fields[3] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
-	    errno == ENOTCAPABLE);
-	fields[0] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest_fields(capgrp, GR_NAME | GR_PASSWD | GR_GID));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * fields: gr_name, gr_passwd
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	fields[0] = "gr_name";
-	fields[1] = "gr_passwd";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 2) == 0);
-	fields[0] = "gr_name";
-	fields[1] = "gr_passwd";
-	fields[2] = "gr_gid";
-	fields[3] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
-	    errno == ENOTCAPABLE);
-	fields[0] = "gr_gid";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest_fields(capgrp, GR_NAME | GR_PASSWD));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * fields: gr_name, gr_gid
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	fields[0] = "gr_name";
-	fields[1] = "gr_gid";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 2) == 0);
-	fields[0] = "gr_name";
-	fields[1] = "gr_passwd";
-	fields[2] = "gr_gid";
-	fields[3] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
-	    errno == ENOTCAPABLE);
-	fields[0] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest_fields(capgrp, GR_NAME | GR_GID));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * fields: gr_name, gr_mem
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	fields[0] = "gr_name";
-	fields[1] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 2) == 0);
-	fields[0] = "gr_name";
-	fields[1] = "gr_passwd";
-	fields[2] = "gr_gid";
-	fields[3] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
-	    errno == ENOTCAPABLE);
-	fields[0] = "gr_passwd";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest_fields(capgrp, GR_NAME | GR_MEM));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * fields: gr_passwd, gr_gid
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	fields[0] = "gr_passwd";
-	fields[1] = "gr_gid";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 2) == 0);
-	fields[0] = "gr_name";
-	fields[1] = "gr_passwd";
-	fields[2] = "gr_gid";
-	fields[3] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
-	    errno == ENOTCAPABLE);
-	fields[0] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest_fields(capgrp, GR_PASSWD | GR_GID));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * fields: gr_passwd, gr_mem
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	fields[0] = "gr_passwd";
-	fields[1] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 2) == 0);
-	fields[0] = "gr_name";
-	fields[1] = "gr_passwd";
-	fields[2] = "gr_gid";
-	fields[3] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
-	    errno == ENOTCAPABLE);
-	fields[0] = "gr_gid";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest_fields(capgrp, GR_PASSWD | GR_MEM));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * fields: gr_gid, gr_mem
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	fields[0] = "gr_gid";
-	fields[1] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 2) == 0);
-	fields[0] = "gr_name";
-	fields[1] = "gr_passwd";
-	fields[2] = "gr_gid";
-	fields[3] = "gr_mem";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 4) == -1 &&
-	    errno == ENOTCAPABLE);
-	fields[0] = "gr_passwd";
-	CHECK(cap_grp_limit_fields(capgrp, fields, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest_fields(capgrp, GR_GID | GR_MEM));
-
-	cap_close(capgrp);
-}
-
-static bool
-runtest_groups(cap_channel_t *capgrp, const char **names, const gid_t *gids,
-    size_t ngroups)
-{
-	char buf[1024];
-	struct group *grp;
-	struct group st;
-	unsigned int i, got;
-
-	(void)cap_setgrent(capgrp);
-	got = 0;
-	for (;;) {
-		grp = cap_getgrent(capgrp);
-		if (grp == NULL)
-			break;
-		got++;
-		for (i = 0; i < ngroups; i++) {
-			if (strcmp(names[i], grp->gr_name) == 0 &&
-			    gids[i] == grp->gr_gid) {
-				break;
-			}
-		}
-		if (i == ngroups)
-			return (false);
-	}
-	if (got != ngroups)
-		return (false);
-
-	(void)cap_setgrent(capgrp);
-	got = 0;
-	for (;;) {
-		cap_getgrent_r(capgrp, &st, buf, sizeof(buf), &grp);
-		if (grp == NULL)
-			break;
-		got++;
-		for (i = 0; i < ngroups; i++) {
-			if (strcmp(names[i], grp->gr_name) == 0 &&
-			    gids[i] == grp->gr_gid) {
-				break;
-			}
-		}
-		if (i == ngroups)
-			return (false);
-	}
-	if (got != ngroups)
-		return (false);
-
-	for (i = 0; i < ngroups; i++) {
-		grp = cap_getgrnam(capgrp, names[i]);
-		if (grp == NULL)
-			return (false);
-	}
-
-	for (i = 0; i < ngroups; i++) {
-		cap_getgrnam_r(capgrp, names[i], &st, buf, sizeof(buf), &grp);
-		if (grp == NULL)
-			return (false);
-	}
-
-	for (i = 0; i < ngroups; i++) {
-		grp = cap_getgrgid(capgrp, gids[i]);
-		if (grp == NULL)
-			return (false);
-	}
-
-	for (i = 0; i < ngroups; i++) {
-		cap_getgrgid_r(capgrp, gids[i], &st, buf, sizeof(buf), &grp);
-		if (grp == NULL)
-			return (false);
-	}
-
-	return (true);
-}
-
-static void
-test_groups(cap_channel_t *origcapgrp)
-{
-	cap_channel_t *capgrp;
-	const char *names[5];
-	gid_t gids[5];
-
-	/*
-	 * Allow:
-	 * groups:
-	 *     names: wheel, daemon, kmem, sys, tty
-	 *     gids:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	names[0] = "wheel";
-	names[1] = "daemon";
-	names[2] = "kmem";
-	names[3] = "sys";
-	names[4] = "tty";
-	CHECK(cap_grp_limit_groups(capgrp, names, 5, NULL, 0) == 0);
-	gids[0] = 0;
-	gids[1] = 1;
-	gids[2] = 2;
-	gids[3] = 3;
-	gids[4] = 4;
-
-	CHECK(runtest_groups(capgrp, names, gids, 5));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * groups:
-	 *     names: kmem, sys, tty
-	 *     gids:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	names[0] = "kmem";
-	names[1] = "sys";
-	names[2] = "tty";
-	CHECK(cap_grp_limit_groups(capgrp, names, 3, NULL, 0) == 0);
-	names[3] = "daemon";
-	CHECK(cap_grp_limit_groups(capgrp, names, 4, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-	names[0] = "daemon";
-	CHECK(cap_grp_limit_groups(capgrp, names, 1, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-	names[0] = "kmem";
-	gids[0] = 2;
-	gids[1] = 3;
-	gids[2] = 4;
-
-	CHECK(runtest_groups(capgrp, names, gids, 3));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * groups:
-	 *     names: wheel, kmem, tty
-	 *     gids:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	names[0] = "wheel";
-	names[1] = "kmem";
-	names[2] = "tty";
-	CHECK(cap_grp_limit_groups(capgrp, names, 3, NULL, 0) == 0);
-	names[3] = "daemon";
-	CHECK(cap_grp_limit_groups(capgrp, names, 4, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-	names[0] = "daemon";
-	CHECK(cap_grp_limit_groups(capgrp, names, 1, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-	names[0] = "wheel";
-	gids[0] = 0;
-	gids[1] = 2;
-	gids[2] = 4;
-
-	CHECK(runtest_groups(capgrp, names, gids, 3));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * groups:
-	 *     names:
-	 *     gids: 2, 3, 4
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	names[0] = "kmem";
-	names[1] = "sys";
-	names[2] = "tty";
-	gids[0] = 2;
-	gids[1] = 3;
-	gids[2] = 4;
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 3) == 0);
-	gids[3] = 0;
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 4) == -1 &&
-	    errno == ENOTCAPABLE);
-	gids[0] = 0;
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-	gids[0] = 2;
-
-	CHECK(runtest_groups(capgrp, names, gids, 3));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * groups:
-	 *     names:
-	 *     gids: 0, 2, 4
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	names[0] = "wheel";
-	names[1] = "kmem";
-	names[2] = "tty";
-	gids[0] = 0;
-	gids[1] = 2;
-	gids[2] = 4;
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 3) == 0);
-	gids[3] = 1;
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 4) == -1 &&
-	    errno == ENOTCAPABLE);
-	gids[0] = 1;
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-	gids[0] = 0;
-
-	CHECK(runtest_groups(capgrp, names, gids, 3));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * groups:
-	 *     names: kmem
-	 *     gids:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	names[0] = "kmem";
-	CHECK(cap_grp_limit_groups(capgrp, names, 1, NULL, 0) == 0);
-	names[1] = "daemon";
-	CHECK(cap_grp_limit_groups(capgrp, names, 2, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-	names[0] = "daemon";
-	CHECK(cap_grp_limit_groups(capgrp, names, 1, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-	names[0] = "kmem";
-	gids[0] = 2;
-
-	CHECK(runtest_groups(capgrp, names, gids, 1));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * groups:
-	 *     names: wheel, tty
-	 *     gids:
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	names[0] = "wheel";
-	names[1] = "tty";
-	CHECK(cap_grp_limit_groups(capgrp, names, 2, NULL, 0) == 0);
-	names[2] = "daemon";
-	CHECK(cap_grp_limit_groups(capgrp, names, 3, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-	names[0] = "daemon";
-	CHECK(cap_grp_limit_groups(capgrp, names, 1, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-	names[0] = "wheel";
-	gids[0] = 0;
-	gids[1] = 4;
-
-	CHECK(runtest_groups(capgrp, names, gids, 2));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * groups:
-	 *     names:
-	 *     gids: 2
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	names[0] = "kmem";
-	gids[0] = 2;
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 1) == 0);
-	gids[1] = 1;
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 2) == -1 &&
-	    errno == ENOTCAPABLE);
-	gids[0] = 1;
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-	gids[0] = 2;
-
-	CHECK(runtest_groups(capgrp, names, gids, 1));
-
-	cap_close(capgrp);
-
-	/*
-	 * Allow:
-	 * groups:
-	 *     names:
-	 *     gids: 0, 4
-	 */
-	capgrp = cap_clone(origcapgrp);
-	CHECK(capgrp != NULL);
-
-	names[0] = "wheel";
-	names[1] = "tty";
-	gids[0] = 0;
-	gids[1] = 4;
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 2) == 0);
-	gids[2] = 1;
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 3) == -1 &&
-	    errno == ENOTCAPABLE);
-	gids[0] = 1;
-	CHECK(cap_grp_limit_groups(capgrp, NULL, 0, gids, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-	gids[0] = 0;
-
-	CHECK(runtest_groups(capgrp, names, gids, 2));
-
-	cap_close(capgrp);
-}
-
-int
-main(void)
-{
-	cap_channel_t *capcas, *capgrp;
-
-	printf("1..199\n");
-
-	capcas = cap_init();
-	CHECKX(capcas != NULL);
-
-	capgrp = cap_service_open(capcas, "system.grp");
-	CHECKX(capgrp != NULL);
-
-	cap_close(capcas);
-
-	/* No limits. */
-
-	CHECK(runtest_cmds(capgrp) == (SETGRENT | GETGRENT | GETGRENT_R |
-	    GETGRNAM | GETGRNAM_R | GETGRGID | GETGRGID_R));
-
-	test_cmds(capgrp);
-
-	test_fields(capgrp);
-
-	test_groups(capgrp);
-
-	cap_close(capgrp);
-
-	exit(0);
-}

Property changes on: projects/clang390-import/tools/regression/capsicum/libcasper/grp.c
___________________________________________________________________
Deleted: svn:eol-style
## -1 +0,0 ##
-native
\ No newline at end of property
Deleted: svn:keywords
## -1 +0,0 ##
-FreeBSD=%H
\ No newline at end of property
Deleted: svn:mime-type
## -1 +0,0 ##
-text/plain
\ No newline at end of property
Index: projects/clang390-import/tools/regression/capsicum/libcasper/pwd.c
===================================================================
--- projects/clang390-import/tools/regression/capsicum/libcasper/pwd.c	(revision 305686)
+++ projects/clang390-import/tools/regression/capsicum/libcasper/pwd.c	(nonexistent)
@@ -1,1536 +0,0 @@
-/*-
- * Copyright (c) 2013 The FreeBSD Foundation
- * All rights reserved.
- *
- * This software was developed by Pawel Jakub Dawidek under sponsorship from
- * the FreeBSD Foundation.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- * 1. Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in the
- *    documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
- * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
- * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
- * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
- * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
- * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
- * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
- * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
- * SUCH DAMAGE.
- */
-
-#include <sys/cdefs.h>
-__FBSDID("$FreeBSD$");
-
-#include <sys/capsicum.h>
-
-#include <assert.h>
-#include <err.h>
-#include <errno.h>
-#include <pwd.h>
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-#include <unistd.h>
-
-#include <libcasper.h>
-
-#include <casper/cap_pwd.h>
-
-static int ntest = 1;
-
-#define CHECK(expr)     do {						\
-	if ((expr))							\
-		printf("ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
-	else								\
-		printf("not ok %d %s:%u\n", ntest, __FILE__, __LINE__);\
-	ntest++;							\
-} while (0)
-#define CHECKX(expr)     do {						\
-	if ((expr)) {							\
-		printf("ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
-	} else {							\
-		printf("not ok %d %s:%u\n", ntest, __FILE__, __LINE__);\
-		exit(1);						\
-	}								\
-	ntest++;							\
-} while (0)
-
-#define	UID_ROOT	0
-#define	UID_OPERATOR	2
-
-#define	GETPWENT0	0x0001
-#define	GETPWENT1	0x0002
-#define	GETPWENT2	0x0004
-#define	GETPWENT	(GETPWENT0 | GETPWENT1 | GETPWENT2)
-#define	GETPWENT_R0	0x0008
-#define	GETPWENT_R1	0x0010
-#define	GETPWENT_R2	0x0020
-#define	GETPWENT_R	(GETPWENT_R0 | GETPWENT_R1 | GETPWENT_R2)
-#define	GETPWNAM	0x0040
-#define	GETPWNAM_R	0x0080
-#define	GETPWUID	0x0100
-#define	GETPWUID_R	0x0200
-
-static bool
-passwd_compare(const struct passwd *pwd0, const struct passwd *pwd1)
-{
-
-	if (pwd0 == NULL && pwd1 == NULL)
-		return (true);
-	if (pwd0 == NULL || pwd1 == NULL)
-		return (false);
-
-	if (strcmp(pwd0->pw_name, pwd1->pw_name) != 0)
-		return (false);
-
-	if (pwd0->pw_passwd != NULL || pwd1->pw_passwd != NULL) {
-		if (pwd0->pw_passwd == NULL || pwd1->pw_passwd == NULL)
-			return (false);
-		if (strcmp(pwd0->pw_passwd, pwd1->pw_passwd) != 0)
-			return (false);
-	}
-
-	if (pwd0->pw_uid != pwd1->pw_uid)
-		return (false);
-
-	if (pwd0->pw_gid != pwd1->pw_gid)
-		return (false);
-
-	if (pwd0->pw_change != pwd1->pw_change)
-		return (false);
-
-	if (pwd0->pw_class != NULL || pwd1->pw_class != NULL) {
-		if (pwd0->pw_class == NULL || pwd1->pw_class == NULL)
-			return (false);
-		if (strcmp(pwd0->pw_class, pwd1->pw_class) != 0)
-			return (false);
-	}
-
-	if (pwd0->pw_gecos != NULL || pwd1->pw_gecos != NULL) {
-		if (pwd0->pw_gecos == NULL || pwd1->pw_gecos == NULL)
-			return (false);
-		if (strcmp(pwd0->pw_gecos, pwd1->pw_gecos) != 0)
-			return (false);
-	}
-
-	if (pwd0->pw_dir != NULL || pwd1->pw_dir != NULL) {
-		if (pwd0->pw_dir == NULL || pwd1->pw_dir == NULL)
-			return (false);
-		if (strcmp(pwd0->pw_dir, pwd1->pw_dir) != 0)
-			return (false);
-	}
-
-	if (pwd0->pw_shell != NULL || pwd1->pw_shell != NULL) {
-		if (pwd0->pw_shell == NULL || pwd1->pw_shell == NULL)
-			return (false);
-		if (strcmp(pwd0->pw_shell, pwd1->pw_shell) != 0)
-			return (false);
-	}
-
-	if (pwd0->pw_expire != pwd1->pw_expire)
-		return (false);
-
-	if (pwd0->pw_fields != pwd1->pw_fields)
-		return (false);
-
-	return (true);
-}
-
-static unsigned int
-runtest_cmds(cap_channel_t *cappwd)
-{
-	char bufs[1024], bufc[1024];
-	unsigned int result;
-	struct passwd *pwds, *pwdc;
-	struct passwd sts, stc;
-
-	result = 0;
-
-	setpwent();
-	cap_setpwent(cappwd);
-
-	pwds = getpwent();
-	pwdc = cap_getpwent(cappwd);
-	if (passwd_compare(pwds, pwdc)) {
-		result |= GETPWENT0;
-		pwds = getpwent();
-		pwdc = cap_getpwent(cappwd);
-		if (passwd_compare(pwds, pwdc))
-			result |= GETPWENT1;
-	}
-
-	getpwent_r(&sts, bufs, sizeof(bufs), &pwds);
-	cap_getpwent_r(cappwd, &stc, bufc, sizeof(bufc), &pwdc);
-	if (passwd_compare(pwds, pwdc)) {
-		result |= GETPWENT_R0;
-		getpwent_r(&sts, bufs, sizeof(bufs), &pwds);
-		cap_getpwent_r(cappwd, &stc, bufc, sizeof(bufc), &pwdc);
-		if (passwd_compare(pwds, pwdc))
-			result |= GETPWENT_R1;
-	}
-
-	setpwent();
-	cap_setpwent(cappwd);
-
-	getpwent_r(&sts, bufs, sizeof(bufs), &pwds);
-	cap_getpwent_r(cappwd, &stc, bufc, sizeof(bufc), &pwdc);
-	if (passwd_compare(pwds, pwdc))
-		result |= GETPWENT_R2;
-
-	pwds = getpwent();
-	pwdc = cap_getpwent(cappwd);
-	if (passwd_compare(pwds, pwdc))
-		result |= GETPWENT2;
-
-	pwds = getpwnam("root");
-	pwdc = cap_getpwnam(cappwd, "root");
-	if (passwd_compare(pwds, pwdc)) {
-		pwds = getpwnam("operator");
-		pwdc = cap_getpwnam(cappwd, "operator");
-		if (passwd_compare(pwds, pwdc))
-			result |= GETPWNAM;
-	}
-
-	getpwnam_r("root", &sts, bufs, sizeof(bufs), &pwds);
-	cap_getpwnam_r(cappwd, "root", &stc, bufc, sizeof(bufc), &pwdc);
-	if (passwd_compare(pwds, pwdc)) {
-		getpwnam_r("operator", &sts, bufs, sizeof(bufs), &pwds);
-		cap_getpwnam_r(cappwd, "operator", &stc, bufc, sizeof(bufc),
-		    &pwdc);
-		if (passwd_compare(pwds, pwdc))
-			result |= GETPWNAM_R;
-	}
-
-	pwds = getpwuid(UID_ROOT);
-	pwdc = cap_getpwuid(cappwd, UID_ROOT);
-	if (passwd_compare(pwds, pwdc)) {
-		pwds = getpwuid(UID_OPERATOR);
-		pwdc = cap_getpwuid(cappwd, UID_OPERATOR);
-		if (passwd_compare(pwds, pwdc))
-			result |= GETPWUID;
-	}
-
-	getpwuid_r(UID_ROOT, &sts, bufs, sizeof(bufs), &pwds);
-	cap_getpwuid_r(cappwd, UID_ROOT, &stc, bufc, sizeof(bufc), &pwdc);
-	if (passwd_compare(pwds, pwdc)) {
-		getpwuid_r(UID_OPERATOR, &sts, bufs, sizeof(bufs), &pwds);
-		cap_getpwuid_r(cappwd, UID_OPERATOR, &stc, bufc, sizeof(bufc),
-		    &pwdc);
-		if (passwd_compare(pwds, pwdc))
-			result |= GETPWUID_R;
-	}
-
-	return (result);
-}
-
-static void
-test_cmds(cap_channel_t *origcappwd)
-{
-	cap_channel_t *cappwd;
-	const char *cmds[7], *fields[10], *names[6];
-	uid_t uids[5];
-
-	fields[0] = "pw_name";
-	fields[1] = "pw_passwd";
-	fields[2] = "pw_uid";
-	fields[3] = "pw_gid";
-	fields[4] = "pw_change";
-	fields[5] = "pw_class";
-	fields[6] = "pw_gecos";
-	fields[7] = "pw_dir";
-	fields[8] = "pw_shell";
-	fields[9] = "pw_expire";
-
-	names[0] = "root";
-	names[1] = "toor";
-	names[2] = "daemon";
-	names[3] = "operator";
-	names[4] = "bin";
-	names[5] = "kmem";
-
-	uids[0] = 0;
-	uids[1] = 1;
-	uids[2] = 2;
-	uids[3] = 3;
-	uids[4] = 5;
-
-	/*
-	 * Allow:
-	 * cmds: setpwent, getpwent, getpwent_r, getpwnam, getpwnam_r,
-	 *       getpwuid, getpwuid_r
-	 * users:
-	 *     names: root, toor, daemon, operator, bin, kmem
-	 *     uids:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid";
-	cmds[6] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == 0);
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
-	CHECK(cap_pwd_limit_users(cappwd, names, 6, NULL, 0) == 0);
-
-	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
-	    GETPWNAM | GETPWNAM_R | GETPWUID | GETPWUID_R));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * cmds: setpwent, getpwent, getpwent_r, getpwnam, getpwnam_r,
-	 *       getpwuid, getpwuid_r
-	 * users:
-	 *     names:
-	 *     uids: 0, 1, 2, 3, 5
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid";
-	cmds[6] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == 0);
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 5) == 0);
-
-	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
-	    GETPWNAM | GETPWNAM_R | GETPWUID | GETPWUID_R));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * cmds: getpwent, getpwent_r, getpwnam, getpwnam_r,
-	 *       getpwuid, getpwuid_r
-	 * users:
-	 *     names: root, toor, daemon, operator, bin, kmem
-	 *     uids:
-	 * Disallow:
-	 * cmds: setpwent
-	 * users:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	cap_setpwent(cappwd);
-
-	cmds[0] = "getpwent";
-	cmds[1] = "getpwent_r";
-	cmds[2] = "getpwnam";
-	cmds[3] = "getpwnam_r";
-	cmds[4] = "getpwuid";
-	cmds[5] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid";
-	cmds[6] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "setpwent";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
-	CHECK(cap_pwd_limit_users(cappwd, names, 6, NULL, 0) == 0);
-
-	CHECK(runtest_cmds(cappwd) == (GETPWENT0 | GETPWENT1 | GETPWENT_R0 |
-	    GETPWENT_R1 | GETPWNAM | GETPWNAM_R | GETPWUID | GETPWUID_R));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * cmds: getpwent, getpwent_r, getpwnam, getpwnam_r,
-	 *       getpwuid, getpwuid_r
-	 * users:
-	 *     names:
-	 *     uids: 0, 1, 2, 3, 5
-	 * Disallow:
-	 * cmds: setpwent
-	 * users:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	cap_setpwent(cappwd);
-
-	cmds[0] = "getpwent";
-	cmds[1] = "getpwent_r";
-	cmds[2] = "getpwnam";
-	cmds[3] = "getpwnam_r";
-	cmds[4] = "getpwuid";
-	cmds[5] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid";
-	cmds[6] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "setpwent";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 5) == 0);
-
-	CHECK(runtest_cmds(cappwd) == (GETPWENT0 | GETPWENT1 | GETPWENT_R0 |
-	    GETPWENT_R1 | GETPWNAM | GETPWNAM_R | GETPWUID | GETPWUID_R));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * cmds: setpwent, getpwent_r, getpwnam, getpwnam_r,
-	 *       getpwuid, getpwuid_r
-	 * users:
-	 *     names: root, toor, daemon, operator, bin, kmem
-	 *     uids:
-	 * Disallow:
-	 * cmds: getpwent
-	 * users:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent_r";
-	cmds[2] = "getpwnam";
-	cmds[3] = "getpwnam_r";
-	cmds[4] = "getpwuid";
-	cmds[5] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid";
-	cmds[6] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getpwent";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
-	CHECK(cap_pwd_limit_users(cappwd, names, 6, NULL, 0) == 0);
-
-	CHECK(runtest_cmds(cappwd) == (GETPWENT_R2 |
-	    GETPWNAM | GETPWNAM_R | GETPWUID | GETPWUID_R));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * cmds: setpwent, getpwent_r, getpwnam, getpwnam_r,
-	 *       getpwuid, getpwuid_r
-	 * users:
-	 *     names:
-	 *     uids: 0, 1, 2, 3, 5
-	 * Disallow:
-	 * cmds: getpwent
-	 * users:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent_r";
-	cmds[2] = "getpwnam";
-	cmds[3] = "getpwnam_r";
-	cmds[4] = "getpwuid";
-	cmds[5] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid";
-	cmds[6] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getpwent";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 5) == 0);
-
-	CHECK(runtest_cmds(cappwd) == (GETPWENT_R2 |
-	    GETPWNAM | GETPWNAM_R | GETPWUID | GETPWUID_R));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * cmds: setpwent, getpwent, getpwnam, getpwnam_r,
-	 *       getpwuid, getpwuid_r
-	 * users:
-	 *     names: root, toor, daemon, operator, bin, kmem
-	 *     uids:
-	 * Disallow:
-	 * cmds: getpwent_r
-	 * users:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwnam";
-	cmds[3] = "getpwnam_r";
-	cmds[4] = "getpwuid";
-	cmds[5] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid";
-	cmds[6] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getpwent_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
-	CHECK(cap_pwd_limit_users(cappwd, names, 6, NULL, 0) == 0);
-
-	CHECK(runtest_cmds(cappwd) == (GETPWENT0 | GETPWENT1 |
-	    GETPWNAM | GETPWNAM_R | GETPWUID | GETPWUID_R));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * cmds: setpwent, getpwent, getpwnam, getpwnam_r,
-	 *       getpwuid, getpwuid_r
-	 * users:
-	 *     names:
-	 *     uids: 0, 1, 2, 3, 5
-	 * Disallow:
-	 * cmds: getpwent_r
-	 * users:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwnam";
-	cmds[3] = "getpwnam_r";
-	cmds[4] = "getpwuid";
-	cmds[5] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid";
-	cmds[6] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getpwent_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 5) == 0);
-
-	CHECK(runtest_cmds(cappwd) == (GETPWENT0 | GETPWENT1 |
-	    GETPWNAM | GETPWNAM_R | GETPWUID | GETPWUID_R));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * cmds: setpwent, getpwent, getpwent_r, getpwnam_r,
-	 *       getpwuid, getpwuid_r
-	 * users:
-	 *     names: root, toor, daemon, operator, bin, kmem
-	 *     uids:
-	 * Disallow:
-	 * cmds: getpwnam
-	 * users:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam_r";
-	cmds[4] = "getpwuid";
-	cmds[5] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid";
-	cmds[6] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getpwnam";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
-	CHECK(cap_pwd_limit_users(cappwd, names, 6, NULL, 0) == 0);
-
-	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
-	    GETPWNAM_R | GETPWUID | GETPWUID_R));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * cmds: setpwent, getpwent, getpwent_r, getpwnam_r,
-	 *       getpwuid, getpwuid_r
-	 * users:
-	 *     names:
-	 *     uids: 0, 1, 2, 3, 5
-	 * Disallow:
-	 * cmds: getpwnam
-	 * users:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam_r";
-	cmds[4] = "getpwuid";
-	cmds[5] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid";
-	cmds[6] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getpwnam";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 5) == 0);
-
-	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
-	    GETPWNAM_R | GETPWUID | GETPWUID_R));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * cmds: setpwent, getpwent, getpwent_r, getpwnam,
-	 *       getpwuid, getpwuid_r
-	 * users:
-	 *     names: root, toor, daemon, operator, bin, kmem
-	 *     uids:
-	 * Disallow:
-	 * cmds: getpwnam_r
-	 * users:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwuid";
-	cmds[5] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid";
-	cmds[6] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getpwnam_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
-	CHECK(cap_pwd_limit_users(cappwd, names, 6, NULL, 0) == 0);
-
-	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
-	    GETPWNAM | GETPWUID | GETPWUID_R));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * cmds: setpwent, getpwent, getpwent_r, getpwnam,
-	 *       getpwuid, getpwuid_r
-	 * users:
-	 *     names:
-	 *     uids: 0, 1, 2, 3, 5
-	 * Disallow:
-	 * cmds: getpwnam_r
-	 * users:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwuid";
-	cmds[5] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid";
-	cmds[6] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getpwnam_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 5) == 0);
-
-	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
-	    GETPWNAM | GETPWUID | GETPWUID_R));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * cmds: setpwent, getpwent, getpwent_r, getpwnam, getpwnam_r,
-	 *       getpwuid_r
-	 * users:
-	 *     names: root, toor, daemon, operator, bin, kmem
-	 *     uids:
-	 * Disallow:
-	 * cmds: getpwuid
-	 * users:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid";
-	cmds[6] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getpwuid";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
-	CHECK(cap_pwd_limit_users(cappwd, names, 6, NULL, 0) == 0);
-
-	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
-	    GETPWNAM | GETPWNAM_R | GETPWUID_R));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * cmds: setpwent, getpwent, getpwent_r, getpwnam, getpwnam_r,
-	 *       getpwuid_r
-	 * users:
-	 *     names:
-	 *     uids: 0, 1, 2, 3, 5
-	 * Disallow:
-	 * cmds: getpwuid
-	 * users:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid";
-	cmds[6] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getpwuid";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 5) == 0);
-
-	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
-	    GETPWNAM | GETPWNAM_R | GETPWUID_R));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * cmds: setpwent, getpwent, getpwent_r, getpwnam, getpwnam_r,
-	 *       getpwuid
-	 * users:
-	 *     names: root, toor, daemon, operator, bin, kmem
-	 *     uids:
-	 * Disallow:
-	 * cmds: getpwuid_r
-	 * users:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid";
-	cmds[6] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
-	CHECK(cap_pwd_limit_users(cappwd, names, 6, NULL, 0) == 0);
-
-	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
-	    GETPWNAM | GETPWNAM_R | GETPWUID));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * cmds: setpwent, getpwent, getpwent_r, getpwnam, getpwnam_r,
-	 *       getpwuid
-	 * users:
-	 *     names:
-	 *     uids: 0, 1, 2, 3, 5
-	 * Disallow:
-	 * cmds: getpwuid_r
-	 * users:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 6) == 0);
-	cmds[0] = "setpwent";
-	cmds[1] = "getpwent";
-	cmds[2] = "getpwent_r";
-	cmds[3] = "getpwnam";
-	cmds[4] = "getpwnam_r";
-	cmds[5] = "getpwuid";
-	cmds[6] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 7) == -1 && errno == ENOTCAPABLE);
-	cmds[0] = "getpwuid_r";
-	CHECK(cap_pwd_limit_cmds(cappwd, cmds, 1) == -1 && errno == ENOTCAPABLE);
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 5) == 0);
-
-	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R |
-	    GETPWNAM | GETPWNAM_R | GETPWUID));
-
-	cap_close(cappwd);
-}
-
-#define	PW_NAME		_PWF_NAME
-#define	PW_PASSWD	_PWF_PASSWD
-#define	PW_UID		_PWF_UID
-#define	PW_GID		_PWF_GID
-#define	PW_CHANGE	_PWF_CHANGE
-#define	PW_CLASS	_PWF_CLASS
-#define	PW_GECOS	_PWF_GECOS
-#define	PW_DIR		_PWF_DIR
-#define	PW_SHELL	_PWF_SHELL
-#define	PW_EXPIRE	_PWF_EXPIRE
-
-static unsigned int
-passwd_fields(const struct passwd *pwd)
-{
-	unsigned int result;
-
-	result = 0;
-
-	if (pwd->pw_name != NULL && pwd->pw_name[0] != '\0')
-		result |= PW_NAME;
-//	else
-//		printf("No pw_name\n");
-
-	if (pwd->pw_passwd != NULL && pwd->pw_passwd[0] != '\0')
-		result |= PW_PASSWD;
-	else if ((pwd->pw_fields & _PWF_PASSWD) != 0)
-		result |= PW_PASSWD;
-//	else
-//		printf("No pw_passwd\n");
-
-	if (pwd->pw_uid != (uid_t)-1)
-		result |= PW_UID;
-//	else
-//		printf("No pw_uid\n");
-
-	if (pwd->pw_gid != (gid_t)-1)
-		result |= PW_GID;
-//	else
-//		printf("No pw_gid\n");
-
-	if (pwd->pw_change != 0 || (pwd->pw_fields & _PWF_CHANGE) != 0)
-		result |= PW_CHANGE;
-//	else
-//		printf("No pw_change\n");
-
-	if (pwd->pw_class != NULL && pwd->pw_class[0] != '\0')
-		result |= PW_CLASS;
-	else if ((pwd->pw_fields & _PWF_CLASS) != 0)
-		result |= PW_CLASS;
-//	else
-//		printf("No pw_class\n");
-
-	if (pwd->pw_gecos != NULL && pwd->pw_gecos[0] != '\0')
-		result |= PW_GECOS;
-	else if ((pwd->pw_fields & _PWF_GECOS) != 0)
-		result |= PW_GECOS;
-//	else
-//		printf("No pw_gecos\n");
-
-	if (pwd->pw_dir != NULL && pwd->pw_dir[0] != '\0')
-		result |= PW_DIR;
-	else if ((pwd->pw_fields & _PWF_DIR) != 0)
-		result |= PW_DIR;
-//	else
-//		printf("No pw_dir\n");
-
-	if (pwd->pw_shell != NULL && pwd->pw_shell[0] != '\0')
-		result |= PW_SHELL;
-	else if ((pwd->pw_fields & _PWF_SHELL) != 0)
-		result |= PW_SHELL;
-//	else
-//		printf("No pw_shell\n");
-
-	if (pwd->pw_expire != 0 || (pwd->pw_fields & _PWF_EXPIRE) != 0)
-		result |= PW_EXPIRE;
-//	else
-//		printf("No pw_expire\n");
-
-if (false && pwd->pw_fields != (int)result) {
-printf("fields=0x%x != result=0x%x\n", (const unsigned int)pwd->pw_fields, result);
-printf("           fields result\n");
-printf("PW_NAME    %d      %d\n", (pwd->pw_fields & PW_NAME) != 0, (result & PW_NAME) != 0);
-printf("PW_PASSWD  %d      %d\n", (pwd->pw_fields & PW_PASSWD) != 0, (result & PW_PASSWD) != 0);
-printf("PW_UID     %d      %d\n", (pwd->pw_fields & PW_UID) != 0, (result & PW_UID) != 0);
-printf("PW_GID     %d      %d\n", (pwd->pw_fields & PW_GID) != 0, (result & PW_GID) != 0);
-printf("PW_CHANGE  %d      %d\n", (pwd->pw_fields & PW_CHANGE) != 0, (result & PW_CHANGE) != 0);
-printf("PW_CLASS   %d      %d\n", (pwd->pw_fields & PW_CLASS) != 0, (result & PW_CLASS) != 0);
-printf("PW_GECOS   %d      %d\n", (pwd->pw_fields & PW_GECOS) != 0, (result & PW_GECOS) != 0);
-printf("PW_DIR     %d      %d\n", (pwd->pw_fields & PW_DIR) != 0, (result & PW_DIR) != 0);
-printf("PW_SHELL   %d      %d\n", (pwd->pw_fields & PW_SHELL) != 0, (result & PW_SHELL) != 0);
-printf("PW_EXPIRE  %d      %d\n", (pwd->pw_fields & PW_EXPIRE) != 0, (result & PW_EXPIRE) != 0);
-}
-
-//printf("result=0x%x\n", result);
-	return (result);
-}
-
-static bool
-runtest_fields(cap_channel_t *cappwd, unsigned int expected)
-{
-	char buf[1024];
-	struct passwd *pwd;
-	struct passwd st;
-
-//printf("expected=0x%x\n", expected);
-	cap_setpwent(cappwd);
-	pwd = cap_getpwent(cappwd);
-	if ((passwd_fields(pwd) & ~expected) != 0)
-		return (false);
-
-	cap_setpwent(cappwd);
-	cap_getpwent_r(cappwd, &st, buf, sizeof(buf), &pwd);
-	if ((passwd_fields(pwd) & ~expected) != 0)
-		return (false);
-
-	pwd = cap_getpwnam(cappwd, "root");
-	if ((passwd_fields(pwd) & ~expected) != 0)
-		return (false);
-
-	cap_getpwnam_r(cappwd, "root", &st, buf, sizeof(buf), &pwd);
-	if ((passwd_fields(pwd) & ~expected) != 0)
-		return (false);
-
-	pwd = cap_getpwuid(cappwd, UID_ROOT);
-	if ((passwd_fields(pwd) & ~expected) != 0)
-		return (false);
-
-	cap_getpwuid_r(cappwd, UID_ROOT, &st, buf, sizeof(buf), &pwd);
-	if ((passwd_fields(pwd) & ~expected) != 0)
-		return (false);
-
-	return (true);
-}
-
-static void
-test_fields(cap_channel_t *origcappwd)
-{
-	cap_channel_t *cappwd;
-	const char *fields[10];
-
-	/* No limits. */
-
-	CHECK(runtest_fields(origcappwd, PW_NAME | PW_PASSWD | PW_UID |
-	    PW_GID | PW_CHANGE | PW_CLASS | PW_GECOS | PW_DIR | PW_SHELL |
-	    PW_EXPIRE));
-
-	/*
-	 * Allow:
-	 * fields: pw_name, pw_passwd, pw_uid, pw_gid, pw_change, pw_class,
-	 *         pw_gecos, pw_dir, pw_shell, pw_expire
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	fields[0] = "pw_name";
-	fields[1] = "pw_passwd";
-	fields[2] = "pw_uid";
-	fields[3] = "pw_gid";
-	fields[4] = "pw_change";
-	fields[5] = "pw_class";
-	fields[6] = "pw_gecos";
-	fields[7] = "pw_dir";
-	fields[8] = "pw_shell";
-	fields[9] = "pw_expire";
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 10) == 0);
-
-	CHECK(runtest_fields(origcappwd, PW_NAME | PW_PASSWD | PW_UID |
-	    PW_GID | PW_CHANGE | PW_CLASS | PW_GECOS | PW_DIR | PW_SHELL |
-	    PW_EXPIRE));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * fields: pw_name, pw_passwd, pw_uid, pw_gid, pw_change
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	fields[0] = "pw_name";
-	fields[1] = "pw_passwd";
-	fields[2] = "pw_uid";
-	fields[3] = "pw_gid";
-	fields[4] = "pw_change";
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 5) == 0);
-	fields[5] = "pw_class";
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 6) == -1 &&
-	    errno == ENOTCAPABLE);
-	fields[0] = "pw_class";
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest_fields(cappwd, PW_NAME | PW_PASSWD | PW_UID |
-	    PW_GID | PW_CHANGE));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * fields: pw_class, pw_gecos, pw_dir, pw_shell, pw_expire
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	fields[0] = "pw_class";
-	fields[1] = "pw_gecos";
-	fields[2] = "pw_dir";
-	fields[3] = "pw_shell";
-	fields[4] = "pw_expire";
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 5) == 0);
-	fields[5] = "pw_uid";
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 6) == -1 &&
-	    errno == ENOTCAPABLE);
-	fields[0] = "pw_uid";
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest_fields(cappwd, PW_CLASS | PW_GECOS | PW_DIR |
-	    PW_SHELL | PW_EXPIRE));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * fields: pw_name, pw_uid, pw_change, pw_gecos, pw_shell
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	fields[0] = "pw_name";
-	fields[1] = "pw_uid";
-	fields[2] = "pw_change";
-	fields[3] = "pw_gecos";
-	fields[4] = "pw_shell";
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 5) == 0);
-	fields[5] = "pw_class";
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 6) == -1 &&
-	    errno == ENOTCAPABLE);
-	fields[0] = "pw_class";
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest_fields(cappwd, PW_NAME | PW_UID | PW_CHANGE |
-	    PW_GECOS | PW_SHELL));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * fields: pw_passwd, pw_gid, pw_class, pw_dir, pw_expire
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	fields[0] = "pw_passwd";
-	fields[1] = "pw_gid";
-	fields[2] = "pw_class";
-	fields[3] = "pw_dir";
-	fields[4] = "pw_expire";
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 5) == 0);
-	fields[5] = "pw_uid";
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 6) == -1 &&
-	    errno == ENOTCAPABLE);
-	fields[0] = "pw_uid";
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest_fields(cappwd, PW_PASSWD | PW_GID | PW_CLASS |
-	    PW_DIR | PW_EXPIRE));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * fields: pw_uid, pw_class, pw_shell
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	fields[0] = "pw_uid";
-	fields[1] = "pw_class";
-	fields[2] = "pw_shell";
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 3) == 0);
-	fields[3] = "pw_change";
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 4) == -1 &&
-	    errno == ENOTCAPABLE);
-	fields[0] = "pw_change";
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest_fields(cappwd, PW_UID | PW_CLASS | PW_SHELL));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * fields: pw_change
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	fields[0] = "pw_change";
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 1) == 0);
-	fields[1] = "pw_uid";
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 2) == -1 &&
-	    errno == ENOTCAPABLE);
-	fields[0] = "pw_uid";
-	CHECK(cap_pwd_limit_fields(cappwd, fields, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-
-	CHECK(runtest_fields(cappwd, PW_CHANGE));
-
-	cap_close(cappwd);
-}
-
-static bool
-runtest_users(cap_channel_t *cappwd, const char **names, const uid_t *uids,
-    size_t nusers)
-{
-	char buf[1024];
-	struct passwd *pwd;
-	struct passwd st;
-	unsigned int i, got;
-
-	cap_setpwent(cappwd);
-	got = 0;
-	for (;;) {
-		pwd = cap_getpwent(cappwd);
-		if (pwd == NULL)
-			break;
-		got++;
-		for (i = 0; i < nusers; i++) {
-			if (strcmp(names[i], pwd->pw_name) == 0 &&
-			    uids[i] == pwd->pw_uid) {
-				break;
-			}
-		}
-		if (i == nusers)
-			return (false);
-	}
-	if (got != nusers)
-		return (false);
-
-	cap_setpwent(cappwd);
-	got = 0;
-	for (;;) {
-		cap_getpwent_r(cappwd, &st, buf, sizeof(buf), &pwd);
-		if (pwd == NULL)
-			break;
-		got++;
-		for (i = 0; i < nusers; i++) {
-			if (strcmp(names[i], pwd->pw_name) == 0 &&
-			    uids[i] == pwd->pw_uid) {
-				break;
-			}
-		}
-		if (i == nusers)
-			return (false);
-	}
-	if (got != nusers)
-		return (false);
-
-	for (i = 0; i < nusers; i++) {
-		pwd = cap_getpwnam(cappwd, names[i]);
-		if (pwd == NULL)
-			return (false);
-	}
-
-	for (i = 0; i < nusers; i++) {
-		cap_getpwnam_r(cappwd, names[i], &st, buf, sizeof(buf), &pwd);
-		if (pwd == NULL)
-			return (false);
-	}
-
-	for (i = 0; i < nusers; i++) {
-		pwd = cap_getpwuid(cappwd, uids[i]);
-		if (pwd == NULL)
-			return (false);
-	}
-
-	for (i = 0; i < nusers; i++) {
-		cap_getpwuid_r(cappwd, uids[i], &st, buf, sizeof(buf), &pwd);
-		if (pwd == NULL)
-			return (false);
-	}
-
-	return (true);
-}
-
-static void
-test_users(cap_channel_t *origcappwd)
-{
-	cap_channel_t *cappwd;
-	const char *names[6];
-	uid_t uids[6];
-
-	/*
-	 * Allow:
-	 * users:
-	 *     names: root, toor, daemon, operator, bin, tty
-	 *     uids:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	names[0] = "root";
-	names[1] = "toor";
-	names[2] = "daemon";
-	names[3] = "operator";
-	names[4] = "bin";
-	names[5] = "tty";
-	CHECK(cap_pwd_limit_users(cappwd, names, 6, NULL, 0) == 0);
-	uids[0] = 0;
-	uids[1] = 0;
-	uids[2] = 1;
-	uids[3] = 2;
-	uids[4] = 3;
-	uids[5] = 4;
-
-	CHECK(runtest_users(cappwd, names, uids, 6));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * users:
-	 *     names: daemon, operator, bin
-	 *     uids:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	names[0] = "daemon";
-	names[1] = "operator";
-	names[2] = "bin";
-	CHECK(cap_pwd_limit_users(cappwd, names, 3, NULL, 0) == 0);
-	names[3] = "tty";
-	CHECK(cap_pwd_limit_users(cappwd, names, 4, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-	names[0] = "tty";
-	CHECK(cap_pwd_limit_users(cappwd, names, 1, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-	names[0] = "daemon";
-	uids[0] = 1;
-	uids[1] = 2;
-	uids[2] = 3;
-
-	CHECK(runtest_users(cappwd, names, uids, 3));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * users:
-	 *     names: daemon, bin, tty
-	 *     uids:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	names[0] = "daemon";
-	names[1] = "bin";
-	names[2] = "tty";
-	CHECK(cap_pwd_limit_users(cappwd, names, 3, NULL, 0) == 0);
-	names[3] = "operator";
-	CHECK(cap_pwd_limit_users(cappwd, names, 4, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-	names[0] = "operator";
-	CHECK(cap_pwd_limit_users(cappwd, names, 1, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-	names[0] = "daemon";
-	uids[0] = 1;
-	uids[1] = 3;
-	uids[2] = 4;
-
-	CHECK(runtest_users(cappwd, names, uids, 3));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * users:
-	 *     names:
-	 *     uids: 1, 2, 3
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	names[0] = "daemon";
-	names[1] = "operator";
-	names[2] = "bin";
-	uids[0] = 1;
-	uids[1] = 2;
-	uids[2] = 3;
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 3) == 0);
-	uids[3] = 4;
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 4) == -1 &&
-	    errno == ENOTCAPABLE);
-	uids[0] = 4;
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-	uids[0] = 1;
-
-	CHECK(runtest_users(cappwd, names, uids, 3));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * users:
-	 *     names:
-	 *     uids: 1, 3, 4
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	names[0] = "daemon";
-	names[1] = "bin";
-	names[2] = "tty";
-	uids[0] = 1;
-	uids[1] = 3;
-	uids[2] = 4;
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 3) == 0);
-	uids[3] = 5;
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 4) == -1 &&
-	    errno == ENOTCAPABLE);
-	uids[0] = 5;
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-	uids[0] = 1;
-
-	CHECK(runtest_users(cappwd, names, uids, 3));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * users:
-	 *     names: bin
-	 *     uids:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	names[0] = "bin";
-	CHECK(cap_pwd_limit_users(cappwd, names, 1, NULL, 0) == 0);
-	names[1] = "operator";
-	CHECK(cap_pwd_limit_users(cappwd, names, 2, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-	names[0] = "operator";
-	CHECK(cap_pwd_limit_users(cappwd, names, 1, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-	names[0] = "bin";
-	uids[0] = 3;
-
-	CHECK(runtest_users(cappwd, names, uids, 1));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * users:
-	 *     names: daemon, tty
-	 *     uids:
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	names[0] = "daemon";
-	names[1] = "tty";
-	CHECK(cap_pwd_limit_users(cappwd, names, 2, NULL, 0) == 0);
-	names[2] = "operator";
-	CHECK(cap_pwd_limit_users(cappwd, names, 3, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-	names[0] = "operator";
-	CHECK(cap_pwd_limit_users(cappwd, names, 1, NULL, 0) == -1 &&
-	    errno == ENOTCAPABLE);
-	names[0] = "daemon";
-	uids[0] = 1;
-	uids[1] = 4;
-
-	CHECK(runtest_users(cappwd, names, uids, 2));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * users:
-	 *     names:
-	 *     uids: 3
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	names[0] = "bin";
-	uids[0] = 3;
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 1) == 0);
-	uids[1] = 4;
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 2) == -1 &&
-	    errno == ENOTCAPABLE);
-	uids[0] = 4;
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-	uids[0] = 3;
-
-	CHECK(runtest_users(cappwd, names, uids, 1));
-
-	cap_close(cappwd);
-
-	/*
-	 * Allow:
-	 * users:
-	 *     names:
-	 *     uids: 1, 4
-	 */
-	cappwd = cap_clone(origcappwd);
-	CHECK(cappwd != NULL);
-
-	names[0] = "daemon";
-	names[1] = "tty";
-	uids[0] = 1;
-	uids[1] = 4;
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 2) == 0);
-	uids[2] = 3;
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 3) == -1 &&
-	    errno == ENOTCAPABLE);
-	uids[0] = 3;
-	CHECK(cap_pwd_limit_users(cappwd, NULL, 0, uids, 1) == -1 &&
-	    errno == ENOTCAPABLE);
-	uids[0] = 1;
-
-	CHECK(runtest_users(cappwd, names, uids, 2));
-
-	cap_close(cappwd);
-}
-
-int
-main(void)
-{
-	cap_channel_t *capcas, *cappwd;
-
-	printf("1..188\n");
-
-	capcas = cap_init();
-	CHECKX(capcas != NULL);
-
-	cappwd = cap_service_open(capcas, "system.pwd");
-	CHECKX(cappwd != NULL);
-
-	cap_close(capcas);
-
-	/* No limits. */
-
-	CHECK(runtest_cmds(cappwd) == (GETPWENT | GETPWENT_R | GETPWNAM |
-	    GETPWNAM_R | GETPWUID | GETPWUID_R));
-
-	test_cmds(cappwd);
-
-	test_fields(cappwd);
-
-	test_users(cappwd);
-
-	cap_close(cappwd);
-
-	exit(0);
-}

Property changes on: projects/clang390-import/tools/regression/capsicum/libcasper/pwd.c
___________________________________________________________________
Deleted: svn:eol-style
## -1 +0,0 ##
-native
\ No newline at end of property
Deleted: svn:keywords
## -1 +0,0 ##
-FreeBSD=%H
\ No newline at end of property
Deleted: svn:mime-type
## -1 +0,0 ##
-text/plain
\ No newline at end of property
Index: projects/clang390-import/tools/regression/capsicum/libcasper/sysctl.c
===================================================================
--- projects/clang390-import/tools/regression/capsicum/libcasper/sysctl.c	(revision 305686)
+++ projects/clang390-import/tools/regression/capsicum/libcasper/sysctl.c	(nonexistent)
@@ -1,1510 +0,0 @@
-/*-
- * Copyright (c) 2013 The FreeBSD Foundation
- * All rights reserved.
- *
- * This software was developed by Pawel Jakub Dawidek under sponsorship from
- * the FreeBSD Foundation.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- * 1. Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in the
- *    documentation and/or other materials provided with the distribution.
- *
- * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
- * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
- * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
- * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
- * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
- * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
- * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
- * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
- * SUCH DAMAGE.
- */
-
-#include <sys/cdefs.h>
-__FBSDID("$FreeBSD$");
-
-#include <sys/types.h>
-#include <sys/capsicum.h>
-#include <sys/sysctl.h>
-#include <sys/nv.h>
-
-#include <assert.h>
-#include <err.h>
-#include <errno.h>
-#include <netdb.h>
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-#include <unistd.h>
-
-#include <libcasper.h>
-
-#include <casper/cap_sysctl.h>
-
-/*
- * We need some sysctls to perform the tests on.
- * We remember their values and restore them afer the test is done.
- */
-#define	SYSCTL0_PARENT	"kern"
-#define	SYSCTL0_NAME	"kern.sync_on_panic"
-#define	SYSCTL1_PARENT	"debug"
-#define	SYSCTL1_NAME	"debug.minidump"
-
-static int ntest = 1;
-
-#define CHECK(expr)     do {						\
-	if ((expr))							\
-		printf("ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
-	else								\
-		printf("not ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
-	ntest++;							\
-} while (0)
-#define CHECKX(expr)     do {						\
-	if ((expr)) {							\
-		printf("ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
-	} else {							\
-		printf("not ok %d %s:%u\n", ntest, __FILE__, __LINE__);	\
-		exit(1);						\
-	}								\
-	ntest++;							\
-} while (0)
-
-#define	SYSCTL0_READ0		0x0001
-#define	SYSCTL0_READ1		0x0002
-#define	SYSCTL0_READ2		0x0004
-#define	SYSCTL0_WRITE		0x0008
-#define	SYSCTL0_READ_WRITE	0x0010
-#define	SYSCTL1_READ0		0x0020
-#define	SYSCTL1_READ1		0x0040
-#define	SYSCTL1_READ2		0x0080
-#define	SYSCTL1_WRITE		0x0100
-#define	SYSCTL1_READ_WRITE	0x0200
-
-static unsigned int
-runtest(cap_channel_t *capsysctl)
-{
-	unsigned int result;
-	int oldvalue, newvalue;
-	size_t oldsize;
-
-	result = 0;
-
-	oldsize = sizeof(oldvalue);
-	if (cap_sysctlbyname(capsysctl, SYSCTL0_NAME, &oldvalue, &oldsize,
-	    NULL, 0) == 0) {
-		if (oldsize == sizeof(oldvalue))
-			result |= SYSCTL0_READ0;
-	}
-
-	newvalue = 123;
-	if (cap_sysctlbyname(capsysctl, SYSCTL0_NAME, NULL, NULL, &newvalue,
-	    sizeof(newvalue)) == 0) {
-		result |= SYSCTL0_WRITE;
-	}
-
-	if ((result & SYSCTL0_WRITE) != 0) {
-		oldsize = sizeof(oldvalue);
-		if (cap_sysctlbyname(capsysctl, SYSCTL0_NAME, &oldvalue,
-		    &oldsize, NULL, 0) == 0) {
-			if (oldsize == sizeof(oldvalue) && oldvalue == 123)
-				result |= SYSCTL0_READ1;
-		}
-	}
-
-	oldsize = sizeof(oldvalue);
-	newvalue = 4567;
-	if (cap_sysctlbyname(capsysctl, SYSCTL0_NAME, &oldvalue, &oldsize,
-	    &newvalue, sizeof(newvalue)) == 0) {
-		if (oldsize == sizeof(oldvalue) && oldvalue == 123)
-			result |= SYSCTL0_READ_WRITE;
-	}
-
-	if ((result & SYSCTL0_READ_WRITE) != 0) {
-		oldsize = sizeof(oldvalue);
-		if (cap_sysctlbyname(capsysctl, SYSCTL0_NAME, &oldvalue,
-		    &oldsize, NULL, 0) == 0) {
-			if (oldsize == sizeof(oldvalue) && oldvalue == 4567)
-				result |= SYSCTL0_READ2;
-		}
-	}
-
-	oldsize = sizeof(oldvalue);
-	if (cap_sysctlbyname(capsysctl, SYSCTL1_NAME, &oldvalue, &oldsize,
-	    NULL, 0) == 0) {
-		if (oldsize == sizeof(oldvalue))
-			result |= SYSCTL1_READ0;
-	}
-
-	newvalue = 506;
-	if (cap_sysctlbyname(capsysctl, SYSCTL1_NAME, NULL, NULL, &newvalue,
-	    sizeof(newvalue)) == 0) {
-		result |= SYSCTL1_WRITE;
-	}
-
-	if ((result & SYSCTL1_WRITE) != 0) {
-		oldsize = sizeof(oldvalue);
-		if (cap_sysctlbyname(capsysctl, SYSCTL1_NAME, &oldvalue,
-		    &oldsize, NULL, 0) == 0) {
-			if (oldsize == sizeof(oldvalue) && oldvalue == 506)
-				result |= SYSCTL1_READ1;
-		}
-	}
-
-	oldsize = sizeof(oldvalue);
-	newvalue = 7008;
-	if (cap_sysctlbyname(capsysctl, SYSCTL1_NAME, &oldvalue, &oldsize,
-	    &newvalue, sizeof(newvalue)) == 0) {
-		if (oldsize == sizeof(oldvalue) && oldvalue == 506)
-			result |= SYSCTL1_READ_WRITE;
-	}
-
-	if ((result & SYSCTL1_READ_WRITE) != 0) {
-		oldsize = sizeof(oldvalue);
-		if (cap_sysctlbyname(capsysctl, SYSCTL1_NAME, &oldvalue,
-		    &oldsize, NULL, 0) == 0) {
-			if (oldsize == sizeof(oldvalue) && oldvalue == 7008)
-				result |= SYSCTL1_READ2;
-		}
-	}
-
-	return (result);
-}
-
-static void
-test_operation(cap_channel_t *origcapsysctl)
-{
-	cap_channel_t *capsysctl;
-	nvlist_t *limits;
-
-	/*
-	 * Allow:
-	 * SYSCTL0_PARENT/RDWR/RECURSIVE
-	 * SYSCTL1_PARENT/RDWR/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, "foo.bar",
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, "foo.bar",
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL0_READ1 |
-	    SYSCTL0_READ2 | SYSCTL0_WRITE | SYSCTL0_READ_WRITE |
-	    SYSCTL1_READ0 | SYSCTL1_READ1 | SYSCTL1_READ2 | SYSCTL1_WRITE |
-	    SYSCTL1_READ_WRITE));
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL0_READ1 |
-	    SYSCTL0_READ2 | SYSCTL0_WRITE | SYSCTL0_READ_WRITE |
-	    SYSCTL1_READ0 | SYSCTL1_READ1 | SYSCTL1_READ2 | SYSCTL1_WRITE |
-	    SYSCTL1_READ_WRITE));
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_WRITE));
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_WRITE));
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-
-	CHECK(runtest(capsysctl) == SYSCTL0_READ0);
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_NAME/RDWR/RECURSIVE
-	 * SYSCTL1_NAME/RDWR/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_RDWR);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL0_READ1 |
-	    SYSCTL0_READ2 | SYSCTL0_WRITE | SYSCTL0_READ_WRITE |
-	    SYSCTL1_READ0 | SYSCTL1_READ1 | SYSCTL1_READ2 | SYSCTL1_WRITE |
-	    SYSCTL1_READ_WRITE));
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_PARENT/RDWR
-	 * SYSCTL1_PARENT/RDWR
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_RDWR);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == 0);
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_NAME/RDWR
-	 * SYSCTL1_NAME/RDWR
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_RDWR);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL0_READ1 |
-	    SYSCTL0_READ2 | SYSCTL0_WRITE | SYSCTL0_READ_WRITE |
-	    SYSCTL1_READ0 | SYSCTL1_READ1 | SYSCTL1_READ2 | SYSCTL1_WRITE |
-	    SYSCTL1_READ_WRITE));
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_PARENT/RDWR
-	 * SYSCTL1_PARENT/RDWR/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_RDWR);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == (SYSCTL1_READ0 | SYSCTL1_READ1 |
-	    SYSCTL1_READ2 | SYSCTL1_WRITE | SYSCTL1_READ_WRITE));
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_NAME/RDWR
-	 * SYSCTL1_NAME/RDWR/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_RDWR);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL0_READ1 |
-	    SYSCTL0_READ2 | SYSCTL0_WRITE | SYSCTL0_READ_WRITE |
-	    SYSCTL1_READ0 | SYSCTL1_READ1 | SYSCTL1_READ2 | SYSCTL1_WRITE |
-	    SYSCTL1_READ_WRITE));
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_PARENT/READ/RECURSIVE
-	 * SYSCTL1_PARENT/READ/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_RDWR);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_WRITE);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_READ0));
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_NAME/READ/RECURSIVE
-	 * SYSCTL1_NAME/READ/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_RDWR);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_READ0));
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_PARENT/READ
-	 * SYSCTL1_PARENT/READ
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_RDWR);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_WRITE);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == 0);
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_NAME/READ
-	 * SYSCTL1_NAME/READ
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_RDWR);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_READ0));
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_PARENT/READ
-	 * SYSCTL1_PARENT/READ/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == SYSCTL1_READ0);
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_NAME/READ
-	 * SYSCTL1_NAME/READ/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_READ0));
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_PARENT/WRITE/RECURSIVE
-	 * SYSCTL1_PARENT/WRITE/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_RDWR);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_WRITE | SYSCTL1_WRITE));
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_NAME/WRITE/RECURSIVE
-	 * SYSCTL1_NAME/WRITE/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_RDWR);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_WRITE | SYSCTL1_WRITE));
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_PARENT/WRITE
-	 * SYSCTL1_PARENT/WRITE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_WRITE);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_RDWR);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == 0);
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_NAME/WRITE
-	 * SYSCTL1_NAME/WRITE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_RDWR);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_WRITE | SYSCTL1_WRITE));
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_PARENT/WRITE
-	 * SYSCTL1_PARENT/WRITE/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_WRITE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == SYSCTL1_WRITE);
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_NAME/WRITE
-	 * SYSCTL1_NAME/WRITE/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_WRITE | SYSCTL1_WRITE));
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_PARENT/READ/RECURSIVE
-	 * SYSCTL1_PARENT/WRITE/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_WRITE));
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_NAME/READ/RECURSIVE
-	 * SYSCTL1_NAME/WRITE/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_WRITE));
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_PARENT/READ
-	 * SYSCTL1_PARENT/WRITE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-
-	CHECK(runtest(capsysctl) == 0);
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_NAME/READ
-	 * SYSCTL1_NAME/WRITE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_WRITE));
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_PARENT/READ
-	 * SYSCTL1_PARENT/WRITE/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-
-	CHECK(runtest(capsysctl) == SYSCTL1_WRITE);
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_NAME/READ
-	 * SYSCTL1_NAME/WRITE/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL1_WRITE));
-
-	cap_close(capsysctl);
-}
-
-static void
-test_names(cap_channel_t *origcapsysctl)
-{
-	cap_channel_t *capsysctl;
-	nvlist_t *limits;
-
-	/*
-	 * Allow:
-	 * SYSCTL0_PARENT/READ/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == SYSCTL0_READ0);
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL1_NAME/READ/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_READ | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == SYSCTL1_READ0);
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_PARENT/WRITE/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_WRITE);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == SYSCTL0_WRITE);
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL1_NAME/WRITE/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_WRITE | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == SYSCTL1_WRITE);
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_PARENT/RDWR/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_PARENT,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL0_READ1 |
-	    SYSCTL0_READ2 | SYSCTL0_WRITE | SYSCTL0_READ_WRITE));
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL1_NAME/RDWR/RECURSIVE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	nvlist_add_number(limits, SYSCTL1_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME,
-	    CAP_SYSCTL_RDWR | CAP_SYSCTL_RECURSIVE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == (SYSCTL1_READ0 | SYSCTL1_READ1 |
-	    SYSCTL1_READ2 | SYSCTL1_WRITE | SYSCTL1_READ_WRITE));
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_PARENT/READ
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == 0);
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL1_NAME/READ
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_READ);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == SYSCTL1_READ0);
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_PARENT/WRITE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_WRITE);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == 0);
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL1_NAME/WRITE
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_WRITE);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == SYSCTL1_WRITE);
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL0_PARENT/RDWR
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_PARENT, CAP_SYSCTL_RDWR);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_PARENT, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == 0);
-
-	cap_close(capsysctl);
-
-	/*
-	 * Allow:
-	 * SYSCTL1_NAME/RDWR
-	 */
-
-	capsysctl = cap_clone(origcapsysctl);
-	CHECK(capsysctl != NULL);
-
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == 0);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_RDWR);
-	nvlist_add_number(limits, SYSCTL1_NAME, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-	limits = nvlist_create(0);
-	nvlist_add_number(limits, SYSCTL0_NAME, CAP_SYSCTL_RDWR);
-	CHECK(cap_limit_set(capsysctl, limits) == -1 && errno == ENOTCAPABLE);
-
-	CHECK(runtest(capsysctl) == (SYSCTL1_READ0 | SYSCTL1_READ1 |
-	    SYSCTL1_READ2 | SYSCTL1_WRITE | SYSCTL1_READ_WRITE));
-
-	cap_close(capsysctl);
-}
-
-int
-main(void)
-{
-	cap_channel_t *capcas, *capsysctl;
-	int scvalue0, scvalue1;
-	size_t scsize;
-
-	printf("1..256\n");
-
-	scsize = sizeof(scvalue0);
-	CHECKX(sysctlbyname(SYSCTL0_NAME, &scvalue0, &scsize, NULL, 0) == 0);
-	CHECKX(scsize == sizeof(scvalue0));
-	scsize = sizeof(scvalue1);
-	CHECKX(sysctlbyname(SYSCTL1_NAME, &scvalue1, &scsize, NULL, 0) == 0);
-	CHECKX(scsize == sizeof(scvalue1));
-
-	capcas = cap_init();
-	CHECKX(capcas != NULL);
-
-	capsysctl = cap_service_open(capcas, "system.sysctl");
-	CHECKX(capsysctl != NULL);
-
-	cap_close(capcas);
-
-	/* No limits set. */
-
-	CHECK(runtest(capsysctl) == (SYSCTL0_READ0 | SYSCTL0_READ1 |
-	    SYSCTL0_READ2 | SYSCTL0_WRITE | SYSCTL0_READ_WRITE |
-	    SYSCTL1_READ0 | SYSCTL1_READ1 | SYSCTL1_READ2 | SYSCTL1_WRITE |
-	    SYSCTL1_READ_WRITE));
-
-	test_operation(capsysctl);
-
-	test_names(capsysctl);
-
-	cap_close(capsysctl);
-
-	CHECK(sysctlbyname(SYSCTL0_NAME, NULL, NULL, &scvalue0,
-	    sizeof(scvalue0)) == 0);
-	CHECK(sysctlbyname(SYSCTL1_NAME, NULL, NULL, &scvalue1,
-	    sizeof(scvalue1)) == 0);
-
-	exit(0);
-}

Property changes on: projects/clang390-import/tools/regression/capsicum/libcasper/sysctl.c
___________________________________________________________________
Deleted: svn:eol-style
## -1 +0,0 ##
-native
\ No newline at end of property
Deleted: svn:keywords
## -1 +0,0 ##
-FreeBSD=%H
\ No newline at end of property
Deleted: svn:mime-type
## -1 +0,0 ##
-text/plain
\ No newline at end of property
Index: projects/clang390-import/tools/regression/capsicum/libcasper/Makefile
===================================================================
--- projects/clang390-import/tools/regression/capsicum/libcasper/Makefile	(revision 305686)
+++ projects/clang390-import/tools/regression/capsicum/libcasper/Makefile	(nonexistent)
@@ -1,32 +0,0 @@
-# $FreeBSD$
-
-SERVICES=	dns
-SERVICES+=	grp
-SERVICES+=	pwd
-SERVICES+=	sysctl
-
-CFLAGS=		-O2 -pipe -std=gnu99 -fstack-protector
-CFLAGS+=	-Wsystem-headers -Werror -Wall -Wno-format-y2k -W -Wno-unused-parameter
-CFLAGS+=	-Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wreturn-type
-CFLAGS+=	-Wcast-qual -Wwrite-strings -Wswitch -Wshadow -Wunused-parameter
-CFLAGS+=	-Wcast-align -Wchar-subscripts -Winline -Wnested-externs -Wredundant-decls
-CFLAGS+=	-Wold-style-definition -Wno-pointer-sign
-
-CFLAGS+=	-ggdb
-
-SERVTEST=	${SERVICES:=.t}
-
-all:	${SERVTEST}
-
-.for SERVICE in ${SERVICES}
-
-${SERVICE}.t:	${SERVICE}.c
-	${CC} ${CFLAGS} ${@:.t=.c} -o $@ -lnv -lcasper -lcap_${@:.t=}
-
-.endfor
-
-test:	all
-	@prove -r ${.CURDIR}
-
-clean:
-	rm -f ${SERVTEST}

Property changes on: projects/clang390-import/tools/regression/capsicum/libcasper/Makefile
___________________________________________________________________
Deleted: svn:eol-style
## -1 +0,0 ##
-native
\ No newline at end of property
Deleted: svn:keywords
## -1 +0,0 ##
-FreeBSD=%H
\ No newline at end of property
Deleted: svn:mime-type
## -1 +0,0 ##
-text/plain
\ No newline at end of property
Index: projects/clang390-import/usr.bin/bmake/Makefile
===================================================================
--- projects/clang390-import/usr.bin/bmake/Makefile	(revision 305686)
+++ projects/clang390-import/usr.bin/bmake/Makefile	(revision 305687)
@@ -1,177 +1,177 @@
 # This is a generated file, do NOT edit!
 # See contrib/bmake/bsd.after-import.mk
 #
 # $FreeBSD$
 
 .sinclude "Makefile.inc"
 
 SRCTOP?= ${.CURDIR:H:H}
 
 # look here first for config.h
 CFLAGS+= -I${.CURDIR}
 
 # for after-import
 CLEANDIRS+= FreeBSD
 CLEANFILES+= bootstrap
 
-#	$Id: Makefile,v 1.67 2016/06/07 00:46:12 sjg Exp $
+#	$Id: Makefile,v 1.72 2016/08/18 23:02:26 sjg Exp $
 
 # Base version on src date
-_MAKE_VERSION= 20160606
+_MAKE_VERSION= 20160818
 
 PROG?=	${.CURDIR:T}
 
 SRCS= \
 	arch.c \
 	buf.c \
 	compat.c \
 	cond.c \
 	dir.c \
 	for.c \
 	hash.c \
 	job.c \
 	main.c \
 	make.c \
 	make_malloc.c \
 	meta.c \
 	metachar.c \
 	parse.c \
 	str.c \
 	strlist.c \
 	suff.c \
 	targ.c \
 	trace.c \
 	util.c \
 	var.c
 
 # from lst.lib/
 SRCS+= \
 	lstAppend.c \
 	lstAtEnd.c \
 	lstAtFront.c \
 	lstClose.c \
 	lstConcat.c \
 	lstDatum.c \
 	lstDeQueue.c \
 	lstDestroy.c \
 	lstDupl.c \
 	lstEnQueue.c \
 	lstFind.c \
 	lstFindFrom.c \
 	lstFirst.c \
 	lstForEach.c \
 	lstForEachFrom.c \
 	lstInit.c \
 	lstInsert.c \
 	lstIsAtEnd.c \
 	lstIsEmpty.c \
 	lstLast.c \
 	lstMember.c \
 	lstNext.c \
 	lstOpen.c \
 	lstPrev.c \
 	lstRemove.c \
 	lstReplace.c \
 	lstSucc.c
 
 # this file gets generated by configure
 .sinclude "Makefile.config"
 
 .if !empty(LIBOBJS)
 SRCS+= ${LIBOBJS:T:.o=.c}
 .endif
 
 # just in case
 prefix?= /usr
 srcdir?= ${.CURDIR}
 
 DEFAULT_SYS_PATH?= ${prefix}/share/mk
 
 CPPFLAGS+= -DUSE_META
 CFLAGS+= ${CPPFLAGS}
 CFLAGS+= -D_PATH_DEFSYSPATH=\"${DEFAULT_SYS_PATH}\"
 CFLAGS+= -I. -I${srcdir} ${XDEFS} -DMAKE_NATIVE
 CFLAGS+= ${COPTS.${.ALLSRC:M*.c:T:u}}
 COPTS.main.c+= "-DMAKE_VERSION=\"${_MAKE_VERSION}\""
 
 # meta mode can be useful even without filemon 
 FILEMON_H ?= /usr/include/dev/filemon/filemon.h
 .if exists(${FILEMON_H}) && ${FILEMON_H:T} == "filemon.h"
 COPTS.meta.c += -DHAVE_FILEMON_H -I${FILEMON_H:H}
 .endif
 
 .PATH:	${srcdir}
 .PATH:	${srcdir}/lst.lib
 
 .if make(obj) || make(clean)
 SUBDIR+= unit-tests
 .endif
 
 
 MAN= ${PROG}.1
 MAN1= ${MAN}
 
 .if (${PROG} != "make")
 CLEANFILES+= my.history
 .if make(${MAN}) || !exists(${srcdir}/${MAN})
 my.history: ${MAKEFILE}
 	@(echo ".Nm"; \
 	echo "is derived from NetBSD"; \
 	echo ".Xr make 1 ."; \
 	echo "It uses autoconf to facilitate portability to other platforms."; \
 	echo ".Pp") > $@
 
 .NOPATH: ${MAN}
 ${MAN}:	make.1 my.history
 	@echo making $@
 	@sed -e 's/^.Nx/NetBSD/' -e '/^.Nm/s/make/${PROG}/' \
 	-e '/^.Sh HISTORY/rmy.history' \
 	-e '/^.Sh HISTORY/,$$s,^.Nm,make,' ${srcdir}/make.1 > $@
 
 all beforeinstall: ${MAN}
 _mfromdir=.
 .endif
 .endif
 
 MANTARGET?= cat
 MANDEST?= ${MANDIR}/${MANTARGET}1
 
 .if ${MANTARGET} == "cat"
 _mfromdir=${srcdir}
 .endif
 
 .include <bsd.prog.mk>
 
 CPPFLAGS+= -DMAKE_NATIVE -DHAVE_CONFIG_H
 COPTS.var.c += -Wno-cast-qual
 COPTS.job.c += -Wno-format-nonliteral
 COPTS.parse.c += -Wno-format-nonliteral
 COPTS.var.c += -Wno-format-nonliteral
 
 # Force these
 SHAREDIR= ${SHAREDIR.bmake:U${prefix}/share}
 BINDIR= ${BINDIR.bmake:U${prefix}/bin}
 MANDIR= ${MANDIR.bmake:U${SHAREDIR}/man}
 
 .if !exists(.depend)
 ${OBJS}: config.h
 .endif
 
 # make sure that MAKE_VERSION gets updated.
 main.o: ${SRCS} ${MAKEFILE}
 
 
 # A simple unit-test driver to help catch regressions
 accept test:
 	cd ${.CURDIR}/unit-tests && MAKEFLAGS= ${.MAKE} -r -m / TEST_MAKE=${TEST_MAKE:U${.OBJDIR}/${PROG:T}} ${.TARGET}
 
 # override some simple things
 BINDIR= /usr/bin
 MANDIR= /usr/share/man/man
 
 # make sure we get this
 CFLAGS+= ${COPTS.${.IMPSRC:T}}
 
 after-import: ${SRCTOP}/contrib/bmake/bsd.after-import.mk
 	cd ${.CURDIR} && ${.MAKE} -f ${SRCTOP}/contrib/bmake/bsd.after-import.mk
 
Index: projects/clang390-import/usr.sbin/newsyslog/newsyslog.c
===================================================================
--- projects/clang390-import/usr.sbin/newsyslog/newsyslog.c	(revision 305686)
+++ projects/clang390-import/usr.sbin/newsyslog/newsyslog.c	(revision 305687)
@@ -1,2681 +1,2681 @@
 /*-
  * ------+---------+---------+-------- + --------+---------+---------+---------*
  * This file includes significant modifications done by:
  * Copyright (c) 2003, 2004  - Garance Alistair Drosehn <gad@FreeBSD.org>.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  *   1. Redistributions of source code must retain the above copyright
  *      notice, this list of conditions and the following disclaimer.
  *   2. Redistributions in binary form must reproduce the above copyright
  *      notice, this list of conditions and the following disclaimer in the
  *      documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * ------+---------+---------+-------- + --------+---------+---------+---------*
  */
 
 /*
  * This file contains changes from the Open Software Foundation.
  */
 
 /*
  * Copyright 1988, 1989 by the Massachusetts Institute of Technology
  *
  * Permission to use, copy, modify, and distribute this software and its
  * documentation for any purpose and without fee is hereby granted, provided
  * that the above copyright notice appear in all copies and that both that
  * copyright notice and this permission notice appear in supporting
  * documentation, and that the names of M.I.T. and the M.I.T. S.I.P.B. not be
  * used in advertising or publicity pertaining to distribution of the
  * software without specific, written prior permission. M.I.T. and the M.I.T.
  * S.I.P.B. make no representations about the suitability of this software
  * for any purpose.  It is provided "as is" without express or implied
  * warranty.
  *
  */
 
 /*
  * newsyslog - roll over selected logs at the appropriate time, keeping the a
  * specified number of backup files around.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #define	OSF
 
 #include <sys/param.h>
 #include <sys/queue.h>
 #include <sys/stat.h>
 #include <sys/wait.h>
 
 #include <assert.h>
 #include <ctype.h>
 #include <err.h>
 #include <errno.h>
 #include <dirent.h>
 #include <fcntl.h>
 #include <fnmatch.h>
 #include <glob.h>
 #include <grp.h>
 #include <paths.h>
 #include <pwd.h>
 #include <signal.h>
 #include <stdio.h>
 #include <libgen.h>
 #include <stdlib.h>
 #include <string.h>
 #include <time.h>
 #include <unistd.h>
 
 #include "pathnames.h"
 #include "extern.h"
 
 /*
  * Compression suffixes
  */
 #ifndef	COMPRESS_SUFFIX_GZ
 #define	COMPRESS_SUFFIX_GZ	".gz"
 #endif
 
 #ifndef	COMPRESS_SUFFIX_BZ2
 #define	COMPRESS_SUFFIX_BZ2	".bz2"
 #endif
 
 #ifndef	COMPRESS_SUFFIX_XZ
 #define	COMPRESS_SUFFIX_XZ	".xz"
 #endif
 
 #define	COMPRESS_SUFFIX_MAXLEN	MAX(MAX(sizeof(COMPRESS_SUFFIX_GZ),sizeof(COMPRESS_SUFFIX_BZ2)),sizeof(COMPRESS_SUFFIX_XZ))
 
 /*
  * Compression types
  */
 #define	COMPRESS_TYPES  4	/* Number of supported compression types */
 
 #define	COMPRESS_NONE	0
 #define	COMPRESS_GZIP	1
 #define	COMPRESS_BZIP2	2
 #define	COMPRESS_XZ	3
 
 /*
  * Bit-values for the 'flags' parsed from a config-file entry.
  */
 #define	CE_BINARY	0x0008	/* Logfile is in binary, do not add status */
 				/*    messages to logfile(s) when rotating. */
 #define	CE_NOSIGNAL	0x0010	/* There is no process to signal when */
 				/*    trimming this file. */
 #define	CE_TRIMAT	0x0020	/* trim file at a specific time. */
 #define	CE_GLOB		0x0040	/* name of the log is file name pattern. */
 #define	CE_SIGNALGROUP	0x0080	/* Signal a process-group instead of a single */
 				/*    process when trimming this file. */
 #define	CE_CREATE	0x0100	/* Create the log file if it does not exist. */
 #define	CE_NODUMP	0x0200	/* Set 'nodump' on newly created log file. */
 #define	CE_PID2CMD	0x0400	/* Replace PID file with a shell command.*/
 
 #define	MIN_PID         5	/* Don't touch pids lower than this */
 #define	MAX_PID		99999	/* was lower, see /usr/include/sys/proc.h */
 
 #define	kbytes(size)  (((size) + 1023) >> 10)
 
 #define	DEFAULT_MARKER	"<default>"
 #define	DEBUG_MARKER	"<debug>"
 #define	INCLUDE_MARKER	"<include>"
 #define	DEFAULT_TIMEFNAME_FMT	"%Y%m%dT%H%M%S"
 
 #define	MAX_OLDLOGS 65536	/* Default maximum number of old logfiles */
 
 struct compress_types {
 	const char *flag;	/* Flag in configuration file */
 	const char *suffix;	/* Compression suffix */
 	const char *path;	/* Path to compression program */
 };
 
 static const struct compress_types compress_type[COMPRESS_TYPES] = {
 	{ "", "", "" },					/* no compression */
 	{ "Z", COMPRESS_SUFFIX_GZ, _PATH_GZIP },	/* gzip compression */
 	{ "J", COMPRESS_SUFFIX_BZ2, _PATH_BZIP2 },	/* bzip2 compression */
 	{ "X", COMPRESS_SUFFIX_XZ, _PATH_XZ }		/* xz compression */
 };
 
 struct conf_entry {
 	STAILQ_ENTRY(conf_entry) cf_nextp;
 	char *log;		/* Name of the log */
 	char *pid_cmd_file;		/* PID or command file */
 	char *r_reason;		/* The reason this file is being rotated */
 	int firstcreate;	/* Creating log for the first time (-C). */
 	int rotate;		/* Non-zero if this file should be rotated */
 	int fsize;		/* size found for the log file */
 	uid_t uid;		/* Owner of log */
 	gid_t gid;		/* Group of log */
 	int numlogs;		/* Number of logs to keep */
 	int trsize;		/* Size cutoff to trigger trimming the log */
 	int hours;		/* Hours between log trimming */
 	struct ptime_data *trim_at;	/* Specific time to do trimming */
 	unsigned int permissions;	/* File permissions on the log */
 	int flags;		/* CE_BINARY */
 	int compress;		/* Compression */
 	int sig;		/* Signal to send */
 	int def_cfg;		/* Using the <default> rule for this file */
 };
 
 struct sigwork_entry {
 	SLIST_ENTRY(sigwork_entry) sw_nextp;
 	int	 sw_signum;		/* the signal to send */
 	int	 sw_pidok;		/* true if pid value is valid */
 	pid_t	 sw_pid;		/* the process id from the PID file */
 	const char *sw_pidtype;		/* "daemon" or "process group" */
 	int	 sw_runcmd;		/* run command or send PID to signal */
 	char	 sw_fname[1];		/* file the PID was read from or shell cmd */
 };
 
 struct zipwork_entry {
 	SLIST_ENTRY(zipwork_entry) zw_nextp;
 	const struct conf_entry *zw_conf;	/* for chown/perm/flag info */
 	const struct sigwork_entry *zw_swork;	/* to know success of signal */
 	int	 zw_fsize;		/* size of the file to compress */
 	char	 zw_fname[1];		/* the file to compress */
 };
 
 struct include_entry {
 	STAILQ_ENTRY(include_entry) inc_nextp;
 	const char *file;	/* Name of file to process */
 };
 
 struct oldlog_entry {
 	char *fname;		/* Filename of the log file */
 	time_t t;		/* Parsed timestamp of the logfile */
 };
 
 typedef enum {
 	FREE_ENT, KEEP_ENT
 }	fk_entry;
 
 STAILQ_HEAD(cflist, conf_entry);
 static SLIST_HEAD(swlisthead, sigwork_entry) swhead =
     SLIST_HEAD_INITIALIZER(swhead);
 static SLIST_HEAD(zwlisthead, zipwork_entry) zwhead =
     SLIST_HEAD_INITIALIZER(zwhead);
 STAILQ_HEAD(ilist, include_entry);
 
 int dbg_at_times;		/* -D Show details of 'trim_at' code */
 
 static int archtodir = 0;	/* Archive old logfiles to other directory */
 static int createlogs;		/* Create (non-GLOB) logfiles which do not */
 				/*    already exist.  1=='for entries with */
 				/*    C flag', 2=='for all entries'. */
 int verbose = 0;		/* Print out what's going on */
 static int needroot = 1;	/* Root privs are necessary */
 int noaction = 0;		/* Don't do anything, just show it */
 static int norotate = 0;	/* Don't rotate */
 static int nosignal;		/* Do not send any signals */
 static int enforcepid = 0;	/* If PID file does not exist or empty, do nothing */
 static int force = 0;		/* Force the trim no matter what */
 static int rotatereq = 0;	/* -R = Always rotate the file(s) as given */
 				/*    on the command (this also requires   */
 				/*    that a list of files *are* given on  */
 				/*    the run command). */
 static char *requestor;		/* The name given on a -R request */
 static char *timefnamefmt = NULL;/* Use time based filenames instead of .0 */
 static char *archdirname;	/* Directory path to old logfiles archive */
 static char *destdir = NULL;	/* Directory to treat at root for logs */
 static const char *conf;	/* Configuration file to use */
 
 struct ptime_data *dbg_timenow;	/* A "timenow" value set via -D option */
 static struct ptime_data *timenow; /* The time to use for checking at-fields */
 
 #define	DAYTIME_LEN	16
 static char daytime[DAYTIME_LEN];/* The current time in human readable form,
 				  * used for rotation-tracking messages. */
 static char hostname[MAXHOSTNAMELEN]; /* hostname */
 
 static const char *path_syslogpid = _PATH_SYSLOGPID;
 
 static struct cflist *get_worklist(char **files);
 static void parse_file(FILE *cf, struct cflist *work_p, struct cflist *glob_p,
 		    struct conf_entry *defconf_p, struct ilist *inclist);
 static void add_to_queue(const char *fname, struct ilist *inclist);
 static char *sob(char *p);
 static char *son(char *p);
 static int isnumberstr(const char *);
 static int isglobstr(const char *);
 static char *missing_field(char *p, char *errline);
 static void	 change_attrs(const char *, const struct conf_entry *);
 static const char *get_logfile_suffix(const char *logfile);
 static fk_entry	 do_entry(struct conf_entry *);
 static fk_entry	 do_rotate(const struct conf_entry *);
 static void	 do_sigwork(struct sigwork_entry *);
 static void	 do_zipwork(struct zipwork_entry *);
 static struct sigwork_entry *
 		 save_sigwork(const struct conf_entry *);
 static struct zipwork_entry *
 		 save_zipwork(const struct conf_entry *, const struct
 		    sigwork_entry *, int, const char *);
 static void	 set_swpid(struct sigwork_entry *, const struct conf_entry *);
 static int	 sizefile(const char *);
 static void expand_globs(struct cflist *work_p, struct cflist *glob_p);
 static void free_clist(struct cflist *list);
 static void free_entry(struct conf_entry *ent);
 static struct conf_entry *init_entry(const char *fname,
 		struct conf_entry *src_entry);
 static void parse_args(int argc, char **argv);
 static int parse_doption(const char *doption);
 static void usage(void);
 static int log_trim(const char *logname, const struct conf_entry *log_ent);
 static int age_old_log(const char *file);
 static void savelog(char *from, char *to);
 static void createdir(const struct conf_entry *ent, char *dirpart);
 static void createlog(const struct conf_entry *ent);
 static int parse_signal(const char *str);
 
 /*
  * All the following take a parameter of 'int', but expect values in the
  * range of unsigned char.  Define wrappers which take values of type 'char',
  * whether signed or unsigned, and ensure they end up in the right range.
  */
 #define	isdigitch(Anychar) isdigit((u_char)(Anychar))
 #define	isprintch(Anychar) isprint((u_char)(Anychar))
 #define	isspacech(Anychar) isspace((u_char)(Anychar))
 #define	tolowerch(Anychar) tolower((u_char)(Anychar))
 
 int
 main(int argc, char **argv)
 {
 	struct cflist *worklist;
 	struct conf_entry *p;
 	struct sigwork_entry *stmp;
 	struct zipwork_entry *ztmp;
 
 	SLIST_INIT(&swhead);
 	SLIST_INIT(&zwhead);
 
 	parse_args(argc, argv);
 	argc -= optind;
 	argv += optind;
 
 	if (needroot && getuid() && geteuid())
 		errx(1, "must have root privs");
 	worklist = get_worklist(argv);
 
 	/*
 	 * Rotate all the files which need to be rotated.  Note that
 	 * some users have *hundreds* of entries in newsyslog.conf!
 	 */
 	while (!STAILQ_EMPTY(worklist)) {
 		p = STAILQ_FIRST(worklist);
 		STAILQ_REMOVE_HEAD(worklist, cf_nextp);
 		if (do_entry(p) == FREE_ENT)
 			free_entry(p);
 	}
 
 	/*
 	 * Send signals to any processes which need a signal to tell
 	 * them to close and re-open the log file(s) we have rotated.
 	 * Note that zipwork_entries include pointers to these
 	 * sigwork_entry's, so we can not free the entries here.
 	 */
 	if (!SLIST_EMPTY(&swhead)) {
 		if (noaction || verbose)
 			printf("Signal all daemon process(es)...\n");
 		SLIST_FOREACH(stmp, &swhead, sw_nextp)
 			do_sigwork(stmp);
 		if (!(rotatereq && nosignal)) {
 			if (noaction)
 				printf("\tsleep 10\n");
 			else {
 				if (verbose)
 					printf("Pause 10 seconds to allow "
 					    "daemon(s) to close log file(s)\n");
 				sleep(10);
 			}
 		}
 	}
 	/*
 	 * Compress all files that we're expected to compress, now
 	 * that all processes should have closed the files which
 	 * have been rotated.
 	 */
 	if (!SLIST_EMPTY(&zwhead)) {
 		if (noaction || verbose)
 			printf("Compress all rotated log file(s)...\n");
 		while (!SLIST_EMPTY(&zwhead)) {
 			ztmp = SLIST_FIRST(&zwhead);
 			do_zipwork(ztmp);
 			SLIST_REMOVE_HEAD(&zwhead, zw_nextp);
 			free(ztmp);
 		}
 	}
 	/* Now free all the sigwork entries. */
 	while (!SLIST_EMPTY(&swhead)) {
 		stmp = SLIST_FIRST(&swhead);
 		SLIST_REMOVE_HEAD(&swhead, sw_nextp);
 		free(stmp);
 	}
 
 	while (wait(NULL) > 0 || errno == EINTR)
 		;
 	return (0);
 }
 
 static struct conf_entry *
 init_entry(const char *fname, struct conf_entry *src_entry)
 {
 	struct conf_entry *tempwork;
 
 	if (verbose > 4)
 		printf("\t--> [creating entry for %s]\n", fname);
 
 	tempwork = malloc(sizeof(struct conf_entry));
 	if (tempwork == NULL)
 		err(1, "malloc of conf_entry for %s", fname);
 
 	if (destdir == NULL || fname[0] != '/')
 		tempwork->log = strdup(fname);
 	else
 		asprintf(&tempwork->log, "%s%s", destdir, fname);
 	if (tempwork->log == NULL)
 		err(1, "strdup for %s", fname);
 
 	if (src_entry != NULL) {
 		tempwork->pid_cmd_file = NULL;
 		if (src_entry->pid_cmd_file)
 			tempwork->pid_cmd_file = strdup(src_entry->pid_cmd_file);
 		tempwork->r_reason = NULL;
 		tempwork->firstcreate = 0;
 		tempwork->rotate = 0;
 		tempwork->fsize = -1;
 		tempwork->uid = src_entry->uid;
 		tempwork->gid = src_entry->gid;
 		tempwork->numlogs = src_entry->numlogs;
 		tempwork->trsize = src_entry->trsize;
 		tempwork->hours = src_entry->hours;
 		tempwork->trim_at = NULL;
 		if (src_entry->trim_at != NULL)
 			tempwork->trim_at = ptime_init(src_entry->trim_at);
 		tempwork->permissions = src_entry->permissions;
 		tempwork->flags = src_entry->flags;
 		tempwork->compress = src_entry->compress;
 		tempwork->sig = src_entry->sig;
 		tempwork->def_cfg = src_entry->def_cfg;
 	} else {
 		/* Initialize as a "do-nothing" entry */
 		tempwork->pid_cmd_file = NULL;
 		tempwork->r_reason = NULL;
 		tempwork->firstcreate = 0;
 		tempwork->rotate = 0;
 		tempwork->fsize = -1;
 		tempwork->uid = (uid_t)-1;
 		tempwork->gid = (gid_t)-1;
 		tempwork->numlogs = 1;
 		tempwork->trsize = -1;
 		tempwork->hours = -1;
 		tempwork->trim_at = NULL;
 		tempwork->permissions = 0;
 		tempwork->flags = 0;
 		tempwork->compress = COMPRESS_NONE;
 		tempwork->sig = SIGHUP;
 		tempwork->def_cfg = 0;
 	}
 
 	return (tempwork);
 }
 
 static void
 free_entry(struct conf_entry *ent)
 {
 
 	if (ent == NULL)
 		return;
 
 	if (ent->log != NULL) {
 		if (verbose > 4)
 			printf("\t--> [freeing entry for %s]\n", ent->log);
 		free(ent->log);
 		ent->log = NULL;
 	}
 
 	if (ent->pid_cmd_file != NULL) {
 		free(ent->pid_cmd_file);
 		ent->pid_cmd_file = NULL;
 	}
 
 	if (ent->r_reason != NULL) {
 		free(ent->r_reason);
 		ent->r_reason = NULL;
 	}
 
 	if (ent->trim_at != NULL) {
 		ptime_free(ent->trim_at);
 		ent->trim_at = NULL;
 	}
 
 	free(ent);
 }
 
 static void
 free_clist(struct cflist *list)
 {
 	struct conf_entry *ent;
 
 	while (!STAILQ_EMPTY(list)) {
 		ent = STAILQ_FIRST(list);
 		STAILQ_REMOVE_HEAD(list, cf_nextp);
 		free_entry(ent);
 	}
 
 	free(list);
 	list = NULL;
 }
 
 static fk_entry
 do_entry(struct conf_entry * ent)
 {
 #define	REASON_MAX	80
 	int modtime;
 	fk_entry free_or_keep;
 	double diffsecs;
 	char temp_reason[REASON_MAX];
 	int oversized;
 
 	free_or_keep = FREE_ENT;
 	if (verbose)
 		printf("%s <%d%s>: ", ent->log, ent->numlogs,
 		    compress_type[ent->compress].flag);
 	ent->fsize = sizefile(ent->log);
 	oversized = ((ent->trsize > 0) && (ent->fsize >= ent->trsize));
 	modtime = age_old_log(ent->log);
 	ent->rotate = 0;
 	ent->firstcreate = 0;
 	if (ent->fsize < 0) {
 		/*
 		 * If either the C flag or the -C option was specified,
 		 * and if we won't be creating the file, then have the
 		 * verbose message include a hint as to why the file
 		 * will not be created.
 		 */
 		temp_reason[0] = '\0';
 		if (createlogs > 1)
 			ent->firstcreate = 1;
 		else if ((ent->flags & CE_CREATE) && createlogs)
 			ent->firstcreate = 1;
 		else if (ent->flags & CE_CREATE)
 			strlcpy(temp_reason, " (no -C option)", REASON_MAX);
 		else if (createlogs)
 			strlcpy(temp_reason, " (no C flag)", REASON_MAX);
 
 		if (ent->firstcreate) {
 			if (verbose)
 				printf("does not exist -> will create.\n");
 			createlog(ent);
 		} else if (verbose) {
 			printf("does not exist, skipped%s.\n", temp_reason);
 		}
 	} else {
 		if (ent->flags & CE_TRIMAT && !force && !rotatereq &&
 		    !oversized) {
 			diffsecs = ptimeget_diff(timenow, ent->trim_at);
 			if (diffsecs < 0.0) {
 				/* trim_at is some time in the future. */
 				if (verbose) {
 					ptime_adjust4dst(ent->trim_at,
 					    timenow);
 					printf("--> will trim at %s",
 					    ptimeget_ctime(ent->trim_at));
 				}
 				return (free_or_keep);
 			} else if (diffsecs >= 3600.0) {
 				/*
 				 * trim_at is more than an hour in the past,
 				 * so find the next valid trim_at time, and
 				 * tell the user what that will be.
 				 */
 				if (verbose && dbg_at_times)
 					printf("\n\t--> prev trim at %s\t",
 					    ptimeget_ctime(ent->trim_at));
 				if (verbose) {
 					ptimeset_nxtime(ent->trim_at);
 					printf("--> will trim at %s",
 					    ptimeget_ctime(ent->trim_at));
 				}
 				return (free_or_keep);
 			} else if (verbose && noaction && dbg_at_times) {
 				/*
 				 * If we are just debugging at-times, then
 				 * a detailed message is helpful.  Also
 				 * skip "doing" any commands, since they
 				 * would all be turned off by no-action.
 				 */
 				printf("\n\t--> timematch at %s",
 				    ptimeget_ctime(ent->trim_at));
 				return (free_or_keep);
 			} else if (verbose && ent->hours <= 0) {
 				printf("--> time is up\n");
 			}
 		}
 		if (verbose && (ent->trsize > 0))
 			printf("size (Kb): %d [%d] ", ent->fsize, ent->trsize);
 		if (verbose && (ent->hours > 0))
 			printf(" age (hr): %d [%d] ", modtime, ent->hours);
 
 		/*
 		 * Figure out if this logfile needs to be rotated.
 		 */
 		temp_reason[0] = '\0';
 		if (rotatereq) {
 			ent->rotate = 1;
 			snprintf(temp_reason, REASON_MAX, " due to -R from %s",
 			    requestor);
 		} else if (force) {
 			ent->rotate = 1;
 			snprintf(temp_reason, REASON_MAX, " due to -F request");
 		} else if (oversized) {
 			ent->rotate = 1;
 			snprintf(temp_reason, REASON_MAX, " due to size>%dK",
 			    ent->trsize);
 		} else if (ent->hours <= 0 && (ent->flags & CE_TRIMAT)) {
 			ent->rotate = 1;
 		} else if ((ent->hours > 0) && ((modtime >= ent->hours) ||
 		    (modtime < 0))) {
 			ent->rotate = 1;
 		}
 
 		/*
 		 * If the file needs to be rotated, then rotate it.
 		 */
 		if (ent->rotate && !norotate) {
 			if (temp_reason[0] != '\0')
 				ent->r_reason = strdup(temp_reason);
 			if (verbose)
 				printf("--> trimming log....\n");
 			if (noaction && !verbose)
 				printf("%s <%d%s>: trimming\n", ent->log,
 				    ent->numlogs,
 				    compress_type[ent->compress].flag);
 			free_or_keep = do_rotate(ent);
 		} else {
 			if (verbose)
 				printf("--> skipping\n");
 		}
 	}
 	return (free_or_keep);
 #undef REASON_MAX
 }
 
 static void
 parse_args(int argc, char **argv)
 {
 	int ch;
 	char *p;
 
 	timenow = ptime_init(NULL);
 	ptimeset_time(timenow, time(NULL));
 	strlcpy(daytime, ptimeget_ctime(timenow) + 4, DAYTIME_LEN);
 
 	/* Let's get our hostname */
 	(void)gethostname(hostname, sizeof(hostname));
 
 	/* Truncate domain */
 	if ((p = strchr(hostname, '.')) != NULL)
 		*p = '\0';
 
 	/* Parse command line options. */
 	while ((ch = getopt(argc, argv, "a:d:f:nrst:vCD:FNPR:S:")) != -1)
 		switch (ch) {
 		case 'a':
 			archtodir++;
 			archdirname = optarg;
 			break;
 		case 'd':
 			destdir = optarg;
 			break;
 		case 'f':
 			conf = optarg;
 			break;
 		case 'n':
 			noaction++;
 			/* FALLTHROUGH */
 		case 'r':
 			needroot = 0;
 			break;
 		case 's':
 			nosignal = 1;
 			break;
 		case 't':
 			if (optarg[0] == '\0' ||
 			    strcmp(optarg, "DEFAULT") == 0)
 				timefnamefmt = strdup(DEFAULT_TIMEFNAME_FMT);
 			else
 				timefnamefmt = strdup(optarg);
 			break;
 		case 'v':
 			verbose++;
 			break;
 		case 'C':
 			/* Useful for things like rc.diskless... */
 			createlogs++;
 			break;
 		case 'D':
 			/*
 			 * Set some debugging option.  The specific option
 			 * depends on the value of optarg.  These options
 			 * may come and go without notice or documentation.
 			 */
 			if (parse_doption(optarg))
 				break;
 			usage();
 			/* NOTREACHED */
 		case 'F':
 			force++;
 			break;
 		case 'N':
 			norotate++;
 			break;
 		case 'P':
 			enforcepid++;
 			break;
 		case 'R':
 			rotatereq++;
 			requestor = strdup(optarg);
 			break;
 		case 'S':
 			path_syslogpid = optarg;
 			break;
 		case 'm':	/* Used by OpenBSD for "monitor mode" */
 		default:
 			usage();
 			/* NOTREACHED */
 		}
 
 	if (force && norotate) {
 		warnx("Only one of -F and -N may be specified.");
 		usage();
 		/* NOTREACHED */
 	}
 
 	if (rotatereq) {
 		if (optind == argc) {
 			warnx("At least one filename must be given when -R is specified.");
 			usage();
 			/* NOTREACHED */
 		}
 		/* Make sure "requestor" value is safe for a syslog message. */
 		for (p = requestor; *p != '\0'; p++) {
 			if (!isprintch(*p) && (*p != '\t'))
 				*p = '.';
 		}
 	}
 
 	if (dbg_timenow) {
 		/*
 		 * Note that the 'daytime' variable is not changed.
 		 * That is only used in messages that track when a
 		 * logfile is rotated, and if a file *is* rotated,
 		 * then it will still rotated at the "real now" time.
 		 */
 		ptime_free(timenow);
 		timenow = dbg_timenow;
 		fprintf(stderr, "Debug: Running as if TimeNow is %s",
 		    ptimeget_ctime(dbg_timenow));
 	}
 
 }
 
 /*
  * These debugging options are mainly meant for developer use, such
  * as writing regression-tests.  They would not be needed by users
  * during normal operation of newsyslog...
  */
 static int
 parse_doption(const char *doption)
 {
 	const char TN[] = "TN=";
 	int res;
 
 	if (strncmp(doption, TN, sizeof(TN) - 1) == 0) {
 		/*
 		 * The "TimeNow" debugging option.  This might be off
 		 * by an hour when crossing a timezone change.
 		 */
 		dbg_timenow = ptime_init(NULL);
 		res = ptime_relparse(dbg_timenow, PTM_PARSE_ISO8601,
 		    time(NULL), doption + sizeof(TN) - 1);
 		if (res == -2) {
 			warnx("Non-existent time specified on -D %s", doption);
 			return (0);			/* failure */
 		} else if (res < 0) {
 			warnx("Malformed time given on -D %s", doption);
 			return (0);			/* failure */
 		}
 		return (1);			/* successfully parsed */
 
 	}
 
 	if (strcmp(doption, "ats") == 0) {
 		dbg_at_times++;
 		return (1);			/* successfully parsed */
 	}
 
 	/* XXX - This check could probably be dropped. */
 	if ((strcmp(doption, "neworder") == 0) || (strcmp(doption, "oldorder")
 	    == 0)) {
 		warnx("NOTE: newsyslog always uses 'neworder'.");
 		return (1);			/* successfully parsed */
 	}
 
 	warnx("Unknown -D (debug) option: '%s'", doption);
 	return (0);				/* failure */
 }
 
 static void
 usage(void)
 {
 
 	fprintf(stderr,
 	    "usage: newsyslog [-CFNPnrsv] [-a directory] [-d directory] [-f config_file]\n"
 	    "                 [-S pidfile] [-t timefmt] [[-R tagname] file ...]\n");
 	exit(1);
 }
 
 /*
  * Parse a configuration file and return a linked list of all the logs
  * which should be processed.
  */
 static struct cflist *
 get_worklist(char **files)
 {
 	FILE *f;
 	char **given;
 	struct cflist *cmdlist, *filelist, *globlist;
 	struct conf_entry *defconf, *dupent, *ent;
 	struct ilist inclist;
 	struct include_entry *inc;
 	int gmatch, fnres;
 
 	defconf = NULL;
 	STAILQ_INIT(&inclist);
 
 	filelist = malloc(sizeof(struct cflist));
 	if (filelist == NULL)
 		err(1, "malloc of filelist");
 	STAILQ_INIT(filelist);
 	globlist = malloc(sizeof(struct cflist));
 	if (globlist == NULL)
 		err(1, "malloc of globlist");
 	STAILQ_INIT(globlist);
 
 	inc = malloc(sizeof(struct include_entry));
 	if (inc == NULL)
 		err(1, "malloc of inc");
 	inc->file = conf;
 	if (inc->file == NULL)
 		inc->file = _PATH_CONF;
 	STAILQ_INSERT_TAIL(&inclist, inc, inc_nextp);
 
 	STAILQ_FOREACH(inc, &inclist, inc_nextp) {
 		if (strcmp(inc->file, "-") != 0)
 			f = fopen(inc->file, "r");
 		else {
 			f = stdin;
 			inc->file = "<stdin>";
 		}
 		if (!f)
 			err(1, "%s", inc->file);
 
 		if (verbose)
 			printf("Processing %s\n", inc->file);
 		parse_file(f, filelist, globlist, defconf, &inclist);
 		(void) fclose(f);
 	}
 
 	/*
 	 * All config-file information has been read in and turned into
 	 * a filelist and a globlist.  If there were no specific files
 	 * given on the run command, then the only thing left to do is to
 	 * call a routine which finds all files matched by the globlist
 	 * and adds them to the filelist.  Then return the worklist.
 	 */
 	if (*files == NULL) {
 		expand_globs(filelist, globlist);
 		free_clist(globlist);
 		if (defconf != NULL)
 			free_entry(defconf);
 		return (filelist);
 		/* NOTREACHED */
 	}
 
 	/*
 	 * If newsyslog was given a specific list of files to process,
 	 * it may be that some of those files were not listed in any
 	 * config file.  Those unlisted files should get the default
 	 * rotation action.  First, create the default-rotation action
 	 * if none was found in a system config file.
 	 */
 	if (defconf == NULL) {
 		defconf = init_entry(DEFAULT_MARKER, NULL);
 		defconf->numlogs = 3;
 		defconf->trsize = 50;
 		defconf->permissions = S_IRUSR|S_IWUSR;
 	}
 
 	/*
 	 * If newsyslog was run with a list of specific filenames,
 	 * then create a new worklist which has only those files in
 	 * it, picking up the rotation-rules for those files from
 	 * the original filelist.
 	 *
 	 * XXX - Note that this will copy multiple rules for a single
 	 *	logfile, if multiple entries are an exact match for
 	 *	that file.  That matches the historic behavior, but do
 	 *	we want to continue to allow it?  If so, it should
 	 *	probably be handled more intelligently.
 	 */
 	cmdlist = malloc(sizeof(struct cflist));
 	if (cmdlist == NULL)
 		err(1, "malloc of cmdlist");
 	STAILQ_INIT(cmdlist);
 
 	for (given = files; *given; ++given) {
 		/*
 		 * First try to find exact-matches for this given file.
 		 */
 		gmatch = 0;
 		STAILQ_FOREACH(ent, filelist, cf_nextp) {
 			if (strcmp(ent->log, *given) == 0) {
 				gmatch++;
 				dupent = init_entry(*given, ent);
 				STAILQ_INSERT_TAIL(cmdlist, dupent, cf_nextp);
 			}
 		}
 		if (gmatch) {
 			if (verbose > 2)
 				printf("\t+ Matched entry %s\n", *given);
 			continue;
 		}
 
 		/*
 		 * There was no exact-match for this given file, so look
 		 * for a "glob" entry which does match.
 		 */
 		gmatch = 0;
 		if (verbose > 2 && globlist != NULL)
 			printf("\t+ Checking globs for %s\n", *given);
 		STAILQ_FOREACH(ent, globlist, cf_nextp) {
 			fnres = fnmatch(ent->log, *given, FNM_PATHNAME);
 			if (verbose > 2)
 				printf("\t+    = %d for pattern %s\n", fnres,
 				    ent->log);
 			if (fnres == 0) {
 				gmatch++;
 				dupent = init_entry(*given, ent);
 				/* This new entry is not a glob! */
 				dupent->flags &= ~CE_GLOB;
 				STAILQ_INSERT_TAIL(cmdlist, dupent, cf_nextp);
 				/* Only allow a match to one glob-entry */
 				break;
 			}
 		}
 		if (gmatch) {
 			if (verbose > 2)
 				printf("\t+ Matched %s via %s\n", *given,
 				    ent->log);
 			continue;
 		}
 
 		/*
 		 * This given file was not found in any config file, so
 		 * add a worklist item based on the default entry.
 		 */
 		if (verbose > 2)
 			printf("\t+ No entry matched %s  (will use %s)\n",
 			    *given, DEFAULT_MARKER);
 		dupent = init_entry(*given, defconf);
 		/* Mark that it was *not* found in a config file */
 		dupent->def_cfg = 1;
 		STAILQ_INSERT_TAIL(cmdlist, dupent, cf_nextp);
 	}
 
 	/*
 	 * Free all the entries in the original work list, the list of
 	 * glob entries, and the default entry.
 	 */
 	free_clist(filelist);
 	free_clist(globlist);
 	free_entry(defconf);
 
 	/* And finally, return a worklist which matches the given files. */
 	return (cmdlist);
 }
 
 /*
  * Expand the list of entries with filename patterns, and add all files
  * which match those glob-entries onto the worklist.
  */
 static void
 expand_globs(struct cflist *work_p, struct cflist *glob_p)
 {
 	int gmatch, gres;
 	size_t i;
 	char *mfname;
 	struct conf_entry *dupent, *ent, *globent;
 	glob_t pglob;
 	struct stat st_fm;
 
 	/*
 	 * The worklist contains all fully-specified (non-GLOB) names.
 	 *
 	 * Now expand the list of filename-pattern (GLOB) entries into
 	 * a second list, which (by definition) will only match files
 	 * that already exist.  Do not add a glob-related entry for any
 	 * file which already exists in the fully-specified list.
 	 */
 	STAILQ_FOREACH(globent, glob_p, cf_nextp) {
 		gres = glob(globent->log, GLOB_NOCHECK, NULL, &pglob);
 		if (gres != 0) {
 			warn("cannot expand pattern (%d): %s", gres,
 			    globent->log);
 			continue;
 		}
 
 		if (verbose > 2)
 			printf("\t+ Expanding pattern %s\n", globent->log);
 		for (i = 0; i < pglob.gl_matchc; i++) {
 			mfname = pglob.gl_pathv[i];
 
 			/* See if this file already has a specific entry. */
 			gmatch = 0;
 			STAILQ_FOREACH(ent, work_p, cf_nextp) {
 				if (strcmp(mfname, ent->log) == 0) {
 					gmatch++;
 					break;
 				}
 			}
 			if (gmatch)
 				continue;
 
 			/* Make sure the named matched is a file. */
 			gres = lstat(mfname, &st_fm);
 			if (gres != 0) {
 				/* Error on a file that glob() matched?!? */
 				warn("Skipping %s - lstat() error", mfname);
 				continue;
 			}
 			if (!S_ISREG(st_fm.st_mode)) {
 				/* We only rotate files! */
 				if (verbose > 2)
 					printf("\t+  . skipping %s (!file)\n",
 					    mfname);
 				continue;
 			}
 
 			if (verbose > 2)
 				printf("\t+  . add file %s\n", mfname);
 			dupent = init_entry(mfname, globent);
 			/* This new entry is not a glob! */
 			dupent->flags &= ~CE_GLOB;
 
 			/* Add to the worklist. */
 			STAILQ_INSERT_TAIL(work_p, dupent, cf_nextp);
 		}
 		globfree(&pglob);
 		if (verbose > 2)
 			printf("\t+ Done with pattern %s\n", globent->log);
 	}
 }
 
 /*
  * Parse a configuration file and update a linked list of all the logs to
  * process.
  */
 static void
 parse_file(FILE *cf, struct cflist *work_p, struct cflist *glob_p,
     struct conf_entry *defconf_p, struct ilist *inclist)
 {
 	char line[BUFSIZ], *parse, *q;
 	char *cp, *errline, *group;
 	struct conf_entry *working;
 	struct passwd *pwd;
 	struct group *grp;
 	glob_t pglob;
 	int eol, ptm_opts, res, special;
 	size_t i;
 
 	errline = NULL;
 	while (fgets(line, BUFSIZ, cf)) {
 		if ((line[0] == '\n') || (line[0] == '#') ||
 		    (strlen(line) == 0))
 			continue;
 		if (errline != NULL)
 			free(errline);
 		errline = strdup(line);
 		for (cp = line + 1; *cp != '\0'; cp++) {
 			if (*cp != '#')
 				continue;
 			if (*(cp - 1) == '\\') {
 				strcpy(cp - 1, cp);
 				cp--;
 				continue;
 			}
 			*cp = '\0';
 			break;
 		}
 
 		q = parse = missing_field(sob(line), errline);
 		parse = son(line);
 		if (!*parse)
 			errx(1, "malformed line (missing fields):\n%s",
 			    errline);
 		*parse = '\0';
 
 		/*
 		 * Allow people to set debug options via the config file.
 		 * (NOTE: debug options are undocumented, and may disappear
 		 * at any time, etc).
 		 */
 		if (strcasecmp(DEBUG_MARKER, q) == 0) {
 			q = parse = missing_field(sob(parse + 1), errline);
 			parse = son(parse);
 			if (!*parse)
 				warnx("debug line specifies no option:\n%s",
 				    errline);
 			else {
 				*parse = '\0';
 				parse_doption(q);
 			}
 			continue;
 		} else if (strcasecmp(INCLUDE_MARKER, q) == 0) {
 			if (verbose)
 				printf("Found: %s", errline);
 			q = parse = missing_field(sob(parse + 1), errline);
 			parse = son(parse);
 			if (!*parse) {
 				warnx("include line missing argument:\n%s",
 				    errline);
 				continue;
 			}
 
 			*parse = '\0';
 
 			if (isglobstr(q)) {
 				res = glob(q, GLOB_NOCHECK, NULL, &pglob);
 				if (res != 0) {
 					warn("cannot expand pattern (%d): %s",
 					    res, q);
 					continue;
 				}
 
 				if (verbose > 2)
 					printf("\t+ Expanding pattern %s\n", q);
 
 				for (i = 0; i < pglob.gl_matchc; i++)
 					add_to_queue(pglob.gl_pathv[i],
 					    inclist);
 				globfree(&pglob);
 			} else
 				add_to_queue(q, inclist);
 			continue;
 		}
 
 		special = 0;
 		working = init_entry(q, NULL);
 		if (strcasecmp(DEFAULT_MARKER, q) == 0) {
 			special = 1;
 			if (defconf_p != NULL) {
 				warnx("Ignoring duplicate entry for %s!", q);
 				free_entry(working);
 				continue;
 			}
 			defconf_p = working;
 		}
 
 		q = parse = missing_field(sob(parse + 1), errline);
 		parse = son(parse);
 		if (!*parse)
 			errx(1, "malformed line (missing fields):\n%s",
 			    errline);
 		*parse = '\0';
 		if ((group = strchr(q, ':')) != NULL ||
 		    (group = strrchr(q, '.')) != NULL) {
 			*group++ = '\0';
 			if (*q) {
 				if (!(isnumberstr(q))) {
 					if ((pwd = getpwnam(q)) == NULL)
 						errx(1,
 				     "error in config file; unknown user:\n%s",
 						    errline);
 					working->uid = pwd->pw_uid;
 				} else
 					working->uid = atoi(q);
 			} else
 				working->uid = (uid_t)-1;
 
 			q = group;
 			if (*q) {
 				if (!(isnumberstr(q))) {
 					if ((grp = getgrnam(q)) == NULL)
 						errx(1,
 				    "error in config file; unknown group:\n%s",
 						    errline);
 					working->gid = grp->gr_gid;
 				} else
 					working->gid = atoi(q);
 			} else
 				working->gid = (gid_t)-1;
 
 			q = parse = missing_field(sob(parse + 1), errline);
 			parse = son(parse);
 			if (!*parse)
 				errx(1, "malformed line (missing fields):\n%s",
 				    errline);
 			*parse = '\0';
 		} else {
 			working->uid = (uid_t)-1;
 			working->gid = (gid_t)-1;
 		}
 
 		if (!sscanf(q, "%o", &working->permissions))
 			errx(1, "error in config file; bad permissions:\n%s",
 			    errline);
 
 		q = parse = missing_field(sob(parse + 1), errline);
 		parse = son(parse);
 		if (!*parse)
 			errx(1, "malformed line (missing fields):\n%s",
 			    errline);
 		*parse = '\0';
 		if (!sscanf(q, "%d", &working->numlogs) || working->numlogs < 0)
 			errx(1, "error in config file; bad value for count of logs to save:\n%s",
 			    errline);
 
 		q = parse = missing_field(sob(parse + 1), errline);
 		parse = son(parse);
 		if (!*parse)
 			errx(1, "malformed line (missing fields):\n%s",
 			    errline);
 		*parse = '\0';
 		if (isdigitch(*q))
 			working->trsize = atoi(q);
 		else if (strcmp(q, "*") == 0)
 			working->trsize = -1;
 		else {
 			warnx("Invalid value of '%s' for 'size' in line:\n%s",
 			    q, errline);
 			working->trsize = -1;
 		}
 
 		working->flags = 0;
 		working->compress = COMPRESS_NONE;
 		q = parse = missing_field(sob(parse + 1), errline);
 		parse = son(parse);
 		eol = !*parse;
 		*parse = '\0';
 		{
 			char *ep;
 			u_long ul;
 
 			ul = strtoul(q, &ep, 10);
 			if (ep == q)
 				working->hours = 0;
 			else if (*ep == '*')
 				working->hours = -1;
 			else if (ul > INT_MAX)
 				errx(1, "interval is too large:\n%s", errline);
 			else
 				working->hours = ul;
 
 			if (*ep == '\0' || strcmp(ep, "*") == 0)
 				goto no_trimat;
 			if (*ep != '@' && *ep != '$')
 				errx(1, "malformed interval/at:\n%s", errline);
 
 			working->flags |= CE_TRIMAT;
 			working->trim_at = ptime_init(NULL);
 			ptm_opts = PTM_PARSE_ISO8601;
 			if (*ep == '$')
 				ptm_opts = PTM_PARSE_DWM;
 			ptm_opts |= PTM_PARSE_MATCHDOM;
 			res = ptime_relparse(working->trim_at, ptm_opts,
 			    ptimeget_secs(timenow), ep + 1);
 			if (res == -2)
 				errx(1, "nonexistent time for 'at' value:\n%s",
 				    errline);
 			else if (res < 0)
 				errx(1, "malformed 'at' value:\n%s", errline);
 		}
 no_trimat:
 
 		if (eol)
 			q = NULL;
 		else {
 			q = parse = sob(parse + 1);	/* Optional field */
 			parse = son(parse);
 			if (!*parse)
 				eol = 1;
 			*parse = '\0';
 		}
 
 		for (; q && *q && !isspacech(*q); q++) {
 			switch (tolowerch(*q)) {
 			case 'b':
 				working->flags |= CE_BINARY;
 				break;
 			case 'c':
 				working->flags |= CE_CREATE;
 				break;
 			case 'd':
 				working->flags |= CE_NODUMP;
 				break;
 			case 'g':
 				working->flags |= CE_GLOB;
 				break;
 			case 'j':
 				working->compress = COMPRESS_BZIP2;
 				break;
 			case 'n':
 				working->flags |= CE_NOSIGNAL;
 				break;
 			case 'r':
 				working->flags |= CE_PID2CMD;
 				break;
 			case 'u':
 				working->flags |= CE_SIGNALGROUP;
 				break;
 			case 'w':
 				/* Depreciated flag - keep for compatibility purposes */
 				break;
 			case 'x':
 				working->compress = COMPRESS_XZ;
 				break;
 			case 'z':
 				working->compress = COMPRESS_GZIP;
 				break;
 			case '-':
 				break;
 			case 'f':	/* Used by OpenBSD for "CE_FOLLOW" */
 			case 'm':	/* Used by OpenBSD for "CE_MONITOR" */
 			case 'p':	/* Used by NetBSD  for "CE_PLAIN0" */
 			default:
 				errx(1, "illegal flag in config file -- %c",
 				    *q);
 			}
 		}
 
 		if (eol)
 			q = NULL;
 		else {
 			q = parse = sob(parse + 1);	/* Optional field */
 			parse = son(parse);
 			if (!*parse)
 				eol = 1;
 			*parse = '\0';
 		}
 
 		working->pid_cmd_file = NULL;
 		if (q && *q) {
 			if (*q == '/')
 				working->pid_cmd_file = strdup(q);
 			else if (isalnum(*q))
 				goto got_sig;
 			else {
 				errx(1,
 			"illegal pid file or signal in config file:\n%s",
 				    errline);
 			}
 		}
 		if (eol)
 			q = NULL;
 		else {
 			q = parse = sob(parse + 1);	/* Optional field */
 			*(parse = son(parse)) = '\0';
 		}
 
 		working->sig = SIGHUP;
 		if (q && *q) {
 got_sig:
 			working->sig = parse_signal(q);
 			if (working->sig < 1 || working->sig >= sys_nsig) {
 				errx(1,
 				    "illegal signal in config file:\n%s",
 				    errline);
 			}
 		}
 
 		/*
 		 * Finish figuring out what pid-file to use (if any) in
 		 * later processing if this logfile needs to be rotated.
 		 */
 		if ((working->flags & CE_NOSIGNAL) == CE_NOSIGNAL) {
 			/*
 			 * This config-entry specified 'n' for nosignal,
 			 * see if it also specified an explicit pid_cmd_file.
 			 * This would be a pretty pointless combination.
 			 */
 			if (working->pid_cmd_file != NULL) {
 				warnx("Ignoring '%s' because flag 'n' was specified in line:\n%s",
 				    working->pid_cmd_file, errline);
 				free(working->pid_cmd_file);
 				working->pid_cmd_file = NULL;
 			}
 		} else if (working->pid_cmd_file == NULL) {
 			/*
 			 * This entry did not specify the 'n' flag, which
 			 * means it should signal syslogd unless it had
 			 * specified some other pid-file (and obviously the
 			 * syslog pid-file will not be for a process-group).
 			 * Also, we should only try to notify syslog if we
 			 * are root.
 			 */
 			if (working->flags & CE_SIGNALGROUP) {
 				warnx("Ignoring flag 'U' in line:\n%s",
 				    errline);
 				working->flags &= ~CE_SIGNALGROUP;
 			}
 			if (needroot)
 				working->pid_cmd_file = strdup(path_syslogpid);
 		}
 
 		/*
 		 * Add this entry to the appropriate list of entries, unless
 		 * it was some kind of special entry (eg: <default>).
 		 */
 		if (special) {
 			;			/* Do not add to any list */
 		} else if (working->flags & CE_GLOB) {
 			STAILQ_INSERT_TAIL(glob_p, working, cf_nextp);
 		} else {
 			STAILQ_INSERT_TAIL(work_p, working, cf_nextp);
 		}
 	}
 	if (errline != NULL)
 		free(errline);
 }
 
 static char *
 missing_field(char *p, char *errline)
 {
 
 	if (!p || !*p)
 		errx(1, "missing field in config file:\n%s", errline);
 	return (p);
 }
 
 /*
  * In our sort we return it in the reverse of what qsort normally
  * would do, as we want the newest files first.  If we have two
  * entries with the same time we don't really care about order.
  *
  * Support function for qsort() in delete_oldest_timelog().
  */
 static int
 oldlog_entry_compare(const void *a, const void *b)
 {
 	const struct oldlog_entry *ola = a, *olb = b;
 
 	if (ola->t > olb->t)
 		return (-1);
 	else if (ola->t < olb->t)
 		return (1);
 	else
 		return (0);
 }
 
 /*
  * Check whether the file corresponding to dp is an archive of the logfile
  * logfname, based on the timefnamefmt format string. Return true and fill out
  * tm if this is the case; otherwise return false.
  */
 static int
 validate_old_timelog(int fd, const struct dirent *dp, const char *logfname,
     struct tm *tm)
 {
 	struct stat sb;
 	size_t logfname_len;
 	char *s;
 	int c;
 
 	logfname_len = strlen(logfname);
 
 	if (dp->d_type != DT_REG) {
 		/*
 		 * Some filesystems (e.g. NFS) don't fill out the d_type field
 		 * and leave it set to DT_UNKNOWN; in this case we must obtain
 		 * the file type ourselves.
 		 */
 		if (dp->d_type != DT_UNKNOWN ||
 		    fstatat(fd, dp->d_name, &sb, AT_SYMLINK_NOFOLLOW) != 0 ||
 		    !S_ISREG(sb.st_mode))
 			return (0);
 	}
 	/* Ignore everything but files with our logfile prefix. */
 	if (strncmp(dp->d_name, logfname, logfname_len) != 0)
 		return (0);
 	/* Ignore the actual non-rotated logfile. */
 	if (dp->d_namlen == logfname_len)
 		return (0);
 
 	/*
 	 * Make sure we created have found a logfile, so the
 	 * postfix is valid, IE format is: '.<time>(.[bgx]z)?'.
 	 */
 	if (dp->d_name[logfname_len] != '.') {
 		if (verbose)
 			printf("Ignoring %s which has unexpected "
 			    "extension '%s'\n", dp->d_name,
 			    &dp->d_name[logfname_len]);
 		return (0);
 	}
 	memset(tm, 0, sizeof(*tm));
 	if ((s = strptime(&dp->d_name[logfname_len + 1],
 	    timefnamefmt, tm)) == NULL) {
 		/*
 		 * We could special case "old" sequentially named logfiles here,
 		 * but we do not as that would require special handling to
 		 * decide which one was the oldest compared to "new" time based
 		 * logfiles.
 		 */
 		if (verbose)
 			printf("Ignoring %s which does not "
 			    "match time format\n", dp->d_name);
 		return (0);
 	}
 
 	for (c = 0; c < COMPRESS_TYPES; c++)
 		if (strcmp(s, compress_type[c].suffix) == 0)
 			/* We're done. */
 			return (1);
 
 	if (verbose)
 		printf("Ignoring %s which has unexpected extension '%s'\n",
 		    dp->d_name, s);
 
 	return (0);
 }
 
 /*
  * Delete the oldest logfiles, when using time based filenames.
  */
 static void
 delete_oldest_timelog(const struct conf_entry *ent, const char *archive_dir)
 {
-	char *logfname, *s, *dir, errbuf[80];
+	char *basebuf, *dirbuf, errbuf[80];
+	const char *base, *dir;
 	int dir_fd, i, logcnt, max_logcnt;
 	struct oldlog_entry *oldlogs;
 	struct dirent *dp;
-	const char *cdir;
 	struct tm tm;
 	DIR *dirp;
 
 	oldlogs = malloc(MAX_OLDLOGS * sizeof(struct oldlog_entry));
 	max_logcnt = MAX_OLDLOGS;
 	logcnt = 0;
 
-	if (archive_dir != NULL && archive_dir[0] != '\0')
-		cdir = archive_dir;
-	else
-		if ((cdir = dirname(ent->log)) == NULL)
-			err(1, "dirname()");
-	if ((dir = strdup(cdir)) == NULL)
-		err(1, "strdup()");
+	if (archive_dir != NULL && archive_dir[0] != '\0') {
+		dirbuf = NULL;
+		dir = archive_dir;
+	} else {
+		if ((dirbuf = strdup(ent->log)) == NULL)
+			err(1, "strdup()");
+		dir = dirname(dirbuf);
+	}
 
-	if ((s = basename(ent->log)) == NULL)
-		err(1, "basename()");
-	if ((logfname = strdup(s)) == NULL)
+	if ((basebuf = strdup(ent->log)) == NULL)
 		err(1, "strdup()");
-	if (strcmp(logfname, "/") == 0)
+	base = basename(basebuf);
+	if (strcmp(base, "/") == 0)
 		errx(1, "Invalid log filename - became '/'");
 
 	if (verbose > 2)
 		printf("Searching for old logs in %s\n", dir);
 
 	/* First we create a 'list' of all archived logfiles */
 	if ((dirp = opendir(dir)) == NULL)
 		err(1, "Cannot open log directory '%s'", dir);
 	dir_fd = dirfd(dirp);
 	while ((dp = readdir(dirp)) != NULL) {
-		if (validate_old_timelog(dir_fd, dp, logfname, &tm) == 0)
+		if (validate_old_timelog(dir_fd, dp, base, &tm) == 0)
 			continue;
 
 		/*
 		 * We should now have old an old rotated logfile, so
 		 * add it to the 'list'.
 		 */
 		if ((oldlogs[logcnt].t = timegm(&tm)) == -1)
 			err(1, "Could not convert time string to time value");
 		if ((oldlogs[logcnt].fname = strdup(dp->d_name)) == NULL)
 			err(1, "strdup()");
 		logcnt++;
 
 		/*
 		 * It is very unlikely we ever run out of space in the
 		 * logfile array from the default size, but lets
 		 * handle it anyway...
 		 */
 		if (logcnt >= max_logcnt) {
 			max_logcnt *= 4;
 			/* Detect integer overflow */
 			if (max_logcnt < logcnt)
 				errx(1, "Too many old logfiles found");
 			oldlogs = realloc(oldlogs,
 			    max_logcnt * sizeof(struct oldlog_entry));
 			if (oldlogs == NULL)
 				err(1, "realloc()");
 		}
 	}
 
 	/* Second, if needed we delete oldest archived logfiles */
 	if (logcnt > 0 && logcnt >= ent->numlogs && ent->numlogs > 1) {
 		oldlogs = realloc(oldlogs, logcnt *
 		    sizeof(struct oldlog_entry));
 		if (oldlogs == NULL)
 			err(1, "realloc()");
 
 		/*
 		 * We now sort the logs in the order of newest to
 		 * oldest.  That way we can simply skip over the
 		 * number of records we want to keep.
 		 */
 		qsort(oldlogs, logcnt, sizeof(struct oldlog_entry),
 		    oldlog_entry_compare);
 		for (i = ent->numlogs - 1; i < logcnt; i++) {
 			if (noaction)
 				printf("\trm -f %s/%s\n", dir,
 				    oldlogs[i].fname);
 			else if (unlinkat(dir_fd, oldlogs[i].fname, 0) != 0) {
 				snprintf(errbuf, sizeof(errbuf),
 				    "Could not delete old logfile '%s'",
 				    oldlogs[i].fname);
 				perror(errbuf);
 			}
 		}
 	} else if (verbose > 1)
 		printf("No old logs to delete for logfile %s\n", ent->log);
 
 	/* Third, cleanup */
 	closedir(dirp);
 	for (i = 0; i < logcnt; i++) {
 		assert(oldlogs[i].fname != NULL);
 		free(oldlogs[i].fname);
 	}
 	free(oldlogs);
-	free(logfname);
-	free(dir);
+	free(dirbuf);
+	free(basebuf);
 }
 
 /*
  * Generate a log filename, when using classic filenames.
  */
 static void
 gen_classiclog_fname(char *fname, size_t fname_sz, const char *archive_dir,
     const char *namepart, int numlogs_c)
 {
 
 	if (archive_dir[0] != '\0')
 		(void) snprintf(fname, fname_sz, "%s/%s.%d", archive_dir,
 		    namepart, numlogs_c);
 	else
 		(void) snprintf(fname, fname_sz, "%s.%d", namepart, numlogs_c);
 }
 
 /*
  * Delete a rotated logfile, when using classic filenames.
  */
 static void
 delete_classiclog(const char *archive_dir, const char *namepart, int numlog_c)
 {
 	char file1[MAXPATHLEN], zfile1[MAXPATHLEN];
 	int c;
 
 	gen_classiclog_fname(file1, sizeof(file1), archive_dir, namepart,
 	    numlog_c);
 
 	for (c = 0; c < COMPRESS_TYPES; c++) {
 		(void) snprintf(zfile1, sizeof(zfile1), "%s%s", file1,
 		    compress_type[c].suffix);
 		if (noaction)
 			printf("\trm -f %s\n", zfile1);
 		else
 			(void) unlink(zfile1);
 	}
 }
 
 /*
  * Only add to the queue if the file hasn't already been added. This is
  * done to prevent circular include loops.
  */
 static void
 add_to_queue(const char *fname, struct ilist *inclist)
 {
 	struct include_entry *inc;
 
 	STAILQ_FOREACH(inc, inclist, inc_nextp) {
 		if (strcmp(fname, inc->file) == 0) {
 			warnx("duplicate include detected: %s", fname);
 			return;
 		}
 	}
 
 	inc = malloc(sizeof(struct include_entry));
 	if (inc == NULL)
 		err(1, "malloc of inc");
 	inc->file = strdup(fname);
 
 	if (verbose > 2)
 		printf("\t+ Adding %s to the processing queue.\n", fname);
 
 	STAILQ_INSERT_TAIL(inclist, inc, inc_nextp);
 }
 
 /*
  * Search for logfile and return its compression suffix (if supported)
  * The suffix detection is first-match in the order of compress_types
  *
  * Note: if logfile without suffix exists (uncompressed, COMPRESS_NONE)
  * a zero-length string is returned
  */
 static const char *
 get_logfile_suffix(const char *logfile)
 {
 	struct stat st;
 	char zfile[MAXPATHLEN];
 	int c;
 
 	for (c = 0; c < COMPRESS_TYPES; c++) {
 		(void) strlcpy(zfile, logfile, MAXPATHLEN);
 		(void) strlcat(zfile, compress_type[c].suffix, MAXPATHLEN);
 		if (lstat(zfile, &st) == 0)
 			return (compress_type[c].suffix);
 	}
 	return (NULL);
 }
 
 static fk_entry
 do_rotate(const struct conf_entry *ent)
 {
 	char dirpart[MAXPATHLEN], namepart[MAXPATHLEN];
 	char file1[MAXPATHLEN], file2[MAXPATHLEN];
 	char zfile1[MAXPATHLEN], zfile2[MAXPATHLEN];
 	const char *logfile_suffix;
 	char datetimestr[30];
 	int flags, numlogs_c;
 	fk_entry free_or_keep;
 	struct sigwork_entry *swork;
 	struct stat st;
 	struct tm tm;
 	time_t now;
 
 	flags = ent->flags;
 	free_or_keep = FREE_ENT;
 
 	if (archtodir) {
 		char *p;
 
 		/* build complete name of archive directory into dirpart */
 		if (*archdirname == '/') {	/* absolute */
 			strlcpy(dirpart, archdirname, sizeof(dirpart));
 		} else {	/* relative */
 			/* get directory part of logfile */
 			strlcpy(dirpart, ent->log, sizeof(dirpart));
 			if ((p = strrchr(dirpart, '/')) == NULL)
 				dirpart[0] = '\0';
 			else
 				*(p + 1) = '\0';
 			strlcat(dirpart, archdirname, sizeof(dirpart));
 		}
 
 		/* check if archive directory exists, if not, create it */
 		if (lstat(dirpart, &st))
 			createdir(ent, dirpart);
 
 		/* get filename part of logfile */
 		if ((p = strrchr(ent->log, '/')) == NULL)
 			strlcpy(namepart, ent->log, sizeof(namepart));
 		else
 			strlcpy(namepart, p + 1, sizeof(namepart));
 	} else {
 		/*
 		 * Tell utility functions we are not using an archive
 		 * dir.
 		 */
 		dirpart[0] = '\0';
 		strlcpy(namepart, ent->log, sizeof(namepart));
 	}
 
 	/* Delete old logs */
 	if (timefnamefmt != NULL)
 		delete_oldest_timelog(ent, dirpart);
 	else {
 		/*
 		 * Handle cleaning up after legacy newsyslog where we
 		 * kept ent->numlogs + 1 files.  This code can go away
 		 * at some point in the future.
 		 */
 		delete_classiclog(dirpart, namepart, ent->numlogs);
 
 		if (ent->numlogs > 0)
 			delete_classiclog(dirpart, namepart, ent->numlogs - 1);
 
 	}
 
 	if (timefnamefmt != NULL) {
 		/* If time functions fails we can't really do any sensible */
 		if (time(&now) == (time_t)-1 ||
 		    localtime_r(&now, &tm) == NULL)
 			bzero(&tm, sizeof(tm));
 
 		strftime(datetimestr, sizeof(datetimestr), timefnamefmt, &tm);
 		if (archtodir)
 			(void) snprintf(file1, sizeof(file1), "%s/%s.%s",
 			    dirpart, namepart, datetimestr);
 		else
 			(void) snprintf(file1, sizeof(file1), "%s.%s",
 			    ent->log, datetimestr);
 
 		/* Don't run the code to move down logs */
 		numlogs_c = -1;
 	} else {
 		gen_classiclog_fname(file1, sizeof(file1), dirpart, namepart,
 		    ent->numlogs - 1);
 		numlogs_c = ent->numlogs - 2;		/* copy for countdown */
 	}
 
 	/* Move down log files */
 	for (; numlogs_c >= 0; numlogs_c--) {
 		(void) strlcpy(file2, file1, sizeof(file2));
 
 		gen_classiclog_fname(file1, sizeof(file1), dirpart, namepart,
 		    numlogs_c);
 
 		logfile_suffix = get_logfile_suffix(file1);
 		if (logfile_suffix == NULL)
 			continue;
 		(void) strlcpy(zfile1, file1, MAXPATHLEN);
 		(void) strlcpy(zfile2, file2, MAXPATHLEN);
 		(void) strlcat(zfile1, logfile_suffix, MAXPATHLEN);
 		(void) strlcat(zfile2, logfile_suffix, MAXPATHLEN);
 
 		if (noaction)
 			printf("\tmv %s %s\n", zfile1, zfile2);
 		else {
 			/* XXX - Ought to be checking for failure! */
 			(void)rename(zfile1, zfile2);
 		}
 		change_attrs(zfile2, ent);
 	}
 
 	if (ent->numlogs > 0) {
 		if (noaction) {
 			/*
 			 * Note that savelog() may succeed with using link()
 			 * for the archtodir case, but there is no good way
 			 * of knowing if it will when doing "noaction", so
 			 * here we claim that it will have to do a copy...
 			 */
 			if (archtodir)
 				printf("\tcp %s %s\n", ent->log, file1);
 			else
 				printf("\tln %s %s\n", ent->log, file1);
 			printf("\ttouch %s\t\t"
 			    "# Update mtime for 'when'-interval processing\n",
 			    file1);
 		} else {
 			if (!(flags & CE_BINARY)) {
 				/* Report the trimming to the old log */
 				log_trim(ent->log, ent);
 			}
 			savelog(ent->log, file1);
 			/*
 			 * Interval-based rotations are done using the mtime of
 			 * the most recently archived log, so make sure it gets
 			 * updated during a rotation.
 			 */
 			utimes(file1, NULL);
 		}
 		change_attrs(file1, ent);
 	}
 
 	/* Create the new log file and move it into place */
 	if (noaction)
 		printf("Start new log...\n");
 	createlog(ent);
 
 	/*
 	 * Save all signalling and file-compression to be done after log
 	 * files from all entries have been rotated.  This way any one
 	 * process will not be sent the same signal multiple times when
 	 * multiple log files had to be rotated.
 	 */
 	swork = NULL;
 	if (ent->pid_cmd_file != NULL)
 		swork = save_sigwork(ent);
 	if (ent->numlogs > 0 && ent->compress > COMPRESS_NONE) {
 		/*
 		 * The zipwork_entry will include a pointer to this
 		 * conf_entry, so the conf_entry should not be freed.
 		 */
 		free_or_keep = KEEP_ENT;
 		save_zipwork(ent, swork, ent->fsize, file1);
 	}
 
 	return (free_or_keep);
 }
 
 static void
 do_sigwork(struct sigwork_entry *swork)
 {
 	struct sigwork_entry *nextsig;
 	int kres, secs;
 	char *tmp;
 
 	if (swork->sw_runcmd == 0 && (!(swork->sw_pidok) || swork->sw_pid == 0))
 		return;			/* no work to do... */
 
 	/*
 	 * If nosignal (-s) was specified, then do not signal any process.
 	 * Note that a nosignal request triggers a warning message if the
 	 * rotated logfile needs to be compressed, *unless* -R was also
 	 * specified.  We assume that an `-sR' request came from a process
 	 * which writes to the logfile, and as such, we assume that process
 	 * has already made sure the logfile is not presently in use.  This
 	 * just sets swork->sw_pidok to a special value, and do_zipwork
 	 * will print any necessary warning(s).
 	 */
 	if (nosignal) {
 		if (!rotatereq)
 			swork->sw_pidok = -1;
 		return;
 	}
 
 	/*
 	 * Compute the pause between consecutive signals.  Use a longer
 	 * sleep time if we will be sending two signals to the same
 	 * daemon or process-group.
 	 */
 	secs = 0;
 	nextsig = SLIST_NEXT(swork, sw_nextp);
 	if (nextsig != NULL) {
 		if (swork->sw_pid == nextsig->sw_pid)
 			secs = 10;
 		else
 			secs = 1;
 	}
 
 	if (noaction) {
 		if (swork->sw_runcmd)
 			printf("\tsh -c '%s %d'\n", swork->sw_fname,
 			    swork->sw_signum);
 		else {
 			printf("\tkill -%d %d \t\t# %s\n", swork->sw_signum,
 			    (int)swork->sw_pid, swork->sw_fname);
 			if (secs > 0)
 				printf("\tsleep %d\n", secs);
 		}
 		return;
 	}
 
 	if (swork->sw_runcmd) {
 		asprintf(&tmp, "%s %d", swork->sw_fname, swork->sw_signum);
 		if (tmp == NULL) {
 			warn("can't allocate memory to run %s",
 			    swork->sw_fname);
 			return;
 		}
 		if (verbose)
 			printf("Run command: %s\n", tmp);
 		kres = system(tmp);
 		if (kres) {
 			warnx("%s: returned non-zero exit code: %d",
 			    tmp, kres);
 		}
 		free(tmp);
 		return;
 	}
 
 	kres = kill(swork->sw_pid, swork->sw_signum);
 	if (kres != 0) {
 		/*
 		 * Assume that "no such process" (ESRCH) is something
 		 * to warn about, but is not an error.  Presumably the
 		 * process which writes to the rotated log file(s) is
 		 * gone, in which case we should have no problem with
 		 * compressing the rotated log file(s).
 		 */
 		if (errno != ESRCH)
 			swork->sw_pidok = 0;
 		warn("can't notify %s, pid %d = %s", swork->sw_pidtype,
 		    (int)swork->sw_pid, swork->sw_fname);
 	} else {
 		if (verbose)
 			printf("Notified %s pid %d = %s\n", swork->sw_pidtype,
 			    (int)swork->sw_pid, swork->sw_fname);
 		if (secs > 0) {
 			if (verbose)
 				printf("Pause %d second(s) between signals\n",
 				    secs);
 			sleep(secs);
 		}
 	}
 }
 
 static void
 do_zipwork(struct zipwork_entry *zwork)
 {
 	const char *pgm_name, *pgm_path;
 	int errsav, fcount, zstatus;
 	pid_t pidzip, wpid;
 	char zresult[MAXPATHLEN];
 	int c;
 
 	assert(zwork != NULL);
 	pgm_path = NULL;
 	strlcpy(zresult, zwork->zw_fname, sizeof(zresult));
 	if (zwork->zw_conf != NULL &&
 	    zwork->zw_conf->compress > COMPRESS_NONE)
 		for (c = 1; c < COMPRESS_TYPES; c++) {
 			if (zwork->zw_conf->compress == c) {
 				pgm_path = compress_type[c].path;
 				(void) strlcat(zresult,
 				    compress_type[c].suffix, sizeof(zresult));
 				break;
 			}
 		}
 	if (pgm_path == NULL) {
 		warnx("invalid entry for %s in do_zipwork", zwork->zw_fname);
 		return;
 	}
 	pgm_name = strrchr(pgm_path, '/');
 	if (pgm_name == NULL)
 		pgm_name = pgm_path;
 	else
 		pgm_name++;
 
 	if (zwork->zw_swork != NULL && zwork->zw_swork->sw_runcmd == 0 &&
 	    zwork->zw_swork->sw_pidok <= 0) {
 		warnx(
 		    "log %s not compressed because daemon(s) not notified",
 		    zwork->zw_fname);
 		change_attrs(zwork->zw_fname, zwork->zw_conf);
 		return;
 	}
 
 	if (noaction) {
 		printf("\t%s %s\n", pgm_name, zwork->zw_fname);
 		change_attrs(zresult, zwork->zw_conf);
 		return;
 	}
 
 	fcount = 1;
 	pidzip = fork();
 	while (pidzip < 0) {
 		/*
 		 * The fork failed.  If the failure was due to a temporary
 		 * problem, then wait a short time and try it again.
 		 */
 		errsav = errno;
 		warn("fork() for `%s %s'", pgm_name, zwork->zw_fname);
 		if (errsav != EAGAIN || fcount > 5)
 			errx(1, "Exiting...");
 		sleep(fcount * 12);
 		fcount++;
 		pidzip = fork();
 	}
 	if (!pidzip) {
 		/* The child process executes the compression command */
 		execl(pgm_path, pgm_path, "-f", zwork->zw_fname, (char *)0);
 		err(1, "execl(`%s -f %s')", pgm_path, zwork->zw_fname);
 	}
 
 	wpid = waitpid(pidzip, &zstatus, 0);
 	if (wpid == -1) {
 		/* XXX - should this be a fatal error? */
 		warn("%s: waitpid(%d)", pgm_path, pidzip);
 		return;
 	}
 	if (!WIFEXITED(zstatus)) {
 		warnx("`%s -f %s' did not terminate normally", pgm_name,
 		    zwork->zw_fname);
 		return;
 	}
 	if (WEXITSTATUS(zstatus)) {
 		warnx("`%s -f %s' terminated with a non-zero status (%d)",
 		    pgm_name, zwork->zw_fname, WEXITSTATUS(zstatus));
 		return;
 	}
 
 	/* Compression was successful, set file attributes on the result. */
 	change_attrs(zresult, zwork->zw_conf);
 }
 
 /*
  * Save information on any process we need to signal.  Any single
  * process may need to be sent different signal-values for different
  * log files, but usually a single signal-value will cause the process
  * to close and re-open all of it's log files.
  */
 static struct sigwork_entry *
 save_sigwork(const struct conf_entry *ent)
 {
 	struct sigwork_entry *sprev, *stmp;
 	int ndiff;
 	size_t tmpsiz;
 
 	sprev = NULL;
 	ndiff = 1;
 	SLIST_FOREACH(stmp, &swhead, sw_nextp) {
 		ndiff = strcmp(ent->pid_cmd_file, stmp->sw_fname);
 		if (ndiff > 0)
 			break;
 		if (ndiff == 0) {
 			if (ent->sig == stmp->sw_signum)
 				break;
 			if (ent->sig > stmp->sw_signum) {
 				ndiff = 1;
 				break;
 			}
 		}
 		sprev = stmp;
 	}
 	if (stmp != NULL && ndiff == 0)
 		return (stmp);
 
 	tmpsiz = sizeof(struct sigwork_entry) + strlen(ent->pid_cmd_file) + 1;
 	stmp = malloc(tmpsiz);
 	
 	stmp->sw_runcmd = 0;
 	/* If this is a command to run we just set the flag and run command */
 	if (ent->flags & CE_PID2CMD) {
 		stmp->sw_pid = -1;
 		stmp->sw_pidok = 0;
 		stmp->sw_runcmd = 1;
 	} else {
 		set_swpid(stmp, ent);
 	}
 	stmp->sw_signum = ent->sig;
 	strcpy(stmp->sw_fname, ent->pid_cmd_file);
 	if (sprev == NULL)
 		SLIST_INSERT_HEAD(&swhead, stmp, sw_nextp);
 	else
 		SLIST_INSERT_AFTER(sprev, stmp, sw_nextp);
 	return (stmp);
 }
 
 /*
  * Save information on any file we need to compress.  We may see the same
  * file multiple times, so check the full list to avoid duplicates.  The
  * list itself is sorted smallest-to-largest, because that's the order we
  * want to compress the files.  If the partition is very low on disk space,
  * then the smallest files are the most likely to compress, and compressing
  * them first will free up more space for the larger files.
  */
 static struct zipwork_entry *
 save_zipwork(const struct conf_entry *ent, const struct sigwork_entry *swork,
     int zsize, const char *zipfname)
 {
 	struct zipwork_entry *zprev, *ztmp;
 	int ndiff;
 	size_t tmpsiz;
 
 	/* Compute the size if the caller did not know it. */
 	if (zsize < 0)
 		zsize = sizefile(zipfname);
 
 	zprev = NULL;
 	ndiff = 1;
 	SLIST_FOREACH(ztmp, &zwhead, zw_nextp) {
 		ndiff = strcmp(zipfname, ztmp->zw_fname);
 		if (ndiff == 0)
 			break;
 		if (zsize > ztmp->zw_fsize)
 			zprev = ztmp;
 	}
 	if (ztmp != NULL && ndiff == 0)
 		return (ztmp);
 
 	tmpsiz = sizeof(struct zipwork_entry) + strlen(zipfname) + 1;
 	ztmp = malloc(tmpsiz);
 	ztmp->zw_conf = ent;
 	ztmp->zw_swork = swork;
 	ztmp->zw_fsize = zsize;
 	strcpy(ztmp->zw_fname, zipfname);
 	if (zprev == NULL)
 		SLIST_INSERT_HEAD(&zwhead, ztmp, zw_nextp);
 	else
 		SLIST_INSERT_AFTER(zprev, ztmp, zw_nextp);
 	return (ztmp);
 }
 
 /* Send a signal to the pid specified by pidfile */
 static void
 set_swpid(struct sigwork_entry *swork, const struct conf_entry *ent)
 {
 	FILE *f;
 	long minok, maxok, rval;
 	char *endp, *linep, line[BUFSIZ];
 
 	minok = MIN_PID;
 	maxok = MAX_PID;
 	swork->sw_pidok = 0;
 	swork->sw_pid = 0;
 	swork->sw_pidtype = "daemon";
 	if (ent->flags & CE_SIGNALGROUP) {
 		/*
 		 * If we are expected to signal a process-group when
 		 * rotating this logfile, then the value read in should
 		 * be the negative of a valid process ID.
 		 */
 		minok = -MAX_PID;
 		maxok = -MIN_PID;
 		swork->sw_pidtype = "process-group";
 	}
 
 	f = fopen(ent->pid_cmd_file, "r");
 	if (f == NULL) {
 		if (errno == ENOENT && enforcepid == 0) {
 			/*
 			 * Warn if the PID file doesn't exist, but do
 			 * not consider it an error.  Most likely it
 			 * means the process has been terminated,
 			 * so it should be safe to rotate any log
 			 * files that the process would have been using.
 			 */
 			swork->sw_pidok = 1;
 			warnx("pid file doesn't exist: %s", ent->pid_cmd_file);
 		} else
 			warn("can't open pid file: %s", ent->pid_cmd_file);
 		return;
 	}
 
 	if (fgets(line, BUFSIZ, f) == NULL) {
 		/*
 		 * Warn if the PID file is empty, but do not consider
 		 * it an error.  Most likely it means the process has
 		 * has terminated, so it should be safe to rotate any
 		 * log files that the process would have been using.
 		 */
 		if (feof(f) && enforcepid == 0) {
 			swork->sw_pidok = 1;
 			warnx("pid/cmd file is empty: %s", ent->pid_cmd_file);
 		} else
 			warn("can't read from pid file: %s", ent->pid_cmd_file);
 		(void)fclose(f);
 		return;
 	}
 	(void)fclose(f);
 
 	errno = 0;
 	linep = line;
 	while (*linep == ' ')
 		linep++;
 	rval = strtol(linep, &endp, 10);
 	if (*endp != '\0' && !isspacech(*endp)) {
 		warnx("pid file does not start with a valid number: %s",
 		    ent->pid_cmd_file);
 	} else if (rval < minok || rval > maxok) {
 		warnx("bad value '%ld' for process number in %s",
 		    rval, ent->pid_cmd_file);
 		if (verbose)
 			warnx("\t(expecting value between %ld and %ld)",
 			    minok, maxok);
 	} else {
 		swork->sw_pidok = 1;
 		swork->sw_pid = rval;
 	}
 
 	return;
 }
 
 /* Log the fact that the logs were turned over */
 static int
 log_trim(const char *logname, const struct conf_entry *log_ent)
 {
 	FILE *f;
 	const char *xtra;
 
 	if ((f = fopen(logname, "a")) == NULL)
 		return (-1);
 	xtra = "";
 	if (log_ent->def_cfg)
 		xtra = " using <default> rule";
 	if (log_ent->firstcreate)
 		fprintf(f, "%s %s newsyslog[%d]: logfile first created%s\n",
 		    daytime, hostname, (int) getpid(), xtra);
 	else if (log_ent->r_reason != NULL)
 		fprintf(f, "%s %s newsyslog[%d]: logfile turned over%s%s\n",
 		    daytime, hostname, (int) getpid(), log_ent->r_reason, xtra);
 	else
 		fprintf(f, "%s %s newsyslog[%d]: logfile turned over%s\n",
 		    daytime, hostname, (int) getpid(), xtra);
 	if (fclose(f) == EOF)
 		err(1, "log_trim: fclose");
 	return (0);
 }
 
 /* Return size in kilobytes of a file */
 static int
 sizefile(const char *file)
 {
 	struct stat sb;
 
 	if (stat(file, &sb) < 0)
 		return (-1);
 	return (kbytes(sb.st_size));
 }
 
 /*
  * Return the mtime of the most recent archive of the logfile, using timestamp
  * based filenames.
  */
 static time_t
 mtime_old_timelog(const char *file)
 {
 	struct stat sb;
 	struct tm tm;
 	int dir_fd;
 	time_t t;
 	struct dirent *dp;
 	DIR *dirp;
 	char *logfname, *logfnamebuf, *dir, *dirbuf;
 
 	t = -1;
 
 	if ((dirbuf = strdup(file)) == NULL) {
 		warn("strdup() of '%s'", file);
 		return (t);
 	}
 	dir = dirname(dirbuf);
 	if ((logfnamebuf = strdup(file)) == NULL) {
 		warn("strdup() of '%s'", file);
 		free(dirbuf);
 		return (t);
 	}
 	logfname = basename(logfnamebuf);
 	if (logfname[0] == '/') {
 		warnx("Invalid log filename '%s'", logfname);
 		goto out;
 	}
 
 	if ((dirp = opendir(dir)) == NULL) {
 		warn("Cannot open log directory '%s'", dir);
 		goto out;
 	}
 	dir_fd = dirfd(dirp);
 	/* Open the archive dir and find the most recent archive of logfname. */
 	while ((dp = readdir(dirp)) != NULL) {
 		if (validate_old_timelog(dir_fd, dp, logfname, &tm) == 0)
 			continue;
 
 		if (fstatat(dir_fd, dp->d_name, &sb, AT_SYMLINK_NOFOLLOW) == -1) {
 			warn("Cannot stat '%s'", file);
 			continue;
 		}
 		if (t < sb.st_mtime)
 			t = sb.st_mtime;
 	}
 	closedir(dirp);
 
 out:
 	free(dirbuf);
 	free(logfnamebuf);
 	return (t);
 }
 
 /* Return the age in hours of the most recent archive of the logfile. */
 static int
 age_old_log(const char *file)
 {
 	struct stat sb;
 	const char *logfile_suffix;
 	char tmp[MAXPATHLEN + sizeof(".0") + COMPRESS_SUFFIX_MAXLEN + 1];
 	time_t mtime;
 
 	if (archtodir) {
 		char *p;
 
 		/* build name of archive directory into tmp */
 		if (*archdirname == '/') {	/* absolute */
 			strlcpy(tmp, archdirname, sizeof(tmp));
 		} else {	/* relative */
 			/* get directory part of logfile */
 			strlcpy(tmp, file, sizeof(tmp));
 			if ((p = strrchr(tmp, '/')) == NULL)
 				tmp[0] = '\0';
 			else
 				*(p + 1) = '\0';
 			strlcat(tmp, archdirname, sizeof(tmp));
 		}
 
 		strlcat(tmp, "/", sizeof(tmp));
 
 		/* get filename part of logfile */
 		if ((p = strrchr(file, '/')) == NULL)
 			strlcat(tmp, file, sizeof(tmp));
 		else
 			strlcat(tmp, p + 1, sizeof(tmp));
 	} else {
 		(void) strlcpy(tmp, file, sizeof(tmp));
 	}
 
 	if (timefnamefmt != NULL) {
 		mtime = mtime_old_timelog(tmp);
 		if (mtime == -1)
 			return (-1);
 	} else {
 		strlcat(tmp, ".0", sizeof(tmp));
 		logfile_suffix = get_logfile_suffix(tmp);
 		if (logfile_suffix == NULL)
 			return (-1);
 		(void) strlcat(tmp, logfile_suffix, sizeof(tmp));
 		if (stat(tmp, &sb) < 0)
 			return (-1);
 		mtime = sb.st_mtime;
 	}
 
 	return ((int)(ptimeget_secs(timenow) - mtime + 1800) / 3600);
 }
 
 /* Skip Over Blanks */
 static char *
 sob(char *p)
 {
 	while (p && *p && isspace(*p))
 		p++;
 	return (p);
 }
 
 /* Skip Over Non-Blanks */
 static char *
 son(char *p)
 {
 	while (p && *p && !isspace(*p))
 		p++;
 	return (p);
 }
 
 /* Check if string is actually a number */
 static int
 isnumberstr(const char *string)
 {
 	while (*string) {
 		if (!isdigitch(*string++))
 			return (0);
 	}
 	return (1);
 }
 
 /* Check if string contains a glob */
 static int
 isglobstr(const char *string)
 {
 	char chr;
 
 	while ((chr = *string++)) {
 		if (chr == '*' || chr == '?' || chr == '[')
 			return (1);
 	}
 	return (0);
 }
 
 /*
  * Save the active log file under a new name.  A link to the new name
  * is the quick-and-easy way to do this.  If that fails (which it will
  * if the destination is on another partition), then make a copy of
  * the file to the new location.
  */
 static void
 savelog(char *from, char *to)
 {
 	FILE *src, *dst;
 	int c, res;
 
 	res = link(from, to);
 	if (res == 0)
 		return;
 
 	if ((src = fopen(from, "r")) == NULL)
 		err(1, "can't fopen %s for reading", from);
 	if ((dst = fopen(to, "w")) == NULL)
 		err(1, "can't fopen %s for writing", to);
 
 	while ((c = getc(src)) != EOF) {
 		if ((putc(c, dst)) == EOF)
 			err(1, "error writing to %s", to);
 	}
 
 	if (ferror(src))
 		err(1, "error reading from %s", from);
 	if ((fclose(src)) != 0)
 		err(1, "can't fclose %s", to);
 	if ((fclose(dst)) != 0)
 		err(1, "can't fclose %s", from);
 }
 
 /* create one or more directory components of a path */
 static void
 createdir(const struct conf_entry *ent, char *dirpart)
 {
 	int res;
 	char *s, *d;
 	char mkdirpath[MAXPATHLEN];
 	struct stat st;
 
 	s = dirpart;
 	d = mkdirpath;
 
 	for (;;) {
 		*d++ = *s++;
 		if (*s != '/' && *s != '\0')
 			continue;
 		*d = '\0';
 		res = lstat(mkdirpath, &st);
 		if (res != 0) {
 			if (noaction) {
 				printf("\tmkdir %s\n", mkdirpath);
 			} else {
 				res = mkdir(mkdirpath, 0755);
 				if (res != 0)
 					err(1, "Error on mkdir(\"%s\") for -a",
 					    mkdirpath);
 			}
 		}
 		if (*s == '\0')
 			break;
 	}
 	if (verbose) {
 		if (ent->firstcreate)
 			printf("Created directory '%s' for new %s\n",
 			    dirpart, ent->log);
 		else
 			printf("Created directory '%s' for -a\n", dirpart);
 	}
 }
 
 /*
  * Create a new log file, destroying any currently-existing version
  * of the log file in the process.  If the caller wants a backup copy
  * of the file to exist, they should call 'link(logfile,logbackup)'
  * before calling this routine.
  */
 void
 createlog(const struct conf_entry *ent)
 {
 	int fd, failed;
 	struct stat st;
 	char *realfile, *slash, tempfile[MAXPATHLEN];
 
 	fd = -1;
 	realfile = ent->log;
 
 	/*
 	 * If this log file is being created for the first time (-C option),
 	 * then it may also be true that the parent directory does not exist
 	 * yet.  Check, and create that directory if it is missing.
 	 */
 	if (ent->firstcreate) {
 		strlcpy(tempfile, realfile, sizeof(tempfile));
 		slash = strrchr(tempfile, '/');
 		if (slash != NULL) {
 			*slash = '\0';
 			failed = stat(tempfile, &st);
 			if (failed && errno != ENOENT)
 				err(1, "Error on stat(%s)", tempfile);
 			if (failed)
 				createdir(ent, tempfile);
 			else if (!S_ISDIR(st.st_mode))
 				errx(1, "%s exists but is not a directory",
 				    tempfile);
 		}
 	}
 
 	/*
 	 * First create an unused filename, so it can be chown'ed and
 	 * chmod'ed before it is moved into the real location.  mkstemp
 	 * will create the file mode=600 & owned by us.  Note that all
 	 * temp files will have a suffix of '.z<something>'.
 	 */
 	strlcpy(tempfile, realfile, sizeof(tempfile));
 	strlcat(tempfile, ".zXXXXXX", sizeof(tempfile));
 	if (noaction)
 		printf("\tmktemp %s\n", tempfile);
 	else {
 		fd = mkstemp(tempfile);
 		if (fd < 0)
 			err(1, "can't mkstemp logfile %s", tempfile);
 
 		/*
 		 * Add status message to what will become the new log file.
 		 */
 		if (!(ent->flags & CE_BINARY)) {
 			if (log_trim(tempfile, ent))
 				err(1, "can't add status message to log");
 		}
 	}
 
 	/* Change the owner/group, if we are supposed to */
 	if (ent->uid != (uid_t)-1 || ent->gid != (gid_t)-1) {
 		if (noaction)
 			printf("\tchown %u:%u %s\n", ent->uid, ent->gid,
 			    tempfile);
 		else {
 			failed = fchown(fd, ent->uid, ent->gid);
 			if (failed)
 				err(1, "can't fchown temp file %s", tempfile);
 		}
 	}
 
 	/* Turn on NODUMP if it was requested in the config-file. */
 	if (ent->flags & CE_NODUMP) {
 		if (noaction)
 			printf("\tchflags nodump %s\n", tempfile);
 		else {
 			failed = fchflags(fd, UF_NODUMP);
 			if (failed) {
 				warn("log_trim: fchflags(NODUMP)");
 			}
 		}
 	}
 
 	/*
 	 * Note that if the real logfile still exists, and if the call
 	 * to rename() fails, then "neither the old file nor the new
 	 * file shall be changed or created" (to quote the standard).
 	 * If the call succeeds, then the file will be replaced without
 	 * any window where some other process might find that the file
 	 * did not exist.
 	 * XXX - ? It may be that for some error conditions, we could
 	 *	retry by first removing the realfile and then renaming.
 	 */
 	if (noaction) {
 		printf("\tchmod %o %s\n", ent->permissions, tempfile);
 		printf("\tmv %s %s\n", tempfile, realfile);
 	} else {
 		failed = fchmod(fd, ent->permissions);
 		if (failed)
 			err(1, "can't fchmod temp file '%s'", tempfile);
 		failed = rename(tempfile, realfile);
 		if (failed)
 			err(1, "can't mv %s to %s", tempfile, realfile);
 	}
 
 	if (fd >= 0)
 		close(fd);
 }
 
 /*
  * Change the attributes of a given filename to what was specified in
  * the newsyslog.conf entry.  This routine is only called for files
  * that newsyslog expects that it has created, and thus it is a fatal
  * error if this routine finds that the file does not exist.
  */
 static void
 change_attrs(const char *fname, const struct conf_entry *ent)
 {
 	int failed;
 
 	if (noaction) {
 		printf("\tchmod %o %s\n", ent->permissions, fname);
 
 		if (ent->uid != (uid_t)-1 || ent->gid != (gid_t)-1)
 			printf("\tchown %u:%u %s\n",
 			    ent->uid, ent->gid, fname);
 
 		if (ent->flags & CE_NODUMP)
 			printf("\tchflags nodump %s\n", fname);
 		return;
 	}
 
 	failed = chmod(fname, ent->permissions);
 	if (failed) {
 		if (errno != EPERM)
 			err(1, "chmod(%s) in change_attrs", fname);
 		warn("change_attrs couldn't chmod(%s)", fname);
 	}
 
 	if (ent->uid != (uid_t)-1 || ent->gid != (gid_t)-1) {
 		failed = chown(fname, ent->uid, ent->gid);
 		if (failed)
 			warn("can't chown %s", fname);
 	}
 
 	if (ent->flags & CE_NODUMP) {
 		failed = chflags(fname, UF_NODUMP);
 		if (failed)
 			warn("can't chflags %s NODUMP", fname);
 	}
 }
 
 /*
  * Parse a signal number or signal name. Returns the signal number parsed or -1
  * on failure.
  */
 static int
 parse_signal(const char *str)
 {
 	int sig, i;
 	const char *errstr;
 
 	sig = strtonum(str, 1, sys_nsig - 1, &errstr);
 
 	if (errstr == NULL)
 		return (sig);
 	if (strncasecmp(str, "SIG", 3) == 0)
 		str += 3;
 
 	for (i = 1; i < sys_nsig; i++) {
 		if (strcasecmp(str, sys_signame[i]) == 0)
 			return (i);
 	}
 
 	return (-1);
 }
Index: projects/clang390-import
===================================================================
--- projects/clang390-import	(revision 305686)
+++ projects/clang390-import	(revision 305687)

Property changes on: projects/clang390-import
___________________________________________________________________
Modified: svn:mergeinfo
## -0,0 +0,1 ##
   Merged /head:r305623-305686