Index: user/ngie/more-tests/UPDATING
===================================================================
--- user/ngie/more-tests/UPDATING	(revision 281584)
+++ user/ngie/more-tests/UPDATING	(revision 281585)
@@ -1,1173 +1,1177 @@
 Updating Information for FreeBSD current users.
 
 This file is maintained and copyrighted by M. Warner Losh <imp@freebsd.org>.
 See end of file for further details.  For commonly done items, please see the
 COMMON ITEMS: section later in the file.  These instructions assume that you
 basically know what you are doing.  If not, then please consult the FreeBSD
 handbook:
 
     http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html
 
 Items affecting the ports and packages system can be found in
 /usr/ports/UPDATING.  Please read that file before running portupgrade.
 
 NOTE: FreeBSD has switched from gcc to clang. If you have trouble bootstrapping
 from older versions of FreeBSD, try WITHOUT_CLANG and WITH_GCC to bootstrap to
 the tip of head, and then rebuild without this option. The bootstrap process from
 older version of current across the gcc/clang cutover is a bit fragile.
 
 NOTE TO PEOPLE WHO THINK THAT FreeBSD 11.x IS SLOW:
 	FreeBSD 11.x has many debugging features turned on, in both the kernel
 	and userland.  These features attempt to detect incorrect use of
 	system primitives, and encourage loud failure through extra sanity
 	checking and fail stop semantics.  They also substantially impact
 	system performance.  If you want to do performance measurement,
 	benchmarking, and optimization, you'll want to turn them off.  This
 	includes various WITNESS- related kernel options, INVARIANTS, malloc
 	debugging flags in userland, and various verbose features in the
 	kernel.  Many developers choose to disable these features on build
 	machines to maximize performance.  (To completely disable malloc
 	debugging, define MALLOC_PRODUCTION in /etc/make.conf, or to merely
 	disable the most expensive debugging functionality run
 	"ln -s 'abort:false,junk:false' /etc/malloc.conf".)
 
+20150415:
+	The const qualifier has been removed from iconv(3) to comply with
+	POSIX.  The ports tree is aware of this from r384038 onwards.
+
 20150324:
 	From legacy ata(4) driver was removed support for SATA controllers
 	supported by more functional drivers ahci(4), siis(4) and mvs(4).
 	Kernel modules ataahci and ataadaptec were removed completely,
 	replaced by ahci and mvs modules respectively.
 
 20150315:
 	Clang, llvm and lldb have been upgraded to 3.6.0 release.  Please see
 	the 20141231 entry below for information about prerequisites and
 	upgrading, if you are not already using 3.5.0 or higher.
 
 20150307:
 	The 32-bit PowerPC kernel has been changed to a position-independent
 	executable. This can only be booted with a version of loader(8)
 	newer than January 31, 2015, so make sure to update both world and
 	kernel before rebooting.
 
 20150217:
 	If you are running a -CURRENT kernel since r273872 (Oct 30th, 2014),
 	but before r278950, the RNG was not seeded properly.  Immediately
 	upgrade the kernel to r278950 or later and regenerate any keys (e.g.
 	ssh keys or openssl keys) that were generated w/ a kernel from that
 	range.  This does not affect programs that directly used /dev/random
 	or /dev/urandom.  All userland uses of arc4random(3) are affected.
 
 20150210:
 	The autofs(4) ABI was changed in order to restore binary compatibility
 	with 10.1-RELEASE.  The automountd(8) daemon needs to be rebuilt to work
 	with the new kernel.
 
 20150131:
 	The powerpc64 kernel has been changed to a position-independent
 	executable. This can only be booted with a new version of loader(8),
 	so make sure to update both world and kernel before rebooting.
 
 20150118:
 	Clang and llvm have been upgraded to 3.5.1 release.  This is a bugfix
 	only release, no new features have been added.  Please see the 20141231
 	entry below for information about prerequisites and upgrading, if you
 	are not already using 3.5.0.
 
 20150107:
 	ELF tools addr2line, elfcopy (strip), nm, size, and strings are now
 	taken from the ELF Tool Chain project rather than GNU binutils. They
 	should be drop-in replacements, with the addition of arm64 support.
 	The WITHOUT_ELFTOOLCHAIN_TOOLS= knob may be used to obtain the
 	binutils tools, if necessary.
 
 20150105:
 	The default Unbound configuration now enables remote control
 	using a local socket.  Users who have already enabled the
 	local_unbound service should regenerate their configuration
 	by running "service local_unbound setup" as root.
 	
 20150102:
 	The GNU texinfo and GNU info pages have been removed.
 	To be able to view GNU info pages please install texinfo from ports.
 
 20141231:
 	Clang, llvm and lldb have been upgraded to 3.5.0 release.
 
 	As of this release, a prerequisite for building clang, llvm and lldb is
 	a C++11 capable compiler and C++11 standard library.  This means that to
 	be able to successfully build the cross-tools stage of buildworld, with
 	clang as the bootstrap compiler, your system compiler or cross compiler
 	should either be clang 3.3 or later, or gcc 4.8 or later, and your
 	system C++ library should be libc++, or libdstdc++ from gcc 4.8 or
 	later.
 
 	On any standard FreeBSD 10.x or 11.x installation, where clang and
 	libc++ are on by default (that is, on x86 or arm), this should work out
 	of the box.
 
 	On 9.x installations where clang is enabled by default, e.g. on x86 and
 	powerpc, libc++ will not be enabled by default, so libc++ should be
 	built (with clang) and installed first.  If both clang and libc++ are
 	missing, build clang first, then use it to build libc++.
 
 	On 8.x and earlier installations, upgrade to 9.x first, and then follow
 	the instructions for 9.x above.
 
 	Sparc64 and mips users are unaffected, as they still use gcc 4.2.1 by
 	default, and do not build clang.
 
 	Many embedded systems are resource constrained, and will not be able to
 	build clang in a reasonable time, or in some cases at all.  In those
 	cases, cross building bootable systems on amd64 is a workaround.
 
 	This new version of clang introduces a number of new warnings, of which
 	the following are most likely to appear:
 
 	-Wabsolute-value
 
 	This warns in two cases, for both C and C++:
 	* When the code is trying to take the absolute value of an unsigned
 	  quantity, which is effectively a no-op, and almost never what was
 	  intended.  The code should be fixed, if at all possible.  If you are
 	  sure that the unsigned quantity can be safely cast to signed, without
 	  loss of information or undefined behavior, you can add an explicit
 	  cast, or disable the warning.
 
 	* When the code is trying to take an absolute value, but the called
 	  abs() variant is for the wrong type, which can lead to truncation.
 	  If you want to disable the warning instead of fixing the code, please
 	  make sure that truncation will not occur, or it might lead to unwanted
 	  side-effects.
 
 	-Wtautological-undefined-compare and
 	-Wundefined-bool-conversion
 
 	These warn when C++ code is trying to compare 'this' against NULL, while
 	'this' should never be NULL in well-defined C++ code.  However, there is
 	some legacy (pre C++11) code out there, which actively abuses this
 	feature, which was less strictly defined in previous C++ versions.
 
 	Squid and openjdk do this, for example.  The warning can be turned off
 	for C++98 and earlier, but compiling the code in C++11 mode might result
 	in unexpected behavior; for example, the parts of the program that are
 	unreachable could be optimized away.
 
 20141222:
 	The old NFS client and server (kernel options NFSCLIENT, NFSSERVER)
 	kernel sources have been removed. The .h files remain, since some
 	utilities include them. This will need to be fixed later.
 	If "mount -t oldnfs ..." is attempted, it will fail.
 	If the "-o" option on mountd(8), nfsd(8) or nfsstat(1) is used,
 	the utilities will report errors.
 
 20141121:
 	The handling of LOCAL_LIB_DIRS has been altered to skip addition of
 	directories to top level SUBDIR variable when their parent
 	directory is included in LOCAL_DIRS.  Users with build systems with
 	such hierarchies and without SUBDIR entries in the parent
 	directory Makefiles should add them or add the directories to
 	LOCAL_DIRS.
 
 20141109:
 	faith(4) and faithd(8) have been removed from the base system. Faith
 	has been obsolete for a very long time.
 
 20141104:
 	vt(4), the new console driver, is enabled by default. It brings
 	support for Unicode and double-width characters, as well as
 	support for UEFI and integration with the KMS kernel video
 	drivers.
 
 	You may need to update your console settings in /etc/rc.conf,
 	most probably the keymap. During boot, /etc/rc.d/syscons will
 	indicate what you need to do.
 
 	vt(4) still has issues and lacks some features compared to
 	syscons(4). See the wiki for up-to-date information:
 	  https://wiki.freebsd.org/Newcons
 
 	If you want to keep using syscons(4), you can do so by adding
 	the following line to /boot/loader.conf:
 	  kern.vty=sc
 
 20141102:
 	pjdfstest has been integrated into kyua as an opt-in test suite.
 	Please see share/doc/pjdfstest/README for more details on how to
 	execute it.
 
 20141009:
 	gperf has been removed from the base system for architectures
 	that use clang. Ports that require gperf will obtain it from the
 	devel/gperf port.
 
 20140923:
 	pjdfstest has been moved from tools/regression/pjdfstest to
 	contrib/pjdfstest .
 
 20140922:
 	At svn r271982, The default linux compat kernel ABI has been adjusted
 	to 2.6.18 in support of the linux-c6 compat ports infrastructure
 	update.  If you wish to continue using the linux-f10 compat ports,
 	add compat.linux.osrelease=2.6.16 to your local sysctl.conf.  Users are
 	encouraged to update their linux-compat packages to linux-c6 during
 	their next update cycle.
 
 20140729:
 	The ofwfb driver, used to provide a graphics console on PowerPC when
 	using vt(4), no longer allows mmap() of all physical memory. This
 	will prevent Xorg on PowerPC with some ATI graphics cards from
 	initializing properly unless x11-servers/xorg-server is updated to
 	1.12.4_8 or newer.
 
 20140723:
 	The xdev targets have been converted to using TARGET and
 	TARGET_ARCH instead of XDEV and XDEV_ARCH.
 
 20140719:
 	The default unbound configuration has been modified to address
 	issues with reverse lookups on networks that use private
 	address ranges.  If you use the local_unbound service, run
 	"service local_unbound setup" as root to regenerate your
 	configuration, then "service local_unbound reload" to load the
 	new configuration.
 
 20140709:
 	The GNU texinfo and GNU info pages are not built and installed
 	anymore, WITH_INFO knob has been added to allow to built and install
 	them again.
 	UPDATE: see 20150102 entry on texinfo's removal
 
 20140708:
 	The GNU readline library is now an INTERNALLIB - that is, it is
 	statically linked into consumers (GDB and variants) in the base
 	system, and the shared library is no longer installed.  The
 	devel/readline port is available for third party software that
 	requires readline.
 
 20140702:
 	The Itanium architecture (ia64) has been removed from the list of
 	known architectures. This is the first step in the removal of the
 	architecture.
 
 20140701:
 	Commit r268115 has added NFSv4.1 server support, merged from
 	projects/nfsv4.1-server.  Since this includes changes to the
 	internal interfaces between the NFS related modules, a full
 	build of the kernel and modules will be necessary.
 	__FreeBSD_version has been bumped.
 
 20140629:
 	The WITHOUT_VT_SUPPORT kernel config knob has been renamed
 	WITHOUT_VT.  (The other _SUPPORT knobs have a consistent meaning
 	which differs from the behaviour controlled by this knob.)
 
 20140619:
 	Maximal length of the serial number in CTL was increased from 16 to
 	64 chars, that breaks ABI.  All CTL-related tools, such as ctladm
 	and ctld, need to be rebuilt to work with a new kernel.
 
 20140606:
 	The libatf-c and libatf-c++ major versions were downgraded to 0 and
 	1 respectively to match the upstream numbers.  They were out of
 	sync because, when they were originally added to FreeBSD, the
 	upstream versions were not respected.  These libraries are private
 	and not yet built by default, so renumbering them should be a
 	non-issue.  However, unclean source trees will yield broken test
 	programs once the operator executes "make delete-old-libs" after a
 	"make installworld".
 
 	Additionally, the atf-sh binary was made private by moving it into
 	/usr/libexec/.  Already-built shell test programs will keep the
 	path to the old binary so they will break after "make delete-old"
 	is run.
 
 	If you are using WITH_TESTS=yes (not the default), wipe the object
 	tree and rebuild from scratch to prevent spurious test failures.
 	This is only needed once: the misnumbered libraries and misplaced
 	binaries have been added to OptionalObsoleteFiles.inc so they will
 	be removed during a clean upgrade.
 
 20140512:
 	Clang and llvm have been upgraded to 3.4.1 release.
 
 20140508:
 	We bogusly installed src.opts.mk in /usr/share/mk. This file should
 	be removed to avoid issues in the future (and has been added to
 	ObsoleteFiles.inc).
 
 20140505:
 	/etc/src.conf now affects only builds of the FreeBSD src tree. In the
 	past, it affected all builds that used the bsd.*.mk files. The old
 	behavior was a bug, but people may have relied upon it. To get this
 	behavior back, you can .include /etc/src.conf from /etc/make.conf
 	(which is still global and isn't changed). This also changes the
 	behavior of incremental builds inside the tree of individual
 	directories. Set MAKESYSPATH to ".../share/mk" to do that.
 	Although this has survived make universe and some upgrade scenarios,
 	other upgrade scenarios may have broken. At least one form of
 	temporary breakage was fixed with MAKESYSPATH settings for buildworld
 	as well... In cases where MAKESYSPATH isn't working with this
 	setting, you'll need to set it to the full path to your tree.
 
 	One side effect of all this cleaning up is that bsd.compiler.mk
 	is no longer implicitly included by bsd.own.mk. If you wish to
 	use COMPILER_TYPE, you must now explicitly include bsd.compiler.mk
 	as well.
 
 20140430:
 	The lindev device has been removed since /dev/full has been made a
 	standard device.  __FreeBSD_version has been bumped.
 
 20140424:
 	The knob WITHOUT_VI was added to the base system, which controls
 	building ex(1), vi(1), etc. Older releases of FreeBSD required ex(1)
 	in order to reorder files share/termcap and didn't build ex(1) as a
 	build tool, so building/installing with WITH_VI is highly advised for
 	build hosts for older releases.
 
 	This issue has been fixed in stable/9 and stable/10 in r277022 and
 	r276991, respectively.
 
 20140418:
 	The YES_HESIOD knob has been removed. It has been obsolete for
 	a decade. Please move to using WITH_HESIOD instead or your builds
 	will silently lack HESIOD.
 
 20140405:
 	The uart(4) driver has been changed with respect to its handling
 	of the low-level console. Previously the uart(4) driver prevented
 	any process from changing the baudrate or the CLOCAL and HUPCL
 	control flags. By removing the restrictions, operators can make
 	changes to the serial console port without having to reboot.
 	However, when getty(8) is started on the serial device that is
 	associated with the low-level console, a misconfigured terminal
 	line in /etc/ttys will now have a real impact.
 	Before upgrading the kernel, make sure that /etc/ttys has the
 	serial console device configured as 3wire without baudrate to
 	preserve the previous behaviour. E.g:
 	    ttyu0  "/usr/libexec/getty 3wire"  vt100  on  secure
 
 20140306:
 	Support for libwrap (TCP wrappers) in rpcbind was disabled by default
 	to improve performance.  To re-enable it, if needed, run rpcbind
 	with command line option -W.
 
 20140226:
 	Switched back to the GPL dtc compiler due to updates in the upstream
 	dts files not being supported by the BSDL dtc compiler. You will need
 	to rebuild your kernel toolchain to pick up the new compiler. Core dumps
 	may result while building dtb files during a kernel build if you fail
 	to do so. Set WITHOUT_GPL_DTC if you require the BSDL compiler.
 
 20140216:
 	Clang and llvm have been upgraded to 3.4 release.
 
 20140216:
 	The nve(4) driver has been removed.  Please use the nfe(4) driver
 	for NVIDIA nForce MCP Ethernet adapters instead.
 
 20140212:
 	An ABI incompatibility crept into the libc++ 3.4 import in r261283.
 	This could cause certain C++ applications using shared libraries built
 	against the previous version of libc++ to crash.  The incompatibility
 	has now been fixed, but any C++ applications or shared libraries built
 	between r261283 and r261801 should be recompiled.
 
 20140204:
 	OpenSSH will now ignore errors caused by kernel lacking of Capsicum
 	capability mode support.  Please note that enabling the feature in
 	kernel is still highly recommended.
 
 20140131:
 	OpenSSH is now built with sandbox support, and will use sandbox as
 	the default privilege separation method.  This requires Capsicum
 	capability mode support in kernel.
 
 20140128:
 	The libelf and libdwarf libraries have been updated to newer
 	versions from upstream. Shared library version numbers for
 	these two libraries were bumped. Any ports or binaries
 	requiring these two libraries should be recompiled.
 	__FreeBSD_version is bumped to 1100006.
 
 20140110:
 	If a Makefile in a tests/ directory was auto-generating a Kyuafile
 	instead of providing an explicit one, this would prevent such
 	Makefile from providing its own Kyuafile in the future during
 	NO_CLEAN builds.  This has been fixed in the Makefiles but manual
 	intervention is needed to clean an objdir if you use NO_CLEAN:
 	  # find /usr/obj -name Kyuafile | xargs rm -f
 
 20131213:
 	The behavior of gss_pseudo_random() for the krb5 mechanism
 	has changed, for applications requesting a longer random string
 	than produced by the underlying enctype's pseudo-random() function.
 	In particular, the random string produced from a session key of
 	enctype aes256-cts-hmac-sha1-96 or aes256-cts-hmac-sha1-96 will
 	be different at the 17th octet and later, after this change.
 	The counter used in the PRF+ construction is now encoded as a
 	big-endian integer in accordance with RFC 4402.
 	__FreeBSD_version is bumped to 1100004.
 
 20131108:
 	The WITHOUT_ATF build knob has been removed and its functionality
 	has been subsumed into the more generic WITHOUT_TESTS.  If you were
 	using the former to disable the build of the ATF libraries, you
 	should change your settings to use the latter.
 
 20131025:
 	The default version of mtree is nmtree which is obtained from
 	NetBSD.  The output is generally the same, but may vary
 	slightly.  If you found you need identical output adding
 	"-F freebsd9" to the command line should do the trick.  For the
 	time being, the old mtree is available as fmtree.
 
 20131014:
 	libbsdyml has been renamed to libyaml and moved to /usr/lib/private.
 	This will break ports-mgmt/pkg. Rebuild the port, or upgrade to pkg
 	1.1.4_8 and verify bsdyml not linked in, before running "make
 	delete-old-libs":
 	  # make -C /usr/ports/ports-mgmt/pkg build deinstall install clean
 	  or
 	  # pkg install pkg; ldd /usr/local/sbin/pkg | grep bsdyml
 
 20131010:
 	The rc.d/jail script has been updated to support jail(8)
 	configuration file.  The "jail_<jname>_*" rc.conf(5) variables
 	for per-jail configuration are automatically converted to
 	/var/run/jail.<jname>.conf before the jail(8) utility is invoked.
 	This is transparently backward compatible.  See below about some
 	incompatibilities and rc.conf(5) manual page for more details.
 
 	These variables are now deprecated in favor of jail(8) configuration
 	file.  One can use "rc.d/jail config <jname>" command to generate
 	a jail(8) configuration file in /var/run/jail.<jname>.conf without
 	running the jail(8) utility.   The default pathname of the
 	configuration file is /etc/jail.conf and can be specified by
 	using $jail_conf or $jail_<jname>_conf variables.
 
 	Please note that jail_devfs_ruleset accepts an integer at
 	this moment.  Please consider to rewrite the ruleset name
 	with an integer.
 
 20130930:
 	BIND has been removed from the base system.  If all you need
 	is a local resolver, simply enable and start the local_unbound
 	service instead.  Otherwise, several versions of BIND are
 	available in the ports tree.   The dns/bind99 port is one example.
 
 	With this change, nslookup(1) and dig(1) are no longer in the base
 	system.  Users should instead use host(1) and drill(1) which are
 	in the base system.  Alternatively, nslookup and dig can
 	be obtained by installing the dns/bind-tools port.
 
 20130916:
 	With the addition of unbound(8), a new unbound user is now
 	required during installworld.  "mergemaster -p" can be used to
 	add the user prior to installworld, as documented in the handbook.
 
 20130911:
 	OpenSSH is now built with DNSSEC support, and will by default
 	silently trust signed SSHFP records.  This can be controlled with
 	the VerifyHostKeyDNS client configuration setting.  DNSSEC support
 	can be disabled entirely with the WITHOUT_LDNS option in src.conf.
 
 20130906:
 	The GNU Compiler Collection and C++ standard library (libstdc++)
 	are no longer built by default on platforms where clang is the system
 	compiler.  You can enable them with the WITH_GCC and WITH_GNUCXX
 	options in src.conf.  
 
 20130905:
 	The PROCDESC kernel option is now part of the GENERIC kernel
 	configuration and is required for the rwhod(8) to work.
 	If you are using custom kernel configuration, you should include
 	'options PROCDESC'.
 
 20130905:
 	The API and ABI related to the Capsicum framework was modified
 	in backward incompatible way. The userland libraries and programs
 	have to be recompiled to work with the new kernel. This includes the
 	following libraries and programs, but the whole buildworld is
 	advised: libc, libprocstat, dhclient, tcpdump, hastd, hastctl,
 	kdump, procstat, rwho, rwhod, uniq.
 
 20130903:
 	AES-NI intrinsic support has been added to gcc.  The AES-NI module
 	has been updated to use this support.  A new gcc is required to build
 	the aesni module on both i386 and amd64.
 
 20130821:
 	The PADLOCK_RNG and RDRAND_RNG kernel options are now devices.
 	Thus "device padlock_rng" and "device rdrand_rng" should be
 	used instead of "options PADLOCK_RNG" & "options RDRAND_RNG".
 
 20130813:
 	WITH_ICONV has been split into two feature sets.  WITH_ICONV now
 	enables just the iconv* functionality and is now on by default.
 	WITH_LIBICONV_COMPAT enables the libiconv api and link time
 	compatability.  Set WITHOUT_ICONV to build the old way.
 	If you have been using WITH_ICONV before, you will very likely
 	need to turn on WITH_LIBICONV_COMPAT.
 
 20130806:
 	INVARIANTS option now enables DEBUG for code with OpenSolaris and
 	Illumos origin, including ZFS.  If you have INVARIANTS in your
 	kernel configuration, then there is no need to set DEBUG or ZFS_DEBUG
 	explicitly.
 	DEBUG used to enable witness(9) tracking of OpenSolaris (mostly ZFS)
 	locks if WITNESS option was set.  Because that generated a lot of
 	witness(9) reports and all of them were believed to be false
 	positives, this is no longer done.  New option OPENSOLARIS_WITNESS
 	can be used to achieve the previous behavior.
 
 20130806:
 	Timer values in IPv6 data structures now use time_uptime instead
 	of time_second.  Although this is not a user-visible functional
 	change, userland utilities which directly use them---ndp(8),
 	rtadvd(8), and rtsold(8) in the base system---need to be updated
 	to r253970 or later.
 
 20130802:
 	find -delete can now delete the pathnames given as arguments,
 	instead of only files found below them or if the pathname did
 	not contain any slashes. Formerly, the following error message
 	would result:
 
 	find: -delete: <path>: relative path potentially not safe
 
 	Deleting the pathnames given as arguments can be prevented
 	without error messages using -mindepth 1 or by changing
 	directory and passing "." as argument to find. This works in the
 	old as well as the new version of find.
 
 20130726:
 	Behavior of devfs rules path matching has been changed.
 	Pattern is now always matched against fully qualified devfs
 	path and slash characters must be explicitly matched by
 	slashes in pattern (FNM_PATHNAME). Rulesets involving devfs
 	subdirectories must be reviewed.
 
 20130716:
 	The default ARM ABI has changed to the ARM EABI. The old ABI is
 	incompatible with the ARM EABI and all programs and modules will
 	need to be rebuilt to work with a new kernel.
 
 	To keep using the old ABI ensure the WITHOUT_ARM_EABI knob is set.
 
 	NOTE: Support for the old ABI will be removed in the future and
 	users are advised to upgrade.
 
 20130709:
 	pkg_install has been disconnected from the build if you really need it
 	you should add WITH_PKGTOOLS in your src.conf(5).
 
 20130709:
 	Most of network statistics structures were changed to be able
 	keep 64-bits counters. Thus all tools, that work with networking
 	statistics, must be rebuilt (netstat(1), bsnmpd(1), etc.)
 
 20130629:
 	Fix targets that run multiple make's to use && rather than ;
 	so that subsequent steps depend on success of previous.
 
 	NOTE: if building 'universe' with -j* on stable/8 or stable/9
 	it would be better to start the build using bmake, to avoid
 	overloading the machine.
 
 20130618:
 	Fix a bug that allowed a tracing process (e.g. gdb) to write
 	to a memory-mapped file in the traced process's address space
 	even if neither the traced process nor the tracing process had
 	write access to that file.
 
 20130615:
 	CVS has been removed from the base system.  An exact copy
 	of the code is available from the devel/cvs port.
 
 20130613:
 	Some people report the following error after the switch to bmake:
 
 		make: illegal option -- J
 		usage: make [-BPSXeiknpqrstv] [-C directory] [-D variable]
 			...
 		*** [buildworld] Error code 2
 
 	this likely due to an old instance of make in
 	${MAKEPATH} (${MAKEOBJDIRPREFIX}${.CURDIR}/make.${MACHINE})
 	which src/Makefile will use that blindly, if it exists, so if
 	you see the above error:
 
 		rm -rf `make -V MAKEPATH`
 
 	should resolve it.
 
 20130516:
 	Use bmake by default.
 	Whereas before one could choose to build with bmake via
 	-DWITH_BMAKE one must now use -DWITHOUT_BMAKE to use the old
 	make. The goal is to remove these knobs for 10-RELEASE.
 
 	It is worth noting that bmake (like gmake) treats the command
 	line as the unit of failure, rather than statements within the
 	command line.  Thus '(cd some/where && dosomething)' is safer
 	than 'cd some/where; dosomething'. The '()' allows consistent
 	behavior in parallel build.
 
 20130429:
         Fix a bug that allows NFS clients to issue READDIR on files.
 
 20130426:
 	The WITHOUT_IDEA option has been removed because
 	the IDEA patent expired.
 
 20130426:
 	The sysctl which controls TRIM support under ZFS has been renamed
 	from vfs.zfs.trim_disable -> vfs.zfs.trim.enabled and has been
 	enabled by default.
 
 20130425:
 	The mergemaster command now uses the default MAKEOBJDIRPREFIX
 	rather than creating it's own in the temporary directory in
 	order allow access to bootstrapped versions of tools such as
 	install and mtree.  When upgrading from version of FreeBSD where
 	the install command does not support -l, you will need to
 	install a new mergemaster command if mergemaster -p is required.
 	This can be accomplished with the command (cd src/usr.sbin/mergemaster
 	&& make install).
 
 20130404:
 	Legacy ATA stack, disabled and replaced by new CAM-based one since
 	FreeBSD 9.0, completely removed from the sources.  Kernel modules
 	atadisk and atapi*, user-level tools atacontrol and burncd are
 	removed.  Kernel option `options ATA_CAM` is now permanently enabled
 	and removed.
 
 20130319:
 	SOCK_CLOEXEC and SOCK_NONBLOCK flags have been added to socket(2)
 	and socketpair(2). Software, in particular Kerberos, may
 	automatically detect and use these during building. The resulting
 	binaries will not work on older kernels.
 
 20130308:
 	CTL_DISABLE has also been added to the sparc64 GENERIC (for further
 	information, see the respective 20130304 entry).
 
 20130304:
 	Recent commits to callout(9) changed the size of struct callout,
 	so the KBI is probably heavily disturbed. Also, some functions
 	in callout(9)/sleep(9)/sleepqueue(9)/condvar(9) KPIs were replaced
 	by macros. Every kernel module using it won't load, so rebuild
 	is requested.
 
 	The ctl device has been re-enabled in GENERIC for i386 and amd64,
 	but does not initialize by default (because of the new CTL_DISABLE
 	option) to save memory.  To re-enable it, remove the CTL_DISABLE
 	option from the kernel config file or set kern.cam.ctl.disable=0
 	in /boot/loader.conf.
 
 20130301:
 	The ctl device has been disabled in GENERIC for i386 and amd64.
 	This was done due to the extra memory being allocated at system
 	initialisation time by the ctl driver which was only used if
 	a CAM target device was created.  This makes a FreeBSD system
 	unusable on 128MB or less of RAM.
 
 20130208:
 	A new compression method (lz4) has been merged to -HEAD.  Please
 	refer to zpool-features(7) for more information.
 
 	Please refer to the "ZFS notes" section of this file for information
 	on upgrading boot ZFS pools.
 
 20130129:
 	A BSD-licensed patch(1) variant has been added and is installed
 	as bsdpatch, being the GNU version the default patch.
 	To inverse the logic and use the BSD-licensed one as default,
 	while having the GNU version installed as gnupatch, rebuild
 	and install world with the WITH_BSD_PATCH knob set.
 
 20130121:
 	Due to the use of the new -l option to install(1) during build
 	and install, you must take care not to directly set the INSTALL
 	make variable in your /etc/make.conf, /etc/src.conf, or on the
 	command line.  If you wish to use the -C flag for all installs
 	you may be able to add INSTALL+=-C to /etc/make.conf or
 	/etc/src.conf.
 
 20130118:
 	The install(1) option -M has changed meaning and now takes an
 	argument that is a file or path to append logs to.  In the
 	unlikely event that -M was the last option on the command line
 	and the command line contained at least two files and a target
 	directory the first file will have logs appended to it.  The -M
 	option served little practical purpose in the last decade so its
 	use is expected to be extremely rare.
 
 20121223:
 	After switching to Clang as the default compiler some users of ZFS
 	on i386 systems started to experience stack overflow kernel panics.
 	Please consider using 'options KSTACK_PAGES=4' in such configurations.
 
 20121222:
 	GEOM_LABEL now mangles label names read from file system metadata.
 	Mangling affect labels containing spaces, non-printable characters,
 	'%' or '"'. Device names in /etc/fstab and other places may need to
 	be updated.
 
 20121217:
 	By default, only the 10 most recent kernel dumps will be saved.  To
 	restore the previous behaviour (no limit on the number of kernel dumps
 	stored in the dump directory) add the following line to /etc/rc.conf:
 
 		savecore_flags=""
 
 20121201:
 	With the addition of auditdistd(8), a new auditdistd user is now
 	required during installworld.  "mergemaster -p" can be used to
 	add the user prior to installworld, as documented in the handbook.
 
 20121117:
 	The sin6_scope_id member variable in struct sockaddr_in6 is now
 	filled by the kernel before passing the structure to the userland via
 	sysctl or routing socket.  This means the KAME-specific embedded scope
 	id in sin6_addr.s6_addr[2] is always cleared in userland application.
 	This behavior can be controlled by net.inet6.ip6.deembed_scopeid.
 	__FreeBSD_version is bumped to 1000025.
 
 20121105:
 	On i386 and amd64 systems WITH_CLANG_IS_CC is now the default.
 	This means that the world and kernel will be compiled with clang
 	and that clang will be installed as /usr/bin/cc, /usr/bin/c++,
 	and /usr/bin/cpp.  To disable this behavior and revert to building
 	with gcc, compile with WITHOUT_CLANG_IS_CC. Really old versions
 	of current may need to bootstrap WITHOUT_CLANG first if the clang
 	build fails (its compatibility window doesn't extend to the 9 stable
 	branch point).
 
 20121102:
 	The IPFIREWALL_FORWARD kernel option has been removed. Its
 	functionality now turned on by default.
 
 20121023:
 	The ZERO_COPY_SOCKET kernel option has been removed and
 	split into SOCKET_SEND_COW and SOCKET_RECV_PFLIP.
 	NB: SOCKET_SEND_COW uses the VM page based copy-on-write
 	mechanism which is not safe and may result in kernel crashes.
 	NB: The SOCKET_RECV_PFLIP mechanism is useless as no current
 	driver supports disposeable external page sized mbuf storage.
 	Proper replacements for both zero-copy mechanisms are under
 	consideration and will eventually lead to complete removal
 	of the two kernel options.
 
 20121023:
 	The IPv4 network stack has been converted to network byte
 	order. The following modules need to be recompiled together
 	with kernel: carp(4), divert(4), gif(4), siftr(4), gre(4),
 	pf(4), ipfw(4), ng_ipfw(4), stf(4).
 
 20121022:
 	Support for non-MPSAFE filesystems was removed from VFS. The
 	VFS_VERSION was bumped, all filesystem modules shall be
 	recompiled.
 
 20121018:
 	All the non-MPSAFE filesystems have been disconnected from
 	the build. The full list includes: codafs, hpfs, ntfs, nwfs,
 	portalfs, smbfs, xfs.
 
 20121016:
 	The interface cloning API and ABI has changed. The following
 	modules need to be recompiled together with kernel:
 	ipfw(4), pfsync(4), pflog(4), usb(4), wlan(4), stf(4),
 	vlan(4), disc(4), edsc(4), if_bridge(4), gif(4), tap(4),
 	faith(4), epair(4), enc(4), tun(4), if_lagg(4), gre(4).
 
 20121015:
 	The sdhci driver was split in two parts: sdhci (generic SD Host
 	Controller logic) and sdhci_pci (actual hardware driver).
 	No kernel config modifications are required, but if you
 	load sdhc as a module you must switch to sdhci_pci instead.
 
 20121014:
 	Import the FUSE kernel and userland support into base system.
 
 20121013:
 	The GNU sort(1) program has been removed since the BSD-licensed
 	sort(1) has been the default for quite some time and no serious
 	problems have been reported.  The corresponding WITH_GNU_SORT
 	knob has also gone.
 
 20121006:
 	The pfil(9) API/ABI for AF_INET family has been changed. Packet
 	filtering modules: pf(4), ipfw(4), ipfilter(4) need to be recompiled
 	with new kernel.
 
 20121001:
 	The net80211(4) ABI has been changed to allow for improved driver
 	PS-POLL and power-save support.  All wireless drivers need to be
 	recompiled to work with the new kernel.
 
 20120913:
 	The random(4) support for the VIA hardware random number
 	generator (`PADLOCK') is no longer enabled unconditionally.
 	Add the padlock_rng device in the custom kernel config if
 	needed.  The GENERIC kernels on i386 and amd64 do include the
 	device, so the change only affects the custom kernel
 	configurations.
 
 20120908:
 	The pf(4) packet filter ABI has been changed. pfctl(8) and
 	snmp_pf module need to be recompiled to work with new kernel.
 
 20120828:
 	A new ZFS feature flag "com.delphix:empty_bpobj" has been merged
 	to -HEAD. Pools that have empty_bpobj in active state can not be
 	imported read-write with ZFS implementations that do not support
 	this feature. For more information read the zpool-features(5)
 	manual page.
 
 20120727:
 	The sparc64 ZFS loader has been changed to no longer try to auto-
 	detect ZFS providers based on diskN aliases but now requires these
 	to be explicitly listed in the OFW boot-device environment variable. 
 
 20120712:
 	The OpenSSL has been upgraded to 1.0.1c.  Any binaries requiring
 	libcrypto.so.6 or libssl.so.6 must be recompiled.  Also, there are
 	configuration changes.  Make sure to merge /etc/ssl/openssl.cnf.
 
 20120712:
 	The following sysctls and tunables have been renamed for consistency
 	with other variables:
 	  kern.cam.da.da_send_ordered   -> kern.cam.da.send_ordered
 	  kern.cam.ada.ada_send_ordered -> kern.cam.ada.send_ordered
 
 20120628:
 	The sort utility has been replaced with BSD sort.  For now, GNU sort
 	is also available as "gnusort" or the default can be set back to
 	GNU sort by setting WITH_GNU_SORT.  In this case, BSD sort will be
 	installed as "bsdsort".
 
 20120611:
 	A new version of ZFS (pool version 5000) has been merged to -HEAD.
 	Starting with this version the old system of ZFS pool versioning
 	is superseded by "feature flags". This concept enables forward
 	compatibility against certain future changes in functionality of ZFS
 	pools. The first read-only compatible "feature flag" for ZFS pools
 	is named "com.delphix:async_destroy". For more information
 	read the new zpool-features(5) manual page.
 	Please refer to the "ZFS notes" section of this file for information
 	on upgrading boot ZFS pools.
 
 20120417:
 	The malloc(3) implementation embedded in libc now uses sources imported
 	as contrib/jemalloc.  The most disruptive API change is to
 	/etc/malloc.conf.  If your system has an old-style /etc/malloc.conf,
 	delete it prior to installworld, and optionally re-create it using the
 	new format after rebooting.  See malloc.conf(5) for details
 	(specifically the TUNING section and the "opt.*" entries in the MALLCTL
 	NAMESPACE section).
 
 20120328:
 	Big-endian MIPS TARGET_ARCH values no longer end in "eb".  mips64eb
 	is now spelled mips64.  mipsn32eb is now spelled mipsn32.  mipseb is
 	now spelled mips.  This is to aid compatibility with third-party
 	software that expects this naming scheme in uname(3).  Little-endian
 	settings are unchanged. If you are updating a big-endian mips64 machine
 	from before this change, you may need to set MACHINE_ARCH=mips64 in
 	your environment before the new build system will recognize your machine.
 
 20120306:
 	Disable by default the option VFS_ALLOW_NONMPSAFE for all supported
 	platforms.
 
 20120229:
 	Now unix domain sockets behave "as expected" on	nullfs(5). Previously
 	nullfs(5) did not pass through all behaviours to the underlying layer,
 	as a result if we bound to a socket on the lower layer we could connect
 	only to the lower path; if we bound to the upper layer we could connect
 	only to	the upper path. The new behavior is one can connect to both the
 	lower and the upper paths regardless what layer path one binds to.
 
 20120211:
 	The getifaddrs upgrade path broken with 20111215 has been restored.
 	If you have upgraded in between 20111215 and 20120209 you need to
 	recompile libc again with your kernel.  You still need to recompile
 	world to be able to configure CARP but this restriction already
 	comes from 20111215.
 
 20120114:
 	The set_rcvar() function has been removed from /etc/rc.subr.  All
 	base and ports rc.d scripts have been updated, so if you have a
 	port installed with a script in /usr/local/etc/rc.d you can either
 	hand-edit the rcvar= line, or reinstall the port.
 
 	An easy way to handle the mass-update of /etc/rc.d:
 	rm /etc/rc.d/* && mergemaster -i
 
 20120109:
 	panic(9) now stops other CPUs in the SMP systems, disables interrupts
 	on the current CPU and prevents other threads from running.
 	This behavior can be reverted using the kern.stop_scheduler_on_panic
 	tunable/sysctl.
 	The new behavior can be incompatible with kern.sync_on_panic.
 
 20111215:
 	The carp(4) facility has been changed significantly. Configuration
 	of the CARP protocol via ifconfig(8) has changed, as well as format
 	of CARP events submitted to devd(8) has changed. See manual pages
 	for more information. The arpbalance feature of carp(4) is currently
 	not supported anymore.
 
 	Size of struct in_aliasreq, struct in6_aliasreq has changed. User
 	utilities using SIOCAIFADDR, SIOCAIFADDR_IN6, e.g. ifconfig(8),
 	need to be recompiled.
 
 20111122:
 	The acpi_wmi(4) status device /dev/wmistat has been renamed to
 	/dev/wmistat0.
 
 20111108:
 	The option VFS_ALLOW_NONMPSAFE option has been added in order to
 	explicitely support non-MPSAFE filesystems.
 	It is on by default for all supported platform at this present
 	time.
 
 20111101:
 	The broken amd(4) driver has been replaced with esp(4) in the amd64,
 	i386 and pc98 GENERIC kernel configuration files.
 
 20110930:
 	sysinstall has been removed
 
 20110923:
 	The stable/9 branch created in subversion.  This corresponds to the
 	RELENG_9 branch in CVS.
 
 COMMON ITEMS:
 
 	General Notes
 	-------------
 	Avoid using make -j when upgrading.  While generally safe, there are
 	sometimes problems using -j to upgrade.  If your upgrade fails with
 	-j, please try again without -j.  From time to time in the past there
 	have been problems using -j with buildworld and/or installworld.  This
 	is especially true when upgrading between "distant" versions (eg one
 	that cross a major release boundary or several minor releases, or when
 	several months have passed on the -current branch).
 
 	Sometimes, obscure build problems are the result of environment
 	poisoning.  This can happen because the make utility reads its
 	environment when searching for values for global variables.  To run
 	your build attempts in an "environmental clean room", prefix all make
 	commands with 'env -i '.  See the env(1) manual page for more details.
 
 	When upgrading from one major version to another it is generally best
 	to upgrade to the latest code in the currently installed branch first,
 	then do an upgrade to the new branch. This is the best-tested upgrade
 	path, and has the highest probability of being successful.  Please try
 	this approach before reporting problems with a major version upgrade.
 
 	When upgrading a live system, having a root shell around before
 	installing anything can help undo problems. Not having a root shell
 	around can lead to problems if pam has changed too much from your
 	starting point to allow continued authentication after the upgrade.
 
 	ZFS notes
 	---------
 	When upgrading the boot ZFS pool to a new version, always follow
 	these two steps:
 
 	1.) recompile and reinstall the ZFS boot loader and boot block
 	(this is part of "make buildworld" and "make installworld")
 
 	2.) update the ZFS boot block on your boot drive
 
 	The following example updates the ZFS boot block on the first
 	partition (freebsd-boot) of a GPT partitioned drive ada0:
 	"gpart bootcode -p /boot/gptzfsboot -i 1 ada0"
 
 	Non-boot pools do not need these updates.
 
 	To build a kernel
 	-----------------
 	If you are updating from a prior version of FreeBSD (even one just
 	a few days old), you should follow this procedure.  It is the most
 	failsafe as it uses a /usr/obj tree with a fresh mini-buildworld,
 
 	make kernel-toolchain
 	make -DALWAYS_CHECK_MAKE buildkernel KERNCONF=YOUR_KERNEL_HERE
 	make -DALWAYS_CHECK_MAKE installkernel KERNCONF=YOUR_KERNEL_HERE
 
 	To test a kernel once
 	---------------------
 	If you just want to boot a kernel once (because you are not sure
 	if it works, or if you want to boot a known bad kernel to provide
 	debugging information) run
 	make installkernel KERNCONF=YOUR_KERNEL_HERE KODIR=/boot/testkernel
 	nextboot -k testkernel
 
 	To just build a kernel when you know that it won't mess you up
 	--------------------------------------------------------------
 	This assumes you are already running a CURRENT system.  Replace
 	${arch} with the architecture of your machine (e.g. "i386",
 	"arm", "amd64", "ia64", "pc98", "sparc64", "powerpc", "mips", etc).
 
 	cd src/sys/${arch}/conf
 	config KERNEL_NAME_HERE
 	cd ../compile/KERNEL_NAME_HERE
 	make depend
 	make
 	make install
 
 	If this fails, go to the "To build a kernel" section.
 
 	To rebuild everything and install it on the current system.
 	-----------------------------------------------------------
 	# Note: sometimes if you are running current you gotta do more than
 	# is listed here if you are upgrading from a really old current.
 
 	<make sure you have good level 0 dumps>
 	make buildworld
 	make kernel KERNCONF=YOUR_KERNEL_HERE
 							[1]
 	<reboot in single user>				[3]
 	mergemaster -Fp					[5]
 	make installworld
 	mergemaster -Fi					[4]
 	make delete-old					[6]
 	<reboot>
 
 	To cross-install current onto a separate partition
 	--------------------------------------------------
 	# In this approach we use a separate partition to hold
 	# current's root, 'usr', and 'var' directories.   A partition
 	# holding "/", "/usr" and "/var" should be about 2GB in
 	# size.
 
 	<make sure you have good level 0 dumps>
 	<boot into -stable>
 	make buildworld
 	make buildkernel KERNCONF=YOUR_KERNEL_HERE
 	<maybe newfs current's root partition>
 	<mount current's root partition on directory ${CURRENT_ROOT}>
 	make installworld DESTDIR=${CURRENT_ROOT} -DDB_FROM_SRC
 	make distribution DESTDIR=${CURRENT_ROOT} # if newfs'd
 	make installkernel KERNCONF=YOUR_KERNEL_HERE DESTDIR=${CURRENT_ROOT}
 	cp /etc/fstab ${CURRENT_ROOT}/etc/fstab 		   # if newfs'd
 	<edit ${CURRENT_ROOT}/etc/fstab to mount "/" from the correct partition>
 	<reboot into current>
 	<do a "native" rebuild/install as described in the previous section>
 	<maybe install compatibility libraries from ports/misc/compat*>
 	<reboot>
 
 
 	To upgrade in-place from stable to current
 	----------------------------------------------
 	<make sure you have good level 0 dumps>
 	make buildworld					[9]
 	make kernel KERNCONF=YOUR_KERNEL_HERE		[8]
 							[1]
 	<reboot in single user>				[3]
 	mergemaster -Fp					[5]
 	make installworld
 	mergemaster -Fi					[4]
 	make delete-old					[6]
 	<reboot>
 
 	Make sure that you've read the UPDATING file to understand the
 	tweaks to various things you need.  At this point in the life
 	cycle of current, things change often and you are on your own
 	to cope.  The defaults can also change, so please read ALL of
 	the UPDATING entries.
 
 	Also, if you are tracking -current, you must be subscribed to
 	freebsd-current@freebsd.org.  Make sure that before you update
 	your sources that you have read and understood all the recent
 	messages there.  If in doubt, please track -stable which has
 	much fewer pitfalls.
 
 	[1] If you have third party modules, such as vmware, you
 	should disable them at this point so they don't crash your
 	system on reboot.
 
 	[3] From the bootblocks, boot -s, and then do
 		fsck -p
 		mount -u /
 		mount -a
 		cd src
 		adjkerntz -i		# if CMOS is wall time
 	Also, when doing a major release upgrade, it is required that
 	you boot into single user mode to do the installworld.
 
 	[4] Note: This step is non-optional.  Failure to do this step
 	can result in a significant reduction in the functionality of the
 	system.  Attempting to do it by hand is not recommended and those
 	that pursue this avenue should read this file carefully, as well
 	as the archives of freebsd-current and freebsd-hackers mailing lists
 	for potential gotchas.  The -U option is also useful to consider.
 	See mergemaster(8) for more information.
 
 	[5] Usually this step is a noop.  However, from time to time
 	you may need to do this if you get unknown user in the following
 	step.  It never hurts to do it all the time.  You may need to
 	install a new mergemaster (cd src/usr.sbin/mergemaster && make
 	install) after the buildworld before this step if you last updated
 	from current before 20130425 or from -stable before 20130430.
 
 	[6] This only deletes old files and directories. Old libraries
 	can be deleted by "make delete-old-libs", but you have to make
 	sure that no program is using those libraries anymore.
 
 	[8] In order to have a kernel that can run the 4.x binaries needed to
 	do an installworld, you must include the COMPAT_FREEBSD4 option in
 	your kernel.  Failure to do so may leave you with a system that is
 	hard to boot to recover. A similar kernel option COMPAT_FREEBSD5 is
 	required to run the 5.x binaries on more recent kernels.  And so on
 	for COMPAT_FREEBSD6 and COMPAT_FREEBSD7.
 
 	Make sure that you merge any new devices from GENERIC since the
 	last time you updated your kernel config file.
 
 	[9] When checking out sources, you must include the -P flag to have
 	cvs prune empty directories.
 
 	If CPUTYPE is defined in your /etc/make.conf, make sure to use the
 	"?=" instead of the "=" assignment operator, so that buildworld can
 	override the CPUTYPE if it needs to.
 
 	MAKEOBJDIRPREFIX must be defined in an environment variable, and
 	not on the command line, or in /etc/make.conf.  buildworld will
 	warn if it is improperly defined.
 FORMAT:
 
 This file contains a list, in reverse chronological order, of major
 breakages in tracking -current.  It is not guaranteed to be a complete
 list of such breakages, and only contains entries since October 10, 2007.
 If you need to see UPDATING entries from before that date, you will need
 to fetch an UPDATING file from an older FreeBSD release.
 
 Copyright information:
 
 Copyright 1998-2009 M. Warner Losh.  All Rights Reserved.
 
 Redistribution, publication, translation and use, with or without
 modification, in full or in part, in any form or format of this
 document are permitted without further permission from the author.
 
 THIS DOCUMENT IS PROVIDED BY WARNER LOSH ``AS IS'' AND ANY EXPRESS OR
 IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
 WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 DISCLAIMED.  IN NO EVENT SHALL WARNER LOSH BE LIABLE FOR ANY DIRECT,
 INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
 (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
 SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
 STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
 IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 POSSIBILITY OF SUCH DAMAGE.
 
 Contact Warner Losh if you have any questions about your use of
 this document.
 
 $FreeBSD$
Index: user/ngie/more-tests/bin/csh/config.h
===================================================================
--- user/ngie/more-tests/bin/csh/config.h	(revision 281584)
+++ user/ngie/more-tests/bin/csh/config.h	(revision 281585)
@@ -1,277 +1,277 @@
 /* $FreeBSD$ */
 /* config.h.  Generated from config.h.in by configure.  */
 /* config.h.in.  Generated from configure.in by autoheader.  */
 
 /* Define to the type of elements in the array set by `getgroups'. Usually
    this is either `int' or `gid_t'. */
 #define GETGROUPS_T gid_t
 
 /* Define to 1 if the `getpgrp' function requires zero arguments. */
 #define GETPGRP_VOID 1
 
 /* Define to 1 if you have the <auth.h> header file. */
 /* #undef HAVE_AUTH_H */
 
 /* Define to 1 if you have the <crypt.h> header file. */
 /* #undef HAVE_CRYPT_H */
 
 /* Define to 1 if you have the declaration of `crypt', and to 0 if you don't.
    */
 #define HAVE_DECL_CRYPT 1
 
 /* Define to 1 if you have the declaration of `environ', and to 0 if you
    don't. */
 #define HAVE_DECL_ENVIRON 0
 
 /* Define to 1 if you have the declaration of `gethostname', and to 0 if you
    don't. */
 #define HAVE_DECL_GETHOSTNAME 1
 
 /* Define to 1 if you have the declaration of `getpgrp', and to 0 if you
    don't. */
 #define HAVE_DECL_GETPGRP 1
 
 /* Define to 1 if you have the <dirent.h> header file, and it defines `DIR'.
    */
 #define HAVE_DIRENT_H 1
 
 /* Define to 1 if you have the `dup2' function. */
 #define HAVE_DUP2 1
 
 /* Define to 1 if you have the <features.h> header file. */
 /* #undef HAVE_FEATURES_H */
 
 /* Define to 1 if you have the `getauthid' function. */
 /* #undef HAVE_GETAUTHID */
 
 /* Define to 1 if you have the `getcwd' function. */
 #define HAVE_GETCWD 1
 
 /* Define to 1 if you have the `gethostname' function. */
 #define HAVE_GETHOSTNAME 1
 
 /* Define to 1 if you have the `getpwent' function. */
 #define HAVE_GETPWENT 1
 
 /* Define to 1 if you have the `getutent' function. */
 /* #undef HAVE_GETUTENT */
 
 /* Define to 1 if you have the `getutxent' function. */
 #define HAVE_GETUTXENT 1
 
 /* Define if you have the iconv() function and it works. */
 /* #undef HAVE_ICONV */
 
 /* Define to 1 if you have the <inttypes.h> header file. */
 #define HAVE_INTTYPES_H 1
 
 /* Define to 1 if the system has the type `long long'. */
 #define HAVE_LONG_LONG 1
 
 /* Define to 1 if you have the `mallinfo' function. */
 /* #undef HAVE_MALLINFO */
 
 /* Define to 1 if mbrtowc and mbstate_t are properly declared. */
 #define HAVE_MBRTOWC 1
 
 /* Define to 1 if you have the `memmove' function. */
 #define HAVE_MEMMOVE 1
 
 /* Define to 1 if you have the <memory.h> header file. */
 #define HAVE_MEMORY_H 1
 
 /* Define to 1 if you have the `memset' function. */
 #define HAVE_MEMSET 1
 
 /* Define to 1 if you have the `mkstemp' function. */
 #define HAVE_MKSTEMP 1
 
 /* Define to 1 if you have the <ndir.h> header file, and it defines `DIR'. */
 /* #undef HAVE_NDIR_H */
 
 /* Define to 1 if you have the `nice' function. */
 #define HAVE_NICE 1
 
 /* Define to 1 if you have the `nl_langinfo' function. */
 #define HAVE_NL_LANGINFO 1
 
 /* Define to 1 if you have the <paths.h> header file. */
 #define HAVE_PATHS_H 1
 
 /* Define to 1 if you have the `sbrk' function. */
 #define HAVE_SBRK 1
 
 /* Define to 1 if you have the `setpgid' function. */
 #define HAVE_SETPGID 1
 
 /* Define to 1 if you have the `setpriority' function. */
 #define HAVE_SETPRIORITY 1
 
 /* Define to 1 if you have the <shadow.h> header file. */
 /* #undef HAVE_SHADOW_H */
 
 /* Define to 1 if you have the <stdint.h> header file. */
 #define HAVE_STDINT_H 1
 
 /* Define to 1 if you have the <stdlib.h> header file. */
 #define HAVE_STDLIB_H 1
 
 /* Define to 1 if you have the `strcoll' function and it is properly defined.
    */
 #define HAVE_STRCOLL 1
 
 /* Define to 1 if you have the `strerror' function. */
 #define HAVE_STRERROR 1
 
 /* Define to 1 if you have the <strings.h> header file. */
 #define HAVE_STRINGS_H 1
 
 /* Define to 1 if you have the <string.h> header file. */
 #define HAVE_STRING_H 1
 
 /* Define to 1 if you have the `strstr' function. */
 #define HAVE_STRSTR 1
 
 /* Define to 1 if `d_ino' is a member of `struct dirent'. */
 #define HAVE_STRUCT_DIRENT_D_INO 1
 
 /* Define to 1 if `ss_family' is a member of `struct sockaddr_storage'. */
 #define HAVE_STRUCT_SOCKADDR_STORAGE_SS_FAMILY 1
 
 /* Define to 1 if `ut_host' is a member of `struct utmpx'. */
 #define HAVE_STRUCT_UTMPX_UT_HOST 1
 
 /* Define to 1 if `ut_tv' is a member of `struct utmpx'. */
 #define HAVE_STRUCT_UTMPX_UT_TV 1
 
 /* Define to 1 if `ut_user' is a member of `struct utmpx'. */
 #define HAVE_STRUCT_UTMPX_UT_USER 1
 
 /* Define to 1 if `ut_xtime' is a member of `struct utmpx'. */
 /* #undef HAVE_STRUCT_UTMPX_UT_XTIME */
 
 /* Define to 1 if `ut_host' is a member of `struct utmp'. */
 #define HAVE_STRUCT_UTMP_UT_HOST 1
 
 /* Define to 1 if `ut_tv' is a member of `struct utmp'. */
 #define HAVE_STRUCT_UTMP_UT_TV 1
 
 /* Define to 1 if `ut_user' is a member of `struct utmp'. */
 #define HAVE_STRUCT_UTMP_UT_USER 1
 
 /* Define to 1 if `ut_xtime' is a member of `struct utmp'. */
 /* #undef HAVE_STRUCT_UTMP_UT_XTIME */
 
 /* Define to 1 if you have the `sysconf' function. */
 #define HAVE_SYSCONF 1
 
 /* Define to 1 if you have the <sys/dir.h> header file, and it defines `DIR'.
    */
 /* #undef HAVE_SYS_DIR_H */
 
 /* Define to 1 if you have the <sys/ndir.h> header file, and it defines `DIR'.
    */
 /* #undef HAVE_SYS_NDIR_H */
 
 /* Define to 1 if you have the <sys/stat.h> header file. */
 #define HAVE_SYS_STAT_H 1
 
 /* Define to 1 if you have the <sys/types.h> header file. */
 #define HAVE_SYS_TYPES_H 1
 
 /* Define to 1 if you have the <unistd.h> header file. */
 #define HAVE_UNISTD_H 1
 
 /* Define to 1 if you have the <utmpx.h> header file. */
 #define HAVE_UTMPX_H 1
 
 /* Define to 1 if you have the <utmp.h> header file. */
 /* #undef HAVE_UTMP_H */
 
 /* Define to 1 if you have the <wchar.h> header file. */
 #define HAVE_WCHAR_H 1
 
 /* Define to 1 if you have the <wctype.h> header file. */
 #define HAVE_WCTYPE_H 1
 
 /* Define to 1 if you have the `wcwidth' function. */
 #define HAVE_WCWIDTH 1
 
 /* Define as const if the declaration of iconv() needs const. */
-#define ICONV_CONST const
+#define ICONV_CONST
 
 /* Support NLS. */
 #define NLS 1
 
 /* Support NLS catalogs. */
 #define NLS_CATALOGS 1
 
 /* Define to the address where bug reports for this package should be sent. */
 #define PACKAGE_BUGREPORT "http://bugs.gw.com/"
 
 /* Define to the full name of this package. */
 #define PACKAGE_NAME "tcsh"
 
 /* Define to the full name and version of this package. */
 #define PACKAGE_STRING "tcsh 6.18.01"
 
 /* Define to the one symbol short name of this package. */
 #define PACKAGE_TARNAME "tcsh"
 
 /* Define to the home page for this package. */
 #define PACKAGE_URL ""
 
 /* Define to the version of this package. */
 #define PACKAGE_VERSION "6.18.01"
 
 /* Define to 1 if the `setpgrp' function takes no argument. */
 /* #undef SETPGRP_VOID */
 
 /* The size of `wchar_t', as computed by sizeof. */
 #define SIZEOF_WCHAR_T 4
 
 /* Define to 1 if the `S_IS*' macros in <sys/stat.h> do not work properly. */
 /* #undef STAT_MACROS_BROKEN */
 
 /* Define to 1 if you have the ANSI C header files. */
 #define STDC_HEADERS 1
 
 /* Define for Solaris 2.5.1 so the uint32_t typedef from <sys/synch.h>,
    <pthread.h>, or <semaphore.h> is not used. If the typedef were allowed, the
    #define below would cause a syntax error. */
 /* #undef _UINT32_T */
 
 /* Define to empty if `const' does not conform to ANSI C. */
 /* #undef const */
 
 /* Define to `int' if <sys/types.h> doesn't define. */
 /* #undef gid_t */
 
 /* Define to `int' if <sys/types.h> does not define. */
 /* #undef mode_t */
 
 /* Define to `unsigned int' if <sys/types.h> does not define. */
 /* #undef size_t */
 
 /* Define to `int' if neither <sys/types.h> nor <sys/socket.h> define. */
 /* #undef socklen_t */
 
 /* Define to `int' not defined in <sys/types.h>. */
 /* #undef ssize_t */
 
 /* Define to `int' if <sys/types.h> doesn't define. */
 /* #undef uid_t */
 
 /* Define to the type of an unsigned integer type of width exactly 32 bits if
    such a type exists and the standard includes do not define it. */
 /* #undef uint32_t */
 
 /* Define to empty if the keyword `volatile' does not work. Warning: valid
    code using `volatile' can become incorrect without. Disable with care. */
 /* #undef volatile */
 
 #include "config_p.h"
 #include "config_f.h"
 
 /* Work around a vendor issue where config_f.h is #undef'ing this setting */
 #define SYSMALLOC
Index: user/ngie/more-tests/contrib/pjdfstest/tests/open/20.t
===================================================================
--- user/ngie/more-tests/contrib/pjdfstest/tests/open/20.t	(revision 281584)
+++ user/ngie/more-tests/contrib/pjdfstest/tests/open/20.t	(revision 281585)
@@ -1,20 +1,24 @@
 #!/bin/sh
 # $FreeBSD: head/tools/regression/pjdfstest/tests/open/20.t 211352 2010-08-15 21:24:17Z pjd $
 
 desc="open returns ETXTBSY when the file is a pure procedure (shared text) file that is being executed and the open() system call requests write access"
 
 dir=`dirname $0`
 . ${dir}/../misc.sh
 
 [ "${os}:${fs}" = "FreeBSD:UFS" ] || quick_exit
 
 echo "1..4"
 
 n0=`namegen`
 
 cp -pf `which sleep` ${n0}
 ./${n0} 3 &
+while ! pkill -0 -f ./${n0}; do
+	sleep 0.1
+done
 expect ETXTBSY open ${n0} O_WRONLY
 expect ETXTBSY open ${n0} O_RDWR
 expect ETXTBSY open ${n0} O_RDONLY,O_TRUNC
+pkill -9 -f ./${n0}
 expect 0 unlink ${n0}
Index: user/ngie/more-tests/contrib/pjdfstest/tests/truncate/11.t
===================================================================
--- user/ngie/more-tests/contrib/pjdfstest/tests/truncate/11.t	(revision 281584)
+++ user/ngie/more-tests/contrib/pjdfstest/tests/truncate/11.t	(revision 281585)
@@ -1,18 +1,22 @@
 #!/bin/sh
 # $FreeBSD: head/tools/regression/pjdfstest/tests/truncate/11.t 211352 2010-08-15 21:24:17Z pjd $
 
 desc="truncate returns ETXTBSY the file is a pure procedure (shared text) file that is being executed"
 
 dir=`dirname $0`
 . ${dir}/../misc.sh
 
 [ "${os}" = "FreeBSD" ] || quick_exit
 
 echo "1..2"
 
 n0=`namegen`
 
 cp -pf `which sleep` ${n0}
 ./${n0} 3 &
+while ! pkill -0 -f ./${n0}; do
+	sleep 0.1
+done
 expect ETXTBSY truncate ${n0} 123
+pkill -9 -f ./${n0}
 expect 0 unlink ${n0}
Index: user/ngie/more-tests/contrib/smbfs/include/netsmb/smb_lib.h
===================================================================
--- user/ngie/more-tests/contrib/smbfs/include/netsmb/smb_lib.h	(revision 281584)
+++ user/ngie/more-tests/contrib/smbfs/include/netsmb/smb_lib.h	(revision 281585)
@@ -1,258 +1,258 @@
 /*
  * Copyright (c) 2000-2001 Boris Popov
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *    This product includes software developed by Boris Popov.
  * 4. Neither the name of the author nor the names of any co-contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $Id: smb_lib.h,v 1.24 2001/12/20 15:19:43 bp Exp $
  * $FreeBSD$
  */
 #ifndef _NETSMB_SMB_LIB_H_
 #define _NETSMB_SMB_LIB_H_
 
 #include <netsmb/smb.h>
 #include <netsmb/smb_dev.h>
 
 #ifndef SMB_CFG_FILE
 #define	SMB_CFG_FILE	"/usr/local/etc/nsmb.conf"
 #endif
 
 #define	STDPARAM_ARGS	'A':case 'B':case 'C':case 'E':case 'I': \
 		   case 'L':case 'M': \
 		   case 'N':case 'U':case 'R':case 'S':case 'T': \
 		   case 'W':case 'O':case 'P'
 
 #define STDPARAM_OPT	"A:BCE:I:L:M:NO:P:U:R:S:T:W:"
 
 /*
  * bits to indicate the source of error
  */
 #define	SMB_ERRTYPE_MASK	0xf0000
 #define	SMB_SYS_ERROR		0x00000
 #define SMB_RAP_ERROR		0x10000
 #define SMB_NB_ERROR		0x20000
 
 #ifndef min
 #define	min(a,b)	(((a)<(b)) ? (a) : (b))
 #endif
 
 #define getb(buf,ofs) 		(((const u_int8_t *)(buf))[ofs])
 #define setb(buf,ofs,val)	(((u_int8_t*)(buf))[ofs])=val
 #define getbw(buf,ofs)		((u_int16_t)(getb(buf,ofs)))
 #define getw(buf,ofs)		(*((u_int16_t*)(&((u_int8_t*)(buf))[ofs])))
 #define getdw(buf,ofs)		(*((u_int32_t*)(&((u_int8_t*)(buf))[ofs])))
 
 #if (BYTE_ORDER == LITTLE_ENDIAN)
 
 #define getwle(buf,ofs)	(*((u_int16_t*)(&((u_int8_t*)(buf))[ofs])))
 #define getdle(buf,ofs)	(*((u_int32_t*)(&((u_int8_t*)(buf))[ofs])))
 #define getwbe(buf,ofs)	(ntohs(getwle(buf,ofs)))
 #define getdbe(buf,ofs)	(ntohl(getdle(buf,ofs)))
 
 #define setwle(buf,ofs,val) getwle(buf,ofs)=val
 #define setwbe(buf,ofs,val) getwle(buf,ofs)=htons(val)
 #define setdle(buf,ofs,val) getdle(buf,ofs)=val
 #define setdbe(buf,ofs,val) getdle(buf,ofs)=htonl(val)
 
 #else	/* (BYTE_ORDER == LITTLE_ENDIAN) */
 
 #define getwbe(buf,ofs)	(*((u_int16_t*)(&((u_int8_t*)(buf))[ofs])))
 #define getdbe(buf,ofs)	(*((u_int32_t*)(&((u_int8_t*)(buf))[ofs])))
 #define getwle(buf,ofs)	(bswap16(getwbe(buf,ofs)))
 #define getdle(buf,ofs)	(bswap32(getdbe(buf,ofs)))
 
 #define setwbe(buf,ofs,val) getwbe(buf,ofs)=val
 #define setwle(buf,ofs,val) getwbe(buf,ofs)=bswap16(val)
 #define setdbe(buf,ofs,val) getdbe(buf,ofs)=val
 #define setdle(buf,ofs,val) getdbe(buf,ofs)=bswap32(val)
 
 #endif	/* (BYTE_ORDER == LITTLE_ENDIAN) */
 
 /*
  * SMB work context. Used to store all values which is necessary
  * to establish connection to an SMB server.
  */
 struct smb_ctx {
 	int		ct_flags;	/* SMBCF_ */
 	int		ct_fd;		/* handle of connection */
 	int		ct_parsedlevel;
 	int		ct_minlevel;
 	int		ct_maxlevel;
 	char *		ct_srvaddr;	/* hostname or IP address of server */
 	char		ct_locname[SMB_MAXUSERNAMELEN + 1];
 	const char *	ct_uncnext;
 	struct nb_ctx *	ct_nb;
 	struct smbioc_ossn	ct_ssn;
 	struct smbioc_oshare	ct_sh;
 	long		ct_smbtcpport;
 };
 
 #define	SMBCF_NOPWD		0x0001	/* don't ask for a password */
 #define	SMBCF_SRIGHTS		0x0002	/* share access rights was supplied */
 #define	SMBCF_LOCALE		0x0004	/* use current locale */
 #define	SMBCF_RESOLVED		0x8000	/* structure has been verified */
 
 /*
  * request handling structures
  */
 struct mbuf {
 	int		m_len;
 	int		m_maxlen;
 	char *		m_data;
 	struct mbuf *	m_next;
 };
 
 struct mbdata {
 	struct mbuf *	mb_top;
 	struct mbuf * 	mb_cur;
 	char *		mb_pos;
 	int		mb_count;
 };
 
 #define	M_ALIGNFACTOR	(sizeof(long))
 #define M_ALIGN(len)	(((len) + M_ALIGNFACTOR - 1) & ~(M_ALIGNFACTOR - 1))
 #define	M_BASESIZE	(sizeof(struct mbuf))
 #define	M_MINSIZE	(256 - M_BASESIZE)
 #define M_TOP(m)	((char*)(m) + M_BASESIZE)
 #define mtod(m,t)	((t)(m)->m_data)
 #define M_TRAILINGSPACE(m) ((m)->m_maxlen - (m)->m_len)
 
 struct smb_rq {
 	u_char		rq_cmd;
 	struct mbdata	rq_rq;
 	struct mbdata	rq_rp;
 	struct smb_ctx *rq_ctx;
 	int		rq_wcount;
 	int		rq_bcount;
 };
 
 struct smb_bitname {
 	u_int	bn_bit;
 	char	*bn_name;
 };
 
 extern struct rcfile *smb_rc;
 
 __BEGIN_DECLS
 
 struct sockaddr;
 
 int  smb_lib_init(void);
 int  smb_open_rcfile(void);
 void smb_error(const char *, int,...);
 char *smb_printb(char *, int, const struct smb_bitname *);
 void *smb_dumptree(void);
 
 /*
  * Context management
  */
 int  smb_ctx_init(struct smb_ctx *, int, char *[], int, int, int);
 void smb_ctx_done(struct smb_ctx *);
 int  smb_ctx_parseunc(struct smb_ctx *, const char *, int, const char **);
 int  smb_ctx_setcharset(struct smb_ctx *, const char *);
 int  smb_ctx_setserver(struct smb_ctx *, const char *);
 int  smb_ctx_setnbport(struct smb_ctx *, int);
 int  smb_ctx_setsmbport(struct smb_ctx *, int);
 int  smb_ctx_setuser(struct smb_ctx *, const char *);
 int  smb_ctx_setshare(struct smb_ctx *, const char *, int);
 int  smb_ctx_setscope(struct smb_ctx *, const char *);
 int  smb_ctx_setworkgroup(struct smb_ctx *, const char *);
 int  smb_ctx_setpassword(struct smb_ctx *, const char *);
 int  smb_ctx_setsrvaddr(struct smb_ctx *, const char *);
 int  smb_ctx_opt(struct smb_ctx *, int, const char *);
 int  smb_ctx_lookup(struct smb_ctx *, int, int);
 int  smb_ctx_login(struct smb_ctx *);
 int  smb_ctx_readrc(struct smb_ctx *);
 int  smb_ctx_resolve(struct smb_ctx *);
 int  smb_ctx_setflags(struct smb_ctx *, int, int, int);
 
-int  smb_smb_open_print_file(struct smb_ctx *, int, int, const char *, smbfh*);
+int  smb_smb_open_print_file(struct smb_ctx *, int, int, char *, smbfh*);
 int  smb_smb_close_print_file(struct smb_ctx *, smbfh);
 
 int  smb_read(struct smb_ctx *, smbfh, off_t, size_t, char *);
 int  smb_write(struct smb_ctx *, smbfh, off_t, size_t, const char *);
 
 #define smb_rq_getrequest(rqp)	(&(rqp)->rq_rq)
 #define smb_rq_getreply(rqp)	(&(rqp)->rq_rp)
 
 int  smb_rq_init(struct smb_ctx *, u_char, size_t, struct smb_rq **);
 void smb_rq_done(struct smb_rq *);
 void smb_rq_wend(struct smb_rq *);
 int  smb_rq_simple(struct smb_rq *);
-int  smb_rq_dmem(struct mbdata *, const char *, size_t);
-int  smb_rq_dstring(struct mbdata *, const char *);
+int  smb_rq_dmem(struct mbdata *, char *, size_t);
+int  smb_rq_dstring(struct mbdata *, char *);
 
 int  smb_t2_request(struct smb_ctx *, int, int, const char *,
 	int, void *, int, void *, int *, void *, int *, void *);
 
 char* smb_simplecrypt(char *dst, const char *src);
 int  smb_simpledecrypt(char *dst, const char *src);
 
 int  m_getm(struct mbuf *, size_t, struct mbuf **);
 int  m_lineup(struct mbuf *, struct mbuf **);
 int  mb_init(struct mbdata *, size_t);
 int  mb_initm(struct mbdata *, struct mbuf *);
 int  mb_done(struct mbdata *);
 int  mb_fit(struct mbdata *mbp, size_t size, char **pp);
 int  mb_put_uint8(struct mbdata *, u_int8_t);
 int  mb_put_uint16be(struct mbdata *, u_int16_t);
 int  mb_put_uint16le(struct mbdata *, u_int16_t);
 int  mb_put_uint32be(struct mbdata *, u_int32_t);
 int  mb_put_uint32le(struct mbdata *, u_int32_t);
 int  mb_put_int64be(struct mbdata *, int64_t);
 int  mb_put_int64le(struct mbdata *, int64_t);
 int  mb_put_mem(struct mbdata *, const char *, size_t);
 int  mb_put_pstring(struct mbdata *mbp, const char *s);
 int  mb_put_mbuf(struct mbdata *, struct mbuf *);
 
 int  mb_get_uint8(struct mbdata *, u_int8_t *);
 int  mb_get_uint16(struct mbdata *, u_int16_t *);
 int  mb_get_uint16le(struct mbdata *, u_int16_t *);
 int  mb_get_uint16be(struct mbdata *, u_int16_t *);
 int  mb_get_uint32(struct mbdata *, u_int32_t *);
 int  mb_get_uint32be(struct mbdata *, u_int32_t *);
 int  mb_get_uint32le(struct mbdata *, u_int32_t *);
 int  mb_get_int64(struct mbdata *, int64_t *);
 int  mb_get_int64be(struct mbdata *, int64_t *);
 int  mb_get_int64le(struct mbdata *, int64_t *);
 int  mb_get_mem(struct mbdata *, char *, size_t);
 
 extern u_char nls_lower[256], nls_upper[256];
 
 int   nls_setrecode(const char *, const char *);
 int   nls_setlocale(const char *);
-char* nls_str_toext(char *, const char *);
-char* nls_str_toloc(char *, const char *);
-void* nls_mem_toext(void *, const void *, int);
-void* nls_mem_toloc(void *, const void *, int);
+char* nls_str_toext(char *, char *);
+char* nls_str_toloc(char *, char *);
+void* nls_mem_toext(void *, void *, int);
+void* nls_mem_toloc(void *, void *, int);
 char* nls_str_upper(char *, const char *);
 char* nls_str_lower(char *, const char *);
 
 __END_DECLS
 
 #endif /* _NETSMB_SMB_LIB_H_ */
Index: user/ngie/more-tests/contrib/smbfs/lib/smb/nls.c
===================================================================
--- user/ngie/more-tests/contrib/smbfs/lib/smb/nls.c	(revision 281584)
+++ user/ngie/more-tests/contrib/smbfs/lib/smb/nls.c	(revision 281585)
@@ -1,223 +1,223 @@
 /*
  * Copyright (c) 2000-2001, Boris Popov
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *    This product includes software developed by Boris Popov.
  * 4. Neither the name of the author nor the names of any co-contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $Id: nls.c,v 1.10 2002/07/22 08:33:59 bp Exp $
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/types.h>
 #include <sys/sysctl.h>
 #include <ctype.h>
 #include <errno.h>
 #include <stdio.h>
 #include <string.h>
 #include <stdlib.h>
 #include <locale.h>
 #include <err.h>
 #include <netsmb/smb_lib.h>
 
 #ifdef HAVE_ICONV
 #include <iconv.h>
 #endif
 
 u_char nls_lower[256];
 u_char nls_upper[256];
 
 #ifdef HAVE_ICONV
 static iconv_t nls_toext, nls_toloc;
 #endif
 
 int
 nls_setlocale(const char *name)
 {
 	int i;
 
 	if (setlocale(LC_CTYPE, name) == NULL) {
 		warnx("can't set locale '%s'\n", name);
 		return EINVAL;
 	}
 	for (i = 0; i < 256; i++) {
 		nls_lower[i] = tolower(i);
 		nls_upper[i] = toupper(i);
 	}
 	return 0;
 }
 
 int
 nls_setrecode(const char *local, const char *external)
 {
 #ifdef HAVE_ICONV
 	iconv_t icd;
 
 	if (nls_toext)
 		iconv_close(nls_toext);
 	if (nls_toloc)
 		iconv_close(nls_toloc);
 	nls_toext = nls_toloc = (iconv_t)0;
 	icd = iconv_open(external, local);
 	if (icd == (iconv_t)-1)
 		return errno;
 	nls_toext = icd;
 	icd = iconv_open(local, external);
 	if (icd == (iconv_t)-1) {
 		iconv_close(nls_toext);
 		nls_toext = (iconv_t)0;
 		return errno;
 	}
 	nls_toloc = icd;
 	return 0;
 #else
 	return ENOENT;
 #endif
 }
 
 char *
-nls_str_toloc(char *dst, const char *src)
+nls_str_toloc(char *dst, char *src)
 {
 #ifdef HAVE_ICONV
 	char *p = dst;
 	size_t inlen, outlen;
 
 	if (nls_toloc == (iconv_t)0)
 		return strcpy(dst, src);
 	inlen = outlen = strlen(src);
 	iconv(nls_toloc, NULL, NULL, &p, &outlen);
 	while (iconv(nls_toloc, &src, &inlen, &p, &outlen) == -1) {
 		*p++ = *src++;
 		inlen--;
 		outlen--;
 	}
 	*p = 0;
 	return dst;
 #else
 	return strcpy(dst, src);
 #endif
 }
 
 char *
-nls_str_toext(char *dst, const char *src)
+nls_str_toext(char *dst, char *src)
 {
 #ifdef HAVE_ICONV
 	char *p = dst;
 	size_t inlen, outlen;
 
 	if (nls_toext == (iconv_t)0)
 		return strcpy(dst, src);
 	inlen = outlen = strlen(src);
 	iconv(nls_toext, NULL, NULL, &p, &outlen);
 	while (iconv(nls_toext, &src, &inlen, &p, &outlen) == -1) {
 		*p++ = *src++;
 		inlen--;
 		outlen--;
 	}
 	*p = 0;
 	return dst;
 #else
 	return strcpy(dst, src);
 #endif
 }
 
 void *
-nls_mem_toloc(void *dst, const void *src, int size)
+nls_mem_toloc(void *dst, void *src, int size)
 {
 #ifdef HAVE_ICONV
 	char *p = dst;
-	const char *s = src;
+	char *s = src;
 	size_t inlen, outlen;
 
 	if (size == 0)
 		return NULL;
 
 	if (nls_toloc == (iconv_t)0)
 		return memcpy(dst, src, size);
 	inlen = outlen = size;
 	iconv(nls_toloc, NULL, NULL, &p, &outlen);
 	while (iconv(nls_toloc, &s, &inlen, &p, &outlen) == -1) {
 		*p++ = *s++;
 		inlen--;
 		outlen--;
 	}
 	return dst;
 #else
 	return memcpy(dst, src, size);
 #endif
 }
 
 void *
-nls_mem_toext(void *dst, const void *src, int size)
+nls_mem_toext(void *dst, void *src, int size)
 {
 #ifdef HAVE_ICONV
 	char *p = dst;
-	const char *s = src;
+	char *s = src;
 	size_t inlen, outlen;
 
 	if (size == 0)
 		return NULL;
 
 	if (nls_toext == (iconv_t)0)
 		return memcpy(dst, src, size);
 
 	inlen = outlen = size;
 	iconv(nls_toext, NULL, NULL, &p, &outlen);
 	while (iconv(nls_toext, &s, &inlen, &p, &outlen) == -1) {
 		*p++ = *s++;
 		inlen--;
 		outlen--;
 	}
 	return dst;
 #else
 	return memcpy(dst, src, size);
 #endif
 }
 
 char *
 nls_str_upper(char *dst, const char *src)
 {
 	char *p = dst;
 
 	while (*src)
 		*dst++ = toupper(*src++);
 	*dst = 0;
 	return p;
 }
 
 char *
 nls_str_lower(char *dst, const char *src)
 {
 	char *p = dst;
 
 	while (*src)
 		*dst++ = tolower(*src++);
 	*dst = 0;
 	return p;
 }
Index: user/ngie/more-tests/contrib/smbfs/lib/smb/print.c
===================================================================
--- user/ngie/more-tests/contrib/smbfs/lib/smb/print.c	(revision 281584)
+++ user/ngie/more-tests/contrib/smbfs/lib/smb/print.c	(revision 281585)
@@ -1,97 +1,97 @@
 /*
  * Copyright (c) 2000, Boris Popov
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *    This product includes software developed by Boris Popov.
  * 4. Neither the name of the author nor the names of any co-contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $Id: print.c,v 1.4 2001/04/16 04:33:01 bp Exp $
  */
 #include <sys/param.h>
 #include <sys/sysctl.h>
 #include <sys/ioctl.h>
 #include <sys/time.h>
 #include <sys/mount.h>
 #include <fcntl.h>
 #include <ctype.h>
 #include <errno.h>
 #include <stdio.h>
 #include <string.h>
 #include <stdlib.h>
 #include <pwd.h>
 #include <grp.h>
 #include <unistd.h>
 
 /*#include <netnb/netbios.h>*/
 
 #include <netsmb/smb_lib.h>
 #include <netsmb/smb_conn.h>
 #include <cflib.h>
 
 int
 smb_smb_open_print_file(struct smb_ctx *ctx, int setuplen, int mode,
-	const char *ident, smbfh *fhp)
+	char *ident, smbfh *fhp)
 {
 	struct smb_rq *rqp;
 	struct mbdata *mbp;
 	int error;
 
 	error = smb_rq_init(ctx, SMB_COM_OPEN_PRINT_FILE, 2, &rqp);
 	if (error)
 		return error;
 	mbp = smb_rq_getrequest(rqp);
 	mb_put_uint16le(mbp, setuplen);
 	mb_put_uint16le(mbp, mode);
 	smb_rq_wend(rqp);
 	mb_put_uint8(mbp, SMB_DT_ASCII);
 	smb_rq_dstring(mbp, ident);
 	error = smb_rq_simple(rqp);
 	if (!error) {
 		mbp = smb_rq_getreply(rqp);
 		mb_get_uint16(mbp, fhp);
 	}
 	smb_rq_done(rqp);
 	return error;
 }
 
 int
 smb_smb_close_print_file(struct smb_ctx *ctx, smbfh fh)
 {
 	struct smb_rq *rqp;
 	struct mbdata *mbp;
 	int error;
 
 	error = smb_rq_init(ctx, SMB_COM_CLOSE_PRINT_FILE, 0, &rqp);
 	if (error)
 		return error;
 	mbp = smb_rq_getrequest(rqp);
 	mb_put_mem(mbp, (char*)&fh, 2);
 	smb_rq_wend(rqp);
 	error = smb_rq_simple(rqp);
 	smb_rq_done(rqp);
 	return error;
 }
Index: user/ngie/more-tests/contrib/smbfs/lib/smb/rq.c
===================================================================
--- user/ngie/more-tests/contrib/smbfs/lib/smb/rq.c	(revision 281584)
+++ user/ngie/more-tests/contrib/smbfs/lib/smb/rq.c	(revision 281585)
@@ -1,180 +1,180 @@
 /*
  * Copyright (c) 2000, Boris Popov
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *    This product includes software developed by Boris Popov.
  * 4. Neither the name of the author nor the names of any co-contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $Id: rq.c,v 1.7 2001/04/16 04:33:01 bp Exp $
  * $FreeBSD$
  */
 #include <sys/param.h>
 #include <sys/ioctl.h>
 #include <sys/errno.h>
 #include <sys/stat.h>
 #include <ctype.h>
 #include <err.h>
 #include <stdio.h>
 #include <unistd.h>
 #include <string.h>
 #include <stdlib.h>
 #include <sysexits.h>
 
 #include <sys/mchain.h>
 
 #include <netsmb/smb_lib.h>
 #include <netsmb/smb_conn.h>
 #include <netsmb/smb_rap.h>
 
 
 int
 smb_rq_init(struct smb_ctx *ctx, u_char cmd, size_t rpbufsz, struct smb_rq **rqpp)
 {
 	struct smb_rq *rqp;
 
 	rqp = malloc(sizeof(*rqp));
 	if (rqp == NULL)
 		return ENOMEM;
 	bzero(rqp, sizeof(*rqp));
 	rqp->rq_cmd = cmd;
 	rqp->rq_ctx = ctx;
 	mb_init(&rqp->rq_rq, M_MINSIZE);
 	mb_init(&rqp->rq_rp, rpbufsz);
 	*rqpp = rqp;
 	return 0;
 }
 
 void
 smb_rq_done(struct smb_rq *rqp)
 {
 	mb_done(&rqp->rq_rp);
 	mb_done(&rqp->rq_rq);
 	free(rqp);
 }
 
 void
 smb_rq_wend(struct smb_rq *rqp)
 {
 	if (rqp->rq_rq.mb_count & 1)
 		smb_error("smbrq_wend: odd word count\n", 0);
 	rqp->rq_wcount = rqp->rq_rq.mb_count / 2;
 	rqp->rq_rq.mb_count = 0;
 }
 
 int
-smb_rq_dmem(struct mbdata *mbp, const char *src, size_t size)
+smb_rq_dmem(struct mbdata *mbp, char *src, size_t size)
 {
 	struct mbuf *m;
 	char * dst;
 	int cplen, error;
 
 	if (size == 0)
 		return 0;
 	m = mbp->mb_cur;
 	if ((error = m_getm(m, size, &m)) != 0)
 		return error;
 	while (size > 0) {
 		cplen = M_TRAILINGSPACE(m);
 		if (cplen == 0) {
 			m = m->m_next;
 			continue;
 		}
 		if (cplen > (int)size)
 			cplen = size;
 		dst = mtod(m, char *) + m->m_len;
 		nls_mem_toext(dst, src, cplen);
 		size -= cplen;
 		src += cplen;
 		m->m_len += cplen;
 		mbp->mb_count += cplen;
 	}
 	mbp->mb_pos = mtod(m, char *) + m->m_len;
 	mbp->mb_cur = m;
 	return 0;
 }
 
 int
-smb_rq_dstring(struct mbdata *mbp, const char *s)
+smb_rq_dstring(struct mbdata *mbp, char *s)
 {
 	return smb_rq_dmem(mbp, s, strlen(s) + 1);
 }
 
 int
 smb_rq_simple(struct smb_rq *rqp)
 {
 	struct smbioc_rq krq;
 	struct mbdata *mbp;
 	char *data;
 
 	mbp = smb_rq_getrequest(rqp);
 	m_lineup(mbp->mb_top, &mbp->mb_top);
 	data = mtod(mbp->mb_top, char*);
 	bzero(&krq, sizeof(krq));
 	krq.ioc_cmd = rqp->rq_cmd;
 	krq.ioc_twc = rqp->rq_wcount;
 	krq.ioc_twords = data;
 	krq.ioc_tbc = mbp->mb_count;
 	krq.ioc_tbytes = data + rqp->rq_wcount * 2;
 	mbp = smb_rq_getreply(rqp);
 	krq.ioc_rpbufsz = mbp->mb_top->m_maxlen;
 	krq.ioc_rpbuf = mtod(mbp->mb_top, char *);
 	if (ioctl(rqp->rq_ctx->ct_fd, SMBIOC_REQUEST, &krq) == -1)
 		return errno;
 	mbp->mb_top->m_len = krq.ioc_rwc * 2 + krq.ioc_rbc;
 	rqp->rq_wcount = krq.ioc_rwc;
 	rqp->rq_bcount = krq.ioc_rbc;
 	return 0;
 }
 
 int
 smb_t2_request(struct smb_ctx *ctx, int setup, int setupcount,
 	const char *name,
 	int tparamcnt, void *tparam,
 	int tdatacnt, void *tdata,
 	int *rparamcnt, void *rparam,
 	int *rdatacnt, void *rdata)
 {
 	struct smbioc_t2rq krq;
 
 	bzero(&krq, sizeof(krq));
 	krq.ioc_setup[0] = setup;
 	krq.ioc_setupcnt = setupcount;
 	krq.ioc_name = (char *)name;
 	krq.ioc_tparamcnt = tparamcnt;
 	krq.ioc_tparam = tparam;
 	krq.ioc_tdatacnt = tdatacnt;
 	krq.ioc_tdata = tdata;
 	krq.ioc_rparamcnt = *rparamcnt;
 	krq.ioc_rparam = rparam;
 	krq.ioc_rdatacnt = *rdatacnt;
 	krq.ioc_rdata = rdata;
 	if (ioctl(ctx->ct_fd, SMBIOC_T2RQ, &krq) == -1)
 		return errno;
 	*rparamcnt = krq.ioc_rparamcnt;
 	*rdatacnt = krq.ioc_rdatacnt;
 	return 0;
 }
Index: user/ngie/more-tests/etc/login.conf
===================================================================
--- user/ngie/more-tests/etc/login.conf	(revision 281584)
+++ user/ngie/more-tests/etc/login.conf	(revision 281585)
@@ -1,321 +1,321 @@
 # login.conf - login class capabilities database.
 #
 # Remember to rebuild the database after each change to this file:
 #
 #	cap_mkdb /etc/login.conf
 #
 # This file controls resource limits, accounting limits and
 # default user environment settings.
 #
 # $FreeBSD$
 #
 
 # Default settings effectively disable resource limits, see the
 # examples below for a starting point to enable them.
 
 # defaults
 # These settings are used by login(1) by default for classless users
 # Note that entries like "cputime" set both "cputime-cur" and "cputime-max"
 #
 # Note that since a colon ':' is used to separate capability entries,
 # a \c escape sequence must be used to embed a literal colon in the
 # value or name of a capability (see the ``CGETNUM AND CGETSTR SYNTAX
 # AND SEMANTICS'' section of getcap(3) for more escape sequences).
 
 default:\
 	:passwd_format=sha512:\
 	:copyright=/etc/COPYRIGHT:\
 	:welcome=/etc/motd:\
-	:setenv=MAIL=/var/mail/$,BLOCKSIZE=K:LC_COLLATE=C:\
+	:setenv=MAIL=/var/mail/$,BLOCKSIZE=K,LC_COLLATE=C:\
 	:path=/sbin /bin /usr/sbin /usr/bin /usr/local/sbin /usr/local/bin ~/bin:\
 	:nologin=/var/run/nologin:\
 	:cputime=unlimited:\
 	:datasize=unlimited:\
 	:stacksize=unlimited:\
 	:memorylocked=64K:\
 	:memoryuse=unlimited:\
 	:filesize=unlimited:\
 	:coredumpsize=unlimited:\
 	:openfiles=unlimited:\
 	:maxproc=unlimited:\
 	:sbsize=unlimited:\
 	:vmemoryuse=unlimited:\
 	:swapuse=unlimited:\
 	:pseudoterminals=unlimited:\
 	:kqueues=unlimited:\
 	:priority=0:\
 	:ignoretime@:\
 	:umask=022:
 
 
 #
 # A collection of common class names - forward them all to 'default'
 # (login would normally do this anyway, but having a class name
 #  here suppresses the diagnostic)
 #
 standard:\
 	:tc=default:
 xuser:\
 	:tc=default:
 staff:\
 	:tc=default:
 daemon:\
 	:memorylocked=128M:\
 	:tc=default:
 news:\
 	:tc=default:
 dialer:\
 	:tc=default:
 
 #
 # Root can always login
 #
 # N.B.  login_getpwclass(3) will use this entry for the root account,
 #       in preference to 'default'.
 root:\
 	:ignorenologin:\
 	:memorylocked=unlimited:\
 	:tc=default:
 
 #
 # Russian Users Accounts. Setup proper environment variables.
 #
 russian|Russian Users Accounts:\
 	:charset=UTF-8:\
 	:lang=ru_RU.UTF-8:\
 	:tc=default:
 
 
 ######################################################################
 ######################################################################
 ##
 ## Example entries
 ##
 ######################################################################
 ######################################################################
 
 ## Example defaults
 ## These settings are used by login(1) by default for classless users
 ## Note that entries like "cputime" set both "cputime-cur" and "cputime-max"
 #
 #default:\
 #	:cputime=infinity:\
 #	:datasize-cur=22M:\
 #	:stacksize-cur=8M:\
 #	:memorylocked-cur=10M:\
 #	:memoryuse-cur=30M:\
 #	:filesize=infinity:\
 #	:coredumpsize=infinity:\
 #	:maxproc-cur=64:\
 #	:openfiles-cur=64:\
 #	:priority=0:\
 #	:requirehome@:\
 #	:umask=022:\
 #	:tc=auth-defaults:
 #
 #
 ##
 ## standard - standard user defaults
 ##
 #standard:\
 #	:copyright=/etc/COPYRIGHT:\
 #	:welcome=/etc/motd:\
 #	:setenv=MAIL=/var/mail/$,BLOCKSIZE=K:\
 #	:path=~/bin /bin /usr/bin /usr/local/bin:\
 #	:manpath=/usr/share/man /usr/local/man:\
 #	:nologin=/var/run/nologin:\
 #	:cputime=1h30m:\
 #	:datasize=8M:\
 #	:vmemoryuse=100M:\
 #	:stacksize=2M:\
 #	:memorylocked=4M:\
 #	:memoryuse=8M:\
 #	:filesize=8M:\
 #	:coredumpsize=8M:\
 #	:openfiles=24:\
 #	:maxproc=32:\
 #	:priority=0:\
 #	:requirehome:\
 #	:passwordtime=90d:\
 #	:umask=002:\
 #	:ignoretime@:\
 #	:tc=default:
 #
 #
 ##
 ## users of X (needs more resources!)
 ##
 #xuser:\
 #	:manpath=/usr/share/man /usr/local/man:\
 #	:cputime=4h:\
 #	:datasize=12M:\
 #	:vmemoryuse=infinity:\
 #	:stacksize=4M:\
 #	:filesize=8M:\
 #	:memoryuse=16M:\
 #	:openfiles=32:\
 #	:maxproc=48:\
 #	:tc=standard:
 #
 #
 ##
 ## Staff users - few restrictions and allow login anytime
 ##
 #staff:\
 #	:ignorenologin:\
 #	:ignoretime:\
 #	:requirehome@:\
 #	:accounted@:\
 #	:path=~/bin /bin /sbin /usr/bin /usr/sbin /usr/local/bin /usr/local/sbin:\
 #	:umask=022:\
 #	:tc=standard:
 #
 #
 ##
 ## root - fallback for root logins
 ##
 #root:\
 #	:path=~/bin /bin /sbin /usr/bin /usr/sbin /usr/local/bin /usr/local/sbin:\
 #	:cputime=infinity:\
 #	:datasize=infinity:\
 #	:stacksize=infinity:\
 #	:memorylocked=infinity:\
 #	:memoryuse=infinity:\
 #	:filesize=infinity:\
 #	:coredumpsize=infinity:\
 #	:openfiles=infinity:\
 #	:maxproc=infinity:\
 #	:memoryuse-cur=32M:\
 #	:maxproc-cur=64:\
 #	:openfiles-cur=1024:\
 #	:priority=0:\
 #	:requirehome@:\
 #	:umask=022:\
 #	:tc=auth-root-defaults:
 #
 #
 ##
 ## Settings used by /etc/rc
 ##
 #daemon:\
 #	:coredumpsize@:\
 #	:coredumpsize-cur=0:\
 #	:datasize=infinity:\
 #	:datasize-cur@:\
 #	:maxproc=512:\
 #	:maxproc-cur@:\
 #	:memoryuse-cur=64M:\
 #	:memorylocked-cur=64M:\
 #	:openfiles=1024:\
 #	:openfiles-cur@:\
 #	:stacksize=16M:\
 #	:stacksize-cur@:\
 #	:tc=default:
 #
 #
 ##
 ## Settings used by news subsystem
 ##
 #news:\
 #	:path=/usr/local/news/bin /bin /sbin /usr/bin /usr/sbin /usr/local/bin /usr/local/sbin:\
 #	:cputime=infinity:\
 #	:filesize=128M:\
 #	:datasize-cur=64M:\
 #	:stacksize-cur=32M:\
 #	:coredumpsize-cur=0:\
 #	:maxmemorysize-cur=128M:\
 #	:memorylocked=32M:\
 #	:maxproc=128:\
 #	:openfiles=256:\
 #	:tc=default:
 #
 #
 ##
 ## The dialer class should be used for a dialup PPP account
 ## Welcome messages/news suppressed
 ##
 #dialer:\
 #	:hushlogin:\
 #	:requirehome@:\
 #	:cputime=unlimited:\
 #	:filesize=2M:\
 #	:datasize=2M:\
 #	:stacksize=4M:\
 #	:coredumpsize=0:\
 #	:memoryuse=4M:\
 #	:memorylocked=1M:\
 #	:maxproc=16:\
 #	:openfiles=32:\
 #	:tc=standard:
 #
 #
 ##
 ## Site full-time 24/7 PPP connection
 ## - no time accounting, restricted to access via dialin lines
 ##
 #site:\
 #	:ignoretime:\
 #	:passwordtime@:\
 #	:refreshtime@:\
 #	:refreshperiod@:\
 #	:sessionlimit@:\
 #	:autodelete@:\
 #	:expireperiod@:\
 #	:graceexpire@:\
 #	:gracetime@:\
 #	:warnexpire@:\
 #	:warnpassword@:\
 #	:idletime@:\
 #	:sessiontime@:\
 #	:daytime@:\
 #	:weektime@:\
 #	:monthtime@:\
 #	:warntime@:\
 #	:accounted@:\
 #	:tc=dialer:\
 #	:tc=staff:
 #
 #
 ##
 ## Example standard accounting entries for subscriber levels
 ##
 #
 #subscriber|Subscribers:\
 #	:accounted:\
 #	:refreshtime=180d:\
 #	:refreshperiod@:\
 #	:sessionlimit@:\
 #	:autodelete=30d:\
 #	:expireperiod=180d:\
 #	:graceexpire=7d:\
 #	:gracetime=10m:\
 #	:warnexpire=7d:\
 #	:warnpassword=7d:\
 #	:idletime=30m:\
 #	:sessiontime=4h:\
 #	:daytime=6h:\
 #	:weektime=40h:\
 #	:monthtime=120h:\
 #	:warntime=4h:\
 #	:tc=standard:
 #
 #
 ##
 ## Subscriber accounts. These accounts have their login times
 ## accounted and have access limits applied.
 ##
 #subppp|PPP Subscriber Accounts:\
 #	:tc=dialer:\
 #	:tc=subscriber:
 #
 #
 #subshell|Shell Subscriber Accounts:\
 #	:tc=subscriber:
 #
 ##
 ## If you want some of the accounts to use traditional UNIX DES based
 ## password hashes.
 ##
 #des_users:\
 #	:passwd_format=des:\
 #	:tc=default:
Index: user/ngie/more-tests/etc/rc.d/hostid_save
===================================================================
--- user/ngie/more-tests/etc/rc.d/hostid_save	(revision 281584)
+++ user/ngie/more-tests/etc/rc.d/hostid_save	(revision 281585)
@@ -1,28 +1,35 @@
 #!/bin/sh
 #
 # $FreeBSD$
 #
 
 # PROVIDE: hostid_save
 # REQUIRE: root
 # KEYWORD: nojail
 
 . /etc/rc.subr
 
 name="hostid_save"
 start_cmd="hostid_save"
 stop_cmd=":"
 rcvar="hostid_enable"
 
 hostid_save()
 {
-	if [ ! -r ${hostid_file} ]; then
-		$SYSCTL_N kern.hostuuid > ${hostid_file}
-		if [ $? -ne 0 ]; then
-			warn "could not store hostuuid in ${hostid_file}."
+	current_hostid=`$SYSCTL_N kern.hostuuid`
+
+	if [ -r ${hostid_file} ]; then
+		read saved_hostid < ${hostid_file}
+		if [ ${saved_hostid} = ${current_hostid} ]; then
+			exit 0
 		fi
+	fi
+
+	echo ${current_hostid} > ${hostid_file}
+	if [ $? -ne 0 ]; then
+		warn "could not store hostuuid in ${hostid_file}."
 	fi
 }
 
 load_rc_config $name
 run_rc_command "$1"
Index: user/ngie/more-tests/etc
===================================================================
--- user/ngie/more-tests/etc	(revision 281584)
+++ user/ngie/more-tests/etc	(revision 281585)

Property changes on: user/ngie/more-tests/etc
___________________________________________________________________
Modified: svn:mergeinfo
## -0,0 +0,1 ##
   Merged /head/etc:r281414-281584
Index: user/ngie/more-tests/include/iconv.h
===================================================================
--- user/ngie/more-tests/include/iconv.h	(revision 281584)
+++ user/ngie/more-tests/include/iconv.h	(revision 281585)
@@ -1,133 +1,133 @@
 /*	$FreeBSD$	*/
 /*	$NetBSD: iconv.h,v 1.6 2005/02/03 04:39:32 perry Exp $	*/
 
 /*-
  * Copyright (c) 2003 Citrus Project,
  * Copyright (c) 2009, 2010 Gabor Kovesdan <gabor@FreeBSD.org>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  */
 
 #ifndef _ICONV_H_
 #define _ICONV_H_
 
 #include <sys/cdefs.h>
 #include <sys/types.h>
 
 #include <wchar.h>
 
 #include <sys/cdefs.h>
 #include <sys/types.h>
 
 #ifdef __cplusplus
 typedef	bool	__iconv_bool;
 #elif __STDC_VERSION__ >= 199901L
 typedef	_Bool	__iconv_bool;
 #else
 typedef	int	__iconv_bool;
 #endif
 
 struct __tag_iconv_t;
 typedef	struct __tag_iconv_t	*iconv_t;
 
 __BEGIN_DECLS
 iconv_t	iconv_open(const char *, const char *);
-size_t	iconv(iconv_t, const char ** __restrict,
+size_t	iconv(iconv_t, char ** __restrict,
 	      size_t * __restrict, char ** __restrict,
 	      size_t * __restrict);
 int	iconv_close(iconv_t);
 /*
  * non-portable interfaces for iconv
  */
 int	__iconv_get_list(char ***, size_t *, __iconv_bool);
 void	__iconv_free_list(char **, size_t);
-size_t	__iconv(iconv_t, const char **, size_t *, char **,
+size_t	__iconv(iconv_t, char **, size_t *, char **,
 		     size_t *, __uint32_t, size_t *);
 #define __ICONV_F_HIDE_INVALID	0x0001
 
 /*
  * GNU interfaces for iconv
  */
 typedef struct {
 	void	*spaceholder[64];
 } iconv_allocation_t;
 
 int	 iconv_open_into(const char *, const char *, iconv_allocation_t *);
 void	 iconv_set_relocation_prefix(const char *, const char *);
 
 /*
  * iconvctl() request macros
  */
 #define ICONV_TRIVIALP		0
 #define	ICONV_GET_TRANSLITERATE	1
 #define	ICONV_SET_TRANSLITERATE	2
 #define ICONV_GET_DISCARD_ILSEQ	3
 #define ICONV_SET_DISCARD_ILSEQ	4
 #define ICONV_SET_HOOKS		5
 #define ICONV_SET_FALLBACKS	6
 #define ICONV_GET_ILSEQ_INVALID	128
 #define ICONV_SET_ILSEQ_INVALID	129
 
 typedef void (*iconv_unicode_char_hook) (unsigned int mbr, void *data);
 typedef void (*iconv_wide_char_hook) (wchar_t wc, void *data);
 
 struct iconv_hooks {
 	iconv_unicode_char_hook		 uc_hook;
 	iconv_wide_char_hook		 wc_hook;
 	void				*data;
 };
 
 /*
  * Fallbacks aren't supported but type definitions are provided for
  * source compatibility.
  */
 typedef void (*iconv_unicode_mb_to_uc_fallback) (const char*,
 		size_t, void (*write_replacement) (const unsigned int *,
 		size_t, void*),	void*, void*);
 typedef void (*iconv_unicode_uc_to_mb_fallback) (unsigned int,
 		void (*write_replacement) (const char *, size_t, void*),
 		void*, void*);
 typedef void (*iconv_wchar_mb_to_wc_fallback) (const char*, size_t,
 		void (*write_replacement) (const wchar_t *, size_t, void*),
 		void*, void*);
 typedef void (*iconv_wchar_wc_to_mb_fallback) (wchar_t,
 		void (*write_replacement) (const char *, size_t, void*),
 		void*, void*);
 
 struct iconv_fallbacks {
 	iconv_unicode_mb_to_uc_fallback	 mb_to_uc_fallback;
 	iconv_unicode_uc_to_mb_fallback  uc_to_mb_fallback;
 	iconv_wchar_mb_to_wc_fallback	 mb_to_wc_fallback;
 	iconv_wchar_wc_to_mb_fallback	 wc_to_mb_fallback;
 	void				*data;
 };
 
 
 void		 iconvlist(int (*do_one) (unsigned int, const char * const *,
 		    void *), void *);
 const char	*iconv_canonicalize(const char *);
 int		 iconvctl(iconv_t, int, void *);
 __END_DECLS
 
 #endif /* !_ICONV_H_ */
Index: user/ngie/more-tests/include
===================================================================
--- user/ngie/more-tests/include	(revision 281584)
+++ user/ngie/more-tests/include	(revision 281585)

Property changes on: user/ngie/more-tests/include
___________________________________________________________________
Modified: svn:mergeinfo
## -0,0 +0,1 ##
   Merged /head/include:r281414-281584
Index: user/ngie/more-tests/lib/libarchive/Makefile
===================================================================
--- user/ngie/more-tests/lib/libarchive/Makefile	(revision 281584)
+++ user/ngie/more-tests/lib/libarchive/Makefile	(revision 281585)
@@ -1,414 +1,414 @@
 # $FreeBSD$
 .include <src.opts.mk>
 
 LIBARCHIVEDIR=	${.CURDIR}/../../contrib/libarchive
 
 LIB=	archive
 
 LIBADD=	z bz2 lzma bsdxml
 CFLAGS+= -DHAVE_BZLIB_H=1 -DHAVE_LIBLZMA=1 -DHAVE_LZMA_H=1
 
 # FreeBSD SHLIB_MAJOR value is managed as part of the FreeBSD system.
 # It has no real relation to the libarchive version number.
 SHLIB_MAJOR= 6
 
 CFLAGS+=	-DPLATFORM_CONFIG_H=\"${.CURDIR}/config_freebsd.h\"
 CFLAGS+=	-I${.OBJDIR}
 
 .if ${MK_OPENSSL} != "no"
 CFLAGS+=	-DWITH_OPENSSL
 LIBADD+=	crypto
 .else
 LIBADD+=	md
 .endif
 
 .if ${MK_ICONV} != "no"
 # TODO: This can be changed back to CFLAGS once iconv works correctly
 # with statically linked binaries.
-SHARED_CFLAGS+=	-DHAVE_ICONV=1 -DHAVE_ICONV_H=1 -DICONV_CONST=const
+SHARED_CFLAGS+=	-DHAVE_ICONV=1 -DHAVE_ICONV_H=1 -DICONV_CONST=
 .endif
 
 .if ${MACHINE_ARCH:Marm*} != "" || ${MACHINE_ARCH:Mmips*} != "" || \
 	${MACHINE_ARCH:Msparc64*} != ""
 NO_WCAST_ALIGN=	yes
 .if ${MACHINE_ARCH:M*64*} == ""
 CFLAGS+=	-DPPMD_32BIT
 .endif
 .endif
 NO_WCAST_ALIGN.clang=
 
 .ifndef COMPAT_32BIT
 beforeinstall:
 	${INSTALL} -C -o ${LIBOWN} -g ${LIBGRP} -m ${LIBMODE} \
 		${.CURDIR}/libarchive.pc ${DESTDIR}${LIBDATADIR}/pkgconfig
 .endif
 
 .PATH: ${LIBARCHIVEDIR}/libarchive
 
 # Headers to be installed in /usr/include
 INCS=	archive.h archive_entry.h
 
 # Sources to be compiled.
 SRCS=	archive_acl.c					\
 	archive_check_magic.c				\
 	archive_cmdline.c				\
 	archive_crypto.c				\
 	archive_entry.c					\
 	archive_entry_copy_stat.c			\
 	archive_entry_link_resolver.c			\
 	archive_entry_sparse.c				\
 	archive_entry_stat.c				\
 	archive_entry_strmode.c				\
 	archive_entry_xattr.c				\
 	archive_getdate.c				\
 	archive_match.c					\
 	archive_options.c				\
 	archive_pathmatch.c				\
 	archive_ppmd7.c					\
 	archive_rb.c					\
 	archive_read.c					\
 	archive_read_append_filter.c			\
 	archive_read_data_into_fd.c			\
 	archive_read_disk_entry_from_file.c		\
 	archive_read_disk_posix.c			\
 	archive_read_disk_set_standard_lookup.c		\
 	archive_read_extract.c				\
 	archive_read_open_fd.c				\
 	archive_read_open_file.c			\
 	archive_read_open_filename.c			\
 	archive_read_open_memory.c			\
 	archive_read_set_format.c			\
 	archive_read_set_options.c			\
 	archive_read_support_filter_all.c		\
 	archive_read_support_filter_bzip2.c		\
 	archive_read_support_filter_compress.c		\
 	archive_read_support_filter_gzip.c		\
 	archive_read_support_filter_grzip.c		\
 	archive_read_support_filter_lrzip.c		\
 	archive_read_support_filter_lzop.c		\
 	archive_read_support_filter_none.c		\
 	archive_read_support_filter_program.c		\
 	archive_read_support_filter_rpm.c		\
 	archive_read_support_filter_uu.c		\
 	archive_read_support_filter_xz.c		\
 	archive_read_support_format_7zip.c		\
 	archive_read_support_format_all.c		\
 	archive_read_support_format_ar.c		\
 	archive_read_support_format_by_code.c		\
 	archive_read_support_format_cab.c		\
 	archive_read_support_format_cpio.c		\
 	archive_read_support_format_empty.c		\
 	archive_read_support_format_iso9660.c		\
 	archive_read_support_format_lha.c		\
 	archive_read_support_format_mtree.c		\
 	archive_read_support_format_rar.c		\
 	archive_read_support_format_raw.c		\
 	archive_read_support_format_tar.c		\
 	archive_read_support_format_xar.c		\
 	archive_read_support_format_zip.c		\
 	archive_string.c				\
 	archive_string_sprintf.c			\
 	archive_util.c					\
 	archive_virtual.c				\
 	archive_write.c					\
 	archive_write_add_filter.c			\
 	archive_write_disk_acl.c			\
 	archive_write_disk_set_standard_lookup.c	\
 	archive_write_disk_posix.c			\
 	archive_write_open_fd.c				\
 	archive_write_open_file.c			\
 	archive_write_open_filename.c			\
 	archive_write_open_memory.c			\
 	archive_write_add_filter_b64encode.c		\
 	archive_write_add_filter_by_name.c		\
 	archive_write_add_filter_bzip2.c		\
 	archive_write_add_filter_compress.c		\
 	archive_write_add_filter_grzip.c		\
 	archive_write_add_filter_gzip.c			\
 	archive_write_add_filter_lrzip.c		\
 	archive_write_add_filter_lzop.c			\
 	archive_write_add_filter_none.c			\
 	archive_write_add_filter_program.c		\
 	archive_write_add_filter_uuencode.c		\
 	archive_write_add_filter_xz.c			\
 	archive_write_set_format.c			\
 	archive_write_set_format_7zip.c			\
 	archive_write_set_format_ar.c			\
 	archive_write_set_format_by_name.c		\
 	archive_write_set_format_cpio.c			\
 	archive_write_set_format_cpio_newc.c		\
 	archive_write_set_format_gnutar.c		\
 	archive_write_set_format_iso9660.c		\
 	archive_write_set_format_mtree.c		\
 	archive_write_set_format_pax.c			\
 	archive_write_set_format_shar.c			\
 	archive_write_set_format_ustar.c		\
 	archive_write_set_format_v7tar.c		\
 	archive_write_set_format_xar.c			\
 	archive_write_set_format_zip.c			\
 	archive_write_set_options.c			\
 	filter_fork_posix.c
 
 # Man pages to be installed.
 MAN=	archive_entry.3					\
 	archive_entry_acl.3				\
 	archive_entry_linkify.3				\
 	archive_entry_paths.3				\
 	archive_entry_perms.3				\
 	archive_entry_stat.3				\
 	archive_entry_time.3				\
 	archive_read.3					\
 	archive_read_data.3				\
 	archive_read_disk.3				\
 	archive_read_extract.3				\
 	archive_read_filter.3				\
 	archive_read_format.3				\
 	archive_read_free.3				\
 	archive_read_header.3				\
 	archive_read_new.3				\
 	archive_read_open.3				\
 	archive_read_set_options.3			\
 	archive_util.3					\
 	archive_write.3					\
 	archive_write_blocksize.3			\
 	archive_write_data.3				\
 	archive_write_disk.3				\
 	archive_write_filter.3				\
 	archive_write_finish_entry.3			\
 	archive_write_format.3				\
 	archive_write_free.3				\
 	archive_write_header.3				\
 	archive_write_new.3				\
 	archive_write_open.3				\
 	archive_write_set_options.3			\
 	cpio.5						\
 	libarchive.3					\
 	libarchive_changes.3				\
 	libarchive_internals.3				\
 	libarchive-formats.5				\
 	tar.5
 
 # Symlink the man pages under each function name.
 MLINKS+=	archive_entry.3 archive_entry_clear.3
 MLINKS+=	archive_entry.3 archive_entry_clone.3
 MLINKS+=	archive_entry.3 archive_entry_free.3
 MLINKS+=	archive_entry.3 archive_entry_new.3
 MLINKS+=	archive_entry_acl.3 archive_entry_acl_add_entry.3
 MLINKS+=	archive_entry_acl.3 archive_entry_acl_add_entry_w.3
 MLINKS+=	archive_entry_acl.3 archive_entry_acl_clear.3
 MLINKS+=	archive_entry_acl.3 archive_entry_acl_count.3
 MLINKS+=	archive_entry_acl.3 archive_entry_acl_next.3
 MLINKS+=	archive_entry_acl.3 archive_entry_acl_next_w.3
 MLINKS+=	archive_entry_acl.3 archive_entry_acl_reset.3
 MLINKS+=	archive_entry_acl.3 archive_entry_acl_text_w.3
 MLINKS+=	archive_entry_linkify.3 archive_entry_linkresolver.3
 MLINKS+=	archive_entry_linkify.3 archive_entry_linkresolver_new.3
 MLINKS+=	archive_entry_linkify.3 archive_entry_linkresolver_set_strategy.3
 MLINKS+=	archive_entry_linkify.3 archive_entry_linkresolver_free.3
 MLINKS+=	archive_entry_paths.3 archive_entry_copy_hardlink.3
 MLINKS+=	archive_entry_paths.3 archive_entry_copy_hardlink_w.3
 MLINKS+=	archive_entry_paths.3 archive_entry_copy_link.3
 MLINKS+=	archive_entry_paths.3 archive_entry_copy_link_w.3
 MLINKS+=	archive_entry_paths.3 archive_entry_copy_pathname.3
 MLINKS+=	archive_entry_paths.3 archive_entry_copy_pathname_w.3
 MLINKS+=	archive_entry_paths.3 archive_entry_copy_sourcepath.3
 MLINKS+=	archive_entry_paths.3 archive_entry_copy_symlink.3
 MLINKS+=	archive_entry_paths.3 archive_entry_copy_symlink_w.3
 MLINKS+=	archive_entry_paths.3 archive_entry_hardlink.3
 MLINKS+=	archive_entry_paths.3 archive_entry_hardlink_w.3
 MLINKS+=	archive_entry_paths.3 archive_entry_pathname.3
 MLINKS+=	archive_entry_paths.3 archive_entry_pathname_w.3
 MLINKS+=	archive_entry_paths.3 archive_entry_set_hardlink.3
 MLINKS+=	archive_entry_paths.3 archive_entry_set_link.3
 MLINKS+=	archive_entry_paths.3 archive_entry_set_pathname.3
 MLINKS+=	archive_entry_paths.3 archive_entry_set_symlink.3
 MLINKS+=	archive_entry_paths.3 archive_entry_symlink.3
 MLINKS+=	archive_entry_paths.3 archive_entry_symlink_w.3
 MLINKS+=	archive_entry_paths.3 archive_entry_update_symlink_utf8.3
 MLINKS+=	archive_entry_paths.3 archive_entry_update_hardlink_utf8.3
 MLINKS+=	archive_entry_perms.3 archive_entry_copy_fflags_text.3
 MLINKS+=	archive_entry_perms.3 archive_entry_copy_fflags_text_w.3
 MLINKS+=	archive_entry_perms.3 archive_entry_copy_gname.3
 MLINKS+=	archive_entry_perms.3 archive_entry_copy_gname_w.3
 MLINKS+=	archive_entry_perms.3 archive_entry_copy_uname.3
 MLINKS+=	archive_entry_perms.3 archive_entry_copy_uname_w.3
 MLINKS+=	archive_entry_perms.3 archive_entry_fflags.3
 MLINKS+=	archive_entry_perms.3 archive_entry_fflags_text.3
 MLINKS+=	archive_entry_perms.3 archive_entry_gid.3
 MLINKS+=	archive_entry_perms.3 archive_entry_gname.3
 MLINKS+=	archive_entry_perms.3 archive_entry_gname_w.3
 MLINKS+=	archive_entry_perms.3 archive_entry_set_fflags.3
 MLINKS+=	archive_entry_perms.3 archive_entry_set_gid.3
 MLINKS+=	archive_entry_perms.3 archive_entry_set_gname.3
 MLINKS+=	archive_entry_perms.3 archive_entry_perm.3
 MLINKS+=	archive_entry_perms.3 archive_entry_set_perm.3
 MLINKS+=	archive_entry_perms.3 archive_entry_set_uid.3
 MLINKS+=	archive_entry_perms.3 archive_entry_set_uname.3
 MLINKS+=	archive_entry_perms.3 archive_entry_strmode.3
 MLINKS+=	archive_entry_perms.3 archive_entry_uid.3
 MLINKS+=	archive_entry_perms.3 archive_entry_uname.3
 MLINKS+=	archive_entry_perms.3 archive_entry_uname_w.3
 MLINKS+=	archive_entry_perms.3 archive_entry_update_gname_utf8.3
 MLINKS+=	archive_entry_perms.3 archive_entry_update_uname_utf8.3
 MLINKS+=	archive_entry_stat.3 archive_entry_copy_stat.3
 MLINKS+=	archive_entry_stat.3 archive_entry_dev.3
 MLINKS+=	archive_entry_stat.3 archive_entry_dev_is_set.3
 MLINKS+=	archive_entry_stat.3 archive_entry_devmajor.3
 MLINKS+=	archive_entry_stat.3 archive_entry_devminor.3
 MLINKS+=	archive_entry_stat.3 archive_entry_filetype.3
 MLINKS+=	archive_entry_stat.3 archive_entry_ino.3
 MLINKS+=	archive_entry_stat.3 archive_entry_ino64.3
 MLINKS+=	archive_entry_stat.3 archive_entry_ino_is_set.3
 MLINKS+=	archive_entry_stat.3 archive_entry_mode.3
 MLINKS+=	archive_entry_stat.3 archive_entry_nlink.3
 MLINKS+=	archive_entry_stat.3 archive_entry_rdev.3
 MLINKS+=	archive_entry_stat.3 archive_entry_rdevmajor.3
 MLINKS+=	archive_entry_stat.3 archive_entry_rdevminor.3
 MLINKS+=	archive_entry_stat.3 archive_entry_set_dev.3
 MLINKS+=	archive_entry_stat.3 archive_entry_set_devmajor.3
 MLINKS+=	archive_entry_stat.3 archive_entry_set_devminor.3
 MLINKS+=	archive_entry_stat.3 archive_entry_set_filetype.3
 MLINKS+=	archive_entry_stat.3 archive_entry_set_ino.3
 MLINKS+=	archive_entry_stat.3 archive_entry_set_ino64.3
 MLINKS+=	archive_entry_stat.3 archive_entry_set_mode.3
 MLINKS+=	archive_entry_stat.3 archive_entry_set_nlink.3
 MLINKS+=	archive_entry_stat.3 archive_entry_set_rdev.3
 MLINKS+=	archive_entry_stat.3 archive_entry_set_rdevmajor.3
 MLINKS+=	archive_entry_stat.3 archive_entry_set_rdevminor.3
 MLINKS+=	archive_entry_stat.3 archive_entry_set_size.3
 MLINKS+=	archive_entry_stat.3 archive_entry_size.3
 MLINKS+=	archive_entry_stat.3 archive_entry_size_is_set.3
 MLINKS+=	archive_entry_stat.3 archive_entry_unset_size.3
 MLINKS+=	archive_entry_time.3 archive_entry_atime.3
 MLINKS+=	archive_entry_time.3 archive_entry_atime_is_set.3
 MLINKS+=	archive_entry_time.3 archive_entry_atime_nsec.3
 MLINKS+=	archive_entry_time.3 archive_entry_birthtime.3
 MLINKS+=	archive_entry_time.3 archive_entry_birthtime_is_set.3
 MLINKS+=	archive_entry_time.3 archive_entry_birthtime_nsec.3
 MLINKS+=	archive_entry_time.3 archive_entry_ctime.3
 MLINKS+=	archive_entry_time.3 archive_entry_ctime_is_set.3
 MLINKS+=	archive_entry_time.3 archive_entry_ctime_nsec.3
 MLINKS+=	archive_entry_time.3 archive_entry_mtime.3
 MLINKS+=	archive_entry_time.3 archive_entry_mtime_is_set.3
 MLINKS+=	archive_entry_time.3 archive_entry_mtime_nsec.3
 MLINKS+=	archive_entry_time.3 archive_entry_set_atime.3
 MLINKS+=	archive_entry_time.3 archive_entry_set_birthtime.3
 MLINKS+=	archive_entry_time.3 archive_entry_set_ctime.3
 MLINKS+=	archive_entry_time.3 archive_entry_set_mtime.3
 MLINKS+=	archive_entry_time.3 archive_entry_unset_atime.3
 MLINKS+=	archive_entry_time.3 archive_entry_unset_birthtime.3
 MLINKS+=	archive_entry_time.3 archive_entry_unset_ctime.3
 MLINKS+=	archive_entry_time.3 archive_entry_unset_mtime.3
 MLINKS+=	archive_read_data.3 archive_read_data_block.3
 MLINKS+=	archive_read_data.3 archive_read_data_into_fd.3
 MLINKS+=	archive_read_data.3 archive_read_data_skip.3
 MLINKS+=	archive_read_header.3 archive_read_next_header.3
 MLINKS+=	archive_read_header.3 archive_read_next_header2.3
 MLINKS+=	archive_read_extract.3 archive_read_extract2.3
 MLINKS+=	archive_read_extract.3 archive_read_extract_set_progress_callback.3
 MLINKS+=	archive_read_extract.3 archive_read_extract_set_skip_file.3
 MLINKS+=	archive_read_open.3 archive_read_open2.3
 MLINKS+=	archive_read_open.3 archive_read_open_FILE.3
 MLINKS+=	archive_read_open.3 archive_read_open_fd.3
 MLINKS+=	archive_read_open.3 archive_read_open_file.3
 MLINKS+=	archive_read_open.3 archive_read_open_filename.3
 MLINKS+=	archive_read_open.3 archive_read_open_memory.3
 MLINKS+=	archive_read_free.3 archive_read_close.3
 MLINKS+=	archive_read_free.3 archive_read_finish.3
 MLINKS+=	archive_read_filter.3 archive_read_support_filter_all.3
 MLINKS+=	archive_read_filter.3 archive_read_support_filter_bzip2.3
 MLINKS+=	archive_read_filter.3 archive_read_support_filter_compress.3
 MLINKS+=	archive_read_filter.3 archive_read_support_filter_gzip.3
 MLINKS+=	archive_read_filter.3 archive_read_support_filter_lzma.3
 MLINKS+=	archive_read_filter.3 archive_read_support_filter_none.3
 MLINKS+=	archive_read_filter.3 archive_read_support_filter_xz.3
 MLINKS+=	archive_read_filter.3 archive_read_support_filter_program.3
 MLINKS+=	archive_read_filter.3 archive_read_support_filter_program_signature.3
 MLINKS+=	archive_read_format.3 archive_read_support_format_7zip.3
 MLINKS+=	archive_read_format.3 archive_read_support_format_all.3
 MLINKS+=	archive_read_format.3 archive_read_support_format_ar.3
 MLINKS+=	archive_read_format.3 archive_read_support_format_by_code.3
 MLINKS+=	archive_read_format.3 archive_read_support_format_cab.3
 MLINKS+=	archive_read_format.3 archive_read_support_format_cpio.3
 MLINKS+=	archive_read_format.3 archive_read_support_format_empty.3
 MLINKS+=	archive_read_format.3 archive_read_support_format_iso9660.3
 MLINKS+=	archive_read_format.3 archive_read_support_format_lha.3
 MLINKS+=	archive_read_format.3 archive_read_support_format_mtree.3
 MLINKS+=	archive_read_format.3 archive_read_support_format_rar.3
 MLINKS+=	archive_read_format.3 archive_read_support_format_raw.3
 MLINKS+=	archive_read_format.3 archive_read_support_format_tar.3
 MLINKS+=	archive_read_format.3 archive_read_support_format_xar.3
 MLINKS+=	archive_read_format.3 archive_read_support_format_zip.3
 MLINKS+=	archive_read_disk.3 archive_read_disk_entry_from_file.3
 MLINKS+=	archive_read_disk.3 archive_read_disk_gname.3
 MLINKS+=	archive_read_disk.3 archive_read_disk_new.3
 MLINKS+=	archive_read_disk.3 archive_read_disk_set_gname_lookup.3
 MLINKS+=	archive_read_disk.3 archive_read_disk_set_standard_lookup.3
 MLINKS+=	archive_read_disk.3 archive_read_disk_set_symlink_hybrid.3
 MLINKS+=	archive_read_disk.3 archive_read_disk_set_symlink_logical.3
 MLINKS+=	archive_read_disk.3 archive_read_disk_set_symlink_physical.3
 MLINKS+=	archive_read_disk.3 archive_read_disk_set_uname_lookup.3
 MLINKS+=	archive_read_disk.3 archive_read_disk_uname.3
 MLINKS+=	archive_read_set_options.3 archive_read_set_filter_option.3
 MLINKS+=	archive_read_set_options.3 archive_read_set_format_option.3
 MLINKS+=	archive_read_set_options.3 archive_read_set_option.3
 MLINKS+=	archive_util.3 archive_clear_error.3
 MLINKS+=	archive_util.3 archive_compression.3
 MLINKS+=	archive_util.3 archive_compression_name.3
 MLINKS+=	archive_util.3 archive_copy_error.3
 MLINKS+=	archive_util.3 archive_errno.3
 MLINKS+=	archive_util.3 archive_error_string.3
 MLINKS+=	archive_util.3 archive_file_count.3
 MLINKS+=	archive_util.3 archive_filter_code.3
 MLINKS+=	archive_util.3 archive_filter_count.3
 MLINKS+=	archive_util.3 archive_filter_name.3
 MLINKS+=	archive_util.3 archive_format.3
 MLINKS+=	archive_util.3 archive_format_name.3
 MLINKS+=	archive_util.3 archive_position.3
 MLINKS+=	archive_util.3 archive_set_error.3
 MLINKS+=	archive_write_blocksize.3 archive_write_get_bytes_in_last_block.3
 MLINKS+=	archive_write_blocksize.3 archive_write_get_bytes_per_block.3
 MLINKS+=	archive_write_blocksize.3 archive_write_set_bytes_in_last_block.3
 MLINKS+=	archive_write_blocksize.3 archive_write_set_bytes_per_block.3
 MLINKS+=	archive_write_disk.3 archive_write_data_block.3
 MLINKS+=	archive_write_disk.3 archive_write_disk_new.3
 MLINKS+=	archive_write_disk.3 archive_write_disk_set_group_lookup.3
 MLINKS+=	archive_write_disk.3 archive_write_disk_set_options.3
 MLINKS+=	archive_write_disk.3 archive_write_disk_set_skip_file.3
 MLINKS+=	archive_write_disk.3 archive_write_disk_set_standard_lookup.3
 MLINKS+=	archive_write_disk.3 archive_write_disk_set_user_lookup.3
 MLINKS+=	archive_write_filter.3 archive_write_add_filter_bzip2.3
 MLINKS+=	archive_write_filter.3 archive_write_add_filter_compress.3
 MLINKS+=	archive_write_filter.3 archive_write_add_filter_gzip.3
 MLINKS+=	archive_write_filter.3 archive_write_add_filter_lzip.3
 MLINKS+=	archive_write_filter.3 archive_write_add_filter_lzma.3
 MLINKS+=	archive_write_filter.3 archive_write_add_filter_none.3
 MLINKS+=	archive_write_filter.3 archive_write_add_filter_program.3
 MLINKS+=	archive_write_filter.3 archive_write_add_filter_xz.3
 MLINKS+=	archive_write_format.3 archive_write_set_format_cpio.3
 MLINKS+=	archive_write_format.3 archive_write_set_format_pax.3
 MLINKS+=	archive_write_format.3 archive_write_set_format_pax_restricted.3
 MLINKS+=	archive_write_format.3 archive_write_set_format_shar.3
 MLINKS+=	archive_write_format.3 archive_write_set_format_shar_dump.3
 MLINKS+=	archive_write_format.3 archive_write_set_format_ustar.3
 MLINKS+=	archive_write_free.3 archive_write_close.3
 MLINKS+=	archive_write_free.3 archive_write_fail.3
 MLINKS+=	archive_write_free.3 archive_write_finish.3
 MLINKS+=	archive_write_open.3 archive_write_open_FILE.3
 MLINKS+=	archive_write_open.3 archive_write_open_fd.3
 MLINKS+=	archive_write_open.3 archive_write_open_file.3
 MLINKS+=	archive_write_open.3 archive_write_open_filename.3
 MLINKS+=	archive_write_open.3 archive_write_open_memory.3
 MLINKS+=	archive_write_set_options.3 archive_write_set_filter_option.3
 MLINKS+=	archive_write_set_options.3 archive_write_set_format_option.3
 MLINKS+=	archive_write_set_options.3 archive_write_set_option.3
 MLINKS+=	libarchive.3 archive.3
 
 .PHONY: check test clean-test
 check test:
 	cd ${.CURDIR}/test && make obj && make test
 
 clean-test:
 	cd ${.CURDIR}/test && make clean
 
 .include <bsd.lib.mk>
Index: user/ngie/more-tests/lib/libc/iconv/__iconv.c
===================================================================
--- user/ngie/more-tests/lib/libc/iconv/__iconv.c	(revision 281584)
+++ user/ngie/more-tests/lib/libc/iconv/__iconv.c	(revision 281585)
@@ -1,38 +1,38 @@
 /*-
  * Copyright (c) 2013 Peter Wemm
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 #include <sys/types.h>
 #include <iconv.h>
 #include "iconv-internal.h"
 
 size_t
-__iconv(iconv_t a, const char **b, size_t *c, char **d,
+__iconv(iconv_t a, char **b, size_t *c, char **d,
      size_t *e, __uint32_t f, size_t *g)
 {
 	return __bsd___iconv(a, b, c, d, e, f, g);
 }
Index: user/ngie/more-tests/lib/libc/iconv/bsd_iconv.c
===================================================================
--- user/ngie/more-tests/lib/libc/iconv/bsd_iconv.c	(revision 281584)
+++ user/ngie/more-tests/lib/libc/iconv/bsd_iconv.c	(revision 281585)
@@ -1,319 +1,319 @@
 /* $FreeBSD$ */
 /* $NetBSD: iconv.c,v 1.11 2009/03/03 16:22:33 explorer Exp $ */
 
 /*-
  * Copyright (c) 2003 Citrus Project,
  * Copyright (c) 2009, 2010 Gabor Kovesdan <gabor@FreeBSD.org>,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 #include <sys/queue.h>
 #include <sys/types.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <iconv.h>
 #include <limits.h>
 #include <paths.h>
 #include <stdbool.h>
 #include <stdlib.h>
 #include <string.h>
 
 #include "citrus_types.h"
 #include "citrus_module.h"
 #include "citrus_esdb.h"
 #include "citrus_hash.h"
 #include "citrus_iconv.h"
 
 #include "iconv-internal.h"
 
 #define ISBADF(_h_)	(!(_h_) || (_h_) == (iconv_t)-1)
 
 static iconv_t
 __bsd___iconv_open(const char *out, const char *in, struct _citrus_iconv *handle)
 {
 	const char *out_slashes;
 	char *out_noslashes;
 	int ret;
 
 	/*
 	 * Remove anything following a //, as these are options (like
 	 * //ignore, //translate, etc) and we just don't handle them.
 	 * This is for compatibility with software that uses these
 	 * blindly.
 	 */
 	out_slashes = strstr(out, "//");
 	if (out_slashes != NULL) {
 		out_noslashes = strndup(out, out_slashes - out);
 		if (out_noslashes == NULL) {
 			errno = ENOMEM;
 			return ((iconv_t)-1);
 		}
 		ret = _citrus_iconv_open(&handle, in, out_noslashes);
 		free(out_noslashes);
 	} else {
 		ret = _citrus_iconv_open(&handle, in, out);
 	}
 
 	if (ret) {
 		errno = ret == ENOENT ? EINVAL : ret;
 		return ((iconv_t)-1);
 	}
 
 	handle->cv_shared->ci_discard_ilseq = strcasestr(out, "//IGNORE");
 	handle->cv_shared->ci_ilseq_invalid = false;
 	handle->cv_shared->ci_hooks = NULL;
 
 	return ((iconv_t)(void *)handle);
 }
 
 iconv_t
 __bsd_iconv_open(const char *out, const char *in)
 {
 
 	return (__bsd___iconv_open(out, in, NULL));
 }
 
 int
 __bsd_iconv_open_into(const char *out, const char *in, iconv_allocation_t *ptr)
 {
 	struct _citrus_iconv *handle;
 
 	handle = (struct _citrus_iconv *)ptr;
 	return ((__bsd___iconv_open(out, in, handle) == (iconv_t)-1) ? -1 : 0);
 }
 
 int
 __bsd_iconv_close(iconv_t handle)
 {
 
 	if (ISBADF(handle)) {
 		errno = EBADF;
 		return (-1);
 	}
 
 	_citrus_iconv_close((struct _citrus_iconv *)(void *)handle);
 
 	return (0);
 }
 
 size_t
-__bsd_iconv(iconv_t handle, const char **in, size_t *szin, char **out, size_t *szout)
+__bsd_iconv(iconv_t handle, char **in, size_t *szin, char **out, size_t *szout)
 {
 	size_t ret;
 	int err;
 
 	if (ISBADF(handle)) {
 		errno = EBADF;
 		return ((size_t)-1);
 	}
 
 	err = _citrus_iconv_convert((struct _citrus_iconv *)(void *)handle,
 	    in, szin, out, szout, 0, &ret);
 	if (err) {
 		errno = err;
 		ret = (size_t)-1;
 	}
 
 	return (ret);
 }
 
 size_t
-__bsd___iconv(iconv_t handle, const char **in, size_t *szin, char **out,
+__bsd___iconv(iconv_t handle, char **in, size_t *szin, char **out,
     size_t *szout, uint32_t flags, size_t *invalids)
 {
 	size_t ret;
 	int err;
 
 	if (ISBADF(handle)) {
 		errno = EBADF;
 		return ((size_t)-1);
 	}
 
 	err = _citrus_iconv_convert((struct _citrus_iconv *)(void *)handle,
 	    in, szin, out, szout, flags, &ret);
 	if (invalids)
 		*invalids = ret;
 	if (err) {
 		errno = err;
 		ret = (size_t)-1;
 	}
 
 	return (ret);
 }
 
 int
 __bsd___iconv_get_list(char ***rlist, size_t *rsz, bool sorted)
 {
 	int ret;
 
 	ret = _citrus_esdb_get_list(rlist, rsz, sorted);
 	if (ret) {
 		errno = ret;
 		return (-1);
 	}
 
 	return (0);
 }
 
 void
 __bsd___iconv_free_list(char **list, size_t sz)
 {
 
 	_citrus_esdb_free_list(list, sz);
 }
 
 /*
  * GNU-compatibile non-standard interfaces.
  */
 static int
 qsort_helper(const void *first, const void *second)
 {
 	const char * const *s1;
 	const char * const *s2;
 
 	s1 = first;
 	s2 = second;
 	return (strcmp(*s1, *s2));
 }
 
 void
 __bsd_iconvlist(int (*do_one) (unsigned int, const char * const *,
     void *), void *data)
 {
 	char **list, **names;
 	const char * const *np;
 	char *curitem, *curkey, *slashpos;
 	size_t sz;
 	unsigned int i, j;
 
 	i = 0;
 
 	if (__bsd___iconv_get_list(&list, &sz, true))
 		list = NULL;
 	qsort((void *)list, sz, sizeof(char *), qsort_helper);
 	while (i < sz) {
 		j = 0;
 		slashpos = strchr(list[i], '/');
 		curkey = (char *)malloc(slashpos - list[i] + 2);
 		names = (char **)malloc(sz * sizeof(char *));
 		if ((curkey == NULL) || (names == NULL)) {
 			__bsd___iconv_free_list(list, sz);
 			return;
 		}
 		strlcpy(curkey, list[i], slashpos - list[i] + 1);
 		names[j++] = curkey;
 		for (; (i < sz) && (memcmp(curkey, list[i], strlen(curkey)) == 0); i++) {
 			slashpos = strchr(list[i], '/');
 			curitem = (char *)malloc(strlen(slashpos) + 1);
 			if (curitem == NULL) {
 				__bsd___iconv_free_list(list, sz);
 				return;
 			}
 			strlcpy(curitem, &slashpos[1], strlen(slashpos) + 1);
 			if (strcmp(curkey, curitem) == 0) {
 				continue;
 			}
 			names[j++] = curitem;
 		}
 		np = (const char * const *)names;
 		do_one(j, np, data);
 		free(names);
 	}
 
 	__bsd___iconv_free_list(list, sz);
 }
 
 __inline const char *
 __bsd_iconv_canonicalize(const char *name)
 {
 
 	return (_citrus_iconv_canonicalize(name));
 }
 
 int
 __bsd_iconvctl(iconv_t cd, int request, void *argument)
 {
 	struct _citrus_iconv *cv;
 	struct iconv_hooks *hooks;
 	const char *convname;
 	char src[PATH_MAX], *dst;
 	int *i;
 
 	cv = (struct _citrus_iconv *)(void *)cd;
 	hooks = (struct iconv_hooks *)argument;
 	i = (int *)argument;
 
 	if (ISBADF(cd)) {
 		errno = EBADF;
 		return (-1);
 	}
 
 	switch (request) {
 	case ICONV_TRIVIALP:
 		convname = cv->cv_shared->ci_convname;
 		dst = strchr(convname, '/');
 
 		strlcpy(src, convname, dst - convname + 1);
 		dst++;
 		if ((convname == NULL) || (src == NULL) || (dst == NULL))
 			return (-1);
 		*i = strcmp(src, dst) == 0 ? 1 : 0;
 		return (0);
 	case ICONV_GET_TRANSLITERATE:
 		*i = 1;
 		return (0);
 	case ICONV_SET_TRANSLITERATE:
 		return  ((*i == 1) ? 0 : -1);
 	case ICONV_GET_DISCARD_ILSEQ:
 		*i = cv->cv_shared->ci_discard_ilseq ? 1 : 0;
 		return (0);
 	case ICONV_SET_DISCARD_ILSEQ:
 		cv->cv_shared->ci_discard_ilseq = *i;
 		return (0);
 	case ICONV_SET_HOOKS:
 		cv->cv_shared->ci_hooks = hooks;
 		return (0);
 	case ICONV_SET_FALLBACKS:
 		errno = EOPNOTSUPP;
 		return (-1);
 	case ICONV_GET_ILSEQ_INVALID:
 		*i = cv->cv_shared->ci_ilseq_invalid ? 1 : 0;
 		return (0);
 	case ICONV_SET_ILSEQ_INVALID:
 		cv->cv_shared->ci_ilseq_invalid = *i;
 		return (0);
 	default:
 		errno = EINVAL;
 		return (-1);
 	}
 }
 
 void
 __bsd_iconv_set_relocation_prefix(const char *orig_prefix __unused,
     const char *curr_prefix __unused)
 {
 
 }
Index: user/ngie/more-tests/lib/libc/iconv/citrus_iconv.h
===================================================================
--- user/ngie/more-tests/lib/libc/iconv/citrus_iconv.h	(revision 281584)
+++ user/ngie/more-tests/lib/libc/iconv/citrus_iconv.h	(revision 281585)
@@ -1,64 +1,64 @@
 /* $FreeBSD$ */
 /* $NetBSD: citrus_iconv.h,v 1.5 2008/02/09 14:56:20 junyoung Exp $ */
 
 /*-
  * Copyright (c)2003 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #ifndef _CITRUS_ICONV_H_
 #define _CITRUS_ICONV_H_
 
 struct _citrus_iconv_shared;
 struct _citrus_iconv_ops;
 struct _citrus_iconv;
 
 __BEGIN_DECLS
 int		 _citrus_iconv_open(struct _citrus_iconv * __restrict * __restrict,
 		    const char * __restrict, const char * __restrict);
 void		 _citrus_iconv_close(struct _citrus_iconv *);
 const char	*_citrus_iconv_canonicalize(const char *);
 __END_DECLS
 
 
 #include "citrus_iconv_local.h"
 
 #define _CITRUS_ICONV_F_HIDE_INVALID	0x0001
 
 /*
  * _citrus_iconv_convert:
  *	convert a string.
  */
 static __inline int
 _citrus_iconv_convert(struct _citrus_iconv * __restrict cv,
-    const char * __restrict * __restrict in, size_t * __restrict inbytes,
+    char * __restrict * __restrict in, size_t * __restrict inbytes,
     char * __restrict * __restrict out, size_t * __restrict outbytes,
     uint32_t flags, size_t * __restrict nresults)
 {
 
 	return (*cv->cv_shared->ci_ops->io_convert)(cv, in, inbytes, out,
 	    outbytes, flags, nresults);
 }
 
 #endif
Index: user/ngie/more-tests/lib/libc/iconv/citrus_iconv_local.h
===================================================================
--- user/ngie/more-tests/lib/libc/iconv/citrus_iconv_local.h	(revision 281584)
+++ user/ngie/more-tests/lib/libc/iconv/citrus_iconv_local.h	(revision 281585)
@@ -1,110 +1,110 @@
 /* $FreeBSD$ */
 /* $NetBSD: citrus_iconv_local.h,v 1.3 2008/02/09 14:56:20 junyoung Exp $ */
 
 /*-
  * Copyright (c)2003 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #ifndef _CITRUS_ICONV_LOCAL_H_
 #define _CITRUS_ICONV_LOCAL_H_
 
 #include <iconv.h>
 #include <stdbool.h>
 
 #define _CITRUS_ICONV_GETOPS_FUNC_BASE(_n_)				\
     int _n_(struct _citrus_iconv_ops *)
 #define _CITRUS_ICONV_GETOPS_FUNC(_n_)					\
     _CITRUS_ICONV_GETOPS_FUNC_BASE(_citrus_##_n_##_iconv_getops)
 
 #define _CITRUS_ICONV_DECLS(_m_)					\
 static int	 _citrus_##_m_##_iconv_init_shared			\
 		    (struct _citrus_iconv_shared * __restrict,		\
 	 	    const char * __restrict, const char * __restrict);	\
 static void	 _citrus_##_m_##_iconv_uninit_shared			\
 		    (struct _citrus_iconv_shared *);			\
 static int	 _citrus_##_m_##_iconv_convert				\
 		    (struct _citrus_iconv * __restrict,			\
-		    const char * __restrict * __restrict,		\
+		    char * __restrict * __restrict,			\
 		    size_t * __restrict,				\
 		    char * __restrict * __restrict,			\
 		    size_t * __restrict outbytes,			\
 	 	    uint32_t, size_t * __restrict);			\
 static int	 _citrus_##_m_##_iconv_init_context			\
 		    (struct _citrus_iconv *);				\
 static void	 _citrus_##_m_##_iconv_uninit_context			\
 		    (struct _citrus_iconv *)
 
 
 #define _CITRUS_ICONV_DEF_OPS(_m_)					\
 extern struct _citrus_iconv_ops _citrus_##_m_##_iconv_ops;		\
 struct _citrus_iconv_ops _citrus_##_m_##_iconv_ops = {			\
 	/* io_init_shared */	&_citrus_##_m_##_iconv_init_shared,	\
 	/* io_uninit_shared */	&_citrus_##_m_##_iconv_uninit_shared,	\
 	/* io_init_context */	&_citrus_##_m_##_iconv_init_context,	\
 	/* io_uninit_context */	&_citrus_##_m_##_iconv_uninit_context,	\
 	/* io_convert */	&_citrus_##_m_##_iconv_convert		\
 }
 
 typedef _CITRUS_ICONV_GETOPS_FUNC_BASE((*_citrus_iconv_getops_t));
 typedef	int (*_citrus_iconv_init_shared_t)
     (struct _citrus_iconv_shared * __restrict,
     const char * __restrict, const char * __restrict);
 typedef void (*_citrus_iconv_uninit_shared_t)
     (struct _citrus_iconv_shared *);
 typedef int (*_citrus_iconv_convert_t)
     (struct _citrus_iconv * __restrict,
-    const char *__restrict* __restrict, size_t * __restrict,
+    char *__restrict* __restrict, size_t * __restrict,
     char * __restrict * __restrict, size_t * __restrict, uint32_t,
     size_t * __restrict);
 typedef int (*_citrus_iconv_init_context_t)(struct _citrus_iconv *);
 typedef void (*_citrus_iconv_uninit_context_t)(struct _citrus_iconv *);
 
 struct _citrus_iconv_ops {
 	_citrus_iconv_init_shared_t	io_init_shared;
 	_citrus_iconv_uninit_shared_t	io_uninit_shared;
 	_citrus_iconv_init_context_t	io_init_context;
 	_citrus_iconv_uninit_context_t	io_uninit_context;
 	_citrus_iconv_convert_t		io_convert;
 };
 
 struct _citrus_iconv_shared {
 	struct _citrus_iconv_ops			*ci_ops;
 	void						*ci_closure;
 	_CITRUS_HASH_ENTRY(_citrus_iconv_shared)	 ci_hash_entry;
 	TAILQ_ENTRY(_citrus_iconv_shared)		 ci_tailq_entry;
 	_citrus_module_t				 ci_module;
 	unsigned int					 ci_used_count;
 	char						*ci_convname;
 	bool						 ci_discard_ilseq;
 	struct iconv_hooks				*ci_hooks;
 	bool						 ci_ilseq_invalid;
 };
 
 struct _citrus_iconv {
 	struct _citrus_iconv_shared			*cv_shared;
 	void						*cv_closure;
 };
 
 #endif
Index: user/ngie/more-tests/lib/libc/iconv/citrus_none.c
===================================================================
--- user/ngie/more-tests/lib/libc/iconv/citrus_none.c	(revision 281584)
+++ user/ngie/more-tests/lib/libc/iconv/citrus_none.c	(revision 281585)
@@ -1,237 +1,237 @@
 /* $FreeBSD$ */
 /* $NetBSD: citrus_none.c,v 1.18 2008/06/14 16:01:07 tnozaki Exp $ */
 
 /*-
  * Copyright (c) 2002 Citrus Project,
  * Copyright (c) 2010 Gabor Kovesdan <gabor@FreeBSD.org>,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 #include <sys/types.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <iconv.h>
 #include <stddef.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <wchar.h>
 
 #include "citrus_namespace.h"
 #include "citrus_types.h"
 #include "citrus_module.h"
 #include "citrus_none.h"
 #include "citrus_stdenc.h"
 
 _CITRUS_STDENC_DECLS(NONE);
 _CITRUS_STDENC_DEF_OPS(NONE);
 struct _citrus_stdenc_traits _citrus_NONE_stdenc_traits = {
 	0,	/* et_state_size */
 	1,	/* mb_cur_max */
 };
 
 static int
 _citrus_NONE_stdenc_init(struct _citrus_stdenc * __restrict ce,
     const void *var __unused, size_t lenvar __unused,
     struct _citrus_stdenc_traits * __restrict et)
 {
 
 	et->et_state_size = 0;
 	et->et_mb_cur_max = 1;
 
 	ce->ce_closure = NULL;
 
 	return (0);
 }
 
 static void
 _citrus_NONE_stdenc_uninit(struct _citrus_stdenc *ce __unused)
 {
 
 }
 
 static int
 _citrus_NONE_stdenc_init_state(struct _citrus_stdenc * __restrict ce __unused,
     void * __restrict ps __unused)
 {
 
 	return (0);
 }
 
 static int
 _citrus_NONE_stdenc_mbtocs(struct _citrus_stdenc * __restrict ce __unused,
-    _csid_t *csid, _index_t *idx, const char **s, size_t n,
+    _csid_t *csid, _index_t *idx, char **s, size_t n,
     void *ps __unused, size_t *nresult, struct iconv_hooks *hooks)
 {
 
 	if (n < 1) {
 		*nresult = (size_t)-2;
 		return (0);
 	}
 
 	*csid = 0;
 	*idx = (_index_t)(unsigned char)*(*s)++;
 	*nresult = *idx == 0 ? 0 : 1;
 
 	if ((hooks != NULL) && (hooks->uc_hook != NULL))
 		hooks->uc_hook((unsigned int)*idx, hooks->data);
 
 	return (0);
 }
 
 static int
 _citrus_NONE_stdenc_cstomb(struct _citrus_stdenc * __restrict ce __unused,
     char *s, size_t n, _csid_t csid, _index_t idx, void *ps __unused,
     size_t *nresult, struct iconv_hooks *hooks __unused)
 {
 
 	if (csid == _CITRUS_CSID_INVALID) {
 		*nresult = 0;
 		return (0);
 	}
 	if (csid != 0)
 		return (EILSEQ);
 
 	if ((idx & 0x000000FF) == idx) {
 		if (n < 1) {
 			*nresult = (size_t)-1;
 			return (E2BIG);
 		}
 		*s = (char)idx;
 		*nresult = 1;
 	} else if ((idx & 0x0000FFFF) == idx) {
 		if (n < 2) {
 			*nresult = (size_t)-1;
 			return (E2BIG);
 		}
 		s[0] = (char)idx;
 		/* XXX: might be endian dependent */
 		s[1] = (char)(idx >> 8);
 		*nresult = 2;
 	} else if ((idx & 0x00FFFFFF) == idx) {
 		if (n < 3) {
 			*nresult = (size_t)-1;
 			return (E2BIG);
 		}
 		s[0] = (char)idx;
 		/* XXX: might be endian dependent */
 		s[1] = (char)(idx >> 8);
 		s[2] = (char)(idx >> 16);
 		*nresult = 3;
 	} else {
 		if (n < 3) {
 			*nresult = (size_t)-1;
 			return (E2BIG);
 		}
 		s[0] = (char)idx;
 		/* XXX: might be endian dependent */
 		s[1] = (char)(idx >> 8);
 		s[2] = (char)(idx >> 16);
 		s[3] = (char)(idx >> 24);
 		*nresult = 4;
 	}
 		
 	return (0);
 }
 
 static int
 _citrus_NONE_stdenc_mbtowc(struct _citrus_stdenc * __restrict ce __unused,
-    _wc_t * __restrict pwc, const char ** __restrict s, size_t n,
+    _wc_t * __restrict pwc, char ** __restrict s, size_t n,
     void * __restrict pspriv __unused, size_t * __restrict nresult,
     struct iconv_hooks *hooks)
 {
 
 	if (s == NULL) {
 		*nresult = 0;
 		return (0);
 	}
 	if (n == 0) {
 		*nresult = (size_t)-2;
 		return (0);
 	}
 
 	if (pwc != NULL)
 		*pwc = (_wc_t)(unsigned char) **s;
 
 	*nresult = *s == '\0' ? 0 : 1;
 
 	if ((hooks != NULL) && (hooks->wc_hook != NULL))
 		hooks->wc_hook(*pwc, hooks->data);
 
 	return (0);
 }
 
 static int
 _citrus_NONE_stdenc_wctomb(struct _citrus_stdenc * __restrict ce __unused,
     char * __restrict s, size_t n, _wc_t wc,
     void * __restrict pspriv __unused, size_t * __restrict nresult,
     struct iconv_hooks *hooks __unused)
 {
 
 	if ((wc & ~0xFFU) != 0) {
 		*nresult = (size_t)-1;
 		return (EILSEQ);
 	}
 	if (n == 0) {
 		*nresult = (size_t)-1;
 		return (E2BIG);
 	}
 
 	*nresult = 1;
 	if (s != NULL && n > 0)
 		*s = (char)wc;
 
 	return (0);
 }
 
 static int
 _citrus_NONE_stdenc_put_state_reset(struct _citrus_stdenc * __restrict ce __unused,
     char * __restrict s __unused, size_t n __unused,
     void * __restrict pspriv __unused, size_t * __restrict nresult)
 {
 
 	*nresult = 0;
 
 	return (0);
 }
 
 static int
 _citrus_NONE_stdenc_get_state_desc(struct _stdenc * __restrict ce __unused,
     void * __restrict ps __unused, int id,
     struct _stdenc_state_desc * __restrict d)
 {
 	int ret = 0;
 
 	switch (id) {
 	case _STDENC_SDID_GENERIC:
 		d->u.generic.state = _STDENC_SDGEN_INITIAL;
 		break;
 	default:
 		ret = EOPNOTSUPP;
 	}
 
 	return (ret);
 }
Index: user/ngie/more-tests/lib/libc/iconv/citrus_stdenc.h
===================================================================
--- user/ngie/more-tests/lib/libc/iconv/citrus_stdenc.h	(revision 281584)
+++ user/ngie/more-tests/lib/libc/iconv/citrus_stdenc.h	(revision 281585)
@@ -1,124 +1,124 @@
 /* $FreeBSD$ */
 /* $NetBSD: citrus_stdenc.h,v 1.4 2005/10/29 18:02:04 tshiozak Exp $ */
 
 /*-
  * Copyright (c)2003 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  */
 
 #ifndef _CITRUS_STDENC_H_
 #define _CITRUS_STDENC_H_
 
 struct _citrus_stdenc;
 struct _citrus_stdenc_ops;
 struct _citrus_stdenc_traits;
 
 #define _CITRUS_STDENC_SDID_GENERIC		0
 struct _citrus_stdenc_state_desc
 {
 	union {
 		struct {
 			int	state;
 #define _CITRUS_STDENC_SDGEN_UNKNOWN		0
 #define _CITRUS_STDENC_SDGEN_INITIAL		1
 #define _CITRUS_STDENC_SDGEN_STABLE		2
 #define _CITRUS_STDENC_SDGEN_INCOMPLETE_CHAR	3
 #define _CITRUS_STDENC_SDGEN_INCOMPLETE_SHIFT	4
 		} generic;
 	} u;
 };
 
 #include "citrus_stdenc_local.h"
 
 __BEGIN_DECLS
 int	 _citrus_stdenc_open(struct _citrus_stdenc * __restrict * __restrict,
 	    char const * __restrict, const void * __restrict, size_t);
 void	 _citrus_stdenc_close(struct _citrus_stdenc *);
 __END_DECLS
 
 static __inline int
 _citrus_stdenc_init_state(struct _citrus_stdenc * __restrict ce,
     void * __restrict ps)
 {
 
 	return ((*ce->ce_ops->eo_init_state)(ce, ps));
 }
 
 static __inline int
 _citrus_stdenc_mbtocs(struct _citrus_stdenc * __restrict ce,
     _citrus_csid_t * __restrict csid, _citrus_index_t * __restrict idx,
-    const char ** __restrict s, size_t n, void * __restrict ps,
+    char ** __restrict s, size_t n, void * __restrict ps,
     size_t * __restrict nresult, struct iconv_hooks *hooks)
 {
 
 	return ((*ce->ce_ops->eo_mbtocs)(ce, csid, idx, s, n, ps, nresult,
 	    hooks));
 }
 
 static __inline int
 _citrus_stdenc_cstomb(struct _citrus_stdenc * __restrict ce,
     char * __restrict s, size_t n, _citrus_csid_t csid, _citrus_index_t idx,
     void * __restrict ps, size_t * __restrict nresult,
     struct iconv_hooks *hooks)
 {
 
 	return ((*ce->ce_ops->eo_cstomb)(ce, s, n, csid, idx, ps, nresult,
 	    hooks));
 }
 
 static __inline int
 _citrus_stdenc_wctomb(struct _citrus_stdenc * __restrict ce,
     char * __restrict s, size_t n, _citrus_wc_t wc, void * __restrict ps,
     size_t * __restrict nresult, struct iconv_hooks *hooks)
 {
 
 	return ((*ce->ce_ops->eo_wctomb)(ce, s, n, wc, ps, nresult, hooks));
 }
 
 static __inline int
 _citrus_stdenc_put_state_reset(struct _citrus_stdenc * __restrict ce,
     char * __restrict s, size_t n, void * __restrict ps,
     size_t * __restrict nresult)
 {
 
 	return ((*ce->ce_ops->eo_put_state_reset)(ce, s, n, ps, nresult));
 }
 
 static __inline size_t
 _citrus_stdenc_get_state_size(struct _citrus_stdenc *ce)
 {
 
 	return (ce->ce_traits->et_state_size);
 }
 
 static __inline int
 _citrus_stdenc_get_state_desc(struct _citrus_stdenc * __restrict ce,
     void * __restrict ps, int id,
     struct _citrus_stdenc_state_desc * __restrict d)
 {
 
 	return ((*ce->ce_ops->eo_get_state_desc)(ce, ps, id, d));
 }
 #endif
Index: user/ngie/more-tests/lib/libc/iconv/citrus_stdenc_local.h
===================================================================
--- user/ngie/more-tests/lib/libc/iconv/citrus_stdenc_local.h	(revision 281584)
+++ user/ngie/more-tests/lib/libc/iconv/citrus_stdenc_local.h	(revision 281585)
@@ -1,162 +1,162 @@
 /* $FreeBSD$ */
 /* $NetBSD: citrus_stdenc_local.h,v 1.4 2008/02/09 14:56:20 junyoung Exp $ */
 
 /*-
  * Copyright (c)2003 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  */
 
 #ifndef _CITRUS_STDENC_LOCAL_H_
 #define _CITRUS_STDENC_LOCAL_H_
 
 #include <iconv.h>
 
 #include "citrus_module.h"
 
 #define _CITRUS_STDENC_GETOPS_FUNC_BASE(n)			\
    int n(struct _citrus_stdenc_ops *, size_t)
 #define _CITRUS_STDENC_GETOPS_FUNC(_e_)					\
    _CITRUS_STDENC_GETOPS_FUNC_BASE(_citrus_##_e_##_stdenc_getops)
 typedef _CITRUS_STDENC_GETOPS_FUNC_BASE((*_citrus_stdenc_getops_t));
 
 
 #define _CITRUS_STDENC_DECLS(_e_)					\
 static int	 _citrus_##_e_##_stdenc_init				\
 		    (struct _citrus_stdenc * __restrict,		\
 		    const void * __restrict, size_t,			\
 		    struct _citrus_stdenc_traits * __restrict);		\
 static void	 _citrus_##_e_##_stdenc_uninit(struct _citrus_stdenc *);\
 static int	 _citrus_##_e_##_stdenc_init_state			\
 		    (struct _citrus_stdenc * __restrict,		\
 		    void * __restrict);					\
 static int	 _citrus_##_e_##_stdenc_mbtocs				\
 		    (struct _citrus_stdenc * __restrict,		\
 		    _citrus_csid_t * __restrict,			\
 		    _citrus_index_t * __restrict,			\
-		    const char ** __restrict, size_t,			\
+		    char ** __restrict, size_t,				\
 		    void * __restrict, size_t * __restrict,		\
 		    struct iconv_hooks *);				\
 static int	 _citrus_##_e_##_stdenc_cstomb				\
 		    (struct _citrus_stdenc * __restrict,		\
 		    char * __restrict, size_t, _citrus_csid_t,		\
 		    _citrus_index_t, void * __restrict,			\
 		    size_t * __restrict, struct iconv_hooks *);		\
 static int	 _citrus_##_e_##_stdenc_mbtowc				\
 		    (struct _citrus_stdenc * __restrict,		\
 		    _citrus_wc_t * __restrict,				\
-		    const char ** __restrict, size_t,				\
+		    char ** __restrict, size_t,				\
 		    void * __restrict, size_t * __restrict,		\
 		    struct iconv_hooks *);				\
 static int	 _citrus_##_e_##_stdenc_wctomb				\
 		    (struct _citrus_stdenc * __restrict,		\
 		    char * __restrict, size_t, _citrus_wc_t,		\
 		    void * __restrict, size_t * __restrict,		\
 		    struct iconv_hooks *);				\
 static int	 _citrus_##_e_##_stdenc_put_state_reset			\
 		    (struct _citrus_stdenc * __restrict,		\
 		    char * __restrict, size_t, void * __restrict,	\
 		    size_t * __restrict);				\
 static int	 _citrus_##_e_##_stdenc_get_state_desc			\
 		    (struct _citrus_stdenc * __restrict,		\
 		    void * __restrict, int,				\
 		    struct _citrus_stdenc_state_desc * __restrict)
 
 #define _CITRUS_STDENC_DEF_OPS(_e_)					\
 extern struct _citrus_stdenc_ops _citrus_##_e_##_stdenc_ops;		\
 struct _citrus_stdenc_ops _citrus_##_e_##_stdenc_ops = {		\
 	/* eo_init */		&_citrus_##_e_##_stdenc_init,		\
 	/* eo_uninit */		&_citrus_##_e_##_stdenc_uninit,		\
 	/* eo_init_state */	&_citrus_##_e_##_stdenc_init_state,	\
 	/* eo_mbtocs */		&_citrus_##_e_##_stdenc_mbtocs,		\
 	/* eo_cstomb */		&_citrus_##_e_##_stdenc_cstomb,		\
 	/* eo_mbtowc */		&_citrus_##_e_##_stdenc_mbtowc,		\
 	/* eo_wctomb */		&_citrus_##_e_##_stdenc_wctomb,		\
 	/* eo_put_state_reset */&_citrus_##_e_##_stdenc_put_state_reset,\
 	/* eo_get_state_desc */	&_citrus_##_e_##_stdenc_get_state_desc	\
 }
 
 typedef int (*_citrus_stdenc_init_t)
     (struct _citrus_stdenc * __reatrict, const void * __restrict , size_t,
     struct _citrus_stdenc_traits * __restrict);
 typedef void (*_citrus_stdenc_uninit_t)(struct _citrus_stdenc * __restrict);
 typedef int (*_citrus_stdenc_init_state_t)
     (struct _citrus_stdenc * __restrict, void * __restrict);
 typedef int (*_citrus_stdenc_mbtocs_t)
     (struct _citrus_stdenc * __restrict,
     _citrus_csid_t * __restrict, _citrus_index_t * __restrict,
-    const char ** __restrict, size_t,
+    char ** __restrict, size_t,
     void * __restrict, size_t * __restrict,
     struct iconv_hooks *);
 typedef int (*_citrus_stdenc_cstomb_t)
     (struct _citrus_stdenc *__restrict, char * __restrict, size_t,
     _citrus_csid_t, _citrus_index_t, void * __restrict,
     size_t * __restrict, struct iconv_hooks *);
 typedef int (*_citrus_stdenc_mbtowc_t)
     (struct _citrus_stdenc * __restrict,
     _citrus_wc_t * __restrict,
-    const char ** __restrict, size_t,
+    char ** __restrict, size_t,
     void * __restrict, size_t * __restrict,
     struct iconv_hooks *);
 typedef int (*_citrus_stdenc_wctomb_t)
     (struct _citrus_stdenc *__restrict, char * __restrict, size_t,
     _citrus_wc_t, void * __restrict, size_t * __restrict,
     struct iconv_hooks *);
 typedef int (*_citrus_stdenc_put_state_reset_t)
     (struct _citrus_stdenc *__restrict, char * __restrict, size_t,
     void * __restrict, size_t * __restrict);
 typedef int (*_citrus_stdenc_get_state_desc_t)
     (struct _citrus_stdenc * __restrict, void * __restrict, int,
     struct _citrus_stdenc_state_desc * __restrict);
 
 struct _citrus_stdenc_ops {
 	_citrus_stdenc_init_t		eo_init;
 	_citrus_stdenc_uninit_t		eo_uninit;
 	_citrus_stdenc_init_state_t	eo_init_state;
 	_citrus_stdenc_mbtocs_t		eo_mbtocs;
 	_citrus_stdenc_cstomb_t		eo_cstomb;
 	_citrus_stdenc_mbtowc_t		eo_mbtowc;
 	_citrus_stdenc_wctomb_t		eo_wctomb;
 	_citrus_stdenc_put_state_reset_t eo_put_state_reset;
 	/* version 0x00000002 */
 	_citrus_stdenc_get_state_desc_t	eo_get_state_desc;
 };
 
 struct _citrus_stdenc_traits {
 	/* version 0x00000001 */
 	size_t				 et_state_size;
 	size_t				 et_mb_cur_max;
 };
 
 struct _citrus_stdenc {
 	/* version 0x00000001 */
 	struct _citrus_stdenc_ops	*ce_ops;
 	void				*ce_closure;
 	_citrus_module_t		 ce_module;
 	struct _citrus_stdenc_traits	*ce_traits;
 };
 
 #define _CITRUS_DEFAULT_STDENC_NAME		"NONE"
 
 #endif
Index: user/ngie/more-tests/lib/libc/iconv/citrus_stdenc_template.h
===================================================================
--- user/ngie/more-tests/lib/libc/iconv/citrus_stdenc_template.h	(revision 281584)
+++ user/ngie/more-tests/lib/libc/iconv/citrus_stdenc_template.h	(revision 281585)
@@ -1,211 +1,211 @@
 /* $FreeBSD$ */
 /* $NetBSD: citrus_stdenc_template.h,v 1.4 2008/02/09 14:56:20 junyoung Exp $ */
 
 /*-
  * Copyright (c)2003 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <iconv.h>
 
 /*
  * CAUTION: THIS IS NOT STANDALONE FILE
  *
  * function templates of iconv standard encoding handler for each encodings.
  *
  */
 
 /*
  * macros
  */
 
 #undef _TO_EI
 #undef _CE_TO_EI
 #undef _TO_STATE
 #define _TO_EI(_cl_)	((_ENCODING_INFO*)(_cl_))
 #define _CE_TO_EI(_ce_)	(_TO_EI((_ce_)->ce_closure))
 #define _TO_STATE(_ps_)	((_ENCODING_STATE*)(_ps_))
 
 /* ----------------------------------------------------------------------
  * templates for public functions
  */
 
 int
 _FUNCNAME(stdenc_getops)(struct _citrus_stdenc_ops *ops,
     size_t lenops __unused)
 {
 
 	memcpy(ops, &_FUNCNAME(stdenc_ops), sizeof(_FUNCNAME(stdenc_ops)));
 
 	return (0);
 }
 
 static int
 _FUNCNAME(stdenc_init)(struct _citrus_stdenc * __restrict ce,
     const void * __restrict var, size_t lenvar,
     struct _citrus_stdenc_traits * __restrict et)
 {
 	_ENCODING_INFO *ei;
 	int ret;
 
 	ei = NULL;
 	if (sizeof(_ENCODING_INFO) > 0) {
 		ei = calloc(1, sizeof(_ENCODING_INFO));
 		if (ei == NULL)
 			return (errno);
 	}
 
 	ret = _FUNCNAME(encoding_module_init)(ei, var, lenvar);
 	if (ret) {
 		free((void *)ei);
 		return (ret);
 	}
 
 	ce->ce_closure = ei;
 	et->et_state_size = sizeof(_ENCODING_STATE);
 	et->et_mb_cur_max = _ENCODING_MB_CUR_MAX(_CE_TO_EI(ce));
 
 	return (0);
 }
 
 static void
 _FUNCNAME(stdenc_uninit)(struct _citrus_stdenc * __restrict ce)
 {
 
 	if (ce) {
 		_FUNCNAME(encoding_module_uninit)(_CE_TO_EI(ce));
 		free(ce->ce_closure);
 	}
 }
 
 static int
 _FUNCNAME(stdenc_init_state)(struct _citrus_stdenc * __restrict ce,
     void * __restrict ps)
 {
 
 	_FUNCNAME(init_state)(_CE_TO_EI(ce), _TO_STATE(ps));
 
 	return (0);
 }
 
 static int
 _FUNCNAME(stdenc_mbtocs)(struct _citrus_stdenc * __restrict ce,
     _citrus_csid_t * __restrict csid, _citrus_index_t * __restrict idx,
-    const char ** __restrict s, size_t n, void * __restrict ps,
+    char ** __restrict s, size_t n, void * __restrict ps,
     size_t * __restrict nresult, struct iconv_hooks *hooks)
 {
 	wchar_t wc;
 	int ret;
 
 	ret = _FUNCNAME(mbrtowc_priv)(_CE_TO_EI(ce), &wc, s, n,
 	    _TO_STATE(ps), nresult);
 
 	if ((ret == 0) && *nresult != (size_t)-2)
 		ret = _FUNCNAME(stdenc_wctocs)(_CE_TO_EI(ce), csid, idx, wc);
 
 	if ((ret == 0) && (hooks != NULL) && (hooks->uc_hook != NULL))
 		hooks->uc_hook((unsigned int)*idx, hooks->data);
 	return (ret);
 }
 
 static int
 _FUNCNAME(stdenc_cstomb)(struct _citrus_stdenc * __restrict ce,
     char * __restrict s, size_t n, _citrus_csid_t csid, _citrus_index_t idx,
     void * __restrict ps, size_t * __restrict nresult,
     struct iconv_hooks *hooks __unused)
 {
 	wchar_t wc;
 	int ret;
 
 	wc = ret = 0;
 
 	if (csid != _CITRUS_CSID_INVALID)
 		ret = _FUNCNAME(stdenc_cstowc)(_CE_TO_EI(ce), &wc, csid, idx);
 
 	if (ret == 0)
 		ret = _FUNCNAME(wcrtomb_priv)(_CE_TO_EI(ce), s, n, wc,
 		    _TO_STATE(ps), nresult);
 	return (ret);
 }
 
 static int
 _FUNCNAME(stdenc_mbtowc)(struct _citrus_stdenc * __restrict ce,
-    _citrus_wc_t * __restrict wc, const char ** __restrict s, size_t n,
+    _citrus_wc_t * __restrict wc, char ** __restrict s, size_t n,
     void * __restrict ps, size_t * __restrict nresult,
     struct iconv_hooks *hooks)
 {
 	int ret;
 
 	ret = _FUNCNAME(mbrtowc_priv)(_CE_TO_EI(ce), wc, s, n,
 	    _TO_STATE(ps), nresult);
 	if ((ret == 0) && (hooks != NULL) && (hooks->wc_hook != NULL))
 		hooks->wc_hook(*wc, hooks->data);
 	return (ret);
 }
 
 static int
 _FUNCNAME(stdenc_wctomb)(struct _citrus_stdenc * __restrict ce,
     char * __restrict s, size_t n, _citrus_wc_t wc, void * __restrict ps,
     size_t * __restrict nresult, struct iconv_hooks *hooks __unused)
 {
 	int ret;
 
 	ret = _FUNCNAME(wcrtomb_priv)(_CE_TO_EI(ce), s, n, wc, _TO_STATE(ps),
 	    nresult);
 	return (ret);
 }
 
 static int
 _FUNCNAME(stdenc_put_state_reset)(struct _citrus_stdenc * __restrict ce __unused,
     char * __restrict s __unused, size_t n __unused,
     void * __restrict ps __unused, size_t * __restrict nresult)
 {
 
 #if _ENCODING_IS_STATE_DEPENDENT
 	return ((_FUNCNAME(put_state_reset)(_CE_TO_EI(ce), s, n, _TO_STATE(ps),
 	    nresult)));
 #else
 	*nresult = 0;
 	return (0);
 #endif
 }
 
 static int
 _FUNCNAME(stdenc_get_state_desc)(struct _citrus_stdenc * __restrict ce,
     void * __restrict ps, int id,
     struct _citrus_stdenc_state_desc * __restrict d)
 {
 	int ret;
 
 	switch (id) {
 	case _STDENC_SDID_GENERIC:
 		ret = _FUNCNAME(stdenc_get_state_desc_generic)(
 		    _CE_TO_EI(ce), _TO_STATE(ps), &d->u.generic.state);
 		break;
 	default:
 		ret = EOPNOTSUPP;
 	}
 
 	return (ret);
 }
Index: user/ngie/more-tests/lib/libc/iconv/iconv-internal.h
===================================================================
--- user/ngie/more-tests/lib/libc/iconv/iconv-internal.h	(revision 281584)
+++ user/ngie/more-tests/lib/libc/iconv/iconv-internal.h	(revision 281585)
@@ -1,45 +1,45 @@
 /*-
  * Copyright (c) 2013 Peter Wemm
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 /*
  * Interal prototypes for our back-end functions.
  */
-size_t	__bsd___iconv(iconv_t, const char **, size_t *, char **,
+size_t	__bsd___iconv(iconv_t, char **, size_t *, char **,
 		size_t *, __uint32_t, size_t *);
 void	__bsd___iconv_free_list(char **, size_t);
 int	__bsd___iconv_get_list(char ***, size_t *, __iconv_bool);
-size_t	__bsd_iconv(iconv_t, const char ** __restrict,
+size_t	__bsd_iconv(iconv_t, char ** __restrict,
 		    size_t * __restrict, char ** __restrict,
 		    size_t * __restrict);
 const char *__bsd_iconv_canonicalize(const char *);
 int	__bsd_iconv_close(iconv_t);
 iconv_t	__bsd_iconv_open(const char *, const char *);
 int	__bsd_iconv_open_into(const char *, const char *, iconv_allocation_t *);
 void	__bsd_iconv_set_relocation_prefix(const char *, const char *);
 int	__bsd_iconvctl(iconv_t, int, void *);
 void	__bsd_iconvlist(int (*) (unsigned int, const char * const *, void *), void *);
Index: user/ngie/more-tests/lib/libc/iconv/iconv.3
===================================================================
--- user/ngie/more-tests/lib/libc/iconv/iconv.3	(revision 281584)
+++ user/ngie/more-tests/lib/libc/iconv/iconv.3	(revision 281585)
@@ -1,309 +1,309 @@
 .\" $FreeBSD$
 .\" $NetBSD: iconv.3,v 1.12 2004/08/02 13:38:21 tshiozak Exp $
 .\"
 .\" Copyright (c) 2003 Citrus Project,
 .\" Copyright (c) 2009, 2010 Gabor Kovesdan <gabor@FreeBSD.org>,
 .\" All rights reserved.
 .\"
 .\" Redistribution and use in source and binary forms, with or without
 .\" modification, are permitted provided that the following conditions
 .\" are met:
 .\" 1. Redistributions of source code must retain the above copyright
 .\"    notice, this list of conditions and the following disclaimer.
 .\" 2. Redistributions in binary form must reproduce the above copyright
 .\"    notice, this list of conditions and the following disclaimer in the
 .\"    documentation and/or other materials provided with the distribution.
 .\"
 .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 .\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 .\" SUCH DAMAGE.
 .\"
 .Dd August 4, 2014
 .Dt ICONV 3
 .Os
 .Sh NAME
 .Nm iconv_open ,
 .Nm iconv_open_into ,
 .Nm iconv_close ,
 .Nm iconv
 .Nd codeset conversion functions
 .Sh LIBRARY
 .Lb libc
 .Sh SYNOPSIS
 .In iconv.h
 .Ft iconv_t
 .Fn iconv_open "const char *dstname" "const char *srcname"
 .Ft int
 .Fn iconv_open_into "const char *dstname" "const char *srcname" "iconv_allocation_t *ptr"
 .Ft int
 .Fn iconv_close "iconv_t cd"
 .Ft size_t
 .Fn iconv "iconv_t cd" "char ** restrict src" "size_t * restrict srcleft" "char ** restrict dst" "size_t * restrict dstleft"
 .Ft size_t
-.Fn __iconv "iconv_t cd" "const char ** restrict src" "size_t * restrict srcleft" "char ** restrict dst" "size_t * restrict dstleft" "uint32_t flags" "size_t * invalids"
+.Fn __iconv "iconv_t cd" "char ** restrict src" "size_t * restrict srcleft" "char ** restrict dst" "size_t * restrict dstleft" "uint32_t flags" "size_t * invalids"
 .Sh DESCRIPTION
 The
 .Fn iconv_open
 function opens a converter from the codeset
 .Fa srcname
 to the codeset
 .Fa dstname
 and returns its descriptor.
 The arguments
 .Fa srcname
 and
 .Fa dstname
 accept "" and "char", which refer to the current locale encoding.
 .Pp
 The
 .Fn iconv_open_into
 creates a conversion descriptor on a preallocated space.
 The
 .Ft iconv_allocation_t
 is used as a spaceholder type when allocating such space.
 The
 .Fa dstname
 and
 .Fa srcname
 arguments are the same as in the case of
 .Fn iconv_open .
 The
 .Fa ptr
 argument is a pointer of
 .Ft iconv_allocation_t
 to the preallocated space.
 .Pp
 The
 .Fn iconv_close
 function closes the specified converter
 .Fa cd .
 .Pp
 The
 .Fn iconv
 function converts the string in the buffer
 .Fa *src
 of length
 .Fa *srcleft
 bytes and stores the converted string in the buffer
 .Fa *dst
 of size
 .Fa *dstleft
 bytes.
 After calling
 .Fn iconv ,
 the values pointed to by
 .Fa src ,
 .Fa srcleft ,
 .Fa dst ,
 and
 .Fa dstleft
 are updated as follows:
 .Bl -tag -width 01234567
 .It *src
 Pointer to the byte just after the last character fetched.
 .It *srcleft
 Number of remaining bytes in the source buffer.
 .It *dst
 Pointer to the byte just after the last character stored.
 .It *dstleft
 Number of remainder bytes in the destination buffer.
 .El
 .Pp
 If the string pointed to by
 .Fa *src
 contains a byte sequence which is not a valid character in the source
 codeset, the conversion stops just after the last successful conversion.
 If the output buffer is too small to store the converted
 character, the conversion also stops in the same way.
 In these cases, the values pointed to by
 .Fa src ,
 .Fa srcleft ,
 .Fa dst ,
 and
 .Fa dstleft
 are updated to the state just after the last successful conversion.
 .Pp
 If the string pointed to by
 .Fa *src
 contains a character which is valid under the source codeset but
 can not be converted to the destination codeset,
 the character is replaced by an
 .Dq invalid character
 which depends on the destination codeset, e.g.,
 .Sq \&? ,
 and the conversion is continued.
 .Fn iconv
 returns the number of such
 .Dq invalid conversions .
 .Pp
 There are two special cases of
 .Fn iconv :
 .Bl -tag -width 0123
 .It "src == NULL || *src == NULL"
 If the source and/or destination codesets are stateful,
 .Fn iconv
 places these into their initial state.
 .Pp
 If both
 .Fa dst
 and
 .Fa *dst
 are
 .No non- Ns Dv NULL ,
 .Fn iconv
 stores the shift sequence for the destination switching to the initial state
 in the buffer pointed to by
 .Fa *dst .
 The buffer size is specified by the value pointed to by
 .Fa dstleft
 as above.
 .Fn iconv
 will fail if the buffer is too small to store the shift sequence.
 .Pp
 On the other hand,
 .Fa dst
 or
 .Fa *dst
 may be
 .Dv NULL .
 In this case, the shift sequence for the destination switching
 to the initial state is discarded.
 .El
 .Pp
 The
 .Fn __iconv
 function works just like
 .Fn iconv
 but if
 .Fn iconv
 fails, the invalid character count is lost there.
 This is a not bug rather a limitation of
 .St -p1003.1-2008 ,
 so
 .Fn __iconv
 is provided as an alternative but non-standard interface.
 It also has a flags argument, where currently the following
 flags can be passed:
 .Bl -tag -width 0123
 .It __ICONV_F_HIDE_INVALID
 Skip invalid characters, instead of returning with an error.
 .El
 .Sh RETURN VALUES
 Upon successful completion of
 .Fn iconv_open ,
 it returns a conversion descriptor.
 Otherwise,
 .Fn iconv_open
 returns (iconv_t)\-1 and sets errno to indicate the error.
 .Pp
 Upon successful completion of
 .Fn iconv_open_into ,
 it returns 0.
 Otherwise,
 .Fn iconv_open_into
 returns \-1, and sets errno to indicate the error.
 .Pp
 Upon successful completion of
 .Fn iconv_close ,
 it returns 0.
 Otherwise,
 .Fn iconv_close
 returns \-1 and sets errno to indicate the error.
 .Pp
 Upon successful completion of
 .Fn iconv ,
 it returns the number of
 .Dq invalid
 conversions.
 Otherwise,
 .Fn iconv
 returns (size_t)\-1 and sets errno to indicate the error.
 .Sh ERRORS
 The
 .Fn iconv_open
 function may cause an error in the following cases:
 .Bl -tag -width Er
 .It Bq Er ENOMEM
 Memory is exhausted.
 .It Bq Er EINVAL
 There is no converter specified by
 .Fa srcname
 and
 .Fa dstname .
 .El
 The
 .Fn iconv_open_into
 function may cause an error in the following cases:
 .Bl -tag -width Er
 .It Bq Er EINVAL
 There is no converter specified by
 .Fa srcname
 and
 .Fa dstname .
 .El
 .Pp
 The
 .Fn iconv_close
 function may cause an error in the following case:
 .Bl -tag -width Er
 .It Bq Er EBADF
 The conversion descriptor specified by
 .Fa cd
 is invalid.
 .El
 .Pp
 The
 .Fn iconv
 function may cause an error in the following cases:
 .Bl -tag -width Er
 .It Bq Er EBADF
 The conversion descriptor specified by
 .Fa cd
 is invalid.
 .It Bq Er EILSEQ
 The string pointed to by
 .Fa *src
 contains a byte sequence which does not describe a valid character of
 the source codeset.
 .It Bq Er E2BIG
 The output buffer pointed to by
 .Fa *dst
 is too small to store the result string.
 .It Bq Er EINVAL
 The string pointed to by
 .Fa *src
 terminates with an incomplete character or shift sequence.
 .El
 .Sh SEE ALSO
 .Xr iconv 1 ,
 .Xr mkcsmapper 1 ,
 .Xr mkesdb 1 ,
 .Xr __iconv_get_list 3 ,
 .Xr iconv_canonicalize 3 ,
 .Xr iconvctl 3 ,
 .Xr iconvlist 3
 .Sh STANDARDS
 The
 .Fn iconv_open ,
 .Fn iconv_close ,
 and
 .Fn iconv
 functions conform to
 .St -p1003.1-2008 .
 .Pp
 The
 .Fn iconv_open_into
 function is a GNU-specific extension and it is not part of any standard,
 thus its use may break portability.
 The
 .Fn __iconv
 function is an own extension and it is not part of any standard,
 thus its use may break portability.
Index: user/ngie/more-tests/lib/libc/iconv/iconv.c
===================================================================
--- user/ngie/more-tests/lib/libc/iconv/iconv.c	(revision 281584)
+++ user/ngie/more-tests/lib/libc/iconv/iconv.c	(revision 281585)
@@ -1,39 +1,39 @@
 /*-
  * Copyright (c) 2013 Peter Wemm
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 #include <sys/types.h>
 #include <iconv.h>
 #include "iconv-internal.h"
 
 size_t
-iconv(iconv_t a, const char ** __restrict b,
+iconv(iconv_t a, char ** __restrict b,
       size_t * __restrict c, char ** __restrict d,
       size_t * __restrict e)
 {
 	return __bsd_iconv(a, b, c, d, e);
 }
Index: user/ngie/more-tests/lib/libc/iconv/iconv_compat.c
===================================================================
--- user/ngie/more-tests/lib/libc/iconv/iconv_compat.c	(revision 281584)
+++ user/ngie/more-tests/lib/libc/iconv/iconv_compat.c	(revision 281585)
@@ -1,121 +1,121 @@
 /*-
  * Copyright (c) 2013 Peter Wemm
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 /*
  * These are ABI implementations for when the raw iconv_* symbol
  * space was exposed via libc.so.7 in its early life.  This is
  * a transition aide, these wrappers will not normally ever be
  * executed except via __sym_compat() references.
  */
 #include <sys/types.h>
 #include <iconv.h>
 #include "iconv-internal.h"
 
 size_t
-__iconv_compat(iconv_t a, const char ** b, size_t * c, char ** d,
+__iconv_compat(iconv_t a, char ** b, size_t * c, char ** d,
      size_t * e, __uint32_t f, size_t *g)
 {
 	return __bsd___iconv(a, b, c, d, e, f, g);
 }
 
 void
 __iconv_free_list_compat(char ** a, size_t b)
 {
 	__bsd___iconv_free_list(a, b);
 }
 
 int
 __iconv_get_list_compat(char ***a, size_t *b, __iconv_bool c)
 {
 	return __bsd___iconv_get_list(a, b, c);
 }
 
 size_t
-iconv_compat(iconv_t a, const char ** __restrict b,
+iconv_compat(iconv_t a, char ** __restrict b,
       size_t * __restrict c, char ** __restrict d,
       size_t * __restrict e)
 {
 	return __bsd_iconv(a, b, c, d, e);
 }
 
 const char *
 iconv_canonicalize_compat(const char *a)
 {
 	return __bsd_iconv_canonicalize(a);
 }
 
 int
 iconv_close_compat(iconv_t a)
 {
 	return __bsd_iconv_close(a);
 }
 
 iconv_t
 iconv_open_compat(const char *a, const char *b)
 {
 	return __bsd_iconv_open(a, b);
 }
 
 int
 iconv_open_into_compat(const char *a, const char *b, iconv_allocation_t *c)
 {
 	return __bsd_iconv_open_into(a, b, c);
 }
 
 void
 iconv_set_relocation_prefix_compat(const char *a, const char *b)
 {
 	return __bsd_iconv_set_relocation_prefix(a, b);
 }
 
 int
 iconvctl_compat(iconv_t a, int b, void *c)
 {
 	return __bsd_iconvctl(a, b, c);
 }
 
 void
 iconvlist_compat(int (*a) (unsigned int, const char * const *, void *), void *b)
 {
 	return __bsd_iconvlist(a, b);
 }
 
 int _iconv_version_compat = 0x0108;	/* Magic - not used */
 
 __sym_compat(__iconv, __iconv_compat, FBSD_1.2);
 __sym_compat(__iconv_free_list, __iconv_free_list_compat, FBSD_1.2);
 __sym_compat(__iconv_get_list, __iconv_get_list_compat, FBSD_1.2);
 __sym_compat(_iconv_version, _iconv_version_compat, FBSD_1.3);
 __sym_compat(iconv, iconv_compat, FBSD_1.3);
 __sym_compat(iconv_canonicalize, iconv_canonicalize_compat, FBSD_1.2);
 __sym_compat(iconv_close, iconv_close_compat, FBSD_1.3);
 __sym_compat(iconv_open, iconv_open_compat, FBSD_1.3);
 __sym_compat(iconv_open_into, iconv_open_into_compat, FBSD_1.3);
 __sym_compat(iconv_set_relocation_prefix, iconv_set_relocation_prefix_compat, FBSD_1.3);
 __sym_compat(iconvctl, iconvctl_compat, FBSD_1.3);
 __sym_compat(iconvlist, iconvlist_compat, FBSD_1.3);
Index: user/ngie/more-tests/lib/libc/locale/cXXrtomb_iconv.h
===================================================================
--- user/ngie/more-tests/lib/libc/locale/cXXrtomb_iconv.h	(revision 281584)
+++ user/ngie/more-tests/lib/libc/locale/cXXrtomb_iconv.h	(revision 281585)
@@ -1,116 +1,115 @@
 /*-
  * Copyright (c) 2013 Ed Schouten <ed@FreeBSD.org>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/queue.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <langinfo.h>
 #include <uchar.h>
 
 #include "../iconv/citrus_hash.h"
 #include "../iconv/citrus_module.h"
 #include "../iconv/citrus_iconv.h"
 #include "xlocale_private.h"
 
 typedef struct {
 	bool			initialized;
 	struct _citrus_iconv	iconv;
 	union {
 		charXX_t	widechar[SRCBUF_LEN];
 		char		bytes[sizeof(charXX_t) * SRCBUF_LEN];
 	} srcbuf;
 	size_t			srcbuf_len;
 } _ConversionState;
 _Static_assert(sizeof(_ConversionState) <= sizeof(mbstate_t),
     "Size of _ConversionState must not exceed mbstate_t's size.");
 
 size_t
 cXXrtomb_l(char * __restrict s, charXX_t c, mbstate_t * __restrict ps,
     locale_t locale)
 {
 	_ConversionState *cs;
 	struct _citrus_iconv *handle;
-	const char *src;
-	char *dst;
+	char *src, *dst;
 	size_t srcleft, dstleft, invlen;
 	int err;
 
 	FIX_LOCALE(locale);
 	if (ps == NULL)
 		ps = &locale->cXXrtomb;
 	cs = (_ConversionState *)ps;
 	handle = &cs->iconv;
 
 	/* Reinitialize mbstate_t. */
 	if (s == NULL || !cs->initialized) {
 		if (_citrus_iconv_open(&handle, UTF_XX_INTERNAL,
 		    nl_langinfo_l(CODESET, locale)) != 0) {
 			cs->initialized = false;
 			errno = EINVAL;
 			return (-1);
 		}
 		handle->cv_shared->ci_discard_ilseq = true;
 		handle->cv_shared->ci_hooks = NULL;
 		cs->srcbuf_len = 0;
 		cs->initialized = true;
 		if (s == NULL)
 			return (1);
 	}
 
 	assert(cs->srcbuf_len < sizeof(cs->srcbuf.widechar) / sizeof(charXX_t));
 	cs->srcbuf.widechar[cs->srcbuf_len++] = c;
 
 	/* Perform conversion. */
 	src = cs->srcbuf.bytes;
 	srcleft = cs->srcbuf_len * sizeof(charXX_t);
 	dst = s;
 	dstleft = MB_CUR_MAX_L(locale);
 	err = _citrus_iconv_convert(handle, &src, &srcleft, &dst, &dstleft,
 	    0, &invlen);
 
 	/* Character is part of a surrogate pair. We need more input. */
 	if (err == EINVAL)
 		return (0);
 	cs->srcbuf_len = 0;
 	
 	/* Illegal sequence. */
 	if (dst == s) {
 		errno = EILSEQ;
 		return ((size_t)-1);
 	}
 	return (dst - s);
 }
 
 size_t
 cXXrtomb(char * __restrict s, charXX_t c, mbstate_t * __restrict ps)
 {
 
 	return (cXXrtomb_l(s, c, ps, __get_locale()));
 }
Index: user/ngie/more-tests/lib/libc/locale/mbrtocXX_iconv.h
===================================================================
--- user/ngie/more-tests/lib/libc/locale/mbrtocXX_iconv.h	(revision 281584)
+++ user/ngie/more-tests/lib/libc/locale/mbrtocXX_iconv.h	(revision 281585)
@@ -1,159 +1,158 @@
 /*-
  * Copyright (c) 2013 Ed Schouten <ed@FreeBSD.org>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/queue.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <langinfo.h>
 #include <limits.h>
 #include <string.h>
 #include <uchar.h>
 
 #include "../iconv/citrus_hash.h"
 #include "../iconv/citrus_module.h"
 #include "../iconv/citrus_iconv.h"
 #include "xlocale_private.h"
 
 typedef struct {
 	bool			initialized;
 	struct _citrus_iconv	iconv;
 	char			srcbuf[MB_LEN_MAX];
 	size_t			srcbuf_len;
 	union {
 		charXX_t	widechar[DSTBUF_LEN];
 		char		bytes[sizeof(charXX_t) * DSTBUF_LEN];
 	} dstbuf;
 	size_t			dstbuf_len;
 } _ConversionState;
 _Static_assert(sizeof(_ConversionState) <= sizeof(mbstate_t),
     "Size of _ConversionState must not exceed mbstate_t's size.");
 
 size_t
 mbrtocXX_l(charXX_t * __restrict pc, const char * __restrict s, size_t n,
     mbstate_t * __restrict ps, locale_t locale)
 {
 	_ConversionState *cs;
 	struct _citrus_iconv *handle;
 	size_t i, retval;
 	charXX_t retchar;
 
 	FIX_LOCALE(locale);
 	if (ps == NULL)
 		ps = &locale->mbrtocXX;
 	cs = (_ConversionState *)ps;
 	handle = &cs->iconv;
 
 	/* Reinitialize mbstate_t. */
 	if (s == NULL || !cs->initialized) {
 		if (_citrus_iconv_open(&handle,
 		    nl_langinfo_l(CODESET, locale), UTF_XX_INTERNAL) != 0) {
 			cs->initialized = false;
 			errno = EINVAL;
 			return (-1);
 		}
 		handle->cv_shared->ci_discard_ilseq = true;
 		handle->cv_shared->ci_hooks = NULL;
 		cs->srcbuf_len = cs->dstbuf_len = 0;
 		cs->initialized = true;
 		if (s == NULL)
 			return (0);
 	}
 
 	/* See if we still have characters left from the previous invocation. */
 	if (cs->dstbuf_len > 0) {
 		retval = (size_t)-3;
 		goto return_char;
 	}
 
 	/* Fill up the read buffer as far as possible. */
 	if (n > sizeof(cs->srcbuf) - cs->srcbuf_len)
 		n = sizeof(cs->srcbuf) - cs->srcbuf_len;
 	memcpy(cs->srcbuf + cs->srcbuf_len, s, n);
 
 	/* Convert as few characters to the dst buffer as possible. */
 	for (i = 0; ; i++) {
-		const char *src;
-		char *dst;
+		char *src, *dst;
 		size_t srcleft, dstleft, invlen;
 		int err;
 
 		src = cs->srcbuf;
 		srcleft = cs->srcbuf_len + n;
 		dst = cs->dstbuf.bytes;
 		dstleft = i * sizeof(charXX_t);
 		assert(srcleft <= sizeof(cs->srcbuf) &&
 		    dstleft <= sizeof(cs->dstbuf.bytes));
 		err = _citrus_iconv_convert(handle, &src, &srcleft,
 		    &dst, &dstleft, 0, &invlen);
 		cs->dstbuf_len = (dst - cs->dstbuf.bytes) / sizeof(charXX_t);
 
 		/* Got new character(s). Return the first. */
 		if (cs->dstbuf_len > 0) {
 			assert(src - cs->srcbuf > cs->srcbuf_len);
 			retval = src - cs->srcbuf - cs->srcbuf_len;
 			cs->srcbuf_len = 0;
 			goto return_char;
 		}
 
 		/* Increase dst buffer size, to obtain the surrogate pair. */
 		if (err == E2BIG)
 			continue;
 
 		/* Illegal sequence. */
 		if (invlen > 0) {
 			cs->srcbuf_len = 0;
 			errno = EILSEQ;
 			return ((size_t)-1);
 		}
 
 		/* Save unprocessed remainder for the next invocation. */
 		memmove(cs->srcbuf, src, srcleft);
 		cs->srcbuf_len = srcleft;
 		return ((size_t)-2);
 	}
 
 return_char:
 	retchar = cs->dstbuf.widechar[0];
 	memmove(&cs->dstbuf.widechar[0], &cs->dstbuf.widechar[1],
 	    --cs->dstbuf_len * sizeof(charXX_t));
 	if (pc != NULL)
 		*pc = retchar;
 	if (retchar == 0)
 		return (0);
 	return (retval);
 }
 
 size_t
 mbrtocXX(charXX_t * __restrict pc, const char * __restrict s, size_t n,
     mbstate_t * __restrict ps)
 {
 
 	return (mbrtocXX_l(pc, s, n, ps, __get_locale()));
 }
Index: user/ngie/more-tests/lib/libc
===================================================================
--- user/ngie/more-tests/lib/libc	(revision 281584)
+++ user/ngie/more-tests/lib/libc	(revision 281585)

Property changes on: user/ngie/more-tests/lib/libc
___________________________________________________________________
Modified: svn:mergeinfo
## -0,0 +0,1 ##
   Merged /head/lib/libc:r281477-281584
Index: user/ngie/more-tests/lib/libiconv_modules/BIG5/citrus_big5.c
===================================================================
--- user/ngie/more-tests/lib/libiconv_modules/BIG5/citrus_big5.c	(revision 281584)
+++ user/ngie/more-tests/lib/libiconv_modules/BIG5/citrus_big5.c	(revision 281585)
@@ -1,458 +1,458 @@
 /* $FreeBSD$ */
 /*	$NetBSD: citrus_big5.c,v 1.13 2011/05/23 14:53:46 joerg Exp $	*/
 
 /*-
  * Copyright (c)2002, 2006 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 /*-
  * Copyright (c) 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * Paul Borman at Krystal Technologies.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 #include <sys/queue.h>
 #include <sys/types.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <limits.h>
 #include <stddef.h>
 #include <stdint.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <wchar.h>
 
 #include "citrus_namespace.h"
 #include "citrus_prop.h"
 #include "citrus_types.h"
 #include "citrus_bcs.h"
 #include "citrus_module.h"
 #include "citrus_stdenc.h"
 #include "citrus_big5.h"
 
 /* ----------------------------------------------------------------------
  * private stuffs used by templates
  */
 
 typedef struct {
 	int	 chlen;
 	char	 ch[2];
 } _BIG5State;
 
 typedef struct _BIG5Exclude {
 	TAILQ_ENTRY(_BIG5Exclude)	 entry;
 	wint_t				 start;
 	wint_t				 end;
 } _BIG5Exclude;
 
 typedef TAILQ_HEAD(_BIG5ExcludeList, _BIG5Exclude) _BIG5ExcludeList;
 
 typedef struct {
 	_BIG5ExcludeList	 excludes;
 	int			 cell[0x100];
 } _BIG5EncodingInfo;
 
 #define _CEI_TO_EI(_cei_)		(&(_cei_)->ei)
 #define _CEI_TO_STATE(_cei_, _func_)	(_cei_)->states.s_##_func_
 
 #define _FUNCNAME(m)			_citrus_BIG5_##m
 #define _ENCODING_INFO			_BIG5EncodingInfo
 #define _ENCODING_STATE			_BIG5State
 #define _ENCODING_MB_CUR_MAX(_ei_)	2
 #define _ENCODING_IS_STATE_DEPENDENT	0
 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_)	0
 
 
 static __inline void
 /*ARGSUSED*/
 _citrus_BIG5_init_state(_BIG5EncodingInfo * __restrict ei __unused,
     _BIG5State * __restrict s)
 {
 
 	memset(s, 0, sizeof(*s));
 }
 
 #if 0
 static __inline void
 /*ARGSUSED*/
 _citrus_BIG5_pack_state(_BIG5EncodingInfo * __restrict ei __unused,
     void * __restrict pspriv,
     const _BIG5State * __restrict s)
 {
 
 	memcpy(pspriv, (const void *)s, sizeof(*s));
 }
 
 static __inline void
 /*ARGSUSED*/
 _citrus_BIG5_unpack_state(_BIG5EncodingInfo * __restrict ei __unused,
     _BIG5State * __restrict s,
     const void * __restrict pspriv)
 {
 
 	memcpy((void *)s, pspriv, sizeof(*s));
 }
 #endif
 
 static __inline int
 _citrus_BIG5_check(_BIG5EncodingInfo *ei, unsigned int c)
 {
 
 	return ((ei->cell[c & 0xFF] & 0x1) ? 2 : 1);
 }
 
 static __inline int
 _citrus_BIG5_check2(_BIG5EncodingInfo *ei, unsigned int c)
 {
 
 	return ((ei->cell[c & 0xFF] & 0x2) ? 1 : 0);
 }
 
 static __inline int
 _citrus_BIG5_check_excludes(_BIG5EncodingInfo *ei, wint_t c)
 {
 	_BIG5Exclude *exclude;
 
 	TAILQ_FOREACH(exclude, &ei->excludes, entry) {
 		if (c >= exclude->start && c <= exclude->end)
 			return (EILSEQ);
 	}
 	return (0);
 }
 
 static int
 _citrus_BIG5_fill_rowcol(void * __restrict ctx, const char * __restrict s,
     uint64_t start, uint64_t end)
 {
 	_BIG5EncodingInfo *ei;
 	uint64_t n;
 	int i;
 
 	if (start > 0xFF || end > 0xFF)
 		return (EINVAL);
 	ei = (_BIG5EncodingInfo *)ctx;
 	i = strcmp("row", s) ? 1 : 0;
 	i = 1 << i;
 	for (n = start; n <= end; ++n)
 		ei->cell[n & 0xFF] |= i;
 	return (0);
 }
 
 static int
 /*ARGSUSED*/
 _citrus_BIG5_fill_excludes(void * __restrict ctx,
     const char * __restrict s __unused, uint64_t start, uint64_t end)
 {
 	_BIG5EncodingInfo *ei;
 	_BIG5Exclude *exclude;
 
 	if (start > 0xFFFF || end > 0xFFFF)
 		return (EINVAL);
 	ei = (_BIG5EncodingInfo *)ctx;
 	exclude = TAILQ_LAST(&ei->excludes, _BIG5ExcludeList);
 	if (exclude != NULL && (wint_t)start <= exclude->end)
 		return (EINVAL);
 	exclude = (void *)malloc(sizeof(*exclude));
 	if (exclude == NULL)
 		return (ENOMEM);
 	exclude->start = (wint_t)start;
 	exclude->end = (wint_t)end;
 	TAILQ_INSERT_TAIL(&ei->excludes, exclude, entry);
 
 	return (0);
 }
 
 static const _citrus_prop_hint_t root_hints[] = {
     _CITRUS_PROP_HINT_NUM("row", &_citrus_BIG5_fill_rowcol),
     _CITRUS_PROP_HINT_NUM("col", &_citrus_BIG5_fill_rowcol),
     _CITRUS_PROP_HINT_NUM("excludes", &_citrus_BIG5_fill_excludes),
     _CITRUS_PROP_HINT_END
 };
 
 static void
 /*ARGSUSED*/
 _citrus_BIG5_encoding_module_uninit(_BIG5EncodingInfo *ei)
 {
 	_BIG5Exclude *exclude;
 
 	while ((exclude = TAILQ_FIRST(&ei->excludes)) != NULL) {
 		TAILQ_REMOVE(&ei->excludes, exclude, entry);
 		free(exclude);
 	}
 }
 
 static int
 /*ARGSUSED*/
 _citrus_BIG5_encoding_module_init(_BIG5EncodingInfo * __restrict ei,
     const void * __restrict var, size_t lenvar)
 {
 	const char *s;
 	int err;
 
 	memset((void *)ei, 0, sizeof(*ei));
 	TAILQ_INIT(&ei->excludes);
 
 	if (lenvar > 0 && var != NULL) {
 		s = _bcs_skip_ws_len((const char *)var, &lenvar);
 		if (lenvar > 0 && *s != '\0') {
 			err = _citrus_prop_parse_variable(
 			    root_hints, (void *)ei, s, lenvar);
 			if (err == 0)
 				return (0);
 
 			_citrus_BIG5_encoding_module_uninit(ei);
 			memset((void *)ei, 0, sizeof(*ei));
 			TAILQ_INIT(&ei->excludes);
 		}
 	}
 
 	/* fallback Big5-1984, for backward compatibility. */
 	_citrus_BIG5_fill_rowcol(ei, "row", 0xA1, 0xFE);
 	_citrus_BIG5_fill_rowcol(ei, "col", 0x40, 0x7E);
 	_citrus_BIG5_fill_rowcol(ei, "col", 0xA1, 0xFE);
 
 	return (0);
 }
 
 static int
 /*ARGSUSED*/
 _citrus_BIG5_mbrtowc_priv(_BIG5EncodingInfo * __restrict ei,
     wchar_t * __restrict pwc,
-    const char ** __restrict s, size_t n,
+    char ** __restrict s, size_t n,
     _BIG5State * __restrict psenc,
     size_t * __restrict nresult)
 {
 	wchar_t wchar;
-	const char *s0;
+	char *s0;
 	int c, chlenbak;
 
 	s0 = *s;
 
 	if (s0 == NULL) {
 		_citrus_BIG5_init_state(ei, psenc);
 		*nresult = 0;
 		return (0);
 	}
 
 	chlenbak = psenc->chlen;
 
 	/* make sure we have the first byte in the buffer */
 	switch (psenc->chlen) {
 	case 0:
 		if (n < 1)
 			goto restart;
 		psenc->ch[0] = *s0++;
 		psenc->chlen = 1;
 		n--;
 		break;
 	case 1:
 		break;
 	default:
 		/* illegal state */
 		goto ilseq;
 	}
 
 	c = _citrus_BIG5_check(ei, psenc->ch[0] & 0xff);
 	if (c == 0)
 		goto ilseq;
 	while (psenc->chlen < c) {
 		if (n < 1) {
 			goto restart;
 		}
 		psenc->ch[psenc->chlen] = *s0++;
 		psenc->chlen++;
 		n--;
 	}
 
 	switch (c) {
 	case 1:
 		wchar = psenc->ch[0] & 0xff;
 		break;
 	case 2:
 		if (!_citrus_BIG5_check2(ei, psenc->ch[1] & 0xff))
 			goto ilseq;
 		wchar = ((psenc->ch[0] & 0xff) << 8) | (psenc->ch[1] & 0xff);
 		break;
 	default:
 		/* illegal state */
 		goto ilseq;
 	}
 
 	if (_citrus_BIG5_check_excludes(ei, (wint_t)wchar) != 0)
 		goto ilseq;
 
 	*s = s0;
 	psenc->chlen = 0;
 	if (pwc)
 		*pwc = wchar;
 	*nresult = wchar ? c - chlenbak : 0;
 
 	return (0);
 
 ilseq:
 	psenc->chlen = 0;
 	*nresult = (size_t)-1;
 	return (EILSEQ);
 
 restart:
 	*s = s0;
 	*nresult = (size_t)-2;
 	return (0);
 }
 
 static int
 /*ARGSUSED*/
 _citrus_BIG5_wcrtomb_priv(_BIG5EncodingInfo * __restrict ei,
     char * __restrict s,
     size_t n, wchar_t wc, _BIG5State * __restrict psenc __unused,
     size_t * __restrict nresult)
 {
 	size_t l;
 	int ret;
 
 	/* check invalid sequence */
 	if (wc & ~0xffff ||
 	    _citrus_BIG5_check_excludes(ei, (wint_t)wc) != 0) {
 		ret = EILSEQ;
 		goto err;
 	}
 
 	if (wc & 0x8000) {
 		if (_citrus_BIG5_check(ei, (wc >> 8) & 0xff) != 2 ||
 		    !_citrus_BIG5_check2(ei, wc & 0xff)) {
 			ret = EILSEQ;
 			goto err;
 		}
 		l = 2;
 	} else {
 		if (wc & ~0xff || !_citrus_BIG5_check(ei, wc & 0xff)) {
 			ret = EILSEQ;
 			goto err;
 		}
 		l = 1;
 	}
 
 	if (n < l) {
 		/* bound check failure */
 		ret = E2BIG;
 		goto err;
 	}
 
 	if (l == 2) {
 		s[0] = (wc >> 8) & 0xff;
 		s[1] = wc & 0xff;
 	} else
 		s[0] = wc & 0xff;
 
 	*nresult = l;
 
 	return (0);
 
 err:
 	*nresult = (size_t)-1;
 	return (ret);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_BIG5_stdenc_wctocs(_BIG5EncodingInfo * __restrict ei __unused,
     _csid_t * __restrict csid,
     _index_t * __restrict idx, wchar_t wc)
 {
 
 	*csid = (wc < 0x100) ? 0 : 1;
 	*idx = (_index_t)wc;
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_BIG5_stdenc_cstowc(_BIG5EncodingInfo * __restrict ei __unused,
     wchar_t * __restrict wc,
     _csid_t csid, _index_t idx)
 {
 
 	switch (csid) {
 	case 0:
 	case 1:
 		*wc = (wchar_t)idx;
 		break;
 	default:
 		return (EILSEQ);
 	}
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_BIG5_stdenc_get_state_desc_generic(_BIG5EncodingInfo * __restrict ei __unused,
     _BIG5State * __restrict psenc,
     int * __restrict rstate)
 {
 
 	*rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL :
 	    _STDENC_SDGEN_INCOMPLETE_CHAR;
 	return (0);
 }
 
 /* ----------------------------------------------------------------------
  * public interface for stdenc
  */
 
 _CITRUS_STDENC_DECLS(BIG5);
 _CITRUS_STDENC_DEF_OPS(BIG5);
 
 #include "citrus_stdenc_template.h"
Index: user/ngie/more-tests/lib/libiconv_modules/DECHanyu/citrus_dechanyu.c
===================================================================
--- user/ngie/more-tests/lib/libiconv_modules/DECHanyu/citrus_dechanyu.c	(revision 281584)
+++ user/ngie/more-tests/lib/libiconv_modules/DECHanyu/citrus_dechanyu.c	(revision 281585)
@@ -1,394 +1,394 @@
 /* $FreeBSD$ */
 /* $NetBSD: citrus_dechanyu.c,v 1.4 2011/11/19 18:20:13 tnozaki Exp $ */
 
 /*-
  * Copyright (c)2007 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 #include <sys/cdefs.h>
 #include <sys/types.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <limits.h>
 #include <stddef.h>
 #include <stdint.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <wchar.h>
 
 #include "citrus_namespace.h"
 #include "citrus_types.h"
 #include "citrus_bcs.h"
 #include "citrus_module.h"
 #include "citrus_stdenc.h"
 #include "citrus_dechanyu.h"
 
 /* ----------------------------------------------------------------------
  * private stuffs used by templates
  */
 
 typedef struct {
 	size_t	 chlen;
 	char	 ch[4];
 } _DECHanyuState;
 
 typedef struct {
 	int	 dummy;
 } _DECHanyuEncodingInfo;
 
 #define _CEI_TO_EI(_cei_)		(&(_cei_)->ei)
 #define _CEI_TO_STATE(_cei_, _func_)	(_cei_)->states.__CONCAT(s_,_func_)
 
 #define _FUNCNAME(m)			__CONCAT(_citrus_DECHanyu_,m)
 #define _ENCODING_INFO			_DECHanyuEncodingInfo
 #define _ENCODING_STATE			_DECHanyuState
 #define _ENCODING_MB_CUR_MAX(_ei_)		4
 #define _ENCODING_IS_STATE_DEPENDENT		0
 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_)	0
 
 static __inline void
 /*ARGSUSED*/
 _citrus_DECHanyu_init_state(_DECHanyuEncodingInfo * __restrict ei __unused,
     _DECHanyuState * __restrict psenc)
 {
 
 	psenc->chlen = 0;
 }
 
 #if 0
 static __inline void
 /*ARGSUSED*/
 _citrus_DECHanyu_pack_state(_DECHanyuEncodingInfo * __restrict ei __unused,
     void * __restrict pspriv, const _DECHanyuState * __restrict psenc)
 {
 
 	memcpy(pspriv, (const void *)psenc, sizeof(*psenc));
 }
 
 static __inline void
 /*ARGSUSED*/
 _citrus_DECHanyu_unpack_state(_DECHanyuEncodingInfo * __restrict ei __unused,
     _DECHanyuState * __restrict psenc,
     const void * __restrict pspriv)
 {
 
 	memcpy((void *)psenc, pspriv, sizeof(*psenc));
 }
 #endif
 
 static void
 /*ARGSUSED*/
 _citrus_DECHanyu_encoding_module_uninit(_DECHanyuEncodingInfo *ei __unused)
 {
 
 	/* ei may be null */
 }
 
 static int
 /*ARGSUSED*/
 _citrus_DECHanyu_encoding_module_init(_DECHanyuEncodingInfo * __restrict ei __unused,
     const void * __restrict var __unused, size_t lenvar __unused)
 {
 
 	/* ei may be null */
 	return (0);
 }
 
 static __inline bool
 is_singlebyte(int c)
 {
 
 	return (c <= 0x7F);
 }
 
 static __inline bool
 is_leadbyte(int c)
 {
 
 	return (c >= 0xA1 && c <= 0xFE);
 }
 
 static __inline bool
 is_trailbyte(int c)
 {
 
 	c &= ~0x80;
 	return (c >= 0x21 && c <= 0x7E);
 }
 
 static __inline bool
 is_hanyu1(int c)
 {
 
 	return (c == 0xC2);
 }
 
 static __inline bool
 is_hanyu2(int c)
 {
 
 	return (c == 0xCB);
 }
 
 #define HANYUBIT	0xC2CB0000
 
 static __inline bool
 is_94charset(int c)
 {
 
 	return (c >= 0x21 && c <= 0x7E);
 }
 
 static int
 /*ARGSUSED*/
 _citrus_DECHanyu_mbrtowc_priv(_DECHanyuEncodingInfo * __restrict ei,
-    wchar_t * __restrict pwc, const char ** __restrict s, size_t n,
+    wchar_t * __restrict pwc, char ** __restrict s, size_t n,
     _DECHanyuState * __restrict psenc, size_t * __restrict nresult)
 {
-	const char *s0;
+	char *s0;
 	wchar_t wc;
 	int ch;
 
 	if (*s == NULL) {
 		_citrus_DECHanyu_init_state(ei, psenc);
 		*nresult = _ENCODING_IS_STATE_DEPENDENT;
 		return (0);
 	}
 	s0 = *s;
 
 	wc = (wchar_t)0;
 	switch (psenc->chlen) {
 	case 0:
 		if (n-- < 1)
 			goto restart;
 		ch = *s0++ & 0xFF;
 		if (is_singlebyte(ch)) {
 			if (pwc != NULL)
 				*pwc = (wchar_t)ch;
 			*nresult = (size_t)((ch == 0) ? 0 : 1);
 			*s = s0;
 			return (0);
 		}
 		if (!is_leadbyte(ch))
 			goto ilseq;
 		psenc->ch[psenc->chlen++] = ch;
 		break;
 	case 1:
 		ch = psenc->ch[0] & 0xFF;
 		if (!is_leadbyte(ch))
 			return (EINVAL);
 		break;
 	case 2: case 3:
 		ch = psenc->ch[0] & 0xFF;
 		if (is_hanyu1(ch)) {
 			ch = psenc->ch[1] & 0xFF;
 			if (is_hanyu2(ch)) {
 				wc |= (wchar_t)HANYUBIT;
 				break;
 			}
 		}
 	/*FALLTHROUGH*/
 	default:
 		return (EINVAL);
 	}
 
 	switch (psenc->chlen) {
 	case 1:
 		if (is_hanyu1(ch)) {
 			if (n-- < 1)
 				goto restart;
 			ch = *s0++ & 0xFF;
 			if (!is_hanyu2(ch))
 				goto ilseq;
 			psenc->ch[psenc->chlen++] = ch;
 			wc |= (wchar_t)HANYUBIT;
 			if (n-- < 1)
 				goto restart;
 			ch = *s0++ & 0xFF;
 			if (!is_leadbyte(ch))
 				goto ilseq;
 			psenc->ch[psenc->chlen++] = ch;
 		}
 		break;
 	case 2:
 		if (n-- < 1)
 			goto restart;
 		ch = *s0++ & 0xFF;
 		if (!is_leadbyte(ch))
 			goto ilseq;
 		psenc->ch[psenc->chlen++] = ch;
 		break;
 	case 3:
 		ch = psenc->ch[2] & 0xFF;
 		if (!is_leadbyte(ch))
 			return (EINVAL);
 	}
 	if (n-- < 1)
 		goto restart;
 	wc |= (wchar_t)(ch << 8);
 	ch = *s0++ & 0xFF;
 	if (!is_trailbyte(ch))
 		goto ilseq;
 	wc |= (wchar_t)ch;
 	if (pwc != NULL)
 		*pwc = wc;
 	*nresult = (size_t)(s0 - *s);
 	*s = s0;
 	psenc->chlen = 0;
 
 	return (0);
 
 restart:
 	*nresult = (size_t)-2;
 	*s = s0;
 	return (0);
 
 ilseq:
 	*nresult = (size_t)-1;
 	return (EILSEQ);
 }
 
 static int
 /*ARGSUSED*/
 _citrus_DECHanyu_wcrtomb_priv(_DECHanyuEncodingInfo * __restrict ei __unused,
     char * __restrict s, size_t n, wchar_t wc,
     _DECHanyuState * __restrict psenc, size_t * __restrict nresult)
 {
 	int ch;
 
 	if (psenc->chlen != 0)
 		return (EINVAL);
 
 	/* XXX: assume wchar_t as int */
 	if ((uint32_t)wc <= 0x7F) {
 		ch = wc & 0xFF;
 	} else {
 		if ((uint32_t)wc > 0xFFFF) {
 			if ((wc & ~0xFFFF) != (wchar_t)HANYUBIT)
 				goto ilseq;
 			psenc->ch[psenc->chlen++] = (wc >> 24) & 0xFF;
 			psenc->ch[psenc->chlen++] = (wc >> 16) & 0xFF;
 			wc &= 0xFFFF;
 		}
 		ch = (wc >> 8) & 0xFF;
 		if (!is_leadbyte(ch))
 			goto ilseq;
 		psenc->ch[psenc->chlen++] = ch;
 		ch = wc & 0xFF;
 		if (!is_trailbyte(ch))
 			goto ilseq;
 	}
 	psenc->ch[psenc->chlen++] = ch;
 	if (n < psenc->chlen) {
 		*nresult = (size_t)-1;
 		return (E2BIG);
 	}
 	memcpy(s, psenc->ch, psenc->chlen);
 	*nresult = psenc->chlen;
 	psenc->chlen = 0;
 
 	return (0);
 
 ilseq:
 	*nresult = (size_t)-1;
 	return (EILSEQ);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_DECHanyu_stdenc_wctocs(_DECHanyuEncodingInfo * __restrict ei __unused,
     _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc)
 {
 	wchar_t mask;
 	int plane;
 
 	plane = 0;
 	mask = 0x7F;
 	/* XXX: assume wchar_t as int */
 	if ((uint32_t)wc > 0x7F) {
 		if ((uint32_t)wc > 0xFFFF) {
 			if ((wc & ~0xFFFF) != (wchar_t)HANYUBIT)
 				return (EILSEQ);
 			plane += 2;
 		}
 		if (!is_leadbyte((wc >> 8) & 0xFF) ||
 		    !is_trailbyte(wc & 0xFF))
 			return (EILSEQ);
 		plane += (wc & 0x80) ? 1 : 2;
 		mask |= 0x7F00;
 	}
 	*csid = plane;
 	*idx = (_index_t)(wc & mask);
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_DECHanyu_stdenc_cstowc(_DECHanyuEncodingInfo * __restrict ei __unused,
     wchar_t * __restrict wc, _csid_t csid, _index_t idx)
 {
 
 	if (csid == 0) {
 		if (idx > 0x7F)
 			return (EILSEQ);
 	} else if (csid <= 4) {
 		if (!is_94charset(idx >> 8))
 			return (EILSEQ);
 		if (!is_94charset(idx & 0xFF))
 			return (EILSEQ);
 		if (csid % 2)
 			idx |= 0x80;
 		idx |= 0x8000;
 		if (csid > 2)
 			idx |= HANYUBIT;
 	} else
 		return (EILSEQ);
 	*wc = (wchar_t)idx;
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_DECHanyu_stdenc_get_state_desc_generic(
     _DECHanyuEncodingInfo * __restrict ei __unused,
     _DECHanyuState * __restrict psenc, int * __restrict rstate)
 {
 
 	*rstate = (psenc->chlen == 0)
 	    ? _STDENC_SDGEN_INITIAL
 	    : _STDENC_SDGEN_INCOMPLETE_CHAR;
 	return (0);
 }
 
 /* ----------------------------------------------------------------------
  * public interface for stdenc
  */
 
 _CITRUS_STDENC_DECLS(DECHanyu);
 _CITRUS_STDENC_DEF_OPS(DECHanyu);
 
 #include "citrus_stdenc_template.h"
Index: user/ngie/more-tests/lib/libiconv_modules/EUC/citrus_euc.c
===================================================================
--- user/ngie/more-tests/lib/libiconv_modules/EUC/citrus_euc.c	(revision 281584)
+++ user/ngie/more-tests/lib/libiconv_modules/EUC/citrus_euc.c	(revision 281585)
@@ -1,387 +1,387 @@
 /* $FreeBSD$ */
 /*	$NetBSD: citrus_euc.c,v 1.14 2009/01/11 02:46:24 christos Exp $	*/
 
 /*-
  * Copyright (c)2002 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 /*-
  * Copyright (c) 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * Paul Borman at Krystal Technologies.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 #include <sys/types.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <limits.h>
 #include <stddef.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <wchar.h>
 
 #include "citrus_namespace.h"
 #include "citrus_bcs.h"
 #include "citrus_types.h"
 #include "citrus_module.h"
 #include "citrus_stdenc.h"
 #include "citrus_euc.h"
 
 
 /* ----------------------------------------------------------------------
  * private stuffs used by templates
  */
 
 typedef struct {
 	int	 chlen;
 	char	 ch[3];
 } _EUCState;
 
 typedef struct {
 	wchar_t		 bits[4];
 	wchar_t		 mask;
 	unsigned	 count[4];
 	unsigned	 mb_cur_max;
 } _EUCEncodingInfo;
 
 #define	_SS2	0x008e
 #define	_SS3	0x008f
 
 #define _CEI_TO_EI(_cei_)		(&(_cei_)->ei)
 #define _CEI_TO_STATE(_cei_, _func_)	(_cei_)->states.s_##_func_
 
 #define _FUNCNAME(m)			_citrus_EUC_##m
 #define _ENCODING_INFO			_EUCEncodingInfo
 #define _ENCODING_STATE			_EUCState
 #define _ENCODING_MB_CUR_MAX(_ei_)	(_ei_)->mb_cur_max
 #define _ENCODING_IS_STATE_DEPENDENT	0
 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_)	0
 
 
 static __inline int
 _citrus_EUC_cs(unsigned int c)
 {
 
 	c &= 0xff;
 
 	return ((c & 0x80) ? c == _SS3 ? 3 : c == _SS2 ? 2 : 1 : 0);
 }
 
 static __inline int
 _citrus_EUC_parse_variable(_EUCEncodingInfo *ei, const void *var,
     size_t lenvar __unused)
 {
 	char *e;
 	const char *v;
 	int x;
 
 	/* parse variable string */
 	if (!var)
 		return (EFTYPE);
 
 	v = (const char *)var;
 
 	while (*v == ' ' || *v == '\t')
 		++v;
 
 	ei->mb_cur_max = 1;
 	for (x = 0; x < 4; ++x) {
 		ei->count[x] = (int)_bcs_strtol(v, (char **)&e, 0);
 		if (v == e || !(v = e) || ei->count[x] < 1 || ei->count[x] > 4) {
 			return (EFTYPE);
 		}
 		if (ei->mb_cur_max < ei->count[x])
 			ei->mb_cur_max = ei->count[x];
 		while (*v == ' ' || *v == '\t')
 			++v;
 		ei->bits[x] = (int)_bcs_strtol(v, (char **)&e, 0);
 		if (v == e || !(v = e)) {
 			return (EFTYPE);
 		}
 		while (*v == ' ' || *v == '\t')
 			++v;
 	}
 	ei->mask = (int)_bcs_strtol(v, (char **)&e, 0);
 	if (v == e || !(v = e)) {
 		return (EFTYPE);
 	}
 
 	return (0);
 }
 
 
 static __inline void
 /*ARGSUSED*/
 _citrus_EUC_init_state(_EUCEncodingInfo *ei __unused, _EUCState *s)
 {
 
 	memset(s, 0, sizeof(*s));
 }
 
 #if 0
 static __inline void
 /*ARGSUSED*/
 _citrus_EUC_pack_state(_EUCEncodingInfo *ei __unused, void *pspriv,
     const _EUCState *s)
 {
 
 	memcpy(pspriv, (const void *)s, sizeof(*s));
 }
 
 static __inline void
 /*ARGSUSED*/
 _citrus_EUC_unpack_state(_EUCEncodingInfo *ei __unused, _EUCState *s,
     const void *pspriv)
 {
 
 	memcpy((void *)s, pspriv, sizeof(*s));
 }
 #endif
 
 static int
-_citrus_EUC_mbrtowc_priv(_EUCEncodingInfo *ei, wchar_t *pwc, const char **s,
+_citrus_EUC_mbrtowc_priv(_EUCEncodingInfo *ei, wchar_t *pwc, char **s,
     size_t n, _EUCState *psenc, size_t *nresult)
 {
 	wchar_t wchar;
 	int c, chlenbak, cs, len;
-	const char *s0, *s1 = NULL;
+	char *s0, *s1 = NULL;
 
 	s0 = *s;
 
 	if (s0 == NULL) {
 		_citrus_EUC_init_state(ei, psenc);
 		*nresult = 0; /* state independent */
 		return (0);
 	}
 
 	chlenbak = psenc->chlen;
 
 	/* make sure we have the first byte in the buffer */
 	switch (psenc->chlen) {
 	case 0:
 		if (n < 1)
 			goto restart;
 		psenc->ch[0] = *s0++;
 		psenc->chlen = 1;
 		n--;
 		break;
 	case 1:
 	case 2:
 		break;
 	default:
 		/* illgeal state */
 		goto encoding_error;
 	}
 
 	c = ei->count[cs = _citrus_EUC_cs(psenc->ch[0] & 0xff)];
 	if (c == 0)
 		goto encoding_error;
 	while (psenc->chlen < c) {
 		if (n < 1)
 			goto restart;
 		psenc->ch[psenc->chlen] = *s0++;
 		psenc->chlen++;
 		n--;
 	}
 	*s = s0;
 
 	switch (cs) {
 	case 3:
 	case 2:
 		/* skip SS2/SS3 */
 		len = c - 1;
 		s1 = &psenc->ch[1];
 		break;
 	case 1:
 	case 0:
 		len = c;
 		s1 = &psenc->ch[0];
 		break;
 	default:
 		goto encoding_error;
 	}
 	wchar = 0;
 	while (len-- > 0)
 		wchar = (wchar << 8) | (*s1++ & 0xff);
 	wchar = (wchar & ~ei->mask) | ei->bits[cs];
 
 	psenc->chlen = 0;
 	if (pwc)
 		*pwc = wchar;
 	*nresult = wchar ? (size_t)(c - chlenbak) : 0;
 	return (0);
 
 encoding_error:
 	psenc->chlen = 0;
 	*nresult = (size_t)-1;
 	return (EILSEQ);
 
 restart:
 	*nresult = (size_t)-2;
 	*s = s0;
 	return (0);
 }
 
 static int
 _citrus_EUC_wcrtomb_priv(_EUCEncodingInfo *ei, char *s, size_t n, wchar_t wc,
     _EUCState *psenc __unused, size_t *nresult)
 {
 	wchar_t m, nm;
 	unsigned int cs;
 	int ret;
 	short i;
 
 	m = wc & ei->mask;
 	nm = wc & ~m;
 
 	for (cs = 0; cs < sizeof(ei->count) / sizeof(ei->count[0]); cs++)
 		if (m == ei->bits[cs])
 			break;
 	/* fallback case - not sure if it is necessary */
 	if (cs == sizeof(ei->count) / sizeof(ei->count[0]))
 		cs = 1;
 
 	i = ei->count[cs];
 	if (n < (unsigned)i) {
 		ret = E2BIG;
 		goto err;
 	}
 	m = (cs) ? 0x80 : 0x00;
 	switch (cs) {
 	case 2:
 		*s++ = _SS2;
 		i--;
 		break;
 	case 3:
 		*s++ = _SS3;
 		i--;
 		break;
 	}
 
 	while (i-- > 0)
 		*s++ = ((nm >> (i << 3)) & 0xff) | m;
 
 	*nresult = (size_t)ei->count[cs];
 	return (0);
 
 err:
 	*nresult = (size_t)-1;
 	return (ret);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_EUC_stdenc_wctocs(_EUCEncodingInfo * __restrict ei,
     _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc)
 {
 	wchar_t m, nm;
 
 	m = wc & ei->mask;
 	nm = wc & ~m;
 
 	*csid = (_citrus_csid_t)m;
 	*idx  = (_citrus_index_t)nm;
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_EUC_stdenc_cstowc(_EUCEncodingInfo * __restrict ei,
     wchar_t * __restrict wc, _csid_t csid, _index_t idx)
 {
 
 	if ((csid & ~ei->mask) != 0 || (idx & ei->mask) != 0)
 		return (EINVAL);
 
 	*wc = (wchar_t)csid | (wchar_t)idx;
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_EUC_stdenc_get_state_desc_generic(_EUCEncodingInfo * __restrict ei __unused,
     _EUCState * __restrict psenc, int * __restrict rstate)
 {
 
 	*rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL :
 	    _STDENC_SDGEN_INCOMPLETE_CHAR;
 	return (0);
 }
 
 static int
 /*ARGSUSED*/
 _citrus_EUC_encoding_module_init(_EUCEncodingInfo * __restrict ei,
     const void * __restrict var, size_t lenvar)
 {
 
 	return (_citrus_EUC_parse_variable(ei, var, lenvar));
 }
 
 static void
 /*ARGSUSED*/
 _citrus_EUC_encoding_module_uninit(_EUCEncodingInfo * __restrict ei __unused)
 {
 
 }
 
 /* ----------------------------------------------------------------------
  * public interface for stdenc
  */
 
 _CITRUS_STDENC_DECLS(EUC);
 _CITRUS_STDENC_DEF_OPS(EUC);
 
 #include "citrus_stdenc_template.h"
Index: user/ngie/more-tests/lib/libiconv_modules/EUCTW/citrus_euctw.c
===================================================================
--- user/ngie/more-tests/lib/libiconv_modules/EUCTW/citrus_euctw.c	(revision 281584)
+++ user/ngie/more-tests/lib/libiconv_modules/EUCTW/citrus_euctw.c	(revision 281585)
@@ -1,379 +1,379 @@
 /* $FreeBSD$ */
 /*	$NetBSD: citrus_euctw.c,v 1.11 2008/06/14 16:01:07 tnozaki Exp $	*/
 
 /*-
  * Copyright (c)2002 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 /*-
  * Copyright (c)1999 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	$Citrus: xpg4dl/FreeBSD/lib/libc/locale/euctw.c,v 1.13 2001/06/21 01:51:44 yamt Exp $
  */
 
 #include <sys/cdefs.h>
 #include <sys/types.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <limits.h>
 #include <stddef.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <wchar.h>
 
 #include "citrus_namespace.h"
 #include "citrus_types.h"
 #include "citrus_module.h"
 #include "citrus_stdenc.h"
 #include "citrus_euctw.h"
 
 
 /* ----------------------------------------------------------------------
  * private stuffs used by templates
  */
 
 typedef struct {
 	int	 chlen;
 	char	 ch[4];
 } _EUCTWState;
 
 typedef struct {
 	int	 dummy;
 } _EUCTWEncodingInfo;
 
 #define	_SS2	0x008e
 #define	_SS3	0x008f
 
 #define _CEI_TO_EI(_cei_)		(&(_cei_)->ei)
 #define _CEI_TO_STATE(_cei_, _func_)	(_cei_)->states.s_##_func_
 
 #define _FUNCNAME(m)			_citrus_EUCTW_##m
 #define _ENCODING_INFO			_EUCTWEncodingInfo
 #define _ENCODING_STATE			_EUCTWState
 #define _ENCODING_MB_CUR_MAX(_ei_)	4
 #define _ENCODING_IS_STATE_DEPENDENT	0
 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_)	0
 
 static __inline int
 _citrus_EUCTW_cs(unsigned int c)
 {
 
 	c &= 0xff;
 
 	return ((c & 0x80) ? (c == _SS2 ? 2 : 1) : 0);
 }
 
 static __inline int
 _citrus_EUCTW_count(int cs)
 {
 
 	switch (cs) {
 	case 0:
 		/*FALLTHROUGH*/
 	case 1:
 		/*FALLTHROUGH*/
 	case 2:
 		return (1 << cs);
 	case 3:
 		abort();
 		/*NOTREACHED*/
 	}
 	return (0);
 }
 
 static __inline void
 /*ARGSUSED*/
 _citrus_EUCTW_init_state(_EUCTWEncodingInfo * __restrict ei __unused,
     _EUCTWState * __restrict s)
 {
 
 	memset(s, 0, sizeof(*s));
 }
 
 #if 0
 static __inline void
 /*ARGSUSED*/
 _citrus_EUCTW_pack_state(_EUCTWEncodingInfo * __restrict ei __unused,
     void * __restrict pspriv, const _EUCTWState * __restrict s)
 {
 
 	memcpy(pspriv, (const void *)s, sizeof(*s));
 }
 
 static __inline void
 /*ARGSUSED*/
 _citrus_EUCTW_unpack_state(_EUCTWEncodingInfo * __restrict ei __unused,
     _EUCTWState * __restrict s, const void * __restrict pspriv)
 {
 
 	memcpy((void *)s, pspriv, sizeof(*s));
 }
 #endif
 
 static int
 /*ARGSUSED*/
 _citrus_EUCTW_encoding_module_init(_EUCTWEncodingInfo * __restrict ei,
     const void * __restrict var __unused, size_t lenvar __unused)
 {
 
 	memset((void *)ei, 0, sizeof(*ei));
 
 	return (0);
 }
 
 static void
 /*ARGSUSED*/
 _citrus_EUCTW_encoding_module_uninit(_EUCTWEncodingInfo *ei __unused)
 {
 
 }
 
 static int
 _citrus_EUCTW_mbrtowc_priv(_EUCTWEncodingInfo * __restrict ei,
-    wchar_t * __restrict pwc, const char ** __restrict s,
+    wchar_t * __restrict pwc, char ** __restrict s,
     size_t n, _EUCTWState * __restrict psenc, size_t * __restrict nresult)
 {
-	const char *s0;
+	char *s0;
 	wchar_t wchar;
 	int c, chlenbak, cs;
 
 	s0 = *s;
 
 	if (s0 == NULL) {
 		_citrus_EUCTW_init_state(ei, psenc);
 		*nresult = 0; /* state independent */
 		return (0);
 	}
 
 	chlenbak = psenc->chlen;
 
 	/* make sure we have the first byte in the buffer */
 	switch (psenc->chlen) {
 	case 0:
 		if (n < 1)
 			goto restart;
 		psenc->ch[0] = *s0++;
 		psenc->chlen = 1;
 		n--;
 		break;
 	case 1:
 	case 2:
 		break;
 	default:
 		/* illgeal state */
 		goto ilseq;
 	}
 
 	c = _citrus_EUCTW_count(cs = _citrus_EUCTW_cs(psenc->ch[0] & 0xff));
 	if (c == 0)
 		goto ilseq;
 	while (psenc->chlen < c) {
 		if (n < 1)
 			goto ilseq;
 		psenc->ch[psenc->chlen] = *s0++;
 		psenc->chlen++;
 		n--;
 	}
 
 	wchar = 0;
 	switch (cs) {
 	case 0:
 		if (psenc->ch[0] & 0x80)
 			goto ilseq;
 		wchar = psenc->ch[0] & 0xff;
 		break;
 	case 1:
 		if (!(psenc->ch[0] & 0x80) || !(psenc->ch[1] & 0x80))
 			goto ilseq;
 		wchar = ((psenc->ch[0] & 0xff) << 8) | (psenc->ch[1] & 0xff);
 		wchar |= 'G' << 24;
 		break;
 	case 2:
 		if ((unsigned char)psenc->ch[1] < 0xa1 ||
 		    0xa7 < (unsigned char)psenc->ch[1])
 			goto ilseq;
 		if (!(psenc->ch[2] & 0x80) || !(psenc->ch[3] & 0x80))
 			goto ilseq;
 		wchar = ((psenc->ch[2] & 0xff) << 8) | (psenc->ch[3] & 0xff);
 		wchar |= ('G' + psenc->ch[1] - 0xa1) << 24;
 		break;
 	default:
 		goto ilseq;
 	}
 
 	*s = s0;
 	psenc->chlen = 0;
 
 	if (pwc)
 		*pwc = wchar;
 	*nresult = wchar ? c - chlenbak : 0;
 	return (0);
 
 ilseq:
 	psenc->chlen = 0;
 	*nresult = (size_t)-1;
 	return (EILSEQ);
 
 restart:
 	*s = s0;
 	*nresult = (size_t)-1;
 	return (0);
 }
 
 static int
 _citrus_EUCTW_wcrtomb_priv(_EUCTWEncodingInfo * __restrict ei __unused,
     char * __restrict s, size_t n, wchar_t wc,
     _EUCTWState * __restrict psenc __unused, size_t * __restrict nresult)
 {
 	wchar_t cs, v;
 	int clen, i, ret;
 	size_t len;
 
 	cs = wc & 0x7f000080;
 	clen = 1;
 	if (wc & 0x00007f00)
 		clen = 2;
 	if ((wc & 0x007f0000) && !(wc & 0x00800000))
 		clen = 3;
 
 	if (clen == 1 && cs == 0x00000000) {
 		/* ASCII */
 		len = 1;
 		if (n < len) {
 			ret = E2BIG;
 			goto err;
 		}
 		v = wc & 0x0000007f;
 	} else if (clen == 2 && cs == ('G' << 24)) {
 		/* CNS-11643-1 */
 		len = 2;
 		if (n < len) {
 			ret = E2BIG;
 			goto err;
 		}
 		v = wc & 0x00007f7f;
 		v |= 0x00008080;
 	} else if (clen == 2 && 'H' <= (cs >> 24) && (cs >> 24) <= 'M') {
 		/* CNS-11643-[2-7] */
 		len = 4;
 		if (n < len) {
 			ret = E2BIG;
 			goto err;
 		}
 		*s++ = _SS2;
 		*s++ = (cs >> 24) - 'H' + 0xa2;
 		v = wc & 0x00007f7f;
 		v |= 0x00008080;
 	} else {
 		ret = EILSEQ;
 		goto err;
 	}
 
 	i = clen;
 	while (i-- > 0)
 		*s++ = (v >> (i << 3)) & 0xff;
 
 	*nresult = len;
 	return (0);
 
 err:
 	*nresult = (size_t)-1;
 	return (ret);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_EUCTW_stdenc_wctocs(_EUCTWEncodingInfo * __restrict ei __unused,
     _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc)
 {
 
 	*csid = (_csid_t)(wc >> 24) & 0xFF;
 	*idx  = (_index_t)(wc & 0x7F7F);
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_EUCTW_stdenc_cstowc(_EUCTWEncodingInfo * __restrict ei __unused,
     wchar_t * __restrict wc, _csid_t csid, _index_t idx)
 {
 
 	if (csid == 0) {
 		if ((idx & ~0x7F) != 0)
 			return (EINVAL);
 		*wc = (wchar_t)idx;
 	} else {
 		if (csid < 'G' || csid > 'M' || (idx & ~0x7F7F) != 0)
 			return (EINVAL);
 		*wc = (wchar_t)idx | ((wchar_t)csid << 24);
 	}
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_EUCTW_stdenc_get_state_desc_generic(_EUCTWEncodingInfo * __restrict ei __unused,
     _EUCTWState * __restrict psenc, int * __restrict rstate)
 {
 
 	*rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL :
 	    _STDENC_SDGEN_INCOMPLETE_CHAR;
 	return (0);
 }
 
 /* ----------------------------------------------------------------------
  * public interface for stdenc
  */
 
 _CITRUS_STDENC_DECLS(EUCTW);
 _CITRUS_STDENC_DEF_OPS(EUCTW);
 
 #include "citrus_stdenc_template.h"
Index: user/ngie/more-tests/lib/libiconv_modules/GBK2K/citrus_gbk2k.c
===================================================================
--- user/ngie/more-tests/lib/libiconv_modules/GBK2K/citrus_gbk2k.c	(revision 281584)
+++ user/ngie/more-tests/lib/libiconv_modules/GBK2K/citrus_gbk2k.c	(revision 281585)
@@ -1,419 +1,419 @@
 /* $FreeBSD$ */
 /* $NetBSD: citrus_gbk2k.c,v 1.7 2008/06/14 16:01:07 tnozaki Exp $ */
 
 /*-
  * Copyright (c)2003 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 #include <sys/types.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <limits.h>
 #include <stdbool.h>
 #include <stddef.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <wchar.h>
 
 #include "citrus_namespace.h"
 #include "citrus_types.h"
 #include "citrus_bcs.h"
 #include "citrus_module.h"
 #include "citrus_stdenc.h"
 #include "citrus_gbk2k.h"
 
 
 /* ----------------------------------------------------------------------
  * private stuffs used by templates
  */
 
 typedef struct _GBK2KState {
 	int	 chlen;
 	char	 ch[4];
 } _GBK2KState;
 
 typedef struct {
 	int	 mb_cur_max;
 } _GBK2KEncodingInfo;
 
 #define _CEI_TO_EI(_cei_)		(&(_cei_)->ei)
 #define _CEI_TO_STATE(_cei_, _func_)	(_cei_)->states.s_##_func_
 
 #define _FUNCNAME(m)			_citrus_GBK2K_##m
 #define _ENCODING_INFO			_GBK2KEncodingInfo
 #define _ENCODING_STATE			_GBK2KState
 #define _ENCODING_MB_CUR_MAX(_ei_)	(_ei_)->mb_cur_max
 #define _ENCODING_IS_STATE_DEPENDENT	0
 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_)	0
 
 static __inline void
 /*ARGSUSED*/
 _citrus_GBK2K_init_state(_GBK2KEncodingInfo * __restrict ei __unused,
     _GBK2KState * __restrict s)
 {
 
 	memset(s, 0, sizeof(*s));
 }
 
 #if 0
 static __inline void
 /*ARGSUSED*/
 _citrus_GBK2K_pack_state(_GBK2KEncodingInfo * __restrict ei __unused,
     void * __restrict pspriv, const _GBK2KState * __restrict s)
 {
 
 	memcpy(pspriv, (const void *)s, sizeof(*s));
 }
 
 static __inline void
 /*ARGSUSED*/
 _citrus_GBK2K_unpack_state(_GBK2KEncodingInfo * __restrict ei __unused,
     _GBK2KState * __restrict s, const void * __restrict pspriv)
 {
 
 	memcpy((void *)s, pspriv, sizeof(*s));
 }
 #endif
 
 static  __inline bool
 _mb_singlebyte(int c)
 {
 
 	return ((c & 0xff) <= 0x7f);
 }
 
 static __inline bool
 _mb_leadbyte(int c)
 {
 
 	c &= 0xff;
 	return (0x81 <= c && c <= 0xfe);
 }
 
 static __inline bool
 _mb_trailbyte(int c)
 {
 
 	c &= 0xff;
 	return ((0x40 <= c && c <= 0x7e) || (0x80 <= c && c <= 0xfe));
 }
 
 static __inline bool
 _mb_surrogate(int c)
 {
 
 	c &= 0xff;
 	return (0x30 <= c && c <= 0x39);
 }
 
 static __inline int
 _mb_count(wchar_t v)
 {
 	uint32_t c;
 
 	c = (uint32_t)v; /* XXX */
 	if (!(c & 0xffffff00))
 		return (1);
 	if (!(c & 0xffff0000))
 		return (2);
 	return (4);
 }
 
 #define	_PSENC		(psenc->ch[psenc->chlen - 1])
 #define	_PUSH_PSENC(c)	(psenc->ch[psenc->chlen++] = (c))
 
 static int
 _citrus_GBK2K_mbrtowc_priv(_GBK2KEncodingInfo * __restrict ei,
-    wchar_t * __restrict pwc, const char ** __restrict s, size_t n,
+    wchar_t * __restrict pwc, char ** __restrict s, size_t n,
     _GBK2KState * __restrict psenc, size_t * __restrict nresult)
 {
-	const char *s0, *s1;
+	char *s0, *s1;
 	wchar_t wc;
 	int chlenbak, len;
 
 	s0 = *s;
 
 	if (s0 == NULL) {
 		/* _citrus_GBK2K_init_state(ei, psenc); */
 		psenc->chlen = 0;
 		*nresult = 0;
 		return (0);
 	}
 
 	chlenbak = psenc->chlen;
 
 	switch (psenc->chlen) {
 	case 3:
 		if (!_mb_leadbyte (_PSENC))
 			goto invalid;
 	/* FALLTHROUGH */
 	case 2:
 		if (!_mb_surrogate(_PSENC) || _mb_trailbyte(_PSENC))
 			goto invalid;
 	/* FALLTHROUGH */
 	case 1:
 		if (!_mb_leadbyte (_PSENC))
 			goto invalid;
 	/* FALLTHOROUGH */
 	case 0:
 		break;
 	default:
 		goto invalid;
 	}
 
 	for (;;) {
 		if (n-- < 1)
 			goto restart;
 
 		_PUSH_PSENC(*s0++);
 
 		switch (psenc->chlen) {
 		case 1:
 			if (_mb_singlebyte(_PSENC))
 				goto convert;
 			if (_mb_leadbyte  (_PSENC))
 				continue;
 			goto ilseq;
 		case 2:
 			if (_mb_trailbyte (_PSENC))
 				goto convert;
 			if (ei->mb_cur_max == 4 &&
 			    _mb_surrogate (_PSENC))
 				continue;
 			goto ilseq;
 		case 3:
 			if (_mb_leadbyte  (_PSENC))
 				continue;
 			goto ilseq;
 		case 4:
 			if (_mb_surrogate (_PSENC))
 				goto convert;
 			goto ilseq;
 		}
 	}
 
 convert:
 	len = psenc->chlen;
 	s1  = &psenc->ch[0];
 	wc  = 0;
 	while (len-- > 0)
 		wc = (wc << 8) | (*s1++ & 0xff);
 
 	if (pwc != NULL)
 		*pwc = wc;
 	*s = s0;
 	*nresult = (wc == 0) ? 0 : psenc->chlen - chlenbak;
 	/* _citrus_GBK2K_init_state(ei, psenc); */
 	psenc->chlen = 0;
 
 	return (0);
 
 restart:
 	*s = s0;
 	*nresult = (size_t)-2;
 
 	return (0);
 
 invalid:
 	return (EINVAL);
 
 ilseq:
 	*nresult = (size_t)-1;
 	return (EILSEQ);
 }
 
 static int
 _citrus_GBK2K_wcrtomb_priv(_GBK2KEncodingInfo * __restrict ei,
     char * __restrict s, size_t n, wchar_t wc, _GBK2KState * __restrict psenc,
     size_t * __restrict nresult)
 {
 	size_t len;
 	int ret;
 
 	if (psenc->chlen != 0) {
 		ret = EINVAL;
 		goto err;
 	}
 
 	len = _mb_count(wc);
 	if (n < len) {
 		ret = E2BIG;
 		goto err;
 	}
 
 	switch (len) {
 	case 1:
 		if (!_mb_singlebyte(_PUSH_PSENC(wc     ))) {
 			ret = EILSEQ;
 			goto err;
 		}
 		break;
 	case 2:
 		if (!_mb_leadbyte  (_PUSH_PSENC(wc >> 8)) ||
 		    !_mb_trailbyte (_PUSH_PSENC(wc))) {
 			ret = EILSEQ;
 			goto err;
 		}
 		break;
 	case 4:
 		if (ei->mb_cur_max != 4 ||
 		    !_mb_leadbyte  (_PUSH_PSENC(wc >> 24)) ||
 		    !_mb_surrogate (_PUSH_PSENC(wc >> 16)) ||
 		    !_mb_leadbyte  (_PUSH_PSENC(wc >>  8)) ||
 		    !_mb_surrogate (_PUSH_PSENC(wc))) {
 			ret = EILSEQ;
 			goto err;
 		}
 		break;
 	}
 
 	memcpy(s, psenc->ch, psenc->chlen);
 	*nresult = psenc->chlen;
 	/* _citrus_GBK2K_init_state(ei, psenc); */
 	psenc->chlen = 0;
 
 	return (0);
 
 err:
 	*nresult = (size_t)-1;
 	return (ret);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_GBK2K_stdenc_wctocs(_GBK2KEncodingInfo * __restrict ei __unused,
     _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc)
 {
 	uint8_t ch, cl;
 
 	if ((uint32_t)wc < 0x80) {
 		/* ISO646 */
 		*csid = 0;
 		*idx = (_index_t)wc;
 	} else if ((uint32_t)wc >= 0x10000) {
 		/* GBKUCS : XXX */
 		*csid = 3;
 		*idx = (_index_t)wc;
 	} else {
 		ch = (uint8_t)(wc >> 8);
 		cl = (uint8_t)wc;
 		if (ch >= 0xA1 && cl >= 0xA1) {
 			/* EUC G1 */
 			*csid = 1;
 			*idx = (_index_t)wc & 0x7F7FU;
 		} else {
 			/* extended area (0x8140-) */
 			*csid = 2;
 			*idx = (_index_t)wc;
 		}
 	}
 		
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_GBK2K_stdenc_cstowc(_GBK2KEncodingInfo * __restrict ei,
     wchar_t * __restrict wc, _csid_t csid, _index_t idx)
 {
 
 	switch (csid) {
 	case 0:
 		/* ISO646 */
 		*wc = (wchar_t)idx;
 		break;
 	case 1:
 		/* EUC G1 */
 		*wc = (wchar_t)idx | 0x8080U;
 		break;
 	case 2:
 		/* extended area */
 		*wc = (wchar_t)idx;
 		break;
 	case 3:
 		/* GBKUCS : XXX */
 		if (ei->mb_cur_max != 4)
 			return (EINVAL);
 		*wc = (wchar_t)idx;
 		break;
 	default:
 		return (EILSEQ);
 	}
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_GBK2K_stdenc_get_state_desc_generic(_GBK2KEncodingInfo * __restrict ei __unused,
     _GBK2KState * __restrict psenc, int * __restrict rstate)
 {
 
 	*rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL :
 	    _STDENC_SDGEN_INCOMPLETE_CHAR;
 	return (0);
 }
 
 static int
 /*ARGSUSED*/
 _citrus_GBK2K_encoding_module_init(_GBK2KEncodingInfo * __restrict ei,
     const void * __restrict var, size_t lenvar)
 {
 	const char *p;
 
 	p = var;
 	memset((void *)ei, 0, sizeof(*ei));
 	ei->mb_cur_max = 4;
 	while (lenvar > 0) {
 		switch (_bcs_tolower(*p)) {
 		case '2':
 			MATCH("2byte", ei->mb_cur_max = 2);
 			break;
 		}
 		p++;
 		lenvar--;
 	}
 
 	return (0);
 }
 
 static void
 /*ARGSUSED*/
 _citrus_GBK2K_encoding_module_uninit(_GBK2KEncodingInfo *ei __unused)
 {
 
 }
 
 /* ----------------------------------------------------------------------
  * public interface for stdenc
  */
 
 _CITRUS_STDENC_DECLS(GBK2K);
 _CITRUS_STDENC_DEF_OPS(GBK2K);
 
 #include "citrus_stdenc_template.h"
Index: user/ngie/more-tests/lib/libiconv_modules/HZ/citrus_hz.c
===================================================================
--- user/ngie/more-tests/lib/libiconv_modules/HZ/citrus_hz.c	(revision 281584)
+++ user/ngie/more-tests/lib/libiconv_modules/HZ/citrus_hz.c	(revision 281585)
@@ -1,648 +1,648 @@
 /* $FreeBSD$ */
 /* $NetBSD: citrus_hz.c,v 1.2 2008/06/14 16:01:07 tnozaki Exp $ */
 
 /*-
  * Copyright (c)2004, 2006 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  */
 
 #include <sys/cdefs.h>
 #include <sys/queue.h>
 #include <sys/types.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <limits.h>
 #include <stddef.h>
 #include <stdint.h>
 #include <stdlib.h>
 #include <string.h>
 #include <wchar.h>
 
 #include "citrus_namespace.h"
 #include "citrus_types.h"
 #include "citrus_bcs.h"
 #include "citrus_module.h"
 #include "citrus_stdenc.h"
 
 #include "citrus_hz.h"
 #include "citrus_prop.h"
 
 /*
  * wchar_t mapping:
  *
  * CTRL/ASCII	00000000 00000000 00000000 gxxxxxxx
  * GB2312	00000000 00000000 0xxxxxxx gxxxxxxx
  * 94/96*n (~M)	0mmmmmmm 0xxxxxxx 0xxxxxxx gxxxxxxx
  */
 
 #define ESCAPE_CHAR	'~'
 
 typedef enum {
 	CTRL = 0, ASCII = 1, GB2312 = 2, CS94 = 3, CS96 = 4
 } charset_t;
 
 typedef struct {
 	int	 start;
 	int	 end;
 	int	 width;
 } range_t;
 
 static const range_t ranges[] = {
 #define RANGE(start, end) { start, end, (end - start) + 1 }
 /* CTRL   */ RANGE(0x00, 0x1F),
 /* ASCII  */ RANGE(0x20, 0x7F),
 /* GB2312 */ RANGE(0x21, 0x7E),
 /* CS94   */ RANGE(0x21, 0x7E),
 /* CS96   */ RANGE(0x20, 0x7F),
 #undef RANGE
 };
 
 typedef struct escape_t escape_t;
 typedef struct {
 	charset_t	 charset;
 	escape_t	*escape;
 	ssize_t		 length;
 #define ROWCOL_MAX	3
 } graphic_t;
 
 typedef TAILQ_HEAD(escape_list, escape_t) escape_list;
 struct escape_t {
 	TAILQ_ENTRY(escape_t)	 entry;
 	escape_list		*set;
 	graphic_t		*left;
 	graphic_t		*right;
 	int			 ch;
 };
 
 #define GL(escape)	((escape)->left)
 #define GR(escape)	((escape)->right)
 #define SET(escape)	((escape)->set)
 #define ESC(escape)	((escape)->ch)
 #define INIT(escape)	(TAILQ_FIRST(SET(escape)))
 
 static __inline escape_t *
 find_escape(escape_list *set, int ch)
 {
 	escape_t *escape;
 
 	TAILQ_FOREACH(escape, set, entry) {
 		if (ESC(escape) == ch)
 			break;
 	}
 
 	return (escape);
 }
 
 typedef struct {
 	escape_list	 e0;
 	escape_list	 e1;
 	graphic_t	*ascii;
 	graphic_t	*gb2312;
 } _HZEncodingInfo;
 
 #define E0SET(ei)	(&(ei)->e0)
 #define E1SET(ei)	(&(ei)->e1)
 #define INIT0(ei)	(TAILQ_FIRST(E0SET(ei)))
 #define INIT1(ei)	(TAILQ_FIRST(E1SET(ei)))
 
 typedef struct {
 	escape_t	*inuse;
 	int		 chlen;
 	char		 ch[ROWCOL_MAX];
 } _HZState;
 
 #define _CEI_TO_EI(_cei_)		(&(_cei_)->ei)
 #define _CEI_TO_STATE(_cei_, _func_)	(_cei_)->states.s_##_func_
 
 #define _FUNCNAME(m)			_citrus_HZ_##m
 #define _ENCODING_INFO			_HZEncodingInfo
 #define _ENCODING_STATE			_HZState
 #define _ENCODING_MB_CUR_MAX(_ei_)	MB_LEN_MAX
 #define _ENCODING_IS_STATE_DEPENDENT		1
 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_)	((_ps_)->inuse == NULL)
 
 static __inline void
 _citrus_HZ_init_state(_HZEncodingInfo * __restrict ei,
     _HZState * __restrict psenc)
 {
 
 	psenc->chlen = 0;
 	psenc->inuse = INIT0(ei);
 }
 
 #if 0
 static __inline void
 /*ARGSUSED*/
 _citrus_HZ_pack_state(_HZEncodingInfo * __restrict ei __unused,
     void *__restrict pspriv, const _HZState * __restrict psenc)
 {
 
 	memcpy(pspriv, (const void *)psenc, sizeof(*psenc));
 }
 
 static __inline void
 /*ARGSUSED*/
 _citrus_HZ_unpack_state(_HZEncodingInfo * __restrict ei __unused,
     _HZState * __restrict psenc, const void * __restrict pspriv)
 {
 
 	memcpy((void *)psenc, pspriv, sizeof(*psenc));
 }
 #endif
 
 static int
 _citrus_HZ_mbrtowc_priv(_HZEncodingInfo * __restrict ei,
-    wchar_t * __restrict pwc, const char ** __restrict s, size_t n,
+    wchar_t * __restrict pwc, char ** __restrict s, size_t n,
     _HZState * __restrict psenc, size_t * __restrict nresult)
 {
 	escape_t *candidate, *init;
 	graphic_t *graphic;
 	const range_t *range;
-	const char *s0;
+	char *s0;
 	wchar_t wc;
 	int bit, ch, head, len, tail;
 
 	if (*s == NULL) {
 		_citrus_HZ_init_state(ei, psenc);
 		*nresult = 1;
 		return (0);
 	}
 	s0 = *s;
 	if (psenc->chlen < 0 || psenc->inuse == NULL)
 		return (EINVAL);
 
 	wc = (wchar_t)0;
 	bit = head = tail = 0;
 	graphic = NULL;
 	for (len = 0; len <= MB_LEN_MAX;) {
 		if (psenc->chlen == tail) {
 			if (n-- < 1) {
 				*s = s0;
 				*nresult = (size_t)-2;
 				return (0);
 			}
 			psenc->ch[psenc->chlen++] = *s0++;
 			++len;
 		}
 		ch = (unsigned char)psenc->ch[tail++];
 		if (tail == 1) {
 			if ((ch & ~0x80) <= 0x1F) {
 				if (psenc->inuse != INIT0(ei))
 					break;
 				wc = (wchar_t)ch;
 				goto done;
 			}
 			if (ch & 0x80) {
 				graphic = GR(psenc->inuse);
 				bit = 0x80;
 				ch &= ~0x80;
 			} else {
 				graphic = GL(psenc->inuse);
 				if (ch == ESCAPE_CHAR)
 					continue;
 				bit = 0x0;
 			}
 			if (graphic == NULL)
 				break;
 		} else if (tail == 2 && psenc->ch[0] == ESCAPE_CHAR) {
 			if (tail < psenc->chlen)
 				return (EINVAL);
 			if (ch == ESCAPE_CHAR) {
 				++head;
 			} else if (ch == '\n') {
 				if (psenc->inuse != INIT0(ei))
 					break;
 				tail = psenc->chlen = 0;
 				continue;
 			} else {
 				candidate = NULL;
 				init = INIT0(ei);
 				if (psenc->inuse == init) {
 					init = INIT1(ei);
 				} else if (INIT(psenc->inuse) == init) {
 					if (ESC(init) != ch)
 						break;
 					candidate = init;
 				}
 				if (candidate == NULL) {
 					candidate = find_escape(
 					    SET(psenc->inuse), ch);
 					if (candidate == NULL) {
 						if (init == NULL ||
 						    ESC(init) != ch)
 							break;
 						candidate = init;
 					}
 				}
 				psenc->inuse = candidate;
 				tail = psenc->chlen = 0;
 				continue;
 			}
 		} else if (ch & 0x80) {
 			if (graphic != GR(psenc->inuse))
 				break;
 			ch &= ~0x80;
 		} else {
 			if (graphic != GL(psenc->inuse))
 				break;
 		}
 		range = &ranges[(size_t)graphic->charset];
 		if (range->start > ch || range->end < ch)
 			break;
 		wc <<= 8;
 		wc |= ch;
 		if (graphic->length == (tail - head)) {
 			if (graphic->charset > GB2312)
 				bit |= ESC(psenc->inuse) << 24;
 			wc |= bit;
 			goto done;
 		}
 	}
 	*nresult = (size_t)-1;
 	return (EILSEQ);
 done:
 	if (tail < psenc->chlen)
 		return (EINVAL);
 	*s = s0;
 	if (pwc != NULL)
 		*pwc = wc;
 	psenc->chlen = 0;
 	*nresult = (wc == 0) ? 0 : len;
 
 	return (0);
 }
 
 static int
 _citrus_HZ_wcrtomb_priv(_HZEncodingInfo * __restrict ei,
     char * __restrict s, size_t n, wchar_t wc,
     _HZState * __restrict psenc, size_t * __restrict nresult)
 {
 	escape_t *candidate, *init;
 	graphic_t *graphic;
 	const range_t *range;
 	size_t len;
 	int bit, ch;
 
 	if (psenc->chlen != 0 || psenc->inuse == NULL)
 		return (EINVAL);
 	if (wc & 0x80) {
 		bit = 0x80;
 		wc &= ~0x80;
 	} else {
 		bit = 0x0;
 	}
 	if ((uint32_t)wc <= 0x1F) {
 		candidate = INIT0(ei);
 		graphic = (bit == 0) ? candidate->left : candidate->right;
 		if (graphic == NULL)
 			goto ilseq;
 		range = &ranges[(size_t)CTRL];
 		len = 1;
 	} else if ((uint32_t)wc <= 0x7F) {
 		graphic = ei->ascii;
 		if (graphic == NULL)
 			goto ilseq;
 		candidate = graphic->escape;
 		range = &ranges[(size_t)graphic->charset];
 		len = graphic->length;
 	} else if ((uint32_t)wc <= 0x7F7F) {
 		graphic = ei->gb2312;
 		if (graphic == NULL)
 			goto ilseq;
 		candidate = graphic->escape;
 		range = &ranges[(size_t)graphic->charset];
 		len = graphic->length;
 	} else {
 		ch = (wc >> 24) & 0xFF;
 		candidate = find_escape(E0SET(ei), ch);
 		if (candidate == NULL) {
 			candidate = find_escape(E1SET(ei), ch);
 			if (candidate == NULL)
 				goto ilseq;
 		}
 		wc &= ~0xFF000000;
 		graphic = (bit == 0) ? candidate->left : candidate->right;
 		if (graphic == NULL)
 			goto ilseq;
 		range = &ranges[(size_t)graphic->charset];
 		len = graphic->length;
 	}
 	if (psenc->inuse != candidate) {
 		init = INIT0(ei);
 		if (SET(psenc->inuse) == SET(candidate)) {
 			if (INIT(psenc->inuse) != init ||
 			    psenc->inuse == init || candidate == init)
 				init = NULL;
 		} else if (candidate == (init = INIT(candidate))) {
 			init = NULL;
 		}
 		if (init != NULL) {
 			if (n < 2)
 				return (E2BIG);
 			n -= 2;
 			psenc->ch[psenc->chlen++] = ESCAPE_CHAR;
 			psenc->ch[psenc->chlen++] = ESC(init);
 		}
 		if (n < 2)
 			return (E2BIG);
 		n -= 2;
 		psenc->ch[psenc->chlen++] = ESCAPE_CHAR;
 		psenc->ch[psenc->chlen++] = ESC(candidate);
 		psenc->inuse = candidate;
 	}
 	if (n < len)
 		return (E2BIG);
 	while (len-- > 0) {
 		ch = (wc >> (len * 8)) & 0xFF;
 		if (range->start > ch || range->end < ch)
 			goto ilseq;
 		psenc->ch[psenc->chlen++] = ch | bit;
 	}
 	memcpy(s, psenc->ch, psenc->chlen);
 	*nresult = psenc->chlen;
 	psenc->chlen = 0;
 
 	return (0);
 
 ilseq:
 	*nresult = (size_t)-1;
 	return (EILSEQ);
 }
 
 static __inline int
 _citrus_HZ_put_state_reset(_HZEncodingInfo * __restrict ei,
     char * __restrict s, size_t n, _HZState * __restrict psenc,
     size_t * __restrict nresult)
 {
 	escape_t *candidate;
 
 	if (psenc->chlen != 0 || psenc->inuse == NULL)
 		return (EINVAL);
 	candidate = INIT0(ei);
 	if (psenc->inuse != candidate) {
 		if (n < 2)
 			return (E2BIG);
 		n -= 2;
 		psenc->ch[psenc->chlen++] = ESCAPE_CHAR;
 		psenc->ch[psenc->chlen++] = ESC(candidate);
 	}
 	if (n < 1)
 		return (E2BIG);
 	if (psenc->chlen > 0)
 		memcpy(s, psenc->ch, psenc->chlen);
 	*nresult = psenc->chlen;
 	_citrus_HZ_init_state(ei, psenc);
 
 	return (0);
 }
 
 static __inline int
 _citrus_HZ_stdenc_get_state_desc_generic(_HZEncodingInfo * __restrict ei,
     _HZState * __restrict psenc, int * __restrict rstate)
 {
 
 	if (psenc->chlen < 0 || psenc->inuse == NULL)
 		return (EINVAL);
 	*rstate = (psenc->chlen == 0)
 	    ? ((psenc->inuse == INIT0(ei))
 	        ? _STDENC_SDGEN_INITIAL
 	        : _STDENC_SDGEN_STABLE)
 	    : ((psenc->ch[0] == ESCAPE_CHAR)
 	        ? _STDENC_SDGEN_INCOMPLETE_SHIFT
 	        : _STDENC_SDGEN_INCOMPLETE_CHAR);
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_HZ_stdenc_wctocs(_HZEncodingInfo * __restrict ei __unused,
     _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc)
 {
 	int bit;
 
 	if (wc & 0x80) {
 		bit = 0x80;
 		wc &= ~0x80;
 	} else
 		bit = 0x0;
 	if ((uint32_t)wc <= 0x7F) {
 		*csid = (_csid_t)bit;
 		*idx = (_index_t)wc;
 	} else if ((uint32_t)wc <= 0x7F7F) {
 		*csid = (_csid_t)(bit | 0x8000);
 		*idx = (_index_t)wc;
 	} else {
 		*csid = (_index_t)(wc & ~0x00FFFF7F);
 		*idx = (_csid_t)(wc & 0x00FFFF7F);
 	}
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_HZ_stdenc_cstowc(_HZEncodingInfo * __restrict ei __unused,
     wchar_t * __restrict wc, _csid_t csid, _index_t idx)
 {
 
 	*wc = (wchar_t)idx;
 	switch (csid) {
 	case 0x80:
 	case 0x8080:
 		*wc |= (wchar_t)0x80;
 		/*FALLTHROUGH*/
 	case 0x0:
 	case 0x8000:
 		break;
 	default:
 		*wc |= (wchar_t)csid;
 	}
 
 	return (0);
 }
 
 static void
 _citrus_HZ_encoding_module_uninit(_HZEncodingInfo *ei)
 {
 	escape_t *escape;
 
 	while ((escape = TAILQ_FIRST(E0SET(ei))) != NULL) {
 		TAILQ_REMOVE(E0SET(ei), escape, entry);
 		free(GL(escape));
 		free(GR(escape));
 		free(escape);
 	}
 	while ((escape = TAILQ_FIRST(E1SET(ei))) != NULL) {
 		TAILQ_REMOVE(E1SET(ei), escape, entry);
 		free(GL(escape));
 		free(GR(escape));
 		free(escape);
 	}
 }
 
 static int
 _citrus_HZ_parse_char(void *context, const char *name __unused, const char *s)
 {
 	escape_t *escape;
 	void **p;
 
 	p = (void **)context;
 	escape = (escape_t *)p[0];
 	if (escape->ch != '\0')
 		return (EINVAL);
 	escape->ch = *s++;
 	if (escape->ch == ESCAPE_CHAR || *s != '\0')
 		return (EINVAL);
 
 	return (0);
 }
 
 static int
 _citrus_HZ_parse_graphic(void *context, const char *name, const char *s)
 {
 	_HZEncodingInfo *ei;
 	escape_t *escape;
 	graphic_t *graphic;
 	void **p;
 
 	p = (void **)context;
 	escape = (escape_t *)p[0];
 	ei = (_HZEncodingInfo *)p[1];
 	graphic = calloc(1, sizeof(*graphic));
 	if (graphic == NULL)
 		return (ENOMEM);
 	if (strcmp("GL", name) == 0) {
 		if (GL(escape) != NULL)
 			goto release;
 		GL(escape) = graphic;
 	} else if (strcmp("GR", name) == 0) {
 		if (GR(escape) != NULL)
 			goto release;
 		GR(escape) = graphic;
 	} else {
 release:
 		free(graphic);
 		return (EINVAL);
 	}
 	graphic->escape = escape;
 	if (_bcs_strncasecmp("ASCII", s, 5) == 0) {
 		if (s[5] != '\0')
 			return (EINVAL);
 		graphic->charset = ASCII;
 		graphic->length = 1;
 		ei->ascii = graphic;
 		return (0);
 	} else if (_bcs_strncasecmp("GB2312", s, 6) == 0) {
 		if (s[6] != '\0')
 			return (EINVAL);
 		graphic->charset = GB2312;
 		graphic->length = 2;
 		ei->gb2312 = graphic;
 		return (0);
 	} else if (strncmp("94*", s, 3) == 0)
 		graphic->charset = CS94;
 	else if (strncmp("96*", s, 3) == 0)
 		graphic->charset = CS96;
 	else
 		return (EINVAL);
 	s += 3;
 	switch(*s) {
 	case '1': case '2': case '3':
 		graphic->length = (size_t)(*s - '0');
 		if (*++s == '\0')
 			break;
 	/*FALLTHROUGH*/
 	default:
 		return (EINVAL);
 	}
 	return (0);
 }
 
 static const _citrus_prop_hint_t escape_hints[] = {
 _CITRUS_PROP_HINT_STR("CH", &_citrus_HZ_parse_char),
 _CITRUS_PROP_HINT_STR("GL", &_citrus_HZ_parse_graphic),
 _CITRUS_PROP_HINT_STR("GR", &_citrus_HZ_parse_graphic),
 _CITRUS_PROP_HINT_END
 };
 
 static int
 _citrus_HZ_parse_escape(void *context, const char *name, const char *s)
 {
 	_HZEncodingInfo *ei;
 	escape_t *escape;
 	void *p[2];
 
 	ei = (_HZEncodingInfo *)context;
 	escape = calloc(1, sizeof(*escape));
 	if (escape == NULL)
 		return (EINVAL);
 	if (strcmp("0", name) == 0) {
 		escape->set = E0SET(ei);
 		TAILQ_INSERT_TAIL(E0SET(ei), escape, entry);
 	} else if (strcmp("1", name) == 0) {
 		escape->set = E1SET(ei);
 		TAILQ_INSERT_TAIL(E1SET(ei), escape, entry);
 	} else {
 		free(escape);
 		return (EINVAL);
 	}
 	p[0] = (void *)escape;
 	p[1] = (void *)ei;
 	return (_citrus_prop_parse_variable(
 	    escape_hints, (void *)&p[0], s, strlen(s)));
 }
 
 static const _citrus_prop_hint_t root_hints[] = {
 _CITRUS_PROP_HINT_STR("0", &_citrus_HZ_parse_escape),
 _CITRUS_PROP_HINT_STR("1", &_citrus_HZ_parse_escape),
 _CITRUS_PROP_HINT_END
 };
 
 static int
 _citrus_HZ_encoding_module_init(_HZEncodingInfo * __restrict ei,
     const void * __restrict var, size_t lenvar)
 {
 	int errnum;
 
 	memset(ei, 0, sizeof(*ei));
 	TAILQ_INIT(E0SET(ei));
 	TAILQ_INIT(E1SET(ei));
 	errnum = _citrus_prop_parse_variable(
 	    root_hints, (void *)ei, var, lenvar);
 	if (errnum != 0)
 		_citrus_HZ_encoding_module_uninit(ei);
 	return (errnum);
 }
 
 /* ----------------------------------------------------------------------
  * public interface for stdenc
  */
 
 _CITRUS_STDENC_DECLS(HZ);
 _CITRUS_STDENC_DEF_OPS(HZ);
 
 #include "citrus_stdenc_template.h"
Index: user/ngie/more-tests/lib/libiconv_modules/ISO2022/citrus_iso2022.c
===================================================================
--- user/ngie/more-tests/lib/libiconv_modules/ISO2022/citrus_iso2022.c	(revision 281584)
+++ user/ngie/more-tests/lib/libiconv_modules/ISO2022/citrus_iso2022.c	(revision 281585)
@@ -1,1291 +1,1291 @@
 /* $FreeBSD$ */
 /*	$NetBSD: citrus_iso2022.c,v 1.20 2010/12/07 22:01:45 joerg Exp $	*/
 
 /*-
  * Copyright (c)1999, 2002 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	$Citrus: xpg4dl/FreeBSD/lib/libc/locale/iso2022.c,v 1.23 2001/06/21 01:51:44 yamt Exp $
  */
 
 #include <sys/cdefs.h>
 #include <sys/types.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <limits.h>
 #include <stdbool.h>
 #include <stddef.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <wchar.h>
 
 #include "citrus_namespace.h"
 #include "citrus_types.h"
 #include "citrus_module.h"
 #include "citrus_stdenc.h"
 #include "citrus_iso2022.h"
 
 
 /* ----------------------------------------------------------------------
  * private stuffs used by templates
  */
 
 
 /*
  * wchar_t mappings:
  * ASCII (ESC ( B)		00000000 00000000 00000000 0xxxxxxx
  * iso-8859-1 (ESC , A)		00000000 00000000 00000000 1xxxxxxx
  * 94 charset (ESC ( F)		0fffffff 00000000 00000000 0xxxxxxx
  * 94 charset (ESC ( M F)	0fffffff 1mmmmmmm 00000000 0xxxxxxx
  * 96 charset (ESC , F)		0fffffff 00000000 00000000 1xxxxxxx
  * 96 charset (ESC , M F)	0fffffff 1mmmmmmm 00000000 1xxxxxxx
  * 94x94 charset (ESC $ ( F)	0fffffff 00000000 0xxxxxxx 0xxxxxxx
  * 96x96 charset (ESC $ , F)	0fffffff 00000000 0xxxxxxx 1xxxxxxx
  * 94x94 charset (ESC & V ESC $ ( F)
  *				0fffffff 1vvvvvvv 0xxxxxxx 0xxxxxxx
  * 94x94x94 charset (ESC $ ( F)	0fffffff 0xxxxxxx 0xxxxxxx 0xxxxxxx
  * 96x96x96 charset (ESC $ , F)	0fffffff 0xxxxxxx 0xxxxxxx 1xxxxxxx
  * reserved for UCS4 co-existence (UCS4 is 31bit encoding thanks to mohta bit)
  *				1xxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
  */
 
 #define CS94		(0U)
 #define CS96		(1U)
 #define CS94MULTI	(2U)
 #define CS96MULTI	(3U)
 
 typedef struct {
 	unsigned char	 type;
 	unsigned char	 final;
 	unsigned char	 interm;
 	unsigned char	 vers;
 } _ISO2022Charset;
 
 static const _ISO2022Charset ascii    = { CS94, 'B', '\0', '\0' };
 static const _ISO2022Charset iso88591 = { CS96, 'A', '\0', '\0' };
 
 typedef struct {
 	_ISO2022Charset	g[4];
 	/* need 3 bits to hold -1, 0, ..., 3 */
 	int	gl:3,
 		gr:3,
 		singlegl:3,
 		singlegr:3;
 	char ch[7];	/* longest escape sequence (ESC & V ESC $ ( F) */
 	size_t chlen;
 	int flags;
 #define _ISO2022STATE_FLAG_INITIALIZED	1
 } _ISO2022State;
 
 typedef struct {
 	_ISO2022Charset	*recommend[4];
 	size_t	recommendsize[4];
 	_ISO2022Charset	initg[4];
 	int	maxcharset;
 	int	flags;
 #define	F_8BIT	0x0001
 #define	F_NOOLD	0x0002
 #define	F_SI	0x0010	/*0F*/
 #define	F_SO	0x0020	/*0E*/
 #define	F_LS0	0x0010	/*0F*/
 #define	F_LS1	0x0020	/*0E*/
 #define	F_LS2	0x0040	/*ESC n*/
 #define	F_LS3	0x0080	/*ESC o*/
 #define	F_LS1R	0x0100	/*ESC ~*/
 #define	F_LS2R	0x0200	/*ESC }*/
 #define	F_LS3R	0x0400	/*ESC |*/
 #define	F_SS2	0x0800	/*ESC N*/
 #define	F_SS3	0x1000	/*ESC O*/
 #define	F_SS2R	0x2000	/*8E*/
 #define	F_SS3R	0x4000	/*8F*/
 } _ISO2022EncodingInfo;
 
 #define _CEI_TO_EI(_cei_)		(&(_cei_)->ei)
 #define _CEI_TO_STATE(_cei_, _func_)	(_cei_)->states.s_##_func_
 
 #define _FUNCNAME(m)			_citrus_ISO2022_##m
 #define _ENCODING_INFO			_ISO2022EncodingInfo
 #define _ENCODING_STATE			_ISO2022State
 #define _ENCODING_MB_CUR_MAX(_ei_)	MB_LEN_MAX
 #define _ENCODING_IS_STATE_DEPENDENT	1
 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_)	\
     (!((_ps_)->flags & _ISO2022STATE_FLAG_INITIALIZED))
 
 
 #define _ISO2022INVALID (wchar_t)-1
 
 static __inline bool isc0(__uint8_t x)
 {
 
 	return ((x & 0x1f) == x);
 }
 
 static __inline bool isc1(__uint8_t x)
 {
 
 	return (0x80 <= x && x <= 0x9f);
 }
 
 static __inline bool iscntl(__uint8_t x)
 {
 
 	return (isc0(x) || isc1(x) || x == 0x7f);
 }
 
 static __inline bool is94(__uint8_t x)
 {
 
 	return (0x21 <= x && x <= 0x7e);
 }
 
 static __inline bool is96(__uint8_t x)
 {
 
 	return (0x20 <= x && x <= 0x7f);
 }
 
 static __inline bool isecma(__uint8_t x)
 {
 
 	return (0x30 <= x && x <= 0x7f);
 }
 
 static __inline bool isinterm(__uint8_t x)
 {
 
 	return (0x20 <= x && x <= 0x2f);
 }
 
 static __inline bool isthree(__uint8_t x)
 {
 
 	return (0x60 <= x && x <= 0x6f);
 }
 
 static __inline int
 getcs(const char * __restrict p, _ISO2022Charset * __restrict cs)
 {
 
 	if (!strncmp(p, "94$", 3) && p[3] && !p[4]) {
 		cs->final = (unsigned char)(p[3] & 0xff);
 		cs->interm = '\0';
 		cs->vers = '\0';
 		cs->type = CS94MULTI;
 	} else if (!strncmp(p, "96$", 3) && p[3] && !p[4]) {
 		cs->final = (unsigned char)(p[3] & 0xff);
 		cs->interm = '\0';
 		cs->vers = '\0';
 		cs->type = CS96MULTI;
 	} else if (!strncmp(p, "94", 2) && p[2] && !p[3]) {
 		cs->final = (unsigned char)(p[2] & 0xff);
 		cs->interm = '\0';
 		cs->vers = '\0';
 		cs->type = CS94;
 	} else if (!strncmp(p, "96", 2) && p[2] && !p[3]) {
 		cs->final = (unsigned char )(p[2] & 0xff);
 		cs->interm = '\0';
 		cs->vers = '\0';
 		cs->type = CS96;
 	} else
 		return (1);
 
 	return (0);
 }
 
 
 #define _NOTMATCH	0
 #define _MATCH		1
 #define _PARSEFAIL	2
 
 static __inline int
 get_recommend(_ISO2022EncodingInfo * __restrict ei,
     const char * __restrict token)
 {
 	_ISO2022Charset cs, *p;
 	int i;
 
 	if (!strchr("0123", token[0]) || token[1] != '=')
 		return (_NOTMATCH);
 
 	if (getcs(&token[2], &cs) == 0)
 		;
 	else if (!strcmp(&token[2], "94")) {
 		cs.final = (unsigned char)(token[4]);
 		cs.interm = '\0';
 		cs.vers = '\0';
 		cs.type = CS94;
 	} else if (!strcmp(&token[2], "96")) {
 		cs.final = (unsigned char)(token[4]);
 		cs.interm = '\0';
 		cs.vers = '\0';
 		cs.type = CS96;
 	} else if (!strcmp(&token[2], "94$")) {
 		cs.final = (unsigned char)(token[5]);
 		cs.interm = '\0';
 		cs.vers = '\0';
 		cs.type = CS94MULTI;
 	} else if (!strcmp(&token[2], "96$")) {
 		cs.final = (unsigned char)(token[5]);
 		cs.interm = '\0';
 		cs.vers = '\0';
 		cs.type = CS96MULTI;
 	} else
 		return (_PARSEFAIL);
 
 	i = token[0] - '0';
 	if (!ei->recommend[i])
 		ei->recommend[i] = malloc(sizeof(_ISO2022Charset));
 	else {
 		p = realloc(ei->recommend[i],
 		    sizeof(_ISO2022Charset) * (ei->recommendsize[i] + 1));
 		if (!p)
 			return (_PARSEFAIL);
 		ei->recommend[i] = p;
 	}
 	if (!ei->recommend[i])
 		return (_PARSEFAIL);
 	ei->recommendsize[i]++;
 
 	(ei->recommend[i] + (ei->recommendsize[i] - 1))->final = cs.final;
 	(ei->recommend[i] + (ei->recommendsize[i] - 1))->interm = cs.interm;
 	(ei->recommend[i] + (ei->recommendsize[i] - 1))->vers = cs.vers;
 	(ei->recommend[i] + (ei->recommendsize[i] - 1))->type = cs.type;
 
 	return (_MATCH);
 }
 
 static __inline int
 get_initg(_ISO2022EncodingInfo * __restrict ei,
     const char * __restrict token)
 {
 	_ISO2022Charset cs;
 
 	if (strncmp("INIT", &token[0], 4) ||
 	    !strchr("0123", token[4]) ||
 	    token[5] != '=')
 		return (_NOTMATCH);
 
 	if (getcs(&token[6], &cs) != 0)
 		return (_PARSEFAIL);
 
 	ei->initg[token[4] - '0'].type = cs.type;
 	ei->initg[token[4] - '0'].final = cs.final;
 	ei->initg[token[4] - '0'].interm = cs.interm;
 	ei->initg[token[4] - '0'].vers = cs.vers;
 
 	return (_MATCH);
 }
 
 static __inline int
 get_max(_ISO2022EncodingInfo * __restrict ei,
     const char * __restrict token)
 {
 	if (!strcmp(token, "MAX1"))
 		ei->maxcharset = 1;
 	else if (!strcmp(token, "MAX2"))
 		ei->maxcharset = 2;
 	else if (!strcmp(token, "MAX3"))
 		ei->maxcharset = 3;
 	else
 		return (_NOTMATCH);
 
 	return (_MATCH);
 }
 
 
 static __inline int
 get_flags(_ISO2022EncodingInfo * __restrict ei,
     const char * __restrict token)
 {
 	static struct {
 		const char	*tag;
 		int		flag;
 	} const tags[] = {
 		{ "DUMMY",	0	},
 		{ "8BIT",	F_8BIT	},
 		{ "NOOLD",	F_NOOLD	},
 		{ "SI",		F_SI	},
 		{ "SO",		F_SO	},
 		{ "LS0",	F_LS0	},
 		{ "LS1",	F_LS1	},
 		{ "LS2",	F_LS2	},
 		{ "LS3",	F_LS3	},
 		{ "LS1R",	F_LS1R	},
 		{ "LS2R",	F_LS2R	},
 		{ "LS3R",	F_LS3R	},
 		{ "SS2",	F_SS2	},
 		{ "SS3",	F_SS3	},
 		{ "SS2R",	F_SS2R	},
 		{ "SS3R",	F_SS3R	},
 		{ NULL,		0 }
 	};
 	int i;
 
 	for (i = 0; tags[i].tag; i++)
 		if (!strcmp(token, tags[i].tag)) {
 			ei->flags |= tags[i].flag;
 			return (_MATCH);
 		}
 
 	return (_NOTMATCH);
 }
 
 
 static __inline int
 _citrus_ISO2022_parse_variable(_ISO2022EncodingInfo * __restrict ei,
     const void * __restrict var, size_t lenvar __unused)
 {
 	char const *e, *v;
 	char buf[20];
 	size_t len;
 	int i, ret;
 
 	/*
 	 * parse VARIABLE section.
 	 */
 
 	if (!var)
 		return (EFTYPE);
 
 	v = (const char *) var;
 
 	/* initialize structure */
 	ei->maxcharset = 0;
 	for (i = 0; i < 4; i++) {
 		ei->recommend[i] = NULL;
 		ei->recommendsize[i] = 0;
 	}
 	ei->flags = 0;
 
 	while (*v) {
 		while (*v == ' ' || *v == '\t')
 			++v;
 
 		/* find the token */
 		e = v;
 		while (*e && *e != ' ' && *e != '\t')
 			++e;
 
 		len = e - v;
 		if (len == 0)
 			break;
 		if (len >= sizeof(buf))
 			goto parsefail;
 		snprintf(buf, sizeof(buf), "%.*s", (int)len, v);
 
 		if ((ret = get_recommend(ei, buf)) != _NOTMATCH)
 			;
 		else if ((ret = get_initg(ei, buf)) != _NOTMATCH)
 			;
 		else if ((ret = get_max(ei, buf)) != _NOTMATCH)
 			;
 		else if ((ret = get_flags(ei, buf)) != _NOTMATCH)
 			;
 		else
 			ret = _PARSEFAIL;
 		if (ret == _PARSEFAIL)
 			goto parsefail;
 		v = e;
 
 	}
 
 	return (0);
 
 parsefail:
 	free(ei->recommend[0]);
 	free(ei->recommend[1]);
 	free(ei->recommend[2]);
 	free(ei->recommend[3]);
 
 	return (EFTYPE);
 }
 
 static __inline void
 /*ARGSUSED*/
 _citrus_ISO2022_init_state(_ISO2022EncodingInfo * __restrict ei,
     _ISO2022State * __restrict s)
 {
 	int i;
 
 	memset(s, 0, sizeof(*s));
 	s->gl = 0;
 	s->gr = (ei->flags & F_8BIT) ? 1 : -1;
 
 	for (i = 0; i < 4; i++)
 		if (ei->initg[i].final) {
 			s->g[i].type = ei->initg[i].type;
 			s->g[i].final = ei->initg[i].final;
 			s->g[i].interm = ei->initg[i].interm;
 		}
 	s->singlegl = s->singlegr = -1;
 	s->flags |= _ISO2022STATE_FLAG_INITIALIZED;
 }
 
 #if 0
 static __inline void
 /*ARGSUSED*/
 _citrus_ISO2022_pack_state(_ISO2022EncodingInfo * __restrict ei __unused,
     void * __restrict pspriv, const _ISO2022State * __restrict s)
 {
 
 	memcpy(pspriv, (const void *)s, sizeof(*s));
 }
 
 static __inline void
 /*ARGSUSED*/
 _citrus_ISO2022_unpack_state(_ISO2022EncodingInfo * __restrict ei __unused,
     _ISO2022State * __restrict s, const void * __restrict pspriv)
 {
 
 	memcpy((void *)s, pspriv, sizeof(*s));
 }
 #endif
 
 static int
 /*ARGSUSED*/
 _citrus_ISO2022_encoding_module_init(_ISO2022EncodingInfo * __restrict ei,
     const void * __restrict var, size_t lenvar)
 {
 
 	return (_citrus_ISO2022_parse_variable(ei, var, lenvar));
 }
 
 static void
 /*ARGSUSED*/
 _citrus_ISO2022_encoding_module_uninit(_ISO2022EncodingInfo *ei __unused)
 {
 
 }
 
 #define	ESC	'\033'
 #define	ECMA	-1
 #define	INTERM	-2
 #define	OECMA	-3
 static const struct seqtable {
 	int type;
 	int csoff;
 	int finaloff;
 	int intermoff;
 	int versoff;
 	int len;
 	int chars[10];
 } seqtable[] = {
 	/* G0 94MULTI special */
 	{ CS94MULTI, -1, 2, -1, -1,	3, { ESC, '$', OECMA }, },
 	/* G0 94MULTI special with version identification */
 	{ CS94MULTI, -1, 5, -1, 2,	6, { ESC, '&', ECMA, ESC, '$', OECMA }, },
 	/* G? 94 */
 	{ CS94, 1, 2, -1, -1,		3, { ESC, CS94, ECMA, }, },
 	/* G? 94 with 2nd intermediate char */
 	{ CS94, 1, 3, 2, -1,		4, { ESC, CS94, INTERM, ECMA, }, },
 	/* G? 96 */
 	{ CS96, 1, 2, -1, -1,		3, { ESC, CS96, ECMA, }, },
 	/* G? 96 with 2nd intermediate char */
 	{ CS96, 1, 3, 2, -1,		4, { ESC, CS96, INTERM, ECMA, }, },
 	/* G? 94MULTI */
 	{ CS94MULTI, 2, 3, -1, -1,	4, { ESC, '$', CS94, ECMA, }, },
 	/* G? 96MULTI */
 	{ CS96MULTI, 2, 3, -1, -1,	4, { ESC, '$', CS96, ECMA, }, },
 	/* G? 94MULTI with version specification */
 	{ CS94MULTI, 5, 6, -1, 2,	7, { ESC, '&', ECMA, ESC, '$', CS94, ECMA, }, },
 	/* LS2/3 */
 	{ -1, -1, -1, -1, -1,		2, { ESC, 'n', }, },
 	{ -1, -1, -1, -1, -1,		2, { ESC, 'o', }, },
 	/* LS1/2/3R */
 	{ -1, -1, -1, -1, -1,		2, { ESC, '~', }, },
 	{ -1, -1, -1, -1, -1,		2, { ESC, /*{*/ '}', }, },
 	{ -1, -1, -1, -1, -1,		2, { ESC, '|', }, },
 	/* SS2/3 */
 	{ -1, -1, -1, -1, -1,		2, { ESC, 'N', }, },
 	{ -1, -1, -1, -1, -1,		2, { ESC, 'O', }, },
 	/* end of records */
 //	{ 0, }
 	{  0,  0,  0,  0,  0,		0, { ESC,  0, }, }
 };
 
 static int
 seqmatch(const char * __restrict s, size_t n,
     const struct seqtable * __restrict sp)
 {
 	const int *p;
 
 	p = sp->chars;
 	while ((size_t)(p - sp->chars) < n && p - sp->chars < sp->len) {
 		switch (*p) {
 		case ECMA:
 			if (!isecma(*s))
 				goto terminate;
 			break;
 		case OECMA:
 			if (*s && strchr("@AB", *s))
 				break;
 			else
 				goto terminate;
 		case INTERM:
 			if (!isinterm(*s))
 				goto terminate;
 			break;
 		case CS94:
 			if (*s && strchr("()*+", *s))
 				break;
 			else
 				goto terminate;
 		case CS96:
 			if (*s && strchr(",-./", *s))
 				break;
 			else
 				goto terminate;
 		default:
 			if (*s != *p)
 				goto terminate;
 			break;
 		}
 
 		p++;
 		s++;
 	}
 
 terminate:
 	return (p - sp->chars);
 }
 
 static wchar_t
 _ISO2022_sgetwchar(_ISO2022EncodingInfo * __restrict ei __unused,
-    const char * __restrict string, size_t n, const char ** __restrict result,
+    char * __restrict string, size_t n, char ** __restrict result,
     _ISO2022State * __restrict psenc)
 {
 	const struct seqtable *sp;
 	wchar_t wchar = 0;
 	int i, cur, nmatch;
 
 	while (1) {
 		/* SI/SO */
 		if (1 <= n && string[0] == '\017') {
 			psenc->gl = 0;
 			string++;
 			n--;
 			continue;
 		}
 		if (1 <= n && string[0] == '\016') {
 			psenc->gl = 1;
 			string++;
 			n--;
 			continue;
 		}
 
 		/* SS2/3R */
 		if (1 <= n && string[0] && strchr("\217\216", string[0])) {
 			psenc->singlegl = psenc->singlegr =
 			    (string[0] - '\216') + 2;
 			string++;
 			n--;
 			continue;
 		}
 
 		/* eat the letter if this is not ESC */
 		if (1 <= n && string[0] != '\033')
 			break;
 
 		/* look for a perfect match from escape sequences */
 		for (sp = &seqtable[0]; sp->len; sp++) {
 			nmatch = seqmatch(string, n, sp);
 			if (sp->len == nmatch && n >= (size_t)(sp->len))
 				break;
 		}
 
 		if (!sp->len)
 			goto notseq;
 
 		if (sp->type != -1) {
 			if (sp->csoff == -1)
 				i = 0;
 			else {
 				switch (sp->type) {
 				case CS94:
 				case CS94MULTI:
 					i = string[sp->csoff] - '(';
 					break;
 				case CS96:
 				case CS96MULTI:
 					i = string[sp->csoff] - ',';
 					break;
 				default:
 					return (_ISO2022INVALID);
 				}
 			}
 			psenc->g[i].type = sp->type;
 			psenc->g[i].final = '\0';
 			psenc->g[i].interm = '\0';
 			psenc->g[i].vers = '\0';
 			/* sp->finaloff must not be -1 */
 			if (sp->finaloff != -1)
 				psenc->g[i].final = string[sp->finaloff];
 			if (sp->intermoff != -1)
 				psenc->g[i].interm = string[sp->intermoff];
 			if (sp->versoff != -1)
 				psenc->g[i].vers = string[sp->versoff];
 
 			string += sp->len;
 			n -= sp->len;
 			continue;
 		}
 
 		/* LS2/3 */
 		if (2 <= n && string[0] == '\033' &&
 		    string[1] && strchr("no", string[1])) {
 			psenc->gl = string[1] - 'n' + 2;
 			string += 2;
 			n -= 2;
 			continue;
 		}
 
 		/* LS1/2/3R */
 			/* XXX: { for vi showmatch */
 		if (2 <= n && string[0] == '\033' &&
 		    string[1] && strchr("~}|", string[1])) {
 			psenc->gr = 3 - (string[1] - '|');
 			string += 2;
 			n -= 2;
 			continue;
 		}
 
 		/* SS2/3 */
 		if (2 <= n && string[0] == '\033' && string[1] &&
 		    strchr("NO", string[1])) {
 			psenc->singlegl = (string[1] - 'N') + 2;
 			string += 2;
 			n -= 2;
 			continue;
 		}
 
 	notseq:
 		/*
 		 * if we've got an unknown escape sequence, eat the ESC at the
 		 * head.  otherwise, wait till full escape sequence comes.
 		 */
 		for (sp = &seqtable[0]; sp->len; sp++) {
 			nmatch = seqmatch(string, n, sp);
 			if (!nmatch)
 				continue;
 
 			/*
 			 * if we are in the middle of escape sequence,
 			 * we still need to wait for more characters to come
 			 */
 			if (n < (size_t)(sp->len)) {
 				if ((size_t)(nmatch) == n) {
 					if (result)
 						*result = string;
 					return (_ISO2022INVALID);
 				}
 			} else {
 				if (nmatch == sp->len) {
 					/* this case should not happen */
 					goto eat;
 				}
 			}
 		}
 
 		break;
 	}
 
 eat:
 	/* no letter to eat */
 	if (n < 1) {
 		if (result)
 			*result = string;
 		return (_ISO2022INVALID);
 	}
 
 	/* normal chars.  always eat C0/C1 as is. */
 	if (iscntl(*string & 0xff))
 		cur = -1;
 	else if (*string & 0x80)
 		cur = (psenc->singlegr == -1) ? psenc->gr : psenc->singlegr;
 	else
 		cur = (psenc->singlegl == -1) ? psenc->gl : psenc->singlegl;
 
 	if (cur == -1) {
 asis:
 		wchar = *string++ & 0xff;
 		if (result)
 			*result = string;
 		/* reset single shift state */
 		psenc->singlegr = psenc->singlegl = -1;
 		return (wchar);
 	}
 
 	/* length error check */
 	switch (psenc->g[cur].type) {
 	case CS94MULTI:
 	case CS96MULTI:
 		if (!isthree(psenc->g[cur].final)) {
 			if (2 <= n &&
 			    (string[0] & 0x80) == (string[1] & 0x80))
 				break;
 		} else {
 			if (3 <= n &&
 			    (string[0] & 0x80) == (string[1] & 0x80) &&
 			    (string[0] & 0x80) == (string[2] & 0x80))
 				break;
 		}
 
 		/* we still need to wait for more characters to come */
 		if (result)
 			*result = string;
 		return (_ISO2022INVALID);
 
 	case CS94:
 	case CS96:
 		if (1 <= n)
 			break;
 
 		/* we still need to wait for more characters to come */
 		if (result)
 			*result = string;
 		return (_ISO2022INVALID);
 	}
 
 	/* range check */
 	switch (psenc->g[cur].type) {
 	case CS94:
 		if (!(is94(string[0] & 0x7f)))
 			goto asis;
 	case CS96:
 		if (!(is96(string[0] & 0x7f)))
 			goto asis;
 		break;
 	case CS94MULTI:
 		if (!(is94(string[0] & 0x7f) && is94(string[1] & 0x7f)))
 			goto asis;
 		break;
 	case CS96MULTI:
 		if (!(is96(string[0] & 0x7f) && is96(string[1] & 0x7f)))
 			goto asis;
 		break;
 	}
 
 	/* extract the character. */
 	switch (psenc->g[cur].type) {
 	case CS94:
 		/* special case for ASCII. */
 		if (psenc->g[cur].final == 'B' && !psenc->g[cur].interm) {
 			wchar = *string++;
 			wchar &= 0x7f;
 			break;
 		}
 		wchar = psenc->g[cur].final;
 		wchar = (wchar << 8);
 		wchar |= (psenc->g[cur].interm ? (0x80 | psenc->g[cur].interm) : 0);
 		wchar = (wchar << 8);
 		wchar = (wchar << 8) | (*string++ & 0x7f);
 		break;
 	case CS96:
 		/* special case for ISO-8859-1. */
 		if (psenc->g[cur].final == 'A' && !psenc->g[cur].interm) {
 			wchar = *string++;
 			wchar &= 0x7f;
 			wchar |= 0x80;
 			break;
 		}
 		wchar = psenc->g[cur].final;
 		wchar = (wchar << 8);
 		wchar |= (psenc->g[cur].interm ? (0x80 | psenc->g[cur].interm) : 0);
 		wchar = (wchar << 8);
 		wchar = (wchar << 8) | (*string++ & 0x7f);
 		wchar |= 0x80;
 		break;
 	case CS94MULTI:
 	case CS96MULTI:
 		wchar = psenc->g[cur].final;
 		wchar = (wchar << 8);
 		if (isthree(psenc->g[cur].final))
 			wchar |= (*string++ & 0x7f);
 		wchar = (wchar << 8) | (*string++ & 0x7f);
 		wchar = (wchar << 8) | (*string++ & 0x7f);
 		if (psenc->g[cur].type == CS96MULTI)
 			wchar |= 0x80;
 		break;
 	}
 
 	if (result)
 		*result = string;
 	/* reset single shift state */
 	psenc->singlegr = psenc->singlegl = -1;
 	return (wchar);
 }
 
 
 
 static int
 _citrus_ISO2022_mbrtowc_priv(_ISO2022EncodingInfo * __restrict ei,
-    wchar_t * __restrict pwc, const char ** __restrict s,
+    wchar_t * __restrict pwc, char ** __restrict s,
     size_t n, _ISO2022State * __restrict psenc, size_t * __restrict nresult)
 {
-	const char *p, *result, *s0;
+	char *p, *result, *s0;
 	wchar_t wchar;
 	int c, chlenbak;
 
 	if (*s == NULL) {
 		_citrus_ISO2022_init_state(ei, psenc);
 		*nresult = _ENCODING_IS_STATE_DEPENDENT;
 		return (0);
 	}
 	s0 = *s;
 	c = 0;
 	chlenbak = psenc->chlen;
 
 	/*
 	 * if we have something in buffer, use that.
 	 * otherwise, skip here
 	 */
 	if (psenc->chlen > sizeof(psenc->ch)) {
 		/* illgeal state */
 		_citrus_ISO2022_init_state(ei, psenc);
 		goto encoding_error;
 	}
 	if (psenc->chlen == 0)
 		goto emptybuf;
 
 	/* buffer is not empty */
 	p = psenc->ch;
 	while (psenc->chlen < sizeof(psenc->ch)) {
 		if (n > 0) {
 			psenc->ch[psenc->chlen++] = *s0++;
 			n--;
 		}
 
 		wchar = _ISO2022_sgetwchar(ei, p, psenc->chlen - (p-psenc->ch),
 		    &result, psenc);
 		c += result - p;
 		if (wchar != _ISO2022INVALID) {
 			if (psenc->chlen > (size_t)c)
 				memmove(psenc->ch, result, psenc->chlen - c);
 			if (psenc->chlen < (size_t)c)
 				psenc->chlen = 0;
 			else
 				psenc->chlen -= c;
 			goto output;
 		}
 
 		if (n == 0) {
 			if ((size_t)(result - p) == psenc->chlen)
 				/* complete shift sequence. */
 				psenc->chlen = 0;
 			goto restart;
 		}
 
 		p = result;
 	}
 
 	/* escape sequence too long? */
 	goto encoding_error;
 
 emptybuf:
 	wchar = _ISO2022_sgetwchar(ei, s0, n, &result, psenc);
 	if (wchar != _ISO2022INVALID) {
 		c += result - s0;
 		psenc->chlen = 0;
 		s0 = result;
 		goto output;
 	}
 	if (result > s0) {
 		c += (result - s0);
 		n -= (result - s0);
 		s0 = result;
 		if (n > 0)
 			goto emptybuf;
 		/* complete shift sequence. */
 		goto restart;
 	}
 	n += c;
 	if (n < sizeof(psenc->ch)) {
 		memcpy(psenc->ch, s0 - c, n);
 		psenc->chlen = n;
 		s0 = result;
 		goto restart;
 	}
 
 	/* escape sequence too long? */
 
 encoding_error:
 	psenc->chlen = 0;
 	*nresult = (size_t)-1;
 	return (EILSEQ);
 
 output:
 	*s = s0;
 	if (pwc)
 		*pwc = wchar;
 	*nresult = wchar ? c - chlenbak : 0;
 	return (0);
 
 restart:
 	*s = s0;
 	*nresult = (size_t)-2;
 
 	return (0);
 }
 
 static int
 recommendation(_ISO2022EncodingInfo * __restrict ei,
     _ISO2022Charset * __restrict cs)
 {
 	_ISO2022Charset *recommend;
 	size_t j;
 	int i;
 
 	/* first, try a exact match. */
 	for (i = 0; i < 4; i++) {
 		recommend = ei->recommend[i];
 		for (j = 0; j < ei->recommendsize[i]; j++) {
 			if (cs->type != recommend[j].type)
 				continue;
 			if (cs->final != recommend[j].final)
 				continue;
 			if (cs->interm != recommend[j].interm)
 				continue;
 
 			return (i);
 		}
 	}
 
 	/* then, try a wildcard match over final char. */
 	for (i = 0; i < 4; i++) {
 		recommend = ei->recommend[i];
 		for (j = 0; j < ei->recommendsize[i]; j++) {
 			if (cs->type != recommend[j].type)
 				continue;
 			if (cs->final && (cs->final != recommend[j].final))
 				continue;
 			if (cs->interm && (cs->interm != recommend[j].interm))
 				continue;
 
 			return (i);
 		}
 	}
 
 	/* there's no recommendation. make a guess. */
 	if (ei->maxcharset == 0) {
 		return (0);
 	} else {
 		switch (cs->type) {
 		case CS94:
 		case CS94MULTI:
 			return (0);
 		case CS96:
 		case CS96MULTI:
 			return (1);
 		}
 	}
 	return (0);
 }
 
 static int
 _ISO2022_sputwchar(_ISO2022EncodingInfo * __restrict ei, wchar_t wc,
     char * __restrict string, size_t n, char ** __restrict result,
     _ISO2022State * __restrict psenc, size_t * __restrict nresult)
 {
 	_ISO2022Charset cs;
 	char *p;
 	char tmp[MB_LEN_MAX];
 	size_t len;
 	int bit8, i = 0, target;
 	unsigned char mask;
 
 	if (isc0(wc & 0xff)) {
 		/* go back to INIT0 or ASCII on control chars */
 		cs = ei->initg[0].final ? ei->initg[0] : ascii;
 	} else if (isc1(wc & 0xff)) {
 		/* go back to INIT1 or ISO-8859-1 on control chars */
 		cs = ei->initg[1].final ? ei->initg[1] : iso88591;
 	} else if (!(wc & ~0xff)) {
 		if (wc & 0x80) {
 			/* special treatment for ISO-8859-1 */
 			cs = iso88591;
 		} else {
 			/* special treatment for ASCII */
 			cs = ascii;
 		}
 	} else {
 		cs.final = (wc >> 24) & 0x7f;
 		if ((wc >> 16) & 0x80)
 			cs.interm = (wc >> 16) & 0x7f;
 		else
 			cs.interm = '\0';
 		if (wc & 0x80)
 			cs.type = (wc & 0x00007f00) ? CS96MULTI : CS96;
 		else
 			cs.type = (wc & 0x00007f00) ? CS94MULTI : CS94;
 	}
 	target = recommendation(ei, &cs);
 	p = tmp;
 	bit8 = ei->flags & F_8BIT;
 
 	/* designate the charset onto the target plane(G0/1/2/3). */
 	if (psenc->g[target].type == cs.type &&
 	    psenc->g[target].final == cs.final &&
 	    psenc->g[target].interm == cs.interm)
 		goto planeok;
 
 	*p++ = '\033';
 	if (cs.type == CS94MULTI || cs.type == CS96MULTI)
 		*p++ = '$';
 	if (target == 0 && cs.type == CS94MULTI && strchr("@AB", cs.final) &&
 	    !cs.interm && !(ei->flags & F_NOOLD))
 		;
 	else if (cs.type == CS94 || cs.type == CS94MULTI)
 		*p++ = "()*+"[target];
 	else
 		*p++ = ",-./"[target];
 	if (cs.interm)
 		*p++ = cs.interm;
 	*p++ = cs.final;
 
 	psenc->g[target].type = cs.type;
 	psenc->g[target].final = cs.final;
 	psenc->g[target].interm = cs.interm;
 
 planeok:
 	/* invoke the plane onto GL or GR. */
 	if (psenc->gl == target)
 		goto sideok;
 	if (bit8 && psenc->gr == target)
 		goto sideok;
 
 	if (target == 0 && (ei->flags & F_LS0)) {
 		*p++ = '\017';
 		psenc->gl = 0;
 	} else if (target == 1 && (ei->flags & F_LS1)) {
 		*p++ = '\016';
 		psenc->gl = 1;
 	} else if (target == 2 && (ei->flags & F_LS2)) {
 		*p++ = '\033';
 		*p++ = 'n';
 		psenc->gl = 2;
 	} else if (target == 3 && (ei->flags & F_LS3)) {
 		*p++ = '\033';
 		*p++ = 'o';
 		psenc->gl = 3;
 	} else if (bit8 && target == 1 && (ei->flags & F_LS1R)) {
 		*p++ = '\033';
 		*p++ = '~';
 		psenc->gr = 1;
 	} else if (bit8 && target == 2 && (ei->flags & F_LS2R)) {
 		*p++ = '\033';
 		/*{*/
 		*p++ = '}';
 		psenc->gr = 2;
 	} else if (bit8 && target == 3 && (ei->flags & F_LS3R)) {
 		*p++ = '\033';
 		*p++ = '|';
 		psenc->gr = 3;
 	} else if (target == 2 && (ei->flags & F_SS2)) {
 		*p++ = '\033';
 		*p++ = 'N';
 		psenc->singlegl = 2;
 	} else if (target == 3 && (ei->flags & F_SS3)) {
 		*p++ = '\033';
 		*p++ = 'O';
 		psenc->singlegl = 3;
 	} else if (bit8 && target == 2 && (ei->flags & F_SS2R)) {
 		*p++ = '\216';
 		*p++ = 'N';
 		psenc->singlegl = psenc->singlegr = 2;
 	} else if (bit8 && target == 3 && (ei->flags & F_SS3R)) {
 		*p++ = '\217';
 		*p++ = 'O';
 		psenc->singlegl = psenc->singlegr = 3;
 	} else
 		goto ilseq;
 
 sideok:
 	if (psenc->singlegl == target)
 		mask = 0x00;
 	else if (psenc->singlegr == target)
 		mask = 0x80;
 	else if (psenc->gl == target)
 		mask = 0x00;
 	else if ((ei->flags & F_8BIT) && psenc->gr == target)
 		mask = 0x80;
 	else
 		goto ilseq;
 
 	switch (cs.type) {
 	case CS94:
 	case CS96:
 		i = 1;
 		break;
 	case CS94MULTI:
 	case CS96MULTI:
 		i = !iscntl(wc & 0xff) ?
 		    (isthree(cs.final) ? 3 : 2) : 1;
 		break;
 	}
 	while (i-- > 0)
 		*p++ = ((wc >> (i << 3)) & 0x7f) | mask;
 
 	/* reset single shift state */
 	psenc->singlegl = psenc->singlegr = -1;
 
 	len = (size_t)(p - tmp);
 	if (n < len) {
 		if (result)
 			*result = (char *)0;
 		*nresult = (size_t)-1;
 		return (E2BIG);
 	}
 	if (result)
 		*result = string + len;
 	memcpy(string, tmp, len);
 	*nresult = len;
 
 	return (0);
 
 ilseq:
 	*nresult = (size_t)-1;
 	return (EILSEQ);
 }
 
 static int
 _citrus_ISO2022_put_state_reset(_ISO2022EncodingInfo * __restrict ei,
     char * __restrict s, size_t n, _ISO2022State * __restrict psenc,
     size_t * __restrict nresult)
 {
 	char *result;
 	char buf[MB_LEN_MAX];
 	size_t len;
 	int ret;
 
 	/* XXX state will be modified after this operation... */
 	ret = _ISO2022_sputwchar(ei, L'\0', buf, sizeof(buf), &result, psenc,
 	    &len);
 	if (ret) {
 		*nresult = len;
 		return (ret);
 	}
 
 	if (sizeof(buf) < len || n < len-1) {
 		/* XXX should recover state? */
 		*nresult = (size_t)-1;
 		return (E2BIG);
 	}
 
 	memcpy(s, buf, len - 1);
 	*nresult = len - 1;
 	return (0);
 }
 
 static int
 _citrus_ISO2022_wcrtomb_priv(_ISO2022EncodingInfo * __restrict ei,
     char * __restrict s, size_t n, wchar_t wc,
     _ISO2022State * __restrict psenc, size_t * __restrict nresult)
 {
 	char *result;
 	char buf[MB_LEN_MAX];
 	size_t len;
 	int ret;
 
 	/* XXX state will be modified after this operation... */
 	ret = _ISO2022_sputwchar(ei, wc, buf, sizeof(buf), &result, psenc,
 	    &len);
 	if (ret) {
 		*nresult = len;
 		return (ret);
 	}
 
 	if (sizeof(buf) < len || n < len) {
 		/* XXX should recover state? */
 		*nresult = (size_t)-1;
 		return (E2BIG);
 	}
 
 	memcpy(s, buf, len);
 	*nresult = len;
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_ISO2022_stdenc_wctocs(_ISO2022EncodingInfo * __restrict ei __unused,
     _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc)
 {
 	wchar_t m, nm;
 
 	m = wc & 0x7FFF8080;
 	nm = wc & 0x007F7F7F;
 	if (m & 0x00800000)
 		nm &= 0x00007F7F;
 	else
 		m &= 0x7F008080;
 	if (nm & 0x007F0000) {
 		/* ^3 mark */
 		m |= 0x007F0000;
 	} else if (nm & 0x00007F00) {
 		/* ^2 mark */
 		m |= 0x00007F00;
 	}
 	*csid = (_csid_t)m;
 	*idx  = (_index_t)nm;
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_ISO2022_stdenc_cstowc(_ISO2022EncodingInfo * __restrict ei __unused,
     wchar_t * __restrict wc, _csid_t csid, _index_t idx)
 {
 
 	*wc = (wchar_t)(csid & 0x7F808080) | (wchar_t)idx;
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_ISO2022_stdenc_get_state_desc_generic(_ISO2022EncodingInfo * __restrict ei __unused,
     _ISO2022State * __restrict psenc, int * __restrict rstate)
 {
 
 	if (psenc->chlen == 0) {
 		/* XXX: it should distinguish initial and stable. */
 		*rstate = _STDENC_SDGEN_STABLE;
 	} else
 		*rstate = (psenc->ch[0] == '\033') ?
 		    _STDENC_SDGEN_INCOMPLETE_SHIFT :
 		    _STDENC_SDGEN_INCOMPLETE_CHAR;
 	return (0);
 }
 
 /* ----------------------------------------------------------------------
  * public interface for stdenc
  */
 
 _CITRUS_STDENC_DECLS(ISO2022);
 _CITRUS_STDENC_DEF_OPS(ISO2022);
 
 #include "citrus_stdenc_template.h"
Index: user/ngie/more-tests/lib/libiconv_modules/JOHAB/citrus_johab.c
===================================================================
--- user/ngie/more-tests/lib/libiconv_modules/JOHAB/citrus_johab.c	(revision 281584)
+++ user/ngie/more-tests/lib/libiconv_modules/JOHAB/citrus_johab.c	(revision 281585)
@@ -1,335 +1,335 @@
 /* $FreeBSD$ */
 /* $NetBSD: citrus_johab.c,v 1.4 2008/06/14 16:01:07 tnozaki Exp $ */
 
 /*-
  * Copyright (c)2006 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 #include <sys/cdefs.h>
 #include <sys/types.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <limits.h>
 #include <stdbool.h>
 #include <stddef.h>
 #include <stdint.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <wchar.h>
 
 #include "citrus_namespace.h"
 #include "citrus_types.h"
 #include "citrus_bcs.h"
 #include "citrus_module.h"
 #include "citrus_stdenc.h"
 #include "citrus_johab.h"
 
 /* ----------------------------------------------------------------------
  * private stuffs used by templates
  */
 
 typedef struct {
 	int	 chlen;
 	char	 ch[2];
 } _JOHABState;
 
 typedef struct {
 	int	 dummy;
 } _JOHABEncodingInfo;
 
 #define _CEI_TO_EI(_cei_)		(&(_cei_)->ei)
 #define _CEI_TO_STATE(_cei_, _func_)	(_cei_)->states.s_##_func_
 
 #define _FUNCNAME(m)			_citrus_JOHAB_##m
 #define _ENCODING_INFO			_JOHABEncodingInfo
 #define _ENCODING_STATE			_JOHABState
 #define _ENCODING_MB_CUR_MAX(_ei_)		2
 #define _ENCODING_IS_STATE_DEPENDENT		0
 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_)	0
 
 
 static __inline void
 /*ARGSUSED*/
 _citrus_JOHAB_init_state(_JOHABEncodingInfo * __restrict ei __unused,
     _JOHABState * __restrict psenc)
 {
 
 	psenc->chlen = 0;
 }
 
 #if 0
 static __inline void
 /*ARGSUSED*/
 _citrus_JOHAB_pack_state(_JOHABEncodingInfo * __restrict ei __unused,
     void * __restrict pspriv, const _JOHABState * __restrict psenc)
 {
 
 	memcpy(pspriv, (const void *)psenc, sizeof(*psenc));
 }
 
 static __inline void
 /*ARGSUSED*/
 _citrus_JOHAB_unpack_state(_JOHABEncodingInfo * __restrict ei __unused,
     _JOHABState * __restrict psenc, const void * __restrict pspriv)
 {
 
 	memcpy((void *)psenc, pspriv, sizeof(*psenc));
 }
 #endif
 
 static void
 /*ARGSUSED*/
 _citrus_JOHAB_encoding_module_uninit(_JOHABEncodingInfo *ei __unused)
 {
 
 	/* ei may be null */
 }
 
 static int
 /*ARGSUSED*/
 _citrus_JOHAB_encoding_module_init(_JOHABEncodingInfo * __restrict ei __unused,
     const void * __restrict var __unused, size_t lenvar __unused)
 {
 
 	/* ei may be null */
 	return (0);
 }
 
 static __inline bool
 ishangul(int l, int t)
 {
 
 	return ((l >= 0x84 && l <= 0xD3) &&
 	    ((t >= 0x41 && t <= 0x7E) || (t >= 0x81 && t <= 0xFE)));
 }
 
 static __inline bool
 isuda(int l, int t)
 {
 
 	return ((l == 0xD8) &&
 	    ((t >= 0x31 && t <= 0x7E) || (t >= 0x91 && t <= 0xFE)));
 }
 
 static __inline bool
 ishanja(int l, int t)
 {
 
 	return (((l >= 0xD9 && l <= 0xDE) || (l >= 0xE0 && l <= 0xF9)) &&
 	    ((t >= 0x31 && t <= 0x7E) || (t >= 0x91 && t <= 0xFE)));
 }
 
 static int
 /*ARGSUSED*/
 _citrus_JOHAB_mbrtowc_priv(_JOHABEncodingInfo * __restrict ei,
-    wchar_t * __restrict pwc, const char ** __restrict s, size_t n,
+    wchar_t * __restrict pwc, char ** __restrict s, size_t n,
     _JOHABState * __restrict psenc, size_t * __restrict nresult)
 {
-	const char *s0;
+	char *s0;
 	int l, t;
 
 	if (*s == NULL) {
 		_citrus_JOHAB_init_state(ei, psenc);
 		*nresult = _ENCODING_IS_STATE_DEPENDENT;
 		return (0);
 	}
 	s0 = *s;
 
 	switch (psenc->chlen) {
 	case 0:
 		if (n-- < 1)
 			goto restart;
 		l = *s0++ & 0xFF;
 		if (l <= 0x7F) {
 			if (pwc != NULL)
 				*pwc = (wchar_t)l;
 			*nresult = (l == 0) ? 0 : 1;
 			*s = s0;
 			return (0);
 		}
 		psenc->ch[psenc->chlen++] = l;
 		break;
 	case 1:
 		l = psenc->ch[0] & 0xFF;
 		break;
 	default:
 		return (EINVAL);
 	}
 	if (n-- < 1) {
 restart:
 		*nresult = (size_t)-2;
 		*s = s0;
 		return (0);
 	}
 	t = *s0++ & 0xFF;
 	if (!ishangul(l, t) && !isuda(l, t) && !ishanja(l, t)) {
 		*nresult = (size_t)-1;
 		return (EILSEQ);
 	}
 	if (pwc != NULL)
 		*pwc = (wchar_t)(l << 8 | t);
 	*nresult = s0 - *s;
 	*s = s0;
 	psenc->chlen = 0;
 
 	return (0);
 }
 
 static int
 /*ARGSUSED*/
 _citrus_JOHAB_wcrtomb_priv(_JOHABEncodingInfo * __restrict ei __unused,
     char * __restrict s, size_t n, wchar_t wc,
     _JOHABState * __restrict psenc, size_t * __restrict nresult)
 {
 	int l, t;
 
 	if (psenc->chlen != 0)
 		return (EINVAL);
 
 	/* XXX assume wchar_t as int */
 	if ((uint32_t)wc <= 0x7F) {
 		if (n < 1)
 			goto e2big;
 		*s = wc & 0xFF;
 		*nresult = 1;
 	} else if ((uint32_t)wc <= 0xFFFF) {
 		if (n < 2) {
 e2big:
 			*nresult = (size_t)-1;
 			return (E2BIG);
 		}
 		l = (wc >> 8) & 0xFF;
 		t = wc & 0xFF;
 		if (!ishangul(l, t) && !isuda(l, t) && !ishanja(l, t))
 			goto ilseq;
 		*s++ = l;
 		*s = t;
 		*nresult = 2;
 	} else {
 ilseq:
 		*nresult = (size_t)-1;
 		return (EILSEQ);
 	}
 	return (0);
 
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_JOHAB_stdenc_wctocs(_JOHABEncodingInfo * __restrict ei __unused,
     _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc)
 {
 	int m, l, linear, t;
 
 	/* XXX assume wchar_t as int */
 	if ((uint32_t)wc <= 0x7F) {
 		*idx = (_index_t)wc;
 		*csid = 0;
 	} else if ((uint32_t)wc <= 0xFFFF) {
 		l = (wc >> 8) & 0xFF;
 		t = wc & 0xFF;
 		if (ishangul(l, t) || isuda(l, t)) {
 			*idx = (_index_t)wc;
 			*csid = 1;
 		} else {
 			if (l >= 0xD9 && l <= 0xDE) {
 				linear = l - 0xD9;
 				m = 0x21;
 			} else if (l >= 0xE0 && l <= 0xF9) {
 				linear = l - 0xE0;
 				m = 0x4A;
 			} else
 				return (EILSEQ);
 			linear *= 188;
 			if (t >= 0x31 && t <= 0x7E)
 				linear += t - 0x31;
 			else if (t >= 0x91 && t <= 0xFE)
 				linear += t - 0x43;
 			else
 				return (EILSEQ);
 			l = (linear / 94) + m;
 			t = (linear % 94) + 0x21;
 			*idx = (_index_t)((l << 8) | t);
 			*csid = 2;
 		}
 	} else
 		return (EILSEQ);
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_JOHAB_stdenc_cstowc(_JOHABEncodingInfo * __restrict ei __unused,
     wchar_t * __restrict wc, _csid_t csid, _index_t idx)
 {
 	int m, n, l, linear, t;
 
 	switch (csid) {
 	case 0:
 	case 1:
 		*wc = (wchar_t)idx;
 		break;
 	case 2:
 		if (idx >= 0x2121 && idx <= 0x2C71) {
 			m = 0xD9;
 			n = 0x21;
 		} else if (idx >= 0x4A21 && idx <= 0x7D7E) {
 			m = 0xE0;
 			n = 0x4A;
 		} else
 			return (EILSEQ);
 		l = ((idx >> 8) & 0xFF) - n;
 		t = (idx & 0xFF) - 0x21;
 		linear = (l * 94) + t;
 		l = (linear / 188) + m;
 		t = linear % 188;
 		t += (t <= 0x4D) ? 0x31 : 0x43;
 		break;
 	default:
 		return (EILSEQ);
 	}
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_JOHAB_stdenc_get_state_desc_generic(_JOHABEncodingInfo * __restrict ei __unused,
     _JOHABState * __restrict psenc, int * __restrict rstate)
 {
 
 	*rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL :
 	    _STDENC_SDGEN_INCOMPLETE_CHAR;
 	return (0);
 }
 
 /* ----------------------------------------------------------------------
  * public interface for stdenc
  */
 
 _CITRUS_STDENC_DECLS(JOHAB);
 _CITRUS_STDENC_DEF_OPS(JOHAB);
 
 #include "citrus_stdenc_template.h"
Index: user/ngie/more-tests/lib/libiconv_modules/MSKanji/citrus_mskanji.c
===================================================================
--- user/ngie/more-tests/lib/libiconv_modules/MSKanji/citrus_mskanji.c	(revision 281584)
+++ user/ngie/more-tests/lib/libiconv_modules/MSKanji/citrus_mskanji.c	(revision 281585)
@@ -1,475 +1,475 @@
 /* $FreeBSD$ */
 /*	$NetBSD: citrus_mskanji.c,v 1.13 2008/06/14 16:01:08 tnozaki Exp $	*/
 
 /*-
  * Copyright (c)2002 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 /*
  *    ja_JP.SJIS locale table for BSD4.4/rune
  *    version 1.0
  *    (C) Sin'ichiro MIYATANI / Phase One, Inc
  *    May 12, 1995
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *      This product includes software developed by Phase One, Inc.
  * 4. The name of Phase One, Inc. may be used to endorse or promote products
  *    derived from this software without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 
 #include <sys/cdefs.h>
 #include <sys/types.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <limits.h>
 #include <stdbool.h>
 #include <stddef.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <wchar.h>
 
 #include "citrus_namespace.h"
 #include "citrus_types.h"
 #include "citrus_bcs.h"
 #include "citrus_module.h"
 #include "citrus_stdenc.h"
 #include "citrus_mskanji.h"
 
 
 /* ----------------------------------------------------------------------
  * private stuffs used by templates
  */
 
 typedef struct _MSKanjiState {
 	int	 chlen;
 	char	 ch[2];
 } _MSKanjiState;
 
 typedef struct {
 	int	 mode;
 #define MODE_JIS2004	1
 } _MSKanjiEncodingInfo;
 
 #define _CEI_TO_EI(_cei_)		(&(_cei_)->ei)
 #define _CEI_TO_STATE(_cei_, _func_)	(_cei_)->states.s_##_func_
 
 #define _FUNCNAME(m)			_citrus_MSKanji_##m
 #define _ENCODING_INFO			_MSKanjiEncodingInfo
 #define _ENCODING_STATE			_MSKanjiState
 #define _ENCODING_MB_CUR_MAX(_ei_)	2
 #define _ENCODING_IS_STATE_DEPENDENT	0
 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_)	0
 
 
 static bool
 _mskanji1(int c)
 {
 
 	return ((c >= 0x81 && c <= 0x9f) || (c >= 0xe0 && c <= 0xfc));
 }
 
 static bool
 _mskanji2(int c)
 {
 
 	return ((c >= 0x40 && c <= 0x7e) || (c >= 0x80 && c <= 0xfc));
 }
 
 static __inline void
 /*ARGSUSED*/
 _citrus_MSKanji_init_state(_MSKanjiEncodingInfo * __restrict ei __unused,
     _MSKanjiState * __restrict s)
 {
 
 	s->chlen = 0;
 }
 
 #if 0
 static __inline void
 /*ARGSUSED*/
 _citrus_MSKanji_pack_state(_MSKanjiEncodingInfo * __restrict ei __unused,
     void * __restrict pspriv, const _MSKanjiState * __restrict s)
 {
 
 	memcpy(pspriv, (const void *)s, sizeof(*s));
 }
 
 static __inline void
 /*ARGSUSED*/
 _citrus_MSKanji_unpack_state(_MSKanjiEncodingInfo * __restrict ei __unused,
     _MSKanjiState * __restrict s, const void * __restrict pspriv)
 {
 
 	memcpy((void *)s, pspriv, sizeof(*s));
 }
 #endif
 
 static int
 /*ARGSUSED*/
 _citrus_MSKanji_mbrtowc_priv(_MSKanjiEncodingInfo * __restrict ei,
-    wchar_t * __restrict pwc, const char ** __restrict s, size_t n,
+    wchar_t * __restrict pwc, char ** __restrict s, size_t n,
     _MSKanjiState * __restrict psenc, size_t * __restrict nresult)
 {
-	const char *s0;
+	char *s0;
 	wchar_t wchar;
 	int chlenbak, len;
 
 	s0 = *s;
 
 	if (s0 == NULL) {
 		_citrus_MSKanji_init_state(ei, psenc);
 		*nresult = 0; /* state independent */
 		return (0);
 	}
 
 	chlenbak = psenc->chlen;
 
 	/* make sure we have the first byte in the buffer */
 	switch (psenc->chlen) {
 	case 0:
 		if (n < 1)
 			goto restart;
 		psenc->ch[0] = *s0++;
 		psenc->chlen = 1;
 		n--;
 		break;
 	case 1:
 		break;
 	default:
 		/* illegal state */
 		goto encoding_error;
 	}
 
 	len = _mskanji1(psenc->ch[0] & 0xff) ? 2 : 1;
 	while (psenc->chlen < len) {
 		if (n < 1)
 			goto restart;
 		psenc->ch[psenc->chlen] = *s0++;
 		psenc->chlen++;
 		n--;
 	}
 
 	*s = s0;
 
 	switch (len) {
 	case 1:
 		wchar = psenc->ch[0] & 0xff;
 		break;
 	case 2:
 		if (!_mskanji2(psenc->ch[1] & 0xff))
 			goto encoding_error;
 		wchar = ((psenc->ch[0] & 0xff) << 8) | (psenc->ch[1] & 0xff);
 		break;
 	default:
 		/* illegal state */
 		goto encoding_error;
 	}
 
 	psenc->chlen = 0;
 
 	if (pwc)
 		*pwc = wchar;
 	*nresult = wchar ? len - chlenbak : 0;
 	return (0);
 
 encoding_error:
 	psenc->chlen = 0;
 	*nresult = (size_t)-1;
 	return (EILSEQ);
 
 restart:
 	*nresult = (size_t)-2;
 	*s = s0;
 	return (0);
 }
 
 
 static int
 _citrus_MSKanji_wcrtomb_priv(_MSKanjiEncodingInfo * __restrict ei __unused,
     char * __restrict s, size_t n, wchar_t wc,
     _MSKanjiState * __restrict psenc __unused, size_t * __restrict nresult)
 {
 	int ret;
 
 	/* check invalid sequence */
 	if (wc & ~0xffff) {
 		ret = EILSEQ;
 		goto err;
 	}
 
 	if (wc & 0xff00) {
 		if (n < 2) {
 			ret = E2BIG;
 			goto err;
 		}
 
 		s[0] = (wc >> 8) & 0xff;
 		s[1] = wc & 0xff;
 		if (!_mskanji1(s[0] & 0xff) || !_mskanji2(s[1] & 0xff)) {
 			ret = EILSEQ;
 			goto err;
 		}
 
 		*nresult = 2;
 		return (0);
 	} else {
 		if (n < 1) {
 			ret = E2BIG;
 			goto err;
 		}
 
 		s[0] = wc & 0xff;
 		if (_mskanji1(s[0] & 0xff)) {
 			ret = EILSEQ;
 			goto err;
 		}
 
 		*nresult = 1;
 		return (0);
 	}
 
 err:
 	*nresult = (size_t)-1;
 	return (ret);
 }
 
 
 static __inline int
 /*ARGSUSED*/
 _citrus_MSKanji_stdenc_wctocs(_MSKanjiEncodingInfo * __restrict ei,
     _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc)
 {
 	_index_t col, row;
 	int offset;
 
 	if ((_wc_t)wc < 0x80) {
 		/* ISO-646 */
 		*csid = 0;
 		*idx = (_index_t)wc;
 	} else if ((_wc_t)wc < 0x100) {
 		/* KANA */
 		*csid = 1;
 		*idx = (_index_t)wc & 0x7F;
 	} else {
 		/* Kanji (containing Gaiji zone) */
 		/*
 		 * 94^2 zone (contains a part of Gaiji (0xED40 - 0xEEFC)):
 		 * 0x8140 - 0x817E -> 0x2121 - 0x215F
 		 * 0x8180 - 0x819E -> 0x2160 - 0x217E
 		 * 0x819F - 0x81FC -> 0x2221 - 0x227E
 		 *
 		 * 0x8240 - 0x827E -> 0x2321 - 0x235F
 		 *  ...
 		 * 0x9F9F - 0x9FFc -> 0x5E21 - 0x5E7E
 		 *
 		 * 0xE040 - 0xE07E -> 0x5F21 - 0x5F5F
 		 *  ...
 		 * 0xEF9F - 0xEFFC -> 0x7E21 - 0x7E7E
 		 *
 		 * extended Gaiji zone:
 		 * 0xF040 - 0xFCFC
 		 *
 		 * JIS X0213-plane2:
 		 * 0xF040 - 0xF09E -> 0x2121 - 0x217E
 		 * 0xF140 - 0xF19E -> 0x2321 - 0x237E
 		 * ...
 		 * 0xF240 - 0xF29E -> 0x2521 - 0x257E
 		 *
 		 * 0xF09F - 0xF0FC -> 0x2821 - 0x287E
 		 * 0xF29F - 0xF2FC -> 0x2C21 - 0x2C7E
 		 * ...
 		 * 0xF44F - 0xF49E -> 0x2F21 - 0x2F7E
 		 *
 		 * 0xF49F - 0xF4FC -> 0x6E21 - 0x6E7E
 		 * ...
 		 * 0xFC9F - 0xFCFC -> 0x7E21 - 0x7E7E
 		 */
 		row = ((_wc_t)wc >> 8) & 0xFF;
 		col = (_wc_t)wc & 0xFF;
 		if (!_mskanji1(row) || !_mskanji2(col))
 			return (EILSEQ);
 		if ((ei->mode & MODE_JIS2004) == 0 || row < 0xF0) {
 			*csid = 2;
 			offset = 0x81;
 		} else {
 			*csid = 3;
 			if ((_wc_t)wc <= 0xF49E) {
 				offset = (_wc_t)wc >= 0xF29F ||
 				    ((_wc_t)wc >= 0xF09F &&
 				    (_wc_t)wc <= 0xF0FC) ? 0xED : 0xF0;
 			} else
 				offset = 0xCE;
 		}
 		row -= offset;
 		if (row >= 0x5F)
 			row -= 0x40;
 		row = row * 2 + 0x21;
 		col -= 0x1F;
 		if (col >= 0x61)
 			col -= 1;
 		if (col > 0x7E) {
 			row += 1;
 			col -= 0x5E;
 		}
 		*idx = ((_index_t)row << 8) | col;
 	}
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_MSKanji_stdenc_cstowc(_MSKanjiEncodingInfo * __restrict ei,
     wchar_t * __restrict wc, _csid_t csid, _index_t idx)
 {
 	uint32_t col, row;
 	int offset;
 
 	switch (csid) {
 	case 0:
 		/* ISO-646 */
 		if (idx >= 0x80)
 			return (EILSEQ);
 		*wc = (wchar_t)idx;
 		break;
 	case 1:
 		/* kana */
 		if (idx >= 0x80)
 			return (EILSEQ);
 		*wc = (wchar_t)idx + 0x80;
 		break;
 	case 3:
 		if ((ei->mode & MODE_JIS2004) == 0)
 			return (EILSEQ);
 	/*FALLTHROUGH*/
 	case 2:
 		/* kanji */
 		row = (idx >> 8);
 		if (row < 0x21)
 			return (EILSEQ);
 		if (csid == 3) {
 			if (row <= 0x2F)
 				offset = (row == 0x22 || row >= 0x26) ?
 				    0xED : 0xF0;
 			else if (row >= 0x4D && row <= 0x7E)
 				offset = 0xCE;
 			else
 				return (EILSEQ);
 		} else {
 			if (row > 0x97)
 				return (EILSEQ);
 			offset = (row < 0x5F) ? 0x81 : 0xC1;
 		}
 		col = idx & 0xFF;
 		if (col < 0x21 || col > 0x7E)
 			return (EILSEQ);
 		row -= 0x21; col -= 0x21;
 		if ((row & 1) == 0) {
 			col += 0x40;
 			if (col >= 0x7F)
 				col += 1;
 		} else
 			col += 0x9F;
 		row = row / 2 + offset;
 		*wc = ((wchar_t)row << 8) | col;
 		break;
 	default:
 		return (EILSEQ);
 	}
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_MSKanji_stdenc_get_state_desc_generic(_MSKanjiEncodingInfo * __restrict ei __unused,
     _MSKanjiState * __restrict psenc, int * __restrict rstate)
 {
 
 	*rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL :
 	    _STDENC_SDGEN_INCOMPLETE_CHAR;
 	return (0);
 }
 
 static int
 /*ARGSUSED*/
 _citrus_MSKanji_encoding_module_init(_MSKanjiEncodingInfo *  __restrict ei,
     const void * __restrict var, size_t lenvar)
 {
 	const char *p;
 
 	p = var;
 	memset((void *)ei, 0, sizeof(*ei));
 	while (lenvar > 0) {
 		switch (_bcs_toupper(*p)) {
 		case 'J':
 			MATCH(JIS2004, ei->mode |= MODE_JIS2004);
 			break;
 		}
 		++p;
 		--lenvar;
 	}
 
 	return (0);
 }
 
 static void
 _citrus_MSKanji_encoding_module_uninit(_MSKanjiEncodingInfo *ei __unused)
 {
 
 }
 
 /* ----------------------------------------------------------------------
  * public interface for stdenc
  */
 
 _CITRUS_STDENC_DECLS(MSKanji);
 _CITRUS_STDENC_DEF_OPS(MSKanji);
 
 #include "citrus_stdenc_template.h"
Index: user/ngie/more-tests/lib/libiconv_modules/UES/citrus_ues.c
===================================================================
--- user/ngie/more-tests/lib/libiconv_modules/UES/citrus_ues.c	(revision 281584)
+++ user/ngie/more-tests/lib/libiconv_modules/UES/citrus_ues.c	(revision 281585)
@@ -1,414 +1,414 @@
 /* $FreeBSD$ */
 /* $NetBSD: citrus_ues.c,v 1.3 2012/02/12 13:51:29 wiz Exp $ */
 
 /*-
  * Copyright (c)2006 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <limits.h>
 #include <stdio.h>
 #include <stdint.h>
 #include <stdlib.h>
 #include <string.h>
 #include <wchar.h>
 
 #include "citrus_namespace.h"
 #include "citrus_types.h"
 #include "citrus_bcs.h"
 #include "citrus_module.h"
 #include "citrus_stdenc.h"
 #include "citrus_ues.h"
 
 typedef struct {
 	size_t	 mb_cur_max;
 	int	 mode;
 #define MODE_C99	1
 } _UESEncodingInfo;
 
 typedef struct {
 	int	 chlen;
 	char	 ch[12];
 } _UESState;
 
 #define _CEI_TO_EI(_cei_)               (&(_cei_)->ei)
 #define _CEI_TO_STATE(_cei_, _func_)    (_cei_)->states.s_##_func_
 
 #define _FUNCNAME(m)			_citrus_UES_##m
 #define _ENCODING_INFO			_UESEncodingInfo
 #define _ENCODING_STATE			_UESState
 #define _ENCODING_MB_CUR_MAX(_ei_)	(_ei_)->mb_cur_max
 #define _ENCODING_IS_STATE_DEPENDENT		0
 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_)	0
 
 static __inline void
 /*ARGSUSED*/
 _citrus_UES_init_state(_UESEncodingInfo * __restrict ei __unused,
     _UESState * __restrict psenc)
 {
 
 	psenc->chlen = 0;
 }
 
 #if 0
 static __inline void
 /*ARGSUSED*/
 _citrus_UES_pack_state(_UESEncodingInfo * __restrict ei __unused,
     void *__restrict pspriv, const _UESState * __restrict psenc)
 {
 
 	memcpy(pspriv, (const void *)psenc, sizeof(*psenc));
 }
 
 static __inline void
 /*ARGSUSED*/
 _citrus_UES_unpack_state(_UESEncodingInfo * __restrict ei __unused,
     _UESState * __restrict psenc, const void * __restrict pspriv)
 {
 
 	memcpy((void *)psenc, pspriv, sizeof(*psenc));
 }
 #endif
 
 static __inline int
 to_int(int ch)
 {
 
 	if (ch >= '0' && ch <= '9')
 		return (ch - '0');
 	else if (ch >= 'A' && ch <= 'F')
 		return ((ch - 'A') + 10);
 	else if (ch >= 'a' && ch <= 'f')
 		return ((ch - 'a') + 10);
 	return (-1);
 }
 
 #define ESCAPE		'\\'
 #define UCS2_ESC	'u'
 #define UCS4_ESC	'U'
 
 #define UCS2_BIT	16
 #define UCS4_BIT	32
 #define BMP_MAX		UINT32_C(0xFFFF)
 #define UCS2_MAX	UINT32_C(0x10FFFF)
 #define UCS4_MAX	UINT32_C(0x7FFFFFFF)
 
 static const char *xdig = "0123456789abcdef";
 
 static __inline int
 to_str(char *s, wchar_t wc, int bit)
 {
 	char *p;
 
 	p = s;
 	*p++ = ESCAPE;
 	switch (bit) {
 	case UCS2_BIT:
 		*p++ = UCS2_ESC;
 		break;
 	case UCS4_BIT:
 		*p++ = UCS4_ESC;
 		break;
 	default:
 		abort();
 	}
 	do {
 		*p++ = xdig[(wc >> (bit -= 4)) & 0xF];
 	} while (bit > 0);
 	return (p - s);
 }
 
 static __inline bool
 is_hi_surrogate(wchar_t wc)
 {
 
 	return (wc >= 0xD800 && wc <= 0xDBFF);
 }
 
 static __inline bool
 is_lo_surrogate(wchar_t wc)
 {
 
 	return (wc >= 0xDC00 && wc <= 0xDFFF);
 }
 
 static __inline wchar_t
 surrogate_to_ucs(wchar_t hi, wchar_t lo)
 {
 
 	hi -= 0xD800;
 	lo -= 0xDC00;
 	return ((hi << 10 | lo) + 0x10000);
 }
 
 static __inline void
 ucs_to_surrogate(wchar_t wc, wchar_t * __restrict hi, wchar_t * __restrict lo)
 {
 
 	wc -= 0x10000;
 	*hi = (wc >> 10) + 0xD800;
 	*lo = (wc & 0x3FF) + 0xDC00;
 }
 
 static __inline bool
 is_basic(wchar_t wc)
 {
 
 	return ((uint32_t)wc <= 0x9F && wc != 0x24 && wc != 0x40 &&
 	    wc != 0x60);
 }
 
 static int
 _citrus_UES_mbrtowc_priv(_UESEncodingInfo * __restrict ei,
-    wchar_t * __restrict pwc, const char ** __restrict s, size_t n,
+    wchar_t * __restrict pwc, char ** __restrict s, size_t n,
     _UESState * __restrict psenc, size_t * __restrict nresult)
 {
-	const char *s0;
+	char *s0;
 	int ch, head, num, tail;
 	wchar_t hi, wc;
 
 	if (*s == NULL) {
 		_citrus_UES_init_state(ei, psenc);
 		*nresult = 0;
 		return (0);
 	}
 	s0 = *s;
 
 	hi = (wchar_t)0;
 	tail = 0;
 
 surrogate:
 	wc = (wchar_t)0;
 	head = tail;
 	if (psenc->chlen == head) {
 		if (n-- < 1)
 			goto restart;
 		psenc->ch[psenc->chlen++] = *s0++;
 	}
 	ch = (unsigned char)psenc->ch[head++];
 	if (ch == ESCAPE) {
 		if (psenc->chlen == head) {
 			if (n-- < 1)
 				goto restart;
 			psenc->ch[psenc->chlen++] = *s0++;
 		}
 		switch (psenc->ch[head]) {
 		case UCS2_ESC:
 			tail += 6;
 			break;
 		case UCS4_ESC:
 			if (ei->mode & MODE_C99) {
 				tail = 10;
 				break;
 			}
 		/*FALLTHROUGH*/
 		default:
 			tail = 0;
 		}
 		++head;
 	}
 	for (; head < tail; ++head) {
 		if (psenc->chlen == head) {
 			if (n-- < 1) {
 restart:
 				*s = s0;
 				*nresult = (size_t)-2;
 				return (0);
 			}
 			psenc->ch[psenc->chlen++] = *s0++;
 		}
 		num = to_int((int)(unsigned char)psenc->ch[head]);
 		if (num < 0) {
 			tail = 0;
 			break;
 		}
 		wc = (wc << 4) | num;
 	}
 	head = 0;
 	switch (tail) {
 	case 0:
 		break;
 	case 6:
 		if (hi != (wchar_t)0)
 			break;
 		if ((ei->mode & MODE_C99) == 0) {
 			if (is_hi_surrogate(wc) != 0) {
 				hi = wc;
 				goto surrogate;
 			}
 			if ((uint32_t)wc <= 0x7F /* XXX */ ||
 			    is_lo_surrogate(wc) != 0)
 				break;
 			goto done;
 		}
 	/*FALLTHROUGH*/
 	case 10:
 		if (is_basic(wc) == 0 && (uint32_t)wc <= UCS4_MAX &&
 		    is_hi_surrogate(wc) == 0 && is_lo_surrogate(wc) == 0)
 			goto done;
 		*nresult = (size_t)-1;
 		return (EILSEQ);
 	case 12:
 		if (is_lo_surrogate(wc) == 0)
 			break;
 		wc = surrogate_to_ucs(hi, wc);
 		goto done;
 	}
 	ch = (unsigned char)psenc->ch[0];
 	head = psenc->chlen;
 	if (--head > 0)
 		memmove(&psenc->ch[0], &psenc->ch[1], head);
 	wc = (wchar_t)ch;
 done:
 	psenc->chlen = head;
 	if (pwc != NULL)
 		*pwc = wc;
 	*nresult = (size_t)((wc == 0) ? 0 : (s0 - *s));
 	*s = s0;
 
 	return (0);
 }
 
 static int
 _citrus_UES_wcrtomb_priv(_UESEncodingInfo * __restrict ei,
     char * __restrict s, size_t n, wchar_t wc,
     _UESState * __restrict psenc, size_t * __restrict nresult)
 {
 	wchar_t hi, lo;
 
 	if (psenc->chlen != 0)
 		return (EINVAL);
 
 	if ((ei->mode & MODE_C99) ? is_basic(wc) : (uint32_t)wc <= 0x7F) {
 		if (n-- < 1)
 			goto e2big;
 		psenc->ch[psenc->chlen++] = (char)wc;
 	} else if ((uint32_t)wc <= BMP_MAX) {
 		if (n < 6)
 			goto e2big;
 		psenc->chlen = to_str(&psenc->ch[0], wc, UCS2_BIT);
 	} else if ((ei->mode & MODE_C99) == 0 && (uint32_t)wc <= UCS2_MAX) {
 		if (n < 12)
 			goto e2big;
 		ucs_to_surrogate(wc, &hi, &lo);
 		psenc->chlen += to_str(&psenc->ch[0], hi, UCS2_BIT);
 		psenc->chlen += to_str(&psenc->ch[6], lo, UCS2_BIT);
 	} else if ((ei->mode & MODE_C99) && (uint32_t)wc <= UCS4_MAX) {
 		if (n < 10)
 			goto e2big;
 		psenc->chlen = to_str(&psenc->ch[0], wc, UCS4_BIT);
 	} else {
 		*nresult = (size_t)-1;
 		return (EILSEQ);
 	}
 	memcpy(s, psenc->ch, psenc->chlen);
 	*nresult = psenc->chlen;
 	psenc->chlen = 0;
 
 	return (0);
 
 e2big:
 	*nresult = (size_t)-1;
 	return (E2BIG);
 }
 
 /*ARGSUSED*/
 static int
 _citrus_UES_stdenc_wctocs(_UESEncodingInfo * __restrict ei __unused,
     _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc)
 {
 
 	*csid = 0;
 	*idx = (_index_t)wc;
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_UES_stdenc_cstowc(_UESEncodingInfo * __restrict ei __unused,
     wchar_t * __restrict wc, _csid_t csid, _index_t idx)
 {
 
 	if (csid != 0)
 		return (EILSEQ);
 	*wc = (wchar_t)idx;
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_UES_stdenc_get_state_desc_generic(_UESEncodingInfo * __restrict ei __unused,
     _UESState * __restrict psenc, int * __restrict rstate)
 {
 
 	*rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL :
 	    _STDENC_SDGEN_INCOMPLETE_CHAR;
 	return (0);
 }
 
 static void
 /*ARGSUSED*/
 _citrus_UES_encoding_module_uninit(_UESEncodingInfo *ei __unused)
 {
 
 	/* ei seems to be unused */
 }
 
 static int
 /*ARGSUSED*/
 _citrus_UES_encoding_module_init(_UESEncodingInfo * __restrict ei,
     const void * __restrict var, size_t lenvar)
 {
 	const char *p;
 
 	p = var;
 	memset((void *)ei, 0, sizeof(*ei));
 	while (lenvar > 0) {
 		switch (_bcs_toupper(*p)) {
 		case 'C':
 			MATCH(C99, ei->mode |= MODE_C99);
 			break;
 		}
 		++p;
 		--lenvar;
 	}
 	ei->mb_cur_max = (ei->mode & MODE_C99) ? 10 : 12;
 
 	return (0);
 }
 
 /* ----------------------------------------------------------------------
  * public interface for stdenc
  */
 
 _CITRUS_STDENC_DECLS(UES);
 _CITRUS_STDENC_DEF_OPS(UES);
 
 #include "citrus_stdenc_template.h"
Index: user/ngie/more-tests/lib/libiconv_modules/UTF1632/citrus_utf1632.c
===================================================================
--- user/ngie/more-tests/lib/libiconv_modules/UTF1632/citrus_utf1632.c	(revision 281584)
+++ user/ngie/more-tests/lib/libiconv_modules/UTF1632/citrus_utf1632.c	(revision 281585)
@@ -1,453 +1,453 @@
 /* $FreeBSD$ */
 /*	$NetBSD: citrus_utf1632.c,v 1.9 2008/06/14 16:01:08 tnozaki Exp $	*/
 
 /*-
  * Copyright (c)2003 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 #include <sys/endian.h>
 #include <sys/types.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <limits.h>
 #include <stddef.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <wchar.h>
 
 #include "citrus_namespace.h"
 #include "citrus_types.h"
 #include "citrus_module.h"
 #include "citrus_stdenc.h"
 #include "citrus_bcs.h"
 
 #include "citrus_utf1632.h"
 
 
 /* ----------------------------------------------------------------------
  * private stuffs used by templates
  */
 
 typedef struct {
 	int		 chlen;
 	int		 current_endian;
 	uint8_t		 ch[4];
 } _UTF1632State;
 
 #define _ENDIAN_UNKNOWN		0
 #define _ENDIAN_BIG		1
 #define _ENDIAN_LITTLE		2
 #if BYTE_ORDER == BIG_ENDIAN
 #define _ENDIAN_INTERNAL	_ENDIAN_BIG
 #define _ENDIAN_SWAPPED		_ENDIAN_LITTLE
 #else
 #define _ENDIAN_INTERNAL	_ENDIAN_LITTLE
 #define _ENDIAN_SWAPPED	_ENDIAN_BIG
 #endif
 #define _MODE_UTF32		0x00000001U
 #define _MODE_FORCE_ENDIAN	0x00000002U
 
 typedef struct {
 	int		 preffered_endian;
 	unsigned int	 cur_max;
 	uint32_t	 mode;
 } _UTF1632EncodingInfo;
 
 #define _FUNCNAME(m)			_citrus_UTF1632_##m
 #define _ENCODING_INFO			_UTF1632EncodingInfo
 #define _ENCODING_STATE			_UTF1632State
 #define _ENCODING_MB_CUR_MAX(_ei_)	((_ei_)->cur_max)
 #define _ENCODING_IS_STATE_DEPENDENT	0
 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_)	0
 
 
 static __inline void
 /*ARGSUSED*/
 _citrus_UTF1632_init_state(_UTF1632EncodingInfo *ei __unused,
     _UTF1632State *s)
 {
 
 	memset(s, 0, sizeof(*s));
 }
 
 static int
 _citrus_UTF1632_mbrtowc_priv(_UTF1632EncodingInfo *ei, wchar_t *pwc,
-    const char **s, size_t n, _UTF1632State *psenc, size_t *nresult)
+    char **s, size_t n, _UTF1632State *psenc, size_t *nresult)
 {
-	const char *s0;
+	char *s0;
 	size_t result;
 	wchar_t wc = L'\0';
 	int chlenbak, endian, needlen;
 
 	s0 = *s;
 
 	if (s0 == NULL) {
 		_citrus_UTF1632_init_state(ei, psenc);
 		*nresult = 0; /* state independent */
 		return (0);
 	}
 
 	result = 0;
 	chlenbak = psenc->chlen;
 
 refetch:
 	needlen = ((ei->mode & _MODE_UTF32) != 0 || chlenbak >= 2) ? 4 : 2;
 
 	while (chlenbak < needlen) {
 		if (n == 0)
 			goto restart;
 		psenc->ch[chlenbak++] = *s0++;
 		n--;
 		result++;
 	}
 
 	/* judge endian marker */
 	if ((ei->mode & _MODE_UTF32) == 0) {
 		/* UTF16 */
 		if (psenc->ch[0] == 0xFE && psenc->ch[1] == 0xFF) {
 			psenc->current_endian = _ENDIAN_BIG;
 			chlenbak = 0;
 			goto refetch;
 		} else if (psenc->ch[0] == 0xFF && psenc->ch[1] == 0xFE) {
 			psenc->current_endian = _ENDIAN_LITTLE;
 			chlenbak = 0;
 			goto refetch;
 		}
 	} else {
 		/* UTF32 */
 		if (psenc->ch[0] == 0x00 && psenc->ch[1] == 0x00 &&
 		    psenc->ch[2] == 0xFE && psenc->ch[3] == 0xFF) {
 			psenc->current_endian = _ENDIAN_BIG;
 			chlenbak = 0;
 			goto refetch;
 		} else if (psenc->ch[0] == 0xFF && psenc->ch[1] == 0xFE &&
 			   psenc->ch[2] == 0x00 && psenc->ch[3] == 0x00) {
 			psenc->current_endian = _ENDIAN_LITTLE;
 			chlenbak = 0;
 			goto refetch;
 		}
 	}
 	endian = ((ei->mode & _MODE_FORCE_ENDIAN) != 0 ||
 	    psenc->current_endian == _ENDIAN_UNKNOWN) ? ei->preffered_endian :
 	    psenc->current_endian;
 
 	/* get wc */
 	if ((ei->mode & _MODE_UTF32) == 0) {
 		/* UTF16 */
 		if (needlen == 2) {
 			switch (endian) {
 			case _ENDIAN_LITTLE:
 				wc = (psenc->ch[0] |
 				    ((wchar_t)psenc->ch[1] << 8));
 				break;
 			case _ENDIAN_BIG:
 				wc = (psenc->ch[1] |
 				    ((wchar_t)psenc->ch[0] << 8));
 				break;
 			default:
 				goto ilseq;
 			}
 			if (wc >= 0xD800 && wc <= 0xDBFF) {
 				/* surrogate high */
 				needlen = 4;
 				goto refetch;
 			}
 		} else {
 			/* surrogate low */
 			wc -= 0xD800; /* wc : surrogate high (see above) */
 			wc <<= 10;
 			switch (endian) {
 			case _ENDIAN_LITTLE:
 				if (psenc->ch[3] < 0xDC || psenc->ch[3] > 0xDF)
 					goto ilseq;
 				wc |= psenc->ch[2];
 				wc |= (wchar_t)(psenc->ch[3] & 3) << 8;
 				break;
 			case _ENDIAN_BIG:
 				if (psenc->ch[2]<0xDC || psenc->ch[2]>0xDF)
 					goto ilseq;
 				wc |= psenc->ch[3];
 				wc |= (wchar_t)(psenc->ch[2] & 3) << 8;
 				break;
 			default:
 				goto ilseq;
 			}
 			wc += 0x10000;
 		}
 	} else {
 		/* UTF32 */
 		switch (endian) {
 		case _ENDIAN_LITTLE:
 			wc = (psenc->ch[0] |
 			    ((wchar_t)psenc->ch[1] << 8) |
 			    ((wchar_t)psenc->ch[2] << 16) |
 			    ((wchar_t)psenc->ch[3] << 24));
 			break;
 		case _ENDIAN_BIG:
 			wc = (psenc->ch[3] |
 			    ((wchar_t)psenc->ch[2] << 8) |
 			    ((wchar_t)psenc->ch[1] << 16) |
 			    ((wchar_t)psenc->ch[0] << 24));
 			break;
 		default:
 			goto ilseq;
 		}
 		if (wc >= 0xD800 && wc <= 0xDFFF)
 			goto ilseq;
 	}
 
 
 	*pwc = wc;
 	psenc->chlen = 0;
 	*nresult = result;
 	*s = s0;
 
 	return (0);
 
 ilseq:
 	*nresult = (size_t)-1;
 	psenc->chlen = 0;
 	return (EILSEQ);
 
 restart:
 	*nresult = (size_t)-2;
 	psenc->chlen = chlenbak;
 	*s = s0;
 	return (0);
 }
 
 static int
 _citrus_UTF1632_wcrtomb_priv(_UTF1632EncodingInfo *ei, char *s, size_t n,
     wchar_t wc, _UTF1632State *psenc, size_t *nresult)
 {
 	wchar_t wc2;
 	static const char _bom[4] = {
 	    0x00, 0x00, 0xFE, 0xFF,
 	};
 	const char *bom = &_bom[0];
 	size_t cnt;
 
 	cnt = (size_t)0;
 	if (psenc->current_endian == _ENDIAN_UNKNOWN) {
 		if ((ei->mode & _MODE_FORCE_ENDIAN) == 0) {
 			if (ei->mode & _MODE_UTF32)
 				cnt = 4;
 			else {
 				cnt = 2;
 				bom += 2;
 			}
 			if (n < cnt)
 				goto e2big;
 			memcpy(s, bom, cnt);
 			s += cnt, n -= cnt;
 		}
 		psenc->current_endian = ei->preffered_endian;
 	}
 
 	wc2 = 0;
 	if ((ei->mode & _MODE_UTF32)==0) {
 		/* UTF16 */
 		if (wc > 0xFFFF) {
 			/* surrogate */
 			if (wc > 0x10FFFF)
 				goto ilseq;
 			if (n < 4)
 				goto e2big;
 			cnt += 4;
 			wc -= 0x10000;
 			wc2 = (wc & 0x3FF) | 0xDC00;
 			wc = (wc>>10) | 0xD800;
 		} else {
 			if (n < 2)
 				goto e2big;
 			cnt += 2;
 		}
 
 surrogate:
 		switch (psenc->current_endian) {
 		case _ENDIAN_BIG:
 			s[1] = wc;
 			s[0] = (wc >>= 8);
 			break;
 		case _ENDIAN_LITTLE:
 			s[0] = wc;
 			s[1] = (wc >>= 8);
 			break;
 		}
 		if (wc2 != 0) {
 			wc = wc2;
 			wc2 = 0;
 			s += 2;
 			goto surrogate;
 		}
 	} else {
 		/* UTF32 */
 		if (wc >= 0xD800 && wc <= 0xDFFF)
 			goto ilseq;
 		if (n < 4)
 			goto e2big;
 		cnt += 4;
 		switch (psenc->current_endian) {
 		case _ENDIAN_BIG:
 			s[3] = wc;
 			s[2] = (wc >>= 8);
 			s[1] = (wc >>= 8);
 			s[0] = (wc >>= 8);
 			break;
 		case _ENDIAN_LITTLE:
 			s[0] = wc;
 			s[1] = (wc >>= 8);
 			s[2] = (wc >>= 8);
 			s[3] = (wc >>= 8);
 			break;
 		}
 	}
 	*nresult = cnt;
 
 	return (0);
 
 ilseq:
 	*nresult = (size_t)-1;
 	return (EILSEQ);
 e2big:
 	*nresult = (size_t)-1;
 	return (E2BIG);
 }
 
 static void
 parse_variable(_UTF1632EncodingInfo * __restrict ei,
     const void * __restrict var, size_t lenvar)
 {
 	const char *p;
 
 	p = var;
 	while (lenvar > 0) {
 		switch (*p) {
 		case 'B':
 		case 'b':
 			MATCH(big, ei->preffered_endian = _ENDIAN_BIG);
 			break;
 		case 'L':
 		case 'l':
 			MATCH(little, ei->preffered_endian = _ENDIAN_LITTLE);
 			break;
 		case 'i':
 		case 'I':
 			MATCH(internal, ei->preffered_endian = _ENDIAN_INTERNAL);
 			break;
 		case 's':
 		case 'S':
 			MATCH(swapped, ei->preffered_endian = _ENDIAN_SWAPPED);
 			break;
 		case 'F':
 		case 'f':
 			MATCH(force, ei->mode |= _MODE_FORCE_ENDIAN);
 			break;
 		case 'U':
 		case 'u':
 			MATCH(utf32, ei->mode |= _MODE_UTF32);
 			break;
 		}
 		p++;
 		lenvar--;
 	}
 }
 
 static int
 /*ARGSUSED*/
 _citrus_UTF1632_encoding_module_init(_UTF1632EncodingInfo * __restrict ei,
     const void * __restrict var, size_t lenvar)
 {
 
 	memset((void *)ei, 0, sizeof(*ei));
 
 	parse_variable(ei, var, lenvar);
 
 	ei->cur_max = ((ei->mode&_MODE_UTF32) == 0) ? 6 : 8;
 	/* 6: endian + surrogate */
 	/* 8: endian + normal */
 
 	if (ei->preffered_endian == _ENDIAN_UNKNOWN) {
 		ei->preffered_endian = _ENDIAN_BIG;
 	}
 
 	return (0);
 }
 
 static void
 /*ARGSUSED*/
 _citrus_UTF1632_encoding_module_uninit(_UTF1632EncodingInfo *ei __unused)
 {
 
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_UTF1632_stdenc_wctocs(_UTF1632EncodingInfo * __restrict ei __unused,
      _csid_t * __restrict csid, _index_t * __restrict idx, _wc_t wc)
 {
 
 	*csid = 0;
 	*idx = (_index_t)wc;
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_UTF1632_stdenc_cstowc(_UTF1632EncodingInfo * __restrict ei __unused,
     _wc_t * __restrict wc, _csid_t csid, _index_t idx)
 {
 
 	if (csid != 0)
 		return (EILSEQ);
 
 	*wc = (_wc_t)idx;
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_UTF1632_stdenc_get_state_desc_generic(_UTF1632EncodingInfo * __restrict ei __unused,
     _UTF1632State * __restrict psenc, int * __restrict rstate)
 {
 
 	*rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL :
 	    _STDENC_SDGEN_INCOMPLETE_CHAR;
 	return (0);
 }
 
 /* ----------------------------------------------------------------------
  * public interface for stdenc
  */
 
 _CITRUS_STDENC_DECLS(UTF1632);
 _CITRUS_STDENC_DEF_OPS(UTF1632);
 
 #include "citrus_stdenc_template.h"
Index: user/ngie/more-tests/lib/libiconv_modules/UTF7/citrus_utf7.c
===================================================================
--- user/ngie/more-tests/lib/libiconv_modules/UTF7/citrus_utf7.c	(revision 281584)
+++ user/ngie/more-tests/lib/libiconv_modules/UTF7/citrus_utf7.c	(revision 281585)
@@ -1,502 +1,502 @@
 /* $FreeBSD$ */
 /*	$NetBSD: citrus_utf7.c,v 1.5 2006/08/23 12:57:24 tnozaki Exp $	*/
 
 /*-
  * Copyright (c)2004, 2005 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  */
 
 #include <sys/cdefs.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <limits.h>
 #include <stdio.h>
 #include <stdint.h>
 #include <stdlib.h>
 #include <string.h>
 #include <wchar.h>
 
 #include "citrus_namespace.h"
 #include "citrus_types.h"
 #include "citrus_module.h"
 #include "citrus_stdenc.h"
 #include "citrus_utf7.h"
 
 /* ----------------------------------------------------------------------
  * private stuffs used by templates
  */
 
 #define EI_MASK		UINT16_C(0xff)
 #define EI_DIRECT	UINT16_C(0x100)
 #define EI_OPTION	UINT16_C(0x200)
 #define EI_SPACE	UINT16_C(0x400)
 
 typedef struct {
 	uint16_t	 cell[0x80];
 } _UTF7EncodingInfo;
 
 typedef struct {
 	unsigned int
 		mode: 1,	/* whether base64 mode */
 		bits: 4,	/* need to hold 0 - 15 */
 		cache: 22,	/* 22 = BASE64_BIT + UTF16_BIT */
 		surrogate: 1;	/* whether surrogate pair or not */
 	int chlen;
 	char ch[4]; /* BASE64_IN, 3 * 6 = 18, most closed to UTF16_BIT */
 } _UTF7State;
 
 #define	_CEI_TO_EI(_cei_)		(&(_cei_)->ei)
 #define	_CEI_TO_STATE(_cei_, _func_)	(_cei_)->states.s_##_func_
 
 #define	_FUNCNAME(m)			_citrus_UTF7_##m
 #define	_ENCODING_INFO			_UTF7EncodingInfo
 #define	_ENCODING_STATE			_UTF7State
 #define	_ENCODING_MB_CUR_MAX(_ei_)		4
 #define	_ENCODING_IS_STATE_DEPENDENT		1
 #define	_STATE_NEEDS_EXPLICIT_INIT(_ps_)	0
 
 static __inline void
 /*ARGSUSED*/
 _citrus_UTF7_init_state(_UTF7EncodingInfo * __restrict ei __unused,
     _UTF7State * __restrict s)
 {
 
 	memset((void *)s, 0, sizeof(*s));
 }
 
 #if 0
 static __inline void
 /*ARGSUSED*/
 _citrus_UTF7_pack_state(_UTF7EncodingInfo * __restrict ei __unused,
     void *__restrict pspriv, const _UTF7State * __restrict s)
 {
 
 	memcpy(pspriv, (const void *)s, sizeof(*s));
 }
 
 static __inline void
 /*ARGSUSED*/
 _citrus_UTF7_unpack_state(_UTF7EncodingInfo * __restrict ei __unused,
     _UTF7State * __restrict s, const void * __restrict pspriv)
 {
 
 	memcpy((void *)s, pspriv, sizeof(*s));
 }
 #endif
 
 static const char base64[] =
 	"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
 	"abcdefghijklmnopqrstuvwxyz"
 	"0123456789+/";
 
 static const char direct[] =
 	"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
 	"abcdefghijklmnopqrstuvwxyz"
 	"0123456789'(),-./:?";
 
 static const char option[] = "!\"#$%&*;<=>@[]^_`{|}";
 static const char spaces[] = " \t\r\n";
 
 #define	BASE64_BIT	6
 #define	UTF16_BIT	16
 
 #define	BASE64_MAX	0x3f
 #define	UTF16_MAX	UINT16_C(0xffff)
 #define	UTF32_MAX	UINT32_C(0x10ffff)
 
 #define	BASE64_IN	'+'
 #define	BASE64_OUT	'-'
 
 #define	SHIFT7BIT(c)	((c) >> 7)
 #define	ISSPECIAL(c)	((c) == '\0' || (c) == BASE64_IN)
 
 #define	FINDLEN(ei, c) \
 	(SHIFT7BIT((c)) ? -1 : (((ei)->cell[(c)] & EI_MASK) - 1))
 
 #define	ISDIRECT(ei, c)	(!SHIFT7BIT((c)) && (ISSPECIAL((c)) || \
 	ei->cell[(c)] & (EI_DIRECT | EI_OPTION | EI_SPACE)))
 
 #define	ISSAFE(ei, c)	(!SHIFT7BIT((c)) && (ISSPECIAL((c)) || \
 	(c < 0x80 && ei->cell[(c)] & (EI_DIRECT | EI_SPACE))))
 
 /* surrogate pair */
 #define	SRG_BASE	UINT32_C(0x10000)
 #define	HISRG_MIN	UINT16_C(0xd800)
 #define	HISRG_MAX	UINT16_C(0xdbff)
 #define	LOSRG_MIN	UINT16_C(0xdc00)
 #define	LOSRG_MAX	UINT16_C(0xdfff)
 
 static int
 _citrus_UTF7_mbtoutf16(_UTF7EncodingInfo * __restrict ei,
-    uint16_t * __restrict u16, const char ** __restrict s, size_t n,
+    uint16_t * __restrict u16, char ** __restrict s, size_t n,
     _UTF7State * __restrict psenc, size_t * __restrict nresult)
 {
 	_UTF7State sv;
-	const char *s0;
+	char *s0;
 	int done, i, len;
 
 	s0 = *s;
 	sv = *psenc;
 
 	for (i = 0, done = 0; done == 0; i++) {
 		if (i == psenc->chlen) {
 			if (n-- < 1) {
 				*nresult = (size_t)-2;
 				*s = s0;
 				sv.chlen = psenc->chlen;
 				memcpy(sv.ch, psenc->ch, sizeof(sv.ch));
 				*psenc = sv;
 				return (0);
 			}
 			psenc->ch[psenc->chlen++] = *s0++;
 		}
 		if (SHIFT7BIT((int)psenc->ch[i]))
 			goto ilseq;
 		if (!psenc->mode) {
 			if (psenc->bits > 0 || psenc->cache > 0)
 				return (EINVAL);
 			if (psenc->ch[i] == BASE64_IN)
 				psenc->mode = 1;
 			else {
 				if (!ISDIRECT(ei, (int)psenc->ch[i]))
 					goto ilseq;
 				*u16 = (uint16_t)psenc->ch[i];
 				done = 1;
 				continue;
 			}
 		} else {
 			if (psenc->ch[i] == BASE64_OUT && psenc->cache == 0) {
 				psenc->mode = 0;
 				*u16 = (uint16_t)BASE64_IN;
 				done = 1;
 				continue;
 			}
 			len = FINDLEN(ei, (int)psenc->ch[i]);
 			if (len < 0) {
 				if (psenc->bits >= BASE64_BIT)
 					return (EINVAL);
 				psenc->mode = 0;
 				psenc->bits = psenc->cache = 0;
 				if (psenc->ch[i] != BASE64_OUT) {
 					if (!ISDIRECT(ei, (int)psenc->ch[i]))
 						goto ilseq;
 					*u16 = (uint16_t)psenc->ch[i];
 					done = 1;
 				} else {
 					psenc->chlen--;
 					i--;
 				}
 			} else {
 				psenc->cache =
 				    (psenc->cache << BASE64_BIT) | len;
 				switch (psenc->bits) {
 				case 0: case 2: case 4: case 6: case 8:
 					psenc->bits += BASE64_BIT;
 					break;
 				case 10: case 12: case 14:
 					psenc->bits -= (UTF16_BIT - BASE64_BIT);
 					*u16 = (psenc->cache >> psenc->bits) &
 					    UTF16_MAX;
 					done = 1;
 					break;
 				default:
 					return (EINVAL);
 				}
 			}
 		}
 	}
 
 	if (psenc->chlen > i)
 		return (EINVAL);
 	psenc->chlen = 0;
 	*nresult = (size_t)((*u16 == 0) ? 0 : s0 - *s);
 	*s = s0;
 
 	return (0);
 
 ilseq:
 	*nresult = (size_t)-1;
 	return (EILSEQ);
 }
 
 static int
 _citrus_UTF7_mbrtowc_priv(_UTF7EncodingInfo * __restrict ei,
-    wchar_t * __restrict pwc, const char ** __restrict s, size_t n,
+    wchar_t * __restrict pwc, char ** __restrict s, size_t n,
     _UTF7State * __restrict psenc, size_t * __restrict nresult)
 {
 	uint32_t u32;
 	uint16_t hi, lo;
 	size_t nr, siz;
 	int err;
 
 	if (*s == NULL) {
 		_citrus_UTF7_init_state(ei, psenc);
 		*nresult = (size_t)_ENCODING_IS_STATE_DEPENDENT;
 		return (0);
 	}
 	if (psenc->surrogate) {
 		hi = (psenc->cache >> psenc->bits) & UTF16_MAX;
 		if (hi < HISRG_MIN || hi > HISRG_MAX)
 			return (EINVAL);
 		siz = 0;
 	} else {
 		err = _citrus_UTF7_mbtoutf16(ei, &hi, s, n, psenc, &nr);
 		if (nr == (size_t)-1 || nr == (size_t)-2) {
 			*nresult = nr;
 			return (err);
 		}
 		if (err != 0)
 			return (err);
 		n -= nr;
 		siz = nr;
 		if (hi < HISRG_MIN || hi > HISRG_MAX) {
 			u32 = (uint32_t)hi;
 			goto done;
 		}
 		psenc->surrogate = 1;
 	}
 	err = _citrus_UTF7_mbtoutf16(ei, &lo, s, n, psenc, &nr);
 	if (nr == (size_t)-1 || nr == (size_t)-2) {
 		*nresult = nr;
 		return (err);
 	}
 	if (err != 0)
 		return (err);
 	hi -= HISRG_MIN;
 	lo -= LOSRG_MIN;
 	u32 = (hi << 10 | lo) + SRG_BASE;
 	siz += nr;
 done:
 	if (pwc != NULL)
 		*pwc = (wchar_t)u32;
 	if (u32 == (uint32_t)0) {
 		*nresult = (size_t)0;
 		_citrus_UTF7_init_state(ei, psenc);
 	} else {
 		*nresult = siz;
 		psenc->surrogate = 0;
 	}
 	return (err);
 }
 
 static int
 _citrus_UTF7_utf16tomb(_UTF7EncodingInfo * __restrict ei,
     char * __restrict s, size_t n __unused, uint16_t u16,
     _UTF7State * __restrict psenc, size_t * __restrict nresult)
 {
 	int bits, i;
 
 	if (psenc->chlen != 0 || psenc->bits > BASE64_BIT)
 		return (EINVAL);
 
 	if (ISSAFE(ei, u16)) {
 		if (psenc->mode) {
 			if (psenc->bits > 0) {
 				bits = BASE64_BIT - psenc->bits;
 				i = (psenc->cache << bits) & BASE64_MAX;
 				psenc->ch[psenc->chlen++] = base64[i];
 				psenc->bits = psenc->cache = 0;
 			}
 			if (u16 == BASE64_OUT || FINDLEN(ei, u16) >= 0)
 				psenc->ch[psenc->chlen++] = BASE64_OUT;
 			psenc->mode = 0;
 		}
 		if (psenc->bits != 0)
 			return (EINVAL);
 		psenc->ch[psenc->chlen++] = (char)u16;
 		if (u16 == BASE64_IN)
 			psenc->ch[psenc->chlen++] = BASE64_OUT;
 	} else {
 		if (!psenc->mode) {
 			if (psenc->bits > 0)
 				return (EINVAL);
 			psenc->ch[psenc->chlen++] = BASE64_IN;
 			psenc->mode = 1;
 		}
 		psenc->cache = (psenc->cache << UTF16_BIT) | u16;
 		bits = UTF16_BIT + psenc->bits;
 		psenc->bits = bits % BASE64_BIT;
 		while ((bits -= BASE64_BIT) >= 0) {
 			i = (psenc->cache >> bits) & BASE64_MAX;
 			psenc->ch[psenc->chlen++] = base64[i];
 		}
 	}
 	memcpy(s, psenc->ch, psenc->chlen);
 	*nresult = psenc->chlen;
 	psenc->chlen = 0;
 
 	return (0);
 }
 
 static int
 _citrus_UTF7_wcrtomb_priv(_UTF7EncodingInfo * __restrict ei,
     char * __restrict s, size_t n, wchar_t wchar,
     _UTF7State * __restrict psenc, size_t * __restrict nresult)
 {
 	uint32_t u32;
 	uint16_t u16[2];
 	int err, i, len;
 	size_t nr, siz;
 
 	u32 = (uint32_t)wchar;
 	if (u32 <= UTF16_MAX) {
 		u16[0] = (uint16_t)u32;
 		len = 1;
 	} else if (u32 <= UTF32_MAX) {
 		u32 -= SRG_BASE;
 		u16[0] = (u32 >> 10) + HISRG_MIN;
 		u16[1] = ((uint16_t)(u32 & UINT32_C(0x3ff))) + LOSRG_MIN;
 		len = 2;
 	} else {
 		*nresult = (size_t)-1;
 		return (EILSEQ);
 	}
 	siz = 0;
 	for (i = 0; i < len; ++i) {
 		err = _citrus_UTF7_utf16tomb(ei, s, n, u16[i], psenc, &nr);
 		if (err != 0)
 			return (err); /* XXX: state has been modified */
 		s += nr;
 		n -= nr;
 		siz += nr;
 	}
 	*nresult = siz;
 
 	return (0);
 }
 
 static int
 /* ARGSUSED */
 _citrus_UTF7_put_state_reset(_UTF7EncodingInfo * __restrict ei __unused,
     char * __restrict s, size_t n, _UTF7State * __restrict psenc,
     size_t * __restrict nresult)
 {
 	int bits, pos;
 
 	if (psenc->chlen != 0 || psenc->bits > BASE64_BIT || psenc->surrogate)
 		return (EINVAL);
 
 	if (psenc->mode) {
 		if (psenc->bits > 0) {
 			if (n-- < 1)
 				return (E2BIG);
 			bits = BASE64_BIT - psenc->bits;
 			pos = (psenc->cache << bits) & BASE64_MAX;
 			psenc->ch[psenc->chlen++] = base64[pos];
 			psenc->ch[psenc->chlen++] = BASE64_OUT;
 			psenc->bits = psenc->cache = 0;
 		}
 		psenc->mode = 0;
 	}
 	if (psenc->bits != 0)
 		return (EINVAL);
 	if (n-- < 1)
 		return (E2BIG);
 
 	*nresult = (size_t)psenc->chlen;
 	if (psenc->chlen > 0) {
 		memcpy(s, psenc->ch, psenc->chlen);
 		psenc->chlen = 0;
 	}
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_UTF7_stdenc_wctocs(_UTF7EncodingInfo * __restrict ei __unused,
     _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc)
 {
 
 	*csid = 0;
 	*idx = (_index_t)wc;
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_UTF7_stdenc_cstowc(_UTF7EncodingInfo * __restrict ei __unused,
     wchar_t * __restrict wc, _csid_t csid, _index_t idx)
 {
 
 	if (csid != 0)
 		return (EILSEQ);
 	*wc = (wchar_t)idx;
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_UTF7_stdenc_get_state_desc_generic(_UTF7EncodingInfo * __restrict ei __unused,
     _UTF7State * __restrict psenc, int * __restrict rstate)
 {
 
 	*rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL :
 	    _STDENC_SDGEN_INCOMPLETE_CHAR;
 	return (0);
 }
 
 static void
 /*ARGSUSED*/
 _citrus_UTF7_encoding_module_uninit(_UTF7EncodingInfo *ei __unused)
 {
 
 	/* ei seems to be unused */
 }
 
 static int
 /*ARGSUSED*/
 _citrus_UTF7_encoding_module_init(_UTF7EncodingInfo * __restrict ei,
     const void * __restrict var __unused, size_t lenvar __unused)
 {
 	const char *s;
 
 	memset(ei, 0, sizeof(*ei));
 
 #define FILL(str, flag)				\
 do {						\
 	for (s = str; *s != '\0'; s++)		\
 		ei->cell[*s & 0x7f] |= flag;	\
 } while (/*CONSTCOND*/0)
 
 	FILL(base64, (s - base64) + 1);
 	FILL(direct, EI_DIRECT);
 	FILL(option, EI_OPTION);
 	FILL(spaces, EI_SPACE);
 
 	return (0);
 }
 
 /* ----------------------------------------------------------------------
  * public interface for stdenc
  */
 
 _CITRUS_STDENC_DECLS(UTF7);
 _CITRUS_STDENC_DEF_OPS(UTF7);
 
 #include "citrus_stdenc_template.h"
Index: user/ngie/more-tests/lib/libiconv_modules/UTF8/citrus_utf8.c
===================================================================
--- user/ngie/more-tests/lib/libiconv_modules/UTF8/citrus_utf8.c	(revision 281584)
+++ user/ngie/more-tests/lib/libiconv_modules/UTF8/citrus_utf8.c	(revision 281585)
@@ -1,351 +1,351 @@
 /* $FreeBSD$ */
 /*	$NetBSD: citrus_utf8.c,v 1.17 2008/06/14 16:01:08 tnozaki Exp $	*/
 
 /*-
  * Copyright (c)2002 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 /*-
  * Copyright (c) 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * Paul Borman at Krystal Technologies.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 #include <sys/types.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <limits.h>
 #include <stdbool.h>
 #include <stddef.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <wchar.h>
 
 #include "citrus_namespace.h"
 #include "citrus_types.h"
 #include "citrus_module.h"
 #include "citrus_stdenc.h"
 #include "citrus_utf8.h"
 
 
 /* ----------------------------------------------------------------------
  * private stuffs used by templates
  */
 
 static uint8_t _UTF8_count_array[256] = {
 	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,	/* 00 - 0F */
 	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,	/* 10 - 1F */
 	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,	/* 20 - 2F */
 	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,	/* 30 - 3F */
 	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,	/* 40 - 4F */
 	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,	/* 50 - 5F */
 	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,	/* 60 - 6F */
 	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,	/* 70 - 7F */
 	0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,	/* 80 - 8F */
 	0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,	/* 90 - 9F */
 	0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,	/* A0 - AF */
 	0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,	/* B0 - BF */
 	2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,	/* C0 - CF */
 	2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,	/* D0 - DF */
 	3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,	/* E0 - EF */
 	4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 0, 0	/* F0 - FF */
 };
 
 static uint8_t const *_UTF8_count = _UTF8_count_array;
 
 static const uint32_t _UTF8_range[] = {
 	0,	/*dummy*/
 	0x00000000, 0x00000080, 0x00000800, 0x00010000,
 	0x00200000, 0x04000000, 0x80000000,
 };
 
 typedef struct {
 	int	 chlen;
 	char	 ch[6];
 } _UTF8State;
 
 typedef void *_UTF8EncodingInfo;
 
 #define _CEI_TO_EI(_cei_)		(&(_cei_)->ei)
 #define _CEI_TO_STATE(_ei_, _func_)	(_ei_)->states.s_##_func_
 
 #define _FUNCNAME(m)			_citrus_UTF8_##m
 #define _ENCODING_INFO			_UTF8EncodingInfo
 #define _ENCODING_STATE			_UTF8State
 #define _ENCODING_MB_CUR_MAX(_ei_)	6
 #define _ENCODING_IS_STATE_DEPENDENT	0
 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_)	0
 
 static size_t
 _UTF8_findlen(wchar_t v)
 {
 	size_t i;
 	uint32_t c;
 
 	c = (uint32_t)v;	/*XXX*/
 	for (i = 1; i < sizeof(_UTF8_range) / sizeof(_UTF8_range[0]) - 1; i++)
 		if (c >= _UTF8_range[i] && c < _UTF8_range[i + 1])
 			return (i);
 
 	return (-1);	/*out of range*/
 }
 
 static __inline bool
 _UTF8_surrogate(wchar_t wc)
 {
 
 	return (wc >= 0xd800 && wc <= 0xdfff);
 }
 
 static __inline void
 /*ARGSUSED*/
 _citrus_UTF8_init_state(_UTF8EncodingInfo *ei __unused, _UTF8State *s)
 {
 
 	s->chlen = 0;
 }
 
 #if 0
 static __inline void
 /*ARGSUSED*/
 _citrus_UTF8_pack_state(_UTF8EncodingInfo *ei __unused, void *pspriv,
     const _UTF8State *s)
 {
 
 	memcpy(pspriv, (const void *)s, sizeof(*s));
 }
 
 static __inline void
 /*ARGSUSED*/
 _citrus_UTF8_unpack_state(_UTF8EncodingInfo *ei __unused, _UTF8State *s,
     const void *pspriv)
 {
 
 	memcpy((void *)s, pspriv, sizeof(*s));
 }
 #endif
 
 static int
-_citrus_UTF8_mbrtowc_priv(_UTF8EncodingInfo *ei, wchar_t *pwc, const char **s,
+_citrus_UTF8_mbrtowc_priv(_UTF8EncodingInfo *ei, wchar_t *pwc, char **s,
     size_t n, _UTF8State *psenc, size_t *nresult)
 {
-	const char *s0;
+	char *s0;
 	wchar_t wchar;
 	int i;
 	uint8_t c;
 
 	s0 = *s;
 
 	if (s0 == NULL) {
 		_citrus_UTF8_init_state(ei, psenc);
 		*nresult = 0; /* state independent */
 		return (0);
 	}
 
 	/* make sure we have the first byte in the buffer */
 	if (psenc->chlen == 0) {
 		if (n-- < 1)
 			goto restart;
 		psenc->ch[psenc->chlen++] = *s0++;
 	}
 
 	c = _UTF8_count[psenc->ch[0] & 0xff];
 	if (c < 1 || c < psenc->chlen)
 		goto ilseq;
 
 	if (c == 1)
 		wchar = psenc->ch[0] & 0xff;
 	else {
 		while (psenc->chlen < c) {
 			if (n-- < 1)
 				goto restart;
 			psenc->ch[psenc->chlen++] = *s0++;
 		}
 		wchar = psenc->ch[0] & (0x7f >> c);
 		for (i = 1; i < c; i++) {
 			if ((psenc->ch[i] & 0xc0) != 0x80)
 				goto ilseq;
 			wchar <<= 6;
 			wchar |= (psenc->ch[i] & 0x3f);
 		}
 		if (_UTF8_surrogate(wchar) || _UTF8_findlen(wchar) != c)
 			goto ilseq;
 	}
 	if (pwc != NULL)
 		*pwc = wchar;
 	*nresult = (wchar == 0) ? 0 : s0 - *s;
 	*s = s0;
 	psenc->chlen = 0;
 
 	return (0);
 
 ilseq:
 	*nresult = (size_t)-1;
 	return (EILSEQ);
 
 restart:
 	*s = s0;
 	*nresult = (size_t)-2;
 	return (0);
 }
 
 static int
 _citrus_UTF8_wcrtomb_priv(_UTF8EncodingInfo *ei __unused, char *s, size_t n,
     wchar_t wc, _UTF8State *psenc __unused, size_t *nresult)
 {
 	wchar_t c;
 	size_t cnt;
 	int i, ret;
 
 	if (_UTF8_surrogate(wc)) {
 		ret = EILSEQ;
 		goto err;
 	}
 	cnt = _UTF8_findlen(wc);
 	if (cnt <= 0 || cnt > 6) {
 		/* invalid UCS4 value */
 		ret = EILSEQ;
 		goto err;
 	}
 	if (n < cnt) {
 		/* bound check failure */
 		ret = E2BIG;
 		goto err;
 	}
 
 	c = wc;
 	if (s) {
 		for (i = cnt - 1; i > 0; i--) {
 			s[i] = 0x80 | (c & 0x3f);
 			c >>= 6;
 		}
 		s[0] = c;
 		if (cnt == 1)
 			s[0] &= 0x7f;
 		else {
 			s[0] &= (0x7f >> cnt);
 			s[0] |= ((0xff00 >> cnt) & 0xff);
 		}
 	}
 
 	*nresult = (size_t)cnt;
 	return (0);
 
 err:
 	*nresult = (size_t)-1;
 	return (ret);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_UTF8_stdenc_wctocs(_UTF8EncodingInfo * __restrict ei __unused,
     _csid_t * __restrict csid, _index_t * __restrict idx,
     wchar_t wc)
 {
 
 	*csid = 0;
 	*idx = (_citrus_index_t)wc;
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_UTF8_stdenc_cstowc(_UTF8EncodingInfo * __restrict ei __unused,
     wchar_t * __restrict wc, _csid_t csid, _index_t idx)
 {
 
 	if (csid != 0)
 		return (EILSEQ);
 
 	*wc = (wchar_t)idx;
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_UTF8_stdenc_get_state_desc_generic(_UTF8EncodingInfo * __restrict ei __unused,
     _UTF8State * __restrict psenc, int * __restrict rstate)
 {
 
 	*rstate = (psenc->chlen == 0) ? _STDENC_SDGEN_INITIAL :
 	    _STDENC_SDGEN_INCOMPLETE_CHAR;
 	return (0);
 }
 
 static int
 /*ARGSUSED*/
 _citrus_UTF8_encoding_module_init(_UTF8EncodingInfo * __restrict ei __unused,
     const void * __restrict var __unused, size_t lenvar __unused)
 {
 
 	return (0);
 }
 
 static void
 /*ARGSUSED*/
 _citrus_UTF8_encoding_module_uninit(_UTF8EncodingInfo *ei __unused)
 {
 
 }
 
 /* ----------------------------------------------------------------------
  * public interface for stdenc
  */
 
 _CITRUS_STDENC_DECLS(UTF8);
 _CITRUS_STDENC_DEF_OPS(UTF8);
 
 #include "citrus_stdenc_template.h"
Index: user/ngie/more-tests/lib/libiconv_modules/VIQR/citrus_viqr.c
===================================================================
--- user/ngie/more-tests/lib/libiconv_modules/VIQR/citrus_viqr.c	(revision 281584)
+++ user/ngie/more-tests/lib/libiconv_modules/VIQR/citrus_viqr.c	(revision 281585)
@@ -1,498 +1,498 @@
 /* $FreeBSD$ */
 /* $NetBSD: citrus_viqr.c,v 1.5 2011/11/19 18:20:13 tnozaki Exp $ */
 
 /*-
  * Copyright (c)2006 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  */
 
 #include <sys/cdefs.h>
 #include <sys/queue.h>
 #include <sys/types.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <limits.h>
 #include <stddef.h>
 #include <stdint.h>
 #include <stdlib.h>
 #include <string.h>
 #include <wchar.h>
 
 #include "citrus_namespace.h"
 #include "citrus_types.h"
 #include "citrus_bcs.h"
 #include "citrus_module.h"
 #include "citrus_stdenc.h"
 #include "citrus_viqr.h"
 
 #define ESCAPE	'\\'
 
 /*
  * this table generated from RFC 1456.
  */
 static const char *mnemonic_rfc1456[0x100] = {
   NULL , NULL , "A(?", NULL , NULL , "A(~", "A^~", NULL ,
   NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL ,
   NULL , NULL , NULL , NULL , "Y?" , NULL , NULL , NULL ,
   NULL , "Y~" , NULL , NULL , NULL , NULL , "Y." , NULL ,
   NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL ,
   NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL ,
   NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL ,
   NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL ,
   NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL ,
   NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL ,
   NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL ,
   NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL ,
   NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL ,
   NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL ,
   NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL ,
   NULL , NULL , NULL , NULL , NULL , NULL , NULL , NULL ,
   "A." , "A('", "A(`", "A(.", "A^'", "A^`", "A^?", "A^.",
   "E~" , "E." , "E^'", "E^`", "E^?", "E^~", "E^.", "O^'",
   "O^`", "O^?", "O^~", "O^.", "O+.", "O+'", "O+`", "O+?",
   "I." , "O?" , "O." , "I?" , "U?" , "U~" , "U." , "Y`" ,
   "O~" , "a('", "a(`", "a(.", "a^'", "a^`", "a^?", "a^.",
   "e~" , "e." , "e^'", "e^`", "e^?", "e^~", "e^.", "o^'",
   "o^`", "o^?", "o^~", "O+~", "O+" , "o^.", "o+`", "o+?",
   "i." , "U+.", "U+'", "U+`", "U+?", "o+" , "o+'", "U+" ,
   "A`" , "A'" , "A^" , "A~" , "A?" , "A(" , "a(?", "a(~",
   "E`" , "E'" , "E^" , "E?" , "I`" , "I'" , "I~" , "y`" ,
   "DD" , "u+'", "O`" , "O'" , "O^" , "a." , "y?" , "u+`",
   "u+?", "U`" , "U'" , "y~" , "y." , "Y'" , "o+~", "u+" ,
   "a`" , "a'" , "a^" , "a~" , "a?" , "a(" , "u+~", "a^~",
   "e`" , "e'" , "e^" , "e?" , "i`" , "i'" , "i~" , "i?" ,
   "dd" , "u+.", "o`" , "o'" , "o^" , "o~" , "o?" , "o." ,
   "u." , "u`" , "u'" , "u~" , "u?" , "y'" , "o+.", "U+~",
 };
 
 typedef struct {
 	const char	*name;
 	wchar_t		 value;
 } mnemonic_def_t;
 
 static const mnemonic_def_t mnemonic_ext[] = {
 /* add extra mnemonic here (should be sorted by wchar_t order). */
 };
 static const size_t mnemonic_ext_size =
 	sizeof(mnemonic_ext) / sizeof(mnemonic_def_t);
 
 static const char *
 mnemonic_ext_find(wchar_t wc, const mnemonic_def_t *head, size_t n)
 {
 	const mnemonic_def_t *mid;
 
 	for (; n > 0; n >>= 1) {
 		mid = head + (n >> 1);
 		if (mid->value == wc)
 			return (mid->name);
 		else if (mid->value < wc) {
 			head = mid + 1;
 			--n;
 		}
 	}
 	return (NULL);
 }
 
 struct mnemonic_t;
 typedef TAILQ_HEAD(mnemonic_list_t, mnemonic_t) mnemonic_list_t;
 typedef struct mnemonic_t {
 	TAILQ_ENTRY(mnemonic_t)	 entry;
 	struct mnemonic_t	*parent;
 	mnemonic_list_t		 child;
 	wchar_t			 value;
 	int			 ascii;
 } mnemonic_t;
 
 static mnemonic_t *
 mnemonic_list_find(mnemonic_list_t *ml, int ch)
 {
 	mnemonic_t *m;
 
 	TAILQ_FOREACH(m, ml, entry) {
 		if (m->ascii == ch)
 			return (m);
 	}
 
 	return (NULL);
 }
 
 static mnemonic_t *
 mnemonic_create(mnemonic_t *parent, int ascii, wchar_t value)
 {
 	mnemonic_t *m;
 
 	m = malloc(sizeof(*m));
 	if (m != NULL) {
 		m->parent = parent;
 		m->ascii = ascii;
 		m->value = value;
 		TAILQ_INIT(&m->child);
 	}
 
 	return (m);
 }
 
 static int
 mnemonic_append_child(mnemonic_t *m, const char *s,
     wchar_t value, wchar_t invalid)
 {
 	mnemonic_t *m0;
 	int ch;
 
 	ch = (unsigned char)*s++;
 	if (ch == '\0')
 		return (EINVAL);
 	m0 = mnemonic_list_find(&m->child, ch);
 	if (m0 == NULL) {
 		m0 = mnemonic_create(m, ch, (wchar_t)ch);
 		if (m0 == NULL)
 			return (ENOMEM);
 		TAILQ_INSERT_TAIL(&m->child, m0, entry);
 	}
 	m = m0;
 	for (m0 = NULL; (ch = (unsigned char)*s) != '\0'; ++s) {
 		m0 = mnemonic_list_find(&m->child, ch);
 		if (m0 == NULL) {
 			m0 = mnemonic_create(m, ch, invalid);
 			if (m0 == NULL)
 				return (ENOMEM);
 			TAILQ_INSERT_TAIL(&m->child, m0, entry);
 		}
 		m = m0;
 	}
 	if (m0 == NULL)
 		return (EINVAL);
 	m0->value = value;
 
 	return (0);
 }
 
 static void
 mnemonic_destroy(mnemonic_t *m)
 {
 	mnemonic_t *m0;
 
 	TAILQ_FOREACH(m0, &m->child, entry)
 		mnemonic_destroy(m0);
 	free(m);
 }
 
 typedef struct {
 	mnemonic_t	*mroot;
 	wchar_t		 invalid;
 	size_t		 mb_cur_max;
 } _VIQREncodingInfo;
 
 typedef struct {
 	int	 chlen;
 	char	 ch[MB_LEN_MAX];
 } _VIQRState;
 
 #define _CEI_TO_EI(_cei_)		(&(_cei_)->ei)
 #define _CEI_TO_STATE(_cei_, _func_)	(_cei_)->states.s_##_func_
 
 #define _FUNCNAME(m)			_citrus_VIQR_##m
 #define _ENCODING_INFO			_VIQREncodingInfo
 #define _ENCODING_STATE			_VIQRState
 #define _ENCODING_MB_CUR_MAX(_ei_)	(_ei_)->mb_cur_max
 #define _ENCODING_IS_STATE_DEPENDENT		1
 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_)	0
 
 static __inline void
 /*ARGSUSED*/
 _citrus_VIQR_init_state(_VIQREncodingInfo * __restrict ei __unused,
     _VIQRState * __restrict psenc)
 {
 
 	psenc->chlen = 0;
 }
 
 #if 0
 static __inline void
 /*ARGSUSED*/
 _citrus_VIQR_pack_state(_VIQREncodingInfo * __restrict ei __unused,
     void *__restrict pspriv, const _VIQRState * __restrict psenc)
 {
 
 	memcpy(pspriv, (const void *)psenc, sizeof(*psenc));
 }
 
 static __inline void
 /*ARGSUSED*/
 _citrus_VIQR_unpack_state(_VIQREncodingInfo * __restrict ei __unused,
     _VIQRState * __restrict psenc, const void * __restrict pspriv)
 {
 
 	memcpy((void *)psenc, pspriv, sizeof(*psenc));
 }
 #endif
 
 static int
 _citrus_VIQR_mbrtowc_priv(_VIQREncodingInfo * __restrict ei,
-    wchar_t * __restrict pwc, const char ** __restrict s, size_t n,
+    wchar_t * __restrict pwc, char ** __restrict s, size_t n,
     _VIQRState * __restrict psenc, size_t * __restrict nresult)
 {
 	mnemonic_t *m, *m0;
-	const char *s0;
+	char *s0;
 	wchar_t wc;
 	ssize_t i;
 	int ch, escape;
 
 	if (*s == NULL) {
 		_citrus_VIQR_init_state(ei, psenc);
 		*nresult = (size_t)_ENCODING_IS_STATE_DEPENDENT;
 		return (0);
 	}
 	s0 = *s;
 
 	i = 0;
 	m = ei->mroot;
 	for (escape = 0;;) {
 		if (psenc->chlen == i) {
 			if (n-- < 1) {
 				*s = s0;
 				*nresult = (size_t)-2;
 				return (0);
 			}
 			psenc->ch[psenc->chlen++] = *s0++;
 		}
 		ch = (unsigned char)psenc->ch[i++];
 		if (ch == ESCAPE) {
 			if (m != ei->mroot)
 				break;
 			escape = 1;
 			continue;
 		}
 		if (escape != 0)
 			break;
 		m0 = mnemonic_list_find(&m->child, ch);
 		if (m0 == NULL)
 			break;
 		m = m0;
 	}
 	while (m != ei->mroot) {
 		--i;
 		if (m->value != ei->invalid)
 			break;
 		m = m->parent;
 	}
 	if (ch == ESCAPE && m != ei->mroot)
 		++i;
 	psenc->chlen -= i;
 	memmove(&psenc->ch[0], &psenc->ch[i], psenc->chlen);
 	wc = (m == ei->mroot) ? (wchar_t)ch : m->value;
 	if (pwc != NULL)
 		*pwc = wc;
 	*nresult = (size_t)(wc == 0 ? 0 : s0 - *s);
 	*s = s0;
 
 	return (0);
 }
 
 static int
 _citrus_VIQR_wcrtomb_priv(_VIQREncodingInfo * __restrict ei,
     char * __restrict s, size_t n, wchar_t wc,
     _VIQRState * __restrict psenc, size_t * __restrict nresult)
 {
 	mnemonic_t *m;
 	const char *p;
 	int ch = 0;
 
 	switch (psenc->chlen) {
 	case 0: case 1:
 		break;
 	default:
 		return (EINVAL);
 	}
 	m = NULL;
 	if ((uint32_t)wc <= 0xFF) {
 		p = mnemonic_rfc1456[wc & 0xFF];
 		if (p != NULL)
 			goto mnemonic_found;
 		if (n-- < 1)
 			goto e2big;
 		ch = (unsigned int)wc;
 		m = ei->mroot;
 		if (psenc->chlen > 0) {
 			m = mnemonic_list_find(&m->child, psenc->ch[0]);
 			if (m == NULL)
 				return (EINVAL);
 			psenc->ch[0] = ESCAPE;
 		}
 		if (mnemonic_list_find(&m->child, ch) == NULL) {
 			psenc->chlen = 0;
 			m = NULL;
 		}
 		psenc->ch[psenc->chlen++] = ch;
 	} else {
 		p = mnemonic_ext_find(wc, &mnemonic_ext[0], mnemonic_ext_size);
 		if (p == NULL) {
 			*nresult = (size_t)-1;
 			return (EILSEQ);
 		} else {
 mnemonic_found:
 			psenc->chlen = 0;
 			while (*p != '\0') {
 				if (n-- < 1)
 					goto e2big;
 				psenc->ch[psenc->chlen++] = *p++;
 			}
 		}
 	}
 	memcpy(s, psenc->ch, psenc->chlen);
 	*nresult = psenc->chlen;
 	if (m == ei->mroot) {
 		psenc->ch[0] = ch;
 		psenc->chlen = 1;
 	} else
 		psenc->chlen = 0;
 
 	return (0);
 
 e2big:
 	*nresult = (size_t)-1;
 	return (E2BIG);
 }
 
 static int
 /* ARGSUSED */
 _citrus_VIQR_put_state_reset(_VIQREncodingInfo * __restrict ei __unused,
     char * __restrict s __unused, size_t n __unused,
     _VIQRState * __restrict psenc, size_t * __restrict nresult)
 {
 
 	switch (psenc->chlen) {
 	case 0: case 1:
 		break;
 	default:
 		return (EINVAL);
 	}
 	*nresult = 0;
 	psenc->chlen = 0;
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_VIQR_stdenc_wctocs(_VIQREncodingInfo * __restrict ei __unused,
     _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc)
 {
 
 	*csid = 0;
 	*idx = (_index_t)wc;
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_VIQR_stdenc_cstowc(_VIQREncodingInfo * __restrict ei __unused,
     wchar_t * __restrict pwc, _csid_t csid, _index_t idx)
 {
 
 	if (csid != 0)
 		return (EILSEQ);
 	*pwc = (wchar_t)idx;
 
 	return (0);
 }
 
 static void
 _citrus_VIQR_encoding_module_uninit(_VIQREncodingInfo *ei)
 {
 
 	mnemonic_destroy(ei->mroot);
 }
 
 static int
 /*ARGSUSED*/
 _citrus_VIQR_encoding_module_init(_VIQREncodingInfo * __restrict ei,
     const void * __restrict var __unused, size_t lenvar __unused)
 {
 	const char *s;
 	size_t i, n;
 	int errnum;
 
 	ei->mb_cur_max = 1;
 	ei->invalid = (wchar_t)-1;
 	ei->mroot = mnemonic_create(NULL, '\0', ei->invalid);
 	if (ei->mroot == NULL)
 		return (ENOMEM);
 	for (i = 0; i < sizeof(mnemonic_rfc1456) / sizeof(const char *); ++i) {
 		s = mnemonic_rfc1456[i];
 		if (s == NULL)
 			continue;
 		n = strlen(s);
 		if (ei->mb_cur_max < n)
 			ei->mb_cur_max = n;
 		errnum = mnemonic_append_child(ei->mroot,
 		    s, (wchar_t)i, ei->invalid);
 		if (errnum != 0) {
 			_citrus_VIQR_encoding_module_uninit(ei);
 			return (errnum);
 		}
 	}
 	/* a + 1 < b + 1 here to silence gcc warning about unsigned < 0. */
 	for (i = 0; i + 1 < mnemonic_ext_size + 1; ++i) {
 		const mnemonic_def_t *p;
 
 		p = &mnemonic_ext[i];
 		n = strlen(p->name);
 		if (ei->mb_cur_max < n)
 			ei->mb_cur_max = n;
 		errnum = mnemonic_append_child(ei->mroot,
 		    p->name, p->value, ei->invalid);
 		if (errnum != 0) {
 			_citrus_VIQR_encoding_module_uninit(ei);
 			return (errnum);
 		}
 	}
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_VIQR_stdenc_get_state_desc_generic(_VIQREncodingInfo * __restrict ei __unused,
     _VIQRState * __restrict psenc, int * __restrict rstate)
 {
 
 	*rstate = (psenc->chlen == 0) ?
 	    _STDENC_SDGEN_INITIAL :
 	    _STDENC_SDGEN_INCOMPLETE_CHAR;
 
 	return (0);
 }
 
 /* ----------------------------------------------------------------------
  * public interface for stdenc
  */
 
 _CITRUS_STDENC_DECLS(VIQR);
 _CITRUS_STDENC_DEF_OPS(VIQR);
 
 #include "citrus_stdenc_template.h"
Index: user/ngie/more-tests/lib/libiconv_modules/ZW/citrus_zw.c
===================================================================
--- user/ngie/more-tests/lib/libiconv_modules/ZW/citrus_zw.c	(revision 281584)
+++ user/ngie/more-tests/lib/libiconv_modules/ZW/citrus_zw.c	(revision 281585)
@@ -1,457 +1,457 @@
 /* $FreeBSD$ */
 /* $NetBSD: citrus_zw.c,v 1.4 2008/06/14 16:01:08 tnozaki Exp $ */
 
 /*-
  * Copyright (c)2004, 2006 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  */
 
 #include <sys/cdefs.h>
 #include <sys/types.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <limits.h>
 #include <stddef.h>
 #include <stdio.h>
 #include <stdint.h>
 #include <stdlib.h>
 #include <string.h>
 #include <wchar.h>
 
 #include "citrus_namespace.h"
 #include "citrus_types.h"
 #include "citrus_module.h"
 #include "citrus_stdenc.h"
 #include "citrus_zw.h"
 
 /* ----------------------------------------------------------------------
  * private stuffs used by templates
  */
 
 typedef struct {
 	int	 dummy;
 } _ZWEncodingInfo;
 
 typedef enum {
 	NONE, AMBIGIOUS, ASCII, GB2312
 } _ZWCharset;
 
 typedef struct {
 	_ZWCharset	 charset;
 	int		 chlen;
 	char		 ch[4];
 } _ZWState;
 
 #define _CEI_TO_EI(_cei_)		(&(_cei_)->ei)
 #define _CEI_TO_STATE(_cei_, _func_)	(_cei_)->states.s_##_func_
 
 #define _FUNCNAME(m)			_citrus_ZW_##m
 #define _ENCODING_INFO			_ZWEncodingInfo
 #define _ENCODING_STATE			_ZWState
 #define _ENCODING_MB_CUR_MAX(_ei_)	MB_LEN_MAX
 #define _ENCODING_IS_STATE_DEPENDENT		1
 #define _STATE_NEEDS_EXPLICIT_INIT(_ps_)	((_ps_)->charset != NONE)
 
 static __inline void
 /*ARGSUSED*/
 _citrus_ZW_init_state(_ZWEncodingInfo * __restrict ei __unused,
     _ZWState * __restrict psenc)
 {
 
 	psenc->chlen = 0;
 	psenc->charset = NONE;
 }
 
 #if 0
 static __inline void
 /*ARGSUSED*/
 _citrus_ZW_pack_state(_ZWEncodingInfo * __restrict ei __unused,
     void *__restrict pspriv, const _ZWState * __restrict psenc)
 {
 
 	memcpy(pspriv, (const void *)psenc, sizeof(*psenc));
 }
 
 static __inline void
 /*ARGSUSED*/
 _citrus_ZW_unpack_state(_ZWEncodingInfo * __restrict ei __unused,
     _ZWState * __restrict psenc, const void * __restrict pspriv)
 {
 
 	memcpy((void *)psenc, pspriv, sizeof(*psenc));
 }
 #endif
 
 static int
 _citrus_ZW_mbrtowc_priv(_ZWEncodingInfo * __restrict ei,
-    wchar_t * __restrict pwc, const char **__restrict s, size_t n,
+    wchar_t * __restrict pwc, char **__restrict s, size_t n,
     _ZWState * __restrict psenc, size_t * __restrict nresult)
 {
-	const char *s0;
+	char *s0;
 	wchar_t  wc;
 	int ch, len;
 
 	if (*s == NULL) {
 		_citrus_ZW_init_state(ei, psenc);
 		*nresult = (size_t)_ENCODING_IS_STATE_DEPENDENT;
 		return (0);
 	}
 	s0 = *s;
 	len = 0;
 
 #define	STORE				\
 do {					\
 	if (n-- < 1) {			\
 		*nresult = (size_t)-2;	\
 		*s = s0;		\
 		return (0);		\
 	}				\
 	ch = (unsigned char)*s0++;	\
 	if (len++ > MB_LEN_MAX || ch > 0x7F)\
 		goto ilseq;		\
 	psenc->ch[psenc->chlen++] = ch;	\
 } while (/*CONSTCOND*/0)
 
 loop:
 	switch (psenc->charset) {
 	case ASCII:
 		switch (psenc->chlen) {
 		case 0:
 			STORE;
 			switch (psenc->ch[0]) {
 			case '\0': case '\n':
 				psenc->charset = NONE;
 			}
 		/*FALLTHROUGH*/
 		case 1:
 			break;
 		default:
 			return (EINVAL);
 		}
 		ch = (unsigned char)psenc->ch[0];
 		if (ch > 0x7F)
 			goto ilseq;
 		wc = (wchar_t)ch;
 		psenc->chlen = 0;
 		break;
 	case NONE:
 		if (psenc->chlen != 0)
 			return (EINVAL);
 		STORE;
 		ch = (unsigned char)psenc->ch[0];
 		if (ch != 'z') {
 			if (ch != '\n' && ch != '\0')
 				psenc->charset = ASCII;
 			wc = (wchar_t)ch;
 			psenc->chlen = 0;
 			break;
 		}
 		psenc->charset = AMBIGIOUS;
 		psenc->chlen = 0;
 	/* FALLTHROUGH */
 	case AMBIGIOUS:
 		if (psenc->chlen != 0)
 			return (EINVAL);
 		STORE;
 		if (psenc->ch[0] != 'W') {
 			psenc->charset = ASCII;
 			wc = L'z';
 			break;
 		}
 		psenc->charset = GB2312;
 		psenc->chlen = 0;
 	/* FALLTHROUGH */
 	case GB2312:
 		switch (psenc->chlen) {
 		case 0:
 			STORE;
 			ch = (unsigned char)psenc->ch[0];
 			if (ch == '\0') {
 				psenc->charset = NONE;
 				wc = (wchar_t)ch;
 				psenc->chlen = 0;
 				break;
 			} else if (ch == '\n') {
 				psenc->charset = NONE;
 				psenc->chlen = 0;
 				goto loop;
 			}
 		/*FALLTHROUGH*/
 		case 1:
 			STORE;
 			if (psenc->ch[0] == ' ') {
 				ch = (unsigned char)psenc->ch[1];
 				wc = (wchar_t)ch;
 				psenc->chlen = 0;
 				break;
 			} else if (psenc->ch[0] == '#') {
 				ch = (unsigned char)psenc->ch[1];
 				if (ch == '\n') {
 					psenc->charset = NONE;
 					wc = (wchar_t)ch;
 					psenc->chlen = 0;
 					break;
 				} else if (ch == ' ') {
 					wc = (wchar_t)ch;
 					psenc->chlen = 0;
 					break;
 				}
 			}
 			ch = (unsigned char)psenc->ch[0];
 			if (ch < 0x21 || ch > 0x7E)
 				goto ilseq;
 			wc = (wchar_t)(ch << 8);
 			ch = (unsigned char)psenc->ch[1];
 			if (ch < 0x21 || ch > 0x7E) {
 ilseq:
 				*nresult = (size_t)-1;
 				return (EILSEQ);
 			}
 			wc |= (wchar_t)ch;
 			psenc->chlen = 0;
 			break;
 		default:
 			return (EINVAL);
 		}
 		break;
 	default:
 		return (EINVAL);
 	}
 	if (pwc != NULL)
 		*pwc = wc;
 
 	*nresult = (size_t)(wc == 0 ? 0 : len);
 	*s = s0;
 
 	return (0);
 }
 
 static int
 /*ARGSUSED*/
 _citrus_ZW_wcrtomb_priv(_ZWEncodingInfo * __restrict ei __unused,
     char *__restrict s, size_t n, wchar_t wc,
     _ZWState * __restrict psenc, size_t * __restrict nresult)
 {
 	int ch;
 
 	if (psenc->chlen != 0)
 		return (EINVAL);
 	if ((uint32_t)wc <= 0x7F) {
 		ch = (unsigned char)wc;
 		switch (psenc->charset) {
 		case NONE:
 			if (ch == '\0' || ch == '\n')
 				psenc->ch[psenc->chlen++] = ch;
 			else {
 				if (n < 4)
 					return (E2BIG);
 				n -= 4;
 				psenc->ch[psenc->chlen++] = 'z';
 				psenc->ch[psenc->chlen++] = 'W';
 				psenc->ch[psenc->chlen++] = ' ';
 				psenc->ch[psenc->chlen++] = ch;
 				psenc->charset = GB2312;
 			}
 			break;
 		case GB2312:
 			if (n < 2)
 				return (E2BIG);
 			n -= 2;
 			if (ch == '\0') {
 				psenc->ch[psenc->chlen++] = '\n';
 				psenc->ch[psenc->chlen++] = '\0';
 				psenc->charset = NONE;
 			} else if (ch == '\n') {
 				psenc->ch[psenc->chlen++] = '#';
 				psenc->ch[psenc->chlen++] = '\n';
 				psenc->charset = NONE;
 			} else {
 				psenc->ch[psenc->chlen++] = ' ';
 				psenc->ch[psenc->chlen++] = ch;
 			}
 			break;
 		default:
 			return (EINVAL);
 		}
 	} else if ((uint32_t)wc <= 0x7E7E) {
 		switch (psenc->charset) {
 		case NONE:
 			if (n < 2)
 				return (E2BIG);
 			n -= 2;
 			psenc->ch[psenc->chlen++] = 'z';
 			psenc->ch[psenc->chlen++] = 'W';
 			psenc->charset = GB2312;
 		/* FALLTHROUGH*/
 		case GB2312:
 			if (n < 2)
 				return (E2BIG);
 			n -= 2;
 			ch = (wc >> 8) & 0xFF;
 			if (ch < 0x21 || ch > 0x7E)
 				goto ilseq;
 			psenc->ch[psenc->chlen++] = ch;
 			ch = wc & 0xFF;
 			if (ch < 0x21 || ch > 0x7E)
 				goto ilseq;
 			psenc->ch[psenc->chlen++] = ch;
 			break;
 		default:
 			return (EINVAL);
 		}
 	} else {
 ilseq:
 		*nresult = (size_t)-1;
 		return (EILSEQ);
 	}
 	memcpy(s, psenc->ch, psenc->chlen);
 	*nresult = psenc->chlen;
 	psenc->chlen = 0;
 
 	return (0);
 }
 
 static int
 /*ARGSUSED*/
 _citrus_ZW_put_state_reset(_ZWEncodingInfo * __restrict ei __unused,
     char * __restrict s, size_t n, _ZWState * __restrict psenc,
     size_t * __restrict nresult)
 {
 
 	if (psenc->chlen != 0)
 		return (EINVAL);
 	switch (psenc->charset) {
 	case GB2312:
 		if (n-- < 1)
 			return (E2BIG);
 		psenc->ch[psenc->chlen++] = '\n';
 		psenc->charset = NONE;
 	/*FALLTHROUGH*/
 	case NONE:
 		*nresult = psenc->chlen;
 		if (psenc->chlen > 0) {
 			memcpy(s, psenc->ch, psenc->chlen);
 			psenc->chlen = 0;
 		}
 		break;
 	default:
 		return (EINVAL);
 	}
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_ZW_stdenc_get_state_desc_generic(_ZWEncodingInfo * __restrict ei __unused,
     _ZWState * __restrict psenc, int * __restrict rstate)
 {
 
 	switch (psenc->charset) {
 	case NONE:
 		if (psenc->chlen != 0)
 			return (EINVAL);
 		*rstate = _STDENC_SDGEN_INITIAL;
 		break;
 	case AMBIGIOUS:
 		if (psenc->chlen != 0)
 			return (EINVAL);
 		*rstate = _STDENC_SDGEN_INCOMPLETE_SHIFT;
 		break;
 	case ASCII:
 	case GB2312:
 		switch (psenc->chlen) {
 		case 0:
 			*rstate = _STDENC_SDGEN_STABLE;
 			break;
 		case 1:
 			*rstate = (psenc->ch[0] == '#') ?
 			    _STDENC_SDGEN_INCOMPLETE_SHIFT :
 			    _STDENC_SDGEN_INCOMPLETE_CHAR;
 			break;
 		default:
 			return (EINVAL);
 		}
 		break;
 	default:
 		return (EINVAL);
 	}
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_ZW_stdenc_wctocs(_ZWEncodingInfo * __restrict ei __unused,
     _csid_t * __restrict csid, _index_t * __restrict idx, wchar_t wc)
 {
 
 	*csid = (_csid_t)(wc <= (wchar_t)0x7FU) ? 0 : 1;
 	*idx = (_index_t)wc;
 
 	return (0);
 }
 
 static __inline int
 /*ARGSUSED*/
 _citrus_ZW_stdenc_cstowc(_ZWEncodingInfo * __restrict ei __unused,
     wchar_t * __restrict wc, _csid_t csid, _index_t idx)
 {
 
 	switch (csid) {
 	case 0: case 1:
 		break;
 	default:
 		return (EINVAL);
 	}
 	*wc = (wchar_t)idx;
 
 	return (0);
 }
 
 static void
 /*ARGSUSED*/
 _citrus_ZW_encoding_module_uninit(_ZWEncodingInfo *ei __unused)
 {
 
 }
 
 static int
 /*ARGSUSED*/
 _citrus_ZW_encoding_module_init(_ZWEncodingInfo * __restrict ei __unused,
     const void *__restrict var __unused, size_t lenvar __unused)
 {
 
 	return (0);
 }
 
 /* ----------------------------------------------------------------------
  * public interface for stdenc
  */
 
 _CITRUS_STDENC_DECLS(ZW);
 _CITRUS_STDENC_DEF_OPS(ZW);
 
 #include "citrus_stdenc_template.h"
Index: user/ngie/more-tests/lib/libiconv_modules/iconv_none/citrus_iconv_none.c
===================================================================
--- user/ngie/more-tests/lib/libiconv_modules/iconv_none/citrus_iconv_none.c	(revision 281584)
+++ user/ngie/more-tests/lib/libiconv_modules/iconv_none/citrus_iconv_none.c	(revision 281585)
@@ -1,127 +1,127 @@
 /* $FreeBSD$ */
 /*	$NetBSD: citrus_iconv_none.c,v 1.3 2011/05/23 14:45:44 joerg Exp $	*/
 
 /*-
  * Copyright (c)2003 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 #include <sys/queue.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <stdbool.h>
 #include <stdlib.h>
 #include <string.h>
 
 #include "citrus_types.h"
 #include "citrus_module.h"
 #include "citrus_hash.h"
 #include "citrus_iconv.h"
 #include "citrus_iconv_none.h"
 
 /* ---------------------------------------------------------------------- */
 
 _CITRUS_ICONV_DECLS(iconv_none);
 _CITRUS_ICONV_DEF_OPS(iconv_none);
 
 
 /* ---------------------------------------------------------------------- */
 
 int
 _citrus_iconv_none_iconv_getops(struct _citrus_iconv_ops *ops)
 {
 
 	memcpy(ops, &_citrus_iconv_none_iconv_ops,
 	       sizeof(_citrus_iconv_none_iconv_ops));
 
 	return (0);
 }
 
 static int
 /*ARGSUSED*/
 _citrus_iconv_none_iconv_init_shared(
     struct _citrus_iconv_shared * __restrict ci,
     const char * __restrict in __unused, const char * __restrict out __unused)
 {
 
 	ci->ci_closure = NULL;
 	return (0);
 }
 
 static void
 /*ARGSUSED*/
 _citrus_iconv_none_iconv_uninit_shared(struct _citrus_iconv_shared *ci __unused)
 {
 
 }
 
 static int
 /*ARGSUSED*/
 _citrus_iconv_none_iconv_init_context(struct _citrus_iconv *cv)
 {
 
 	cv->cv_closure = NULL;
 	return (0);
 }
 
 static void
 /*ARGSUSED*/
 _citrus_iconv_none_iconv_uninit_context(struct _citrus_iconv *cv __unused)
 {
 
 }
 
 static int
 /*ARGSUSED*/
 _citrus_iconv_none_iconv_convert(struct _citrus_iconv * __restrict ci __unused,
-    const char * __restrict * __restrict in, size_t * __restrict inbytes,
+    char * __restrict * __restrict in, size_t * __restrict inbytes,
     char * __restrict * __restrict out, size_t * __restrict outbytes,
     uint32_t flags __unused, size_t * __restrict invalids)
 {
 	size_t len;
 	int e2big;
 
 	if ((in == NULL) || (out == NULL) || (inbytes == NULL))
 		return (0);
 	if ((*in == NULL) || (*out == NULL) || (*inbytes == 0) || (*outbytes == 0))
 		return (0);
 	len = *inbytes;
 	e2big = 0;
 	if (*outbytes<len) {
 		e2big = 1;
 		len = *outbytes;
 	}
 	memcpy(*out, *in, len);
 	in += len;
 	*inbytes -= len;
 	out += len;
 	*outbytes -= len;
 	*invalids = 0;
 	if (e2big)
 		return (E2BIG);
 
 	return (0);
 }
Index: user/ngie/more-tests/lib/libiconv_modules/iconv_std/citrus_iconv_std.c
===================================================================
--- user/ngie/more-tests/lib/libiconv_modules/iconv_std/citrus_iconv_std.c	(revision 281584)
+++ user/ngie/more-tests/lib/libiconv_modules/iconv_std/citrus_iconv_std.c	(revision 281585)
@@ -1,593 +1,593 @@
 /* $FreeBSD$ */
 /*	$NetBSD: citrus_iconv_std.c,v 1.16 2012/02/12 13:51:29 wiz Exp $	*/
 
 /*-
  * Copyright (c)2003 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 #include <sys/endian.h>
 #include <sys/queue.h>
 
 #include <assert.h>
 #include <errno.h>
 #include <limits.h>
 #include <stdbool.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 
 #include "citrus_namespace.h"
 #include "citrus_types.h"
 #include "citrus_module.h"
 #include "citrus_region.h"
 #include "citrus_mmap.h"
 #include "citrus_hash.h"
 #include "citrus_iconv.h"
 #include "citrus_stdenc.h"
 #include "citrus_mapper.h"
 #include "citrus_csmapper.h"
 #include "citrus_memstream.h"
 #include "citrus_iconv_std.h"
 #include "citrus_esdb.h"
 
 /* ---------------------------------------------------------------------- */
 
 _CITRUS_ICONV_DECLS(iconv_std);
 _CITRUS_ICONV_DEF_OPS(iconv_std);
 
 
 /* ---------------------------------------------------------------------- */
 
 int
 _citrus_iconv_std_iconv_getops(struct _citrus_iconv_ops *ops)
 {
 
 	memcpy(ops, &_citrus_iconv_std_iconv_ops,
 	    sizeof(_citrus_iconv_std_iconv_ops));
 
 	return (0);
 }
 
 /* ---------------------------------------------------------------------- */
 
 /*
  * convenience routines for stdenc.
  */
 static __inline void
 save_encoding_state(struct _citrus_iconv_std_encoding *se)
 {
 
 	if (se->se_ps)
 		memcpy(se->se_pssaved, se->se_ps,
 		    _stdenc_get_state_size(se->se_handle));
 }
 
 static __inline void
 restore_encoding_state(struct _citrus_iconv_std_encoding *se)
 {
 
 	if (se->se_ps)
 		memcpy(se->se_ps, se->se_pssaved,
 		    _stdenc_get_state_size(se->se_handle));
 }
 
 static __inline void
 init_encoding_state(struct _citrus_iconv_std_encoding *se)
 {
 
 	if (se->se_ps)
 		_stdenc_init_state(se->se_handle, se->se_ps);
 }
 
 static __inline int
 mbtocsx(struct _citrus_iconv_std_encoding *se,
-    _csid_t *csid, _index_t *idx, const char **s, size_t n, size_t *nresult,
+    _csid_t *csid, _index_t *idx, char **s, size_t n, size_t *nresult,
     struct iconv_hooks *hooks)
 {
 
 	return (_stdenc_mbtocs(se->se_handle, csid, idx, s, n, se->se_ps,
 			      nresult, hooks));
 }
 
 static __inline int
 cstombx(struct _citrus_iconv_std_encoding *se,
     char *s, size_t n, _csid_t csid, _index_t idx, size_t *nresult,
     struct iconv_hooks *hooks)
 {
 
 	return (_stdenc_cstomb(se->se_handle, s, n, csid, idx, se->se_ps,
 			      nresult, hooks));
 }
 
 static __inline int
 wctombx(struct _citrus_iconv_std_encoding *se,
     char *s, size_t n, _wc_t wc, size_t *nresult,
     struct iconv_hooks *hooks)
 {
 
 	return (_stdenc_wctomb(se->se_handle, s, n, wc, se->se_ps, nresult,
 			     hooks));
 }
 
 static __inline int
 put_state_resetx(struct _citrus_iconv_std_encoding *se, char *s, size_t n,
     size_t *nresult)
 {
 
 	return (_stdenc_put_state_reset(se->se_handle, s, n, se->se_ps, nresult));
 }
 
 static __inline int
 get_state_desc_gen(struct _citrus_iconv_std_encoding *se, int *rstate)
 {
 	struct _stdenc_state_desc ssd;
 	int ret;
 
 	ret = _stdenc_get_state_desc(se->se_handle, se->se_ps,
 	    _STDENC_SDID_GENERIC, &ssd);
 	if (!ret)
 		*rstate = ssd.u.generic.state;
 
 	return (ret);
 }
 
 /*
  * init encoding context
  */
 static int
 init_encoding(struct _citrus_iconv_std_encoding *se, struct _stdenc *cs,
     void *ps1, void *ps2)
 {
 	int ret = -1;
 
 	se->se_handle = cs;
 	se->se_ps = ps1;
 	se->se_pssaved = ps2;
 
 	if (se->se_ps)
 		ret = _stdenc_init_state(cs, se->se_ps);
 	if (!ret && se->se_pssaved)
 		ret = _stdenc_init_state(cs, se->se_pssaved);
 
 	return (ret);
 }
 
 static int
 open_csmapper(struct _csmapper **rcm, const char *src, const char *dst,
     unsigned long *rnorm)
 {
 	struct _csmapper *cm;
 	int ret;
 
 	ret = _csmapper_open(&cm, src, dst, 0, rnorm);
 	if (ret)
 		return (ret);
 	if (_csmapper_get_src_max(cm) != 1 || _csmapper_get_dst_max(cm) != 1 ||
 	    _csmapper_get_state_size(cm) != 0) {
 		_csmapper_close(cm);
 		return (EINVAL);
 	}
 
 	*rcm = cm;
 
 	return (0);
 }
 
 static void
 close_dsts(struct _citrus_iconv_std_dst_list *dl)
 {
 	struct _citrus_iconv_std_dst *sd;
 
 	while ((sd = TAILQ_FIRST(dl)) != NULL) {
 		TAILQ_REMOVE(dl, sd, sd_entry);
 		_csmapper_close(sd->sd_mapper);
 		free(sd);
 	}
 }
 
 static int
 open_dsts(struct _citrus_iconv_std_dst_list *dl,
     const struct _esdb_charset *ec, const struct _esdb *dbdst)
 {
 	struct _citrus_iconv_std_dst *sd, *sdtmp;
 	unsigned long norm;
 	int i, ret;
 
 	sd = malloc(sizeof(*sd));
 	if (sd == NULL)
 		return (errno);
 
 	for (i = 0; i < dbdst->db_num_charsets; i++) {
 		ret = open_csmapper(&sd->sd_mapper, ec->ec_csname,
 		    dbdst->db_charsets[i].ec_csname, &norm);
 		if (ret == 0) {
 			sd->sd_csid = dbdst->db_charsets[i].ec_csid;
 			sd->sd_norm = norm;
 			/* insert this mapper by sorted order. */
 			TAILQ_FOREACH(sdtmp, dl, sd_entry) {
 				if (sdtmp->sd_norm > norm) {
 					TAILQ_INSERT_BEFORE(sdtmp, sd,
 					    sd_entry);
 					sd = NULL;
 					break;
 				}
 			}
 			if (sd)
 				TAILQ_INSERT_TAIL(dl, sd, sd_entry);
 			sd = malloc(sizeof(*sd));
 			if (sd == NULL) {
 				ret = errno;
 				close_dsts(dl);
 				return (ret);
 			}
 		} else if (ret != ENOENT) {
 			close_dsts(dl);
 			free(sd);
 			return (ret);
 		}
 	}
 	free(sd);
 	return (0);
 }
 
 static void
 close_srcs(struct _citrus_iconv_std_src_list *sl)
 {
 	struct _citrus_iconv_std_src *ss;
 
 	while ((ss = TAILQ_FIRST(sl)) != NULL) {
 		TAILQ_REMOVE(sl, ss, ss_entry);
 		close_dsts(&ss->ss_dsts);
 		free(ss);
 	}
 }
 
 static int
 open_srcs(struct _citrus_iconv_std_src_list *sl,
     const struct _esdb *dbsrc, const struct _esdb *dbdst)
 {
 	struct _citrus_iconv_std_src *ss;
 	int count = 0, i, ret;
 
 	ss = malloc(sizeof(*ss));
 	if (ss == NULL)
 		return (errno);
 
 	TAILQ_INIT(&ss->ss_dsts);
 
 	for (i = 0; i < dbsrc->db_num_charsets; i++) {
 		ret = open_dsts(&ss->ss_dsts, &dbsrc->db_charsets[i], dbdst);
 		if (ret)
 			goto err;
 		if (!TAILQ_EMPTY(&ss->ss_dsts)) {
 			ss->ss_csid = dbsrc->db_charsets[i].ec_csid;
 			TAILQ_INSERT_TAIL(sl, ss, ss_entry);
 			ss = malloc(sizeof(*ss));
 			if (ss == NULL) {
 				ret = errno;
 				goto err;
 			}
 			count++;
 			TAILQ_INIT(&ss->ss_dsts);
 		}
 	}
 	free(ss);
 
 	return (count ? 0 : ENOENT);
 
 err:
 	free(ss);
 	close_srcs(sl);
 	return (ret);
 }
 
 /* do convert a character */
 #define E_NO_CORRESPONDING_CHAR ENOENT /* XXX */
 static int
 /*ARGSUSED*/
 do_conv(const struct _citrus_iconv_std_shared *is,
 	_csid_t *csid, _index_t *idx)
 {
 	struct _citrus_iconv_std_dst *sd;
 	struct _citrus_iconv_std_src *ss;
 	_index_t tmpidx;
 	int ret;
 
 	TAILQ_FOREACH(ss, &is->is_srcs, ss_entry) {
 		if (ss->ss_csid == *csid) {
 			TAILQ_FOREACH(sd, &ss->ss_dsts, sd_entry) {
 				ret = _csmapper_convert(sd->sd_mapper,
 				    &tmpidx, *idx, NULL);
 				switch (ret) {
 				case _MAPPER_CONVERT_SUCCESS:
 					*csid = sd->sd_csid;
 					*idx = tmpidx;
 					return (0);
 				case _MAPPER_CONVERT_NONIDENTICAL:
 					break;
 				case _MAPPER_CONVERT_SRC_MORE:
 					/*FALLTHROUGH*/
 				case _MAPPER_CONVERT_DST_MORE:
 					/*FALLTHROUGH*/
 				case _MAPPER_CONVERT_ILSEQ:
 					return (EILSEQ);
 				case _MAPPER_CONVERT_FATAL:
 					return (EINVAL);
 				}
 			}
 			break;
 		}
 	}
 
 	return (E_NO_CORRESPONDING_CHAR);
 }
 /* ---------------------------------------------------------------------- */
 
 static int
 /*ARGSUSED*/
 _citrus_iconv_std_iconv_init_shared(struct _citrus_iconv_shared *ci,
     const char * __restrict src, const char * __restrict dst)
 {
 	struct _citrus_esdb esdbdst, esdbsrc;
 	struct _citrus_iconv_std_shared *is;
 	int ret;
 
 	is = malloc(sizeof(*is));
 	if (is == NULL) {
 		ret = errno;
 		goto err0;
 	}
 	ret = _citrus_esdb_open(&esdbsrc, src);
 	if (ret)
 		goto err1;
 	ret = _citrus_esdb_open(&esdbdst, dst);
 	if (ret)
 		goto err2;
 	ret = _stdenc_open(&is->is_src_encoding, esdbsrc.db_encname,
 	    esdbsrc.db_variable, esdbsrc.db_len_variable);
 	if (ret)
 		goto err3;
 	ret = _stdenc_open(&is->is_dst_encoding, esdbdst.db_encname,
 	    esdbdst.db_variable, esdbdst.db_len_variable);
 	if (ret)
 		goto err4;
 	is->is_use_invalid = esdbdst.db_use_invalid;
 	is->is_invalid = esdbdst.db_invalid;
 
 	TAILQ_INIT(&is->is_srcs);
 	ret = open_srcs(&is->is_srcs, &esdbsrc, &esdbdst);
 	if (ret)
 		goto err5;
 
 	_esdb_close(&esdbsrc);
 	_esdb_close(&esdbdst);
 	ci->ci_closure = is;
 
 	return (0);
 
 err5:
 	_stdenc_close(is->is_dst_encoding);
 err4:
 	_stdenc_close(is->is_src_encoding);
 err3:
 	_esdb_close(&esdbdst);
 err2:
 	_esdb_close(&esdbsrc);
 err1:
 	free(is);
 err0:
 	return (ret);
 }
 
 static void
 _citrus_iconv_std_iconv_uninit_shared(struct _citrus_iconv_shared *ci)
 {
 	struct _citrus_iconv_std_shared *is = ci->ci_closure;
 
 	if (is == NULL)
 		return;
 
 	_stdenc_close(is->is_src_encoding);
 	_stdenc_close(is->is_dst_encoding);
 	close_srcs(&is->is_srcs);
 	free(is);
 }
 
 static int
 _citrus_iconv_std_iconv_init_context(struct _citrus_iconv *cv)
 {
 	const struct _citrus_iconv_std_shared *is = cv->cv_shared->ci_closure;
 	struct _citrus_iconv_std_context *sc;
 	char *ptr;
 	size_t sz, szpsdst, szpssrc;
 
 	szpssrc = _stdenc_get_state_size(is->is_src_encoding);
 	szpsdst = _stdenc_get_state_size(is->is_dst_encoding);
 
 	sz = (szpssrc + szpsdst)*2 + sizeof(struct _citrus_iconv_std_context);
 	sc = malloc(sz);
 	if (sc == NULL)
 		return (errno);
 
 	ptr = (char *)&sc[1];
 	if (szpssrc > 0)
 		init_encoding(&sc->sc_src_encoding, is->is_src_encoding,
 		    ptr, ptr+szpssrc);
 	else
 		init_encoding(&sc->sc_src_encoding, is->is_src_encoding,
 		    NULL, NULL);
 	ptr += szpssrc*2;
 	if (szpsdst > 0)
 		init_encoding(&sc->sc_dst_encoding, is->is_dst_encoding,
 		    ptr, ptr+szpsdst);
 	else
 		init_encoding(&sc->sc_dst_encoding, is->is_dst_encoding,
 		    NULL, NULL);
 
 	cv->cv_closure = (void *)sc;
 
 	return (0);
 }
 
 static void
 _citrus_iconv_std_iconv_uninit_context(struct _citrus_iconv *cv)
 {
 
 	free(cv->cv_closure);
 }
 
 static int
 _citrus_iconv_std_iconv_convert(struct _citrus_iconv * __restrict cv,
-    const char * __restrict * __restrict in, size_t * __restrict inbytes,
+    char * __restrict * __restrict in, size_t * __restrict inbytes,
     char * __restrict * __restrict out, size_t * __restrict outbytes,
     uint32_t flags, size_t * __restrict invalids)
 {
 	const struct _citrus_iconv_std_shared *is = cv->cv_shared->ci_closure;
 	struct _citrus_iconv_std_context *sc = cv->cv_closure;
 	_csid_t csid;
 	_index_t idx;
-	const char *tmpin;
+	char *tmpin;
 	size_t inval, szrin, szrout;
 	int ret, state = 0;
 
 	inval = 0;
 	if (in == NULL || *in == NULL) {
 		/* special cases */
 		if (out != NULL && *out != NULL) {
 			/* init output state and store the shift sequence */
 			save_encoding_state(&sc->sc_src_encoding);
 			save_encoding_state(&sc->sc_dst_encoding);
 			szrout = 0;
 
 			ret = put_state_resetx(&sc->sc_dst_encoding,
 			    *out, *outbytes, &szrout);
 			if (ret)
 				goto err;
 
 			if (szrout == (size_t)-2) {
 				/* too small to store the character */
 				ret = EINVAL;
 				goto err;
 			}
 			*out += szrout;
 			*outbytes -= szrout;
 		} else
 			/* otherwise, discard the shift sequence */
 			init_encoding_state(&sc->sc_dst_encoding);
 		init_encoding_state(&sc->sc_src_encoding);
 		*invalids = 0;
 		return (0);
 	}
 
 	/* normal case */
 	for (;;) {
 		if (*inbytes == 0) {
 			ret = get_state_desc_gen(&sc->sc_src_encoding, &state);
 			if (state == _STDENC_SDGEN_INITIAL ||
 			    state == _STDENC_SDGEN_STABLE)
 				break;
 		}
 
 		/* save the encoding states for the error recovery */
 		save_encoding_state(&sc->sc_src_encoding);
 		save_encoding_state(&sc->sc_dst_encoding);
 
 		/* mb -> csid/index */
 		tmpin = *in;
 		szrin = szrout = 0;
 		ret = mbtocsx(&sc->sc_src_encoding, &csid, &idx, &tmpin,
 		    *inbytes, &szrin, cv->cv_shared->ci_hooks);
 		if (ret)
 			goto err;
 
 		if (szrin == (size_t)-2) {
 			/* incompleted character */
 			ret = get_state_desc_gen(&sc->sc_src_encoding, &state);
 			if (ret) {
 				ret = EINVAL;
 				goto err;
 			}
 			switch (state) {
 			case _STDENC_SDGEN_INITIAL:
 			case _STDENC_SDGEN_STABLE:
 				/* fetch shift sequences only. */
 				goto next;
 			}
 			ret = EINVAL;
 			goto err;
 		}
 		/* convert the character */
 		ret = do_conv(is, &csid, &idx);
 		if (ret) {
 			if (ret == E_NO_CORRESPONDING_CHAR) {
 				/*
 				 * GNU iconv returns EILSEQ when no
 				 * corresponding character in the output.
 				 * Some software depends on this behavior
 				 * though this is against POSIX specification.
 				 */
 				if (cv->cv_shared->ci_ilseq_invalid != 0) {
 					ret = EILSEQ;
 					goto err;
 				}
 				inval++;
 				szrout = 0;
 				if ((((flags & _CITRUS_ICONV_F_HIDE_INVALID) == 0) &&
 				    !cv->cv_shared->ci_discard_ilseq) &&
 				    is->is_use_invalid) {
 					ret = wctombx(&sc->sc_dst_encoding,
 					    *out, *outbytes, is->is_invalid,
 					    &szrout, cv->cv_shared->ci_hooks);
 					if (ret)
 						goto err;
 				}
 				goto next;
 			} else
 				goto err;
 		}
 		/* csid/index -> mb */
 		ret = cstombx(&sc->sc_dst_encoding,
 		    *out, *outbytes, csid, idx, &szrout,
 		    cv->cv_shared->ci_hooks);
 		if (ret)
 			goto err;
 next:
 		*inbytes -= tmpin-*in; /* szrin is insufficient on \0. */
 		*in = tmpin;
 		*outbytes -= szrout;
 		*out += szrout;
 	}
 	*invalids = inval;
 
 	return (0);
 
 err:
 	restore_encoding_state(&sc->sc_src_encoding);
 	restore_encoding_state(&sc->sc_dst_encoding);
 	*invalids = inval;
 
 	return (ret);
 }
Index: user/ngie/more-tests/lib/libkiconv/xlat16_iconv.c
===================================================================
--- user/ngie/more-tests/lib/libkiconv/xlat16_iconv.c	(revision 281584)
+++ user/ngie/more-tests/lib/libkiconv/xlat16_iconv.c	(revision 281585)
@@ -1,465 +1,464 @@
 /*-
  * Copyright (c) 2003, 2005 Ryuichiro Imura
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 /*
  * kiconv(3) requires shared linked, and reduce module size
  * when statically linked.
  */
 
 #ifdef PIC
 
 #include <sys/types.h>
 #include <sys/iconv.h>
 #include <sys/sysctl.h>
 
 #include <ctype.h>
 #include <dlfcn.h>
 #include <err.h>
 #include <errno.h>
 #include <locale.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <wctype.h>
 
 #include "quirks.h"
 
 struct xlat16_table {
 	uint32_t *	idx[0x200];
 	void *		data;
 	size_t		size;
 };
 
 static struct xlat16_table kiconv_xlat16_open(const char *, const char *, int);
 static int chklocale(int, const char *);
 
 #ifdef ICONV_DLOPEN
 typedef void *iconv_t;
 static int my_iconv_init(void);
 static iconv_t (*my_iconv_open)(const char *, const char *);
-static size_t (*my_iconv)(iconv_t, const char **, size_t *, char **, size_t *);
+static size_t (*my_iconv)(iconv_t, char **, size_t *, char **, size_t *);
 static int (*my_iconv_close)(iconv_t);
 #else
 #include <iconv.h>
 #define my_iconv_init() 0
 #define my_iconv_open iconv_open
 #define my_iconv iconv
 #define my_iconv_close iconv_close
 #endif
-static size_t my_iconv_char(iconv_t, const u_char **, size_t *, u_char **, size_t *);
+static size_t my_iconv_char(iconv_t, u_char **, size_t *, u_char **, size_t *);
 
 int
 kiconv_add_xlat16_cspair(const char *tocode, const char *fromcode, int flag)
 {
 	int error;
 	size_t idxsize;
 	struct xlat16_table xt;
 	void *data;
 	char *p;
 	const char unicode[] = ENCODING_UNICODE;
 
 	if ((flag & KICONV_WCTYPE) == 0 &&
 	    strcmp(unicode, tocode) != 0 &&
 	    strcmp(unicode, fromcode) != 0 &&
 	    kiconv_lookupconv(unicode) == 0) {
 		error = kiconv_add_xlat16_cspair(unicode, fromcode, flag);
 		if (error)
 			return (-1);
 		error = kiconv_add_xlat16_cspair(tocode, unicode, flag);
 		return (error);
 	}
 
 	if (kiconv_lookupcs(tocode, fromcode) == 0)
 		return (0);
 
 	if (flag & KICONV_WCTYPE)
 		xt = kiconv_xlat16_open(fromcode, fromcode, flag);
 	else
 		xt = kiconv_xlat16_open(tocode, fromcode, flag);
 	if (xt.size == 0)
 		return (-1);
 
 	idxsize = sizeof(xt.idx);
 
 	if ((idxsize + xt.size) > ICONV_CSMAXDATALEN) {
 		errno = E2BIG;
 		return (-1);
 	}
 
 	if ((data = malloc(idxsize + xt.size)) != NULL) {
 		p = data;
 		memcpy(p, xt.idx, idxsize);
 		p += idxsize;
 		memcpy(p, xt.data, xt.size);
 		error = kiconv_add_xlat16_table(tocode, fromcode, data,
 		    (int)(idxsize + xt.size));
 		return (error);
 	}
 
 	return (-1);
 }
 
 int
 kiconv_add_xlat16_cspairs(const char *foreigncode, const char *localcode)
 {
 	int error, locale;
 
 	error = kiconv_add_xlat16_cspair(foreigncode, localcode,
 	    KICONV_FROM_LOWER | KICONV_FROM_UPPER);
 	if (error)
 		return (error);
 	error = kiconv_add_xlat16_cspair(localcode, foreigncode,
 	    KICONV_LOWER | KICONV_UPPER);
 	if (error)
 		return (error);
 	locale = chklocale(LC_CTYPE, localcode);
 	if (locale == 0) {
 		error = kiconv_add_xlat16_cspair(KICONV_WCTYPE_NAME, localcode,
 		    KICONV_WCTYPE);
 		if (error)
 			return (error);
 	}
 
 	return (0);
 }
 
 static struct xlat16_table
 kiconv_xlat16_open(const char *tocode, const char *fromcode, int lcase)
 {
 	u_char src[3], dst[4], *srcp, *dstp, ud, ld;
 	int us, ls, ret;
 	uint16_t c;
 	uint32_t table[0x80];
 	size_t inbytesleft, outbytesleft, pre_q_size, post_q_size;
 	struct xlat16_table xt;
 	struct quirk_replace_list *pre_q_list, *post_q_list;
 	iconv_t cd;
 	char *p;
 
 	xt.data = NULL;
 	xt.size = 0;
 
 	src[2] = '\0';
 	dst[3] = '\0';
 
 	ret = my_iconv_init();
 	if (ret)
 		return (xt);
 
 	cd = my_iconv_open(search_quirk(tocode, fromcode, &pre_q_list, &pre_q_size),
 	    search_quirk(fromcode, tocode, &post_q_list, &post_q_size));
 	if (cd == (iconv_t) (-1))
 		return (xt);
 
 	if ((xt.data = malloc(0x200 * 0x80 * sizeof(uint32_t))) == NULL)
 		return (xt);
 
 	p = xt.data;
 
 	for (ls = 0 ; ls < 0x200 ; ls++) {
 		xt.idx[ls] = NULL;
 		for (us = 0 ; us < 0x80 ; us++) {
 			srcp = src;
 			dstp = dst;
 
 			inbytesleft = 2;
 			outbytesleft = 3;
 			bzero(dst, outbytesleft);
 
 			c = ((ls & 0x100 ? us | 0x80 : us) << 8) | (u_char)ls;
 
 			if (lcase & KICONV_WCTYPE) {
 				if ((c & 0xff) == 0)
 					c >>= 8;
 				if (iswupper(c)) {
 					c = towlower(c);
 					if ((c & 0xff00) == 0)
 						c <<= 8;
 					table[us] = c | XLAT16_HAS_LOWER_CASE;
 				} else if (iswlower(c)) {
 					c = towupper(c);
 					if ((c & 0xff00) == 0)
 						c <<= 8;
 					table[us] = c | XLAT16_HAS_UPPER_CASE;
 				} else
 					table[us] = 0;
 				/*
 				 * store not NULL
 				 */
 				if (table[us])
 					xt.idx[ls] = table;
 
 				continue;
 			}
 
 			c = quirk_vendor2unix(c, pre_q_list, pre_q_size);
 			src[0] = (u_char)(c >> 8);
 			src[1] = (u_char)c;
 
-			ret = my_iconv_char(cd, (const u_char **)&srcp,
-			    &inbytesleft, &dstp, &outbytesleft);
+			ret = my_iconv_char(cd, &srcp, &inbytesleft,
+				&dstp, &outbytesleft);
 			if (ret == -1) {
 				table[us] = 0;
 				continue;
 			}
 
 			ud = (u_char)dst[0];
 			ld = (u_char)dst[1];
 
 			switch(outbytesleft) {
 			case 0:
 #ifdef XLAT16_ACCEPT_3BYTE_CHR
 				table[us] = (ud << 8) | ld;
 				table[us] |= (u_char)dst[2] << 16;
 				table[us] |= XLAT16_IS_3BYTE_CHR;
 #else
 				table[us] = 0;
 				continue;
 #endif
 				break;
 			case 1:
 				table[us] = quirk_unix2vendor((ud << 8) | ld,
 				    post_q_list, post_q_size);
 				if ((table[us] >> 8) == 0)
 					table[us] |= XLAT16_ACCEPT_NULL_OUT;
 				break;
 			case 2:
 				table[us] = ud;
 				if (lcase & KICONV_LOWER && ud != tolower(ud)) {
 					table[us] |= (u_char)tolower(ud) << 16;
 					table[us] |= XLAT16_HAS_LOWER_CASE;
 				}
 				if (lcase & KICONV_UPPER && ud != toupper(ud)) {
 					table[us] |= (u_char)toupper(ud) << 16;
 					table[us] |= XLAT16_HAS_UPPER_CASE;
 				}
 				break;
 			}
 
 			switch(inbytesleft) {
 			case 0:
 				if ((ls & 0xff) == 0)
 					table[us] |= XLAT16_ACCEPT_NULL_IN;
 				break;
 			case 1:
 				c = ls > 0xff ? us | 0x80 : us;
 				if (lcase & KICONV_FROM_LOWER && c != tolower(c)) {
 					table[us] |= (u_char)tolower(c) << 16;
 					table[us] |= XLAT16_HAS_FROM_LOWER_CASE;
 				}
 				if (lcase & KICONV_FROM_UPPER && c != toupper(c)) {
 					table[us] |= (u_char)toupper(c) << 16;
 					table[us] |= XLAT16_HAS_FROM_UPPER_CASE;
 				}
 				break;
 			}
 
 			if (table[us] == 0)
 				continue;
 
 			/*
 			 * store not NULL
 			 */
 			xt.idx[ls] = table;
 		}
 		if (xt.idx[ls]) {
 			memcpy(p, table, sizeof(table));
 			p += sizeof(table);
 		}
 	}
 	my_iconv_close(cd);
 
 	xt.size = p - (char *)xt.data;
 	xt.data = realloc(xt.data, xt.size);
 	return (xt);
 }
 
 static int
 chklocale(int category, const char *code)
 {
 	char *p;
 	int error = -1;
 
 	p = strchr(setlocale(category, NULL), '.');
 	if (p++) {
 		error = strcasecmp(code, p);
 		if (error) {
 			/* XXX - can't avoid calling quirk here... */
 			error = strcasecmp(code, kiconv_quirkcs(p,
 			    KICONV_VENDOR_MICSFT));
 		}
 	}
 	return (error);
 }
 
 #ifdef ICONV_DLOPEN
 static int
 my_iconv_init(void)
 {
 	void *iconv_lib;
 
 	iconv_lib = dlopen("libiconv.so", RTLD_LAZY | RTLD_GLOBAL);
 	if (iconv_lib == NULL) {
 		warn("Unable to load iconv library: %s\n", dlerror());
 		errno = ENOENT;
 		return (-1);
 	}
 	my_iconv_open = dlsym(iconv_lib, "iconv_open");
 	my_iconv = dlsym(iconv_lib, "iconv");
 	my_iconv_close = dlsym(iconv_lib, "iconv_close");
 
 	return (0);
 }
 #endif
 
 static size_t
-my_iconv_char(iconv_t cd, const u_char **ibuf, size_t * ilen, u_char **obuf,
+my_iconv_char(iconv_t cd, u_char **ibuf, size_t * ilen, u_char **obuf,
 	size_t * olen)
 {
-	const u_char *sp;
-	u_char *dp, ilocal[3], olocal[3];
+	u_char *sp, *dp, ilocal[3], olocal[3];
 	u_char c1, c2;
 	int ret;
 	size_t ir, or;
 
 	sp = *ibuf;
 	dp = *obuf;
 	ir = *ilen;
 
 	bzero(*obuf, *olen);
-	ret = my_iconv(cd, (const char **)&sp, ilen, (char **)&dp, olen);
+	ret = my_iconv(cd, (char **)&sp, ilen, (char **)&dp, olen);
 	c1 = (*obuf)[0];
 	c2 = (*obuf)[1];
 
 	if (ret == -1) {
 		if (*ilen == ir - 1 && (*ibuf)[1] == '\0' && (c1 || c2))
 			return (0);
 		else
 			return (-1);
 	}
 
 	/*
 	 * We must judge if inbuf is a single byte char or double byte char.
 	 * Here, to judge, try first byte(*sp) conversion and compare.
 	 */
 	ir = 1;
 	or = 3;
 
 	bzero(olocal, or);
 	memcpy(ilocal, *ibuf, sizeof(ilocal));
 	sp = ilocal;
 	dp = olocal;
 
-	if ((my_iconv(cd,(const char **)&sp, &ir, (char **)&dp, &or)) != -1) {
+	if ((my_iconv(cd,(char **)&sp, &ir, (char **)&dp, &or)) != -1) {
 		if (olocal[0] != c1)
 			return (ret);
 
 		if (olocal[1] == c2 && (*ibuf)[1] == '\0') {
 			/*
 			 * inbuf is a single byte char
 			 */
 			*ilen = 1;
 			*olen = or;
 			return (ret);
 		}
 
 		switch(or) {
 		case 0:
 		case 1:
 			if (olocal[1] == c2) {
 				/*
 				 * inbuf is a single byte char,
 				 * so return false here.
 				 */
 				return (-1);
 			} else {
 				/*
 				 * inbuf is a double byte char
 				 */
 				return (ret);
 			}
 			break;
 		case 2:
 			/*
 			 * should compare second byte of inbuf
 			 */
 			break;
 		}
 	} else {
 		/*
 		 * inbuf clould not be splitted, so inbuf is
 		 * a double byte char.
 		 */
 		return (ret);
 	}
 
 	/*
 	 * try second byte(*(sp+1)) conversion, and compare
 	 */
 	ir = 1;
 	or = 3;
 
 	bzero(olocal, or);
 
 	sp = ilocal + 1;
 	dp = olocal;
 
-	if ((my_iconv(cd,(const char **)&sp, &ir, (char **)&dp, &or)) != -1) {
+	if ((my_iconv(cd,(char **)&sp, &ir, (char **)&dp, &or)) != -1) {
 		if (olocal[0] == c2)
 			/*
 			 * inbuf is a single byte char
 			 */
 			return (-1);
 	}
 
 	return (ret);
 }
 
 #else /* statically linked */
 
 #include <sys/types.h>
 #include <sys/iconv.h>
 #include <errno.h>
 
 int
 kiconv_add_xlat16_cspair(const char *tocode __unused, const char *fromcode __unused,
     int flag __unused)
 {
 
 	errno = EINVAL;
 	return (-1);
 }
 
 int
 kiconv_add_xlat16_cspairs(const char *tocode __unused, const char *fromcode __unused)
 {
 	errno = EINVAL;
 	return (-1);
 }
 
 #endif /* PIC */
Index: user/ngie/more-tests/libexec/rtld-elf/aarch64/reloc.c
===================================================================
--- user/ngie/more-tests/libexec/rtld-elf/aarch64/reloc.c	(revision 281584)
+++ user/ngie/more-tests/libexec/rtld-elf/aarch64/reloc.c	(revision 281585)
@@ -1,414 +1,414 @@
 /*-
  * Copyright (c) 2014-2015 The FreeBSD Foundation
  * All rights reserved.
  *
  * Portions of this software were developed by Andrew Turner
  * under sponsorship from the FreeBSD Foundation.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/types.h>
 
 #include <stdlib.h>
 
 #include "debug.h"
 #include "rtld.h"
 #include "rtld_printf.h"
 
 /*
  * It is possible for the compiler to emit relocations for unaligned data.
  * We handle this situation with these inlines.
  */
 #define	RELOC_ALIGNED_P(x) \
 	(((uintptr_t)(x) & (sizeof(void *) - 1)) == 0)
 
 /*
  * This is not the correct prototype, but we only need it for
  * a function pointer to a simple asm function.
  */
 void *_rtld_tlsdesc(void *);
 void *_rtld_tlsdesc_dynamic(void *);
 
 void _exit(int);
 
 void
 init_pltgot(Obj_Entry *obj)
 {
 
 	if (obj->pltgot != NULL) {
 		obj->pltgot[1] = (Elf_Addr) obj;
 		obj->pltgot[2] = (Elf_Addr) &_rtld_bind_start;
 	}
 }
 
 int
 do_copy_relocations(Obj_Entry *dstobj)
 {
 	const Obj_Entry *srcobj, *defobj;
 	const Elf_Rela *relalim;
 	const Elf_Rela *rela;
 	const Elf_Sym *srcsym;
 	const Elf_Sym *dstsym;
 	const void *srcaddr;
 	const char *name;
 	void *dstaddr;
 	SymLook req;
 	size_t size;
 	int res;
 
 	/*
 	 * COPY relocs are invalid outside of the main program
 	 */
 	assert(dstobj->mainprog);
 
 	relalim = (const Elf_Rela *)((char *)dstobj->rela +
 	    dstobj->relasize);
 	for (rela = dstobj->rela; rela < relalim; rela++) {
 		if (ELF_R_TYPE(rela->r_info) != R_AARCH64_COPY)
 			continue;
 
 		dstaddr = (void *)(dstobj->relocbase + rela->r_offset);
 		dstsym = dstobj->symtab + ELF_R_SYM(rela->r_info);
 		name = dstobj->strtab + dstsym->st_name;
 		size = dstsym->st_size;
 
 		symlook_init(&req, name);
 		req.ventry = fetch_ventry(dstobj, ELF_R_SYM(rela->r_info));
 		req.flags = SYMLOOK_EARLY;
 
 		for (srcobj = dstobj->next; srcobj != NULL;
 		     srcobj = srcobj->next) {
 			res = symlook_obj(&req, srcobj);
 			if (res == 0) {
 				srcsym = req.sym_out;
 				defobj = req.defobj_out;
 				break;
 			}
 		}
 		if (srcobj == NULL) {
 			_rtld_error(
 "Undefined symbol \"%s\" referenced from COPY relocation in %s",
 			    name, dstobj->path);
 			return (-1);
 		}
 
 		srcaddr = (const void *)(defobj->relocbase + srcsym->st_value);
 		memcpy(dstaddr, srcaddr, size);
 	}
 
 	return (0);
 }
 
 struct tls_data {
 	int64_t index;
 	Obj_Entry *obj;
 	const Elf_Rela *rela;
 };
 
 static struct tls_data *
 reloc_tlsdesc_alloc(Obj_Entry *obj, const Elf_Rela *rela)
 {
 	struct tls_data *tlsdesc;
 
 	tlsdesc = xmalloc(sizeof(struct tls_data));
 	tlsdesc->index = -1;
 	tlsdesc->obj = obj;
 	tlsdesc->rela = rela;
 
 	return (tlsdesc);
 }
 
 /*
  * Look up the symbol to find its tls index
  */
 static int64_t
 rtld_tlsdesc_handle_locked(struct tls_data *tlsdesc, int flags,
     RtldLockState *lockstate)
 {
 	const Elf_Rela *rela;
 	const Elf_Sym *def;
 	const Obj_Entry *defobj;
 	Obj_Entry *obj;
 
 	rela = tlsdesc->rela;
 	obj = tlsdesc->obj;
 
 	def = find_symdef(ELF_R_SYM(rela->r_info), obj, &defobj, flags, NULL,
 	    lockstate);
 	if (def == NULL)
 		rtld_die();
 
-	tlsdesc->index = defobj->tlsindex + def->st_value + rela->r_addend;
+	tlsdesc->index = defobj->tlsoffset + def->st_value + rela->r_addend;
 
 	return (tlsdesc->index);
 }
 
 int64_t
 rtld_tlsdesc_handle(struct tls_data *tlsdesc, int flags)
 {
 	RtldLockState lockstate;
 
 	/* We have already found the index, return it */
 	if (tlsdesc->index >= 0)
 		return (tlsdesc->index);
 
 	wlock_acquire(rtld_bind_lock, &lockstate);
 	/* tlsdesc->index may have been set by another thread */
 	if (tlsdesc->index == -1)
 		rtld_tlsdesc_handle_locked(tlsdesc, flags, &lockstate);
 	lock_release(rtld_bind_lock, &lockstate);
 
 	return (tlsdesc->index);
 }
 
 /*
  * Process the PLT relocations.
  */
 int
 reloc_plt(Obj_Entry *obj)
 {
 	const Elf_Rela *relalim;
 	const Elf_Rela *rela;
 
 	relalim = (const Elf_Rela *)((char *)obj->pltrela + obj->pltrelasize);
 	for (rela = obj->pltrela; rela < relalim; rela++) {
 		Elf_Addr *where;
 
 		where = (Elf_Addr *)(obj->relocbase + rela->r_offset);
 
 		switch(ELF_R_TYPE(rela->r_info)) {
 		case R_AARCH64_JUMP_SLOT:
 			*where += (Elf_Addr)obj->relocbase;
 			break;
 		case R_AARCH64_TLSDESC:
 			if (ELF_R_SYM(rela->r_info) == 0) {
 				where[0] = (Elf_Addr)_rtld_tlsdesc;
-				where[1] = obj->tlsindex + rela->r_addend;
+				where[1] = obj->tlsoffset + rela->r_addend;
 			} else {
 				where[0] = (Elf_Addr)_rtld_tlsdesc_dynamic;
 				where[1] = (Elf_Addr)reloc_tlsdesc_alloc(obj,
 				    rela);
 			}
 			break;
 		default:
 			_rtld_error("Unknown relocation type %u in PLT",
 			    (unsigned int)ELF_R_TYPE(rela->r_info));
 			return (-1);
 		}
 	}
 
 	return (0);
 }
 
 /*
  * LD_BIND_NOW was set - force relocation for all jump slots
  */
 int
 reloc_jmpslots(Obj_Entry *obj, int flags, RtldLockState *lockstate)
 {
 	const Obj_Entry *defobj;
 	const Elf_Rela *relalim;
 	const Elf_Rela *rela;
 	const Elf_Sym *def;
 	struct tls_data *tlsdesc;
 
 	relalim = (const Elf_Rela *)((char *)obj->pltrela + obj->pltrelasize);
 	for (rela = obj->pltrela; rela < relalim; rela++) {
 		Elf_Addr *where;
 
 		where = (Elf_Addr *)(obj->relocbase + rela->r_offset);
 		switch(ELF_R_TYPE(rela->r_info)) {
 		case R_AARCH64_JUMP_SLOT:
 			def = find_symdef(ELF_R_SYM(rela->r_info), obj,
 			    &defobj, SYMLOOK_IN_PLT | flags, NULL, lockstate);
 			if (def == NULL) {
 				dbg("reloc_jmpslots: sym not found");
 				return (-1);
 			}
 
 			*where = (Elf_Addr)(defobj->relocbase + def->st_value);
 			break;
 		case R_AARCH64_TLSDESC:
 			if (ELF_R_SYM(rela->r_info) != 0) {
 				tlsdesc = (struct tls_data *)where[1];
 				if (tlsdesc->index == -1)
 					rtld_tlsdesc_handle_locked(tlsdesc,
 					    SYMLOOK_IN_PLT | flags, lockstate);
 			}
 			break;
 		default:
 			_rtld_error("Unknown relocation type %x in jmpslot",
 			    (unsigned int)ELF_R_TYPE(rela->r_info));
 			return (-1);
 		}
 	}
 
 	return (0);
 }
 
 int
 reloc_iresolve(Obj_Entry *obj, struct Struct_RtldLockState *lockstate)
 {
 
 	/* XXX not implemented */
 	return (0);
 }
 
 int
 reloc_gnu_ifunc(Obj_Entry *obj, int flags,
    struct Struct_RtldLockState *lockstate)
 {
 
 	/* XXX not implemented */
 	return (0);
 }
 
 Elf_Addr
 reloc_jmpslot(Elf_Addr *where, Elf_Addr target, const Obj_Entry *defobj,
     const Obj_Entry *obj, const Elf_Rel *rel)
 {
 
 	assert(ELF_R_TYPE(rel->r_info) == R_AARCH64_JUMP_SLOT);
 
 	if (*where != target)
 		*where = target;
 
 	return target;
 }
 
 /*
  * Process non-PLT relocations
  */
 int
 reloc_non_plt(Obj_Entry *obj, Obj_Entry *obj_rtld, int flags,
     RtldLockState *lockstate)
 {
 	const Obj_Entry *defobj;
 	const Elf_Rela *relalim;
 	const Elf_Rela *rela;
 	const Elf_Sym *def;
 	SymCache *cache;
 	Elf_Addr *where;
 	unsigned long symnum;
 
 	if ((flags & SYMLOOK_IFUNC) != 0)
 		/* XXX not implemented */
 		return (0);
 
 	/*
 	 * The dynamic loader may be called from a thread, we have
 	 * limited amounts of stack available so we cannot use alloca().
 	 */
 	if (obj == obj_rtld)
 		cache = NULL;
 	else
 		cache = calloc(obj->dynsymcount, sizeof(SymCache));
 		/* No need to check for NULL here */
 
 	relalim = (const Elf_Rela *)((caddr_t)obj->rela + obj->relasize);
 	for (rela = obj->rela; rela < relalim; rela++) {
 		where = (Elf_Addr *)(obj->relocbase + rela->r_offset);
 		symnum = ELF_R_SYM(rela->r_info);
 
 		switch (ELF_R_TYPE(rela->r_info)) {
 		case R_AARCH64_ABS64:
 		case R_AARCH64_GLOB_DAT:
 			def = find_symdef(symnum, obj, &defobj, flags, cache,
 			    lockstate);
 			if (def == NULL)
 				return (-1);
 
 			*where = (Elf_Addr)defobj->relocbase + def->st_value;
 			break;
 		case R_AARCH64_COPY:
 			/*
 			 * These are deferred until all other relocations have
 			 * been done. All we do here is make sure that the
 			 * COPY relocation is not in a shared library. They
 			 * are allowed only in executable files.
 			 */
 			if (!obj->mainprog) {
 				_rtld_error("%s: Unexpected R_AARCH64_COPY "
 				    "relocation in shared library", obj->path);
 				return (-1);
 			}
 			break;
 		case R_AARCH64_TLS_TPREL64:
 			def = find_symdef(symnum, obj, &defobj, flags, cache,
 			    lockstate);
 			if (def == NULL)
 				return (-1);
 
 			/*
 			 * We lazily allocate offsets for static TLS as we
 			 * see the first relocation that references the
 			 * TLS block. This allows us to support (small
 			 * amounts of) static TLS in dynamically loaded
 			 * modules. If we run out of space, we generate an
 			 * error.
 			 */
 			if (!defobj->tls_done) {
 				if (!allocate_tls_offset((Obj_Entry*) defobj)) {
 					_rtld_error(
 					    "%s: No space available for static "
 					    "Thread Local Storage", obj->path);
 					return (-1);
 				}
 			}
 
 			*where = def->st_value + rela->r_addend +
 			    defobj->tlsoffset - TLS_TCB_SIZE;
 			break;
 		case R_AARCH64_RELATIVE:
 			*where = (Elf_Addr)(obj->relocbase + rela->r_addend);
 			break;
 		default:
 			rtld_printf("%s: Unhandled relocation %lu\n",
 			    obj->path, ELF_R_TYPE(rela->r_info));
 			return (-1);
 		}
 	}
 
 	return (0);
 }
 
 void
 allocate_initial_tls(Obj_Entry *objs)
 {
 	Elf_Addr **tp;
 
 	/*
 	* Fix the size of the static TLS block by using the maximum
 	* offset allocated so far and adding a bit for dynamic modules to
 	* use.
 	*/
 	tls_static_space = tls_last_offset + tls_last_size +
 	    RTLD_STATIC_TLS_EXTRA;
 
 	tp = (Elf_Addr **) allocate_tls(objs, NULL, TLS_TCB_SIZE, 16);
 
 	asm volatile("msr	tpidr_el0, %0" : : "r"(tp));
 }
Index: user/ngie/more-tests/libexec/rtld-elf/rtld.c
===================================================================
--- user/ngie/more-tests/libexec/rtld-elf/rtld.c	(revision 281584)
+++ user/ngie/more-tests/libexec/rtld-elf/rtld.c	(revision 281585)
@@ -1,5070 +1,5083 @@
 /*-
  * Copyright 1996, 1997, 1998, 1999, 2000 John D. Polstra.
  * Copyright 2003 Alexander Kabaev <kan@FreeBSD.ORG>.
  * Copyright 2009-2012 Konstantin Belousov <kib@FreeBSD.ORG>.
  * Copyright 2012 John Marino <draco@marino.st>.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 /*
  * Dynamic linker for ELF.
  *
  * John Polstra <jdp@polstra.com>.
  */
 
 #ifndef __GNUC__
 #error "GCC is needed to compile this file"
 #endif
 
 #include <sys/param.h>
 #include <sys/mount.h>
 #include <sys/mman.h>
 #include <sys/stat.h>
 #include <sys/sysctl.h>
 #include <sys/uio.h>
 #include <sys/utsname.h>
 #include <sys/ktrace.h>
 
 #include <dlfcn.h>
 #include <err.h>
 #include <errno.h>
 #include <fcntl.h>
 #include <stdarg.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
 
 #include "debug.h"
 #include "rtld.h"
 #include "libmap.h"
 #include "rtld_tls.h"
 #include "rtld_printf.h"
 #include "notes.h"
 
 #ifndef COMPAT_32BIT
 #define PATH_RTLD	"/libexec/ld-elf.so.1"
 #else
 #define PATH_RTLD	"/libexec/ld-elf32.so.1"
 #endif
 
 /* Types. */
 typedef void (*func_ptr_type)();
 typedef void * (*path_enum_proc) (const char *path, size_t len, void *arg);
 
 /*
  * Function declarations.
  */
 static const char *basename(const char *);
 static void digest_dynamic1(Obj_Entry *, int, const Elf_Dyn **,
     const Elf_Dyn **, const Elf_Dyn **);
 static void digest_dynamic2(Obj_Entry *, const Elf_Dyn *, const Elf_Dyn *,
     const Elf_Dyn *);
 static void digest_dynamic(Obj_Entry *, int);
 static Obj_Entry *digest_phdr(const Elf_Phdr *, int, caddr_t, const char *);
 static Obj_Entry *dlcheck(void *);
 static Obj_Entry *dlopen_object(const char *name, int fd, Obj_Entry *refobj,
     int lo_flags, int mode, RtldLockState *lockstate);
 static Obj_Entry *do_load_object(int, const char *, char *, struct stat *, int);
 static int do_search_info(const Obj_Entry *obj, int, struct dl_serinfo *);
 static bool donelist_check(DoneList *, const Obj_Entry *);
 static void errmsg_restore(char *);
 static char *errmsg_save(void);
 static void *fill_search_info(const char *, size_t, void *);
 static char *find_library(const char *, const Obj_Entry *, int *);
 static const char *gethints(bool);
 static void init_dag(Obj_Entry *);
 static void init_pagesizes(Elf_Auxinfo **aux_info);
 static void init_rtld(caddr_t, Elf_Auxinfo **);
 static void initlist_add_neededs(Needed_Entry *, Objlist *);
 static void initlist_add_objects(Obj_Entry *, Obj_Entry **, Objlist *);
 static void linkmap_add(Obj_Entry *);
 static void linkmap_delete(Obj_Entry *);
 static void load_filtees(Obj_Entry *, int flags, RtldLockState *);
 static void unload_filtees(Obj_Entry *);
 static int load_needed_objects(Obj_Entry *, int);
 static int load_preload_objects(void);
 static Obj_Entry *load_object(const char *, int fd, const Obj_Entry *, int);
 static void map_stacks_exec(RtldLockState *);
 static Obj_Entry *obj_from_addr(const void *);
 static void objlist_call_fini(Objlist *, Obj_Entry *, RtldLockState *);
 static void objlist_call_init(Objlist *, RtldLockState *);
 static void objlist_clear(Objlist *);
 static Objlist_Entry *objlist_find(Objlist *, const Obj_Entry *);
 static void objlist_init(Objlist *);
 static void objlist_push_head(Objlist *, Obj_Entry *);
 static void objlist_push_tail(Objlist *, Obj_Entry *);
 static void objlist_put_after(Objlist *, Obj_Entry *, Obj_Entry *);
 static void objlist_remove(Objlist *, Obj_Entry *);
 static int parse_libdir(const char *);
 static void *path_enumerate(const char *, path_enum_proc, void *);
 static int relocate_object_dag(Obj_Entry *root, bool bind_now,
     Obj_Entry *rtldobj, int flags, RtldLockState *lockstate);
 static int relocate_object(Obj_Entry *obj, bool bind_now, Obj_Entry *rtldobj,
     int flags, RtldLockState *lockstate);
 static int relocate_objects(Obj_Entry *, bool, Obj_Entry *, int,
     RtldLockState *);
 static int resolve_objects_ifunc(Obj_Entry *first, bool bind_now,
     int flags, RtldLockState *lockstate);
 static int rtld_dirname(const char *, char *);
 static int rtld_dirname_abs(const char *, char *);
 static void *rtld_dlopen(const char *name, int fd, int mode);
 static void rtld_exit(void);
 static char *search_library_path(const char *, const char *);
 static char *search_library_pathfds(const char *, const char *, int *);
 static const void **get_program_var_addr(const char *, RtldLockState *);
 static void set_program_var(const char *, const void *);
 static int symlook_default(SymLook *, const Obj_Entry *refobj);
 static int symlook_global(SymLook *, DoneList *);
 static void symlook_init_from_req(SymLook *, const SymLook *);
 static int symlook_list(SymLook *, const Objlist *, DoneList *);
 static int symlook_needed(SymLook *, const Needed_Entry *, DoneList *);
 static int symlook_obj1_sysv(SymLook *, const Obj_Entry *);
 static int symlook_obj1_gnu(SymLook *, const Obj_Entry *);
 static void trace_loaded_objects(Obj_Entry *);
 static void unlink_object(Obj_Entry *);
 static void unload_object(Obj_Entry *);
 static void unref_dag(Obj_Entry *);
 static void ref_dag(Obj_Entry *);
 static char *origin_subst_one(char *, const char *, const char *, bool);
 static char *origin_subst(char *, const char *);
 static void preinit_main(void);
 static int  rtld_verify_versions(const Objlist *);
 static int  rtld_verify_object_versions(Obj_Entry *);
 static void object_add_name(Obj_Entry *, const char *);
 static int  object_match_name(const Obj_Entry *, const char *);
 static void ld_utrace_log(int, void *, void *, size_t, int, const char *);
 static void rtld_fill_dl_phdr_info(const Obj_Entry *obj,
     struct dl_phdr_info *phdr_info);
 static uint32_t gnu_hash(const char *);
 static bool matched_symbol(SymLook *, const Obj_Entry *, Sym_Match_Result *,
     const unsigned long);
 
 void r_debug_state(struct r_debug *, struct link_map *) __noinline __exported;
 void _r_debug_postinit(struct link_map *) __noinline __exported;
 
 int __sys_openat(int, const char *, int, ...);
 
 /*
  * Data declarations.
  */
 static char *error_message;	/* Message for dlerror(), or NULL */
 struct r_debug r_debug __exported;	/* for GDB; */
 static bool libmap_disable;	/* Disable libmap */
 static bool ld_loadfltr;	/* Immediate filters processing */
 static char *libmap_override;	/* Maps to use in addition to libmap.conf */
 static bool trust;		/* False for setuid and setgid programs */
 static bool dangerous_ld_env;	/* True if environment variables have been
 				   used to affect the libraries loaded */
 static char *ld_bind_now;	/* Environment variable for immediate binding */
 static char *ld_debug;		/* Environment variable for debugging */
 static char *ld_library_path;	/* Environment variable for search path */
 static char *ld_library_dirs;	/* Environment variable for library descriptors */
 static char *ld_preload;	/* Environment variable for libraries to
 				   load first */
 static char *ld_elf_hints_path;	/* Environment variable for alternative hints path */
 static char *ld_tracing;	/* Called from ldd to print libs */
 static char *ld_utrace;		/* Use utrace() to log events. */
 static Obj_Entry *obj_list;	/* Head of linked list of shared objects */
 static Obj_Entry **obj_tail;	/* Link field of last object in list */
 static Obj_Entry *obj_main;	/* The main program shared object */
 static Obj_Entry obj_rtld;	/* The dynamic linker shared object */
 static unsigned int obj_count;	/* Number of objects in obj_list */
 static unsigned int obj_loads;	/* Number of objects in obj_list */
 
 static Objlist list_global =	/* Objects dlopened with RTLD_GLOBAL */
   STAILQ_HEAD_INITIALIZER(list_global);
 static Objlist list_main =	/* Objects loaded at program startup */
   STAILQ_HEAD_INITIALIZER(list_main);
 static Objlist list_fini =	/* Objects needing fini() calls */
   STAILQ_HEAD_INITIALIZER(list_fini);
 
 Elf_Sym sym_zero;		/* For resolving undefined weak refs. */
 
 #define GDB_STATE(s,m)	r_debug.r_state = s; r_debug_state(&r_debug,m);
 
 extern Elf_Dyn _DYNAMIC;
 #pragma weak _DYNAMIC
 #ifndef RTLD_IS_DYNAMIC
 #define	RTLD_IS_DYNAMIC()	(&_DYNAMIC != NULL)
 #endif
 
 int dlclose(void *) __exported;
 char *dlerror(void) __exported;
 void *dlopen(const char *, int) __exported;
 void *fdlopen(int, int) __exported;
 void *dlsym(void *, const char *) __exported;
 dlfunc_t dlfunc(void *, const char *) __exported;
 void *dlvsym(void *, const char *, const char *) __exported;
 int dladdr(const void *, Dl_info *) __exported;
 void dllockinit(void *, void *(*)(void *), void (*)(void *), void (*)(void *),
     void (*)(void *), void (*)(void *), void (*)(void *)) __exported;
 int dlinfo(void *, int , void *) __exported;
 int dl_iterate_phdr(__dl_iterate_hdr_callback, void *) __exported;
 int _rtld_addr_phdr(const void *, struct dl_phdr_info *) __exported;
 int _rtld_get_stack_prot(void) __exported;
 int _rtld_is_dlopened(void *) __exported;
 void _rtld_error(const char *, ...) __exported;
 
 int npagesizes, osreldate;
 size_t *pagesizes;
 
 long __stack_chk_guard[8] = {0, 0, 0, 0, 0, 0, 0, 0};
 
 static int stack_prot = PROT_READ | PROT_WRITE | RTLD_DEFAULT_STACK_EXEC;
 static int max_stack_flags;
 
 /*
  * Global declarations normally provided by crt1.  The dynamic linker is
  * not built with crt1, so we have to provide them ourselves.
  */
 char *__progname;
 char **environ;
 
 /*
  * Used to pass argc, argv to init functions.
  */
 int main_argc;
 char **main_argv;
 
 /*
  * Globals to control TLS allocation.
  */
 size_t tls_last_offset;		/* Static TLS offset of last module */
 size_t tls_last_size;		/* Static TLS size of last module */
 size_t tls_static_space;	/* Static TLS space allocated */
 size_t tls_static_max_align;
 int tls_dtv_generation = 1;	/* Used to detect when dtv size changes  */
 int tls_max_index = 1;		/* Largest module index allocated */
 
 bool ld_library_path_rpath = false;
 
 /*
  * Fill in a DoneList with an allocation large enough to hold all of
  * the currently-loaded objects.  Keep this as a macro since it calls
  * alloca and we want that to occur within the scope of the caller.
  */
 #define donelist_init(dlp)					\
     ((dlp)->objs = alloca(obj_count * sizeof (dlp)->objs[0]),	\
     assert((dlp)->objs != NULL),				\
     (dlp)->num_alloc = obj_count,				\
     (dlp)->num_used = 0)
 
 #define	UTRACE_DLOPEN_START		1
 #define	UTRACE_DLOPEN_STOP		2
 #define	UTRACE_DLCLOSE_START		3
 #define	UTRACE_DLCLOSE_STOP		4
 #define	UTRACE_LOAD_OBJECT		5
 #define	UTRACE_UNLOAD_OBJECT		6
 #define	UTRACE_ADD_RUNDEP		7
 #define	UTRACE_PRELOAD_FINISHED		8
 #define	UTRACE_INIT_CALL		9
 #define	UTRACE_FINI_CALL		10
 #define	UTRACE_DLSYM_START		11
 #define	UTRACE_DLSYM_STOP		12
 
 struct utrace_rtld {
 	char sig[4];			/* 'RTLD' */
 	int event;
 	void *handle;
 	void *mapbase;			/* Used for 'parent' and 'init/fini' */
 	size_t mapsize;
 	int refcnt;			/* Used for 'mode' */
 	char name[MAXPATHLEN];
 };
 
 #define	LD_UTRACE(e, h, mb, ms, r, n) do {			\
 	if (ld_utrace != NULL)					\
 		ld_utrace_log(e, h, mb, ms, r, n);		\
 } while (0)
 
 static void
 ld_utrace_log(int event, void *handle, void *mapbase, size_t mapsize,
     int refcnt, const char *name)
 {
 	struct utrace_rtld ut;
 
 	ut.sig[0] = 'R';
 	ut.sig[1] = 'T';
 	ut.sig[2] = 'L';
 	ut.sig[3] = 'D';
 	ut.event = event;
 	ut.handle = handle;
 	ut.mapbase = mapbase;
 	ut.mapsize = mapsize;
 	ut.refcnt = refcnt;
 	bzero(ut.name, sizeof(ut.name));
 	if (name)
 		strlcpy(ut.name, name, sizeof(ut.name));
 	utrace(&ut, sizeof(ut));
 }
 
 /*
  * Main entry point for dynamic linking.  The first argument is the
  * stack pointer.  The stack is expected to be laid out as described
  * in the SVR4 ABI specification, Intel 386 Processor Supplement.
  * Specifically, the stack pointer points to a word containing
  * ARGC.  Following that in the stack is a null-terminated sequence
  * of pointers to argument strings.  Then comes a null-terminated
  * sequence of pointers to environment strings.  Finally, there is a
  * sequence of "auxiliary vector" entries.
  *
  * The second argument points to a place to store the dynamic linker's
  * exit procedure pointer and the third to a place to store the main
  * program's object.
  *
  * The return value is the main program's entry point.
  */
 func_ptr_type
 _rtld(Elf_Addr *sp, func_ptr_type *exit_proc, Obj_Entry **objp)
 {
     Elf_Auxinfo *aux_info[AT_COUNT];
     int i;
     int argc;
     char **argv;
     char **env;
     Elf_Auxinfo *aux;
     Elf_Auxinfo *auxp;
     const char *argv0;
     Objlist_Entry *entry;
     Obj_Entry *obj;
     Obj_Entry **preload_tail;
     Obj_Entry *last_interposer;
     Objlist initlist;
     RtldLockState lockstate;
     char *library_path_rpath;
     int mib[2];
     size_t len;
 
     /*
      * On entry, the dynamic linker itself has not been relocated yet.
      * Be very careful not to reference any global data until after
      * init_rtld has returned.  It is OK to reference file-scope statics
      * and string constants, and to call static and global functions.
      */
 
     /* Find the auxiliary vector on the stack. */
     argc = *sp++;
     argv = (char **) sp;
     sp += argc + 1;	/* Skip over arguments and NULL terminator */
     env = (char **) sp;
     while (*sp++ != 0)	/* Skip over environment, and NULL terminator */
 	;
     aux = (Elf_Auxinfo *) sp;
 
     /* Digest the auxiliary vector. */
     for (i = 0;  i < AT_COUNT;  i++)
 	aux_info[i] = NULL;
     for (auxp = aux;  auxp->a_type != AT_NULL;  auxp++) {
 	if (auxp->a_type < AT_COUNT)
 	    aux_info[auxp->a_type] = auxp;
     }
 
     /* Initialize and relocate ourselves. */
     assert(aux_info[AT_BASE] != NULL);
     init_rtld((caddr_t) aux_info[AT_BASE]->a_un.a_ptr, aux_info);
 
     __progname = obj_rtld.path;
     argv0 = argv[0] != NULL ? argv[0] : "(null)";
     environ = env;
     main_argc = argc;
     main_argv = argv;
 
     if (aux_info[AT_CANARY] != NULL &&
 	aux_info[AT_CANARY]->a_un.a_ptr != NULL) {
 	    i = aux_info[AT_CANARYLEN]->a_un.a_val;
 	    if (i > sizeof(__stack_chk_guard))
 		    i = sizeof(__stack_chk_guard);
 	    memcpy(__stack_chk_guard, aux_info[AT_CANARY]->a_un.a_ptr, i);
     } else {
 	mib[0] = CTL_KERN;
 	mib[1] = KERN_ARND;
 
 	len = sizeof(__stack_chk_guard);
 	if (sysctl(mib, 2, __stack_chk_guard, &len, NULL, 0) == -1 ||
 	    len != sizeof(__stack_chk_guard)) {
 		/* If sysctl was unsuccessful, use the "terminator canary". */
 		((unsigned char *)(void *)__stack_chk_guard)[0] = 0;
 		((unsigned char *)(void *)__stack_chk_guard)[1] = 0;
 		((unsigned char *)(void *)__stack_chk_guard)[2] = '\n';
 		((unsigned char *)(void *)__stack_chk_guard)[3] = 255;
 	}
     }
 
     trust = !issetugid();
 
     ld_bind_now = getenv(LD_ "BIND_NOW");
     /* 
      * If the process is tainted, then we un-set the dangerous environment
      * variables.  The process will be marked as tainted until setuid(2)
      * is called.  If any child process calls setuid(2) we do not want any
      * future processes to honor the potentially un-safe variables.
      */
     if (!trust) {
         if (unsetenv(LD_ "PRELOAD") || unsetenv(LD_ "LIBMAP") ||
 	    unsetenv(LD_ "LIBRARY_PATH") || unsetenv(LD_ "LIBRARY_PATH_FDS") ||
 	    unsetenv(LD_ "LIBMAP_DISABLE") ||
 	    unsetenv(LD_ "DEBUG") || unsetenv(LD_ "ELF_HINTS_PATH") ||
 	    unsetenv(LD_ "LOADFLTR") || unsetenv(LD_ "LIBRARY_PATH_RPATH")) {
 		_rtld_error("environment corrupt; aborting");
 		rtld_die();
 	}
     }
     ld_debug = getenv(LD_ "DEBUG");
     libmap_disable = getenv(LD_ "LIBMAP_DISABLE") != NULL;
     libmap_override = getenv(LD_ "LIBMAP");
     ld_library_path = getenv(LD_ "LIBRARY_PATH");
     ld_library_dirs = getenv(LD_ "LIBRARY_PATH_FDS");
     ld_preload = getenv(LD_ "PRELOAD");
     ld_elf_hints_path = getenv(LD_ "ELF_HINTS_PATH");
     ld_loadfltr = getenv(LD_ "LOADFLTR") != NULL;
     library_path_rpath = getenv(LD_ "LIBRARY_PATH_RPATH");
     if (library_path_rpath != NULL) {
 	    if (library_path_rpath[0] == 'y' ||
 		library_path_rpath[0] == 'Y' ||
 		library_path_rpath[0] == '1')
 		    ld_library_path_rpath = true;
 	    else
 		    ld_library_path_rpath = false;
     }
     dangerous_ld_env = libmap_disable || (libmap_override != NULL) ||
 	(ld_library_path != NULL) || (ld_preload != NULL) ||
 	(ld_elf_hints_path != NULL) || ld_loadfltr;
     ld_tracing = getenv(LD_ "TRACE_LOADED_OBJECTS");
     ld_utrace = getenv(LD_ "UTRACE");
 
     if ((ld_elf_hints_path == NULL) || strlen(ld_elf_hints_path) == 0)
 	ld_elf_hints_path = _PATH_ELF_HINTS;
 
     if (ld_debug != NULL && *ld_debug != '\0')
 	debug = 1;
     dbg("%s is initialized, base address = %p", __progname,
 	(caddr_t) aux_info[AT_BASE]->a_un.a_ptr);
     dbg("RTLD dynamic = %p", obj_rtld.dynamic);
     dbg("RTLD pltgot  = %p", obj_rtld.pltgot);
 
     dbg("initializing thread locks");
     lockdflt_init();
 
     /*
      * Load the main program, or process its program header if it is
      * already loaded.
      */
     if (aux_info[AT_EXECFD] != NULL) {	/* Load the main program. */
 	int fd = aux_info[AT_EXECFD]->a_un.a_val;
 	dbg("loading main program");
 	obj_main = map_object(fd, argv0, NULL);
 	close(fd);
 	if (obj_main == NULL)
 	    rtld_die();
 	max_stack_flags = obj->stack_flags;
     } else {				/* Main program already loaded. */
 	const Elf_Phdr *phdr;
 	int phnum;
 	caddr_t entry;
 
 	dbg("processing main program's program header");
 	assert(aux_info[AT_PHDR] != NULL);
 	phdr = (const Elf_Phdr *) aux_info[AT_PHDR]->a_un.a_ptr;
 	assert(aux_info[AT_PHNUM] != NULL);
 	phnum = aux_info[AT_PHNUM]->a_un.a_val;
 	assert(aux_info[AT_PHENT] != NULL);
 	assert(aux_info[AT_PHENT]->a_un.a_val == sizeof(Elf_Phdr));
 	assert(aux_info[AT_ENTRY] != NULL);
 	entry = (caddr_t) aux_info[AT_ENTRY]->a_un.a_ptr;
 	if ((obj_main = digest_phdr(phdr, phnum, entry, argv0)) == NULL)
 	    rtld_die();
     }
 
     if (aux_info[AT_EXECPATH] != 0) {
 	    char *kexecpath;
 	    char buf[MAXPATHLEN];
 
 	    kexecpath = aux_info[AT_EXECPATH]->a_un.a_ptr;
 	    dbg("AT_EXECPATH %p %s", kexecpath, kexecpath);
 	    if (kexecpath[0] == '/')
 		    obj_main->path = kexecpath;
 	    else if (getcwd(buf, sizeof(buf)) == NULL ||
 		     strlcat(buf, "/", sizeof(buf)) >= sizeof(buf) ||
 		     strlcat(buf, kexecpath, sizeof(buf)) >= sizeof(buf))
 		    obj_main->path = xstrdup(argv0);
 	    else
 		    obj_main->path = xstrdup(buf);
     } else {
 	    dbg("No AT_EXECPATH");
 	    obj_main->path = xstrdup(argv0);
     }
     dbg("obj_main path %s", obj_main->path);
     obj_main->mainprog = true;
 
     if (aux_info[AT_STACKPROT] != NULL &&
       aux_info[AT_STACKPROT]->a_un.a_val != 0)
 	    stack_prot = aux_info[AT_STACKPROT]->a_un.a_val;
 
 #ifndef COMPAT_32BIT
     /*
      * Get the actual dynamic linker pathname from the executable if
      * possible.  (It should always be possible.)  That ensures that
      * gdb will find the right dynamic linker even if a non-standard
      * one is being used.
      */
     if (obj_main->interp != NULL &&
       strcmp(obj_main->interp, obj_rtld.path) != 0) {
 	free(obj_rtld.path);
 	obj_rtld.path = xstrdup(obj_main->interp);
         __progname = obj_rtld.path;
     }
 #endif
 
     digest_dynamic(obj_main, 0);
     dbg("%s valid_hash_sysv %d valid_hash_gnu %d dynsymcount %d",
 	obj_main->path, obj_main->valid_hash_sysv, obj_main->valid_hash_gnu,
 	obj_main->dynsymcount);
 
     linkmap_add(obj_main);
     linkmap_add(&obj_rtld);
 
     /* Link the main program into the list of objects. */
     *obj_tail = obj_main;
     obj_tail = &obj_main->next;
     obj_count++;
     obj_loads++;
 
     /* Initialize a fake symbol for resolving undefined weak references. */
     sym_zero.st_info = ELF_ST_INFO(STB_GLOBAL, STT_NOTYPE);
     sym_zero.st_shndx = SHN_UNDEF;
     sym_zero.st_value = -(uintptr_t)obj_main->relocbase;
 
     if (!libmap_disable)
         libmap_disable = (bool)lm_init(libmap_override);
 
     dbg("loading LD_PRELOAD libraries");
     if (load_preload_objects() == -1)
 	rtld_die();
     preload_tail = obj_tail;
 
     dbg("loading needed objects");
     if (load_needed_objects(obj_main, 0) == -1)
 	rtld_die();
 
     /* Make a list of all objects loaded at startup. */
     last_interposer = obj_main;
     for (obj = obj_list;  obj != NULL;  obj = obj->next) {
 	if (obj->z_interpose && obj != obj_main) {
 	    objlist_put_after(&list_main, last_interposer, obj);
 	    last_interposer = obj;
 	} else {
 	    objlist_push_tail(&list_main, obj);
 	}
     	obj->refcount++;
     }
 
     dbg("checking for required versions");
     if (rtld_verify_versions(&list_main) == -1 && !ld_tracing)
 	rtld_die();
 
     if (ld_tracing) {		/* We're done */
 	trace_loaded_objects(obj_main);
 	exit(0);
     }
 
     if (getenv(LD_ "DUMP_REL_PRE") != NULL) {
        dump_relocations(obj_main);
        exit (0);
     }
 
     /*
      * Processing tls relocations requires having the tls offsets
      * initialized.  Prepare offsets before starting initial
      * relocation processing.
      */
     dbg("initializing initial thread local storage offsets");
     STAILQ_FOREACH(entry, &list_main, link) {
 	/*
 	 * Allocate all the initial objects out of the static TLS
 	 * block even if they didn't ask for it.
 	 */
 	allocate_tls_offset(entry->obj);
     }
 
     if (relocate_objects(obj_main,
       ld_bind_now != NULL && *ld_bind_now != '\0',
       &obj_rtld, SYMLOOK_EARLY, NULL) == -1)
 	rtld_die();
 
     dbg("doing copy relocations");
     if (do_copy_relocations(obj_main) == -1)
 	rtld_die();
 
     if (getenv(LD_ "DUMP_REL_POST") != NULL) {
        dump_relocations(obj_main);
        exit (0);
     }
 
     /*
      * Setup TLS for main thread.  This must be done after the
      * relocations are processed, since tls initialization section
      * might be the subject for relocations.
      */
     dbg("initializing initial thread local storage");
     allocate_initial_tls(obj_list);
 
     dbg("initializing key program variables");
     set_program_var("__progname", argv[0] != NULL ? basename(argv[0]) : "");
     set_program_var("environ", env);
     set_program_var("__elf_aux_vector", aux);
 
     /* Make a list of init functions to call. */
     objlist_init(&initlist);
     initlist_add_objects(obj_list, preload_tail, &initlist);
 
     r_debug_state(NULL, &obj_main->linkmap); /* say hello to gdb! */
 
     map_stacks_exec(NULL);
 
     dbg("resolving ifuncs");
     if (resolve_objects_ifunc(obj_main,
       ld_bind_now != NULL && *ld_bind_now != '\0', SYMLOOK_EARLY,
       NULL) == -1)
 	rtld_die();
 
     if (!obj_main->crt_no_init) {
 	/*
 	 * Make sure we don't call the main program's init and fini
 	 * functions for binaries linked with old crt1 which calls
 	 * _init itself.
 	 */
 	obj_main->init = obj_main->fini = (Elf_Addr)NULL;
 	obj_main->preinit_array = obj_main->init_array =
 	    obj_main->fini_array = (Elf_Addr)NULL;
     }
 
     wlock_acquire(rtld_bind_lock, &lockstate);
     if (obj_main->crt_no_init)
 	preinit_main();
     objlist_call_init(&initlist, &lockstate);
     _r_debug_postinit(&obj_main->linkmap);
     objlist_clear(&initlist);
     dbg("loading filtees");
     for (obj = obj_list->next; obj != NULL; obj = obj->next) {
 	if (ld_loadfltr || obj->z_loadfltr)
 	    load_filtees(obj, 0, &lockstate);
     }
     lock_release(rtld_bind_lock, &lockstate);
 
     dbg("transferring control to program entry point = %p", obj_main->entry);
 
     /* Return the exit procedure and the program entry point. */
     *exit_proc = rtld_exit;
     *objp = obj_main;
     return (func_ptr_type) obj_main->entry;
 }
 
 void *
 rtld_resolve_ifunc(const Obj_Entry *obj, const Elf_Sym *def)
 {
 	void *ptr;
 	Elf_Addr target;
 
 	ptr = (void *)make_function_pointer(def, obj);
 	target = ((Elf_Addr (*)(void))ptr)();
 	return ((void *)target);
 }
 
 Elf_Addr
 _rtld_bind(Obj_Entry *obj, Elf_Size reloff)
 {
     const Elf_Rel *rel;
     const Elf_Sym *def;
     const Obj_Entry *defobj;
     Elf_Addr *where;
     Elf_Addr target;
     RtldLockState lockstate;
 
     rlock_acquire(rtld_bind_lock, &lockstate);
     if (sigsetjmp(lockstate.env, 0) != 0)
 	    lock_upgrade(rtld_bind_lock, &lockstate);
     if (obj->pltrel)
 	rel = (const Elf_Rel *) ((caddr_t) obj->pltrel + reloff);
     else
 	rel = (const Elf_Rel *) ((caddr_t) obj->pltrela + reloff);
 
     where = (Elf_Addr *) (obj->relocbase + rel->r_offset);
     def = find_symdef(ELF_R_SYM(rel->r_info), obj, &defobj, true, NULL,
 	&lockstate);
     if (def == NULL)
 	rtld_die();
     if (ELF_ST_TYPE(def->st_info) == STT_GNU_IFUNC)
 	target = (Elf_Addr)rtld_resolve_ifunc(defobj, def);
     else
 	target = (Elf_Addr)(defobj->relocbase + def->st_value);
 
     dbg("\"%s\" in \"%s\" ==> %p in \"%s\"",
       defobj->strtab + def->st_name, basename(obj->path),
       (void *)target, basename(defobj->path));
 
     /*
      * Write the new contents for the jmpslot. Note that depending on
      * architecture, the value which we need to return back to the
      * lazy binding trampoline may or may not be the target
      * address. The value returned from reloc_jmpslot() is the value
      * that the trampoline needs.
      */
     target = reloc_jmpslot(where, target, defobj, obj, rel);
     lock_release(rtld_bind_lock, &lockstate);
     return target;
 }
 
 /*
  * Error reporting function.  Use it like printf.  If formats the message
  * into a buffer, and sets things up so that the next call to dlerror()
  * will return the message.
  */
 void
 _rtld_error(const char *fmt, ...)
 {
     static char buf[512];
     va_list ap;
 
     va_start(ap, fmt);
     rtld_vsnprintf(buf, sizeof buf, fmt, ap);
     error_message = buf;
     va_end(ap);
 }
 
 /*
  * Return a dynamically-allocated copy of the current error message, if any.
  */
 static char *
 errmsg_save(void)
 {
     return error_message == NULL ? NULL : xstrdup(error_message);
 }
 
 /*
  * Restore the current error message from a copy which was previously saved
  * by errmsg_save().  The copy is freed.
  */
 static void
 errmsg_restore(char *saved_msg)
 {
     if (saved_msg == NULL)
 	error_message = NULL;
     else {
 	_rtld_error("%s", saved_msg);
 	free(saved_msg);
     }
 }
 
 static const char *
 basename(const char *name)
 {
     const char *p = strrchr(name, '/');
     return p != NULL ? p + 1 : name;
 }
 
 static struct utsname uts;
 
 static char *
 origin_subst_one(char *real, const char *kw, const char *subst,
     bool may_free)
 {
 	char *p, *p1, *res, *resp;
 	int subst_len, kw_len, subst_count, old_len, new_len;
 
 	kw_len = strlen(kw);
 
 	/*
 	 * First, count the number of the keyword occurences, to
 	 * preallocate the final string.
 	 */
 	for (p = real, subst_count = 0;; p = p1 + kw_len, subst_count++) {
 		p1 = strstr(p, kw);
 		if (p1 == NULL)
 			break;
 	}
 
 	/*
 	 * If the keyword is not found, just return.
 	 */
 	if (subst_count == 0)
 		return (may_free ? real : xstrdup(real));
 
 	/*
 	 * There is indeed something to substitute.  Calculate the
 	 * length of the resulting string, and allocate it.
 	 */
 	subst_len = strlen(subst);
 	old_len = strlen(real);
 	new_len = old_len + (subst_len - kw_len) * subst_count;
 	res = xmalloc(new_len + 1);
 
 	/*
 	 * Now, execute the substitution loop.
 	 */
 	for (p = real, resp = res, *resp = '\0';;) {
 		p1 = strstr(p, kw);
 		if (p1 != NULL) {
 			/* Copy the prefix before keyword. */
 			memcpy(resp, p, p1 - p);
 			resp += p1 - p;
 			/* Keyword replacement. */
 			memcpy(resp, subst, subst_len);
 			resp += subst_len;
 			*resp = '\0';
 			p = p1 + kw_len;
 		} else
 			break;
 	}
 
 	/* Copy to the end of string and finish. */
 	strcat(resp, p);
 	if (may_free)
 		free(real);
 	return (res);
 }
 
 static char *
 origin_subst(char *real, const char *origin_path)
 {
 	char *res1, *res2, *res3, *res4;
 
 	if (uts.sysname[0] == '\0') {
 		if (uname(&uts) != 0) {
 			_rtld_error("utsname failed: %d", errno);
 			return (NULL);
 		}
 	}
 	res1 = origin_subst_one(real, "$ORIGIN", origin_path, false);
 	res2 = origin_subst_one(res1, "$OSNAME", uts.sysname, true);
 	res3 = origin_subst_one(res2, "$OSREL", uts.release, true);
 	res4 = origin_subst_one(res3, "$PLATFORM", uts.machine, true);
 	return (res4);
 }
 
 void
 rtld_die(void)
 {
     const char *msg = dlerror();
 
     if (msg == NULL)
 	msg = "Fatal error";
     rtld_fdputstr(STDERR_FILENO, msg);
     rtld_fdputchar(STDERR_FILENO, '\n');
     _exit(1);
 }
 
 /*
  * Process a shared object's DYNAMIC section, and save the important
  * information in its Obj_Entry structure.
  */
 static void
 digest_dynamic1(Obj_Entry *obj, int early, const Elf_Dyn **dyn_rpath,
     const Elf_Dyn **dyn_soname, const Elf_Dyn **dyn_runpath)
 {
     const Elf_Dyn *dynp;
     Needed_Entry **needed_tail = &obj->needed;
     Needed_Entry **needed_filtees_tail = &obj->needed_filtees;
     Needed_Entry **needed_aux_filtees_tail = &obj->needed_aux_filtees;
     const Elf_Hashelt *hashtab;
     const Elf32_Word *hashval;
     Elf32_Word bkt, nmaskwords;
     int bloom_size32;
     int plttype = DT_REL;
 
     *dyn_rpath = NULL;
     *dyn_soname = NULL;
     *dyn_runpath = NULL;
 
     obj->bind_now = false;
     for (dynp = obj->dynamic;  dynp->d_tag != DT_NULL;  dynp++) {
 	switch (dynp->d_tag) {
 
 	case DT_REL:
 	    obj->rel = (const Elf_Rel *) (obj->relocbase + dynp->d_un.d_ptr);
 	    break;
 
 	case DT_RELSZ:
 	    obj->relsize = dynp->d_un.d_val;
 	    break;
 
 	case DT_RELENT:
 	    assert(dynp->d_un.d_val == sizeof(Elf_Rel));
 	    break;
 
 	case DT_JMPREL:
 	    obj->pltrel = (const Elf_Rel *)
 	      (obj->relocbase + dynp->d_un.d_ptr);
 	    break;
 
 	case DT_PLTRELSZ:
 	    obj->pltrelsize = dynp->d_un.d_val;
 	    break;
 
 	case DT_RELA:
 	    obj->rela = (const Elf_Rela *) (obj->relocbase + dynp->d_un.d_ptr);
 	    break;
 
 	case DT_RELASZ:
 	    obj->relasize = dynp->d_un.d_val;
 	    break;
 
 	case DT_RELAENT:
 	    assert(dynp->d_un.d_val == sizeof(Elf_Rela));
 	    break;
 
 	case DT_PLTREL:
 	    plttype = dynp->d_un.d_val;
 	    assert(dynp->d_un.d_val == DT_REL || plttype == DT_RELA);
 	    break;
 
 	case DT_SYMTAB:
 	    obj->symtab = (const Elf_Sym *)
 	      (obj->relocbase + dynp->d_un.d_ptr);
 	    break;
 
 	case DT_SYMENT:
 	    assert(dynp->d_un.d_val == sizeof(Elf_Sym));
 	    break;
 
 	case DT_STRTAB:
 	    obj->strtab = (const char *) (obj->relocbase + dynp->d_un.d_ptr);
 	    break;
 
 	case DT_STRSZ:
 	    obj->strsize = dynp->d_un.d_val;
 	    break;
 
 	case DT_VERNEED:
 	    obj->verneed = (const Elf_Verneed *) (obj->relocbase +
 		dynp->d_un.d_val);
 	    break;
 
 	case DT_VERNEEDNUM:
 	    obj->verneednum = dynp->d_un.d_val;
 	    break;
 
 	case DT_VERDEF:
 	    obj->verdef = (const Elf_Verdef *) (obj->relocbase +
 		dynp->d_un.d_val);
 	    break;
 
 	case DT_VERDEFNUM:
 	    obj->verdefnum = dynp->d_un.d_val;
 	    break;
 
 	case DT_VERSYM:
 	    obj->versyms = (const Elf_Versym *)(obj->relocbase +
 		dynp->d_un.d_val);
 	    break;
 
 	case DT_HASH:
 	    {
 		hashtab = (const Elf_Hashelt *)(obj->relocbase +
 		    dynp->d_un.d_ptr);
 		obj->nbuckets = hashtab[0];
 		obj->nchains = hashtab[1];
 		obj->buckets = hashtab + 2;
 		obj->chains = obj->buckets + obj->nbuckets;
 		obj->valid_hash_sysv = obj->nbuckets > 0 && obj->nchains > 0 &&
 		  obj->buckets != NULL;
 	    }
 	    break;
 
 	case DT_GNU_HASH:
 	    {
 		hashtab = (const Elf_Hashelt *)(obj->relocbase +
 		    dynp->d_un.d_ptr);
 		obj->nbuckets_gnu = hashtab[0];
 		obj->symndx_gnu = hashtab[1];
 		nmaskwords = hashtab[2];
 		bloom_size32 = (__ELF_WORD_SIZE / 32) * nmaskwords;
 		obj->maskwords_bm_gnu = nmaskwords - 1;
 		obj->shift2_gnu = hashtab[3];
 		obj->bloom_gnu = (Elf_Addr *) (hashtab + 4);
 		obj->buckets_gnu = hashtab + 4 + bloom_size32;
 		obj->chain_zero_gnu = obj->buckets_gnu + obj->nbuckets_gnu -
 		  obj->symndx_gnu;
 		/* Number of bitmask words is required to be power of 2 */
 		obj->valid_hash_gnu = powerof2(nmaskwords) &&
 		    obj->nbuckets_gnu > 0 && obj->buckets_gnu != NULL;
 	    }
 	    break;
 
 	case DT_NEEDED:
 	    if (!obj->rtld) {
 		Needed_Entry *nep = NEW(Needed_Entry);
 		nep->name = dynp->d_un.d_val;
 		nep->obj = NULL;
 		nep->next = NULL;
 
 		*needed_tail = nep;
 		needed_tail = &nep->next;
 	    }
 	    break;
 
 	case DT_FILTER:
 	    if (!obj->rtld) {
 		Needed_Entry *nep = NEW(Needed_Entry);
 		nep->name = dynp->d_un.d_val;
 		nep->obj = NULL;
 		nep->next = NULL;
 
 		*needed_filtees_tail = nep;
 		needed_filtees_tail = &nep->next;
 	    }
 	    break;
 
 	case DT_AUXILIARY:
 	    if (!obj->rtld) {
 		Needed_Entry *nep = NEW(Needed_Entry);
 		nep->name = dynp->d_un.d_val;
 		nep->obj = NULL;
 		nep->next = NULL;
 
 		*needed_aux_filtees_tail = nep;
 		needed_aux_filtees_tail = &nep->next;
 	    }
 	    break;
 
 	case DT_PLTGOT:
 	    obj->pltgot = (Elf_Addr *) (obj->relocbase + dynp->d_un.d_ptr);
 	    break;
 
 	case DT_TEXTREL:
 	    obj->textrel = true;
 	    break;
 
 	case DT_SYMBOLIC:
 	    obj->symbolic = true;
 	    break;
 
 	case DT_RPATH:
 	    /*
 	     * We have to wait until later to process this, because we
 	     * might not have gotten the address of the string table yet.
 	     */
 	    *dyn_rpath = dynp;
 	    break;
 
 	case DT_SONAME:
 	    *dyn_soname = dynp;
 	    break;
 
 	case DT_RUNPATH:
 	    *dyn_runpath = dynp;
 	    break;
 
 	case DT_INIT:
 	    obj->init = (Elf_Addr) (obj->relocbase + dynp->d_un.d_ptr);
 	    break;
 
 	case DT_PREINIT_ARRAY:
 	    obj->preinit_array = (Elf_Addr)(obj->relocbase + dynp->d_un.d_ptr);
 	    break;
 
 	case DT_PREINIT_ARRAYSZ:
 	    obj->preinit_array_num = dynp->d_un.d_val / sizeof(Elf_Addr);
 	    break;
 
 	case DT_INIT_ARRAY:
 	    obj->init_array = (Elf_Addr)(obj->relocbase + dynp->d_un.d_ptr);
 	    break;
 
 	case DT_INIT_ARRAYSZ:
 	    obj->init_array_num = dynp->d_un.d_val / sizeof(Elf_Addr);
 	    break;
 
 	case DT_FINI:
 	    obj->fini = (Elf_Addr) (obj->relocbase + dynp->d_un.d_ptr);
 	    break;
 
 	case DT_FINI_ARRAY:
 	    obj->fini_array = (Elf_Addr)(obj->relocbase + dynp->d_un.d_ptr);
 	    break;
 
 	case DT_FINI_ARRAYSZ:
 	    obj->fini_array_num = dynp->d_un.d_val / sizeof(Elf_Addr);
 	    break;
 
 	/*
 	 * Don't process DT_DEBUG on MIPS as the dynamic section
 	 * is mapped read-only. DT_MIPS_RLD_MAP is used instead.
 	 */
 
 #ifndef __mips__
 	case DT_DEBUG:
 	    /* XXX - not implemented yet */
 	    if (!early)
 		dbg("Filling in DT_DEBUG entry");
 	    ((Elf_Dyn*)dynp)->d_un.d_ptr = (Elf_Addr) &r_debug;
 	    break;
 #endif
 
 	case DT_FLAGS:
 		if ((dynp->d_un.d_val & DF_ORIGIN) && trust)
 		    obj->z_origin = true;
 		if (dynp->d_un.d_val & DF_SYMBOLIC)
 		    obj->symbolic = true;
 		if (dynp->d_un.d_val & DF_TEXTREL)
 		    obj->textrel = true;
 		if (dynp->d_un.d_val & DF_BIND_NOW)
 		    obj->bind_now = true;
 		/*if (dynp->d_un.d_val & DF_STATIC_TLS)
 		    ;*/
 	    break;
 #ifdef __mips__
 	case DT_MIPS_LOCAL_GOTNO:
 		obj->local_gotno = dynp->d_un.d_val;
 	    break;
 
 	case DT_MIPS_SYMTABNO:
 		obj->symtabno = dynp->d_un.d_val;
 		break;
 
 	case DT_MIPS_GOTSYM:
 		obj->gotsym = dynp->d_un.d_val;
 		break;
 
 	case DT_MIPS_RLD_MAP:
 		*((Elf_Addr *)(dynp->d_un.d_ptr)) = (Elf_Addr) &r_debug;
 		break;
 #endif
 
 	case DT_FLAGS_1:
 		if (dynp->d_un.d_val & DF_1_NOOPEN)
 		    obj->z_noopen = true;
 		if ((dynp->d_un.d_val & DF_1_ORIGIN) && trust)
 		    obj->z_origin = true;
-		/*if (dynp->d_un.d_val & DF_1_GLOBAL)
-		    XXX ;*/
+		if (dynp->d_un.d_val & DF_1_GLOBAL)
+		    obj->z_global = true;
 		if (dynp->d_un.d_val & DF_1_BIND_NOW)
 		    obj->bind_now = true;
 		if (dynp->d_un.d_val & DF_1_NODELETE)
 		    obj->z_nodelete = true;
 		if (dynp->d_un.d_val & DF_1_LOADFLTR)
 		    obj->z_loadfltr = true;
 		if (dynp->d_un.d_val & DF_1_INTERPOSE)
 		    obj->z_interpose = true;
 		if (dynp->d_un.d_val & DF_1_NODEFLIB)
 		    obj->z_nodeflib = true;
 	    break;
 
 	default:
 	    if (!early) {
 		dbg("Ignoring d_tag %ld = %#lx", (long)dynp->d_tag,
 		    (long)dynp->d_tag);
 	    }
 	    break;
 	}
     }
 
     obj->traced = false;
 
     if (plttype == DT_RELA) {
 	obj->pltrela = (const Elf_Rela *) obj->pltrel;
 	obj->pltrel = NULL;
 	obj->pltrelasize = obj->pltrelsize;
 	obj->pltrelsize = 0;
     }
 
     /* Determine size of dynsym table (equal to nchains of sysv hash) */
     if (obj->valid_hash_sysv)
 	obj->dynsymcount = obj->nchains;
     else if (obj->valid_hash_gnu) {
 	obj->dynsymcount = 0;
 	for (bkt = 0; bkt < obj->nbuckets_gnu; bkt++) {
 	    if (obj->buckets_gnu[bkt] == 0)
 		continue;
 	    hashval = &obj->chain_zero_gnu[obj->buckets_gnu[bkt]];
 	    do
 		obj->dynsymcount++;
 	    while ((*hashval++ & 1u) == 0);
 	}
 	obj->dynsymcount += obj->symndx_gnu;
     }
 }
 
 static void
 digest_dynamic2(Obj_Entry *obj, const Elf_Dyn *dyn_rpath,
     const Elf_Dyn *dyn_soname, const Elf_Dyn *dyn_runpath)
 {
 
     if (obj->z_origin && obj->origin_path == NULL) {
 	obj->origin_path = xmalloc(PATH_MAX);
 	if (rtld_dirname_abs(obj->path, obj->origin_path) == -1)
 	    rtld_die();
     }
 
     if (dyn_runpath != NULL) {
 	obj->runpath = (char *)obj->strtab + dyn_runpath->d_un.d_val;
 	if (obj->z_origin)
 	    obj->runpath = origin_subst(obj->runpath, obj->origin_path);
     }
     else if (dyn_rpath != NULL) {
 	obj->rpath = (char *)obj->strtab + dyn_rpath->d_un.d_val;
 	if (obj->z_origin)
 	    obj->rpath = origin_subst(obj->rpath, obj->origin_path);
     }
 
     if (dyn_soname != NULL)
 	object_add_name(obj, obj->strtab + dyn_soname->d_un.d_val);
 }
 
 static void
 digest_dynamic(Obj_Entry *obj, int early)
 {
 	const Elf_Dyn *dyn_rpath;
 	const Elf_Dyn *dyn_soname;
 	const Elf_Dyn *dyn_runpath;
 
 	digest_dynamic1(obj, early, &dyn_rpath, &dyn_soname, &dyn_runpath);
 	digest_dynamic2(obj, dyn_rpath, dyn_soname, dyn_runpath);
 }
 
 /*
  * Process a shared object's program header.  This is used only for the
  * main program, when the kernel has already loaded the main program
  * into memory before calling the dynamic linker.  It creates and
  * returns an Obj_Entry structure.
  */
 static Obj_Entry *
 digest_phdr(const Elf_Phdr *phdr, int phnum, caddr_t entry, const char *path)
 {
     Obj_Entry *obj;
     const Elf_Phdr *phlimit = phdr + phnum;
     const Elf_Phdr *ph;
     Elf_Addr note_start, note_end;
     int nsegs = 0;
 
     obj = obj_new();
     for (ph = phdr;  ph < phlimit;  ph++) {
 	if (ph->p_type != PT_PHDR)
 	    continue;
 
 	obj->phdr = phdr;
 	obj->phsize = ph->p_memsz;
 	obj->relocbase = (caddr_t)phdr - ph->p_vaddr;
 	break;
     }
 
     obj->stack_flags = PF_X | PF_R | PF_W;
 
     for (ph = phdr;  ph < phlimit;  ph++) {
 	switch (ph->p_type) {
 
 	case PT_INTERP:
 	    obj->interp = (const char *)(ph->p_vaddr + obj->relocbase);
 	    break;
 
 	case PT_LOAD:
 	    if (nsegs == 0) {	/* First load segment */
 		obj->vaddrbase = trunc_page(ph->p_vaddr);
 		obj->mapbase = obj->vaddrbase + obj->relocbase;
 		obj->textsize = round_page(ph->p_vaddr + ph->p_memsz) -
 		  obj->vaddrbase;
 	    } else {		/* Last load segment */
 		obj->mapsize = round_page(ph->p_vaddr + ph->p_memsz) -
 		  obj->vaddrbase;
 	    }
 	    nsegs++;
 	    break;
 
 	case PT_DYNAMIC:
 	    obj->dynamic = (const Elf_Dyn *)(ph->p_vaddr + obj->relocbase);
 	    break;
 
 	case PT_TLS:
 	    obj->tlsindex = 1;
 	    obj->tlssize = ph->p_memsz;
 	    obj->tlsalign = ph->p_align;
 	    obj->tlsinitsize = ph->p_filesz;
 	    obj->tlsinit = (void*)(ph->p_vaddr + obj->relocbase);
 	    break;
 
 	case PT_GNU_STACK:
 	    obj->stack_flags = ph->p_flags;
 	    break;
 
 	case PT_GNU_RELRO:
 	    obj->relro_page = obj->relocbase + trunc_page(ph->p_vaddr);
 	    obj->relro_size = round_page(ph->p_memsz);
 	    break;
 
 	case PT_NOTE:
 	    note_start = (Elf_Addr)obj->relocbase + ph->p_vaddr;
 	    note_end = note_start + ph->p_filesz;
 	    digest_notes(obj, note_start, note_end);
 	    break;
 	}
     }
     if (nsegs < 1) {
 	_rtld_error("%s: too few PT_LOAD segments", path);
 	return NULL;
     }
 
     obj->entry = entry;
     return obj;
 }
 
 void
 digest_notes(Obj_Entry *obj, Elf_Addr note_start, Elf_Addr note_end)
 {
 	const Elf_Note *note;
 	const char *note_name;
 	uintptr_t p;
 
 	for (note = (const Elf_Note *)note_start; (Elf_Addr)note < note_end;
 	    note = (const Elf_Note *)((const char *)(note + 1) +
 	      roundup2(note->n_namesz, sizeof(Elf32_Addr)) +
 	      roundup2(note->n_descsz, sizeof(Elf32_Addr)))) {
 		if (note->n_namesz != sizeof(NOTE_FREEBSD_VENDOR) ||
 		    note->n_descsz != sizeof(int32_t))
 			continue;
 		if (note->n_type != ABI_NOTETYPE &&
 		    note->n_type != CRT_NOINIT_NOTETYPE)
 			continue;
 		note_name = (const char *)(note + 1);
 		if (strncmp(NOTE_FREEBSD_VENDOR, note_name,
 		    sizeof(NOTE_FREEBSD_VENDOR)) != 0)
 			continue;
 		switch (note->n_type) {
 		case ABI_NOTETYPE:
 			/* FreeBSD osrel note */
 			p = (uintptr_t)(note + 1);
 			p += roundup2(note->n_namesz, sizeof(Elf32_Addr));
 			obj->osrel = *(const int32_t *)(p);
 			dbg("note osrel %d", obj->osrel);
 			break;
 		case CRT_NOINIT_NOTETYPE:
 			/* FreeBSD 'crt does not call init' note */
 			obj->crt_no_init = true;
 			dbg("note crt_no_init");
 			break;
 		}
 	}
 }
 
 static Obj_Entry *
 dlcheck(void *handle)
 {
     Obj_Entry *obj;
 
     for (obj = obj_list;  obj != NULL;  obj = obj->next)
 	if (obj == (Obj_Entry *) handle)
 	    break;
 
     if (obj == NULL || obj->refcount == 0 || obj->dl_refcount == 0) {
 	_rtld_error("Invalid shared object handle %p", handle);
 	return NULL;
     }
     return obj;
 }
 
 /*
  * If the given object is already in the donelist, return true.  Otherwise
  * add the object to the list and return false.
  */
 static bool
 donelist_check(DoneList *dlp, const Obj_Entry *obj)
 {
     unsigned int i;
 
     for (i = 0;  i < dlp->num_used;  i++)
 	if (dlp->objs[i] == obj)
 	    return true;
     /*
      * Our donelist allocation should always be sufficient.  But if
      * our threads locking isn't working properly, more shared objects
      * could have been loaded since we allocated the list.  That should
      * never happen, but we'll handle it properly just in case it does.
      */
     if (dlp->num_used < dlp->num_alloc)
 	dlp->objs[dlp->num_used++] = obj;
     return false;
 }
 
 /*
  * Hash function for symbol table lookup.  Don't even think about changing
  * this.  It is specified by the System V ABI.
  */
 unsigned long
 elf_hash(const char *name)
 {
     const unsigned char *p = (const unsigned char *) name;
     unsigned long h = 0;
     unsigned long g;
 
     while (*p != '\0') {
 	h = (h << 4) + *p++;
 	if ((g = h & 0xf0000000) != 0)
 	    h ^= g >> 24;
 	h &= ~g;
     }
     return h;
 }
 
 /*
  * The GNU hash function is the Daniel J. Bernstein hash clipped to 32 bits
  * unsigned in case it's implemented with a wider type.
  */
 static uint32_t
 gnu_hash(const char *s)
 {
 	uint32_t h;
 	unsigned char c;
 
 	h = 5381;
 	for (c = *s; c != '\0'; c = *++s)
 		h = h * 33 + c;
 	return (h & 0xffffffff);
 }
 
 
 /*
  * Find the library with the given name, and return its full pathname.
  * The returned string is dynamically allocated.  Generates an error
  * message and returns NULL if the library cannot be found.
  *
  * If the second argument is non-NULL, then it refers to an already-
  * loaded shared object, whose library search path will be searched.
  *
  * If a library is successfully located via LD_LIBRARY_PATH_FDS, its
  * descriptor (which is close-on-exec) will be passed out via the third
  * argument.
  *
  * The search order is:
  *   DT_RPATH in the referencing file _unless_ DT_RUNPATH is present (1)
  *   DT_RPATH of the main object if DSO without defined DT_RUNPATH (1)
  *   LD_LIBRARY_PATH
  *   DT_RUNPATH in the referencing file
  *   ldconfig hints (if -z nodefaultlib, filter out default library directories
  *	 from list)
  *   /lib:/usr/lib _unless_ the referencing file is linked with -z nodefaultlib
  *
  * (1) Handled in digest_dynamic2 - rpath left NULL if runpath defined.
  */
 static char *
 find_library(const char *xname, const Obj_Entry *refobj, int *fdp)
 {
     char *pathname;
     char *name;
     bool nodeflib, objgiven;
 
     objgiven = refobj != NULL;
     if (strchr(xname, '/') != NULL) {	/* Hard coded pathname */
 	if (xname[0] != '/' && !trust) {
 	    _rtld_error("Absolute pathname required for shared object \"%s\"",
 	      xname);
 	    return NULL;
 	}
 	if (objgiven && refobj->z_origin) {
 		return (origin_subst(__DECONST(char *, xname),
 		    refobj->origin_path));
 	} else {
 		return (xstrdup(xname));
 	}
     }
 
     if (libmap_disable || !objgiven ||
 	(name = lm_find(refobj->path, xname)) == NULL)
 	name = (char *)xname;
 
     dbg(" Searching for \"%s\"", name);
 
     /*
      * If refobj->rpath != NULL, then refobj->runpath is NULL.  Fall
      * back to pre-conforming behaviour if user requested so with
      * LD_LIBRARY_PATH_RPATH environment variable and ignore -z
      * nodeflib.
      */
     if (objgiven && refobj->rpath != NULL && ld_library_path_rpath) {
 	if ((pathname = search_library_path(name, ld_library_path)) != NULL ||
 	  (refobj != NULL &&
 	  (pathname = search_library_path(name, refobj->rpath)) != NULL) ||
 	  (pathname = search_library_pathfds(name, ld_library_dirs, fdp)) != NULL ||
           (pathname = search_library_path(name, gethints(false))) != NULL ||
 	  (pathname = search_library_path(name, STANDARD_LIBRARY_PATH)) != NULL)
 	    return (pathname);
     } else {
 	nodeflib = objgiven ? refobj->z_nodeflib : false;
 	if ((objgiven &&
 	  (pathname = search_library_path(name, refobj->rpath)) != NULL) ||
 	  (objgiven && refobj->runpath == NULL && refobj != obj_main &&
 	  (pathname = search_library_path(name, obj_main->rpath)) != NULL) ||
 	  (pathname = search_library_path(name, ld_library_path)) != NULL ||
 	  (objgiven &&
 	  (pathname = search_library_path(name, refobj->runpath)) != NULL) ||
 	  (pathname = search_library_pathfds(name, ld_library_dirs, fdp)) != NULL ||
 	  (pathname = search_library_path(name, gethints(nodeflib))) != NULL ||
 	  (objgiven && !nodeflib &&
 	  (pathname = search_library_path(name, STANDARD_LIBRARY_PATH)) != NULL))
 	    return (pathname);
     }
 
     if (objgiven && refobj->path != NULL) {
 	_rtld_error("Shared object \"%s\" not found, required by \"%s\"",
 	  name, basename(refobj->path));
     } else {
 	_rtld_error("Shared object \"%s\" not found", name);
     }
     return NULL;
 }
 
 /*
  * Given a symbol number in a referencing object, find the corresponding
  * definition of the symbol.  Returns a pointer to the symbol, or NULL if
  * no definition was found.  Returns a pointer to the Obj_Entry of the
  * defining object via the reference parameter DEFOBJ_OUT.
  */
 const Elf_Sym *
 find_symdef(unsigned long symnum, const Obj_Entry *refobj,
     const Obj_Entry **defobj_out, int flags, SymCache *cache,
     RtldLockState *lockstate)
 {
     const Elf_Sym *ref;
     const Elf_Sym *def;
     const Obj_Entry *defobj;
     SymLook req;
     const char *name;
     int res;
 
     /*
      * If we have already found this symbol, get the information from
      * the cache.
      */
     if (symnum >= refobj->dynsymcount)
 	return NULL;	/* Bad object */
     if (cache != NULL && cache[symnum].sym != NULL) {
 	*defobj_out = cache[symnum].obj;
 	return cache[symnum].sym;
     }
 
     ref = refobj->symtab + symnum;
     name = refobj->strtab + ref->st_name;
     def = NULL;
     defobj = NULL;
 
     /*
      * We don't have to do a full scale lookup if the symbol is local.
      * We know it will bind to the instance in this load module; to
      * which we already have a pointer (ie ref). By not doing a lookup,
      * we not only improve performance, but it also avoids unresolvable
      * symbols when local symbols are not in the hash table. This has
      * been seen with the ia64 toolchain.
      */
     if (ELF_ST_BIND(ref->st_info) != STB_LOCAL) {
 	if (ELF_ST_TYPE(ref->st_info) == STT_SECTION) {
 	    _rtld_error("%s: Bogus symbol table entry %lu", refobj->path,
 		symnum);
 	}
 	symlook_init(&req, name);
 	req.flags = flags;
 	req.ventry = fetch_ventry(refobj, symnum);
 	req.lockstate = lockstate;
 	res = symlook_default(&req, refobj);
 	if (res == 0) {
 	    def = req.sym_out;
 	    defobj = req.defobj_out;
 	}
     } else {
 	def = ref;
 	defobj = refobj;
     }
 
     /*
      * If we found no definition and the reference is weak, treat the
      * symbol as having the value zero.
      */
     if (def == NULL && ELF_ST_BIND(ref->st_info) == STB_WEAK) {
 	def = &sym_zero;
 	defobj = obj_main;
     }
 
     if (def != NULL) {
 	*defobj_out = defobj;
 	/* Record the information in the cache to avoid subsequent lookups. */
 	if (cache != NULL) {
 	    cache[symnum].sym = def;
 	    cache[symnum].obj = defobj;
 	}
     } else {
 	if (refobj != &obj_rtld)
 	    _rtld_error("%s: Undefined symbol \"%s\"", refobj->path, name);
     }
     return def;
 }
 
 /*
  * Return the search path from the ldconfig hints file, reading it if
  * necessary.  If nostdlib is true, then the default search paths are
  * not added to result.
  *
  * Returns NULL if there are problems with the hints file,
  * or if the search path there is empty.
  */
 static const char *
 gethints(bool nostdlib)
 {
 	static char *hints, *filtered_path;
 	struct elfhints_hdr hdr;
 	struct fill_search_info_args sargs, hargs;
 	struct dl_serinfo smeta, hmeta, *SLPinfo, *hintinfo;
 	struct dl_serpath *SLPpath, *hintpath;
 	char *p;
 	unsigned int SLPndx, hintndx, fndx, fcount;
 	int fd;
 	size_t flen;
 	bool skip;
 
 	/* First call, read the hints file */
 	if (hints == NULL) {
 		/* Keep from trying again in case the hints file is bad. */
 		hints = "";
 
 		if ((fd = open(ld_elf_hints_path, O_RDONLY | O_CLOEXEC)) == -1)
 			return (NULL);
 		if (read(fd, &hdr, sizeof hdr) != sizeof hdr ||
 		    hdr.magic != ELFHINTS_MAGIC ||
 		    hdr.version != 1) {
 			close(fd);
 			return (NULL);
 		}
 		p = xmalloc(hdr.dirlistlen + 1);
 		if (lseek(fd, hdr.strtab + hdr.dirlist, SEEK_SET) == -1 ||
 		    read(fd, p, hdr.dirlistlen + 1) !=
 		    (ssize_t)hdr.dirlistlen + 1) {
 			free(p);
 			close(fd);
 			return (NULL);
 		}
 		hints = p;
 		close(fd);
 	}
 
 	/*
 	 * If caller agreed to receive list which includes the default
 	 * paths, we are done. Otherwise, if we still did not
 	 * calculated filtered result, do it now.
 	 */
 	if (!nostdlib)
 		return (hints[0] != '\0' ? hints : NULL);
 	if (filtered_path != NULL)
 		goto filt_ret;
 
 	/*
 	 * Obtain the list of all configured search paths, and the
 	 * list of the default paths.
 	 *
 	 * First estimate the size of the results.
 	 */
 	smeta.dls_size = __offsetof(struct dl_serinfo, dls_serpath);
 	smeta.dls_cnt = 0;
 	hmeta.dls_size = __offsetof(struct dl_serinfo, dls_serpath);
 	hmeta.dls_cnt = 0;
 
 	sargs.request = RTLD_DI_SERINFOSIZE;
 	sargs.serinfo = &smeta;
 	hargs.request = RTLD_DI_SERINFOSIZE;
 	hargs.serinfo = &hmeta;
 
 	path_enumerate(STANDARD_LIBRARY_PATH, fill_search_info, &sargs);
 	path_enumerate(p, fill_search_info, &hargs);
 
 	SLPinfo = xmalloc(smeta.dls_size);
 	hintinfo = xmalloc(hmeta.dls_size);
 
 	/*
 	 * Next fetch both sets of paths.
 	 */
 	sargs.request = RTLD_DI_SERINFO;
 	sargs.serinfo = SLPinfo;
 	sargs.serpath = &SLPinfo->dls_serpath[0];
 	sargs.strspace = (char *)&SLPinfo->dls_serpath[smeta.dls_cnt];
 
 	hargs.request = RTLD_DI_SERINFO;
 	hargs.serinfo = hintinfo;
 	hargs.serpath = &hintinfo->dls_serpath[0];
 	hargs.strspace = (char *)&hintinfo->dls_serpath[hmeta.dls_cnt];
 
 	path_enumerate(STANDARD_LIBRARY_PATH, fill_search_info, &sargs);
 	path_enumerate(p, fill_search_info, &hargs);
 
 	/*
 	 * Now calculate the difference between two sets, by excluding
 	 * standard paths from the full set.
 	 */
 	fndx = 0;
 	fcount = 0;
 	filtered_path = xmalloc(hdr.dirlistlen + 1);
 	hintpath = &hintinfo->dls_serpath[0];
 	for (hintndx = 0; hintndx < hmeta.dls_cnt; hintndx++, hintpath++) {
 		skip = false;
 		SLPpath = &SLPinfo->dls_serpath[0];
 		/*
 		 * Check each standard path against current.
 		 */
 		for (SLPndx = 0; SLPndx < smeta.dls_cnt; SLPndx++, SLPpath++) {
 			/* matched, skip the path */
 			if (!strcmp(hintpath->dls_name, SLPpath->dls_name)) {
 				skip = true;
 				break;
 			}
 		}
 		if (skip)
 			continue;
 		/*
 		 * Not matched against any standard path, add the path
 		 * to result. Separate consequtive paths with ':'.
 		 */
 		if (fcount > 0) {
 			filtered_path[fndx] = ':';
 			fndx++;
 		}
 		fcount++;
 		flen = strlen(hintpath->dls_name);
 		strncpy((filtered_path + fndx),	hintpath->dls_name, flen);
 		fndx += flen;
 	}
 	filtered_path[fndx] = '\0';
 
 	free(SLPinfo);
 	free(hintinfo);
 
 filt_ret:
 	return (filtered_path[0] != '\0' ? filtered_path : NULL);
 }
 
 static void
 init_dag(Obj_Entry *root)
 {
     const Needed_Entry *needed;
     const Objlist_Entry *elm;
     DoneList donelist;
 
     if (root->dag_inited)
 	return;
     donelist_init(&donelist);
 
     /* Root object belongs to own DAG. */
     objlist_push_tail(&root->dldags, root);
     objlist_push_tail(&root->dagmembers, root);
     donelist_check(&donelist, root);
 
     /*
      * Add dependencies of root object to DAG in breadth order
      * by exploiting the fact that each new object get added
      * to the tail of the dagmembers list.
      */
     STAILQ_FOREACH(elm, &root->dagmembers, link) {
 	for (needed = elm->obj->needed; needed != NULL; needed = needed->next) {
 	    if (needed->obj == NULL || donelist_check(&donelist, needed->obj))
 		continue;
 	    objlist_push_tail(&needed->obj->dldags, root);
 	    objlist_push_tail(&root->dagmembers, needed->obj);
 	}
     }
     root->dag_inited = true;
 }
 
 static void
-process_nodelete(Obj_Entry *root)
+process_z(Obj_Entry *root)
 {
 	const Objlist_Entry *elm;
+	Obj_Entry *obj;
 
 	/*
-	 * Walk over object DAG and process every dependent object that
-	 * is marked as DF_1_NODELETE. They need to grow their own DAG,
-	 * which then should have its reference upped separately.
+	 * Walk over object DAG and process every dependent object
+	 * that is marked as DF_1_NODELETE or DF_1_GLOBAL. They need
+	 * to grow their own DAG.
+	 *
+	 * For DF_1_GLOBAL, DAG is required for symbol lookups in
+	 * symlook_global() to work.
+	 *
+	 * For DF_1_NODELETE, the DAG should have its reference upped.
 	 */
 	STAILQ_FOREACH(elm, &root->dagmembers, link) {
-		if (elm->obj != NULL && elm->obj->z_nodelete &&
-		    !elm->obj->ref_nodel) {
-			dbg("obj %s nodelete", elm->obj->path);
-			init_dag(elm->obj);
-			ref_dag(elm->obj);
-			elm->obj->ref_nodel = true;
+		obj = elm->obj;
+		if (obj == NULL)
+			continue;
+		if (obj->z_nodelete && !obj->ref_nodel) {
+			dbg("obj %s -z nodelete", obj->path);
+			init_dag(obj);
+			ref_dag(obj);
+			obj->ref_nodel = true;
 		}
+		if (obj->z_global && objlist_find(&list_global, obj) == NULL) {
+			dbg("obj %s -z global", obj->path);
+			objlist_push_tail(&list_global, obj);
+			init_dag(obj);
+		}
 	}
 }
 /*
  * Initialize the dynamic linker.  The argument is the address at which
  * the dynamic linker has been mapped into memory.  The primary task of
  * this function is to relocate the dynamic linker.
  */
 static void
 init_rtld(caddr_t mapbase, Elf_Auxinfo **aux_info)
 {
     Obj_Entry objtmp;	/* Temporary rtld object */
     const Elf_Dyn *dyn_rpath;
     const Elf_Dyn *dyn_soname;
     const Elf_Dyn *dyn_runpath;
 
 #ifdef RTLD_INIT_PAGESIZES_EARLY
     /* The page size is required by the dynamic memory allocator. */
     init_pagesizes(aux_info);
 #endif
 
     /*
      * Conjure up an Obj_Entry structure for the dynamic linker.
      *
      * The "path" member can't be initialized yet because string constants
      * cannot yet be accessed. Below we will set it correctly.
      */
     memset(&objtmp, 0, sizeof(objtmp));
     objtmp.path = NULL;
     objtmp.rtld = true;
     objtmp.mapbase = mapbase;
 #ifdef PIC
     objtmp.relocbase = mapbase;
 #endif
     if (RTLD_IS_DYNAMIC()) {
 	objtmp.dynamic = rtld_dynamic(&objtmp);
 	digest_dynamic1(&objtmp, 1, &dyn_rpath, &dyn_soname, &dyn_runpath);
 	assert(objtmp.needed == NULL);
 #if !defined(__mips__)
 	/* MIPS has a bogus DT_TEXTREL. */
 	assert(!objtmp.textrel);
 #endif
 
 	/*
 	 * Temporarily put the dynamic linker entry into the object list, so
 	 * that symbols can be found.
 	 */
 
 	relocate_objects(&objtmp, true, &objtmp, 0, NULL);
     }
 
     /* Initialize the object list. */
     obj_tail = &obj_list;
 
     /* Now that non-local variables can be accesses, copy out obj_rtld. */
     memcpy(&obj_rtld, &objtmp, sizeof(obj_rtld));
 
 #ifndef RTLD_INIT_PAGESIZES_EARLY
     /* The page size is required by the dynamic memory allocator. */
     init_pagesizes(aux_info);
 #endif
 
     if (aux_info[AT_OSRELDATE] != NULL)
 	    osreldate = aux_info[AT_OSRELDATE]->a_un.a_val;
 
     digest_dynamic2(&obj_rtld, dyn_rpath, dyn_soname, dyn_runpath);
 
     /* Replace the path with a dynamically allocated copy. */
     obj_rtld.path = xstrdup(PATH_RTLD);
 
     r_debug.r_brk = r_debug_state;
     r_debug.r_state = RT_CONSISTENT;
 }
 
 /*
  * Retrieve the array of supported page sizes.  The kernel provides the page
  * sizes in increasing order.
  */
 static void
 init_pagesizes(Elf_Auxinfo **aux_info)
 {
 	static size_t psa[MAXPAGESIZES];
 	int mib[2];
 	size_t len, size;
 
 	if (aux_info[AT_PAGESIZES] != NULL && aux_info[AT_PAGESIZESLEN] !=
 	    NULL) {
 		size = aux_info[AT_PAGESIZESLEN]->a_un.a_val;
 		pagesizes = aux_info[AT_PAGESIZES]->a_un.a_ptr;
 	} else {
 		len = 2;
 		if (sysctlnametomib("hw.pagesizes", mib, &len) == 0)
 			size = sizeof(psa);
 		else {
 			/* As a fallback, retrieve the base page size. */
 			size = sizeof(psa[0]);
 			if (aux_info[AT_PAGESZ] != NULL) {
 				psa[0] = aux_info[AT_PAGESZ]->a_un.a_val;
 				goto psa_filled;
 			} else {
 				mib[0] = CTL_HW;
 				mib[1] = HW_PAGESIZE;
 				len = 2;
 			}
 		}
 		if (sysctl(mib, len, psa, &size, NULL, 0) == -1) {
 			_rtld_error("sysctl for hw.pagesize(s) failed");
 			rtld_die();
 		}
 psa_filled:
 		pagesizes = psa;
 	}
 	npagesizes = size / sizeof(pagesizes[0]);
 	/* Discard any invalid entries at the end of the array. */
 	while (npagesizes > 0 && pagesizes[npagesizes - 1] == 0)
 		npagesizes--;
 }
 
 /*
  * Add the init functions from a needed object list (and its recursive
  * needed objects) to "list".  This is not used directly; it is a helper
  * function for initlist_add_objects().  The write lock must be held
  * when this function is called.
  */
 static void
 initlist_add_neededs(Needed_Entry *needed, Objlist *list)
 {
     /* Recursively process the successor needed objects. */
     if (needed->next != NULL)
 	initlist_add_neededs(needed->next, list);
 
     /* Process the current needed object. */
     if (needed->obj != NULL)
 	initlist_add_objects(needed->obj, &needed->obj->next, list);
 }
 
 /*
  * Scan all of the DAGs rooted in the range of objects from "obj" to
  * "tail" and add their init functions to "list".  This recurses over
  * the DAGs and ensure the proper init ordering such that each object's
  * needed libraries are initialized before the object itself.  At the
  * same time, this function adds the objects to the global finalization
  * list "list_fini" in the opposite order.  The write lock must be
  * held when this function is called.
  */
 static void
 initlist_add_objects(Obj_Entry *obj, Obj_Entry **tail, Objlist *list)
 {
 
     if (obj->init_scanned || obj->init_done)
 	return;
     obj->init_scanned = true;
 
     /* Recursively process the successor objects. */
     if (&obj->next != tail)
 	initlist_add_objects(obj->next, tail, list);
 
     /* Recursively process the needed objects. */
     if (obj->needed != NULL)
 	initlist_add_neededs(obj->needed, list);
     if (obj->needed_filtees != NULL)
 	initlist_add_neededs(obj->needed_filtees, list);
     if (obj->needed_aux_filtees != NULL)
 	initlist_add_neededs(obj->needed_aux_filtees, list);
 
     /* Add the object to the init list. */
     if (obj->preinit_array != (Elf_Addr)NULL || obj->init != (Elf_Addr)NULL ||
       obj->init_array != (Elf_Addr)NULL)
 	objlist_push_tail(list, obj);
 
     /* Add the object to the global fini list in the reverse order. */
     if ((obj->fini != (Elf_Addr)NULL || obj->fini_array != (Elf_Addr)NULL)
       && !obj->on_fini_list) {
 	objlist_push_head(&list_fini, obj);
 	obj->on_fini_list = true;
     }
 }
 
 #ifndef FPTR_TARGET
 #define FPTR_TARGET(f)	((Elf_Addr) (f))
 #endif
 
 static void
 free_needed_filtees(Needed_Entry *n)
 {
     Needed_Entry *needed, *needed1;
 
     for (needed = n; needed != NULL; needed = needed->next) {
 	if (needed->obj != NULL) {
 	    dlclose(needed->obj);
 	    needed->obj = NULL;
 	}
     }
     for (needed = n; needed != NULL; needed = needed1) {
 	needed1 = needed->next;
 	free(needed);
     }
 }
 
 static void
 unload_filtees(Obj_Entry *obj)
 {
 
     free_needed_filtees(obj->needed_filtees);
     obj->needed_filtees = NULL;
     free_needed_filtees(obj->needed_aux_filtees);
     obj->needed_aux_filtees = NULL;
     obj->filtees_loaded = false;
 }
 
 static void
 load_filtee1(Obj_Entry *obj, Needed_Entry *needed, int flags,
     RtldLockState *lockstate)
 {
 
     for (; needed != NULL; needed = needed->next) {
 	needed->obj = dlopen_object(obj->strtab + needed->name, -1, obj,
 	  flags, ((ld_loadfltr || obj->z_loadfltr) ? RTLD_NOW : RTLD_LAZY) |
 	  RTLD_LOCAL, lockstate);
     }
 }
 
 static void
 load_filtees(Obj_Entry *obj, int flags, RtldLockState *lockstate)
 {
 
     lock_restart_for_upgrade(lockstate);
     if (!obj->filtees_loaded) {
 	load_filtee1(obj, obj->needed_filtees, flags, lockstate);
 	load_filtee1(obj, obj->needed_aux_filtees, flags, lockstate);
 	obj->filtees_loaded = true;
     }
 }
 
 static int
 process_needed(Obj_Entry *obj, Needed_Entry *needed, int flags)
 {
     Obj_Entry *obj1;
 
     for (; needed != NULL; needed = needed->next) {
 	obj1 = needed->obj = load_object(obj->strtab + needed->name, -1, obj,
 	  flags & ~RTLD_LO_NOLOAD);
 	if (obj1 == NULL && !ld_tracing && (flags & RTLD_LO_FILTEES) == 0)
 	    return (-1);
     }
     return (0);
 }
 
 /*
  * Given a shared object, traverse its list of needed objects, and load
  * each of them.  Returns 0 on success.  Generates an error message and
  * returns -1 on failure.
  */
 static int
 load_needed_objects(Obj_Entry *first, int flags)
 {
     Obj_Entry *obj;
 
     for (obj = first;  obj != NULL;  obj = obj->next) {
 	if (process_needed(obj, obj->needed, flags) == -1)
 	    return (-1);
     }
     return (0);
 }
 
 static int
 load_preload_objects(void)
 {
     char *p = ld_preload;
     Obj_Entry *obj;
     static const char delim[] = " \t:;";
 
     if (p == NULL)
 	return 0;
 
     p += strspn(p, delim);
     while (*p != '\0') {
 	size_t len = strcspn(p, delim);
 	char savech;
 
 	savech = p[len];
 	p[len] = '\0';
 	obj = load_object(p, -1, NULL, 0);
 	if (obj == NULL)
 	    return -1;	/* XXX - cleanup */
 	obj->z_interpose = true;
 	p[len] = savech;
 	p += len;
 	p += strspn(p, delim);
     }
     LD_UTRACE(UTRACE_PRELOAD_FINISHED, NULL, NULL, 0, 0, NULL);
     return 0;
 }
 
 static const char *
 printable_path(const char *path)
 {
 
 	return (path == NULL ? "<unknown>" : path);
 }
 
 /*
  * Load a shared object into memory, if it is not already loaded.  The
  * object may be specified by name or by user-supplied file descriptor
  * fd_u. In the later case, the fd_u descriptor is not closed, but its
  * duplicate is.
  *
  * Returns a pointer to the Obj_Entry for the object.  Returns NULL
  * on failure.
  */
 static Obj_Entry *
 load_object(const char *name, int fd_u, const Obj_Entry *refobj, int flags)
 {
     Obj_Entry *obj;
     int fd;
     struct stat sb;
     char *path;
 
     fd = -1;
     if (name != NULL) {
 	for (obj = obj_list->next;  obj != NULL;  obj = obj->next) {
 	    if (object_match_name(obj, name))
 		return (obj);
 	}
 
 	path = find_library(name, refobj, &fd);
 	if (path == NULL)
 	    return (NULL);
     } else
 	path = NULL;
 
     if (fd >= 0) {
 	/*
 	 * search_library_pathfds() opens a fresh file descriptor for the
 	 * library, so there is no need to dup().
 	 */
     } else if (fd_u == -1) {
 	/*
 	 * If we didn't find a match by pathname, or the name is not
 	 * supplied, open the file and check again by device and inode.
 	 * This avoids false mismatches caused by multiple links or ".."
 	 * in pathnames.
 	 *
 	 * To avoid a race, we open the file and use fstat() rather than
 	 * using stat().
 	 */
 	if ((fd = open(path, O_RDONLY | O_CLOEXEC)) == -1) {
 	    _rtld_error("Cannot open \"%s\"", path);
 	    free(path);
 	    return (NULL);
 	}
     } else {
 	fd = fcntl(fd_u, F_DUPFD_CLOEXEC, 0);
 	if (fd == -1) {
 	    _rtld_error("Cannot dup fd");
 	    free(path);
 	    return (NULL);
 	}
     }
     if (fstat(fd, &sb) == -1) {
 	_rtld_error("Cannot fstat \"%s\"", printable_path(path));
 	close(fd);
 	free(path);
 	return NULL;
     }
     for (obj = obj_list->next;  obj != NULL;  obj = obj->next)
 	if (obj->ino == sb.st_ino && obj->dev == sb.st_dev)
 	    break;
     if (obj != NULL && name != NULL) {
 	object_add_name(obj, name);
 	free(path);
 	close(fd);
 	return obj;
     }
     if (flags & RTLD_LO_NOLOAD) {
 	free(path);
 	close(fd);
 	return (NULL);
     }
 
     /* First use of this object, so we must map it in */
     obj = do_load_object(fd, name, path, &sb, flags);
     if (obj == NULL)
 	free(path);
     close(fd);
 
     return obj;
 }
 
 static Obj_Entry *
 do_load_object(int fd, const char *name, char *path, struct stat *sbp,
   int flags)
 {
     Obj_Entry *obj;
     struct statfs fs;
 
     /*
      * but first, make sure that environment variables haven't been
      * used to circumvent the noexec flag on a filesystem.
      */
     if (dangerous_ld_env) {
 	if (fstatfs(fd, &fs) != 0) {
 	    _rtld_error("Cannot fstatfs \"%s\"", printable_path(path));
 	    return NULL;
 	}
 	if (fs.f_flags & MNT_NOEXEC) {
 	    _rtld_error("Cannot execute objects on %s\n", fs.f_mntonname);
 	    return NULL;
 	}
     }
     dbg("loading \"%s\"", printable_path(path));
     obj = map_object(fd, printable_path(path), sbp);
     if (obj == NULL)
         return NULL;
 
     /*
      * If DT_SONAME is present in the object, digest_dynamic2 already
      * added it to the object names.
      */
     if (name != NULL)
 	object_add_name(obj, name);
     obj->path = path;
     digest_dynamic(obj, 0);
     dbg("%s valid_hash_sysv %d valid_hash_gnu %d dynsymcount %d", obj->path,
 	obj->valid_hash_sysv, obj->valid_hash_gnu, obj->dynsymcount);
     if (obj->z_noopen && (flags & (RTLD_LO_DLOPEN | RTLD_LO_TRACE)) ==
       RTLD_LO_DLOPEN) {
 	dbg("refusing to load non-loadable \"%s\"", obj->path);
 	_rtld_error("Cannot dlopen non-loadable %s", obj->path);
 	munmap(obj->mapbase, obj->mapsize);
 	obj_free(obj);
 	return (NULL);
     }
 
     obj->dlopened = (flags & RTLD_LO_DLOPEN) != 0;
     *obj_tail = obj;
     obj_tail = &obj->next;
     obj_count++;
     obj_loads++;
     linkmap_add(obj);	/* for GDB & dlinfo() */
     max_stack_flags |= obj->stack_flags;
 
     dbg("  %p .. %p: %s", obj->mapbase,
          obj->mapbase + obj->mapsize - 1, obj->path);
     if (obj->textrel)
 	dbg("  WARNING: %s has impure text", obj->path);
     LD_UTRACE(UTRACE_LOAD_OBJECT, obj, obj->mapbase, obj->mapsize, 0,
 	obj->path);    
 
     return obj;
 }
 
 static Obj_Entry *
 obj_from_addr(const void *addr)
 {
     Obj_Entry *obj;
 
     for (obj = obj_list;  obj != NULL;  obj = obj->next) {
 	if (addr < (void *) obj->mapbase)
 	    continue;
 	if (addr < (void *) (obj->mapbase + obj->mapsize))
 	    return obj;
     }
     return NULL;
 }
 
 static void
 preinit_main(void)
 {
     Elf_Addr *preinit_addr;
     int index;
 
     preinit_addr = (Elf_Addr *)obj_main->preinit_array;
     if (preinit_addr == NULL)
 	return;
 
     for (index = 0; index < obj_main->preinit_array_num; index++) {
 	if (preinit_addr[index] != 0 && preinit_addr[index] != 1) {
 	    dbg("calling preinit function for %s at %p", obj_main->path,
 	      (void *)preinit_addr[index]);
 	    LD_UTRACE(UTRACE_INIT_CALL, obj_main, (void *)preinit_addr[index],
 	      0, 0, obj_main->path);
 	    call_init_pointer(obj_main, preinit_addr[index]);
 	}
     }
 }
 
 /*
  * Call the finalization functions for each of the objects in "list"
  * belonging to the DAG of "root" and referenced once. If NULL "root"
  * is specified, every finalization function will be called regardless
  * of the reference count and the list elements won't be freed. All of
  * the objects are expected to have non-NULL fini functions.
  */
 static void
 objlist_call_fini(Objlist *list, Obj_Entry *root, RtldLockState *lockstate)
 {
     Objlist_Entry *elm;
     char *saved_msg;
     Elf_Addr *fini_addr;
     int index;
 
     assert(root == NULL || root->refcount == 1);
 
     /*
      * Preserve the current error message since a fini function might
      * call into the dynamic linker and overwrite it.
      */
     saved_msg = errmsg_save();
     do {
 	STAILQ_FOREACH(elm, list, link) {
 	    if (root != NULL && (elm->obj->refcount != 1 ||
 	      objlist_find(&root->dagmembers, elm->obj) == NULL))
 		continue;
 	    /* Remove object from fini list to prevent recursive invocation. */
 	    STAILQ_REMOVE(list, elm, Struct_Objlist_Entry, link);
 	    /*
 	     * XXX: If a dlopen() call references an object while the
 	     * fini function is in progress, we might end up trying to
 	     * unload the referenced object in dlclose() or the object
 	     * won't be unloaded although its fini function has been
 	     * called.
 	     */
 	    lock_release(rtld_bind_lock, lockstate);
 
 	    /*
 	     * It is legal to have both DT_FINI and DT_FINI_ARRAY defined.
 	     * When this happens, DT_FINI_ARRAY is processed first.
 	     */
 	    fini_addr = (Elf_Addr *)elm->obj->fini_array;
 	    if (fini_addr != NULL && elm->obj->fini_array_num > 0) {
 		for (index = elm->obj->fini_array_num - 1; index >= 0;
 		  index--) {
 		    if (fini_addr[index] != 0 && fini_addr[index] != 1) {
 			dbg("calling fini function for %s at %p",
 			    elm->obj->path, (void *)fini_addr[index]);
 			LD_UTRACE(UTRACE_FINI_CALL, elm->obj,
 			    (void *)fini_addr[index], 0, 0, elm->obj->path);
 			call_initfini_pointer(elm->obj, fini_addr[index]);
 		    }
 		}
 	    }
 	    if (elm->obj->fini != (Elf_Addr)NULL) {
 		dbg("calling fini function for %s at %p", elm->obj->path,
 		    (void *)elm->obj->fini);
 		LD_UTRACE(UTRACE_FINI_CALL, elm->obj, (void *)elm->obj->fini,
 		    0, 0, elm->obj->path);
 		call_initfini_pointer(elm->obj, elm->obj->fini);
 	    }
 	    wlock_acquire(rtld_bind_lock, lockstate);
 	    /* No need to free anything if process is going down. */
 	    if (root != NULL)
 	    	free(elm);
 	    /*
 	     * We must restart the list traversal after every fini call
 	     * because a dlclose() call from the fini function or from
 	     * another thread might have modified the reference counts.
 	     */
 	    break;
 	}
     } while (elm != NULL);
     errmsg_restore(saved_msg);
 }
 
 /*
  * Call the initialization functions for each of the objects in
  * "list".  All of the objects are expected to have non-NULL init
  * functions.
  */
 static void
 objlist_call_init(Objlist *list, RtldLockState *lockstate)
 {
     Objlist_Entry *elm;
     Obj_Entry *obj;
     char *saved_msg;
     Elf_Addr *init_addr;
     int index;
 
     /*
      * Clean init_scanned flag so that objects can be rechecked and
      * possibly initialized earlier if any of vectors called below
      * cause the change by using dlopen.
      */
     for (obj = obj_list;  obj != NULL;  obj = obj->next)
 	obj->init_scanned = false;
 
     /*
      * Preserve the current error message since an init function might
      * call into the dynamic linker and overwrite it.
      */
     saved_msg = errmsg_save();
     STAILQ_FOREACH(elm, list, link) {
 	if (elm->obj->init_done) /* Initialized early. */
 	    continue;
 	/*
 	 * Race: other thread might try to use this object before current
 	 * one completes the initilization. Not much can be done here
 	 * without better locking.
 	 */
 	elm->obj->init_done = true;
 	lock_release(rtld_bind_lock, lockstate);
 
         /*
          * It is legal to have both DT_INIT and DT_INIT_ARRAY defined.
          * When this happens, DT_INIT is processed first.
          */
 	if (elm->obj->init != (Elf_Addr)NULL) {
 	    dbg("calling init function for %s at %p", elm->obj->path,
 	        (void *)elm->obj->init);
 	    LD_UTRACE(UTRACE_INIT_CALL, elm->obj, (void *)elm->obj->init,
 	        0, 0, elm->obj->path);
 	    call_initfini_pointer(elm->obj, elm->obj->init);
 	}
 	init_addr = (Elf_Addr *)elm->obj->init_array;
 	if (init_addr != NULL) {
 	    for (index = 0; index < elm->obj->init_array_num; index++) {
 		if (init_addr[index] != 0 && init_addr[index] != 1) {
 		    dbg("calling init function for %s at %p", elm->obj->path,
 			(void *)init_addr[index]);
 		    LD_UTRACE(UTRACE_INIT_CALL, elm->obj,
 			(void *)init_addr[index], 0, 0, elm->obj->path);
 		    call_init_pointer(elm->obj, init_addr[index]);
 		}
 	    }
 	}
 	wlock_acquire(rtld_bind_lock, lockstate);
     }
     errmsg_restore(saved_msg);
 }
 
 static void
 objlist_clear(Objlist *list)
 {
     Objlist_Entry *elm;
 
     while (!STAILQ_EMPTY(list)) {
 	elm = STAILQ_FIRST(list);
 	STAILQ_REMOVE_HEAD(list, link);
 	free(elm);
     }
 }
 
 static Objlist_Entry *
 objlist_find(Objlist *list, const Obj_Entry *obj)
 {
     Objlist_Entry *elm;
 
     STAILQ_FOREACH(elm, list, link)
 	if (elm->obj == obj)
 	    return elm;
     return NULL;
 }
 
 static void
 objlist_init(Objlist *list)
 {
     STAILQ_INIT(list);
 }
 
 static void
 objlist_push_head(Objlist *list, Obj_Entry *obj)
 {
     Objlist_Entry *elm;
 
     elm = NEW(Objlist_Entry);
     elm->obj = obj;
     STAILQ_INSERT_HEAD(list, elm, link);
 }
 
 static void
 objlist_push_tail(Objlist *list, Obj_Entry *obj)
 {
     Objlist_Entry *elm;
 
     elm = NEW(Objlist_Entry);
     elm->obj = obj;
     STAILQ_INSERT_TAIL(list, elm, link);
 }
 
 static void
 objlist_put_after(Objlist *list, Obj_Entry *listobj, Obj_Entry *obj)
 {
 	Objlist_Entry *elm, *listelm;
 
 	STAILQ_FOREACH(listelm, list, link) {
 		if (listelm->obj == listobj)
 			break;
 	}
 	elm = NEW(Objlist_Entry);
 	elm->obj = obj;
 	if (listelm != NULL)
 		STAILQ_INSERT_AFTER(list, listelm, elm, link);
 	else
 		STAILQ_INSERT_TAIL(list, elm, link);
 }
 
 static void
 objlist_remove(Objlist *list, Obj_Entry *obj)
 {
     Objlist_Entry *elm;
 
     if ((elm = objlist_find(list, obj)) != NULL) {
 	STAILQ_REMOVE(list, elm, Struct_Objlist_Entry, link);
 	free(elm);
     }
 }
 
 /*
  * Relocate dag rooted in the specified object.
  * Returns 0 on success, or -1 on failure.
  */
 
 static int
 relocate_object_dag(Obj_Entry *root, bool bind_now, Obj_Entry *rtldobj,
     int flags, RtldLockState *lockstate)
 {
 	Objlist_Entry *elm;
 	int error;
 
 	error = 0;
 	STAILQ_FOREACH(elm, &root->dagmembers, link) {
 		error = relocate_object(elm->obj, bind_now, rtldobj, flags,
 		    lockstate);
 		if (error == -1)
 			break;
 	}
 	return (error);
 }
 
 /*
  * Relocate single object.
  * Returns 0 on success, or -1 on failure.
  */
 static int
 relocate_object(Obj_Entry *obj, bool bind_now, Obj_Entry *rtldobj,
     int flags, RtldLockState *lockstate)
 {
 
 	if (obj->relocated)
 		return (0);
 	obj->relocated = true;
 	if (obj != rtldobj)
 		dbg("relocating \"%s\"", obj->path);
 
 	if (obj->symtab == NULL || obj->strtab == NULL ||
 	    !(obj->valid_hash_sysv || obj->valid_hash_gnu)) {
 		_rtld_error("%s: Shared object has no run-time symbol table",
 			    obj->path);
 		return (-1);
 	}
 
 	if (obj->textrel) {
 		/* There are relocations to the write-protected text segment. */
 		if (mprotect(obj->mapbase, obj->textsize,
 		    PROT_READ|PROT_WRITE|PROT_EXEC) == -1) {
 			_rtld_error("%s: Cannot write-enable text segment: %s",
 			    obj->path, rtld_strerror(errno));
 			return (-1);
 		}
 	}
 
 	/* Process the non-PLT non-IFUNC relocations. */
 	if (reloc_non_plt(obj, rtldobj, flags, lockstate))
 		return (-1);
 
 	if (obj->textrel) {	/* Re-protected the text segment. */
 		if (mprotect(obj->mapbase, obj->textsize,
 		    PROT_READ|PROT_EXEC) == -1) {
 			_rtld_error("%s: Cannot write-protect text segment: %s",
 			    obj->path, rtld_strerror(errno));
 			return (-1);
 		}
 	}
 
 	/* Set the special PLT or GOT entries. */
 	init_pltgot(obj);
 
 	/* Process the PLT relocations. */
 	if (reloc_plt(obj) == -1)
 		return (-1);
 	/* Relocate the jump slots if we are doing immediate binding. */
 	if (obj->bind_now || bind_now)
 		if (reloc_jmpslots(obj, flags, lockstate) == -1)
 			return (-1);
 
 	/*
 	 * Process the non-PLT IFUNC relocations.  The relocations are
 	 * processed in two phases, because IFUNC resolvers may
 	 * reference other symbols, which must be readily processed
 	 * before resolvers are called.
 	 */
 	if (obj->non_plt_gnu_ifunc &&
 	    reloc_non_plt(obj, rtldobj, flags | SYMLOOK_IFUNC, lockstate))
 		return (-1);
 
 	if (obj->relro_size > 0) {
 		if (mprotect(obj->relro_page, obj->relro_size,
 		    PROT_READ) == -1) {
 			_rtld_error("%s: Cannot enforce relro protection: %s",
 			    obj->path, rtld_strerror(errno));
 			return (-1);
 		}
 	}
 
 	/*
 	 * Set up the magic number and version in the Obj_Entry.  These
 	 * were checked in the crt1.o from the original ElfKit, so we
 	 * set them for backward compatibility.
 	 */
 	obj->magic = RTLD_MAGIC;
 	obj->version = RTLD_VERSION;
 
 	return (0);
 }
 
 /*
  * Relocate newly-loaded shared objects.  The argument is a pointer to
  * the Obj_Entry for the first such object.  All objects from the first
  * to the end of the list of objects are relocated.  Returns 0 on success,
  * or -1 on failure.
  */
 static int
 relocate_objects(Obj_Entry *first, bool bind_now, Obj_Entry *rtldobj,
     int flags, RtldLockState *lockstate)
 {
 	Obj_Entry *obj;
 	int error;
 
 	for (error = 0, obj = first;  obj != NULL;  obj = obj->next) {
 		error = relocate_object(obj, bind_now, rtldobj, flags,
 		    lockstate);
 		if (error == -1)
 			break;
 	}
 	return (error);
 }
 
 /*
  * The handling of R_MACHINE_IRELATIVE relocations and jumpslots
  * referencing STT_GNU_IFUNC symbols is postponed till the other
  * relocations are done.  The indirect functions specified as
  * ifunc are allowed to call other symbols, so we need to have
  * objects relocated before asking for resolution from indirects.
  *
  * The R_MACHINE_IRELATIVE slots are resolved in greedy fashion,
  * instead of the usual lazy handling of PLT slots.  It is
  * consistent with how GNU does it.
  */
 static int
 resolve_object_ifunc(Obj_Entry *obj, bool bind_now, int flags,
     RtldLockState *lockstate)
 {
 	if (obj->irelative && reloc_iresolve(obj, lockstate) == -1)
 		return (-1);
 	if ((obj->bind_now || bind_now) && obj->gnu_ifunc &&
 	    reloc_gnu_ifunc(obj, flags, lockstate) == -1)
 		return (-1);
 	return (0);
 }
 
 static int
 resolve_objects_ifunc(Obj_Entry *first, bool bind_now, int flags,
     RtldLockState *lockstate)
 {
 	Obj_Entry *obj;
 
 	for (obj = first;  obj != NULL;  obj = obj->next) {
 		if (resolve_object_ifunc(obj, bind_now, flags, lockstate) == -1)
 			return (-1);
 	}
 	return (0);
 }
 
 static int
 initlist_objects_ifunc(Objlist *list, bool bind_now, int flags,
     RtldLockState *lockstate)
 {
 	Objlist_Entry *elm;
 
 	STAILQ_FOREACH(elm, list, link) {
 		if (resolve_object_ifunc(elm->obj, bind_now, flags,
 		    lockstate) == -1)
 			return (-1);
 	}
 	return (0);
 }
 
 /*
  * Cleanup procedure.  It will be called (by the atexit mechanism) just
  * before the process exits.
  */
 static void
 rtld_exit(void)
 {
     RtldLockState lockstate;
 
     wlock_acquire(rtld_bind_lock, &lockstate);
     dbg("rtld_exit()");
     objlist_call_fini(&list_fini, NULL, &lockstate);
     /* No need to remove the items from the list, since we are exiting. */
     if (!libmap_disable)
         lm_fini();
     lock_release(rtld_bind_lock, &lockstate);
 }
 
 /*
  * Iterate over a search path, translate each element, and invoke the
  * callback on the result.
  */
 static void *
 path_enumerate(const char *path, path_enum_proc callback, void *arg)
 {
     const char *trans;
     if (path == NULL)
 	return (NULL);
 
     path += strspn(path, ":;");
     while (*path != '\0') {
 	size_t len;
 	char  *res;
 
 	len = strcspn(path, ":;");
 	trans = lm_findn(NULL, path, len);
 	if (trans)
 	    res = callback(trans, strlen(trans), arg);
 	else
 	    res = callback(path, len, arg);
 
 	if (res != NULL)
 	    return (res);
 
 	path += len;
 	path += strspn(path, ":;");
     }
 
     return (NULL);
 }
 
 struct try_library_args {
     const char	*name;
     size_t	 namelen;
     char	*buffer;
     size_t	 buflen;
 };
 
 static void *
 try_library_path(const char *dir, size_t dirlen, void *param)
 {
     struct try_library_args *arg;
 
     arg = param;
     if (*dir == '/' || trust) {
 	char *pathname;
 
 	if (dirlen + 1 + arg->namelen + 1 > arg->buflen)
 		return (NULL);
 
 	pathname = arg->buffer;
 	strncpy(pathname, dir, dirlen);
 	pathname[dirlen] = '/';
 	strcpy(pathname + dirlen + 1, arg->name);
 
 	dbg("  Trying \"%s\"", pathname);
 	if (access(pathname, F_OK) == 0) {		/* We found it */
 	    pathname = xmalloc(dirlen + 1 + arg->namelen + 1);
 	    strcpy(pathname, arg->buffer);
 	    return (pathname);
 	}
     }
     return (NULL);
 }
 
 static char *
 search_library_path(const char *name, const char *path)
 {
     char *p;
     struct try_library_args arg;
 
     if (path == NULL)
 	return NULL;
 
     arg.name = name;
     arg.namelen = strlen(name);
     arg.buffer = xmalloc(PATH_MAX);
     arg.buflen = PATH_MAX;
 
     p = path_enumerate(path, try_library_path, &arg);
 
     free(arg.buffer);
 
     return (p);
 }
 
 
 /*
  * Finds the library with the given name using the directory descriptors
  * listed in the LD_LIBRARY_PATH_FDS environment variable.
  *
  * Returns a freshly-opened close-on-exec file descriptor for the library,
  * or -1 if the library cannot be found.
  */
 static char *
 search_library_pathfds(const char *name, const char *path, int *fdp)
 {
 	char *envcopy, *fdstr, *found, *last_token;
 	size_t len;
 	int dirfd, fd;
 
 	dbg("%s('%s', '%s', fdp)", __func__, name, path);
 
 	/* Don't load from user-specified libdirs into setuid binaries. */
 	if (!trust)
 		return (NULL);
 
 	/* We can't do anything if LD_LIBRARY_PATH_FDS isn't set. */
 	if (path == NULL)
 		return (NULL);
 
 	/* LD_LIBRARY_PATH_FDS only works with relative paths. */
 	if (name[0] == '/') {
 		dbg("Absolute path (%s) passed to %s", name, __func__);
 		return (NULL);
 	}
 
 	/*
 	 * Use strtok_r() to walk the FD:FD:FD list.  This requires a local
 	 * copy of the path, as strtok_r rewrites separator tokens
 	 * with '\0'.
 	 */
 	found = NULL;
 	envcopy = xstrdup(path);
 	for (fdstr = strtok_r(envcopy, ":", &last_token); fdstr != NULL;
 	    fdstr = strtok_r(NULL, ":", &last_token)) {
 		dirfd = parse_libdir(fdstr);
 		if (dirfd < 0)
 			break;
 		fd = __sys_openat(dirfd, name, O_RDONLY | O_CLOEXEC);
 		if (fd >= 0) {
 			*fdp = fd;
 			len = strlen(fdstr) + strlen(name) + 3;
 			found = xmalloc(len);
 			if (rtld_snprintf(found, len, "#%d/%s", dirfd, name) < 0) {
 				_rtld_error("error generating '%d/%s'",
 				    dirfd, name);
 				rtld_die();
 			}
 			dbg("open('%s') => %d", found, fd);
 			break;
 		}
 	}
 	free(envcopy);
 
 	return (found);
 }
 
 
 int
 dlclose(void *handle)
 {
     Obj_Entry *root;
     RtldLockState lockstate;
 
     wlock_acquire(rtld_bind_lock, &lockstate);
     root = dlcheck(handle);
     if (root == NULL) {
 	lock_release(rtld_bind_lock, &lockstate);
 	return -1;
     }
     LD_UTRACE(UTRACE_DLCLOSE_START, handle, NULL, 0, root->dl_refcount,
 	root->path);
 
     /* Unreference the object and its dependencies. */
     root->dl_refcount--;
 
     if (root->refcount == 1) {
 	/*
 	 * The object will be no longer referenced, so we must unload it.
 	 * First, call the fini functions.
 	 */
 	objlist_call_fini(&list_fini, root, &lockstate);
 
 	unref_dag(root);
 
 	/* Finish cleaning up the newly-unreferenced objects. */
 	GDB_STATE(RT_DELETE,&root->linkmap);
 	unload_object(root);
 	GDB_STATE(RT_CONSISTENT,NULL);
     } else
 	unref_dag(root);
 
     LD_UTRACE(UTRACE_DLCLOSE_STOP, handle, NULL, 0, 0, NULL);
     lock_release(rtld_bind_lock, &lockstate);
     return 0;
 }
 
 char *
 dlerror(void)
 {
     char *msg = error_message;
     error_message = NULL;
     return msg;
 }
 
 /*
  * This function is deprecated and has no effect.
  */
 void
 dllockinit(void *context,
 	   void *(*lock_create)(void *context),
            void (*rlock_acquire)(void *lock),
            void (*wlock_acquire)(void *lock),
            void (*lock_release)(void *lock),
            void (*lock_destroy)(void *lock),
 	   void (*context_destroy)(void *context))
 {
     static void *cur_context;
     static void (*cur_context_destroy)(void *);
 
     /* Just destroy the context from the previous call, if necessary. */
     if (cur_context_destroy != NULL)
 	cur_context_destroy(cur_context);
     cur_context = context;
     cur_context_destroy = context_destroy;
 }
 
 void *
 dlopen(const char *name, int mode)
 {
 
 	return (rtld_dlopen(name, -1, mode));
 }
 
 void *
 fdlopen(int fd, int mode)
 {
 
 	return (rtld_dlopen(NULL, fd, mode));
 }
 
 static void *
 rtld_dlopen(const char *name, int fd, int mode)
 {
     RtldLockState lockstate;
     int lo_flags;
 
     LD_UTRACE(UTRACE_DLOPEN_START, NULL, NULL, 0, mode, name);
     ld_tracing = (mode & RTLD_TRACE) == 0 ? NULL : "1";
     if (ld_tracing != NULL) {
 	rlock_acquire(rtld_bind_lock, &lockstate);
 	if (sigsetjmp(lockstate.env, 0) != 0)
 	    lock_upgrade(rtld_bind_lock, &lockstate);
 	environ = (char **)*get_program_var_addr("environ", &lockstate);
 	lock_release(rtld_bind_lock, &lockstate);
     }
     lo_flags = RTLD_LO_DLOPEN;
     if (mode & RTLD_NODELETE)
 	    lo_flags |= RTLD_LO_NODELETE;
     if (mode & RTLD_NOLOAD)
 	    lo_flags |= RTLD_LO_NOLOAD;
     if (ld_tracing != NULL)
 	    lo_flags |= RTLD_LO_TRACE;
 
     return (dlopen_object(name, fd, obj_main, lo_flags,
       mode & (RTLD_MODEMASK | RTLD_GLOBAL), NULL));
 }
 
 static void
 dlopen_cleanup(Obj_Entry *obj)
 {
 
 	obj->dl_refcount--;
 	unref_dag(obj);
 	if (obj->refcount == 0)
 		unload_object(obj);
 }
 
 static Obj_Entry *
 dlopen_object(const char *name, int fd, Obj_Entry *refobj, int lo_flags,
     int mode, RtldLockState *lockstate)
 {
     Obj_Entry **old_obj_tail;
     Obj_Entry *obj;
     Objlist initlist;
     RtldLockState mlockstate;
     int result;
 
     objlist_init(&initlist);
 
     if (lockstate == NULL && !(lo_flags & RTLD_LO_EARLY)) {
 	wlock_acquire(rtld_bind_lock, &mlockstate);
 	lockstate = &mlockstate;
     }
     GDB_STATE(RT_ADD,NULL);
 
     old_obj_tail = obj_tail;
     obj = NULL;
     if (name == NULL && fd == -1) {
 	obj = obj_main;
 	obj->refcount++;
     } else {
 	obj = load_object(name, fd, refobj, lo_flags);
     }
 
     if (obj) {
 	obj->dl_refcount++;
 	if (mode & RTLD_GLOBAL && objlist_find(&list_global, obj) == NULL)
 	    objlist_push_tail(&list_global, obj);
 	if (*old_obj_tail != NULL) {		/* We loaded something new. */
 	    assert(*old_obj_tail == obj);
 	    result = load_needed_objects(obj,
 		lo_flags & (RTLD_LO_DLOPEN | RTLD_LO_EARLY));
 	    init_dag(obj);
 	    ref_dag(obj);
 	    if (result != -1)
 		result = rtld_verify_versions(&obj->dagmembers);
 	    if (result != -1 && ld_tracing)
 		goto trace;
 	    if (result == -1 || relocate_object_dag(obj,
 	      (mode & RTLD_MODEMASK) == RTLD_NOW, &obj_rtld,
 	      (lo_flags & RTLD_LO_EARLY) ? SYMLOOK_EARLY : 0,
 	      lockstate) == -1) {
 		dlopen_cleanup(obj);
 		obj = NULL;
 	    } else if (lo_flags & RTLD_LO_EARLY) {
 		/*
 		 * Do not call the init functions for early loaded
 		 * filtees.  The image is still not initialized enough
 		 * for them to work.
 		 *
 		 * Our object is found by the global object list and
 		 * will be ordered among all init calls done right
 		 * before transferring control to main.
 		 */
 	    } else {
 		/* Make list of init functions to call. */
 		initlist_add_objects(obj, &obj->next, &initlist);
 	    }
 	    /*
-	     * Process all no_delete objects here, given them own
-	     * DAGs to prevent their dependencies from being unloaded.
-	     * This has to be done after we have loaded all of the
-	     * dependencies, so that we do not miss any.
+	     * Process all no_delete or global objects here, given
+	     * them own DAGs to prevent their dependencies from being
+	     * unloaded.  This has to be done after we have loaded all
+	     * of the dependencies, so that we do not miss any.
 	     */
 	    if (obj != NULL)
-		process_nodelete(obj);
+		process_z(obj);
 	} else {
 	    /*
 	     * Bump the reference counts for objects on this DAG.  If
 	     * this is the first dlopen() call for the object that was
 	     * already loaded as a dependency, initialize the dag
 	     * starting at it.
 	     */
 	    init_dag(obj);
 	    ref_dag(obj);
 
 	    if ((lo_flags & RTLD_LO_TRACE) != 0)
 		goto trace;
 	}
 	if (obj != NULL && ((lo_flags & RTLD_LO_NODELETE) != 0 ||
 	  obj->z_nodelete) && !obj->ref_nodel) {
 	    dbg("obj %s nodelete", obj->path);
 	    ref_dag(obj);
 	    obj->z_nodelete = obj->ref_nodel = true;
 	}
     }
 
     LD_UTRACE(UTRACE_DLOPEN_STOP, obj, NULL, 0, obj ? obj->dl_refcount : 0,
 	name);
     GDB_STATE(RT_CONSISTENT,obj ? &obj->linkmap : NULL);
 
     if (!(lo_flags & RTLD_LO_EARLY)) {
 	map_stacks_exec(lockstate);
     }
 
     if (initlist_objects_ifunc(&initlist, (mode & RTLD_MODEMASK) == RTLD_NOW,
       (lo_flags & RTLD_LO_EARLY) ? SYMLOOK_EARLY : 0,
       lockstate) == -1) {
 	objlist_clear(&initlist);
 	dlopen_cleanup(obj);
 	if (lockstate == &mlockstate)
 	    lock_release(rtld_bind_lock, lockstate);
 	return (NULL);
     }
 
     if (!(lo_flags & RTLD_LO_EARLY)) {
 	/* Call the init functions. */
 	objlist_call_init(&initlist, lockstate);
     }
     objlist_clear(&initlist);
     if (lockstate == &mlockstate)
 	lock_release(rtld_bind_lock, lockstate);
     return obj;
 trace:
     trace_loaded_objects(obj);
     if (lockstate == &mlockstate)
 	lock_release(rtld_bind_lock, lockstate);
     exit(0);
 }
 
 static void *
 do_dlsym(void *handle, const char *name, void *retaddr, const Ver_Entry *ve,
     int flags)
 {
     DoneList donelist;
     const Obj_Entry *obj, *defobj;
     const Elf_Sym *def;
     SymLook req;
     RtldLockState lockstate;
     tls_index ti;
     void *sym;
     int res;
 
     def = NULL;
     defobj = NULL;
     symlook_init(&req, name);
     req.ventry = ve;
     req.flags = flags | SYMLOOK_IN_PLT;
     req.lockstate = &lockstate;
 
     LD_UTRACE(UTRACE_DLSYM_START, handle, NULL, 0, 0, name);
     rlock_acquire(rtld_bind_lock, &lockstate);
     if (sigsetjmp(lockstate.env, 0) != 0)
 	    lock_upgrade(rtld_bind_lock, &lockstate);
     if (handle == NULL || handle == RTLD_NEXT ||
 	handle == RTLD_DEFAULT || handle == RTLD_SELF) {
 
 	if ((obj = obj_from_addr(retaddr)) == NULL) {
 	    _rtld_error("Cannot determine caller's shared object");
 	    lock_release(rtld_bind_lock, &lockstate);
 	    LD_UTRACE(UTRACE_DLSYM_STOP, handle, NULL, 0, 0, name);
 	    return NULL;
 	}
 	if (handle == NULL) {	/* Just the caller's shared object. */
 	    res = symlook_obj(&req, obj);
 	    if (res == 0) {
 		def = req.sym_out;
 		defobj = req.defobj_out;
 	    }
 	} else if (handle == RTLD_NEXT || /* Objects after caller's */
 		   handle == RTLD_SELF) { /* ... caller included */
 	    if (handle == RTLD_NEXT)
 		obj = obj->next;
 	    for (; obj != NULL; obj = obj->next) {
 		res = symlook_obj(&req, obj);
 		if (res == 0) {
 		    if (def == NULL ||
 		      ELF_ST_BIND(req.sym_out->st_info) != STB_WEAK) {
 			def = req.sym_out;
 			defobj = req.defobj_out;
 			if (ELF_ST_BIND(def->st_info) != STB_WEAK)
 			    break;
 		    }
 		}
 	    }
 	    /*
 	     * Search the dynamic linker itself, and possibly resolve the
 	     * symbol from there.  This is how the application links to
 	     * dynamic linker services such as dlopen.
 	     */
 	    if (def == NULL || ELF_ST_BIND(def->st_info) == STB_WEAK) {
 		res = symlook_obj(&req, &obj_rtld);
 		if (res == 0) {
 		    def = req.sym_out;
 		    defobj = req.defobj_out;
 		}
 	    }
 	} else {
 	    assert(handle == RTLD_DEFAULT);
 	    res = symlook_default(&req, obj);
 	    if (res == 0) {
 		defobj = req.defobj_out;
 		def = req.sym_out;
 	    }
 	}
     } else {
 	if ((obj = dlcheck(handle)) == NULL) {
 	    lock_release(rtld_bind_lock, &lockstate);
 	    LD_UTRACE(UTRACE_DLSYM_STOP, handle, NULL, 0, 0, name);
 	    return NULL;
 	}
 
 	donelist_init(&donelist);
 	if (obj->mainprog) {
             /* Handle obtained by dlopen(NULL, ...) implies global scope. */
 	    res = symlook_global(&req, &donelist);
 	    if (res == 0) {
 		def = req.sym_out;
 		defobj = req.defobj_out;
 	    }
 	    /*
 	     * Search the dynamic linker itself, and possibly resolve the
 	     * symbol from there.  This is how the application links to
 	     * dynamic linker services such as dlopen.
 	     */
 	    if (def == NULL || ELF_ST_BIND(def->st_info) == STB_WEAK) {
 		res = symlook_obj(&req, &obj_rtld);
 		if (res == 0) {
 		    def = req.sym_out;
 		    defobj = req.defobj_out;
 		}
 	    }
 	}
 	else {
 	    /* Search the whole DAG rooted at the given object. */
 	    res = symlook_list(&req, &obj->dagmembers, &donelist);
 	    if (res == 0) {
 		def = req.sym_out;
 		defobj = req.defobj_out;
 	    }
 	}
     }
 
     if (def != NULL) {
 	lock_release(rtld_bind_lock, &lockstate);
 
 	/*
 	 * The value required by the caller is derived from the value
 	 * of the symbol. this is simply the relocated value of the
 	 * symbol.
 	 */
 	if (ELF_ST_TYPE(def->st_info) == STT_FUNC)
 	    sym = make_function_pointer(def, defobj);
 	else if (ELF_ST_TYPE(def->st_info) == STT_GNU_IFUNC)
 	    sym = rtld_resolve_ifunc(defobj, def);
 	else if (ELF_ST_TYPE(def->st_info) == STT_TLS) {
 	    ti.ti_module = defobj->tlsindex;
 	    ti.ti_offset = def->st_value;
 	    sym = __tls_get_addr(&ti);
 	} else
 	    sym = defobj->relocbase + def->st_value;
 	LD_UTRACE(UTRACE_DLSYM_STOP, handle, sym, 0, 0, name);
 	return (sym);
     }
 
     _rtld_error("Undefined symbol \"%s\"", name);
     lock_release(rtld_bind_lock, &lockstate);
     LD_UTRACE(UTRACE_DLSYM_STOP, handle, NULL, 0, 0, name);
     return NULL;
 }
 
 void *
 dlsym(void *handle, const char *name)
 {
 	return do_dlsym(handle, name, __builtin_return_address(0), NULL,
 	    SYMLOOK_DLSYM);
 }
 
 dlfunc_t
 dlfunc(void *handle, const char *name)
 {
 	union {
 		void *d;
 		dlfunc_t f;
 	} rv;
 
 	rv.d = do_dlsym(handle, name, __builtin_return_address(0), NULL,
 	    SYMLOOK_DLSYM);
 	return (rv.f);
 }
 
 void *
 dlvsym(void *handle, const char *name, const char *version)
 {
 	Ver_Entry ventry;
 
 	ventry.name = version;
 	ventry.file = NULL;
 	ventry.hash = elf_hash(version);
 	ventry.flags= 0;
 	return do_dlsym(handle, name, __builtin_return_address(0), &ventry,
 	    SYMLOOK_DLSYM);
 }
 
 int
 _rtld_addr_phdr(const void *addr, struct dl_phdr_info *phdr_info)
 {
     const Obj_Entry *obj;
     RtldLockState lockstate;
 
     rlock_acquire(rtld_bind_lock, &lockstate);
     obj = obj_from_addr(addr);
     if (obj == NULL) {
         _rtld_error("No shared object contains address");
 	lock_release(rtld_bind_lock, &lockstate);
         return (0);
     }
     rtld_fill_dl_phdr_info(obj, phdr_info);
     lock_release(rtld_bind_lock, &lockstate);
     return (1);
 }
 
 int
 dladdr(const void *addr, Dl_info *info)
 {
     const Obj_Entry *obj;
     const Elf_Sym *def;
     void *symbol_addr;
     unsigned long symoffset;
     RtldLockState lockstate;
 
     rlock_acquire(rtld_bind_lock, &lockstate);
     obj = obj_from_addr(addr);
     if (obj == NULL) {
         _rtld_error("No shared object contains address");
 	lock_release(rtld_bind_lock, &lockstate);
         return 0;
     }
     info->dli_fname = obj->path;
     info->dli_fbase = obj->mapbase;
     info->dli_saddr = (void *)0;
     info->dli_sname = NULL;
 
     /*
      * Walk the symbol list looking for the symbol whose address is
      * closest to the address sent in.
      */
     for (symoffset = 0; symoffset < obj->dynsymcount; symoffset++) {
         def = obj->symtab + symoffset;
 
         /*
          * For skip the symbol if st_shndx is either SHN_UNDEF or
          * SHN_COMMON.
          */
         if (def->st_shndx == SHN_UNDEF || def->st_shndx == SHN_COMMON)
             continue;
 
         /*
          * If the symbol is greater than the specified address, or if it
          * is further away from addr than the current nearest symbol,
          * then reject it.
          */
         symbol_addr = obj->relocbase + def->st_value;
         if (symbol_addr > addr || symbol_addr < info->dli_saddr)
             continue;
 
         /* Update our idea of the nearest symbol. */
         info->dli_sname = obj->strtab + def->st_name;
         info->dli_saddr = symbol_addr;
 
         /* Exact match? */
         if (info->dli_saddr == addr)
             break;
     }
     lock_release(rtld_bind_lock, &lockstate);
     return 1;
 }
 
 int
 dlinfo(void *handle, int request, void *p)
 {
     const Obj_Entry *obj;
     RtldLockState lockstate;
     int error;
 
     rlock_acquire(rtld_bind_lock, &lockstate);
 
     if (handle == NULL || handle == RTLD_SELF) {
 	void *retaddr;
 
 	retaddr = __builtin_return_address(0);	/* __GNUC__ only */
 	if ((obj = obj_from_addr(retaddr)) == NULL)
 	    _rtld_error("Cannot determine caller's shared object");
     } else
 	obj = dlcheck(handle);
 
     if (obj == NULL) {
 	lock_release(rtld_bind_lock, &lockstate);
 	return (-1);
     }
 
     error = 0;
     switch (request) {
     case RTLD_DI_LINKMAP:
 	*((struct link_map const **)p) = &obj->linkmap;
 	break;
     case RTLD_DI_ORIGIN:
 	error = rtld_dirname(obj->path, p);
 	break;
 
     case RTLD_DI_SERINFOSIZE:
     case RTLD_DI_SERINFO:
 	error = do_search_info(obj, request, (struct dl_serinfo *)p);
 	break;
 
     default:
 	_rtld_error("Invalid request %d passed to dlinfo()", request);
 	error = -1;
     }
 
     lock_release(rtld_bind_lock, &lockstate);
 
     return (error);
 }
 
 static void
 rtld_fill_dl_phdr_info(const Obj_Entry *obj, struct dl_phdr_info *phdr_info)
 {
 
 	phdr_info->dlpi_addr = (Elf_Addr)obj->relocbase;
 	phdr_info->dlpi_name = obj->path;
 	phdr_info->dlpi_phdr = obj->phdr;
 	phdr_info->dlpi_phnum = obj->phsize / sizeof(obj->phdr[0]);
 	phdr_info->dlpi_tls_modid = obj->tlsindex;
 	phdr_info->dlpi_tls_data = obj->tlsinit;
 	phdr_info->dlpi_adds = obj_loads;
 	phdr_info->dlpi_subs = obj_loads - obj_count;
 }
 
 int
 dl_iterate_phdr(__dl_iterate_hdr_callback callback, void *param)
 {
     struct dl_phdr_info phdr_info;
     const Obj_Entry *obj;
     RtldLockState bind_lockstate, phdr_lockstate;
     int error;
 
     wlock_acquire(rtld_phdr_lock, &phdr_lockstate);
     rlock_acquire(rtld_bind_lock, &bind_lockstate);
 
     error = 0;
 
     for (obj = obj_list;  obj != NULL;  obj = obj->next) {
 	rtld_fill_dl_phdr_info(obj, &phdr_info);
 	if ((error = callback(&phdr_info, sizeof phdr_info, param)) != 0)
 		break;
 
     }
     if (error == 0) {
 	rtld_fill_dl_phdr_info(&obj_rtld, &phdr_info);
 	error = callback(&phdr_info, sizeof(phdr_info), param);
     }
 
     lock_release(rtld_bind_lock, &bind_lockstate);
     lock_release(rtld_phdr_lock, &phdr_lockstate);
 
     return (error);
 }
 
 static void *
 fill_search_info(const char *dir, size_t dirlen, void *param)
 {
     struct fill_search_info_args *arg;
 
     arg = param;
 
     if (arg->request == RTLD_DI_SERINFOSIZE) {
 	arg->serinfo->dls_cnt ++;
 	arg->serinfo->dls_size += sizeof(struct dl_serpath) + dirlen + 1;
     } else {
 	struct dl_serpath *s_entry;
 
 	s_entry = arg->serpath;
 	s_entry->dls_name  = arg->strspace;
 	s_entry->dls_flags = arg->flags;
 
 	strncpy(arg->strspace, dir, dirlen);
 	arg->strspace[dirlen] = '\0';
 
 	arg->strspace += dirlen + 1;
 	arg->serpath++;
     }
 
     return (NULL);
 }
 
 static int
 do_search_info(const Obj_Entry *obj, int request, struct dl_serinfo *info)
 {
     struct dl_serinfo _info;
     struct fill_search_info_args args;
 
     args.request = RTLD_DI_SERINFOSIZE;
     args.serinfo = &_info;
 
     _info.dls_size = __offsetof(struct dl_serinfo, dls_serpath);
     _info.dls_cnt  = 0;
 
     path_enumerate(obj->rpath, fill_search_info, &args);
     path_enumerate(ld_library_path, fill_search_info, &args);
     path_enumerate(obj->runpath, fill_search_info, &args);
     path_enumerate(gethints(obj->z_nodeflib), fill_search_info, &args);
     if (!obj->z_nodeflib)
       path_enumerate(STANDARD_LIBRARY_PATH, fill_search_info, &args);
 
 
     if (request == RTLD_DI_SERINFOSIZE) {
 	info->dls_size = _info.dls_size;
 	info->dls_cnt = _info.dls_cnt;
 	return (0);
     }
 
     if (info->dls_cnt != _info.dls_cnt || info->dls_size != _info.dls_size) {
 	_rtld_error("Uninitialized Dl_serinfo struct passed to dlinfo()");
 	return (-1);
     }
 
     args.request  = RTLD_DI_SERINFO;
     args.serinfo  = info;
     args.serpath  = &info->dls_serpath[0];
     args.strspace = (char *)&info->dls_serpath[_info.dls_cnt];
 
     args.flags = LA_SER_RUNPATH;
     if (path_enumerate(obj->rpath, fill_search_info, &args) != NULL)
 	return (-1);
 
     args.flags = LA_SER_LIBPATH;
     if (path_enumerate(ld_library_path, fill_search_info, &args) != NULL)
 	return (-1);
 
     args.flags = LA_SER_RUNPATH;
     if (path_enumerate(obj->runpath, fill_search_info, &args) != NULL)
 	return (-1);
 
     args.flags = LA_SER_CONFIG;
     if (path_enumerate(gethints(obj->z_nodeflib), fill_search_info, &args)
       != NULL)
 	return (-1);
 
     args.flags = LA_SER_DEFAULT;
     if (!obj->z_nodeflib &&
       path_enumerate(STANDARD_LIBRARY_PATH, fill_search_info, &args) != NULL)
 	return (-1);
     return (0);
 }
 
 static int
 rtld_dirname(const char *path, char *bname)
 {
     const char *endp;
 
     /* Empty or NULL string gets treated as "." */
     if (path == NULL || *path == '\0') {
 	bname[0] = '.';
 	bname[1] = '\0';
 	return (0);
     }
 
     /* Strip trailing slashes */
     endp = path + strlen(path) - 1;
     while (endp > path && *endp == '/')
 	endp--;
 
     /* Find the start of the dir */
     while (endp > path && *endp != '/')
 	endp--;
 
     /* Either the dir is "/" or there are no slashes */
     if (endp == path) {
 	bname[0] = *endp == '/' ? '/' : '.';
 	bname[1] = '\0';
 	return (0);
     } else {
 	do {
 	    endp--;
 	} while (endp > path && *endp == '/');
     }
 
     if (endp - path + 2 > PATH_MAX)
     {
 	_rtld_error("Filename is too long: %s", path);
 	return(-1);
     }
 
     strncpy(bname, path, endp - path + 1);
     bname[endp - path + 1] = '\0';
     return (0);
 }
 
 static int
 rtld_dirname_abs(const char *path, char *base)
 {
 	char *last;
 
 	if (realpath(path, base) == NULL)
 		return (-1);
 	dbg("%s -> %s", path, base);
 	last = strrchr(base, '/');
 	if (last == NULL)
 		return (-1);
 	if (last != base)
 		*last = '\0';
 	return (0);
 }
 
 static void
 linkmap_add(Obj_Entry *obj)
 {
     struct link_map *l = &obj->linkmap;
     struct link_map *prev;
 
     obj->linkmap.l_name = obj->path;
     obj->linkmap.l_addr = obj->mapbase;
     obj->linkmap.l_ld = obj->dynamic;
 #ifdef __mips__
     /* GDB needs load offset on MIPS to use the symbols */
     obj->linkmap.l_offs = obj->relocbase;
 #endif
 
     if (r_debug.r_map == NULL) {
 	r_debug.r_map = l;
 	return;
     }
 
     /*
      * Scan to the end of the list, but not past the entry for the
      * dynamic linker, which we want to keep at the very end.
      */
     for (prev = r_debug.r_map;
       prev->l_next != NULL && prev->l_next != &obj_rtld.linkmap;
       prev = prev->l_next)
 	;
 
     /* Link in the new entry. */
     l->l_prev = prev;
     l->l_next = prev->l_next;
     if (l->l_next != NULL)
 	l->l_next->l_prev = l;
     prev->l_next = l;
 }
 
 static void
 linkmap_delete(Obj_Entry *obj)
 {
     struct link_map *l = &obj->linkmap;
 
     if (l->l_prev == NULL) {
 	if ((r_debug.r_map = l->l_next) != NULL)
 	    l->l_next->l_prev = NULL;
 	return;
     }
 
     if ((l->l_prev->l_next = l->l_next) != NULL)
 	l->l_next->l_prev = l->l_prev;
 }
 
 /*
  * Function for the debugger to set a breakpoint on to gain control.
  *
  * The two parameters allow the debugger to easily find and determine
  * what the runtime loader is doing and to whom it is doing it.
  *
  * When the loadhook trap is hit (r_debug_state, set at program
  * initialization), the arguments can be found on the stack:
  *
  *  +8   struct link_map *m
  *  +4   struct r_debug  *rd
  *  +0   RetAddr
  */
 void
 r_debug_state(struct r_debug* rd, struct link_map *m)
 {
     /*
      * The following is a hack to force the compiler to emit calls to
      * this function, even when optimizing.  If the function is empty,
      * the compiler is not obliged to emit any code for calls to it,
      * even when marked __noinline.  However, gdb depends on those
      * calls being made.
      */
     __compiler_membar();
 }
 
 /*
  * A function called after init routines have completed. This can be used to
  * break before a program's entry routine is called, and can be used when
  * main is not available in the symbol table.
  */
 void
 _r_debug_postinit(struct link_map *m)
 {
 
 	/* See r_debug_state(). */
 	__compiler_membar();
 }
 
 /*
  * Get address of the pointer variable in the main program.
  * Prefer non-weak symbol over the weak one.
  */
 static const void **
 get_program_var_addr(const char *name, RtldLockState *lockstate)
 {
     SymLook req;
     DoneList donelist;
 
     symlook_init(&req, name);
     req.lockstate = lockstate;
     donelist_init(&donelist);
     if (symlook_global(&req, &donelist) != 0)
 	return (NULL);
     if (ELF_ST_TYPE(req.sym_out->st_info) == STT_FUNC)
 	return ((const void **)make_function_pointer(req.sym_out,
 	  req.defobj_out));
     else if (ELF_ST_TYPE(req.sym_out->st_info) == STT_GNU_IFUNC)
 	return ((const void **)rtld_resolve_ifunc(req.defobj_out, req.sym_out));
     else
 	return ((const void **)(req.defobj_out->relocbase +
 	  req.sym_out->st_value));
 }
 
 /*
  * Set a pointer variable in the main program to the given value.  This
  * is used to set key variables such as "environ" before any of the
  * init functions are called.
  */
 static void
 set_program_var(const char *name, const void *value)
 {
     const void **addr;
 
     if ((addr = get_program_var_addr(name, NULL)) != NULL) {
 	dbg("\"%s\": *%p <-- %p", name, addr, value);
 	*addr = value;
     }
 }
 
 /*
  * Search the global objects, including dependencies and main object,
  * for the given symbol.
  */
 static int
 symlook_global(SymLook *req, DoneList *donelist)
 {
     SymLook req1;
     const Objlist_Entry *elm;
     int res;
 
     symlook_init_from_req(&req1, req);
 
     /* Search all objects loaded at program start up. */
     if (req->defobj_out == NULL ||
       ELF_ST_BIND(req->sym_out->st_info) == STB_WEAK) {
 	res = symlook_list(&req1, &list_main, donelist);
 	if (res == 0 && (req->defobj_out == NULL ||
 	  ELF_ST_BIND(req1.sym_out->st_info) != STB_WEAK)) {
 	    req->sym_out = req1.sym_out;
 	    req->defobj_out = req1.defobj_out;
 	    assert(req->defobj_out != NULL);
 	}
     }
 
     /* Search all DAGs whose roots are RTLD_GLOBAL objects. */
     STAILQ_FOREACH(elm, &list_global, link) {
 	if (req->defobj_out != NULL &&
 	  ELF_ST_BIND(req->sym_out->st_info) != STB_WEAK)
 	    break;
 	res = symlook_list(&req1, &elm->obj->dagmembers, donelist);
 	if (res == 0 && (req->defobj_out == NULL ||
 	  ELF_ST_BIND(req1.sym_out->st_info) != STB_WEAK)) {
 	    req->sym_out = req1.sym_out;
 	    req->defobj_out = req1.defobj_out;
 	    assert(req->defobj_out != NULL);
 	}
     }
 
     return (req->sym_out != NULL ? 0 : ESRCH);
 }
 
 /*
  * Given a symbol name in a referencing object, find the corresponding
  * definition of the symbol.  Returns a pointer to the symbol, or NULL if
  * no definition was found.  Returns a pointer to the Obj_Entry of the
  * defining object via the reference parameter DEFOBJ_OUT.
  */
 static int
 symlook_default(SymLook *req, const Obj_Entry *refobj)
 {
     DoneList donelist;
     const Objlist_Entry *elm;
     SymLook req1;
     int res;
 
     donelist_init(&donelist);
     symlook_init_from_req(&req1, req);
 
     /* Look first in the referencing object if linked symbolically. */
     if (refobj->symbolic && !donelist_check(&donelist, refobj)) {
 	res = symlook_obj(&req1, refobj);
 	if (res == 0) {
 	    req->sym_out = req1.sym_out;
 	    req->defobj_out = req1.defobj_out;
 	    assert(req->defobj_out != NULL);
 	}
     }
 
     symlook_global(req, &donelist);
 
     /* Search all dlopened DAGs containing the referencing object. */
     STAILQ_FOREACH(elm, &refobj->dldags, link) {
 	if (req->sym_out != NULL &&
 	  ELF_ST_BIND(req->sym_out->st_info) != STB_WEAK)
 	    break;
 	res = symlook_list(&req1, &elm->obj->dagmembers, &donelist);
 	if (res == 0 && (req->sym_out == NULL ||
 	  ELF_ST_BIND(req1.sym_out->st_info) != STB_WEAK)) {
 	    req->sym_out = req1.sym_out;
 	    req->defobj_out = req1.defobj_out;
 	    assert(req->defobj_out != NULL);
 	}
     }
 
     /*
      * Search the dynamic linker itself, and possibly resolve the
      * symbol from there.  This is how the application links to
      * dynamic linker services such as dlopen.
      */
     if (req->sym_out == NULL ||
       ELF_ST_BIND(req->sym_out->st_info) == STB_WEAK) {
 	res = symlook_obj(&req1, &obj_rtld);
 	if (res == 0) {
 	    req->sym_out = req1.sym_out;
 	    req->defobj_out = req1.defobj_out;
 	    assert(req->defobj_out != NULL);
 	}
     }
 
     return (req->sym_out != NULL ? 0 : ESRCH);
 }
 
 static int
 symlook_list(SymLook *req, const Objlist *objlist, DoneList *dlp)
 {
     const Elf_Sym *def;
     const Obj_Entry *defobj;
     const Objlist_Entry *elm;
     SymLook req1;
     int res;
 
     def = NULL;
     defobj = NULL;
     STAILQ_FOREACH(elm, objlist, link) {
 	if (donelist_check(dlp, elm->obj))
 	    continue;
 	symlook_init_from_req(&req1, req);
 	if ((res = symlook_obj(&req1, elm->obj)) == 0) {
 	    if (def == NULL || ELF_ST_BIND(req1.sym_out->st_info) != STB_WEAK) {
 		def = req1.sym_out;
 		defobj = req1.defobj_out;
 		if (ELF_ST_BIND(def->st_info) != STB_WEAK)
 		    break;
 	    }
 	}
     }
     if (def != NULL) {
 	req->sym_out = def;
 	req->defobj_out = defobj;
 	return (0);
     }
     return (ESRCH);
 }
 
 /*
  * Search the chain of DAGS cointed to by the given Needed_Entry
  * for a symbol of the given name.  Each DAG is scanned completely
  * before advancing to the next one.  Returns a pointer to the symbol,
  * or NULL if no definition was found.
  */
 static int
 symlook_needed(SymLook *req, const Needed_Entry *needed, DoneList *dlp)
 {
     const Elf_Sym *def;
     const Needed_Entry *n;
     const Obj_Entry *defobj;
     SymLook req1;
     int res;
 
     def = NULL;
     defobj = NULL;
     symlook_init_from_req(&req1, req);
     for (n = needed; n != NULL; n = n->next) {
 	if (n->obj == NULL ||
 	    (res = symlook_list(&req1, &n->obj->dagmembers, dlp)) != 0)
 	    continue;
 	if (def == NULL || ELF_ST_BIND(req1.sym_out->st_info) != STB_WEAK) {
 	    def = req1.sym_out;
 	    defobj = req1.defobj_out;
 	    if (ELF_ST_BIND(def->st_info) != STB_WEAK)
 		break;
 	}
     }
     if (def != NULL) {
 	req->sym_out = def;
 	req->defobj_out = defobj;
 	return (0);
     }
     return (ESRCH);
 }
 
 /*
  * Search the symbol table of a single shared object for a symbol of
  * the given name and version, if requested.  Returns a pointer to the
  * symbol, or NULL if no definition was found.  If the object is
  * filter, return filtered symbol from filtee.
  *
  * The symbol's hash value is passed in for efficiency reasons; that
  * eliminates many recomputations of the hash value.
  */
 int
 symlook_obj(SymLook *req, const Obj_Entry *obj)
 {
     DoneList donelist;
     SymLook req1;
     int flags, res, mres;
 
     /*
      * If there is at least one valid hash at this point, we prefer to
      * use the faster GNU version if available.
      */
     if (obj->valid_hash_gnu)
 	mres = symlook_obj1_gnu(req, obj);
     else if (obj->valid_hash_sysv)
 	mres = symlook_obj1_sysv(req, obj);
     else
 	return (EINVAL);
 
     if (mres == 0) {
 	if (obj->needed_filtees != NULL) {
 	    flags = (req->flags & SYMLOOK_EARLY) ? RTLD_LO_EARLY : 0;
 	    load_filtees(__DECONST(Obj_Entry *, obj), flags, req->lockstate);
 	    donelist_init(&donelist);
 	    symlook_init_from_req(&req1, req);
 	    res = symlook_needed(&req1, obj->needed_filtees, &donelist);
 	    if (res == 0) {
 		req->sym_out = req1.sym_out;
 		req->defobj_out = req1.defobj_out;
 	    }
 	    return (res);
 	}
 	if (obj->needed_aux_filtees != NULL) {
 	    flags = (req->flags & SYMLOOK_EARLY) ? RTLD_LO_EARLY : 0;
 	    load_filtees(__DECONST(Obj_Entry *, obj), flags, req->lockstate);
 	    donelist_init(&donelist);
 	    symlook_init_from_req(&req1, req);
 	    res = symlook_needed(&req1, obj->needed_aux_filtees, &donelist);
 	    if (res == 0) {
 		req->sym_out = req1.sym_out;
 		req->defobj_out = req1.defobj_out;
 		return (res);
 	    }
 	}
     }
     return (mres);
 }
 
 /* Symbol match routine common to both hash functions */
 static bool
 matched_symbol(SymLook *req, const Obj_Entry *obj, Sym_Match_Result *result,
     const unsigned long symnum)
 {
 	Elf_Versym verndx;
 	const Elf_Sym *symp;
 	const char *strp;
 
 	symp = obj->symtab + symnum;
 	strp = obj->strtab + symp->st_name;
 
 	switch (ELF_ST_TYPE(symp->st_info)) {
 	case STT_FUNC:
 	case STT_NOTYPE:
 	case STT_OBJECT:
 	case STT_COMMON:
 	case STT_GNU_IFUNC:
 		if (symp->st_value == 0)
 			return (false);
 		/* fallthrough */
 	case STT_TLS:
 		if (symp->st_shndx != SHN_UNDEF)
 			break;
 #ifndef __mips__
 		else if (((req->flags & SYMLOOK_IN_PLT) == 0) &&
 		    (ELF_ST_TYPE(symp->st_info) == STT_FUNC))
 			break;
 		/* fallthrough */
 #endif
 	default:
 		return (false);
 	}
 	if (req->name[0] != strp[0] || strcmp(req->name, strp) != 0)
 		return (false);
 
 	if (req->ventry == NULL) {
 		if (obj->versyms != NULL) {
 			verndx = VER_NDX(obj->versyms[symnum]);
 			if (verndx > obj->vernum) {
 				_rtld_error(
 				    "%s: symbol %s references wrong version %d",
 				    obj->path, obj->strtab + symnum, verndx);
 				return (false);
 			}
 			/*
 			 * If we are not called from dlsym (i.e. this
 			 * is a normal relocation from unversioned
 			 * binary), accept the symbol immediately if
 			 * it happens to have first version after this
 			 * shared object became versioned.  Otherwise,
 			 * if symbol is versioned and not hidden,
 			 * remember it. If it is the only symbol with
 			 * this name exported by the shared object, it
 			 * will be returned as a match by the calling
 			 * function. If symbol is global (verndx < 2)
 			 * accept it unconditionally.
 			 */
 			if ((req->flags & SYMLOOK_DLSYM) == 0 &&
 			    verndx == VER_NDX_GIVEN) {
 				result->sym_out = symp;
 				return (true);
 			}
 			else if (verndx >= VER_NDX_GIVEN) {
 				if ((obj->versyms[symnum] & VER_NDX_HIDDEN)
 				    == 0) {
 					if (result->vsymp == NULL)
 						result->vsymp = symp;
 					result->vcount++;
 				}
 				return (false);
 			}
 		}
 		result->sym_out = symp;
 		return (true);
 	}
 	if (obj->versyms == NULL) {
 		if (object_match_name(obj, req->ventry->name)) {
 			_rtld_error("%s: object %s should provide version %s "
 			    "for symbol %s", obj_rtld.path, obj->path,
 			    req->ventry->name, obj->strtab + symnum);
 			return (false);
 		}
 	} else {
 		verndx = VER_NDX(obj->versyms[symnum]);
 		if (verndx > obj->vernum) {
 			_rtld_error("%s: symbol %s references wrong version %d",
 			    obj->path, obj->strtab + symnum, verndx);
 			return (false);
 		}
 		if (obj->vertab[verndx].hash != req->ventry->hash ||
 		    strcmp(obj->vertab[verndx].name, req->ventry->name)) {
 			/*
 			 * Version does not match. Look if this is a
 			 * global symbol and if it is not hidden. If
 			 * global symbol (verndx < 2) is available,
 			 * use it. Do not return symbol if we are
 			 * called by dlvsym, because dlvsym looks for
 			 * a specific version and default one is not
 			 * what dlvsym wants.
 			 */
 			if ((req->flags & SYMLOOK_DLSYM) ||
 			    (verndx >= VER_NDX_GIVEN) ||
 			    (obj->versyms[symnum] & VER_NDX_HIDDEN))
 				return (false);
 		}
 	}
 	result->sym_out = symp;
 	return (true);
 }
 
 /*
  * Search for symbol using SysV hash function.
  * obj->buckets is known not to be NULL at this point; the test for this was
  * performed with the obj->valid_hash_sysv assignment.
  */
 static int
 symlook_obj1_sysv(SymLook *req, const Obj_Entry *obj)
 {
 	unsigned long symnum;
 	Sym_Match_Result matchres;
 
 	matchres.sym_out = NULL;
 	matchres.vsymp = NULL;
 	matchres.vcount = 0;
 
 	for (symnum = obj->buckets[req->hash % obj->nbuckets];
 	    symnum != STN_UNDEF; symnum = obj->chains[symnum]) {
 		if (symnum >= obj->nchains)
 			return (ESRCH);	/* Bad object */
 
 		if (matched_symbol(req, obj, &matchres, symnum)) {
 			req->sym_out = matchres.sym_out;
 			req->defobj_out = obj;
 			return (0);
 		}
 	}
 	if (matchres.vcount == 1) {
 		req->sym_out = matchres.vsymp;
 		req->defobj_out = obj;
 		return (0);
 	}
 	return (ESRCH);
 }
 
 /* Search for symbol using GNU hash function */
 static int
 symlook_obj1_gnu(SymLook *req, const Obj_Entry *obj)
 {
 	Elf_Addr bloom_word;
 	const Elf32_Word *hashval;
 	Elf32_Word bucket;
 	Sym_Match_Result matchres;
 	unsigned int h1, h2;
 	unsigned long symnum;
 
 	matchres.sym_out = NULL;
 	matchres.vsymp = NULL;
 	matchres.vcount = 0;
 
 	/* Pick right bitmask word from Bloom filter array */
 	bloom_word = obj->bloom_gnu[(req->hash_gnu / __ELF_WORD_SIZE) &
 	    obj->maskwords_bm_gnu];
 
 	/* Calculate modulus word size of gnu hash and its derivative */
 	h1 = req->hash_gnu & (__ELF_WORD_SIZE - 1);
 	h2 = ((req->hash_gnu >> obj->shift2_gnu) & (__ELF_WORD_SIZE - 1));
 
 	/* Filter out the "definitely not in set" queries */
 	if (((bloom_word >> h1) & (bloom_word >> h2) & 1) == 0)
 		return (ESRCH);
 
 	/* Locate hash chain and corresponding value element*/
 	bucket = obj->buckets_gnu[req->hash_gnu % obj->nbuckets_gnu];
 	if (bucket == 0)
 		return (ESRCH);
 	hashval = &obj->chain_zero_gnu[bucket];
 	do {
 		if (((*hashval ^ req->hash_gnu) >> 1) == 0) {
 			symnum = hashval - obj->chain_zero_gnu;
 			if (matched_symbol(req, obj, &matchres, symnum)) {
 				req->sym_out = matchres.sym_out;
 				req->defobj_out = obj;
 				return (0);
 			}
 		}
 	} while ((*hashval++ & 1) == 0);
 	if (matchres.vcount == 1) {
 		req->sym_out = matchres.vsymp;
 		req->defobj_out = obj;
 		return (0);
 	}
 	return (ESRCH);
 }
 
 static void
 trace_loaded_objects(Obj_Entry *obj)
 {
     char	*fmt1, *fmt2, *fmt, *main_local, *list_containers;
     int		c;
 
     if ((main_local = getenv(LD_ "TRACE_LOADED_OBJECTS_PROGNAME")) == NULL)
 	main_local = "";
 
     if ((fmt1 = getenv(LD_ "TRACE_LOADED_OBJECTS_FMT1")) == NULL)
 	fmt1 = "\t%o => %p (%x)\n";
 
     if ((fmt2 = getenv(LD_ "TRACE_LOADED_OBJECTS_FMT2")) == NULL)
 	fmt2 = "\t%o (%x)\n";
 
     list_containers = getenv(LD_ "TRACE_LOADED_OBJECTS_ALL");
 
     for (; obj; obj = obj->next) {
 	Needed_Entry		*needed;
 	char			*name, *path;
 	bool			is_lib;
 
 	if (list_containers && obj->needed != NULL)
 	    rtld_printf("%s:\n", obj->path);
 	for (needed = obj->needed; needed; needed = needed->next) {
 	    if (needed->obj != NULL) {
 		if (needed->obj->traced && !list_containers)
 		    continue;
 		needed->obj->traced = true;
 		path = needed->obj->path;
 	    } else
 		path = "not found";
 
 	    name = (char *)obj->strtab + needed->name;
 	    is_lib = strncmp(name, "lib", 3) == 0;	/* XXX - bogus */
 
 	    fmt = is_lib ? fmt1 : fmt2;
 	    while ((c = *fmt++) != '\0') {
 		switch (c) {
 		default:
 		    rtld_putchar(c);
 		    continue;
 		case '\\':
 		    switch (c = *fmt) {
 		    case '\0':
 			continue;
 		    case 'n':
 			rtld_putchar('\n');
 			break;
 		    case 't':
 			rtld_putchar('\t');
 			break;
 		    }
 		    break;
 		case '%':
 		    switch (c = *fmt) {
 		    case '\0':
 			continue;
 		    case '%':
 		    default:
 			rtld_putchar(c);
 			break;
 		    case 'A':
 			rtld_putstr(main_local);
 			break;
 		    case 'a':
 			rtld_putstr(obj_main->path);
 			break;
 		    case 'o':
 			rtld_putstr(name);
 			break;
 #if 0
 		    case 'm':
 			rtld_printf("%d", sodp->sod_major);
 			break;
 		    case 'n':
 			rtld_printf("%d", sodp->sod_minor);
 			break;
 #endif
 		    case 'p':
 			rtld_putstr(path);
 			break;
 		    case 'x':
 			rtld_printf("%p", needed->obj ? needed->obj->mapbase :
 			  0);
 			break;
 		    }
 		    break;
 		}
 		++fmt;
 	    }
 	}
     }
 }
 
 /*
  * Unload a dlopened object and its dependencies from memory and from
  * our data structures.  It is assumed that the DAG rooted in the
  * object has already been unreferenced, and that the object has a
  * reference count of 0.
  */
 static void
 unload_object(Obj_Entry *root)
 {
     Obj_Entry *obj;
     Obj_Entry **linkp;
 
     assert(root->refcount == 0);
 
     /*
      * Pass over the DAG removing unreferenced objects from
      * appropriate lists.
      */
     unlink_object(root);
 
     /* Unmap all objects that are no longer referenced. */
     linkp = &obj_list->next;
     while ((obj = *linkp) != NULL) {
 	if (obj->refcount == 0) {
 	    LD_UTRACE(UTRACE_UNLOAD_OBJECT, obj, obj->mapbase, obj->mapsize, 0,
 		obj->path);
 	    dbg("unloading \"%s\"", obj->path);
 	    unload_filtees(root);
 	    munmap(obj->mapbase, obj->mapsize);
 	    linkmap_delete(obj);
 	    *linkp = obj->next;
 	    obj_count--;
 	    obj_free(obj);
 	} else
 	    linkp = &obj->next;
     }
     obj_tail = linkp;
 }
 
 static void
 unlink_object(Obj_Entry *root)
 {
     Objlist_Entry *elm;
 
     if (root->refcount == 0) {
 	/* Remove the object from the RTLD_GLOBAL list. */
 	objlist_remove(&list_global, root);
 
     	/* Remove the object from all objects' DAG lists. */
     	STAILQ_FOREACH(elm, &root->dagmembers, link) {
 	    objlist_remove(&elm->obj->dldags, root);
 	    if (elm->obj != root)
 		unlink_object(elm->obj);
 	}
     }
 }
 
 static void
 ref_dag(Obj_Entry *root)
 {
     Objlist_Entry *elm;
 
     assert(root->dag_inited);
     STAILQ_FOREACH(elm, &root->dagmembers, link)
 	elm->obj->refcount++;
 }
 
 static void
 unref_dag(Obj_Entry *root)
 {
     Objlist_Entry *elm;
 
     assert(root->dag_inited);
     STAILQ_FOREACH(elm, &root->dagmembers, link)
 	elm->obj->refcount--;
 }
 
 /*
  * Common code for MD __tls_get_addr().
  */
 static void *tls_get_addr_slow(Elf_Addr **, int, size_t) __noinline;
 static void *
 tls_get_addr_slow(Elf_Addr **dtvp, int index, size_t offset)
 {
     Elf_Addr *newdtv, *dtv;
     RtldLockState lockstate;
     int to_copy;
 
     dtv = *dtvp;
     /* Check dtv generation in case new modules have arrived */
     if (dtv[0] != tls_dtv_generation) {
 	wlock_acquire(rtld_bind_lock, &lockstate);
 	newdtv = xcalloc(tls_max_index + 2, sizeof(Elf_Addr));
 	to_copy = dtv[1];
 	if (to_copy > tls_max_index)
 	    to_copy = tls_max_index;
 	memcpy(&newdtv[2], &dtv[2], to_copy * sizeof(Elf_Addr));
 	newdtv[0] = tls_dtv_generation;
 	newdtv[1] = tls_max_index;
 	free(dtv);
 	lock_release(rtld_bind_lock, &lockstate);
 	dtv = *dtvp = newdtv;
     }
 
     /* Dynamically allocate module TLS if necessary */
     if (dtv[index + 1] == 0) {
 	/* Signal safe, wlock will block out signals. */
 	wlock_acquire(rtld_bind_lock, &lockstate);
 	if (!dtv[index + 1])
 	    dtv[index + 1] = (Elf_Addr)allocate_module_tls(index);
 	lock_release(rtld_bind_lock, &lockstate);
     }
     return ((void *)(dtv[index + 1] + offset));
 }
 
 void *
 tls_get_addr_common(Elf_Addr **dtvp, int index, size_t offset)
 {
 	Elf_Addr *dtv;
 
 	dtv = *dtvp;
 	/* Check dtv generation in case new modules have arrived */
 	if (__predict_true(dtv[0] == tls_dtv_generation &&
 	    dtv[index + 1] != 0))
 		return ((void *)(dtv[index + 1] + offset));
 	return (tls_get_addr_slow(dtvp, index, offset));
 }
 
 #if defined(__aarch64__) || defined(__arm__) || defined(__mips__) || \
     defined(__powerpc__)
 
 /*
  * Allocate Static TLS using the Variant I method.
  */
 void *
 allocate_tls(Obj_Entry *objs, void *oldtcb, size_t tcbsize, size_t tcbalign)
 {
     Obj_Entry *obj;
     char *tcb;
     Elf_Addr **tls;
     Elf_Addr *dtv;
     Elf_Addr addr;
     int i;
 
     if (oldtcb != NULL && tcbsize == TLS_TCB_SIZE)
 	return (oldtcb);
 
     assert(tcbsize >= TLS_TCB_SIZE);
     tcb = xcalloc(1, tls_static_space - TLS_TCB_SIZE + tcbsize);
     tls = (Elf_Addr **)(tcb + tcbsize - TLS_TCB_SIZE);
 
     if (oldtcb != NULL) {
 	memcpy(tls, oldtcb, tls_static_space);
 	free(oldtcb);
 
 	/* Adjust the DTV. */
 	dtv = tls[0];
 	for (i = 0; i < dtv[1]; i++) {
 	    if (dtv[i+2] >= (Elf_Addr)oldtcb &&
 		dtv[i+2] < (Elf_Addr)oldtcb + tls_static_space) {
 		dtv[i+2] = dtv[i+2] - (Elf_Addr)oldtcb + (Elf_Addr)tls;
 	    }
 	}
     } else {
 	dtv = xcalloc(tls_max_index + 2, sizeof(Elf_Addr));
 	tls[0] = dtv;
 	dtv[0] = tls_dtv_generation;
 	dtv[1] = tls_max_index;
 
 	for (obj = objs; obj; obj = obj->next) {
 	    if (obj->tlsoffset > 0) {
 		addr = (Elf_Addr)tls + obj->tlsoffset;
 		if (obj->tlsinitsize > 0)
 		    memcpy((void*) addr, obj->tlsinit, obj->tlsinitsize);
 		if (obj->tlssize > obj->tlsinitsize)
 		    memset((void*) (addr + obj->tlsinitsize), 0,
 			   obj->tlssize - obj->tlsinitsize);
 		dtv[obj->tlsindex + 1] = addr;
 	    }
 	}
     }
 
     return (tcb);
 }
 
 void
 free_tls(void *tcb, size_t tcbsize, size_t tcbalign)
 {
     Elf_Addr *dtv;
     Elf_Addr tlsstart, tlsend;
     int dtvsize, i;
 
     assert(tcbsize >= TLS_TCB_SIZE);
 
     tlsstart = (Elf_Addr)tcb + tcbsize - TLS_TCB_SIZE;
     tlsend = tlsstart + tls_static_space;
 
     dtv = *(Elf_Addr **)tlsstart;
     dtvsize = dtv[1];
     for (i = 0; i < dtvsize; i++) {
 	if (dtv[i+2] && (dtv[i+2] < tlsstart || dtv[i+2] >= tlsend)) {
 	    free((void*)dtv[i+2]);
 	}
     }
     free(dtv);
     free(tcb);
 }
 
 #endif
 
 #if defined(__i386__) || defined(__amd64__) || defined(__sparc64__)
 
 /*
  * Allocate Static TLS using the Variant II method.
  */
 void *
 allocate_tls(Obj_Entry *objs, void *oldtls, size_t tcbsize, size_t tcbalign)
 {
     Obj_Entry *obj;
     size_t size, ralign;
     char *tls;
     Elf_Addr *dtv, *olddtv;
     Elf_Addr segbase, oldsegbase, addr;
     int i;
 
     ralign = tcbalign;
     if (tls_static_max_align > ralign)
 	    ralign = tls_static_max_align;
     size = round(tls_static_space, ralign) + round(tcbsize, ralign);
 
     assert(tcbsize >= 2*sizeof(Elf_Addr));
     tls = malloc_aligned(size, ralign);
     dtv = xcalloc(tls_max_index + 2, sizeof(Elf_Addr));
 
     segbase = (Elf_Addr)(tls + round(tls_static_space, ralign));
     ((Elf_Addr*)segbase)[0] = segbase;
     ((Elf_Addr*)segbase)[1] = (Elf_Addr) dtv;
 
     dtv[0] = tls_dtv_generation;
     dtv[1] = tls_max_index;
 
     if (oldtls) {
 	/*
 	 * Copy the static TLS block over whole.
 	 */
 	oldsegbase = (Elf_Addr) oldtls;
 	memcpy((void *)(segbase - tls_static_space),
 	       (const void *)(oldsegbase - tls_static_space),
 	       tls_static_space);
 
 	/*
 	 * If any dynamic TLS blocks have been created tls_get_addr(),
 	 * move them over.
 	 */
 	olddtv = ((Elf_Addr**)oldsegbase)[1];
 	for (i = 0; i < olddtv[1]; i++) {
 	    if (olddtv[i+2] < oldsegbase - size || olddtv[i+2] > oldsegbase) {
 		dtv[i+2] = olddtv[i+2];
 		olddtv[i+2] = 0;
 	    }
 	}
 
 	/*
 	 * We assume that this block was the one we created with
 	 * allocate_initial_tls().
 	 */
 	free_tls(oldtls, 2*sizeof(Elf_Addr), sizeof(Elf_Addr));
     } else {
 	for (obj = objs; obj; obj = obj->next) {
 	    if (obj->tlsoffset) {
 		addr = segbase - obj->tlsoffset;
 		memset((void*) (addr + obj->tlsinitsize),
 		       0, obj->tlssize - obj->tlsinitsize);
 		if (obj->tlsinit)
 		    memcpy((void*) addr, obj->tlsinit, obj->tlsinitsize);
 		dtv[obj->tlsindex + 1] = addr;
 	    }
 	}
     }
 
     return (void*) segbase;
 }
 
 void
 free_tls(void *tls, size_t tcbsize, size_t tcbalign)
 {
     Elf_Addr* dtv;
     size_t size, ralign;
     int dtvsize, i;
     Elf_Addr tlsstart, tlsend;
 
     /*
      * Figure out the size of the initial TLS block so that we can
      * find stuff which ___tls_get_addr() allocated dynamically.
      */
     ralign = tcbalign;
     if (tls_static_max_align > ralign)
 	    ralign = tls_static_max_align;
     size = round(tls_static_space, ralign);
 
     dtv = ((Elf_Addr**)tls)[1];
     dtvsize = dtv[1];
     tlsend = (Elf_Addr) tls;
     tlsstart = tlsend - size;
     for (i = 0; i < dtvsize; i++) {
 	if (dtv[i + 2] != 0 && (dtv[i + 2] < tlsstart || dtv[i + 2] > tlsend)) {
 		free_aligned((void *)dtv[i + 2]);
 	}
     }
 
     free_aligned((void *)tlsstart);
     free((void*) dtv);
 }
 
 #endif
 
 /*
  * Allocate TLS block for module with given index.
  */
 void *
 allocate_module_tls(int index)
 {
     Obj_Entry* obj;
     char* p;
 
     for (obj = obj_list; obj; obj = obj->next) {
 	if (obj->tlsindex == index)
 	    break;
     }
     if (!obj) {
 	_rtld_error("Can't find module with TLS index %d", index);
 	rtld_die();
     }
 
     p = malloc_aligned(obj->tlssize, obj->tlsalign);
     memcpy(p, obj->tlsinit, obj->tlsinitsize);
     memset(p + obj->tlsinitsize, 0, obj->tlssize - obj->tlsinitsize);
 
     return p;
 }
 
 bool
 allocate_tls_offset(Obj_Entry *obj)
 {
     size_t off;
 
     if (obj->tls_done)
 	return true;
 
     if (obj->tlssize == 0) {
 	obj->tls_done = true;
 	return true;
     }
 
     if (obj->tlsindex == 1)
 	off = calculate_first_tls_offset(obj->tlssize, obj->tlsalign);
     else
 	off = calculate_tls_offset(tls_last_offset, tls_last_size,
 				   obj->tlssize, obj->tlsalign);
 
     /*
      * If we have already fixed the size of the static TLS block, we
      * must stay within that size. When allocating the static TLS, we
      * leave a small amount of space spare to be used for dynamically
      * loading modules which use static TLS.
      */
     if (tls_static_space != 0) {
 	if (calculate_tls_end(off, obj->tlssize) > tls_static_space)
 	    return false;
     } else if (obj->tlsalign > tls_static_max_align) {
 	    tls_static_max_align = obj->tlsalign;
     }
 
     tls_last_offset = obj->tlsoffset = off;
     tls_last_size = obj->tlssize;
     obj->tls_done = true;
 
     return true;
 }
 
 void
 free_tls_offset(Obj_Entry *obj)
 {
 
     /*
      * If we were the last thing to allocate out of the static TLS
      * block, we give our space back to the 'allocator'. This is a
      * simplistic workaround to allow libGL.so.1 to be loaded and
      * unloaded multiple times.
      */
     if (calculate_tls_end(obj->tlsoffset, obj->tlssize)
 	== calculate_tls_end(tls_last_offset, tls_last_size)) {
 	tls_last_offset -= obj->tlssize;
 	tls_last_size = 0;
     }
 }
 
 void *
 _rtld_allocate_tls(void *oldtls, size_t tcbsize, size_t tcbalign)
 {
     void *ret;
     RtldLockState lockstate;
 
     wlock_acquire(rtld_bind_lock, &lockstate);
     ret = allocate_tls(obj_list, oldtls, tcbsize, tcbalign);
     lock_release(rtld_bind_lock, &lockstate);
     return (ret);
 }
 
 void
 _rtld_free_tls(void *tcb, size_t tcbsize, size_t tcbalign)
 {
     RtldLockState lockstate;
 
     wlock_acquire(rtld_bind_lock, &lockstate);
     free_tls(tcb, tcbsize, tcbalign);
     lock_release(rtld_bind_lock, &lockstate);
 }
 
 static void
 object_add_name(Obj_Entry *obj, const char *name)
 {
     Name_Entry *entry;
     size_t len;
 
     len = strlen(name);
     entry = malloc(sizeof(Name_Entry) + len);
 
     if (entry != NULL) {
 	strcpy(entry->name, name);
 	STAILQ_INSERT_TAIL(&obj->names, entry, link);
     }
 }
 
 static int
 object_match_name(const Obj_Entry *obj, const char *name)
 {
     Name_Entry *entry;
 
     STAILQ_FOREACH(entry, &obj->names, link) {
 	if (strcmp(name, entry->name) == 0)
 	    return (1);
     }
     return (0);
 }
 
 static Obj_Entry *
 locate_dependency(const Obj_Entry *obj, const char *name)
 {
     const Objlist_Entry *entry;
     const Needed_Entry *needed;
 
     STAILQ_FOREACH(entry, &list_main, link) {
 	if (object_match_name(entry->obj, name))
 	    return entry->obj;
     }
 
     for (needed = obj->needed;  needed != NULL;  needed = needed->next) {
 	if (strcmp(obj->strtab + needed->name, name) == 0 ||
 	  (needed->obj != NULL && object_match_name(needed->obj, name))) {
 	    /*
 	     * If there is DT_NEEDED for the name we are looking for,
 	     * we are all set.  Note that object might not be found if
 	     * dependency was not loaded yet, so the function can
 	     * return NULL here.  This is expected and handled
 	     * properly by the caller.
 	     */
 	    return (needed->obj);
 	}
     }
     _rtld_error("%s: Unexpected inconsistency: dependency %s not found",
 	obj->path, name);
     rtld_die();
 }
 
 static int
 check_object_provided_version(Obj_Entry *refobj, const Obj_Entry *depobj,
     const Elf_Vernaux *vna)
 {
     const Elf_Verdef *vd;
     const char *vername;
 
     vername = refobj->strtab + vna->vna_name;
     vd = depobj->verdef;
     if (vd == NULL) {
 	_rtld_error("%s: version %s required by %s not defined",
 	    depobj->path, vername, refobj->path);
 	return (-1);
     }
     for (;;) {
 	if (vd->vd_version != VER_DEF_CURRENT) {
 	    _rtld_error("%s: Unsupported version %d of Elf_Verdef entry",
 		depobj->path, vd->vd_version);
 	    return (-1);
 	}
 	if (vna->vna_hash == vd->vd_hash) {
 	    const Elf_Verdaux *aux = (const Elf_Verdaux *)
 		((char *)vd + vd->vd_aux);
 	    if (strcmp(vername, depobj->strtab + aux->vda_name) == 0)
 		return (0);
 	}
 	if (vd->vd_next == 0)
 	    break;
 	vd = (const Elf_Verdef *) ((char *)vd + vd->vd_next);
     }
     if (vna->vna_flags & VER_FLG_WEAK)
 	return (0);
     _rtld_error("%s: version %s required by %s not found",
 	depobj->path, vername, refobj->path);
     return (-1);
 }
 
 static int
 rtld_verify_object_versions(Obj_Entry *obj)
 {
     const Elf_Verneed *vn;
     const Elf_Verdef  *vd;
     const Elf_Verdaux *vda;
     const Elf_Vernaux *vna;
     const Obj_Entry *depobj;
     int maxvernum, vernum;
 
     if (obj->ver_checked)
 	return (0);
     obj->ver_checked = true;
 
     maxvernum = 0;
     /*
      * Walk over defined and required version records and figure out
      * max index used by any of them. Do very basic sanity checking
      * while there.
      */
     vn = obj->verneed;
     while (vn != NULL) {
 	if (vn->vn_version != VER_NEED_CURRENT) {
 	    _rtld_error("%s: Unsupported version %d of Elf_Verneed entry",
 		obj->path, vn->vn_version);
 	    return (-1);
 	}
 	vna = (const Elf_Vernaux *) ((char *)vn + vn->vn_aux);
 	for (;;) {
 	    vernum = VER_NEED_IDX(vna->vna_other);
 	    if (vernum > maxvernum)
 		maxvernum = vernum;
 	    if (vna->vna_next == 0)
 		 break;
 	    vna = (const Elf_Vernaux *) ((char *)vna + vna->vna_next);
 	}
 	if (vn->vn_next == 0)
 	    break;
 	vn = (const Elf_Verneed *) ((char *)vn + vn->vn_next);
     }
 
     vd = obj->verdef;
     while (vd != NULL) {
 	if (vd->vd_version != VER_DEF_CURRENT) {
 	    _rtld_error("%s: Unsupported version %d of Elf_Verdef entry",
 		obj->path, vd->vd_version);
 	    return (-1);
 	}
 	vernum = VER_DEF_IDX(vd->vd_ndx);
 	if (vernum > maxvernum)
 		maxvernum = vernum;
 	if (vd->vd_next == 0)
 	    break;
 	vd = (const Elf_Verdef *) ((char *)vd + vd->vd_next);
     }
 
     if (maxvernum == 0)
 	return (0);
 
     /*
      * Store version information in array indexable by version index.
      * Verify that object version requirements are satisfied along the
      * way.
      */
     obj->vernum = maxvernum + 1;
     obj->vertab = xcalloc(obj->vernum, sizeof(Ver_Entry));
 
     vd = obj->verdef;
     while (vd != NULL) {
 	if ((vd->vd_flags & VER_FLG_BASE) == 0) {
 	    vernum = VER_DEF_IDX(vd->vd_ndx);
 	    assert(vernum <= maxvernum);
 	    vda = (const Elf_Verdaux *)((char *)vd + vd->vd_aux);
 	    obj->vertab[vernum].hash = vd->vd_hash;
 	    obj->vertab[vernum].name = obj->strtab + vda->vda_name;
 	    obj->vertab[vernum].file = NULL;
 	    obj->vertab[vernum].flags = 0;
 	}
 	if (vd->vd_next == 0)
 	    break;
 	vd = (const Elf_Verdef *) ((char *)vd + vd->vd_next);
     }
 
     vn = obj->verneed;
     while (vn != NULL) {
 	depobj = locate_dependency(obj, obj->strtab + vn->vn_file);
 	if (depobj == NULL)
 	    return (-1);
 	vna = (const Elf_Vernaux *) ((char *)vn + vn->vn_aux);
 	for (;;) {
 	    if (check_object_provided_version(obj, depobj, vna))
 		return (-1);
 	    vernum = VER_NEED_IDX(vna->vna_other);
 	    assert(vernum <= maxvernum);
 	    obj->vertab[vernum].hash = vna->vna_hash;
 	    obj->vertab[vernum].name = obj->strtab + vna->vna_name;
 	    obj->vertab[vernum].file = obj->strtab + vn->vn_file;
 	    obj->vertab[vernum].flags = (vna->vna_other & VER_NEED_HIDDEN) ?
 		VER_INFO_HIDDEN : 0;
 	    if (vna->vna_next == 0)
 		 break;
 	    vna = (const Elf_Vernaux *) ((char *)vna + vna->vna_next);
 	}
 	if (vn->vn_next == 0)
 	    break;
 	vn = (const Elf_Verneed *) ((char *)vn + vn->vn_next);
     }
     return 0;
 }
 
 static int
 rtld_verify_versions(const Objlist *objlist)
 {
     Objlist_Entry *entry;
     int rc;
 
     rc = 0;
     STAILQ_FOREACH(entry, objlist, link) {
 	/*
 	 * Skip dummy objects or objects that have their version requirements
 	 * already checked.
 	 */
 	if (entry->obj->strtab == NULL || entry->obj->vertab != NULL)
 	    continue;
 	if (rtld_verify_object_versions(entry->obj) == -1) {
 	    rc = -1;
 	    if (ld_tracing == NULL)
 		break;
 	}
     }
     if (rc == 0 || ld_tracing != NULL)
     	rc = rtld_verify_object_versions(&obj_rtld);
     return rc;
 }
 
 const Ver_Entry *
 fetch_ventry(const Obj_Entry *obj, unsigned long symnum)
 {
     Elf_Versym vernum;
 
     if (obj->vertab) {
 	vernum = VER_NDX(obj->versyms[symnum]);
 	if (vernum >= obj->vernum) {
 	    _rtld_error("%s: symbol %s has wrong verneed value %d",
 		obj->path, obj->strtab + symnum, vernum);
 	} else if (obj->vertab[vernum].hash != 0) {
 	    return &obj->vertab[vernum];
 	}
     }
     return NULL;
 }
 
 int
 _rtld_get_stack_prot(void)
 {
 
 	return (stack_prot);
 }
 
 int
 _rtld_is_dlopened(void *arg)
 {
 	Obj_Entry *obj;
 	RtldLockState lockstate;
 	int res;
 
 	rlock_acquire(rtld_bind_lock, &lockstate);
 	obj = dlcheck(arg);
 	if (obj == NULL)
 		obj = obj_from_addr(arg);
 	if (obj == NULL) {
 		_rtld_error("No shared object contains address");
 		lock_release(rtld_bind_lock, &lockstate);
 		return (-1);
 	}
 	res = obj->dlopened ? 1 : 0;
 	lock_release(rtld_bind_lock, &lockstate);
 	return (res);
 }
 
 static void
 map_stacks_exec(RtldLockState *lockstate)
 {
 	void (*thr_map_stacks_exec)(void);
 
 	if ((max_stack_flags & PF_X) == 0 || (stack_prot & PROT_EXEC) != 0)
 		return;
 	thr_map_stacks_exec = (void (*)(void))(uintptr_t)
 	    get_program_var_addr("__pthread_map_stacks_exec", lockstate);
 	if (thr_map_stacks_exec != NULL) {
 		stack_prot |= PROT_EXEC;
 		thr_map_stacks_exec();
 	}
 }
 
 void
 symlook_init(SymLook *dst, const char *name)
 {
 
 	bzero(dst, sizeof(*dst));
 	dst->name = name;
 	dst->hash = elf_hash(name);
 	dst->hash_gnu = gnu_hash(name);
 }
 
 static void
 symlook_init_from_req(SymLook *dst, const SymLook *src)
 {
 
 	dst->name = src->name;
 	dst->hash = src->hash;
 	dst->hash_gnu = src->hash_gnu;
 	dst->ventry = src->ventry;
 	dst->flags = src->flags;
 	dst->defobj_out = NULL;
 	dst->sym_out = NULL;
 	dst->lockstate = src->lockstate;
 }
 
 
 /*
  * Parse a file descriptor number without pulling in more of libc (e.g. atoi).
  */
 static int
 parse_libdir(const char *str)
 {
 	static const int RADIX = 10;  /* XXXJA: possibly support hex? */
 	const char *orig;
 	int fd;
 	char c;
 
 	orig = str;
 	fd = 0;
 	for (c = *str; c != '\0'; c = *++str) {
 		if (c < '0' || c > '9')
 			return (-1);
 
 		fd *= RADIX;
 		fd += c - '0';
 	}
 
 	/* Make sure we actually parsed something. */
 	if (str == orig) {
 		_rtld_error("failed to parse directory FD from '%s'", str);
 		return (-1);
 	}
 	return (fd);
 }
 
 /*
  * Overrides for libc_pic-provided functions.
  */
 
 int
 __getosreldate(void)
 {
 	size_t len;
 	int oid[2];
 	int error, osrel;
 
 	if (osreldate != 0)
 		return (osreldate);
 
 	oid[0] = CTL_KERN;
 	oid[1] = KERN_OSRELDATE;
 	osrel = 0;
 	len = sizeof(osrel);
 	error = sysctl(oid, 2, &osrel, &len, NULL, 0);
 	if (error == 0 && osrel > 0 && len == sizeof(osrel))
 		osreldate = osrel;
 	return (osreldate);
 }
 
 void
 exit(int status)
 {
 
 	_exit(status);
 }
 
 void (*__cleanup)(void);
 int __isthreaded = 0;
 int _thread_autoinit_dummy_decl = 1;
 
 /*
  * No unresolved symbols for rtld.
  */
 void
 __pthread_cxa_finalize(struct dl_phdr_info *a)
 {
 }
 
 void
 __stack_chk_fail(void)
 {
 
 	_rtld_error("stack overflow detected; terminated");
 	rtld_die();
 }
 __weak_reference(__stack_chk_fail, __stack_chk_fail_local);
 
 void
 __chk_fail(void)
 {
 
 	_rtld_error("buffer overflow detected; terminated");
 	rtld_die();
 }
 
 const char *
 rtld_strerror(int errnum)
 {
 
 	if (errnum < 0 || errnum >= sys_nerr)
 		return ("Unknown error");
 	return (sys_errlist[errnum]);
 }
Index: user/ngie/more-tests/libexec/rtld-elf/rtld.h
===================================================================
--- user/ngie/more-tests/libexec/rtld-elf/rtld.h	(revision 281584)
+++ user/ngie/more-tests/libexec/rtld-elf/rtld.h	(revision 281585)
@@ -1,408 +1,409 @@
 /*-
  * Copyright 1996, 1997, 1998, 1999, 2000 John D. Polstra.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 #ifndef RTLD_H /* { */
 #define RTLD_H 1
 
 #include <machine/elf.h>
 #include <sys/types.h>
 #include <sys/queue.h>
 
 #include <elf-hints.h>
 #include <link.h>
 #include <stdarg.h>
 #include <setjmp.h>
 #include <stddef.h>
 
 #include "rtld_lock.h"
 #include "rtld_machdep.h"
 
 #ifdef COMPAT_32BIT
 #undef STANDARD_LIBRARY_PATH
 #undef _PATH_ELF_HINTS
 #define	_PATH_ELF_HINTS		"/var/run/ld-elf32.so.hints"
 /* For running 32 bit binaries  */
 #define	STANDARD_LIBRARY_PATH	"/lib32:/usr/lib32"
 #define LD_ "LD_32_"
 #endif
 
 #ifndef STANDARD_LIBRARY_PATH
 #define STANDARD_LIBRARY_PATH	"/lib:/usr/lib"
 #endif
 #ifndef LD_
 #define LD_ "LD_"
 #endif
 
 #define NEW(type)	((type *) xmalloc(sizeof(type)))
 #define CNEW(type)	((type *) xcalloc(1, sizeof(type)))
 
 /* We might as well do booleans like C++. */
 typedef unsigned char bool;
 #define false	0
 #define true	1
 
 extern size_t tls_last_offset;
 extern size_t tls_last_size;
 extern size_t tls_static_space;
 extern int tls_dtv_generation;
 extern int tls_max_index;
 
 extern int npagesizes;
 extern size_t *pagesizes;
 
 extern int main_argc;
 extern char **main_argv;
 extern char **environ;
 
 struct stat;
 struct Struct_Obj_Entry;
 
 /* Lists of shared objects */
 typedef struct Struct_Objlist_Entry {
     STAILQ_ENTRY(Struct_Objlist_Entry) link;
     struct Struct_Obj_Entry *obj;
 } Objlist_Entry;
 
 typedef STAILQ_HEAD(Struct_Objlist, Struct_Objlist_Entry) Objlist;
 
 /* Types of init and fini functions */
 typedef void (*InitFunc)(void);
 typedef void (*InitArrFunc)(int, char **, char **);
 
 /* Lists of shared object dependencies */
 typedef struct Struct_Needed_Entry {
     struct Struct_Needed_Entry *next;
     struct Struct_Obj_Entry *obj;
     unsigned long name;		/* Offset of name in string table */
 } Needed_Entry;
 
 typedef struct Struct_Name_Entry {
     STAILQ_ENTRY(Struct_Name_Entry) link;
     char   name[1];
 } Name_Entry;
 
 /* Lock object */
 typedef struct Struct_LockInfo {
     void *context;		/* Client context for creating locks */
     void *thelock;		/* The one big lock */
     /* Debugging aids. */
     volatile int rcount;	/* Number of readers holding lock */
     volatile int wcount;	/* Number of writers holding lock */
     /* Methods */
     void *(*lock_create)(void *context);
     void (*rlock_acquire)(void *lock);
     void (*wlock_acquire)(void *lock);
     void (*rlock_release)(void *lock);
     void (*wlock_release)(void *lock);
     void (*lock_destroy)(void *lock);
     void (*context_destroy)(void *context);
 } LockInfo;
 
 typedef struct Struct_Ver_Entry {
 	Elf_Word     hash;
 	unsigned int flags;
 	const char  *name;
 	const char  *file;
 } Ver_Entry;
 
 typedef struct Struct_Sym_Match_Result {
     const Elf_Sym *sym_out;
     const Elf_Sym *vsymp;
     int vcount;
 } Sym_Match_Result;
 
 #define VER_INFO_HIDDEN	0x01
 
 /*
  * Shared object descriptor.
  *
  * Items marked with "(%)" are dynamically allocated, and must be freed
  * when the structure is destroyed.
  *
  * CAUTION: It appears that the JDK port peeks into these structures.
  * It looks at "next" and "mapbase" at least.  Don't add new members
  * near the front, until this can be straightened out.
  */
 typedef struct Struct_Obj_Entry {
     /*
      * These two items have to be set right for compatibility with the
      * original ElfKit crt1.o.
      */
     Elf_Size magic;		/* Magic number (sanity check) */
     Elf_Size version;		/* Version number of struct format */
 
     struct Struct_Obj_Entry *next;
     char *path;			/* Pathname of underlying file (%) */
     char *origin_path;		/* Directory path of origin file */
     int refcount;
     int dl_refcount;		/* Number of times loaded by dlopen */
 
     /* These items are computed by map_object() or by digest_phdr(). */
     caddr_t mapbase;		/* Base address of mapped region */
     size_t mapsize;		/* Size of mapped region in bytes */
     size_t textsize;		/* Size of text segment in bytes */
     Elf_Addr vaddrbase;		/* Base address in shared object file */
     caddr_t relocbase;		/* Relocation constant = mapbase - vaddrbase */
     const Elf_Dyn *dynamic;	/* Dynamic section */
     caddr_t entry;		/* Entry point */
     const Elf_Phdr *phdr;	/* Program header if it is mapped, else NULL */
     size_t phsize;		/* Size of program header in bytes */
     const char *interp;		/* Pathname of the interpreter, if any */
     Elf_Word stack_flags;
 
     /* TLS information */
     int tlsindex;		/* Index in DTV for this module */
     void *tlsinit;		/* Base address of TLS init block */
     size_t tlsinitsize;		/* Size of TLS init block for this module */
     size_t tlssize;		/* Size of TLS block for this module */
     size_t tlsoffset;		/* Offset of static TLS block for this module */
     size_t tlsalign;		/* Alignment of static TLS block */
 
     caddr_t relro_page;
     size_t relro_size;
 
     /* Items from the dynamic section. */
     Elf_Addr *pltgot;		/* PLT or GOT, depending on architecture */
     const Elf_Rel *rel;		/* Relocation entries */
     unsigned long relsize;	/* Size in bytes of relocation info */
     const Elf_Rela *rela;	/* Relocation entries with addend */
     unsigned long relasize;	/* Size in bytes of addend relocation info */
     const Elf_Rel *pltrel;	/* PLT relocation entries */
     unsigned long pltrelsize;	/* Size in bytes of PLT relocation info */
     const Elf_Rela *pltrela;	/* PLT relocation entries with addend */
     unsigned long pltrelasize;	/* Size in bytes of PLT addend reloc info */
     const Elf_Sym *symtab;	/* Symbol table */
     const char *strtab;		/* String table */
     unsigned long strsize;	/* Size in bytes of string table */
 #ifdef __mips__
     Elf_Word local_gotno;	/* Number of local GOT entries */
     Elf_Word symtabno;		/* Number of dynamic symbols */
     Elf_Word gotsym;		/* First dynamic symbol in GOT */
 #endif
 
     const Elf_Verneed *verneed; /* Required versions. */
     Elf_Word verneednum;	/* Number of entries in verneed table */
     const Elf_Verdef  *verdef;	/* Provided versions. */
     Elf_Word verdefnum;		/* Number of entries in verdef table */
     const Elf_Versym *versyms;  /* Symbol versions table */
 
     const Elf_Hashelt *buckets;	/* Hash table buckets array */
     unsigned long nbuckets;	/* Number of buckets */
     const Elf_Hashelt *chains;	/* Hash table chain array */
     unsigned long nchains;	/* Number of entries in chain array */
 
     Elf32_Word nbuckets_gnu;		/* Number of GNU hash buckets*/
     Elf32_Word symndx_gnu;		/* 1st accessible symbol on dynsym table */
     Elf32_Word maskwords_bm_gnu;  	/* Bloom filter words - 1 (bitmask) */
     Elf32_Word shift2_gnu;		/* Bloom filter shift count */
     Elf32_Word dynsymcount;		/* Total entries in dynsym table */
     Elf_Addr *bloom_gnu;		/* Bloom filter used by GNU hash func */
     const Elf_Hashelt *buckets_gnu;	/* GNU hash table bucket array */
     const Elf_Hashelt *chain_zero_gnu;	/* GNU hash table value array (Zeroed) */
 
     char *rpath;		/* Search path specified in object */
     char *runpath;		/* Search path with different priority */
     Needed_Entry *needed;	/* Shared objects needed by this one (%) */
     Needed_Entry *needed_filtees;
     Needed_Entry *needed_aux_filtees;
 
     STAILQ_HEAD(, Struct_Name_Entry) names; /* List of names for this object we
 					       know about. */
     Ver_Entry *vertab;		/* Versions required /defined by this object */
     int vernum;			/* Number of entries in vertab */
 
     Elf_Addr init;		/* Initialization function to call */
     Elf_Addr fini;		/* Termination function to call */
     Elf_Addr preinit_array;	/* Pre-initialization array of functions */
     Elf_Addr init_array;	/* Initialization array of functions */
     Elf_Addr fini_array;	/* Termination array of functions */
     int preinit_array_num;	/* Number of entries in preinit_array */
     int init_array_num; 	/* Number of entries in init_array */
     int fini_array_num; 	/* Number of entries in fini_array */
 
     int32_t osrel;		/* OSREL note value */
 
     bool mainprog : 1;		/* True if this is the main program */
     bool rtld : 1;		/* True if this is the dynamic linker */
     bool relocated : 1;		/* True if processed by relocate_objects() */
     bool ver_checked : 1;	/* True if processed by rtld_verify_object_versions */
     bool textrel : 1;		/* True if there are relocations to text seg */
     bool symbolic : 1;		/* True if generated with "-Bsymbolic" */
     bool bind_now : 1;		/* True if all relocations should be made first */
     bool traced : 1;		/* Already printed in ldd trace output */
     bool jmpslots_done : 1;	/* Already have relocated the jump slots */
     bool init_done : 1;		/* Already have added object to init list */
     bool tls_done : 1;		/* Already allocated offset for static TLS */
     bool phdr_alloc : 1;	/* Phdr is allocated and needs to be freed. */
     bool z_origin : 1;		/* Process rpath and soname tokens */
     bool z_nodelete : 1;	/* Do not unload the object and dependencies */
     bool z_noopen : 1;		/* Do not load on dlopen */
     bool z_loadfltr : 1;	/* Immediately load filtees */
     bool z_interpose : 1;	/* Interpose all objects but main */
     bool z_nodeflib : 1;	/* Don't search default library path */
+    bool z_global : 1;		/* Make the object global */
     bool ref_nodel : 1;		/* Refcount increased to prevent dlclose */
     bool init_scanned: 1;	/* Object is already on init list. */
     bool on_fini_list: 1;	/* Object is already on fini list. */
     bool dag_inited : 1;	/* Object has its DAG initialized. */
     bool filtees_loaded : 1;	/* Filtees loaded */
     bool irelative : 1;		/* Object has R_MACHDEP_IRELATIVE relocs */
     bool gnu_ifunc : 1;		/* Object has references to STT_GNU_IFUNC */
     bool non_plt_gnu_ifunc : 1;	/* Object has non-plt IFUNC references */
     bool crt_no_init : 1;	/* Object' crt does not call _init/_fini */
     bool valid_hash_sysv : 1;	/* A valid System V hash hash tag is available */
     bool valid_hash_gnu : 1;	/* A valid GNU hash tag is available */
     bool dlopened : 1;		/* dlopen()-ed (vs. load statically) */
 
     struct link_map linkmap;	/* For GDB and dlinfo() */
     Objlist dldags;		/* Object belongs to these dlopened DAGs (%) */
     Objlist dagmembers;		/* DAG has these members (%) */
     dev_t dev;			/* Object's filesystem's device */
     ino_t ino;			/* Object's inode number */
     void *priv;			/* Platform-dependent */
 } Obj_Entry;
 
 #define RTLD_MAGIC	0xd550b87a
 #define RTLD_VERSION	1
 
 #define RTLD_STATIC_TLS_EXTRA	128
 
 /* Flags to be passed into symlook_ family of functions. */
 #define SYMLOOK_IN_PLT	0x01	/* Lookup for PLT symbol */
 #define SYMLOOK_DLSYM	0x02	/* Return newest versioned symbol. Used by
 				   dlsym. */
 #define	SYMLOOK_EARLY	0x04	/* Symlook is done during initialization. */
 #define	SYMLOOK_IFUNC	0x08	/* Allow IFUNC processing in
 				   reloc_non_plt(). */
 
 /* Flags for load_object(). */
 #define	RTLD_LO_NOLOAD	0x01	/* dlopen() specified RTLD_NOLOAD. */
 #define	RTLD_LO_DLOPEN	0x02	/* Load_object() called from dlopen(). */
 #define	RTLD_LO_TRACE	0x04	/* Only tracing. */
 #define	RTLD_LO_NODELETE 0x08	/* Loaded object cannot be closed. */
 #define	RTLD_LO_FILTEES 0x10	/* Loading filtee. */
 #define	RTLD_LO_EARLY	0x20	/* Do not call ctors, postpone it to the
 				   initialization during the image start. */
 
 /*
  * Symbol cache entry used during relocation to avoid multiple lookups
  * of the same symbol.
  */
 typedef struct Struct_SymCache {
     const Elf_Sym *sym;		/* Symbol table entry */
     const Obj_Entry *obj;	/* Shared object which defines it */
 } SymCache;
 
 /*
  * This structure provides a reentrant way to keep a list of objects and
  * check which ones have already been processed in some way.
  */
 typedef struct Struct_DoneList {
     const Obj_Entry **objs;		/* Array of object pointers */
     unsigned int num_alloc;		/* Allocated size of the array */
     unsigned int num_used;		/* Number of array slots used */
 } DoneList;
 
 struct Struct_RtldLockState {
 	int lockstate;
 	sigjmp_buf env;
 };
 
 struct fill_search_info_args {
 	int request;
 	unsigned int flags;
 	struct dl_serinfo *serinfo;
 	struct dl_serpath *serpath;
 	char *strspace;
 };
 
 /*
  * The pack of arguments and results for the symbol lookup functions.
  */
 typedef struct Struct_SymLook {
     const char *name;
     unsigned long hash;
     uint32_t hash_gnu;
     const Ver_Entry *ventry;
     int flags;
     const Obj_Entry *defobj_out;
     const Elf_Sym *sym_out;
     struct Struct_RtldLockState *lockstate;
 } SymLook;
 
 void _rtld_error(const char *, ...) __printflike(1, 2) __exported;
 void rtld_die(void) __dead2;
 const char *rtld_strerror(int);
 Obj_Entry *map_object(int, const char *, const struct stat *);
 void *xcalloc(size_t, size_t);
 void *xmalloc(size_t);
 char *xstrdup(const char *);
 void *malloc_aligned(size_t size, size_t align);
 void free_aligned(void *ptr);
 extern Elf_Addr _GLOBAL_OFFSET_TABLE_[];
 extern Elf_Sym sym_zero;	/* For resolving undefined weak refs. */
 
 void dump_relocations(Obj_Entry *);
 void dump_obj_relocations(Obj_Entry *);
 void dump_Elf_Rel(Obj_Entry *, const Elf_Rel *, u_long);
 void dump_Elf_Rela(Obj_Entry *, const Elf_Rela *, u_long);
 
 /*
  * Function declarations.
  */
 unsigned long elf_hash(const char *);
 const Elf_Sym *find_symdef(unsigned long, const Obj_Entry *,
   const Obj_Entry **, int, SymCache *, struct Struct_RtldLockState *);
 void init_pltgot(Obj_Entry *);
 void lockdflt_init(void);
 void digest_notes(Obj_Entry *, Elf_Addr, Elf_Addr);
 void obj_free(Obj_Entry *);
 Obj_Entry *obj_new(void);
 void _rtld_bind_start(void);
 void *rtld_resolve_ifunc(const Obj_Entry *obj, const Elf_Sym *def);
 void symlook_init(SymLook *, const char *);
 int symlook_obj(SymLook *, const Obj_Entry *);
 void *tls_get_addr_common(Elf_Addr** dtvp, int index, size_t offset);
 void *allocate_tls(Obj_Entry *, void *, size_t, size_t);
 void free_tls(void *, size_t, size_t);
 void *allocate_module_tls(int index);
 bool allocate_tls_offset(Obj_Entry *obj);
 void free_tls_offset(Obj_Entry *obj);
 const Ver_Entry *fetch_ventry(const Obj_Entry *obj, unsigned long);
 
 /*
  * MD function declarations.
  */
 int do_copy_relocations(Obj_Entry *);
 int reloc_non_plt(Obj_Entry *, Obj_Entry *, int flags,
     struct Struct_RtldLockState *);
 int reloc_plt(Obj_Entry *);
 int reloc_jmpslots(Obj_Entry *, int flags, struct Struct_RtldLockState *);
 int reloc_iresolve(Obj_Entry *, struct Struct_RtldLockState *);
 int reloc_gnu_ifunc(Obj_Entry *, int flags, struct Struct_RtldLockState *);
 void allocate_initial_tls(Obj_Entry *);
 
 #endif /* } */
Index: user/ngie/more-tests/sys/arm64/arm64/vfp.c
===================================================================
--- user/ngie/more-tests/sys/arm64/arm64/vfp.c	(revision 281584)
+++ user/ngie/more-tests/sys/arm64/arm64/vfp.c	(revision 281585)
@@ -1,194 +1,195 @@
 /*-
  * Copyright (c) 2015 The FreeBSD Foundation
  * All rights reserved.
  *
  * This software was developed by Andrew Turner under
  * sponsorship from the FreeBSD Foundation.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #ifdef VFP
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/pcpu.h>
 #include <sys/proc.h>
 
 #include <machine/armreg.h>
 #include <machine/pcb.h>
 #include <machine/vfp.h>
 
 /* Sanity check we can store all the VFP registers */
 CTASSERT(sizeof(((struct pcb *)0)->pcb_vfp) == 16 * 32);
 
 static void
 vfp_enable(void)
 {
 	uint32_t cpacr;
 
 	cpacr = READ_SPECIALREG(cpacr_el1);
 	cpacr = (cpacr & ~CPACR_FPEN_MASK) | CPACR_FPEN_TRAP_NONE;
 	WRITE_SPECIALREG(cpacr_el1, cpacr);
 	isb();
 }
 
 static void
 vfp_disable(void)
 {
 	uint32_t cpacr;
 
 	cpacr = READ_SPECIALREG(cpacr_el1);
 	cpacr = (cpacr & ~CPACR_FPEN_MASK) | CPACR_FPEN_TRAP_ALL1;
 	WRITE_SPECIALREG(cpacr_el1, cpacr);
 	isb();
 }
 
 /*
  * Called when the thread is dying. If the thread was the last to use the
  * VFP unit mark it as unused to tell the kernel the fp state is unowned.
  * Ensure the VFP unit is off so we get an exception on the next access.
  */
 void
 vfp_discard(struct thread *td)
 {
 
 	if (PCPU_GET(fpcurthread) == td)
 		PCPU_SET(fpcurthread, NULL);
 
 	vfp_disable();
 }
 
 void
 vfp_save_state(struct thread *td)
 {
 	__int128_t *vfp_state;
 	uint64_t fpcr, fpsr;
 	uint32_t cpacr;
 
+	critical_enter();
 	/*
 	 * Only store the registers if the VFP is enabled,
 	 * i.e. return if we are trapping on FP access.
 	 */
 	cpacr = READ_SPECIALREG(cpacr_el1);
-	if ((cpacr & CPACR_FPEN_MASK) != CPACR_FPEN_TRAP_NONE)
-		return;
+	if ((cpacr & CPACR_FPEN_MASK) == CPACR_FPEN_TRAP_NONE) {
+		vfp_state = td->td_pcb->pcb_vfp;
+		__asm __volatile(
+		    "mrs	%0, fpcr		\n"
+		    "mrs	%1, fpsr		\n"
+		    "stp	q0,  q1,  [%2, #16 *  0]\n"
+		    "stp	q2,  q3,  [%2, #16 *  2]\n"
+		    "stp	q4,  q5,  [%2, #16 *  4]\n"
+		    "stp	q6,  q7,  [%2, #16 *  6]\n"
+		    "stp	q8,  q9,  [%2, #16 *  8]\n"
+		    "stp	q10, q11, [%2, #16 * 10]\n"
+		    "stp	q12, q13, [%2, #16 * 12]\n"
+		    "stp	q14, q15, [%2, #16 * 14]\n"
+		    "stp	q16, q17, [%2, #16 * 16]\n"
+		    "stp	q18, q19, [%2, #16 * 18]\n"
+		    "stp	q20, q21, [%2, #16 * 20]\n"
+		    "stp	q22, q23, [%2, #16 * 22]\n"
+		    "stp	q24, q25, [%2, #16 * 24]\n"
+		    "stp	q26, q27, [%2, #16 * 26]\n"
+		    "stp	q28, q29, [%2, #16 * 28]\n"
+		    "stp	q30, q31, [%2, #16 * 30]\n"
+		    : "=&r"(fpcr), "=&r"(fpsr) : "r"(vfp_state));
 
-	vfp_state = td->td_pcb->pcb_vfp;
-	__asm __volatile(
-	    "mrs	%0, fpcr		\n"
-	    "mrs	%1, fpsr		\n"
-	    "stp	q0,  q1,  [%2, #16 *  0]\n"
-	    "stp	q2,  q3,  [%2, #16 *  2]\n"
-	    "stp	q4,  q5,  [%2, #16 *  4]\n"
-	    "stp	q6,  q7,  [%2, #16 *  6]\n"
-	    "stp	q8,  q9,  [%2, #16 *  8]\n"
-	    "stp	q10, q11, [%2, #16 * 10]\n"
-	    "stp	q12, q13, [%2, #16 * 12]\n"
-	    "stp	q14, q15, [%2, #16 * 14]\n"
-	    "stp	q16, q17, [%2, #16 * 16]\n"
-	    "stp	q18, q19, [%2, #16 * 18]\n"
-	    "stp	q20, q21, [%2, #16 * 20]\n"
-	    "stp	q22, q23, [%2, #16 * 22]\n"
-	    "stp	q24, q25, [%2, #16 * 24]\n"
-	    "stp	q26, q27, [%2, #16 * 26]\n"
-	    "stp	q28, q29, [%2, #16 * 28]\n"
-	    "stp	q30, q31, [%2, #16 * 30]\n"
-	    : "=&r"(fpcr), "=&r"(fpsr) : "r"(vfp_state));
+		td->td_pcb->pcb_fpcr = fpcr;
+		td->td_pcb->pcb_fpsr = fpsr;
 
-	td->td_pcb->pcb_fpcr = fpcr;
-	td->td_pcb->pcb_fpsr = fpsr;
-
-	dsb();
-	vfp_disable();
+		dsb();
+		vfp_disable();
+	}
+	critical_exit();
 }
 
 void
 vfp_restore_state(void)
 {
 	__int128_t *vfp_state;
 	uint64_t fpcr, fpsr;
 	struct pcb *curpcb;
 	u_int cpu;
 
 	critical_enter();
 
 	cpu = PCPU_GET(cpuid);
 	curpcb = curthread->td_pcb;
 	curpcb->pcb_fpflags |= PCB_FP_STARTED;
 
 	vfp_enable();
 
 	if (PCPU_GET(fpcurthread) != curthread && cpu != curpcb->pcb_vfpcpu) {
 
 		vfp_state = curthread->td_pcb->pcb_vfp;
 		fpcr = curthread->td_pcb->pcb_fpcr;
 		fpsr = curthread->td_pcb->pcb_fpsr;
 
 		__asm __volatile(
 		    "ldp	q0,  q1,  [%2, #16 *  0]\n"
 		    "ldp	q2,  q3,  [%2, #16 *  2]\n"
 		    "ldp	q4,  q5,  [%2, #16 *  4]\n"
 		    "ldp	q6,  q7,  [%2, #16 *  6]\n"
 		    "ldp	q8,  q9,  [%2, #16 *  8]\n"
 		    "ldp	q10, q11, [%2, #16 * 10]\n"
 		    "ldp	q12, q13, [%2, #16 * 12]\n"
 		    "ldp	q14, q15, [%2, #16 * 14]\n"
 		    "ldp	q16, q17, [%2, #16 * 16]\n"
 		    "ldp	q18, q19, [%2, #16 * 18]\n"
 		    "ldp	q20, q21, [%2, #16 * 20]\n"
 		    "ldp	q22, q23, [%2, #16 * 22]\n"
 		    "ldp	q24, q25, [%2, #16 * 24]\n"
 		    "ldp	q26, q27, [%2, #16 * 26]\n"
 		    "ldp	q28, q29, [%2, #16 * 28]\n"
 		    "ldp	q30, q31, [%2, #16 * 30]\n"
 		    "msr	fpcr, %0		\n"
 		    "msr	fpsr, %1		\n"
 		    : : "r"(fpcr), "r"(fpsr), "r"(vfp_state));
 
 		PCPU_SET(fpcurthread, curthread);
 		curpcb->pcb_vfpcpu = cpu;
 	}
 
 	critical_exit();
 }
 
 void
 vfp_init(void)
 {
 	uint64_t pfr;
 
 	/* Check if there is a vfp unit present */
 	pfr = READ_SPECIALREG(id_aa64pfr0_el1);
 	if ((pfr & ID_AA64PFR0_FP_MASK) == ID_AA64PFR0_FP_NONE)
 		return;
 
 	/* Disable to be enabled when it's used */
 	vfp_disable();
 }
 
 SYSINIT(vfp, SI_SUB_CPU, SI_ORDER_ANY, vfp_init, NULL);
 
 #endif
Index: user/ngie/more-tests/sys/arm64/arm64/vm_machdep.c
===================================================================
--- user/ngie/more-tests/sys/arm64/arm64/vm_machdep.c	(revision 281584)
+++ user/ngie/more-tests/sys/arm64/arm64/vm_machdep.c	(revision 281585)
@@ -1,248 +1,265 @@
 /*-
  * Copyright (c) 2014 Andrew Turner
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/limits.h>
 #include <sys/proc.h>
 #include <sys/sf_buf.h>
 #include <sys/signal.h>
 #include <sys/unistd.h>
 
 #include <vm/vm.h>
 #include <vm/vm_page.h>
 #include <vm/vm_map.h>
 #include <vm/uma.h>
 #include <vm/uma_int.h>
 
 #include <machine/armreg.h>
 #include <machine/cpu.h>
 #include <machine/pcb.h>
 #include <machine/frame.h>
 
+#ifdef VFP
+#include <machine/vfp.h>
+#endif
+
 /*
  * Finish a fork operation, with process p2 nearly set up.
  * Copy and update the pcb, set up the stack so that the child
  * ready to run and return to user mode.
  */
 void
 cpu_fork(struct thread *td1, struct proc *p2, struct thread *td2, int flags)
 {
 	struct pcb *pcb2;
 	struct trapframe *tf;
 
 	if ((flags & RFPROC) == 0)
 		return;
+
+	if (td1 == curthread) {
+		/*
+		 * Save the tpidr_el0 and the vfp state, these normally happen
+		 * in cpu_switch, but if userland changes these then forks
+		 * this may not have happened.
+		 */
+		td1->td_pcb->pcb_tpidr_el0 = READ_SPECIALREG(tpidr_el0);
+#ifdef VFP
+		if ((td1->td_pcb->pcb_fpflags & PCB_FP_STARTED) != 0)
+			vfp_save_state(td1);
+#endif
+	}
 
 	pcb2 = (struct pcb *)(td2->td_kstack +
 	    td2->td_kstack_pages * PAGE_SIZE) - 1;
 
 	td2->td_pcb = pcb2;
 	bcopy(td1->td_pcb, pcb2, sizeof(*pcb2));
 
 	td2->td_pcb->pcb_l1addr =
 	    vtophys(vmspace_pmap(td2->td_proc->p_vmspace)->pm_l1);
 
 	tf = (struct trapframe *)STACKALIGN((struct trapframe *)pcb2 - 1);
 	bcopy(td1->td_frame, tf, sizeof(*tf));
 	tf->tf_x[0] = 0;
 	tf->tf_x[1] = 0;
 	tf->tf_spsr = 0;
 
 	td2->td_frame = tf;
 
 	/* Set the return value registers for fork() */
 	td2->td_pcb->pcb_x[8] = (uintptr_t)fork_return;
 	td2->td_pcb->pcb_x[9] = (uintptr_t)td2;
 	td2->td_pcb->pcb_x[PCB_LR] = (uintptr_t)fork_trampoline;
 	td2->td_pcb->pcb_sp = (uintptr_t)td2->td_frame;
 	td2->td_pcb->pcb_vfpcpu = UINT_MAX;
 
 	/* Setup to release spin count in fork_exit(). */
 	td2->td_md.md_spinlock_count = 1;
 	td2->td_md.md_saved_daif = 0;
 }
 
 void
 cpu_reset(void)
 {
 
 	printf("cpu_reset");
 	while(1)
 		__asm volatile("wfi" ::: "memory");
 }
 
 void
 cpu_thread_swapin(struct thread *td)
 {
 }
 
 void
 cpu_thread_swapout(struct thread *td)
 {
 }
 
 void
 cpu_set_syscall_retval(struct thread *td, int error)
 {
 	struct trapframe *frame;
 
 	frame = td->td_frame;
 
 	switch (error) {
 	case 0:
 		frame->tf_x[0] = td->td_retval[0];
 		frame->tf_x[1] = td->td_retval[1];
 		frame->tf_spsr &= ~PSR_C;	/* carry bit */
 		break;
 	case ERESTART:
 		frame->tf_elr -= 4;
 		break;
 	case EJUSTRETURN:
 		break;
 	default:
 		frame->tf_spsr |= PSR_C;	/* carry bit */
 		frame->tf_x[0] = error;
 		break;
 	}
 }
 
 /*
  * Initialize machine state (pcb and trap frame) for a new thread about to
  * upcall. Put enough state in the new thread's PCB to get it to go back
  * userret(), where we can intercept it again to set the return (upcall)
  * Address and stack, along with those from upcals that are from other sources
  * such as those generated in thread_userret() itself.
  */
 void
 cpu_set_upcall(struct thread *td, struct thread *td0)
 {
 	bcopy(td0->td_frame, td->td_frame, sizeof(struct trapframe));
 	bcopy(td0->td_pcb, td->td_pcb, sizeof(struct pcb));
 
 	td->td_pcb->pcb_x[8] = (uintptr_t)fork_return;
 	td->td_pcb->pcb_x[9] = (uintptr_t)td;
 	td->td_pcb->pcb_x[PCB_LR] = (uintptr_t)fork_trampoline;
 	td->td_pcb->pcb_sp = (uintptr_t)td->td_frame;
 	td->td_pcb->pcb_vfpcpu = UINT_MAX;
 
 	/* Setup to release spin count in fork_exit(). */
 	td->td_md.md_spinlock_count = 1;
 	td->td_md.md_saved_daif = 0;
 }
 
 /*
  * Set that machine state for performing an upcall that has to
  * be done in thread_userret() so that those upcalls generated
  * in thread_userret() itself can be done as well.
  */
 void
 cpu_set_upcall_kse(struct thread *td, void (*entry)(void *), void *arg,
 	stack_t *stack)
 {
 
 	panic("cpu_set_upcall_kse");
 }
 
 int
 cpu_set_user_tls(struct thread *td, void *tls_base)
 {
 
 	panic("cpu_set_user_tls");
 }
 
 void
 cpu_thread_exit(struct thread *td)
 {
 }
 
 void
 cpu_thread_alloc(struct thread *td)
 {
 
 	td->td_pcb = (struct pcb *)(td->td_kstack +
 	    td->td_kstack_pages * PAGE_SIZE) - 1;
 	td->td_frame = (struct trapframe *)STACKALIGN(
 	    td->td_pcb - 1);
 }
 
 void
 cpu_thread_free(struct thread *td)
 {
 }
 
 void
 cpu_thread_clean(struct thread *td)
 {
 }
 
 /*
  * Intercept the return address from a freshly forked process that has NOT
  * been scheduled yet.
  *
  * This is needed to make kernel threads stay in kernel mode.
  */
 void
 cpu_set_fork_handler(struct thread *td, void (*func)(void *), void *arg)
 {
 
 	td->td_pcb->pcb_x[8] = (uintptr_t)func;
 	td->td_pcb->pcb_x[9] = (uintptr_t)arg;
 	td->td_pcb->pcb_x[PCB_LR] = (uintptr_t)fork_trampoline;
 	td->td_pcb->pcb_sp = (uintptr_t)td->td_frame;
 	td->td_pcb->pcb_vfpcpu = UINT_MAX;
 }
 
 void
 cpu_exit(struct thread *td)
 {
 }
 
 void
 swi_vm(void *v)
 {
 
 	/* Nothing to do here - busdma bounce buffers are not implemented. */
 }
 
 void *
 uma_small_alloc(uma_zone_t zone, vm_size_t bytes, u_int8_t *flags, int wait)
 {
 
 	panic("uma_small_alloc");
 }
 
 void
 uma_small_free(void *mem, vm_size_t size, u_int8_t flags)
 {
 
 	panic("uma_small_free");
 }
 
Index: user/ngie/more-tests/sys/arm64/include/psl.h
===================================================================
--- user/ngie/more-tests/sys/arm64/include/psl.h	(nonexistent)
+++ user/ngie/more-tests/sys/arm64/include/psl.h	(revision 281585)
@@ -0,0 +1 @@
+/* $FreeBSD$ */

Property changes on: user/ngie/more-tests/sys/arm64/include/psl.h
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: user/ngie/more-tests/sys/boot/Makefile.arm
===================================================================
--- user/ngie/more-tests/sys/boot/Makefile.arm	(revision 281584)
+++ user/ngie/more-tests/sys/boot/Makefile.arm	(revision 281585)
@@ -1,7 +1,7 @@
 # $FreeBSD$
 
 .if ${MK_FDT} != "no"
 SUBDIR+=		fdt
 .endif
 
-SUBDIR+=		uboot
+SUBDIR+=		efi uboot
Index: user/ngie/more-tests/sys/boot/Makefile.arm64
===================================================================
--- user/ngie/more-tests/sys/boot/Makefile.arm64	(nonexistent)
+++ user/ngie/more-tests/sys/boot/Makefile.arm64	(revision 281585)
@@ -0,0 +1,7 @@
+# $FreeBSD$
+
+.if ${MK_FDT} != "no"
+SUBDIR+=		fdt
+.endif
+
+SUBDIR+=		efi

Property changes on: user/ngie/more-tests/sys/boot/Makefile.arm64
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: user/ngie/more-tests/sys/boot/arm64/Makefile
===================================================================
--- user/ngie/more-tests/sys/boot/arm64/Makefile	(nonexistent)
+++ user/ngie/more-tests/sys/boot/arm64/Makefile	(revision 281585)
@@ -0,0 +1,3 @@
+# $FreeBSD$
+
+.include <bsd.subdir.mk>

Property changes on: user/ngie/more-tests/sys/boot/arm64/Makefile
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: user/ngie/more-tests/sys/boot/arm64/libarm64/cache.c
===================================================================
--- user/ngie/more-tests/sys/boot/arm64/libarm64/cache.c	(nonexistent)
+++ user/ngie/more-tests/sys/boot/arm64/libarm64/cache.c	(revision 281585)
@@ -0,0 +1,95 @@
+/*-
+ * Copyright (c) 2014 The FreeBSD Foundation
+ * All rights reserved.
+ *
+ * This software was developed by Semihalf under
+ * the sponsorship of the FreeBSD Foundation.
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <sys/cdefs.h>
+__FBSDID("$FreeBSD$");
+
+#include <sys/param.h>
+
+#include <machine/armreg.h>
+
+#include <stand.h>
+#include <efi.h>
+
+#include "cache.h"
+
+static unsigned int
+get_dcache_line_size(void)
+{
+	uint64_t ctr;
+	unsigned int dcl_size;
+
+	/* Accessible from all security levels */
+	ctr = READ_SPECIALREG(ctr_el0);
+
+	/*
+	 * Relevant field [19:16] is LOG2
+	 * of the number of words in DCache line
+	 */
+	dcl_size = CTR_DLINE_SIZE(ctr);
+
+	/* Size of word shifted by cache line size */
+	return (sizeof(int) << dcl_size);
+}
+
+void
+cpu_flush_dcache(const void *ptr, size_t len)
+{
+
+	uint64_t cl_size;
+	vm_offset_t addr, end;
+
+	cl_size = get_dcache_line_size();
+
+	/* Calculate end address to clean */
+	end = (vm_offset_t)(ptr + len);
+	/* Align start address to cache line */
+	addr = (vm_offset_t)ptr;
+	addr = rounddown2(addr, cl_size);
+
+	for (; addr < end; addr += cl_size)
+		__asm __volatile("dc	civac, %0" : : "r" (addr) : "memory");
+	/* Full system DSB */
+	__asm __volatile("dsb	sy" : : : "memory");
+}
+
+void
+cpu_inval_icache(const void *ptr, size_t len)
+{
+
+	/* NULL ptr or 0 len means all */
+	if (ptr == NULL || len == 0) {
+		__asm __volatile(
+		    "ic		ialluis	\n"
+		    "dsb	ish	\n"
+		    : : : "memory");
+		return;
+	}
+
+	/* TODO: Other cache ranges if necessary */
+}

Property changes on: user/ngie/more-tests/sys/boot/arm64/libarm64/cache.c
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: user/ngie/more-tests/sys/boot/arm64/libarm64/cache.h
===================================================================
--- user/ngie/more-tests/sys/boot/arm64/libarm64/cache.h	(nonexistent)
+++ user/ngie/more-tests/sys/boot/arm64/libarm64/cache.h	(revision 281585)
@@ -0,0 +1,38 @@
+/*-
+ * Copyright (c) 2014 The FreeBSD Foundation
+ * All rights reserved.
+ *
+ * This software was developed by Semihalf under
+ * the sponsorship of the FreeBSD Foundation.
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ *
+ * $FreeBSD$
+ */
+
+#ifndef _CACHE_H_
+#define	_CACHE_H_
+
+/* cache.c */
+void cpu_flush_dcache(const void *, size_t);
+void cpu_inval_icache(const void *, size_t);
+
+#endif /* _CACHE_H_ */

Property changes on: user/ngie/more-tests/sys/boot/arm64/libarm64/cache.h
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: user/ngie/more-tests/sys/boot/common/Makefile.inc
===================================================================
--- user/ngie/more-tests/sys/boot/common/Makefile.inc	(revision 281584)
+++ user/ngie/more-tests/sys/boot/common/Makefile.inc	(revision 281585)
@@ -1,73 +1,75 @@
 # $FreeBSD$
 
 SRCS+=	boot.c commands.c console.c devopen.c interp.c 
 SRCS+=	interp_backslash.c interp_parse.c ls.c misc.c 
 SRCS+=	module.c panic.c
 
 .if ${MACHINE} == "i386" || ${MACHINE_CPUARCH} == "amd64"
 SRCS+=	load_elf32.c load_elf32_obj.c reloc_elf32.c
 SRCS+=	load_elf64.c load_elf64_obj.c reloc_elf64.c
 .elif ${MACHINE} == "pc98"
 SRCS+=	load_elf32.c load_elf32_obj.c reloc_elf32.c
+.elif ${MACHINE_CPUARCH} == "aarch64"
+SRCS+=	load_elf64.c reloc_elf64.c
 .elif ${MACHINE_CPUARCH} == "arm"
 SRCS+=	load_elf32.c reloc_elf32.c
 .elif ${MACHINE_CPUARCH} == "powerpc"
 SRCS+=	load_elf32.c reloc_elf32.c
 SRCS+=	load_elf64.c reloc_elf64.c
 .elif ${MACHINE_CPUARCH} == "sparc64"
 SRCS+=	load_elf64.c reloc_elf64.c
 .elif ${MACHINE_ARCH} == "mips64" || ${MACHINE_ARCH} == "mips64el"
 SRCS+= load_elf64.c reloc_elf64.c
 .endif
 
 .if defined(LOADER_NET_SUPPORT)
 SRCS+=	dev_net.c
 .endif
 
 .if !defined(LOADER_NO_DISK_SUPPORT)
 SRCS+=	disk.c part.c
 CFLAGS+= -DLOADER_DISK_SUPPORT
 .if !defined(LOADER_NO_GPT_SUPPORT)
 SRCS+=	crc32.c
 CFLAGS+= -DLOADER_GPT_SUPPORT
 .endif
 .if !defined(LOADER_NO_MBR_SUPPORT)
 CFLAGS+= -DLOADER_MBR_SUPPORT
 .endif
 .endif
 
 .if defined(HAVE_BCACHE)
 SRCS+=  bcache.c
 .endif
 
 .if defined(MD_IMAGE_SIZE)
 CFLAGS+= -DMD_IMAGE_SIZE=${MD_IMAGE_SIZE}
 SRCS+=	md.c
 .endif
 
 # Machine-independant ISA PnP
 .if defined(HAVE_ISABUS)
 SRCS+=	isapnp.c
 .endif
 .if defined(HAVE_PNP)
 SRCS+=	pnp.c
 .endif
 
 # Forth interpreter
 .if defined(BOOT_FORTH)
 SRCS+=	interp_forth.c
 .endif
 
 .if defined(BOOT_PROMPT_123)
 CFLAGS+=	-DBOOT_PROMPT_123
 .endif
 
 .if defined(LOADER_INSTALL_SUPPORT)
 SRCS+=	install.c
 CFLAGS+=-I${.CURDIR}/../../../../lib/libstand
 .endif
 
 MAN+=	loader.8
 .if ${MK_ZFS} != "no"
 MAN+=	zfsloader.8
 .endif
Index: user/ngie/more-tests/sys/boot/efi/Makefile
===================================================================
--- user/ngie/more-tests/sys/boot/efi/Makefile	(revision 281584)
+++ user/ngie/more-tests/sys/boot/efi/Makefile	(revision 281585)
@@ -1,17 +1,19 @@
 # $FreeBSD$
 
 .include <src.opts.mk>
 
 SUBDIR=		libefi
 
 .if ${MACHINE_CPUARCH} == "aarch64" || ${MACHINE_CPUARCH} == "arm"
 .if ${MK_FDT} != "no"
 SUBDIR+=	fdt
 .endif
 .endif
 
-.if ${MACHINE_CPUARCH} == "amd64" || ${MACHINE_CPUARCH} == "arm"
+.if ${MACHINE_CPUARCH} == "aarch64" || \
+    ${MACHINE_CPUARCH} == "amd64" || \
+    ${MACHINE_CPUARCH} == "arm"
 SUBDIR+=	loader boot1
 .endif
 
 .include <bsd.subdir.mk>
Index: user/ngie/more-tests/sys/boot/efi/boot1/Makefile
===================================================================
--- user/ngie/more-tests/sys/boot/efi/boot1/Makefile	(revision 281584)
+++ user/ngie/more-tests/sys/boot/efi/boot1/Makefile	(revision 281585)
@@ -1,106 +1,106 @@
 # $FreeBSD$
 
 MAN=
 
 .include <bsd.own.mk>
 
 # In-tree GCC does not support __attribute__((ms_abi)).
 .if ${COMPILER_TYPE} != "gcc"
 
 MK_SSP=		no
 
 PROG=		loader.sym
 INTERNALPROG=
 
 # architecture-specific loader code
 SRCS=	boot1.c reloc.c start.S
 
 CFLAGS+=	-I.
 CFLAGS+=	-I${.CURDIR}/../include
-CFLAGS+=	-I${.CURDIR}/../include/${MACHINE_CPUARCH}
+CFLAGS+=	-I${.CURDIR}/../include/${MACHINE}
 CFLAGS+=	-I${.CURDIR}/../../../contrib/dev/acpica/include
 CFLAGS+=	-I${.CURDIR}/../../..
 
 # Always add MI sources and REGULAR efi loader bits
-.PATH:		${.CURDIR}/../loader/arch/${MACHINE_CPUARCH}
+.PATH:		${.CURDIR}/../loader/arch/${MACHINE}
 .PATH:		${.CURDIR}/../loader
 .PATH:		${.CURDIR}/../../common
 CFLAGS+=	-I${.CURDIR}/../../common
 
 FILES=	boot1.efi boot1.efifat
 FILESMODE_boot1.efi=	${BINMODE}
 
-LDSCRIPT=	${.CURDIR}/../loader/arch/${MACHINE_CPUARCH}/ldscript.${MACHINE_CPUARCH}
+LDSCRIPT=	${.CURDIR}/../loader/arch/${MACHINE}/ldscript.${MACHINE}
 LDFLAGS=	-Wl,-T${LDSCRIPT} -Wl,-Bsymbolic -shared
 
 .if ${MACHINE_CPUARCH} == "amd64" || ${MACHINE_CPUARCH} == "i386"
 CFLAGS+=	-fPIC
 LDFLAGS+=	-Wl,-znocombreloc
 .endif
 
 .if ${MACHINE_CPUARCH} == "arm" || ${MACHINE_CPUARCH} == "i386"
 #
 # Add libstand for the runtime functions used by the compiler - for example
 # __aeabi_* (arm) or __divdi3 (i386).
 #
 DPADD+=		${LIBSTAND}
 LDADD+=		-lstand
 .endif
 
 ${PROG}:	${LDSCRIPT}
 
 OBJCOPY?=	objcopy
 OBJDUMP?=	objdump
 
 .if ${MACHINE_CPUARCH} == "amd64"
 EFI_TARGET=	efi-app-x86_64
 .elif ${MACHINE_CPUARCH} == "i386"
 EFI_TARGET=	efi-app-ia32
 .else
 EFI_TARGET=	binary
 .endif
 
 boot1.efi: loader.sym
 	if [ `${OBJDUMP} -t ${.ALLSRC} | fgrep '*UND*' | wc -l` != 0 ]; then \
 		${OBJDUMP} -t ${.ALLSRC} | fgrep '*UND*'; \
 		exit 1; \
 	fi
 	${OBJCOPY} -j .peheader -j .text -j .sdata -j .data \
 		-j .dynamic -j .dynsym -j .rel.dyn \
 		-j .rela.dyn -j .reloc -j .eh_frame -j set_Xcommand_set \
 		--output-target=${EFI_TARGET} ${.ALLSRC} ${.TARGET}
 
 boot1.o: ${.CURDIR}/../../common/ufsread.c
 
 # The following inserts out objects into a template FAT file system
 # created by generate-fat.sh
 
 .include "${.CURDIR}/Makefile.fat"
 
 boot1.efifat: boot1.efi
 	echo ${.OBJDIR}
-	uudecode ${.CURDIR}/fat-${MACHINE_CPUARCH}.tmpl.bz2.uu
-	mv fat-${MACHINE_CPUARCH}.tmpl.bz2 ${.TARGET}.bz2
+	uudecode ${.CURDIR}/fat-${MACHINE}.tmpl.bz2.uu
+	mv fat-${MACHINE}.tmpl.bz2 ${.TARGET}.bz2
 	bzip2 -f -d ${.TARGET}.bz2
 	dd if=boot1.efi of=${.TARGET} seek=${BOOT1_OFFSET} conv=notrunc
 
 CLEANFILES= boot1.efi boot1.efifat
 
 .endif # ${COMPILER_TYPE} != "gcc"
 
 .include <bsd.prog.mk>
 
 beforedepend ${OBJS}: machine
 
 CLEANFILES+=   machine
 
 machine:
 	ln -sf ${.CURDIR}/../../../${MACHINE}/include machine
 
 .if ${MACHINE_CPUARCH} == "amd64" || ${MACHINE_CPUARCH} == "i386"
 beforedepend ${OBJS}: x86
 CLEANFILES+=   x86
 
 x86:
 	ln -sf ${.CURDIR}/../../../x86/include x86
 .endif
Index: user/ngie/more-tests/sys/boot/efi/boot1/fat-arm64.tmpl.bz2.uu
===================================================================
--- user/ngie/more-tests/sys/boot/efi/boot1/fat-arm64.tmpl.bz2.uu	(nonexistent)
+++ user/ngie/more-tests/sys/boot/efi/boot1/fat-arm64.tmpl.bz2.uu	(revision 281585)
@@ -0,0 +1,26 @@
+FAT template boot filesystem created by generate-fat.sh
+DO NOT EDIT
+$FreeBSD$
+begin 644 fat-arm64.tmpl.bz2
+M0EIH.3%!62936:2BH:(`&T#_____ZZKJ[_^N_ZO_J_Z[OJ_NJ^JK^KZNKNNJ
+MZNKNZOJ^P`+\```"``:`9,@T&F3$,@!B`,AIHP$#0-`T``,09--&31IH9#)D
+M,(#$T&)B!A``-`,F0:#3)B&0`Q`&0TT8"!H&@:``&(,FFC)HTT,ADR&$!B:#
+M$Q`P@`&@&3(-!IDQ#(`8@#(::,!`T#0-``#$&331DT::&0R9#"`Q-!B8@8"J
+M*0GY$I&GH"-&AZ@T```T`:`!HT```&@&@-,C0`-`#U,(-`&)ZF(PGIJ>IO;U
+M^=&QL3`-\E@Q+$(RTHB$7I"(B(-W:73$0@A#;S##$3$`A#FL\LAF,;&8;[CE
+M&D=@ON\:9IWHO):QK7LL=LFN;1M6Y:%F>-1G^&O-A*(@0AQ,\#H*KRCJF>9Q
+MFF_,RWU4X-,R6K5EJU:M6L"JMB5555555JVJU*U-JU:M6MUB*I555555;XHJ
+ME555555YM='L;(N7+ERY<N4****************.2B55555555%%%%%%%%"B
+MBBBBBC#S/DW+ERY<N7+E&NE55555550HHHHHHHS_(SQEG'X7.OTSKFY=#*]*
+M9?C*,7MM&JO=0EIDQW:7C-0U;WGRM>_!^S^F[=KA#3YJ.S-)`!>O]K)`#-ZO
+MU=9T,X(@!H',-+,1Q'-6'#ZNQORGURS=]_O%.6SF5G,PC`G#X_@7W$RC>2Q)
+M9MW3P&G:AJFA?`^AKWXOV?R_QDL9F^`=5H>$UWWT9K&Q/HS.!KXB)U)9$6,)
+M*/!EJ7>+W2L65_C\&LP69G$?'M$18.(LL.G:AZ;%>TUKYF.V+9/W9AMF[<VT
+MS/X;%DL?1SW5Y]4L]R=+RLNCTMB[6_5C8C2+?DPWF/F.G]WN[5+:G4UOWC%E
+M*3V7BO??&^=COL?:V3]&T9#<-XWKX3\,BQ#[KW01%D(X'O<#8?*B.`D]1J&O
+M>&^]HFB?J_AM6V;ANF2RG7,61I)?])L35<Y8C.RWDHV3\GZOW;-TC:MLW#<O
+MG;QDO2<6_S5;,T$1["/3W^^]&4;3*_F\:?7"90R&E;=N']O@;ENF[?ZR6Q93
+M_F6SAO9,N65+9R:#69>2^EX5C12BSAX@$(3`!EWEZ6P2R$9#_Q=R13A0D*2B
+"H:(`
+`
+end

Property changes on: user/ngie/more-tests/sys/boot/efi/boot1/fat-arm64.tmpl.bz2.uu
___________________________________________________________________
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Index: user/ngie/more-tests/sys/boot/efi/boot1/generate-fat.sh
===================================================================
--- user/ngie/more-tests/sys/boot/efi/boot1/generate-fat.sh	(revision 281584)
+++ user/ngie/more-tests/sys/boot/efi/boot1/generate-fat.sh	(revision 281585)
@@ -1,69 +1,69 @@
 #!/bin/sh
 
 # This script generates the dummy FAT filesystem used for the EFI boot
 # blocks. It uses newfs_msdos to generate a template filesystem with the
 # relevant interesting files. These are then found by grep, and the offsets
 # written to a Makefile snippet.
 #
 # Because it requires root, and because it is overkill, we do not
 # do this as part of the normal build. If makefs(8) grows workable FAT
 # support, this should be revisited.
 
 # $FreeBSD$
 
 FAT_SIZE=1600 			#Size in 512-byte blocks of the produced image
 
 BOOT1_SIZE=128k
 
 #
 # Known filenames
 # amd64:   BOOTx64.efi
-# aarch64: BOOTaa64.efi
+# arm64:   BOOTaa64.efi
 # arm:     BOOTarm.efi
 # i386:    BOOTia32.efi
 #
 if [ -z "$2" ]; then
 	echo "Usage: $0 arch boot-filename"
 	exit 1
 fi
 
 ARCH=$1
 FILENAME=$2
 
 # Generate 800K FAT image
 OUTPUT_FILE=fat-${ARCH}.tmpl
 
 dd if=/dev/zero of=$OUTPUT_FILE bs=512 count=$FAT_SIZE
 DEVICE=`mdconfig -a -f $OUTPUT_FILE`
 newfs_msdos -F 12 -L EFI $DEVICE
 mkdir stub
 mount -t msdosfs /dev/$DEVICE stub
 
 # Create and bless a directory for the boot loader
 mkdir -p stub/efi/boot
 
 # Make a dummy file for boot1
 echo 'Boot1 START' | dd of=stub/efi/boot/$FILENAME cbs=$BOOT1_SIZE count=1 conv=block
 
 umount stub
 mdconfig -d -u $DEVICE
 rmdir stub
 
 # Locate the offset of the fake file
 BOOT1_OFFSET=$(hd $OUTPUT_FILE | grep 'Boot1 START' | cut -f 1 -d ' ')
 
 # Convert to number of blocks
 BOOT1_OFFSET=$(echo 0x$BOOT1_OFFSET | awk '{printf("%x\n",$1/512);}')
 
 echo '# This file autogenerated by generate-fat.sh - DO NOT EDIT' > Makefile.fat
 echo '# $FreeBSD$' >> Makefile.fat
 echo "BOOT1_OFFSET=0x$BOOT1_OFFSET" >> Makefile.fat
 
 bzip2 $OUTPUT_FILE
 echo 'FAT template boot filesystem created by generate-fat.sh' > $OUTPUT_FILE.bz2.uu
 echo 'DO NOT EDIT' >> $OUTPUT_FILE.bz2.uu
 echo '$FreeBSD$' >> $OUTPUT_FILE.bz2.uu
 
 uuencode $OUTPUT_FILE.bz2 $OUTPUT_FILE.bz2 >> $OUTPUT_FILE.bz2.uu
 rm $OUTPUT_FILE.bz2
 
Index: user/ngie/more-tests/sys/boot/efi/fdt/Makefile
===================================================================
--- user/ngie/more-tests/sys/boot/efi/fdt/Makefile	(revision 281584)
+++ user/ngie/more-tests/sys/boot/efi/fdt/Makefile	(revision 281585)
@@ -1,36 +1,36 @@
 # $FreeBSD$
 
 .include <src.opts.mk>
 
 .PATH: ${.CURDIR}/../../common
 
 LIB=		efi_fdt
 INTERNALLIB=
 
 SRCS=		efi_fdt.c
 
 CFLAGS+=	-ffreestanding -msoft-float
-.if ${MACHINE_CPUARCH} == "arm64"
+.if ${MACHINE_CPUARCH} == "aarch64"
 CFLAGS+=	-mgeneral-regs-only
 .endif
 
 CFLAGS+=	-I${.CURDIR}/../../../../lib/libstand/
 
 # EFI library headers
 CFLAGS+=	-I${.CURDIR}/../include
-CFLAGS+=	-I${.CURDIR}/../include/${MACHINE_CPUARCH}
+CFLAGS+=	-I${.CURDIR}/../include/${MACHINE}
 
 # libfdt headers
 CFLAGS+=	-I${.CURDIR}/../../fdt
 
 # Pick up the bootstrap header for some interface items
 CFLAGS+=	-I${.CURDIR}/../../common -I${.CURDIR}/../../.. -I.
 
 machine:
-	ln -sf ${.CURDIR}/../../../${MACHINE_CPUARCH}/include machine
+	ln -sf ${.CURDIR}/../../../${MACHINE}/include machine
 
 CLEANFILES+=	machine
 
 .include <bsd.lib.mk>
 
 beforedepend ${OBJS}: machine
Index: user/ngie/more-tests/sys/boot/efi/include/arm64/efibind.h
===================================================================
--- user/ngie/more-tests/sys/boot/efi/include/arm64/efibind.h	(nonexistent)
+++ user/ngie/more-tests/sys/boot/efi/include/arm64/efibind.h	(revision 281585)
@@ -0,0 +1,219 @@
+/* $FreeBSD$ */
+/*++
+
+Copyright (c)  1999 - 2003 Intel Corporation. All rights reserved
+This software and associated documentation (if any) is furnished
+under a license and may only be used or copied in accordance
+with the terms of the license. Except as permitted by such
+license, no part of this software or documentation may be
+reproduced, stored in a retrieval system, or transmitted in any
+form or by any means without the express written consent of
+Intel Corporation.
+
+Module Name:
+
+    efefind.h
+
+Abstract:
+
+    EFI to compile bindings
+
+
+
+
+Revision History
+
+--*/
+
+#pragma pack()
+
+
+#ifdef __FreeBSD__
+#include <sys/stdint.h>
+#else
+//
+// Basic int types of various widths
+//
+
+#if (__STDC_VERSION__ < 199901L )
+
+    // No ANSI C 1999/2000 stdint.h integer width declarations 
+
+    #if _MSC_EXTENSIONS
+
+        // Use Microsoft C compiler integer width declarations 
+
+        typedef unsigned __int64    uint64_t;
+        typedef __int64             int64_t;
+        typedef unsigned __int32    uint32_t;
+        typedef __int32             int32_t;
+        typedef unsigned __int16    uint16_t;
+        typedef __int16             int16_t;
+        typedef unsigned __int8     uint8_t;
+        typedef __int8              int8_t;
+    #else             
+        #ifdef UNIX_LP64
+
+            // Use LP64 programming model from C_FLAGS for integer width declarations 
+
+            typedef unsigned long       uint64_t;
+            typedef long                int64_t;
+            typedef unsigned int        uint32_t;
+            typedef int                 int32_t;
+            typedef unsigned short      uint16_t;
+            typedef short               int16_t;
+            typedef unsigned char       uint8_t;
+            typedef char                int8_t;
+        #else
+
+            // Assume P64 programming model from C_FLAGS for integer width declarations 
+
+            typedef unsigned long long  uint64_t;
+            typedef long long           int64_t;
+            typedef unsigned int        uint32_t;
+            typedef int                 int32_t;
+            typedef unsigned short      uint16_t;
+            typedef short               int16_t;
+            typedef unsigned char       uint8_t;
+            typedef char                int8_t;
+        #endif
+    #endif
+#endif
+#endif	/* __FreeBSD__ */
+
+//
+// Basic EFI types of various widths
+//
+
+
+typedef uint64_t   UINT64;
+typedef int64_t    INT64;
+typedef uint32_t   UINT32;
+typedef int32_t    INT32;
+typedef uint16_t   UINT16;
+typedef int16_t    INT16;
+typedef uint8_t    UINT8;
+typedef int8_t     INT8;
+
+
+#undef VOID
+#define VOID    void
+
+
+typedef int64_t    INTN;
+typedef uint64_t   UINTN;
+
+//++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+// BugBug: Code to debug
+//
+#define BIT63   0x8000000000000000
+
+#define PLATFORM_IOBASE_ADDRESS   (0xffffc000000 | BIT63)                                               
+#define PORT_TO_MEMD(_Port) (PLATFORM_IOBASE_ADDRESS | ( ( ( (_Port) & 0xfffc) << 10 ) | ( (_Port) & 0x0fff) ) )
+                                                                           
+//                                                                  
+// Macro's with casts make this much easier to use and read.
+//
+#define PORT_TO_MEM8D(_Port)  (*(UINT8  *)(PORT_TO_MEMD(_Port)))
+#define POST_CODE(_Data)  (PORT_TO_MEM8D(0x80) = (_Data))
+//
+// BugBug: End Debug Code!!!
+//+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+
+#define EFIERR(a)           (0x8000000000000000 | a)
+#define EFI_ERROR_MASK      0x8000000000000000
+#define EFIERR_OEM(a)       (0xc000000000000000 | a)      
+
+#define BAD_POINTER         0xFBFBFBFBFBFBFBFB
+#define MAX_ADDRESS         0xFFFFFFFFFFFFFFFF
+
+#pragma intrinsic (__break)  
+#define BREAKPOINT()  __break(0)
+
+//
+// Pointers must be aligned to these address to function
+//  you will get an alignment fault if this value is less than 8
+//
+#define MIN_ALIGNMENT_SIZE  8
+
+#define ALIGN_VARIABLE(Value , Adjustment) \
+            (UINTN) Adjustment = 0; \
+            if((UINTN)Value % MIN_ALIGNMENT_SIZE) \
+                (UINTN)Adjustment = MIN_ALIGNMENT_SIZE - ((UINTN)Value % MIN_ALIGNMENT_SIZE); \
+            Value = (UINTN)Value + (UINTN)Adjustment
+
+//
+// Define macros to create data structure signatures.
+//
+
+#define EFI_SIGNATURE_16(A,B)             ((A) | (B<<8))
+#define EFI_SIGNATURE_32(A,B,C,D)         (EFI_SIGNATURE_16(A,B)     | (EFI_SIGNATURE_16(C,D)     << 16))
+#define EFI_SIGNATURE_64(A,B,C,D,E,F,G,H) (EFI_SIGNATURE_32(A,B,C,D) | ((UINT64)(EFI_SIGNATURE_32(E,F,G,H)) << 32))
+
+//
+// EFIAPI - prototype calling convention for EFI function pointers
+// BOOTSERVICE - prototype for implementation of a boot service interface
+// RUNTIMESERVICE - prototype for implementation of a runtime service interface
+// RUNTIMEFUNCTION - prototype for implementation of a runtime function that is not a service
+// RUNTIME_CODE - pragma macro for declaring runtime code    
+//
+
+#ifndef EFIAPI                  // Forces EFI calling conventions reguardless of compiler options 
+    #if _MSC_EXTENSIONS
+        #define EFIAPI __cdecl  // Force C calling convention for Microsoft C compiler 
+    #else
+        #define EFIAPI          // Substitute expresion to force C calling convention 
+    #endif
+#endif
+
+#define BOOTSERVICE
+#define RUNTIMESERVICE
+#define RUNTIMEFUNCTION
+
+#define RUNTIME_CODE(a)         alloc_text("rtcode", a)
+#define BEGIN_RUNTIME_DATA()    data_seg("rtdata")
+#define END_RUNTIME_DATA()      data_seg()
+
+#define VOLATILE    volatile
+
+//
+// BugBug: Need to find out if this is portable accross compliers.
+//
+void __mfa (void);                       
+#pragma intrinsic (__mfa)  
+#define MEMORY_FENCE()    __mfa()
+
+#ifdef EFI_NO_INTERFACE_DECL
+  #define EFI_FORWARD_DECLARATION(x)
+  #define EFI_INTERFACE_DECL(x)
+#else
+  #define EFI_FORWARD_DECLARATION(x) typedef struct _##x x
+  #define EFI_INTERFACE_DECL(x) typedef struct x
+#endif
+
+//
+// When build similiar to FW, then link everything together as
+// one big module.
+//
+
+#define EFI_DRIVER_ENTRY_POINT(InitFunction)
+
+#define LOAD_INTERNAL_DRIVER(_if, type, name, entry)    \
+            (_if)->LoadInternal(type, name, entry)
+//        entry(NULL, ST)
+
+#ifdef __FreeBSD__
+#define INTERFACE_DECL(x) struct x
+#else
+//
+// Some compilers don't support the forward reference construct:
+//  typedef struct XXXXX
+//
+// The following macro provide a workaround for such cases.
+//
+#ifdef NO_INTERFACE_DECL
+#define INTERFACE_DECL(x)
+#else
+#define INTERFACE_DECL(x) typedef struct x
+#endif
+#endif

Property changes on: user/ngie/more-tests/sys/boot/efi/include/arm64/efibind.h
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: user/ngie/more-tests/sys/boot/efi/libefi/Makefile
===================================================================
--- user/ngie/more-tests/sys/boot/efi/libefi/Makefile	(revision 281584)
+++ user/ngie/more-tests/sys/boot/efi/libefi/Makefile	(revision 281585)
@@ -1,22 +1,22 @@
 # $FreeBSD$
 
 LIB=	efi
 INTERNALLIB=
 
 SRCS=	delay.c efi_console.c efinet.c efipart.c errno.c handles.c \
 	libefi.c time.c
 
 .if ${MACHINE_ARCH} == "amd64"
 CFLAGS+= -fPIC -mno-red-zone
 .endif
 CFLAGS+= -I${.CURDIR}/../include
-CFLAGS+= -I${.CURDIR}/../include/${MACHINE_CPUARCH}
+CFLAGS+= -I${.CURDIR}/../include/${MACHINE}
 CFLAGS+= -I${.CURDIR}/../../../../lib/libstand
 
 # Pick up the bootstrap header for some interface items
 CFLAGS+= -I${.CURDIR}/../../common
 
 # Handle FreeBSD specific %b and %D printf format specifiers
 CFLAGS+= ${FORMAT_EXTENSIONS}
 
 .include <bsd.lib.mk>
Index: user/ngie/more-tests/sys/boot/efi/loader/Makefile
===================================================================
--- user/ngie/more-tests/sys/boot/efi/loader/Makefile	(revision 281584)
+++ user/ngie/more-tests/sys/boot/efi/loader/Makefile	(revision 281585)
@@ -1,127 +1,127 @@
 # $FreeBSD$
 
 MAN=
 
 .include <src.opts.mk>
 
 # In-tree GCC does not support __attribute__((ms_abi)).
 .if ${COMPILER_TYPE} != "gcc"
 
 MK_SSP=		no
 
 PROG=		loader.sym
 INTERNALPROG=
 
 .PATH: ${.CURDIR}/../../efi/loader
 # architecture-specific loader code
 SRCS=	autoload.c \
 	bootinfo.c \
 	conf.c \
 	copy.c \
 	devicename.c \
 	main.c \
 	reloc.c \
 	smbios.c \
 	vers.c
 
-.PATH: ${.CURDIR}/arch/${MACHINE_CPUARCH}
+.PATH: ${.CURDIR}/arch/${MACHINE}
 # For smbios.c
 .PATH: ${.CURDIR}/../../i386/libi386
-.include "${.CURDIR}/arch/${MACHINE_CPUARCH}/Makefile.inc"
+.include "${.CURDIR}/arch/${MACHINE}/Makefile.inc"
 
 CFLAGS+=	-I${.CURDIR}
-CFLAGS+=	-I${.CURDIR}/arch/${MACHINE_CPUARCH}
+CFLAGS+=	-I${.CURDIR}/arch/${MACHINE}
 CFLAGS+=	-I${.CURDIR}/../include
-CFLAGS+=	-I${.CURDIR}/../include/${MACHINE_CPUARCH}
+CFLAGS+=	-I${.CURDIR}/../include/${MACHINE}
 CFLAGS+=	-I${.CURDIR}/../../../contrib/dev/acpica/include
 CFLAGS+=	-I${.CURDIR}/../../..
 CFLAGS+=	-I${.CURDIR}/../../i386/libi386
 CFLAGS+=	-DNO_PCI -DEFI
 
 .if ${MK_FORTH} != "no"
 BOOT_FORTH=	yes
 CFLAGS+=	-DBOOT_FORTH
 CFLAGS+=	-I${.CURDIR}/../../ficl
 CFLAGS+=	-I${.CURDIR}/../../ficl/${MACHINE_CPUARCH}
 LIBFICL=	${.OBJDIR}/../../ficl/libficl.a
 .endif
 
 LOADER_FDT_SUPPORT?=	no
 .if ${MK_FDT} != "no" && ${LOADER_FDT_SUPPORT} != "no"
 CFLAGS+=	-I${.CURDIR}/../../fdt
 CFLAGS+=	-I${.OBJDIR}/../../fdt
 CFLAGS+=	-DLOADER_FDT_SUPPORT
 LIBEFI_FDT=	${.OBJDIR}/../../efi/fdt/libefi_fdt.a
 LIBFDT=		${.OBJDIR}/../../fdt/libfdt.a
 .endif
 
 # Include bcache code.
 HAVE_BCACHE=    yes
 
 .if defined(EFI_STAGING_SIZE)
 CFLAGS+=	-DEFI_STAGING_SIZE=${EFI_STAGING_SIZE}
 .endif
 
 # Always add MI sources 
 .PATH:		${.CURDIR}/../../common
 .include	"${.CURDIR}/../../common/Makefile.inc"
 CFLAGS+=	-I${.CURDIR}/../../common
 
 FILES=	loader.efi
 FILESMODE_loader.efi=	${BINMODE}
 
-LDSCRIPT=	${.CURDIR}/arch/${MACHINE_CPUARCH}/ldscript.${MACHINE_CPUARCH}
+LDSCRIPT=	${.CURDIR}/arch/${MACHINE}/ldscript.${MACHINE}
 LDFLAGS+=	-Wl,-T${LDSCRIPT} -Wl,-Bsymbolic -shared
 
 CLEANFILES=	vers.c loader.efi
 
-NEWVERSWHAT=	"EFI loader" ${MACHINE_CPUARCH}
+NEWVERSWHAT=	"EFI loader" ${MACHINE}
 
 vers.c:	${.CURDIR}/../../common/newvers.sh ${.CURDIR}/../../efi/loader/version
 	sh ${.CURDIR}/../../common/newvers.sh ${.CURDIR}/version ${NEWVERSWHAT}
 
 OBJCOPY?=	objcopy
 OBJDUMP?=	objdump
 
 .if ${MACHINE_CPUARCH} == "amd64"
 EFI_TARGET=	efi-app-x86_64
 .elif ${MACHINE_CPUARCH} == "i386"
 EFI_TARGET=	efi-app-ia32
 .else
 EFI_TARGET=	binary
 .endif
 
 loader.efi: loader.sym
 	if [ `${OBJDUMP} -t ${.ALLSRC} | fgrep '*UND*' | wc -l` != 0 ]; then \
 		${OBJDUMP} -t ${.ALLSRC} | fgrep '*UND*'; \
 		exit 1; \
 	fi
 	${OBJCOPY} -j .peheader -j .text -j .sdata -j .data \
 		-j .dynamic -j .dynsym -j .rel.dyn \
 		-j .rela.dyn -j .reloc -j .eh_frame -j set_Xcommand_set \
 		--output-target=${EFI_TARGET} ${.ALLSRC} ${.TARGET}
 
 LIBEFI=		${.OBJDIR}/../libefi/libefi.a
 
 DPADD=		${LIBFICL} ${LIBEFI} ${LIBFDT} ${LIBEFI_FDT} ${LIBSTAND} \
 		${LDSCRIPT}
 LDADD=		${LIBFICL} ${LIBEFI} ${LIBFDT} ${LIBEFI_FDT} ${LIBSTAND}
 
 .endif # ${COMPILER_TYPE} != "gcc"
 
 .include <bsd.prog.mk>
 
 beforedepend ${OBJS}: machine
 
 CLEANFILES+=   machine
 
 machine:
 	ln -sf ${.CURDIR}/../../../${MACHINE}/include machine
 
 .if ${MACHINE_CPUARCH} == "amd64" || ${MACHINE_CPUARCH} == "i386"
 beforedepend ${OBJS}: x86
 CLEANFILES+=   x86
 
 x86:
 	ln -sf ${.CURDIR}/../../../x86/include x86
 .endif
Index: user/ngie/more-tests/sys/boot/efi/loader/arch/arm/start.S
===================================================================
--- user/ngie/more-tests/sys/boot/efi/loader/arch/arm/start.S	(revision 281584)
+++ user/ngie/more-tests/sys/boot/efi/loader/arch/arm/start.S	(revision 281585)
@@ -1,190 +1,189 @@
 /*-
  * Copyright (c) 2014, 2015 Andrew Turner
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 #include <machine/asm.h>
 
 /*
  * We need to be a PE32 file for EFI. On some architectures we can use
  * objcopy to create the correct file, however on arm we need to do
  * it ourselves.
  */
 
 #define	IMAGE_FILE_MACHINE_ARM		0x01c2
 
 #define	IMAGE_SCN_CNT_CODE		0x00000020
 #define	IMAGE_SCN_CNT_INITIALIZED_DATA	0x00000040
 #define	IMAGE_SCN_MEM_DISCARDABLE	0x02000000
 #define	IMAGE_SCN_MEM_EXECUTE		0x20000000
 #define	IMAGE_SCN_MEM_READ		0x40000000
 
 	.section .peheader
 efi_start:
 	/* The MS-DOS Stub, only used to get the offset of the COFF header */
 	.ascii	"MZ"
 	.short	0
 	.space	0x38
 	.long	pe_sig - efi_start
 
 	/* The PE32 Signature. Needs to be 8-byte aligned */
 	.align	3
 pe_sig:
 	.ascii	"PE"
 	.short	0
 coff_head:
 	.short	IMAGE_FILE_MACHINE_ARM		/* ARM file */
 	.short	2				/* 2 Sections */
 	.long	0				/* Timestamp */
 	.long	0				/* No symbol table */
 	.long	0				/* No symbols */
 	.short	section_table - optional_header	/* Optional header size */
 	.short	0	/* Characteristics TODO: Fill in */
 
 optional_header:
 	.short	0x010b				/* PE32 (32-bit addressing) */
 	.byte	0				/* Major linker version */
 	.byte	0				/* Minor linker version */
 	.long	_edata - _end_header		/* Code size */
 	.long	0				/* No initialized data */
 	.long	0				/* No uninitialized data */
 	.long	_start - efi_start		/* Entry point */
 	.long	_end_header - efi_start		/* Start of code */
 	.long	0				/* Start of data */
 
 optional_windows_header:
 	.long	0				/* Image base */
 	.long	32				/* Section Alignment */
 	.long	8				/* File alignment */
 	.short	0				/* Major OS version */
 	.short	0				/* Minor OS version */
 	.short	0				/* Major image version */
 	.short	0				/* Minor image version */
 	.short	0				/* Major subsystem version */
 	.short	0				/* Minor subsystem version */
 	.long	0				/* Win32 version */
 	.long	_edata - efi_start		/* Image size */
 	.long	_end_header - efi_start		/* Header size */
 	.long	0				/* Checksum */
 	.short	0xa				/* Subsystem (EFI app) */
 	.short	0				/* DLL Characteristics */
 	.long	0				/* Stack reserve */
 	.long	0				/* Stack commit */
 	.long	0				/* Heap reserve */
 	.long	0				/* Heap commit */
 	.long	0				/* Loader flags */
 	.long	6				/* Number of RVAs */
 
 	/* RVAs: */
 	.quad	0
 	.quad	0
 	.quad	0
 	.quad	0
 	.quad	0
 	.quad	0
 
 section_table:
 	/* We need a .reloc section for EFI */
 	.ascii	".reloc"
 	.byte	0
 	.byte	0				/* Pad to 8 bytes */
 	.long	0				/* Virtual size */
 	.long	0				/* Virtual address */
 	.long	0				/* Size of raw data */
 	.long	0				/* Pointer to raw data */
 	.long	0				/* Pointer to relocations */
 	.long	0				/* Pointer to line numbers */
 	.short	0				/* Number of relocations */
 	.short	0				/* Number of line numbers */
 	.long	(IMAGE_SCN_CNT_INITIALIZED_DATA | IMAGE_SCN_MEM_READ | \
 		 IMAGE_SCN_MEM_DISCARDABLE)	/* Characteristics */
 
 	/* The contents of the loader */
 	.ascii	".text"
 	.byte	0
 	.byte	0
 	.byte	0				/* Pad to 8 bytes */
 	.long	_edata - _end_header		/* Virtual size */
 	.long	_end_header - efi_start		/* Virtual address */
 	.long	_edata - _end_header		/* Size of raw data */
 	.long	_end_header - efi_start		/* Pointer to raw data */
 	.long	0				/* Pointer to relocations */
 	.long	0				/* Pointer to line numbers */
 	.short	0				/* Number of relocations */
 	.short	0				/* Number of line numbers */
 	.long	(IMAGE_SCN_CNT_CODE | IMAGE_SCN_MEM_EXECUTE | \
 		 IMAGE_SCN_MEM_READ)		/* Characteristics */
 _end_header:
 
 	.text
 _start:
 	/* Save the boot params to the stack */
 	push	{r0, r1}
 
 	adr	r0, .Lbase
 	ldr	r1, [r0]
 	sub	r5, r0, r1
 
 	ldr	r0, .Limagebase
 	add	r0, r0, r5
 	ldr	r1, .Ldynamic
 	add	r1, r1, r5
 
 	bl	_C_LABEL(_reloc)
 
 	/* Zero the BSS, _reloc fixed the values for us */
 	ldr	r0, .Lbss
 	ldr	r1, .Lbssend
 	mov	r2, #0
 
 1:	cmp	r0, r1
 	bgt	2f
 	str	r2, [r0], #4
 	b	1b
 2:
 
 	pop	{r0, r1}
 	bl	_C_LABEL(efi_main)
 
-1:	WFI
-	b	1b
+1:	b	1b
 
 .Lbase:
 	.word	.
 .Limagebase:
 	.word	ImageBase
 .Ldynamic:
 	.word	_DYNAMIC
 .Lbss:
 	.word	__bss_start
 .Lbssend:
 	.word	__bss_end
 
 .align	3
 stack:
 	.space 512
 stack_end:
 
Index: user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/Makefile.inc
===================================================================
--- user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/Makefile.inc	(nonexistent)
+++ user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/Makefile.inc	(revision 281585)
@@ -0,0 +1,9 @@
+# $FreeBSD$
+
+LOADER_FDT_SUPPORT=yes
+SRCS+=	exec.c \
+	start.S
+
+.PATH:	${.CURDIR}/../../arm64/libarm64
+CFLAGS+=-I${.CURDIR}/../../arm64/libarm64
+SRCS+=	cache.c

Property changes on: user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/Makefile.inc
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/exec.c
===================================================================
--- user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/exec.c	(nonexistent)
+++ user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/exec.c	(revision 281585)
@@ -0,0 +1,109 @@
+/*-
+ * Copyright (c) 2006 Marcel Moolenaar
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
+ * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+ * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
+ * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+ * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
+ * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <sys/cdefs.h>
+__FBSDID("$FreeBSD$");
+
+#include <stand.h>
+#include <string.h>
+
+#include <sys/param.h>
+#include <sys/linker.h>
+#include <machine/elf.h>
+
+#include <bootstrap.h>
+
+#include <efi.h>
+#include <efilib.h>
+
+#include "loader_efi.h"
+#include "cache.h"
+
+static int elf64_exec(struct preloaded_file *amp);
+static int elf64_obj_exec(struct preloaded_file *amp);
+
+int bi_load(char *args, vm_offset_t *modulep, vm_offset_t *kernendp);
+
+static struct file_format arm64_elf = {
+	elf64_loadfile,
+	elf64_exec
+};
+
+struct file_format *file_formats[] = {
+	&arm64_elf,
+	NULL
+};
+
+static int
+elf64_exec(struct preloaded_file *fp)
+{
+	vm_offset_t modulep, kernendp;
+	vm_offset_t clean_addr;
+	size_t clean_size;
+	struct file_metadata *md;
+	EFI_STATUS status;
+	EFI_PHYSICAL_ADDRESS addr;
+	Elf_Ehdr *ehdr;
+	int err;
+	void (*entry)(vm_offset_t);
+
+	if ((md = file_findmetadata(fp, MODINFOMD_ELFHDR)) == NULL)
+        	return(EFTYPE);
+
+	ehdr = (Elf_Ehdr *)&(md->md_data);
+	entry = efi_translate(ehdr->e_entry);
+
+	err = bi_load(fp->f_args, &modulep, &kernendp);
+	if (err != 0)
+		return (err);
+
+	status = BS->ExitBootServices(IH, efi_mapkey);
+        if (EFI_ERROR(status)) {
+		printf("%s: ExitBootServices() returned 0x%lx\n", __func__,
+		    (long)status);
+		return (EINVAL);
+	}
+
+	/* Clean D-cache under kernel area and invalidate whole I-cache */
+	clean_addr = efi_translate(fp->f_addr);
+	clean_size = efi_translate(kernendp) - clean_addr;
+
+	cpu_flush_dcache((void *)clean_addr, clean_size);
+	cpu_inval_icache(NULL, 0);
+
+	(*entry)(modulep);
+	panic("exec returned");
+}
+
+static int
+elf64_obj_exec(struct preloaded_file *fp)
+{
+
+	printf("%s called for preloaded file %p (=%s):\n", __func__, fp,
+	    fp->f_name);
+	return (ENOSYS);
+}
+

Property changes on: user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/exec.c
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/ldscript.arm64
===================================================================
--- user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/ldscript.arm64	(nonexistent)
+++ user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/ldscript.arm64	(revision 281585)
@@ -0,0 +1,80 @@
+/* $FreeBSD$ */
+/*
+OUTPUT_FORMAT("elf64-aarch64-freebsd", "elf64-aarch64-freebsd", "elf64-aarch64-freebsd")
+*/
+OUTPUT_ARCH(aarch64)
+ENTRY(_start)
+SECTIONS
+{
+  /* Read-only sections, merged into text segment: */
+  . = 0;
+  ImageBase = .;
+  .text		: {
+    *(.peheader)
+    *(.text .stub .text.* .gnu.linkonce.t.*)
+    /* .gnu.warning sections are handled specially by elf32.em. */
+    *(.gnu.warning)
+    *(.plt)
+  } =0x00300000010070000002000001000400
+  . = ALIGN(16);
+  .data		: {
+    *(.rodata .rodata.* .gnu.linkonce.r.*)
+    *(.rodata1)
+    *(.sdata2 .sdata2.* .gnu.linkonce.s2.*)
+    *(.sbss2 .sbss2.* .gnu.linkonce.sb2.*)
+    *(.opd)
+    *(.data .data.* .gnu.linkonce.d.*)
+    *(.data1)
+    *(.plabel)
+
+    . = ALIGN(16);
+    __bss_start = .;
+    *(.sbss .sbss.* .gnu.linkonce.sb.*)
+    *(.scommon)
+    *(.dynbss)
+    *(.bss *.bss.*)
+    *(COMMON)
+    . = ALIGN(16);
+    __bss_end = .;
+  }
+  . = ALIGN(16);
+  set_Xcommand_set	: {
+    __start_set_Xcommand_set = .;
+    *(set_Xcommand_set)
+    __stop_set_Xcommand_set = .;
+  }
+  . = ALIGN(16);
+  __gp = .;
+  .sdata	: {
+    *(.got.plt .got)
+    *(.sdata .sdata.* .gnu.linkonce.s.*)
+    *(dynsbss)
+    *(.scommon)
+  }
+  . = ALIGN(16);
+  .dynamic	: { *(.dynamic) }
+  . = ALIGN(16);
+  .rela.dyn	: {
+    *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*)
+    *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*)
+    *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*)
+    *(.rela.got)
+    *(.rela.sdata .rela.sdata.* .rela.gnu.linkonce.s.*)
+    *(.rela.sbss .rela.sbss.* .rela.gnu.linkonce.sb.*)
+    *(.rela.sdata2 .rela.sdata2.* .rela.gnu.linkonce.s2.*)
+    *(.rela.sbss2 .rela.sbss2.* .rela.gnu.linkonce.sb2.*)
+    *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*)
+    *(.rela.plt)
+    *(.relset_*)
+    *(.rela.dyn .rela.dyn.*)
+  }
+  . = ALIGN(16);
+  .reloc	: { *(.reloc) }
+  . = ALIGN(16);
+  .dynsym	: { *(.dynsym) }
+  _edata = .;
+
+  /* Unused sections */
+  .dynstr	: { *(.dynstr) }
+  .hash		: { *(.hash) }
+}

Property changes on: user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/ldscript.arm64
___________________________________________________________________
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Index: user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/start.S
===================================================================
--- user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/start.S	(nonexistent)
+++ user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/start.S	(revision 281585)
@@ -0,0 +1,165 @@
+/*-
+ * Copyright (c) 2014 Andrew Turner
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ *
+ * $FreeBSD$
+ */
+
+/*
+ * We need to be a PE32+ file for EFI. On some architectures we can use
+ * objcopy to create the correct file, however on arm64 we need to do
+ * it ourselves.
+ */
+
+#define	IMAGE_FILE_MACHINE_ARM64	0xaa64
+
+#define	IMAGE_SCN_CNT_CODE		0x00000020
+#define	IMAGE_SCN_CNT_INITIALIZED_DATA	0x00000040
+#define	IMAGE_SCN_MEM_DISCARDABLE	0x02000000
+#define	IMAGE_SCN_MEM_EXECUTE		0x20000000
+#define	IMAGE_SCN_MEM_READ		0x40000000
+
+	.section .peheader
+efi_start:
+	/* The MS-DOS Stub, only used to get the offset of the COFF header */
+	.ascii	"MZ"
+	.short	0
+	.space	0x38
+	.long	pe_sig - efi_start
+
+	/* The PE32 Signature. Needs to be 8-byte aligned */
+	.align	3
+pe_sig:
+	.ascii	"PE"
+	.short	0
+coff_head:
+	.short	IMAGE_FILE_MACHINE_ARM64	/* AArch64 file */
+	.short	2				/* 2 Sections */
+	.long	0				/* Timestamp */
+	.long	0				/* No symbol table */
+	.long	0				/* No symbols */
+	.short	section_table - optional_header	/* Optional header size */
+	.short	0	/* Characteristics TODO: Fill in */
+
+optional_header:
+	.short	0x020b				/* PE32+ (64-bit addressing) */
+	.byte	0				/* Major linker version */
+	.byte	0				/* Minor linker version */
+	.long	_edata - _end_header		/* Code size */
+	.long	0				/* No initialized data */
+	.long	0				/* No uninitialized data */
+	.long	_start - efi_start		/* Entry point */
+	.long	_end_header - efi_start		/* Start of code */
+
+optional_windows_header:
+	.quad	0				/* Image base */
+	.long	32				/* Section Alignment */
+	.long	8				/* File alignment */
+	.short	0				/* Major OS version */
+	.short	0				/* Minor OS version */
+	.short	0				/* Major image version */
+	.short	0				/* Minor image version */
+	.short	0				/* Major subsystem version */
+	.short	0				/* Minor subsystem version */
+	.long	0				/* Win32 version */
+	.long	_edata - efi_start		/* Image size */
+	.long	_end_header - efi_start		/* Header size */
+	.long	0				/* Checksum */
+	.short	0xa				/* Subsystem (EFI app) */
+	.short	0				/* DLL Characteristics */
+	.quad	0				/* Stack reserve */
+	.quad	0				/* Stack commit */
+	.quad	0				/* Heap reserve */
+	.quad	0				/* Heap commit */
+	.long	0				/* Loader flags */
+	.long	6				/* Number of RVAs */
+
+	/* RVAs: */
+	.quad	0
+	.quad	0
+	.quad	0
+	.quad	0
+	.quad	0
+	.quad	0
+
+section_table:
+	/* We need a .reloc section for EFI */
+	.ascii	".reloc"
+	.byte	0
+	.byte	0				/* Pad to 8 bytes */
+	.long	0				/* Virtual size */
+	.long	0				/* Virtual address */
+	.long	0				/* Size of raw data */
+	.long	0				/* Pointer to raw data */
+	.long	0				/* Pointer to relocations */
+	.long	0				/* Pointer to line numbers */
+	.short	0				/* Number of relocations */
+	.short	0				/* Number of line numbers */
+	.long	(IMAGE_SCN_CNT_INITIALIZED_DATA | IMAGE_SCN_MEM_READ | \
+		 IMAGE_SCN_MEM_DISCARDABLE)	/* Characteristics */
+
+	/* The contents of the loader */
+	.ascii	".text"
+	.byte	0
+	.byte	0
+	.byte	0				/* Pad to 8 bytes */
+	.long	_edata - _end_header		/* Virtual size */
+	.long	_end_header - efi_start		/* Virtual address */
+	.long	_edata - _end_header		/* Size of raw data */
+	.long	_end_header - efi_start		/* Pointer to raw data */
+	.long	0				/* Pointer to relocations */
+	.long	0				/* Pointer to line numbers */
+	.short	0				/* Number of relocations */
+	.short	0				/* Number of line numbers */
+	.long	(IMAGE_SCN_CNT_CODE | IMAGE_SCN_MEM_EXECUTE | \
+		 IMAGE_SCN_MEM_READ)		/* Characteristics */
+_end_header:
+
+	.text
+	.globl	_start
+_start:
+	/* Save the boot params to the stack */
+	stp	x0, x1, [sp, #-16]!
+
+	adr	x0, __bss_start
+	adr	x1, __bss_end
+
+	b 2f
+
+1:
+	stp	xzr, xzr, [x0], #16
+2:
+	cmp	x0, x1
+	b.lo	1b
+
+	adr	x0, ImageBase
+	adr	x1, _DYNAMIC
+
+	bl	_reloc
+
+	ldp	x0, x1, [sp], #16
+
+	bl	efi_main
+
+1:	b	1b

Property changes on: user/ngie/more-tests/sys/boot/efi/loader/arch/arm64/start.S
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+FreeBSD=%H
\ No newline at end of property
Added: svn:mime-type
## -0,0 +1 ##
+text/plain
\ No newline at end of property
Index: user/ngie/more-tests/sys/boot/efi/loader/copy.c
===================================================================
--- user/ngie/more-tests/sys/boot/efi/loader/copy.c	(revision 281584)
+++ user/ngie/more-tests/sys/boot/efi/loader/copy.c	(revision 281585)
@@ -1,133 +1,139 @@
 /*-
  * Copyright (c) 2013 The FreeBSD Foundation
  * All rights reserved.
  *
  * This software was developed by Benno Rice under sponsorship from
  * the FreeBSD Foundation.
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 
 #include <stand.h>
 #include <bootstrap.h>
 
 #include <efi.h>
 #include <efilib.h>
 
 #ifndef EFI_STAGING_SIZE
 #define	EFI_STAGING_SIZE	32
 #endif
 
 #define	STAGE_PAGES	((EFI_STAGING_SIZE) * 1024 * 1024 / 4096)
 
 EFI_PHYSICAL_ADDRESS	staging, staging_end;
 int			stage_offset_set = 0;
 ssize_t			stage_offset;
 
 int
 efi_copy_init(void)
 {
 	EFI_STATUS	status;
 
 	status = BS->AllocatePages(AllocateAnyPages, EfiLoaderData,
 	    STAGE_PAGES, &staging);
 	if (EFI_ERROR(status)) {
 		printf("failed to allocate staging area: %lu\n",
 		    (unsigned long)(status & EFI_ERROR_MASK));
 		return (status);
 	}
 	staging_end = staging + STAGE_PAGES * 4096;
 
-#ifdef __arm__
-	/* Round the kernel load address to a 2MiB value */
+#if defined(__aarch64__) || defined(__arm__)
+	/*
+	 * Round the kernel load address to a 2MiB value. This is needed
+	 * because the kernel builds a page table based on where it has
+	 * been loaded in physical address space. As the kernel will use
+	 * either a 1MiB or 2MiB page for this we need to make sure it
+	 * is correctly aligned for both cases.
+	 */
 	staging = roundup2(staging, 2 * 1024 * 1024);
 #endif
 
 	return (0);
 }
 
 void *
 efi_translate(vm_offset_t ptr)
 {
 
 	return ((void *)(ptr + stage_offset));
 }
 
 ssize_t
 efi_copyin(const void *src, vm_offset_t dest, const size_t len)
 {
 
 	if (!stage_offset_set) {
 		stage_offset = (vm_offset_t)staging - dest;
 		stage_offset_set = 1;
 	}
 
 	/* XXX: Callers do not check for failure. */
 	if (dest + stage_offset + len > staging_end) {
 		errno = ENOMEM;
 		return (-1);
 	}
 	bcopy(src, (void *)(dest + stage_offset), len);
 	return (len);
 }
 
 ssize_t
 efi_copyout(const vm_offset_t src, void *dest, const size_t len)
 {
 
 	/* XXX: Callers do not check for failure. */
 	if (src + stage_offset + len > staging_end) {
 		errno = ENOMEM;
 		return (-1);
 	}
 	bcopy((void *)(src + stage_offset), dest, len);
 	return (len);
 }
 
 
 ssize_t
 efi_readin(const int fd, vm_offset_t dest, const size_t len)
 {
 
 	if (dest + stage_offset + len > staging_end) {
 		errno = ENOMEM;
 		return (-1);
 	}
 	return (read(fd, (void *)(dest + stage_offset), len));
 }
 
 void
 efi_copy_finish(void)
 {
 	uint64_t	*src, *dst, *last;
 
 	src = (uint64_t *)staging;
 	dst = (uint64_t *)(staging - stage_offset);
 	last = (uint64_t *)(staging + STAGE_PAGES * EFI_PAGE_SIZE);
 
 	while (src < last)
 		*dst++ = *src++;
 }
Index: user/ngie/more-tests/sys/boot
===================================================================
--- user/ngie/more-tests/sys/boot	(revision 281584)
+++ user/ngie/more-tests/sys/boot	(revision 281585)

Property changes on: user/ngie/more-tests/sys/boot
___________________________________________________________________
Modified: svn:mergeinfo
## -0,0 +0,1 ##
   Merged /head/sys/boot:r281504-281584
Index: user/ngie/more-tests/sys/cam/cam_xpt.c
===================================================================
--- user/ngie/more-tests/sys/cam/cam_xpt.c	(revision 281584)
+++ user/ngie/more-tests/sys/cam/cam_xpt.c	(revision 281585)
@@ -1,5316 +1,5318 @@
 /*-
  * Implementation of the Common Access Method Transport (XPT) layer.
  *
  * Copyright (c) 1997, 1998, 1999 Justin T. Gibbs.
  * Copyright (c) 1997, 1998, 1999 Kenneth D. Merry.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions, and the following disclaimer,
  *    without modification, immediately at the beginning of the file.
  * 2. The name of the author may not be used to endorse or promote products
  *    derived from this software without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR
  * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/bus.h>
 #include <sys/systm.h>
 #include <sys/types.h>
 #include <sys/malloc.h>
 #include <sys/kernel.h>
 #include <sys/time.h>
 #include <sys/conf.h>
 #include <sys/fcntl.h>
 #include <sys/interrupt.h>
 #include <sys/proc.h>
 #include <sys/sbuf.h>
 #include <sys/smp.h>
 #include <sys/taskqueue.h>
 
 #include <sys/lock.h>
 #include <sys/mutex.h>
 #include <sys/sysctl.h>
 #include <sys/kthread.h>
 
 #include <cam/cam.h>
 #include <cam/cam_ccb.h>
 #include <cam/cam_periph.h>
 #include <cam/cam_queue.h>
 #include <cam/cam_sim.h>
 #include <cam/cam_xpt.h>
 #include <cam/cam_xpt_sim.h>
 #include <cam/cam_xpt_periph.h>
 #include <cam/cam_xpt_internal.h>
 #include <cam/cam_debug.h>
 #include <cam/cam_compat.h>
 
 #include <cam/scsi/scsi_all.h>
 #include <cam/scsi/scsi_message.h>
 #include <cam/scsi/scsi_pass.h>
 
 #include <machine/md_var.h>	/* geometry translation */
 #include <machine/stdarg.h>	/* for xpt_print below */
 
 #include "opt_cam.h"
 
 /*
  * This is the maximum number of high powered commands (e.g. start unit)
  * that can be outstanding at a particular time.
  */
 #ifndef CAM_MAX_HIGHPOWER
 #define CAM_MAX_HIGHPOWER  4
 #endif
 
 /* Datastructures internal to the xpt layer */
 MALLOC_DEFINE(M_CAMXPT, "CAM XPT", "CAM XPT buffers");
 MALLOC_DEFINE(M_CAMDEV, "CAM DEV", "CAM devices");
 MALLOC_DEFINE(M_CAMCCB, "CAM CCB", "CAM CCBs");
 MALLOC_DEFINE(M_CAMPATH, "CAM path", "CAM paths");
 
 /* Object for defering XPT actions to a taskqueue */
 struct xpt_task {
 	struct task	task;
 	void		*data1;
 	uintptr_t	data2;
 };
 
 struct xpt_softc {
 	uint32_t		xpt_generation;
 
 	/* number of high powered commands that can go through right now */
 	struct mtx		xpt_highpower_lock;
 	STAILQ_HEAD(highpowerlist, cam_ed)	highpowerq;
 	int			num_highpower;
 
 	/* queue for handling async rescan requests. */
 	TAILQ_HEAD(, ccb_hdr) ccb_scanq;
 	int buses_to_config;
 	int buses_config_done;
 
 	/* Registered busses */
 	TAILQ_HEAD(,cam_eb)	xpt_busses;
 	u_int			bus_generation;
 
 	struct intr_config_hook	*xpt_config_hook;
 
 	int			boot_delay;
 	struct callout 		boot_callout;
 
 	struct mtx		xpt_topo_lock;
 	struct mtx		xpt_lock;
 	struct taskqueue	*xpt_taskq;
 };
 
 typedef enum {
 	DM_RET_COPY		= 0x01,
 	DM_RET_FLAG_MASK	= 0x0f,
 	DM_RET_NONE		= 0x00,
 	DM_RET_STOP		= 0x10,
 	DM_RET_DESCEND		= 0x20,
 	DM_RET_ERROR		= 0x30,
 	DM_RET_ACTION_MASK	= 0xf0
 } dev_match_ret;
 
 typedef enum {
 	XPT_DEPTH_BUS,
 	XPT_DEPTH_TARGET,
 	XPT_DEPTH_DEVICE,
 	XPT_DEPTH_PERIPH
 } xpt_traverse_depth;
 
 struct xpt_traverse_config {
 	xpt_traverse_depth	depth;
 	void			*tr_func;
 	void			*tr_arg;
 };
 
 typedef	int	xpt_busfunc_t (struct cam_eb *bus, void *arg);
 typedef	int	xpt_targetfunc_t (struct cam_et *target, void *arg);
 typedef	int	xpt_devicefunc_t (struct cam_ed *device, void *arg);
 typedef	int	xpt_periphfunc_t (struct cam_periph *periph, void *arg);
 typedef int	xpt_pdrvfunc_t (struct periph_driver **pdrv, void *arg);
 
 /* Transport layer configuration information */
 static struct xpt_softc xsoftc;
 
 MTX_SYSINIT(xpt_topo_init, &xsoftc.xpt_topo_lock, "XPT topology lock", MTX_DEF);
 
 SYSCTL_INT(_kern_cam, OID_AUTO, boot_delay, CTLFLAG_RDTUN,
            &xsoftc.boot_delay, 0, "Bus registration wait time");
 SYSCTL_UINT(_kern_cam, OID_AUTO, xpt_generation, CTLFLAG_RD,
 	    &xsoftc.xpt_generation, 0, "CAM peripheral generation count");
 
 struct cam_doneq {
 	struct mtx_padalign	cam_doneq_mtx;
 	STAILQ_HEAD(, ccb_hdr)	cam_doneq;
 	int			cam_doneq_sleep;
 };
 
 static struct cam_doneq cam_doneqs[MAXCPU];
 static int cam_num_doneqs;
 static struct proc *cam_proc;
 
 SYSCTL_INT(_kern_cam, OID_AUTO, num_doneqs, CTLFLAG_RDTUN,
            &cam_num_doneqs, 0, "Number of completion queues/threads");
 
 struct cam_periph *xpt_periph;
 
 static periph_init_t xpt_periph_init;
 
 static struct periph_driver xpt_driver =
 {
 	xpt_periph_init, "xpt",
 	TAILQ_HEAD_INITIALIZER(xpt_driver.units), /* generation */ 0,
 	CAM_PERIPH_DRV_EARLY
 };
 
 PERIPHDRIVER_DECLARE(xpt, xpt_driver);
 
 static d_open_t xptopen;
 static d_close_t xptclose;
 static d_ioctl_t xptioctl;
 static d_ioctl_t xptdoioctl;
 
 static struct cdevsw xpt_cdevsw = {
 	.d_version =	D_VERSION,
 	.d_flags =	0,
 	.d_open =	xptopen,
 	.d_close =	xptclose,
 	.d_ioctl =	xptioctl,
 	.d_name =	"xpt",
 };
 
 /* Storage for debugging datastructures */
 struct cam_path *cam_dpath;
 u_int32_t cam_dflags = CAM_DEBUG_FLAGS;
 SYSCTL_UINT(_kern_cam, OID_AUTO, dflags, CTLFLAG_RWTUN,
 	&cam_dflags, 0, "Enabled debug flags");
 u_int32_t cam_debug_delay = CAM_DEBUG_DELAY;
 SYSCTL_UINT(_kern_cam, OID_AUTO, debug_delay, CTLFLAG_RWTUN,
 	&cam_debug_delay, 0, "Delay in us after each debug message");
 
 /* Our boot-time initialization hook */
 static int cam_module_event_handler(module_t, int /*modeventtype_t*/, void *);
 
 static moduledata_t cam_moduledata = {
 	"cam",
 	cam_module_event_handler,
 	NULL
 };
 
 static int	xpt_init(void *);
 
 DECLARE_MODULE(cam, cam_moduledata, SI_SUB_CONFIGURE, SI_ORDER_SECOND);
 MODULE_VERSION(cam, 1);
 
 
 static void		xpt_async_bcast(struct async_list *async_head,
 					u_int32_t async_code,
 					struct cam_path *path,
 					void *async_arg);
 static path_id_t xptnextfreepathid(void);
 static path_id_t xptpathid(const char *sim_name, int sim_unit, int sim_bus);
 static union ccb *xpt_get_ccb(struct cam_periph *periph);
 static union ccb *xpt_get_ccb_nowait(struct cam_periph *periph);
 static void	 xpt_run_allocq(struct cam_periph *periph, int sleep);
 static void	 xpt_run_allocq_task(void *context, int pending);
 static void	 xpt_run_devq(struct cam_devq *devq);
 static timeout_t xpt_release_devq_timeout;
 static void	 xpt_release_simq_timeout(void *arg) __unused;
 static void	 xpt_acquire_bus(struct cam_eb *bus);
 static void	 xpt_release_bus(struct cam_eb *bus);
 static uint32_t	 xpt_freeze_devq_device(struct cam_ed *dev, u_int count);
 static int	 xpt_release_devq_device(struct cam_ed *dev, u_int count,
 		    int run_queue);
 static struct cam_et*
 		 xpt_alloc_target(struct cam_eb *bus, target_id_t target_id);
 static void	 xpt_acquire_target(struct cam_et *target);
 static void	 xpt_release_target(struct cam_et *target);
 static struct cam_eb*
 		 xpt_find_bus(path_id_t path_id);
 static struct cam_et*
 		 xpt_find_target(struct cam_eb *bus, target_id_t target_id);
 static struct cam_ed*
 		 xpt_find_device(struct cam_et *target, lun_id_t lun_id);
 static void	 xpt_config(void *arg);
 static int	 xpt_schedule_dev(struct camq *queue, cam_pinfo *dev_pinfo,
 				 u_int32_t new_priority);
 static xpt_devicefunc_t xptpassannouncefunc;
 static void	 xptaction(struct cam_sim *sim, union ccb *work_ccb);
 static void	 xptpoll(struct cam_sim *sim);
 static void	 camisr_runqueue(void);
 static void	 xpt_done_process(struct ccb_hdr *ccb_h);
 static void	 xpt_done_td(void *);
 static dev_match_ret	xptbusmatch(struct dev_match_pattern *patterns,
 				    u_int num_patterns, struct cam_eb *bus);
 static dev_match_ret	xptdevicematch(struct dev_match_pattern *patterns,
 				       u_int num_patterns,
 				       struct cam_ed *device);
 static dev_match_ret	xptperiphmatch(struct dev_match_pattern *patterns,
 				       u_int num_patterns,
 				       struct cam_periph *periph);
 static xpt_busfunc_t	xptedtbusfunc;
 static xpt_targetfunc_t	xptedttargetfunc;
 static xpt_devicefunc_t	xptedtdevicefunc;
 static xpt_periphfunc_t	xptedtperiphfunc;
 static xpt_pdrvfunc_t	xptplistpdrvfunc;
 static xpt_periphfunc_t	xptplistperiphfunc;
 static int		xptedtmatch(struct ccb_dev_match *cdm);
 static int		xptperiphlistmatch(struct ccb_dev_match *cdm);
 static int		xptbustraverse(struct cam_eb *start_bus,
 				       xpt_busfunc_t *tr_func, void *arg);
 static int		xpttargettraverse(struct cam_eb *bus,
 					  struct cam_et *start_target,
 					  xpt_targetfunc_t *tr_func, void *arg);
 static int		xptdevicetraverse(struct cam_et *target,
 					  struct cam_ed *start_device,
 					  xpt_devicefunc_t *tr_func, void *arg);
 static int		xptperiphtraverse(struct cam_ed *device,
 					  struct cam_periph *start_periph,
 					  xpt_periphfunc_t *tr_func, void *arg);
 static int		xptpdrvtraverse(struct periph_driver **start_pdrv,
 					xpt_pdrvfunc_t *tr_func, void *arg);
 static int		xptpdperiphtraverse(struct periph_driver **pdrv,
 					    struct cam_periph *start_periph,
 					    xpt_periphfunc_t *tr_func,
 					    void *arg);
 static xpt_busfunc_t	xptdefbusfunc;
 static xpt_targetfunc_t	xptdeftargetfunc;
 static xpt_devicefunc_t	xptdefdevicefunc;
 static xpt_periphfunc_t	xptdefperiphfunc;
 static void		xpt_finishconfig_task(void *context, int pending);
 static void		xpt_dev_async_default(u_int32_t async_code,
 					      struct cam_eb *bus,
 					      struct cam_et *target,
 					      struct cam_ed *device,
 					      void *async_arg);
 static struct cam_ed *	xpt_alloc_device_default(struct cam_eb *bus,
 						 struct cam_et *target,
 						 lun_id_t lun_id);
 static xpt_devicefunc_t	xptsetasyncfunc;
 static xpt_busfunc_t	xptsetasyncbusfunc;
 static cam_status	xptregister(struct cam_periph *periph,
 				    void *arg);
 static __inline int device_is_queued(struct cam_ed *device);
 
 static __inline int
 xpt_schedule_devq(struct cam_devq *devq, struct cam_ed *dev)
 {
 	int	retval;
 
 	mtx_assert(&devq->send_mtx, MA_OWNED);
 	if ((dev->ccbq.queue.entries > 0) &&
 	    (dev->ccbq.dev_openings > 0) &&
 	    (dev->ccbq.queue.qfrozen_cnt == 0)) {
 		/*
 		 * The priority of a device waiting for controller
 		 * resources is that of the highest priority CCB
 		 * enqueued.
 		 */
 		retval =
 		    xpt_schedule_dev(&devq->send_queue,
 				     &dev->devq_entry,
 				     CAMQ_GET_PRIO(&dev->ccbq.queue));
 	} else {
 		retval = 0;
 	}
 	return (retval);
 }
 
 static __inline int
 device_is_queued(struct cam_ed *device)
 {
 	return (device->devq_entry.index != CAM_UNQUEUED_INDEX);
 }
 
 static void
 xpt_periph_init()
 {
 	make_dev(&xpt_cdevsw, 0, UID_ROOT, GID_OPERATOR, 0600, "xpt0");
 }
 
 static int
 xptopen(struct cdev *dev, int flags, int fmt, struct thread *td)
 {
 
 	/*
 	 * Only allow read-write access.
 	 */
 	if (((flags & FWRITE) == 0) || ((flags & FREAD) == 0))
 		return(EPERM);
 
 	/*
 	 * We don't allow nonblocking access.
 	 */
 	if ((flags & O_NONBLOCK) != 0) {
 		printf("%s: can't do nonblocking access\n", devtoname(dev));
 		return(ENODEV);
 	}
 
 	return(0);
 }
 
 static int
 xptclose(struct cdev *dev, int flag, int fmt, struct thread *td)
 {
 
 	return(0);
 }
 
 /*
  * Don't automatically grab the xpt softc lock here even though this is going
  * through the xpt device.  The xpt device is really just a back door for
  * accessing other devices and SIMs, so the right thing to do is to grab
  * the appropriate SIM lock once the bus/SIM is located.
  */
 static int
 xptioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flag, struct thread *td)
 {
 	int error;
 
 	if ((error = xptdoioctl(dev, cmd, addr, flag, td)) == ENOTTY) {
 		error = cam_compat_ioctl(dev, cmd, addr, flag, td, xptdoioctl);
 	}
 	return (error);
 }
 	
 static int
 xptdoioctl(struct cdev *dev, u_long cmd, caddr_t addr, int flag, struct thread *td)
 {
 	int error;
 
 	error = 0;
 
 	switch(cmd) {
 	/*
 	 * For the transport layer CAMIOCOMMAND ioctl, we really only want
 	 * to accept CCB types that don't quite make sense to send through a
 	 * passthrough driver. XPT_PATH_INQ is an exception to this, as stated
 	 * in the CAM spec.
 	 */
 	case CAMIOCOMMAND: {
 		union ccb *ccb;
 		union ccb *inccb;
 		struct cam_eb *bus;
 
 		inccb = (union ccb *)addr;
 
 		bus = xpt_find_bus(inccb->ccb_h.path_id);
 		if (bus == NULL)
 			return (EINVAL);
 
 		switch (inccb->ccb_h.func_code) {
 		case XPT_SCAN_BUS:
 		case XPT_RESET_BUS:
 			if (inccb->ccb_h.target_id != CAM_TARGET_WILDCARD ||
 			    inccb->ccb_h.target_lun != CAM_LUN_WILDCARD) {
 				xpt_release_bus(bus);
 				return (EINVAL);
 			}
 			break;
 		case XPT_SCAN_TGT:
 			if (inccb->ccb_h.target_id == CAM_TARGET_WILDCARD ||
 			    inccb->ccb_h.target_lun != CAM_LUN_WILDCARD) {
 				xpt_release_bus(bus);
 				return (EINVAL);
 			}
 			break;
 		default:
 			break;
 		}
 
 		switch(inccb->ccb_h.func_code) {
 		case XPT_SCAN_BUS:
 		case XPT_RESET_BUS:
 		case XPT_PATH_INQ:
 		case XPT_ENG_INQ:
 		case XPT_SCAN_LUN:
 		case XPT_SCAN_TGT:
 
 			ccb = xpt_alloc_ccb();
 
 			/*
 			 * Create a path using the bus, target, and lun the
 			 * user passed in.
 			 */
 			if (xpt_create_path(&ccb->ccb_h.path, NULL,
 					    inccb->ccb_h.path_id,
 					    inccb->ccb_h.target_id,
 					    inccb->ccb_h.target_lun) !=
 					    CAM_REQ_CMP){
 				error = EINVAL;
 				xpt_free_ccb(ccb);
 				break;
 			}
 			/* Ensure all of our fields are correct */
 			xpt_setup_ccb(&ccb->ccb_h, ccb->ccb_h.path,
 				      inccb->ccb_h.pinfo.priority);
 			xpt_merge_ccb(ccb, inccb);
 			xpt_path_lock(ccb->ccb_h.path);
 			cam_periph_runccb(ccb, NULL, 0, 0, NULL);
 			xpt_path_unlock(ccb->ccb_h.path);
 			bcopy(ccb, inccb, sizeof(union ccb));
 			xpt_free_path(ccb->ccb_h.path);
 			xpt_free_ccb(ccb);
 			break;
 
 		case XPT_DEBUG: {
 			union ccb ccb;
 
 			/*
 			 * This is an immediate CCB, so it's okay to
 			 * allocate it on the stack.
 			 */
 
 			/*
 			 * Create a path using the bus, target, and lun the
 			 * user passed in.
 			 */
 			if (xpt_create_path(&ccb.ccb_h.path, NULL,
 					    inccb->ccb_h.path_id,
 					    inccb->ccb_h.target_id,
 					    inccb->ccb_h.target_lun) !=
 					    CAM_REQ_CMP){
 				error = EINVAL;
 				break;
 			}
 			/* Ensure all of our fields are correct */
 			xpt_setup_ccb(&ccb.ccb_h, ccb.ccb_h.path,
 				      inccb->ccb_h.pinfo.priority);
 			xpt_merge_ccb(&ccb, inccb);
 			xpt_action(&ccb);
 			bcopy(&ccb, inccb, sizeof(union ccb));
 			xpt_free_path(ccb.ccb_h.path);
 			break;
 
 		}
 		case XPT_DEV_MATCH: {
 			struct cam_periph_map_info mapinfo;
 			struct cam_path *old_path;
 
 			/*
 			 * We can't deal with physical addresses for this
 			 * type of transaction.
 			 */
 			if ((inccb->ccb_h.flags & CAM_DATA_MASK) !=
 			    CAM_DATA_VADDR) {
 				error = EINVAL;
 				break;
 			}
 
 			/*
 			 * Save this in case the caller had it set to
 			 * something in particular.
 			 */
 			old_path = inccb->ccb_h.path;
 
 			/*
 			 * We really don't need a path for the matching
 			 * code.  The path is needed because of the
 			 * debugging statements in xpt_action().  They
 			 * assume that the CCB has a valid path.
 			 */
 			inccb->ccb_h.path = xpt_periph->path;
 
 			bzero(&mapinfo, sizeof(mapinfo));
 
 			/*
 			 * Map the pattern and match buffers into kernel
 			 * virtual address space.
 			 */
 			error = cam_periph_mapmem(inccb, &mapinfo);
 
 			if (error) {
 				inccb->ccb_h.path = old_path;
 				break;
 			}
 
 			/*
 			 * This is an immediate CCB, we can send it on directly.
 			 */
 			xpt_action(inccb);
 
 			/*
 			 * Map the buffers back into user space.
 			 */
 			cam_periph_unmapmem(inccb, &mapinfo);
 
 			inccb->ccb_h.path = old_path;
 
 			error = 0;
 			break;
 		}
 		default:
 			error = ENOTSUP;
 			break;
 		}
 		xpt_release_bus(bus);
 		break;
 	}
 	/*
 	 * This is the getpassthru ioctl. It takes a XPT_GDEVLIST ccb as input,
 	 * with the periphal driver name and unit name filled in.  The other
 	 * fields don't really matter as input.  The passthrough driver name
 	 * ("pass"), and unit number are passed back in the ccb.  The current
 	 * device generation number, and the index into the device peripheral
 	 * driver list, and the status are also passed back.  Note that
 	 * since we do everything in one pass, unlike the XPT_GDEVLIST ccb,
 	 * we never return a status of CAM_GDEVLIST_LIST_CHANGED.  It is
 	 * (or rather should be) impossible for the device peripheral driver
 	 * list to change since we look at the whole thing in one pass, and
 	 * we do it with lock protection.
 	 *
 	 */
 	case CAMGETPASSTHRU: {
 		union ccb *ccb;
 		struct cam_periph *periph;
 		struct periph_driver **p_drv;
 		char   *name;
 		u_int unit;
 		int base_periph_found;
 
 		ccb = (union ccb *)addr;
 		unit = ccb->cgdl.unit_number;
 		name = ccb->cgdl.periph_name;
 		base_periph_found = 0;
 
 		/*
 		 * Sanity check -- make sure we don't get a null peripheral
 		 * driver name.
 		 */
 		if (*ccb->cgdl.periph_name == '\0') {
 			error = EINVAL;
 			break;
 		}
 
 		/* Keep the list from changing while we traverse it */
 		xpt_lock_buses();
 
 		/* first find our driver in the list of drivers */
 		for (p_drv = periph_drivers; *p_drv != NULL; p_drv++)
 			if (strcmp((*p_drv)->driver_name, name) == 0)
 				break;
 
 		if (*p_drv == NULL) {
 			xpt_unlock_buses();
 			ccb->ccb_h.status = CAM_REQ_CMP_ERR;
 			ccb->cgdl.status = CAM_GDEVLIST_ERROR;
 			*ccb->cgdl.periph_name = '\0';
 			ccb->cgdl.unit_number = 0;
 			error = ENOENT;
 			break;
 		}
 
 		/*
 		 * Run through every peripheral instance of this driver
 		 * and check to see whether it matches the unit passed
 		 * in by the user.  If it does, get out of the loops and
 		 * find the passthrough driver associated with that
 		 * peripheral driver.
 		 */
 		for (periph = TAILQ_FIRST(&(*p_drv)->units); periph != NULL;
 		     periph = TAILQ_NEXT(periph, unit_links)) {
 
 			if (periph->unit_number == unit)
 				break;
 		}
 		/*
 		 * If we found the peripheral driver that the user passed
 		 * in, go through all of the peripheral drivers for that
 		 * particular device and look for a passthrough driver.
 		 */
 		if (periph != NULL) {
 			struct cam_ed *device;
 			int i;
 
 			base_periph_found = 1;
 			device = periph->path->device;
 			for (i = 0, periph = SLIST_FIRST(&device->periphs);
 			     periph != NULL;
 			     periph = SLIST_NEXT(periph, periph_links), i++) {
 				/*
 				 * Check to see whether we have a
 				 * passthrough device or not.
 				 */
 				if (strcmp(periph->periph_name, "pass") == 0) {
 					/*
 					 * Fill in the getdevlist fields.
 					 */
 					strcpy(ccb->cgdl.periph_name,
 					       periph->periph_name);
 					ccb->cgdl.unit_number =
 						periph->unit_number;
 					if (SLIST_NEXT(periph, periph_links))
 						ccb->cgdl.status =
 							CAM_GDEVLIST_MORE_DEVS;
 					else
 						ccb->cgdl.status =
 						       CAM_GDEVLIST_LAST_DEVICE;
 					ccb->cgdl.generation =
 						device->generation;
 					ccb->cgdl.index = i;
 					/*
 					 * Fill in some CCB header fields
 					 * that the user may want.
 					 */
 					ccb->ccb_h.path_id =
 						periph->path->bus->path_id;
 					ccb->ccb_h.target_id =
 						periph->path->target->target_id;
 					ccb->ccb_h.target_lun =
 						periph->path->device->lun_id;
 					ccb->ccb_h.status = CAM_REQ_CMP;
 					break;
 				}
 			}
 		}
 
 		/*
 		 * If the periph is null here, one of two things has
 		 * happened.  The first possibility is that we couldn't
 		 * find the unit number of the particular peripheral driver
 		 * that the user is asking about.  e.g. the user asks for
 		 * the passthrough driver for "da11".  We find the list of
 		 * "da" peripherals all right, but there is no unit 11.
 		 * The other possibility is that we went through the list
 		 * of peripheral drivers attached to the device structure,
 		 * but didn't find one with the name "pass".  Either way,
 		 * we return ENOENT, since we couldn't find something.
 		 */
 		if (periph == NULL) {
 			ccb->ccb_h.status = CAM_REQ_CMP_ERR;
 			ccb->cgdl.status = CAM_GDEVLIST_ERROR;
 			*ccb->cgdl.periph_name = '\0';
 			ccb->cgdl.unit_number = 0;
 			error = ENOENT;
 			/*
 			 * It is unfortunate that this is even necessary,
 			 * but there are many, many clueless users out there.
 			 * If this is true, the user is looking for the
 			 * passthrough driver, but doesn't have one in his
 			 * kernel.
 			 */
 			if (base_periph_found == 1) {
 				printf("xptioctl: pass driver is not in the "
 				       "kernel\n");
 				printf("xptioctl: put \"device pass\" in "
 				       "your kernel config file\n");
 			}
 		}
 		xpt_unlock_buses();
 		break;
 		}
 	default:
 		error = ENOTTY;
 		break;
 	}
 
 	return(error);
 }
 
 static int
 cam_module_event_handler(module_t mod, int what, void *arg)
 {
 	int error;
 
 	switch (what) {
 	case MOD_LOAD:
 		if ((error = xpt_init(NULL)) != 0)
 			return (error);
 		break;
 	case MOD_UNLOAD:
 		return EBUSY;
 	default:
 		return EOPNOTSUPP;
 	}
 
 	return 0;
 }
 
 static void
 xpt_rescan_done(struct cam_periph *periph, union ccb *done_ccb)
 {
 
 	if (done_ccb->ccb_h.ppriv_ptr1 == NULL) {
 		xpt_free_path(done_ccb->ccb_h.path);
 		xpt_free_ccb(done_ccb);
 	} else {
 		done_ccb->ccb_h.cbfcnp = done_ccb->ccb_h.ppriv_ptr1;
 		(*done_ccb->ccb_h.cbfcnp)(periph, done_ccb);
 	}
 	xpt_release_boot();
 }
 
 /* thread to handle bus rescans */
 static void
 xpt_scanner_thread(void *dummy)
 {
 	union ccb	*ccb;
 	struct cam_path	 path;
 
 	xpt_lock_buses();
 	for (;;) {
 		if (TAILQ_EMPTY(&xsoftc.ccb_scanq))
 			msleep(&xsoftc.ccb_scanq, &xsoftc.xpt_topo_lock, PRIBIO,
 			       "-", 0);
 		if ((ccb = (union ccb *)TAILQ_FIRST(&xsoftc.ccb_scanq)) != NULL) {
 			TAILQ_REMOVE(&xsoftc.ccb_scanq, &ccb->ccb_h, sim_links.tqe);
 			xpt_unlock_buses();
 
 			/*
 			 * Since lock can be dropped inside and path freed
 			 * by completion callback even before return here,
 			 * take our own path copy for reference.
 			 */
 			xpt_copy_path(&path, ccb->ccb_h.path);
 			xpt_path_lock(&path);
 			xpt_action(ccb);
 			xpt_path_unlock(&path);
 			xpt_release_path(&path);
 
 			xpt_lock_buses();
 		}
 	}
 }
 
 void
 xpt_rescan(union ccb *ccb)
 {
 	struct ccb_hdr *hdr;
 
 	/* Prepare request */
 	if (ccb->ccb_h.path->target->target_id == CAM_TARGET_WILDCARD &&
 	    ccb->ccb_h.path->device->lun_id == CAM_LUN_WILDCARD)
 		ccb->ccb_h.func_code = XPT_SCAN_BUS;
 	else if (ccb->ccb_h.path->target->target_id != CAM_TARGET_WILDCARD &&
 	    ccb->ccb_h.path->device->lun_id == CAM_LUN_WILDCARD)
 		ccb->ccb_h.func_code = XPT_SCAN_TGT;
 	else if (ccb->ccb_h.path->target->target_id != CAM_TARGET_WILDCARD &&
 	    ccb->ccb_h.path->device->lun_id != CAM_LUN_WILDCARD)
 		ccb->ccb_h.func_code = XPT_SCAN_LUN;
 	else {
 		xpt_print(ccb->ccb_h.path, "illegal scan path\n");
 		xpt_free_path(ccb->ccb_h.path);
 		xpt_free_ccb(ccb);
 		return;
 	}
 	ccb->ccb_h.ppriv_ptr1 = ccb->ccb_h.cbfcnp;
 	ccb->ccb_h.cbfcnp = xpt_rescan_done;
 	xpt_setup_ccb(&ccb->ccb_h, ccb->ccb_h.path, CAM_PRIORITY_XPT);
 	/* Don't make duplicate entries for the same paths. */
 	xpt_lock_buses();
 	if (ccb->ccb_h.ppriv_ptr1 == NULL) {
 		TAILQ_FOREACH(hdr, &xsoftc.ccb_scanq, sim_links.tqe) {
 			if (xpt_path_comp(hdr->path, ccb->ccb_h.path) == 0) {
 				wakeup(&xsoftc.ccb_scanq);
 				xpt_unlock_buses();
 				xpt_print(ccb->ccb_h.path, "rescan already queued\n");
 				xpt_free_path(ccb->ccb_h.path);
 				xpt_free_ccb(ccb);
 				return;
 			}
 		}
 	}
 	TAILQ_INSERT_TAIL(&xsoftc.ccb_scanq, &ccb->ccb_h, sim_links.tqe);
 	xsoftc.buses_to_config++;
 	wakeup(&xsoftc.ccb_scanq);
 	xpt_unlock_buses();
 }
 
 /* Functions accessed by the peripheral drivers */
 static int
 xpt_init(void *dummy)
 {
 	struct cam_sim *xpt_sim;
 	struct cam_path *path;
 	struct cam_devq *devq;
 	cam_status status;
 	int error, i;
 
 	TAILQ_INIT(&xsoftc.xpt_busses);
 	TAILQ_INIT(&xsoftc.ccb_scanq);
 	STAILQ_INIT(&xsoftc.highpowerq);
 	xsoftc.num_highpower = CAM_MAX_HIGHPOWER;
 
 	mtx_init(&xsoftc.xpt_lock, "XPT lock", NULL, MTX_DEF);
 	mtx_init(&xsoftc.xpt_highpower_lock, "XPT highpower lock", NULL, MTX_DEF);
 	xsoftc.xpt_taskq = taskqueue_create("CAM XPT task", M_WAITOK,
 	    taskqueue_thread_enqueue, /*context*/&xsoftc.xpt_taskq);
 
 #ifdef CAM_BOOT_DELAY
 	/*
 	 * Override this value at compile time to assist our users
 	 * who don't use loader to boot a kernel.
 	 */
 	xsoftc.boot_delay = CAM_BOOT_DELAY;
 #endif
 	/*
 	 * The xpt layer is, itself, the equivelent of a SIM.
 	 * Allow 16 ccbs in the ccb pool for it.  This should
 	 * give decent parallelism when we probe busses and
 	 * perform other XPT functions.
 	 */
 	devq = cam_simq_alloc(16);
 	xpt_sim = cam_sim_alloc(xptaction,
 				xptpoll,
 				"xpt",
 				/*softc*/NULL,
 				/*unit*/0,
 				/*mtx*/&xsoftc.xpt_lock,
 				/*max_dev_transactions*/0,
 				/*max_tagged_dev_transactions*/0,
 				devq);
 	if (xpt_sim == NULL)
 		return (ENOMEM);
 
 	mtx_lock(&xsoftc.xpt_lock);
 	if ((status = xpt_bus_register(xpt_sim, NULL, 0)) != CAM_SUCCESS) {
 		mtx_unlock(&xsoftc.xpt_lock);
 		printf("xpt_init: xpt_bus_register failed with status %#x,"
 		       " failing attach\n", status);
 		return (EINVAL);
 	}
 	mtx_unlock(&xsoftc.xpt_lock);
 
 	/*
 	 * Looking at the XPT from the SIM layer, the XPT is
 	 * the equivelent of a peripheral driver.  Allocate
 	 * a peripheral driver entry for us.
 	 */
 	if ((status = xpt_create_path(&path, NULL, CAM_XPT_PATH_ID,
 				      CAM_TARGET_WILDCARD,
 				      CAM_LUN_WILDCARD)) != CAM_REQ_CMP) {
 		printf("xpt_init: xpt_create_path failed with status %#x,"
 		       " failing attach\n", status);
 		return (EINVAL);
 	}
 	xpt_path_lock(path);
 	cam_periph_alloc(xptregister, NULL, NULL, NULL, "xpt", CAM_PERIPH_BIO,
 			 path, NULL, 0, xpt_sim);
 	xpt_path_unlock(path);
 	xpt_free_path(path);
 
 	if (cam_num_doneqs < 1)
 		cam_num_doneqs = 1 + mp_ncpus / 6;
 	else if (cam_num_doneqs > MAXCPU)
 		cam_num_doneqs = MAXCPU;
 	for (i = 0; i < cam_num_doneqs; i++) {
 		mtx_init(&cam_doneqs[i].cam_doneq_mtx, "CAM doneq", NULL,
 		    MTX_DEF);
 		STAILQ_INIT(&cam_doneqs[i].cam_doneq);
 		error = kproc_kthread_add(xpt_done_td, &cam_doneqs[i],
 		    &cam_proc, NULL, 0, 0, "cam", "doneq%d", i);
 		if (error != 0) {
 			cam_num_doneqs = i;
 			break;
 		}
 	}
 	if (cam_num_doneqs < 1) {
 		printf("xpt_init: Cannot init completion queues "
 		       "- failing attach\n");
 		return (ENOMEM);
 	}
 	/*
 	 * Register a callback for when interrupts are enabled.
 	 */
 	xsoftc.xpt_config_hook =
 	    (struct intr_config_hook *)malloc(sizeof(struct intr_config_hook),
 					      M_CAMXPT, M_NOWAIT | M_ZERO);
 	if (xsoftc.xpt_config_hook == NULL) {
 		printf("xpt_init: Cannot malloc config hook "
 		       "- failing attach\n");
 		return (ENOMEM);
 	}
 	xsoftc.xpt_config_hook->ich_func = xpt_config;
 	if (config_intrhook_establish(xsoftc.xpt_config_hook) != 0) {
 		free (xsoftc.xpt_config_hook, M_CAMXPT);
 		printf("xpt_init: config_intrhook_establish failed "
 		       "- failing attach\n");
 	}
 
 	return (0);
 }
 
 static cam_status
 xptregister(struct cam_periph *periph, void *arg)
 {
 	struct cam_sim *xpt_sim;
 
 	if (periph == NULL) {
 		printf("xptregister: periph was NULL!!\n");
 		return(CAM_REQ_CMP_ERR);
 	}
 
 	xpt_sim = (struct cam_sim *)arg;
 	xpt_sim->softc = periph;
 	xpt_periph = periph;
 	periph->softc = NULL;
 
 	return(CAM_REQ_CMP);
 }
 
 int32_t
 xpt_add_periph(struct cam_periph *periph)
 {
 	struct cam_ed *device;
 	int32_t	 status;
 
 	TASK_INIT(&periph->periph_run_task, 0, xpt_run_allocq_task, periph);
 	device = periph->path->device;
 	status = CAM_REQ_CMP;
 	if (device != NULL) {
 		mtx_lock(&device->target->bus->eb_mtx);
 		device->generation++;
 		SLIST_INSERT_HEAD(&device->periphs, periph, periph_links);
 		mtx_unlock(&device->target->bus->eb_mtx);
 		atomic_add_32(&xsoftc.xpt_generation, 1);
 	}
 
 	return (status);
 }
 
 void
 xpt_remove_periph(struct cam_periph *periph)
 {
 	struct cam_ed *device;
 
 	device = periph->path->device;
 	if (device != NULL) {
 		mtx_lock(&device->target->bus->eb_mtx);
 		device->generation++;
 		SLIST_REMOVE(&device->periphs, periph, cam_periph, periph_links);
 		mtx_unlock(&device->target->bus->eb_mtx);
 		atomic_add_32(&xsoftc.xpt_generation, 1);
 	}
 }
 
 
 void
 xpt_announce_periph(struct cam_periph *periph, char *announce_string)
 {
 	struct	cam_path *path = periph->path;
 
 	cam_periph_assert(periph, MA_OWNED);
 	periph->flags |= CAM_PERIPH_ANNOUNCED;
 
 	printf("%s%d at %s%d bus %d scbus%d target %d lun %jx\n",
 	       periph->periph_name, periph->unit_number,
 	       path->bus->sim->sim_name,
 	       path->bus->sim->unit_number,
 	       path->bus->sim->bus_id,
 	       path->bus->path_id,
 	       path->target->target_id,
 	       (uintmax_t)path->device->lun_id);
 	printf("%s%d: ", periph->periph_name, periph->unit_number);
 	if (path->device->protocol == PROTO_SCSI)
 		scsi_print_inquiry(&path->device->inq_data);
 	else if (path->device->protocol == PROTO_ATA ||
 	    path->device->protocol == PROTO_SATAPM)
 		ata_print_ident(&path->device->ident_data);
 	else if (path->device->protocol == PROTO_SEMB)
 		semb_print_ident(
 		    (struct sep_identify_data *)&path->device->ident_data);
 	else
 		printf("Unknown protocol device\n");
 	if (path->device->serial_num_len > 0) {
 		/* Don't wrap the screen  - print only the first 60 chars */
 		printf("%s%d: Serial Number %.60s\n", periph->periph_name,
 		       periph->unit_number, path->device->serial_num);
 	}
 	/* Announce transport details. */
 	(*(path->bus->xport->announce))(periph);
 	/* Announce command queueing. */
 	if (path->device->inq_flags & SID_CmdQue
 	 || path->device->flags & CAM_DEV_TAG_AFTER_COUNT) {
 		printf("%s%d: Command Queueing enabled\n",
 		       periph->periph_name, periph->unit_number);
 	}
 	/* Announce caller's details if they've passed in. */
 	if (announce_string != NULL)
 		printf("%s%d: %s\n", periph->periph_name,
 		       periph->unit_number, announce_string);
 }
 
 void
 xpt_announce_quirks(struct cam_periph *periph, int quirks, char *bit_string)
 {
 	if (quirks != 0) {
 		printf("%s%d: quirks=0x%b\n", periph->periph_name,
 		    periph->unit_number, quirks, bit_string);
 	}
 }
 
 void
 xpt_denounce_periph(struct cam_periph *periph)
 {
 	struct	cam_path *path = periph->path;
 
 	cam_periph_assert(periph, MA_OWNED);
 	printf("%s%d at %s%d bus %d scbus%d target %d lun %jx\n",
 	       periph->periph_name, periph->unit_number,
 	       path->bus->sim->sim_name,
 	       path->bus->sim->unit_number,
 	       path->bus->sim->bus_id,
 	       path->bus->path_id,
 	       path->target->target_id,
 	       (uintmax_t)path->device->lun_id);
 	printf("%s%d: ", periph->periph_name, periph->unit_number);
 	if (path->device->protocol == PROTO_SCSI)
 		scsi_print_inquiry_short(&path->device->inq_data);
 	else if (path->device->protocol == PROTO_ATA ||
 	    path->device->protocol == PROTO_SATAPM)
 		ata_print_ident_short(&path->device->ident_data);
 	else if (path->device->protocol == PROTO_SEMB)
 		semb_print_ident_short(
 		    (struct sep_identify_data *)&path->device->ident_data);
 	else
 		printf("Unknown protocol device");
 	if (path->device->serial_num_len > 0)
 		printf(" s/n %.60s", path->device->serial_num);
 	printf(" detached\n");
 }
 
 
 int
 xpt_getattr(char *buf, size_t len, const char *attr, struct cam_path *path)
 {
 	int ret = -1, l;
 	struct ccb_dev_advinfo cdai;
 	struct scsi_vpd_id_descriptor *idd;
 
 	xpt_path_assert(path, MA_OWNED);
 
 	memset(&cdai, 0, sizeof(cdai));
 	xpt_setup_ccb(&cdai.ccb_h, path, CAM_PRIORITY_NORMAL);
 	cdai.ccb_h.func_code = XPT_DEV_ADVINFO;
 	cdai.bufsiz = len;
 
 	if (!strcmp(attr, "GEOM::ident"))
 		cdai.buftype = CDAI_TYPE_SERIAL_NUM;
 	else if (!strcmp(attr, "GEOM::physpath"))
 		cdai.buftype = CDAI_TYPE_PHYS_PATH;
 	else if (strcmp(attr, "GEOM::lunid") == 0 ||
 		 strcmp(attr, "GEOM::lunname") == 0) {
 		cdai.buftype = CDAI_TYPE_SCSI_DEVID;
 		cdai.bufsiz = CAM_SCSI_DEVID_MAXLEN;
 	} else
 		goto out;
 
 	cdai.buf = malloc(cdai.bufsiz, M_CAMXPT, M_NOWAIT|M_ZERO);
 	if (cdai.buf == NULL) {
 		ret = ENOMEM;
 		goto out;
 	}
 	xpt_action((union ccb *)&cdai); /* can only be synchronous */
 	if ((cdai.ccb_h.status & CAM_DEV_QFRZN) != 0)
 		cam_release_devq(cdai.ccb_h.path, 0, 0, 0, FALSE);
 	if (cdai.provsiz == 0)
 		goto out;
 	if (cdai.buftype == CDAI_TYPE_SCSI_DEVID) {
 		if (strcmp(attr, "GEOM::lunid") == 0) {
 			idd = scsi_get_devid((struct scsi_vpd_device_id *)cdai.buf,
 			    cdai.provsiz, scsi_devid_is_lun_naa);
 			if (idd == NULL)
 				idd = scsi_get_devid((struct scsi_vpd_device_id *)cdai.buf,
 				    cdai.provsiz, scsi_devid_is_lun_eui64);
 		} else
 			idd = NULL;
 		if (idd == NULL)
 			idd = scsi_get_devid((struct scsi_vpd_device_id *)cdai.buf,
 			    cdai.provsiz, scsi_devid_is_lun_t10);
 		if (idd == NULL)
 			idd = scsi_get_devid((struct scsi_vpd_device_id *)cdai.buf,
 			    cdai.provsiz, scsi_devid_is_lun_name);
 		if (idd == NULL)
 			goto out;
 		ret = 0;
 		if ((idd->proto_codeset & SVPD_ID_CODESET_MASK) == SVPD_ID_CODESET_ASCII) {
 			if (idd->length < len) {
 				for (l = 0; l < idd->length; l++)
 					buf[l] = idd->identifier[l] ?
 					    idd->identifier[l] : ' ';
 				buf[l] = 0;
 			} else
 				ret = EFAULT;
 		} else if ((idd->proto_codeset & SVPD_ID_CODESET_MASK) == SVPD_ID_CODESET_UTF8) {
 			l = strnlen(idd->identifier, idd->length);
 			if (l < len) {
 				bcopy(idd->identifier, buf, l);
 				buf[l] = 0;
 			} else
 				ret = EFAULT;
 		} else {
 			if (idd->length * 2 < len) {
 				for (l = 0; l < idd->length; l++)
 					sprintf(buf + l * 2, "%02x",
 					    idd->identifier[l]);
 			} else
 				ret = EFAULT;
 		}
 	} else {
 		ret = 0;
 		if (strlcpy(buf, cdai.buf, len) >= len)
 			ret = EFAULT;
 	}
 
 out:
 	if (cdai.buf != NULL)
 		free(cdai.buf, M_CAMXPT);
 	return ret;
 }
 
 static dev_match_ret
 xptbusmatch(struct dev_match_pattern *patterns, u_int num_patterns,
 	    struct cam_eb *bus)
 {
 	dev_match_ret retval;
 	int i;
 
 	retval = DM_RET_NONE;
 
 	/*
 	 * If we aren't given something to match against, that's an error.
 	 */
 	if (bus == NULL)
 		return(DM_RET_ERROR);
 
 	/*
 	 * If there are no match entries, then this bus matches no
 	 * matter what.
 	 */
 	if ((patterns == NULL) || (num_patterns == 0))
 		return(DM_RET_DESCEND | DM_RET_COPY);
 
 	for (i = 0; i < num_patterns; i++) {
 		struct bus_match_pattern *cur_pattern;
 
 		/*
 		 * If the pattern in question isn't for a bus node, we
 		 * aren't interested.  However, we do indicate to the
 		 * calling routine that we should continue descending the
 		 * tree, since the user wants to match against lower-level
 		 * EDT elements.
 		 */
 		if (patterns[i].type != DEV_MATCH_BUS) {
 			if ((retval & DM_RET_ACTION_MASK) == DM_RET_NONE)
 				retval |= DM_RET_DESCEND;
 			continue;
 		}
 
 		cur_pattern = &patterns[i].pattern.bus_pattern;
 
 		/*
 		 * If they want to match any bus node, we give them any
 		 * device node.
 		 */
 		if (cur_pattern->flags == BUS_MATCH_ANY) {
 			/* set the copy flag */
 			retval |= DM_RET_COPY;
 
 			/*
 			 * If we've already decided on an action, go ahead
 			 * and return.
 			 */
 			if ((retval & DM_RET_ACTION_MASK) != DM_RET_NONE)
 				return(retval);
 		}
 
 		/*
 		 * Not sure why someone would do this...
 		 */
 		if (cur_pattern->flags == BUS_MATCH_NONE)
 			continue;
 
 		if (((cur_pattern->flags & BUS_MATCH_PATH) != 0)
 		 && (cur_pattern->path_id != bus->path_id))
 			continue;
 
 		if (((cur_pattern->flags & BUS_MATCH_BUS_ID) != 0)
 		 && (cur_pattern->bus_id != bus->sim->bus_id))
 			continue;
 
 		if (((cur_pattern->flags & BUS_MATCH_UNIT) != 0)
 		 && (cur_pattern->unit_number != bus->sim->unit_number))
 			continue;
 
 		if (((cur_pattern->flags & BUS_MATCH_NAME) != 0)
 		 && (strncmp(cur_pattern->dev_name, bus->sim->sim_name,
 			     DEV_IDLEN) != 0))
 			continue;
 
 		/*
 		 * If we get to this point, the user definitely wants
 		 * information on this bus.  So tell the caller to copy the
 		 * data out.
 		 */
 		retval |= DM_RET_COPY;
 
 		/*
 		 * If the return action has been set to descend, then we
 		 * know that we've already seen a non-bus matching
 		 * expression, therefore we need to further descend the tree.
 		 * This won't change by continuing around the loop, so we
 		 * go ahead and return.  If we haven't seen a non-bus
 		 * matching expression, we keep going around the loop until
 		 * we exhaust the matching expressions.  We'll set the stop
 		 * flag once we fall out of the loop.
 		 */
 		if ((retval & DM_RET_ACTION_MASK) == DM_RET_DESCEND)
 			return(retval);
 	}
 
 	/*
 	 * If the return action hasn't been set to descend yet, that means
 	 * we haven't seen anything other than bus matching patterns.  So
 	 * tell the caller to stop descending the tree -- the user doesn't
 	 * want to match against lower level tree elements.
 	 */
 	if ((retval & DM_RET_ACTION_MASK) == DM_RET_NONE)
 		retval |= DM_RET_STOP;
 
 	return(retval);
 }
 
 static dev_match_ret
 xptdevicematch(struct dev_match_pattern *patterns, u_int num_patterns,
 	       struct cam_ed *device)
 {
 	dev_match_ret retval;
 	int i;
 
 	retval = DM_RET_NONE;
 
 	/*
 	 * If we aren't given something to match against, that's an error.
 	 */
 	if (device == NULL)
 		return(DM_RET_ERROR);
 
 	/*
 	 * If there are no match entries, then this device matches no
 	 * matter what.
 	 */
 	if ((patterns == NULL) || (num_patterns == 0))
 		return(DM_RET_DESCEND | DM_RET_COPY);
 
 	for (i = 0; i < num_patterns; i++) {
 		struct device_match_pattern *cur_pattern;
 		struct scsi_vpd_device_id *device_id_page;
 
 		/*
 		 * If the pattern in question isn't for a device node, we
 		 * aren't interested.
 		 */
 		if (patterns[i].type != DEV_MATCH_DEVICE) {
 			if ((patterns[i].type == DEV_MATCH_PERIPH)
 			 && ((retval & DM_RET_ACTION_MASK) == DM_RET_NONE))
 				retval |= DM_RET_DESCEND;
 			continue;
 		}
 
 		cur_pattern = &patterns[i].pattern.device_pattern;
 
 		/* Error out if mutually exclusive options are specified. */ 
 		if ((cur_pattern->flags & (DEV_MATCH_INQUIRY|DEV_MATCH_DEVID))
 		 == (DEV_MATCH_INQUIRY|DEV_MATCH_DEVID))
 			return(DM_RET_ERROR);
 
 		/*
 		 * If they want to match any device node, we give them any
 		 * device node.
 		 */
 		if (cur_pattern->flags == DEV_MATCH_ANY)
 			goto copy_dev_node;
 
 		/*
 		 * Not sure why someone would do this...
 		 */
 		if (cur_pattern->flags == DEV_MATCH_NONE)
 			continue;
 
 		if (((cur_pattern->flags & DEV_MATCH_PATH) != 0)
 		 && (cur_pattern->path_id != device->target->bus->path_id))
 			continue;
 
 		if (((cur_pattern->flags & DEV_MATCH_TARGET) != 0)
 		 && (cur_pattern->target_id != device->target->target_id))
 			continue;
 
 		if (((cur_pattern->flags & DEV_MATCH_LUN) != 0)
 		 && (cur_pattern->target_lun != device->lun_id))
 			continue;
 
 		if (((cur_pattern->flags & DEV_MATCH_INQUIRY) != 0)
 		 && (cam_quirkmatch((caddr_t)&device->inq_data,
 				    (caddr_t)&cur_pattern->data.inq_pat,
 				    1, sizeof(cur_pattern->data.inq_pat),
 				    scsi_static_inquiry_match) == NULL))
 			continue;
 
 		device_id_page = (struct scsi_vpd_device_id *)device->device_id;
 		if (((cur_pattern->flags & DEV_MATCH_DEVID) != 0)
 		 && (device->device_id_len < SVPD_DEVICE_ID_HDR_LEN
 		  || scsi_devid_match((uint8_t *)device_id_page->desc_list,
 				      device->device_id_len
 				    - SVPD_DEVICE_ID_HDR_LEN,
 				      cur_pattern->data.devid_pat.id,
 				      cur_pattern->data.devid_pat.id_len) != 0))
 			continue;
 
 copy_dev_node:
 		/*
 		 * If we get to this point, the user definitely wants
 		 * information on this device.  So tell the caller to copy
 		 * the data out.
 		 */
 		retval |= DM_RET_COPY;
 
 		/*
 		 * If the return action has been set to descend, then we
 		 * know that we've already seen a peripheral matching
 		 * expression, therefore we need to further descend the tree.
 		 * This won't change by continuing around the loop, so we
 		 * go ahead and return.  If we haven't seen a peripheral
 		 * matching expression, we keep going around the loop until
 		 * we exhaust the matching expressions.  We'll set the stop
 		 * flag once we fall out of the loop.
 		 */
 		if ((retval & DM_RET_ACTION_MASK) == DM_RET_DESCEND)
 			return(retval);
 	}
 
 	/*
 	 * If the return action hasn't been set to descend yet, that means
 	 * we haven't seen any peripheral matching patterns.  So tell the
 	 * caller to stop descending the tree -- the user doesn't want to
 	 * match against lower level tree elements.
 	 */
 	if ((retval & DM_RET_ACTION_MASK) == DM_RET_NONE)
 		retval |= DM_RET_STOP;
 
 	return(retval);
 }
 
 /*
  * Match a single peripheral against any number of match patterns.
  */
 static dev_match_ret
 xptperiphmatch(struct dev_match_pattern *patterns, u_int num_patterns,
 	       struct cam_periph *periph)
 {
 	dev_match_ret retval;
 	int i;
 
 	/*
 	 * If we aren't given something to match against, that's an error.
 	 */
 	if (periph == NULL)
 		return(DM_RET_ERROR);
 
 	/*
 	 * If there are no match entries, then this peripheral matches no
 	 * matter what.
 	 */
 	if ((patterns == NULL) || (num_patterns == 0))
 		return(DM_RET_STOP | DM_RET_COPY);
 
 	/*
 	 * There aren't any nodes below a peripheral node, so there's no
 	 * reason to descend the tree any further.
 	 */
 	retval = DM_RET_STOP;
 
 	for (i = 0; i < num_patterns; i++) {
 		struct periph_match_pattern *cur_pattern;
 
 		/*
 		 * If the pattern in question isn't for a peripheral, we
 		 * aren't interested.
 		 */
 		if (patterns[i].type != DEV_MATCH_PERIPH)
 			continue;
 
 		cur_pattern = &patterns[i].pattern.periph_pattern;
 
 		/*
 		 * If they want to match on anything, then we will do so.
 		 */
 		if (cur_pattern->flags == PERIPH_MATCH_ANY) {
 			/* set the copy flag */
 			retval |= DM_RET_COPY;
 
 			/*
 			 * We've already set the return action to stop,
 			 * since there are no nodes below peripherals in
 			 * the tree.
 			 */
 			return(retval);
 		}
 
 		/*
 		 * Not sure why someone would do this...
 		 */
 		if (cur_pattern->flags == PERIPH_MATCH_NONE)
 			continue;
 
 		if (((cur_pattern->flags & PERIPH_MATCH_PATH) != 0)
 		 && (cur_pattern->path_id != periph->path->bus->path_id))
 			continue;
 
 		/*
 		 * For the target and lun id's, we have to make sure the
 		 * target and lun pointers aren't NULL.  The xpt peripheral
 		 * has a wildcard target and device.
 		 */
 		if (((cur_pattern->flags & PERIPH_MATCH_TARGET) != 0)
 		 && ((periph->path->target == NULL)
 		 ||(cur_pattern->target_id != periph->path->target->target_id)))
 			continue;
 
 		if (((cur_pattern->flags & PERIPH_MATCH_LUN) != 0)
 		 && ((periph->path->device == NULL)
 		 || (cur_pattern->target_lun != periph->path->device->lun_id)))
 			continue;
 
 		if (((cur_pattern->flags & PERIPH_MATCH_UNIT) != 0)
 		 && (cur_pattern->unit_number != periph->unit_number))
 			continue;
 
 		if (((cur_pattern->flags & PERIPH_MATCH_NAME) != 0)
 		 && (strncmp(cur_pattern->periph_name, periph->periph_name,
 			     DEV_IDLEN) != 0))
 			continue;
 
 		/*
 		 * If we get to this point, the user definitely wants
 		 * information on this peripheral.  So tell the caller to
 		 * copy the data out.
 		 */
 		retval |= DM_RET_COPY;
 
 		/*
 		 * The return action has already been set to stop, since
 		 * peripherals don't have any nodes below them in the EDT.
 		 */
 		return(retval);
 	}
 
 	/*
 	 * If we get to this point, the peripheral that was passed in
 	 * doesn't match any of the patterns.
 	 */
 	return(retval);
 }
 
 static int
 xptedtbusfunc(struct cam_eb *bus, void *arg)
 {
 	struct ccb_dev_match *cdm;
 	struct cam_et *target;
 	dev_match_ret retval;
 
 	cdm = (struct ccb_dev_match *)arg;
 
 	/*
 	 * If our position is for something deeper in the tree, that means
 	 * that we've already seen this node.  So, we keep going down.
 	 */
 	if ((cdm->pos.position_type & CAM_DEV_POS_BUS)
 	 && (cdm->pos.cookie.bus == bus)
 	 && (cdm->pos.position_type & CAM_DEV_POS_TARGET)
 	 && (cdm->pos.cookie.target != NULL))
 		retval = DM_RET_DESCEND;
 	else
 		retval = xptbusmatch(cdm->patterns, cdm->num_patterns, bus);
 
 	/*
 	 * If we got an error, bail out of the search.
 	 */
 	if ((retval & DM_RET_ACTION_MASK) == DM_RET_ERROR) {
 		cdm->status = CAM_DEV_MATCH_ERROR;
 		return(0);
 	}
 
 	/*
 	 * If the copy flag is set, copy this bus out.
 	 */
 	if (retval & DM_RET_COPY) {
 		int spaceleft, j;
 
 		spaceleft = cdm->match_buf_len - (cdm->num_matches *
 			sizeof(struct dev_match_result));
 
 		/*
 		 * If we don't have enough space to put in another
 		 * match result, save our position and tell the
 		 * user there are more devices to check.
 		 */
 		if (spaceleft < sizeof(struct dev_match_result)) {
 			bzero(&cdm->pos, sizeof(cdm->pos));
 			cdm->pos.position_type =
 				CAM_DEV_POS_EDT | CAM_DEV_POS_BUS;
 
 			cdm->pos.cookie.bus = bus;
 			cdm->pos.generations[CAM_BUS_GENERATION]=
 				xsoftc.bus_generation;
 			cdm->status = CAM_DEV_MATCH_MORE;
 			return(0);
 		}
 		j = cdm->num_matches;
 		cdm->num_matches++;
 		cdm->matches[j].type = DEV_MATCH_BUS;
 		cdm->matches[j].result.bus_result.path_id = bus->path_id;
 		cdm->matches[j].result.bus_result.bus_id = bus->sim->bus_id;
 		cdm->matches[j].result.bus_result.unit_number =
 			bus->sim->unit_number;
 		strncpy(cdm->matches[j].result.bus_result.dev_name,
 			bus->sim->sim_name, DEV_IDLEN);
 	}
 
 	/*
 	 * If the user is only interested in busses, there's no
 	 * reason to descend to the next level in the tree.
 	 */
 	if ((retval & DM_RET_ACTION_MASK) == DM_RET_STOP)
 		return(1);
 
 	/*
 	 * If there is a target generation recorded, check it to
 	 * make sure the target list hasn't changed.
 	 */
 	mtx_lock(&bus->eb_mtx);
 	if ((cdm->pos.position_type & CAM_DEV_POS_BUS)
 	 && (cdm->pos.cookie.bus == bus)
 	 && (cdm->pos.position_type & CAM_DEV_POS_TARGET)
 	 && (cdm->pos.cookie.target != NULL)) {
 		if ((cdm->pos.generations[CAM_TARGET_GENERATION] !=
 		    bus->generation)) {
 			mtx_unlock(&bus->eb_mtx);
 			cdm->status = CAM_DEV_MATCH_LIST_CHANGED;
 			return (0);
 		}
 		target = (struct cam_et *)cdm->pos.cookie.target;
 		target->refcount++;
 	} else
 		target = NULL;
 	mtx_unlock(&bus->eb_mtx);
 
 	return (xpttargettraverse(bus, target, xptedttargetfunc, arg));
 }
 
 static int
 xptedttargetfunc(struct cam_et *target, void *arg)
 {
 	struct ccb_dev_match *cdm;
 	struct cam_eb *bus;
 	struct cam_ed *device;
 
 	cdm = (struct ccb_dev_match *)arg;
 	bus = target->bus;
 
 	/*
 	 * If there is a device list generation recorded, check it to
 	 * make sure the device list hasn't changed.
 	 */
 	mtx_lock(&bus->eb_mtx);
 	if ((cdm->pos.position_type & CAM_DEV_POS_BUS)
 	 && (cdm->pos.cookie.bus == bus)
 	 && (cdm->pos.position_type & CAM_DEV_POS_TARGET)
 	 && (cdm->pos.cookie.target == target)
 	 && (cdm->pos.position_type & CAM_DEV_POS_DEVICE)
 	 && (cdm->pos.cookie.device != NULL)) {
 		if (cdm->pos.generations[CAM_DEV_GENERATION] !=
 		    target->generation) {
 			mtx_unlock(&bus->eb_mtx);
 			cdm->status = CAM_DEV_MATCH_LIST_CHANGED;
 			return(0);
 		}
 		device = (struct cam_ed *)cdm->pos.cookie.device;
 		device->refcount++;
 	} else
 		device = NULL;
 	mtx_unlock(&bus->eb_mtx);
 
 	return (xptdevicetraverse(target, device, xptedtdevicefunc, arg));
 }
 
 static int
 xptedtdevicefunc(struct cam_ed *device, void *arg)
 {
 	struct cam_eb *bus;
 	struct cam_periph *periph;
 	struct ccb_dev_match *cdm;
 	dev_match_ret retval;
 
 	cdm = (struct ccb_dev_match *)arg;
 	bus = device->target->bus;
 
 	/*
 	 * If our position is for something deeper in the tree, that means
 	 * that we've already seen this node.  So, we keep going down.
 	 */
 	if ((cdm->pos.position_type & CAM_DEV_POS_DEVICE)
 	 && (cdm->pos.cookie.device == device)
 	 && (cdm->pos.position_type & CAM_DEV_POS_PERIPH)
 	 && (cdm->pos.cookie.periph != NULL))
 		retval = DM_RET_DESCEND;
 	else
 		retval = xptdevicematch(cdm->patterns, cdm->num_patterns,
 					device);
 
 	if ((retval & DM_RET_ACTION_MASK) == DM_RET_ERROR) {
 		cdm->status = CAM_DEV_MATCH_ERROR;
 		return(0);
 	}
 
 	/*
 	 * If the copy flag is set, copy this device out.
 	 */
 	if (retval & DM_RET_COPY) {
 		int spaceleft, j;
 
 		spaceleft = cdm->match_buf_len - (cdm->num_matches *
 			sizeof(struct dev_match_result));
 
 		/*
 		 * If we don't have enough space to put in another
 		 * match result, save our position and tell the
 		 * user there are more devices to check.
 		 */
 		if (spaceleft < sizeof(struct dev_match_result)) {
 			bzero(&cdm->pos, sizeof(cdm->pos));
 			cdm->pos.position_type =
 				CAM_DEV_POS_EDT | CAM_DEV_POS_BUS |
 				CAM_DEV_POS_TARGET | CAM_DEV_POS_DEVICE;
 
 			cdm->pos.cookie.bus = device->target->bus;
 			cdm->pos.generations[CAM_BUS_GENERATION]=
 				xsoftc.bus_generation;
 			cdm->pos.cookie.target = device->target;
 			cdm->pos.generations[CAM_TARGET_GENERATION] =
 				device->target->bus->generation;
 			cdm->pos.cookie.device = device;
 			cdm->pos.generations[CAM_DEV_GENERATION] =
 				device->target->generation;
 			cdm->status = CAM_DEV_MATCH_MORE;
 			return(0);
 		}
 		j = cdm->num_matches;
 		cdm->num_matches++;
 		cdm->matches[j].type = DEV_MATCH_DEVICE;
 		cdm->matches[j].result.device_result.path_id =
 			device->target->bus->path_id;
 		cdm->matches[j].result.device_result.target_id =
 			device->target->target_id;
 		cdm->matches[j].result.device_result.target_lun =
 			device->lun_id;
 		cdm->matches[j].result.device_result.protocol =
 			device->protocol;
 		bcopy(&device->inq_data,
 		      &cdm->matches[j].result.device_result.inq_data,
 		      sizeof(struct scsi_inquiry_data));
 		bcopy(&device->ident_data,
 		      &cdm->matches[j].result.device_result.ident_data,
 		      sizeof(struct ata_params));
 
 		/* Let the user know whether this device is unconfigured */
 		if (device->flags & CAM_DEV_UNCONFIGURED)
 			cdm->matches[j].result.device_result.flags =
 				DEV_RESULT_UNCONFIGURED;
 		else
 			cdm->matches[j].result.device_result.flags =
 				DEV_RESULT_NOFLAG;
 	}
 
 	/*
 	 * If the user isn't interested in peripherals, don't descend
 	 * the tree any further.
 	 */
 	if ((retval & DM_RET_ACTION_MASK) == DM_RET_STOP)
 		return(1);
 
 	/*
 	 * If there is a peripheral list generation recorded, make sure
 	 * it hasn't changed.
 	 */
 	xpt_lock_buses();
 	mtx_lock(&bus->eb_mtx);
 	if ((cdm->pos.position_type & CAM_DEV_POS_BUS)
 	 && (cdm->pos.cookie.bus == bus)
 	 && (cdm->pos.position_type & CAM_DEV_POS_TARGET)
 	 && (cdm->pos.cookie.target == device->target)
 	 && (cdm->pos.position_type & CAM_DEV_POS_DEVICE)
 	 && (cdm->pos.cookie.device == device)
 	 && (cdm->pos.position_type & CAM_DEV_POS_PERIPH)
 	 && (cdm->pos.cookie.periph != NULL)) {
 		if (cdm->pos.generations[CAM_PERIPH_GENERATION] !=
 		    device->generation) {
 			mtx_unlock(&bus->eb_mtx);
 			xpt_unlock_buses();
 			cdm->status = CAM_DEV_MATCH_LIST_CHANGED;
 			return(0);
 		}
 		periph = (struct cam_periph *)cdm->pos.cookie.periph;
 		periph->refcount++;
 	} else
 		periph = NULL;
 	mtx_unlock(&bus->eb_mtx);
 	xpt_unlock_buses();
 
 	return (xptperiphtraverse(device, periph, xptedtperiphfunc, arg));
 }
 
 static int
 xptedtperiphfunc(struct cam_periph *periph, void *arg)
 {
 	struct ccb_dev_match *cdm;
 	dev_match_ret retval;
 
 	cdm = (struct ccb_dev_match *)arg;
 
 	retval = xptperiphmatch(cdm->patterns, cdm->num_patterns, periph);
 
 	if ((retval & DM_RET_ACTION_MASK) == DM_RET_ERROR) {
 		cdm->status = CAM_DEV_MATCH_ERROR;
 		return(0);
 	}
 
 	/*
 	 * If the copy flag is set, copy this peripheral out.
 	 */
 	if (retval & DM_RET_COPY) {
 		int spaceleft, j;
 
 		spaceleft = cdm->match_buf_len - (cdm->num_matches *
 			sizeof(struct dev_match_result));
 
 		/*
 		 * If we don't have enough space to put in another
 		 * match result, save our position and tell the
 		 * user there are more devices to check.
 		 */
 		if (spaceleft < sizeof(struct dev_match_result)) {
 			bzero(&cdm->pos, sizeof(cdm->pos));
 			cdm->pos.position_type =
 				CAM_DEV_POS_EDT | CAM_DEV_POS_BUS |
 				CAM_DEV_POS_TARGET | CAM_DEV_POS_DEVICE |
 				CAM_DEV_POS_PERIPH;
 
 			cdm->pos.cookie.bus = periph->path->bus;
 			cdm->pos.generations[CAM_BUS_GENERATION]=
 				xsoftc.bus_generation;
 			cdm->pos.cookie.target = periph->path->target;
 			cdm->pos.generations[CAM_TARGET_GENERATION] =
 				periph->path->bus->generation;
 			cdm->pos.cookie.device = periph->path->device;
 			cdm->pos.generations[CAM_DEV_GENERATION] =
 				periph->path->target->generation;
 			cdm->pos.cookie.periph = periph;
 			cdm->pos.generations[CAM_PERIPH_GENERATION] =
 				periph->path->device->generation;
 			cdm->status = CAM_DEV_MATCH_MORE;
 			return(0);
 		}
 
 		j = cdm->num_matches;
 		cdm->num_matches++;
 		cdm->matches[j].type = DEV_MATCH_PERIPH;
 		cdm->matches[j].result.periph_result.path_id =
 			periph->path->bus->path_id;
 		cdm->matches[j].result.periph_result.target_id =
 			periph->path->target->target_id;
 		cdm->matches[j].result.periph_result.target_lun =
 			periph->path->device->lun_id;
 		cdm->matches[j].result.periph_result.unit_number =
 			periph->unit_number;
 		strncpy(cdm->matches[j].result.periph_result.periph_name,
 			periph->periph_name, DEV_IDLEN);
 	}
 
 	return(1);
 }
 
 static int
 xptedtmatch(struct ccb_dev_match *cdm)
 {
 	struct cam_eb *bus;
 	int ret;
 
 	cdm->num_matches = 0;
 
 	/*
 	 * Check the bus list generation.  If it has changed, the user
 	 * needs to reset everything and start over.
 	 */
 	xpt_lock_buses();
 	if ((cdm->pos.position_type & CAM_DEV_POS_BUS)
 	 && (cdm->pos.cookie.bus != NULL)) {
 		if (cdm->pos.generations[CAM_BUS_GENERATION] !=
 		    xsoftc.bus_generation) {
 			xpt_unlock_buses();
 			cdm->status = CAM_DEV_MATCH_LIST_CHANGED;
 			return(0);
 		}
 		bus = (struct cam_eb *)cdm->pos.cookie.bus;
 		bus->refcount++;
 	} else
 		bus = NULL;
 	xpt_unlock_buses();
 
 	ret = xptbustraverse(bus, xptedtbusfunc, cdm);
 
 	/*
 	 * If we get back 0, that means that we had to stop before fully
 	 * traversing the EDT.  It also means that one of the subroutines
 	 * has set the status field to the proper value.  If we get back 1,
 	 * we've fully traversed the EDT and copied out any matching entries.
 	 */
 	if (ret == 1)
 		cdm->status = CAM_DEV_MATCH_LAST;
 
 	return(ret);
 }
 
 static int
 xptplistpdrvfunc(struct periph_driver **pdrv, void *arg)
 {
 	struct cam_periph *periph;
 	struct ccb_dev_match *cdm;
 
 	cdm = (struct ccb_dev_match *)arg;
 
 	xpt_lock_buses();
 	if ((cdm->pos.position_type & CAM_DEV_POS_PDPTR)
 	 && (cdm->pos.cookie.pdrv == pdrv)
 	 && (cdm->pos.position_type & CAM_DEV_POS_PERIPH)
 	 && (cdm->pos.cookie.periph != NULL)) {
 		if (cdm->pos.generations[CAM_PERIPH_GENERATION] !=
 		    (*pdrv)->generation) {
 			xpt_unlock_buses();
 			cdm->status = CAM_DEV_MATCH_LIST_CHANGED;
 			return(0);
 		}
 		periph = (struct cam_periph *)cdm->pos.cookie.periph;
 		periph->refcount++;
 	} else
 		periph = NULL;
 	xpt_unlock_buses();
 
 	return (xptpdperiphtraverse(pdrv, periph, xptplistperiphfunc, arg));
 }
 
 static int
 xptplistperiphfunc(struct cam_periph *periph, void *arg)
 {
 	struct ccb_dev_match *cdm;
 	dev_match_ret retval;
 
 	cdm = (struct ccb_dev_match *)arg;
 
 	retval = xptperiphmatch(cdm->patterns, cdm->num_patterns, periph);
 
 	if ((retval & DM_RET_ACTION_MASK) == DM_RET_ERROR) {
 		cdm->status = CAM_DEV_MATCH_ERROR;
 		return(0);
 	}
 
 	/*
 	 * If the copy flag is set, copy this peripheral out.
 	 */
 	if (retval & DM_RET_COPY) {
 		int spaceleft, j;
 
 		spaceleft = cdm->match_buf_len - (cdm->num_matches *
 			sizeof(struct dev_match_result));
 
 		/*
 		 * If we don't have enough space to put in another
 		 * match result, save our position and tell the
 		 * user there are more devices to check.
 		 */
 		if (spaceleft < sizeof(struct dev_match_result)) {
 			struct periph_driver **pdrv;
 
 			pdrv = NULL;
 			bzero(&cdm->pos, sizeof(cdm->pos));
 			cdm->pos.position_type =
 				CAM_DEV_POS_PDRV | CAM_DEV_POS_PDPTR |
 				CAM_DEV_POS_PERIPH;
 
 			/*
 			 * This may look a bit non-sensical, but it is
 			 * actually quite logical.  There are very few
 			 * peripheral drivers, and bloating every peripheral
 			 * structure with a pointer back to its parent
 			 * peripheral driver linker set entry would cost
 			 * more in the long run than doing this quick lookup.
 			 */
 			for (pdrv = periph_drivers; *pdrv != NULL; pdrv++) {
 				if (strcmp((*pdrv)->driver_name,
 				    periph->periph_name) == 0)
 					break;
 			}
 
 			if (*pdrv == NULL) {
 				cdm->status = CAM_DEV_MATCH_ERROR;
 				return(0);
 			}
 
 			cdm->pos.cookie.pdrv = pdrv;
 			/*
 			 * The periph generation slot does double duty, as
 			 * does the periph pointer slot.  They are used for
 			 * both edt and pdrv lookups and positioning.
 			 */
 			cdm->pos.cookie.periph = periph;
 			cdm->pos.generations[CAM_PERIPH_GENERATION] =
 				(*pdrv)->generation;
 			cdm->status = CAM_DEV_MATCH_MORE;
 			return(0);
 		}
 
 		j = cdm->num_matches;
 		cdm->num_matches++;
 		cdm->matches[j].type = DEV_MATCH_PERIPH;
 		cdm->matches[j].result.periph_result.path_id =
 			periph->path->bus->path_id;
 
 		/*
 		 * The transport layer peripheral doesn't have a target or
 		 * lun.
 		 */
 		if (periph->path->target)
 			cdm->matches[j].result.periph_result.target_id =
 				periph->path->target->target_id;
 		else
 			cdm->matches[j].result.periph_result.target_id =
 				CAM_TARGET_WILDCARD;
 
 		if (periph->path->device)
 			cdm->matches[j].result.periph_result.target_lun =
 				periph->path->device->lun_id;
 		else
 			cdm->matches[j].result.periph_result.target_lun =
 				CAM_LUN_WILDCARD;
 
 		cdm->matches[j].result.periph_result.unit_number =
 			periph->unit_number;
 		strncpy(cdm->matches[j].result.periph_result.periph_name,
 			periph->periph_name, DEV_IDLEN);
 	}
 
 	return(1);
 }
 
 static int
 xptperiphlistmatch(struct ccb_dev_match *cdm)
 {
 	int ret;
 
 	cdm->num_matches = 0;
 
 	/*
 	 * At this point in the edt traversal function, we check the bus
 	 * list generation to make sure that no busses have been added or
 	 * removed since the user last sent a XPT_DEV_MATCH ccb through.
 	 * For the peripheral driver list traversal function, however, we
 	 * don't have to worry about new peripheral driver types coming or
 	 * going; they're in a linker set, and therefore can't change
 	 * without a recompile.
 	 */
 
 	if ((cdm->pos.position_type & CAM_DEV_POS_PDPTR)
 	 && (cdm->pos.cookie.pdrv != NULL))
 		ret = xptpdrvtraverse(
 				(struct periph_driver **)cdm->pos.cookie.pdrv,
 				xptplistpdrvfunc, cdm);
 	else
 		ret = xptpdrvtraverse(NULL, xptplistpdrvfunc, cdm);
 
 	/*
 	 * If we get back 0, that means that we had to stop before fully
 	 * traversing the peripheral driver tree.  It also means that one of
 	 * the subroutines has set the status field to the proper value.  If
 	 * we get back 1, we've fully traversed the EDT and copied out any
 	 * matching entries.
 	 */
 	if (ret == 1)
 		cdm->status = CAM_DEV_MATCH_LAST;
 
 	return(ret);
 }
 
 static int
 xptbustraverse(struct cam_eb *start_bus, xpt_busfunc_t *tr_func, void *arg)
 {
 	struct cam_eb *bus, *next_bus;
 	int retval;
 
 	retval = 1;
 	if (start_bus)
 		bus = start_bus;
 	else {
 		xpt_lock_buses();
 		bus = TAILQ_FIRST(&xsoftc.xpt_busses);
 		if (bus == NULL) {
 			xpt_unlock_buses();
 			return (retval);
 		}
 		bus->refcount++;
 		xpt_unlock_buses();
 	}
 	for (; bus != NULL; bus = next_bus) {
 		retval = tr_func(bus, arg);
 		if (retval == 0) {
 			xpt_release_bus(bus);
 			break;
 		}
 		xpt_lock_buses();
 		next_bus = TAILQ_NEXT(bus, links);
 		if (next_bus)
 			next_bus->refcount++;
 		xpt_unlock_buses();
 		xpt_release_bus(bus);
 	}
 	return(retval);
 }
 
 static int
 xpttargettraverse(struct cam_eb *bus, struct cam_et *start_target,
 		  xpt_targetfunc_t *tr_func, void *arg)
 {
 	struct cam_et *target, *next_target;
 	int retval;
 
 	retval = 1;
 	if (start_target)
 		target = start_target;
 	else {
 		mtx_lock(&bus->eb_mtx);
 		target = TAILQ_FIRST(&bus->et_entries);
 		if (target == NULL) {
 			mtx_unlock(&bus->eb_mtx);
 			return (retval);
 		}
 		target->refcount++;
 		mtx_unlock(&bus->eb_mtx);
 	}
 	for (; target != NULL; target = next_target) {
 		retval = tr_func(target, arg);
 		if (retval == 0) {
 			xpt_release_target(target);
 			break;
 		}
 		mtx_lock(&bus->eb_mtx);
 		next_target = TAILQ_NEXT(target, links);
 		if (next_target)
 			next_target->refcount++;
 		mtx_unlock(&bus->eb_mtx);
 		xpt_release_target(target);
 	}
 	return(retval);
 }
 
 static int
 xptdevicetraverse(struct cam_et *target, struct cam_ed *start_device,
 		  xpt_devicefunc_t *tr_func, void *arg)
 {
 	struct cam_eb *bus;
 	struct cam_ed *device, *next_device;
 	int retval;
 
 	retval = 1;
 	bus = target->bus;
 	if (start_device)
 		device = start_device;
 	else {
 		mtx_lock(&bus->eb_mtx);
 		device = TAILQ_FIRST(&target->ed_entries);
 		if (device == NULL) {
 			mtx_unlock(&bus->eb_mtx);
 			return (retval);
 		}
 		device->refcount++;
 		mtx_unlock(&bus->eb_mtx);
 	}
 	for (; device != NULL; device = next_device) {
 		mtx_lock(&device->device_mtx);
 		retval = tr_func(device, arg);
 		mtx_unlock(&device->device_mtx);
 		if (retval == 0) {
 			xpt_release_device(device);
 			break;
 		}
 		mtx_lock(&bus->eb_mtx);
 		next_device = TAILQ_NEXT(device, links);
 		if (next_device)
 			next_device->refcount++;
 		mtx_unlock(&bus->eb_mtx);
 		xpt_release_device(device);
 	}
 	return(retval);
 }
 
 static int
 xptperiphtraverse(struct cam_ed *device, struct cam_periph *start_periph,
 		  xpt_periphfunc_t *tr_func, void *arg)
 {
 	struct cam_eb *bus;
 	struct cam_periph *periph, *next_periph;
 	int retval;
 
 	retval = 1;
 
 	bus = device->target->bus;
 	if (start_periph)
 		periph = start_periph;
 	else {
 		xpt_lock_buses();
 		mtx_lock(&bus->eb_mtx);
 		periph = SLIST_FIRST(&device->periphs);
 		while (periph != NULL && (periph->flags & CAM_PERIPH_FREE) != 0)
 			periph = SLIST_NEXT(periph, periph_links);
 		if (periph == NULL) {
 			mtx_unlock(&bus->eb_mtx);
 			xpt_unlock_buses();
 			return (retval);
 		}
 		periph->refcount++;
 		mtx_unlock(&bus->eb_mtx);
 		xpt_unlock_buses();
 	}
 	for (; periph != NULL; periph = next_periph) {
 		retval = tr_func(periph, arg);
 		if (retval == 0) {
 			cam_periph_release_locked(periph);
 			break;
 		}
 		xpt_lock_buses();
 		mtx_lock(&bus->eb_mtx);
 		next_periph = SLIST_NEXT(periph, periph_links);
 		while (next_periph != NULL &&
 		    (next_periph->flags & CAM_PERIPH_FREE) != 0)
 			next_periph = SLIST_NEXT(next_periph, periph_links);
 		if (next_periph)
 			next_periph->refcount++;
 		mtx_unlock(&bus->eb_mtx);
 		xpt_unlock_buses();
 		cam_periph_release_locked(periph);
 	}
 	return(retval);
 }
 
 static int
 xptpdrvtraverse(struct periph_driver **start_pdrv,
 		xpt_pdrvfunc_t *tr_func, void *arg)
 {
 	struct periph_driver **pdrv;
 	int retval;
 
 	retval = 1;
 
 	/*
 	 * We don't traverse the peripheral driver list like we do the
 	 * other lists, because it is a linker set, and therefore cannot be
 	 * changed during runtime.  If the peripheral driver list is ever
 	 * re-done to be something other than a linker set (i.e. it can
 	 * change while the system is running), the list traversal should
 	 * be modified to work like the other traversal functions.
 	 */
 	for (pdrv = (start_pdrv ? start_pdrv : periph_drivers);
 	     *pdrv != NULL; pdrv++) {
 		retval = tr_func(pdrv, arg);
 
 		if (retval == 0)
 			return(retval);
 	}
 
 	return(retval);
 }
 
 static int
 xptpdperiphtraverse(struct periph_driver **pdrv,
 		    struct cam_periph *start_periph,
 		    xpt_periphfunc_t *tr_func, void *arg)
 {
 	struct cam_periph *periph, *next_periph;
 	int retval;
 
 	retval = 1;
 
 	if (start_periph)
 		periph = start_periph;
 	else {
 		xpt_lock_buses();
 		periph = TAILQ_FIRST(&(*pdrv)->units);
 		while (periph != NULL && (periph->flags & CAM_PERIPH_FREE) != 0)
 			periph = TAILQ_NEXT(periph, unit_links);
 		if (periph == NULL) {
 			xpt_unlock_buses();
 			return (retval);
 		}
 		periph->refcount++;
 		xpt_unlock_buses();
 	}
 	for (; periph != NULL; periph = next_periph) {
 		cam_periph_lock(periph);
 		retval = tr_func(periph, arg);
 		cam_periph_unlock(periph);
 		if (retval == 0) {
 			cam_periph_release(periph);
 			break;
 		}
 		xpt_lock_buses();
 		next_periph = TAILQ_NEXT(periph, unit_links);
 		while (next_periph != NULL &&
 		    (next_periph->flags & CAM_PERIPH_FREE) != 0)
 			next_periph = TAILQ_NEXT(next_periph, unit_links);
 		if (next_periph)
 			next_periph->refcount++;
 		xpt_unlock_buses();
 		cam_periph_release(periph);
 	}
 	return(retval);
 }
 
 static int
 xptdefbusfunc(struct cam_eb *bus, void *arg)
 {
 	struct xpt_traverse_config *tr_config;
 
 	tr_config = (struct xpt_traverse_config *)arg;
 
 	if (tr_config->depth == XPT_DEPTH_BUS) {
 		xpt_busfunc_t *tr_func;
 
 		tr_func = (xpt_busfunc_t *)tr_config->tr_func;
 
 		return(tr_func(bus, tr_config->tr_arg));
 	} else
 		return(xpttargettraverse(bus, NULL, xptdeftargetfunc, arg));
 }
 
 static int
 xptdeftargetfunc(struct cam_et *target, void *arg)
 {
 	struct xpt_traverse_config *tr_config;
 
 	tr_config = (struct xpt_traverse_config *)arg;
 
 	if (tr_config->depth == XPT_DEPTH_TARGET) {
 		xpt_targetfunc_t *tr_func;
 
 		tr_func = (xpt_targetfunc_t *)tr_config->tr_func;
 
 		return(tr_func(target, tr_config->tr_arg));
 	} else
 		return(xptdevicetraverse(target, NULL, xptdefdevicefunc, arg));
 }
 
 static int
 xptdefdevicefunc(struct cam_ed *device, void *arg)
 {
 	struct xpt_traverse_config *tr_config;
 
 	tr_config = (struct xpt_traverse_config *)arg;
 
 	if (tr_config->depth == XPT_DEPTH_DEVICE) {
 		xpt_devicefunc_t *tr_func;
 
 		tr_func = (xpt_devicefunc_t *)tr_config->tr_func;
 
 		return(tr_func(device, tr_config->tr_arg));
 	} else
 		return(xptperiphtraverse(device, NULL, xptdefperiphfunc, arg));
 }
 
 static int
 xptdefperiphfunc(struct cam_periph *periph, void *arg)
 {
 	struct xpt_traverse_config *tr_config;
 	xpt_periphfunc_t *tr_func;
 
 	tr_config = (struct xpt_traverse_config *)arg;
 
 	tr_func = (xpt_periphfunc_t *)tr_config->tr_func;
 
 	/*
 	 * Unlike the other default functions, we don't check for depth
 	 * here.  The peripheral driver level is the last level in the EDT,
 	 * so if we're here, we should execute the function in question.
 	 */
 	return(tr_func(periph, tr_config->tr_arg));
 }
 
 /*
  * Execute the given function for every bus in the EDT.
  */
 static int
 xpt_for_all_busses(xpt_busfunc_t *tr_func, void *arg)
 {
 	struct xpt_traverse_config tr_config;
 
 	tr_config.depth = XPT_DEPTH_BUS;
 	tr_config.tr_func = tr_func;
 	tr_config.tr_arg = arg;
 
 	return(xptbustraverse(NULL, xptdefbusfunc, &tr_config));
 }
 
 /*
  * Execute the given function for every device in the EDT.
  */
 static int
 xpt_for_all_devices(xpt_devicefunc_t *tr_func, void *arg)
 {
 	struct xpt_traverse_config tr_config;
 
 	tr_config.depth = XPT_DEPTH_DEVICE;
 	tr_config.tr_func = tr_func;
 	tr_config.tr_arg = arg;
 
 	return(xptbustraverse(NULL, xptdefbusfunc, &tr_config));
 }
 
 static int
 xptsetasyncfunc(struct cam_ed *device, void *arg)
 {
 	struct cam_path path;
 	struct ccb_getdev cgd;
 	struct ccb_setasync *csa = (struct ccb_setasync *)arg;
 
 	/*
 	 * Don't report unconfigured devices (Wildcard devs,
 	 * devices only for target mode, device instances
 	 * that have been invalidated but are waiting for
 	 * their last reference count to be released).
 	 */
 	if ((device->flags & CAM_DEV_UNCONFIGURED) != 0)
 		return (1);
 
 	xpt_compile_path(&path,
 			 NULL,
 			 device->target->bus->path_id,
 			 device->target->target_id,
 			 device->lun_id);
 	xpt_setup_ccb(&cgd.ccb_h, &path, CAM_PRIORITY_NORMAL);
 	cgd.ccb_h.func_code = XPT_GDEV_TYPE;
 	xpt_action((union ccb *)&cgd);
 	csa->callback(csa->callback_arg,
 			    AC_FOUND_DEVICE,
 			    &path, &cgd);
 	xpt_release_path(&path);
 
 	return(1);
 }
 
 static int
 xptsetasyncbusfunc(struct cam_eb *bus, void *arg)
 {
 	struct cam_path path;
 	struct ccb_pathinq cpi;
 	struct ccb_setasync *csa = (struct ccb_setasync *)arg;
 
 	xpt_compile_path(&path, /*periph*/NULL,
 			 bus->path_id,
 			 CAM_TARGET_WILDCARD,
 			 CAM_LUN_WILDCARD);
 	xpt_path_lock(&path);
 	xpt_setup_ccb(&cpi.ccb_h, &path, CAM_PRIORITY_NORMAL);
 	cpi.ccb_h.func_code = XPT_PATH_INQ;
 	xpt_action((union ccb *)&cpi);
 	csa->callback(csa->callback_arg,
 			    AC_PATH_REGISTERED,
 			    &path, &cpi);
 	xpt_path_unlock(&path);
 	xpt_release_path(&path);
 
 	return(1);
 }
 
 void
 xpt_action(union ccb *start_ccb)
 {
 
 	CAM_DEBUG(start_ccb->ccb_h.path, CAM_DEBUG_TRACE, ("xpt_action\n"));
 
 	start_ccb->ccb_h.status = CAM_REQ_INPROG;
 	(*(start_ccb->ccb_h.path->bus->xport->action))(start_ccb);
 }
 
 void
 xpt_action_default(union ccb *start_ccb)
 {
 	struct cam_path *path;
 	struct cam_sim *sim;
 	int lock;
 
 	path = start_ccb->ccb_h.path;
 	CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_action_default\n"));
 
 	switch (start_ccb->ccb_h.func_code) {
 	case XPT_SCSI_IO:
 	{
 		struct cam_ed *device;
 
 		/*
 		 * For the sake of compatibility with SCSI-1
 		 * devices that may not understand the identify
 		 * message, we include lun information in the
 		 * second byte of all commands.  SCSI-1 specifies
 		 * that luns are a 3 bit value and reserves only 3
 		 * bits for lun information in the CDB.  Later
 		 * revisions of the SCSI spec allow for more than 8
 		 * luns, but have deprecated lun information in the
 		 * CDB.  So, if the lun won't fit, we must omit.
 		 *
 		 * Also be aware that during initial probing for devices,
 		 * the inquiry information is unknown but initialized to 0.
 		 * This means that this code will be exercised while probing
 		 * devices with an ANSI revision greater than 2.
 		 */
 		device = path->device;
 		if (device->protocol_version <= SCSI_REV_2
 		 && start_ccb->ccb_h.target_lun < 8
 		 && (start_ccb->ccb_h.flags & CAM_CDB_POINTER) == 0) {
 
 			start_ccb->csio.cdb_io.cdb_bytes[1] |=
 			    start_ccb->ccb_h.target_lun << 5;
 		}
 		start_ccb->csio.scsi_status = SCSI_STATUS_OK;
 	}
 	/* FALLTHROUGH */
 	case XPT_TARGET_IO:
 	case XPT_CONT_TARGET_IO:
 		start_ccb->csio.sense_resid = 0;
 		start_ccb->csio.resid = 0;
 		/* FALLTHROUGH */
 	case XPT_ATA_IO:
 		if (start_ccb->ccb_h.func_code == XPT_ATA_IO)
 			start_ccb->ataio.resid = 0;
 		/* FALLTHROUGH */
 	case XPT_RESET_DEV:
 	case XPT_ENG_EXEC:
 	case XPT_SMP_IO:
 	{
 		struct cam_devq *devq;
 
 		devq = path->bus->sim->devq;
 		mtx_lock(&devq->send_mtx);
 		cam_ccbq_insert_ccb(&path->device->ccbq, start_ccb);
 		if (xpt_schedule_devq(devq, path->device) != 0)
 			xpt_run_devq(devq);
 		mtx_unlock(&devq->send_mtx);
 		break;
 	}
 	case XPT_CALC_GEOMETRY:
 		/* Filter out garbage */
 		if (start_ccb->ccg.block_size == 0
 		 || start_ccb->ccg.volume_size == 0) {
 			start_ccb->ccg.cylinders = 0;
 			start_ccb->ccg.heads = 0;
 			start_ccb->ccg.secs_per_track = 0;
 			start_ccb->ccb_h.status = CAM_REQ_CMP;
 			break;
 		}
 #if defined(PC98) || defined(__sparc64__)
 		/*
 		 * In a PC-98 system, geometry translation depens on
 		 * the "real" device geometry obtained from mode page 4.
 		 * SCSI geometry translation is performed in the
 		 * initialization routine of the SCSI BIOS and the result
 		 * stored in host memory.  If the translation is available
 		 * in host memory, use it.  If not, rely on the default
 		 * translation the device driver performs.
 		 * For sparc64, we may need adjust the geometry of large
 		 * disks in order to fit the limitations of the 16-bit
 		 * fields of the VTOC8 disk label.
 		 */
 		if (scsi_da_bios_params(&start_ccb->ccg) != 0) {
 			start_ccb->ccb_h.status = CAM_REQ_CMP;
 			break;
 		}
 #endif
 		goto call_sim;
 	case XPT_ABORT:
 	{
 		union ccb* abort_ccb;
 
 		abort_ccb = start_ccb->cab.abort_ccb;
 		if (XPT_FC_IS_DEV_QUEUED(abort_ccb)) {
 
 			if (abort_ccb->ccb_h.pinfo.index >= 0) {
 				struct cam_ccbq *ccbq;
 				struct cam_ed *device;
 
 				device = abort_ccb->ccb_h.path->device;
 				ccbq = &device->ccbq;
 				cam_ccbq_remove_ccb(ccbq, abort_ccb);
 				abort_ccb->ccb_h.status =
 				    CAM_REQ_ABORTED|CAM_DEV_QFRZN;
 				xpt_freeze_devq(abort_ccb->ccb_h.path, 1);
 				xpt_done(abort_ccb);
 				start_ccb->ccb_h.status = CAM_REQ_CMP;
 				break;
 			}
 			if (abort_ccb->ccb_h.pinfo.index == CAM_UNQUEUED_INDEX
 			 && (abort_ccb->ccb_h.status & CAM_SIM_QUEUED) == 0) {
 				/*
 				 * We've caught this ccb en route to
 				 * the SIM.  Flag it for abort and the
 				 * SIM will do so just before starting
 				 * real work on the CCB.
 				 */
 				abort_ccb->ccb_h.status =
 				    CAM_REQ_ABORTED|CAM_DEV_QFRZN;
 				xpt_freeze_devq(abort_ccb->ccb_h.path, 1);
 				start_ccb->ccb_h.status = CAM_REQ_CMP;
 				break;
 			}
 		}
 		if (XPT_FC_IS_QUEUED(abort_ccb)
 		 && (abort_ccb->ccb_h.pinfo.index == CAM_DONEQ_INDEX)) {
 			/*
 			 * It's already completed but waiting
 			 * for our SWI to get to it.
 			 */
 			start_ccb->ccb_h.status = CAM_UA_ABORT;
 			break;
 		}
 		/*
 		 * If we weren't able to take care of the abort request
 		 * in the XPT, pass the request down to the SIM for processing.
 		 */
 	}
 	/* FALLTHROUGH */
 	case XPT_ACCEPT_TARGET_IO:
 	case XPT_EN_LUN:
 	case XPT_IMMED_NOTIFY:
 	case XPT_NOTIFY_ACK:
 	case XPT_RESET_BUS:
 	case XPT_IMMEDIATE_NOTIFY:
 	case XPT_NOTIFY_ACKNOWLEDGE:
 	case XPT_GET_SIM_KNOB:
 	case XPT_SET_SIM_KNOB:
 	case XPT_GET_TRAN_SETTINGS:
 	case XPT_SET_TRAN_SETTINGS:
 	case XPT_PATH_INQ:
 call_sim:
 		sim = path->bus->sim;
 		lock = (mtx_owned(sim->mtx) == 0);
 		if (lock)
 			CAM_SIM_LOCK(sim);
 		(*(sim->sim_action))(sim, start_ccb);
 		if (lock)
 			CAM_SIM_UNLOCK(sim);
 		break;
 	case XPT_PATH_STATS:
 		start_ccb->cpis.last_reset = path->bus->last_reset;
 		start_ccb->ccb_h.status = CAM_REQ_CMP;
 		break;
 	case XPT_GDEV_TYPE:
 	{
 		struct cam_ed *dev;
 
 		dev = path->device;
 		if ((dev->flags & CAM_DEV_UNCONFIGURED) != 0) {
 			start_ccb->ccb_h.status = CAM_DEV_NOT_THERE;
 		} else {
 			struct ccb_getdev *cgd;
 
 			cgd = &start_ccb->cgd;
 			cgd->protocol = dev->protocol;
 			cgd->inq_data = dev->inq_data;
 			cgd->ident_data = dev->ident_data;
 			cgd->inq_flags = dev->inq_flags;
 			cgd->ccb_h.status = CAM_REQ_CMP;
 			cgd->serial_num_len = dev->serial_num_len;
 			if ((dev->serial_num_len > 0)
 			 && (dev->serial_num != NULL))
 				bcopy(dev->serial_num, cgd->serial_num,
 				      dev->serial_num_len);
 		}
 		break;
 	}
 	case XPT_GDEV_STATS:
 	{
 		struct cam_ed *dev;
 
 		dev = path->device;
 		if ((dev->flags & CAM_DEV_UNCONFIGURED) != 0) {
 			start_ccb->ccb_h.status = CAM_DEV_NOT_THERE;
 		} else {
 			struct ccb_getdevstats *cgds;
 			struct cam_eb *bus;
 			struct cam_et *tar;
 			struct cam_devq *devq;
 
 			cgds = &start_ccb->cgds;
 			bus = path->bus;
 			tar = path->target;
 			devq = bus->sim->devq;
 			mtx_lock(&devq->send_mtx);
 			cgds->dev_openings = dev->ccbq.dev_openings;
 			cgds->dev_active = dev->ccbq.dev_active;
 			cgds->allocated = dev->ccbq.allocated;
 			cgds->queued = cam_ccbq_pending_ccb_count(&dev->ccbq);
 			cgds->held = cgds->allocated - cgds->dev_active -
 			    cgds->queued;
 			cgds->last_reset = tar->last_reset;
 			cgds->maxtags = dev->maxtags;
 			cgds->mintags = dev->mintags;
 			if (timevalcmp(&tar->last_reset, &bus->last_reset, <))
 				cgds->last_reset = bus->last_reset;
 			mtx_unlock(&devq->send_mtx);
 			cgds->ccb_h.status = CAM_REQ_CMP;
 		}
 		break;
 	}
 	case XPT_GDEVLIST:
 	{
 		struct cam_periph	*nperiph;
 		struct periph_list	*periph_head;
 		struct ccb_getdevlist	*cgdl;
 		u_int			i;
 		struct cam_ed		*device;
 		int			found;
 
 
 		found = 0;
 
 		/*
 		 * Don't want anyone mucking with our data.
 		 */
 		device = path->device;
 		periph_head = &device->periphs;
 		cgdl = &start_ccb->cgdl;
 
 		/*
 		 * Check and see if the list has changed since the user
 		 * last requested a list member.  If so, tell them that the
 		 * list has changed, and therefore they need to start over
 		 * from the beginning.
 		 */
 		if ((cgdl->index != 0) &&
 		    (cgdl->generation != device->generation)) {
 			cgdl->status = CAM_GDEVLIST_LIST_CHANGED;
 			break;
 		}
 
 		/*
 		 * Traverse the list of peripherals and attempt to find
 		 * the requested peripheral.
 		 */
 		for (nperiph = SLIST_FIRST(periph_head), i = 0;
 		     (nperiph != NULL) && (i <= cgdl->index);
 		     nperiph = SLIST_NEXT(nperiph, periph_links), i++) {
 			if (i == cgdl->index) {
 				strncpy(cgdl->periph_name,
 					nperiph->periph_name,
 					DEV_IDLEN);
 				cgdl->unit_number = nperiph->unit_number;
 				found = 1;
 			}
 		}
 		if (found == 0) {
 			cgdl->status = CAM_GDEVLIST_ERROR;
 			break;
 		}
 
 		if (nperiph == NULL)
 			cgdl->status = CAM_GDEVLIST_LAST_DEVICE;
 		else
 			cgdl->status = CAM_GDEVLIST_MORE_DEVS;
 
 		cgdl->index++;
 		cgdl->generation = device->generation;
 
 		cgdl->ccb_h.status = CAM_REQ_CMP;
 		break;
 	}
 	case XPT_DEV_MATCH:
 	{
 		dev_pos_type position_type;
 		struct ccb_dev_match *cdm;
 
 		cdm = &start_ccb->cdm;
 
 		/*
 		 * There are two ways of getting at information in the EDT.
 		 * The first way is via the primary EDT tree.  It starts
 		 * with a list of busses, then a list of targets on a bus,
 		 * then devices/luns on a target, and then peripherals on a
 		 * device/lun.  The "other" way is by the peripheral driver
 		 * lists.  The peripheral driver lists are organized by
 		 * peripheral driver.  (obviously)  So it makes sense to
 		 * use the peripheral driver list if the user is looking
 		 * for something like "da1", or all "da" devices.  If the
 		 * user is looking for something on a particular bus/target
 		 * or lun, it's generally better to go through the EDT tree.
 		 */
 
 		if (cdm->pos.position_type != CAM_DEV_POS_NONE)
 			position_type = cdm->pos.position_type;
 		else {
 			u_int i;
 
 			position_type = CAM_DEV_POS_NONE;
 
 			for (i = 0; i < cdm->num_patterns; i++) {
 				if ((cdm->patterns[i].type == DEV_MATCH_BUS)
 				 ||(cdm->patterns[i].type == DEV_MATCH_DEVICE)){
 					position_type = CAM_DEV_POS_EDT;
 					break;
 				}
 			}
 
 			if (cdm->num_patterns == 0)
 				position_type = CAM_DEV_POS_EDT;
 			else if (position_type == CAM_DEV_POS_NONE)
 				position_type = CAM_DEV_POS_PDRV;
 		}
 
 		switch(position_type & CAM_DEV_POS_TYPEMASK) {
 		case CAM_DEV_POS_EDT:
 			xptedtmatch(cdm);
 			break;
 		case CAM_DEV_POS_PDRV:
 			xptperiphlistmatch(cdm);
 			break;
 		default:
 			cdm->status = CAM_DEV_MATCH_ERROR;
 			break;
 		}
 
 		if (cdm->status == CAM_DEV_MATCH_ERROR)
 			start_ccb->ccb_h.status = CAM_REQ_CMP_ERR;
 		else
 			start_ccb->ccb_h.status = CAM_REQ_CMP;
 
 		break;
 	}
 	case XPT_SASYNC_CB:
 	{
 		struct ccb_setasync *csa;
 		struct async_node *cur_entry;
 		struct async_list *async_head;
 		u_int32_t added;
 
 		csa = &start_ccb->csa;
 		added = csa->event_enable;
 		async_head = &path->device->asyncs;
 
 		/*
 		 * If there is already an entry for us, simply
 		 * update it.
 		 */
 		cur_entry = SLIST_FIRST(async_head);
 		while (cur_entry != NULL) {
 			if ((cur_entry->callback_arg == csa->callback_arg)
 			 && (cur_entry->callback == csa->callback))
 				break;
 			cur_entry = SLIST_NEXT(cur_entry, links);
 		}
 
 		if (cur_entry != NULL) {
 		 	/*
 			 * If the request has no flags set,
 			 * remove the entry.
 			 */
 			added &= ~cur_entry->event_enable;
 			if (csa->event_enable == 0) {
 				SLIST_REMOVE(async_head, cur_entry,
 					     async_node, links);
 				xpt_release_device(path->device);
 				free(cur_entry, M_CAMXPT);
 			} else {
 				cur_entry->event_enable = csa->event_enable;
 			}
 			csa->event_enable = added;
 		} else {
 			cur_entry = malloc(sizeof(*cur_entry), M_CAMXPT,
 					   M_NOWAIT);
 			if (cur_entry == NULL) {
 				csa->ccb_h.status = CAM_RESRC_UNAVAIL;
 				break;
 			}
 			cur_entry->event_enable = csa->event_enable;
 			cur_entry->event_lock =
 			    mtx_owned(path->bus->sim->mtx) ? 1 : 0;
 			cur_entry->callback_arg = csa->callback_arg;
 			cur_entry->callback = csa->callback;
 			SLIST_INSERT_HEAD(async_head, cur_entry, links);
 			xpt_acquire_device(path->device);
 		}
 		start_ccb->ccb_h.status = CAM_REQ_CMP;
 		break;
 	}
 	case XPT_REL_SIMQ:
 	{
 		struct ccb_relsim *crs;
 		struct cam_ed *dev;
 
 		crs = &start_ccb->crs;
 		dev = path->device;
 		if (dev == NULL) {
 
 			crs->ccb_h.status = CAM_DEV_NOT_THERE;
 			break;
 		}
 
 		if ((crs->release_flags & RELSIM_ADJUST_OPENINGS) != 0) {
 
 			/* Don't ever go below one opening */
 			if (crs->openings > 0) {
 				xpt_dev_ccbq_resize(path, crs->openings);
 				if (bootverbose) {
 					xpt_print(path,
 					    "number of openings is now %d\n",
 					    crs->openings);
 				}
 			}
 		}
 
 		mtx_lock(&dev->sim->devq->send_mtx);
 		if ((crs->release_flags & RELSIM_RELEASE_AFTER_TIMEOUT) != 0) {
 
 			if ((dev->flags & CAM_DEV_REL_TIMEOUT_PENDING) != 0) {
 
 				/*
 				 * Just extend the old timeout and decrement
 				 * the freeze count so that a single timeout
 				 * is sufficient for releasing the queue.
 				 */
 				start_ccb->ccb_h.flags &= ~CAM_DEV_QFREEZE;
 				callout_stop(&dev->callout);
 			} else {
 
 				start_ccb->ccb_h.flags |= CAM_DEV_QFREEZE;
 			}
 
 			callout_reset_sbt(&dev->callout,
 			    SBT_1MS * crs->release_timeout, 0,
 			    xpt_release_devq_timeout, dev, 0);
 
 			dev->flags |= CAM_DEV_REL_TIMEOUT_PENDING;
 
 		}
 
 		if ((crs->release_flags & RELSIM_RELEASE_AFTER_CMDCMPLT) != 0) {
 
 			if ((dev->flags & CAM_DEV_REL_ON_COMPLETE) != 0) {
 				/*
 				 * Decrement the freeze count so that a single
 				 * completion is still sufficient to unfreeze
 				 * the queue.
 				 */
 				start_ccb->ccb_h.flags &= ~CAM_DEV_QFREEZE;
 			} else {
 
 				dev->flags |= CAM_DEV_REL_ON_COMPLETE;
 				start_ccb->ccb_h.flags |= CAM_DEV_QFREEZE;
 			}
 		}
 
 		if ((crs->release_flags & RELSIM_RELEASE_AFTER_QEMPTY) != 0) {
 
 			if ((dev->flags & CAM_DEV_REL_ON_QUEUE_EMPTY) != 0
 			 || (dev->ccbq.dev_active == 0)) {
 
 				start_ccb->ccb_h.flags &= ~CAM_DEV_QFREEZE;
 			} else {
 
 				dev->flags |= CAM_DEV_REL_ON_QUEUE_EMPTY;
 				start_ccb->ccb_h.flags |= CAM_DEV_QFREEZE;
 			}
 		}
 		mtx_unlock(&dev->sim->devq->send_mtx);
 
 		if ((start_ccb->ccb_h.flags & CAM_DEV_QFREEZE) == 0)
 			xpt_release_devq(path, /*count*/1, /*run_queue*/TRUE);
 		start_ccb->crs.qfrozen_cnt = dev->ccbq.queue.qfrozen_cnt;
 		start_ccb->ccb_h.status = CAM_REQ_CMP;
 		break;
 	}
 	case XPT_DEBUG: {
 		struct cam_path *oldpath;
 
 		/* Check that all request bits are supported. */
 		if (start_ccb->cdbg.flags & ~(CAM_DEBUG_COMPILE)) {
 			start_ccb->ccb_h.status = CAM_FUNC_NOTAVAIL;
 			break;
 		}
 
 		cam_dflags = CAM_DEBUG_NONE;
 		if (cam_dpath != NULL) {
 			oldpath = cam_dpath;
 			cam_dpath = NULL;
 			xpt_free_path(oldpath);
 		}
 		if (start_ccb->cdbg.flags != CAM_DEBUG_NONE) {
 			if (xpt_create_path(&cam_dpath, NULL,
 					    start_ccb->ccb_h.path_id,
 					    start_ccb->ccb_h.target_id,
 					    start_ccb->ccb_h.target_lun) !=
 					    CAM_REQ_CMP) {
 				start_ccb->ccb_h.status = CAM_RESRC_UNAVAIL;
 			} else {
 				cam_dflags = start_ccb->cdbg.flags;
 				start_ccb->ccb_h.status = CAM_REQ_CMP;
 				xpt_print(cam_dpath, "debugging flags now %x\n",
 				    cam_dflags);
 			}
 		} else
 			start_ccb->ccb_h.status = CAM_REQ_CMP;
 		break;
 	}
 	case XPT_NOOP:
 		if ((start_ccb->ccb_h.flags & CAM_DEV_QFREEZE) != 0)
 			xpt_freeze_devq(path, 1);
 		start_ccb->ccb_h.status = CAM_REQ_CMP;
 		break;
 	default:
 	case XPT_SDEV_TYPE:
 	case XPT_TERM_IO:
 	case XPT_ENG_INQ:
 		/* XXX Implement */
 		printf("%s: CCB type %#x not supported\n", __func__,
 		       start_ccb->ccb_h.func_code);
 		start_ccb->ccb_h.status = CAM_PROVIDE_FAIL;
 		if (start_ccb->ccb_h.func_code & XPT_FC_DEV_QUEUED) {
 			xpt_done(start_ccb);
 		}
 		break;
 	}
 }
 
 void
 xpt_polled_action(union ccb *start_ccb)
 {
 	u_int32_t timeout;
 	struct	  cam_sim *sim;
 	struct	  cam_devq *devq;
 	struct	  cam_ed *dev;
 
 	timeout = start_ccb->ccb_h.timeout * 10;
 	sim = start_ccb->ccb_h.path->bus->sim;
 	devq = sim->devq;
 	dev = start_ccb->ccb_h.path->device;
 
 	mtx_unlock(&dev->device_mtx);
 
 	/*
 	 * Steal an opening so that no other queued requests
 	 * can get it before us while we simulate interrupts.
 	 */
 	mtx_lock(&devq->send_mtx);
 	dev->ccbq.dev_openings--;
 	while((devq->send_openings <= 0 || dev->ccbq.dev_openings < 0) &&
 	    (--timeout > 0)) {
 		mtx_unlock(&devq->send_mtx);
 		DELAY(100);
 		CAM_SIM_LOCK(sim);
 		(*(sim->sim_poll))(sim);
 		CAM_SIM_UNLOCK(sim);
 		camisr_runqueue();
 		mtx_lock(&devq->send_mtx);
 	}
 	dev->ccbq.dev_openings++;
 	mtx_unlock(&devq->send_mtx);
 
 	if (timeout != 0) {
 		xpt_action(start_ccb);
 		while(--timeout > 0) {
 			CAM_SIM_LOCK(sim);
 			(*(sim->sim_poll))(sim);
 			CAM_SIM_UNLOCK(sim);
 			camisr_runqueue();
 			if ((start_ccb->ccb_h.status  & CAM_STATUS_MASK)
 			    != CAM_REQ_INPROG)
 				break;
 			DELAY(100);
 		}
 		if (timeout == 0) {
 			/*
 			 * XXX Is it worth adding a sim_timeout entry
 			 * point so we can attempt recovery?  If
 			 * this is only used for dumps, I don't think
 			 * it is.
 			 */
 			start_ccb->ccb_h.status = CAM_CMD_TIMEOUT;
 		}
 	} else {
 		start_ccb->ccb_h.status = CAM_RESRC_UNAVAIL;
 	}
 
 	mtx_lock(&dev->device_mtx);
 }
 
 /*
  * Schedule a peripheral driver to receive a ccb when its
  * target device has space for more transactions.
  */
 void
 xpt_schedule(struct cam_periph *periph, u_int32_t new_priority)
 {
 
 	CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("xpt_schedule\n"));
 	cam_periph_assert(periph, MA_OWNED);
 	if (new_priority < periph->scheduled_priority) {
 		periph->scheduled_priority = new_priority;
 		xpt_run_allocq(periph, 0);
 	}
 }
 
 
 /*
  * Schedule a device to run on a given queue.
  * If the device was inserted as a new entry on the queue,
  * return 1 meaning the device queue should be run. If we
  * were already queued, implying someone else has already
  * started the queue, return 0 so the caller doesn't attempt
  * to run the queue.
  */
 static int
 xpt_schedule_dev(struct camq *queue, cam_pinfo *pinfo,
 		 u_int32_t new_priority)
 {
 	int retval;
 	u_int32_t old_priority;
 
 	CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("xpt_schedule_dev\n"));
 
 	old_priority = pinfo->priority;
 
 	/*
 	 * Are we already queued?
 	 */
 	if (pinfo->index != CAM_UNQUEUED_INDEX) {
 		/* Simply reorder based on new priority */
 		if (new_priority < old_priority) {
 			camq_change_priority(queue, pinfo->index,
 					     new_priority);
 			CAM_DEBUG_PRINT(CAM_DEBUG_XPT,
 					("changed priority to %d\n",
 					 new_priority));
 			retval = 1;
 		} else
 			retval = 0;
 	} else {
 		/* New entry on the queue */
 		if (new_priority < old_priority)
 			pinfo->priority = new_priority;
 
 		CAM_DEBUG_PRINT(CAM_DEBUG_XPT,
 				("Inserting onto queue\n"));
 		pinfo->generation = ++queue->generation;
 		camq_insert(queue, pinfo);
 		retval = 1;
 	}
 	return (retval);
 }
 
 static void
 xpt_run_allocq_task(void *context, int pending)
 {
 	struct cam_periph *periph = context;
 
 	cam_periph_lock(periph);
 	periph->flags &= ~CAM_PERIPH_RUN_TASK;
 	xpt_run_allocq(periph, 1);
 	cam_periph_unlock(periph);
 	cam_periph_release(periph);
 }
 
 static void
 xpt_run_allocq(struct cam_periph *periph, int sleep)
 {
 	struct cam_ed	*device;
 	union ccb	*ccb;
 	uint32_t	 prio;
 
 	cam_periph_assert(periph, MA_OWNED);
 	if (periph->periph_allocating)
 		return;
 	periph->periph_allocating = 1;
 	CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("xpt_run_allocq(%p)\n", periph));
 	device = periph->path->device;
 	ccb = NULL;
 restart:
 	while ((prio = min(periph->scheduled_priority,
 	    periph->immediate_priority)) != CAM_PRIORITY_NONE &&
 	    (periph->periph_allocated - (ccb != NULL ? 1 : 0) <
 	     device->ccbq.total_openings || prio <= CAM_PRIORITY_OOB)) {
 
 		if (ccb == NULL &&
 		    (ccb = xpt_get_ccb_nowait(periph)) == NULL) {
 			if (sleep) {
 				ccb = xpt_get_ccb(periph);
 				goto restart;
 			}
 			if (periph->flags & CAM_PERIPH_RUN_TASK)
 				break;
 			cam_periph_doacquire(periph);
 			periph->flags |= CAM_PERIPH_RUN_TASK;
 			taskqueue_enqueue(xsoftc.xpt_taskq,
 			    &periph->periph_run_task);
 			break;
 		}
 		xpt_setup_ccb(&ccb->ccb_h, periph->path, prio);
 		if (prio == periph->immediate_priority) {
 			periph->immediate_priority = CAM_PRIORITY_NONE;
 			CAM_DEBUG_PRINT(CAM_DEBUG_XPT,
 					("waking cam_periph_getccb()\n"));
 			SLIST_INSERT_HEAD(&periph->ccb_list, &ccb->ccb_h,
 					  periph_links.sle);
 			wakeup(&periph->ccb_list);
 		} else {
 			periph->scheduled_priority = CAM_PRIORITY_NONE;
 			CAM_DEBUG_PRINT(CAM_DEBUG_XPT,
 					("calling periph_start()\n"));
 			periph->periph_start(periph, ccb);
 		}
 		ccb = NULL;
 	}
 	if (ccb != NULL)
 		xpt_release_ccb(ccb);
 	periph->periph_allocating = 0;
 }
 
 static void
 xpt_run_devq(struct cam_devq *devq)
 {
 	char cdb_str[(SCSI_MAX_CDBLEN * 3) + 1];
 	int lock;
 
 	CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("xpt_run_devq\n"));
 
 	devq->send_queue.qfrozen_cnt++;
 	while ((devq->send_queue.entries > 0)
 	    && (devq->send_openings > 0)
 	    && (devq->send_queue.qfrozen_cnt <= 1)) {
 		struct	cam_ed *device;
 		union ccb *work_ccb;
 		struct	cam_sim *sim;
 
 		device = (struct cam_ed *)camq_remove(&devq->send_queue,
 							   CAMQ_HEAD);
 		CAM_DEBUG_PRINT(CAM_DEBUG_XPT,
 				("running device %p\n", device));
 
 		work_ccb = cam_ccbq_peek_ccb(&device->ccbq, CAMQ_HEAD);
 		if (work_ccb == NULL) {
 			printf("device on run queue with no ccbs???\n");
 			continue;
 		}
 
 		if ((work_ccb->ccb_h.flags & CAM_HIGH_POWER) != 0) {
 
 			mtx_lock(&xsoftc.xpt_highpower_lock);
 		 	if (xsoftc.num_highpower <= 0) {
 				/*
 				 * We got a high power command, but we
 				 * don't have any available slots.  Freeze
 				 * the device queue until we have a slot
 				 * available.
 				 */
 				xpt_freeze_devq_device(device, 1);
 				STAILQ_INSERT_TAIL(&xsoftc.highpowerq, device,
 						   highpowerq_entry);
 
 				mtx_unlock(&xsoftc.xpt_highpower_lock);
 				continue;
 			} else {
 				/*
 				 * Consume a high power slot while
 				 * this ccb runs.
 				 */
 				xsoftc.num_highpower--;
 			}
 			mtx_unlock(&xsoftc.xpt_highpower_lock);
 		}
 		cam_ccbq_remove_ccb(&device->ccbq, work_ccb);
 		cam_ccbq_send_ccb(&device->ccbq, work_ccb);
 		devq->send_openings--;
 		devq->send_active++;
 		xpt_schedule_devq(devq, device);
 		mtx_unlock(&devq->send_mtx);
 
 		if ((work_ccb->ccb_h.flags & CAM_DEV_QFREEZE) != 0) {
 			/*
 			 * The client wants to freeze the queue
 			 * after this CCB is sent.
 			 */
 			xpt_freeze_devq(work_ccb->ccb_h.path, 1);
 		}
 
 		/* In Target mode, the peripheral driver knows best... */
 		if (work_ccb->ccb_h.func_code == XPT_SCSI_IO) {
 			if ((device->inq_flags & SID_CmdQue) != 0
 			 && work_ccb->csio.tag_action != CAM_TAG_ACTION_NONE)
 				work_ccb->ccb_h.flags |= CAM_TAG_ACTION_VALID;
 			else
 				/*
 				 * Clear this in case of a retried CCB that
 				 * failed due to a rejected tag.
 				 */
 				work_ccb->ccb_h.flags &= ~CAM_TAG_ACTION_VALID;
 		}
 
 		switch (work_ccb->ccb_h.func_code) {
 		case XPT_SCSI_IO:
 			CAM_DEBUG(work_ccb->ccb_h.path,
 			    CAM_DEBUG_CDB,("%s. CDB: %s\n",
 			     scsi_op_desc(work_ccb->csio.cdb_io.cdb_bytes[0],
 					  &device->inq_data),
 			     scsi_cdb_string(work_ccb->csio.cdb_io.cdb_bytes,
 					     cdb_str, sizeof(cdb_str))));
 			break;
 		case XPT_ATA_IO:
 			CAM_DEBUG(work_ccb->ccb_h.path,
 			    CAM_DEBUG_CDB,("%s. ACB: %s\n",
 			     ata_op_string(&work_ccb->ataio.cmd),
 			     ata_cmd_string(&work_ccb->ataio.cmd,
 					    cdb_str, sizeof(cdb_str))));
 			break;
 		default:
 			break;
 		}
 
 		/*
 		 * Device queues can be shared among multiple SIM instances
 		 * that reside on different busses.  Use the SIM from the
 		 * queued device, rather than the one from the calling bus.
 		 */
 		sim = device->sim;
 		lock = (mtx_owned(sim->mtx) == 0);
 		if (lock)
 			CAM_SIM_LOCK(sim);
 		(*(sim->sim_action))(sim, work_ccb);
 		if (lock)
 			CAM_SIM_UNLOCK(sim);
 		mtx_lock(&devq->send_mtx);
 	}
 	devq->send_queue.qfrozen_cnt--;
 }
 
 /*
  * This function merges stuff from the slave ccb into the master ccb, while
  * keeping important fields in the master ccb constant.
  */
 void
 xpt_merge_ccb(union ccb *master_ccb, union ccb *slave_ccb)
 {
 
 	/*
 	 * Pull fields that are valid for peripheral drivers to set
 	 * into the master CCB along with the CCB "payload".
 	 */
 	master_ccb->ccb_h.retry_count = slave_ccb->ccb_h.retry_count;
 	master_ccb->ccb_h.func_code = slave_ccb->ccb_h.func_code;
 	master_ccb->ccb_h.timeout = slave_ccb->ccb_h.timeout;
 	master_ccb->ccb_h.flags = slave_ccb->ccb_h.flags;
 	bcopy(&(&slave_ccb->ccb_h)[1], &(&master_ccb->ccb_h)[1],
 	      sizeof(union ccb) - sizeof(struct ccb_hdr));
 }
 
 void
 xpt_setup_ccb(struct ccb_hdr *ccb_h, struct cam_path *path, u_int32_t priority)
 {
 
 	CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_setup_ccb\n"));
 	ccb_h->pinfo.priority = priority;
 	ccb_h->path = path;
 	ccb_h->path_id = path->bus->path_id;
 	if (path->target)
 		ccb_h->target_id = path->target->target_id;
 	else
 		ccb_h->target_id = CAM_TARGET_WILDCARD;
 	if (path->device) {
 		ccb_h->target_lun = path->device->lun_id;
 		ccb_h->pinfo.generation = ++path->device->ccbq.queue.generation;
 	} else {
 		ccb_h->target_lun = CAM_TARGET_WILDCARD;
 	}
 	ccb_h->pinfo.index = CAM_UNQUEUED_INDEX;
 	ccb_h->flags = 0;
 	ccb_h->xflags = 0;
 }
 
 /* Path manipulation functions */
 cam_status
 xpt_create_path(struct cam_path **new_path_ptr, struct cam_periph *perph,
 		path_id_t path_id, target_id_t target_id, lun_id_t lun_id)
 {
 	struct	   cam_path *path;
 	cam_status status;
 
 	path = (struct cam_path *)malloc(sizeof(*path), M_CAMPATH, M_NOWAIT);
 
 	if (path == NULL) {
 		status = CAM_RESRC_UNAVAIL;
 		return(status);
 	}
 	status = xpt_compile_path(path, perph, path_id, target_id, lun_id);
 	if (status != CAM_REQ_CMP) {
 		free(path, M_CAMPATH);
 		path = NULL;
 	}
 	*new_path_ptr = path;
 	return (status);
 }
 
 cam_status
 xpt_create_path_unlocked(struct cam_path **new_path_ptr,
 			 struct cam_periph *periph, path_id_t path_id,
 			 target_id_t target_id, lun_id_t lun_id)
 {
 
 	return (xpt_create_path(new_path_ptr, periph, path_id, target_id,
 	    lun_id));
 }
 
 cam_status
 xpt_compile_path(struct cam_path *new_path, struct cam_periph *perph,
 		 path_id_t path_id, target_id_t target_id, lun_id_t lun_id)
 {
 	struct	     cam_eb *bus;
 	struct	     cam_et *target;
 	struct	     cam_ed *device;
 	cam_status   status;
 
 	status = CAM_REQ_CMP;	/* Completed without error */
 	target = NULL;		/* Wildcarded */
 	device = NULL;		/* Wildcarded */
 
 	/*
 	 * We will potentially modify the EDT, so block interrupts
 	 * that may attempt to create cam paths.
 	 */
 	bus = xpt_find_bus(path_id);
 	if (bus == NULL) {
 		status = CAM_PATH_INVALID;
 	} else {
 		xpt_lock_buses();
 		mtx_lock(&bus->eb_mtx);
 		target = xpt_find_target(bus, target_id);
 		if (target == NULL) {
 			/* Create one */
 			struct cam_et *new_target;
 
 			new_target = xpt_alloc_target(bus, target_id);
 			if (new_target == NULL) {
 				status = CAM_RESRC_UNAVAIL;
 			} else {
 				target = new_target;
 			}
 		}
 		xpt_unlock_buses();
 		if (target != NULL) {
 			device = xpt_find_device(target, lun_id);
 			if (device == NULL) {
 				/* Create one */
 				struct cam_ed *new_device;
 
 				new_device =
 				    (*(bus->xport->alloc_device))(bus,
 								      target,
 								      lun_id);
 				if (new_device == NULL) {
 					status = CAM_RESRC_UNAVAIL;
 				} else {
 					device = new_device;
 				}
 			}
 		}
 		mtx_unlock(&bus->eb_mtx);
 	}
 
 	/*
 	 * Only touch the user's data if we are successful.
 	 */
 	if (status == CAM_REQ_CMP) {
 		new_path->periph = perph;
 		new_path->bus = bus;
 		new_path->target = target;
 		new_path->device = device;
 		CAM_DEBUG(new_path, CAM_DEBUG_TRACE, ("xpt_compile_path\n"));
 	} else {
 		if (device != NULL)
 			xpt_release_device(device);
 		if (target != NULL)
 			xpt_release_target(target);
 		if (bus != NULL)
 			xpt_release_bus(bus);
 	}
 	return (status);
 }
 
 cam_status
 xpt_clone_path(struct cam_path **new_path_ptr, struct cam_path *path)
 {
 	struct	   cam_path *new_path;
 
 	new_path = (struct cam_path *)malloc(sizeof(*path), M_CAMPATH, M_NOWAIT);
 	if (new_path == NULL)
 		return(CAM_RESRC_UNAVAIL);
 	xpt_copy_path(new_path, path);
 	*new_path_ptr = new_path;
 	return (CAM_REQ_CMP);
 }
 
 void
 xpt_copy_path(struct cam_path *new_path, struct cam_path *path)
 {
 
 	*new_path = *path;
 	if (path->bus != NULL)
 		xpt_acquire_bus(path->bus);
 	if (path->target != NULL)
 		xpt_acquire_target(path->target);
 	if (path->device != NULL)
 		xpt_acquire_device(path->device);
 }
 
 void
 xpt_release_path(struct cam_path *path)
 {
 	CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_release_path\n"));
 	if (path->device != NULL) {
 		xpt_release_device(path->device);
 		path->device = NULL;
 	}
 	if (path->target != NULL) {
 		xpt_release_target(path->target);
 		path->target = NULL;
 	}
 	if (path->bus != NULL) {
 		xpt_release_bus(path->bus);
 		path->bus = NULL;
 	}
 }
 
 void
 xpt_free_path(struct cam_path *path)
 {
 
 	CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_free_path\n"));
 	xpt_release_path(path);
 	free(path, M_CAMPATH);
 }
 
 void
 xpt_path_counts(struct cam_path *path, uint32_t *bus_ref,
     uint32_t *periph_ref, uint32_t *target_ref, uint32_t *device_ref)
 {
 
 	xpt_lock_buses();
 	if (bus_ref) {
 		if (path->bus)
 			*bus_ref = path->bus->refcount;
 		else
 			*bus_ref = 0;
 	}
 	if (periph_ref) {
 		if (path->periph)
 			*periph_ref = path->periph->refcount;
 		else
 			*periph_ref = 0;
 	}
 	xpt_unlock_buses();
 	if (target_ref) {
 		if (path->target)
 			*target_ref = path->target->refcount;
 		else
 			*target_ref = 0;
 	}
 	if (device_ref) {
 		if (path->device)
 			*device_ref = path->device->refcount;
 		else
 			*device_ref = 0;
 	}
 }
 
 /*
  * Return -1 for failure, 0 for exact match, 1 for match with wildcards
  * in path1, 2 for match with wildcards in path2.
  */
 int
 xpt_path_comp(struct cam_path *path1, struct cam_path *path2)
 {
 	int retval = 0;
 
 	if (path1->bus != path2->bus) {
 		if (path1->bus->path_id == CAM_BUS_WILDCARD)
 			retval = 1;
 		else if (path2->bus->path_id == CAM_BUS_WILDCARD)
 			retval = 2;
 		else
 			return (-1);
 	}
 	if (path1->target != path2->target) {
 		if (path1->target->target_id == CAM_TARGET_WILDCARD) {
 			if (retval == 0)
 				retval = 1;
 		} else if (path2->target->target_id == CAM_TARGET_WILDCARD)
 			retval = 2;
 		else
 			return (-1);
 	}
 	if (path1->device != path2->device) {
 		if (path1->device->lun_id == CAM_LUN_WILDCARD) {
 			if (retval == 0)
 				retval = 1;
 		} else if (path2->device->lun_id == CAM_LUN_WILDCARD)
 			retval = 2;
 		else
 			return (-1);
 	}
 	return (retval);
 }
 
 int
 xpt_path_comp_dev(struct cam_path *path, struct cam_ed *dev)
 {
 	int retval = 0;
 
 	if (path->bus != dev->target->bus) {
 		if (path->bus->path_id == CAM_BUS_WILDCARD)
 			retval = 1;
 		else if (dev->target->bus->path_id == CAM_BUS_WILDCARD)
 			retval = 2;
 		else
 			return (-1);
 	}
 	if (path->target != dev->target) {
 		if (path->target->target_id == CAM_TARGET_WILDCARD) {
 			if (retval == 0)
 				retval = 1;
 		} else if (dev->target->target_id == CAM_TARGET_WILDCARD)
 			retval = 2;
 		else
 			return (-1);
 	}
 	if (path->device != dev) {
 		if (path->device->lun_id == CAM_LUN_WILDCARD) {
 			if (retval == 0)
 				retval = 1;
 		} else if (dev->lun_id == CAM_LUN_WILDCARD)
 			retval = 2;
 		else
 			return (-1);
 	}
 	return (retval);
 }
 
 void
 xpt_print_path(struct cam_path *path)
 {
 
 	if (path == NULL)
 		printf("(nopath): ");
 	else {
 		if (path->periph != NULL)
 			printf("(%s%d:", path->periph->periph_name,
 			       path->periph->unit_number);
 		else
 			printf("(noperiph:");
 
 		if (path->bus != NULL)
 			printf("%s%d:%d:", path->bus->sim->sim_name,
 			       path->bus->sim->unit_number,
 			       path->bus->sim->bus_id);
 		else
 			printf("nobus:");
 
 		if (path->target != NULL)
 			printf("%d:", path->target->target_id);
 		else
 			printf("X:");
 
 		if (path->device != NULL)
 			printf("%jx): ", (uintmax_t)path->device->lun_id);
 		else
 			printf("X): ");
 	}
 }
 
 void
 xpt_print_device(struct cam_ed *device)
 {
 
 	if (device == NULL)
 		printf("(nopath): ");
 	else {
 		printf("(noperiph:%s%d:%d:%d:%jx): ", device->sim->sim_name,
 		       device->sim->unit_number,
 		       device->sim->bus_id,
 		       device->target->target_id,
 		       (uintmax_t)device->lun_id);
 	}
 }
 
 void
 xpt_print(struct cam_path *path, const char *fmt, ...)
 {
 	va_list ap;
 	xpt_print_path(path);
 	va_start(ap, fmt);
 	vprintf(fmt, ap);
 	va_end(ap);
 }
 
 int
 xpt_path_string(struct cam_path *path, char *str, size_t str_len)
 {
 	struct sbuf sb;
 
 	sbuf_new(&sb, str, str_len, 0);
 
 	if (path == NULL)
 		sbuf_printf(&sb, "(nopath): ");
 	else {
 		if (path->periph != NULL)
 			sbuf_printf(&sb, "(%s%d:", path->periph->periph_name,
 				    path->periph->unit_number);
 		else
 			sbuf_printf(&sb, "(noperiph:");
 
 		if (path->bus != NULL)
 			sbuf_printf(&sb, "%s%d:%d:", path->bus->sim->sim_name,
 				    path->bus->sim->unit_number,
 				    path->bus->sim->bus_id);
 		else
 			sbuf_printf(&sb, "nobus:");
 
 		if (path->target != NULL)
 			sbuf_printf(&sb, "%d:", path->target->target_id);
 		else
 			sbuf_printf(&sb, "X:");
 
 		if (path->device != NULL)
 			sbuf_printf(&sb, "%jx): ",
 			    (uintmax_t)path->device->lun_id);
 		else
 			sbuf_printf(&sb, "X): ");
 	}
 	sbuf_finish(&sb);
 
 	return(sbuf_len(&sb));
 }
 
 path_id_t
 xpt_path_path_id(struct cam_path *path)
 {
 	return(path->bus->path_id);
 }
 
 target_id_t
 xpt_path_target_id(struct cam_path *path)
 {
 	if (path->target != NULL)
 		return (path->target->target_id);
 	else
 		return (CAM_TARGET_WILDCARD);
 }
 
 lun_id_t
 xpt_path_lun_id(struct cam_path *path)
 {
 	if (path->device != NULL)
 		return (path->device->lun_id);
 	else
 		return (CAM_LUN_WILDCARD);
 }
 
 struct cam_sim *
 xpt_path_sim(struct cam_path *path)
 {
 
 	return (path->bus->sim);
 }
 
 struct cam_periph*
 xpt_path_periph(struct cam_path *path)
 {
 
 	return (path->periph);
 }
 
 int
 xpt_path_legacy_ata_id(struct cam_path *path)
 {
 	struct cam_eb *bus;
 	int bus_id;
 
 	if ((strcmp(path->bus->sim->sim_name, "ata") != 0) &&
 	    strcmp(path->bus->sim->sim_name, "ahcich") != 0 &&
 	    strcmp(path->bus->sim->sim_name, "mvsch") != 0 &&
 	    strcmp(path->bus->sim->sim_name, "siisch") != 0)
 		return (-1);
 
 	if (strcmp(path->bus->sim->sim_name, "ata") == 0 &&
 	    path->bus->sim->unit_number < 2) {
 		bus_id = path->bus->sim->unit_number;
 	} else {
 		bus_id = 2;
 		xpt_lock_buses();
 		TAILQ_FOREACH(bus, &xsoftc.xpt_busses, links) {
 			if (bus == path->bus)
 				break;
 			if ((strcmp(bus->sim->sim_name, "ata") == 0 &&
 			     bus->sim->unit_number >= 2) ||
 			    strcmp(bus->sim->sim_name, "ahcich") == 0 ||
 			    strcmp(bus->sim->sim_name, "mvsch") == 0 ||
 			    strcmp(bus->sim->sim_name, "siisch") == 0)
 				bus_id++;
 		}
 		xpt_unlock_buses();
 	}
 	if (path->target != NULL) {
 		if (path->target->target_id < 2)
 			return (bus_id * 2 + path->target->target_id);
 		else
 			return (-1);
 	} else
 		return (bus_id * 2);
 }
 
 /*
  * Release a CAM control block for the caller.  Remit the cost of the structure
  * to the device referenced by the path.  If the this device had no 'credits'
  * and peripheral drivers have registered async callbacks for this notification
  * call them now.
  */
 void
 xpt_release_ccb(union ccb *free_ccb)
 {
 	struct	 cam_ed *device;
 	struct	 cam_periph *periph;
 
 	CAM_DEBUG_PRINT(CAM_DEBUG_XPT, ("xpt_release_ccb\n"));
 	xpt_path_assert(free_ccb->ccb_h.path, MA_OWNED);
 	device = free_ccb->ccb_h.path->device;
 	periph = free_ccb->ccb_h.path->periph;
 
 	xpt_free_ccb(free_ccb);
 	periph->periph_allocated--;
 	cam_ccbq_release_opening(&device->ccbq);
 	xpt_run_allocq(periph, 0);
 }
 
 /* Functions accessed by SIM drivers */
 
 static struct xpt_xport xport_default = {
 	.alloc_device = xpt_alloc_device_default,
 	.action = xpt_action_default,
 	.async = xpt_dev_async_default,
 };
 
 /*
  * A sim structure, listing the SIM entry points and instance
  * identification info is passed to xpt_bus_register to hook the SIM
  * into the CAM framework.  xpt_bus_register creates a cam_eb entry
  * for this new bus and places it in the array of busses and assigns
  * it a path_id.  The path_id may be influenced by "hard wiring"
  * information specified by the user.  Once interrupt services are
  * available, the bus will be probed.
  */
 int32_t
 xpt_bus_register(struct cam_sim *sim, device_t parent, u_int32_t bus)
 {
 	struct cam_eb *new_bus;
 	struct cam_eb *old_bus;
 	struct ccb_pathinq cpi;
 	struct cam_path *path;
 	cam_status status;
 
 	mtx_assert(sim->mtx, MA_OWNED);
 
 	sim->bus_id = bus;
 	new_bus = (struct cam_eb *)malloc(sizeof(*new_bus),
 					  M_CAMXPT, M_NOWAIT|M_ZERO);
 	if (new_bus == NULL) {
 		/* Couldn't satisfy request */
 		return (CAM_RESRC_UNAVAIL);
 	}
 
 	mtx_init(&new_bus->eb_mtx, "CAM bus lock", NULL, MTX_DEF);
 	TAILQ_INIT(&new_bus->et_entries);
 	cam_sim_hold(sim);
 	new_bus->sim = sim;
 	timevalclear(&new_bus->last_reset);
 	new_bus->flags = 0;
 	new_bus->refcount = 1;	/* Held until a bus_deregister event */
 	new_bus->generation = 0;
 
 	xpt_lock_buses();
 	sim->path_id = new_bus->path_id =
 	    xptpathid(sim->sim_name, sim->unit_number, sim->bus_id);
 	old_bus = TAILQ_FIRST(&xsoftc.xpt_busses);
 	while (old_bus != NULL
 	    && old_bus->path_id < new_bus->path_id)
 		old_bus = TAILQ_NEXT(old_bus, links);
 	if (old_bus != NULL)
 		TAILQ_INSERT_BEFORE(old_bus, new_bus, links);
 	else
 		TAILQ_INSERT_TAIL(&xsoftc.xpt_busses, new_bus, links);
 	xsoftc.bus_generation++;
 	xpt_unlock_buses();
 
 	/*
 	 * Set a default transport so that a PATH_INQ can be issued to
 	 * the SIM.  This will then allow for probing and attaching of
 	 * a more appropriate transport.
 	 */
 	new_bus->xport = &xport_default;
 
 	status = xpt_create_path(&path, /*periph*/NULL, sim->path_id,
 				  CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD);
 	if (status != CAM_REQ_CMP) {
 		xpt_release_bus(new_bus);
 		free(path, M_CAMXPT);
 		return (CAM_RESRC_UNAVAIL);
 	}
 
 	xpt_setup_ccb(&cpi.ccb_h, path, CAM_PRIORITY_NORMAL);
 	cpi.ccb_h.func_code = XPT_PATH_INQ;
 	xpt_action((union ccb *)&cpi);
 
 	if (cpi.ccb_h.status == CAM_REQ_CMP) {
 		switch (cpi.transport) {
 		case XPORT_SPI:
 		case XPORT_SAS:
 		case XPORT_FC:
 		case XPORT_USB:
 		case XPORT_ISCSI:
 		case XPORT_SRP:
 		case XPORT_PPB:
 			new_bus->xport = scsi_get_xport();
 			break;
 		case XPORT_ATA:
 		case XPORT_SATA:
 			new_bus->xport = ata_get_xport();
 			break;
 		default:
 			new_bus->xport = &xport_default;
 			break;
 		}
 	}
 
 	/* Notify interested parties */
 	if (sim->path_id != CAM_XPT_PATH_ID) {
 
 		xpt_async(AC_PATH_REGISTERED, path, &cpi);
 		if ((cpi.hba_misc & PIM_NOSCAN) == 0) {
 			union	ccb *scan_ccb;
 
 			/* Initiate bus rescan. */
 			scan_ccb = xpt_alloc_ccb_nowait();
 			if (scan_ccb != NULL) {
 				scan_ccb->ccb_h.path = path;
 				scan_ccb->ccb_h.func_code = XPT_SCAN_BUS;
 				scan_ccb->crcn.flags = 0;
 				xpt_rescan(scan_ccb);
 			} else {
 				xpt_print(path,
 					  "Can't allocate CCB to scan bus\n");
 				xpt_free_path(path);
 			}
 		} else
 			xpt_free_path(path);
 	} else
 		xpt_free_path(path);
 	return (CAM_SUCCESS);
 }
 
 int32_t
 xpt_bus_deregister(path_id_t pathid)
 {
 	struct cam_path bus_path;
 	cam_status status;
 
 	status = xpt_compile_path(&bus_path, NULL, pathid,
 				  CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD);
 	if (status != CAM_REQ_CMP)
 		return (status);
 
 	xpt_async(AC_LOST_DEVICE, &bus_path, NULL);
 	xpt_async(AC_PATH_DEREGISTERED, &bus_path, NULL);
 
 	/* Release the reference count held while registered. */
 	xpt_release_bus(bus_path.bus);
 	xpt_release_path(&bus_path);
 
 	return (CAM_REQ_CMP);
 }
 
 static path_id_t
 xptnextfreepathid(void)
 {
 	struct cam_eb *bus;
 	path_id_t pathid;
 	const char *strval;
 
 	mtx_assert(&xsoftc.xpt_topo_lock, MA_OWNED);
 	pathid = 0;
 	bus = TAILQ_FIRST(&xsoftc.xpt_busses);
 retry:
 	/* Find an unoccupied pathid */
 	while (bus != NULL && bus->path_id <= pathid) {
 		if (bus->path_id == pathid)
 			pathid++;
 		bus = TAILQ_NEXT(bus, links);
 	}
 
 	/*
 	 * Ensure that this pathid is not reserved for
 	 * a bus that may be registered in the future.
 	 */
 	if (resource_string_value("scbus", pathid, "at", &strval) == 0) {
 		++pathid;
 		/* Start the search over */
 		goto retry;
 	}
 	return (pathid);
 }
 
 static path_id_t
 xptpathid(const char *sim_name, int sim_unit, int sim_bus)
 {
 	path_id_t pathid;
 	int i, dunit, val;
 	char buf[32];
 	const char *dname;
 
 	pathid = CAM_XPT_PATH_ID;
 	snprintf(buf, sizeof(buf), "%s%d", sim_name, sim_unit);
 	if (strcmp(buf, "xpt0") == 0 && sim_bus == 0)
 		return (pathid);
 	i = 0;
 	while ((resource_find_match(&i, &dname, &dunit, "at", buf)) == 0) {
 		if (strcmp(dname, "scbus")) {
 			/* Avoid a bit of foot shooting. */
 			continue;
 		}
 		if (dunit < 0)		/* unwired?! */
 			continue;
 		if (resource_int_value("scbus", dunit, "bus", &val) == 0) {
 			if (sim_bus == val) {
 				pathid = dunit;
 				break;
 			}
 		} else if (sim_bus == 0) {
 			/* Unspecified matches bus 0 */
 			pathid = dunit;
 			break;
 		} else {
 			printf("Ambiguous scbus configuration for %s%d "
 			       "bus %d, cannot wire down.  The kernel "
 			       "config entry for scbus%d should "
 			       "specify a controller bus.\n"
 			       "Scbus will be assigned dynamically.\n",
 			       sim_name, sim_unit, sim_bus, dunit);
 			break;
 		}
 	}
 
 	if (pathid == CAM_XPT_PATH_ID)
 		pathid = xptnextfreepathid();
 	return (pathid);
 }
 
 static const char *
 xpt_async_string(u_int32_t async_code)
 {
 
 	switch (async_code) {
 	case AC_BUS_RESET: return ("AC_BUS_RESET");
 	case AC_UNSOL_RESEL: return ("AC_UNSOL_RESEL");
 	case AC_SCSI_AEN: return ("AC_SCSI_AEN");
 	case AC_SENT_BDR: return ("AC_SENT_BDR");
 	case AC_PATH_REGISTERED: return ("AC_PATH_REGISTERED");
 	case AC_PATH_DEREGISTERED: return ("AC_PATH_DEREGISTERED");
 	case AC_FOUND_DEVICE: return ("AC_FOUND_DEVICE");
 	case AC_LOST_DEVICE: return ("AC_LOST_DEVICE");
 	case AC_TRANSFER_NEG: return ("AC_TRANSFER_NEG");
 	case AC_INQ_CHANGED: return ("AC_INQ_CHANGED");
 	case AC_GETDEV_CHANGED: return ("AC_GETDEV_CHANGED");
 	case AC_CONTRACT: return ("AC_CONTRACT");
 	case AC_ADVINFO_CHANGED: return ("AC_ADVINFO_CHANGED");
 	case AC_UNIT_ATTENTION: return ("AC_UNIT_ATTENTION");
 	}
 	return ("AC_UNKNOWN");
 }
 
 static int
 xpt_async_size(u_int32_t async_code)
 {
 
 	switch (async_code) {
 	case AC_BUS_RESET: return (0);
 	case AC_UNSOL_RESEL: return (0);
 	case AC_SCSI_AEN: return (0);
 	case AC_SENT_BDR: return (0);
 	case AC_PATH_REGISTERED: return (sizeof(struct ccb_pathinq));
 	case AC_PATH_DEREGISTERED: return (0);
 	case AC_FOUND_DEVICE: return (sizeof(struct ccb_getdev));
 	case AC_LOST_DEVICE: return (0);
 	case AC_TRANSFER_NEG: return (sizeof(struct ccb_trans_settings));
 	case AC_INQ_CHANGED: return (0);
 	case AC_GETDEV_CHANGED: return (0);
 	case AC_CONTRACT: return (sizeof(struct ac_contract));
 	case AC_ADVINFO_CHANGED: return (-1);
 	case AC_UNIT_ATTENTION: return (sizeof(struct ccb_scsiio));
 	}
 	return (0);
 }
 
 static int
 xpt_async_process_dev(struct cam_ed *device, void *arg)
 {
 	union ccb *ccb = arg;
 	struct cam_path *path = ccb->ccb_h.path;
 	void *async_arg = ccb->casync.async_arg_ptr;
 	u_int32_t async_code = ccb->casync.async_code;
 	int relock;
 
 	if (path->device != device
 	 && path->device->lun_id != CAM_LUN_WILDCARD
 	 && device->lun_id != CAM_LUN_WILDCARD)
 		return (1);
 
 	/*
 	 * The async callback could free the device.
 	 * If it is a broadcast async, it doesn't hold
 	 * device reference, so take our own reference.
 	 */
 	xpt_acquire_device(device);
 
 	/*
 	 * If async for specific device is to be delivered to
 	 * the wildcard client, take the specific device lock.
 	 * XXX: We may need a way for client to specify it.
 	 */
 	if ((device->lun_id == CAM_LUN_WILDCARD &&
 	     path->device->lun_id != CAM_LUN_WILDCARD) ||
 	    (device->target->target_id == CAM_TARGET_WILDCARD &&
 	     path->target->target_id != CAM_TARGET_WILDCARD) ||
 	    (device->target->bus->path_id == CAM_BUS_WILDCARD &&
 	     path->target->bus->path_id != CAM_BUS_WILDCARD)) {
 		mtx_unlock(&device->device_mtx);
 		xpt_path_lock(path);
 		relock = 1;
 	} else
 		relock = 0;
 
 	(*(device->target->bus->xport->async))(async_code,
 	    device->target->bus, device->target, device, async_arg);
 	xpt_async_bcast(&device->asyncs, async_code, path, async_arg);
 
 	if (relock) {
 		xpt_path_unlock(path);
 		mtx_lock(&device->device_mtx);
 	}
 	xpt_release_device(device);
 	return (1);
 }
 
 static int
 xpt_async_process_tgt(struct cam_et *target, void *arg)
 {
 	union ccb *ccb = arg;
 	struct cam_path *path = ccb->ccb_h.path;
 
 	if (path->target != target
 	 && path->target->target_id != CAM_TARGET_WILDCARD
 	 && target->target_id != CAM_TARGET_WILDCARD)
 		return (1);
 
 	if (ccb->casync.async_code == AC_SENT_BDR) {
 		/* Update our notion of when the last reset occurred */
 		microtime(&target->last_reset);
 	}
 
 	return (xptdevicetraverse(target, NULL, xpt_async_process_dev, ccb));
 }
 
 static void
 xpt_async_process(struct cam_periph *periph, union ccb *ccb)
 {
 	struct cam_eb *bus;
 	struct cam_path *path;
 	void *async_arg;
 	u_int32_t async_code;
 
 	path = ccb->ccb_h.path;
 	async_code = ccb->casync.async_code;
 	async_arg = ccb->casync.async_arg_ptr;
 	CAM_DEBUG(path, CAM_DEBUG_TRACE | CAM_DEBUG_INFO,
 	    ("xpt_async(%s)\n", xpt_async_string(async_code)));
 	bus = path->bus;
 
 	if (async_code == AC_BUS_RESET) {
 		/* Update our notion of when the last reset occurred */
 		microtime(&bus->last_reset);
 	}
 
 	xpttargettraverse(bus, NULL, xpt_async_process_tgt, ccb);
 
 	/*
 	 * If this wasn't a fully wildcarded async, tell all
 	 * clients that want all async events.
 	 */
 	if (bus != xpt_periph->path->bus) {
 		xpt_path_lock(xpt_periph->path);
 		xpt_async_process_dev(xpt_periph->path->device, ccb);
 		xpt_path_unlock(xpt_periph->path);
 	}
 
 	if (path->device != NULL && path->device->lun_id != CAM_LUN_WILDCARD)
 		xpt_release_devq(path, 1, TRUE);
 	else
 		xpt_release_simq(path->bus->sim, TRUE);
 	if (ccb->casync.async_arg_size > 0)
 		free(async_arg, M_CAMXPT);
 	xpt_free_path(path);
 	xpt_free_ccb(ccb);
 }
 
 static void
 xpt_async_bcast(struct async_list *async_head,
 		u_int32_t async_code,
 		struct cam_path *path, void *async_arg)
 {
 	struct async_node *cur_entry;
 	int lock;
 
 	cur_entry = SLIST_FIRST(async_head);
 	while (cur_entry != NULL) {
 		struct async_node *next_entry;
 		/*
 		 * Grab the next list entry before we call the current
 		 * entry's callback.  This is because the callback function
 		 * can delete its async callback entry.
 		 */
 		next_entry = SLIST_NEXT(cur_entry, links);
 		if ((cur_entry->event_enable & async_code) != 0) {
 			lock = cur_entry->event_lock;
 			if (lock)
 				CAM_SIM_LOCK(path->device->sim);
 			cur_entry->callback(cur_entry->callback_arg,
 					    async_code, path,
 					    async_arg);
 			if (lock)
 				CAM_SIM_UNLOCK(path->device->sim);
 		}
 		cur_entry = next_entry;
 	}
 }
 
 void
 xpt_async(u_int32_t async_code, struct cam_path *path, void *async_arg)
 {
 	union ccb *ccb;
 	int size;
 
 	ccb = xpt_alloc_ccb_nowait();
 	if (ccb == NULL) {
 		xpt_print(path, "Can't allocate CCB to send %s\n",
 		    xpt_async_string(async_code));
 		return;
 	}
 
 	if (xpt_clone_path(&ccb->ccb_h.path, path) != CAM_REQ_CMP) {
 		xpt_print(path, "Can't allocate path to send %s\n",
 		    xpt_async_string(async_code));
 		xpt_free_ccb(ccb);
 		return;
 	}
 	ccb->ccb_h.path->periph = NULL;
 	ccb->ccb_h.func_code = XPT_ASYNC;
 	ccb->ccb_h.cbfcnp = xpt_async_process;
 	ccb->ccb_h.flags |= CAM_UNLOCKED;
 	ccb->casync.async_code = async_code;
 	ccb->casync.async_arg_size = 0;
 	size = xpt_async_size(async_code);
 	if (size > 0 && async_arg != NULL) {
 		ccb->casync.async_arg_ptr = malloc(size, M_CAMXPT, M_NOWAIT);
 		if (ccb->casync.async_arg_ptr == NULL) {
 			xpt_print(path, "Can't allocate argument to send %s\n",
 			    xpt_async_string(async_code));
 			xpt_free_path(ccb->ccb_h.path);
 			xpt_free_ccb(ccb);
 			return;
 		}
 		memcpy(ccb->casync.async_arg_ptr, async_arg, size);
 		ccb->casync.async_arg_size = size;
-	} else if (size < 0)
+	} else if (size < 0) {
+		ccb->casync.async_arg_ptr = async_arg;
 		ccb->casync.async_arg_size = size;
+	}
 	if (path->device != NULL && path->device->lun_id != CAM_LUN_WILDCARD)
 		xpt_freeze_devq(path, 1);
 	else
 		xpt_freeze_simq(path->bus->sim, 1);
 	xpt_done(ccb);
 }
 
 static void
 xpt_dev_async_default(u_int32_t async_code, struct cam_eb *bus,
 		      struct cam_et *target, struct cam_ed *device,
 		      void *async_arg)
 {
 
 	/*
 	 * We only need to handle events for real devices.
 	 */
 	if (target->target_id == CAM_TARGET_WILDCARD
 	 || device->lun_id == CAM_LUN_WILDCARD)
 		return;
 
 	printf("%s called\n", __func__);
 }
 
 static uint32_t
 xpt_freeze_devq_device(struct cam_ed *dev, u_int count)
 {
 	struct cam_devq	*devq;
 	uint32_t freeze;
 
 	devq = dev->sim->devq;
 	mtx_assert(&devq->send_mtx, MA_OWNED);
 	CAM_DEBUG_DEV(dev, CAM_DEBUG_TRACE,
 	    ("xpt_freeze_devq_device(%d) %u->%u\n", count,
 	    dev->ccbq.queue.qfrozen_cnt, dev->ccbq.queue.qfrozen_cnt + count));
 	freeze = (dev->ccbq.queue.qfrozen_cnt += count);
 	/* Remove frozen device from sendq. */
 	if (device_is_queued(dev))
 		camq_remove(&devq->send_queue, dev->devq_entry.index);
 	return (freeze);
 }
 
 u_int32_t
 xpt_freeze_devq(struct cam_path *path, u_int count)
 {
 	struct cam_ed	*dev = path->device;
 	struct cam_devq	*devq;
 	uint32_t	 freeze;
 
 	devq = dev->sim->devq;
 	mtx_lock(&devq->send_mtx);
 	CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_freeze_devq(%d)\n", count));
 	freeze = xpt_freeze_devq_device(dev, count);
 	mtx_unlock(&devq->send_mtx);
 	return (freeze);
 }
 
 u_int32_t
 xpt_freeze_simq(struct cam_sim *sim, u_int count)
 {
 	struct cam_devq	*devq;
 	uint32_t	 freeze;
 
 	devq = sim->devq;
 	mtx_lock(&devq->send_mtx);
 	freeze = (devq->send_queue.qfrozen_cnt += count);
 	mtx_unlock(&devq->send_mtx);
 	return (freeze);
 }
 
 static void
 xpt_release_devq_timeout(void *arg)
 {
 	struct cam_ed *dev;
 	struct cam_devq *devq;
 
 	dev = (struct cam_ed *)arg;
 	CAM_DEBUG_DEV(dev, CAM_DEBUG_TRACE, ("xpt_release_devq_timeout\n"));
 	devq = dev->sim->devq;
 	mtx_assert(&devq->send_mtx, MA_OWNED);
 	if (xpt_release_devq_device(dev, /*count*/1, /*run_queue*/TRUE))
 		xpt_run_devq(devq);
 }
 
 void
 xpt_release_devq(struct cam_path *path, u_int count, int run_queue)
 {
 	struct cam_ed *dev;
 	struct cam_devq *devq;
 
 	CAM_DEBUG(path, CAM_DEBUG_TRACE, ("xpt_release_devq(%d, %d)\n",
 	    count, run_queue));
 	dev = path->device;
 	devq = dev->sim->devq;
 	mtx_lock(&devq->send_mtx);
 	if (xpt_release_devq_device(dev, count, run_queue))
 		xpt_run_devq(dev->sim->devq);
 	mtx_unlock(&devq->send_mtx);
 }
 
 static int
 xpt_release_devq_device(struct cam_ed *dev, u_int count, int run_queue)
 {
 
 	mtx_assert(&dev->sim->devq->send_mtx, MA_OWNED);
 	CAM_DEBUG_DEV(dev, CAM_DEBUG_TRACE,
 	    ("xpt_release_devq_device(%d, %d) %u->%u\n", count, run_queue,
 	    dev->ccbq.queue.qfrozen_cnt, dev->ccbq.queue.qfrozen_cnt - count));
 	if (count > dev->ccbq.queue.qfrozen_cnt) {
 #ifdef INVARIANTS
 		printf("xpt_release_devq(): requested %u > present %u\n",
 		    count, dev->ccbq.queue.qfrozen_cnt);
 #endif
 		count = dev->ccbq.queue.qfrozen_cnt;
 	}
 	dev->ccbq.queue.qfrozen_cnt -= count;
 	if (dev->ccbq.queue.qfrozen_cnt == 0) {
 		/*
 		 * No longer need to wait for a successful
 		 * command completion.
 		 */
 		dev->flags &= ~CAM_DEV_REL_ON_COMPLETE;
 		/*
 		 * Remove any timeouts that might be scheduled
 		 * to release this queue.
 		 */
 		if ((dev->flags & CAM_DEV_REL_TIMEOUT_PENDING) != 0) {
 			callout_stop(&dev->callout);
 			dev->flags &= ~CAM_DEV_REL_TIMEOUT_PENDING;
 		}
 		/*
 		 * Now that we are unfrozen schedule the
 		 * device so any pending transactions are
 		 * run.
 		 */
 		xpt_schedule_devq(dev->sim->devq, dev);
 	} else
 		run_queue = 0;
 	return (run_queue);
 }
 
 void
 xpt_release_simq(struct cam_sim *sim, int run_queue)
 {
 	struct cam_devq	*devq;
 
 	devq = sim->devq;
 	mtx_lock(&devq->send_mtx);
 	if (devq->send_queue.qfrozen_cnt <= 0) {
 #ifdef INVARIANTS
 		printf("xpt_release_simq: requested 1 > present %u\n",
 		    devq->send_queue.qfrozen_cnt);
 #endif
 	} else
 		devq->send_queue.qfrozen_cnt--;
 	if (devq->send_queue.qfrozen_cnt == 0) {
 		/*
 		 * If there is a timeout scheduled to release this
 		 * sim queue, remove it.  The queue frozen count is
 		 * already at 0.
 		 */
 		if ((sim->flags & CAM_SIM_REL_TIMEOUT_PENDING) != 0){
 			callout_stop(&sim->callout);
 			sim->flags &= ~CAM_SIM_REL_TIMEOUT_PENDING;
 		}
 		if (run_queue) {
 			/*
 			 * Now that we are unfrozen run the send queue.
 			 */
 			xpt_run_devq(sim->devq);
 		}
 	}
 	mtx_unlock(&devq->send_mtx);
 }
 
 /*
  * XXX Appears to be unused.
  */
 static void
 xpt_release_simq_timeout(void *arg)
 {
 	struct cam_sim *sim;
 
 	sim = (struct cam_sim *)arg;
 	xpt_release_simq(sim, /* run_queue */ TRUE);
 }
 
 void
 xpt_done(union ccb *done_ccb)
 {
 	struct cam_doneq *queue;
 	int	run, hash;
 
 	CAM_DEBUG(done_ccb->ccb_h.path, CAM_DEBUG_TRACE, ("xpt_done\n"));
 	if ((done_ccb->ccb_h.func_code & XPT_FC_QUEUED) == 0)
 		return;
 
 	hash = (done_ccb->ccb_h.path_id + done_ccb->ccb_h.target_id +
 	    done_ccb->ccb_h.target_lun) % cam_num_doneqs;
 	queue = &cam_doneqs[hash];
 	mtx_lock(&queue->cam_doneq_mtx);
 	run = (queue->cam_doneq_sleep && STAILQ_EMPTY(&queue->cam_doneq));
 	STAILQ_INSERT_TAIL(&queue->cam_doneq, &done_ccb->ccb_h, sim_links.stqe);
 	done_ccb->ccb_h.pinfo.index = CAM_DONEQ_INDEX;
 	mtx_unlock(&queue->cam_doneq_mtx);
 	if (run)
 		wakeup(&queue->cam_doneq);
 }
 
 void
 xpt_done_direct(union ccb *done_ccb)
 {
 
 	CAM_DEBUG(done_ccb->ccb_h.path, CAM_DEBUG_TRACE, ("xpt_done_direct\n"));
 	if ((done_ccb->ccb_h.func_code & XPT_FC_QUEUED) == 0)
 		return;
 
 	xpt_done_process(&done_ccb->ccb_h);
 }
 
 union ccb *
 xpt_alloc_ccb()
 {
 	union ccb *new_ccb;
 
 	new_ccb = malloc(sizeof(*new_ccb), M_CAMCCB, M_ZERO|M_WAITOK);
 	return (new_ccb);
 }
 
 union ccb *
 xpt_alloc_ccb_nowait()
 {
 	union ccb *new_ccb;
 
 	new_ccb = malloc(sizeof(*new_ccb), M_CAMCCB, M_ZERO|M_NOWAIT);
 	return (new_ccb);
 }
 
 void
 xpt_free_ccb(union ccb *free_ccb)
 {
 	free(free_ccb, M_CAMCCB);
 }
 
 
 
 /* Private XPT functions */
 
 /*
  * Get a CAM control block for the caller. Charge the structure to the device
  * referenced by the path.  If we don't have sufficient resources to allocate
  * more ccbs, we return NULL.
  */
 static union ccb *
 xpt_get_ccb_nowait(struct cam_periph *periph)
 {
 	union ccb *new_ccb;
 
 	new_ccb = malloc(sizeof(*new_ccb), M_CAMCCB, M_ZERO|M_NOWAIT);
 	if (new_ccb == NULL)
 		return (NULL);
 	periph->periph_allocated++;
 	cam_ccbq_take_opening(&periph->path->device->ccbq);
 	return (new_ccb);
 }
 
 static union ccb *
 xpt_get_ccb(struct cam_periph *periph)
 {
 	union ccb *new_ccb;
 
 	cam_periph_unlock(periph);
 	new_ccb = malloc(sizeof(*new_ccb), M_CAMCCB, M_ZERO|M_WAITOK);
 	cam_periph_lock(periph);
 	periph->periph_allocated++;
 	cam_ccbq_take_opening(&periph->path->device->ccbq);
 	return (new_ccb);
 }
 
 union ccb *
 cam_periph_getccb(struct cam_periph *periph, u_int32_t priority)
 {
 	struct ccb_hdr *ccb_h;
 
 	CAM_DEBUG(periph->path, CAM_DEBUG_TRACE, ("cam_periph_getccb\n"));
 	cam_periph_assert(periph, MA_OWNED);
 	while ((ccb_h = SLIST_FIRST(&periph->ccb_list)) == NULL ||
 	    ccb_h->pinfo.priority != priority) {
 		if (priority < periph->immediate_priority) {
 			periph->immediate_priority = priority;
 			xpt_run_allocq(periph, 0);
 		} else
 			cam_periph_sleep(periph, &periph->ccb_list, PRIBIO,
 			    "cgticb", 0);
 	}
 	SLIST_REMOVE_HEAD(&periph->ccb_list, periph_links.sle);
 	return ((union ccb *)ccb_h);
 }
 
 static void
 xpt_acquire_bus(struct cam_eb *bus)
 {
 
 	xpt_lock_buses();
 	bus->refcount++;
 	xpt_unlock_buses();
 }
 
 static void
 xpt_release_bus(struct cam_eb *bus)
 {
 
 	xpt_lock_buses();
 	KASSERT(bus->refcount >= 1, ("bus->refcount >= 1"));
 	if (--bus->refcount > 0) {
 		xpt_unlock_buses();
 		return;
 	}
 	TAILQ_REMOVE(&xsoftc.xpt_busses, bus, links);
 	xsoftc.bus_generation++;
 	xpt_unlock_buses();
 	KASSERT(TAILQ_EMPTY(&bus->et_entries),
 	    ("destroying bus, but target list is not empty"));
 	cam_sim_release(bus->sim);
 	mtx_destroy(&bus->eb_mtx);
 	free(bus, M_CAMXPT);
 }
 
 static struct cam_et *
 xpt_alloc_target(struct cam_eb *bus, target_id_t target_id)
 {
 	struct cam_et *cur_target, *target;
 
 	mtx_assert(&xsoftc.xpt_topo_lock, MA_OWNED);
 	mtx_assert(&bus->eb_mtx, MA_OWNED);
 	target = (struct cam_et *)malloc(sizeof(*target), M_CAMXPT,
 					 M_NOWAIT|M_ZERO);
 	if (target == NULL)
 		return (NULL);
 
 	TAILQ_INIT(&target->ed_entries);
 	target->bus = bus;
 	target->target_id = target_id;
 	target->refcount = 1;
 	target->generation = 0;
 	target->luns = NULL;
 	mtx_init(&target->luns_mtx, "CAM LUNs lock", NULL, MTX_DEF);
 	timevalclear(&target->last_reset);
 	/*
 	 * Hold a reference to our parent bus so it
 	 * will not go away before we do.
 	 */
 	bus->refcount++;
 
 	/* Insertion sort into our bus's target list */
 	cur_target = TAILQ_FIRST(&bus->et_entries);
 	while (cur_target != NULL && cur_target->target_id < target_id)
 		cur_target = TAILQ_NEXT(cur_target, links);
 	if (cur_target != NULL) {
 		TAILQ_INSERT_BEFORE(cur_target, target, links);
 	} else {
 		TAILQ_INSERT_TAIL(&bus->et_entries, target, links);
 	}
 	bus->generation++;
 	return (target);
 }
 
 static void
 xpt_acquire_target(struct cam_et *target)
 {
 	struct cam_eb *bus = target->bus;
 
 	mtx_lock(&bus->eb_mtx);
 	target->refcount++;
 	mtx_unlock(&bus->eb_mtx);
 }
 
 static void
 xpt_release_target(struct cam_et *target)
 {
 	struct cam_eb *bus = target->bus;
 
 	mtx_lock(&bus->eb_mtx);
 	if (--target->refcount > 0) {
 		mtx_unlock(&bus->eb_mtx);
 		return;
 	}
 	TAILQ_REMOVE(&bus->et_entries, target, links);
 	bus->generation++;
 	mtx_unlock(&bus->eb_mtx);
 	KASSERT(TAILQ_EMPTY(&target->ed_entries),
 	    ("destroying target, but device list is not empty"));
 	xpt_release_bus(bus);
 	mtx_destroy(&target->luns_mtx);
 	if (target->luns)
 		free(target->luns, M_CAMXPT);
 	free(target, M_CAMXPT);
 }
 
 static struct cam_ed *
 xpt_alloc_device_default(struct cam_eb *bus, struct cam_et *target,
 			 lun_id_t lun_id)
 {
 	struct cam_ed *device;
 
 	device = xpt_alloc_device(bus, target, lun_id);
 	if (device == NULL)
 		return (NULL);
 
 	device->mintags = 1;
 	device->maxtags = 1;
 	return (device);
 }
 
 static void
 xpt_destroy_device(void *context, int pending)
 {
 	struct cam_ed	*device = context;
 
 	mtx_lock(&device->device_mtx);
 	mtx_destroy(&device->device_mtx);
 	free(device, M_CAMDEV);
 }
 
 struct cam_ed *
 xpt_alloc_device(struct cam_eb *bus, struct cam_et *target, lun_id_t lun_id)
 {
 	struct cam_ed	*cur_device, *device;
 	struct cam_devq	*devq;
 	cam_status status;
 
 	mtx_assert(&bus->eb_mtx, MA_OWNED);
 	/* Make space for us in the device queue on our bus */
 	devq = bus->sim->devq;
 	mtx_lock(&devq->send_mtx);
 	status = cam_devq_resize(devq, devq->send_queue.array_size + 1);
 	mtx_unlock(&devq->send_mtx);
 	if (status != CAM_REQ_CMP)
 		return (NULL);
 
 	device = (struct cam_ed *)malloc(sizeof(*device),
 					 M_CAMDEV, M_NOWAIT|M_ZERO);
 	if (device == NULL)
 		return (NULL);
 
 	cam_init_pinfo(&device->devq_entry);
 	device->target = target;
 	device->lun_id = lun_id;
 	device->sim = bus->sim;
 	if (cam_ccbq_init(&device->ccbq,
 			  bus->sim->max_dev_openings) != 0) {
 		free(device, M_CAMDEV);
 		return (NULL);
 	}
 	SLIST_INIT(&device->asyncs);
 	SLIST_INIT(&device->periphs);
 	device->generation = 0;
 	device->flags = CAM_DEV_UNCONFIGURED;
 	device->tag_delay_count = 0;
 	device->tag_saved_openings = 0;
 	device->refcount = 1;
 	mtx_init(&device->device_mtx, "CAM device lock", NULL, MTX_DEF);
 	callout_init_mtx(&device->callout, &devq->send_mtx, 0);
 	TASK_INIT(&device->device_destroy_task, 0, xpt_destroy_device, device);
 	/*
 	 * Hold a reference to our parent bus so it
 	 * will not go away before we do.
 	 */
 	target->refcount++;
 
 	cur_device = TAILQ_FIRST(&target->ed_entries);
 	while (cur_device != NULL && cur_device->lun_id < lun_id)
 		cur_device = TAILQ_NEXT(cur_device, links);
 	if (cur_device != NULL)
 		TAILQ_INSERT_BEFORE(cur_device, device, links);
 	else
 		TAILQ_INSERT_TAIL(&target->ed_entries, device, links);
 	target->generation++;
 	return (device);
 }
 
 void
 xpt_acquire_device(struct cam_ed *device)
 {
 	struct cam_eb *bus = device->target->bus;
 
 	mtx_lock(&bus->eb_mtx);
 	device->refcount++;
 	mtx_unlock(&bus->eb_mtx);
 }
 
 void
 xpt_release_device(struct cam_ed *device)
 {
 	struct cam_eb *bus = device->target->bus;
 	struct cam_devq *devq;
 
 	mtx_lock(&bus->eb_mtx);
 	if (--device->refcount > 0) {
 		mtx_unlock(&bus->eb_mtx);
 		return;
 	}
 
 	TAILQ_REMOVE(&device->target->ed_entries, device,links);
 	device->target->generation++;
 	mtx_unlock(&bus->eb_mtx);
 
 	/* Release our slot in the devq */
 	devq = bus->sim->devq;
 	mtx_lock(&devq->send_mtx);
 	cam_devq_resize(devq, devq->send_queue.array_size - 1);
 	mtx_unlock(&devq->send_mtx);
 
 	KASSERT(SLIST_EMPTY(&device->periphs),
 	    ("destroying device, but periphs list is not empty"));
 	KASSERT(device->devq_entry.index == CAM_UNQUEUED_INDEX,
 	    ("destroying device while still queued for ccbs"));
 
 	if ((device->flags & CAM_DEV_REL_TIMEOUT_PENDING) != 0)
 		callout_stop(&device->callout);
 
 	xpt_release_target(device->target);
 
 	cam_ccbq_fini(&device->ccbq);
 	/*
 	 * Free allocated memory.  free(9) does nothing if the
 	 * supplied pointer is NULL, so it is safe to call without
 	 * checking.
 	 */
 	free(device->supported_vpds, M_CAMXPT);
 	free(device->device_id, M_CAMXPT);
 	free(device->ext_inq, M_CAMXPT);
 	free(device->physpath, M_CAMXPT);
 	free(device->rcap_buf, M_CAMXPT);
 	free(device->serial_num, M_CAMXPT);
 	taskqueue_enqueue(xsoftc.xpt_taskq, &device->device_destroy_task);
 }
 
 u_int32_t
 xpt_dev_ccbq_resize(struct cam_path *path, int newopenings)
 {
 	int	result;
 	struct	cam_ed *dev;
 
 	dev = path->device;
 	mtx_lock(&dev->sim->devq->send_mtx);
 	result = cam_ccbq_resize(&dev->ccbq, newopenings);
 	mtx_unlock(&dev->sim->devq->send_mtx);
 	if ((dev->flags & CAM_DEV_TAG_AFTER_COUNT) != 0
 	 || (dev->inq_flags & SID_CmdQue) != 0)
 		dev->tag_saved_openings = newopenings;
 	return (result);
 }
 
 static struct cam_eb *
 xpt_find_bus(path_id_t path_id)
 {
 	struct cam_eb *bus;
 
 	xpt_lock_buses();
 	for (bus = TAILQ_FIRST(&xsoftc.xpt_busses);
 	     bus != NULL;
 	     bus = TAILQ_NEXT(bus, links)) {
 		if (bus->path_id == path_id) {
 			bus->refcount++;
 			break;
 		}
 	}
 	xpt_unlock_buses();
 	return (bus);
 }
 
 static struct cam_et *
 xpt_find_target(struct cam_eb *bus, target_id_t	target_id)
 {
 	struct cam_et *target;
 
 	mtx_assert(&bus->eb_mtx, MA_OWNED);
 	for (target = TAILQ_FIRST(&bus->et_entries);
 	     target != NULL;
 	     target = TAILQ_NEXT(target, links)) {
 		if (target->target_id == target_id) {
 			target->refcount++;
 			break;
 		}
 	}
 	return (target);
 }
 
 static struct cam_ed *
 xpt_find_device(struct cam_et *target, lun_id_t lun_id)
 {
 	struct cam_ed *device;
 
 	mtx_assert(&target->bus->eb_mtx, MA_OWNED);
 	for (device = TAILQ_FIRST(&target->ed_entries);
 	     device != NULL;
 	     device = TAILQ_NEXT(device, links)) {
 		if (device->lun_id == lun_id) {
 			device->refcount++;
 			break;
 		}
 	}
 	return (device);
 }
 
 void
 xpt_start_tags(struct cam_path *path)
 {
 	struct ccb_relsim crs;
 	struct cam_ed *device;
 	struct cam_sim *sim;
 	int    newopenings;
 
 	device = path->device;
 	sim = path->bus->sim;
 	device->flags &= ~CAM_DEV_TAG_AFTER_COUNT;
 	xpt_freeze_devq(path, /*count*/1);
 	device->inq_flags |= SID_CmdQue;
 	if (device->tag_saved_openings != 0)
 		newopenings = device->tag_saved_openings;
 	else
 		newopenings = min(device->maxtags,
 				  sim->max_tagged_dev_openings);
 	xpt_dev_ccbq_resize(path, newopenings);
 	xpt_async(AC_GETDEV_CHANGED, path, NULL);
 	xpt_setup_ccb(&crs.ccb_h, path, CAM_PRIORITY_NORMAL);
 	crs.ccb_h.func_code = XPT_REL_SIMQ;
 	crs.release_flags = RELSIM_RELEASE_AFTER_QEMPTY;
 	crs.openings
 	    = crs.release_timeout
 	    = crs.qfrozen_cnt
 	    = 0;
 	xpt_action((union ccb *)&crs);
 }
 
 void
 xpt_stop_tags(struct cam_path *path)
 {
 	struct ccb_relsim crs;
 	struct cam_ed *device;
 	struct cam_sim *sim;
 
 	device = path->device;
 	sim = path->bus->sim;
 	device->flags &= ~CAM_DEV_TAG_AFTER_COUNT;
 	device->tag_delay_count = 0;
 	xpt_freeze_devq(path, /*count*/1);
 	device->inq_flags &= ~SID_CmdQue;
 	xpt_dev_ccbq_resize(path, sim->max_dev_openings);
 	xpt_async(AC_GETDEV_CHANGED, path, NULL);
 	xpt_setup_ccb(&crs.ccb_h, path, CAM_PRIORITY_NORMAL);
 	crs.ccb_h.func_code = XPT_REL_SIMQ;
 	crs.release_flags = RELSIM_RELEASE_AFTER_QEMPTY;
 	crs.openings
 	    = crs.release_timeout
 	    = crs.qfrozen_cnt
 	    = 0;
 	xpt_action((union ccb *)&crs);
 }
 
 static void
 xpt_boot_delay(void *arg)
 {
 
 	xpt_release_boot();
 }
 
 static void
 xpt_config(void *arg)
 {
 	/*
 	 * Now that interrupts are enabled, go find our devices
 	 */
 	if (taskqueue_start_threads(&xsoftc.xpt_taskq, 1, PRIBIO, "CAM taskq"))
 		printf("xpt_config: failed to create taskqueue thread.\n");
 
 	/* Setup debugging path */
 	if (cam_dflags != CAM_DEBUG_NONE) {
 		if (xpt_create_path(&cam_dpath, NULL,
 				    CAM_DEBUG_BUS, CAM_DEBUG_TARGET,
 				    CAM_DEBUG_LUN) != CAM_REQ_CMP) {
 			printf("xpt_config: xpt_create_path() failed for debug"
 			       " target %d:%d:%d, debugging disabled\n",
 			       CAM_DEBUG_BUS, CAM_DEBUG_TARGET, CAM_DEBUG_LUN);
 			cam_dflags = CAM_DEBUG_NONE;
 		}
 	} else
 		cam_dpath = NULL;
 
 	periphdriver_init(1);
 	xpt_hold_boot();
 	callout_init(&xsoftc.boot_callout, 1);
 	callout_reset_sbt(&xsoftc.boot_callout, SBT_1MS * xsoftc.boot_delay, 0,
 	    xpt_boot_delay, NULL, 0);
 	/* Fire up rescan thread. */
 	if (kproc_kthread_add(xpt_scanner_thread, NULL, &cam_proc, NULL, 0, 0,
 	    "cam", "scanner")) {
 		printf("xpt_config: failed to create rescan thread.\n");
 	}
 }
 
 void
 xpt_hold_boot(void)
 {
 	xpt_lock_buses();
 	xsoftc.buses_to_config++;
 	xpt_unlock_buses();
 }
 
 void
 xpt_release_boot(void)
 {
 	xpt_lock_buses();
 	xsoftc.buses_to_config--;
 	if (xsoftc.buses_to_config == 0 && xsoftc.buses_config_done == 0) {
 		struct	xpt_task *task;
 
 		xsoftc.buses_config_done = 1;
 		xpt_unlock_buses();
 		/* Call manually because we don't have any busses */
 		task = malloc(sizeof(struct xpt_task), M_CAMXPT, M_NOWAIT);
 		if (task != NULL) {
 			TASK_INIT(&task->task, 0, xpt_finishconfig_task, task);
 			taskqueue_enqueue(taskqueue_thread, &task->task);
 		}
 	} else
 		xpt_unlock_buses();
 }
 
 /*
  * If the given device only has one peripheral attached to it, and if that
  * peripheral is the passthrough driver, announce it.  This insures that the
  * user sees some sort of announcement for every peripheral in their system.
  */
 static int
 xptpassannouncefunc(struct cam_ed *device, void *arg)
 {
 	struct cam_periph *periph;
 	int i;
 
 	for (periph = SLIST_FIRST(&device->periphs), i = 0; periph != NULL;
 	     periph = SLIST_NEXT(periph, periph_links), i++);
 
 	periph = SLIST_FIRST(&device->periphs);
 	if ((i == 1)
 	 && (strncmp(periph->periph_name, "pass", 4) == 0))
 		xpt_announce_periph(periph, NULL);
 
 	return(1);
 }
 
 static void
 xpt_finishconfig_task(void *context, int pending)
 {
 
 	periphdriver_init(2);
 	/*
 	 * Check for devices with no "standard" peripheral driver
 	 * attached.  For any devices like that, announce the
 	 * passthrough driver so the user will see something.
 	 */
 	if (!bootverbose)
 		xpt_for_all_devices(xptpassannouncefunc, NULL);
 
 	/* Release our hook so that the boot can continue. */
 	config_intrhook_disestablish(xsoftc.xpt_config_hook);
 	free(xsoftc.xpt_config_hook, M_CAMXPT);
 	xsoftc.xpt_config_hook = NULL;
 
 	free(context, M_CAMXPT);
 }
 
 cam_status
 xpt_register_async(int event, ac_callback_t *cbfunc, void *cbarg,
 		   struct cam_path *path)
 {
 	struct ccb_setasync csa;
 	cam_status status;
 	int xptpath = 0;
 
 	if (path == NULL) {
 		status = xpt_create_path(&path, /*periph*/NULL, CAM_XPT_PATH_ID,
 					 CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD);
 		if (status != CAM_REQ_CMP)
 			return (status);
 		xpt_path_lock(path);
 		xptpath = 1;
 	}
 
 	xpt_setup_ccb(&csa.ccb_h, path, CAM_PRIORITY_NORMAL);
 	csa.ccb_h.func_code = XPT_SASYNC_CB;
 	csa.event_enable = event;
 	csa.callback = cbfunc;
 	csa.callback_arg = cbarg;
 	xpt_action((union ccb *)&csa);
 	status = csa.ccb_h.status;
 
 	if (xptpath) {
 		xpt_path_unlock(path);
 		xpt_free_path(path);
 	}
 
 	if ((status == CAM_REQ_CMP) &&
 	    (csa.event_enable & AC_FOUND_DEVICE)) {
 		/*
 		 * Get this peripheral up to date with all
 		 * the currently existing devices.
 		 */
 		xpt_for_all_devices(xptsetasyncfunc, &csa);
 	}
 	if ((status == CAM_REQ_CMP) &&
 	    (csa.event_enable & AC_PATH_REGISTERED)) {
 		/*
 		 * Get this peripheral up to date with all
 		 * the currently existing busses.
 		 */
 		xpt_for_all_busses(xptsetasyncbusfunc, &csa);
 	}
 
 	return (status);
 }
 
 static void
 xptaction(struct cam_sim *sim, union ccb *work_ccb)
 {
 	CAM_DEBUG(work_ccb->ccb_h.path, CAM_DEBUG_TRACE, ("xptaction\n"));
 
 	switch (work_ccb->ccb_h.func_code) {
 	/* Common cases first */
 	case XPT_PATH_INQ:		/* Path routing inquiry */
 	{
 		struct ccb_pathinq *cpi;
 
 		cpi = &work_ccb->cpi;
 		cpi->version_num = 1; /* XXX??? */
 		cpi->hba_inquiry = 0;
 		cpi->target_sprt = 0;
 		cpi->hba_misc = 0;
 		cpi->hba_eng_cnt = 0;
 		cpi->max_target = 0;
 		cpi->max_lun = 0;
 		cpi->initiator_id = 0;
 		strncpy(cpi->sim_vid, "FreeBSD", SIM_IDLEN);
 		strncpy(cpi->hba_vid, "", HBA_IDLEN);
 		strncpy(cpi->dev_name, sim->sim_name, DEV_IDLEN);
 		cpi->unit_number = sim->unit_number;
 		cpi->bus_id = sim->bus_id;
 		cpi->base_transfer_speed = 0;
 		cpi->protocol = PROTO_UNSPECIFIED;
 		cpi->protocol_version = PROTO_VERSION_UNSPECIFIED;
 		cpi->transport = XPORT_UNSPECIFIED;
 		cpi->transport_version = XPORT_VERSION_UNSPECIFIED;
 		cpi->ccb_h.status = CAM_REQ_CMP;
 		xpt_done(work_ccb);
 		break;
 	}
 	default:
 		work_ccb->ccb_h.status = CAM_REQ_INVALID;
 		xpt_done(work_ccb);
 		break;
 	}
 }
 
 /*
  * The xpt as a "controller" has no interrupt sources, so polling
  * is a no-op.
  */
 static void
 xptpoll(struct cam_sim *sim)
 {
 }
 
 void
 xpt_lock_buses(void)
 {
 	mtx_lock(&xsoftc.xpt_topo_lock);
 }
 
 void
 xpt_unlock_buses(void)
 {
 	mtx_unlock(&xsoftc.xpt_topo_lock);
 }
 
 struct mtx *
 xpt_path_mtx(struct cam_path *path)
 {
 
 	return (&path->device->device_mtx);
 }
 
 static void
 xpt_done_process(struct ccb_hdr *ccb_h)
 {
 	struct cam_sim *sim;
 	struct cam_devq *devq;
 	struct mtx *mtx = NULL;
 
 	if (ccb_h->flags & CAM_HIGH_POWER) {
 		struct highpowerlist	*hphead;
 		struct cam_ed		*device;
 
 		mtx_lock(&xsoftc.xpt_highpower_lock);
 		hphead = &xsoftc.highpowerq;
 
 		device = STAILQ_FIRST(hphead);
 
 		/*
 		 * Increment the count since this command is done.
 		 */
 		xsoftc.num_highpower++;
 
 		/*
 		 * Any high powered commands queued up?
 		 */
 		if (device != NULL) {
 
 			STAILQ_REMOVE_HEAD(hphead, highpowerq_entry);
 			mtx_unlock(&xsoftc.xpt_highpower_lock);
 
 			mtx_lock(&device->sim->devq->send_mtx);
 			xpt_release_devq_device(device,
 					 /*count*/1, /*runqueue*/TRUE);
 			mtx_unlock(&device->sim->devq->send_mtx);
 		} else
 			mtx_unlock(&xsoftc.xpt_highpower_lock);
 	}
 
 	sim = ccb_h->path->bus->sim;
 
 	if (ccb_h->status & CAM_RELEASE_SIMQ) {
 		xpt_release_simq(sim, /*run_queue*/FALSE);
 		ccb_h->status &= ~CAM_RELEASE_SIMQ;
 	}
 
 	if ((ccb_h->flags & CAM_DEV_QFRZDIS)
 	 && (ccb_h->status & CAM_DEV_QFRZN)) {
 		xpt_release_devq(ccb_h->path, /*count*/1, /*run_queue*/TRUE);
 		ccb_h->status &= ~CAM_DEV_QFRZN;
 	}
 
 	devq = sim->devq;
 	if ((ccb_h->func_code & XPT_FC_USER_CCB) == 0) {
 		struct cam_ed *dev = ccb_h->path->device;
 
 		mtx_lock(&devq->send_mtx);
 		devq->send_active--;
 		devq->send_openings++;
 		cam_ccbq_ccb_done(&dev->ccbq, (union ccb *)ccb_h);
 
 		if (((dev->flags & CAM_DEV_REL_ON_QUEUE_EMPTY) != 0
 		  && (dev->ccbq.dev_active == 0))) {
 			dev->flags &= ~CAM_DEV_REL_ON_QUEUE_EMPTY;
 			xpt_release_devq_device(dev, /*count*/1,
 					 /*run_queue*/FALSE);
 		}
 
 		if (((dev->flags & CAM_DEV_REL_ON_COMPLETE) != 0
 		  && (ccb_h->status&CAM_STATUS_MASK) != CAM_REQUEUE_REQ)) {
 			dev->flags &= ~CAM_DEV_REL_ON_COMPLETE;
 			xpt_release_devq_device(dev, /*count*/1,
 					 /*run_queue*/FALSE);
 		}
 
 		if (!device_is_queued(dev))
 			(void)xpt_schedule_devq(devq, dev);
 		xpt_run_devq(devq);
 		mtx_unlock(&devq->send_mtx);
 
 		if ((dev->flags & CAM_DEV_TAG_AFTER_COUNT) != 0) {
 			mtx = xpt_path_mtx(ccb_h->path);
 			mtx_lock(mtx);
 
 			if ((dev->flags & CAM_DEV_TAG_AFTER_COUNT) != 0
 			 && (--dev->tag_delay_count == 0))
 				xpt_start_tags(ccb_h->path);
 		}
 	}
 
 	if ((ccb_h->flags & CAM_UNLOCKED) == 0) {
 		if (mtx == NULL) {
 			mtx = xpt_path_mtx(ccb_h->path);
 			mtx_lock(mtx);
 		}
 	} else {
 		if (mtx != NULL) {
 			mtx_unlock(mtx);
 			mtx = NULL;
 		}
 	}
 
 	/* Call the peripheral driver's callback */
 	ccb_h->pinfo.index = CAM_UNQUEUED_INDEX;
 	(*ccb_h->cbfcnp)(ccb_h->path->periph, (union ccb *)ccb_h);
 	if (mtx != NULL)
 		mtx_unlock(mtx);
 }
 
 void
 xpt_done_td(void *arg)
 {
 	struct cam_doneq *queue = arg;
 	struct ccb_hdr *ccb_h;
 	STAILQ_HEAD(, ccb_hdr)	doneq;
 
 	STAILQ_INIT(&doneq);
 	mtx_lock(&queue->cam_doneq_mtx);
 	while (1) {
 		while (STAILQ_EMPTY(&queue->cam_doneq)) {
 			queue->cam_doneq_sleep = 1;
 			msleep(&queue->cam_doneq, &queue->cam_doneq_mtx,
 			    PRIBIO, "-", 0);
 			queue->cam_doneq_sleep = 0;
 		}
 		STAILQ_CONCAT(&doneq, &queue->cam_doneq);
 		mtx_unlock(&queue->cam_doneq_mtx);
 
 		THREAD_NO_SLEEPING();
 		while ((ccb_h = STAILQ_FIRST(&doneq)) != NULL) {
 			STAILQ_REMOVE_HEAD(&doneq, sim_links.stqe);
 			xpt_done_process(ccb_h);
 		}
 		THREAD_SLEEPING_OK();
 
 		mtx_lock(&queue->cam_doneq_mtx);
 	}
 }
 
 static void
 camisr_runqueue(void)
 {
 	struct	ccb_hdr *ccb_h;
 	struct cam_doneq *queue;
 	int i;
 
 	/* Process global queues. */
 	for (i = 0; i < cam_num_doneqs; i++) {
 		queue = &cam_doneqs[i];
 		mtx_lock(&queue->cam_doneq_mtx);
 		while ((ccb_h = STAILQ_FIRST(&queue->cam_doneq)) != NULL) {
 			STAILQ_REMOVE_HEAD(&queue->cam_doneq, sim_links.stqe);
 			mtx_unlock(&queue->cam_doneq_mtx);
 			xpt_done_process(ccb_h);
 			mtx_lock(&queue->cam_doneq_mtx);
 		}
 		mtx_unlock(&queue->cam_doneq_mtx);
 	}
 }
Index: user/ngie/more-tests/sys/compat/freebsd32/freebsd32_misc.c
===================================================================
--- user/ngie/more-tests/sys/compat/freebsd32/freebsd32_misc.c	(revision 281584)
+++ user/ngie/more-tests/sys/compat/freebsd32/freebsd32_misc.c	(revision 281585)
@@ -1,3124 +1,3125 @@
 /*-
  * Copyright (c) 2002 Doug Rabson
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_compat.h"
 #include "opt_inet.h"
 #include "opt_inet6.h"
 
 #define __ELF_WORD_SIZE 32
 
 #include <sys/param.h>
 #include <sys/bus.h>
 #include <sys/capsicum.h>
 #include <sys/clock.h>
 #include <sys/exec.h>
 #include <sys/fcntl.h>
 #include <sys/filedesc.h>
 #include <sys/imgact.h>
 #include <sys/jail.h>
 #include <sys/kernel.h>
 #include <sys/limits.h>
 #include <sys/linker.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/file.h>		/* Must come after sys/malloc.h */
 #include <sys/imgact.h>
 #include <sys/mbuf.h>
 #include <sys/mman.h>
 #include <sys/module.h>
 #include <sys/mount.h>
 #include <sys/mutex.h>
 #include <sys/namei.h>
 #include <sys/proc.h>
 #include <sys/procctl.h>
 #include <sys/reboot.h>
 #include <sys/resource.h>
 #include <sys/resourcevar.h>
 #include <sys/selinfo.h>
 #include <sys/eventvar.h>	/* Must come after sys/selinfo.h */
 #include <sys/pipe.h>		/* Must come after sys/selinfo.h */
 #include <sys/signal.h>
 #include <sys/signalvar.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/stat.h>
 #include <sys/syscall.h>
 #include <sys/syscallsubr.h>
 #include <sys/sysctl.h>
 #include <sys/sysent.h>
 #include <sys/sysproto.h>
 #include <sys/systm.h>
 #include <sys/thr.h>
 #include <sys/unistd.h>
 #include <sys/ucontext.h>
 #include <sys/vnode.h>
 #include <sys/wait.h>
 #include <sys/ipc.h>
 #include <sys/msg.h>
 #include <sys/sem.h>
 #include <sys/shm.h>
 
 #ifdef INET
 #include <netinet/in.h>
 #endif
 
 #include <vm/vm.h>
 #include <vm/vm_param.h>
 #include <vm/pmap.h>
 #include <vm/vm_map.h>
 #include <vm/vm_object.h>
 #include <vm/vm_extern.h>
 
 #include <machine/cpu.h>
 #include <machine/elf.h>
 
 #include <security/audit/audit.h>
 
 #include <compat/freebsd32/freebsd32_util.h>
 #include <compat/freebsd32/freebsd32.h>
 #include <compat/freebsd32/freebsd32_ipc.h>
 #include <compat/freebsd32/freebsd32_misc.h>
 #include <compat/freebsd32/freebsd32_signal.h>
 #include <compat/freebsd32/freebsd32_proto.h>
 
 FEATURE(compat_freebsd_32bit, "Compatible with 32-bit FreeBSD");
 
 #ifndef __mips__
 CTASSERT(sizeof(struct timeval32) == 8);
 CTASSERT(sizeof(struct timespec32) == 8);
 CTASSERT(sizeof(struct itimerval32) == 16);
 #endif
 CTASSERT(sizeof(struct statfs32) == 256);
 #ifndef __mips__
 CTASSERT(sizeof(struct rusage32) == 72);
 #endif
 CTASSERT(sizeof(struct sigaltstack32) == 12);
 CTASSERT(sizeof(struct kevent32) == 20);
 CTASSERT(sizeof(struct iovec32) == 8);
 CTASSERT(sizeof(struct msghdr32) == 28);
 #ifndef __mips__
 CTASSERT(sizeof(struct stat32) == 96);
 #endif
 CTASSERT(sizeof(struct sigaction32) == 24);
 
 static int freebsd32_kevent_copyout(void *arg, struct kevent *kevp, int count);
 static int freebsd32_kevent_copyin(void *arg, struct kevent *kevp, int count);
 
 void
 freebsd32_rusage_out(const struct rusage *s, struct rusage32 *s32)
 {
 
 	TV_CP(*s, *s32, ru_utime);
 	TV_CP(*s, *s32, ru_stime);
 	CP(*s, *s32, ru_maxrss);
 	CP(*s, *s32, ru_ixrss);
 	CP(*s, *s32, ru_idrss);
 	CP(*s, *s32, ru_isrss);
 	CP(*s, *s32, ru_minflt);
 	CP(*s, *s32, ru_majflt);
 	CP(*s, *s32, ru_nswap);
 	CP(*s, *s32, ru_inblock);
 	CP(*s, *s32, ru_oublock);
 	CP(*s, *s32, ru_msgsnd);
 	CP(*s, *s32, ru_msgrcv);
 	CP(*s, *s32, ru_nsignals);
 	CP(*s, *s32, ru_nvcsw);
 	CP(*s, *s32, ru_nivcsw);
 }
 
 int
 freebsd32_wait4(struct thread *td, struct freebsd32_wait4_args *uap)
 {
 	int error, status;
 	struct rusage32 ru32;
 	struct rusage ru, *rup;
 
 	if (uap->rusage != NULL)
 		rup = &ru;
 	else
 		rup = NULL;
 	error = kern_wait(td, uap->pid, &status, uap->options, rup);
 	if (error)
 		return (error);
 	if (uap->status != NULL)
 		error = copyout(&status, uap->status, sizeof(status));
 	if (uap->rusage != NULL && error == 0) {
 		freebsd32_rusage_out(&ru, &ru32);
 		error = copyout(&ru32, uap->rusage, sizeof(ru32));
 	}
 	return (error);
 }
 
 int
 freebsd32_wait6(struct thread *td, struct freebsd32_wait6_args *uap)
 {
 	struct wrusage32 wru32;
 	struct __wrusage wru, *wrup;
 	struct siginfo32 si32;
 	struct __siginfo si, *sip;
 	int error, status;
 
 	if (uap->wrusage != NULL)
 		wrup = &wru;
 	else
 		wrup = NULL;
 	if (uap->info != NULL) {
 		sip = &si;
 		bzero(sip, sizeof(*sip));
 	} else
 		sip = NULL;
 	error = kern_wait6(td, uap->idtype, PAIR32TO64(id_t, uap->id),
 	    &status, uap->options, wrup, sip);
 	if (error != 0)
 		return (error);
 	if (uap->status != NULL)
 		error = copyout(&status, uap->status, sizeof(status));
 	if (uap->wrusage != NULL && error == 0) {
 		freebsd32_rusage_out(&wru.wru_self, &wru32.wru_self);
 		freebsd32_rusage_out(&wru.wru_children, &wru32.wru_children);
 		error = copyout(&wru32, uap->wrusage, sizeof(wru32));
 	}
 	if (uap->info != NULL && error == 0) {
 		siginfo_to_siginfo32 (&si, &si32);
 		error = copyout(&si32, uap->info, sizeof(si32));
 	}
 	return (error);
 }
 
 #ifdef COMPAT_FREEBSD4
 static void
 copy_statfs(struct statfs *in, struct statfs32 *out)
 {
 
 	statfs_scale_blocks(in, INT32_MAX);
 	bzero(out, sizeof(*out));
 	CP(*in, *out, f_bsize);
 	out->f_iosize = MIN(in->f_iosize, INT32_MAX);
 	CP(*in, *out, f_blocks);
 	CP(*in, *out, f_bfree);
 	CP(*in, *out, f_bavail);
 	out->f_files = MIN(in->f_files, INT32_MAX);
 	out->f_ffree = MIN(in->f_ffree, INT32_MAX);
 	CP(*in, *out, f_fsid);
 	CP(*in, *out, f_owner);
 	CP(*in, *out, f_type);
 	CP(*in, *out, f_flags);
 	out->f_syncwrites = MIN(in->f_syncwrites, INT32_MAX);
 	out->f_asyncwrites = MIN(in->f_asyncwrites, INT32_MAX);
 	strlcpy(out->f_fstypename,
 	      in->f_fstypename, MFSNAMELEN);
 	strlcpy(out->f_mntonname,
 	      in->f_mntonname, min(MNAMELEN, FREEBSD4_MNAMELEN));
 	out->f_syncreads = MIN(in->f_syncreads, INT32_MAX);
 	out->f_asyncreads = MIN(in->f_asyncreads, INT32_MAX);
 	strlcpy(out->f_mntfromname,
 	      in->f_mntfromname, min(MNAMELEN, FREEBSD4_MNAMELEN));
 }
 #endif
 
 #ifdef COMPAT_FREEBSD4
 int
 freebsd4_freebsd32_getfsstat(struct thread *td, struct freebsd4_freebsd32_getfsstat_args *uap)
 {
 	struct statfs *buf, *sp;
 	struct statfs32 stat32;
 	size_t count, size;
 	int error;
 
 	count = uap->bufsize / sizeof(struct statfs32);
 	size = count * sizeof(struct statfs);
-	error = kern_getfsstat(td, &buf, size, UIO_SYSSPACE, uap->flags);
+	error = kern_getfsstat(td, &buf, size, &count, UIO_SYSSPACE, uap->flags);
 	if (size > 0) {
-		count = td->td_retval[0];
 		sp = buf;
 		while (count > 0 && error == 0) {
 			copy_statfs(sp, &stat32);
 			error = copyout(&stat32, uap->buf, sizeof(stat32));
 			sp++;
 			uap->buf++;
 			count--;
 		}
 		free(buf, M_TEMP);
 	}
+	if (error == 0)
+		td->td_retval[0] = count;
 	return (error);
 }
 #endif
 
 int
 freebsd32_sigaltstack(struct thread *td,
 		      struct freebsd32_sigaltstack_args *uap)
 {
 	struct sigaltstack32 s32;
 	struct sigaltstack ss, oss, *ssp;
 	int error;
 
 	if (uap->ss != NULL) {
 		error = copyin(uap->ss, &s32, sizeof(s32));
 		if (error)
 			return (error);
 		PTRIN_CP(s32, ss, ss_sp);
 		CP(s32, ss, ss_size);
 		CP(s32, ss, ss_flags);
 		ssp = &ss;
 	} else
 		ssp = NULL;
 	error = kern_sigaltstack(td, ssp, &oss);
 	if (error == 0 && uap->oss != NULL) {
 		PTROUT_CP(oss, s32, ss_sp);
 		CP(oss, s32, ss_size);
 		CP(oss, s32, ss_flags);
 		error = copyout(&s32, uap->oss, sizeof(s32));
 	}
 	return (error);
 }
 
 /*
  * Custom version of exec_copyin_args() so that we can translate
  * the pointers.
  */
 int
 freebsd32_exec_copyin_args(struct image_args *args, char *fname,
     enum uio_seg segflg, u_int32_t *argv, u_int32_t *envv)
 {
 	char *argp, *envp;
 	u_int32_t *p32, arg;
 	size_t length;
 	int error;
 
 	bzero(args, sizeof(*args));
 	if (argv == NULL)
 		return (EFAULT);
 
 	/*
 	 * Allocate demand-paged memory for the file name, argument, and
 	 * environment strings.
 	 */
 	error = exec_alloc_args(args);
 	if (error != 0)
 		return (error);
 
 	/*
 	 * Copy the file name.
 	 */
 	if (fname != NULL) {
 		args->fname = args->buf;
 		error = (segflg == UIO_SYSSPACE) ?
 		    copystr(fname, args->fname, PATH_MAX, &length) :
 		    copyinstr(fname, args->fname, PATH_MAX, &length);
 		if (error != 0)
 			goto err_exit;
 	} else
 		length = 0;
 
 	args->begin_argv = args->buf + length;
 	args->endp = args->begin_argv;
 	args->stringspace = ARG_MAX;
 
 	/*
 	 * extract arguments first
 	 */
 	p32 = argv;
 	for (;;) {
 		error = copyin(p32++, &arg, sizeof(arg));
 		if (error)
 			goto err_exit;
 		if (arg == 0)
 			break;
 		argp = PTRIN(arg);
 		error = copyinstr(argp, args->endp, args->stringspace, &length);
 		if (error) {
 			if (error == ENAMETOOLONG)
 				error = E2BIG;
 			goto err_exit;
 		}
 		args->stringspace -= length;
 		args->endp += length;
 		args->argc++;
 	}
 			
 	args->begin_envv = args->endp;
 
 	/*
 	 * extract environment strings
 	 */
 	if (envv) {
 		p32 = envv;
 		for (;;) {
 			error = copyin(p32++, &arg, sizeof(arg));
 			if (error)
 				goto err_exit;
 			if (arg == 0)
 				break;
 			envp = PTRIN(arg);
 			error = copyinstr(envp, args->endp, args->stringspace,
 			    &length);
 			if (error) {
 				if (error == ENAMETOOLONG)
 					error = E2BIG;
 				goto err_exit;
 			}
 			args->stringspace -= length;
 			args->endp += length;
 			args->envc++;
 		}
 	}
 
 	return (0);
 
 err_exit:
 	exec_free_args(args);
 	return (error);
 }
 
 int
 freebsd32_execve(struct thread *td, struct freebsd32_execve_args *uap)
 {
 	struct image_args eargs;
 	int error;
 
 	error = freebsd32_exec_copyin_args(&eargs, uap->fname, UIO_USERSPACE,
 	    uap->argv, uap->envv);
 	if (error == 0)
 		error = kern_execve(td, &eargs, NULL);
 	return (error);
 }
 
 int
 freebsd32_fexecve(struct thread *td, struct freebsd32_fexecve_args *uap)
 {
 	struct image_args eargs;
 	int error;
 
 	error = freebsd32_exec_copyin_args(&eargs, NULL, UIO_SYSSPACE,
 	    uap->argv, uap->envv);
 	if (error == 0) {
 		eargs.fd = uap->fd;
 		error = kern_execve(td, &eargs, NULL);
 	}
 	return (error);
 }
 
 int
 freebsd32_mprotect(struct thread *td, struct freebsd32_mprotect_args *uap)
 {
 	struct mprotect_args ap;
 
 	ap.addr = PTRIN(uap->addr);
 	ap.len = uap->len;
 	ap.prot = uap->prot;
 #if defined(__amd64__)
 	if (i386_read_exec && (ap.prot & PROT_READ) != 0)
 		ap.prot |= PROT_EXEC;
 #endif
 	return (sys_mprotect(td, &ap));
 }
 
 int
 freebsd32_mmap(struct thread *td, struct freebsd32_mmap_args *uap)
 {
 	struct mmap_args ap;
 	vm_offset_t addr = (vm_offset_t) uap->addr;
 	vm_size_t len	 = uap->len;
 	int prot	 = uap->prot;
 	int flags	 = uap->flags;
 	int fd		 = uap->fd;
 	off_t pos	 = PAIR32TO64(off_t,uap->pos);
 
 #if defined(__amd64__)
 	if (i386_read_exec && (prot & PROT_READ))
 		prot |= PROT_EXEC;
 #endif
 
 	ap.addr = (void *) addr;
 	ap.len = len;
 	ap.prot = prot;
 	ap.flags = flags;
 	ap.fd = fd;
 	ap.pos = pos;
 
 	return (sys_mmap(td, &ap));
 }
 
 #ifdef COMPAT_FREEBSD6
 int
 freebsd6_freebsd32_mmap(struct thread *td, struct freebsd6_freebsd32_mmap_args *uap)
 {
 	struct freebsd32_mmap_args ap;
 
 	ap.addr = uap->addr;
 	ap.len = uap->len;
 	ap.prot = uap->prot;
 	ap.flags = uap->flags;
 	ap.fd = uap->fd;
 	ap.pos1 = uap->pos1;
 	ap.pos2 = uap->pos2;
 
 	return (freebsd32_mmap(td, &ap));
 }
 #endif
 
 int
 freebsd32_setitimer(struct thread *td, struct freebsd32_setitimer_args *uap)
 {
 	struct itimerval itv, oitv, *itvp;	
 	struct itimerval32 i32;
 	int error;
 
 	if (uap->itv != NULL) {
 		error = copyin(uap->itv, &i32, sizeof(i32));
 		if (error)
 			return (error);
 		TV_CP(i32, itv, it_interval);
 		TV_CP(i32, itv, it_value);
 		itvp = &itv;
 	} else
 		itvp = NULL;
 	error = kern_setitimer(td, uap->which, itvp, &oitv);
 	if (error || uap->oitv == NULL)
 		return (error);
 	TV_CP(oitv, i32, it_interval);
 	TV_CP(oitv, i32, it_value);
 	return (copyout(&i32, uap->oitv, sizeof(i32)));
 }
 
 int
 freebsd32_getitimer(struct thread *td, struct freebsd32_getitimer_args *uap)
 {
 	struct itimerval itv;
 	struct itimerval32 i32;
 	int error;
 
 	error = kern_getitimer(td, uap->which, &itv);
 	if (error || uap->itv == NULL)
 		return (error);
 	TV_CP(itv, i32, it_interval);
 	TV_CP(itv, i32, it_value);
 	return (copyout(&i32, uap->itv, sizeof(i32)));
 }
 
 int
 freebsd32_select(struct thread *td, struct freebsd32_select_args *uap)
 {
 	struct timeval32 tv32;
 	struct timeval tv, *tvp;
 	int error;
 
 	if (uap->tv != NULL) {
 		error = copyin(uap->tv, &tv32, sizeof(tv32));
 		if (error)
 			return (error);
 		CP(tv32, tv, tv_sec);
 		CP(tv32, tv, tv_usec);
 		tvp = &tv;
 	} else
 		tvp = NULL;
 	/*
 	 * XXX Do pointers need PTRIN()?
 	 */
 	return (kern_select(td, uap->nd, uap->in, uap->ou, uap->ex, tvp,
 	    sizeof(int32_t) * 8));
 }
 
 int
 freebsd32_pselect(struct thread *td, struct freebsd32_pselect_args *uap)
 {
 	struct timespec32 ts32;
 	struct timespec ts;
 	struct timeval tv, *tvp;
 	sigset_t set, *uset;
 	int error;
 
 	if (uap->ts != NULL) {
 		error = copyin(uap->ts, &ts32, sizeof(ts32));
 		if (error != 0)
 			return (error);
 		CP(ts32, ts, tv_sec);
 		CP(ts32, ts, tv_nsec);
 		TIMESPEC_TO_TIMEVAL(&tv, &ts);
 		tvp = &tv;
 	} else
 		tvp = NULL;
 	if (uap->sm != NULL) {
 		error = copyin(uap->sm, &set, sizeof(set));
 		if (error != 0)
 			return (error);
 		uset = &set;
 	} else
 		uset = NULL;
 	/*
 	 * XXX Do pointers need PTRIN()?
 	 */
 	error = kern_pselect(td, uap->nd, uap->in, uap->ou, uap->ex, tvp,
 	    uset, sizeof(int32_t) * 8);
 	return (error);
 }
 
 /*
  * Copy 'count' items into the destination list pointed to by uap->eventlist.
  */
 static int
 freebsd32_kevent_copyout(void *arg, struct kevent *kevp, int count)
 {
 	struct freebsd32_kevent_args *uap;
 	struct kevent32	ks32[KQ_NEVENTS];
 	int i, error = 0;
 
 	KASSERT(count <= KQ_NEVENTS, ("count (%d) > KQ_NEVENTS", count));
 	uap = (struct freebsd32_kevent_args *)arg;
 
 	for (i = 0; i < count; i++) {
 		CP(kevp[i], ks32[i], ident);
 		CP(kevp[i], ks32[i], filter);
 		CP(kevp[i], ks32[i], flags);
 		CP(kevp[i], ks32[i], fflags);
 		CP(kevp[i], ks32[i], data);
 		PTROUT_CP(kevp[i], ks32[i], udata);
 	}
 	error = copyout(ks32, uap->eventlist, count * sizeof *ks32);
 	if (error == 0)
 		uap->eventlist += count;
 	return (error);
 }
 
 /*
  * Copy 'count' items from the list pointed to by uap->changelist.
  */
 static int
 freebsd32_kevent_copyin(void *arg, struct kevent *kevp, int count)
 {
 	struct freebsd32_kevent_args *uap;
 	struct kevent32	ks32[KQ_NEVENTS];
 	int i, error = 0;
 
 	KASSERT(count <= KQ_NEVENTS, ("count (%d) > KQ_NEVENTS", count));
 	uap = (struct freebsd32_kevent_args *)arg;
 
 	error = copyin(uap->changelist, ks32, count * sizeof *ks32);
 	if (error)
 		goto done;
 	uap->changelist += count;
 
 	for (i = 0; i < count; i++) {
 		CP(ks32[i], kevp[i], ident);
 		CP(ks32[i], kevp[i], filter);
 		CP(ks32[i], kevp[i], flags);
 		CP(ks32[i], kevp[i], fflags);
 		CP(ks32[i], kevp[i], data);
 		PTRIN_CP(ks32[i], kevp[i], udata);
 	}
 done:
 	return (error);
 }
 
 int
 freebsd32_kevent(struct thread *td, struct freebsd32_kevent_args *uap)
 {
 	struct timespec32 ts32;
 	struct timespec ts, *tsp;
 	struct kevent_copyops k_ops = { uap,
 					freebsd32_kevent_copyout,
 					freebsd32_kevent_copyin};
 	int error;
 
 
 	if (uap->timeout) {
 		error = copyin(uap->timeout, &ts32, sizeof(ts32));
 		if (error)
 			return (error);
 		CP(ts32, ts, tv_sec);
 		CP(ts32, ts, tv_nsec);
 		tsp = &ts;
 	} else
 		tsp = NULL;
 	error = kern_kevent(td, uap->fd, uap->nchanges, uap->nevents,
 	    &k_ops, tsp);
 	return (error);
 }
 
 int
 freebsd32_gettimeofday(struct thread *td,
 		       struct freebsd32_gettimeofday_args *uap)
 {
 	struct timeval atv;
 	struct timeval32 atv32;
 	struct timezone rtz;
 	int error = 0;
 
 	if (uap->tp) {
 		microtime(&atv);
 		CP(atv, atv32, tv_sec);
 		CP(atv, atv32, tv_usec);
 		error = copyout(&atv32, uap->tp, sizeof (atv32));
 	}
 	if (error == 0 && uap->tzp != NULL) {
 		rtz.tz_minuteswest = tz_minuteswest;
 		rtz.tz_dsttime = tz_dsttime;
 		error = copyout(&rtz, uap->tzp, sizeof (rtz));
 	}
 	return (error);
 }
 
 int
 freebsd32_getrusage(struct thread *td, struct freebsd32_getrusage_args *uap)
 {
 	struct rusage32 s32;
 	struct rusage s;
 	int error;
 
 	error = kern_getrusage(td, uap->who, &s);
 	if (error)
 		return (error);
 	if (uap->rusage != NULL) {
 		freebsd32_rusage_out(&s, &s32);
 		error = copyout(&s32, uap->rusage, sizeof(s32));
 	}
 	return (error);
 }
 
 static int
 freebsd32_copyinuio(struct iovec32 *iovp, u_int iovcnt, struct uio **uiop)
 {
 	struct iovec32 iov32;
 	struct iovec *iov;
 	struct uio *uio;
 	u_int iovlen;
 	int error, i;
 
 	*uiop = NULL;
 	if (iovcnt > UIO_MAXIOV)
 		return (EINVAL);
 	iovlen = iovcnt * sizeof(struct iovec);
 	uio = malloc(iovlen + sizeof *uio, M_IOV, M_WAITOK);
 	iov = (struct iovec *)(uio + 1);
 	for (i = 0; i < iovcnt; i++) {
 		error = copyin(&iovp[i], &iov32, sizeof(struct iovec32));
 		if (error) {
 			free(uio, M_IOV);
 			return (error);
 		}
 		iov[i].iov_base = PTRIN(iov32.iov_base);
 		iov[i].iov_len = iov32.iov_len;
 	}
 	uio->uio_iov = iov;
 	uio->uio_iovcnt = iovcnt;
 	uio->uio_segflg = UIO_USERSPACE;
 	uio->uio_offset = -1;
 	uio->uio_resid = 0;
 	for (i = 0; i < iovcnt; i++) {
 		if (iov->iov_len > INT_MAX - uio->uio_resid) {
 			free(uio, M_IOV);
 			return (EINVAL);
 		}
 		uio->uio_resid += iov->iov_len;
 		iov++;
 	}
 	*uiop = uio;
 	return (0);
 }
 
 int
 freebsd32_readv(struct thread *td, struct freebsd32_readv_args *uap)
 {
 	struct uio *auio;
 	int error;
 
 	error = freebsd32_copyinuio(uap->iovp, uap->iovcnt, &auio);
 	if (error)
 		return (error);
 	error = kern_readv(td, uap->fd, auio);
 	free(auio, M_IOV);
 	return (error);
 }
 
 int
 freebsd32_writev(struct thread *td, struct freebsd32_writev_args *uap)
 {
 	struct uio *auio;
 	int error;
 
 	error = freebsd32_copyinuio(uap->iovp, uap->iovcnt, &auio);
 	if (error)
 		return (error);
 	error = kern_writev(td, uap->fd, auio);
 	free(auio, M_IOV);
 	return (error);
 }
 
 int
 freebsd32_preadv(struct thread *td, struct freebsd32_preadv_args *uap)
 {
 	struct uio *auio;
 	int error;
 
 	error = freebsd32_copyinuio(uap->iovp, uap->iovcnt, &auio);
 	if (error)
 		return (error);
 	error = kern_preadv(td, uap->fd, auio, PAIR32TO64(off_t,uap->offset));
 	free(auio, M_IOV);
 	return (error);
 }
 
 int
 freebsd32_pwritev(struct thread *td, struct freebsd32_pwritev_args *uap)
 {
 	struct uio *auio;
 	int error;
 
 	error = freebsd32_copyinuio(uap->iovp, uap->iovcnt, &auio);
 	if (error)
 		return (error);
 	error = kern_pwritev(td, uap->fd, auio, PAIR32TO64(off_t,uap->offset));
 	free(auio, M_IOV);
 	return (error);
 }
 
 int
 freebsd32_copyiniov(struct iovec32 *iovp32, u_int iovcnt, struct iovec **iovp,
     int error)
 {
 	struct iovec32 iov32;
 	struct iovec *iov;
 	u_int iovlen;
 	int i;
 
 	*iovp = NULL;
 	if (iovcnt > UIO_MAXIOV)
 		return (error);
 	iovlen = iovcnt * sizeof(struct iovec);
 	iov = malloc(iovlen, M_IOV, M_WAITOK);
 	for (i = 0; i < iovcnt; i++) {
 		error = copyin(&iovp32[i], &iov32, sizeof(struct iovec32));
 		if (error) {
 			free(iov, M_IOV);
 			return (error);
 		}
 		iov[i].iov_base = PTRIN(iov32.iov_base);
 		iov[i].iov_len = iov32.iov_len;
 	}
 	*iovp = iov;
 	return (0);
 }
 
 static int
 freebsd32_copyinmsghdr(struct msghdr32 *msg32, struct msghdr *msg)
 {
 	struct msghdr32 m32;
 	int error;
 
 	error = copyin(msg32, &m32, sizeof(m32));
 	if (error)
 		return (error);
 	msg->msg_name = PTRIN(m32.msg_name);
 	msg->msg_namelen = m32.msg_namelen;
 	msg->msg_iov = PTRIN(m32.msg_iov);
 	msg->msg_iovlen = m32.msg_iovlen;
 	msg->msg_control = PTRIN(m32.msg_control);
 	msg->msg_controllen = m32.msg_controllen;
 	msg->msg_flags = m32.msg_flags;
 	return (0);
 }
 
 static int
 freebsd32_copyoutmsghdr(struct msghdr *msg, struct msghdr32 *msg32)
 {
 	struct msghdr32 m32;
 	int error;
 
 	m32.msg_name = PTROUT(msg->msg_name);
 	m32.msg_namelen = msg->msg_namelen;
 	m32.msg_iov = PTROUT(msg->msg_iov);
 	m32.msg_iovlen = msg->msg_iovlen;
 	m32.msg_control = PTROUT(msg->msg_control);
 	m32.msg_controllen = msg->msg_controllen;
 	m32.msg_flags = msg->msg_flags;
 	error = copyout(&m32, msg32, sizeof(m32));
 	return (error);
 }
 
 #ifndef __mips__
 #define FREEBSD32_ALIGNBYTES	(sizeof(int) - 1)
 #else
 #define FREEBSD32_ALIGNBYTES	(sizeof(long) - 1)
 #endif
 #define FREEBSD32_ALIGN(p)	\
 	(((u_long)(p) + FREEBSD32_ALIGNBYTES) & ~FREEBSD32_ALIGNBYTES)
 #define	FREEBSD32_CMSG_SPACE(l)	\
 	(FREEBSD32_ALIGN(sizeof(struct cmsghdr)) + FREEBSD32_ALIGN(l))
 
 #define	FREEBSD32_CMSG_DATA(cmsg)	((unsigned char *)(cmsg) + \
 				 FREEBSD32_ALIGN(sizeof(struct cmsghdr)))
 static int
 freebsd32_copy_msg_out(struct msghdr *msg, struct mbuf *control)
 {
 	struct cmsghdr *cm;
 	void *data;
 	socklen_t clen, datalen;
 	int error;
 	caddr_t ctlbuf;
 	int len, maxlen, copylen;
 	struct mbuf *m;
 	error = 0;
 
 	len    = msg->msg_controllen;
 	maxlen = msg->msg_controllen;
 	msg->msg_controllen = 0;
 
 	m = control;
 	ctlbuf = msg->msg_control;
       
 	while (m && len > 0) {
 		cm = mtod(m, struct cmsghdr *);
 		clen = m->m_len;
 
 		while (cm != NULL) {
 
 			if (sizeof(struct cmsghdr) > clen ||
 			    cm->cmsg_len > clen) {
 				error = EINVAL;
 				break;
 			}	
 
 			data   = CMSG_DATA(cm);
 			datalen = (caddr_t)cm + cm->cmsg_len - (caddr_t)data;
 
 			/* Adjust message length */
 			cm->cmsg_len = FREEBSD32_ALIGN(sizeof(struct cmsghdr)) +
 			    datalen;
 
 
 			/* Copy cmsghdr */
 			copylen = sizeof(struct cmsghdr);
 			if (len < copylen) {
 				msg->msg_flags |= MSG_CTRUNC;
 				copylen = len;
 			}
 
 			error = copyout(cm,ctlbuf,copylen);
 			if (error)
 				goto exit;
 
 			ctlbuf += FREEBSD32_ALIGN(copylen);
 			len    -= FREEBSD32_ALIGN(copylen);
 
 			if (len <= 0)
 				break;
 
 			/* Copy data */
 			copylen = datalen;
 			if (len < copylen) {
 				msg->msg_flags |= MSG_CTRUNC;
 				copylen = len;
 			}
 
 			error = copyout(data,ctlbuf,copylen);
 			if (error)
 				goto exit;
 
 			ctlbuf += FREEBSD32_ALIGN(copylen);
 			len    -= FREEBSD32_ALIGN(copylen);
 
 			if (CMSG_SPACE(datalen) < clen) {
 				clen -= CMSG_SPACE(datalen);
 				cm = (struct cmsghdr *)
 					((caddr_t)cm + CMSG_SPACE(datalen));
 			} else {
 				clen = 0;
 				cm = NULL;
 			}
 		}	
 		m = m->m_next;
 	}
 
 	msg->msg_controllen = (len <= 0) ? maxlen :  ctlbuf - (caddr_t)msg->msg_control;
 	
 exit:
 	return (error);
 
 }
 
 int
 freebsd32_recvmsg(td, uap)
 	struct thread *td;
 	struct freebsd32_recvmsg_args /* {
 		int	s;
 		struct	msghdr32 *msg;
 		int	flags;
 	} */ *uap;
 {
 	struct msghdr msg;
 	struct msghdr32 m32;
 	struct iovec *uiov, *iov;
 	struct mbuf *control = NULL;
 	struct mbuf **controlp;
 
 	int error;
 	error = copyin(uap->msg, &m32, sizeof(m32));
 	if (error)
 		return (error);
 	error = freebsd32_copyinmsghdr(uap->msg, &msg);
 	if (error)
 		return (error);
 	error = freebsd32_copyiniov(PTRIN(m32.msg_iov), m32.msg_iovlen, &iov,
 	    EMSGSIZE);
 	if (error)
 		return (error);
 	msg.msg_flags = uap->flags;
 	uiov = msg.msg_iov;
 	msg.msg_iov = iov;
 
 	controlp = (msg.msg_control != NULL) ?  &control : NULL;
 	error = kern_recvit(td, uap->s, &msg, UIO_USERSPACE, controlp);
 	if (error == 0) {
 		msg.msg_iov = uiov;
 		
 		if (control != NULL)
 			error = freebsd32_copy_msg_out(&msg, control);
 		else
 			msg.msg_controllen = 0;
 		
 		if (error == 0)
 			error = freebsd32_copyoutmsghdr(&msg, uap->msg);
 	}
 	free(iov, M_IOV);
 
 	if (control != NULL)
 		m_freem(control);
 
 	return (error);
 }
 
 /*
  * Copy-in the array of control messages constructed using alignment
  * and padding suitable for a 32-bit environment and construct an
  * mbuf using alignment and padding suitable for a 64-bit kernel.
  * The alignment and padding are defined indirectly by CMSG_DATA(),
  * CMSG_SPACE() and CMSG_LEN().
  */
 static int
 freebsd32_copyin_control(struct mbuf **mp, caddr_t buf, u_int buflen)
 {
 	struct mbuf *m;
 	void *md;
 	u_int idx, len, msglen;
 	int error;
 
 	buflen = FREEBSD32_ALIGN(buflen);
 
 	if (buflen > MCLBYTES)
 		return (EINVAL);
 
 	/*
 	 * Iterate over the buffer and get the length of each message
 	 * in there. This has 32-bit alignment and padding. Use it to
 	 * determine the length of these messages when using 64-bit
 	 * alignment and padding.
 	 */
 	idx = 0;
 	len = 0;
 	while (idx < buflen) {
 		error = copyin(buf + idx, &msglen, sizeof(msglen));
 		if (error)
 			return (error);
 		if (msglen < sizeof(struct cmsghdr))
 			return (EINVAL);
 		msglen = FREEBSD32_ALIGN(msglen);
 		if (idx + msglen > buflen)
 			return (EINVAL);
 		idx += msglen;
 		msglen += CMSG_ALIGN(sizeof(struct cmsghdr)) -
 		    FREEBSD32_ALIGN(sizeof(struct cmsghdr));
 		len += CMSG_ALIGN(msglen);
 	}
 
 	if (len > MCLBYTES)
 		return (EINVAL);
 
 	m = m_get(M_WAITOK, MT_CONTROL);
 	if (len > MLEN)
 		MCLGET(m, M_WAITOK);
 	m->m_len = len;
 
 	md = mtod(m, void *);
 	while (buflen > 0) {
 		error = copyin(buf, md, sizeof(struct cmsghdr));
 		if (error)
 			break;
 		msglen = *(u_int *)md;
 		msglen = FREEBSD32_ALIGN(msglen);
 
 		/* Modify the message length to account for alignment. */
 		*(u_int *)md = msglen + CMSG_ALIGN(sizeof(struct cmsghdr)) -
 		    FREEBSD32_ALIGN(sizeof(struct cmsghdr));
 
 		md = (char *)md + CMSG_ALIGN(sizeof(struct cmsghdr));
 		buf += FREEBSD32_ALIGN(sizeof(struct cmsghdr));
 		buflen -= FREEBSD32_ALIGN(sizeof(struct cmsghdr));
 
 		msglen -= FREEBSD32_ALIGN(sizeof(struct cmsghdr));
 		if (msglen > 0) {
 			error = copyin(buf, md, msglen);
 			if (error)
 				break;
 			md = (char *)md + CMSG_ALIGN(msglen);
 			buf += msglen;
 			buflen -= msglen;
 		}
 	}
 
 	if (error)
 		m_free(m);
 	else
 		*mp = m;
 	return (error);
 }
 
 int
 freebsd32_sendmsg(struct thread *td,
 		  struct freebsd32_sendmsg_args *uap)
 {
 	struct msghdr msg;
 	struct msghdr32 m32;
 	struct iovec *iov;
 	struct mbuf *control = NULL;
 	struct sockaddr *to = NULL;
 	int error;
 
 	error = copyin(uap->msg, &m32, sizeof(m32));
 	if (error)
 		return (error);
 	error = freebsd32_copyinmsghdr(uap->msg, &msg);
 	if (error)
 		return (error);
 	error = freebsd32_copyiniov(PTRIN(m32.msg_iov), m32.msg_iovlen, &iov,
 	    EMSGSIZE);
 	if (error)
 		return (error);
 	msg.msg_iov = iov;
 	if (msg.msg_name != NULL) {
 		error = getsockaddr(&to, msg.msg_name, msg.msg_namelen);
 		if (error) {
 			to = NULL;
 			goto out;
 		}
 		msg.msg_name = to;
 	}
 
 	if (msg.msg_control) {
 		if (msg.msg_controllen < sizeof(struct cmsghdr)) {
 			error = EINVAL;
 			goto out;
 		}
 
 		error = freebsd32_copyin_control(&control, msg.msg_control,
 		    msg.msg_controllen);
 		if (error)
 			goto out;
 
 		msg.msg_control = NULL;
 		msg.msg_controllen = 0;
 	}
 
 	error = kern_sendit(td, uap->s, &msg, uap->flags, control,
 	    UIO_USERSPACE);
 
 out:
 	free(iov, M_IOV);
 	if (to)
 		free(to, M_SONAME);
 	return (error);
 }
 
 int
 freebsd32_recvfrom(struct thread *td,
 		   struct freebsd32_recvfrom_args *uap)
 {
 	struct msghdr msg;
 	struct iovec aiov;
 	int error;
 
 	if (uap->fromlenaddr) {
 		error = copyin(PTRIN(uap->fromlenaddr), &msg.msg_namelen,
 		    sizeof(msg.msg_namelen));
 		if (error)
 			return (error);
 	} else {
 		msg.msg_namelen = 0;
 	}
 
 	msg.msg_name = PTRIN(uap->from);
 	msg.msg_iov = &aiov;
 	msg.msg_iovlen = 1;
 	aiov.iov_base = PTRIN(uap->buf);
 	aiov.iov_len = uap->len;
 	msg.msg_control = NULL;
 	msg.msg_flags = uap->flags;
 	error = kern_recvit(td, uap->s, &msg, UIO_USERSPACE, NULL);
 	if (error == 0 && uap->fromlenaddr)
 		error = copyout(&msg.msg_namelen, PTRIN(uap->fromlenaddr),
 		    sizeof (msg.msg_namelen));
 	return (error);
 }
 
 int
 freebsd32_settimeofday(struct thread *td,
 		       struct freebsd32_settimeofday_args *uap)
 {
 	struct timeval32 tv32;
 	struct timeval tv, *tvp;
 	struct timezone tz, *tzp;
 	int error;
 
 	if (uap->tv) {
 		error = copyin(uap->tv, &tv32, sizeof(tv32));
 		if (error)
 			return (error);
 		CP(tv32, tv, tv_sec);
 		CP(tv32, tv, tv_usec);
 		tvp = &tv;
 	} else
 		tvp = NULL;
 	if (uap->tzp) {
 		error = copyin(uap->tzp, &tz, sizeof(tz));
 		if (error)
 			return (error);
 		tzp = &tz;
 	} else
 		tzp = NULL;
 	return (kern_settimeofday(td, tvp, tzp));
 }
 
 int
 freebsd32_utimes(struct thread *td, struct freebsd32_utimes_args *uap)
 {
 	struct timeval32 s32[2];
 	struct timeval s[2], *sp;
 	int error;
 
 	if (uap->tptr != NULL) {
 		error = copyin(uap->tptr, s32, sizeof(s32));
 		if (error)
 			return (error);
 		CP(s32[0], s[0], tv_sec);
 		CP(s32[0], s[0], tv_usec);
 		CP(s32[1], s[1], tv_sec);
 		CP(s32[1], s[1], tv_usec);
 		sp = s;
 	} else
 		sp = NULL;
 	return (kern_utimesat(td, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    sp, UIO_SYSSPACE));
 }
 
 int
 freebsd32_lutimes(struct thread *td, struct freebsd32_lutimes_args *uap)
 {
 	struct timeval32 s32[2];
 	struct timeval s[2], *sp;
 	int error;
 
 	if (uap->tptr != NULL) {
 		error = copyin(uap->tptr, s32, sizeof(s32));
 		if (error)
 			return (error);
 		CP(s32[0], s[0], tv_sec);
 		CP(s32[0], s[0], tv_usec);
 		CP(s32[1], s[1], tv_sec);
 		CP(s32[1], s[1], tv_usec);
 		sp = s;
 	} else
 		sp = NULL;
 	return (kern_lutimes(td, uap->path, UIO_USERSPACE, sp, UIO_SYSSPACE));
 }
 
 int
 freebsd32_futimes(struct thread *td, struct freebsd32_futimes_args *uap)
 {
 	struct timeval32 s32[2];
 	struct timeval s[2], *sp;
 	int error;
 
 	if (uap->tptr != NULL) {
 		error = copyin(uap->tptr, s32, sizeof(s32));
 		if (error)
 			return (error);
 		CP(s32[0], s[0], tv_sec);
 		CP(s32[0], s[0], tv_usec);
 		CP(s32[1], s[1], tv_sec);
 		CP(s32[1], s[1], tv_usec);
 		sp = s;
 	} else
 		sp = NULL;
 	return (kern_futimes(td, uap->fd, sp, UIO_SYSSPACE));
 }
 
 int
 freebsd32_futimesat(struct thread *td, struct freebsd32_futimesat_args *uap)
 {
 	struct timeval32 s32[2];
 	struct timeval s[2], *sp;
 	int error;
 
 	if (uap->times != NULL) {
 		error = copyin(uap->times, s32, sizeof(s32));
 		if (error)
 			return (error);
 		CP(s32[0], s[0], tv_sec);
 		CP(s32[0], s[0], tv_usec);
 		CP(s32[1], s[1], tv_sec);
 		CP(s32[1], s[1], tv_usec);
 		sp = s;
 	} else
 		sp = NULL;
 	return (kern_utimesat(td, uap->fd, uap->path, UIO_USERSPACE,
 		sp, UIO_SYSSPACE));
 }
 
 int
 freebsd32_futimens(struct thread *td, struct freebsd32_futimens_args *uap)
 {
 	struct timespec32 ts32[2];
 	struct timespec ts[2], *tsp;
 	int error;
 
 	if (uap->times != NULL) {
 		error = copyin(uap->times, ts32, sizeof(ts32));
 		if (error)
 			return (error);
 		CP(ts32[0], ts[0], tv_sec);
 		CP(ts32[0], ts[0], tv_nsec);
 		CP(ts32[1], ts[1], tv_sec);
 		CP(ts32[1], ts[1], tv_nsec);
 		tsp = ts;
 	} else
 		tsp = NULL;
 	return (kern_futimens(td, uap->fd, tsp, UIO_SYSSPACE));
 }
 
 int
 freebsd32_utimensat(struct thread *td, struct freebsd32_utimensat_args *uap)
 {
 	struct timespec32 ts32[2];
 	struct timespec ts[2], *tsp;
 	int error;
 
 	if (uap->times != NULL) {
 		error = copyin(uap->times, ts32, sizeof(ts32));
 		if (error)
 			return (error);
 		CP(ts32[0], ts[0], tv_sec);
 		CP(ts32[0], ts[0], tv_nsec);
 		CP(ts32[1], ts[1], tv_sec);
 		CP(ts32[1], ts[1], tv_nsec);
 		tsp = ts;
 	} else
 		tsp = NULL;
 	return (kern_utimensat(td, uap->fd, uap->path, UIO_USERSPACE,
 	    tsp, UIO_SYSSPACE, uap->flag));
 }
 
 int
 freebsd32_adjtime(struct thread *td, struct freebsd32_adjtime_args *uap)
 {
 	struct timeval32 tv32;
 	struct timeval delta, olddelta, *deltap;
 	int error;
 
 	if (uap->delta) {
 		error = copyin(uap->delta, &tv32, sizeof(tv32));
 		if (error)
 			return (error);
 		CP(tv32, delta, tv_sec);
 		CP(tv32, delta, tv_usec);
 		deltap = &delta;
 	} else
 		deltap = NULL;
 	error = kern_adjtime(td, deltap, &olddelta);
 	if (uap->olddelta && error == 0) {
 		CP(olddelta, tv32, tv_sec);
 		CP(olddelta, tv32, tv_usec);
 		error = copyout(&tv32, uap->olddelta, sizeof(tv32));
 	}
 	return (error);
 }
 
 #ifdef COMPAT_FREEBSD4
 int
 freebsd4_freebsd32_statfs(struct thread *td, struct freebsd4_freebsd32_statfs_args *uap)
 {
 	struct statfs32 s32;
 	struct statfs s;
 	int error;
 
 	error = kern_statfs(td, uap->path, UIO_USERSPACE, &s);
 	if (error)
 		return (error);
 	copy_statfs(&s, &s32);
 	return (copyout(&s32, uap->buf, sizeof(s32)));
 }
 #endif
 
 #ifdef COMPAT_FREEBSD4
 int
 freebsd4_freebsd32_fstatfs(struct thread *td, struct freebsd4_freebsd32_fstatfs_args *uap)
 {
 	struct statfs32 s32;
 	struct statfs s;
 	int error;
 
 	error = kern_fstatfs(td, uap->fd, &s);
 	if (error)
 		return (error);
 	copy_statfs(&s, &s32);
 	return (copyout(&s32, uap->buf, sizeof(s32)));
 }
 #endif
 
 #ifdef COMPAT_FREEBSD4
 int
 freebsd4_freebsd32_fhstatfs(struct thread *td, struct freebsd4_freebsd32_fhstatfs_args *uap)
 {
 	struct statfs32 s32;
 	struct statfs s;
 	fhandle_t fh;
 	int error;
 
 	if ((error = copyin(uap->u_fhp, &fh, sizeof(fhandle_t))) != 0)
 		return (error);
 	error = kern_fhstatfs(td, fh, &s);
 	if (error)
 		return (error);
 	copy_statfs(&s, &s32);
 	return (copyout(&s32, uap->buf, sizeof(s32)));
 }
 #endif
 
 int
 freebsd32_pread(struct thread *td, struct freebsd32_pread_args *uap)
 {
 	struct pread_args ap;
 
 	ap.fd = uap->fd;
 	ap.buf = uap->buf;
 	ap.nbyte = uap->nbyte;
 	ap.offset = PAIR32TO64(off_t,uap->offset);
 	return (sys_pread(td, &ap));
 }
 
 int
 freebsd32_pwrite(struct thread *td, struct freebsd32_pwrite_args *uap)
 {
 	struct pwrite_args ap;
 
 	ap.fd = uap->fd;
 	ap.buf = uap->buf;
 	ap.nbyte = uap->nbyte;
 	ap.offset = PAIR32TO64(off_t,uap->offset);
 	return (sys_pwrite(td, &ap));
 }
 
 #ifdef COMPAT_43
 int
 ofreebsd32_lseek(struct thread *td, struct ofreebsd32_lseek_args *uap)
 {
 	struct lseek_args nuap;
 
 	nuap.fd = uap->fd;
 	nuap.offset = uap->offset;
 	nuap.whence = uap->whence;
 	return (sys_lseek(td, &nuap));
 }
 #endif
 
 int
 freebsd32_lseek(struct thread *td, struct freebsd32_lseek_args *uap)
 {
 	int error;
 	struct lseek_args ap;
 	off_t pos;
 
 	ap.fd = uap->fd;
 	ap.offset = PAIR32TO64(off_t,uap->offset);
 	ap.whence = uap->whence;
 	error = sys_lseek(td, &ap);
 	/* Expand the quad return into two parts for eax and edx */
 	pos = td->td_uretoff.tdu_off;
 	td->td_retval[RETVAL_LO] = pos & 0xffffffff;	/* %eax */
 	td->td_retval[RETVAL_HI] = pos >> 32;		/* %edx */
 	return error;
 }
 
 int
 freebsd32_truncate(struct thread *td, struct freebsd32_truncate_args *uap)
 {
 	struct truncate_args ap;
 
 	ap.path = uap->path;
 	ap.length = PAIR32TO64(off_t,uap->length);
 	return (sys_truncate(td, &ap));
 }
 
 int
 freebsd32_ftruncate(struct thread *td, struct freebsd32_ftruncate_args *uap)
 {
 	struct ftruncate_args ap;
 
 	ap.fd = uap->fd;
 	ap.length = PAIR32TO64(off_t,uap->length);
 	return (sys_ftruncate(td, &ap));
 }
 
 #ifdef COMPAT_43
 int
 ofreebsd32_getdirentries(struct thread *td,
     struct ofreebsd32_getdirentries_args *uap)
 {
 	struct ogetdirentries_args ap;
 	int error;
 	long loff;
 	int32_t loff_cut;
 
 	ap.fd = uap->fd;
 	ap.buf = uap->buf;
 	ap.count = uap->count;
 	ap.basep = NULL;
 	error = kern_ogetdirentries(td, &ap, &loff);
 	if (error == 0) {
 		loff_cut = loff;
 		error = copyout(&loff_cut, uap->basep, sizeof(int32_t));
 	}
 	return (error);
 }
 #endif
 
 int
 freebsd32_getdirentries(struct thread *td,
     struct freebsd32_getdirentries_args *uap)
 {
 	long base;
 	int32_t base32;
 	int error;
 
 	error = kern_getdirentries(td, uap->fd, uap->buf, uap->count, &base,
 	    NULL, UIO_USERSPACE);
 	if (error)
 		return (error);
 	if (uap->basep != NULL) {
 		base32 = base;
 		error = copyout(&base32, uap->basep, sizeof(int32_t));
 	}
 	return (error);
 }
 
 #ifdef COMPAT_FREEBSD6
 /* versions with the 'int pad' argument */
 int
 freebsd6_freebsd32_pread(struct thread *td, struct freebsd6_freebsd32_pread_args *uap)
 {
 	struct pread_args ap;
 
 	ap.fd = uap->fd;
 	ap.buf = uap->buf;
 	ap.nbyte = uap->nbyte;
 	ap.offset = PAIR32TO64(off_t,uap->offset);
 	return (sys_pread(td, &ap));
 }
 
 int
 freebsd6_freebsd32_pwrite(struct thread *td, struct freebsd6_freebsd32_pwrite_args *uap)
 {
 	struct pwrite_args ap;
 
 	ap.fd = uap->fd;
 	ap.buf = uap->buf;
 	ap.nbyte = uap->nbyte;
 	ap.offset = PAIR32TO64(off_t,uap->offset);
 	return (sys_pwrite(td, &ap));
 }
 
 int
 freebsd6_freebsd32_lseek(struct thread *td, struct freebsd6_freebsd32_lseek_args *uap)
 {
 	int error;
 	struct lseek_args ap;
 	off_t pos;
 
 	ap.fd = uap->fd;
 	ap.offset = PAIR32TO64(off_t,uap->offset);
 	ap.whence = uap->whence;
 	error = sys_lseek(td, &ap);
 	/* Expand the quad return into two parts for eax and edx */
 	pos = *(off_t *)(td->td_retval);
 	td->td_retval[RETVAL_LO] = pos & 0xffffffff;	/* %eax */
 	td->td_retval[RETVAL_HI] = pos >> 32;		/* %edx */
 	return error;
 }
 
 int
 freebsd6_freebsd32_truncate(struct thread *td, struct freebsd6_freebsd32_truncate_args *uap)
 {
 	struct truncate_args ap;
 
 	ap.path = uap->path;
 	ap.length = PAIR32TO64(off_t,uap->length);
 	return (sys_truncate(td, &ap));
 }
 
 int
 freebsd6_freebsd32_ftruncate(struct thread *td, struct freebsd6_freebsd32_ftruncate_args *uap)
 {
 	struct ftruncate_args ap;
 
 	ap.fd = uap->fd;
 	ap.length = PAIR32TO64(off_t,uap->length);
 	return (sys_ftruncate(td, &ap));
 }
 #endif /* COMPAT_FREEBSD6 */
 
 struct sf_hdtr32 {
 	uint32_t headers;
 	int hdr_cnt;
 	uint32_t trailers;
 	int trl_cnt;
 };
 
 static int
 freebsd32_do_sendfile(struct thread *td,
     struct freebsd32_sendfile_args *uap, int compat)
 {
 	struct sf_hdtr32 hdtr32;
 	struct sf_hdtr hdtr;
 	struct uio *hdr_uio, *trl_uio;
 	struct file *fp;
 	cap_rights_t rights;
 	struct iovec32 *iov32;
 	off_t offset, sbytes;
 	int error;
 
 	offset = PAIR32TO64(off_t, uap->offset);
 	if (offset < 0)
 		return (EINVAL);
 
 	hdr_uio = trl_uio = NULL;
 
 	if (uap->hdtr != NULL) {
 		error = copyin(uap->hdtr, &hdtr32, sizeof(hdtr32));
 		if (error)
 			goto out;
 		PTRIN_CP(hdtr32, hdtr, headers);
 		CP(hdtr32, hdtr, hdr_cnt);
 		PTRIN_CP(hdtr32, hdtr, trailers);
 		CP(hdtr32, hdtr, trl_cnt);
 
 		if (hdtr.headers != NULL) {
 			iov32 = PTRIN(hdtr32.headers);
 			error = freebsd32_copyinuio(iov32,
 			    hdtr32.hdr_cnt, &hdr_uio);
 			if (error)
 				goto out;
 		}
 		if (hdtr.trailers != NULL) {
 			iov32 = PTRIN(hdtr32.trailers);
 			error = freebsd32_copyinuio(iov32,
 			    hdtr32.trl_cnt, &trl_uio);
 			if (error)
 				goto out;
 		}
 	}
 
 	AUDIT_ARG_FD(uap->fd);
 
 	if ((error = fget_read(td, uap->fd,
 	    cap_rights_init(&rights, CAP_PREAD), &fp)) != 0)
 		goto out;
 
 	error = fo_sendfile(fp, uap->s, hdr_uio, trl_uio, offset,
 	    uap->nbytes, &sbytes, uap->flags, compat ? SFK_COMPAT : 0, td);
 	fdrop(fp, td);
 
 	if (uap->sbytes != NULL)
 		copyout(&sbytes, uap->sbytes, sizeof(off_t));
 
 out:
 	if (hdr_uio)
 		free(hdr_uio, M_IOV);
 	if (trl_uio)
 		free(trl_uio, M_IOV);
 	return (error);
 }
 
 #ifdef COMPAT_FREEBSD4
 int
 freebsd4_freebsd32_sendfile(struct thread *td,
     struct freebsd4_freebsd32_sendfile_args *uap)
 {
 	return (freebsd32_do_sendfile(td,
 	    (struct freebsd32_sendfile_args *)uap, 1));
 }
 #endif
 
 int
 freebsd32_sendfile(struct thread *td, struct freebsd32_sendfile_args *uap)
 {
 
 	return (freebsd32_do_sendfile(td, uap, 0));
 }
 
 static void
 copy_stat(struct stat *in, struct stat32 *out)
 {
 
 	CP(*in, *out, st_dev);
 	CP(*in, *out, st_ino);
 	CP(*in, *out, st_mode);
 	CP(*in, *out, st_nlink);
 	CP(*in, *out, st_uid);
 	CP(*in, *out, st_gid);
 	CP(*in, *out, st_rdev);
 	TS_CP(*in, *out, st_atim);
 	TS_CP(*in, *out, st_mtim);
 	TS_CP(*in, *out, st_ctim);
 	CP(*in, *out, st_size);
 	CP(*in, *out, st_blocks);
 	CP(*in, *out, st_blksize);
 	CP(*in, *out, st_flags);
 	CP(*in, *out, st_gen);
 	TS_CP(*in, *out, st_birthtim);
 }
 
 #ifdef COMPAT_43
 static void
 copy_ostat(struct stat *in, struct ostat32 *out)
 {
 
 	CP(*in, *out, st_dev);
 	CP(*in, *out, st_ino);
 	CP(*in, *out, st_mode);
 	CP(*in, *out, st_nlink);
 	CP(*in, *out, st_uid);
 	CP(*in, *out, st_gid);
 	CP(*in, *out, st_rdev);
 	CP(*in, *out, st_size);
 	TS_CP(*in, *out, st_atim);
 	TS_CP(*in, *out, st_mtim);
 	TS_CP(*in, *out, st_ctim);
 	CP(*in, *out, st_blksize);
 	CP(*in, *out, st_blocks);
 	CP(*in, *out, st_flags);
 	CP(*in, *out, st_gen);
 }
 #endif
 
 int
 freebsd32_stat(struct thread *td, struct freebsd32_stat_args *uap)
 {
 	struct stat sb;
 	struct stat32 sb32;
 	int error;
 
 	error = kern_statat(td, 0, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    &sb, NULL);
 	if (error)
 		return (error);
 	copy_stat(&sb, &sb32);
 	error = copyout(&sb32, uap->ub, sizeof (sb32));
 	return (error);
 }
 
 #ifdef COMPAT_43
 int
 ofreebsd32_stat(struct thread *td, struct ofreebsd32_stat_args *uap)
 {
 	struct stat sb;
 	struct ostat32 sb32;
 	int error;
 
 	error = kern_statat(td, 0, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    &sb, NULL);
 	if (error)
 		return (error);
 	copy_ostat(&sb, &sb32);
 	error = copyout(&sb32, uap->ub, sizeof (sb32));
 	return (error);
 }
 #endif
 
 int
 freebsd32_fstat(struct thread *td, struct freebsd32_fstat_args *uap)
 {
 	struct stat ub;
 	struct stat32 ub32;
 	int error;
 
 	error = kern_fstat(td, uap->fd, &ub);
 	if (error)
 		return (error);
 	copy_stat(&ub, &ub32);
 	error = copyout(&ub32, uap->ub, sizeof(ub32));
 	return (error);
 }
 
 #ifdef COMPAT_43
 int
 ofreebsd32_fstat(struct thread *td, struct ofreebsd32_fstat_args *uap)
 {
 	struct stat ub;
 	struct ostat32 ub32;
 	int error;
 
 	error = kern_fstat(td, uap->fd, &ub);
 	if (error)
 		return (error);
 	copy_ostat(&ub, &ub32);
 	error = copyout(&ub32, uap->ub, sizeof(ub32));
 	return (error);
 }
 #endif
 
 int
 freebsd32_fstatat(struct thread *td, struct freebsd32_fstatat_args *uap)
 {
 	struct stat ub;
 	struct stat32 ub32;
 	int error;
 
 	error = kern_statat(td, uap->flag, uap->fd, uap->path, UIO_USERSPACE,
 	    &ub, NULL);
 	if (error)
 		return (error);
 	copy_stat(&ub, &ub32);
 	error = copyout(&ub32, uap->buf, sizeof(ub32));
 	return (error);
 }
 
 int
 freebsd32_lstat(struct thread *td, struct freebsd32_lstat_args *uap)
 {
 	struct stat sb;
 	struct stat32 sb32;
 	int error;
 
 	error = kern_statat(td, AT_SYMLINK_NOFOLLOW, AT_FDCWD, uap->path,
 	    UIO_USERSPACE, &sb, NULL);
 	if (error)
 		return (error);
 	copy_stat(&sb, &sb32);
 	error = copyout(&sb32, uap->ub, sizeof (sb32));
 	return (error);
 }
 
 #ifdef COMPAT_43
 int
 ofreebsd32_lstat(struct thread *td, struct ofreebsd32_lstat_args *uap)
 {
 	struct stat sb;
 	struct ostat32 sb32;
 	int error;
 
 	error = kern_statat(td, AT_SYMLINK_NOFOLLOW, AT_FDCWD, uap->path,
 	    UIO_USERSPACE, &sb, NULL);
 	if (error)
 		return (error);
 	copy_ostat(&sb, &sb32);
 	error = copyout(&sb32, uap->ub, sizeof (sb32));
 	return (error);
 }
 #endif
 
 int
 freebsd32_sysctl(struct thread *td, struct freebsd32_sysctl_args *uap)
 {
 	int error, name[CTL_MAXNAME];
 	size_t j, oldlen;
 	uint32_t tmp;
 
 	if (uap->namelen > CTL_MAXNAME || uap->namelen < 2)
 		return (EINVAL);
  	error = copyin(uap->name, name, uap->namelen * sizeof(int));
  	if (error)
 		return (error);
 	if (uap->oldlenp) {
 		error = fueword32(uap->oldlenp, &tmp);
 		oldlen = tmp;
 	} else {
 		oldlen = 0;
 	}
 	if (error != 0)
 		return (EFAULT);
 	error = userland_sysctl(td, name, uap->namelen,
 		uap->old, &oldlen, 1,
 		uap->new, uap->newlen, &j, SCTL_MASK32);
 	if (error && error != ENOMEM)
 		return (error);
 	if (uap->oldlenp)
 		suword32(uap->oldlenp, j);
 	return (0);
 }
 
 int
 freebsd32_jail(struct thread *td, struct freebsd32_jail_args *uap)
 {
 	uint32_t version;
 	int error;
 	struct jail j;
 
 	error = copyin(uap->jail, &version, sizeof(uint32_t));
 	if (error)
 		return (error);
 
 	switch (version) {
 	case 0:
 	{
 		/* FreeBSD single IPv4 jails. */
 		struct jail32_v0 j32_v0;
 
 		bzero(&j, sizeof(struct jail));
 		error = copyin(uap->jail, &j32_v0, sizeof(struct jail32_v0));
 		if (error)
 			return (error);
 		CP(j32_v0, j, version);
 		PTRIN_CP(j32_v0, j, path);
 		PTRIN_CP(j32_v0, j, hostname);
 		j.ip4s = htonl(j32_v0.ip_number);	/* jail_v0 is host order */
 		break;
 	}
 
 	case 1:
 		/*
 		 * Version 1 was used by multi-IPv4 jail implementations
 		 * that never made it into the official kernel.
 		 */
 		return (EINVAL);
 
 	case 2:	/* JAIL_API_VERSION */
 	{
 		/* FreeBSD multi-IPv4/IPv6,noIP jails. */
 		struct jail32 j32;
 
 		error = copyin(uap->jail, &j32, sizeof(struct jail32));
 		if (error)
 			return (error);
 		CP(j32, j, version);
 		PTRIN_CP(j32, j, path);
 		PTRIN_CP(j32, j, hostname);
 		PTRIN_CP(j32, j, jailname);
 		CP(j32, j, ip4s);
 		CP(j32, j, ip6s);
 		PTRIN_CP(j32, j, ip4);
 		PTRIN_CP(j32, j, ip6);
 		break;
 	}
 
 	default:
 		/* Sci-Fi jails are not supported, sorry. */
 		return (EINVAL);
 	}
 	return (kern_jail(td, &j));
 }
 
 int
 freebsd32_jail_set(struct thread *td, struct freebsd32_jail_set_args *uap)
 {
 	struct uio *auio;
 	int error;
 
 	/* Check that we have an even number of iovecs. */
 	if (uap->iovcnt & 1)
 		return (EINVAL);
 
 	error = freebsd32_copyinuio(uap->iovp, uap->iovcnt, &auio);
 	if (error)
 		return (error);
 	error = kern_jail_set(td, auio, uap->flags);
 	free(auio, M_IOV);
 	return (error);
 }
 
 int
 freebsd32_jail_get(struct thread *td, struct freebsd32_jail_get_args *uap)
 {
 	struct iovec32 iov32;
 	struct uio *auio;
 	int error, i;
 
 	/* Check that we have an even number of iovecs. */
 	if (uap->iovcnt & 1)
 		return (EINVAL);
 
 	error = freebsd32_copyinuio(uap->iovp, uap->iovcnt, &auio);
 	if (error)
 		return (error);
 	error = kern_jail_get(td, auio, uap->flags);
 	if (error == 0)
 		for (i = 0; i < uap->iovcnt; i++) {
 			PTROUT_CP(auio->uio_iov[i], iov32, iov_base);
 			CP(auio->uio_iov[i], iov32, iov_len);
 			error = copyout(&iov32, uap->iovp + i, sizeof(iov32));
 			if (error != 0)
 				break;
 		}
 	free(auio, M_IOV);
 	return (error);
 }
 
 int
 freebsd32_sigaction(struct thread *td, struct freebsd32_sigaction_args *uap)
 {
 	struct sigaction32 s32;
 	struct sigaction sa, osa, *sap;
 	int error;
 
 	if (uap->act) {
 		error = copyin(uap->act, &s32, sizeof(s32));
 		if (error)
 			return (error);
 		sa.sa_handler = PTRIN(s32.sa_u);
 		CP(s32, sa, sa_flags);
 		CP(s32, sa, sa_mask);
 		sap = &sa;
 	} else
 		sap = NULL;
 	error = kern_sigaction(td, uap->sig, sap, &osa, 0);
 	if (error == 0 && uap->oact != NULL) {
 		s32.sa_u = PTROUT(osa.sa_handler);
 		CP(osa, s32, sa_flags);
 		CP(osa, s32, sa_mask);
 		error = copyout(&s32, uap->oact, sizeof(s32));
 	}
 	return (error);
 }
 
 #ifdef COMPAT_FREEBSD4
 int
 freebsd4_freebsd32_sigaction(struct thread *td,
 			     struct freebsd4_freebsd32_sigaction_args *uap)
 {
 	struct sigaction32 s32;
 	struct sigaction sa, osa, *sap;
 	int error;
 
 	if (uap->act) {
 		error = copyin(uap->act, &s32, sizeof(s32));
 		if (error)
 			return (error);
 		sa.sa_handler = PTRIN(s32.sa_u);
 		CP(s32, sa, sa_flags);
 		CP(s32, sa, sa_mask);
 		sap = &sa;
 	} else
 		sap = NULL;
 	error = kern_sigaction(td, uap->sig, sap, &osa, KSA_FREEBSD4);
 	if (error == 0 && uap->oact != NULL) {
 		s32.sa_u = PTROUT(osa.sa_handler);
 		CP(osa, s32, sa_flags);
 		CP(osa, s32, sa_mask);
 		error = copyout(&s32, uap->oact, sizeof(s32));
 	}
 	return (error);
 }
 #endif
 
 #ifdef COMPAT_43
 struct osigaction32 {
 	u_int32_t	sa_u;
 	osigset_t	sa_mask;
 	int		sa_flags;
 };
 
 #define	ONSIG	32
 
 int
 ofreebsd32_sigaction(struct thread *td,
 			     struct ofreebsd32_sigaction_args *uap)
 {
 	struct osigaction32 s32;
 	struct sigaction sa, osa, *sap;
 	int error;
 
 	if (uap->signum <= 0 || uap->signum >= ONSIG)
 		return (EINVAL);
 
 	if (uap->nsa) {
 		error = copyin(uap->nsa, &s32, sizeof(s32));
 		if (error)
 			return (error);
 		sa.sa_handler = PTRIN(s32.sa_u);
 		CP(s32, sa, sa_flags);
 		OSIG2SIG(s32.sa_mask, sa.sa_mask);
 		sap = &sa;
 	} else
 		sap = NULL;
 	error = kern_sigaction(td, uap->signum, sap, &osa, KSA_OSIGSET);
 	if (error == 0 && uap->osa != NULL) {
 		s32.sa_u = PTROUT(osa.sa_handler);
 		CP(osa, s32, sa_flags);
 		SIG2OSIG(osa.sa_mask, s32.sa_mask);
 		error = copyout(&s32, uap->osa, sizeof(s32));
 	}
 	return (error);
 }
 
 int
 ofreebsd32_sigprocmask(struct thread *td,
 			       struct ofreebsd32_sigprocmask_args *uap)
 {
 	sigset_t set, oset;
 	int error;
 
 	OSIG2SIG(uap->mask, set);
 	error = kern_sigprocmask(td, uap->how, &set, &oset, SIGPROCMASK_OLD);
 	SIG2OSIG(oset, td->td_retval[0]);
 	return (error);
 }
 
 int
 ofreebsd32_sigpending(struct thread *td,
 			      struct ofreebsd32_sigpending_args *uap)
 {
 	struct proc *p = td->td_proc;
 	sigset_t siglist;
 
 	PROC_LOCK(p);
 	siglist = p->p_siglist;
 	SIGSETOR(siglist, td->td_siglist);
 	PROC_UNLOCK(p);
 	SIG2OSIG(siglist, td->td_retval[0]);
 	return (0);
 }
 
 struct sigvec32 {
 	u_int32_t	sv_handler;
 	int		sv_mask;
 	int		sv_flags;
 };
 
 int
 ofreebsd32_sigvec(struct thread *td,
 			  struct ofreebsd32_sigvec_args *uap)
 {
 	struct sigvec32 vec;
 	struct sigaction sa, osa, *sap;
 	int error;
 
 	if (uap->signum <= 0 || uap->signum >= ONSIG)
 		return (EINVAL);
 
 	if (uap->nsv) {
 		error = copyin(uap->nsv, &vec, sizeof(vec));
 		if (error)
 			return (error);
 		sa.sa_handler = PTRIN(vec.sv_handler);
 		OSIG2SIG(vec.sv_mask, sa.sa_mask);
 		sa.sa_flags = vec.sv_flags;
 		sa.sa_flags ^= SA_RESTART;
 		sap = &sa;
 	} else
 		sap = NULL;
 	error = kern_sigaction(td, uap->signum, sap, &osa, KSA_OSIGSET);
 	if (error == 0 && uap->osv != NULL) {
 		vec.sv_handler = PTROUT(osa.sa_handler);
 		SIG2OSIG(osa.sa_mask, vec.sv_mask);
 		vec.sv_flags = osa.sa_flags;
 		vec.sv_flags &= ~SA_NOCLDWAIT;
 		vec.sv_flags ^= SA_RESTART;
 		error = copyout(&vec, uap->osv, sizeof(vec));
 	}
 	return (error);
 }
 
 int
 ofreebsd32_sigblock(struct thread *td,
 			    struct ofreebsd32_sigblock_args *uap)
 {
 	sigset_t set, oset;
 
 	OSIG2SIG(uap->mask, set);
 	kern_sigprocmask(td, SIG_BLOCK, &set, &oset, 0);
 	SIG2OSIG(oset, td->td_retval[0]);
 	return (0);
 }
 
 int
 ofreebsd32_sigsetmask(struct thread *td,
 			      struct ofreebsd32_sigsetmask_args *uap)
 {
 	sigset_t set, oset;
 
 	OSIG2SIG(uap->mask, set);
 	kern_sigprocmask(td, SIG_SETMASK, &set, &oset, 0);
 	SIG2OSIG(oset, td->td_retval[0]);
 	return (0);
 }
 
 int
 ofreebsd32_sigsuspend(struct thread *td,
 			      struct ofreebsd32_sigsuspend_args *uap)
 {
 	sigset_t mask;
 
 	OSIG2SIG(uap->mask, mask);
 	return (kern_sigsuspend(td, mask));
 }
 
 struct sigstack32 {
 	u_int32_t	ss_sp;
 	int		ss_onstack;
 };
 
 int
 ofreebsd32_sigstack(struct thread *td,
 			    struct ofreebsd32_sigstack_args *uap)
 {
 	struct sigstack32 s32;
 	struct sigstack nss, oss;
 	int error = 0, unss;
 
 	if (uap->nss != NULL) {
 		error = copyin(uap->nss, &s32, sizeof(s32));
 		if (error)
 			return (error);
 		nss.ss_sp = PTRIN(s32.ss_sp);
 		CP(s32, nss, ss_onstack);
 		unss = 1;
 	} else {
 		unss = 0;
 	}
 	oss.ss_sp = td->td_sigstk.ss_sp;
 	oss.ss_onstack = sigonstack(cpu_getstack(td));
 	if (unss) {
 		td->td_sigstk.ss_sp = nss.ss_sp;
 		td->td_sigstk.ss_size = 0;
 		td->td_sigstk.ss_flags |= (nss.ss_onstack & SS_ONSTACK);
 		td->td_pflags |= TDP_ALTSTACK;
 	}
 	if (uap->oss != NULL) {
 		s32.ss_sp = PTROUT(oss.ss_sp);
 		CP(oss, s32, ss_onstack);
 		error = copyout(&s32, uap->oss, sizeof(s32));
 	}
 	return (error);
 }
 #endif
 
 int
 freebsd32_nanosleep(struct thread *td, struct freebsd32_nanosleep_args *uap)
 {
 	struct timespec32 rmt32, rqt32;
 	struct timespec rmt, rqt;
 	int error;
 
 	error = copyin(uap->rqtp, &rqt32, sizeof(rqt32));
 	if (error)
 		return (error);
 
 	CP(rqt32, rqt, tv_sec);
 	CP(rqt32, rqt, tv_nsec);
 
 	if (uap->rmtp &&
 	    !useracc((caddr_t)uap->rmtp, sizeof(rmt), VM_PROT_WRITE))
 		return (EFAULT);
 	error = kern_nanosleep(td, &rqt, &rmt);
 	if (error && uap->rmtp) {
 		int error2;
 
 		CP(rmt, rmt32, tv_sec);
 		CP(rmt, rmt32, tv_nsec);
 
 		error2 = copyout(&rmt32, uap->rmtp, sizeof(rmt32));
 		if (error2)
 			error = error2;
 	}
 	return (error);
 }
 
 int
 freebsd32_clock_gettime(struct thread *td,
 			struct freebsd32_clock_gettime_args *uap)
 {
 	struct timespec	ats;
 	struct timespec32 ats32;
 	int error;
 
 	error = kern_clock_gettime(td, uap->clock_id, &ats);
 	if (error == 0) {
 		CP(ats, ats32, tv_sec);
 		CP(ats, ats32, tv_nsec);
 		error = copyout(&ats32, uap->tp, sizeof(ats32));
 	}
 	return (error);
 }
 
 int
 freebsd32_clock_settime(struct thread *td,
 			struct freebsd32_clock_settime_args *uap)
 {
 	struct timespec	ats;
 	struct timespec32 ats32;
 	int error;
 
 	error = copyin(uap->tp, &ats32, sizeof(ats32));
 	if (error)
 		return (error);
 	CP(ats32, ats, tv_sec);
 	CP(ats32, ats, tv_nsec);
 
 	return (kern_clock_settime(td, uap->clock_id, &ats));
 }
 
 int
 freebsd32_clock_getres(struct thread *td,
 		       struct freebsd32_clock_getres_args *uap)
 {
 	struct timespec	ts;
 	struct timespec32 ts32;
 	int error;
 
 	if (uap->tp == NULL)
 		return (0);
 	error = kern_clock_getres(td, uap->clock_id, &ts);
 	if (error == 0) {
 		CP(ts, ts32, tv_sec);
 		CP(ts, ts32, tv_nsec);
 		error = copyout(&ts32, uap->tp, sizeof(ts32));
 	}
 	return (error);
 }
 
 int freebsd32_ktimer_create(struct thread *td,
     struct freebsd32_ktimer_create_args *uap)
 {
 	struct sigevent32 ev32;
 	struct sigevent ev, *evp;
 	int error, id;
 
 	if (uap->evp == NULL) {
 		evp = NULL;
 	} else {
 		evp = &ev;
 		error = copyin(uap->evp, &ev32, sizeof(ev32));
 		if (error != 0)
 			return (error);
 		error = convert_sigevent32(&ev32, &ev);
 		if (error != 0)
 			return (error);
 	}
 	error = kern_ktimer_create(td, uap->clock_id, evp, &id, -1);
 	if (error == 0) {
 		error = copyout(&id, uap->timerid, sizeof(int));
 		if (error != 0)
 			kern_ktimer_delete(td, id);
 	}
 	return (error);
 }
 
 int
 freebsd32_ktimer_settime(struct thread *td,
     struct freebsd32_ktimer_settime_args *uap)
 {
 	struct itimerspec32 val32, oval32;
 	struct itimerspec val, oval, *ovalp;
 	int error;
 
 	error = copyin(uap->value, &val32, sizeof(val32));
 	if (error != 0)
 		return (error);
 	ITS_CP(val32, val);
 	ovalp = uap->ovalue != NULL ? &oval : NULL;
 	error = kern_ktimer_settime(td, uap->timerid, uap->flags, &val, ovalp);
 	if (error == 0 && uap->ovalue != NULL) {
 		ITS_CP(oval, oval32);
 		error = copyout(&oval32, uap->ovalue, sizeof(oval32));
 	}
 	return (error);
 }
 
 int
 freebsd32_ktimer_gettime(struct thread *td,
     struct freebsd32_ktimer_gettime_args *uap)
 {
 	struct itimerspec32 val32;
 	struct itimerspec val;
 	int error;
 
 	error = kern_ktimer_gettime(td, uap->timerid, &val);
 	if (error == 0) {
 		ITS_CP(val, val32);
 		error = copyout(&val32, uap->value, sizeof(val32));
 	}
 	return (error);
 }
 
 int
 freebsd32_clock_getcpuclockid2(struct thread *td,
     struct freebsd32_clock_getcpuclockid2_args *uap)
 {
 	clockid_t clk_id;
 	int error;
 
 	error = kern_clock_getcpuclockid2(td, PAIR32TO64(id_t, uap->id),
 	    uap->which, &clk_id);
 	if (error == 0)
 		error = copyout(&clk_id, uap->clock_id, sizeof(clockid_t));
 	return (error);
 }
 
 int
 freebsd32_thr_new(struct thread *td,
 		  struct freebsd32_thr_new_args *uap)
 {
 	struct thr_param32 param32;
 	struct thr_param param;
 	int error;
 
 	if (uap->param_size < 0 ||
 	    uap->param_size > sizeof(struct thr_param32))
 		return (EINVAL);
 	bzero(&param, sizeof(struct thr_param));
 	bzero(&param32, sizeof(struct thr_param32));
 	error = copyin(uap->param, &param32, uap->param_size);
 	if (error != 0)
 		return (error);
 	param.start_func = PTRIN(param32.start_func);
 	param.arg = PTRIN(param32.arg);
 	param.stack_base = PTRIN(param32.stack_base);
 	param.stack_size = param32.stack_size;
 	param.tls_base = PTRIN(param32.tls_base);
 	param.tls_size = param32.tls_size;
 	param.child_tid = PTRIN(param32.child_tid);
 	param.parent_tid = PTRIN(param32.parent_tid);
 	param.flags = param32.flags;
 	param.rtp = PTRIN(param32.rtp);
 	param.spare[0] = PTRIN(param32.spare[0]);
 	param.spare[1] = PTRIN(param32.spare[1]);
 	param.spare[2] = PTRIN(param32.spare[2]);
 
 	return (kern_thr_new(td, &param));
 }
 
 int
 freebsd32_thr_suspend(struct thread *td, struct freebsd32_thr_suspend_args *uap)
 {
 	struct timespec32 ts32;
 	struct timespec ts, *tsp;
 	int error;
 
 	error = 0;
 	tsp = NULL;
 	if (uap->timeout != NULL) {
 		error = copyin((const void *)uap->timeout, (void *)&ts32,
 		    sizeof(struct timespec32));
 		if (error != 0)
 			return (error);
 		ts.tv_sec = ts32.tv_sec;
 		ts.tv_nsec = ts32.tv_nsec;
 		tsp = &ts;
 	}
 	return (kern_thr_suspend(td, tsp));
 }
 
 void
 siginfo_to_siginfo32(const siginfo_t *src, struct siginfo32 *dst)
 {
 	bzero(dst, sizeof(*dst));
 	dst->si_signo = src->si_signo;
 	dst->si_errno = src->si_errno;
 	dst->si_code = src->si_code;
 	dst->si_pid = src->si_pid;
 	dst->si_uid = src->si_uid;
 	dst->si_status = src->si_status;
 	dst->si_addr = (uintptr_t)src->si_addr;
 	dst->si_value.sival_int = src->si_value.sival_int;
 	dst->si_timerid = src->si_timerid;
 	dst->si_overrun = src->si_overrun;
 }
 
 int
 freebsd32_sigtimedwait(struct thread *td, struct freebsd32_sigtimedwait_args *uap)
 {
 	struct timespec32 ts32;
 	struct timespec ts;
 	struct timespec *timeout;
 	sigset_t set;
 	ksiginfo_t ksi;
 	struct siginfo32 si32;
 	int error;
 
 	if (uap->timeout) {
 		error = copyin(uap->timeout, &ts32, sizeof(ts32));
 		if (error)
 			return (error);
 		ts.tv_sec = ts32.tv_sec;
 		ts.tv_nsec = ts32.tv_nsec;
 		timeout = &ts;
 	} else
 		timeout = NULL;
 
 	error = copyin(uap->set, &set, sizeof(set));
 	if (error)
 		return (error);
 
 	error = kern_sigtimedwait(td, set, &ksi, timeout);
 	if (error)
 		return (error);
 
 	if (uap->info) {
 		siginfo_to_siginfo32(&ksi.ksi_info, &si32);
 		error = copyout(&si32, uap->info, sizeof(struct siginfo32));
 	}
 
 	if (error == 0)
 		td->td_retval[0] = ksi.ksi_signo;
 	return (error);
 }
 
 /*
  * MPSAFE
  */
 int
 freebsd32_sigwaitinfo(struct thread *td, struct freebsd32_sigwaitinfo_args *uap)
 {
 	ksiginfo_t ksi;
 	struct siginfo32 si32;
 	sigset_t set;
 	int error;
 
 	error = copyin(uap->set, &set, sizeof(set));
 	if (error)
 		return (error);
 
 	error = kern_sigtimedwait(td, set, &ksi, NULL);
 	if (error)
 		return (error);
 
 	if (uap->info) {
 		siginfo_to_siginfo32(&ksi.ksi_info, &si32);
 		error = copyout(&si32, uap->info, sizeof(struct siginfo32));
 	}	
 	if (error == 0)
 		td->td_retval[0] = ksi.ksi_signo;
 	return (error);
 }
 
 int
 freebsd32_cpuset_setid(struct thread *td,
     struct freebsd32_cpuset_setid_args *uap)
 {
 	struct cpuset_setid_args ap;
 
 	ap.which = uap->which;
 	ap.id = PAIR32TO64(id_t,uap->id);
 	ap.setid = uap->setid;
 
 	return (sys_cpuset_setid(td, &ap));
 }
 
 int
 freebsd32_cpuset_getid(struct thread *td,
     struct freebsd32_cpuset_getid_args *uap)
 {
 	struct cpuset_getid_args ap;
 
 	ap.level = uap->level;
 	ap.which = uap->which;
 	ap.id = PAIR32TO64(id_t,uap->id);
 	ap.setid = uap->setid;
 
 	return (sys_cpuset_getid(td, &ap));
 }
 
 int
 freebsd32_cpuset_getaffinity(struct thread *td,
     struct freebsd32_cpuset_getaffinity_args *uap)
 {
 	struct cpuset_getaffinity_args ap;
 
 	ap.level = uap->level;
 	ap.which = uap->which;
 	ap.id = PAIR32TO64(id_t,uap->id);
 	ap.cpusetsize = uap->cpusetsize;
 	ap.mask = uap->mask;
 
 	return (sys_cpuset_getaffinity(td, &ap));
 }
 
 int
 freebsd32_cpuset_setaffinity(struct thread *td,
     struct freebsd32_cpuset_setaffinity_args *uap)
 {
 	struct cpuset_setaffinity_args ap;
 
 	ap.level = uap->level;
 	ap.which = uap->which;
 	ap.id = PAIR32TO64(id_t,uap->id);
 	ap.cpusetsize = uap->cpusetsize;
 	ap.mask = uap->mask;
 
 	return (sys_cpuset_setaffinity(td, &ap));
 }
 
 int
 freebsd32_nmount(struct thread *td,
     struct freebsd32_nmount_args /* {
     	struct iovec *iovp;
     	unsigned int iovcnt;
     	int flags;
     } */ *uap)
 {
 	struct uio *auio;
 	uint64_t flags;
 	int error;
 
 	/*
 	 * Mount flags are now 64-bits. On 32-bit archtectures only
 	 * 32-bits are passed in, but from here on everything handles
 	 * 64-bit flags correctly.
 	 */
 	flags = uap->flags;
 
 	AUDIT_ARG_FFLAGS(flags);
 
 	/*
 	 * Filter out MNT_ROOTFS.  We do not want clients of nmount() in
 	 * userspace to set this flag, but we must filter it out if we want
 	 * MNT_UPDATE on the root file system to work.
 	 * MNT_ROOTFS should only be set by the kernel when mounting its
 	 * root file system.
 	 */
 	flags &= ~MNT_ROOTFS;
 
 	/*
 	 * check that we have an even number of iovec's
 	 * and that we have at least two options.
 	 */
 	if ((uap->iovcnt & 1) || (uap->iovcnt < 4))
 		return (EINVAL);
 
 	error = freebsd32_copyinuio(uap->iovp, uap->iovcnt, &auio);
 	if (error)
 		return (error);
 	error = vfs_donmount(td, flags, auio);
 
 	free(auio, M_IOV);
 	return error;
 }
 
 #if 0
 int
 freebsd32_xxx(struct thread *td, struct freebsd32_xxx_args *uap)
 {
 	struct yyy32 *p32, s32;
 	struct yyy *p = NULL, s;
 	struct xxx_arg ap;
 	int error;
 
 	if (uap->zzz) {
 		error = copyin(uap->zzz, &s32, sizeof(s32));
 		if (error)
 			return (error);
 		/* translate in */
 		p = &s;
 	}
 	error = kern_xxx(td, p);
 	if (error)
 		return (error);
 	if (uap->zzz) {
 		/* translate out */
 		error = copyout(&s32, p32, sizeof(s32));
 	}
 	return (error);
 }
 #endif
 
 int
 syscall32_register(int *offset, struct sysent *new_sysent,
     struct sysent *old_sysent, int flags)
 {
 
 	if ((flags & ~SY_THR_STATIC) != 0)
 		return (EINVAL);
 
 	if (*offset == NO_SYSCALL) {
 		int i;
 
 		for (i = 1; i < SYS_MAXSYSCALL; ++i)
 			if (freebsd32_sysent[i].sy_call ==
 			    (sy_call_t *)lkmnosys)
 				break;
 		if (i == SYS_MAXSYSCALL)
 			return (ENFILE);
 		*offset = i;
 	} else if (*offset < 0 || *offset >= SYS_MAXSYSCALL)
 		return (EINVAL);
 	else if (freebsd32_sysent[*offset].sy_call != (sy_call_t *)lkmnosys &&
 	    freebsd32_sysent[*offset].sy_call != (sy_call_t *)lkmressys)
 		return (EEXIST);
 
 	*old_sysent = freebsd32_sysent[*offset];
 	freebsd32_sysent[*offset] = *new_sysent;
 	atomic_store_rel_32(&freebsd32_sysent[*offset].sy_thrcnt, flags);
 	return (0);
 }
 
 int
 syscall32_deregister(int *offset, struct sysent *old_sysent)
 {
 
 	if (*offset == 0)
 		return (0);
 
 	freebsd32_sysent[*offset] = *old_sysent;
 	return (0);
 }
 
 int
 syscall32_module_handler(struct module *mod, int what, void *arg)
 {
 	struct syscall_module_data *data = (struct syscall_module_data*)arg;
 	modspecific_t ms;
 	int error;
 
 	switch (what) {
 	case MOD_LOAD:
 		error = syscall32_register(data->offset, data->new_sysent,
 		    &data->old_sysent, SY_THR_STATIC_KLD);
 		if (error) {
 			/* Leave a mark so we know to safely unload below. */
 			data->offset = NULL;
 			return error;
 		}
 		ms.intval = *data->offset;
 		MOD_XLOCK;
 		module_setspecific(mod, &ms);
 		MOD_XUNLOCK;
 		if (data->chainevh)
 			error = data->chainevh(mod, what, data->chainarg);
 		return (error);
 	case MOD_UNLOAD:
 		/*
 		 * MOD_LOAD failed, so just return without calling the
 		 * chained handler since we didn't pass along the MOD_LOAD
 		 * event.
 		 */
 		if (data->offset == NULL)
 			return (0);
 		if (data->chainevh) {
 			error = data->chainevh(mod, what, data->chainarg);
 			if (error)
 				return (error);
 		}
 		error = syscall32_deregister(data->offset, &data->old_sysent);
 		return (error);
 	default:
 		error = EOPNOTSUPP;
 		if (data->chainevh)
 			error = data->chainevh(mod, what, data->chainarg);
 		return (error);
 	}
 }
 
 int
 syscall32_helper_register(struct syscall_helper_data *sd, int flags)
 {
 	struct syscall_helper_data *sd1;
 	int error;
 
 	for (sd1 = sd; sd1->syscall_no != NO_SYSCALL; sd1++) {
 		error = syscall32_register(&sd1->syscall_no, &sd1->new_sysent,
 		    &sd1->old_sysent, flags);
 		if (error != 0) {
 			syscall32_helper_unregister(sd);
 			return (error);
 		}
 		sd1->registered = 1;
 	}
 	return (0);
 }
 
 int
 syscall32_helper_unregister(struct syscall_helper_data *sd)
 {
 	struct syscall_helper_data *sd1;
 
 	for (sd1 = sd; sd1->registered != 0; sd1++) {
 		syscall32_deregister(&sd1->syscall_no, &sd1->old_sysent);
 		sd1->registered = 0;
 	}
 	return (0);
 }
 
 register_t *
 freebsd32_copyout_strings(struct image_params *imgp)
 {
 	int argc, envc, i;
 	u_int32_t *vectp;
 	char *stringp;
 	uintptr_t destp;
 	u_int32_t *stack_base;
 	struct freebsd32_ps_strings *arginfo;
 	char canary[sizeof(long) * 8];
 	int32_t pagesizes32[MAXPAGESIZES];
 	size_t execpath_len;
 	int szsigcode;
 
 	/*
 	 * Calculate string base and vector table pointers.
 	 * Also deal with signal trampoline code for this exec type.
 	 */
 	if (imgp->execpath != NULL && imgp->auxargs != NULL)
 		execpath_len = strlen(imgp->execpath) + 1;
 	else
 		execpath_len = 0;
 	arginfo = (struct freebsd32_ps_strings *)curproc->p_sysent->
 	    sv_psstrings;
 	if (imgp->proc->p_sysent->sv_sigcode_base == 0)
 		szsigcode = *(imgp->proc->p_sysent->sv_szsigcode);
 	else
 		szsigcode = 0;
 	destp =	(uintptr_t)arginfo;
 
 	/*
 	 * install sigcode
 	 */
 	if (szsigcode != 0) {
 		destp -= szsigcode;
 		destp = rounddown2(destp, sizeof(uint32_t));
 		copyout(imgp->proc->p_sysent->sv_sigcode, (void *)destp,
 		    szsigcode);
 	}
 
 	/*
 	 * Copy the image path for the rtld.
 	 */
 	if (execpath_len != 0) {
 		destp -= execpath_len;
 		imgp->execpathp = destp;
 		copyout(imgp->execpath, (void *)destp, execpath_len);
 	}
 
 	/*
 	 * Prepare the canary for SSP.
 	 */
 	arc4rand(canary, sizeof(canary), 0);
 	destp -= sizeof(canary);
 	imgp->canary = destp;
 	copyout(canary, (void *)destp, sizeof(canary));
 	imgp->canarylen = sizeof(canary);
 
 	/*
 	 * Prepare the pagesizes array.
 	 */
 	for (i = 0; i < MAXPAGESIZES; i++)
 		pagesizes32[i] = (uint32_t)pagesizes[i];
 	destp -= sizeof(pagesizes32);
 	destp = rounddown2(destp, sizeof(uint32_t));
 	imgp->pagesizes = destp;
 	copyout(pagesizes32, (void *)destp, sizeof(pagesizes32));
 	imgp->pagesizeslen = sizeof(pagesizes32);
 
 	destp -= ARG_MAX - imgp->args->stringspace;
 	destp = rounddown2(destp, sizeof(uint32_t));
 
 	/*
 	 * If we have a valid auxargs ptr, prepare some room
 	 * on the stack.
 	 */
 	if (imgp->auxargs) {
 		/*
 		 * 'AT_COUNT*2' is size for the ELF Auxargs data. This is for
 		 * lower compatibility.
 		 */
 		imgp->auxarg_size = (imgp->auxarg_size) ? imgp->auxarg_size
 			: (AT_COUNT * 2);
 		/*
 		 * The '+ 2' is for the null pointers at the end of each of
 		 * the arg and env vector sets,and imgp->auxarg_size is room
 		 * for argument of Runtime loader.
 		 */
 		vectp = (u_int32_t *) (destp - (imgp->args->argc +
 		    imgp->args->envc + 2 + imgp->auxarg_size + execpath_len) *
 		    sizeof(u_int32_t));
 	} else {
 		/*
 		 * The '+ 2' is for the null pointers at the end of each of
 		 * the arg and env vector sets
 		 */
 		vectp = (u_int32_t *)(destp - (imgp->args->argc +
 		    imgp->args->envc + 2) * sizeof(u_int32_t));
 	}
 
 	/*
 	 * vectp also becomes our initial stack base
 	 */
 	stack_base = vectp;
 
 	stringp = imgp->args->begin_argv;
 	argc = imgp->args->argc;
 	envc = imgp->args->envc;
 	/*
 	 * Copy out strings - arguments and environment.
 	 */
 	copyout(stringp, (void *)destp, ARG_MAX - imgp->args->stringspace);
 
 	/*
 	 * Fill in "ps_strings" struct for ps, w, etc.
 	 */
 	suword32(&arginfo->ps_argvstr, (u_int32_t)(intptr_t)vectp);
 	suword32(&arginfo->ps_nargvstr, argc);
 
 	/*
 	 * Fill in argument portion of vector table.
 	 */
 	for (; argc > 0; --argc) {
 		suword32(vectp++, (u_int32_t)(intptr_t)destp);
 		while (*stringp++ != 0)
 			destp++;
 		destp++;
 	}
 
 	/* a null vector table pointer separates the argp's from the envp's */
 	suword32(vectp++, 0);
 
 	suword32(&arginfo->ps_envstr, (u_int32_t)(intptr_t)vectp);
 	suword32(&arginfo->ps_nenvstr, envc);
 
 	/*
 	 * Fill in environment portion of vector table.
 	 */
 	for (; envc > 0; --envc) {
 		suword32(vectp++, (u_int32_t)(intptr_t)destp);
 		while (*stringp++ != 0)
 			destp++;
 		destp++;
 	}
 
 	/* end of vector table is a null pointer */
 	suword32(vectp, 0);
 
 	return ((register_t *)stack_base);
 }
 
 int
 freebsd32_kldstat(struct thread *td, struct freebsd32_kldstat_args *uap)
 {
 	struct kld_file_stat stat;
 	struct kld32_file_stat stat32;
 	int error, version;
 
 	if ((error = copyin(&uap->stat->version, &version, sizeof(version)))
 	    != 0)
 		return (error);
 	if (version != sizeof(struct kld32_file_stat_1) &&
 	    version != sizeof(struct kld32_file_stat))
 		return (EINVAL);
 
 	error = kern_kldstat(td, uap->fileid, &stat);
 	if (error != 0)
 		return (error);
 
 	bcopy(&stat.name[0], &stat32.name[0], sizeof(stat.name));
 	CP(stat, stat32, refs);
 	CP(stat, stat32, id);
 	PTROUT_CP(stat, stat32, address);
 	CP(stat, stat32, size);
 	bcopy(&stat.pathname[0], &stat32.pathname[0], sizeof(stat.pathname));
 	return (copyout(&stat32, uap->stat, version));
 }
 
 int
 freebsd32_posix_fallocate(struct thread *td,
     struct freebsd32_posix_fallocate_args *uap)
 {
 
 	td->td_retval[0] = kern_posix_fallocate(td, uap->fd,
 	    PAIR32TO64(off_t, uap->offset), PAIR32TO64(off_t, uap->len));
 	return (0);
 }
 
 int
 freebsd32_posix_fadvise(struct thread *td,
     struct freebsd32_posix_fadvise_args *uap)
 {
 
 	td->td_retval[0] = kern_posix_fadvise(td, uap->fd,
 	    PAIR32TO64(off_t, uap->offset), PAIR32TO64(off_t, uap->len),
 	    uap->advice);
 	return (0);
 }
 
 int
 convert_sigevent32(struct sigevent32 *sig32, struct sigevent *sig)
 {
 
 	CP(*sig32, *sig, sigev_notify);
 	switch (sig->sigev_notify) {
 	case SIGEV_NONE:
 		break;
 	case SIGEV_THREAD_ID:
 		CP(*sig32, *sig, sigev_notify_thread_id);
 		/* FALLTHROUGH */
 	case SIGEV_SIGNAL:
 		CP(*sig32, *sig, sigev_signo);
 		PTRIN_CP(*sig32, *sig, sigev_value.sival_ptr);
 		break;
 	case SIGEV_KEVENT:
 		CP(*sig32, *sig, sigev_notify_kqueue);
 		CP(*sig32, *sig, sigev_notify_kevent_flags);
 		PTRIN_CP(*sig32, *sig, sigev_value.sival_ptr);
 		break;
 	default:
 		return (EINVAL);
 	}
 	return (0);
 }
 
 int
 freebsd32_procctl(struct thread *td, struct freebsd32_procctl_args *uap)
 {
 	void *data;
 	union {
 		struct procctl_reaper_status rs;
 		struct procctl_reaper_pids rp;
 		struct procctl_reaper_kill rk;
 	} x;
 	union {
 		struct procctl_reaper_pids32 rp;
 	} x32;
 	int error, error1, flags;
 
 	switch (uap->com) {
 	case PROC_SPROTECT:
 	case PROC_TRACE_CTL:
 		error = copyin(PTRIN(uap->data), &flags, sizeof(flags));
 		if (error != 0)
 			return (error);
 		data = &flags;
 		break;
 	case PROC_REAP_ACQUIRE:
 	case PROC_REAP_RELEASE:
 		if (uap->data != NULL)
 			return (EINVAL);
 		data = NULL;
 		break;
 	case PROC_REAP_STATUS:
 		data = &x.rs;
 		break;
 	case PROC_REAP_GETPIDS:
 		error = copyin(uap->data, &x32.rp, sizeof(x32.rp));
 		if (error != 0)
 			return (error);
 		CP(x32.rp, x.rp, rp_count);
 		PTRIN_CP(x32.rp, x.rp, rp_pids);
 		data = &x.rp;
 		break;
 	case PROC_REAP_KILL:
 		error = copyin(uap->data, &x.rk, sizeof(x.rk));
 		if (error != 0)
 			return (error);
 		data = &x.rk;
 		break;
 	case PROC_TRACE_STATUS:
 		data = &flags;
 		break;
 	default:
 		return (EINVAL);
 	}
 	error = kern_procctl(td, uap->idtype, PAIR32TO64(id_t, uap->id),
 	    uap->com, data);
 	switch (uap->com) {
 	case PROC_REAP_STATUS:
 		if (error == 0)
 			error = copyout(&x.rs, uap->data, sizeof(x.rs));
 		break;
 	case PROC_REAP_KILL:
 		error1 = copyout(&x.rk, uap->data, sizeof(x.rk));
 		if (error == 0)
 			error = error1;
 		break;
 	case PROC_TRACE_STATUS:
 		if (error == 0)
 			error = copyout(&flags, uap->data, sizeof(flags));
 		break;
 	}
 	return (error);
 }
 
 int
 freebsd32_fcntl(struct thread *td, struct freebsd32_fcntl_args *uap)
 {
 	long tmp;
 
 	switch (uap->cmd) {
 	/*
 	 * Do unsigned conversion for arg when operation
 	 * interprets it as flags or pointer.
 	 */
 	case F_SETLK_REMOTE:
 	case F_SETLKW:
 	case F_SETLK:
 	case F_GETLK:
 	case F_SETFD:
 	case F_SETFL:
 	case F_OGETLK:
 	case F_OSETLK:
 	case F_OSETLKW:
 		tmp = (unsigned int)(uap->arg);
 		break;
 	default:
 		tmp = uap->arg;
 		break;
 	}
 	return (kern_fcntl_freebsd(td, uap->fd, uap->cmd, tmp));
 }
 
 int
 freebsd32_ppoll(struct thread *td, struct freebsd32_ppoll_args *uap)
 {
 	struct timespec32 ts32;
 	struct timespec ts, *tsp;
 	sigset_t set, *ssp;
 	int error;
 
 	if (uap->ts != NULL) {
 		error = copyin(uap->ts, &ts32, sizeof(ts32));
 		if (error != 0)
 			return (error);
 		CP(ts32, ts, tv_sec);
 		CP(ts32, ts, tv_nsec);
 		tsp = &ts;
 	} else
 		tsp = NULL;
 	if (uap->set != NULL) {
 		error = copyin(uap->set, &set, sizeof(set));
 		if (error != 0)
 			return (error);
 		ssp = &set;
 	} else
 		ssp = NULL;
 
 	return (kern_poll(td, uap->fds, uap->nfds, tsp, ssp));
 }
Index: user/ngie/more-tests/sys/compat/linprocfs/linprocfs.c
===================================================================
--- user/ngie/more-tests/sys/compat/linprocfs/linprocfs.c	(revision 281584)
+++ user/ngie/more-tests/sys/compat/linprocfs/linprocfs.c	(revision 281585)
@@ -1,1467 +1,1476 @@
 /*-
  * Copyright (c) 2000 Dag-Erling Coïdan Smørgrav
  * Copyright (c) 1999 Pierre Beyssac
  * Copyright (c) 1993 Jan-Simon Pendry
  * Copyright (c) 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * Jan-Simon Pendry.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *	This product includes software developed by the University of
  *	California, Berkeley and its contributors.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)procfs_status.c	8.4 (Berkeley) 6/15/94
  */
 
 #include "opt_compat.h"
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/queue.h>
 #include <sys/blist.h>
 #include <sys/conf.h>
 #include <sys/exec.h>
 #include <sys/fcntl.h>
 #include <sys/filedesc.h>
 #include <sys/jail.h>
 #include <sys/kernel.h>
+#include <sys/limits.h>
 #include <sys/linker.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
-#include <sys/mount.h>
 #include <sys/msg.h>
 #include <sys/mutex.h>
 #include <sys/namei.h>
 #include <sys/proc.h>
 #include <sys/ptrace.h>
 #include <sys/resourcevar.h>
 #include <sys/sbuf.h>
 #include <sys/sem.h>
 #include <sys/smp.h>
 #include <sys/socket.h>
+#include <sys/syscallsubr.h>
 #include <sys/sysctl.h>
 #include <sys/systm.h>
 #include <sys/time.h>
 #include <sys/tty.h>
 #include <sys/user.h>
 #include <sys/uuid.h>
 #include <sys/vmmeter.h>
 #include <sys/vnode.h>
 #include <sys/bus.h>
 
 #include <net/if.h>
 #include <net/if_var.h>
 #include <net/vnet.h>
 
 #include <vm/vm.h>
 #include <vm/vm_extern.h>
 #include <vm/pmap.h>
 #include <vm/vm_map.h>
 #include <vm/vm_param.h>
 #include <vm/vm_object.h>
 #include <vm/swap_pager.h>
 
 #include <machine/clock.h>
 
 #include <geom/geom.h>
 #include <geom/geom_int.h>
 
 #if defined(__i386__) || defined(__amd64__)
 #include <machine/cputypes.h>
 #include <machine/md_var.h>
 #endif /* __i386__ || __amd64__ */
 
 #ifdef COMPAT_FREEBSD32
 #include <compat/freebsd32/freebsd32_util.h>
 #endif
 
 #include <compat/linux/linux_ioctl.h>
 #include <compat/linux/linux_mib.h>
 #include <compat/linux/linux_misc.h>
 #include <compat/linux/linux_util.h>
 #include <fs/pseudofs/pseudofs.h>
 #include <fs/procfs/procfs.h>
 
 /*
  * Various conversion macros
  */
 #define T2J(x) ((long)(((x) * 100ULL) / (stathz ? stathz : hz)))	/* ticks to jiffies */
 #define T2CS(x) ((unsigned long)(((x) * 100ULL) / (stathz ? stathz : hz)))	/* ticks to centiseconds */
 #define T2S(x) ((x) / (stathz ? stathz : hz))		/* ticks to seconds */
 #define B2K(x) ((x) >> 10)				/* bytes to kbytes */
 #define B2P(x) ((x) >> PAGE_SHIFT)			/* bytes to pages */
 #define P2B(x) ((x) << PAGE_SHIFT)			/* pages to bytes */
 #define P2K(x) ((x) << (PAGE_SHIFT - 10))		/* pages to kbytes */
 #define TV2J(x)	((x)->tv_sec * 100UL + (x)->tv_usec / 10000)
 
 /**
  * @brief Mapping of ki_stat in struct kinfo_proc to the linux state
  *
  * The linux procfs state field displays one of the characters RSDZTW to
  * denote running, sleeping in an interruptible wait, waiting in an
  * uninterruptible disk sleep, a zombie process, process is being traced
  * or stopped, or process is paging respectively.
  *
  * Our struct kinfo_proc contains the variable ki_stat which contains a
  * value out of SIDL, SRUN, SSLEEP, SSTOP, SZOMB, SWAIT and SLOCK.
  *
  * This character array is used with ki_stati-1 as an index and tries to
  * map our states to suitable linux states.
  */
 static char linux_state[] = "RRSTZDD";
 
 /*
  * Filler function for proc/meminfo
  */
 static int
 linprocfs_domeminfo(PFS_FILL_ARGS)
 {
 	unsigned long memtotal;		/* total memory in bytes */
 	unsigned long memused;		/* used memory in bytes */
 	unsigned long memfree;		/* free memory in bytes */
 	unsigned long memshared;	/* shared memory ??? */
 	unsigned long buffers, cached;	/* buffer / cache memory ??? */
 	unsigned long long swaptotal;	/* total swap space in bytes */
 	unsigned long long swapused;	/* used swap space in bytes */
 	unsigned long long swapfree;	/* free swap space in bytes */
 	vm_object_t object;
 	int i, j;
 
 	memtotal = physmem * PAGE_SIZE;
 	/*
 	 * The correct thing here would be:
 	 *
 	memfree = vm_cnt.v_free_count * PAGE_SIZE;
 	memused = memtotal - memfree;
 	 *
 	 * but it might mislead linux binaries into thinking there
 	 * is very little memory left, so we cheat and tell them that
 	 * all memory that isn't wired down is free.
 	 */
 	memused = vm_cnt.v_wire_count * PAGE_SIZE;
 	memfree = memtotal - memused;
 	swap_pager_status(&i, &j);
 	swaptotal = (unsigned long long)i * PAGE_SIZE;
 	swapused = (unsigned long long)j * PAGE_SIZE;
 	swapfree = swaptotal - swapused;
 	memshared = 0;
 	mtx_lock(&vm_object_list_mtx);
 	TAILQ_FOREACH(object, &vm_object_list, object_list)
 		if (object->shadow_count > 1)
 			memshared += object->resident_page_count;
 	mtx_unlock(&vm_object_list_mtx);
 	memshared *= PAGE_SIZE;
 	/*
 	 * We'd love to be able to write:
 	 *
 	buffers = bufspace;
 	 *
 	 * but bufspace is internal to vfs_bio.c and we don't feel
 	 * like unstaticizing it just for linprocfs's sake.
 	 */
 	buffers = 0;
 	cached = vm_cnt.v_cache_count * PAGE_SIZE;
 
 	sbuf_printf(sb,
 	    "	     total:    used:	free:  shared: buffers:	 cached:\n"
 	    "Mem:  %lu %lu %lu %lu %lu %lu\n"
 	    "Swap: %llu %llu %llu\n"
 	    "MemTotal: %9lu kB\n"
 	    "MemFree:  %9lu kB\n"
 	    "MemShared:%9lu kB\n"
 	    "Buffers:  %9lu kB\n"
 	    "Cached:   %9lu kB\n"
 	    "SwapTotal:%9llu kB\n"
 	    "SwapFree: %9llu kB\n",
 	    memtotal, memused, memfree, memshared, buffers, cached,
 	    swaptotal, swapused, swapfree,
 	    B2K(memtotal), B2K(memfree),
 	    B2K(memshared), B2K(buffers), B2K(cached),
 	    B2K(swaptotal), B2K(swapfree));
 
 	return (0);
 }
 
 #if defined(__i386__) || defined(__amd64__)
 /*
  * Filler function for proc/cpuinfo (i386 & amd64 version)
  */
 static int
 linprocfs_docpuinfo(PFS_FILL_ARGS)
 {
 	int hw_model[2];
 	char model[128];
 	uint64_t freq;
 	size_t size;
 	int class, fqmhz, fqkhz;
 	int i;
 
 	/*
 	 * We default the flags to include all non-conflicting flags,
 	 * and the Intel versions of conflicting flags.
 	 */
 	static char *flags[] = {
 		"fpu",	    "vme",     "de",	   "pse",      "tsc",
 		"msr",	    "pae",     "mce",	   "cx8",      "apic",
 		"sep",	    "sep",     "mtrr",	   "pge",      "mca",
 		"cmov",	    "pat",     "pse36",	   "pn",       "b19",
 		"b20",	    "b21",     "mmxext",   "mmx",      "fxsr",
 		"xmm",	    "sse2",    "b27",	   "b28",      "b29",
 		"3dnowext", "3dnow"
 	};
 
 	switch (cpu_class) {
 #ifdef __i386__
 	case CPUCLASS_286:
 		class = 2;
 		break;
 	case CPUCLASS_386:
 		class = 3;
 		break;
 	case CPUCLASS_486:
 		class = 4;
 		break;
 	case CPUCLASS_586:
 		class = 5;
 		break;
 	case CPUCLASS_686:
 		class = 6;
 		break;
 	default:
 		class = 0;
 		break;
 #else /* __amd64__ */
 	default:
 		class = 15;
 		break;
 #endif
 	}
 
 	hw_model[0] = CTL_HW;
 	hw_model[1] = HW_MODEL;
 	model[0] = '\0';
 	size = sizeof(model);
 	if (kernel_sysctl(td, hw_model, 2, &model, &size, 0, 0, 0, 0) != 0)
 		strcpy(model, "unknown");
 	for (i = 0; i < mp_ncpus; ++i) {
 		sbuf_printf(sb,
 		    "processor\t: %d\n"
 		    "vendor_id\t: %.20s\n"
 		    "cpu family\t: %u\n"
 		    "model\t\t: %u\n"
 		    "model name\t: %s\n"
 		    "stepping\t: %u\n\n",
 		    i, cpu_vendor, CPUID_TO_FAMILY(cpu_id),
 		    CPUID_TO_MODEL(cpu_id), model, cpu_id & CPUID_STEPPING);
 		/* XXX per-cpu vendor / class / model / id? */
 	}
 
 	sbuf_cat(sb, "flags\t\t:");
 
 #ifdef __i386__
 	switch (cpu_vendor_id) {
 	case CPU_VENDOR_AMD:
 		if (class < 6)
 			flags[16] = "fcmov";
 		break;
 	case CPU_VENDOR_CYRIX:
 		flags[24] = "cxmmx";
 		break;
 	}
 #endif
 
 	for (i = 0; i < 32; i++)
 		if (cpu_feature & (1 << i))
 			sbuf_printf(sb, " %s", flags[i]);
 	sbuf_cat(sb, "\n");
 	freq = atomic_load_acq_64(&tsc_freq);
 	if (freq != 0) {
 		fqmhz = (freq + 4999) / 1000000;
 		fqkhz = ((freq + 4999) / 10000) % 100;
 		sbuf_printf(sb,
 		    "cpu MHz\t\t: %d.%02d\n"
 		    "bogomips\t: %d.%02d\n",
 		    fqmhz, fqkhz, fqmhz, fqkhz);
 	}
 
 	return (0);
 }
 #endif /* __i386__ || __amd64__ */
 
 /*
  * Filler function for proc/mtab
  *
  * This file doesn't exist in Linux' procfs, but is included here so
  * users can symlink /compat/linux/etc/mtab to /proc/mtab
  */
 static int
 linprocfs_domtab(PFS_FILL_ARGS)
 {
 	struct nameidata nd;
-	struct mount *mp;
 	const char *lep;
 	char *dlep, *flep, *mntto, *mntfrom, *fstype;
 	size_t lep_len;
 	int error;
+	struct statfs *buf, *sp;
+	size_t count;
 
 	/* resolve symlinks etc. in the emulation tree prefix */
 	NDINIT(&nd, LOOKUP, FOLLOW, UIO_SYSSPACE, linux_emul_path, td);
 	flep = NULL;
 	error = namei(&nd);
 	lep = linux_emul_path;
 	if (error == 0) {
 		if (vn_fullpath(td, nd.ni_vp, &dlep, &flep) == 0)
 			lep = dlep;
 		vrele(nd.ni_vp);
 	}
 	lep_len = strlen(lep);
 
-	mtx_lock(&mountlist_mtx);
-	error = 0;
-	TAILQ_FOREACH(mp, &mountlist, mnt_list) {
+	buf = NULL;
+	error = kern_getfsstat(td, &buf, SIZE_T_MAX, &count,
+	    UIO_SYSSPACE, MNT_WAIT);
+	if (error != 0) {
+		free(buf, M_TEMP);
+		free(flep, M_TEMP);
+		return (error);
+	}
+
+	for (sp = buf; count > 0; sp++, count--) {
 		/* determine device name */
-		mntfrom = mp->mnt_stat.f_mntfromname;
+		mntfrom = sp->f_mntfromname;
 
 		/* determine mount point */
-		mntto = mp->mnt_stat.f_mntonname;
-		if (strncmp(mntto, lep, lep_len) == 0 &&
-		    mntto[lep_len] == '/')
+		mntto = sp->f_mntonname;
+		if (strncmp(mntto, lep, lep_len) == 0 && mntto[lep_len] == '/')
 			mntto += lep_len;
 
 		/* determine fs type */
-		fstype = mp->mnt_stat.f_fstypename;
+		fstype = sp->f_fstypename;
 		if (strcmp(fstype, pn->pn_info->pi_name) == 0)
 			mntfrom = fstype = "proc";
 		else if (strcmp(fstype, "procfs") == 0)
 			continue;
 
 		if (strcmp(fstype, "linsysfs") == 0) {
 			sbuf_printf(sb, "/sys %s sysfs %s", mntto,
-			    mp->mnt_stat.f_flags & MNT_RDONLY ? "ro" : "rw");
+			    sp->f_flags & MNT_RDONLY ? "ro" : "rw");
 		} else {
 			/* For Linux msdosfs is called vfat */
 			if (strcmp(fstype, "msdosfs") == 0)
 				fstype = "vfat";
 			sbuf_printf(sb, "%s %s %s %s", mntfrom, mntto, fstype,
-			    mp->mnt_stat.f_flags & MNT_RDONLY ? "ro" : "rw");
+			    sp->f_flags & MNT_RDONLY ? "ro" : "rw");
 		}
 #define ADD_OPTION(opt, name) \
-	if (mp->mnt_stat.f_flags & (opt)) sbuf_printf(sb, "," name);
+	if (sp->f_flags & (opt)) sbuf_printf(sb, "," name);
 		ADD_OPTION(MNT_SYNCHRONOUS,	"sync");
 		ADD_OPTION(MNT_NOEXEC,		"noexec");
 		ADD_OPTION(MNT_NOSUID,		"nosuid");
 		ADD_OPTION(MNT_UNION,		"union");
 		ADD_OPTION(MNT_ASYNC,		"async");
 		ADD_OPTION(MNT_SUIDDIR,		"suiddir");
 		ADD_OPTION(MNT_NOSYMFOLLOW,	"nosymfollow");
 		ADD_OPTION(MNT_NOATIME,		"noatime");
 #undef ADD_OPTION
 		/* a real Linux mtab will also show NFS options */
 		sbuf_printf(sb, " 0 0\n");
 	}
-	mtx_unlock(&mountlist_mtx);
+
+	free(buf, M_TEMP);
 	free(flep, M_TEMP);
 	return (error);
 }
 
 /*
  * Filler function for proc/partitions
  */
 static int
 linprocfs_dopartitions(PFS_FILL_ARGS)
 {
 	struct g_class *cp;
 	struct g_geom *gp;
 	struct g_provider *pp;
 	int major, minor;
 
 	g_topology_lock();
 	sbuf_printf(sb, "major minor  #blocks  name rio rmerge rsect "
 	    "ruse wio wmerge wsect wuse running use aveq\n");
 
 	LIST_FOREACH(cp, &g_classes, class) {
 		if (strcmp(cp->name, "DISK") == 0 ||
 		    strcmp(cp->name, "PART") == 0)
 			LIST_FOREACH(gp, &cp->geom, geom) {
 				LIST_FOREACH(pp, &gp->provider, provider) {
 					if (linux_driver_get_major_minor(
 					    pp->name, &major, &minor) != 0) {
 						major = 0;
 						minor = 0;
 					}
 					sbuf_printf(sb, "%d %d %lld %s "
 					    "%d %d %d %d %d "
 					     "%d %d %d %d %d %d\n",
 					     major, minor,
 					     (long long)pp->mediasize, pp->name,
 					     0, 0, 0, 0, 0,
 					     0, 0, 0, 0, 0, 0);
 				}
 			}
 	}
 	g_topology_unlock();
 
 	return (0);
 }
 
 
 /*
  * Filler function for proc/stat
  */
 static int
 linprocfs_dostat(PFS_FILL_ARGS)
 {
 	struct pcpu *pcpu;
 	long cp_time[CPUSTATES];
 	long *cp;
 	int i;
 
 	read_cpu_time(cp_time);
 	sbuf_printf(sb, "cpu %ld %ld %ld %ld\n",
 	    T2J(cp_time[CP_USER]),
 	    T2J(cp_time[CP_NICE]),
 	    T2J(cp_time[CP_SYS] /*+ cp_time[CP_INTR]*/),
 	    T2J(cp_time[CP_IDLE]));
 	CPU_FOREACH(i) {
 		pcpu = pcpu_find(i);
 		cp = pcpu->pc_cp_time;
 		sbuf_printf(sb, "cpu%d %ld %ld %ld %ld\n", i,
 		    T2J(cp[CP_USER]),
 		    T2J(cp[CP_NICE]),
 		    T2J(cp[CP_SYS] /*+ cp[CP_INTR]*/),
 		    T2J(cp[CP_IDLE]));
 	}
 	sbuf_printf(sb,
 	    "disk 0 0 0 0\n"
 	    "page %u %u\n"
 	    "swap %u %u\n"
 	    "intr %u\n"
 	    "ctxt %u\n"
 	    "btime %lld\n",
 	    vm_cnt.v_vnodepgsin,
 	    vm_cnt.v_vnodepgsout,
 	    vm_cnt.v_swappgsin,
 	    vm_cnt.v_swappgsout,
 	    vm_cnt.v_intr,
 	    vm_cnt.v_swtch,
 	    (long long)boottime.tv_sec);
 	return (0);
 }
 
 static int
 linprocfs_doswaps(PFS_FILL_ARGS)
 {
 	struct xswdev xsw;
 	uintmax_t total, used;
 	int n;
 	char devname[SPECNAMELEN + 1];
 
 	sbuf_printf(sb, "Filename\t\t\t\tType\t\tSize\tUsed\tPriority\n");
 	mtx_lock(&Giant);
 	for (n = 0; ; n++) {
 		if (swap_dev_info(n, &xsw, devname, sizeof(devname)) != 0)
 			break;
 		total = (uintmax_t)xsw.xsw_nblks * PAGE_SIZE / 1024;
 		used  = (uintmax_t)xsw.xsw_used * PAGE_SIZE / 1024;
 
 		/*
 		 * The space and not tab after the device name is on
 		 * purpose.  Linux does so.
 		 */
 		sbuf_printf(sb, "/dev/%-34s unknown\t\t%jd\t%jd\t-1\n",
 		    devname, total, used);
 	}
 	mtx_unlock(&Giant);
 	return (0);
 }
 
 /*
  * Filler function for proc/uptime
  */
 static int
 linprocfs_douptime(PFS_FILL_ARGS)
 {
 	long cp_time[CPUSTATES];
 	struct timeval tv;
 
 	getmicrouptime(&tv);
 	read_cpu_time(cp_time);
 	sbuf_printf(sb, "%lld.%02ld %ld.%02lu\n",
 	    (long long)tv.tv_sec, tv.tv_usec / 10000,
 	    T2S(cp_time[CP_IDLE] / mp_ncpus),
 	    T2CS(cp_time[CP_IDLE] / mp_ncpus) % 100);
 	return (0);
 }
 
 /*
  * Get OS build date
  */
 static void
 linprocfs_osbuild(struct thread *td, struct sbuf *sb)
 {
 #if 0
 	char osbuild[256];
 	char *cp1, *cp2;
 
 	strncpy(osbuild, version, 256);
 	osbuild[255] = '\0';
 	cp1 = strstr(osbuild, "\n");
 	cp2 = strstr(osbuild, ":");
 	if (cp1 && cp2) {
 		*cp1 = *cp2 = '\0';
 		cp1 = strstr(osbuild, "#");
 	} else
 		cp1 = NULL;
 	if (cp1)
 		sbuf_printf(sb, "%s%s", cp1, cp2 + 1);
 	else
 #endif
 		sbuf_cat(sb, "#4 Sun Dec 18 04:30:00 CET 1977");
 }
 
 /*
  * Get OS builder
  */
 static void
 linprocfs_osbuilder(struct thread *td, struct sbuf *sb)
 {
 #if 0
 	char builder[256];
 	char *cp;
 
 	cp = strstr(version, "\n    ");
 	if (cp) {
 		strncpy(builder, cp + 5, 256);
 		builder[255] = '\0';
 		cp = strstr(builder, ":");
 		if (cp)
 			*cp = '\0';
 	}
 	if (cp)
 		sbuf_cat(sb, builder);
 	else
 #endif
 		sbuf_cat(sb, "des@freebsd.org");
 }
 
 /*
  * Filler function for proc/version
  */
 static int
 linprocfs_doversion(PFS_FILL_ARGS)
 {
 	char osname[LINUX_MAX_UTSNAME];
 	char osrelease[LINUX_MAX_UTSNAME];
 
 	linux_get_osname(td, osname);
 	linux_get_osrelease(td, osrelease);
 	sbuf_printf(sb, "%s version %s (", osname, osrelease);
 	linprocfs_osbuilder(td, sb);
 	sbuf_cat(sb, ") (gcc version " __VERSION__ ") ");
 	linprocfs_osbuild(td, sb);
 	sbuf_cat(sb, "\n");
 
 	return (0);
 }
 
 /*
  * Filler function for proc/loadavg
  */
 static int
 linprocfs_doloadavg(PFS_FILL_ARGS)
 {
 
 	sbuf_printf(sb,
 	    "%d.%02d %d.%02d %d.%02d %d/%d %d\n",
 	    (int)(averunnable.ldavg[0] / averunnable.fscale),
 	    (int)(averunnable.ldavg[0] * 100 / averunnable.fscale % 100),
 	    (int)(averunnable.ldavg[1] / averunnable.fscale),
 	    (int)(averunnable.ldavg[1] * 100 / averunnable.fscale % 100),
 	    (int)(averunnable.ldavg[2] / averunnable.fscale),
 	    (int)(averunnable.ldavg[2] * 100 / averunnable.fscale % 100),
 	    1,				/* number of running tasks */
 	    nprocs,			/* number of tasks */
 	    lastpid			/* the last pid */
 	);
 	return (0);
 }
 
 /*
  * Filler function for proc/pid/stat
  */
 static int
 linprocfs_doprocstat(PFS_FILL_ARGS)
 {
 	struct kinfo_proc kp;
 	char state;
 	static int ratelimit = 0;
 	vm_offset_t startcode, startdata;
 
 	sx_slock(&proctree_lock);
 	PROC_LOCK(p);
 	fill_kinfo_proc(p, &kp);
 	sx_sunlock(&proctree_lock);
 	if (p->p_vmspace) {
 	   startcode = (vm_offset_t)p->p_vmspace->vm_taddr;
 	   startdata = (vm_offset_t)p->p_vmspace->vm_daddr;
 	} else {
 	   startcode = 0;
 	   startdata = 0;
 	};
 	sbuf_printf(sb, "%d", p->p_pid);
 #define PS_ADD(name, fmt, arg) sbuf_printf(sb, " " fmt, arg)
 	PS_ADD("comm",		"(%s)",	p->p_comm);
 	if (kp.ki_stat > sizeof(linux_state)) {
 		state = 'R';
 
 		if (ratelimit == 0) {
 			printf("linprocfs: don't know how to handle unknown FreeBSD state %d/%zd, mapping to R\n",
 			    kp.ki_stat, sizeof(linux_state));
 			++ratelimit;
 		}
 	} else
 		state = linux_state[kp.ki_stat - 1];
 	PS_ADD("state",		"%c",	state);
 	PS_ADD("ppid",		"%d",	p->p_pptr ? p->p_pptr->p_pid : 0);
 	PS_ADD("pgrp",		"%d",	p->p_pgid);
 	PS_ADD("session",	"%d",	p->p_session->s_sid);
 	PROC_UNLOCK(p);
 	PS_ADD("tty",		"%ju",	(uintmax_t)kp.ki_tdev);
 	PS_ADD("tpgid",		"%d",	kp.ki_tpgid);
 	PS_ADD("flags",		"%u",	0); /* XXX */
 	PS_ADD("minflt",	"%lu",	kp.ki_rusage.ru_minflt);
 	PS_ADD("cminflt",	"%lu",	kp.ki_rusage_ch.ru_minflt);
 	PS_ADD("majflt",	"%lu",	kp.ki_rusage.ru_majflt);
 	PS_ADD("cmajflt",	"%lu",	kp.ki_rusage_ch.ru_majflt);
 	PS_ADD("utime",		"%ld",	TV2J(&kp.ki_rusage.ru_utime));
 	PS_ADD("stime",		"%ld",	TV2J(&kp.ki_rusage.ru_stime));
 	PS_ADD("cutime",	"%ld",	TV2J(&kp.ki_rusage_ch.ru_utime));
 	PS_ADD("cstime",	"%ld",	TV2J(&kp.ki_rusage_ch.ru_stime));
 	PS_ADD("priority",	"%d",	kp.ki_pri.pri_user);
 	PS_ADD("nice",		"%d",	kp.ki_nice); /* 19 (nicest) to -19 */
 	PS_ADD("0",		"%d",	0); /* removed field */
 	PS_ADD("itrealvalue",	"%d",	0); /* XXX */
 	PS_ADD("starttime",	"%lu",	TV2J(&kp.ki_start) - TV2J(&boottime));
 	PS_ADD("vsize",		"%ju",	P2K((uintmax_t)kp.ki_size));
 	PS_ADD("rss",		"%ju",	(uintmax_t)kp.ki_rssize);
 	PS_ADD("rlim",		"%lu",	kp.ki_rusage.ru_maxrss);
 	PS_ADD("startcode",	"%ju",	(uintmax_t)startcode);
 	PS_ADD("endcode",	"%ju",	(uintmax_t)startdata);
 	PS_ADD("startstack",	"%u",	0); /* XXX */
 	PS_ADD("kstkesp",	"%u",	0); /* XXX */
 	PS_ADD("kstkeip",	"%u",	0); /* XXX */
 	PS_ADD("signal",	"%u",	0); /* XXX */
 	PS_ADD("blocked",	"%u",	0); /* XXX */
 	PS_ADD("sigignore",	"%u",	0); /* XXX */
 	PS_ADD("sigcatch",	"%u",	0); /* XXX */
 	PS_ADD("wchan",		"%u",	0); /* XXX */
 	PS_ADD("nswap",		"%lu",	kp.ki_rusage.ru_nswap);
 	PS_ADD("cnswap",	"%lu",	kp.ki_rusage_ch.ru_nswap);
 	PS_ADD("exitsignal",	"%d",	0); /* XXX */
 	PS_ADD("processor",	"%u",	kp.ki_lastcpu);
 	PS_ADD("rt_priority",	"%u",	0); /* XXX */ /* >= 2.5.19 */
 	PS_ADD("policy",	"%u",	kp.ki_pri.pri_class); /* >= 2.5.19 */
 #undef PS_ADD
 	sbuf_putc(sb, '\n');
 
 	return (0);
 }
 
 /*
  * Filler function for proc/pid/statm
  */
 static int
 linprocfs_doprocstatm(PFS_FILL_ARGS)
 {
 	struct kinfo_proc kp;
 	segsz_t lsize;
 
 	sx_slock(&proctree_lock);
 	PROC_LOCK(p);
 	fill_kinfo_proc(p, &kp);
 	PROC_UNLOCK(p);
 	sx_sunlock(&proctree_lock);
 
 	/*
 	 * See comments in linprocfs_doprocstatus() regarding the
 	 * computation of lsize.
 	 */
 	/* size resident share trs drs lrs dt */
 	sbuf_printf(sb, "%ju ", B2P((uintmax_t)kp.ki_size));
 	sbuf_printf(sb, "%ju ", (uintmax_t)kp.ki_rssize);
 	sbuf_printf(sb, "%ju ", (uintmax_t)0); /* XXX */
 	sbuf_printf(sb, "%ju ",	(uintmax_t)kp.ki_tsize);
 	sbuf_printf(sb, "%ju ", (uintmax_t)(kp.ki_dsize + kp.ki_ssize));
 	lsize = B2P(kp.ki_size) - kp.ki_dsize -
 	    kp.ki_ssize - kp.ki_tsize - 1;
 	sbuf_printf(sb, "%ju ", (uintmax_t)lsize);
 	sbuf_printf(sb, "%ju\n", (uintmax_t)0); /* XXX */
 
 	return (0);
 }
 
 /*
  * Filler function for proc/pid/status
  */
 static int
 linprocfs_doprocstatus(PFS_FILL_ARGS)
 {
 	struct kinfo_proc kp;
 	char *state;
 	segsz_t lsize;
 	struct thread *td2;
 	struct sigacts *ps;
 	int i;
 
 	sx_slock(&proctree_lock);
 	PROC_LOCK(p);
 	td2 = FIRST_THREAD_IN_PROC(p); /* XXXKSE pretend only one thread */
 
 	if (P_SHOULDSTOP(p)) {
 		state = "T (stopped)";
 	} else {
 		switch(p->p_state) {
 		case PRS_NEW:
 			state = "I (idle)";
 			break;
 		case PRS_NORMAL:
 			if (p->p_flag & P_WEXIT) {
 				state = "X (exiting)";
 				break;
 			}
 			switch(td2->td_state) {
 			case TDS_INHIBITED:
 				state = "S (sleeping)";
 				break;
 			case TDS_RUNQ:
 			case TDS_RUNNING:
 				state = "R (running)";
 				break;
 			default:
 				state = "? (unknown)";
 				break;
 			}
 			break;
 		case PRS_ZOMBIE:
 			state = "Z (zombie)";
 			break;
 		default:
 			state = "? (unknown)";
 			break;
 		}
 	}
 
 	fill_kinfo_proc(p, &kp);
 	sx_sunlock(&proctree_lock);
 
 	sbuf_printf(sb, "Name:\t%s\n",		p->p_comm); /* XXX escape */
 	sbuf_printf(sb, "State:\t%s\n",		state);
 
 	/*
 	 * Credentials
 	 */
 	sbuf_printf(sb, "Pid:\t%d\n",		p->p_pid);
 	sbuf_printf(sb, "PPid:\t%d\n",		p->p_pptr ?
 						p->p_pptr->p_pid : 0);
 	sbuf_printf(sb, "Uid:\t%d %d %d %d\n",	p->p_ucred->cr_ruid,
 						p->p_ucred->cr_uid,
 						p->p_ucred->cr_svuid,
 						/* FreeBSD doesn't have fsuid */
 						p->p_ucred->cr_uid);
 	sbuf_printf(sb, "Gid:\t%d %d %d %d\n",	p->p_ucred->cr_rgid,
 						p->p_ucred->cr_gid,
 						p->p_ucred->cr_svgid,
 						/* FreeBSD doesn't have fsgid */
 						p->p_ucred->cr_gid);
 	sbuf_cat(sb, "Groups:\t");
 	for (i = 0; i < p->p_ucred->cr_ngroups; i++)
 		sbuf_printf(sb, "%d ",		p->p_ucred->cr_groups[i]);
 	PROC_UNLOCK(p);
 	sbuf_putc(sb, '\n');
 
 	/*
 	 * Memory
 	 *
 	 * While our approximation of VmLib may not be accurate (I
 	 * don't know of a simple way to verify it, and I'm not sure
 	 * it has much meaning anyway), I believe it's good enough.
 	 *
 	 * The same code that could (I think) accurately compute VmLib
 	 * could also compute VmLck, but I don't really care enough to
 	 * implement it. Submissions are welcome.
 	 */
 	sbuf_printf(sb, "VmSize:\t%8ju kB\n",	B2K((uintmax_t)kp.ki_size));
 	sbuf_printf(sb, "VmLck:\t%8u kB\n",	P2K(0)); /* XXX */
 	sbuf_printf(sb, "VmRSS:\t%8ju kB\n",	P2K((uintmax_t)kp.ki_rssize));
 	sbuf_printf(sb, "VmData:\t%8ju kB\n",	P2K((uintmax_t)kp.ki_dsize));
 	sbuf_printf(sb, "VmStk:\t%8ju kB\n",	P2K((uintmax_t)kp.ki_ssize));
 	sbuf_printf(sb, "VmExe:\t%8ju kB\n",	P2K((uintmax_t)kp.ki_tsize));
 	lsize = B2P(kp.ki_size) - kp.ki_dsize -
 	    kp.ki_ssize - kp.ki_tsize - 1;
 	sbuf_printf(sb, "VmLib:\t%8ju kB\n",	P2K((uintmax_t)lsize));
 
 	/*
 	 * Signal masks
 	 *
 	 * We support up to 128 signals, while Linux supports 32,
 	 * but we only define 32 (the same 32 as Linux, to boot), so
 	 * just show the lower 32 bits of each mask. XXX hack.
 	 *
 	 * NB: on certain platforms (Sparc at least) Linux actually
 	 * supports 64 signals, but this code is a long way from
 	 * running on anything but i386, so ignore that for now.
 	 */
 	PROC_LOCK(p);
 	sbuf_printf(sb, "SigPnd:\t%08x\n",	p->p_siglist.__bits[0]);
 	/*
 	 * I can't seem to find out where the signal mask is in
 	 * relation to struct proc, so SigBlk is left unimplemented.
 	 */
 	sbuf_printf(sb, "SigBlk:\t%08x\n",	0); /* XXX */
 	ps = p->p_sigacts;
 	mtx_lock(&ps->ps_mtx);
 	sbuf_printf(sb, "SigIgn:\t%08x\n",	ps->ps_sigignore.__bits[0]);
 	sbuf_printf(sb, "SigCgt:\t%08x\n",	ps->ps_sigcatch.__bits[0]);
 	mtx_unlock(&ps->ps_mtx);
 	PROC_UNLOCK(p);
 
 	/*
 	 * Linux also prints the capability masks, but we don't have
 	 * capabilities yet, and when we do get them they're likely to
 	 * be meaningless to Linux programs, so we lie. XXX
 	 */
 	sbuf_printf(sb, "CapInh:\t%016x\n",	0);
 	sbuf_printf(sb, "CapPrm:\t%016x\n",	0);
 	sbuf_printf(sb, "CapEff:\t%016x\n",	0);
 
 	return (0);
 }
 
 
 /*
  * Filler function for proc/pid/cwd
  */
 static int
 linprocfs_doproccwd(PFS_FILL_ARGS)
 {
 	char *fullpath = "unknown";
 	char *freepath = NULL;
 
 	vn_fullpath(td, p->p_fd->fd_cdir, &fullpath, &freepath);
 	sbuf_printf(sb, "%s", fullpath);
 	if (freepath)
 		free(freepath, M_TEMP);
 	return (0);
 }
 
 /*
  * Filler function for proc/pid/root
  */
 static int
 linprocfs_doprocroot(PFS_FILL_ARGS)
 {
 	struct vnode *rvp;
 	char *fullpath = "unknown";
 	char *freepath = NULL;
 
 	rvp = jailed(p->p_ucred) ? p->p_fd->fd_jdir : p->p_fd->fd_rdir;
 	vn_fullpath(td, rvp, &fullpath, &freepath);
 	sbuf_printf(sb, "%s", fullpath);
 	if (freepath)
 		free(freepath, M_TEMP);
 	return (0);
 }
 
 /*
  * Filler function for proc/pid/cmdline
  */
 static int
 linprocfs_doproccmdline(PFS_FILL_ARGS)
 {
 	int ret;
 
 	PROC_LOCK(p);
 	if ((ret = p_cansee(td, p)) != 0) {
 		PROC_UNLOCK(p);
 		return (ret);
 	}
 
 	/*
 	 * Mimic linux behavior and pass only processes with usermode
 	 * address space as valid.  Return zero silently otherwize.
 	 */
 	if (p->p_vmspace == &vmspace0) {
 		PROC_UNLOCK(p);
 		return (0);
 	}
 	if (p->p_args != NULL) {
 		sbuf_bcpy(sb, p->p_args->ar_args, p->p_args->ar_length);
 		PROC_UNLOCK(p);
 		return (0);
 	}
 
 	if ((p->p_flag & P_SYSTEM) != 0) {
 		PROC_UNLOCK(p);
 		return (0);
 	}
 
 	PROC_UNLOCK(p);
 
 	ret = proc_getargv(td, p, sb);
 	return (ret);
 }
 
 /*
  * Filler function for proc/pid/environ
  */
 static int
 linprocfs_doprocenviron(PFS_FILL_ARGS)
 {
 	int ret;
 
 	PROC_LOCK(p);
 	if ((ret = p_candebug(td, p)) != 0) {
 		PROC_UNLOCK(p);
 		return (ret);
 	}
 
 	/*
 	 * Mimic linux behavior and pass only processes with usermode
 	 * address space as valid.  Return zero silently otherwize.
 	 */
 	if (p->p_vmspace == &vmspace0) {
 		PROC_UNLOCK(p);
 		return (0);
 	}
 
 	if ((p->p_flag & P_SYSTEM) != 0) {
 		PROC_UNLOCK(p);
 		return (0);
 	}
 
 	PROC_UNLOCK(p);
 
 	ret = proc_getenvv(td, p, sb);
 	return (ret);
 }
 
 /*
  * Filler function for proc/pid/maps
  */
 static int
 linprocfs_doprocmaps(PFS_FILL_ARGS)
 {
 	struct vmspace *vm;
 	vm_map_t map;
 	vm_map_entry_t entry, tmp_entry;
 	vm_object_t obj, tobj, lobj;
 	vm_offset_t e_start, e_end;
 	vm_ooffset_t off = 0;
 	vm_prot_t e_prot;
 	unsigned int last_timestamp;
 	char *name = "", *freename = NULL;
 	ino_t ino;
 	int ref_count, shadow_count, flags;
 	int error;
 	struct vnode *vp;
 	struct vattr vat;
 
 	PROC_LOCK(p);
 	error = p_candebug(td, p);
 	PROC_UNLOCK(p);
 	if (error)
 		return (error);
 
 	if (uio->uio_rw != UIO_READ)
 		return (EOPNOTSUPP);
 
 	error = 0;
 	vm = vmspace_acquire_ref(p);
 	if (vm == NULL)
 		return (ESRCH);
 	map = &vm->vm_map;
 	vm_map_lock_read(map);
 	for (entry = map->header.next; entry != &map->header;
 	    entry = entry->next) {
 		name = "";
 		freename = NULL;
 		if (entry->eflags & MAP_ENTRY_IS_SUB_MAP)
 			continue;
 		e_prot = entry->protection;
 		e_start = entry->start;
 		e_end = entry->end;
 		obj = entry->object.vm_object;
 		for (lobj = tobj = obj; tobj; tobj = tobj->backing_object) {
 			VM_OBJECT_RLOCK(tobj);
 			if (lobj != obj)
 				VM_OBJECT_RUNLOCK(lobj);
 			lobj = tobj;
 		}
 		last_timestamp = map->timestamp;
 		vm_map_unlock_read(map);
 		ino = 0;
 		if (lobj) {
 			off = IDX_TO_OFF(lobj->size);
 			if (lobj->type == OBJT_VNODE) {
 				vp = lobj->handle;
 				if (vp)
 					vref(vp);
 			}
 			else
 				vp = NULL;
 			if (lobj != obj)
 				VM_OBJECT_RUNLOCK(lobj);
 			flags = obj->flags;
 			ref_count = obj->ref_count;
 			shadow_count = obj->shadow_count;
 			VM_OBJECT_RUNLOCK(obj);
 			if (vp) {
 				vn_fullpath(td, vp, &name, &freename);
 				vn_lock(vp, LK_SHARED | LK_RETRY);
 				VOP_GETATTR(vp, &vat, td->td_ucred);
 				ino = vat.va_fileid;
 				vput(vp);
 			}
 		} else {
 			flags = 0;
 			ref_count = 0;
 			shadow_count = 0;
 		}
 
 		/*
 		 * format:
 		 *  start, end, access, offset, major, minor, inode, name.
 		 */
 		error = sbuf_printf(sb,
 		    "%08lx-%08lx %s%s%s%s %08lx %02x:%02x %lu%s%s\n",
 		    (u_long)e_start, (u_long)e_end,
 		    (e_prot & VM_PROT_READ)?"r":"-",
 		    (e_prot & VM_PROT_WRITE)?"w":"-",
 		    (e_prot & VM_PROT_EXECUTE)?"x":"-",
 		    "p",
 		    (u_long)off,
 		    0,
 		    0,
 		    (u_long)ino,
 		    *name ? "     " : "",
 		    name
 		    );
 		if (freename)
 			free(freename, M_TEMP);
 		vm_map_lock_read(map);
 		if (error == -1) {
 			error = 0;
 			break;
 		}
 		if (last_timestamp != map->timestamp) {
 			/*
 			 * Look again for the entry because the map was
 			 * modified while it was unlocked.  Specifically,
 			 * the entry may have been clipped, merged, or deleted.
 			 */
 			vm_map_lookup_entry(map, e_end - 1, &tmp_entry);
 			entry = tmp_entry;
 		}
 	}
 	vm_map_unlock_read(map);
 	vmspace_free(vm);
 
 	return (error);
 }
 
 /*
  * Filler function for proc/net/dev
  */
 static int
 linprocfs_donetdev(PFS_FILL_ARGS)
 {
 	char ifname[16]; /* XXX LINUX_IFNAMSIZ */
 	struct ifnet *ifp;
 
 	sbuf_printf(sb, "%6s|%58s|%s\n"
 	    "%6s|%58s|%58s\n",
 	    "Inter-", "   Receive", "  Transmit",
 	    " face",
 	    "bytes    packets errs drop fifo frame compressed multicast",
 	    "bytes    packets errs drop fifo colls carrier compressed");
 
 	CURVNET_SET(TD_TO_VNET(curthread));
 	IFNET_RLOCK();
 	TAILQ_FOREACH(ifp, &V_ifnet, if_link) {
 		linux_ifname(ifp, ifname, sizeof ifname);
 		sbuf_printf(sb, "%6.6s: ", ifname);
 		sbuf_printf(sb, "%7ju %7ju %4ju %4ju %4lu %5lu %10lu %9ju ",
 		    (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_IBYTES),
 		    (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_IPACKETS),
 		    (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_IERRORS),
 		    (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_IQDROPS),
 							/* rx_missed_errors */
 		    0UL,				/* rx_fifo_errors */
 		    0UL,				/* rx_length_errors +
 							 * rx_over_errors +
 							 * rx_crc_errors +
 							 * rx_frame_errors */
 		    0UL,				/* rx_compressed */
 		    (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_IMCASTS));
 							/* XXX-BZ rx only? */
 		sbuf_printf(sb, "%8ju %7ju %4ju %4ju %4lu %5ju %7lu %10lu\n",
 		    (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_OBYTES),
 		    (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_OPACKETS),
 		    (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_OERRORS),
 		    (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_OQDROPS),
 		    0UL,				/* tx_fifo_errors */
 		    (uintmax_t )ifp->if_get_counter(ifp, IFCOUNTER_COLLISIONS),
 		    0UL,				/* tx_carrier_errors +
 							 * tx_aborted_errors +
 							 * tx_window_errors +
 							 * tx_heartbeat_errors*/
 		    0UL);				/* tx_compressed */
 	}
 	IFNET_RUNLOCK();
 	CURVNET_RESTORE();
 
 	return (0);
 }
 
 /*
  * Filler function for proc/sys/kernel/osrelease
  */
 static int
 linprocfs_doosrelease(PFS_FILL_ARGS)
 {
 	char osrelease[LINUX_MAX_UTSNAME];
 
 	linux_get_osrelease(td, osrelease);
 	sbuf_printf(sb, "%s\n", osrelease);
 
 	return (0);
 }
 
 /*
  * Filler function for proc/sys/kernel/ostype
  */
 static int
 linprocfs_doostype(PFS_FILL_ARGS)
 {
 	char osname[LINUX_MAX_UTSNAME];
 
 	linux_get_osname(td, osname);
 	sbuf_printf(sb, "%s\n", osname);
 
 	return (0);
 }
 
 /*
  * Filler function for proc/sys/kernel/version
  */
 static int
 linprocfs_doosbuild(PFS_FILL_ARGS)
 {
 
 	linprocfs_osbuild(td, sb);
 	sbuf_cat(sb, "\n");
 	return (0);
 }
 
 /*
  * Filler function for proc/sys/kernel/msgmni
  */
 static int
 linprocfs_domsgmni(PFS_FILL_ARGS)
 {
 
 	sbuf_printf(sb, "%d\n", msginfo.msgmni);
 	return (0);
 }
 
 /*
  * Filler function for proc/sys/kernel/pid_max
  */
 static int
 linprocfs_dopid_max(PFS_FILL_ARGS)
 {
 
 	sbuf_printf(sb, "%i\n", PID_MAX);
 	return (0);
 }
 
 /*
  * Filler function for proc/sys/kernel/sem
  */
 static int
 linprocfs_dosem(PFS_FILL_ARGS)
 {
 
 	sbuf_printf(sb, "%d %d %d %d\n", seminfo.semmsl, seminfo.semmns,
 	    seminfo.semopm, seminfo.semmni);
 	return (0);
 }
 
 /*
  * Filler function for proc/scsi/device_info
  */
 static int
 linprocfs_doscsidevinfo(PFS_FILL_ARGS)
 {
 
 	return (0);
 }
 
 /*
  * Filler function for proc/scsi/scsi
  */
 static int
 linprocfs_doscsiscsi(PFS_FILL_ARGS)
 {
 
 	return (0);
 }
 
 extern struct cdevsw *cdevsw[];
 
 /*
  * Filler function for proc/devices
  */
 static int
 linprocfs_dodevices(PFS_FILL_ARGS)
 {
 	char *char_devices;
 	sbuf_printf(sb, "Character devices:\n");
 
 	char_devices = linux_get_char_devices();
 	sbuf_printf(sb, "%s", char_devices);
 	linux_free_get_char_devices(char_devices);
 
 	sbuf_printf(sb, "\nBlock devices:\n");
 
 	return (0);
 }
 
 /*
  * Filler function for proc/cmdline
  */
 static int
 linprocfs_docmdline(PFS_FILL_ARGS)
 {
 
 	sbuf_printf(sb, "BOOT_IMAGE=%s", kernelname);
 	sbuf_printf(sb, " ro root=302\n");
 	return (0);
 }
 
 /*
  * Filler function for proc/filesystems
  */
 static int
 linprocfs_dofilesystems(PFS_FILL_ARGS)
 {
 	struct vfsconf *vfsp;
 
 	mtx_lock(&Giant);
 	TAILQ_FOREACH(vfsp, &vfsconf, vfc_list) {
 		if (vfsp->vfc_flags & VFCF_SYNTHETIC)
 			sbuf_printf(sb, "nodev");
 		sbuf_printf(sb, "\t%s\n", vfsp->vfc_name);
 	}
 	mtx_unlock(&Giant);
 	return(0);
 }
 
 #if 0
 /*
  * Filler function for proc/modules
  */
 static int
 linprocfs_domodules(PFS_FILL_ARGS)
 {
 	struct linker_file *lf;
 
 	TAILQ_FOREACH(lf, &linker_files, link) {
 		sbuf_printf(sb, "%-20s%8lu%4d\n", lf->filename,
 		    (unsigned long)lf->size, lf->refs);
 	}
 	return (0);
 }
 #endif
 
 /*
  * Filler function for proc/pid/fd
  */
 static int
 linprocfs_dofdescfs(PFS_FILL_ARGS)
 {
 
 	if (p == curproc)
 		sbuf_printf(sb, "/dev/fd");
 	else
 		sbuf_printf(sb, "unknown");
 	return (0);
 }
 
 
 /*
  * Filler function for proc/sys/kernel/random/uuid
  */
 static int
 linprocfs_douuid(PFS_FILL_ARGS)
 {
 	struct uuid uuid;
 
 	kern_uuidgen(&uuid, 1);
 	sbuf_printf_uuid(sb, &uuid);
 	sbuf_printf(sb, "\n");
 	return(0);
 }
 
 
 /*
  * Constructor
  */
 static int
 linprocfs_init(PFS_INIT_ARGS)
 {
 	struct pfs_node *root;
 	struct pfs_node *dir;
 
 	root = pi->pi_root;
 
 	/* /proc/... */
 	pfs_create_file(root, "cmdline", &linprocfs_docmdline,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_file(root, "cpuinfo", &linprocfs_docpuinfo,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_file(root, "devices", &linprocfs_dodevices,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_file(root, "filesystems", &linprocfs_dofilesystems,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_file(root, "loadavg", &linprocfs_doloadavg,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_file(root, "meminfo", &linprocfs_domeminfo,
 	    NULL, NULL, NULL, PFS_RD);
 #if 0
 	pfs_create_file(root, "modules", &linprocfs_domodules,
 	    NULL, NULL, NULL, PFS_RD);
 #endif
 	pfs_create_file(root, "mounts", &linprocfs_domtab,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_file(root, "mtab", &linprocfs_domtab,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_file(root, "partitions", &linprocfs_dopartitions,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_link(root, "self", &procfs_docurproc,
 	    NULL, NULL, NULL, 0);
 	pfs_create_file(root, "stat", &linprocfs_dostat,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_file(root, "swaps", &linprocfs_doswaps,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_file(root, "uptime", &linprocfs_douptime,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_file(root, "version", &linprocfs_doversion,
 	    NULL, NULL, NULL, PFS_RD);
 
 	/* /proc/net/... */
 	dir = pfs_create_dir(root, "net", NULL, NULL, NULL, 0);
 	pfs_create_file(dir, "dev", &linprocfs_donetdev,
 	    NULL, NULL, NULL, PFS_RD);
 
 	/* /proc/<pid>/... */
 	dir = pfs_create_dir(root, "pid", NULL, NULL, NULL, PFS_PROCDEP);
 	pfs_create_file(dir, "cmdline", &linprocfs_doproccmdline,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_link(dir, "cwd", &linprocfs_doproccwd,
 	    NULL, NULL, NULL, 0);
 	pfs_create_file(dir, "environ", &linprocfs_doprocenviron,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_link(dir, "exe", &procfs_doprocfile,
 	    NULL, &procfs_notsystem, NULL, 0);
 	pfs_create_file(dir, "maps", &linprocfs_doprocmaps,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_file(dir, "mem", &procfs_doprocmem,
 	    &procfs_attr, &procfs_candebug, NULL, PFS_RDWR|PFS_RAW);
 	pfs_create_link(dir, "root", &linprocfs_doprocroot,
 	    NULL, NULL, NULL, 0);
 	pfs_create_file(dir, "stat", &linprocfs_doprocstat,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_file(dir, "statm", &linprocfs_doprocstatm,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_file(dir, "status", &linprocfs_doprocstatus,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_link(dir, "fd", &linprocfs_dofdescfs,
 	    NULL, NULL, NULL, 0);
 
 	/* /proc/scsi/... */
 	dir = pfs_create_dir(root, "scsi", NULL, NULL, NULL, 0);
 	pfs_create_file(dir, "device_info", &linprocfs_doscsidevinfo,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_file(dir, "scsi", &linprocfs_doscsiscsi,
 	    NULL, NULL, NULL, PFS_RD);
 
 	/* /proc/sys/... */
 	dir = pfs_create_dir(root, "sys", NULL, NULL, NULL, 0);
 	/* /proc/sys/kernel/... */
 	dir = pfs_create_dir(dir, "kernel", NULL, NULL, NULL, 0);
 	pfs_create_file(dir, "osrelease", &linprocfs_doosrelease,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_file(dir, "ostype", &linprocfs_doostype,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_file(dir, "version", &linprocfs_doosbuild,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_file(dir, "msgmni", &linprocfs_domsgmni,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_file(dir, "pid_max", &linprocfs_dopid_max,
 	    NULL, NULL, NULL, PFS_RD);
 	pfs_create_file(dir, "sem", &linprocfs_dosem,
 	    NULL, NULL, NULL, PFS_RD);
 
 	/* /proc/sys/kernel/random/... */
 	dir = pfs_create_dir(dir, "random", NULL, NULL, NULL, 0);
 	pfs_create_file(dir, "uuid", &linprocfs_douuid,
 	    NULL, NULL, NULL, PFS_RD);
 
 	return (0);
 }
 
 /*
  * Destructor
  */
 static int
 linprocfs_uninit(PFS_INIT_ARGS)
 {
 
 	/* nothing to do, pseudofs will GC */
 	return (0);
 }
 
 PSEUDOFS(linprocfs, 1, 0);
 MODULE_DEPEND(linprocfs, linux, 1, 1, 1);
 MODULE_DEPEND(linprocfs, procfs, 1, 1, 1);
 MODULE_DEPEND(linprocfs, sysvmsg, 1, 1, 1);
 MODULE_DEPEND(linprocfs, sysvsem, 1, 1, 1);
Index: user/ngie/more-tests/sys/dev/sound/pci/hda/hdaa_patches.c
===================================================================
--- user/ngie/more-tests/sys/dev/sound/pci/hda/hdaa_patches.c	(revision 281584)
+++ user/ngie/more-tests/sys/dev/sound/pci/hda/hdaa_patches.c	(revision 281585)
@@ -1,739 +1,746 @@
 /*-
  * Copyright (c) 2006 Stephane E. Potvin <sepotvin@videotron.ca>
  * Copyright (c) 2006 Ariff Abdullah <ariff@FreeBSD.org>
  * Copyright (c) 2008-2012 Alexander Motin <mav@FreeBSD.org>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 /*
  * Intel High Definition Audio (Audio function quirks) driver for FreeBSD.
  */
 
 #ifdef HAVE_KERNEL_OPTION_HEADERS
 #include "opt_snd.h"
 #endif
 
 #include <dev/sound/pcm/sound.h>
 
 #include <sys/ctype.h>
 
 #include <dev/sound/pci/hda/hdac.h>
 #include <dev/sound/pci/hda/hdaa.h>
 #include <dev/sound/pci/hda/hda_reg.h>
 
 SND_DECLARE_FILE("$FreeBSD$");
 
 static const struct {
 	uint32_t model;
 	uint32_t id;
 	uint32_t subsystemid;
 	uint32_t set, unset;
 	uint32_t gpio;
 } hdac_quirks[] = {
 	/*
 	 * XXX Force stereo quirk. Monoural recording / playback
 	 *     on few codecs (especially ALC880) seems broken or
 	 *     perhaps unsupported.
 	 */
 	{ HDA_MATCH_ALL, HDA_MATCH_ALL, HDA_MATCH_ALL,
 	    HDAA_QUIRK_FORCESTEREO | HDAA_QUIRK_IVREF, 0,
 	    0 },
 	{ ACER_ALL_SUBVENDOR, HDA_MATCH_ALL, HDA_MATCH_ALL,
 	    0, 0,
 	    HDAA_GPIO_SET(0) },
 	{ ASUS_G2K_SUBVENDOR, HDA_CODEC_ALC660, HDA_MATCH_ALL,
 	    0, 0,
 	    HDAA_GPIO_SET(0) },
 	{ ASUS_M5200_SUBVENDOR, HDA_CODEC_ALC880, HDA_MATCH_ALL,
 	    0, 0,
 	    HDAA_GPIO_SET(0) },
 	{ ASUS_A7M_SUBVENDOR, HDA_CODEC_ALC880, HDA_MATCH_ALL,
 	    0, 0,
 	    HDAA_GPIO_SET(0) },
 	{ ASUS_A7T_SUBVENDOR, HDA_CODEC_ALC882, HDA_MATCH_ALL,
 	    0, 0,
 	    HDAA_GPIO_SET(0) },
 	{ ASUS_W2J_SUBVENDOR, HDA_CODEC_ALC882, HDA_MATCH_ALL,
 	    0, 0,
 	    HDAA_GPIO_SET(0) },
 	{ ASUS_U5F_SUBVENDOR, HDA_CODEC_AD1986A, HDA_MATCH_ALL,
 	    HDAA_QUIRK_EAPDINV, 0,
 	    0 },
 	{ ASUS_A8X_SUBVENDOR, HDA_CODEC_AD1986A, HDA_MATCH_ALL,
 	    HDAA_QUIRK_EAPDINV, 0,
 	    0 },
 	{ ASUS_F3JC_SUBVENDOR, HDA_CODEC_ALC861, HDA_MATCH_ALL,
 	    HDAA_QUIRK_OVREF, 0,
 	    0 },
 	{ UNIWILL_9075_SUBVENDOR, HDA_CODEC_ALC861, HDA_MATCH_ALL,
 	    HDAA_QUIRK_OVREF, 0,
 	    0 },
 	/*{ ASUS_M2N_SUBVENDOR, HDA_CODEC_AD1988, HDA_MATCH_ALL,
 	    HDAA_QUIRK_IVREF80, HDAA_QUIRK_IVREF50 | HDAA_QUIRK_IVREF100,
 	    0 },*/
 	{ MEDION_MD95257_SUBVENDOR, HDA_CODEC_ALC880, HDA_MATCH_ALL,
 	    0, 0,
 	    HDAA_GPIO_SET(1) },
 	{ LENOVO_3KN100_SUBVENDOR, HDA_CODEC_AD1986A, HDA_MATCH_ALL,
 	    HDAA_QUIRK_EAPDINV | HDAA_QUIRK_SENSEINV, 0,
 	    0 },
 	{ SAMSUNG_Q1_SUBVENDOR, HDA_CODEC_AD1986A, HDA_MATCH_ALL,
 	    HDAA_QUIRK_EAPDINV, 0,
 	    0 },
 	{ APPLE_MB3_SUBVENDOR, HDA_CODEC_ALC885, HDA_MATCH_ALL,
 	    HDAA_QUIRK_OVREF50, 0,
 	    HDAA_GPIO_SET(0) },
 	{ APPLE_INTEL_MAC, HDA_CODEC_STAC9221, HDA_MATCH_ALL,
 	    0, 0,
 	    HDAA_GPIO_SET(0) | HDAA_GPIO_SET(1) },
 	{ APPLE_MACBOOKAIR31, HDA_CODEC_CS4206, HDA_MATCH_ALL,
 	    0, 0,
 	    HDAA_GPIO_SET(1) | HDAA_GPIO_SET(3) },
 	{ APPLE_MACBOOKPRO55, HDA_CODEC_CS4206, HDA_MATCH_ALL,
 	    0, 0,
 	    HDAA_GPIO_SET(1) | HDAA_GPIO_SET(3) },
 	{ APPLE_MACBOOKPRO71, HDA_CODEC_CS4206, HDA_MATCH_ALL,
 	    0, 0,
 	    HDAA_GPIO_SET(1) | HDAA_GPIO_SET(3) },
 	{ HDA_INTEL_MACBOOKPRO92, HDA_CODEC_CS4206, HDA_MATCH_ALL,
 	    0, 0,
 	    HDAA_GPIO_SET(1) | HDAA_GPIO_SET(3) },
 	{ DELL_D630_SUBVENDOR, HDA_CODEC_STAC9205X, HDA_MATCH_ALL,
 	    0, 0,
 	    HDAA_GPIO_SET(0) },
 	{ DELL_V1400_SUBVENDOR, HDA_CODEC_STAC9228X, HDA_MATCH_ALL,
 	    0, 0,
 	    HDAA_GPIO_SET(2) },
 	{ DELL_V1500_SUBVENDOR, HDA_CODEC_STAC9205X, HDA_MATCH_ALL,
 	    0, 0,
 	    HDAA_GPIO_SET(0) },
 	{ HDA_MATCH_ALL, HDA_CODEC_AD1988, HDA_MATCH_ALL,
 	    HDAA_QUIRK_IVREF80, HDAA_QUIRK_IVREF50 | HDAA_QUIRK_IVREF100,
 	    0 },
 	{ HDA_MATCH_ALL, HDA_CODEC_AD1988B, HDA_MATCH_ALL,
 	    HDAA_QUIRK_IVREF80, HDAA_QUIRK_IVREF50 | HDAA_QUIRK_IVREF100,
 	    0 },
 	{ HDA_MATCH_ALL, HDA_CODEC_CX20549, HDA_MATCH_ALL,
 	    0, HDAA_QUIRK_FORCESTEREO,
 	    0 },
 	/* Mac Pro 1,1 requires ovref for proper volume level. */
 	{ 0x00000000, HDA_CODEC_ALC885, 0x106b0c00,
 	    0, HDAA_QUIRK_OVREF,
 	    0 }
 };
 
 static void
 hdac_pin_patch(struct hdaa_widget *w)
 {
 	const char *patch = NULL;
 	uint32_t config, orig, id, subid;
 	nid_t nid = w->nid;
 
 	config = orig = w->wclass.pin.config;
 	id = hdaa_codec_id(w->devinfo);
 	subid = hdaa_card_id(w->devinfo);
 
 	/* XXX: Old patches require complete review.
 	 * Now they may create more problem then solve due to
 	 * incorrect associations.
 	 */
 	if (id == HDA_CODEC_ALC880 && subid == LG_LW20_SUBVENDOR) {
 		switch (nid) {
 		case 26:
 			config &= ~HDA_CONFIG_DEFAULTCONF_DEVICE_MASK;
 			config |= HDA_CONFIG_DEFAULTCONF_DEVICE_LINE_IN;
 			break;
 		case 27:
 			config &= ~HDA_CONFIG_DEFAULTCONF_DEVICE_MASK;
 			config |= HDA_CONFIG_DEFAULTCONF_DEVICE_HP_OUT;
 			break;
 		default:
 			break;
 		}
 	} else if (id == HDA_CODEC_ALC880 &&
 	    (subid == CLEVO_D900T_SUBVENDOR ||
 	    subid == ASUS_M5200_SUBVENDOR)) {
 		/*
 		 * Super broken BIOS
 		 */
 		switch (nid) {
 		case 24:	/* MIC1 */
 			config &= ~HDA_CONFIG_DEFAULTCONF_DEVICE_MASK;
 			config |= HDA_CONFIG_DEFAULTCONF_DEVICE_MIC_IN;
 			break;
 		case 25:	/* XXX MIC2 */
 			config &= ~HDA_CONFIG_DEFAULTCONF_DEVICE_MASK;
 			config |= HDA_CONFIG_DEFAULTCONF_DEVICE_MIC_IN;
 			break;
 		case 26:	/* LINE1 */
 			config &= ~HDA_CONFIG_DEFAULTCONF_DEVICE_MASK;
 			config |= HDA_CONFIG_DEFAULTCONF_DEVICE_LINE_IN;
 			break;
 		case 27:	/* XXX LINE2 */
 			config &= ~HDA_CONFIG_DEFAULTCONF_DEVICE_MASK;
 			config |= HDA_CONFIG_DEFAULTCONF_DEVICE_LINE_IN;
 			break;
 		case 28:	/* CD */
 			config &= ~HDA_CONFIG_DEFAULTCONF_DEVICE_MASK;
 			config |= HDA_CONFIG_DEFAULTCONF_DEVICE_CD;
 			break;
 		}
 	} else if (id == HDA_CODEC_ALC883 &&
 	    (subid == MSI_MS034A_SUBVENDOR ||
 	    HDA_DEV_MATCH(ACER_ALL_SUBVENDOR, subid))) {
 		switch (nid) {
 		case 25:
 			config &= ~(HDA_CONFIG_DEFAULTCONF_DEVICE_MASK |
 			    HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK);
 			config |= (HDA_CONFIG_DEFAULTCONF_DEVICE_MIC_IN |
 			    HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_FIXED);
 			break;
 		case 28:
 			config &= ~(HDA_CONFIG_DEFAULTCONF_DEVICE_MASK |
 			    HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK);
 			config |= (HDA_CONFIG_DEFAULTCONF_DEVICE_CD |
 			    HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_FIXED);
 			break;
 		}
 	} else if (id == HDA_CODEC_CX20549 && subid ==
 	    HP_V3000_SUBVENDOR) {
 		switch (nid) {
 		case 18:
 			config &= ~HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK;
 			config |= HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_NONE;
 			break;
 		case 20:
 			config &= ~(HDA_CONFIG_DEFAULTCONF_DEVICE_MASK |
 			    HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK);
 			config |= (HDA_CONFIG_DEFAULTCONF_DEVICE_MIC_IN |
 			    HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_FIXED);
 			break;
 		case 21:
 			config &= ~(HDA_CONFIG_DEFAULTCONF_DEVICE_MASK |
 			    HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK);
 			config |= (HDA_CONFIG_DEFAULTCONF_DEVICE_CD |
 			    HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_FIXED);
 			break;
 		}
 	} else if (id == HDA_CODEC_CX20551 && subid ==
 	    HP_DV5000_SUBVENDOR) {
 		switch (nid) {
 		case 20:
 		case 21:
 			config &= ~HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK;
 			config |= HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_NONE;
 			break;
 		}
 	} else if (id == HDA_CODEC_ALC861 && subid ==
 	    ASUS_W6F_SUBVENDOR) {
 		switch (nid) {
 		case 11:
 			config &= ~(HDA_CONFIG_DEFAULTCONF_DEVICE_MASK |
 			    HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK);
 			config |= (HDA_CONFIG_DEFAULTCONF_DEVICE_LINE_OUT |
 			    HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_FIXED);
 			break;
 		case 12:
 		case 14:
 		case 16:
 		case 31:
 		case 32:
 			config &= ~(HDA_CONFIG_DEFAULTCONF_DEVICE_MASK |
 			    HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK);
 			config |= (HDA_CONFIG_DEFAULTCONF_DEVICE_MIC_IN |
 			    HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_FIXED);
 			break;
 		case 15:
 			config &= ~(HDA_CONFIG_DEFAULTCONF_DEVICE_MASK |
 			    HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK);
 			config |= (HDA_CONFIG_DEFAULTCONF_DEVICE_HP_OUT |
 			    HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_JACK);
 			break;
 		}
 	} else if (id == HDA_CODEC_ALC861 && subid ==
 	    UNIWILL_9075_SUBVENDOR) {
 		switch (nid) {
 		case 15:
 			config &= ~(HDA_CONFIG_DEFAULTCONF_DEVICE_MASK |
 			    HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK);
 			config |= (HDA_CONFIG_DEFAULTCONF_DEVICE_HP_OUT |
 			    HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_JACK);
 			break;
 		}
 	}
 
 	/* New patches */
 	if (id == HDA_CODEC_AD1984A &&
 	    subid == LENOVO_X300_SUBVENDOR) {
 		switch (nid) {
 		case 17: /* Headphones with redirection */
 			patch = "as=1 seq=15";
 			break;
 		case 20: /* Two mics together */
 			patch = "as=2 seq=15";
 			break;
 		}
 	} else if (id == HDA_CODEC_AD1986A &&
 	    (subid == ASUS_M2NPVMX_SUBVENDOR ||
 	    subid == ASUS_A8NVMCSM_SUBVENDOR ||
 	    subid == ASUS_P5PL2_SUBVENDOR)) {
 		switch (nid) {
 		case 26: /* Headphones with redirection */
 			patch = "as=1 seq=15";
 			break;
 		case 28: /* 5.1 out => 2.0 out + 1 input */
 			patch = "device=Line-in as=8 seq=1";
 			break;
 		case 29: /* Can't use this as input, as the only available mic
 			  * preamplifier is busy by front panel mic (nid 31).
 			  * If you want to use this rear connector as mic input,
 			  * you have to disable the front panel one. */
 			patch = "as=0";
 			break;
 		case 31: /* Lot of inputs configured with as=15 and unusable */
 			patch = "as=8 seq=3";
 			break;
 		case 32:
 			patch = "as=8 seq=4";
 			break;
 		case 34:
 			patch = "as=8 seq=5";
 			break;
 		case 36:
 			patch = "as=8 seq=6";
 			break;
 		}
 	} else if (id == HDA_CODEC_ALC260 &&
 	    HDA_DEV_MATCH(SONY_S5_SUBVENDOR, subid)) {
 		switch (nid) {
 		case 16:
 			patch = "seq=15 device=Headphones";
 			break;
 		}
 	} else if (id == HDA_CODEC_ALC268) {
 	    if (subid == ACER_T5320_SUBVENDOR) {
 		switch (nid) {
 		case 20: /* Headphones Jack */
 			patch = "as=1 seq=15";
 			break;
 		}
 	    }
 	} else if (id == HDA_CODEC_CX20561 &&
 	    subid == LENOVO_B450_SUBVENDOR) {
 		switch (nid) {
 		case 22:
 			patch = "as=1 seq=15";
 			break;
 		}
 	} else if (id == HDA_CODEC_CX20561 &&
 	    subid == LENOVO_T400_SUBVENDOR) {
 		switch (nid) {
 		case 22:
 			patch = "as=1 seq=15";
 			break;
 		case 26:
 			patch = "as=1 seq=0";
 			break;
 		}
 	} else if (id == HDA_CODEC_CX20590 &&
 	    (subid == LENOVO_X1_SUBVENDOR ||
 	    subid == LENOVO_X220_SUBVENDOR ||
 	    subid == LENOVO_T420_SUBVENDOR ||
 	    subid == LENOVO_T520_SUBVENDOR ||
 	    subid == LENOVO_G580_SUBVENDOR)) {
 		switch (nid) {
 		case 25:
 			patch = "as=1 seq=15";
 			break;
 		/*
 		 * Group onboard mic and headphone mic
 		 * together.  Fixes onboard mic.
 		 */
 		case 27:
 			patch = "as=2 seq=15";
 			break;
 		case 35:
 			patch = "as=2";
 			break;
 		}
 	} else if (id == HDA_CODEC_ALC269 &&
 	    (subid == LENOVO_X1CRBN_SUBVENDOR ||
 	    subid == LENOVO_T430_SUBVENDOR ||
 	    subid == LENOVO_T430S_SUBVENDOR ||
 	    subid == LENOVO_T530_SUBVENDOR)) {
 		switch (nid) {
 		case 21:
 			patch = "as=1 seq=15";
 			break;
 		}
 	} else if (id == HDA_CODEC_ALC269 &&
 	    subid == ASUS_UX31A_SUBVENDOR) {
 		switch (nid) {
 		case 33:
 			patch = "as=1 seq=15";
 			break;
 		}
 	} else if (id == HDA_CODEC_ALC892 &&
 	    subid == INTEL_DH87RL_SUBVENDOR) {
 		switch (nid) {
 		case 27:
 			patch = "as=1 seq=15";
 			break;
 		}
+	} else if (id == HDA_CODEC_ALC292 &&
+	    subid == LENOVO_X120BS_SUBVENDOR) {
+		switch (nid) {
+		case 21:
+			patch = "as=1 seq=15";
+			break;
+		}
 	}
 
 	if (patch != NULL)
 		config = hdaa_widget_pin_patch(config, patch);
 	HDA_BOOTVERBOSE(
 		if (config != orig)
 			device_printf(w->devinfo->dev,
 			    "Patching pin config nid=%u 0x%08x -> 0x%08x\n",
 			    nid, orig, config);
 	);
 	w->wclass.pin.config = config;
 }
 
 static void
 hdaa_widget_patch(struct hdaa_widget *w)
 {
 	struct hdaa_devinfo *devinfo = w->devinfo;
 	uint32_t orig;
 	nid_t beeper = -1;
 
 	orig = w->param.widget_cap;
 	/* On some codecs beeper is an input pin, but it is not recordable
 	   alone. Also most of BIOSes does not declare beeper pin.
 	   Change beeper pin node type to beeper to help parser. */
 	switch (hdaa_codec_id(devinfo)) {
 	case HDA_CODEC_AD1882:
 	case HDA_CODEC_AD1883:
 	case HDA_CODEC_AD1984:
 	case HDA_CODEC_AD1984A:
 	case HDA_CODEC_AD1984B:
 	case HDA_CODEC_AD1987:
 	case HDA_CODEC_AD1988:
 	case HDA_CODEC_AD1988B:
 	case HDA_CODEC_AD1989B:
 		beeper = 26;
 		break;
 	case HDA_CODEC_ALC260:
 		beeper = 23;
 		break;
 	}
 	if (hda_get_vendor_id(devinfo->dev) == REALTEK_VENDORID &&
 	    hdaa_codec_id(devinfo) != HDA_CODEC_ALC260)
 		beeper = 29;
 	if (w->nid == beeper) {
 		w->param.widget_cap &= ~HDA_PARAM_AUDIO_WIDGET_CAP_TYPE_MASK;
 		w->param.widget_cap |= HDA_PARAM_AUDIO_WIDGET_CAP_TYPE_BEEP_WIDGET <<
 		    HDA_PARAM_AUDIO_WIDGET_CAP_TYPE_SHIFT;
 		w->waspin = 1;
 	}
 	/*
 	 * Clear "digital" flag from digital mic input, as its signal then goes
 	 * to "analog" mixer and this separation just limits functionaity.
 	 */
 	if (hdaa_codec_id(devinfo) == HDA_CODEC_AD1984A &&
 	    w->nid == 23)
 		w->param.widget_cap &= ~HDA_PARAM_AUDIO_WIDGET_CAP_DIGITAL_MASK;
 	HDA_BOOTVERBOSE(
 		if (w->param.widget_cap != orig) {
 			device_printf(w->devinfo->dev,
 			    "Patching widget caps nid=%u 0x%08x -> 0x%08x\n",
 			    w->nid, orig, w->param.widget_cap);
 		}
 	);
 
 	if (w->type == HDA_PARAM_AUDIO_WIDGET_CAP_TYPE_PIN_COMPLEX)
 		hdac_pin_patch(w);
 }
 
 void
 hdaa_patch(struct hdaa_devinfo *devinfo)
 {
 	struct hdaa_widget *w;
 	uint32_t id, subid, subsystemid;
 	int i;
 
 	id = hdaa_codec_id(devinfo);
 	subid = hdaa_card_id(devinfo);
 	subsystemid = hda_get_subsystem_id(devinfo->dev);
 
 	/*
 	 * Quirks
 	 */
 	for (i = 0; i < nitems(hdac_quirks); i++) {
 		if (!(HDA_DEV_MATCH(hdac_quirks[i].model, subid) &&
 		    HDA_DEV_MATCH(hdac_quirks[i].id, id) &&
 		    HDA_DEV_MATCH(hdac_quirks[i].subsystemid, subsystemid)))
 			continue;
 		devinfo->quirks |= hdac_quirks[i].set;
 		devinfo->quirks &= ~(hdac_quirks[i].unset);
 		devinfo->gpio = hdac_quirks[i].gpio;
 	}
 
 	/* Apply per-widget patch. */
 	for (i = devinfo->startnode; i < devinfo->endnode; i++) {
 		w = hdaa_widget_get(devinfo, i);
 		if (w == NULL)
 			continue;
 		hdaa_widget_patch(w);
 	}
 
 	switch (id) {
 	case HDA_CODEC_AD1983:
 		/*
 		 * This CODEC has several possible usages, but none
 		 * fit the parser best. Help parser to choose better.
 		 */
 		/* Disable direct unmixed playback to get pcm volume. */
 		w = hdaa_widget_get(devinfo, 5);
 		if (w != NULL)
 			w->connsenable[0] = 0;
 		w = hdaa_widget_get(devinfo, 6);
 		if (w != NULL)
 			w->connsenable[0] = 0;
 		w = hdaa_widget_get(devinfo, 11);
 		if (w != NULL)
 			w->connsenable[0] = 0;
 		/* Disable mic and line selectors. */
 		w = hdaa_widget_get(devinfo, 12);
 		if (w != NULL)
 			w->connsenable[1] = 0;
 		w = hdaa_widget_get(devinfo, 13);
 		if (w != NULL)
 			w->connsenable[1] = 0;
 		/* Disable recording from mono playback mix. */
 		w = hdaa_widget_get(devinfo, 20);
 		if (w != NULL)
 			w->connsenable[3] = 0;
 		break;
 	case HDA_CODEC_AD1986A:
 		/*
 		 * This CODEC has overcomplicated input mixing.
 		 * Make some cleaning there.
 		 */
 		/* Disable input mono mixer. Not needed and not supported. */
 		w = hdaa_widget_get(devinfo, 43);
 		if (w != NULL)
 			w->enable = 0;
 		/* Disable any with any input mixing mesh. Use separately. */
 		w = hdaa_widget_get(devinfo, 39);
 		if (w != NULL)
 			w->enable = 0;
 		w = hdaa_widget_get(devinfo, 40);
 		if (w != NULL)
 			w->enable = 0;
 		w = hdaa_widget_get(devinfo, 41);
 		if (w != NULL)
 			w->enable = 0;
 		w = hdaa_widget_get(devinfo, 42);
 		if (w != NULL)
 			w->enable = 0;
 		/* Disable duplicate mixer node connector. */
 		w = hdaa_widget_get(devinfo, 15);
 		if (w != NULL)
 			w->connsenable[3] = 0;
 		/* There is only one mic preamplifier, use it effectively. */
 		w = hdaa_widget_get(devinfo, 31);
 		if (w != NULL) {
 			if ((w->wclass.pin.config &
 			    HDA_CONFIG_DEFAULTCONF_DEVICE_MASK) ==
 			    HDA_CONFIG_DEFAULTCONF_DEVICE_MIC_IN) {
 				w = hdaa_widget_get(devinfo, 16);
 				if (w != NULL)
 				    w->connsenable[2] = 0;
 			} else {
 				w = hdaa_widget_get(devinfo, 15);
 				if (w != NULL)
 				    w->connsenable[0] = 0;
 			}
 		}
 		w = hdaa_widget_get(devinfo, 32);
 		if (w != NULL) {
 			if ((w->wclass.pin.config &
 			    HDA_CONFIG_DEFAULTCONF_DEVICE_MASK) ==
 			    HDA_CONFIG_DEFAULTCONF_DEVICE_MIC_IN) {
 				w = hdaa_widget_get(devinfo, 16);
 				if (w != NULL)
 				    w->connsenable[0] = 0;
 			} else {
 				w = hdaa_widget_get(devinfo, 15);
 				if (w != NULL)
 				    w->connsenable[1] = 0;
 			}
 		}
 
 		if (subid == ASUS_A8X_SUBVENDOR) {
 			/*
 			 * This is just plain ridiculous.. There
 			 * are several A8 series that share the same
 			 * pci id but works differently (EAPD).
 			 */
 			w = hdaa_widget_get(devinfo, 26);
 			if (w != NULL && w->type ==
 			    HDA_PARAM_AUDIO_WIDGET_CAP_TYPE_PIN_COMPLEX &&
 			    (w->wclass.pin.config &
 			    HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_MASK) !=
 			    HDA_CONFIG_DEFAULTCONF_CONNECTIVITY_NONE)
 				devinfo->quirks &=
 				    ~HDAA_QUIRK_EAPDINV;
 		}
 		break;
 	case HDA_CODEC_AD1981HD:
 		/*
 		 * This CODEC has very unusual design with several
 		 * points inappropriate for the present parser.
 		 */
 		/* Disable recording from mono playback mix. */
 		w = hdaa_widget_get(devinfo, 21);
 		if (w != NULL)
 			w->connsenable[3] = 0;
 		/* Disable rear to front mic mixer, use separately. */
 		w = hdaa_widget_get(devinfo, 31);
 		if (w != NULL)
 			w->enable = 0;
 		/* Disable direct playback, use mixer. */
 		w = hdaa_widget_get(devinfo, 5);
 		if (w != NULL)
 			w->connsenable[0] = 0;
 		w = hdaa_widget_get(devinfo, 6);
 		if (w != NULL)
 			w->connsenable[0] = 0;
 		w = hdaa_widget_get(devinfo, 9);
 		if (w != NULL)
 			w->connsenable[0] = 0;
 		w = hdaa_widget_get(devinfo, 24);
 		if (w != NULL)
 			w->connsenable[0] = 0;
 		break;
 	case HDA_CODEC_ALC269:
 		/*
 		 * ASUS EeePC 1001px has strange variant of ALC269 CODEC,
 		 * that mutes speaker if unused mixer at NID 15 is muted.
 		 * Probably CODEC incorrectly reports internal connections.
 		 * Hide that muter from the driver.  There are several CODECs
 		 * sharing this ID and I have not enough information about
 		 * them to implement more universal solution.
 		 */
 		if (subid == 0x84371043) {
 			w = hdaa_widget_get(devinfo, 15);
 			if (w != NULL)
 				w->param.inamp_cap = 0;
 		}
 		break;
 	case HDA_CODEC_CX20582:
 	case HDA_CODEC_CX20583:
 	case HDA_CODEC_CX20584:
 	case HDA_CODEC_CX20585:
 	case HDA_CODEC_CX20590:
 		/*
 		 * These codecs have extra connectivity on record side
 		 * too reach for the present parser.
 		 */
 		w = hdaa_widget_get(devinfo, 20);
 		if (w != NULL)
 			w->connsenable[1] = 0;
 		w = hdaa_widget_get(devinfo, 21);
 		if (w != NULL)
 			w->connsenable[1] = 0;
 		w = hdaa_widget_get(devinfo, 22);
 		if (w != NULL)
 			w->connsenable[0] = 0;
 		break;
 	case HDA_CODEC_VT1708S_0:
 	case HDA_CODEC_VT1708S_1:
 	case HDA_CODEC_VT1708S_2:
 	case HDA_CODEC_VT1708S_3:
 	case HDA_CODEC_VT1708S_4:
 	case HDA_CODEC_VT1708S_5:
 	case HDA_CODEC_VT1708S_6:
 	case HDA_CODEC_VT1708S_7:
 		/*
 		 * These codecs have hidden mic boost controls.
 		 */
 		w = hdaa_widget_get(devinfo, 26);
 		if (w != NULL)
 			w->param.inamp_cap =
 			    (40 << HDA_PARAM_OUTPUT_AMP_CAP_STEPSIZE_SHIFT) |
 			    (3 << HDA_PARAM_OUTPUT_AMP_CAP_NUMSTEPS_SHIFT) |
 			    (0 << HDA_PARAM_OUTPUT_AMP_CAP_OFFSET_SHIFT);
 		w = hdaa_widget_get(devinfo, 30);
 		if (w != NULL)
 			w->param.inamp_cap =
 			    (40 << HDA_PARAM_OUTPUT_AMP_CAP_STEPSIZE_SHIFT) |
 			    (3 << HDA_PARAM_OUTPUT_AMP_CAP_NUMSTEPS_SHIFT) |
 			    (0 << HDA_PARAM_OUTPUT_AMP_CAP_OFFSET_SHIFT);
 		break;
 	}
 }
 
 void
 hdaa_patch_direct(struct hdaa_devinfo *devinfo)
 {
 	device_t dev = devinfo->dev;
 	uint32_t id, subid, val;
 
 	id = hdaa_codec_id(devinfo);
 	subid = hdaa_card_id(devinfo);
 
 	switch (id) {
 	case HDA_CODEC_VT1708S_0:
 	case HDA_CODEC_VT1708S_1:
 	case HDA_CODEC_VT1708S_2:
 	case HDA_CODEC_VT1708S_3:
 	case HDA_CODEC_VT1708S_4:
 	case HDA_CODEC_VT1708S_5:
 	case HDA_CODEC_VT1708S_6:
 	case HDA_CODEC_VT1708S_7:
 		/* Enable Mic Boost Volume controls. */
 		hda_command(dev, HDA_CMD_12BIT(0, devinfo->nid,
 		    0xf98, 0x01));
 		/* Fall though */
 	case HDA_CODEC_VT1818S:
 		/* Don't bypass mixer. */
 		hda_command(dev, HDA_CMD_12BIT(0, devinfo->nid,
 		    0xf88, 0xc0));
 		break;
 	}
 	if (subid == APPLE_INTEL_MAC)
 		hda_command(dev, HDA_CMD_12BIT(0, devinfo->nid,
 		    0x7e7, 0));
 	if (id == HDA_CODEC_ALC269) {
 		if (subid == 0x16e31043 || subid == 0x831a1043 ||
 		    subid == 0x834a1043 || subid == 0x83981043 ||
 		    subid == 0x83ce1043) {
 			/*
 			 * The ditital mics on some Asus laptops produce
 			 * differential signals instead of expected stereo.
 			 * That results in silence if downmix it to mono.
 			 * To workaround, make codec to handle signal as mono.
 			 */
 			hda_command(dev, HDA_CMD_SET_COEFF_INDEX(0, 0x20, 0x07));
 			val = hda_command(dev, HDA_CMD_GET_PROCESSING_COEFF(0, 0x20));
 			hda_command(dev, HDA_CMD_SET_COEFF_INDEX(0, 0x20, 0x07));
 			hda_command(dev, HDA_CMD_SET_PROCESSING_COEFF(0, 0x20, val|0x80));
 		}
 	}
 }
Index: user/ngie/more-tests/sys/dev/sound/pci/hda/hdac.c
===================================================================
--- user/ngie/more-tests/sys/dev/sound/pci/hda/hdac.c	(revision 281584)
+++ user/ngie/more-tests/sys/dev/sound/pci/hda/hdac.c	(revision 281585)
@@ -1,2088 +1,2090 @@
 /*-
  * Copyright (c) 2006 Stephane E. Potvin <sepotvin@videotron.ca>
  * Copyright (c) 2006 Ariff Abdullah <ariff@FreeBSD.org>
  * Copyright (c) 2008-2012 Alexander Motin <mav@FreeBSD.org>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 /*
  * Intel High Definition Audio (Controller) driver for FreeBSD.
  */
 
 #ifdef HAVE_KERNEL_OPTION_HEADERS
 #include "opt_snd.h"
 #endif
 
 #include <dev/sound/pcm/sound.h>
 #include <dev/pci/pcireg.h>
 #include <dev/pci/pcivar.h>
 
 #include <sys/ctype.h>
 #include <sys/taskqueue.h>
 
 #include <dev/sound/pci/hda/hdac_private.h>
 #include <dev/sound/pci/hda/hdac_reg.h>
 #include <dev/sound/pci/hda/hda_reg.h>
 #include <dev/sound/pci/hda/hdac.h>
 
 #define HDA_DRV_TEST_REV	"20120126_0002"
 
 SND_DECLARE_FILE("$FreeBSD$");
 
 #define hdac_lock(sc)		snd_mtxlock((sc)->lock)
 #define hdac_unlock(sc)		snd_mtxunlock((sc)->lock)
 #define hdac_lockassert(sc)	snd_mtxassert((sc)->lock)
 #define hdac_lockowned(sc)	mtx_owned((sc)->lock)
 
 #define HDAC_QUIRK_64BIT	(1 << 0)
 #define HDAC_QUIRK_DMAPOS	(1 << 1)
 #define HDAC_QUIRK_MSI		(1 << 2)
 
 static const struct {
 	const char *key;
 	uint32_t value;
 } hdac_quirks_tab[] = {
 	{ "64bit", HDAC_QUIRK_DMAPOS },
 	{ "dmapos", HDAC_QUIRK_DMAPOS },
 	{ "msi", HDAC_QUIRK_MSI },
 };
 
 MALLOC_DEFINE(M_HDAC, "hdac", "HDA Controller");
 
 static const struct {
 	uint32_t	model;
 	const char	*desc;
 	char		quirks_on;
 	char		quirks_off;
 } hdac_devices[] = {
 	{ HDA_INTEL_OAK,     "Intel Oaktrail",	0, 0 },
 	{ HDA_INTEL_BAY,     "Intel BayTrail",	0, 0 },
 	{ HDA_INTEL_HSW1,    "Intel Haswell",	0, 0 },
 	{ HDA_INTEL_HSW2,    "Intel Haswell",	0, 0 },
 	{ HDA_INTEL_HSW3,    "Intel Haswell",	0, 0 },
+	{ HDA_INTEL_BDW1,    "Intel Broadwell",	0, 0 },
+	{ HDA_INTEL_BDW2,    "Intel Broadwell",	0, 0 },
 	{ HDA_INTEL_CPT,     "Intel Cougar Point",	0, 0 },
 	{ HDA_INTEL_PATSBURG,"Intel Patsburg",  0, 0 },
 	{ HDA_INTEL_PPT1,    "Intel Panther Point",	0, 0 },
 	{ HDA_INTEL_LPT1,    "Intel Lynx Point",	0, 0 },
 	{ HDA_INTEL_LPT2,    "Intel Lynx Point",	0, 0 },
 	{ HDA_INTEL_WCPT,    "Intel Wildcat Point",	0, 0 },
 	{ HDA_INTEL_WELLS1,  "Intel Wellsburg",	0, 0 },
 	{ HDA_INTEL_WELLS2,  "Intel Wellsburg",	0, 0 },
 	{ HDA_INTEL_LPTLP1,  "Intel Lynx Point-LP",	0, 0 },
 	{ HDA_INTEL_LPTLP2,  "Intel Lynx Point-LP",	0, 0 },
 	{ HDA_INTEL_82801F,  "Intel 82801F",	0, 0 },
 	{ HDA_INTEL_63XXESB, "Intel 631x/632xESB",	0, 0 },
 	{ HDA_INTEL_82801G,  "Intel 82801G",	0, 0 },
 	{ HDA_INTEL_82801H,  "Intel 82801H",	0, 0 },
 	{ HDA_INTEL_82801I,  "Intel 82801I",	0, 0 },
 	{ HDA_INTEL_82801JI, "Intel 82801JI",	0, 0 },
 	{ HDA_INTEL_82801JD, "Intel 82801JD",	0, 0 },
 	{ HDA_INTEL_PCH,     "Intel 5 Series/3400 Series",	0, 0 },
 	{ HDA_INTEL_PCH2,    "Intel 5 Series/3400 Series",	0, 0 },
 	{ HDA_INTEL_SCH,     "Intel SCH",	0, 0 },
 	{ HDA_NVIDIA_MCP51,  "NVIDIA MCP51",	0, HDAC_QUIRK_MSI },
 	{ HDA_NVIDIA_MCP55,  "NVIDIA MCP55",	0, HDAC_QUIRK_MSI },
 	{ HDA_NVIDIA_MCP61_1, "NVIDIA MCP61",	0, 0 },
 	{ HDA_NVIDIA_MCP61_2, "NVIDIA MCP61",	0, 0 },
 	{ HDA_NVIDIA_MCP65_1, "NVIDIA MCP65",	0, 0 },
 	{ HDA_NVIDIA_MCP65_2, "NVIDIA MCP65",	0, 0 },
 	{ HDA_NVIDIA_MCP67_1, "NVIDIA MCP67",	0, 0 },
 	{ HDA_NVIDIA_MCP67_2, "NVIDIA MCP67",	0, 0 },
 	{ HDA_NVIDIA_MCP73_1, "NVIDIA MCP73",	0, 0 },
 	{ HDA_NVIDIA_MCP73_2, "NVIDIA MCP73",	0, 0 },
 	{ HDA_NVIDIA_MCP78_1, "NVIDIA MCP78",	0, HDAC_QUIRK_64BIT },
 	{ HDA_NVIDIA_MCP78_2, "NVIDIA MCP78",	0, HDAC_QUIRK_64BIT },
 	{ HDA_NVIDIA_MCP78_3, "NVIDIA MCP78",	0, HDAC_QUIRK_64BIT },
 	{ HDA_NVIDIA_MCP78_4, "NVIDIA MCP78",	0, HDAC_QUIRK_64BIT },
 	{ HDA_NVIDIA_MCP79_1, "NVIDIA MCP79",	0, 0 },
 	{ HDA_NVIDIA_MCP79_2, "NVIDIA MCP79",	0, 0 },
 	{ HDA_NVIDIA_MCP79_3, "NVIDIA MCP79",	0, 0 },
 	{ HDA_NVIDIA_MCP79_4, "NVIDIA MCP79",	0, 0 },
 	{ HDA_NVIDIA_MCP89_1, "NVIDIA MCP89",	0, 0 },
 	{ HDA_NVIDIA_MCP89_2, "NVIDIA MCP89",	0, 0 },
 	{ HDA_NVIDIA_MCP89_3, "NVIDIA MCP89",	0, 0 },
 	{ HDA_NVIDIA_MCP89_4, "NVIDIA MCP89",	0, 0 },
 	{ HDA_NVIDIA_0BE2,   "NVIDIA (0x0be2)",	0, HDAC_QUIRK_MSI },
 	{ HDA_NVIDIA_0BE3,   "NVIDIA (0x0be3)",	0, HDAC_QUIRK_MSI },
 	{ HDA_NVIDIA_0BE4,   "NVIDIA (0x0be4)",	0, HDAC_QUIRK_MSI },
 	{ HDA_NVIDIA_GT100,  "NVIDIA GT100",	0, HDAC_QUIRK_MSI },
 	{ HDA_NVIDIA_GT104,  "NVIDIA GT104",	0, HDAC_QUIRK_MSI },
 	{ HDA_NVIDIA_GT106,  "NVIDIA GT106",	0, HDAC_QUIRK_MSI },
 	{ HDA_NVIDIA_GT108,  "NVIDIA GT108",	0, HDAC_QUIRK_MSI },
 	{ HDA_NVIDIA_GT116,  "NVIDIA GT116",	0, HDAC_QUIRK_MSI },
 	{ HDA_NVIDIA_GF119,  "NVIDIA GF119",	0, 0 },
 	{ HDA_NVIDIA_GF110_1, "NVIDIA GF110",	0, HDAC_QUIRK_MSI },
 	{ HDA_NVIDIA_GF110_2, "NVIDIA GF110",	0, HDAC_QUIRK_MSI },
 	{ HDA_ATI_SB450,     "ATI SB450",	0, 0 },
 	{ HDA_ATI_SB600,     "ATI SB600",	0, 0 },
 	{ HDA_ATI_RS600,     "ATI RS600",	0, 0 },
 	{ HDA_ATI_RS690,     "ATI RS690",	0, 0 },
 	{ HDA_ATI_RS780,     "ATI RS780",	0, 0 },
 	{ HDA_ATI_R600,      "ATI R600",	0, 0 },
 	{ HDA_ATI_RV610,     "ATI RV610",	0, 0 },
 	{ HDA_ATI_RV620,     "ATI RV620",	0, 0 },
 	{ HDA_ATI_RV630,     "ATI RV630",	0, 0 },
 	{ HDA_ATI_RV635,     "ATI RV635",	0, 0 },
 	{ HDA_ATI_RV710,     "ATI RV710",	0, 0 },
 	{ HDA_ATI_RV730,     "ATI RV730",	0, 0 },
 	{ HDA_ATI_RV740,     "ATI RV740",	0, 0 },
 	{ HDA_ATI_RV770,     "ATI RV770",	0, 0 },
 	{ HDA_ATI_RV810,     "ATI RV810",	0, 0 },
 	{ HDA_ATI_RV830,     "ATI RV830",	0, 0 },
 	{ HDA_ATI_RV840,     "ATI RV840",	0, 0 },
 	{ HDA_ATI_RV870,     "ATI RV870",	0, 0 },
 	{ HDA_ATI_RV910,     "ATI RV910",	0, 0 },
 	{ HDA_ATI_RV930,     "ATI RV930",	0, 0 },
 	{ HDA_ATI_RV940,     "ATI RV940",	0, 0 },
 	{ HDA_ATI_RV970,     "ATI RV970",	0, 0 },
 	{ HDA_ATI_R1000,     "ATI R1000",	0, 0 },
 	{ HDA_RDC_M3010,     "RDC M3010",	0, 0 },
 	{ HDA_VIA_VT82XX,    "VIA VT8251/8237A",0, 0 },
 	{ HDA_SIS_966,       "SiS 966",		0, 0 },
 	{ HDA_ULI_M5461,     "ULI M5461",	0, 0 },
 	/* Unknown */
 	{ HDA_INTEL_ALL,  "Intel",		0, 0 },
 	{ HDA_NVIDIA_ALL, "NVIDIA",		0, 0 },
 	{ HDA_ATI_ALL,    "ATI",		0, 0 },
 	{ HDA_VIA_ALL,    "VIA",		0, 0 },
 	{ HDA_SIS_ALL,    "SiS",		0, 0 },
 	{ HDA_ULI_ALL,    "ULI",		0, 0 },
 };
 
 static const struct {
 	uint16_t vendor;
 	uint8_t reg;
 	uint8_t mask;
 	uint8_t enable;
 } hdac_pcie_snoop[] = {
 	{  INTEL_VENDORID, 0x00, 0x00, 0x00 },
 	{    ATI_VENDORID, 0x42, 0xf8, 0x02 },
 	{ NVIDIA_VENDORID, 0x4e, 0xf0, 0x0f },
 };
 
 /****************************************************************************
  * Function prototypes
  ****************************************************************************/
 static void	hdac_intr_handler(void *);
 static int	hdac_reset(struct hdac_softc *, int);
 static int	hdac_get_capabilities(struct hdac_softc *);
 static void	hdac_dma_cb(void *, bus_dma_segment_t *, int, int);
 static int	hdac_dma_alloc(struct hdac_softc *,
 					struct hdac_dma *, bus_size_t);
 static void	hdac_dma_free(struct hdac_softc *, struct hdac_dma *);
 static int	hdac_mem_alloc(struct hdac_softc *);
 static void	hdac_mem_free(struct hdac_softc *);
 static int	hdac_irq_alloc(struct hdac_softc *);
 static void	hdac_irq_free(struct hdac_softc *);
 static void	hdac_corb_init(struct hdac_softc *);
 static void	hdac_rirb_init(struct hdac_softc *);
 static void	hdac_corb_start(struct hdac_softc *);
 static void	hdac_rirb_start(struct hdac_softc *);
 
 static void	hdac_attach2(void *);
 
 static uint32_t	hdac_send_command(struct hdac_softc *, nid_t, uint32_t);
 
 static int	hdac_probe(device_t);
 static int	hdac_attach(device_t);
 static int	hdac_detach(device_t);
 static int	hdac_suspend(device_t);
 static int	hdac_resume(device_t);
 
 static int	hdac_rirb_flush(struct hdac_softc *sc);
 static int	hdac_unsolq_flush(struct hdac_softc *sc);
 
 #define hdac_command(a1, a2, a3)	\
 		hdac_send_command(a1, a3, a2)
 
 /* This function surely going to make its way into upper level someday. */
 static void
 hdac_config_fetch(struct hdac_softc *sc, uint32_t *on, uint32_t *off)
 {
 	const char *res = NULL;
 	int i = 0, j, k, len, inv;
 
 	if (resource_string_value(device_get_name(sc->dev),
 	    device_get_unit(sc->dev), "config", &res) != 0)
 		return;
 	if (!(res != NULL && strlen(res) > 0))
 		return;
 	HDA_BOOTVERBOSE(
 		device_printf(sc->dev, "Config options:");
 	);
 	for (;;) {
 		while (res[i] != '\0' &&
 		    (res[i] == ',' || isspace(res[i]) != 0))
 			i++;
 		if (res[i] == '\0') {
 			HDA_BOOTVERBOSE(
 				printf("\n");
 			);
 			return;
 		}
 		j = i;
 		while (res[j] != '\0' &&
 		    !(res[j] == ',' || isspace(res[j]) != 0))
 			j++;
 		len = j - i;
 		if (len > 2 && strncmp(res + i, "no", 2) == 0)
 			inv = 2;
 		else
 			inv = 0;
 		for (k = 0; len > inv && k < nitems(hdac_quirks_tab); k++) {
 			if (strncmp(res + i + inv,
 			    hdac_quirks_tab[k].key, len - inv) != 0)
 				continue;
 			if (len - inv != strlen(hdac_quirks_tab[k].key))
 				continue;
 			HDA_BOOTVERBOSE(
 				printf(" %s%s", (inv != 0) ? "no" : "",
 				    hdac_quirks_tab[k].key);
 			);
 			if (inv == 0) {
 				*on |= hdac_quirks_tab[k].value;
 				*on &= ~hdac_quirks_tab[k].value;
 			} else if (inv != 0) {
 				*off |= hdac_quirks_tab[k].value;
 				*off &= ~hdac_quirks_tab[k].value;
 			}
 			break;
 		}
 		i = j;
 	}
 }
 
 /****************************************************************************
  * void hdac_intr_handler(void *)
  *
  * Interrupt handler. Processes interrupts received from the hdac.
  ****************************************************************************/
 static void
 hdac_intr_handler(void *context)
 {
 	struct hdac_softc *sc;
 	device_t dev;
 	uint32_t intsts;
 	uint8_t rirbsts;
 	int i;
 
 	sc = (struct hdac_softc *)context;
 	hdac_lock(sc);
 
 	/* Do we have anything to do? */
 	intsts = HDAC_READ_4(&sc->mem, HDAC_INTSTS);
 	if ((intsts & HDAC_INTSTS_GIS) == 0) {
 		hdac_unlock(sc);
 		return;
 	}
 
 	/* Was this a controller interrupt? */
 	if (intsts & HDAC_INTSTS_CIS) {
 		rirbsts = HDAC_READ_1(&sc->mem, HDAC_RIRBSTS);
 		/* Get as many responses that we can */
 		while (rirbsts & HDAC_RIRBSTS_RINTFL) {
 			HDAC_WRITE_1(&sc->mem,
 			    HDAC_RIRBSTS, HDAC_RIRBSTS_RINTFL);
 			hdac_rirb_flush(sc);
 			rirbsts = HDAC_READ_1(&sc->mem, HDAC_RIRBSTS);
 		}
 		if (sc->unsolq_rp != sc->unsolq_wp)
 			taskqueue_enqueue(taskqueue_thread, &sc->unsolq_task);
 	}
 
 	if (intsts & HDAC_INTSTS_SIS_MASK) {
 		for (i = 0; i < sc->num_ss; i++) {
 			if ((intsts & (1 << i)) == 0)
 				continue;
 			HDAC_WRITE_1(&sc->mem, (i << 5) + HDAC_SDSTS,
 			    HDAC_SDSTS_DESE | HDAC_SDSTS_FIFOE | HDAC_SDSTS_BCIS );
 			if ((dev = sc->streams[i].dev) != NULL) {
 				HDAC_STREAM_INTR(dev,
 				    sc->streams[i].dir, sc->streams[i].stream);
 			}
 		}
 	}
 
 	HDAC_WRITE_4(&sc->mem, HDAC_INTSTS, intsts);
 	hdac_unlock(sc);
 }
 
 static void
 hdac_poll_callback(void *arg)
 {
 	struct hdac_softc *sc = arg;
 
 	if (sc == NULL)
 		return;
 
 	hdac_lock(sc);
 	if (sc->polling == 0) {
 		hdac_unlock(sc);
 		return;
 	}
 	callout_reset(&sc->poll_callout, sc->poll_ival,
 	    hdac_poll_callback, sc);
 	hdac_unlock(sc);
 
 	hdac_intr_handler(sc);
 }
 
 /****************************************************************************
  * int hdac_reset(hdac_softc *, int)
  *
  * Reset the hdac to a quiescent and known state.
  ****************************************************************************/
 static int
 hdac_reset(struct hdac_softc *sc, int wakeup)
 {
 	uint32_t gctl;
 	int count, i;
 
 	/*
 	 * Stop all Streams DMA engine
 	 */
 	for (i = 0; i < sc->num_iss; i++)
 		HDAC_WRITE_4(&sc->mem, HDAC_ISDCTL(sc, i), 0x0);
 	for (i = 0; i < sc->num_oss; i++)
 		HDAC_WRITE_4(&sc->mem, HDAC_OSDCTL(sc, i), 0x0);
 	for (i = 0; i < sc->num_bss; i++)
 		HDAC_WRITE_4(&sc->mem, HDAC_BSDCTL(sc, i), 0x0);
 
 	/*
 	 * Stop Control DMA engines.
 	 */
 	HDAC_WRITE_1(&sc->mem, HDAC_CORBCTL, 0x0);
 	HDAC_WRITE_1(&sc->mem, HDAC_RIRBCTL, 0x0);
 
 	/*
 	 * Reset DMA position buffer.
 	 */
 	HDAC_WRITE_4(&sc->mem, HDAC_DPIBLBASE, 0x0);
 	HDAC_WRITE_4(&sc->mem, HDAC_DPIBUBASE, 0x0);
 
 	/*
 	 * Reset the controller. The reset must remain asserted for
 	 * a minimum of 100us.
 	 */
 	gctl = HDAC_READ_4(&sc->mem, HDAC_GCTL);
 	HDAC_WRITE_4(&sc->mem, HDAC_GCTL, gctl & ~HDAC_GCTL_CRST);
 	count = 10000;
 	do {
 		gctl = HDAC_READ_4(&sc->mem, HDAC_GCTL);
 		if (!(gctl & HDAC_GCTL_CRST))
 			break;
 		DELAY(10);
 	} while	(--count);
 	if (gctl & HDAC_GCTL_CRST) {
 		device_printf(sc->dev, "Unable to put hdac in reset\n");
 		return (ENXIO);
 	}
 
 	/* If wakeup is not requested - leave the controller in reset state. */
 	if (!wakeup)
 		return (0);
 
 	DELAY(100);
 	gctl = HDAC_READ_4(&sc->mem, HDAC_GCTL);
 	HDAC_WRITE_4(&sc->mem, HDAC_GCTL, gctl | HDAC_GCTL_CRST);
 	count = 10000;
 	do {
 		gctl = HDAC_READ_4(&sc->mem, HDAC_GCTL);
 		if (gctl & HDAC_GCTL_CRST)
 			break;
 		DELAY(10);
 	} while (--count);
 	if (!(gctl & HDAC_GCTL_CRST)) {
 		device_printf(sc->dev, "Device stuck in reset\n");
 		return (ENXIO);
 	}
 
 	/*
 	 * Wait for codecs to finish their own reset sequence. The delay here
 	 * should be of 250us but for some reasons, on it's not enough on my
 	 * computer. Let's use twice as much as necessary to make sure that
 	 * it's reset properly.
 	 */
 	DELAY(1000);
 
 	return (0);
 }
 
 
 /****************************************************************************
  * int hdac_get_capabilities(struct hdac_softc *);
  *
  * Retreive the general capabilities of the hdac;
  *	Number of Input Streams
  *	Number of Output Streams
  *	Number of bidirectional Streams
  *	64bit ready
  *	CORB and RIRB sizes
  ****************************************************************************/
 static int
 hdac_get_capabilities(struct hdac_softc *sc)
 {
 	uint16_t gcap;
 	uint8_t corbsize, rirbsize;
 
 	gcap = HDAC_READ_2(&sc->mem, HDAC_GCAP);
 	sc->num_iss = HDAC_GCAP_ISS(gcap);
 	sc->num_oss = HDAC_GCAP_OSS(gcap);
 	sc->num_bss = HDAC_GCAP_BSS(gcap);
 	sc->num_ss = sc->num_iss + sc->num_oss + sc->num_bss;
 	sc->num_sdo = HDAC_GCAP_NSDO(gcap);
 	sc->support_64bit = (gcap & HDAC_GCAP_64OK) != 0;
 	if (sc->quirks_on & HDAC_QUIRK_64BIT)
 		sc->support_64bit = 1;
 	else if (sc->quirks_off & HDAC_QUIRK_64BIT)
 		sc->support_64bit = 0;
 
 	corbsize = HDAC_READ_1(&sc->mem, HDAC_CORBSIZE);
 	if ((corbsize & HDAC_CORBSIZE_CORBSZCAP_256) ==
 	    HDAC_CORBSIZE_CORBSZCAP_256)
 		sc->corb_size = 256;
 	else if ((corbsize & HDAC_CORBSIZE_CORBSZCAP_16) ==
 	    HDAC_CORBSIZE_CORBSZCAP_16)
 		sc->corb_size = 16;
 	else if ((corbsize & HDAC_CORBSIZE_CORBSZCAP_2) ==
 	    HDAC_CORBSIZE_CORBSZCAP_2)
 		sc->corb_size = 2;
 	else {
 		device_printf(sc->dev, "%s: Invalid corb size (%x)\n",
 		    __func__, corbsize);
 		return (ENXIO);
 	}
 
 	rirbsize = HDAC_READ_1(&sc->mem, HDAC_RIRBSIZE);
 	if ((rirbsize & HDAC_RIRBSIZE_RIRBSZCAP_256) ==
 	    HDAC_RIRBSIZE_RIRBSZCAP_256)
 		sc->rirb_size = 256;
 	else if ((rirbsize & HDAC_RIRBSIZE_RIRBSZCAP_16) ==
 	    HDAC_RIRBSIZE_RIRBSZCAP_16)
 		sc->rirb_size = 16;
 	else if ((rirbsize & HDAC_RIRBSIZE_RIRBSZCAP_2) ==
 	    HDAC_RIRBSIZE_RIRBSZCAP_2)
 		sc->rirb_size = 2;
 	else {
 		device_printf(sc->dev, "%s: Invalid rirb size (%x)\n",
 		    __func__, rirbsize);
 		return (ENXIO);
 	}
 
 	HDA_BOOTVERBOSE(
 		device_printf(sc->dev, "Caps: OSS %d, ISS %d, BSS %d, "
 		    "NSDO %d%s, CORB %d, RIRB %d\n",
 		    sc->num_oss, sc->num_iss, sc->num_bss, 1 << sc->num_sdo,
 		    sc->support_64bit ? ", 64bit" : "",
 		    sc->corb_size, sc->rirb_size);
 	);
 
 	return (0);
 }
 
 
 /****************************************************************************
  * void hdac_dma_cb
  *
  * This function is called by bus_dmamap_load when the mapping has been
  * established. We just record the physical address of the mapping into
  * the struct hdac_dma passed in.
  ****************************************************************************/
 static void
 hdac_dma_cb(void *callback_arg, bus_dma_segment_t *segs, int nseg, int error)
 {
 	struct hdac_dma *dma;
 
 	if (error == 0) {
 		dma = (struct hdac_dma *)callback_arg;
 		dma->dma_paddr = segs[0].ds_addr;
 	}
 }
 
 
 /****************************************************************************
  * int hdac_dma_alloc
  *
  * This function allocate and setup a dma region (struct hdac_dma).
  * It must be freed by a corresponding hdac_dma_free.
  ****************************************************************************/
 static int
 hdac_dma_alloc(struct hdac_softc *sc, struct hdac_dma *dma, bus_size_t size)
 {
 	bus_size_t roundsz;
 	int result;
 
 	roundsz = roundup2(size, HDA_DMA_ALIGNMENT);
 	bzero(dma, sizeof(*dma));
 
 	/*
 	 * Create a DMA tag
 	 */
 	result = bus_dma_tag_create(
 	    bus_get_dma_tag(sc->dev),		/* parent */
 	    HDA_DMA_ALIGNMENT,			/* alignment */
 	    0,					/* boundary */
 	    (sc->support_64bit) ? BUS_SPACE_MAXADDR :
 		BUS_SPACE_MAXADDR_32BIT,	/* lowaddr */
 	    BUS_SPACE_MAXADDR,			/* highaddr */
 	    NULL,				/* filtfunc */
 	    NULL,				/* fistfuncarg */
 	    roundsz, 				/* maxsize */
 	    1,					/* nsegments */
 	    roundsz, 				/* maxsegsz */
 	    0,					/* flags */
 	    NULL,				/* lockfunc */
 	    NULL,				/* lockfuncarg */
 	    &dma->dma_tag);			/* dmat */
 	if (result != 0) {
 		device_printf(sc->dev, "%s: bus_dma_tag_create failed (%x)\n",
 		    __func__, result);
 		goto hdac_dma_alloc_fail;
 	}
 
 	/*
 	 * Allocate DMA memory
 	 */
 	result = bus_dmamem_alloc(dma->dma_tag, (void **)&dma->dma_vaddr,
 	    BUS_DMA_NOWAIT | BUS_DMA_ZERO |
 	    ((sc->flags & HDAC_F_DMA_NOCACHE) ? BUS_DMA_NOCACHE : 0),
 	    &dma->dma_map);
 	if (result != 0) {
 		device_printf(sc->dev, "%s: bus_dmamem_alloc failed (%x)\n",
 		    __func__, result);
 		goto hdac_dma_alloc_fail;
 	}
 
 	dma->dma_size = roundsz;
 
 	/*
 	 * Map the memory
 	 */
 	result = bus_dmamap_load(dma->dma_tag, dma->dma_map,
 	    (void *)dma->dma_vaddr, roundsz, hdac_dma_cb, (void *)dma, 0);
 	if (result != 0 || dma->dma_paddr == 0) {
 		if (result == 0)
 			result = ENOMEM;
 		device_printf(sc->dev, "%s: bus_dmamem_load failed (%x)\n",
 		    __func__, result);
 		goto hdac_dma_alloc_fail;
 	}
 
 	HDA_BOOTHVERBOSE(
 		device_printf(sc->dev, "%s: size=%ju -> roundsz=%ju\n",
 		    __func__, (uintmax_t)size, (uintmax_t)roundsz);
 	);
 
 	return (0);
 
 hdac_dma_alloc_fail:
 	hdac_dma_free(sc, dma);
 
 	return (result);
 }
 
 
 /****************************************************************************
  * void hdac_dma_free(struct hdac_softc *, struct hdac_dma *)
  *
  * Free a struct dhac_dma that has been previously allocated via the
  * hdac_dma_alloc function.
  ****************************************************************************/
 static void
 hdac_dma_free(struct hdac_softc *sc, struct hdac_dma *dma)
 {
 	if (dma->dma_paddr != 0) {
 #if 0
 		/* Flush caches */
 		bus_dmamap_sync(dma->dma_tag, dma->dma_map,
 		    BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE);
 #endif
 		bus_dmamap_unload(dma->dma_tag, dma->dma_map);
 		dma->dma_paddr = 0;
 	}
 	if (dma->dma_vaddr != NULL) {
 		bus_dmamem_free(dma->dma_tag, dma->dma_vaddr, dma->dma_map);
 		dma->dma_vaddr = NULL;
 	}
 	if (dma->dma_tag != NULL) {
 		bus_dma_tag_destroy(dma->dma_tag);
 		dma->dma_tag = NULL;
 	}
 	dma->dma_size = 0;
 }
 
 /****************************************************************************
  * int hdac_mem_alloc(struct hdac_softc *)
  *
  * Allocate all the bus resources necessary to speak with the physical
  * controller.
  ****************************************************************************/
 static int
 hdac_mem_alloc(struct hdac_softc *sc)
 {
 	struct hdac_mem *mem;
 
 	mem = &sc->mem;
 	mem->mem_rid = PCIR_BAR(0);
 	mem->mem_res = bus_alloc_resource_any(sc->dev, SYS_RES_MEMORY,
 	    &mem->mem_rid, RF_ACTIVE);
 	if (mem->mem_res == NULL) {
 		device_printf(sc->dev,
 		    "%s: Unable to allocate memory resource\n", __func__);
 		return (ENOMEM);
 	}
 	mem->mem_tag = rman_get_bustag(mem->mem_res);
 	mem->mem_handle = rman_get_bushandle(mem->mem_res);
 
 	return (0);
 }
 
 /****************************************************************************
  * void hdac_mem_free(struct hdac_softc *)
  *
  * Free up resources previously allocated by hdac_mem_alloc.
  ****************************************************************************/
 static void
 hdac_mem_free(struct hdac_softc *sc)
 {
 	struct hdac_mem *mem;
 
 	mem = &sc->mem;
 	if (mem->mem_res != NULL)
 		bus_release_resource(sc->dev, SYS_RES_MEMORY, mem->mem_rid,
 		    mem->mem_res);
 	mem->mem_res = NULL;
 }
 
 /****************************************************************************
  * int hdac_irq_alloc(struct hdac_softc *)
  *
  * Allocate and setup the resources necessary for interrupt handling.
  ****************************************************************************/
 static int
 hdac_irq_alloc(struct hdac_softc *sc)
 {
 	struct hdac_irq *irq;
 	int result;
 
 	irq = &sc->irq;
 	irq->irq_rid = 0x0;
 
 	if ((sc->quirks_off & HDAC_QUIRK_MSI) == 0 &&
 	    (result = pci_msi_count(sc->dev)) == 1 &&
 	    pci_alloc_msi(sc->dev, &result) == 0)
 		irq->irq_rid = 0x1;
 
 	irq->irq_res = bus_alloc_resource_any(sc->dev, SYS_RES_IRQ,
 	    &irq->irq_rid, RF_SHAREABLE | RF_ACTIVE);
 	if (irq->irq_res == NULL) {
 		device_printf(sc->dev, "%s: Unable to allocate irq\n",
 		    __func__);
 		goto hdac_irq_alloc_fail;
 	}
 	result = bus_setup_intr(sc->dev, irq->irq_res, INTR_MPSAFE | INTR_TYPE_AV,
 	    NULL, hdac_intr_handler, sc, &irq->irq_handle);
 	if (result != 0) {
 		device_printf(sc->dev,
 		    "%s: Unable to setup interrupt handler (%x)\n",
 		    __func__, result);
 		goto hdac_irq_alloc_fail;
 	}
 
 	return (0);
 
 hdac_irq_alloc_fail:
 	hdac_irq_free(sc);
 
 	return (ENXIO);
 }
 
 /****************************************************************************
  * void hdac_irq_free(struct hdac_softc *)
  *
  * Free up resources previously allocated by hdac_irq_alloc.
  ****************************************************************************/
 static void
 hdac_irq_free(struct hdac_softc *sc)
 {
 	struct hdac_irq *irq;
 
 	irq = &sc->irq;
 	if (irq->irq_res != NULL && irq->irq_handle != NULL)
 		bus_teardown_intr(sc->dev, irq->irq_res, irq->irq_handle);
 	if (irq->irq_res != NULL)
 		bus_release_resource(sc->dev, SYS_RES_IRQ, irq->irq_rid,
 		    irq->irq_res);
 	if (irq->irq_rid == 0x1)
 		pci_release_msi(sc->dev);
 	irq->irq_handle = NULL;
 	irq->irq_res = NULL;
 	irq->irq_rid = 0x0;
 }
 
 /****************************************************************************
  * void hdac_corb_init(struct hdac_softc *)
  *
  * Initialize the corb registers for operations but do not start it up yet.
  * The CORB engine must not be running when this function is called.
  ****************************************************************************/
 static void
 hdac_corb_init(struct hdac_softc *sc)
 {
 	uint8_t corbsize;
 	uint64_t corbpaddr;
 
 	/* Setup the CORB size. */
 	switch (sc->corb_size) {
 	case 256:
 		corbsize = HDAC_CORBSIZE_CORBSIZE(HDAC_CORBSIZE_CORBSIZE_256);
 		break;
 	case 16:
 		corbsize = HDAC_CORBSIZE_CORBSIZE(HDAC_CORBSIZE_CORBSIZE_16);
 		break;
 	case 2:
 		corbsize = HDAC_CORBSIZE_CORBSIZE(HDAC_CORBSIZE_CORBSIZE_2);
 		break;
 	default:
 		panic("%s: Invalid CORB size (%x)\n", __func__, sc->corb_size);
 	}
 	HDAC_WRITE_1(&sc->mem, HDAC_CORBSIZE, corbsize);
 
 	/* Setup the CORB Address in the hdac */
 	corbpaddr = (uint64_t)sc->corb_dma.dma_paddr;
 	HDAC_WRITE_4(&sc->mem, HDAC_CORBLBASE, (uint32_t)corbpaddr);
 	HDAC_WRITE_4(&sc->mem, HDAC_CORBUBASE, (uint32_t)(corbpaddr >> 32));
 
 	/* Set the WP and RP */
 	sc->corb_wp = 0;
 	HDAC_WRITE_2(&sc->mem, HDAC_CORBWP, sc->corb_wp);
 	HDAC_WRITE_2(&sc->mem, HDAC_CORBRP, HDAC_CORBRP_CORBRPRST);
 	/*
 	 * The HDA specification indicates that the CORBRPRST bit will always
 	 * read as zero. Unfortunately, it seems that at least the 82801G
 	 * doesn't reset the bit to zero, which stalls the corb engine.
 	 * manually reset the bit to zero before continuing.
 	 */
 	HDAC_WRITE_2(&sc->mem, HDAC_CORBRP, 0x0);
 
 	/* Enable CORB error reporting */
 #if 0
 	HDAC_WRITE_1(&sc->mem, HDAC_CORBCTL, HDAC_CORBCTL_CMEIE);
 #endif
 }
 
 /****************************************************************************
  * void hdac_rirb_init(struct hdac_softc *)
  *
  * Initialize the rirb registers for operations but do not start it up yet.
  * The RIRB engine must not be running when this function is called.
  ****************************************************************************/
 static void
 hdac_rirb_init(struct hdac_softc *sc)
 {
 	uint8_t rirbsize;
 	uint64_t rirbpaddr;
 
 	/* Setup the RIRB size. */
 	switch (sc->rirb_size) {
 	case 256:
 		rirbsize = HDAC_RIRBSIZE_RIRBSIZE(HDAC_RIRBSIZE_RIRBSIZE_256);
 		break;
 	case 16:
 		rirbsize = HDAC_RIRBSIZE_RIRBSIZE(HDAC_RIRBSIZE_RIRBSIZE_16);
 		break;
 	case 2:
 		rirbsize = HDAC_RIRBSIZE_RIRBSIZE(HDAC_RIRBSIZE_RIRBSIZE_2);
 		break;
 	default:
 		panic("%s: Invalid RIRB size (%x)\n", __func__, sc->rirb_size);
 	}
 	HDAC_WRITE_1(&sc->mem, HDAC_RIRBSIZE, rirbsize);
 
 	/* Setup the RIRB Address in the hdac */
 	rirbpaddr = (uint64_t)sc->rirb_dma.dma_paddr;
 	HDAC_WRITE_4(&sc->mem, HDAC_RIRBLBASE, (uint32_t)rirbpaddr);
 	HDAC_WRITE_4(&sc->mem, HDAC_RIRBUBASE, (uint32_t)(rirbpaddr >> 32));
 
 	/* Setup the WP and RP */
 	sc->rirb_rp = 0;
 	HDAC_WRITE_2(&sc->mem, HDAC_RIRBWP, HDAC_RIRBWP_RIRBWPRST);
 
 	/* Setup the interrupt threshold */
 	HDAC_WRITE_2(&sc->mem, HDAC_RINTCNT, sc->rirb_size / 2);
 
 	/* Enable Overrun and response received reporting */
 #if 0
 	HDAC_WRITE_1(&sc->mem, HDAC_RIRBCTL,
 	    HDAC_RIRBCTL_RIRBOIC | HDAC_RIRBCTL_RINTCTL);
 #else
 	HDAC_WRITE_1(&sc->mem, HDAC_RIRBCTL, HDAC_RIRBCTL_RINTCTL);
 #endif
 
 #if 0
 	/*
 	 * Make sure that the Host CPU cache doesn't contain any dirty
 	 * cache lines that falls in the rirb. If I understood correctly, it
 	 * should be sufficient to do this only once as the rirb is purely
 	 * read-only from now on.
 	 */
 	bus_dmamap_sync(sc->rirb_dma.dma_tag, sc->rirb_dma.dma_map,
 	    BUS_DMASYNC_PREREAD);
 #endif
 }
 
 /****************************************************************************
  * void hdac_corb_start(hdac_softc *)
  *
  * Startup the corb DMA engine
  ****************************************************************************/
 static void
 hdac_corb_start(struct hdac_softc *sc)
 {
 	uint32_t corbctl;
 
 	corbctl = HDAC_READ_1(&sc->mem, HDAC_CORBCTL);
 	corbctl |= HDAC_CORBCTL_CORBRUN;
 	HDAC_WRITE_1(&sc->mem, HDAC_CORBCTL, corbctl);
 }
 
 /****************************************************************************
  * void hdac_rirb_start(hdac_softc *)
  *
  * Startup the rirb DMA engine
  ****************************************************************************/
 static void
 hdac_rirb_start(struct hdac_softc *sc)
 {
 	uint32_t rirbctl;
 
 	rirbctl = HDAC_READ_1(&sc->mem, HDAC_RIRBCTL);
 	rirbctl |= HDAC_RIRBCTL_RIRBDMAEN;
 	HDAC_WRITE_1(&sc->mem, HDAC_RIRBCTL, rirbctl);
 }
 
 static int
 hdac_rirb_flush(struct hdac_softc *sc)
 {
 	struct hdac_rirb *rirb_base, *rirb;
 	nid_t cad;
 	uint32_t resp;
 	uint8_t rirbwp;
 	int ret;
 
 	rirb_base = (struct hdac_rirb *)sc->rirb_dma.dma_vaddr;
 	rirbwp = HDAC_READ_1(&sc->mem, HDAC_RIRBWP);
 #if 0
 	bus_dmamap_sync(sc->rirb_dma.dma_tag, sc->rirb_dma.dma_map,
 	    BUS_DMASYNC_POSTREAD);
 #endif
 
 	ret = 0;
 	while (sc->rirb_rp != rirbwp) {
 		sc->rirb_rp++;
 		sc->rirb_rp %= sc->rirb_size;
 		rirb = &rirb_base[sc->rirb_rp];
 		cad = HDAC_RIRB_RESPONSE_EX_SDATA_IN(rirb->response_ex);
 		resp = rirb->response;
 		if (rirb->response_ex & HDAC_RIRB_RESPONSE_EX_UNSOLICITED) {
 			sc->unsolq[sc->unsolq_wp++] = resp;
 			sc->unsolq_wp %= HDAC_UNSOLQ_MAX;
 			sc->unsolq[sc->unsolq_wp++] = cad;
 			sc->unsolq_wp %= HDAC_UNSOLQ_MAX;
 		} else if (sc->codecs[cad].pending <= 0) {
 			device_printf(sc->dev, "Unexpected unsolicited "
 			    "response from address %d: %08x\n", cad, resp);
 		} else {
 			sc->codecs[cad].response = resp;
 			sc->codecs[cad].pending--;
 		}
 		ret++;
 	}
 	return (ret);
 }
 
 static int
 hdac_unsolq_flush(struct hdac_softc *sc)
 {
 	device_t child;
 	nid_t cad;
 	uint32_t resp;
 	int ret = 0;
 
 	if (sc->unsolq_st == HDAC_UNSOLQ_READY) {
 		sc->unsolq_st = HDAC_UNSOLQ_BUSY;
 		while (sc->unsolq_rp != sc->unsolq_wp) {
 			resp = sc->unsolq[sc->unsolq_rp++];
 			sc->unsolq_rp %= HDAC_UNSOLQ_MAX;
 			cad = sc->unsolq[sc->unsolq_rp++];
 			sc->unsolq_rp %= HDAC_UNSOLQ_MAX;
 			if ((child = sc->codecs[cad].dev) != NULL)
 				HDAC_UNSOL_INTR(child, resp);
 			ret++;
 		}
 		sc->unsolq_st = HDAC_UNSOLQ_READY;
 	}
 
 	return (ret);
 }
 
 /****************************************************************************
  * uint32_t hdac_command_sendone_internal
  *
  * Wrapper function that sends only one command to a given codec
  ****************************************************************************/
 static uint32_t
 hdac_send_command(struct hdac_softc *sc, nid_t cad, uint32_t verb)
 {
 	int timeout;
 	uint32_t *corb;
 
 	if (!hdac_lockowned(sc))
 		device_printf(sc->dev, "WARNING!!!! mtx not owned!!!!\n");
 	verb &= ~HDA_CMD_CAD_MASK;
 	verb |= ((uint32_t)cad) << HDA_CMD_CAD_SHIFT;
 	sc->codecs[cad].response = HDA_INVALID;
 
 	sc->codecs[cad].pending++;
 	sc->corb_wp++;
 	sc->corb_wp %= sc->corb_size;
 	corb = (uint32_t *)sc->corb_dma.dma_vaddr;
 #if 0
 	bus_dmamap_sync(sc->corb_dma.dma_tag,
 	    sc->corb_dma.dma_map, BUS_DMASYNC_PREWRITE);
 #endif
 	corb[sc->corb_wp] = verb;
 #if 0
 	bus_dmamap_sync(sc->corb_dma.dma_tag,
 	    sc->corb_dma.dma_map, BUS_DMASYNC_POSTWRITE);
 #endif
 	HDAC_WRITE_2(&sc->mem, HDAC_CORBWP, sc->corb_wp);
 
 	timeout = 10000;
 	do {
 		if (hdac_rirb_flush(sc) == 0)
 			DELAY(10);
 	} while (sc->codecs[cad].pending != 0 && --timeout);
 
 	if (sc->codecs[cad].pending != 0) {
 		device_printf(sc->dev, "Command timeout on address %d\n", cad);
 		sc->codecs[cad].pending = 0;
 	}
 
 	if (sc->unsolq_rp != sc->unsolq_wp)
 		taskqueue_enqueue(taskqueue_thread, &sc->unsolq_task);
 	return (sc->codecs[cad].response);
 }
 
 /****************************************************************************
  * Device Methods
  ****************************************************************************/
 
 /****************************************************************************
  * int hdac_probe(device_t)
  *
  * Probe for the presence of an hdac. If none is found, check for a generic
  * match using the subclass of the device.
  ****************************************************************************/
 static int
 hdac_probe(device_t dev)
 {
 	int i, result;
 	uint32_t model;
 	uint16_t class, subclass;
 	char desc[64];
 
 	model = (uint32_t)pci_get_device(dev) << 16;
 	model |= (uint32_t)pci_get_vendor(dev) & 0x0000ffff;
 	class = pci_get_class(dev);
 	subclass = pci_get_subclass(dev);
 
 	bzero(desc, sizeof(desc));
 	result = ENXIO;
 	for (i = 0; i < nitems(hdac_devices); i++) {
 		if (hdac_devices[i].model == model) {
 			strlcpy(desc, hdac_devices[i].desc, sizeof(desc));
 			result = BUS_PROBE_DEFAULT;
 			break;
 		}
 		if (HDA_DEV_MATCH(hdac_devices[i].model, model) &&
 		    class == PCIC_MULTIMEDIA &&
 		    subclass == PCIS_MULTIMEDIA_HDA) {
 			snprintf(desc, sizeof(desc),
 			    "%s (0x%04x)",
 			    hdac_devices[i].desc, pci_get_device(dev));
 			result = BUS_PROBE_GENERIC;
 			break;
 		}
 	}
 	if (result == ENXIO && class == PCIC_MULTIMEDIA &&
 	    subclass == PCIS_MULTIMEDIA_HDA) {
 		snprintf(desc, sizeof(desc), "Generic (0x%08x)", model);
 		result = BUS_PROBE_GENERIC;
 	}
 	if (result != ENXIO) {
 		strlcat(desc, " HDA Controller", sizeof(desc));
 		device_set_desc_copy(dev, desc);
 	}
 
 	return (result);
 }
 
 static void
 hdac_unsolq_task(void *context, int pending)
 {
 	struct hdac_softc *sc;
 
 	sc = (struct hdac_softc *)context;
 
 	hdac_lock(sc);
 	hdac_unsolq_flush(sc);
 	hdac_unlock(sc);
 }
 
 /****************************************************************************
  * int hdac_attach(device_t)
  *
  * Attach the device into the kernel. Interrupts usually won't be enabled
  * when this function is called. Setup everything that doesn't require
  * interrupts and defer probing of codecs until interrupts are enabled.
  ****************************************************************************/
 static int
 hdac_attach(device_t dev)
 {
 	struct hdac_softc *sc;
 	int result;
 	int i, devid = -1;
 	uint32_t model;
 	uint16_t class, subclass;
 	uint16_t vendor;
 	uint8_t v;
 
 	sc = device_get_softc(dev);
 	HDA_BOOTVERBOSE(
 		device_printf(dev, "PCI card vendor: 0x%04x, device: 0x%04x\n",
 		    pci_get_subvendor(dev), pci_get_subdevice(dev));
 		device_printf(dev, "HDA Driver Revision: %s\n",
 		    HDA_DRV_TEST_REV);
 	);
 
 	model = (uint32_t)pci_get_device(dev) << 16;
 	model |= (uint32_t)pci_get_vendor(dev) & 0x0000ffff;
 	class = pci_get_class(dev);
 	subclass = pci_get_subclass(dev);
 
 	for (i = 0; i < nitems(hdac_devices); i++) {
 		if (hdac_devices[i].model == model) {
 			devid = i;
 			break;
 		}
 		if (HDA_DEV_MATCH(hdac_devices[i].model, model) &&
 		    class == PCIC_MULTIMEDIA &&
 		    subclass == PCIS_MULTIMEDIA_HDA) {
 			devid = i;
 			break;
 		}
 	}
 
 	sc->lock = snd_mtxcreate(device_get_nameunit(dev), "HDA driver mutex");
 	sc->dev = dev;
 	TASK_INIT(&sc->unsolq_task, 0, hdac_unsolq_task, sc);
 	callout_init(&sc->poll_callout, CALLOUT_MPSAFE);
 	for (i = 0; i < HDAC_CODEC_MAX; i++)
 		sc->codecs[i].dev = NULL;
 	if (devid >= 0) {
 		sc->quirks_on = hdac_devices[devid].quirks_on;
 		sc->quirks_off = hdac_devices[devid].quirks_off;
 	} else {
 		sc->quirks_on = 0;
 		sc->quirks_off = 0;
 	}
 	if (resource_int_value(device_get_name(dev),
 	    device_get_unit(dev), "msi", &i) == 0) {
 		if (i == 0)
 			sc->quirks_off |= HDAC_QUIRK_MSI;
 		else {
 			sc->quirks_on |= HDAC_QUIRK_MSI;
 			sc->quirks_off |= ~HDAC_QUIRK_MSI;
 		}
 	}
 	hdac_config_fetch(sc, &sc->quirks_on, &sc->quirks_off);
 	HDA_BOOTVERBOSE(
 		device_printf(sc->dev,
 		    "Config options: on=0x%08x off=0x%08x\n",
 		    sc->quirks_on, sc->quirks_off);
 	);
 	sc->poll_ival = hz;
 	if (resource_int_value(device_get_name(dev),
 	    device_get_unit(dev), "polling", &i) == 0 && i != 0)
 		sc->polling = 1;
 	else
 		sc->polling = 0;
 
 	pci_enable_busmaster(dev);
 
 	vendor = pci_get_vendor(dev);
 	if (vendor == INTEL_VENDORID) {
 		/* TCSEL -> TC0 */
 		v = pci_read_config(dev, 0x44, 1);
 		pci_write_config(dev, 0x44, v & 0xf8, 1);
 		HDA_BOOTHVERBOSE(
 			device_printf(dev, "TCSEL: 0x%02d -> 0x%02d\n", v,
 			    pci_read_config(dev, 0x44, 1));
 		);
 	}
 
 #if defined(__i386__) || defined(__amd64__)
 	sc->flags |= HDAC_F_DMA_NOCACHE;
 
 	if (resource_int_value(device_get_name(dev),
 	    device_get_unit(dev), "snoop", &i) == 0 && i != 0) {
 #else
 	sc->flags &= ~HDAC_F_DMA_NOCACHE;
 #endif
 		/*
 		 * Try to enable PCIe snoop to avoid messing around with
 		 * uncacheable DMA attribute. Since PCIe snoop register
 		 * config is pretty much vendor specific, there are no
 		 * general solutions on how to enable it, forcing us (even
 		 * Microsoft) to enable uncacheable or write combined DMA
 		 * by default.
 		 *
 		 * http://msdn2.microsoft.com/en-us/library/ms790324.aspx
 		 */
 		for (i = 0; i < nitems(hdac_pcie_snoop); i++) {
 			if (hdac_pcie_snoop[i].vendor != vendor)
 				continue;
 			sc->flags &= ~HDAC_F_DMA_NOCACHE;
 			if (hdac_pcie_snoop[i].reg == 0x00)
 				break;
 			v = pci_read_config(dev, hdac_pcie_snoop[i].reg, 1);
 			if ((v & hdac_pcie_snoop[i].enable) ==
 			    hdac_pcie_snoop[i].enable)
 				break;
 			v &= hdac_pcie_snoop[i].mask;
 			v |= hdac_pcie_snoop[i].enable;
 			pci_write_config(dev, hdac_pcie_snoop[i].reg, v, 1);
 			v = pci_read_config(dev, hdac_pcie_snoop[i].reg, 1);
 			if ((v & hdac_pcie_snoop[i].enable) !=
 			    hdac_pcie_snoop[i].enable) {
 				HDA_BOOTVERBOSE(
 					device_printf(dev,
 					    "WARNING: Failed to enable PCIe "
 					    "snoop!\n");
 				);
 #if defined(__i386__) || defined(__amd64__)
 				sc->flags |= HDAC_F_DMA_NOCACHE;
 #endif
 			}
 			break;
 		}
 #if defined(__i386__) || defined(__amd64__)
 	}
 #endif
 
 	HDA_BOOTHVERBOSE(
 		device_printf(dev, "DMA Coherency: %s / vendor=0x%04x\n",
 		    (sc->flags & HDAC_F_DMA_NOCACHE) ?
 		    "Uncacheable" : "PCIe snoop", vendor);
 	);
 
 	/* Allocate resources */
 	result = hdac_mem_alloc(sc);
 	if (result != 0)
 		goto hdac_attach_fail;
 	result = hdac_irq_alloc(sc);
 	if (result != 0)
 		goto hdac_attach_fail;
 
 	/* Get Capabilities */
 	result = hdac_get_capabilities(sc);
 	if (result != 0)
 		goto hdac_attach_fail;
 
 	/* Allocate CORB, RIRB, POS and BDLs dma memory */
 	result = hdac_dma_alloc(sc, &sc->corb_dma,
 	    sc->corb_size * sizeof(uint32_t));
 	if (result != 0)
 		goto hdac_attach_fail;
 	result = hdac_dma_alloc(sc, &sc->rirb_dma,
 	    sc->rirb_size * sizeof(struct hdac_rirb));
 	if (result != 0)
 		goto hdac_attach_fail;
 	sc->streams = malloc(sizeof(struct hdac_stream) * sc->num_ss,
 	    M_HDAC, M_ZERO | M_WAITOK);
 	for (i = 0; i < sc->num_ss; i++) {
 		result = hdac_dma_alloc(sc, &sc->streams[i].bdl,
 		    sizeof(struct hdac_bdle) * HDA_BDL_MAX);
 		if (result != 0)
 			goto hdac_attach_fail;
 	}
 	if (sc->quirks_on & HDAC_QUIRK_DMAPOS) {
 		if (hdac_dma_alloc(sc, &sc->pos_dma, (sc->num_ss) * 8) != 0) {
 			HDA_BOOTVERBOSE(
 				device_printf(dev, "Failed to "
 				    "allocate DMA pos buffer "
 				    "(non-fatal)\n");
 			);
 		} else {
 			uint64_t addr = sc->pos_dma.dma_paddr;
 
 			HDAC_WRITE_4(&sc->mem, HDAC_DPIBUBASE, addr >> 32);
 			HDAC_WRITE_4(&sc->mem, HDAC_DPIBLBASE,
 			    (addr & HDAC_DPLBASE_DPLBASE_MASK) |
 			    HDAC_DPLBASE_DPLBASE_DMAPBE);
 		}
 	}
 
 	result = bus_dma_tag_create(
 	    bus_get_dma_tag(sc->dev),		/* parent */
 	    HDA_DMA_ALIGNMENT,			/* alignment */
 	    0,					/* boundary */
 	    (sc->support_64bit) ? BUS_SPACE_MAXADDR :
 		BUS_SPACE_MAXADDR_32BIT,	/* lowaddr */
 	    BUS_SPACE_MAXADDR,			/* highaddr */
 	    NULL,				/* filtfunc */
 	    NULL,				/* fistfuncarg */
 	    HDA_BUFSZ_MAX, 			/* maxsize */
 	    1,					/* nsegments */
 	    HDA_BUFSZ_MAX, 			/* maxsegsz */
 	    0,					/* flags */
 	    NULL,				/* lockfunc */
 	    NULL,				/* lockfuncarg */
 	    &sc->chan_dmat);			/* dmat */
 	if (result != 0) {
 		device_printf(dev, "%s: bus_dma_tag_create failed (%x)\n",
 		     __func__, result);
 		goto hdac_attach_fail;
 	}
 
 	/* Quiesce everything */
 	HDA_BOOTHVERBOSE(
 		device_printf(dev, "Reset controller...\n");
 	);
 	hdac_reset(sc, 1);
 
 	/* Initialize the CORB and RIRB */
 	hdac_corb_init(sc);
 	hdac_rirb_init(sc);
 
 	/* Defer remaining of initialization until interrupts are enabled */
 	sc->intrhook.ich_func = hdac_attach2;
 	sc->intrhook.ich_arg = (void *)sc;
 	if (cold == 0 || config_intrhook_establish(&sc->intrhook) != 0) {
 		sc->intrhook.ich_func = NULL;
 		hdac_attach2((void *)sc);
 	}
 
 	return (0);
 
 hdac_attach_fail:
 	hdac_irq_free(sc);
 	for (i = 0; i < sc->num_ss; i++)
 		hdac_dma_free(sc, &sc->streams[i].bdl);
 	free(sc->streams, M_HDAC);
 	hdac_dma_free(sc, &sc->rirb_dma);
 	hdac_dma_free(sc, &sc->corb_dma);
 	hdac_mem_free(sc);
 	snd_mtxfree(sc->lock);
 
 	return (ENXIO);
 }
 
 static int
 sysctl_hdac_pindump(SYSCTL_HANDLER_ARGS)
 {
 	struct hdac_softc *sc;
 	device_t *devlist;
 	device_t dev;
 	int devcount, i, err, val;
 
 	dev = oidp->oid_arg1;
 	sc = device_get_softc(dev);
 	if (sc == NULL)
 		return (EINVAL);
 	val = 0;
 	err = sysctl_handle_int(oidp, &val, 0, req);
 	if (err != 0 || req->newptr == NULL || val == 0)
 		return (err);
 
 	/* XXX: Temporary. For debugging. */
 	if (val == 100) {
 		hdac_suspend(dev);
 		return (0);
 	} else if (val == 101) {
 		hdac_resume(dev);
 		return (0);
 	}
 
 	if ((err = device_get_children(dev, &devlist, &devcount)) != 0)
 		return (err);
 	hdac_lock(sc);
 	for (i = 0; i < devcount; i++)
 		HDAC_PINDUMP(devlist[i]);
 	hdac_unlock(sc);
 	free(devlist, M_TEMP);
 	return (0);
 }
 
 static int
 hdac_mdata_rate(uint16_t fmt)
 {
 	static const int mbits[8] = { 8, 16, 32, 32, 32, 32, 32, 32 };
 	int rate, bits;
 
 	if (fmt & (1 << 14))
 		rate = 44100;
 	else
 		rate = 48000;
 	rate *= ((fmt >> 11) & 0x07) + 1;
 	rate /= ((fmt >> 8) & 0x07) + 1;
 	bits = mbits[(fmt >> 4) & 0x03];
 	bits *= (fmt & 0x0f) + 1;
 	return (rate * bits);
 }
 
 static int
 hdac_bdata_rate(uint16_t fmt, int output)
 {
 	static const int bbits[8] = { 8, 16, 20, 24, 32, 32, 32, 32 };
 	int rate, bits;
 
 	rate = 48000;
 	rate *= ((fmt >> 11) & 0x07) + 1;
 	bits = bbits[(fmt >> 4) & 0x03];
 	bits *= (fmt & 0x0f) + 1;
 	if (!output)
 		bits = ((bits + 7) & ~0x07) + 10;
 	return (rate * bits);
 }
 
 static void
 hdac_poll_reinit(struct hdac_softc *sc)
 {
 	int i, pollticks, min = 1000000;
 	struct hdac_stream *s;
 
 	if (sc->polling == 0)
 		return;
 	if (sc->unsol_registered > 0)
 		min = hz / 2;
 	for (i = 0; i < sc->num_ss; i++) {
 		s = &sc->streams[i];
 		if (s->running == 0)
 			continue;
 		pollticks = ((uint64_t)hz * s->blksz) /
 		    (hdac_mdata_rate(s->format) / 8);
 		pollticks >>= 1;
 		if (pollticks > hz)
 			pollticks = hz;
 		if (pollticks < 1) {
 			HDA_BOOTVERBOSE(
 				device_printf(sc->dev,
 				    "poll interval < 1 tick !\n");
 			);
 			pollticks = 1;
 		}
 		if (min > pollticks)
 			min = pollticks;
 	}
 	HDA_BOOTVERBOSE(
 		device_printf(sc->dev,
 		    "poll interval %d -> %d ticks\n",
 		    sc->poll_ival, min);
 	);
 	sc->poll_ival = min;
 	if (min == 1000000)
 		callout_stop(&sc->poll_callout);
 	else
 		callout_reset(&sc->poll_callout, 1, hdac_poll_callback, sc);
 }
 
 static int
 sysctl_hdac_polling(SYSCTL_HANDLER_ARGS)
 {
 	struct hdac_softc *sc;
 	device_t dev;
 	uint32_t ctl;
 	int err, val;
 
 	dev = oidp->oid_arg1;
 	sc = device_get_softc(dev);
 	if (sc == NULL)
 		return (EINVAL);
 	hdac_lock(sc);
 	val = sc->polling;
 	hdac_unlock(sc);
 	err = sysctl_handle_int(oidp, &val, 0, req);
 
 	if (err != 0 || req->newptr == NULL)
 		return (err);
 	if (val < 0 || val > 1)
 		return (EINVAL);
 
 	hdac_lock(sc);
 	if (val != sc->polling) {
 		if (val == 0) {
 			callout_stop(&sc->poll_callout);
 			hdac_unlock(sc);
 			callout_drain(&sc->poll_callout);
 			hdac_lock(sc);
 			sc->polling = 0;
 			ctl = HDAC_READ_4(&sc->mem, HDAC_INTCTL);
 			ctl |= HDAC_INTCTL_GIE;
 			HDAC_WRITE_4(&sc->mem, HDAC_INTCTL, ctl);
 		} else {
 			ctl = HDAC_READ_4(&sc->mem, HDAC_INTCTL);
 			ctl &= ~HDAC_INTCTL_GIE;
 			HDAC_WRITE_4(&sc->mem, HDAC_INTCTL, ctl);
 			sc->polling = 1;
 			hdac_poll_reinit(sc);
 		}
 	}
 	hdac_unlock(sc);
 
 	return (err);
 }
 
 static void
 hdac_attach2(void *arg)
 {
 	struct hdac_softc *sc;
 	device_t child;
 	uint32_t vendorid, revisionid;
 	int i;
 	uint16_t statests;
 
 	sc = (struct hdac_softc *)arg;
 
 	hdac_lock(sc);
 
 	/* Remove ourselves from the config hooks */
 	if (sc->intrhook.ich_func != NULL) {
 		config_intrhook_disestablish(&sc->intrhook);
 		sc->intrhook.ich_func = NULL;
 	}
 
 	HDA_BOOTHVERBOSE(
 		device_printf(sc->dev, "Starting CORB Engine...\n");
 	);
 	hdac_corb_start(sc);
 	HDA_BOOTHVERBOSE(
 		device_printf(sc->dev, "Starting RIRB Engine...\n");
 	);
 	hdac_rirb_start(sc);
 	HDA_BOOTHVERBOSE(
 		device_printf(sc->dev,
 		    "Enabling controller interrupt...\n");
 	);
 	HDAC_WRITE_4(&sc->mem, HDAC_GCTL, HDAC_READ_4(&sc->mem, HDAC_GCTL) |
 	    HDAC_GCTL_UNSOL);
 	if (sc->polling == 0) {
 		HDAC_WRITE_4(&sc->mem, HDAC_INTCTL,
 		    HDAC_INTCTL_CIE | HDAC_INTCTL_GIE);
 	}
 	DELAY(1000);
 
 	HDA_BOOTHVERBOSE(
 		device_printf(sc->dev, "Scanning HDA codecs ...\n");
 	);
 	statests = HDAC_READ_2(&sc->mem, HDAC_STATESTS);
 	hdac_unlock(sc);
 	for (i = 0; i < HDAC_CODEC_MAX; i++) {
 		if (HDAC_STATESTS_SDIWAKE(statests, i)) {
 			HDA_BOOTHVERBOSE(
 				device_printf(sc->dev,
 				    "Found CODEC at address %d\n", i);
 			);
 			hdac_lock(sc);
 			vendorid = hdac_send_command(sc, i,
 			    HDA_CMD_GET_PARAMETER(0, 0x0, HDA_PARAM_VENDOR_ID));
 			revisionid = hdac_send_command(sc, i,
 			    HDA_CMD_GET_PARAMETER(0, 0x0, HDA_PARAM_REVISION_ID));
 			hdac_unlock(sc);
 			if (vendorid == HDA_INVALID &&
 			    revisionid == HDA_INVALID) {
 				device_printf(sc->dev,
 				    "CODEC is not responding!\n");
 				continue;
 			}
 			sc->codecs[i].vendor_id =
 			    HDA_PARAM_VENDOR_ID_VENDOR_ID(vendorid);
 			sc->codecs[i].device_id =
 			    HDA_PARAM_VENDOR_ID_DEVICE_ID(vendorid);
 			sc->codecs[i].revision_id =
 			    HDA_PARAM_REVISION_ID_REVISION_ID(revisionid);
 			sc->codecs[i].stepping_id =
 			    HDA_PARAM_REVISION_ID_STEPPING_ID(revisionid);
 			child = device_add_child(sc->dev, "hdacc", -1);
 			if (child == NULL) {
 				device_printf(sc->dev,
 				    "Failed to add CODEC device\n");
 				continue;
 			}
 			device_set_ivars(child, (void *)(intptr_t)i);
 			sc->codecs[i].dev = child;
 		}
 	}
 	bus_generic_attach(sc->dev);
 
 	SYSCTL_ADD_PROC(device_get_sysctl_ctx(sc->dev),
 	    SYSCTL_CHILDREN(device_get_sysctl_tree(sc->dev)), OID_AUTO,
 	    "pindump", CTLTYPE_INT | CTLFLAG_RW, sc->dev, sizeof(sc->dev),
 	    sysctl_hdac_pindump, "I", "Dump pin states/data");
 	SYSCTL_ADD_PROC(device_get_sysctl_ctx(sc->dev),
 	    SYSCTL_CHILDREN(device_get_sysctl_tree(sc->dev)), OID_AUTO,
 	    "polling", CTLTYPE_INT | CTLFLAG_RW, sc->dev, sizeof(sc->dev),
 	    sysctl_hdac_polling, "I", "Enable polling mode");
 }
 
 /****************************************************************************
  * int hdac_suspend(device_t)
  *
  * Suspend and power down HDA bus and codecs.
  ****************************************************************************/
 static int
 hdac_suspend(device_t dev)
 {
 	struct hdac_softc *sc = device_get_softc(dev);
 
 	HDA_BOOTHVERBOSE(
 		device_printf(dev, "Suspend...\n");
 	);
 	bus_generic_suspend(dev);
 
 	hdac_lock(sc);
 	HDA_BOOTHVERBOSE(
 		device_printf(dev, "Reset controller...\n");
 	);
 	callout_stop(&sc->poll_callout);
 	hdac_reset(sc, 0);
 	hdac_unlock(sc);
 	callout_drain(&sc->poll_callout);
 	taskqueue_drain(taskqueue_thread, &sc->unsolq_task);
 	HDA_BOOTHVERBOSE(
 		device_printf(dev, "Suspend done\n");
 	);
 	return (0);
 }
 
 /****************************************************************************
  * int hdac_resume(device_t)
  *
  * Powerup and restore HDA bus and codecs state.
  ****************************************************************************/
 static int
 hdac_resume(device_t dev)
 {
 	struct hdac_softc *sc = device_get_softc(dev);
 	int error;
 
 	HDA_BOOTHVERBOSE(
 		device_printf(dev, "Resume...\n");
 	);
 	hdac_lock(sc);
 
 	/* Quiesce everything */
 	HDA_BOOTHVERBOSE(
 		device_printf(dev, "Reset controller...\n");
 	);
 	hdac_reset(sc, 1);
 
 	/* Initialize the CORB and RIRB */
 	hdac_corb_init(sc);
 	hdac_rirb_init(sc);
 
 	HDA_BOOTHVERBOSE(
 		device_printf(dev, "Starting CORB Engine...\n");
 	);
 	hdac_corb_start(sc);
 	HDA_BOOTHVERBOSE(
 		device_printf(dev, "Starting RIRB Engine...\n");
 	);
 	hdac_rirb_start(sc);
 	HDA_BOOTHVERBOSE(
 		device_printf(dev, "Enabling controller interrupt...\n");
 	);
 	HDAC_WRITE_4(&sc->mem, HDAC_GCTL, HDAC_READ_4(&sc->mem, HDAC_GCTL) |
 	    HDAC_GCTL_UNSOL);
 	HDAC_WRITE_4(&sc->mem, HDAC_INTCTL, HDAC_INTCTL_CIE | HDAC_INTCTL_GIE);
 	DELAY(1000);
 	hdac_poll_reinit(sc);
 	hdac_unlock(sc);
 
 	error = bus_generic_resume(dev);
 	HDA_BOOTHVERBOSE(
 		device_printf(dev, "Resume done\n");
 	);
 	return (error);
 }
 
 /****************************************************************************
  * int hdac_detach(device_t)
  *
  * Detach and free up resources utilized by the hdac device.
  ****************************************************************************/
 static int
 hdac_detach(device_t dev)
 {
 	struct hdac_softc *sc = device_get_softc(dev);
 	device_t *devlist;
 	int cad, i, devcount, error;
 
 	if ((error = device_get_children(dev, &devlist, &devcount)) != 0)
 		return (error);
 	for (i = 0; i < devcount; i++) {
 		cad = (intptr_t)device_get_ivars(devlist[i]);
 		if ((error = device_delete_child(dev, devlist[i])) != 0) {
 			free(devlist, M_TEMP);
 			return (error);
 		}
 		sc->codecs[cad].dev = NULL;
 	}
 	free(devlist, M_TEMP);
 
 	hdac_lock(sc);
 	hdac_reset(sc, 0);
 	hdac_unlock(sc);
 	taskqueue_drain(taskqueue_thread, &sc->unsolq_task);
 	hdac_irq_free(sc);
 
 	for (i = 0; i < sc->num_ss; i++)
 		hdac_dma_free(sc, &sc->streams[i].bdl);
 	free(sc->streams, M_HDAC);
 	hdac_dma_free(sc, &sc->pos_dma);
 	hdac_dma_free(sc, &sc->rirb_dma);
 	hdac_dma_free(sc, &sc->corb_dma);
 	if (sc->chan_dmat != NULL) {
 		bus_dma_tag_destroy(sc->chan_dmat);
 		sc->chan_dmat = NULL;
 	}
 	hdac_mem_free(sc);
 	snd_mtxfree(sc->lock);
 	return (0);
 }
 
 static bus_dma_tag_t
 hdac_get_dma_tag(device_t dev, device_t child)
 {
 	struct hdac_softc *sc = device_get_softc(dev);
 
 	return (sc->chan_dmat);
 }
 
 static int
 hdac_print_child(device_t dev, device_t child)
 {
 	int retval;
 
 	retval = bus_print_child_header(dev, child);
 	retval += printf(" at cad %d",
 	    (int)(intptr_t)device_get_ivars(child));
 	retval += bus_print_child_footer(dev, child);
 
 	return (retval);
 }
 
 static int
 hdac_child_location_str(device_t dev, device_t child, char *buf,
     size_t buflen)
 {
 
 	snprintf(buf, buflen, "cad=%d",
 	    (int)(intptr_t)device_get_ivars(child));
 	return (0);
 }
 
 static int
 hdac_child_pnpinfo_str_method(device_t dev, device_t child, char *buf,
     size_t buflen)
 {
 	struct hdac_softc *sc = device_get_softc(dev);
 	nid_t cad = (uintptr_t)device_get_ivars(child);
 
 	snprintf(buf, buflen, "vendor=0x%04x device=0x%04x revision=0x%02x "
 	    "stepping=0x%02x",
 	    sc->codecs[cad].vendor_id, sc->codecs[cad].device_id,
 	    sc->codecs[cad].revision_id, sc->codecs[cad].stepping_id);
 	return (0);
 }
 
 static int
 hdac_read_ivar(device_t dev, device_t child, int which, uintptr_t *result)
 {
 	struct hdac_softc *sc = device_get_softc(dev);
 	nid_t cad = (uintptr_t)device_get_ivars(child);
 
 	switch (which) {
 	case HDA_IVAR_CODEC_ID:
 		*result = cad;
 		break;
 	case HDA_IVAR_VENDOR_ID:
 		*result = sc->codecs[cad].vendor_id;
 		break;
 	case HDA_IVAR_DEVICE_ID:
 		*result = sc->codecs[cad].device_id;
 		break;
 	case HDA_IVAR_REVISION_ID:
 		*result = sc->codecs[cad].revision_id;
 		break;
 	case HDA_IVAR_STEPPING_ID:
 		*result = sc->codecs[cad].stepping_id;
 		break;
 	case HDA_IVAR_SUBVENDOR_ID:
 		*result = pci_get_subvendor(dev);
 		break;
 	case HDA_IVAR_SUBDEVICE_ID:
 		*result = pci_get_subdevice(dev);
 		break;
 	case HDA_IVAR_DMA_NOCACHE:
 		*result = (sc->flags & HDAC_F_DMA_NOCACHE) != 0;
 		break;
 	default:
 		return (ENOENT);
 	}
 	return (0);
 }
 
 static struct mtx *
 hdac_get_mtx(device_t dev, device_t child)
 {
 	struct hdac_softc *sc = device_get_softc(dev);
 
 	return (sc->lock);
 }
 
 static uint32_t
 hdac_codec_command(device_t dev, device_t child, uint32_t verb)
 {
 
 	return (hdac_send_command(device_get_softc(dev),
 	    (intptr_t)device_get_ivars(child), verb));
 }
 
 static int
 hdac_find_stream(struct hdac_softc *sc, int dir, int stream)
 {
 	int i, ss;
 
 	ss = -1;
 	/* Allocate ISS/BSS first. */
 	if (dir == 0) {
 		for (i = 0; i < sc->num_iss; i++) {
 			if (sc->streams[i].stream == stream) {
 				ss = i;
 				break;
 			}
 		}
 	} else {
 		for (i = 0; i < sc->num_oss; i++) {
 			if (sc->streams[i + sc->num_iss].stream == stream) {
 				ss = i + sc->num_iss;
 				break;
 			}
 		}
 	}
 	/* Fallback to BSS. */
 	if (ss == -1) {
 		for (i = 0; i < sc->num_bss; i++) {
 			if (sc->streams[i + sc->num_iss + sc->num_oss].stream
 			    == stream) {
 				ss = i + sc->num_iss + sc->num_oss;
 				break;
 			}
 		}
 	}
 	return (ss);
 }
 
 static int
 hdac_stream_alloc(device_t dev, device_t child, int dir, int format, int stripe,
     uint32_t **dmapos)
 {
 	struct hdac_softc *sc = device_get_softc(dev);
 	nid_t cad = (uintptr_t)device_get_ivars(child);
 	int stream, ss, bw, maxbw, prevbw;
 
 	/* Look for empty stream. */
 	ss = hdac_find_stream(sc, dir, 0);
 
 	/* Return if found nothing. */
 	if (ss < 0)
 		return (0);
 
 	/* Check bus bandwidth. */
 	bw = hdac_bdata_rate(format, dir);
 	if (dir == 1) {
 		bw *= 1 << (sc->num_sdo - stripe);
 		prevbw = sc->sdo_bw_used;
 		maxbw = 48000 * 960 * (1 << sc->num_sdo);
 	} else {
 		prevbw = sc->codecs[cad].sdi_bw_used;
 		maxbw = 48000 * 464;
 	}
 	HDA_BOOTHVERBOSE(
 		device_printf(dev, "%dKbps of %dKbps bandwidth used%s\n",
 		    (bw + prevbw) / 1000, maxbw / 1000,
 		    bw + prevbw > maxbw ? " -- OVERFLOW!" : "");
 	);
 	if (bw + prevbw > maxbw)
 		return (0);
 	if (dir == 1)
 		sc->sdo_bw_used += bw;
 	else
 		sc->codecs[cad].sdi_bw_used += bw;
 
 	/* Allocate stream number */
 	if (ss >= sc->num_iss + sc->num_oss)
 		stream = 15 - (ss - sc->num_iss + sc->num_oss);
 	else if (ss >= sc->num_iss)
 		stream = ss - sc->num_iss + 1;
 	else
 		stream = ss + 1;
 
 	sc->streams[ss].dev = child;
 	sc->streams[ss].dir = dir;
 	sc->streams[ss].stream = stream;
 	sc->streams[ss].bw = bw;
 	sc->streams[ss].format = format;
 	sc->streams[ss].stripe = stripe;
 	if (dmapos != NULL) {
 		if (sc->pos_dma.dma_vaddr != NULL)
 			*dmapos = (uint32_t *)(sc->pos_dma.dma_vaddr + ss * 8);
 		else
 			*dmapos = NULL;
 	}
 	return (stream);
 }
 
 static void
 hdac_stream_free(device_t dev, device_t child, int dir, int stream)
 {
 	struct hdac_softc *sc = device_get_softc(dev);
 	nid_t cad = (uintptr_t)device_get_ivars(child);
 	int ss;
 
 	ss = hdac_find_stream(sc, dir, stream);
 	KASSERT(ss >= 0,
 	    ("Free for not allocated stream (%d/%d)\n", dir, stream));
 	if (dir == 1)
 		sc->sdo_bw_used -= sc->streams[ss].bw;
 	else
 		sc->codecs[cad].sdi_bw_used -= sc->streams[ss].bw;
 	sc->streams[ss].stream = 0;
 	sc->streams[ss].dev = NULL;
 }
 
 static int
 hdac_stream_start(device_t dev, device_t child,
     int dir, int stream, bus_addr_t buf, int blksz, int blkcnt)
 {
 	struct hdac_softc *sc = device_get_softc(dev);
 	struct hdac_bdle *bdle;
 	uint64_t addr;
 	int i, ss, off;
 	uint32_t ctl;
 
 	ss = hdac_find_stream(sc, dir, stream);
 	KASSERT(ss >= 0,
 	    ("Start for not allocated stream (%d/%d)\n", dir, stream));
 
 	addr = (uint64_t)buf;
 	bdle = (struct hdac_bdle *)sc->streams[ss].bdl.dma_vaddr;
 	for (i = 0; i < blkcnt; i++, bdle++) {
 		bdle->addrl = (uint32_t)addr;
 		bdle->addrh = (uint32_t)(addr >> 32);
 		bdle->len = blksz;
 		bdle->ioc = 1;
 		addr += blksz;
 	}
 
 	off = ss << 5;
 	HDAC_WRITE_4(&sc->mem, off + HDAC_SDCBL, blksz * blkcnt);
 	HDAC_WRITE_2(&sc->mem, off + HDAC_SDLVI, blkcnt - 1);
 	addr = sc->streams[ss].bdl.dma_paddr;
 	HDAC_WRITE_4(&sc->mem, off + HDAC_SDBDPL, (uint32_t)addr);
 	HDAC_WRITE_4(&sc->mem, off + HDAC_SDBDPU, (uint32_t)(addr >> 32));
 
 	ctl = HDAC_READ_1(&sc->mem, off + HDAC_SDCTL2);
 	if (dir)
 		ctl |= HDAC_SDCTL2_DIR;
 	else
 		ctl &= ~HDAC_SDCTL2_DIR;
 	ctl &= ~HDAC_SDCTL2_STRM_MASK;
 	ctl |= stream << HDAC_SDCTL2_STRM_SHIFT;
 	ctl &= ~HDAC_SDCTL2_STRIPE_MASK;
 	ctl |= sc->streams[ss].stripe << HDAC_SDCTL2_STRIPE_SHIFT;
 	HDAC_WRITE_1(&sc->mem, off + HDAC_SDCTL2, ctl);
 
 	HDAC_WRITE_2(&sc->mem, off + HDAC_SDFMT, sc->streams[ss].format);
 
 	ctl = HDAC_READ_4(&sc->mem, HDAC_INTCTL);
 	ctl |= 1 << ss;
 	HDAC_WRITE_4(&sc->mem, HDAC_INTCTL, ctl);
 
 	HDAC_WRITE_1(&sc->mem, off + HDAC_SDSTS,
 	    HDAC_SDSTS_DESE | HDAC_SDSTS_FIFOE | HDAC_SDSTS_BCIS);
 	ctl = HDAC_READ_1(&sc->mem, off + HDAC_SDCTL0);
 	ctl |= HDAC_SDCTL_IOCE | HDAC_SDCTL_FEIE | HDAC_SDCTL_DEIE |
 	    HDAC_SDCTL_RUN;
 	HDAC_WRITE_1(&sc->mem, off + HDAC_SDCTL0, ctl);
 
 	sc->streams[ss].blksz = blksz;
 	sc->streams[ss].running = 1;
 	hdac_poll_reinit(sc);
 	return (0);
 }
 
 static void
 hdac_stream_stop(device_t dev, device_t child, int dir, int stream)
 {
 	struct hdac_softc *sc = device_get_softc(dev);
 	int ss, off;
 	uint32_t ctl;
 
 	ss = hdac_find_stream(sc, dir, stream);
 	KASSERT(ss >= 0,
 	    ("Stop for not allocated stream (%d/%d)\n", dir, stream));
 
 	off = ss << 5;
 	ctl = HDAC_READ_1(&sc->mem, off + HDAC_SDCTL0);
 	ctl &= ~(HDAC_SDCTL_IOCE | HDAC_SDCTL_FEIE | HDAC_SDCTL_DEIE |
 	    HDAC_SDCTL_RUN);
 	HDAC_WRITE_1(&sc->mem, off + HDAC_SDCTL0, ctl);
 
 	ctl = HDAC_READ_4(&sc->mem, HDAC_INTCTL);
 	ctl &= ~(1 << ss);
 	HDAC_WRITE_4(&sc->mem, HDAC_INTCTL, ctl);
 
 	sc->streams[ss].running = 0;
 	hdac_poll_reinit(sc);
 }
 
 static void
 hdac_stream_reset(device_t dev, device_t child, int dir, int stream)
 {
 	struct hdac_softc *sc = device_get_softc(dev);
 	int timeout = 1000;
 	int to = timeout;
 	int ss, off;
 	uint32_t ctl;
 
 	ss = hdac_find_stream(sc, dir, stream);
 	KASSERT(ss >= 0,
 	    ("Reset for not allocated stream (%d/%d)\n", dir, stream));
 
 	off = ss << 5;
 	ctl = HDAC_READ_1(&sc->mem, off + HDAC_SDCTL0);
 	ctl |= HDAC_SDCTL_SRST;
 	HDAC_WRITE_1(&sc->mem, off + HDAC_SDCTL0, ctl);
 	do {
 		ctl = HDAC_READ_1(&sc->mem, off + HDAC_SDCTL0);
 		if (ctl & HDAC_SDCTL_SRST)
 			break;
 		DELAY(10);
 	} while (--to);
 	if (!(ctl & HDAC_SDCTL_SRST))
 		device_printf(dev, "Reset setting timeout\n");
 	ctl &= ~HDAC_SDCTL_SRST;
 	HDAC_WRITE_1(&sc->mem, off + HDAC_SDCTL0, ctl);
 	to = timeout;
 	do {
 		ctl = HDAC_READ_1(&sc->mem, off + HDAC_SDCTL0);
 		if (!(ctl & HDAC_SDCTL_SRST))
 			break;
 		DELAY(10);
 	} while (--to);
 	if (ctl & HDAC_SDCTL_SRST)
 		device_printf(dev, "Reset timeout!\n");
 }
 
 static uint32_t
 hdac_stream_getptr(device_t dev, device_t child, int dir, int stream)
 {
 	struct hdac_softc *sc = device_get_softc(dev);
 	int ss, off;
 
 	ss = hdac_find_stream(sc, dir, stream);
 	KASSERT(ss >= 0,
 	    ("Reset for not allocated stream (%d/%d)\n", dir, stream));
 
 	off = ss << 5;
 	return (HDAC_READ_4(&sc->mem, off + HDAC_SDLPIB));
 }
 
 static int
 hdac_unsol_alloc(device_t dev, device_t child, int tag)
 {
 	struct hdac_softc *sc = device_get_softc(dev);
 
 	sc->unsol_registered++;
 	hdac_poll_reinit(sc);
 	return (tag);
 }
 
 static void
 hdac_unsol_free(device_t dev, device_t child, int tag)
 {
 	struct hdac_softc *sc = device_get_softc(dev);
 
 	sc->unsol_registered--;
 	hdac_poll_reinit(sc);
 }
 
 static device_method_t hdac_methods[] = {
 	/* device interface */
 	DEVMETHOD(device_probe,		hdac_probe),
 	DEVMETHOD(device_attach,	hdac_attach),
 	DEVMETHOD(device_detach,	hdac_detach),
 	DEVMETHOD(device_suspend,	hdac_suspend),
 	DEVMETHOD(device_resume,	hdac_resume),
 	/* Bus interface */
 	DEVMETHOD(bus_get_dma_tag,	hdac_get_dma_tag),
 	DEVMETHOD(bus_print_child,	hdac_print_child),
 	DEVMETHOD(bus_child_location_str, hdac_child_location_str),
 	DEVMETHOD(bus_child_pnpinfo_str, hdac_child_pnpinfo_str_method),
 	DEVMETHOD(bus_read_ivar,	hdac_read_ivar),
 	DEVMETHOD(hdac_get_mtx,		hdac_get_mtx),
 	DEVMETHOD(hdac_codec_command,	hdac_codec_command),
 	DEVMETHOD(hdac_stream_alloc,	hdac_stream_alloc),
 	DEVMETHOD(hdac_stream_free,	hdac_stream_free),
 	DEVMETHOD(hdac_stream_start,	hdac_stream_start),
 	DEVMETHOD(hdac_stream_stop,	hdac_stream_stop),
 	DEVMETHOD(hdac_stream_reset,	hdac_stream_reset),
 	DEVMETHOD(hdac_stream_getptr,	hdac_stream_getptr),
 	DEVMETHOD(hdac_unsol_alloc,	hdac_unsol_alloc),
 	DEVMETHOD(hdac_unsol_free,	hdac_unsol_free),
 	DEVMETHOD_END
 };
 
 static driver_t hdac_driver = {
 	"hdac",
 	hdac_methods,
 	sizeof(struct hdac_softc),
 };
 
 static devclass_t hdac_devclass;
 
 DRIVER_MODULE(snd_hda, pci, hdac_driver, hdac_devclass, NULL, NULL);
Index: user/ngie/more-tests/sys/dev/sound/pci/hda/hdac.h
===================================================================
--- user/ngie/more-tests/sys/dev/sound/pci/hda/hdac.h	(revision 281584)
+++ user/ngie/more-tests/sys/dev/sound/pci/hda/hdac.h	(revision 281585)
@@ -1,708 +1,713 @@
 /*-
  * Copyright (c) 2006 Stephane E. Potvin <sepotvin@videotron.ca>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 #ifndef _HDAC_H_
 #define _HDAC_H_
 
 #include "hdac_if.h"
 
 /****************************************************************************
  * Miscellanious defines
  ****************************************************************************/
 
 /* Controller models */
 #define HDA_MODEL_CONSTRUCT(vendor, model)	\
 		(((uint32_t)(model) << 16) | ((vendor##_VENDORID) & 0xffff))
 
 /* Intel */
 #define INTEL_VENDORID		0x8086
 #define HDA_INTEL_OAK		HDA_MODEL_CONSTRUCT(INTEL, 0x080a)
 #define HDA_INTEL_BAY		HDA_MODEL_CONSTRUCT(INTEL, 0x0f04)
 #define HDA_INTEL_HSW1		HDA_MODEL_CONSTRUCT(INTEL, 0x0a0c)
 #define HDA_INTEL_HSW2		HDA_MODEL_CONSTRUCT(INTEL, 0x0c0c)
 #define HDA_INTEL_HSW3		HDA_MODEL_CONSTRUCT(INTEL, 0x0d0c)
+#define HDA_INTEL_BDW1		HDA_MODEL_CONSTRUCT(INTEL, 0x160c)
 #define HDA_INTEL_CPT		HDA_MODEL_CONSTRUCT(INTEL, 0x1c20)
 #define HDA_INTEL_PATSBURG	HDA_MODEL_CONSTRUCT(INTEL, 0x1d20)
 #define HDA_INTEL_PPT1		HDA_MODEL_CONSTRUCT(INTEL, 0x1e20)
 #define HDA_INTEL_82801F	HDA_MODEL_CONSTRUCT(INTEL, 0x2668)
 #define HDA_INTEL_63XXESB	HDA_MODEL_CONSTRUCT(INTEL, 0x269a)
 #define HDA_INTEL_82801G	HDA_MODEL_CONSTRUCT(INTEL, 0x27d8)
 #define HDA_INTEL_82801H	HDA_MODEL_CONSTRUCT(INTEL, 0x284b)
 #define HDA_INTEL_82801I	HDA_MODEL_CONSTRUCT(INTEL, 0x293e)
 #define HDA_INTEL_82801JI	HDA_MODEL_CONSTRUCT(INTEL, 0x3a3e)
 #define HDA_INTEL_82801JD	HDA_MODEL_CONSTRUCT(INTEL, 0x3a6e)
 #define HDA_INTEL_PCH		HDA_MODEL_CONSTRUCT(INTEL, 0x3b56)
 #define HDA_INTEL_PCH2		HDA_MODEL_CONSTRUCT(INTEL, 0x3b57)
 #define HDA_INTEL_MACBOOKPRO92	HDA_MODEL_CONSTRUCT(INTEL, 0x7270)
 #define HDA_INTEL_SCH		HDA_MODEL_CONSTRUCT(INTEL, 0x811b)
 #define HDA_INTEL_LPT1		HDA_MODEL_CONSTRUCT(INTEL, 0x8c20)
 #define HDA_INTEL_LPT2		HDA_MODEL_CONSTRUCT(INTEL, 0x8c21)
 #define HDA_INTEL_WCPT		HDA_MODEL_CONSTRUCT(INTEL, 0x8ca0)
 #define HDA_INTEL_WELLS1	HDA_MODEL_CONSTRUCT(INTEL, 0x8d20)
 #define HDA_INTEL_WELLS2	HDA_MODEL_CONSTRUCT(INTEL, 0x8d21)
 #define HDA_INTEL_LPTLP1	HDA_MODEL_CONSTRUCT(INTEL, 0x9c20)
 #define HDA_INTEL_LPTLP2	HDA_MODEL_CONSTRUCT(INTEL, 0x9c21)
+#define HDA_INTEL_BDW2		HDA_MODEL_CONSTRUCT(INTEL, 0x9ca0)
 #define HDA_INTEL_ALL		HDA_MODEL_CONSTRUCT(INTEL, 0xffff)
 
 /* Nvidia */
 #define NVIDIA_VENDORID		0x10de
 #define HDA_NVIDIA_MCP51	HDA_MODEL_CONSTRUCT(NVIDIA, 0x026c)
 #define HDA_NVIDIA_MCP55	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0371)
 #define HDA_NVIDIA_MCP61_1	HDA_MODEL_CONSTRUCT(NVIDIA, 0x03e4)
 #define HDA_NVIDIA_MCP61_2	HDA_MODEL_CONSTRUCT(NVIDIA, 0x03f0)
 #define HDA_NVIDIA_MCP65_1	HDA_MODEL_CONSTRUCT(NVIDIA, 0x044a)
 #define HDA_NVIDIA_MCP65_2	HDA_MODEL_CONSTRUCT(NVIDIA, 0x044b)
 #define HDA_NVIDIA_MCP67_1	HDA_MODEL_CONSTRUCT(NVIDIA, 0x055c)
 #define HDA_NVIDIA_MCP67_2	HDA_MODEL_CONSTRUCT(NVIDIA, 0x055d)
 #define HDA_NVIDIA_MCP78_1	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0774)
 #define HDA_NVIDIA_MCP78_2	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0775)
 #define HDA_NVIDIA_MCP78_3	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0776)
 #define HDA_NVIDIA_MCP78_4	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0777)
 #define HDA_NVIDIA_MCP73_1	HDA_MODEL_CONSTRUCT(NVIDIA, 0x07fc)
 #define HDA_NVIDIA_MCP73_2	HDA_MODEL_CONSTRUCT(NVIDIA, 0x07fd)
 #define HDA_NVIDIA_MCP79_1	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0ac0)
 #define HDA_NVIDIA_MCP79_2	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0ac1)
 #define HDA_NVIDIA_MCP79_3	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0ac2)
 #define HDA_NVIDIA_MCP79_4	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0ac3)
 #define HDA_NVIDIA_0BE2		HDA_MODEL_CONSTRUCT(NVIDIA, 0x0be2)
 #define HDA_NVIDIA_0BE3		HDA_MODEL_CONSTRUCT(NVIDIA, 0x0be3)
 #define HDA_NVIDIA_0BE4		HDA_MODEL_CONSTRUCT(NVIDIA, 0x0be4)
 #define HDA_NVIDIA_GT100	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0be5)
 #define HDA_NVIDIA_GT106	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0be9)
 #define HDA_NVIDIA_GT108	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0bea)
 #define HDA_NVIDIA_GT104	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0beb)
 #define HDA_NVIDIA_GT116	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0bee)
 #define HDA_NVIDIA_MCP89_1	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0d94)
 #define HDA_NVIDIA_MCP89_2	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0d95)
 #define HDA_NVIDIA_MCP89_3	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0d96)
 #define HDA_NVIDIA_MCP89_4	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0d97)
 #define HDA_NVIDIA_GF119	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0e08)
 #define HDA_NVIDIA_GF110_1	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0e09)
 #define HDA_NVIDIA_GF110_2	HDA_MODEL_CONSTRUCT(NVIDIA, 0x0e0c)
 #define HDA_NVIDIA_ALL		HDA_MODEL_CONSTRUCT(NVIDIA, 0xffff)
 
 /* ATI */
 #define ATI_VENDORID		0x1002
 #define HDA_ATI_SB450		HDA_MODEL_CONSTRUCT(ATI, 0x437b)
 #define HDA_ATI_SB600		HDA_MODEL_CONSTRUCT(ATI, 0x4383)
 #define HDA_ATI_RS600		HDA_MODEL_CONSTRUCT(ATI, 0x793b)
 #define HDA_ATI_RS690		HDA_MODEL_CONSTRUCT(ATI, 0x7919)
 #define HDA_ATI_RS780		HDA_MODEL_CONSTRUCT(ATI, 0x960f)
 #define HDA_ATI_R600		HDA_MODEL_CONSTRUCT(ATI, 0xaa00)
 #define HDA_ATI_RV630		HDA_MODEL_CONSTRUCT(ATI, 0xaa08)
 #define HDA_ATI_RV610		HDA_MODEL_CONSTRUCT(ATI, 0xaa10)
 #define HDA_ATI_RV670		HDA_MODEL_CONSTRUCT(ATI, 0xaa18)
 #define HDA_ATI_RV635		HDA_MODEL_CONSTRUCT(ATI, 0xaa20)
 #define HDA_ATI_RV620		HDA_MODEL_CONSTRUCT(ATI, 0xaa28)
 #define HDA_ATI_RV770		HDA_MODEL_CONSTRUCT(ATI, 0xaa30)
 #define HDA_ATI_RV730		HDA_MODEL_CONSTRUCT(ATI, 0xaa38)
 #define HDA_ATI_RV710		HDA_MODEL_CONSTRUCT(ATI, 0xaa40)
 #define HDA_ATI_RV740		HDA_MODEL_CONSTRUCT(ATI, 0xaa48)
 #define HDA_ATI_RV870		HDA_MODEL_CONSTRUCT(ATI, 0xaa50)
 #define HDA_ATI_RV840		HDA_MODEL_CONSTRUCT(ATI, 0xaa58)
 #define HDA_ATI_RV830		HDA_MODEL_CONSTRUCT(ATI, 0xaa60)
 #define HDA_ATI_RV810		HDA_MODEL_CONSTRUCT(ATI, 0xaa68)
 #define HDA_ATI_RV970		HDA_MODEL_CONSTRUCT(ATI, 0xaa80)
 #define HDA_ATI_RV940		HDA_MODEL_CONSTRUCT(ATI, 0xaa88)
 #define HDA_ATI_RV930		HDA_MODEL_CONSTRUCT(ATI, 0xaa90)
 #define HDA_ATI_RV910		HDA_MODEL_CONSTRUCT(ATI, 0xaa98)
 #define HDA_ATI_R1000		HDA_MODEL_CONSTRUCT(ATI, 0xaaa0)
 #define HDA_ATI_ALL		HDA_MODEL_CONSTRUCT(ATI, 0xffff)
 
 /* RDC */
 #define RDC_VENDORID		0x17f3
 #define HDA_RDC_M3010		HDA_MODEL_CONSTRUCT(RDC, 0x3010)
 
 /* VIA */
 #define VIA_VENDORID		0x1106
 #define HDA_VIA_VT82XX		HDA_MODEL_CONSTRUCT(VIA, 0x3288)
 #define HDA_VIA_ALL		HDA_MODEL_CONSTRUCT(VIA, 0xffff)
 
 /* SiS */
 #define SIS_VENDORID		0x1039
 #define HDA_SIS_966		HDA_MODEL_CONSTRUCT(SIS, 0x7502)
 #define HDA_SIS_ALL		HDA_MODEL_CONSTRUCT(SIS, 0xffff)
 
 /* ULI */
 #define ULI_VENDORID		0x10b9
 #define HDA_ULI_M5461		HDA_MODEL_CONSTRUCT(ULI, 0x5461)
 #define HDA_ULI_ALL		HDA_MODEL_CONSTRUCT(ULI, 0xffff)
 
 /* OEM/subvendors */
 
 /* Intel */
 #define	INTEL_DH87RL_SUBVENDOR	HDA_MODEL_CONSTRUCT(INTEL, 0x204a)
 #define INTEL_D101GGC_SUBVENDOR	HDA_MODEL_CONSTRUCT(INTEL, 0xd600)
 
 /* HP/Compaq */
 #define HP_VENDORID		0x103c
 #define HP_V3000_SUBVENDOR	HDA_MODEL_CONSTRUCT(HP, 0x30b5)
 #define HP_NX7400_SUBVENDOR	HDA_MODEL_CONSTRUCT(HP, 0x30a2)
 #define HP_NX6310_SUBVENDOR	HDA_MODEL_CONSTRUCT(HP, 0x30aa)
 #define HP_NX6325_SUBVENDOR	HDA_MODEL_CONSTRUCT(HP, 0x30b0)
 #define HP_XW4300_SUBVENDOR	HDA_MODEL_CONSTRUCT(HP, 0x3013)
 #define HP_3010_SUBVENDOR	HDA_MODEL_CONSTRUCT(HP, 0x3010)
 #define HP_DV5000_SUBVENDOR	HDA_MODEL_CONSTRUCT(HP, 0x30a5)
 #define HP_DC7700S_SUBVENDOR	HDA_MODEL_CONSTRUCT(HP, 0x2801)
 #define HP_DC7700_SUBVENDOR	HDA_MODEL_CONSTRUCT(HP, 0x2802)
 #define HP_ALL_SUBVENDOR	HDA_MODEL_CONSTRUCT(HP, 0xffff)
 /* What is wrong with XN 2563 anyway? (Got the picture ?) */
 #define HP_NX6325_SUBVENDORX	0x103c30b0
 
 /* Dell */
 #define DELL_VENDORID		0x1028
 #define DELL_D630_SUBVENDOR	HDA_MODEL_CONSTRUCT(DELL, 0x01f9)
 #define DELL_D820_SUBVENDOR	HDA_MODEL_CONSTRUCT(DELL, 0x01cc)
 #define DELL_V1400_SUBVENDOR	HDA_MODEL_CONSTRUCT(DELL, 0x0227)
 #define DELL_V1500_SUBVENDOR	HDA_MODEL_CONSTRUCT(DELL, 0x0228)
 #define DELL_I1300_SUBVENDOR	HDA_MODEL_CONSTRUCT(DELL, 0x01c9)
 #define DELL_XPSM1210_SUBVENDOR	HDA_MODEL_CONSTRUCT(DELL, 0x01d7)
 #define DELL_OPLX745_SUBVENDOR	HDA_MODEL_CONSTRUCT(DELL, 0x01da)
 #define DELL_ALL_SUBVENDOR	HDA_MODEL_CONSTRUCT(DELL, 0xffff)
 
 /* Clevo */
 #define CLEVO_VENDORID		0x1558
 #define CLEVO_D900T_SUBVENDOR	HDA_MODEL_CONSTRUCT(CLEVO, 0x0900)
 #define CLEVO_ALL_SUBVENDOR	HDA_MODEL_CONSTRUCT(CLEVO, 0xffff)
 
 /* Acer */
 #define ACER_VENDORID		0x1025
 #define ACER_A5050_SUBVENDOR	HDA_MODEL_CONSTRUCT(ACER, 0x010f)
 #define ACER_A4520_SUBVENDOR	HDA_MODEL_CONSTRUCT(ACER, 0x0127)
 #define ACER_A4710_SUBVENDOR	HDA_MODEL_CONSTRUCT(ACER, 0x012f)
 #define ACER_A4715_SUBVENDOR	HDA_MODEL_CONSTRUCT(ACER, 0x0133)
 #define ACER_3681WXM_SUBVENDOR	HDA_MODEL_CONSTRUCT(ACER, 0x0110)
 #define ACER_T6292_SUBVENDOR	HDA_MODEL_CONSTRUCT(ACER, 0x011b)
 #define ACER_T5320_SUBVENDOR	HDA_MODEL_CONSTRUCT(ACER, 0x011f)
 #define ACER_ALL_SUBVENDOR	HDA_MODEL_CONSTRUCT(ACER, 0xffff)
 
 /* Asus */
 #define ASUS_VENDORID		0x1043
 #define ASUS_A8X_SUBVENDOR	HDA_MODEL_CONSTRUCT(ASUS, 0x1153)
 #define ASUS_U5F_SUBVENDOR	HDA_MODEL_CONSTRUCT(ASUS, 0x1263)
 #define ASUS_W6F_SUBVENDOR	HDA_MODEL_CONSTRUCT(ASUS, 0x1263)
 #define ASUS_A7M_SUBVENDOR	HDA_MODEL_CONSTRUCT(ASUS, 0x1323)
 #define ASUS_F3JC_SUBVENDOR	HDA_MODEL_CONSTRUCT(ASUS, 0x1338)
 #define ASUS_G2K_SUBVENDOR	HDA_MODEL_CONSTRUCT(ASUS, 0x1339)
 #define ASUS_A7T_SUBVENDOR	HDA_MODEL_CONSTRUCT(ASUS, 0x13c2)
 #define ASUS_UX31A_SUBVENDOR	HDA_MODEL_CONSTRUCT(ASUS, 0x1517)
 #define ASUS_W2J_SUBVENDOR	HDA_MODEL_CONSTRUCT(ASUS, 0x1971)
 #define ASUS_M5200_SUBVENDOR	HDA_MODEL_CONSTRUCT(ASUS, 0x1993)
 #define ASUS_P5PL2_SUBVENDOR	HDA_MODEL_CONSTRUCT(ASUS, 0x817f)
 #define ASUS_P1AH2_SUBVENDOR	HDA_MODEL_CONSTRUCT(ASUS, 0x81cb)
 #define ASUS_M2NPVMX_SUBVENDOR	HDA_MODEL_CONSTRUCT(ASUS, 0x81cb)
 #define ASUS_M2V_SUBVENDOR	HDA_MODEL_CONSTRUCT(ASUS, 0x81e7)
 #define ASUS_P5BWD_SUBVENDOR	HDA_MODEL_CONSTRUCT(ASUS, 0x81ec)
 #define ASUS_M2N_SUBVENDOR	HDA_MODEL_CONSTRUCT(ASUS, 0x8234)
 #define ASUS_A8NVMCSM_SUBVENDOR	HDA_MODEL_CONSTRUCT(NVIDIA, 0xcb84)
 #define ASUS_ALL_SUBVENDOR	HDA_MODEL_CONSTRUCT(ASUS, 0xffff)
 
 /* IBM / Lenovo */
 #define IBM_VENDORID		0x1014
 #define IBM_M52_SUBVENDOR	HDA_MODEL_CONSTRUCT(IBM, 0x02f6)
 #define IBM_ALL_SUBVENDOR	HDA_MODEL_CONSTRUCT(IBM, 0xffff)
 
 /* Lenovo */
 #define LENOVO_VENDORID		0x17aa
 #define LENOVO_3KN100_SUBVENDOR	HDA_MODEL_CONSTRUCT(LENOVO, 0x2066)
 #define LENOVO_3KN200_SUBVENDOR	HDA_MODEL_CONSTRUCT(LENOVO, 0x384e)
 #define LENOVO_B450_SUBVENDOR HDA_MODEL_CONSTRUCT(LENOVO, 0x3a0d)
 #define LENOVO_TCA55_SUBVENDOR	HDA_MODEL_CONSTRUCT(LENOVO, 0x1015)
 #define	LENOVO_X1_SUBVENDOR	HDA_MODEL_CONSTRUCT(LENOVO, 0x21e8)
 #define	LENOVO_X1CRBN_SUBVENDOR	HDA_MODEL_CONSTRUCT(LENOVO, 0x21f9)
+#define	LENOVO_X120BS_SUBVENDOR	HDA_MODEL_CONSTRUCT(LENOVO, 0x2227)
 #define LENOVO_X220_SUBVENDOR	HDA_MODEL_CONSTRUCT(LENOVO, 0x21da)
 #define LENOVO_X300_SUBVENDOR	HDA_MODEL_CONSTRUCT(LENOVO, 0x20ac)
 #define	LENOVO_T400_SUBVENDOR	HDA_MODEL_CONSTRUCT(LENOVO, 0x20f2)
 #define	LENOVO_T420_SUBVENDOR	HDA_MODEL_CONSTRUCT(LENOVO, 0x21ce)
 #define	LENOVO_T430_SUBVENDOR	HDA_MODEL_CONSTRUCT(LENOVO, 0x21f3)
 #define	LENOVO_T430S_SUBVENDOR	HDA_MODEL_CONSTRUCT(LENOVO, 0x21fb)
 #define	LENOVO_T520_SUBVENDOR	HDA_MODEL_CONSTRUCT(LENOVO, 0x21cf)
 #define	LENOVO_T530_SUBVENDOR	HDA_MODEL_CONSTRUCT(LENOVO, 0x21f6)
 #define	LENOVO_G580_SUBVENDOR	HDA_MODEL_CONSTRUCT(LENOVO, 0x3977)
 #define LENOVO_ALL_SUBVENDOR	HDA_MODEL_CONSTRUCT(LENOVO, 0xffff)
 
 /* Samsung */
 #define SAMSUNG_VENDORID	0x144d
 #define SAMSUNG_Q1_SUBVENDOR	HDA_MODEL_CONSTRUCT(SAMSUNG, 0xc027)
 #define SAMSUNG_ALL_SUBVENDOR	HDA_MODEL_CONSTRUCT(SAMSUNG, 0xffff)
 
 /* Medion ? */
 #define MEDION_VENDORID			0x161f
 #define MEDION_MD95257_SUBVENDOR	HDA_MODEL_CONSTRUCT(MEDION, 0x203d)
 #define MEDION_ALL_SUBVENDOR		HDA_MODEL_CONSTRUCT(MEDION, 0xffff)
 
 /* Apple Computer Inc. */
 #define APPLE_VENDORID		0x106b
 #define APPLE_MB3_SUBVENDOR	HDA_MODEL_CONSTRUCT(APPLE, 0x00a1)
 
 /* Sony */
 #define SONY_VENDORID		0x104d
 #define SONY_S5_SUBVENDOR	HDA_MODEL_CONSTRUCT(SONY, 0x81cc)
 #define SONY_ALL_SUBVENDOR	HDA_MODEL_CONSTRUCT(SONY, 0xffff)
 
 /*
  * Apple Intel MacXXXX seems using Sigmatel codec/vendor id
  * instead of their own, which is beyond my comprehension
  * (see HDA_CODEC_STAC9221 below).
  */
 #define APPLE_INTEL_MAC		0x76808384
 #define APPLE_MACBOOKAIR31	0x0d9410de
 #define APPLE_MACBOOKPRO55	0xcb7910de
 #define APPLE_MACBOOKPRO71	0xcb8910de
 
 /* LG Electronics */
 #define LG_VENDORID		0x1854
 #define LG_LW20_SUBVENDOR	HDA_MODEL_CONSTRUCT(LG, 0x0018)
 #define LG_ALL_SUBVENDOR	HDA_MODEL_CONSTRUCT(LG, 0xffff)
 
 /* Fujitsu Siemens */
 #define FS_VENDORID		0x1734
 #define FS_PA1510_SUBVENDOR	HDA_MODEL_CONSTRUCT(FS, 0x10b8)
 #define FS_SI1848_SUBVENDOR	HDA_MODEL_CONSTRUCT(FS, 0x10cd)
 #define FS_ALL_SUBVENDOR	HDA_MODEL_CONSTRUCT(FS, 0xffff)
 
 /* Fujitsu Limited */
 #define FL_VENDORID		0x10cf
 #define FL_S7020D_SUBVENDOR	HDA_MODEL_CONSTRUCT(FL, 0x1326)
 #define FL_U1010_SUBVENDOR	HDA_MODEL_CONSTRUCT(FL, 0x142d)
 #define FL_ALL_SUBVENDOR	HDA_MODEL_CONSTRUCT(FL, 0xffff)
 
 /* Toshiba */
 #define TOSHIBA_VENDORID	0x1179
 #define TOSHIBA_U200_SUBVENDOR	HDA_MODEL_CONSTRUCT(TOSHIBA, 0x0001)
 #define TOSHIBA_A135_SUBVENDOR	HDA_MODEL_CONSTRUCT(TOSHIBA, 0xff01)
 #define TOSHIBA_ALL_SUBVENDOR	HDA_MODEL_CONSTRUCT(TOSHIBA, 0xffff)
 
 /* Micro-Star International (MSI) */
 #define MSI_VENDORID		0x1462
 #define MSI_MS1034_SUBVENDOR	HDA_MODEL_CONSTRUCT(MSI, 0x0349)
 #define MSI_MS034A_SUBVENDOR	HDA_MODEL_CONSTRUCT(MSI, 0x034a)
 #define MSI_ALL_SUBVENDOR	HDA_MODEL_CONSTRUCT(MSI, 0xffff)
 
 /* Giga-Byte Technology */
 #define GB_VENDORID		0x1458
 #define GB_G33S2H_SUBVENDOR	HDA_MODEL_CONSTRUCT(GB, 0xa022)
 #define GP_ALL_SUBVENDOR	HDA_MODEL_CONSTRUCT(GB, 0xffff)
 
 /* Uniwill ? */
 #define UNIWILL_VENDORID	0x1584
 #define UNIWILL_9075_SUBVENDOR	HDA_MODEL_CONSTRUCT(UNIWILL, 0x9075)
 #define UNIWILL_9080_SUBVENDOR	HDA_MODEL_CONSTRUCT(UNIWILL, 0x9080)
 
 /* All codecs you can eat... */
 #define HDA_CODEC_CONSTRUCT(vendor, id) \
 		(((uint32_t)(vendor##_VENDORID) << 16) | ((id) & 0xffff))
 
 /* Cirrus Logic */
 #define CIRRUSLOGIC_VENDORID	0x1013
 #define HDA_CODEC_CS4206	HDA_CODEC_CONSTRUCT(CIRRUSLOGIC, 0x4206)
 #define HDA_CODEC_CS4207	HDA_CODEC_CONSTRUCT(CIRRUSLOGIC, 0x4207)
 #define HDA_CODEC_CS4210	HDA_CODEC_CONSTRUCT(CIRRUSLOGIC, 0x4210)
 #define HDA_CODEC_CSXXXX	HDA_CODEC_CONSTRUCT(CIRRUSLOGIC, 0xffff)
 
 /* Realtek */
 #define REALTEK_VENDORID	0x10ec
 #define HDA_CODEC_ALC221	HDA_CODEC_CONSTRUCT(REALTEK, 0x0221)
 #define HDA_CODEC_ALC260	HDA_CODEC_CONSTRUCT(REALTEK, 0x0260)
 #define HDA_CODEC_ALC262	HDA_CODEC_CONSTRUCT(REALTEK, 0x0262)
 #define HDA_CODEC_ALC267	HDA_CODEC_CONSTRUCT(REALTEK, 0x0267)
 #define HDA_CODEC_ALC268	HDA_CODEC_CONSTRUCT(REALTEK, 0x0268)
 #define HDA_CODEC_ALC269	HDA_CODEC_CONSTRUCT(REALTEK, 0x0269)
 #define HDA_CODEC_ALC270	HDA_CODEC_CONSTRUCT(REALTEK, 0x0270)
 #define HDA_CODEC_ALC272	HDA_CODEC_CONSTRUCT(REALTEK, 0x0272)
 #define HDA_CODEC_ALC273	HDA_CODEC_CONSTRUCT(REALTEK, 0x0273)
 #define HDA_CODEC_ALC275	HDA_CODEC_CONSTRUCT(REALTEK, 0x0275)
 #define HDA_CODEC_ALC276	HDA_CODEC_CONSTRUCT(REALTEK, 0x0276)
+#define HDA_CODEC_ALC292	HDA_CODEC_CONSTRUCT(REALTEK, 0x0292)
 #define HDA_CODEC_ALC660	HDA_CODEC_CONSTRUCT(REALTEK, 0x0660)
 #define HDA_CODEC_ALC662	HDA_CODEC_CONSTRUCT(REALTEK, 0x0662)
 #define HDA_CODEC_ALC663	HDA_CODEC_CONSTRUCT(REALTEK, 0x0663)
 #define HDA_CODEC_ALC665	HDA_CODEC_CONSTRUCT(REALTEK, 0x0665)
 #define HDA_CODEC_ALC670	HDA_CODEC_CONSTRUCT(REALTEK, 0x0670)
 #define HDA_CODEC_ALC680	HDA_CODEC_CONSTRUCT(REALTEK, 0x0680)
 #define HDA_CODEC_ALC861	HDA_CODEC_CONSTRUCT(REALTEK, 0x0861)
 #define HDA_CODEC_ALC861VD	HDA_CODEC_CONSTRUCT(REALTEK, 0x0862)
 #define HDA_CODEC_ALC880	HDA_CODEC_CONSTRUCT(REALTEK, 0x0880)
 #define HDA_CODEC_ALC882	HDA_CODEC_CONSTRUCT(REALTEK, 0x0882)
 #define HDA_CODEC_ALC883	HDA_CODEC_CONSTRUCT(REALTEK, 0x0883)
 #define HDA_CODEC_ALC885	HDA_CODEC_CONSTRUCT(REALTEK, 0x0885)
 #define HDA_CODEC_ALC887	HDA_CODEC_CONSTRUCT(REALTEK, 0x0887)
 #define HDA_CODEC_ALC888	HDA_CODEC_CONSTRUCT(REALTEK, 0x0888)
 #define HDA_CODEC_ALC889	HDA_CODEC_CONSTRUCT(REALTEK, 0x0889)
 #define HDA_CODEC_ALC892	HDA_CODEC_CONSTRUCT(REALTEK, 0x0892)
 #define HDA_CODEC_ALC899	HDA_CODEC_CONSTRUCT(REALTEK, 0x0899)
 #define HDA_CODEC_ALCXXXX	HDA_CODEC_CONSTRUCT(REALTEK, 0xffff)
 
 /* Motorola */
 #define MOTO_VENDORID		0x1057
 #define HDA_CODEC_MOTOXXXX	HDA_CODEC_CONSTRUCT(MOTO, 0xffff)
 
 /* Creative */
 #define CREATIVE_VENDORID	0x1102
 #define HDA_CODEC_CA0110	HDA_CODEC_CONSTRUCT(CREATIVE, 0x000a)
 #define HDA_CODEC_CA0110_2	HDA_CODEC_CONSTRUCT(CREATIVE, 0x000b)
 #define HDA_CODEC_SB0880	HDA_CODEC_CONSTRUCT(CREATIVE, 0x000d)
 #define HDA_CODEC_CA0132	HDA_CODEC_CONSTRUCT(CREATIVE, 0x0011)
 #define HDA_CODEC_CAXXXX	HDA_CODEC_CONSTRUCT(CREATIVE, 0xffff)
 
 /* Analog Devices */
 #define ANALOGDEVICES_VENDORID	0x11d4
 #define HDA_CODEC_AD1884A	HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x184a)
 #define HDA_CODEC_AD1882	HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x1882)
 #define HDA_CODEC_AD1883	HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x1883)
 #define HDA_CODEC_AD1884	HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x1884)
 #define HDA_CODEC_AD1984A	HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x194a)
 #define HDA_CODEC_AD1984B	HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x194b)
 #define HDA_CODEC_AD1981HD	HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x1981)
 #define HDA_CODEC_AD1983	HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x1983)
 #define HDA_CODEC_AD1984	HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x1984)
 #define HDA_CODEC_AD1986A	HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x1986)
 #define HDA_CODEC_AD1987	HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x1987)
 #define HDA_CODEC_AD1988	HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x1988)
 #define HDA_CODEC_AD1988B	HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x198b)
 #define HDA_CODEC_AD1882A	HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x882a)
 #define HDA_CODEC_AD1989A	HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x989a)
 #define HDA_CODEC_AD1989B	HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0x989b)
 #define HDA_CODEC_ADXXXX	HDA_CODEC_CONSTRUCT(ANALOGDEVICES, 0xffff)
 
 /* CMedia */
 #define CMEDIA_VENDORID		0x13f6
 #define HDA_CODEC_CMI9880	HDA_CODEC_CONSTRUCT(CMEDIA, 0x9880)
 #define HDA_CODEC_CMIXXXX	HDA_CODEC_CONSTRUCT(CMEDIA, 0xffff)
 
 #define CMEDIA2_VENDORID	0x434d
 #define HDA_CODEC_CMI98802	HDA_CODEC_CONSTRUCT(CMEDIA2, 0x4980)
 #define HDA_CODEC_CMIXXXX2	HDA_CODEC_CONSTRUCT(CMEDIA2, 0xffff)
 
 /* Sigmatel */
 #define SIGMATEL_VENDORID	0x8384
 #define HDA_CODEC_STAC9230X	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7612)
 #define HDA_CODEC_STAC9230D	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7613)
 #define HDA_CODEC_STAC9229X	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7614)
 #define HDA_CODEC_STAC9229D	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7615)
 #define HDA_CODEC_STAC9228X	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7616)
 #define HDA_CODEC_STAC9228D	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7617)
 #define HDA_CODEC_STAC9227X	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7618)
 #define HDA_CODEC_STAC9227D	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7619)
 #define HDA_CODEC_STAC9274	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7620)
 #define HDA_CODEC_STAC9274D	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7621)
 #define HDA_CODEC_STAC9273X	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7622)
 #define HDA_CODEC_STAC9273D	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7623)
 #define HDA_CODEC_STAC9272X	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7624)
 #define HDA_CODEC_STAC9272D	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7625)
 #define HDA_CODEC_STAC9271X	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7626)
 #define HDA_CODEC_STAC9271D	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7627)
 #define HDA_CODEC_STAC9274X5NH	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7628)
 #define HDA_CODEC_STAC9274D5NH	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7629)
 #define HDA_CODEC_STAC9250	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7634)
 #define HDA_CODEC_STAC9251	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7636)
 #define HDA_CODEC_IDT92HD700X	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7638)
 #define HDA_CODEC_IDT92HD700D	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7639)
 #define HDA_CODEC_IDT92HD206X	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7645)
 #define HDA_CODEC_IDT92HD206D	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7646)
 #define HDA_CODEC_CXD9872RDK	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7661)
 #define HDA_CODEC_STAC9872AK	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7662)
 #define HDA_CODEC_CXD9872AKD	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7664)
 #define HDA_CODEC_STAC9221	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7680)
 #define HDA_CODEC_STAC922XD	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7681)
 #define HDA_CODEC_STAC9221_A2	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7682)
 #define HDA_CODEC_STAC9221D	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7683)
 #define HDA_CODEC_STAC9220	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7690)
 #define HDA_CODEC_STAC9200D	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7691)
 #define HDA_CODEC_IDT92HD005	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7698)
 #define HDA_CODEC_IDT92HD005D	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7699)
 #define HDA_CODEC_STAC9205X	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x76a0)
 #define HDA_CODEC_STAC9205D	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x76a1)
 #define HDA_CODEC_STAC9204X	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x76a2)
 #define HDA_CODEC_STAC9204D	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x76a3)
 #define HDA_CODEC_STAC9255	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x76a4)
 #define HDA_CODEC_STAC9255D	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x76a5)
 #define HDA_CODEC_STAC9254	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x76a6)
 #define HDA_CODEC_STAC9254D	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x76a7)
 #define HDA_CODEC_STAC9220_A2	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7880)
 #define HDA_CODEC_STAC9220_A1	HDA_CODEC_CONSTRUCT(SIGMATEL, 0x7882)
 #define HDA_CODEC_STACXXXX	HDA_CODEC_CONSTRUCT(SIGMATEL, 0xffff)
 
 /* IDT */
 #define IDT_VENDORID		0x111d
 #define HDA_CODEC_IDT92HD75BX	HDA_CODEC_CONSTRUCT(IDT, 0x7603)
 #define HDA_CODEC_IDT92HD83C1X	HDA_CODEC_CONSTRUCT(IDT, 0x7604)
 #define HDA_CODEC_IDT92HD81B1X	HDA_CODEC_CONSTRUCT(IDT, 0x7605)
 #define HDA_CODEC_IDT92HD75B3	HDA_CODEC_CONSTRUCT(IDT, 0x7608)
 #define HDA_CODEC_IDT92HD73D1	HDA_CODEC_CONSTRUCT(IDT, 0x7674)
 #define HDA_CODEC_IDT92HD73C1	HDA_CODEC_CONSTRUCT(IDT, 0x7675)
 #define HDA_CODEC_IDT92HD73E1	HDA_CODEC_CONSTRUCT(IDT, 0x7676)
 #define HDA_CODEC_IDT92HD71B8	HDA_CODEC_CONSTRUCT(IDT, 0x76b0)
 #define HDA_CODEC_IDT92HD71B8_2	HDA_CODEC_CONSTRUCT(IDT, 0x76b1)
 #define HDA_CODEC_IDT92HD71B7	HDA_CODEC_CONSTRUCT(IDT, 0x76b2)
 #define HDA_CODEC_IDT92HD71B7_2	HDA_CODEC_CONSTRUCT(IDT, 0x76b3)
 #define HDA_CODEC_IDT92HD71B6	HDA_CODEC_CONSTRUCT(IDT, 0x76b4)
 #define HDA_CODEC_IDT92HD71B6_2	HDA_CODEC_CONSTRUCT(IDT, 0x76b5)
 #define HDA_CODEC_IDT92HD71B5	HDA_CODEC_CONSTRUCT(IDT, 0x76b6)
 #define HDA_CODEC_IDT92HD71B5_2	HDA_CODEC_CONSTRUCT(IDT, 0x76b7)
 #define HDA_CODEC_IDT92HD89C3	HDA_CODEC_CONSTRUCT(IDT, 0x76c0)
 #define HDA_CODEC_IDT92HD89C2	HDA_CODEC_CONSTRUCT(IDT, 0x76c1)
 #define HDA_CODEC_IDT92HD89C1	HDA_CODEC_CONSTRUCT(IDT, 0x76c2)
 #define HDA_CODEC_IDT92HD89B3	HDA_CODEC_CONSTRUCT(IDT, 0x76c3)
 #define HDA_CODEC_IDT92HD89B2	HDA_CODEC_CONSTRUCT(IDT, 0x76c4)
 #define HDA_CODEC_IDT92HD89B1	HDA_CODEC_CONSTRUCT(IDT, 0x76c5)
 #define HDA_CODEC_IDT92HD89E3	HDA_CODEC_CONSTRUCT(IDT, 0x76c6)
 #define HDA_CODEC_IDT92HD89E2	HDA_CODEC_CONSTRUCT(IDT, 0x76c7)
 #define HDA_CODEC_IDT92HD89E1	HDA_CODEC_CONSTRUCT(IDT, 0x76c8)
 #define HDA_CODEC_IDT92HD89D3	HDA_CODEC_CONSTRUCT(IDT, 0x76c9)
 #define HDA_CODEC_IDT92HD89D2	HDA_CODEC_CONSTRUCT(IDT, 0x76ca)
 #define HDA_CODEC_IDT92HD89D1	HDA_CODEC_CONSTRUCT(IDT, 0x76cb)
 #define HDA_CODEC_IDT92HD89F3	HDA_CODEC_CONSTRUCT(IDT, 0x76cc)
 #define HDA_CODEC_IDT92HD89F2	HDA_CODEC_CONSTRUCT(IDT, 0x76cd)
 #define HDA_CODEC_IDT92HD89F1	HDA_CODEC_CONSTRUCT(IDT, 0x76ce)
 #define HDA_CODEC_IDT92HD87B1_3	HDA_CODEC_CONSTRUCT(IDT, 0x76d1)
 #define HDA_CODEC_IDT92HD83C1C	HDA_CODEC_CONSTRUCT(IDT, 0x76d4)
 #define HDA_CODEC_IDT92HD81B1C	HDA_CODEC_CONSTRUCT(IDT, 0x76d5)
 #define HDA_CODEC_IDT92HD87B2_4	HDA_CODEC_CONSTRUCT(IDT, 0x76d9)
 #define HDA_CODEC_IDT92HD93BXX	HDA_CODEC_CONSTRUCT(IDT, 0x76df)
 #define HDA_CODEC_IDT92HD91BXX	HDA_CODEC_CONSTRUCT(IDT, 0x76e0)
 #define HDA_CODEC_IDT92HD98BXX	HDA_CODEC_CONSTRUCT(IDT, 0x76e3)
 #define HDA_CODEC_IDT92HD99BXX	HDA_CODEC_CONSTRUCT(IDT, 0x76e5)
 #define HDA_CODEC_IDT92HD90BXX	HDA_CODEC_CONSTRUCT(IDT, 0x76e7)
 #define HDA_CODEC_IDT92HD66B1X5	HDA_CODEC_CONSTRUCT(IDT, 0x76e8)
 #define HDA_CODEC_IDT92HD66B2X5	HDA_CODEC_CONSTRUCT(IDT, 0x76e9)
 #define HDA_CODEC_IDT92HD66B3X5	HDA_CODEC_CONSTRUCT(IDT, 0x76ea)
 #define HDA_CODEC_IDT92HD66C1X5	HDA_CODEC_CONSTRUCT(IDT, 0x76eb)
 #define HDA_CODEC_IDT92HD66C2X5	HDA_CODEC_CONSTRUCT(IDT, 0x76ec)
 #define HDA_CODEC_IDT92HD66C3X5	HDA_CODEC_CONSTRUCT(IDT, 0x76ed)
 #define HDA_CODEC_IDT92HD66B1X3	HDA_CODEC_CONSTRUCT(IDT, 0x76ee)
 #define HDA_CODEC_IDT92HD66B2X3	HDA_CODEC_CONSTRUCT(IDT, 0x76ef)
 #define HDA_CODEC_IDT92HD66B3X3	HDA_CODEC_CONSTRUCT(IDT, 0x76f0)
 #define HDA_CODEC_IDT92HD66C1X3	HDA_CODEC_CONSTRUCT(IDT, 0x76f1)
 #define HDA_CODEC_IDT92HD66C2X3	HDA_CODEC_CONSTRUCT(IDT, 0x76f2)
 #define HDA_CODEC_IDT92HD66C3_65	HDA_CODEC_CONSTRUCT(IDT, 0x76f3)
 #define HDA_CODEC_IDTXXXX	HDA_CODEC_CONSTRUCT(IDT, 0xffff)
 
 /* Silicon Image */
 #define SII_VENDORID	0x1095
 #define HDA_CODEC_SII1390	HDA_CODEC_CONSTRUCT(SII, 0x1390)
 #define HDA_CODEC_SII1392	HDA_CODEC_CONSTRUCT(SII, 0x1392)
 #define HDA_CODEC_SIIXXXX	HDA_CODEC_CONSTRUCT(SII, 0xffff)
 
 /* Lucent/Agere */
 #define AGERE_VENDORID	0x11c1
 #define HDA_CODEC_AGEREXXXX	HDA_CODEC_CONSTRUCT(AGERE, 0xffff)
 
 /* Conexant */
 #define CONEXANT_VENDORID	0x14f1
 #define HDA_CODEC_CX20549	HDA_CODEC_CONSTRUCT(CONEXANT, 0x5045)
 #define HDA_CODEC_CX20551	HDA_CODEC_CONSTRUCT(CONEXANT, 0x5047)
 #define HDA_CODEC_CX20561	HDA_CODEC_CONSTRUCT(CONEXANT, 0x5051)
 #define HDA_CODEC_CX20582	HDA_CODEC_CONSTRUCT(CONEXANT, 0x5066)
 #define HDA_CODEC_CX20583	HDA_CODEC_CONSTRUCT(CONEXANT, 0x5067)
 #define HDA_CODEC_CX20584	HDA_CODEC_CONSTRUCT(CONEXANT, 0x5068)
 #define HDA_CODEC_CX20585	HDA_CODEC_CONSTRUCT(CONEXANT, 0x5069)
 #define HDA_CODEC_CX20588	HDA_CODEC_CONSTRUCT(CONEXANT, 0x506c)
 #define HDA_CODEC_CX20590	HDA_CODEC_CONSTRUCT(CONEXANT, 0x506e)
 #define HDA_CODEC_CX20631	HDA_CODEC_CONSTRUCT(CONEXANT, 0x5097)
 #define HDA_CODEC_CX20632	HDA_CODEC_CONSTRUCT(CONEXANT, 0x5098)
 #define HDA_CODEC_CX20641	HDA_CODEC_CONSTRUCT(CONEXANT, 0x50a1)
 #define HDA_CODEC_CX20642	HDA_CODEC_CONSTRUCT(CONEXANT, 0x50a2)
 #define HDA_CODEC_CX20651	HDA_CODEC_CONSTRUCT(CONEXANT, 0x50ab)
 #define HDA_CODEC_CX20652	HDA_CODEC_CONSTRUCT(CONEXANT, 0x50ac)
 #define HDA_CODEC_CX20664	HDA_CODEC_CONSTRUCT(CONEXANT, 0x50b8)
 #define HDA_CODEC_CX20665	HDA_CODEC_CONSTRUCT(CONEXANT, 0x50b9)
 #define HDA_CODEC_CXXXXX	HDA_CODEC_CONSTRUCT(CONEXANT, 0xffff)
 
 /* VIA */
 #define HDA_CODEC_VT1708_8	HDA_CODEC_CONSTRUCT(VIA, 0x1708)
 #define HDA_CODEC_VT1708_9	HDA_CODEC_CONSTRUCT(VIA, 0x1709)
 #define HDA_CODEC_VT1708_A	HDA_CODEC_CONSTRUCT(VIA, 0x170a)
 #define HDA_CODEC_VT1708_B	HDA_CODEC_CONSTRUCT(VIA, 0x170b)
 #define HDA_CODEC_VT1709_0	HDA_CODEC_CONSTRUCT(VIA, 0xe710)
 #define HDA_CODEC_VT1709_1	HDA_CODEC_CONSTRUCT(VIA, 0xe711)
 #define HDA_CODEC_VT1709_2	HDA_CODEC_CONSTRUCT(VIA, 0xe712)
 #define HDA_CODEC_VT1709_3	HDA_CODEC_CONSTRUCT(VIA, 0xe713)
 #define HDA_CODEC_VT1709_4	HDA_CODEC_CONSTRUCT(VIA, 0xe714)
 #define HDA_CODEC_VT1709_5	HDA_CODEC_CONSTRUCT(VIA, 0xe715)
 #define HDA_CODEC_VT1709_6	HDA_CODEC_CONSTRUCT(VIA, 0xe716)
 #define HDA_CODEC_VT1709_7	HDA_CODEC_CONSTRUCT(VIA, 0xe717)
 #define HDA_CODEC_VT1708B_0	HDA_CODEC_CONSTRUCT(VIA, 0xe720)
 #define HDA_CODEC_VT1708B_1	HDA_CODEC_CONSTRUCT(VIA, 0xe721)
 #define HDA_CODEC_VT1708B_2	HDA_CODEC_CONSTRUCT(VIA, 0xe722)
 #define HDA_CODEC_VT1708B_3	HDA_CODEC_CONSTRUCT(VIA, 0xe723)
 #define HDA_CODEC_VT1708B_4	HDA_CODEC_CONSTRUCT(VIA, 0xe724)
 #define HDA_CODEC_VT1708B_5	HDA_CODEC_CONSTRUCT(VIA, 0xe725)
 #define HDA_CODEC_VT1708B_6	HDA_CODEC_CONSTRUCT(VIA, 0xe726)
 #define HDA_CODEC_VT1708B_7	HDA_CODEC_CONSTRUCT(VIA, 0xe727)
 #define HDA_CODEC_VT1708S_0	HDA_CODEC_CONSTRUCT(VIA, 0x0397)
 #define HDA_CODEC_VT1708S_1	HDA_CODEC_CONSTRUCT(VIA, 0x1397)
 #define HDA_CODEC_VT1708S_2	HDA_CODEC_CONSTRUCT(VIA, 0x2397)
 #define HDA_CODEC_VT1708S_3	HDA_CODEC_CONSTRUCT(VIA, 0x3397)
 #define HDA_CODEC_VT1708S_4	HDA_CODEC_CONSTRUCT(VIA, 0x4397)
 #define HDA_CODEC_VT1708S_5	HDA_CODEC_CONSTRUCT(VIA, 0x5397)
 #define HDA_CODEC_VT1708S_6	HDA_CODEC_CONSTRUCT(VIA, 0x6397)
 #define HDA_CODEC_VT1708S_7	HDA_CODEC_CONSTRUCT(VIA, 0x7397)
 #define HDA_CODEC_VT1702_0	HDA_CODEC_CONSTRUCT(VIA, 0x0398)
 #define HDA_CODEC_VT1702_1	HDA_CODEC_CONSTRUCT(VIA, 0x1398)
 #define HDA_CODEC_VT1702_2	HDA_CODEC_CONSTRUCT(VIA, 0x2398)
 #define HDA_CODEC_VT1702_3	HDA_CODEC_CONSTRUCT(VIA, 0x3398)
 #define HDA_CODEC_VT1702_4	HDA_CODEC_CONSTRUCT(VIA, 0x4398)
 #define HDA_CODEC_VT1702_5	HDA_CODEC_CONSTRUCT(VIA, 0x5398)
 #define HDA_CODEC_VT1702_6	HDA_CODEC_CONSTRUCT(VIA, 0x6398)
 #define HDA_CODEC_VT1702_7	HDA_CODEC_CONSTRUCT(VIA, 0x7398)
 #define HDA_CODEC_VT1716S_0	HDA_CODEC_CONSTRUCT(VIA, 0x0433)
 #define HDA_CODEC_VT1716S_1	HDA_CODEC_CONSTRUCT(VIA, 0xa721)
 #define HDA_CODEC_VT1718S_0	HDA_CODEC_CONSTRUCT(VIA, 0x0428)
 #define HDA_CODEC_VT1718S_1	HDA_CODEC_CONSTRUCT(VIA, 0x4428)
 #define HDA_CODEC_VT1802_0	HDA_CODEC_CONSTRUCT(VIA, 0x0446)
 #define HDA_CODEC_VT1802_1	HDA_CODEC_CONSTRUCT(VIA, 0x8446)
 #define HDA_CODEC_VT1812	HDA_CODEC_CONSTRUCT(VIA, 0x0448)
 #define HDA_CODEC_VT1818S	HDA_CODEC_CONSTRUCT(VIA, 0x0440)
 #define HDA_CODEC_VT1828S	HDA_CODEC_CONSTRUCT(VIA, 0x4441)
 #define HDA_CODEC_VT2002P_0	HDA_CODEC_CONSTRUCT(VIA, 0x0438)
 #define HDA_CODEC_VT2002P_1	HDA_CODEC_CONSTRUCT(VIA, 0x4438)
 #define HDA_CODEC_VT2020	HDA_CODEC_CONSTRUCT(VIA, 0x0441)
 #define HDA_CODEC_VTXXXX	HDA_CODEC_CONSTRUCT(VIA, 0xffff)
 
 /* ATI */
 #define HDA_CODEC_ATIRS600_1	HDA_CODEC_CONSTRUCT(ATI, 0x793c)
 #define HDA_CODEC_ATIRS600_2	HDA_CODEC_CONSTRUCT(ATI, 0x7919)
 #define HDA_CODEC_ATIRS690	HDA_CODEC_CONSTRUCT(ATI, 0x791a)
 #define HDA_CODEC_ATIR6XX	HDA_CODEC_CONSTRUCT(ATI, 0xaa01)
 #define HDA_CODEC_ATIXXXX	HDA_CODEC_CONSTRUCT(ATI, 0xffff)
 
 /* NVIDIA */
 #define HDA_CODEC_NVIDIAMCP78	HDA_CODEC_CONSTRUCT(NVIDIA, 0x0002)
 #define HDA_CODEC_NVIDIAMCP78_2	HDA_CODEC_CONSTRUCT(NVIDIA, 0x0003)
 #define HDA_CODEC_NVIDIAMCP78_3	HDA_CODEC_CONSTRUCT(NVIDIA, 0x0005)
 #define HDA_CODEC_NVIDIAMCP78_4	HDA_CODEC_CONSTRUCT(NVIDIA, 0x0006)
 #define HDA_CODEC_NVIDIAMCP7A	HDA_CODEC_CONSTRUCT(NVIDIA, 0x0007)
 #define HDA_CODEC_NVIDIAGT220	HDA_CODEC_CONSTRUCT(NVIDIA, 0x000a)
 #define HDA_CODEC_NVIDIAGT21X	HDA_CODEC_CONSTRUCT(NVIDIA, 0x000b)
 #define HDA_CODEC_NVIDIAMCP89	HDA_CODEC_CONSTRUCT(NVIDIA, 0x000c)
 #define HDA_CODEC_NVIDIAGT240	HDA_CODEC_CONSTRUCT(NVIDIA, 0x000d)
 #define HDA_CODEC_NVIDIAGTS450	HDA_CODEC_CONSTRUCT(NVIDIA, 0x0011)
 #define HDA_CODEC_NVIDIAGT440	HDA_CODEC_CONSTRUCT(NVIDIA, 0x0014)
 #define HDA_CODEC_NVIDIAGTX550	HDA_CODEC_CONSTRUCT(NVIDIA, 0x0015)
 #define HDA_CODEC_NVIDIAGTX570	HDA_CODEC_CONSTRUCT(NVIDIA, 0x0018)
 #define HDA_CODEC_NVIDIAMCP67	HDA_CODEC_CONSTRUCT(NVIDIA, 0x0067)
 #define HDA_CODEC_NVIDIAMCP73	HDA_CODEC_CONSTRUCT(NVIDIA, 0x8001)
 #define HDA_CODEC_NVIDIAXXXX	HDA_CODEC_CONSTRUCT(NVIDIA, 0xffff)
 
 /* Chrontel */
 #define CHRONTEL_VENDORID	0x17e8
 #define HDA_CODEC_CHXXXX	HDA_CODEC_CONSTRUCT(CHRONTEL, 0xffff)
 
 /* INTEL */
 #define HDA_CODEC_INTELIP	HDA_CODEC_CONSTRUCT(INTEL, 0x0054)
 #define HDA_CODEC_INTELBL	HDA_CODEC_CONSTRUCT(INTEL, 0x2801)
 #define HDA_CODEC_INTELCA	HDA_CODEC_CONSTRUCT(INTEL, 0x2802)
 #define HDA_CODEC_INTELEL	HDA_CODEC_CONSTRUCT(INTEL, 0x2803)
 #define HDA_CODEC_INTELIP2	HDA_CODEC_CONSTRUCT(INTEL, 0x2804)
 #define HDA_CODEC_INTELCPT	HDA_CODEC_CONSTRUCT(INTEL, 0x2805)
 #define HDA_CODEC_INTELPPT	HDA_CODEC_CONSTRUCT(INTEL, 0x2806)
 #define HDA_CODEC_INTELHSW	HDA_CODEC_CONSTRUCT(INTEL, 0x2807)
+#define HDA_CODEC_INTELBDW	HDA_CODEC_CONSTRUCT(INTEL, 0x2808)
 #define HDA_CODEC_INTELCL	HDA_CODEC_CONSTRUCT(INTEL, 0x29fb)
 #define HDA_CODEC_INTELXXXX	HDA_CODEC_CONSTRUCT(INTEL, 0xffff)
 
 /****************************************************************************
  * Helper Macros
  ****************************************************************************/
 
 #define HDA_DMA_ALIGNMENT	128
 
 #define HDA_BDL_MIN		2
 #define HDA_BDL_MAX		256
 #define HDA_BDL_DEFAULT		HDA_BDL_MIN
 
 #define HDA_BLK_MIN		HDA_DMA_ALIGNMENT
 #define HDA_BLK_ALIGN		(~(HDA_BLK_MIN - 1))
 
 #define HDA_BUFSZ_MIN		(HDA_BDL_MIN * HDA_BLK_MIN)
 #define HDA_BUFSZ_MAX		262144
 #define HDA_BUFSZ_DEFAULT	65536
 
 #define HDA_GPIO_MAX		8
 
 #define HDA_DEV_MATCH(fl, v)	((fl) == (v) || \
 				(fl) == 0xffffffff || \
 				(((fl) & 0xffff0000) == 0xffff0000 && \
 				((fl) & 0x0000ffff) == ((v) & 0x0000ffff)) || \
 				(((fl) & 0x0000ffff) == 0x0000ffff && \
 				((fl) & 0xffff0000) == ((v) & 0xffff0000)))
 #define HDA_MATCH_ALL		0xffffffff
 #define HDA_INVALID		0xffffffff
 
 #define HDA_BOOTVERBOSE(stmt)	do {			\
 	if (bootverbose != 0 || snd_verbose > 3) {	\
 		stmt					\
 	}						\
 } while (0)
 
 #define HDA_BOOTHVERBOSE(stmt)	do {			\
 	if (snd_verbose > 3) {				\
 		stmt					\
 	}						\
 } while (0)
 
 #define hda_command(dev, verb)					\
     HDAC_CODEC_COMMAND(device_get_parent(dev), (dev), (verb))
 
 typedef int nid_t;
 
 /****************************************************************************
  * Simplified Accessors for HDA devices
  ****************************************************************************/
 
 enum hdac_device_ivars {
     HDA_IVAR_CODEC_ID,
     HDA_IVAR_NODE_ID,
     HDA_IVAR_VENDOR_ID,
     HDA_IVAR_DEVICE_ID,
     HDA_IVAR_REVISION_ID,
     HDA_IVAR_STEPPING_ID,
     HDA_IVAR_SUBVENDOR_ID,
     HDA_IVAR_SUBDEVICE_ID,
     HDA_IVAR_SUBSYSTEM_ID,
     HDA_IVAR_NODE_TYPE,
     HDA_IVAR_DMA_NOCACHE,
 };
 
 #define HDA_ACCESSOR(var, ivar, type)					\
     __BUS_ACCESSOR(hda, var, HDA, ivar, type)
 
 HDA_ACCESSOR(codec_id,		CODEC_ID,	uint8_t);
 HDA_ACCESSOR(node_id,		NODE_ID,	uint8_t);
 HDA_ACCESSOR(vendor_id,		VENDOR_ID,	uint16_t);
 HDA_ACCESSOR(device_id,		DEVICE_ID,	uint16_t);
 HDA_ACCESSOR(revision_id,	REVISION_ID,	uint8_t);
 HDA_ACCESSOR(stepping_id,	STEPPING_ID,	uint8_t);
 HDA_ACCESSOR(subvendor_id,	SUBVENDOR_ID,	uint16_t);
 HDA_ACCESSOR(subdevice_id,	SUBDEVICE_ID,	uint16_t);
 HDA_ACCESSOR(subsystem_id,	SUBSYSTEM_ID,	uint32_t);
 HDA_ACCESSOR(node_type,		NODE_TYPE,	uint8_t);
 HDA_ACCESSOR(dma_nocache,	DMA_NOCACHE,	uint8_t);
 
 #define PCIS_MULTIMEDIA_HDA    0x03
 
 #endif
Index: user/ngie/more-tests/sys/dev/sound/pci/hda/hdacc.c
===================================================================
--- user/ngie/more-tests/sys/dev/sound/pci/hda/hdacc.c	(revision 281584)
+++ user/ngie/more-tests/sys/dev/sound/pci/hda/hdacc.c	(revision 281585)
@@ -1,726 +1,728 @@
 /*-
  * Copyright (c) 2006 Stephane E. Potvin <sepotvin@videotron.ca>
  * Copyright (c) 2006 Ariff Abdullah <ariff@FreeBSD.org>
  * Copyright (c) 2008-2012 Alexander Motin <mav@FreeBSD.org>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 /*
  * Intel High Definition Audio (CODEC) driver for FreeBSD.
  */
 
 #ifdef HAVE_KERNEL_OPTION_HEADERS
 #include "opt_snd.h"
 #endif
 
 #include <dev/sound/pcm/sound.h>
 
 #include <sys/ctype.h>
 
 #include <dev/sound/pci/hda/hda_reg.h>
 #include <dev/sound/pci/hda/hdac.h>
 
 SND_DECLARE_FILE("$FreeBSD$");
 
 struct hdacc_fg {
 	device_t	dev;
 	nid_t		nid;
 	uint8_t		type;
 	uint32_t	subsystem_id;
 };
 
 struct hdacc_softc {
 	device_t	dev;
 	struct mtx	*lock;
 	nid_t		cad;
 	device_t	streams[2][16];
 	device_t	tags[64];
 	int		fgcnt;
 	struct hdacc_fg	*fgs;
 };
 
 #define hdacc_lock(codec)	snd_mtxlock((codec)->lock)
 #define hdacc_unlock(codec)	snd_mtxunlock((codec)->lock)
 #define hdacc_lockassert(codec)	snd_mtxassert((codec)->lock)
 #define hdacc_lockowned(codec)	mtx_owned((codec)->lock)
 
 MALLOC_DEFINE(M_HDACC, "hdacc", "HDA CODEC");
 
 /* CODECs */
 static const struct {
 	uint32_t id;
 	uint16_t revid;
 	const char *name;
 } hdacc_codecs[] = {
 	{ HDA_CODEC_CS4206, 0,		"Cirrus Logic CS4206" },
 	{ HDA_CODEC_CS4207, 0,		"Cirrus Logic CS4207" },
 	{ HDA_CODEC_CS4210, 0,		"Cirrus Logic CS4210" },
 	{ HDA_CODEC_ALC221, 0,		"Realtek ALC221" },
 	{ HDA_CODEC_ALC260, 0,		"Realtek ALC260" },
 	{ HDA_CODEC_ALC262, 0,		"Realtek ALC262" },
 	{ HDA_CODEC_ALC267, 0,		"Realtek ALC267" },
 	{ HDA_CODEC_ALC268, 0,		"Realtek ALC268" },
 	{ HDA_CODEC_ALC269, 0,		"Realtek ALC269" },
 	{ HDA_CODEC_ALC270, 0,		"Realtek ALC270" },
 	{ HDA_CODEC_ALC272, 0,		"Realtek ALC272" },
 	{ HDA_CODEC_ALC273, 0,		"Realtek ALC273" },
 	{ HDA_CODEC_ALC275, 0,		"Realtek ALC275" },
 	{ HDA_CODEC_ALC276, 0,		"Realtek ALC276" },
+	{ HDA_CODEC_ALC292, 0,		"Realtek ALC292" },
 	{ HDA_CODEC_ALC660, 0,		"Realtek ALC660-VD" },
 	{ HDA_CODEC_ALC662, 0x0002,	"Realtek ALC662 rev2" },
 	{ HDA_CODEC_ALC662, 0,		"Realtek ALC662" },
 	{ HDA_CODEC_ALC663, 0,		"Realtek ALC663" },
 	{ HDA_CODEC_ALC665, 0,		"Realtek ALC665" },
 	{ HDA_CODEC_ALC670, 0,		"Realtek ALC670" },
 	{ HDA_CODEC_ALC680, 0,		"Realtek ALC680" },
 	{ HDA_CODEC_ALC861, 0x0340,	"Realtek ALC660" },
 	{ HDA_CODEC_ALC861, 0,		"Realtek ALC861" },
 	{ HDA_CODEC_ALC861VD, 0,	"Realtek ALC861-VD" },
 	{ HDA_CODEC_ALC880, 0,		"Realtek ALC880" },
 	{ HDA_CODEC_ALC882, 0,		"Realtek ALC882" },
 	{ HDA_CODEC_ALC883, 0,		"Realtek ALC883" },
 	{ HDA_CODEC_ALC885, 0x0101,	"Realtek ALC889A" },
 	{ HDA_CODEC_ALC885, 0x0103,	"Realtek ALC889A" },
 	{ HDA_CODEC_ALC885, 0,		"Realtek ALC885" },
 	{ HDA_CODEC_ALC887, 0,		"Realtek ALC887" },
 	{ HDA_CODEC_ALC888, 0x0101,	"Realtek ALC1200" },
 	{ HDA_CODEC_ALC888, 0,		"Realtek ALC888" },
 	{ HDA_CODEC_ALC889, 0,		"Realtek ALC889" },
 	{ HDA_CODEC_ALC892, 0,		"Realtek ALC892" },
 	{ HDA_CODEC_ALC899, 0,		"Realtek ALC899" },
 	{ HDA_CODEC_AD1882, 0,		"Analog Devices AD1882" },
 	{ HDA_CODEC_AD1882A, 0,		"Analog Devices AD1882A" },
 	{ HDA_CODEC_AD1883, 0,		"Analog Devices AD1883" },
 	{ HDA_CODEC_AD1884, 0,		"Analog Devices AD1884" },
 	{ HDA_CODEC_AD1884A, 0,		"Analog Devices AD1884A" },
 	{ HDA_CODEC_AD1981HD, 0,	"Analog Devices AD1981HD" },
 	{ HDA_CODEC_AD1983, 0,		"Analog Devices AD1983" },
 	{ HDA_CODEC_AD1984, 0,		"Analog Devices AD1984" },
 	{ HDA_CODEC_AD1984A, 0,		"Analog Devices AD1984A" },
 	{ HDA_CODEC_AD1984B, 0,		"Analog Devices AD1984B" },
 	{ HDA_CODEC_AD1986A, 0,		"Analog Devices AD1986A" },
 	{ HDA_CODEC_AD1987, 0,		"Analog Devices AD1987" },
 	{ HDA_CODEC_AD1988, 0,		"Analog Devices AD1988A" },
 	{ HDA_CODEC_AD1988B, 0,		"Analog Devices AD1988B" },
 	{ HDA_CODEC_AD1989A, 0,		"Analog Devices AD1989A" },
 	{ HDA_CODEC_AD1989B, 0,		"Analog Devices AD1989B" },
 	{ HDA_CODEC_CA0110, 0,		"Creative CA0110-IBG" },
 	{ HDA_CODEC_CA0110_2, 0,	"Creative CA0110-IBG" },
 	{ HDA_CODEC_CA0132, 0,		"Creative CA0132" },
 	{ HDA_CODEC_SB0880, 0,		"Creative SB0880 X-Fi" },
 	{ HDA_CODEC_CMI9880, 0,		"CMedia CMI9880" },
 	{ HDA_CODEC_CMI98802, 0,	"CMedia CMI9880" },
 	{ HDA_CODEC_CXD9872RDK, 0,	"Sigmatel CXD9872RD/K" },
 	{ HDA_CODEC_CXD9872AKD, 0,	"Sigmatel CXD9872AKD" },
 	{ HDA_CODEC_STAC9200D, 0,	"Sigmatel STAC9200D" },
 	{ HDA_CODEC_STAC9204X, 0,	"Sigmatel STAC9204X" },
 	{ HDA_CODEC_STAC9204D, 0,	"Sigmatel STAC9204D" },
 	{ HDA_CODEC_STAC9205X, 0,	"Sigmatel STAC9205X" },
 	{ HDA_CODEC_STAC9205D, 0,	"Sigmatel STAC9205D" },
 	{ HDA_CODEC_STAC9220, 0,	"Sigmatel STAC9220" },
 	{ HDA_CODEC_STAC9220_A1, 0,	"Sigmatel STAC9220_A1" },
 	{ HDA_CODEC_STAC9220_A2, 0,	"Sigmatel STAC9220_A2" },
 	{ HDA_CODEC_STAC9221, 0,	"Sigmatel STAC9221" },
 	{ HDA_CODEC_STAC9221_A2, 0,	"Sigmatel STAC9221_A2" },
 	{ HDA_CODEC_STAC9221D, 0,	"Sigmatel STAC9221D" },
 	{ HDA_CODEC_STAC922XD, 0,	"Sigmatel STAC9220D/9223D" },
 	{ HDA_CODEC_STAC9227X, 0,	"Sigmatel STAC9227X" },
 	{ HDA_CODEC_STAC9227D, 0,	"Sigmatel STAC9227D" },
 	{ HDA_CODEC_STAC9228X, 0,	"Sigmatel STAC9228X" },
 	{ HDA_CODEC_STAC9228D, 0,	"Sigmatel STAC9228D" },
 	{ HDA_CODEC_STAC9229X, 0,	"Sigmatel STAC9229X" },
 	{ HDA_CODEC_STAC9229D, 0,	"Sigmatel STAC9229D" },
 	{ HDA_CODEC_STAC9230X, 0,	"Sigmatel STAC9230X" },
 	{ HDA_CODEC_STAC9230D, 0,	"Sigmatel STAC9230D" },
 	{ HDA_CODEC_STAC9250, 0, 	"Sigmatel STAC9250" },
 	{ HDA_CODEC_STAC9251, 0, 	"Sigmatel STAC9251" },
 	{ HDA_CODEC_STAC9255, 0, 	"Sigmatel STAC9255" },
 	{ HDA_CODEC_STAC9255D, 0, 	"Sigmatel STAC9255D" },
 	{ HDA_CODEC_STAC9254, 0, 	"Sigmatel STAC9254" },
 	{ HDA_CODEC_STAC9254D, 0, 	"Sigmatel STAC9254D" },
 	{ HDA_CODEC_STAC9271X, 0,	"Sigmatel STAC9271X" },
 	{ HDA_CODEC_STAC9271D, 0,	"Sigmatel STAC9271D" },
 	{ HDA_CODEC_STAC9272X, 0,	"Sigmatel STAC9272X" },
 	{ HDA_CODEC_STAC9272D, 0,	"Sigmatel STAC9272D" },
 	{ HDA_CODEC_STAC9273X, 0,	"Sigmatel STAC9273X" },
 	{ HDA_CODEC_STAC9273D, 0,	"Sigmatel STAC9273D" },
 	{ HDA_CODEC_STAC9274, 0, 	"Sigmatel STAC9274" },
 	{ HDA_CODEC_STAC9274D, 0,	"Sigmatel STAC9274D" },
 	{ HDA_CODEC_STAC9274X5NH, 0,	"Sigmatel STAC9274X5NH" },
 	{ HDA_CODEC_STAC9274D5NH, 0,	"Sigmatel STAC9274D5NH" },
 	{ HDA_CODEC_STAC9872AK, 0,	"Sigmatel STAC9872AK" },
 	{ HDA_CODEC_IDT92HD005, 0,	"IDT 92HD005" },
 	{ HDA_CODEC_IDT92HD005D, 0,	"IDT 92HD005D" },
 	{ HDA_CODEC_IDT92HD206X, 0,	"IDT 92HD206X" },
 	{ HDA_CODEC_IDT92HD206D, 0,	"IDT 92HD206D" },
 	{ HDA_CODEC_IDT92HD66B1X5, 0,	"IDT 92HD66B1X5" },
 	{ HDA_CODEC_IDT92HD66B2X5, 0,	"IDT 92HD66B2X5" },
 	{ HDA_CODEC_IDT92HD66B3X5, 0,	"IDT 92HD66B3X5" },
 	{ HDA_CODEC_IDT92HD66C1X5, 0,	"IDT 92HD66C1X5" },
 	{ HDA_CODEC_IDT92HD66C2X5, 0,	"IDT 92HD66C2X5" },
 	{ HDA_CODEC_IDT92HD66C3X5, 0,	"IDT 92HD66C3X5" },
 	{ HDA_CODEC_IDT92HD66B1X3, 0,	"IDT 92HD66B1X3" },
 	{ HDA_CODEC_IDT92HD66B2X3, 0,	"IDT 92HD66B2X3" },
 	{ HDA_CODEC_IDT92HD66B3X3, 0,	"IDT 92HD66B3X3" },
 	{ HDA_CODEC_IDT92HD66C1X3, 0,	"IDT 92HD66C1X3" },
 	{ HDA_CODEC_IDT92HD66C2X3, 0,	"IDT 92HD66C2X3" },
 	{ HDA_CODEC_IDT92HD66C3_65, 0,	"IDT 92HD66C3_65" },
 	{ HDA_CODEC_IDT92HD700X, 0,	"IDT 92HD700X" },
 	{ HDA_CODEC_IDT92HD700D, 0,	"IDT 92HD700D" },
 	{ HDA_CODEC_IDT92HD71B5, 0,	"IDT 92HD71B5" },
 	{ HDA_CODEC_IDT92HD71B5_2, 0,	"IDT 92HD71B5" },
 	{ HDA_CODEC_IDT92HD71B6, 0,	"IDT 92HD71B6" },
 	{ HDA_CODEC_IDT92HD71B6_2, 0,	"IDT 92HD71B6" },
 	{ HDA_CODEC_IDT92HD71B7, 0,	"IDT 92HD71B7" },
 	{ HDA_CODEC_IDT92HD71B7_2, 0,	"IDT 92HD71B7" },
 	{ HDA_CODEC_IDT92HD71B8, 0,	"IDT 92HD71B8" },
 	{ HDA_CODEC_IDT92HD71B8_2, 0,	"IDT 92HD71B8" },
 	{ HDA_CODEC_IDT92HD73C1, 0,	"IDT 92HD73C1" },
 	{ HDA_CODEC_IDT92HD73D1, 0,	"IDT 92HD73D1" },
 	{ HDA_CODEC_IDT92HD73E1, 0,	"IDT 92HD73E1" },
 	{ HDA_CODEC_IDT92HD75B3, 0,	"IDT 92HD75B3" },
 	{ HDA_CODEC_IDT92HD75BX, 0,	"IDT 92HD75BX" },
 	{ HDA_CODEC_IDT92HD81B1C, 0,	"IDT 92HD81B1C" },
 	{ HDA_CODEC_IDT92HD81B1X, 0,	"IDT 92HD81B1X" },
 	{ HDA_CODEC_IDT92HD83C1C, 0,	"IDT 92HD83C1C" },
 	{ HDA_CODEC_IDT92HD83C1X, 0,	"IDT 92HD83C1X" },
 	{ HDA_CODEC_IDT92HD87B1_3, 0,	"IDT 92HD87B1/3" },
 	{ HDA_CODEC_IDT92HD87B2_4, 0,	"IDT 92HD87B2/4" },
 	{ HDA_CODEC_IDT92HD89C3, 0,	"IDT 92HD89C3" },
 	{ HDA_CODEC_IDT92HD89C2, 0,	"IDT 92HD89C2" },
 	{ HDA_CODEC_IDT92HD89C1, 0,	"IDT 92HD89C1" },
 	{ HDA_CODEC_IDT92HD89B3, 0,	"IDT 92HD89B3" },
 	{ HDA_CODEC_IDT92HD89B2, 0,	"IDT 92HD89B2" },
 	{ HDA_CODEC_IDT92HD89B1, 0,	"IDT 92HD89B1" },
 	{ HDA_CODEC_IDT92HD89E3, 0,	"IDT 92HD89E3" },
 	{ HDA_CODEC_IDT92HD89E2, 0,	"IDT 92HD89E2" },
 	{ HDA_CODEC_IDT92HD89E1, 0,	"IDT 92HD89E1" },
 	{ HDA_CODEC_IDT92HD89D3, 0,	"IDT 92HD89D3" },
 	{ HDA_CODEC_IDT92HD89D2, 0,	"IDT 92HD89D2" },
 	{ HDA_CODEC_IDT92HD89D1, 0,	"IDT 92HD89D1" },
 	{ HDA_CODEC_IDT92HD89F3, 0,	"IDT 92HD89F3" },
 	{ HDA_CODEC_IDT92HD89F2, 0,	"IDT 92HD89F2" },
 	{ HDA_CODEC_IDT92HD89F1, 0,	"IDT 92HD89F1" },
 	{ HDA_CODEC_IDT92HD90BXX, 0,	"IDT 92HD90BXX" },
 	{ HDA_CODEC_IDT92HD91BXX, 0,	"IDT 92HD91BXX" },
 	{ HDA_CODEC_IDT92HD93BXX, 0,	"IDT 92HD93BXX" },
 	{ HDA_CODEC_IDT92HD98BXX, 0,	"IDT 92HD98BXX" },
 	{ HDA_CODEC_IDT92HD99BXX, 0,	"IDT 92HD99BXX" },
 	{ HDA_CODEC_CX20549, 0,		"Conexant CX20549 (Venice)" },
 	{ HDA_CODEC_CX20551, 0,		"Conexant CX20551 (Waikiki)" },
 	{ HDA_CODEC_CX20561, 0,		"Conexant CX20561 (Hermosa)" },
 	{ HDA_CODEC_CX20582, 0,		"Conexant CX20582 (Pebble)" },
 	{ HDA_CODEC_CX20583, 0,		"Conexant CX20583 (Pebble HSF)" },
 	{ HDA_CODEC_CX20584, 0,		"Conexant CX20584" },
 	{ HDA_CODEC_CX20585, 0,		"Conexant CX20585" },
 	{ HDA_CODEC_CX20588, 0,		"Conexant CX20588" },
 	{ HDA_CODEC_CX20590, 0,		"Conexant CX20590" },
 	{ HDA_CODEC_CX20631, 0,		"Conexant CX20631" },
 	{ HDA_CODEC_CX20632, 0,		"Conexant CX20632" },
 	{ HDA_CODEC_CX20641, 0,		"Conexant CX20641" },
 	{ HDA_CODEC_CX20642, 0,		"Conexant CX20642" },
 	{ HDA_CODEC_CX20651, 0,		"Conexant CX20651" },
 	{ HDA_CODEC_CX20652, 0,		"Conexant CX20652" },
 	{ HDA_CODEC_CX20664, 0,		"Conexant CX20664" },
 	{ HDA_CODEC_CX20665, 0,		"Conexant CX20665" },
 	{ HDA_CODEC_VT1708_8, 0,	"VIA VT1708_8" },
 	{ HDA_CODEC_VT1708_9, 0,	"VIA VT1708_9" },
 	{ HDA_CODEC_VT1708_A, 0,	"VIA VT1708_A" },
 	{ HDA_CODEC_VT1708_B, 0,	"VIA VT1708_B" },
 	{ HDA_CODEC_VT1709_0, 0,	"VIA VT1709_0" },
 	{ HDA_CODEC_VT1709_1, 0,	"VIA VT1709_1" },
 	{ HDA_CODEC_VT1709_2, 0,	"VIA VT1709_2" },
 	{ HDA_CODEC_VT1709_3, 0,	"VIA VT1709_3" },
 	{ HDA_CODEC_VT1709_4, 0,	"VIA VT1709_4" },
 	{ HDA_CODEC_VT1709_5, 0,	"VIA VT1709_5" },
 	{ HDA_CODEC_VT1709_6, 0,	"VIA VT1709_6" },
 	{ HDA_CODEC_VT1709_7, 0,	"VIA VT1709_7" },
 	{ HDA_CODEC_VT1708B_0, 0,	"VIA VT1708B_0" },
 	{ HDA_CODEC_VT1708B_1, 0,	"VIA VT1708B_1" },
 	{ HDA_CODEC_VT1708B_2, 0,	"VIA VT1708B_2" },
 	{ HDA_CODEC_VT1708B_3, 0,	"VIA VT1708B_3" },
 	{ HDA_CODEC_VT1708B_4, 0,	"VIA VT1708B_4" },
 	{ HDA_CODEC_VT1708B_5, 0,	"VIA VT1708B_5" },
 	{ HDA_CODEC_VT1708B_6, 0,	"VIA VT1708B_6" },
 	{ HDA_CODEC_VT1708B_7, 0,	"VIA VT1708B_7" },
 	{ HDA_CODEC_VT1708S_0, 0,	"VIA VT1708S_0" },
 	{ HDA_CODEC_VT1708S_1, 0,	"VIA VT1708S_1" },
 	{ HDA_CODEC_VT1708S_2, 0,	"VIA VT1708S_2" },
 	{ HDA_CODEC_VT1708S_3, 0,	"VIA VT1708S_3" },
 	{ HDA_CODEC_VT1708S_4, 0,	"VIA VT1708S_4" },
 	{ HDA_CODEC_VT1708S_5, 0,	"VIA VT1708S_5" },
 	{ HDA_CODEC_VT1708S_6, 0,	"VIA VT1708S_6" },
 	{ HDA_CODEC_VT1708S_7, 0,	"VIA VT1708S_7" },
 	{ HDA_CODEC_VT1702_0, 0,	"VIA VT1702_0" },
 	{ HDA_CODEC_VT1702_1, 0,	"VIA VT1702_1" },
 	{ HDA_CODEC_VT1702_2, 0,	"VIA VT1702_2" },
 	{ HDA_CODEC_VT1702_3, 0,	"VIA VT1702_3" },
 	{ HDA_CODEC_VT1702_4, 0,	"VIA VT1702_4" },
 	{ HDA_CODEC_VT1702_5, 0,	"VIA VT1702_5" },
 	{ HDA_CODEC_VT1702_6, 0,	"VIA VT1702_6" },
 	{ HDA_CODEC_VT1702_7, 0,	"VIA VT1702_7" },
 	{ HDA_CODEC_VT1716S_0, 0,	"VIA VT1716S_0" },
 	{ HDA_CODEC_VT1716S_1, 0,	"VIA VT1716S_1" },
 	{ HDA_CODEC_VT1718S_0, 0,	"VIA VT1718S_0" },
 	{ HDA_CODEC_VT1718S_1, 0,	"VIA VT1718S_1" },
 	{ HDA_CODEC_VT1802_0, 0,	"VIA VT1802_0" },
 	{ HDA_CODEC_VT1802_1, 0,	"VIA VT1802_1" },
 	{ HDA_CODEC_VT1812, 0,		"VIA VT1812" },
 	{ HDA_CODEC_VT1818S, 0,		"VIA VT1818S" },
 	{ HDA_CODEC_VT1828S, 0,		"VIA VT1828S" },
 	{ HDA_CODEC_VT2002P_0, 0,	"VIA VT2002P_0" },
 	{ HDA_CODEC_VT2002P_1, 0,	"VIA VT2002P_1" },
 	{ HDA_CODEC_VT2020, 0,		"VIA VT2020" },
 	{ HDA_CODEC_ATIRS600_1, 0,	"ATI RS600" },
 	{ HDA_CODEC_ATIRS600_2, 0,	"ATI RS600" },
 	{ HDA_CODEC_ATIRS690, 0,	"ATI RS690/780" },
 	{ HDA_CODEC_ATIR6XX, 0,		"ATI R6xx" },
 	{ HDA_CODEC_NVIDIAMCP67, 0,	"NVIDIA MCP67" },
 	{ HDA_CODEC_NVIDIAMCP73, 0,	"NVIDIA MCP73" },
 	{ HDA_CODEC_NVIDIAMCP78, 0,	"NVIDIA MCP78" },
 	{ HDA_CODEC_NVIDIAMCP78_2, 0,	"NVIDIA MCP78" },
 	{ HDA_CODEC_NVIDIAMCP78_3, 0,	"NVIDIA MCP78" },
 	{ HDA_CODEC_NVIDIAMCP78_4, 0,	"NVIDIA MCP78" },
 	{ HDA_CODEC_NVIDIAMCP7A, 0,	"NVIDIA MCP7A" },
 	{ HDA_CODEC_NVIDIAGT220, 0,	"NVIDIA GT220" },
 	{ HDA_CODEC_NVIDIAGT21X, 0,	"NVIDIA GT21x" },
 	{ HDA_CODEC_NVIDIAMCP89, 0,	"NVIDIA MCP89" },
 	{ HDA_CODEC_NVIDIAGT240, 0,	"NVIDIA GT240" },
 	{ HDA_CODEC_NVIDIAGTS450, 0,	"NVIDIA GTS450" },
 	{ HDA_CODEC_NVIDIAGT440, 0,	"NVIDIA GT440" },
 	{ HDA_CODEC_NVIDIAGTX550, 0,	"NVIDIA GTX550" },
 	{ HDA_CODEC_NVIDIAGTX570, 0,	"NVIDIA GTX570" },
 	{ HDA_CODEC_INTELIP, 0,		"Intel Ibex Peak" },
 	{ HDA_CODEC_INTELBL, 0,		"Intel Bearlake" },
 	{ HDA_CODEC_INTELCA, 0,		"Intel Cantiga" },
 	{ HDA_CODEC_INTELEL, 0,		"Intel Eaglelake" },
 	{ HDA_CODEC_INTELIP2, 0,	"Intel Ibex Peak" },
 	{ HDA_CODEC_INTELCPT, 0,	"Intel Cougar Point" },
 	{ HDA_CODEC_INTELPPT, 0,	"Intel Panther Point" },
 	{ HDA_CODEC_INTELHSW, 0,	"Intel Haswell" },
+	{ HDA_CODEC_INTELBDW, 0,	"Intel Broadwell" },
 	{ HDA_CODEC_INTELCL, 0,		"Intel Crestline" },
 	{ HDA_CODEC_SII1390, 0,		"Silicon Image SiI1390" },
 	{ HDA_CODEC_SII1392, 0,		"Silicon Image SiI1392" },
 	/* Unknown CODECs */
 	{ HDA_CODEC_ADXXXX, 0,		"Analog Devices" },
 	{ HDA_CODEC_AGEREXXXX, 0,	"Lucent/Agere Systems" },
 	{ HDA_CODEC_ALCXXXX, 0,		"Realtek" },
 	{ HDA_CODEC_ATIXXXX, 0,		"ATI" },
 	{ HDA_CODEC_CAXXXX, 0,		"Creative" },
 	{ HDA_CODEC_CMIXXXX, 0,		"CMedia" },
 	{ HDA_CODEC_CMIXXXX2, 0,	"CMedia" },
 	{ HDA_CODEC_CSXXXX, 0,		"Cirrus Logic" },
 	{ HDA_CODEC_CXXXXX, 0,		"Conexant" },
 	{ HDA_CODEC_CHXXXX, 0,		"Chrontel" },
 	{ HDA_CODEC_IDTXXXX, 0,		"IDT" },
 	{ HDA_CODEC_INTELXXXX, 0,	"Intel" },
 	{ HDA_CODEC_MOTOXXXX, 0,	"Motorola" },
 	{ HDA_CODEC_NVIDIAXXXX, 0,	"NVIDIA" },
 	{ HDA_CODEC_SIIXXXX, 0,		"Silicon Image" },
 	{ HDA_CODEC_STACXXXX, 0,	"Sigmatel" },
 	{ HDA_CODEC_VTXXXX, 0,		"VIA" },
 };
 
 static int
 hdacc_suspend(device_t dev)
 {
 
 	HDA_BOOTHVERBOSE(
 		device_printf(dev, "Suspend...\n");
 	);
 	bus_generic_suspend(dev);
 	HDA_BOOTHVERBOSE(
 		device_printf(dev, "Suspend done\n");
 	);
 	return (0);
 }
 
 static int
 hdacc_resume(device_t dev)
 {
 
 	HDA_BOOTHVERBOSE(
 		device_printf(dev, "Resume...\n");
 	);
 	bus_generic_resume(dev);
 	HDA_BOOTHVERBOSE(
 		device_printf(dev, "Resume done\n");
 	);
 	return (0);
 }
 
 static int
 hdacc_probe(device_t dev)
 {
 	uint32_t id, revid;
 	char buf[128];
 	int i;
 
 	id = ((uint32_t)hda_get_vendor_id(dev) << 16) + hda_get_device_id(dev);
 	revid = ((uint32_t)hda_get_revision_id(dev) << 8) + hda_get_stepping_id(dev);
 
 	for (i = 0; i < nitems(hdacc_codecs); i++) {
 		if (!HDA_DEV_MATCH(hdacc_codecs[i].id, id))
 			continue;
 		if (hdacc_codecs[i].revid != 0 &&
 		    hdacc_codecs[i].revid != revid)
 			continue;
 		break;
 	}
 	if (i < nitems(hdacc_codecs)) {
 		if ((hdacc_codecs[i].id & 0xffff) != 0xffff)
 			strlcpy(buf, hdacc_codecs[i].name, sizeof(buf));
 		else
 			snprintf(buf, sizeof(buf), "%s (0x%04x)",
 			    hdacc_codecs[i].name, hda_get_device_id(dev));
 	} else
 		snprintf(buf, sizeof(buf), "Generic (0x%04x)", id);
 	strlcat(buf, " HDA CODEC", sizeof(buf));
 	device_set_desc_copy(dev, buf);
 	return (BUS_PROBE_DEFAULT);
 }
 
 static int
 hdacc_attach(device_t dev)
 {
 	struct hdacc_softc *codec = device_get_softc(dev);
 	device_t child;
 	int cad = (intptr_t)device_get_ivars(dev);
 	uint32_t subnode;
 	int startnode;
 	int endnode;
 	int i, n;
 
 	codec->lock = HDAC_GET_MTX(device_get_parent(dev), dev);
 	codec->dev = dev;
 	codec->cad = cad;
 
 	hdacc_lock(codec);
 	subnode = hda_command(dev,
 	    HDA_CMD_GET_PARAMETER(0, 0x0, HDA_PARAM_SUB_NODE_COUNT));
 	hdacc_unlock(codec);
 	if (subnode == HDA_INVALID)
 		return (EIO);
 	codec->fgcnt = HDA_PARAM_SUB_NODE_COUNT_TOTAL(subnode);
 	startnode = HDA_PARAM_SUB_NODE_COUNT_START(subnode);
 	endnode = startnode + codec->fgcnt;
 
 	HDA_BOOTHVERBOSE(
 		device_printf(dev,
 		    "Root Node at nid=0: %d subnodes %d-%d\n",
 		    HDA_PARAM_SUB_NODE_COUNT_TOTAL(subnode),
 		    startnode, endnode - 1);
 	);
 
 	codec->fgs = malloc(sizeof(struct hdacc_fg) * codec->fgcnt,
 	    M_HDACC, M_ZERO | M_WAITOK);
 	for (i = startnode, n = 0; i < endnode; i++, n++) {
 		codec->fgs[n].nid = i;
 		hdacc_lock(codec);
 		codec->fgs[n].type =
 		    HDA_PARAM_FCT_GRP_TYPE_NODE_TYPE(hda_command(dev,
 		    HDA_CMD_GET_PARAMETER(0, i, HDA_PARAM_FCT_GRP_TYPE)));
 		codec->fgs[n].subsystem_id = hda_command(dev,
 		    HDA_CMD_GET_SUBSYSTEM_ID(0, i));
 		hdacc_unlock(codec);
 		codec->fgs[n].dev = child = device_add_child(dev, NULL, -1);
 		if (child == NULL) {
 			device_printf(dev, "Failed to add function device\n");
 			continue;
 		}
 		device_set_ivars(child, &codec->fgs[n]);
 	}
 
 	bus_generic_attach(dev);
 
 	return (0);
 }
 
 static int
 hdacc_detach(device_t dev)
 {
 	struct hdacc_softc *codec = device_get_softc(dev);
 	int error;
 
 	error = device_delete_children(dev);
 	free(codec->fgs, M_HDACC);
 	return (error);
 }
 
 static int
 hdacc_child_location_str(device_t dev, device_t child, char *buf,
     size_t buflen)
 {
 	struct hdacc_fg *fg = device_get_ivars(child);
 
 	snprintf(buf, buflen, "nid=%d", fg->nid);
 	return (0);
 }
 
 static int
 hdacc_child_pnpinfo_str_method(device_t dev, device_t child, char *buf,
     size_t buflen)
 {
 	struct hdacc_fg *fg = device_get_ivars(child);
 
 	snprintf(buf, buflen, "type=0x%02x subsystem=0x%08x",
 	    fg->type, fg->subsystem_id);
 	return (0);
 }
 
 static int
 hdacc_print_child(device_t dev, device_t child)
 {
 	struct hdacc_fg *fg = device_get_ivars(child);
 	int retval;
 
 	retval = bus_print_child_header(dev, child);
 	retval += printf(" at nid %d", fg->nid);
 	retval += bus_print_child_footer(dev, child);
 
 	return (retval);
 }
 
 static void
 hdacc_probe_nomatch(device_t dev, device_t child)
 {
 	struct hdacc_softc *codec = device_get_softc(dev);
 	struct hdacc_fg *fg = device_get_ivars(child);
 
 	device_printf(child, "<%s %s Function Group> at nid %d on %s "
 	    "(no driver attached)\n",
 	    device_get_desc(dev),
 	    fg->type == HDA_PARAM_FCT_GRP_TYPE_NODE_TYPE_AUDIO ? "Audio" :
 	    (fg->type == HDA_PARAM_FCT_GRP_TYPE_NODE_TYPE_MODEM ? "Modem" :
 	    "Unknown"), fg->nid, device_get_nameunit(dev));
 	HDA_BOOTVERBOSE(
 		device_printf(dev, "Subsystem ID: 0x%08x\n",
 		    hda_get_subsystem_id(dev));
 	);
 	HDA_BOOTHVERBOSE(
 		device_printf(dev, "Power down FG nid=%d to the D3 state...\n",
 		    fg->nid);
 	);
 	hdacc_lock(codec);
 	hda_command(dev, HDA_CMD_SET_POWER_STATE(0,
 	    fg->nid, HDA_CMD_POWER_STATE_D3));
 	hdacc_unlock(codec);
 }
 
 static int
 hdacc_read_ivar(device_t dev, device_t child, int which, uintptr_t *result)
 {
 	struct hdacc_fg *fg = device_get_ivars(child);
 
 	switch (which) {
 	case HDA_IVAR_NODE_ID:
 		*result = fg->nid;
 		break;
 	case HDA_IVAR_NODE_TYPE:
 		*result = fg->type;
 		break;
 	case HDA_IVAR_SUBSYSTEM_ID:
 		*result = fg->subsystem_id;
 		break;
 	default:
 		return(BUS_READ_IVAR(device_get_parent(dev), dev,
 		    which, result));
 	}
 	return (0);
 }
 
 static struct mtx *
 hdacc_get_mtx(device_t dev, device_t child)
 {
 	struct hdacc_softc *codec = device_get_softc(dev);
 
 	return (codec->lock);
 }
 
 static uint32_t
 hdacc_codec_command(device_t dev, device_t child, uint32_t verb)
 {
 
 	return (HDAC_CODEC_COMMAND(device_get_parent(dev), dev, verb));
 }
 
 static int
 hdacc_stream_alloc(device_t dev, device_t child, int dir, int format,
     int stripe, uint32_t **dmapos)
 {
 	struct hdacc_softc *codec = device_get_softc(dev);
 	int stream;
 
 	stream = HDAC_STREAM_ALLOC(device_get_parent(dev), dev,
 	    dir, format, stripe, dmapos);
 	if (stream > 0)
 		codec->streams[dir][stream] = child;
 	return (stream);
 }
 
 static void
 hdacc_stream_free(device_t dev, device_t child, int dir, int stream)
 {
 	struct hdacc_softc *codec = device_get_softc(dev);
 
 	codec->streams[dir][stream] = NULL;
 	HDAC_STREAM_FREE(device_get_parent(dev), dev, dir, stream);
 }
 
 static int
 hdacc_stream_start(device_t dev, device_t child,
     int dir, int stream, bus_addr_t buf, int blksz, int blkcnt)
 {
 
 	return (HDAC_STREAM_START(device_get_parent(dev), dev,
 	    dir, stream, buf, blksz, blkcnt));
 }
 
 static void
 hdacc_stream_stop(device_t dev, device_t child, int dir, int stream)
 {
 
 	HDAC_STREAM_STOP(device_get_parent(dev), dev, dir, stream);
 }
 
 static void
 hdacc_stream_reset(device_t dev, device_t child, int dir, int stream)
 {
 
 	HDAC_STREAM_RESET(device_get_parent(dev), dev, dir, stream);
 }
 
 static uint32_t
 hdacc_stream_getptr(device_t dev, device_t child, int dir, int stream)
 {
 
 	return (HDAC_STREAM_GETPTR(device_get_parent(dev), dev, dir, stream));
 }
 
 static void
 hdacc_stream_intr(device_t dev, int dir, int stream)
 {
 	struct hdacc_softc *codec = device_get_softc(dev);
 	device_t child;
 
 	if ((child = codec->streams[dir][stream]) != NULL)
 		HDAC_STREAM_INTR(child, dir, stream);
 }
 
 static int
 hdacc_unsol_alloc(device_t dev, device_t child, int wanted)
 {
 	struct hdacc_softc *codec = device_get_softc(dev);
 	int tag;
 
 	wanted &= 0x3f;
 	tag = wanted;
 	do {
 		if (codec->tags[tag] == NULL) {
 			codec->tags[tag] = child;
 			HDAC_UNSOL_ALLOC(device_get_parent(dev), dev, tag);
 			return (tag);
 		}
 		tag++;
 		tag &= 0x3f;
 	} while (tag != wanted);
 	return (-1);
 }
 
 static void
 hdacc_unsol_free(device_t dev, device_t child, int tag)
 {
 	struct hdacc_softc *codec = device_get_softc(dev);
 
 	KASSERT(tag >= 0 && tag <= 0x3f, ("Wrong tag value %d\n", tag));
 	codec->tags[tag] = NULL;
 	HDAC_UNSOL_FREE(device_get_parent(dev), dev, tag);
 }
 
 static void
 hdacc_unsol_intr(device_t dev, uint32_t resp)
 {
 	struct hdacc_softc *codec = device_get_softc(dev);
 	device_t child;
 	int tag;
 
 	tag = resp >> 26;
 	if ((child = codec->tags[tag]) != NULL)
 		HDAC_UNSOL_INTR(child, resp);
 	else
 		device_printf(codec->dev, "Unexpected unsolicited "
 		    "response with tag %d: %08x\n", tag, resp);
 }
 
 static void
 hdacc_pindump(device_t dev)
 {
 	device_t *devlist;
 	int devcount, i;
 
 	if (device_get_children(dev, &devlist, &devcount) != 0)
 		return;
 	for (i = 0; i < devcount; i++)
 		HDAC_PINDUMP(devlist[i]);
 	free(devlist, M_TEMP);
 }
 
 static device_method_t hdacc_methods[] = {
 	/* device interface */
 	DEVMETHOD(device_probe,		hdacc_probe),
 	DEVMETHOD(device_attach,	hdacc_attach),
 	DEVMETHOD(device_detach,	hdacc_detach),
 	DEVMETHOD(device_suspend,	hdacc_suspend),
 	DEVMETHOD(device_resume,	hdacc_resume),
 	/* Bus interface */
 	DEVMETHOD(bus_child_location_str, hdacc_child_location_str),
 	DEVMETHOD(bus_child_pnpinfo_str, hdacc_child_pnpinfo_str_method),
 	DEVMETHOD(bus_print_child,	hdacc_print_child),
 	DEVMETHOD(bus_probe_nomatch,	hdacc_probe_nomatch),
 	DEVMETHOD(bus_read_ivar,	hdacc_read_ivar),
 	DEVMETHOD(hdac_get_mtx,		hdacc_get_mtx),
 	DEVMETHOD(hdac_codec_command,	hdacc_codec_command),
 	DEVMETHOD(hdac_stream_alloc,	hdacc_stream_alloc),
 	DEVMETHOD(hdac_stream_free,	hdacc_stream_free),
 	DEVMETHOD(hdac_stream_start,	hdacc_stream_start),
 	DEVMETHOD(hdac_stream_stop,	hdacc_stream_stop),
 	DEVMETHOD(hdac_stream_reset,	hdacc_stream_reset),
 	DEVMETHOD(hdac_stream_getptr,	hdacc_stream_getptr),
 	DEVMETHOD(hdac_stream_intr,	hdacc_stream_intr),
 	DEVMETHOD(hdac_unsol_alloc,	hdacc_unsol_alloc),
 	DEVMETHOD(hdac_unsol_free,	hdacc_unsol_free),
 	DEVMETHOD(hdac_unsol_intr,	hdacc_unsol_intr),
 	DEVMETHOD(hdac_pindump,		hdacc_pindump),
 	DEVMETHOD_END
 };
 
 static driver_t hdacc_driver = {
 	"hdacc",
 	hdacc_methods,
 	sizeof(struct hdacc_softc),
 };
 
 static devclass_t hdacc_devclass;
 
 DRIVER_MODULE(snd_hda, hdac, hdacc_driver, hdacc_devclass, NULL, NULL);
Index: user/ngie/more-tests/sys/dev/vt/vt_font.c
===================================================================
--- user/ngie/more-tests/sys/dev/vt/vt_font.c	(revision 281584)
+++ user/ngie/more-tests/sys/dev/vt/vt_font.c	(revision 281585)
@@ -1,224 +1,224 @@
 /*-
  * Copyright (c) 2009 The FreeBSD Foundation
  * All rights reserved.
  *
  * This software was developed by Ed Schouten under sponsorship from the
  * FreeBSD Foundation.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/kernel.h>
 #include <sys/malloc.h>
 #include <sys/refcount.h>
 #include <sys/systm.h>
 
 #include <dev/vt/vt.h>
 
 static MALLOC_DEFINE(M_VTFONT, "vtfont", "vt font");
 
 /* Some limits to prevent abnormal fonts from being loaded. */
-#define	VTFONT_MAXMAPPINGS	8192
-#define	VTFONT_MAXGLYPHSIZE	1048576
+#define	VTFONT_MAXMAPPINGS	65536
+#define	VTFONT_MAXGLYPHSIZE	2097152
 #define	VTFONT_MAXDIMENSION	128
 
 static uint16_t
 vtfont_bisearch(const struct vt_font_map *map, unsigned int len, uint32_t src)
 {
 	int min, mid, max;
 
 	min = 0;
 	max = len - 1;
 
 	/* Empty font map. */
 	if (len == 0)
 		return (0);
 	/* Character below minimal entry. */
 	if (src < map[0].vfm_src)
 		return (0);
 	/* Optimization: ASCII characters occur very often. */
 	if (src <= map[0].vfm_src + map[0].vfm_len)
 		return (src - map[0].vfm_src + map[0].vfm_dst);
 	/* Character above maximum entry. */
 	if (src > map[max].vfm_src + map[max].vfm_len)
 		return (0);
 
 	/* Binary search. */
 	while (max >= min) {
 		mid = (min + max) / 2;
 		if (src < map[mid].vfm_src)
 			max = mid - 1;
 		else if (src > map[mid].vfm_src + map[mid].vfm_len)
 			min = mid + 1;
 		else
 			return (src - map[mid].vfm_src + map[mid].vfm_dst);
 	}
 
 	return (0);
 }
 
 const uint8_t *
 vtfont_lookup(const struct vt_font *vf, term_char_t c)
 {
 	uint32_t src;
 	uint16_t dst;
 	size_t stride;
 	unsigned int normal_map;
 	unsigned int bold_map;
 
 	/*
 	 * No support for printing right hand sides for CJK fullwidth
 	 * characters. Simply print a space and assume that the left
 	 * hand side describes the entire character.
 	 */
 	src = TCHAR_CHARACTER(c);
 	if (TCHAR_FORMAT(c) & TF_CJK_RIGHT) {
 		normal_map = VFNT_MAP_NORMAL_RIGHT;
 		bold_map = VFNT_MAP_BOLD_RIGHT;
 	} else {
 		normal_map = VFNT_MAP_NORMAL;
 		bold_map = VFNT_MAP_BOLD;
 	}
 
 	if (TCHAR_FORMAT(c) & TF_BOLD) {
 		dst = vtfont_bisearch(vf->vf_map[bold_map],
 		    vf->vf_map_count[bold_map], src);
 		if (dst != 0)
 			goto found;
 	}
 	dst = vtfont_bisearch(vf->vf_map[normal_map],
 	    vf->vf_map_count[normal_map], src);
 
 found:
 	stride = howmany(vf->vf_width, 8) * vf->vf_height;
 	return (&vf->vf_bytes[dst * stride]);
 }
 
 struct vt_font *
 vtfont_ref(struct vt_font *vf)
 {
 
 	refcount_acquire(&vf->vf_refcount);
 	return (vf);
 }
 
 void
 vtfont_unref(struct vt_font *vf)
 {
 	unsigned int i;
 
 	if (refcount_release(&vf->vf_refcount)) {
 		for (i = 0; i < VFNT_MAPS; i++)
 			free(vf->vf_map[i], M_VTFONT);
 		free(vf->vf_bytes, M_VTFONT);
 		free(vf, M_VTFONT);
 	}
 }
 
 static int
 vtfont_validate_map(struct vt_font_map *vfm, unsigned int length,
     unsigned int glyph_count)
 {
 	unsigned int i, last = 0;
 
 	for (i = 0; i < length; i++) {
 		/* Not ordered. */
 		if (i > 0 && vfm[i].vfm_src <= last)
 			return (EINVAL);
 		/*
 		 * Destination extends amount of glyphs.
 		 */
 		if (vfm[i].vfm_dst >= glyph_count ||
 		    vfm[i].vfm_dst + vfm[i].vfm_len >= glyph_count)
 			return (EINVAL);
 		last = vfm[i].vfm_src + vfm[i].vfm_len;
 	}
 
 	return (0);
 }
 
 int
 vtfont_load(vfnt_t *f, struct vt_font **ret)
 {
 	size_t glyphsize, mapsize;
 	struct vt_font *vf;
 	int error;
 	unsigned int i;
 
 	/* Make sure the dimensions are valid. */
 	if (f->width < 1 || f->height < 1)
 		return (EINVAL);
 	if (f->width > VTFONT_MAXDIMENSION || f->height > VTFONT_MAXDIMENSION)
 		return (E2BIG);
 
 	/* Not too many mappings. */
 	for (i = 0; i < VFNT_MAPS; i++)
 		if (f->map_count[i] > VTFONT_MAXMAPPINGS)
 			return (E2BIG);
 
 	/* Character 0 must always be present. */
 	if (f->glyph_count < 1)
 		return (EINVAL);
 
 	glyphsize = howmany(f->width, 8) * f->height * f->glyph_count;
 	if (glyphsize > VTFONT_MAXGLYPHSIZE)
 		return (E2BIG);
 
 	/* Allocate new font structure. */
 	vf = malloc(sizeof *vf, M_VTFONT, M_WAITOK | M_ZERO);
 	vf->vf_bytes = malloc(glyphsize, M_VTFONT, M_WAITOK);
 	vf->vf_height = f->height;
 	vf->vf_width = f->width;
 	vf->vf_refcount = 1;
 
 	/* Allocate, copy in, and validate mappings. */
 	for (i = 0; i < VFNT_MAPS; i++) {
 		vf->vf_map_count[i] = f->map_count[i];
 		if (f->map_count[i] == 0)
 			continue;
 		mapsize = f->map_count[i] * sizeof(struct vt_font_map);
 		vf->vf_map[i] = malloc(mapsize, M_VTFONT, M_WAITOK);
 		error = copyin(f->map[i], vf->vf_map[i], mapsize);
 		if (error)
 			goto bad;
 		error = vtfont_validate_map(vf->vf_map[i], vf->vf_map_count[i],
 		    f->glyph_count);
 		if (error)
 			goto bad;
 	}
 
 	/* Copy in glyph data. */
 	error = copyin(f->glyphs, vf->vf_bytes, glyphsize);
 	if (error)
 		goto bad;
 
 	/* Success. */
 	*ret = vf;
 	return (0);
 
 bad:	vtfont_unref(vf);
 	return (error);
 }
Index: user/ngie/more-tests/sys/fs/ext2fs/ext2_vfsops.c
===================================================================
--- user/ngie/more-tests/sys/fs/ext2fs/ext2_vfsops.c	(revision 281584)
+++ user/ngie/more-tests/sys/fs/ext2fs/ext2_vfsops.c	(revision 281585)
@@ -1,1114 +1,1115 @@
 /*-
  *  modified for EXT2FS support in Lites 1.1
  *
  *  Aug 1995, Godmar Back (gback@cs.utah.edu)
  *  University of Utah, Department of Computer Science
  */
 /*-
  * Copyright (c) 1989, 1991, 1993, 1994
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)ffs_vfsops.c	8.8 (Berkeley) 4/18/94
  * $FreeBSD$
  */
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/namei.h>
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/kernel.h>
 #include <sys/vnode.h>
 #include <sys/mount.h>
 #include <sys/bio.h>
 #include <sys/buf.h>
 #include <sys/conf.h>
 #include <sys/endian.h>
 #include <sys/fcntl.h>
 #include <sys/malloc.h>
 #include <sys/stat.h>
 #include <sys/mutex.h>
 
 #include <geom/geom.h>
 #include <geom/geom_vfs.h>
 
 #include <fs/ext2fs/ext2_mount.h>
 #include <fs/ext2fs/inode.h>
 
 #include <fs/ext2fs/fs.h>
 #include <fs/ext2fs/ext2fs.h>
 #include <fs/ext2fs/ext2_dinode.h>
 #include <fs/ext2fs/ext2_extern.h>
 
 static int	ext2_flushfiles(struct mount *mp, int flags, struct thread *td);
 static int	ext2_mountfs(struct vnode *, struct mount *);
 static int	ext2_reload(struct mount *mp, struct thread *td);
 static int	ext2_sbupdate(struct ext2mount *, int);
 static int	ext2_cgupdate(struct ext2mount *, int);
 static vfs_unmount_t		ext2_unmount;
 static vfs_root_t		ext2_root;
 static vfs_statfs_t		ext2_statfs;
 static vfs_sync_t		ext2_sync;
 static vfs_vget_t		ext2_vget;
 static vfs_fhtovp_t		ext2_fhtovp;
 static vfs_mount_t		ext2_mount;
 
 MALLOC_DEFINE(M_EXT2NODE, "ext2_node", "EXT2 vnode private part");
 static MALLOC_DEFINE(M_EXT2MNT, "ext2_mount", "EXT2 mount structure");
 
 static struct vfsops ext2fs_vfsops = {
 	.vfs_fhtovp =		ext2_fhtovp,
 	.vfs_mount =		ext2_mount,
 	.vfs_root =		ext2_root,	/* root inode via vget */
 	.vfs_statfs =		ext2_statfs,
 	.vfs_sync =		ext2_sync,
 	.vfs_unmount =		ext2_unmount,
 	.vfs_vget =		ext2_vget,
 };
 
 VFS_SET(ext2fs_vfsops, ext2fs, 0);
 
 static int	ext2_check_sb_compat(struct ext2fs *es, struct cdev *dev,
 		    int ronly);
 static int	compute_sb_data(struct vnode * devvp,
 		    struct ext2fs * es, struct m_ext2fs * fs);
 
 static const char *ext2_opts[] = { "acls", "async", "noatime", "noclusterr", 
     "noclusterw", "noexec", "export", "force", "from", "multilabel",
     "suiddir", "nosymfollow", "sync", "union", NULL };
 
 /*
  * VFS Operations.
  *
  * mount system call
  */
 static int
 ext2_mount(struct mount *mp)
 {
 	struct vfsoptlist *opts;
 	struct vnode *devvp;
 	struct thread *td;
 	struct ext2mount *ump = NULL;
 	struct m_ext2fs *fs;
 	struct nameidata nd, *ndp = &nd;
 	accmode_t accmode;
 	char *path, *fspec;
 	int error, flags, len;
 
 	td = curthread;
 	opts = mp->mnt_optnew;
 
 	if (vfs_filteropt(opts, ext2_opts))
 		return (EINVAL);
 
 	vfs_getopt(opts, "fspath", (void **)&path, NULL);
 	/* Double-check the length of path.. */
 	if (strlen(path) >= MAXMNTLEN)
 		return (ENAMETOOLONG);
 
 	fspec = NULL;
 	error = vfs_getopt(opts, "from", (void **)&fspec, &len);
 	if (!error && fspec[len - 1] != '\0')
 		return (EINVAL);
 
 	/*
 	 * If updating, check whether changing from read-only to
 	 * read/write; if there is no device name, that's all we do.
 	 */
 	if (mp->mnt_flag & MNT_UPDATE) {
 		ump = VFSTOEXT2(mp);
 		fs = ump->um_e2fs; 
 		error = 0;
 		if (fs->e2fs_ronly == 0 &&
 		    vfs_flagopt(opts, "ro", NULL, 0)) {
 			error = VFS_SYNC(mp, MNT_WAIT);
 			if (error)
 				return (error);
 			flags = WRITECLOSE;
 			if (mp->mnt_flag & MNT_FORCE)
 				flags |= FORCECLOSE;
 			error = ext2_flushfiles(mp, flags, td);
 			if ( error == 0 && fs->e2fs_wasvalid && ext2_cgupdate(ump, MNT_WAIT) == 0) {
 				fs->e2fs->e2fs_state |= E2FS_ISCLEAN;
 				ext2_sbupdate(ump, MNT_WAIT);
 			}
 			fs->e2fs_ronly = 1;
 			vfs_flagopt(opts, "ro", &mp->mnt_flag, MNT_RDONLY);
 			DROP_GIANT();
 			g_topology_lock();
 			g_access(ump->um_cp, 0, -1, 0);
 			g_topology_unlock();
 			PICKUP_GIANT();
 		}
 		if (!error && (mp->mnt_flag & MNT_RELOAD))
 			error = ext2_reload(mp, td);
 		if (error)
 			return (error);
 		devvp = ump->um_devvp;
 		if (fs->e2fs_ronly && !vfs_flagopt(opts, "ro", NULL, 0)) {
 			if (ext2_check_sb_compat(fs->e2fs, devvp->v_rdev, 0))
 				return (EPERM);
 
 			/*
 			 * If upgrade to read-write by non-root, then verify
 			 * that user has necessary permissions on the device.
 			 */
 			vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY);
 			error = VOP_ACCESS(devvp, VREAD | VWRITE,
 			    td->td_ucred, td);
 			if (error)
 				error = priv_check(td, PRIV_VFS_MOUNT_PERM);
 			if (error) {
 				VOP_UNLOCK(devvp, 0);
 				return (error);
 			}
 			VOP_UNLOCK(devvp, 0);
 			DROP_GIANT();
 			g_topology_lock();
 			error = g_access(ump->um_cp, 0, 1, 0);
 			g_topology_unlock();
 			PICKUP_GIANT();
 			if (error)
 				return (error);
 
 			if ((fs->e2fs->e2fs_state & E2FS_ISCLEAN) == 0 ||
 			    (fs->e2fs->e2fs_state & E2FS_ERRORS)) {
 				if (mp->mnt_flag & MNT_FORCE) {
 					printf(
 "WARNING: %s was not properly dismounted\n", fs->e2fs_fsmnt);
 				} else {
 					printf(
 "WARNING: R/W mount of %s denied.  Filesystem is not clean - run fsck\n",
 					    fs->e2fs_fsmnt);
 					return (EPERM);
 				}
 			}
 			fs->e2fs->e2fs_state &= ~E2FS_ISCLEAN;
 			(void)ext2_cgupdate(ump, MNT_WAIT);
 			fs->e2fs_ronly = 0;
 			MNT_ILOCK(mp);
 			mp->mnt_flag &= ~MNT_RDONLY;
 			MNT_IUNLOCK(mp);
 		}
 		if (vfs_flagopt(opts, "export", NULL, 0)) {
 			/* Process export requests in vfs_mount.c. */
 			return (error);
 		}
 	}
 
 	/*
 	 * Not an update, or updating the name: look up the name
 	 * and verify that it refers to a sensible disk device.
 	 */
 	if (fspec == NULL)
 		return (EINVAL);
 	NDINIT(ndp, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, fspec, td);
 	if ((error = namei(ndp)) != 0)
 		return (error);
 	NDFREE(ndp, NDF_ONLY_PNBUF);
 	devvp = ndp->ni_vp;
 
 	if (!vn_isdisk(devvp, &error)) {
 		vput(devvp);
 		return (error);
 	}
 
 	/*
 	 * If mount by non-root, then verify that user has necessary
 	 * permissions on the device.
 	 *
 	 * XXXRW: VOP_ACCESS() enough?
 	 */
 	accmode = VREAD;
 	if ((mp->mnt_flag & MNT_RDONLY) == 0)
 		accmode |= VWRITE;
 	error = VOP_ACCESS(devvp, accmode, td->td_ucred, td);
 	if (error)
 		error = priv_check(td, PRIV_VFS_MOUNT_PERM);
 	if (error) {
 		vput(devvp);
 		return (error);
 	}
 
 	if ((mp->mnt_flag & MNT_UPDATE) == 0) {
 		error = ext2_mountfs(devvp, mp);
 	} else {
 		if (devvp != ump->um_devvp) {
 			vput(devvp);
 			return (EINVAL);	/* needs translation */
 		} else
 			vput(devvp);
 	}
 	if (error) {
 		vrele(devvp);
 		return (error);
 	}
 	ump = VFSTOEXT2(mp);
 	fs = ump->um_e2fs;
 
 	/*
 	 * Note that this strncpy() is ok because of a check at the start
 	 * of ext2_mount().
 	 */
 	strncpy(fs->e2fs_fsmnt, path, MAXMNTLEN);
 	fs->e2fs_fsmnt[MAXMNTLEN - 1] = '\0';
 	vfs_mountedfrom(mp, fspec);
 	return (0);
 }
 
 static int
 ext2_check_sb_compat(struct ext2fs *es, struct cdev *dev, int ronly)
 {
 
 	if (es->e2fs_magic != E2FS_MAGIC) {
 		printf("ext2fs: %s: wrong magic number %#x (expected %#x)\n",
 		    devtoname(dev), es->e2fs_magic, E2FS_MAGIC);
 		return (1);
 	}
 	if (es->e2fs_rev > E2FS_REV0) {
 		if (es->e2fs_features_incompat & ~(EXT2F_INCOMPAT_SUPP |
 						   EXT4F_RO_INCOMPAT_SUPP)) {
 			printf(
 "WARNING: mount of %s denied due to unsupported optional features\n",
 			    devtoname(dev));
 			return (1);
 		}
 		if (!ronly &&
 		    (es->e2fs_features_rocompat & ~EXT2F_ROCOMPAT_SUPP)) {
 			printf("WARNING: R/W mount of %s denied due to "
 			    "unsupported optional features\n", devtoname(dev));
 			return (1);
 		}
 	}
 	return (0);
 }
 
 /*
  * This computes the fields of the  ext2_sb_info structure from the
  * data in the ext2_super_block structure read in.
  */
 static int
 compute_sb_data(struct vnode *devvp, struct ext2fs *es,
     struct m_ext2fs *fs)
 {
 	int db_count, error;
 	int i;
 	int logic_sb_block = 1;	/* XXX for now */
 	struct buf *bp;
 	uint32_t e2fs_descpb;
 
 	fs->e2fs_bshift = EXT2_MIN_BLOCK_LOG_SIZE + es->e2fs_log_bsize;
 	fs->e2fs_bsize = 1U << fs->e2fs_bshift;
 	fs->e2fs_fsbtodb = es->e2fs_log_bsize + 1;
 	fs->e2fs_qbmask = fs->e2fs_bsize - 1;
 	fs->e2fs_fsize = EXT2_MIN_FRAG_SIZE << es->e2fs_log_fsize;
 	if (fs->e2fs_fsize)
 		fs->e2fs_fpb = fs->e2fs_bsize / fs->e2fs_fsize;
 	fs->e2fs_bpg = es->e2fs_bpg;
 	fs->e2fs_fpg = es->e2fs_fpg;
 	fs->e2fs_ipg = es->e2fs_ipg;
 	if (es->e2fs_rev == E2FS_REV0) {
 		fs->e2fs_isize = E2FS_REV0_INODE_SIZE ;
 	} else {
 		fs->e2fs_isize = es->e2fs_inode_size;
 
 		/*
 		 * Simple sanity check for superblock inode size value.
 		 */
 		if (EXT2_INODE_SIZE(fs) < E2FS_REV0_INODE_SIZE ||
 		    EXT2_INODE_SIZE(fs) > fs->e2fs_bsize ||
 		    (fs->e2fs_isize & (fs->e2fs_isize - 1)) != 0) {
 			printf("ext2fs: invalid inode size %d\n",
 			    fs->e2fs_isize);
 			return (EIO);
 		}
 	}
 	/* Check for extra isize in big inodes. */
 	if (EXT2_HAS_RO_COMPAT_FEATURE(fs, EXT2F_ROCOMPAT_EXTRA_ISIZE) &&
 	    EXT2_INODE_SIZE(fs) < sizeof(struct ext2fs_dinode)) {
 		printf("ext2fs: no space for extra inode timestamps\n");
 		return (EINVAL);
 	}
 
 	fs->e2fs_ipb = fs->e2fs_bsize / EXT2_INODE_SIZE(fs);
 	fs->e2fs_itpg = fs->e2fs_ipg / fs->e2fs_ipb;
 	/* s_resuid / s_resgid ? */
 	fs->e2fs_gcount = (es->e2fs_bcount - es->e2fs_first_dblock +
 	    EXT2_BLOCKS_PER_GROUP(fs) - 1) / EXT2_BLOCKS_PER_GROUP(fs);
 	e2fs_descpb = fs->e2fs_bsize / sizeof(struct ext2_gd);
 	db_count = (fs->e2fs_gcount + e2fs_descpb - 1) / e2fs_descpb;
 	fs->e2fs_gdbcount = db_count;
 	fs->e2fs_gd = malloc(db_count * fs->e2fs_bsize,
 	    M_EXT2MNT, M_WAITOK);
 	fs->e2fs_contigdirs = malloc(fs->e2fs_gcount *
 	    sizeof(*fs->e2fs_contigdirs), M_EXT2MNT, M_WAITOK | M_ZERO);
 
 	/*
 	 * Adjust logic_sb_block.
 	 * Godmar thinks: if the blocksize is greater than 1024, then
 	 * the superblock is logically part of block zero.
 	 */
 	if(fs->e2fs_bsize > SBSIZE)
 		logic_sb_block = 0;
 	for (i = 0; i < db_count; i++) {
 		error = bread(devvp ,
 			 fsbtodb(fs, logic_sb_block + i + 1 ),
 			fs->e2fs_bsize, NOCRED, &bp);
 		if (error) {
 			free(fs->e2fs_contigdirs, M_EXT2MNT);
 			free(fs->e2fs_gd, M_EXT2MNT);
 			brelse(bp);
 			return (error);
 		}
 		e2fs_cgload((struct ext2_gd *)bp->b_data,
 		    &fs->e2fs_gd[
 			i * fs->e2fs_bsize / sizeof(struct ext2_gd)],
 		    fs->e2fs_bsize);
 		brelse(bp);
 		bp = NULL;
 	}
 	/* Initialization for the ext2 Orlov allocator variant. */
 	fs->e2fs_total_dir = 0;
 	for (i = 0; i < fs->e2fs_gcount; i++)
 		fs->e2fs_total_dir += fs->e2fs_gd[i].ext2bgd_ndirs;
 
 	if (es->e2fs_rev == E2FS_REV0 ||
 	    !EXT2_HAS_RO_COMPAT_FEATURE(fs, EXT2F_ROCOMPAT_LARGEFILE))
 		fs->e2fs_maxfilesize = 0x7fffffff;
 	else {
 		fs->e2fs_maxfilesize = 0xffffffffffff;
 		if (EXT2_HAS_RO_COMPAT_FEATURE(fs, EXT2F_ROCOMPAT_HUGE_FILE))
 			fs->e2fs_maxfilesize = 0x7fffffffffffffff;
 	}
 	if (es->e4fs_flags & E2FS_UNSIGNED_HASH) {
 		fs->e2fs_uhash = 3;
 	} else if ((es->e4fs_flags & E2FS_SIGNED_HASH) == 0) {
 #ifdef __CHAR_UNSIGNED__
 		es->e4fs_flags |= E2FS_UNSIGNED_HASH;
 		fs->e2fs_uhash = 3;
 #else
 		es->e4fs_flags |= E2FS_SIGNED_HASH;
 #endif
 	}
 
 	return (0);
 }
 
 /*
  * Reload all incore data for a filesystem (used after running fsck on
  * the root filesystem and finding things to fix). The filesystem must
  * be mounted read-only.
  *
  * Things to do to update the mount:
  *	1) invalidate all cached meta-data.
  *	2) re-read superblock from disk.
  *	3) invalidate all cluster summary information.
  *	4) invalidate all inactive vnodes.
  *	5) invalidate all cached file data.
  *	6) re-read inode data for all active vnodes.
  * XXX we are missing some steps, in particular # 3, this has to be reviewed.
  */
 static int
 ext2_reload(struct mount *mp, struct thread *td)
 {
 	struct vnode *vp, *mvp, *devvp;
 	struct inode *ip;
 	struct buf *bp;
 	struct ext2fs *es;
 	struct m_ext2fs *fs;
 	struct csum *sump;
 	int error, i;
 	int32_t *lp;
 
 	if ((mp->mnt_flag & MNT_RDONLY) == 0)
 		return (EINVAL);
 	/*
 	 * Step 1: invalidate all cached meta-data.
 	 */
 	devvp = VFSTOEXT2(mp)->um_devvp;
 	vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY);
 	if (vinvalbuf(devvp, 0, 0, 0) != 0)
 		panic("ext2_reload: dirty1");
 	VOP_UNLOCK(devvp, 0);
 
 	/*
 	 * Step 2: re-read superblock from disk.
 	 * constants have been adjusted for ext2
 	 */
 	if ((error = bread(devvp, SBLOCK, SBSIZE, NOCRED, &bp)) != 0)
 		return (error);
 	es = (struct ext2fs *)bp->b_data;
 	if (ext2_check_sb_compat(es, devvp->v_rdev, 0) != 0) {
 		brelse(bp);
 		return (EIO);		/* XXX needs translation */
 	}
 	fs = VFSTOEXT2(mp)->um_e2fs;
 	bcopy(bp->b_data, fs->e2fs, sizeof(struct ext2fs));
 
 	if((error = compute_sb_data(devvp, es, fs)) != 0) {
 		brelse(bp);
 		return (error);
 	}
 #ifdef UNKLAR
 	if (fs->fs_sbsize < SBSIZE)
 		bp->b_flags |= B_INVAL;
 #endif
 	brelse(bp);
 
 	/*
 	 * Step 3: invalidate all cluster summary information.
 	 */
 	if (fs->e2fs_contigsumsize > 0) {
 		lp = fs->e2fs_maxcluster;
 		sump = fs->e2fs_clustersum;
 		for (i = 0; i < fs->e2fs_gcount; i++, sump++) {
 			*lp++ = fs->e2fs_contigsumsize;
 			sump->cs_init = 0;
 			bzero(sump->cs_sum, fs->e2fs_contigsumsize + 1);
 		}
 	}
 
 loop:
 	MNT_VNODE_FOREACH_ALL(vp, mp, mvp) {
 		/*
 		 * Step 4: invalidate all cached file data.
 		 */
 		if (vget(vp, LK_EXCLUSIVE | LK_INTERLOCK, td)) {
 			MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp);
 			goto loop;
 		}
 		if (vinvalbuf(vp, 0, 0, 0))
 			panic("ext2_reload: dirty2");
 
 		/*
 		 * Step 5: re-read inode data for all active vnodes.
 		 */
 		ip = VTOI(vp);
 		error = bread(devvp, fsbtodb(fs, ino_to_fsba(fs, ip->i_number)),
 		    (int)fs->e2fs_bsize, NOCRED, &bp);
 		if (error) {
 			VOP_UNLOCK(vp, 0);
 			vrele(vp);
 			MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp);
 			return (error);
 		}
 		ext2_ei2i((struct ext2fs_dinode *) ((char *)bp->b_data +
 		    EXT2_INODE_SIZE(fs) * ino_to_fsbo(fs, ip->i_number)), ip);
 		brelse(bp);
 		VOP_UNLOCK(vp, 0);
 		vrele(vp);
 	}
 	return (0);
 }
 
 /*
  * Common code for mount and mountroot.
  */
 static int
 ext2_mountfs(struct vnode *devvp, struct mount *mp)
 {
 	struct ext2mount *ump;
 	struct buf *bp;
 	struct m_ext2fs *fs;
 	struct ext2fs *es;
 	struct cdev *dev = devvp->v_rdev;
 	struct g_consumer *cp;
 	struct bufobj *bo;
 	struct csum *sump;
 	int error;
 	int ronly;
 	int i, size;
 	int32_t *lp;
 	int32_t e2fs_maxcontig;
 
 	ronly = vfs_flagopt(mp->mnt_optnew, "ro", NULL, 0);
 	/* XXX: use VOP_ACESS to check FS perms */
 	DROP_GIANT();
 	g_topology_lock();
 	error = g_vfs_open(devvp, &cp, "ext2fs", ronly ? 0 : 1);
 	g_topology_unlock();
 	PICKUP_GIANT();
 	VOP_UNLOCK(devvp, 0);
 	if (error)
 		return (error);
 
 	/* XXX: should we check for some sectorsize or 512 instead? */
 	if (((SBSIZE % cp->provider->sectorsize) != 0) ||
 	    (SBSIZE < cp->provider->sectorsize)) {
 		DROP_GIANT();
 		g_topology_lock();
 		g_vfs_close(cp);
 		g_topology_unlock();
 		PICKUP_GIANT();
 		return (EINVAL);
 	}
 
 	bo = &devvp->v_bufobj;
 	bo->bo_private = cp;
 	bo->bo_ops = g_vfs_bufops;
 	if (devvp->v_rdev->si_iosize_max != 0)
 		mp->mnt_iosize_max = devvp->v_rdev->si_iosize_max;
 	if (mp->mnt_iosize_max > MAXPHYS)
 		mp->mnt_iosize_max = MAXPHYS;
 
 	bp = NULL;
 	ump = NULL;
 	if ((error = bread(devvp, SBLOCK, SBSIZE, NOCRED, &bp)) != 0)
 		goto out;
 	es = (struct ext2fs *)bp->b_data;
 	if (ext2_check_sb_compat(es, dev, ronly) != 0) {
 		error = EINVAL;		/* XXX needs translation */
 		goto out;
 	}
 	if ((es->e2fs_state & E2FS_ISCLEAN) == 0 ||
 	    (es->e2fs_state & E2FS_ERRORS)) {
 		if (ronly || (mp->mnt_flag & MNT_FORCE)) {
 			printf(
 "WARNING: Filesystem was not properly dismounted\n");
 		} else {
 			printf(
 "WARNING: R/W mount denied.  Filesystem is not clean - run fsck\n");
 			error = EPERM;
 			goto out;
 		}
 	}
 	ump = malloc(sizeof(*ump), M_EXT2MNT, M_WAITOK | M_ZERO);
 
 	/*
 	 * I don't know whether this is the right strategy. Note that
 	 * we dynamically allocate both an ext2_sb_info and an ext2_super_block
 	 * while Linux keeps the super block in a locked buffer.
 	 */
 	ump->um_e2fs = malloc(sizeof(struct m_ext2fs),
 		M_EXT2MNT, M_WAITOK);
 	ump->um_e2fs->e2fs = malloc(sizeof(struct ext2fs),
 		M_EXT2MNT, M_WAITOK);
 	mtx_init(EXT2_MTX(ump), "EXT2FS", "EXT2FS Lock", MTX_DEF);
 	bcopy(es, ump->um_e2fs->e2fs, (u_int)sizeof(struct ext2fs));
 	if ((error = compute_sb_data(devvp, ump->um_e2fs->e2fs, ump->um_e2fs)))
 		goto out;
 
 	/*
 	 * Calculate the maximum contiguous blocks and size of cluster summary
 	 * array.  In FFS this is done by newfs; however, the superblock 
 	 * in ext2fs doesn't have these variables, so we can calculate 
 	 * them here.
 	 */
 	e2fs_maxcontig = MAX(1, MAXPHYS / ump->um_e2fs->e2fs_bsize);
 	ump->um_e2fs->e2fs_contigsumsize = MIN(e2fs_maxcontig, EXT2_MAXCONTIG);
 	if (ump->um_e2fs->e2fs_contigsumsize > 0) {
 		size = ump->um_e2fs->e2fs_gcount * sizeof(int32_t);
 		ump->um_e2fs->e2fs_maxcluster = malloc(size, M_EXT2MNT, M_WAITOK);
 		size = ump->um_e2fs->e2fs_gcount * sizeof(struct csum);
 		ump->um_e2fs->e2fs_clustersum = malloc(size, M_EXT2MNT, M_WAITOK);
 		lp = ump->um_e2fs->e2fs_maxcluster;
 		sump = ump->um_e2fs->e2fs_clustersum;
 		for (i = 0; i < ump->um_e2fs->e2fs_gcount; i++, sump++) {
 			*lp++ = ump->um_e2fs->e2fs_contigsumsize;
 			sump->cs_init = 0;
 			sump->cs_sum = malloc((ump->um_e2fs->e2fs_contigsumsize + 1) *
 			    sizeof(int32_t), M_EXT2MNT, M_WAITOK | M_ZERO);
 		}
 	}
 
 	brelse(bp);
 	bp = NULL;
 	fs = ump->um_e2fs;
 	fs->e2fs_ronly = ronly;	/* ronly is set according to mnt_flags */
 
 	/*
 	 * If the fs is not mounted read-only, make sure the super block is
 	 * always written back on a sync().
 	 */
 	fs->e2fs_wasvalid = fs->e2fs->e2fs_state & E2FS_ISCLEAN ? 1 : 0;
 	if (ronly == 0) {
 		fs->e2fs_fmod = 1;		/* mark it modified */
 		fs->e2fs->e2fs_state &= ~E2FS_ISCLEAN;	/* set fs invalid */
 	}
 	mp->mnt_data = ump;
 	mp->mnt_stat.f_fsid.val[0] = dev2udev(dev);
 	mp->mnt_stat.f_fsid.val[1] = mp->mnt_vfc->vfc_typenum;
 	mp->mnt_maxsymlinklen = EXT2_MAXSYMLINKLEN;
 	MNT_ILOCK(mp);
 	mp->mnt_flag |= MNT_LOCAL;
 	MNT_IUNLOCK(mp);
 	ump->um_mountp = mp;
 	ump->um_dev = dev;
 	ump->um_devvp = devvp;
 	ump->um_bo = &devvp->v_bufobj;
 	ump->um_cp = cp;
 
 	/*
 	 * Setting those two parameters allowed us to use
 	 * ufs_bmap w/o changse!
 	 */
 	ump->um_nindir = EXT2_ADDR_PER_BLOCK(fs);
 	ump->um_bptrtodb = fs->e2fs->e2fs_log_bsize + 1;
 	ump->um_seqinc = EXT2_FRAGS_PER_BLOCK(fs);
 	if (ronly == 0)
 		ext2_sbupdate(ump, MNT_WAIT);
 	/*
 	 * Initialize filesystem stat information in mount struct.
 	 */
 	MNT_ILOCK(mp);
-	mp->mnt_kern_flag |= MNTK_LOOKUP_SHARED | MNTK_EXTENDED_SHARED;
+	mp->mnt_kern_flag |= MNTK_LOOKUP_SHARED | MNTK_EXTENDED_SHARED |
+	    MNTK_USES_BCACHE;
 	MNT_IUNLOCK(mp);
 	return (0);
 out:
 	if (bp)
 		brelse(bp);
 	if (cp != NULL) {
 		DROP_GIANT();
 		g_topology_lock();
 		g_vfs_close(cp);
 		g_topology_unlock();
 		PICKUP_GIANT();
 	}
 	if (ump) {
 		mtx_destroy(EXT2_MTX(ump));
 		free(ump->um_e2fs->e2fs_gd, M_EXT2MNT);
 		free(ump->um_e2fs->e2fs_contigdirs, M_EXT2MNT);
 		free(ump->um_e2fs->e2fs, M_EXT2MNT);
 		free(ump->um_e2fs, M_EXT2MNT);
 		free(ump, M_EXT2MNT);
 		mp->mnt_data = NULL;
 	}
 	return (error);
 }
 
 /*
  * Unmount system call.
  */
 static int
 ext2_unmount(struct mount *mp, int mntflags)
 {
 	struct ext2mount *ump;
 	struct m_ext2fs *fs;
 	struct csum *sump;
 	int error, flags, i, ronly;
 
 	flags = 0;
 	if (mntflags & MNT_FORCE) {
 		if (mp->mnt_flag & MNT_ROOTFS)
 			return (EINVAL);
 		flags |= FORCECLOSE;
 	}
 	if ((error = ext2_flushfiles(mp, flags, curthread)) != 0)
 		return (error);
 	ump = VFSTOEXT2(mp);
 	fs = ump->um_e2fs;
 	ronly = fs->e2fs_ronly;
 	if (ronly == 0 && ext2_cgupdate(ump, MNT_WAIT) == 0) {
 		if (fs->e2fs_wasvalid)
 			fs->e2fs->e2fs_state |= E2FS_ISCLEAN;
 		ext2_sbupdate(ump, MNT_WAIT);
 	}
 
 	DROP_GIANT();
 	g_topology_lock();
 	g_vfs_close(ump->um_cp);
 	g_topology_unlock();
 	PICKUP_GIANT();
 	vrele(ump->um_devvp);
 	sump = fs->e2fs_clustersum;
 	for (i = 0; i < fs->e2fs_gcount; i++, sump++)
 		free(sump->cs_sum, M_EXT2MNT);
 	free(fs->e2fs_clustersum, M_EXT2MNT);
 	free(fs->e2fs_maxcluster, M_EXT2MNT);
 	free(fs->e2fs_gd, M_EXT2MNT);
 	free(fs->e2fs_contigdirs, M_EXT2MNT);
 	free(fs->e2fs, M_EXT2MNT);
 	free(fs, M_EXT2MNT);
 	free(ump, M_EXT2MNT);
 	mp->mnt_data = NULL;
 	MNT_ILOCK(mp);
 	mp->mnt_flag &= ~MNT_LOCAL;
 	MNT_IUNLOCK(mp);
 	return (error);
 }
 
 /*
  * Flush out all the files in a filesystem.
  */
 static int
 ext2_flushfiles(struct mount *mp, int flags, struct thread *td)
 {
 	int error;
 
 	error = vflush(mp, 0, flags, td);
 	return (error);
 }
 /*
  * Get filesystem statistics.
  */
 int
 ext2_statfs(struct mount *mp, struct statfs *sbp)
 {
 	struct ext2mount *ump;
 	struct m_ext2fs *fs;
 	uint32_t overhead, overhead_per_group, ngdb;
 	int i, ngroups;
 
 	ump = VFSTOEXT2(mp);
 	fs = ump->um_e2fs;
 	if (fs->e2fs->e2fs_magic != E2FS_MAGIC)
 		panic("ext2_statfs");
 
 	/*
 	 * Compute the overhead (FS structures)
 	 */
 	overhead_per_group =
 	    1 /* block bitmap */ +
 	    1 /* inode bitmap */ +
 	    fs->e2fs_itpg;
 	overhead = fs->e2fs->e2fs_first_dblock +
 	    fs->e2fs_gcount * overhead_per_group;
 	if (fs->e2fs->e2fs_rev > E2FS_REV0 &&
 	    fs->e2fs->e2fs_features_rocompat & EXT2F_ROCOMPAT_SPARSESUPER) {
 		for (i = 0, ngroups = 0; i < fs->e2fs_gcount; i++) {
 			if (cg_has_sb(i))
 				ngroups++;
 		}
 	} else {
 		ngroups = fs->e2fs_gcount;
 	}
 	ngdb = fs->e2fs_gdbcount;
 	if (fs->e2fs->e2fs_rev > E2FS_REV0 &&
 	    fs->e2fs->e2fs_features_compat & EXT2F_COMPAT_RESIZE)
 		ngdb += fs->e2fs->e2fs_reserved_ngdb;
 	overhead += ngroups * (1 /* superblock */ + ngdb);
 
 	sbp->f_bsize = EXT2_FRAG_SIZE(fs);
 	sbp->f_iosize = EXT2_BLOCK_SIZE(fs);
 	sbp->f_blocks = fs->e2fs->e2fs_bcount - overhead;
 	sbp->f_bfree = fs->e2fs->e2fs_fbcount;
 	sbp->f_bavail = sbp->f_bfree - fs->e2fs->e2fs_rbcount;
 	sbp->f_files = fs->e2fs->e2fs_icount;
 	sbp->f_ffree = fs->e2fs->e2fs_ficount;
 	return (0);
 }
 
 /*
  * Go through the disk queues to initiate sandbagged IO;
  * go through the inodes to write those that have been modified;
  * initiate the writing of the super block if it has been modified.
  *
  * Note: we are always called with the filesystem marked `MPBUSY'.
  */
 static int
 ext2_sync(struct mount *mp, int waitfor)
 {
 	struct vnode *mvp, *vp;
 	struct thread *td;
 	struct inode *ip;
 	struct ext2mount *ump = VFSTOEXT2(mp);
 	struct m_ext2fs *fs;
 	int error, allerror = 0;
 
 	td = curthread;
 	fs = ump->um_e2fs;
 	if (fs->e2fs_fmod != 0 && fs->e2fs_ronly != 0) {		/* XXX */
 		printf("fs = %s\n", fs->e2fs_fsmnt);
 		panic("ext2_sync: rofs mod");
 	}
 
 	/*
 	 * Write back each (modified) inode.
 	 */
 loop:
 	MNT_VNODE_FOREACH_ALL(vp, mp, mvp) {
 		if (vp->v_type == VNON) {
 			VI_UNLOCK(vp);
 			continue;
 		}
 		ip = VTOI(vp);
 		if ((ip->i_flag &
 		    (IN_ACCESS | IN_CHANGE | IN_MODIFIED | IN_UPDATE)) == 0 &&
 		    (vp->v_bufobj.bo_dirty.bv_cnt == 0 ||
 		    waitfor == MNT_LAZY)) {
 			VI_UNLOCK(vp);
 			continue;
 		}
 		error = vget(vp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, td);
 		if (error) {
 			if (error == ENOENT) {
 				MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp);
 				goto loop;
 			}
 			continue;
 		}
 		if ((error = VOP_FSYNC(vp, waitfor, td)) != 0)
 			allerror = error;
 		VOP_UNLOCK(vp, 0);
 		vrele(vp);
 	}
 
 	/*
 	 * Force stale filesystem control information to be flushed.
 	 */
 	if (waitfor != MNT_LAZY) {
 		vn_lock(ump->um_devvp, LK_EXCLUSIVE | LK_RETRY);
 		if ((error = VOP_FSYNC(ump->um_devvp, waitfor, td)) != 0)
 			allerror = error;
 		VOP_UNLOCK(ump->um_devvp, 0);
 	}
 
 	/*
 	 * Write back modified superblock.
 	 */
 	if (fs->e2fs_fmod != 0) {
 		fs->e2fs_fmod = 0;
 		fs->e2fs->e2fs_wtime = time_second;
 		if ((error = ext2_cgupdate(ump, waitfor)) != 0)
 			allerror = error;
 	}
 	return (allerror);
 }
 
 /*
  * Look up an EXT2FS dinode number to find its incore vnode, otherwise read it
  * in from disk.  If it is in core, wait for the lock bit to clear, then
  * return the inode locked.  Detection and handling of mount points must be
  * done by the calling routine.
  */
 static int
 ext2_vget(struct mount *mp, ino_t ino, int flags, struct vnode **vpp)
 {
 	struct m_ext2fs *fs;
 	struct inode *ip;
 	struct ext2mount *ump;
 	struct buf *bp;
 	struct vnode *vp;
 	struct thread *td;
 	int i, error;
 	int used_blocks;
 
 	td = curthread;
 	error = vfs_hash_get(mp, ino, flags, td, vpp, NULL, NULL);
 	if (error || *vpp != NULL)
 		return (error);
 
 	ump = VFSTOEXT2(mp);
 	ip = malloc(sizeof(struct inode), M_EXT2NODE, M_WAITOK | M_ZERO);
 
 	/* Allocate a new vnode/inode. */
 	if ((error = getnewvnode("ext2fs", mp, &ext2_vnodeops, &vp)) != 0) {
 		*vpp = NULL;
 		free(ip, M_EXT2NODE);
 		return (error);
 	}
 	vp->v_data = ip;
 	ip->i_vnode = vp;
 	ip->i_e2fs = fs = ump->um_e2fs;
 	ip->i_ump  = ump;
 	ip->i_number = ino;
 
 	lockmgr(vp->v_vnlock, LK_EXCLUSIVE, NULL);
 	error = insmntque(vp, mp);
 	if (error != 0) {
 		free(ip, M_EXT2NODE);
 		*vpp = NULL;
 		return (error);
 	}
 	error = vfs_hash_insert(vp, ino, flags, td, vpp, NULL, NULL);
 	if (error || *vpp != NULL)
 		return (error);
 
 	/* Read in the disk contents for the inode, copy into the inode. */
 	if ((error = bread(ump->um_devvp, fsbtodb(fs, ino_to_fsba(fs, ino)),
 	    (int)fs->e2fs_bsize, NOCRED, &bp)) != 0) {
 		/*
 		 * The inode does not contain anything useful, so it would
 		 * be misleading to leave it on its hash chain. With mode
 		 * still zero, it will be unlinked and returned to the free
 		 * list by vput().
 		 */
 		brelse(bp);
 		vput(vp);
 		*vpp = NULL;
 		return (error);
 	}
 	/* convert ext2 inode to dinode */
 	ext2_ei2i((struct ext2fs_dinode *) ((char *)bp->b_data + EXT2_INODE_SIZE(fs) *
 			ino_to_fsbo(fs, ino)), ip);
 	ip->i_block_group = ino_to_cg(fs, ino);
 	ip->i_next_alloc_block = 0;
 	ip->i_next_alloc_goal = 0;
 
 	/*
 	 * Now we want to make sure that block pointers for unused
 	 * blocks are zeroed out - ext2_balloc depends on this
 	 * although for regular files and directories only
 	 *
 	 * If IN_E4EXTENTS is enabled, unused blocks are not zeroed
 	 * out because we could corrupt the extent tree.
 	 */
 	if (!(ip->i_flag & IN_E4EXTENTS) &&
 	    (S_ISDIR(ip->i_mode) || S_ISREG(ip->i_mode))) {
 		used_blocks = (ip->i_size+fs->e2fs_bsize-1) / fs->e2fs_bsize;
 		for (i = used_blocks; i < EXT2_NDIR_BLOCKS; i++)
 			ip->i_db[i] = 0;
 	}
 #ifdef EXT2FS_DEBUG
 	ext2_print_inode(ip);
 #endif
 	bqrelse(bp);
 
 	/*
 	 * Initialize the vnode from the inode, check for aliases.
 	 * Note that the underlying vnode may have changed.
 	 */
 	if ((error = ext2_vinit(mp, &ext2_fifoops, &vp)) != 0) {
 		vput(vp);
 		*vpp = NULL;
 		return (error);
 	}
 
 	/*
 	 * Finish inode initialization.
 	 */
 
 	/*
 	 * Set up a generation number for this inode if it does not
 	 * already have one. This should only happen on old filesystems.
 	 */
 	if (ip->i_gen == 0) {
 		ip->i_gen = random() + 1;
 		if ((vp->v_mount->mnt_flag & MNT_RDONLY) == 0)
 			ip->i_flag |= IN_MODIFIED;
 	}
 	*vpp = vp;
 	return (0);
 }
 
 /*
  * File handle to vnode
  *
  * Have to be really careful about stale file handles:
  * - check that the inode number is valid
  * - call ext2_vget() to get the locked inode
  * - check for an unallocated inode (i_mode == 0)
  * - check that the given client host has export rights and return
  *   those rights via. exflagsp and credanonp
  */
 static int
 ext2_fhtovp(struct mount *mp, struct fid *fhp, int flags, struct vnode **vpp)
 {
 	struct inode *ip;
 	struct ufid *ufhp;
 	struct vnode *nvp;
 	struct m_ext2fs *fs;
 	int error;
 
 	ufhp = (struct ufid *)fhp;
 	fs = VFSTOEXT2(mp)->um_e2fs;
 	if (ufhp->ufid_ino < EXT2_ROOTINO ||
 	    ufhp->ufid_ino > fs->e2fs_gcount * fs->e2fs->e2fs_ipg)
 		return (ESTALE);
 
 	error = VFS_VGET(mp, ufhp->ufid_ino, LK_EXCLUSIVE, &nvp);
 	if (error) {
 		*vpp = NULLVP;
 		return (error);
 	}
 	ip = VTOI(nvp);
 	if (ip->i_mode == 0 ||
 	    ip->i_gen != ufhp->ufid_gen || ip->i_nlink <= 0) {
 		vput(nvp);
 		*vpp = NULLVP;
 		return (ESTALE);
 	}
 	*vpp = nvp;
 	vnode_create_vobject(*vpp, 0, curthread);
 	return (0);
 }
 
 /*
  * Write a superblock and associated information back to disk.
  */
 static int
 ext2_sbupdate(struct ext2mount *mp, int waitfor)
 {
 	struct m_ext2fs *fs = mp->um_e2fs;
 	struct ext2fs *es = fs->e2fs;
 	struct buf *bp;
 	int error = 0;
 
 	bp = getblk(mp->um_devvp, SBLOCK, SBSIZE, 0, 0, 0);
 	bcopy((caddr_t)es, bp->b_data, (u_int)sizeof(struct ext2fs));
 	if (waitfor == MNT_WAIT)
 		error = bwrite(bp);
 	else
 		bawrite(bp);
 
 	/*
 	 * The buffers for group descriptors, inode bitmaps and block bitmaps
 	 * are not busy at this point and are (hopefully) written by the
 	 * usual sync mechanism. No need to write them here.
 	 */
 	return (error);
 }
 int
 ext2_cgupdate(struct ext2mount *mp, int waitfor)
 {
 	struct m_ext2fs *fs = mp->um_e2fs;
 	struct buf *bp;
 	int i, error = 0, allerror = 0;
 
 	allerror = ext2_sbupdate(mp, waitfor);
 	for (i = 0; i < fs->e2fs_gdbcount; i++) {
 		bp = getblk(mp->um_devvp, fsbtodb(fs,
 		    fs->e2fs->e2fs_first_dblock +
 		    1 /* superblock */ + i), fs->e2fs_bsize, 0, 0, 0);
 		e2fs_cgsave(&fs->e2fs_gd[
 		    i * fs->e2fs_bsize / sizeof(struct ext2_gd)],
 		    (struct ext2_gd *)bp->b_data, fs->e2fs_bsize);
 		if (waitfor == MNT_WAIT)
 			error = bwrite(bp);
 		else
 			bawrite(bp);
 	}
 
 	if (!allerror && error)
 		allerror = error;
 	return (allerror);
 }
 /*
  * Return the root of a filesystem.
  */
 static int
 ext2_root(struct mount *mp, int flags, struct vnode **vpp)
 {
 	struct vnode *nvp;
 	int error;
 
 	error = VFS_VGET(mp, EXT2_ROOTINO, LK_EXCLUSIVE, &nvp);
 	if (error)
 		return (error);
 	*vpp = nvp;
 	return (0);
 }
Index: user/ngie/more-tests/sys/fs/fuse/fuse_vfsops.c
===================================================================
--- user/ngie/more-tests/sys/fs/fuse/fuse_vfsops.c	(revision 281584)
+++ user/ngie/more-tests/sys/fs/fuse/fuse_vfsops.c	(revision 281585)
@@ -1,533 +1,534 @@
 /*
  * Copyright (c) 2007-2009 Google Inc. and Amit Singh
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions are
  * met:
  *
  * * Redistributions of source code must retain the above copyright
  *   notice, this list of conditions and the following disclaimer.
  * * Redistributions in binary form must reproduce the above
  *   copyright notice, this list of conditions and the following disclaimer
  *   in the documentation and/or other materials provided with the
  *   distribution.
  * * Neither the name of Google Inc. nor the names of its
  *   contributors may be used to endorse or promote products derived from
  *   this software without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
  * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
  * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
  * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
  * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
  * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
  * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  *
  * Copyright (C) 2005 Csaba Henk.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/types.h>
 #include <sys/module.h>
 #include <sys/systm.h>
 #include <sys/errno.h>
 #include <sys/param.h>
 #include <sys/kernel.h>
 #include <sys/capsicum.h>
 #include <sys/conf.h>
 #include <sys/filedesc.h>
 #include <sys/uio.h>
 #include <sys/malloc.h>
 #include <sys/queue.h>
 #include <sys/lock.h>
 #include <sys/sx.h>
 #include <sys/mutex.h>
 #include <sys/proc.h>
 #include <sys/vnode.h>
 #include <sys/namei.h>
 #include <sys/mount.h>
 #include <sys/sysctl.h>
 #include <sys/fcntl.h>
 
 #include "fuse.h"
 #include "fuse_param.h"
 #include "fuse_node.h"
 #include "fuse_ipc.h"
 #include "fuse_internal.h"
 
 #include <sys/priv.h>
 #include <security/mac/mac_framework.h>
 
 #define FUSE_DEBUG_MODULE VFSOPS
 #include "fuse_debug.h"
 
 /* This will do for privilege types for now */
 #ifndef PRIV_VFS_FUSE_ALLOWOTHER
 #define PRIV_VFS_FUSE_ALLOWOTHER PRIV_VFS_MOUNT_NONUSER
 #endif
 #ifndef PRIV_VFS_FUSE_MOUNT_NONUSER
 #define PRIV_VFS_FUSE_MOUNT_NONUSER PRIV_VFS_MOUNT_NONUSER
 #endif
 #ifndef PRIV_VFS_FUSE_SYNC_UNMOUNT
 #define PRIV_VFS_FUSE_SYNC_UNMOUNT PRIV_VFS_MOUNT_NONUSER
 #endif
 
 static vfs_mount_t fuse_vfsop_mount;
 static vfs_unmount_t fuse_vfsop_unmount;
 static vfs_root_t fuse_vfsop_root;
 static vfs_statfs_t fuse_vfsop_statfs;
 
 struct vfsops fuse_vfsops = {
 	.vfs_mount = fuse_vfsop_mount,
 	.vfs_unmount = fuse_vfsop_unmount,
 	.vfs_root = fuse_vfsop_root,
 	.vfs_statfs = fuse_vfsop_statfs,
 };
 
 SYSCTL_INT(_vfs_fuse, OID_AUTO, init_backgrounded, CTLFLAG_RD,
     SYSCTL_NULL_INT_PTR, 1, "indicate async handshake");
 static int fuse_enforce_dev_perms = 0;
 
 SYSCTL_INT(_vfs_fuse, OID_AUTO, enforce_dev_perms, CTLFLAG_RW,
     &fuse_enforce_dev_perms, 0,
     "enforce fuse device permissions for secondary mounts");
 static unsigned sync_unmount = 1;
 
 SYSCTL_UINT(_vfs_fuse, OID_AUTO, sync_unmount, CTLFLAG_RW,
     &sync_unmount, 0, "specify when to use synchronous unmount");
 
 MALLOC_DEFINE(M_FUSEVFS, "fuse_filesystem", "buffer for fuse vfs layer");
 
 static int
 fuse_getdevice(const char *fspec, struct thread *td, struct cdev **fdevp)
 {
 	struct nameidata nd, *ndp = &nd;
 	struct vnode *devvp;
 	struct cdev *fdev;
 	int err;
 
 	/*
          * Not an update, or updating the name: look up the name
          * and verify that it refers to a sensible disk device.
          */
 
 	NDINIT(ndp, LOOKUP, FOLLOW, UIO_SYSSPACE, fspec, td);
 	if ((err = namei(ndp)) != 0)
 		return err;
 	NDFREE(ndp, NDF_ONLY_PNBUF);
 	devvp = ndp->ni_vp;
 
 	if (devvp->v_type != VCHR) {
 		vrele(devvp);
 		return ENXIO;
 	}
 	fdev = devvp->v_rdev;
 	dev_ref(fdev);
 
 	if (fuse_enforce_dev_perms) {
 		/*
 	         * Check if mounter can open the fuse device.
 	         *
 	         * This has significance only if we are doing a secondary mount
 	         * which doesn't involve actually opening fuse devices, but we
 	         * still want to enforce the permissions of the device (in
 	         * order to keep control over the circle of fuse users).
 	         *
 	         * (In case of primary mounts, we are either the superuser so
 	         * we can do anything anyway, or we can mount only if the
 	         * device is already opened by us, ie. we are permitted to open
 	         * the device.)
 	         */
 #if 0
 #ifdef MAC
 		err = mac_check_vnode_open(td->td_ucred, devvp, VREAD | VWRITE);
 		if (!err)
 #endif
 #endif /* 0 */
 			err = VOP_ACCESS(devvp, VREAD | VWRITE, td->td_ucred, td);
 		if (err) {
 			vrele(devvp);
 			dev_rel(fdev);
 			return err;
 		}
 	}
 	/*
          * according to coda code, no extra lock is needed --
          * although in sys/vnode.h this field is marked "v"
          */
 	vrele(devvp);
 
 	if (!fdev->si_devsw ||
 	    strcmp("fuse", fdev->si_devsw->d_name)) {
 		dev_rel(fdev);
 		return ENXIO;
 	}
 	*fdevp = fdev;
 
 	return 0;
 }
 
 #define FUSE_FLAGOPT(fnam, fval) do {				\
     vfs_flagopt(opts, #fnam, &mntopts, fval);		\
     vfs_flagopt(opts, "__" #fnam, &__mntopts, fval);	\
 } while (0)
 
 static int
 fuse_vfsop_mount(struct mount *mp)
 {
 	int err;
 
 	uint64_t mntopts, __mntopts;
 	int max_read_set;
 	uint32_t max_read;
 	int daemon_timeout;
 	int fd;
 
 	size_t len;
 
 	struct cdev *fdev;
 	struct fuse_data *data;
 	struct thread *td;
 	struct file *fp, *fptmp;
 	char *fspec, *subtype;
 	struct vfsoptlist *opts;
 	cap_rights_t rights;
 
 	subtype = NULL;
 	max_read_set = 0;
 	max_read = ~0;
 	err = 0;
 	mntopts = 0;
 	__mntopts = 0;
 	td = curthread;
 
 	fuse_trace_printf_vfsop();
 
 	if (mp->mnt_flag & MNT_UPDATE)
 		return EOPNOTSUPP;
 
 	MNT_ILOCK(mp);
 	mp->mnt_flag |= MNT_SYNCHRONOUS;
 	mp->mnt_data = NULL;
 	MNT_IUNLOCK(mp);
 	/* Get the new options passed to mount */
 	opts = mp->mnt_optnew;
 
 	if (!opts)
 		return EINVAL;
 
 	/* `fspath' contains the mount point (eg. /mnt/fuse/sshfs); REQUIRED */
 	if (!vfs_getopts(opts, "fspath", &err))
 		return err;
 
 	/* `from' contains the device name (eg. /dev/fuse0); REQUIRED */
 	fspec = vfs_getopts(opts, "from", &err);
 	if (!fspec)
 		return err;
 
 	/* `fd' contains the filedescriptor for this session; REQUIRED */
 	if (vfs_scanopt(opts, "fd", "%d", &fd) != 1)
 		return EINVAL;
 
 	err = fuse_getdevice(fspec, td, &fdev);
 	if (err != 0)
 		return err;
 
 	/*
          * With the help of underscored options the mount program
          * can inform us from the flags it sets by default
          */
 	FUSE_FLAGOPT(allow_other, FSESS_DAEMON_CAN_SPY);
 	FUSE_FLAGOPT(push_symlinks_in, FSESS_PUSH_SYMLINKS_IN);
 	FUSE_FLAGOPT(default_permissions, FSESS_DEFAULT_PERMISSIONS);
 	FUSE_FLAGOPT(no_attrcache, FSESS_NO_ATTRCACHE);
 	FUSE_FLAGOPT(no_readahed, FSESS_NO_READAHEAD);
 	FUSE_FLAGOPT(no_datacache, FSESS_NO_DATACACHE);
 	FUSE_FLAGOPT(no_namecache, FSESS_NO_NAMECACHE);
 	FUSE_FLAGOPT(no_mmap, FSESS_NO_MMAP);
 	FUSE_FLAGOPT(brokenio, FSESS_BROKENIO);
 
 	if (vfs_scanopt(opts, "max_read=", "%u", &max_read) == 1)
 		max_read_set = 1;
 	if (vfs_scanopt(opts, "timeout=", "%u", &daemon_timeout) == 1) {
 		if (daemon_timeout < FUSE_MIN_DAEMON_TIMEOUT)
 			daemon_timeout = FUSE_MIN_DAEMON_TIMEOUT;
 		else if (daemon_timeout > FUSE_MAX_DAEMON_TIMEOUT)
 			daemon_timeout = FUSE_MAX_DAEMON_TIMEOUT;
 	} else {
 		daemon_timeout = FUSE_DEFAULT_DAEMON_TIMEOUT;
 	}
 	subtype = vfs_getopts(opts, "subtype=", &err);
 
 	FS_DEBUG2G("mntopts 0x%jx\n", (uintmax_t)mntopts);
 
 	err = fget(td, fd, cap_rights_init(&rights, CAP_READ), &fp);
 	if (err != 0) {
 		FS_DEBUG("invalid or not opened device: data=%p\n", data);
 		goto out;
 	}
 	fptmp = td->td_fpop;
 	td->td_fpop = fp;
         err = devfs_get_cdevpriv((void **)&data);
 	td->td_fpop = fptmp;
 	fdrop(fp, td);
 	FUSE_LOCK();
 	if (err != 0 || data == NULL || data->mp != NULL) {
 		FS_DEBUG("invalid or not opened device: data=%p data.mp=%p\n",
 		    data, data != NULL ? data->mp : NULL);
 		err = ENXIO;
 		FUSE_UNLOCK();
 		goto out;
 	}
 	if (fdata_get_dead(data)) {
 		FS_DEBUG("device is dead during mount: data=%p\n", data);
 		err = ENOTCONN;
 		FUSE_UNLOCK();
 		goto out;
 	}
 	/* Sanity + permission checks */
 	if (!data->daemoncred)
 		panic("fuse daemon found, but identity unknown");
 	if (mntopts & FSESS_DAEMON_CAN_SPY)
 		err = priv_check(td, PRIV_VFS_FUSE_ALLOWOTHER);
 	if (err == 0 && td->td_ucred->cr_uid != data->daemoncred->cr_uid)
 		/* are we allowed to do the first mount? */
 		err = priv_check(td, PRIV_VFS_FUSE_MOUNT_NONUSER);
 	if (err) {
 		FUSE_UNLOCK();
 		goto out;
 	}
 	data->ref++;
 	data->mp = mp;
 	data->dataflags |= mntopts;
 	data->max_read = max_read;
 	data->daemon_timeout = daemon_timeout;
 	FUSE_UNLOCK();
 
 	vfs_getnewfsid(mp);
 	MNT_ILOCK(mp);
 	mp->mnt_data = data;
 	mp->mnt_flag |= MNT_LOCAL;
+	mp->mnt_kern_flag |= MNTK_USES_BCACHE;
 	MNT_IUNLOCK(mp);
 	/* We need this here as this slot is used by getnewvnode() */
 	mp->mnt_stat.f_iosize = PAGE_SIZE;
 	if (subtype) {
 		strlcat(mp->mnt_stat.f_fstypename, ".", MFSNAMELEN);
 		strlcat(mp->mnt_stat.f_fstypename, subtype, MFSNAMELEN);
 	}
 	copystr(fspec, mp->mnt_stat.f_mntfromname, MNAMELEN - 1, &len);
 	bzero(mp->mnt_stat.f_mntfromname + len, MNAMELEN - len);
 	FS_DEBUG2G("mp %p: %s\n", mp, mp->mnt_stat.f_mntfromname);
 
 	/* Now handshaking with daemon */
 	fuse_internal_send_init(data, td);
 
 out:
 	if (err) {
 		FUSE_LOCK();
 		if (data->mp == mp) {
 			/*
 			 * Destroy device only if we acquired reference to
 			 * it
 			 */
 			FS_DEBUG("mount failed, destroy device: data=%p mp=%p"
 			      " err=%d\n",
 			    data, mp, err);
 			data->mp = NULL;
 			fdata_trydestroy(data);
 		}
 		FUSE_UNLOCK();
 		dev_rel(fdev);
 	}
 	return err;
 }
 
 static int
 fuse_vfsop_unmount(struct mount *mp, int mntflags)
 {
 	int err = 0;
 	int flags = 0;
 
 	struct cdev *fdev;
 	struct fuse_data *data;
 	struct fuse_dispatcher fdi;
 	struct thread *td = curthread;
 
 	fuse_trace_printf_vfsop();
 
 	if (mntflags & MNT_FORCE) {
 		flags |= FORCECLOSE;
 	}
 	data = fuse_get_mpdata(mp);
 	if (!data) {
 		panic("no private data for mount point?");
 	}
 	/* There is 1 extra root vnode reference (mp->mnt_data). */
 	FUSE_LOCK();
 	if (data->vroot != NULL) {
 		struct vnode *vroot = data->vroot;
 
 		data->vroot = NULL;
 		FUSE_UNLOCK();
 		vrele(vroot);
 	} else
 		FUSE_UNLOCK();
 	err = vflush(mp, 0, flags, td);
 	if (err) {
 		debug_printf("vflush failed");
 		return err;
 	}
 	if (fdata_get_dead(data)) {
 		goto alreadydead;
 	}
 	fdisp_init(&fdi, 0);
 	fdisp_make(&fdi, FUSE_DESTROY, mp, 0, td, NULL);
 
 	err = fdisp_wait_answ(&fdi);
 	fdisp_destroy(&fdi);
 
 	fdata_set_dead(data);
 
 alreadydead:
 	FUSE_LOCK();
 	data->mp = NULL;
 	fdev = data->fdev;
 	fdata_trydestroy(data);
 	FUSE_UNLOCK();
 
 	MNT_ILOCK(mp);
 	mp->mnt_data = NULL;
 	mp->mnt_flag &= ~MNT_LOCAL;
 	MNT_IUNLOCK(mp);
 
 	dev_rel(fdev);
 
 	return 0;
 }
 
 static int
 fuse_vfsop_root(struct mount *mp, int lkflags, struct vnode **vpp)
 {
 	struct fuse_data *data = fuse_get_mpdata(mp);
 	int err = 0;
 
 	if (data->vroot != NULL) {
 		err = vget(data->vroot, lkflags, curthread);
 		if (err == 0)
 			*vpp = data->vroot;
 	} else {
 		err = fuse_vnode_get(mp, FUSE_ROOT_ID, NULL, vpp, NULL, VDIR);
 		if (err == 0) {
 			FUSE_LOCK();
 			MPASS(data->vroot == NULL || data->vroot == *vpp);
 			if (data->vroot == NULL) {
 				FS_DEBUG("new root vnode\n");
 				data->vroot = *vpp;
 				FUSE_UNLOCK();
 				vref(*vpp);
 			} else if (data->vroot != *vpp) {
 				FS_DEBUG("root vnode race\n");
 				FUSE_UNLOCK();
 				VOP_UNLOCK(*vpp, 0);
 				vrele(*vpp);
 				vrecycle(*vpp);
 				*vpp = data->vroot;
 			} else
 				FUSE_UNLOCK();
 		}
 	}
 	return err;
 }
 
 static int
 fuse_vfsop_statfs(struct mount *mp, struct statfs *sbp)
 {
 	struct fuse_dispatcher fdi;
 	int err = 0;
 
 	struct fuse_statfs_out *fsfo;
 	struct fuse_data *data;
 
 	FS_DEBUG2G("mp %p: %s\n", mp, mp->mnt_stat.f_mntfromname);
 	data = fuse_get_mpdata(mp);
 
 	if (!(data->dataflags & FSESS_INITED))
 		goto fake;
 
 	fdisp_init(&fdi, 0);
 	fdisp_make(&fdi, FUSE_STATFS, mp, FUSE_ROOT_ID, NULL, NULL);
 	err = fdisp_wait_answ(&fdi);
 	if (err) {
 		fdisp_destroy(&fdi);
 		if (err == ENOTCONN) {
 			/*
 	                 * We want to seem a legitimate fs even if the daemon
 	                 * is stiff dead... (so that, eg., we can still do path
 	                 * based unmounting after the daemon dies).
 	                 */
 			goto fake;
 		}
 		return err;
 	}
 	fsfo = fdi.answ;
 
 	sbp->f_blocks = fsfo->st.blocks;
 	sbp->f_bfree = fsfo->st.bfree;
 	sbp->f_bavail = fsfo->st.bavail;
 	sbp->f_files = fsfo->st.files;
 	sbp->f_ffree = fsfo->st.ffree;	/* cast from uint64_t to int64_t */
 	sbp->f_namemax = fsfo->st.namelen;
 	sbp->f_bsize = fsfo->st.frsize;	/* cast from uint32_t to uint64_t */
 
 	FS_DEBUG("fuse_statfs_out -- blocks: %llu, bfree: %llu, bavail: %llu, "
 	      "fil	es: %llu, ffree: %llu, bsize: %i, namelen: %i\n",
 	      (unsigned long long)fsfo->st.blocks, 
 	      (unsigned long long)fsfo->st.bfree,
 	      (unsigned long long)fsfo->st.bavail, 
 	      (unsigned long long)fsfo->st.files,
 	      (unsigned long long)fsfo->st.ffree, fsfo->st.bsize, 
 	      fsfo->st.namelen);
 
 	fdisp_destroy(&fdi);
 	return 0;
 
 fake:
 	sbp->f_blocks = 0;
 	sbp->f_bfree = 0;
 	sbp->f_bavail = 0;
 	sbp->f_files = 0;
 	sbp->f_ffree = 0;
 	sbp->f_namemax = 0;
 	sbp->f_bsize = FUSE_DEFAULT_BLOCKSIZE;
 
 	return 0;
 }
Index: user/ngie/more-tests/sys/fs/msdosfs/msdosfs_vfsops.c
===================================================================
--- user/ngie/more-tests/sys/fs/msdosfs/msdosfs_vfsops.c	(revision 281584)
+++ user/ngie/more-tests/sys/fs/msdosfs/msdosfs_vfsops.c	(revision 281585)
@@ -1,1035 +1,1036 @@
 /* $FreeBSD$ */
 /*	$NetBSD: msdosfs_vfsops.c,v 1.51 1997/11/17 15:36:58 ws Exp $	*/
 
 /*-
  * Copyright (C) 1994, 1995, 1997 Wolfgang Solfrank.
  * Copyright (C) 1994, 1995, 1997 TooLs GmbH.
  * All rights reserved.
  * Original code by Paul Popelka (paulp@uts.amdahl.com) (see below).
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *	This product includes software developed by TooLs GmbH.
  * 4. The name of TooLs GmbH may not be used to endorse or promote products
  *    derived from this software without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY TOOLS GMBH ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL TOOLS GMBH BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
  * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
  * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
  * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
  * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
  * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 /*-
  * Written by Paul Popelka (paulp@uts.amdahl.com)
  *
  * You can do anything you want with this software, just don't say you wrote
  * it, and don't remove this notice.
  *
  * This software is provided "as is".
  *
  * The author supplies this software to be publicly redistributed on the
  * understanding that the author is not responsible for the correct
  * functioning of this software in any circumstances and is not liable for
  * any damages caused by this software.
  *
  * October 1992
  */
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/buf.h>
 #include <sys/conf.h>
 #include <sys/fcntl.h>
 #include <sys/iconv.h>
 #include <sys/kernel.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/mount.h>
 #include <sys/mutex.h>
 #include <sys/namei.h>
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/stat.h>
 #include <sys/vnode.h>
 
 #include <geom/geom.h>
 #include <geom/geom_vfs.h>
 
 #include <fs/msdosfs/bootsect.h>
 #include <fs/msdosfs/bpb.h>
 #include <fs/msdosfs/direntry.h>
 #include <fs/msdosfs/denode.h>
 #include <fs/msdosfs/fat.h>
 #include <fs/msdosfs/msdosfsmount.h>
 
 static const char msdosfs_lock_msg[] = "fatlk";
 
 /* Mount options that we support. */
 static const char *msdosfs_opts[] = {
 	"async", "noatime", "noclusterr", "noclusterw",
 	"export", "force", "from", "sync",
 	"cs_dos", "cs_local", "cs_win", "dirmask",
 	"gid", "kiconv", "large", "longname",
 	"longnames", "mask", "shortname", "shortnames",
 	"uid", "win95", "nowin95",
 	NULL
 };
 
 #if 1 /*def PC98*/
 /*
  * XXX - The boot signature formatted by NEC PC-98 DOS looks like a
  *       garbage or a random value :-{
  *       If you want to use that broken-signatured media, define the
  *       following symbol even though PC/AT.
  *       (ex. mount PC-98 DOS formatted FD on PC/AT)
  */
 #define	MSDOSFS_NOCHECKSIG
 #endif
 
 MALLOC_DEFINE(M_MSDOSFSMNT, "msdosfs_mount", "MSDOSFS mount structure");
 static MALLOC_DEFINE(M_MSDOSFSFAT, "msdosfs_fat", "MSDOSFS file allocation table");
 
 struct iconv_functions *msdosfs_iconv;
 
 static int	update_mp(struct mount *mp, struct thread *td);
 static int	mountmsdosfs(struct vnode *devvp, struct mount *mp);
 static vfs_fhtovp_t	msdosfs_fhtovp;
 static vfs_mount_t	msdosfs_mount;
 static vfs_root_t	msdosfs_root;
 static vfs_statfs_t	msdosfs_statfs;
 static vfs_sync_t	msdosfs_sync;
 static vfs_unmount_t	msdosfs_unmount;
 
 /* Maximum length of a character set name (arbitrary). */
 #define	MAXCSLEN	64
 
 static int
 update_mp(struct mount *mp, struct thread *td)
 {
 	struct msdosfsmount *pmp = VFSTOMSDOSFS(mp);
 	void *dos, *win, *local;
 	int error, v;
 
 	if (!vfs_getopt(mp->mnt_optnew, "kiconv", NULL, NULL)) {
 		if (msdosfs_iconv != NULL) {
 			error = vfs_getopt(mp->mnt_optnew,
 			    "cs_win", &win, NULL);
 			if (!error)
 				error = vfs_getopt(mp->mnt_optnew,
 				    "cs_local", &local, NULL);
 			if (!error)
 				error = vfs_getopt(mp->mnt_optnew,
 				    "cs_dos", &dos, NULL);
 			if (!error) {
 				msdosfs_iconv->open(win, local, &pmp->pm_u2w);
 				msdosfs_iconv->open(local, win, &pmp->pm_w2u);
 				msdosfs_iconv->open(dos, local, &pmp->pm_u2d);
 				msdosfs_iconv->open(local, dos, &pmp->pm_d2u);
 			}
 			if (error != 0)
 				return (error);
 		} else {
 			pmp->pm_w2u = NULL;
 			pmp->pm_u2w = NULL;
 			pmp->pm_d2u = NULL;
 			pmp->pm_u2d = NULL;
 		}
 	}
 
 	if (vfs_scanopt(mp->mnt_optnew, "gid", "%d", &v) == 1)
 		pmp->pm_gid = v;
 	if (vfs_scanopt(mp->mnt_optnew, "uid", "%d", &v) == 1)
 		pmp->pm_uid = v;
 	if (vfs_scanopt(mp->mnt_optnew, "mask", "%d", &v) == 1)
 		pmp->pm_mask = v & ALLPERMS;
 	if (vfs_scanopt(mp->mnt_optnew, "dirmask", "%d", &v) == 1)
 		pmp->pm_dirmask = v & ALLPERMS;
 	vfs_flagopt(mp->mnt_optnew, "shortname",
 	    &pmp->pm_flags, MSDOSFSMNT_SHORTNAME);
 	vfs_flagopt(mp->mnt_optnew, "shortnames",
 	    &pmp->pm_flags, MSDOSFSMNT_SHORTNAME);
 	vfs_flagopt(mp->mnt_optnew, "longname",
 	    &pmp->pm_flags, MSDOSFSMNT_LONGNAME);
 	vfs_flagopt(mp->mnt_optnew, "longnames",
 	    &pmp->pm_flags, MSDOSFSMNT_LONGNAME);
 	vfs_flagopt(mp->mnt_optnew, "kiconv",
 	    &pmp->pm_flags, MSDOSFSMNT_KICONV);
 
 	if (vfs_getopt(mp->mnt_optnew, "nowin95", NULL, NULL) == 0)
 		pmp->pm_flags |= MSDOSFSMNT_NOWIN95;
 	else
 		pmp->pm_flags &= ~MSDOSFSMNT_NOWIN95;
 
 	if (pmp->pm_flags & MSDOSFSMNT_NOWIN95)
 		pmp->pm_flags |= MSDOSFSMNT_SHORTNAME;
 	else if (!(pmp->pm_flags &
 	    (MSDOSFSMNT_SHORTNAME | MSDOSFSMNT_LONGNAME))) {
 		struct vnode *rootvp;
 
 		/*
 		 * Try to divine whether to support Win'95 long filenames
 		 */
 		if (FAT32(pmp))
 			pmp->pm_flags |= MSDOSFSMNT_LONGNAME;
 		else {
 			if ((error =
 			    msdosfs_root(mp, LK_EXCLUSIVE, &rootvp)) != 0)
 				return error;
 			pmp->pm_flags |= findwin95(VTODE(rootvp)) ?
 			    MSDOSFSMNT_LONGNAME : MSDOSFSMNT_SHORTNAME;
 			vput(rootvp);
 		}
 	}
 	return 0;
 }
 
 static int
 msdosfs_cmount(struct mntarg *ma, void *data, uint64_t flags)
 {
 	struct msdosfs_args args;
 	struct export_args exp;
 	int error;
 
 	if (data == NULL)
 		return (EINVAL);
 	error = copyin(data, &args, sizeof args);
 	if (error)
 		return (error);
 	vfs_oexport_conv(&args.export, &exp);
 
 	ma = mount_argsu(ma, "from", args.fspec, MAXPATHLEN);
 	ma = mount_arg(ma, "export", &exp, sizeof(exp));
 	ma = mount_argf(ma, "uid", "%d", args.uid);
 	ma = mount_argf(ma, "gid", "%d", args.gid);
 	ma = mount_argf(ma, "mask", "%d", args.mask);
 	ma = mount_argf(ma, "dirmask", "%d", args.dirmask);
 
 	ma = mount_argb(ma, args.flags & MSDOSFSMNT_SHORTNAME, "noshortname");
 	ma = mount_argb(ma, args.flags & MSDOSFSMNT_LONGNAME, "nolongname");
 	ma = mount_argb(ma, !(args.flags & MSDOSFSMNT_NOWIN95), "nowin95");
 	ma = mount_argb(ma, args.flags & MSDOSFSMNT_KICONV, "nokiconv");
 
 	ma = mount_argsu(ma, "cs_win", args.cs_win, MAXCSLEN);
 	ma = mount_argsu(ma, "cs_dos", args.cs_dos, MAXCSLEN);
 	ma = mount_argsu(ma, "cs_local", args.cs_local, MAXCSLEN);
 
 	error = kernel_mount(ma, flags);
 
 	return (error);
 }
 
 /*
  * mp - path - addr in user space of mount point (ie /usr or whatever)
  * data - addr in user space of mount params including the name of the block
  * special file to treat as a filesystem.
  */
 static int
 msdosfs_mount(struct mount *mp)
 {
 	struct vnode *devvp;	  /* vnode for blk device to mount */
 	struct thread *td;
 	/* msdosfs specific mount control block */
 	struct msdosfsmount *pmp = NULL;
 	struct nameidata ndp;
 	int error, flags;
 	accmode_t accmode;
 	char *from;
 
 	td = curthread;
 	if (vfs_filteropt(mp->mnt_optnew, msdosfs_opts))
 		return (EINVAL);
 
 	/*
 	 * If updating, check whether changing from read-only to
 	 * read/write; if there is no device name, that's all we do.
 	 */
 	if (mp->mnt_flag & MNT_UPDATE) {
 		pmp = VFSTOMSDOSFS(mp);
 		if (vfs_flagopt(mp->mnt_optnew, "export", NULL, 0)) {
 			/*
 			 * Forbid export requests if filesystem has
 			 * MSDOSFS_LARGEFS flag set.
 			 */
 			if ((pmp->pm_flags & MSDOSFS_LARGEFS) != 0) {
 				vfs_mount_error(mp,
 				    "MSDOSFS_LARGEFS flag set, cannot export");
 				return (EOPNOTSUPP);
 			}
 		}
 		if (!(pmp->pm_flags & MSDOSFSMNT_RONLY) &&
 		    vfs_flagopt(mp->mnt_optnew, "ro", NULL, 0)) {
 			error = VFS_SYNC(mp, MNT_WAIT);
 			if (error)
 				return (error);
 			flags = WRITECLOSE;
 			if (mp->mnt_flag & MNT_FORCE)
 				flags |= FORCECLOSE;
 			error = vflush(mp, 0, flags, td);
 			if (error)
 				return (error);
 
 			/*
 			 * Now the volume is clean.  Mark it so while the
 			 * device is still rw.
 			 */
 			error = markvoldirty(pmp, 0);
 			if (error) {
 				(void)markvoldirty(pmp, 1);
 				return (error);
 			}
 
 			/* Downgrade the device from rw to ro. */
 			DROP_GIANT();
 			g_topology_lock();
 			error = g_access(pmp->pm_cp, 0, -1, 0);
 			g_topology_unlock();
 			PICKUP_GIANT();
 			if (error) {
 				(void)markvoldirty(pmp, 1);
 				return (error);
 			}
 
 			/*
 			 * Backing out after an error was painful in the
 			 * above.  Now we are committed to succeeding.
 			 */
 			pmp->pm_fmod = 0;
 			pmp->pm_flags |= MSDOSFSMNT_RONLY;
 			MNT_ILOCK(mp);
 			mp->mnt_flag |= MNT_RDONLY;
 			MNT_IUNLOCK(mp);
 		} else if ((pmp->pm_flags & MSDOSFSMNT_RONLY) &&
 		    !vfs_flagopt(mp->mnt_optnew, "ro", NULL, 0)) {
 			/*
 			 * If upgrade to read-write by non-root, then verify
 			 * that user has necessary permissions on the device.
 			 */
 			devvp = pmp->pm_devvp;
 			vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY);
 			error = VOP_ACCESS(devvp, VREAD | VWRITE,
 			    td->td_ucred, td);
 			if (error)
 				error = priv_check(td, PRIV_VFS_MOUNT_PERM);
 			if (error) {
 				VOP_UNLOCK(devvp, 0);
 				return (error);
 			}
 			VOP_UNLOCK(devvp, 0);
 			DROP_GIANT();
 			g_topology_lock();
 			error = g_access(pmp->pm_cp, 0, 1, 0);
 			g_topology_unlock();
 			PICKUP_GIANT();
 			if (error)
 				return (error);
 
 			pmp->pm_fmod = 1;
 			pmp->pm_flags &= ~MSDOSFSMNT_RONLY;
 			MNT_ILOCK(mp);
 			mp->mnt_flag &= ~MNT_RDONLY;
 			MNT_IUNLOCK(mp);
 
 			/* Now that the volume is modifiable, mark it dirty. */
 			error = markvoldirty(pmp, 1);
 			if (error)
 				return (error); 
 		}
 	}
 	/*
 	 * Not an update, or updating the name: look up the name
 	 * and verify that it refers to a sensible disk device.
 	 */
 	if (vfs_getopt(mp->mnt_optnew, "from", (void **)&from, NULL))
 		return (EINVAL);
 	NDINIT(&ndp, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, from, td);
 	error = namei(&ndp);
 	if (error)
 		return (error);
 	devvp = ndp.ni_vp;
 	NDFREE(&ndp, NDF_ONLY_PNBUF);
 
 	if (!vn_isdisk(devvp, &error)) {
 		vput(devvp);
 		return (error);
 	}
 	/*
 	 * If mount by non-root, then verify that user has necessary
 	 * permissions on the device.
 	 */
 	accmode = VREAD;
 	if ((mp->mnt_flag & MNT_RDONLY) == 0)
 		accmode |= VWRITE;
 	error = VOP_ACCESS(devvp, accmode, td->td_ucred, td);
 	if (error)
 		error = priv_check(td, PRIV_VFS_MOUNT_PERM);
 	if (error) {
 		vput(devvp);
 		return (error);
 	}
 	if ((mp->mnt_flag & MNT_UPDATE) == 0) {
 		error = mountmsdosfs(devvp, mp);
 #ifdef MSDOSFS_DEBUG		/* only needed for the printf below */
 		pmp = VFSTOMSDOSFS(mp);
 #endif
 	} else {
 		vput(devvp);
 		if (devvp != pmp->pm_devvp)
 			return (EINVAL);	/* XXX needs translation */
 	}
 	if (error) {
 		vrele(devvp);
 		return (error);
 	}
 
 	error = update_mp(mp, td);
 	if (error) {
 		if ((mp->mnt_flag & MNT_UPDATE) == 0)
 			msdosfs_unmount(mp, MNT_FORCE);
 		return error;
 	}
 
 	if (devvp->v_type == VCHR && devvp->v_rdev != NULL)
 		devvp->v_rdev->si_mountpt = mp;
 	vfs_mountedfrom(mp, from);
 #ifdef MSDOSFS_DEBUG
 	printf("msdosfs_mount(): mp %p, pmp %p, inusemap %p\n", mp, pmp, pmp->pm_inusemap);
 #endif
 	return (0);
 }
 
 static int
 mountmsdosfs(struct vnode *devvp, struct mount *mp)
 {
 	struct msdosfsmount *pmp;
 	struct buf *bp;
 	struct cdev *dev;
 	union bootsector *bsp;
 	struct byte_bpb33 *b33;
 	struct byte_bpb50 *b50;
 	struct byte_bpb710 *b710;
 	u_int8_t SecPerClust;
 	u_long clusters;
 	int ronly, error;
 	struct g_consumer *cp;
 	struct bufobj *bo;
 
 	bp = NULL;		/* This and pmp both used in error_exit. */
 	pmp = NULL;
 	ronly = (mp->mnt_flag & MNT_RDONLY) != 0;
 
 	dev = devvp->v_rdev;
 	dev_ref(dev);
 	DROP_GIANT();
 	g_topology_lock();
 	error = g_vfs_open(devvp, &cp, "msdosfs", ronly ? 0 : 1);
 	g_topology_unlock();
 	PICKUP_GIANT();
 	VOP_UNLOCK(devvp, 0);
 	if (error)
 		goto error_exit;
 
 	bo = &devvp->v_bufobj;
 
 	/*
 	 * Read the boot sector of the filesystem, and then check the
 	 * boot signature.  If not a dos boot sector then error out.
 	 *
 	 * NOTE: 8192 is a magic size that works for ffs.
 	 */
 	error = bread(devvp, 0, 8192, NOCRED, &bp);
 	if (error)
 		goto error_exit;
 	bp->b_flags |= B_AGE;
 	bsp = (union bootsector *)bp->b_data;
 	b33 = (struct byte_bpb33 *)bsp->bs33.bsBPB;
 	b50 = (struct byte_bpb50 *)bsp->bs50.bsBPB;
 	b710 = (struct byte_bpb710 *)bsp->bs710.bsBPB;
 
 #ifndef MSDOSFS_NOCHECKSIG
 	if (bsp->bs50.bsBootSectSig0 != BOOTSIG0
 	    || bsp->bs50.bsBootSectSig1 != BOOTSIG1) {
 		error = EINVAL;
 		goto error_exit;
 	}
 #endif
 
 	pmp = malloc(sizeof *pmp, M_MSDOSFSMNT, M_WAITOK | M_ZERO);
 	pmp->pm_mountp = mp;
 	pmp->pm_cp = cp;
 	pmp->pm_bo = bo;
 
 	lockinit(&pmp->pm_fatlock, 0, msdosfs_lock_msg, 0, 0);
 
 	/*
 	 * Initialize ownerships and permissions, since nothing else will
 	 * initialize them iff we are mounting root.
 	 */
 	pmp->pm_uid = UID_ROOT;
 	pmp->pm_gid = GID_WHEEL;
 	pmp->pm_mask = pmp->pm_dirmask = S_IXUSR | S_IXGRP | S_IXOTH |
 	    S_IRUSR | S_IRGRP | S_IROTH | S_IWUSR;
 
 	/*
 	 * Experimental support for large MS-DOS filesystems.
 	 * WARNING: This uses at least 32 bytes of kernel memory (which is not
 	 * reclaimed until the FS is unmounted) for each file on disk to map
 	 * between the 32-bit inode numbers used by VFS and the 64-bit
 	 * pseudo-inode numbers used internally by msdosfs. This is only
 	 * safe to use in certain controlled situations (e.g. read-only FS
 	 * with less than 1 million files).
 	 * Since the mappings do not persist across unmounts (or reboots), these
 	 * filesystems are not suitable for exporting through NFS, or any other
 	 * application that requires fixed inode numbers.
 	 */
 	vfs_flagopt(mp->mnt_optnew, "large", &pmp->pm_flags, MSDOSFS_LARGEFS);
 
 	/*
 	 * Compute several useful quantities from the bpb in the
 	 * bootsector.  Copy in the dos 5 variant of the bpb then fix up
 	 * the fields that are different between dos 5 and dos 3.3.
 	 */
 	SecPerClust = b50->bpbSecPerClust;
 	pmp->pm_BytesPerSec = getushort(b50->bpbBytesPerSec);
 	if (pmp->pm_BytesPerSec < DEV_BSIZE) {
 		error = EINVAL;
 		goto error_exit;
 	}
 	pmp->pm_ResSectors = getushort(b50->bpbResSectors);
 	pmp->pm_FATs = b50->bpbFATs;
 	pmp->pm_RootDirEnts = getushort(b50->bpbRootDirEnts);
 	pmp->pm_Sectors = getushort(b50->bpbSectors);
 	pmp->pm_FATsecs = getushort(b50->bpbFATsecs);
 	pmp->pm_SecPerTrack = getushort(b50->bpbSecPerTrack);
 	pmp->pm_Heads = getushort(b50->bpbHeads);
 	pmp->pm_Media = b50->bpbMedia;
 
 	/* calculate the ratio of sector size to DEV_BSIZE */
 	pmp->pm_BlkPerSec = pmp->pm_BytesPerSec / DEV_BSIZE;
 
 	/*
 	 * We don't check pm_Heads nor pm_SecPerTrack, because
 	 * these may not be set for EFI file systems. We don't
 	 * use these anyway, so we're unaffected if they are
 	 * invalid.
 	 */
 	if (!pmp->pm_BytesPerSec || !SecPerClust) {
 		error = EINVAL;
 		goto error_exit;
 	}
 
 	if (pmp->pm_Sectors == 0) {
 		pmp->pm_HiddenSects = getulong(b50->bpbHiddenSecs);
 		pmp->pm_HugeSectors = getulong(b50->bpbHugeSectors);
 	} else {
 		pmp->pm_HiddenSects = getushort(b33->bpbHiddenSecs);
 		pmp->pm_HugeSectors = pmp->pm_Sectors;
 	}
 	if (!(pmp->pm_flags & MSDOSFS_LARGEFS)) {
 		if (pmp->pm_HugeSectors > 0xffffffff /
 		    (pmp->pm_BytesPerSec / sizeof(struct direntry)) + 1) {
 			/*
 			 * We cannot deal currently with this size of disk
 			 * due to fileid limitations (see msdosfs_getattr and
 			 * msdosfs_readdir)
 			 */
 			error = EINVAL;
 			vfs_mount_error(mp,
 			    "Disk too big, try '-o large' mount option");
 			goto error_exit;
 		}
 	}
 
 	if (pmp->pm_RootDirEnts == 0) {
 		if (pmp->pm_FATsecs
 		    || getushort(b710->bpbFSVers)) {
 			error = EINVAL;
 #ifdef MSDOSFS_DEBUG
 			printf("mountmsdosfs(): bad FAT32 filesystem\n");
 #endif
 			goto error_exit;
 		}
 		pmp->pm_fatmask = FAT32_MASK;
 		pmp->pm_fatmult = 4;
 		pmp->pm_fatdiv = 1;
 		pmp->pm_FATsecs = getulong(b710->bpbBigFATsecs);
 		if (getushort(b710->bpbExtFlags) & FATMIRROR)
 			pmp->pm_curfat = getushort(b710->bpbExtFlags) & FATNUM;
 		else
 			pmp->pm_flags |= MSDOSFS_FATMIRROR;
 	} else
 		pmp->pm_flags |= MSDOSFS_FATMIRROR;
 
 	/*
 	 * Check a few values (could do some more):
 	 * - logical sector size: power of 2, >= block size
 	 * - sectors per cluster: power of 2, >= 1
 	 * - number of sectors:   >= 1, <= size of partition
 	 * - number of FAT sectors: >= 1
 	 */
 	if ( (SecPerClust == 0)
 	  || (SecPerClust & (SecPerClust - 1))
 	  || (pmp->pm_BytesPerSec < DEV_BSIZE)
 	  || (pmp->pm_BytesPerSec & (pmp->pm_BytesPerSec - 1))
 	  || (pmp->pm_HugeSectors == 0)
 	  || (pmp->pm_FATsecs == 0)
 	  || (SecPerClust * pmp->pm_BlkPerSec > MAXBSIZE / DEV_BSIZE)
 	) {
 		error = EINVAL;
 		goto error_exit;
 	}
 
 	pmp->pm_HugeSectors *= pmp->pm_BlkPerSec;
 	pmp->pm_HiddenSects *= pmp->pm_BlkPerSec;	/* XXX not used? */
 	pmp->pm_FATsecs     *= pmp->pm_BlkPerSec;
 	SecPerClust         *= pmp->pm_BlkPerSec;
 
 	pmp->pm_fatblk = pmp->pm_ResSectors * pmp->pm_BlkPerSec;
 
 	if (FAT32(pmp)) {
 		pmp->pm_rootdirblk = getulong(b710->bpbRootClust);
 		pmp->pm_firstcluster = pmp->pm_fatblk
 			+ (pmp->pm_FATs * pmp->pm_FATsecs);
 		pmp->pm_fsinfo = getushort(b710->bpbFSInfo) * pmp->pm_BlkPerSec;
 	} else {
 		pmp->pm_rootdirblk = pmp->pm_fatblk +
 			(pmp->pm_FATs * pmp->pm_FATsecs);
 		pmp->pm_rootdirsize = (pmp->pm_RootDirEnts * sizeof(struct direntry)
 				       + DEV_BSIZE - 1)
 			/ DEV_BSIZE; /* in blocks */
 		pmp->pm_firstcluster = pmp->pm_rootdirblk + pmp->pm_rootdirsize;
 	}
 
 	pmp->pm_maxcluster = (pmp->pm_HugeSectors - pmp->pm_firstcluster) /
 	    SecPerClust + 1;
 	pmp->pm_fatsize = pmp->pm_FATsecs * DEV_BSIZE;	/* XXX not used? */
 
 	if (pmp->pm_fatmask == 0) {
 		if (pmp->pm_maxcluster
 		    <= ((CLUST_RSRVD - CLUST_FIRST) & FAT12_MASK)) {
 			/*
 			 * This will usually be a floppy disk. This size makes
 			 * sure that one fat entry will not be split across
 			 * multiple blocks.
 			 */
 			pmp->pm_fatmask = FAT12_MASK;
 			pmp->pm_fatmult = 3;
 			pmp->pm_fatdiv = 2;
 		} else {
 			pmp->pm_fatmask = FAT16_MASK;
 			pmp->pm_fatmult = 2;
 			pmp->pm_fatdiv = 1;
 		}
 	}
 
 	clusters = (pmp->pm_fatsize / pmp->pm_fatmult) * pmp->pm_fatdiv;
 	if (pmp->pm_maxcluster >= clusters) {
 #ifdef MSDOSFS_DEBUG
 		printf("Warning: number of clusters (%ld) exceeds FAT "
 		    "capacity (%ld)\n", pmp->pm_maxcluster + 1, clusters);
 #endif
 		pmp->pm_maxcluster = clusters - 1;
 	}
 
 	if (FAT12(pmp))
 		pmp->pm_fatblocksize = 3 * 512;
 	else
 		pmp->pm_fatblocksize = PAGE_SIZE;
 	pmp->pm_fatblocksize = roundup(pmp->pm_fatblocksize,
 	    pmp->pm_BytesPerSec);
 	pmp->pm_fatblocksec = pmp->pm_fatblocksize / DEV_BSIZE;
 	pmp->pm_bnshift = ffs(DEV_BSIZE) - 1;
 
 	/*
 	 * Compute mask and shift value for isolating cluster relative byte
 	 * offsets and cluster numbers from a file offset.
 	 */
 	pmp->pm_bpcluster = SecPerClust * DEV_BSIZE;
 	pmp->pm_crbomask = pmp->pm_bpcluster - 1;
 	pmp->pm_cnshift = ffs(pmp->pm_bpcluster) - 1;
 
 	/*
 	 * Check for valid cluster size
 	 * must be a power of 2
 	 */
 	if (pmp->pm_bpcluster ^ (1 << pmp->pm_cnshift)) {
 		error = EINVAL;
 		goto error_exit;
 	}
 
 	/*
 	 * Release the bootsector buffer.
 	 */
 	brelse(bp);
 	bp = NULL;
 
 	/*
 	 * Check the fsinfo sector if we have one.  Silently fix up our
 	 * in-core copy of fp->fsinxtfree if it is unknown (0xffffffff)
 	 * or too large.  Ignore fp->fsinfree for now, since we need to
 	 * read the entire FAT anyway to fill the inuse map.
 	 */
 	if (pmp->pm_fsinfo) {
 		struct fsinfo *fp;
 
 		if ((error = bread(devvp, pmp->pm_fsinfo, pmp->pm_BytesPerSec,
 		    NOCRED, &bp)) != 0)
 			goto error_exit;
 		fp = (struct fsinfo *)bp->b_data;
 		if (!bcmp(fp->fsisig1, "RRaA", 4)
 		    && !bcmp(fp->fsisig2, "rrAa", 4)
 		    && !bcmp(fp->fsisig3, "\0\0\125\252", 4)) {
 			pmp->pm_nxtfree = getulong(fp->fsinxtfree);
 			if (pmp->pm_nxtfree > pmp->pm_maxcluster)
 				pmp->pm_nxtfree = CLUST_FIRST;
 		} else
 			pmp->pm_fsinfo = 0;
 		brelse(bp);
 		bp = NULL;
 	}
 
 	/*
 	 * Finish initializing pmp->pm_nxtfree (just in case the first few
 	 * sectors aren't properly reserved in the FAT).  This completes
 	 * the fixup for fp->fsinxtfree, and fixes up the zero-initialized
 	 * value if there is no fsinfo.  We will use pmp->pm_nxtfree
 	 * internally even if there is no fsinfo.
 	 */
 	if (pmp->pm_nxtfree < CLUST_FIRST)
 		pmp->pm_nxtfree = CLUST_FIRST;
 
 	/*
 	 * Allocate memory for the bitmap of allocated clusters, and then
 	 * fill it in.
 	 */
 	pmp->pm_inusemap = malloc(howmany(pmp->pm_maxcluster + 1, N_INUSEBITS)
 				  * sizeof(*pmp->pm_inusemap),
 				  M_MSDOSFSFAT, M_WAITOK);
 
 	/*
 	 * fillinusemap() needs pm_devvp.
 	 */
 	pmp->pm_devvp = devvp;
 	pmp->pm_dev = dev;
 
 	/*
 	 * Have the inuse map filled in.
 	 */
 	MSDOSFS_LOCK_MP(pmp);
 	error = fillinusemap(pmp);
 	MSDOSFS_UNLOCK_MP(pmp);
 	if (error != 0)
 		goto error_exit;
 
 	/*
 	 * If they want fat updates to be synchronous then let them suffer
 	 * the performance degradation in exchange for the on disk copy of
 	 * the fat being correct just about all the time.  I suppose this
 	 * would be a good thing to turn on if the kernel is still flakey.
 	 */
 	if (mp->mnt_flag & MNT_SYNCHRONOUS)
 		pmp->pm_flags |= MSDOSFSMNT_WAITONFAT;
 
 	/*
 	 * Finish up.
 	 */
 	if (ronly)
 		pmp->pm_flags |= MSDOSFSMNT_RONLY;
 	else {
 		if ((error = markvoldirty(pmp, 1)) != 0) {
 			(void)markvoldirty(pmp, 0);
 			goto error_exit;
 		}
 		pmp->pm_fmod = 1;
 	}
 	mp->mnt_data =  pmp;
 	mp->mnt_stat.f_fsid.val[0] = dev2udev(dev);
 	mp->mnt_stat.f_fsid.val[1] = mp->mnt_vfc->vfc_typenum;
 	MNT_ILOCK(mp);
 	mp->mnt_flag |= MNT_LOCAL;
+	mp->mnt_kern_flag |= MNTK_USES_BCACHE;
 	MNT_IUNLOCK(mp);
 
 	if (pmp->pm_flags & MSDOSFS_LARGEFS)
 		msdosfs_fileno_init(mp);
 
 	return 0;
 
 error_exit:
 	if (bp)
 		brelse(bp);
 	if (cp != NULL) {
 		DROP_GIANT();
 		g_topology_lock();
 		g_vfs_close(cp);
 		g_topology_unlock();
 		PICKUP_GIANT();
 	}
 	if (pmp) {
 		lockdestroy(&pmp->pm_fatlock);
 		if (pmp->pm_inusemap)
 			free(pmp->pm_inusemap, M_MSDOSFSFAT);
 		free(pmp, M_MSDOSFSMNT);
 		mp->mnt_data = NULL;
 	}
 	dev_rel(dev);
 	return (error);
 }
 
 /*
  * Unmount the filesystem described by mp.
  */
 static int
 msdosfs_unmount(struct mount *mp, int mntflags)
 {
 	struct msdosfsmount *pmp;
 	int error, flags;
 
 	error = flags = 0;
 	pmp = VFSTOMSDOSFS(mp);
 	if ((pmp->pm_flags & MSDOSFSMNT_RONLY) == 0)
 		error = msdosfs_sync(mp, MNT_WAIT);
 	if ((mntflags & MNT_FORCE) != 0)
 		flags |= FORCECLOSE;
 	else if (error != 0)
 		return (error);
 	error = vflush(mp, 0, flags, curthread);
 	if (error != 0 && error != ENXIO)
 		return (error);
 	if ((pmp->pm_flags & MSDOSFSMNT_RONLY) == 0) {
 		error = markvoldirty(pmp, 0);
 		if (error && error != ENXIO) {
 			(void)markvoldirty(pmp, 1);
 			return (error);
 		}
 	}
 	if (pmp->pm_flags & MSDOSFSMNT_KICONV && msdosfs_iconv) {
 		if (pmp->pm_w2u)
 			msdosfs_iconv->close(pmp->pm_w2u);
 		if (pmp->pm_u2w)
 			msdosfs_iconv->close(pmp->pm_u2w);
 		if (pmp->pm_d2u)
 			msdosfs_iconv->close(pmp->pm_d2u);
 		if (pmp->pm_u2d)
 			msdosfs_iconv->close(pmp->pm_u2d);
 	}
 
 #ifdef MSDOSFS_DEBUG
 	{
 		struct vnode *vp = pmp->pm_devvp;
 		struct bufobj *bo;
 
 		bo = &vp->v_bufobj;
 		BO_LOCK(bo);
 		VI_LOCK(vp);
 		vn_printf(vp,
 		    "msdosfs_umount(): just before calling VOP_CLOSE()\n");
 		printf("freef %p, freeb %p, mount %p\n",
 		    TAILQ_NEXT(vp, v_actfreelist), vp->v_actfreelist.tqe_prev,
 		    vp->v_mount);
 		printf("cleanblkhd %p, dirtyblkhd %p, numoutput %ld, type %d\n",
 		    TAILQ_FIRST(&vp->v_bufobj.bo_clean.bv_hd),
 		    TAILQ_FIRST(&vp->v_bufobj.bo_dirty.bv_hd),
 		    vp->v_bufobj.bo_numoutput, vp->v_type);
 		VI_UNLOCK(vp);
 		BO_UNLOCK(bo);
 	}
 #endif
 	DROP_GIANT();
 	if (pmp->pm_devvp->v_type == VCHR && pmp->pm_devvp->v_rdev != NULL)
 		pmp->pm_devvp->v_rdev->si_mountpt = NULL;
 	g_topology_lock();
 	g_vfs_close(pmp->pm_cp);
 	g_topology_unlock();
 	PICKUP_GIANT();
 	vrele(pmp->pm_devvp);
 	dev_rel(pmp->pm_dev);
 	free(pmp->pm_inusemap, M_MSDOSFSFAT);
 	if (pmp->pm_flags & MSDOSFS_LARGEFS)
 		msdosfs_fileno_free(mp);
 	lockdestroy(&pmp->pm_fatlock);
 	free(pmp, M_MSDOSFSMNT);
 	mp->mnt_data = NULL;
 	MNT_ILOCK(mp);
 	mp->mnt_flag &= ~MNT_LOCAL;
 	MNT_IUNLOCK(mp);
 	return (error);
 }
 
 static int
 msdosfs_root(struct mount *mp, int flags, struct vnode **vpp)
 {
 	struct msdosfsmount *pmp = VFSTOMSDOSFS(mp);
 	struct denode *ndep;
 	int error;
 
 #ifdef MSDOSFS_DEBUG
 	printf("msdosfs_root(); mp %p, pmp %p\n", mp, pmp);
 #endif
 	error = deget(pmp, MSDOSFSROOT, MSDOSFSROOT_OFS, &ndep);
 	if (error)
 		return (error);
 	*vpp = DETOV(ndep);
 	return (0);
 }
 
 static int
 msdosfs_statfs(struct mount *mp, struct statfs *sbp)
 {
 	struct msdosfsmount *pmp;
 
 	pmp = VFSTOMSDOSFS(mp);
 	sbp->f_bsize = pmp->pm_bpcluster;
 	sbp->f_iosize = pmp->pm_bpcluster;
 	sbp->f_blocks = pmp->pm_maxcluster + 1;
 	sbp->f_bfree = pmp->pm_freeclustercount;
 	sbp->f_bavail = pmp->pm_freeclustercount;
 	sbp->f_files = pmp->pm_RootDirEnts;	/* XXX */
 	sbp->f_ffree = 0;	/* what to put in here? */
 	return (0);
 }
 
 /*
  * If we have an FSInfo block, update it.
  */
 static int
 msdosfs_fsiflush(struct msdosfsmount *pmp, int waitfor)
 {
 	struct fsinfo *fp;
 	struct buf *bp;
 	int error;
 
 	MSDOSFS_LOCK_MP(pmp);
 	if (pmp->pm_fsinfo == 0 || (pmp->pm_flags & MSDOSFS_FSIMOD) == 0) {
 		error = 0;
 		goto unlock;
 	}
 	error = bread(pmp->pm_devvp, pmp->pm_fsinfo, pmp->pm_BytesPerSec,
 	    NOCRED, &bp);
 	if (error != 0) {
 		brelse(bp);
 		goto unlock;
 	}
 	fp = (struct fsinfo *)bp->b_data;
 	putulong(fp->fsinfree, pmp->pm_freeclustercount);
 	putulong(fp->fsinxtfree, pmp->pm_nxtfree);
 	pmp->pm_flags &= ~MSDOSFS_FSIMOD;
 	if (waitfor == MNT_WAIT)
 		error = bwrite(bp);
 	else
 		bawrite(bp);
 unlock:
 	MSDOSFS_UNLOCK_MP(pmp);
 	return (error);
 }
 
 static int
 msdosfs_sync(struct mount *mp, int waitfor)
 {
 	struct vnode *vp, *nvp;
 	struct thread *td;
 	struct denode *dep;
 	struct msdosfsmount *pmp = VFSTOMSDOSFS(mp);
 	int error, allerror = 0;
 
 	td = curthread;
 
 	/*
 	 * If we ever switch to not updating all of the fats all the time,
 	 * this would be the place to update them from the first one.
 	 */
 	if (pmp->pm_fmod != 0) {
 		if (pmp->pm_flags & MSDOSFSMNT_RONLY)
 			panic("msdosfs_sync: rofs mod");
 		else {
 			/* update fats here */
 		}
 	}
 	/*
 	 * Write back each (modified) denode.
 	 */
 loop:
 	MNT_VNODE_FOREACH_ALL(vp, mp, nvp) {
 		if (vp->v_type == VNON) {
 			VI_UNLOCK(vp);
 			continue;
 		}
 		dep = VTODE(vp);
 		if ((dep->de_flag &
 		    (DE_ACCESS | DE_CREATE | DE_UPDATE | DE_MODIFIED)) == 0 &&
 		    (vp->v_bufobj.bo_dirty.bv_cnt == 0 ||
 		    waitfor == MNT_LAZY)) {
 			VI_UNLOCK(vp);
 			continue;
 		}
 		error = vget(vp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, td);
 		if (error) {
 			if (error == ENOENT)
 				goto loop;
 			continue;
 		}
 		error = VOP_FSYNC(vp, waitfor, td);
 		if (error)
 			allerror = error;
 		VOP_UNLOCK(vp, 0);
 		vrele(vp);
 	}
 
 	/*
 	 * Flush filesystem control info.
 	 */
 	if (waitfor != MNT_LAZY) {
 		vn_lock(pmp->pm_devvp, LK_EXCLUSIVE | LK_RETRY);
 		error = VOP_FSYNC(pmp->pm_devvp, waitfor, td);
 		if (error)
 			allerror = error;
 		VOP_UNLOCK(pmp->pm_devvp, 0);
 	}
 
 	error = msdosfs_fsiflush(pmp, waitfor);
 	if (error != 0)
 		allerror = error;
 	return (allerror);
 }
 
 static int
 msdosfs_fhtovp(struct mount *mp, struct fid *fhp, int flags, struct vnode **vpp)
 {
 	struct msdosfsmount *pmp = VFSTOMSDOSFS(mp);
 	struct defid *defhp = (struct defid *) fhp;
 	struct denode *dep;
 	int error;
 
 	error = deget(pmp, defhp->defid_dirclust, defhp->defid_dirofs, &dep);
 	if (error) {
 		*vpp = NULLVP;
 		return (error);
 	}
 	*vpp = DETOV(dep);
 	vnode_create_vobject(*vpp, dep->de_FileSize, curthread);
 	return (0);
 }
 
 static struct vfsops msdosfs_vfsops = {
 	.vfs_fhtovp =		msdosfs_fhtovp,
 	.vfs_mount =		msdosfs_mount,
 	.vfs_cmount =		msdosfs_cmount,
 	.vfs_root =		msdosfs_root,
 	.vfs_statfs =		msdosfs_statfs,
 	.vfs_sync =		msdosfs_sync,
 	.vfs_unmount =		msdosfs_unmount,
 };
 
 VFS_SET(msdosfs_vfsops, msdosfs, 0);
 MODULE_VERSION(msdosfs, 1);
Index: user/ngie/more-tests/sys/fs/nandfs/nandfs_vfsops.c
===================================================================
--- user/ngie/more-tests/sys/fs/nandfs/nandfs_vfsops.c	(revision 281584)
+++ user/ngie/more-tests/sys/fs/nandfs/nandfs_vfsops.c	(revision 281585)
@@ -1,1597 +1,1598 @@
 /*-
  * Copyright (c) 2010-2012 Semihalf
  * Copyright (c) 2008, 2009 Reinoud Zandijk
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  *
  * From: NetBSD: nilfs_vfsops.c,v 1.1 2009/07/18 16:31:42 reinoud Exp
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/fcntl.h>
 #include <sys/kernel.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/mount.h>
 #include <sys/namei.h>
 #include <sys/proc.h>
 #include <sys/priv.h>
 #include <sys/vnode.h>
 #include <sys/buf.h>
 #include <sys/sysctl.h>
 #include <sys/libkern.h>
 
 #include <geom/geom.h>
 #include <geom/geom_vfs.h>
 
 #include <machine/_inttypes.h>
 
 #include <fs/nandfs/nandfs_mount.h>
 #include <fs/nandfs/nandfs.h>
 #include <fs/nandfs/nandfs_subr.h>
 
 static MALLOC_DEFINE(M_NANDFSMNT, "nandfs_mount", "NANDFS mount structure");
 
 #define	NANDFS_SET_SYSTEMFILE(vp) {	\
 	(vp)->v_vflag |= VV_SYSTEM;	\
 	vref(vp);			\
 	vput(vp); }
 
 #define	NANDFS_UNSET_SYSTEMFILE(vp) {	\
 	VOP_LOCK(vp, LK_EXCLUSIVE);	\
 	MPASS(vp->v_bufobj.bo_dirty.bv_cnt == 0); \
 	(vp)->v_vflag &= ~VV_SYSTEM;	\
 	vgone(vp);			\
 	vput(vp); }
 
 /* Globals */
 struct _nandfs_devices nandfs_devices;
 
 /* Parameters */
 int nandfs_verbose = 0;
 
 static void
 nandfs_tunable_init(void *arg)
 {
 
 	TUNABLE_INT_FETCH("vfs.nandfs.verbose", &nandfs_verbose);
 }
 SYSINIT(nandfs_tunables, SI_SUB_VFS, SI_ORDER_ANY, nandfs_tunable_init, NULL);
 
 static SYSCTL_NODE(_vfs, OID_AUTO, nandfs, CTLFLAG_RD, 0, "NAND filesystem");
 static SYSCTL_NODE(_vfs_nandfs, OID_AUTO, mount, CTLFLAG_RD, 0,
     "NANDFS mountpoints");
 SYSCTL_INT(_vfs_nandfs, OID_AUTO, verbose, CTLFLAG_RW, &nandfs_verbose, 0, "");
 
 #define NANDFS_CONSTR_INTERVAL	5
 int nandfs_sync_interval = NANDFS_CONSTR_INTERVAL; /* sync every 5 seconds */
 SYSCTL_UINT(_vfs_nandfs, OID_AUTO, sync_interval, CTLFLAG_RW,
     &nandfs_sync_interval, 0, "");
 
 #define NANDFS_MAX_DIRTY_SEGS	5
 int nandfs_max_dirty_segs = NANDFS_MAX_DIRTY_SEGS; /* sync when 5 dirty seg */
 SYSCTL_UINT(_vfs_nandfs, OID_AUTO, max_dirty_segs, CTLFLAG_RW,
     &nandfs_max_dirty_segs, 0, "");
 
 #define NANDFS_CPS_BETWEEN_SBLOCKS 5
 int nandfs_cps_between_sblocks = NANDFS_CPS_BETWEEN_SBLOCKS; /* write superblock every 5 checkpoints */
 SYSCTL_UINT(_vfs_nandfs, OID_AUTO, cps_between_sblocks, CTLFLAG_RW,
     &nandfs_cps_between_sblocks, 0, "");
 
 #define NANDFS_CLEANER_ENABLE 1
 int nandfs_cleaner_enable = NANDFS_CLEANER_ENABLE;
 SYSCTL_UINT(_vfs_nandfs, OID_AUTO, cleaner_enable, CTLFLAG_RW,
     &nandfs_cleaner_enable, 0, "");
 
 #define NANDFS_CLEANER_INTERVAL 5
 int nandfs_cleaner_interval = NANDFS_CLEANER_INTERVAL;
 SYSCTL_UINT(_vfs_nandfs, OID_AUTO, cleaner_interval, CTLFLAG_RW,
     &nandfs_cleaner_interval, 0, "");
 
 #define NANDFS_CLEANER_SEGMENTS 5
 int nandfs_cleaner_segments = NANDFS_CLEANER_SEGMENTS;
 SYSCTL_UINT(_vfs_nandfs, OID_AUTO, cleaner_segments, CTLFLAG_RW,
     &nandfs_cleaner_segments, 0, "");
 
 static int nandfs_mountfs(struct vnode *devvp, struct mount *mp);
 static vfs_mount_t	nandfs_mount;
 static vfs_root_t	nandfs_root;
 static vfs_statfs_t	nandfs_statfs;
 static vfs_unmount_t	nandfs_unmount;
 static vfs_vget_t	nandfs_vget;
 static vfs_sync_t	nandfs_sync;
 static const char *nandfs_opts[] = {
 	"snap", "from", "noatime", NULL
 };
 
 /* System nodes */
 static int
 nandfs_create_system_nodes(struct nandfs_device *nandfsdev)
 {
 	int error;
 
 	error = nandfs_get_node_raw(nandfsdev, NULL, NANDFS_DAT_INO,
 	    &nandfsdev->nd_super_root.sr_dat, &nandfsdev->nd_dat_node);
 	if (error)
 		goto errorout;
 
 	error = nandfs_get_node_raw(nandfsdev, NULL, NANDFS_CPFILE_INO,
 	    &nandfsdev->nd_super_root.sr_cpfile, &nandfsdev->nd_cp_node);
 	if (error)
 		goto errorout;
 
 	error = nandfs_get_node_raw(nandfsdev, NULL, NANDFS_SUFILE_INO,
 	    &nandfsdev->nd_super_root.sr_sufile, &nandfsdev->nd_su_node);
 	if (error)
 		goto errorout;
 
 	error = nandfs_get_node_raw(nandfsdev, NULL, NANDFS_GC_INO,
 	    NULL, &nandfsdev->nd_gc_node);
 	if (error)
 		goto errorout;
 
 	NANDFS_SET_SYSTEMFILE(NTOV(nandfsdev->nd_dat_node));
 	NANDFS_SET_SYSTEMFILE(NTOV(nandfsdev->nd_cp_node));
 	NANDFS_SET_SYSTEMFILE(NTOV(nandfsdev->nd_su_node));
 	NANDFS_SET_SYSTEMFILE(NTOV(nandfsdev->nd_gc_node));
 
 	DPRINTF(VOLUMES, ("System vnodes: dat: %p cp: %p su: %p\n",
 	    NTOV(nandfsdev->nd_dat_node), NTOV(nandfsdev->nd_cp_node),
 	    NTOV(nandfsdev->nd_su_node)));
 	return (0);
 
 errorout:
 	nandfs_dispose_node(&nandfsdev->nd_gc_node);
 	nandfs_dispose_node(&nandfsdev->nd_dat_node);
 	nandfs_dispose_node(&nandfsdev->nd_cp_node);
 	nandfs_dispose_node(&nandfsdev->nd_su_node);
 
 	return (error);
 }
 
 static void
 nandfs_release_system_nodes(struct nandfs_device *nandfsdev)
 {
 
 	if (!nandfsdev)
 		return;
 	if (nandfsdev->nd_refcnt > 0)
 		return;
 
 	if (nandfsdev->nd_gc_node)
 		NANDFS_UNSET_SYSTEMFILE(NTOV(nandfsdev->nd_gc_node));
 	if (nandfsdev->nd_dat_node)
 		NANDFS_UNSET_SYSTEMFILE(NTOV(nandfsdev->nd_dat_node));
 	if (nandfsdev->nd_cp_node)
 		NANDFS_UNSET_SYSTEMFILE(NTOV(nandfsdev->nd_cp_node));
 	if (nandfsdev->nd_su_node)
 		NANDFS_UNSET_SYSTEMFILE(NTOV(nandfsdev->nd_su_node));
 }
 
 static int
 nandfs_check_fsdata_crc(struct nandfs_fsdata *fsdata)
 {
 	uint32_t fsdata_crc, comp_crc;
 
 	if (fsdata->f_magic != NANDFS_FSDATA_MAGIC)
 		return (0);
 
 	/* Preserve CRC */
 	fsdata_crc = fsdata->f_sum;
 
 	/* Calculate */
 	fsdata->f_sum = (0);
 	comp_crc = crc32((uint8_t *)fsdata, fsdata->f_bytes);
 
 	/* Restore */
 	fsdata->f_sum = fsdata_crc;
 
 	/* Check CRC */
 	return (fsdata_crc == comp_crc);
 }
 
 static int
 nandfs_check_superblock_crc(struct nandfs_fsdata *fsdata,
     struct nandfs_super_block *super)
 {
 	uint32_t super_crc, comp_crc;
 
 	/* Check super block magic */
 	if (super->s_magic != NANDFS_SUPER_MAGIC)
 		return (0);
 
 	/* Preserve CRC */
 	super_crc = super->s_sum;
 
 	/* Calculate */
 	super->s_sum = (0);
 	comp_crc = crc32((uint8_t *)super, fsdata->f_sbbytes);
 
 	/* Restore */
 	super->s_sum = super_crc;
 
 	/* Check CRC */
 	return (super_crc == comp_crc);
 }
 
 static void
 nandfs_calc_superblock_crc(struct nandfs_fsdata *fsdata,
     struct nandfs_super_block *super)
 {
 	uint32_t comp_crc;
 
 	/* Calculate */
 	super->s_sum = 0;
 	comp_crc = crc32((uint8_t *)super, fsdata->f_sbbytes);
 
 	/* Restore */
 	super->s_sum = comp_crc;
 }
 
 static int
 nandfs_is_empty(u_char *area, int size)
 {
 	int i;
 
 	for (i = 0; i < size; i++)
 		if (area[i] != 0xff)
 			return (0);
 
 	return (1);
 }
 
 static __inline int
 nandfs_sblocks_in_esize(struct nandfs_device *fsdev)
 {
 
 	return ((fsdev->nd_erasesize - NANDFS_SBLOCK_OFFSET_BYTES) /
 	    sizeof(struct nandfs_super_block));
 }
 
 static __inline int
 nandfs_max_sblocks(struct nandfs_device *fsdev)
 {
 
 	return (NANDFS_NFSAREAS * nandfs_sblocks_in_esize(fsdev));
 }
 
 static __inline int
 nandfs_sblocks_in_block(struct nandfs_device *fsdev)
 {
 
 	return (fsdev->nd_devblocksize / sizeof(struct nandfs_super_block));
 }
 
 #if 0
 static __inline int
 nandfs_sblocks_in_first_block(struct nandfs_device *fsdev)
 {
 	int n;
 
 	n = nandfs_sblocks_in_block(fsdev) -
 	    NANDFS_SBLOCK_OFFSET_BYTES / sizeof(struct nandfs_super_block);
 	if (n < 0)
 		n = 0;
 
 	return (n);
 }
 #endif
 
 static int
 nandfs_write_superblock_at(struct nandfs_device *fsdev,
     struct nandfs_fsarea *fstp)
 {
 	struct nandfs_super_block *super, *supert;
 	struct buf *bp;
 	int sb_per_sector, sbs_in_fsd, read_block;
 	int index, pos, error;
 	off_t offset;
 
 	DPRINTF(SYNC, ("%s: last_used %d nandfs_sblocks_in_esize %d\n",
 	    __func__, fstp->last_used, nandfs_sblocks_in_esize(fsdev)));
 	if (fstp->last_used == nandfs_sblocks_in_esize(fsdev) - 1)
 		index = 0;
 	else
 		index = fstp->last_used + 1;
 
 	super = &fsdev->nd_super;
 	supert = NULL;
 
 	sb_per_sector = nandfs_sblocks_in_block(fsdev);
 	sbs_in_fsd = sizeof(struct nandfs_fsdata) /
 	    sizeof(struct nandfs_super_block);
 	index += sbs_in_fsd;
 	offset = fstp->offset;
 
 	DPRINTF(SYNC, ("%s: offset %#jx s_last_pseg %#jx s_last_cno %#jx "
 	    "s_last_seq %#jx wtime %jd index %d\n", __func__, offset,
 	    super->s_last_pseg, super->s_last_cno, super->s_last_seq,
 	    super->s_wtime, index));
 
 	read_block = btodb(offset + ((index / sb_per_sector) * sb_per_sector)
 	    * sizeof(struct nandfs_super_block));
 
 	DPRINTF(SYNC, ("%s: read_block %#x\n", __func__, read_block));
 
 	if (index == sbs_in_fsd) {
 		error = nandfs_erase(fsdev, offset, fsdev->nd_erasesize);
 		if (error)
 			return (error);
 
 		error = bread(fsdev->nd_devvp, btodb(offset),
 		    fsdev->nd_devblocksize, NOCRED, &bp);
 		if (error) {
 			printf("NANDFS: couldn't read initial data: %d\n",
 			    error);
 			brelse(bp);
 			return (error);
 		}
 		memcpy(bp->b_data, &fsdev->nd_fsdata, sizeof(fsdev->nd_fsdata));
 		/*
 		 * 0xff-out the rest. This bp could be cached, so potentially
 		 * b_data contains stale super blocks.
 		 *
 		 * We don't mind cached bp since most of the time we just add
 		 * super blocks to already 0xff-out b_data and don't need to
 		 * perform actual read.
 		 */
 		if (fsdev->nd_devblocksize > sizeof(fsdev->nd_fsdata))
 			memset(bp->b_data + sizeof(fsdev->nd_fsdata), 0xff,
 			    fsdev->nd_devblocksize - sizeof(fsdev->nd_fsdata));
 		error = bwrite(bp);
 		if (error) {
 			printf("NANDFS: cannot rewrite initial data at %jx\n",
 			    offset);
 			return (error);
 		}
 	}
 
 	error = bread(fsdev->nd_devvp, read_block, fsdev->nd_devblocksize,
 	    NOCRED, &bp);
 	if (error) {
 		brelse(bp);
 		return (error);
 	}
 
 	supert = (struct nandfs_super_block *)(bp->b_data);
 	pos = index % sb_per_sector;
 
 	DPRINTF(SYNC, ("%s: storing at %d\n", __func__, pos));
 	memcpy(&supert[pos], super, sizeof(struct nandfs_super_block));
 
 	/*
 	 * See comment above in code that performs erase.
 	 */
 	if (pos == 0)
 		memset(&supert[1], 0xff,
 		    (sb_per_sector - 1) * sizeof(struct nandfs_super_block));
 
 	error = bwrite(bp);
 	if (error) {
 		printf("NANDFS: cannot update superblock at %jx\n", offset);
 		return (error);
 	}
 
 	DPRINTF(SYNC, ("%s: fstp->last_used %d -> %d\n", __func__,
 	    fstp->last_used, index - sbs_in_fsd));
 	fstp->last_used = index - sbs_in_fsd;
 
 	return (0);
 }
 
 int
 nandfs_write_superblock(struct nandfs_device *fsdev)
 {
 	struct nandfs_super_block *super;
 	struct timespec ts;
 	int error;
 	int i, j;
 
 	vfs_timestamp(&ts);
 
 	super = &fsdev->nd_super;
 
 	super->s_last_pseg = fsdev->nd_last_pseg;
 	super->s_last_cno = fsdev->nd_last_cno;
 	super->s_last_seq = fsdev->nd_seg_sequence;
 	super->s_wtime = ts.tv_sec;
 
 	nandfs_calc_superblock_crc(&fsdev->nd_fsdata, super);
 
 	error = 0;
 	for (i = 0, j = fsdev->nd_last_fsarea; i < NANDFS_NFSAREAS;
 	    i++, j = (j + 1 % NANDFS_NFSAREAS)) {
 		if (fsdev->nd_fsarea[j].flags & NANDFS_FSSTOR_FAILED) {
 			DPRINTF(SYNC, ("%s: skipping %d\n", __func__, j));
 			continue;
 		}
 		error = nandfs_write_superblock_at(fsdev, &fsdev->nd_fsarea[j]);
 		if (error) {
 			printf("NANDFS: writing superblock at offset %d failed:"
 			    "%d\n", j * fsdev->nd_erasesize, error);
 			fsdev->nd_fsarea[j].flags |= NANDFS_FSSTOR_FAILED;
 		} else
 			break;
 	}
 
 	if (i == NANDFS_NFSAREAS) {
 		printf("NANDFS: superblock was not written\n");
 		/*
 		 * TODO: switch to read-only?
 		 */
 		return (error);
 	} else
 		fsdev->nd_last_fsarea = (j + 1) % NANDFS_NFSAREAS;
 
 	return (0);
 }
 
 static int
 nandfs_select_fsdata(struct nandfs_device *fsdev,
     struct nandfs_fsdata *fsdatat, struct nandfs_fsdata **fsdata, int nfsds)
 {
 	int i;
 
 	*fsdata = NULL;
 	for (i = 0; i < nfsds; i++) {
 		DPRINTF(VOLUMES, ("%s: i %d f_magic %x f_crc %x\n", __func__,
 		    i, fsdatat[i].f_magic, fsdatat[i].f_sum));
 		if (!nandfs_check_fsdata_crc(&fsdatat[i]))
 			continue;
 		*fsdata = &fsdatat[i];
 		break;
 	}
 
 	return (*fsdata != NULL ? 0 : EINVAL);
 }
 
 static int
 nandfs_select_sb(struct nandfs_device *fsdev,
     struct nandfs_super_block *supert, struct nandfs_super_block **super,
     int nsbs)
 {
 	int i;
 
 	*super = NULL;
 	for (i = 0; i < nsbs; i++) {
 		if (!nandfs_check_superblock_crc(&fsdev->nd_fsdata, &supert[i]))
 			continue;
 		DPRINTF(SYNC, ("%s: i %d s_last_cno %jx s_magic %x "
 		    "s_wtime %jd\n", __func__, i, supert[i].s_last_cno,
 		    supert[i].s_magic, supert[i].s_wtime));
 		if (*super == NULL || supert[i].s_last_cno >
 		    (*super)->s_last_cno)
 			*super = &supert[i];
 	}
 
 	return (*super != NULL ? 0 : EINVAL);
 }
 
 static int
 nandfs_read_structures_at(struct nandfs_device *fsdev,
     struct nandfs_fsarea *fstp, struct nandfs_fsdata *fsdata,
     struct nandfs_super_block *super)
 {
 	struct nandfs_super_block *tsuper, *tsuperd;
 	struct buf *bp;
 	int error, read_size;
 	int i;
 	int offset;
 
 	offset = fstp->offset;
 
 	if (fsdev->nd_erasesize > MAXBSIZE)
 		read_size = MAXBSIZE;
 	else
 		read_size = fsdev->nd_erasesize;
 
 	error = bread(fsdev->nd_devvp, btodb(offset), read_size, NOCRED, &bp);
 	if (error) {
 		printf("couldn't read: %d\n", error);
 		brelse(bp);
 		fstp->flags |= NANDFS_FSSTOR_FAILED;
 		return (error);
 	}
 
 	tsuper = super;
 
 	memcpy(fsdata, bp->b_data, sizeof(struct nandfs_fsdata));
 	memcpy(tsuper, (bp->b_data + sizeof(struct nandfs_fsdata)),
 	    read_size - sizeof(struct nandfs_fsdata));
 	brelse(bp);
 
 	tsuper += (read_size - sizeof(struct nandfs_fsdata)) /
 	    sizeof(struct nandfs_super_block);
 
 	for (i = 1; i < fsdev->nd_erasesize / read_size; i++) {
 		error = bread(fsdev->nd_devvp, btodb(offset + i * read_size),
 		    read_size, NOCRED, &bp);
 		if (error) {
 			printf("couldn't read: %d\n", error);
 			brelse(bp);
 			fstp->flags |= NANDFS_FSSTOR_FAILED;
 			return (error);
 		}
 		memcpy(tsuper, bp->b_data, read_size);
 		tsuper += read_size / sizeof(struct nandfs_super_block);
 		brelse(bp);
 	}
 
 	tsuper -= 1;
 	fstp->last_used = nandfs_sblocks_in_esize(fsdev) - 1;
 	for (tsuperd = super - 1; (tsuper != tsuperd); tsuper -= 1) {
 		if (nandfs_is_empty((u_char *)tsuper, sizeof(*tsuper)))
 			fstp->last_used--;
 		else
 			break;
 	}
 
 	DPRINTF(VOLUMES, ("%s: last_used %d\n", __func__, fstp->last_used));
 
 	return (0);
 }
 
 static int
 nandfs_read_structures(struct nandfs_device *fsdev)
 {
 	struct nandfs_fsdata *fsdata, *fsdatat;
 	struct nandfs_super_block *sblocks, *ssblock;
 	int nsbs, nfsds, i;
 	int error = 0;
 	int nrsbs;
 
 	nfsds = NANDFS_NFSAREAS;
 	nsbs = nandfs_max_sblocks(fsdev);
 
 	fsdatat = malloc(sizeof(struct nandfs_fsdata) * nfsds, M_NANDFSTEMP,
 	    M_WAITOK | M_ZERO);
 	sblocks = malloc(sizeof(struct nandfs_super_block) * nsbs, M_NANDFSTEMP,
 	    M_WAITOK | M_ZERO);
 
 	nrsbs = 0;
 	for (i = 0; i < NANDFS_NFSAREAS; i++) {
 		fsdev->nd_fsarea[i].offset = i * fsdev->nd_erasesize;
 		error = nandfs_read_structures_at(fsdev, &fsdev->nd_fsarea[i],
 		    &fsdatat[i], sblocks + nrsbs);
 		if (error)
 			continue;
 		nrsbs += (fsdev->nd_fsarea[i].last_used + 1);
 		if (fsdev->nd_fsarea[fsdev->nd_last_fsarea].last_used >
 		    fsdev->nd_fsarea[i].last_used)
 			fsdev->nd_last_fsarea = i;
 	}
 
 	if (nrsbs == 0) {
 		printf("nandfs: no valid superblocks found\n");
 		error = EINVAL;
 		goto out;
 	}
 
 	error = nandfs_select_fsdata(fsdev, fsdatat, &fsdata, nfsds);
 	if (error)
 		goto out;
 	memcpy(&fsdev->nd_fsdata, fsdata, sizeof(struct nandfs_fsdata));
 
 	error = nandfs_select_sb(fsdev, sblocks, &ssblock, nsbs);
 	if (error)
 		goto out;
 
 	memcpy(&fsdev->nd_super, ssblock, sizeof(struct nandfs_super_block));
 out:
 	free(fsdatat, M_NANDFSTEMP);
 	free(sblocks, M_NANDFSTEMP);
 
 	if (error == 0)
 		DPRINTF(VOLUMES, ("%s: selected sb with w_time %jd "
 		    "last_pseg %#jx\n", __func__, fsdev->nd_super.s_wtime,
 		    fsdev->nd_super.s_last_pseg));
 
 	return (error);
 }
 
 static void
 nandfs_unmount_base(struct nandfs_device *nandfsdev)
 {
 	int error;
 
 	if (!nandfsdev)
 		return;
 
 	/* Remove all our information */
 	error = vinvalbuf(nandfsdev->nd_devvp, V_SAVE, 0, 0);
 	if (error) {
 		/*
 		 * Flushing buffers failed when fs was umounting, can't do
 		 * much now, just printf error and continue with umount.
 		 */
 		nandfs_error("%s(): error:%d when umounting FS\n",
 		    __func__, error);
 	}
 
 	/* Release the device's system nodes */
 	nandfs_release_system_nodes(nandfsdev);
 }
 
 static void
 nandfs_get_ncleanseg(struct nandfs_device *nandfsdev)
 {
 	struct nandfs_seg_stat nss;
 
 	nandfs_get_seg_stat(nandfsdev, &nss);
 	nandfsdev->nd_clean_segs = nss.nss_ncleansegs;
 	DPRINTF(VOLUMES, ("nandfs_mount: clean segs: %jx\n",
 	    (uintmax_t)nandfsdev->nd_clean_segs));
 }
 
 
 static int
 nandfs_mount_base(struct nandfs_device *nandfsdev, struct mount *mp,
     struct nandfs_args *args)
 {
 	uint32_t log_blocksize;
 	int error;
 
 	/* Flush out any old buffers remaining from a previous use. */
 	if ((error = vinvalbuf(nandfsdev->nd_devvp, V_SAVE, 0, 0)))
 		return (error);
 
 	error = nandfs_read_structures(nandfsdev);
 	if (error) {
 		printf("nandfs: could not get valid filesystem structures\n");
 		return (error);
 	}
 
 	if (nandfsdev->nd_fsdata.f_rev_level != NANDFS_CURRENT_REV) {
 		printf("nandfs: unsupported file system revision: %d "
 		    "(supported is %d).\n", nandfsdev->nd_fsdata.f_rev_level,
 		    NANDFS_CURRENT_REV);
 		return (EINVAL);
 	}
 
 	if (nandfsdev->nd_fsdata.f_erasesize != nandfsdev->nd_erasesize) {
 		printf("nandfs: erasesize mismatch (device %#x, fs %#x)\n",
 		    nandfsdev->nd_erasesize, nandfsdev->nd_fsdata.f_erasesize);
 		return (EINVAL);
 	}
 
 	/* Get our blocksize */
 	log_blocksize = nandfsdev->nd_fsdata.f_log_block_size;
 	nandfsdev->nd_blocksize = (uint64_t) 1 << (log_blocksize + 10);
 	DPRINTF(VOLUMES, ("%s: blocksize:%x\n", __func__,
 	    nandfsdev->nd_blocksize));
 
 	DPRINTF(VOLUMES, ("%s: accepted super block with cp %#jx\n", __func__,
 	    (uintmax_t)nandfsdev->nd_super.s_last_cno));
 
 	/* Calculate dat structure parameters */
 	nandfs_calc_mdt_consts(nandfsdev, &nandfsdev->nd_dat_mdt,
 	    nandfsdev->nd_fsdata.f_dat_entry_size);
 	nandfs_calc_mdt_consts(nandfsdev, &nandfsdev->nd_ifile_mdt,
 	    nandfsdev->nd_fsdata.f_inode_size);
 
 	/* Search for the super root and roll forward when needed */
 	if (nandfs_search_super_root(nandfsdev)) {
 		printf("Cannot find valid SuperRoot\n");
 		return (EINVAL);
 	}
 
 	nandfsdev->nd_mount_state = nandfsdev->nd_super.s_state;
 	if (nandfsdev->nd_mount_state != NANDFS_VALID_FS) {
 		printf("FS is seriously damaged, needs repairing\n");
 		printf("aborting mount\n");
 		return (EINVAL);
 	}
 
 	/*
 	 * FS should be ok now. The superblock and the last segsum could be
 	 * updated from the repair so extract running values again.
 	 */
 	nandfsdev->nd_last_pseg = nandfsdev->nd_super.s_last_pseg;
 	nandfsdev->nd_seg_sequence = nandfsdev->nd_super.s_last_seq;
 	nandfsdev->nd_seg_num = nandfs_get_segnum_of_block(nandfsdev,
 	    nandfsdev->nd_last_pseg);
 	nandfsdev->nd_next_seg_num = nandfs_get_segnum_of_block(nandfsdev,
 	    nandfsdev->nd_last_segsum.ss_next);
 	nandfsdev->nd_ts.tv_sec = nandfsdev->nd_last_segsum.ss_create;
 	nandfsdev->nd_last_cno = nandfsdev->nd_super.s_last_cno;
 	nandfsdev->nd_fakevblk = 1;
 	/*
 	 * FIXME: bogus calculation. Should use actual number of usable segments
 	 * instead of total amount.
 	 */
 	nandfsdev->nd_segs_reserved =
 	    nandfsdev->nd_fsdata.f_nsegments *
 	    nandfsdev->nd_fsdata.f_r_segments_percentage / 100;
 	nandfsdev->nd_last_ino  = NANDFS_USER_INO;
 	DPRINTF(VOLUMES, ("%s: last_pseg %#jx last_cno %#jx last_seq %#jx\n"
 	    "fsdev: last_seg: seq %#jx num %#jx, next_seg_num %#jx "
 	    "segs_reserved %#jx\n",
 	    __func__, (uintmax_t)nandfsdev->nd_last_pseg,
 	    (uintmax_t)nandfsdev->nd_last_cno,
 	    (uintmax_t)nandfsdev->nd_seg_sequence,
 	    (uintmax_t)nandfsdev->nd_seg_sequence,
 	    (uintmax_t)nandfsdev->nd_seg_num,
 	    (uintmax_t)nandfsdev->nd_next_seg_num,
 	    (uintmax_t)nandfsdev->nd_segs_reserved));
 
 	DPRINTF(VOLUMES, ("nandfs_mount: accepted super root\n"));
 
 	/* Create system vnodes for DAT, CP and SEGSUM */
 	error = nandfs_create_system_nodes(nandfsdev);
 	if (error)
 		nandfs_unmount_base(nandfsdev);
 
 	nandfs_get_ncleanseg(nandfsdev);
 
 	return (error);
 }
 
 static void
 nandfs_unmount_device(struct nandfs_device *nandfsdev)
 {
 
 	/* Is there anything? */
 	if (nandfsdev == NULL)
 		return;
 
 	/* Remove the device only if we're the last reference */
 	nandfsdev->nd_refcnt--;
 	if (nandfsdev->nd_refcnt >= 1)
 		return;
 
 	MPASS(nandfsdev->nd_syncer == NULL);
 	MPASS(nandfsdev->nd_cleaner == NULL);
 	MPASS(nandfsdev->nd_free_base == NULL);
 
 	/* Unmount our base */
 	nandfs_unmount_base(nandfsdev);
 
 	/* Remove from our device list */
 	SLIST_REMOVE(&nandfs_devices, nandfsdev, nandfs_device, nd_next_device);
 
 	DROP_GIANT();
 	g_topology_lock();
 	g_vfs_close(nandfsdev->nd_gconsumer);
 	g_topology_unlock();
 	PICKUP_GIANT();
 
 	DPRINTF(VOLUMES, ("closing device\n"));
 
 	/* Clear our mount reference and release device node */
 	vrele(nandfsdev->nd_devvp);
 
 	dev_rel(nandfsdev->nd_devvp->v_rdev);
 
 	/* Free our device info */
 	cv_destroy(&nandfsdev->nd_sync_cv);
 	mtx_destroy(&nandfsdev->nd_sync_mtx);
 	cv_destroy(&nandfsdev->nd_clean_cv);
 	mtx_destroy(&nandfsdev->nd_clean_mtx);
 	mtx_destroy(&nandfsdev->nd_mutex);
 	lockdestroy(&nandfsdev->nd_seg_const);
 	free(nandfsdev, M_NANDFSMNT);
 }
 
 static int
 nandfs_check_mounts(struct nandfs_device *nandfsdev, struct mount *mp,
     struct nandfs_args *args)
 {
 	struct nandfsmount *nmp;
 	uint64_t last_cno;
 
 	/* no double-mounting of the same checkpoint */
 	STAILQ_FOREACH(nmp, &nandfsdev->nd_mounts, nm_next_mount) {
 		if (nmp->nm_mount_args.cpno == args->cpno)
 			return (EBUSY);
 	}
 
 	/* Allow readonly mounts without questioning here */
 	if (mp->mnt_flag & MNT_RDONLY)
 		return (0);
 
 	/* Read/write mount */
 	STAILQ_FOREACH(nmp, &nandfsdev->nd_mounts, nm_next_mount) {
 		/* Only one RW mount on this device! */
 		if ((nmp->nm_vfs_mountp->mnt_flag & MNT_RDONLY)==0)
 			return (EROFS);
 		/* RDONLY on last mountpoint is device busy */
 		last_cno = nmp->nm_nandfsdev->nd_super.s_last_cno;
 		if (nmp->nm_mount_args.cpno == last_cno)
 			return (EBUSY);
 	}
 
 	/* OK for now */
 	return (0);
 }
 
 static int
 nandfs_mount_device(struct vnode *devvp, struct mount *mp,
     struct nandfs_args *args, struct nandfs_device **nandfsdev_p)
 {
 	struct nandfs_device *nandfsdev;
 	struct g_provider *pp;
 	struct g_consumer *cp;
 	struct cdev *dev;
 	uint32_t erasesize;
 	int error, size;
 	int ronly;
 
 	DPRINTF(VOLUMES, ("Mounting NANDFS device\n"));
 
 	ronly = (mp->mnt_flag & MNT_RDONLY) != 0;
 
 	/* Look up device in our nandfs_mountpoints */
 	*nandfsdev_p = NULL;
 	SLIST_FOREACH(nandfsdev, &nandfs_devices, nd_next_device)
 		if (nandfsdev->nd_devvp == devvp)
 			break;
 
 	if (nandfsdev) {
 		DPRINTF(VOLUMES, ("device already mounted\n"));
 		error = nandfs_check_mounts(nandfsdev, mp, args);
 		if (error)
 			return error;
 		nandfsdev->nd_refcnt++;
 		*nandfsdev_p = nandfsdev;
 
 		if (!ronly) {
 			DROP_GIANT();
 			g_topology_lock();
 			error = g_access(nandfsdev->nd_gconsumer, 0, 1, 0);
 			g_topology_unlock();
 			PICKUP_GIANT();
 		}
 		return (error);
 	}
 
 	vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY);
 	dev = devvp->v_rdev;
 	dev_ref(dev);
 	DROP_GIANT();
 	g_topology_lock();
 	error = g_vfs_open(devvp, &cp, "nandfs", ronly ? 0 : 1);
 	pp = g_dev_getprovider(dev);
 	g_topology_unlock();
 	PICKUP_GIANT();
 	VOP_UNLOCK(devvp, 0);
 	if (error) {
 		dev_rel(dev);
 		return (error);
 	}
 
 	nandfsdev = malloc(sizeof(struct nandfs_device), M_NANDFSMNT, M_WAITOK | M_ZERO);
 
 	/* Initialise */
 	nandfsdev->nd_refcnt = 1;
 	nandfsdev->nd_devvp = devvp;
 	nandfsdev->nd_syncing = 0;
 	nandfsdev->nd_cleaning = 0;
 	nandfsdev->nd_gconsumer = cp;
 	cv_init(&nandfsdev->nd_sync_cv, "nandfssync");
 	mtx_init(&nandfsdev->nd_sync_mtx, "nffssyncmtx", NULL, MTX_DEF);
 	cv_init(&nandfsdev->nd_clean_cv, "nandfsclean");
 	mtx_init(&nandfsdev->nd_clean_mtx, "nffscleanmtx", NULL, MTX_DEF);
 	mtx_init(&nandfsdev->nd_mutex, "nandfsdev lock", NULL, MTX_DEF);
 	lockinit(&nandfsdev->nd_seg_const, PVFS, "nffssegcon", VLKTIMEOUT,
 	    LK_CANRECURSE);
 	STAILQ_INIT(&nandfsdev->nd_mounts);
 
 	nandfsdev->nd_devsize = pp->mediasize;
 	nandfsdev->nd_devblocksize = pp->sectorsize;
 
 	size = sizeof(erasesize);
 	error = g_io_getattr("NAND::blocksize", nandfsdev->nd_gconsumer, &size,
 	    &erasesize);
 	if (error) {
 		DPRINTF(VOLUMES, ("couldn't get erasesize: %d\n", error));
 
 		if (error == ENOIOCTL || error == EOPNOTSUPP) {
 			/*
 			 * We conclude that this is not NAND storage
 			 */
 			erasesize = NANDFS_DEF_ERASESIZE;
 		} else {
 			DROP_GIANT();
 			g_topology_lock();
 			g_vfs_close(nandfsdev->nd_gconsumer);
 			g_topology_unlock();
 			PICKUP_GIANT();
 			dev_rel(dev);
 			free(nandfsdev, M_NANDFSMNT);
 			return (error);
 		}
 	}
 	nandfsdev->nd_erasesize = erasesize;
 
 	DPRINTF(VOLUMES, ("%s: erasesize %x\n", __func__,
 	    nandfsdev->nd_erasesize));
 
 	/* Register nandfs_device in list */
 	SLIST_INSERT_HEAD(&nandfs_devices, nandfsdev, nd_next_device);
 
 	error = nandfs_mount_base(nandfsdev, mp, args);
 	if (error) {
 		/* Remove all our information */
 		nandfs_unmount_device(nandfsdev);
 		return (EINVAL);
 	}
 
 	nandfsdev->nd_maxfilesize = nandfs_get_maxfilesize(nandfsdev);
 
 	*nandfsdev_p = nandfsdev;
 	DPRINTF(VOLUMES, ("NANDFS device mounted ok\n"));
 
 	return (0);
 }
 
 static int
 nandfs_mount_checkpoint(struct nandfsmount *nmp)
 {
 	struct nandfs_cpfile_header *cphdr;
 	struct nandfs_checkpoint *cp;
 	struct nandfs_inode ifile_inode;
 	struct nandfs_node *cp_node;
 	struct buf *bp;
 	uint64_t ncp, nsn, cpno, fcpno, blocknr, last_cno;
 	uint32_t off, dlen;
 	int cp_per_block, error;
 
 	cpno = nmp->nm_mount_args.cpno;
 	if (cpno == 0)
 		cpno = nmp->nm_nandfsdev->nd_super.s_last_cno;
 
 	DPRINTF(VOLUMES, ("%s: trying to mount checkpoint number %"PRIu64"\n",
 	    __func__, cpno));
 
 	cp_node = nmp->nm_nandfsdev->nd_cp_node;
 
 	VOP_LOCK(NTOV(cp_node), LK_SHARED);
 	/* Get cpfile header from 1st block of cp file */
 	error = nandfs_bread(cp_node, 0, NOCRED, 0, &bp);
 	if (error) {
 		brelse(bp);
 		VOP_UNLOCK(NTOV(cp_node), 0);
 		return (error);
 	}
 
 	cphdr = (struct nandfs_cpfile_header *) bp->b_data;
 	ncp = cphdr->ch_ncheckpoints;
 	nsn = cphdr->ch_nsnapshots;
 
 	brelse(bp);
 
 	DPRINTF(VOLUMES, ("mount_nandfs: checkpoint header read in\n"));
 	DPRINTF(VOLUMES, ("\tNumber of checkpoints %"PRIu64"\n", ncp));
 	DPRINTF(VOLUMES, ("\tNumber of snapshots %"PRIu64"\n", nsn));
 
 	/* Read in our specified checkpoint */
 	dlen = nmp->nm_nandfsdev->nd_fsdata.f_checkpoint_size;
 	cp_per_block = nmp->nm_nandfsdev->nd_blocksize / dlen;
 
 	fcpno = cpno + NANDFS_CPFILE_FIRST_CHECKPOINT_OFFSET - 1;
 	blocknr = fcpno / cp_per_block;
 	off = (fcpno % cp_per_block) * dlen;
 	error = nandfs_bread(cp_node, blocknr, NOCRED, 0, &bp);
 	if (error) {
 		brelse(bp);
 		VOP_UNLOCK(NTOV(cp_node), 0);
 		printf("mount_nandfs: couldn't read cp block %"PRIu64"\n",
 		    fcpno);
 		return (EINVAL);
 	}
 
 	/* Needs to be a valid checkpoint */
 	cp = (struct nandfs_checkpoint *) ((uint8_t *) bp->b_data + off);
 	if (cp->cp_flags & NANDFS_CHECKPOINT_INVALID) {
 		printf("mount_nandfs: checkpoint marked invalid\n");
 		brelse(bp);
 		VOP_UNLOCK(NTOV(cp_node), 0);
 		return (EINVAL);
 	}
 
 	/* Is this really the checkpoint we want? */
 	if (cp->cp_cno != cpno) {
 		printf("mount_nandfs: checkpoint file corrupt? "
 		    "expected cpno %"PRIu64", found cpno %"PRIu64"\n",
 		    cpno, cp->cp_cno);
 		brelse(bp);
 		VOP_UNLOCK(NTOV(cp_node), 0);
 		return (EINVAL);
 	}
 
 	/* Check if it's a snapshot ! */
 	last_cno = nmp->nm_nandfsdev->nd_super.s_last_cno;
 	if (cpno != last_cno) {
 		/* Only allow snapshots if not mounting on the last cp */
 		if ((cp->cp_flags & NANDFS_CHECKPOINT_SNAPSHOT) == 0) {
 			printf( "mount_nandfs: checkpoint %"PRIu64" is not a "
 			    "snapshot\n", cpno);
 			brelse(bp);
 			VOP_UNLOCK(NTOV(cp_node), 0);
 			return (EINVAL);
 		}
 	}
 
 	ifile_inode = cp->cp_ifile_inode;
 	brelse(bp);
 
 	/* Get ifile inode */
 	error = nandfs_get_node_raw(nmp->nm_nandfsdev, NULL, NANDFS_IFILE_INO,
 	    &ifile_inode, &nmp->nm_ifile_node);
 	if (error) {
 		printf("mount_nandfs: can't read ifile node\n");
 		VOP_UNLOCK(NTOV(cp_node), 0);
 		return (EINVAL);
 	}
 
 	NANDFS_SET_SYSTEMFILE(NTOV(nmp->nm_ifile_node));
 	VOP_UNLOCK(NTOV(cp_node), 0);
 	/* Get root node? */
 
 	return (0);
 }
 
 static void
 free_nandfs_mountinfo(struct mount *mp)
 {
 	struct nandfsmount *nmp = VFSTONANDFS(mp);
 
 	if (nmp == NULL)
 		return;
 
 	free(nmp, M_NANDFSMNT);
 }
 
 void
 nandfs_wakeup_wait_sync(struct nandfs_device *nffsdev, int reason)
 {
 	char *reasons[] = {
 	    "umount",
 	    "vfssync",
 	    "bdflush",
 	    "fforce",
 	    "fsync",
 	    "ro_upd"
 	};
 
 	DPRINTF(SYNC, ("%s: %s\n", __func__, reasons[reason]));
 	mtx_lock(&nffsdev->nd_sync_mtx);
 	if (nffsdev->nd_syncing)
 		cv_wait(&nffsdev->nd_sync_cv, &nffsdev->nd_sync_mtx);
 	if (reason == SYNCER_UMOUNT)
 		nffsdev->nd_syncer_exit = 1;
 	nffsdev->nd_syncing = 1;
 	wakeup(&nffsdev->nd_syncing);
 	cv_wait(&nffsdev->nd_sync_cv, &nffsdev->nd_sync_mtx);
 
 	mtx_unlock(&nffsdev->nd_sync_mtx);
 }
 
 static void
 nandfs_gc_finished(struct nandfs_device *nffsdev, int exit)
 {
 	int error;
 
 	mtx_lock(&nffsdev->nd_sync_mtx);
 	nffsdev->nd_syncing = 0;
 	DPRINTF(SYNC, ("%s: cleaner finish\n", __func__));
 	cv_broadcast(&nffsdev->nd_sync_cv);
 	mtx_unlock(&nffsdev->nd_sync_mtx);
 	if (!exit) {
 		error = tsleep(&nffsdev->nd_syncing, PRIBIO, "-",
 		    hz * nandfs_sync_interval);
 		DPRINTF(SYNC, ("%s: cleaner waked up: %d\n",
 		    __func__, error));
 	}
 }
 
 static void
 nandfs_syncer(struct nandfsmount *nmp)
 {
 	struct nandfs_device *nffsdev;
 	struct mount *mp;
 	int flags, error;
 
 	mp = nmp->nm_vfs_mountp;
 	nffsdev = nmp->nm_nandfsdev;
 	tsleep(&nffsdev->nd_syncing, PRIBIO, "-", hz * nandfs_sync_interval);
 
 	while (!nffsdev->nd_syncer_exit) {
 		DPRINTF(SYNC, ("%s: syncer run\n", __func__));
 		nffsdev->nd_syncing = 1;
 
 		flags = (nmp->nm_flags & (NANDFS_FORCE_SYNCER | NANDFS_UMOUNT));
 
 		error = nandfs_segment_constructor(nmp, flags);
 		if (error)
 			nandfs_error("%s: error:%d when creating segments\n",
 			    __func__, error);
 
 		nmp->nm_flags &= ~flags;
 
 		nandfs_gc_finished(nffsdev, 0);
 	}
 
 	MPASS(nffsdev->nd_cleaner == NULL);
 	error = nandfs_segment_constructor(nmp,
 	    NANDFS_FORCE_SYNCER | NANDFS_UMOUNT);
 	if (error)
 		nandfs_error("%s: error:%d when creating segments\n",
 		    __func__, error);
 	nandfs_gc_finished(nffsdev, 1);
 	nffsdev->nd_syncer = NULL;
 	MPASS(nffsdev->nd_free_base == NULL);
 
 	DPRINTF(SYNC, ("%s: exiting\n", __func__));
 	kthread_exit();
 }
 
 static int
 start_syncer(struct nandfsmount *nmp)
 {
 	int error;
 
 	MPASS(nmp->nm_nandfsdev->nd_syncer == NULL);
 
 	DPRINTF(SYNC, ("%s: start syncer\n", __func__));
 
 	nmp->nm_nandfsdev->nd_syncer_exit = 0;
 
 	error = kthread_add((void(*)(void *))nandfs_syncer, nmp, NULL,
 	    &nmp->nm_nandfsdev->nd_syncer, 0, 0, "nandfs_syncer");
 
 	if (error)
 		printf("nandfs: could not start syncer: %d\n", error);
 
 	return (error);
 }
 
 static int
 stop_syncer(struct nandfsmount *nmp)
 {
 
 	MPASS(nmp->nm_nandfsdev->nd_syncer != NULL);
 
 	nandfs_wakeup_wait_sync(nmp->nm_nandfsdev, SYNCER_UMOUNT);
 
 	DPRINTF(SYNC, ("%s: stop syncer\n", __func__));
 	return (0);
 }
 
 /*
  * Mount null layer
  */
 static int
 nandfs_mount(struct mount *mp)
 {
 	struct nandfsmount *nmp;
 	struct vnode *devvp;
 	struct nameidata nd;
 	struct vfsoptlist *opts;
 	struct thread *td;
 	char *from;
 	int error = 0, flags;
 
 	DPRINTF(VOLUMES, ("%s: mp = %p\n", __func__, (void *)mp));
 
 	td = curthread;
 	opts = mp->mnt_optnew;
 
 	if (vfs_filteropt(opts, nandfs_opts))
 		return (EINVAL);
 
 	/*
 	 * Update is a no-op
 	 */
 	if (mp->mnt_flag & MNT_UPDATE) {
 		nmp = VFSTONANDFS(mp);
 		if (vfs_flagopt(mp->mnt_optnew, "export", NULL, 0)) {
 			return (error);
 		}
 		if (!(nmp->nm_ronly) && vfs_flagopt(opts, "ro", NULL, 0)) {
 			vn_start_write(NULL, &mp, V_WAIT);
 			error = VFS_SYNC(mp, MNT_WAIT);
 			if (error)
 				return (error);
 			vn_finished_write(mp);
 
 			flags = WRITECLOSE;
 			if (mp->mnt_flag & MNT_FORCE)
 				flags |= FORCECLOSE;
 
 			nandfs_wakeup_wait_sync(nmp->nm_nandfsdev,
 			    SYNCER_ROUPD);
 			error = vflush(mp, 0, flags, td);
 			if (error)
 				return (error);
 
 			nandfs_stop_cleaner(nmp->nm_nandfsdev);
 			stop_syncer(nmp);
 			DROP_GIANT();
 			g_topology_lock();
 			g_access(nmp->nm_nandfsdev->nd_gconsumer, 0, -1, 0);
 			g_topology_unlock();
 			PICKUP_GIANT();
 			MNT_ILOCK(mp);
 			mp->mnt_flag |= MNT_RDONLY;
 			MNT_IUNLOCK(mp);
 			nmp->nm_ronly = 1;
 
 		} else if ((nmp->nm_ronly) &&
 		    !vfs_flagopt(opts, "ro", NULL, 0)) {
 			/*
 			 * Don't allow read-write snapshots.
 			 */
 			if (nmp->nm_mount_args.cpno != 0)
 				return (EROFS);
 			/*
 			 * If upgrade to read-write by non-root, then verify
 			 * that user has necessary permissions on the device.
 			 */
 			devvp = nmp->nm_nandfsdev->nd_devvp;
 			vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY);
 			error = VOP_ACCESS(devvp, VREAD | VWRITE,
 			    td->td_ucred, td);
 			if (error) {
 				error = priv_check(td, PRIV_VFS_MOUNT_PERM);
 				if (error) {
 					VOP_UNLOCK(devvp, 0);
 					return (error);
 				}
 			}
 
 			VOP_UNLOCK(devvp, 0);
 			DROP_GIANT();
 			g_topology_lock();
 			error = g_access(nmp->nm_nandfsdev->nd_gconsumer, 0, 1,
 			    0);
 			g_topology_unlock();
 			PICKUP_GIANT();
 			if (error)
 				return (error);
 
 			MNT_ILOCK(mp);
 			mp->mnt_flag &= ~MNT_RDONLY;
 			MNT_IUNLOCK(mp);
 			error = start_syncer(nmp);
 			if (error == 0)
 				error = nandfs_start_cleaner(nmp->nm_nandfsdev);
 			if (error) {
 				DROP_GIANT();
 				g_topology_lock();
 				g_access(nmp->nm_nandfsdev->nd_gconsumer, 0, -1,
 				    0);
 				g_topology_unlock();
 				PICKUP_GIANT();
 				return (error);
 			}
 
 			nmp->nm_ronly = 0;
 		}
 		return (0);
 	}
 
 	from = vfs_getopts(opts, "from", &error);
 	if (error)
 		return (error);
 
 	/*
 	 * Find device node
 	 */
 	NDINIT(&nd, LOOKUP, FOLLOW|LOCKLEAF, UIO_SYSSPACE, from, curthread);
 	error = namei(&nd);
 	if (error)
 		return (error);
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 
 	devvp = nd.ni_vp;
 
 	if (!vn_isdisk(devvp, &error)) {
 		vput(devvp);
 		return (error);
 	}
 
 	/* Check the access rights on the mount device */
 	error = VOP_ACCESS(devvp, VREAD, curthread->td_ucred, curthread);
 	if (error)
 		error = priv_check(curthread, PRIV_VFS_MOUNT_PERM);
 	if (error) {
 		vput(devvp);
 		return (error);
 	}
 
 	vfs_getnewfsid(mp);
 
 	error = nandfs_mountfs(devvp, mp);
 	if (error)
 		return (error);
 	vfs_mountedfrom(mp, from);
 
 	return (0);
 }
 
 static int
 nandfs_mountfs(struct vnode *devvp, struct mount *mp)
 {
 	struct nandfsmount *nmp = NULL;
 	struct nandfs_args *args = NULL;
 	struct nandfs_device *nandfsdev;
 	char *from;
 	int error, ronly;
 	char *cpno;
 
 	ronly = (mp->mnt_flag & MNT_RDONLY) != 0;
 
 	if (devvp->v_rdev->si_iosize_max != 0)
 		mp->mnt_iosize_max = devvp->v_rdev->si_iosize_max;
 	VOP_UNLOCK(devvp, 0);
 
 	if (mp->mnt_iosize_max > MAXPHYS)
 		mp->mnt_iosize_max = MAXPHYS;
 
 	from = vfs_getopts(mp->mnt_optnew, "from", &error);
 	if (error)
 		goto error;
 
 	error = vfs_getopt(mp->mnt_optnew, "snap", (void **)&cpno, NULL);
 	if (error == ENOENT)
 		cpno = NULL;
 	else if (error)
 		goto error;
 
 	args = (struct nandfs_args *)malloc(sizeof(struct nandfs_args),
 	    M_NANDFSMNT, M_WAITOK | M_ZERO);
 
 	if (cpno != NULL)
 		args->cpno = strtoul(cpno, (char **)NULL, 10);
 	else
 		args->cpno = 0;
 	args->fspec = from;
 
 	if (args->cpno != 0 && !ronly) {
 		error = EROFS;
 		goto error;
 	}
 
 	printf("WARNING: NANDFS is considered to be a highly experimental "
 	    "feature in FreeBSD.\n");
 
 	error = nandfs_mount_device(devvp, mp, args, &nandfsdev);
 	if (error)
 		goto error;
 
 	nmp = (struct nandfsmount *) malloc(sizeof(struct nandfsmount),
 	    M_NANDFSMNT, M_WAITOK | M_ZERO);
 
 	mp->mnt_data = nmp;
 	nmp->nm_vfs_mountp = mp;
 	nmp->nm_ronly = ronly;
 	MNT_ILOCK(mp);
 	mp->mnt_flag |= MNT_LOCAL;
+	mp->mnt_kern_flag |= MNTK_USES_BCACHE;
 	MNT_IUNLOCK(mp);
 	nmp->nm_nandfsdev = nandfsdev;
 	/* Add our mountpoint */
 	STAILQ_INSERT_TAIL(&nandfsdev->nd_mounts, nmp, nm_next_mount);
 
 	if (args->cpno > nandfsdev->nd_last_cno) {
 		printf("WARNING: supplied checkpoint number (%jd) is greater "
 		    "than last known checkpoint on filesystem (%jd). Mounting"
 		    " checkpoint %jd\n", (uintmax_t)args->cpno,
 		    (uintmax_t)nandfsdev->nd_last_cno,
 		    (uintmax_t)nandfsdev->nd_last_cno);
 		args->cpno = nandfsdev->nd_last_cno;
 	}
 
 	/* Setting up other parameters */
 	nmp->nm_mount_args = *args;
 	free(args, M_NANDFSMNT);
 	error = nandfs_mount_checkpoint(nmp);
 	if (error) {
 		nandfs_unmount(mp, MNT_FORCE);
 		goto unmounted;
 	}
 
 	if (!ronly) {
 		error = start_syncer(nmp);
 		if (error == 0)
 			error = nandfs_start_cleaner(nmp->nm_nandfsdev);
 		if (error)
 			nandfs_unmount(mp, MNT_FORCE);
 	}
 
 	return (0);
 
 error:
 	if (args != NULL)
 		free(args, M_NANDFSMNT);
 
 	if (nmp != NULL) {
 		free(nmp, M_NANDFSMNT);
 		mp->mnt_data = NULL;
 	}
 unmounted:
 	return (error);
 }
 
 static int
 nandfs_unmount(struct mount *mp, int mntflags)
 {
 	struct nandfs_device *nandfsdev;
 	struct nandfsmount *nmp;
 	int error;
 	int flags = 0;
 
 	DPRINTF(VOLUMES, ("%s: mp = %p\n", __func__, (void *)mp));
 
 	if (mntflags & MNT_FORCE)
 		flags |= FORCECLOSE;
 
 	nmp = mp->mnt_data;
 	nandfsdev = nmp->nm_nandfsdev;
 
 	error = vflush(mp, 0, flags | SKIPSYSTEM, curthread);
 	if (error)
 		return (error);
 
 	if (!(nmp->nm_ronly)) {
 		nandfs_stop_cleaner(nandfsdev);
 		stop_syncer(nmp);
 	}
 
 	if (nmp->nm_ifile_node)
 		NANDFS_UNSET_SYSTEMFILE(NTOV(nmp->nm_ifile_node));
 
 	/* Remove our mount point */
 	STAILQ_REMOVE(&nandfsdev->nd_mounts, nmp, nandfsmount, nm_next_mount);
 
 	/* Unmount the device itself when we're the last one */
 	nandfs_unmount_device(nandfsdev);
 
 	free_nandfs_mountinfo(mp);
 
 	/*
 	 * Finally, throw away the null_mount structure
 	 */
 	mp->mnt_data = 0;
 	MNT_ILOCK(mp);
 	mp->mnt_flag &= ~MNT_LOCAL;
 	MNT_IUNLOCK(mp);
 
 	return (0);
 }
 
 static int
 nandfs_statfs(struct mount *mp, struct statfs *sbp)
 {
 	struct nandfsmount *nmp;
 	struct nandfs_device *nandfsdev;
 	struct nandfs_fsdata *fsdata;
 	struct nandfs_super_block *sb;
 	struct nandfs_block_group_desc *groups;
 	struct nandfs_node *ifile;
 	struct nandfs_mdt *mdt;
 	struct buf *bp;
 	int i, error;
 	uint32_t entries_per_group;
 	uint64_t files = 0;
 
 	nmp = mp->mnt_data;
 	nandfsdev = nmp->nm_nandfsdev;
 	fsdata = &nandfsdev->nd_fsdata;
 	sb = &nandfsdev->nd_super;
 	ifile = nmp->nm_ifile_node;
 	mdt = &nandfsdev->nd_ifile_mdt;
 	entries_per_group = mdt->entries_per_group;
 
 	VOP_LOCK(NTOV(ifile), LK_SHARED);
 	error = nandfs_bread(ifile, 0, NOCRED, 0, &bp);
 	if (error) {
 		brelse(bp);
 		VOP_UNLOCK(NTOV(ifile), 0);
 		return (error);
 	}
 
 	groups = (struct nandfs_block_group_desc *)bp->b_data;
 
 	for (i = 0; i < mdt->groups_per_desc_block; i++)
 		files += (entries_per_group - groups[i].bg_nfrees);
 
 	brelse(bp);
 	VOP_UNLOCK(NTOV(ifile), 0);
 
 	sbp->f_bsize = nandfsdev->nd_blocksize;
 	sbp->f_iosize = sbp->f_bsize;
 	sbp->f_blocks = fsdata->f_blocks_per_segment * fsdata->f_nsegments;
 	sbp->f_bfree = sb->s_free_blocks_count;
 	sbp->f_bavail = sbp->f_bfree;
 	sbp->f_files = files;
 	sbp->f_ffree = 0;
 	return (0);
 }
 
 static int
 nandfs_root(struct mount *mp, int flags, struct vnode **vpp)
 {
 	struct nandfsmount *nmp = VFSTONANDFS(mp);
 	struct nandfs_node *node;
 	int error;
 
 	error = nandfs_get_node(nmp, NANDFS_ROOT_INO, &node);
 	if (error)
 		return (error);
 
 	KASSERT(NTOV(node)->v_vflag & VV_ROOT,
 	    ("root_vp->v_vflag & VV_ROOT"));
 
 	*vpp = NTOV(node);
 
 	return (error);
 }
 
 static int
 nandfs_vget(struct mount *mp, ino_t ino, int flags, struct vnode **vpp)
 {
 	struct nandfsmount *nmp = VFSTONANDFS(mp);
 	struct nandfs_node *node;
 	int error;
 
 	error = nandfs_get_node(nmp, ino, &node);
 	if (node)
 		*vpp = NTOV(node);
 
 	return (error);
 }
 
 static int
 nandfs_sync(struct mount *mp, int waitfor)
 {
 	struct nandfsmount *nmp = VFSTONANDFS(mp);
 
 	DPRINTF(SYNC, ("%s: mp %p waitfor %d\n", __func__, mp, waitfor));
 
 	/*
 	 * XXX: A hack to be removed soon
 	 */
 	if (waitfor == MNT_LAZY)
 		return (0);
 	if (waitfor == MNT_SUSPEND)
 		return (0);
 	nandfs_wakeup_wait_sync(nmp->nm_nandfsdev, SYNCER_VFS_SYNC);
 	return (0);
 }
 
 static struct vfsops nandfs_vfsops = {
 	.vfs_init =		nandfs_init,
 	.vfs_mount =		nandfs_mount,
 	.vfs_root =		nandfs_root,
 	.vfs_statfs =		nandfs_statfs,
 	.vfs_uninit =		nandfs_uninit,
 	.vfs_unmount =		nandfs_unmount,
 	.vfs_vget =		nandfs_vget,
 	.vfs_sync =		nandfs_sync,
 };
 
 VFS_SET(nandfs_vfsops, nandfs, VFCF_LOOPBACK);
Index: user/ngie/more-tests/sys/fs/nfsclient/nfs_clvfsops.c
===================================================================
--- user/ngie/more-tests/sys/fs/nfsclient/nfs_clvfsops.c	(revision 281584)
+++ user/ngie/more-tests/sys/fs/nfsclient/nfs_clvfsops.c	(revision 281585)
@@ -1,1876 +1,1877 @@
 /*-
  * Copyright (c) 1989, 1993, 1995
  *	The Regents of the University of California.  All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * Rick Macklem at The University of Guelph.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	from nfs_vfsops.c	8.12 (Berkeley) 5/20/95
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 
 #include "opt_bootp.h"
 #include "opt_nfsroot.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/bio.h>
 #include <sys/buf.h>
 #include <sys/clock.h>
 #include <sys/jail.h>
 #include <sys/limits.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/module.h>
 #include <sys/mount.h>
 #include <sys/proc.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/sockio.h>
 #include <sys/sysctl.h>
 #include <sys/vnode.h>
 #include <sys/signalvar.h>
 
 #include <vm/vm.h>
 #include <vm/vm_extern.h>
 #include <vm/uma.h>
 
 #include <net/if.h>
 #include <net/route.h>
 #include <netinet/in.h>
 
 #include <fs/nfs/nfsport.h>
 #include <fs/nfsclient/nfsnode.h>
 #include <fs/nfsclient/nfsmount.h>
 #include <fs/nfsclient/nfs.h>
 #include <nfs/nfsdiskless.h>
 
 FEATURE(nfscl, "NFSv4 client");
 
 extern int nfscl_ticks;
 extern struct timeval nfsboottime;
 extern struct nfsstats	newnfsstats;
 extern int nfsrv_useacl;
 extern int nfscl_debuglevel;
 extern enum nfsiod_state ncl_iodwant[NFS_MAXASYNCDAEMON];
 extern struct nfsmount *ncl_iodmount[NFS_MAXASYNCDAEMON];
 extern struct mtx ncl_iod_mutex;
 NFSCLSTATEMUTEX;
 
 MALLOC_DEFINE(M_NEWNFSREQ, "newnfsclient_req", "New NFS request header");
 MALLOC_DEFINE(M_NEWNFSMNT, "newnfsmnt", "New NFS mount struct");
 
 SYSCTL_DECL(_vfs_nfs);
 static int nfs_ip_paranoia = 1;
 SYSCTL_INT(_vfs_nfs, OID_AUTO, nfs_ip_paranoia, CTLFLAG_RW,
     &nfs_ip_paranoia, 0, "");
 static int nfs_tprintf_initial_delay = NFS_TPRINTF_INITIAL_DELAY;
 SYSCTL_INT(_vfs_nfs, NFS_TPRINTF_INITIAL_DELAY,
         downdelayinitial, CTLFLAG_RW, &nfs_tprintf_initial_delay, 0, "");
 /* how long between console messages "nfs server foo not responding" */
 static int nfs_tprintf_delay = NFS_TPRINTF_DELAY;
 SYSCTL_INT(_vfs_nfs, NFS_TPRINTF_DELAY,
         downdelayinterval, CTLFLAG_RW, &nfs_tprintf_delay, 0, "");
 #ifdef NFS_DEBUG
 int nfs_debug;
 SYSCTL_INT(_vfs_nfs, OID_AUTO, debug, CTLFLAG_RW, &nfs_debug, 0,
     "Toggle debug flag");
 #endif
 
 static int	nfs_mountroot(struct mount *);
 static void	nfs_sec_name(char *, int *);
 static void	nfs_decode_args(struct mount *mp, struct nfsmount *nmp,
 		    struct nfs_args *argp, const char *, struct ucred *,
 		    struct thread *);
 static int	mountnfs(struct nfs_args *, struct mount *,
 		    struct sockaddr *, char *, u_char *, int, u_char *, int,
 		    u_char *, int, struct vnode **, struct ucred *,
 		    struct thread *, int, int, int);
 static void	nfs_getnlminfo(struct vnode *, uint8_t *, size_t *,
 		    struct sockaddr_storage *, int *, off_t *,
 		    struct timeval *);
 static vfs_mount_t nfs_mount;
 static vfs_cmount_t nfs_cmount;
 static vfs_unmount_t nfs_unmount;
 static vfs_root_t nfs_root;
 static vfs_statfs_t nfs_statfs;
 static vfs_sync_t nfs_sync;
 static vfs_sysctl_t nfs_sysctl;
 static vfs_purge_t nfs_purge;
 
 /*
  * nfs vfs operations.
  */
 static struct vfsops nfs_vfsops = {
 	.vfs_init =		ncl_init,
 	.vfs_mount =		nfs_mount,
 	.vfs_cmount =		nfs_cmount,
 	.vfs_root =		nfs_root,
 	.vfs_statfs =		nfs_statfs,
 	.vfs_sync =		nfs_sync,
 	.vfs_uninit =		ncl_uninit,
 	.vfs_unmount =		nfs_unmount,
 	.vfs_sysctl =		nfs_sysctl,
 	.vfs_purge =		nfs_purge,
 };
 VFS_SET(nfs_vfsops, nfs, VFCF_NETWORK | VFCF_SBDRY);
 
 /* So that loader and kldload(2) can find us, wherever we are.. */
 MODULE_VERSION(nfs, 1);
 MODULE_DEPEND(nfs, nfscommon, 1, 1, 1);
 MODULE_DEPEND(nfs, krpc, 1, 1, 1);
 MODULE_DEPEND(nfs, nfssvc, 1, 1, 1);
 MODULE_DEPEND(nfs, nfslock, 1, 1, 1);
 
 /*
  * This structure is now defined in sys/nfs/nfs_diskless.c so that it
  * can be shared by both NFS clients. It is declared here so that it
  * will be defined for kernels built without NFS_ROOT, although it
  * isn't used in that case.
  */
 #if !defined(NFS_ROOT)
 struct nfs_diskless	nfs_diskless = { { { 0 } } };
 struct nfsv3_diskless	nfsv3_diskless = { { { 0 } } };
 int			nfs_diskless_valid = 0;
 #endif
 
 SYSCTL_INT(_vfs_nfs, OID_AUTO, diskless_valid, CTLFLAG_RD,
     &nfs_diskless_valid, 0,
     "Has the diskless struct been filled correctly");
 
 SYSCTL_STRING(_vfs_nfs, OID_AUTO, diskless_rootpath, CTLFLAG_RD,
     nfsv3_diskless.root_hostnam, 0, "Path to nfs root");
 
 SYSCTL_OPAQUE(_vfs_nfs, OID_AUTO, diskless_rootaddr, CTLFLAG_RD,
     &nfsv3_diskless.root_saddr, sizeof(nfsv3_diskless.root_saddr),
     "%Ssockaddr_in", "Diskless root nfs address");
 
 
 void		newnfsargs_ntoh(struct nfs_args *);
 static int	nfs_mountdiskless(char *,
 		    struct sockaddr_in *, struct nfs_args *,
 		    struct thread *, struct vnode **, struct mount *);
 static void	nfs_convert_diskless(void);
 static void	nfs_convert_oargs(struct nfs_args *args,
 		    struct onfs_args *oargs);
 
 int
 newnfs_iosize(struct nfsmount *nmp)
 {
 	int iosize, maxio;
 
 	/* First, set the upper limit for iosize */
 	if (nmp->nm_flag & NFSMNT_NFSV4) {
 		maxio = NFS_MAXBSIZE;
 	} else if (nmp->nm_flag & NFSMNT_NFSV3) {
 		if (nmp->nm_sotype == SOCK_DGRAM)
 			maxio = NFS_MAXDGRAMDATA;
 		else
 			maxio = NFS_MAXBSIZE;
 	} else {
 		maxio = NFS_V2MAXDATA;
 	}
 	if (nmp->nm_rsize > maxio || nmp->nm_rsize == 0)
 		nmp->nm_rsize = maxio;
 	if (nmp->nm_rsize > MAXBSIZE)
 		nmp->nm_rsize = MAXBSIZE;
 	if (nmp->nm_readdirsize > maxio || nmp->nm_readdirsize == 0)
 		nmp->nm_readdirsize = maxio;
 	if (nmp->nm_readdirsize > nmp->nm_rsize)
 		nmp->nm_readdirsize = nmp->nm_rsize;
 	if (nmp->nm_wsize > maxio || nmp->nm_wsize == 0)
 		nmp->nm_wsize = maxio;
 	if (nmp->nm_wsize > MAXBSIZE)
 		nmp->nm_wsize = MAXBSIZE;
 
 	/*
 	 * Calculate the size used for io buffers.  Use the larger
 	 * of the two sizes to minimise nfs requests but make sure
 	 * that it is at least one VM page to avoid wasting buffer
 	 * space.
 	 */
 	iosize = imax(nmp->nm_rsize, nmp->nm_wsize);
 	iosize = imax(iosize, PAGE_SIZE);
 	nmp->nm_mountp->mnt_stat.f_iosize = iosize;
 	return (iosize);
 }
 
 static void
 nfs_convert_oargs(struct nfs_args *args, struct onfs_args *oargs)
 {
 
 	args->version = NFS_ARGSVERSION;
 	args->addr = oargs->addr;
 	args->addrlen = oargs->addrlen;
 	args->sotype = oargs->sotype;
 	args->proto = oargs->proto;
 	args->fh = oargs->fh;
 	args->fhsize = oargs->fhsize;
 	args->flags = oargs->flags;
 	args->wsize = oargs->wsize;
 	args->rsize = oargs->rsize;
 	args->readdirsize = oargs->readdirsize;
 	args->timeo = oargs->timeo;
 	args->retrans = oargs->retrans;
 	args->readahead = oargs->readahead;
 	args->hostname = oargs->hostname;
 }
 
 static void
 nfs_convert_diskless(void)
 {
 
 	bcopy(&nfs_diskless.myif, &nfsv3_diskless.myif,
 		sizeof(struct ifaliasreq));
 	bcopy(&nfs_diskless.mygateway, &nfsv3_diskless.mygateway,
 		sizeof(struct sockaddr_in));
 	nfs_convert_oargs(&nfsv3_diskless.root_args,&nfs_diskless.root_args);
 	if (nfsv3_diskless.root_args.flags & NFSMNT_NFSV3) {
 		nfsv3_diskless.root_fhsize = NFSX_MYFH;
 		bcopy(nfs_diskless.root_fh, nfsv3_diskless.root_fh, NFSX_MYFH);
 	} else {
 		nfsv3_diskless.root_fhsize = NFSX_V2FH;
 		bcopy(nfs_diskless.root_fh, nfsv3_diskless.root_fh, NFSX_V2FH);
 	}
 	bcopy(&nfs_diskless.root_saddr,&nfsv3_diskless.root_saddr,
 		sizeof(struct sockaddr_in));
 	bcopy(nfs_diskless.root_hostnam, nfsv3_diskless.root_hostnam, MNAMELEN);
 	nfsv3_diskless.root_time = nfs_diskless.root_time;
 	bcopy(nfs_diskless.my_hostnam, nfsv3_diskless.my_hostnam,
 		MAXHOSTNAMELEN);
 	nfs_diskless_valid = 3;
 }
 
 /*
  * nfs statfs call
  */
 static int
 nfs_statfs(struct mount *mp, struct statfs *sbp)
 {
 	struct vnode *vp;
 	struct thread *td;
 	struct nfsmount *nmp = VFSTONFS(mp);
 	struct nfsvattr nfsva;
 	struct nfsfsinfo fs;
 	struct nfsstatfs sb;
 	int error = 0, attrflag, gotfsinfo = 0, ret;
 	struct nfsnode *np;
 
 	td = curthread;
 
 	error = vfs_busy(mp, MBF_NOWAIT);
 	if (error)
 		return (error);
 	error = ncl_nget(mp, nmp->nm_fh, nmp->nm_fhsize, &np, LK_EXCLUSIVE);
 	if (error) {
 		vfs_unbusy(mp);
 		return (error);
 	}
 	vp = NFSTOV(np);
 	mtx_lock(&nmp->nm_mtx);
 	if (NFSHASNFSV3(nmp) && !NFSHASGOTFSINFO(nmp)) {
 		mtx_unlock(&nmp->nm_mtx);
 		error = nfsrpc_fsinfo(vp, &fs, td->td_ucred, td, &nfsva,
 		    &attrflag, NULL);
 		if (!error)
 			gotfsinfo = 1;
 	} else
 		mtx_unlock(&nmp->nm_mtx);
 	if (!error)
 		error = nfsrpc_statfs(vp, &sb, &fs, td->td_ucred, td, &nfsva,
 		    &attrflag, NULL);
 	if (error != 0)
 		NFSCL_DEBUG(2, "statfs=%d\n", error);
 	if (attrflag == 0) {
 		ret = nfsrpc_getattrnovp(nmp, nmp->nm_fh, nmp->nm_fhsize, 1,
 		    td->td_ucred, td, &nfsva, NULL, NULL);
 		if (ret) {
 			/*
 			 * Just set default values to get things going.
 			 */
 			NFSBZERO((caddr_t)&nfsva, sizeof (struct nfsvattr));
 			nfsva.na_vattr.va_type = VDIR;
 			nfsva.na_vattr.va_mode = 0777;
 			nfsva.na_vattr.va_nlink = 100;
 			nfsva.na_vattr.va_uid = (uid_t)0;
 			nfsva.na_vattr.va_gid = (gid_t)0;
 			nfsva.na_vattr.va_fileid = 2;
 			nfsva.na_vattr.va_gen = 1;
 			nfsva.na_vattr.va_blocksize = NFS_FABLKSIZE;
 			nfsva.na_vattr.va_size = 512 * 1024;
 		}
 	}
 	(void) nfscl_loadattrcache(&vp, &nfsva, NULL, NULL, 0, 1);
 	if (!error) {
 	    mtx_lock(&nmp->nm_mtx);
 	    if (gotfsinfo || (nmp->nm_flag & NFSMNT_NFSV4))
 		nfscl_loadfsinfo(nmp, &fs);
 	    nfscl_loadsbinfo(nmp, &sb, sbp);
 	    sbp->f_iosize = newnfs_iosize(nmp);
 	    mtx_unlock(&nmp->nm_mtx);
 	    if (sbp != &mp->mnt_stat) {
 		bcopy(mp->mnt_stat.f_mntonname, sbp->f_mntonname, MNAMELEN);
 		bcopy(mp->mnt_stat.f_mntfromname, sbp->f_mntfromname, MNAMELEN);
 	    }
 	    strncpy(&sbp->f_fstypename[0], mp->mnt_vfc->vfc_name, MFSNAMELEN);
 	} else if (NFS_ISV4(vp)) {
 		error = nfscl_maperr(td, error, (uid_t)0, (gid_t)0);
 	}
 	vput(vp);
 	vfs_unbusy(mp);
 	return (error);
 }
 
 /*
  * nfs version 3 fsinfo rpc call
  */
 int
 ncl_fsinfo(struct nfsmount *nmp, struct vnode *vp, struct ucred *cred,
     struct thread *td)
 {
 	struct nfsfsinfo fs;
 	struct nfsvattr nfsva;
 	int error, attrflag;
 	
 	error = nfsrpc_fsinfo(vp, &fs, cred, td, &nfsva, &attrflag, NULL);
 	if (!error) {
 		if (attrflag)
 			(void) nfscl_loadattrcache(&vp, &nfsva, NULL, NULL, 0,
 			    1);
 		mtx_lock(&nmp->nm_mtx);
 		nfscl_loadfsinfo(nmp, &fs);
 		mtx_unlock(&nmp->nm_mtx);
 	}
 	return (error);
 }
 
 /*
  * Mount a remote root fs via. nfs. This depends on the info in the
  * nfs_diskless structure that has been filled in properly by some primary
  * bootstrap.
  * It goes something like this:
  * - do enough of "ifconfig" by calling ifioctl() so that the system
  *   can talk to the server
  * - If nfs_diskless.mygateway is filled in, use that address as
  *   a default gateway.
  * - build the rootfs mount point and call mountnfs() to do the rest.
  *
  * It is assumed to be safe to read, modify, and write the nfsv3_diskless
  * structure, as well as other global NFS client variables here, as
  * nfs_mountroot() will be called once in the boot before any other NFS
  * client activity occurs.
  */
 static int
 nfs_mountroot(struct mount *mp)
 {
 	struct thread *td = curthread;
 	struct nfsv3_diskless *nd = &nfsv3_diskless;
 	struct socket *so;
 	struct vnode *vp;
 	struct ifreq ir;
 	int error;
 	u_long l;
 	char buf[128];
 	char *cp;
 
 #if defined(BOOTP_NFSROOT) && defined(BOOTP)
 	bootpc_init();		/* use bootp to get nfs_diskless filled in */
 #elif defined(NFS_ROOT)
 	nfs_setup_diskless();
 #endif
 
 	if (nfs_diskless_valid == 0)
 		return (-1);
 	if (nfs_diskless_valid == 1)
 		nfs_convert_diskless();
 
 	/*
 	 * XXX splnet, so networks will receive...
 	 */
 	splnet();
 
 	/*
 	 * Do enough of ifconfig(8) so that the critical net interface can
 	 * talk to the server.
 	 */
 	error = socreate(nd->myif.ifra_addr.sa_family, &so, nd->root_args.sotype, 0,
 	    td->td_ucred, td);
 	if (error)
 		panic("nfs_mountroot: socreate(%04x): %d",
 			nd->myif.ifra_addr.sa_family, error);
 
 #if 0 /* XXX Bad idea */
 	/*
 	 * We might not have been told the right interface, so we pass
 	 * over the first ten interfaces of the same kind, until we get
 	 * one of them configured.
 	 */
 
 	for (i = strlen(nd->myif.ifra_name) - 1;
 		nd->myif.ifra_name[i] >= '0' &&
 		nd->myif.ifra_name[i] <= '9';
 		nd->myif.ifra_name[i] ++) {
 		error = ifioctl(so, SIOCAIFADDR, (caddr_t)&nd->myif, td);
 		if(!error)
 			break;
 	}
 #endif
 	error = ifioctl(so, SIOCAIFADDR, (caddr_t)&nd->myif, td);
 	if (error)
 		panic("nfs_mountroot: SIOCAIFADDR: %d", error);
 	if ((cp = kern_getenv("boot.netif.mtu")) != NULL) {
 		ir.ifr_mtu = strtol(cp, NULL, 10);
 		bcopy(nd->myif.ifra_name, ir.ifr_name, IFNAMSIZ);
 		freeenv(cp);
 		error = ifioctl(so, SIOCSIFMTU, (caddr_t)&ir, td);
 		if (error)
 			printf("nfs_mountroot: SIOCSIFMTU: %d", error);
 	}
 	soclose(so);
 
 	/*
 	 * If the gateway field is filled in, set it as the default route.
 	 * Note that pxeboot will set a default route of 0 if the route
 	 * is not set by the DHCP server.  Check also for a value of 0
 	 * to avoid panicking inappropriately in that situation.
 	 */
 	if (nd->mygateway.sin_len != 0 &&
 	    nd->mygateway.sin_addr.s_addr != 0) {
 		struct sockaddr_in mask, sin;
 
 		bzero((caddr_t)&mask, sizeof(mask));
 		sin = mask;
 		sin.sin_family = AF_INET;
 		sin.sin_len = sizeof(sin);
                 /* XXX MRT use table 0 for this sort of thing */
 		CURVNET_SET(TD_TO_VNET(td));
 		error = rtrequest_fib(RTM_ADD, (struct sockaddr *)&sin,
 		    (struct sockaddr *)&nd->mygateway,
 		    (struct sockaddr *)&mask,
 		    RTF_UP | RTF_GATEWAY, NULL, RT_DEFAULT_FIB);
 		CURVNET_RESTORE();
 		if (error)
 			panic("nfs_mountroot: RTM_ADD: %d", error);
 	}
 
 	/*
 	 * Create the rootfs mount point.
 	 */
 	nd->root_args.fh = nd->root_fh;
 	nd->root_args.fhsize = nd->root_fhsize;
 	l = ntohl(nd->root_saddr.sin_addr.s_addr);
 	snprintf(buf, sizeof(buf), "%ld.%ld.%ld.%ld:%s",
 		(l >> 24) & 0xff, (l >> 16) & 0xff,
 		(l >>  8) & 0xff, (l >>  0) & 0xff, nd->root_hostnam);
 	printf("NFS ROOT: %s\n", buf);
 	nd->root_args.hostname = buf;
 	if ((error = nfs_mountdiskless(buf,
 	    &nd->root_saddr, &nd->root_args, td, &vp, mp)) != 0) {
 		return (error);
 	}
 
 	/*
 	 * This is not really an nfs issue, but it is much easier to
 	 * set hostname here and then let the "/etc/rc.xxx" files
 	 * mount the right /var based upon its preset value.
 	 */
 	mtx_lock(&prison0.pr_mtx);
 	strlcpy(prison0.pr_hostname, nd->my_hostnam,
 	    sizeof(prison0.pr_hostname));
 	mtx_unlock(&prison0.pr_mtx);
 	inittodr(ntohl(nd->root_time));
 	return (0);
 }
 
 /*
  * Internal version of mount system call for diskless setup.
  */
 static int
 nfs_mountdiskless(char *path,
     struct sockaddr_in *sin, struct nfs_args *args, struct thread *td,
     struct vnode **vpp, struct mount *mp)
 {
 	struct sockaddr *nam;
 	int dirlen, error;
 	char *dirpath;
 
 	/*
 	 * Find the directory path in "path", which also has the server's
 	 * name/ip address in it.
 	 */
 	dirpath = strchr(path, ':');
 	if (dirpath != NULL)
 		dirlen = strlen(++dirpath);
 	else
 		dirlen = 0;
 	nam = sodupsockaddr((struct sockaddr *)sin, M_WAITOK);
 	if ((error = mountnfs(args, mp, nam, path, NULL, 0, dirpath, dirlen,
 	    NULL, 0, vpp, td->td_ucred, td, NFS_DEFAULT_NAMETIMEO, 
 	    NFS_DEFAULT_NEGNAMETIMEO, 0)) != 0) {
 		printf("nfs_mountroot: mount %s on /: %d\n", path, error);
 		return (error);
 	}
 	return (0);
 }
 
 static void
 nfs_sec_name(char *sec, int *flagsp)
 {
 	if (!strcmp(sec, "krb5"))
 		*flagsp |= NFSMNT_KERB;
 	else if (!strcmp(sec, "krb5i"))
 		*flagsp |= (NFSMNT_KERB | NFSMNT_INTEGRITY);
 	else if (!strcmp(sec, "krb5p"))
 		*flagsp |= (NFSMNT_KERB | NFSMNT_PRIVACY);
 }
 
 static void
 nfs_decode_args(struct mount *mp, struct nfsmount *nmp, struct nfs_args *argp,
     const char *hostname, struct ucred *cred, struct thread *td)
 {
 	int s;
 	int adjsock;
 	char *p;
 
 	s = splnet();
 
 	/*
 	 * Set read-only flag if requested; otherwise, clear it if this is
 	 * an update.  If this is not an update, then either the read-only
 	 * flag is already clear, or this is a root mount and it was set
 	 * intentionally at some previous point.
 	 */
 	if (vfs_getopt(mp->mnt_optnew, "ro", NULL, NULL) == 0) {
 		MNT_ILOCK(mp);
 		mp->mnt_flag |= MNT_RDONLY;
 		MNT_IUNLOCK(mp);
 	} else if (mp->mnt_flag & MNT_UPDATE) {
 		MNT_ILOCK(mp);
 		mp->mnt_flag &= ~MNT_RDONLY;
 		MNT_IUNLOCK(mp);
 	}
 
 	/*
 	 * Silently clear NFSMNT_NOCONN if it's a TCP mount, it makes
 	 * no sense in that context.  Also, set up appropriate retransmit
 	 * and soft timeout behavior.
 	 */
 	if (argp->sotype == SOCK_STREAM) {
 		nmp->nm_flag &= ~NFSMNT_NOCONN;
 		nmp->nm_timeo = NFS_MAXTIMEO;
 		if ((argp->flags & NFSMNT_NFSV4) != 0)
 			nmp->nm_retry = INT_MAX;
 		else
 			nmp->nm_retry = NFS_RETRANS_TCP;
 	}
 
 	/* Also clear RDIRPLUS if NFSv2, it crashes some servers */
 	if ((argp->flags & (NFSMNT_NFSV3 | NFSMNT_NFSV4)) == 0) {
 		argp->flags &= ~NFSMNT_RDIRPLUS;
 		nmp->nm_flag &= ~NFSMNT_RDIRPLUS;
 	}
 
 	/* Re-bind if rsrvd port requested and wasn't on one */
 	adjsock = !(nmp->nm_flag & NFSMNT_RESVPORT)
 		  && (argp->flags & NFSMNT_RESVPORT);
 	/* Also re-bind if we're switching to/from a connected UDP socket */
 	adjsock |= ((nmp->nm_flag & NFSMNT_NOCONN) !=
 		    (argp->flags & NFSMNT_NOCONN));
 
 	/* Update flags atomically.  Don't change the lock bits. */
 	nmp->nm_flag = argp->flags | nmp->nm_flag;
 	splx(s);
 
 	if ((argp->flags & NFSMNT_TIMEO) && argp->timeo > 0) {
 		nmp->nm_timeo = (argp->timeo * NFS_HZ + 5) / 10;
 		if (nmp->nm_timeo < NFS_MINTIMEO)
 			nmp->nm_timeo = NFS_MINTIMEO;
 		else if (nmp->nm_timeo > NFS_MAXTIMEO)
 			nmp->nm_timeo = NFS_MAXTIMEO;
 	}
 
 	if ((argp->flags & NFSMNT_RETRANS) && argp->retrans > 1) {
 		nmp->nm_retry = argp->retrans;
 		if (nmp->nm_retry > NFS_MAXREXMIT)
 			nmp->nm_retry = NFS_MAXREXMIT;
 	}
 
 	if ((argp->flags & NFSMNT_WSIZE) && argp->wsize > 0) {
 		nmp->nm_wsize = argp->wsize;
 		/*
 		 * Clip at the power of 2 below the size. There is an
 		 * issue (not isolated) that causes intermittent page
 		 * faults if this is not done.
 		 */
 		if (nmp->nm_wsize > NFS_FABLKSIZE)
 			nmp->nm_wsize = 1 << (fls(nmp->nm_wsize) - 1);
 		else
 			nmp->nm_wsize = NFS_FABLKSIZE;
 	}
 
 	if ((argp->flags & NFSMNT_RSIZE) && argp->rsize > 0) {
 		nmp->nm_rsize = argp->rsize;
 		/*
 		 * Clip at the power of 2 below the size. There is an
 		 * issue (not isolated) that causes intermittent page
 		 * faults if this is not done.
 		 */
 		if (nmp->nm_rsize > NFS_FABLKSIZE)
 			nmp->nm_rsize = 1 << (fls(nmp->nm_rsize) - 1);
 		else
 			nmp->nm_rsize = NFS_FABLKSIZE;
 	}
 
 	if ((argp->flags & NFSMNT_READDIRSIZE) && argp->readdirsize > 0) {
 		nmp->nm_readdirsize = argp->readdirsize;
 	}
 
 	if ((argp->flags & NFSMNT_ACREGMIN) && argp->acregmin >= 0)
 		nmp->nm_acregmin = argp->acregmin;
 	else
 		nmp->nm_acregmin = NFS_MINATTRTIMO;
 	if ((argp->flags & NFSMNT_ACREGMAX) && argp->acregmax >= 0)
 		nmp->nm_acregmax = argp->acregmax;
 	else
 		nmp->nm_acregmax = NFS_MAXATTRTIMO;
 	if ((argp->flags & NFSMNT_ACDIRMIN) && argp->acdirmin >= 0)
 		nmp->nm_acdirmin = argp->acdirmin;
 	else
 		nmp->nm_acdirmin = NFS_MINDIRATTRTIMO;
 	if ((argp->flags & NFSMNT_ACDIRMAX) && argp->acdirmax >= 0)
 		nmp->nm_acdirmax = argp->acdirmax;
 	else
 		nmp->nm_acdirmax = NFS_MAXDIRATTRTIMO;
 	if (nmp->nm_acdirmin > nmp->nm_acdirmax)
 		nmp->nm_acdirmin = nmp->nm_acdirmax;
 	if (nmp->nm_acregmin > nmp->nm_acregmax)
 		nmp->nm_acregmin = nmp->nm_acregmax;
 
 	if ((argp->flags & NFSMNT_READAHEAD) && argp->readahead >= 0) {
 		if (argp->readahead <= NFS_MAXRAHEAD)
 			nmp->nm_readahead = argp->readahead;
 		else
 			nmp->nm_readahead = NFS_MAXRAHEAD;
 	}
 	if ((argp->flags & NFSMNT_WCOMMITSIZE) && argp->wcommitsize >= 0) {
 		if (argp->wcommitsize < nmp->nm_wsize)
 			nmp->nm_wcommitsize = nmp->nm_wsize;
 		else
 			nmp->nm_wcommitsize = argp->wcommitsize;
 	}
 
 	adjsock |= ((nmp->nm_sotype != argp->sotype) ||
 		    (nmp->nm_soproto != argp->proto));
 
 	if (nmp->nm_client != NULL && adjsock) {
 		int haslock = 0, error = 0;
 
 		if (nmp->nm_sotype == SOCK_STREAM) {
 			error = newnfs_sndlock(&nmp->nm_sockreq.nr_lock);
 			if (!error)
 				haslock = 1;
 		}
 		if (!error) {
 		    newnfs_disconnect(&nmp->nm_sockreq);
 		    if (haslock)
 			newnfs_sndunlock(&nmp->nm_sockreq.nr_lock);
 		    nmp->nm_sotype = argp->sotype;
 		    nmp->nm_soproto = argp->proto;
 		    if (nmp->nm_sotype == SOCK_DGRAM)
 			while (newnfs_connect(nmp, &nmp->nm_sockreq,
 			    cred, td, 0)) {
 				printf("newnfs_args: retrying connect\n");
 				(void) nfs_catnap(PSOCK, 0, "nfscon");
 			}
 		}
 	} else {
 		nmp->nm_sotype = argp->sotype;
 		nmp->nm_soproto = argp->proto;
 	}
 
 	if (hostname != NULL) {
 		strlcpy(nmp->nm_hostname, hostname,
 		    sizeof(nmp->nm_hostname));
 		p = strchr(nmp->nm_hostname, ':');
 		if (p != NULL)
 			*p = '\0';
 	}
 }
 
 static const char *nfs_opts[] = { "from", "nfs_args",
     "noac", "noatime", "noexec", "suiddir", "nosuid", "nosymfollow", "union",
     "noclusterr", "noclusterw", "multilabel", "acls", "force", "update",
     "async", "noconn", "nolockd", "conn", "lockd", "intr", "rdirplus",
     "readdirsize", "soft", "hard", "mntudp", "tcp", "udp", "wsize", "rsize",
     "retrans", "actimeo", "acregmin", "acregmax", "acdirmin", "acdirmax",
     "resvport", "readahead", "hostname", "timeo", "timeout", "addr", "fh",
     "nfsv3", "sec", "principal", "nfsv4", "gssname", "allgssname", "dirpath",
     "minorversion", "nametimeo", "negnametimeo", "nocto", "noncontigwr",
     "pnfs", "wcommitsize",
     NULL };
 
 /*
  * VFS Operations.
  *
  * mount system call
  * It seems a bit dumb to copyinstr() the host and path here and then
  * bcopy() them in mountnfs(), but I wanted to detect errors before
  * doing the sockargs() call because sockargs() allocates an mbuf and
  * an error after that means that I have to release the mbuf.
  */
 /* ARGSUSED */
 static int
 nfs_mount(struct mount *mp)
 {
 	struct nfs_args args = {
 	    .version = NFS_ARGSVERSION,
 	    .addr = NULL,
 	    .addrlen = sizeof (struct sockaddr_in),
 	    .sotype = SOCK_STREAM,
 	    .proto = 0,
 	    .fh = NULL,
 	    .fhsize = 0,
 	    .flags = NFSMNT_RESVPORT,
 	    .wsize = NFS_WSIZE,
 	    .rsize = NFS_RSIZE,
 	    .readdirsize = NFS_READDIRSIZE,
 	    .timeo = 10,
 	    .retrans = NFS_RETRANS,
 	    .readahead = NFS_DEFRAHEAD,
 	    .wcommitsize = 0,			/* was: NQ_DEFLEASE */
 	    .hostname = NULL,
 	    .acregmin = NFS_MINATTRTIMO,
 	    .acregmax = NFS_MAXATTRTIMO,
 	    .acdirmin = NFS_MINDIRATTRTIMO,
 	    .acdirmax = NFS_MAXDIRATTRTIMO,
 	};
 	int error = 0, ret, len;
 	struct sockaddr *nam = NULL;
 	struct vnode *vp;
 	struct thread *td;
 	char hst[MNAMELEN];
 	u_char nfh[NFSX_FHMAX], krbname[100], dirpath[100], srvkrbname[100];
 	char *opt, *name, *secname;
 	int nametimeo = NFS_DEFAULT_NAMETIMEO;
 	int negnametimeo = NFS_DEFAULT_NEGNAMETIMEO;
 	int minvers = 0;
 	int dirlen, has_nfs_args_opt, krbnamelen, srvkrbnamelen;
 	size_t hstlen;
 
 	has_nfs_args_opt = 0;
 	if (vfs_filteropt(mp->mnt_optnew, nfs_opts)) {
 		error = EINVAL;
 		goto out;
 	}
 
 	td = curthread;
 	if ((mp->mnt_flag & (MNT_ROOTFS | MNT_UPDATE)) == MNT_ROOTFS) {
 		error = nfs_mountroot(mp);
 		goto out;
 	}
 
 	nfscl_init();
 
 	/*
 	 * The old mount_nfs program passed the struct nfs_args
 	 * from userspace to kernel.  The new mount_nfs program
 	 * passes string options via nmount() from userspace to kernel
 	 * and we populate the struct nfs_args in the kernel.
 	 */
 	if (vfs_getopt(mp->mnt_optnew, "nfs_args", NULL, NULL) == 0) {
 		error = vfs_copyopt(mp->mnt_optnew, "nfs_args", &args,
 		    sizeof(args));
 		if (error != 0)
 			goto out;
 
 		if (args.version != NFS_ARGSVERSION) {
 			error = EPROGMISMATCH;
 			goto out;
 		}
 		has_nfs_args_opt = 1;
 	}
 
 	/* Handle the new style options. */
 	if (vfs_getopt(mp->mnt_optnew, "noac", NULL, NULL) == 0) {
 		args.acdirmin = args.acdirmax =
 		    args.acregmin = args.acregmax = 0;
 		args.flags |= NFSMNT_ACDIRMIN | NFSMNT_ACDIRMAX |
 		    NFSMNT_ACREGMIN | NFSMNT_ACREGMAX;
 	}
 	if (vfs_getopt(mp->mnt_optnew, "noconn", NULL, NULL) == 0)
 		args.flags |= NFSMNT_NOCONN;
 	if (vfs_getopt(mp->mnt_optnew, "conn", NULL, NULL) == 0)
 		args.flags &= ~NFSMNT_NOCONN;
 	if (vfs_getopt(mp->mnt_optnew, "nolockd", NULL, NULL) == 0)
 		args.flags |= NFSMNT_NOLOCKD;
 	if (vfs_getopt(mp->mnt_optnew, "lockd", NULL, NULL) == 0)
 		args.flags &= ~NFSMNT_NOLOCKD;
 	if (vfs_getopt(mp->mnt_optnew, "intr", NULL, NULL) == 0)
 		args.flags |= NFSMNT_INT;
 	if (vfs_getopt(mp->mnt_optnew, "rdirplus", NULL, NULL) == 0)
 		args.flags |= NFSMNT_RDIRPLUS;
 	if (vfs_getopt(mp->mnt_optnew, "resvport", NULL, NULL) == 0)
 		args.flags |= NFSMNT_RESVPORT;
 	if (vfs_getopt(mp->mnt_optnew, "noresvport", NULL, NULL) == 0)
 		args.flags &= ~NFSMNT_RESVPORT;
 	if (vfs_getopt(mp->mnt_optnew, "soft", NULL, NULL) == 0)
 		args.flags |= NFSMNT_SOFT;
 	if (vfs_getopt(mp->mnt_optnew, "hard", NULL, NULL) == 0)
 		args.flags &= ~NFSMNT_SOFT;
 	if (vfs_getopt(mp->mnt_optnew, "mntudp", NULL, NULL) == 0)
 		args.sotype = SOCK_DGRAM;
 	if (vfs_getopt(mp->mnt_optnew, "udp", NULL, NULL) == 0)
 		args.sotype = SOCK_DGRAM;
 	if (vfs_getopt(mp->mnt_optnew, "tcp", NULL, NULL) == 0)
 		args.sotype = SOCK_STREAM;
 	if (vfs_getopt(mp->mnt_optnew, "nfsv3", NULL, NULL) == 0)
 		args.flags |= NFSMNT_NFSV3;
 	if (vfs_getopt(mp->mnt_optnew, "nfsv4", NULL, NULL) == 0) {
 		args.flags |= NFSMNT_NFSV4;
 		args.sotype = SOCK_STREAM;
 	}
 	if (vfs_getopt(mp->mnt_optnew, "allgssname", NULL, NULL) == 0)
 		args.flags |= NFSMNT_ALLGSSNAME;
 	if (vfs_getopt(mp->mnt_optnew, "nocto", NULL, NULL) == 0)
 		args.flags |= NFSMNT_NOCTO;
 	if (vfs_getopt(mp->mnt_optnew, "noncontigwr", NULL, NULL) == 0)
 		args.flags |= NFSMNT_NONCONTIGWR;
 	if (vfs_getopt(mp->mnt_optnew, "pnfs", NULL, NULL) == 0)
 		args.flags |= NFSMNT_PNFS;
 	if (vfs_getopt(mp->mnt_optnew, "readdirsize", (void **)&opt, NULL) == 0) {
 		if (opt == NULL) { 
 			vfs_mount_error(mp, "illegal readdirsize");
 			error = EINVAL;
 			goto out;
 		}
 		ret = sscanf(opt, "%d", &args.readdirsize);
 		if (ret != 1 || args.readdirsize <= 0) {
 			vfs_mount_error(mp, "illegal readdirsize: %s",
 			    opt);
 			error = EINVAL;
 			goto out;
 		}
 		args.flags |= NFSMNT_READDIRSIZE;
 	}
 	if (vfs_getopt(mp->mnt_optnew, "readahead", (void **)&opt, NULL) == 0) {
 		if (opt == NULL) { 
 			vfs_mount_error(mp, "illegal readahead");
 			error = EINVAL;
 			goto out;
 		}
 		ret = sscanf(opt, "%d", &args.readahead);
 		if (ret != 1 || args.readahead <= 0) {
 			vfs_mount_error(mp, "illegal readahead: %s",
 			    opt);
 			error = EINVAL;
 			goto out;
 		}
 		args.flags |= NFSMNT_READAHEAD;
 	}
 	if (vfs_getopt(mp->mnt_optnew, "wsize", (void **)&opt, NULL) == 0) {
 		if (opt == NULL) { 
 			vfs_mount_error(mp, "illegal wsize");
 			error = EINVAL;
 			goto out;
 		}
 		ret = sscanf(opt, "%d", &args.wsize);
 		if (ret != 1 || args.wsize <= 0) {
 			vfs_mount_error(mp, "illegal wsize: %s",
 			    opt);
 			error = EINVAL;
 			goto out;
 		}
 		args.flags |= NFSMNT_WSIZE;
 	}
 	if (vfs_getopt(mp->mnt_optnew, "rsize", (void **)&opt, NULL) == 0) {
 		if (opt == NULL) { 
 			vfs_mount_error(mp, "illegal rsize");
 			error = EINVAL;
 			goto out;
 		}
 		ret = sscanf(opt, "%d", &args.rsize);
 		if (ret != 1 || args.rsize <= 0) {
 			vfs_mount_error(mp, "illegal wsize: %s",
 			    opt);
 			error = EINVAL;
 			goto out;
 		}
 		args.flags |= NFSMNT_RSIZE;
 	}
 	if (vfs_getopt(mp->mnt_optnew, "retrans", (void **)&opt, NULL) == 0) {
 		if (opt == NULL) { 
 			vfs_mount_error(mp, "illegal retrans");
 			error = EINVAL;
 			goto out;
 		}
 		ret = sscanf(opt, "%d", &args.retrans);
 		if (ret != 1 || args.retrans <= 0) {
 			vfs_mount_error(mp, "illegal retrans: %s",
 			    opt);
 			error = EINVAL;
 			goto out;
 		}
 		args.flags |= NFSMNT_RETRANS;
 	}
 	if (vfs_getopt(mp->mnt_optnew, "actimeo", (void **)&opt, NULL) == 0) {
 		ret = sscanf(opt, "%d", &args.acregmin);
 		if (ret != 1 || args.acregmin < 0) {
 			vfs_mount_error(mp, "illegal actimeo: %s",
 			    opt);
 			error = EINVAL;
 			goto out;
 		}
 		args.acdirmin = args.acdirmax = args.acregmax = args.acregmin;
 		args.flags |= NFSMNT_ACDIRMIN | NFSMNT_ACDIRMAX |
 		    NFSMNT_ACREGMIN | NFSMNT_ACREGMAX;
 	}
 	if (vfs_getopt(mp->mnt_optnew, "acregmin", (void **)&opt, NULL) == 0) {
 		ret = sscanf(opt, "%d", &args.acregmin);
 		if (ret != 1 || args.acregmin < 0) {
 			vfs_mount_error(mp, "illegal acregmin: %s",
 			    opt);
 			error = EINVAL;
 			goto out;
 		}
 		args.flags |= NFSMNT_ACREGMIN;
 	}
 	if (vfs_getopt(mp->mnt_optnew, "acregmax", (void **)&opt, NULL) == 0) {
 		ret = sscanf(opt, "%d", &args.acregmax);
 		if (ret != 1 || args.acregmax < 0) {
 			vfs_mount_error(mp, "illegal acregmax: %s",
 			    opt);
 			error = EINVAL;
 			goto out;
 		}
 		args.flags |= NFSMNT_ACREGMAX;
 	}
 	if (vfs_getopt(mp->mnt_optnew, "acdirmin", (void **)&opt, NULL) == 0) {
 		ret = sscanf(opt, "%d", &args.acdirmin);
 		if (ret != 1 || args.acdirmin < 0) {
 			vfs_mount_error(mp, "illegal acdirmin: %s",
 			    opt);
 			error = EINVAL;
 			goto out;
 		}
 		args.flags |= NFSMNT_ACDIRMIN;
 	}
 	if (vfs_getopt(mp->mnt_optnew, "acdirmax", (void **)&opt, NULL) == 0) {
 		ret = sscanf(opt, "%d", &args.acdirmax);
 		if (ret != 1 || args.acdirmax < 0) {
 			vfs_mount_error(mp, "illegal acdirmax: %s",
 			    opt);
 			error = EINVAL;
 			goto out;
 		}
 		args.flags |= NFSMNT_ACDIRMAX;
 	}
 	if (vfs_getopt(mp->mnt_optnew, "wcommitsize", (void **)&opt, NULL) == 0) {
 		ret = sscanf(opt, "%d", &args.wcommitsize);
 		if (ret != 1 || args.wcommitsize < 0) {
 			vfs_mount_error(mp, "illegal wcommitsize: %s", opt);
 			error = EINVAL;
 			goto out;
 		}
 		args.flags |= NFSMNT_WCOMMITSIZE;
 	}
 	if (vfs_getopt(mp->mnt_optnew, "timeo", (void **)&opt, NULL) == 0) {
 		ret = sscanf(opt, "%d", &args.timeo);
 		if (ret != 1 || args.timeo <= 0) {
 			vfs_mount_error(mp, "illegal timeo: %s",
 			    opt);
 			error = EINVAL;
 			goto out;
 		}
 		args.flags |= NFSMNT_TIMEO;
 	}
 	if (vfs_getopt(mp->mnt_optnew, "timeout", (void **)&opt, NULL) == 0) {
 		ret = sscanf(opt, "%d", &args.timeo);
 		if (ret != 1 || args.timeo <= 0) {
 			vfs_mount_error(mp, "illegal timeout: %s",
 			    opt);
 			error = EINVAL;
 			goto out;
 		}
 		args.flags |= NFSMNT_TIMEO;
 	}
 	if (vfs_getopt(mp->mnt_optnew, "nametimeo", (void **)&opt, NULL) == 0) {
 		ret = sscanf(opt, "%d", &nametimeo);
 		if (ret != 1 || nametimeo < 0) {
 			vfs_mount_error(mp, "illegal nametimeo: %s", opt);
 			error = EINVAL;
 			goto out;
 		}
 	}
 	if (vfs_getopt(mp->mnt_optnew, "negnametimeo", (void **)&opt, NULL)
 	    == 0) {
 		ret = sscanf(opt, "%d", &negnametimeo);
 		if (ret != 1 || negnametimeo < 0) {
 			vfs_mount_error(mp, "illegal negnametimeo: %s",
 			    opt);
 			error = EINVAL;
 			goto out;
 		}
 	}
 	if (vfs_getopt(mp->mnt_optnew, "minorversion", (void **)&opt, NULL) ==
 	    0) {
 		ret = sscanf(opt, "%d", &minvers);
 		if (ret != 1 || minvers < 0 || minvers > 1 ||
 		    (args.flags & NFSMNT_NFSV4) == 0) {
 			vfs_mount_error(mp, "illegal minorversion: %s", opt);
 			error = EINVAL;
 			goto out;
 		}
 	}
 	if (vfs_getopt(mp->mnt_optnew, "sec",
 		(void **) &secname, NULL) == 0)
 		nfs_sec_name(secname, &args.flags);
 
 	if (mp->mnt_flag & MNT_UPDATE) {
 		struct nfsmount *nmp = VFSTONFS(mp);
 
 		if (nmp == NULL) {
 			error = EIO;
 			goto out;
 		}
 
 		/*
 		 * If a change from TCP->UDP is done and there are thread(s)
 		 * that have I/O RPC(s) in progress with a tranfer size
 		 * greater than NFS_MAXDGRAMDATA, those thread(s) will be
 		 * hung, retrying the RPC(s) forever. Usually these threads
 		 * will be seen doing an uninterruptible sleep on wait channel
 		 * "nfsreq".
 		 */
 		if (args.sotype == SOCK_DGRAM && nmp->nm_sotype == SOCK_STREAM)
 			tprintf(td->td_proc, LOG_WARNING,
 	"Warning: mount -u that changes TCP->UDP can result in hung threads\n");
 
 		/*
 		 * When doing an update, we can't change version,
 		 * security, switch lockd strategies or change cookie
 		 * translation
 		 */
 		args.flags = (args.flags &
 		    ~(NFSMNT_NFSV3 |
 		      NFSMNT_NFSV4 |
 		      NFSMNT_KERB |
 		      NFSMNT_INTEGRITY |
 		      NFSMNT_PRIVACY |
 		      NFSMNT_NOLOCKD /*|NFSMNT_XLATECOOKIE*/)) |
 		    (nmp->nm_flag &
 			(NFSMNT_NFSV3 |
 			 NFSMNT_NFSV4 |
 			 NFSMNT_KERB |
 			 NFSMNT_INTEGRITY |
 			 NFSMNT_PRIVACY |
 			 NFSMNT_NOLOCKD /*|NFSMNT_XLATECOOKIE*/));
 		nfs_decode_args(mp, nmp, &args, NULL, td->td_ucred, td);
 		goto out;
 	}
 
 	/*
 	 * Make the nfs_ip_paranoia sysctl serve as the default connection
 	 * or no-connection mode for those protocols that support 
 	 * no-connection mode (the flag will be cleared later for protocols
 	 * that do not support no-connection mode).  This will allow a client
 	 * to receive replies from a different IP then the request was
 	 * sent to.  Note: default value for nfs_ip_paranoia is 1 (paranoid),
 	 * not 0.
 	 */
 	if (nfs_ip_paranoia == 0)
 		args.flags |= NFSMNT_NOCONN;
 
 	if (has_nfs_args_opt != 0) {
 		/*
 		 * In the 'nfs_args' case, the pointers in the args
 		 * structure are in userland - we copy them in here.
 		 */
 		if (args.fhsize < 0 || args.fhsize > NFSX_V3FHMAX) {
 			vfs_mount_error(mp, "Bad file handle");
 			error = EINVAL;
 			goto out;
 		}
 		error = copyin((caddr_t)args.fh, (caddr_t)nfh,
 		    args.fhsize);
 		if (error != 0)
 			goto out;
 		error = copyinstr(args.hostname, hst, MNAMELEN - 1, &hstlen);
 		if (error != 0)
 			goto out;
 		bzero(&hst[hstlen], MNAMELEN - hstlen);
 		args.hostname = hst;
 		/* sockargs() call must be after above copyin() calls */
 		error = getsockaddr(&nam, (caddr_t)args.addr,
 		    args.addrlen);
 		if (error != 0)
 			goto out;
 	} else {
 		if (vfs_getopt(mp->mnt_optnew, "fh", (void **)&args.fh,
 		    &args.fhsize) == 0) {
 			if (args.fhsize < 0 || args.fhsize > NFSX_FHMAX) {
 				vfs_mount_error(mp, "Bad file handle");
 				error = EINVAL;
 				goto out;
 			}
 			bcopy(args.fh, nfh, args.fhsize);
 		} else {
 			args.fhsize = 0;
 		}
 		(void) vfs_getopt(mp->mnt_optnew, "hostname",
 		    (void **)&args.hostname, &len);
 		if (args.hostname == NULL) {
 			vfs_mount_error(mp, "Invalid hostname");
 			error = EINVAL;
 			goto out;
 		}
 		bcopy(args.hostname, hst, MNAMELEN);
 		hst[MNAMELEN - 1] = '\0';
 	}
 
 	if (vfs_getopt(mp->mnt_optnew, "principal", (void **)&name, NULL) == 0)
 		strlcpy(srvkrbname, name, sizeof (srvkrbname));
 	else
 		snprintf(srvkrbname, sizeof (srvkrbname), "nfs@%s", hst);
 	srvkrbnamelen = strlen(srvkrbname);
 
 	if (vfs_getopt(mp->mnt_optnew, "gssname", (void **)&name, NULL) == 0)
 		strlcpy(krbname, name, sizeof (krbname));
 	else
 		krbname[0] = '\0';
 	krbnamelen = strlen(krbname);
 
 	if (vfs_getopt(mp->mnt_optnew, "dirpath", (void **)&name, NULL) == 0)
 		strlcpy(dirpath, name, sizeof (dirpath));
 	else
 		dirpath[0] = '\0';
 	dirlen = strlen(dirpath);
 
 	if (has_nfs_args_opt == 0) {
 		if (vfs_getopt(mp->mnt_optnew, "addr",
 		    (void **)&args.addr, &args.addrlen) == 0) {
 			if (args.addrlen > SOCK_MAXADDRLEN) {
 				error = ENAMETOOLONG;
 				goto out;
 			}
 			nam = malloc(args.addrlen, M_SONAME, M_WAITOK);
 			bcopy(args.addr, nam, args.addrlen);
 			nam->sa_len = args.addrlen;
 		} else {
 			vfs_mount_error(mp, "No server address");
 			error = EINVAL;
 			goto out;
 		}
 	}
 
 	args.fh = nfh;
 	error = mountnfs(&args, mp, nam, hst, krbname, krbnamelen, dirpath,
 	    dirlen, srvkrbname, srvkrbnamelen, &vp, td->td_ucred, td,
 	    nametimeo, negnametimeo, minvers);
 out:
 	if (!error) {
 		MNT_ILOCK(mp);
-		mp->mnt_kern_flag |= MNTK_LOOKUP_SHARED | MNTK_NO_IOPF;
+		mp->mnt_kern_flag |= MNTK_LOOKUP_SHARED | MNTK_NO_IOPF |
+		    MNTK_USES_BCACHE;
 		MNT_IUNLOCK(mp);
 	}
 	return (error);
 }
 
 
 /*
  * VFS Operations.
  *
  * mount system call
  * It seems a bit dumb to copyinstr() the host and path here and then
  * bcopy() them in mountnfs(), but I wanted to detect errors before
  * doing the sockargs() call because sockargs() allocates an mbuf and
  * an error after that means that I have to release the mbuf.
  */
 /* ARGSUSED */
 static int
 nfs_cmount(struct mntarg *ma, void *data, uint64_t flags)
 {
 	int error;
 	struct nfs_args args;
 
 	error = copyin(data, &args, sizeof (struct nfs_args));
 	if (error)
 		return error;
 
 	ma = mount_arg(ma, "nfs_args", &args, sizeof args);
 
 	error = kernel_mount(ma, flags);
 	return (error);
 }
 
 /*
  * Common code for mount and mountroot
  */
 static int
 mountnfs(struct nfs_args *argp, struct mount *mp, struct sockaddr *nam,
     char *hst, u_char *krbname, int krbnamelen, u_char *dirpath, int dirlen,
     u_char *srvkrbname, int srvkrbnamelen, struct vnode **vpp,
     struct ucred *cred, struct thread *td, int nametimeo, int negnametimeo,
     int minvers)
 {
 	struct nfsmount *nmp;
 	struct nfsnode *np;
 	int error, trycnt, ret;
 	struct nfsvattr nfsva;
 	struct nfsclclient *clp;
 	struct nfsclds *dsp, *tdsp;
 	uint32_t lease;
 	static u_int64_t clval = 0;
 
 	NFSCL_DEBUG(3, "in mnt\n");
 	clp = NULL;
 	if (mp->mnt_flag & MNT_UPDATE) {
 		nmp = VFSTONFS(mp);
 		printf("%s: MNT_UPDATE is no longer handled here\n", __func__);
 		FREE(nam, M_SONAME);
 		return (0);
 	} else {
 		MALLOC(nmp, struct nfsmount *, sizeof (struct nfsmount) +
 		    krbnamelen + dirlen + srvkrbnamelen + 2,
 		    M_NEWNFSMNT, M_WAITOK | M_ZERO);
 		TAILQ_INIT(&nmp->nm_bufq);
 		if (clval == 0)
 			clval = (u_int64_t)nfsboottime.tv_sec;
 		nmp->nm_clval = clval++;
 		nmp->nm_krbnamelen = krbnamelen;
 		nmp->nm_dirpathlen = dirlen;
 		nmp->nm_srvkrbnamelen = srvkrbnamelen;
 		if (td->td_ucred->cr_uid != (uid_t)0) {
 			/*
 			 * nm_uid is used to get KerberosV credentials for
 			 * the nfsv4 state handling operations if there is
 			 * no host based principal set. Use the uid of
 			 * this user if not root, since they are doing the
 			 * mount. I don't think setting this for root will
 			 * work, since root normally does not have user
 			 * credentials in a credentials cache.
 			 */
 			nmp->nm_uid = td->td_ucred->cr_uid;
 		} else {
 			/*
 			 * Just set to -1, so it won't be used.
 			 */
 			nmp->nm_uid = (uid_t)-1;
 		}
 
 		/* Copy and null terminate all the names */
 		if (nmp->nm_krbnamelen > 0) {
 			bcopy(krbname, nmp->nm_krbname, nmp->nm_krbnamelen);
 			nmp->nm_name[nmp->nm_krbnamelen] = '\0';
 		}
 		if (nmp->nm_dirpathlen > 0) {
 			bcopy(dirpath, NFSMNT_DIRPATH(nmp),
 			    nmp->nm_dirpathlen);
 			nmp->nm_name[nmp->nm_krbnamelen + nmp->nm_dirpathlen
 			    + 1] = '\0';
 		}
 		if (nmp->nm_srvkrbnamelen > 0) {
 			bcopy(srvkrbname, NFSMNT_SRVKRBNAME(nmp),
 			    nmp->nm_srvkrbnamelen);
 			nmp->nm_name[nmp->nm_krbnamelen + nmp->nm_dirpathlen
 			    + nmp->nm_srvkrbnamelen + 2] = '\0';
 		}
 		nmp->nm_sockreq.nr_cred = crhold(cred);
 		mtx_init(&nmp->nm_sockreq.nr_mtx, "nfssock", NULL, MTX_DEF);
 		mp->mnt_data = nmp;
 		nmp->nm_getinfo = nfs_getnlminfo;
 		nmp->nm_vinvalbuf = ncl_vinvalbuf;
 	}
 	vfs_getnewfsid(mp);
 	nmp->nm_mountp = mp;
 	mtx_init(&nmp->nm_mtx, "NFSmount lock", NULL, MTX_DEF | MTX_DUPOK);
 
 	/*
 	 * Since nfs_decode_args() might optionally set them, these
 	 * need to be set to defaults before the call, so that the
 	 * optional settings aren't overwritten.
 	 */
 	nmp->nm_nametimeo = nametimeo;
 	nmp->nm_negnametimeo = negnametimeo;
 	nmp->nm_timeo = NFS_TIMEO;
 	nmp->nm_retry = NFS_RETRANS;
 	nmp->nm_readahead = NFS_DEFRAHEAD;
 	if (desiredvnodes >= 11000)
 		nmp->nm_wcommitsize = hibufspace / (desiredvnodes / 1000);
 	else
 		nmp->nm_wcommitsize = hibufspace / 10;
 	if ((argp->flags & NFSMNT_NFSV4) != 0)
 		nmp->nm_minorvers = minvers;
 	else
 		nmp->nm_minorvers = 0;
 
 	nfs_decode_args(mp, nmp, argp, hst, cred, td);
 
 	/*
 	 * V2 can only handle 32 bit filesizes.  A 4GB-1 limit may be too
 	 * high, depending on whether we end up with negative offsets in
 	 * the client or server somewhere.  2GB-1 may be safer.
 	 *
 	 * For V3, ncl_fsinfo will adjust this as necessary.  Assume maximum
 	 * that we can handle until we find out otherwise.
 	 */
 	if ((argp->flags & (NFSMNT_NFSV3 | NFSMNT_NFSV4)) == 0)
 		nmp->nm_maxfilesize = 0xffffffffLL;
 	else
 		nmp->nm_maxfilesize = OFF_MAX;
 
 	if ((argp->flags & (NFSMNT_NFSV3 | NFSMNT_NFSV4)) == 0) {
 		nmp->nm_wsize = NFS_WSIZE;
 		nmp->nm_rsize = NFS_RSIZE;
 		nmp->nm_readdirsize = NFS_READDIRSIZE;
 	}
 	nmp->nm_numgrps = NFS_MAXGRPS;
 	nmp->nm_tprintf_delay = nfs_tprintf_delay;
 	if (nmp->nm_tprintf_delay < 0)
 		nmp->nm_tprintf_delay = 0;
 	nmp->nm_tprintf_initial_delay = nfs_tprintf_initial_delay;
 	if (nmp->nm_tprintf_initial_delay < 0)
 		nmp->nm_tprintf_initial_delay = 0;
 	nmp->nm_fhsize = argp->fhsize;
 	if (nmp->nm_fhsize > 0)
 		bcopy((caddr_t)argp->fh, (caddr_t)nmp->nm_fh, argp->fhsize);
 	bcopy(hst, mp->mnt_stat.f_mntfromname, MNAMELEN);
 	nmp->nm_nam = nam;
 	/* Set up the sockets and per-host congestion */
 	nmp->nm_sotype = argp->sotype;
 	nmp->nm_soproto = argp->proto;
 	nmp->nm_sockreq.nr_prog = NFS_PROG;
 	if ((argp->flags & NFSMNT_NFSV4))
 		nmp->nm_sockreq.nr_vers = NFS_VER4;
 	else if ((argp->flags & NFSMNT_NFSV3))
 		nmp->nm_sockreq.nr_vers = NFS_VER3;
 	else
 		nmp->nm_sockreq.nr_vers = NFS_VER2;
 
 
 	if ((error = newnfs_connect(nmp, &nmp->nm_sockreq, cred, td, 0)))
 		goto bad;
 	/* For NFSv4.1, get the clientid now. */
 	if (nmp->nm_minorvers > 0) {
 		NFSCL_DEBUG(3, "at getcl\n");
 		error = nfscl_getcl(mp, cred, td, 0, &clp);
 		NFSCL_DEBUG(3, "aft getcl=%d\n", error);
 		if (error != 0)
 			goto bad;
 	}
 
 	if (nmp->nm_fhsize == 0 && (nmp->nm_flag & NFSMNT_NFSV4) &&
 	    nmp->nm_dirpathlen > 0) {
 		NFSCL_DEBUG(3, "in dirp\n");
 		/*
 		 * If the fhsize on the mount point == 0 for V4, the mount
 		 * path needs to be looked up.
 		 */
 		trycnt = 3;
 		do {
 			error = nfsrpc_getdirpath(nmp, NFSMNT_DIRPATH(nmp),
 			    cred, td);
 			NFSCL_DEBUG(3, "aft dirp=%d\n", error);
 			if (error)
 				(void) nfs_catnap(PZERO, error, "nfsgetdirp");
 		} while (error && --trycnt > 0);
 		if (error) {
 			error = nfscl_maperr(td, error, (uid_t)0, (gid_t)0);
 			goto bad;
 		}
 	}
 
 	/*
 	 * A reference count is needed on the nfsnode representing the
 	 * remote root.  If this object is not persistent, then backward
 	 * traversals of the mount point (i.e. "..") will not work if
 	 * the nfsnode gets flushed out of the cache. Ufs does not have
 	 * this problem, because one can identify root inodes by their
 	 * number == ROOTINO (2).
 	 */
 	if (nmp->nm_fhsize > 0) {
 		/*
 		 * Set f_iosize to NFS_DIRBLKSIZ so that bo_bsize gets set
 		 * non-zero for the root vnode. f_iosize will be set correctly
 		 * by nfs_statfs() before any I/O occurs.
 		 */
 		mp->mnt_stat.f_iosize = NFS_DIRBLKSIZ;
 		error = ncl_nget(mp, nmp->nm_fh, nmp->nm_fhsize, &np,
 		    LK_EXCLUSIVE);
 		if (error)
 			goto bad;
 		*vpp = NFSTOV(np);
 	
 		/*
 		 * Get file attributes and transfer parameters for the
 		 * mountpoint.  This has the side effect of filling in
 		 * (*vpp)->v_type with the correct value.
 		 */
 		ret = nfsrpc_getattrnovp(nmp, nmp->nm_fh, nmp->nm_fhsize, 1,
 		    cred, td, &nfsva, NULL, &lease);
 		if (ret) {
 			/*
 			 * Just set default values to get things going.
 			 */
 			NFSBZERO((caddr_t)&nfsva, sizeof (struct nfsvattr));
 			nfsva.na_vattr.va_type = VDIR;
 			nfsva.na_vattr.va_mode = 0777;
 			nfsva.na_vattr.va_nlink = 100;
 			nfsva.na_vattr.va_uid = (uid_t)0;
 			nfsva.na_vattr.va_gid = (gid_t)0;
 			nfsva.na_vattr.va_fileid = 2;
 			nfsva.na_vattr.va_gen = 1;
 			nfsva.na_vattr.va_blocksize = NFS_FABLKSIZE;
 			nfsva.na_vattr.va_size = 512 * 1024;
 			lease = 60;
 		}
 		(void) nfscl_loadattrcache(vpp, &nfsva, NULL, NULL, 0, 1);
 		if (nmp->nm_minorvers > 0) {
 			NFSCL_DEBUG(3, "lease=%d\n", (int)lease);
 			NFSLOCKCLSTATE();
 			clp->nfsc_renew = NFSCL_RENEW(lease);
 			clp->nfsc_expire = NFSD_MONOSEC + clp->nfsc_renew;
 			clp->nfsc_clientidrev++;
 			if (clp->nfsc_clientidrev == 0)
 				clp->nfsc_clientidrev++;
 			NFSUNLOCKCLSTATE();
 			/*
 			 * Mount will succeed, so the renew thread can be
 			 * started now.
 			 */
 			nfscl_start_renewthread(clp);
 			nfscl_clientrelease(clp);
 		}
 		if (argp->flags & NFSMNT_NFSV3)
 			ncl_fsinfo(nmp, *vpp, cred, td);
 	
 		/* Mark if the mount point supports NFSv4 ACLs. */
 		if ((argp->flags & NFSMNT_NFSV4) != 0 && nfsrv_useacl != 0 &&
 		    ret == 0 &&
 		    NFSISSET_ATTRBIT(&nfsva.na_suppattr, NFSATTRBIT_ACL)) {
 			MNT_ILOCK(mp);
 			mp->mnt_flag |= MNT_NFS4ACLS;
 			MNT_IUNLOCK(mp);
 		}
 	
 		/*
 		 * Lose the lock but keep the ref.
 		 */
 		NFSVOPUNLOCK(*vpp, 0);
 		return (0);
 	}
 	error = EIO;
 
 bad:
 	if (clp != NULL)
 		nfscl_clientrelease(clp);
 	newnfs_disconnect(&nmp->nm_sockreq);
 	crfree(nmp->nm_sockreq.nr_cred);
 	if (nmp->nm_sockreq.nr_auth != NULL)
 		AUTH_DESTROY(nmp->nm_sockreq.nr_auth);
 	mtx_destroy(&nmp->nm_sockreq.nr_mtx);
 	mtx_destroy(&nmp->nm_mtx);
 	if (nmp->nm_clp != NULL) {
 		NFSLOCKCLSTATE();
 		LIST_REMOVE(nmp->nm_clp, nfsc_list);
 		NFSUNLOCKCLSTATE();
 		free(nmp->nm_clp, M_NFSCLCLIENT);
 	}
 	TAILQ_FOREACH_SAFE(dsp, &nmp->nm_sess, nfsclds_list, tdsp)
 		nfscl_freenfsclds(dsp);
 	FREE(nmp, M_NEWNFSMNT);
 	FREE(nam, M_SONAME);
 	return (error);
 }
 
 /*
  * unmount system call
  */
 static int
 nfs_unmount(struct mount *mp, int mntflags)
 {
 	struct thread *td;
 	struct nfsmount *nmp;
 	int error, flags = 0, i, trycnt = 0;
 	struct nfsclds *dsp, *tdsp;
 
 	td = curthread;
 
 	if (mntflags & MNT_FORCE)
 		flags |= FORCECLOSE;
 	nmp = VFSTONFS(mp);
 	/*
 	 * Goes something like this..
 	 * - Call vflush() to clear out vnodes for this filesystem
 	 * - Close the socket
 	 * - Free up the data structures
 	 */
 	/* In the forced case, cancel any outstanding requests. */
 	if (mntflags & MNT_FORCE) {
 		error = newnfs_nmcancelreqs(nmp);
 		if (error)
 			goto out;
 		/* For a forced close, get rid of the renew thread now */
 		nfscl_umount(nmp, td);
 	}
 	/* We hold 1 extra ref on the root vnode; see comment in mountnfs(). */
 	do {
 		error = vflush(mp, 1, flags, td);
 		if ((mntflags & MNT_FORCE) && error != 0 && ++trycnt < 30)
 			(void) nfs_catnap(PSOCK, error, "newndm");
 	} while ((mntflags & MNT_FORCE) && error != 0 && trycnt < 30);
 	if (error)
 		goto out;
 
 	/*
 	 * We are now committed to the unmount.
 	 */
 	if ((mntflags & MNT_FORCE) == 0)
 		nfscl_umount(nmp, td);
 	/* Make sure no nfsiods are assigned to this mount. */
 	mtx_lock(&ncl_iod_mutex);
 	for (i = 0; i < NFS_MAXASYNCDAEMON; i++)
 		if (ncl_iodmount[i] == nmp) {
 			ncl_iodwant[i] = NFSIOD_AVAILABLE;
 			ncl_iodmount[i] = NULL;
 		}
 	mtx_unlock(&ncl_iod_mutex);
 	newnfs_disconnect(&nmp->nm_sockreq);
 	crfree(nmp->nm_sockreq.nr_cred);
 	FREE(nmp->nm_nam, M_SONAME);
 	if (nmp->nm_sockreq.nr_auth != NULL)
 		AUTH_DESTROY(nmp->nm_sockreq.nr_auth);
 	mtx_destroy(&nmp->nm_sockreq.nr_mtx);
 	mtx_destroy(&nmp->nm_mtx);
 	TAILQ_FOREACH_SAFE(dsp, &nmp->nm_sess, nfsclds_list, tdsp)
 		nfscl_freenfsclds(dsp);
 	FREE(nmp, M_NEWNFSMNT);
 out:
 	return (error);
 }
 
 /*
  * Return root of a filesystem
  */
 static int
 nfs_root(struct mount *mp, int flags, struct vnode **vpp)
 {
 	struct vnode *vp;
 	struct nfsmount *nmp;
 	struct nfsnode *np;
 	int error;
 
 	nmp = VFSTONFS(mp);
 	error = ncl_nget(mp, nmp->nm_fh, nmp->nm_fhsize, &np, flags);
 	if (error)
 		return error;
 	vp = NFSTOV(np);
 	/*
 	 * Get transfer parameters and attributes for root vnode once.
 	 */
 	mtx_lock(&nmp->nm_mtx);
 	if (NFSHASNFSV3(nmp) && !NFSHASGOTFSINFO(nmp)) {
 		mtx_unlock(&nmp->nm_mtx);
 		ncl_fsinfo(nmp, vp, curthread->td_ucred, curthread);
 	} else 
 		mtx_unlock(&nmp->nm_mtx);
 	if (vp->v_type == VNON)
 	    vp->v_type = VDIR;
 	vp->v_vflag |= VV_ROOT;
 	*vpp = vp;
 	return (0);
 }
 
 /*
  * Flush out the buffer cache
  */
 /* ARGSUSED */
 static int
 nfs_sync(struct mount *mp, int waitfor)
 {
 	struct vnode *vp, *mvp;
 	struct thread *td;
 	int error, allerror = 0;
 
 	td = curthread;
 
 	MNT_ILOCK(mp);
 	/*
 	 * If a forced dismount is in progress, return from here so that
 	 * the umount(2) syscall doesn't get stuck in VFS_SYNC() before
 	 * calling VFS_UNMOUNT().
 	 */
 	if ((mp->mnt_kern_flag & MNTK_UNMOUNTF) != 0) {
 		MNT_IUNLOCK(mp);
 		return (EBADF);
 	}
 	MNT_IUNLOCK(mp);
 
 	/*
 	 * Force stale buffer cache information to be flushed.
 	 */
 loop:
 	MNT_VNODE_FOREACH_ALL(vp, mp, mvp) {
 		/* XXX Racy bv_cnt check. */
 		if (NFSVOPISLOCKED(vp) || vp->v_bufobj.bo_dirty.bv_cnt == 0 ||
 		    waitfor == MNT_LAZY) {
 			VI_UNLOCK(vp);
 			continue;
 		}
 		if (vget(vp, LK_EXCLUSIVE | LK_INTERLOCK, td)) {
 			MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp);
 			goto loop;
 		}
 		error = VOP_FSYNC(vp, waitfor, td);
 		if (error)
 			allerror = error;
 		NFSVOPUNLOCK(vp, 0);
 		vrele(vp);
 	}
 	return (allerror);
 }
 
 static int
 nfs_sysctl(struct mount *mp, fsctlop_t op, struct sysctl_req *req)
 {
 	struct nfsmount *nmp = VFSTONFS(mp);
 	struct vfsquery vq;
 	int error;
 
 	bzero(&vq, sizeof(vq));
 	switch (op) {
 #if 0
 	case VFS_CTL_NOLOCKS:
 		val = (nmp->nm_flag & NFSMNT_NOLOCKS) ? 1 : 0;
  		if (req->oldptr != NULL) {
  			error = SYSCTL_OUT(req, &val, sizeof(val));
  			if (error)
  				return (error);
  		}
  		if (req->newptr != NULL) {
  			error = SYSCTL_IN(req, &val, sizeof(val));
  			if (error)
  				return (error);
 			if (val)
 				nmp->nm_flag |= NFSMNT_NOLOCKS;
 			else
 				nmp->nm_flag &= ~NFSMNT_NOLOCKS;
  		}
 		break;
 #endif
 	case VFS_CTL_QUERY:
 		mtx_lock(&nmp->nm_mtx);
 		if (nmp->nm_state & NFSSTA_TIMEO)
 			vq.vq_flags |= VQ_NOTRESP;
 		mtx_unlock(&nmp->nm_mtx);
 #if 0
 		if (!(nmp->nm_flag & NFSMNT_NOLOCKS) &&
 		    (nmp->nm_state & NFSSTA_LOCKTIMEO))
 			vq.vq_flags |= VQ_NOTRESPLOCK;
 #endif
 		error = SYSCTL_OUT(req, &vq, sizeof(vq));
 		break;
  	case VFS_CTL_TIMEO:
  		if (req->oldptr != NULL) {
  			error = SYSCTL_OUT(req, &nmp->nm_tprintf_initial_delay,
  			    sizeof(nmp->nm_tprintf_initial_delay));
  			if (error)
  				return (error);
  		}
  		if (req->newptr != NULL) {
 			error = vfs_suser(mp, req->td);
 			if (error)
 				return (error);
  			error = SYSCTL_IN(req, &nmp->nm_tprintf_initial_delay,
  			    sizeof(nmp->nm_tprintf_initial_delay));
  			if (error)
  				return (error);
  			if (nmp->nm_tprintf_initial_delay < 0)
  				nmp->nm_tprintf_initial_delay = 0;
  		}
 		break;
 	default:
 		return (ENOTSUP);
 	}
 	return (0);
 }
 
 /*
  * Purge any RPCs in progress, so that they will all return errors.
  * This allows dounmount() to continue as far as VFS_UNMOUNT() for a
  * forced dismount.
  */
 static void
 nfs_purge(struct mount *mp)
 {
 	struct nfsmount *nmp = VFSTONFS(mp);
 
 	newnfs_nmcancelreqs(nmp);
 }
 
 /*
  * Extract the information needed by the nlm from the nfs vnode.
  */
 static void
 nfs_getnlminfo(struct vnode *vp, uint8_t *fhp, size_t *fhlenp,
     struct sockaddr_storage *sp, int *is_v3p, off_t *sizep,
     struct timeval *timeop)
 {
 	struct nfsmount *nmp;
 	struct nfsnode *np = VTONFS(vp);
 
 	nmp = VFSTONFS(vp->v_mount);
 	if (fhlenp != NULL)
 		*fhlenp = (size_t)np->n_fhp->nfh_len;
 	if (fhp != NULL)
 		bcopy(np->n_fhp->nfh_fh, fhp, np->n_fhp->nfh_len);
 	if (sp != NULL)
 		bcopy(nmp->nm_nam, sp, min(nmp->nm_nam->sa_len, sizeof(*sp)));
 	if (is_v3p != NULL)
 		*is_v3p = NFS_ISV3(vp);
 	if (sizep != NULL)
 		*sizep = np->n_size;
 	if (timeop != NULL) {
 		timeop->tv_sec = nmp->nm_timeo / NFS_HZ;
 		timeop->tv_usec = (nmp->nm_timeo % NFS_HZ) * (1000000 / NFS_HZ);
 	}
 }
 
 /*
  * This function prints out an option name, based on the conditional
  * argument.
  */
 static __inline void nfscl_printopt(struct nfsmount *nmp, int testval,
     char *opt, char **buf, size_t *blen)
 {
 	int len;
 
 	if (testval != 0 && *blen > strlen(opt)) {
 		len = snprintf(*buf, *blen, "%s", opt);
 		if (len != strlen(opt))
 			printf("EEK!!\n");
 		*buf += len;
 		*blen -= len;
 	}
 }
 
 /*
  * This function printf out an options integer value.
  */
 static __inline void nfscl_printoptval(struct nfsmount *nmp, int optval,
     char *opt, char **buf, size_t *blen)
 {
 	int len;
 
 	if (*blen > strlen(opt) + 1) {
 		/* Could result in truncated output string. */
 		len = snprintf(*buf, *blen, "%s=%d", opt, optval);
 		if (len < *blen) {
 			*buf += len;
 			*blen -= len;
 		}
 	}
 }
 
 /*
  * Load the option flags and values into the buffer.
  */
 void nfscl_retopts(struct nfsmount *nmp, char *buffer, size_t buflen)
 {
 	char *buf;
 	size_t blen;
 
 	buf = buffer;
 	blen = buflen;
 	nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_NFSV4) != 0, "nfsv4", &buf,
 	    &blen);
 	if ((nmp->nm_flag & NFSMNT_NFSV4) != 0) {
 		nfscl_printoptval(nmp, nmp->nm_minorvers, ",minorversion", &buf,
 		    &blen);
 		nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_PNFS) != 0, ",pnfs",
 		    &buf, &blen);
 	}
 	nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_NFSV3) != 0, "nfsv3", &buf,
 	    &blen);
 	nfscl_printopt(nmp, (nmp->nm_flag & (NFSMNT_NFSV3 | NFSMNT_NFSV4)) == 0,
 	    "nfsv2", &buf, &blen);
 	nfscl_printopt(nmp, nmp->nm_sotype == SOCK_STREAM, ",tcp", &buf, &blen);
 	nfscl_printopt(nmp, nmp->nm_sotype != SOCK_STREAM, ",udp", &buf, &blen);
 	nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_RESVPORT) != 0, ",resvport",
 	    &buf, &blen);
 	nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_NOCONN) != 0, ",noconn",
 	    &buf, &blen);
 	nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_SOFT) == 0, ",hard", &buf,
 	    &blen);
 	nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_SOFT) != 0, ",soft", &buf,
 	    &blen);
 	nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_INT) != 0, ",intr", &buf,
 	    &blen);
 	nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_NOCTO) == 0, ",cto", &buf,
 	    &blen);
 	nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_NOCTO) != 0, ",nocto", &buf,
 	    &blen);
 	nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_NONCONTIGWR) != 0,
 	    ",noncontigwr", &buf, &blen);
 	nfscl_printopt(nmp, (nmp->nm_flag & (NFSMNT_NOLOCKD | NFSMNT_NFSV4)) ==
 	    0, ",lockd", &buf, &blen);
 	nfscl_printopt(nmp, (nmp->nm_flag & (NFSMNT_NOLOCKD | NFSMNT_NFSV4)) ==
 	    NFSMNT_NOLOCKD, ",nolockd", &buf, &blen);
 	nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_RDIRPLUS) != 0, ",rdirplus",
 	    &buf, &blen);
 	nfscl_printopt(nmp, (nmp->nm_flag & NFSMNT_KERB) == 0, ",sec=sys",
 	    &buf, &blen);
 	nfscl_printopt(nmp, (nmp->nm_flag & (NFSMNT_KERB | NFSMNT_INTEGRITY |
 	    NFSMNT_PRIVACY)) == NFSMNT_KERB, ",sec=krb5", &buf, &blen);
 	nfscl_printopt(nmp, (nmp->nm_flag & (NFSMNT_KERB | NFSMNT_INTEGRITY |
 	    NFSMNT_PRIVACY)) == (NFSMNT_KERB | NFSMNT_INTEGRITY), ",sec=krb5i",
 	    &buf, &blen);
 	nfscl_printopt(nmp, (nmp->nm_flag & (NFSMNT_KERB | NFSMNT_INTEGRITY |
 	    NFSMNT_PRIVACY)) == (NFSMNT_KERB | NFSMNT_PRIVACY), ",sec=krb5p",
 	    &buf, &blen);
 	nfscl_printoptval(nmp, nmp->nm_acdirmin, ",acdirmin", &buf, &blen);
 	nfscl_printoptval(nmp, nmp->nm_acdirmax, ",acdirmax", &buf, &blen);
 	nfscl_printoptval(nmp, nmp->nm_acregmin, ",acregmin", &buf, &blen);
 	nfscl_printoptval(nmp, nmp->nm_acregmax, ",acregmax", &buf, &blen);
 	nfscl_printoptval(nmp, nmp->nm_nametimeo, ",nametimeo", &buf, &blen);
 	nfscl_printoptval(nmp, nmp->nm_negnametimeo, ",negnametimeo", &buf,
 	    &blen);
 	nfscl_printoptval(nmp, nmp->nm_rsize, ",rsize", &buf, &blen);
 	nfscl_printoptval(nmp, nmp->nm_wsize, ",wsize", &buf, &blen);
 	nfscl_printoptval(nmp, nmp->nm_readdirsize, ",readdirsize", &buf,
 	    &blen);
 	nfscl_printoptval(nmp, nmp->nm_readahead, ",readahead", &buf, &blen);
 	nfscl_printoptval(nmp, nmp->nm_wcommitsize, ",wcommitsize", &buf,
 	    &blen);
 	nfscl_printoptval(nmp, nmp->nm_timeo, ",timeout", &buf, &blen);
 	nfscl_printoptval(nmp, nmp->nm_retry, ",retrans", &buf, &blen);
 }
 
Index: user/ngie/more-tests/sys/fs/nfsserver/nfs_nfsdport.c
===================================================================
--- user/ngie/more-tests/sys/fs/nfsserver/nfs_nfsdport.c	(revision 281584)
+++ user/ngie/more-tests/sys/fs/nfsserver/nfs_nfsdport.c	(revision 281585)
@@ -1,3408 +1,3411 @@
 /*-
  * Copyright (c) 1989, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * This code is derived from software contributed to Berkeley by
  * Rick Macklem at The University of Guelph.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/capsicum.h>
 
 /*
  * Functions that perform the vfs operations required by the routines in
  * nfsd_serv.c. It is hoped that this change will make the server more
  * portable.
  */
 
 #include <fs/nfs/nfsport.h>
 #include <sys/hash.h>
 #include <sys/sysctl.h>
 #include <nlm/nlm_prot.h>
 #include <nlm/nlm.h>
 
 FEATURE(nfsd, "NFSv4 server");
 
 extern u_int32_t newnfs_true, newnfs_false, newnfs_xdrneg1;
 extern int nfsrv_useacl;
 extern int newnfs_numnfsd;
 extern struct mount nfsv4root_mnt;
 extern struct nfsrv_stablefirst nfsrv_stablefirst;
 extern void (*nfsd_call_servertimer)(void);
 extern SVCPOOL	*nfsrvd_pool;
 extern struct nfsv4lock nfsd_suspend_lock;
 extern struct nfssessionhash nfssessionhash[NFSSESSIONHASHSIZE];
 struct vfsoptlist nfsv4root_opt, nfsv4root_newopt;
 NFSDLOCKMUTEX;
 struct nfsrchash_bucket nfsrchash_table[NFSRVCACHE_HASHSIZE];
 struct nfsrchash_bucket nfsrcahash_table[NFSRVCACHE_HASHSIZE];
 struct mtx nfsrc_udpmtx;
 struct mtx nfs_v4root_mutex;
 struct nfsrvfh nfs_rootfh, nfs_pubfh;
 int nfs_pubfhset = 0, nfs_rootfhset = 0;
 struct proc *nfsd_master_proc = NULL;
 int nfsd_debuglevel = 0;
 static pid_t nfsd_master_pid = (pid_t)-1;
 static char nfsd_master_comm[MAXCOMLEN + 1];
 static struct timeval nfsd_master_start;
 static uint32_t nfsv4_sysid = 0;
 
 static int nfssvc_srvcall(struct thread *, struct nfssvc_args *,
     struct ucred *);
 
 int nfsrv_enable_crossmntpt = 1;
 static int nfs_commit_blks;
 static int nfs_commit_miss;
 extern int nfsrv_issuedelegs;
 extern int nfsrv_dolocallocks;
 extern int nfsd_enable_stringtouid;
 
 SYSCTL_NODE(_vfs, OID_AUTO, nfsd, CTLFLAG_RW, 0, "New NFS server");
 SYSCTL_INT(_vfs_nfsd, OID_AUTO, mirrormnt, CTLFLAG_RW,
     &nfsrv_enable_crossmntpt, 0, "Enable nfsd to cross mount points");
 SYSCTL_INT(_vfs_nfsd, OID_AUTO, commit_blks, CTLFLAG_RW, &nfs_commit_blks,
     0, "");
 SYSCTL_INT(_vfs_nfsd, OID_AUTO, commit_miss, CTLFLAG_RW, &nfs_commit_miss,
     0, "");
 SYSCTL_INT(_vfs_nfsd, OID_AUTO, issue_delegations, CTLFLAG_RW,
     &nfsrv_issuedelegs, 0, "Enable nfsd to issue delegations");
 SYSCTL_INT(_vfs_nfsd, OID_AUTO, enable_locallocks, CTLFLAG_RW,
     &nfsrv_dolocallocks, 0, "Enable nfsd to acquire local locks on files");
 SYSCTL_INT(_vfs_nfsd, OID_AUTO, debuglevel, CTLFLAG_RW, &nfsd_debuglevel,
     0, "Debug level for new nfs server");
 SYSCTL_INT(_vfs_nfsd, OID_AUTO, enable_stringtouid, CTLFLAG_RW,
     &nfsd_enable_stringtouid, 0, "Enable nfsd to accept numeric owner_names");
 
 #define	MAX_REORDERED_RPC	16
 #define	NUM_HEURISTIC		1031
 #define	NHUSE_INIT		64
 #define	NHUSE_INC		16
 #define	NHUSE_MAX		2048
 
 static struct nfsheur {
 	struct vnode *nh_vp;	/* vp to match (unreferenced pointer) */
 	off_t nh_nextoff;	/* next offset for sequential detection */
 	int nh_use;		/* use count for selection */
 	int nh_seqcount;	/* heuristic */
 } nfsheur[NUM_HEURISTIC];
 
 
 /*
  * Heuristic to detect sequential operation.
  */
 static struct nfsheur *
 nfsrv_sequential_heuristic(struct uio *uio, struct vnode *vp)
 {
 	struct nfsheur *nh;
 	int hi, try;
 
 	/* Locate best candidate. */
 	try = 32;
 	hi = ((int)(vm_offset_t)vp / sizeof(struct vnode)) % NUM_HEURISTIC;
 	nh = &nfsheur[hi];
 	while (try--) {
 		if (nfsheur[hi].nh_vp == vp) {
 			nh = &nfsheur[hi];
 			break;
 		}
 		if (nfsheur[hi].nh_use > 0)
 			--nfsheur[hi].nh_use;
 		hi = (hi + 1) % NUM_HEURISTIC;
 		if (nfsheur[hi].nh_use < nh->nh_use)
 			nh = &nfsheur[hi];
 	}
 
 	/* Initialize hint if this is a new file. */
 	if (nh->nh_vp != vp) {
 		nh->nh_vp = vp;
 		nh->nh_nextoff = uio->uio_offset;
 		nh->nh_use = NHUSE_INIT;
 		if (uio->uio_offset == 0)
 			nh->nh_seqcount = 4;
 		else
 			nh->nh_seqcount = 1;
 	}
 
 	/* Calculate heuristic. */
 	if ((uio->uio_offset == 0 && nh->nh_seqcount > 0) ||
 	    uio->uio_offset == nh->nh_nextoff) {
 		/* See comments in vfs_vnops.c:sequential_heuristic(). */
 		nh->nh_seqcount += howmany(uio->uio_resid, 16384);
 		if (nh->nh_seqcount > IO_SEQMAX)
 			nh->nh_seqcount = IO_SEQMAX;
 	} else if (qabs(uio->uio_offset - nh->nh_nextoff) <= MAX_REORDERED_RPC *
 	    imax(vp->v_mount->mnt_stat.f_iosize, uio->uio_resid)) {
 		/* Probably a reordered RPC, leave seqcount alone. */
 	} else if (nh->nh_seqcount > 1) {
 		nh->nh_seqcount /= 2;
 	} else {
 		nh->nh_seqcount = 0;
 	}
 	nh->nh_use += NHUSE_INC;
 	if (nh->nh_use > NHUSE_MAX)
 		nh->nh_use = NHUSE_MAX;
 	return (nh);
 }
 
 /*
  * Get attributes into nfsvattr structure.
  */
 int
 nfsvno_getattr(struct vnode *vp, struct nfsvattr *nvap, struct ucred *cred,
     struct thread *p, int vpislocked)
 {
 	int error, lockedit = 0;
 
 	if (vpislocked == 0) {
 		/*
 		 * When vpislocked == 0, the vnode is either exclusively
 		 * locked by this thread or not locked by this thread.
 		 * As such, shared lock it, if not exclusively locked.
 		 */
 		if (NFSVOPISLOCKED(vp) != LK_EXCLUSIVE) {
 			lockedit = 1;
 			NFSVOPLOCK(vp, LK_SHARED | LK_RETRY);
 		}
 	}
 	error = VOP_GETATTR(vp, &nvap->na_vattr, cred);
 	if (lockedit != 0)
 		NFSVOPUNLOCK(vp, 0);
 
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Get a file handle for a vnode.
  */
 int
 nfsvno_getfh(struct vnode *vp, fhandle_t *fhp, struct thread *p)
 {
 	int error;
 
 	NFSBZERO((caddr_t)fhp, sizeof(fhandle_t));
 	fhp->fh_fsid = vp->v_mount->mnt_stat.f_fsid;
 	error = VOP_VPTOFH(vp, &fhp->fh_fid);
 
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Perform access checking for vnodes obtained from file handles that would
  * refer to files already opened by a Unix client. You cannot just use
  * vn_writechk() and VOP_ACCESSX() for two reasons.
  * 1 - You must check for exported rdonly as well as MNT_RDONLY for the write
  *     case.
  * 2 - The owner is to be given access irrespective of mode bits for some
  *     operations, so that processes that chmod after opening a file don't
  *     break.
  */
 int
 nfsvno_accchk(struct vnode *vp, accmode_t accmode, struct ucred *cred,
     struct nfsexstuff *exp, struct thread *p, int override, int vpislocked,
     u_int32_t *supportedtypep)
 {
 	struct vattr vattr;
 	int error = 0, getret = 0;
 
 	if (vpislocked == 0) {
 		if (NFSVOPLOCK(vp, LK_SHARED) != 0) {
 			error = EPERM;
 			goto out;
 		}
 	}
 	if (accmode & VWRITE) {
 		/* Just vn_writechk() changed to check rdonly */
 		/*
 		 * Disallow write attempts on read-only file systems;
 		 * unless the file is a socket or a block or character
 		 * device resident on the file system.
 		 */
 		if (NFSVNO_EXRDONLY(exp) ||
 		    (vp->v_mount->mnt_flag & MNT_RDONLY)) {
 			switch (vp->v_type) {
 			case VREG:
 			case VDIR:
 			case VLNK:
 				error = EROFS;
 			default:
 				break;
 			}
 		}
 		/*
 		 * If there's shared text associated with
 		 * the inode, try to free it up once.  If
 		 * we fail, we can't allow writing.
 		 */
 		if (VOP_IS_TEXT(vp) && error == 0)
 			error = ETXTBSY;
 	}
 	if (error != 0) {
 		if (vpislocked == 0)
 			NFSVOPUNLOCK(vp, 0);
 		goto out;
 	}
 
 	/*
 	 * Should the override still be applied when ACLs are enabled?
 	 */
 	error = VOP_ACCESSX(vp, accmode, cred, p);
 	if (error != 0 && (accmode & (VDELETE | VDELETE_CHILD))) {
 		/*
 		 * Try again with VEXPLICIT_DENY, to see if the test for
 		 * deletion is supported.
 		 */
 		error = VOP_ACCESSX(vp, accmode | VEXPLICIT_DENY, cred, p);
 		if (error == 0) {
 			if (vp->v_type == VDIR) {
 				accmode &= ~(VDELETE | VDELETE_CHILD);
 				accmode |= VWRITE;
 				error = VOP_ACCESSX(vp, accmode, cred, p);
 			} else if (supportedtypep != NULL) {
 				*supportedtypep &= ~NFSACCESS_DELETE;
 			}
 		}
 	}
 
 	/*
 	 * Allow certain operations for the owner (reads and writes
 	 * on files that are already open).
 	 */
 	if (override != NFSACCCHK_NOOVERRIDE &&
 	    (error == EPERM || error == EACCES)) {
 		if (cred->cr_uid == 0 && (override & NFSACCCHK_ALLOWROOT))
 			error = 0;
 		else if (override & NFSACCCHK_ALLOWOWNER) {
 			getret = VOP_GETATTR(vp, &vattr, cred);
 			if (getret == 0 && cred->cr_uid == vattr.va_uid)
 				error = 0;
 		}
 	}
 	if (vpislocked == 0)
 		NFSVOPUNLOCK(vp, 0);
 
 out:
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Set attribute(s) vnop.
  */
 int
 nfsvno_setattr(struct vnode *vp, struct nfsvattr *nvap, struct ucred *cred,
     struct thread *p, struct nfsexstuff *exp)
 {
 	int error;
 
 	error = VOP_SETATTR(vp, &nvap->na_vattr, cred);
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Set up nameidata for a lookup() call and do it.
  */
 int
 nfsvno_namei(struct nfsrv_descript *nd, struct nameidata *ndp,
     struct vnode *dp, int islocked, struct nfsexstuff *exp, struct thread *p,
     struct vnode **retdirp)
 {
 	struct componentname *cnp = &ndp->ni_cnd;
 	int i;
 	struct iovec aiov;
 	struct uio auio;
 	int lockleaf = (cnp->cn_flags & LOCKLEAF) != 0, linklen;
 	int error = 0, crossmnt;
 	char *cp;
 
 	*retdirp = NULL;
 	cnp->cn_nameptr = cnp->cn_pnbuf;
 	ndp->ni_strictrelative = 0;
 	/*
 	 * Extract and set starting directory.
 	 */
 	if (dp->v_type != VDIR) {
 		if (islocked)
 			vput(dp);
 		else
 			vrele(dp);
 		nfsvno_relpathbuf(ndp);
 		error = ENOTDIR;
 		goto out1;
 	}
 	if (islocked)
 		NFSVOPUNLOCK(dp, 0);
 	VREF(dp);
 	*retdirp = dp;
 	if (NFSVNO_EXRDONLY(exp))
 		cnp->cn_flags |= RDONLY;
 	ndp->ni_segflg = UIO_SYSSPACE;
 	crossmnt = 1;
 
 	if (nd->nd_flag & ND_PUBLOOKUP) {
 		ndp->ni_loopcnt = 0;
 		if (cnp->cn_pnbuf[0] == '/') {
 			vrele(dp);
 			/*
 			 * Check for degenerate pathnames here, since lookup()
 			 * panics on them.
 			 */
 			for (i = 1; i < ndp->ni_pathlen; i++)
 				if (cnp->cn_pnbuf[i] != '/')
 					break;
 			if (i == ndp->ni_pathlen) {
 				error = NFSERR_ACCES;
 				goto out;
 			}
 			dp = rootvnode;
 			VREF(dp);
 		}
 	} else if ((nfsrv_enable_crossmntpt == 0 && NFSVNO_EXPORTED(exp)) ||
 	    (nd->nd_flag & ND_NFSV4) == 0) {
 		/*
 		 * Only cross mount points for NFSv4 when doing a
 		 * mount while traversing the file system above
 		 * the mount point, unless nfsrv_enable_crossmntpt is set.
 		 */
 		cnp->cn_flags |= NOCROSSMOUNT;
 		crossmnt = 0;
 	}
 
 	/*
 	 * Initialize for scan, set ni_startdir and bump ref on dp again
 	 * because lookup() will dereference ni_startdir.
 	 */
 
 	cnp->cn_thread = p;
 	ndp->ni_startdir = dp;
 	ndp->ni_rootdir = rootvnode;
 	ndp->ni_topdir = NULL;
 
 	if (!lockleaf)
 		cnp->cn_flags |= LOCKLEAF;
 	for (;;) {
 		cnp->cn_nameptr = cnp->cn_pnbuf;
 		/*
 		 * Call lookup() to do the real work.  If an error occurs,
 		 * ndp->ni_vp and ni_dvp are left uninitialized or NULL and
 		 * we do not have to dereference anything before returning.
 		 * In either case ni_startdir will be dereferenced and NULLed
 		 * out.
 		 */
 		error = lookup(ndp);
 		if (error)
 			break;
 
 		/*
 		 * Check for encountering a symbolic link.  Trivial
 		 * termination occurs if no symlink encountered.
 		 */
 		if ((cnp->cn_flags & ISSYMLINK) == 0) {
 			if ((cnp->cn_flags & (SAVENAME | SAVESTART)) == 0)
 				nfsvno_relpathbuf(ndp);
 			if (ndp->ni_vp && !lockleaf)
 				NFSVOPUNLOCK(ndp->ni_vp, 0);
 			break;
 		}
 
 		/*
 		 * Validate symlink
 		 */
 		if ((cnp->cn_flags & LOCKPARENT) && ndp->ni_pathlen == 1)
 			NFSVOPUNLOCK(ndp->ni_dvp, 0);
 		if (!(nd->nd_flag & ND_PUBLOOKUP)) {
 			error = EINVAL;
 			goto badlink2;
 		}
 
 		if (ndp->ni_loopcnt++ >= MAXSYMLINKS) {
 			error = ELOOP;
 			goto badlink2;
 		}
 		if (ndp->ni_pathlen > 1)
 			cp = uma_zalloc(namei_zone, M_WAITOK);
 		else
 			cp = cnp->cn_pnbuf;
 		aiov.iov_base = cp;
 		aiov.iov_len = MAXPATHLEN;
 		auio.uio_iov = &aiov;
 		auio.uio_iovcnt = 1;
 		auio.uio_offset = 0;
 		auio.uio_rw = UIO_READ;
 		auio.uio_segflg = UIO_SYSSPACE;
 		auio.uio_td = NULL;
 		auio.uio_resid = MAXPATHLEN;
 		error = VOP_READLINK(ndp->ni_vp, &auio, cnp->cn_cred);
 		if (error) {
 		badlink1:
 			if (ndp->ni_pathlen > 1)
 				uma_zfree(namei_zone, cp);
 		badlink2:
 			vrele(ndp->ni_dvp);
 			vput(ndp->ni_vp);
 			break;
 		}
 		linklen = MAXPATHLEN - auio.uio_resid;
 		if (linklen == 0) {
 			error = ENOENT;
 			goto badlink1;
 		}
 		if (linklen + ndp->ni_pathlen >= MAXPATHLEN) {
 			error = ENAMETOOLONG;
 			goto badlink1;
 		}
 
 		/*
 		 * Adjust or replace path
 		 */
 		if (ndp->ni_pathlen > 1) {
 			NFSBCOPY(ndp->ni_next, cp + linklen, ndp->ni_pathlen);
 			uma_zfree(namei_zone, cnp->cn_pnbuf);
 			cnp->cn_pnbuf = cp;
 		} else
 			cnp->cn_pnbuf[linklen] = '\0';
 		ndp->ni_pathlen += linklen;
 
 		/*
 		 * Cleanup refs for next loop and check if root directory
 		 * should replace current directory.  Normally ni_dvp
 		 * becomes the new base directory and is cleaned up when
 		 * we loop.  Explicitly null pointers after invalidation
 		 * to clarify operation.
 		 */
 		vput(ndp->ni_vp);
 		ndp->ni_vp = NULL;
 
 		if (cnp->cn_pnbuf[0] == '/') {
 			vrele(ndp->ni_dvp);
 			ndp->ni_dvp = ndp->ni_rootdir;
 			VREF(ndp->ni_dvp);
 		}
 		ndp->ni_startdir = ndp->ni_dvp;
 		ndp->ni_dvp = NULL;
 	}
 	if (!lockleaf)
 		cnp->cn_flags &= ~LOCKLEAF;
 
 out:
 	if (error) {
 		nfsvno_relpathbuf(ndp);
 		ndp->ni_vp = NULL;
 		ndp->ni_dvp = NULL;
 		ndp->ni_startdir = NULL;
 	} else if ((ndp->ni_cnd.cn_flags & (WANTPARENT|LOCKPARENT)) == 0) {
 		ndp->ni_dvp = NULL;
 	}
 
 out1:
 	NFSEXITCODE2(error, nd);
 	return (error);
 }
 
 /*
  * Set up a pathname buffer and return a pointer to it and, optionally
  * set a hash pointer.
  */
 void
 nfsvno_setpathbuf(struct nameidata *ndp, char **bufpp, u_long **hashpp)
 {
 	struct componentname *cnp = &ndp->ni_cnd;
 
 	cnp->cn_flags |= (NOMACCHECK | HASBUF);
 	cnp->cn_pnbuf = uma_zalloc(namei_zone, M_WAITOK);
 	if (hashpp != NULL)
 		*hashpp = NULL;
 	*bufpp = cnp->cn_pnbuf;
 }
 
 /*
  * Release the above path buffer, if not released by nfsvno_namei().
  */
 void
 nfsvno_relpathbuf(struct nameidata *ndp)
 {
 
 	if ((ndp->ni_cnd.cn_flags & HASBUF) == 0)
 		panic("nfsrelpath");
 	uma_zfree(namei_zone, ndp->ni_cnd.cn_pnbuf);
 	ndp->ni_cnd.cn_flags &= ~HASBUF;
 }
 
 /*
  * Readlink vnode op into an mbuf list.
  */
 int
 nfsvno_readlink(struct vnode *vp, struct ucred *cred, struct thread *p,
     struct mbuf **mpp, struct mbuf **mpendp, int *lenp)
 {
 	struct iovec iv[(NFS_MAXPATHLEN+MLEN-1)/MLEN];
 	struct iovec *ivp = iv;
 	struct uio io, *uiop = &io;
 	struct mbuf *mp, *mp2 = NULL, *mp3 = NULL;
 	int i, len, tlen, error = 0;
 
 	len = 0;
 	i = 0;
 	while (len < NFS_MAXPATHLEN) {
 		NFSMGET(mp);
 		MCLGET(mp, M_WAITOK);
 		mp->m_len = M_SIZE(mp);
 		if (len == 0) {
 			mp3 = mp2 = mp;
 		} else {
 			mp2->m_next = mp;
 			mp2 = mp;
 		}
 		if ((len + mp->m_len) > NFS_MAXPATHLEN) {
 			mp->m_len = NFS_MAXPATHLEN - len;
 			len = NFS_MAXPATHLEN;
 		} else {
 			len += mp->m_len;
 		}
 		ivp->iov_base = mtod(mp, caddr_t);
 		ivp->iov_len = mp->m_len;
 		i++;
 		ivp++;
 	}
 	uiop->uio_iov = iv;
 	uiop->uio_iovcnt = i;
 	uiop->uio_offset = 0;
 	uiop->uio_resid = len;
 	uiop->uio_rw = UIO_READ;
 	uiop->uio_segflg = UIO_SYSSPACE;
 	uiop->uio_td = NULL;
 	error = VOP_READLINK(vp, uiop, cred);
 	if (error) {
 		m_freem(mp3);
 		*lenp = 0;
 		goto out;
 	}
 	if (uiop->uio_resid > 0) {
 		len -= uiop->uio_resid;
 		tlen = NFSM_RNDUP(len);
 		nfsrv_adj(mp3, NFS_MAXPATHLEN - tlen, tlen - len);
 	}
 	*lenp = len;
 	*mpp = mp3;
 	*mpendp = mp;
 
 out:
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Read vnode op call into mbuf list.
  */
 int
 nfsvno_read(struct vnode *vp, off_t off, int cnt, struct ucred *cred,
     struct thread *p, struct mbuf **mpp, struct mbuf **mpendp)
 {
 	struct mbuf *m;
 	int i;
 	struct iovec *iv;
 	struct iovec *iv2;
 	int error = 0, len, left, siz, tlen, ioflag = 0;
 	struct mbuf *m2 = NULL, *m3;
 	struct uio io, *uiop = &io;
 	struct nfsheur *nh;
 
 	len = left = NFSM_RNDUP(cnt);
 	m3 = NULL;
 	/*
 	 * Generate the mbuf list with the uio_iov ref. to it.
 	 */
 	i = 0;
 	while (left > 0) {
 		NFSMGET(m);
 		MCLGET(m, M_WAITOK);
 		m->m_len = 0;
 		siz = min(M_TRAILINGSPACE(m), left);
 		left -= siz;
 		i++;
 		if (m3)
 			m2->m_next = m;
 		else
 			m3 = m;
 		m2 = m;
 	}
 	MALLOC(iv, struct iovec *, i * sizeof (struct iovec),
 	    M_TEMP, M_WAITOK);
 	uiop->uio_iov = iv2 = iv;
 	m = m3;
 	left = len;
 	i = 0;
 	while (left > 0) {
 		if (m == NULL)
 			panic("nfsvno_read iov");
 		siz = min(M_TRAILINGSPACE(m), left);
 		if (siz > 0) {
 			iv->iov_base = mtod(m, caddr_t) + m->m_len;
 			iv->iov_len = siz;
 			m->m_len += siz;
 			left -= siz;
 			iv++;
 			i++;
 		}
 		m = m->m_next;
 	}
 	uiop->uio_iovcnt = i;
 	uiop->uio_offset = off;
 	uiop->uio_resid = len;
 	uiop->uio_rw = UIO_READ;
 	uiop->uio_segflg = UIO_SYSSPACE;
 	uiop->uio_td = NULL;
 	nh = nfsrv_sequential_heuristic(uiop, vp);
 	ioflag |= nh->nh_seqcount << IO_SEQSHIFT;
 	error = VOP_READ(vp, uiop, IO_NODELOCKED | ioflag, cred);
 	FREE((caddr_t)iv2, M_TEMP);
 	if (error) {
 		m_freem(m3);
 		*mpp = NULL;
 		goto out;
 	}
 	nh->nh_nextoff = uiop->uio_offset;
 	tlen = len - uiop->uio_resid;
 	cnt = cnt < tlen ? cnt : tlen;
 	tlen = NFSM_RNDUP(cnt);
 	if (tlen == 0) {
 		m_freem(m3);
 		m3 = NULL;
 	} else if (len != tlen || tlen != cnt)
 		nfsrv_adj(m3, len - tlen, tlen - cnt);
 	*mpp = m3;
 	*mpendp = m2;
 
 out:
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Write vnode op from an mbuf list.
  */
 int
 nfsvno_write(struct vnode *vp, off_t off, int retlen, int cnt, int stable,
     struct mbuf *mp, char *cp, struct ucred *cred, struct thread *p)
 {
 	struct iovec *ivp;
 	int i, len;
 	struct iovec *iv;
 	int ioflags, error;
 	struct uio io, *uiop = &io;
 	struct nfsheur *nh;
 
 	MALLOC(ivp, struct iovec *, cnt * sizeof (struct iovec), M_TEMP,
 	    M_WAITOK);
 	uiop->uio_iov = iv = ivp;
 	uiop->uio_iovcnt = cnt;
 	i = mtod(mp, caddr_t) + mp->m_len - cp;
 	len = retlen;
 	while (len > 0) {
 		if (mp == NULL)
 			panic("nfsvno_write");
 		if (i > 0) {
 			i = min(i, len);
 			ivp->iov_base = cp;
 			ivp->iov_len = i;
 			ivp++;
 			len -= i;
 		}
 		mp = mp->m_next;
 		if (mp) {
 			i = mp->m_len;
 			cp = mtod(mp, caddr_t);
 		}
 	}
 
 	if (stable == NFSWRITE_UNSTABLE)
 		ioflags = IO_NODELOCKED;
 	else
 		ioflags = (IO_SYNC | IO_NODELOCKED);
 	uiop->uio_resid = retlen;
 	uiop->uio_rw = UIO_WRITE;
 	uiop->uio_segflg = UIO_SYSSPACE;
 	NFSUIOPROC(uiop, p);
 	uiop->uio_offset = off;
 	nh = nfsrv_sequential_heuristic(uiop, vp);
 	ioflags |= nh->nh_seqcount << IO_SEQSHIFT;
 	error = VOP_WRITE(vp, uiop, ioflags, cred);
 	if (error == 0)
 		nh->nh_nextoff = uiop->uio_offset;
 	FREE((caddr_t)iv, M_TEMP);
 
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Common code for creating a regular file (plus special files for V2).
  */
 int
 nfsvno_createsub(struct nfsrv_descript *nd, struct nameidata *ndp,
     struct vnode **vpp, struct nfsvattr *nvap, int *exclusive_flagp,
     int32_t *cverf, NFSDEV_T rdev, struct thread *p, struct nfsexstuff *exp)
 {
 	u_quad_t tempsize;
 	int error;
 
 	error = nd->nd_repstat;
 	if (!error && ndp->ni_vp == NULL) {
 		if (nvap->na_type == VREG || nvap->na_type == VSOCK) {
 			vrele(ndp->ni_startdir);
 			error = VOP_CREATE(ndp->ni_dvp,
 			    &ndp->ni_vp, &ndp->ni_cnd, &nvap->na_vattr);
 			vput(ndp->ni_dvp);
 			nfsvno_relpathbuf(ndp);
 			if (!error) {
 				if (*exclusive_flagp) {
 					*exclusive_flagp = 0;
 					NFSVNO_ATTRINIT(nvap);
 					nvap->na_atime.tv_sec = cverf[0];
 					nvap->na_atime.tv_nsec = cverf[1];
 					error = VOP_SETATTR(ndp->ni_vp,
 					    &nvap->na_vattr, nd->nd_cred);
 				}
 			}
 		/*
 		 * NFS V2 Only. nfsrvd_mknod() does this for V3.
 		 * (This implies, just get out on an error.)
 		 */
 		} else if (nvap->na_type == VCHR || nvap->na_type == VBLK ||
 			nvap->na_type == VFIFO) {
 			if (nvap->na_type == VCHR && rdev == 0xffffffff)
 				nvap->na_type = VFIFO;
                         if (nvap->na_type != VFIFO &&
 			    (error = priv_check_cred(nd->nd_cred,
 			     PRIV_VFS_MKNOD_DEV, 0))) {
 				vrele(ndp->ni_startdir);
 				nfsvno_relpathbuf(ndp);
 				vput(ndp->ni_dvp);
 				goto out;
 			}
 			nvap->na_rdev = rdev;
 			error = VOP_MKNOD(ndp->ni_dvp, &ndp->ni_vp,
 			    &ndp->ni_cnd, &nvap->na_vattr);
 			vput(ndp->ni_dvp);
 			nfsvno_relpathbuf(ndp);
 			vrele(ndp->ni_startdir);
 			if (error)
 				goto out;
 		} else {
 			vrele(ndp->ni_startdir);
 			nfsvno_relpathbuf(ndp);
 			vput(ndp->ni_dvp);
 			error = ENXIO;
 			goto out;
 		}
 		*vpp = ndp->ni_vp;
 	} else {
 		/*
 		 * Handle cases where error is already set and/or
 		 * the file exists.
 		 * 1 - clean up the lookup
 		 * 2 - iff !error and na_size set, truncate it
 		 */
 		vrele(ndp->ni_startdir);
 		nfsvno_relpathbuf(ndp);
 		*vpp = ndp->ni_vp;
 		if (ndp->ni_dvp == *vpp)
 			vrele(ndp->ni_dvp);
 		else
 			vput(ndp->ni_dvp);
 		if (!error && nvap->na_size != VNOVAL) {
 			error = nfsvno_accchk(*vpp, VWRITE,
 			    nd->nd_cred, exp, p, NFSACCCHK_NOOVERRIDE,
 			    NFSACCCHK_VPISLOCKED, NULL);
 			if (!error) {
 				tempsize = nvap->na_size;
 				NFSVNO_ATTRINIT(nvap);
 				nvap->na_size = tempsize;
 				error = VOP_SETATTR(*vpp,
 				    &nvap->na_vattr, nd->nd_cred);
 			}
 		}
 		if (error)
 			vput(*vpp);
 	}
 
 out:
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Do a mknod vnode op.
  */
 int
 nfsvno_mknod(struct nameidata *ndp, struct nfsvattr *nvap, struct ucred *cred,
     struct thread *p)
 {
 	int error = 0;
 	enum vtype vtyp;
 
 	vtyp = nvap->na_type;
 	/*
 	 * Iff doesn't exist, create it.
 	 */
 	if (ndp->ni_vp) {
 		vrele(ndp->ni_startdir);
 		nfsvno_relpathbuf(ndp);
 		vput(ndp->ni_dvp);
 		vrele(ndp->ni_vp);
 		error = EEXIST;
 		goto out;
 	}
 	if (vtyp != VCHR && vtyp != VBLK && vtyp != VSOCK && vtyp != VFIFO) {
 		vrele(ndp->ni_startdir);
 		nfsvno_relpathbuf(ndp);
 		vput(ndp->ni_dvp);
 		error = NFSERR_BADTYPE;
 		goto out;
 	}
 	if (vtyp == VSOCK) {
 		vrele(ndp->ni_startdir);
 		error = VOP_CREATE(ndp->ni_dvp, &ndp->ni_vp,
 		    &ndp->ni_cnd, &nvap->na_vattr);
 		vput(ndp->ni_dvp);
 		nfsvno_relpathbuf(ndp);
 	} else {
 		if (nvap->na_type != VFIFO &&
 		    (error = priv_check_cred(cred, PRIV_VFS_MKNOD_DEV, 0))) {
 			vrele(ndp->ni_startdir);
 			nfsvno_relpathbuf(ndp);
 			vput(ndp->ni_dvp);
 			goto out;
 		}
 		error = VOP_MKNOD(ndp->ni_dvp, &ndp->ni_vp,
 		    &ndp->ni_cnd, &nvap->na_vattr);
 		vput(ndp->ni_dvp);
 		nfsvno_relpathbuf(ndp);
 		vrele(ndp->ni_startdir);
 		/*
 		 * Since VOP_MKNOD returns the ni_vp, I can't
 		 * see any reason to do the lookup.
 		 */
 	}
 
 out:
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Mkdir vnode op.
  */
 int
 nfsvno_mkdir(struct nameidata *ndp, struct nfsvattr *nvap, uid_t saved_uid,
     struct ucred *cred, struct thread *p, struct nfsexstuff *exp)
 {
 	int error = 0;
 
 	if (ndp->ni_vp != NULL) {
 		if (ndp->ni_dvp == ndp->ni_vp)
 			vrele(ndp->ni_dvp);
 		else
 			vput(ndp->ni_dvp);
 		vrele(ndp->ni_vp);
 		nfsvno_relpathbuf(ndp);
 		error = EEXIST;
 		goto out;
 	}
 	error = VOP_MKDIR(ndp->ni_dvp, &ndp->ni_vp, &ndp->ni_cnd,
 	    &nvap->na_vattr);
 	vput(ndp->ni_dvp);
 	nfsvno_relpathbuf(ndp);
 
 out:
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * symlink vnode op.
  */
 int
 nfsvno_symlink(struct nameidata *ndp, struct nfsvattr *nvap, char *pathcp,
     int pathlen, int not_v2, uid_t saved_uid, struct ucred *cred, struct thread *p,
     struct nfsexstuff *exp)
 {
 	int error = 0;
 
 	if (ndp->ni_vp) {
 		vrele(ndp->ni_startdir);
 		nfsvno_relpathbuf(ndp);
 		if (ndp->ni_dvp == ndp->ni_vp)
 			vrele(ndp->ni_dvp);
 		else
 			vput(ndp->ni_dvp);
 		vrele(ndp->ni_vp);
 		error = EEXIST;
 		goto out;
 	}
 
 	error = VOP_SYMLINK(ndp->ni_dvp, &ndp->ni_vp, &ndp->ni_cnd,
 	    &nvap->na_vattr, pathcp);
 	vput(ndp->ni_dvp);
 	vrele(ndp->ni_startdir);
 	nfsvno_relpathbuf(ndp);
 	/*
 	 * Although FreeBSD still had the lookup code in
 	 * it for 7/current, there doesn't seem to be any
 	 * point, since VOP_SYMLINK() returns the ni_vp.
 	 * Just vput it for v2.
 	 */
 	if (!not_v2 && !error)
 		vput(ndp->ni_vp);
 
 out:
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Parse symbolic link arguments.
  * This function has an ugly side effect. It will MALLOC() an area for
  * the symlink and set iov_base to point to it, only if it succeeds.
  * So, if it returns with uiop->uio_iov->iov_base != NULL, that must
  * be FREE'd later.
  */
 int
 nfsvno_getsymlink(struct nfsrv_descript *nd, struct nfsvattr *nvap,
     struct thread *p, char **pathcpp, int *lenp)
 {
 	u_int32_t *tl;
 	char *pathcp = NULL;
 	int error = 0, len;
 	struct nfsv2_sattr *sp;
 
 	*pathcpp = NULL;
 	*lenp = 0;
 	if ((nd->nd_flag & ND_NFSV3) &&
 	    (error = nfsrv_sattr(nd, NULL, nvap, NULL, NULL, p)))
 		goto nfsmout;
 	NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 	len = fxdr_unsigned(int, *tl);
 	if (len > NFS_MAXPATHLEN || len <= 0) {
 		error = EBADRPC;
 		goto nfsmout;
 	}
 	MALLOC(pathcp, caddr_t, len + 1, M_TEMP, M_WAITOK);
 	error = nfsrv_mtostr(nd, pathcp, len);
 	if (error)
 		goto nfsmout;
 	if (nd->nd_flag & ND_NFSV2) {
 		NFSM_DISSECT(sp, struct nfsv2_sattr *, NFSX_V2SATTR);
 		nvap->na_mode = fxdr_unsigned(u_int16_t, sp->sa_mode);
 	}
 	*pathcpp = pathcp;
 	*lenp = len;
 	NFSEXITCODE2(0, nd);
 	return (0);
 nfsmout:
 	if (pathcp)
 		free(pathcp, M_TEMP);
 	NFSEXITCODE2(error, nd);
 	return (error);
 }
 
 /*
  * Remove a non-directory object.
  */
 int
 nfsvno_removesub(struct nameidata *ndp, int is_v4, struct ucred *cred,
     struct thread *p, struct nfsexstuff *exp)
 {
 	struct vnode *vp;
 	int error = 0;
 
 	vp = ndp->ni_vp;
 	if (vp->v_type == VDIR)
 		error = NFSERR_ISDIR;
 	else if (is_v4)
 		error = nfsrv_checkremove(vp, 1, p);
 	if (!error)
 		error = VOP_REMOVE(ndp->ni_dvp, vp, &ndp->ni_cnd);
 	if (ndp->ni_dvp == vp)
 		vrele(ndp->ni_dvp);
 	else
 		vput(ndp->ni_dvp);
 	vput(vp);
 	if ((ndp->ni_cnd.cn_flags & SAVENAME) != 0)
 		nfsvno_relpathbuf(ndp);
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Remove a directory.
  */
 int
 nfsvno_rmdirsub(struct nameidata *ndp, int is_v4, struct ucred *cred,
     struct thread *p, struct nfsexstuff *exp)
 {
 	struct vnode *vp;
 	int error = 0;
 
 	vp = ndp->ni_vp;
 	if (vp->v_type != VDIR) {
 		error = ENOTDIR;
 		goto out;
 	}
 	/*
 	 * No rmdir "." please.
 	 */
 	if (ndp->ni_dvp == vp) {
 		error = EINVAL;
 		goto out;
 	}
 	/*
 	 * The root of a mounted filesystem cannot be deleted.
 	 */
 	if (vp->v_vflag & VV_ROOT)
 		error = EBUSY;
 out:
 	if (!error)
 		error = VOP_RMDIR(ndp->ni_dvp, vp, &ndp->ni_cnd);
 	if (ndp->ni_dvp == vp)
 		vrele(ndp->ni_dvp);
 	else
 		vput(ndp->ni_dvp);
 	vput(vp);
 	if ((ndp->ni_cnd.cn_flags & SAVENAME) != 0)
 		nfsvno_relpathbuf(ndp);
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Rename vnode op.
  */
 int
 nfsvno_rename(struct nameidata *fromndp, struct nameidata *tondp,
     u_int32_t ndstat, u_int32_t ndflag, struct ucred *cred, struct thread *p)
 {
 	struct vnode *fvp, *tvp, *tdvp;
 	int error = 0;
 
 	fvp = fromndp->ni_vp;
 	if (ndstat) {
 		vrele(fromndp->ni_dvp);
 		vrele(fvp);
 		error = ndstat;
 		goto out1;
 	}
 	tdvp = tondp->ni_dvp;
 	tvp = tondp->ni_vp;
 	if (tvp != NULL) {
 		if (fvp->v_type == VDIR && tvp->v_type != VDIR) {
 			error = (ndflag & ND_NFSV2) ? EISDIR : EEXIST;
 			goto out;
 		} else if (fvp->v_type != VDIR && tvp->v_type == VDIR) {
 			error = (ndflag & ND_NFSV2) ? ENOTDIR : EEXIST;
 			goto out;
 		}
 		if (tvp->v_type == VDIR && tvp->v_mountedhere) {
 			error = (ndflag & ND_NFSV2) ? ENOTEMPTY : EXDEV;
 			goto out;
 		}
 
 		/*
 		 * A rename to '.' or '..' results in a prematurely
 		 * unlocked vnode on FreeBSD5, so I'm just going to fail that
 		 * here.
 		 */
 		if ((tondp->ni_cnd.cn_namelen == 1 &&
 		     tondp->ni_cnd.cn_nameptr[0] == '.') ||
 		    (tondp->ni_cnd.cn_namelen == 2 &&
 		     tondp->ni_cnd.cn_nameptr[0] == '.' &&
 		     tondp->ni_cnd.cn_nameptr[1] == '.')) {
 			error = EINVAL;
 			goto out;
 		}
 	}
 	if (fvp->v_type == VDIR && fvp->v_mountedhere) {
 		error = (ndflag & ND_NFSV2) ? ENOTEMPTY : EXDEV;
 		goto out;
 	}
 	if (fvp->v_mount != tdvp->v_mount) {
 		error = (ndflag & ND_NFSV2) ? ENOTEMPTY : EXDEV;
 		goto out;
 	}
 	if (fvp == tdvp) {
 		error = (ndflag & ND_NFSV2) ? ENOTEMPTY : EINVAL;
 		goto out;
 	}
 	if (fvp == tvp) {
 		/*
 		 * If source and destination are the same, there is nothing to
 		 * do. Set error to -1 to indicate this.
 		 */
 		error = -1;
 		goto out;
 	}
 	if (ndflag & ND_NFSV4) {
 		if (NFSVOPLOCK(fvp, LK_EXCLUSIVE) == 0) {
 			error = nfsrv_checkremove(fvp, 0, p);
 			NFSVOPUNLOCK(fvp, 0);
 		} else
 			error = EPERM;
 		if (tvp && !error)
 			error = nfsrv_checkremove(tvp, 1, p);
 	} else {
 		/*
 		 * For NFSv2 and NFSv3, try to get rid of the delegation, so
 		 * that the NFSv4 client won't be confused by the rename.
 		 * Since nfsd_recalldelegation() can only be called on an
 		 * unlocked vnode at this point and fvp is the file that will
 		 * still exist after the rename, just do fvp.
 		 */
 		nfsd_recalldelegation(fvp, p);
 	}
 out:
 	if (!error) {
 		error = VOP_RENAME(fromndp->ni_dvp, fromndp->ni_vp,
 		    &fromndp->ni_cnd, tondp->ni_dvp, tondp->ni_vp,
 		    &tondp->ni_cnd);
 	} else {
 		if (tdvp == tvp)
 			vrele(tdvp);
 		else
 			vput(tdvp);
 		if (tvp)
 			vput(tvp);
 		vrele(fromndp->ni_dvp);
 		vrele(fvp);
 		if (error == -1)
 			error = 0;
 	}
 	vrele(tondp->ni_startdir);
 	nfsvno_relpathbuf(tondp);
 out1:
 	vrele(fromndp->ni_startdir);
 	nfsvno_relpathbuf(fromndp);
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Link vnode op.
  */
 int
 nfsvno_link(struct nameidata *ndp, struct vnode *vp, struct ucred *cred,
     struct thread *p, struct nfsexstuff *exp)
 {
 	struct vnode *xp;
 	int error = 0;
 
 	xp = ndp->ni_vp;
 	if (xp != NULL) {
 		error = EEXIST;
 	} else {
 		xp = ndp->ni_dvp;
 		if (vp->v_mount != xp->v_mount)
 			error = EXDEV;
 	}
 	if (!error) {
 		NFSVOPLOCK(vp, LK_EXCLUSIVE | LK_RETRY);
 		if ((vp->v_iflag & VI_DOOMED) == 0)
 			error = VOP_LINK(ndp->ni_dvp, vp, &ndp->ni_cnd);
 		else
 			error = EPERM;
 		if (ndp->ni_dvp == vp)
 			vrele(ndp->ni_dvp);
 		else
 			vput(ndp->ni_dvp);
 		NFSVOPUNLOCK(vp, 0);
 	} else {
 		if (ndp->ni_dvp == ndp->ni_vp)
 			vrele(ndp->ni_dvp);
 		else
 			vput(ndp->ni_dvp);
 		if (ndp->ni_vp)
 			vrele(ndp->ni_vp);
 	}
 	nfsvno_relpathbuf(ndp);
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Do the fsync() appropriate for the commit.
  */
 int
 nfsvno_fsync(struct vnode *vp, u_int64_t off, int cnt, struct ucred *cred,
     struct thread *td)
 {
 	int error = 0;
 
 	/*
 	 * RFC 1813 3.3.21: if count is 0, a flush from offset to the end of
 	 * file is done.  At this time VOP_FSYNC does not accept offset and
 	 * byte count parameters so call VOP_FSYNC the whole file for now.
 	 * The same is true for NFSv4: RFC 3530 Sec. 14.2.3.
+	 * File systems that do not use the buffer cache (as indicated
+	 * by MNTK_USES_BCACHE not being set) must use VOP_FSYNC().
 	 */
-	if (cnt == 0 || cnt > MAX_COMMIT_COUNT) {
+	if (cnt == 0 || cnt > MAX_COMMIT_COUNT ||
+	    (vp->v_mount->mnt_kern_flag & MNTK_USES_BCACHE) == 0) {
 		/*
 		 * Give up and do the whole thing
 		 */
 		if (vp->v_object &&
 		   (vp->v_object->flags & OBJ_MIGHTBEDIRTY)) {
 			VM_OBJECT_WLOCK(vp->v_object);
 			vm_object_page_clean(vp->v_object, 0, 0, OBJPC_SYNC);
 			VM_OBJECT_WUNLOCK(vp->v_object);
 		}
 		error = VOP_FSYNC(vp, MNT_WAIT, td);
 	} else {
 		/*
 		 * Locate and synchronously write any buffers that fall
 		 * into the requested range.  Note:  we are assuming that
 		 * f_iosize is a power of 2.
 		 */
 		int iosize = vp->v_mount->mnt_stat.f_iosize;
 		int iomask = iosize - 1;
 		struct bufobj *bo;
 		daddr_t lblkno;
 
 		/*
 		 * Align to iosize boundry, super-align to page boundry.
 		 */
 		if (off & iomask) {
 			cnt += off & iomask;
 			off &= ~(u_quad_t)iomask;
 		}
 		if (off & PAGE_MASK) {
 			cnt += off & PAGE_MASK;
 			off &= ~(u_quad_t)PAGE_MASK;
 		}
 		lblkno = off / iosize;
 
 		if (vp->v_object &&
 		   (vp->v_object->flags & OBJ_MIGHTBEDIRTY)) {
 			VM_OBJECT_WLOCK(vp->v_object);
 			vm_object_page_clean(vp->v_object, off, off + cnt,
 			    OBJPC_SYNC);
 			VM_OBJECT_WUNLOCK(vp->v_object);
 		}
 
 		bo = &vp->v_bufobj;
 		BO_LOCK(bo);
 		while (cnt > 0) {
 			struct buf *bp;
 
 			/*
 			 * If we have a buffer and it is marked B_DELWRI we
 			 * have to lock and write it.  Otherwise the prior
 			 * write is assumed to have already been committed.
 			 *
 			 * gbincore() can return invalid buffers now so we
 			 * have to check that bit as well (though B_DELWRI
 			 * should not be set if B_INVAL is set there could be
 			 * a race here since we haven't locked the buffer).
 			 */
 			if ((bp = gbincore(&vp->v_bufobj, lblkno)) != NULL) {
 				if (BUF_LOCK(bp, LK_EXCLUSIVE | LK_SLEEPFAIL |
 				    LK_INTERLOCK, BO_LOCKPTR(bo)) == ENOLCK) {
 					BO_LOCK(bo);
 					continue; /* retry */
 				}
 			    	if ((bp->b_flags & (B_DELWRI|B_INVAL)) ==
 				    B_DELWRI) {
 					bremfree(bp);
 					bp->b_flags &= ~B_ASYNC;
 					bwrite(bp);
 					++nfs_commit_miss;
 				} else
 					BUF_UNLOCK(bp);
 				BO_LOCK(bo);
 			}
 			++nfs_commit_blks;
 			if (cnt < iosize)
 				break;
 			cnt -= iosize;
 			++lblkno;
 		}
 		BO_UNLOCK(bo);
 	}
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Statfs vnode op.
  */
 int
 nfsvno_statfs(struct vnode *vp, struct statfs *sf)
 {
 	int error;
 
 	error = VFS_STATFS(vp->v_mount, sf);
 	if (error == 0) {
 		/*
 		 * Since NFS handles these values as unsigned on the
 		 * wire, there is no way to represent negative values,
 		 * so set them to 0. Without this, they will appear
 		 * to be very large positive values for clients like
 		 * Solaris10.
 		 */
 		if (sf->f_bavail < 0)
 			sf->f_bavail = 0;
 		if (sf->f_ffree < 0)
 			sf->f_ffree = 0;
 	}
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Do the vnode op stuff for Open. Similar to nfsvno_createsub(), but
  * must handle nfsrv_opencheck() calls after any other access checks.
  */
 void
 nfsvno_open(struct nfsrv_descript *nd, struct nameidata *ndp,
     nfsquad_t clientid, nfsv4stateid_t *stateidp, struct nfsstate *stp,
     int *exclusive_flagp, struct nfsvattr *nvap, int32_t *cverf, int create,
     NFSACL_T *aclp, nfsattrbit_t *attrbitp, struct ucred *cred, struct thread *p,
     struct nfsexstuff *exp, struct vnode **vpp)
 {
 	struct vnode *vp = NULL;
 	u_quad_t tempsize;
 	struct nfsexstuff nes;
 
 	if (ndp->ni_vp == NULL)
 		nd->nd_repstat = nfsrv_opencheck(clientid,
 		    stateidp, stp, NULL, nd, p, nd->nd_repstat);
 	if (!nd->nd_repstat) {
 		if (ndp->ni_vp == NULL) {
 			vrele(ndp->ni_startdir);
 			nd->nd_repstat = VOP_CREATE(ndp->ni_dvp,
 			    &ndp->ni_vp, &ndp->ni_cnd, &nvap->na_vattr);
 			vput(ndp->ni_dvp);
 			nfsvno_relpathbuf(ndp);
 			if (!nd->nd_repstat) {
 				if (*exclusive_flagp) {
 					*exclusive_flagp = 0;
 					NFSVNO_ATTRINIT(nvap);
 					nvap->na_atime.tv_sec = cverf[0];
 					nvap->na_atime.tv_nsec = cverf[1];
 					nd->nd_repstat = VOP_SETATTR(ndp->ni_vp,
 					    &nvap->na_vattr, cred);
 				} else {
 					nfsrv_fixattr(nd, ndp->ni_vp, nvap,
 					    aclp, p, attrbitp, exp);
 				}
 			}
 			vp = ndp->ni_vp;
 		} else {
 			if (ndp->ni_startdir)
 				vrele(ndp->ni_startdir);
 			nfsvno_relpathbuf(ndp);
 			vp = ndp->ni_vp;
 			if (create == NFSV4OPEN_CREATE) {
 				if (ndp->ni_dvp == vp)
 					vrele(ndp->ni_dvp);
 				else
 					vput(ndp->ni_dvp);
 			}
 			if (NFSVNO_ISSETSIZE(nvap) && vp->v_type == VREG) {
 				if (ndp->ni_cnd.cn_flags & RDONLY)
 					NFSVNO_SETEXRDONLY(&nes);
 				else
 					NFSVNO_EXINIT(&nes);
 				nd->nd_repstat = nfsvno_accchk(vp, 
 				    VWRITE, cred, &nes, p,
 				    NFSACCCHK_NOOVERRIDE,
 				    NFSACCCHK_VPISLOCKED, NULL);
 				nd->nd_repstat = nfsrv_opencheck(clientid,
 				    stateidp, stp, vp, nd, p, nd->nd_repstat);
 				if (!nd->nd_repstat) {
 					tempsize = nvap->na_size;
 					NFSVNO_ATTRINIT(nvap);
 					nvap->na_size = tempsize;
 					nd->nd_repstat = VOP_SETATTR(vp,
 					    &nvap->na_vattr, cred);
 				}
 			} else if (vp->v_type == VREG) {
 				nd->nd_repstat = nfsrv_opencheck(clientid,
 				    stateidp, stp, vp, nd, p, nd->nd_repstat);
 			}
 		}
 	} else {
 		if (ndp->ni_cnd.cn_flags & HASBUF)
 			nfsvno_relpathbuf(ndp);
 		if (ndp->ni_startdir && create == NFSV4OPEN_CREATE) {
 			vrele(ndp->ni_startdir);
 			if (ndp->ni_dvp == ndp->ni_vp)
 				vrele(ndp->ni_dvp);
 			else
 				vput(ndp->ni_dvp);
 			if (ndp->ni_vp)
 				vput(ndp->ni_vp);
 		}
 	}
 	*vpp = vp;
 
 	NFSEXITCODE2(0, nd);
 }
 
 /*
  * Updates the file rev and sets the mtime and ctime
  * to the current clock time, returning the va_filerev and va_Xtime
  * values.
  * Return ESTALE to indicate the vnode is VI_DOOMED.
  */
 int
 nfsvno_updfilerev(struct vnode *vp, struct nfsvattr *nvap,
     struct ucred *cred, struct thread *p)
 {
 	struct vattr va;
 
 	VATTR_NULL(&va);
 	vfs_timestamp(&va.va_mtime);
 	if (NFSVOPISLOCKED(vp) != LK_EXCLUSIVE) {
 		NFSVOPLOCK(vp, LK_UPGRADE | LK_RETRY);
 		if ((vp->v_iflag & VI_DOOMED) != 0)
 			return (ESTALE);
 	}
 	(void) VOP_SETATTR(vp, &va, cred);
 	(void) nfsvno_getattr(vp, nvap, cred, p, 1);
 	return (0);
 }
 
 /*
  * Glue routine to nfsv4_fillattr().
  */
 int
 nfsvno_fillattr(struct nfsrv_descript *nd, struct mount *mp, struct vnode *vp,
     struct nfsvattr *nvap, fhandle_t *fhp, int rderror, nfsattrbit_t *attrbitp,
     struct ucred *cred, struct thread *p, int isdgram, int reterr,
     int supports_nfsv4acls, int at_root, uint64_t mounted_on_fileno)
 {
 	int error;
 
 	error = nfsv4_fillattr(nd, mp, vp, NULL, &nvap->na_vattr, fhp, rderror,
 	    attrbitp, cred, p, isdgram, reterr, supports_nfsv4acls, at_root,
 	    mounted_on_fileno);
 	NFSEXITCODE2(0, nd);
 	return (error);
 }
 
 /* Since the Readdir vnode ops vary, put the entire functions in here. */
 /*
  * nfs readdir service
  * - mallocs what it thinks is enough to read
  *	count rounded up to a multiple of DIRBLKSIZ <= NFS_MAXREADDIR
  * - calls VOP_READDIR()
  * - loops around building the reply
  *	if the output generated exceeds count break out of loop
  *	The NFSM_CLGET macro is used here so that the reply will be packed
  *	tightly in mbuf clusters.
  * - it trims out records with d_fileno == 0
  *	this doesn't matter for Unix clients, but they might confuse clients
  *	for other os'.
  * - it trims out records with d_type == DT_WHT
  *	these cannot be seen through NFS (unless we extend the protocol)
  *     The alternate call nfsrvd_readdirplus() does lookups as well.
  * PS: The NFS protocol spec. does not clarify what the "count" byte
  *	argument is a count of.. just name strings and file id's or the
  *	entire reply rpc or ...
  *	I tried just file name and id sizes and it confused the Sun client,
  *	so I am using the full rpc size now. The "paranoia.." comment refers
  *	to including the status longwords that are not a part of the dir.
  *	"entry" structures, but are in the rpc.
  */
 int
 nfsrvd_readdir(struct nfsrv_descript *nd, int isdgram,
     struct vnode *vp, struct thread *p, struct nfsexstuff *exp)
 {
 	struct dirent *dp;
 	u_int32_t *tl;
 	int dirlen;
 	char *cpos, *cend, *rbuf;
 	struct nfsvattr at;
 	int nlen, error = 0, getret = 1;
 	int siz, cnt, fullsiz, eofflag, ncookies;
 	u_int64_t off, toff, verf;
 	u_long *cookies = NULL, *cookiep;
 	struct uio io;
 	struct iovec iv;
 	int is_ufs;
 
 	if (nd->nd_repstat) {
 		nfsrv_postopattr(nd, getret, &at);
 		goto out;
 	}
 	if (nd->nd_flag & ND_NFSV2) {
 		NFSM_DISSECT(tl, u_int32_t *, 2 * NFSX_UNSIGNED);
 		off = fxdr_unsigned(u_quad_t, *tl++);
 	} else {
 		NFSM_DISSECT(tl, u_int32_t *, 5 * NFSX_UNSIGNED);
 		off = fxdr_hyper(tl);
 		tl += 2;
 		verf = fxdr_hyper(tl);
 		tl += 2;
 	}
 	toff = off;
 	cnt = fxdr_unsigned(int, *tl);
 	if (cnt > NFS_SRVMAXDATA(nd) || cnt < 0)
 		cnt = NFS_SRVMAXDATA(nd);
 	siz = ((cnt + DIRBLKSIZ - 1) & ~(DIRBLKSIZ - 1));
 	fullsiz = siz;
 	if (nd->nd_flag & ND_NFSV3) {
 		nd->nd_repstat = getret = nfsvno_getattr(vp, &at, nd->nd_cred,
 		    p, 1);
 #if 0
 		/*
 		 * va_filerev is not sufficient as a cookie verifier,
 		 * since it is not supposed to change when entries are
 		 * removed/added unless that offset cookies returned to
 		 * the client are no longer valid.
 		 */
 		if (!nd->nd_repstat && toff && verf != at.na_filerev)
 			nd->nd_repstat = NFSERR_BAD_COOKIE;
 #endif
 	}
 	if (!nd->nd_repstat && vp->v_type != VDIR)
 		nd->nd_repstat = NFSERR_NOTDIR;
 	if (nd->nd_repstat == 0 && cnt == 0) {
 		if (nd->nd_flag & ND_NFSV2)
 			/* NFSv2 does not have NFSERR_TOOSMALL */
 			nd->nd_repstat = EPERM;
 		else
 			nd->nd_repstat = NFSERR_TOOSMALL;
 	}
 	if (!nd->nd_repstat)
 		nd->nd_repstat = nfsvno_accchk(vp, VEXEC,
 		    nd->nd_cred, exp, p, NFSACCCHK_NOOVERRIDE,
 		    NFSACCCHK_VPISLOCKED, NULL);
 	if (nd->nd_repstat) {
 		vput(vp);
 		if (nd->nd_flag & ND_NFSV3)
 			nfsrv_postopattr(nd, getret, &at);
 		goto out;
 	}
 	is_ufs = strcmp(vp->v_mount->mnt_vfc->vfc_name, "ufs") == 0;
 	MALLOC(rbuf, caddr_t, siz, M_TEMP, M_WAITOK);
 again:
 	eofflag = 0;
 	if (cookies) {
 		free((caddr_t)cookies, M_TEMP);
 		cookies = NULL;
 	}
 
 	iv.iov_base = rbuf;
 	iv.iov_len = siz;
 	io.uio_iov = &iv;
 	io.uio_iovcnt = 1;
 	io.uio_offset = (off_t)off;
 	io.uio_resid = siz;
 	io.uio_segflg = UIO_SYSSPACE;
 	io.uio_rw = UIO_READ;
 	io.uio_td = NULL;
 	nd->nd_repstat = VOP_READDIR(vp, &io, nd->nd_cred, &eofflag, &ncookies,
 	    &cookies);
 	off = (u_int64_t)io.uio_offset;
 	if (io.uio_resid)
 		siz -= io.uio_resid;
 
 	if (!cookies && !nd->nd_repstat)
 		nd->nd_repstat = NFSERR_PERM;
 	if (nd->nd_flag & ND_NFSV3) {
 		getret = nfsvno_getattr(vp, &at, nd->nd_cred, p, 1);
 		if (!nd->nd_repstat)
 			nd->nd_repstat = getret;
 	}
 
 	/*
 	 * Handles the failed cases. nd->nd_repstat == 0 past here.
 	 */
 	if (nd->nd_repstat) {
 		vput(vp);
 		free((caddr_t)rbuf, M_TEMP);
 		if (cookies)
 			free((caddr_t)cookies, M_TEMP);
 		if (nd->nd_flag & ND_NFSV3)
 			nfsrv_postopattr(nd, getret, &at);
 		goto out;
 	}
 	/*
 	 * If nothing read, return eof
 	 * rpc reply
 	 */
 	if (siz == 0) {
 		vput(vp);
 		if (nd->nd_flag & ND_NFSV2) {
 			NFSM_BUILD(tl, u_int32_t *, 2 * NFSX_UNSIGNED);
 		} else {
 			nfsrv_postopattr(nd, getret, &at);
 			NFSM_BUILD(tl, u_int32_t *, 4 * NFSX_UNSIGNED);
 			txdr_hyper(at.na_filerev, tl);
 			tl += 2;
 		}
 		*tl++ = newnfs_false;
 		*tl = newnfs_true;
 		FREE((caddr_t)rbuf, M_TEMP);
 		FREE((caddr_t)cookies, M_TEMP);
 		goto out;
 	}
 
 	/*
 	 * Check for degenerate cases of nothing useful read.
 	 * If so go try again
 	 */
 	cpos = rbuf;
 	cend = rbuf + siz;
 	dp = (struct dirent *)cpos;
 	cookiep = cookies;
 
 	/*
 	 * For some reason FreeBSD's ufs_readdir() chooses to back the
 	 * directory offset up to a block boundary, so it is necessary to
 	 * skip over the records that precede the requested offset. This
 	 * requires the assumption that file offset cookies monotonically
 	 * increase.
 	 */
 	while (cpos < cend && ncookies > 0 &&
 	    (dp->d_fileno == 0 || dp->d_type == DT_WHT ||
 	     (is_ufs == 1 && ((u_quad_t)(*cookiep)) <= toff))) {
 		cpos += dp->d_reclen;
 		dp = (struct dirent *)cpos;
 		cookiep++;
 		ncookies--;
 	}
 	if (cpos >= cend || ncookies == 0) {
 		siz = fullsiz;
 		toff = off;
 		goto again;
 	}
 	vput(vp);
 
 	/*
 	 * dirlen is the size of the reply, including all XDR and must
 	 * not exceed cnt. For NFSv2, RFC1094 didn't clearly indicate
 	 * if the XDR should be included in "count", but to be safe, we do.
 	 * (Include the two booleans at the end of the reply in dirlen now.)
 	 */
 	if (nd->nd_flag & ND_NFSV3) {
 		nfsrv_postopattr(nd, getret, &at);
 		NFSM_BUILD(tl, u_int32_t *, 2 * NFSX_UNSIGNED);
 		txdr_hyper(at.na_filerev, tl);
 		dirlen = NFSX_V3POSTOPATTR + NFSX_VERF + 2 * NFSX_UNSIGNED;
 	} else {
 		dirlen = 2 * NFSX_UNSIGNED;
 	}
 
 	/* Loop through the records and build reply */
 	while (cpos < cend && ncookies > 0) {
 		nlen = dp->d_namlen;
 		if (dp->d_fileno != 0 && dp->d_type != DT_WHT &&
 			nlen <= NFS_MAXNAMLEN) {
 			if (nd->nd_flag & ND_NFSV3)
 				dirlen += (6*NFSX_UNSIGNED + NFSM_RNDUP(nlen));
 			else
 				dirlen += (4*NFSX_UNSIGNED + NFSM_RNDUP(nlen));
 			if (dirlen > cnt) {
 				eofflag = 0;
 				break;
 			}
 
 			/*
 			 * Build the directory record xdr from
 			 * the dirent entry.
 			 */
 			if (nd->nd_flag & ND_NFSV3) {
 				NFSM_BUILD(tl, u_int32_t *, 3 * NFSX_UNSIGNED);
 				*tl++ = newnfs_true;
 				*tl++ = 0;
 			} else {
 				NFSM_BUILD(tl, u_int32_t *, 2 * NFSX_UNSIGNED);
 				*tl++ = newnfs_true;
 			}
 			*tl = txdr_unsigned(dp->d_fileno);
 			(void) nfsm_strtom(nd, dp->d_name, nlen);
 			if (nd->nd_flag & ND_NFSV3) {
 				NFSM_BUILD(tl, u_int32_t *, 2 * NFSX_UNSIGNED);
 				*tl++ = 0;
 			} else
 				NFSM_BUILD(tl, u_int32_t *, NFSX_UNSIGNED);
 			*tl = txdr_unsigned(*cookiep);
 		}
 		cpos += dp->d_reclen;
 		dp = (struct dirent *)cpos;
 		cookiep++;
 		ncookies--;
 	}
 	if (cpos < cend)
 		eofflag = 0;
 	NFSM_BUILD(tl, u_int32_t *, 2 * NFSX_UNSIGNED);
 	*tl++ = newnfs_false;
 	if (eofflag)
 		*tl = newnfs_true;
 	else
 		*tl = newnfs_false;
 	FREE((caddr_t)rbuf, M_TEMP);
 	FREE((caddr_t)cookies, M_TEMP);
 
 out:
 	NFSEXITCODE2(0, nd);
 	return (0);
 nfsmout:
 	vput(vp);
 	NFSEXITCODE2(error, nd);
 	return (error);
 }
 
 /*
  * Readdirplus for V3 and Readdir for V4.
  */
 int
 nfsrvd_readdirplus(struct nfsrv_descript *nd, int isdgram,
     struct vnode *vp, struct thread *p, struct nfsexstuff *exp)
 {
 	struct dirent *dp;
 	u_int32_t *tl;
 	int dirlen;
 	char *cpos, *cend, *rbuf;
 	struct vnode *nvp;
 	fhandle_t nfh;
 	struct nfsvattr nva, at, *nvap = &nva;
 	struct mbuf *mb0, *mb1;
 	struct nfsreferral *refp;
 	int nlen, r, error = 0, getret = 1, usevget = 1;
 	int siz, cnt, fullsiz, eofflag, ncookies, entrycnt;
 	caddr_t bpos0, bpos1;
 	u_int64_t off, toff, verf;
 	u_long *cookies = NULL, *cookiep;
 	nfsattrbit_t attrbits, rderrbits, savbits;
 	struct uio io;
 	struct iovec iv;
 	struct componentname cn;
 	int at_root, is_ufs, is_zfs, needs_unbusy, supports_nfsv4acls;
 	struct mount *mp, *new_mp;
 	uint64_t mounted_on_fileno;
 
 	if (nd->nd_repstat) {
 		nfsrv_postopattr(nd, getret, &at);
 		goto out;
 	}
 	NFSM_DISSECT(tl, u_int32_t *, 6 * NFSX_UNSIGNED);
 	off = fxdr_hyper(tl);
 	toff = off;
 	tl += 2;
 	verf = fxdr_hyper(tl);
 	tl += 2;
 	siz = fxdr_unsigned(int, *tl++);
 	cnt = fxdr_unsigned(int, *tl);
 
 	/*
 	 * Use the server's maximum data transfer size as the upper bound
 	 * on reply datalen.
 	 */
 	if (cnt > NFS_SRVMAXDATA(nd) || cnt < 0)
 		cnt = NFS_SRVMAXDATA(nd);
 
 	/*
 	 * siz is a "hint" of how much directory information (name, fileid,
 	 * cookie) should be in the reply. At least one client "hints" 0,
 	 * so I set it to cnt for that case. I also round it up to the
 	 * next multiple of DIRBLKSIZ.
 	 */
 	if (siz <= 0)
 		siz = cnt;
 	siz = ((siz + DIRBLKSIZ - 1) & ~(DIRBLKSIZ - 1));
 
 	if (nd->nd_flag & ND_NFSV4) {
 		error = nfsrv_getattrbits(nd, &attrbits, NULL, NULL);
 		if (error)
 			goto nfsmout;
 		NFSSET_ATTRBIT(&savbits, &attrbits);
 		NFSCLRNOTFILLABLE_ATTRBIT(&attrbits);
 		NFSZERO_ATTRBIT(&rderrbits);
 		NFSSETBIT_ATTRBIT(&rderrbits, NFSATTRBIT_RDATTRERROR);
 	} else {
 		NFSZERO_ATTRBIT(&attrbits);
 	}
 	fullsiz = siz;
 	nd->nd_repstat = getret = nfsvno_getattr(vp, &at, nd->nd_cred, p, 1);
 	if (!nd->nd_repstat) {
 	    if (off && verf != at.na_filerev) {
 		/*
 		 * va_filerev is not sufficient as a cookie verifier,
 		 * since it is not supposed to change when entries are
 		 * removed/added unless that offset cookies returned to
 		 * the client are no longer valid.
 		 */
 #if 0
 		if (nd->nd_flag & ND_NFSV4) {
 			nd->nd_repstat = NFSERR_NOTSAME;
 		} else {
 			nd->nd_repstat = NFSERR_BAD_COOKIE;
 		}
 #endif
 	    } else if ((nd->nd_flag & ND_NFSV4) && off == 0 && verf != 0) {
 		nd->nd_repstat = NFSERR_BAD_COOKIE;
 	    }
 	}
 	if (!nd->nd_repstat && vp->v_type != VDIR)
 		nd->nd_repstat = NFSERR_NOTDIR;
 	if (!nd->nd_repstat && cnt == 0)
 		nd->nd_repstat = NFSERR_TOOSMALL;
 	if (!nd->nd_repstat)
 		nd->nd_repstat = nfsvno_accchk(vp, VEXEC,
 		    nd->nd_cred, exp, p, NFSACCCHK_NOOVERRIDE,
 		    NFSACCCHK_VPISLOCKED, NULL);
 	if (nd->nd_repstat) {
 		vput(vp);
 		if (nd->nd_flag & ND_NFSV3)
 			nfsrv_postopattr(nd, getret, &at);
 		goto out;
 	}
 	is_ufs = strcmp(vp->v_mount->mnt_vfc->vfc_name, "ufs") == 0;
 	is_zfs = strcmp(vp->v_mount->mnt_vfc->vfc_name, "zfs") == 0;
 
 	MALLOC(rbuf, caddr_t, siz, M_TEMP, M_WAITOK);
 again:
 	eofflag = 0;
 	if (cookies) {
 		free((caddr_t)cookies, M_TEMP);
 		cookies = NULL;
 	}
 
 	iv.iov_base = rbuf;
 	iv.iov_len = siz;
 	io.uio_iov = &iv;
 	io.uio_iovcnt = 1;
 	io.uio_offset = (off_t)off;
 	io.uio_resid = siz;
 	io.uio_segflg = UIO_SYSSPACE;
 	io.uio_rw = UIO_READ;
 	io.uio_td = NULL;
 	nd->nd_repstat = VOP_READDIR(vp, &io, nd->nd_cred, &eofflag, &ncookies,
 	    &cookies);
 	off = (u_int64_t)io.uio_offset;
 	if (io.uio_resid)
 		siz -= io.uio_resid;
 
 	getret = nfsvno_getattr(vp, &at, nd->nd_cred, p, 1);
 
 	if (!cookies && !nd->nd_repstat)
 		nd->nd_repstat = NFSERR_PERM;
 	if (!nd->nd_repstat)
 		nd->nd_repstat = getret;
 	if (nd->nd_repstat) {
 		vput(vp);
 		if (cookies)
 			free((caddr_t)cookies, M_TEMP);
 		free((caddr_t)rbuf, M_TEMP);
 		if (nd->nd_flag & ND_NFSV3)
 			nfsrv_postopattr(nd, getret, &at);
 		goto out;
 	}
 	/*
 	 * If nothing read, return eof
 	 * rpc reply
 	 */
 	if (siz == 0) {
 		vput(vp);
 		if (nd->nd_flag & ND_NFSV3)
 			nfsrv_postopattr(nd, getret, &at);
 		NFSM_BUILD(tl, u_int32_t *, 4 * NFSX_UNSIGNED);
 		txdr_hyper(at.na_filerev, tl);
 		tl += 2;
 		*tl++ = newnfs_false;
 		*tl = newnfs_true;
 		free((caddr_t)cookies, M_TEMP);
 		free((caddr_t)rbuf, M_TEMP);
 		goto out;
 	}
 
 	/*
 	 * Check for degenerate cases of nothing useful read.
 	 * If so go try again
 	 */
 	cpos = rbuf;
 	cend = rbuf + siz;
 	dp = (struct dirent *)cpos;
 	cookiep = cookies;
 
 	/*
 	 * For some reason FreeBSD's ufs_readdir() chooses to back the
 	 * directory offset up to a block boundary, so it is necessary to
 	 * skip over the records that precede the requested offset. This
 	 * requires the assumption that file offset cookies monotonically
 	 * increase.
 	 */
 	while (cpos < cend && ncookies > 0 &&
 	  (dp->d_fileno == 0 || dp->d_type == DT_WHT ||
 	   (is_ufs == 1 && ((u_quad_t)(*cookiep)) <= toff) ||
 	   ((nd->nd_flag & ND_NFSV4) &&
 	    ((dp->d_namlen == 1 && dp->d_name[0] == '.') ||
 	     (dp->d_namlen==2 && dp->d_name[0]=='.' && dp->d_name[1]=='.'))))) {
 		cpos += dp->d_reclen;
 		dp = (struct dirent *)cpos;
 		cookiep++;
 		ncookies--;
 	}
 	if (cpos >= cend || ncookies == 0) {
 		siz = fullsiz;
 		toff = off;
 		goto again;
 	}
 
 	/*
 	 * Busy the file system so that the mount point won't go away
 	 * and, as such, VFS_VGET() can be used safely.
 	 */
 	mp = vp->v_mount;
 	vfs_ref(mp);
 	NFSVOPUNLOCK(vp, 0);
 	nd->nd_repstat = vfs_busy(mp, 0);
 	vfs_rel(mp);
 	if (nd->nd_repstat != 0) {
 		vrele(vp);
 		free(cookies, M_TEMP);
 		free(rbuf, M_TEMP);
 		if (nd->nd_flag & ND_NFSV3)
 			nfsrv_postopattr(nd, getret, &at);
 		goto out;
 	}
 
 	/*
 	 * Check to see if entries in this directory can be safely acquired
 	 * via VFS_VGET() or if a switch to VOP_LOOKUP() is required.
 	 * ZFS snapshot directories need VOP_LOOKUP(), so that any
 	 * automount of the snapshot directory that is required will
 	 * be done.
 	 * This needs to be done here for NFSv4, since NFSv4 never does
 	 * a VFS_VGET() for "." or "..".
 	 */
 	if (is_zfs == 1) {
 		r = VFS_VGET(mp, at.na_fileid, LK_SHARED, &nvp);
 		if (r == EOPNOTSUPP) {
 			usevget = 0;
 			cn.cn_nameiop = LOOKUP;
 			cn.cn_lkflags = LK_SHARED | LK_RETRY;
 			cn.cn_cred = nd->nd_cred;
 			cn.cn_thread = p;
 		} else if (r == 0)
 			vput(nvp);
 	}
 
 	/*
 	 * Save this position, in case there is an error before one entry
 	 * is created.
 	 */
 	mb0 = nd->nd_mb;
 	bpos0 = nd->nd_bpos;
 
 	/*
 	 * Fill in the first part of the reply.
 	 * dirlen is the reply length in bytes and cannot exceed cnt.
 	 * (Include the two booleans at the end of the reply in dirlen now,
 	 *  so we recognize when we have exceeded cnt.)
 	 */
 	if (nd->nd_flag & ND_NFSV3) {
 		dirlen = NFSX_V3POSTOPATTR + NFSX_VERF + 2 * NFSX_UNSIGNED;
 		nfsrv_postopattr(nd, getret, &at);
 	} else {
 		dirlen = NFSX_VERF + 2 * NFSX_UNSIGNED;
 	}
 	NFSM_BUILD(tl, u_int32_t *, NFSX_VERF);
 	txdr_hyper(at.na_filerev, tl);
 
 	/*
 	 * Save this position, in case there is an empty reply needed.
 	 */
 	mb1 = nd->nd_mb;
 	bpos1 = nd->nd_bpos;
 
 	/* Loop through the records and build reply */
 	entrycnt = 0;
 	while (cpos < cend && ncookies > 0 && dirlen < cnt) {
 		nlen = dp->d_namlen;
 		if (dp->d_fileno != 0 && dp->d_type != DT_WHT &&
 		    nlen <= NFS_MAXNAMLEN &&
 		    ((nd->nd_flag & ND_NFSV3) || nlen > 2 ||
 		     (nlen==2 && (dp->d_name[0]!='.' || dp->d_name[1]!='.'))
 		      || (nlen == 1 && dp->d_name[0] != '.'))) {
 			/*
 			 * Save the current position in the reply, in case
 			 * this entry exceeds cnt.
 			 */
 			mb1 = nd->nd_mb;
 			bpos1 = nd->nd_bpos;
 	
 			/*
 			 * For readdir_and_lookup get the vnode using
 			 * the file number.
 			 */
 			nvp = NULL;
 			refp = NULL;
 			r = 0;
 			at_root = 0;
 			needs_unbusy = 0;
 			new_mp = mp;
 			mounted_on_fileno = (uint64_t)dp->d_fileno;
 			if ((nd->nd_flag & ND_NFSV3) ||
 			    NFSNONZERO_ATTRBIT(&savbits)) {
 				if (nd->nd_flag & ND_NFSV4)
 					refp = nfsv4root_getreferral(NULL,
 					    vp, dp->d_fileno);
 				if (refp == NULL) {
 					if (usevget)
 						r = VFS_VGET(mp, dp->d_fileno,
 						    LK_SHARED, &nvp);
 					else
 						r = EOPNOTSUPP;
 					if (r == EOPNOTSUPP) {
 						if (usevget) {
 							usevget = 0;
 							cn.cn_nameiop = LOOKUP;
 							cn.cn_lkflags =
 							    LK_SHARED |
 							    LK_RETRY;
 							cn.cn_cred =
 							    nd->nd_cred;
 							cn.cn_thread = p;
 						}
 						cn.cn_nameptr = dp->d_name;
 						cn.cn_namelen = nlen;
 						cn.cn_flags = ISLASTCN |
 						    NOFOLLOW | LOCKLEAF;
 						if (nlen == 2 &&
 						    dp->d_name[0] == '.' &&
 						    dp->d_name[1] == '.')
 							cn.cn_flags |=
 							    ISDOTDOT;
 						if (NFSVOPLOCK(vp, LK_SHARED)
 						    != 0) {
 							nd->nd_repstat = EPERM;
 							break;
 						}
 						if ((vp->v_vflag & VV_ROOT) != 0
 						    && (cn.cn_flags & ISDOTDOT)
 						    != 0) {
 							vref(vp);
 							nvp = vp;
 							r = 0;
 						} else {
 							r = VOP_LOOKUP(vp, &nvp,
 							    &cn);
 							if (vp != nvp)
 								NFSVOPUNLOCK(vp,
 								    0);
 						}
 					}
 
 					/*
 					 * For NFSv4, check to see if nvp is
 					 * a mount point and get the mount
 					 * point vnode, as required.
 					 */
 					if (r == 0 &&
 					    nfsrv_enable_crossmntpt != 0 &&
 					    (nd->nd_flag & ND_NFSV4) != 0 &&
 					    nvp->v_type == VDIR &&
 					    nvp->v_mountedhere != NULL) {
 						new_mp = nvp->v_mountedhere;
 						r = vfs_busy(new_mp, 0);
 						vput(nvp);
 						nvp = NULL;
 						if (r == 0) {
 							r = VFS_ROOT(new_mp,
 							    LK_SHARED, &nvp);
 							needs_unbusy = 1;
 							if (r == 0)
 								at_root = 1;
 						}
 					}
 				}
 				if (!r) {
 				    if (refp == NULL &&
 					((nd->nd_flag & ND_NFSV3) ||
 					 NFSNONZERO_ATTRBIT(&attrbits))) {
 					r = nfsvno_getfh(nvp, &nfh, p);
 					if (!r)
 					    r = nfsvno_getattr(nvp, nvap,
 						nd->nd_cred, p, 1);
 					if (r == 0 && is_zfs == 1 &&
 					    nfsrv_enable_crossmntpt != 0 &&
 					    (nd->nd_flag & ND_NFSV4) != 0 &&
 					    nvp->v_type == VDIR &&
 					    vp->v_mount != nvp->v_mount) {
 					    /*
 					     * For a ZFS snapshot, there is a
 					     * pseudo mount that does not set
 					     * v_mountedhere, so it needs to
 					     * be detected via a different
 					     * mount structure.
 					     */
 					    at_root = 1;
 					    if (new_mp == mp)
 						new_mp = nvp->v_mount;
 					}
 				    }
 				} else {
 				    nvp = NULL;
 				}
 				if (r) {
 					if (!NFSISSET_ATTRBIT(&attrbits,
 					    NFSATTRBIT_RDATTRERROR)) {
 						if (nvp != NULL)
 							vput(nvp);
 						if (needs_unbusy != 0)
 							vfs_unbusy(new_mp);
 						nd->nd_repstat = r;
 						break;
 					}
 				}
 			}
 
 			/*
 			 * Build the directory record xdr
 			 */
 			if (nd->nd_flag & ND_NFSV3) {
 				NFSM_BUILD(tl, u_int32_t *, 3 * NFSX_UNSIGNED);
 				*tl++ = newnfs_true;
 				*tl++ = 0;
 				*tl = txdr_unsigned(dp->d_fileno);
 				dirlen += nfsm_strtom(nd, dp->d_name, nlen);
 				NFSM_BUILD(tl, u_int32_t *, 2 * NFSX_UNSIGNED);
 				*tl++ = 0;
 				*tl = txdr_unsigned(*cookiep);
 				nfsrv_postopattr(nd, 0, nvap);
 				dirlen += nfsm_fhtom(nd,(u_int8_t *)&nfh,0,1);
 				dirlen += (5*NFSX_UNSIGNED+NFSX_V3POSTOPATTR);
 				if (nvp != NULL)
 					vput(nvp);
 			} else {
 				NFSM_BUILD(tl, u_int32_t *, 3 * NFSX_UNSIGNED);
 				*tl++ = newnfs_true;
 				*tl++ = 0;
 				*tl = txdr_unsigned(*cookiep);
 				dirlen += nfsm_strtom(nd, dp->d_name, nlen);
 				if (nvp != NULL) {
 					supports_nfsv4acls =
 					    nfs_supportsnfsv4acls(nvp);
 					NFSVOPUNLOCK(nvp, 0);
 				} else
 					supports_nfsv4acls = 0;
 				if (refp != NULL) {
 					dirlen += nfsrv_putreferralattr(nd,
 					    &savbits, refp, 0,
 					    &nd->nd_repstat);
 					if (nd->nd_repstat) {
 						if (nvp != NULL)
 							vrele(nvp);
 						if (needs_unbusy != 0)
 							vfs_unbusy(new_mp);
 						break;
 					}
 				} else if (r) {
 					dirlen += nfsvno_fillattr(nd, new_mp,
 					    nvp, nvap, &nfh, r, &rderrbits,
 					    nd->nd_cred, p, isdgram, 0,
 					    supports_nfsv4acls, at_root,
 					    mounted_on_fileno);
 				} else {
 					dirlen += nfsvno_fillattr(nd, new_mp,
 					    nvp, nvap, &nfh, r, &attrbits,
 					    nd->nd_cred, p, isdgram, 0,
 					    supports_nfsv4acls, at_root,
 					    mounted_on_fileno);
 				}
 				if (nvp != NULL)
 					vrele(nvp);
 				dirlen += (3 * NFSX_UNSIGNED);
 			}
 			if (needs_unbusy != 0)
 				vfs_unbusy(new_mp);
 			if (dirlen <= cnt)
 				entrycnt++;
 		}
 		cpos += dp->d_reclen;
 		dp = (struct dirent *)cpos;
 		cookiep++;
 		ncookies--;
 	}
 	vrele(vp);
 	vfs_unbusy(mp);
 
 	/*
 	 * If dirlen > cnt, we must strip off the last entry. If that
 	 * results in an empty reply, report NFSERR_TOOSMALL.
 	 */
 	if (dirlen > cnt || nd->nd_repstat) {
 		if (!nd->nd_repstat && entrycnt == 0)
 			nd->nd_repstat = NFSERR_TOOSMALL;
 		if (nd->nd_repstat) {
 			newnfs_trimtrailing(nd, mb0, bpos0);
 			if (nd->nd_flag & ND_NFSV3)
 				nfsrv_postopattr(nd, getret, &at);
 		} else
 			newnfs_trimtrailing(nd, mb1, bpos1);
 		eofflag = 0;
 	} else if (cpos < cend)
 		eofflag = 0;
 	if (!nd->nd_repstat) {
 		NFSM_BUILD(tl, u_int32_t *, 2 * NFSX_UNSIGNED);
 		*tl++ = newnfs_false;
 		if (eofflag)
 			*tl = newnfs_true;
 		else
 			*tl = newnfs_false;
 	}
 	FREE((caddr_t)cookies, M_TEMP);
 	FREE((caddr_t)rbuf, M_TEMP);
 
 out:
 	NFSEXITCODE2(0, nd);
 	return (0);
 nfsmout:
 	vput(vp);
 	NFSEXITCODE2(error, nd);
 	return (error);
 }
 
 /*
  * Get the settable attributes out of the mbuf list.
  * (Return 0 or EBADRPC)
  */
 int
 nfsrv_sattr(struct nfsrv_descript *nd, vnode_t vp, struct nfsvattr *nvap,
     nfsattrbit_t *attrbitp, NFSACL_T *aclp, struct thread *p)
 {
 	u_int32_t *tl;
 	struct nfsv2_sattr *sp;
 	int error = 0, toclient = 0;
 
 	switch (nd->nd_flag & (ND_NFSV2 | ND_NFSV3 | ND_NFSV4)) {
 	case ND_NFSV2:
 		NFSM_DISSECT(sp, struct nfsv2_sattr *, NFSX_V2SATTR);
 		/*
 		 * Some old clients didn't fill in the high order 16bits.
 		 * --> check the low order 2 bytes for 0xffff
 		 */
 		if ((fxdr_unsigned(int, sp->sa_mode) & 0xffff) != 0xffff)
 			nvap->na_mode = nfstov_mode(sp->sa_mode);
 		if (sp->sa_uid != newnfs_xdrneg1)
 			nvap->na_uid = fxdr_unsigned(uid_t, sp->sa_uid);
 		if (sp->sa_gid != newnfs_xdrneg1)
 			nvap->na_gid = fxdr_unsigned(gid_t, sp->sa_gid);
 		if (sp->sa_size != newnfs_xdrneg1)
 			nvap->na_size = fxdr_unsigned(u_quad_t, sp->sa_size);
 		if (sp->sa_atime.nfsv2_sec != newnfs_xdrneg1) {
 #ifdef notyet
 			fxdr_nfsv2time(&sp->sa_atime, &nvap->na_atime);
 #else
 			nvap->na_atime.tv_sec =
 				fxdr_unsigned(u_int32_t,sp->sa_atime.nfsv2_sec);
 			nvap->na_atime.tv_nsec = 0;
 #endif
 		}
 		if (sp->sa_mtime.nfsv2_sec != newnfs_xdrneg1)
 			fxdr_nfsv2time(&sp->sa_mtime, &nvap->na_mtime);
 		break;
 	case ND_NFSV3:
 		NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 		if (*tl == newnfs_true) {
 			NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 			nvap->na_mode = nfstov_mode(*tl);
 		}
 		NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 		if (*tl == newnfs_true) {
 			NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 			nvap->na_uid = fxdr_unsigned(uid_t, *tl);
 		}
 		NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 		if (*tl == newnfs_true) {
 			NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 			nvap->na_gid = fxdr_unsigned(gid_t, *tl);
 		}
 		NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 		if (*tl == newnfs_true) {
 			NFSM_DISSECT(tl, u_int32_t *, 2 * NFSX_UNSIGNED);
 			nvap->na_size = fxdr_hyper(tl);
 		}
 		NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 		switch (fxdr_unsigned(int, *tl)) {
 		case NFSV3SATTRTIME_TOCLIENT:
 			NFSM_DISSECT(tl, u_int32_t *, 2 * NFSX_UNSIGNED);
 			fxdr_nfsv3time(tl, &nvap->na_atime);
 			toclient = 1;
 			break;
 		case NFSV3SATTRTIME_TOSERVER:
 			vfs_timestamp(&nvap->na_atime);
 			nvap->na_vaflags |= VA_UTIMES_NULL;
 			break;
 		};
 		NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 		switch (fxdr_unsigned(int, *tl)) {
 		case NFSV3SATTRTIME_TOCLIENT:
 			NFSM_DISSECT(tl, u_int32_t *, 2 * NFSX_UNSIGNED);
 			fxdr_nfsv3time(tl, &nvap->na_mtime);
 			nvap->na_vaflags &= ~VA_UTIMES_NULL;
 			break;
 		case NFSV3SATTRTIME_TOSERVER:
 			vfs_timestamp(&nvap->na_mtime);
 			if (!toclient)
 				nvap->na_vaflags |= VA_UTIMES_NULL;
 			break;
 		};
 		break;
 	case ND_NFSV4:
 		error = nfsv4_sattr(nd, vp, nvap, attrbitp, aclp, p);
 	};
 nfsmout:
 	NFSEXITCODE2(error, nd);
 	return (error);
 }
 
 /*
  * Handle the setable attributes for V4.
  * Returns NFSERR_BADXDR if it can't be parsed, 0 otherwise.
  */
 int
 nfsv4_sattr(struct nfsrv_descript *nd, vnode_t vp, struct nfsvattr *nvap,
     nfsattrbit_t *attrbitp, NFSACL_T *aclp, struct thread *p)
 {
 	u_int32_t *tl;
 	int attrsum = 0;
 	int i, j;
 	int error, attrsize, bitpos, aclsize, aceerr, retnotsup = 0;
 	int toclient = 0;
 	u_char *cp, namestr[NFSV4_SMALLSTR + 1];
 	uid_t uid;
 	gid_t gid;
 
 	error = nfsrv_getattrbits(nd, attrbitp, NULL, &retnotsup);
 	if (error)
 		goto nfsmout;
 	NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 	attrsize = fxdr_unsigned(int, *tl);
 
 	/*
 	 * Loop around getting the setable attributes. If an unsupported
 	 * one is found, set nd_repstat == NFSERR_ATTRNOTSUPP and return.
 	 */
 	if (retnotsup) {
 		nd->nd_repstat = NFSERR_ATTRNOTSUPP;
 		bitpos = NFSATTRBIT_MAX;
 	} else {
 		bitpos = 0;
 	}
 	for (; bitpos < NFSATTRBIT_MAX; bitpos++) {
 	    if (attrsum > attrsize) {
 		error = NFSERR_BADXDR;
 		goto nfsmout;
 	    }
 	    if (NFSISSET_ATTRBIT(attrbitp, bitpos))
 		switch (bitpos) {
 		case NFSATTRBIT_SIZE:
 			NFSM_DISSECT(tl, u_int32_t *, NFSX_HYPER);
                      if (vp != NULL && vp->v_type != VREG) {
                             error = (vp->v_type == VDIR) ? NFSERR_ISDIR :
                                 NFSERR_INVAL;
                             goto nfsmout;
 			}
 			nvap->na_size = fxdr_hyper(tl);
 			attrsum += NFSX_HYPER;
 			break;
 		case NFSATTRBIT_ACL:
 			error = nfsrv_dissectacl(nd, aclp, &aceerr, &aclsize,
 			    p);
 			if (error)
 				goto nfsmout;
 			if (aceerr && !nd->nd_repstat)
 				nd->nd_repstat = aceerr;
 			attrsum += aclsize;
 			break;
 		case NFSATTRBIT_ARCHIVE:
 			NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 			if (!nd->nd_repstat)
 				nd->nd_repstat = NFSERR_ATTRNOTSUPP;
 			attrsum += NFSX_UNSIGNED;
 			break;
 		case NFSATTRBIT_HIDDEN:
 			NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 			if (!nd->nd_repstat)
 				nd->nd_repstat = NFSERR_ATTRNOTSUPP;
 			attrsum += NFSX_UNSIGNED;
 			break;
 		case NFSATTRBIT_MIMETYPE:
 			NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 			i = fxdr_unsigned(int, *tl);
 			error = nfsm_advance(nd, NFSM_RNDUP(i), -1);
 			if (error)
 				goto nfsmout;
 			if (!nd->nd_repstat)
 				nd->nd_repstat = NFSERR_ATTRNOTSUPP;
 			attrsum += (NFSX_UNSIGNED + NFSM_RNDUP(i));
 			break;
 		case NFSATTRBIT_MODE:
 			NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 			nvap->na_mode = nfstov_mode(*tl);
 			attrsum += NFSX_UNSIGNED;
 			break;
 		case NFSATTRBIT_OWNER:
 			NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 			j = fxdr_unsigned(int, *tl);
 			if (j < 0) {
 				error = NFSERR_BADXDR;
 				goto nfsmout;
 			}
 			if (j > NFSV4_SMALLSTR)
 				cp = malloc(j + 1, M_NFSSTRING, M_WAITOK);
 			else
 				cp = namestr;
 			error = nfsrv_mtostr(nd, cp, j);
 			if (error) {
 				if (j > NFSV4_SMALLSTR)
 					free(cp, M_NFSSTRING);
 				goto nfsmout;
 			}
 			if (!nd->nd_repstat) {
 				nd->nd_repstat = nfsv4_strtouid(nd, cp, j, &uid,
 				    p);
 				if (!nd->nd_repstat)
 					nvap->na_uid = uid;
 			}
 			if (j > NFSV4_SMALLSTR)
 				free(cp, M_NFSSTRING);
 			attrsum += (NFSX_UNSIGNED + NFSM_RNDUP(j));
 			break;
 		case NFSATTRBIT_OWNERGROUP:
 			NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 			j = fxdr_unsigned(int, *tl);
 			if (j < 0) {
 				error = NFSERR_BADXDR;
 				goto nfsmout;
 			}
 			if (j > NFSV4_SMALLSTR)
 				cp = malloc(j + 1, M_NFSSTRING, M_WAITOK);
 			else
 				cp = namestr;
 			error = nfsrv_mtostr(nd, cp, j);
 			if (error) {
 				if (j > NFSV4_SMALLSTR)
 					free(cp, M_NFSSTRING);
 				goto nfsmout;
 			}
 			if (!nd->nd_repstat) {
 				nd->nd_repstat = nfsv4_strtogid(nd, cp, j, &gid,
 				    p);
 				if (!nd->nd_repstat)
 					nvap->na_gid = gid;
 			}
 			if (j > NFSV4_SMALLSTR)
 				free(cp, M_NFSSTRING);
 			attrsum += (NFSX_UNSIGNED + NFSM_RNDUP(j));
 			break;
 		case NFSATTRBIT_SYSTEM:
 			NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 			if (!nd->nd_repstat)
 				nd->nd_repstat = NFSERR_ATTRNOTSUPP;
 			attrsum += NFSX_UNSIGNED;
 			break;
 		case NFSATTRBIT_TIMEACCESSSET:
 			NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 			attrsum += NFSX_UNSIGNED;
 			if (fxdr_unsigned(int, *tl)==NFSV4SATTRTIME_TOCLIENT) {
 			    NFSM_DISSECT(tl, u_int32_t *, NFSX_V4TIME);
 			    fxdr_nfsv4time(tl, &nvap->na_atime);
 			    toclient = 1;
 			    attrsum += NFSX_V4TIME;
 			} else {
 			    vfs_timestamp(&nvap->na_atime);
 			    nvap->na_vaflags |= VA_UTIMES_NULL;
 			}
 			break;
 		case NFSATTRBIT_TIMEBACKUP:
 			NFSM_DISSECT(tl, u_int32_t *, NFSX_V4TIME);
 			if (!nd->nd_repstat)
 				nd->nd_repstat = NFSERR_ATTRNOTSUPP;
 			attrsum += NFSX_V4TIME;
 			break;
 		case NFSATTRBIT_TIMECREATE:
 			NFSM_DISSECT(tl, u_int32_t *, NFSX_V4TIME);
 			if (!nd->nd_repstat)
 				nd->nd_repstat = NFSERR_ATTRNOTSUPP;
 			attrsum += NFSX_V4TIME;
 			break;
 		case NFSATTRBIT_TIMEMODIFYSET:
 			NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
 			attrsum += NFSX_UNSIGNED;
 			if (fxdr_unsigned(int, *tl)==NFSV4SATTRTIME_TOCLIENT) {
 			    NFSM_DISSECT(tl, u_int32_t *, NFSX_V4TIME);
 			    fxdr_nfsv4time(tl, &nvap->na_mtime);
 			    nvap->na_vaflags &= ~VA_UTIMES_NULL;
 			    attrsum += NFSX_V4TIME;
 			} else {
 			    vfs_timestamp(&nvap->na_mtime);
 			    if (!toclient)
 				nvap->na_vaflags |= VA_UTIMES_NULL;
 			}
 			break;
 		default:
 			nd->nd_repstat = NFSERR_ATTRNOTSUPP;
 			/*
 			 * set bitpos so we drop out of the loop.
 			 */
 			bitpos = NFSATTRBIT_MAX;
 			break;
 		};
 	}
 
 	/*
 	 * some clients pad the attrlist, so we need to skip over the
 	 * padding.
 	 */
 	if (attrsum > attrsize) {
 		error = NFSERR_BADXDR;
 	} else {
 		attrsize = NFSM_RNDUP(attrsize);
 		if (attrsum < attrsize)
 			error = nfsm_advance(nd, attrsize - attrsum, -1);
 	}
 nfsmout:
 	NFSEXITCODE2(error, nd);
 	return (error);
 }
 
 /*
  * Check/setup export credentials.
  */
 int
 nfsd_excred(struct nfsrv_descript *nd, struct nfsexstuff *exp,
     struct ucred *credanon)
 {
 	int error = 0;
 
 	/*
 	 * Check/setup credentials.
 	 */
 	if (nd->nd_flag & ND_GSS)
 		exp->nes_exflag &= ~MNT_EXPORTANON;
 
 	/*
 	 * Check to see if the operation is allowed for this security flavor.
 	 * RFC2623 suggests that the NFSv3 Fsinfo RPC be allowed to
 	 * AUTH_NONE or AUTH_SYS for file systems requiring RPCSEC_GSS.
 	 * Also, allow Secinfo, so that it can acquire the correct flavor(s).
 	 */
 	if (nfsvno_testexp(nd, exp) &&
 	    nd->nd_procnum != NFSV4OP_SECINFO &&
 	    nd->nd_procnum != NFSPROC_FSINFO) {
 		if (nd->nd_flag & ND_NFSV4)
 			error = NFSERR_WRONGSEC;
 		else
 			error = (NFSERR_AUTHERR | AUTH_TOOWEAK);
 		goto out;
 	}
 
 	/*
 	 * Check to see if the file system is exported V4 only.
 	 */
 	if (NFSVNO_EXV4ONLY(exp) && !(nd->nd_flag & ND_NFSV4)) {
 		error = NFSERR_PROGNOTV4;
 		goto out;
 	}
 
 	/*
 	 * Now, map the user credentials.
 	 * (Note that ND_AUTHNONE will only be set for an NFSv3
 	 *  Fsinfo RPC. If set for anything else, this code might need
 	 *  to change.)
 	 */
 	if (NFSVNO_EXPORTED(exp) &&
 	    ((!(nd->nd_flag & ND_GSS) && nd->nd_cred->cr_uid == 0) ||
 	     NFSVNO_EXPORTANON(exp) ||
 	     (nd->nd_flag & ND_AUTHNONE))) {
 		nd->nd_cred->cr_uid = credanon->cr_uid;
 		nd->nd_cred->cr_gid = credanon->cr_gid;
 		crsetgroups(nd->nd_cred, credanon->cr_ngroups,
 		    credanon->cr_groups);
 	}
 
 out:
 	NFSEXITCODE2(error, nd);
 	return (error);
 }
 
 /*
  * Check exports.
  */
 int
 nfsvno_checkexp(struct mount *mp, struct sockaddr *nam, struct nfsexstuff *exp,
     struct ucred **credp)
 {
 	int i, error, *secflavors;
 
 	error = VFS_CHECKEXP(mp, nam, &exp->nes_exflag, credp,
 	    &exp->nes_numsecflavor, &secflavors);
 	if (error) {
 		if (nfs_rootfhset) {
 			exp->nes_exflag = 0;
 			exp->nes_numsecflavor = 0;
 			error = 0;
 		}
 	} else {
 		/* Copy the security flavors. */
 		for (i = 0; i < exp->nes_numsecflavor; i++)
 			exp->nes_secflavors[i] = secflavors[i];
 	}
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Get a vnode for a file handle and export stuff.
  */
 int
 nfsvno_fhtovp(struct mount *mp, fhandle_t *fhp, struct sockaddr *nam,
     int lktype, struct vnode **vpp, struct nfsexstuff *exp,
     struct ucred **credp)
 {
 	int i, error, *secflavors;
 
 	*credp = NULL;
 	exp->nes_numsecflavor = 0;
 	error = VFS_FHTOVP(mp, &fhp->fh_fid, lktype, vpp);
 	if (error != 0)
 		/* Make sure the server replies ESTALE to the client. */
 		error = ESTALE;
 	if (nam && !error) {
 		error = VFS_CHECKEXP(mp, nam, &exp->nes_exflag, credp,
 		    &exp->nes_numsecflavor, &secflavors);
 		if (error) {
 			if (nfs_rootfhset) {
 				exp->nes_exflag = 0;
 				exp->nes_numsecflavor = 0;
 				error = 0;
 			} else {
 				vput(*vpp);
 			}
 		} else {
 			/* Copy the security flavors. */
 			for (i = 0; i < exp->nes_numsecflavor; i++)
 				exp->nes_secflavors[i] = secflavors[i];
 		}
 	}
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * nfsd_fhtovp() - convert a fh to a vnode ptr
  * 	- look up fsid in mount list (if not found ret error)
  *	- get vp and export rights by calling nfsvno_fhtovp()
  *	- if cred->cr_uid == 0 or MNT_EXPORTANON set it to credanon
  *	  for AUTH_SYS
  *	- if mpp != NULL, return the mount point so that it can
  *	  be used for vn_finished_write() by the caller
  */
 void
 nfsd_fhtovp(struct nfsrv_descript *nd, struct nfsrvfh *nfp, int lktype,
     struct vnode **vpp, struct nfsexstuff *exp,
     struct mount **mpp, int startwrite, struct thread *p)
 {
 	struct mount *mp;
 	struct ucred *credanon;
 	fhandle_t *fhp;
 
 	fhp = (fhandle_t *)nfp->nfsrvfh_data;
 	/*
 	 * Check for the special case of the nfsv4root_fh.
 	 */
 	mp = vfs_busyfs(&fhp->fh_fsid);
 	if (mpp != NULL)
 		*mpp = mp;
 	if (mp == NULL) {
 		*vpp = NULL;
 		nd->nd_repstat = ESTALE;
 		goto out;
 	}
 
 	if (startwrite) {
 		vn_start_write(NULL, mpp, V_WAIT);
 		if (lktype == LK_SHARED && !(MNT_SHARED_WRITES(mp)))
 			lktype = LK_EXCLUSIVE;
 	}
 	nd->nd_repstat = nfsvno_fhtovp(mp, fhp, nd->nd_nam, lktype, vpp, exp,
 	    &credanon);
 	vfs_unbusy(mp);
 
 	/*
 	 * For NFSv4 without a pseudo root fs, unexported file handles
 	 * can be returned, so that Lookup works everywhere.
 	 */
 	if (!nd->nd_repstat && exp->nes_exflag == 0 &&
 	    !(nd->nd_flag & ND_NFSV4)) {
 		vput(*vpp);
 		nd->nd_repstat = EACCES;
 	}
 
 	/*
 	 * Personally, I've never seen any point in requiring a
 	 * reserved port#, since only in the rare case where the
 	 * clients are all boxes with secure system priviledges,
 	 * does it provide any enhanced security, but... some people
 	 * believe it to be useful and keep putting this code back in.
 	 * (There is also some "security checker" out there that
 	 *  complains if the nfs server doesn't enforce this.)
 	 * However, note the following:
 	 * RFC3530 (NFSv4) specifies that a reserved port# not be
 	 *	required.
 	 * RFC2623 recommends that, if a reserved port# is checked for,
 	 *	that there be a way to turn that off--> ifdef'd.
 	 */
 #ifdef NFS_REQRSVPORT
 	if (!nd->nd_repstat) {
 		struct sockaddr_in *saddr;
 		struct sockaddr_in6 *saddr6;
 
 		saddr = NFSSOCKADDR(nd->nd_nam, struct sockaddr_in *);
 		saddr6 = NFSSOCKADDR(nd->nd_nam, struct sockaddr_in6 *);
 		if (!(nd->nd_flag & ND_NFSV4) &&
 		    ((saddr->sin_family == AF_INET &&
 		      ntohs(saddr->sin_port) >= IPPORT_RESERVED) ||
 		     (saddr6->sin6_family == AF_INET6 &&
 		      ntohs(saddr6->sin6_port) >= IPPORT_RESERVED))) {
 			vput(*vpp);
 			nd->nd_repstat = (NFSERR_AUTHERR | AUTH_TOOWEAK);
 		}
 	}
 #endif	/* NFS_REQRSVPORT */
 
 	/*
 	 * Check/setup credentials.
 	 */
 	if (!nd->nd_repstat) {
 		nd->nd_saveduid = nd->nd_cred->cr_uid;
 		nd->nd_repstat = nfsd_excred(nd, exp, credanon);
 		if (nd->nd_repstat)
 			vput(*vpp);
 	}
 	if (credanon != NULL)
 		crfree(credanon);
 	if (nd->nd_repstat) {
 		if (startwrite)
 			vn_finished_write(mp);
 		*vpp = NULL;
 		if (mpp != NULL)
 			*mpp = NULL;
 	}
 
 out:
 	NFSEXITCODE2(0, nd);
 }
 
 /*
  * glue for fp.
  */
 static int
 fp_getfvp(struct thread *p, int fd, struct file **fpp, struct vnode **vpp)
 {
 	struct filedesc *fdp;
 	struct file *fp;
 	int error = 0;
 
 	fdp = p->td_proc->p_fd;
 	if (fd < 0 || fd >= fdp->fd_nfiles ||
 	    (fp = fdp->fd_ofiles[fd].fde_file) == NULL) {
 		error = EBADF;
 		goto out;
 	}
 	*fpp = fp;
 
 out:
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Called from nfssvc() to update the exports list. Just call
  * vfs_export(). This has to be done, since the v4 root fake fs isn't
  * in the mount list.
  */
 int
 nfsrv_v4rootexport(void *argp, struct ucred *cred, struct thread *p)
 {
 	struct nfsex_args *nfsexargp = (struct nfsex_args *)argp;
 	int error = 0;
 	struct nameidata nd;
 	fhandle_t fh;
 
 	error = vfs_export(&nfsv4root_mnt, &nfsexargp->export);
 	if ((nfsexargp->export.ex_flags & MNT_DELEXPORT) != 0)
 		nfs_rootfhset = 0;
 	else if (error == 0) {
 		if (nfsexargp->fspec == NULL) {
 			error = EPERM;
 			goto out;
 		}
 		/*
 		 * If fspec != NULL, this is the v4root path.
 		 */
 		NDINIT(&nd, LOOKUP, FOLLOW, UIO_USERSPACE,
 		    nfsexargp->fspec, p);
 		if ((error = namei(&nd)) != 0)
 			goto out;
 		error = nfsvno_getfh(nd.ni_vp, &fh, p);
 		vrele(nd.ni_vp);
 		if (!error) {
 			nfs_rootfh.nfsrvfh_len = NFSX_MYFH;
 			NFSBCOPY((caddr_t)&fh,
 			    nfs_rootfh.nfsrvfh_data,
 			    sizeof (fhandle_t));
 			nfs_rootfhset = 1;
 		}
 	}
 
 out:
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * This function needs to test to see if the system is near its limit
  * for memory allocation via malloc() or mget() and return True iff
  * either of these resources are near their limit.
  * XXX (For now, this is just a stub.)
  */
 int nfsrv_testmalloclimit = 0;
 int
 nfsrv_mallocmget_limit(void)
 {
 	static int printmesg = 0;
 	static int testval = 1;
 
 	if (nfsrv_testmalloclimit && (testval++ % 1000) == 0) {
 		if ((printmesg++ % 100) == 0)
 			printf("nfsd: malloc/mget near limit\n");
 		return (1);
 	}
 	return (0);
 }
 
 /*
  * BSD specific initialization of a mount point.
  */
 void
 nfsd_mntinit(void)
 {
 	static int inited = 0;
 
 	if (inited)
 		return;
 	inited = 1;
 	nfsv4root_mnt.mnt_flag = (MNT_RDONLY | MNT_EXPORTED);
 	TAILQ_INIT(&nfsv4root_mnt.mnt_nvnodelist);
 	TAILQ_INIT(&nfsv4root_mnt.mnt_activevnodelist);
 	nfsv4root_mnt.mnt_export = NULL;
 	TAILQ_INIT(&nfsv4root_opt);
 	TAILQ_INIT(&nfsv4root_newopt);
 	nfsv4root_mnt.mnt_opt = &nfsv4root_opt;
 	nfsv4root_mnt.mnt_optnew = &nfsv4root_newopt;
 	nfsv4root_mnt.mnt_nvnodelistsize = 0;
 	nfsv4root_mnt.mnt_activevnodelistsize = 0;
 }
 
 /*
  * Get a vnode for a file handle, without checking exports, etc.
  */
 struct vnode *
 nfsvno_getvp(fhandle_t *fhp)
 {
 	struct mount *mp;
 	struct vnode *vp;
 	int error;
 
 	mp = vfs_busyfs(&fhp->fh_fsid);
 	if (mp == NULL)
 		return (NULL);
 	error = VFS_FHTOVP(mp, &fhp->fh_fid, LK_EXCLUSIVE, &vp);
 	vfs_unbusy(mp);
 	if (error)
 		return (NULL);
 	return (vp);
 }
 
 /*
  * Do a local VOP_ADVLOCK().
  */
 int
 nfsvno_advlock(struct vnode *vp, int ftype, u_int64_t first,
     u_int64_t end, struct thread *td)
 {
 	int error = 0;
 	struct flock fl;
 	u_int64_t tlen;
 
 	if (nfsrv_dolocallocks == 0)
 		goto out;
 	ASSERT_VOP_UNLOCKED(vp, "nfsvno_advlock: vp locked");
 
 	fl.l_whence = SEEK_SET;
 	fl.l_type = ftype;
 	fl.l_start = (off_t)first;
 	if (end == NFS64BITSSET) {
 		fl.l_len = 0;
 	} else {
 		tlen = end - first;
 		fl.l_len = (off_t)tlen;
 	}
 	/*
 	 * For FreeBSD8, the l_pid and l_sysid must be set to the same
 	 * values for all calls, so that all locks will be held by the
 	 * nfsd server. (The nfsd server handles conflicts between the
 	 * various clients.)
 	 * Since an NFSv4 lockowner is a ClientID plus an array of up to 1024
 	 * bytes, so it can't be put in l_sysid.
 	 */
 	if (nfsv4_sysid == 0)
 		nfsv4_sysid = nlm_acquire_next_sysid();
 	fl.l_pid = (pid_t)0;
 	fl.l_sysid = (int)nfsv4_sysid;
 
 	if (ftype == F_UNLCK)
 		error = VOP_ADVLOCK(vp, (caddr_t)td->td_proc, F_UNLCK, &fl,
 		    (F_POSIX | F_REMOTE));
 	else
 		error = VOP_ADVLOCK(vp, (caddr_t)td->td_proc, F_SETLK, &fl,
 		    (F_POSIX | F_REMOTE));
 
 out:
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Check the nfsv4 root exports.
  */
 int
 nfsvno_v4rootexport(struct nfsrv_descript *nd)
 {
 	struct ucred *credanon;
 	int exflags, error = 0, numsecflavor, *secflavors, i;
 
 	error = vfs_stdcheckexp(&nfsv4root_mnt, nd->nd_nam, &exflags,
 	    &credanon, &numsecflavor, &secflavors);
 	if (error) {
 		error = NFSERR_PROGUNAVAIL;
 		goto out;
 	}
 	if (credanon != NULL)
 		crfree(credanon);
 	for (i = 0; i < numsecflavor; i++) {
 		if (secflavors[i] == AUTH_SYS)
 			nd->nd_flag |= ND_EXAUTHSYS;
 		else if (secflavors[i] == RPCSEC_GSS_KRB5)
 			nd->nd_flag |= ND_EXGSS;
 		else if (secflavors[i] == RPCSEC_GSS_KRB5I)
 			nd->nd_flag |= ND_EXGSSINTEGRITY;
 		else if (secflavors[i] == RPCSEC_GSS_KRB5P)
 			nd->nd_flag |= ND_EXGSSPRIVACY;
 	}
 
 out:
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Nfs server psuedo system call for the nfsd's
  */
 /*
  * MPSAFE
  */
 static int
 nfssvc_nfsd(struct thread *td, struct nfssvc_args *uap)
 {
 	struct file *fp;
 	struct nfsd_addsock_args sockarg;
 	struct nfsd_nfsd_args nfsdarg;
 	cap_rights_t rights;
 	int error;
 
 	if (uap->flag & NFSSVC_NFSDADDSOCK) {
 		error = copyin(uap->argp, (caddr_t)&sockarg, sizeof (sockarg));
 		if (error)
 			goto out;
 		/*
 		 * Since we don't know what rights might be required,
 		 * pretend that we need them all. It is better to be too
 		 * careful than too reckless.
 		 */
 		error = fget(td, sockarg.sock,
 		    cap_rights_init(&rights, CAP_SOCK_SERVER), &fp);
 		if (error != 0)
 			goto out;
 		if (fp->f_type != DTYPE_SOCKET) {
 			fdrop(fp, td);
 			error = EPERM;
 			goto out;
 		}
 		error = nfsrvd_addsock(fp);
 		fdrop(fp, td);
 	} else if (uap->flag & NFSSVC_NFSDNFSD) {
 		if (uap->argp == NULL) {
 			error = EINVAL;
 			goto out;
 		}
 		error = copyin(uap->argp, (caddr_t)&nfsdarg,
 		    sizeof (nfsdarg));
 		if (error)
 			goto out;
 		error = nfsrvd_nfsd(td, &nfsdarg);
 	} else {
 		error = nfssvc_srvcall(td, uap, td->td_ucred);
 	}
 
 out:
 	NFSEXITCODE(error);
 	return (error);
 }
 
 static int
 nfssvc_srvcall(struct thread *p, struct nfssvc_args *uap, struct ucred *cred)
 {
 	struct nfsex_args export;
 	struct file *fp = NULL;
 	int stablefd, len;
 	struct nfsd_clid adminrevoke;
 	struct nfsd_dumplist dumplist;
 	struct nfsd_dumpclients *dumpclients;
 	struct nfsd_dumplocklist dumplocklist;
 	struct nfsd_dumplocks *dumplocks;
 	struct nameidata nd;
 	vnode_t vp;
 	int error = EINVAL, igotlock;
 	struct proc *procp;
 	static int suspend_nfsd = 0;
 
 	if (uap->flag & NFSSVC_PUBLICFH) {
 		NFSBZERO((caddr_t)&nfs_pubfh.nfsrvfh_data,
 		    sizeof (fhandle_t));
 		error = copyin(uap->argp,
 		    &nfs_pubfh.nfsrvfh_data, sizeof (fhandle_t));
 		if (!error)
 			nfs_pubfhset = 1;
 	} else if (uap->flag & NFSSVC_V4ROOTEXPORT) {
 		error = copyin(uap->argp,(caddr_t)&export,
 		    sizeof (struct nfsex_args));
 		if (!error)
 			error = nfsrv_v4rootexport(&export, cred, p);
 	} else if (uap->flag & NFSSVC_NOPUBLICFH) {
 		nfs_pubfhset = 0;
 		error = 0;
 	} else if (uap->flag & NFSSVC_STABLERESTART) {
 		error = copyin(uap->argp, (caddr_t)&stablefd,
 		    sizeof (int));
 		if (!error)
 			error = fp_getfvp(p, stablefd, &fp, &vp);
 		if (!error && (NFSFPFLAG(fp) & (FREAD | FWRITE)) != (FREAD | FWRITE))
 			error = EBADF;
 		if (!error && newnfs_numnfsd != 0)
 			error = EPERM;
 		if (!error) {
 			nfsrv_stablefirst.nsf_fp = fp;
 			nfsrv_setupstable(p);
 		}
 	} else if (uap->flag & NFSSVC_ADMINREVOKE) {
 		error = copyin(uap->argp, (caddr_t)&adminrevoke,
 		    sizeof (struct nfsd_clid));
 		if (!error)
 			error = nfsrv_adminrevoke(&adminrevoke, p);
 	} else if (uap->flag & NFSSVC_DUMPCLIENTS) {
 		error = copyin(uap->argp, (caddr_t)&dumplist,
 		    sizeof (struct nfsd_dumplist));
 		if (!error && (dumplist.ndl_size < 1 ||
 			dumplist.ndl_size > NFSRV_MAXDUMPLIST))
 			error = EPERM;
 		if (!error) {
 		    len = sizeof (struct nfsd_dumpclients) * dumplist.ndl_size;
 		    dumpclients = (struct nfsd_dumpclients *)malloc(len,
 			M_TEMP, M_WAITOK);
 		    nfsrv_dumpclients(dumpclients, dumplist.ndl_size);
 		    error = copyout(dumpclients,
 			CAST_USER_ADDR_T(dumplist.ndl_list), len);
 		    free((caddr_t)dumpclients, M_TEMP);
 		}
 	} else if (uap->flag & NFSSVC_DUMPLOCKS) {
 		error = copyin(uap->argp, (caddr_t)&dumplocklist,
 		    sizeof (struct nfsd_dumplocklist));
 		if (!error && (dumplocklist.ndllck_size < 1 ||
 			dumplocklist.ndllck_size > NFSRV_MAXDUMPLIST))
 			error = EPERM;
 		if (!error)
 			error = nfsrv_lookupfilename(&nd,
 				dumplocklist.ndllck_fname, p);
 		if (!error) {
 			len = sizeof (struct nfsd_dumplocks) *
 				dumplocklist.ndllck_size;
 			dumplocks = (struct nfsd_dumplocks *)malloc(len,
 				M_TEMP, M_WAITOK);
 			nfsrv_dumplocks(nd.ni_vp, dumplocks,
 			    dumplocklist.ndllck_size, p);
 			vput(nd.ni_vp);
 			error = copyout(dumplocks,
 			    CAST_USER_ADDR_T(dumplocklist.ndllck_list), len);
 			free((caddr_t)dumplocks, M_TEMP);
 		}
 	} else if (uap->flag & NFSSVC_BACKUPSTABLE) {
 		procp = p->td_proc;
 		PROC_LOCK(procp);
 		nfsd_master_pid = procp->p_pid;
 		bcopy(procp->p_comm, nfsd_master_comm, MAXCOMLEN + 1);
 		nfsd_master_start = procp->p_stats->p_start;
 		nfsd_master_proc = procp;
 		PROC_UNLOCK(procp);
 	} else if ((uap->flag & NFSSVC_SUSPENDNFSD) != 0) {
 		NFSLOCKV4ROOTMUTEX();
 		if (suspend_nfsd == 0) {
 			/* Lock out all nfsd threads */
 			do {
 				igotlock = nfsv4_lock(&nfsd_suspend_lock, 1,
 				    NULL, NFSV4ROOTLOCKMUTEXPTR, NULL);
 			} while (igotlock == 0 && suspend_nfsd == 0);
 			suspend_nfsd = 1;
 		}
 		NFSUNLOCKV4ROOTMUTEX();
 		error = 0;
 	} else if ((uap->flag & NFSSVC_RESUMENFSD) != 0) {
 		NFSLOCKV4ROOTMUTEX();
 		if (suspend_nfsd != 0) {
 			nfsv4_unlock(&nfsd_suspend_lock, 0);
 			suspend_nfsd = 0;
 		}
 		NFSUNLOCKV4ROOTMUTEX();
 		error = 0;
 	}
 
 	NFSEXITCODE(error);
 	return (error);
 }
 
 /*
  * Check exports.
  * Returns 0 if ok, 1 otherwise.
  */
 int
 nfsvno_testexp(struct nfsrv_descript *nd, struct nfsexstuff *exp)
 {
 	int i;
 
 	/*
 	 * This seems odd, but allow the case where the security flavor
 	 * list is empty. This happens when NFSv4 is traversing non-exported
 	 * file systems. Exported file systems should always have a non-empty
 	 * security flavor list.
 	 */
 	if (exp->nes_numsecflavor == 0)
 		return (0);
 
 	for (i = 0; i < exp->nes_numsecflavor; i++) {
 		/*
 		 * The tests for privacy and integrity must be first,
 		 * since ND_GSS is set for everything but AUTH_SYS.
 		 */
 		if (exp->nes_secflavors[i] == RPCSEC_GSS_KRB5P &&
 		    (nd->nd_flag & ND_GSSPRIVACY))
 			return (0);
 		if (exp->nes_secflavors[i] == RPCSEC_GSS_KRB5I &&
 		    (nd->nd_flag & ND_GSSINTEGRITY))
 			return (0);
 		if (exp->nes_secflavors[i] == RPCSEC_GSS_KRB5 &&
 		    (nd->nd_flag & ND_GSS))
 			return (0);
 		if (exp->nes_secflavors[i] == AUTH_SYS &&
 		    (nd->nd_flag & ND_GSS) == 0)
 			return (0);
 	}
 	return (1);
 }
 
 /*
  * Calculate a hash value for the fid in a file handle.
  */
 uint32_t
 nfsrv_hashfh(fhandle_t *fhp)
 {
 	uint32_t hashval;
 
 	hashval = hash32_buf(&fhp->fh_fid, sizeof(struct fid), 0);
 	return (hashval);
 }
 
 /*
  * Calculate a hash value for the sessionid.
  */
 uint32_t
 nfsrv_hashsessionid(uint8_t *sessionid)
 {
 	uint32_t hashval;
 
 	hashval = hash32_buf(sessionid, NFSX_V4SESSIONID, 0);
 	return (hashval);
 }
 
 /*
  * Signal the userland master nfsd to backup the stable restart file.
  */
 void
 nfsrv_backupstable(void)
 {
 	struct proc *procp;
 
 	if (nfsd_master_proc != NULL) {
 		procp = pfind(nfsd_master_pid);
 		/* Try to make sure it is the correct process. */
 		if (procp == nfsd_master_proc &&
 		    procp->p_stats->p_start.tv_sec ==
 		    nfsd_master_start.tv_sec &&
 		    procp->p_stats->p_start.tv_usec ==
 		    nfsd_master_start.tv_usec &&
 		    strcmp(procp->p_comm, nfsd_master_comm) == 0)
 			kern_psignal(procp, SIGUSR2);
 		else
 			nfsd_master_proc = NULL;
 
 		if (procp != NULL)
 			PROC_UNLOCK(procp);
 	}
 }
 
 extern int (*nfsd_call_nfsd)(struct thread *, struct nfssvc_args *);
 
 /*
  * Called once to initialize data structures...
  */
 static int
 nfsd_modevent(module_t mod, int type, void *data)
 {
 	int error = 0, i;
 	static int loaded = 0;
 
 	switch (type) {
 	case MOD_LOAD:
 		if (loaded)
 			goto out;
 		newnfs_portinit();
 		for (i = 0; i < NFSRVCACHE_HASHSIZE; i++) {
 			mtx_init(&nfsrchash_table[i].mtx, "nfsrtc", NULL,
 			    MTX_DEF);
 			mtx_init(&nfsrcahash_table[i].mtx, "nfsrtca", NULL,
 			    MTX_DEF);
 		}
 		mtx_init(&nfsrc_udpmtx, "nfsuc", NULL, MTX_DEF);
 		mtx_init(&nfs_v4root_mutex, "nfs4rt", NULL, MTX_DEF);
 		mtx_init(&nfsv4root_mnt.mnt_mtx, "nfs4mnt", NULL, MTX_DEF);
 		for (i = 0; i < NFSSESSIONHASHSIZE; i++)
 			mtx_init(&nfssessionhash[i].mtx, "nfssm",
 			    NULL, MTX_DEF);
 		lockinit(&nfsv4root_mnt.mnt_explock, PVFS, "explock", 0, 0);
 		nfsrvd_initcache();
 		nfsd_init();
 		NFSD_LOCK();
 		nfsrvd_init(0);
 		NFSD_UNLOCK();
 		nfsd_mntinit();
 #ifdef VV_DISABLEDELEG
 		vn_deleg_ops.vndeleg_recall = nfsd_recalldelegation;
 		vn_deleg_ops.vndeleg_disable = nfsd_disabledelegation;
 #endif
 		nfsd_call_servertimer = nfsrv_servertimer;
 		nfsd_call_nfsd = nfssvc_nfsd;
 		loaded = 1;
 		break;
 
 	case MOD_UNLOAD:
 		if (newnfs_numnfsd != 0) {
 			error = EBUSY;
 			break;
 		}
 
 #ifdef VV_DISABLEDELEG
 		vn_deleg_ops.vndeleg_recall = NULL;
 		vn_deleg_ops.vndeleg_disable = NULL;
 #endif
 		nfsd_call_servertimer = NULL;
 		nfsd_call_nfsd = NULL;
 
 		/* Clean out all NFSv4 state. */
 		nfsrv_throwawayallstate(curthread);
 
 		/* Clean the NFS server reply cache */
 		nfsrvd_cleancache();
 
 		/* Free up the krpc server pool. */
 		if (nfsrvd_pool != NULL)
 			svcpool_destroy(nfsrvd_pool);
 
 		/* and get rid of the locks */
 		for (i = 0; i < NFSRVCACHE_HASHSIZE; i++) {
 			mtx_destroy(&nfsrchash_table[i].mtx);
 			mtx_destroy(&nfsrcahash_table[i].mtx);
 		}
 		mtx_destroy(&nfsrc_udpmtx);
 		mtx_destroy(&nfs_v4root_mutex);
 		mtx_destroy(&nfsv4root_mnt.mnt_mtx);
 		for (i = 0; i < NFSSESSIONHASHSIZE; i++)
 			mtx_destroy(&nfssessionhash[i].mtx);
 		lockdestroy(&nfsv4root_mnt.mnt_explock);
 		loaded = 0;
 		break;
 	default:
 		error = EOPNOTSUPP;
 		break;
 	}
 
 out:
 	NFSEXITCODE(error);
 	return (error);
 }
 static moduledata_t nfsd_mod = {
 	"nfsd",
 	nfsd_modevent,
 	NULL,
 };
 DECLARE_MODULE(nfsd, nfsd_mod, SI_SUB_VFS, SI_ORDER_ANY);
 
 /* So that loader and kldload(2) can find us, wherever we are.. */
 MODULE_VERSION(nfsd, 1);
 MODULE_DEPEND(nfsd, nfscommon, 1, 1, 1);
 MODULE_DEPEND(nfsd, nfslock, 1, 1, 1);
 MODULE_DEPEND(nfsd, nfslockd, 1, 1, 1);
 MODULE_DEPEND(nfsd, krpc, 1, 1, 1);
 MODULE_DEPEND(nfsd, nfssvc, 1, 1, 1);
 
Index: user/ngie/more-tests/sys/fs/nullfs/null_vfsops.c
===================================================================
--- user/ngie/more-tests/sys/fs/nullfs/null_vfsops.c	(revision 281584)
+++ user/ngie/more-tests/sys/fs/nullfs/null_vfsops.c	(revision 281585)
@@ -1,456 +1,456 @@
 /*-
  * Copyright (c) 1992, 1993, 1995
  *	The Regents of the University of California.  All rights reserved.
  *
  * This code is derived from software donated to Berkeley by
  * Jan-Simon Pendry.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)null_vfsops.c	8.2 (Berkeley) 1/21/94
  *
  * @(#)lofs_vfsops.c	1.2 (Berkeley) 6/18/92
  * $FreeBSD$
  */
 
 /*
  * Null Layer
  * (See null_vnops.c for a description of what this does.)
  */
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/fcntl.h>
 #include <sys/kernel.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/mount.h>
 #include <sys/namei.h>
 #include <sys/proc.h>
 #include <sys/vnode.h>
 #include <sys/jail.h>
 
 #include <fs/nullfs/null.h>
 
 static MALLOC_DEFINE(M_NULLFSMNT, "nullfs_mount", "NULLFS mount structure");
 
 static vfs_fhtovp_t	nullfs_fhtovp;
 static vfs_mount_t	nullfs_mount;
 static vfs_quotactl_t	nullfs_quotactl;
 static vfs_root_t	nullfs_root;
 static vfs_sync_t	nullfs_sync;
 static vfs_statfs_t	nullfs_statfs;
 static vfs_unmount_t	nullfs_unmount;
 static vfs_vget_t	nullfs_vget;
 static vfs_extattrctl_t	nullfs_extattrctl;
 
 /*
  * Mount null layer
  */
 static int
 nullfs_mount(struct mount *mp)
 {
 	int error = 0;
 	struct vnode *lowerrootvp, *vp;
 	struct vnode *nullm_rootvp;
 	struct null_mount *xmp;
 	struct thread *td = curthread;
 	char *target;
 	int isvnunlocked = 0, len;
 	struct nameidata nd, *ndp = &nd;
 
 	NULLFSDEBUG("nullfs_mount(mp = %p)\n", (void *)mp);
 
 	if (!prison_allow(td->td_ucred, PR_ALLOW_MOUNT_NULLFS))
 		return (EPERM);
 	if (mp->mnt_flag & MNT_ROOTFS)
 		return (EOPNOTSUPP);
 
 	/*
 	 * Update is a no-op
 	 */
 	if (mp->mnt_flag & MNT_UPDATE) {
 		/*
 		 * Only support update mounts for NFS export.
 		 */
 		if (vfs_flagopt(mp->mnt_optnew, "export", NULL, 0))
 			return (0);
 		else
 			return (EOPNOTSUPP);
 	}
 
 	/*
 	 * Get argument
 	 */
 	error = vfs_getopt(mp->mnt_optnew, "target", (void **)&target, &len);
 	if (error || target[len - 1] != '\0')
 		return (EINVAL);
 
 	/*
 	 * Unlock lower node to avoid possible deadlock.
 	 */
 	if ((mp->mnt_vnodecovered->v_op == &null_vnodeops) &&
 	    VOP_ISLOCKED(mp->mnt_vnodecovered) == LK_EXCLUSIVE) {
 		VOP_UNLOCK(mp->mnt_vnodecovered, 0);
 		isvnunlocked = 1;
 	}
 	/*
 	 * Find lower node
 	 */
 	NDINIT(ndp, LOOKUP, FOLLOW|LOCKLEAF, UIO_SYSSPACE, target, curthread);
 	error = namei(ndp);
 
 	/*
 	 * Re-lock vnode.
 	 * XXXKIB This is deadlock-prone as well.
 	 */
 	if (isvnunlocked)
 		vn_lock(mp->mnt_vnodecovered, LK_EXCLUSIVE | LK_RETRY);
 
 	if (error)
 		return (error);
 	NDFREE(ndp, NDF_ONLY_PNBUF);
 
 	/*
 	 * Sanity check on lower vnode
 	 */
 	lowerrootvp = ndp->ni_vp;
 
 	/*
 	 * Check multi null mount to avoid `lock against myself' panic.
 	 */
 	if (lowerrootvp == VTONULL(mp->mnt_vnodecovered)->null_lowervp) {
 		NULLFSDEBUG("nullfs_mount: multi null mount?\n");
 		vput(lowerrootvp);
 		return (EDEADLK);
 	}
 
 	xmp = (struct null_mount *) malloc(sizeof(struct null_mount),
 	    M_NULLFSMNT, M_WAITOK | M_ZERO);
 
 	/*
 	 * Save reference to underlying FS
 	 */
 	xmp->nullm_vfs = lowerrootvp->v_mount;
 
 	/*
 	 * Save reference.  Each mount also holds
 	 * a reference on the root vnode.
 	 */
 	error = null_nodeget(mp, lowerrootvp, &vp);
 	/*
 	 * Make sure the node alias worked
 	 */
 	if (error) {
 		free(xmp, M_NULLFSMNT);
 		return (error);
 	}
 
 	/*
 	 * Keep a held reference to the root vnode.
 	 * It is vrele'd in nullfs_unmount.
 	 */
 	nullm_rootvp = vp;
 	nullm_rootvp->v_vflag |= VV_ROOT;
 	xmp->nullm_rootvp = nullm_rootvp;
 
 	/*
 	 * Unlock the node (either the lower or the alias)
 	 */
 	VOP_UNLOCK(vp, 0);
 
 	if (NULLVPTOLOWERVP(nullm_rootvp)->v_mount->mnt_flag & MNT_LOCAL) {
 		MNT_ILOCK(mp);
 		mp->mnt_flag |= MNT_LOCAL;
 		MNT_IUNLOCK(mp);
 	}
 
 	xmp->nullm_flags |= NULLM_CACHE;
 	if (vfs_getopt(mp->mnt_optnew, "nocache", NULL, NULL) == 0)
 		xmp->nullm_flags &= ~NULLM_CACHE;
 
 	MNT_ILOCK(mp);
 	if ((xmp->nullm_flags & NULLM_CACHE) != 0) {
 		mp->mnt_kern_flag |= lowerrootvp->v_mount->mnt_kern_flag &
 		    (MNTK_SHARED_WRITES | MNTK_LOOKUP_SHARED |
 		    MNTK_EXTENDED_SHARED);
 	}
 	mp->mnt_kern_flag |= MNTK_LOOKUP_EXCL_DOTDOT;
 	mp->mnt_kern_flag |= lowerrootvp->v_mount->mnt_kern_flag &
-	    MNTK_SUSPENDABLE;
+	    (MNTK_SUSPENDABLE | MNTK_USES_BCACHE);
 	MNT_IUNLOCK(mp);
 	mp->mnt_data = xmp;
 	vfs_getnewfsid(mp);
 	if ((xmp->nullm_flags & NULLM_CACHE) != 0) {
 		MNT_ILOCK(xmp->nullm_vfs);
 		TAILQ_INSERT_TAIL(&xmp->nullm_vfs->mnt_uppers, mp,
 		    mnt_upper_link);
 		MNT_IUNLOCK(xmp->nullm_vfs);
 	}
 
 	vfs_mountedfrom(mp, target);
 
 	NULLFSDEBUG("nullfs_mount: lower %s, alias at %s\n",
 		mp->mnt_stat.f_mntfromname, mp->mnt_stat.f_mntonname);
 	return (0);
 }
 
 /*
  * Free reference to null layer
  */
 static int
 nullfs_unmount(mp, mntflags)
 	struct mount *mp;
 	int mntflags;
 {
 	struct null_mount *mntdata;
 	struct mount *ump;
 	int error, flags;
 
 	NULLFSDEBUG("nullfs_unmount: mp = %p\n", (void *)mp);
 
 	if (mntflags & MNT_FORCE)
 		flags = FORCECLOSE;
 	else
 		flags = 0;
 
 	/* There is 1 extra root vnode reference (nullm_rootvp). */
 	error = vflush(mp, 1, flags, curthread);
 	if (error)
 		return (error);
 
 	/*
 	 * Finally, throw away the null_mount structure
 	 */
 	mntdata = mp->mnt_data;
 	ump = mntdata->nullm_vfs;
 	if ((mntdata->nullm_flags & NULLM_CACHE) != 0) {
 		MNT_ILOCK(ump);
 		while ((ump->mnt_kern_flag & MNTK_VGONE_UPPER) != 0) {
 			ump->mnt_kern_flag |= MNTK_VGONE_WAITER;
 			msleep(&ump->mnt_uppers, &ump->mnt_mtx, 0, "vgnupw", 0);
 		}
 		TAILQ_REMOVE(&ump->mnt_uppers, mp, mnt_upper_link);
 		MNT_IUNLOCK(ump);
 	}
 	mp->mnt_data = NULL;
 	free(mntdata, M_NULLFSMNT);
 	return (0);
 }
 
 static int
 nullfs_root(mp, flags, vpp)
 	struct mount *mp;
 	int flags;
 	struct vnode **vpp;
 {
 	struct vnode *vp;
 
 	NULLFSDEBUG("nullfs_root(mp = %p, vp = %p->%p)\n", (void *)mp,
 	    (void *)MOUNTTONULLMOUNT(mp)->nullm_rootvp,
 	    (void *)NULLVPTOLOWERVP(MOUNTTONULLMOUNT(mp)->nullm_rootvp));
 
 	/*
 	 * Return locked reference to root.
 	 */
 	vp = MOUNTTONULLMOUNT(mp)->nullm_rootvp;
 	VREF(vp);
 
 	ASSERT_VOP_UNLOCKED(vp, "root vnode is locked");
 	vn_lock(vp, flags | LK_RETRY);
 	*vpp = vp;
 	return 0;
 }
 
 static int
 nullfs_quotactl(mp, cmd, uid, arg)
 	struct mount *mp;
 	int cmd;
 	uid_t uid;
 	void *arg;
 {
 	return VFS_QUOTACTL(MOUNTTONULLMOUNT(mp)->nullm_vfs, cmd, uid, arg);
 }
 
 static int
 nullfs_statfs(mp, sbp)
 	struct mount *mp;
 	struct statfs *sbp;
 {
 	int error;
 	struct statfs mstat;
 
 	NULLFSDEBUG("nullfs_statfs(mp = %p, vp = %p->%p)\n", (void *)mp,
 	    (void *)MOUNTTONULLMOUNT(mp)->nullm_rootvp,
 	    (void *)NULLVPTOLOWERVP(MOUNTTONULLMOUNT(mp)->nullm_rootvp));
 
 	bzero(&mstat, sizeof(mstat));
 
 	error = VFS_STATFS(MOUNTTONULLMOUNT(mp)->nullm_vfs, &mstat);
 	if (error)
 		return (error);
 
 	/* now copy across the "interesting" information and fake the rest */
 	sbp->f_type = mstat.f_type;
 	sbp->f_flags = (sbp->f_flags & (MNT_RDONLY | MNT_NOEXEC | MNT_NOSUID |
 	    MNT_UNION | MNT_NOSYMFOLLOW)) | (mstat.f_flags & ~MNT_ROOTFS);
 	sbp->f_bsize = mstat.f_bsize;
 	sbp->f_iosize = mstat.f_iosize;
 	sbp->f_blocks = mstat.f_blocks;
 	sbp->f_bfree = mstat.f_bfree;
 	sbp->f_bavail = mstat.f_bavail;
 	sbp->f_files = mstat.f_files;
 	sbp->f_ffree = mstat.f_ffree;
 	return (0);
 }
 
 static int
 nullfs_sync(mp, waitfor)
 	struct mount *mp;
 	int waitfor;
 {
 	/*
 	 * XXX - Assumes no data cached at null layer.
 	 */
 	return (0);
 }
 
 static int
 nullfs_vget(mp, ino, flags, vpp)
 	struct mount *mp;
 	ino_t ino;
 	int flags;
 	struct vnode **vpp;
 {
 	int error;
 
 	KASSERT((flags & LK_TYPE_MASK) != 0,
 	    ("nullfs_vget: no lock requested"));
 
 	error = VFS_VGET(MOUNTTONULLMOUNT(mp)->nullm_vfs, ino, flags, vpp);
 	if (error != 0)
 		return (error);
 	return (null_nodeget(mp, *vpp, vpp));
 }
 
 static int
 nullfs_fhtovp(mp, fidp, flags, vpp)
 	struct mount *mp;
 	struct fid *fidp;
 	int flags;
 	struct vnode **vpp;
 {
 	int error;
 
 	error = VFS_FHTOVP(MOUNTTONULLMOUNT(mp)->nullm_vfs, fidp, flags,
 	    vpp);
 	if (error != 0)
 		return (error);
 	return (null_nodeget(mp, *vpp, vpp));
 }
 
 static int                        
 nullfs_extattrctl(mp, cmd, filename_vp, namespace, attrname)
 	struct mount *mp;
 	int cmd;
 	struct vnode *filename_vp;
 	int namespace;
 	const char *attrname;
 {
 
 	return (VFS_EXTATTRCTL(MOUNTTONULLMOUNT(mp)->nullm_vfs, cmd,
 	    filename_vp, namespace, attrname));
 }
 
 static void
 nullfs_reclaim_lowervp(struct mount *mp, struct vnode *lowervp)
 {
 	struct vnode *vp;
 
 	vp = null_hashget(mp, lowervp);
 	if (vp == NULL)
 		return;
 	VTONULL(vp)->null_flags |= NULLV_NOUNLOCK;
 	vgone(vp);
 	vput(vp);
 }
 
 static void
 nullfs_unlink_lowervp(struct mount *mp, struct vnode *lowervp)
 {
 	struct vnode *vp;
 	struct null_node *xp;
 
 	vp = null_hashget(mp, lowervp);
 	if (vp == NULL)
 		return;
 	xp = VTONULL(vp);
 	xp->null_flags |= NULLV_DROP | NULLV_NOUNLOCK;
 	vhold(vp);
 	vunref(vp);
 
 	if (vp->v_usecount == 0) {
 		/*
 		 * If vunref() dropped the last use reference on the
 		 * nullfs vnode, it must be reclaimed, and its lock
 		 * was split from the lower vnode lock.  Need to do
 		 * extra unlock before allowing the final vdrop() to
 		 * free the vnode.
 		 */
 		KASSERT((vp->v_iflag & VI_DOOMED) != 0,
 		    ("not reclaimed nullfs vnode %p", vp));
 		VOP_UNLOCK(vp, 0);
 	} else {
 		/*
 		 * Otherwise, the nullfs vnode still shares the lock
 		 * with the lower vnode, and must not be unlocked.
 		 * Also clear the NULLV_NOUNLOCK, the flag is not
 		 * relevant for future reclamations.
 		 */
 		ASSERT_VOP_ELOCKED(vp, "unlink_lowervp");
 		KASSERT((vp->v_iflag & VI_DOOMED) == 0,
 		    ("reclaimed nullfs vnode %p", vp));
 		xp->null_flags &= ~NULLV_NOUNLOCK;
 	}
 	vdrop(vp);
 }
 
 static struct vfsops null_vfsops = {
 	.vfs_extattrctl =	nullfs_extattrctl,
 	.vfs_fhtovp =		nullfs_fhtovp,
 	.vfs_init =		nullfs_init,
 	.vfs_mount =		nullfs_mount,
 	.vfs_quotactl =		nullfs_quotactl,
 	.vfs_root =		nullfs_root,
 	.vfs_statfs =		nullfs_statfs,
 	.vfs_sync =		nullfs_sync,
 	.vfs_uninit =		nullfs_uninit,
 	.vfs_unmount =		nullfs_unmount,
 	.vfs_vget =		nullfs_vget,
 	.vfs_reclaim_lowervp =	nullfs_reclaim_lowervp,
 	.vfs_unlink_lowervp =	nullfs_unlink_lowervp,
 };
 
 VFS_SET(null_vfsops, nullfs, VFCF_LOOPBACK | VFCF_JAIL);
Index: user/ngie/more-tests/sys/kern/imgact_elf.c
===================================================================
--- user/ngie/more-tests/sys/kern/imgact_elf.c	(revision 281584)
+++ user/ngie/more-tests/sys/kern/imgact_elf.c	(revision 281585)
@@ -1,2157 +1,2158 @@
 /*-
  * Copyright (c) 2000 David O'Brien
  * Copyright (c) 1995-1996 Søren Schmidt
  * Copyright (c) 1996 Peter Wemm
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer
  *    in this position and unchanged.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. The name of the author may not be used to endorse or promote products
  *    derived from this software without specific prior written permission
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_capsicum.h"
 #include "opt_compat.h"
 #include "opt_gzio.h"
 
 #include <sys/param.h>
 #include <sys/capsicum.h>
 #include <sys/exec.h>
 #include <sys/fcntl.h>
 #include <sys/gzio.h>
 #include <sys/imgact.h>
 #include <sys/imgact_elf.h>
 #include <sys/jail.h>
 #include <sys/kernel.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/mount.h>
 #include <sys/mman.h>
 #include <sys/namei.h>
 #include <sys/pioctl.h>
 #include <sys/proc.h>
 #include <sys/procfs.h>
 #include <sys/racct.h>
 #include <sys/resourcevar.h>
 #include <sys/rwlock.h>
 #include <sys/sbuf.h>
 #include <sys/sf_buf.h>
 #include <sys/smp.h>
 #include <sys/systm.h>
 #include <sys/signalvar.h>
 #include <sys/stat.h>
 #include <sys/sx.h>
 #include <sys/syscall.h>
 #include <sys/sysctl.h>
 #include <sys/sysent.h>
 #include <sys/vnode.h>
 #include <sys/syslog.h>
 #include <sys/eventhandler.h>
 #include <sys/user.h>
 
 #include <vm/vm.h>
 #include <vm/vm_kern.h>
 #include <vm/vm_param.h>
 #include <vm/pmap.h>
 #include <vm/vm_map.h>
 #include <vm/vm_object.h>
 #include <vm/vm_extern.h>
 
 #include <machine/elf.h>
 #include <machine/md_var.h>
 
 #define ELF_NOTE_ROUNDSIZE	4
 #define OLD_EI_BRAND	8
 
 static int __elfN(check_header)(const Elf_Ehdr *hdr);
 static Elf_Brandinfo *__elfN(get_brandinfo)(struct image_params *imgp,
     const char *interp, int interp_name_len, int32_t *osrel);
 static int __elfN(load_file)(struct proc *p, const char *file, u_long *addr,
     u_long *entry, size_t pagesize);
 static int __elfN(load_section)(struct image_params *imgp, vm_offset_t offset,
     caddr_t vmaddr, size_t memsz, size_t filsz, vm_prot_t prot,
     size_t pagesize);
 static int __CONCAT(exec_, __elfN(imgact))(struct image_params *imgp);
 static boolean_t __elfN(freebsd_trans_osrel)(const Elf_Note *note,
     int32_t *osrel);
 static boolean_t kfreebsd_trans_osrel(const Elf_Note *note, int32_t *osrel);
 static boolean_t __elfN(check_note)(struct image_params *imgp,
     Elf_Brandnote *checknote, int32_t *osrel);
 static vm_prot_t __elfN(trans_prot)(Elf_Word);
 static Elf_Word __elfN(untrans_prot)(vm_prot_t);
 
 SYSCTL_NODE(_kern, OID_AUTO, __CONCAT(elf, __ELF_WORD_SIZE), CTLFLAG_RW, 0,
     "");
 
 #define	CORE_BUF_SIZE	(16 * 1024)
 
 int __elfN(fallback_brand) = -1;
 SYSCTL_INT(__CONCAT(_kern_elf, __ELF_WORD_SIZE), OID_AUTO,
     fallback_brand, CTLFLAG_RWTUN, &__elfN(fallback_brand), 0,
     __XSTRING(__CONCAT(ELF, __ELF_WORD_SIZE)) " brand of last resort");
 
 static int elf_legacy_coredump = 0;
 SYSCTL_INT(_debug, OID_AUTO, __elfN(legacy_coredump), CTLFLAG_RW, 
     &elf_legacy_coredump, 0, "");
 
 int __elfN(nxstack) =
 #if defined(__amd64__) || defined(__powerpc64__) /* both 64 and 32 bit */
 	1;
 #else
 	0;
 #endif
 SYSCTL_INT(__CONCAT(_kern_elf, __ELF_WORD_SIZE), OID_AUTO,
     nxstack, CTLFLAG_RW, &__elfN(nxstack), 0,
     __XSTRING(__CONCAT(ELF, __ELF_WORD_SIZE)) ": enable non-executable stack");
 
 #if __ELF_WORD_SIZE == 32
 #if defined(__amd64__)
 int i386_read_exec = 0;
 SYSCTL_INT(_kern_elf32, OID_AUTO, read_exec, CTLFLAG_RW, &i386_read_exec, 0,
     "enable execution from readable segments");
 #endif
 #endif
 
 static Elf_Brandinfo *elf_brand_list[MAX_BRANDS];
 
 #define	trunc_page_ps(va, ps)	((va) & ~(ps - 1))
 #define	round_page_ps(va, ps)	(((va) + (ps - 1)) & ~(ps - 1))
 #define	aligned(a, t)	(trunc_page_ps((u_long)(a), sizeof(t)) == (u_long)(a))
 
 static const char FREEBSD_ABI_VENDOR[] = "FreeBSD";
 
 Elf_Brandnote __elfN(freebsd_brandnote) = {
 	.hdr.n_namesz	= sizeof(FREEBSD_ABI_VENDOR),
 	.hdr.n_descsz	= sizeof(int32_t),
 	.hdr.n_type	= 1,
 	.vendor		= FREEBSD_ABI_VENDOR,
 	.flags		= BN_TRANSLATE_OSREL,
 	.trans_osrel	= __elfN(freebsd_trans_osrel)
 };
 
 static boolean_t
 __elfN(freebsd_trans_osrel)(const Elf_Note *note, int32_t *osrel)
 {
 	uintptr_t p;
 
 	p = (uintptr_t)(note + 1);
 	p += roundup2(note->n_namesz, ELF_NOTE_ROUNDSIZE);
 	*osrel = *(const int32_t *)(p);
 
 	return (TRUE);
 }
 
 static const char GNU_ABI_VENDOR[] = "GNU";
 static int GNU_KFREEBSD_ABI_DESC = 3;
 
 Elf_Brandnote __elfN(kfreebsd_brandnote) = {
 	.hdr.n_namesz	= sizeof(GNU_ABI_VENDOR),
 	.hdr.n_descsz	= 16,	/* XXX at least 16 */
 	.hdr.n_type	= 1,
 	.vendor		= GNU_ABI_VENDOR,
 	.flags		= BN_TRANSLATE_OSREL,
 	.trans_osrel	= kfreebsd_trans_osrel
 };
 
 static boolean_t
 kfreebsd_trans_osrel(const Elf_Note *note, int32_t *osrel)
 {
 	const Elf32_Word *desc;
 	uintptr_t p;
 
 	p = (uintptr_t)(note + 1);
 	p += roundup2(note->n_namesz, ELF_NOTE_ROUNDSIZE);
 
 	desc = (const Elf32_Word *)p;
 	if (desc[0] != GNU_KFREEBSD_ABI_DESC)
 		return (FALSE);
 
 	/*
 	 * Debian GNU/kFreeBSD embed the earliest compatible kernel version
 	 * (__FreeBSD_version: <major><two digit minor>Rxx) in the LSB way.
 	 */
 	*osrel = desc[1] * 100000 + desc[2] * 1000 + desc[3];
 
 	return (TRUE);
 }
 
 int
 __elfN(insert_brand_entry)(Elf_Brandinfo *entry)
 {
 	int i;
 
 	for (i = 0; i < MAX_BRANDS; i++) {
 		if (elf_brand_list[i] == NULL) {
 			elf_brand_list[i] = entry;
 			break;
 		}
 	}
 	if (i == MAX_BRANDS) {
 		printf("WARNING: %s: could not insert brandinfo entry: %p\n",
 			__func__, entry);
 		return (-1);
 	}
 	return (0);
 }
 
 int
 __elfN(remove_brand_entry)(Elf_Brandinfo *entry)
 {
 	int i;
 
 	for (i = 0; i < MAX_BRANDS; i++) {
 		if (elf_brand_list[i] == entry) {
 			elf_brand_list[i] = NULL;
 			break;
 		}
 	}
 	if (i == MAX_BRANDS)
 		return (-1);
 	return (0);
 }
 
 int
 __elfN(brand_inuse)(Elf_Brandinfo *entry)
 {
 	struct proc *p;
 	int rval = FALSE;
 
 	sx_slock(&allproc_lock);
 	FOREACH_PROC_IN_SYSTEM(p) {
 		if (p->p_sysent == entry->sysvec) {
 			rval = TRUE;
 			break;
 		}
 	}
 	sx_sunlock(&allproc_lock);
 
 	return (rval);
 }
 
 static Elf_Brandinfo *
 __elfN(get_brandinfo)(struct image_params *imgp, const char *interp,
     int interp_name_len, int32_t *osrel)
 {
 	const Elf_Ehdr *hdr = (const Elf_Ehdr *)imgp->image_header;
 	Elf_Brandinfo *bi;
 	boolean_t ret;
 	int i;
 
 	/*
 	 * We support four types of branding -- (1) the ELF EI_OSABI field
 	 * that SCO added to the ELF spec, (2) FreeBSD 3.x's traditional string
 	 * branding w/in the ELF header, (3) path of the `interp_path'
 	 * field, and (4) the ".note.ABI-tag" ELF section.
 	 */
 
 	/* Look for an ".note.ABI-tag" ELF section */
 	for (i = 0; i < MAX_BRANDS; i++) {
 		bi = elf_brand_list[i];
 		if (bi == NULL)
 			continue;
 		if (hdr->e_machine == bi->machine && (bi->flags &
 		    (BI_BRAND_NOTE|BI_BRAND_NOTE_MANDATORY)) != 0) {
 			ret = __elfN(check_note)(imgp, bi->brand_note, osrel);
 			if (ret)
 				return (bi);
 		}
 	}
 
 	/* If the executable has a brand, search for it in the brand list. */
 	for (i = 0; i < MAX_BRANDS; i++) {
 		bi = elf_brand_list[i];
 		if (bi == NULL || bi->flags & BI_BRAND_NOTE_MANDATORY)
 			continue;
 		if (hdr->e_machine == bi->machine &&
 		    (hdr->e_ident[EI_OSABI] == bi->brand ||
 		    strncmp((const char *)&hdr->e_ident[OLD_EI_BRAND],
 		    bi->compat_3_brand, strlen(bi->compat_3_brand)) == 0))
 			return (bi);
 	}
 
 	/* No known brand, see if the header is recognized by any brand */
 	for (i = 0; i < MAX_BRANDS; i++) {
 		bi = elf_brand_list[i];
 		if (bi == NULL || bi->flags & BI_BRAND_NOTE_MANDATORY ||
 		    bi->header_supported == NULL)
 			continue;
 		if (hdr->e_machine == bi->machine) {
 			ret = bi->header_supported(imgp);
 			if (ret)
 				return (bi);
 		}
 	}
 
 	/* Lacking a known brand, search for a recognized interpreter. */
 	if (interp != NULL) {
 		for (i = 0; i < MAX_BRANDS; i++) {
 			bi = elf_brand_list[i];
 			if (bi == NULL || bi->flags & BI_BRAND_NOTE_MANDATORY)
 				continue;
 			if (hdr->e_machine == bi->machine &&
 			    /* ELF image p_filesz includes terminating zero */
 			    strlen(bi->interp_path) + 1 == interp_name_len &&
 			    strncmp(interp, bi->interp_path, interp_name_len)
 			    == 0)
 				return (bi);
 		}
 	}
 
 	/* Lacking a recognized interpreter, try the default brand */
 	for (i = 0; i < MAX_BRANDS; i++) {
 		bi = elf_brand_list[i];
 		if (bi == NULL || bi->flags & BI_BRAND_NOTE_MANDATORY)
 			continue;
 		if (hdr->e_machine == bi->machine &&
 		    __elfN(fallback_brand) == bi->brand)
 			return (bi);
 	}
 	return (NULL);
 }
 
 static int
 __elfN(check_header)(const Elf_Ehdr *hdr)
 {
 	Elf_Brandinfo *bi;
 	int i;
 
 	if (!IS_ELF(*hdr) ||
 	    hdr->e_ident[EI_CLASS] != ELF_TARG_CLASS ||
 	    hdr->e_ident[EI_DATA] != ELF_TARG_DATA ||
 	    hdr->e_ident[EI_VERSION] != EV_CURRENT ||
 	    hdr->e_phentsize != sizeof(Elf_Phdr) ||
 	    hdr->e_version != ELF_TARG_VER)
 		return (ENOEXEC);
 
 	/*
 	 * Make sure we have at least one brand for this machine.
 	 */
 
 	for (i = 0; i < MAX_BRANDS; i++) {
 		bi = elf_brand_list[i];
 		if (bi != NULL && bi->machine == hdr->e_machine)
 			break;
 	}
 	if (i == MAX_BRANDS)
 		return (ENOEXEC);
 
 	return (0);
 }
 
 static int
 __elfN(map_partial)(vm_map_t map, vm_object_t object, vm_ooffset_t offset,
     vm_offset_t start, vm_offset_t end, vm_prot_t prot)
 {
 	struct sf_buf *sf;
 	int error;
 	vm_offset_t off;
 
 	/*
 	 * Create the page if it doesn't exist yet. Ignore errors.
 	 */
 	vm_map_lock(map);
 	vm_map_insert(map, NULL, 0, trunc_page(start), round_page(end),
 	    VM_PROT_ALL, VM_PROT_ALL, 0);
 	vm_map_unlock(map);
 
 	/*
 	 * Find the page from the underlying object.
 	 */
 	if (object) {
 		sf = vm_imgact_map_page(object, offset);
 		if (sf == NULL)
 			return (KERN_FAILURE);
 		off = offset - trunc_page(offset);
 		error = copyout((caddr_t)sf_buf_kva(sf) + off, (caddr_t)start,
 		    end - start);
 		vm_imgact_unmap_page(sf);
 		if (error) {
 			return (KERN_FAILURE);
 		}
 	}
 
 	return (KERN_SUCCESS);
 }
 
 static int
 __elfN(map_insert)(vm_map_t map, vm_object_t object, vm_ooffset_t offset,
     vm_offset_t start, vm_offset_t end, vm_prot_t prot, int cow)
 {
 	struct sf_buf *sf;
 	vm_offset_t off;
 	vm_size_t sz;
 	int error, rv;
 
 	if (start != trunc_page(start)) {
 		rv = __elfN(map_partial)(map, object, offset, start,
 		    round_page(start), prot);
 		if (rv)
 			return (rv);
 		offset += round_page(start) - start;
 		start = round_page(start);
 	}
 	if (end != round_page(end)) {
 		rv = __elfN(map_partial)(map, object, offset +
 		    trunc_page(end) - start, trunc_page(end), end, prot);
 		if (rv)
 			return (rv);
 		end = trunc_page(end);
 	}
 	if (end > start) {
 		if (offset & PAGE_MASK) {
 			/*
 			 * The mapping is not page aligned. This means we have
 			 * to copy the data. Sigh.
 			 */
 			rv = vm_map_find(map, NULL, 0, &start, end - start, 0,
 			    VMFS_NO_SPACE, prot | VM_PROT_WRITE, VM_PROT_ALL,
 			    0);
 			if (rv)
 				return (rv);
 			if (object == NULL)
 				return (KERN_SUCCESS);
 			for (; start < end; start += sz) {
 				sf = vm_imgact_map_page(object, offset);
 				if (sf == NULL)
 					return (KERN_FAILURE);
 				off = offset - trunc_page(offset);
 				sz = end - start;
 				if (sz > PAGE_SIZE - off)
 					sz = PAGE_SIZE - off;
 				error = copyout((caddr_t)sf_buf_kva(sf) + off,
 				    (caddr_t)start, sz);
 				vm_imgact_unmap_page(sf);
 				if (error) {
 					return (KERN_FAILURE);
 				}
 				offset += sz;
 			}
 			rv = KERN_SUCCESS;
 		} else {
 			vm_object_reference(object);
 			vm_map_lock(map);
 			rv = vm_map_insert(map, object, offset, start, end,
 			    prot, VM_PROT_ALL, cow);
 			vm_map_unlock(map);
 			if (rv != KERN_SUCCESS)
 				vm_object_deallocate(object);
 		}
 		return (rv);
 	} else {
 		return (KERN_SUCCESS);
 	}
 }
 
 static int
 __elfN(load_section)(struct image_params *imgp, vm_offset_t offset,
     caddr_t vmaddr, size_t memsz, size_t filsz, vm_prot_t prot,
     size_t pagesize)
 {
 	struct sf_buf *sf;
 	size_t map_len;
 	vm_map_t map;
 	vm_object_t object;
 	vm_offset_t map_addr;
 	int error, rv, cow;
 	size_t copy_len;
 	vm_offset_t file_addr;
 
 	/*
 	 * It's necessary to fail if the filsz + offset taken from the
 	 * header is greater than the actual file pager object's size.
 	 * If we were to allow this, then the vm_map_find() below would
 	 * walk right off the end of the file object and into the ether.
 	 *
 	 * While I'm here, might as well check for something else that
 	 * is invalid: filsz cannot be greater than memsz.
 	 */
 	if ((off_t)filsz + offset > imgp->attr->va_size || filsz > memsz) {
 		uprintf("elf_load_section: truncated ELF file\n");
 		return (ENOEXEC);
 	}
 
 	object = imgp->object;
 	map = &imgp->proc->p_vmspace->vm_map;
 	map_addr = trunc_page_ps((vm_offset_t)vmaddr, pagesize);
 	file_addr = trunc_page_ps(offset, pagesize);
 
 	/*
 	 * We have two choices.  We can either clear the data in the last page
 	 * of an oversized mapping, or we can start the anon mapping a page
 	 * early and copy the initialized data into that first page.  We
 	 * choose the second..
 	 */
 	if (memsz > filsz)
 		map_len = trunc_page_ps(offset + filsz, pagesize) - file_addr;
 	else
 		map_len = round_page_ps(offset + filsz, pagesize) - file_addr;
 
 	if (map_len != 0) {
 		/* cow flags: don't dump readonly sections in core */
 		cow = MAP_COPY_ON_WRITE | MAP_PREFAULT |
 		    (prot & VM_PROT_WRITE ? 0 : MAP_DISABLE_COREDUMP);
 
 		rv = __elfN(map_insert)(map,
 				      object,
 				      file_addr,	/* file offset */
 				      map_addr,		/* virtual start */
 				      map_addr + map_len,/* virtual end */
 				      prot,
 				      cow);
 		if (rv != KERN_SUCCESS)
 			return (EINVAL);
 
 		/* we can stop now if we've covered it all */
 		if (memsz == filsz) {
 			return (0);
 		}
 	}
 
 
 	/*
 	 * We have to get the remaining bit of the file into the first part
 	 * of the oversized map segment.  This is normally because the .data
 	 * segment in the file is extended to provide bss.  It's a neat idea
 	 * to try and save a page, but it's a pain in the behind to implement.
 	 */
 	copy_len = (offset + filsz) - trunc_page_ps(offset + filsz, pagesize);
 	map_addr = trunc_page_ps((vm_offset_t)vmaddr + filsz, pagesize);
 	map_len = round_page_ps((vm_offset_t)vmaddr + memsz, pagesize) -
 	    map_addr;
 
 	/* This had damn well better be true! */
 	if (map_len != 0) {
 		rv = __elfN(map_insert)(map, NULL, 0, map_addr, map_addr +
 		    map_len, VM_PROT_ALL, 0);
 		if (rv != KERN_SUCCESS) {
 			return (EINVAL);
 		}
 	}
 
 	if (copy_len != 0) {
 		vm_offset_t off;
 
 		sf = vm_imgact_map_page(object, offset + filsz);
 		if (sf == NULL)
 			return (EIO);
 
 		/* send the page fragment to user space */
 		off = trunc_page_ps(offset + filsz, pagesize) -
 		    trunc_page(offset + filsz);
 		error = copyout((caddr_t)sf_buf_kva(sf) + off,
 		    (caddr_t)map_addr, copy_len);
 		vm_imgact_unmap_page(sf);
 		if (error) {
 			return (error);
 		}
 	}
 
 	/*
 	 * set it to the specified protection.
 	 * XXX had better undo the damage from pasting over the cracks here!
 	 */
 	vm_map_protect(map, trunc_page(map_addr), round_page(map_addr +
 	    map_len), prot, FALSE);
 
 	return (0);
 }
 
 /*
  * Load the file "file" into memory.  It may be either a shared object
  * or an executable.
  *
  * The "addr" reference parameter is in/out.  On entry, it specifies
  * the address where a shared object should be loaded.  If the file is
  * an executable, this value is ignored.  On exit, "addr" specifies
  * where the file was actually loaded.
  *
  * The "entry" reference parameter is out only.  On exit, it specifies
  * the entry point for the loaded file.
  */
 static int
 __elfN(load_file)(struct proc *p, const char *file, u_long *addr,
 	u_long *entry, size_t pagesize)
 {
 	struct {
 		struct nameidata nd;
 		struct vattr attr;
 		struct image_params image_params;
 	} *tempdata;
 	const Elf_Ehdr *hdr = NULL;
 	const Elf_Phdr *phdr = NULL;
 	struct nameidata *nd;
 	struct vattr *attr;
 	struct image_params *imgp;
 	vm_prot_t prot;
 	u_long rbase;
 	u_long base_addr = 0;
 	int error, i, numsegs;
 
 #ifdef CAPABILITY_MODE
 	/*
 	 * XXXJA: This check can go away once we are sufficiently confident
 	 * that the checks in namei() are correct.
 	 */
 	if (IN_CAPABILITY_MODE(curthread))
 		return (ECAPMODE);
 #endif
 
 	tempdata = malloc(sizeof(*tempdata), M_TEMP, M_WAITOK);
 	nd = &tempdata->nd;
 	attr = &tempdata->attr;
 	imgp = &tempdata->image_params;
 
 	/*
 	 * Initialize part of the common data
 	 */
 	imgp->proc = p;
 	imgp->attr = attr;
 	imgp->firstpage = NULL;
 	imgp->image_header = NULL;
 	imgp->object = NULL;
 	imgp->execlabel = NULL;
 
 	NDINIT(nd, LOOKUP, LOCKLEAF | FOLLOW, UIO_SYSSPACE, file, curthread);
 	if ((error = namei(nd)) != 0) {
 		nd->ni_vp = NULL;
 		goto fail;
 	}
 	NDFREE(nd, NDF_ONLY_PNBUF);
 	imgp->vp = nd->ni_vp;
 
 	/*
 	 * Check permissions, modes, uid, etc on the file, and "open" it.
 	 */
 	error = exec_check_permissions(imgp);
 	if (error)
 		goto fail;
 
 	error = exec_map_first_page(imgp);
 	if (error)
 		goto fail;
 
 	/*
 	 * Also make certain that the interpreter stays the same, so set
 	 * its VV_TEXT flag, too.
 	 */
 	VOP_SET_TEXT(nd->ni_vp);
 
 	imgp->object = nd->ni_vp->v_object;
 
 	hdr = (const Elf_Ehdr *)imgp->image_header;
 	if ((error = __elfN(check_header)(hdr)) != 0)
 		goto fail;
 	if (hdr->e_type == ET_DYN)
 		rbase = *addr;
 	else if (hdr->e_type == ET_EXEC)
 		rbase = 0;
 	else {
 		error = ENOEXEC;
 		goto fail;
 	}
 
 	/* Only support headers that fit within first page for now      */
 	if ((hdr->e_phoff > PAGE_SIZE) ||
 	    (u_int)hdr->e_phentsize * hdr->e_phnum > PAGE_SIZE - hdr->e_phoff) {
 		error = ENOEXEC;
 		goto fail;
 	}
 
 	phdr = (const Elf_Phdr *)(imgp->image_header + hdr->e_phoff);
 	if (!aligned(phdr, Elf_Addr)) {
 		error = ENOEXEC;
 		goto fail;
 	}
 
 	for (i = 0, numsegs = 0; i < hdr->e_phnum; i++) {
 		if (phdr[i].p_type == PT_LOAD && phdr[i].p_memsz != 0) {
 			/* Loadable segment */
 			prot = __elfN(trans_prot)(phdr[i].p_flags);
 			error = __elfN(load_section)(imgp, phdr[i].p_offset,
 			    (caddr_t)(uintptr_t)phdr[i].p_vaddr + rbase,
 			    phdr[i].p_memsz, phdr[i].p_filesz, prot, pagesize);
 			if (error != 0)
 				goto fail;
 			/*
 			 * Establish the base address if this is the
 			 * first segment.
 			 */
 			if (numsegs == 0)
   				base_addr = trunc_page(phdr[i].p_vaddr +
 				    rbase);
 			numsegs++;
 		}
 	}
 	*addr = base_addr;
 	*entry = (unsigned long)hdr->e_entry + rbase;
 
 fail:
 	if (imgp->firstpage)
 		exec_unmap_first_page(imgp);
 
 	if (nd->ni_vp)
 		vput(nd->ni_vp);
 
 	free(tempdata, M_TEMP);
 
 	return (error);
 }
 
 static int
 __CONCAT(exec_, __elfN(imgact))(struct image_params *imgp)
 {
 	const Elf_Ehdr *hdr = (const Elf_Ehdr *)imgp->image_header;
 	const Elf_Phdr *phdr;
 	Elf_Auxargs *elf_auxargs;
 	struct vmspace *vmspace;
 	vm_prot_t prot;
 	u_long text_size = 0, data_size = 0, total_size = 0;
 	u_long text_addr = 0, data_addr = 0;
 	u_long seg_size, seg_addr;
 	u_long addr, baddr, et_dyn_addr, entry = 0, proghdr = 0;
 	int32_t osrel = 0;
 	int error = 0, i, n, interp_name_len = 0;
 	const char *interp = NULL, *newinterp = NULL;
 	Elf_Brandinfo *brand_info;
 	char *path;
 	struct sysentvec *sv;
 
 	/*
 	 * Do we have a valid ELF header ?
 	 *
 	 * Only allow ET_EXEC & ET_DYN here, reject ET_DYN later
 	 * if particular brand doesn't support it.
 	 */
 	if (__elfN(check_header)(hdr) != 0 ||
 	    (hdr->e_type != ET_EXEC && hdr->e_type != ET_DYN))
 		return (-1);
 
 	/*
 	 * From here on down, we return an errno, not -1, as we've
 	 * detected an ELF file.
 	 */
 
 	if ((hdr->e_phoff > PAGE_SIZE) ||
 	    (u_int)hdr->e_phentsize * hdr->e_phnum > PAGE_SIZE - hdr->e_phoff) {
 		/* Only support headers in first page for now */
 		return (ENOEXEC);
 	}
 	phdr = (const Elf_Phdr *)(imgp->image_header + hdr->e_phoff);
 	if (!aligned(phdr, Elf_Addr))
 		return (ENOEXEC);
 	n = 0;
 	baddr = 0;
 	for (i = 0; i < hdr->e_phnum; i++) {
 		switch (phdr[i].p_type) {
 		case PT_LOAD:
 			if (n == 0)
 				baddr = phdr[i].p_vaddr;
 			n++;
 			break;
 		case PT_INTERP:
 			/* Path to interpreter */
 			if (phdr[i].p_filesz > MAXPATHLEN ||
 			    phdr[i].p_offset > PAGE_SIZE ||
 			    phdr[i].p_filesz > PAGE_SIZE - phdr[i].p_offset)
 				return (ENOEXEC);
 			interp = imgp->image_header + phdr[i].p_offset;
 			interp_name_len = phdr[i].p_filesz;
 			break;
 		case PT_GNU_STACK:
 			if (__elfN(nxstack))
 				imgp->stack_prot =
 				    __elfN(trans_prot)(phdr[i].p_flags);
+			imgp->stack_sz = phdr[i].p_memsz;
 			break;
 		}
 	}
 
 	brand_info = __elfN(get_brandinfo)(imgp, interp, interp_name_len,
 	    &osrel);
 	if (brand_info == NULL) {
 		uprintf("ELF binary type \"%u\" not known.\n",
 		    hdr->e_ident[EI_OSABI]);
 		return (ENOEXEC);
 	}
 	if (hdr->e_type == ET_DYN) {
 		if ((brand_info->flags & BI_CAN_EXEC_DYN) == 0)
 			return (ENOEXEC);
 		/*
 		 * Honour the base load address from the dso if it is
 		 * non-zero for some reason.
 		 */
 		if (baddr == 0)
 			et_dyn_addr = ET_DYN_LOAD_ADDR;
 		else
 			et_dyn_addr = 0;
 	} else
 		et_dyn_addr = 0;
 	sv = brand_info->sysvec;
 	if (interp != NULL && brand_info->interp_newpath != NULL)
 		newinterp = brand_info->interp_newpath;
 
 	/*
 	 * Avoid a possible deadlock if the current address space is destroyed
 	 * and that address space maps the locked vnode.  In the common case,
 	 * the locked vnode's v_usecount is decremented but remains greater
 	 * than zero.  Consequently, the vnode lock is not needed by vrele().
 	 * However, in cases where the vnode lock is external, such as nullfs,
 	 * v_usecount may become zero.
 	 *
 	 * The VV_TEXT flag prevents modifications to the executable while
 	 * the vnode is unlocked.
 	 */
 	VOP_UNLOCK(imgp->vp, 0);
 
 	error = exec_new_vmspace(imgp, sv);
 	imgp->proc->p_sysent = sv;
 
 	vn_lock(imgp->vp, LK_EXCLUSIVE | LK_RETRY);
 	if (error)
 		return (error);
 
 	for (i = 0; i < hdr->e_phnum; i++) {
 		switch (phdr[i].p_type) {
 		case PT_LOAD:	/* Loadable segment */
 			if (phdr[i].p_memsz == 0)
 				break;
 			prot = __elfN(trans_prot)(phdr[i].p_flags);
 			error = __elfN(load_section)(imgp, phdr[i].p_offset,
 			    (caddr_t)(uintptr_t)phdr[i].p_vaddr + et_dyn_addr,
 			    phdr[i].p_memsz, phdr[i].p_filesz, prot,
 			    sv->sv_pagesize);
 			if (error != 0)
 				return (error);
 
 			/*
 			 * If this segment contains the program headers,
 			 * remember their virtual address for the AT_PHDR
 			 * aux entry. Static binaries don't usually include
 			 * a PT_PHDR entry.
 			 */
 			if (phdr[i].p_offset == 0 &&
 			    hdr->e_phoff + hdr->e_phnum * hdr->e_phentsize
 				<= phdr[i].p_filesz)
 				proghdr = phdr[i].p_vaddr + hdr->e_phoff +
 				    et_dyn_addr;
 
 			seg_addr = trunc_page(phdr[i].p_vaddr + et_dyn_addr);
 			seg_size = round_page(phdr[i].p_memsz +
 			    phdr[i].p_vaddr + et_dyn_addr - seg_addr);
 
 			/*
 			 * Make the largest executable segment the official
 			 * text segment and all others data.
 			 *
 			 * Note that obreak() assumes that data_addr + 
 			 * data_size == end of data load area, and the ELF
 			 * file format expects segments to be sorted by
 			 * address.  If multiple data segments exist, the
 			 * last one will be used.
 			 */
 
 			if (phdr[i].p_flags & PF_X && text_size < seg_size) {
 				text_size = seg_size;
 				text_addr = seg_addr;
 			} else {
 				data_size = seg_size;
 				data_addr = seg_addr;
 			}
 			total_size += seg_size;
 			break;
 		case PT_PHDR: 	/* Program header table info */
 			proghdr = phdr[i].p_vaddr + et_dyn_addr;
 			break;
 		default:
 			break;
 		}
 	}
 	
 	if (data_addr == 0 && data_size == 0) {
 		data_addr = text_addr;
 		data_size = text_size;
 	}
 
 	entry = (u_long)hdr->e_entry + et_dyn_addr;
 
 	/*
 	 * Check limits.  It should be safe to check the
 	 * limits after loading the segments since we do
 	 * not actually fault in all the segments pages.
 	 */
 	PROC_LOCK(imgp->proc);
 	if (data_size > lim_cur(imgp->proc, RLIMIT_DATA) ||
 	    text_size > maxtsiz ||
 	    total_size > lim_cur(imgp->proc, RLIMIT_VMEM) ||
 	    racct_set(imgp->proc, RACCT_DATA, data_size) != 0 ||
 	    racct_set(imgp->proc, RACCT_VMEM, total_size) != 0) {
 		PROC_UNLOCK(imgp->proc);
 		return (ENOMEM);
 	}
 
 	vmspace = imgp->proc->p_vmspace;
 	vmspace->vm_tsize = text_size >> PAGE_SHIFT;
 	vmspace->vm_taddr = (caddr_t)(uintptr_t)text_addr;
 	vmspace->vm_dsize = data_size >> PAGE_SHIFT;
 	vmspace->vm_daddr = (caddr_t)(uintptr_t)data_addr;
 
 	/*
 	 * We load the dynamic linker where a userland call
 	 * to mmap(0, ...) would put it.  The rationale behind this
 	 * calculation is that it leaves room for the heap to grow to
 	 * its maximum allowed size.
 	 */
 	addr = round_page((vm_offset_t)vmspace->vm_daddr + lim_max(imgp->proc,
 	    RLIMIT_DATA));
 	PROC_UNLOCK(imgp->proc);
 
 	imgp->entry_addr = entry;
 
 	if (interp != NULL) {
 		int have_interp = FALSE;
 		VOP_UNLOCK(imgp->vp, 0);
 		if (brand_info->emul_path != NULL &&
 		    brand_info->emul_path[0] != '\0') {
 			path = malloc(MAXPATHLEN, M_TEMP, M_WAITOK);
 			snprintf(path, MAXPATHLEN, "%s%s",
 			    brand_info->emul_path, interp);
 			error = __elfN(load_file)(imgp->proc, path, &addr,
 			    &imgp->entry_addr, sv->sv_pagesize);
 			free(path, M_TEMP);
 			if (error == 0)
 				have_interp = TRUE;
 		}
 		if (!have_interp && newinterp != NULL) {
 			error = __elfN(load_file)(imgp->proc, newinterp, &addr,
 			    &imgp->entry_addr, sv->sv_pagesize);
 			if (error == 0)
 				have_interp = TRUE;
 		}
 		if (!have_interp) {
 			error = __elfN(load_file)(imgp->proc, interp, &addr,
 			    &imgp->entry_addr, sv->sv_pagesize);
 		}
 		vn_lock(imgp->vp, LK_EXCLUSIVE | LK_RETRY);
 		if (error != 0) {
 			uprintf("ELF interpreter %s not found\n", interp);
 			return (error);
 		}
 	} else
 		addr = et_dyn_addr;
 
 	/*
 	 * Construct auxargs table (used by the fixup routine)
 	 */
 	elf_auxargs = malloc(sizeof(Elf_Auxargs), M_TEMP, M_WAITOK);
 	elf_auxargs->execfd = -1;
 	elf_auxargs->phdr = proghdr;
 	elf_auxargs->phent = hdr->e_phentsize;
 	elf_auxargs->phnum = hdr->e_phnum;
 	elf_auxargs->pagesz = PAGE_SIZE;
 	elf_auxargs->base = addr;
 	elf_auxargs->flags = 0;
 	elf_auxargs->entry = entry;
 
 	imgp->auxargs = elf_auxargs;
 	imgp->interpreted = 0;
 	imgp->reloc_base = addr;
 	imgp->proc->p_osrel = osrel;
 
 	return (error);
 }
 
 #define	suword __CONCAT(suword, __ELF_WORD_SIZE)
 
 int
 __elfN(freebsd_fixup)(register_t **stack_base, struct image_params *imgp)
 {
 	Elf_Auxargs *args = (Elf_Auxargs *)imgp->auxargs;
 	Elf_Addr *base;
 	Elf_Addr *pos;
 
 	base = (Elf_Addr *)*stack_base;
 	pos = base + (imgp->args->argc + imgp->args->envc + 2);
 
 	if (args->execfd != -1)
 		AUXARGS_ENTRY(pos, AT_EXECFD, args->execfd);
 	AUXARGS_ENTRY(pos, AT_PHDR, args->phdr);
 	AUXARGS_ENTRY(pos, AT_PHENT, args->phent);
 	AUXARGS_ENTRY(pos, AT_PHNUM, args->phnum);
 	AUXARGS_ENTRY(pos, AT_PAGESZ, args->pagesz);
 	AUXARGS_ENTRY(pos, AT_FLAGS, args->flags);
 	AUXARGS_ENTRY(pos, AT_ENTRY, args->entry);
 	AUXARGS_ENTRY(pos, AT_BASE, args->base);
 	if (imgp->execpathp != 0)
 		AUXARGS_ENTRY(pos, AT_EXECPATH, imgp->execpathp);
 	AUXARGS_ENTRY(pos, AT_OSRELDATE,
 	    imgp->proc->p_ucred->cr_prison->pr_osreldate);
 	if (imgp->canary != 0) {
 		AUXARGS_ENTRY(pos, AT_CANARY, imgp->canary);
 		AUXARGS_ENTRY(pos, AT_CANARYLEN, imgp->canarylen);
 	}
 	AUXARGS_ENTRY(pos, AT_NCPUS, mp_ncpus);
 	if (imgp->pagesizes != 0) {
 		AUXARGS_ENTRY(pos, AT_PAGESIZES, imgp->pagesizes);
 		AUXARGS_ENTRY(pos, AT_PAGESIZESLEN, imgp->pagesizeslen);
 	}
 	if (imgp->sysent->sv_timekeep_base != 0) {
 		AUXARGS_ENTRY(pos, AT_TIMEKEEP,
 		    imgp->sysent->sv_timekeep_base);
 	}
 	AUXARGS_ENTRY(pos, AT_STACKPROT, imgp->sysent->sv_shared_page_obj
 	    != NULL && imgp->stack_prot != 0 ? imgp->stack_prot :
 	    imgp->sysent->sv_stackprot);
 	AUXARGS_ENTRY(pos, AT_NULL, 0);
 
 	free(imgp->auxargs, M_TEMP);
 	imgp->auxargs = NULL;
 
 	base--;
 	suword(base, (long)imgp->args->argc);
 	*stack_base = (register_t *)base;
 	return (0);
 }
 
 /*
  * Code for generating ELF core dumps.
  */
 
 typedef void (*segment_callback)(vm_map_entry_t, void *);
 
 /* Closure for cb_put_phdr(). */
 struct phdr_closure {
 	Elf_Phdr *phdr;		/* Program header to fill in */
 	Elf_Off offset;		/* Offset of segment in core file */
 };
 
 /* Closure for cb_size_segment(). */
 struct sseg_closure {
 	int count;		/* Count of writable segments. */
 	size_t size;		/* Total size of all writable segments. */
 };
 
 typedef void (*outfunc_t)(void *, struct sbuf *, size_t *);
 
 struct note_info {
 	int		type;		/* Note type. */
 	outfunc_t 	outfunc; 	/* Output function. */
 	void		*outarg;	/* Argument for the output function. */
 	size_t		outsize;	/* Output size. */
 	TAILQ_ENTRY(note_info) link;	/* Link to the next note info. */
 };
 
 TAILQ_HEAD(note_info_list, note_info);
 
 /* Coredump output parameters. */
 struct coredump_params {
 	off_t		offset;
 	struct ucred	*active_cred;
 	struct ucred	*file_cred;
 	struct thread	*td;
 	struct vnode	*vp;
 	struct gzio_stream *gzs;
 };
 
 static void cb_put_phdr(vm_map_entry_t, void *);
 static void cb_size_segment(vm_map_entry_t, void *);
 static int core_write(struct coredump_params *, void *, size_t, off_t,
     enum uio_seg);
 static void each_writable_segment(struct thread *, segment_callback, void *);
 static int __elfN(corehdr)(struct coredump_params *, int, void *, size_t,
     struct note_info_list *, size_t);
 static void __elfN(prepare_notes)(struct thread *, struct note_info_list *,
     size_t *);
 static void __elfN(puthdr)(struct thread *, void *, size_t, int, size_t);
 static void __elfN(putnote)(struct note_info *, struct sbuf *);
 static size_t register_note(struct note_info_list *, int, outfunc_t, void *);
 static int sbuf_drain_core_output(void *, const char *, int);
 static int sbuf_drain_count(void *arg, const char *data, int len);
 
 static void __elfN(note_fpregset)(void *, struct sbuf *, size_t *);
 static void __elfN(note_prpsinfo)(void *, struct sbuf *, size_t *);
 static void __elfN(note_prstatus)(void *, struct sbuf *, size_t *);
 static void __elfN(note_threadmd)(void *, struct sbuf *, size_t *);
 static void __elfN(note_thrmisc)(void *, struct sbuf *, size_t *);
 static void __elfN(note_procstat_auxv)(void *, struct sbuf *, size_t *);
 static void __elfN(note_procstat_proc)(void *, struct sbuf *, size_t *);
 static void __elfN(note_procstat_psstrings)(void *, struct sbuf *, size_t *);
 static void note_procstat_files(void *, struct sbuf *, size_t *);
 static void note_procstat_groups(void *, struct sbuf *, size_t *);
 static void note_procstat_osrel(void *, struct sbuf *, size_t *);
 static void note_procstat_rlimit(void *, struct sbuf *, size_t *);
 static void note_procstat_umask(void *, struct sbuf *, size_t *);
 static void note_procstat_vmmap(void *, struct sbuf *, size_t *);
 
 #ifdef GZIO
 extern int compress_user_cores_gzlevel;
 
 /*
  * Write out a core segment to the compression stream.
  */
 static int
 compress_chunk(struct coredump_params *p, char *base, char *buf, u_int len)
 {
 	u_int chunk_len;
 	int error;
 
 	while (len > 0) {
 		chunk_len = MIN(len, CORE_BUF_SIZE);
 		copyin(base, buf, chunk_len);
 		error = gzio_write(p->gzs, buf, chunk_len);
 		if (error != 0)
 			break;
 		base += chunk_len;
 		len -= chunk_len;
 	}
 	return (error);
 }
 
 static int
 core_gz_write(void *base, size_t len, off_t offset, void *arg)
 {
 
 	return (core_write((struct coredump_params *)arg, base, len, offset,
 	    UIO_SYSSPACE));
 }
 #endif /* GZIO */
 
 static int
 core_write(struct coredump_params *p, void *base, size_t len, off_t offset,
     enum uio_seg seg)
 {
 
 	return (vn_rdwr_inchunks(UIO_WRITE, p->vp, base, len, offset,
 	    seg, IO_UNIT | IO_DIRECT | IO_RANGELOCKED,
 	    p->active_cred, p->file_cred, NULL, p->td));
 }
 
 static int
 core_output(void *base, size_t len, off_t offset, struct coredump_params *p,
     void *tmpbuf)
 {
 
 #ifdef GZIO
 	if (p->gzs != NULL)
 		return (compress_chunk(p, base, tmpbuf, len));
 #endif
 	return (core_write(p, base, len, offset, UIO_USERSPACE));
 }
 
 /*
  * Drain into a core file.
  */
 static int
 sbuf_drain_core_output(void *arg, const char *data, int len)
 {
 	struct coredump_params *p;
 	int error, locked;
 
 	p = (struct coredump_params *)arg;
 
 	/*
 	 * Some kern_proc out routines that print to this sbuf may
 	 * call us with the process lock held. Draining with the
 	 * non-sleepable lock held is unsafe. The lock is needed for
 	 * those routines when dumping a live process. In our case we
 	 * can safely release the lock before draining and acquire
 	 * again after.
 	 */
 	locked = PROC_LOCKED(p->td->td_proc);
 	if (locked)
 		PROC_UNLOCK(p->td->td_proc);
 #ifdef GZIO
 	if (p->gzs != NULL)
 		error = gzio_write(p->gzs, __DECONST(char *, data), len);
 	else
 #endif
 		error = core_write(p, __DECONST(void *, data), len, p->offset,
 		    UIO_SYSSPACE);
 	if (locked)
 		PROC_LOCK(p->td->td_proc);
 	if (error != 0)
 		return (-error);
 	p->offset += len;
 	return (len);
 }
 
 /*
  * Drain into a counter.
  */
 static int
 sbuf_drain_count(void *arg, const char *data __unused, int len)
 {
 	size_t *sizep;
 
 	sizep = (size_t *)arg;
 	*sizep += len;
 	return (len);
 }
 
 int
 __elfN(coredump)(struct thread *td, struct vnode *vp, off_t limit, int flags)
 {
 	struct ucred *cred = td->td_ucred;
 	int error = 0;
 	struct sseg_closure seginfo;
 	struct note_info_list notelst;
 	struct coredump_params params;
 	struct note_info *ninfo;
 	void *hdr, *tmpbuf;
 	size_t hdrsize, notesz, coresize;
 	boolean_t compress;
 
 	compress = (flags & IMGACT_CORE_COMPRESS) != 0;
 	hdr = NULL;
 	TAILQ_INIT(&notelst);
 
 	/* Size the program segments. */
 	seginfo.count = 0;
 	seginfo.size = 0;
 	each_writable_segment(td, cb_size_segment, &seginfo);
 
 	/*
 	 * Collect info about the core file header area.
 	 */
 	hdrsize = sizeof(Elf_Ehdr) + sizeof(Elf_Phdr) * (1 + seginfo.count);
 	__elfN(prepare_notes)(td, &notelst, &notesz);
 	coresize = round_page(hdrsize + notesz) + seginfo.size;
 
 #ifdef RACCT
 	PROC_LOCK(td->td_proc);
 	error = racct_add(td->td_proc, RACCT_CORE, coresize);
 	PROC_UNLOCK(td->td_proc);
 	if (error != 0) {
 		error = EFAULT;
 		goto done;
 	}
 #endif
 	if (coresize >= limit) {
 		error = EFAULT;
 		goto done;
 	}
 
 	/* Set up core dump parameters. */
 	params.offset = 0;
 	params.active_cred = cred;
 	params.file_cred = NOCRED;
 	params.td = td;
 	params.vp = vp;
 	params.gzs = NULL;
 
 	tmpbuf = NULL;
 #ifdef GZIO
 	/* Create a compression stream if necessary. */
 	if (compress) {
 		params.gzs = gzio_init(core_gz_write, GZIO_DEFLATE,
 		    CORE_BUF_SIZE, compress_user_cores_gzlevel, &params);
 		if (params.gzs == NULL) {
 			error = EFAULT;
 			goto done;
 		}
 		tmpbuf = malloc(CORE_BUF_SIZE, M_TEMP, M_WAITOK | M_ZERO);
         }
 #endif
 
 	/*
 	 * Allocate memory for building the header, fill it up,
 	 * and write it out following the notes.
 	 */
 	hdr = malloc(hdrsize, M_TEMP, M_WAITOK);
 	if (hdr == NULL) {
 		error = EINVAL;
 		goto done;
 	}
 	error = __elfN(corehdr)(&params, seginfo.count, hdr, hdrsize, &notelst,
 	    notesz);
 
 	/* Write the contents of all of the writable segments. */
 	if (error == 0) {
 		Elf_Phdr *php;
 		off_t offset;
 		int i;
 
 		php = (Elf_Phdr *)((char *)hdr + sizeof(Elf_Ehdr)) + 1;
 		offset = round_page(hdrsize + notesz);
 		for (i = 0; i < seginfo.count; i++) {
 			error = core_output((caddr_t)(uintptr_t)php->p_vaddr,
 			    php->p_filesz, offset, &params, tmpbuf);
 			if (error != 0)
 				break;
 			offset += php->p_filesz;
 			php++;
 		}
 #ifdef GZIO
 		if (error == 0 && compress)
 			error = gzio_flush(params.gzs);
 #endif
 	}
 	if (error) {
 		log(LOG_WARNING,
 		    "Failed to write core file for process %s (error %d)\n",
 		    curproc->p_comm, error);
 	}
 
 done:
 #ifdef GZIO
 	if (compress) {
 		free(tmpbuf, M_TEMP);
 		gzio_fini(params.gzs);
 	}
 #endif
 	while ((ninfo = TAILQ_FIRST(&notelst)) != NULL) {
 		TAILQ_REMOVE(&notelst, ninfo, link);
 		free(ninfo, M_TEMP);
 	}
 	if (hdr != NULL)
 		free(hdr, M_TEMP);
 
 	return (error);
 }
 
 /*
  * A callback for each_writable_segment() to write out the segment's
  * program header entry.
  */
 static void
 cb_put_phdr(entry, closure)
 	vm_map_entry_t entry;
 	void *closure;
 {
 	struct phdr_closure *phc = (struct phdr_closure *)closure;
 	Elf_Phdr *phdr = phc->phdr;
 
 	phc->offset = round_page(phc->offset);
 
 	phdr->p_type = PT_LOAD;
 	phdr->p_offset = phc->offset;
 	phdr->p_vaddr = entry->start;
 	phdr->p_paddr = 0;
 	phdr->p_filesz = phdr->p_memsz = entry->end - entry->start;
 	phdr->p_align = PAGE_SIZE;
 	phdr->p_flags = __elfN(untrans_prot)(entry->protection);
 
 	phc->offset += phdr->p_filesz;
 	phc->phdr++;
 }
 
 /*
  * A callback for each_writable_segment() to gather information about
  * the number of segments and their total size.
  */
 static void
 cb_size_segment(entry, closure)
 	vm_map_entry_t entry;
 	void *closure;
 {
 	struct sseg_closure *ssc = (struct sseg_closure *)closure;
 
 	ssc->count++;
 	ssc->size += entry->end - entry->start;
 }
 
 /*
  * For each writable segment in the process's memory map, call the given
  * function with a pointer to the map entry and some arbitrary
  * caller-supplied data.
  */
 static void
 each_writable_segment(td, func, closure)
 	struct thread *td;
 	segment_callback func;
 	void *closure;
 {
 	struct proc *p = td->td_proc;
 	vm_map_t map = &p->p_vmspace->vm_map;
 	vm_map_entry_t entry;
 	vm_object_t backing_object, object;
 	boolean_t ignore_entry;
 
 	vm_map_lock_read(map);
 	for (entry = map->header.next; entry != &map->header;
 	    entry = entry->next) {
 		/*
 		 * Don't dump inaccessible mappings, deal with legacy
 		 * coredump mode.
 		 *
 		 * Note that read-only segments related to the elf binary
 		 * are marked MAP_ENTRY_NOCOREDUMP now so we no longer
 		 * need to arbitrarily ignore such segments.
 		 */
 		if (elf_legacy_coredump) {
 			if ((entry->protection & VM_PROT_RW) != VM_PROT_RW)
 				continue;
 		} else {
 			if ((entry->protection & VM_PROT_ALL) == 0)
 				continue;
 		}
 
 		/*
 		 * Dont include memory segment in the coredump if
 		 * MAP_NOCORE is set in mmap(2) or MADV_NOCORE in
 		 * madvise(2).  Do not dump submaps (i.e. parts of the
 		 * kernel map).
 		 */
 		if (entry->eflags & (MAP_ENTRY_NOCOREDUMP|MAP_ENTRY_IS_SUB_MAP))
 			continue;
 
 		if ((object = entry->object.vm_object) == NULL)
 			continue;
 
 		/* Ignore memory-mapped devices and such things. */
 		VM_OBJECT_RLOCK(object);
 		while ((backing_object = object->backing_object) != NULL) {
 			VM_OBJECT_RLOCK(backing_object);
 			VM_OBJECT_RUNLOCK(object);
 			object = backing_object;
 		}
 		ignore_entry = object->type != OBJT_DEFAULT &&
 		    object->type != OBJT_SWAP && object->type != OBJT_VNODE &&
 		    object->type != OBJT_PHYS;
 		VM_OBJECT_RUNLOCK(object);
 		if (ignore_entry)
 			continue;
 
 		(*func)(entry, closure);
 	}
 	vm_map_unlock_read(map);
 }
 
 /*
  * Write the core file header to the file, including padding up to
  * the page boundary.
  */
 static int
 __elfN(corehdr)(struct coredump_params *p, int numsegs, void *hdr,
     size_t hdrsize, struct note_info_list *notelst, size_t notesz)
 {
 	struct note_info *ninfo;
 	struct sbuf *sb;
 	int error;
 
 	/* Fill in the header. */
 	bzero(hdr, hdrsize);
 	__elfN(puthdr)(p->td, hdr, hdrsize, numsegs, notesz);
 
 	sb = sbuf_new(NULL, NULL, CORE_BUF_SIZE, SBUF_FIXEDLEN);
 	sbuf_set_drain(sb, sbuf_drain_core_output, p);
 	sbuf_start_section(sb, NULL);
 	sbuf_bcat(sb, hdr, hdrsize);
 	TAILQ_FOREACH(ninfo, notelst, link)
 	    __elfN(putnote)(ninfo, sb);
 	/* Align up to a page boundary for the program segments. */
 	sbuf_end_section(sb, -1, PAGE_SIZE, 0);
 	error = sbuf_finish(sb);
 	sbuf_delete(sb);
 
 	return (error);
 }
 
 static void
 __elfN(prepare_notes)(struct thread *td, struct note_info_list *list,
     size_t *sizep)
 {
 	struct proc *p;
 	struct thread *thr;
 	size_t size;
 
 	p = td->td_proc;
 	size = 0;
 
 	size += register_note(list, NT_PRPSINFO, __elfN(note_prpsinfo), p);
 
 	/*
 	 * To have the debugger select the right thread (LWP) as the initial
 	 * thread, we dump the state of the thread passed to us in td first.
 	 * This is the thread that causes the core dump and thus likely to
 	 * be the right thread one wants to have selected in the debugger.
 	 */
 	thr = td;
 	while (thr != NULL) {
 		size += register_note(list, NT_PRSTATUS,
 		    __elfN(note_prstatus), thr);
 		size += register_note(list, NT_FPREGSET,
 		    __elfN(note_fpregset), thr);
 		size += register_note(list, NT_THRMISC,
 		    __elfN(note_thrmisc), thr);
 		size += register_note(list, -1,
 		    __elfN(note_threadmd), thr);
 
 		thr = (thr == td) ? TAILQ_FIRST(&p->p_threads) :
 		    TAILQ_NEXT(thr, td_plist);
 		if (thr == td)
 			thr = TAILQ_NEXT(thr, td_plist);
 	}
 
 	size += register_note(list, NT_PROCSTAT_PROC,
 	    __elfN(note_procstat_proc), p);
 	size += register_note(list, NT_PROCSTAT_FILES,
 	    note_procstat_files, p);
 	size += register_note(list, NT_PROCSTAT_VMMAP,
 	    note_procstat_vmmap, p);
 	size += register_note(list, NT_PROCSTAT_GROUPS,
 	    note_procstat_groups, p);
 	size += register_note(list, NT_PROCSTAT_UMASK,
 	    note_procstat_umask, p);
 	size += register_note(list, NT_PROCSTAT_RLIMIT,
 	    note_procstat_rlimit, p);
 	size += register_note(list, NT_PROCSTAT_OSREL,
 	    note_procstat_osrel, p);
 	size += register_note(list, NT_PROCSTAT_PSSTRINGS,
 	    __elfN(note_procstat_psstrings), p);
 	size += register_note(list, NT_PROCSTAT_AUXV,
 	    __elfN(note_procstat_auxv), p);
 
 	*sizep = size;
 }
 
 static void
 __elfN(puthdr)(struct thread *td, void *hdr, size_t hdrsize, int numsegs,
     size_t notesz)
 {
 	Elf_Ehdr *ehdr;
 	Elf_Phdr *phdr;
 	struct phdr_closure phc;
 
 	ehdr = (Elf_Ehdr *)hdr;
 	phdr = (Elf_Phdr *)((char *)hdr + sizeof(Elf_Ehdr));
 
 	ehdr->e_ident[EI_MAG0] = ELFMAG0;
 	ehdr->e_ident[EI_MAG1] = ELFMAG1;
 	ehdr->e_ident[EI_MAG2] = ELFMAG2;
 	ehdr->e_ident[EI_MAG3] = ELFMAG3;
 	ehdr->e_ident[EI_CLASS] = ELF_CLASS;
 	ehdr->e_ident[EI_DATA] = ELF_DATA;
 	ehdr->e_ident[EI_VERSION] = EV_CURRENT;
 	ehdr->e_ident[EI_OSABI] = ELFOSABI_FREEBSD;
 	ehdr->e_ident[EI_ABIVERSION] = 0;
 	ehdr->e_ident[EI_PAD] = 0;
 	ehdr->e_type = ET_CORE;
 #if defined(COMPAT_FREEBSD32) && __ELF_WORD_SIZE == 32
 	ehdr->e_machine = ELF_ARCH32;
 #else
 	ehdr->e_machine = ELF_ARCH;
 #endif
 	ehdr->e_version = EV_CURRENT;
 	ehdr->e_entry = 0;
 	ehdr->e_phoff = sizeof(Elf_Ehdr);
 	ehdr->e_flags = 0;
 	ehdr->e_ehsize = sizeof(Elf_Ehdr);
 	ehdr->e_phentsize = sizeof(Elf_Phdr);
 	ehdr->e_phnum = numsegs + 1;
 	ehdr->e_shentsize = sizeof(Elf_Shdr);
 	ehdr->e_shnum = 0;
 	ehdr->e_shstrndx = SHN_UNDEF;
 
 	/*
 	 * Fill in the program header entries.
 	 */
 
 	/* The note segement. */
 	phdr->p_type = PT_NOTE;
 	phdr->p_offset = hdrsize;
 	phdr->p_vaddr = 0;
 	phdr->p_paddr = 0;
 	phdr->p_filesz = notesz;
 	phdr->p_memsz = 0;
 	phdr->p_flags = PF_R;
 	phdr->p_align = ELF_NOTE_ROUNDSIZE;
 	phdr++;
 
 	/* All the writable segments from the program. */
 	phc.phdr = phdr;
 	phc.offset = round_page(hdrsize + notesz);
 	each_writable_segment(td, cb_put_phdr, &phc);
 }
 
 static size_t
 register_note(struct note_info_list *list, int type, outfunc_t out, void *arg)
 {
 	struct note_info *ninfo;
 	size_t size, notesize;
 
 	size = 0;
 	out(arg, NULL, &size);
 	ninfo = malloc(sizeof(*ninfo), M_TEMP, M_ZERO | M_WAITOK);
 	ninfo->type = type;
 	ninfo->outfunc = out;
 	ninfo->outarg = arg;
 	ninfo->outsize = size;
 	TAILQ_INSERT_TAIL(list, ninfo, link);
 
 	if (type == -1)
 		return (size);
 
 	notesize = sizeof(Elf_Note) +		/* note header */
 	    roundup2(sizeof(FREEBSD_ABI_VENDOR), ELF_NOTE_ROUNDSIZE) +
 						/* note name */
 	    roundup2(size, ELF_NOTE_ROUNDSIZE);	/* note description */
 
 	return (notesize);
 }
 
 static size_t
 append_note_data(const void *src, void *dst, size_t len)
 {
 	size_t padded_len;
 
 	padded_len = roundup2(len, ELF_NOTE_ROUNDSIZE);
 	if (dst != NULL) {
 		bcopy(src, dst, len);
 		bzero((char *)dst + len, padded_len - len);
 	}
 	return (padded_len);
 }
 
 size_t
 __elfN(populate_note)(int type, void *src, void *dst, size_t size, void **descp)
 {
 	Elf_Note *note;
 	char *buf;
 	size_t notesize;
 
 	buf = dst;
 	if (buf != NULL) {
 		note = (Elf_Note *)buf;
 		note->n_namesz = sizeof(FREEBSD_ABI_VENDOR);
 		note->n_descsz = size;
 		note->n_type = type;
 		buf += sizeof(*note);
 		buf += append_note_data(FREEBSD_ABI_VENDOR, buf,
 		    sizeof(FREEBSD_ABI_VENDOR));
 		append_note_data(src, buf, size);
 		if (descp != NULL)
 			*descp = buf;
 	}
 
 	notesize = sizeof(Elf_Note) +		/* note header */
 	    roundup2(sizeof(FREEBSD_ABI_VENDOR), ELF_NOTE_ROUNDSIZE) +
 						/* note name */
 	    roundup2(size, ELF_NOTE_ROUNDSIZE);	/* note description */
 
 	return (notesize);
 }
 
 static void
 __elfN(putnote)(struct note_info *ninfo, struct sbuf *sb)
 {
 	Elf_Note note;
 	ssize_t old_len;
 
 	if (ninfo->type == -1) {
 		ninfo->outfunc(ninfo->outarg, sb, &ninfo->outsize);
 		return;
 	}
 
 	note.n_namesz = sizeof(FREEBSD_ABI_VENDOR);
 	note.n_descsz = ninfo->outsize;
 	note.n_type = ninfo->type;
 
 	sbuf_bcat(sb, &note, sizeof(note));
 	sbuf_start_section(sb, &old_len);
 	sbuf_bcat(sb, FREEBSD_ABI_VENDOR, sizeof(FREEBSD_ABI_VENDOR));
 	sbuf_end_section(sb, old_len, ELF_NOTE_ROUNDSIZE, 0);
 	if (note.n_descsz == 0)
 		return;
 	sbuf_start_section(sb, &old_len);
 	ninfo->outfunc(ninfo->outarg, sb, &ninfo->outsize);
 	sbuf_end_section(sb, old_len, ELF_NOTE_ROUNDSIZE, 0);
 }
 
 /*
  * Miscellaneous note out functions.
  */
 
 #if defined(COMPAT_FREEBSD32) && __ELF_WORD_SIZE == 32
 #include <compat/freebsd32/freebsd32.h>
 
 typedef struct prstatus32 elf_prstatus_t;
 typedef struct prpsinfo32 elf_prpsinfo_t;
 typedef struct fpreg32 elf_prfpregset_t;
 typedef struct fpreg32 elf_fpregset_t;
 typedef struct reg32 elf_gregset_t;
 typedef struct thrmisc32 elf_thrmisc_t;
 #define ELF_KERN_PROC_MASK	KERN_PROC_MASK32
 typedef struct kinfo_proc32 elf_kinfo_proc_t;
 typedef uint32_t elf_ps_strings_t;
 #else
 typedef prstatus_t elf_prstatus_t;
 typedef prpsinfo_t elf_prpsinfo_t;
 typedef prfpregset_t elf_prfpregset_t;
 typedef prfpregset_t elf_fpregset_t;
 typedef gregset_t elf_gregset_t;
 typedef thrmisc_t elf_thrmisc_t;
 #define ELF_KERN_PROC_MASK	0
 typedef struct kinfo_proc elf_kinfo_proc_t;
 typedef vm_offset_t elf_ps_strings_t;
 #endif
 
 static void
 __elfN(note_prpsinfo)(void *arg, struct sbuf *sb, size_t *sizep)
 {
 	struct proc *p;
 	elf_prpsinfo_t *psinfo;
 
 	p = (struct proc *)arg;
 	if (sb != NULL) {
 		KASSERT(*sizep == sizeof(*psinfo), ("invalid size"));
 		psinfo = malloc(sizeof(*psinfo), M_TEMP, M_ZERO | M_WAITOK);
 		psinfo->pr_version = PRPSINFO_VERSION;
 		psinfo->pr_psinfosz = sizeof(elf_prpsinfo_t);
 		strlcpy(psinfo->pr_fname, p->p_comm, sizeof(psinfo->pr_fname));
 		/*
 		 * XXX - We don't fill in the command line arguments properly
 		 * yet.
 		 */
 		strlcpy(psinfo->pr_psargs, p->p_comm,
 		    sizeof(psinfo->pr_psargs));
 
 		sbuf_bcat(sb, psinfo, sizeof(*psinfo));
 		free(psinfo, M_TEMP);
 	}
 	*sizep = sizeof(*psinfo);
 }
 
 static void
 __elfN(note_prstatus)(void *arg, struct sbuf *sb, size_t *sizep)
 {
 	struct thread *td;
 	elf_prstatus_t *status;
 
 	td = (struct thread *)arg;
 	if (sb != NULL) {
 		KASSERT(*sizep == sizeof(*status), ("invalid size"));
 		status = malloc(sizeof(*status), M_TEMP, M_ZERO | M_WAITOK);
 		status->pr_version = PRSTATUS_VERSION;
 		status->pr_statussz = sizeof(elf_prstatus_t);
 		status->pr_gregsetsz = sizeof(elf_gregset_t);
 		status->pr_fpregsetsz = sizeof(elf_fpregset_t);
 		status->pr_osreldate = osreldate;
 		status->pr_cursig = td->td_proc->p_sig;
 		status->pr_pid = td->td_tid;
 #if defined(COMPAT_FREEBSD32) && __ELF_WORD_SIZE == 32
 		fill_regs32(td, &status->pr_reg);
 #else
 		fill_regs(td, &status->pr_reg);
 #endif
 		sbuf_bcat(sb, status, sizeof(*status));
 		free(status, M_TEMP);
 	}
 	*sizep = sizeof(*status);
 }
 
 static void
 __elfN(note_fpregset)(void *arg, struct sbuf *sb, size_t *sizep)
 {
 	struct thread *td;
 	elf_prfpregset_t *fpregset;
 
 	td = (struct thread *)arg;
 	if (sb != NULL) {
 		KASSERT(*sizep == sizeof(*fpregset), ("invalid size"));
 		fpregset = malloc(sizeof(*fpregset), M_TEMP, M_ZERO | M_WAITOK);
 #if defined(COMPAT_FREEBSD32) && __ELF_WORD_SIZE == 32
 		fill_fpregs32(td, fpregset);
 #else
 		fill_fpregs(td, fpregset);
 #endif
 		sbuf_bcat(sb, fpregset, sizeof(*fpregset));
 		free(fpregset, M_TEMP);
 	}
 	*sizep = sizeof(*fpregset);
 }
 
 static void
 __elfN(note_thrmisc)(void *arg, struct sbuf *sb, size_t *sizep)
 {
 	struct thread *td;
 	elf_thrmisc_t thrmisc;
 
 	td = (struct thread *)arg;
 	if (sb != NULL) {
 		KASSERT(*sizep == sizeof(thrmisc), ("invalid size"));
 		bzero(&thrmisc._pad, sizeof(thrmisc._pad));
 		strcpy(thrmisc.pr_tname, td->td_name);
 		sbuf_bcat(sb, &thrmisc, sizeof(thrmisc));
 	}
 	*sizep = sizeof(thrmisc);
 }
 
 /*
  * Allow for MD specific notes, as well as any MD
  * specific preparations for writing MI notes.
  */
 static void
 __elfN(note_threadmd)(void *arg, struct sbuf *sb, size_t *sizep)
 {
 	struct thread *td;
 	void *buf;
 	size_t size;
 
 	td = (struct thread *)arg;
 	size = *sizep;
 	if (size != 0 && sb != NULL)
 		buf = malloc(size, M_TEMP, M_ZERO | M_WAITOK);
 	else
 		buf = NULL;
 	size = 0;
 	__elfN(dump_thread)(td, buf, &size);
 	KASSERT(sb == NULL || *sizep == size, ("invalid size"));
 	if (size != 0 && sb != NULL)
 		sbuf_bcat(sb, buf, size);
 	free(buf, M_TEMP);
 	*sizep = size;
 }
 
 #ifdef KINFO_PROC_SIZE
 CTASSERT(sizeof(struct kinfo_proc) == KINFO_PROC_SIZE);
 #endif
 
 static void
 __elfN(note_procstat_proc)(void *arg, struct sbuf *sb, size_t *sizep)
 {
 	struct proc *p;
 	size_t size;
 	int structsize;
 
 	p = (struct proc *)arg;
 	size = sizeof(structsize) + p->p_numthreads *
 	    sizeof(elf_kinfo_proc_t);
 
 	if (sb != NULL) {
 		KASSERT(*sizep == size, ("invalid size"));
 		structsize = sizeof(elf_kinfo_proc_t);
 		sbuf_bcat(sb, &structsize, sizeof(structsize));
 		sx_slock(&proctree_lock);
 		PROC_LOCK(p);
 		kern_proc_out(p, sb, ELF_KERN_PROC_MASK);
 		sx_sunlock(&proctree_lock);
 	}
 	*sizep = size;
 }
 
 #ifdef KINFO_FILE_SIZE
 CTASSERT(sizeof(struct kinfo_file) == KINFO_FILE_SIZE);
 #endif
 
 static void
 note_procstat_files(void *arg, struct sbuf *sb, size_t *sizep)
 {
 	struct proc *p;
 	size_t size;
 	int structsize;
 
 	p = (struct proc *)arg;
 	if (sb == NULL) {
 		size = 0;
 		sb = sbuf_new(NULL, NULL, 128, SBUF_FIXEDLEN);
 		sbuf_set_drain(sb, sbuf_drain_count, &size);
 		sbuf_bcat(sb, &structsize, sizeof(structsize));
 		PROC_LOCK(p);
 		kern_proc_filedesc_out(p, sb, -1);
 		sbuf_finish(sb);
 		sbuf_delete(sb);
 		*sizep = size;
 	} else {
 		structsize = sizeof(struct kinfo_file);
 		sbuf_bcat(sb, &structsize, sizeof(structsize));
 		PROC_LOCK(p);
 		kern_proc_filedesc_out(p, sb, -1);
 	}
 }
 
 #ifdef KINFO_VMENTRY_SIZE
 CTASSERT(sizeof(struct kinfo_vmentry) == KINFO_VMENTRY_SIZE);
 #endif
 
 static void
 note_procstat_vmmap(void *arg, struct sbuf *sb, size_t *sizep)
 {
 	struct proc *p;
 	size_t size;
 	int structsize;
 
 	p = (struct proc *)arg;
 	if (sb == NULL) {
 		size = 0;
 		sb = sbuf_new(NULL, NULL, 128, SBUF_FIXEDLEN);
 		sbuf_set_drain(sb, sbuf_drain_count, &size);
 		sbuf_bcat(sb, &structsize, sizeof(structsize));
 		PROC_LOCK(p);
 		kern_proc_vmmap_out(p, sb);
 		sbuf_finish(sb);
 		sbuf_delete(sb);
 		*sizep = size;
 	} else {
 		structsize = sizeof(struct kinfo_vmentry);
 		sbuf_bcat(sb, &structsize, sizeof(structsize));
 		PROC_LOCK(p);
 		kern_proc_vmmap_out(p, sb);
 	}
 }
 
 static void
 note_procstat_groups(void *arg, struct sbuf *sb, size_t *sizep)
 {
 	struct proc *p;
 	size_t size;
 	int structsize;
 
 	p = (struct proc *)arg;
 	size = sizeof(structsize) + p->p_ucred->cr_ngroups * sizeof(gid_t);
 	if (sb != NULL) {
 		KASSERT(*sizep == size, ("invalid size"));
 		structsize = sizeof(gid_t);
 		sbuf_bcat(sb, &structsize, sizeof(structsize));
 		sbuf_bcat(sb, p->p_ucred->cr_groups, p->p_ucred->cr_ngroups *
 		    sizeof(gid_t));
 	}
 	*sizep = size;
 }
 
 static void
 note_procstat_umask(void *arg, struct sbuf *sb, size_t *sizep)
 {
 	struct proc *p;
 	size_t size;
 	int structsize;
 
 	p = (struct proc *)arg;
 	size = sizeof(structsize) + sizeof(p->p_fd->fd_cmask);
 	if (sb != NULL) {
 		KASSERT(*sizep == size, ("invalid size"));
 		structsize = sizeof(p->p_fd->fd_cmask);
 		sbuf_bcat(sb, &structsize, sizeof(structsize));
 		sbuf_bcat(sb, &p->p_fd->fd_cmask, sizeof(p->p_fd->fd_cmask));
 	}
 	*sizep = size;
 }
 
 static void
 note_procstat_rlimit(void *arg, struct sbuf *sb, size_t *sizep)
 {
 	struct proc *p;
 	struct rlimit rlim[RLIM_NLIMITS];
 	size_t size;
 	int structsize, i;
 
 	p = (struct proc *)arg;
 	size = sizeof(structsize) + sizeof(rlim);
 	if (sb != NULL) {
 		KASSERT(*sizep == size, ("invalid size"));
 		structsize = sizeof(rlim);
 		sbuf_bcat(sb, &structsize, sizeof(structsize));
 		PROC_LOCK(p);
 		for (i = 0; i < RLIM_NLIMITS; i++)
 			lim_rlimit(p, i, &rlim[i]);
 		PROC_UNLOCK(p);
 		sbuf_bcat(sb, rlim, sizeof(rlim));
 	}
 	*sizep = size;
 }
 
 static void
 note_procstat_osrel(void *arg, struct sbuf *sb, size_t *sizep)
 {
 	struct proc *p;
 	size_t size;
 	int structsize;
 
 	p = (struct proc *)arg;
 	size = sizeof(structsize) + sizeof(p->p_osrel);
 	if (sb != NULL) {
 		KASSERT(*sizep == size, ("invalid size"));
 		structsize = sizeof(p->p_osrel);
 		sbuf_bcat(sb, &structsize, sizeof(structsize));
 		sbuf_bcat(sb, &p->p_osrel, sizeof(p->p_osrel));
 	}
 	*sizep = size;
 }
 
 static void
 __elfN(note_procstat_psstrings)(void *arg, struct sbuf *sb, size_t *sizep)
 {
 	struct proc *p;
 	elf_ps_strings_t ps_strings;
 	size_t size;
 	int structsize;
 
 	p = (struct proc *)arg;
 	size = sizeof(structsize) + sizeof(ps_strings);
 	if (sb != NULL) {
 		KASSERT(*sizep == size, ("invalid size"));
 		structsize = sizeof(ps_strings);
 #if defined(COMPAT_FREEBSD32) && __ELF_WORD_SIZE == 32
 		ps_strings = PTROUT(p->p_sysent->sv_psstrings);
 #else
 		ps_strings = p->p_sysent->sv_psstrings;
 #endif
 		sbuf_bcat(sb, &structsize, sizeof(structsize));
 		sbuf_bcat(sb, &ps_strings, sizeof(ps_strings));
 	}
 	*sizep = size;
 }
 
 static void
 __elfN(note_procstat_auxv)(void *arg, struct sbuf *sb, size_t *sizep)
 {
 	struct proc *p;
 	size_t size;
 	int structsize;
 
 	p = (struct proc *)arg;
 	if (sb == NULL) {
 		size = 0;
 		sb = sbuf_new(NULL, NULL, 128, SBUF_FIXEDLEN);
 		sbuf_set_drain(sb, sbuf_drain_count, &size);
 		sbuf_bcat(sb, &structsize, sizeof(structsize));
 		PHOLD(p);
 		proc_getauxv(curthread, p, sb);
 		PRELE(p);
 		sbuf_finish(sb);
 		sbuf_delete(sb);
 		*sizep = size;
 	} else {
 		structsize = sizeof(Elf_Auxinfo);
 		sbuf_bcat(sb, &structsize, sizeof(structsize));
 		PHOLD(p);
 		proc_getauxv(curthread, p, sb);
 		PRELE(p);
 	}
 }
 
 static boolean_t
 __elfN(parse_notes)(struct image_params *imgp, Elf_Brandnote *checknote,
     int32_t *osrel, const Elf_Phdr *pnote)
 {
 	const Elf_Note *note, *note0, *note_end;
 	const char *note_name;
 	int i;
 
 	if (pnote == NULL || pnote->p_offset > PAGE_SIZE ||
 	    pnote->p_filesz > PAGE_SIZE - pnote->p_offset)
 		return (FALSE);
 
 	note = note0 = (const Elf_Note *)(imgp->image_header + pnote->p_offset);
 	note_end = (const Elf_Note *)(imgp->image_header +
 	    pnote->p_offset + pnote->p_filesz);
 	for (i = 0; i < 100 && note >= note0 && note < note_end; i++) {
 		if (!aligned(note, Elf32_Addr) || (const char *)note_end -
 		    (const char *)note < sizeof(Elf_Note))
 			return (FALSE);
 		if (note->n_namesz != checknote->hdr.n_namesz ||
 		    note->n_descsz != checknote->hdr.n_descsz ||
 		    note->n_type != checknote->hdr.n_type)
 			goto nextnote;
 		note_name = (const char *)(note + 1);
 		if (note_name + checknote->hdr.n_namesz >=
 		    (const char *)note_end || strncmp(checknote->vendor,
 		    note_name, checknote->hdr.n_namesz) != 0)
 			goto nextnote;
 
 		/*
 		 * Fetch the osreldate for binary
 		 * from the ELF OSABI-note if necessary.
 		 */
 		if ((checknote->flags & BN_TRANSLATE_OSREL) != 0 &&
 		    checknote->trans_osrel != NULL)
 			return (checknote->trans_osrel(note, osrel));
 		return (TRUE);
 
 nextnote:
 		note = (const Elf_Note *)((const char *)(note + 1) +
 		    roundup2(note->n_namesz, ELF_NOTE_ROUNDSIZE) +
 		    roundup2(note->n_descsz, ELF_NOTE_ROUNDSIZE));
 	}
 
 	return (FALSE);
 }
 
 /*
  * Try to find the appropriate ABI-note section for checknote,
  * fetch the osreldate for binary from the ELF OSABI-note. Only the
  * first page of the image is searched, the same as for headers.
  */
 static boolean_t
 __elfN(check_note)(struct image_params *imgp, Elf_Brandnote *checknote,
     int32_t *osrel)
 {
 	const Elf_Phdr *phdr;
 	const Elf_Ehdr *hdr;
 	int i;
 
 	hdr = (const Elf_Ehdr *)imgp->image_header;
 	phdr = (const Elf_Phdr *)(imgp->image_header + hdr->e_phoff);
 
 	for (i = 0; i < hdr->e_phnum; i++) {
 		if (phdr[i].p_type == PT_NOTE &&
 		    __elfN(parse_notes)(imgp, checknote, osrel, &phdr[i]))
 			return (TRUE);
 	}
 	return (FALSE);
 
 }
 
 /*
  * Tell kern_execve.c about it, with a little help from the linker.
  */
 static struct execsw __elfN(execsw) = {
 	__CONCAT(exec_, __elfN(imgact)),
 	__XSTRING(__CONCAT(ELF, __ELF_WORD_SIZE))
 };
 EXEC_SET(__CONCAT(elf, __ELF_WORD_SIZE), __elfN(execsw));
 
 static vm_prot_t
 __elfN(trans_prot)(Elf_Word flags)
 {
 	vm_prot_t prot;
 
 	prot = 0;
 	if (flags & PF_X)
 		prot |= VM_PROT_EXECUTE;
 	if (flags & PF_W)
 		prot |= VM_PROT_WRITE;
 	if (flags & PF_R)
 		prot |= VM_PROT_READ;
 #if __ELF_WORD_SIZE == 32
 #if defined(__amd64__)
 	if (i386_read_exec && (flags & PF_R))
 		prot |= VM_PROT_EXECUTE;
 #endif
 #endif
 	return (prot);
 }
 
 static Elf_Word
 __elfN(untrans_prot)(vm_prot_t prot)
 {
 	Elf_Word flags;
 
 	flags = 0;
 	if (prot & VM_PROT_EXECUTE)
 		flags |= PF_X;
 	if (prot & VM_PROT_READ)
 		flags |= PF_R;
 	if (prot & VM_PROT_WRITE)
 		flags |= PF_W;
 	return (flags);
 }
Index: user/ngie/more-tests/sys/kern/kern_exec.c
===================================================================
--- user/ngie/more-tests/sys/kern/kern_exec.c	(revision 281584)
+++ user/ngie/more-tests/sys/kern/kern_exec.c	(revision 281585)
@@ -1,1498 +1,1511 @@
 /*-
  * Copyright (c) 1993, David Greenman
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_capsicum.h"
 #include "opt_hwpmc_hooks.h"
 #include "opt_ktrace.h"
 #include "opt_vm.h"
 
 #include <sys/param.h>
 #include <sys/capsicum.h>
 #include <sys/systm.h>
 #include <sys/eventhandler.h>
 #include <sys/lock.h>
 #include <sys/mutex.h>
 #include <sys/sysproto.h>
 #include <sys/signalvar.h>
 #include <sys/kernel.h>
 #include <sys/mount.h>
 #include <sys/filedesc.h>
 #include <sys/fcntl.h>
 #include <sys/acct.h>
 #include <sys/exec.h>
 #include <sys/imgact.h>
 #include <sys/imgact_elf.h>
 #include <sys/wait.h>
 #include <sys/malloc.h>
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/pioctl.h>
 #include <sys/namei.h>
 #include <sys/resourcevar.h>
 #include <sys/rwlock.h>
 #include <sys/sched.h>
 #include <sys/sdt.h>
 #include <sys/sf_buf.h>
 #include <sys/syscallsubr.h>
 #include <sys/sysent.h>
 #include <sys/shm.h>
 #include <sys/sysctl.h>
 #include <sys/vnode.h>
 #include <sys/stat.h>
 #ifdef KTRACE
 #include <sys/ktrace.h>
 #endif
 
 #include <vm/vm.h>
 #include <vm/vm_param.h>
 #include <vm/pmap.h>
 #include <vm/vm_page.h>
 #include <vm/vm_map.h>
 #include <vm/vm_kern.h>
 #include <vm/vm_extern.h>
 #include <vm/vm_object.h>
 #include <vm/vm_pager.h>
 
 #ifdef	HWPMC_HOOKS
 #include <sys/pmckern.h>
 #endif
 
 #include <machine/reg.h>
 
 #include <security/audit/audit.h>
 #include <security/mac/mac_framework.h>
 
 #ifdef KDTRACE_HOOKS
 #include <sys/dtrace_bsd.h>
 dtrace_execexit_func_t	dtrace_fasttrap_exec;
 #endif
 
 SDT_PROVIDER_DECLARE(proc);
 SDT_PROBE_DEFINE1(proc, kernel, , exec, "char *");
 SDT_PROBE_DEFINE1(proc, kernel, , exec__failure, "int");
 SDT_PROBE_DEFINE1(proc, kernel, , exec__success, "char *");
 
 MALLOC_DEFINE(M_PARGS, "proc-args", "Process arguments");
 
 static int sysctl_kern_ps_strings(SYSCTL_HANDLER_ARGS);
 static int sysctl_kern_usrstack(SYSCTL_HANDLER_ARGS);
 static int sysctl_kern_stackprot(SYSCTL_HANDLER_ARGS);
 static int do_execve(struct thread *td, struct image_args *args,
     struct mac *mac_p);
 
 /* XXX This should be vm_size_t. */
 SYSCTL_PROC(_kern, KERN_PS_STRINGS, ps_strings, CTLTYPE_ULONG|CTLFLAG_RD,
     NULL, 0, sysctl_kern_ps_strings, "LU", "");
 
 /* XXX This should be vm_size_t. */
 SYSCTL_PROC(_kern, KERN_USRSTACK, usrstack, CTLTYPE_ULONG|CTLFLAG_RD|
     CTLFLAG_CAPRD, NULL, 0, sysctl_kern_usrstack, "LU", "");
 
 SYSCTL_PROC(_kern, OID_AUTO, stackprot, CTLTYPE_INT|CTLFLAG_RD,
     NULL, 0, sysctl_kern_stackprot, "I", "");
 
 u_long ps_arg_cache_limit = PAGE_SIZE / 16;
 SYSCTL_ULONG(_kern, OID_AUTO, ps_arg_cache_limit, CTLFLAG_RW, 
     &ps_arg_cache_limit, 0, "");
 
 static int disallow_high_osrel;
 SYSCTL_INT(_kern, OID_AUTO, disallow_high_osrel, CTLFLAG_RW,
     &disallow_high_osrel, 0,
     "Disallow execution of binaries built for higher version of the world");
 
 static int map_at_zero = 0;
 SYSCTL_INT(_security_bsd, OID_AUTO, map_at_zero, CTLFLAG_RWTUN, &map_at_zero, 0,
     "Permit processes to map an object at virtual address 0.");
 
 static int
 sysctl_kern_ps_strings(SYSCTL_HANDLER_ARGS)
 {
 	struct proc *p;
 	int error;
 
 	p = curproc;
 #ifdef SCTL_MASK32
 	if (req->flags & SCTL_MASK32) {
 		unsigned int val;
 		val = (unsigned int)p->p_sysent->sv_psstrings;
 		error = SYSCTL_OUT(req, &val, sizeof(val));
 	} else
 #endif
 		error = SYSCTL_OUT(req, &p->p_sysent->sv_psstrings,
 		   sizeof(p->p_sysent->sv_psstrings));
 	return error;
 }
 
 static int
 sysctl_kern_usrstack(SYSCTL_HANDLER_ARGS)
 {
 	struct proc *p;
 	int error;
 
 	p = curproc;
 #ifdef SCTL_MASK32
 	if (req->flags & SCTL_MASK32) {
 		unsigned int val;
 		val = (unsigned int)p->p_sysent->sv_usrstack;
 		error = SYSCTL_OUT(req, &val, sizeof(val));
 	} else
 #endif
 		error = SYSCTL_OUT(req, &p->p_sysent->sv_usrstack,
 		    sizeof(p->p_sysent->sv_usrstack));
 	return error;
 }
 
 static int
 sysctl_kern_stackprot(SYSCTL_HANDLER_ARGS)
 {
 	struct proc *p;
 
 	p = curproc;
 	return (SYSCTL_OUT(req, &p->p_sysent->sv_stackprot,
 	    sizeof(p->p_sysent->sv_stackprot)));
 }
 
 /*
  * Each of the items is a pointer to a `const struct execsw', hence the
  * double pointer here.
  */
 static const struct execsw **execsw;
 
 #ifndef _SYS_SYSPROTO_H_
 struct execve_args {
 	char    *fname; 
 	char    **argv;
 	char    **envv; 
 };
 #endif
 
 int
 sys_execve(td, uap)
 	struct thread *td;
 	struct execve_args /* {
 		char *fname;
 		char **argv;
 		char **envv;
 	} */ *uap;
 {
 	int error;
 	struct image_args args;
 
 	error = exec_copyin_args(&args, uap->fname, UIO_USERSPACE,
 	    uap->argv, uap->envv);
 	if (error == 0)
 		error = kern_execve(td, &args, NULL);
 	return (error);
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct fexecve_args {
 	int	fd;
 	char	**argv;
 	char	**envv;
 }
 #endif
 int
 sys_fexecve(struct thread *td, struct fexecve_args *uap)
 {
 	int error;
 	struct image_args args;
 
 	error = exec_copyin_args(&args, NULL, UIO_SYSSPACE,
 	    uap->argv, uap->envv);
 	if (error == 0) {
 		args.fd = uap->fd;
 		error = kern_execve(td, &args, NULL);
 	}
 	return (error);
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct __mac_execve_args {
 	char	*fname;
 	char	**argv;
 	char	**envv;
 	struct mac	*mac_p;
 };
 #endif
 
 int
 sys___mac_execve(td, uap)
 	struct thread *td;
 	struct __mac_execve_args /* {
 		char *fname;
 		char **argv;
 		char **envv;
 		struct mac *mac_p;
 	} */ *uap;
 {
 #ifdef MAC
 	int error;
 	struct image_args args;
 
 	error = exec_copyin_args(&args, uap->fname, UIO_USERSPACE,
 	    uap->argv, uap->envv);
 	if (error == 0)
 		error = kern_execve(td, &args, uap->mac_p);
 	return (error);
 #else
 	return (ENOSYS);
 #endif
 }
 
 /*
  * XXX: kern_execve has the astonishing property of not always returning to
  * the caller.  If sufficiently bad things happen during the call to
  * do_execve(), it can end up calling exit1(); as a result, callers must
  * avoid doing anything which they might need to undo (e.g., allocating
  * memory).
  */
 int
 kern_execve(td, args, mac_p)
 	struct thread *td;
 	struct image_args *args;
 	struct mac *mac_p;
 {
 	struct proc *p = td->td_proc;
 	struct vmspace *oldvmspace;
 	int error;
 
 	AUDIT_ARG_ARGV(args->begin_argv, args->argc,
 	    args->begin_envv - args->begin_argv);
 	AUDIT_ARG_ENVV(args->begin_envv, args->envc,
 	    args->endp - args->begin_envv);
 	if (p->p_flag & P_HADTHREADS) {
 		PROC_LOCK(p);
 		if (thread_single(p, SINGLE_BOUNDARY)) {
 			PROC_UNLOCK(p);
 	       		exec_free_args(args);
 			return (ERESTART);	/* Try again later. */
 		}
 		PROC_UNLOCK(p);
 	}
 
 	KASSERT((td->td_pflags & TDP_EXECVMSPC) == 0, ("nested execve"));
 	oldvmspace = td->td_proc->p_vmspace;
 	error = do_execve(td, args, mac_p);
 
 	if (p->p_flag & P_HADTHREADS) {
 		PROC_LOCK(p);
 		/*
 		 * If success, we upgrade to SINGLE_EXIT state to
 		 * force other threads to suicide.
 		 */
 		if (error == 0)
 			thread_single(p, SINGLE_EXIT);
 		else
 			thread_single_end(p, SINGLE_BOUNDARY);
 		PROC_UNLOCK(p);
 	}
 	if ((td->td_pflags & TDP_EXECVMSPC) != 0) {
 		KASSERT(td->td_proc->p_vmspace != oldvmspace,
 		    ("oldvmspace still used"));
 		vmspace_free(oldvmspace);
 		td->td_pflags &= ~TDP_EXECVMSPC;
 	}
 
 	return (error);
 }
 
 /*
  * In-kernel implementation of execve().  All arguments are assumed to be
  * userspace pointers from the passed thread.
  */
 static int
 do_execve(td, args, mac_p)
 	struct thread *td;
 	struct image_args *args;
 	struct mac *mac_p;
 {
 	struct proc *p = td->td_proc;
 	struct nameidata nd;
 	struct ucred *newcred = NULL, *oldcred;
 	struct uidinfo *euip = NULL;
 	register_t *stack_base;
 	int error, i;
 	struct image_params image_params, *imgp;
 	struct vattr attr;
 	int (*img_first)(struct image_params *);
 	struct pargs *oldargs = NULL, *newargs = NULL;
 	struct sigacts *oldsigacts, *newsigacts;
 #ifdef KTRACE
 	struct vnode *tracevp = NULL;
 	struct ucred *tracecred = NULL;
 #endif
 	struct vnode *textvp = NULL, *binvp;
 	cap_rights_t rights;
 	int credential_changing;
 	int textset;
 #ifdef MAC
 	struct label *interpvplabel = NULL;
 	int will_transition;
 #endif
 #ifdef HWPMC_HOOKS
 	struct pmckern_procexec pe;
 #endif
 	static const char fexecv_proc_title[] = "(fexecv)";
 
 	imgp = &image_params;
 
 	/*
 	 * Lock the process and set the P_INEXEC flag to indicate that
 	 * it should be left alone until we're done here.  This is
 	 * necessary to avoid race conditions - e.g. in ptrace() -
 	 * that might allow a local user to illicitly obtain elevated
 	 * privileges.
 	 */
 	PROC_LOCK(p);
 	KASSERT((p->p_flag & P_INEXEC) == 0,
 	    ("%s(): process already has P_INEXEC flag", __func__));
 	p->p_flag |= P_INEXEC;
 	PROC_UNLOCK(p);
 
 	/*
 	 * Initialize part of the common data
 	 */
 	bzero(imgp, sizeof(*imgp));
 	imgp->proc = p;
 	imgp->attr = &attr;
 	imgp->args = args;
 
 #ifdef MAC
 	error = mac_execve_enter(imgp, mac_p);
 	if (error)
 		goto exec_fail;
 #endif
 
 	/*
 	 * Translate the file name. namei() returns a vnode pointer
 	 *	in ni_vp amoung other things.
 	 *
 	 * XXXAUDIT: It would be desirable to also audit the name of the
 	 * interpreter if this is an interpreted binary.
 	 */
 	if (args->fname != NULL) {
 		NDINIT(&nd, LOOKUP, ISOPEN | LOCKLEAF | FOLLOW | SAVENAME
 		    | AUDITVNODE1, UIO_SYSSPACE, args->fname, td);
 	}
 
 	SDT_PROBE(proc, kernel, , exec, args->fname, 0, 0, 0, 0 );
 
 interpret:
 	if (args->fname != NULL) {
 #ifdef CAPABILITY_MODE
 		/*
 		 * While capability mode can't reach this point via direct
 		 * path arguments to execve(), we also don't allow
 		 * interpreters to be used in capability mode (for now).
 		 * Catch indirect lookups and return a permissions error.
 		 */
 		if (IN_CAPABILITY_MODE(td)) {
 			error = ECAPMODE;
 			goto exec_fail;
 		}
 #endif
 		error = namei(&nd);
 		if (error)
 			goto exec_fail;
 
 		binvp = nd.ni_vp;
 		imgp->vp = binvp;
 	} else {
 		AUDIT_ARG_FD(args->fd);
 		/*
 		 * Descriptors opened only with O_EXEC or O_RDONLY are allowed.
 		 */
 		error = fgetvp_exec(td, args->fd,
 		    cap_rights_init(&rights, CAP_FEXECVE), &binvp);
 		if (error)
 			goto exec_fail;
 		vn_lock(binvp, LK_EXCLUSIVE | LK_RETRY);
 		AUDIT_ARG_VNODE1(binvp);
 		imgp->vp = binvp;
 	}
 
 	/*
 	 * Check file permissions (also 'opens' file)
 	 */
 	error = exec_check_permissions(imgp);
 	if (error)
 		goto exec_fail_dealloc;
 
 	imgp->object = imgp->vp->v_object;
 	if (imgp->object != NULL)
 		vm_object_reference(imgp->object);
 
 	/*
 	 * Set VV_TEXT now so no one can write to the executable while we're
 	 * activating it.
 	 *
 	 * Remember if this was set before and unset it in case this is not
 	 * actually an executable image.
 	 */
 	textset = VOP_IS_TEXT(imgp->vp);
 	VOP_SET_TEXT(imgp->vp);
 
 	error = exec_map_first_page(imgp);
 	if (error)
 		goto exec_fail_dealloc;
 
 	imgp->proc->p_osrel = 0;
 	/*
 	 *	If the current process has a special image activator it
 	 *	wants to try first, call it.   For example, emulating shell
 	 *	scripts differently.
 	 */
 	error = -1;
 	if ((img_first = imgp->proc->p_sysent->sv_imgact_try) != NULL)
 		error = img_first(imgp);
 
 	/*
 	 *	Loop through the list of image activators, calling each one.
 	 *	An activator returns -1 if there is no match, 0 on success,
 	 *	and an error otherwise.
 	 */
 	for (i = 0; error == -1 && execsw[i]; ++i) {
 		if (execsw[i]->ex_imgact == NULL ||
 		    execsw[i]->ex_imgact == img_first) {
 			continue;
 		}
 		error = (*execsw[i]->ex_imgact)(imgp);
 	}
 
 	if (error) {
 		if (error == -1) {
 			if (textset == 0)
 				VOP_UNSET_TEXT(imgp->vp);
 			error = ENOEXEC;
 		}
 		goto exec_fail_dealloc;
 	}
 
 	/*
 	 * Special interpreter operation, cleanup and loop up to try to
 	 * activate the interpreter.
 	 */
 	if (imgp->interpreted) {
 		exec_unmap_first_page(imgp);
 		/*
 		 * VV_TEXT needs to be unset for scripts.  There is a short
 		 * period before we determine that something is a script where
 		 * VV_TEXT will be set. The vnode lock is held over this
 		 * entire period so nothing should illegitimately be blocked.
 		 */
 		VOP_UNSET_TEXT(imgp->vp);
 		/* free name buffer and old vnode */
 		if (args->fname != NULL)
 			NDFREE(&nd, NDF_ONLY_PNBUF);
 #ifdef MAC
 		mac_execve_interpreter_enter(binvp, &interpvplabel);
 #endif
 		if (imgp->opened) {
 			VOP_CLOSE(binvp, FREAD, td->td_ucred, td);
 			imgp->opened = 0;
 		}
 		vput(binvp);
 		vm_object_deallocate(imgp->object);
 		imgp->object = NULL;
 		/* set new name to that of the interpreter */
 		NDINIT(&nd, LOOKUP, LOCKLEAF | FOLLOW | SAVENAME,
 		    UIO_SYSSPACE, imgp->interpreter_name, td);
 		args->fname = imgp->interpreter_name;
 		goto interpret;
 	}
 
 	/*
 	 * NB: We unlock the vnode here because it is believed that none
 	 * of the sv_copyout_strings/sv_fixup operations require the vnode.
 	 */
 	VOP_UNLOCK(imgp->vp, 0);
 
 	/*
 	 * Do the best to calculate the full path to the image file.
 	 */
 	if (imgp->auxargs != NULL &&
 	    ((args->fname != NULL && args->fname[0] == '/') ||
 	     vn_fullpath(td, imgp->vp, &imgp->execpath, &imgp->freepath) != 0))
 		imgp->execpath = args->fname;
 
 	if (disallow_high_osrel &&
 	    P_OSREL_MAJOR(p->p_osrel) > P_OSREL_MAJOR(__FreeBSD_version)) {
 		error = ENOEXEC;
 		uprintf("Osrel %d for image %s too high\n", p->p_osrel,
 		    imgp->execpath != NULL ? imgp->execpath : "<unresolved>");
 		vn_lock(imgp->vp, LK_SHARED | LK_RETRY);
 		goto exec_fail_dealloc;
 	}
 
 	/*
 	 * Copy out strings (args and env) and initialize stack base
 	 */
 	if (p->p_sysent->sv_copyout_strings)
 		stack_base = (*p->p_sysent->sv_copyout_strings)(imgp);
 	else
 		stack_base = exec_copyout_strings(imgp);
 
 	/*
 	 * If custom stack fixup routine present for this process
 	 * let it do the stack setup.
 	 * Else stuff argument count as first item on stack
 	 */
 	if (p->p_sysent->sv_fixup != NULL)
 		(*p->p_sysent->sv_fixup)(&stack_base, imgp);
 	else
 		suword(--stack_base, imgp->args->argc);
 
 	/*
 	 * For security and other reasons, the file descriptor table cannot
 	 * be shared after an exec.
 	 */
 	fdunshare(td);
 	/* close files on exec */
 	fdcloseexec(td);
 
 	/*
 	 * Malloc things before we need locks.
 	 */
 	i = imgp->args->begin_envv - imgp->args->begin_argv;
 	/* Cache arguments if they fit inside our allowance */
 	if (ps_arg_cache_limit >= i + sizeof(struct pargs)) {
 		newargs = pargs_alloc(i);
 		bcopy(imgp->args->begin_argv, newargs->ar_args, i);
 	}
 
 	vn_lock(imgp->vp, LK_SHARED | LK_RETRY);
 
 	/* Get a reference to the vnode prior to locking the proc */
 	VREF(binvp);
 
 	/*
 	 * For security and other reasons, signal handlers cannot
 	 * be shared after an exec. The new process gets a copy of the old
 	 * handlers. In execsigs(), the new process will have its signals
 	 * reset.
 	 */
 	if (sigacts_shared(p->p_sigacts)) {
 		oldsigacts = p->p_sigacts;
 		newsigacts = sigacts_alloc();
 		sigacts_copy(newsigacts, oldsigacts);
 	} else {
 		oldsigacts = NULL;
 		newsigacts = NULL; /* satisfy gcc */
 	}
 
 	PROC_LOCK(p);
 	if (oldsigacts)
 		p->p_sigacts = newsigacts;
 	oldcred = p->p_ucred;
 	/* Stop profiling */
 	stopprofclock(p);
 
 	/* reset caught signals */
 	execsigs(p);
 
 	/* name this process - nameiexec(p, ndp) */
 	bzero(p->p_comm, sizeof(p->p_comm));
 	if (args->fname)
 		bcopy(nd.ni_cnd.cn_nameptr, p->p_comm,
 		    min(nd.ni_cnd.cn_namelen, MAXCOMLEN));
 	else if (vn_commname(binvp, p->p_comm, sizeof(p->p_comm)) != 0)
 		bcopy(fexecv_proc_title, p->p_comm, sizeof(fexecv_proc_title));
 	bcopy(p->p_comm, td->td_name, sizeof(td->td_name));
 #ifdef KTR
 	sched_clear_tdname(td);
 #endif
 
 	/*
 	 * mark as execed, wakeup the process that vforked (if any) and tell
 	 * it that it now has its own resources back
 	 */
 	p->p_flag |= P_EXEC;
 	if ((p->p_flag2 & P2_NOTRACE_EXEC) == 0)
 		p->p_flag2 &= ~P2_NOTRACE;
 	if (p->p_flag & P_PPWAIT) {
 		p->p_flag &= ~(P_PPWAIT | P_PPTRACE);
 		cv_broadcast(&p->p_pwait);
 	}
 
 	/*
 	 * Implement image setuid/setgid.
 	 *
 	 * Don't honor setuid/setgid if the filesystem prohibits it or if
 	 * the process is being traced.
 	 *
 	 * We disable setuid/setgid/etc in compatibility mode on the basis
 	 * that most setugid applications are not written with that
 	 * environment in mind, and will therefore almost certainly operate
 	 * incorrectly. In principle there's no reason that setugid
 	 * applications might not be useful in capability mode, so we may want
 	 * to reconsider this conservative design choice in the future.
 	 *
 	 * XXXMAC: For the time being, use NOSUID to also prohibit
 	 * transitions on the file system.
 	 */
 	credential_changing = 0;
 	credential_changing |= (attr.va_mode & S_ISUID) && oldcred->cr_uid !=
 	    attr.va_uid;
 	credential_changing |= (attr.va_mode & S_ISGID) && oldcred->cr_gid !=
 	    attr.va_gid;
 #ifdef MAC
 	will_transition = mac_vnode_execve_will_transition(oldcred, imgp->vp,
 	    interpvplabel, imgp);
 	credential_changing |= will_transition;
 #endif
 
 	if (credential_changing &&
 #ifdef CAPABILITY_MODE
 	    ((oldcred->cr_flags & CRED_FLAG_CAPMODE) == 0) &&
 #endif
 	    (imgp->vp->v_mount->mnt_flag & MNT_NOSUID) == 0 &&
 	    (p->p_flag & P_TRACED) == 0) {
 		/*
 		 * Turn off syscall tracing for set-id programs, except for
 		 * root.  Record any set-id flags first to make sure that
 		 * we do not regain any tracing during a possible block.
 		 */
 		setsugid(p);
 
 #ifdef KTRACE
 		if (p->p_tracecred != NULL &&
 		    priv_check_cred(p->p_tracecred, PRIV_DEBUG_DIFFCRED, 0))
 			ktrprocexec(p, &tracecred, &tracevp);
 #endif
 		/*
 		 * Close any file descriptors 0..2 that reference procfs,
 		 * then make sure file descriptors 0..2 are in use.
 		 *
 		 * Both fdsetugidsafety() and fdcheckstd() may call functions
 		 * taking sleepable locks, so temporarily drop our locks.
 		 */
 		PROC_UNLOCK(p);
 		VOP_UNLOCK(imgp->vp, 0);
 		fdsetugidsafety(td);
 		error = fdcheckstd(td);
 		if (error != 0)
 			goto done1;
 		newcred = crdup(oldcred);
 		euip = uifind(attr.va_uid);
 		vn_lock(imgp->vp, LK_SHARED | LK_RETRY);
 		PROC_LOCK(p);
 		/*
 		 * Set the new credentials.
 		 */
 		if (attr.va_mode & S_ISUID)
 			change_euid(newcred, euip);
 		if (attr.va_mode & S_ISGID)
 			change_egid(newcred, attr.va_gid);
 #ifdef MAC
 		if (will_transition) {
 			mac_vnode_execve_transition(oldcred, newcred, imgp->vp,
 			    interpvplabel, imgp);
 		}
 #endif
 		/*
 		 * Implement correct POSIX saved-id behavior.
 		 *
 		 * XXXMAC: Note that the current logic will save the
 		 * uid and gid if a MAC domain transition occurs, even
 		 * though maybe it shouldn't.
 		 */
 		change_svuid(newcred, newcred->cr_uid);
 		change_svgid(newcred, newcred->cr_gid);
 		proc_set_cred(p, newcred);
 	} else {
 		if (oldcred->cr_uid == oldcred->cr_ruid &&
 		    oldcred->cr_gid == oldcred->cr_rgid)
 			p->p_flag &= ~P_SUGID;
 		/*
 		 * Implement correct POSIX saved-id behavior.
 		 *
 		 * XXX: It's not clear that the existing behavior is
 		 * POSIX-compliant.  A number of sources indicate that the
 		 * saved uid/gid should only be updated if the new ruid is
 		 * not equal to the old ruid, or the new euid is not equal
 		 * to the old euid and the new euid is not equal to the old
 		 * ruid.  The FreeBSD code always updates the saved uid/gid.
 		 * Also, this code uses the new (replaced) euid and egid as
 		 * the source, which may or may not be the right ones to use.
 		 */
 		if (oldcred->cr_svuid != oldcred->cr_uid ||
 		    oldcred->cr_svgid != oldcred->cr_gid) {
 			PROC_UNLOCK(p);
 			VOP_UNLOCK(imgp->vp, 0);
 			newcred = crdup(oldcred);
 			vn_lock(imgp->vp, LK_SHARED | LK_RETRY);
 			PROC_LOCK(p);
 			change_svuid(newcred, newcred->cr_uid);
 			change_svgid(newcred, newcred->cr_gid);
 			proc_set_cred(p, newcred);
 		}
 	}
 
 	/*
 	 * Store the vp for use in procfs.  This vnode was referenced prior
 	 * to locking the proc lock.
 	 */
 	textvp = p->p_textvp;
 	p->p_textvp = binvp;
 
 #ifdef KDTRACE_HOOKS
 	/*
 	 * Tell the DTrace fasttrap provider about the exec if it
 	 * has declared an interest.
 	 */
 	if (dtrace_fasttrap_exec)
 		dtrace_fasttrap_exec(p);
 #endif
 
 	/*
 	 * Notify others that we exec'd, and clear the P_INEXEC flag
 	 * as we're now a bona fide freshly-execed process.
 	 */
 	KNOTE_LOCKED(&p->p_klist, NOTE_EXEC);
 	p->p_flag &= ~P_INEXEC;
 
 	/* clear "fork but no exec" flag, as we _are_ execing */
 	p->p_acflag &= ~AFORK;
 
 	/*
 	 * Free any previous argument cache and replace it with
 	 * the new argument cache, if any.
 	 */
 	oldargs = p->p_args;
 	p->p_args = newargs;
 	newargs = NULL;
 
 #ifdef	HWPMC_HOOKS
 	/*
 	 * Check if system-wide sampling is in effect or if the
 	 * current process is using PMCs.  If so, do exec() time
 	 * processing.  This processing needs to happen AFTER the
 	 * P_INEXEC flag is cleared.
 	 *
 	 * The proc lock needs to be released before taking the PMC
 	 * SX.
 	 */
 	if (PMC_SYSTEM_SAMPLING_ACTIVE() || PMC_PROC_IS_USING_PMCS(p)) {
 		PROC_UNLOCK(p);
 		VOP_UNLOCK(imgp->vp, 0);
 		pe.pm_credentialschanged = credential_changing;
 		pe.pm_entryaddr = imgp->entry_addr;
 
 		PMC_CALL_HOOK_X(td, PMC_FN_PROCESS_EXEC, (void *) &pe);
 		vn_lock(imgp->vp, LK_SHARED | LK_RETRY);
 	} else
 		PROC_UNLOCK(p);
 #else  /* !HWPMC_HOOKS */
 	PROC_UNLOCK(p);
 #endif
 
 	/* Set values passed into the program in registers. */
 	if (p->p_sysent->sv_setregs)
 		(*p->p_sysent->sv_setregs)(td, imgp, 
 		    (u_long)(uintptr_t)stack_base);
 	else
 		exec_setregs(td, imgp, (u_long)(uintptr_t)stack_base);
 
 	vfs_mark_atime(imgp->vp, td->td_ucred);
 
 	SDT_PROBE(proc, kernel, , exec__success, args->fname, 0, 0, 0, 0);
 
 	VOP_UNLOCK(imgp->vp, 0);
 done1:
 	/*
 	 * Free any resources malloc'd earlier that we didn't use.
 	 */
 	if (euip != NULL)
 		uifree(euip);
 	if (newcred != NULL)
 		crfree(oldcred);
 
 	/*
 	 * Handle deferred decrement of ref counts.
 	 */
 	if (textvp != NULL)
 		vrele(textvp);
 	if (error != 0)
 		vrele(binvp);
 #ifdef KTRACE
 	if (tracevp != NULL)
 		vrele(tracevp);
 	if (tracecred != NULL)
 		crfree(tracecred);
 #endif
 	vn_lock(imgp->vp, LK_SHARED | LK_RETRY);
 	pargs_drop(oldargs);
 	pargs_drop(newargs);
 	if (oldsigacts != NULL)
 		sigacts_free(oldsigacts);
 
 exec_fail_dealloc:
 
 	/*
 	 * free various allocated resources
 	 */
 	if (imgp->firstpage != NULL)
 		exec_unmap_first_page(imgp);
 
 	if (imgp->vp != NULL) {
 		if (args->fname)
 			NDFREE(&nd, NDF_ONLY_PNBUF);
 		if (imgp->opened)
 			VOP_CLOSE(imgp->vp, FREAD, td->td_ucred, td);
 		vput(imgp->vp);
 	}
 
 	if (imgp->object != NULL)
 		vm_object_deallocate(imgp->object);
 
 	free(imgp->freepath, M_TEMP);
 
 	if (error == 0) {
 		PROC_LOCK(p);
 		td->td_dbgflags |= TDB_EXEC;
 		PROC_UNLOCK(p);
 
 		/*
 		 * Stop the process here if its stop event mask has
 		 * the S_EXEC bit set.
 		 */
 		STOPEVENT(p, S_EXEC, 0);
 		goto done2;
 	}
 
 exec_fail:
 	/* we're done here, clear P_INEXEC */
 	PROC_LOCK(p);
 	p->p_flag &= ~P_INEXEC;
 	PROC_UNLOCK(p);
 
 	SDT_PROBE(proc, kernel, , exec__failure, error, 0, 0, 0, 0);
 
 done2:
 #ifdef MAC
 	mac_execve_exit(imgp);
 	mac_execve_interpreter_exit(interpvplabel);
 #endif
 	exec_free_args(args);
 
 	if (error && imgp->vmspace_destroyed) {
 		/* sorry, no more process anymore. exit gracefully */
 		exit1(td, W_EXITCODE(0, SIGABRT));
 		/* NOT REACHED */
 	}
 
 #ifdef KTRACE
 	if (error == 0)
 		ktrprocctor(p);
 #endif
 
 	return (error);
 }
 
 int
 exec_map_first_page(imgp)
 	struct image_params *imgp;
 {
 	int rv, i;
 	int initial_pagein;
 	vm_page_t ma[VM_INITIAL_PAGEIN];
 	vm_object_t object;
 
 	if (imgp->firstpage != NULL)
 		exec_unmap_first_page(imgp);
 
 	object = imgp->vp->v_object;
 	if (object == NULL)
 		return (EACCES);
 	VM_OBJECT_WLOCK(object);
 #if VM_NRESERVLEVEL > 0
 	vm_object_color(object, 0);
 #endif
 	ma[0] = vm_page_grab(object, 0, VM_ALLOC_NORMAL);
 	if (ma[0]->valid != VM_PAGE_BITS_ALL) {
 		initial_pagein = VM_INITIAL_PAGEIN;
 		if (initial_pagein > object->size)
 			initial_pagein = object->size;
 		for (i = 1; i < initial_pagein; i++) {
 			if ((ma[i] = vm_page_next(ma[i - 1])) != NULL) {
 				if (ma[i]->valid)
 					break;
 				if (vm_page_tryxbusy(ma[i]))
 					break;
 			} else {
 				ma[i] = vm_page_alloc(object, i,
 				    VM_ALLOC_NORMAL | VM_ALLOC_IFNOTCACHED);
 				if (ma[i] == NULL)
 					break;
 			}
 		}
 		initial_pagein = i;
 		rv = vm_pager_get_pages(object, ma, initial_pagein, 0);
 		ma[0] = vm_page_lookup(object, 0);
 		if ((rv != VM_PAGER_OK) || (ma[0] == NULL)) {
 			if (ma[0] != NULL) {
 				vm_page_lock(ma[0]);
 				vm_page_free(ma[0]);
 				vm_page_unlock(ma[0]);
 			}
 			VM_OBJECT_WUNLOCK(object);
 			return (EIO);
 		}
 	}
 	vm_page_xunbusy(ma[0]);
 	vm_page_lock(ma[0]);
 	vm_page_hold(ma[0]);
 	vm_page_activate(ma[0]);
 	vm_page_unlock(ma[0]);
 	VM_OBJECT_WUNLOCK(object);
 
 	imgp->firstpage = sf_buf_alloc(ma[0], 0);
 	imgp->image_header = (char *)sf_buf_kva(imgp->firstpage);
 
 	return (0);
 }
 
 void
 exec_unmap_first_page(imgp)
 	struct image_params *imgp;
 {
 	vm_page_t m;
 
 	if (imgp->firstpage != NULL) {
 		m = sf_buf_page(imgp->firstpage);
 		sf_buf_free(imgp->firstpage);
 		imgp->firstpage = NULL;
 		vm_page_lock(m);
 		vm_page_unhold(m);
 		vm_page_unlock(m);
 	}
 }
 
 /*
  * Destroy old address space, and allocate a new stack
  *	The new stack is only SGROWSIZ large because it is grown
  *	automatically in trap.c.
  */
 int
 exec_new_vmspace(imgp, sv)
 	struct image_params *imgp;
 	struct sysentvec *sv;
 {
 	int error;
 	struct proc *p = imgp->proc;
 	struct vmspace *vmspace = p->p_vmspace;
 	vm_object_t obj;
+	struct rlimit rlim_stack;
 	vm_offset_t sv_minuser, stack_addr;
 	vm_map_t map;
 	u_long ssiz;
 
 	imgp->vmspace_destroyed = 1;
 	imgp->sysent = sv;
 
 	/* May be called with Giant held */
 	EVENTHANDLER_INVOKE(process_exec, p, imgp);
 
 	/*
 	 * Blow away entire process VM, if address space not shared,
 	 * otherwise, create a new VM space so that other threads are
 	 * not disrupted
 	 */
 	map = &vmspace->vm_map;
 	if (map_at_zero)
 		sv_minuser = sv->sv_minuser;
 	else
 		sv_minuser = MAX(sv->sv_minuser, PAGE_SIZE);
 	if (vmspace->vm_refcnt == 1 && vm_map_min(map) == sv_minuser &&
 	    vm_map_max(map) == sv->sv_maxuser) {
 		shmexit(vmspace);
 		pmap_remove_pages(vmspace_pmap(vmspace));
 		vm_map_remove(map, vm_map_min(map), vm_map_max(map));
 	} else {
 		error = vmspace_exec(p, sv_minuser, sv->sv_maxuser);
 		if (error)
 			return (error);
 		vmspace = p->p_vmspace;
 		map = &vmspace->vm_map;
 	}
 
 	/* Map a shared page */
 	obj = sv->sv_shared_page_obj;
 	if (obj != NULL) {
 		vm_object_reference(obj);
 		error = vm_map_fixed(map, obj, 0,
 		    sv->sv_shared_page_base, sv->sv_shared_page_len,
 		    VM_PROT_READ | VM_PROT_EXECUTE,
 		    VM_PROT_READ | VM_PROT_EXECUTE,
 		    MAP_INHERIT_SHARE | MAP_ACC_NO_CHARGE);
 		if (error) {
 			vm_object_deallocate(obj);
 			return (error);
 		}
 	}
 
 	/* Allocate a new stack */
-	if (sv->sv_maxssiz != NULL)
+	if (imgp->stack_sz != 0) {
+		ssiz = imgp->stack_sz;
+		PROC_LOCK(p);
+		lim_rlimit(p, RLIMIT_STACK, &rlim_stack);
+		PROC_UNLOCK(p);
+		if (ssiz > rlim_stack.rlim_max)
+			ssiz = rlim_stack.rlim_max;
+		if (ssiz > rlim_stack.rlim_cur) {
+			rlim_stack.rlim_cur = ssiz;
+			kern_setrlimit(curthread, RLIMIT_STACK, &rlim_stack);
+		}
+	} else if (sv->sv_maxssiz != NULL) {
 		ssiz = *sv->sv_maxssiz;
-	else
+	} else {
 		ssiz = maxssiz;
+	}
 	stack_addr = sv->sv_usrstack - ssiz;
 	error = vm_map_stack(map, stack_addr, (vm_size_t)ssiz,
 	    obj != NULL && imgp->stack_prot != 0 ? imgp->stack_prot :
 		sv->sv_stackprot,
 	    VM_PROT_ALL, MAP_STACK_GROWS_DOWN);
 	if (error)
 		return (error);
 
 	/*
 	 * vm_ssize and vm_maxsaddr are somewhat antiquated concepts, but they
 	 * are still used to enforce the stack rlimit on the process stack.
 	 */
 	vmspace->vm_ssize = sgrowsiz >> PAGE_SHIFT;
 	vmspace->vm_maxsaddr = (char *)sv->sv_usrstack - ssiz;
 
 	return (0);
 }
 
 /*
  * Copy out argument and environment strings from the old process address
  * space into the temporary string buffer.
  */
 int
 exec_copyin_args(struct image_args *args, char *fname,
     enum uio_seg segflg, char **argv, char **envv)
 {
 	u_long argp, envp;
 	int error;
 	size_t length;
 
 	bzero(args, sizeof(*args));
 	if (argv == NULL)
 		return (EFAULT);
 
 	/*
 	 * Allocate demand-paged memory for the file name, argument, and
 	 * environment strings.
 	 */
 	error = exec_alloc_args(args);
 	if (error != 0)
 		return (error);
 
 	/*
 	 * Copy the file name.
 	 */
 	if (fname != NULL) {
 		args->fname = args->buf;
 		error = (segflg == UIO_SYSSPACE) ?
 		    copystr(fname, args->fname, PATH_MAX, &length) :
 		    copyinstr(fname, args->fname, PATH_MAX, &length);
 		if (error != 0)
 			goto err_exit;
 	} else
 		length = 0;
 
 	args->begin_argv = args->buf + length;
 	args->endp = args->begin_argv;
 	args->stringspace = ARG_MAX;
 
 	/*
 	 * extract arguments first
 	 */
 	for (;;) {
 		error = fueword(argv++, &argp);
 		if (error == -1) {
 			error = EFAULT;
 			goto err_exit;
 		}
 		if (argp == 0)
 			break;
 		error = copyinstr((void *)(uintptr_t)argp, args->endp,
 		    args->stringspace, &length);
 		if (error != 0) {
 			if (error == ENAMETOOLONG) 
 				error = E2BIG;
 			goto err_exit;
 		}
 		args->stringspace -= length;
 		args->endp += length;
 		args->argc++;
 	}
 
 	args->begin_envv = args->endp;
 
 	/*
 	 * extract environment strings
 	 */
 	if (envv) {
 		for (;;) {
 			error = fueword(envv++, &envp);
 			if (error == -1) {
 				error = EFAULT;
 				goto err_exit;
 			}
 			if (envp == 0)
 				break;
 			error = copyinstr((void *)(uintptr_t)envp,
 			    args->endp, args->stringspace, &length);
 			if (error != 0) {
 				if (error == ENAMETOOLONG)
 					error = E2BIG;
 				goto err_exit;
 			}
 			args->stringspace -= length;
 			args->endp += length;
 			args->envc++;
 		}
 	}
 
 	return (0);
 
 err_exit:
 	exec_free_args(args);
 	return (error);
 }
 
 /*
  * Allocate temporary demand-paged, zero-filled memory for the file name,
  * argument, and environment strings.  Returns zero if the allocation succeeds
  * and ENOMEM otherwise.
  */
 int
 exec_alloc_args(struct image_args *args)
 {
 
 	args->buf = (char *)kmap_alloc_wait(exec_map, PATH_MAX + ARG_MAX);
 	return (args->buf != NULL ? 0 : ENOMEM);
 }
 
 void
 exec_free_args(struct image_args *args)
 {
 
 	if (args->buf != NULL) {
 		kmap_free_wakeup(exec_map, (vm_offset_t)args->buf,
 		    PATH_MAX + ARG_MAX);
 		args->buf = NULL;
 	}
 	if (args->fname_buf != NULL) {
 		free(args->fname_buf, M_TEMP);
 		args->fname_buf = NULL;
 	}
 }
 
 /*
  * Copy strings out to the new process address space, constructing new arg
  * and env vector tables. Return a pointer to the base so that it can be used
  * as the initial stack pointer.
  */
 register_t *
 exec_copyout_strings(imgp)
 	struct image_params *imgp;
 {
 	int argc, envc;
 	char **vectp;
 	char *stringp;
 	uintptr_t destp;
 	register_t *stack_base;
 	struct ps_strings *arginfo;
 	struct proc *p;
 	size_t execpath_len;
 	int szsigcode, szps;
 	char canary[sizeof(long) * 8];
 
 	szps = sizeof(pagesizes[0]) * MAXPAGESIZES;
 	/*
 	 * Calculate string base and vector table pointers.
 	 * Also deal with signal trampoline code for this exec type.
 	 */
 	if (imgp->execpath != NULL && imgp->auxargs != NULL)
 		execpath_len = strlen(imgp->execpath) + 1;
 	else
 		execpath_len = 0;
 	p = imgp->proc;
 	szsigcode = 0;
 	arginfo = (struct ps_strings *)p->p_sysent->sv_psstrings;
 	if (p->p_sysent->sv_sigcode_base == 0) {
 		if (p->p_sysent->sv_szsigcode != NULL)
 			szsigcode = *(p->p_sysent->sv_szsigcode);
 	}
 	destp =	(uintptr_t)arginfo;
 
 	/*
 	 * install sigcode
 	 */
 	if (szsigcode != 0) {
 		destp -= szsigcode;
 		destp = rounddown2(destp, sizeof(void *));
 		copyout(p->p_sysent->sv_sigcode, (void *)destp, szsigcode);
 	}
 
 	/*
 	 * Copy the image path for the rtld.
 	 */
 	if (execpath_len != 0) {
 		destp -= execpath_len;
 		imgp->execpathp = destp;
 		copyout(imgp->execpath, (void *)destp, execpath_len);
 	}
 
 	/*
 	 * Prepare the canary for SSP.
 	 */
 	arc4rand(canary, sizeof(canary), 0);
 	destp -= sizeof(canary);
 	imgp->canary = destp;
 	copyout(canary, (void *)destp, sizeof(canary));
 	imgp->canarylen = sizeof(canary);
 
 	/*
 	 * Prepare the pagesizes array.
 	 */
 	destp -= szps;
 	destp = rounddown2(destp, sizeof(void *));
 	imgp->pagesizes = destp;
 	copyout(pagesizes, (void *)destp, szps);
 	imgp->pagesizeslen = szps;
 
 	destp -= ARG_MAX - imgp->args->stringspace;
 	destp = rounddown2(destp, sizeof(void *));
 
 	/*
 	 * If we have a valid auxargs ptr, prepare some room
 	 * on the stack.
 	 */
 	if (imgp->auxargs) {
 		/*
 		 * 'AT_COUNT*2' is size for the ELF Auxargs data. This is for
 		 * lower compatibility.
 		 */
 		imgp->auxarg_size = (imgp->auxarg_size) ? imgp->auxarg_size :
 		    (AT_COUNT * 2);
 		/*
 		 * The '+ 2' is for the null pointers at the end of each of
 		 * the arg and env vector sets,and imgp->auxarg_size is room
 		 * for argument of Runtime loader.
 		 */
 		vectp = (char **)(destp - (imgp->args->argc +
 		    imgp->args->envc + 2 + imgp->auxarg_size)
 		    * sizeof(char *));
 	} else {
 		/*
 		 * The '+ 2' is for the null pointers at the end of each of
 		 * the arg and env vector sets
 		 */
 		vectp = (char **)(destp - (imgp->args->argc + imgp->args->envc
 		    + 2) * sizeof(char *));
 	}
 
 	/*
 	 * vectp also becomes our initial stack base
 	 */
 	stack_base = (register_t *)vectp;
 
 	stringp = imgp->args->begin_argv;
 	argc = imgp->args->argc;
 	envc = imgp->args->envc;
 
 	/*
 	 * Copy out strings - arguments and environment.
 	 */
 	copyout(stringp, (void *)destp, ARG_MAX - imgp->args->stringspace);
 
 	/*
 	 * Fill in "ps_strings" struct for ps, w, etc.
 	 */
 	suword(&arginfo->ps_argvstr, (long)(intptr_t)vectp);
 	suword32(&arginfo->ps_nargvstr, argc);
 
 	/*
 	 * Fill in argument portion of vector table.
 	 */
 	for (; argc > 0; --argc) {
 		suword(vectp++, (long)(intptr_t)destp);
 		while (*stringp++ != 0)
 			destp++;
 		destp++;
 	}
 
 	/* a null vector table pointer separates the argp's from the envp's */
 	suword(vectp++, 0);
 
 	suword(&arginfo->ps_envstr, (long)(intptr_t)vectp);
 	suword32(&arginfo->ps_nenvstr, envc);
 
 	/*
 	 * Fill in environment portion of vector table.
 	 */
 	for (; envc > 0; --envc) {
 		suword(vectp++, (long)(intptr_t)destp);
 		while (*stringp++ != 0)
 			destp++;
 		destp++;
 	}
 
 	/* end of vector table is a null pointer */
 	suword(vectp, 0);
 
 	return (stack_base);
 }
 
 /*
  * Check permissions of file to execute.
  *	Called with imgp->vp locked.
  *	Return 0 for success or error code on failure.
  */
 int
 exec_check_permissions(imgp)
 	struct image_params *imgp;
 {
 	struct vnode *vp = imgp->vp;
 	struct vattr *attr = imgp->attr;
 	struct thread *td;
 	int error, writecount;
 
 	td = curthread;
 
 	/* Get file attributes */
 	error = VOP_GETATTR(vp, attr, td->td_ucred);
 	if (error)
 		return (error);
 
 #ifdef MAC
 	error = mac_vnode_check_exec(td->td_ucred, imgp->vp, imgp);
 	if (error)
 		return (error);
 #endif
 
 	/*
 	 * 1) Check if file execution is disabled for the filesystem that
 	 *    this file resides on.
 	 * 2) Ensure that at least one execute bit is on. Otherwise, a
 	 *    privileged user will always succeed, and we don't want this
 	 *    to happen unless the file really is executable.
 	 * 3) Ensure that the file is a regular file.
 	 */
 	if ((vp->v_mount->mnt_flag & MNT_NOEXEC) ||
 	    (attr->va_mode & (S_IXUSR | S_IXGRP | S_IXOTH)) == 0 ||
 	    (attr->va_type != VREG))
 		return (EACCES);
 
 	/*
 	 * Zero length files can't be exec'd
 	 */
 	if (attr->va_size == 0)
 		return (ENOEXEC);
 
 	/*
 	 *  Check for execute permission to file based on current credentials.
 	 */
 	error = VOP_ACCESS(vp, VEXEC, td->td_ucred, td);
 	if (error)
 		return (error);
 
 	/*
 	 * Check number of open-for-writes on the file and deny execution
 	 * if there are any.
 	 */
 	error = VOP_GET_WRITECOUNT(vp, &writecount);
 	if (error != 0)
 		return (error);
 	if (writecount != 0)
 		return (ETXTBSY);
 
 	/*
 	 * Call filesystem specific open routine (which does nothing in the
 	 * general case).
 	 */
 	error = VOP_OPEN(vp, FREAD, td->td_ucred, td, NULL);
 	if (error == 0)
 		imgp->opened = 1;
 	return (error);
 }
 
 /*
  * Exec handler registration
  */
 int
 exec_register(execsw_arg)
 	const struct execsw *execsw_arg;
 {
 	const struct execsw **es, **xs, **newexecsw;
 	int count = 2;	/* New slot and trailing NULL */
 
 	if (execsw)
 		for (es = execsw; *es; es++)
 			count++;
 	newexecsw = malloc(count * sizeof(*es), M_TEMP, M_WAITOK);
 	if (newexecsw == NULL)
 		return (ENOMEM);
 	xs = newexecsw;
 	if (execsw)
 		for (es = execsw; *es; es++)
 			*xs++ = *es;
 	*xs++ = execsw_arg;
 	*xs = NULL;
 	if (execsw)
 		free(execsw, M_TEMP);
 	execsw = newexecsw;
 	return (0);
 }
 
 int
 exec_unregister(execsw_arg)
 	const struct execsw *execsw_arg;
 {
 	const struct execsw **es, **xs, **newexecsw;
 	int count = 1;
 
 	if (execsw == NULL)
 		panic("unregister with no handlers left?\n");
 
 	for (es = execsw; *es; es++) {
 		if (*es == execsw_arg)
 			break;
 	}
 	if (*es == NULL)
 		return (ENOENT);
 	for (es = execsw; *es; es++)
 		if (*es != execsw_arg)
 			count++;
 	newexecsw = malloc(count * sizeof(*es), M_TEMP, M_WAITOK);
 	if (newexecsw == NULL)
 		return (ENOMEM);
 	xs = newexecsw;
 	for (es = execsw; *es; es++)
 		if (*es != execsw_arg)
 			*xs++ = *es;
 	*xs = NULL;
 	if (execsw)
 		free(execsw, M_TEMP);
 	execsw = newexecsw;
 	return (0);
 }
Index: user/ngie/more-tests/sys/kern/kern_poll.c
===================================================================
--- user/ngie/more-tests/sys/kern/kern_poll.c	(revision 281584)
+++ user/ngie/more-tests/sys/kern/kern_poll.c	(revision 281585)
@@ -1,568 +1,574 @@
 /*-
  * Copyright (c) 2001-2002 Luigi Rizzo
  *
  * Supported by: the Xorp Project (www.xorp.org)
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_device_polling.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/kthread.h>
 #include <sys/proc.h>
 #include <sys/eventhandler.h>
 #include <sys/resourcevar.h>
 #include <sys/socket.h>			/* needed by net/if.h		*/
 #include <sys/sockio.h>
 #include <sys/sysctl.h>
 #include <sys/syslog.h>
 
 #include <net/if.h>
 #include <net/if_var.h>
 #include <net/netisr.h>			/* for NETISR_POLL		*/
 #include <net/vnet.h>
 
 void hardclock_device_poll(void);	/* hook from hardclock		*/
 
 static struct mtx	poll_mtx;
 
 /*
  * Polling support for [network] device drivers.
  *
  * Drivers which support this feature can register with the
  * polling code.
  *
  * If registration is successful, the driver must disable interrupts,
  * and further I/O is performed through the handler, which is invoked
  * (at least once per clock tick) with 3 arguments: the "arg" passed at
  * register time (a struct ifnet pointer), a command, and a "count" limit.
  *
  * The command can be one of the following:
  *  POLL_ONLY: quick move of "count" packets from input/output queues.
  *  POLL_AND_CHECK_STATUS: as above, plus check status registers or do
  *	other more expensive operations. This command is issued periodically
  *	but less frequently than POLL_ONLY.
  *
  * The count limit specifies how much work the handler can do during the
  * call -- typically this is the number of packets to be received, or
  * transmitted, etc. (drivers are free to interpret this number, as long
  * as the max time spent in the function grows roughly linearly with the
  * count).
  *
  * Polling is enabled and disabled via setting IFCAP_POLLING flag on
  * the interface. The driver ioctl handler should register interface
  * with polling and disable interrupts, if registration was successful.
  *
  * A second variable controls the sharing of CPU between polling/kernel
  * network processing, and other activities (typically userlevel tasks):
  * kern.polling.user_frac (between 0 and 100, default 50) sets the share
  * of CPU allocated to user tasks. CPU is allocated proportionally to the
  * shares, by dynamically adjusting the "count" (poll_burst).
  *
  * Other parameters can should be left to their default values.
  * The following constraints hold
  *
  *	1 <= poll_each_burst <= poll_burst <= poll_burst_max
  *	MIN_POLL_BURST_MAX <= poll_burst_max <= MAX_POLL_BURST_MAX
  */
 
 #define MIN_POLL_BURST_MAX	10
 #define MAX_POLL_BURST_MAX	20000
 
 static uint32_t poll_burst = 5;
 static uint32_t poll_burst_max = 150;	/* good for 100Mbit net and HZ=1000 */
 static uint32_t poll_each_burst = 5;
 
 static SYSCTL_NODE(_kern, OID_AUTO, polling, CTLFLAG_RW, 0,
 	"Device polling parameters");
 
 SYSCTL_UINT(_kern_polling, OID_AUTO, burst, CTLFLAG_RD,
 	&poll_burst, 0, "Current polling burst size");
 
 static int	netisr_poll_scheduled;
 static int	netisr_pollmore_scheduled;
 static int	poll_shutting_down;
 
 static int poll_burst_max_sysctl(SYSCTL_HANDLER_ARGS)
 {
 	uint32_t val = poll_burst_max;
 	int error;
 
 	error = sysctl_handle_int(oidp, &val, 0, req);
 	if (error || !req->newptr )
 		return (error);
 	if (val < MIN_POLL_BURST_MAX || val > MAX_POLL_BURST_MAX)
 		return (EINVAL);
 
 	mtx_lock(&poll_mtx);
 	poll_burst_max = val;
 	if (poll_burst > poll_burst_max)
 		poll_burst = poll_burst_max;
 	if (poll_each_burst > poll_burst_max)
 		poll_each_burst = MIN_POLL_BURST_MAX;
 	mtx_unlock(&poll_mtx);
 
 	return (0);
 }
 SYSCTL_PROC(_kern_polling, OID_AUTO, burst_max, CTLTYPE_UINT | CTLFLAG_RW,
 	0, sizeof(uint32_t), poll_burst_max_sysctl, "I", "Max Polling burst size");
 
 static int poll_each_burst_sysctl(SYSCTL_HANDLER_ARGS)
 {
 	uint32_t val = poll_each_burst;
 	int error;
 
 	error = sysctl_handle_int(oidp, &val, 0, req);
 	if (error || !req->newptr )
 		return (error);
 	if (val < 1)
 		return (EINVAL);
 
 	mtx_lock(&poll_mtx);
 	if (val > poll_burst_max) {
 		mtx_unlock(&poll_mtx);
 		return (EINVAL);
 	}
 	poll_each_burst = val;
 	mtx_unlock(&poll_mtx);
 
 	return (0);
 }
 SYSCTL_PROC(_kern_polling, OID_AUTO, each_burst, CTLTYPE_UINT | CTLFLAG_RW,
 	0, sizeof(uint32_t), poll_each_burst_sysctl, "I",
 	"Max size of each burst");
 
 static uint32_t poll_in_idle_loop=0;	/* do we poll in idle loop ? */
 SYSCTL_UINT(_kern_polling, OID_AUTO, idle_poll, CTLFLAG_RW,
 	&poll_in_idle_loop, 0, "Enable device polling in idle loop");
 
 static uint32_t user_frac = 50;
 static int user_frac_sysctl(SYSCTL_HANDLER_ARGS)
 {
 	uint32_t val = user_frac;
 	int error;
 
 	error = sysctl_handle_int(oidp, &val, 0, req);
 	if (error || !req->newptr )
 		return (error);
 	if (val > 99)
 		return (EINVAL);
 
 	mtx_lock(&poll_mtx);
 	user_frac = val;
 	mtx_unlock(&poll_mtx);
 
 	return (0);
 }
 SYSCTL_PROC(_kern_polling, OID_AUTO, user_frac, CTLTYPE_UINT | CTLFLAG_RW,
 	0, sizeof(uint32_t), user_frac_sysctl, "I",
 	"Desired user fraction of cpu time");
 
 static uint32_t reg_frac_count = 0;
 static uint32_t reg_frac = 20 ;
 static int reg_frac_sysctl(SYSCTL_HANDLER_ARGS)
 {
 	uint32_t val = reg_frac;
 	int error;
 
 	error = sysctl_handle_int(oidp, &val, 0, req);
 	if (error || !req->newptr )
 		return (error);
 	if (val < 1 || val > hz)
 		return (EINVAL);
 
 	mtx_lock(&poll_mtx);
 	reg_frac = val;
 	if (reg_frac_count >= reg_frac)
 		reg_frac_count = 0;
 	mtx_unlock(&poll_mtx);
 
 	return (0);
 }
 SYSCTL_PROC(_kern_polling, OID_AUTO, reg_frac, CTLTYPE_UINT | CTLFLAG_RW,
 	0, sizeof(uint32_t), reg_frac_sysctl, "I",
 	"Every this many cycles check registers");
 
 static uint32_t short_ticks;
 SYSCTL_UINT(_kern_polling, OID_AUTO, short_ticks, CTLFLAG_RD,
 	&short_ticks, 0, "Hardclock ticks shorter than they should be");
 
 static uint32_t lost_polls;
 SYSCTL_UINT(_kern_polling, OID_AUTO, lost_polls, CTLFLAG_RD,
 	&lost_polls, 0, "How many times we would have lost a poll tick");
 
 static uint32_t pending_polls;
 SYSCTL_UINT(_kern_polling, OID_AUTO, pending_polls, CTLFLAG_RD,
 	&pending_polls, 0, "Do we need to poll again");
 
 static int residual_burst = 0;
 SYSCTL_INT(_kern_polling, OID_AUTO, residual_burst, CTLFLAG_RD,
 	&residual_burst, 0, "# of residual cycles in burst");
 
 static uint32_t poll_handlers; /* next free entry in pr[]. */
 SYSCTL_UINT(_kern_polling, OID_AUTO, handlers, CTLFLAG_RD,
 	&poll_handlers, 0, "Number of registered poll handlers");
 
 static uint32_t phase;
 SYSCTL_UINT(_kern_polling, OID_AUTO, phase, CTLFLAG_RD,
 	&phase, 0, "Polling phase");
 
 static uint32_t suspect;
 SYSCTL_UINT(_kern_polling, OID_AUTO, suspect, CTLFLAG_RD,
 	&suspect, 0, "suspect event");
 
 static uint32_t stalled;
 SYSCTL_UINT(_kern_polling, OID_AUTO, stalled, CTLFLAG_RD,
 	&stalled, 0, "potential stalls");
 
 static uint32_t idlepoll_sleeping; /* idlepoll is sleeping */
 SYSCTL_UINT(_kern_polling, OID_AUTO, idlepoll_sleeping, CTLFLAG_RD,
 	&idlepoll_sleeping, 0, "idlepoll is sleeping");
 
 
 #define POLL_LIST_LEN  128
 struct pollrec {
 	poll_handler_t	*handler;
 	struct ifnet	*ifp;
 };
 
 static struct pollrec pr[POLL_LIST_LEN];
 
 static void
 poll_shutdown(void *arg, int howto)
 {
 
 	poll_shutting_down = 1;
 }
 
 static void
 init_device_poll(void)
 {
 
 	mtx_init(&poll_mtx, "polling", NULL, MTX_DEF);
 	EVENTHANDLER_REGISTER(shutdown_post_sync, poll_shutdown, NULL,
 	    SHUTDOWN_PRI_LAST);
 }
 SYSINIT(device_poll, SI_SUB_SOFTINTR, SI_ORDER_MIDDLE, init_device_poll, NULL);
 
 
 /*
  * Hook from hardclock. Tries to schedule a netisr, but keeps track
  * of lost ticks due to the previous handler taking too long.
  * Normally, this should not happen, because polling handler should
  * run for a short time. However, in some cases (e.g. when there are
  * changes in link status etc.) the drivers take a very long time
  * (even in the order of milliseconds) to reset and reconfigure the
  * device, causing apparent lost polls.
  *
  * The first part of the code is just for debugging purposes, and tries
  * to count how often hardclock ticks are shorter than they should,
  * meaning either stray interrupts or delayed events.
  */
 void
 hardclock_device_poll(void)
 {
 	static struct timeval prev_t, t;
 	int delta;
 
 	if (poll_handlers == 0 || poll_shutting_down)
 		return;
 
 	microuptime(&t);
 	delta = (t.tv_usec - prev_t.tv_usec) +
 		(t.tv_sec - prev_t.tv_sec)*1000000;
 	if (delta * hz < 500000)
 		short_ticks++;
 	else
 		prev_t = t;
 
 	if (pending_polls > 100) {
 		/*
 		 * Too much, assume it has stalled (not always true
 		 * see comment above).
 		 */
 		stalled++;
 		pending_polls = 0;
 		phase = 0;
 	}
 
 	if (phase <= 2) {
 		if (phase != 0)
 			suspect++;
 		phase = 1;
 		netisr_poll_scheduled = 1;
 		netisr_pollmore_scheduled = 1;
 		netisr_sched_poll();
 		phase = 2;
 	}
 	if (pending_polls++ > 0)
 		lost_polls++;
 }
 
 /*
  * ether_poll is called from the idle loop.
  */
 static void
 ether_poll(int count)
 {
 	int i;
 
 	mtx_lock(&poll_mtx);
 
 	if (count > poll_each_burst)
 		count = poll_each_burst;
 
 	for (i = 0 ; i < poll_handlers ; i++)
 		pr[i].handler(pr[i].ifp, POLL_ONLY, count);
 
 	mtx_unlock(&poll_mtx);
 }
 
 /*
  * netisr_pollmore is called after other netisr's, possibly scheduling
  * another NETISR_POLL call, or adapting the burst size for the next cycle.
  *
  * It is very bad to fetch large bursts of packets from a single card at once,
  * because the burst could take a long time to be completely processed, or
  * could saturate the intermediate queue (ipintrq or similar) leading to
  * losses or unfairness. To reduce the problem, and also to account better for
  * time spent in network-related processing, we split the burst in smaller
  * chunks of fixed size, giving control to the other netisr's between chunks.
  * This helps in improving the fairness, reducing livelock (because we
  * emulate more closely the "process to completion" that we have with
  * fastforwarding) and accounting for the work performed in low level
  * handling and forwarding.
  */
 
 static struct timeval poll_start_t;
 
 void
 netisr_pollmore()
 {
 	struct timeval t;
 	int kern_load;
 
+	if (poll_handlers == 0)
+		return;
+
 	mtx_lock(&poll_mtx);
 	if (!netisr_pollmore_scheduled) {
 		mtx_unlock(&poll_mtx);
 		return;
 	}
 	netisr_pollmore_scheduled = 0;
 	phase = 5;
 	if (residual_burst > 0) {
 		netisr_poll_scheduled = 1;
 		netisr_pollmore_scheduled = 1;
 		netisr_sched_poll();
 		mtx_unlock(&poll_mtx);
 		/* will run immediately on return, followed by netisrs */
 		return;
 	}
 	/* here we can account time spent in netisr's in this tick */
 	microuptime(&t);
 	kern_load = (t.tv_usec - poll_start_t.tv_usec) +
 		(t.tv_sec - poll_start_t.tv_sec)*1000000;	/* us */
 	kern_load = (kern_load * hz) / 10000;			/* 0..100 */
 	if (kern_load > (100 - user_frac)) { /* try decrease ticks */
 		if (poll_burst > 1)
 			poll_burst--;
 	} else {
 		if (poll_burst < poll_burst_max)
 			poll_burst++;
 	}
 
 	pending_polls--;
 	if (pending_polls == 0) /* we are done */
 		phase = 0;
 	else {
 		/*
 		 * Last cycle was long and caused us to miss one or more
 		 * hardclock ticks. Restart processing again, but slightly
 		 * reduce the burst size to prevent that this happens again.
 		 */
 		poll_burst -= (poll_burst / 8);
 		if (poll_burst < 1)
 			poll_burst = 1;
 		netisr_poll_scheduled = 1;
 		netisr_pollmore_scheduled = 1;
 		netisr_sched_poll();
 		phase = 6;
 	}
 	mtx_unlock(&poll_mtx);
 }
 
 /*
  * netisr_poll is typically scheduled once per tick.
  */
 void
 netisr_poll(void)
 {
 	int i, cycles;
 	enum poll_cmd arg = POLL_ONLY;
+
+	if (poll_handlers == 0)
+		return;
 
 	mtx_lock(&poll_mtx);
 	if (!netisr_poll_scheduled) {
 		mtx_unlock(&poll_mtx);
 		return;
 	}
 	netisr_poll_scheduled = 0;
 	phase = 3;
 	if (residual_burst == 0) { /* first call in this tick */
 		microuptime(&poll_start_t);
 		if (++reg_frac_count == reg_frac) {
 			arg = POLL_AND_CHECK_STATUS;
 			reg_frac_count = 0;
 		}
 
 		residual_burst = poll_burst;
 	}
 	cycles = (residual_burst < poll_each_burst) ?
 		residual_burst : poll_each_burst;
 	residual_burst -= cycles;
 
 	for (i = 0 ; i < poll_handlers ; i++)
 		pr[i].handler(pr[i].ifp, arg, cycles);
 
 	phase = 4;
 	mtx_unlock(&poll_mtx);
 }
 
 /*
  * Try to register routine for polling. Returns 0 if successful
  * (and polling should be enabled), error code otherwise.
  * A device is not supposed to register itself multiple times.
  *
  * This is called from within the *_ioctl() functions.
  */
 int
 ether_poll_register(poll_handler_t *h, if_t ifp)
 {
 	int i;
 
 	KASSERT(h != NULL, ("%s: handler is NULL", __func__));
 	KASSERT(ifp != NULL, ("%s: ifp is NULL", __func__));
 
 	mtx_lock(&poll_mtx);
 	if (poll_handlers >= POLL_LIST_LEN) {
 		/*
 		 * List full, cannot register more entries.
 		 * This should never happen; if it does, it is probably a
 		 * broken driver trying to register multiple times. Checking
 		 * this at runtime is expensive, and won't solve the problem
 		 * anyways, so just report a few times and then give up.
 		 */
 		static int verbose = 10 ;
 		if (verbose >0) {
 			log(LOG_ERR, "poll handlers list full, "
 			    "maybe a broken driver ?\n");
 			verbose--;
 		}
 		mtx_unlock(&poll_mtx);
 		return (ENOMEM); /* no polling for you */
 	}
 
 	for (i = 0 ; i < poll_handlers ; i++)
 		if (pr[i].ifp == ifp && pr[i].handler != NULL) {
 			mtx_unlock(&poll_mtx);
 			log(LOG_DEBUG, "ether_poll_register: %s: handler"
 			    " already registered\n", ifp->if_xname);
 			return (EEXIST);
 		}
 
 	pr[poll_handlers].handler = h;
 	pr[poll_handlers].ifp = ifp;
 	poll_handlers++;
 	mtx_unlock(&poll_mtx);
 	if (idlepoll_sleeping)
 		wakeup(&idlepoll_sleeping);
 	return (0);
 }
 
 /*
  * Remove interface from the polling list. Called from *_ioctl(), too.
  */
 int
 ether_poll_deregister(if_t ifp)
 {
 	int i;
 
 	KASSERT(ifp != NULL, ("%s: ifp is NULL", __func__));
 
 	mtx_lock(&poll_mtx);
 
 	for (i = 0 ; i < poll_handlers ; i++)
 		if (pr[i].ifp == ifp) /* found it */
 			break;
 	if (i == poll_handlers) {
 		log(LOG_DEBUG, "ether_poll_deregister: %s: not found!\n",
 		    ifp->if_xname);
 		mtx_unlock(&poll_mtx);
 		return (ENOENT);
 	}
 	poll_handlers--;
 	if (i < poll_handlers) { /* Last entry replaces this one. */
 		pr[i].handler = pr[poll_handlers].handler;
 		pr[i].ifp = pr[poll_handlers].ifp;
 	}
 	mtx_unlock(&poll_mtx);
 	return (0);
 }
 
 static void
 poll_idle(void)
 {
 	struct thread *td = curthread;
 	struct rtprio rtp;
 
 	rtp.prio = RTP_PRIO_MAX;	/* lowest priority */
 	rtp.type = RTP_PRIO_IDLE;
 	PROC_SLOCK(td->td_proc);
 	rtp_to_pri(&rtp, td);
 	PROC_SUNLOCK(td->td_proc);
 
 	for (;;) {
 		if (poll_in_idle_loop && poll_handlers > 0) {
 			idlepoll_sleeping = 0;
 			ether_poll(poll_each_burst);
 			thread_lock(td);
 			mi_switch(SW_VOL, NULL);
 			thread_unlock(td);
 		} else {
 			idlepoll_sleeping = 1;
 			tsleep(&idlepoll_sleeping, 0, "pollid", hz * 3);
 		}
 	}
 }
 
 static struct proc *idlepoll;
 static struct kproc_desc idlepoll_kp = {
 	 "idlepoll",
 	 poll_idle,
 	 &idlepoll
 };
 SYSINIT(idlepoll, SI_SUB_KTHREAD_VM, SI_ORDER_ANY, kproc_start,
     &idlepoll_kp);
Index: user/ngie/more-tests/sys/kern/kern_resource.c
===================================================================
--- user/ngie/more-tests/sys/kern/kern_resource.c	(revision 281584)
+++ user/ngie/more-tests/sys/kern/kern_resource.c	(revision 281585)
@@ -1,1438 +1,1443 @@
 /*-
  * Copyright (c) 1982, 1986, 1991, 1993
  *	The Regents of the University of California.  All rights reserved.
  * (c) UNIX System Laboratories, Inc.
  * All or some portions of this file are derived from material licensed
  * to the University of California by American Telephone and Telegraph
  * Co. or Unix System Laboratories, Inc. and are reproduced herein with
  * the permission of UNIX System Laboratories, Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)kern_resource.c	8.5 (Berkeley) 1/21/94
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_compat.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/sysproto.h>
 #include <sys/file.h>
 #include <sys/kernel.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/mutex.h>
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/refcount.h>
 #include <sys/racct.h>
 #include <sys/resourcevar.h>
 #include <sys/rwlock.h>
 #include <sys/sched.h>
 #include <sys/sx.h>
 #include <sys/syscallsubr.h>
 #include <sys/sysctl.h>
 #include <sys/sysent.h>
 #include <sys/time.h>
 #include <sys/umtx.h>
 
 #include <vm/vm.h>
 #include <vm/vm_param.h>
 #include <vm/pmap.h>
 #include <vm/vm_map.h>
 
 
 static MALLOC_DEFINE(M_PLIMIT, "plimit", "plimit structures");
 static MALLOC_DEFINE(M_UIDINFO, "uidinfo", "uidinfo structures");
 #define	UIHASH(uid)	(&uihashtbl[(uid) & uihash])
 static struct rwlock uihashtbl_lock;
 static LIST_HEAD(uihashhead, uidinfo) *uihashtbl;
 static u_long uihash;		/* size of hash table - 1 */
 
 static void	calcru1(struct proc *p, struct rusage_ext *ruxp,
 		    struct timeval *up, struct timeval *sp);
 static int	donice(struct thread *td, struct proc *chgp, int n);
 static struct uidinfo *uilookup(uid_t uid);
 static void	ruxagg_locked(struct rusage_ext *rux, struct thread *td);
 
 static __inline int	lim_shared(struct plimit *limp);
 
 /*
  * Resource controls and accounting.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct getpriority_args {
 	int	which;
 	int	who;
 };
 #endif
 int
 sys_getpriority(struct thread *td, register struct getpriority_args *uap)
 {
 	struct proc *p;
 	struct pgrp *pg;
 	int error, low;
 
 	error = 0;
 	low = PRIO_MAX + 1;
 	switch (uap->which) {
 
 	case PRIO_PROCESS:
 		if (uap->who == 0)
 			low = td->td_proc->p_nice;
 		else {
 			p = pfind(uap->who);
 			if (p == NULL)
 				break;
 			if (p_cansee(td, p) == 0)
 				low = p->p_nice;
 			PROC_UNLOCK(p);
 		}
 		break;
 
 	case PRIO_PGRP:
 		sx_slock(&proctree_lock);
 		if (uap->who == 0) {
 			pg = td->td_proc->p_pgrp;
 			PGRP_LOCK(pg);
 		} else {
 			pg = pgfind(uap->who);
 			if (pg == NULL) {
 				sx_sunlock(&proctree_lock);
 				break;
 			}
 		}
 		sx_sunlock(&proctree_lock);
 		LIST_FOREACH(p, &pg->pg_members, p_pglist) {
 			PROC_LOCK(p);
 			if (p->p_state == PRS_NORMAL &&
 			    p_cansee(td, p) == 0) {
 				if (p->p_nice < low)
 					low = p->p_nice;
 			}
 			PROC_UNLOCK(p);
 		}
 		PGRP_UNLOCK(pg);
 		break;
 
 	case PRIO_USER:
 		if (uap->who == 0)
 			uap->who = td->td_ucred->cr_uid;
 		sx_slock(&allproc_lock);
 		FOREACH_PROC_IN_SYSTEM(p) {
 			PROC_LOCK(p);
 			if (p->p_state == PRS_NORMAL &&
 			    p_cansee(td, p) == 0 &&
 			    p->p_ucred->cr_uid == uap->who) {
 				if (p->p_nice < low)
 					low = p->p_nice;
 			}
 			PROC_UNLOCK(p);
 		}
 		sx_sunlock(&allproc_lock);
 		break;
 
 	default:
 		error = EINVAL;
 		break;
 	}
 	if (low == PRIO_MAX + 1 && error == 0)
 		error = ESRCH;
 	td->td_retval[0] = low;
 	return (error);
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct setpriority_args {
 	int	which;
 	int	who;
 	int	prio;
 };
 #endif
 int
 sys_setpriority(struct thread *td, struct setpriority_args *uap)
 {
 	struct proc *curp, *p;
 	struct pgrp *pg;
 	int found = 0, error = 0;
 
 	curp = td->td_proc;
 	switch (uap->which) {
 	case PRIO_PROCESS:
 		if (uap->who == 0) {
 			PROC_LOCK(curp);
 			error = donice(td, curp, uap->prio);
 			PROC_UNLOCK(curp);
 		} else {
 			p = pfind(uap->who);
 			if (p == NULL)
 				break;
 			error = p_cansee(td, p);
 			if (error == 0)
 				error = donice(td, p, uap->prio);
 			PROC_UNLOCK(p);
 		}
 		found++;
 		break;
 
 	case PRIO_PGRP:
 		sx_slock(&proctree_lock);
 		if (uap->who == 0) {
 			pg = curp->p_pgrp;
 			PGRP_LOCK(pg);
 		} else {
 			pg = pgfind(uap->who);
 			if (pg == NULL) {
 				sx_sunlock(&proctree_lock);
 				break;
 			}
 		}
 		sx_sunlock(&proctree_lock);
 		LIST_FOREACH(p, &pg->pg_members, p_pglist) {
 			PROC_LOCK(p);
 			if (p->p_state == PRS_NORMAL &&
 			    p_cansee(td, p) == 0) {
 				error = donice(td, p, uap->prio);
 				found++;
 			}
 			PROC_UNLOCK(p);
 		}
 		PGRP_UNLOCK(pg);
 		break;
 
 	case PRIO_USER:
 		if (uap->who == 0)
 			uap->who = td->td_ucred->cr_uid;
 		sx_slock(&allproc_lock);
 		FOREACH_PROC_IN_SYSTEM(p) {
 			PROC_LOCK(p);
 			if (p->p_state == PRS_NORMAL &&
 			    p->p_ucred->cr_uid == uap->who &&
 			    p_cansee(td, p) == 0) {
 				error = donice(td, p, uap->prio);
 				found++;
 			}
 			PROC_UNLOCK(p);
 		}
 		sx_sunlock(&allproc_lock);
 		break;
 
 	default:
 		error = EINVAL;
 		break;
 	}
 	if (found == 0 && error == 0)
 		error = ESRCH;
 	return (error);
 }
 
 /*
  * Set "nice" for a (whole) process.
  */
 static int
 donice(struct thread *td, struct proc *p, int n)
 {
 	int error;
 
 	PROC_LOCK_ASSERT(p, MA_OWNED);
 	if ((error = p_cansched(td, p)))
 		return (error);
 	if (n > PRIO_MAX)
 		n = PRIO_MAX;
 	if (n < PRIO_MIN)
 		n = PRIO_MIN;
 	if (n < p->p_nice && priv_check(td, PRIV_SCHED_SETPRIORITY) != 0)
 		return (EACCES);
 	sched_nice(p, n);
 	return (0);
 }
 
 static int unprivileged_idprio;
 SYSCTL_INT(_security_bsd, OID_AUTO, unprivileged_idprio, CTLFLAG_RW,
     &unprivileged_idprio, 0, "Allow non-root users to set an idle priority");
 
 /*
  * Set realtime priority for LWP.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct rtprio_thread_args {
 	int		function;
 	lwpid_t		lwpid;
 	struct rtprio	*rtp;
 };
 #endif
 int
 sys_rtprio_thread(struct thread *td, struct rtprio_thread_args *uap)
 {
 	struct proc *p;
 	struct rtprio rtp;
 	struct thread *td1;
 	int cierror, error;
 
 	/* Perform copyin before acquiring locks if needed. */
 	if (uap->function == RTP_SET)
 		cierror = copyin(uap->rtp, &rtp, sizeof(struct rtprio));
 	else
 		cierror = 0;
 
 	if (uap->lwpid == 0 || uap->lwpid == td->td_tid) {
 		p = td->td_proc;
 		td1 = td;
 		PROC_LOCK(p);
 	} else {
 		/* Only look up thread in current process */
 		td1 = tdfind(uap->lwpid, curproc->p_pid);
 		if (td1 == NULL)
 			return (ESRCH);
 		p = td1->td_proc;
 	}
 
 	switch (uap->function) {
 	case RTP_LOOKUP:
 		if ((error = p_cansee(td, p)))
 			break;
 		pri_to_rtp(td1, &rtp);
 		PROC_UNLOCK(p);
 		return (copyout(&rtp, uap->rtp, sizeof(struct rtprio)));
 	case RTP_SET:
 		if ((error = p_cansched(td, p)) || (error = cierror))
 			break;
 
 		/* Disallow setting rtprio in most cases if not superuser. */
 
 		/*
 		 * Realtime priority has to be restricted for reasons which
 		 * should be obvious.  However, for idleprio processes, there is
 		 * a potential for system deadlock if an idleprio process gains
 		 * a lock on a resource that other processes need (and the
 		 * idleprio process can't run due to a CPU-bound normal
 		 * process).  Fix me!  XXX
 		 *
 		 * This problem is not only related to idleprio process.
 		 * A user level program can obtain a file lock and hold it
 		 * indefinitely.  Additionally, without idleprio processes it is
 		 * still conceivable that a program with low priority will never
 		 * get to run.  In short, allowing this feature might make it
 		 * easier to lock a resource indefinitely, but it is not the
 		 * only thing that makes it possible.
 		 */
 		if (RTP_PRIO_BASE(rtp.type) == RTP_PRIO_REALTIME ||
 		    (RTP_PRIO_BASE(rtp.type) == RTP_PRIO_IDLE &&
 		    unprivileged_idprio == 0)) {
 			error = priv_check(td, PRIV_SCHED_RTPRIO);
 			if (error)
 				break;
 		}
 		error = rtp_to_pri(&rtp, td1);
 		break;
 	default:
 		error = EINVAL;
 		break;
 	}
 	PROC_UNLOCK(p);
 	return (error);
 }
 
 /*
  * Set realtime priority.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct rtprio_args {
 	int		function;
 	pid_t		pid;
 	struct rtprio	*rtp;
 };
 #endif
 int
 sys_rtprio(struct thread *td, register struct rtprio_args *uap)
 {
 	struct proc *p;
 	struct thread *tdp;
 	struct rtprio rtp;
 	int cierror, error;
 
 	/* Perform copyin before acquiring locks if needed. */
 	if (uap->function == RTP_SET)
 		cierror = copyin(uap->rtp, &rtp, sizeof(struct rtprio));
 	else
 		cierror = 0;
 
 	if (uap->pid == 0) {
 		p = td->td_proc;
 		PROC_LOCK(p);
 	} else {
 		p = pfind(uap->pid);
 		if (p == NULL)
 			return (ESRCH);
 	}
 
 	switch (uap->function) {
 	case RTP_LOOKUP:
 		if ((error = p_cansee(td, p)))
 			break;
 		/*
 		 * Return OUR priority if no pid specified,
 		 * or if one is, report the highest priority
 		 * in the process.  There isn't much more you can do as
 		 * there is only room to return a single priority.
 		 * Note: specifying our own pid is not the same
 		 * as leaving it zero.
 		 */
 		if (uap->pid == 0) {
 			pri_to_rtp(td, &rtp);
 		} else {
 			struct rtprio rtp2;
 
 			rtp.type = RTP_PRIO_IDLE;
 			rtp.prio = RTP_PRIO_MAX;
 			FOREACH_THREAD_IN_PROC(p, tdp) {
 				pri_to_rtp(tdp, &rtp2);
 				if (rtp2.type <  rtp.type ||
 				    (rtp2.type == rtp.type &&
 				    rtp2.prio < rtp.prio)) {
 					rtp.type = rtp2.type;
 					rtp.prio = rtp2.prio;
 				}
 			}
 		}
 		PROC_UNLOCK(p);
 		return (copyout(&rtp, uap->rtp, sizeof(struct rtprio)));
 	case RTP_SET:
 		if ((error = p_cansched(td, p)) || (error = cierror))
 			break;
 
 		/*
 		 * Disallow setting rtprio in most cases if not superuser.
 		 * See the comment in sys_rtprio_thread about idprio
 		 * threads holding a lock.
 		 */
 		if (RTP_PRIO_BASE(rtp.type) == RTP_PRIO_REALTIME ||
 		    (RTP_PRIO_BASE(rtp.type) == RTP_PRIO_IDLE &&
 		    !unprivileged_idprio)) {
 			error = priv_check(td, PRIV_SCHED_RTPRIO);
 			if (error)
 				break;
 		}
 
 		/*
 		 * If we are setting our own priority, set just our
 		 * thread but if we are doing another process,
 		 * do all the threads on that process. If we
 		 * specify our own pid we do the latter.
 		 */
 		if (uap->pid == 0) {
 			error = rtp_to_pri(&rtp, td);
 		} else {
 			FOREACH_THREAD_IN_PROC(p, td) {
 				if ((error = rtp_to_pri(&rtp, td)) != 0)
 					break;
 			}
 		}
 		break;
 	default:
 		error = EINVAL;
 		break;
 	}
 	PROC_UNLOCK(p);
 	return (error);
 }
 
 int
 rtp_to_pri(struct rtprio *rtp, struct thread *td)
 {
 	u_char  newpri, oldclass, oldpri;
 
 	switch (RTP_PRIO_BASE(rtp->type)) {
 	case RTP_PRIO_REALTIME:
 		if (rtp->prio > RTP_PRIO_MAX)
 			return (EINVAL);
 		newpri = PRI_MIN_REALTIME + rtp->prio;
 		break;
 	case RTP_PRIO_NORMAL:
 		if (rtp->prio > (PRI_MAX_TIMESHARE - PRI_MIN_TIMESHARE))
 			return (EINVAL);
 		newpri = PRI_MIN_TIMESHARE + rtp->prio;
 		break;
 	case RTP_PRIO_IDLE:
 		if (rtp->prio > RTP_PRIO_MAX)
 			return (EINVAL);
 		newpri = PRI_MIN_IDLE + rtp->prio;
 		break;
 	default:
 		return (EINVAL);
 	}
 
 	thread_lock(td);
 	oldclass = td->td_pri_class;
 	sched_class(td, rtp->type);	/* XXX fix */
 	oldpri = td->td_user_pri;
 	sched_user_prio(td, newpri);
 	if (td->td_user_pri != oldpri && (oldclass != RTP_PRIO_NORMAL ||
 	    td->td_pri_class != RTP_PRIO_NORMAL))
 		sched_prio(td, td->td_user_pri);
 	if (TD_ON_UPILOCK(td) && oldpri != newpri) {
 		critical_enter();
 		thread_unlock(td);
 		umtx_pi_adjust(td, oldpri);
 		critical_exit();
 	} else
 		thread_unlock(td);
 	return (0);
 }
 
 void
 pri_to_rtp(struct thread *td, struct rtprio *rtp)
 {
 
 	thread_lock(td);
 	switch (PRI_BASE(td->td_pri_class)) {
 	case PRI_REALTIME:
 		rtp->prio = td->td_base_user_pri - PRI_MIN_REALTIME;
 		break;
 	case PRI_TIMESHARE:
 		rtp->prio = td->td_base_user_pri - PRI_MIN_TIMESHARE;
 		break;
 	case PRI_IDLE:
 		rtp->prio = td->td_base_user_pri - PRI_MIN_IDLE;
 		break;
 	default:
 		break;
 	}
 	rtp->type = td->td_pri_class;
 	thread_unlock(td);
 }
 
 #if defined(COMPAT_43)
 #ifndef _SYS_SYSPROTO_H_
 struct osetrlimit_args {
 	u_int	which;
 	struct	orlimit *rlp;
 };
 #endif
 int
 osetrlimit(struct thread *td, register struct osetrlimit_args *uap)
 {
 	struct orlimit olim;
 	struct rlimit lim;
 	int error;
 
 	if ((error = copyin(uap->rlp, &olim, sizeof(struct orlimit))))
 		return (error);
 	lim.rlim_cur = olim.rlim_cur;
 	lim.rlim_max = olim.rlim_max;
 	error = kern_setrlimit(td, uap->which, &lim);
 	return (error);
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct ogetrlimit_args {
 	u_int	which;
 	struct	orlimit *rlp;
 };
 #endif
 int
 ogetrlimit(struct thread *td, register struct ogetrlimit_args *uap)
 {
 	struct orlimit olim;
 	struct rlimit rl;
 	struct proc *p;
 	int error;
 
 	if (uap->which >= RLIM_NLIMITS)
 		return (EINVAL);
 	p = td->td_proc;
 	PROC_LOCK(p);
 	lim_rlimit(p, uap->which, &rl);
 	PROC_UNLOCK(p);
 
 	/*
 	 * XXX would be more correct to convert only RLIM_INFINITY to the
 	 * old RLIM_INFINITY and fail with EOVERFLOW for other larger
 	 * values.  Most 64->32 and 32->16 conversions, including not
 	 * unimportant ones of uids are even more broken than what we
 	 * do here (they blindly truncate).  We don't do this correctly
 	 * here since we have little experience with EOVERFLOW yet.
 	 * Elsewhere, getuid() can't fail...
 	 */
 	olim.rlim_cur = rl.rlim_cur > 0x7fffffff ? 0x7fffffff : rl.rlim_cur;
 	olim.rlim_max = rl.rlim_max > 0x7fffffff ? 0x7fffffff : rl.rlim_max;
 	error = copyout(&olim, uap->rlp, sizeof(olim));
 	return (error);
 }
 #endif /* COMPAT_43 */
 
 #ifndef _SYS_SYSPROTO_H_
 struct __setrlimit_args {
 	u_int	which;
 	struct	rlimit *rlp;
 };
 #endif
 int
 sys_setrlimit(struct thread *td, register struct __setrlimit_args *uap)
 {
 	struct rlimit alim;
 	int error;
 
 	if ((error = copyin(uap->rlp, &alim, sizeof(struct rlimit))))
 		return (error);
 	error = kern_setrlimit(td, uap->which, &alim);
 	return (error);
 }
 
 static void
 lim_cb(void *arg)
 {
 	struct rlimit rlim;
 	struct thread *td;
 	struct proc *p;
 
 	p = arg;
 	PROC_LOCK_ASSERT(p, MA_OWNED);
 	/*
 	 * Check if the process exceeds its cpu resource allocation.  If
 	 * it reaches the max, arrange to kill the process in ast().
 	 */
 	if (p->p_cpulimit == RLIM_INFINITY)
 		return;
 	PROC_STATLOCK(p);
 	FOREACH_THREAD_IN_PROC(p, td) {
 		ruxagg(p, td);
 	}
 	PROC_STATUNLOCK(p);
 	if (p->p_rux.rux_runtime > p->p_cpulimit * cpu_tickrate()) {
 		lim_rlimit(p, RLIMIT_CPU, &rlim);
 		if (p->p_rux.rux_runtime >= rlim.rlim_max * cpu_tickrate()) {
 			killproc(p, "exceeded maximum CPU limit");
 		} else {
 			if (p->p_cpulimit < rlim.rlim_max)
 				p->p_cpulimit += 5;
 			kern_psignal(p, SIGXCPU);
 		}
 	}
 	if ((p->p_flag & P_WEXIT) == 0)
 		callout_reset_sbt(&p->p_limco, SBT_1S, 0,
 		    lim_cb, p, C_PREL(1));
 }
 
 int
 kern_setrlimit(struct thread *td, u_int which, struct rlimit *limp)
 {
 
 	return (kern_proc_setrlimit(td, td->td_proc, which, limp));
 }
 
 int
 kern_proc_setrlimit(struct thread *td, struct proc *p, u_int which,
     struct rlimit *limp)
 {
 	struct plimit *newlim, *oldlim;
 	register struct rlimit *alimp;
 	struct rlimit oldssiz;
 	int error;
 
 	if (which >= RLIM_NLIMITS)
 		return (EINVAL);
 
 	/*
 	 * Preserve historical bugs by treating negative limits as unsigned.
 	 */
 	if (limp->rlim_cur < 0)
 		limp->rlim_cur = RLIM_INFINITY;
 	if (limp->rlim_max < 0)
 		limp->rlim_max = RLIM_INFINITY;
 
 	oldssiz.rlim_cur = 0;
 	newlim = NULL;
 	PROC_LOCK(p);
 	if (lim_shared(p->p_limit)) {
 		PROC_UNLOCK(p);
 		newlim = lim_alloc();
 		PROC_LOCK(p);
 	}
 	oldlim = p->p_limit;
 	alimp = &oldlim->pl_rlimit[which];
 	if (limp->rlim_cur > alimp->rlim_max ||
 	    limp->rlim_max > alimp->rlim_max)
 		if ((error = priv_check(td, PRIV_PROC_SETRLIMIT))) {
 			PROC_UNLOCK(p);
 			if (newlim != NULL)
 				lim_free(newlim);
 			return (error);
 		}
 	if (limp->rlim_cur > limp->rlim_max)
 		limp->rlim_cur = limp->rlim_max;
 	if (newlim != NULL) {
 		lim_copy(newlim, oldlim);
 		alimp = &newlim->pl_rlimit[which];
 	}
 
 	switch (which) {
 
 	case RLIMIT_CPU:
 		if (limp->rlim_cur != RLIM_INFINITY &&
 		    p->p_cpulimit == RLIM_INFINITY)
 			callout_reset_sbt(&p->p_limco, SBT_1S, 0,
 			    lim_cb, p, C_PREL(1));
 		p->p_cpulimit = limp->rlim_cur;
 		break;
 	case RLIMIT_DATA:
 		if (limp->rlim_cur > maxdsiz)
 			limp->rlim_cur = maxdsiz;
 		if (limp->rlim_max > maxdsiz)
 			limp->rlim_max = maxdsiz;
 		break;
 
 	case RLIMIT_STACK:
 		if (limp->rlim_cur > maxssiz)
 			limp->rlim_cur = maxssiz;
 		if (limp->rlim_max > maxssiz)
 			limp->rlim_max = maxssiz;
 		oldssiz = *alimp;
 		if (p->p_sysent->sv_fixlimit != NULL)
 			p->p_sysent->sv_fixlimit(&oldssiz,
 			    RLIMIT_STACK);
 		break;
 
 	case RLIMIT_NOFILE:
 		if (limp->rlim_cur > maxfilesperproc)
 			limp->rlim_cur = maxfilesperproc;
 		if (limp->rlim_max > maxfilesperproc)
 			limp->rlim_max = maxfilesperproc;
 		break;
 
 	case RLIMIT_NPROC:
 		if (limp->rlim_cur > maxprocperuid)
 			limp->rlim_cur = maxprocperuid;
 		if (limp->rlim_max > maxprocperuid)
 			limp->rlim_max = maxprocperuid;
 		if (limp->rlim_cur < 1)
 			limp->rlim_cur = 1;
 		if (limp->rlim_max < 1)
 			limp->rlim_max = 1;
 		break;
 	}
 	if (p->p_sysent->sv_fixlimit != NULL)
 		p->p_sysent->sv_fixlimit(limp, which);
 	*alimp = *limp;
 	if (newlim != NULL)
 		p->p_limit = newlim;
 	PROC_UNLOCK(p);
 	if (newlim != NULL)
 		lim_free(oldlim);
 
-	if (which == RLIMIT_STACK) {
+	if (which == RLIMIT_STACK &&
+	    /*
+	     * Skip calls from exec_new_vmspace(), done when stack is
+	     * not mapped yet.
+	     */
+	    (td != curthread || (p->p_flag & P_INEXEC) == 0)) {
 		/*
 		 * Stack is allocated to the max at exec time with only
 		 * "rlim_cur" bytes accessible.  If stack limit is going
 		 * up make more accessible, if going down make inaccessible.
 		 */
 		if (limp->rlim_cur != oldssiz.rlim_cur) {
 			vm_offset_t addr;
 			vm_size_t size;
 			vm_prot_t prot;
 
 			if (limp->rlim_cur > oldssiz.rlim_cur) {
 				prot = p->p_sysent->sv_stackprot;
 				size = limp->rlim_cur - oldssiz.rlim_cur;
 				addr = p->p_sysent->sv_usrstack -
 				    limp->rlim_cur;
 			} else {
 				prot = VM_PROT_NONE;
 				size = oldssiz.rlim_cur - limp->rlim_cur;
 				addr = p->p_sysent->sv_usrstack -
 				    oldssiz.rlim_cur;
 			}
 			addr = trunc_page(addr);
 			size = round_page(size);
 			(void)vm_map_protect(&p->p_vmspace->vm_map,
 			    addr, addr + size, prot, FALSE);
 		}
 	}
 
 	return (0);
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct __getrlimit_args {
 	u_int	which;
 	struct	rlimit *rlp;
 };
 #endif
 /* ARGSUSED */
 int
 sys_getrlimit(struct thread *td, register struct __getrlimit_args *uap)
 {
 	struct rlimit rlim;
 	struct proc *p;
 	int error;
 
 	if (uap->which >= RLIM_NLIMITS)
 		return (EINVAL);
 	p = td->td_proc;
 	PROC_LOCK(p);
 	lim_rlimit(p, uap->which, &rlim);
 	PROC_UNLOCK(p);
 	error = copyout(&rlim, uap->rlp, sizeof(struct rlimit));
 	return (error);
 }
 
 /*
  * Transform the running time and tick information for children of proc p
  * into user and system time usage.
  */
 void
 calccru(struct proc *p, struct timeval *up, struct timeval *sp)
 {
 
 	PROC_LOCK_ASSERT(p, MA_OWNED);
 	calcru1(p, &p->p_crux, up, sp);
 }
 
 /*
  * Transform the running time and tick information in proc p into user
  * and system time usage.  If appropriate, include the current time slice
  * on this CPU.
  */
 void
 calcru(struct proc *p, struct timeval *up, struct timeval *sp)
 {
 	struct thread *td;
 	uint64_t runtime, u;
 
 	PROC_LOCK_ASSERT(p, MA_OWNED);
 	PROC_STATLOCK_ASSERT(p, MA_OWNED);
 	/*
 	 * If we are getting stats for the current process, then add in the
 	 * stats that this thread has accumulated in its current time slice.
 	 * We reset the thread and CPU state as if we had performed a context
 	 * switch right here.
 	 */
 	td = curthread;
 	if (td->td_proc == p) {
 		u = cpu_ticks();
 		runtime = u - PCPU_GET(switchtime);
 		td->td_runtime += runtime;
 		td->td_incruntime += runtime;
 		PCPU_SET(switchtime, u);
 	}
 	/* Make sure the per-thread stats are current. */
 	FOREACH_THREAD_IN_PROC(p, td) {
 		if (td->td_incruntime == 0)
 			continue;
 		ruxagg(p, td);
 	}
 	calcru1(p, &p->p_rux, up, sp);
 }
 
 /* Collect resource usage for a single thread. */
 void
 rufetchtd(struct thread *td, struct rusage *ru)
 {
 	struct proc *p;
 	uint64_t runtime, u;
 
 	p = td->td_proc;
 	PROC_STATLOCK_ASSERT(p, MA_OWNED);
 	THREAD_LOCK_ASSERT(td, MA_OWNED);
 	/*
 	 * If we are getting stats for the current thread, then add in the
 	 * stats that this thread has accumulated in its current time slice.
 	 * We reset the thread and CPU state as if we had performed a context
 	 * switch right here.
 	 */
 	if (td == curthread) {
 		u = cpu_ticks();
 		runtime = u - PCPU_GET(switchtime);
 		td->td_runtime += runtime;
 		td->td_incruntime += runtime;
 		PCPU_SET(switchtime, u);
 	}
 	ruxagg(p, td);
 	*ru = td->td_ru;
 	calcru1(p, &td->td_rux, &ru->ru_utime, &ru->ru_stime);
 }
 
 static void
 calcru1(struct proc *p, struct rusage_ext *ruxp, struct timeval *up,
     struct timeval *sp)
 {
 	/* {user, system, interrupt, total} {ticks, usec}: */
 	uint64_t ut, uu, st, su, it, tt, tu;
 
 	ut = ruxp->rux_uticks;
 	st = ruxp->rux_sticks;
 	it = ruxp->rux_iticks;
 	tt = ut + st + it;
 	if (tt == 0) {
 		/* Avoid divide by zero */
 		st = 1;
 		tt = 1;
 	}
 	tu = cputick2usec(ruxp->rux_runtime);
 	if ((int64_t)tu < 0) {
 		/* XXX: this should be an assert /phk */
 		printf("calcru: negative runtime of %jd usec for pid %d (%s)\n",
 		    (intmax_t)tu, p->p_pid, p->p_comm);
 		tu = ruxp->rux_tu;
 	}
 
 	if (tu >= ruxp->rux_tu) {
 		/*
 		 * The normal case, time increased.
 		 * Enforce monotonicity of bucketed numbers.
 		 */
 		uu = (tu * ut) / tt;
 		if (uu < ruxp->rux_uu)
 			uu = ruxp->rux_uu;
 		su = (tu * st) / tt;
 		if (su < ruxp->rux_su)
 			su = ruxp->rux_su;
 	} else if (tu + 3 > ruxp->rux_tu || 101 * tu > 100 * ruxp->rux_tu) {
 		/*
 		 * When we calibrate the cputicker, it is not uncommon to
 		 * see the presumably fixed frequency increase slightly over
 		 * time as a result of thermal stabilization and NTP
 		 * discipline (of the reference clock).  We therefore ignore
 		 * a bit of backwards slop because we  expect to catch up
 		 * shortly.  We use a 3 microsecond limit to catch low
 		 * counts and a 1% limit for high counts.
 		 */
 		uu = ruxp->rux_uu;
 		su = ruxp->rux_su;
 		tu = ruxp->rux_tu;
 	} else { /* tu < ruxp->rux_tu */
 		/*
 		 * What happened here was likely that a laptop, which ran at
 		 * a reduced clock frequency at boot, kicked into high gear.
 		 * The wisdom of spamming this message in that case is
 		 * dubious, but it might also be indicative of something
 		 * serious, so lets keep it and hope laptops can be made
 		 * more truthful about their CPU speed via ACPI.
 		 */
 		printf("calcru: runtime went backwards from %ju usec "
 		    "to %ju usec for pid %d (%s)\n",
 		    (uintmax_t)ruxp->rux_tu, (uintmax_t)tu,
 		    p->p_pid, p->p_comm);
 		uu = (tu * ut) / tt;
 		su = (tu * st) / tt;
 	}
 
 	ruxp->rux_uu = uu;
 	ruxp->rux_su = su;
 	ruxp->rux_tu = tu;
 
 	up->tv_sec = uu / 1000000;
 	up->tv_usec = uu % 1000000;
 	sp->tv_sec = su / 1000000;
 	sp->tv_usec = su % 1000000;
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct getrusage_args {
 	int	who;
 	struct	rusage *rusage;
 };
 #endif
 int
 sys_getrusage(register struct thread *td, register struct getrusage_args *uap)
 {
 	struct rusage ru;
 	int error;
 
 	error = kern_getrusage(td, uap->who, &ru);
 	if (error == 0)
 		error = copyout(&ru, uap->rusage, sizeof(struct rusage));
 	return (error);
 }
 
 int
 kern_getrusage(struct thread *td, int who, struct rusage *rup)
 {
 	struct proc *p;
 	int error;
 
 	error = 0;
 	p = td->td_proc;
 	PROC_LOCK(p);
 	switch (who) {
 	case RUSAGE_SELF:
 		rufetchcalc(p, rup, &rup->ru_utime,
 		    &rup->ru_stime);
 		break;
 
 	case RUSAGE_CHILDREN:
 		*rup = p->p_stats->p_cru;
 		calccru(p, &rup->ru_utime, &rup->ru_stime);
 		break;
 
 	case RUSAGE_THREAD:
 		PROC_STATLOCK(p);
 		thread_lock(td);
 		rufetchtd(td, rup);
 		thread_unlock(td);
 		PROC_STATUNLOCK(p);
 		break;
 
 	default:
 		error = EINVAL;
 	}
 	PROC_UNLOCK(p);
 	return (error);
 }
 
 void
 rucollect(struct rusage *ru, struct rusage *ru2)
 {
 	long *ip, *ip2;
 	int i;
 
 	if (ru->ru_maxrss < ru2->ru_maxrss)
 		ru->ru_maxrss = ru2->ru_maxrss;
 	ip = &ru->ru_first;
 	ip2 = &ru2->ru_first;
 	for (i = &ru->ru_last - &ru->ru_first; i >= 0; i--)
 		*ip++ += *ip2++;
 }
 
 void
 ruadd(struct rusage *ru, struct rusage_ext *rux, struct rusage *ru2,
     struct rusage_ext *rux2)
 {
 
 	rux->rux_runtime += rux2->rux_runtime;
 	rux->rux_uticks += rux2->rux_uticks;
 	rux->rux_sticks += rux2->rux_sticks;
 	rux->rux_iticks += rux2->rux_iticks;
 	rux->rux_uu += rux2->rux_uu;
 	rux->rux_su += rux2->rux_su;
 	rux->rux_tu += rux2->rux_tu;
 	rucollect(ru, ru2);
 }
 
 /*
  * Aggregate tick counts into the proc's rusage_ext.
  */
 static void
 ruxagg_locked(struct rusage_ext *rux, struct thread *td)
 {
 
 	THREAD_LOCK_ASSERT(td, MA_OWNED);
 	PROC_STATLOCK_ASSERT(td->td_proc, MA_OWNED);
 	rux->rux_runtime += td->td_incruntime;
 	rux->rux_uticks += td->td_uticks;
 	rux->rux_sticks += td->td_sticks;
 	rux->rux_iticks += td->td_iticks;
 }
 
 void
 ruxagg(struct proc *p, struct thread *td)
 {
 
 	thread_lock(td);
 	ruxagg_locked(&p->p_rux, td);
 	ruxagg_locked(&td->td_rux, td);
 	td->td_incruntime = 0;
 	td->td_uticks = 0;
 	td->td_iticks = 0;
 	td->td_sticks = 0;
 	thread_unlock(td);
 }
 
 /*
  * Update the rusage_ext structure and fetch a valid aggregate rusage
  * for proc p if storage for one is supplied.
  */
 void
 rufetch(struct proc *p, struct rusage *ru)
 {
 	struct thread *td;
 
 	PROC_STATLOCK_ASSERT(p, MA_OWNED);
 
 	*ru = p->p_ru;
 	if (p->p_numthreads > 0)  {
 		FOREACH_THREAD_IN_PROC(p, td) {
 			ruxagg(p, td);
 			rucollect(ru, &td->td_ru);
 		}
 	}
 }
 
 /*
  * Atomically perform a rufetch and a calcru together.
  * Consumers, can safely assume the calcru is executed only once
  * rufetch is completed.
  */
 void
 rufetchcalc(struct proc *p, struct rusage *ru, struct timeval *up,
     struct timeval *sp)
 {
 
 	PROC_STATLOCK(p);
 	rufetch(p, ru);
 	calcru(p, up, sp);
 	PROC_STATUNLOCK(p);
 }
 
 /*
  * Allocate a new resource limits structure and initialize its
  * reference count and mutex pointer.
  */
 struct plimit *
 lim_alloc()
 {
 	struct plimit *limp;
 
 	limp = malloc(sizeof(struct plimit), M_PLIMIT, M_WAITOK);
 	refcount_init(&limp->pl_refcnt, 1);
 	return (limp);
 }
 
 struct plimit *
 lim_hold(struct plimit *limp)
 {
 
 	refcount_acquire(&limp->pl_refcnt);
 	return (limp);
 }
 
 static __inline int
 lim_shared(struct plimit *limp)
 {
 
 	return (limp->pl_refcnt > 1);
 }
 
 void
 lim_fork(struct proc *p1, struct proc *p2)
 {
 
 	PROC_LOCK_ASSERT(p1, MA_OWNED);
 	PROC_LOCK_ASSERT(p2, MA_OWNED);
 
 	p2->p_limit = lim_hold(p1->p_limit);
 	callout_init_mtx(&p2->p_limco, &p2->p_mtx, 0);
 	if (p1->p_cpulimit != RLIM_INFINITY)
 		callout_reset_sbt(&p2->p_limco, SBT_1S, 0,
 		    lim_cb, p2, C_PREL(1));
 }
 
 void
 lim_free(struct plimit *limp)
 {
 
 	if (refcount_release(&limp->pl_refcnt))
 		free((void *)limp, M_PLIMIT);
 }
 
 /*
  * Make a copy of the plimit structure.
  * We share these structures copy-on-write after fork.
  */
 void
 lim_copy(struct plimit *dst, struct plimit *src)
 {
 
 	KASSERT(!lim_shared(dst), ("lim_copy to shared limit"));
 	bcopy(src->pl_rlimit, dst->pl_rlimit, sizeof(src->pl_rlimit));
 }
 
 /*
  * Return the hard limit for a particular system resource.  The
  * which parameter specifies the index into the rlimit array.
  */
 rlim_t
 lim_max(struct proc *p, int which)
 {
 	struct rlimit rl;
 
 	lim_rlimit(p, which, &rl);
 	return (rl.rlim_max);
 }
 
 /*
  * Return the current (soft) limit for a particular system resource.
  * The which parameter which specifies the index into the rlimit array
  */
 rlim_t
 lim_cur(struct proc *p, int which)
 {
 	struct rlimit rl;
 
 	lim_rlimit(p, which, &rl);
 	return (rl.rlim_cur);
 }
 
 /*
  * Return a copy of the entire rlimit structure for the system limit
  * specified by 'which' in the rlimit structure pointed to by 'rlp'.
  */
 void
 lim_rlimit(struct proc *p, int which, struct rlimit *rlp)
 {
 
 	PROC_LOCK_ASSERT(p, MA_OWNED);
 	KASSERT(which >= 0 && which < RLIM_NLIMITS,
 	    ("request for invalid resource limit"));
 	*rlp = p->p_limit->pl_rlimit[which];
 	if (p->p_sysent->sv_fixlimit != NULL)
 		p->p_sysent->sv_fixlimit(rlp, which);
 }
 
 void
 uihashinit()
 {
 
 	uihashtbl = hashinit(maxproc / 16, M_UIDINFO, &uihash);
 	rw_init(&uihashtbl_lock, "uidinfo hash");
 }
 
 /*
  * Look up a uidinfo struct for the parameter uid.
  * uihashtbl_lock must be locked.
  * Increase refcount on uidinfo struct returned.
  */
 static struct uidinfo *
 uilookup(uid_t uid)
 {
 	struct uihashhead *uipp;
 	struct uidinfo *uip;
 
 	rw_assert(&uihashtbl_lock, RA_LOCKED);
 	uipp = UIHASH(uid);
 	LIST_FOREACH(uip, uipp, ui_hash)
 		if (uip->ui_uid == uid) {
 			uihold(uip);
 			break;
 		}
 
 	return (uip);
 }
 
 /*
  * Find or allocate a struct uidinfo for a particular uid.
  * Returns with uidinfo struct referenced.
  * uifree() should be called on a struct uidinfo when released.
  */
 struct uidinfo *
 uifind(uid_t uid)
 {
 	struct uidinfo *new_uip, *uip;
 
 	rw_rlock(&uihashtbl_lock);
 	uip = uilookup(uid);
 	rw_runlock(&uihashtbl_lock);
 	if (uip != NULL)
 		return (uip);
 
 	new_uip = malloc(sizeof(*new_uip), M_UIDINFO, M_WAITOK | M_ZERO);
 	racct_create(&new_uip->ui_racct);
 	refcount_init(&new_uip->ui_ref, 1);
 	new_uip->ui_uid = uid;
 	mtx_init(&new_uip->ui_vmsize_mtx, "ui_vmsize", NULL, MTX_DEF);
 
 	rw_wlock(&uihashtbl_lock);
 	/*
 	 * There's a chance someone created our uidinfo while we
 	 * were in malloc and not holding the lock, so we have to
 	 * make sure we don't insert a duplicate uidinfo.
 	 */
 	if ((uip = uilookup(uid)) == NULL) {
 		LIST_INSERT_HEAD(UIHASH(uid), new_uip, ui_hash);
 		rw_wunlock(&uihashtbl_lock);
 		uip = new_uip;
 	} else {
 		rw_wunlock(&uihashtbl_lock);
 		racct_destroy(&new_uip->ui_racct);
 		mtx_destroy(&new_uip->ui_vmsize_mtx);
 		free(new_uip, M_UIDINFO);
 	}
 	return (uip);
 }
 
 /*
  * Place another refcount on a uidinfo struct.
  */
 void
 uihold(struct uidinfo *uip)
 {
 
 	refcount_acquire(&uip->ui_ref);
 }
 
 /*-
  * Since uidinfo structs have a long lifetime, we use an
  * opportunistic refcounting scheme to avoid locking the lookup hash
  * for each release.
  *
  * If the refcount hits 0, we need to free the structure,
  * which means we need to lock the hash.
  * Optimal case:
  *   After locking the struct and lowering the refcount, if we find
  *   that we don't need to free, simply unlock and return.
  * Suboptimal case:
  *   If refcount lowering results in need to free, bump the count
  *   back up, lose the lock and acquire the locks in the proper
  *   order to try again.
  */
 void
 uifree(struct uidinfo *uip)
 {
 	int old;
 
 	/* Prepare for optimal case. */
 	old = uip->ui_ref;
 	if (old > 1 && atomic_cmpset_int(&uip->ui_ref, old, old - 1))
 		return;
 
 	/* Prepare for suboptimal case. */
 	rw_wlock(&uihashtbl_lock);
 	if (refcount_release(&uip->ui_ref) == 0) {
 		rw_wunlock(&uihashtbl_lock);
 		return;
 	}
 
 	racct_destroy(&uip->ui_racct);
 	LIST_REMOVE(uip, ui_hash);
 	rw_wunlock(&uihashtbl_lock);
 
 	if (uip->ui_sbsize != 0)
 		printf("freeing uidinfo: uid = %d, sbsize = %ld\n",
 		    uip->ui_uid, uip->ui_sbsize);
 	if (uip->ui_proccnt != 0)
 		printf("freeing uidinfo: uid = %d, proccnt = %ld\n",
 		    uip->ui_uid, uip->ui_proccnt);
 	if (uip->ui_vmsize != 0)
 		printf("freeing uidinfo: uid = %d, swapuse = %lld\n",
 		    uip->ui_uid, (unsigned long long)uip->ui_vmsize);
 	mtx_destroy(&uip->ui_vmsize_mtx);
 	free(uip, M_UIDINFO);
 }
 
 #ifdef RACCT
 void
 ui_racct_foreach(void (*callback)(struct racct *racct,
     void *arg2, void *arg3), void *arg2, void *arg3)
 {
 	struct uidinfo *uip;
 	struct uihashhead *uih;
 
 	rw_rlock(&uihashtbl_lock);
 	for (uih = &uihashtbl[uihash]; uih >= uihashtbl; uih--) {
 		LIST_FOREACH(uip, uih, ui_hash) {
 			(callback)(uip->ui_racct, arg2, arg3);
 		}
 	}
 	rw_runlock(&uihashtbl_lock);
 }
 #endif
 
 /*
  * Change the count associated with number of processes
  * a given user is using.  When 'max' is 0, don't enforce a limit
  */
 int
 chgproccnt(struct uidinfo *uip, int diff, rlim_t max)
 {
 
 	/* Don't allow them to exceed max, but allow subtraction. */
 	if (diff > 0 && max != 0) {
 		if (atomic_fetchadd_long(&uip->ui_proccnt, (long)diff) + diff > max) {
 			atomic_subtract_long(&uip->ui_proccnt, (long)diff);
 			return (0);
 		}
 	} else {
 		atomic_add_long(&uip->ui_proccnt, (long)diff);
 		if (uip->ui_proccnt < 0)
 			printf("negative proccnt for uid = %d\n", uip->ui_uid);
 	}
 	return (1);
 }
 
 /*
  * Change the total socket buffer size a user has used.
  */
 int
 chgsbsize(struct uidinfo *uip, u_int *hiwat, u_int to, rlim_t max)
 {
 	int diff;
 
 	diff = to - *hiwat;
 	if (diff > 0) {
 		if (atomic_fetchadd_long(&uip->ui_sbsize, (long)diff) + diff > max) {
 			atomic_subtract_long(&uip->ui_sbsize, (long)diff);
 			return (0);
 		}
 	} else {
 		atomic_add_long(&uip->ui_sbsize, (long)diff);
 		if (uip->ui_sbsize < 0)
 			printf("negative sbsize for uid = %d\n", uip->ui_uid);
 	}
 	*hiwat = to;
 	return (1);
 }
 
 /*
  * Change the count associated with number of pseudo-terminals
  * a given user is using.  When 'max' is 0, don't enforce a limit
  */
 int
 chgptscnt(struct uidinfo *uip, int diff, rlim_t max)
 {
 
 	/* Don't allow them to exceed max, but allow subtraction. */
 	if (diff > 0 && max != 0) {
 		if (atomic_fetchadd_long(&uip->ui_ptscnt, (long)diff) + diff > max) {
 			atomic_subtract_long(&uip->ui_ptscnt, (long)diff);
 			return (0);
 		}
 	} else {
 		atomic_add_long(&uip->ui_ptscnt, (long)diff);
 		if (uip->ui_ptscnt < 0)
 			printf("negative ptscnt for uid = %d\n", uip->ui_uid);
 	}
 	return (1);
 }
 
 int
 chgkqcnt(struct uidinfo *uip, int diff, rlim_t max)
 {
 
 	if (diff > 0 && max != 0) {
 		if (atomic_fetchadd_long(&uip->ui_kqcnt, (long)diff) +
 		    diff > max) {
 			atomic_subtract_long(&uip->ui_kqcnt, (long)diff);
 			return (0);
 		}
 	} else {
 		atomic_add_long(&uip->ui_kqcnt, (long)diff);
 		if (uip->ui_kqcnt < 0)
 			printf("negative kqcnt for uid = %d\n", uip->ui_uid);
 	}
 	return (1);
 }
Index: user/ngie/more-tests/sys/kern/kern_timeout.c
===================================================================
--- user/ngie/more-tests/sys/kern/kern_timeout.c	(revision 281584)
+++ user/ngie/more-tests/sys/kern/kern_timeout.c	(revision 281585)
@@ -1,1594 +1,1596 @@
 /*-
  * Copyright (c) 1982, 1986, 1991, 1993
  *	The Regents of the University of California.  All rights reserved.
  * (c) UNIX System Laboratories, Inc.
  * All or some portions of this file are derived from material licensed
  * to the University of California by American Telephone and Telegraph
  * Co. or Unix System Laboratories, Inc. and are reproduced herein with
  * the permission of UNIX System Laboratories, Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	From: @(#)kern_clock.c	8.5 (Berkeley) 1/21/94
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_callout_profiling.h"
 #if defined(__arm__)
 #include "opt_timer.h"
 #endif
 #include "opt_rss.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/bus.h>
 #include <sys/callout.h>
 #include <sys/file.h>
 #include <sys/interrupt.h>
 #include <sys/kernel.h>
 #include <sys/ktr.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/mutex.h>
 #include <sys/proc.h>
 #include <sys/sdt.h>
 #include <sys/sleepqueue.h>
 #include <sys/sysctl.h>
 #include <sys/smp.h>
 
 #ifdef SMP
 #include <machine/cpu.h>
 #endif
 
 #ifndef NO_EVENTTIMERS
 DPCPU_DECLARE(sbintime_t, hardclocktime);
 #endif
 
 SDT_PROVIDER_DEFINE(callout_execute);
 SDT_PROBE_DEFINE1(callout_execute, kernel, , callout__start,
     "struct callout *");
 SDT_PROBE_DEFINE1(callout_execute, kernel, , callout__end,
     "struct callout *");
 
 #ifdef CALLOUT_PROFILING
 static int avg_depth;
 SYSCTL_INT(_debug, OID_AUTO, to_avg_depth, CTLFLAG_RD, &avg_depth, 0,
     "Average number of items examined per softclock call. Units = 1/1000");
 static int avg_gcalls;
 SYSCTL_INT(_debug, OID_AUTO, to_avg_gcalls, CTLFLAG_RD, &avg_gcalls, 0,
     "Average number of Giant callouts made per softclock call. Units = 1/1000");
 static int avg_lockcalls;
 SYSCTL_INT(_debug, OID_AUTO, to_avg_lockcalls, CTLFLAG_RD, &avg_lockcalls, 0,
     "Average number of lock callouts made per softclock call. Units = 1/1000");
 static int avg_mpcalls;
 SYSCTL_INT(_debug, OID_AUTO, to_avg_mpcalls, CTLFLAG_RD, &avg_mpcalls, 0,
     "Average number of MP callouts made per softclock call. Units = 1/1000");
 static int avg_depth_dir;
 SYSCTL_INT(_debug, OID_AUTO, to_avg_depth_dir, CTLFLAG_RD, &avg_depth_dir, 0,
     "Average number of direct callouts examined per callout_process call. "
     "Units = 1/1000");
 static int avg_lockcalls_dir;
 SYSCTL_INT(_debug, OID_AUTO, to_avg_lockcalls_dir, CTLFLAG_RD,
     &avg_lockcalls_dir, 0, "Average number of lock direct callouts made per "
     "callout_process call. Units = 1/1000");
 static int avg_mpcalls_dir;
 SYSCTL_INT(_debug, OID_AUTO, to_avg_mpcalls_dir, CTLFLAG_RD, &avg_mpcalls_dir,
     0, "Average number of MP direct callouts made per callout_process call. "
     "Units = 1/1000");
 #endif
 
 static int ncallout;
 SYSCTL_INT(_kern, OID_AUTO, ncallout, CTLFLAG_RDTUN | CTLFLAG_NOFETCH, &ncallout, 0,
     "Number of entries in callwheel and size of timeout() preallocation");
 
 #ifdef	RSS
 static int pin_default_swi = 1;
 static int pin_pcpu_swi = 1;
 #else
 static int pin_default_swi = 0;
 static int pin_pcpu_swi = 0;
 #endif
 
 SYSCTL_INT(_kern, OID_AUTO, pin_default_swi, CTLFLAG_RDTUN | CTLFLAG_NOFETCH, &pin_default_swi,
     0, "Pin the default (non-per-cpu) swi (shared with PCPU 0 swi)");
 SYSCTL_INT(_kern, OID_AUTO, pin_pcpu_swi, CTLFLAG_RDTUN | CTLFLAG_NOFETCH, &pin_pcpu_swi,
     0, "Pin the per-CPU swis (except PCPU 0, which is also default");
 
 /*
  * TODO:
  *	allocate more timeout table slots when table overflows.
  */
 u_int callwheelsize, callwheelmask;
 
 /*
  * The callout cpu exec entities represent informations necessary for
  * describing the state of callouts currently running on the CPU and the ones
  * necessary for migrating callouts to the new callout cpu. In particular,
  * the first entry of the array cc_exec_entity holds informations for callout
  * running in SWI thread context, while the second one holds informations
  * for callout running directly from hardware interrupt context.
  * The cached informations are very important for deferring migration when
  * the migrating callout is already running.
  */
 struct cc_exec {
 	struct callout		*cc_curr;
 #ifdef SMP
 	void			(*ce_migration_func)(void *);
 	void			*ce_migration_arg;
 	int			ce_migration_cpu;
 	sbintime_t		ce_migration_time;
 	sbintime_t		ce_migration_prec;
 #endif
 	bool			cc_cancel;
 	bool			cc_waiting;
 };
 
 /*
  * There is one struct callout_cpu per cpu, holding all relevant
  * state for the callout processing thread on the individual CPU.
  */
 struct callout_cpu {
 	struct mtx_padalign	cc_lock;
 	struct cc_exec 		cc_exec_entity[2];
 	struct callout		*cc_next;
 	struct callout		*cc_callout;
 	struct callout_list	*cc_callwheel;
 	struct callout_tailq	cc_expireq;
 	struct callout_slist	cc_callfree;
 	sbintime_t		cc_firstevent;
 	sbintime_t		cc_lastscan;
 	void			*cc_cookie;
 	u_int			cc_bucket;
 	u_int			cc_inited;
 	char			cc_ktr_event_name[20];
 };
 
 #define	callout_migrating(c)	((c)->c_iflags & CALLOUT_DFRMIGRATION)
 
 #define	cc_exec_curr(cc, dir)		cc->cc_exec_entity[dir].cc_curr
 #define	cc_exec_next(cc)		cc->cc_next
 #define	cc_exec_cancel(cc, dir)		cc->cc_exec_entity[dir].cc_cancel
 #define	cc_exec_waiting(cc, dir)	cc->cc_exec_entity[dir].cc_waiting
 #ifdef SMP
 #define	cc_migration_func(cc, dir)	cc->cc_exec_entity[dir].ce_migration_func
 #define	cc_migration_arg(cc, dir)	cc->cc_exec_entity[dir].ce_migration_arg
 #define	cc_migration_cpu(cc, dir)	cc->cc_exec_entity[dir].ce_migration_cpu
 #define	cc_migration_time(cc, dir)	cc->cc_exec_entity[dir].ce_migration_time
 #define	cc_migration_prec(cc, dir)	cc->cc_exec_entity[dir].ce_migration_prec
 
 struct callout_cpu cc_cpu[MAXCPU];
 #define	CPUBLOCK	MAXCPU
 #define	CC_CPU(cpu)	(&cc_cpu[(cpu)])
 #define	CC_SELF()	CC_CPU(PCPU_GET(cpuid))
 #else
 struct callout_cpu cc_cpu;
 #define	CC_CPU(cpu)	&cc_cpu
 #define	CC_SELF()	&cc_cpu
 #endif
 #define	CC_LOCK(cc)	mtx_lock_spin(&(cc)->cc_lock)
 #define	CC_UNLOCK(cc)	mtx_unlock_spin(&(cc)->cc_lock)
 #define	CC_LOCK_ASSERT(cc)	mtx_assert(&(cc)->cc_lock, MA_OWNED)
 
 static int timeout_cpu;
 
 static void	callout_cpu_init(struct callout_cpu *cc, int cpu);
 static void	softclock_call_cc(struct callout *c, struct callout_cpu *cc,
 #ifdef CALLOUT_PROFILING
 		    int *mpcalls, int *lockcalls, int *gcalls,
 #endif
 		    int direct);
 
 static MALLOC_DEFINE(M_CALLOUT, "callout", "Callout datastructures");
 
 /**
  * Locked by cc_lock:
  *   cc_curr         - If a callout is in progress, it is cc_curr.
  *                     If cc_curr is non-NULL, threads waiting in
  *                     callout_drain() will be woken up as soon as the
  *                     relevant callout completes.
  *   cc_cancel       - Changing to 1 with both callout_lock and cc_lock held
  *                     guarantees that the current callout will not run.
  *                     The softclock() function sets this to 0 before it
  *                     drops callout_lock to acquire c_lock, and it calls
  *                     the handler only if curr_cancelled is still 0 after
  *                     cc_lock is successfully acquired.
  *   cc_waiting      - If a thread is waiting in callout_drain(), then
  *                     callout_wait is nonzero.  Set only when
  *                     cc_curr is non-NULL.
  */
 
 /*
  * Resets the execution entity tied to a specific callout cpu.
  */
 static void
 cc_cce_cleanup(struct callout_cpu *cc, int direct)
 {
 
 	cc_exec_curr(cc, direct) = NULL;
 	cc_exec_cancel(cc, direct) = false;
 	cc_exec_waiting(cc, direct) = false;
 #ifdef SMP
 	cc_migration_cpu(cc, direct) = CPUBLOCK;
 	cc_migration_time(cc, direct) = 0;
 	cc_migration_prec(cc, direct) = 0;
 	cc_migration_func(cc, direct) = NULL;
 	cc_migration_arg(cc, direct) = NULL;
 #endif
 }
 
 /*
  * Checks if migration is requested by a specific callout cpu.
  */
 static int
 cc_cce_migrating(struct callout_cpu *cc, int direct)
 {
 
 #ifdef SMP
 	return (cc_migration_cpu(cc, direct) != CPUBLOCK);
 #else
 	return (0);
 #endif
 }
 
 /*
  * Kernel low level callwheel initialization
  * called on cpu0 during kernel startup.
  */
 static void
 callout_callwheel_init(void *dummy)
 {
 	struct callout_cpu *cc;
 
 	/*
 	 * Calculate the size of the callout wheel and the preallocated
 	 * timeout() structures.
 	 * XXX: Clip callout to result of previous function of maxusers
 	 * maximum 384.  This is still huge, but acceptable.
 	 */
 	memset(CC_CPU(0), 0, sizeof(cc_cpu));
 	ncallout = imin(16 + maxproc + maxfiles, 18508);
 	TUNABLE_INT_FETCH("kern.ncallout", &ncallout);
 
 	/*
 	 * Calculate callout wheel size, should be next power of two higher
 	 * than 'ncallout'.
 	 */
 	callwheelsize = 1 << fls(ncallout);
 	callwheelmask = callwheelsize - 1;
 
 	/*
 	 * Fetch whether we're pinning the swi's or not.
 	 */
 	TUNABLE_INT_FETCH("kern.pin_default_swi", &pin_default_swi);
 	TUNABLE_INT_FETCH("kern.pin_pcpu_swi", &pin_pcpu_swi);
 
 	/*
 	 * Only cpu0 handles timeout(9) and receives a preallocation.
 	 *
 	 * XXX: Once all timeout(9) consumers are converted this can
 	 * be removed.
 	 */
 	timeout_cpu = PCPU_GET(cpuid);
 	cc = CC_CPU(timeout_cpu);
 	cc->cc_callout = malloc(ncallout * sizeof(struct callout),
 	    M_CALLOUT, M_WAITOK);
 	callout_cpu_init(cc, timeout_cpu);
 }
 SYSINIT(callwheel_init, SI_SUB_CPU, SI_ORDER_ANY, callout_callwheel_init, NULL);
 
 /*
  * Initialize the per-cpu callout structures.
  */
 static void
 callout_cpu_init(struct callout_cpu *cc, int cpu)
 {
 	struct callout *c;
 	int i;
 
 	mtx_init(&cc->cc_lock, "callout", NULL, MTX_SPIN | MTX_RECURSE);
 	SLIST_INIT(&cc->cc_callfree);
 	cc->cc_inited = 1;
 	cc->cc_callwheel = malloc(sizeof(struct callout_list) * callwheelsize,
 	    M_CALLOUT, M_WAITOK);
 	for (i = 0; i < callwheelsize; i++)
 		LIST_INIT(&cc->cc_callwheel[i]);
 	TAILQ_INIT(&cc->cc_expireq);
 	cc->cc_firstevent = SBT_MAX;
 	for (i = 0; i < 2; i++)
 		cc_cce_cleanup(cc, i);
 	snprintf(cc->cc_ktr_event_name, sizeof(cc->cc_ktr_event_name),
 	    "callwheel cpu %d", cpu);
 	if (cc->cc_callout == NULL)	/* Only cpu0 handles timeout(9) */
 		return;
 	for (i = 0; i < ncallout; i++) {
 		c = &cc->cc_callout[i];
 		callout_init(c, 0);
 		c->c_iflags = CALLOUT_LOCAL_ALLOC;
 		SLIST_INSERT_HEAD(&cc->cc_callfree, c, c_links.sle);
 	}
 }
 
 #ifdef SMP
 /*
  * Switches the cpu tied to a specific callout.
  * The function expects a locked incoming callout cpu and returns with
  * locked outcoming callout cpu.
  */
 static struct callout_cpu *
 callout_cpu_switch(struct callout *c, struct callout_cpu *cc, int new_cpu)
 {
 	struct callout_cpu *new_cc;
 
 	MPASS(c != NULL && cc != NULL);
 	CC_LOCK_ASSERT(cc);
 
 	/*
 	 * Avoid interrupts and preemption firing after the callout cpu
 	 * is blocked in order to avoid deadlocks as the new thread
 	 * may be willing to acquire the callout cpu lock.
 	 */
 	c->c_cpu = CPUBLOCK;
 	spinlock_enter();
 	CC_UNLOCK(cc);
 	new_cc = CC_CPU(new_cpu);
 	CC_LOCK(new_cc);
 	spinlock_exit();
 	c->c_cpu = new_cpu;
 	return (new_cc);
 }
 #endif
 
 /*
  * Start standard softclock thread.
  */
 static void
 start_softclock(void *dummy)
 {
 	struct callout_cpu *cc;
 	char name[MAXCOMLEN];
 #ifdef SMP
 	int cpu;
 	struct intr_event *ie;
 #endif
 
 	cc = CC_CPU(timeout_cpu);
 	snprintf(name, sizeof(name), "clock (%d)", timeout_cpu);
 	if (swi_add(&clk_intr_event, name, softclock, cc, SWI_CLOCK,
 	    INTR_MPSAFE, &cc->cc_cookie))
 		panic("died while creating standard software ithreads");
 	if (pin_default_swi &&
 	    (intr_event_bind(clk_intr_event, timeout_cpu) != 0)) {
 		printf("%s: timeout clock couldn't be pinned to cpu %d\n",
 		    __func__,
 		    timeout_cpu);
 	}
 
 #ifdef SMP
 	CPU_FOREACH(cpu) {
 		if (cpu == timeout_cpu)
 			continue;
 		cc = CC_CPU(cpu);
 		cc->cc_callout = NULL;	/* Only cpu0 handles timeout(9). */
 		callout_cpu_init(cc, cpu);
 		snprintf(name, sizeof(name), "clock (%d)", cpu);
 		ie = NULL;
 		if (swi_add(&ie, name, softclock, cc, SWI_CLOCK,
 		    INTR_MPSAFE, &cc->cc_cookie))
 			panic("died while creating standard software ithreads");
 		if (pin_pcpu_swi && (intr_event_bind(ie, cpu) != 0)) {
 			printf("%s: per-cpu clock couldn't be pinned to "
 			    "cpu %d\n",
 			    __func__,
 			    cpu);
 		}
 	}
 #endif
 }
 SYSINIT(start_softclock, SI_SUB_SOFTINTR, SI_ORDER_FIRST, start_softclock, NULL);
 
 #define	CC_HASH_SHIFT	8
 
 static inline u_int
 callout_hash(sbintime_t sbt)
 {
 
 	return (sbt >> (32 - CC_HASH_SHIFT));
 }
 
 static inline u_int
 callout_get_bucket(sbintime_t sbt)
 {
 
 	return (callout_hash(sbt) & callwheelmask);
 }
 
 void
 callout_process(sbintime_t now)
 {
 	struct callout *tmp, *tmpn;
 	struct callout_cpu *cc;
 	struct callout_list *sc;
 	sbintime_t first, last, max, tmp_max;
 	uint32_t lookahead;
 	u_int firstb, lastb, nowb;
 #ifdef CALLOUT_PROFILING
 	int depth_dir = 0, mpcalls_dir = 0, lockcalls_dir = 0;
 #endif
 
 	cc = CC_SELF();
 	mtx_lock_spin_flags(&cc->cc_lock, MTX_QUIET);
 
 	/* Compute the buckets of the last scan and present times. */
 	firstb = callout_hash(cc->cc_lastscan);
 	cc->cc_lastscan = now;
 	nowb = callout_hash(now);
 
 	/* Compute the last bucket and minimum time of the bucket after it. */
 	if (nowb == firstb)
 		lookahead = (SBT_1S / 16);
 	else if (nowb - firstb == 1)
 		lookahead = (SBT_1S / 8);
 	else
 		lookahead = (SBT_1S / 2);
 	first = last = now;
 	first += (lookahead / 2);
 	last += lookahead;
 	last &= (0xffffffffffffffffLLU << (32 - CC_HASH_SHIFT));
 	lastb = callout_hash(last) - 1;
 	max = last;
 
 	/*
 	 * Check if we wrapped around the entire wheel from the last scan.
 	 * In case, we need to scan entirely the wheel for pending callouts.
 	 */
 	if (lastb - firstb >= callwheelsize) {
 		lastb = firstb + callwheelsize - 1;
 		if (nowb - firstb >= callwheelsize)
 			nowb = lastb;
 	}
 
 	/* Iterate callwheel from firstb to nowb and then up to lastb. */
 	do {
 		sc = &cc->cc_callwheel[firstb & callwheelmask];
 		tmp = LIST_FIRST(sc);
 		while (tmp != NULL) {
 			/* Run the callout if present time within allowed. */
 			if (tmp->c_time <= now) {
 				/*
 				 * Consumer told us the callout may be run
 				 * directly from hardware interrupt context.
 				 */
 				if (tmp->c_iflags & CALLOUT_DIRECT) {
 #ifdef CALLOUT_PROFILING
 					++depth_dir;
 #endif
 					cc_exec_next(cc) =
 					    LIST_NEXT(tmp, c_links.le);
 					cc->cc_bucket = firstb & callwheelmask;
 					LIST_REMOVE(tmp, c_links.le);
 					softclock_call_cc(tmp, cc,
 #ifdef CALLOUT_PROFILING
 					    &mpcalls_dir, &lockcalls_dir, NULL,
 #endif
 					    1);
 					tmp = cc_exec_next(cc);
 					cc_exec_next(cc) = NULL;
 				} else {
 					tmpn = LIST_NEXT(tmp, c_links.le);
 					LIST_REMOVE(tmp, c_links.le);
 					TAILQ_INSERT_TAIL(&cc->cc_expireq,
 					    tmp, c_links.tqe);
 					tmp->c_iflags |= CALLOUT_PROCESSED;
 					tmp = tmpn;
 				}
 				continue;
 			}
 			/* Skip events from distant future. */
 			if (tmp->c_time >= max)
 				goto next;
 			/*
 			 * Event minimal time is bigger than present maximal
 			 * time, so it cannot be aggregated.
 			 */
 			if (tmp->c_time > last) {
 				lastb = nowb;
 				goto next;
 			}
 			/* Update first and last time, respecting this event. */
 			if (tmp->c_time < first)
 				first = tmp->c_time;
 			tmp_max = tmp->c_time + tmp->c_precision;
 			if (tmp_max < last)
 				last = tmp_max;
 next:
 			tmp = LIST_NEXT(tmp, c_links.le);
 		}
 		/* Proceed with the next bucket. */
 		firstb++;
 		/*
 		 * Stop if we looked after present time and found
 		 * some event we can't execute at now.
 		 * Stop if we looked far enough into the future.
 		 */
 	} while (((int)(firstb - lastb)) <= 0);
 	cc->cc_firstevent = last;
 #ifndef NO_EVENTTIMERS
 	cpu_new_callout(curcpu, last, first);
 #endif
 #ifdef CALLOUT_PROFILING
 	avg_depth_dir += (depth_dir * 1000 - avg_depth_dir) >> 8;
 	avg_mpcalls_dir += (mpcalls_dir * 1000 - avg_mpcalls_dir) >> 8;
 	avg_lockcalls_dir += (lockcalls_dir * 1000 - avg_lockcalls_dir) >> 8;
 #endif
 	mtx_unlock_spin_flags(&cc->cc_lock, MTX_QUIET);
 	/*
 	 * swi_sched acquires the thread lock, so we don't want to call it
 	 * with cc_lock held; incorrect locking order.
 	 */
 	if (!TAILQ_EMPTY(&cc->cc_expireq))
 		swi_sched(cc->cc_cookie, 0);
 }
 
 static struct callout_cpu *
 callout_lock(struct callout *c)
 {
 	struct callout_cpu *cc;
 	int cpu;
 
 	for (;;) {
 		cpu = c->c_cpu;
 #ifdef SMP
 		if (cpu == CPUBLOCK) {
 			while (c->c_cpu == CPUBLOCK)
 				cpu_spinwait();
 			continue;
 		}
 #endif
 		cc = CC_CPU(cpu);
 		CC_LOCK(cc);
 		if (cpu == c->c_cpu)
 			break;
 		CC_UNLOCK(cc);
 	}
 	return (cc);
 }
 
 static void
 callout_cc_add(struct callout *c, struct callout_cpu *cc,
     sbintime_t sbt, sbintime_t precision, void (*func)(void *),
     void *arg, int cpu, int flags)
 {
 	int bucket;
 
 	CC_LOCK_ASSERT(cc);
 	if (sbt < cc->cc_lastscan)
 		sbt = cc->cc_lastscan;
 	c->c_arg = arg;
 	c->c_iflags |= CALLOUT_PENDING;
 	c->c_iflags &= ~CALLOUT_PROCESSED;
 	c->c_flags |= CALLOUT_ACTIVE;
+	if (flags & C_DIRECT_EXEC)
+		c->c_iflags |= CALLOUT_DIRECT;
 	c->c_func = func;
 	c->c_time = sbt;
 	c->c_precision = precision;
 	bucket = callout_get_bucket(c->c_time);
 	CTR3(KTR_CALLOUT, "precision set for %p: %d.%08x",
 	    c, (int)(c->c_precision >> 32),
 	    (u_int)(c->c_precision & 0xffffffff));
 	LIST_INSERT_HEAD(&cc->cc_callwheel[bucket], c, c_links.le);
 	if (cc->cc_bucket == bucket)
 		cc_exec_next(cc) = c;
 #ifndef NO_EVENTTIMERS
 	/*
 	 * Inform the eventtimers(4) subsystem there's a new callout
 	 * that has been inserted, but only if really required.
 	 */
 	if (SBT_MAX - c->c_time < c->c_precision)
 		c->c_precision = SBT_MAX - c->c_time;
 	sbt = c->c_time + c->c_precision;
 	if (sbt < cc->cc_firstevent) {
 		cc->cc_firstevent = sbt;
 		cpu_new_callout(cpu, sbt, c->c_time);
 	}
 #endif
 }
 
 static void
 callout_cc_del(struct callout *c, struct callout_cpu *cc)
 {
 
 	if ((c->c_iflags & CALLOUT_LOCAL_ALLOC) == 0)
 		return;
 	c->c_func = NULL;
 	SLIST_INSERT_HEAD(&cc->cc_callfree, c, c_links.sle);
 }
 
 static void
 softclock_call_cc(struct callout *c, struct callout_cpu *cc,
 #ifdef CALLOUT_PROFILING
     int *mpcalls, int *lockcalls, int *gcalls,
 #endif
     int direct)
 {
 	struct rm_priotracker tracker;
 	void (*c_func)(void *);
 	void *c_arg;
 	struct lock_class *class;
 	struct lock_object *c_lock;
 	uintptr_t lock_status;
 	int c_iflags;
 #ifdef SMP
 	struct callout_cpu *new_cc;
 	void (*new_func)(void *);
 	void *new_arg;
 	int flags, new_cpu;
 	sbintime_t new_prec, new_time;
 #endif
 #if defined(DIAGNOSTIC) || defined(CALLOUT_PROFILING) 
 	sbintime_t sbt1, sbt2;
 	struct timespec ts2;
 	static sbintime_t maxdt = 2 * SBT_1MS;	/* 2 msec */
 	static timeout_t *lastfunc;
 #endif
 
 	KASSERT((c->c_iflags & CALLOUT_PENDING) == CALLOUT_PENDING,
 	    ("softclock_call_cc: pend %p %x", c, c->c_iflags));
 	KASSERT((c->c_flags & CALLOUT_ACTIVE) == CALLOUT_ACTIVE,
 	    ("softclock_call_cc: act %p %x", c, c->c_flags));
 	class = (c->c_lock != NULL) ? LOCK_CLASS(c->c_lock) : NULL;
 	lock_status = 0;
 	if (c->c_flags & CALLOUT_SHAREDLOCK) {
 		if (class == &lock_class_rm)
 			lock_status = (uintptr_t)&tracker;
 		else
 			lock_status = 1;
 	}
 	c_lock = c->c_lock;
 	c_func = c->c_func;
 	c_arg = c->c_arg;
 	c_iflags = c->c_iflags;
 	if (c->c_iflags & CALLOUT_LOCAL_ALLOC)
 		c->c_iflags = CALLOUT_LOCAL_ALLOC;
 	else
 		c->c_iflags &= ~CALLOUT_PENDING;
 	
 	cc_exec_curr(cc, direct) = c;
 	cc_exec_cancel(cc, direct) = false;
 	CC_UNLOCK(cc);
 	if (c_lock != NULL) {
 		class->lc_lock(c_lock, lock_status);
 		/*
 		 * The callout may have been cancelled
 		 * while we switched locks.
 		 */
 		if (cc_exec_cancel(cc, direct)) {
 			class->lc_unlock(c_lock);
 			goto skip;
 		}
 		/* The callout cannot be stopped now. */
 		cc_exec_cancel(cc, direct) = true;
 		if (c_lock == &Giant.lock_object) {
 #ifdef CALLOUT_PROFILING
 			(*gcalls)++;
 #endif
 			CTR3(KTR_CALLOUT, "callout giant %p func %p arg %p",
 			    c, c_func, c_arg);
 		} else {
 #ifdef CALLOUT_PROFILING
 			(*lockcalls)++;
 #endif
 			CTR3(KTR_CALLOUT, "callout lock %p func %p arg %p",
 			    c, c_func, c_arg);
 		}
 	} else {
 #ifdef CALLOUT_PROFILING
 		(*mpcalls)++;
 #endif
 		CTR3(KTR_CALLOUT, "callout %p func %p arg %p",
 		    c, c_func, c_arg);
 	}
 	KTR_STATE3(KTR_SCHED, "callout", cc->cc_ktr_event_name, "running",
 	    "func:%p", c_func, "arg:%p", c_arg, "direct:%d", direct);
 #if defined(DIAGNOSTIC) || defined(CALLOUT_PROFILING)
 	sbt1 = sbinuptime();
 #endif
 	THREAD_NO_SLEEPING();
 	SDT_PROBE(callout_execute, kernel, , callout__start, c, 0, 0, 0, 0);
 	c_func(c_arg);
 	SDT_PROBE(callout_execute, kernel, , callout__end, c, 0, 0, 0, 0);
 	THREAD_SLEEPING_OK();
 #if defined(DIAGNOSTIC) || defined(CALLOUT_PROFILING)
 	sbt2 = sbinuptime();
 	sbt2 -= sbt1;
 	if (sbt2 > maxdt) {
 		if (lastfunc != c_func || sbt2 > maxdt * 2) {
 			ts2 = sbttots(sbt2);
 			printf(
 		"Expensive timeout(9) function: %p(%p) %jd.%09ld s\n",
 			    c_func, c_arg, (intmax_t)ts2.tv_sec, ts2.tv_nsec);
 		}
 		maxdt = sbt2;
 		lastfunc = c_func;
 	}
 #endif
 	KTR_STATE0(KTR_SCHED, "callout", cc->cc_ktr_event_name, "idle");
 	CTR1(KTR_CALLOUT, "callout %p finished", c);
 	if ((c_iflags & CALLOUT_RETURNUNLOCKED) == 0)
 		class->lc_unlock(c_lock);
 skip:
 	CC_LOCK(cc);
 	KASSERT(cc_exec_curr(cc, direct) == c, ("mishandled cc_curr"));
 	cc_exec_curr(cc, direct) = NULL;
 	if (cc_exec_waiting(cc, direct)) {
 		/*
 		 * There is someone waiting for the
 		 * callout to complete.
 		 * If the callout was scheduled for
 		 * migration just cancel it.
 		 */
 		if (cc_cce_migrating(cc, direct)) {
 			cc_cce_cleanup(cc, direct);
 
 			/*
 			 * It should be assert here that the callout is not
 			 * destroyed but that is not easy.
 			 */
 			c->c_iflags &= ~CALLOUT_DFRMIGRATION;
 		}
 		cc_exec_waiting(cc, direct) = false;
 		CC_UNLOCK(cc);
 		wakeup(&cc_exec_waiting(cc, direct));
 		CC_LOCK(cc);
 	} else if (cc_cce_migrating(cc, direct)) {
 		KASSERT((c_iflags & CALLOUT_LOCAL_ALLOC) == 0,
 		    ("Migrating legacy callout %p", c));
 #ifdef SMP
 		/*
 		 * If the callout was scheduled for
 		 * migration just perform it now.
 		 */
 		new_cpu = cc_migration_cpu(cc, direct);
 		new_time = cc_migration_time(cc, direct);
 		new_prec = cc_migration_prec(cc, direct);
 		new_func = cc_migration_func(cc, direct);
 		new_arg = cc_migration_arg(cc, direct);
 		cc_cce_cleanup(cc, direct);
 
 		/*
 		 * It should be assert here that the callout is not destroyed
 		 * but that is not easy.
 		 *
 		 * As first thing, handle deferred callout stops.
 		 */
 		if (!callout_migrating(c)) {
 			CTR3(KTR_CALLOUT,
 			     "deferred cancelled %p func %p arg %p",
 			     c, new_func, new_arg);
 			callout_cc_del(c, cc);
 			return;
 		}
 		c->c_iflags &= ~CALLOUT_DFRMIGRATION;
 
 		new_cc = callout_cpu_switch(c, cc, new_cpu);
 		flags = (direct) ? C_DIRECT_EXEC : 0;
 		callout_cc_add(c, new_cc, new_time, new_prec, new_func,
 		    new_arg, new_cpu, flags);
 		CC_UNLOCK(new_cc);
 		CC_LOCK(cc);
 #else
 		panic("migration should not happen");
 #endif
 	}
 	/*
 	 * If the current callout is locally allocated (from
 	 * timeout(9)) then put it on the freelist.
 	 *
 	 * Note: we need to check the cached copy of c_iflags because
 	 * if it was not local, then it's not safe to deref the
 	 * callout pointer.
 	 */
 	KASSERT((c_iflags & CALLOUT_LOCAL_ALLOC) == 0 ||
 	    c->c_iflags == CALLOUT_LOCAL_ALLOC,
 	    ("corrupted callout"));
 	if (c_iflags & CALLOUT_LOCAL_ALLOC)
 		callout_cc_del(c, cc);
 }
 
 /*
  * The callout mechanism is based on the work of Adam M. Costello and
  * George Varghese, published in a technical report entitled "Redesigning
  * the BSD Callout and Timer Facilities" and modified slightly for inclusion
  * in FreeBSD by Justin T. Gibbs.  The original work on the data structures
  * used in this implementation was published by G. Varghese and T. Lauck in
  * the paper "Hashed and Hierarchical Timing Wheels: Data Structures for
  * the Efficient Implementation of a Timer Facility" in the Proceedings of
  * the 11th ACM Annual Symposium on Operating Systems Principles,
  * Austin, Texas Nov 1987.
  */
 
 /*
  * Software (low priority) clock interrupt.
  * Run periodic events from timeout queue.
  */
 void
 softclock(void *arg)
 {
 	struct callout_cpu *cc;
 	struct callout *c;
 #ifdef CALLOUT_PROFILING
 	int depth = 0, gcalls = 0, lockcalls = 0, mpcalls = 0;
 #endif
 
 	cc = (struct callout_cpu *)arg;
 	CC_LOCK(cc);
 	while ((c = TAILQ_FIRST(&cc->cc_expireq)) != NULL) {
 		TAILQ_REMOVE(&cc->cc_expireq, c, c_links.tqe);
 		softclock_call_cc(c, cc,
 #ifdef CALLOUT_PROFILING
 		    &mpcalls, &lockcalls, &gcalls,
 #endif
 		    0);
 #ifdef CALLOUT_PROFILING
 		++depth;
 #endif
 	}
 #ifdef CALLOUT_PROFILING
 	avg_depth += (depth * 1000 - avg_depth) >> 8;
 	avg_mpcalls += (mpcalls * 1000 - avg_mpcalls) >> 8;
 	avg_lockcalls += (lockcalls * 1000 - avg_lockcalls) >> 8;
 	avg_gcalls += (gcalls * 1000 - avg_gcalls) >> 8;
 #endif
 	CC_UNLOCK(cc);
 }
 
 /*
  * timeout --
  *	Execute a function after a specified length of time.
  *
  * untimeout --
  *	Cancel previous timeout function call.
  *
  * callout_handle_init --
  *	Initialize a handle so that using it with untimeout is benign.
  *
  *	See AT&T BCI Driver Reference Manual for specification.  This
  *	implementation differs from that one in that although an
  *	identification value is returned from timeout, the original
  *	arguments to timeout as well as the identifier are used to
  *	identify entries for untimeout.
  */
 struct callout_handle
 timeout(timeout_t *ftn, void *arg, int to_ticks)
 {
 	struct callout_cpu *cc;
 	struct callout *new;
 	struct callout_handle handle;
 
 	cc = CC_CPU(timeout_cpu);
 	CC_LOCK(cc);
 	/* Fill in the next free callout structure. */
 	new = SLIST_FIRST(&cc->cc_callfree);
 	if (new == NULL)
 		/* XXX Attempt to malloc first */
 		panic("timeout table full");
 	SLIST_REMOVE_HEAD(&cc->cc_callfree, c_links.sle);
 	callout_reset(new, to_ticks, ftn, arg);
 	handle.callout = new;
 	CC_UNLOCK(cc);
 
 	return (handle);
 }
 
 void
 untimeout(timeout_t *ftn, void *arg, struct callout_handle handle)
 {
 	struct callout_cpu *cc;
 
 	/*
 	 * Check for a handle that was initialized
 	 * by callout_handle_init, but never used
 	 * for a real timeout.
 	 */
 	if (handle.callout == NULL)
 		return;
 
 	cc = callout_lock(handle.callout);
 	if (handle.callout->c_func == ftn && handle.callout->c_arg == arg)
 		callout_stop(handle.callout);
 	CC_UNLOCK(cc);
 }
 
 void
 callout_handle_init(struct callout_handle *handle)
 {
 	handle->callout = NULL;
 }
 
 /*
  * New interface; clients allocate their own callout structures.
  *
  * callout_reset() - establish or change a timeout
  * callout_stop() - disestablish a timeout
  * callout_init() - initialize a callout structure so that it can
  *	safely be passed to callout_reset() and callout_stop()
  *
  * <sys/callout.h> defines three convenience macros:
  *
  * callout_active() - returns truth if callout has not been stopped,
  *	drained, or deactivated since the last time the callout was
  *	reset.
  * callout_pending() - returns truth if callout is still waiting for timeout
  * callout_deactivate() - marks the callout as having been serviced
  */
 int
 callout_reset_sbt_on(struct callout *c, sbintime_t sbt, sbintime_t precision,
     void (*ftn)(void *), void *arg, int cpu, int flags)
 {
 	sbintime_t to_sbt, pr;
 	struct callout_cpu *cc;
 	int cancelled, direct;
 	int ignore_cpu=0;
 
 	cancelled = 0;
 	if (cpu == -1) {
 		ignore_cpu = 1;
 	} else if ((cpu >= MAXCPU) ||
 		   ((CC_CPU(cpu))->cc_inited == 0)) {
 		/* Invalid CPU spec */
 		panic("Invalid CPU in callout %d", cpu);
 	}
 	if (flags & C_ABSOLUTE) {
 		to_sbt = sbt;
 	} else {
 		if ((flags & C_HARDCLOCK) && (sbt < tick_sbt))
 			sbt = tick_sbt;
 		if ((flags & C_HARDCLOCK) ||
 #ifdef NO_EVENTTIMERS
 		    sbt >= sbt_timethreshold) {
 			to_sbt = getsbinuptime();
 
 			/* Add safety belt for the case of hz > 1000. */
 			to_sbt += tc_tick_sbt - tick_sbt;
 #else
 		    sbt >= sbt_tickthreshold) {
 			/*
 			 * Obtain the time of the last hardclock() call on
 			 * this CPU directly from the kern_clocksource.c.
 			 * This value is per-CPU, but it is equal for all
 			 * active ones.
 			 */
 #ifdef __LP64__
 			to_sbt = DPCPU_GET(hardclocktime);
 #else
 			spinlock_enter();
 			to_sbt = DPCPU_GET(hardclocktime);
 			spinlock_exit();
 #endif
 #endif
 			if ((flags & C_HARDCLOCK) == 0)
 				to_sbt += tick_sbt;
 		} else
 			to_sbt = sbinuptime();
 		if (SBT_MAX - to_sbt < sbt)
 			to_sbt = SBT_MAX;
 		else
 			to_sbt += sbt;
 		pr = ((C_PRELGET(flags) < 0) ? sbt >> tc_precexp :
 		    sbt >> C_PRELGET(flags));
 		if (pr > precision)
 			precision = pr;
 	}
 	/* 
 	 * This flag used to be added by callout_cc_add, but the
 	 * first time you call this we could end up with the
 	 * wrong direct flag if we don't do it before we add.
 	 */
 	if (flags & C_DIRECT_EXEC) {
 		direct = 1;
 	} else {
 		direct = 0;
 	}
 	KASSERT(!direct || c->c_lock == NULL,
 	    ("%s: direct callout %p has lock", __func__, c));
 	cc = callout_lock(c);
 	/*
 	 * Don't allow migration of pre-allocated callouts lest they
 	 * become unbalanced or handle the case where the user does
 	 * not care. 
 	 */
 	if ((c->c_iflags & CALLOUT_LOCAL_ALLOC) ||
 	    ignore_cpu) {
 		cpu = c->c_cpu;
 	}
 
 	if (cc_exec_curr(cc, direct) == c) {
 		/*
 		 * We're being asked to reschedule a callout which is
 		 * currently in progress.  If there is a lock then we
 		 * can cancel the callout if it has not really started.
 		 */
 		if (c->c_lock != NULL && cc_exec_cancel(cc, direct))
 			cancelled = cc_exec_cancel(cc, direct) = true;
 		if (cc_exec_waiting(cc, direct)) {
 			/*
 			 * Someone has called callout_drain to kill this
 			 * callout.  Don't reschedule.
 			 */
 			CTR4(KTR_CALLOUT, "%s %p func %p arg %p",
 			    cancelled ? "cancelled" : "failed to cancel",
 			    c, c->c_func, c->c_arg);
 			CC_UNLOCK(cc);
 			return (cancelled);
 		}
 #ifdef SMP
 		if (callout_migrating(c)) {
 			/* 
 			 * This only occurs when a second callout_reset_sbt_on
 			 * is made after a previous one moved it into
 			 * deferred migration (below). Note we do *not* change
 			 * the prev_cpu even though the previous target may
 			 * be different.
 			 */
 			cc_migration_cpu(cc, direct) = cpu;
 			cc_migration_time(cc, direct) = to_sbt;
 			cc_migration_prec(cc, direct) = precision;
 			cc_migration_func(cc, direct) = ftn;
 			cc_migration_arg(cc, direct) = arg;
 			cancelled = 1;
 			CC_UNLOCK(cc);
 			return (cancelled);
 		}
 #endif
 	}
 	if (c->c_iflags & CALLOUT_PENDING) {
 		if ((c->c_iflags & CALLOUT_PROCESSED) == 0) {
 			if (cc_exec_next(cc) == c)
 				cc_exec_next(cc) = LIST_NEXT(c, c_links.le);
 			LIST_REMOVE(c, c_links.le);
 		} else {
 			TAILQ_REMOVE(&cc->cc_expireq, c, c_links.tqe);
 		}
 		cancelled = 1;
 		c->c_iflags &= ~ CALLOUT_PENDING;
 		c->c_flags &= ~ CALLOUT_ACTIVE;
 	}
 
 #ifdef SMP
 	/*
 	 * If the callout must migrate try to perform it immediately.
 	 * If the callout is currently running, just defer the migration
 	 * to a more appropriate moment.
 	 */
 	if (c->c_cpu != cpu) {
 		if (cc_exec_curr(cc, direct) == c) {
 			/* 
 			 * Pending will have been removed since we are
 			 * actually executing the callout on another
 			 * CPU. That callout should be waiting on the
 			 * lock the caller holds. If we set both
 			 * active/and/pending after we return and the
 			 * lock on the executing callout proceeds, it
 			 * will then see pending is true and return.
 			 * At the return from the actual callout execution
 			 * the migration will occur in softclock_call_cc
 			 * and this new callout will be placed on the 
 			 * new CPU via a call to callout_cpu_switch() which
 			 * will get the lock on the right CPU followed
 			 * by a call callout_cc_add() which will add it there.
 			 * (see above in softclock_call_cc()).
 			 */
 			cc_migration_cpu(cc, direct) = cpu;
 			cc_migration_time(cc, direct) = to_sbt;
 			cc_migration_prec(cc, direct) = precision;
 			cc_migration_func(cc, direct) = ftn;
 			cc_migration_arg(cc, direct) = arg;
 			c->c_iflags |= (CALLOUT_DFRMIGRATION | CALLOUT_PENDING);
 			c->c_flags |= CALLOUT_ACTIVE;
 			CTR6(KTR_CALLOUT,
 		    "migration of %p func %p arg %p in %d.%08x to %u deferred",
 			    c, c->c_func, c->c_arg, (int)(to_sbt >> 32),
 			    (u_int)(to_sbt & 0xffffffff), cpu);
 			CC_UNLOCK(cc);
 			return (cancelled);
 		}
 		cc = callout_cpu_switch(c, cc, cpu);
 	}
 #endif
 
 	callout_cc_add(c, cc, to_sbt, precision, ftn, arg, cpu, flags);
 	CTR6(KTR_CALLOUT, "%sscheduled %p func %p arg %p in %d.%08x",
 	    cancelled ? "re" : "", c, c->c_func, c->c_arg, (int)(to_sbt >> 32),
 	    (u_int)(to_sbt & 0xffffffff));
 	CC_UNLOCK(cc);
 
 	return (cancelled);
 }
 
 /*
  * Common idioms that can be optimized in the future.
  */
 int
 callout_schedule_on(struct callout *c, int to_ticks, int cpu)
 {
 	return callout_reset_on(c, to_ticks, c->c_func, c->c_arg, cpu);
 }
 
 int
 callout_schedule(struct callout *c, int to_ticks)
 {
 	return callout_reset_on(c, to_ticks, c->c_func, c->c_arg, c->c_cpu);
 }
 
 int
 _callout_stop_safe(struct callout *c, int safe)
 {
 	struct callout_cpu *cc, *old_cc;
 	struct lock_class *class;
 	int direct, sq_locked, use_lock;
 	int not_on_a_list;
 
 	if (safe)
 		WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, c->c_lock,
 		    "calling %s", __func__);
 
 	/*
 	 * Some old subsystems don't hold Giant while running a callout_stop(),
 	 * so just discard this check for the moment.
 	 */
 	if (!safe && c->c_lock != NULL) {
 		if (c->c_lock == &Giant.lock_object)
 			use_lock = mtx_owned(&Giant);
 		else {
 			use_lock = 1;
 			class = LOCK_CLASS(c->c_lock);
 			class->lc_assert(c->c_lock, LA_XLOCKED);
 		}
 	} else
 		use_lock = 0;
 	if (c->c_iflags & CALLOUT_DIRECT) {
 		direct = 1;
 	} else {
 		direct = 0;
 	}
 	sq_locked = 0;
 	old_cc = NULL;
 again:
 	cc = callout_lock(c);
 
 	if ((c->c_iflags & (CALLOUT_DFRMIGRATION | CALLOUT_PENDING)) ==
 	    (CALLOUT_DFRMIGRATION | CALLOUT_PENDING) &&
 	    ((c->c_flags & CALLOUT_ACTIVE) == CALLOUT_ACTIVE)) {
 		/*
 		 * Special case where this slipped in while we
 		 * were migrating *as* the callout is about to
 		 * execute. The caller probably holds the lock
 		 * the callout wants.
 		 *
 		 * Get rid of the migration first. Then set
 		 * the flag that tells this code *not* to
 		 * try to remove it from any lists (its not
 		 * on one yet). When the callout wheel runs,
 		 * it will ignore this callout.
 		 */
 		c->c_iflags &= ~CALLOUT_PENDING;
 		c->c_flags &= ~CALLOUT_ACTIVE;
 		not_on_a_list = 1;
 	} else {
 		not_on_a_list = 0;
 	}
 
 	/*
 	 * If the callout was migrating while the callout cpu lock was
 	 * dropped,  just drop the sleepqueue lock and check the states
 	 * again.
 	 */
 	if (sq_locked != 0 && cc != old_cc) {
 #ifdef SMP
 		CC_UNLOCK(cc);
 		sleepq_release(&cc_exec_waiting(old_cc, direct));
 		sq_locked = 0;
 		old_cc = NULL;
 		goto again;
 #else
 		panic("migration should not happen");
 #endif
 	}
 
 	/*
 	 * If the callout isn't pending, it's not on the queue, so
 	 * don't attempt to remove it from the queue.  We can try to
 	 * stop it by other means however.
 	 */
 	if (!(c->c_iflags & CALLOUT_PENDING)) {
 		c->c_flags &= ~CALLOUT_ACTIVE;
 
 		/*
 		 * If it wasn't on the queue and it isn't the current
 		 * callout, then we can't stop it, so just bail.
 		 */
 		if (cc_exec_curr(cc, direct) != c) {
 			CTR3(KTR_CALLOUT, "failed to stop %p func %p arg %p",
 			    c, c->c_func, c->c_arg);
 			CC_UNLOCK(cc);
 			if (sq_locked)
 				sleepq_release(&cc_exec_waiting(cc, direct));
 			return (0);
 		}
 
 		if (safe) {
 			/*
 			 * The current callout is running (or just
 			 * about to run) and blocking is allowed, so
 			 * just wait for the current invocation to
 			 * finish.
 			 */
 			while (cc_exec_curr(cc, direct) == c) {
 				/*
 				 * Use direct calls to sleepqueue interface
 				 * instead of cv/msleep in order to avoid
 				 * a LOR between cc_lock and sleepqueue
 				 * chain spinlocks.  This piece of code
 				 * emulates a msleep_spin() call actually.
 				 *
 				 * If we already have the sleepqueue chain
 				 * locked, then we can safely block.  If we
 				 * don't already have it locked, however,
 				 * we have to drop the cc_lock to lock
 				 * it.  This opens several races, so we
 				 * restart at the beginning once we have
 				 * both locks.  If nothing has changed, then
 				 * we will end up back here with sq_locked
 				 * set.
 				 */
 				if (!sq_locked) {
 					CC_UNLOCK(cc);
 					sleepq_lock(
 					    &cc_exec_waiting(cc, direct));
 					sq_locked = 1;
 					old_cc = cc;
 					goto again;
 				}
 
 				/*
 				 * Migration could be cancelled here, but
 				 * as long as it is still not sure when it
 				 * will be packed up, just let softclock()
 				 * take care of it.
 				 */
 				cc_exec_waiting(cc, direct) = true;
 				DROP_GIANT();
 				CC_UNLOCK(cc);
 				sleepq_add(
 				    &cc_exec_waiting(cc, direct),
 				    &cc->cc_lock.lock_object, "codrain",
 				    SLEEPQ_SLEEP, 0);
 				sleepq_wait(
 				    &cc_exec_waiting(cc, direct),
 					     0);
 				sq_locked = 0;
 				old_cc = NULL;
 
 				/* Reacquire locks previously released. */
 				PICKUP_GIANT();
 				CC_LOCK(cc);
 			}
 		} else if (use_lock &&
 			   !cc_exec_cancel(cc, direct)) {
 			
 			/*
 			 * The current callout is waiting for its
 			 * lock which we hold.  Cancel the callout
 			 * and return.  After our caller drops the
 			 * lock, the callout will be skipped in
 			 * softclock().
 			 */
 			cc_exec_cancel(cc, direct) = true;
 			CTR3(KTR_CALLOUT, "cancelled %p func %p arg %p",
 			    c, c->c_func, c->c_arg);
 			KASSERT(!cc_cce_migrating(cc, direct),
 			    ("callout wrongly scheduled for migration"));
 			if (callout_migrating(c)) {
 				c->c_iflags &= ~CALLOUT_DFRMIGRATION;
 #ifdef SMP
 				cc_migration_cpu(cc, direct) = CPUBLOCK;
 				cc_migration_time(cc, direct) = 0;
 				cc_migration_prec(cc, direct) = 0;
 				cc_migration_func(cc, direct) = NULL;
 				cc_migration_arg(cc, direct) = NULL;
 #endif
 			}
 			CC_UNLOCK(cc);
 			KASSERT(!sq_locked, ("sleepqueue chain locked"));
 			return (1);
 		} else if (callout_migrating(c)) {
 			/*
 			 * The callout is currently being serviced
 			 * and the "next" callout is scheduled at
 			 * its completion with a migration. We remove
 			 * the migration flag so it *won't* get rescheduled,
 			 * but we can't stop the one thats running so
 			 * we return 0.
 			 */
 			c->c_iflags &= ~CALLOUT_DFRMIGRATION;
 #ifdef SMP
 			/* 
 			 * We can't call cc_cce_cleanup here since
 			 * if we do it will remove .ce_curr and
 			 * its still running. This will prevent a
 			 * reschedule of the callout when the 
 			 * execution completes.
 			 */
 			cc_migration_cpu(cc, direct) = CPUBLOCK;
 			cc_migration_time(cc, direct) = 0;
 			cc_migration_prec(cc, direct) = 0;
 			cc_migration_func(cc, direct) = NULL;
 			cc_migration_arg(cc, direct) = NULL;
 #endif
 			CTR3(KTR_CALLOUT, "postponing stop %p func %p arg %p",
 			    c, c->c_func, c->c_arg);
 			CC_UNLOCK(cc);
 			return (0);
 		}
 		CTR3(KTR_CALLOUT, "failed to stop %p func %p arg %p",
 		    c, c->c_func, c->c_arg);
 		CC_UNLOCK(cc);
 		KASSERT(!sq_locked, ("sleepqueue chain still locked"));
 		return (0);
 	}
 	if (sq_locked)
 		sleepq_release(&cc_exec_waiting(cc, direct));
 
 	c->c_iflags &= ~CALLOUT_PENDING;
 	c->c_flags &= ~CALLOUT_ACTIVE;
 
 	CTR3(KTR_CALLOUT, "cancelled %p func %p arg %p",
 	    c, c->c_func, c->c_arg);
 	if (not_on_a_list == 0) {
 		if ((c->c_iflags & CALLOUT_PROCESSED) == 0) {
 			if (cc_exec_next(cc) == c)
 				cc_exec_next(cc) = LIST_NEXT(c, c_links.le);
 			LIST_REMOVE(c, c_links.le);
 		} else {
 			TAILQ_REMOVE(&cc->cc_expireq, c, c_links.tqe);
 		}
 	}
 	callout_cc_del(c, cc);
 	CC_UNLOCK(cc);
 	return (1);
 }
 
 void
 callout_init(struct callout *c, int mpsafe)
 {
 	bzero(c, sizeof *c);
 	if (mpsafe) {
 		c->c_lock = NULL;
 		c->c_iflags = CALLOUT_RETURNUNLOCKED;
 	} else {
 		c->c_lock = &Giant.lock_object;
 		c->c_iflags = 0;
 	}
 	c->c_cpu = timeout_cpu;
 }
 
 void
 _callout_init_lock(struct callout *c, struct lock_object *lock, int flags)
 {
 	bzero(c, sizeof *c);
 	c->c_lock = lock;
 	KASSERT((flags & ~(CALLOUT_RETURNUNLOCKED | CALLOUT_SHAREDLOCK)) == 0,
 	    ("callout_init_lock: bad flags %d", flags));
 	KASSERT(lock != NULL || (flags & CALLOUT_RETURNUNLOCKED) == 0,
 	    ("callout_init_lock: CALLOUT_RETURNUNLOCKED with no lock"));
 	KASSERT(lock == NULL || !(LOCK_CLASS(lock)->lc_flags &
 	    (LC_SPINLOCK | LC_SLEEPABLE)), ("%s: invalid lock class",
 	    __func__));
 	c->c_iflags = flags & (CALLOUT_RETURNUNLOCKED | CALLOUT_SHAREDLOCK);
 	c->c_cpu = timeout_cpu;
 }
 
 #ifdef APM_FIXUP_CALLTODO
 /* 
  * Adjust the kernel calltodo timeout list.  This routine is used after 
  * an APM resume to recalculate the calltodo timer list values with the 
  * number of hz's we have been sleeping.  The next hardclock() will detect 
  * that there are fired timers and run softclock() to execute them.
  *
  * Please note, I have not done an exhaustive analysis of what code this
  * might break.  I am motivated to have my select()'s and alarm()'s that
  * have expired during suspend firing upon resume so that the applications
  * which set the timer can do the maintanence the timer was for as close
  * as possible to the originally intended time.  Testing this code for a 
  * week showed that resuming from a suspend resulted in 22 to 25 timers 
  * firing, which seemed independant on whether the suspend was 2 hours or
  * 2 days.  Your milage may vary.   - Ken Key <key@cs.utk.edu>
  */
 void
 adjust_timeout_calltodo(struct timeval *time_change)
 {
 	register struct callout *p;
 	unsigned long delta_ticks;
 
 	/* 
 	 * How many ticks were we asleep?
 	 * (stolen from tvtohz()).
 	 */
 
 	/* Don't do anything */
 	if (time_change->tv_sec < 0)
 		return;
 	else if (time_change->tv_sec <= LONG_MAX / 1000000)
 		delta_ticks = (time_change->tv_sec * 1000000 +
 			       time_change->tv_usec + (tick - 1)) / tick + 1;
 	else if (time_change->tv_sec <= LONG_MAX / hz)
 		delta_ticks = time_change->tv_sec * hz +
 			      (time_change->tv_usec + (tick - 1)) / tick + 1;
 	else
 		delta_ticks = LONG_MAX;
 
 	if (delta_ticks > INT_MAX)
 		delta_ticks = INT_MAX;
 
 	/* 
 	 * Now rip through the timer calltodo list looking for timers
 	 * to expire.
 	 */
 
 	/* don't collide with softclock() */
 	CC_LOCK(cc);
 	for (p = calltodo.c_next; p != NULL; p = p->c_next) {
 		p->c_time -= delta_ticks;
 
 		/* Break if the timer had more time on it than delta_ticks */
 		if (p->c_time > 0)
 			break;
 
 		/* take back the ticks the timer didn't use (p->c_time <= 0) */
 		delta_ticks = -p->c_time;
 	}
 	CC_UNLOCK(cc);
 
 	return;
 }
 #endif /* APM_FIXUP_CALLTODO */
 
 static int
 flssbt(sbintime_t sbt)
 {
 
 	sbt += (uint64_t)sbt >> 1;
 	if (sizeof(long) >= sizeof(sbintime_t))
 		return (flsl(sbt));
 	if (sbt >= SBT_1S)
 		return (flsl(((uint64_t)sbt) >> 32) + 32);
 	return (flsl(sbt));
 }
 
 /*
  * Dump immediate statistic snapshot of the scheduled callouts.
  */
 static int
 sysctl_kern_callout_stat(SYSCTL_HANDLER_ARGS)
 {
 	struct callout *tmp;
 	struct callout_cpu *cc;
 	struct callout_list *sc;
 	sbintime_t maxpr, maxt, medpr, medt, now, spr, st, t;
 	int ct[64], cpr[64], ccpbk[32];
 	int error, val, i, count, tcum, pcum, maxc, c, medc;
 #ifdef SMP
 	int cpu;
 #endif
 
 	val = 0;
 	error = sysctl_handle_int(oidp, &val, 0, req);
 	if (error != 0 || req->newptr == NULL)
 		return (error);
 	count = maxc = 0;
 	st = spr = maxt = maxpr = 0;
 	bzero(ccpbk, sizeof(ccpbk));
 	bzero(ct, sizeof(ct));
 	bzero(cpr, sizeof(cpr));
 	now = sbinuptime();
 #ifdef SMP
 	CPU_FOREACH(cpu) {
 		cc = CC_CPU(cpu);
 #else
 		cc = CC_CPU(timeout_cpu);
 #endif
 		CC_LOCK(cc);
 		for (i = 0; i < callwheelsize; i++) {
 			sc = &cc->cc_callwheel[i];
 			c = 0;
 			LIST_FOREACH(tmp, sc, c_links.le) {
 				c++;
 				t = tmp->c_time - now;
 				if (t < 0)
 					t = 0;
 				st += t / SBT_1US;
 				spr += tmp->c_precision / SBT_1US;
 				if (t > maxt)
 					maxt = t;
 				if (tmp->c_precision > maxpr)
 					maxpr = tmp->c_precision;
 				ct[flssbt(t)]++;
 				cpr[flssbt(tmp->c_precision)]++;
 			}
 			if (c > maxc)
 				maxc = c;
 			ccpbk[fls(c + c / 2)]++;
 			count += c;
 		}
 		CC_UNLOCK(cc);
 #ifdef SMP
 	}
 #endif
 
 	for (i = 0, tcum = 0; i < 64 && tcum < count / 2; i++)
 		tcum += ct[i];
 	medt = (i >= 2) ? (((sbintime_t)1) << (i - 2)) : 0;
 	for (i = 0, pcum = 0; i < 64 && pcum < count / 2; i++)
 		pcum += cpr[i];
 	medpr = (i >= 2) ? (((sbintime_t)1) << (i - 2)) : 0;
 	for (i = 0, c = 0; i < 32 && c < count / 2; i++)
 		c += ccpbk[i];
 	medc = (i >= 2) ? (1 << (i - 2)) : 0;
 
 	printf("Scheduled callouts statistic snapshot:\n");
 	printf("  Callouts: %6d  Buckets: %6d*%-3d  Bucket size: 0.%06ds\n",
 	    count, callwheelsize, mp_ncpus, 1000000 >> CC_HASH_SHIFT);
 	printf("  C/Bk: med %5d         avg %6d.%06jd  max %6d\n",
 	    medc,
 	    count / callwheelsize / mp_ncpus,
 	    (uint64_t)count * 1000000 / callwheelsize / mp_ncpus % 1000000,
 	    maxc);
 	printf("  Time: med %5jd.%06jds avg %6jd.%06jds max %6jd.%06jds\n",
 	    medt / SBT_1S, (medt & 0xffffffff) * 1000000 >> 32,
 	    (st / count) / 1000000, (st / count) % 1000000,
 	    maxt / SBT_1S, (maxt & 0xffffffff) * 1000000 >> 32);
 	printf("  Prec: med %5jd.%06jds avg %6jd.%06jds max %6jd.%06jds\n",
 	    medpr / SBT_1S, (medpr & 0xffffffff) * 1000000 >> 32,
 	    (spr / count) / 1000000, (spr / count) % 1000000,
 	    maxpr / SBT_1S, (maxpr & 0xffffffff) * 1000000 >> 32);
 	printf("  Distribution:       \tbuckets\t   time\t   tcum\t"
 	    "   prec\t   pcum\n");
 	for (i = 0, tcum = pcum = 0; i < 64; i++) {
 		if (ct[i] == 0 && cpr[i] == 0)
 			continue;
 		t = (i != 0) ? (((sbintime_t)1) << (i - 1)) : 0;
 		tcum += ct[i];
 		pcum += cpr[i];
 		printf("  %10jd.%06jds\t 2**%d\t%7d\t%7d\t%7d\t%7d\n",
 		    t / SBT_1S, (t & 0xffffffff) * 1000000 >> 32,
 		    i - 1 - (32 - CC_HASH_SHIFT),
 		    ct[i], tcum, cpr[i], pcum);
 	}
 	return (error);
 }
 SYSCTL_PROC(_kern, OID_AUTO, callout_stat,
     CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE,
     0, 0, sysctl_kern_callout_stat, "I",
     "Dump immediate statistic snapshot of the scheduled callouts");
Index: user/ngie/more-tests/sys/kern/subr_bus.c
===================================================================
--- user/ngie/more-tests/sys/kern/subr_bus.c	(revision 281584)
+++ user/ngie/more-tests/sys/kern/subr_bus.c	(revision 281585)
@@ -1,5306 +1,5308 @@
 /*-
  * Copyright (c) 1997,1998,2003 Doug Rabson
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_bus.h"
 #include "opt_random.h"
 
 #include <sys/param.h>
 #include <sys/conf.h>
 #include <sys/filio.h>
 #include <sys/lock.h>
 #include <sys/kernel.h>
 #include <sys/kobj.h>
 #include <sys/limits.h>
 #include <sys/malloc.h>
 #include <sys/module.h>
 #include <sys/mutex.h>
 #include <sys/poll.h>
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/condvar.h>
 #include <sys/queue.h>
 #include <machine/bus.h>
 #include <sys/random.h>
 #include <sys/rman.h>
 #include <sys/selinfo.h>
 #include <sys/signalvar.h>
 #include <sys/sysctl.h>
 #include <sys/systm.h>
 #include <sys/uio.h>
 #include <sys/bus.h>
 #include <sys/interrupt.h>
 #include <sys/cpuset.h>
 
 #include <net/vnet.h>
 
 #include <machine/cpu.h>
 #include <machine/stdarg.h>
 
 #include <vm/uma.h>
 
 SYSCTL_NODE(_hw, OID_AUTO, bus, CTLFLAG_RW, NULL, NULL);
 SYSCTL_ROOT_NODE(OID_AUTO, dev, CTLFLAG_RW, NULL, NULL);
 
 /*
  * Used to attach drivers to devclasses.
  */
 typedef struct driverlink *driverlink_t;
 struct driverlink {
 	kobj_class_t	driver;
 	TAILQ_ENTRY(driverlink) link;	/* list of drivers in devclass */
 	int		pass;
 	TAILQ_ENTRY(driverlink) passlink;
 };
 
 /*
  * Forward declarations
  */
 typedef TAILQ_HEAD(devclass_list, devclass) devclass_list_t;
 typedef TAILQ_HEAD(driver_list, driverlink) driver_list_t;
 typedef TAILQ_HEAD(device_list, device) device_list_t;
 
 struct devclass {
 	TAILQ_ENTRY(devclass) link;
 	devclass_t	parent;		/* parent in devclass hierarchy */
 	driver_list_t	drivers;     /* bus devclasses store drivers for bus */
 	char		*name;
 	device_t	*devices;	/* array of devices indexed by unit */
 	int		maxunit;	/* size of devices array */
 	int		flags;
 #define DC_HAS_CHILDREN		1
 
 	struct sysctl_ctx_list sysctl_ctx;
 	struct sysctl_oid *sysctl_tree;
 };
 
 /**
  * @brief Implementation of device.
  */
 struct device {
 	/*
 	 * A device is a kernel object. The first field must be the
 	 * current ops table for the object.
 	 */
 	KOBJ_FIELDS;
 
 	/*
 	 * Device hierarchy.
 	 */
 	TAILQ_ENTRY(device)	link;	/**< list of devices in parent */
 	TAILQ_ENTRY(device)	devlink; /**< global device list membership */
 	device_t	parent;		/**< parent of this device  */
 	device_list_t	children;	/**< list of child devices */
 
 	/*
 	 * Details of this device.
 	 */
 	driver_t	*driver;	/**< current driver */
 	devclass_t	devclass;	/**< current device class */
 	int		unit;		/**< current unit number */
 	char*		nameunit;	/**< name+unit e.g. foodev0 */
 	char*		desc;		/**< driver specific description */
 	int		busy;		/**< count of calls to device_busy() */
 	device_state_t	state;		/**< current device state  */
 	uint32_t	devflags;	/**< api level flags for device_get_flags() */
 	u_int		flags;		/**< internal device flags  */
 	u_int	order;			/**< order from device_add_child_ordered() */
 	void	*ivars;			/**< instance variables  */
 	void	*softc;			/**< current driver's variables  */
 
 	struct sysctl_ctx_list sysctl_ctx; /**< state for sysctl variables  */
 	struct sysctl_oid *sysctl_tree;	/**< state for sysctl variables */
 };
 
 static MALLOC_DEFINE(M_BUS, "bus", "Bus data structures");
 static MALLOC_DEFINE(M_BUS_SC, "bus-sc", "Bus data structures, softc");
 
 static void devctl2_init(void);
 
 #ifdef BUS_DEBUG
 
 static int bus_debug = 1;
 SYSCTL_INT(_debug, OID_AUTO, bus_debug, CTLFLAG_RWTUN, &bus_debug, 0,
     "Bus debug level");
 
 #define PDEBUG(a)	if (bus_debug) {printf("%s:%d: ", __func__, __LINE__), printf a; printf("\n");}
 #define DEVICENAME(d)	((d)? device_get_name(d): "no device")
 #define DRIVERNAME(d)	((d)? d->name : "no driver")
 #define DEVCLANAME(d)	((d)? d->name : "no devclass")
 
 /**
  * Produce the indenting, indent*2 spaces plus a '.' ahead of that to
  * prevent syslog from deleting initial spaces
  */
 #define indentprintf(p)	do { int iJ; printf("."); for (iJ=0; iJ<indent; iJ++) printf("  "); printf p ; } while (0)
 
 static void print_device_short(device_t dev, int indent);
 static void print_device(device_t dev, int indent);
 void print_device_tree_short(device_t dev, int indent);
 void print_device_tree(device_t dev, int indent);
 static void print_driver_short(driver_t *driver, int indent);
 static void print_driver(driver_t *driver, int indent);
 static void print_driver_list(driver_list_t drivers, int indent);
 static void print_devclass_short(devclass_t dc, int indent);
 static void print_devclass(devclass_t dc, int indent);
 void print_devclass_list_short(void);
 void print_devclass_list(void);
 
 #else
 /* Make the compiler ignore the function calls */
 #define PDEBUG(a)			/* nop */
 #define DEVICENAME(d)			/* nop */
 #define DRIVERNAME(d)			/* nop */
 #define DEVCLANAME(d)			/* nop */
 
 #define print_device_short(d,i)		/* nop */
 #define print_device(d,i)		/* nop */
 #define print_device_tree_short(d,i)	/* nop */
 #define print_device_tree(d,i)		/* nop */
 #define print_driver_short(d,i)		/* nop */
 #define print_driver(d,i)		/* nop */
 #define print_driver_list(d,i)		/* nop */
 #define print_devclass_short(d,i)	/* nop */
 #define print_devclass(d,i)		/* nop */
 #define print_devclass_list_short()	/* nop */
 #define print_devclass_list()		/* nop */
 #endif
 
 /*
  * dev sysctl tree
  */
 
 enum {
 	DEVCLASS_SYSCTL_PARENT,
 };
 
 static int
 devclass_sysctl_handler(SYSCTL_HANDLER_ARGS)
 {
 	devclass_t dc = (devclass_t)arg1;
 	const char *value;
 
 	switch (arg2) {
 	case DEVCLASS_SYSCTL_PARENT:
 		value = dc->parent ? dc->parent->name : "";
 		break;
 	default:
 		return (EINVAL);
 	}
 	return (SYSCTL_OUT_STR(req, value));
 }
 
 static void
 devclass_sysctl_init(devclass_t dc)
 {
 
 	if (dc->sysctl_tree != NULL)
 		return;
 	sysctl_ctx_init(&dc->sysctl_ctx);
 	dc->sysctl_tree = SYSCTL_ADD_NODE(&dc->sysctl_ctx,
 	    SYSCTL_STATIC_CHILDREN(_dev), OID_AUTO, dc->name,
 	    CTLFLAG_RD, NULL, "");
 	SYSCTL_ADD_PROC(&dc->sysctl_ctx, SYSCTL_CHILDREN(dc->sysctl_tree),
 	    OID_AUTO, "%parent", CTLTYPE_STRING | CTLFLAG_RD,
 	    dc, DEVCLASS_SYSCTL_PARENT, devclass_sysctl_handler, "A",
 	    "parent class");
 }
 
 enum {
 	DEVICE_SYSCTL_DESC,
 	DEVICE_SYSCTL_DRIVER,
 	DEVICE_SYSCTL_LOCATION,
 	DEVICE_SYSCTL_PNPINFO,
 	DEVICE_SYSCTL_PARENT,
 };
 
 static int
 device_sysctl_handler(SYSCTL_HANDLER_ARGS)
 {
 	device_t dev = (device_t)arg1;
 	const char *value;
 	char *buf;
 	int error;
 
 	buf = NULL;
 	switch (arg2) {
 	case DEVICE_SYSCTL_DESC:
 		value = dev->desc ? dev->desc : "";
 		break;
 	case DEVICE_SYSCTL_DRIVER:
 		value = dev->driver ? dev->driver->name : "";
 		break;
 	case DEVICE_SYSCTL_LOCATION:
 		value = buf = malloc(1024, M_BUS, M_WAITOK | M_ZERO);
 		bus_child_location_str(dev, buf, 1024);
 		break;
 	case DEVICE_SYSCTL_PNPINFO:
 		value = buf = malloc(1024, M_BUS, M_WAITOK | M_ZERO);
 		bus_child_pnpinfo_str(dev, buf, 1024);
 		break;
 	case DEVICE_SYSCTL_PARENT:
 		value = dev->parent ? dev->parent->nameunit : "";
 		break;
 	default:
 		return (EINVAL);
 	}
 	error = SYSCTL_OUT_STR(req, value);
 	if (buf != NULL)
 		free(buf, M_BUS);
 	return (error);
 }
 
 static void
 device_sysctl_init(device_t dev)
 {
 	devclass_t dc = dev->devclass;
 	int domain;
 
 	if (dev->sysctl_tree != NULL)
 		return;
 	devclass_sysctl_init(dc);
 	sysctl_ctx_init(&dev->sysctl_ctx);
 	dev->sysctl_tree = SYSCTL_ADD_NODE(&dev->sysctl_ctx,
 	    SYSCTL_CHILDREN(dc->sysctl_tree), OID_AUTO,
 	    dev->nameunit + strlen(dc->name),
 	    CTLFLAG_RD, NULL, "");
 	SYSCTL_ADD_PROC(&dev->sysctl_ctx, SYSCTL_CHILDREN(dev->sysctl_tree),
 	    OID_AUTO, "%desc", CTLTYPE_STRING | CTLFLAG_RD,
 	    dev, DEVICE_SYSCTL_DESC, device_sysctl_handler, "A",
 	    "device description");
 	SYSCTL_ADD_PROC(&dev->sysctl_ctx, SYSCTL_CHILDREN(dev->sysctl_tree),
 	    OID_AUTO, "%driver", CTLTYPE_STRING | CTLFLAG_RD,
 	    dev, DEVICE_SYSCTL_DRIVER, device_sysctl_handler, "A",
 	    "device driver name");
 	SYSCTL_ADD_PROC(&dev->sysctl_ctx, SYSCTL_CHILDREN(dev->sysctl_tree),
 	    OID_AUTO, "%location", CTLTYPE_STRING | CTLFLAG_RD,
 	    dev, DEVICE_SYSCTL_LOCATION, device_sysctl_handler, "A",
 	    "device location relative to parent");
 	SYSCTL_ADD_PROC(&dev->sysctl_ctx, SYSCTL_CHILDREN(dev->sysctl_tree),
 	    OID_AUTO, "%pnpinfo", CTLTYPE_STRING | CTLFLAG_RD,
 	    dev, DEVICE_SYSCTL_PNPINFO, device_sysctl_handler, "A",
 	    "device identification");
 	SYSCTL_ADD_PROC(&dev->sysctl_ctx, SYSCTL_CHILDREN(dev->sysctl_tree),
 	    OID_AUTO, "%parent", CTLTYPE_STRING | CTLFLAG_RD,
 	    dev, DEVICE_SYSCTL_PARENT, device_sysctl_handler, "A",
 	    "parent device");
 	if (bus_get_domain(dev, &domain) == 0)
 		SYSCTL_ADD_INT(&dev->sysctl_ctx,
 		    SYSCTL_CHILDREN(dev->sysctl_tree), OID_AUTO, "%domain",
 		    CTLFLAG_RD, NULL, domain, "NUMA domain");
 }
 
 static void
 device_sysctl_update(device_t dev)
 {
 	devclass_t dc = dev->devclass;
 
 	if (dev->sysctl_tree == NULL)
 		return;
 	sysctl_rename_oid(dev->sysctl_tree, dev->nameunit + strlen(dc->name));
 }
 
 static void
 device_sysctl_fini(device_t dev)
 {
 	if (dev->sysctl_tree == NULL)
 		return;
 	sysctl_ctx_free(&dev->sysctl_ctx);
 	dev->sysctl_tree = NULL;
 }
 
 /*
  * /dev/devctl implementation
  */
 
 /*
  * This design allows only one reader for /dev/devctl.  This is not desirable
  * in the long run, but will get a lot of hair out of this implementation.
  * Maybe we should make this device a clonable device.
  *
  * Also note: we specifically do not attach a device to the device_t tree
  * to avoid potential chicken and egg problems.  One could argue that all
  * of this belongs to the root node.  One could also further argue that the
  * sysctl interface that we have not might more properly be an ioctl
  * interface, but at this stage of the game, I'm not inclined to rock that
  * boat.
  *
  * I'm also not sure that the SIGIO support is done correctly or not, as
  * I copied it from a driver that had SIGIO support that likely hasn't been
  * tested since 3.4 or 2.2.8!
  */
 
 /* Deprecated way to adjust queue length */
 static int sysctl_devctl_disable(SYSCTL_HANDLER_ARGS);
 SYSCTL_PROC(_hw_bus, OID_AUTO, devctl_disable, CTLTYPE_INT | CTLFLAG_RWTUN |
     CTLFLAG_MPSAFE, NULL, 0, sysctl_devctl_disable, "I",
     "devctl disable -- deprecated");
 
 #define DEVCTL_DEFAULT_QUEUE_LEN 1000
 static int sysctl_devctl_queue(SYSCTL_HANDLER_ARGS);
 static int devctl_queue_length = DEVCTL_DEFAULT_QUEUE_LEN;
 SYSCTL_PROC(_hw_bus, OID_AUTO, devctl_queue, CTLTYPE_INT | CTLFLAG_RWTUN |
     CTLFLAG_MPSAFE, NULL, 0, sysctl_devctl_queue, "I", "devctl queue length");
 
 static d_open_t		devopen;
 static d_close_t	devclose;
 static d_read_t		devread;
 static d_ioctl_t	devioctl;
 static d_poll_t		devpoll;
 static d_kqfilter_t	devkqfilter;
 
 static struct cdevsw dev_cdevsw = {
 	.d_version =	D_VERSION,
 	.d_open =	devopen,
 	.d_close =	devclose,
 	.d_read =	devread,
 	.d_ioctl =	devioctl,
 	.d_poll =	devpoll,
 	.d_kqfilter =	devkqfilter,
 	.d_name =	"devctl",
 };
 
 struct dev_event_info
 {
 	char *dei_data;
 	TAILQ_ENTRY(dev_event_info) dei_link;
 };
 
 TAILQ_HEAD(devq, dev_event_info);
 
 static struct dev_softc
 {
 	int	inuse;
 	int	nonblock;
 	int	queued;
 	int	async;
 	struct mtx mtx;
 	struct cv cv;
 	struct selinfo sel;
 	struct devq devq;
 	struct sigio *sigio;
 } devsoftc;
 
 static void	filt_devctl_detach(struct knote *kn);
 static int	filt_devctl_read(struct knote *kn, long hint);
 
 struct filterops devctl_rfiltops = {
 	.f_isfd = 1,
 	.f_detach = filt_devctl_detach,
 	.f_event = filt_devctl_read,
 };
 
 static struct cdev *devctl_dev;
 
 static void
 devinit(void)
 {
 	devctl_dev = make_dev_credf(MAKEDEV_ETERNAL, &dev_cdevsw, 0, NULL,
 	    UID_ROOT, GID_WHEEL, 0600, "devctl");
 	mtx_init(&devsoftc.mtx, "dev mtx", "devd", MTX_DEF);
 	cv_init(&devsoftc.cv, "dev cv");
 	TAILQ_INIT(&devsoftc.devq);
 	knlist_init_mtx(&devsoftc.sel.si_note, &devsoftc.mtx);
 	devctl2_init();
 }
 
 static int
 devopen(struct cdev *dev, int oflags, int devtype, struct thread *td)
 {
 
 	mtx_lock(&devsoftc.mtx);
 	if (devsoftc.inuse) {
 		mtx_unlock(&devsoftc.mtx);
 		return (EBUSY);
 	}
 	/* move to init */
 	devsoftc.inuse = 1;
 	mtx_unlock(&devsoftc.mtx);
 	return (0);
 }
 
 static int
 devclose(struct cdev *dev, int fflag, int devtype, struct thread *td)
 {
 
 	mtx_lock(&devsoftc.mtx);
 	devsoftc.inuse = 0;
 	devsoftc.nonblock = 0;
 	devsoftc.async = 0;
 	cv_broadcast(&devsoftc.cv);
 	funsetown(&devsoftc.sigio);
 	mtx_unlock(&devsoftc.mtx);
 	return (0);
 }
 
 /*
  * The read channel for this device is used to report changes to
  * userland in realtime.  We are required to free the data as well as
  * the n1 object because we allocate them separately.  Also note that
  * we return one record at a time.  If you try to read this device a
  * character at a time, you will lose the rest of the data.  Listening
  * programs are expected to cope.
  */
 static int
 devread(struct cdev *dev, struct uio *uio, int ioflag)
 {
 	struct dev_event_info *n1;
 	int rv;
 
 	mtx_lock(&devsoftc.mtx);
 	while (TAILQ_EMPTY(&devsoftc.devq)) {
 		if (devsoftc.nonblock) {
 			mtx_unlock(&devsoftc.mtx);
 			return (EAGAIN);
 		}
 		rv = cv_wait_sig(&devsoftc.cv, &devsoftc.mtx);
 		if (rv) {
 			/*
 			 * Need to translate ERESTART to EINTR here? -- jake
 			 */
 			mtx_unlock(&devsoftc.mtx);
 			return (rv);
 		}
 	}
 	n1 = TAILQ_FIRST(&devsoftc.devq);
 	TAILQ_REMOVE(&devsoftc.devq, n1, dei_link);
 	devsoftc.queued--;
 	mtx_unlock(&devsoftc.mtx);
 	rv = uiomove(n1->dei_data, strlen(n1->dei_data), uio);
 	free(n1->dei_data, M_BUS);
 	free(n1, M_BUS);
 	return (rv);
 }
 
 static	int
 devioctl(struct cdev *dev, u_long cmd, caddr_t data, int fflag, struct thread *td)
 {
 	switch (cmd) {
 
 	case FIONBIO:
 		if (*(int*)data)
 			devsoftc.nonblock = 1;
 		else
 			devsoftc.nonblock = 0;
 		return (0);
 	case FIOASYNC:
 		if (*(int*)data)
 			devsoftc.async = 1;
 		else
 			devsoftc.async = 0;
 		return (0);
 	case FIOSETOWN:
 		return fsetown(*(int *)data, &devsoftc.sigio);
 	case FIOGETOWN:
 		*(int *)data = fgetown(&devsoftc.sigio);
 		return (0);
 
 		/* (un)Support for other fcntl() calls. */
 	case FIOCLEX:
 	case FIONCLEX:
 	case FIONREAD:
 	default:
 		break;
 	}
 	return (ENOTTY);
 }
 
 static	int
 devpoll(struct cdev *dev, int events, struct thread *td)
 {
 	int	revents = 0;
 
 	mtx_lock(&devsoftc.mtx);
 	if (events & (POLLIN | POLLRDNORM)) {
 		if (!TAILQ_EMPTY(&devsoftc.devq))
 			revents = events & (POLLIN | POLLRDNORM);
 		else
 			selrecord(td, &devsoftc.sel);
 	}
 	mtx_unlock(&devsoftc.mtx);
 
 	return (revents);
 }
 
 static int
 devkqfilter(struct cdev *dev, struct knote *kn)
 {
 	int error;
 
 	if (kn->kn_filter == EVFILT_READ) {
 		kn->kn_fop = &devctl_rfiltops;
 		knlist_add(&devsoftc.sel.si_note, kn, 0);
 		error = 0;
 	} else
 		error = EINVAL;
 	return (error);
 }
 
 static void
 filt_devctl_detach(struct knote *kn)
 {
 
 	knlist_remove(&devsoftc.sel.si_note, kn, 0);
 }
 
 static int
 filt_devctl_read(struct knote *kn, long hint)
 {
 	kn->kn_data = devsoftc.queued;
 	return (kn->kn_data != 0);
 }
 
 /**
  * @brief Return whether the userland process is running
  */
 boolean_t
 devctl_process_running(void)
 {
 	return (devsoftc.inuse == 1);
 }
 
 /**
  * @brief Queue data to be read from the devctl device
  *
  * Generic interface to queue data to the devctl device.  It is
  * assumed that @p data is properly formatted.  It is further assumed
  * that @p data is allocated using the M_BUS malloc type.
  */
 void
 devctl_queue_data_f(char *data, int flags)
 {
 	struct dev_event_info *n1 = NULL, *n2 = NULL;
 
 	if (strlen(data) == 0)
 		goto out;
 	if (devctl_queue_length == 0)
 		goto out;
 	n1 = malloc(sizeof(*n1), M_BUS, flags);
 	if (n1 == NULL)
 		goto out;
 	n1->dei_data = data;
 	mtx_lock(&devsoftc.mtx);
 	if (devctl_queue_length == 0) {
 		mtx_unlock(&devsoftc.mtx);
 		free(n1->dei_data, M_BUS);
 		free(n1, M_BUS);
 		return;
 	}
 	/* Leave at least one spot in the queue... */
 	while (devsoftc.queued > devctl_queue_length - 1) {
 		n2 = TAILQ_FIRST(&devsoftc.devq);
 		TAILQ_REMOVE(&devsoftc.devq, n2, dei_link);
 		free(n2->dei_data, M_BUS);
 		free(n2, M_BUS);
 		devsoftc.queued--;
 	}
 	TAILQ_INSERT_TAIL(&devsoftc.devq, n1, dei_link);
 	devsoftc.queued++;
 	cv_broadcast(&devsoftc.cv);
 	KNOTE_LOCKED(&devsoftc.sel.si_note, 0);
 	mtx_unlock(&devsoftc.mtx);
 	selwakeup(&devsoftc.sel);
 	if (devsoftc.async && devsoftc.sigio != NULL)
 		pgsigio(&devsoftc.sigio, SIGIO, 0);
 	return;
 out:
 	/*
 	 * We have to free data on all error paths since the caller
 	 * assumes it will be free'd when this item is dequeued.
 	 */
 	free(data, M_BUS);
 	return;
 }
 
 void
 devctl_queue_data(char *data)
 {
 
 	devctl_queue_data_f(data, M_NOWAIT);
 }
 
 /**
  * @brief Send a 'notification' to userland, using standard ways
  */
 void
 devctl_notify_f(const char *system, const char *subsystem, const char *type,
     const char *data, int flags)
 {
 	int len = 0;
 	char *msg;
 
 	if (system == NULL)
 		return;		/* BOGUS!  Must specify system. */
 	if (subsystem == NULL)
 		return;		/* BOGUS!  Must specify subsystem. */
 	if (type == NULL)
 		return;		/* BOGUS!  Must specify type. */
 	len += strlen(" system=") + strlen(system);
 	len += strlen(" subsystem=") + strlen(subsystem);
 	len += strlen(" type=") + strlen(type);
 	/* add in the data message plus newline. */
 	if (data != NULL)
 		len += strlen(data);
 	len += 3;	/* '!', '\n', and NUL */
 	msg = malloc(len, M_BUS, flags);
 	if (msg == NULL)
 		return;		/* Drop it on the floor */
 	if (data != NULL)
 		snprintf(msg, len, "!system=%s subsystem=%s type=%s %s\n",
 		    system, subsystem, type, data);
 	else
 		snprintf(msg, len, "!system=%s subsystem=%s type=%s\n",
 		    system, subsystem, type);
 	devctl_queue_data_f(msg, flags);
 }
 
 void
 devctl_notify(const char *system, const char *subsystem, const char *type,
     const char *data)
 {
 
 	devctl_notify_f(system, subsystem, type, data, M_NOWAIT);
 }
 
 /*
  * Common routine that tries to make sending messages as easy as possible.
  * We allocate memory for the data, copy strings into that, but do not
  * free it unless there's an error.  The dequeue part of the driver should
  * free the data.  We don't send data when the device is disabled.  We do
  * send data, even when we have no listeners, because we wish to avoid
  * races relating to startup and restart of listening applications.
  *
  * devaddq is designed to string together the type of event, with the
  * object of that event, plus the plug and play info and location info
  * for that event.  This is likely most useful for devices, but less
  * useful for other consumers of this interface.  Those should use
  * the devctl_queue_data() interface instead.
  */
 static void
 devaddq(const char *type, const char *what, device_t dev)
 {
 	char *data = NULL;
 	char *loc = NULL;
 	char *pnp = NULL;
 	const char *parstr;
 
 	if (!devctl_queue_length)/* Rare race, but lost races safely discard */
 		return;
 	data = malloc(1024, M_BUS, M_NOWAIT);
 	if (data == NULL)
 		goto bad;
 
 	/* get the bus specific location of this device */
 	loc = malloc(1024, M_BUS, M_NOWAIT);
 	if (loc == NULL)
 		goto bad;
 	*loc = '\0';
 	bus_child_location_str(dev, loc, 1024);
 
 	/* Get the bus specific pnp info of this device */
 	pnp = malloc(1024, M_BUS, M_NOWAIT);
 	if (pnp == NULL)
 		goto bad;
 	*pnp = '\0';
 	bus_child_pnpinfo_str(dev, pnp, 1024);
 
 	/* Get the parent of this device, or / if high enough in the tree. */
 	if (device_get_parent(dev) == NULL)
 		parstr = ".";	/* Or '/' ? */
 	else
 		parstr = device_get_nameunit(device_get_parent(dev));
 	/* String it all together. */
 	snprintf(data, 1024, "%s%s at %s %s on %s\n", type, what, loc, pnp,
 	  parstr);
 	free(loc, M_BUS);
 	free(pnp, M_BUS);
 	devctl_queue_data(data);
 	return;
 bad:
 	free(pnp, M_BUS);
 	free(loc, M_BUS);
 	free(data, M_BUS);
 	return;
 }
 
 /*
  * A device was added to the tree.  We are called just after it successfully
  * attaches (that is, probe and attach success for this device).  No call
  * is made if a device is merely parented into the tree.  See devnomatch
  * if probe fails.  If attach fails, no notification is sent (but maybe
  * we should have a different message for this).
  */
 static void
 devadded(device_t dev)
 {
 	devaddq("+", device_get_nameunit(dev), dev);
 }
 
 /*
  * A device was removed from the tree.  We are called just before this
  * happens.
  */
 static void
 devremoved(device_t dev)
 {
 	devaddq("-", device_get_nameunit(dev), dev);
 }
 
 /*
  * Called when there's no match for this device.  This is only called
  * the first time that no match happens, so we don't keep getting this
  * message.  Should that prove to be undesirable, we can change it.
  * This is called when all drivers that can attach to a given bus
  * decline to accept this device.  Other errors may not be detected.
  */
 static void
 devnomatch(device_t dev)
 {
 	devaddq("?", "", dev);
 }
 
 static int
 sysctl_devctl_disable(SYSCTL_HANDLER_ARGS)
 {
 	struct dev_event_info *n1;
 	int dis, error;
 
 	dis = (devctl_queue_length == 0);
 	error = sysctl_handle_int(oidp, &dis, 0, req);
 	if (error || !req->newptr)
 		return (error);
 	if (mtx_initialized(&devsoftc.mtx))
 		mtx_lock(&devsoftc.mtx);
 	if (dis) {
 		while (!TAILQ_EMPTY(&devsoftc.devq)) {
 			n1 = TAILQ_FIRST(&devsoftc.devq);
 			TAILQ_REMOVE(&devsoftc.devq, n1, dei_link);
 			free(n1->dei_data, M_BUS);
 			free(n1, M_BUS);
 		}
 		devsoftc.queued = 0;
 		devctl_queue_length = 0;
 	} else {
 		devctl_queue_length = DEVCTL_DEFAULT_QUEUE_LEN;
 	}
 	if (mtx_initialized(&devsoftc.mtx))
 		mtx_unlock(&devsoftc.mtx);
 	return (0);
 }
 
 static int
 sysctl_devctl_queue(SYSCTL_HANDLER_ARGS)
 {
 	struct dev_event_info *n1;
 	int q, error;
 
 	q = devctl_queue_length;
 	error = sysctl_handle_int(oidp, &q, 0, req);
 	if (error || !req->newptr)
 		return (error);
 	if (q < 0)
 		return (EINVAL);
 	if (mtx_initialized(&devsoftc.mtx))
 		mtx_lock(&devsoftc.mtx);
 	devctl_queue_length = q;
 	while (devsoftc.queued > devctl_queue_length) {
 		n1 = TAILQ_FIRST(&devsoftc.devq);
 		TAILQ_REMOVE(&devsoftc.devq, n1, dei_link);
 		free(n1->dei_data, M_BUS);
 		free(n1, M_BUS);
 		devsoftc.queued--;
 	}
 	if (mtx_initialized(&devsoftc.mtx))
 		mtx_unlock(&devsoftc.mtx);
 	return (0);
 }
 
 /* End of /dev/devctl code */
 
 static TAILQ_HEAD(,device)	bus_data_devices;
 static int bus_data_generation = 1;
 
 static kobj_method_t null_methods[] = {
 	KOBJMETHOD_END
 };
 
 DEFINE_CLASS(null, null_methods, 0);
 
 /*
  * Bus pass implementation
  */
 
 static driver_list_t passes = TAILQ_HEAD_INITIALIZER(passes);
 int bus_current_pass = BUS_PASS_ROOT;
 
 /**
  * @internal
  * @brief Register the pass level of a new driver attachment
  *
  * Register a new driver attachment's pass level.  If no driver
  * attachment with the same pass level has been added, then @p new
  * will be added to the global passes list.
  *
  * @param new		the new driver attachment
  */
 static void
 driver_register_pass(struct driverlink *new)
 {
 	struct driverlink *dl;
 
 	/* We only consider pass numbers during boot. */
 	if (bus_current_pass == BUS_PASS_DEFAULT)
 		return;
 
 	/*
 	 * Walk the passes list.  If we already know about this pass
 	 * then there is nothing to do.  If we don't, then insert this
 	 * driver link into the list.
 	 */
 	TAILQ_FOREACH(dl, &passes, passlink) {
 		if (dl->pass < new->pass)
 			continue;
 		if (dl->pass == new->pass)
 			return;
 		TAILQ_INSERT_BEFORE(dl, new, passlink);
 		return;
 	}
 	TAILQ_INSERT_TAIL(&passes, new, passlink);
 }
 
 /**
  * @brief Raise the current bus pass
  *
  * Raise the current bus pass level to @p pass.  Call the BUS_NEW_PASS()
  * method on the root bus to kick off a new device tree scan for each
  * new pass level that has at least one driver.
  */
 void
 bus_set_pass(int pass)
 {
 	struct driverlink *dl;
 
 	if (bus_current_pass > pass)
 		panic("Attempt to lower bus pass level");
 
 	TAILQ_FOREACH(dl, &passes, passlink) {
 		/* Skip pass values below the current pass level. */
 		if (dl->pass <= bus_current_pass)
 			continue;
 
 		/*
 		 * Bail once we hit a driver with a pass level that is
 		 * too high.
 		 */
 		if (dl->pass > pass)
 			break;
 
 		/*
 		 * Raise the pass level to the next level and rescan
 		 * the tree.
 		 */
 		bus_current_pass = dl->pass;
 		BUS_NEW_PASS(root_bus);
 	}
 
 	/*
 	 * If there isn't a driver registered for the requested pass,
 	 * then bus_current_pass might still be less than 'pass'.  Set
 	 * it to 'pass' in that case.
 	 */
 	if (bus_current_pass < pass)
 		bus_current_pass = pass;
 	KASSERT(bus_current_pass == pass, ("Failed to update bus pass level"));
 }
 
 /*
  * Devclass implementation
  */
 
 static devclass_list_t devclasses = TAILQ_HEAD_INITIALIZER(devclasses);
 
 /**
  * @internal
  * @brief Find or create a device class
  *
  * If a device class with the name @p classname exists, return it,
  * otherwise if @p create is non-zero create and return a new device
  * class.
  *
  * If @p parentname is non-NULL, the parent of the devclass is set to
  * the devclass of that name.
  *
  * @param classname	the devclass name to find or create
  * @param parentname	the parent devclass name or @c NULL
  * @param create	non-zero to create a devclass
  */
 static devclass_t
 devclass_find_internal(const char *classname, const char *parentname,
 		       int create)
 {
 	devclass_t dc;
 
 	PDEBUG(("looking for %s", classname));
 	if (!classname)
 		return (NULL);
 
 	TAILQ_FOREACH(dc, &devclasses, link) {
 		if (!strcmp(dc->name, classname))
 			break;
 	}
 
 	if (create && !dc) {
 		PDEBUG(("creating %s", classname));
 		dc = malloc(sizeof(struct devclass) + strlen(classname) + 1,
 		    M_BUS, M_NOWAIT | M_ZERO);
 		if (!dc)
 			return (NULL);
 		dc->parent = NULL;
 		dc->name = (char*) (dc + 1);
 		strcpy(dc->name, classname);
 		TAILQ_INIT(&dc->drivers);
 		TAILQ_INSERT_TAIL(&devclasses, dc, link);
 
 		bus_data_generation_update();
 	}
 
 	/*
 	 * If a parent class is specified, then set that as our parent so
 	 * that this devclass will support drivers for the parent class as
 	 * well.  If the parent class has the same name don't do this though
 	 * as it creates a cycle that can trigger an infinite loop in
 	 * device_probe_child() if a device exists for which there is no
 	 * suitable driver.
 	 */
 	if (parentname && dc && !dc->parent &&
 	    strcmp(classname, parentname) != 0) {
 		dc->parent = devclass_find_internal(parentname, NULL, TRUE);
 		dc->parent->flags |= DC_HAS_CHILDREN;
 	}
 
 	return (dc);
 }
 
 /**
  * @brief Create a device class
  *
  * If a device class with the name @p classname exists, return it,
  * otherwise create and return a new device class.
  *
  * @param classname	the devclass name to find or create
  */
 devclass_t
 devclass_create(const char *classname)
 {
 	return (devclass_find_internal(classname, NULL, TRUE));
 }
 
 /**
  * @brief Find a device class
  *
  * If a device class with the name @p classname exists, return it,
  * otherwise return @c NULL.
  *
  * @param classname	the devclass name to find
  */
 devclass_t
 devclass_find(const char *classname)
 {
 	return (devclass_find_internal(classname, NULL, FALSE));
 }
 
 /**
  * @brief Register that a device driver has been added to a devclass
  *
  * Register that a device driver has been added to a devclass.  This
  * is called by devclass_add_driver to accomplish the recursive
  * notification of all the children classes of dc, as well as dc.
  * Each layer will have BUS_DRIVER_ADDED() called for all instances of
  * the devclass.
  *
  * We do a full search here of the devclass list at each iteration
  * level to save storing children-lists in the devclass structure.  If
  * we ever move beyond a few dozen devices doing this, we may need to
  * reevaluate...
  *
  * @param dc		the devclass to edit
  * @param driver	the driver that was just added
  */
 static void
 devclass_driver_added(devclass_t dc, driver_t *driver)
 {
 	devclass_t parent;
 	int i;
 
 	/*
 	 * Call BUS_DRIVER_ADDED for any existing busses in this class.
 	 */
 	for (i = 0; i < dc->maxunit; i++)
 		if (dc->devices[i] && device_is_attached(dc->devices[i]))
 			BUS_DRIVER_ADDED(dc->devices[i], driver);
 
 	/*
 	 * Walk through the children classes.  Since we only keep a
 	 * single parent pointer around, we walk the entire list of
 	 * devclasses looking for children.  We set the
 	 * DC_HAS_CHILDREN flag when a child devclass is created on
 	 * the parent, so we only walk the list for those devclasses
 	 * that have children.
 	 */
 	if (!(dc->flags & DC_HAS_CHILDREN))
 		return;
 	parent = dc;
 	TAILQ_FOREACH(dc, &devclasses, link) {
 		if (dc->parent == parent)
 			devclass_driver_added(dc, driver);
 	}
 }
 
 /**
  * @brief Add a device driver to a device class
  *
  * Add a device driver to a devclass. This is normally called
  * automatically by DRIVER_MODULE(). The BUS_DRIVER_ADDED() method of
  * all devices in the devclass will be called to allow them to attempt
  * to re-probe any unmatched children.
  *
  * @param dc		the devclass to edit
  * @param driver	the driver to register
  */
 int
 devclass_add_driver(devclass_t dc, driver_t *driver, int pass, devclass_t *dcp)
 {
 	driverlink_t dl;
 	const char *parentname;
 
 	PDEBUG(("%s", DRIVERNAME(driver)));
 
 	/* Don't allow invalid pass values. */
 	if (pass <= BUS_PASS_ROOT)
 		return (EINVAL);
 
 	dl = malloc(sizeof *dl, M_BUS, M_NOWAIT|M_ZERO);
 	if (!dl)
 		return (ENOMEM);
 
 	/*
 	 * Compile the driver's methods. Also increase the reference count
 	 * so that the class doesn't get freed when the last instance
 	 * goes. This means we can safely use static methods and avoids a
 	 * double-free in devclass_delete_driver.
 	 */
 	kobj_class_compile((kobj_class_t) driver);
 
 	/*
 	 * If the driver has any base classes, make the
 	 * devclass inherit from the devclass of the driver's
 	 * first base class. This will allow the system to
 	 * search for drivers in both devclasses for children
 	 * of a device using this driver.
 	 */
 	if (driver->baseclasses)
 		parentname = driver->baseclasses[0]->name;
 	else
 		parentname = NULL;
 	*dcp = devclass_find_internal(driver->name, parentname, TRUE);
 
 	dl->driver = driver;
 	TAILQ_INSERT_TAIL(&dc->drivers, dl, link);
 	driver->refs++;		/* XXX: kobj_mtx */
 	dl->pass = pass;
 	driver_register_pass(dl);
 
 	devclass_driver_added(dc, driver);
 	bus_data_generation_update();
 	return (0);
 }
 
 /**
  * @brief Register that a device driver has been deleted from a devclass
  *
  * Register that a device driver has been removed from a devclass.
  * This is called by devclass_delete_driver to accomplish the
  * recursive notification of all the children classes of busclass, as
  * well as busclass.  Each layer will attempt to detach the driver
  * from any devices that are children of the bus's devclass.  The function
  * will return an error if a device fails to detach.
  *
  * We do a full search here of the devclass list at each iteration
  * level to save storing children-lists in the devclass structure.  If
  * we ever move beyond a few dozen devices doing this, we may need to
  * reevaluate...
  *
  * @param busclass	the devclass of the parent bus
  * @param dc		the devclass of the driver being deleted
  * @param driver	the driver being deleted
  */
 static int
 devclass_driver_deleted(devclass_t busclass, devclass_t dc, driver_t *driver)
 {
 	devclass_t parent;
 	device_t dev;
 	int error, i;
 
 	/*
 	 * Disassociate from any devices.  We iterate through all the
 	 * devices in the devclass of the driver and detach any which are
 	 * using the driver and which have a parent in the devclass which
 	 * we are deleting from.
 	 *
 	 * Note that since a driver can be in multiple devclasses, we
 	 * should not detach devices which are not children of devices in
 	 * the affected devclass.
 	 */
 	for (i = 0; i < dc->maxunit; i++) {
 		if (dc->devices[i]) {
 			dev = dc->devices[i];
 			if (dev->driver == driver && dev->parent &&
 			    dev->parent->devclass == busclass) {
 				if ((error = device_detach(dev)) != 0)
 					return (error);
 				BUS_PROBE_NOMATCH(dev->parent, dev);
 				devnomatch(dev);
 				dev->flags |= DF_DONENOMATCH;
 			}
 		}
 	}
 
 	/*
 	 * Walk through the children classes.  Since we only keep a
 	 * single parent pointer around, we walk the entire list of
 	 * devclasses looking for children.  We set the
 	 * DC_HAS_CHILDREN flag when a child devclass is created on
 	 * the parent, so we only walk the list for those devclasses
 	 * that have children.
 	 */
 	if (!(busclass->flags & DC_HAS_CHILDREN))
 		return (0);
 	parent = busclass;
 	TAILQ_FOREACH(busclass, &devclasses, link) {
 		if (busclass->parent == parent) {
 			error = devclass_driver_deleted(busclass, dc, driver);
 			if (error)
 				return (error);
 		}
 	}
 	return (0);
 }
 
 /**
  * @brief Delete a device driver from a device class
  *
  * Delete a device driver from a devclass. This is normally called
  * automatically by DRIVER_MODULE().
  *
  * If the driver is currently attached to any devices,
  * devclass_delete_driver() will first attempt to detach from each
  * device. If one of the detach calls fails, the driver will not be
  * deleted.
  *
  * @param dc		the devclass to edit
  * @param driver	the driver to unregister
  */
 int
 devclass_delete_driver(devclass_t busclass, driver_t *driver)
 {
 	devclass_t dc = devclass_find(driver->name);
 	driverlink_t dl;
 	int error;
 
 	PDEBUG(("%s from devclass %s", driver->name, DEVCLANAME(busclass)));
 
 	if (!dc)
 		return (0);
 
 	/*
 	 * Find the link structure in the bus' list of drivers.
 	 */
 	TAILQ_FOREACH(dl, &busclass->drivers, link) {
 		if (dl->driver == driver)
 			break;
 	}
 
 	if (!dl) {
 		PDEBUG(("%s not found in %s list", driver->name,
 		    busclass->name));
 		return (ENOENT);
 	}
 
 	error = devclass_driver_deleted(busclass, dc, driver);
 	if (error != 0)
 		return (error);
 
 	TAILQ_REMOVE(&busclass->drivers, dl, link);
 	free(dl, M_BUS);
 
 	/* XXX: kobj_mtx */
 	driver->refs--;
 	if (driver->refs == 0)
 		kobj_class_free((kobj_class_t) driver);
 
 	bus_data_generation_update();
 	return (0);
 }
 
 /**
  * @brief Quiesces a set of device drivers from a device class
  *
  * Quiesce a device driver from a devclass. This is normally called
  * automatically by DRIVER_MODULE().
  *
  * If the driver is currently attached to any devices,
  * devclass_quiesece_driver() will first attempt to quiesce each
  * device.
  *
  * @param dc		the devclass to edit
  * @param driver	the driver to unregister
  */
 static int
 devclass_quiesce_driver(devclass_t busclass, driver_t *driver)
 {
 	devclass_t dc = devclass_find(driver->name);
 	driverlink_t dl;
 	device_t dev;
 	int i;
 	int error;
 
 	PDEBUG(("%s from devclass %s", driver->name, DEVCLANAME(busclass)));
 
 	if (!dc)
 		return (0);
 
 	/*
 	 * Find the link structure in the bus' list of drivers.
 	 */
 	TAILQ_FOREACH(dl, &busclass->drivers, link) {
 		if (dl->driver == driver)
 			break;
 	}
 
 	if (!dl) {
 		PDEBUG(("%s not found in %s list", driver->name,
 		    busclass->name));
 		return (ENOENT);
 	}
 
 	/*
 	 * Quiesce all devices.  We iterate through all the devices in
 	 * the devclass of the driver and quiesce any which are using
 	 * the driver and which have a parent in the devclass which we
 	 * are quiescing.
 	 *
 	 * Note that since a driver can be in multiple devclasses, we
 	 * should not quiesce devices which are not children of
 	 * devices in the affected devclass.
 	 */
 	for (i = 0; i < dc->maxunit; i++) {
 		if (dc->devices[i]) {
 			dev = dc->devices[i];
 			if (dev->driver == driver && dev->parent &&
 			    dev->parent->devclass == busclass) {
 				if ((error = device_quiesce(dev)) != 0)
 					return (error);
 			}
 		}
 	}
 
 	return (0);
 }
 
 /**
  * @internal
  */
 static driverlink_t
 devclass_find_driver_internal(devclass_t dc, const char *classname)
 {
 	driverlink_t dl;
 
 	PDEBUG(("%s in devclass %s", classname, DEVCLANAME(dc)));
 
 	TAILQ_FOREACH(dl, &dc->drivers, link) {
 		if (!strcmp(dl->driver->name, classname))
 			return (dl);
 	}
 
 	PDEBUG(("not found"));
 	return (NULL);
 }
 
 /**
  * @brief Return the name of the devclass
  */
 const char *
 devclass_get_name(devclass_t dc)
 {
 	return (dc->name);
 }
 
 /**
  * @brief Find a device given a unit number
  *
  * @param dc		the devclass to search
  * @param unit		the unit number to search for
  *
  * @returns		the device with the given unit number or @c
  *			NULL if there is no such device
  */
 device_t
 devclass_get_device(devclass_t dc, int unit)
 {
 	if (dc == NULL || unit < 0 || unit >= dc->maxunit)
 		return (NULL);
 	return (dc->devices[unit]);
 }
 
 /**
  * @brief Find the softc field of a device given a unit number
  *
  * @param dc		the devclass to search
  * @param unit		the unit number to search for
  *
  * @returns		the softc field of the device with the given
  *			unit number or @c NULL if there is no such
  *			device
  */
 void *
 devclass_get_softc(devclass_t dc, int unit)
 {
 	device_t dev;
 
 	dev = devclass_get_device(dc, unit);
 	if (!dev)
 		return (NULL);
 
 	return (device_get_softc(dev));
 }
 
 /**
  * @brief Get a list of devices in the devclass
  *
  * An array containing a list of all the devices in the given devclass
  * is allocated and returned in @p *devlistp. The number of devices
  * in the array is returned in @p *devcountp. The caller should free
  * the array using @c free(p, M_TEMP), even if @p *devcountp is 0.
  *
  * @param dc		the devclass to examine
  * @param devlistp	points at location for array pointer return
  *			value
  * @param devcountp	points at location for array size return value
  *
  * @retval 0		success
  * @retval ENOMEM	the array allocation failed
  */
 int
 devclass_get_devices(devclass_t dc, device_t **devlistp, int *devcountp)
 {
 	int count, i;
 	device_t *list;
 
 	count = devclass_get_count(dc);
 	list = malloc(count * sizeof(device_t), M_TEMP, M_NOWAIT|M_ZERO);
 	if (!list)
 		return (ENOMEM);
 
 	count = 0;
 	for (i = 0; i < dc->maxunit; i++) {
 		if (dc->devices[i]) {
 			list[count] = dc->devices[i];
 			count++;
 		}
 	}
 
 	*devlistp = list;
 	*devcountp = count;
 
 	return (0);
 }
 
 /**
  * @brief Get a list of drivers in the devclass
  *
  * An array containing a list of pointers to all the drivers in the
  * given devclass is allocated and returned in @p *listp.  The number
  * of drivers in the array is returned in @p *countp. The caller should
  * free the array using @c free(p, M_TEMP).
  *
  * @param dc		the devclass to examine
  * @param listp		gives location for array pointer return value
  * @param countp	gives location for number of array elements
  *			return value
  *
  * @retval 0		success
  * @retval ENOMEM	the array allocation failed
  */
 int
 devclass_get_drivers(devclass_t dc, driver_t ***listp, int *countp)
 {
 	driverlink_t dl;
 	driver_t **list;
 	int count;
 
 	count = 0;
 	TAILQ_FOREACH(dl, &dc->drivers, link)
 		count++;
 	list = malloc(count * sizeof(driver_t *), M_TEMP, M_NOWAIT);
 	if (list == NULL)
 		return (ENOMEM);
 
 	count = 0;
 	TAILQ_FOREACH(dl, &dc->drivers, link) {
 		list[count] = dl->driver;
 		count++;
 	}
 	*listp = list;
 	*countp = count;
 
 	return (0);
 }
 
 /**
  * @brief Get the number of devices in a devclass
  *
  * @param dc		the devclass to examine
  */
 int
 devclass_get_count(devclass_t dc)
 {
 	int count, i;
 
 	count = 0;
 	for (i = 0; i < dc->maxunit; i++)
 		if (dc->devices[i])
 			count++;
 	return (count);
 }
 
 /**
  * @brief Get the maximum unit number used in a devclass
  *
  * Note that this is one greater than the highest currently-allocated
  * unit.  If a null devclass_t is passed in, -1 is returned to indicate
  * that not even the devclass has been allocated yet.
  *
  * @param dc		the devclass to examine
  */
 int
 devclass_get_maxunit(devclass_t dc)
 {
 	if (dc == NULL)
 		return (-1);
 	return (dc->maxunit);
 }
 
 /**
  * @brief Find a free unit number in a devclass
  *
  * This function searches for the first unused unit number greater
  * that or equal to @p unit.
  *
  * @param dc		the devclass to examine
  * @param unit		the first unit number to check
  */
 int
 devclass_find_free_unit(devclass_t dc, int unit)
 {
 	if (dc == NULL)
 		return (unit);
 	while (unit < dc->maxunit && dc->devices[unit] != NULL)
 		unit++;
 	return (unit);
 }
 
 /**
  * @brief Set the parent of a devclass
  *
  * The parent class is normally initialised automatically by
  * DRIVER_MODULE().
  *
  * @param dc		the devclass to edit
  * @param pdc		the new parent devclass
  */
 void
 devclass_set_parent(devclass_t dc, devclass_t pdc)
 {
 	dc->parent = pdc;
 }
 
 /**
  * @brief Get the parent of a devclass
  *
  * @param dc		the devclass to examine
  */
 devclass_t
 devclass_get_parent(devclass_t dc)
 {
 	return (dc->parent);
 }
 
 struct sysctl_ctx_list *
 devclass_get_sysctl_ctx(devclass_t dc)
 {
 	return (&dc->sysctl_ctx);
 }
 
 struct sysctl_oid *
 devclass_get_sysctl_tree(devclass_t dc)
 {
 	return (dc->sysctl_tree);
 }
 
 /**
  * @internal
  * @brief Allocate a unit number
  *
  * On entry, @p *unitp is the desired unit number (or @c -1 if any
  * will do). The allocated unit number is returned in @p *unitp.
 
  * @param dc		the devclass to allocate from
  * @param unitp		points at the location for the allocated unit
  *			number
  *
  * @retval 0		success
  * @retval EEXIST	the requested unit number is already allocated
  * @retval ENOMEM	memory allocation failure
  */
 static int
 devclass_alloc_unit(devclass_t dc, device_t dev, int *unitp)
 {
 	const char *s;
 	int unit = *unitp;
 
 	PDEBUG(("unit %d in devclass %s", unit, DEVCLANAME(dc)));
 
 	/* Ask the parent bus if it wants to wire this device. */
 	if (unit == -1)
 		BUS_HINT_DEVICE_UNIT(device_get_parent(dev), dev, dc->name,
 		    &unit);
 
 	/* If we were given a wired unit number, check for existing device */
 	/* XXX imp XXX */
 	if (unit != -1) {
 		if (unit >= 0 && unit < dc->maxunit &&
 		    dc->devices[unit] != NULL) {
 			if (bootverbose)
 				printf("%s: %s%d already exists; skipping it\n",
 				    dc->name, dc->name, *unitp);
 			return (EEXIST);
 		}
 	} else {
 		/* Unwired device, find the next available slot for it */
 		unit = 0;
 		for (unit = 0;; unit++) {
 			/* If there is an "at" hint for a unit then skip it. */
 			if (resource_string_value(dc->name, unit, "at", &s) ==
 			    0)
 				continue;
 
 			/* If this device slot is already in use, skip it. */
 			if (unit < dc->maxunit && dc->devices[unit] != NULL)
 				continue;
 
 			break;
 		}
 	}
 
 	/*
 	 * We've selected a unit beyond the length of the table, so let's
 	 * extend the table to make room for all units up to and including
 	 * this one.
 	 */
 	if (unit >= dc->maxunit) {
 		device_t *newlist, *oldlist;
 		int newsize;
 
 		oldlist = dc->devices;
 		newsize = roundup((unit + 1), MINALLOCSIZE / sizeof(device_t));
 		newlist = malloc(sizeof(device_t) * newsize, M_BUS, M_NOWAIT);
 		if (!newlist)
 			return (ENOMEM);
 		if (oldlist != NULL)
 			bcopy(oldlist, newlist, sizeof(device_t) * dc->maxunit);
 		bzero(newlist + dc->maxunit,
 		    sizeof(device_t) * (newsize - dc->maxunit));
 		dc->devices = newlist;
 		dc->maxunit = newsize;
 		if (oldlist != NULL)
 			free(oldlist, M_BUS);
 	}
 	PDEBUG(("now: unit %d in devclass %s", unit, DEVCLANAME(dc)));
 
 	*unitp = unit;
 	return (0);
 }
 
 /**
  * @internal
  * @brief Add a device to a devclass
  *
  * A unit number is allocated for the device (using the device's
  * preferred unit number if any) and the device is registered in the
  * devclass. This allows the device to be looked up by its unit
  * number, e.g. by decoding a dev_t minor number.
  *
  * @param dc		the devclass to add to
  * @param dev		the device to add
  *
  * @retval 0		success
  * @retval EEXIST	the requested unit number is already allocated
  * @retval ENOMEM	memory allocation failure
  */
 static int
 devclass_add_device(devclass_t dc, device_t dev)
 {
 	int buflen, error;
 
 	PDEBUG(("%s in devclass %s", DEVICENAME(dev), DEVCLANAME(dc)));
 
 	buflen = snprintf(NULL, 0, "%s%d$", dc->name, INT_MAX);
 	if (buflen < 0)
 		return (ENOMEM);
 	dev->nameunit = malloc(buflen, M_BUS, M_NOWAIT|M_ZERO);
 	if (!dev->nameunit)
 		return (ENOMEM);
 
 	if ((error = devclass_alloc_unit(dc, dev, &dev->unit)) != 0) {
 		free(dev->nameunit, M_BUS);
 		dev->nameunit = NULL;
 		return (error);
 	}
 	dc->devices[dev->unit] = dev;
 	dev->devclass = dc;
 	snprintf(dev->nameunit, buflen, "%s%d", dc->name, dev->unit);
 
 	return (0);
 }
 
 /**
  * @internal
  * @brief Delete a device from a devclass
  *
  * The device is removed from the devclass's device list and its unit
  * number is freed.
 
  * @param dc		the devclass to delete from
  * @param dev		the device to delete
  *
  * @retval 0		success
  */
 static int
 devclass_delete_device(devclass_t dc, device_t dev)
 {
 	if (!dc || !dev)
 		return (0);
 
 	PDEBUG(("%s in devclass %s", DEVICENAME(dev), DEVCLANAME(dc)));
 
 	if (dev->devclass != dc || dc->devices[dev->unit] != dev)
 		panic("devclass_delete_device: inconsistent device class");
 	dc->devices[dev->unit] = NULL;
 	if (dev->flags & DF_WILDCARD)
 		dev->unit = -1;
 	dev->devclass = NULL;
 	free(dev->nameunit, M_BUS);
 	dev->nameunit = NULL;
 
 	return (0);
 }
 
 /**
  * @internal
  * @brief Make a new device and add it as a child of @p parent
  *
  * @param parent	the parent of the new device
  * @param name		the devclass name of the new device or @c NULL
  *			to leave the devclass unspecified
  * @parem unit		the unit number of the new device of @c -1 to
  *			leave the unit number unspecified
  *
  * @returns the new device
  */
 static device_t
 make_device(device_t parent, const char *name, int unit)
 {
 	device_t dev;
 	devclass_t dc;
 
 	PDEBUG(("%s at %s as unit %d", name, DEVICENAME(parent), unit));
 
 	if (name) {
 		dc = devclass_find_internal(name, NULL, TRUE);
 		if (!dc) {
 			printf("make_device: can't find device class %s\n",
 			    name);
 			return (NULL);
 		}
 	} else {
 		dc = NULL;
 	}
 
 	dev = malloc(sizeof(struct device), M_BUS, M_NOWAIT|M_ZERO);
 	if (!dev)
 		return (NULL);
 
 	dev->parent = parent;
 	TAILQ_INIT(&dev->children);
 	kobj_init((kobj_t) dev, &null_class);
 	dev->driver = NULL;
 	dev->devclass = NULL;
 	dev->unit = unit;
 	dev->nameunit = NULL;
 	dev->desc = NULL;
 	dev->busy = 0;
 	dev->devflags = 0;
 	dev->flags = DF_ENABLED;
 	dev->order = 0;
 	if (unit == -1)
 		dev->flags |= DF_WILDCARD;
 	if (name) {
 		dev->flags |= DF_FIXEDCLASS;
 		if (devclass_add_device(dc, dev)) {
 			kobj_delete((kobj_t) dev, M_BUS);
 			return (NULL);
 		}
 	}
 	dev->ivars = NULL;
 	dev->softc = NULL;
 
 	dev->state = DS_NOTPRESENT;
 
 	TAILQ_INSERT_TAIL(&bus_data_devices, dev, devlink);
 	bus_data_generation_update();
 
 	return (dev);
 }
 
 /**
  * @internal
  * @brief Print a description of a device.
  */
 static int
 device_print_child(device_t dev, device_t child)
 {
 	int retval = 0;
 
 	if (device_is_alive(child))
 		retval += BUS_PRINT_CHILD(dev, child);
 	else
 		retval += device_printf(child, " not found\n");
 
 	return (retval);
 }
 
 /**
  * @brief Create a new device
  *
  * This creates a new device and adds it as a child of an existing
  * parent device. The new device will be added after the last existing
  * child with order zero.
  *
  * @param dev		the device which will be the parent of the
  *			new child device
  * @param name		devclass name for new device or @c NULL if not
  *			specified
  * @param unit		unit number for new device or @c -1 if not
  *			specified
  *
  * @returns		the new device
  */
 device_t
 device_add_child(device_t dev, const char *name, int unit)
 {
 	return (device_add_child_ordered(dev, 0, name, unit));
 }
 
 /**
  * @brief Create a new device
  *
  * This creates a new device and adds it as a child of an existing
  * parent device. The new device will be added after the last existing
  * child with the same order.
  *
  * @param dev		the device which will be the parent of the
  *			new child device
  * @param order		a value which is used to partially sort the
  *			children of @p dev - devices created using
  *			lower values of @p order appear first in @p
  *			dev's list of children
  * @param name		devclass name for new device or @c NULL if not
  *			specified
  * @param unit		unit number for new device or @c -1 if not
  *			specified
  *
  * @returns		the new device
  */
 device_t
 device_add_child_ordered(device_t dev, u_int order, const char *name, int unit)
 {
 	device_t child;
 	device_t place;
 
 	PDEBUG(("%s at %s with order %u as unit %d",
 	    name, DEVICENAME(dev), order, unit));
 	KASSERT(name != NULL || unit == -1,
 	    ("child device with wildcard name and specific unit number"));
 
 	child = make_device(dev, name, unit);
 	if (child == NULL)
 		return (child);
 	child->order = order;
 
 	TAILQ_FOREACH(place, &dev->children, link) {
 		if (place->order > order)
 			break;
 	}
 
 	if (place) {
 		/*
 		 * The device 'place' is the first device whose order is
 		 * greater than the new child.
 		 */
 		TAILQ_INSERT_BEFORE(place, child, link);
 	} else {
 		/*
 		 * The new child's order is greater or equal to the order of
 		 * any existing device. Add the child to the tail of the list.
 		 */
 		TAILQ_INSERT_TAIL(&dev->children, child, link);
 	}
 
 	bus_data_generation_update();
 	return (child);
 }
 
 /**
  * @brief Delete a device
  *
  * This function deletes a device along with all of its children. If
  * the device currently has a driver attached to it, the device is
  * detached first using device_detach().
  *
  * @param dev		the parent device
  * @param child		the device to delete
  *
  * @retval 0		success
  * @retval non-zero	a unit error code describing the error
  */
 int
 device_delete_child(device_t dev, device_t child)
 {
 	int error;
 	device_t grandchild;
 
 	PDEBUG(("%s from %s", DEVICENAME(child), DEVICENAME(dev)));
 
 	/* remove children first */
 	while ((grandchild = TAILQ_FIRST(&child->children)) != NULL) {
 		error = device_delete_child(child, grandchild);
 		if (error)
 			return (error);
 	}
 
 	if ((error = device_detach(child)) != 0)
 		return (error);
 	if (child->devclass)
 		devclass_delete_device(child->devclass, child);
 	if (child->parent)
 		BUS_CHILD_DELETED(dev, child);
 	TAILQ_REMOVE(&dev->children, child, link);
 	TAILQ_REMOVE(&bus_data_devices, child, devlink);
 	kobj_delete((kobj_t) child, M_BUS);
 
 	bus_data_generation_update();
 	return (0);
 }
 
 /**
  * @brief Delete all children devices of the given device, if any.
  *
  * This function deletes all children devices of the given device, if
  * any, using the device_delete_child() function for each device it
  * finds. If a child device cannot be deleted, this function will
  * return an error code.
  *
  * @param dev		the parent device
  *
  * @retval 0		success
  * @retval non-zero	a device would not detach
  */
 int
 device_delete_children(device_t dev)
 {
 	device_t child;
 	int error;
 
 	PDEBUG(("Deleting all children of %s", DEVICENAME(dev)));
 
 	error = 0;
 
 	while ((child = TAILQ_FIRST(&dev->children)) != NULL) {
 		error = device_delete_child(dev, child);
 		if (error) {
 			PDEBUG(("Failed deleting %s", DEVICENAME(child)));
 			break;
 		}
 	}
 	return (error);
 }
 
 /**
  * @brief Find a device given a unit number
  *
  * This is similar to devclass_get_devices() but only searches for
  * devices which have @p dev as a parent.
  *
  * @param dev		the parent device to search
  * @param unit		the unit number to search for.  If the unit is -1,
  *			return the first child of @p dev which has name
  *			@p classname (that is, the one with the lowest unit.)
  *
  * @returns		the device with the given unit number or @c
  *			NULL if there is no such device
  */
 device_t
 device_find_child(device_t dev, const char *classname, int unit)
 {
 	devclass_t dc;
 	device_t child;
 
 	dc = devclass_find(classname);
 	if (!dc)
 		return (NULL);
 
 	if (unit != -1) {
 		child = devclass_get_device(dc, unit);
 		if (child && child->parent == dev)
 			return (child);
 	} else {
 		for (unit = 0; unit < devclass_get_maxunit(dc); unit++) {
 			child = devclass_get_device(dc, unit);
 			if (child && child->parent == dev)
 				return (child);
 		}
 	}
 	return (NULL);
 }
 
 /**
  * @internal
  */
 static driverlink_t
 first_matching_driver(devclass_t dc, device_t dev)
 {
 	if (dev->devclass)
 		return (devclass_find_driver_internal(dc, dev->devclass->name));
 	return (TAILQ_FIRST(&dc->drivers));
 }
 
 /**
  * @internal
  */
 static driverlink_t
 next_matching_driver(devclass_t dc, device_t dev, driverlink_t last)
 {
 	if (dev->devclass) {
 		driverlink_t dl;
 		for (dl = TAILQ_NEXT(last, link); dl; dl = TAILQ_NEXT(dl, link))
 			if (!strcmp(dev->devclass->name, dl->driver->name))
 				return (dl);
 		return (NULL);
 	}
 	return (TAILQ_NEXT(last, link));
 }
 
 /**
  * @internal
  */
 int
 device_probe_child(device_t dev, device_t child)
 {
 	devclass_t dc;
 	driverlink_t best = NULL;
 	driverlink_t dl;
 	int result, pri = 0;
 	int hasclass = (child->devclass != NULL);
 
 	GIANT_REQUIRED;
 
 	dc = dev->devclass;
 	if (!dc)
 		panic("device_probe_child: parent device has no devclass");
 
 	/*
 	 * If the state is already probed, then return.  However, don't
 	 * return if we can rebid this object.
 	 */
 	if (child->state == DS_ALIVE && (child->flags & DF_REBID) == 0)
 		return (0);
 
 	for (; dc; dc = dc->parent) {
 		for (dl = first_matching_driver(dc, child);
 		     dl;
 		     dl = next_matching_driver(dc, child, dl)) {
 			/* If this driver's pass is too high, then ignore it. */
 			if (dl->pass > bus_current_pass)
 				continue;
 
 			PDEBUG(("Trying %s", DRIVERNAME(dl->driver)));
 			result = device_set_driver(child, dl->driver);
 			if (result == ENOMEM)
 				return (result);
 			else if (result != 0)
 				continue;
 			if (!hasclass) {
 				if (device_set_devclass(child,
 				    dl->driver->name) != 0) {
 					char const * devname =
 					    device_get_name(child);
 					if (devname == NULL)
 						devname = "(unknown)";
 					printf("driver bug: Unable to set "
 					    "devclass (class: %s "
 					    "devname: %s)\n",
 					    dl->driver->name,
 					    devname);
 					(void)device_set_driver(child, NULL);
 					continue;
 				}
 			}
 
 			/* Fetch any flags for the device before probing. */
 			resource_int_value(dl->driver->name, child->unit,
 			    "flags", &child->devflags);
 
 			result = DEVICE_PROBE(child);
 
 			/* Reset flags and devclass before the next probe. */
 			child->devflags = 0;
 			if (!hasclass)
 				(void)device_set_devclass(child, NULL);
 
 			/*
 			 * If the driver returns SUCCESS, there can be
 			 * no higher match for this device.
 			 */
 			if (result == 0) {
 				best = dl;
 				pri = 0;
 				break;
 			}
 
 			/*
+			 * Probes that return BUS_PROBE_NOWILDCARD or lower
+			 * only match on devices whose driver was explicitly
+			 * specified.
+			 */
+			if (result <= BUS_PROBE_NOWILDCARD &&
+			    !(child->flags & DF_FIXEDCLASS)) {
+				result = ENXIO;
+			}
+
+			/*
 			 * The driver returned an error so it
 			 * certainly doesn't match.
 			 */
 			if (result > 0) {
 				(void)device_set_driver(child, NULL);
 				continue;
 			}
 
 			/*
 			 * A priority lower than SUCCESS, remember the
 			 * best matching driver. Initialise the value
 			 * of pri for the first match.
 			 */
 			if (best == NULL || result > pri) {
-				/*
-				 * Probes that return BUS_PROBE_NOWILDCARD
-				 * or lower only match on devices whose
-				 * driver was explicitly specified.
-				 */
-				if (result <= BUS_PROBE_NOWILDCARD &&
-				    !(child->flags & DF_FIXEDCLASS))
-					continue;
 				best = dl;
 				pri = result;
 				continue;
 			}
 		}
 		/*
 		 * If we have an unambiguous match in this devclass,
 		 * don't look in the parent.
 		 */
 		if (best && pri == 0)
 			break;
 	}
 
 	/*
 	 * If we found a driver, change state and initialise the devclass.
 	 */
 	/* XXX What happens if we rebid and got no best? */
 	if (best) {
 		/*
 		 * If this device was attached, and we were asked to
 		 * rescan, and it is a different driver, then we have
 		 * to detach the old driver and reattach this new one.
 		 * Note, we don't have to check for DF_REBID here
 		 * because if the state is > DS_ALIVE, we know it must
 		 * be.
 		 *
 		 * This assumes that all DF_REBID drivers can have
 		 * their probe routine called at any time and that
 		 * they are idempotent as well as completely benign in
 		 * normal operations.
 		 *
 		 * We also have to make sure that the detach
 		 * succeeded, otherwise we fail the operation (or
 		 * maybe it should just fail silently?  I'm torn).
 		 */
 		if (child->state > DS_ALIVE && best->driver != child->driver)
 			if ((result = device_detach(dev)) != 0)
 				return (result);
 
 		/* Set the winning driver, devclass, and flags. */
 		if (!child->devclass) {
 			result = device_set_devclass(child, best->driver->name);
 			if (result != 0)
 				return (result);
 		}
 		result = device_set_driver(child, best->driver);
 		if (result != 0)
 			return (result);
 		resource_int_value(best->driver->name, child->unit,
 		    "flags", &child->devflags);
 
 		if (pri < 0) {
 			/*
 			 * A bit bogus. Call the probe method again to make
 			 * sure that we have the right description.
 			 */
 			DEVICE_PROBE(child);
 #if 0
 			child->flags |= DF_REBID;
 #endif
 		} else
 			child->flags &= ~DF_REBID;
 		child->state = DS_ALIVE;
 
 		bus_data_generation_update();
 		return (0);
 	}
 
 	return (ENXIO);
 }
 
 /**
  * @brief Return the parent of a device
  */
 device_t
 device_get_parent(device_t dev)
 {
 	return (dev->parent);
 }
 
 /**
  * @brief Get a list of children of a device
  *
  * An array containing a list of all the children of the given device
  * is allocated and returned in @p *devlistp. The number of devices
  * in the array is returned in @p *devcountp. The caller should free
  * the array using @c free(p, M_TEMP).
  *
  * @param dev		the device to examine
  * @param devlistp	points at location for array pointer return
  *			value
  * @param devcountp	points at location for array size return value
  *
  * @retval 0		success
  * @retval ENOMEM	the array allocation failed
  */
 int
 device_get_children(device_t dev, device_t **devlistp, int *devcountp)
 {
 	int count;
 	device_t child;
 	device_t *list;
 
 	count = 0;
 	TAILQ_FOREACH(child, &dev->children, link) {
 		count++;
 	}
 	if (count == 0) {
 		*devlistp = NULL;
 		*devcountp = 0;
 		return (0);
 	}
 
 	list = malloc(count * sizeof(device_t), M_TEMP, M_NOWAIT|M_ZERO);
 	if (!list)
 		return (ENOMEM);
 
 	count = 0;
 	TAILQ_FOREACH(child, &dev->children, link) {
 		list[count] = child;
 		count++;
 	}
 
 	*devlistp = list;
 	*devcountp = count;
 
 	return (0);
 }
 
 /**
  * @brief Return the current driver for the device or @c NULL if there
  * is no driver currently attached
  */
 driver_t *
 device_get_driver(device_t dev)
 {
 	return (dev->driver);
 }
 
 /**
  * @brief Return the current devclass for the device or @c NULL if
  * there is none.
  */
 devclass_t
 device_get_devclass(device_t dev)
 {
 	return (dev->devclass);
 }
 
 /**
  * @brief Return the name of the device's devclass or @c NULL if there
  * is none.
  */
 const char *
 device_get_name(device_t dev)
 {
 	if (dev != NULL && dev->devclass)
 		return (devclass_get_name(dev->devclass));
 	return (NULL);
 }
 
 /**
  * @brief Return a string containing the device's devclass name
  * followed by an ascii representation of the device's unit number
  * (e.g. @c "foo2").
  */
 const char *
 device_get_nameunit(device_t dev)
 {
 	return (dev->nameunit);
 }
 
 /**
  * @brief Return the device's unit number.
  */
 int
 device_get_unit(device_t dev)
 {
 	return (dev->unit);
 }
 
 /**
  * @brief Return the device's description string
  */
 const char *
 device_get_desc(device_t dev)
 {
 	return (dev->desc);
 }
 
 /**
  * @brief Return the device's flags
  */
 uint32_t
 device_get_flags(device_t dev)
 {
 	return (dev->devflags);
 }
 
 struct sysctl_ctx_list *
 device_get_sysctl_ctx(device_t dev)
 {
 	return (&dev->sysctl_ctx);
 }
 
 struct sysctl_oid *
 device_get_sysctl_tree(device_t dev)
 {
 	return (dev->sysctl_tree);
 }
 
 /**
  * @brief Print the name of the device followed by a colon and a space
  *
  * @returns the number of characters printed
  */
 int
 device_print_prettyname(device_t dev)
 {
 	const char *name = device_get_name(dev);
 
 	if (name == NULL)
 		return (printf("unknown: "));
 	return (printf("%s%d: ", name, device_get_unit(dev)));
 }
 
 /**
  * @brief Print the name of the device followed by a colon, a space
  * and the result of calling vprintf() with the value of @p fmt and
  * the following arguments.
  *
  * @returns the number of characters printed
  */
 int
 device_printf(device_t dev, const char * fmt, ...)
 {
 	va_list ap;
 	int retval;
 
 	retval = device_print_prettyname(dev);
 	va_start(ap, fmt);
 	retval += vprintf(fmt, ap);
 	va_end(ap);
 	return (retval);
 }
 
 /**
  * @internal
  */
 static void
 device_set_desc_internal(device_t dev, const char* desc, int copy)
 {
 	if (dev->desc && (dev->flags & DF_DESCMALLOCED)) {
 		free(dev->desc, M_BUS);
 		dev->flags &= ~DF_DESCMALLOCED;
 		dev->desc = NULL;
 	}
 
 	if (copy && desc) {
 		dev->desc = malloc(strlen(desc) + 1, M_BUS, M_NOWAIT);
 		if (dev->desc) {
 			strcpy(dev->desc, desc);
 			dev->flags |= DF_DESCMALLOCED;
 		}
 	} else {
 		/* Avoid a -Wcast-qual warning */
 		dev->desc = (char *)(uintptr_t) desc;
 	}
 
 	bus_data_generation_update();
 }
 
 /**
  * @brief Set the device's description
  *
  * The value of @c desc should be a string constant that will not
  * change (at least until the description is changed in a subsequent
  * call to device_set_desc() or device_set_desc_copy()).
  */
 void
 device_set_desc(device_t dev, const char* desc)
 {
 	device_set_desc_internal(dev, desc, FALSE);
 }
 
 /**
  * @brief Set the device's description
  *
  * The string pointed to by @c desc is copied. Use this function if
  * the device description is generated, (e.g. with sprintf()).
  */
 void
 device_set_desc_copy(device_t dev, const char* desc)
 {
 	device_set_desc_internal(dev, desc, TRUE);
 }
 
 /**
  * @brief Set the device's flags
  */
 void
 device_set_flags(device_t dev, uint32_t flags)
 {
 	dev->devflags = flags;
 }
 
 /**
  * @brief Return the device's softc field
  *
  * The softc is allocated and zeroed when a driver is attached, based
  * on the size field of the driver.
  */
 void *
 device_get_softc(device_t dev)
 {
 	return (dev->softc);
 }
 
 /**
  * @brief Set the device's softc field
  *
  * Most drivers do not need to use this since the softc is allocated
  * automatically when the driver is attached.
  */
 void
 device_set_softc(device_t dev, void *softc)
 {
 	if (dev->softc && !(dev->flags & DF_EXTERNALSOFTC))
 		free(dev->softc, M_BUS_SC);
 	dev->softc = softc;
 	if (dev->softc)
 		dev->flags |= DF_EXTERNALSOFTC;
 	else
 		dev->flags &= ~DF_EXTERNALSOFTC;
 }
 
 /**
  * @brief Free claimed softc
  *
  * Most drivers do not need to use this since the softc is freed
  * automatically when the driver is detached.
  */
 void
 device_free_softc(void *softc)
 {
 	free(softc, M_BUS_SC);
 }
 
 /**
  * @brief Claim softc
  *
  * This function can be used to let the driver free the automatically
  * allocated softc using "device_free_softc()". This function is
  * useful when the driver is refcounting the softc and the softc
  * cannot be freed when the "device_detach" method is called.
  */
 void
 device_claim_softc(device_t dev)
 {
 	if (dev->softc)
 		dev->flags |= DF_EXTERNALSOFTC;
 	else
 		dev->flags &= ~DF_EXTERNALSOFTC;
 }
 
 /**
  * @brief Get the device's ivars field
  *
  * The ivars field is used by the parent device to store per-device
  * state (e.g. the physical location of the device or a list of
  * resources).
  */
 void *
 device_get_ivars(device_t dev)
 {
 
 	KASSERT(dev != NULL, ("device_get_ivars(NULL, ...)"));
 	return (dev->ivars);
 }
 
 /**
  * @brief Set the device's ivars field
  */
 void
 device_set_ivars(device_t dev, void * ivars)
 {
 
 	KASSERT(dev != NULL, ("device_set_ivars(NULL, ...)"));
 	dev->ivars = ivars;
 }
 
 /**
  * @brief Return the device's state
  */
 device_state_t
 device_get_state(device_t dev)
 {
 	return (dev->state);
 }
 
 /**
  * @brief Set the DF_ENABLED flag for the device
  */
 void
 device_enable(device_t dev)
 {
 	dev->flags |= DF_ENABLED;
 }
 
 /**
  * @brief Clear the DF_ENABLED flag for the device
  */
 void
 device_disable(device_t dev)
 {
 	dev->flags &= ~DF_ENABLED;
 }
 
 /**
  * @brief Increment the busy counter for the device
  */
 void
 device_busy(device_t dev)
 {
 	if (dev->state < DS_ATTACHING)
 		panic("device_busy: called for unattached device");
 	if (dev->busy == 0 && dev->parent)
 		device_busy(dev->parent);
 	dev->busy++;
 	if (dev->state == DS_ATTACHED)
 		dev->state = DS_BUSY;
 }
 
 /**
  * @brief Decrement the busy counter for the device
  */
 void
 device_unbusy(device_t dev)
 {
 	if (dev->busy != 0 && dev->state != DS_BUSY &&
 	    dev->state != DS_ATTACHING)
 		panic("device_unbusy: called for non-busy device %s",
 		    device_get_nameunit(dev));
 	dev->busy--;
 	if (dev->busy == 0) {
 		if (dev->parent)
 			device_unbusy(dev->parent);
 		if (dev->state == DS_BUSY)
 			dev->state = DS_ATTACHED;
 	}
 }
 
 /**
  * @brief Set the DF_QUIET flag for the device
  */
 void
 device_quiet(device_t dev)
 {
 	dev->flags |= DF_QUIET;
 }
 
 /**
  * @brief Clear the DF_QUIET flag for the device
  */
 void
 device_verbose(device_t dev)
 {
 	dev->flags &= ~DF_QUIET;
 }
 
 /**
  * @brief Return non-zero if the DF_QUIET flag is set on the device
  */
 int
 device_is_quiet(device_t dev)
 {
 	return ((dev->flags & DF_QUIET) != 0);
 }
 
 /**
  * @brief Return non-zero if the DF_ENABLED flag is set on the device
  */
 int
 device_is_enabled(device_t dev)
 {
 	return ((dev->flags & DF_ENABLED) != 0);
 }
 
 /**
  * @brief Return non-zero if the device was successfully probed
  */
 int
 device_is_alive(device_t dev)
 {
 	return (dev->state >= DS_ALIVE);
 }
 
 /**
  * @brief Return non-zero if the device currently has a driver
  * attached to it
  */
 int
 device_is_attached(device_t dev)
 {
 	return (dev->state >= DS_ATTACHED);
 }
 
 /**
  * @brief Return non-zero if the device is currently suspended.
  */
 int
 device_is_suspended(device_t dev)
 {
 	return ((dev->flags & DF_SUSPENDED) != 0);
 }
 
 /**
  * @brief Set the devclass of a device
  * @see devclass_add_device().
  */
 int
 device_set_devclass(device_t dev, const char *classname)
 {
 	devclass_t dc;
 	int error;
 
 	if (!classname) {
 		if (dev->devclass)
 			devclass_delete_device(dev->devclass, dev);
 		return (0);
 	}
 
 	if (dev->devclass) {
 		printf("device_set_devclass: device class already set\n");
 		return (EINVAL);
 	}
 
 	dc = devclass_find_internal(classname, NULL, TRUE);
 	if (!dc)
 		return (ENOMEM);
 
 	error = devclass_add_device(dc, dev);
 
 	bus_data_generation_update();
 	return (error);
 }
 
 /**
  * @brief Set the devclass of a device and mark the devclass fixed.
  * @see device_set_devclass()
  */
 int
 device_set_devclass_fixed(device_t dev, const char *classname)
 {
 	int error;
 
 	if (classname == NULL)
 		return (EINVAL);
 
 	error = device_set_devclass(dev, classname);
 	if (error)
 		return (error);
 	dev->flags |= DF_FIXEDCLASS;
 	return (0);
 }
 
 /**
  * @brief Set the driver of a device
  *
  * @retval 0		success
  * @retval EBUSY	the device already has a driver attached
  * @retval ENOMEM	a memory allocation failure occurred
  */
 int
 device_set_driver(device_t dev, driver_t *driver)
 {
 	if (dev->state >= DS_ATTACHED)
 		return (EBUSY);
 
 	if (dev->driver == driver)
 		return (0);
 
 	if (dev->softc && !(dev->flags & DF_EXTERNALSOFTC)) {
 		free(dev->softc, M_BUS_SC);
 		dev->softc = NULL;
 	}
 	device_set_desc(dev, NULL);
 	kobj_delete((kobj_t) dev, NULL);
 	dev->driver = driver;
 	if (driver) {
 		kobj_init((kobj_t) dev, (kobj_class_t) driver);
 		if (!(dev->flags & DF_EXTERNALSOFTC) && driver->size > 0) {
 			dev->softc = malloc(driver->size, M_BUS_SC,
 			    M_NOWAIT | M_ZERO);
 			if (!dev->softc) {
 				kobj_delete((kobj_t) dev, NULL);
 				kobj_init((kobj_t) dev, &null_class);
 				dev->driver = NULL;
 				return (ENOMEM);
 			}
 		}
 	} else {
 		kobj_init((kobj_t) dev, &null_class);
 	}
 
 	bus_data_generation_update();
 	return (0);
 }
 
 /**
  * @brief Probe a device, and return this status.
  *
  * This function is the core of the device autoconfiguration
  * system. Its purpose is to select a suitable driver for a device and
  * then call that driver to initialise the hardware appropriately. The
  * driver is selected by calling the DEVICE_PROBE() method of a set of
  * candidate drivers and then choosing the driver which returned the
  * best value. This driver is then attached to the device using
  * device_attach().
  *
  * The set of suitable drivers is taken from the list of drivers in
  * the parent device's devclass. If the device was originally created
  * with a specific class name (see device_add_child()), only drivers
  * with that name are probed, otherwise all drivers in the devclass
  * are probed. If no drivers return successful probe values in the
  * parent devclass, the search continues in the parent of that
  * devclass (see devclass_get_parent()) if any.
  *
  * @param dev		the device to initialise
  *
  * @retval 0		success
  * @retval ENXIO	no driver was found
  * @retval ENOMEM	memory allocation failure
  * @retval non-zero	some other unix error code
  * @retval -1		Device already attached
  */
 int
 device_probe(device_t dev)
 {
 	int error;
 
 	GIANT_REQUIRED;
 
 	if (dev->state >= DS_ALIVE && (dev->flags & DF_REBID) == 0)
 		return (-1);
 
 	if (!(dev->flags & DF_ENABLED)) {
 		if (bootverbose && device_get_name(dev) != NULL) {
 			device_print_prettyname(dev);
 			printf("not probed (disabled)\n");
 		}
 		return (-1);
 	}
 	if ((error = device_probe_child(dev->parent, dev)) != 0) {
 		if (bus_current_pass == BUS_PASS_DEFAULT &&
 		    !(dev->flags & DF_DONENOMATCH)) {
 			BUS_PROBE_NOMATCH(dev->parent, dev);
 			devnomatch(dev);
 			dev->flags |= DF_DONENOMATCH;
 		}
 		return (error);
 	}
 	return (0);
 }
 
 /**
  * @brief Probe a device and attach a driver if possible
  *
  * calls device_probe() and attaches if that was successful.
  */
 int
 device_probe_and_attach(device_t dev)
 {
 	int error;
 
 	GIANT_REQUIRED;
 
 	error = device_probe(dev);
 	if (error == -1)
 		return (0);
 	else if (error != 0)
 		return (error);
 
 	CURVNET_SET_QUIET(vnet0);
 	error = device_attach(dev);
 	CURVNET_RESTORE();
 	return error;
 }
 
 /**
  * @brief Attach a device driver to a device
  *
  * This function is a wrapper around the DEVICE_ATTACH() driver
  * method. In addition to calling DEVICE_ATTACH(), it initialises the
  * device's sysctl tree, optionally prints a description of the device
  * and queues a notification event for user-based device management
  * services.
  *
  * Normally this function is only called internally from
  * device_probe_and_attach().
  *
  * @param dev		the device to initialise
  *
  * @retval 0		success
  * @retval ENXIO	no driver was found
  * @retval ENOMEM	memory allocation failure
  * @retval non-zero	some other unix error code
  */
 int
 device_attach(device_t dev)
 {
 	uint64_t attachtime;
 	int error;
 
 	if (resource_disabled(dev->driver->name, dev->unit)) {
 		device_disable(dev);
 		if (bootverbose)
 			 device_printf(dev, "disabled via hints entry\n");
 		return (ENXIO);
 	}
 
 	device_sysctl_init(dev);
 	if (!device_is_quiet(dev))
 		device_print_child(dev->parent, dev);
 	attachtime = get_cyclecount();
 	dev->state = DS_ATTACHING;
 	if ((error = DEVICE_ATTACH(dev)) != 0) {
 		printf("device_attach: %s%d attach returned %d\n",
 		    dev->driver->name, dev->unit, error);
 		if (!(dev->flags & DF_FIXEDCLASS))
 			devclass_delete_device(dev->devclass, dev);
 		(void)device_set_driver(dev, NULL);
 		device_sysctl_fini(dev);
 		KASSERT(dev->busy == 0, ("attach failed but busy"));
 		dev->state = DS_NOTPRESENT;
 		return (error);
 	}
 	attachtime = get_cyclecount() - attachtime;
 	/*
 	 * 4 bits per device is a reasonable value for desktop and server
 	 * hardware with good get_cyclecount() implementations, but may
 	 * need to be adjusted on other platforms.
 	 */
 #ifdef RANDOM_DEBUG
 	printf("random: %s(): feeding %d bit(s) of entropy from %s%d\n",
 	    __func__, 4, dev->driver->name, dev->unit);
 #endif
 	random_harvest(&attachtime, sizeof(attachtime), 4, RANDOM_ATTACH);
 	device_sysctl_update(dev);
 	if (dev->busy)
 		dev->state = DS_BUSY;
 	else
 		dev->state = DS_ATTACHED;
 	dev->flags &= ~DF_DONENOMATCH;
 	devadded(dev);
 	return (0);
 }
 
 /**
  * @brief Detach a driver from a device
  *
  * This function is a wrapper around the DEVICE_DETACH() driver
  * method. If the call to DEVICE_DETACH() succeeds, it calls
  * BUS_CHILD_DETACHED() for the parent of @p dev, queues a
  * notification event for user-based device management services and
  * cleans up the device's sysctl tree.
  *
  * @param dev		the device to un-initialise
  *
  * @retval 0		success
  * @retval ENXIO	no driver was found
  * @retval ENOMEM	memory allocation failure
  * @retval non-zero	some other unix error code
  */
 int
 device_detach(device_t dev)
 {
 	int error;
 
 	GIANT_REQUIRED;
 
 	PDEBUG(("%s", DEVICENAME(dev)));
 	if (dev->state == DS_BUSY)
 		return (EBUSY);
 	if (dev->state != DS_ATTACHED)
 		return (0);
 
 	if ((error = DEVICE_DETACH(dev)) != 0)
 		return (error);
 	devremoved(dev);
 	if (!device_is_quiet(dev))
 		device_printf(dev, "detached\n");
 	if (dev->parent)
 		BUS_CHILD_DETACHED(dev->parent, dev);
 
 	if (!(dev->flags & DF_FIXEDCLASS))
 		devclass_delete_device(dev->devclass, dev);
 
 	dev->state = DS_NOTPRESENT;
 	(void)device_set_driver(dev, NULL);
 	device_sysctl_fini(dev);
 
 	return (0);
 }
 
 /**
  * @brief Tells a driver to quiesce itself.
  *
  * This function is a wrapper around the DEVICE_QUIESCE() driver
  * method. If the call to DEVICE_QUIESCE() succeeds.
  *
  * @param dev		the device to quiesce
  *
  * @retval 0		success
  * @retval ENXIO	no driver was found
  * @retval ENOMEM	memory allocation failure
  * @retval non-zero	some other unix error code
  */
 int
 device_quiesce(device_t dev)
 {
 
 	PDEBUG(("%s", DEVICENAME(dev)));
 	if (dev->state == DS_BUSY)
 		return (EBUSY);
 	if (dev->state != DS_ATTACHED)
 		return (0);
 
 	return (DEVICE_QUIESCE(dev));
 }
 
 /**
  * @brief Notify a device of system shutdown
  *
  * This function calls the DEVICE_SHUTDOWN() driver method if the
  * device currently has an attached driver.
  *
  * @returns the value returned by DEVICE_SHUTDOWN()
  */
 int
 device_shutdown(device_t dev)
 {
 	if (dev->state < DS_ATTACHED)
 		return (0);
 	return (DEVICE_SHUTDOWN(dev));
 }
 
 /**
  * @brief Set the unit number of a device
  *
  * This function can be used to override the unit number used for a
  * device (e.g. to wire a device to a pre-configured unit number).
  */
 int
 device_set_unit(device_t dev, int unit)
 {
 	devclass_t dc;
 	int err;
 
 	dc = device_get_devclass(dev);
 	if (unit < dc->maxunit && dc->devices[unit])
 		return (EBUSY);
 	err = devclass_delete_device(dc, dev);
 	if (err)
 		return (err);
 	dev->unit = unit;
 	err = devclass_add_device(dc, dev);
 	if (err)
 		return (err);
 
 	bus_data_generation_update();
 	return (0);
 }
 
 /*======================================*/
 /*
  * Some useful method implementations to make life easier for bus drivers.
  */
 
 /**
  * @brief Initialise a resource list.
  *
  * @param rl		the resource list to initialise
  */
 void
 resource_list_init(struct resource_list *rl)
 {
 	STAILQ_INIT(rl);
 }
 
 /**
  * @brief Reclaim memory used by a resource list.
  *
  * This function frees the memory for all resource entries on the list
  * (if any).
  *
  * @param rl		the resource list to free
  */
 void
 resource_list_free(struct resource_list *rl)
 {
 	struct resource_list_entry *rle;
 
 	while ((rle = STAILQ_FIRST(rl)) != NULL) {
 		if (rle->res)
 			panic("resource_list_free: resource entry is busy");
 		STAILQ_REMOVE_HEAD(rl, link);
 		free(rle, M_BUS);
 	}
 }
 
 /**
  * @brief Add a resource entry.
  *
  * This function adds a resource entry using the given @p type, @p
  * start, @p end and @p count values. A rid value is chosen by
  * searching sequentially for the first unused rid starting at zero.
  *
  * @param rl		the resource list to edit
  * @param type		the resource entry type (e.g. SYS_RES_MEMORY)
  * @param start		the start address of the resource
  * @param end		the end address of the resource
  * @param count		XXX end-start+1
  */
 int
 resource_list_add_next(struct resource_list *rl, int type, u_long start,
     u_long end, u_long count)
 {
 	int rid;
 
 	rid = 0;
 	while (resource_list_find(rl, type, rid) != NULL)
 		rid++;
 	resource_list_add(rl, type, rid, start, end, count);
 	return (rid);
 }
 
 /**
  * @brief Add or modify a resource entry.
  *
  * If an existing entry exists with the same type and rid, it will be
  * modified using the given values of @p start, @p end and @p
  * count. If no entry exists, a new one will be created using the
  * given values.  The resource list entry that matches is then returned.
  *
  * @param rl		the resource list to edit
  * @param type		the resource entry type (e.g. SYS_RES_MEMORY)
  * @param rid		the resource identifier
  * @param start		the start address of the resource
  * @param end		the end address of the resource
  * @param count		XXX end-start+1
  */
 struct resource_list_entry *
 resource_list_add(struct resource_list *rl, int type, int rid,
     u_long start, u_long end, u_long count)
 {
 	struct resource_list_entry *rle;
 
 	rle = resource_list_find(rl, type, rid);
 	if (!rle) {
 		rle = malloc(sizeof(struct resource_list_entry), M_BUS,
 		    M_NOWAIT);
 		if (!rle)
 			panic("resource_list_add: can't record entry");
 		STAILQ_INSERT_TAIL(rl, rle, link);
 		rle->type = type;
 		rle->rid = rid;
 		rle->res = NULL;
 		rle->flags = 0;
 	}
 
 	if (rle->res)
 		panic("resource_list_add: resource entry is busy");
 
 	rle->start = start;
 	rle->end = end;
 	rle->count = count;
 	return (rle);
 }
 
 /**
  * @brief Determine if a resource entry is busy.
  *
  * Returns true if a resource entry is busy meaning that it has an
  * associated resource that is not an unallocated "reserved" resource.
  *
  * @param rl		the resource list to search
  * @param type		the resource entry type (e.g. SYS_RES_MEMORY)
  * @param rid		the resource identifier
  *
  * @returns Non-zero if the entry is busy, zero otherwise.
  */
 int
 resource_list_busy(struct resource_list *rl, int type, int rid)
 {
 	struct resource_list_entry *rle;
 
 	rle = resource_list_find(rl, type, rid);
 	if (rle == NULL || rle->res == NULL)
 		return (0);
 	if ((rle->flags & (RLE_RESERVED | RLE_ALLOCATED)) == RLE_RESERVED) {
 		KASSERT(!(rman_get_flags(rle->res) & RF_ACTIVE),
 		    ("reserved resource is active"));
 		return (0);
 	}
 	return (1);
 }
 
 /**
  * @brief Determine if a resource entry is reserved.
  *
  * Returns true if a resource entry is reserved meaning that it has an
  * associated "reserved" resource.  The resource can either be
  * allocated or unallocated.
  *
  * @param rl		the resource list to search
  * @param type		the resource entry type (e.g. SYS_RES_MEMORY)
  * @param rid		the resource identifier
  *
  * @returns Non-zero if the entry is reserved, zero otherwise.
  */
 int
 resource_list_reserved(struct resource_list *rl, int type, int rid)
 {
 	struct resource_list_entry *rle;
 
 	rle = resource_list_find(rl, type, rid);
 	if (rle != NULL && rle->flags & RLE_RESERVED)
 		return (1);
 	return (0);
 }
 
 /**
  * @brief Find a resource entry by type and rid.
  *
  * @param rl		the resource list to search
  * @param type		the resource entry type (e.g. SYS_RES_MEMORY)
  * @param rid		the resource identifier
  *
  * @returns the resource entry pointer or NULL if there is no such
  * entry.
  */
 struct resource_list_entry *
 resource_list_find(struct resource_list *rl, int type, int rid)
 {
 	struct resource_list_entry *rle;
 
 	STAILQ_FOREACH(rle, rl, link) {
 		if (rle->type == type && rle->rid == rid)
 			return (rle);
 	}
 	return (NULL);
 }
 
 /**
  * @brief Delete a resource entry.
  *
  * @param rl		the resource list to edit
  * @param type		the resource entry type (e.g. SYS_RES_MEMORY)
  * @param rid		the resource identifier
  */
 void
 resource_list_delete(struct resource_list *rl, int type, int rid)
 {
 	struct resource_list_entry *rle = resource_list_find(rl, type, rid);
 
 	if (rle) {
 		if (rle->res != NULL)
 			panic("resource_list_delete: resource has not been released");
 		STAILQ_REMOVE(rl, rle, resource_list_entry, link);
 		free(rle, M_BUS);
 	}
 }
 
 /**
  * @brief Allocate a reserved resource
  *
  * This can be used by busses to force the allocation of resources
  * that are always active in the system even if they are not allocated
  * by a driver (e.g. PCI BARs).  This function is usually called when
  * adding a new child to the bus.  The resource is allocated from the
  * parent bus when it is reserved.  The resource list entry is marked
  * with RLE_RESERVED to note that it is a reserved resource.
  *
  * Subsequent attempts to allocate the resource with
  * resource_list_alloc() will succeed the first time and will set
  * RLE_ALLOCATED to note that it has been allocated.  When a reserved
  * resource that has been allocated is released with
  * resource_list_release() the resource RLE_ALLOCATED is cleared, but
  * the actual resource remains allocated.  The resource can be released to
  * the parent bus by calling resource_list_unreserve().
  *
  * @param rl		the resource list to allocate from
  * @param bus		the parent device of @p child
  * @param child		the device for which the resource is being reserved
  * @param type		the type of resource to allocate
  * @param rid		a pointer to the resource identifier
  * @param start		hint at the start of the resource range - pass
  *			@c 0UL for any start address
  * @param end		hint at the end of the resource range - pass
  *			@c ~0UL for any end address
  * @param count		hint at the size of range required - pass @c 1
  *			for any size
  * @param flags		any extra flags to control the resource
  *			allocation - see @c RF_XXX flags in
  *			<sys/rman.h> for details
  *
  * @returns		the resource which was allocated or @c NULL if no
  *			resource could be allocated
  */
 struct resource *
 resource_list_reserve(struct resource_list *rl, device_t bus, device_t child,
     int type, int *rid, u_long start, u_long end, u_long count, u_int flags)
 {
 	struct resource_list_entry *rle = NULL;
 	int passthrough = (device_get_parent(child) != bus);
 	struct resource *r;
 
 	if (passthrough)
 		panic(
     "resource_list_reserve() should only be called for direct children");
 	if (flags & RF_ACTIVE)
 		panic(
     "resource_list_reserve() should only reserve inactive resources");
 
 	r = resource_list_alloc(rl, bus, child, type, rid, start, end, count,
 	    flags);
 	if (r != NULL) {
 		rle = resource_list_find(rl, type, *rid);
 		rle->flags |= RLE_RESERVED;
 	}
 	return (r);
 }
 
 /**
  * @brief Helper function for implementing BUS_ALLOC_RESOURCE()
  *
  * Implement BUS_ALLOC_RESOURCE() by looking up a resource from the list
  * and passing the allocation up to the parent of @p bus. This assumes
  * that the first entry of @c device_get_ivars(child) is a struct
  * resource_list. This also handles 'passthrough' allocations where a
  * child is a remote descendant of bus by passing the allocation up to
  * the parent of bus.
  *
  * Typically, a bus driver would store a list of child resources
  * somewhere in the child device's ivars (see device_get_ivars()) and
  * its implementation of BUS_ALLOC_RESOURCE() would find that list and
  * then call resource_list_alloc() to perform the allocation.
  *
  * @param rl		the resource list to allocate from
  * @param bus		the parent device of @p child
  * @param child		the device which is requesting an allocation
  * @param type		the type of resource to allocate
  * @param rid		a pointer to the resource identifier
  * @param start		hint at the start of the resource range - pass
  *			@c 0UL for any start address
  * @param end		hint at the end of the resource range - pass
  *			@c ~0UL for any end address
  * @param count		hint at the size of range required - pass @c 1
  *			for any size
  * @param flags		any extra flags to control the resource
  *			allocation - see @c RF_XXX flags in
  *			<sys/rman.h> for details
  *
  * @returns		the resource which was allocated or @c NULL if no
  *			resource could be allocated
  */
 struct resource *
 resource_list_alloc(struct resource_list *rl, device_t bus, device_t child,
     int type, int *rid, u_long start, u_long end, u_long count, u_int flags)
 {
 	struct resource_list_entry *rle = NULL;
 	int passthrough = (device_get_parent(child) != bus);
 	int isdefault = (start == 0UL && end == ~0UL);
 
 	if (passthrough) {
 		return (BUS_ALLOC_RESOURCE(device_get_parent(bus), child,
 		    type, rid, start, end, count, flags));
 	}
 
 	rle = resource_list_find(rl, type, *rid);
 
 	if (!rle)
 		return (NULL);		/* no resource of that type/rid */
 
 	if (rle->res) {
 		if (rle->flags & RLE_RESERVED) {
 			if (rle->flags & RLE_ALLOCATED)
 				return (NULL);
 			if ((flags & RF_ACTIVE) &&
 			    bus_activate_resource(child, type, *rid,
 			    rle->res) != 0)
 				return (NULL);
 			rle->flags |= RLE_ALLOCATED;
 			return (rle->res);
 		}
 		device_printf(bus,
 		    "resource entry %#x type %d for child %s is busy\n", *rid,
 		    type, device_get_nameunit(child));
 		return (NULL);
 	}
 
 	if (isdefault) {
 		start = rle->start;
 		count = ulmax(count, rle->count);
 		end = ulmax(rle->end, start + count - 1);
 	}
 
 	rle->res = BUS_ALLOC_RESOURCE(device_get_parent(bus), child,
 	    type, rid, start, end, count, flags);
 
 	/*
 	 * Record the new range.
 	 */
 	if (rle->res) {
 		rle->start = rman_get_start(rle->res);
 		rle->end = rman_get_end(rle->res);
 		rle->count = count;
 	}
 
 	return (rle->res);
 }
 
 /**
  * @brief Helper function for implementing BUS_RELEASE_RESOURCE()
  *
  * Implement BUS_RELEASE_RESOURCE() using a resource list. Normally
  * used with resource_list_alloc().
  *
  * @param rl		the resource list which was allocated from
  * @param bus		the parent device of @p child
  * @param child		the device which is requesting a release
  * @param type		the type of resource to release
  * @param rid		the resource identifier
  * @param res		the resource to release
  *
  * @retval 0		success
  * @retval non-zero	a standard unix error code indicating what
  *			error condition prevented the operation
  */
 int
 resource_list_release(struct resource_list *rl, device_t bus, device_t child,
     int type, int rid, struct resource *res)
 {
 	struct resource_list_entry *rle = NULL;
 	int passthrough = (device_get_parent(child) != bus);
 	int error;
 
 	if (passthrough) {
 		return (BUS_RELEASE_RESOURCE(device_get_parent(bus), child,
 		    type, rid, res));
 	}
 
 	rle = resource_list_find(rl, type, rid);
 
 	if (!rle)
 		panic("resource_list_release: can't find resource");
 	if (!rle->res)
 		panic("resource_list_release: resource entry is not busy");
 	if (rle->flags & RLE_RESERVED) {
 		if (rle->flags & RLE_ALLOCATED) {
 			if (rman_get_flags(res) & RF_ACTIVE) {
 				error = bus_deactivate_resource(child, type,
 				    rid, res);
 				if (error)
 					return (error);
 			}
 			rle->flags &= ~RLE_ALLOCATED;
 			return (0);
 		}
 		return (EINVAL);
 	}
 
 	error = BUS_RELEASE_RESOURCE(device_get_parent(bus), child,
 	    type, rid, res);
 	if (error)
 		return (error);
 
 	rle->res = NULL;
 	return (0);
 }
 
 /**
  * @brief Release all active resources of a given type
  *
  * Release all active resources of a specified type.  This is intended
  * to be used to cleanup resources leaked by a driver after detach or
  * a failed attach.
  *
  * @param rl		the resource list which was allocated from
  * @param bus		the parent device of @p child
  * @param child		the device whose active resources are being released
  * @param type		the type of resources to release
  *
  * @retval 0		success
  * @retval EBUSY	at least one resource was active
  */
 int
 resource_list_release_active(struct resource_list *rl, device_t bus,
     device_t child, int type)
 {
 	struct resource_list_entry *rle;
 	int error, retval;
 
 	retval = 0;
 	STAILQ_FOREACH(rle, rl, link) {
 		if (rle->type != type)
 			continue;
 		if (rle->res == NULL)
 			continue;
 		if ((rle->flags & (RLE_RESERVED | RLE_ALLOCATED)) ==
 		    RLE_RESERVED)
 			continue;
 		retval = EBUSY;
 		error = resource_list_release(rl, bus, child, type,
 		    rman_get_rid(rle->res), rle->res);
 		if (error != 0)
 			device_printf(bus,
 			    "Failed to release active resource: %d\n", error);
 	}
 	return (retval);
 }
 
 
 /**
  * @brief Fully release a reserved resource
  *
  * Fully releases a resource reserved via resource_list_reserve().
  *
  * @param rl		the resource list which was allocated from
  * @param bus		the parent device of @p child
  * @param child		the device whose reserved resource is being released
  * @param type		the type of resource to release
  * @param rid		the resource identifier
  * @param res		the resource to release
  *
  * @retval 0		success
  * @retval non-zero	a standard unix error code indicating what
  *			error condition prevented the operation
  */
 int
 resource_list_unreserve(struct resource_list *rl, device_t bus, device_t child,
     int type, int rid)
 {
 	struct resource_list_entry *rle = NULL;
 	int passthrough = (device_get_parent(child) != bus);
 
 	if (passthrough)
 		panic(
     "resource_list_unreserve() should only be called for direct children");
 
 	rle = resource_list_find(rl, type, rid);
 
 	if (!rle)
 		panic("resource_list_unreserve: can't find resource");
 	if (!(rle->flags & RLE_RESERVED))
 		return (EINVAL);
 	if (rle->flags & RLE_ALLOCATED)
 		return (EBUSY);
 	rle->flags &= ~RLE_RESERVED;
 	return (resource_list_release(rl, bus, child, type, rid, rle->res));
 }
 
 /**
  * @brief Print a description of resources in a resource list
  *
  * Print all resources of a specified type, for use in BUS_PRINT_CHILD().
  * The name is printed if at least one resource of the given type is available.
  * The format is used to print resource start and end.
  *
  * @param rl		the resource list to print
  * @param name		the name of @p type, e.g. @c "memory"
  * @param type		type type of resource entry to print
  * @param format	printf(9) format string to print resource
  *			start and end values
  *
  * @returns		the number of characters printed
  */
 int
 resource_list_print_type(struct resource_list *rl, const char *name, int type,
     const char *format)
 {
 	struct resource_list_entry *rle;
 	int printed, retval;
 
 	printed = 0;
 	retval = 0;
 	/* Yes, this is kinda cheating */
 	STAILQ_FOREACH(rle, rl, link) {
 		if (rle->type == type) {
 			if (printed == 0)
 				retval += printf(" %s ", name);
 			else
 				retval += printf(",");
 			printed++;
 			retval += printf(format, rle->start);
 			if (rle->count > 1) {
 				retval += printf("-");
 				retval += printf(format, rle->start +
 						 rle->count - 1);
 			}
 		}
 	}
 	return (retval);
 }
 
 /**
  * @brief Releases all the resources in a list.
  *
  * @param rl		The resource list to purge.
  *
  * @returns		nothing
  */
 void
 resource_list_purge(struct resource_list *rl)
 {
 	struct resource_list_entry *rle;
 
 	while ((rle = STAILQ_FIRST(rl)) != NULL) {
 		if (rle->res)
 			bus_release_resource(rman_get_device(rle->res),
 			    rle->type, rle->rid, rle->res);
 		STAILQ_REMOVE_HEAD(rl, link);
 		free(rle, M_BUS);
 	}
 }
 
 device_t
 bus_generic_add_child(device_t dev, u_int order, const char *name, int unit)
 {
 
 	return (device_add_child_ordered(dev, order, name, unit));
 }
 
 /**
  * @brief Helper function for implementing DEVICE_PROBE()
  *
  * This function can be used to help implement the DEVICE_PROBE() for
  * a bus (i.e. a device which has other devices attached to it). It
  * calls the DEVICE_IDENTIFY() method of each driver in the device's
  * devclass.
  */
 int
 bus_generic_probe(device_t dev)
 {
 	devclass_t dc = dev->devclass;
 	driverlink_t dl;
 
 	TAILQ_FOREACH(dl, &dc->drivers, link) {
 		/*
 		 * If this driver's pass is too high, then ignore it.
 		 * For most drivers in the default pass, this will
 		 * never be true.  For early-pass drivers they will
 		 * only call the identify routines of eligible drivers
 		 * when this routine is called.  Drivers for later
 		 * passes should have their identify routines called
 		 * on early-pass busses during BUS_NEW_PASS().
 		 */
 		if (dl->pass > bus_current_pass)
 			continue;
 		DEVICE_IDENTIFY(dl->driver, dev);
 	}
 
 	return (0);
 }
 
 /**
  * @brief Helper function for implementing DEVICE_ATTACH()
  *
  * This function can be used to help implement the DEVICE_ATTACH() for
  * a bus. It calls device_probe_and_attach() for each of the device's
  * children.
  */
 int
 bus_generic_attach(device_t dev)
 {
 	device_t child;
 
 	TAILQ_FOREACH(child, &dev->children, link) {
 		device_probe_and_attach(child);
 	}
 
 	return (0);
 }
 
 /**
  * @brief Helper function for implementing DEVICE_DETACH()
  *
  * This function can be used to help implement the DEVICE_DETACH() for
  * a bus. It calls device_detach() for each of the device's
  * children.
  */
 int
 bus_generic_detach(device_t dev)
 {
 	device_t child;
 	int error;
 
 	if (dev->state != DS_ATTACHED)
 		return (EBUSY);
 
 	TAILQ_FOREACH(child, &dev->children, link) {
 		if ((error = device_detach(child)) != 0)
 			return (error);
 	}
 
 	return (0);
 }
 
 /**
  * @brief Helper function for implementing DEVICE_SHUTDOWN()
  *
  * This function can be used to help implement the DEVICE_SHUTDOWN()
  * for a bus. It calls device_shutdown() for each of the device's
  * children.
  */
 int
 bus_generic_shutdown(device_t dev)
 {
 	device_t child;
 
 	TAILQ_FOREACH(child, &dev->children, link) {
 		device_shutdown(child);
 	}
 
 	return (0);
 }
 
 /**
  * @brief Default function for suspending a child device.
  *
  * This function is to be used by a bus's DEVICE_SUSPEND_CHILD().
  */
 int
 bus_generic_suspend_child(device_t dev, device_t child)
 {
 	int	error;
 
 	error = DEVICE_SUSPEND(child);
 
 	if (error == 0)
 		child->flags |= DF_SUSPENDED;
 
 	return (error);
 }
 
 /**
  * @brief Default function for resuming a child device.
  *
  * This function is to be used by a bus's DEVICE_RESUME_CHILD().
  */
 int
 bus_generic_resume_child(device_t dev, device_t child)
 {
 
 	DEVICE_RESUME(child);
 	child->flags &= ~DF_SUSPENDED;
 
 	return (0);
 }
 
 /**
  * @brief Helper function for implementing DEVICE_SUSPEND()
  *
  * This function can be used to help implement the DEVICE_SUSPEND()
  * for a bus. It calls DEVICE_SUSPEND() for each of the device's
  * children. If any call to DEVICE_SUSPEND() fails, the suspend
  * operation is aborted and any devices which were suspended are
  * resumed immediately by calling their DEVICE_RESUME() methods.
  */
 int
 bus_generic_suspend(device_t dev)
 {
 	int		error;
 	device_t	child, child2;
 
 	TAILQ_FOREACH(child, &dev->children, link) {
 		error = BUS_SUSPEND_CHILD(dev, child);
 		if (error) {
 			for (child2 = TAILQ_FIRST(&dev->children);
 			     child2 && child2 != child;
 			     child2 = TAILQ_NEXT(child2, link))
 				BUS_RESUME_CHILD(dev, child2);
 			return (error);
 		}
 	}
 	return (0);
 }
 
 /**
  * @brief Helper function for implementing DEVICE_RESUME()
  *
  * This function can be used to help implement the DEVICE_RESUME() for
  * a bus. It calls DEVICE_RESUME() on each of the device's children.
  */
 int
 bus_generic_resume(device_t dev)
 {
 	device_t	child;
 
 	TAILQ_FOREACH(child, &dev->children, link) {
 		BUS_RESUME_CHILD(dev, child);
 		/* if resume fails, there's nothing we can usefully do... */
 	}
 	return (0);
 }
 
 /**
  * @brief Helper function for implementing BUS_PRINT_CHILD().
  *
  * This function prints the first part of the ascii representation of
  * @p child, including its name, unit and description (if any - see
  * device_set_desc()).
  *
  * @returns the number of characters printed
  */
 int
 bus_print_child_header(device_t dev, device_t child)
 {
 	int	retval = 0;
 
 	if (device_get_desc(child)) {
 		retval += device_printf(child, "<%s>", device_get_desc(child));
 	} else {
 		retval += printf("%s", device_get_nameunit(child));
 	}
 
 	return (retval);
 }
 
 /**
  * @brief Helper function for implementing BUS_PRINT_CHILD().
  *
  * This function prints the last part of the ascii representation of
  * @p child, which consists of the string @c " on " followed by the
  * name and unit of the @p dev.
  *
  * @returns the number of characters printed
  */
 int
 bus_print_child_footer(device_t dev, device_t child)
 {
 	return (printf(" on %s\n", device_get_nameunit(dev)));
 }
 
 /**
  * @brief Helper function for implementing BUS_PRINT_CHILD().
  *
  * This function prints out the VM domain for the given device.
  *
  * @returns the number of characters printed
  */
 int
 bus_print_child_domain(device_t dev, device_t child)
 {
 	int domain;
 
 	/* No domain? Don't print anything */
 	if (BUS_GET_DOMAIN(dev, child, &domain) != 0)
 		return (0);
 
 	return (printf(" numa-domain %d", domain));
 }
 
 /**
  * @brief Helper function for implementing BUS_PRINT_CHILD().
  *
  * This function simply calls bus_print_child_header() followed by
  * bus_print_child_footer().
  *
  * @returns the number of characters printed
  */
 int
 bus_generic_print_child(device_t dev, device_t child)
 {
 	int	retval = 0;
 
 	retval += bus_print_child_header(dev, child);
 	retval += bus_print_child_domain(dev, child);
 	retval += bus_print_child_footer(dev, child);
 
 	return (retval);
 }
 
 /**
  * @brief Stub function for implementing BUS_READ_IVAR().
  *
  * @returns ENOENT
  */
 int
 bus_generic_read_ivar(device_t dev, device_t child, int index,
     uintptr_t * result)
 {
 	return (ENOENT);
 }
 
 /**
  * @brief Stub function for implementing BUS_WRITE_IVAR().
  *
  * @returns ENOENT
  */
 int
 bus_generic_write_ivar(device_t dev, device_t child, int index,
     uintptr_t value)
 {
 	return (ENOENT);
 }
 
 /**
  * @brief Stub function for implementing BUS_GET_RESOURCE_LIST().
  *
  * @returns NULL
  */
 struct resource_list *
 bus_generic_get_resource_list(device_t dev, device_t child)
 {
 	return (NULL);
 }
 
 /**
  * @brief Helper function for implementing BUS_DRIVER_ADDED().
  *
  * This implementation of BUS_DRIVER_ADDED() simply calls the driver's
  * DEVICE_IDENTIFY() method to allow it to add new children to the bus
  * and then calls device_probe_and_attach() for each unattached child.
  */
 void
 bus_generic_driver_added(device_t dev, driver_t *driver)
 {
 	device_t child;
 
 	DEVICE_IDENTIFY(driver, dev);
 	TAILQ_FOREACH(child, &dev->children, link) {
 		if (child->state == DS_NOTPRESENT ||
 		    (child->flags & DF_REBID))
 			device_probe_and_attach(child);
 	}
 }
 
 /**
  * @brief Helper function for implementing BUS_NEW_PASS().
  *
  * This implementing of BUS_NEW_PASS() first calls the identify
  * routines for any drivers that probe at the current pass.  Then it
  * walks the list of devices for this bus.  If a device is already
  * attached, then it calls BUS_NEW_PASS() on that device.  If the
  * device is not already attached, it attempts to attach a driver to
  * it.
  */
 void
 bus_generic_new_pass(device_t dev)
 {
 	driverlink_t dl;
 	devclass_t dc;
 	device_t child;
 
 	dc = dev->devclass;
 	TAILQ_FOREACH(dl, &dc->drivers, link) {
 		if (dl->pass == bus_current_pass)
 			DEVICE_IDENTIFY(dl->driver, dev);
 	}
 	TAILQ_FOREACH(child, &dev->children, link) {
 		if (child->state >= DS_ATTACHED)
 			BUS_NEW_PASS(child);
 		else if (child->state == DS_NOTPRESENT)
 			device_probe_and_attach(child);
 	}
 }
 
 /**
  * @brief Helper function for implementing BUS_SETUP_INTR().
  *
  * This simple implementation of BUS_SETUP_INTR() simply calls the
  * BUS_SETUP_INTR() method of the parent of @p dev.
  */
 int
 bus_generic_setup_intr(device_t dev, device_t child, struct resource *irq,
     int flags, driver_filter_t *filter, driver_intr_t *intr, void *arg,
     void **cookiep)
 {
 	/* Propagate up the bus hierarchy until someone handles it. */
 	if (dev->parent)
 		return (BUS_SETUP_INTR(dev->parent, child, irq, flags,
 		    filter, intr, arg, cookiep));
 	return (EINVAL);
 }
 
 /**
  * @brief Helper function for implementing BUS_TEARDOWN_INTR().
  *
  * This simple implementation of BUS_TEARDOWN_INTR() simply calls the
  * BUS_TEARDOWN_INTR() method of the parent of @p dev.
  */
 int
 bus_generic_teardown_intr(device_t dev, device_t child, struct resource *irq,
     void *cookie)
 {
 	/* Propagate up the bus hierarchy until someone handles it. */
 	if (dev->parent)
 		return (BUS_TEARDOWN_INTR(dev->parent, child, irq, cookie));
 	return (EINVAL);
 }
 
 /**
  * @brief Helper function for implementing BUS_ADJUST_RESOURCE().
  *
  * This simple implementation of BUS_ADJUST_RESOURCE() simply calls the
  * BUS_ADJUST_RESOURCE() method of the parent of @p dev.
  */
 int
 bus_generic_adjust_resource(device_t dev, device_t child, int type,
     struct resource *r, u_long start, u_long end)
 {
 	/* Propagate up the bus hierarchy until someone handles it. */
 	if (dev->parent)
 		return (BUS_ADJUST_RESOURCE(dev->parent, child, type, r, start,
 		    end));
 	return (EINVAL);
 }
 
 /**
  * @brief Helper function for implementing BUS_ALLOC_RESOURCE().
  *
  * This simple implementation of BUS_ALLOC_RESOURCE() simply calls the
  * BUS_ALLOC_RESOURCE() method of the parent of @p dev.
  */
 struct resource *
 bus_generic_alloc_resource(device_t dev, device_t child, int type, int *rid,
     u_long start, u_long end, u_long count, u_int flags)
 {
 	/* Propagate up the bus hierarchy until someone handles it. */
 	if (dev->parent)
 		return (BUS_ALLOC_RESOURCE(dev->parent, child, type, rid,
 		    start, end, count, flags));
 	return (NULL);
 }
 
 /**
  * @brief Helper function for implementing BUS_RELEASE_RESOURCE().
  *
  * This simple implementation of BUS_RELEASE_RESOURCE() simply calls the
  * BUS_RELEASE_RESOURCE() method of the parent of @p dev.
  */
 int
 bus_generic_release_resource(device_t dev, device_t child, int type, int rid,
     struct resource *r)
 {
 	/* Propagate up the bus hierarchy until someone handles it. */
 	if (dev->parent)
 		return (BUS_RELEASE_RESOURCE(dev->parent, child, type, rid,
 		    r));
 	return (EINVAL);
 }
 
 /**
  * @brief Helper function for implementing BUS_ACTIVATE_RESOURCE().
  *
  * This simple implementation of BUS_ACTIVATE_RESOURCE() simply calls the
  * BUS_ACTIVATE_RESOURCE() method of the parent of @p dev.
  */
 int
 bus_generic_activate_resource(device_t dev, device_t child, int type, int rid,
     struct resource *r)
 {
 	/* Propagate up the bus hierarchy until someone handles it. */
 	if (dev->parent)
 		return (BUS_ACTIVATE_RESOURCE(dev->parent, child, type, rid,
 		    r));
 	return (EINVAL);
 }
 
 /**
  * @brief Helper function for implementing BUS_DEACTIVATE_RESOURCE().
  *
  * This simple implementation of BUS_DEACTIVATE_RESOURCE() simply calls the
  * BUS_DEACTIVATE_RESOURCE() method of the parent of @p dev.
  */
 int
 bus_generic_deactivate_resource(device_t dev, device_t child, int type,
     int rid, struct resource *r)
 {
 	/* Propagate up the bus hierarchy until someone handles it. */
 	if (dev->parent)
 		return (BUS_DEACTIVATE_RESOURCE(dev->parent, child, type, rid,
 		    r));
 	return (EINVAL);
 }
 
 /**
  * @brief Helper function for implementing BUS_BIND_INTR().
  *
  * This simple implementation of BUS_BIND_INTR() simply calls the
  * BUS_BIND_INTR() method of the parent of @p dev.
  */
 int
 bus_generic_bind_intr(device_t dev, device_t child, struct resource *irq,
     int cpu)
 {
 
 	/* Propagate up the bus hierarchy until someone handles it. */
 	if (dev->parent)
 		return (BUS_BIND_INTR(dev->parent, child, irq, cpu));
 	return (EINVAL);
 }
 
 /**
  * @brief Helper function for implementing BUS_CONFIG_INTR().
  *
  * This simple implementation of BUS_CONFIG_INTR() simply calls the
  * BUS_CONFIG_INTR() method of the parent of @p dev.
  */
 int
 bus_generic_config_intr(device_t dev, int irq, enum intr_trigger trig,
     enum intr_polarity pol)
 {
 
 	/* Propagate up the bus hierarchy until someone handles it. */
 	if (dev->parent)
 		return (BUS_CONFIG_INTR(dev->parent, irq, trig, pol));
 	return (EINVAL);
 }
 
 /**
  * @brief Helper function for implementing BUS_DESCRIBE_INTR().
  *
  * This simple implementation of BUS_DESCRIBE_INTR() simply calls the
  * BUS_DESCRIBE_INTR() method of the parent of @p dev.
  */
 int
 bus_generic_describe_intr(device_t dev, device_t child, struct resource *irq,
     void *cookie, const char *descr)
 {
 
 	/* Propagate up the bus hierarchy until someone handles it. */
 	if (dev->parent)
 		return (BUS_DESCRIBE_INTR(dev->parent, child, irq, cookie,
 		    descr));
 	return (EINVAL);
 }
 
 /**
  * @brief Helper function for implementing BUS_GET_DMA_TAG().
  *
  * This simple implementation of BUS_GET_DMA_TAG() simply calls the
  * BUS_GET_DMA_TAG() method of the parent of @p dev.
  */
 bus_dma_tag_t
 bus_generic_get_dma_tag(device_t dev, device_t child)
 {
 
 	/* Propagate up the bus hierarchy until someone handles it. */
 	if (dev->parent != NULL)
 		return (BUS_GET_DMA_TAG(dev->parent, child));
 	return (NULL);
 }
 
 /**
  * @brief Helper function for implementing BUS_GET_RESOURCE().
  *
  * This implementation of BUS_GET_RESOURCE() uses the
  * resource_list_find() function to do most of the work. It calls
  * BUS_GET_RESOURCE_LIST() to find a suitable resource list to
  * search.
  */
 int
 bus_generic_rl_get_resource(device_t dev, device_t child, int type, int rid,
     u_long *startp, u_long *countp)
 {
 	struct resource_list *		rl = NULL;
 	struct resource_list_entry *	rle = NULL;
 
 	rl = BUS_GET_RESOURCE_LIST(dev, child);
 	if (!rl)
 		return (EINVAL);
 
 	rle = resource_list_find(rl, type, rid);
 	if (!rle)
 		return (ENOENT);
 
 	if (startp)
 		*startp = rle->start;
 	if (countp)
 		*countp = rle->count;
 
 	return (0);
 }
 
 /**
  * @brief Helper function for implementing BUS_SET_RESOURCE().
  *
  * This implementation of BUS_SET_RESOURCE() uses the
  * resource_list_add() function to do most of the work. It calls
  * BUS_GET_RESOURCE_LIST() to find a suitable resource list to
  * edit.
  */
 int
 bus_generic_rl_set_resource(device_t dev, device_t child, int type, int rid,
     u_long start, u_long count)
 {
 	struct resource_list *		rl = NULL;
 
 	rl = BUS_GET_RESOURCE_LIST(dev, child);
 	if (!rl)
 		return (EINVAL);
 
 	resource_list_add(rl, type, rid, start, (start + count - 1), count);
 
 	return (0);
 }
 
 /**
  * @brief Helper function for implementing BUS_DELETE_RESOURCE().
  *
  * This implementation of BUS_DELETE_RESOURCE() uses the
  * resource_list_delete() function to do most of the work. It calls
  * BUS_GET_RESOURCE_LIST() to find a suitable resource list to
  * edit.
  */
 void
 bus_generic_rl_delete_resource(device_t dev, device_t child, int type, int rid)
 {
 	struct resource_list *		rl = NULL;
 
 	rl = BUS_GET_RESOURCE_LIST(dev, child);
 	if (!rl)
 		return;
 
 	resource_list_delete(rl, type, rid);
 
 	return;
 }
 
 /**
  * @brief Helper function for implementing BUS_RELEASE_RESOURCE().
  *
  * This implementation of BUS_RELEASE_RESOURCE() uses the
  * resource_list_release() function to do most of the work. It calls
  * BUS_GET_RESOURCE_LIST() to find a suitable resource list.
  */
 int
 bus_generic_rl_release_resource(device_t dev, device_t child, int type,
     int rid, struct resource *r)
 {
 	struct resource_list *		rl = NULL;
 
 	if (device_get_parent(child) != dev)
 		return (BUS_RELEASE_RESOURCE(device_get_parent(dev), child,
 		    type, rid, r));
 
 	rl = BUS_GET_RESOURCE_LIST(dev, child);
 	if (!rl)
 		return (EINVAL);
 
 	return (resource_list_release(rl, dev, child, type, rid, r));
 }
 
 /**
  * @brief Helper function for implementing BUS_ALLOC_RESOURCE().
  *
  * This implementation of BUS_ALLOC_RESOURCE() uses the
  * resource_list_alloc() function to do most of the work. It calls
  * BUS_GET_RESOURCE_LIST() to find a suitable resource list.
  */
 struct resource *
 bus_generic_rl_alloc_resource(device_t dev, device_t child, int type,
     int *rid, u_long start, u_long end, u_long count, u_int flags)
 {
 	struct resource_list *		rl = NULL;
 
 	if (device_get_parent(child) != dev)
 		return (BUS_ALLOC_RESOURCE(device_get_parent(dev), child,
 		    type, rid, start, end, count, flags));
 
 	rl = BUS_GET_RESOURCE_LIST(dev, child);
 	if (!rl)
 		return (NULL);
 
 	return (resource_list_alloc(rl, dev, child, type, rid,
 	    start, end, count, flags));
 }
 
 /**
  * @brief Helper function for implementing BUS_CHILD_PRESENT().
  *
  * This simple implementation of BUS_CHILD_PRESENT() simply calls the
  * BUS_CHILD_PRESENT() method of the parent of @p dev.
  */
 int
 bus_generic_child_present(device_t dev, device_t child)
 {
 	return (BUS_CHILD_PRESENT(device_get_parent(dev), dev));
 }
 
 int
 bus_generic_get_domain(device_t dev, device_t child, int *domain)
 {
 
 	if (dev->parent)
 		return (BUS_GET_DOMAIN(dev->parent, dev, domain));
 
 	return (ENOENT);
 }
 
 /*
  * Some convenience functions to make it easier for drivers to use the
  * resource-management functions.  All these really do is hide the
  * indirection through the parent's method table, making for slightly
  * less-wordy code.  In the future, it might make sense for this code
  * to maintain some sort of a list of resources allocated by each device.
  */
 
 int
 bus_alloc_resources(device_t dev, struct resource_spec *rs,
     struct resource **res)
 {
 	int i;
 
 	for (i = 0; rs[i].type != -1; i++)
 		res[i] = NULL;
 	for (i = 0; rs[i].type != -1; i++) {
 		res[i] = bus_alloc_resource_any(dev,
 		    rs[i].type, &rs[i].rid, rs[i].flags);
 		if (res[i] == NULL && !(rs[i].flags & RF_OPTIONAL)) {
 			bus_release_resources(dev, rs, res);
 			return (ENXIO);
 		}
 	}
 	return (0);
 }
 
 void
 bus_release_resources(device_t dev, const struct resource_spec *rs,
     struct resource **res)
 {
 	int i;
 
 	for (i = 0; rs[i].type != -1; i++)
 		if (res[i] != NULL) {
 			bus_release_resource(
 			    dev, rs[i].type, rs[i].rid, res[i]);
 			res[i] = NULL;
 		}
 }
 
 /**
  * @brief Wrapper function for BUS_ALLOC_RESOURCE().
  *
  * This function simply calls the BUS_ALLOC_RESOURCE() method of the
  * parent of @p dev.
  */
 struct resource *
 bus_alloc_resource(device_t dev, int type, int *rid, u_long start, u_long end,
     u_long count, u_int flags)
 {
 	if (dev->parent == NULL)
 		return (NULL);
 	return (BUS_ALLOC_RESOURCE(dev->parent, dev, type, rid, start, end,
 	    count, flags));
 }
 
 /**
  * @brief Wrapper function for BUS_ADJUST_RESOURCE().
  *
  * This function simply calls the BUS_ADJUST_RESOURCE() method of the
  * parent of @p dev.
  */
 int
 bus_adjust_resource(device_t dev, int type, struct resource *r, u_long start,
     u_long end)
 {
 	if (dev->parent == NULL)
 		return (EINVAL);
 	return (BUS_ADJUST_RESOURCE(dev->parent, dev, type, r, start, end));
 }
 
 /**
  * @brief Wrapper function for BUS_ACTIVATE_RESOURCE().
  *
  * This function simply calls the BUS_ACTIVATE_RESOURCE() method of the
  * parent of @p dev.
  */
 int
 bus_activate_resource(device_t dev, int type, int rid, struct resource *r)
 {
 	if (dev->parent == NULL)
 		return (EINVAL);
 	return (BUS_ACTIVATE_RESOURCE(dev->parent, dev, type, rid, r));
 }
 
 /**
  * @brief Wrapper function for BUS_DEACTIVATE_RESOURCE().
  *
  * This function simply calls the BUS_DEACTIVATE_RESOURCE() method of the
  * parent of @p dev.
  */
 int
 bus_deactivate_resource(device_t dev, int type, int rid, struct resource *r)
 {
 	if (dev->parent == NULL)
 		return (EINVAL);
 	return (BUS_DEACTIVATE_RESOURCE(dev->parent, dev, type, rid, r));
 }
 
 /**
  * @brief Wrapper function for BUS_RELEASE_RESOURCE().
  *
  * This function simply calls the BUS_RELEASE_RESOURCE() method of the
  * parent of @p dev.
  */
 int
 bus_release_resource(device_t dev, int type, int rid, struct resource *r)
 {
 	if (dev->parent == NULL)
 		return (EINVAL);
 	return (BUS_RELEASE_RESOURCE(dev->parent, dev, type, rid, r));
 }
 
 /**
  * @brief Wrapper function for BUS_SETUP_INTR().
  *
  * This function simply calls the BUS_SETUP_INTR() method of the
  * parent of @p dev.
  */
 int
 bus_setup_intr(device_t dev, struct resource *r, int flags,
     driver_filter_t filter, driver_intr_t handler, void *arg, void **cookiep)
 {
 	int error;
 
 	if (dev->parent == NULL)
 		return (EINVAL);
 	error = BUS_SETUP_INTR(dev->parent, dev, r, flags, filter, handler,
 	    arg, cookiep);
 	if (error != 0)
 		return (error);
 	if (handler != NULL && !(flags & INTR_MPSAFE))
 		device_printf(dev, "[GIANT-LOCKED]\n");
 	return (0);
 }
 
 /**
  * @brief Wrapper function for BUS_TEARDOWN_INTR().
  *
  * This function simply calls the BUS_TEARDOWN_INTR() method of the
  * parent of @p dev.
  */
 int
 bus_teardown_intr(device_t dev, struct resource *r, void *cookie)
 {
 	if (dev->parent == NULL)
 		return (EINVAL);
 	return (BUS_TEARDOWN_INTR(dev->parent, dev, r, cookie));
 }
 
 /**
  * @brief Wrapper function for BUS_BIND_INTR().
  *
  * This function simply calls the BUS_BIND_INTR() method of the
  * parent of @p dev.
  */
 int
 bus_bind_intr(device_t dev, struct resource *r, int cpu)
 {
 	if (dev->parent == NULL)
 		return (EINVAL);
 	return (BUS_BIND_INTR(dev->parent, dev, r, cpu));
 }
 
 /**
  * @brief Wrapper function for BUS_DESCRIBE_INTR().
  *
  * This function first formats the requested description into a
  * temporary buffer and then calls the BUS_DESCRIBE_INTR() method of
  * the parent of @p dev.
  */
 int
 bus_describe_intr(device_t dev, struct resource *irq, void *cookie,
     const char *fmt, ...)
 {
 	va_list ap;
 	char descr[MAXCOMLEN + 1];
 
 	if (dev->parent == NULL)
 		return (EINVAL);
 	va_start(ap, fmt);
 	vsnprintf(descr, sizeof(descr), fmt, ap);
 	va_end(ap);
 	return (BUS_DESCRIBE_INTR(dev->parent, dev, irq, cookie, descr));
 }
 
 /**
  * @brief Wrapper function for BUS_SET_RESOURCE().
  *
  * This function simply calls the BUS_SET_RESOURCE() method of the
  * parent of @p dev.
  */
 int
 bus_set_resource(device_t dev, int type, int rid,
     u_long start, u_long count)
 {
 	return (BUS_SET_RESOURCE(device_get_parent(dev), dev, type, rid,
 	    start, count));
 }
 
 /**
  * @brief Wrapper function for BUS_GET_RESOURCE().
  *
  * This function simply calls the BUS_GET_RESOURCE() method of the
  * parent of @p dev.
  */
 int
 bus_get_resource(device_t dev, int type, int rid,
     u_long *startp, u_long *countp)
 {
 	return (BUS_GET_RESOURCE(device_get_parent(dev), dev, type, rid,
 	    startp, countp));
 }
 
 /**
  * @brief Wrapper function for BUS_GET_RESOURCE().
  *
  * This function simply calls the BUS_GET_RESOURCE() method of the
  * parent of @p dev and returns the start value.
  */
 u_long
 bus_get_resource_start(device_t dev, int type, int rid)
 {
 	u_long start, count;
 	int error;
 
 	error = BUS_GET_RESOURCE(device_get_parent(dev), dev, type, rid,
 	    &start, &count);
 	if (error)
 		return (0);
 	return (start);
 }
 
 /**
  * @brief Wrapper function for BUS_GET_RESOURCE().
  *
  * This function simply calls the BUS_GET_RESOURCE() method of the
  * parent of @p dev and returns the count value.
  */
 u_long
 bus_get_resource_count(device_t dev, int type, int rid)
 {
 	u_long start, count;
 	int error;
 
 	error = BUS_GET_RESOURCE(device_get_parent(dev), dev, type, rid,
 	    &start, &count);
 	if (error)
 		return (0);
 	return (count);
 }
 
 /**
  * @brief Wrapper function for BUS_DELETE_RESOURCE().
  *
  * This function simply calls the BUS_DELETE_RESOURCE() method of the
  * parent of @p dev.
  */
 void
 bus_delete_resource(device_t dev, int type, int rid)
 {
 	BUS_DELETE_RESOURCE(device_get_parent(dev), dev, type, rid);
 }
 
 /**
  * @brief Wrapper function for BUS_CHILD_PRESENT().
  *
  * This function simply calls the BUS_CHILD_PRESENT() method of the
  * parent of @p dev.
  */
 int
 bus_child_present(device_t child)
 {
 	return (BUS_CHILD_PRESENT(device_get_parent(child), child));
 }
 
 /**
  * @brief Wrapper function for BUS_CHILD_PNPINFO_STR().
  *
  * This function simply calls the BUS_CHILD_PNPINFO_STR() method of the
  * parent of @p dev.
  */
 int
 bus_child_pnpinfo_str(device_t child, char *buf, size_t buflen)
 {
 	device_t parent;
 
 	parent = device_get_parent(child);
 	if (parent == NULL) {
 		*buf = '\0';
 		return (0);
 	}
 	return (BUS_CHILD_PNPINFO_STR(parent, child, buf, buflen));
 }
 
 /**
  * @brief Wrapper function for BUS_CHILD_LOCATION_STR().
  *
  * This function simply calls the BUS_CHILD_LOCATION_STR() method of the
  * parent of @p dev.
  */
 int
 bus_child_location_str(device_t child, char *buf, size_t buflen)
 {
 	device_t parent;
 
 	parent = device_get_parent(child);
 	if (parent == NULL) {
 		*buf = '\0';
 		return (0);
 	}
 	return (BUS_CHILD_LOCATION_STR(parent, child, buf, buflen));
 }
 
 /**
  * @brief Wrapper function for BUS_GET_DMA_TAG().
  *
  * This function simply calls the BUS_GET_DMA_TAG() method of the
  * parent of @p dev.
  */
 bus_dma_tag_t
 bus_get_dma_tag(device_t dev)
 {
 	device_t parent;
 
 	parent = device_get_parent(dev);
 	if (parent == NULL)
 		return (NULL);
 	return (BUS_GET_DMA_TAG(parent, dev));
 }
 
 /**
  * @brief Wrapper function for BUS_GET_DOMAIN().
  *
  * This function simply calls the BUS_GET_DOMAIN() method of the
  * parent of @p dev.
  */
 int
 bus_get_domain(device_t dev, int *domain)
 {
 	return (BUS_GET_DOMAIN(device_get_parent(dev), dev, domain));
 }
 
 /* Resume all devices and then notify userland that we're up again. */
 static int
 root_resume(device_t dev)
 {
 	int error;
 
 	error = bus_generic_resume(dev);
 	if (error == 0)
 		devctl_notify("kern", "power", "resume", NULL);
 	return (error);
 }
 
 static int
 root_print_child(device_t dev, device_t child)
 {
 	int	retval = 0;
 
 	retval += bus_print_child_header(dev, child);
 	retval += printf("\n");
 
 	return (retval);
 }
 
 static int
 root_setup_intr(device_t dev, device_t child, struct resource *irq, int flags,
     driver_filter_t *filter, driver_intr_t *intr, void *arg, void **cookiep)
 {
 	/*
 	 * If an interrupt mapping gets to here something bad has happened.
 	 */
 	panic("root_setup_intr");
 }
 
 /*
  * If we get here, assume that the device is permanant and really is
  * present in the system.  Removable bus drivers are expected to intercept
  * this call long before it gets here.  We return -1 so that drivers that
  * really care can check vs -1 or some ERRNO returned higher in the food
  * chain.
  */
 static int
 root_child_present(device_t dev, device_t child)
 {
 	return (-1);
 }
 
 static kobj_method_t root_methods[] = {
 	/* Device interface */
 	KOBJMETHOD(device_shutdown,	bus_generic_shutdown),
 	KOBJMETHOD(device_suspend,	bus_generic_suspend),
 	KOBJMETHOD(device_resume,	root_resume),
 
 	/* Bus interface */
 	KOBJMETHOD(bus_print_child,	root_print_child),
 	KOBJMETHOD(bus_read_ivar,	bus_generic_read_ivar),
 	KOBJMETHOD(bus_write_ivar,	bus_generic_write_ivar),
 	KOBJMETHOD(bus_setup_intr,	root_setup_intr),
 	KOBJMETHOD(bus_child_present,	root_child_present),
 
 	KOBJMETHOD_END
 };
 
 static driver_t root_driver = {
 	"root",
 	root_methods,
 	1,			/* no softc */
 };
 
 device_t	root_bus;
 devclass_t	root_devclass;
 
 static int
 root_bus_module_handler(module_t mod, int what, void* arg)
 {
 	switch (what) {
 	case MOD_LOAD:
 		TAILQ_INIT(&bus_data_devices);
 		kobj_class_compile((kobj_class_t) &root_driver);
 		root_bus = make_device(NULL, "root", 0);
 		root_bus->desc = "System root bus";
 		kobj_init((kobj_t) root_bus, (kobj_class_t) &root_driver);
 		root_bus->driver = &root_driver;
 		root_bus->state = DS_ATTACHED;
 		root_devclass = devclass_find_internal("root", NULL, FALSE);
 		devinit();
 		return (0);
 
 	case MOD_SHUTDOWN:
 		device_shutdown(root_bus);
 		return (0);
 	default:
 		return (EOPNOTSUPP);
 	}
 
 	return (0);
 }
 
 static moduledata_t root_bus_mod = {
 	"rootbus",
 	root_bus_module_handler,
 	NULL
 };
 DECLARE_MODULE(rootbus, root_bus_mod, SI_SUB_DRIVERS, SI_ORDER_FIRST);
 
 /**
  * @brief Automatically configure devices
  *
  * This function begins the autoconfiguration process by calling
  * device_probe_and_attach() for each child of the @c root0 device.
  */
 void
 root_bus_configure(void)
 {
 
 	PDEBUG(("."));
 
 	/* Eventually this will be split up, but this is sufficient for now. */
 	bus_set_pass(BUS_PASS_DEFAULT);
 }
 
 /**
  * @brief Module handler for registering device drivers
  *
  * This module handler is used to automatically register device
  * drivers when modules are loaded. If @p what is MOD_LOAD, it calls
  * devclass_add_driver() for the driver described by the
  * driver_module_data structure pointed to by @p arg
  */
 int
 driver_module_handler(module_t mod, int what, void *arg)
 {
 	struct driver_module_data *dmd;
 	devclass_t bus_devclass;
 	kobj_class_t driver;
 	int error, pass;
 
 	dmd = (struct driver_module_data *)arg;
 	bus_devclass = devclass_find_internal(dmd->dmd_busname, NULL, TRUE);
 	error = 0;
 
 	switch (what) {
 	case MOD_LOAD:
 		if (dmd->dmd_chainevh)
 			error = dmd->dmd_chainevh(mod,what,dmd->dmd_chainarg);
 
 		pass = dmd->dmd_pass;
 		driver = dmd->dmd_driver;
 		PDEBUG(("Loading module: driver %s on bus %s (pass %d)",
 		    DRIVERNAME(driver), dmd->dmd_busname, pass));
 		error = devclass_add_driver(bus_devclass, driver, pass,
 		    dmd->dmd_devclass);
 		break;
 
 	case MOD_UNLOAD:
 		PDEBUG(("Unloading module: driver %s from bus %s",
 		    DRIVERNAME(dmd->dmd_driver),
 		    dmd->dmd_busname));
 		error = devclass_delete_driver(bus_devclass,
 		    dmd->dmd_driver);
 
 		if (!error && dmd->dmd_chainevh)
 			error = dmd->dmd_chainevh(mod,what,dmd->dmd_chainarg);
 		break;
 	case MOD_QUIESCE:
 		PDEBUG(("Quiesce module: driver %s from bus %s",
 		    DRIVERNAME(dmd->dmd_driver),
 		    dmd->dmd_busname));
 		error = devclass_quiesce_driver(bus_devclass,
 		    dmd->dmd_driver);
 
 		if (!error && dmd->dmd_chainevh)
 			error = dmd->dmd_chainevh(mod,what,dmd->dmd_chainarg);
 		break;
 	default:
 		error = EOPNOTSUPP;
 		break;
 	}
 
 	return (error);
 }
 
 /**
  * @brief Enumerate all hinted devices for this bus.
  *
  * Walks through the hints for this bus and calls the bus_hinted_child
  * routine for each one it fines.  It searches first for the specific
  * bus that's being probed for hinted children (eg isa0), and then for
  * generic children (eg isa).
  *
  * @param	dev	bus device to enumerate
  */
 void
 bus_enumerate_hinted_children(device_t bus)
 {
 	int i;
 	const char *dname, *busname;
 	int dunit;
 
 	/*
 	 * enumerate all devices on the specific bus
 	 */
 	busname = device_get_nameunit(bus);
 	i = 0;
 	while (resource_find_match(&i, &dname, &dunit, "at", busname) == 0)
 		BUS_HINTED_CHILD(bus, dname, dunit);
 
 	/*
 	 * and all the generic ones.
 	 */
 	busname = device_get_name(bus);
 	i = 0;
 	while (resource_find_match(&i, &dname, &dunit, "at", busname) == 0)
 		BUS_HINTED_CHILD(bus, dname, dunit);
 }
 
 #ifdef BUS_DEBUG
 
 /* the _short versions avoid iteration by not calling anything that prints
  * more than oneliners. I love oneliners.
  */
 
 static void
 print_device_short(device_t dev, int indent)
 {
 	if (!dev)
 		return;
 
 	indentprintf(("device %d: <%s> %sparent,%schildren,%s%s%s%s%s,%sivars,%ssoftc,busy=%d\n",
 	    dev->unit, dev->desc,
 	    (dev->parent? "":"no "),
 	    (TAILQ_EMPTY(&dev->children)? "no ":""),
 	    (dev->flags&DF_ENABLED? "enabled,":"disabled,"),
 	    (dev->flags&DF_FIXEDCLASS? "fixed,":""),
 	    (dev->flags&DF_WILDCARD? "wildcard,":""),
 	    (dev->flags&DF_DESCMALLOCED? "descmalloced,":""),
 	    (dev->flags&DF_REBID? "rebiddable,":""),
 	    (dev->ivars? "":"no "),
 	    (dev->softc? "":"no "),
 	    dev->busy));
 }
 
 static void
 print_device(device_t dev, int indent)
 {
 	if (!dev)
 		return;
 
 	print_device_short(dev, indent);
 
 	indentprintf(("Parent:\n"));
 	print_device_short(dev->parent, indent+1);
 	indentprintf(("Driver:\n"));
 	print_driver_short(dev->driver, indent+1);
 	indentprintf(("Devclass:\n"));
 	print_devclass_short(dev->devclass, indent+1);
 }
 
 void
 print_device_tree_short(device_t dev, int indent)
 /* print the device and all its children (indented) */
 {
 	device_t child;
 
 	if (!dev)
 		return;
 
 	print_device_short(dev, indent);
 
 	TAILQ_FOREACH(child, &dev->children, link) {
 		print_device_tree_short(child, indent+1);
 	}
 }
 
 void
 print_device_tree(device_t dev, int indent)
 /* print the device and all its children (indented) */
 {
 	device_t child;
 
 	if (!dev)
 		return;
 
 	print_device(dev, indent);
 
 	TAILQ_FOREACH(child, &dev->children, link) {
 		print_device_tree(child, indent+1);
 	}
 }
 
 static void
 print_driver_short(driver_t *driver, int indent)
 {
 	if (!driver)
 		return;
 
 	indentprintf(("driver %s: softc size = %zd\n",
 	    driver->name, driver->size));
 }
 
 static void
 print_driver(driver_t *driver, int indent)
 {
 	if (!driver)
 		return;
 
 	print_driver_short(driver, indent);
 }
 
 static void
 print_driver_list(driver_list_t drivers, int indent)
 {
 	driverlink_t driver;
 
 	TAILQ_FOREACH(driver, &drivers, link) {
 		print_driver(driver->driver, indent);
 	}
 }
 
 static void
 print_devclass_short(devclass_t dc, int indent)
 {
 	if ( !dc )
 		return;
 
 	indentprintf(("devclass %s: max units = %d\n", dc->name, dc->maxunit));
 }
 
 static void
 print_devclass(devclass_t dc, int indent)
 {
 	int i;
 
 	if ( !dc )
 		return;
 
 	print_devclass_short(dc, indent);
 	indentprintf(("Drivers:\n"));
 	print_driver_list(dc->drivers, indent+1);
 
 	indentprintf(("Devices:\n"));
 	for (i = 0; i < dc->maxunit; i++)
 		if (dc->devices[i])
 			print_device(dc->devices[i], indent+1);
 }
 
 void
 print_devclass_list_short(void)
 {
 	devclass_t dc;
 
 	printf("Short listing of devclasses, drivers & devices:\n");
 	TAILQ_FOREACH(dc, &devclasses, link) {
 		print_devclass_short(dc, 0);
 	}
 }
 
 void
 print_devclass_list(void)
 {
 	devclass_t dc;
 
 	printf("Full listing of devclasses, drivers & devices:\n");
 	TAILQ_FOREACH(dc, &devclasses, link) {
 		print_devclass(dc, 0);
 	}
 }
 
 #endif
 
 /*
  * User-space access to the device tree.
  *
  * We implement a small set of nodes:
  *
  * hw.bus			Single integer read method to obtain the
  *				current generation count.
  * hw.bus.devices		Reads the entire device tree in flat space.
  * hw.bus.rman			Resource manager interface
  *
  * We might like to add the ability to scan devclasses and/or drivers to
  * determine what else is currently loaded/available.
  */
 
 static int
 sysctl_bus(SYSCTL_HANDLER_ARGS)
 {
 	struct u_businfo	ubus;
 
 	ubus.ub_version = BUS_USER_VERSION;
 	ubus.ub_generation = bus_data_generation;
 
 	return (SYSCTL_OUT(req, &ubus, sizeof(ubus)));
 }
 SYSCTL_NODE(_hw_bus, OID_AUTO, info, CTLFLAG_RW, sysctl_bus,
     "bus-related data");
 
 static int
 sysctl_devices(SYSCTL_HANDLER_ARGS)
 {
 	int			*name = (int *)arg1;
 	u_int			namelen = arg2;
 	int			index;
 	struct device		*dev;
 	struct u_device		udev;	/* XXX this is a bit big */
 	int			error;
 
 	if (namelen != 2)
 		return (EINVAL);
 
 	if (bus_data_generation_check(name[0]))
 		return (EINVAL);
 
 	index = name[1];
 
 	/*
 	 * Scan the list of devices, looking for the requested index.
 	 */
 	TAILQ_FOREACH(dev, &bus_data_devices, devlink) {
 		if (index-- == 0)
 			break;
 	}
 	if (dev == NULL)
 		return (ENOENT);
 
 	/*
 	 * Populate the return array.
 	 */
 	bzero(&udev, sizeof(udev));
 	udev.dv_handle = (uintptr_t)dev;
 	udev.dv_parent = (uintptr_t)dev->parent;
 	if (dev->nameunit != NULL)
 		strlcpy(udev.dv_name, dev->nameunit, sizeof(udev.dv_name));
 	if (dev->desc != NULL)
 		strlcpy(udev.dv_desc, dev->desc, sizeof(udev.dv_desc));
 	if (dev->driver != NULL && dev->driver->name != NULL)
 		strlcpy(udev.dv_drivername, dev->driver->name,
 		    sizeof(udev.dv_drivername));
 	bus_child_pnpinfo_str(dev, udev.dv_pnpinfo, sizeof(udev.dv_pnpinfo));
 	bus_child_location_str(dev, udev.dv_location, sizeof(udev.dv_location));
 	udev.dv_devflags = dev->devflags;
 	udev.dv_flags = dev->flags;
 	udev.dv_state = dev->state;
 	error = SYSCTL_OUT(req, &udev, sizeof(udev));
 	return (error);
 }
 
 SYSCTL_NODE(_hw_bus, OID_AUTO, devices, CTLFLAG_RD, sysctl_devices,
     "system device tree");
 
 int
 bus_data_generation_check(int generation)
 {
 	if (generation != bus_data_generation)
 		return (1);
 
 	/* XXX generate optimised lists here? */
 	return (0);
 }
 
 void
 bus_data_generation_update(void)
 {
 	bus_data_generation++;
 }
 
 int
 bus_free_resource(device_t dev, int type, struct resource *r)
 {
 	if (r == NULL)
 		return (0);
 	return (bus_release_resource(dev, type, rman_get_rid(r), r));
 }
 
 /*
  * /dev/devctl2 implementation.  The existing /dev/devctl device has
  * implicit semantics on open, so it could not be reused for this.
  * Another option would be to call this /dev/bus?
  */
 static int
 find_device(struct devreq *req, device_t *devp)
 {
 	device_t dev;
 
 	/*
 	 * First, ensure that the name is nul terminated.
 	 */
 	if (memchr(req->dr_name, '\0', sizeof(req->dr_name)) == NULL)
 		return (EINVAL);
 
 	/*
 	 * Second, try to find an attached device whose name matches
 	 * 'name'.
 	 */
 	TAILQ_FOREACH(dev, &bus_data_devices, devlink) {
 		if (dev->nameunit != NULL &&
 		    strcmp(dev->nameunit, req->dr_name) == 0) {
 			*devp = dev;
 			return (0);
 		}
 	}
 
 	/* Finally, give device enumerators a chance. */
 	dev = NULL;
 	EVENTHANDLER_INVOKE(dev_lookup, req->dr_name, &dev);
 	if (dev == NULL)
 		return (ENOENT);
 	*devp = dev;
 	return (0);
 }
 
 static bool
 driver_exists(struct device *bus, const char *driver)
 {
 	devclass_t dc;
 
 	for (dc = bus->devclass; dc != NULL; dc = dc->parent) {
 		if (devclass_find_driver_internal(dc, driver) != NULL)
 			return (true);
 	}
 	return (false);
 }
 
 static int
 devctl2_ioctl(struct cdev *cdev, u_long cmd, caddr_t data, int fflag,
     struct thread *td)
 {
 	struct devreq *req;
 	device_t dev;
 	int error, old;
 
 	/* Locate the device to control. */
 	mtx_lock(&Giant);
 	req = (struct devreq *)data;
 	switch (cmd) {
 	case DEV_ATTACH:
 	case DEV_DETACH:
 	case DEV_ENABLE:
 	case DEV_DISABLE:
 	case DEV_SUSPEND:
 	case DEV_RESUME:
 	case DEV_SET_DRIVER:
 		error = priv_check(td, PRIV_DRIVER);
 		if (error == 0)
 			error = find_device(req, &dev);
 		break;
 	default:
 		error = ENOTTY;
 		break;
 	}
 	if (error) {
 		mtx_unlock(&Giant);
 		return (error);
 	}
 
 	/* Perform the requested operation. */
 	switch (cmd) {
 	case DEV_ATTACH:
 		if (device_is_attached(dev) && (dev->flags & DF_REBID) == 0)
 			error = EBUSY;
 		else if (!device_is_enabled(dev))
 			error = ENXIO;
 		else
 			error = device_probe_and_attach(dev);
 		break;
 	case DEV_DETACH:
 		if (!device_is_attached(dev)) {
 			error = ENXIO;
 			break;
 		}
 		if (!(req->dr_flags & DEVF_FORCE_DETACH)) {
 			error = device_quiesce(dev);
 			if (error)
 				break;
 		}
 		error = device_detach(dev);
 		break;
 	case DEV_ENABLE:
 		if (device_is_enabled(dev)) {
 			error = EBUSY;
 			break;
 		}
 
 		/*
 		 * If the device has been probed but not attached (e.g.
 		 * when it has been disabled by a loader hint), just
 		 * attach the device rather than doing a full probe.
 		 */
 		device_enable(dev);
 		if (device_is_alive(dev)) {
 			/*
 			 * If the device was disabled via a hint, clear
 			 * the hint.
 			 */
 			if (resource_disabled(dev->driver->name, dev->unit))
 				resource_unset_value(dev->driver->name,
 				    dev->unit, "disabled");
 			error = device_attach(dev);
 		} else
 			error = device_probe_and_attach(dev);
 		break;
 	case DEV_DISABLE:
 		if (!device_is_enabled(dev)) {
 			error = ENXIO;
 			break;
 		}
 
 		if (!(req->dr_flags & DEVF_FORCE_DETACH)) {
 			error = device_quiesce(dev);
 			if (error)
 				break;
 		}
 
 		/*
 		 * Force DF_FIXEDCLASS on around detach to preserve
 		 * the existing name.
 		 */
 		old = dev->flags;
 		dev->flags |= DF_FIXEDCLASS;
 		error = device_detach(dev);
 		if (!(old & DF_FIXEDCLASS))
 			dev->flags &= ~DF_FIXEDCLASS;
 		if (error == 0)
 			device_disable(dev);
 		break;
 	case DEV_SUSPEND:
 		if (device_is_suspended(dev)) {
 			error = EBUSY;
 			break;
 		}
 		if (device_get_parent(dev) == NULL) {
 			error = EINVAL;
 			break;
 		}
 		error = BUS_SUSPEND_CHILD(device_get_parent(dev), dev);
 		break;
 	case DEV_RESUME:
 		if (!device_is_suspended(dev)) {
 			error = EINVAL;
 			break;
 		}
 		if (device_get_parent(dev) == NULL) {
 			error = EINVAL;
 			break;
 		}
 		error = BUS_RESUME_CHILD(device_get_parent(dev), dev);
 		break;
 	case DEV_SET_DRIVER: {
 		devclass_t dc;
 		char driver[128];
 
 		error = copyinstr(req->dr_data, driver, sizeof(driver), NULL);
 		if (error)
 			break;
 		if (driver[0] == '\0') {
 			error = EINVAL;
 			break;
 		}
 		if (dev->devclass != NULL &&
 		    strcmp(driver, dev->devclass->name) == 0)
 			/* XXX: Could possibly force DF_FIXEDCLASS on? */
 			break;
 
 		/*
 		 * Scan drivers for this device's bus looking for at
 		 * least one matching driver.
 		 */
 		if (dev->parent == NULL) {
 			error = EINVAL;
 			break;
 		}
 		if (!driver_exists(dev->parent, driver)) {
 			error = ENOENT;
 			break;
 		}
 		dc = devclass_create(driver);
 		if (dc == NULL) {
 			error = ENOMEM;
 			break;
 		}
 
 		/* Detach device if necessary. */
 		if (device_is_attached(dev)) {
 			if (req->dr_flags & DEVF_SET_DRIVER_DETACH)
 				error = device_detach(dev);
 			else
 				error = EBUSY;
 			if (error)
 				break;
 		}
 
 		/* Clear any previously-fixed device class and unit. */
 		if (dev->flags & DF_FIXEDCLASS)
 			devclass_delete_device(dev->devclass, dev);
 		dev->flags |= DF_WILDCARD;
 		dev->unit = -1;
 
 		/* Force the new device class. */
 		error = devclass_add_device(dc, dev);
 		if (error)
 			break;
 		dev->flags |= DF_FIXEDCLASS;
 		error = device_probe_and_attach(dev);
 		break;
 	}
 	}
 	mtx_unlock(&Giant);
 	return (error);
 }
 
 static struct cdevsw devctl2_cdevsw = {
 	.d_version =	D_VERSION,
 	.d_ioctl =	devctl2_ioctl,
 	.d_name =	"devctl2",
 };
 
 static void
 devctl2_init(void)
 {
 
 	make_dev_credf(MAKEDEV_ETERNAL, &devctl2_cdevsw, 0, NULL,
 	    UID_ROOT, GID_WHEEL, 0600, "devctl2");
 }
Index: user/ngie/more-tests/sys/kern/vfs_subr.c
===================================================================
--- user/ngie/more-tests/sys/kern/vfs_subr.c	(revision 281584)
+++ user/ngie/more-tests/sys/kern/vfs_subr.c	(revision 281585)
@@ -1,4892 +1,4893 @@
 /*-
  * Copyright (c) 1989, 1993
  *	The Regents of the University of California.  All rights reserved.
  * (c) UNIX System Laboratories, Inc.
  * All or some portions of this file are derived from material licensed
  * to the University of California by American Telephone and Telegraph
  * Co. or Unix System Laboratories, Inc. and are reproduced herein with
  * the permission of UNIX System Laboratories, Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)vfs_subr.c	8.31 (Berkeley) 5/26/95
  */
 
 /*
  * External virtual filesystem routines
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_compat.h"
 #include "opt_ddb.h"
 #include "opt_watchdog.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/bio.h>
 #include <sys/buf.h>
 #include <sys/condvar.h>
 #include <sys/conf.h>
 #include <sys/dirent.h>
 #include <sys/event.h>
 #include <sys/eventhandler.h>
 #include <sys/extattr.h>
 #include <sys/file.h>
 #include <sys/fcntl.h>
 #include <sys/jail.h>
 #include <sys/kdb.h>
 #include <sys/kernel.h>
 #include <sys/kthread.h>
 #include <sys/lockf.h>
 #include <sys/malloc.h>
 #include <sys/mount.h>
 #include <sys/namei.h>
 #include <sys/pctrie.h>
 #include <sys/priv.h>
 #include <sys/reboot.h>
 #include <sys/rwlock.h>
 #include <sys/sched.h>
 #include <sys/sleepqueue.h>
 #include <sys/smp.h>
 #include <sys/stat.h>
 #include <sys/sysctl.h>
 #include <sys/syslog.h>
 #include <sys/vmmeter.h>
 #include <sys/vnode.h>
 #include <sys/watchdog.h>
 
 #include <machine/stdarg.h>
 
 #include <security/mac/mac_framework.h>
 
 #include <vm/vm.h>
 #include <vm/vm_object.h>
 #include <vm/vm_extern.h>
 #include <vm/pmap.h>
 #include <vm/vm_map.h>
 #include <vm/vm_page.h>
 #include <vm/vm_kern.h>
 #include <vm/uma.h>
 
 #ifdef DDB
 #include <ddb/ddb.h>
 #endif
 
 static void	delmntque(struct vnode *vp);
 static int	flushbuflist(struct bufv *bufv, int flags, struct bufobj *bo,
 		    int slpflag, int slptimeo);
 static void	syncer_shutdown(void *arg, int howto);
 static int	vtryrecycle(struct vnode *vp);
 static void	v_incr_usecount(struct vnode *);
 static void	v_decr_usecount(struct vnode *);
 static void	v_decr_useonly(struct vnode *);
 static void	v_upgrade_usecount(struct vnode *);
 static void	vnlru_free(int);
 static void	vgonel(struct vnode *);
 static void	vfs_knllock(void *arg);
 static void	vfs_knlunlock(void *arg);
 static void	vfs_knl_assert_locked(void *arg);
 static void	vfs_knl_assert_unlocked(void *arg);
 static void	destroy_vpollinfo(struct vpollinfo *vi);
 
 /*
  * Number of vnodes in existence.  Increased whenever getnewvnode()
  * allocates a new vnode, decreased in vdropl() for VI_DOOMED vnode.
  */
 static unsigned long	numvnodes;
 
 SYSCTL_ULONG(_vfs, OID_AUTO, numvnodes, CTLFLAG_RD, &numvnodes, 0,
     "Number of vnodes in existence");
 
 static u_long vnodes_created;
 SYSCTL_ULONG(_vfs, OID_AUTO, vnodes_created, CTLFLAG_RD, &vnodes_created,
     0, "Number of vnodes created by getnewvnode");
 
 /*
  * Conversion tables for conversion from vnode types to inode formats
  * and back.
  */
 enum vtype iftovt_tab[16] = {
 	VNON, VFIFO, VCHR, VNON, VDIR, VNON, VBLK, VNON,
 	VREG, VNON, VLNK, VNON, VSOCK, VNON, VNON, VBAD,
 };
 int vttoif_tab[10] = {
 	0, S_IFREG, S_IFDIR, S_IFBLK, S_IFCHR, S_IFLNK,
 	S_IFSOCK, S_IFIFO, S_IFMT, S_IFMT
 };
 
 /*
  * List of vnodes that are ready for recycling.
  */
 static TAILQ_HEAD(freelst, vnode) vnode_free_list;
 
 /*
  * Free vnode target.  Free vnodes may simply be files which have been stat'd
  * but not read.  This is somewhat common, and a small cache of such files
  * should be kept to avoid recreation costs.
  */
 static u_long wantfreevnodes;
 SYSCTL_ULONG(_vfs, OID_AUTO, wantfreevnodes, CTLFLAG_RW, &wantfreevnodes, 0, "");
 /* Number of vnodes in the free list. */
 static u_long freevnodes;
 SYSCTL_ULONG(_vfs, OID_AUTO, freevnodes, CTLFLAG_RD, &freevnodes, 0,
     "Number of vnodes in the free list");
 
 static int vlru_allow_cache_src;
 SYSCTL_INT(_vfs, OID_AUTO, vlru_allow_cache_src, CTLFLAG_RW,
     &vlru_allow_cache_src, 0, "Allow vlru to reclaim source vnode");
 
 static u_long recycles_count;
 SYSCTL_ULONG(_vfs, OID_AUTO, recycles, CTLFLAG_RD, &recycles_count, 0,
     "Number of vnodes recycled to avoid exceding kern.maxvnodes");
 
 /*
  * Various variables used for debugging the new implementation of
  * reassignbuf().
  * XXX these are probably of (very) limited utility now.
  */
 static int reassignbufcalls;
 SYSCTL_INT(_vfs, OID_AUTO, reassignbufcalls, CTLFLAG_RW, &reassignbufcalls, 0,
     "Number of calls to reassignbuf");
 
 /*
  * Cache for the mount type id assigned to NFS.  This is used for
  * special checks in nfs/nfs_nqlease.c and vm/vnode_pager.c.
  */
 int	nfs_mount_type = -1;
 
 /* To keep more than one thread at a time from running vfs_getnewfsid */
 static struct mtx mntid_mtx;
 
 /*
  * Lock for any access to the following:
  *	vnode_free_list
  *	numvnodes
  *	freevnodes
  */
 static struct mtx vnode_free_list_mtx;
 
 /* Publicly exported FS */
 struct nfs_public nfs_pub;
 
 static uma_zone_t buf_trie_zone;
 
 /* Zone for allocation of new vnodes - used exclusively by getnewvnode() */
 static uma_zone_t vnode_zone;
 static uma_zone_t vnodepoll_zone;
 
 /*
  * The workitem queue.
  *
  * It is useful to delay writes of file data and filesystem metadata
  * for tens of seconds so that quickly created and deleted files need
  * not waste disk bandwidth being created and removed. To realize this,
  * we append vnodes to a "workitem" queue. When running with a soft
  * updates implementation, most pending metadata dependencies should
  * not wait for more than a few seconds. Thus, mounted on block devices
  * are delayed only about a half the time that file data is delayed.
  * Similarly, directory updates are more critical, so are only delayed
  * about a third the time that file data is delayed. Thus, there are
  * SYNCER_MAXDELAY queues that are processed round-robin at a rate of
  * one each second (driven off the filesystem syncer process). The
  * syncer_delayno variable indicates the next queue that is to be processed.
  * Items that need to be processed soon are placed in this queue:
  *
  *	syncer_workitem_pending[syncer_delayno]
  *
  * A delay of fifteen seconds is done by placing the request fifteen
  * entries later in the queue:
  *
  *	syncer_workitem_pending[(syncer_delayno + 15) & syncer_mask]
  *
  */
 static int syncer_delayno;
 static long syncer_mask;
 LIST_HEAD(synclist, bufobj);
 static struct synclist *syncer_workitem_pending;
 /*
  * The sync_mtx protects:
  *	bo->bo_synclist
  *	sync_vnode_count
  *	syncer_delayno
  *	syncer_state
  *	syncer_workitem_pending
  *	syncer_worklist_len
  *	rushjob
  */
 static struct mtx sync_mtx;
 static struct cv sync_wakeup;
 
 #define SYNCER_MAXDELAY		32
 static int syncer_maxdelay = SYNCER_MAXDELAY;	/* maximum delay time */
 static int syncdelay = 30;		/* max time to delay syncing data */
 static int filedelay = 30;		/* time to delay syncing files */
 SYSCTL_INT(_kern, OID_AUTO, filedelay, CTLFLAG_RW, &filedelay, 0,
     "Time to delay syncing files (in seconds)");
 static int dirdelay = 29;		/* time to delay syncing directories */
 SYSCTL_INT(_kern, OID_AUTO, dirdelay, CTLFLAG_RW, &dirdelay, 0,
     "Time to delay syncing directories (in seconds)");
 static int metadelay = 28;		/* time to delay syncing metadata */
 SYSCTL_INT(_kern, OID_AUTO, metadelay, CTLFLAG_RW, &metadelay, 0,
     "Time to delay syncing metadata (in seconds)");
 static int rushjob;		/* number of slots to run ASAP */
 static int stat_rush_requests;	/* number of times I/O speeded up */
 SYSCTL_INT(_debug, OID_AUTO, rush_requests, CTLFLAG_RW, &stat_rush_requests, 0,
     "Number of times I/O speeded up (rush requests)");
 
 /*
  * When shutting down the syncer, run it at four times normal speed.
  */
 #define SYNCER_SHUTDOWN_SPEEDUP		4
 static int sync_vnode_count;
 static int syncer_worklist_len;
 static enum { SYNCER_RUNNING, SYNCER_SHUTTING_DOWN, SYNCER_FINAL_DELAY }
     syncer_state;
 
 /*
  * Number of vnodes we want to exist at any one time.  This is mostly used
  * to size hash tables in vnode-related code.  It is normally not used in
  * getnewvnode(), as wantfreevnodes is normally nonzero.)
  *
  * XXX desiredvnodes is historical cruft and should not exist.
  */
 int desiredvnodes;
 SYSCTL_INT(_kern, KERN_MAXVNODES, maxvnodes, CTLFLAG_RW,
     &desiredvnodes, 0, "Maximum number of vnodes");
 SYSCTL_ULONG(_kern, OID_AUTO, minvnodes, CTLFLAG_RW,
     &wantfreevnodes, 0, "Minimum number of vnodes (legacy)");
 static int vnlru_nowhere;
 SYSCTL_INT(_debug, OID_AUTO, vnlru_nowhere, CTLFLAG_RW,
     &vnlru_nowhere, 0, "Number of times the vnlru process ran without success");
 
 /* Shift count for (uintptr_t)vp to initialize vp->v_hash. */
 static int vnsz2log;
 
 /*
  * Support for the bufobj clean & dirty pctrie.
  */
 static void *
 buf_trie_alloc(struct pctrie *ptree)
 {
 
 	return uma_zalloc(buf_trie_zone, M_NOWAIT);
 }
 
 static void
 buf_trie_free(struct pctrie *ptree, void *node)
 {
 
 	uma_zfree(buf_trie_zone, node);
 }
 PCTRIE_DEFINE(BUF, buf, b_lblkno, buf_trie_alloc, buf_trie_free);
 
 /*
  * Initialize the vnode management data structures.
  *
  * Reevaluate the following cap on the number of vnodes after the physical
  * memory size exceeds 512GB.  In the limit, as the physical memory size
  * grows, the ratio of physical pages to vnodes approaches sixteen to one.
  */
 #ifndef	MAXVNODES_MAX
 #define	MAXVNODES_MAX	(512 * (1024 * 1024 * 1024 / (int)PAGE_SIZE / 16))
 #endif
 static void
 vntblinit(void *dummy __unused)
 {
 	u_int i;
 	int physvnodes, virtvnodes;
 
 	/*
 	 * Desiredvnodes is a function of the physical memory size and the
 	 * kernel's heap size.  Generally speaking, it scales with the
 	 * physical memory size.  The ratio of desiredvnodes to physical pages
 	 * is one to four until desiredvnodes exceeds 98,304.  Thereafter, the
 	 * marginal ratio of desiredvnodes to physical pages is one to
 	 * sixteen.  However, desiredvnodes is limited by the kernel's heap
 	 * size.  The memory required by desiredvnodes vnodes and vm objects
 	 * may not exceed one seventh of the kernel's heap size.
 	 */
 	physvnodes = maxproc + vm_cnt.v_page_count / 16 + 3 * min(98304 * 4,
 	    vm_cnt.v_page_count) / 16;
 	virtvnodes = vm_kmem_size / (7 * (sizeof(struct vm_object) +
 	    sizeof(struct vnode)));
 	desiredvnodes = min(physvnodes, virtvnodes);
 	if (desiredvnodes > MAXVNODES_MAX) {
 		if (bootverbose)
 			printf("Reducing kern.maxvnodes %d -> %d\n",
 			    desiredvnodes, MAXVNODES_MAX);
 		desiredvnodes = MAXVNODES_MAX;
 	}
 	wantfreevnodes = desiredvnodes / 4;
 	mtx_init(&mntid_mtx, "mntid", NULL, MTX_DEF);
 	TAILQ_INIT(&vnode_free_list);
 	mtx_init(&vnode_free_list_mtx, "vnode_free_list", NULL, MTX_DEF);
 	vnode_zone = uma_zcreate("VNODE", sizeof (struct vnode), NULL, NULL,
 	    NULL, NULL, UMA_ALIGN_PTR, 0);
 	vnodepoll_zone = uma_zcreate("VNODEPOLL", sizeof (struct vpollinfo),
 	    NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0);
 	/*
 	 * Preallocate enough nodes to support one-per buf so that
 	 * we can not fail an insert.  reassignbuf() callers can not
 	 * tolerate the insertion failure.
 	 */
 	buf_trie_zone = uma_zcreate("BUF TRIE", pctrie_node_size(),
 	    NULL, NULL, pctrie_zone_init, NULL, UMA_ALIGN_PTR, 
 	    UMA_ZONE_NOFREE | UMA_ZONE_VM);
 	uma_prealloc(buf_trie_zone, nbuf);
 	/*
 	 * Initialize the filesystem syncer.
 	 */
 	syncer_workitem_pending = hashinit(syncer_maxdelay, M_VNODE,
 	    &syncer_mask);
 	syncer_maxdelay = syncer_mask + 1;
 	mtx_init(&sync_mtx, "Syncer mtx", NULL, MTX_DEF);
 	cv_init(&sync_wakeup, "syncer");
 	for (i = 1; i <= sizeof(struct vnode); i <<= 1)
 		vnsz2log++;
 	vnsz2log--;
 }
 SYSINIT(vfs, SI_SUB_VFS, SI_ORDER_FIRST, vntblinit, NULL);
 
 
 /*
  * Mark a mount point as busy. Used to synchronize access and to delay
  * unmounting. Eventually, mountlist_mtx is not released on failure.
  *
  * vfs_busy() is a custom lock, it can block the caller.
  * vfs_busy() only sleeps if the unmount is active on the mount point.
  * For a mountpoint mp, vfs_busy-enforced lock is before lock of any
  * vnode belonging to mp.
  *
  * Lookup uses vfs_busy() to traverse mount points.
  * root fs			var fs
  * / vnode lock		A	/ vnode lock (/var)		D
  * /var vnode lock	B	/log vnode lock(/var/log)	E
  * vfs_busy lock	C	vfs_busy lock			F
  *
  * Within each file system, the lock order is C->A->B and F->D->E.
  *
  * When traversing across mounts, the system follows that lock order:
  *
  *        C->A->B
  *              |
  *              +->F->D->E
  *
  * The lookup() process for namei("/var") illustrates the process:
  *  VOP_LOOKUP() obtains B while A is held
  *  vfs_busy() obtains a shared lock on F while A and B are held
  *  vput() releases lock on B
  *  vput() releases lock on A
  *  VFS_ROOT() obtains lock on D while shared lock on F is held
  *  vfs_unbusy() releases shared lock on F
  *  vn_lock() obtains lock on deadfs vnode vp_crossmp instead of A.
  *    Attempt to lock A (instead of vp_crossmp) while D is held would
  *    violate the global order, causing deadlocks.
  *
  * dounmount() locks B while F is drained.
  */
 int
 vfs_busy(struct mount *mp, int flags)
 {
 
 	MPASS((flags & ~MBF_MASK) == 0);
 	CTR3(KTR_VFS, "%s: mp %p with flags %d", __func__, mp, flags);
 
 	MNT_ILOCK(mp);
 	MNT_REF(mp);
 	/*
 	 * If mount point is currenly being unmounted, sleep until the
 	 * mount point fate is decided.  If thread doing the unmounting fails,
 	 * it will clear MNTK_UNMOUNT flag before waking us up, indicating
 	 * that this mount point has survived the unmount attempt and vfs_busy
 	 * should retry.  Otherwise the unmounter thread will set MNTK_REFEXPIRE
 	 * flag in addition to MNTK_UNMOUNT, indicating that mount point is
 	 * about to be really destroyed.  vfs_busy needs to release its
 	 * reference on the mount point in this case and return with ENOENT,
 	 * telling the caller that mount mount it tried to busy is no longer
 	 * valid.
 	 */
 	while (mp->mnt_kern_flag & MNTK_UNMOUNT) {
 		if (flags & MBF_NOWAIT || mp->mnt_kern_flag & MNTK_REFEXPIRE) {
 			MNT_REL(mp);
 			MNT_IUNLOCK(mp);
 			CTR1(KTR_VFS, "%s: failed busying before sleeping",
 			    __func__);
 			return (ENOENT);
 		}
 		if (flags & MBF_MNTLSTLOCK)
 			mtx_unlock(&mountlist_mtx);
 		mp->mnt_kern_flag |= MNTK_MWAIT;
 		msleep(mp, MNT_MTX(mp), PVFS | PDROP, "vfs_busy", 0);
 		if (flags & MBF_MNTLSTLOCK)
 			mtx_lock(&mountlist_mtx);
 		MNT_ILOCK(mp);
 	}
 	if (flags & MBF_MNTLSTLOCK)
 		mtx_unlock(&mountlist_mtx);
 	mp->mnt_lockref++;
 	MNT_IUNLOCK(mp);
 	return (0);
 }
 
 /*
  * Free a busy filesystem.
  */
 void
 vfs_unbusy(struct mount *mp)
 {
 
 	CTR2(KTR_VFS, "%s: mp %p", __func__, mp);
 	MNT_ILOCK(mp);
 	MNT_REL(mp);
 	KASSERT(mp->mnt_lockref > 0, ("negative mnt_lockref"));
 	mp->mnt_lockref--;
 	if (mp->mnt_lockref == 0 && (mp->mnt_kern_flag & MNTK_DRAINING) != 0) {
 		MPASS(mp->mnt_kern_flag & MNTK_UNMOUNT);
 		CTR1(KTR_VFS, "%s: waking up waiters", __func__);
 		mp->mnt_kern_flag &= ~MNTK_DRAINING;
 		wakeup(&mp->mnt_lockref);
 	}
 	MNT_IUNLOCK(mp);
 }
 
 /*
  * Lookup a mount point by filesystem identifier.
  */
 struct mount *
 vfs_getvfs(fsid_t *fsid)
 {
 	struct mount *mp;
 
 	CTR2(KTR_VFS, "%s: fsid %p", __func__, fsid);
 	mtx_lock(&mountlist_mtx);
 	TAILQ_FOREACH(mp, &mountlist, mnt_list) {
 		if (mp->mnt_stat.f_fsid.val[0] == fsid->val[0] &&
 		    mp->mnt_stat.f_fsid.val[1] == fsid->val[1]) {
 			vfs_ref(mp);
 			mtx_unlock(&mountlist_mtx);
 			return (mp);
 		}
 	}
 	mtx_unlock(&mountlist_mtx);
 	CTR2(KTR_VFS, "%s: lookup failed for %p id", __func__, fsid);
 	return ((struct mount *) 0);
 }
 
 /*
  * Lookup a mount point by filesystem identifier, busying it before
  * returning.
  *
  * To avoid congestion on mountlist_mtx, implement simple direct-mapped
  * cache for popular filesystem identifiers.  The cache is lockess, using
  * the fact that struct mount's are never freed.  In worst case we may
  * get pointer to unmounted or even different filesystem, so we have to
  * check what we got, and go slow way if so.
  */
 struct mount *
 vfs_busyfs(fsid_t *fsid)
 {
 #define	FSID_CACHE_SIZE	256
 	typedef struct mount * volatile vmp_t;
 	static vmp_t cache[FSID_CACHE_SIZE];
 	struct mount *mp;
 	int error;
 	uint32_t hash;
 
 	CTR2(KTR_VFS, "%s: fsid %p", __func__, fsid);
 	hash = fsid->val[0] ^ fsid->val[1];
 	hash = (hash >> 16 ^ hash) & (FSID_CACHE_SIZE - 1);
 	mp = cache[hash];
 	if (mp == NULL ||
 	    mp->mnt_stat.f_fsid.val[0] != fsid->val[0] ||
 	    mp->mnt_stat.f_fsid.val[1] != fsid->val[1])
 		goto slow;
 	if (vfs_busy(mp, 0) != 0) {
 		cache[hash] = NULL;
 		goto slow;
 	}
 	if (mp->mnt_stat.f_fsid.val[0] == fsid->val[0] &&
 	    mp->mnt_stat.f_fsid.val[1] == fsid->val[1])
 		return (mp);
 	else
 	    vfs_unbusy(mp);
 
 slow:
 	mtx_lock(&mountlist_mtx);
 	TAILQ_FOREACH(mp, &mountlist, mnt_list) {
 		if (mp->mnt_stat.f_fsid.val[0] == fsid->val[0] &&
 		    mp->mnt_stat.f_fsid.val[1] == fsid->val[1]) {
 			error = vfs_busy(mp, MBF_MNTLSTLOCK);
 			if (error) {
 				cache[hash] = NULL;
 				mtx_unlock(&mountlist_mtx);
 				return (NULL);
 			}
 			cache[hash] = mp;
 			return (mp);
 		}
 	}
 	CTR2(KTR_VFS, "%s: lookup failed for %p id", __func__, fsid);
 	mtx_unlock(&mountlist_mtx);
 	return ((struct mount *) 0);
 }
 
 /*
  * Check if a user can access privileged mount options.
  */
 int
 vfs_suser(struct mount *mp, struct thread *td)
 {
 	int error;
 
 	/*
 	 * If the thread is jailed, but this is not a jail-friendly file
 	 * system, deny immediately.
 	 */
 	if (!(mp->mnt_vfc->vfc_flags & VFCF_JAIL) && jailed(td->td_ucred))
 		return (EPERM);
 
 	/*
 	 * If the file system was mounted outside the jail of the calling
 	 * thread, deny immediately.
 	 */
 	if (prison_check(td->td_ucred, mp->mnt_cred) != 0)
 		return (EPERM);
 
 	/*
 	 * If file system supports delegated administration, we don't check
 	 * for the PRIV_VFS_MOUNT_OWNER privilege - it will be better verified
 	 * by the file system itself.
 	 * If this is not the user that did original mount, we check for
 	 * the PRIV_VFS_MOUNT_OWNER privilege.
 	 */
 	if (!(mp->mnt_vfc->vfc_flags & VFCF_DELEGADMIN) &&
 	    mp->mnt_cred->cr_uid != td->td_ucred->cr_uid) {
 		if ((error = priv_check(td, PRIV_VFS_MOUNT_OWNER)) != 0)
 			return (error);
 	}
 	return (0);
 }
 
 /*
  * Get a new unique fsid.  Try to make its val[0] unique, since this value
  * will be used to create fake device numbers for stat().  Also try (but
  * not so hard) make its val[0] unique mod 2^16, since some emulators only
  * support 16-bit device numbers.  We end up with unique val[0]'s for the
  * first 2^16 calls and unique val[0]'s mod 2^16 for the first 2^8 calls.
  *
  * Keep in mind that several mounts may be running in parallel.  Starting
  * the search one past where the previous search terminated is both a
  * micro-optimization and a defense against returning the same fsid to
  * different mounts.
  */
 void
 vfs_getnewfsid(struct mount *mp)
 {
 	static uint16_t mntid_base;
 	struct mount *nmp;
 	fsid_t tfsid;
 	int mtype;
 
 	CTR2(KTR_VFS, "%s: mp %p", __func__, mp);
 	mtx_lock(&mntid_mtx);
 	mtype = mp->mnt_vfc->vfc_typenum;
 	tfsid.val[1] = mtype;
 	mtype = (mtype & 0xFF) << 24;
 	for (;;) {
 		tfsid.val[0] = makedev(255,
 		    mtype | ((mntid_base & 0xFF00) << 8) | (mntid_base & 0xFF));
 		mntid_base++;
 		if ((nmp = vfs_getvfs(&tfsid)) == NULL)
 			break;
 		vfs_rel(nmp);
 	}
 	mp->mnt_stat.f_fsid.val[0] = tfsid.val[0];
 	mp->mnt_stat.f_fsid.val[1] = tfsid.val[1];
 	mtx_unlock(&mntid_mtx);
 }
 
 /*
  * Knob to control the precision of file timestamps:
  *
  *   0 = seconds only; nanoseconds zeroed.
  *   1 = seconds and nanoseconds, accurate within 1/HZ.
  *   2 = seconds and nanoseconds, truncated to microseconds.
  * >=3 = seconds and nanoseconds, maximum precision.
  */
 enum { TSP_SEC, TSP_HZ, TSP_USEC, TSP_NSEC };
 
 static int timestamp_precision = TSP_USEC;
 SYSCTL_INT(_vfs, OID_AUTO, timestamp_precision, CTLFLAG_RW,
     &timestamp_precision, 0, "File timestamp precision (0: seconds, "
     "1: sec + ns accurate to 1/HZ, 2: sec + ns truncated to ms, "
     "3+: sec + ns (max. precision))");
 
 /*
  * Get a current timestamp.
  */
 void
 vfs_timestamp(struct timespec *tsp)
 {
 	struct timeval tv;
 
 	switch (timestamp_precision) {
 	case TSP_SEC:
 		tsp->tv_sec = time_second;
 		tsp->tv_nsec = 0;
 		break;
 	case TSP_HZ:
 		getnanotime(tsp);
 		break;
 	case TSP_USEC:
 		microtime(&tv);
 		TIMEVAL_TO_TIMESPEC(&tv, tsp);
 		break;
 	case TSP_NSEC:
 	default:
 		nanotime(tsp);
 		break;
 	}
 }
 
 /*
  * Set vnode attributes to VNOVAL
  */
 void
 vattr_null(struct vattr *vap)
 {
 
 	vap->va_type = VNON;
 	vap->va_size = VNOVAL;
 	vap->va_bytes = VNOVAL;
 	vap->va_mode = VNOVAL;
 	vap->va_nlink = VNOVAL;
 	vap->va_uid = VNOVAL;
 	vap->va_gid = VNOVAL;
 	vap->va_fsid = VNOVAL;
 	vap->va_fileid = VNOVAL;
 	vap->va_blocksize = VNOVAL;
 	vap->va_rdev = VNOVAL;
 	vap->va_atime.tv_sec = VNOVAL;
 	vap->va_atime.tv_nsec = VNOVAL;
 	vap->va_mtime.tv_sec = VNOVAL;
 	vap->va_mtime.tv_nsec = VNOVAL;
 	vap->va_ctime.tv_sec = VNOVAL;
 	vap->va_ctime.tv_nsec = VNOVAL;
 	vap->va_birthtime.tv_sec = VNOVAL;
 	vap->va_birthtime.tv_nsec = VNOVAL;
 	vap->va_flags = VNOVAL;
 	vap->va_gen = VNOVAL;
 	vap->va_vaflags = 0;
 }
 
 /*
  * This routine is called when we have too many vnodes.  It attempts
  * to free <count> vnodes and will potentially free vnodes that still
  * have VM backing store (VM backing store is typically the cause
  * of a vnode blowout so we want to do this).  Therefore, this operation
  * is not considered cheap.
  *
  * A number of conditions may prevent a vnode from being reclaimed.
  * the buffer cache may have references on the vnode, a directory
  * vnode may still have references due to the namei cache representing
  * underlying files, or the vnode may be in active use.   It is not
  * desireable to reuse such vnodes.  These conditions may cause the
  * number of vnodes to reach some minimum value regardless of what
  * you set kern.maxvnodes to.  Do not set kern.maxvnodes too low.
  */
 static int
 vlrureclaim(struct mount *mp)
 {
 	struct vnode *vp;
 	int done;
 	int trigger;
 	int usevnodes;
 	int count;
 
 	/*
 	 * Calculate the trigger point, don't allow user
 	 * screwups to blow us up.   This prevents us from
 	 * recycling vnodes with lots of resident pages.  We
 	 * aren't trying to free memory, we are trying to
 	 * free vnodes.
 	 */
 	usevnodes = desiredvnodes;
 	if (usevnodes <= 0)
 		usevnodes = 1;
 	trigger = vm_cnt.v_page_count * 2 / usevnodes;
 	done = 0;
 	vn_start_write(NULL, &mp, V_WAIT);
 	MNT_ILOCK(mp);
 	count = mp->mnt_nvnodelistsize / 10 + 1;
 	while (count != 0) {
 		vp = TAILQ_FIRST(&mp->mnt_nvnodelist);
 		while (vp != NULL && vp->v_type == VMARKER)
 			vp = TAILQ_NEXT(vp, v_nmntvnodes);
 		if (vp == NULL)
 			break;
 		TAILQ_REMOVE(&mp->mnt_nvnodelist, vp, v_nmntvnodes);
 		TAILQ_INSERT_TAIL(&mp->mnt_nvnodelist, vp, v_nmntvnodes);
 		--count;
 		if (!VI_TRYLOCK(vp))
 			goto next_iter;
 		/*
 		 * If it's been deconstructed already, it's still
 		 * referenced, or it exceeds the trigger, skip it.
 		 */
 		if (vp->v_usecount ||
 		    (!vlru_allow_cache_src &&
 			!LIST_EMPTY(&(vp)->v_cache_src)) ||
 		    (vp->v_iflag & VI_DOOMED) != 0 || (vp->v_object != NULL &&
 		    vp->v_object->resident_page_count > trigger)) {
 			VI_UNLOCK(vp);
 			goto next_iter;
 		}
 		MNT_IUNLOCK(mp);
 		vholdl(vp);
 		if (VOP_LOCK(vp, LK_INTERLOCK|LK_EXCLUSIVE|LK_NOWAIT)) {
 			vdrop(vp);
 			goto next_iter_mntunlocked;
 		}
 		VI_LOCK(vp);
 		/*
 		 * v_usecount may have been bumped after VOP_LOCK() dropped
 		 * the vnode interlock and before it was locked again.
 		 *
 		 * It is not necessary to recheck VI_DOOMED because it can
 		 * only be set by another thread that holds both the vnode
 		 * lock and vnode interlock.  If another thread has the
 		 * vnode lock before we get to VOP_LOCK() and obtains the
 		 * vnode interlock after VOP_LOCK() drops the vnode
 		 * interlock, the other thread will be unable to drop the
 		 * vnode lock before our VOP_LOCK() call fails.
 		 */
 		if (vp->v_usecount ||
 		    (!vlru_allow_cache_src &&
 			!LIST_EMPTY(&(vp)->v_cache_src)) ||
 		    (vp->v_object != NULL &&
 		    vp->v_object->resident_page_count > trigger)) {
 			VOP_UNLOCK(vp, LK_INTERLOCK);
 			vdrop(vp);
 			goto next_iter_mntunlocked;
 		}
 		KASSERT((vp->v_iflag & VI_DOOMED) == 0,
 		    ("VI_DOOMED unexpectedly detected in vlrureclaim()"));
 		atomic_add_long(&recycles_count, 1);
 		vgonel(vp);
 		VOP_UNLOCK(vp, 0);
 		vdropl(vp);
 		done++;
 next_iter_mntunlocked:
 		if (!should_yield())
 			goto relock_mnt;
 		goto yield;
 next_iter:
 		if (!should_yield())
 			continue;
 		MNT_IUNLOCK(mp);
 yield:
 		kern_yield(PRI_USER);
 relock_mnt:
 		MNT_ILOCK(mp);
 	}
 	MNT_IUNLOCK(mp);
 	vn_finished_write(mp);
 	return done;
 }
 
 /*
  * Attempt to keep the free list at wantfreevnodes length.
  */
 static void
 vnlru_free(int count)
 {
 	struct vnode *vp;
 
 	mtx_assert(&vnode_free_list_mtx, MA_OWNED);
 	for (; count > 0; count--) {
 		vp = TAILQ_FIRST(&vnode_free_list);
 		/*
 		 * The list can be modified while the free_list_mtx
 		 * has been dropped and vp could be NULL here.
 		 */
 		if (!vp)
 			break;
 		VNASSERT(vp->v_op != NULL, vp,
 		    ("vnlru_free: vnode already reclaimed."));
 		KASSERT((vp->v_iflag & VI_FREE) != 0,
 		    ("Removing vnode not on freelist"));
 		KASSERT((vp->v_iflag & VI_ACTIVE) == 0,
 		    ("Mangling active vnode"));
 		TAILQ_REMOVE(&vnode_free_list, vp, v_actfreelist);
 		/*
 		 * Don't recycle if we can't get the interlock.
 		 */
 		if (!VI_TRYLOCK(vp)) {
 			TAILQ_INSERT_TAIL(&vnode_free_list, vp, v_actfreelist);
 			continue;
 		}
 		VNASSERT((vp->v_iflag & VI_FREE) != 0 && vp->v_holdcnt == 0,
 		    vp, ("vp inconsistent on freelist"));
 
 		/*
 		 * The clear of VI_FREE prevents activation of the
 		 * vnode.  There is no sense in putting the vnode on
 		 * the mount point active list, only to remove it
 		 * later during recycling.  Inline the relevant part
 		 * of vholdl(), to avoid triggering assertions or
 		 * activating.
 		 */
 		freevnodes--;
 		vp->v_iflag &= ~VI_FREE;
 		vp->v_holdcnt++;
 
 		mtx_unlock(&vnode_free_list_mtx);
 		VI_UNLOCK(vp);
 		vtryrecycle(vp);
 		/*
 		 * If the recycled succeeded this vdrop will actually free
 		 * the vnode.  If not it will simply place it back on
 		 * the free list.
 		 */
 		vdrop(vp);
 		mtx_lock(&vnode_free_list_mtx);
 	}
 }
 /*
  * Attempt to recycle vnodes in a context that is always safe to block.
  * Calling vlrurecycle() from the bowels of filesystem code has some
  * interesting deadlock problems.
  */
 static struct proc *vnlruproc;
 static int vnlruproc_sig;
 
 static void
 vnlru_proc(void)
 {
 	struct mount *mp, *nmp;
 	int done;
 	struct proc *p = vnlruproc;
 
 	EVENTHANDLER_REGISTER(shutdown_pre_sync, kproc_shutdown, p,
 	    SHUTDOWN_PRI_FIRST);
 
 	for (;;) {
 		kproc_suspend_check(p);
 		mtx_lock(&vnode_free_list_mtx);
 		if (freevnodes > wantfreevnodes)
 			vnlru_free(freevnodes - wantfreevnodes);
 		if (numvnodes <= desiredvnodes * 9 / 10) {
 			vnlruproc_sig = 0;
 			wakeup(&vnlruproc_sig);
 			msleep(vnlruproc, &vnode_free_list_mtx,
 			    PVFS|PDROP, "vlruwt", hz);
 			continue;
 		}
 		mtx_unlock(&vnode_free_list_mtx);
 		done = 0;
 		mtx_lock(&mountlist_mtx);
 		for (mp = TAILQ_FIRST(&mountlist); mp != NULL; mp = nmp) {
 			if (vfs_busy(mp, MBF_NOWAIT | MBF_MNTLSTLOCK)) {
 				nmp = TAILQ_NEXT(mp, mnt_list);
 				continue;
 			}
 			done += vlrureclaim(mp);
 			mtx_lock(&mountlist_mtx);
 			nmp = TAILQ_NEXT(mp, mnt_list);
 			vfs_unbusy(mp);
 		}
 		mtx_unlock(&mountlist_mtx);
 		if (done == 0) {
 #if 0
 			/* These messages are temporary debugging aids */
 			if (vnlru_nowhere < 5)
 				printf("vnlru process getting nowhere..\n");
 			else if (vnlru_nowhere == 5)
 				printf("vnlru process messages stopped.\n");
 #endif
 			vnlru_nowhere++;
 			tsleep(vnlruproc, PPAUSE, "vlrup", hz * 3);
 		} else
 			kern_yield(PRI_USER);
 	}
 }
 
 static struct kproc_desc vnlru_kp = {
 	"vnlru",
 	vnlru_proc,
 	&vnlruproc
 };
 SYSINIT(vnlru, SI_SUB_KTHREAD_UPDATE, SI_ORDER_FIRST, kproc_start,
     &vnlru_kp);
  
 /*
  * Routines having to do with the management of the vnode table.
  */
 
 /*
  * Try to recycle a freed vnode.  We abort if anyone picks up a reference
  * before we actually vgone().  This function must be called with the vnode
  * held to prevent the vnode from being returned to the free list midway
  * through vgone().
  */
 static int
 vtryrecycle(struct vnode *vp)
 {
 	struct mount *vnmp;
 
 	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
 	VNASSERT(vp->v_holdcnt, vp,
 	    ("vtryrecycle: Recycling vp %p without a reference.", vp));
 	/*
 	 * This vnode may found and locked via some other list, if so we
 	 * can't recycle it yet.
 	 */
 	if (VOP_LOCK(vp, LK_EXCLUSIVE | LK_NOWAIT) != 0) {
 		CTR2(KTR_VFS,
 		    "%s: impossible to recycle, vp %p lock is already held",
 		    __func__, vp);
 		return (EWOULDBLOCK);
 	}
 	/*
 	 * Don't recycle if its filesystem is being suspended.
 	 */
 	if (vn_start_write(vp, &vnmp, V_NOWAIT) != 0) {
 		VOP_UNLOCK(vp, 0);
 		CTR2(KTR_VFS,
 		    "%s: impossible to recycle, cannot start the write for %p",
 		    __func__, vp);
 		return (EBUSY);
 	}
 	/*
 	 * If we got this far, we need to acquire the interlock and see if
 	 * anyone picked up this vnode from another list.  If not, we will
 	 * mark it with DOOMED via vgonel() so that anyone who does find it
 	 * will skip over it.
 	 */
 	VI_LOCK(vp);
 	if (vp->v_usecount) {
 		VOP_UNLOCK(vp, LK_INTERLOCK);
 		vn_finished_write(vnmp);
 		CTR2(KTR_VFS,
 		    "%s: impossible to recycle, %p is already referenced",
 		    __func__, vp);
 		return (EBUSY);
 	}
 	if ((vp->v_iflag & VI_DOOMED) == 0) {
 		atomic_add_long(&recycles_count, 1);
 		vgonel(vp);
 	}
 	VOP_UNLOCK(vp, LK_INTERLOCK);
 	vn_finished_write(vnmp);
 	return (0);
 }
 
 /*
  * Wait for available vnodes.
  */
 static int
 getnewvnode_wait(int suspended)
 {
 
 	mtx_assert(&vnode_free_list_mtx, MA_OWNED);
 	if (numvnodes > desiredvnodes) {
 		if (suspended) {
 			/*
 			 * File system is beeing suspended, we cannot risk a
 			 * deadlock here, so allocate new vnode anyway.
 			 */
 			if (freevnodes > wantfreevnodes)
 				vnlru_free(freevnodes - wantfreevnodes);
 			return (0);
 		}
 		if (vnlruproc_sig == 0) {
 			vnlruproc_sig = 1;	/* avoid unnecessary wakeups */
 			wakeup(vnlruproc);
 		}
 		msleep(&vnlruproc_sig, &vnode_free_list_mtx, PVFS,
 		    "vlruwk", hz);
 	}
 	return (numvnodes > desiredvnodes ? ENFILE : 0);
 }
 
 void
 getnewvnode_reserve(u_int count)
 {
 	struct thread *td;
 
 	td = curthread;
 	/* First try to be quick and racy. */
 	if (atomic_fetchadd_long(&numvnodes, count) + count <= desiredvnodes) {
 		td->td_vp_reserv += count;
 		return;
 	} else
 		atomic_subtract_long(&numvnodes, count);
 
 	mtx_lock(&vnode_free_list_mtx);
 	while (count > 0) {
 		if (getnewvnode_wait(0) == 0) {
 			count--;
 			td->td_vp_reserv++;
 			atomic_add_long(&numvnodes, 1);
 		}
 	}
 	mtx_unlock(&vnode_free_list_mtx);
 }
 
 void
 getnewvnode_drop_reserve(void)
 {
 	struct thread *td;
 
 	td = curthread;
 	atomic_subtract_long(&numvnodes, td->td_vp_reserv);
 	td->td_vp_reserv = 0;
 }
 
 /*
  * Return the next vnode from the free list.
  */
 int
 getnewvnode(const char *tag, struct mount *mp, struct vop_vector *vops,
     struct vnode **vpp)
 {
 	struct vnode *vp;
 	struct bufobj *bo;
 	struct thread *td;
 	int error;
 
 	CTR3(KTR_VFS, "%s: mp %p with tag %s", __func__, mp, tag);
 	vp = NULL;
 	td = curthread;
 	if (td->td_vp_reserv > 0) {
 		td->td_vp_reserv -= 1;
 		goto alloc;
 	}
 	mtx_lock(&vnode_free_list_mtx);
 	/*
 	 * Lend our context to reclaim vnodes if they've exceeded the max.
 	 */
 	if (freevnodes > wantfreevnodes)
 		vnlru_free(1);
 	error = getnewvnode_wait(mp != NULL && (mp->mnt_kern_flag &
 	    MNTK_SUSPEND));
 #if 0	/* XXX Not all VFS_VGET/ffs_vget callers check returns. */
 	if (error != 0) {
 		mtx_unlock(&vnode_free_list_mtx);
 		return (error);
 	}
 #endif
 	atomic_add_long(&numvnodes, 1);
 	mtx_unlock(&vnode_free_list_mtx);
 alloc:
 	atomic_add_long(&vnodes_created, 1);
 	vp = (struct vnode *) uma_zalloc(vnode_zone, M_WAITOK|M_ZERO);
 	/*
 	 * Setup locks.
 	 */
 	vp->v_vnlock = &vp->v_lock;
 	mtx_init(&vp->v_interlock, "vnode interlock", NULL, MTX_DEF);
 	/*
 	 * By default, don't allow shared locks unless filesystems
 	 * opt-in.
 	 */
 	lockinit(vp->v_vnlock, PVFS, tag, VLKTIMEOUT, LK_NOSHARE | LK_IS_VNODE);
 	/*
 	 * Initialize bufobj.
 	 */
 	bo = &vp->v_bufobj;
 	bo->__bo_vnode = vp;
 	rw_init(BO_LOCKPTR(bo), "bufobj interlock");
 	bo->bo_ops = &buf_ops_bio;
 	bo->bo_private = vp;
 	TAILQ_INIT(&bo->bo_clean.bv_hd);
 	TAILQ_INIT(&bo->bo_dirty.bv_hd);
 	/*
 	 * Initialize namecache.
 	 */
 	LIST_INIT(&vp->v_cache_src);
 	TAILQ_INIT(&vp->v_cache_dst);
 	/*
 	 * Finalize various vnode identity bits.
 	 */
 	vp->v_type = VNON;
 	vp->v_tag = tag;
 	vp->v_op = vops;
 	v_incr_usecount(vp);
 	vp->v_data = NULL;
 #ifdef MAC
 	mac_vnode_init(vp);
 	if (mp != NULL && (mp->mnt_flag & MNT_MULTILABEL) == 0)
 		mac_vnode_associate_singlelabel(mp, vp);
 	else if (mp == NULL && vops != &dead_vnodeops)
 		printf("NULL mp in getnewvnode()\n");
 #endif
 	if (mp != NULL) {
 		bo->bo_bsize = mp->mnt_stat.f_iosize;
 		if ((mp->mnt_kern_flag & MNTK_NOKNOTE) != 0)
 			vp->v_vflag |= VV_NOKNOTE;
 	}
 	rangelock_init(&vp->v_rl);
 
 	/*
 	 * For the filesystems which do not use vfs_hash_insert(),
 	 * still initialize v_hash to have vfs_hash_index() useful.
 	 * E.g., nullfs uses vfs_hash_index() on the lower vnode for
 	 * its own hashing.
 	 */
 	vp->v_hash = (uintptr_t)vp >> vnsz2log;
 
 	*vpp = vp;
 	return (0);
 }
 
 /*
  * Delete from old mount point vnode list, if on one.
  */
 static void
 delmntque(struct vnode *vp)
 {
 	struct mount *mp;
 	int active;
 
 	mp = vp->v_mount;
 	if (mp == NULL)
 		return;
 	MNT_ILOCK(mp);
 	VI_LOCK(vp);
 	KASSERT(mp->mnt_activevnodelistsize <= mp->mnt_nvnodelistsize,
 	    ("Active vnode list size %d > Vnode list size %d",
 	     mp->mnt_activevnodelistsize, mp->mnt_nvnodelistsize));
 	active = vp->v_iflag & VI_ACTIVE;
 	vp->v_iflag &= ~VI_ACTIVE;
 	if (active) {
 		mtx_lock(&vnode_free_list_mtx);
 		TAILQ_REMOVE(&mp->mnt_activevnodelist, vp, v_actfreelist);
 		mp->mnt_activevnodelistsize--;
 		mtx_unlock(&vnode_free_list_mtx);
 	}
 	vp->v_mount = NULL;
 	VI_UNLOCK(vp);
 	VNASSERT(mp->mnt_nvnodelistsize > 0, vp,
 		("bad mount point vnode list size"));
 	TAILQ_REMOVE(&mp->mnt_nvnodelist, vp, v_nmntvnodes);
 	mp->mnt_nvnodelistsize--;
 	MNT_REL(mp);
 	MNT_IUNLOCK(mp);
 }
 
 static void
 insmntque_stddtr(struct vnode *vp, void *dtr_arg)
 {
 
 	vp->v_data = NULL;
 	vp->v_op = &dead_vnodeops;
 	vgone(vp);
 	vput(vp);
 }
 
 /*
  * Insert into list of vnodes for the new mount point, if available.
  */
 int
 insmntque1(struct vnode *vp, struct mount *mp,
 	void (*dtr)(struct vnode *, void *), void *dtr_arg)
 {
 
 	KASSERT(vp->v_mount == NULL,
 		("insmntque: vnode already on per mount vnode list"));
 	VNASSERT(mp != NULL, vp, ("Don't call insmntque(foo, NULL)"));
 	ASSERT_VOP_ELOCKED(vp, "insmntque: non-locked vp");
 
 	/*
 	 * We acquire the vnode interlock early to ensure that the
 	 * vnode cannot be recycled by another process releasing a
 	 * holdcnt on it before we get it on both the vnode list
 	 * and the active vnode list. The mount mutex protects only
 	 * manipulation of the vnode list and the vnode freelist
 	 * mutex protects only manipulation of the active vnode list.
 	 * Hence the need to hold the vnode interlock throughout.
 	 */
 	MNT_ILOCK(mp);
 	VI_LOCK(vp);
 	if (((mp->mnt_kern_flag & MNTK_NOINSMNTQ) != 0 &&
 	    ((mp->mnt_kern_flag & MNTK_UNMOUNTF) != 0 ||
 	    mp->mnt_nvnodelistsize == 0)) &&
 	    (vp->v_vflag & VV_FORCEINSMQ) == 0) {
 		VI_UNLOCK(vp);
 		MNT_IUNLOCK(mp);
 		if (dtr != NULL)
 			dtr(vp, dtr_arg);
 		return (EBUSY);
 	}
 	vp->v_mount = mp;
 	MNT_REF(mp);
 	TAILQ_INSERT_TAIL(&mp->mnt_nvnodelist, vp, v_nmntvnodes);
 	VNASSERT(mp->mnt_nvnodelistsize >= 0, vp,
 		("neg mount point vnode list size"));
 	mp->mnt_nvnodelistsize++;
 	KASSERT((vp->v_iflag & VI_ACTIVE) == 0,
 	    ("Activating already active vnode"));
 	vp->v_iflag |= VI_ACTIVE;
 	mtx_lock(&vnode_free_list_mtx);
 	TAILQ_INSERT_HEAD(&mp->mnt_activevnodelist, vp, v_actfreelist);
 	mp->mnt_activevnodelistsize++;
 	mtx_unlock(&vnode_free_list_mtx);
 	VI_UNLOCK(vp);
 	MNT_IUNLOCK(mp);
 	return (0);
 }
 
 int
 insmntque(struct vnode *vp, struct mount *mp)
 {
 
 	return (insmntque1(vp, mp, insmntque_stddtr, NULL));
 }
 
 /*
  * Flush out and invalidate all buffers associated with a bufobj
  * Called with the underlying object locked.
  */
 int
 bufobj_invalbuf(struct bufobj *bo, int flags, int slpflag, int slptimeo)
 {
 	int error;
 
 	BO_LOCK(bo);
 	if (flags & V_SAVE) {
 		error = bufobj_wwait(bo, slpflag, slptimeo);
 		if (error) {
 			BO_UNLOCK(bo);
 			return (error);
 		}
 		if (bo->bo_dirty.bv_cnt > 0) {
 			BO_UNLOCK(bo);
 			if ((error = BO_SYNC(bo, MNT_WAIT)) != 0)
 				return (error);
 			/*
 			 * XXX We could save a lock/unlock if this was only
 			 * enabled under INVARIANTS
 			 */
 			BO_LOCK(bo);
 			if (bo->bo_numoutput > 0 || bo->bo_dirty.bv_cnt > 0)
 				panic("vinvalbuf: dirty bufs");
 		}
 	}
 	/*
 	 * If you alter this loop please notice that interlock is dropped and
 	 * reacquired in flushbuflist.  Special care is needed to ensure that
 	 * no race conditions occur from this.
 	 */
 	do {
 		error = flushbuflist(&bo->bo_clean,
 		    flags, bo, slpflag, slptimeo);
 		if (error == 0 && !(flags & V_CLEANONLY))
 			error = flushbuflist(&bo->bo_dirty,
 			    flags, bo, slpflag, slptimeo);
 		if (error != 0 && error != EAGAIN) {
 			BO_UNLOCK(bo);
 			return (error);
 		}
 	} while (error != 0);
 
 	/*
 	 * Wait for I/O to complete.  XXX needs cleaning up.  The vnode can
 	 * have write I/O in-progress but if there is a VM object then the
 	 * VM object can also have read-I/O in-progress.
 	 */
 	do {
 		bufobj_wwait(bo, 0, 0);
 		BO_UNLOCK(bo);
 		if (bo->bo_object != NULL) {
 			VM_OBJECT_WLOCK(bo->bo_object);
 			vm_object_pip_wait(bo->bo_object, "bovlbx");
 			VM_OBJECT_WUNLOCK(bo->bo_object);
 		}
 		BO_LOCK(bo);
 	} while (bo->bo_numoutput > 0);
 	BO_UNLOCK(bo);
 
 	/*
 	 * Destroy the copy in the VM cache, too.
 	 */
 	if (bo->bo_object != NULL &&
 	    (flags & (V_ALT | V_NORMAL | V_CLEANONLY)) == 0) {
 		VM_OBJECT_WLOCK(bo->bo_object);
 		vm_object_page_remove(bo->bo_object, 0, 0, (flags & V_SAVE) ?
 		    OBJPR_CLEANONLY : 0);
 		VM_OBJECT_WUNLOCK(bo->bo_object);
 	}
 
 #ifdef INVARIANTS
 	BO_LOCK(bo);
 	if ((flags & (V_ALT | V_NORMAL | V_CLEANONLY)) == 0 &&
 	    (bo->bo_dirty.bv_cnt > 0 || bo->bo_clean.bv_cnt > 0))
 		panic("vinvalbuf: flush failed");
 	BO_UNLOCK(bo);
 #endif
 	return (0);
 }
 
 /*
  * Flush out and invalidate all buffers associated with a vnode.
  * Called with the underlying object locked.
  */
 int
 vinvalbuf(struct vnode *vp, int flags, int slpflag, int slptimeo)
 {
 
 	CTR3(KTR_VFS, "%s: vp %p with flags %d", __func__, vp, flags);
 	ASSERT_VOP_LOCKED(vp, "vinvalbuf");
 	if (vp->v_object != NULL && vp->v_object->handle != vp)
 		return (0);
 	return (bufobj_invalbuf(&vp->v_bufobj, flags, slpflag, slptimeo));
 }
 
 /*
  * Flush out buffers on the specified list.
  *
  */
 static int
 flushbuflist(struct bufv *bufv, int flags, struct bufobj *bo, int slpflag,
     int slptimeo)
 {
 	struct buf *bp, *nbp;
 	int retval, error;
 	daddr_t lblkno;
 	b_xflags_t xflags;
 
 	ASSERT_BO_WLOCKED(bo);
 
 	retval = 0;
 	TAILQ_FOREACH_SAFE(bp, &bufv->bv_hd, b_bobufs, nbp) {
 		if (((flags & V_NORMAL) && (bp->b_xflags & BX_ALTDATA)) ||
 		    ((flags & V_ALT) && (bp->b_xflags & BX_ALTDATA) == 0)) {
 			continue;
 		}
 		lblkno = 0;
 		xflags = 0;
 		if (nbp != NULL) {
 			lblkno = nbp->b_lblkno;
 			xflags = nbp->b_xflags & (BX_VNDIRTY | BX_VNCLEAN);
 		}
 		retval = EAGAIN;
 		error = BUF_TIMELOCK(bp,
 		    LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK, BO_LOCKPTR(bo),
 		    "flushbuf", slpflag, slptimeo);
 		if (error) {
 			BO_LOCK(bo);
 			return (error != ENOLCK ? error : EAGAIN);
 		}
 		KASSERT(bp->b_bufobj == bo,
 		    ("bp %p wrong b_bufobj %p should be %p",
 		    bp, bp->b_bufobj, bo));
 		if (bp->b_bufobj != bo) {	/* XXX: necessary ? */
 			BUF_UNLOCK(bp);
 			BO_LOCK(bo);
 			return (EAGAIN);
 		}
 		/*
 		 * XXX Since there are no node locks for NFS, I
 		 * believe there is a slight chance that a delayed
 		 * write will occur while sleeping just above, so
 		 * check for it.
 		 */
 		if (((bp->b_flags & (B_DELWRI | B_INVAL)) == B_DELWRI) &&
 		    (flags & V_SAVE)) {
 			bremfree(bp);
 			bp->b_flags |= B_ASYNC;
 			bwrite(bp);
 			BO_LOCK(bo);
 			return (EAGAIN);	/* XXX: why not loop ? */
 		}
 		bremfree(bp);
 		bp->b_flags |= (B_INVAL | B_RELBUF);
 		bp->b_flags &= ~B_ASYNC;
 		brelse(bp);
 		BO_LOCK(bo);
 		if (nbp != NULL &&
 		    (nbp->b_bufobj != bo ||
 		     nbp->b_lblkno != lblkno ||
 		     (nbp->b_xflags & (BX_VNDIRTY | BX_VNCLEAN)) != xflags))
 			break;			/* nbp invalid */
 	}
 	return (retval);
 }
 
 /*
  * Truncate a file's buffer and pages to a specified length.  This
  * is in lieu of the old vinvalbuf mechanism, which performed unneeded
  * sync activity.
  */
 int
 vtruncbuf(struct vnode *vp, struct ucred *cred, off_t length, int blksize)
 {
 	struct buf *bp, *nbp;
 	int anyfreed;
 	int trunclbn;
 	struct bufobj *bo;
 
 	CTR5(KTR_VFS, "%s: vp %p with cred %p and block %d:%ju", __func__,
 	    vp, cred, blksize, (uintmax_t)length);
 
 	/*
 	 * Round up to the *next* lbn.
 	 */
 	trunclbn = (length + blksize - 1) / blksize;
 
 	ASSERT_VOP_LOCKED(vp, "vtruncbuf");
 restart:
 	bo = &vp->v_bufobj;
 	BO_LOCK(bo);
 	anyfreed = 1;
 	for (;anyfreed;) {
 		anyfreed = 0;
 		TAILQ_FOREACH_SAFE(bp, &bo->bo_clean.bv_hd, b_bobufs, nbp) {
 			if (bp->b_lblkno < trunclbn)
 				continue;
 			if (BUF_LOCK(bp,
 			    LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK,
 			    BO_LOCKPTR(bo)) == ENOLCK)
 				goto restart;
 
 			bremfree(bp);
 			bp->b_flags |= (B_INVAL | B_RELBUF);
 			bp->b_flags &= ~B_ASYNC;
 			brelse(bp);
 			anyfreed = 1;
 
 			BO_LOCK(bo);
 			if (nbp != NULL &&
 			    (((nbp->b_xflags & BX_VNCLEAN) == 0) ||
 			    (nbp->b_vp != vp) ||
 			    (nbp->b_flags & B_DELWRI))) {
 				BO_UNLOCK(bo);
 				goto restart;
 			}
 		}
 
 		TAILQ_FOREACH_SAFE(bp, &bo->bo_dirty.bv_hd, b_bobufs, nbp) {
 			if (bp->b_lblkno < trunclbn)
 				continue;
 			if (BUF_LOCK(bp,
 			    LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK,
 			    BO_LOCKPTR(bo)) == ENOLCK)
 				goto restart;
 			bremfree(bp);
 			bp->b_flags |= (B_INVAL | B_RELBUF);
 			bp->b_flags &= ~B_ASYNC;
 			brelse(bp);
 			anyfreed = 1;
 
 			BO_LOCK(bo);
 			if (nbp != NULL &&
 			    (((nbp->b_xflags & BX_VNDIRTY) == 0) ||
 			    (nbp->b_vp != vp) ||
 			    (nbp->b_flags & B_DELWRI) == 0)) {
 				BO_UNLOCK(bo);
 				goto restart;
 			}
 		}
 	}
 
 	if (length > 0) {
 restartsync:
 		TAILQ_FOREACH_SAFE(bp, &bo->bo_dirty.bv_hd, b_bobufs, nbp) {
 			if (bp->b_lblkno > 0)
 				continue;
 			/*
 			 * Since we hold the vnode lock this should only
 			 * fail if we're racing with the buf daemon.
 			 */
 			if (BUF_LOCK(bp,
 			    LK_EXCLUSIVE | LK_SLEEPFAIL | LK_INTERLOCK,
 			    BO_LOCKPTR(bo)) == ENOLCK) {
 				goto restart;
 			}
 			VNASSERT((bp->b_flags & B_DELWRI), vp,
 			    ("buf(%p) on dirty queue without DELWRI", bp));
 
 			bremfree(bp);
 			bawrite(bp);
 			BO_LOCK(bo);
 			goto restartsync;
 		}
 	}
 
 	bufobj_wwait(bo, 0, 0);
 	BO_UNLOCK(bo);
 	vnode_pager_setsize(vp, length);
 
 	return (0);
 }
 
 static void
 buf_vlist_remove(struct buf *bp)
 {
 	struct bufv *bv;
 
 	KASSERT(bp->b_bufobj != NULL, ("No b_bufobj %p", bp));
 	ASSERT_BO_WLOCKED(bp->b_bufobj);
 	KASSERT((bp->b_xflags & (BX_VNDIRTY|BX_VNCLEAN)) !=
 	    (BX_VNDIRTY|BX_VNCLEAN),
 	    ("buf_vlist_remove: Buf %p is on two lists", bp));
 	if (bp->b_xflags & BX_VNDIRTY)
 		bv = &bp->b_bufobj->bo_dirty;
 	else
 		bv = &bp->b_bufobj->bo_clean;
 	BUF_PCTRIE_REMOVE(&bv->bv_root, bp->b_lblkno);
 	TAILQ_REMOVE(&bv->bv_hd, bp, b_bobufs);
 	bv->bv_cnt--;
 	bp->b_xflags &= ~(BX_VNDIRTY | BX_VNCLEAN);
 }
 
 /*
  * Add the buffer to the sorted clean or dirty block list.
  *
  * NOTE: xflags is passed as a constant, optimizing this inline function!
  */
 static void
 buf_vlist_add(struct buf *bp, struct bufobj *bo, b_xflags_t xflags)
 {
 	struct bufv *bv;
 	struct buf *n;
 	int error;
 
 	ASSERT_BO_WLOCKED(bo);
 	KASSERT((bo->bo_flag & BO_DEAD) == 0, ("dead bo %p", bo));
 	KASSERT((bp->b_xflags & (BX_VNDIRTY|BX_VNCLEAN)) == 0,
 	    ("buf_vlist_add: Buf %p has existing xflags %d", bp, bp->b_xflags));
 	bp->b_xflags |= xflags;
 	if (xflags & BX_VNDIRTY)
 		bv = &bo->bo_dirty;
 	else
 		bv = &bo->bo_clean;
 
 	/*
 	 * Keep the list ordered.  Optimize empty list insertion.  Assume
 	 * we tend to grow at the tail so lookup_le should usually be cheaper
 	 * than _ge. 
 	 */
 	if (bv->bv_cnt == 0 ||
 	    bp->b_lblkno > TAILQ_LAST(&bv->bv_hd, buflists)->b_lblkno)
 		TAILQ_INSERT_TAIL(&bv->bv_hd, bp, b_bobufs);
 	else if ((n = BUF_PCTRIE_LOOKUP_LE(&bv->bv_root, bp->b_lblkno)) == NULL)
 		TAILQ_INSERT_HEAD(&bv->bv_hd, bp, b_bobufs);
 	else
 		TAILQ_INSERT_AFTER(&bv->bv_hd, n, bp, b_bobufs);
 	error = BUF_PCTRIE_INSERT(&bv->bv_root, bp);
 	if (error)
 		panic("buf_vlist_add:  Preallocated nodes insufficient.");
 	bv->bv_cnt++;
 }
 
 /*
  * Lookup a buffer using the splay tree.  Note that we specifically avoid
  * shadow buffers used in background bitmap writes.
  *
  * This code isn't quite efficient as it could be because we are maintaining
  * two sorted lists and do not know which list the block resides in.
  *
  * During a "make buildworld" the desired buffer is found at one of
  * the roots more than 60% of the time.  Thus, checking both roots
  * before performing either splay eliminates unnecessary splays on the
  * first tree splayed.
  */
 struct buf *
 gbincore(struct bufobj *bo, daddr_t lblkno)
 {
 	struct buf *bp;
 
 	ASSERT_BO_LOCKED(bo);
 	bp = BUF_PCTRIE_LOOKUP(&bo->bo_clean.bv_root, lblkno);
 	if (bp != NULL)
 		return (bp);
 	return BUF_PCTRIE_LOOKUP(&bo->bo_dirty.bv_root, lblkno);
 }
 
 /*
  * Associate a buffer with a vnode.
  */
 void
 bgetvp(struct vnode *vp, struct buf *bp)
 {
 	struct bufobj *bo;
 
 	bo = &vp->v_bufobj;
 	ASSERT_BO_WLOCKED(bo);
 	VNASSERT(bp->b_vp == NULL, bp->b_vp, ("bgetvp: not free"));
 
 	CTR3(KTR_BUF, "bgetvp(%p) vp %p flags %X", bp, vp, bp->b_flags);
 	VNASSERT((bp->b_xflags & (BX_VNDIRTY|BX_VNCLEAN)) == 0, vp,
 	    ("bgetvp: bp already attached! %p", bp));
 
 	vhold(vp);
 	bp->b_vp = vp;
 	bp->b_bufobj = bo;
 	/*
 	 * Insert onto list for new vnode.
 	 */
 	buf_vlist_add(bp, bo, BX_VNCLEAN);
 }
 
 /*
  * Disassociate a buffer from a vnode.
  */
 void
 brelvp(struct buf *bp)
 {
 	struct bufobj *bo;
 	struct vnode *vp;
 
 	CTR3(KTR_BUF, "brelvp(%p) vp %p flags %X", bp, bp->b_vp, bp->b_flags);
 	KASSERT(bp->b_vp != NULL, ("brelvp: NULL"));
 
 	/*
 	 * Delete from old vnode list, if on one.
 	 */
 	vp = bp->b_vp;		/* XXX */
 	bo = bp->b_bufobj;
 	BO_LOCK(bo);
 	if (bp->b_xflags & (BX_VNDIRTY | BX_VNCLEAN))
 		buf_vlist_remove(bp);
 	else
 		panic("brelvp: Buffer %p not on queue.", bp);
 	if ((bo->bo_flag & BO_ONWORKLST) && bo->bo_dirty.bv_cnt == 0) {
 		bo->bo_flag &= ~BO_ONWORKLST;
 		mtx_lock(&sync_mtx);
 		LIST_REMOVE(bo, bo_synclist);
 		syncer_worklist_len--;
 		mtx_unlock(&sync_mtx);
 	}
 	bp->b_vp = NULL;
 	bp->b_bufobj = NULL;
 	BO_UNLOCK(bo);
 	vdrop(vp);
 }
 
 /*
  * Add an item to the syncer work queue.
  */
 static void
 vn_syncer_add_to_worklist(struct bufobj *bo, int delay)
 {
 	int slot;
 
 	ASSERT_BO_WLOCKED(bo);
 
 	mtx_lock(&sync_mtx);
 	if (bo->bo_flag & BO_ONWORKLST)
 		LIST_REMOVE(bo, bo_synclist);
 	else {
 		bo->bo_flag |= BO_ONWORKLST;
 		syncer_worklist_len++;
 	}
 
 	if (delay > syncer_maxdelay - 2)
 		delay = syncer_maxdelay - 2;
 	slot = (syncer_delayno + delay) & syncer_mask;
 
 	LIST_INSERT_HEAD(&syncer_workitem_pending[slot], bo, bo_synclist);
 	mtx_unlock(&sync_mtx);
 }
 
 static int
 sysctl_vfs_worklist_len(SYSCTL_HANDLER_ARGS)
 {
 	int error, len;
 
 	mtx_lock(&sync_mtx);
 	len = syncer_worklist_len - sync_vnode_count;
 	mtx_unlock(&sync_mtx);
 	error = SYSCTL_OUT(req, &len, sizeof(len));
 	return (error);
 }
 
 SYSCTL_PROC(_vfs, OID_AUTO, worklist_len, CTLTYPE_INT | CTLFLAG_RD, NULL, 0,
     sysctl_vfs_worklist_len, "I", "Syncer thread worklist length");
 
 static struct proc *updateproc;
 static void sched_sync(void);
 static struct kproc_desc up_kp = {
 	"syncer",
 	sched_sync,
 	&updateproc
 };
 SYSINIT(syncer, SI_SUB_KTHREAD_UPDATE, SI_ORDER_FIRST, kproc_start, &up_kp);
 
 static int
 sync_vnode(struct synclist *slp, struct bufobj **bo, struct thread *td)
 {
 	struct vnode *vp;
 	struct mount *mp;
 
 	*bo = LIST_FIRST(slp);
 	if (*bo == NULL)
 		return (0);
 	vp = (*bo)->__bo_vnode;	/* XXX */
 	if (VOP_ISLOCKED(vp) != 0 || VI_TRYLOCK(vp) == 0)
 		return (1);
 	/*
 	 * We use vhold in case the vnode does not
 	 * successfully sync.  vhold prevents the vnode from
 	 * going away when we unlock the sync_mtx so that
 	 * we can acquire the vnode interlock.
 	 */
 	vholdl(vp);
 	mtx_unlock(&sync_mtx);
 	VI_UNLOCK(vp);
 	if (vn_start_write(vp, &mp, V_NOWAIT) != 0) {
 		vdrop(vp);
 		mtx_lock(&sync_mtx);
 		return (*bo == LIST_FIRST(slp));
 	}
 	vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
 	(void) VOP_FSYNC(vp, MNT_LAZY, td);
 	VOP_UNLOCK(vp, 0);
 	vn_finished_write(mp);
 	BO_LOCK(*bo);
 	if (((*bo)->bo_flag & BO_ONWORKLST) != 0) {
 		/*
 		 * Put us back on the worklist.  The worklist
 		 * routine will remove us from our current
 		 * position and then add us back in at a later
 		 * position.
 		 */
 		vn_syncer_add_to_worklist(*bo, syncdelay);
 	}
 	BO_UNLOCK(*bo);
 	vdrop(vp);
 	mtx_lock(&sync_mtx);
 	return (0);
 }
 
 static int first_printf = 1;
 
 /*
  * System filesystem synchronizer daemon.
  */
 static void
 sched_sync(void)
 {
 	struct synclist *next, *slp;
 	struct bufobj *bo;
 	long starttime;
 	struct thread *td = curthread;
 	int last_work_seen;
 	int net_worklist_len;
 	int syncer_final_iter;
 	int error;
 
 	last_work_seen = 0;
 	syncer_final_iter = 0;
 	syncer_state = SYNCER_RUNNING;
 	starttime = time_uptime;
 	td->td_pflags |= TDP_NORUNNINGBUF;
 
 	EVENTHANDLER_REGISTER(shutdown_pre_sync, syncer_shutdown, td->td_proc,
 	    SHUTDOWN_PRI_LAST);
 
 	mtx_lock(&sync_mtx);
 	for (;;) {
 		if (syncer_state == SYNCER_FINAL_DELAY &&
 		    syncer_final_iter == 0) {
 			mtx_unlock(&sync_mtx);
 			kproc_suspend_check(td->td_proc);
 			mtx_lock(&sync_mtx);
 		}
 		net_worklist_len = syncer_worklist_len - sync_vnode_count;
 		if (syncer_state != SYNCER_RUNNING &&
 		    starttime != time_uptime) {
 			if (first_printf) {
 				printf("\nSyncing disks, vnodes remaining...");
 				first_printf = 0;
 			}
 			printf("%d ", net_worklist_len);
 		}
 		starttime = time_uptime;
 
 		/*
 		 * Push files whose dirty time has expired.  Be careful
 		 * of interrupt race on slp queue.
 		 *
 		 * Skip over empty worklist slots when shutting down.
 		 */
 		do {
 			slp = &syncer_workitem_pending[syncer_delayno];
 			syncer_delayno += 1;
 			if (syncer_delayno == syncer_maxdelay)
 				syncer_delayno = 0;
 			next = &syncer_workitem_pending[syncer_delayno];
 			/*
 			 * If the worklist has wrapped since the
 			 * it was emptied of all but syncer vnodes,
 			 * switch to the FINAL_DELAY state and run
 			 * for one more second.
 			 */
 			if (syncer_state == SYNCER_SHUTTING_DOWN &&
 			    net_worklist_len == 0 &&
 			    last_work_seen == syncer_delayno) {
 				syncer_state = SYNCER_FINAL_DELAY;
 				syncer_final_iter = SYNCER_SHUTDOWN_SPEEDUP;
 			}
 		} while (syncer_state != SYNCER_RUNNING && LIST_EMPTY(slp) &&
 		    syncer_worklist_len > 0);
 
 		/*
 		 * Keep track of the last time there was anything
 		 * on the worklist other than syncer vnodes.
 		 * Return to the SHUTTING_DOWN state if any
 		 * new work appears.
 		 */
 		if (net_worklist_len > 0 || syncer_state == SYNCER_RUNNING)
 			last_work_seen = syncer_delayno;
 		if (net_worklist_len > 0 && syncer_state == SYNCER_FINAL_DELAY)
 			syncer_state = SYNCER_SHUTTING_DOWN;
 		while (!LIST_EMPTY(slp)) {
 			error = sync_vnode(slp, &bo, td);
 			if (error == 1) {
 				LIST_REMOVE(bo, bo_synclist);
 				LIST_INSERT_HEAD(next, bo, bo_synclist);
 				continue;
 			}
 
 			if (first_printf == 0) {
 				/*
 				 * Drop the sync mutex, because some watchdog
 				 * drivers need to sleep while patting
 				 */
 				mtx_unlock(&sync_mtx);
 				wdog_kern_pat(WD_LASTVAL);
 				mtx_lock(&sync_mtx);
 			}
 
 		}
 		if (syncer_state == SYNCER_FINAL_DELAY && syncer_final_iter > 0)
 			syncer_final_iter--;
 		/*
 		 * The variable rushjob allows the kernel to speed up the
 		 * processing of the filesystem syncer process. A rushjob
 		 * value of N tells the filesystem syncer to process the next
 		 * N seconds worth of work on its queue ASAP. Currently rushjob
 		 * is used by the soft update code to speed up the filesystem
 		 * syncer process when the incore state is getting so far
 		 * ahead of the disk that the kernel memory pool is being
 		 * threatened with exhaustion.
 		 */
 		if (rushjob > 0) {
 			rushjob -= 1;
 			continue;
 		}
 		/*
 		 * Just sleep for a short period of time between
 		 * iterations when shutting down to allow some I/O
 		 * to happen.
 		 *
 		 * If it has taken us less than a second to process the
 		 * current work, then wait. Otherwise start right over
 		 * again. We can still lose time if any single round
 		 * takes more than two seconds, but it does not really
 		 * matter as we are just trying to generally pace the
 		 * filesystem activity.
 		 */
 		if (syncer_state != SYNCER_RUNNING ||
 		    time_uptime == starttime) {
 			thread_lock(td);
 			sched_prio(td, PPAUSE);
 			thread_unlock(td);
 		}
 		if (syncer_state != SYNCER_RUNNING)
 			cv_timedwait(&sync_wakeup, &sync_mtx,
 			    hz / SYNCER_SHUTDOWN_SPEEDUP);
 		else if (time_uptime == starttime)
 			cv_timedwait(&sync_wakeup, &sync_mtx, hz);
 	}
 }
 
 /*
  * Request the syncer daemon to speed up its work.
  * We never push it to speed up more than half of its
  * normal turn time, otherwise it could take over the cpu.
  */
 int
 speedup_syncer(void)
 {
 	int ret = 0;
 
 	mtx_lock(&sync_mtx);
 	if (rushjob < syncdelay / 2) {
 		rushjob += 1;
 		stat_rush_requests += 1;
 		ret = 1;
 	}
 	mtx_unlock(&sync_mtx);
 	cv_broadcast(&sync_wakeup);
 	return (ret);
 }
 
 /*
  * Tell the syncer to speed up its work and run though its work
  * list several times, then tell it to shut down.
  */
 static void
 syncer_shutdown(void *arg, int howto)
 {
 
 	if (howto & RB_NOSYNC)
 		return;
 	mtx_lock(&sync_mtx);
 	syncer_state = SYNCER_SHUTTING_DOWN;
 	rushjob = 0;
 	mtx_unlock(&sync_mtx);
 	cv_broadcast(&sync_wakeup);
 	kproc_shutdown(arg, howto);
 }
 
 void
 syncer_suspend(void)
 {
 
 	syncer_shutdown(updateproc, 0);
 }
 
 void
 syncer_resume(void)
 {
 
 	mtx_lock(&sync_mtx);
 	first_printf = 1;
 	syncer_state = SYNCER_RUNNING;
 	mtx_unlock(&sync_mtx);
 	cv_broadcast(&sync_wakeup);
 	kproc_resume(updateproc);
 }
 
 /*
  * Reassign a buffer from one vnode to another.
  * Used to assign file specific control information
  * (indirect blocks) to the vnode to which they belong.
  */
 void
 reassignbuf(struct buf *bp)
 {
 	struct vnode *vp;
 	struct bufobj *bo;
 	int delay;
 #ifdef INVARIANTS
 	struct bufv *bv;
 #endif
 
 	vp = bp->b_vp;
 	bo = bp->b_bufobj;
 	++reassignbufcalls;
 
 	CTR3(KTR_BUF, "reassignbuf(%p) vp %p flags %X",
 	    bp, bp->b_vp, bp->b_flags);
 	/*
 	 * B_PAGING flagged buffers cannot be reassigned because their vp
 	 * is not fully linked in.
 	 */
 	if (bp->b_flags & B_PAGING)
 		panic("cannot reassign paging buffer");
 
 	/*
 	 * Delete from old vnode list, if on one.
 	 */
 	BO_LOCK(bo);
 	if (bp->b_xflags & (BX_VNDIRTY | BX_VNCLEAN))
 		buf_vlist_remove(bp);
 	else
 		panic("reassignbuf: Buffer %p not on queue.", bp);
 	/*
 	 * If dirty, put on list of dirty buffers; otherwise insert onto list
 	 * of clean buffers.
 	 */
 	if (bp->b_flags & B_DELWRI) {
 		if ((bo->bo_flag & BO_ONWORKLST) == 0) {
 			switch (vp->v_type) {
 			case VDIR:
 				delay = dirdelay;
 				break;
 			case VCHR:
 				delay = metadelay;
 				break;
 			default:
 				delay = filedelay;
 			}
 			vn_syncer_add_to_worklist(bo, delay);
 		}
 		buf_vlist_add(bp, bo, BX_VNDIRTY);
 	} else {
 		buf_vlist_add(bp, bo, BX_VNCLEAN);
 
 		if ((bo->bo_flag & BO_ONWORKLST) && bo->bo_dirty.bv_cnt == 0) {
 			mtx_lock(&sync_mtx);
 			LIST_REMOVE(bo, bo_synclist);
 			syncer_worklist_len--;
 			mtx_unlock(&sync_mtx);
 			bo->bo_flag &= ~BO_ONWORKLST;
 		}
 	}
 #ifdef INVARIANTS
 	bv = &bo->bo_clean;
 	bp = TAILQ_FIRST(&bv->bv_hd);
 	KASSERT(bp == NULL || bp->b_bufobj == bo,
 	    ("bp %p wrong b_bufobj %p should be %p", bp, bp->b_bufobj, bo));
 	bp = TAILQ_LAST(&bv->bv_hd, buflists);
 	KASSERT(bp == NULL || bp->b_bufobj == bo,
 	    ("bp %p wrong b_bufobj %p should be %p", bp, bp->b_bufobj, bo));
 	bv = &bo->bo_dirty;
 	bp = TAILQ_FIRST(&bv->bv_hd);
 	KASSERT(bp == NULL || bp->b_bufobj == bo,
 	    ("bp %p wrong b_bufobj %p should be %p", bp, bp->b_bufobj, bo));
 	bp = TAILQ_LAST(&bv->bv_hd, buflists);
 	KASSERT(bp == NULL || bp->b_bufobj == bo,
 	    ("bp %p wrong b_bufobj %p should be %p", bp, bp->b_bufobj, bo));
 #endif
 	BO_UNLOCK(bo);
 }
 
 /*
  * Increment the use and hold counts on the vnode, taking care to reference
  * the driver's usecount if this is a chardev.  The vholdl() will remove
  * the vnode from the free list if it is presently free.  Requires the
  * vnode interlock and returns with it held.
  */
 static void
 v_incr_usecount(struct vnode *vp)
 {
 
 	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
 	vholdl(vp);
 	vp->v_usecount++;
 	if (vp->v_type == VCHR && vp->v_rdev != NULL) {
 		dev_lock();
 		vp->v_rdev->si_usecount++;
 		dev_unlock();
 	}
 }
 
 /*
  * Turn a holdcnt into a use+holdcnt such that only one call to
  * v_decr_usecount is needed.
  */
 static void
 v_upgrade_usecount(struct vnode *vp)
 {
 
 	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
 	vp->v_usecount++;
 	if (vp->v_type == VCHR && vp->v_rdev != NULL) {
 		dev_lock();
 		vp->v_rdev->si_usecount++;
 		dev_unlock();
 	}
 }
 
 /*
  * Decrement the vnode use and hold count along with the driver's usecount
  * if this is a chardev.  The vdropl() below releases the vnode interlock
  * as it may free the vnode.
  */
 static void
 v_decr_usecount(struct vnode *vp)
 {
 
 	ASSERT_VI_LOCKED(vp, __FUNCTION__);
 	VNASSERT(vp->v_usecount > 0, vp,
 	    ("v_decr_usecount: negative usecount"));
 	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
 	vp->v_usecount--;
 	if (vp->v_type == VCHR && vp->v_rdev != NULL) {
 		dev_lock();
 		vp->v_rdev->si_usecount--;
 		dev_unlock();
 	}
 	vdropl(vp);
 }
 
 /*
  * Decrement only the use count and driver use count.  This is intended to
  * be paired with a follow on vdropl() to release the remaining hold count.
  * In this way we may vgone() a vnode with a 0 usecount without risk of
  * having it end up on a free list because the hold count is kept above 0.
  */
 static void
 v_decr_useonly(struct vnode *vp)
 {
 
 	ASSERT_VI_LOCKED(vp, __FUNCTION__);
 	VNASSERT(vp->v_usecount > 0, vp,
 	    ("v_decr_useonly: negative usecount"));
 	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
 	vp->v_usecount--;
 	if (vp->v_type == VCHR && vp->v_rdev != NULL) {
 		dev_lock();
 		vp->v_rdev->si_usecount--;
 		dev_unlock();
 	}
 }
 
 /*
  * Grab a particular vnode from the free list, increment its
  * reference count and lock it.  VI_DOOMED is set if the vnode
  * is being destroyed.  Only callers who specify LK_RETRY will
  * see doomed vnodes.  If inactive processing was delayed in
  * vput try to do it here.
  */
 int
 vget(struct vnode *vp, int flags, struct thread *td)
 {
 	int error;
 
 	error = 0;
 	VNASSERT((flags & LK_TYPE_MASK) != 0, vp,
 	    ("vget: invalid lock operation"));
 	CTR3(KTR_VFS, "%s: vp %p with flags %d", __func__, vp, flags);
 
 	if ((flags & LK_INTERLOCK) == 0)
 		VI_LOCK(vp);
 	vholdl(vp);
 	if ((error = vn_lock(vp, flags | LK_INTERLOCK)) != 0) {
 		vdrop(vp);
 		CTR2(KTR_VFS, "%s: impossible to lock vnode %p", __func__,
 		    vp);
 		return (error);
 	}
 	if (vp->v_iflag & VI_DOOMED && (flags & LK_RETRY) == 0)
 		panic("vget: vn_lock failed to return ENOENT\n");
 	VI_LOCK(vp);
 	/* Upgrade our holdcnt to a usecount. */
 	v_upgrade_usecount(vp);
 	/*
 	 * We don't guarantee that any particular close will
 	 * trigger inactive processing so just make a best effort
 	 * here at preventing a reference to a removed file.  If
 	 * we don't succeed no harm is done.
 	 */
 	if (vp->v_iflag & VI_OWEINACT) {
 		if (VOP_ISLOCKED(vp) == LK_EXCLUSIVE &&
 		    (flags & LK_NOWAIT) == 0)
 			vinactive(vp, td);
 		vp->v_iflag &= ~VI_OWEINACT;
 	}
 	VI_UNLOCK(vp);
 	return (0);
 }
 
 /*
  * Increase the reference count of a vnode.
  */
 void
 vref(struct vnode *vp)
 {
 
 	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
 	VI_LOCK(vp);
 	v_incr_usecount(vp);
 	VI_UNLOCK(vp);
 }
 
 /*
  * Return reference count of a vnode.
  *
  * The results of this call are only guaranteed when some mechanism other
  * than the VI lock is used to stop other processes from gaining references
  * to the vnode.  This may be the case if the caller holds the only reference.
  * This is also useful when stale data is acceptable as race conditions may
  * be accounted for by some other means.
  */
 int
 vrefcnt(struct vnode *vp)
 {
 	int usecnt;
 
 	VI_LOCK(vp);
 	usecnt = vp->v_usecount;
 	VI_UNLOCK(vp);
 
 	return (usecnt);
 }
 
 #define	VPUTX_VRELE	1
 #define	VPUTX_VPUT	2
 #define	VPUTX_VUNREF	3
 
 static void
 vputx(struct vnode *vp, int func)
 {
 	int error;
 
 	KASSERT(vp != NULL, ("vputx: null vp"));
 	if (func == VPUTX_VUNREF)
 		ASSERT_VOP_LOCKED(vp, "vunref");
 	else if (func == VPUTX_VPUT)
 		ASSERT_VOP_LOCKED(vp, "vput");
 	else
 		KASSERT(func == VPUTX_VRELE, ("vputx: wrong func"));
 	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
 	VI_LOCK(vp);
 
 	/* Skip this v_writecount check if we're going to panic below. */
 	VNASSERT(vp->v_writecount < vp->v_usecount || vp->v_usecount < 1, vp,
 	    ("vputx: missed vn_close"));
 	error = 0;
 
 	if (vp->v_usecount > 1 || ((vp->v_iflag & VI_DOINGINACT) &&
 	    vp->v_usecount == 1)) {
 		if (func == VPUTX_VPUT)
 			VOP_UNLOCK(vp, 0);
 		v_decr_usecount(vp);
 		return;
 	}
 
 	if (vp->v_usecount != 1) {
 		vprint("vputx: negative ref count", vp);
 		panic("vputx: negative ref cnt");
 	}
 	CTR2(KTR_VFS, "%s: return vnode %p to the freelist", __func__, vp);
 	/*
 	 * We want to hold the vnode until the inactive finishes to
 	 * prevent vgone() races.  We drop the use count here and the
 	 * hold count below when we're done.
 	 */
 	v_decr_useonly(vp);
 	/*
 	 * We must call VOP_INACTIVE with the node locked. Mark
 	 * as VI_DOINGINACT to avoid recursion.
 	 */
 	vp->v_iflag |= VI_OWEINACT;
 	switch (func) {
 	case VPUTX_VRELE:
 		error = vn_lock(vp, LK_EXCLUSIVE | LK_INTERLOCK);
 		VI_LOCK(vp);
 		break;
 	case VPUTX_VPUT:
 		if (VOP_ISLOCKED(vp) != LK_EXCLUSIVE) {
 			error = VOP_LOCK(vp, LK_UPGRADE | LK_INTERLOCK |
 			    LK_NOWAIT);
 			VI_LOCK(vp);
 		}
 		break;
 	case VPUTX_VUNREF:
 		if (VOP_ISLOCKED(vp) != LK_EXCLUSIVE) {
 			error = VOP_LOCK(vp, LK_TRYUPGRADE | LK_INTERLOCK);
 			VI_LOCK(vp);
 		}
 		break;
 	}
 	if (vp->v_usecount > 0)
 		vp->v_iflag &= ~VI_OWEINACT;
 	if (error == 0) {
 		if (vp->v_iflag & VI_OWEINACT)
 			vinactive(vp, curthread);
 		if (func != VPUTX_VUNREF)
 			VOP_UNLOCK(vp, 0);
 	}
 	vdropl(vp);
 }
 
 /*
  * Vnode put/release.
  * If count drops to zero, call inactive routine and return to freelist.
  */
 void
 vrele(struct vnode *vp)
 {
 
 	vputx(vp, VPUTX_VRELE);
 }
 
 /*
  * Release an already locked vnode.  This give the same effects as
  * unlock+vrele(), but takes less time and avoids releasing and
  * re-aquiring the lock (as vrele() acquires the lock internally.)
  */
 void
 vput(struct vnode *vp)
 {
 
 	vputx(vp, VPUTX_VPUT);
 }
 
 /*
  * Release an exclusively locked vnode. Do not unlock the vnode lock.
  */
 void
 vunref(struct vnode *vp)
 {
 
 	vputx(vp, VPUTX_VUNREF);
 }
 
 /*
  * Somebody doesn't want the vnode recycled.
  */
 void
 vhold(struct vnode *vp)
 {
 
 	VI_LOCK(vp);
 	vholdl(vp);
 	VI_UNLOCK(vp);
 }
 
 /*
  * Increase the hold count and activate if this is the first reference.
  */
 void
 vholdl(struct vnode *vp)
 {
 	struct mount *mp;
 
 	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
 #ifdef INVARIANTS
 	/* getnewvnode() calls v_incr_usecount() without holding interlock. */
 	if (vp->v_type != VNON || vp->v_data != NULL) {
 		ASSERT_VI_LOCKED(vp, "vholdl");
 		VNASSERT(vp->v_holdcnt > 0 || (vp->v_iflag & VI_FREE) != 0,
 		    vp, ("vholdl: free vnode is held"));
 	}
 #endif
 	vp->v_holdcnt++;
 	if ((vp->v_iflag & VI_FREE) == 0)
 		return;
 	VNASSERT(vp->v_holdcnt == 1, vp, ("vholdl: wrong hold count"));
 	VNASSERT(vp->v_op != NULL, vp, ("vholdl: vnode already reclaimed."));
 	/*
 	 * Remove a vnode from the free list, mark it as in use,
 	 * and put it on the active list.
 	 */
 	mtx_lock(&vnode_free_list_mtx);
 	TAILQ_REMOVE(&vnode_free_list, vp, v_actfreelist);
 	freevnodes--;
 	vp->v_iflag &= ~(VI_FREE|VI_AGE);
 	KASSERT((vp->v_iflag & VI_ACTIVE) == 0,
 	    ("Activating already active vnode"));
 	vp->v_iflag |= VI_ACTIVE;
 	mp = vp->v_mount;
 	TAILQ_INSERT_HEAD(&mp->mnt_activevnodelist, vp, v_actfreelist);
 	mp->mnt_activevnodelistsize++;
 	mtx_unlock(&vnode_free_list_mtx);
 }
 
 /*
  * Note that there is one less who cares about this vnode.
  * vdrop() is the opposite of vhold().
  */
 void
 vdrop(struct vnode *vp)
 {
 
 	VI_LOCK(vp);
 	vdropl(vp);
 }
 
 /*
  * Drop the hold count of the vnode.  If this is the last reference to
  * the vnode we place it on the free list unless it has been vgone'd
  * (marked VI_DOOMED) in which case we will free it.
  */
 void
 vdropl(struct vnode *vp)
 {
 	struct bufobj *bo;
 	struct mount *mp;
 	int active;
 
 	ASSERT_VI_LOCKED(vp, "vdropl");
 	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
 	if (vp->v_holdcnt <= 0)
 		panic("vdrop: holdcnt %d", vp->v_holdcnt);
 	vp->v_holdcnt--;
 	VNASSERT(vp->v_holdcnt >= vp->v_usecount, vp,
 	    ("hold count less than use count"));
 	if (vp->v_holdcnt > 0) {
 		VI_UNLOCK(vp);
 		return;
 	}
 	if ((vp->v_iflag & VI_DOOMED) == 0) {
 		/*
 		 * Mark a vnode as free: remove it from its active list
 		 * and put it up for recycling on the freelist.
 		 */
 		VNASSERT(vp->v_op != NULL, vp,
 		    ("vdropl: vnode already reclaimed."));
 		VNASSERT((vp->v_iflag & VI_FREE) == 0, vp,
 		    ("vnode already free"));
 		VNASSERT(vp->v_holdcnt == 0, vp,
 		    ("vdropl: freeing when we shouldn't"));
 		active = vp->v_iflag & VI_ACTIVE;
 		vp->v_iflag &= ~VI_ACTIVE;
 		mp = vp->v_mount;
 		mtx_lock(&vnode_free_list_mtx);
 		if (active) {
 			TAILQ_REMOVE(&mp->mnt_activevnodelist, vp,
 			    v_actfreelist);
 			mp->mnt_activevnodelistsize--;
 		}
 		if (vp->v_iflag & VI_AGE) {
 			TAILQ_INSERT_HEAD(&vnode_free_list, vp, v_actfreelist);
 		} else {
 			TAILQ_INSERT_TAIL(&vnode_free_list, vp, v_actfreelist);
 		}
 		freevnodes++;
 		vp->v_iflag &= ~VI_AGE;
 		vp->v_iflag |= VI_FREE;
 		mtx_unlock(&vnode_free_list_mtx);
 		VI_UNLOCK(vp);
 		return;
 	}
 	/*
 	 * The vnode has been marked for destruction, so free it.
 	 */
 	CTR2(KTR_VFS, "%s: destroying the vnode %p", __func__, vp);
 	atomic_subtract_long(&numvnodes, 1);
 	bo = &vp->v_bufobj;
 	VNASSERT((vp->v_iflag & VI_FREE) == 0, vp,
 	    ("cleaned vnode still on the free list."));
 	VNASSERT(vp->v_data == NULL, vp, ("cleaned vnode isn't"));
 	VNASSERT(vp->v_holdcnt == 0, vp, ("Non-zero hold count"));
 	VNASSERT(vp->v_usecount == 0, vp, ("Non-zero use count"));
 	VNASSERT(vp->v_writecount == 0, vp, ("Non-zero write count"));
 	VNASSERT(bo->bo_numoutput == 0, vp, ("Clean vnode has pending I/O's"));
 	VNASSERT(bo->bo_clean.bv_cnt == 0, vp, ("cleanbufcnt not 0"));
 	VNASSERT(pctrie_is_empty(&bo->bo_clean.bv_root), vp,
 	    ("clean blk trie not empty"));
 	VNASSERT(bo->bo_dirty.bv_cnt == 0, vp, ("dirtybufcnt not 0"));
 	VNASSERT(pctrie_is_empty(&bo->bo_dirty.bv_root), vp,
 	    ("dirty blk trie not empty"));
 	VNASSERT(TAILQ_EMPTY(&vp->v_cache_dst), vp, ("vp has namecache dst"));
 	VNASSERT(LIST_EMPTY(&vp->v_cache_src), vp, ("vp has namecache src"));
 	VNASSERT(vp->v_cache_dd == NULL, vp, ("vp has namecache for .."));
 	VI_UNLOCK(vp);
 #ifdef MAC
 	mac_vnode_destroy(vp);
 #endif
 	if (vp->v_pollinfo != NULL)
 		destroy_vpollinfo(vp->v_pollinfo);
 #ifdef INVARIANTS
 	/* XXX Elsewhere we detect an already freed vnode via NULL v_op. */
 	vp->v_op = NULL;
 #endif
 	rangelock_destroy(&vp->v_rl);
 	lockdestroy(vp->v_vnlock);
 	mtx_destroy(&vp->v_interlock);
 	rw_destroy(BO_LOCKPTR(bo));
 	uma_zfree(vnode_zone, vp);
 }
 
 /*
  * Call VOP_INACTIVE on the vnode and manage the DOINGINACT and OWEINACT
  * flags.  DOINGINACT prevents us from recursing in calls to vinactive.
  * OWEINACT tracks whether a vnode missed a call to inactive due to a
  * failed lock upgrade.
  */
 void
 vinactive(struct vnode *vp, struct thread *td)
 {
 	struct vm_object *obj;
 
 	ASSERT_VOP_ELOCKED(vp, "vinactive");
 	ASSERT_VI_LOCKED(vp, "vinactive");
 	VNASSERT((vp->v_iflag & VI_DOINGINACT) == 0, vp,
 	    ("vinactive: recursed on VI_DOINGINACT"));
 	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
 	vp->v_iflag |= VI_DOINGINACT;
 	vp->v_iflag &= ~VI_OWEINACT;
 	VI_UNLOCK(vp);
 	/*
 	 * Before moving off the active list, we must be sure that any
 	 * modified pages are on the vnode's dirty list since these will
 	 * no longer be checked once the vnode is on the inactive list.
 	 * Because the vnode vm object keeps a hold reference on the vnode
 	 * if there is at least one resident non-cached page, the vnode
 	 * cannot leave the active list without the page cleanup done.
 	 */
 	obj = vp->v_object;
 	if (obj != NULL && (obj->flags & OBJ_MIGHTBEDIRTY) != 0) {
 		VM_OBJECT_WLOCK(obj);
 		vm_object_page_clean(obj, 0, 0, OBJPC_NOSYNC);
 		VM_OBJECT_WUNLOCK(obj);
 	}
 	VOP_INACTIVE(vp, td);
 	VI_LOCK(vp);
 	VNASSERT(vp->v_iflag & VI_DOINGINACT, vp,
 	    ("vinactive: lost VI_DOINGINACT"));
 	vp->v_iflag &= ~VI_DOINGINACT;
 }
 
 /*
  * Remove any vnodes in the vnode table belonging to mount point mp.
  *
  * If FORCECLOSE is not specified, there should not be any active ones,
  * return error if any are found (nb: this is a user error, not a
  * system error). If FORCECLOSE is specified, detach any active vnodes
  * that are found.
  *
  * If WRITECLOSE is set, only flush out regular file vnodes open for
  * writing.
  *
  * SKIPSYSTEM causes any vnodes marked VV_SYSTEM to be skipped.
  *
  * `rootrefs' specifies the base reference count for the root vnode
  * of this filesystem. The root vnode is considered busy if its
  * v_usecount exceeds this value. On a successful return, vflush(, td)
  * will call vrele() on the root vnode exactly rootrefs times.
  * If the SKIPSYSTEM or WRITECLOSE flags are specified, rootrefs must
  * be zero.
  */
 #ifdef DIAGNOSTIC
 static int busyprt = 0;		/* print out busy vnodes */
 SYSCTL_INT(_debug, OID_AUTO, busyprt, CTLFLAG_RW, &busyprt, 0, "Print out busy vnodes");
 #endif
 
 int
 vflush(struct mount *mp, int rootrefs, int flags, struct thread *td)
 {
 	struct vnode *vp, *mvp, *rootvp = NULL;
 	struct vattr vattr;
 	int busy = 0, error;
 
 	CTR4(KTR_VFS, "%s: mp %p with rootrefs %d and flags %d", __func__, mp,
 	    rootrefs, flags);
 	if (rootrefs > 0) {
 		KASSERT((flags & (SKIPSYSTEM | WRITECLOSE)) == 0,
 		    ("vflush: bad args"));
 		/*
 		 * Get the filesystem root vnode. We can vput() it
 		 * immediately, since with rootrefs > 0, it won't go away.
 		 */
 		if ((error = VFS_ROOT(mp, LK_EXCLUSIVE, &rootvp)) != 0) {
 			CTR2(KTR_VFS, "%s: vfs_root lookup failed with %d",
 			    __func__, error);
 			return (error);
 		}
 		vput(rootvp);
 	}
 loop:
 	MNT_VNODE_FOREACH_ALL(vp, mp, mvp) {
 		vholdl(vp);
 		error = vn_lock(vp, LK_INTERLOCK | LK_EXCLUSIVE);
 		if (error) {
 			vdrop(vp);
 			MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp);
 			goto loop;
 		}
 		/*
 		 * Skip over a vnodes marked VV_SYSTEM.
 		 */
 		if ((flags & SKIPSYSTEM) && (vp->v_vflag & VV_SYSTEM)) {
 			VOP_UNLOCK(vp, 0);
 			vdrop(vp);
 			continue;
 		}
 		/*
 		 * If WRITECLOSE is set, flush out unlinked but still open
 		 * files (even if open only for reading) and regular file
 		 * vnodes open for writing.
 		 */
 		if (flags & WRITECLOSE) {
 			if (vp->v_object != NULL) {
 				VM_OBJECT_WLOCK(vp->v_object);
 				vm_object_page_clean(vp->v_object, 0, 0, 0);
 				VM_OBJECT_WUNLOCK(vp->v_object);
 			}
 			error = VOP_FSYNC(vp, MNT_WAIT, td);
 			if (error != 0) {
 				VOP_UNLOCK(vp, 0);
 				vdrop(vp);
 				MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp);
 				return (error);
 			}
 			error = VOP_GETATTR(vp, &vattr, td->td_ucred);
 			VI_LOCK(vp);
 
 			if ((vp->v_type == VNON ||
 			    (error == 0 && vattr.va_nlink > 0)) &&
 			    (vp->v_writecount == 0 || vp->v_type != VREG)) {
 				VOP_UNLOCK(vp, 0);
 				vdropl(vp);
 				continue;
 			}
 		} else
 			VI_LOCK(vp);
 		/*
 		 * With v_usecount == 0, all we need to do is clear out the
 		 * vnode data structures and we are done.
 		 *
 		 * If FORCECLOSE is set, forcibly close the vnode.
 		 */
 		if (vp->v_usecount == 0 || (flags & FORCECLOSE)) {
 			VNASSERT(vp->v_usecount == 0 ||
 			    vp->v_op != &devfs_specops ||
 			    (vp->v_type != VCHR && vp->v_type != VBLK), vp,
 			    ("device VNODE %p is FORCECLOSED", vp));
 			vgonel(vp);
 		} else {
 			busy++;
 #ifdef DIAGNOSTIC
 			if (busyprt)
 				vprint("vflush: busy vnode", vp);
 #endif
 		}
 		VOP_UNLOCK(vp, 0);
 		vdropl(vp);
 	}
 	if (rootrefs > 0 && (flags & FORCECLOSE) == 0) {
 		/*
 		 * If just the root vnode is busy, and if its refcount
 		 * is equal to `rootrefs', then go ahead and kill it.
 		 */
 		VI_LOCK(rootvp);
 		KASSERT(busy > 0, ("vflush: not busy"));
 		VNASSERT(rootvp->v_usecount >= rootrefs, rootvp,
 		    ("vflush: usecount %d < rootrefs %d",
 		     rootvp->v_usecount, rootrefs));
 		if (busy == 1 && rootvp->v_usecount == rootrefs) {
 			VOP_LOCK(rootvp, LK_EXCLUSIVE|LK_INTERLOCK);
 			vgone(rootvp);
 			VOP_UNLOCK(rootvp, 0);
 			busy = 0;
 		} else
 			VI_UNLOCK(rootvp);
 	}
 	if (busy) {
 		CTR2(KTR_VFS, "%s: failing as %d vnodes are busy", __func__,
 		    busy);
 		return (EBUSY);
 	}
 	for (; rootrefs > 0; rootrefs--)
 		vrele(rootvp);
 	return (0);
 }
 
 /*
  * Recycle an unused vnode to the front of the free list.
  */
 int
 vrecycle(struct vnode *vp)
 {
 	int recycled;
 
 	ASSERT_VOP_ELOCKED(vp, "vrecycle");
 	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
 	recycled = 0;
 	VI_LOCK(vp);
 	if (vp->v_usecount == 0) {
 		recycled = 1;
 		vgonel(vp);
 	}
 	VI_UNLOCK(vp);
 	return (recycled);
 }
 
 /*
  * Eliminate all activity associated with a vnode
  * in preparation for reuse.
  */
 void
 vgone(struct vnode *vp)
 {
 	VI_LOCK(vp);
 	vgonel(vp);
 	VI_UNLOCK(vp);
 }
 
 static void
 notify_lowervp_vfs_dummy(struct mount *mp __unused,
     struct vnode *lowervp __unused)
 {
 }
 
 /*
  * Notify upper mounts about reclaimed or unlinked vnode.
  */
 void
 vfs_notify_upper(struct vnode *vp, int event)
 {
 	static struct vfsops vgonel_vfsops = {
 		.vfs_reclaim_lowervp = notify_lowervp_vfs_dummy,
 		.vfs_unlink_lowervp = notify_lowervp_vfs_dummy,
 	};
 	struct mount *mp, *ump, *mmp;
 
 	mp = vp->v_mount;
 	if (mp == NULL)
 		return;
 
 	MNT_ILOCK(mp);
 	if (TAILQ_EMPTY(&mp->mnt_uppers))
 		goto unlock;
 	MNT_IUNLOCK(mp);
 	mmp = malloc(sizeof(struct mount), M_TEMP, M_WAITOK | M_ZERO);
 	mmp->mnt_op = &vgonel_vfsops;
 	mmp->mnt_kern_flag |= MNTK_MARKER;
 	MNT_ILOCK(mp);
 	mp->mnt_kern_flag |= MNTK_VGONE_UPPER;
 	for (ump = TAILQ_FIRST(&mp->mnt_uppers); ump != NULL;) {
 		if ((ump->mnt_kern_flag & MNTK_MARKER) != 0) {
 			ump = TAILQ_NEXT(ump, mnt_upper_link);
 			continue;
 		}
 		TAILQ_INSERT_AFTER(&mp->mnt_uppers, ump, mmp, mnt_upper_link);
 		MNT_IUNLOCK(mp);
 		switch (event) {
 		case VFS_NOTIFY_UPPER_RECLAIM:
 			VFS_RECLAIM_LOWERVP(ump, vp);
 			break;
 		case VFS_NOTIFY_UPPER_UNLINK:
 			VFS_UNLINK_LOWERVP(ump, vp);
 			break;
 		default:
 			KASSERT(0, ("invalid event %d", event));
 			break;
 		}
 		MNT_ILOCK(mp);
 		ump = TAILQ_NEXT(mmp, mnt_upper_link);
 		TAILQ_REMOVE(&mp->mnt_uppers, mmp, mnt_upper_link);
 	}
 	free(mmp, M_TEMP);
 	mp->mnt_kern_flag &= ~MNTK_VGONE_UPPER;
 	if ((mp->mnt_kern_flag & MNTK_VGONE_WAITER) != 0) {
 		mp->mnt_kern_flag &= ~MNTK_VGONE_WAITER;
 		wakeup(&mp->mnt_uppers);
 	}
 unlock:
 	MNT_IUNLOCK(mp);
 }
 
 /*
  * vgone, with the vp interlock held.
  */
 void
 vgonel(struct vnode *vp)
 {
 	struct thread *td;
 	int oweinact;
 	int active;
 	struct mount *mp;
 
 	ASSERT_VOP_ELOCKED(vp, "vgonel");
 	ASSERT_VI_LOCKED(vp, "vgonel");
 	VNASSERT(vp->v_holdcnt, vp,
 	    ("vgonel: vp %p has no reference.", vp));
 	CTR2(KTR_VFS, "%s: vp %p", __func__, vp);
 	td = curthread;
 
 	/*
 	 * Don't vgonel if we're already doomed.
 	 */
 	if (vp->v_iflag & VI_DOOMED)
 		return;
 	vp->v_iflag |= VI_DOOMED;
 
 	/*
 	 * Check to see if the vnode is in use.  If so, we have to call
 	 * VOP_CLOSE() and VOP_INACTIVE().
 	 */
 	active = vp->v_usecount;
 	oweinact = (vp->v_iflag & VI_OWEINACT);
 	VI_UNLOCK(vp);
 	vfs_notify_upper(vp, VFS_NOTIFY_UPPER_RECLAIM);
 
 	/*
 	 * If purging an active vnode, it must be closed and
 	 * deactivated before being reclaimed.
 	 */
 	if (active)
 		VOP_CLOSE(vp, FNONBLOCK, NOCRED, td);
 	if (oweinact || active) {
 		VI_LOCK(vp);
 		if ((vp->v_iflag & VI_DOINGINACT) == 0)
 			vinactive(vp, td);
 		VI_UNLOCK(vp);
 	}
 	if (vp->v_type == VSOCK)
 		vfs_unp_reclaim(vp);
 
 	/*
 	 * Clean out any buffers associated with the vnode.
 	 * If the flush fails, just toss the buffers.
 	 */
 	mp = NULL;
 	if (!TAILQ_EMPTY(&vp->v_bufobj.bo_dirty.bv_hd))
 		(void) vn_start_secondary_write(vp, &mp, V_WAIT);
 	if (vinvalbuf(vp, V_SAVE, 0, 0) != 0) {
 		while (vinvalbuf(vp, 0, 0, 0) != 0)
 			;
 	}
 #ifdef INVARIANTS
 	BO_LOCK(&vp->v_bufobj);
 	KASSERT(TAILQ_EMPTY(&vp->v_bufobj.bo_dirty.bv_hd) &&
 	    vp->v_bufobj.bo_dirty.bv_cnt == 0 &&
 	    TAILQ_EMPTY(&vp->v_bufobj.bo_clean.bv_hd) &&
 	    vp->v_bufobj.bo_clean.bv_cnt == 0,
 	    ("vp %p bufobj not invalidated", vp));
 	vp->v_bufobj.bo_flag |= BO_DEAD;
 	BO_UNLOCK(&vp->v_bufobj);
 #endif
 
 	/*
 	 * Reclaim the vnode.
 	 */
 	if (VOP_RECLAIM(vp, td))
 		panic("vgone: cannot reclaim");
 	if (mp != NULL)
 		vn_finished_secondary_write(mp);
 	VNASSERT(vp->v_object == NULL, vp,
 	    ("vop_reclaim left v_object vp=%p, tag=%s", vp, vp->v_tag));
 	/*
 	 * Clear the advisory locks and wake up waiting threads.
 	 */
 	(void)VOP_ADVLOCKPURGE(vp);
 	/*
 	 * Delete from old mount point vnode list.
 	 */
 	delmntque(vp);
 	cache_purge(vp);
 	/*
 	 * Done with purge, reset to the standard lock and invalidate
 	 * the vnode.
 	 */
 	VI_LOCK(vp);
 	vp->v_vnlock = &vp->v_lock;
 	vp->v_op = &dead_vnodeops;
 	vp->v_tag = "none";
 	vp->v_type = VBAD;
 }
 
 /*
  * Calculate the total number of references to a special device.
  */
 int
 vcount(struct vnode *vp)
 {
 	int count;
 
 	dev_lock();
 	count = vp->v_rdev->si_usecount;
 	dev_unlock();
 	return (count);
 }
 
 /*
  * Same as above, but using the struct cdev *as argument
  */
 int
 count_dev(struct cdev *dev)
 {
 	int count;
 
 	dev_lock();
 	count = dev->si_usecount;
 	dev_unlock();
 	return(count);
 }
 
 /*
  * Print out a description of a vnode.
  */
 static char *typename[] =
 {"VNON", "VREG", "VDIR", "VBLK", "VCHR", "VLNK", "VSOCK", "VFIFO", "VBAD",
  "VMARKER"};
 
 void
 vn_printf(struct vnode *vp, const char *fmt, ...)
 {
 	va_list ap;
 	char buf[256], buf2[16];
 	u_long flags;
 
 	va_start(ap, fmt);
 	vprintf(fmt, ap);
 	va_end(ap);
 	printf("%p: ", (void *)vp);
 	printf("tag %s, type %s\n", vp->v_tag, typename[vp->v_type]);
 	printf("    usecount %d, writecount %d, refcount %d mountedhere %p\n",
 	    vp->v_usecount, vp->v_writecount, vp->v_holdcnt, vp->v_mountedhere);
 	buf[0] = '\0';
 	buf[1] = '\0';
 	if (vp->v_vflag & VV_ROOT)
 		strlcat(buf, "|VV_ROOT", sizeof(buf));
 	if (vp->v_vflag & VV_ISTTY)
 		strlcat(buf, "|VV_ISTTY", sizeof(buf));
 	if (vp->v_vflag & VV_NOSYNC)
 		strlcat(buf, "|VV_NOSYNC", sizeof(buf));
 	if (vp->v_vflag & VV_ETERNALDEV)
 		strlcat(buf, "|VV_ETERNALDEV", sizeof(buf));
 	if (vp->v_vflag & VV_CACHEDLABEL)
 		strlcat(buf, "|VV_CACHEDLABEL", sizeof(buf));
 	if (vp->v_vflag & VV_TEXT)
 		strlcat(buf, "|VV_TEXT", sizeof(buf));
 	if (vp->v_vflag & VV_COPYONWRITE)
 		strlcat(buf, "|VV_COPYONWRITE", sizeof(buf));
 	if (vp->v_vflag & VV_SYSTEM)
 		strlcat(buf, "|VV_SYSTEM", sizeof(buf));
 	if (vp->v_vflag & VV_PROCDEP)
 		strlcat(buf, "|VV_PROCDEP", sizeof(buf));
 	if (vp->v_vflag & VV_NOKNOTE)
 		strlcat(buf, "|VV_NOKNOTE", sizeof(buf));
 	if (vp->v_vflag & VV_DELETED)
 		strlcat(buf, "|VV_DELETED", sizeof(buf));
 	if (vp->v_vflag & VV_MD)
 		strlcat(buf, "|VV_MD", sizeof(buf));
 	if (vp->v_vflag & VV_FORCEINSMQ)
 		strlcat(buf, "|VV_FORCEINSMQ", sizeof(buf));
 	flags = vp->v_vflag & ~(VV_ROOT | VV_ISTTY | VV_NOSYNC | VV_ETERNALDEV |
 	    VV_CACHEDLABEL | VV_TEXT | VV_COPYONWRITE | VV_SYSTEM | VV_PROCDEP |
 	    VV_NOKNOTE | VV_DELETED | VV_MD | VV_FORCEINSMQ);
 	if (flags != 0) {
 		snprintf(buf2, sizeof(buf2), "|VV(0x%lx)", flags);
 		strlcat(buf, buf2, sizeof(buf));
 	}
 	if (vp->v_iflag & VI_MOUNT)
 		strlcat(buf, "|VI_MOUNT", sizeof(buf));
 	if (vp->v_iflag & VI_AGE)
 		strlcat(buf, "|VI_AGE", sizeof(buf));
 	if (vp->v_iflag & VI_DOOMED)
 		strlcat(buf, "|VI_DOOMED", sizeof(buf));
 	if (vp->v_iflag & VI_FREE)
 		strlcat(buf, "|VI_FREE", sizeof(buf));
 	if (vp->v_iflag & VI_ACTIVE)
 		strlcat(buf, "|VI_ACTIVE", sizeof(buf));
 	if (vp->v_iflag & VI_DOINGINACT)
 		strlcat(buf, "|VI_DOINGINACT", sizeof(buf));
 	if (vp->v_iflag & VI_OWEINACT)
 		strlcat(buf, "|VI_OWEINACT", sizeof(buf));
 	flags = vp->v_iflag & ~(VI_MOUNT | VI_AGE | VI_DOOMED | VI_FREE |
 	    VI_ACTIVE | VI_DOINGINACT | VI_OWEINACT);
 	if (flags != 0) {
 		snprintf(buf2, sizeof(buf2), "|VI(0x%lx)", flags);
 		strlcat(buf, buf2, sizeof(buf));
 	}
 	printf("    flags (%s)\n", buf + 1);
 	if (mtx_owned(VI_MTX(vp)))
 		printf(" VI_LOCKed");
 	if (vp->v_object != NULL)
 		printf("    v_object %p ref %d pages %d "
 		    "cleanbuf %d dirtybuf %d\n",
 		    vp->v_object, vp->v_object->ref_count,
 		    vp->v_object->resident_page_count,
 		    vp->v_bufobj.bo_dirty.bv_cnt,
 		    vp->v_bufobj.bo_clean.bv_cnt);
 	printf("    ");
 	lockmgr_printinfo(vp->v_vnlock);
 	if (vp->v_data != NULL)
 		VOP_PRINT(vp);
 }
 
 #ifdef DDB
 /*
  * List all of the locked vnodes in the system.
  * Called when debugging the kernel.
  */
 DB_SHOW_COMMAND(lockedvnods, lockedvnodes)
 {
 	struct mount *mp;
 	struct vnode *vp;
 
 	/*
 	 * Note: because this is DDB, we can't obey the locking semantics
 	 * for these structures, which means we could catch an inconsistent
 	 * state and dereference a nasty pointer.  Not much to be done
 	 * about that.
 	 */
 	db_printf("Locked vnodes\n");
 	TAILQ_FOREACH(mp, &mountlist, mnt_list) {
 		TAILQ_FOREACH(vp, &mp->mnt_nvnodelist, v_nmntvnodes) {
 			if (vp->v_type != VMARKER && VOP_ISLOCKED(vp))
 				vprint("", vp);
 		}
 	}
 }
 
 /*
  * Show details about the given vnode.
  */
 DB_SHOW_COMMAND(vnode, db_show_vnode)
 {
 	struct vnode *vp;
 
 	if (!have_addr)
 		return;
 	vp = (struct vnode *)addr;
 	vn_printf(vp, "vnode ");
 }
 
 /*
  * Show details about the given mount point.
  */
 DB_SHOW_COMMAND(mount, db_show_mount)
 {
 	struct mount *mp;
 	struct vfsopt *opt;
 	struct statfs *sp;
 	struct vnode *vp;
 	char buf[512];
 	uint64_t mflags;
 	u_int flags;
 
 	if (!have_addr) {
 		/* No address given, print short info about all mount points. */
 		TAILQ_FOREACH(mp, &mountlist, mnt_list) {
 			db_printf("%p %s on %s (%s)\n", mp,
 			    mp->mnt_stat.f_mntfromname,
 			    mp->mnt_stat.f_mntonname,
 			    mp->mnt_stat.f_fstypename);
 			if (db_pager_quit)
 				break;
 		}
 		db_printf("\nMore info: show mount <addr>\n");
 		return;
 	}
 
 	mp = (struct mount *)addr;
 	db_printf("%p %s on %s (%s)\n", mp, mp->mnt_stat.f_mntfromname,
 	    mp->mnt_stat.f_mntonname, mp->mnt_stat.f_fstypename);
 
 	buf[0] = '\0';
 	mflags = mp->mnt_flag;
 #define	MNT_FLAG(flag)	do {						\
 	if (mflags & (flag)) {						\
 		if (buf[0] != '\0')					\
 			strlcat(buf, ", ", sizeof(buf));		\
 		strlcat(buf, (#flag) + 4, sizeof(buf));			\
 		mflags &= ~(flag);					\
 	}								\
 } while (0)
 	MNT_FLAG(MNT_RDONLY);
 	MNT_FLAG(MNT_SYNCHRONOUS);
 	MNT_FLAG(MNT_NOEXEC);
 	MNT_FLAG(MNT_NOSUID);
 	MNT_FLAG(MNT_NFS4ACLS);
 	MNT_FLAG(MNT_UNION);
 	MNT_FLAG(MNT_ASYNC);
 	MNT_FLAG(MNT_SUIDDIR);
 	MNT_FLAG(MNT_SOFTDEP);
 	MNT_FLAG(MNT_NOSYMFOLLOW);
 	MNT_FLAG(MNT_GJOURNAL);
 	MNT_FLAG(MNT_MULTILABEL);
 	MNT_FLAG(MNT_ACLS);
 	MNT_FLAG(MNT_NOATIME);
 	MNT_FLAG(MNT_NOCLUSTERR);
 	MNT_FLAG(MNT_NOCLUSTERW);
 	MNT_FLAG(MNT_SUJ);
 	MNT_FLAG(MNT_EXRDONLY);
 	MNT_FLAG(MNT_EXPORTED);
 	MNT_FLAG(MNT_DEFEXPORTED);
 	MNT_FLAG(MNT_EXPORTANON);
 	MNT_FLAG(MNT_EXKERB);
 	MNT_FLAG(MNT_EXPUBLIC);
 	MNT_FLAG(MNT_LOCAL);
 	MNT_FLAG(MNT_QUOTA);
 	MNT_FLAG(MNT_ROOTFS);
 	MNT_FLAG(MNT_USER);
 	MNT_FLAG(MNT_IGNORE);
 	MNT_FLAG(MNT_UPDATE);
 	MNT_FLAG(MNT_DELEXPORT);
 	MNT_FLAG(MNT_RELOAD);
 	MNT_FLAG(MNT_FORCE);
 	MNT_FLAG(MNT_SNAPSHOT);
 	MNT_FLAG(MNT_BYFSID);
 #undef MNT_FLAG
 	if (mflags != 0) {
 		if (buf[0] != '\0')
 			strlcat(buf, ", ", sizeof(buf));
 		snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf),
 		    "0x%016jx", mflags);
 	}
 	db_printf("    mnt_flag = %s\n", buf);
 
 	buf[0] = '\0';
 	flags = mp->mnt_kern_flag;
 #define	MNT_KERN_FLAG(flag)	do {					\
 	if (flags & (flag)) {						\
 		if (buf[0] != '\0')					\
 			strlcat(buf, ", ", sizeof(buf));		\
 		strlcat(buf, (#flag) + 5, sizeof(buf));			\
 		flags &= ~(flag);					\
 	}								\
 } while (0)
 	MNT_KERN_FLAG(MNTK_UNMOUNTF);
 	MNT_KERN_FLAG(MNTK_ASYNC);
 	MNT_KERN_FLAG(MNTK_SOFTDEP);
 	MNT_KERN_FLAG(MNTK_NOINSMNTQ);
 	MNT_KERN_FLAG(MNTK_DRAINING);
 	MNT_KERN_FLAG(MNTK_REFEXPIRE);
 	MNT_KERN_FLAG(MNTK_EXTENDED_SHARED);
 	MNT_KERN_FLAG(MNTK_SHARED_WRITES);
 	MNT_KERN_FLAG(MNTK_NO_IOPF);
 	MNT_KERN_FLAG(MNTK_VGONE_UPPER);
 	MNT_KERN_FLAG(MNTK_VGONE_WAITER);
 	MNT_KERN_FLAG(MNTK_LOOKUP_EXCL_DOTDOT);
 	MNT_KERN_FLAG(MNTK_MARKER);
+	MNT_KERN_FLAG(MNTK_USES_BCACHE);
 	MNT_KERN_FLAG(MNTK_NOASYNC);
 	MNT_KERN_FLAG(MNTK_UNMOUNT);
 	MNT_KERN_FLAG(MNTK_MWAIT);
 	MNT_KERN_FLAG(MNTK_SUSPEND);
 	MNT_KERN_FLAG(MNTK_SUSPEND2);
 	MNT_KERN_FLAG(MNTK_SUSPENDED);
 	MNT_KERN_FLAG(MNTK_LOOKUP_SHARED);
 	MNT_KERN_FLAG(MNTK_NOKNOTE);
 #undef MNT_KERN_FLAG
 	if (flags != 0) {
 		if (buf[0] != '\0')
 			strlcat(buf, ", ", sizeof(buf));
 		snprintf(buf + strlen(buf), sizeof(buf) - strlen(buf),
 		    "0x%08x", flags);
 	}
 	db_printf("    mnt_kern_flag = %s\n", buf);
 
 	db_printf("    mnt_opt = ");
 	opt = TAILQ_FIRST(mp->mnt_opt);
 	if (opt != NULL) {
 		db_printf("%s", opt->name);
 		opt = TAILQ_NEXT(opt, link);
 		while (opt != NULL) {
 			db_printf(", %s", opt->name);
 			opt = TAILQ_NEXT(opt, link);
 		}
 	}
 	db_printf("\n");
 
 	sp = &mp->mnt_stat;
 	db_printf("    mnt_stat = { version=%u type=%u flags=0x%016jx "
 	    "bsize=%ju iosize=%ju blocks=%ju bfree=%ju bavail=%jd files=%ju "
 	    "ffree=%jd syncwrites=%ju asyncwrites=%ju syncreads=%ju "
 	    "asyncreads=%ju namemax=%u owner=%u fsid=[%d, %d] }\n",
 	    (u_int)sp->f_version, (u_int)sp->f_type, (uintmax_t)sp->f_flags,
 	    (uintmax_t)sp->f_bsize, (uintmax_t)sp->f_iosize,
 	    (uintmax_t)sp->f_blocks, (uintmax_t)sp->f_bfree,
 	    (intmax_t)sp->f_bavail, (uintmax_t)sp->f_files,
 	    (intmax_t)sp->f_ffree, (uintmax_t)sp->f_syncwrites,
 	    (uintmax_t)sp->f_asyncwrites, (uintmax_t)sp->f_syncreads,
 	    (uintmax_t)sp->f_asyncreads, (u_int)sp->f_namemax,
 	    (u_int)sp->f_owner, (int)sp->f_fsid.val[0], (int)sp->f_fsid.val[1]);
 
 	db_printf("    mnt_cred = { uid=%u ruid=%u",
 	    (u_int)mp->mnt_cred->cr_uid, (u_int)mp->mnt_cred->cr_ruid);
 	if (jailed(mp->mnt_cred))
 		db_printf(", jail=%d", mp->mnt_cred->cr_prison->pr_id);
 	db_printf(" }\n");
 	db_printf("    mnt_ref = %d\n", mp->mnt_ref);
 	db_printf("    mnt_gen = %d\n", mp->mnt_gen);
 	db_printf("    mnt_nvnodelistsize = %d\n", mp->mnt_nvnodelistsize);
 	db_printf("    mnt_activevnodelistsize = %d\n",
 	    mp->mnt_activevnodelistsize);
 	db_printf("    mnt_writeopcount = %d\n", mp->mnt_writeopcount);
 	db_printf("    mnt_maxsymlinklen = %d\n", mp->mnt_maxsymlinklen);
 	db_printf("    mnt_iosize_max = %d\n", mp->mnt_iosize_max);
 	db_printf("    mnt_hashseed = %u\n", mp->mnt_hashseed);
 	db_printf("    mnt_lockref = %d\n", mp->mnt_lockref);
 	db_printf("    mnt_secondary_writes = %d\n", mp->mnt_secondary_writes);
 	db_printf("    mnt_secondary_accwrites = %d\n",
 	    mp->mnt_secondary_accwrites);
 	db_printf("    mnt_gjprovider = %s\n",
 	    mp->mnt_gjprovider != NULL ? mp->mnt_gjprovider : "NULL");
 
 	db_printf("\n\nList of active vnodes\n");
 	TAILQ_FOREACH(vp, &mp->mnt_activevnodelist, v_actfreelist) {
 		if (vp->v_type != VMARKER) {
 			vn_printf(vp, "vnode ");
 			if (db_pager_quit)
 				break;
 		}
 	}
 	db_printf("\n\nList of inactive vnodes\n");
 	TAILQ_FOREACH(vp, &mp->mnt_nvnodelist, v_nmntvnodes) {
 		if (vp->v_type != VMARKER && (vp->v_iflag & VI_ACTIVE) == 0) {
 			vn_printf(vp, "vnode ");
 			if (db_pager_quit)
 				break;
 		}
 	}
 }
 #endif	/* DDB */
 
 /*
  * Fill in a struct xvfsconf based on a struct vfsconf.
  */
 static int
 vfsconf2x(struct sysctl_req *req, struct vfsconf *vfsp)
 {
 	struct xvfsconf xvfsp;
 
 	bzero(&xvfsp, sizeof(xvfsp));
 	strcpy(xvfsp.vfc_name, vfsp->vfc_name);
 	xvfsp.vfc_typenum = vfsp->vfc_typenum;
 	xvfsp.vfc_refcount = vfsp->vfc_refcount;
 	xvfsp.vfc_flags = vfsp->vfc_flags;
 	/*
 	 * These are unused in userland, we keep them
 	 * to not break binary compatibility.
 	 */
 	xvfsp.vfc_vfsops = NULL;
 	xvfsp.vfc_next = NULL;
 	return (SYSCTL_OUT(req, &xvfsp, sizeof(xvfsp)));
 }
 
 #ifdef COMPAT_FREEBSD32
 struct xvfsconf32 {
 	uint32_t	vfc_vfsops;
 	char		vfc_name[MFSNAMELEN];
 	int32_t		vfc_typenum;
 	int32_t		vfc_refcount;
 	int32_t		vfc_flags;
 	uint32_t	vfc_next;
 };
 
 static int
 vfsconf2x32(struct sysctl_req *req, struct vfsconf *vfsp)
 {
 	struct xvfsconf32 xvfsp;
 
 	strcpy(xvfsp.vfc_name, vfsp->vfc_name);
 	xvfsp.vfc_typenum = vfsp->vfc_typenum;
 	xvfsp.vfc_refcount = vfsp->vfc_refcount;
 	xvfsp.vfc_flags = vfsp->vfc_flags;
 	xvfsp.vfc_vfsops = 0;
 	xvfsp.vfc_next = 0;
 	return (SYSCTL_OUT(req, &xvfsp, sizeof(xvfsp)));
 }
 #endif
 
 /*
  * Top level filesystem related information gathering.
  */
 static int
 sysctl_vfs_conflist(SYSCTL_HANDLER_ARGS)
 {
 	struct vfsconf *vfsp;
 	int error;
 
 	error = 0;
 	vfsconf_slock();
 	TAILQ_FOREACH(vfsp, &vfsconf, vfc_list) {
 #ifdef COMPAT_FREEBSD32
 		if (req->flags & SCTL_MASK32)
 			error = vfsconf2x32(req, vfsp);
 		else
 #endif
 			error = vfsconf2x(req, vfsp);
 		if (error)
 			break;
 	}
 	vfsconf_sunlock();
 	return (error);
 }
 
 SYSCTL_PROC(_vfs, OID_AUTO, conflist, CTLTYPE_OPAQUE | CTLFLAG_RD |
     CTLFLAG_MPSAFE, NULL, 0, sysctl_vfs_conflist,
     "S,xvfsconf", "List of all configured filesystems");
 
 #ifndef BURN_BRIDGES
 static int	sysctl_ovfs_conf(SYSCTL_HANDLER_ARGS);
 
 static int
 vfs_sysctl(SYSCTL_HANDLER_ARGS)
 {
 	int *name = (int *)arg1 - 1;	/* XXX */
 	u_int namelen = arg2 + 1;	/* XXX */
 	struct vfsconf *vfsp;
 
 	log(LOG_WARNING, "userland calling deprecated sysctl, "
 	    "please rebuild world\n");
 
 #if 1 || defined(COMPAT_PRELITE2)
 	/* Resolve ambiguity between VFS_VFSCONF and VFS_GENERIC. */
 	if (namelen == 1)
 		return (sysctl_ovfs_conf(oidp, arg1, arg2, req));
 #endif
 
 	switch (name[1]) {
 	case VFS_MAXTYPENUM:
 		if (namelen != 2)
 			return (ENOTDIR);
 		return (SYSCTL_OUT(req, &maxvfsconf, sizeof(int)));
 	case VFS_CONF:
 		if (namelen != 3)
 			return (ENOTDIR);	/* overloaded */
 		vfsconf_slock();
 		TAILQ_FOREACH(vfsp, &vfsconf, vfc_list) {
 			if (vfsp->vfc_typenum == name[2])
 				break;
 		}
 		vfsconf_sunlock();
 		if (vfsp == NULL)
 			return (EOPNOTSUPP);
 #ifdef COMPAT_FREEBSD32
 		if (req->flags & SCTL_MASK32)
 			return (vfsconf2x32(req, vfsp));
 		else
 #endif
 			return (vfsconf2x(req, vfsp));
 	}
 	return (EOPNOTSUPP);
 }
 
 static SYSCTL_NODE(_vfs, VFS_GENERIC, generic, CTLFLAG_RD | CTLFLAG_SKIP |
     CTLFLAG_MPSAFE, vfs_sysctl,
     "Generic filesystem");
 
 #if 1 || defined(COMPAT_PRELITE2)
 
 static int
 sysctl_ovfs_conf(SYSCTL_HANDLER_ARGS)
 {
 	int error;
 	struct vfsconf *vfsp;
 	struct ovfsconf ovfs;
 
 	vfsconf_slock();
 	TAILQ_FOREACH(vfsp, &vfsconf, vfc_list) {
 		bzero(&ovfs, sizeof(ovfs));
 		ovfs.vfc_vfsops = vfsp->vfc_vfsops;	/* XXX used as flag */
 		strcpy(ovfs.vfc_name, vfsp->vfc_name);
 		ovfs.vfc_index = vfsp->vfc_typenum;
 		ovfs.vfc_refcount = vfsp->vfc_refcount;
 		ovfs.vfc_flags = vfsp->vfc_flags;
 		error = SYSCTL_OUT(req, &ovfs, sizeof ovfs);
 		if (error != 0) {
 			vfsconf_sunlock();
 			return (error);
 		}
 	}
 	vfsconf_sunlock();
 	return (0);
 }
 
 #endif /* 1 || COMPAT_PRELITE2 */
 #endif /* !BURN_BRIDGES */
 
 #define KINFO_VNODESLOP		10
 #ifdef notyet
 /*
  * Dump vnode list (via sysctl).
  */
 /* ARGSUSED */
 static int
 sysctl_vnode(SYSCTL_HANDLER_ARGS)
 {
 	struct xvnode *xvn;
 	struct mount *mp;
 	struct vnode *vp;
 	int error, len, n;
 
 	/*
 	 * Stale numvnodes access is not fatal here.
 	 */
 	req->lock = 0;
 	len = (numvnodes + KINFO_VNODESLOP) * sizeof *xvn;
 	if (!req->oldptr)
 		/* Make an estimate */
 		return (SYSCTL_OUT(req, 0, len));
 
 	error = sysctl_wire_old_buffer(req, 0);
 	if (error != 0)
 		return (error);
 	xvn = malloc(len, M_TEMP, M_ZERO | M_WAITOK);
 	n = 0;
 	mtx_lock(&mountlist_mtx);
 	TAILQ_FOREACH(mp, &mountlist, mnt_list) {
 		if (vfs_busy(mp, MBF_NOWAIT | MBF_MNTLSTLOCK))
 			continue;
 		MNT_ILOCK(mp);
 		TAILQ_FOREACH(vp, &mp->mnt_nvnodelist, v_nmntvnodes) {
 			if (n == len)
 				break;
 			vref(vp);
 			xvn[n].xv_size = sizeof *xvn;
 			xvn[n].xv_vnode = vp;
 			xvn[n].xv_id = 0;	/* XXX compat */
 #define XV_COPY(field) xvn[n].xv_##field = vp->v_##field
 			XV_COPY(usecount);
 			XV_COPY(writecount);
 			XV_COPY(holdcnt);
 			XV_COPY(mount);
 			XV_COPY(numoutput);
 			XV_COPY(type);
 #undef XV_COPY
 			xvn[n].xv_flag = vp->v_vflag;
 
 			switch (vp->v_type) {
 			case VREG:
 			case VDIR:
 			case VLNK:
 				break;
 			case VBLK:
 			case VCHR:
 				if (vp->v_rdev == NULL) {
 					vrele(vp);
 					continue;
 				}
 				xvn[n].xv_dev = dev2udev(vp->v_rdev);
 				break;
 			case VSOCK:
 				xvn[n].xv_socket = vp->v_socket;
 				break;
 			case VFIFO:
 				xvn[n].xv_fifo = vp->v_fifoinfo;
 				break;
 			case VNON:
 			case VBAD:
 			default:
 				/* shouldn't happen? */
 				vrele(vp);
 				continue;
 			}
 			vrele(vp);
 			++n;
 		}
 		MNT_IUNLOCK(mp);
 		mtx_lock(&mountlist_mtx);
 		vfs_unbusy(mp);
 		if (n == len)
 			break;
 	}
 	mtx_unlock(&mountlist_mtx);
 
 	error = SYSCTL_OUT(req, xvn, n * sizeof *xvn);
 	free(xvn, M_TEMP);
 	return (error);
 }
 
 SYSCTL_PROC(_kern, KERN_VNODE, vnode, CTLTYPE_OPAQUE | CTLFLAG_RD |
     CTLFLAG_MPSAFE, 0, 0, sysctl_vnode, "S,xvnode",
     "");
 #endif
 
 /*
  * Unmount all filesystems. The list is traversed in reverse order
  * of mounting to avoid dependencies.
  */
 void
 vfs_unmountall(void)
 {
 	struct mount *mp;
 	struct thread *td;
 	int error;
 
 	CTR1(KTR_VFS, "%s: unmounting all filesystems", __func__);
 	td = curthread;
 
 	/*
 	 * Since this only runs when rebooting, it is not interlocked.
 	 */
 	while(!TAILQ_EMPTY(&mountlist)) {
 		mp = TAILQ_LAST(&mountlist, mntlist);
 		error = dounmount(mp, MNT_FORCE, td);
 		if (error) {
 			TAILQ_REMOVE(&mountlist, mp, mnt_list);
 			/*
 			 * XXX: Due to the way in which we mount the root
 			 * file system off of devfs, devfs will generate a
 			 * "busy" warning when we try to unmount it before
 			 * the root.  Don't print a warning as a result in
 			 * order to avoid false positive errors that may
 			 * cause needless upset.
 			 */
 			if (strcmp(mp->mnt_vfc->vfc_name, "devfs") != 0) {
 				printf("unmount of %s failed (",
 				    mp->mnt_stat.f_mntonname);
 				if (error == EBUSY)
 					printf("BUSY)\n");
 				else
 					printf("%d)\n", error);
 			}
 		} else {
 			/* The unmount has removed mp from the mountlist */
 		}
 	}
 }
 
 /*
  * perform msync on all vnodes under a mount point
  * the mount point must be locked.
  */
 void
 vfs_msync(struct mount *mp, int flags)
 {
 	struct vnode *vp, *mvp;
 	struct vm_object *obj;
 
 	CTR2(KTR_VFS, "%s: mp %p", __func__, mp);
 	MNT_VNODE_FOREACH_ACTIVE(vp, mp, mvp) {
 		obj = vp->v_object;
 		if (obj != NULL && (obj->flags & OBJ_MIGHTBEDIRTY) != 0 &&
 		    (flags == MNT_WAIT || VOP_ISLOCKED(vp) == 0)) {
 			if (!vget(vp,
 			    LK_EXCLUSIVE | LK_RETRY | LK_INTERLOCK,
 			    curthread)) {
 				if (vp->v_vflag & VV_NOSYNC) {	/* unlinked */
 					vput(vp);
 					continue;
 				}
 
 				obj = vp->v_object;
 				if (obj != NULL) {
 					VM_OBJECT_WLOCK(obj);
 					vm_object_page_clean(obj, 0, 0,
 					    flags == MNT_WAIT ?
 					    OBJPC_SYNC : OBJPC_NOSYNC);
 					VM_OBJECT_WUNLOCK(obj);
 				}
 				vput(vp);
 			}
 		} else
 			VI_UNLOCK(vp);
 	}
 }
 
 static void
 destroy_vpollinfo_free(struct vpollinfo *vi)
 {
 
 	knlist_destroy(&vi->vpi_selinfo.si_note);
 	mtx_destroy(&vi->vpi_lock);
 	uma_zfree(vnodepoll_zone, vi);
 }
 
 static void
 destroy_vpollinfo(struct vpollinfo *vi)
 {
 
 	knlist_clear(&vi->vpi_selinfo.si_note, 1);
 	seldrain(&vi->vpi_selinfo);
 	destroy_vpollinfo_free(vi);
 }
 
 /*
  * Initalize per-vnode helper structure to hold poll-related state.
  */
 void
 v_addpollinfo(struct vnode *vp)
 {
 	struct vpollinfo *vi;
 
 	if (vp->v_pollinfo != NULL)
 		return;
 	vi = uma_zalloc(vnodepoll_zone, M_WAITOK);
 	mtx_init(&vi->vpi_lock, "vnode pollinfo", NULL, MTX_DEF);
 	knlist_init(&vi->vpi_selinfo.si_note, vp, vfs_knllock,
 	    vfs_knlunlock, vfs_knl_assert_locked, vfs_knl_assert_unlocked);
 	VI_LOCK(vp);
 	if (vp->v_pollinfo != NULL) {
 		VI_UNLOCK(vp);
 		destroy_vpollinfo_free(vi);
 		return;
 	}
 	vp->v_pollinfo = vi;
 	VI_UNLOCK(vp);
 }
 
 /*
  * Record a process's interest in events which might happen to
  * a vnode.  Because poll uses the historic select-style interface
  * internally, this routine serves as both the ``check for any
  * pending events'' and the ``record my interest in future events''
  * functions.  (These are done together, while the lock is held,
  * to avoid race conditions.)
  */
 int
 vn_pollrecord(struct vnode *vp, struct thread *td, int events)
 {
 
 	v_addpollinfo(vp);
 	mtx_lock(&vp->v_pollinfo->vpi_lock);
 	if (vp->v_pollinfo->vpi_revents & events) {
 		/*
 		 * This leaves events we are not interested
 		 * in available for the other process which
 		 * which presumably had requested them
 		 * (otherwise they would never have been
 		 * recorded).
 		 */
 		events &= vp->v_pollinfo->vpi_revents;
 		vp->v_pollinfo->vpi_revents &= ~events;
 
 		mtx_unlock(&vp->v_pollinfo->vpi_lock);
 		return (events);
 	}
 	vp->v_pollinfo->vpi_events |= events;
 	selrecord(td, &vp->v_pollinfo->vpi_selinfo);
 	mtx_unlock(&vp->v_pollinfo->vpi_lock);
 	return (0);
 }
 
 /*
  * Routine to create and manage a filesystem syncer vnode.
  */
 #define sync_close ((int (*)(struct  vop_close_args *))nullop)
 static int	sync_fsync(struct  vop_fsync_args *);
 static int	sync_inactive(struct  vop_inactive_args *);
 static int	sync_reclaim(struct  vop_reclaim_args *);
 
 static struct vop_vector sync_vnodeops = {
 	.vop_bypass =	VOP_EOPNOTSUPP,
 	.vop_close =	sync_close,		/* close */
 	.vop_fsync =	sync_fsync,		/* fsync */
 	.vop_inactive =	sync_inactive,	/* inactive */
 	.vop_reclaim =	sync_reclaim,	/* reclaim */
 	.vop_lock1 =	vop_stdlock,	/* lock */
 	.vop_unlock =	vop_stdunlock,	/* unlock */
 	.vop_islocked =	vop_stdislocked,	/* islocked */
 };
 
 /*
  * Create a new filesystem syncer vnode for the specified mount point.
  */
 void
 vfs_allocate_syncvnode(struct mount *mp)
 {
 	struct vnode *vp;
 	struct bufobj *bo;
 	static long start, incr, next;
 	int error;
 
 	/* Allocate a new vnode */
 	error = getnewvnode("syncer", mp, &sync_vnodeops, &vp);
 	if (error != 0)
 		panic("vfs_allocate_syncvnode: getnewvnode() failed");
 	vp->v_type = VNON;
 	vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
 	vp->v_vflag |= VV_FORCEINSMQ;
 	error = insmntque(vp, mp);
 	if (error != 0)
 		panic("vfs_allocate_syncvnode: insmntque() failed");
 	vp->v_vflag &= ~VV_FORCEINSMQ;
 	VOP_UNLOCK(vp, 0);
 	/*
 	 * Place the vnode onto the syncer worklist. We attempt to
 	 * scatter them about on the list so that they will go off
 	 * at evenly distributed times even if all the filesystems
 	 * are mounted at once.
 	 */
 	next += incr;
 	if (next == 0 || next > syncer_maxdelay) {
 		start /= 2;
 		incr /= 2;
 		if (start == 0) {
 			start = syncer_maxdelay / 2;
 			incr = syncer_maxdelay;
 		}
 		next = start;
 	}
 	bo = &vp->v_bufobj;
 	BO_LOCK(bo);
 	vn_syncer_add_to_worklist(bo, syncdelay > 0 ? next % syncdelay : 0);
 	/* XXX - vn_syncer_add_to_worklist() also grabs and drops sync_mtx. */
 	mtx_lock(&sync_mtx);
 	sync_vnode_count++;
 	if (mp->mnt_syncer == NULL) {
 		mp->mnt_syncer = vp;
 		vp = NULL;
 	}
 	mtx_unlock(&sync_mtx);
 	BO_UNLOCK(bo);
 	if (vp != NULL) {
 		vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
 		vgone(vp);
 		vput(vp);
 	}
 }
 
 void
 vfs_deallocate_syncvnode(struct mount *mp)
 {
 	struct vnode *vp;
 
 	mtx_lock(&sync_mtx);
 	vp = mp->mnt_syncer;
 	if (vp != NULL)
 		mp->mnt_syncer = NULL;
 	mtx_unlock(&sync_mtx);
 	if (vp != NULL)
 		vrele(vp);
 }
 
 /*
  * Do a lazy sync of the filesystem.
  */
 static int
 sync_fsync(struct vop_fsync_args *ap)
 {
 	struct vnode *syncvp = ap->a_vp;
 	struct mount *mp = syncvp->v_mount;
 	int error, save;
 	struct bufobj *bo;
 
 	/*
 	 * We only need to do something if this is a lazy evaluation.
 	 */
 	if (ap->a_waitfor != MNT_LAZY)
 		return (0);
 
 	/*
 	 * Move ourselves to the back of the sync list.
 	 */
 	bo = &syncvp->v_bufobj;
 	BO_LOCK(bo);
 	vn_syncer_add_to_worklist(bo, syncdelay);
 	BO_UNLOCK(bo);
 
 	/*
 	 * Walk the list of vnodes pushing all that are dirty and
 	 * not already on the sync list.
 	 */
 	if (vfs_busy(mp, MBF_NOWAIT) != 0)
 		return (0);
 	if (vn_start_write(NULL, &mp, V_NOWAIT) != 0) {
 		vfs_unbusy(mp);
 		return (0);
 	}
 	save = curthread_pflags_set(TDP_SYNCIO);
 	vfs_msync(mp, MNT_NOWAIT);
 	error = VFS_SYNC(mp, MNT_LAZY);
 	curthread_pflags_restore(save);
 	vn_finished_write(mp);
 	vfs_unbusy(mp);
 	return (error);
 }
 
 /*
  * The syncer vnode is no referenced.
  */
 static int
 sync_inactive(struct vop_inactive_args *ap)
 {
 
 	vgone(ap->a_vp);
 	return (0);
 }
 
 /*
  * The syncer vnode is no longer needed and is being decommissioned.
  *
  * Modifications to the worklist must be protected by sync_mtx.
  */
 static int
 sync_reclaim(struct vop_reclaim_args *ap)
 {
 	struct vnode *vp = ap->a_vp;
 	struct bufobj *bo;
 
 	bo = &vp->v_bufobj;
 	BO_LOCK(bo);
 	mtx_lock(&sync_mtx);
 	if (vp->v_mount->mnt_syncer == vp)
 		vp->v_mount->mnt_syncer = NULL;
 	if (bo->bo_flag & BO_ONWORKLST) {
 		LIST_REMOVE(bo, bo_synclist);
 		syncer_worklist_len--;
 		sync_vnode_count--;
 		bo->bo_flag &= ~BO_ONWORKLST;
 	}
 	mtx_unlock(&sync_mtx);
 	BO_UNLOCK(bo);
 
 	return (0);
 }
 
 /*
  * Check if vnode represents a disk device
  */
 int
 vn_isdisk(struct vnode *vp, int *errp)
 {
 	int error;
 
 	if (vp->v_type != VCHR) {
 		error = ENOTBLK;
 		goto out;
 	}
 	error = 0;
 	dev_lock();
 	if (vp->v_rdev == NULL)
 		error = ENXIO;
 	else if (vp->v_rdev->si_devsw == NULL)
 		error = ENXIO;
 	else if (!(vp->v_rdev->si_devsw->d_flags & D_DISK))
 		error = ENOTBLK;
 	dev_unlock();
 out:
 	if (errp != NULL)
 		*errp = error;
 	return (error == 0);
 }
 
 /*
  * Common filesystem object access control check routine.  Accepts a
  * vnode's type, "mode", uid and gid, requested access mode, credentials,
  * and optional call-by-reference privused argument allowing vaccess()
  * to indicate to the caller whether privilege was used to satisfy the
  * request (obsoleted).  Returns 0 on success, or an errno on failure.
  */
 int
 vaccess(enum vtype type, mode_t file_mode, uid_t file_uid, gid_t file_gid,
     accmode_t accmode, struct ucred *cred, int *privused)
 {
 	accmode_t dac_granted;
 	accmode_t priv_granted;
 
 	KASSERT((accmode & ~(VEXEC | VWRITE | VREAD | VADMIN | VAPPEND)) == 0,
 	    ("invalid bit in accmode"));
 	KASSERT((accmode & VAPPEND) == 0 || (accmode & VWRITE),
 	    ("VAPPEND without VWRITE"));
 
 	/*
 	 * Look for a normal, non-privileged way to access the file/directory
 	 * as requested.  If it exists, go with that.
 	 */
 
 	if (privused != NULL)
 		*privused = 0;
 
 	dac_granted = 0;
 
 	/* Check the owner. */
 	if (cred->cr_uid == file_uid) {
 		dac_granted |= VADMIN;
 		if (file_mode & S_IXUSR)
 			dac_granted |= VEXEC;
 		if (file_mode & S_IRUSR)
 			dac_granted |= VREAD;
 		if (file_mode & S_IWUSR)
 			dac_granted |= (VWRITE | VAPPEND);
 
 		if ((accmode & dac_granted) == accmode)
 			return (0);
 
 		goto privcheck;
 	}
 
 	/* Otherwise, check the groups (first match) */
 	if (groupmember(file_gid, cred)) {
 		if (file_mode & S_IXGRP)
 			dac_granted |= VEXEC;
 		if (file_mode & S_IRGRP)
 			dac_granted |= VREAD;
 		if (file_mode & S_IWGRP)
 			dac_granted |= (VWRITE | VAPPEND);
 
 		if ((accmode & dac_granted) == accmode)
 			return (0);
 
 		goto privcheck;
 	}
 
 	/* Otherwise, check everyone else. */
 	if (file_mode & S_IXOTH)
 		dac_granted |= VEXEC;
 	if (file_mode & S_IROTH)
 		dac_granted |= VREAD;
 	if (file_mode & S_IWOTH)
 		dac_granted |= (VWRITE | VAPPEND);
 	if ((accmode & dac_granted) == accmode)
 		return (0);
 
 privcheck:
 	/*
 	 * Build a privilege mask to determine if the set of privileges
 	 * satisfies the requirements when combined with the granted mask
 	 * from above.  For each privilege, if the privilege is required,
 	 * bitwise or the request type onto the priv_granted mask.
 	 */
 	priv_granted = 0;
 
 	if (type == VDIR) {
 		/*
 		 * For directories, use PRIV_VFS_LOOKUP to satisfy VEXEC
 		 * requests, instead of PRIV_VFS_EXEC.
 		 */
 		if ((accmode & VEXEC) && ((dac_granted & VEXEC) == 0) &&
 		    !priv_check_cred(cred, PRIV_VFS_LOOKUP, 0))
 			priv_granted |= VEXEC;
 	} else {
 		/*
 		 * Ensure that at least one execute bit is on. Otherwise,
 		 * a privileged user will always succeed, and we don't want
 		 * this to happen unless the file really is executable.
 		 */
 		if ((accmode & VEXEC) && ((dac_granted & VEXEC) == 0) &&
 		    (file_mode & (S_IXUSR | S_IXGRP | S_IXOTH)) != 0 &&
 		    !priv_check_cred(cred, PRIV_VFS_EXEC, 0))
 			priv_granted |= VEXEC;
 	}
 
 	if ((accmode & VREAD) && ((dac_granted & VREAD) == 0) &&
 	    !priv_check_cred(cred, PRIV_VFS_READ, 0))
 		priv_granted |= VREAD;
 
 	if ((accmode & VWRITE) && ((dac_granted & VWRITE) == 0) &&
 	    !priv_check_cred(cred, PRIV_VFS_WRITE, 0))
 		priv_granted |= (VWRITE | VAPPEND);
 
 	if ((accmode & VADMIN) && ((dac_granted & VADMIN) == 0) &&
 	    !priv_check_cred(cred, PRIV_VFS_ADMIN, 0))
 		priv_granted |= VADMIN;
 
 	if ((accmode & (priv_granted | dac_granted)) == accmode) {
 		/* XXX audit: privilege used */
 		if (privused != NULL)
 			*privused = 1;
 		return (0);
 	}
 
 	return ((accmode & VADMIN) ? EPERM : EACCES);
 }
 
 /*
  * Credential check based on process requesting service, and per-attribute
  * permissions.
  */
 int
 extattr_check_cred(struct vnode *vp, int attrnamespace, struct ucred *cred,
     struct thread *td, accmode_t accmode)
 {
 
 	/*
 	 * Kernel-invoked always succeeds.
 	 */
 	if (cred == NOCRED)
 		return (0);
 
 	/*
 	 * Do not allow privileged processes in jail to directly manipulate
 	 * system attributes.
 	 */
 	switch (attrnamespace) {
 	case EXTATTR_NAMESPACE_SYSTEM:
 		/* Potentially should be: return (EPERM); */
 		return (priv_check_cred(cred, PRIV_VFS_EXTATTR_SYSTEM, 0));
 	case EXTATTR_NAMESPACE_USER:
 		return (VOP_ACCESS(vp, accmode, cred, td));
 	default:
 		return (EPERM);
 	}
 }
 
 #ifdef DEBUG_VFS_LOCKS
 /*
  * This only exists to supress warnings from unlocked specfs accesses.  It is
  * no longer ok to have an unlocked VFS.
  */
 #define	IGNORE_LOCK(vp) (panicstr != NULL || (vp) == NULL ||		\
 	(vp)->v_type == VCHR ||	(vp)->v_type == VBAD)
 
 int vfs_badlock_ddb = 1;	/* Drop into debugger on violation. */
 SYSCTL_INT(_debug, OID_AUTO, vfs_badlock_ddb, CTLFLAG_RW, &vfs_badlock_ddb, 0,
     "Drop into debugger on lock violation");
 
 int vfs_badlock_mutex = 1;	/* Check for interlock across VOPs. */
 SYSCTL_INT(_debug, OID_AUTO, vfs_badlock_mutex, CTLFLAG_RW, &vfs_badlock_mutex,
     0, "Check for interlock across VOPs");
 
 int vfs_badlock_print = 1;	/* Print lock violations. */
 SYSCTL_INT(_debug, OID_AUTO, vfs_badlock_print, CTLFLAG_RW, &vfs_badlock_print,
     0, "Print lock violations");
 
 #ifdef KDB
 int vfs_badlock_backtrace = 1;	/* Print backtrace at lock violations. */
 SYSCTL_INT(_debug, OID_AUTO, vfs_badlock_backtrace, CTLFLAG_RW,
     &vfs_badlock_backtrace, 0, "Print backtrace at lock violations");
 #endif
 
 static void
 vfs_badlock(const char *msg, const char *str, struct vnode *vp)
 {
 
 #ifdef KDB
 	if (vfs_badlock_backtrace)
 		kdb_backtrace();
 #endif
 	if (vfs_badlock_print)
 		printf("%s: %p %s\n", str, (void *)vp, msg);
 	if (vfs_badlock_ddb)
 		kdb_enter(KDB_WHY_VFSLOCK, "lock violation");
 }
 
 void
 assert_vi_locked(struct vnode *vp, const char *str)
 {
 
 	if (vfs_badlock_mutex && !mtx_owned(VI_MTX(vp)))
 		vfs_badlock("interlock is not locked but should be", str, vp);
 }
 
 void
 assert_vi_unlocked(struct vnode *vp, const char *str)
 {
 
 	if (vfs_badlock_mutex && mtx_owned(VI_MTX(vp)))
 		vfs_badlock("interlock is locked but should not be", str, vp);
 }
 
 void
 assert_vop_locked(struct vnode *vp, const char *str)
 {
 	int locked;
 
 	if (!IGNORE_LOCK(vp)) {
 		locked = VOP_ISLOCKED(vp);
 		if (locked == 0 || locked == LK_EXCLOTHER)
 			vfs_badlock("is not locked but should be", str, vp);
 	}
 }
 
 void
 assert_vop_unlocked(struct vnode *vp, const char *str)
 {
 
 	if (!IGNORE_LOCK(vp) && VOP_ISLOCKED(vp) == LK_EXCLUSIVE)
 		vfs_badlock("is locked but should not be", str, vp);
 }
 
 void
 assert_vop_elocked(struct vnode *vp, const char *str)
 {
 
 	if (!IGNORE_LOCK(vp) && VOP_ISLOCKED(vp) != LK_EXCLUSIVE)
 		vfs_badlock("is not exclusive locked but should be", str, vp);
 }
 
 #if 0
 void
 assert_vop_elocked_other(struct vnode *vp, const char *str)
 {
 
 	if (!IGNORE_LOCK(vp) && VOP_ISLOCKED(vp) != LK_EXCLOTHER)
 		vfs_badlock("is not exclusive locked by another thread",
 		    str, vp);
 }
 
 void
 assert_vop_slocked(struct vnode *vp, const char *str)
 {
 
 	if (!IGNORE_LOCK(vp) && VOP_ISLOCKED(vp) != LK_SHARED)
 		vfs_badlock("is not locked shared but should be", str, vp);
 }
 #endif /* 0 */
 #endif /* DEBUG_VFS_LOCKS */
 
 void
 vop_rename_fail(struct vop_rename_args *ap)
 {
 
 	if (ap->a_tvp != NULL)
 		vput(ap->a_tvp);
 	if (ap->a_tdvp == ap->a_tvp)
 		vrele(ap->a_tdvp);
 	else
 		vput(ap->a_tdvp);
 	vrele(ap->a_fdvp);
 	vrele(ap->a_fvp);
 }
 
 void
 vop_rename_pre(void *ap)
 {
 	struct vop_rename_args *a = ap;
 
 #ifdef DEBUG_VFS_LOCKS
 	if (a->a_tvp)
 		ASSERT_VI_UNLOCKED(a->a_tvp, "VOP_RENAME");
 	ASSERT_VI_UNLOCKED(a->a_tdvp, "VOP_RENAME");
 	ASSERT_VI_UNLOCKED(a->a_fvp, "VOP_RENAME");
 	ASSERT_VI_UNLOCKED(a->a_fdvp, "VOP_RENAME");
 
 	/* Check the source (from). */
 	if (a->a_tdvp->v_vnlock != a->a_fdvp->v_vnlock &&
 	    (a->a_tvp == NULL || a->a_tvp->v_vnlock != a->a_fdvp->v_vnlock))
 		ASSERT_VOP_UNLOCKED(a->a_fdvp, "vop_rename: fdvp locked");
 	if (a->a_tvp == NULL || a->a_tvp->v_vnlock != a->a_fvp->v_vnlock)
 		ASSERT_VOP_UNLOCKED(a->a_fvp, "vop_rename: fvp locked");
 
 	/* Check the target. */
 	if (a->a_tvp)
 		ASSERT_VOP_LOCKED(a->a_tvp, "vop_rename: tvp not locked");
 	ASSERT_VOP_LOCKED(a->a_tdvp, "vop_rename: tdvp not locked");
 #endif
 	if (a->a_tdvp != a->a_fdvp)
 		vhold(a->a_fdvp);
 	if (a->a_tvp != a->a_fvp)
 		vhold(a->a_fvp);
 	vhold(a->a_tdvp);
 	if (a->a_tvp)
 		vhold(a->a_tvp);
 }
 
 void
 vop_strategy_pre(void *ap)
 {
 #ifdef DEBUG_VFS_LOCKS
 	struct vop_strategy_args *a;
 	struct buf *bp;
 
 	a = ap;
 	bp = a->a_bp;
 
 	/*
 	 * Cluster ops lock their component buffers but not the IO container.
 	 */
 	if ((bp->b_flags & B_CLUSTER) != 0)
 		return;
 
 	if (panicstr == NULL && !BUF_ISLOCKED(bp)) {
 		if (vfs_badlock_print)
 			printf(
 			    "VOP_STRATEGY: bp is not locked but should be\n");
 		if (vfs_badlock_ddb)
 			kdb_enter(KDB_WHY_VFSLOCK, "lock violation");
 	}
 #endif
 }
 
 void
 vop_lock_pre(void *ap)
 {
 #ifdef DEBUG_VFS_LOCKS
 	struct vop_lock1_args *a = ap;
 
 	if ((a->a_flags & LK_INTERLOCK) == 0)
 		ASSERT_VI_UNLOCKED(a->a_vp, "VOP_LOCK");
 	else
 		ASSERT_VI_LOCKED(a->a_vp, "VOP_LOCK");
 #endif
 }
 
 void
 vop_lock_post(void *ap, int rc)
 {
 #ifdef DEBUG_VFS_LOCKS
 	struct vop_lock1_args *a = ap;
 
 	ASSERT_VI_UNLOCKED(a->a_vp, "VOP_LOCK");
 	if (rc == 0 && (a->a_flags & LK_EXCLOTHER) == 0)
 		ASSERT_VOP_LOCKED(a->a_vp, "VOP_LOCK");
 #endif
 }
 
 void
 vop_unlock_pre(void *ap)
 {
 #ifdef DEBUG_VFS_LOCKS
 	struct vop_unlock_args *a = ap;
 
 	if (a->a_flags & LK_INTERLOCK)
 		ASSERT_VI_LOCKED(a->a_vp, "VOP_UNLOCK");
 	ASSERT_VOP_LOCKED(a->a_vp, "VOP_UNLOCK");
 #endif
 }
 
 void
 vop_unlock_post(void *ap, int rc)
 {
 #ifdef DEBUG_VFS_LOCKS
 	struct vop_unlock_args *a = ap;
 
 	if (a->a_flags & LK_INTERLOCK)
 		ASSERT_VI_UNLOCKED(a->a_vp, "VOP_UNLOCK");
 #endif
 }
 
 void
 vop_create_post(void *ap, int rc)
 {
 	struct vop_create_args *a = ap;
 
 	if (!rc)
 		VFS_KNOTE_LOCKED(a->a_dvp, NOTE_WRITE);
 }
 
 void
 vop_deleteextattr_post(void *ap, int rc)
 {
 	struct vop_deleteextattr_args *a = ap;
 
 	if (!rc)
 		VFS_KNOTE_LOCKED(a->a_vp, NOTE_ATTRIB);
 }
 
 void
 vop_link_post(void *ap, int rc)
 {
 	struct vop_link_args *a = ap;
 
 	if (!rc) {
 		VFS_KNOTE_LOCKED(a->a_vp, NOTE_LINK);
 		VFS_KNOTE_LOCKED(a->a_tdvp, NOTE_WRITE);
 	}
 }
 
 void
 vop_mkdir_post(void *ap, int rc)
 {
 	struct vop_mkdir_args *a = ap;
 
 	if (!rc)
 		VFS_KNOTE_LOCKED(a->a_dvp, NOTE_WRITE | NOTE_LINK);
 }
 
 void
 vop_mknod_post(void *ap, int rc)
 {
 	struct vop_mknod_args *a = ap;
 
 	if (!rc)
 		VFS_KNOTE_LOCKED(a->a_dvp, NOTE_WRITE);
 }
 
 void
 vop_remove_post(void *ap, int rc)
 {
 	struct vop_remove_args *a = ap;
 
 	if (!rc) {
 		VFS_KNOTE_LOCKED(a->a_dvp, NOTE_WRITE);
 		VFS_KNOTE_LOCKED(a->a_vp, NOTE_DELETE);
 	}
 }
 
 void
 vop_rename_post(void *ap, int rc)
 {
 	struct vop_rename_args *a = ap;
 
 	if (!rc) {
 		VFS_KNOTE_UNLOCKED(a->a_fdvp, NOTE_WRITE);
 		VFS_KNOTE_UNLOCKED(a->a_tdvp, NOTE_WRITE);
 		VFS_KNOTE_UNLOCKED(a->a_fvp, NOTE_RENAME);
 		if (a->a_tvp)
 			VFS_KNOTE_UNLOCKED(a->a_tvp, NOTE_DELETE);
 	}
 	if (a->a_tdvp != a->a_fdvp)
 		vdrop(a->a_fdvp);
 	if (a->a_tvp != a->a_fvp)
 		vdrop(a->a_fvp);
 	vdrop(a->a_tdvp);
 	if (a->a_tvp)
 		vdrop(a->a_tvp);
 }
 
 void
 vop_rmdir_post(void *ap, int rc)
 {
 	struct vop_rmdir_args *a = ap;
 
 	if (!rc) {
 		VFS_KNOTE_LOCKED(a->a_dvp, NOTE_WRITE | NOTE_LINK);
 		VFS_KNOTE_LOCKED(a->a_vp, NOTE_DELETE);
 	}
 }
 
 void
 vop_setattr_post(void *ap, int rc)
 {
 	struct vop_setattr_args *a = ap;
 
 	if (!rc)
 		VFS_KNOTE_LOCKED(a->a_vp, NOTE_ATTRIB);
 }
 
 void
 vop_setextattr_post(void *ap, int rc)
 {
 	struct vop_setextattr_args *a = ap;
 
 	if (!rc)
 		VFS_KNOTE_LOCKED(a->a_vp, NOTE_ATTRIB);
 }
 
 void
 vop_symlink_post(void *ap, int rc)
 {
 	struct vop_symlink_args *a = ap;
 
 	if (!rc)
 		VFS_KNOTE_LOCKED(a->a_dvp, NOTE_WRITE);
 }
 
 static struct knlist fs_knlist;
 
 static void
 vfs_event_init(void *arg)
 {
 	knlist_init_mtx(&fs_knlist, NULL);
 }
 /* XXX - correct order? */
 SYSINIT(vfs_knlist, SI_SUB_VFS, SI_ORDER_ANY, vfs_event_init, NULL);
 
 void
 vfs_event_signal(fsid_t *fsid, uint32_t event, intptr_t data __unused)
 {
 
 	KNOTE_UNLOCKED(&fs_knlist, event);
 }
 
 static int	filt_fsattach(struct knote *kn);
 static void	filt_fsdetach(struct knote *kn);
 static int	filt_fsevent(struct knote *kn, long hint);
 
 struct filterops fs_filtops = {
 	.f_isfd = 0,
 	.f_attach = filt_fsattach,
 	.f_detach = filt_fsdetach,
 	.f_event = filt_fsevent
 };
 
 static int
 filt_fsattach(struct knote *kn)
 {
 
 	kn->kn_flags |= EV_CLEAR;
 	knlist_add(&fs_knlist, kn, 0);
 	return (0);
 }
 
 static void
 filt_fsdetach(struct knote *kn)
 {
 
 	knlist_remove(&fs_knlist, kn, 0);
 }
 
 static int
 filt_fsevent(struct knote *kn, long hint)
 {
 
 	kn->kn_fflags |= hint;
 	return (kn->kn_fflags != 0);
 }
 
 static int
 sysctl_vfs_ctl(SYSCTL_HANDLER_ARGS)
 {
 	struct vfsidctl vc;
 	int error;
 	struct mount *mp;
 
 	error = SYSCTL_IN(req, &vc, sizeof(vc));
 	if (error)
 		return (error);
 	if (vc.vc_vers != VFS_CTL_VERS1)
 		return (EINVAL);
 	mp = vfs_getvfs(&vc.vc_fsid);
 	if (mp == NULL)
 		return (ENOENT);
 	/* ensure that a specific sysctl goes to the right filesystem. */
 	if (strcmp(vc.vc_fstypename, "*") != 0 &&
 	    strcmp(vc.vc_fstypename, mp->mnt_vfc->vfc_name) != 0) {
 		vfs_rel(mp);
 		return (EINVAL);
 	}
 	VCTLTOREQ(&vc, req);
 	error = VFS_SYSCTL(mp, vc.vc_op, req);
 	vfs_rel(mp);
 	return (error);
 }
 
 SYSCTL_PROC(_vfs, OID_AUTO, ctl, CTLTYPE_OPAQUE | CTLFLAG_WR,
     NULL, 0, sysctl_vfs_ctl, "",
     "Sysctl by fsid");
 
 /*
  * Function to initialize a va_filerev field sensibly.
  * XXX: Wouldn't a random number make a lot more sense ??
  */
 u_quad_t
 init_va_filerev(void)
 {
 	struct bintime bt;
 
 	getbinuptime(&bt);
 	return (((u_quad_t)bt.sec << 32LL) | (bt.frac >> 32LL));
 }
 
 static int	filt_vfsread(struct knote *kn, long hint);
 static int	filt_vfswrite(struct knote *kn, long hint);
 static int	filt_vfsvnode(struct knote *kn, long hint);
 static void	filt_vfsdetach(struct knote *kn);
 static struct filterops vfsread_filtops = {
 	.f_isfd = 1,
 	.f_detach = filt_vfsdetach,
 	.f_event = filt_vfsread
 };
 static struct filterops vfswrite_filtops = {
 	.f_isfd = 1,
 	.f_detach = filt_vfsdetach,
 	.f_event = filt_vfswrite
 };
 static struct filterops vfsvnode_filtops = {
 	.f_isfd = 1,
 	.f_detach = filt_vfsdetach,
 	.f_event = filt_vfsvnode
 };
 
 static void
 vfs_knllock(void *arg)
 {
 	struct vnode *vp = arg;
 
 	vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
 }
 
 static void
 vfs_knlunlock(void *arg)
 {
 	struct vnode *vp = arg;
 
 	VOP_UNLOCK(vp, 0);
 }
 
 static void
 vfs_knl_assert_locked(void *arg)
 {
 #ifdef DEBUG_VFS_LOCKS
 	struct vnode *vp = arg;
 
 	ASSERT_VOP_LOCKED(vp, "vfs_knl_assert_locked");
 #endif
 }
 
 static void
 vfs_knl_assert_unlocked(void *arg)
 {
 #ifdef DEBUG_VFS_LOCKS
 	struct vnode *vp = arg;
 
 	ASSERT_VOP_UNLOCKED(vp, "vfs_knl_assert_unlocked");
 #endif
 }
 
 int
 vfs_kqfilter(struct vop_kqfilter_args *ap)
 {
 	struct vnode *vp = ap->a_vp;
 	struct knote *kn = ap->a_kn;
 	struct knlist *knl;
 
 	switch (kn->kn_filter) {
 	case EVFILT_READ:
 		kn->kn_fop = &vfsread_filtops;
 		break;
 	case EVFILT_WRITE:
 		kn->kn_fop = &vfswrite_filtops;
 		break;
 	case EVFILT_VNODE:
 		kn->kn_fop = &vfsvnode_filtops;
 		break;
 	default:
 		return (EINVAL);
 	}
 
 	kn->kn_hook = (caddr_t)vp;
 
 	v_addpollinfo(vp);
 	if (vp->v_pollinfo == NULL)
 		return (ENOMEM);
 	knl = &vp->v_pollinfo->vpi_selinfo.si_note;
 	vhold(vp);
 	knlist_add(knl, kn, 0);
 
 	return (0);
 }
 
 /*
  * Detach knote from vnode
  */
 static void
 filt_vfsdetach(struct knote *kn)
 {
 	struct vnode *vp = (struct vnode *)kn->kn_hook;
 
 	KASSERT(vp->v_pollinfo != NULL, ("Missing v_pollinfo"));
 	knlist_remove(&vp->v_pollinfo->vpi_selinfo.si_note, kn, 0);
 	vdrop(vp);
 }
 
 /*ARGSUSED*/
 static int
 filt_vfsread(struct knote *kn, long hint)
 {
 	struct vnode *vp = (struct vnode *)kn->kn_hook;
 	struct vattr va;
 	int res;
 
 	/*
 	 * filesystem is gone, so set the EOF flag and schedule
 	 * the knote for deletion.
 	 */
 	if (hint == NOTE_REVOKE) {
 		VI_LOCK(vp);
 		kn->kn_flags |= (EV_EOF | EV_ONESHOT);
 		VI_UNLOCK(vp);
 		return (1);
 	}
 
 	if (VOP_GETATTR(vp, &va, curthread->td_ucred))
 		return (0);
 
 	VI_LOCK(vp);
 	kn->kn_data = va.va_size - kn->kn_fp->f_offset;
 	res = (kn->kn_data != 0);
 	VI_UNLOCK(vp);
 	return (res);
 }
 
 /*ARGSUSED*/
 static int
 filt_vfswrite(struct knote *kn, long hint)
 {
 	struct vnode *vp = (struct vnode *)kn->kn_hook;
 
 	VI_LOCK(vp);
 
 	/*
 	 * filesystem is gone, so set the EOF flag and schedule
 	 * the knote for deletion.
 	 */
 	if (hint == NOTE_REVOKE)
 		kn->kn_flags |= (EV_EOF | EV_ONESHOT);
 
 	kn->kn_data = 0;
 	VI_UNLOCK(vp);
 	return (1);
 }
 
 static int
 filt_vfsvnode(struct knote *kn, long hint)
 {
 	struct vnode *vp = (struct vnode *)kn->kn_hook;
 	int res;
 
 	VI_LOCK(vp);
 	if (kn->kn_sfflags & hint)
 		kn->kn_fflags |= hint;
 	if (hint == NOTE_REVOKE) {
 		kn->kn_flags |= EV_EOF;
 		VI_UNLOCK(vp);
 		return (1);
 	}
 	res = (kn->kn_fflags != 0);
 	VI_UNLOCK(vp);
 	return (res);
 }
 
 int
 vfs_read_dirent(struct vop_readdir_args *ap, struct dirent *dp, off_t off)
 {
 	int error;
 
 	if (dp->d_reclen > ap->a_uio->uio_resid)
 		return (ENAMETOOLONG);
 	error = uiomove(dp, dp->d_reclen, ap->a_uio);
 	if (error) {
 		if (ap->a_ncookies != NULL) {
 			if (ap->a_cookies != NULL)
 				free(ap->a_cookies, M_TEMP);
 			ap->a_cookies = NULL;
 			*ap->a_ncookies = 0;
 		}
 		return (error);
 	}
 	if (ap->a_ncookies == NULL)
 		return (0);
 
 	KASSERT(ap->a_cookies,
 	    ("NULL ap->a_cookies value with non-NULL ap->a_ncookies!"));
 
 	*ap->a_cookies = realloc(*ap->a_cookies,
 	    (*ap->a_ncookies + 1) * sizeof(u_long), M_TEMP, M_WAITOK | M_ZERO);
 	(*ap->a_cookies)[*ap->a_ncookies] = off;
 	return (0);
 }
 
 /*
  * Mark for update the access time of the file if the filesystem
  * supports VOP_MARKATIME.  This functionality is used by execve and
  * mmap, so we want to avoid the I/O implied by directly setting
  * va_atime for the sake of efficiency.
  */
 void
 vfs_mark_atime(struct vnode *vp, struct ucred *cred)
 {
 	struct mount *mp;
 
 	mp = vp->v_mount;
 	ASSERT_VOP_LOCKED(vp, "vfs_mark_atime");
 	if (mp != NULL && (mp->mnt_flag & (MNT_NOATIME | MNT_RDONLY)) == 0)
 		(void)VOP_MARKATIME(vp);
 }
 
 /*
  * The purpose of this routine is to remove granularity from accmode_t,
  * reducing it into standard unix access bits - VEXEC, VREAD, VWRITE,
  * VADMIN and VAPPEND.
  *
  * If it returns 0, the caller is supposed to continue with the usual
  * access checks using 'accmode' as modified by this routine.  If it
  * returns nonzero value, the caller is supposed to return that value
  * as errno.
  *
  * Note that after this routine runs, accmode may be zero.
  */
 int
 vfs_unixify_accmode(accmode_t *accmode)
 {
 	/*
 	 * There is no way to specify explicit "deny" rule using
 	 * file mode or POSIX.1e ACLs.
 	 */
 	if (*accmode & VEXPLICIT_DENY) {
 		*accmode = 0;
 		return (0);
 	}
 
 	/*
 	 * None of these can be translated into usual access bits.
 	 * Also, the common case for NFSv4 ACLs is to not contain
 	 * either of these bits. Caller should check for VWRITE
 	 * on the containing directory instead.
 	 */
 	if (*accmode & (VDELETE_CHILD | VDELETE))
 		return (EPERM);
 
 	if (*accmode & VADMIN_PERMS) {
 		*accmode &= ~VADMIN_PERMS;
 		*accmode |= VADMIN;
 	}
 
 	/*
 	 * There is no way to deny VREAD_ATTRIBUTES, VREAD_ACL
 	 * or VSYNCHRONIZE using file mode or POSIX.1e ACL.
 	 */
 	*accmode &= ~(VSTAT_PERMS | VSYNCHRONIZE);
 
 	return (0);
 }
 
 /*
  * These are helper functions for filesystems to traverse all
  * their vnodes.  See MNT_VNODE_FOREACH_ALL() in sys/mount.h.
  *
  * This interface replaces MNT_VNODE_FOREACH.
  */
 
 MALLOC_DEFINE(M_VNODE_MARKER, "vnodemarker", "vnode marker");
 
 struct vnode *
 __mnt_vnode_next_all(struct vnode **mvp, struct mount *mp)
 {
 	struct vnode *vp;
 
 	if (should_yield())
 		kern_yield(PRI_USER);
 	MNT_ILOCK(mp);
 	KASSERT((*mvp)->v_mount == mp, ("marker vnode mount list mismatch"));
 	vp = TAILQ_NEXT(*mvp, v_nmntvnodes);
 	while (vp != NULL && (vp->v_type == VMARKER ||
 	    (vp->v_iflag & VI_DOOMED) != 0))
 		vp = TAILQ_NEXT(vp, v_nmntvnodes);
 
 	/* Check if we are done */
 	if (vp == NULL) {
 		__mnt_vnode_markerfree_all(mvp, mp);
 		/* MNT_IUNLOCK(mp); -- done in above function */
 		mtx_assert(MNT_MTX(mp), MA_NOTOWNED);
 		return (NULL);
 	}
 	TAILQ_REMOVE(&mp->mnt_nvnodelist, *mvp, v_nmntvnodes);
 	TAILQ_INSERT_AFTER(&mp->mnt_nvnodelist, vp, *mvp, v_nmntvnodes);
 	VI_LOCK(vp);
 	MNT_IUNLOCK(mp);
 	return (vp);
 }
 
 struct vnode *
 __mnt_vnode_first_all(struct vnode **mvp, struct mount *mp)
 {
 	struct vnode *vp;
 
 	*mvp = malloc(sizeof(struct vnode), M_VNODE_MARKER, M_WAITOK | M_ZERO);
 	MNT_ILOCK(mp);
 	MNT_REF(mp);
 	(*mvp)->v_type = VMARKER;
 
 	vp = TAILQ_FIRST(&mp->mnt_nvnodelist);
 	while (vp != NULL && (vp->v_type == VMARKER ||
 	    (vp->v_iflag & VI_DOOMED) != 0))
 		vp = TAILQ_NEXT(vp, v_nmntvnodes);
 
 	/* Check if we are done */
 	if (vp == NULL) {
 		MNT_REL(mp);
 		MNT_IUNLOCK(mp);
 		free(*mvp, M_VNODE_MARKER);
 		*mvp = NULL;
 		return (NULL);
 	}
 	(*mvp)->v_mount = mp;
 	TAILQ_INSERT_AFTER(&mp->mnt_nvnodelist, vp, *mvp, v_nmntvnodes);
 	VI_LOCK(vp);
 	MNT_IUNLOCK(mp);
 	return (vp);
 }
 
 
 void
 __mnt_vnode_markerfree_all(struct vnode **mvp, struct mount *mp)
 {
 
 	if (*mvp == NULL) {
 		MNT_IUNLOCK(mp);
 		return;
 	}
 
 	mtx_assert(MNT_MTX(mp), MA_OWNED);
 
 	KASSERT((*mvp)->v_mount == mp, ("marker vnode mount list mismatch"));
 	TAILQ_REMOVE(&mp->mnt_nvnodelist, *mvp, v_nmntvnodes);
 	MNT_REL(mp);
 	MNT_IUNLOCK(mp);
 	free(*mvp, M_VNODE_MARKER);
 	*mvp = NULL;
 }
 
 /*
  * These are helper functions for filesystems to traverse their
  * active vnodes.  See MNT_VNODE_FOREACH_ACTIVE() in sys/mount.h
  */
 static void
 mnt_vnode_markerfree_active(struct vnode **mvp, struct mount *mp)
 {
 
 	KASSERT((*mvp)->v_mount == mp, ("marker vnode mount list mismatch"));
 
 	MNT_ILOCK(mp);
 	MNT_REL(mp);
 	MNT_IUNLOCK(mp);
 	free(*mvp, M_VNODE_MARKER);
 	*mvp = NULL;
 }
 
 static struct vnode *
 mnt_vnode_next_active(struct vnode **mvp, struct mount *mp)
 {
 	struct vnode *vp, *nvp;
 
 	mtx_assert(&vnode_free_list_mtx, MA_OWNED);
 	KASSERT((*mvp)->v_mount == mp, ("marker vnode mount list mismatch"));
 restart:
 	vp = TAILQ_NEXT(*mvp, v_actfreelist);
 	TAILQ_REMOVE(&mp->mnt_activevnodelist, *mvp, v_actfreelist);
 	while (vp != NULL) {
 		if (vp->v_type == VMARKER) {
 			vp = TAILQ_NEXT(vp, v_actfreelist);
 			continue;
 		}
 		if (!VI_TRYLOCK(vp)) {
 			if (mp_ncpus == 1 || should_yield()) {
 				TAILQ_INSERT_BEFORE(vp, *mvp, v_actfreelist);
 				mtx_unlock(&vnode_free_list_mtx);
 				pause("vnacti", 1);
 				mtx_lock(&vnode_free_list_mtx);
 				goto restart;
 			}
 			continue;
 		}
 		KASSERT(vp->v_type != VMARKER, ("locked marker %p", vp));
 		KASSERT(vp->v_mount == mp || vp->v_mount == NULL,
 		    ("alien vnode on the active list %p %p", vp, mp));
 		if (vp->v_mount == mp && (vp->v_iflag & VI_DOOMED) == 0)
 			break;
 		nvp = TAILQ_NEXT(vp, v_actfreelist);
 		VI_UNLOCK(vp);
 		vp = nvp;
 	}
 
 	/* Check if we are done */
 	if (vp == NULL) {
 		mtx_unlock(&vnode_free_list_mtx);
 		mnt_vnode_markerfree_active(mvp, mp);
 		return (NULL);
 	}
 	TAILQ_INSERT_AFTER(&mp->mnt_activevnodelist, vp, *mvp, v_actfreelist);
 	mtx_unlock(&vnode_free_list_mtx);
 	ASSERT_VI_LOCKED(vp, "active iter");
 	KASSERT((vp->v_iflag & VI_ACTIVE) != 0, ("Non-active vp %p", vp));
 	return (vp);
 }
 
 struct vnode *
 __mnt_vnode_next_active(struct vnode **mvp, struct mount *mp)
 {
 
 	if (should_yield())
 		kern_yield(PRI_USER);
 	mtx_lock(&vnode_free_list_mtx);
 	return (mnt_vnode_next_active(mvp, mp));
 }
 
 struct vnode *
 __mnt_vnode_first_active(struct vnode **mvp, struct mount *mp)
 {
 	struct vnode *vp;
 
 	*mvp = malloc(sizeof(struct vnode), M_VNODE_MARKER, M_WAITOK | M_ZERO);
 	MNT_ILOCK(mp);
 	MNT_REF(mp);
 	MNT_IUNLOCK(mp);
 	(*mvp)->v_type = VMARKER;
 	(*mvp)->v_mount = mp;
 
 	mtx_lock(&vnode_free_list_mtx);
 	vp = TAILQ_FIRST(&mp->mnt_activevnodelist);
 	if (vp == NULL) {
 		mtx_unlock(&vnode_free_list_mtx);
 		mnt_vnode_markerfree_active(mvp, mp);
 		return (NULL);
 	}
 	TAILQ_INSERT_BEFORE(vp, *mvp, v_actfreelist);
 	return (mnt_vnode_next_active(mvp, mp));
 }
 
 void
 __mnt_vnode_markerfree_active(struct vnode **mvp, struct mount *mp)
 {
 
 	if (*mvp == NULL)
 		return;
 
 	mtx_lock(&vnode_free_list_mtx);
 	TAILQ_REMOVE(&mp->mnt_activevnodelist, *mvp, v_actfreelist);
 	mtx_unlock(&vnode_free_list_mtx);
 	mnt_vnode_markerfree_active(mvp, mp);
 }
Index: user/ngie/more-tests/sys/kern/vfs_syscalls.c
===================================================================
--- user/ngie/more-tests/sys/kern/vfs_syscalls.c	(revision 281584)
+++ user/ngie/more-tests/sys/kern/vfs_syscalls.c	(revision 281585)
@@ -1,4753 +1,4760 @@
 /*-
  * Copyright (c) 1989, 1993
  *	The Regents of the University of California.  All rights reserved.
  * (c) UNIX System Laboratories, Inc.
  * All or some portions of this file are derived from material licensed
  * to the University of California by American Telephone and Telegraph
  * Co. or Unix System Laboratories, Inc. and are reproduced herein with
  * the permission of UNIX System Laboratories, Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)vfs_syscalls.c	8.13 (Berkeley) 4/15/94
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_capsicum.h"
 #include "opt_compat.h"
 #include "opt_ktrace.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/bio.h>
 #include <sys/buf.h>
 #include <sys/capsicum.h>
 #include <sys/disk.h>
 #include <sys/sysent.h>
 #include <sys/malloc.h>
 #include <sys/mount.h>
 #include <sys/mutex.h>
 #include <sys/sysproto.h>
 #include <sys/namei.h>
 #include <sys/filedesc.h>
 #include <sys/kernel.h>
 #include <sys/fcntl.h>
 #include <sys/file.h>
 #include <sys/filio.h>
 #include <sys/limits.h>
 #include <sys/linker.h>
 #include <sys/rwlock.h>
 #include <sys/sdt.h>
 #include <sys/stat.h>
 #include <sys/sx.h>
 #include <sys/unistd.h>
 #include <sys/vnode.h>
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/dirent.h>
 #include <sys/jail.h>
 #include <sys/syscallsubr.h>
 #include <sys/sysctl.h>
 #ifdef KTRACE
 #include <sys/ktrace.h>
 #endif
 
 #include <machine/stdarg.h>
 
 #include <security/audit/audit.h>
 #include <security/mac/mac_framework.h>
 
 #include <vm/vm.h>
 #include <vm/vm_object.h>
 #include <vm/vm_page.h>
 #include <vm/uma.h>
 
 #include <ufs/ufs/quota.h>
 
 MALLOC_DEFINE(M_FADVISE, "fadvise", "posix_fadvise(2) information");
 
 SDT_PROVIDER_DEFINE(vfs);
 SDT_PROBE_DEFINE2(vfs, , stat, mode, "char *", "int");
 SDT_PROBE_DEFINE2(vfs, , stat, reg, "char *", "int");
 
 static int chroot_refuse_vdir_fds(struct filedesc *fdp);
 static int kern_chflagsat(struct thread *td, int fd, const char *path,
     enum uio_seg pathseg, u_long flags, int atflag);
 static int setfflags(struct thread *td, struct vnode *, u_long);
 static int getutimes(const struct timeval *, enum uio_seg, struct timespec *);
 static int getutimens(const struct timespec *, enum uio_seg,
     struct timespec *, int *);
 static int setutimes(struct thread *td, struct vnode *,
     const struct timespec *, int, int);
 static int vn_access(struct vnode *vp, int user_flags, struct ucred *cred,
     struct thread *td);
 
 /*
  * The module initialization routine for POSIX asynchronous I/O will
  * set this to the version of AIO that it implements.  (Zero means
  * that it is not implemented.)  This value is used here by pathconf()
  * and in kern_descrip.c by fpathconf().
  */
 int async_io_version;
 
 /*
  * Sync each mounted filesystem.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct sync_args {
 	int     dummy;
 };
 #endif
 /* ARGSUSED */
 int
 sys_sync(td, uap)
 	struct thread *td;
 	struct sync_args *uap;
 {
 	struct mount *mp, *nmp;
 	int save;
 
 	mtx_lock(&mountlist_mtx);
 	for (mp = TAILQ_FIRST(&mountlist); mp != NULL; mp = nmp) {
 		if (vfs_busy(mp, MBF_NOWAIT | MBF_MNTLSTLOCK)) {
 			nmp = TAILQ_NEXT(mp, mnt_list);
 			continue;
 		}
 		if ((mp->mnt_flag & MNT_RDONLY) == 0 &&
 		    vn_start_write(NULL, &mp, V_NOWAIT) == 0) {
 			save = curthread_pflags_set(TDP_SYNCIO);
 			vfs_msync(mp, MNT_NOWAIT);
 			VFS_SYNC(mp, MNT_NOWAIT);
 			curthread_pflags_restore(save);
 			vn_finished_write(mp);
 		}
 		mtx_lock(&mountlist_mtx);
 		nmp = TAILQ_NEXT(mp, mnt_list);
 		vfs_unbusy(mp);
 	}
 	mtx_unlock(&mountlist_mtx);
 	return (0);
 }
 
 /*
  * Change filesystem quotas.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct quotactl_args {
 	char *path;
 	int cmd;
 	int uid;
 	caddr_t arg;
 };
 #endif
 int
 sys_quotactl(td, uap)
 	struct thread *td;
 	register struct quotactl_args /* {
 		char *path;
 		int cmd;
 		int uid;
 		caddr_t arg;
 	} */ *uap;
 {
 	struct mount *mp;
 	struct nameidata nd;
 	int error;
 
 	AUDIT_ARG_CMD(uap->cmd);
 	AUDIT_ARG_UID(uap->uid);
 	if (!prison_allow(td->td_ucred, PR_ALLOW_QUOTAS))
 		return (EPERM);
 	NDINIT(&nd, LOOKUP, FOLLOW | LOCKLEAF | AUDITVNODE1, UIO_USERSPACE,
 	    uap->path, td);
 	if ((error = namei(&nd)) != 0)
 		return (error);
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	mp = nd.ni_vp->v_mount;
 	vfs_ref(mp);
 	vput(nd.ni_vp);
 	error = vfs_busy(mp, 0);
 	vfs_rel(mp);
 	if (error != 0)
 		return (error);
 	error = VFS_QUOTACTL(mp, uap->cmd, uap->uid, uap->arg);
 
 	/*
 	 * Since quota on operation typically needs to open quota
 	 * file, the Q_QUOTAON handler needs to unbusy the mount point
 	 * before calling into namei.  Otherwise, unmount might be
 	 * started between two vfs_busy() invocations (first is our,
 	 * second is from mount point cross-walk code in lookup()),
 	 * causing deadlock.
 	 *
 	 * Require that Q_QUOTAON handles the vfs_busy() reference on
 	 * its own, always returning with ubusied mount point.
 	 */
 	if ((uap->cmd >> SUBCMDSHIFT) != Q_QUOTAON)
 		vfs_unbusy(mp);
 	return (error);
 }
 
 /*
  * Used by statfs conversion routines to scale the block size up if
  * necessary so that all of the block counts are <= 'max_size'.  Note
  * that 'max_size' should be a bitmask, i.e. 2^n - 1 for some non-zero
  * value of 'n'.
  */
 void
 statfs_scale_blocks(struct statfs *sf, long max_size)
 {
 	uint64_t count;
 	int shift;
 
 	KASSERT(powerof2(max_size + 1), ("%s: invalid max_size", __func__));
 
 	/*
 	 * Attempt to scale the block counts to give a more accurate
 	 * overview to userland of the ratio of free space to used
 	 * space.  To do this, find the largest block count and compute
 	 * a divisor that lets it fit into a signed integer <= max_size.
 	 */
 	if (sf->f_bavail < 0)
 		count = -sf->f_bavail;
 	else
 		count = sf->f_bavail;
 	count = MAX(sf->f_blocks, MAX(sf->f_bfree, count));
 	if (count <= max_size)
 		return;
 
 	count >>= flsl(max_size);
 	shift = 0;
 	while (count > 0) {
 		shift++;
 		count >>=1;
 	}
 
 	sf->f_bsize <<= shift;
 	sf->f_blocks >>= shift;
 	sf->f_bfree >>= shift;
 	sf->f_bavail >>= shift;
 }
 
 /*
  * Get filesystem statistics.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct statfs_args {
 	char *path;
 	struct statfs *buf;
 };
 #endif
 int
 sys_statfs(td, uap)
 	struct thread *td;
 	register struct statfs_args /* {
 		char *path;
 		struct statfs *buf;
 	} */ *uap;
 {
 	struct statfs sf;
 	int error;
 
 	error = kern_statfs(td, uap->path, UIO_USERSPACE, &sf);
 	if (error == 0)
 		error = copyout(&sf, uap->buf, sizeof(sf));
 	return (error);
 }
 
 int
 kern_statfs(struct thread *td, char *path, enum uio_seg pathseg,
     struct statfs *buf)
 {
 	struct mount *mp;
 	struct statfs *sp, sb;
 	struct nameidata nd;
 	int error;
 
 	NDINIT(&nd, LOOKUP, FOLLOW | LOCKSHARED | LOCKLEAF | AUDITVNODE1,
 	    pathseg, path, td);
 	error = namei(&nd);
 	if (error != 0)
 		return (error);
 	mp = nd.ni_vp->v_mount;
 	vfs_ref(mp);
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	vput(nd.ni_vp);
 	error = vfs_busy(mp, 0);
 	vfs_rel(mp);
 	if (error != 0)
 		return (error);
 #ifdef MAC
 	error = mac_mount_check_stat(td->td_ucred, mp);
 	if (error != 0)
 		goto out;
 #endif
 	/*
 	 * Set these in case the underlying filesystem fails to do so.
 	 */
 	sp = &mp->mnt_stat;
 	sp->f_version = STATFS_VERSION;
 	sp->f_namemax = NAME_MAX;
 	sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK;
 	error = VFS_STATFS(mp, sp);
 	if (error != 0)
 		goto out;
 	if (priv_check(td, PRIV_VFS_GENERATION)) {
 		bcopy(sp, &sb, sizeof(sb));
 		sb.f_fsid.val[0] = sb.f_fsid.val[1] = 0;
 		prison_enforce_statfs(td->td_ucred, mp, &sb);
 		sp = &sb;
 	}
 	*buf = *sp;
 out:
 	vfs_unbusy(mp);
 	return (error);
 }
 
 /*
  * Get filesystem statistics.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct fstatfs_args {
 	int fd;
 	struct statfs *buf;
 };
 #endif
 int
 sys_fstatfs(td, uap)
 	struct thread *td;
 	register struct fstatfs_args /* {
 		int fd;
 		struct statfs *buf;
 	} */ *uap;
 {
 	struct statfs sf;
 	int error;
 
 	error = kern_fstatfs(td, uap->fd, &sf);
 	if (error == 0)
 		error = copyout(&sf, uap->buf, sizeof(sf));
 	return (error);
 }
 
 int
 kern_fstatfs(struct thread *td, int fd, struct statfs *buf)
 {
 	struct file *fp;
 	struct mount *mp;
 	struct statfs *sp, sb;
 	struct vnode *vp;
 	cap_rights_t rights;
 	int error;
 
 	AUDIT_ARG_FD(fd);
 	error = getvnode(td->td_proc->p_fd, fd,
 	    cap_rights_init(&rights, CAP_FSTATFS), &fp);
 	if (error != 0)
 		return (error);
 	vp = fp->f_vnode;
 	vn_lock(vp, LK_SHARED | LK_RETRY);
 #ifdef AUDIT
 	AUDIT_ARG_VNODE1(vp);
 #endif
 	mp = vp->v_mount;
 	if (mp)
 		vfs_ref(mp);
 	VOP_UNLOCK(vp, 0);
 	fdrop(fp, td);
 	if (mp == NULL) {
 		error = EBADF;
 		goto out;
 	}
 	error = vfs_busy(mp, 0);
 	vfs_rel(mp);
 	if (error != 0)
 		return (error);
 #ifdef MAC
 	error = mac_mount_check_stat(td->td_ucred, mp);
 	if (error != 0)
 		goto out;
 #endif
 	/*
 	 * Set these in case the underlying filesystem fails to do so.
 	 */
 	sp = &mp->mnt_stat;
 	sp->f_version = STATFS_VERSION;
 	sp->f_namemax = NAME_MAX;
 	sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK;
 	error = VFS_STATFS(mp, sp);
 	if (error != 0)
 		goto out;
 	if (priv_check(td, PRIV_VFS_GENERATION)) {
 		bcopy(sp, &sb, sizeof(sb));
 		sb.f_fsid.val[0] = sb.f_fsid.val[1] = 0;
 		prison_enforce_statfs(td->td_ucred, mp, &sb);
 		sp = &sb;
 	}
 	*buf = *sp;
 out:
 	if (mp)
 		vfs_unbusy(mp);
 	return (error);
 }
 
 /*
  * Get statistics on all filesystems.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct getfsstat_args {
 	struct statfs *buf;
 	long bufsize;
 	int flags;
 };
 #endif
 int
 sys_getfsstat(td, uap)
 	struct thread *td;
 	register struct getfsstat_args /* {
 		struct statfs *buf;
 		long bufsize;
 		int flags;
 	} */ *uap;
 {
+	size_t count;
+	int error;
 
-	return (kern_getfsstat(td, &uap->buf, uap->bufsize, UIO_USERSPACE,
-	    uap->flags));
+	error = kern_getfsstat(td, &uap->buf, uap->bufsize, &count,
+	    UIO_USERSPACE, uap->flags);
+	if (error == 0)
+		td->td_retval[0] = count;
+	return (error);
 }
 
 /*
  * If (bufsize > 0 && bufseg == UIO_SYSSPACE)
  *	The caller is responsible for freeing memory which will be allocated
  *	in '*buf'.
  */
 int
 kern_getfsstat(struct thread *td, struct statfs **buf, size_t bufsize,
-    enum uio_seg bufseg, int flags)
+    size_t *countp, enum uio_seg bufseg, int flags)
 {
 	struct mount *mp, *nmp;
 	struct statfs *sfsp, *sp, sb;
 	size_t count, maxcount;
 	int error;
 
 	maxcount = bufsize / sizeof(struct statfs);
 	if (bufsize == 0)
 		sfsp = NULL;
 	else if (bufseg == UIO_USERSPACE)
 		sfsp = *buf;
 	else /* if (bufseg == UIO_SYSSPACE) */ {
 		count = 0;
 		mtx_lock(&mountlist_mtx);
 		TAILQ_FOREACH(mp, &mountlist, mnt_list) {
 			count++;
 		}
 		mtx_unlock(&mountlist_mtx);
 		if (maxcount > count)
 			maxcount = count;
 		sfsp = *buf = malloc(maxcount * sizeof(struct statfs), M_TEMP,
 		    M_WAITOK);
 	}
 	count = 0;
 	mtx_lock(&mountlist_mtx);
 	for (mp = TAILQ_FIRST(&mountlist); mp != NULL; mp = nmp) {
 		if (prison_canseemount(td->td_ucred, mp) != 0) {
 			nmp = TAILQ_NEXT(mp, mnt_list);
 			continue;
 		}
 #ifdef MAC
 		if (mac_mount_check_stat(td->td_ucred, mp) != 0) {
 			nmp = TAILQ_NEXT(mp, mnt_list);
 			continue;
 		}
 #endif
 		if (vfs_busy(mp, MBF_NOWAIT | MBF_MNTLSTLOCK)) {
 			nmp = TAILQ_NEXT(mp, mnt_list);
 			continue;
 		}
 		if (sfsp && count < maxcount) {
 			sp = &mp->mnt_stat;
 			/*
 			 * Set these in case the underlying filesystem
 			 * fails to do so.
 			 */
 			sp->f_version = STATFS_VERSION;
 			sp->f_namemax = NAME_MAX;
 			sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK;
 			/*
 			 * If MNT_NOWAIT or MNT_LAZY is specified, do not
 			 * refresh the fsstat cache. MNT_NOWAIT or MNT_LAZY
 			 * overrides MNT_WAIT.
 			 */
 			if (((flags & (MNT_LAZY|MNT_NOWAIT)) == 0 ||
 			    (flags & MNT_WAIT)) &&
 			    (error = VFS_STATFS(mp, sp))) {
 				mtx_lock(&mountlist_mtx);
 				nmp = TAILQ_NEXT(mp, mnt_list);
 				vfs_unbusy(mp);
 				continue;
 			}
 			if (priv_check(td, PRIV_VFS_GENERATION)) {
 				bcopy(sp, &sb, sizeof(sb));
 				sb.f_fsid.val[0] = sb.f_fsid.val[1] = 0;
 				prison_enforce_statfs(td->td_ucred, mp, &sb);
 				sp = &sb;
 			}
 			if (bufseg == UIO_SYSSPACE)
 				bcopy(sp, sfsp, sizeof(*sp));
 			else /* if (bufseg == UIO_USERSPACE) */ {
 				error = copyout(sp, sfsp, sizeof(*sp));
 				if (error != 0) {
 					vfs_unbusy(mp);
 					return (error);
 				}
 			}
 			sfsp++;
 		}
 		count++;
 		mtx_lock(&mountlist_mtx);
 		nmp = TAILQ_NEXT(mp, mnt_list);
 		vfs_unbusy(mp);
 	}
 	mtx_unlock(&mountlist_mtx);
 	if (sfsp && count > maxcount)
-		td->td_retval[0] = maxcount;
+		*countp = maxcount;
 	else
-		td->td_retval[0] = count;
+		*countp = count;
 	return (0);
 }
 
 #ifdef COMPAT_FREEBSD4
 /*
  * Get old format filesystem statistics.
  */
 static void cvtstatfs(struct statfs *, struct ostatfs *);
 
 #ifndef _SYS_SYSPROTO_H_
 struct freebsd4_statfs_args {
 	char *path;
 	struct ostatfs *buf;
 };
 #endif
 int
 freebsd4_statfs(td, uap)
 	struct thread *td;
 	struct freebsd4_statfs_args /* {
 		char *path;
 		struct ostatfs *buf;
 	} */ *uap;
 {
 	struct ostatfs osb;
 	struct statfs sf;
 	int error;
 
 	error = kern_statfs(td, uap->path, UIO_USERSPACE, &sf);
 	if (error != 0)
 		return (error);
 	cvtstatfs(&sf, &osb);
 	return (copyout(&osb, uap->buf, sizeof(osb)));
 }
 
 /*
  * Get filesystem statistics.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct freebsd4_fstatfs_args {
 	int fd;
 	struct ostatfs *buf;
 };
 #endif
 int
 freebsd4_fstatfs(td, uap)
 	struct thread *td;
 	struct freebsd4_fstatfs_args /* {
 		int fd;
 		struct ostatfs *buf;
 	} */ *uap;
 {
 	struct ostatfs osb;
 	struct statfs sf;
 	int error;
 
 	error = kern_fstatfs(td, uap->fd, &sf);
 	if (error != 0)
 		return (error);
 	cvtstatfs(&sf, &osb);
 	return (copyout(&osb, uap->buf, sizeof(osb)));
 }
 
 /*
  * Get statistics on all filesystems.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct freebsd4_getfsstat_args {
 	struct ostatfs *buf;
 	long bufsize;
 	int flags;
 };
 #endif
 int
 freebsd4_getfsstat(td, uap)
 	struct thread *td;
 	register struct freebsd4_getfsstat_args /* {
 		struct ostatfs *buf;
 		long bufsize;
 		int flags;
 	} */ *uap;
 {
 	struct statfs *buf, *sp;
 	struct ostatfs osb;
 	size_t count, size;
 	int error;
 
 	count = uap->bufsize / sizeof(struct ostatfs);
 	size = count * sizeof(struct statfs);
-	error = kern_getfsstat(td, &buf, size, UIO_SYSSPACE, uap->flags);
+	error = kern_getfsstat(td, &buf, size, &count, UIO_SYSSPACE,
+	    uap->flags);
 	if (size > 0) {
-		count = td->td_retval[0];
 		sp = buf;
 		while (count > 0 && error == 0) {
 			cvtstatfs(sp, &osb);
 			error = copyout(&osb, uap->buf, sizeof(osb));
 			sp++;
 			uap->buf++;
 			count--;
 		}
 		free(buf, M_TEMP);
 	}
+	if (error == 0)
+		td->td_retval[0] = count;
 	return (error);
 }
 
 /*
  * Implement fstatfs() for (NFS) file handles.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct freebsd4_fhstatfs_args {
 	struct fhandle *u_fhp;
 	struct ostatfs *buf;
 };
 #endif
 int
 freebsd4_fhstatfs(td, uap)
 	struct thread *td;
 	struct freebsd4_fhstatfs_args /* {
 		struct fhandle *u_fhp;
 		struct ostatfs *buf;
 	} */ *uap;
 {
 	struct ostatfs osb;
 	struct statfs sf;
 	fhandle_t fh;
 	int error;
 
 	error = copyin(uap->u_fhp, &fh, sizeof(fhandle_t));
 	if (error != 0)
 		return (error);
 	error = kern_fhstatfs(td, fh, &sf);
 	if (error != 0)
 		return (error);
 	cvtstatfs(&sf, &osb);
 	return (copyout(&osb, uap->buf, sizeof(osb)));
 }
 
 /*
  * Convert a new format statfs structure to an old format statfs structure.
  */
 static void
 cvtstatfs(nsp, osp)
 	struct statfs *nsp;
 	struct ostatfs *osp;
 {
 
 	statfs_scale_blocks(nsp, LONG_MAX);
 	bzero(osp, sizeof(*osp));
 	osp->f_bsize = nsp->f_bsize;
 	osp->f_iosize = MIN(nsp->f_iosize, LONG_MAX);
 	osp->f_blocks = nsp->f_blocks;
 	osp->f_bfree = nsp->f_bfree;
 	osp->f_bavail = nsp->f_bavail;
 	osp->f_files = MIN(nsp->f_files, LONG_MAX);
 	osp->f_ffree = MIN(nsp->f_ffree, LONG_MAX);
 	osp->f_owner = nsp->f_owner;
 	osp->f_type = nsp->f_type;
 	osp->f_flags = nsp->f_flags;
 	osp->f_syncwrites = MIN(nsp->f_syncwrites, LONG_MAX);
 	osp->f_asyncwrites = MIN(nsp->f_asyncwrites, LONG_MAX);
 	osp->f_syncreads = MIN(nsp->f_syncreads, LONG_MAX);
 	osp->f_asyncreads = MIN(nsp->f_asyncreads, LONG_MAX);
 	strlcpy(osp->f_fstypename, nsp->f_fstypename,
 	    MIN(MFSNAMELEN, OMFSNAMELEN));
 	strlcpy(osp->f_mntonname, nsp->f_mntonname,
 	    MIN(MNAMELEN, OMNAMELEN));
 	strlcpy(osp->f_mntfromname, nsp->f_mntfromname,
 	    MIN(MNAMELEN, OMNAMELEN));
 	osp->f_fsid = nsp->f_fsid;
 }
 #endif /* COMPAT_FREEBSD4 */
 
 /*
  * Change current working directory to a given file descriptor.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct fchdir_args {
 	int	fd;
 };
 #endif
 int
 sys_fchdir(td, uap)
 	struct thread *td;
 	struct fchdir_args /* {
 		int fd;
 	} */ *uap;
 {
 	register struct filedesc *fdp = td->td_proc->p_fd;
 	struct vnode *vp, *tdp, *vpold;
 	struct mount *mp;
 	struct file *fp;
 	cap_rights_t rights;
 	int error;
 
 	AUDIT_ARG_FD(uap->fd);
 	error = getvnode(fdp, uap->fd, cap_rights_init(&rights, CAP_FCHDIR),
 	    &fp);
 	if (error != 0)
 		return (error);
 	vp = fp->f_vnode;
 	VREF(vp);
 	fdrop(fp, td);
 	vn_lock(vp, LK_SHARED | LK_RETRY);
 	AUDIT_ARG_VNODE1(vp);
 	error = change_dir(vp, td);
 	while (!error && (mp = vp->v_mountedhere) != NULL) {
 		if (vfs_busy(mp, 0))
 			continue;
 		error = VFS_ROOT(mp, LK_SHARED, &tdp);
 		vfs_unbusy(mp);
 		if (error != 0)
 			break;
 		vput(vp);
 		vp = tdp;
 	}
 	if (error != 0) {
 		vput(vp);
 		return (error);
 	}
 	VOP_UNLOCK(vp, 0);
 	FILEDESC_XLOCK(fdp);
 	vpold = fdp->fd_cdir;
 	fdp->fd_cdir = vp;
 	FILEDESC_XUNLOCK(fdp);
 	vrele(vpold);
 	return (0);
 }
 
 /*
  * Change current working directory (``.'').
  */
 #ifndef _SYS_SYSPROTO_H_
 struct chdir_args {
 	char	*path;
 };
 #endif
 int
 sys_chdir(td, uap)
 	struct thread *td;
 	struct chdir_args /* {
 		char *path;
 	} */ *uap;
 {
 
 	return (kern_chdir(td, uap->path, UIO_USERSPACE));
 }
 
 int
 kern_chdir(struct thread *td, char *path, enum uio_seg pathseg)
 {
 	register struct filedesc *fdp = td->td_proc->p_fd;
 	struct nameidata nd;
 	struct vnode *vp;
 	int error;
 
 	NDINIT(&nd, LOOKUP, FOLLOW | LOCKSHARED | LOCKLEAF | AUDITVNODE1,
 	    pathseg, path, td);
 	if ((error = namei(&nd)) != 0)
 		return (error);
 	if ((error = change_dir(nd.ni_vp, td)) != 0) {
 		vput(nd.ni_vp);
 		NDFREE(&nd, NDF_ONLY_PNBUF);
 		return (error);
 	}
 	VOP_UNLOCK(nd.ni_vp, 0);
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	FILEDESC_XLOCK(fdp);
 	vp = fdp->fd_cdir;
 	fdp->fd_cdir = nd.ni_vp;
 	FILEDESC_XUNLOCK(fdp);
 	vrele(vp);
 	return (0);
 }
 
 /*
  * Helper function for raised chroot(2) security function:  Refuse if
  * any filedescriptors are open directories.
  */
 static int
 chroot_refuse_vdir_fds(fdp)
 	struct filedesc *fdp;
 {
 	struct vnode *vp;
 	struct file *fp;
 	int fd;
 
 	FILEDESC_LOCK_ASSERT(fdp);
 
 	for (fd = 0; fd <= fdp->fd_lastfile; fd++) {
 		fp = fget_locked(fdp, fd);
 		if (fp == NULL)
 			continue;
 		if (fp->f_type == DTYPE_VNODE) {
 			vp = fp->f_vnode;
 			if (vp->v_type == VDIR)
 				return (EPERM);
 		}
 	}
 	return (0);
 }
 
 /*
  * This sysctl determines if we will allow a process to chroot(2) if it
  * has a directory open:
  *	0: disallowed for all processes.
  *	1: allowed for processes that were not already chroot(2)'ed.
  *	2: allowed for all processes.
  */
 
 static int chroot_allow_open_directories = 1;
 
 SYSCTL_INT(_kern, OID_AUTO, chroot_allow_open_directories, CTLFLAG_RW,
      &chroot_allow_open_directories, 0,
      "Allow a process to chroot(2) if it has a directory open");
 
 /*
  * Change notion of root (``/'') directory.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct chroot_args {
 	char	*path;
 };
 #endif
 int
 sys_chroot(td, uap)
 	struct thread *td;
 	struct chroot_args /* {
 		char *path;
 	} */ *uap;
 {
 	struct nameidata nd;
 	int error;
 
 	error = priv_check(td, PRIV_VFS_CHROOT);
 	if (error != 0)
 		return (error);
 	NDINIT(&nd, LOOKUP, FOLLOW | LOCKSHARED | LOCKLEAF | AUDITVNODE1,
 	    UIO_USERSPACE, uap->path, td);
 	error = namei(&nd);
 	if (error != 0)
 		goto error;
 	error = change_dir(nd.ni_vp, td);
 	if (error != 0)
 		goto e_vunlock;
 #ifdef MAC
 	error = mac_vnode_check_chroot(td->td_ucred, nd.ni_vp);
 	if (error != 0)
 		goto e_vunlock;
 #endif
 	VOP_UNLOCK(nd.ni_vp, 0);
 	error = change_root(nd.ni_vp, td);
 	vrele(nd.ni_vp);
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	return (error);
 e_vunlock:
 	vput(nd.ni_vp);
 error:
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	return (error);
 }
 
 /*
  * Common routine for chroot and chdir.  Callers must provide a locked vnode
  * instance.
  */
 int
 change_dir(vp, td)
 	struct vnode *vp;
 	struct thread *td;
 {
 #ifdef MAC
 	int error;
 #endif
 
 	ASSERT_VOP_LOCKED(vp, "change_dir(): vp not locked");
 	if (vp->v_type != VDIR)
 		return (ENOTDIR);
 #ifdef MAC
 	error = mac_vnode_check_chdir(td->td_ucred, vp);
 	if (error != 0)
 		return (error);
 #endif
 	return (VOP_ACCESS(vp, VEXEC, td->td_ucred, td));
 }
 
 /*
  * Common routine for kern_chroot() and jail_attach().  The caller is
  * responsible for invoking priv_check() and mac_vnode_check_chroot() to
  * authorize this operation.
  */
 int
 change_root(vp, td)
 	struct vnode *vp;
 	struct thread *td;
 {
 	struct filedesc *fdp;
 	struct vnode *oldvp;
 	int error;
 
 	fdp = td->td_proc->p_fd;
 	FILEDESC_XLOCK(fdp);
 	if (chroot_allow_open_directories == 0 ||
 	    (chroot_allow_open_directories == 1 && fdp->fd_rdir != rootvnode)) {
 		error = chroot_refuse_vdir_fds(fdp);
 		if (error != 0) {
 			FILEDESC_XUNLOCK(fdp);
 			return (error);
 		}
 	}
 	oldvp = fdp->fd_rdir;
 	fdp->fd_rdir = vp;
 	VREF(fdp->fd_rdir);
 	if (!fdp->fd_jdir) {
 		fdp->fd_jdir = vp;
 		VREF(fdp->fd_jdir);
 	}
 	FILEDESC_XUNLOCK(fdp);
 	vrele(oldvp);
 	return (0);
 }
 
 static __inline void
 flags_to_rights(int flags, cap_rights_t *rightsp)
 {
 
 	if (flags & O_EXEC) {
 		cap_rights_set(rightsp, CAP_FEXECVE);
 	} else {
 		switch ((flags & O_ACCMODE)) {
 		case O_RDONLY:
 			cap_rights_set(rightsp, CAP_READ);
 			break;
 		case O_RDWR:
 			cap_rights_set(rightsp, CAP_READ);
 			/* FALLTHROUGH */
 		case O_WRONLY:
 			cap_rights_set(rightsp, CAP_WRITE);
 			if (!(flags & (O_APPEND | O_TRUNC)))
 				cap_rights_set(rightsp, CAP_SEEK);
 			break;
 		}
 	}
 
 	if (flags & O_CREAT)
 		cap_rights_set(rightsp, CAP_CREATE);
 
 	if (flags & O_TRUNC)
 		cap_rights_set(rightsp, CAP_FTRUNCATE);
 
 	if (flags & (O_SYNC | O_FSYNC))
 		cap_rights_set(rightsp, CAP_FSYNC);
 
 	if (flags & (O_EXLOCK | O_SHLOCK))
 		cap_rights_set(rightsp, CAP_FLOCK);
 }
 
 /*
  * Check permissions, allocate an open file structure, and call the device
  * open routine if any.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct open_args {
 	char	*path;
 	int	flags;
 	int	mode;
 };
 #endif
 int
 sys_open(td, uap)
 	struct thread *td;
 	register struct open_args /* {
 		char *path;
 		int flags;
 		int mode;
 	} */ *uap;
 {
 
 	return (kern_openat(td, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    uap->flags, uap->mode));
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct openat_args {
 	int	fd;
 	char	*path;
 	int	flag;
 	int	mode;
 };
 #endif
 int
 sys_openat(struct thread *td, struct openat_args *uap)
 {
 
 	return (kern_openat(td, uap->fd, uap->path, UIO_USERSPACE, uap->flag,
 	    uap->mode));
 }
 
 int
 kern_openat(struct thread *td, int fd, char *path, enum uio_seg pathseg,
     int flags, int mode)
 {
 	struct proc *p = td->td_proc;
 	struct filedesc *fdp = p->p_fd;
 	struct file *fp;
 	struct vnode *vp;
 	struct nameidata nd;
 	cap_rights_t rights;
 	int cmode, error, indx;
 
 	indx = -1;
 
 	AUDIT_ARG_FFLAGS(flags);
 	AUDIT_ARG_MODE(mode);
 	/* XXX: audit dirfd */
 	cap_rights_init(&rights, CAP_LOOKUP);
 	flags_to_rights(flags, &rights);
 	/*
 	 * Only one of the O_EXEC, O_RDONLY, O_WRONLY and O_RDWR flags
 	 * may be specified.
 	 */
 	if (flags & O_EXEC) {
 		if (flags & O_ACCMODE)
 			return (EINVAL);
 	} else if ((flags & O_ACCMODE) == O_ACCMODE) {
 		return (EINVAL);
 	} else {
 		flags = FFLAGS(flags);
 	}
 
 	/*
 	 * Allocate the file descriptor, but don't install a descriptor yet.
 	 */
 	error = falloc_noinstall(td, &fp);
 	if (error != 0)
 		return (error);
 	/*
 	 * An extra reference on `fp' has been held for us by
 	 * falloc_noinstall().
 	 */
 	/* Set the flags early so the finit in devfs can pick them up. */
 	fp->f_flag = flags & FMASK;
 	cmode = ((mode & ~fdp->fd_cmask) & ALLPERMS) & ~S_ISTXT;
 	NDINIT_ATRIGHTS(&nd, LOOKUP, FOLLOW | AUDITVNODE1, pathseg, path, fd,
 	    &rights, td);
 	td->td_dupfd = -1;		/* XXX check for fdopen */
 	error = vn_open(&nd, &flags, cmode, fp);
 	if (error != 0) {
 		/*
 		 * If the vn_open replaced the method vector, something
 		 * wonderous happened deep below and we just pass it up
 		 * pretending we know what we do.
 		 */
 		if (error == ENXIO && fp->f_ops != &badfileops)
 			goto success;
 
 		/*
 		 * Handle special fdopen() case. bleh.
 		 *
 		 * Don't do this for relative (capability) lookups; we don't
 		 * understand exactly what would happen, and we don't think
 		 * that it ever should.
 		 */
 		if (nd.ni_strictrelative == 0 &&
 		    (error == ENODEV || error == ENXIO) &&
 		    td->td_dupfd >= 0) {
 			error = dupfdopen(td, fdp, td->td_dupfd, flags, error,
 			    &indx);
 			if (error == 0)
 				goto success;
 		}
 
 		goto bad;
 	}
 	td->td_dupfd = 0;
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	vp = nd.ni_vp;
 
 	/*
 	 * Store the vnode, for any f_type. Typically, the vnode use
 	 * count is decremented by direct call to vn_closefile() for
 	 * files that switched type in the cdevsw fdopen() method.
 	 */
 	fp->f_vnode = vp;
 	/*
 	 * If the file wasn't claimed by devfs bind it to the normal
 	 * vnode operations here.
 	 */
 	if (fp->f_ops == &badfileops) {
 		KASSERT(vp->v_type != VFIFO, ("Unexpected fifo."));
 		fp->f_seqcount = 1;
 		finit(fp, (flags & FMASK) | (fp->f_flag & FHASLOCK),
 		    DTYPE_VNODE, vp, &vnops);
 	}
 
 	VOP_UNLOCK(vp, 0);
 	if (flags & O_TRUNC) {
 		error = fo_truncate(fp, 0, td->td_ucred, td);
 		if (error != 0)
 			goto bad;
 	}
 success:
 	/*
 	 * If we haven't already installed the FD (for dupfdopen), do so now.
 	 */
 	if (indx == -1) {
 		struct filecaps *fcaps;
 
 #ifdef CAPABILITIES
 		if (nd.ni_strictrelative == 1)
 			fcaps = &nd.ni_filecaps;
 		else
 #endif
 			fcaps = NULL;
 		error = finstall(td, fp, &indx, flags, fcaps);
 		/* On success finstall() consumes fcaps. */
 		if (error != 0) {
 			filecaps_free(&nd.ni_filecaps);
 			goto bad;
 		}
 	} else {
 		filecaps_free(&nd.ni_filecaps);
 	}
 
 	/*
 	 * Release our private reference, leaving the one associated with
 	 * the descriptor table intact.
 	 */
 	fdrop(fp, td);
 	td->td_retval[0] = indx;
 	return (0);
 bad:
 	KASSERT(indx == -1, ("indx=%d, should be -1", indx));
 	fdrop(fp, td);
 	return (error);
 }
 
 #ifdef COMPAT_43
 /*
  * Create a file.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct ocreat_args {
 	char	*path;
 	int	mode;
 };
 #endif
 int
 ocreat(td, uap)
 	struct thread *td;
 	register struct ocreat_args /* {
 		char *path;
 		int mode;
 	} */ *uap;
 {
 
 	return (kern_openat(td, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    O_WRONLY | O_CREAT | O_TRUNC, uap->mode));
 }
 #endif /* COMPAT_43 */
 
 /*
  * Create a special file.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct mknod_args {
 	char	*path;
 	int	mode;
 	int	dev;
 };
 #endif
 int
 sys_mknod(td, uap)
 	struct thread *td;
 	register struct mknod_args /* {
 		char *path;
 		int mode;
 		int dev;
 	} */ *uap;
 {
 
 	return (kern_mknodat(td, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    uap->mode, uap->dev));
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct mknodat_args {
 	int	fd;
 	char	*path;
 	mode_t	mode;
 	dev_t	dev;
 };
 #endif
 int
 sys_mknodat(struct thread *td, struct mknodat_args *uap)
 {
 
 	return (kern_mknodat(td, uap->fd, uap->path, UIO_USERSPACE, uap->mode,
 	    uap->dev));
 }
 
 int
 kern_mknodat(struct thread *td, int fd, char *path, enum uio_seg pathseg,
     int mode, int dev)
 {
 	struct vnode *vp;
 	struct mount *mp;
 	struct vattr vattr;
 	struct nameidata nd;
 	cap_rights_t rights;
 	int error, whiteout = 0;
 
 	AUDIT_ARG_MODE(mode);
 	AUDIT_ARG_DEV(dev);
 	switch (mode & S_IFMT) {
 	case S_IFCHR:
 	case S_IFBLK:
 		error = priv_check(td, PRIV_VFS_MKNOD_DEV);
 		break;
 	case S_IFMT:
 		error = priv_check(td, PRIV_VFS_MKNOD_BAD);
 		break;
 	case S_IFWHT:
 		error = priv_check(td, PRIV_VFS_MKNOD_WHT);
 		break;
 	case S_IFIFO:
 		if (dev == 0)
 			return (kern_mkfifoat(td, fd, path, pathseg, mode));
 		/* FALLTHROUGH */
 	default:
 		error = EINVAL;
 		break;
 	}
 	if (error != 0)
 		return (error);
 restart:
 	bwillwrite();
 	NDINIT_ATRIGHTS(&nd, CREATE, LOCKPARENT | SAVENAME | AUDITVNODE1 |
 	    NOCACHE, pathseg, path, fd, cap_rights_init(&rights, CAP_MKNODAT),
 	    td);
 	if ((error = namei(&nd)) != 0)
 		return (error);
 	vp = nd.ni_vp;
 	if (vp != NULL) {
 		NDFREE(&nd, NDF_ONLY_PNBUF);
 		if (vp == nd.ni_dvp)
 			vrele(nd.ni_dvp);
 		else
 			vput(nd.ni_dvp);
 		vrele(vp);
 		return (EEXIST);
 	} else {
 		VATTR_NULL(&vattr);
 		vattr.va_mode = (mode & ALLPERMS) &
 		    ~td->td_proc->p_fd->fd_cmask;
 		vattr.va_rdev = dev;
 		whiteout = 0;
 
 		switch (mode & S_IFMT) {
 		case S_IFMT:	/* used by badsect to flag bad sectors */
 			vattr.va_type = VBAD;
 			break;
 		case S_IFCHR:
 			vattr.va_type = VCHR;
 			break;
 		case S_IFBLK:
 			vattr.va_type = VBLK;
 			break;
 		case S_IFWHT:
 			whiteout = 1;
 			break;
 		default:
 			panic("kern_mknod: invalid mode");
 		}
 	}
 	if (vn_start_write(nd.ni_dvp, &mp, V_NOWAIT) != 0) {
 		NDFREE(&nd, NDF_ONLY_PNBUF);
 		vput(nd.ni_dvp);
 		if ((error = vn_start_write(NULL, &mp, V_XSLEEP | PCATCH)) != 0)
 			return (error);
 		goto restart;
 	}
 #ifdef MAC
 	if (error == 0 && !whiteout)
 		error = mac_vnode_check_create(td->td_ucred, nd.ni_dvp,
 		    &nd.ni_cnd, &vattr);
 #endif
 	if (error == 0) {
 		if (whiteout)
 			error = VOP_WHITEOUT(nd.ni_dvp, &nd.ni_cnd, CREATE);
 		else {
 			error = VOP_MKNOD(nd.ni_dvp, &nd.ni_vp,
 						&nd.ni_cnd, &vattr);
 			if (error == 0)
 				vput(nd.ni_vp);
 		}
 	}
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	vput(nd.ni_dvp);
 	vn_finished_write(mp);
 	return (error);
 }
 
 /*
  * Create a named pipe.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct mkfifo_args {
 	char	*path;
 	int	mode;
 };
 #endif
 int
 sys_mkfifo(td, uap)
 	struct thread *td;
 	register struct mkfifo_args /* {
 		char *path;
 		int mode;
 	} */ *uap;
 {
 
 	return (kern_mkfifoat(td, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    uap->mode));
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct mkfifoat_args {
 	int	fd;
 	char	*path;
 	mode_t	mode;
 };
 #endif
 int
 sys_mkfifoat(struct thread *td, struct mkfifoat_args *uap)
 {
 
 	return (kern_mkfifoat(td, uap->fd, uap->path, UIO_USERSPACE,
 	    uap->mode));
 }
 
 int
 kern_mkfifoat(struct thread *td, int fd, char *path, enum uio_seg pathseg,
     int mode)
 {
 	struct mount *mp;
 	struct vattr vattr;
 	struct nameidata nd;
 	cap_rights_t rights;
 	int error;
 
 	AUDIT_ARG_MODE(mode);
 restart:
 	bwillwrite();
 	NDINIT_ATRIGHTS(&nd, CREATE, LOCKPARENT | SAVENAME | AUDITVNODE1 |
 	    NOCACHE, pathseg, path, fd, cap_rights_init(&rights, CAP_MKFIFOAT),
 	    td);
 	if ((error = namei(&nd)) != 0)
 		return (error);
 	if (nd.ni_vp != NULL) {
 		NDFREE(&nd, NDF_ONLY_PNBUF);
 		if (nd.ni_vp == nd.ni_dvp)
 			vrele(nd.ni_dvp);
 		else
 			vput(nd.ni_dvp);
 		vrele(nd.ni_vp);
 		return (EEXIST);
 	}
 	if (vn_start_write(nd.ni_dvp, &mp, V_NOWAIT) != 0) {
 		NDFREE(&nd, NDF_ONLY_PNBUF);
 		vput(nd.ni_dvp);
 		if ((error = vn_start_write(NULL, &mp, V_XSLEEP | PCATCH)) != 0)
 			return (error);
 		goto restart;
 	}
 	VATTR_NULL(&vattr);
 	vattr.va_type = VFIFO;
 	vattr.va_mode = (mode & ALLPERMS) & ~td->td_proc->p_fd->fd_cmask;
 #ifdef MAC
 	error = mac_vnode_check_create(td->td_ucred, nd.ni_dvp, &nd.ni_cnd,
 	    &vattr);
 	if (error != 0)
 		goto out;
 #endif
 	error = VOP_MKNOD(nd.ni_dvp, &nd.ni_vp, &nd.ni_cnd, &vattr);
 	if (error == 0)
 		vput(nd.ni_vp);
 #ifdef MAC
 out:
 #endif
 	vput(nd.ni_dvp);
 	vn_finished_write(mp);
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	return (error);
 }
 
 /*
  * Make a hard file link.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct link_args {
 	char	*path;
 	char	*link;
 };
 #endif
 int
 sys_link(td, uap)
 	struct thread *td;
 	register struct link_args /* {
 		char *path;
 		char *link;
 	} */ *uap;
 {
 
 	return (kern_linkat(td, AT_FDCWD, AT_FDCWD, uap->path, uap->link,
 	    UIO_USERSPACE, FOLLOW));
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct linkat_args {
 	int	fd1;
 	char	*path1;
 	int	fd2;
 	char	*path2;
 	int	flag;
 };
 #endif
 int
 sys_linkat(struct thread *td, struct linkat_args *uap)
 {
 	int flag;
 
 	flag = uap->flag;
 	if (flag & ~AT_SYMLINK_FOLLOW)
 		return (EINVAL);
 
 	return (kern_linkat(td, uap->fd1, uap->fd2, uap->path1, uap->path2,
 	    UIO_USERSPACE, (flag & AT_SYMLINK_FOLLOW) ? FOLLOW : NOFOLLOW));
 }
 
 int hardlink_check_uid = 0;
 SYSCTL_INT(_security_bsd, OID_AUTO, hardlink_check_uid, CTLFLAG_RW,
     &hardlink_check_uid, 0,
     "Unprivileged processes cannot create hard links to files owned by other "
     "users");
 static int hardlink_check_gid = 0;
 SYSCTL_INT(_security_bsd, OID_AUTO, hardlink_check_gid, CTLFLAG_RW,
     &hardlink_check_gid, 0,
     "Unprivileged processes cannot create hard links to files owned by other "
     "groups");
 
 static int
 can_hardlink(struct vnode *vp, struct ucred *cred)
 {
 	struct vattr va;
 	int error;
 
 	if (!hardlink_check_uid && !hardlink_check_gid)
 		return (0);
 
 	error = VOP_GETATTR(vp, &va, cred);
 	if (error != 0)
 		return (error);
 
 	if (hardlink_check_uid && cred->cr_uid != va.va_uid) {
 		error = priv_check_cred(cred, PRIV_VFS_LINK, 0);
 		if (error != 0)
 			return (error);
 	}
 
 	if (hardlink_check_gid && !groupmember(va.va_gid, cred)) {
 		error = priv_check_cred(cred, PRIV_VFS_LINK, 0);
 		if (error != 0)
 			return (error);
 	}
 
 	return (0);
 }
 
 int
 kern_linkat(struct thread *td, int fd1, int fd2, char *path1, char *path2,
     enum uio_seg segflg, int follow)
 {
 	struct vnode *vp;
 	struct mount *mp;
 	struct nameidata nd;
 	cap_rights_t rights;
 	int error;
 
 again:
 	bwillwrite();
 	NDINIT_AT(&nd, LOOKUP, follow | AUDITVNODE1, segflg, path1, fd1, td);
 
 	if ((error = namei(&nd)) != 0)
 		return (error);
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	vp = nd.ni_vp;
 	if (vp->v_type == VDIR) {
 		vrele(vp);
 		return (EPERM);		/* POSIX */
 	}
 	NDINIT_ATRIGHTS(&nd, CREATE, LOCKPARENT | SAVENAME | AUDITVNODE2 |
 	    NOCACHE, segflg, path2, fd2, cap_rights_init(&rights, CAP_LINKAT),
 	    td);
 	if ((error = namei(&nd)) == 0) {
 		if (nd.ni_vp != NULL) {
 			NDFREE(&nd, NDF_ONLY_PNBUF);
 			if (nd.ni_dvp == nd.ni_vp)
 				vrele(nd.ni_dvp);
 			else
 				vput(nd.ni_dvp);
 			vrele(nd.ni_vp);
 			vrele(vp);
 			return (EEXIST);
 		} else if (nd.ni_dvp->v_mount != vp->v_mount) {
 			/*
 			 * Cross-device link.  No need to recheck
 			 * vp->v_type, since it cannot change, except
 			 * to VBAD.
 			 */
 			NDFREE(&nd, NDF_ONLY_PNBUF);
 			vput(nd.ni_dvp);
 			vrele(vp);
 			return (EXDEV);
 		} else if ((error = vn_lock(vp, LK_EXCLUSIVE)) == 0) {
 			error = can_hardlink(vp, td->td_ucred);
 #ifdef MAC
 			if (error == 0)
 				error = mac_vnode_check_link(td->td_ucred,
 				    nd.ni_dvp, vp, &nd.ni_cnd);
 #endif
 			if (error != 0) {
 				vput(vp);
 				vput(nd.ni_dvp);
 				NDFREE(&nd, NDF_ONLY_PNBUF);
 				return (error);
 			}
 			error = vn_start_write(vp, &mp, V_NOWAIT);
 			if (error != 0) {
 				vput(vp);
 				vput(nd.ni_dvp);
 				NDFREE(&nd, NDF_ONLY_PNBUF);
 				error = vn_start_write(NULL, &mp,
 				    V_XSLEEP | PCATCH);
 				if (error != 0)
 					return (error);
 				goto again;
 			}
 			error = VOP_LINK(nd.ni_dvp, vp, &nd.ni_cnd);
 			VOP_UNLOCK(vp, 0);
 			vput(nd.ni_dvp);
 			vn_finished_write(mp);
 			NDFREE(&nd, NDF_ONLY_PNBUF);
 		} else {
 			vput(nd.ni_dvp);
 			NDFREE(&nd, NDF_ONLY_PNBUF);
 			vrele(vp);
 			goto again;
 		}
 	}
 	vrele(vp);
 	return (error);
 }
 
 /*
  * Make a symbolic link.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct symlink_args {
 	char	*path;
 	char	*link;
 };
 #endif
 int
 sys_symlink(td, uap)
 	struct thread *td;
 	register struct symlink_args /* {
 		char *path;
 		char *link;
 	} */ *uap;
 {
 
 	return (kern_symlinkat(td, uap->path, AT_FDCWD, uap->link,
 	    UIO_USERSPACE));
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct symlinkat_args {
 	char	*path;
 	int	fd;
 	char	*path2;
 };
 #endif
 int
 sys_symlinkat(struct thread *td, struct symlinkat_args *uap)
 {
 
 	return (kern_symlinkat(td, uap->path1, uap->fd, uap->path2,
 	    UIO_USERSPACE));
 }
 
 int
 kern_symlinkat(struct thread *td, char *path1, int fd, char *path2,
     enum uio_seg segflg)
 {
 	struct mount *mp;
 	struct vattr vattr;
 	char *syspath;
 	struct nameidata nd;
 	int error;
 	cap_rights_t rights;
 
 	if (segflg == UIO_SYSSPACE) {
 		syspath = path1;
 	} else {
 		syspath = uma_zalloc(namei_zone, M_WAITOK);
 		if ((error = copyinstr(path1, syspath, MAXPATHLEN, NULL)) != 0)
 			goto out;
 	}
 	AUDIT_ARG_TEXT(syspath);
 restart:
 	bwillwrite();
 	NDINIT_ATRIGHTS(&nd, CREATE, LOCKPARENT | SAVENAME | AUDITVNODE1 |
 	    NOCACHE, segflg, path2, fd, cap_rights_init(&rights, CAP_SYMLINKAT),
 	    td);
 	if ((error = namei(&nd)) != 0)
 		goto out;
 	if (nd.ni_vp) {
 		NDFREE(&nd, NDF_ONLY_PNBUF);
 		if (nd.ni_vp == nd.ni_dvp)
 			vrele(nd.ni_dvp);
 		else
 			vput(nd.ni_dvp);
 		vrele(nd.ni_vp);
 		error = EEXIST;
 		goto out;
 	}
 	if (vn_start_write(nd.ni_dvp, &mp, V_NOWAIT) != 0) {
 		NDFREE(&nd, NDF_ONLY_PNBUF);
 		vput(nd.ni_dvp);
 		if ((error = vn_start_write(NULL, &mp, V_XSLEEP | PCATCH)) != 0)
 			goto out;
 		goto restart;
 	}
 	VATTR_NULL(&vattr);
 	vattr.va_mode = ACCESSPERMS &~ td->td_proc->p_fd->fd_cmask;
 #ifdef MAC
 	vattr.va_type = VLNK;
 	error = mac_vnode_check_create(td->td_ucred, nd.ni_dvp, &nd.ni_cnd,
 	    &vattr);
 	if (error != 0)
 		goto out2;
 #endif
 	error = VOP_SYMLINK(nd.ni_dvp, &nd.ni_vp, &nd.ni_cnd, &vattr, syspath);
 	if (error == 0)
 		vput(nd.ni_vp);
 #ifdef MAC
 out2:
 #endif
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	vput(nd.ni_dvp);
 	vn_finished_write(mp);
 out:
 	if (segflg != UIO_SYSSPACE)
 		uma_zfree(namei_zone, syspath);
 	return (error);
 }
 
 /*
  * Delete a whiteout from the filesystem.
  */
 int
 sys_undelete(td, uap)
 	struct thread *td;
 	register struct undelete_args /* {
 		char *path;
 	} */ *uap;
 {
 	struct mount *mp;
 	struct nameidata nd;
 	int error;
 
 restart:
 	bwillwrite();
 	NDINIT(&nd, DELETE, LOCKPARENT | DOWHITEOUT | AUDITVNODE1,
 	    UIO_USERSPACE, uap->path, td);
 	error = namei(&nd);
 	if (error != 0)
 		return (error);
 
 	if (nd.ni_vp != NULLVP || !(nd.ni_cnd.cn_flags & ISWHITEOUT)) {
 		NDFREE(&nd, NDF_ONLY_PNBUF);
 		if (nd.ni_vp == nd.ni_dvp)
 			vrele(nd.ni_dvp);
 		else
 			vput(nd.ni_dvp);
 		if (nd.ni_vp)
 			vrele(nd.ni_vp);
 		return (EEXIST);
 	}
 	if (vn_start_write(nd.ni_dvp, &mp, V_NOWAIT) != 0) {
 		NDFREE(&nd, NDF_ONLY_PNBUF);
 		vput(nd.ni_dvp);
 		if ((error = vn_start_write(NULL, &mp, V_XSLEEP | PCATCH)) != 0)
 			return (error);
 		goto restart;
 	}
 	error = VOP_WHITEOUT(nd.ni_dvp, &nd.ni_cnd, DELETE);
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	vput(nd.ni_dvp);
 	vn_finished_write(mp);
 	return (error);
 }
 
 /*
  * Delete a name from the filesystem.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct unlink_args {
 	char	*path;
 };
 #endif
 int
 sys_unlink(td, uap)
 	struct thread *td;
 	struct unlink_args /* {
 		char *path;
 	} */ *uap;
 {
 
 	return (kern_unlinkat(td, AT_FDCWD, uap->path, UIO_USERSPACE, 0));
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct unlinkat_args {
 	int	fd;
 	char	*path;
 	int	flag;
 };
 #endif
 int
 sys_unlinkat(struct thread *td, struct unlinkat_args *uap)
 {
 	int flag = uap->flag;
 	int fd = uap->fd;
 	char *path = uap->path;
 
 	if (flag & ~AT_REMOVEDIR)
 		return (EINVAL);
 
 	if (flag & AT_REMOVEDIR)
 		return (kern_rmdirat(td, fd, path, UIO_USERSPACE));
 	else
 		return (kern_unlinkat(td, fd, path, UIO_USERSPACE, 0));
 }
 
 int
 kern_unlinkat(struct thread *td, int fd, char *path, enum uio_seg pathseg,
     ino_t oldinum)
 {
 	struct mount *mp;
 	struct vnode *vp;
 	struct nameidata nd;
 	struct stat sb;
 	cap_rights_t rights;
 	int error;
 
 restart:
 	bwillwrite();
 	NDINIT_ATRIGHTS(&nd, DELETE, LOCKPARENT | LOCKLEAF | AUDITVNODE1,
 	    pathseg, path, fd, cap_rights_init(&rights, CAP_UNLINKAT), td);
 	if ((error = namei(&nd)) != 0)
 		return (error == EINVAL ? EPERM : error);
 	vp = nd.ni_vp;
 	if (vp->v_type == VDIR && oldinum == 0) {
 		error = EPERM;		/* POSIX */
 	} else if (oldinum != 0 &&
 		  ((error = vn_stat(vp, &sb, td->td_ucred, NOCRED, td)) == 0) &&
 		  sb.st_ino != oldinum) {
 			error = EIDRM;	/* Identifier removed */
 	} else {
 		/*
 		 * The root of a mounted filesystem cannot be deleted.
 		 *
 		 * XXX: can this only be a VDIR case?
 		 */
 		if (vp->v_vflag & VV_ROOT)
 			error = EBUSY;
 	}
 	if (error == 0) {
 		if (vn_start_write(nd.ni_dvp, &mp, V_NOWAIT) != 0) {
 			NDFREE(&nd, NDF_ONLY_PNBUF);
 			vput(nd.ni_dvp);
 			if (vp == nd.ni_dvp)
 				vrele(vp);
 			else
 				vput(vp);
 			if ((error = vn_start_write(NULL, &mp,
 			    V_XSLEEP | PCATCH)) != 0)
 				return (error);
 			goto restart;
 		}
 #ifdef MAC
 		error = mac_vnode_check_unlink(td->td_ucred, nd.ni_dvp, vp,
 		    &nd.ni_cnd);
 		if (error != 0)
 			goto out;
 #endif
 		vfs_notify_upper(vp, VFS_NOTIFY_UPPER_UNLINK);
 		error = VOP_REMOVE(nd.ni_dvp, vp, &nd.ni_cnd);
 #ifdef MAC
 out:
 #endif
 		vn_finished_write(mp);
 	}
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	vput(nd.ni_dvp);
 	if (vp == nd.ni_dvp)
 		vrele(vp);
 	else
 		vput(vp);
 	return (error);
 }
 
 /*
  * Reposition read/write file offset.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct lseek_args {
 	int	fd;
 	int	pad;
 	off_t	offset;
 	int	whence;
 };
 #endif
 int
 sys_lseek(td, uap)
 	struct thread *td;
 	register struct lseek_args /* {
 		int fd;
 		int pad;
 		off_t offset;
 		int whence;
 	} */ *uap;
 {
 	struct file *fp;
 	cap_rights_t rights;
 	int error;
 
 	AUDIT_ARG_FD(uap->fd);
 	error = fget(td, uap->fd, cap_rights_init(&rights, CAP_SEEK), &fp);
 	if (error != 0)
 		return (error);
 	error = (fp->f_ops->fo_flags & DFLAG_SEEKABLE) != 0 ?
 	    fo_seek(fp, uap->offset, uap->whence, td) : ESPIPE;
 	fdrop(fp, td);
 	return (error);
 }
 
 #if defined(COMPAT_43)
 /*
  * Reposition read/write file offset.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct olseek_args {
 	int	fd;
 	long	offset;
 	int	whence;
 };
 #endif
 int
 olseek(td, uap)
 	struct thread *td;
 	register struct olseek_args /* {
 		int fd;
 		long offset;
 		int whence;
 	} */ *uap;
 {
 	struct lseek_args /* {
 		int fd;
 		int pad;
 		off_t offset;
 		int whence;
 	} */ nuap;
 
 	nuap.fd = uap->fd;
 	nuap.offset = uap->offset;
 	nuap.whence = uap->whence;
 	return (sys_lseek(td, &nuap));
 }
 #endif /* COMPAT_43 */
 
 /* Version with the 'pad' argument */
 int
 freebsd6_lseek(td, uap)
 	struct thread *td;
 	register struct freebsd6_lseek_args *uap;
 {
 	struct lseek_args ouap;
 
 	ouap.fd = uap->fd;
 	ouap.offset = uap->offset;
 	ouap.whence = uap->whence;
 	return (sys_lseek(td, &ouap));
 }
 
 /*
  * Check access permissions using passed credentials.
  */
 static int
 vn_access(vp, user_flags, cred, td)
 	struct vnode	*vp;
 	int		user_flags;
 	struct ucred	*cred;
 	struct thread	*td;
 {
 	accmode_t accmode;
 	int error;
 
 	/* Flags == 0 means only check for existence. */
 	if (user_flags == 0)
 		return (0);
 
 	accmode = 0;
 	if (user_flags & R_OK)
 		accmode |= VREAD;
 	if (user_flags & W_OK)
 		accmode |= VWRITE;
 	if (user_flags & X_OK)
 		accmode |= VEXEC;
 #ifdef MAC
 	error = mac_vnode_check_access(cred, vp, accmode);
 	if (error != 0)
 		return (error);
 #endif
 	if ((accmode & VWRITE) == 0 || (error = vn_writechk(vp)) == 0)
 		error = VOP_ACCESS(vp, accmode, cred, td);
 	return (error);
 }
 
 /*
  * Check access permissions using "real" credentials.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct access_args {
 	char	*path;
 	int	amode;
 };
 #endif
 int
 sys_access(td, uap)
 	struct thread *td;
 	register struct access_args /* {
 		char *path;
 		int amode;
 	} */ *uap;
 {
 
 	return (kern_accessat(td, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    0, uap->amode));
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct faccessat_args {
 	int	dirfd;
 	char	*path;
 	int	amode;
 	int	flag;
 }
 #endif
 int
 sys_faccessat(struct thread *td, struct faccessat_args *uap)
 {
 
 	return (kern_accessat(td, uap->fd, uap->path, UIO_USERSPACE, uap->flag,
 	    uap->amode));
 }
 
 int
 kern_accessat(struct thread *td, int fd, char *path, enum uio_seg pathseg,
     int flag, int amode)
 {
 	struct ucred *cred, *usecred;
 	struct vnode *vp;
 	struct nameidata nd;
 	cap_rights_t rights;
 	int error;
 
 	if (flag & ~AT_EACCESS)
 		return (EINVAL);
 	if (amode != F_OK && (amode & ~(R_OK | W_OK | X_OK)) != 0)
 		return (EINVAL);
 
 	/*
 	 * Create and modify a temporary credential instead of one that
 	 * is potentially shared (if we need one).
 	 */
 	cred = td->td_ucred;
 	if ((flag & AT_EACCESS) == 0 &&
 	    ((cred->cr_uid != cred->cr_ruid ||
 	    cred->cr_rgid != cred->cr_groups[0]))) {
 		usecred = crdup(cred);
 		usecred->cr_uid = cred->cr_ruid;
 		usecred->cr_groups[0] = cred->cr_rgid;
 		td->td_ucred = usecred;
 	} else
 		usecred = cred;
 	AUDIT_ARG_VALUE(amode);
 	NDINIT_ATRIGHTS(&nd, LOOKUP, FOLLOW | LOCKSHARED | LOCKLEAF |
 	    AUDITVNODE1, pathseg, path, fd, cap_rights_init(&rights, CAP_FSTAT),
 	    td);
 	if ((error = namei(&nd)) != 0)
 		goto out;
 	vp = nd.ni_vp;
 
 	error = vn_access(vp, amode, usecred, td);
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	vput(vp);
 out:
 	if (usecred != cred) {
 		td->td_ucred = cred;
 		crfree(usecred);
 	}
 	return (error);
 }
 
 /*
  * Check access permissions using "effective" credentials.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct eaccess_args {
 	char	*path;
 	int	amode;
 };
 #endif
 int
 sys_eaccess(td, uap)
 	struct thread *td;
 	register struct eaccess_args /* {
 		char *path;
 		int amode;
 	} */ *uap;
 {
 
 	return (kern_accessat(td, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    AT_EACCESS, uap->amode));
 }
 
 #if defined(COMPAT_43)
 /*
  * Get file status; this version follows links.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct ostat_args {
 	char	*path;
 	struct ostat *ub;
 };
 #endif
 int
 ostat(td, uap)
 	struct thread *td;
 	register struct ostat_args /* {
 		char *path;
 		struct ostat *ub;
 	} */ *uap;
 {
 	struct stat sb;
 	struct ostat osb;
 	int error;
 
 	error = kern_statat(td, 0, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    &sb, NULL);
 	if (error != 0)
 		return (error);
 	cvtstat(&sb, &osb);
 	return (copyout(&osb, uap->ub, sizeof (osb)));
 }
 
 /*
  * Get file status; this version does not follow links.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct olstat_args {
 	char	*path;
 	struct ostat *ub;
 };
 #endif
 int
 olstat(td, uap)
 	struct thread *td;
 	register struct olstat_args /* {
 		char *path;
 		struct ostat *ub;
 	} */ *uap;
 {
 	struct stat sb;
 	struct ostat osb;
 	int error;
 
 	error = kern_statat(td, AT_SYMLINK_NOFOLLOW, AT_FDCWD, uap->path,
 	    UIO_USERSPACE, &sb, NULL);
 	if (error != 0)
 		return (error);
 	cvtstat(&sb, &osb);
 	return (copyout(&osb, uap->ub, sizeof (osb)));
 }
 
 /*
  * Convert from an old to a new stat structure.
  */
 void
 cvtstat(st, ost)
 	struct stat *st;
 	struct ostat *ost;
 {
 
 	ost->st_dev = st->st_dev;
 	ost->st_ino = st->st_ino;
 	ost->st_mode = st->st_mode;
 	ost->st_nlink = st->st_nlink;
 	ost->st_uid = st->st_uid;
 	ost->st_gid = st->st_gid;
 	ost->st_rdev = st->st_rdev;
 	if (st->st_size < (quad_t)1 << 32)
 		ost->st_size = st->st_size;
 	else
 		ost->st_size = -2;
 	ost->st_atim = st->st_atim;
 	ost->st_mtim = st->st_mtim;
 	ost->st_ctim = st->st_ctim;
 	ost->st_blksize = st->st_blksize;
 	ost->st_blocks = st->st_blocks;
 	ost->st_flags = st->st_flags;
 	ost->st_gen = st->st_gen;
 }
 #endif /* COMPAT_43 */
 
 /*
  * Get file status; this version follows links.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct stat_args {
 	char	*path;
 	struct stat *ub;
 };
 #endif
 int
 sys_stat(td, uap)
 	struct thread *td;
 	register struct stat_args /* {
 		char *path;
 		struct stat *ub;
 	} */ *uap;
 {
 	struct stat sb;
 	int error;
 
 	error = kern_statat(td, 0, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    &sb, NULL);
 	if (error == 0)
 		error = copyout(&sb, uap->ub, sizeof (sb));
 	return (error);
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct fstatat_args {
 	int	fd;
 	char	*path;
 	struct stat	*buf;
 	int	flag;
 }
 #endif
 int
 sys_fstatat(struct thread *td, struct fstatat_args *uap)
 {
 	struct stat sb;
 	int error;
 
 	error = kern_statat(td, uap->flag, uap->fd, uap->path,
 	    UIO_USERSPACE, &sb, NULL);
 	if (error == 0)
 		error = copyout(&sb, uap->buf, sizeof (sb));
 	return (error);
 }
 
 int
 kern_statat(struct thread *td, int flag, int fd, char *path,
     enum uio_seg pathseg, struct stat *sbp,
     void (*hook)(struct vnode *vp, struct stat *sbp))
 {
 	struct nameidata nd;
 	struct stat sb;
 	cap_rights_t rights;
 	int error;
 
 	if (flag & ~AT_SYMLINK_NOFOLLOW)
 		return (EINVAL);
 
 	NDINIT_ATRIGHTS(&nd, LOOKUP, ((flag & AT_SYMLINK_NOFOLLOW) ? NOFOLLOW :
 	    FOLLOW) | LOCKSHARED | LOCKLEAF | AUDITVNODE1, pathseg, path, fd,
 	    cap_rights_init(&rights, CAP_FSTAT), td);
 
 	if ((error = namei(&nd)) != 0)
 		return (error);
 	error = vn_stat(nd.ni_vp, &sb, td->td_ucred, NOCRED, td);
 	if (error == 0) {
 		SDT_PROBE(vfs, , stat, mode, path, sb.st_mode, 0, 0, 0);
 		if (S_ISREG(sb.st_mode))
 			SDT_PROBE(vfs, , stat, reg, path, pathseg, 0, 0, 0);
 		if (__predict_false(hook != NULL))
 			hook(nd.ni_vp, &sb);
 	}
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	vput(nd.ni_vp);
 	if (error != 0)
 		return (error);
 	*sbp = sb;
 #ifdef KTRACE
 	if (KTRPOINT(td, KTR_STRUCT))
 		ktrstat(&sb);
 #endif
 	return (0);
 }
 
 /*
  * Get file status; this version does not follow links.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct lstat_args {
 	char	*path;
 	struct stat *ub;
 };
 #endif
 int
 sys_lstat(td, uap)
 	struct thread *td;
 	register struct lstat_args /* {
 		char *path;
 		struct stat *ub;
 	} */ *uap;
 {
 	struct stat sb;
 	int error;
 
 	error = kern_statat(td, AT_SYMLINK_NOFOLLOW, AT_FDCWD, uap->path,
 	    UIO_USERSPACE, &sb, NULL);
 	if (error == 0)
 		error = copyout(&sb, uap->ub, sizeof (sb));
 	return (error);
 }
 
 /*
  * Implementation of the NetBSD [l]stat() functions.
  */
 void
 cvtnstat(sb, nsb)
 	struct stat *sb;
 	struct nstat *nsb;
 {
 
 	bzero(nsb, sizeof *nsb);
 	nsb->st_dev = sb->st_dev;
 	nsb->st_ino = sb->st_ino;
 	nsb->st_mode = sb->st_mode;
 	nsb->st_nlink = sb->st_nlink;
 	nsb->st_uid = sb->st_uid;
 	nsb->st_gid = sb->st_gid;
 	nsb->st_rdev = sb->st_rdev;
 	nsb->st_atim = sb->st_atim;
 	nsb->st_mtim = sb->st_mtim;
 	nsb->st_ctim = sb->st_ctim;
 	nsb->st_size = sb->st_size;
 	nsb->st_blocks = sb->st_blocks;
 	nsb->st_blksize = sb->st_blksize;
 	nsb->st_flags = sb->st_flags;
 	nsb->st_gen = sb->st_gen;
 	nsb->st_birthtim = sb->st_birthtim;
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct nstat_args {
 	char	*path;
 	struct nstat *ub;
 };
 #endif
 int
 sys_nstat(td, uap)
 	struct thread *td;
 	register struct nstat_args /* {
 		char *path;
 		struct nstat *ub;
 	} */ *uap;
 {
 	struct stat sb;
 	struct nstat nsb;
 	int error;
 
 	error = kern_statat(td, 0, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    &sb, NULL);
 	if (error != 0)
 		return (error);
 	cvtnstat(&sb, &nsb);
 	return (copyout(&nsb, uap->ub, sizeof (nsb)));
 }
 
 /*
  * NetBSD lstat.  Get file status; this version does not follow links.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct lstat_args {
 	char	*path;
 	struct stat *ub;
 };
 #endif
 int
 sys_nlstat(td, uap)
 	struct thread *td;
 	register struct nlstat_args /* {
 		char *path;
 		struct nstat *ub;
 	} */ *uap;
 {
 	struct stat sb;
 	struct nstat nsb;
 	int error;
 
 	error = kern_statat(td, AT_SYMLINK_NOFOLLOW, AT_FDCWD, uap->path,
 	    UIO_USERSPACE, &sb, NULL);
 	if (error != 0)
 		return (error);
 	cvtnstat(&sb, &nsb);
 	return (copyout(&nsb, uap->ub, sizeof (nsb)));
 }
 
 /*
  * Get configurable pathname variables.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct pathconf_args {
 	char	*path;
 	int	name;
 };
 #endif
 int
 sys_pathconf(td, uap)
 	struct thread *td;
 	register struct pathconf_args /* {
 		char *path;
 		int name;
 	} */ *uap;
 {
 
 	return (kern_pathconf(td, uap->path, UIO_USERSPACE, uap->name, FOLLOW));
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct lpathconf_args {
 	char	*path;
 	int	name;
 };
 #endif
 int
 sys_lpathconf(td, uap)
 	struct thread *td;
 	register struct lpathconf_args /* {
 		char *path;
 		int name;
 	} */ *uap;
 {
 
 	return (kern_pathconf(td, uap->path, UIO_USERSPACE, uap->name,
 	    NOFOLLOW));
 }
 
 int
 kern_pathconf(struct thread *td, char *path, enum uio_seg pathseg, int name,
     u_long flags)
 {
 	struct nameidata nd;
 	int error;
 
 	NDINIT(&nd, LOOKUP, LOCKSHARED | LOCKLEAF | AUDITVNODE1 | flags,
 	    pathseg, path, td);
 	if ((error = namei(&nd)) != 0)
 		return (error);
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 
 	/* If asynchronous I/O is available, it works for all files. */
 	if (name == _PC_ASYNC_IO)
 		td->td_retval[0] = async_io_version;
 	else
 		error = VOP_PATHCONF(nd.ni_vp, name, td->td_retval);
 	vput(nd.ni_vp);
 	return (error);
 }
 
 /*
  * Return target name of a symbolic link.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct readlink_args {
 	char	*path;
 	char	*buf;
 	size_t	count;
 };
 #endif
 int
 sys_readlink(td, uap)
 	struct thread *td;
 	register struct readlink_args /* {
 		char *path;
 		char *buf;
 		size_t count;
 	} */ *uap;
 {
 
 	return (kern_readlinkat(td, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    uap->buf, UIO_USERSPACE, uap->count));
 }
 #ifndef _SYS_SYSPROTO_H_
 struct readlinkat_args {
 	int	fd;
 	char	*path;
 	char	*buf;
 	size_t	bufsize;
 };
 #endif
 int
 sys_readlinkat(struct thread *td, struct readlinkat_args *uap)
 {
 
 	return (kern_readlinkat(td, uap->fd, uap->path, UIO_USERSPACE,
 	    uap->buf, UIO_USERSPACE, uap->bufsize));
 }
 
 int
 kern_readlinkat(struct thread *td, int fd, char *path, enum uio_seg pathseg,
     char *buf, enum uio_seg bufseg, size_t count)
 {
 	struct vnode *vp;
 	struct iovec aiov;
 	struct uio auio;
 	struct nameidata nd;
 	int error;
 
 	if (count > IOSIZE_MAX)
 		return (EINVAL);
 
 	NDINIT_AT(&nd, LOOKUP, NOFOLLOW | LOCKSHARED | LOCKLEAF | AUDITVNODE1,
 	    pathseg, path, fd, td);
 
 	if ((error = namei(&nd)) != 0)
 		return (error);
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	vp = nd.ni_vp;
 #ifdef MAC
 	error = mac_vnode_check_readlink(td->td_ucred, vp);
 	if (error != 0) {
 		vput(vp);
 		return (error);
 	}
 #endif
 	if (vp->v_type != VLNK)
 		error = EINVAL;
 	else {
 		aiov.iov_base = buf;
 		aiov.iov_len = count;
 		auio.uio_iov = &aiov;
 		auio.uio_iovcnt = 1;
 		auio.uio_offset = 0;
 		auio.uio_rw = UIO_READ;
 		auio.uio_segflg = bufseg;
 		auio.uio_td = td;
 		auio.uio_resid = count;
 		error = VOP_READLINK(vp, &auio, td->td_ucred);
 		td->td_retval[0] = count - auio.uio_resid;
 	}
 	vput(vp);
 	return (error);
 }
 
 /*
  * Common implementation code for chflags() and fchflags().
  */
 static int
 setfflags(td, vp, flags)
 	struct thread *td;
 	struct vnode *vp;
 	u_long flags;
 {
 	struct mount *mp;
 	struct vattr vattr;
 	int error;
 
 	/* We can't support the value matching VNOVAL. */
 	if (flags == VNOVAL)
 		return (EOPNOTSUPP);
 
 	/*
 	 * Prevent non-root users from setting flags on devices.  When
 	 * a device is reused, users can retain ownership of the device
 	 * if they are allowed to set flags and programs assume that
 	 * chown can't fail when done as root.
 	 */
 	if (vp->v_type == VCHR || vp->v_type == VBLK) {
 		error = priv_check(td, PRIV_VFS_CHFLAGS_DEV);
 		if (error != 0)
 			return (error);
 	}
 
 	if ((error = vn_start_write(vp, &mp, V_WAIT | PCATCH)) != 0)
 		return (error);
 	VATTR_NULL(&vattr);
 	vattr.va_flags = flags;
 	vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
 #ifdef MAC
 	error = mac_vnode_check_setflags(td->td_ucred, vp, vattr.va_flags);
 	if (error == 0)
 #endif
 		error = VOP_SETATTR(vp, &vattr, td->td_ucred);
 	VOP_UNLOCK(vp, 0);
 	vn_finished_write(mp);
 	return (error);
 }
 
 /*
  * Change flags of a file given a path name.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct chflags_args {
 	const char *path;
 	u_long	flags;
 };
 #endif
 int
 sys_chflags(td, uap)
 	struct thread *td;
 	register struct chflags_args /* {
 		const char *path;
 		u_long flags;
 	} */ *uap;
 {
 
 	return (kern_chflagsat(td, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    uap->flags, 0));
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct chflagsat_args {
 	int	fd;
 	const char *path;
 	u_long	flags;
 	int	atflag;
 }
 #endif
 int
 sys_chflagsat(struct thread *td, struct chflagsat_args *uap)
 {
 	int fd = uap->fd;
 	const char *path = uap->path;
 	u_long flags = uap->flags;
 	int atflag = uap->atflag;
 
 	if (atflag & ~AT_SYMLINK_NOFOLLOW)
 		return (EINVAL);
 
 	return (kern_chflagsat(td, fd, path, UIO_USERSPACE, flags, atflag));
 }
 
 /*
  * Same as chflags() but doesn't follow symlinks.
  */
 int
 sys_lchflags(td, uap)
 	struct thread *td;
 	register struct lchflags_args /* {
 		const char *path;
 		u_long flags;
 	} */ *uap;
 {
 
 	return (kern_chflagsat(td, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    uap->flags, AT_SYMLINK_NOFOLLOW));
 }
 
 static int
 kern_chflagsat(struct thread *td, int fd, const char *path,
     enum uio_seg pathseg, u_long flags, int atflag)
 {
 	struct nameidata nd;
 	cap_rights_t rights;
 	int error, follow;
 
 	AUDIT_ARG_FFLAGS(flags);
 	follow = (atflag & AT_SYMLINK_NOFOLLOW) ? NOFOLLOW : FOLLOW;
 	NDINIT_ATRIGHTS(&nd, LOOKUP, follow | AUDITVNODE1, pathseg, path, fd,
 	    cap_rights_init(&rights, CAP_FCHFLAGS), td);
 	if ((error = namei(&nd)) != 0)
 		return (error);
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	error = setfflags(td, nd.ni_vp, flags);
 	vrele(nd.ni_vp);
 	return (error);
 }
 
 /*
  * Change flags of a file given a file descriptor.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct fchflags_args {
 	int	fd;
 	u_long	flags;
 };
 #endif
 int
 sys_fchflags(td, uap)
 	struct thread *td;
 	register struct fchflags_args /* {
 		int fd;
 		u_long flags;
 	} */ *uap;
 {
 	struct file *fp;
 	cap_rights_t rights;
 	int error;
 
 	AUDIT_ARG_FD(uap->fd);
 	AUDIT_ARG_FFLAGS(uap->flags);
 	error = getvnode(td->td_proc->p_fd, uap->fd,
 	    cap_rights_init(&rights, CAP_FCHFLAGS), &fp);
 	if (error != 0)
 		return (error);
 #ifdef AUDIT
 	vn_lock(fp->f_vnode, LK_SHARED | LK_RETRY);
 	AUDIT_ARG_VNODE1(fp->f_vnode);
 	VOP_UNLOCK(fp->f_vnode, 0);
 #endif
 	error = setfflags(td, fp->f_vnode, uap->flags);
 	fdrop(fp, td);
 	return (error);
 }
 
 /*
  * Common implementation code for chmod(), lchmod() and fchmod().
  */
 int
 setfmode(td, cred, vp, mode)
 	struct thread *td;
 	struct ucred *cred;
 	struct vnode *vp;
 	int mode;
 {
 	struct mount *mp;
 	struct vattr vattr;
 	int error;
 
 	if ((error = vn_start_write(vp, &mp, V_WAIT | PCATCH)) != 0)
 		return (error);
 	vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
 	VATTR_NULL(&vattr);
 	vattr.va_mode = mode & ALLPERMS;
 #ifdef MAC
 	error = mac_vnode_check_setmode(cred, vp, vattr.va_mode);
 	if (error == 0)
 #endif
 		error = VOP_SETATTR(vp, &vattr, cred);
 	VOP_UNLOCK(vp, 0);
 	vn_finished_write(mp);
 	return (error);
 }
 
 /*
  * Change mode of a file given path name.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct chmod_args {
 	char	*path;
 	int	mode;
 };
 #endif
 int
 sys_chmod(td, uap)
 	struct thread *td;
 	register struct chmod_args /* {
 		char *path;
 		int mode;
 	} */ *uap;
 {
 
 	return (kern_fchmodat(td, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    uap->mode, 0));
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct fchmodat_args {
 	int	dirfd;
 	char	*path;
 	mode_t	mode;
 	int	flag;
 }
 #endif
 int
 sys_fchmodat(struct thread *td, struct fchmodat_args *uap)
 {
 	int flag = uap->flag;
 	int fd = uap->fd;
 	char *path = uap->path;
 	mode_t mode = uap->mode;
 
 	if (flag & ~AT_SYMLINK_NOFOLLOW)
 		return (EINVAL);
 
 	return (kern_fchmodat(td, fd, path, UIO_USERSPACE, mode, flag));
 }
 
 /*
  * Change mode of a file given path name (don't follow links.)
  */
 #ifndef _SYS_SYSPROTO_H_
 struct lchmod_args {
 	char	*path;
 	int	mode;
 };
 #endif
 int
 sys_lchmod(td, uap)
 	struct thread *td;
 	register struct lchmod_args /* {
 		char *path;
 		int mode;
 	} */ *uap;
 {
 
 	return (kern_fchmodat(td, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    uap->mode, AT_SYMLINK_NOFOLLOW));
 }
 
 int
 kern_fchmodat(struct thread *td, int fd, char *path, enum uio_seg pathseg,
     mode_t mode, int flag)
 {
 	struct nameidata nd;
 	cap_rights_t rights;
 	int error, follow;
 
 	AUDIT_ARG_MODE(mode);
 	follow = (flag & AT_SYMLINK_NOFOLLOW) ? NOFOLLOW : FOLLOW;
 	NDINIT_ATRIGHTS(&nd, LOOKUP, follow | AUDITVNODE1, pathseg, path, fd,
 	    cap_rights_init(&rights, CAP_FCHMOD), td);
 	if ((error = namei(&nd)) != 0)
 		return (error);
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	error = setfmode(td, td->td_ucred, nd.ni_vp, mode);
 	vrele(nd.ni_vp);
 	return (error);
 }
 
 /*
  * Change mode of a file given a file descriptor.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct fchmod_args {
 	int	fd;
 	int	mode;
 };
 #endif
 int
 sys_fchmod(struct thread *td, struct fchmod_args *uap)
 {
 	struct file *fp;
 	cap_rights_t rights;
 	int error;
 
 	AUDIT_ARG_FD(uap->fd);
 	AUDIT_ARG_MODE(uap->mode);
 
 	error = fget(td, uap->fd, cap_rights_init(&rights, CAP_FCHMOD), &fp);
 	if (error != 0)
 		return (error);
 	error = fo_chmod(fp, uap->mode, td->td_ucred, td);
 	fdrop(fp, td);
 	return (error);
 }
 
 /*
  * Common implementation for chown(), lchown(), and fchown()
  */
 int
 setfown(td, cred, vp, uid, gid)
 	struct thread *td;
 	struct ucred *cred;
 	struct vnode *vp;
 	uid_t uid;
 	gid_t gid;
 {
 	struct mount *mp;
 	struct vattr vattr;
 	int error;
 
 	if ((error = vn_start_write(vp, &mp, V_WAIT | PCATCH)) != 0)
 		return (error);
 	vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
 	VATTR_NULL(&vattr);
 	vattr.va_uid = uid;
 	vattr.va_gid = gid;
 #ifdef MAC
 	error = mac_vnode_check_setowner(cred, vp, vattr.va_uid,
 	    vattr.va_gid);
 	if (error == 0)
 #endif
 		error = VOP_SETATTR(vp, &vattr, cred);
 	VOP_UNLOCK(vp, 0);
 	vn_finished_write(mp);
 	return (error);
 }
 
 /*
  * Set ownership given a path name.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct chown_args {
 	char	*path;
 	int	uid;
 	int	gid;
 };
 #endif
 int
 sys_chown(td, uap)
 	struct thread *td;
 	register struct chown_args /* {
 		char *path;
 		int uid;
 		int gid;
 	} */ *uap;
 {
 
 	return (kern_fchownat(td, AT_FDCWD, uap->path, UIO_USERSPACE, uap->uid,
 	    uap->gid, 0));
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct fchownat_args {
 	int fd;
 	const char * path;
 	uid_t uid;
 	gid_t gid;
 	int flag;
 };
 #endif
 int
 sys_fchownat(struct thread *td, struct fchownat_args *uap)
 {
 	int flag;
 
 	flag = uap->flag;
 	if (flag & ~AT_SYMLINK_NOFOLLOW)
 		return (EINVAL);
 
 	return (kern_fchownat(td, uap->fd, uap->path, UIO_USERSPACE, uap->uid,
 	    uap->gid, uap->flag));
 }
 
 int
 kern_fchownat(struct thread *td, int fd, char *path, enum uio_seg pathseg,
     int uid, int gid, int flag)
 {
 	struct nameidata nd;
 	cap_rights_t rights;
 	int error, follow;
 
 	AUDIT_ARG_OWNER(uid, gid);
 	follow = (flag & AT_SYMLINK_NOFOLLOW) ? NOFOLLOW : FOLLOW;
 	NDINIT_ATRIGHTS(&nd, LOOKUP, follow | AUDITVNODE1, pathseg, path, fd,
 	    cap_rights_init(&rights, CAP_FCHOWN), td);
 
 	if ((error = namei(&nd)) != 0)
 		return (error);
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	error = setfown(td, td->td_ucred, nd.ni_vp, uid, gid);
 	vrele(nd.ni_vp);
 	return (error);
 }
 
 /*
  * Set ownership given a path name, do not cross symlinks.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct lchown_args {
 	char	*path;
 	int	uid;
 	int	gid;
 };
 #endif
 int
 sys_lchown(td, uap)
 	struct thread *td;
 	register struct lchown_args /* {
 		char *path;
 		int uid;
 		int gid;
 	} */ *uap;
 {
 
 	return (kern_fchownat(td, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    uap->uid, uap->gid, AT_SYMLINK_NOFOLLOW));
 }
 
 /*
  * Set ownership given a file descriptor.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct fchown_args {
 	int	fd;
 	int	uid;
 	int	gid;
 };
 #endif
 int
 sys_fchown(td, uap)
 	struct thread *td;
 	register struct fchown_args /* {
 		int fd;
 		int uid;
 		int gid;
 	} */ *uap;
 {
 	struct file *fp;
 	cap_rights_t rights;
 	int error;
 
 	AUDIT_ARG_FD(uap->fd);
 	AUDIT_ARG_OWNER(uap->uid, uap->gid);
 	error = fget(td, uap->fd, cap_rights_init(&rights, CAP_FCHOWN), &fp);
 	if (error != 0)
 		return (error);
 	error = fo_chown(fp, uap->uid, uap->gid, td->td_ucred, td);
 	fdrop(fp, td);
 	return (error);
 }
 
 /*
  * Common implementation code for utimes(), lutimes(), and futimes().
  */
 static int
 getutimes(usrtvp, tvpseg, tsp)
 	const struct timeval *usrtvp;
 	enum uio_seg tvpseg;
 	struct timespec *tsp;
 {
 	struct timeval tv[2];
 	const struct timeval *tvp;
 	int error;
 
 	if (usrtvp == NULL) {
 		vfs_timestamp(&tsp[0]);
 		tsp[1] = tsp[0];
 	} else {
 		if (tvpseg == UIO_SYSSPACE) {
 			tvp = usrtvp;
 		} else {
 			if ((error = copyin(usrtvp, tv, sizeof(tv))) != 0)
 				return (error);
 			tvp = tv;
 		}
 
 		if (tvp[0].tv_usec < 0 || tvp[0].tv_usec >= 1000000 ||
 		    tvp[1].tv_usec < 0 || tvp[1].tv_usec >= 1000000)
 			return (EINVAL);
 		TIMEVAL_TO_TIMESPEC(&tvp[0], &tsp[0]);
 		TIMEVAL_TO_TIMESPEC(&tvp[1], &tsp[1]);
 	}
 	return (0);
 }
 
 /*
  * Common implementation code for futimens(), utimensat().
  */
 #define	UTIMENS_NULL	0x1
 #define	UTIMENS_EXIT	0x2
 static int
 getutimens(const struct timespec *usrtsp, enum uio_seg tspseg,
     struct timespec *tsp, int *retflags)
 {
 	struct timespec tsnow;
 	int error;
 
 	vfs_timestamp(&tsnow);
 	*retflags = 0;
 	if (usrtsp == NULL) {
 		tsp[0] = tsnow;
 		tsp[1] = tsnow;
 		*retflags |= UTIMENS_NULL;
 		return (0);
 	}
 	if (tspseg == UIO_SYSSPACE) {
 		tsp[0] = usrtsp[0];
 		tsp[1] = usrtsp[1];
 	} else if ((error = copyin(usrtsp, tsp, sizeof(*tsp) * 2)) != 0)
 		return (error);
 	if (tsp[0].tv_nsec == UTIME_OMIT && tsp[1].tv_nsec == UTIME_OMIT)
 		*retflags |= UTIMENS_EXIT;
 	if (tsp[0].tv_nsec == UTIME_NOW && tsp[1].tv_nsec == UTIME_NOW)
 		*retflags |= UTIMENS_NULL;
 	if (tsp[0].tv_nsec == UTIME_OMIT)
 		tsp[0].tv_sec = VNOVAL;
 	else if (tsp[0].tv_nsec == UTIME_NOW)
 		tsp[0] = tsnow;
 	else if (tsp[0].tv_nsec < 0 || tsp[0].tv_nsec >= 1000000000L)
 		return (EINVAL);
 	if (tsp[1].tv_nsec == UTIME_OMIT)
 		tsp[1].tv_sec = VNOVAL;
 	else if (tsp[1].tv_nsec == UTIME_NOW)
 		tsp[1] = tsnow;
 	else if (tsp[1].tv_nsec < 0 || tsp[1].tv_nsec >= 1000000000L)
 		return (EINVAL);
 
 	return (0);
 }
 
 /*
  * Common implementation code for utimes(), lutimes(), futimes(), futimens(),
  * and utimensat().
  */
 static int
 setutimes(td, vp, ts, numtimes, nullflag)
 	struct thread *td;
 	struct vnode *vp;
 	const struct timespec *ts;
 	int numtimes;
 	int nullflag;
 {
 	struct mount *mp;
 	struct vattr vattr;
 	int error, setbirthtime;
 
 	if ((error = vn_start_write(vp, &mp, V_WAIT | PCATCH)) != 0)
 		return (error);
 	vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
 	setbirthtime = 0;
 	if (numtimes < 3 && !VOP_GETATTR(vp, &vattr, td->td_ucred) &&
 	    timespeccmp(&ts[1], &vattr.va_birthtime, < ))
 		setbirthtime = 1;
 	VATTR_NULL(&vattr);
 	vattr.va_atime = ts[0];
 	vattr.va_mtime = ts[1];
 	if (setbirthtime)
 		vattr.va_birthtime = ts[1];
 	if (numtimes > 2)
 		vattr.va_birthtime = ts[2];
 	if (nullflag)
 		vattr.va_vaflags |= VA_UTIMES_NULL;
 #ifdef MAC
 	error = mac_vnode_check_setutimes(td->td_ucred, vp, vattr.va_atime,
 	    vattr.va_mtime);
 #endif
 	if (error == 0)
 		error = VOP_SETATTR(vp, &vattr, td->td_ucred);
 	VOP_UNLOCK(vp, 0);
 	vn_finished_write(mp);
 	return (error);
 }
 
 /*
  * Set the access and modification times of a file.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct utimes_args {
 	char	*path;
 	struct	timeval *tptr;
 };
 #endif
 int
 sys_utimes(td, uap)
 	struct thread *td;
 	register struct utimes_args /* {
 		char *path;
 		struct timeval *tptr;
 	} */ *uap;
 {
 
 	return (kern_utimesat(td, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    uap->tptr, UIO_USERSPACE));
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct futimesat_args {
 	int fd;
 	const char * path;
 	const struct timeval * times;
 };
 #endif
 int
 sys_futimesat(struct thread *td, struct futimesat_args *uap)
 {
 
 	return (kern_utimesat(td, uap->fd, uap->path, UIO_USERSPACE,
 	    uap->times, UIO_USERSPACE));
 }
 
 int
 kern_utimesat(struct thread *td, int fd, char *path, enum uio_seg pathseg,
     struct timeval *tptr, enum uio_seg tptrseg)
 {
 	struct nameidata nd;
 	struct timespec ts[2];
 	cap_rights_t rights;
 	int error;
 
 	if ((error = getutimes(tptr, tptrseg, ts)) != 0)
 		return (error);
 	NDINIT_ATRIGHTS(&nd, LOOKUP, FOLLOW | AUDITVNODE1, pathseg, path, fd,
 	    cap_rights_init(&rights, CAP_FUTIMES), td);
 
 	if ((error = namei(&nd)) != 0)
 		return (error);
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	error = setutimes(td, nd.ni_vp, ts, 2, tptr == NULL);
 	vrele(nd.ni_vp);
 	return (error);
 }
 
 /*
  * Set the access and modification times of a file.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct lutimes_args {
 	char	*path;
 	struct	timeval *tptr;
 };
 #endif
 int
 sys_lutimes(td, uap)
 	struct thread *td;
 	register struct lutimes_args /* {
 		char *path;
 		struct timeval *tptr;
 	} */ *uap;
 {
 
 	return (kern_lutimes(td, uap->path, UIO_USERSPACE, uap->tptr,
 	    UIO_USERSPACE));
 }
 
 int
 kern_lutimes(struct thread *td, char *path, enum uio_seg pathseg,
     struct timeval *tptr, enum uio_seg tptrseg)
 {
 	struct timespec ts[2];
 	struct nameidata nd;
 	int error;
 
 	if ((error = getutimes(tptr, tptrseg, ts)) != 0)
 		return (error);
 	NDINIT(&nd, LOOKUP, NOFOLLOW | AUDITVNODE1, pathseg, path, td);
 	if ((error = namei(&nd)) != 0)
 		return (error);
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	error = setutimes(td, nd.ni_vp, ts, 2, tptr == NULL);
 	vrele(nd.ni_vp);
 	return (error);
 }
 
 /*
  * Set the access and modification times of a file.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct futimes_args {
 	int	fd;
 	struct	timeval *tptr;
 };
 #endif
 int
 sys_futimes(td, uap)
 	struct thread *td;
 	register struct futimes_args /* {
 		int  fd;
 		struct timeval *tptr;
 	} */ *uap;
 {
 
 	return (kern_futimes(td, uap->fd, uap->tptr, UIO_USERSPACE));
 }
 
 int
 kern_futimes(struct thread *td, int fd, struct timeval *tptr,
     enum uio_seg tptrseg)
 {
 	struct timespec ts[2];
 	struct file *fp;
 	cap_rights_t rights;
 	int error;
 
 	AUDIT_ARG_FD(fd);
 	error = getutimes(tptr, tptrseg, ts);
 	if (error != 0)
 		return (error);
 	error = getvnode(td->td_proc->p_fd, fd,
 	    cap_rights_init(&rights, CAP_FUTIMES), &fp);
 	if (error != 0)
 		return (error);
 #ifdef AUDIT
 	vn_lock(fp->f_vnode, LK_SHARED | LK_RETRY);
 	AUDIT_ARG_VNODE1(fp->f_vnode);
 	VOP_UNLOCK(fp->f_vnode, 0);
 #endif
 	error = setutimes(td, fp->f_vnode, ts, 2, tptr == NULL);
 	fdrop(fp, td);
 	return (error);
 }
 
 int
 sys_futimens(struct thread *td, struct futimens_args *uap)
 {
 
 	return (kern_futimens(td, uap->fd, uap->times, UIO_USERSPACE));
 }
 
 int
 kern_futimens(struct thread *td, int fd, struct timespec *tptr,
     enum uio_seg tptrseg)
 {
 	struct timespec ts[2];
 	struct file *fp;
 	cap_rights_t rights;
 	int error, flags;
 
 	AUDIT_ARG_FD(fd);
 	error = getutimens(tptr, tptrseg, ts, &flags);
 	if (error != 0)
 		return (error);
 	if (flags & UTIMENS_EXIT)
 		return (0);
 	error = getvnode(td->td_proc->p_fd, fd,
 	    cap_rights_init(&rights, CAP_FUTIMES), &fp);
 	if (error != 0)
 		return (error);
 #ifdef AUDIT
 	vn_lock(fp->f_vnode, LK_SHARED | LK_RETRY);
 	AUDIT_ARG_VNODE1(fp->f_vnode);
 	VOP_UNLOCK(fp->f_vnode, 0);
 #endif
 	error = setutimes(td, fp->f_vnode, ts, 2, flags & UTIMENS_NULL);
 	fdrop(fp, td);
 	return (error);
 }
 
 int
 sys_utimensat(struct thread *td, struct utimensat_args *uap)
 {
 
 	return (kern_utimensat(td, uap->fd, uap->path, UIO_USERSPACE,
 	    uap->times, UIO_USERSPACE, uap->flag));
 }
 
 int
 kern_utimensat(struct thread *td, int fd, char *path, enum uio_seg pathseg,
     struct timespec *tptr, enum uio_seg tptrseg, int flag)
 {
 	struct nameidata nd;
 	struct timespec ts[2];
 	cap_rights_t rights;
 	int error, flags;
 
 	if (flag & ~AT_SYMLINK_NOFOLLOW)
 		return (EINVAL);
 
 	if ((error = getutimens(tptr, tptrseg, ts, &flags)) != 0)
 		return (error);
 	NDINIT_ATRIGHTS(&nd, LOOKUP, ((flag & AT_SYMLINK_NOFOLLOW) ? NOFOLLOW :
 	    FOLLOW) | AUDITVNODE1, pathseg, path, fd,
 	    cap_rights_init(&rights, CAP_FUTIMES), td);
 	if ((error = namei(&nd)) != 0)
 		return (error);
 	/*
 	 * We are allowed to call namei() regardless of 2xUTIME_OMIT.
 	 * POSIX states:
 	 * "If both tv_nsec fields are UTIME_OMIT... EACCESS may be detected."
 	 * "Search permission is denied by a component of the path prefix."
 	 */
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	if ((flags & UTIMENS_EXIT) == 0)
 		error = setutimes(td, nd.ni_vp, ts, 2, flags & UTIMENS_NULL);
 	vrele(nd.ni_vp);
 	return (error);
 }
 
 /*
  * Truncate a file given its path name.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct truncate_args {
 	char	*path;
 	int	pad;
 	off_t	length;
 };
 #endif
 int
 sys_truncate(td, uap)
 	struct thread *td;
 	register struct truncate_args /* {
 		char *path;
 		int pad;
 		off_t length;
 	} */ *uap;
 {
 
 	return (kern_truncate(td, uap->path, UIO_USERSPACE, uap->length));
 }
 
 int
 kern_truncate(struct thread *td, char *path, enum uio_seg pathseg, off_t length)
 {
 	struct mount *mp;
 	struct vnode *vp;
 	void *rl_cookie;
 	struct vattr vattr;
 	struct nameidata nd;
 	int error;
 
 	if (length < 0)
 		return(EINVAL);
 	NDINIT(&nd, LOOKUP, FOLLOW | AUDITVNODE1, pathseg, path, td);
 	if ((error = namei(&nd)) != 0)
 		return (error);
 	vp = nd.ni_vp;
 	rl_cookie = vn_rangelock_wlock(vp, 0, OFF_MAX);
 	if ((error = vn_start_write(vp, &mp, V_WAIT | PCATCH)) != 0) {
 		vn_rangelock_unlock(vp, rl_cookie);
 		vrele(vp);
 		return (error);
 	}
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
 	if (vp->v_type == VDIR)
 		error = EISDIR;
 #ifdef MAC
 	else if ((error = mac_vnode_check_write(td->td_ucred, NOCRED, vp))) {
 	}
 #endif
 	else if ((error = vn_writechk(vp)) == 0 &&
 	    (error = VOP_ACCESS(vp, VWRITE, td->td_ucred, td)) == 0) {
 		VATTR_NULL(&vattr);
 		vattr.va_size = length;
 		error = VOP_SETATTR(vp, &vattr, td->td_ucred);
 	}
 	VOP_UNLOCK(vp, 0);
 	vn_finished_write(mp);
 	vn_rangelock_unlock(vp, rl_cookie);
 	vrele(vp);
 	return (error);
 }
 
 #if defined(COMPAT_43)
 /*
  * Truncate a file given its path name.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct otruncate_args {
 	char	*path;
 	long	length;
 };
 #endif
 int
 otruncate(td, uap)
 	struct thread *td;
 	register struct otruncate_args /* {
 		char *path;
 		long length;
 	} */ *uap;
 {
 	struct truncate_args /* {
 		char *path;
 		int pad;
 		off_t length;
 	} */ nuap;
 
 	nuap.path = uap->path;
 	nuap.length = uap->length;
 	return (sys_truncate(td, &nuap));
 }
 #endif /* COMPAT_43 */
 
 /* Versions with the pad argument */
 int
 freebsd6_truncate(struct thread *td, struct freebsd6_truncate_args *uap)
 {
 	struct truncate_args ouap;
 
 	ouap.path = uap->path;
 	ouap.length = uap->length;
 	return (sys_truncate(td, &ouap));
 }
 
 int
 freebsd6_ftruncate(struct thread *td, struct freebsd6_ftruncate_args *uap)
 {
 	struct ftruncate_args ouap;
 
 	ouap.fd = uap->fd;
 	ouap.length = uap->length;
 	return (sys_ftruncate(td, &ouap));
 }
 
 /*
  * Sync an open file.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct fsync_args {
 	int	fd;
 };
 #endif
 int
 sys_fsync(td, uap)
 	struct thread *td;
 	struct fsync_args /* {
 		int fd;
 	} */ *uap;
 {
 	struct vnode *vp;
 	struct mount *mp;
 	struct file *fp;
 	cap_rights_t rights;
 	int error, lock_flags;
 
 	AUDIT_ARG_FD(uap->fd);
 	error = getvnode(td->td_proc->p_fd, uap->fd,
 	    cap_rights_init(&rights, CAP_FSYNC), &fp);
 	if (error != 0)
 		return (error);
 	vp = fp->f_vnode;
 	error = vn_start_write(vp, &mp, V_WAIT | PCATCH);
 	if (error != 0)
 		goto drop;
 	if (MNT_SHARED_WRITES(mp) ||
 	    ((mp == NULL) && MNT_SHARED_WRITES(vp->v_mount))) {
 		lock_flags = LK_SHARED;
 	} else {
 		lock_flags = LK_EXCLUSIVE;
 	}
 	vn_lock(vp, lock_flags | LK_RETRY);
 	AUDIT_ARG_VNODE1(vp);
 	if (vp->v_object != NULL) {
 		VM_OBJECT_WLOCK(vp->v_object);
 		vm_object_page_clean(vp->v_object, 0, 0, 0);
 		VM_OBJECT_WUNLOCK(vp->v_object);
 	}
 	error = VOP_FSYNC(vp, MNT_WAIT, td);
 
 	VOP_UNLOCK(vp, 0);
 	vn_finished_write(mp);
 drop:
 	fdrop(fp, td);
 	return (error);
 }
 
 /*
  * Rename files.  Source and destination must either both be directories, or
  * both not be directories.  If target is a directory, it must be empty.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct rename_args {
 	char	*from;
 	char	*to;
 };
 #endif
 int
 sys_rename(td, uap)
 	struct thread *td;
 	register struct rename_args /* {
 		char *from;
 		char *to;
 	} */ *uap;
 {
 
 	return (kern_renameat(td, AT_FDCWD, uap->from, AT_FDCWD,
 	    uap->to, UIO_USERSPACE));
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct renameat_args {
 	int	oldfd;
 	char	*old;
 	int	newfd;
 	char	*new;
 };
 #endif
 int
 sys_renameat(struct thread *td, struct renameat_args *uap)
 {
 
 	return (kern_renameat(td, uap->oldfd, uap->old, uap->newfd, uap->new,
 	    UIO_USERSPACE));
 }
 
 int
 kern_renameat(struct thread *td, int oldfd, char *old, int newfd, char *new,
     enum uio_seg pathseg)
 {
 	struct mount *mp = NULL;
 	struct vnode *tvp, *fvp, *tdvp;
 	struct nameidata fromnd, tond;
 	cap_rights_t rights;
 	int error;
 
 again:
 	bwillwrite();
 #ifdef MAC
 	NDINIT_ATRIGHTS(&fromnd, DELETE, LOCKPARENT | LOCKLEAF | SAVESTART |
 	    AUDITVNODE1, pathseg, old, oldfd,
 	    cap_rights_init(&rights, CAP_RENAMEAT), td);
 #else
 	NDINIT_ATRIGHTS(&fromnd, DELETE, WANTPARENT | SAVESTART | AUDITVNODE1,
 	    pathseg, old, oldfd, cap_rights_init(&rights, CAP_RENAMEAT), td);
 #endif
 
 	if ((error = namei(&fromnd)) != 0)
 		return (error);
 #ifdef MAC
 	error = mac_vnode_check_rename_from(td->td_ucred, fromnd.ni_dvp,
 	    fromnd.ni_vp, &fromnd.ni_cnd);
 	VOP_UNLOCK(fromnd.ni_dvp, 0);
 	if (fromnd.ni_dvp != fromnd.ni_vp)
 		VOP_UNLOCK(fromnd.ni_vp, 0);
 #endif
 	fvp = fromnd.ni_vp;
 	NDINIT_ATRIGHTS(&tond, RENAME, LOCKPARENT | LOCKLEAF | NOCACHE |
 	    SAVESTART | AUDITVNODE2, pathseg, new, newfd,
 	    cap_rights_init(&rights, CAP_LINKAT), td);
 	if (fromnd.ni_vp->v_type == VDIR)
 		tond.ni_cnd.cn_flags |= WILLBEDIR;
 	if ((error = namei(&tond)) != 0) {
 		/* Translate error code for rename("dir1", "dir2/."). */
 		if (error == EISDIR && fvp->v_type == VDIR)
 			error = EINVAL;
 		NDFREE(&fromnd, NDF_ONLY_PNBUF);
 		vrele(fromnd.ni_dvp);
 		vrele(fvp);
 		goto out1;
 	}
 	tdvp = tond.ni_dvp;
 	tvp = tond.ni_vp;
 	error = vn_start_write(fvp, &mp, V_NOWAIT);
 	if (error != 0) {
 		NDFREE(&fromnd, NDF_ONLY_PNBUF);
 		NDFREE(&tond, NDF_ONLY_PNBUF);
 		if (tvp != NULL)
 			vput(tvp);
 		if (tdvp == tvp)
 			vrele(tdvp);
 		else
 			vput(tdvp);
 		vrele(fromnd.ni_dvp);
 		vrele(fvp);
 		vrele(tond.ni_startdir);
 		if (fromnd.ni_startdir != NULL)
 			vrele(fromnd.ni_startdir);
 		error = vn_start_write(NULL, &mp, V_XSLEEP | PCATCH);
 		if (error != 0)
 			return (error);
 		goto again;
 	}
 	if (tvp != NULL) {
 		if (fvp->v_type == VDIR && tvp->v_type != VDIR) {
 			error = ENOTDIR;
 			goto out;
 		} else if (fvp->v_type != VDIR && tvp->v_type == VDIR) {
 			error = EISDIR;
 			goto out;
 		}
 #ifdef CAPABILITIES
 		if (newfd != AT_FDCWD) {
 			/*
 			 * If the target already exists we require CAP_UNLINKAT
 			 * from 'newfd'.
 			 */
 			error = cap_check(&tond.ni_filecaps.fc_rights,
 			    cap_rights_init(&rights, CAP_UNLINKAT));
 			if (error != 0)
 				goto out;
 		}
 #endif
 	}
 	if (fvp == tdvp) {
 		error = EINVAL;
 		goto out;
 	}
 	/*
 	 * If the source is the same as the destination (that is, if they
 	 * are links to the same vnode), then there is nothing to do.
 	 */
 	if (fvp == tvp)
 		error = -1;
 #ifdef MAC
 	else
 		error = mac_vnode_check_rename_to(td->td_ucred, tdvp,
 		    tond.ni_vp, fromnd.ni_dvp == tdvp, &tond.ni_cnd);
 #endif
 out:
 	if (error == 0) {
 		error = VOP_RENAME(fromnd.ni_dvp, fromnd.ni_vp, &fromnd.ni_cnd,
 		    tond.ni_dvp, tond.ni_vp, &tond.ni_cnd);
 		NDFREE(&fromnd, NDF_ONLY_PNBUF);
 		NDFREE(&tond, NDF_ONLY_PNBUF);
 	} else {
 		NDFREE(&fromnd, NDF_ONLY_PNBUF);
 		NDFREE(&tond, NDF_ONLY_PNBUF);
 		if (tvp != NULL)
 			vput(tvp);
 		if (tdvp == tvp)
 			vrele(tdvp);
 		else
 			vput(tdvp);
 		vrele(fromnd.ni_dvp);
 		vrele(fvp);
 	}
 	vrele(tond.ni_startdir);
 	vn_finished_write(mp);
 out1:
 	if (fromnd.ni_startdir)
 		vrele(fromnd.ni_startdir);
 	if (error == -1)
 		return (0);
 	return (error);
 }
 
 /*
  * Make a directory file.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct mkdir_args {
 	char	*path;
 	int	mode;
 };
 #endif
 int
 sys_mkdir(td, uap)
 	struct thread *td;
 	register struct mkdir_args /* {
 		char *path;
 		int mode;
 	} */ *uap;
 {
 
 	return (kern_mkdirat(td, AT_FDCWD, uap->path, UIO_USERSPACE,
 	    uap->mode));
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct mkdirat_args {
 	int	fd;
 	char	*path;
 	mode_t	mode;
 };
 #endif
 int
 sys_mkdirat(struct thread *td, struct mkdirat_args *uap)
 {
 
 	return (kern_mkdirat(td, uap->fd, uap->path, UIO_USERSPACE, uap->mode));
 }
 
 int
 kern_mkdirat(struct thread *td, int fd, char *path, enum uio_seg segflg,
     int mode)
 {
 	struct mount *mp;
 	struct vnode *vp;
 	struct vattr vattr;
 	struct nameidata nd;
 	cap_rights_t rights;
 	int error;
 
 	AUDIT_ARG_MODE(mode);
 restart:
 	bwillwrite();
 	NDINIT_ATRIGHTS(&nd, CREATE, LOCKPARENT | SAVENAME | AUDITVNODE1 |
 	    NOCACHE, segflg, path, fd, cap_rights_init(&rights, CAP_MKDIRAT),
 	    td);
 	nd.ni_cnd.cn_flags |= WILLBEDIR;
 	if ((error = namei(&nd)) != 0)
 		return (error);
 	vp = nd.ni_vp;
 	if (vp != NULL) {
 		NDFREE(&nd, NDF_ONLY_PNBUF);
 		/*
 		 * XXX namei called with LOCKPARENT but not LOCKLEAF has
 		 * the strange behaviour of leaving the vnode unlocked
 		 * if the target is the same vnode as the parent.
 		 */
 		if (vp == nd.ni_dvp)
 			vrele(nd.ni_dvp);
 		else
 			vput(nd.ni_dvp);
 		vrele(vp);
 		return (EEXIST);
 	}
 	if (vn_start_write(nd.ni_dvp, &mp, V_NOWAIT) != 0) {
 		NDFREE(&nd, NDF_ONLY_PNBUF);
 		vput(nd.ni_dvp);
 		if ((error = vn_start_write(NULL, &mp, V_XSLEEP | PCATCH)) != 0)
 			return (error);
 		goto restart;
 	}
 	VATTR_NULL(&vattr);
 	vattr.va_type = VDIR;
 	vattr.va_mode = (mode & ACCESSPERMS) &~ td->td_proc->p_fd->fd_cmask;
 #ifdef MAC
 	error = mac_vnode_check_create(td->td_ucred, nd.ni_dvp, &nd.ni_cnd,
 	    &vattr);
 	if (error != 0)
 		goto out;
 #endif
 	error = VOP_MKDIR(nd.ni_dvp, &nd.ni_vp, &nd.ni_cnd, &vattr);
 #ifdef MAC
 out:
 #endif
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	vput(nd.ni_dvp);
 	if (error == 0)
 		vput(nd.ni_vp);
 	vn_finished_write(mp);
 	return (error);
 }
 
 /*
  * Remove a directory file.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct rmdir_args {
 	char	*path;
 };
 #endif
 int
 sys_rmdir(td, uap)
 	struct thread *td;
 	struct rmdir_args /* {
 		char *path;
 	} */ *uap;
 {
 
 	return (kern_rmdirat(td, AT_FDCWD, uap->path, UIO_USERSPACE));
 }
 
 int
 kern_rmdirat(struct thread *td, int fd, char *path, enum uio_seg pathseg)
 {
 	struct mount *mp;
 	struct vnode *vp;
 	struct nameidata nd;
 	cap_rights_t rights;
 	int error;
 
 restart:
 	bwillwrite();
 	NDINIT_ATRIGHTS(&nd, DELETE, LOCKPARENT | LOCKLEAF | AUDITVNODE1,
 	    pathseg, path, fd, cap_rights_init(&rights, CAP_UNLINKAT), td);
 	if ((error = namei(&nd)) != 0)
 		return (error);
 	vp = nd.ni_vp;
 	if (vp->v_type != VDIR) {
 		error = ENOTDIR;
 		goto out;
 	}
 	/*
 	 * No rmdir "." please.
 	 */
 	if (nd.ni_dvp == vp) {
 		error = EINVAL;
 		goto out;
 	}
 	/*
 	 * The root of a mounted filesystem cannot be deleted.
 	 */
 	if (vp->v_vflag & VV_ROOT) {
 		error = EBUSY;
 		goto out;
 	}
 #ifdef MAC
 	error = mac_vnode_check_unlink(td->td_ucred, nd.ni_dvp, vp,
 	    &nd.ni_cnd);
 	if (error != 0)
 		goto out;
 #endif
 	if (vn_start_write(nd.ni_dvp, &mp, V_NOWAIT) != 0) {
 		NDFREE(&nd, NDF_ONLY_PNBUF);
 		vput(vp);
 		if (nd.ni_dvp == vp)
 			vrele(nd.ni_dvp);
 		else
 			vput(nd.ni_dvp);
 		if ((error = vn_start_write(NULL, &mp, V_XSLEEP | PCATCH)) != 0)
 			return (error);
 		goto restart;
 	}
 	vfs_notify_upper(vp, VFS_NOTIFY_UPPER_UNLINK);
 	error = VOP_RMDIR(nd.ni_dvp, nd.ni_vp, &nd.ni_cnd);
 	vn_finished_write(mp);
 out:
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	vput(vp);
 	if (nd.ni_dvp == vp)
 		vrele(nd.ni_dvp);
 	else
 		vput(nd.ni_dvp);
 	return (error);
 }
 
 #ifdef COMPAT_43
 /*
  * Read a block of directory entries in a filesystem independent format.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct ogetdirentries_args {
 	int	fd;
 	char	*buf;
 	u_int	count;
 	long	*basep;
 };
 #endif
 int
 ogetdirentries(struct thread *td, struct ogetdirentries_args *uap)
 {
 	long loff;
 	int error;
 
 	error = kern_ogetdirentries(td, uap, &loff);
 	if (error == 0)
 		error = copyout(&loff, uap->basep, sizeof(long));
 	return (error);
 }
 
 int
 kern_ogetdirentries(struct thread *td, struct ogetdirentries_args *uap,
     long *ploff)
 {
 	struct vnode *vp;
 	struct file *fp;
 	struct uio auio, kuio;
 	struct iovec aiov, kiov;
 	struct dirent *dp, *edp;
 	cap_rights_t rights;
 	caddr_t dirbuf;
 	int error, eofflag, readcnt;
 	long loff;
 	off_t foffset;
 
 	/* XXX arbitrary sanity limit on `count'. */
 	if (uap->count > 64 * 1024)
 		return (EINVAL);
 	error = getvnode(td->td_proc->p_fd, uap->fd,
 	    cap_rights_init(&rights, CAP_READ), &fp);
 	if (error != 0)
 		return (error);
 	if ((fp->f_flag & FREAD) == 0) {
 		fdrop(fp, td);
 		return (EBADF);
 	}
 	vp = fp->f_vnode;
 	foffset = foffset_lock(fp, 0);
 unionread:
 	if (vp->v_type != VDIR) {
 		foffset_unlock(fp, foffset, 0);
 		fdrop(fp, td);
 		return (EINVAL);
 	}
 	aiov.iov_base = uap->buf;
 	aiov.iov_len = uap->count;
 	auio.uio_iov = &aiov;
 	auio.uio_iovcnt = 1;
 	auio.uio_rw = UIO_READ;
 	auio.uio_segflg = UIO_USERSPACE;
 	auio.uio_td = td;
 	auio.uio_resid = uap->count;
 	vn_lock(vp, LK_SHARED | LK_RETRY);
 	loff = auio.uio_offset = foffset;
 #ifdef MAC
 	error = mac_vnode_check_readdir(td->td_ucred, vp);
 	if (error != 0) {
 		VOP_UNLOCK(vp, 0);
 		foffset_unlock(fp, foffset, FOF_NOUPDATE);
 		fdrop(fp, td);
 		return (error);
 	}
 #endif
 #	if (BYTE_ORDER != LITTLE_ENDIAN)
 		if (vp->v_mount->mnt_maxsymlinklen <= 0) {
 			error = VOP_READDIR(vp, &auio, fp->f_cred, &eofflag,
 			    NULL, NULL);
 			foffset = auio.uio_offset;
 		} else
 #	endif
 	{
 		kuio = auio;
 		kuio.uio_iov = &kiov;
 		kuio.uio_segflg = UIO_SYSSPACE;
 		kiov.iov_len = uap->count;
 		dirbuf = malloc(uap->count, M_TEMP, M_WAITOK);
 		kiov.iov_base = dirbuf;
 		error = VOP_READDIR(vp, &kuio, fp->f_cred, &eofflag,
 			    NULL, NULL);
 		foffset = kuio.uio_offset;
 		if (error == 0) {
 			readcnt = uap->count - kuio.uio_resid;
 			edp = (struct dirent *)&dirbuf[readcnt];
 			for (dp = (struct dirent *)dirbuf; dp < edp; ) {
 #				if (BYTE_ORDER == LITTLE_ENDIAN)
 					/*
 					 * The expected low byte of
 					 * dp->d_namlen is our dp->d_type.
 					 * The high MBZ byte of dp->d_namlen
 					 * is our dp->d_namlen.
 					 */
 					dp->d_type = dp->d_namlen;
 					dp->d_namlen = 0;
 #				else
 					/*
 					 * The dp->d_type is the high byte
 					 * of the expected dp->d_namlen,
 					 * so must be zero'ed.
 					 */
 					dp->d_type = 0;
 #				endif
 				if (dp->d_reclen > 0) {
 					dp = (struct dirent *)
 					    ((char *)dp + dp->d_reclen);
 				} else {
 					error = EIO;
 					break;
 				}
 			}
 			if (dp >= edp)
 				error = uiomove(dirbuf, readcnt, &auio);
 		}
 		free(dirbuf, M_TEMP);
 	}
 	if (error != 0) {
 		VOP_UNLOCK(vp, 0);
 		foffset_unlock(fp, foffset, 0);
 		fdrop(fp, td);
 		return (error);
 	}
 	if (uap->count == auio.uio_resid &&
 	    (vp->v_vflag & VV_ROOT) &&
 	    (vp->v_mount->mnt_flag & MNT_UNION)) {
 		struct vnode *tvp = vp;
 		vp = vp->v_mount->mnt_vnodecovered;
 		VREF(vp);
 		fp->f_vnode = vp;
 		fp->f_data = vp;
 		foffset = 0;
 		vput(tvp);
 		goto unionread;
 	}
 	VOP_UNLOCK(vp, 0);
 	foffset_unlock(fp, foffset, 0);
 	fdrop(fp, td);
 	td->td_retval[0] = uap->count - auio.uio_resid;
 	if (error == 0)
 		*ploff = loff;
 	return (error);
 }
 #endif /* COMPAT_43 */
 
 /*
  * Read a block of directory entries in a filesystem independent format.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct getdirentries_args {
 	int	fd;
 	char	*buf;
 	u_int	count;
 	long	*basep;
 };
 #endif
 int
 sys_getdirentries(td, uap)
 	struct thread *td;
 	register struct getdirentries_args /* {
 		int fd;
 		char *buf;
 		u_int count;
 		long *basep;
 	} */ *uap;
 {
 	long base;
 	int error;
 
 	error = kern_getdirentries(td, uap->fd, uap->buf, uap->count, &base,
 	    NULL, UIO_USERSPACE);
 	if (error != 0)
 		return (error);
 	if (uap->basep != NULL)
 		error = copyout(&base, uap->basep, sizeof(long));
 	return (error);
 }
 
 int
 kern_getdirentries(struct thread *td, int fd, char *buf, u_int count,
     long *basep, ssize_t *residp, enum uio_seg bufseg)
 {
 	struct vnode *vp;
 	struct file *fp;
 	struct uio auio;
 	struct iovec aiov;
 	cap_rights_t rights;
 	long loff;
 	int error, eofflag;
 	off_t foffset;
 
 	AUDIT_ARG_FD(fd);
 	if (count > IOSIZE_MAX)
 		return (EINVAL);
 	auio.uio_resid = count;
 	error = getvnode(td->td_proc->p_fd, fd,
 	    cap_rights_init(&rights, CAP_READ), &fp);
 	if (error != 0)
 		return (error);
 	if ((fp->f_flag & FREAD) == 0) {
 		fdrop(fp, td);
 		return (EBADF);
 	}
 	vp = fp->f_vnode;
 	foffset = foffset_lock(fp, 0);
 unionread:
 	if (vp->v_type != VDIR) {
 		error = EINVAL;
 		goto fail;
 	}
 	aiov.iov_base = buf;
 	aiov.iov_len = count;
 	auio.uio_iov = &aiov;
 	auio.uio_iovcnt = 1;
 	auio.uio_rw = UIO_READ;
 	auio.uio_segflg = bufseg;
 	auio.uio_td = td;
 	vn_lock(vp, LK_SHARED | LK_RETRY);
 	AUDIT_ARG_VNODE1(vp);
 	loff = auio.uio_offset = foffset;
 #ifdef MAC
 	error = mac_vnode_check_readdir(td->td_ucred, vp);
 	if (error == 0)
 #endif
 		error = VOP_READDIR(vp, &auio, fp->f_cred, &eofflag, NULL,
 		    NULL);
 	foffset = auio.uio_offset;
 	if (error != 0) {
 		VOP_UNLOCK(vp, 0);
 		goto fail;
 	}
 	if (count == auio.uio_resid &&
 	    (vp->v_vflag & VV_ROOT) &&
 	    (vp->v_mount->mnt_flag & MNT_UNION)) {
 		struct vnode *tvp = vp;
 
 		vp = vp->v_mount->mnt_vnodecovered;
 		VREF(vp);
 		fp->f_vnode = vp;
 		fp->f_data = vp;
 		foffset = 0;
 		vput(tvp);
 		goto unionread;
 	}
 	VOP_UNLOCK(vp, 0);
 	*basep = loff;
 	if (residp != NULL)
 		*residp = auio.uio_resid;
 	td->td_retval[0] = count - auio.uio_resid;
 fail:
 	foffset_unlock(fp, foffset, 0);
 	fdrop(fp, td);
 	return (error);
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct getdents_args {
 	int fd;
 	char *buf;
 	size_t count;
 };
 #endif
 int
 sys_getdents(td, uap)
 	struct thread *td;
 	register struct getdents_args /* {
 		int fd;
 		char *buf;
 		u_int count;
 	} */ *uap;
 {
 	struct getdirentries_args ap;
 
 	ap.fd = uap->fd;
 	ap.buf = uap->buf;
 	ap.count = uap->count;
 	ap.basep = NULL;
 	return (sys_getdirentries(td, &ap));
 }
 
 /*
  * Set the mode mask for creation of filesystem nodes.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct umask_args {
 	int	newmask;
 };
 #endif
 int
 sys_umask(td, uap)
 	struct thread *td;
 	struct umask_args /* {
 		int newmask;
 	} */ *uap;
 {
 	register struct filedesc *fdp;
 
 	FILEDESC_XLOCK(td->td_proc->p_fd);
 	fdp = td->td_proc->p_fd;
 	td->td_retval[0] = fdp->fd_cmask;
 	fdp->fd_cmask = uap->newmask & ALLPERMS;
 	FILEDESC_XUNLOCK(td->td_proc->p_fd);
 	return (0);
 }
 
 /*
  * Void all references to file by ripping underlying filesystem away from
  * vnode.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct revoke_args {
 	char	*path;
 };
 #endif
 int
 sys_revoke(td, uap)
 	struct thread *td;
 	register struct revoke_args /* {
 		char *path;
 	} */ *uap;
 {
 	struct vnode *vp;
 	struct vattr vattr;
 	struct nameidata nd;
 	int error;
 
 	NDINIT(&nd, LOOKUP, FOLLOW | LOCKLEAF | AUDITVNODE1, UIO_USERSPACE,
 	    uap->path, td);
 	if ((error = namei(&nd)) != 0)
 		return (error);
 	vp = nd.ni_vp;
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	if (vp->v_type != VCHR || vp->v_rdev == NULL) {
 		error = EINVAL;
 		goto out;
 	}
 #ifdef MAC
 	error = mac_vnode_check_revoke(td->td_ucred, vp);
 	if (error != 0)
 		goto out;
 #endif
 	error = VOP_GETATTR(vp, &vattr, td->td_ucred);
 	if (error != 0)
 		goto out;
 	if (td->td_ucred->cr_uid != vattr.va_uid) {
 		error = priv_check(td, PRIV_VFS_ADMIN);
 		if (error != 0)
 			goto out;
 	}
 	if (vcount(vp) > 1)
 		VOP_REVOKE(vp, REVOKEALL);
 out:
 	vput(vp);
 	return (error);
 }
 
 /*
  * Convert a user file descriptor to a kernel file entry and check that, if it
  * is a capability, the correct rights are present. A reference on the file
  * entry is held upon returning.
  */
 int
 getvnode(struct filedesc *fdp, int fd, cap_rights_t *rightsp, struct file **fpp)
 {
 	struct file *fp;
 	int error;
 
 	error = fget_unlocked(fdp, fd, rightsp, &fp, NULL);
 	if (error != 0)
 		return (error);
 
 	/*
 	 * The file could be not of the vnode type, or it may be not
 	 * yet fully initialized, in which case the f_vnode pointer
 	 * may be set, but f_ops is still badfileops.  E.g.,
 	 * devfs_open() transiently create such situation to
 	 * facilitate csw d_fdopen().
 	 *
 	 * Dupfdopen() handling in kern_openat() installs the
 	 * half-baked file into the process descriptor table, allowing
 	 * other thread to dereference it. Guard against the race by
 	 * checking f_ops.
 	 */
 	if (fp->f_vnode == NULL || fp->f_ops == &badfileops) {
 		fdrop(fp, curthread);
 		return (EINVAL);
 	}
 	*fpp = fp;
 	return (0);
 }
 
 
 /*
  * Get an (NFS) file handle.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct lgetfh_args {
 	char	*fname;
 	fhandle_t *fhp;
 };
 #endif
 int
 sys_lgetfh(td, uap)
 	struct thread *td;
 	register struct lgetfh_args *uap;
 {
 	struct nameidata nd;
 	fhandle_t fh;
 	register struct vnode *vp;
 	int error;
 
 	error = priv_check(td, PRIV_VFS_GETFH);
 	if (error != 0)
 		return (error);
 	NDINIT(&nd, LOOKUP, NOFOLLOW | LOCKLEAF | AUDITVNODE1, UIO_USERSPACE,
 	    uap->fname, td);
 	error = namei(&nd);
 	if (error != 0)
 		return (error);
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	vp = nd.ni_vp;
 	bzero(&fh, sizeof(fh));
 	fh.fh_fsid = vp->v_mount->mnt_stat.f_fsid;
 	error = VOP_VPTOFH(vp, &fh.fh_fid);
 	vput(vp);
 	if (error == 0)
 		error = copyout(&fh, uap->fhp, sizeof (fh));
 	return (error);
 }
 
 #ifndef _SYS_SYSPROTO_H_
 struct getfh_args {
 	char	*fname;
 	fhandle_t *fhp;
 };
 #endif
 int
 sys_getfh(td, uap)
 	struct thread *td;
 	register struct getfh_args *uap;
 {
 	struct nameidata nd;
 	fhandle_t fh;
 	register struct vnode *vp;
 	int error;
 
 	error = priv_check(td, PRIV_VFS_GETFH);
 	if (error != 0)
 		return (error);
 	NDINIT(&nd, LOOKUP, FOLLOW | LOCKLEAF | AUDITVNODE1, UIO_USERSPACE,
 	    uap->fname, td);
 	error = namei(&nd);
 	if (error != 0)
 		return (error);
 	NDFREE(&nd, NDF_ONLY_PNBUF);
 	vp = nd.ni_vp;
 	bzero(&fh, sizeof(fh));
 	fh.fh_fsid = vp->v_mount->mnt_stat.f_fsid;
 	error = VOP_VPTOFH(vp, &fh.fh_fid);
 	vput(vp);
 	if (error == 0)
 		error = copyout(&fh, uap->fhp, sizeof (fh));
 	return (error);
 }
 
 /*
  * syscall for the rpc.lockd to use to translate a NFS file handle into an
  * open descriptor.
  *
  * warning: do not remove the priv_check() call or this becomes one giant
  * security hole.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct fhopen_args {
 	const struct fhandle *u_fhp;
 	int flags;
 };
 #endif
 int
 sys_fhopen(td, uap)
 	struct thread *td;
 	struct fhopen_args /* {
 		const struct fhandle *u_fhp;
 		int flags;
 	} */ *uap;
 {
 	struct mount *mp;
 	struct vnode *vp;
 	struct fhandle fhp;
 	struct file *fp;
 	int fmode, error;
 	int indx;
 
 	error = priv_check(td, PRIV_VFS_FHOPEN);
 	if (error != 0)
 		return (error);
 	indx = -1;
 	fmode = FFLAGS(uap->flags);
 	/* why not allow a non-read/write open for our lockd? */
 	if (((fmode & (FREAD | FWRITE)) == 0) || (fmode & O_CREAT))
 		return (EINVAL);
 	error = copyin(uap->u_fhp, &fhp, sizeof(fhp));
 	if (error != 0)
 		return(error);
 	/* find the mount point */
 	mp = vfs_busyfs(&fhp.fh_fsid);
 	if (mp == NULL)
 		return (ESTALE);
 	/* now give me my vnode, it gets returned to me locked */
 	error = VFS_FHTOVP(mp, &fhp.fh_fid, LK_EXCLUSIVE, &vp);
 	vfs_unbusy(mp);
 	if (error != 0)
 		return (error);
 
 	error = falloc_noinstall(td, &fp);
 	if (error != 0) {
 		vput(vp);
 		return (error);
 	}
 	/*
 	 * An extra reference on `fp' has been held for us by
 	 * falloc_noinstall().
 	 */
 
 #ifdef INVARIANTS
 	td->td_dupfd = -1;
 #endif
 	error = vn_open_vnode(vp, fmode, td->td_ucred, td, fp);
 	if (error != 0) {
 		KASSERT(fp->f_ops == &badfileops,
 		    ("VOP_OPEN in fhopen() set f_ops"));
 		KASSERT(td->td_dupfd < 0,
 		    ("fhopen() encountered fdopen()"));
 
 		vput(vp);
 		goto bad;
 	}
 #ifdef INVARIANTS
 	td->td_dupfd = 0;
 #endif
 	fp->f_vnode = vp;
 	fp->f_seqcount = 1;
 	finit(fp, (fmode & FMASK) | (fp->f_flag & FHASLOCK), DTYPE_VNODE, vp,
 	    &vnops);
 	VOP_UNLOCK(vp, 0);
 	if ((fmode & O_TRUNC) != 0) {
 		error = fo_truncate(fp, 0, td->td_ucred, td);
 		if (error != 0)
 			goto bad;
 	}
 
 	error = finstall(td, fp, &indx, fmode, NULL);
 bad:
 	fdrop(fp, td);
 	td->td_retval[0] = indx;
 	return (error);
 }
 
 /*
  * Stat an (NFS) file handle.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct fhstat_args {
 	struct fhandle *u_fhp;
 	struct stat *sb;
 };
 #endif
 int
 sys_fhstat(td, uap)
 	struct thread *td;
 	register struct fhstat_args /* {
 		struct fhandle *u_fhp;
 		struct stat *sb;
 	} */ *uap;
 {
 	struct stat sb;
 	struct fhandle fh;
 	int error;
 
 	error = copyin(uap->u_fhp, &fh, sizeof(fh));
 	if (error != 0)
 		return (error);
 	error = kern_fhstat(td, fh, &sb);
 	if (error == 0)
 		error = copyout(&sb, uap->sb, sizeof(sb));
 	return (error);
 }
 
 int
 kern_fhstat(struct thread *td, struct fhandle fh, struct stat *sb)
 {
 	struct mount *mp;
 	struct vnode *vp;
 	int error;
 
 	error = priv_check(td, PRIV_VFS_FHSTAT);
 	if (error != 0)
 		return (error);
 	if ((mp = vfs_busyfs(&fh.fh_fsid)) == NULL)
 		return (ESTALE);
 	error = VFS_FHTOVP(mp, &fh.fh_fid, LK_EXCLUSIVE, &vp);
 	vfs_unbusy(mp);
 	if (error != 0)
 		return (error);
 	error = vn_stat(vp, sb, td->td_ucred, NOCRED, td);
 	vput(vp);
 	return (error);
 }
 
 /*
  * Implement fstatfs() for (NFS) file handles.
  */
 #ifndef _SYS_SYSPROTO_H_
 struct fhstatfs_args {
 	struct fhandle *u_fhp;
 	struct statfs *buf;
 };
 #endif
 int
 sys_fhstatfs(td, uap)
 	struct thread *td;
 	struct fhstatfs_args /* {
 		struct fhandle *u_fhp;
 		struct statfs *buf;
 	} */ *uap;
 {
 	struct statfs sf;
 	fhandle_t fh;
 	int error;
 
 	error = copyin(uap->u_fhp, &fh, sizeof(fhandle_t));
 	if (error != 0)
 		return (error);
 	error = kern_fhstatfs(td, fh, &sf);
 	if (error != 0)
 		return (error);
 	return (copyout(&sf, uap->buf, sizeof(sf)));
 }
 
 int
 kern_fhstatfs(struct thread *td, fhandle_t fh, struct statfs *buf)
 {
 	struct statfs *sp;
 	struct mount *mp;
 	struct vnode *vp;
 	int error;
 
 	error = priv_check(td, PRIV_VFS_FHSTATFS);
 	if (error != 0)
 		return (error);
 	if ((mp = vfs_busyfs(&fh.fh_fsid)) == NULL)
 		return (ESTALE);
 	error = VFS_FHTOVP(mp, &fh.fh_fid, LK_EXCLUSIVE, &vp);
 	if (error != 0) {
 		vfs_unbusy(mp);
 		return (error);
 	}
 	vput(vp);
 	error = prison_canseemount(td->td_ucred, mp);
 	if (error != 0)
 		goto out;
 #ifdef MAC
 	error = mac_mount_check_stat(td->td_ucred, mp);
 	if (error != 0)
 		goto out;
 #endif
 	/*
 	 * Set these in case the underlying filesystem fails to do so.
 	 */
 	sp = &mp->mnt_stat;
 	sp->f_version = STATFS_VERSION;
 	sp->f_namemax = NAME_MAX;
 	sp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK;
 	error = VFS_STATFS(mp, sp);
 	if (error == 0)
 		*buf = *sp;
 out:
 	vfs_unbusy(mp);
 	return (error);
 }
 
 int
 kern_posix_fallocate(struct thread *td, int fd, off_t offset, off_t len)
 {
 	struct file *fp;
 	struct mount *mp;
 	struct vnode *vp;
 	cap_rights_t rights;
 	off_t olen, ooffset;
 	int error;
 
 	if (offset < 0 || len <= 0)
 		return (EINVAL);
 	/* Check for wrap. */
 	if (offset > OFF_MAX - len)
 		return (EFBIG);
 	error = fget(td, fd, cap_rights_init(&rights, CAP_WRITE), &fp);
 	if (error != 0)
 		return (error);
 	if ((fp->f_ops->fo_flags & DFLAG_SEEKABLE) == 0) {
 		error = ESPIPE;
 		goto out;
 	}
 	if ((fp->f_flag & FWRITE) == 0) {
 		error = EBADF;
 		goto out;
 	}
 	if (fp->f_type != DTYPE_VNODE) {
 		error = ENODEV;
 		goto out;
 	}
 	vp = fp->f_vnode;
 	if (vp->v_type != VREG) {
 		error = ENODEV;
 		goto out;
 	}
 
 	/* Allocating blocks may take a long time, so iterate. */
 	for (;;) {
 		olen = len;
 		ooffset = offset;
 
 		bwillwrite();
 		mp = NULL;
 		error = vn_start_write(vp, &mp, V_WAIT | PCATCH);
 		if (error != 0)
 			break;
 		error = vn_lock(vp, LK_EXCLUSIVE);
 		if (error != 0) {
 			vn_finished_write(mp);
 			break;
 		}
 #ifdef MAC
 		error = mac_vnode_check_write(td->td_ucred, fp->f_cred, vp);
 		if (error == 0)
 #endif
 			error = VOP_ALLOCATE(vp, &offset, &len);
 		VOP_UNLOCK(vp, 0);
 		vn_finished_write(mp);
 
 		if (olen + ooffset != offset + len) {
 			panic("offset + len changed from %jx/%jx to %jx/%jx",
 			    ooffset, olen, offset, len);
 		}
 		if (error != 0 || len == 0)
 			break;
 		KASSERT(olen > len, ("Iteration did not make progress?"));
 		maybe_yield();
 	}
  out:
 	fdrop(fp, td);
 	return (error);
 }
 
 int
 sys_posix_fallocate(struct thread *td, struct posix_fallocate_args *uap)
 {
 
 	td->td_retval[0] = kern_posix_fallocate(td, uap->fd, uap->offset,
 	    uap->len);
 	return (0);
 }
 
 /*
  * Unlike madvise(2), we do not make a best effort to remember every
  * possible caching hint.  Instead, we remember the last setting with
  * the exception that we will allow POSIX_FADV_NORMAL to adjust the
  * region of any current setting.
  */
 int
 kern_posix_fadvise(struct thread *td, int fd, off_t offset, off_t len,
     int advice)
 {
 	struct fadvise_info *fa, *new;
 	struct file *fp;
 	struct vnode *vp;
 	cap_rights_t rights;
 	off_t end;
 	int error;
 
 	if (offset < 0 || len < 0 || offset > OFF_MAX - len)
 		return (EINVAL);
 	switch (advice) {
 	case POSIX_FADV_SEQUENTIAL:
 	case POSIX_FADV_RANDOM:
 	case POSIX_FADV_NOREUSE:
 		new = malloc(sizeof(*fa), M_FADVISE, M_WAITOK);
 		break;
 	case POSIX_FADV_NORMAL:
 	case POSIX_FADV_WILLNEED:
 	case POSIX_FADV_DONTNEED:
 		new = NULL;
 		break;
 	default:
 		return (EINVAL);
 	}
 	/* XXX: CAP_POSIX_FADVISE? */
 	error = fget(td, fd, cap_rights_init(&rights), &fp);
 	if (error != 0)
 		goto out;
 	if ((fp->f_ops->fo_flags & DFLAG_SEEKABLE) == 0) {
 		error = ESPIPE;
 		goto out;
 	}
 	if (fp->f_type != DTYPE_VNODE) {
 		error = ENODEV;
 		goto out;
 	}
 	vp = fp->f_vnode;
 	if (vp->v_type != VREG) {
 		error = ENODEV;
 		goto out;
 	}
 	if (len == 0)
 		end = OFF_MAX;
 	else
 		end = offset + len - 1;
 	switch (advice) {
 	case POSIX_FADV_SEQUENTIAL:
 	case POSIX_FADV_RANDOM:
 	case POSIX_FADV_NOREUSE:
 		/*
 		 * Try to merge any existing non-standard region with
 		 * this new region if possible, otherwise create a new
 		 * non-standard region for this request.
 		 */
 		mtx_pool_lock(mtxpool_sleep, fp);
 		fa = fp->f_advice;
 		if (fa != NULL && fa->fa_advice == advice &&
 		    ((fa->fa_start <= end && fa->fa_end >= offset) ||
 		    (end != OFF_MAX && fa->fa_start == end + 1) ||
 		    (fa->fa_end != OFF_MAX && fa->fa_end + 1 == offset))) {
 			if (offset < fa->fa_start)
 				fa->fa_start = offset;
 			if (end > fa->fa_end)
 				fa->fa_end = end;
 		} else {
 			new->fa_advice = advice;
 			new->fa_start = offset;
 			new->fa_end = end;
 			new->fa_prevstart = 0;
 			new->fa_prevend = 0;
 			fp->f_advice = new;
 			new = fa;
 		}
 		mtx_pool_unlock(mtxpool_sleep, fp);
 		break;
 	case POSIX_FADV_NORMAL:
 		/*
 		 * If a the "normal" region overlaps with an existing
 		 * non-standard region, trim or remove the
 		 * non-standard region.
 		 */
 		mtx_pool_lock(mtxpool_sleep, fp);
 		fa = fp->f_advice;
 		if (fa != NULL) {
 			if (offset <= fa->fa_start && end >= fa->fa_end) {
 				new = fa;
 				fp->f_advice = NULL;
 			} else if (offset <= fa->fa_start &&
 			    end >= fa->fa_start)
 				fa->fa_start = end + 1;
 			else if (offset <= fa->fa_end && end >= fa->fa_end)
 				fa->fa_end = offset - 1;
 			else if (offset >= fa->fa_start && end <= fa->fa_end) {
 				/*
 				 * If the "normal" region is a middle
 				 * portion of the existing
 				 * non-standard region, just remove
 				 * the whole thing rather than picking
 				 * one side or the other to
 				 * preserve.
 				 */
 				new = fa;
 				fp->f_advice = NULL;
 			}
 		}
 		mtx_pool_unlock(mtxpool_sleep, fp);
 		break;
 	case POSIX_FADV_WILLNEED:
 	case POSIX_FADV_DONTNEED:
 		error = VOP_ADVISE(vp, offset, end, advice);
 		break;
 	}
 out:
 	if (fp != NULL)
 		fdrop(fp, td);
 	free(new, M_FADVISE);
 	return (error);
 }
 
 int
 sys_posix_fadvise(struct thread *td, struct posix_fadvise_args *uap)
 {
 
 	td->td_retval[0] = kern_posix_fadvise(td, uap->fd, uap->offset,
 	    uap->len, uap->advice);
 	return (0);
 }
Index: user/ngie/more-tests/sys/net/if_types.h
===================================================================
--- user/ngie/more-tests/sys/net/if_types.h	(revision 281584)
+++ user/ngie/more-tests/sys/net/if_types.h	(revision 281585)
@@ -1,252 +1,252 @@
 /*-
  * Copyright (c) 1989, 1993, 1994
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)if_types.h	8.3 (Berkeley) 4/28/95
  * $FreeBSD$
  * $NetBSD: if_types.h,v 1.16 2000/04/19 06:30:53 itojun Exp $
  */
 
 #ifndef _NET_IF_TYPES_H_
 #define _NET_IF_TYPES_H_
 
 /*
  * Interface types for benefit of parsing media address headers.
  * This list is derived from the SNMP list of ifTypes, originally
  * documented in RFC1573, now maintained as:
  *
  * 	http://www.iana.org/assignments/smi-numbers
  */
 
 #define	IFT_OTHER	0x1		/* none of the following */
 #define	IFT_1822	0x2		/* old-style arpanet imp */
 #define	IFT_HDH1822	0x3		/* HDH arpanet imp */
 #define	IFT_X25DDN	0x4		/* x25 to imp */
 #define	IFT_X25		0x5		/* PDN X25 interface (RFC877) */
 #define	IFT_ETHER	0x6		/* Ethernet CSMA/CD */
 #define	IFT_ISO88023	0x7		/* CMSA/CD */
 #define	IFT_ISO88024	0x8		/* Token Bus */
 #define	IFT_ISO88025	0x9		/* Token Ring */
 #define	IFT_ISO88026	0xa		/* MAN */
 #define	IFT_STARLAN	0xb
 #define	IFT_P10		0xc		/* Proteon 10MBit ring */
 #define	IFT_P80		0xd		/* Proteon 80MBit ring */
 #define	IFT_HY		0xe		/* Hyperchannel */
 #define	IFT_FDDI	0xf
 #define	IFT_LAPB	0x10
 #define	IFT_SDLC	0x11
 #define	IFT_T1		0x12
 #define	IFT_CEPT	0x13		/* E1 - european T1 */
 #define	IFT_ISDNBASIC	0x14
 #define	IFT_ISDNPRIMARY	0x15
 #define	IFT_PTPSERIAL	0x16		/* Proprietary PTP serial */
 #define	IFT_PPP		0x17		/* RFC 1331 */
 #define	IFT_LOOP	0x18		/* loopback */
 #define	IFT_EON		0x19		/* ISO over IP */
 #define	IFT_XETHER	0x1a		/* obsolete 3MB experimental ethernet */
 #define	IFT_NSIP	0x1b		/* XNS over IP */
 #define	IFT_SLIP	0x1c		/* IP over generic TTY */
 #define	IFT_ULTRA	0x1d		/* Ultra Technologies */
 #define	IFT_DS3		0x1e		/* Generic T3 */
 #define	IFT_SIP		0x1f		/* SMDS */
 #define	IFT_FRELAY	0x20		/* Frame Relay DTE only */
 #define	IFT_RS232	0x21
 #define	IFT_PARA	0x22		/* parallel-port */
 #define	IFT_ARCNET	0x23
 #define	IFT_ARCNETPLUS	0x24
 #define	IFT_ATM		0x25		/* ATM cells */
 #define	IFT_MIOX25	0x26
 #define	IFT_SONET	0x27		/* SONET or SDH */
 #define	IFT_X25PLE	0x28
 #define	IFT_ISO88022LLC	0x29
 #define	IFT_LOCALTALK	0x2a
 #define	IFT_SMDSDXI	0x2b
 #define	IFT_FRELAYDCE	0x2c		/* Frame Relay DCE */
 #define	IFT_V35		0x2d
 #define	IFT_HSSI	0x2e
 #define	IFT_HIPPI	0x2f
 #define	IFT_MODEM	0x30		/* Generic Modem */
 #define	IFT_AAL5	0x31		/* AAL5 over ATM */
 #define	IFT_SONETPATH	0x32
 #define	IFT_SONETVT	0x33
 #define	IFT_SMDSICIP	0x34		/* SMDS InterCarrier Interface */
 #define	IFT_PROPVIRTUAL	0x35		/* Proprietary Virtual/internal */
 #define	IFT_PROPMUX	0x36		/* Proprietary Multiplexing */
 #define	IFT_IEEE80212		   0x37 /* 100BaseVG */
 #define	IFT_FIBRECHANNEL	   0x38 /* Fibre Channel */
 #define	IFT_HIPPIINTERFACE	   0x39 /* HIPPI interfaces	 */
 #define	IFT_FRAMERELAYINTERCONNECT 0x3a /* Obsolete, use either 0x20 or 0x2c */
 #define	IFT_AFLANE8023		   0x3b /* ATM Emulated LAN for 802.3 */
 #define	IFT_AFLANE8025		   0x3c /* ATM Emulated LAN for 802.5 */
 #define	IFT_CCTEMUL		   0x3d /* ATM Emulated circuit		  */
 #define	IFT_FASTETHER		   0x3e /* Fast Ethernet (100BaseT) */
 #define	IFT_ISDN		   0x3f /* ISDN and X.25	    */
 #define	IFT_V11			   0x40 /* CCITT V.11/X.21		*/
 #define	IFT_V36			   0x41 /* CCITT V.36			*/
 #define	IFT_G703AT64K		   0x42 /* CCITT G703 at 64Kbps */
 #define	IFT_G703AT2MB		   0x43 /* Obsolete see DS1-MIB */
 #define	IFT_QLLC		   0x44 /* SNA QLLC			*/
 #define	IFT_FASTETHERFX		   0x45 /* Fast Ethernet (100BaseFX)	*/
 #define	IFT_CHANNEL		   0x46 /* channel			*/
 #define	IFT_IEEE80211		   0x47 /* radio spread spectrum	*/
 #define	IFT_IBM370PARCHAN	   0x48 /* IBM System 360/370 OEMI Channel */
 #define	IFT_ESCON		   0x49 /* IBM Enterprise Systems Connection */
 #define	IFT_DLSW		   0x4a /* Data Link Switching */
 #define	IFT_ISDNS		   0x4b /* ISDN S/T interface */
 #define	IFT_ISDNU		   0x4c /* ISDN U interface */
 #define	IFT_LAPD		   0x4d /* Link Access Protocol D */
 #define	IFT_IPSWITCH		   0x4e /* IP Switching Objects */
 #define	IFT_RSRB		   0x4f /* Remote Source Route Bridging */
 #define	IFT_ATMLOGICAL		   0x50 /* ATM Logical Port */
 #define	IFT_DS0			   0x51 /* Digital Signal Level 0 */
 #define	IFT_DS0BUNDLE		   0x52 /* group of ds0s on the same ds1 */
 #define	IFT_BSC			   0x53 /* Bisynchronous Protocol */
 #define	IFT_ASYNC		   0x54 /* Asynchronous Protocol */
 #define	IFT_CNR			   0x55 /* Combat Net Radio */
 #define	IFT_ISO88025DTR		   0x56 /* ISO 802.5r DTR */
 #define	IFT_EPLRS		   0x57 /* Ext Pos Loc Report Sys */
 #define	IFT_ARAP		   0x58 /* Appletalk Remote Access Protocol */
 #define	IFT_PROPCNLS		   0x59 /* Proprietary Connectionless Protocol*/
 #define	IFT_HOSTPAD		   0x5a /* CCITT-ITU X.29 PAD Protocol */
 #define	IFT_TERMPAD		   0x5b /* CCITT-ITU X.3 PAD Facility */
 #define	IFT_FRAMERELAYMPI	   0x5c /* Multiproto Interconnect over FR */
 #define	IFT_X213		   0x5d /* CCITT-ITU X213 */
 #define	IFT_ADSL		   0x5e /* Asymmetric Digital Subscriber Loop */
 #define	IFT_RADSL		   0x5f /* Rate-Adapt. Digital Subscriber Loop*/
 #define	IFT_SDSL		   0x60 /* Symmetric Digital Subscriber Loop */
 #define	IFT_VDSL		   0x61 /* Very H-Speed Digital Subscrib. Loop*/
 #define	IFT_ISO88025CRFPINT	   0x62 /* ISO 802.5 CRFP */
 #define	IFT_MYRINET		   0x63 /* Myricom Myrinet */
 #define	IFT_VOICEEM		   0x64 /* voice recEive and transMit */
 #define	IFT_VOICEFXO		   0x65 /* voice Foreign Exchange Office */
 #define	IFT_VOICEFXS		   0x66 /* voice Foreign Exchange Station */
 #define	IFT_VOICEENCAP		   0x67 /* voice encapsulation */
 #define	IFT_VOICEOVERIP		   0x68 /* voice over IP encapsulation */
 #define	IFT_ATMDXI		   0x69 /* ATM DXI */
 #define	IFT_ATMFUNI		   0x6a /* ATM FUNI */
 #define	IFT_ATMIMA		   0x6b /* ATM IMA		      */
 #define	IFT_PPPMULTILINKBUNDLE	   0x6c /* PPP Multilink Bundle */
 #define	IFT_IPOVERCDLC		   0x6d /* IBM ipOverCdlc */
 #define	IFT_IPOVERCLAW		   0x6e /* IBM Common Link Access to Workstn */
 #define	IFT_STACKTOSTACK	   0x6f /* IBM stackToStack */
 #define	IFT_VIRTUALIPADDRESS	   0x70 /* IBM VIPA */
 #define	IFT_MPC			   0x71 /* IBM multi-protocol channel support */
 #define	IFT_IPOVERATM		   0x72 /* IBM ipOverAtm */
 #define	IFT_ISO88025FIBER	   0x73 /* ISO 802.5j Fiber Token Ring */
 #define	IFT_TDLC		   0x74 /* IBM twinaxial data link control */
 #define	IFT_GIGABITETHERNET	   0x75 /* Gigabit Ethernet */
 #define	IFT_HDLC		   0x76 /* HDLC */
 #define	IFT_LAPF		   0x77 /* LAP F */
 #define	IFT_V37			   0x78 /* V.37 */
 #define	IFT_X25MLP		   0x79 /* Multi-Link Protocol */
 #define	IFT_X25HUNTGROUP	   0x7a /* X25 Hunt Group */
 #define	IFT_TRANSPHDLC		   0x7b /* Transp HDLC */
 #define	IFT_INTERLEAVE		   0x7c /* Interleave channel */
 #define	IFT_FAST		   0x7d /* Fast channel */
 #define	IFT_IP			   0x7e /* IP (for APPN HPR in IP networks) */
 #define	IFT_DOCSCABLEMACLAYER	   0x7f /* CATV Mac Layer */
 #define	IFT_DOCSCABLEDOWNSTREAM	   0x80 /* CATV Downstream interface */
 #define	IFT_DOCSCABLEUPSTREAM	   0x81 /* CATV Upstream interface */
 #define	IFT_A12MPPSWITCH	   0x82	/* Avalon Parallel Processor */
 #define	IFT_TUNNEL		   0x83	/* Encapsulation interface */
 #define	IFT_COFFEE		   0x84	/* coffee pot */
 #define	IFT_CES			   0x85	/* Circiut Emulation Service */
 #define	IFT_ATMSUBINTERFACE	   0x86	/* (x)  ATM Sub Interface */
 #define	IFT_L2VLAN		   0x87	/* Layer 2 Virtual LAN using 802.1Q */
 #define	IFT_L3IPVLAN		   0x88	/* Layer 3 Virtual LAN - IP Protocol */
 #define	IFT_L3IPXVLAN		   0x89	/* Layer 3 Virtual LAN - IPX Prot. */
 #define	IFT_DIGITALPOWERLINE	   0x8a	/* IP over Power Lines */
 #define	IFT_MEDIAMAILOVERIP	   0x8b	/* (xxx)  Multimedia Mail over IP */
 #define	IFT_DTM			   0x8c	/* Dynamic synchronous Transfer Mode */
 #define	IFT_DCN			   0x8d	/* Data Communications Network */
 #define	IFT_IPFORWARD		   0x8e	/* IP Forwarding Interface */
 #define	IFT_MSDSL		   0x8f	/* Multi-rate Symmetric DSL */
 #define	IFT_IEEE1394		   0x90	/* IEEE1394 High Performance SerialBus*/
 #define	IFT_IFGSN		   0x91	/* HIPPI-6400 */
 #define	IFT_DVBRCCMACLAYER	   0x92	/* DVB-RCC MAC Layer */
 #define	IFT_DVBRCCDOWNSTREAM	   0x93	/* DVB-RCC Downstream Channel */
 #define	IFT_DVBRCCUPSTREAM	   0x94	/* DVB-RCC Upstream Channel */
 #define	IFT_ATMVIRTUAL		   0x95	/* ATM Virtual Interface */
 #define	IFT_MPLSTUNNEL		   0x96	/* MPLS Tunnel Virtual Interface */
 #define	IFT_SRP			   0x97	/* Spatial Reuse Protocol */
 #define	IFT_VOICEOVERATM	   0x98	/* Voice over ATM */
 #define	IFT_VOICEOVERFRAMERELAY	   0x99	/* Voice Over Frame Relay */
 #define	IFT_IDSL		   0x9a	/* Digital Subscriber Loop over ISDN */
 #define	IFT_COMPOSITELINK	   0x9b	/* Avici Composite Link Interface */
 #define	IFT_SS7SIGLINK		   0x9c	/* SS7 Signaling Link */
 #define	IFT_PROPWIRELESSP2P	   0x9d	/* Prop. P2P wireless interface */
 #define	IFT_FRFORWARD		   0x9e	/* Frame forward Interface */
 #define	IFT_RFC1483		   0x9f	/* Multiprotocol over ATM AAL5 */
 #define	IFT_USB			   0xa0	/* USB Interface */
 #define	IFT_IEEE8023ADLAG	   0xa1	/* IEEE 802.3ad Link Aggregate*/
 #define	IFT_BGPPOLICYACCOUNTING	   0xa2	/* BGP Policy Accounting */
 #define	IFT_FRF16MFRBUNDLE	   0xa3	/* FRF.16 Multilik Frame Relay*/
 #define	IFT_H323GATEKEEPER	   0xa4	/* H323 Gatekeeper */
 #define	IFT_H323PROXY		   0xa5	/* H323 Voice and Video Proxy */
 #define	IFT_MPLS		   0xa6	/* MPLS */
 #define	IFT_MFSIGLINK		   0xa7	/* Multi-frequency signaling link */
 #define	IFT_HDSL2		   0xa8	/* High Bit-Rate DSL, 2nd gen. */
 #define	IFT_SHDSL		   0xa9	/* Multirate HDSL2 */
 #define	IFT_DS1FDL		   0xaa	/* Facility Data Link (4Kbps) on a DS1*/
 #define	IFT_POS			   0xab	/* Packet over SONET/SDH Interface */
 #define	IFT_DVBASILN		   0xac	/* DVB-ASI Input */
 #define	IFT_DVBASIOUT		   0xad	/* DVB-ASI Output */
 #define	IFT_PLC			   0xae	/* Power Line Communications */
 #define	IFT_NFAS		   0xaf	/* Non-Facility Associated Signaling */
 #define	IFT_TR008		   0xb0	/* TROO8 */
 #define	IFT_GR303RDT		   0xb1	/* Remote Digital Terminal */
 #define	IFT_GR303IDT		   0xb2	/* Integrated Digital Terminal */
 #define	IFT_ISUP		   0xb3	/* ISUP */
 #define	IFT_PROPDOCSWIRELESSMACLAYER	   0xb4	/* prop/Wireless MAC Layer */
 #define	IFT_PROPDOCSWIRELESSDOWNSTREAM	   0xb5	/* prop/Wireless Downstream */
 #define	IFT_PROPDOCSWIRELESSUPSTREAM	   0xb6	/* prop/Wireless Upstream */
 #define	IFT_HIPERLAN2		   0xb7	/* HIPERLAN Type 2 Radio Interface */
 #define	IFT_PROPBWAP2MP		   0xb8	/* PropBroadbandWirelessAccess P2MP*/
 #define	IFT_SONETOVERHEADCHANNEL   0xb9	/* SONET Overhead Channel */
 #define	IFT_DIGITALWRAPPEROVERHEADCHANNEL  0xba	/* Digital Wrapper Overhead */
 #define	IFT_AAL2		   0xbb	/* ATM adaptation layer 2 */
 #define	IFT_RADIOMAC		   0xbc	/* MAC layer over radio links */
 #define	IFT_ATMRADIO		   0xbd	/* ATM over radio links */
 #define	IFT_IMT			   0xbe /* Inter-Machine Trunks */
 #define	IFT_MVL			   0xbf /* Multiple Virtual Lines DSL */
 #define	IFT_REACHDSL		   0xc0 /* Long Reach DSL */
 #define	IFT_FRDLCIENDPT		   0xc1 /* Frame Relay DLCI End Point */
 #define	IFT_ATMVCIENDPT		   0xc2 /* ATM VCI End Point */
 #define	IFT_OPTICALCHANNEL	   0xc3 /* Optical Channel */
 #define	IFT_OPTICALTRANSPORT	   0xc4 /* Optical Transport */
 #define	IFT_INFINIBAND		   0xc7	/* Infiniband */
 #define	IFT_BRIDGE		   0xd1 /* Transparent bridge interface */
 
 #define	IFT_STF			   0xd7	/* 6to4 interface */
 
-/* not based on IANA assignments */
-#define	IFT_GIF		0xf0
-#define	IFT_PVC		0xf1
-#define	IFT_ENC		0xf4
-#define	IFT_PFLOG	0xf6
-#define	IFT_PFSYNC	0xf7
+/* FreeBSD specific, not based on IANA assignments */
+#define	IFT_GIF		0xf0 /* Generic tunnel interface */
+#define	IFT_PVC		0xf1 /* Unused */
+#define	IFT_ENC		0xf4 /* Encapsulating interface */
+#define	IFT_PFLOG	0xf6 /* PF packet filter logging */
+#define	IFT_PFSYNC	0xf7 /* PF packet filter synchronization */
 #endif /* !_NET_IF_TYPES_H_ */
Index: user/ngie/more-tests/sys/net/pfvar.h
===================================================================
--- user/ngie/more-tests/sys/net/pfvar.h	(revision 281584)
+++ user/ngie/more-tests/sys/net/pfvar.h	(revision 281585)
@@ -1,1744 +1,1743 @@
 /*
  * Copyright (c) 2001 Daniel Hartmeier
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  *
  *    - Redistributions of source code must retain the above copyright
  *      notice, this list of conditions and the following disclaimer.
  *    - Redistributions in binary form must reproduce the above
  *      copyright notice, this list of conditions and the following
  *      disclaimer in the documentation and/or other materials provided
  *      with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
  * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
  * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
  * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
  * COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
  * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
  * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
  * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
  * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
  * POSSIBILITY OF SUCH DAMAGE.
  *
  *	$OpenBSD: pfvar.h,v 1.282 2009/01/29 15:12:28 pyr Exp $
  *	$FreeBSD$
  */
 
 #ifndef _NET_PFVAR_H_
 #define _NET_PFVAR_H_
 
 #include <sys/param.h>
 #include <sys/queue.h>
 #include <sys/counter.h>
 #include <sys/refcount.h>
 #include <sys/tree.h>
 
 #include <net/radix.h>
 #include <netinet/in.h>
 
 #include <netpfil/pf/pf.h>
 #include <netpfil/pf/pf_altq.h>
 #include <netpfil/pf/pf_mtag.h>
 
 struct pf_addr {
 	union {
 		struct in_addr		v4;
 		struct in6_addr		v6;
 		u_int8_t		addr8[16];
 		u_int16_t		addr16[8];
 		u_int32_t		addr32[4];
 	} pfa;		    /* 128-bit address */
 #define v4	pfa.v4
 #define v6	pfa.v6
 #define addr8	pfa.addr8
 #define addr16	pfa.addr16
 #define addr32	pfa.addr32
 };
 
 #define PFI_AFLAG_NETWORK	0x01
 #define PFI_AFLAG_BROADCAST	0x02
 #define PFI_AFLAG_PEER		0x04
 #define PFI_AFLAG_MODEMASK	0x07
 #define PFI_AFLAG_NOALIAS	0x08
 
 struct pf_addr_wrap {
 	union {
 		struct {
 			struct pf_addr		 addr;
 			struct pf_addr		 mask;
 		}			 a;
 		char			 ifname[IFNAMSIZ];
 		char			 tblname[PF_TABLE_NAME_SIZE];
 	}			 v;
 	union {
 		struct pfi_dynaddr	*dyn;
 		struct pfr_ktable	*tbl;
 		int			 dyncnt;
 		int			 tblcnt;
 	}			 p;
 	u_int8_t		 type;		/* PF_ADDR_* */
 	u_int8_t		 iflags;	/* PFI_AFLAG_* */
 };
 
 #ifdef _KERNEL
 
 struct pfi_dynaddr {
 	TAILQ_ENTRY(pfi_dynaddr)	 entry;
 	struct pf_addr			 pfid_addr4;
 	struct pf_addr			 pfid_mask4;
 	struct pf_addr			 pfid_addr6;
 	struct pf_addr			 pfid_mask6;
 	struct pfr_ktable		*pfid_kt;
 	struct pfi_kif			*pfid_kif;
 	int				 pfid_net;	/* mask or 128 */
 	int				 pfid_acnt4;	/* address count IPv4 */
 	int				 pfid_acnt6;	/* address count IPv6 */
 	sa_family_t			 pfid_af;	/* rule af */
 	u_int8_t			 pfid_iflags;	/* PFI_AFLAG_* */
 };
 
 /*
  * Address manipulation macros
  */
 #define	HTONL(x)	(x) = htonl((__uint32_t)(x))
 #define	HTONS(x)	(x) = htons((__uint16_t)(x))
 #define	NTOHL(x)	(x) = ntohl((__uint32_t)(x))
 #define	NTOHS(x)	(x) = ntohs((__uint16_t)(x))
 
 #define	PF_NAME		"pf"
 
 #define	PF_HASHROW_ASSERT(h)	mtx_assert(&(h)->lock, MA_OWNED)
 #define	PF_HASHROW_LOCK(h)	mtx_lock(&(h)->lock)
 #define	PF_HASHROW_UNLOCK(h)	mtx_unlock(&(h)->lock)
 
 #define	PF_STATE_LOCK(s)						\
 	do {								\
 		struct pf_idhash *_ih = &V_pf_idhash[PF_IDHASH(s)];	\
 		PF_HASHROW_LOCK(_ih);					\
 	} while (0)
 
 #define	PF_STATE_UNLOCK(s)						\
 	do {								\
 		struct pf_idhash *_ih = &V_pf_idhash[PF_IDHASH((s))];	\
 		PF_HASHROW_UNLOCK(_ih);					\
 	} while (0)
 
 #ifdef INVARIANTS
 #define	PF_STATE_LOCK_ASSERT(s)						\
 	do {								\
 		struct pf_idhash *_ih = &V_pf_idhash[PF_IDHASH(s)];	\
 		PF_HASHROW_ASSERT(_ih);					\
 	} while (0)
 #else /* !INVARIANTS */
 #define	PF_STATE_LOCK_ASSERT(s)		do {} while (0)
 #endif /* INVARIANTS */
 
 extern struct mtx pf_unlnkdrules_mtx;
 #define	PF_UNLNKDRULES_LOCK()	mtx_lock(&pf_unlnkdrules_mtx)
 #define	PF_UNLNKDRULES_UNLOCK()	mtx_unlock(&pf_unlnkdrules_mtx)
 
 extern struct rwlock pf_rules_lock;
 #define	PF_RULES_RLOCK()	rw_rlock(&pf_rules_lock)
 #define	PF_RULES_RUNLOCK()	rw_runlock(&pf_rules_lock)
 #define	PF_RULES_WLOCK()	rw_wlock(&pf_rules_lock)
 #define	PF_RULES_WUNLOCK()	rw_wunlock(&pf_rules_lock)
 #define	PF_RULES_ASSERT()	rw_assert(&pf_rules_lock, RA_LOCKED)
 #define	PF_RULES_RASSERT()	rw_assert(&pf_rules_lock, RA_RLOCKED)
 #define	PF_RULES_WASSERT()	rw_assert(&pf_rules_lock, RA_WLOCKED)
 
 #define	PF_MODVER	1
 #define	PFLOG_MODVER	1
 #define	PFSYNC_MODVER	1
 
 #define	PFLOG_MINVER	1
 #define	PFLOG_PREFVER	PFLOG_MODVER
 #define	PFLOG_MAXVER	1
 #define	PFSYNC_MINVER	1
 #define	PFSYNC_PREFVER	PFSYNC_MODVER
 #define	PFSYNC_MAXVER	1
 
 #ifdef INET
 #ifndef INET6
 #define	PF_INET_ONLY
 #endif /* ! INET6 */
 #endif /* INET */
 
 #ifdef INET6
 #ifndef INET
 #define	PF_INET6_ONLY
 #endif /* ! INET */
 #endif /* INET6 */
 
 #ifdef INET
 #ifdef INET6
 #define	PF_INET_INET6
 #endif /* INET6 */
 #endif /* INET */
 
 #else
 
 #define	PF_INET_INET6
 
 #endif /* _KERNEL */
 
 /* Both IPv4 and IPv6 */
 #ifdef PF_INET_INET6
 
 #define PF_AEQ(a, b, c) \
 	((c == AF_INET && (a)->addr32[0] == (b)->addr32[0]) || \
-	((a)->addr32[3] == (b)->addr32[3] && \
+	(c == AF_INET6 && (a)->addr32[3] == (b)->addr32[3] && \
 	(a)->addr32[2] == (b)->addr32[2] && \
 	(a)->addr32[1] == (b)->addr32[1] && \
 	(a)->addr32[0] == (b)->addr32[0])) \
 
 #define PF_ANEQ(a, b, c) \
-	((c == AF_INET && (a)->addr32[0] != (b)->addr32[0]) || \
-	((a)->addr32[3] != (b)->addr32[3] || \
-	(a)->addr32[2] != (b)->addr32[2] || \
+	((a)->addr32[0] != (b)->addr32[0] || \
 	(a)->addr32[1] != (b)->addr32[1] || \
-	(a)->addr32[0] != (b)->addr32[0])) \
+	(a)->addr32[2] != (b)->addr32[2] || \
+	(a)->addr32[3] != (b)->addr32[3]) \
 
 #define PF_AZERO(a, c) \
 	((c == AF_INET && !(a)->addr32[0]) || \
-	(!(a)->addr32[0] && !(a)->addr32[1] && \
+	(c == AF_INET6 && !(a)->addr32[0] && !(a)->addr32[1] && \
 	!(a)->addr32[2] && !(a)->addr32[3] )) \
 
 #define PF_MATCHA(n, a, m, b, f) \
 	pf_match_addr(n, a, m, b, f)
 
 #define PF_ACPY(a, b, f) \
 	pf_addrcpy(a, b, f)
 
 #define PF_AINC(a, f) \
 	pf_addr_inc(a, f)
 
 #define PF_POOLMASK(a, b, c, d, f) \
 	pf_poolmask(a, b, c, d, f)
 
 #else
 
 /* Just IPv6 */
 
 #ifdef PF_INET6_ONLY
 
 #define PF_AEQ(a, b, c) \
 	((a)->addr32[3] == (b)->addr32[3] && \
 	(a)->addr32[2] == (b)->addr32[2] && \
 	(a)->addr32[1] == (b)->addr32[1] && \
 	(a)->addr32[0] == (b)->addr32[0]) \
 
 #define PF_ANEQ(a, b, c) \
 	((a)->addr32[3] != (b)->addr32[3] || \
 	(a)->addr32[2] != (b)->addr32[2] || \
 	(a)->addr32[1] != (b)->addr32[1] || \
 	(a)->addr32[0] != (b)->addr32[0]) \
 
 #define PF_AZERO(a, c) \
 	(!(a)->addr32[0] && \
 	!(a)->addr32[1] && \
 	!(a)->addr32[2] && \
 	!(a)->addr32[3] ) \
 
 #define PF_MATCHA(n, a, m, b, f) \
 	pf_match_addr(n, a, m, b, f)
 
 #define PF_ACPY(a, b, f) \
 	pf_addrcpy(a, b, f)
 
 #define PF_AINC(a, f) \
 	pf_addr_inc(a, f)
 
 #define PF_POOLMASK(a, b, c, d, f) \
 	pf_poolmask(a, b, c, d, f)
 
 #else
 
 /* Just IPv4 */
 #ifdef PF_INET_ONLY
 
 #define PF_AEQ(a, b, c) \
 	((a)->addr32[0] == (b)->addr32[0])
 
 #define PF_ANEQ(a, b, c) \
 	((a)->addr32[0] != (b)->addr32[0])
 
 #define PF_AZERO(a, c) \
 	(!(a)->addr32[0])
 
 #define PF_MATCHA(n, a, m, b, f) \
 	pf_match_addr(n, a, m, b, f)
 
 #define PF_ACPY(a, b, f) \
 	(a)->v4.s_addr = (b)->v4.s_addr
 
 #define PF_AINC(a, f) \
 	do { \
 		(a)->addr32[0] = htonl(ntohl((a)->addr32[0]) + 1); \
 	} while (0)
 
 #define PF_POOLMASK(a, b, c, d, f) \
 	do { \
 		(a)->addr32[0] = ((b)->addr32[0] & (c)->addr32[0]) | \
 		(((c)->addr32[0] ^ 0xffffffff ) & (d)->addr32[0]); \
 	} while (0)
 
 #endif /* PF_INET_ONLY */
 #endif /* PF_INET6_ONLY */
 #endif /* PF_INET_INET6 */
 
 /*
  * XXX callers not FIB-aware in our version of pf yet.
  * OpenBSD fixed it later it seems, 2010/05/07 13:33:16 claudio.
  */
 #define	PF_MISMATCHAW(aw, x, af, neg, ifp, rtid)			\
 	(								\
 		(((aw)->type == PF_ADDR_NOROUTE &&			\
 		    pf_routable((x), (af), NULL, (rtid))) ||		\
 		(((aw)->type == PF_ADDR_URPFFAILED && (ifp) != NULL &&	\
 		    pf_routable((x), (af), (ifp), (rtid))) ||		\
 		((aw)->type == PF_ADDR_TABLE &&				\
 		    !pfr_match_addr((aw)->p.tbl, (x), (af))) ||		\
 		((aw)->type == PF_ADDR_DYNIFTL &&			\
 		    !pfi_match_addr((aw)->p.dyn, (x), (af))) ||		\
 		((aw)->type == PF_ADDR_RANGE &&				\
 		    !pf_match_addr_range(&(aw)->v.a.addr,		\
 		    &(aw)->v.a.mask, (x), (af))) ||			\
 		((aw)->type == PF_ADDR_ADDRMASK &&			\
 		    !PF_AZERO(&(aw)->v.a.mask, (af)) &&			\
 		    !PF_MATCHA(0, &(aw)->v.a.addr,			\
 		    &(aw)->v.a.mask, (x), (af))))) !=			\
 		(neg)							\
 	)
 
 
 struct pf_rule_uid {
 	uid_t		 uid[2];
 	u_int8_t	 op;
 };
 
 struct pf_rule_gid {
 	uid_t		 gid[2];
 	u_int8_t	 op;
 };
 
 struct pf_rule_addr {
 	struct pf_addr_wrap	 addr;
 	u_int16_t		 port[2];
 	u_int8_t		 neg;
 	u_int8_t		 port_op;
 };
 
 struct pf_pooladdr {
 	struct pf_addr_wrap		 addr;
 	TAILQ_ENTRY(pf_pooladdr)	 entries;
 	char				 ifname[IFNAMSIZ];
 	struct pfi_kif			*kif;
 };
 
 TAILQ_HEAD(pf_palist, pf_pooladdr);
 
 struct pf_poolhashkey {
 	union {
 		u_int8_t		key8[16];
 		u_int16_t		key16[8];
 		u_int32_t		key32[4];
 	} pfk;		    /* 128-bit hash key */
 #define key8	pfk.key8
 #define key16	pfk.key16
 #define key32	pfk.key32
 };
 
 struct pf_pool {
 	struct pf_palist	 list;
 	struct pf_pooladdr	*cur;
 	struct pf_poolhashkey	 key;
 	struct pf_addr		 counter;
 	int			 tblidx;
 	u_int16_t		 proxy_port[2];
 	u_int8_t		 opts;
 };
 
 
 /* A packed Operating System description for fingerprinting */
 typedef u_int32_t pf_osfp_t;
 #define PF_OSFP_ANY	((pf_osfp_t)0)
 #define PF_OSFP_UNKNOWN	((pf_osfp_t)-1)
 #define PF_OSFP_NOMATCH	((pf_osfp_t)-2)
 
 struct pf_osfp_entry {
 	SLIST_ENTRY(pf_osfp_entry) fp_entry;
 	pf_osfp_t		fp_os;
 	int			fp_enflags;
 #define PF_OSFP_EXPANDED	0x001		/* expanded entry */
 #define PF_OSFP_GENERIC		0x002		/* generic signature */
 #define PF_OSFP_NODETAIL	0x004		/* no p0f details */
 #define PF_OSFP_LEN	32
 	char			fp_class_nm[PF_OSFP_LEN];
 	char			fp_version_nm[PF_OSFP_LEN];
 	char			fp_subtype_nm[PF_OSFP_LEN];
 };
 #define PF_OSFP_ENTRY_EQ(a, b) \
     ((a)->fp_os == (b)->fp_os && \
     memcmp((a)->fp_class_nm, (b)->fp_class_nm, PF_OSFP_LEN) == 0 && \
     memcmp((a)->fp_version_nm, (b)->fp_version_nm, PF_OSFP_LEN) == 0 && \
     memcmp((a)->fp_subtype_nm, (b)->fp_subtype_nm, PF_OSFP_LEN) == 0)
 
 /* handle pf_osfp_t packing */
 #define _FP_RESERVED_BIT	1  /* For the special negative #defines */
 #define _FP_UNUSED_BITS		1
 #define _FP_CLASS_BITS		10 /* OS Class (Windows, Linux) */
 #define _FP_VERSION_BITS	10 /* OS version (95, 98, NT, 2.4.54, 3.2) */
 #define _FP_SUBTYPE_BITS	10 /* patch level (NT SP4, SP3, ECN patch) */
 #define PF_OSFP_UNPACK(osfp, class, version, subtype) do { \
 	(class) = ((osfp) >> (_FP_VERSION_BITS+_FP_SUBTYPE_BITS)) & \
 	    ((1 << _FP_CLASS_BITS) - 1); \
 	(version) = ((osfp) >> _FP_SUBTYPE_BITS) & \
 	    ((1 << _FP_VERSION_BITS) - 1);\
 	(subtype) = (osfp) & ((1 << _FP_SUBTYPE_BITS) - 1); \
 } while(0)
 #define PF_OSFP_PACK(osfp, class, version, subtype) do { \
 	(osfp) = ((class) & ((1 << _FP_CLASS_BITS) - 1)) << (_FP_VERSION_BITS \
 	    + _FP_SUBTYPE_BITS); \
 	(osfp) |= ((version) & ((1 << _FP_VERSION_BITS) - 1)) << \
 	    _FP_SUBTYPE_BITS; \
 	(osfp) |= (subtype) & ((1 << _FP_SUBTYPE_BITS) - 1); \
 } while(0)
 
 /* the fingerprint of an OSes TCP SYN packet */
 typedef u_int64_t	pf_tcpopts_t;
 struct pf_os_fingerprint {
 	SLIST_HEAD(pf_osfp_enlist, pf_osfp_entry) fp_oses; /* list of matches */
 	pf_tcpopts_t		fp_tcpopts;	/* packed TCP options */
 	u_int16_t		fp_wsize;	/* TCP window size */
 	u_int16_t		fp_psize;	/* ip->ip_len */
 	u_int16_t		fp_mss;		/* TCP MSS */
 	u_int16_t		fp_flags;
 #define PF_OSFP_WSIZE_MOD	0x0001		/* Window modulus */
 #define PF_OSFP_WSIZE_DC	0x0002		/* Window don't care */
 #define PF_OSFP_WSIZE_MSS	0x0004		/* Window multiple of MSS */
 #define PF_OSFP_WSIZE_MTU	0x0008		/* Window multiple of MTU */
 #define PF_OSFP_PSIZE_MOD	0x0010		/* packet size modulus */
 #define PF_OSFP_PSIZE_DC	0x0020		/* packet size don't care */
 #define PF_OSFP_WSCALE		0x0040		/* TCP window scaling */
 #define PF_OSFP_WSCALE_MOD	0x0080		/* TCP window scale modulus */
 #define PF_OSFP_WSCALE_DC	0x0100		/* TCP window scale dont-care */
 #define PF_OSFP_MSS		0x0200		/* TCP MSS */
 #define PF_OSFP_MSS_MOD		0x0400		/* TCP MSS modulus */
 #define PF_OSFP_MSS_DC		0x0800		/* TCP MSS dont-care */
 #define PF_OSFP_DF		0x1000		/* IPv4 don't fragment bit */
 #define PF_OSFP_TS0		0x2000		/* Zero timestamp */
 #define PF_OSFP_INET6		0x4000		/* IPv6 */
 	u_int8_t		fp_optcnt;	/* TCP option count */
 	u_int8_t		fp_wscale;	/* TCP window scaling */
 	u_int8_t		fp_ttl;		/* IPv4 TTL */
 #define PF_OSFP_MAXTTL_OFFSET	40
 /* TCP options packing */
 #define PF_OSFP_TCPOPT_NOP	0x0		/* TCP NOP option */
 #define PF_OSFP_TCPOPT_WSCALE	0x1		/* TCP window scaling option */
 #define PF_OSFP_TCPOPT_MSS	0x2		/* TCP max segment size opt */
 #define PF_OSFP_TCPOPT_SACK	0x3		/* TCP SACK OK option */
 #define PF_OSFP_TCPOPT_TS	0x4		/* TCP timestamp option */
 #define PF_OSFP_TCPOPT_BITS	3		/* bits used by each option */
 #define PF_OSFP_MAX_OPTS \
     (sizeof(((struct pf_os_fingerprint *)0)->fp_tcpopts) * 8) \
     / PF_OSFP_TCPOPT_BITS
 
 	SLIST_ENTRY(pf_os_fingerprint)	fp_next;
 };
 
 struct pf_osfp_ioctl {
 	struct pf_osfp_entry	fp_os;
 	pf_tcpopts_t		fp_tcpopts;	/* packed TCP options */
 	u_int16_t		fp_wsize;	/* TCP window size */
 	u_int16_t		fp_psize;	/* ip->ip_len */
 	u_int16_t		fp_mss;		/* TCP MSS */
 	u_int16_t		fp_flags;
 	u_int8_t		fp_optcnt;	/* TCP option count */
 	u_int8_t		fp_wscale;	/* TCP window scaling */
 	u_int8_t		fp_ttl;		/* IPv4 TTL */
 
 	int			fp_getnum;	/* DIOCOSFPGET number */
 };
 
 
 union pf_rule_ptr {
 	struct pf_rule		*ptr;
 	u_int32_t		 nr;
 };
 
 #define	PF_ANCHOR_NAME_SIZE	 64
 
 struct pf_rule {
 	struct pf_rule_addr	 src;
 	struct pf_rule_addr	 dst;
 #define PF_SKIP_IFP		0
 #define PF_SKIP_DIR		1
 #define PF_SKIP_AF		2
 #define PF_SKIP_PROTO		3
 #define PF_SKIP_SRC_ADDR	4
 #define PF_SKIP_SRC_PORT	5
 #define PF_SKIP_DST_ADDR	6
 #define PF_SKIP_DST_PORT	7
 #define PF_SKIP_COUNT		8
 	union pf_rule_ptr	 skip[PF_SKIP_COUNT];
 #define PF_RULE_LABEL_SIZE	 64
 	char			 label[PF_RULE_LABEL_SIZE];
 	char			 ifname[IFNAMSIZ];
 	char			 qname[PF_QNAME_SIZE];
 	char			 pqname[PF_QNAME_SIZE];
 #define	PF_TAG_NAME_SIZE	 64
 	char			 tagname[PF_TAG_NAME_SIZE];
 	char			 match_tagname[PF_TAG_NAME_SIZE];
 
 	char			 overload_tblname[PF_TABLE_NAME_SIZE];
 
 	TAILQ_ENTRY(pf_rule)	 entries;
 	struct pf_pool		 rpool;
 
 	u_int64_t		 evaluations;
 	u_int64_t		 packets[2];
 	u_int64_t		 bytes[2];
 
 	struct pfi_kif		*kif;
 	struct pf_anchor	*anchor;
 	struct pfr_ktable	*overload_tbl;
 
 	pf_osfp_t		 os_fingerprint;
 
 	int			 rtableid;
 	u_int32_t		 timeout[PFTM_MAX];
 	u_int32_t		 max_states;
 	u_int32_t		 max_src_nodes;
 	u_int32_t		 max_src_states;
 	u_int32_t		 max_src_conn;
 	struct {
 		u_int32_t		limit;
 		u_int32_t		seconds;
 	}			 max_src_conn_rate;
 	u_int32_t		 qid;
 	u_int32_t		 pqid;
 	u_int32_t		 rt_listid;
 	u_int32_t		 nr;
 	u_int32_t		 prob;
 	uid_t			 cuid;
 	pid_t			 cpid;
 
 	counter_u64_t		 states_cur;
 	counter_u64_t		 states_tot;
 	counter_u64_t		 src_nodes;
 
 	u_int16_t		 return_icmp;
 	u_int16_t		 return_icmp6;
 	u_int16_t		 max_mss;
 	u_int16_t		 tag;
 	u_int16_t		 match_tag;
 	u_int16_t		 spare2;			/* netgraph */
 
 	struct pf_rule_uid	 uid;
 	struct pf_rule_gid	 gid;
 
 	u_int32_t		 rule_flag;
 	u_int8_t		 action;
 	u_int8_t		 direction;
 	u_int8_t		 log;
 	u_int8_t		 logif;
 	u_int8_t		 quick;
 	u_int8_t		 ifnot;
 	u_int8_t		 match_tag_not;
 	u_int8_t		 natpass;
 
 #define PF_STATE_NORMAL		0x1
 #define PF_STATE_MODULATE	0x2
 #define PF_STATE_SYNPROXY	0x3
 	u_int8_t		 keep_state;
 	sa_family_t		 af;
 	u_int8_t		 proto;
 	u_int8_t		 type;
 	u_int8_t		 code;
 	u_int8_t		 flags;
 	u_int8_t		 flagset;
 	u_int8_t		 min_ttl;
 	u_int8_t		 allow_opts;
 	u_int8_t		 rt;
 	u_int8_t		 return_ttl;
 	u_int8_t		 tos;
 	u_int8_t		 set_tos;
 	u_int8_t		 anchor_relative;
 	u_int8_t		 anchor_wildcard;
 
 #define PF_FLUSH		0x01
 #define PF_FLUSH_GLOBAL		0x02
 	u_int8_t		 flush;
 
 	struct {
 		struct pf_addr		addr;
 		u_int16_t		port;
 	}			divert;
 
 	uint64_t		 u_states_cur;
 	uint64_t		 u_states_tot;
 	uint64_t		 u_src_nodes;
 };
 
 /* rule flags */
 #define	PFRULE_DROP		0x0000
 #define	PFRULE_RETURNRST	0x0001
 #define	PFRULE_FRAGMENT		0x0002
 #define	PFRULE_RETURNICMP	0x0004
 #define	PFRULE_RETURN		0x0008
 #define	PFRULE_NOSYNC		0x0010
 #define PFRULE_SRCTRACK		0x0020  /* track source states */
 #define PFRULE_RULESRCTRACK	0x0040  /* per rule */
 #define	PFRULE_REFS		0x0080	/* rule has references */
 
 /* scrub flags */
 #define	PFRULE_NODF		0x0100
 #define	PFRULE_FRAGCROP		0x0200	/* non-buffering frag cache */
 #define	PFRULE_FRAGDROP		0x0400	/* drop funny fragments */
 #define PFRULE_RANDOMID		0x0800
 #define PFRULE_REASSEMBLE_TCP	0x1000
 #define PFRULE_SET_TOS		0x2000
 
 /* rule flags again */
 #define PFRULE_IFBOUND		0x00010000	/* if-bound */
 #define PFRULE_STATESLOPPY	0x00020000	/* sloppy state tracking */
 
 #define PFSTATE_HIWAT		10000	/* default state table size */
 #define PFSTATE_ADAPT_START	6000	/* default adaptive timeout start */
 #define PFSTATE_ADAPT_END	12000	/* default adaptive timeout end */
 
 
 struct pf_threshold {
 	u_int32_t	limit;
 #define	PF_THRESHOLD_MULT	1000
 #define PF_THRESHOLD_MAX	0xffffffff / PF_THRESHOLD_MULT
 	u_int32_t	seconds;
 	u_int32_t	count;
 	u_int32_t	last;
 };
 
 struct pf_src_node {
 	LIST_ENTRY(pf_src_node) entry;
 	struct pf_addr	 addr;
 	struct pf_addr	 raddr;
 	union pf_rule_ptr rule;
 	struct pfi_kif	*kif;
 	u_int64_t	 bytes[2];
 	u_int64_t	 packets[2];
 	u_int32_t	 states;
 	u_int32_t	 conn;
 	struct pf_threshold	conn_rate;
 	u_int32_t	 creation;
 	u_int32_t	 expire;
 	sa_family_t	 af;
 	u_int8_t	 ruletype;
 };
 
 #define PFSNODE_HIWAT		10000	/* default source node table size */
 
 struct pf_state_scrub {
 	struct timeval	pfss_last;	/* time received last packet	*/
 	u_int32_t	pfss_tsecr;	/* last echoed timestamp	*/
 	u_int32_t	pfss_tsval;	/* largest timestamp		*/
 	u_int32_t	pfss_tsval0;	/* original timestamp		*/
 	u_int16_t	pfss_flags;
 #define PFSS_TIMESTAMP	0x0001		/* modulate timestamp		*/
 #define PFSS_PAWS	0x0010		/* stricter PAWS checks		*/
 #define PFSS_PAWS_IDLED	0x0020		/* was idle too long.  no PAWS	*/
 #define PFSS_DATA_TS	0x0040		/* timestamp on data packets	*/
 #define PFSS_DATA_NOTS	0x0080		/* no timestamp on data packets	*/
 	u_int8_t	pfss_ttl;	/* stashed TTL			*/
 	u_int8_t	pad;
 	u_int32_t	pfss_ts_mod;	/* timestamp modulation		*/
 };
 
 struct pf_state_host {
 	struct pf_addr	addr;
 	u_int16_t	port;
 	u_int16_t	pad;
 };
 
 struct pf_state_peer {
 	struct pf_state_scrub	*scrub;	/* state is scrubbed		*/
 	u_int32_t	seqlo;		/* Max sequence number sent	*/
 	u_int32_t	seqhi;		/* Max the other end ACKd + win	*/
 	u_int32_t	seqdiff;	/* Sequence number modulator	*/
 	u_int16_t	max_win;	/* largest window (pre scaling)	*/
 	u_int16_t	mss;		/* Maximum segment size option	*/
 	u_int8_t	state;		/* active state level		*/
 	u_int8_t	wscale;		/* window scaling factor	*/
 	u_int8_t	tcp_est;	/* Did we reach TCPS_ESTABLISHED */
 	u_int8_t	pad[1];
 };
 
 /* Keep synced with struct pf_state_key. */
 struct pf_state_key_cmp {
 	struct pf_addr	 addr[2];
 	u_int16_t	 port[2];
 	sa_family_t	 af;
 	u_int8_t	 proto;
 	u_int8_t	 pad[2];
 };
 
 struct pf_state_key {
 	struct pf_addr	 addr[2];
 	u_int16_t	 port[2];
 	sa_family_t	 af;
 	u_int8_t	 proto;
 	u_int8_t	 pad[2];
 
 	LIST_ENTRY(pf_state_key) entry;
 	TAILQ_HEAD(, pf_state)	 states[2];
 };
 
 /* Keep synced with struct pf_state. */
 struct pf_state_cmp {
 	u_int64_t		 id;
 	u_int32_t		 creatorid;
 	u_int8_t		 direction;
 	u_int8_t		 pad[3];
 };
 
 struct pf_state {
 	u_int64_t		 id;
 	u_int32_t		 creatorid;
 	u_int8_t		 direction;
 	u_int8_t		 pad[3];
 
 	u_int			 refs;
 	TAILQ_ENTRY(pf_state)	 sync_list;
 	TAILQ_ENTRY(pf_state)	 key_list[2];
 	LIST_ENTRY(pf_state)	 entry;
 	struct pf_state_peer	 src;
 	struct pf_state_peer	 dst;
 	union pf_rule_ptr	 rule;
 	union pf_rule_ptr	 anchor;
 	union pf_rule_ptr	 nat_rule;
 	struct pf_addr		 rt_addr;
 	struct pf_state_key	*key[2];	/* addresses stack and wire  */
 	struct pfi_kif		*kif;
 	struct pfi_kif		*rt_kif;
 	struct pf_src_node	*src_node;
 	struct pf_src_node	*nat_src_node;
 	u_int64_t		 packets[2];
 	u_int64_t		 bytes[2];
 	u_int32_t		 creation;
 	u_int32_t	 	 expire;
 	u_int32_t		 pfsync_time;
 	u_int16_t		 tag;
 	u_int8_t		 log;
 	u_int8_t		 state_flags;
 #define	PFSTATE_ALLOWOPTS	0x01
 #define	PFSTATE_SLOPPY		0x02
 /*  was	PFSTATE_PFLOW		0x04 */
 #define	PFSTATE_NOSYNC		0x08
 #define	PFSTATE_ACK		0x10
 	u_int8_t		 timeout;
 	u_int8_t		 sync_state; /* PFSYNC_S_x */
 
 	/* XXX */
 	u_int8_t		 sync_updates;
 	u_int8_t		_tail[3];
 };
 
 /*
  * Unified state structures for pulling states out of the kernel
  * used by pfsync(4) and the pf(4) ioctl.
  */
 struct pfsync_state_scrub {
 	u_int16_t	pfss_flags;
 	u_int8_t	pfss_ttl;	/* stashed TTL		*/
 #define PFSYNC_SCRUB_FLAG_VALID		0x01
 	u_int8_t	scrub_flag;
 	u_int32_t	pfss_ts_mod;	/* timestamp modulation	*/
 } __packed;
 
 struct pfsync_state_peer {
 	struct pfsync_state_scrub scrub;	/* state is scrubbed	*/
 	u_int32_t	seqlo;		/* Max sequence number sent	*/
 	u_int32_t	seqhi;		/* Max the other end ACKd + win	*/
 	u_int32_t	seqdiff;	/* Sequence number modulator	*/
 	u_int16_t	max_win;	/* largest window (pre scaling)	*/
 	u_int16_t	mss;		/* Maximum segment size option	*/
 	u_int8_t	state;		/* active state level		*/
 	u_int8_t	wscale;		/* window scaling factor	*/
 	u_int8_t	pad[6];
 } __packed;
 
 struct pfsync_state_key {
 	struct pf_addr	 addr[2];
 	u_int16_t	 port[2];
 };
 
 struct pfsync_state {
 	u_int64_t	 id;
 	char		 ifname[IFNAMSIZ];
 	struct pfsync_state_key	key[2];
 	struct pfsync_state_peer src;
 	struct pfsync_state_peer dst;
 	struct pf_addr	 rt_addr;
 	u_int32_t	 rule;
 	u_int32_t	 anchor;
 	u_int32_t	 nat_rule;
 	u_int32_t	 creation;
 	u_int32_t	 expire;
 	u_int32_t	 packets[2][2];
 	u_int32_t	 bytes[2][2];
 	u_int32_t	 creatorid;
 	sa_family_t	 af;
 	u_int8_t	 proto;
 	u_int8_t	 direction;
 	u_int8_t	 __spare[2];
 	u_int8_t	 log;
 	u_int8_t	 state_flags;
 	u_int8_t	 timeout;
 	u_int8_t	 sync_flags;
 	u_int8_t	 updates;
 } __packed;
 
 #ifdef _KERNEL
 /* pfsync */
 typedef int		pfsync_state_import_t(struct pfsync_state *, u_int8_t);
 typedef	void		pfsync_insert_state_t(struct pf_state *);
 typedef	void		pfsync_update_state_t(struct pf_state *);
 typedef	void		pfsync_delete_state_t(struct pf_state *);
 typedef void		pfsync_clear_states_t(u_int32_t, const char *);
 typedef int		pfsync_defer_t(struct pf_state *, struct mbuf *);
 
 extern pfsync_state_import_t	*pfsync_state_import_ptr;
 extern pfsync_insert_state_t	*pfsync_insert_state_ptr;
 extern pfsync_update_state_t	*pfsync_update_state_ptr;
 extern pfsync_delete_state_t	*pfsync_delete_state_ptr;
 extern pfsync_clear_states_t	*pfsync_clear_states_ptr;
 extern pfsync_defer_t		*pfsync_defer_ptr;
 
 void			pfsync_state_export(struct pfsync_state *,
 			    struct pf_state *);
 
 /* pflog */
 struct pf_ruleset;
 struct pf_pdesc;
 typedef int pflog_packet_t(struct pfi_kif *, struct mbuf *, sa_family_t,
     u_int8_t, u_int8_t, struct pf_rule *, struct pf_rule *,
     struct pf_ruleset *, struct pf_pdesc *, int);
 extern pflog_packet_t		*pflog_packet_ptr;
 
 #define	V_pf_end_threads	VNET(pf_end_threads)
 #endif /* _KERNEL */
 
 #define	PFSYNC_FLAG_SRCNODE	0x04
 #define	PFSYNC_FLAG_NATSRCNODE	0x08
 
 /* for copies to/from network byte order */
 /* ioctl interface also uses network byte order */
 #define pf_state_peer_hton(s,d) do {		\
 	(d)->seqlo = htonl((s)->seqlo);		\
 	(d)->seqhi = htonl((s)->seqhi);		\
 	(d)->seqdiff = htonl((s)->seqdiff);	\
 	(d)->max_win = htons((s)->max_win);	\
 	(d)->mss = htons((s)->mss);		\
 	(d)->state = (s)->state;		\
 	(d)->wscale = (s)->wscale;		\
 	if ((s)->scrub) {						\
 		(d)->scrub.pfss_flags = 				\
 		    htons((s)->scrub->pfss_flags & PFSS_TIMESTAMP);	\
 		(d)->scrub.pfss_ttl = (s)->scrub->pfss_ttl;		\
 		(d)->scrub.pfss_ts_mod = htonl((s)->scrub->pfss_ts_mod);\
 		(d)->scrub.scrub_flag = PFSYNC_SCRUB_FLAG_VALID;	\
 	}								\
 } while (0)
 
 #define pf_state_peer_ntoh(s,d) do {		\
 	(d)->seqlo = ntohl((s)->seqlo);		\
 	(d)->seqhi = ntohl((s)->seqhi);		\
 	(d)->seqdiff = ntohl((s)->seqdiff);	\
 	(d)->max_win = ntohs((s)->max_win);	\
 	(d)->mss = ntohs((s)->mss);		\
 	(d)->state = (s)->state;		\
 	(d)->wscale = (s)->wscale;		\
 	if ((s)->scrub.scrub_flag == PFSYNC_SCRUB_FLAG_VALID && 	\
 	    (d)->scrub != NULL) {					\
 		(d)->scrub->pfss_flags =				\
 		    ntohs((s)->scrub.pfss_flags) & PFSS_TIMESTAMP;	\
 		(d)->scrub->pfss_ttl = (s)->scrub.pfss_ttl;		\
 		(d)->scrub->pfss_ts_mod = ntohl((s)->scrub.pfss_ts_mod);\
 	}								\
 } while (0)
 
 #define pf_state_counter_hton(s,d) do {				\
 	d[0] = htonl((s>>32)&0xffffffff);			\
 	d[1] = htonl(s&0xffffffff);				\
 } while (0)
 
 #define pf_state_counter_from_pfsync(s)				\
 	(((u_int64_t)(s[0])<<32) | (u_int64_t)(s[1]))
 
 #define pf_state_counter_ntoh(s,d) do {				\
 	d = ntohl(s[0]);					\
 	d = d<<32;						\
 	d += ntohl(s[1]);					\
 } while (0)
 
 TAILQ_HEAD(pf_rulequeue, pf_rule);
 
 struct pf_anchor;
 
 struct pf_ruleset {
 	struct {
 		struct pf_rulequeue	 queues[2];
 		struct {
 			struct pf_rulequeue	*ptr;
 			struct pf_rule		**ptr_array;
 			u_int32_t		 rcount;
 			u_int32_t		 ticket;
 			int			 open;
 		}			 active, inactive;
 	}			 rules[PF_RULESET_MAX];
 	struct pf_anchor	*anchor;
 	u_int32_t		 tticket;
 	int			 tables;
 	int			 topen;
 };
 
 RB_HEAD(pf_anchor_global, pf_anchor);
 RB_HEAD(pf_anchor_node, pf_anchor);
 struct pf_anchor {
 	RB_ENTRY(pf_anchor)	 entry_global;
 	RB_ENTRY(pf_anchor)	 entry_node;
 	struct pf_anchor	*parent;
 	struct pf_anchor_node	 children;
 	char			 name[PF_ANCHOR_NAME_SIZE];
 	char			 path[MAXPATHLEN];
 	struct pf_ruleset	 ruleset;
 	int			 refcnt;	/* anchor rules */
 	int			 match;	/* XXX: used for pfctl black magic */
 };
 RB_PROTOTYPE(pf_anchor_global, pf_anchor, entry_global, pf_anchor_compare);
 RB_PROTOTYPE(pf_anchor_node, pf_anchor, entry_node, pf_anchor_compare);
 
 #define PF_RESERVED_ANCHOR	"_pf"
 
 #define PFR_TFLAG_PERSIST	0x00000001
 #define PFR_TFLAG_CONST		0x00000002
 #define PFR_TFLAG_ACTIVE	0x00000004
 #define PFR_TFLAG_INACTIVE	0x00000008
 #define PFR_TFLAG_REFERENCED	0x00000010
 #define PFR_TFLAG_REFDANCHOR	0x00000020
 #define PFR_TFLAG_COUNTERS	0x00000040
 /* Adjust masks below when adding flags. */
 #define PFR_TFLAG_USRMASK	(PFR_TFLAG_PERSIST	| \
 				 PFR_TFLAG_CONST	| \
 				 PFR_TFLAG_COUNTERS)
 #define PFR_TFLAG_SETMASK	(PFR_TFLAG_ACTIVE	| \
 				 PFR_TFLAG_INACTIVE	| \
 				 PFR_TFLAG_REFERENCED	| \
 				 PFR_TFLAG_REFDANCHOR)
 #define PFR_TFLAG_ALLMASK	(PFR_TFLAG_PERSIST	| \
 				 PFR_TFLAG_CONST	| \
 				 PFR_TFLAG_ACTIVE	| \
 				 PFR_TFLAG_INACTIVE	| \
 				 PFR_TFLAG_REFERENCED	| \
 				 PFR_TFLAG_REFDANCHOR	| \
 				 PFR_TFLAG_COUNTERS)
 
 struct pf_anchor_stackframe;
 
 struct pfr_table {
 	char			 pfrt_anchor[MAXPATHLEN];
 	char			 pfrt_name[PF_TABLE_NAME_SIZE];
 	u_int32_t		 pfrt_flags;
 	u_int8_t		 pfrt_fback;
 };
 
 enum { PFR_FB_NONE, PFR_FB_MATCH, PFR_FB_ADDED, PFR_FB_DELETED,
 	PFR_FB_CHANGED, PFR_FB_CLEARED, PFR_FB_DUPLICATE,
 	PFR_FB_NOTMATCH, PFR_FB_CONFLICT, PFR_FB_NOCOUNT, PFR_FB_MAX };
 
 struct pfr_addr {
 	union {
 		struct in_addr	 _pfra_ip4addr;
 		struct in6_addr	 _pfra_ip6addr;
 	}		 pfra_u;
 	u_int8_t	 pfra_af;
 	u_int8_t	 pfra_net;
 	u_int8_t	 pfra_not;
 	u_int8_t	 pfra_fback;
 };
 #define	pfra_ip4addr	pfra_u._pfra_ip4addr
 #define	pfra_ip6addr	pfra_u._pfra_ip6addr
 
 enum { PFR_DIR_IN, PFR_DIR_OUT, PFR_DIR_MAX };
 enum { PFR_OP_BLOCK, PFR_OP_PASS, PFR_OP_ADDR_MAX, PFR_OP_TABLE_MAX };
 #define PFR_OP_XPASS	PFR_OP_ADDR_MAX
 
 struct pfr_astats {
 	struct pfr_addr	 pfras_a;
 	u_int64_t	 pfras_packets[PFR_DIR_MAX][PFR_OP_ADDR_MAX];
 	u_int64_t	 pfras_bytes[PFR_DIR_MAX][PFR_OP_ADDR_MAX];
 	long		 pfras_tzero;
 };
 
 enum { PFR_REFCNT_RULE, PFR_REFCNT_ANCHOR, PFR_REFCNT_MAX };
 
 struct pfr_tstats {
 	struct pfr_table pfrts_t;
 	u_int64_t	 pfrts_packets[PFR_DIR_MAX][PFR_OP_TABLE_MAX];
 	u_int64_t	 pfrts_bytes[PFR_DIR_MAX][PFR_OP_TABLE_MAX];
 	u_int64_t	 pfrts_match;
 	u_int64_t	 pfrts_nomatch;
 	long		 pfrts_tzero;
 	int		 pfrts_cnt;
 	int		 pfrts_refcnt[PFR_REFCNT_MAX];
 };
 #define	pfrts_name	pfrts_t.pfrt_name
 #define pfrts_flags	pfrts_t.pfrt_flags
 
 #ifndef _SOCKADDR_UNION_DEFINED
 #define	_SOCKADDR_UNION_DEFINED
 union sockaddr_union {
 	struct sockaddr		sa;
 	struct sockaddr_in	sin;
 	struct sockaddr_in6	sin6;
 };
 #endif /* _SOCKADDR_UNION_DEFINED */
 
 struct pfr_kcounters {
 	u_int64_t		 pfrkc_packets[PFR_DIR_MAX][PFR_OP_ADDR_MAX];
 	u_int64_t		 pfrkc_bytes[PFR_DIR_MAX][PFR_OP_ADDR_MAX];
 };
 
 SLIST_HEAD(pfr_kentryworkq, pfr_kentry);
 struct pfr_kentry {
 	struct radix_node	 pfrke_node[2];
 	union sockaddr_union	 pfrke_sa;
 	SLIST_ENTRY(pfr_kentry)	 pfrke_workq;
 	struct pfr_kcounters	*pfrke_counters;
 	long			 pfrke_tzero;
 	u_int8_t		 pfrke_af;
 	u_int8_t		 pfrke_net;
 	u_int8_t		 pfrke_not;
 	u_int8_t		 pfrke_mark;
 };
 
 SLIST_HEAD(pfr_ktableworkq, pfr_ktable);
 RB_HEAD(pfr_ktablehead, pfr_ktable);
 struct pfr_ktable {
 	struct pfr_tstats	 pfrkt_ts;
 	RB_ENTRY(pfr_ktable)	 pfrkt_tree;
 	SLIST_ENTRY(pfr_ktable)	 pfrkt_workq;
 	struct radix_node_head	*pfrkt_ip4;
 	struct radix_node_head	*pfrkt_ip6;
 	struct pfr_ktable	*pfrkt_shadow;
 	struct pfr_ktable	*pfrkt_root;
 	struct pf_ruleset	*pfrkt_rs;
 	long			 pfrkt_larg;
 	int			 pfrkt_nflags;
 };
 #define pfrkt_t		pfrkt_ts.pfrts_t
 #define pfrkt_name	pfrkt_t.pfrt_name
 #define pfrkt_anchor	pfrkt_t.pfrt_anchor
 #define pfrkt_ruleset	pfrkt_t.pfrt_ruleset
 #define pfrkt_flags	pfrkt_t.pfrt_flags
 #define pfrkt_cnt	pfrkt_ts.pfrts_cnt
 #define pfrkt_refcnt	pfrkt_ts.pfrts_refcnt
 #define pfrkt_packets	pfrkt_ts.pfrts_packets
 #define pfrkt_bytes	pfrkt_ts.pfrts_bytes
 #define pfrkt_match	pfrkt_ts.pfrts_match
 #define pfrkt_nomatch	pfrkt_ts.pfrts_nomatch
 #define pfrkt_tzero	pfrkt_ts.pfrts_tzero
 
 /* keep synced with pfi_kif, used in RB_FIND */
 struct pfi_kif_cmp {
 	char				 pfik_name[IFNAMSIZ];
 };
 
 struct pfi_kif {
 	char				 pfik_name[IFNAMSIZ];
 	union {
 		RB_ENTRY(pfi_kif)	 _pfik_tree;
 		LIST_ENTRY(pfi_kif)	 _pfik_list;
 	} _pfik_glue;
 #define	pfik_tree	_pfik_glue._pfik_tree
 #define	pfik_list	_pfik_glue._pfik_list
 	u_int64_t			 pfik_packets[2][2][2];
 	u_int64_t			 pfik_bytes[2][2][2];
 	u_int32_t			 pfik_tzero;
 	u_int				 pfik_flags;
 	struct ifnet			*pfik_ifp;
 	struct ifg_group		*pfik_group;
 	u_int				 pfik_rulerefs;
 	TAILQ_HEAD(, pfi_dynaddr)	 pfik_dynaddrs;
 };
 
 #define	PFI_IFLAG_REFS		0x0001	/* has state references */
 #define PFI_IFLAG_SKIP		0x0100	/* skip filtering on interface */
 
 struct pf_pdesc {
 	struct {
 		int	 done;
 		uid_t	 uid;
 		gid_t	 gid;
 	}		 lookup;
 	u_int64_t	 tot_len;	/* Make Mickey money */
 	union {
 		struct tcphdr		*tcp;
 		struct udphdr		*udp;
 		struct icmp		*icmp;
 #ifdef INET6
 		struct icmp6_hdr	*icmp6;
 #endif /* INET6 */
 		void			*any;
 	} hdr;
 
 	struct pf_rule	*nat_rule;	/* nat/rdr rule applied to packet */
 	struct pf_addr	*src;		/* src address */
 	struct pf_addr	*dst;		/* dst address */
 	u_int16_t *sport;
 	u_int16_t *dport;
 	struct pf_mtag	*pf_mtag;
 
 	u_int32_t	 p_len;		/* total length of payload */
 
 	u_int16_t	*ip_sum;
 	u_int16_t	*proto_sum;
 	u_int16_t	 flags;		/* Let SCRUB trigger behavior in
 					 * state code. Easier than tags */
 #define PFDESC_TCP_NORM	0x0001		/* TCP shall be statefully scrubbed */
 #define PFDESC_IP_REAS	0x0002		/* IP frags would've been reassembled */
 	sa_family_t	 af;
 	u_int8_t	 proto;
 	u_int8_t	 tos;
 	u_int8_t	 dir;		/* direction */
 	u_int8_t	 sidx;		/* key index for source */
 	u_int8_t	 didx;		/* key index for destination */
 };
 
 /* flags for RDR options */
 #define PF_DPORT_RANGE	0x01		/* Dest port uses range */
 #define PF_RPORT_RANGE	0x02		/* RDR'ed port uses range */
 
 /* UDP state enumeration */
 #define PFUDPS_NO_TRAFFIC	0
 #define PFUDPS_SINGLE		1
 #define PFUDPS_MULTIPLE		2
 
 #define PFUDPS_NSTATES		3	/* number of state levels */
 
 #define PFUDPS_NAMES { \
 	"NO_TRAFFIC", \
 	"SINGLE", \
 	"MULTIPLE", \
 	NULL \
 }
 
 /* Other protocol state enumeration */
 #define PFOTHERS_NO_TRAFFIC	0
 #define PFOTHERS_SINGLE		1
 #define PFOTHERS_MULTIPLE	2
 
 #define PFOTHERS_NSTATES	3	/* number of state levels */
 
 #define PFOTHERS_NAMES { \
 	"NO_TRAFFIC", \
 	"SINGLE", \
 	"MULTIPLE", \
 	NULL \
 }
 
 #define ACTION_SET(a, x) \
 	do { \
 		if ((a) != NULL) \
 			*(a) = (x); \
 	} while (0)
 
 #define REASON_SET(a, x) \
 	do { \
 		if ((a) != NULL) \
 			*(a) = (x); \
 		if (x < PFRES_MAX) \
 			counter_u64_add(V_pf_status.counters[x], 1); \
 	} while (0)
 
 struct pf_kstatus {
 	counter_u64_t	counters[PFRES_MAX]; /* reason for passing/dropping */
 	counter_u64_t	lcounters[LCNT_MAX]; /* limit counters */
 	counter_u64_t	fcounters[FCNT_MAX]; /* state operation counters */
 	counter_u64_t	scounters[SCNT_MAX]; /* src_node operation counters */
 	uint32_t	states;
 	uint32_t	src_nodes;
 	uint32_t	running;
 	uint32_t	since;
 	uint32_t	debug;
 	uint32_t	hostid;
 	char		ifname[IFNAMSIZ];
 	uint8_t		pf_chksum[PF_MD5_DIGEST_LENGTH];
 };
 
 struct pf_divert {
 	union {
 		struct in_addr	ipv4;
 		struct in6_addr	ipv6;
 	}		addr;
 	u_int16_t	port;
 };
 
 #define PFFRAG_FRENT_HIWAT	5000	/* Number of fragment entries */
 #define PFR_KENTRY_HIWAT	200000	/* Number of table entries */
 
 /*
  * ioctl parameter structures
  */
 
 struct pfioc_pooladdr {
 	u_int32_t		 action;
 	u_int32_t		 ticket;
 	u_int32_t		 nr;
 	u_int32_t		 r_num;
 	u_int8_t		 r_action;
 	u_int8_t		 r_last;
 	u_int8_t		 af;
 	char			 anchor[MAXPATHLEN];
 	struct pf_pooladdr	 addr;
 };
 
 struct pfioc_rule {
 	u_int32_t	 action;
 	u_int32_t	 ticket;
 	u_int32_t	 pool_ticket;
 	u_int32_t	 nr;
 	char		 anchor[MAXPATHLEN];
 	char		 anchor_call[MAXPATHLEN];
 	struct pf_rule	 rule;
 };
 
 struct pfioc_natlook {
 	struct pf_addr	 saddr;
 	struct pf_addr	 daddr;
 	struct pf_addr	 rsaddr;
 	struct pf_addr	 rdaddr;
 	u_int16_t	 sport;
 	u_int16_t	 dport;
 	u_int16_t	 rsport;
 	u_int16_t	 rdport;
 	sa_family_t	 af;
 	u_int8_t	 proto;
 	u_int8_t	 direction;
 };
 
 struct pfioc_state {
 	struct pfsync_state	state;
 };
 
 struct pfioc_src_node_kill {
 	sa_family_t psnk_af;
 	struct pf_rule_addr psnk_src;
 	struct pf_rule_addr psnk_dst;
 	u_int		    psnk_killed;
 };
 
 struct pfioc_state_kill {
 	struct pf_state_cmp	psk_pfcmp;
 	sa_family_t		psk_af;
 	int			psk_proto;
 	struct pf_rule_addr	psk_src;
 	struct pf_rule_addr	psk_dst;
 	char			psk_ifname[IFNAMSIZ];
 	char			psk_label[PF_RULE_LABEL_SIZE];
 	u_int			psk_killed;
 };
 
 struct pfioc_states {
 	int	ps_len;
 	union {
 		caddr_t			 psu_buf;
 		struct pfsync_state	*psu_states;
 	} ps_u;
 #define ps_buf		ps_u.psu_buf
 #define ps_states	ps_u.psu_states
 };
 
 struct pfioc_src_nodes {
 	int	psn_len;
 	union {
 		caddr_t		 psu_buf;
 		struct pf_src_node	*psu_src_nodes;
 	} psn_u;
 #define psn_buf		psn_u.psu_buf
 #define psn_src_nodes	psn_u.psu_src_nodes
 };
 
 struct pfioc_if {
 	char		 ifname[IFNAMSIZ];
 };
 
 struct pfioc_tm {
 	int		 timeout;
 	int		 seconds;
 };
 
 struct pfioc_limit {
 	int		 index;
 	unsigned	 limit;
 };
 
 struct pfioc_altq {
 	u_int32_t	 action;
 	u_int32_t	 ticket;
 	u_int32_t	 nr;
 	struct pf_altq	 altq;
 };
 
 struct pfioc_qstats {
 	u_int32_t	 ticket;
 	u_int32_t	 nr;
 	void		*buf;
 	int		 nbytes;
 	u_int8_t	 scheduler;
 };
 
 struct pfioc_ruleset {
 	u_int32_t	 nr;
 	char		 path[MAXPATHLEN];
 	char		 name[PF_ANCHOR_NAME_SIZE];
 };
 
 #define PF_RULESET_ALTQ		(PF_RULESET_MAX)
 #define PF_RULESET_TABLE	(PF_RULESET_MAX+1)
 struct pfioc_trans {
 	int		 size;	/* number of elements */
 	int		 esize; /* size of each element in bytes */
 	struct pfioc_trans_e {
 		int		rs_num;
 		char		anchor[MAXPATHLEN];
 		u_int32_t	ticket;
 	}		*array;
 };
 
 #define PFR_FLAG_ATOMIC		0x00000001	/* unused */
 #define PFR_FLAG_DUMMY		0x00000002
 #define PFR_FLAG_FEEDBACK	0x00000004
 #define PFR_FLAG_CLSTATS	0x00000008
 #define PFR_FLAG_ADDRSTOO	0x00000010
 #define PFR_FLAG_REPLACE	0x00000020
 #define PFR_FLAG_ALLRSETS	0x00000040
 #define PFR_FLAG_ALLMASK	0x0000007F
 #ifdef _KERNEL
 #define PFR_FLAG_USERIOCTL	0x10000000
 #endif
 
 struct pfioc_table {
 	struct pfr_table	 pfrio_table;
 	void			*pfrio_buffer;
 	int			 pfrio_esize;
 	int			 pfrio_size;
 	int			 pfrio_size2;
 	int			 pfrio_nadd;
 	int			 pfrio_ndel;
 	int			 pfrio_nchange;
 	int			 pfrio_flags;
 	u_int32_t		 pfrio_ticket;
 };
 #define	pfrio_exists	pfrio_nadd
 #define	pfrio_nzero	pfrio_nadd
 #define	pfrio_nmatch	pfrio_nadd
 #define pfrio_naddr	pfrio_size2
 #define pfrio_setflag	pfrio_size2
 #define pfrio_clrflag	pfrio_nadd
 
 struct pfioc_iface {
 	char	 pfiio_name[IFNAMSIZ];
 	void	*pfiio_buffer;
 	int	 pfiio_esize;
 	int	 pfiio_size;
 	int	 pfiio_nzero;
 	int	 pfiio_flags;
 };
 
 
 /*
  * ioctl operations
  */
 
 #define DIOCSTART	_IO  ('D',  1)
 #define DIOCSTOP	_IO  ('D',  2)
 #define DIOCADDRULE	_IOWR('D',  4, struct pfioc_rule)
 #define DIOCGETRULES	_IOWR('D',  6, struct pfioc_rule)
 #define DIOCGETRULE	_IOWR('D',  7, struct pfioc_rule)
 /* XXX cut 8 - 17 */
 #define DIOCCLRSTATES	_IOWR('D', 18, struct pfioc_state_kill)
 #define DIOCGETSTATE	_IOWR('D', 19, struct pfioc_state)
 #define DIOCSETSTATUSIF _IOWR('D', 20, struct pfioc_if)
 #define DIOCGETSTATUS	_IOWR('D', 21, struct pf_status)
 #define DIOCCLRSTATUS	_IO  ('D', 22)
 #define DIOCNATLOOK	_IOWR('D', 23, struct pfioc_natlook)
 #define DIOCSETDEBUG	_IOWR('D', 24, u_int32_t)
 #define DIOCGETSTATES	_IOWR('D', 25, struct pfioc_states)
 #define DIOCCHANGERULE	_IOWR('D', 26, struct pfioc_rule)
 /* XXX cut 26 - 28 */
 #define DIOCSETTIMEOUT	_IOWR('D', 29, struct pfioc_tm)
 #define DIOCGETTIMEOUT	_IOWR('D', 30, struct pfioc_tm)
 #define DIOCADDSTATE	_IOWR('D', 37, struct pfioc_state)
 #define DIOCCLRRULECTRS	_IO  ('D', 38)
 #define DIOCGETLIMIT	_IOWR('D', 39, struct pfioc_limit)
 #define DIOCSETLIMIT	_IOWR('D', 40, struct pfioc_limit)
 #define DIOCKILLSTATES	_IOWR('D', 41, struct pfioc_state_kill)
 #define DIOCSTARTALTQ	_IO  ('D', 42)
 #define DIOCSTOPALTQ	_IO  ('D', 43)
 #define DIOCADDALTQ	_IOWR('D', 45, struct pfioc_altq)
 #define DIOCGETALTQS	_IOWR('D', 47, struct pfioc_altq)
 #define DIOCGETALTQ	_IOWR('D', 48, struct pfioc_altq)
 #define DIOCCHANGEALTQ	_IOWR('D', 49, struct pfioc_altq)
 #define DIOCGETQSTATS	_IOWR('D', 50, struct pfioc_qstats)
 #define DIOCBEGINADDRS	_IOWR('D', 51, struct pfioc_pooladdr)
 #define DIOCADDADDR	_IOWR('D', 52, struct pfioc_pooladdr)
 #define DIOCGETADDRS	_IOWR('D', 53, struct pfioc_pooladdr)
 #define DIOCGETADDR	_IOWR('D', 54, struct pfioc_pooladdr)
 #define DIOCCHANGEADDR	_IOWR('D', 55, struct pfioc_pooladdr)
 /* XXX cut 55 - 57 */
 #define	DIOCGETRULESETS	_IOWR('D', 58, struct pfioc_ruleset)
 #define	DIOCGETRULESET	_IOWR('D', 59, struct pfioc_ruleset)
 #define	DIOCRCLRTABLES	_IOWR('D', 60, struct pfioc_table)
 #define	DIOCRADDTABLES	_IOWR('D', 61, struct pfioc_table)
 #define	DIOCRDELTABLES	_IOWR('D', 62, struct pfioc_table)
 #define	DIOCRGETTABLES	_IOWR('D', 63, struct pfioc_table)
 #define	DIOCRGETTSTATS	_IOWR('D', 64, struct pfioc_table)
 #define DIOCRCLRTSTATS	_IOWR('D', 65, struct pfioc_table)
 #define	DIOCRCLRADDRS	_IOWR('D', 66, struct pfioc_table)
 #define	DIOCRADDADDRS	_IOWR('D', 67, struct pfioc_table)
 #define	DIOCRDELADDRS	_IOWR('D', 68, struct pfioc_table)
 #define	DIOCRSETADDRS	_IOWR('D', 69, struct pfioc_table)
 #define	DIOCRGETADDRS	_IOWR('D', 70, struct pfioc_table)
 #define	DIOCRGETASTATS	_IOWR('D', 71, struct pfioc_table)
 #define	DIOCRCLRASTATS	_IOWR('D', 72, struct pfioc_table)
 #define	DIOCRTSTADDRS	_IOWR('D', 73, struct pfioc_table)
 #define	DIOCRSETTFLAGS	_IOWR('D', 74, struct pfioc_table)
 #define	DIOCRINADEFINE	_IOWR('D', 77, struct pfioc_table)
 #define	DIOCOSFPFLUSH	_IO('D', 78)
 #define	DIOCOSFPADD	_IOWR('D', 79, struct pf_osfp_ioctl)
 #define	DIOCOSFPGET	_IOWR('D', 80, struct pf_osfp_ioctl)
 #define	DIOCXBEGIN	_IOWR('D', 81, struct pfioc_trans)
 #define	DIOCXCOMMIT	_IOWR('D', 82, struct pfioc_trans)
 #define	DIOCXROLLBACK	_IOWR('D', 83, struct pfioc_trans)
 #define	DIOCGETSRCNODES	_IOWR('D', 84, struct pfioc_src_nodes)
 #define	DIOCCLRSRCNODES	_IO('D', 85)
 #define	DIOCSETHOSTID	_IOWR('D', 86, u_int32_t)
 #define	DIOCIGETIFACES	_IOWR('D', 87, struct pfioc_iface)
 #define	DIOCSETIFFLAG	_IOWR('D', 89, struct pfioc_iface)
 #define	DIOCCLRIFFLAG	_IOWR('D', 90, struct pfioc_iface)
 #define	DIOCKILLSRCNODES	_IOWR('D', 91, struct pfioc_src_node_kill)
 struct pf_ifspeed {
 	char			ifname[IFNAMSIZ];
 	u_int32_t		baudrate;
 };
 #define	DIOCGIFSPEED	_IOWR('D', 92, struct pf_ifspeed)
 
 #ifdef _KERNEL
 LIST_HEAD(pf_src_node_list, pf_src_node);
 struct pf_srchash {
 	struct pf_src_node_list		nodes;
 	struct mtx			lock;
 };
 
 struct pf_keyhash {
 	LIST_HEAD(, pf_state_key)	keys;
 	struct mtx			lock;
 };
 
 struct pf_idhash {
 	LIST_HEAD(, pf_state)		states;
 	struct mtx			lock;
 };
 
 extern u_long		pf_hashmask;
 extern u_long		pf_srchashmask;
 #define	PF_HASHSIZ	(32768)
 VNET_DECLARE(struct pf_keyhash *, pf_keyhash);
 VNET_DECLARE(struct pf_idhash *, pf_idhash);
 #define V_pf_keyhash	VNET(pf_keyhash)
 #define	V_pf_idhash	VNET(pf_idhash)
 VNET_DECLARE(struct pf_srchash *, pf_srchash);
 #define	V_pf_srchash	VNET(pf_srchash)
 
 #define PF_IDHASH(s)	(be64toh((s)->id) % (pf_hashmask + 1))
 
 VNET_DECLARE(void *, pf_swi_cookie);
 #define V_pf_swi_cookie	VNET(pf_swi_cookie)
 
 VNET_DECLARE(uint64_t, pf_stateid[MAXCPU]);
 #define	V_pf_stateid	VNET(pf_stateid)
 
 TAILQ_HEAD(pf_altqqueue, pf_altq);
 VNET_DECLARE(struct pf_altqqueue,	 pf_altqs[2]);
 #define	V_pf_altqs			 VNET(pf_altqs)
 VNET_DECLARE(struct pf_palist,		 pf_pabuf);
 #define	V_pf_pabuf			 VNET(pf_pabuf)
 
 VNET_DECLARE(u_int32_t,			 ticket_altqs_active);
 #define	V_ticket_altqs_active		 VNET(ticket_altqs_active)
 VNET_DECLARE(u_int32_t,			 ticket_altqs_inactive);
 #define	V_ticket_altqs_inactive		 VNET(ticket_altqs_inactive)
 VNET_DECLARE(int,			 altqs_inactive_open);
 #define	V_altqs_inactive_open		 VNET(altqs_inactive_open)
 VNET_DECLARE(u_int32_t,			 ticket_pabuf);
 #define	V_ticket_pabuf			 VNET(ticket_pabuf)
 VNET_DECLARE(struct pf_altqqueue *,	 pf_altqs_active);
 #define	V_pf_altqs_active		 VNET(pf_altqs_active)
 VNET_DECLARE(struct pf_altqqueue *,	 pf_altqs_inactive);
 #define	V_pf_altqs_inactive		 VNET(pf_altqs_inactive)
 
 VNET_DECLARE(struct pf_rulequeue, pf_unlinked_rules);
 #define	V_pf_unlinked_rules	VNET(pf_unlinked_rules)
 
 void				 pf_initialize(void);
 void				 pf_mtag_initialize(void);
 void				 pf_mtag_cleanup(void);
 void				 pf_cleanup(void);
 
 struct pf_mtag			*pf_get_mtag(struct mbuf *);
 
 extern void			 pf_calc_skip_steps(struct pf_rulequeue *);
 #ifdef ALTQ
 extern	void			 pf_altq_ifnet_event(struct ifnet *, int);
 #endif
 VNET_DECLARE(uma_zone_t,	 pf_state_z);
 #define	V_pf_state_z		 VNET(pf_state_z)
 VNET_DECLARE(uma_zone_t,	 pf_state_key_z);
 #define	V_pf_state_key_z	 VNET(pf_state_key_z)
 VNET_DECLARE(uma_zone_t,	 pf_state_scrub_z);
 #define	V_pf_state_scrub_z	 VNET(pf_state_scrub_z)
 
 extern void			 pf_purge_thread(void *);
 extern void			 pf_intr(void *);
 extern void			 pf_purge_expired_src_nodes(void);
 
 extern int			 pf_unlink_state(struct pf_state *, u_int);
 #define	PF_ENTER_LOCKED		0x00000001
 #define	PF_RETURN_LOCKED	0x00000002
 extern int			 pf_state_insert(struct pfi_kif *,
 				    struct pf_state_key *,
 				    struct pf_state_key *,
 				    struct pf_state *);
 extern void			 pf_free_state(struct pf_state *);
 
 static __inline void
 pf_ref_state(struct pf_state *s)
 {
 
 	refcount_acquire(&s->refs);
 }
 
 static __inline int
 pf_release_state(struct pf_state *s)
 {
 
 	if (refcount_release(&s->refs)) {
 		pf_free_state(s);
 		return (1);
 	} else
 		return (0);
 }
 
 extern struct pf_state		*pf_find_state_byid(uint64_t, uint32_t);
 extern struct pf_state		*pf_find_state_all(struct pf_state_key_cmp *,
 				    u_int, int *);
 extern struct pf_src_node	*pf_find_src_node(struct pf_addr *,
 				    struct pf_rule *, sa_family_t, int);
 extern void			 pf_unlink_src_node(struct pf_src_node *);
 extern u_int			 pf_free_src_nodes(struct pf_src_node_list *);
 extern void			 pf_print_state(struct pf_state *);
 extern void			 pf_print_flags(u_int8_t);
 extern u_int16_t		 pf_cksum_fixup(u_int16_t, u_int16_t, u_int16_t,
 				    u_int8_t);
 
 VNET_DECLARE(struct ifnet *,		 sync_ifp);
 #define	V_sync_ifp		 	 VNET(sync_ifp);
 VNET_DECLARE(struct pf_rule,		 pf_default_rule);
 #define	V_pf_default_rule		  VNET(pf_default_rule)
 extern void			 pf_addrcpy(struct pf_addr *, struct pf_addr *,
 				    u_int8_t);
 void				pf_free_rule(struct pf_rule *);
 
 #ifdef INET
 int	pf_test(int, struct ifnet *, struct mbuf **, struct inpcb *);
 int	pf_normalize_ip(struct mbuf **, int, struct pfi_kif *, u_short *,
 	    struct pf_pdesc *);
 #endif /* INET */
 
 #ifdef INET6
 int	pf_test6(int, struct ifnet *, struct mbuf **, struct inpcb *);
 int	pf_normalize_ip6(struct mbuf **, int, struct pfi_kif *, u_short *,
 	    struct pf_pdesc *);
 void	pf_poolmask(struct pf_addr *, struct pf_addr*,
 	    struct pf_addr *, struct pf_addr *, u_int8_t);
 void	pf_addr_inc(struct pf_addr *, sa_family_t);
 int	pf_refragment6(struct ifnet *, struct mbuf **, struct m_tag *);
 #endif /* INET6 */
 
 u_int32_t	pf_new_isn(struct pf_state *);
 void   *pf_pull_hdr(struct mbuf *, int, void *, int, u_short *, u_short *,
 	    sa_family_t);
 void	pf_change_a(void *, u_int16_t *, u_int32_t, u_int8_t);
 void	pf_send_deferred_syn(struct pf_state *);
 int	pf_match_addr(u_int8_t, struct pf_addr *, struct pf_addr *,
 	    struct pf_addr *, sa_family_t);
 int	pf_match_addr_range(struct pf_addr *, struct pf_addr *,
 	    struct pf_addr *, sa_family_t);
 int	pf_match_port(u_int8_t, u_int16_t, u_int16_t, u_int16_t);
 
 void	pf_normalize_init(void);
 void	pf_normalize_cleanup(void);
 int	pf_normalize_tcp(int, struct pfi_kif *, struct mbuf *, int, int, void *,
 	    struct pf_pdesc *);
 void	pf_normalize_tcp_cleanup(struct pf_state *);
 int	pf_normalize_tcp_init(struct mbuf *, int, struct pf_pdesc *,
 	    struct tcphdr *, struct pf_state_peer *, struct pf_state_peer *);
 int	pf_normalize_tcp_stateful(struct mbuf *, int, struct pf_pdesc *,
 	    u_short *, struct tcphdr *, struct pf_state *,
 	    struct pf_state_peer *, struct pf_state_peer *, int *);
 u_int32_t
 	pf_state_expires(const struct pf_state *);
 void	pf_purge_expired_fragments(void);
 int	pf_routable(struct pf_addr *addr, sa_family_t af, struct pfi_kif *,
 	    int);
 int	pf_socket_lookup(int, struct pf_pdesc *, struct mbuf *);
 struct pf_state_key *pf_alloc_state_key(int);
 void	pfr_initialize(void);
 void	pfr_cleanup(void);
 int	pfr_match_addr(struct pfr_ktable *, struct pf_addr *, sa_family_t);
 void	pfr_update_stats(struct pfr_ktable *, struct pf_addr *, sa_family_t,
 	    u_int64_t, int, int, int);
 int	pfr_pool_get(struct pfr_ktable *, int *, struct pf_addr *, sa_family_t);
 void	pfr_dynaddr_update(struct pfr_ktable *, struct pfi_dynaddr *);
 struct pfr_ktable *
 	pfr_attach_table(struct pf_ruleset *, char *);
 void	pfr_detach_table(struct pfr_ktable *);
 int	pfr_clr_tables(struct pfr_table *, int *, int);
 int	pfr_add_tables(struct pfr_table *, int, int *, int);
 int	pfr_del_tables(struct pfr_table *, int, int *, int);
 int	pfr_get_tables(struct pfr_table *, struct pfr_table *, int *, int);
 int	pfr_get_tstats(struct pfr_table *, struct pfr_tstats *, int *, int);
 int	pfr_clr_tstats(struct pfr_table *, int, int *, int);
 int	pfr_set_tflags(struct pfr_table *, int, int, int, int *, int *, int);
 int	pfr_clr_addrs(struct pfr_table *, int *, int);
 int	pfr_insert_kentry(struct pfr_ktable *, struct pfr_addr *, long);
 int	pfr_add_addrs(struct pfr_table *, struct pfr_addr *, int, int *,
 	    int);
 int	pfr_del_addrs(struct pfr_table *, struct pfr_addr *, int, int *,
 	    int);
 int	pfr_set_addrs(struct pfr_table *, struct pfr_addr *, int, int *,
 	    int *, int *, int *, int, u_int32_t);
 int	pfr_get_addrs(struct pfr_table *, struct pfr_addr *, int *, int);
 int	pfr_get_astats(struct pfr_table *, struct pfr_astats *, int *, int);
 int	pfr_clr_astats(struct pfr_table *, struct pfr_addr *, int, int *,
 	    int);
 int	pfr_tst_addrs(struct pfr_table *, struct pfr_addr *, int, int *,
 	    int);
 int	pfr_ina_begin(struct pfr_table *, u_int32_t *, int *, int);
 int	pfr_ina_rollback(struct pfr_table *, u_int32_t, int *, int);
 int	pfr_ina_commit(struct pfr_table *, u_int32_t, int *, int *, int);
 int	pfr_ina_define(struct pfr_table *, struct pfr_addr *, int, int *,
 	    int *, u_int32_t, int);
 
 MALLOC_DECLARE(PFI_MTYPE);
 VNET_DECLARE(struct pfi_kif *,		 pfi_all);
 #define	V_pfi_all	 		 VNET(pfi_all)
 
 void		 pfi_initialize(void);
 void		 pfi_cleanup(void);
 void		 pfi_kif_ref(struct pfi_kif *);
 void		 pfi_kif_unref(struct pfi_kif *);
 struct pfi_kif	*pfi_kif_find(const char *);
 struct pfi_kif	*pfi_kif_attach(struct pfi_kif *, const char *);
 int		 pfi_kif_match(struct pfi_kif *, struct pfi_kif *);
 void		 pfi_kif_purge(void);
 int		 pfi_match_addr(struct pfi_dynaddr *, struct pf_addr *,
 		    sa_family_t);
 int		 pfi_dynaddr_setup(struct pf_addr_wrap *, sa_family_t);
 void		 pfi_dynaddr_remove(struct pfi_dynaddr *);
 void		 pfi_dynaddr_copyout(struct pf_addr_wrap *);
 void		 pfi_update_status(const char *, struct pf_status *);
 void		 pfi_get_ifaces(const char *, struct pfi_kif *, int *);
 int		 pfi_set_flags(const char *, int);
 int		 pfi_clear_flags(const char *, int);
 
 int		 pf_match_tag(struct mbuf *, struct pf_rule *, int *, int);
 int		 pf_tag_packet(struct mbuf *, struct pf_pdesc *, int);
 int		 pf_addr_cmp(struct pf_addr *, struct pf_addr *,
 		    sa_family_t);
 void		 pf_qid2qname(u_int32_t, char *);
 
 VNET_DECLARE(struct pf_kstatus, pf_status);
 #define	V_pf_status	VNET(pf_status)
 
 struct pf_limit {
 	uma_zone_t	zone;
 	u_int		limit;
 };
 VNET_DECLARE(struct pf_limit, pf_limits[PF_LIMIT_MAX]);
 #define	V_pf_limits VNET(pf_limits)
 
 #endif /* _KERNEL */
 
 #ifdef _KERNEL
 VNET_DECLARE(struct pf_anchor_global,		 pf_anchors);
 #define	V_pf_anchors				 VNET(pf_anchors)
 VNET_DECLARE(struct pf_anchor,			 pf_main_anchor);
 #define	V_pf_main_anchor			 VNET(pf_main_anchor)
 #define pf_main_ruleset	V_pf_main_anchor.ruleset
 #endif
 
 /* these ruleset functions can be linked into userland programs (pfctl) */
 int			 pf_get_ruleset_number(u_int8_t);
 void			 pf_init_ruleset(struct pf_ruleset *);
 int			 pf_anchor_setup(struct pf_rule *,
 			    const struct pf_ruleset *, const char *);
 int			 pf_anchor_copyout(const struct pf_ruleset *,
 			    const struct pf_rule *, struct pfioc_rule *);
 void			 pf_anchor_remove(struct pf_rule *);
 void			 pf_remove_if_empty_ruleset(struct pf_ruleset *);
 struct pf_ruleset	*pf_find_ruleset(const char *);
 struct pf_ruleset	*pf_find_or_create_ruleset(const char *);
 void			 pf_rs_initialize(void);
 
 /* The fingerprint functions can be linked into userland programs (tcpdump) */
 int	pf_osfp_add(struct pf_osfp_ioctl *);
 #ifdef _KERNEL
 struct pf_osfp_enlist *
 	pf_osfp_fingerprint(struct pf_pdesc *, struct mbuf *, int,
 	    const struct tcphdr *);
 #endif /* _KERNEL */
 void	pf_osfp_flush(void);
 int	pf_osfp_get(struct pf_osfp_ioctl *);
 int	pf_osfp_match(struct pf_osfp_enlist *, pf_osfp_t);
 
 #ifdef _KERNEL
 void			 pf_print_host(struct pf_addr *, u_int16_t, u_int8_t);
 
 void			 pf_step_into_anchor(struct pf_anchor_stackframe *, int *,
 			    struct pf_ruleset **, int, struct pf_rule **,
 			    struct pf_rule **, int *);
 int			 pf_step_out_of_anchor(struct pf_anchor_stackframe *, int *,
 			    struct pf_ruleset **, int, struct pf_rule **,
 			    struct pf_rule **, int *);
 
 int			 pf_map_addr(u_int8_t, struct pf_rule *,
 			    struct pf_addr *, struct pf_addr *,
 			    struct pf_addr *, struct pf_src_node **);
 struct pf_rule		*pf_get_translation(struct pf_pdesc *, struct mbuf *,
 			    int, int, struct pfi_kif *, struct pf_src_node **,
 			    struct pf_state_key **, struct pf_state_key **,
 			    struct pf_addr *, struct pf_addr *,
 			    uint16_t, uint16_t, struct pf_anchor_stackframe *);
 
 struct pf_state_key	*pf_state_key_setup(struct pf_pdesc *, struct pf_addr *,
 			    struct pf_addr *, u_int16_t, u_int16_t);
 struct pf_state_key	*pf_state_key_clone(struct pf_state_key *);
 #endif /* _KERNEL */
 
 #endif /* _NET_PFVAR_H_ */
Index: user/ngie/more-tests/sys/net/route.c
===================================================================
--- user/ngie/more-tests/sys/net/route.c	(revision 281584)
+++ user/ngie/more-tests/sys/net/route.c	(revision 281585)
@@ -1,2024 +1,2023 @@
 /*-
  * Copyright (c) 1980, 1986, 1991, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)route.c	8.3.1.1 (Berkeley) 2/23/95
  * $FreeBSD$
  */
 /************************************************************************
  * Note: In this file a 'fib' is a "forwarding information base"	*
  * Which is the new name for an in kernel routing (next hop) table.	*
  ***********************************************************************/
 
 #include "opt_inet.h"
 #include "opt_inet6.h"
 #include "opt_route.h"
 #include "opt_sctp.h"
 #include "opt_mrouting.h"
 #include "opt_mpath.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
-#include <sys/syslog.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/socket.h>
 #include <sys/sysctl.h>
 #include <sys/syslog.h>
 #include <sys/sysproto.h>
 #include <sys/proc.h>
 #include <sys/domain.h>
 #include <sys/kernel.h>
 
 #include <net/if.h>
 #include <net/if_var.h>
 #include <net/if_dl.h>
 #include <net/route.h>
 #include <net/vnet.h>
 #include <net/flowtable.h>
 
 #ifdef RADIX_MPATH
 #include <net/radix_mpath.h>
 #endif
 
 #include <netinet/in.h>
 #include <netinet/ip_mroute.h>
 
 #include <vm/uma.h>
 
 #define	RT_MAXFIBS	UINT16_MAX
 
 /* Kernel config default option. */
 #ifdef ROUTETABLES
 #if ROUTETABLES <= 0
 #error "ROUTETABLES defined too low"
 #endif
 #if ROUTETABLES > RT_MAXFIBS
 #error "ROUTETABLES defined too big"
 #endif
 #define	RT_NUMFIBS	ROUTETABLES
 #endif /* ROUTETABLES */
 /* Initialize to default if not otherwise set. */
 #ifndef	RT_NUMFIBS
 #define	RT_NUMFIBS	1
 #endif
 
 #if defined(INET) || defined(INET6)
 #ifdef SCTP
 extern void sctp_addr_change(struct ifaddr *ifa, int cmd);
 #endif /* SCTP */
 #endif
 
 
 /* This is read-only.. */
 u_int rt_numfibs = RT_NUMFIBS;
 SYSCTL_UINT(_net, OID_AUTO, fibs, CTLFLAG_RDTUN, &rt_numfibs, 0, "");
 
 /*
  * By default add routes to all fibs for new interfaces.
  * Once this is set to 0 then only allocate routes on interface
  * changes for the FIB of the caller when adding a new set of addresses
  * to an interface.  XXX this is a shotgun aproach to a problem that needs
  * a more fine grained solution.. that will come.
  * XXX also has the problems getting the FIB from curthread which will not
  * always work given the fib can be overridden and prefixes can be added
  * from the network stack context.
  */
 VNET_DEFINE(u_int, rt_add_addr_allfibs) = 1;
 SYSCTL_UINT(_net, OID_AUTO, add_addr_allfibs, CTLFLAG_RWTUN | CTLFLAG_VNET,
     &VNET_NAME(rt_add_addr_allfibs), 0, "");
 
 VNET_DEFINE(struct rtstat, rtstat);
 #define	V_rtstat	VNET(rtstat)
 
 VNET_DEFINE(struct radix_node_head *, rt_tables);
 #define	V_rt_tables	VNET(rt_tables)
 
 VNET_DEFINE(int, rttrash);		/* routes not in table but not freed */
 #define	V_rttrash	VNET(rttrash)
 
 
 /*
  * Convert a 'struct radix_node *' to a 'struct rtentry *'.
  * The operation can be done safely (in this code) because a
  * 'struct rtentry' starts with two 'struct radix_node''s, the first
  * one representing leaf nodes in the routing tree, which is
  * what the code in radix.c passes us as a 'struct radix_node'.
  *
  * But because there are a lot of assumptions in this conversion,
  * do not cast explicitly, but always use the macro below.
  */
 #define RNTORT(p)	((struct rtentry *)(p))
 
 static VNET_DEFINE(uma_zone_t, rtzone);		/* Routing table UMA zone. */
 #define	V_rtzone	VNET(rtzone)
 
 static int rtrequest1_fib_change(struct radix_node_head *, struct rt_addrinfo *,
     struct rtentry **, u_int);
 static void rt_setmetrics(const struct rt_addrinfo *, struct rtentry *);
 
 struct if_mtuinfo
 {
 	struct ifnet	*ifp;
 	int		mtu;
 };
 
 static int	if_updatemtu_cb(struct radix_node *, void *);
 
 /*
  * handler for net.my_fibnum
  */
 static int
 sysctl_my_fibnum(SYSCTL_HANDLER_ARGS)
 {
         int fibnum;
         int error;
  
         fibnum = curthread->td_proc->p_fibnum;
         error = sysctl_handle_int(oidp, &fibnum, 0, req);
         return (error);
 }
 
 SYSCTL_PROC(_net, OID_AUTO, my_fibnum, CTLTYPE_INT|CTLFLAG_RD,
             NULL, 0, &sysctl_my_fibnum, "I", "default FIB of caller");
 
 static __inline struct radix_node_head **
 rt_tables_get_rnh_ptr(int table, int fam)
 {
 	struct radix_node_head **rnh;
 
 	KASSERT(table >= 0 && table < rt_numfibs, ("%s: table out of bounds.",
 	    __func__));
 	KASSERT(fam >= 0 && fam < (AF_MAX+1), ("%s: fam out of bounds.",
 	    __func__));
 
 	/* rnh is [fib=0][af=0]. */
 	rnh = (struct radix_node_head **)V_rt_tables;
 	/* Get the offset to the requested table and fam. */
 	rnh += table * (AF_MAX+1) + fam;
 
 	return (rnh);
 }
 
 struct radix_node_head *
 rt_tables_get_rnh(int table, int fam)
 {
 
 	return (*rt_tables_get_rnh_ptr(table, fam));
 }
 
 /*
  * route initialization must occur before ip6_init2(), which happenas at
  * SI_ORDER_MIDDLE.
  */
 static void
 route_init(void)
 {
 
 	/* whack the tunable ints into  line. */
 	if (rt_numfibs > RT_MAXFIBS)
 		rt_numfibs = RT_MAXFIBS;
 	if (rt_numfibs == 0)
 		rt_numfibs = 1;
 }
 SYSINIT(route_init, SI_SUB_PROTO_DOMAIN, SI_ORDER_THIRD, route_init, 0);
 
 static int
 rtentry_zinit(void *mem, int size, int how)
 {
 	struct rtentry *rt = mem;
 
 	rt->rt_pksent = counter_u64_alloc(how);
 	if (rt->rt_pksent == NULL)
 		return (ENOMEM);
 
 	RT_LOCK_INIT(rt);
 
 	return (0);
 }
 
 static void
 rtentry_zfini(void *mem, int size)
 {
 	struct rtentry *rt = mem;
 
 	RT_LOCK_DESTROY(rt);
 	counter_u64_free(rt->rt_pksent);
 }
 
 static int
 rtentry_ctor(void *mem, int size, void *arg, int how)
 {
 	struct rtentry *rt = mem;
 
 	bzero(rt, offsetof(struct rtentry, rt_endzero));
 	counter_u64_zero(rt->rt_pksent);
 
 	return (0);
 }
 
 static void
 rtentry_dtor(void *mem, int size, void *arg)
 {
 	struct rtentry *rt = mem;
 
 	RT_UNLOCK_COND(rt);
 }
 
 static void
 vnet_route_init(const void *unused __unused)
 {
 	struct domain *dom;
 	struct radix_node_head **rnh;
 	int table;
 	int fam;
 
 	V_rt_tables = malloc(rt_numfibs * (AF_MAX+1) *
 	    sizeof(struct radix_node_head *), M_RTABLE, M_WAITOK|M_ZERO);
 
 	V_rtzone = uma_zcreate("rtentry", sizeof(struct rtentry),
 	    rtentry_ctor, rtentry_dtor,
 	    rtentry_zinit, rtentry_zfini, UMA_ALIGN_PTR, 0);
 	for (dom = domains; dom; dom = dom->dom_next) {
 		if (dom->dom_rtattach == NULL)
 			continue;
 
 		for  (table = 0; table < rt_numfibs; table++) {
 			fam = dom->dom_family;
 			if (table != 0 && fam != AF_INET6 && fam != AF_INET)
 				break;
 
 			rnh = rt_tables_get_rnh_ptr(table, fam);
 			if (rnh == NULL)
 				panic("%s: rnh NULL", __func__);
 			dom->dom_rtattach((void **)rnh, 0);
 		}
 	}
 }
 VNET_SYSINIT(vnet_route_init, SI_SUB_PROTO_DOMAIN, SI_ORDER_FOURTH,
     vnet_route_init, 0);
 
 #ifdef VIMAGE
 static void
 vnet_route_uninit(const void *unused __unused)
 {
 	int table;
 	int fam;
 	struct domain *dom;
 	struct radix_node_head **rnh;
 
 	for (dom = domains; dom; dom = dom->dom_next) {
 		if (dom->dom_rtdetach == NULL)
 			continue;
 
 		for (table = 0; table < rt_numfibs; table++) {
 			fam = dom->dom_family;
 
 			if (table != 0 && fam != AF_INET6 && fam != AF_INET)
 				break;
 
 			rnh = rt_tables_get_rnh_ptr(table, fam);
 			if (rnh == NULL)
 				panic("%s: rnh NULL", __func__);
 			dom->dom_rtdetach((void **)rnh, 0);
 		}
 	}
 
 	free(V_rt_tables, M_RTABLE);
 	uma_zdestroy(V_rtzone);
 }
 VNET_SYSUNINIT(vnet_route_uninit, SI_SUB_PROTO_DOMAIN, SI_ORDER_THIRD,
     vnet_route_uninit, 0);
 #endif
 
 #ifndef _SYS_SYSPROTO_H_
 struct setfib_args {
 	int     fibnum;
 };
 #endif
 int
 sys_setfib(struct thread *td, struct setfib_args *uap)
 {
 	if (uap->fibnum < 0 || uap->fibnum >= rt_numfibs)
 		return EINVAL;
 	td->td_proc->p_fibnum = uap->fibnum;
 	return (0);
 }
 
 /*
  * Packet routing routines.
  */
 void
 rtalloc(struct route *ro)
 {
 
 	rtalloc_ign_fib(ro, 0UL, RT_DEFAULT_FIB);
 }
 
 void
 rtalloc_fib(struct route *ro, u_int fibnum)
 {
 	rtalloc_ign_fib(ro, 0UL, fibnum);
 }
 
 void
 rtalloc_ign(struct route *ro, u_long ignore)
 {
 	struct rtentry *rt;
 
 	if ((rt = ro->ro_rt) != NULL) {
 		if (rt->rt_ifp != NULL && rt->rt_flags & RTF_UP)
 			return;
 		RTFREE(rt);
 		ro->ro_rt = NULL;
 	}
 	ro->ro_rt = rtalloc1_fib(&ro->ro_dst, 1, ignore, RT_DEFAULT_FIB);
 	if (ro->ro_rt)
 		RT_UNLOCK(ro->ro_rt);
 }
 
 void
 rtalloc_ign_fib(struct route *ro, u_long ignore, u_int fibnum)
 {
 	struct rtentry *rt;
 
 	if ((rt = ro->ro_rt) != NULL) {
 		if (rt->rt_ifp != NULL && rt->rt_flags & RTF_UP)
 			return;
 		RTFREE(rt);
 		ro->ro_rt = NULL;
 	}
 	ro->ro_rt = rtalloc1_fib(&ro->ro_dst, 1, ignore, fibnum);
 	if (ro->ro_rt)
 		RT_UNLOCK(ro->ro_rt);
 }
 
 /*
  * Look up the route that matches the address given
  * Or, at least try.. Create a cloned route if needed.
  *
  * The returned route, if any, is locked.
  */
 struct rtentry *
 rtalloc1(struct sockaddr *dst, int report, u_long ignflags)
 {
 
 	return (rtalloc1_fib(dst, report, ignflags, RT_DEFAULT_FIB));
 }
 
 struct rtentry *
 rtalloc1_fib(struct sockaddr *dst, int report, u_long ignflags,
 		    u_int fibnum)
 {
 	struct radix_node_head *rnh;
 	struct radix_node *rn;
 	struct rtentry *newrt;
 	struct rt_addrinfo info;
 	int err = 0, msgtype = RTM_MISS;
 	int needlock;
 
 	KASSERT((fibnum < rt_numfibs), ("rtalloc1_fib: bad fibnum"));
 	rnh = rt_tables_get_rnh(fibnum, dst->sa_family);
 	newrt = NULL;
 	if (rnh == NULL)
 		goto miss;
 
 	/*
 	 * Look up the address in the table for that Address Family
 	 */
 	needlock = !(ignflags & RTF_RNH_LOCKED);
 	if (needlock)
 		RADIX_NODE_HEAD_RLOCK(rnh);
 #ifdef INVARIANTS	
 	else
 		RADIX_NODE_HEAD_LOCK_ASSERT(rnh);
 #endif
 	rn = rnh->rnh_matchaddr(dst, rnh);
 	if (rn && ((rn->rn_flags & RNF_ROOT) == 0)) {
 		newrt = RNTORT(rn);
 		RT_LOCK(newrt);
 		RT_ADDREF(newrt);
 		if (needlock)
 			RADIX_NODE_HEAD_RUNLOCK(rnh);
 		goto done;
 
 	} else if (needlock)
 		RADIX_NODE_HEAD_RUNLOCK(rnh);
 	
 	/*
 	 * Either we hit the root or couldn't find any match,
 	 * Which basically means
 	 * "caint get there frm here"
 	 */
 miss:
 	V_rtstat.rts_unreach++;
 
 	if (report) {
 		/*
 		 * If required, report the failure to the supervising
 		 * Authorities.
 		 * For a delete, this is not an error. (report == 0)
 		 */
 		bzero(&info, sizeof(info));
 		info.rti_info[RTAX_DST] = dst;
 		rt_missmsg_fib(msgtype, &info, 0, err, fibnum);
 	}	
 done:
 	if (newrt)
 		RT_LOCK_ASSERT(newrt);
 	return (newrt);
 }
 
 /*
  * Remove a reference count from an rtentry.
  * If the count gets low enough, take it out of the routing table
  */
 void
 rtfree(struct rtentry *rt)
 {
 	struct radix_node_head *rnh;
 
 	KASSERT(rt != NULL,("%s: NULL rt", __func__));
 	rnh = rt_tables_get_rnh(rt->rt_fibnum, rt_key(rt)->sa_family);
 	KASSERT(rnh != NULL,("%s: NULL rnh", __func__));
 
 	RT_LOCK_ASSERT(rt);
 
 	/*
 	 * The callers should use RTFREE_LOCKED() or RTFREE(), so
 	 * we should come here exactly with the last reference.
 	 */
 	RT_REMREF(rt);
 	if (rt->rt_refcnt > 0) {
 		log(LOG_DEBUG, "%s: %p has %d refs\n", __func__, rt, rt->rt_refcnt);
 		goto done;
 	}
 
 	/*
 	 * On last reference give the "close method" a chance
 	 * to cleanup private state.  This also permits (for
 	 * IPv4 and IPv6) a chance to decide if the routing table
 	 * entry should be purged immediately or at a later time.
 	 * When an immediate purge is to happen the close routine
 	 * typically calls rtexpunge which clears the RTF_UP flag
 	 * on the entry so that the code below reclaims the storage.
 	 */
 	if (rt->rt_refcnt == 0 && rnh->rnh_close)
 		rnh->rnh_close((struct radix_node *)rt, rnh);
 
 	/*
 	 * If we are no longer "up" (and ref == 0)
 	 * then we can free the resources associated
 	 * with the route.
 	 */
 	if ((rt->rt_flags & RTF_UP) == 0) {
 		if (rt->rt_nodes->rn_flags & (RNF_ACTIVE | RNF_ROOT))
 			panic("rtfree 2");
 		/*
 		 * the rtentry must have been removed from the routing table
 		 * so it is represented in rttrash.. remove that now.
 		 */
 		V_rttrash--;
 #ifdef	DIAGNOSTIC
 		if (rt->rt_refcnt < 0) {
 			printf("rtfree: %p not freed (neg refs)\n", rt);
 			goto done;
 		}
 #endif
 		/*
 		 * release references on items we hold them on..
 		 * e.g other routes and ifaddrs.
 		 */
 		if (rt->rt_ifa)
 			ifa_free(rt->rt_ifa);
 		/*
 		 * The key is separatly alloc'd so free it (see rt_setgate()).
 		 * This also frees the gateway, as they are always malloc'd
 		 * together.
 		 */
 		Free(rt_key(rt));
 
 		/*
 		 * and the rtentry itself of course
 		 */
 		uma_zfree(V_rtzone, rt);
 		return;
 	}
 done:
 	RT_UNLOCK(rt);
 }
 
 
 /*
  * Force a routing table entry to the specified
  * destination to go through the given gateway.
  * Normally called as a result of a routing redirect
  * message from the network layer.
  */
 void
 rtredirect(struct sockaddr *dst,
 	struct sockaddr *gateway,
 	struct sockaddr *netmask,
 	int flags,
 	struct sockaddr *src)
 {
 
 	rtredirect_fib(dst, gateway, netmask, flags, src, RT_DEFAULT_FIB);
 }
 
 void
 rtredirect_fib(struct sockaddr *dst,
 	struct sockaddr *gateway,
 	struct sockaddr *netmask,
 	int flags,
 	struct sockaddr *src,
 	u_int fibnum)
 {
 	struct rtentry *rt, *rt0 = NULL;
 	int error = 0;
 	short *stat = NULL;
 	struct rt_addrinfo info;
 	struct ifaddr *ifa;
 	struct radix_node_head *rnh;
 
 	ifa = NULL;
 	rnh = rt_tables_get_rnh(fibnum, dst->sa_family);
 	if (rnh == NULL) {
 		error = EAFNOSUPPORT;
 		goto out;
 	}
 
 	/* verify the gateway is directly reachable */
 	if ((ifa = ifa_ifwithnet(gateway, 0, fibnum)) == NULL) {
 		error = ENETUNREACH;
 		goto out;
 	}
 	rt = rtalloc1_fib(dst, 0, 0UL, fibnum);	/* NB: rt is locked */
 	/*
 	 * If the redirect isn't from our current router for this dst,
 	 * it's either old or wrong.  If it redirects us to ourselves,
 	 * we have a routing loop, perhaps as a result of an interface
 	 * going down recently.
 	 */
 	if (!(flags & RTF_DONE) && rt &&
 	     (!sa_equal(src, rt->rt_gateway) || rt->rt_ifa != ifa))
 		error = EINVAL;
 	else if (ifa_ifwithaddr_check(gateway))
 		error = EHOSTUNREACH;
 	if (error)
 		goto done;
 	/*
 	 * Create a new entry if we just got back a wildcard entry
 	 * or the lookup failed.  This is necessary for hosts
 	 * which use routing redirects generated by smart gateways
 	 * to dynamically build the routing tables.
 	 */
 	if (rt == NULL || (rt_mask(rt) && rt_mask(rt)->sa_len < 2))
 		goto create;
 	/*
 	 * Don't listen to the redirect if it's
 	 * for a route to an interface.
 	 */
 	if (rt->rt_flags & RTF_GATEWAY) {
 		if (((rt->rt_flags & RTF_HOST) == 0) && (flags & RTF_HOST)) {
 			/*
 			 * Changing from route to net => route to host.
 			 * Create new route, rather than smashing route to net.
 			 */
 		create:
 			rt0 = rt;
 			rt = NULL;
 		
 			flags |=  RTF_GATEWAY | RTF_DYNAMIC;
 			bzero((caddr_t)&info, sizeof(info));
 			info.rti_info[RTAX_DST] = dst;
 			info.rti_info[RTAX_GATEWAY] = gateway;
 			info.rti_info[RTAX_NETMASK] = netmask;
 			info.rti_ifa = ifa;
 			info.rti_flags = flags;
 			if (rt0 != NULL)
 				RT_UNLOCK(rt0);	/* drop lock to avoid LOR with RNH */
 			error = rtrequest1_fib(RTM_ADD, &info, &rt, fibnum);
 			if (rt != NULL) {
 				RT_LOCK(rt);
 				if (rt0 != NULL)
 					EVENTHANDLER_INVOKE(route_redirect_event, rt0, rt, dst);
 				flags = rt->rt_flags;
 			}
 			if (rt0 != NULL)
 				RTFREE(rt0);
 			
 			stat = &V_rtstat.rts_dynamic;
 		} else {
 			struct rtentry *gwrt;
 
 			/*
 			 * Smash the current notion of the gateway to
 			 * this destination.  Should check about netmask!!!
 			 */
 			rt->rt_flags |= RTF_MODIFIED;
 			flags |= RTF_MODIFIED;
 			stat = &V_rtstat.rts_newgateway;
 			/*
 			 * add the key and gateway (in one malloc'd chunk).
 			 */
 			RT_UNLOCK(rt);
 			RADIX_NODE_HEAD_LOCK(rnh);
 			RT_LOCK(rt);
 			rt_setgate(rt, rt_key(rt), gateway);
 			gwrt = rtalloc1(gateway, 1, RTF_RNH_LOCKED);
 			RADIX_NODE_HEAD_UNLOCK(rnh);
 			EVENTHANDLER_INVOKE(route_redirect_event, rt, gwrt, dst);
 			RTFREE_LOCKED(gwrt);
 		}
 	} else
 		error = EHOSTUNREACH;
 done:
 	if (rt)
 		RTFREE_LOCKED(rt);
 out:
 	if (error)
 		V_rtstat.rts_badredirect++;
 	else if (stat != NULL)
 		(*stat)++;
 	bzero((caddr_t)&info, sizeof(info));
 	info.rti_info[RTAX_DST] = dst;
 	info.rti_info[RTAX_GATEWAY] = gateway;
 	info.rti_info[RTAX_NETMASK] = netmask;
 	info.rti_info[RTAX_AUTHOR] = src;
 	rt_missmsg_fib(RTM_REDIRECT, &info, flags, error, fibnum);
 	if (ifa != NULL)
 		ifa_free(ifa);
 }
 
 int
 rtioctl(u_long req, caddr_t data)
 {
 
 	return (rtioctl_fib(req, data, RT_DEFAULT_FIB));
 }
 
 /*
  * Routing table ioctl interface.
  */
 int
 rtioctl_fib(u_long req, caddr_t data, u_int fibnum)
 {
 
 	/*
 	 * If more ioctl commands are added here, make sure the proper
 	 * super-user checks are being performed because it is possible for
 	 * prison-root to make it this far if raw sockets have been enabled
 	 * in jails.
 	 */
 #ifdef INET
 	/* Multicast goop, grrr... */
 	return mrt_ioctl ? mrt_ioctl(req, data, fibnum) : EOPNOTSUPP;
 #else /* INET */
 	return ENXIO;
 #endif /* INET */
 }
 
 struct ifaddr *
 ifa_ifwithroute(int flags, struct sockaddr *dst, struct sockaddr *gateway,
 				u_int fibnum)
 {
 	struct ifaddr *ifa;
 	int not_found = 0;
 
 	if ((flags & RTF_GATEWAY) == 0) {
 		/*
 		 * If we are adding a route to an interface,
 		 * and the interface is a pt to pt link
 		 * we should search for the destination
 		 * as our clue to the interface.  Otherwise
 		 * we can use the local address.
 		 */
 		ifa = NULL;
 		if (flags & RTF_HOST)
 			ifa = ifa_ifwithdstaddr(dst, fibnum);
 		if (ifa == NULL)
 			ifa = ifa_ifwithaddr(gateway);
 	} else {
 		/*
 		 * If we are adding a route to a remote net
 		 * or host, the gateway may still be on the
 		 * other end of a pt to pt link.
 		 */
 		ifa = ifa_ifwithdstaddr(gateway, fibnum);
 	}
 	if (ifa == NULL)
 		ifa = ifa_ifwithnet(gateway, 0, fibnum);
 	if (ifa == NULL) {
 		struct rtentry *rt = rtalloc1_fib(gateway, 0, RTF_RNH_LOCKED, fibnum);
 		if (rt == NULL)
 			return (NULL);
 		/*
 		 * dismiss a gateway that is reachable only
 		 * through the default router
 		 */
 		switch (gateway->sa_family) {
 		case AF_INET:
 			if (satosin(rt_key(rt))->sin_addr.s_addr == INADDR_ANY)
 				not_found = 1;
 			break;
 		case AF_INET6:
 			if (IN6_IS_ADDR_UNSPECIFIED(&satosin6(rt_key(rt))->sin6_addr))
 				not_found = 1;
 			break;
 		default:
 			break;
 		}
 		if (!not_found && rt->rt_ifa != NULL) {
 			ifa = rt->rt_ifa;
 			ifa_ref(ifa);
 		}
 		RT_REMREF(rt);
 		RT_UNLOCK(rt);
 		if (not_found || ifa == NULL)
 			return (NULL);
 	}
 	if (ifa->ifa_addr->sa_family != dst->sa_family) {
 		struct ifaddr *oifa = ifa;
 		ifa = ifaof_ifpforaddr(dst, ifa->ifa_ifp);
 		if (ifa == NULL)
 			ifa = oifa;
 		else
 			ifa_free(oifa);
 	}
 	return (ifa);
 }
 
 /*
  * Do appropriate manipulations of a routing tree given
  * all the bits of info needed
  */
 int
 rtrequest(int req,
 	struct sockaddr *dst,
 	struct sockaddr *gateway,
 	struct sockaddr *netmask,
 	int flags,
 	struct rtentry **ret_nrt)
 {
 
 	return (rtrequest_fib(req, dst, gateway, netmask, flags, ret_nrt,
 	    RT_DEFAULT_FIB));
 }
 
 int
 rtrequest_fib(int req,
 	struct sockaddr *dst,
 	struct sockaddr *gateway,
 	struct sockaddr *netmask,
 	int flags,
 	struct rtentry **ret_nrt,
 	u_int fibnum)
 {
 	struct rt_addrinfo info;
 
 	if (dst->sa_len == 0)
 		return(EINVAL);
 
 	bzero((caddr_t)&info, sizeof(info));
 	info.rti_flags = flags;
 	info.rti_info[RTAX_DST] = dst;
 	info.rti_info[RTAX_GATEWAY] = gateway;
 	info.rti_info[RTAX_NETMASK] = netmask;
 	return rtrequest1_fib(req, &info, ret_nrt, fibnum);
 }
 
 /*
  * These (questionable) definitions of apparent local variables apply
  * to the next two functions.  XXXXXX!!!
  */
 #define	dst	info->rti_info[RTAX_DST]
 #define	gateway	info->rti_info[RTAX_GATEWAY]
 #define	netmask	info->rti_info[RTAX_NETMASK]
 #define	ifaaddr	info->rti_info[RTAX_IFA]
 #define	ifpaddr	info->rti_info[RTAX_IFP]
 #define	flags	info->rti_flags
 
 int
 rt_getifa(struct rt_addrinfo *info)
 {
 
 	return (rt_getifa_fib(info, RT_DEFAULT_FIB));
 }
 
 /*
  * Look up rt_addrinfo for a specific fib.  Note that if rti_ifa is defined,
  * it will be referenced so the caller must free it.
  */
 int
 rt_getifa_fib(struct rt_addrinfo *info, u_int fibnum)
 {
 	struct ifaddr *ifa;
 	int error = 0;
 
 	/*
 	 * ifp may be specified by sockaddr_dl
 	 * when protocol address is ambiguous.
 	 */
 	if (info->rti_ifp == NULL && ifpaddr != NULL &&
 	    ifpaddr->sa_family == AF_LINK &&
 	    (ifa = ifa_ifwithnet(ifpaddr, 0, fibnum)) != NULL) {
 		info->rti_ifp = ifa->ifa_ifp;
 		ifa_free(ifa);
 	}
 	if (info->rti_ifa == NULL && ifaaddr != NULL)
 		info->rti_ifa = ifa_ifwithaddr(ifaaddr);
 	if (info->rti_ifa == NULL) {
 		struct sockaddr *sa;
 
 		sa = ifaaddr != NULL ? ifaaddr :
 		    (gateway != NULL ? gateway : dst);
 		if (sa != NULL && info->rti_ifp != NULL)
 			info->rti_ifa = ifaof_ifpforaddr(sa, info->rti_ifp);
 		else if (dst != NULL && gateway != NULL)
 			info->rti_ifa = ifa_ifwithroute(flags, dst, gateway,
 							fibnum);
 		else if (sa != NULL)
 			info->rti_ifa = ifa_ifwithroute(flags, sa, sa,
 							fibnum);
 	}
 	if ((ifa = info->rti_ifa) != NULL) {
 		if (info->rti_ifp == NULL)
 			info->rti_ifp = ifa->ifa_ifp;
 	} else
 		error = ENETUNREACH;
 	return (error);
 }
 
 /*
  * Expunges references to a route that's about to be reclaimed.
  * The route must be locked.
  */
 int
 rt_expunge(struct radix_node_head *rnh, struct rtentry *rt)
 {
 #if !defined(RADIX_MPATH)
 	struct radix_node *rn;
 #else
 	struct rt_addrinfo info;
 	int fib;
 	struct rtentry *rt0;
 #endif
 	struct ifaddr *ifa;
 	int error = 0;
 
 	RT_LOCK_ASSERT(rt);
 	RADIX_NODE_HEAD_LOCK_ASSERT(rnh);
 
 #ifdef RADIX_MPATH
 	fib = rt->rt_fibnum;
 	bzero(&info, sizeof(info));
 	info.rti_ifp = rt->rt_ifp;
 	info.rti_flags = RTF_RNH_LOCKED;
 	info.rti_info[RTAX_DST] = rt_key(rt);
 	info.rti_info[RTAX_GATEWAY] = rt->rt_ifa->ifa_addr;
 
 	RT_UNLOCK(rt);
 	error = rtrequest1_fib(RTM_DELETE, &info, &rt0, fib);
 
 	if (error == 0 && rt0 != NULL) {
 		rt = rt0;
 		RT_LOCK(rt);
 	} else if (error != 0) {
 		RT_LOCK(rt);
 		return (error);
 	}
 #else
 	/*
 	 * Remove the item from the tree; it should be there,
 	 * but when callers invoke us blindly it may not (sigh).
 	 */
 	rn = rnh->rnh_deladdr(rt_key(rt), rt_mask(rt), rnh);
 	if (rn == NULL) {
 		error = ESRCH;
 		goto bad;
 	}
 	KASSERT((rn->rn_flags & (RNF_ACTIVE | RNF_ROOT)) == 0,
 		("unexpected flags 0x%x", rn->rn_flags));
 	KASSERT(rt == RNTORT(rn),
 		("lookup mismatch, rt %p rn %p", rt, rn));
 #endif /* RADIX_MPATH */
 
 	rt->rt_flags &= ~RTF_UP;
 
 	/*
 	 * Give the protocol a chance to keep things in sync.
 	 */
 	if ((ifa = rt->rt_ifa) && ifa->ifa_rtrequest) {
 		struct rt_addrinfo info;
 
 		bzero((caddr_t)&info, sizeof(info));
 		info.rti_flags = rt->rt_flags;
 		info.rti_info[RTAX_DST] = rt_key(rt);
 		info.rti_info[RTAX_GATEWAY] = rt->rt_gateway;
 		info.rti_info[RTAX_NETMASK] = rt_mask(rt);
 		ifa->ifa_rtrequest(RTM_DELETE, rt, &info);
 	}
 
 	/*
 	 * one more rtentry floating around that is not
 	 * linked to the routing table.
 	 */
 	V_rttrash++;
 #if !defined(RADIX_MPATH)
 bad:
 #endif
 	return (error);
 }
 
 static int
 if_updatemtu_cb(struct radix_node *rn, void *arg)
 {
 	struct rtentry *rt;
 	struct if_mtuinfo *ifmtu;
 
 	rt = (struct rtentry *)rn;
 	ifmtu = (struct if_mtuinfo *)arg;
 
 	if (rt->rt_ifp != ifmtu->ifp)
 		return (0);
 
 	if (rt->rt_mtu >= ifmtu->mtu) {
 		/* We have to decrease mtu regardless of flags */
 		rt->rt_mtu = ifmtu->mtu;
 		return (0);
 	}
 
 	/*
 	 * New MTU is bigger. Check if are allowed to alter it
 	 */
 	if ((rt->rt_flags & (RTF_FIXEDMTU | RTF_GATEWAY | RTF_HOST)) != 0) {
 
 		/*
 		 * Skip routes with user-supplied MTU and
 		 * non-interface routes
 		 */
 		return (0);
 	}
 
 	/* We are safe to update route MTU */
 	rt->rt_mtu = ifmtu->mtu;
 
 	return (0);
 }
 
 void
 rt_updatemtu(struct ifnet *ifp)
 {
 	struct if_mtuinfo ifmtu;
 	struct radix_node_head *rnh;
 	int i, j;
 
 	ifmtu.ifp = ifp;
 
 	/*
 	 * Try to update rt_mtu for all routes using this interface
 	 * Unfortunately the only way to do this is to traverse all
 	 * routing tables in all fibs/domains.
 	 */
 	for (i = 1; i <= AF_MAX; i++) {
 		ifmtu.mtu = if_getmtu_family(ifp, i);
 		for (j = 0; j < rt_numfibs; j++) {
 			rnh = rt_tables_get_rnh(j, i);
 			if (rnh == NULL)
 				continue;
 			RADIX_NODE_HEAD_LOCK(rnh);
 			rnh->rnh_walktree(rnh, if_updatemtu_cb, &ifmtu);
 			RADIX_NODE_HEAD_UNLOCK(rnh);
 		}
 	}
 }
 
 
 #if 0
 int p_sockaddr(char *buf, int buflen, struct sockaddr *s);
 int rt_print(char *buf, int buflen, struct rtentry *rt);
 
 int
 p_sockaddr(char *buf, int buflen, struct sockaddr *s)
 {
 	void *paddr = NULL;
 
 	switch (s->sa_family) {
 	case AF_INET:
 		paddr = &((struct sockaddr_in *)s)->sin_addr;
 		break;
 	case AF_INET6:
 		paddr = &((struct sockaddr_in6 *)s)->sin6_addr;
 		break;
 	}
 
 	if (paddr == NULL)
 		return (0);
 
 	if (inet_ntop(s->sa_family, paddr, buf, buflen) == NULL)
 		return (0);
 	
 	return (strlen(buf));
 }
 
 int
 rt_print(char *buf, int buflen, struct rtentry *rt)
 {
 	struct sockaddr *addr, *mask;
 	int i = 0;
 
 	addr = rt_key(rt);
 	mask = rt_mask(rt);
 
 	i = p_sockaddr(buf, buflen, addr);
 	if (!(rt->rt_flags & RTF_HOST)) {
 		buf[i++] = '/';
 		i += p_sockaddr(buf + i, buflen - i, mask);
 	}
 
 	if (rt->rt_flags & RTF_GATEWAY) {
 		buf[i++] = '>';
 		i += p_sockaddr(buf + i, buflen - i, rt->rt_gateway);
 	}
 
 	return (i);
 }
 #endif
 
 #ifdef RADIX_MPATH
 static int
 rn_mpath_update(int req, struct rt_addrinfo *info,
     struct radix_node_head *rnh, struct rtentry **ret_nrt)
 {
 	/*
 	 * if we got multipath routes, we require users to specify
 	 * a matching RTAX_GATEWAY.
 	 */
 	struct rtentry *rt, *rto = NULL;
 	struct radix_node *rn;
 	int error = 0;
 
 	rn = rnh->rnh_lookup(dst, netmask, rnh);
 	if (rn == NULL)
 		return (ESRCH);
 	rto = rt = RNTORT(rn);
 
 	rt = rt_mpath_matchgate(rt, gateway);
 	if (rt == NULL)
 		return (ESRCH);
 	/*
 	 * this is the first entry in the chain
 	 */
 	if (rto == rt) {
 		rn = rn_mpath_next((struct radix_node *)rt);
 		/*
 		 * there is another entry, now it's active
 		 */
 		if (rn) {
 			rto = RNTORT(rn);
 			RT_LOCK(rto);
 			rto->rt_flags |= RTF_UP;
 			RT_UNLOCK(rto);
 		} else if (rt->rt_flags & RTF_GATEWAY) {
 			/*
 			 * For gateway routes, we need to 
 			 * make sure that we we are deleting
 			 * the correct gateway. 
 			 * rt_mpath_matchgate() does not 
 			 * check the case when there is only
 			 * one route in the chain.  
 			 */
 			if (gateway &&
 			    (rt->rt_gateway->sa_len != gateway->sa_len ||
 				memcmp(rt->rt_gateway, gateway, gateway->sa_len)))
 				error = ESRCH;
 			else {
 				/*
 				 * remove from tree before returning it
 				 * to the caller
 				 */
 				rn = rnh->rnh_deladdr(dst, netmask, rnh);
 				KASSERT(rt == RNTORT(rn), ("radix node disappeared"));
 				goto gwdelete;
 			}
 			
 		}
 		/*
 		 * use the normal delete code to remove
 		 * the first entry
 		 */
 		if (req != RTM_DELETE) 
 			goto nondelete;
 
 		error = ENOENT;
 		goto done;
 	}
 		
 	/*
 	 * if the entry is 2nd and on up
 	 */
 	if ((req == RTM_DELETE) && !rt_mpath_deldup(rto, rt))
 		panic ("rtrequest1: rt_mpath_deldup");
 gwdelete:
 	RT_LOCK(rt);
 	RT_ADDREF(rt);
 	if (req == RTM_DELETE) {
 		rt->rt_flags &= ~RTF_UP;
 		/*
 		 * One more rtentry floating around that is not
 		 * linked to the routing table. rttrash will be decremented
 		 * when RTFREE(rt) is eventually called.
 		 */
 		V_rttrash++;
 	}
 	
 nondelete:
 	if (req != RTM_DELETE)
 		panic("unrecognized request %d", req);
 	
 
 	/*
 	 * If the caller wants it, then it can have it,
 	 * but it's up to it to free the rtentry as we won't be
 	 * doing it.
 	 */
 	if (ret_nrt) {
 		*ret_nrt = rt;
 		RT_UNLOCK(rt);
 	} else
 		RTFREE_LOCKED(rt);
 done:
 	return (error);
 }
 #endif
 
 int
 rtrequest1_fib(int req, struct rt_addrinfo *info, struct rtentry **ret_nrt,
 				u_int fibnum)
 {
 	int error = 0, needlock = 0;
 	struct rtentry *rt;
 #ifdef FLOWTABLE
 	struct rtentry *rt0;
 #endif
 	struct radix_node *rn;
 	struct radix_node_head *rnh;
 	struct ifaddr *ifa;
 	struct sockaddr *ndst;
 	struct sockaddr_storage mdst;
 #define senderr(x) { error = x ; goto bad; }
 
 	KASSERT((fibnum < rt_numfibs), ("rtrequest1_fib: bad fibnum"));
 	switch (dst->sa_family) {
 	case AF_INET6:
 	case AF_INET:
 		/* We support multiple FIBs. */
 		break;
 	default:
 		fibnum = RT_DEFAULT_FIB;
 		break;
 	}
 
 	/*
 	 * Find the correct routing tree to use for this Address Family
 	 */
 	rnh = rt_tables_get_rnh(fibnum, dst->sa_family);
 	if (rnh == NULL)
 		return (EAFNOSUPPORT);
 	needlock = ((flags & RTF_RNH_LOCKED) == 0);
 	flags &= ~RTF_RNH_LOCKED;
 	if (needlock)
 		RADIX_NODE_HEAD_LOCK(rnh);
 	else
 		RADIX_NODE_HEAD_LOCK_ASSERT(rnh);
 	/*
 	 * If we are adding a host route then we don't want to put
 	 * a netmask in the tree, nor do we want to clone it.
 	 */
 	if (flags & RTF_HOST)
 		netmask = NULL;
 
 	switch (req) {
 	case RTM_DELETE:
 		if (netmask) {
 			rt_maskedcopy(dst, (struct sockaddr *)&mdst, netmask);
 			dst = (struct sockaddr *)&mdst;
 		}
 #ifdef RADIX_MPATH
 		if (rn_mpath_capable(rnh)) {
 			error = rn_mpath_update(req, info, rnh, ret_nrt);
 			/*
 			 * "bad" holds true for the success case
 			 * as well
 			 */
 			if (error != ENOENT)
 				goto bad;
 			error = 0;
 		}
 #endif
 		if ((flags & RTF_PINNED) == 0) {
 			/* Check if target route can be deleted */
 			rt = (struct rtentry *)rnh->rnh_lookup(dst,
 			    netmask, rnh);
 			if ((rt != NULL) && (rt->rt_flags & RTF_PINNED))
 				senderr(EADDRINUSE);
 		}
 
 		/*
 		 * Remove the item from the tree and return it.
 		 * Complain if it is not there and do no more processing.
 		 */
 		rn = rnh->rnh_deladdr(dst, netmask, rnh);
 		if (rn == NULL)
 			senderr(ESRCH);
 		if (rn->rn_flags & (RNF_ACTIVE | RNF_ROOT))
 			panic ("rtrequest delete");
 		rt = RNTORT(rn);
 		RT_LOCK(rt);
 		RT_ADDREF(rt);
 		rt->rt_flags &= ~RTF_UP;
 
 		/*
 		 * give the protocol a chance to keep things in sync.
 		 */
 		if ((ifa = rt->rt_ifa) && ifa->ifa_rtrequest)
 			ifa->ifa_rtrequest(RTM_DELETE, rt, info);
 
 		/*
 		 * One more rtentry floating around that is not
 		 * linked to the routing table. rttrash will be decremented
 		 * when RTFREE(rt) is eventually called.
 		 */
 		V_rttrash++;
 
 		/*
 		 * If the caller wants it, then it can have it,
 		 * but it's up to it to free the rtentry as we won't be
 		 * doing it.
 		 */
 		if (ret_nrt) {
 			*ret_nrt = rt;
 			RT_UNLOCK(rt);
 		} else
 			RTFREE_LOCKED(rt);
 		break;
 	case RTM_RESOLVE:
 		/*
 		 * resolve was only used for route cloning
 		 * here for compat
 		 */
 		break;
 	case RTM_ADD:
 		if ((flags & RTF_GATEWAY) && !gateway)
 			senderr(EINVAL);
 		if (dst && gateway && (dst->sa_family != gateway->sa_family) && 
 		    (gateway->sa_family != AF_UNSPEC) && (gateway->sa_family != AF_LINK))
 			senderr(EINVAL);
 
 		if (info->rti_ifa == NULL) {
 			error = rt_getifa_fib(info, fibnum);
 			if (error)
 				senderr(error);
 		} else
 			ifa_ref(info->rti_ifa);
 		ifa = info->rti_ifa;
 		rt = uma_zalloc(V_rtzone, M_NOWAIT);
 		if (rt == NULL) {
 			ifa_free(ifa);
 			senderr(ENOBUFS);
 		}
 		rt->rt_flags = RTF_UP | flags;
 		rt->rt_fibnum = fibnum;
 		/*
 		 * Add the gateway. Possibly re-malloc-ing the storage for it.
 		 */
 		RT_LOCK(rt);
 		if ((error = rt_setgate(rt, dst, gateway)) != 0) {
 			ifa_free(ifa);
 			uma_zfree(V_rtzone, rt);
 			senderr(error);
 		}
 
 		/*
 		 * point to the (possibly newly malloc'd) dest address.
 		 */
 		ndst = (struct sockaddr *)rt_key(rt);
 
 		/*
 		 * make sure it contains the value we want (masked if needed).
 		 */
 		if (netmask) {
 			rt_maskedcopy(dst, ndst, netmask);
 		} else
 			bcopy(dst, ndst, dst->sa_len);
 
 		/*
 		 * We use the ifa reference returned by rt_getifa_fib().
 		 * This moved from below so that rnh->rnh_addaddr() can
 		 * examine the ifa and  ifa->ifa_ifp if it so desires.
 		 */
 		rt->rt_ifa = ifa;
 		rt->rt_ifp = ifa->ifa_ifp;
 		rt->rt_weight = 1;
 
 		rt_setmetrics(info, rt);
 
 #ifdef RADIX_MPATH
 		/* do not permit exactly the same dst/mask/gw pair */
 		if (rn_mpath_capable(rnh) &&
 			rt_mpath_conflict(rnh, rt, netmask)) {
 			ifa_free(rt->rt_ifa);
 			Free(rt_key(rt));
 			uma_zfree(V_rtzone, rt);
 			senderr(EEXIST);
 		}
 #endif
 
 #ifdef FLOWTABLE
 		rt0 = NULL;
 		/* "flow-table" only supports IPv6 and IPv4 at the moment. */
 		switch (dst->sa_family) {
 #ifdef INET6
 		case AF_INET6:
 #endif
 #ifdef INET
 		case AF_INET:
 #endif
 #if defined(INET6) || defined(INET)
 			rn = rnh->rnh_matchaddr(dst, rnh);
 			if (rn && ((rn->rn_flags & RNF_ROOT) == 0)) {
 				struct sockaddr *mask;
 				u_char *m, *n;
 				int len;
 				
 				/*
 				 * compare mask to see if the new route is
 				 * more specific than the existing one
 				 */
 				rt0 = RNTORT(rn);
 				RT_LOCK(rt0);
 				RT_ADDREF(rt0);
 				RT_UNLOCK(rt0);
 				/*
 				 * A host route is already present, so 
 				 * leave the flow-table entries as is.
 				 */
 				if (rt0->rt_flags & RTF_HOST) {
 					RTFREE(rt0);
 					rt0 = NULL;
 				} else if (!(flags & RTF_HOST) && netmask) {
 					mask = rt_mask(rt0);
 					len = mask->sa_len;
 					m = (u_char *)mask;
 					n = (u_char *)netmask;
 					while (len-- > 0) {
 						if (*n != *m)
 							break;
 						n++;
 						m++;
 					}
 					if (len == 0 || (*n < *m)) {
 						RTFREE(rt0);
 						rt0 = NULL;
 					}
 				}
 			}
 #endif/* INET6 || INET */
 		}
 #endif /* FLOWTABLE */
 
 		/* XXX mtu manipulation will be done in rnh_addaddr -- itojun */
 		rn = rnh->rnh_addaddr(ndst, netmask, rnh, rt->rt_nodes);
 		/*
 		 * If it still failed to go into the tree,
 		 * then un-make it (this should be a function)
 		 */
 		if (rn == NULL) {
 			ifa_free(rt->rt_ifa);
 			Free(rt_key(rt));
 			uma_zfree(V_rtzone, rt);
 #ifdef FLOWTABLE
 			if (rt0 != NULL)
 				RTFREE(rt0);
 #endif
 			senderr(EEXIST);
 		} 
 #ifdef FLOWTABLE
 		else if (rt0 != NULL) {
 			flowtable_route_flush(dst->sa_family, rt0);
 			RTFREE(rt0);
 		}
 #endif
 
 		/*
 		 * If this protocol has something to add to this then
 		 * allow it to do that as well.
 		 */
 		if (ifa->ifa_rtrequest)
 			ifa->ifa_rtrequest(req, rt, info);
 
 		/*
 		 * actually return a resultant rtentry and
 		 * give the caller a single reference.
 		 */
 		if (ret_nrt) {
 			*ret_nrt = rt;
 			RT_ADDREF(rt);
 		}
 		RT_UNLOCK(rt);
 		break;
 	case RTM_CHANGE:
 		error = rtrequest1_fib_change(rnh, info, ret_nrt, fibnum);
 		break;
 	default:
 		error = EOPNOTSUPP;
 	}
 bad:
 	if (needlock)
 		RADIX_NODE_HEAD_UNLOCK(rnh);
 	return (error);
 #undef senderr
 }
 
 #undef dst
 #undef gateway
 #undef netmask
 #undef ifaaddr
 #undef ifpaddr
 #undef flags
 
 static int
 rtrequest1_fib_change(struct radix_node_head *rnh, struct rt_addrinfo *info,
     struct rtentry **ret_nrt, u_int fibnum)
 {
 	struct rtentry *rt = NULL;
 	int error = 0;
 	int free_ifa = 0;
 	int family, mtu;
 	struct if_mtuinfo ifmtu;
 
 	rt = (struct rtentry *)rnh->rnh_lookup(info->rti_info[RTAX_DST],
 	    info->rti_info[RTAX_NETMASK], rnh);
 
 	if (rt == NULL)
 		return (ESRCH);
 
 #ifdef RADIX_MPATH
 	/*
 	 * If we got multipath routes,
 	 * we require users to specify a matching RTAX_GATEWAY.
 	 */
 	if (rn_mpath_capable(rnh)) {
 		rt = rt_mpath_matchgate(rt, info->rti_info[RTAX_GATEWAY]);
 		if (rt == NULL)
 			return (ESRCH);
 	}
 #endif
 
 	RT_LOCK(rt);
 
 	rt_setmetrics(info, rt);
 
 	/*
 	 * New gateway could require new ifaddr, ifp;
 	 * flags may also be different; ifp may be specified
 	 * by ll sockaddr when protocol address is ambiguous
 	 */
 	if (((rt->rt_flags & RTF_GATEWAY) &&
 	    info->rti_info[RTAX_GATEWAY] != NULL) ||
 	    info->rti_info[RTAX_IFP] != NULL ||
 	    (info->rti_info[RTAX_IFA] != NULL &&
 	     !sa_equal(info->rti_info[RTAX_IFA], rt->rt_ifa->ifa_addr))) {
 
 		error = rt_getifa_fib(info, fibnum);
 		if (info->rti_ifa != NULL)
 			free_ifa = 1;
 
 		if (error != 0)
 			goto bad;
 	}
 
 	/* Check if outgoing interface has changed */
 	if (info->rti_ifa != NULL && info->rti_ifa != rt->rt_ifa &&
 	    rt->rt_ifa != NULL && rt->rt_ifa->ifa_rtrequest != NULL) {
 		rt->rt_ifa->ifa_rtrequest(RTM_DELETE, rt, info);
 		ifa_free(rt->rt_ifa);
 	}
 	/* Update gateway address */
 	if (info->rti_info[RTAX_GATEWAY] != NULL) {
 		error = rt_setgate(rt, rt_key(rt), info->rti_info[RTAX_GATEWAY]);
 		if (error != 0)
 			goto bad;
 
 		rt->rt_flags &= ~RTF_GATEWAY;
 		rt->rt_flags |= (RTF_GATEWAY & info->rti_flags);
 	}
 
 	if (info->rti_ifa != NULL && info->rti_ifa != rt->rt_ifa) {
 		ifa_ref(info->rti_ifa);
 		rt->rt_ifa = info->rti_ifa;
 		rt->rt_ifp = info->rti_ifp;
 	}
 	/* Allow some flags to be toggled on change. */
 	rt->rt_flags &= ~RTF_FMASK;
 	rt->rt_flags |= info->rti_flags & RTF_FMASK;
 
 	if (rt->rt_ifa && rt->rt_ifa->ifa_rtrequest != NULL)
 	       rt->rt_ifa->ifa_rtrequest(RTM_ADD, rt, info);
 
 	/* Alter route MTU if necessary */
 	if (rt->rt_ifp != NULL) {
 		family = info->rti_info[RTAX_DST]->sa_family;
 		mtu = if_getmtu_family(rt->rt_ifp, family);
 		/* Set default MTU */
 		if (rt->rt_mtu == 0)
 			rt->rt_mtu = mtu;
 		if (rt->rt_mtu != mtu) {
 			/* Check if we really need to update */
 			ifmtu.ifp = rt->rt_ifp;
 			ifmtu.mtu = mtu;
 			if_updatemtu_cb(rt->rt_nodes, &ifmtu);
 		}
 	}
 
 	if (ret_nrt) {
 		*ret_nrt = rt;
 		RT_ADDREF(rt);
 	}
 bad:
 	RT_UNLOCK(rt);
 	if (free_ifa != 0)
 		ifa_free(info->rti_ifa);
 	return (error);
 }
 
 static void
 rt_setmetrics(const struct rt_addrinfo *info, struct rtentry *rt)
 {
 
 	if (info->rti_mflags & RTV_MTU) {
 		if (info->rti_rmx->rmx_mtu != 0) {
 
 			/*
 			 * MTU was explicitly provided by user.
 			 * Keep it.
 			 */
 			rt->rt_flags |= RTF_FIXEDMTU;
 		} else {
 
 			/*
 			 * User explicitly sets MTU to 0.
 			 * Assume rollback to default.
 			 */
 			rt->rt_flags &= ~RTF_FIXEDMTU;
 		}
 		rt->rt_mtu = info->rti_rmx->rmx_mtu;
 	}
 	if (info->rti_mflags & RTV_WEIGHT)
 		rt->rt_weight = info->rti_rmx->rmx_weight;
 	/* Kernel -> userland timebase conversion. */
 	if (info->rti_mflags & RTV_EXPIRE)
 		rt->rt_expire = info->rti_rmx->rmx_expire ?
 		    info->rti_rmx->rmx_expire - time_second + time_uptime : 0;
 }
 
 int
 rt_setgate(struct rtentry *rt, struct sockaddr *dst, struct sockaddr *gate)
 {
 	/* XXX dst may be overwritten, can we move this to below */
 	int dlen = SA_SIZE(dst), glen = SA_SIZE(gate);
 #ifdef INVARIANTS
 	struct radix_node_head *rnh;
 
 	rnh = rt_tables_get_rnh(rt->rt_fibnum, dst->sa_family);
 #endif
 
 	RT_LOCK_ASSERT(rt);
 	RADIX_NODE_HEAD_LOCK_ASSERT(rnh);
 	
 	/*
 	 * Prepare to store the gateway in rt->rt_gateway.
 	 * Both dst and gateway are stored one after the other in the same
 	 * malloc'd chunk. If we have room, we can reuse the old buffer,
 	 * rt_gateway already points to the right place.
 	 * Otherwise, malloc a new block and update the 'dst' address.
 	 */
 	if (rt->rt_gateway == NULL || glen > SA_SIZE(rt->rt_gateway)) {
 		caddr_t new;
 
 		R_Malloc(new, caddr_t, dlen + glen);
 		if (new == NULL)
 			return ENOBUFS;
 		/*
 		 * XXX note, we copy from *dst and not *rt_key(rt) because
 		 * rt_setgate() can be called to initialize a newly
 		 * allocated route entry, in which case rt_key(rt) == NULL
 		 * (and also rt->rt_gateway == NULL).
 		 * Free()/free() handle a NULL argument just fine.
 		 */
 		bcopy(dst, new, dlen);
 		Free(rt_key(rt));	/* free old block, if any */
 		rt_key(rt) = (struct sockaddr *)new;
 		rt->rt_gateway = (struct sockaddr *)(new + dlen);
 	}
 
 	/*
 	 * Copy the new gateway value into the memory chunk.
 	 */
 	bcopy(gate, rt->rt_gateway, glen);
 
 	return (0);
 }
 
 void
 rt_maskedcopy(struct sockaddr *src, struct sockaddr *dst, struct sockaddr *netmask)
 {
 	u_char *cp1 = (u_char *)src;
 	u_char *cp2 = (u_char *)dst;
 	u_char *cp3 = (u_char *)netmask;
 	u_char *cplim = cp2 + *cp3;
 	u_char *cplim2 = cp2 + *cp1;
 
 	*cp2++ = *cp1++; *cp2++ = *cp1++; /* copies sa_len & sa_family */
 	cp3 += 2;
 	if (cplim > cplim2)
 		cplim = cplim2;
 	while (cp2 < cplim)
 		*cp2++ = *cp1++ & *cp3++;
 	if (cp2 < cplim2)
 		bzero((caddr_t)cp2, (unsigned)(cplim2 - cp2));
 }
 
 /*
  * Set up a routing table entry, normally
  * for an interface.
  */
 #define _SOCKADDR_TMPSIZE 128 /* Not too big.. kernel stack size is limited */
 static inline  int
 rtinit1(struct ifaddr *ifa, int cmd, int flags, int fibnum)
 {
 	struct sockaddr *dst;
 	struct sockaddr *netmask;
 	struct rtentry *rt = NULL;
 	struct rt_addrinfo info;
 	int error = 0;
 	int startfib, endfib;
 	char tempbuf[_SOCKADDR_TMPSIZE];
 	int didwork = 0;
 	int a_failure = 0;
 	static struct sockaddr_dl null_sdl = {sizeof(null_sdl), AF_LINK};
 	struct radix_node_head *rnh;
 
 	if (flags & RTF_HOST) {
 		dst = ifa->ifa_dstaddr;
 		netmask = NULL;
 	} else {
 		dst = ifa->ifa_addr;
 		netmask = ifa->ifa_netmask;
 	}
 	if (dst->sa_len == 0)
 		return(EINVAL);
 	switch (dst->sa_family) {
 	case AF_INET6:
 	case AF_INET:
 		/* We support multiple FIBs. */
 		break;
 	default:
 		fibnum = RT_DEFAULT_FIB;
 		break;
 	}
 	if (fibnum == RT_ALL_FIBS) {
 		if (V_rt_add_addr_allfibs == 0 && cmd == (int)RTM_ADD)
 			startfib = endfib = ifa->ifa_ifp->if_fib;
 		else {
 			startfib = 0;
 			endfib = rt_numfibs - 1;
 		}
 	} else {
 		KASSERT((fibnum < rt_numfibs), ("rtinit1: bad fibnum"));
 		startfib = fibnum;
 		endfib = fibnum;
 	}
 
 	/*
 	 * If it's a delete, check that if it exists,
 	 * it's on the correct interface or we might scrub
 	 * a route to another ifa which would
 	 * be confusing at best and possibly worse.
 	 */
 	if (cmd == RTM_DELETE) {
 		/*
 		 * It's a delete, so it should already exist..
 		 * If it's a net, mask off the host bits
 		 * (Assuming we have a mask)
 		 * XXX this is kinda inet specific..
 		 */
 		if (netmask != NULL) {
 			rt_maskedcopy(dst, (struct sockaddr *)tempbuf, netmask);
 			dst = (struct sockaddr *)tempbuf;
 		}
 	}
 	/*
 	 * Now go through all the requested tables (fibs) and do the
 	 * requested action. Realistically, this will either be fib 0
 	 * for protocols that don't do multiple tables or all the
 	 * tables for those that do.
 	 */
 	for ( fibnum = startfib; fibnum <= endfib; fibnum++) {
 		if (cmd == RTM_DELETE) {
 			struct radix_node *rn;
 			/*
 			 * Look up an rtentry that is in the routing tree and
 			 * contains the correct info.
 			 */
 			rnh = rt_tables_get_rnh(fibnum, dst->sa_family);
 			if (rnh == NULL)
 				/* this table doesn't exist but others might */
 				continue;
 			RADIX_NODE_HEAD_RLOCK(rnh);
 			rn = rnh->rnh_lookup(dst, netmask, rnh);
 #ifdef RADIX_MPATH
 			if (rn_mpath_capable(rnh)) {
 
 				if (rn == NULL) 
 					error = ESRCH;
 				else {
 					rt = RNTORT(rn);
 					/*
 					 * for interface route the
 					 * rt->rt_gateway is sockaddr_intf
 					 * for cloning ARP entries, so
 					 * rt_mpath_matchgate must use the
 					 * interface address
 					 */
 					rt = rt_mpath_matchgate(rt,
 					    ifa->ifa_addr);
 					if (rt == NULL) 
 						error = ESRCH;
 				}
 			}
 #endif
 			error = (rn == NULL ||
 			    (rn->rn_flags & RNF_ROOT) ||
 			    RNTORT(rn)->rt_ifa != ifa);
 			RADIX_NODE_HEAD_RUNLOCK(rnh);
 			if (error) {
 				/* this is only an error if bad on ALL tables */
 				continue;
 			}
 		}
 		/*
 		 * Do the actual request
 		 */
 		bzero((caddr_t)&info, sizeof(info));
 		info.rti_ifa = ifa;
 		info.rti_flags = flags |
 		    (ifa->ifa_flags & ~IFA_RTSELF) | RTF_PINNED;
 		info.rti_info[RTAX_DST] = dst;
 		/* 
 		 * doing this for compatibility reasons
 		 */
 		if (cmd == RTM_ADD)
 			info.rti_info[RTAX_GATEWAY] =
 			    (struct sockaddr *)&null_sdl;
 		else
 			info.rti_info[RTAX_GATEWAY] = ifa->ifa_addr;
 		info.rti_info[RTAX_NETMASK] = netmask;
 		error = rtrequest1_fib(cmd, &info, &rt, fibnum);
 
 		if ((error == EEXIST) && (cmd == RTM_ADD)) {
 			/*
 			 * Interface route addition failed.
 			 * Atomically delete current prefix generating
 			 * RTM_DELETE message, and retry adding
 			 * interface prefix.
 			 */
 			rnh = rt_tables_get_rnh(fibnum, dst->sa_family);
 			RADIX_NODE_HEAD_LOCK(rnh);
 
 			/* Delete old prefix */
 			info.rti_ifa = NULL;
 			info.rti_flags = RTF_RNH_LOCKED;
 
 			error = rtrequest1_fib(RTM_DELETE, &info, NULL, fibnum);
 			if (error == 0) {
 				info.rti_ifa = ifa;
 				info.rti_flags = flags | RTF_RNH_LOCKED |
 				    (ifa->ifa_flags & ~IFA_RTSELF) | RTF_PINNED;
 				error = rtrequest1_fib(cmd, &info, &rt, fibnum);
 			}
 
 			RADIX_NODE_HEAD_UNLOCK(rnh);
 		}
 
 
 		if (error == 0 && rt != NULL) {
 			/*
 			 * notify any listening routing agents of the change
 			 */
 			RT_LOCK(rt);
 #ifdef RADIX_MPATH
 			/*
 			 * in case address alias finds the first address
 			 * e.g. ifconfig bge0 192.0.2.246/24
 			 * e.g. ifconfig bge0 192.0.2.247/24
 			 * the address set in the route is 192.0.2.246
 			 * so we need to replace it with 192.0.2.247
 			 */
 			if (memcmp(rt->rt_ifa->ifa_addr,
 			    ifa->ifa_addr, ifa->ifa_addr->sa_len)) {
 				ifa_free(rt->rt_ifa);
 				ifa_ref(ifa);
 				rt->rt_ifp = ifa->ifa_ifp;
 				rt->rt_ifa = ifa;
 			}
 #endif
 			/* 
 			 * doing this for compatibility reasons
 			 */
 			if (cmd == RTM_ADD) {
 			    ((struct sockaddr_dl *)rt->rt_gateway)->sdl_type  =
 				rt->rt_ifp->if_type;
 			    ((struct sockaddr_dl *)rt->rt_gateway)->sdl_index =
 				rt->rt_ifp->if_index;
 			}
 			RT_ADDREF(rt);
 			RT_UNLOCK(rt);
 			rt_newaddrmsg_fib(cmd, ifa, error, rt, fibnum);
 			RT_LOCK(rt);
 			RT_REMREF(rt);
 			if (cmd == RTM_DELETE) {
 				/*
 				 * If we are deleting, and we found an entry,
 				 * then it's been removed from the tree..
 				 * now throw it away.
 				 */
 				RTFREE_LOCKED(rt);
 			} else {
 				if (cmd == RTM_ADD) {
 					/*
 					 * We just wanted to add it..
 					 * we don't actually need a reference.
 					 */
 					RT_REMREF(rt);
 				}
 				RT_UNLOCK(rt);
 			}
 			didwork = 1;
 		}
 		if (error)
 			a_failure = error;
 	}
 	if (cmd == RTM_DELETE) {
 		if (didwork) {
 			error = 0;
 		} else {
 			/* we only give an error if it wasn't in any table */
 			error = ((flags & RTF_HOST) ?
 			    EHOSTUNREACH : ENETUNREACH);
 		}
 	} else {
 		if (a_failure) {
 			/* return an error if any of them failed */
 			error = a_failure;
 		}
 	}
 	return (error);
 }
 
 /*
  * Set up a routing table entry, normally
  * for an interface.
  */
 int
 rtinit(struct ifaddr *ifa, int cmd, int flags)
 {
 	struct sockaddr *dst;
 	int fib = RT_DEFAULT_FIB;
 
 	if (flags & RTF_HOST) {
 		dst = ifa->ifa_dstaddr;
 	} else {
 		dst = ifa->ifa_addr;
 	}
 
 	switch (dst->sa_family) {
 	case AF_INET6:
 	case AF_INET:
 		/* We do support multiple FIBs. */
 		fib = RT_ALL_FIBS;
 		break;
 	}
 	return (rtinit1(ifa, cmd, flags, fib));
 }
 
 /*
  * Announce interface address arrival/withdraw
  * Returns 0 on success.
  */
 int
 rt_addrmsg(int cmd, struct ifaddr *ifa, int fibnum)
 {
 
 	KASSERT(cmd == RTM_ADD || cmd == RTM_DELETE,
 	    ("unexpected cmd %d", cmd));
 	
 	KASSERT(fibnum == RT_ALL_FIBS || (fibnum >= 0 && fibnum < rt_numfibs),
 	    ("%s: fib out of range 0 <=%d<%d", __func__, fibnum, rt_numfibs));
 
 #if defined(INET) || defined(INET6)
 #ifdef SCTP
 	/*
 	 * notify the SCTP stack
 	 * this will only get called when an address is added/deleted
 	 * XXX pass the ifaddr struct instead if ifa->ifa_addr...
 	 */
 	sctp_addr_change(ifa, cmd);
 #endif /* SCTP */
 #endif
 	return (rtsock_addrmsg(cmd, ifa, fibnum));
 }
 
 /*
  * Announce route addition/removal.
  * Users of this function MUST validate input data BEFORE calling.
  * However we have to be able to handle invalid data:
  * if some userland app sends us "invalid" route message (invalid mask,
  * no dst, wrong address families, etc...) we need to pass it back
  * to app (and any other rtsock consumers) with rtm_errno field set to
  * non-zero value.
  * Returns 0 on success.
  */
 int
 rt_routemsg(int cmd, struct ifnet *ifp, int error, struct rtentry *rt,
     int fibnum)
 {
 
 	KASSERT(cmd == RTM_ADD || cmd == RTM_DELETE,
 	    ("unexpected cmd %d", cmd));
 	
 	KASSERT(fibnum == RT_ALL_FIBS || (fibnum >= 0 && fibnum < rt_numfibs),
 	    ("%s: fib out of range 0 <=%d<%d", __func__, fibnum, rt_numfibs));
 
 	KASSERT(rt_key(rt) != NULL, (":%s: rt_key must be supplied", __func__));
 
 	return (rtsock_routemsg(cmd, ifp, error, rt, fibnum));
 }
 
 void
 rt_newaddrmsg(int cmd, struct ifaddr *ifa, int error, struct rtentry *rt)
 {
 
 	rt_newaddrmsg_fib(cmd, ifa, error, rt, RT_ALL_FIBS);
 }
 
 /*
  * This is called to generate messages from the routing socket
  * indicating a network interface has had addresses associated with it.
  */
 void
 rt_newaddrmsg_fib(int cmd, struct ifaddr *ifa, int error, struct rtentry *rt,
     int fibnum)
 {
 
 	KASSERT(cmd == RTM_ADD || cmd == RTM_DELETE,
 		("unexpected cmd %u", cmd));
 	KASSERT(fibnum == RT_ALL_FIBS || (fibnum >= 0 && fibnum < rt_numfibs),
 	    ("%s: fib out of range 0 <=%d<%d", __func__, fibnum, rt_numfibs));
 
 	if (cmd == RTM_ADD) {
 		rt_addrmsg(cmd, ifa, fibnum);
 		if (rt != NULL)
 			rt_routemsg(cmd, ifa->ifa_ifp, error, rt, fibnum);
 	} else {
 		if (rt != NULL)
 			rt_routemsg(cmd, ifa->ifa_ifp, error, rt, fibnum);
 		rt_addrmsg(cmd, ifa, fibnum);
 	}
 }
 
Index: user/ngie/more-tests/sys/netinet/ip_reass.c
===================================================================
--- user/ngie/more-tests/sys/netinet/ip_reass.c	(revision 281584)
+++ user/ngie/more-tests/sys/netinet/ip_reass.c	(revision 281585)
@@ -1,657 +1,658 @@
 /*-
  * Copyright (c) 2015 Gleb Smirnoff <glebius@FreeBSD.org>
  * Copyright (c) 2015 Adrian Chadd <adrian@FreeBSD.org>
  * Copyright (c) 1982, 1986, 1988, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)ip_input.c	8.2 (Berkeley) 1/4/94
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_rss.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/eventhandler.h>
 #include <sys/hash.h>
 #include <sys/mbuf.h>
 #include <sys/malloc.h>
 #include <sys/lock.h>
 #include <sys/mutex.h>
 #include <sys/sysctl.h>
 
 #include <net/rss_config.h>
+#include <net/netisr.h>
 #include <net/vnet.h>
 
 #include <netinet/in.h>
 #include <netinet/ip.h>
 #include <netinet/ip_var.h>
 #include <netinet/in_rss.h>
 #ifdef MAC
 #include <security/mac/mac_framework.h>
 #endif
 
 SYSCTL_DECL(_net_inet_ip);
 
 /*
  * Reassembly headers are stored in hash buckets.
  */
 #define	IPREASS_NHASH_LOG2	6
 #define	IPREASS_NHASH		(1 << IPREASS_NHASH_LOG2)
 #define	IPREASS_HMASK		(IPREASS_NHASH - 1)
 
 struct ipqbucket {
 	TAILQ_HEAD(ipqhead, ipq) head;
 	struct mtx		 lock;
 };
 
 static VNET_DEFINE(struct ipqbucket, ipq[IPREASS_NHASH]);
 #define	V_ipq		VNET(ipq)
 static VNET_DEFINE(uint32_t, ipq_hashseed);
 #define V_ipq_hashseed   VNET(ipq_hashseed)
 
 #define	IPQ_LOCK(i)	mtx_lock(&V_ipq[i].lock)
 #define	IPQ_TRYLOCK(i)	mtx_trylock(&V_ipq[i].lock)
 #define	IPQ_UNLOCK(i)	mtx_unlock(&V_ipq[i].lock)
 #define	IPQ_LOCK_ASSERT(i)	mtx_assert(&V_ipq[i].lock, MA_OWNED)
 
 void		ipreass_init(void);
 void		ipreass_drain(void);
 void		ipreass_slowtimo(void);
 #ifdef VIMAGE
 void		ipreass_destroy(void);
 #endif
 static int	sysctl_maxfragpackets(SYSCTL_HANDLER_ARGS);
 static void	ipreass_zone_change(void *);
 static void	ipreass_drain_tomax(void);
 static void	ipq_free(struct ipqhead *, struct ipq *);
 static struct ipq * ipq_reuse(int);
 
 static inline void
 ipq_timeout(struct ipqhead *head, struct ipq *fp)
 {
 
 	IPSTAT_ADD(ips_fragtimeout, fp->ipq_nfrags);
 	ipq_free(head, fp);
 }
 
 static inline void
 ipq_drop(struct ipqhead *head, struct ipq *fp)
 {
 
 	IPSTAT_ADD(ips_fragdropped, fp->ipq_nfrags);
 	ipq_free(head, fp);
 }
 
 static VNET_DEFINE(uma_zone_t, ipq_zone);
 #define	V_ipq_zone	VNET(ipq_zone)
 SYSCTL_PROC(_net_inet_ip, OID_AUTO, maxfragpackets, CTLFLAG_VNET |
     CTLTYPE_INT | CTLFLAG_RW, NULL, 0, sysctl_maxfragpackets, "I",
     "Maximum number of IPv4 fragment reassembly queue entries");
 SYSCTL_UMA_CUR(_net_inet_ip, OID_AUTO, fragpackets, CTLFLAG_VNET,
     &VNET_NAME(ipq_zone),
     "Current number of IPv4 fragment reassembly queue entries");
 
 static VNET_DEFINE(int, noreass);
 #define	V_noreass	VNET(noreass)
 
 static VNET_DEFINE(int, maxfragsperpacket);
 #define	V_maxfragsperpacket	VNET(maxfragsperpacket)
 SYSCTL_INT(_net_inet_ip, OID_AUTO, maxfragsperpacket, CTLFLAG_VNET | CTLFLAG_RW,
     &VNET_NAME(maxfragsperpacket), 0,
     "Maximum number of IPv4 fragments allowed per packet");
 
 /*
  * Take incoming datagram fragment and try to reassemble it into
  * whole datagram.  If the argument is the first fragment or one
  * in between the function will return NULL and store the mbuf
  * in the fragment chain.  If the argument is the last fragment
  * the packet will be reassembled and the pointer to the new
  * mbuf returned for further processing.  Only m_tags attached
  * to the first packet/fragment are preserved.
  * The IP header is *NOT* adjusted out of iplen.
  */
 #define	M_IP_FRAG	M_PROTO9
 struct mbuf *
 ip_reass(struct mbuf *m)
 {
 	struct ip *ip;
 	struct mbuf *p, *q, *nq, *t;
 	struct ipq *fp;
 	struct ipqhead *head;
 	int i, hlen, next;
 	u_int8_t ecn, ecn0;
 	uint32_t hash;
 #ifdef	RSS
 	uint32_t rss_hash, rss_type;
 #endif
 
 	/*
 	 * If no reassembling or maxfragsperpacket are 0,
 	 * never accept fragments.
 	 */
 	if (V_noreass == 1 || V_maxfragsperpacket == 0) {
 		IPSTAT_INC(ips_fragments);
 		IPSTAT_INC(ips_fragdropped);
 		m_freem(m);
 		return (NULL);
 	}
 
 	ip = mtod(m, struct ip *);
 	hlen = ip->ip_hl << 2;
 
 	/*
 	 * Adjust ip_len to not reflect header,
 	 * convert offset of this to bytes.
 	 */
 	ip->ip_len = htons(ntohs(ip->ip_len) - hlen);
 	if (ip->ip_off & htons(IP_MF)) {
 		/*
 		 * Make sure that fragments have a data length
 		 * that's a non-zero multiple of 8 bytes.
 		 */
 		if (ip->ip_len == htons(0) || (ntohs(ip->ip_len) & 0x7) != 0) {
 			IPSTAT_INC(ips_toosmall); /* XXX */
 			IPSTAT_INC(ips_fragdropped);
 			m_freem(m);
 			return (NULL);
 		}
 		m->m_flags |= M_IP_FRAG;
 	} else
 		m->m_flags &= ~M_IP_FRAG;
 	ip->ip_off = htons(ntohs(ip->ip_off) << 3);
 
 	/*
 	 * Attempt reassembly; if it succeeds, proceed.
 	 * ip_reass() will return a different mbuf.
 	 */
 	IPSTAT_INC(ips_fragments);
 	m->m_pkthdr.PH_loc.ptr = ip;
 
 	/*
 	 * Presence of header sizes in mbufs
 	 * would confuse code below.
 	 */
 	m->m_data += hlen;
 	m->m_len -= hlen;
 
 	hash = ip->ip_src.s_addr ^ ip->ip_id;
 	hash = jenkins_hash32(&hash, 1, V_ipq_hashseed) & IPREASS_HMASK;
 	head = &V_ipq[hash].head;
 	IPQ_LOCK(hash);
 
 	/*
 	 * Look for queue of fragments
 	 * of this datagram.
 	 */
 	TAILQ_FOREACH(fp, head, ipq_list)
 		if (ip->ip_id == fp->ipq_id &&
 		    ip->ip_src.s_addr == fp->ipq_src.s_addr &&
 		    ip->ip_dst.s_addr == fp->ipq_dst.s_addr &&
 #ifdef MAC
 		    mac_ipq_match(m, fp) &&
 #endif
 		    ip->ip_p == fp->ipq_p)
 			break;
 	/*
 	 * If first fragment to arrive, create a reassembly queue.
 	 */
 	if (fp == NULL) {
 		fp = uma_zalloc(V_ipq_zone, M_NOWAIT);
 		if (fp == NULL)
 			fp = ipq_reuse(hash);
 #ifdef MAC
 		if (mac_ipq_init(fp, M_NOWAIT) != 0) {
 			uma_zfree(V_ipq_zone, fp);
 			fp = NULL;
 			goto dropfrag;
 		}
 		mac_ipq_create(m, fp);
 #endif
 		TAILQ_INSERT_HEAD(head, fp, ipq_list);
 		fp->ipq_nfrags = 1;
 		fp->ipq_ttl = IPFRAGTTL;
 		fp->ipq_p = ip->ip_p;
 		fp->ipq_id = ip->ip_id;
 		fp->ipq_src = ip->ip_src;
 		fp->ipq_dst = ip->ip_dst;
 		fp->ipq_frags = m;
 		m->m_nextpkt = NULL;
 		goto done;
 	} else {
 		fp->ipq_nfrags++;
 #ifdef MAC
 		mac_ipq_update(m, fp);
 #endif
 	}
 
 #define GETIP(m)	((struct ip*)((m)->m_pkthdr.PH_loc.ptr))
 
 	/*
 	 * Handle ECN by comparing this segment with the first one;
 	 * if CE is set, do not lose CE.
 	 * drop if CE and not-ECT are mixed for the same packet.
 	 */
 	ecn = ip->ip_tos & IPTOS_ECN_MASK;
 	ecn0 = GETIP(fp->ipq_frags)->ip_tos & IPTOS_ECN_MASK;
 	if (ecn == IPTOS_ECN_CE) {
 		if (ecn0 == IPTOS_ECN_NOTECT)
 			goto dropfrag;
 		if (ecn0 != IPTOS_ECN_CE)
 			GETIP(fp->ipq_frags)->ip_tos |= IPTOS_ECN_CE;
 	}
 	if (ecn == IPTOS_ECN_NOTECT && ecn0 != IPTOS_ECN_NOTECT)
 		goto dropfrag;
 
 	/*
 	 * Find a segment which begins after this one does.
 	 */
 	for (p = NULL, q = fp->ipq_frags; q; p = q, q = q->m_nextpkt)
 		if (ntohs(GETIP(q)->ip_off) > ntohs(ip->ip_off))
 			break;
 
 	/*
 	 * If there is a preceding segment, it may provide some of
 	 * our data already.  If so, drop the data from the incoming
 	 * segment.  If it provides all of our data, drop us, otherwise
 	 * stick new segment in the proper place.
 	 *
 	 * If some of the data is dropped from the preceding
 	 * segment, then it's checksum is invalidated.
 	 */
 	if (p) {
 		i = ntohs(GETIP(p)->ip_off) + ntohs(GETIP(p)->ip_len) -
 		    ntohs(ip->ip_off);
 		if (i > 0) {
 			if (i >= ntohs(ip->ip_len))
 				goto dropfrag;
 			m_adj(m, i);
 			m->m_pkthdr.csum_flags = 0;
 			ip->ip_off = htons(ntohs(ip->ip_off) + i);
 			ip->ip_len = htons(ntohs(ip->ip_len) - i);
 		}
 		m->m_nextpkt = p->m_nextpkt;
 		p->m_nextpkt = m;
 	} else {
 		m->m_nextpkt = fp->ipq_frags;
 		fp->ipq_frags = m;
 	}
 
 	/*
 	 * While we overlap succeeding segments trim them or,
 	 * if they are completely covered, dequeue them.
 	 */
 	for (; q != NULL && ntohs(ip->ip_off) + ntohs(ip->ip_len) >
 	    ntohs(GETIP(q)->ip_off); q = nq) {
 		i = (ntohs(ip->ip_off) + ntohs(ip->ip_len)) -
 		    ntohs(GETIP(q)->ip_off);
 		if (i < ntohs(GETIP(q)->ip_len)) {
 			GETIP(q)->ip_len = htons(ntohs(GETIP(q)->ip_len) - i);
 			GETIP(q)->ip_off = htons(ntohs(GETIP(q)->ip_off) + i);
 			m_adj(q, i);
 			q->m_pkthdr.csum_flags = 0;
 			break;
 		}
 		nq = q->m_nextpkt;
 		m->m_nextpkt = nq;
 		IPSTAT_INC(ips_fragdropped);
 		fp->ipq_nfrags--;
 		m_freem(q);
 	}
 
 	/*
 	 * Check for complete reassembly and perform frag per packet
 	 * limiting.
 	 *
 	 * Frag limiting is performed here so that the nth frag has
 	 * a chance to complete the packet before we drop the packet.
 	 * As a result, n+1 frags are actually allowed per packet, but
 	 * only n will ever be stored. (n = maxfragsperpacket.)
 	 *
 	 */
 	next = 0;
 	for (p = NULL, q = fp->ipq_frags; q; p = q, q = q->m_nextpkt) {
 		if (ntohs(GETIP(q)->ip_off) != next) {
 			if (fp->ipq_nfrags > V_maxfragsperpacket)
 				ipq_drop(head, fp);
 			goto done;
 		}
 		next += ntohs(GETIP(q)->ip_len);
 	}
 	/* Make sure the last packet didn't have the IP_MF flag */
 	if (p->m_flags & M_IP_FRAG) {
 		if (fp->ipq_nfrags > V_maxfragsperpacket)
 			ipq_drop(head, fp);
 		goto done;
 	}
 
 	/*
 	 * Reassembly is complete.  Make sure the packet is a sane size.
 	 */
 	q = fp->ipq_frags;
 	ip = GETIP(q);
 	if (next + (ip->ip_hl << 2) > IP_MAXPACKET) {
 		IPSTAT_INC(ips_toolong);
 		ipq_drop(head, fp);
 		goto done;
 	}
 
 	/*
 	 * Concatenate fragments.
 	 */
 	m = q;
 	t = m->m_next;
 	m->m_next = NULL;
 	m_cat(m, t);
 	nq = q->m_nextpkt;
 	q->m_nextpkt = NULL;
 	for (q = nq; q != NULL; q = nq) {
 		nq = q->m_nextpkt;
 		q->m_nextpkt = NULL;
 		m->m_pkthdr.csum_flags &= q->m_pkthdr.csum_flags;
 		m->m_pkthdr.csum_data += q->m_pkthdr.csum_data;
 		m_cat(m, q);
 	}
 	/*
 	 * In order to do checksumming faster we do 'end-around carry' here
 	 * (and not in for{} loop), though it implies we are not going to
 	 * reassemble more than 64k fragments.
 	 */
 	while (m->m_pkthdr.csum_data & 0xffff0000)
 		m->m_pkthdr.csum_data = (m->m_pkthdr.csum_data & 0xffff) +
 		    (m->m_pkthdr.csum_data >> 16);
 #ifdef MAC
 	mac_ipq_reassemble(fp, m);
 	mac_ipq_destroy(fp);
 #endif
 
 	/*
 	 * Create header for new ip packet by modifying header of first
 	 * packet;  dequeue and discard fragment reassembly header.
 	 * Make header visible.
 	 */
 	ip->ip_len = htons((ip->ip_hl << 2) + next);
 	ip->ip_src = fp->ipq_src;
 	ip->ip_dst = fp->ipq_dst;
 	TAILQ_REMOVE(head, fp, ipq_list);
 	uma_zfree(V_ipq_zone, fp);
 	m->m_len += (ip->ip_hl << 2);
 	m->m_data -= (ip->ip_hl << 2);
 	/* some debugging cruft by sklower, below, will go away soon */
 	if (m->m_flags & M_PKTHDR)	/* XXX this should be done elsewhere */
 		m_fixhdr(m);
 	IPSTAT_INC(ips_reassembled);
 	IPQ_UNLOCK(hash);
 
 #ifdef	RSS
 	/*
 	 * Query the RSS layer for the flowid / flowtype for the
 	 * mbuf payload.
 	 *
 	 * For now, just assume we have to calculate a new one.
 	 * Later on we should check to see if the assigned flowid matches
 	 * what RSS wants for the given IP protocol and if so, just keep it.
 	 *
 	 * We then queue into the relevant netisr so it can be dispatched
 	 * to the correct CPU.
 	 *
 	 * Note - this may return 1, which means the flowid in the mbuf
 	 * is correct for the configured RSS hash types and can be used.
 	 */
 	if (rss_mbuf_software_hash_v4(m, 0, &rss_hash, &rss_type) == 0) {
 		m->m_pkthdr.flowid = rss_hash;
 		M_HASHTYPE_SET(m, rss_type);
 	}
 
 	/*
 	 * Queue/dispatch for reprocessing.
 	 *
 	 * Note: this is much slower than just handling the frame in the
 	 * current receive context.  It's likely worth investigating
 	 * why this is.
 	 */
 	netisr_dispatch(NETISR_IP_DIRECT, m);
 	return (NULL);
 #endif
 
 	/* Handle in-line */
 	return (m);
 
 dropfrag:
 	IPSTAT_INC(ips_fragdropped);
 	if (fp != NULL)
 		fp->ipq_nfrags--;
 	m_freem(m);
 done:
 	IPQ_UNLOCK(hash);
 	return (NULL);
 
 #undef GETIP
 }
 
 /*
  * Initialize IP reassembly structures.
  */
 void
 ipreass_init(void)
 {
 
 	for (int i = 0; i < IPREASS_NHASH; i++) {
 		TAILQ_INIT(&V_ipq[i].head);
 		mtx_init(&V_ipq[i].lock, "IP reassembly", NULL,
 		    MTX_DEF | MTX_DUPOK);
 	}
 	V_ipq_hashseed = arc4random();
 	V_maxfragsperpacket = 16;
 	V_ipq_zone = uma_zcreate("ipq", sizeof(struct ipq), NULL, NULL, NULL,
 	    NULL, UMA_ALIGN_PTR, 0);
 	uma_zone_set_max(V_ipq_zone, nmbclusters / 32);
 
 	if (IS_DEFAULT_VNET(curvnet))
 		EVENTHANDLER_REGISTER(nmbclusters_change, ipreass_zone_change,
 		    NULL, EVENTHANDLER_PRI_ANY);
 }
 
 /*
  * If a timer expires on a reassembly queue, discard it.
  */
 void
 ipreass_slowtimo(void)
 {
 	struct ipq *fp, *tmp;
 
 	for (int i = 0; i < IPREASS_NHASH; i++) {
 		IPQ_LOCK(i);
 		TAILQ_FOREACH_SAFE(fp, &V_ipq[i].head, ipq_list, tmp)
 		if (--fp->ipq_ttl == 0)
 				ipq_timeout(&V_ipq[i].head, fp);
 		IPQ_UNLOCK(i);
 	}
 }
 
 /*
  * Drain off all datagram fragments.
  */
 void
 ipreass_drain(void)
 {
 
 	for (int i = 0; i < IPREASS_NHASH; i++) {
 		IPQ_LOCK(i);
 		while(!TAILQ_EMPTY(&V_ipq[i].head))
 			ipq_drop(&V_ipq[i].head, TAILQ_FIRST(&V_ipq[i].head));
 		IPQ_UNLOCK(i);
 	}
 }
 
 #ifdef VIMAGE
 /*
  * Destroy IP reassembly structures.
  */
 void
 ipreass_destroy(void)
 {
 
 	ipreass_drain();
 	uma_zdestroy(V_ipq_zone);
 	for (int i = 0; i < IPREASS_NHASH; i++)
 		mtx_destroy(&V_ipq[i].lock);
 }
 #endif
 
 /*
  * After maxnipq has been updated, propagate the change to UMA.  The UMA zone
  * max has slightly different semantics than the sysctl, for historical
  * reasons.
  */
 static void
 ipreass_drain_tomax(void)
 {
 	int target;
 
 	/*
 	 * If we are over the maximum number of fragments,
 	 * drain off enough to get down to the new limit,
 	 * stripping off last elements on queues.  Every
 	 * run we strip the oldest element from each bucket.
 	 */
 	target = uma_zone_get_max(V_ipq_zone);
 	while (uma_zone_get_cur(V_ipq_zone) > target) {
 		struct ipq *fp;
 
 		for (int i = 0; i < IPREASS_NHASH; i++) {
 			IPQ_LOCK(i);
 			fp = TAILQ_LAST(&V_ipq[i].head, ipqhead);
 			if (fp != NULL)
 				ipq_timeout(&V_ipq[i].head, fp);
 			IPQ_UNLOCK(i);
 		}
 	}
 }
 
 static void
 ipreass_zone_change(void *tag)
 {
 
 	uma_zone_set_max(V_ipq_zone, nmbclusters / 32);
 	ipreass_drain_tomax();
 }
 
 /*
  * Change the limit on the UMA zone, or disable the fragment allocation
  * at all.  Since 0 and -1 is a special values here, we need our own handler,
  * instead of sysctl_handle_uma_zone_max().
  */
 static int
 sysctl_maxfragpackets(SYSCTL_HANDLER_ARGS)
 {
 	int error, max;
 
 	if (V_noreass == 0) {
 		max = uma_zone_get_max(V_ipq_zone);
 		if (max == 0)
 			max = -1;
 	} else 
 		max = 0;
 	error = sysctl_handle_int(oidp, &max, 0, req);
 	if (error || !req->newptr)
 		return (error);
 	if (max > 0) {
 		/*
 		 * XXXRW: Might be a good idea to sanity check the argument
 		 * and place an extreme upper bound.
 		 */
 		max = uma_zone_set_max(V_ipq_zone, max);
 		ipreass_drain_tomax();
 		V_noreass = 0;
 	} else if (max == 0) {
 		V_noreass = 1;
 		ipreass_drain();
 	} else if (max == -1) {
 		V_noreass = 0;
 		uma_zone_set_max(V_ipq_zone, 0);
 	} else
 		return (EINVAL);
 	return (0);
 }
 
 /*
  * Seek for old fragment queue header that can be reused.  Try to
  * reuse a header from currently locked hash bucket.
  */
 static struct ipq *
 ipq_reuse(int start)
 {
 	struct ipq *fp;
 	int i;
 
 	IPQ_LOCK_ASSERT(start);
 
 	for (i = start;; i++) {
 		if (i == IPREASS_NHASH)
 			i = 0;
 		if (i != start && IPQ_TRYLOCK(i) == 0)
 			continue;
 		fp = TAILQ_LAST(&V_ipq[i].head, ipqhead);
 		if (fp) {
 			struct mbuf *m;
 
 			IPSTAT_ADD(ips_fragtimeout, fp->ipq_nfrags);
 			while (fp->ipq_frags) {
 				m = fp->ipq_frags;
 				fp->ipq_frags = m->m_nextpkt;
 				m_freem(m);
 			}
 			TAILQ_REMOVE(&V_ipq[i].head, fp, ipq_list);
 			if (i != start)
 				IPQ_UNLOCK(i);
 			IPQ_LOCK_ASSERT(start);
 			return (fp);
 		}
 		if (i != start)
 			IPQ_UNLOCK(i);
 	}
 }
 
 /*
  * Free a fragment reassembly header and all associated datagrams.
  */
 static void
 ipq_free(struct ipqhead *fhp, struct ipq *fp)
 {
 	struct mbuf *q;
 
 	while (fp->ipq_frags) {
 		q = fp->ipq_frags;
 		fp->ipq_frags = q->m_nextpkt;
 		m_freem(q);
 	}
 	TAILQ_REMOVE(fhp, fp, ipq_list);
 	uma_zfree(V_ipq_zone, fp);
 }
Index: user/ngie/more-tests/sys/netpfil/pf/pf.c
===================================================================
--- user/ngie/more-tests/sys/netpfil/pf/pf.c	(revision 281584)
+++ user/ngie/more-tests/sys/netpfil/pf/pf.c	(revision 281585)
@@ -1,6444 +1,6444 @@
 /*-
  * Copyright (c) 2001 Daniel Hartmeier
  * Copyright (c) 2002 - 2008 Henning Brauer
  * Copyright (c) 2012 Gleb Smirnoff <glebius@FreeBSD.org>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  *
  *    - Redistributions of source code must retain the above copyright
  *      notice, this list of conditions and the following disclaimer.
  *    - Redistributions in binary form must reproduce the above
  *      copyright notice, this list of conditions and the following
  *      disclaimer in the documentation and/or other materials provided
  *      with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
  * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
  * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
  * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
  * COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
  * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
  * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
  * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
  * ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
  * POSSIBILITY OF SUCH DAMAGE.
  *
  * Effort sponsored in part by the Defense Advanced Research Projects
  * Agency (DARPA) and Air Force Research Laboratory, Air Force
  * Materiel Command, USAF, under agreement number F30602-01-2-0537.
  *
  *	$OpenBSD: pf.c,v 1.634 2009/02/27 12:37:45 henning Exp $
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_inet.h"
 #include "opt_inet6.h"
 #include "opt_bpf.h"
 #include "opt_pf.h"
 
 #include <sys/param.h>
 #include <sys/bus.h>
 #include <sys/endian.h>
 #include <sys/hash.h>
 #include <sys/interrupt.h>
 #include <sys/kernel.h>
 #include <sys/kthread.h>
 #include <sys/limits.h>
 #include <sys/mbuf.h>
 #include <sys/md5.h>
 #include <sys/random.h>
 #include <sys/refcount.h>
 #include <sys/socket.h>
 #include <sys/sysctl.h>
 #include <sys/taskqueue.h>
 #include <sys/ucred.h>
 
 #include <net/if.h>
 #include <net/if_var.h>
 #include <net/if_types.h>
 #include <net/route.h>
 #include <net/radix_mpath.h>
 #include <net/vnet.h>
 
 #include <net/pfvar.h>
 #include <net/if_pflog.h>
 #include <net/if_pfsync.h>
 
 #include <netinet/in_pcb.h>
 #include <netinet/in_var.h>
 #include <netinet/ip.h>
 #include <netinet/ip_fw.h>
 #include <netinet/ip_icmp.h>
 #include <netinet/icmp_var.h>
 #include <netinet/ip_var.h>
 #include <netinet/tcp.h>
 #include <netinet/tcp_fsm.h>
 #include <netinet/tcp_seq.h>
 #include <netinet/tcp_timer.h>
 #include <netinet/tcp_var.h>
 #include <netinet/udp.h>
 #include <netinet/udp_var.h>
 
 #include <netpfil/ipfw/ip_fw_private.h> /* XXX: only for DIR_IN/DIR_OUT */
 
 #ifdef INET6
 #include <netinet/ip6.h>
 #include <netinet/icmp6.h>
 #include <netinet6/nd6.h>
 #include <netinet6/ip6_var.h>
 #include <netinet6/in6_pcb.h>
 #endif /* INET6 */
 
 #include <machine/in_cksum.h>
 #include <security/mac/mac_framework.h>
 
 #define	DPFPRINTF(n, x)	if (V_pf_status.debug >= (n)) printf x
 
 /*
  * Global variables
  */
 
 /* state tables */
 VNET_DEFINE(struct pf_altqqueue,	 pf_altqs[2]);
 VNET_DEFINE(struct pf_palist,		 pf_pabuf);
 VNET_DEFINE(struct pf_altqqueue *,	 pf_altqs_active);
 VNET_DEFINE(struct pf_altqqueue *,	 pf_altqs_inactive);
 VNET_DEFINE(struct pf_kstatus,		 pf_status);
 
 VNET_DEFINE(u_int32_t,			 ticket_altqs_active);
 VNET_DEFINE(u_int32_t,			 ticket_altqs_inactive);
 VNET_DEFINE(int,			 altqs_inactive_open);
 VNET_DEFINE(u_int32_t,			 ticket_pabuf);
 
 VNET_DEFINE(MD5_CTX,			 pf_tcp_secret_ctx);
 #define	V_pf_tcp_secret_ctx		 VNET(pf_tcp_secret_ctx)
 VNET_DEFINE(u_char,			 pf_tcp_secret[16]);
 #define	V_pf_tcp_secret			 VNET(pf_tcp_secret)
 VNET_DEFINE(int,			 pf_tcp_secret_init);
 #define	V_pf_tcp_secret_init		 VNET(pf_tcp_secret_init)
 VNET_DEFINE(int,			 pf_tcp_iss_off);
 #define	V_pf_tcp_iss_off		 VNET(pf_tcp_iss_off)
 
 /*
  * Queue for pf_intr() sends.
  */
 static MALLOC_DEFINE(M_PFTEMP, "pf_temp", "pf(4) temporary allocations");
 struct pf_send_entry {
 	STAILQ_ENTRY(pf_send_entry)	pfse_next;
 	struct mbuf			*pfse_m;
 	enum {
 		PFSE_IP,
 		PFSE_IP6,
 		PFSE_ICMP,
 		PFSE_ICMP6,
 	}				pfse_type;
 	struct {
 		int		type;
 		int		code;
 		int		mtu;
 	} icmpopts;
 };
 
 STAILQ_HEAD(pf_send_head, pf_send_entry);
 static VNET_DEFINE(struct pf_send_head, pf_sendqueue);
 #define	V_pf_sendqueue	VNET(pf_sendqueue)
 
 static struct mtx pf_sendqueue_mtx;
 #define	PF_SENDQ_LOCK()		mtx_lock(&pf_sendqueue_mtx)
 #define	PF_SENDQ_UNLOCK()	mtx_unlock(&pf_sendqueue_mtx)
 
 /*
  * Queue for pf_overload_task() tasks.
  */
 struct pf_overload_entry {
 	SLIST_ENTRY(pf_overload_entry)	next;
 	struct pf_addr  		addr;
 	sa_family_t			af;
 	uint8_t				dir;
 	struct pf_rule  		*rule;
 };
 
 SLIST_HEAD(pf_overload_head, pf_overload_entry);
 static VNET_DEFINE(struct pf_overload_head, pf_overloadqueue);
 #define V_pf_overloadqueue	VNET(pf_overloadqueue)
 static VNET_DEFINE(struct task, pf_overloadtask);
 #define	V_pf_overloadtask	VNET(pf_overloadtask)
 
 static struct mtx pf_overloadqueue_mtx;
 #define	PF_OVERLOADQ_LOCK()	mtx_lock(&pf_overloadqueue_mtx)
 #define	PF_OVERLOADQ_UNLOCK()	mtx_unlock(&pf_overloadqueue_mtx)
 
 VNET_DEFINE(struct pf_rulequeue, pf_unlinked_rules);
 struct mtx pf_unlnkdrules_mtx;
 
 static VNET_DEFINE(uma_zone_t,	pf_sources_z);
 #define	V_pf_sources_z	VNET(pf_sources_z)
 uma_zone_t		pf_mtag_z;
 VNET_DEFINE(uma_zone_t,	 pf_state_z);
 VNET_DEFINE(uma_zone_t,	 pf_state_key_z);
 
 VNET_DEFINE(uint64_t, pf_stateid[MAXCPU]);
 #define	PFID_CPUBITS	8
 #define	PFID_CPUSHIFT	(sizeof(uint64_t) * NBBY - PFID_CPUBITS)
 #define	PFID_CPUMASK	((uint64_t)((1 << PFID_CPUBITS) - 1) <<	PFID_CPUSHIFT)
 #define	PFID_MAXID	(~PFID_CPUMASK)
 CTASSERT((1 << PFID_CPUBITS) >= MAXCPU);
 
 static void		 pf_src_tree_remove_state(struct pf_state *);
 static void		 pf_init_threshold(struct pf_threshold *, u_int32_t,
 			    u_int32_t);
 static void		 pf_add_threshold(struct pf_threshold *);
 static int		 pf_check_threshold(struct pf_threshold *);
 
 static void		 pf_change_ap(struct pf_addr *, u_int16_t *,
 			    u_int16_t *, u_int16_t *, struct pf_addr *,
 			    u_int16_t, u_int8_t, sa_family_t);
 static int		 pf_modulate_sack(struct mbuf *, int, struct pf_pdesc *,
 			    struct tcphdr *, struct pf_state_peer *);
 static void		 pf_change_icmp(struct pf_addr *, u_int16_t *,
 			    struct pf_addr *, struct pf_addr *, u_int16_t,
 			    u_int16_t *, u_int16_t *, u_int16_t *,
 			    u_int16_t *, u_int8_t, sa_family_t);
 static void		 pf_send_tcp(struct mbuf *,
 			    const struct pf_rule *, sa_family_t,
 			    const struct pf_addr *, const struct pf_addr *,
 			    u_int16_t, u_int16_t, u_int32_t, u_int32_t,
 			    u_int8_t, u_int16_t, u_int16_t, u_int8_t, int,
 			    u_int16_t, struct ifnet *);
 static void		 pf_send_icmp(struct mbuf *, u_int8_t, u_int8_t,
 			    sa_family_t, struct pf_rule *);
 static void		 pf_detach_state(struct pf_state *);
 static int		 pf_state_key_attach(struct pf_state_key *,
 			    struct pf_state_key *, struct pf_state *);
 static void		 pf_state_key_detach(struct pf_state *, int);
 static int		 pf_state_key_ctor(void *, int, void *, int);
 static u_int32_t	 pf_tcp_iss(struct pf_pdesc *);
 static int		 pf_test_rule(struct pf_rule **, struct pf_state **,
 			    int, struct pfi_kif *, struct mbuf *, int,
 			    struct pf_pdesc *, struct pf_rule **,
 			    struct pf_ruleset **, struct inpcb *);
 static int		 pf_create_state(struct pf_rule *, struct pf_rule *,
 			    struct pf_rule *, struct pf_pdesc *,
 			    struct pf_src_node *, struct pf_state_key *,
 			    struct pf_state_key *, struct mbuf *, int,
 			    u_int16_t, u_int16_t, int *, struct pfi_kif *,
 			    struct pf_state **, int, u_int16_t, u_int16_t,
 			    int);
 static int		 pf_test_fragment(struct pf_rule **, int,
 			    struct pfi_kif *, struct mbuf *, void *,
 			    struct pf_pdesc *, struct pf_rule **,
 			    struct pf_ruleset **);
 static int		 pf_tcp_track_full(struct pf_state_peer *,
 			    struct pf_state_peer *, struct pf_state **,
 			    struct pfi_kif *, struct mbuf *, int,
 			    struct pf_pdesc *, u_short *, int *);
 static int		 pf_tcp_track_sloppy(struct pf_state_peer *,
 			    struct pf_state_peer *, struct pf_state **,
 			    struct pf_pdesc *, u_short *);
 static int		 pf_test_state_tcp(struct pf_state **, int,
 			    struct pfi_kif *, struct mbuf *, int,
 			    void *, struct pf_pdesc *, u_short *);
 static int		 pf_test_state_udp(struct pf_state **, int,
 			    struct pfi_kif *, struct mbuf *, int,
 			    void *, struct pf_pdesc *);
 static int		 pf_test_state_icmp(struct pf_state **, int,
 			    struct pfi_kif *, struct mbuf *, int,
 			    void *, struct pf_pdesc *, u_short *);
 static int		 pf_test_state_other(struct pf_state **, int,
 			    struct pfi_kif *, struct mbuf *, struct pf_pdesc *);
 static u_int8_t		 pf_get_wscale(struct mbuf *, int, u_int16_t,
 			    sa_family_t);
 static u_int16_t	 pf_get_mss(struct mbuf *, int, u_int16_t,
 			    sa_family_t);
 static u_int16_t	 pf_calc_mss(struct pf_addr *, sa_family_t,
 				int, u_int16_t);
 static int		 pf_check_proto_cksum(struct mbuf *, int, int,
 			    u_int8_t, sa_family_t);
 static void		 pf_print_state_parts(struct pf_state *,
 			    struct pf_state_key *, struct pf_state_key *);
 static int		 pf_addr_wrap_neq(struct pf_addr_wrap *,
 			    struct pf_addr_wrap *);
 static struct pf_state	*pf_find_state(struct pfi_kif *,
 			    struct pf_state_key_cmp *, u_int);
 static int		 pf_src_connlimit(struct pf_state **);
 static void		 pf_overload_task(void *v, int pending);
 static int		 pf_insert_src_node(struct pf_src_node **,
 			    struct pf_rule *, struct pf_addr *, sa_family_t);
 static u_int		 pf_purge_expired_states(u_int, int);
 static void		 pf_purge_unlinked_rules(void);
 static int		 pf_mtag_uminit(void *, int, int);
 static void		 pf_mtag_free(struct m_tag *);
 #ifdef INET
 static void		 pf_route(struct mbuf **, struct pf_rule *, int,
 			    struct ifnet *, struct pf_state *,
 			    struct pf_pdesc *);
 #endif /* INET */
 #ifdef INET6
 static void		 pf_change_a6(struct pf_addr *, u_int16_t *,
 			    struct pf_addr *, u_int8_t);
 static void		 pf_route6(struct mbuf **, struct pf_rule *, int,
 			    struct ifnet *, struct pf_state *,
 			    struct pf_pdesc *);
 #endif /* INET6 */
 
 int in4_cksum(struct mbuf *m, u_int8_t nxt, int off, int len);
 
 VNET_DECLARE(int, pf_end_threads);
 
 VNET_DEFINE(struct pf_limit, pf_limits[PF_LIMIT_MAX]);
 
 #define	PACKET_LOOPED(pd)	((pd)->pf_mtag &&			\
 				 (pd)->pf_mtag->flags & PF_PACKET_LOOPED)
 
 #define	STATE_LOOKUP(i, k, d, s, pd)					\
 	do {								\
 		(s) = pf_find_state((i), (k), (d));			\
 		if ((s) == NULL)					\
 			return (PF_DROP);				\
 		if (PACKET_LOOPED(pd))					\
 			return (PF_PASS);				\
 		if ((d) == PF_OUT &&					\
 		    (((s)->rule.ptr->rt == PF_ROUTETO &&		\
 		    (s)->rule.ptr->direction == PF_OUT) ||		\
 		    ((s)->rule.ptr->rt == PF_REPLYTO &&			\
 		    (s)->rule.ptr->direction == PF_IN)) &&		\
 		    (s)->rt_kif != NULL &&				\
 		    (s)->rt_kif != (i))					\
 			return (PF_PASS);				\
 	} while (0)
 
 #define	BOUND_IFACE(r, k) \
 	((r)->rule_flag & PFRULE_IFBOUND) ? (k) : V_pfi_all
 
 #define	STATE_INC_COUNTERS(s)						\
 	do {								\
 		counter_u64_add(s->rule.ptr->states_cur, 1);		\
 		counter_u64_add(s->rule.ptr->states_tot, 1);		\
 		if (s->anchor.ptr != NULL) {				\
 			counter_u64_add(s->anchor.ptr->states_cur, 1);	\
 			counter_u64_add(s->anchor.ptr->states_tot, 1);	\
 		}							\
 		if (s->nat_rule.ptr != NULL) {				\
 			counter_u64_add(s->nat_rule.ptr->states_cur, 1);\
 			counter_u64_add(s->nat_rule.ptr->states_tot, 1);\
 		}							\
 	} while (0)
 
 #define	STATE_DEC_COUNTERS(s)						\
 	do {								\
 		if (s->nat_rule.ptr != NULL)				\
 			counter_u64_add(s->nat_rule.ptr->states_cur, -1);\
 		if (s->anchor.ptr != NULL)				\
 			counter_u64_add(s->anchor.ptr->states_cur, -1);	\
 		counter_u64_add(s->rule.ptr->states_cur, -1);		\
 	} while (0)
 
 static MALLOC_DEFINE(M_PFHASH, "pf_hash", "pf(4) hash header structures");
 VNET_DEFINE(struct pf_keyhash *, pf_keyhash);
 VNET_DEFINE(struct pf_idhash *, pf_idhash);
 VNET_DEFINE(struct pf_srchash *, pf_srchash);
 
 SYSCTL_NODE(_net, OID_AUTO, pf, CTLFLAG_RW, 0, "pf(4)");
 
 u_long	pf_hashmask;
 u_long	pf_srchashmask;
 static u_long	pf_hashsize;
 static u_long	pf_srchashsize;
 
 SYSCTL_ULONG(_net_pf, OID_AUTO, states_hashsize, CTLFLAG_RDTUN,
     &pf_hashsize, 0, "Size of pf(4) states hashtable");
 SYSCTL_ULONG(_net_pf, OID_AUTO, source_nodes_hashsize, CTLFLAG_RDTUN,
     &pf_srchashsize, 0, "Size of pf(4) source nodes hashtable");
 
 VNET_DEFINE(void *, pf_swi_cookie);
 
 VNET_DEFINE(uint32_t, pf_hashseed);
 #define	V_pf_hashseed	VNET(pf_hashseed)
 
 int
 pf_addr_cmp(struct pf_addr *a, struct pf_addr *b, sa_family_t af)
 {
 
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		if (a->addr32[0] > b->addr32[0])
 			return (1);
 		if (a->addr32[0] < b->addr32[0])
 			return (-1);
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		if (a->addr32[3] > b->addr32[3])
 			return (1);
 		if (a->addr32[3] < b->addr32[3])
 			return (-1);
 		if (a->addr32[2] > b->addr32[2])
 			return (1);
 		if (a->addr32[2] < b->addr32[2])
 			return (-1);
 		if (a->addr32[1] > b->addr32[1])
 			return (1);
 		if (a->addr32[1] < b->addr32[1])
 			return (-1);
 		if (a->addr32[0] > b->addr32[0])
 			return (1);
 		if (a->addr32[0] < b->addr32[0])
 			return (-1);
 		break;
 #endif /* INET6 */
 	default:
 		panic("%s: unknown address family %u", __func__, af);
 	}
 	return (0);
 }
 
 static __inline uint32_t
 pf_hashkey(struct pf_state_key *sk)
 {
 	uint32_t h;
 
 	h = murmur3_32_hash32((uint32_t *)sk,
 	    sizeof(struct pf_state_key_cmp)/sizeof(uint32_t),
 	    V_pf_hashseed);
 
 	return (h & pf_hashmask);
 }
 
 static __inline uint32_t
 pf_hashsrc(struct pf_addr *addr, sa_family_t af)
 {
 	uint32_t h;
 
 	switch (af) {
 	case AF_INET:
 		h = murmur3_32_hash32((uint32_t *)&addr->v4,
 		    sizeof(addr->v4)/sizeof(uint32_t), V_pf_hashseed);
 		break;
 	case AF_INET6:
 		h = murmur3_32_hash32((uint32_t *)&addr->v6,
 		    sizeof(addr->v6)/sizeof(uint32_t), V_pf_hashseed);
 		break;
 	default:
 		panic("%s: unknown address family %u", __func__, af);
 	}
 
 	return (h & pf_srchashmask);
 }
 
 #ifdef INET6
 void
 pf_addrcpy(struct pf_addr *dst, struct pf_addr *src, sa_family_t af)
 {
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		dst->addr32[0] = src->addr32[0];
 		break;
 #endif /* INET */
 	case AF_INET6:
 		dst->addr32[0] = src->addr32[0];
 		dst->addr32[1] = src->addr32[1];
 		dst->addr32[2] = src->addr32[2];
 		dst->addr32[3] = src->addr32[3];
 		break;
 	}
 }
 #endif /* INET6 */
 
 static void
 pf_init_threshold(struct pf_threshold *threshold,
     u_int32_t limit, u_int32_t seconds)
 {
 	threshold->limit = limit * PF_THRESHOLD_MULT;
 	threshold->seconds = seconds;
 	threshold->count = 0;
 	threshold->last = time_uptime;
 }
 
 static void
 pf_add_threshold(struct pf_threshold *threshold)
 {
 	u_int32_t t = time_uptime, diff = t - threshold->last;
 
 	if (diff >= threshold->seconds)
 		threshold->count = 0;
 	else
 		threshold->count -= threshold->count * diff /
 		    threshold->seconds;
 	threshold->count += PF_THRESHOLD_MULT;
 	threshold->last = t;
 }
 
 static int
 pf_check_threshold(struct pf_threshold *threshold)
 {
 	return (threshold->count > threshold->limit);
 }
 
 static int
 pf_src_connlimit(struct pf_state **state)
 {
 	struct pf_overload_entry *pfoe;
 	int bad = 0;
 
 	PF_STATE_LOCK_ASSERT(*state);
 
 	(*state)->src_node->conn++;
 	(*state)->src.tcp_est = 1;
 	pf_add_threshold(&(*state)->src_node->conn_rate);
 
 	if ((*state)->rule.ptr->max_src_conn &&
 	    (*state)->rule.ptr->max_src_conn <
 	    (*state)->src_node->conn) {
 		counter_u64_add(V_pf_status.lcounters[LCNT_SRCCONN], 1);
 		bad++;
 	}
 
 	if ((*state)->rule.ptr->max_src_conn_rate.limit &&
 	    pf_check_threshold(&(*state)->src_node->conn_rate)) {
 		counter_u64_add(V_pf_status.lcounters[LCNT_SRCCONNRATE], 1);
 		bad++;
 	}
 
 	if (!bad)
 		return (0);
 
 	/* Kill this state. */
 	(*state)->timeout = PFTM_PURGE;
 	(*state)->src.state = (*state)->dst.state = TCPS_CLOSED;
 
 	if ((*state)->rule.ptr->overload_tbl == NULL)
 		return (1);
 
 	/* Schedule overloading and flushing task. */
 	pfoe = malloc(sizeof(*pfoe), M_PFTEMP, M_NOWAIT);
 	if (pfoe == NULL)
 		return (1);	/* too bad :( */
 
 	bcopy(&(*state)->src_node->addr, &pfoe->addr, sizeof(pfoe->addr));
 	pfoe->af = (*state)->key[PF_SK_WIRE]->af;
 	pfoe->rule = (*state)->rule.ptr;
 	pfoe->dir = (*state)->direction;
 	PF_OVERLOADQ_LOCK();
 	SLIST_INSERT_HEAD(&V_pf_overloadqueue, pfoe, next);
 	PF_OVERLOADQ_UNLOCK();
 	taskqueue_enqueue(taskqueue_swi, &V_pf_overloadtask);
 
 	return (1);
 }
 
 static void
 pf_overload_task(void *v, int pending)
 {
 	struct pf_overload_head queue;
 	struct pfr_addr p;
 	struct pf_overload_entry *pfoe, *pfoe1;
 	uint32_t killed = 0;
 
 	CURVNET_SET((struct vnet *)v);
 
 	PF_OVERLOADQ_LOCK();
 	queue = V_pf_overloadqueue;
 	SLIST_INIT(&V_pf_overloadqueue);
 	PF_OVERLOADQ_UNLOCK();
 
 	bzero(&p, sizeof(p));
 	SLIST_FOREACH(pfoe, &queue, next) {
 		counter_u64_add(V_pf_status.lcounters[LCNT_OVERLOAD_TABLE], 1);
 		if (V_pf_status.debug >= PF_DEBUG_MISC) {
 			printf("%s: blocking address ", __func__);
 			pf_print_host(&pfoe->addr, 0, pfoe->af);
 			printf("\n");
 		}
 
 		p.pfra_af = pfoe->af;
 		switch (pfoe->af) {
 #ifdef INET
 		case AF_INET:
 			p.pfra_net = 32;
 			p.pfra_ip4addr = pfoe->addr.v4;
 			break;
 #endif
 #ifdef INET6
 		case AF_INET6:
 			p.pfra_net = 128;
 			p.pfra_ip6addr = pfoe->addr.v6;
 			break;
 #endif
 		}
 
 		PF_RULES_WLOCK();
 		pfr_insert_kentry(pfoe->rule->overload_tbl, &p, time_second);
 		PF_RULES_WUNLOCK();
 	}
 
 	/*
 	 * Remove those entries, that don't need flushing.
 	 */
 	SLIST_FOREACH_SAFE(pfoe, &queue, next, pfoe1)
 		if (pfoe->rule->flush == 0) {
 			SLIST_REMOVE(&queue, pfoe, pf_overload_entry, next);
 			free(pfoe, M_PFTEMP);
 		} else
 			counter_u64_add(
 			    V_pf_status.lcounters[LCNT_OVERLOAD_FLUSH], 1);
 
 	/* If nothing to flush, return. */
 	if (SLIST_EMPTY(&queue)) {
 		CURVNET_RESTORE();
 		return;
 	}
 
 	for (int i = 0; i <= pf_hashmask; i++) {
 		struct pf_idhash *ih = &V_pf_idhash[i];
 		struct pf_state_key *sk;
 		struct pf_state *s;
 
 		PF_HASHROW_LOCK(ih);
 		LIST_FOREACH(s, &ih->states, entry) {
 		    sk = s->key[PF_SK_WIRE];
 		    SLIST_FOREACH(pfoe, &queue, next)
 			if (sk->af == pfoe->af &&
 			    ((pfoe->rule->flush & PF_FLUSH_GLOBAL) ||
 			    pfoe->rule == s->rule.ptr) &&
 			    ((pfoe->dir == PF_OUT &&
 			    PF_AEQ(&pfoe->addr, &sk->addr[1], sk->af)) ||
 			    (pfoe->dir == PF_IN &&
 			    PF_AEQ(&pfoe->addr, &sk->addr[0], sk->af)))) {
 				s->timeout = PFTM_PURGE;
 				s->src.state = s->dst.state = TCPS_CLOSED;
 				killed++;
 			}
 		}
 		PF_HASHROW_UNLOCK(ih);
 	}
 	SLIST_FOREACH_SAFE(pfoe, &queue, next, pfoe1)
 		free(pfoe, M_PFTEMP);
 	if (V_pf_status.debug >= PF_DEBUG_MISC)
 		printf("%s: %u states killed", __func__, killed);
 
 	CURVNET_RESTORE();
 }
 
 /*
  * Can return locked on failure, so that we can consistently
  * allocate and insert a new one.
  */
 struct pf_src_node *
 pf_find_src_node(struct pf_addr *src, struct pf_rule *rule, sa_family_t af,
 	int returnlocked)
 {
 	struct pf_srchash *sh;
 	struct pf_src_node *n;
 
 	counter_u64_add(V_pf_status.scounters[SCNT_SRC_NODE_SEARCH], 1);
 
 	sh = &V_pf_srchash[pf_hashsrc(src, af)];
 	PF_HASHROW_LOCK(sh);
 	LIST_FOREACH(n, &sh->nodes, entry)
 		if (n->rule.ptr == rule && n->af == af &&
 		    ((af == AF_INET && n->addr.v4.s_addr == src->v4.s_addr) ||
 		    (af == AF_INET6 && bcmp(&n->addr, src, sizeof(*src)) == 0)))
 			break;
 	if (n != NULL) {
 		n->states++;
 		PF_HASHROW_UNLOCK(sh);
 	} else if (returnlocked == 0)
 		PF_HASHROW_UNLOCK(sh);
 
 	return (n);
 }
 
 static int
 pf_insert_src_node(struct pf_src_node **sn, struct pf_rule *rule,
     struct pf_addr *src, sa_family_t af)
 {
 
 	KASSERT((rule->rule_flag & PFRULE_RULESRCTRACK ||
 	    rule->rpool.opts & PF_POOL_STICKYADDR),
 	    ("%s for non-tracking rule %p", __func__, rule));
 
 	if (*sn == NULL)
 		*sn = pf_find_src_node(src, rule, af, 1);
 
 	if (*sn == NULL) {
 		struct pf_srchash *sh = &V_pf_srchash[pf_hashsrc(src, af)];
 
 		PF_HASHROW_ASSERT(sh);
 
 		if (!rule->max_src_nodes ||
 		    counter_u64_fetch(rule->src_nodes) < rule->max_src_nodes)
 			(*sn) = uma_zalloc(V_pf_sources_z, M_NOWAIT | M_ZERO);
 		else
 			counter_u64_add(V_pf_status.lcounters[LCNT_SRCNODES],
 			    1);
 		if ((*sn) == NULL) {
 			PF_HASHROW_UNLOCK(sh);
 			return (-1);
 		}
 
 		pf_init_threshold(&(*sn)->conn_rate,
 		    rule->max_src_conn_rate.limit,
 		    rule->max_src_conn_rate.seconds);
 
 		(*sn)->af = af;
 		(*sn)->rule.ptr = rule;
 		PF_ACPY(&(*sn)->addr, src, af);
 		LIST_INSERT_HEAD(&sh->nodes, *sn, entry);
 		(*sn)->creation = time_uptime;
 		(*sn)->ruletype = rule->action;
 		(*sn)->states = 1;
 		if ((*sn)->rule.ptr != NULL)
 			counter_u64_add((*sn)->rule.ptr->src_nodes, 1);
 		PF_HASHROW_UNLOCK(sh);
 		counter_u64_add(V_pf_status.scounters[SCNT_SRC_NODE_INSERT], 1);
 	} else {
 		if (rule->max_src_states &&
 		    (*sn)->states >= rule->max_src_states) {
 			counter_u64_add(V_pf_status.lcounters[LCNT_SRCSTATES],
 			    1);
 			return (-1);
 		}
 	}
 	return (0);
 }
 
 void
 pf_unlink_src_node(struct pf_src_node *src)
 {
 
 	PF_HASHROW_ASSERT(&V_pf_srchash[pf_hashsrc(&src->addr, src->af)]);
 	LIST_REMOVE(src, entry);
 	if (src->rule.ptr)
 		counter_u64_add(src->rule.ptr->src_nodes, -1);
 }
 
 u_int
 pf_free_src_nodes(struct pf_src_node_list *head)
 {
 	struct pf_src_node *sn, *tmp;
 	u_int count = 0;
 
 	LIST_FOREACH_SAFE(sn, head, entry, tmp) {
 		uma_zfree(V_pf_sources_z, sn);
 		count++;
 	}
 
 	counter_u64_add(V_pf_status.scounters[SCNT_SRC_NODE_REMOVALS], count);
 
 	return (count);
 }
 
 void
 pf_mtag_initialize()
 {
 
 	pf_mtag_z = uma_zcreate("pf mtags", sizeof(struct m_tag) +
 	    sizeof(struct pf_mtag), NULL, NULL, pf_mtag_uminit, NULL,
 	    UMA_ALIGN_PTR, 0);
 }
 
 /* Per-vnet data storage structures initialization. */
 void
 pf_initialize()
 {
 	struct pf_keyhash	*kh;
 	struct pf_idhash	*ih;
 	struct pf_srchash	*sh;
 	u_int i;
 
 	if (pf_hashsize == 0 || !powerof2(pf_hashsize))
 		pf_hashsize = PF_HASHSIZ;
 	if (pf_srchashsize == 0 || !powerof2(pf_srchashsize))
 		pf_srchashsize = PF_HASHSIZ / 4;
 
 	V_pf_hashseed = arc4random();
 
 	/* States and state keys storage. */
 	V_pf_state_z = uma_zcreate("pf states", sizeof(struct pf_state),
 	    NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0);
 	V_pf_limits[PF_LIMIT_STATES].zone = V_pf_state_z;
 	uma_zone_set_max(V_pf_state_z, PFSTATE_HIWAT);
 	uma_zone_set_warning(V_pf_state_z, "PF states limit reached");
 
 	V_pf_state_key_z = uma_zcreate("pf state keys",
 	    sizeof(struct pf_state_key), pf_state_key_ctor, NULL, NULL, NULL,
 	    UMA_ALIGN_PTR, 0);
 	V_pf_keyhash = malloc(pf_hashsize * sizeof(struct pf_keyhash),
 	    M_PFHASH, M_WAITOK | M_ZERO);
 	V_pf_idhash = malloc(pf_hashsize * sizeof(struct pf_idhash),
 	    M_PFHASH, M_WAITOK | M_ZERO);
 	pf_hashmask = pf_hashsize - 1;
 	for (i = 0, kh = V_pf_keyhash, ih = V_pf_idhash; i <= pf_hashmask;
 	    i++, kh++, ih++) {
 		mtx_init(&kh->lock, "pf_keyhash", NULL, MTX_DEF | MTX_DUPOK);
 		mtx_init(&ih->lock, "pf_idhash", NULL, MTX_DEF);
 	}
 
 	/* Source nodes. */
 	V_pf_sources_z = uma_zcreate("pf source nodes",
 	    sizeof(struct pf_src_node), NULL, NULL, NULL, NULL, UMA_ALIGN_PTR,
 	    0);
 	V_pf_limits[PF_LIMIT_SRC_NODES].zone = V_pf_sources_z;
 	uma_zone_set_max(V_pf_sources_z, PFSNODE_HIWAT);
 	uma_zone_set_warning(V_pf_sources_z, "PF source nodes limit reached");
 	V_pf_srchash = malloc(pf_srchashsize * sizeof(struct pf_srchash),
 	  M_PFHASH, M_WAITOK|M_ZERO);
 	pf_srchashmask = pf_srchashsize - 1;
 	for (i = 0, sh = V_pf_srchash; i <= pf_srchashmask; i++, sh++)
 		mtx_init(&sh->lock, "pf_srchash", NULL, MTX_DEF);
 
 	/* ALTQ */
 	TAILQ_INIT(&V_pf_altqs[0]);
 	TAILQ_INIT(&V_pf_altqs[1]);
 	TAILQ_INIT(&V_pf_pabuf);
 	V_pf_altqs_active = &V_pf_altqs[0];
 	V_pf_altqs_inactive = &V_pf_altqs[1];
 
 
 	/* Send & overload+flush queues. */
 	STAILQ_INIT(&V_pf_sendqueue);
 	SLIST_INIT(&V_pf_overloadqueue);
 	TASK_INIT(&V_pf_overloadtask, 0, pf_overload_task, curvnet);
 	mtx_init(&pf_sendqueue_mtx, "pf send queue", NULL, MTX_DEF);
 	mtx_init(&pf_overloadqueue_mtx, "pf overload/flush queue", NULL,
 	    MTX_DEF);
 
 	/* Unlinked, but may be referenced rules. */
 	TAILQ_INIT(&V_pf_unlinked_rules);
 	mtx_init(&pf_unlnkdrules_mtx, "pf unlinked rules", NULL, MTX_DEF);
 }
 
 void
 pf_mtag_cleanup()
 {
 
 	uma_zdestroy(pf_mtag_z);
 }
 
 void
 pf_cleanup()
 {
 	struct pf_keyhash	*kh;
 	struct pf_idhash	*ih;
 	struct pf_srchash	*sh;
 	struct pf_send_entry	*pfse, *next;
 	u_int i;
 
 	for (i = 0, kh = V_pf_keyhash, ih = V_pf_idhash; i <= pf_hashmask;
 	    i++, kh++, ih++) {
 		KASSERT(LIST_EMPTY(&kh->keys), ("%s: key hash not empty",
 		    __func__));
 		KASSERT(LIST_EMPTY(&ih->states), ("%s: id hash not empty",
 		    __func__));
 		mtx_destroy(&kh->lock);
 		mtx_destroy(&ih->lock);
 	}
 	free(V_pf_keyhash, M_PFHASH);
 	free(V_pf_idhash, M_PFHASH);
 
 	for (i = 0, sh = V_pf_srchash; i <= pf_srchashmask; i++, sh++) {
 		KASSERT(LIST_EMPTY(&sh->nodes),
 		    ("%s: source node hash not empty", __func__));
 		mtx_destroy(&sh->lock);
 	}
 	free(V_pf_srchash, M_PFHASH);
 
 	STAILQ_FOREACH_SAFE(pfse, &V_pf_sendqueue, pfse_next, next) {
 		m_freem(pfse->pfse_m);
 		free(pfse, M_PFTEMP);
 	}
 
 	mtx_destroy(&pf_sendqueue_mtx);
 	mtx_destroy(&pf_overloadqueue_mtx);
 	mtx_destroy(&pf_unlnkdrules_mtx);
 
 	uma_zdestroy(V_pf_sources_z);
 	uma_zdestroy(V_pf_state_z);
 	uma_zdestroy(V_pf_state_key_z);
 }
 
 static int
 pf_mtag_uminit(void *mem, int size, int how)
 {
 	struct m_tag *t;
 
 	t = (struct m_tag *)mem;
 	t->m_tag_cookie = MTAG_ABI_COMPAT;
 	t->m_tag_id = PACKET_TAG_PF;
 	t->m_tag_len = sizeof(struct pf_mtag);
 	t->m_tag_free = pf_mtag_free;
 
 	return (0);
 }
 
 static void
 pf_mtag_free(struct m_tag *t)
 {
 
 	uma_zfree(pf_mtag_z, t);
 }
 
 struct pf_mtag *
 pf_get_mtag(struct mbuf *m)
 {
 	struct m_tag *mtag;
 
 	if ((mtag = m_tag_find(m, PACKET_TAG_PF, NULL)) != NULL)
 		return ((struct pf_mtag *)(mtag + 1));
 
 	mtag = uma_zalloc(pf_mtag_z, M_NOWAIT);
 	if (mtag == NULL)
 		return (NULL);
 	bzero(mtag + 1, sizeof(struct pf_mtag));
 	m_tag_prepend(m, mtag);
 
 	return ((struct pf_mtag *)(mtag + 1));
 }
 
 static int
 pf_state_key_attach(struct pf_state_key *skw, struct pf_state_key *sks,
     struct pf_state *s)
 {
 	struct pf_keyhash	*khs, *khw, *kh;
 	struct pf_state_key	*sk, *cur;
 	struct pf_state		*si, *olds = NULL;
 	int idx;
 
 	KASSERT(s->refs == 0, ("%s: state not pristine", __func__));
 	KASSERT(s->key[PF_SK_WIRE] == NULL, ("%s: state has key", __func__));
 	KASSERT(s->key[PF_SK_STACK] == NULL, ("%s: state has key", __func__));
 
 	/*
 	 * We need to lock hash slots of both keys. To avoid deadlock
 	 * we always lock the slot with lower address first. Unlock order
 	 * isn't important.
 	 *
 	 * We also need to lock ID hash slot before dropping key
 	 * locks. On success we return with ID hash slot locked.
 	 */
 
 	if (skw == sks) {
 		khs = khw = &V_pf_keyhash[pf_hashkey(skw)];
 		PF_HASHROW_LOCK(khs);
 	} else {
 		khs = &V_pf_keyhash[pf_hashkey(sks)];
 		khw = &V_pf_keyhash[pf_hashkey(skw)];
 		if (khs == khw) {
 			PF_HASHROW_LOCK(khs);
 		} else if (khs < khw) {
 			PF_HASHROW_LOCK(khs);
 			PF_HASHROW_LOCK(khw);
 		} else {
 			PF_HASHROW_LOCK(khw);
 			PF_HASHROW_LOCK(khs);
 		}
 	}
 
 #define	KEYS_UNLOCK()	do {			\
 	if (khs != khw) {			\
 		PF_HASHROW_UNLOCK(khs);		\
 		PF_HASHROW_UNLOCK(khw);		\
 	} else					\
 		PF_HASHROW_UNLOCK(khs);		\
 } while (0)
 
 	/*
 	 * First run: start with wire key.
 	 */
 	sk = skw;
 	kh = khw;
 	idx = PF_SK_WIRE;
 
 keyattach:
 	LIST_FOREACH(cur, &kh->keys, entry)
 		if (bcmp(cur, sk, sizeof(struct pf_state_key_cmp)) == 0)
 			break;
 
 	if (cur != NULL) {
 		/* Key exists. Check for same kif, if none, add to key. */
 		TAILQ_FOREACH(si, &cur->states[idx], key_list[idx]) {
 			struct pf_idhash *ih = &V_pf_idhash[PF_IDHASH(si)];
 
 			PF_HASHROW_LOCK(ih);
 			if (si->kif == s->kif &&
 			    si->direction == s->direction) {
 				if (sk->proto == IPPROTO_TCP &&
 				    si->src.state >= TCPS_FIN_WAIT_2 &&
 				    si->dst.state >= TCPS_FIN_WAIT_2) {
 					/*
 					 * New state matches an old >FIN_WAIT_2
 					 * state. We can't drop key hash locks,
 					 * thus we can't unlink it properly.
 					 *
 					 * As a workaround we drop it into
 					 * TCPS_CLOSED state, schedule purge
 					 * ASAP and push it into the very end
 					 * of the slot TAILQ, so that it won't
 					 * conflict with our new state.
 					 */
 					si->src.state = si->dst.state =
 					    TCPS_CLOSED;
 					si->timeout = PFTM_PURGE;
 					olds = si;
 				} else {
 					if (V_pf_status.debug >= PF_DEBUG_MISC) {
 						printf("pf: %s key attach "
 						    "failed on %s: ",
 						    (idx == PF_SK_WIRE) ?
 						    "wire" : "stack",
 						    s->kif->pfik_name);
 						pf_print_state_parts(s,
 						    (idx == PF_SK_WIRE) ?
 						    sk : NULL,
 						    (idx == PF_SK_STACK) ?
 						    sk : NULL);
 						printf(", existing: ");
 						pf_print_state_parts(si,
 						    (idx == PF_SK_WIRE) ?
 						    sk : NULL,
 						    (idx == PF_SK_STACK) ?
 						    sk : NULL);
 						printf("\n");
 					}
 					PF_HASHROW_UNLOCK(ih);
 					KEYS_UNLOCK();
 					uma_zfree(V_pf_state_key_z, sk);
 					if (idx == PF_SK_STACK)
 						pf_detach_state(s);
 					return (EEXIST); /* collision! */
 				}
 			}
 			PF_HASHROW_UNLOCK(ih);
 		}
 		uma_zfree(V_pf_state_key_z, sk);
 		s->key[idx] = cur;
 	} else {
 		LIST_INSERT_HEAD(&kh->keys, sk, entry);
 		s->key[idx] = sk;
 	}
 
 stateattach:
 	/* List is sorted, if-bound states before floating. */
 	if (s->kif == V_pfi_all)
 		TAILQ_INSERT_TAIL(&s->key[idx]->states[idx], s, key_list[idx]);
 	else
 		TAILQ_INSERT_HEAD(&s->key[idx]->states[idx], s, key_list[idx]);
 
 	if (olds) {
 		TAILQ_REMOVE(&s->key[idx]->states[idx], olds, key_list[idx]);
 		TAILQ_INSERT_TAIL(&s->key[idx]->states[idx], olds,
 		    key_list[idx]);
 		olds = NULL;
 	}
 
 	/*
 	 * Attach done. See how should we (or should not?)
 	 * attach a second key.
 	 */
 	if (sks == skw) {
 		s->key[PF_SK_STACK] = s->key[PF_SK_WIRE];
 		idx = PF_SK_STACK;
 		sks = NULL;
 		goto stateattach;
 	} else if (sks != NULL) {
 		/*
 		 * Continue attaching with stack key.
 		 */
 		sk = sks;
 		kh = khs;
 		idx = PF_SK_STACK;
 		sks = NULL;
 		goto keyattach;
 	}
 
 	PF_STATE_LOCK(s);
 	KEYS_UNLOCK();
 
 	KASSERT(s->key[PF_SK_WIRE] != NULL && s->key[PF_SK_STACK] != NULL,
 	    ("%s failure", __func__));
 
 	return (0);
 #undef	KEYS_UNLOCK
 }
 
 static void
 pf_detach_state(struct pf_state *s)
 {
 	struct pf_state_key *sks = s->key[PF_SK_STACK];
 	struct pf_keyhash *kh;
 
 	if (sks != NULL) {
 		kh = &V_pf_keyhash[pf_hashkey(sks)];
 		PF_HASHROW_LOCK(kh);
 		if (s->key[PF_SK_STACK] != NULL)
 			pf_state_key_detach(s, PF_SK_STACK);
 		/*
 		 * If both point to same key, then we are done.
 		 */
 		if (sks == s->key[PF_SK_WIRE]) {
 			pf_state_key_detach(s, PF_SK_WIRE);
 			PF_HASHROW_UNLOCK(kh);
 			return;
 		}
 		PF_HASHROW_UNLOCK(kh);
 	}
 
 	if (s->key[PF_SK_WIRE] != NULL) {
 		kh = &V_pf_keyhash[pf_hashkey(s->key[PF_SK_WIRE])];
 		PF_HASHROW_LOCK(kh);
 		if (s->key[PF_SK_WIRE] != NULL)
 			pf_state_key_detach(s, PF_SK_WIRE);
 		PF_HASHROW_UNLOCK(kh);
 	}
 }
 
 static void
 pf_state_key_detach(struct pf_state *s, int idx)
 {
 	struct pf_state_key *sk = s->key[idx];
 #ifdef INVARIANTS
 	struct pf_keyhash *kh = &V_pf_keyhash[pf_hashkey(sk)];
 
 	PF_HASHROW_ASSERT(kh);
 #endif
 	TAILQ_REMOVE(&sk->states[idx], s, key_list[idx]);
 	s->key[idx] = NULL;
 
 	if (TAILQ_EMPTY(&sk->states[0]) && TAILQ_EMPTY(&sk->states[1])) {
 		LIST_REMOVE(sk, entry);
 		uma_zfree(V_pf_state_key_z, sk);
 	}
 }
 
 static int
 pf_state_key_ctor(void *mem, int size, void *arg, int flags)
 {
 	struct pf_state_key *sk = mem;
 
 	bzero(sk, sizeof(struct pf_state_key_cmp));
 	TAILQ_INIT(&sk->states[PF_SK_WIRE]);
 	TAILQ_INIT(&sk->states[PF_SK_STACK]);
 
 	return (0);
 }
 
 struct pf_state_key *
 pf_state_key_setup(struct pf_pdesc *pd, struct pf_addr *saddr,
 	struct pf_addr *daddr, u_int16_t sport, u_int16_t dport)
 {
 	struct pf_state_key *sk;
 
 	sk = uma_zalloc(V_pf_state_key_z, M_NOWAIT);
 	if (sk == NULL)
 		return (NULL);
 
 	PF_ACPY(&sk->addr[pd->sidx], saddr, pd->af);
 	PF_ACPY(&sk->addr[pd->didx], daddr, pd->af);
 	sk->port[pd->sidx] = sport;
 	sk->port[pd->didx] = dport;
 	sk->proto = pd->proto;
 	sk->af = pd->af;
 
 	return (sk);
 }
 
 struct pf_state_key *
 pf_state_key_clone(struct pf_state_key *orig)
 {
 	struct pf_state_key *sk;
 
 	sk = uma_zalloc(V_pf_state_key_z, M_NOWAIT);
 	if (sk == NULL)
 		return (NULL);
 
 	bcopy(orig, sk, sizeof(struct pf_state_key_cmp));
 
 	return (sk);
 }
 
 int
 pf_state_insert(struct pfi_kif *kif, struct pf_state_key *skw,
     struct pf_state_key *sks, struct pf_state *s)
 {
 	struct pf_idhash *ih;
 	struct pf_state *cur;
 	int error;
 
 	KASSERT(TAILQ_EMPTY(&sks->states[0]) && TAILQ_EMPTY(&sks->states[1]),
 	    ("%s: sks not pristine", __func__));
 	KASSERT(TAILQ_EMPTY(&skw->states[0]) && TAILQ_EMPTY(&skw->states[1]),
 	    ("%s: skw not pristine", __func__));
 	KASSERT(s->refs == 0, ("%s: state not pristine", __func__));
 
 	s->kif = kif;
 
 	if (s->id == 0 && s->creatorid == 0) {
 		/* XXX: should be atomic, but probability of collision low */
 		if ((s->id = V_pf_stateid[curcpu]++) == PFID_MAXID)
 			V_pf_stateid[curcpu] = 1;
 		s->id |= (uint64_t )curcpu << PFID_CPUSHIFT;
 		s->id = htobe64(s->id);
 		s->creatorid = V_pf_status.hostid;
 	}
 
 	/* Returns with ID locked on success. */
 	if ((error = pf_state_key_attach(skw, sks, s)) != 0)
 		return (error);
 
 	ih = &V_pf_idhash[PF_IDHASH(s)];
 	PF_HASHROW_ASSERT(ih);
 	LIST_FOREACH(cur, &ih->states, entry)
 		if (cur->id == s->id && cur->creatorid == s->creatorid)
 			break;
 
 	if (cur != NULL) {
 		PF_HASHROW_UNLOCK(ih);
 		if (V_pf_status.debug >= PF_DEBUG_MISC) {
 			printf("pf: state ID collision: "
 			    "id: %016llx creatorid: %08x\n",
 			    (unsigned long long)be64toh(s->id),
 			    ntohl(s->creatorid));
 		}
 		pf_detach_state(s);
 		return (EEXIST);
 	}
 	LIST_INSERT_HEAD(&ih->states, s, entry);
 	/* One for keys, one for ID hash. */
 	refcount_init(&s->refs, 2);
 
 	counter_u64_add(V_pf_status.fcounters[FCNT_STATE_INSERT], 1);
 	if (pfsync_insert_state_ptr != NULL)
 		pfsync_insert_state_ptr(s);
 
 	/* Returns locked. */
 	return (0);
 }
 
 /*
  * Find state by ID: returns with locked row on success.
  */
 struct pf_state *
 pf_find_state_byid(uint64_t id, uint32_t creatorid)
 {
 	struct pf_idhash *ih;
 	struct pf_state *s;
 
 	counter_u64_add(V_pf_status.fcounters[FCNT_STATE_SEARCH], 1);
 
 	ih = &V_pf_idhash[(be64toh(id) % (pf_hashmask + 1))];
 
 	PF_HASHROW_LOCK(ih);
 	LIST_FOREACH(s, &ih->states, entry)
 		if (s->id == id && s->creatorid == creatorid)
 			break;
 
 	if (s == NULL)
 		PF_HASHROW_UNLOCK(ih);
 
 	return (s);
 }
 
 /*
  * Find state by key.
  * Returns with ID hash slot locked on success.
  */
 static struct pf_state *
 pf_find_state(struct pfi_kif *kif, struct pf_state_key_cmp *key, u_int dir)
 {
 	struct pf_keyhash	*kh;
 	struct pf_state_key	*sk;
 	struct pf_state		*s;
 	int idx;
 
 	counter_u64_add(V_pf_status.fcounters[FCNT_STATE_SEARCH], 1);
 
 	kh = &V_pf_keyhash[pf_hashkey((struct pf_state_key *)key)];
 
 	PF_HASHROW_LOCK(kh);
 	LIST_FOREACH(sk, &kh->keys, entry)
 		if (bcmp(sk, key, sizeof(struct pf_state_key_cmp)) == 0)
 			break;
 	if (sk == NULL) {
 		PF_HASHROW_UNLOCK(kh);
 		return (NULL);
 	}
 
 	idx = (dir == PF_IN ? PF_SK_WIRE : PF_SK_STACK);
 
 	/* List is sorted, if-bound states before floating ones. */
 	TAILQ_FOREACH(s, &sk->states[idx], key_list[idx])
 		if (s->kif == V_pfi_all || s->kif == kif) {
 			PF_STATE_LOCK(s);
 			PF_HASHROW_UNLOCK(kh);
 			if (s->timeout >= PFTM_MAX) {
 				/*
 				 * State is either being processed by
 				 * pf_unlink_state() in an other thread, or
 				 * is scheduled for immediate expiry.
 				 */
 				PF_STATE_UNLOCK(s);
 				return (NULL);
 			}
 			return (s);
 		}
 	PF_HASHROW_UNLOCK(kh);
 
 	return (NULL);
 }
 
 struct pf_state *
 pf_find_state_all(struct pf_state_key_cmp *key, u_int dir, int *more)
 {
 	struct pf_keyhash	*kh;
 	struct pf_state_key	*sk;
 	struct pf_state		*s, *ret = NULL;
 	int			 idx, inout = 0;
 
 	counter_u64_add(V_pf_status.fcounters[FCNT_STATE_SEARCH], 1);
 
 	kh = &V_pf_keyhash[pf_hashkey((struct pf_state_key *)key)];
 
 	PF_HASHROW_LOCK(kh);
 	LIST_FOREACH(sk, &kh->keys, entry)
 		if (bcmp(sk, key, sizeof(struct pf_state_key_cmp)) == 0)
 			break;
 	if (sk == NULL) {
 		PF_HASHROW_UNLOCK(kh);
 		return (NULL);
 	}
 	switch (dir) {
 	case PF_IN:
 		idx = PF_SK_WIRE;
 		break;
 	case PF_OUT:
 		idx = PF_SK_STACK;
 		break;
 	case PF_INOUT:
 		idx = PF_SK_WIRE;
 		inout = 1;
 		break;
 	default:
 		panic("%s: dir %u", __func__, dir);
 	}
 second_run:
 	TAILQ_FOREACH(s, &sk->states[idx], key_list[idx]) {
 		if (more == NULL) {
 			PF_HASHROW_UNLOCK(kh);
 			return (s);
 		}
 
 		if (ret)
 			(*more)++;
 		else
 			ret = s;
 	}
 	if (inout == 1) {
 		inout = 0;
 		idx = PF_SK_STACK;
 		goto second_run;
 	}
 	PF_HASHROW_UNLOCK(kh);
 
 	return (ret);
 }
 
 /* END state table stuff */
 
 static void
 pf_send(struct pf_send_entry *pfse)
 {
 
 	PF_SENDQ_LOCK();
 	STAILQ_INSERT_TAIL(&V_pf_sendqueue, pfse, pfse_next);
 	PF_SENDQ_UNLOCK();
 	swi_sched(V_pf_swi_cookie, 0);
 }
 
 void
 pf_intr(void *v)
 {
 	struct pf_send_head queue;
 	struct pf_send_entry *pfse, *next;
 
 	CURVNET_SET((struct vnet *)v);
 
 	PF_SENDQ_LOCK();
 	queue = V_pf_sendqueue;
 	STAILQ_INIT(&V_pf_sendqueue);
 	PF_SENDQ_UNLOCK();
 
 	STAILQ_FOREACH_SAFE(pfse, &queue, pfse_next, next) {
 		switch (pfse->pfse_type) {
 #ifdef INET
 		case PFSE_IP:
 			ip_output(pfse->pfse_m, NULL, NULL, 0, NULL, NULL);
 			break;
 		case PFSE_ICMP:
 			icmp_error(pfse->pfse_m, pfse->icmpopts.type,
 			    pfse->icmpopts.code, 0, pfse->icmpopts.mtu);
 			break;
 #endif /* INET */
 #ifdef INET6
 		case PFSE_IP6:
 			ip6_output(pfse->pfse_m, NULL, NULL, 0, NULL, NULL,
 			    NULL);
 			break;
 		case PFSE_ICMP6:
 			icmp6_error(pfse->pfse_m, pfse->icmpopts.type,
 			    pfse->icmpopts.code, pfse->icmpopts.mtu);
 			break;
 #endif /* INET6 */
 		default:
 			panic("%s: unknown type", __func__);
 		}
 		free(pfse, M_PFTEMP);
 	}
 	CURVNET_RESTORE();
 }
 
 void
 pf_purge_thread(void *v)
 {
 	u_int idx = 0;
 
 	CURVNET_SET((struct vnet *)v);
 
 	for (;;) {
 		PF_RULES_RLOCK();
 		rw_sleep(pf_purge_thread, &pf_rules_lock, 0, "pftm", hz / 10);
 
 		if (V_pf_end_threads) {
 			/*
 			 * To cleanse up all kifs and rules we need
 			 * two runs: first one clears reference flags,
 			 * then pf_purge_expired_states() doesn't
 			 * raise them, and then second run frees.
 			 */
 			PF_RULES_RUNLOCK();
 			pf_purge_unlinked_rules();
 			pfi_kif_purge();
 
 			/*
 			 * Now purge everything.
 			 */
 			pf_purge_expired_states(0, pf_hashmask);
 			pf_purge_expired_fragments();
 			pf_purge_expired_src_nodes();
 
 			/*
 			 * Now all kifs & rules should be unreferenced,
 			 * thus should be successfully freed.
 			 */
 			pf_purge_unlinked_rules();
 			pfi_kif_purge();
 
 			/*
 			 * Announce success and exit.
 			 */
 			PF_RULES_RLOCK();
 			V_pf_end_threads++;
 			PF_RULES_RUNLOCK();
 			wakeup(pf_purge_thread);
 			kproc_exit(0);
 		}
 		PF_RULES_RUNLOCK();
 
 		/* Process 1/interval fraction of the state table every run. */
 		idx = pf_purge_expired_states(idx, pf_hashmask /
 			    (V_pf_default_rule.timeout[PFTM_INTERVAL] * 10));
 
 		/* Purge other expired types every PFTM_INTERVAL seconds. */
 		if (idx == 0) {
 			/*
 			 * Order is important:
 			 * - states and src nodes reference rules
 			 * - states and rules reference kifs
 			 */
 			pf_purge_expired_fragments();
 			pf_purge_expired_src_nodes();
 			pf_purge_unlinked_rules();
 			pfi_kif_purge();
 		}
 	}
 	/* not reached */
 	CURVNET_RESTORE();
 }
 
 u_int32_t
 pf_state_expires(const struct pf_state *state)
 {
 	u_int32_t	timeout;
 	u_int32_t	start;
 	u_int32_t	end;
 	u_int32_t	states;
 
 	/* handle all PFTM_* > PFTM_MAX here */
 	if (state->timeout == PFTM_PURGE)
 		return (time_uptime);
 	KASSERT(state->timeout != PFTM_UNLINKED,
 	    ("pf_state_expires: timeout == PFTM_UNLINKED"));
 	KASSERT((state->timeout < PFTM_MAX),
 	    ("pf_state_expires: timeout > PFTM_MAX"));
 	timeout = state->rule.ptr->timeout[state->timeout];
 	if (!timeout)
 		timeout = V_pf_default_rule.timeout[state->timeout];
 	start = state->rule.ptr->timeout[PFTM_ADAPTIVE_START];
 	if (start) {
 		end = state->rule.ptr->timeout[PFTM_ADAPTIVE_END];
 		states = counter_u64_fetch(state->rule.ptr->states_cur);
 	} else {
 		start = V_pf_default_rule.timeout[PFTM_ADAPTIVE_START];
 		end = V_pf_default_rule.timeout[PFTM_ADAPTIVE_END];
 		states = V_pf_status.states;
 	}
 	if (end && states > start && start < end) {
 		if (states < end)
 			return (state->expire + timeout * (end - states) /
 			    (end - start));
 		else
 			return (time_uptime);
 	}
 	return (state->expire + timeout);
 }
 
 void
 pf_purge_expired_src_nodes()
 {
 	struct pf_src_node_list	 freelist;
 	struct pf_srchash	*sh;
 	struct pf_src_node	*cur, *next;
 	int i;
 
 	LIST_INIT(&freelist);
 	for (i = 0, sh = V_pf_srchash; i <= pf_srchashmask; i++, sh++) {
 	    PF_HASHROW_LOCK(sh);
 	    LIST_FOREACH_SAFE(cur, &sh->nodes, entry, next)
 		if (cur->states == 0 && cur->expire <= time_uptime) {
 			pf_unlink_src_node(cur);
 			LIST_INSERT_HEAD(&freelist, cur, entry);
 		} else if (cur->rule.ptr != NULL)
 			cur->rule.ptr->rule_flag |= PFRULE_REFS;
 	    PF_HASHROW_UNLOCK(sh);
 	}
 
 	pf_free_src_nodes(&freelist);
 
 	V_pf_status.src_nodes = uma_zone_get_cur(V_pf_sources_z);
 }
 
 static void
 pf_src_tree_remove_state(struct pf_state *s)
 {
 	struct pf_src_node *sn;
 	struct pf_srchash *sh;
 	uint32_t timeout;
 
 	timeout = s->rule.ptr->timeout[PFTM_SRC_NODE] ?
 	    s->rule.ptr->timeout[PFTM_SRC_NODE] :
 	    V_pf_default_rule.timeout[PFTM_SRC_NODE];
 
 	if (s->src_node != NULL) {
 		sn = s->src_node;
 		sh = &V_pf_srchash[pf_hashsrc(&sn->addr, sn->af)];
 	    	PF_HASHROW_LOCK(sh);
 		if (s->src.tcp_est)
 			--sn->conn;
 		if (--sn->states == 0)
 			sn->expire = time_uptime + timeout;
 	    	PF_HASHROW_UNLOCK(sh);
 	}
 	if (s->nat_src_node != s->src_node && s->nat_src_node != NULL) {
 		sn = s->nat_src_node;
 		sh = &V_pf_srchash[pf_hashsrc(&sn->addr, sn->af)];
 	    	PF_HASHROW_LOCK(sh);
 		if (--sn->states == 0)
 			sn->expire = time_uptime + timeout;
 	    	PF_HASHROW_UNLOCK(sh);
 	}
 	s->src_node = s->nat_src_node = NULL;
 }
 
 /*
  * Unlink and potentilly free a state. Function may be
  * called with ID hash row locked, but always returns
  * unlocked, since it needs to go through key hash locking.
  */
 int
 pf_unlink_state(struct pf_state *s, u_int flags)
 {
 	struct pf_idhash *ih = &V_pf_idhash[PF_IDHASH(s)];
 
 	if ((flags & PF_ENTER_LOCKED) == 0)
 		PF_HASHROW_LOCK(ih);
 	else
 		PF_HASHROW_ASSERT(ih);
 
 	if (s->timeout == PFTM_UNLINKED) {
 		/*
 		 * State is being processed
 		 * by pf_unlink_state() in
 		 * an other thread.
 		 */
 		PF_HASHROW_UNLOCK(ih);
 		return (0);	/* XXXGL: undefined actually */
 	}
 
 	if (s->src.state == PF_TCPS_PROXY_DST) {
 		/* XXX wire key the right one? */
 		pf_send_tcp(NULL, s->rule.ptr, s->key[PF_SK_WIRE]->af,
 		    &s->key[PF_SK_WIRE]->addr[1],
 		    &s->key[PF_SK_WIRE]->addr[0],
 		    s->key[PF_SK_WIRE]->port[1],
 		    s->key[PF_SK_WIRE]->port[0],
 		    s->src.seqhi, s->src.seqlo + 1,
 		    TH_RST|TH_ACK, 0, 0, 0, 1, s->tag, NULL);
 	}
 
 	LIST_REMOVE(s, entry);
 	pf_src_tree_remove_state(s);
 
 	if (pfsync_delete_state_ptr != NULL)
 		pfsync_delete_state_ptr(s);
 
 	STATE_DEC_COUNTERS(s);
 
 	s->timeout = PFTM_UNLINKED;
 
 	PF_HASHROW_UNLOCK(ih);
 
 	pf_detach_state(s);
 	refcount_release(&s->refs);
 
 	return (pf_release_state(s));
 }
 
 void
 pf_free_state(struct pf_state *cur)
 {
 
 	KASSERT(cur->refs == 0, ("%s: %p has refs", __func__, cur));
 	KASSERT(cur->timeout == PFTM_UNLINKED, ("%s: timeout %u", __func__,
 	    cur->timeout));
 
 	pf_normalize_tcp_cleanup(cur);
 	uma_zfree(V_pf_state_z, cur);
 	counter_u64_add(V_pf_status.fcounters[FCNT_STATE_REMOVALS], 1);
 }
 
 /*
  * Called only from pf_purge_thread(), thus serialized.
  */
 static u_int
 pf_purge_expired_states(u_int i, int maxcheck)
 {
 	struct pf_idhash *ih;
 	struct pf_state *s;
 
 	V_pf_status.states = uma_zone_get_cur(V_pf_state_z);
 
 	/*
 	 * Go through hash and unlink states that expire now.
 	 */
 	while (maxcheck > 0) {
 
 		ih = &V_pf_idhash[i];
 relock:
 		PF_HASHROW_LOCK(ih);
 		LIST_FOREACH(s, &ih->states, entry) {
 			if (pf_state_expires(s) <= time_uptime) {
 				V_pf_status.states -=
 				    pf_unlink_state(s, PF_ENTER_LOCKED);
 				goto relock;
 			}
 			s->rule.ptr->rule_flag |= PFRULE_REFS;
 			if (s->nat_rule.ptr != NULL)
 				s->nat_rule.ptr->rule_flag |= PFRULE_REFS;
 			if (s->anchor.ptr != NULL)
 				s->anchor.ptr->rule_flag |= PFRULE_REFS;
 			s->kif->pfik_flags |= PFI_IFLAG_REFS;
 			if (s->rt_kif)
 				s->rt_kif->pfik_flags |= PFI_IFLAG_REFS;
 		}
 		PF_HASHROW_UNLOCK(ih);
 
 		/* Return when we hit end of hash. */
 		if (++i > pf_hashmask) {
 			V_pf_status.states = uma_zone_get_cur(V_pf_state_z);
 			return (0);
 		}
 
 		maxcheck--;
 	}
 
 	V_pf_status.states = uma_zone_get_cur(V_pf_state_z);
 
 	return (i);
 }
 
 static void
 pf_purge_unlinked_rules()
 {
 	struct pf_rulequeue tmpq;
 	struct pf_rule *r, *r1;
 
 	/*
 	 * If we have overloading task pending, then we'd
 	 * better skip purging this time. There is a tiny
 	 * probability that overloading task references
 	 * an already unlinked rule.
 	 */
 	PF_OVERLOADQ_LOCK();
 	if (!SLIST_EMPTY(&V_pf_overloadqueue)) {
 		PF_OVERLOADQ_UNLOCK();
 		return;
 	}
 	PF_OVERLOADQ_UNLOCK();
 
 	/*
 	 * Do naive mark-and-sweep garbage collecting of old rules.
 	 * Reference flag is raised by pf_purge_expired_states()
 	 * and pf_purge_expired_src_nodes().
 	 *
 	 * To avoid LOR between PF_UNLNKDRULES_LOCK/PF_RULES_WLOCK,
 	 * use a temporary queue.
 	 */
 	TAILQ_INIT(&tmpq);
 	PF_UNLNKDRULES_LOCK();
 	TAILQ_FOREACH_SAFE(r, &V_pf_unlinked_rules, entries, r1) {
 		if (!(r->rule_flag & PFRULE_REFS)) {
 			TAILQ_REMOVE(&V_pf_unlinked_rules, r, entries);
 			TAILQ_INSERT_TAIL(&tmpq, r, entries);
 		} else
 			r->rule_flag &= ~PFRULE_REFS;
 	}
 	PF_UNLNKDRULES_UNLOCK();
 
 	if (!TAILQ_EMPTY(&tmpq)) {
 		PF_RULES_WLOCK();
 		TAILQ_FOREACH_SAFE(r, &tmpq, entries, r1) {
 			TAILQ_REMOVE(&tmpq, r, entries);
 			pf_free_rule(r);
 		}
 		PF_RULES_WUNLOCK();
 	}
 }
 
 void
 pf_print_host(struct pf_addr *addr, u_int16_t p, sa_family_t af)
 {
 	switch (af) {
 #ifdef INET
 	case AF_INET: {
 		u_int32_t a = ntohl(addr->addr32[0]);
 		printf("%u.%u.%u.%u", (a>>24)&255, (a>>16)&255,
 		    (a>>8)&255, a&255);
 		if (p) {
 			p = ntohs(p);
 			printf(":%u", p);
 		}
 		break;
 	}
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6: {
 		u_int16_t b;
 		u_int8_t i, curstart, curend, maxstart, maxend;
 		curstart = curend = maxstart = maxend = 255;
 		for (i = 0; i < 8; i++) {
 			if (!addr->addr16[i]) {
 				if (curstart == 255)
 					curstart = i;
 				curend = i;
 			} else {
 				if ((curend - curstart) >
 				    (maxend - maxstart)) {
 					maxstart = curstart;
 					maxend = curend;
 				}
 				curstart = curend = 255;
 			}
 		}
 		if ((curend - curstart) >
 		    (maxend - maxstart)) {
 			maxstart = curstart;
 			maxend = curend;
 		}
 		for (i = 0; i < 8; i++) {
 			if (i >= maxstart && i <= maxend) {
 				if (i == 0)
 					printf(":");
 				if (i == maxend)
 					printf(":");
 			} else {
 				b = ntohs(addr->addr16[i]);
 				printf("%x", b);
 				if (i < 7)
 					printf(":");
 			}
 		}
 		if (p) {
 			p = ntohs(p);
 			printf("[%u]", p);
 		}
 		break;
 	}
 #endif /* INET6 */
 	}
 }
 
 void
 pf_print_state(struct pf_state *s)
 {
 	pf_print_state_parts(s, NULL, NULL);
 }
 
 static void
 pf_print_state_parts(struct pf_state *s,
     struct pf_state_key *skwp, struct pf_state_key *sksp)
 {
 	struct pf_state_key *skw, *sks;
 	u_int8_t proto, dir;
 
 	/* Do our best to fill these, but they're skipped if NULL */
 	skw = skwp ? skwp : (s ? s->key[PF_SK_WIRE] : NULL);
 	sks = sksp ? sksp : (s ? s->key[PF_SK_STACK] : NULL);
 	proto = skw ? skw->proto : (sks ? sks->proto : 0);
 	dir = s ? s->direction : 0;
 
 	switch (proto) {
 	case IPPROTO_IPV4:
 		printf("IPv4");
 		break;
 	case IPPROTO_IPV6:
 		printf("IPv6");
 		break;
 	case IPPROTO_TCP:
 		printf("TCP");
 		break;
 	case IPPROTO_UDP:
 		printf("UDP");
 		break;
 	case IPPROTO_ICMP:
 		printf("ICMP");
 		break;
 	case IPPROTO_ICMPV6:
 		printf("ICMPv6");
 		break;
 	default:
 		printf("%u", skw->proto);
 		break;
 	}
 	switch (dir) {
 	case PF_IN:
 		printf(" in");
 		break;
 	case PF_OUT:
 		printf(" out");
 		break;
 	}
 	if (skw) {
 		printf(" wire: ");
 		pf_print_host(&skw->addr[0], skw->port[0], skw->af);
 		printf(" ");
 		pf_print_host(&skw->addr[1], skw->port[1], skw->af);
 	}
 	if (sks) {
 		printf(" stack: ");
 		if (sks != skw) {
 			pf_print_host(&sks->addr[0], sks->port[0], sks->af);
 			printf(" ");
 			pf_print_host(&sks->addr[1], sks->port[1], sks->af);
 		} else
 			printf("-");
 	}
 	if (s) {
 		if (proto == IPPROTO_TCP) {
 			printf(" [lo=%u high=%u win=%u modulator=%u",
 			    s->src.seqlo, s->src.seqhi,
 			    s->src.max_win, s->src.seqdiff);
 			if (s->src.wscale && s->dst.wscale)
 				printf(" wscale=%u",
 				    s->src.wscale & PF_WSCALE_MASK);
 			printf("]");
 			printf(" [lo=%u high=%u win=%u modulator=%u",
 			    s->dst.seqlo, s->dst.seqhi,
 			    s->dst.max_win, s->dst.seqdiff);
 			if (s->src.wscale && s->dst.wscale)
 				printf(" wscale=%u",
 				s->dst.wscale & PF_WSCALE_MASK);
 			printf("]");
 		}
 		printf(" %u:%u", s->src.state, s->dst.state);
 	}
 }
 
 void
 pf_print_flags(u_int8_t f)
 {
 	if (f)
 		printf(" ");
 	if (f & TH_FIN)
 		printf("F");
 	if (f & TH_SYN)
 		printf("S");
 	if (f & TH_RST)
 		printf("R");
 	if (f & TH_PUSH)
 		printf("P");
 	if (f & TH_ACK)
 		printf("A");
 	if (f & TH_URG)
 		printf("U");
 	if (f & TH_ECE)
 		printf("E");
 	if (f & TH_CWR)
 		printf("W");
 }
 
 #define	PF_SET_SKIP_STEPS(i)					\
 	do {							\
 		while (head[i] != cur) {			\
 			head[i]->skip[i].ptr = cur;		\
 			head[i] = TAILQ_NEXT(head[i], entries);	\
 		}						\
 	} while (0)
 
 void
 pf_calc_skip_steps(struct pf_rulequeue *rules)
 {
 	struct pf_rule *cur, *prev, *head[PF_SKIP_COUNT];
 	int i;
 
 	cur = TAILQ_FIRST(rules);
 	prev = cur;
 	for (i = 0; i < PF_SKIP_COUNT; ++i)
 		head[i] = cur;
 	while (cur != NULL) {
 
 		if (cur->kif != prev->kif || cur->ifnot != prev->ifnot)
 			PF_SET_SKIP_STEPS(PF_SKIP_IFP);
 		if (cur->direction != prev->direction)
 			PF_SET_SKIP_STEPS(PF_SKIP_DIR);
 		if (cur->af != prev->af)
 			PF_SET_SKIP_STEPS(PF_SKIP_AF);
 		if (cur->proto != prev->proto)
 			PF_SET_SKIP_STEPS(PF_SKIP_PROTO);
 		if (cur->src.neg != prev->src.neg ||
 		    pf_addr_wrap_neq(&cur->src.addr, &prev->src.addr))
 			PF_SET_SKIP_STEPS(PF_SKIP_SRC_ADDR);
 		if (cur->src.port[0] != prev->src.port[0] ||
 		    cur->src.port[1] != prev->src.port[1] ||
 		    cur->src.port_op != prev->src.port_op)
 			PF_SET_SKIP_STEPS(PF_SKIP_SRC_PORT);
 		if (cur->dst.neg != prev->dst.neg ||
 		    pf_addr_wrap_neq(&cur->dst.addr, &prev->dst.addr))
 			PF_SET_SKIP_STEPS(PF_SKIP_DST_ADDR);
 		if (cur->dst.port[0] != prev->dst.port[0] ||
 		    cur->dst.port[1] != prev->dst.port[1] ||
 		    cur->dst.port_op != prev->dst.port_op)
 			PF_SET_SKIP_STEPS(PF_SKIP_DST_PORT);
 
 		prev = cur;
 		cur = TAILQ_NEXT(cur, entries);
 	}
 	for (i = 0; i < PF_SKIP_COUNT; ++i)
 		PF_SET_SKIP_STEPS(i);
 }
 
 static int
 pf_addr_wrap_neq(struct pf_addr_wrap *aw1, struct pf_addr_wrap *aw2)
 {
 	if (aw1->type != aw2->type)
 		return (1);
 	switch (aw1->type) {
 	case PF_ADDR_ADDRMASK:
 	case PF_ADDR_RANGE:
 		if (PF_ANEQ(&aw1->v.a.addr, &aw2->v.a.addr, 0))
 			return (1);
 		if (PF_ANEQ(&aw1->v.a.mask, &aw2->v.a.mask, 0))
 			return (1);
 		return (0);
 	case PF_ADDR_DYNIFTL:
 		return (aw1->p.dyn->pfid_kt != aw2->p.dyn->pfid_kt);
 	case PF_ADDR_NOROUTE:
 	case PF_ADDR_URPFFAILED:
 		return (0);
 	case PF_ADDR_TABLE:
 		return (aw1->p.tbl != aw2->p.tbl);
 	default:
 		printf("invalid address type: %d\n", aw1->type);
 		return (1);
 	}
 }
 
 u_int16_t
 pf_cksum_fixup(u_int16_t cksum, u_int16_t old, u_int16_t new, u_int8_t udp)
 {
 	u_int32_t	l;
 
 	if (udp && !cksum)
 		return (0x0000);
 	l = cksum + old - new;
 	l = (l >> 16) + (l & 65535);
 	l = l & 65535;
 	if (udp && !l)
 		return (0xFFFF);
 	return (l);
 }
 
 static void
 pf_change_ap(struct pf_addr *a, u_int16_t *p, u_int16_t *ic, u_int16_t *pc,
     struct pf_addr *an, u_int16_t pn, u_int8_t u, sa_family_t af)
 {
 	struct pf_addr	ao;
 	u_int16_t	po = *p;
 
 	PF_ACPY(&ao, a, af);
 	PF_ACPY(a, an, af);
 
 	*p = pn;
 
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		*ic = pf_cksum_fixup(pf_cksum_fixup(*ic,
 		    ao.addr16[0], an->addr16[0], 0),
 		    ao.addr16[1], an->addr16[1], 0);
 		*p = pn;
 		*pc = pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(*pc,
 		    ao.addr16[0], an->addr16[0], u),
 		    ao.addr16[1], an->addr16[1], u),
 		    po, pn, u);
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		*pc = pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(
 		    pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(
 		    pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(*pc,
 		    ao.addr16[0], an->addr16[0], u),
 		    ao.addr16[1], an->addr16[1], u),
 		    ao.addr16[2], an->addr16[2], u),
 		    ao.addr16[3], an->addr16[3], u),
 		    ao.addr16[4], an->addr16[4], u),
 		    ao.addr16[5], an->addr16[5], u),
 		    ao.addr16[6], an->addr16[6], u),
 		    ao.addr16[7], an->addr16[7], u),
 		    po, pn, u);
 		break;
 #endif /* INET6 */
 	}
 }
 
 
 /* Changes a u_int32_t.  Uses a void * so there are no align restrictions */
 void
 pf_change_a(void *a, u_int16_t *c, u_int32_t an, u_int8_t u)
 {
 	u_int32_t	ao;
 
 	memcpy(&ao, a, sizeof(ao));
 	memcpy(a, &an, sizeof(u_int32_t));
 	*c = pf_cksum_fixup(pf_cksum_fixup(*c, ao / 65536, an / 65536, u),
 	    ao % 65536, an % 65536, u);
 }
 
 #ifdef INET6
 static void
 pf_change_a6(struct pf_addr *a, u_int16_t *c, struct pf_addr *an, u_int8_t u)
 {
 	struct pf_addr	ao;
 
 	PF_ACPY(&ao, a, AF_INET6);
 	PF_ACPY(a, an, AF_INET6);
 
 	*c = pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(
 	    pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(
 	    pf_cksum_fixup(pf_cksum_fixup(*c,
 	    ao.addr16[0], an->addr16[0], u),
 	    ao.addr16[1], an->addr16[1], u),
 	    ao.addr16[2], an->addr16[2], u),
 	    ao.addr16[3], an->addr16[3], u),
 	    ao.addr16[4], an->addr16[4], u),
 	    ao.addr16[5], an->addr16[5], u),
 	    ao.addr16[6], an->addr16[6], u),
 	    ao.addr16[7], an->addr16[7], u);
 }
 #endif /* INET6 */
 
 static void
 pf_change_icmp(struct pf_addr *ia, u_int16_t *ip, struct pf_addr *oa,
     struct pf_addr *na, u_int16_t np, u_int16_t *pc, u_int16_t *h2c,
     u_int16_t *ic, u_int16_t *hc, u_int8_t u, sa_family_t af)
 {
 	struct pf_addr	oia, ooa;
 
 	PF_ACPY(&oia, ia, af);
 	if (oa)
 		PF_ACPY(&ooa, oa, af);
 
 	/* Change inner protocol port, fix inner protocol checksum. */
 	if (ip != NULL) {
 		u_int16_t	oip = *ip;
 		u_int32_t	opc;
 
 		if (pc != NULL)
 			opc = *pc;
 		*ip = np;
 		if (pc != NULL)
 			*pc = pf_cksum_fixup(*pc, oip, *ip, u);
 		*ic = pf_cksum_fixup(*ic, oip, *ip, 0);
 		if (pc != NULL)
 			*ic = pf_cksum_fixup(*ic, opc, *pc, 0);
 	}
 	/* Change inner ip address, fix inner ip and icmp checksums. */
 	PF_ACPY(ia, na, af);
 	switch (af) {
 #ifdef INET
 	case AF_INET: {
 		u_int32_t	 oh2c = *h2c;
 
 		*h2c = pf_cksum_fixup(pf_cksum_fixup(*h2c,
 		    oia.addr16[0], ia->addr16[0], 0),
 		    oia.addr16[1], ia->addr16[1], 0);
 		*ic = pf_cksum_fixup(pf_cksum_fixup(*ic,
 		    oia.addr16[0], ia->addr16[0], 0),
 		    oia.addr16[1], ia->addr16[1], 0);
 		*ic = pf_cksum_fixup(*ic, oh2c, *h2c, 0);
 		break;
 	}
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		*ic = pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(
 		    pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(
 		    pf_cksum_fixup(pf_cksum_fixup(*ic,
 		    oia.addr16[0], ia->addr16[0], u),
 		    oia.addr16[1], ia->addr16[1], u),
 		    oia.addr16[2], ia->addr16[2], u),
 		    oia.addr16[3], ia->addr16[3], u),
 		    oia.addr16[4], ia->addr16[4], u),
 		    oia.addr16[5], ia->addr16[5], u),
 		    oia.addr16[6], ia->addr16[6], u),
 		    oia.addr16[7], ia->addr16[7], u);
 		break;
 #endif /* INET6 */
 	}
 	/* Outer ip address, fix outer ip or icmpv6 checksum, if necessary. */
 	if (oa) {
 		PF_ACPY(oa, na, af);
 		switch (af) {
 #ifdef INET
 		case AF_INET:
 			*hc = pf_cksum_fixup(pf_cksum_fixup(*hc,
 			    ooa.addr16[0], oa->addr16[0], 0),
 			    ooa.addr16[1], oa->addr16[1], 0);
 			break;
 #endif /* INET */
 #ifdef INET6
 		case AF_INET6:
 			*ic = pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(
 			    pf_cksum_fixup(pf_cksum_fixup(pf_cksum_fixup(
 			    pf_cksum_fixup(pf_cksum_fixup(*ic,
 			    ooa.addr16[0], oa->addr16[0], u),
 			    ooa.addr16[1], oa->addr16[1], u),
 			    ooa.addr16[2], oa->addr16[2], u),
 			    ooa.addr16[3], oa->addr16[3], u),
 			    ooa.addr16[4], oa->addr16[4], u),
 			    ooa.addr16[5], oa->addr16[5], u),
 			    ooa.addr16[6], oa->addr16[6], u),
 			    ooa.addr16[7], oa->addr16[7], u);
 			break;
 #endif /* INET6 */
 		}
 	}
 }
 
 
 /*
  * Need to modulate the sequence numbers in the TCP SACK option
  * (credits to Krzysztof Pfaff for report and patch)
  */
 static int
 pf_modulate_sack(struct mbuf *m, int off, struct pf_pdesc *pd,
     struct tcphdr *th, struct pf_state_peer *dst)
 {
 	int hlen = (th->th_off << 2) - sizeof(*th), thoptlen = hlen;
 	u_int8_t opts[TCP_MAXOLEN], *opt = opts;
 	int copyback = 0, i, olen;
 	struct sackblk sack;
 
 #define	TCPOLEN_SACKLEN	(TCPOLEN_SACK + 2)
 	if (hlen < TCPOLEN_SACKLEN ||
 	    !pf_pull_hdr(m, off + sizeof(*th), opts, hlen, NULL, NULL, pd->af))
 		return 0;
 
 	while (hlen >= TCPOLEN_SACKLEN) {
 		olen = opt[1];
 		switch (*opt) {
 		case TCPOPT_EOL:	/* FALLTHROUGH */
 		case TCPOPT_NOP:
 			opt++;
 			hlen--;
 			break;
 		case TCPOPT_SACK:
 			if (olen > hlen)
 				olen = hlen;
 			if (olen >= TCPOLEN_SACKLEN) {
 				for (i = 2; i + TCPOLEN_SACK <= olen;
 				    i += TCPOLEN_SACK) {
 					memcpy(&sack, &opt[i], sizeof(sack));
 					pf_change_a(&sack.start, &th->th_sum,
 					    htonl(ntohl(sack.start) -
 					    dst->seqdiff), 0);
 					pf_change_a(&sack.end, &th->th_sum,
 					    htonl(ntohl(sack.end) -
 					    dst->seqdiff), 0);
 					memcpy(&opt[i], &sack, sizeof(sack));
 				}
 				copyback = 1;
 			}
 			/* FALLTHROUGH */
 		default:
 			if (olen < 2)
 				olen = 2;
 			hlen -= olen;
 			opt += olen;
 		}
 	}
 
 	if (copyback)
 		m_copyback(m, off + sizeof(*th), thoptlen, (caddr_t)opts);
 	return (copyback);
 }
 
 static void
 pf_send_tcp(struct mbuf *replyto, const struct pf_rule *r, sa_family_t af,
     const struct pf_addr *saddr, const struct pf_addr *daddr,
     u_int16_t sport, u_int16_t dport, u_int32_t seq, u_int32_t ack,
     u_int8_t flags, u_int16_t win, u_int16_t mss, u_int8_t ttl, int tag,
     u_int16_t rtag, struct ifnet *ifp)
 {
 	struct pf_send_entry *pfse;
 	struct mbuf	*m;
 	int		 len, tlen;
 #ifdef INET
 	struct ip	*h = NULL;
 #endif /* INET */
 #ifdef INET6
 	struct ip6_hdr	*h6 = NULL;
 #endif /* INET6 */
 	struct tcphdr	*th;
 	char		*opt;
 	struct pf_mtag  *pf_mtag;
 
 	len = 0;
 	th = NULL;
 
 	/* maximum segment size tcp option */
 	tlen = sizeof(struct tcphdr);
 	if (mss)
 		tlen += 4;
 
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		len = sizeof(struct ip) + tlen;
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		len = sizeof(struct ip6_hdr) + tlen;
 		break;
 #endif /* INET6 */
 	default:
 		panic("%s: unsupported af %d", __func__, af);
 	}
 
 	/* Allocate outgoing queue entry, mbuf and mbuf tag. */
 	pfse = malloc(sizeof(*pfse), M_PFTEMP, M_NOWAIT);
 	if (pfse == NULL)
 		return;
 	m = m_gethdr(M_NOWAIT, MT_DATA);
 	if (m == NULL) {
 		free(pfse, M_PFTEMP);
 		return;
 	}
 #ifdef MAC
 	mac_netinet_firewall_send(m);
 #endif
 	if ((pf_mtag = pf_get_mtag(m)) == NULL) {
 		free(pfse, M_PFTEMP);
 		m_freem(m);
 		return;
 	}
 	if (tag)
 		m->m_flags |= M_SKIP_FIREWALL;
 	pf_mtag->tag = rtag;
 
 	if (r != NULL && r->rtableid >= 0)
 		M_SETFIB(m, r->rtableid);
 
 #ifdef ALTQ
 	if (r != NULL && r->qid) {
 		pf_mtag->qid = r->qid;
 
 		/* add hints for ecn */
 		pf_mtag->hdr = mtod(m, struct ip *);
 	}
 #endif /* ALTQ */
 	m->m_data += max_linkhdr;
 	m->m_pkthdr.len = m->m_len = len;
 	m->m_pkthdr.rcvif = NULL;
 	bzero(m->m_data, len);
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		h = mtod(m, struct ip *);
 
 		/* IP header fields included in the TCP checksum */
 		h->ip_p = IPPROTO_TCP;
 		h->ip_len = htons(tlen);
 		h->ip_src.s_addr = saddr->v4.s_addr;
 		h->ip_dst.s_addr = daddr->v4.s_addr;
 
 		th = (struct tcphdr *)((caddr_t)h + sizeof(struct ip));
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		h6 = mtod(m, struct ip6_hdr *);
 
 		/* IP header fields included in the TCP checksum */
 		h6->ip6_nxt = IPPROTO_TCP;
 		h6->ip6_plen = htons(tlen);
 		memcpy(&h6->ip6_src, &saddr->v6, sizeof(struct in6_addr));
 		memcpy(&h6->ip6_dst, &daddr->v6, sizeof(struct in6_addr));
 
 		th = (struct tcphdr *)((caddr_t)h6 + sizeof(struct ip6_hdr));
 		break;
 #endif /* INET6 */
 	}
 
 	/* TCP header */
 	th->th_sport = sport;
 	th->th_dport = dport;
 	th->th_seq = htonl(seq);
 	th->th_ack = htonl(ack);
 	th->th_off = tlen >> 2;
 	th->th_flags = flags;
 	th->th_win = htons(win);
 
 	if (mss) {
 		opt = (char *)(th + 1);
 		opt[0] = TCPOPT_MAXSEG;
 		opt[1] = 4;
 		HTONS(mss);
 		bcopy((caddr_t)&mss, (caddr_t)(opt + 2), 2);
 	}
 
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		/* TCP checksum */
 		th->th_sum = in_cksum(m, len);
 
 		/* Finish the IP header */
 		h->ip_v = 4;
 		h->ip_hl = sizeof(*h) >> 2;
 		h->ip_tos = IPTOS_LOWDELAY;
 		h->ip_off = htons(V_path_mtu_discovery ? IP_DF : 0);
 		h->ip_len = htons(len);
 		h->ip_ttl = ttl ? ttl : V_ip_defttl;
 		h->ip_sum = 0;
 
 		pfse->pfse_type = PFSE_IP;
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		/* TCP checksum */
 		th->th_sum = in6_cksum(m, IPPROTO_TCP,
 		    sizeof(struct ip6_hdr), tlen);
 
 		h6->ip6_vfc |= IPV6_VERSION;
 		h6->ip6_hlim = IPV6_DEFHLIM;
 
 		pfse->pfse_type = PFSE_IP6;
 		break;
 #endif /* INET6 */
 	}
 	pfse->pfse_m = m;
 	pf_send(pfse);
 }
 
 static void
 pf_send_icmp(struct mbuf *m, u_int8_t type, u_int8_t code, sa_family_t af,
     struct pf_rule *r)
 {
 	struct pf_send_entry *pfse;
 	struct mbuf *m0;
 	struct pf_mtag *pf_mtag;
 
 	/* Allocate outgoing queue entry, mbuf and mbuf tag. */
 	pfse = malloc(sizeof(*pfse), M_PFTEMP, M_NOWAIT);
 	if (pfse == NULL)
 		return;
 
 	if ((m0 = m_copypacket(m, M_NOWAIT)) == NULL) {
 		free(pfse, M_PFTEMP);
 		return;
 	}
 
 	if ((pf_mtag = pf_get_mtag(m0)) == NULL) {
 		free(pfse, M_PFTEMP);
 		return;
 	}
 	/* XXX: revisit */
 	m0->m_flags |= M_SKIP_FIREWALL;
 
 	if (r->rtableid >= 0)
 		M_SETFIB(m0, r->rtableid);
 
 #ifdef ALTQ
 	if (r->qid) {
 		pf_mtag->qid = r->qid;
 		/* add hints for ecn */
 		pf_mtag->hdr = mtod(m0, struct ip *);
 	}
 #endif /* ALTQ */
 
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		pfse->pfse_type = PFSE_ICMP;
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		pfse->pfse_type = PFSE_ICMP6;
 		break;
 #endif /* INET6 */
 	}
 	pfse->pfse_m = m0;
 	pfse->icmpopts.type = type;
 	pfse->icmpopts.code = code;
 	pf_send(pfse);
 }
 
 /*
  * Return 1 if the addresses a and b match (with mask m), otherwise return 0.
  * If n is 0, they match if they are equal. If n is != 0, they match if they
  * are different.
  */
 int
 pf_match_addr(u_int8_t n, struct pf_addr *a, struct pf_addr *m,
     struct pf_addr *b, sa_family_t af)
 {
 	int	match = 0;
 
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		if ((a->addr32[0] & m->addr32[0]) ==
 		    (b->addr32[0] & m->addr32[0]))
 			match++;
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		if (((a->addr32[0] & m->addr32[0]) ==
 		     (b->addr32[0] & m->addr32[0])) &&
 		    ((a->addr32[1] & m->addr32[1]) ==
 		     (b->addr32[1] & m->addr32[1])) &&
 		    ((a->addr32[2] & m->addr32[2]) ==
 		     (b->addr32[2] & m->addr32[2])) &&
 		    ((a->addr32[3] & m->addr32[3]) ==
 		     (b->addr32[3] & m->addr32[3])))
 			match++;
 		break;
 #endif /* INET6 */
 	}
 	if (match) {
 		if (n)
 			return (0);
 		else
 			return (1);
 	} else {
 		if (n)
 			return (1);
 		else
 			return (0);
 	}
 }
 
 /*
  * Return 1 if b <= a <= e, otherwise return 0.
  */
 int
 pf_match_addr_range(struct pf_addr *b, struct pf_addr *e,
     struct pf_addr *a, sa_family_t af)
 {
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		if ((a->addr32[0] < b->addr32[0]) ||
 		    (a->addr32[0] > e->addr32[0]))
 			return (0);
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6: {
 		int	i;
 
 		/* check a >= b */
 		for (i = 0; i < 4; ++i)
 			if (a->addr32[i] > b->addr32[i])
 				break;
 			else if (a->addr32[i] < b->addr32[i])
 				return (0);
 		/* check a <= e */
 		for (i = 0; i < 4; ++i)
 			if (a->addr32[i] < e->addr32[i])
 				break;
 			else if (a->addr32[i] > e->addr32[i])
 				return (0);
 		break;
 	}
 #endif /* INET6 */
 	}
 	return (1);
 }
 
 static int
 pf_match(u_int8_t op, u_int32_t a1, u_int32_t a2, u_int32_t p)
 {
 	switch (op) {
 	case PF_OP_IRG:
 		return ((p > a1) && (p < a2));
 	case PF_OP_XRG:
 		return ((p < a1) || (p > a2));
 	case PF_OP_RRG:
 		return ((p >= a1) && (p <= a2));
 	case PF_OP_EQ:
 		return (p == a1);
 	case PF_OP_NE:
 		return (p != a1);
 	case PF_OP_LT:
 		return (p < a1);
 	case PF_OP_LE:
 		return (p <= a1);
 	case PF_OP_GT:
 		return (p > a1);
 	case PF_OP_GE:
 		return (p >= a1);
 	}
 	return (0); /* never reached */
 }
 
 int
 pf_match_port(u_int8_t op, u_int16_t a1, u_int16_t a2, u_int16_t p)
 {
 	NTOHS(a1);
 	NTOHS(a2);
 	NTOHS(p);
 	return (pf_match(op, a1, a2, p));
 }
 
 static int
 pf_match_uid(u_int8_t op, uid_t a1, uid_t a2, uid_t u)
 {
 	if (u == UID_MAX && op != PF_OP_EQ && op != PF_OP_NE)
 		return (0);
 	return (pf_match(op, a1, a2, u));
 }
 
 static int
 pf_match_gid(u_int8_t op, gid_t a1, gid_t a2, gid_t g)
 {
 	if (g == GID_MAX && op != PF_OP_EQ && op != PF_OP_NE)
 		return (0);
 	return (pf_match(op, a1, a2, g));
 }
 
 int
 pf_match_tag(struct mbuf *m, struct pf_rule *r, int *tag, int mtag)
 {
 	if (*tag == -1)
 		*tag = mtag;
 
 	return ((!r->match_tag_not && r->match_tag == *tag) ||
 	    (r->match_tag_not && r->match_tag != *tag));
 }
 
 int
 pf_tag_packet(struct mbuf *m, struct pf_pdesc *pd, int tag)
 {
 
 	KASSERT(tag > 0, ("%s: tag %d", __func__, tag));
 
 	if (pd->pf_mtag == NULL && ((pd->pf_mtag = pf_get_mtag(m)) == NULL))
 		return (ENOMEM);
 
 	pd->pf_mtag->tag = tag;
 
 	return (0);
 }
 
 #define	PF_ANCHOR_STACKSIZE	32
 struct pf_anchor_stackframe {
 	struct pf_ruleset	*rs;
 	struct pf_rule		*r;	/* XXX: + match bit */
 	struct pf_anchor	*child;
 };
 
 /*
  * XXX: We rely on malloc(9) returning pointer aligned addresses.
  */
 #define	PF_ANCHORSTACK_MATCH	0x00000001
 #define	PF_ANCHORSTACK_MASK	(PF_ANCHORSTACK_MATCH)
 
 #define	PF_ANCHOR_MATCH(f)	((uintptr_t)(f)->r & PF_ANCHORSTACK_MATCH)
 #define	PF_ANCHOR_RULE(f)	(struct pf_rule *)			\
 				((uintptr_t)(f)->r & ~PF_ANCHORSTACK_MASK)
 #define	PF_ANCHOR_SET_MATCH(f)	do { (f)->r = (void *) 			\
 				((uintptr_t)(f)->r | PF_ANCHORSTACK_MATCH);  \
 } while (0)
 
 void
 pf_step_into_anchor(struct pf_anchor_stackframe *stack, int *depth,
     struct pf_ruleset **rs, int n, struct pf_rule **r, struct pf_rule **a,
     int *match)
 {
 	struct pf_anchor_stackframe	*f;
 
 	PF_RULES_RASSERT();
 
 	if (match)
 		*match = 0;
 	if (*depth >= PF_ANCHOR_STACKSIZE) {
 		printf("%s: anchor stack overflow on %s\n",
 		    __func__, (*r)->anchor->name);
 		*r = TAILQ_NEXT(*r, entries);
 		return;
 	} else if (*depth == 0 && a != NULL)
 		*a = *r;
 	f = stack + (*depth)++;
 	f->rs = *rs;
 	f->r = *r;
 	if ((*r)->anchor_wildcard) {
 		struct pf_anchor_node *parent = &(*r)->anchor->children;
 
 		if ((f->child = RB_MIN(pf_anchor_node, parent)) == NULL) {
 			*r = NULL;
 			return;
 		}
 		*rs = &f->child->ruleset;
 	} else {
 		f->child = NULL;
 		*rs = &(*r)->anchor->ruleset;
 	}
 	*r = TAILQ_FIRST((*rs)->rules[n].active.ptr);
 }
 
 int
 pf_step_out_of_anchor(struct pf_anchor_stackframe *stack, int *depth,
     struct pf_ruleset **rs, int n, struct pf_rule **r, struct pf_rule **a,
     int *match)
 {
 	struct pf_anchor_stackframe	*f;
 	struct pf_rule *fr;
 	int quick = 0;
 
 	PF_RULES_RASSERT();
 
 	do {
 		if (*depth <= 0)
 			break;
 		f = stack + *depth - 1;
 		fr = PF_ANCHOR_RULE(f);
 		if (f->child != NULL) {
 			struct pf_anchor_node *parent;
 
 			/*
 			 * This block traverses through
 			 * a wildcard anchor.
 			 */
 			parent = &fr->anchor->children;
 			if (match != NULL && *match) {
 				/*
 				 * If any of "*" matched, then
 				 * "foo/ *" matched, mark frame
 				 * appropriately.
 				 */
 				PF_ANCHOR_SET_MATCH(f);
 				*match = 0;
 			}
 			f->child = RB_NEXT(pf_anchor_node, parent, f->child);
 			if (f->child != NULL) {
 				*rs = &f->child->ruleset;
 				*r = TAILQ_FIRST((*rs)->rules[n].active.ptr);
 				if (*r == NULL)
 					continue;
 				else
 					break;
 			}
 		}
 		(*depth)--;
 		if (*depth == 0 && a != NULL)
 			*a = NULL;
 		*rs = f->rs;
 		if (PF_ANCHOR_MATCH(f) || (match != NULL && *match))
 			quick = fr->quick;
 		*r = TAILQ_NEXT(fr, entries);
 	} while (*r == NULL);
 
 	return (quick);
 }
 
 #ifdef INET6
 void
 pf_poolmask(struct pf_addr *naddr, struct pf_addr *raddr,
     struct pf_addr *rmask, struct pf_addr *saddr, sa_family_t af)
 {
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		naddr->addr32[0] = (raddr->addr32[0] & rmask->addr32[0]) |
 		((rmask->addr32[0] ^ 0xffffffff ) & saddr->addr32[0]);
 		break;
 #endif /* INET */
 	case AF_INET6:
 		naddr->addr32[0] = (raddr->addr32[0] & rmask->addr32[0]) |
 		((rmask->addr32[0] ^ 0xffffffff ) & saddr->addr32[0]);
 		naddr->addr32[1] = (raddr->addr32[1] & rmask->addr32[1]) |
 		((rmask->addr32[1] ^ 0xffffffff ) & saddr->addr32[1]);
 		naddr->addr32[2] = (raddr->addr32[2] & rmask->addr32[2]) |
 		((rmask->addr32[2] ^ 0xffffffff ) & saddr->addr32[2]);
 		naddr->addr32[3] = (raddr->addr32[3] & rmask->addr32[3]) |
 		((rmask->addr32[3] ^ 0xffffffff ) & saddr->addr32[3]);
 		break;
 	}
 }
 
 void
 pf_addr_inc(struct pf_addr *addr, sa_family_t af)
 {
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		addr->addr32[0] = htonl(ntohl(addr->addr32[0]) + 1);
 		break;
 #endif /* INET */
 	case AF_INET6:
 		if (addr->addr32[3] == 0xffffffff) {
 			addr->addr32[3] = 0;
 			if (addr->addr32[2] == 0xffffffff) {
 				addr->addr32[2] = 0;
 				if (addr->addr32[1] == 0xffffffff) {
 					addr->addr32[1] = 0;
 					addr->addr32[0] =
 					    htonl(ntohl(addr->addr32[0]) + 1);
 				} else
 					addr->addr32[1] =
 					    htonl(ntohl(addr->addr32[1]) + 1);
 			} else
 				addr->addr32[2] =
 				    htonl(ntohl(addr->addr32[2]) + 1);
 		} else
 			addr->addr32[3] =
 			    htonl(ntohl(addr->addr32[3]) + 1);
 		break;
 	}
 }
 #endif /* INET6 */
 
 int
 pf_socket_lookup(int direction, struct pf_pdesc *pd, struct mbuf *m)
 {
 	struct pf_addr		*saddr, *daddr;
 	u_int16_t		 sport, dport;
 	struct inpcbinfo	*pi;
 	struct inpcb		*inp;
 
 	pd->lookup.uid = UID_MAX;
 	pd->lookup.gid = GID_MAX;
 
 	switch (pd->proto) {
 	case IPPROTO_TCP:
 		if (pd->hdr.tcp == NULL)
 			return (-1);
 		sport = pd->hdr.tcp->th_sport;
 		dport = pd->hdr.tcp->th_dport;
 		pi = &V_tcbinfo;
 		break;
 	case IPPROTO_UDP:
 		if (pd->hdr.udp == NULL)
 			return (-1);
 		sport = pd->hdr.udp->uh_sport;
 		dport = pd->hdr.udp->uh_dport;
 		pi = &V_udbinfo;
 		break;
 	default:
 		return (-1);
 	}
 	if (direction == PF_IN) {
 		saddr = pd->src;
 		daddr = pd->dst;
 	} else {
 		u_int16_t	p;
 
 		p = sport;
 		sport = dport;
 		dport = p;
 		saddr = pd->dst;
 		daddr = pd->src;
 	}
 	switch (pd->af) {
 #ifdef INET
 	case AF_INET:
 		inp = in_pcblookup_mbuf(pi, saddr->v4, sport, daddr->v4,
 		    dport, INPLOOKUP_RLOCKPCB, NULL, m);
 		if (inp == NULL) {
 			inp = in_pcblookup_mbuf(pi, saddr->v4, sport,
 			   daddr->v4, dport, INPLOOKUP_WILDCARD |
 			   INPLOOKUP_RLOCKPCB, NULL, m);
 			if (inp == NULL)
 				return (-1);
 		}
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		inp = in6_pcblookup_mbuf(pi, &saddr->v6, sport, &daddr->v6,
 		    dport, INPLOOKUP_RLOCKPCB, NULL, m);
 		if (inp == NULL) {
 			inp = in6_pcblookup_mbuf(pi, &saddr->v6, sport,
 			    &daddr->v6, dport, INPLOOKUP_WILDCARD |
 			    INPLOOKUP_RLOCKPCB, NULL, m);
 			if (inp == NULL)
 				return (-1);
 		}
 		break;
 #endif /* INET6 */
 
 	default:
 		return (-1);
 	}
 	INP_RLOCK_ASSERT(inp);
 	pd->lookup.uid = inp->inp_cred->cr_uid;
 	pd->lookup.gid = inp->inp_cred->cr_groups[0];
 	INP_RUNLOCK(inp);
 
 	return (1);
 }
 
 static u_int8_t
 pf_get_wscale(struct mbuf *m, int off, u_int16_t th_off, sa_family_t af)
 {
 	int		 hlen;
 	u_int8_t	 hdr[60];
 	u_int8_t	*opt, optlen;
 	u_int8_t	 wscale = 0;
 
 	hlen = th_off << 2;		/* hlen <= sizeof(hdr) */
 	if (hlen <= sizeof(struct tcphdr))
 		return (0);
 	if (!pf_pull_hdr(m, off, hdr, hlen, NULL, NULL, af))
 		return (0);
 	opt = hdr + sizeof(struct tcphdr);
 	hlen -= sizeof(struct tcphdr);
 	while (hlen >= 3) {
 		switch (*opt) {
 		case TCPOPT_EOL:
 		case TCPOPT_NOP:
 			++opt;
 			--hlen;
 			break;
 		case TCPOPT_WINDOW:
 			wscale = opt[2];
 			if (wscale > TCP_MAX_WINSHIFT)
 				wscale = TCP_MAX_WINSHIFT;
 			wscale |= PF_WSCALE_FLAG;
 			/* FALLTHROUGH */
 		default:
 			optlen = opt[1];
 			if (optlen < 2)
 				optlen = 2;
 			hlen -= optlen;
 			opt += optlen;
 			break;
 		}
 	}
 	return (wscale);
 }
 
 static u_int16_t
 pf_get_mss(struct mbuf *m, int off, u_int16_t th_off, sa_family_t af)
 {
 	int		 hlen;
 	u_int8_t	 hdr[60];
 	u_int8_t	*opt, optlen;
 	u_int16_t	 mss = V_tcp_mssdflt;
 
 	hlen = th_off << 2;	/* hlen <= sizeof(hdr) */
 	if (hlen <= sizeof(struct tcphdr))
 		return (0);
 	if (!pf_pull_hdr(m, off, hdr, hlen, NULL, NULL, af))
 		return (0);
 	opt = hdr + sizeof(struct tcphdr);
 	hlen -= sizeof(struct tcphdr);
 	while (hlen >= TCPOLEN_MAXSEG) {
 		switch (*opt) {
 		case TCPOPT_EOL:
 		case TCPOPT_NOP:
 			++opt;
 			--hlen;
 			break;
 		case TCPOPT_MAXSEG:
 			bcopy((caddr_t)(opt + 2), (caddr_t)&mss, 2);
 			NTOHS(mss);
 			/* FALLTHROUGH */
 		default:
 			optlen = opt[1];
 			if (optlen < 2)
 				optlen = 2;
 			hlen -= optlen;
 			opt += optlen;
 			break;
 		}
 	}
 	return (mss);
 }
 
 static u_int16_t
 pf_calc_mss(struct pf_addr *addr, sa_family_t af, int rtableid, u_int16_t offer)
 {
 #ifdef INET
 	struct sockaddr_in	*dst;
 	struct route		 ro;
 #endif /* INET */
 #ifdef INET6
 	struct sockaddr_in6	*dst6;
 	struct route_in6	 ro6;
 #endif /* INET6 */
 	struct rtentry		*rt = NULL;
 	int			 hlen = 0;
 	u_int16_t		 mss = V_tcp_mssdflt;
 
 	switch (af) {
 #ifdef INET
 	case AF_INET:
 		hlen = sizeof(struct ip);
 		bzero(&ro, sizeof(ro));
 		dst = (struct sockaddr_in *)&ro.ro_dst;
 		dst->sin_family = AF_INET;
 		dst->sin_len = sizeof(*dst);
 		dst->sin_addr = addr->v4;
 		in_rtalloc_ign(&ro, 0, rtableid);
 		rt = ro.ro_rt;
 		break;
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6:
 		hlen = sizeof(struct ip6_hdr);
 		bzero(&ro6, sizeof(ro6));
 		dst6 = (struct sockaddr_in6 *)&ro6.ro_dst;
 		dst6->sin6_family = AF_INET6;
 		dst6->sin6_len = sizeof(*dst6);
 		dst6->sin6_addr = addr->v6;
 		in6_rtalloc_ign(&ro6, 0, rtableid);
 		rt = ro6.ro_rt;
 		break;
 #endif /* INET6 */
 	}
 
 	if (rt && rt->rt_ifp) {
 		mss = rt->rt_ifp->if_mtu - hlen - sizeof(struct tcphdr);
 		mss = max(V_tcp_mssdflt, mss);
 		RTFREE(rt);
 	}
 	mss = min(mss, offer);
 	mss = max(mss, 64);		/* sanity - at least max opt space */
 	return (mss);
 }
 
 static u_int32_t
 pf_tcp_iss(struct pf_pdesc *pd)
 {
 	MD5_CTX ctx;
 	u_int32_t digest[4];
 
 	if (V_pf_tcp_secret_init == 0) {
 		read_random(&V_pf_tcp_secret, sizeof(V_pf_tcp_secret));
 		MD5Init(&V_pf_tcp_secret_ctx);
 		MD5Update(&V_pf_tcp_secret_ctx, V_pf_tcp_secret,
 		    sizeof(V_pf_tcp_secret));
 		V_pf_tcp_secret_init = 1;
 	}
 
 	ctx = V_pf_tcp_secret_ctx;
 
 	MD5Update(&ctx, (char *)&pd->hdr.tcp->th_sport, sizeof(u_short));
 	MD5Update(&ctx, (char *)&pd->hdr.tcp->th_dport, sizeof(u_short));
 	if (pd->af == AF_INET6) {
 		MD5Update(&ctx, (char *)&pd->src->v6, sizeof(struct in6_addr));
 		MD5Update(&ctx, (char *)&pd->dst->v6, sizeof(struct in6_addr));
 	} else {
 		MD5Update(&ctx, (char *)&pd->src->v4, sizeof(struct in_addr));
 		MD5Update(&ctx, (char *)&pd->dst->v4, sizeof(struct in_addr));
 	}
 	MD5Final((u_char *)digest, &ctx);
 	V_pf_tcp_iss_off += 4096;
 #define	ISN_RANDOM_INCREMENT (4096 - 1)
 	return (digest[0] + (arc4random() & ISN_RANDOM_INCREMENT) +
 	    V_pf_tcp_iss_off);
 #undef	ISN_RANDOM_INCREMENT
 }
 
 static int
 pf_test_rule(struct pf_rule **rm, struct pf_state **sm, int direction,
     struct pfi_kif *kif, struct mbuf *m, int off, struct pf_pdesc *pd,
     struct pf_rule **am, struct pf_ruleset **rsm, struct inpcb *inp)
 {
 	struct pf_rule		*nr = NULL;
 	struct pf_addr		* const saddr = pd->src;
 	struct pf_addr		* const daddr = pd->dst;
 	sa_family_t		 af = pd->af;
 	struct pf_rule		*r, *a = NULL;
 	struct pf_ruleset	*ruleset = NULL;
 	struct pf_src_node	*nsn = NULL;
 	struct tcphdr		*th = pd->hdr.tcp;
 	struct pf_state_key	*sk = NULL, *nk = NULL;
 	u_short			 reason;
 	int			 rewrite = 0, hdrlen = 0;
 	int			 tag = -1, rtableid = -1;
 	int			 asd = 0;
 	int			 match = 0;
 	int			 state_icmp = 0;
 	u_int16_t		 sport = 0, dport = 0;
 	u_int16_t		 bproto_sum = 0, bip_sum = 0;
 	u_int8_t		 icmptype = 0, icmpcode = 0;
 	struct pf_anchor_stackframe	anchor_stack[PF_ANCHOR_STACKSIZE];
 
 	PF_RULES_RASSERT();
 
 	if (inp != NULL) {
 		INP_LOCK_ASSERT(inp);
 		pd->lookup.uid = inp->inp_cred->cr_uid;
 		pd->lookup.gid = inp->inp_cred->cr_groups[0];
 		pd->lookup.done = 1;
 	}
 
 	switch (pd->proto) {
 	case IPPROTO_TCP:
 		sport = th->th_sport;
 		dport = th->th_dport;
 		hdrlen = sizeof(*th);
 		break;
 	case IPPROTO_UDP:
 		sport = pd->hdr.udp->uh_sport;
 		dport = pd->hdr.udp->uh_dport;
 		hdrlen = sizeof(*pd->hdr.udp);
 		break;
 #ifdef INET
 	case IPPROTO_ICMP:
 		if (pd->af != AF_INET)
 			break;
 		sport = dport = pd->hdr.icmp->icmp_id;
 		hdrlen = sizeof(*pd->hdr.icmp);
 		icmptype = pd->hdr.icmp->icmp_type;
 		icmpcode = pd->hdr.icmp->icmp_code;
 
 		if (icmptype == ICMP_UNREACH ||
 		    icmptype == ICMP_SOURCEQUENCH ||
 		    icmptype == ICMP_REDIRECT ||
 		    icmptype == ICMP_TIMXCEED ||
 		    icmptype == ICMP_PARAMPROB)
 			state_icmp++;
 		break;
 #endif /* INET */
 #ifdef INET6
 	case IPPROTO_ICMPV6:
 		if (af != AF_INET6)
 			break;
 		sport = dport = pd->hdr.icmp6->icmp6_id;
 		hdrlen = sizeof(*pd->hdr.icmp6);
 		icmptype = pd->hdr.icmp6->icmp6_type;
 		icmpcode = pd->hdr.icmp6->icmp6_code;
 
 		if (icmptype == ICMP6_DST_UNREACH ||
 		    icmptype == ICMP6_PACKET_TOO_BIG ||
 		    icmptype == ICMP6_TIME_EXCEEDED ||
 		    icmptype == ICMP6_PARAM_PROB)
 			state_icmp++;
 		break;
 #endif /* INET6 */
 	default:
 		sport = dport = hdrlen = 0;
 		break;
 	}
 
 	r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr);
 
 	/* check packet for BINAT/NAT/RDR */
 	if ((nr = pf_get_translation(pd, m, off, direction, kif, &nsn, &sk,
 	    &nk, saddr, daddr, sport, dport, anchor_stack)) != NULL) {
 		KASSERT(sk != NULL, ("%s: null sk", __func__));
 		KASSERT(nk != NULL, ("%s: null nk", __func__));
 
 		if (pd->ip_sum)
 			bip_sum = *pd->ip_sum;
 
 		switch (pd->proto) {
 		case IPPROTO_TCP:
 			bproto_sum = th->th_sum;
 			pd->proto_sum = &th->th_sum;
 
 			if (PF_ANEQ(saddr, &nk->addr[pd->sidx], af) ||
 			    nk->port[pd->sidx] != sport) {
 				pf_change_ap(saddr, &th->th_sport, pd->ip_sum,
 				    &th->th_sum, &nk->addr[pd->sidx],
 				    nk->port[pd->sidx], 0, af);
 				pd->sport = &th->th_sport;
 				sport = th->th_sport;
 			}
 
 			if (PF_ANEQ(daddr, &nk->addr[pd->didx], af) ||
 			    nk->port[pd->didx] != dport) {
 				pf_change_ap(daddr, &th->th_dport, pd->ip_sum,
 				    &th->th_sum, &nk->addr[pd->didx],
 				    nk->port[pd->didx], 0, af);
 				dport = th->th_dport;
 				pd->dport = &th->th_dport;
 			}
 			rewrite++;
 			break;
 		case IPPROTO_UDP:
 			bproto_sum = pd->hdr.udp->uh_sum;
 			pd->proto_sum = &pd->hdr.udp->uh_sum;
 
 			if (PF_ANEQ(saddr, &nk->addr[pd->sidx], af) ||
 			    nk->port[pd->sidx] != sport) {
 				pf_change_ap(saddr, &pd->hdr.udp->uh_sport,
 				    pd->ip_sum, &pd->hdr.udp->uh_sum,
 				    &nk->addr[pd->sidx],
 				    nk->port[pd->sidx], 1, af);
 				sport = pd->hdr.udp->uh_sport;
 				pd->sport = &pd->hdr.udp->uh_sport;
 			}
 
 			if (PF_ANEQ(daddr, &nk->addr[pd->didx], af) ||
 			    nk->port[pd->didx] != dport) {
 				pf_change_ap(daddr, &pd->hdr.udp->uh_dport,
 				    pd->ip_sum, &pd->hdr.udp->uh_sum,
 				    &nk->addr[pd->didx],
 				    nk->port[pd->didx], 1, af);
 				dport = pd->hdr.udp->uh_dport;
 				pd->dport = &pd->hdr.udp->uh_dport;
 			}
 			rewrite++;
 			break;
 #ifdef INET
 		case IPPROTO_ICMP:
 			nk->port[0] = nk->port[1];
 			if (PF_ANEQ(saddr, &nk->addr[pd->sidx], AF_INET))
 				pf_change_a(&saddr->v4.s_addr, pd->ip_sum,
 				    nk->addr[pd->sidx].v4.s_addr, 0);
 
 			if (PF_ANEQ(daddr, &nk->addr[pd->didx], AF_INET))
 				pf_change_a(&daddr->v4.s_addr, pd->ip_sum,
 				    nk->addr[pd->didx].v4.s_addr, 0);
 
 			if (nk->port[1] != pd->hdr.icmp->icmp_id) {
 				pd->hdr.icmp->icmp_cksum = pf_cksum_fixup(
 				    pd->hdr.icmp->icmp_cksum, sport,
 				    nk->port[1], 0);
 				pd->hdr.icmp->icmp_id = nk->port[1];
 				pd->sport = &pd->hdr.icmp->icmp_id;
 			}
 			m_copyback(m, off, ICMP_MINLEN, (caddr_t)pd->hdr.icmp);
 			break;
 #endif /* INET */
 #ifdef INET6
 		case IPPROTO_ICMPV6:
 			nk->port[0] = nk->port[1];
 			if (PF_ANEQ(saddr, &nk->addr[pd->sidx], AF_INET6))
 				pf_change_a6(saddr, &pd->hdr.icmp6->icmp6_cksum,
 				    &nk->addr[pd->sidx], 0);
 
 			if (PF_ANEQ(daddr, &nk->addr[pd->didx], AF_INET6))
 				pf_change_a6(daddr, &pd->hdr.icmp6->icmp6_cksum,
 				    &nk->addr[pd->didx], 0);
 			rewrite++;
 			break;
 #endif /* INET */
 		default:
 			switch (af) {
 #ifdef INET
 			case AF_INET:
 				if (PF_ANEQ(saddr,
 				    &nk->addr[pd->sidx], AF_INET))
 					pf_change_a(&saddr->v4.s_addr,
 					    pd->ip_sum,
 					    nk->addr[pd->sidx].v4.s_addr, 0);
 
 				if (PF_ANEQ(daddr,
 				    &nk->addr[pd->didx], AF_INET))
 					pf_change_a(&daddr->v4.s_addr,
 					    pd->ip_sum,
 					    nk->addr[pd->didx].v4.s_addr, 0);
 				break;
 #endif /* INET */
 #ifdef INET6
 			case AF_INET6:
 				if (PF_ANEQ(saddr,
 				    &nk->addr[pd->sidx], AF_INET6))
 					PF_ACPY(saddr, &nk->addr[pd->sidx], af);
 
 				if (PF_ANEQ(daddr,
 				    &nk->addr[pd->didx], AF_INET6))
 					PF_ACPY(saddr, &nk->addr[pd->didx], af);
 				break;
 #endif /* INET */
 			}
 			break;
 		}
 		if (nr->natpass)
 			r = NULL;
 		pd->nat_rule = nr;
 	}
 
 	while (r != NULL) {
 		r->evaluations++;
 		if (pfi_kif_match(r->kif, kif) == r->ifnot)
 			r = r->skip[PF_SKIP_IFP].ptr;
 		else if (r->direction && r->direction != direction)
 			r = r->skip[PF_SKIP_DIR].ptr;
 		else if (r->af && r->af != af)
 			r = r->skip[PF_SKIP_AF].ptr;
 		else if (r->proto && r->proto != pd->proto)
 			r = r->skip[PF_SKIP_PROTO].ptr;
 		else if (PF_MISMATCHAW(&r->src.addr, saddr, af,
 		    r->src.neg, kif, M_GETFIB(m)))
 			r = r->skip[PF_SKIP_SRC_ADDR].ptr;
 		/* tcp/udp only. port_op always 0 in other cases */
 		else if (r->src.port_op && !pf_match_port(r->src.port_op,
 		    r->src.port[0], r->src.port[1], sport))
 			r = r->skip[PF_SKIP_SRC_PORT].ptr;
 		else if (PF_MISMATCHAW(&r->dst.addr, daddr, af,
 		    r->dst.neg, NULL, M_GETFIB(m)))
 			r = r->skip[PF_SKIP_DST_ADDR].ptr;
 		/* tcp/udp only. port_op always 0 in other cases */
 		else if (r->dst.port_op && !pf_match_port(r->dst.port_op,
 		    r->dst.port[0], r->dst.port[1], dport))
 			r = r->skip[PF_SKIP_DST_PORT].ptr;
 		/* icmp only. type always 0 in other cases */
 		else if (r->type && r->type != icmptype + 1)
 			r = TAILQ_NEXT(r, entries);
 		/* icmp only. type always 0 in other cases */
 		else if (r->code && r->code != icmpcode + 1)
 			r = TAILQ_NEXT(r, entries);
 		else if (r->tos && !(r->tos == pd->tos))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->rule_flag & PFRULE_FRAGMENT)
 			r = TAILQ_NEXT(r, entries);
 		else if (pd->proto == IPPROTO_TCP &&
 		    (r->flagset & th->th_flags) != r->flags)
 			r = TAILQ_NEXT(r, entries);
 		/* tcp/udp only. uid.op always 0 in other cases */
 		else if (r->uid.op && (pd->lookup.done || (pd->lookup.done =
 		    pf_socket_lookup(direction, pd, m), 1)) &&
 		    !pf_match_uid(r->uid.op, r->uid.uid[0], r->uid.uid[1],
 		    pd->lookup.uid))
 			r = TAILQ_NEXT(r, entries);
 		/* tcp/udp only. gid.op always 0 in other cases */
 		else if (r->gid.op && (pd->lookup.done || (pd->lookup.done =
 		    pf_socket_lookup(direction, pd, m), 1)) &&
 		    !pf_match_gid(r->gid.op, r->gid.gid[0], r->gid.gid[1],
 		    pd->lookup.gid))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->prob &&
 		    r->prob <= arc4random())
 			r = TAILQ_NEXT(r, entries);
 		else if (r->match_tag && !pf_match_tag(m, r, &tag,
 		    pd->pf_mtag ? pd->pf_mtag->tag : 0))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->os_fingerprint != PF_OSFP_ANY &&
 		    (pd->proto != IPPROTO_TCP || !pf_osfp_match(
 		    pf_osfp_fingerprint(pd, m, off, th),
 		    r->os_fingerprint)))
 			r = TAILQ_NEXT(r, entries);
 		else {
 			if (r->tag)
 				tag = r->tag;
 			if (r->rtableid >= 0)
 				rtableid = r->rtableid;
 			if (r->anchor == NULL) {
 				match = 1;
 				*rm = r;
 				*am = a;
 				*rsm = ruleset;
 				if ((*rm)->quick)
 					break;
 				r = TAILQ_NEXT(r, entries);
 			} else
 				pf_step_into_anchor(anchor_stack, &asd,
 				    &ruleset, PF_RULESET_FILTER, &r, &a,
 				    &match);
 		}
 		if (r == NULL && pf_step_out_of_anchor(anchor_stack, &asd,
 		    &ruleset, PF_RULESET_FILTER, &r, &a, &match))
 			break;
 	}
 	r = *rm;
 	a = *am;
 	ruleset = *rsm;
 
 	REASON_SET(&reason, PFRES_MATCH);
 
 	if (r->log || (nr != NULL && nr->log)) {
 		if (rewrite)
 			m_copyback(m, off, hdrlen, pd->hdr.any);
 		PFLOG_PACKET(kif, m, af, direction, reason, r->log ? r : nr, a,
 		    ruleset, pd, 1);
 	}
 
 	if ((r->action == PF_DROP) &&
 	    ((r->rule_flag & PFRULE_RETURNRST) ||
 	    (r->rule_flag & PFRULE_RETURNICMP) ||
 	    (r->rule_flag & PFRULE_RETURN))) {
 		/* undo NAT changes, if they have taken place */
 		if (nr != NULL) {
 			PF_ACPY(saddr, &sk->addr[pd->sidx], af);
 			PF_ACPY(daddr, &sk->addr[pd->didx], af);
 			if (pd->sport)
 				*pd->sport = sk->port[pd->sidx];
 			if (pd->dport)
 				*pd->dport = sk->port[pd->didx];
 			if (pd->proto_sum)
 				*pd->proto_sum = bproto_sum;
 			if (pd->ip_sum)
 				*pd->ip_sum = bip_sum;
 			m_copyback(m, off, hdrlen, pd->hdr.any);
 		}
 		if (pd->proto == IPPROTO_TCP &&
 		    ((r->rule_flag & PFRULE_RETURNRST) ||
 		    (r->rule_flag & PFRULE_RETURN)) &&
 		    !(th->th_flags & TH_RST)) {
 			u_int32_t	 ack = ntohl(th->th_seq) + pd->p_len;
 			int		 len = 0;
 #ifdef INET
 			struct ip	*h4;
 #endif
 #ifdef INET6
 			struct ip6_hdr	*h6;
 #endif
 
 			switch (af) {
 #ifdef INET
 			case AF_INET:
 				h4 = mtod(m, struct ip *);
 				len = ntohs(h4->ip_len) - off;
 				break;
 #endif
 #ifdef INET6
 			case AF_INET6:
 				h6 = mtod(m, struct ip6_hdr *);
 				len = ntohs(h6->ip6_plen) - (off - sizeof(*h6));
 				break;
 #endif
 			}
 
 			if (pf_check_proto_cksum(m, off, len, IPPROTO_TCP, af))
 				REASON_SET(&reason, PFRES_PROTCKSUM);
 			else {
 				if (th->th_flags & TH_SYN)
 					ack++;
 				if (th->th_flags & TH_FIN)
 					ack++;
 				pf_send_tcp(m, r, af, pd->dst,
 				    pd->src, th->th_dport, th->th_sport,
 				    ntohl(th->th_ack), ack, TH_RST|TH_ACK, 0, 0,
 				    r->return_ttl, 1, 0, kif->pfik_ifp);
 			}
 		} else if (pd->proto != IPPROTO_ICMP && af == AF_INET &&
 		    r->return_icmp)
 			pf_send_icmp(m, r->return_icmp >> 8,
 			    r->return_icmp & 255, af, r);
 		else if (pd->proto != IPPROTO_ICMPV6 && af == AF_INET6 &&
 		    r->return_icmp6)
 			pf_send_icmp(m, r->return_icmp6 >> 8,
 			    r->return_icmp6 & 255, af, r);
 	}
 
 	if (r->action == PF_DROP)
 		goto cleanup;
 
 	if (tag > 0 && pf_tag_packet(m, pd, tag)) {
 		REASON_SET(&reason, PFRES_MEMORY);
 		goto cleanup;
 	}
 	if (rtableid >= 0)
 		M_SETFIB(m, rtableid);
 
 	if (!state_icmp && (r->keep_state || nr != NULL ||
 	    (pd->flags & PFDESC_TCP_NORM))) {
 		int action;
 		action = pf_create_state(r, nr, a, pd, nsn, nk, sk, m, off,
 		    sport, dport, &rewrite, kif, sm, tag, bproto_sum, bip_sum,
 		    hdrlen);
 		if (action != PF_PASS)
 			return (action);
 	} else {
 		if (sk != NULL)
 			uma_zfree(V_pf_state_key_z, sk);
 		if (nk != NULL)
 			uma_zfree(V_pf_state_key_z, nk);
 	}
 
 	/* copy back packet headers if we performed NAT operations */
 	if (rewrite)
 		m_copyback(m, off, hdrlen, pd->hdr.any);
 
 	if (*sm != NULL && !((*sm)->state_flags & PFSTATE_NOSYNC) &&
 	    direction == PF_OUT &&
 	    pfsync_defer_ptr != NULL && pfsync_defer_ptr(*sm, m))
 		/*
 		 * We want the state created, but we dont
 		 * want to send this in case a partner
 		 * firewall has to know about it to allow
 		 * replies through it.
 		 */
 		return (PF_DEFER);
 
 	return (PF_PASS);
 
 cleanup:
 	if (sk != NULL)
 		uma_zfree(V_pf_state_key_z, sk);
 	if (nk != NULL)
 		uma_zfree(V_pf_state_key_z, nk);
 	return (PF_DROP);
 }
 
 static int
 pf_create_state(struct pf_rule *r, struct pf_rule *nr, struct pf_rule *a,
     struct pf_pdesc *pd, struct pf_src_node *nsn, struct pf_state_key *nk,
     struct pf_state_key *sk, struct mbuf *m, int off, u_int16_t sport,
     u_int16_t dport, int *rewrite, struct pfi_kif *kif, struct pf_state **sm,
     int tag, u_int16_t bproto_sum, u_int16_t bip_sum, int hdrlen)
 {
 	struct pf_state		*s = NULL;
 	struct pf_src_node	*sn = NULL;
 	struct tcphdr		*th = pd->hdr.tcp;
 	u_int16_t		 mss = V_tcp_mssdflt;
 	u_short			 reason;
 
 	/* check maximums */
 	if (r->max_states &&
 	    (counter_u64_fetch(r->states_cur) >= r->max_states)) {
 		counter_u64_add(V_pf_status.lcounters[LCNT_STATES], 1);
 		REASON_SET(&reason, PFRES_MAXSTATES);
 		return (PF_DROP);
 	}
 	/* src node for filter rule */
 	if ((r->rule_flag & PFRULE_SRCTRACK ||
 	    r->rpool.opts & PF_POOL_STICKYADDR) &&
 	    pf_insert_src_node(&sn, r, pd->src, pd->af) != 0) {
 		REASON_SET(&reason, PFRES_SRCLIMIT);
 		goto csfailed;
 	}
 	/* src node for translation rule */
 	if (nr != NULL && (nr->rpool.opts & PF_POOL_STICKYADDR) &&
 	    pf_insert_src_node(&nsn, nr, &sk->addr[pd->sidx], pd->af)) {
 		REASON_SET(&reason, PFRES_SRCLIMIT);
 		goto csfailed;
 	}
 	s = uma_zalloc(V_pf_state_z, M_NOWAIT | M_ZERO);
 	if (s == NULL) {
 		REASON_SET(&reason, PFRES_MEMORY);
 		goto csfailed;
 	}
 	s->rule.ptr = r;
 	s->nat_rule.ptr = nr;
 	s->anchor.ptr = a;
 	STATE_INC_COUNTERS(s);
 	if (r->allow_opts)
 		s->state_flags |= PFSTATE_ALLOWOPTS;
 	if (r->rule_flag & PFRULE_STATESLOPPY)
 		s->state_flags |= PFSTATE_SLOPPY;
 	s->log = r->log & PF_LOG_ALL;
 	s->sync_state = PFSYNC_S_NONE;
 	if (nr != NULL)
 		s->log |= nr->log & PF_LOG_ALL;
 	switch (pd->proto) {
 	case IPPROTO_TCP:
 		s->src.seqlo = ntohl(th->th_seq);
 		s->src.seqhi = s->src.seqlo + pd->p_len + 1;
 		if ((th->th_flags & (TH_SYN|TH_ACK)) == TH_SYN &&
 		    r->keep_state == PF_STATE_MODULATE) {
 			/* Generate sequence number modulator */
 			if ((s->src.seqdiff = pf_tcp_iss(pd) - s->src.seqlo) ==
 			    0)
 				s->src.seqdiff = 1;
 			pf_change_a(&th->th_seq, &th->th_sum,
 			    htonl(s->src.seqlo + s->src.seqdiff), 0);
 			*rewrite = 1;
 		} else
 			s->src.seqdiff = 0;
 		if (th->th_flags & TH_SYN) {
 			s->src.seqhi++;
 			s->src.wscale = pf_get_wscale(m, off,
 			    th->th_off, pd->af);
 		}
 		s->src.max_win = MAX(ntohs(th->th_win), 1);
 		if (s->src.wscale & PF_WSCALE_MASK) {
 			/* Remove scale factor from initial window */
 			int win = s->src.max_win;
 			win += 1 << (s->src.wscale & PF_WSCALE_MASK);
 			s->src.max_win = (win - 1) >>
 			    (s->src.wscale & PF_WSCALE_MASK);
 		}
 		if (th->th_flags & TH_FIN)
 			s->src.seqhi++;
 		s->dst.seqhi = 1;
 		s->dst.max_win = 1;
 		s->src.state = TCPS_SYN_SENT;
 		s->dst.state = TCPS_CLOSED;
 		s->timeout = PFTM_TCP_FIRST_PACKET;
 		break;
 	case IPPROTO_UDP:
 		s->src.state = PFUDPS_SINGLE;
 		s->dst.state = PFUDPS_NO_TRAFFIC;
 		s->timeout = PFTM_UDP_FIRST_PACKET;
 		break;
 	case IPPROTO_ICMP:
 #ifdef INET6
 	case IPPROTO_ICMPV6:
 #endif
 		s->timeout = PFTM_ICMP_FIRST_PACKET;
 		break;
 	default:
 		s->src.state = PFOTHERS_SINGLE;
 		s->dst.state = PFOTHERS_NO_TRAFFIC;
 		s->timeout = PFTM_OTHER_FIRST_PACKET;
 	}
 
 	if (r->rt && r->rt != PF_FASTROUTE) {
 		if (pf_map_addr(pd->af, r, pd->src, &s->rt_addr, NULL, &sn)) {
 			REASON_SET(&reason, PFRES_MAPFAILED);
 			pf_src_tree_remove_state(s);
 			STATE_DEC_COUNTERS(s);
 			uma_zfree(V_pf_state_z, s);
 			goto csfailed;
 		}
 		s->rt_kif = r->rpool.cur->kif;
 	}
 
 	s->creation = time_uptime;
 	s->expire = time_uptime;
 
 	if (sn != NULL)
 		s->src_node = sn;
 	if (nsn != NULL) {
 		/* XXX We only modify one side for now. */
 		PF_ACPY(&nsn->raddr, &nk->addr[1], pd->af);
 		s->nat_src_node = nsn;
 	}
 	if (pd->proto == IPPROTO_TCP) {
 		if ((pd->flags & PFDESC_TCP_NORM) && pf_normalize_tcp_init(m,
 		    off, pd, th, &s->src, &s->dst)) {
 			REASON_SET(&reason, PFRES_MEMORY);
 			pf_src_tree_remove_state(s);
 			STATE_DEC_COUNTERS(s);
 			uma_zfree(V_pf_state_z, s);
 			return (PF_DROP);
 		}
 		if ((pd->flags & PFDESC_TCP_NORM) && s->src.scrub &&
 		    pf_normalize_tcp_stateful(m, off, pd, &reason, th, s,
 		    &s->src, &s->dst, rewrite)) {
 			/* This really shouldn't happen!!! */
 			DPFPRINTF(PF_DEBUG_URGENT,
 			    ("pf_normalize_tcp_stateful failed on first pkt"));
 			pf_normalize_tcp_cleanup(s);
 			pf_src_tree_remove_state(s);
 			STATE_DEC_COUNTERS(s);
 			uma_zfree(V_pf_state_z, s);
 			return (PF_DROP);
 		}
 	}
 	s->direction = pd->dir;
 
 	/*
 	 * sk/nk could already been setup by pf_get_translation().
 	 */
 	if (nr == NULL) {
 		KASSERT((sk == NULL && nk == NULL), ("%s: nr %p sk %p, nk %p",
 		    __func__, nr, sk, nk));
 		sk = pf_state_key_setup(pd, pd->src, pd->dst, sport, dport);
 		if (sk == NULL)
 			goto csfailed;
 		nk = sk;
 	} else
 		KASSERT((sk != NULL && nk != NULL), ("%s: nr %p sk %p, nk %p",
 		    __func__, nr, sk, nk));
 
 	/* Swap sk/nk for PF_OUT. */
 	if (pf_state_insert(BOUND_IFACE(r, kif),
 	    (pd->dir == PF_IN) ? sk : nk,
 	    (pd->dir == PF_IN) ? nk : sk, s)) {
 		if (pd->proto == IPPROTO_TCP)
 			pf_normalize_tcp_cleanup(s);
 		REASON_SET(&reason, PFRES_STATEINS);
 		pf_src_tree_remove_state(s);
 		STATE_DEC_COUNTERS(s);
 		uma_zfree(V_pf_state_z, s);
 		return (PF_DROP);
 	} else
 		*sm = s;
 
 	if (tag > 0)
 		s->tag = tag;
 	if (pd->proto == IPPROTO_TCP && (th->th_flags & (TH_SYN|TH_ACK)) ==
 	    TH_SYN && r->keep_state == PF_STATE_SYNPROXY) {
 		s->src.state = PF_TCPS_PROXY_SRC;
 		/* undo NAT changes, if they have taken place */
 		if (nr != NULL) {
 			struct pf_state_key *skt = s->key[PF_SK_WIRE];
 			if (pd->dir == PF_OUT)
 				skt = s->key[PF_SK_STACK];
 			PF_ACPY(pd->src, &skt->addr[pd->sidx], pd->af);
 			PF_ACPY(pd->dst, &skt->addr[pd->didx], pd->af);
 			if (pd->sport)
 				*pd->sport = skt->port[pd->sidx];
 			if (pd->dport)
 				*pd->dport = skt->port[pd->didx];
 			if (pd->proto_sum)
 				*pd->proto_sum = bproto_sum;
 			if (pd->ip_sum)
 				*pd->ip_sum = bip_sum;
 			m_copyback(m, off, hdrlen, pd->hdr.any);
 		}
 		s->src.seqhi = htonl(arc4random());
 		/* Find mss option */
 		int rtid = M_GETFIB(m);
 		mss = pf_get_mss(m, off, th->th_off, pd->af);
 		mss = pf_calc_mss(pd->src, pd->af, rtid, mss);
 		mss = pf_calc_mss(pd->dst, pd->af, rtid, mss);
 		s->src.mss = mss;
 		pf_send_tcp(NULL, r, pd->af, pd->dst, pd->src, th->th_dport,
 		    th->th_sport, s->src.seqhi, ntohl(th->th_seq) + 1,
 		    TH_SYN|TH_ACK, 0, s->src.mss, 0, 1, 0, NULL);
 		REASON_SET(&reason, PFRES_SYNPROXY);
 		return (PF_SYNPROXY_DROP);
 	}
 
 	return (PF_PASS);
 
 csfailed:
 	if (sk != NULL)
 		uma_zfree(V_pf_state_key_z, sk);
 	if (nk != NULL)
 		uma_zfree(V_pf_state_key_z, nk);
 
 	if (sn != NULL) {
 		struct pf_srchash *sh;
 
 		sh = &V_pf_srchash[pf_hashsrc(&sn->addr, sn->af)];
 		PF_HASHROW_LOCK(sh);
 		if (--sn->states == 0 && sn->expire == 0) {
 			pf_unlink_src_node(sn);
 			uma_zfree(V_pf_sources_z, sn);
 			counter_u64_add(
 			    V_pf_status.scounters[SCNT_SRC_NODE_REMOVALS], 1);
 		}
 		PF_HASHROW_UNLOCK(sh);
 	}
 
 	if (nsn != sn && nsn != NULL) {
 		struct pf_srchash *sh;
 
 		sh = &V_pf_srchash[pf_hashsrc(&nsn->addr, nsn->af)];
 		PF_HASHROW_LOCK(sh);
 		if (--nsn->states == 1 && nsn->expire == 0) {
 			pf_unlink_src_node(nsn);
 			uma_zfree(V_pf_sources_z, nsn);
 			counter_u64_add(
 			    V_pf_status.scounters[SCNT_SRC_NODE_REMOVALS], 1);
 		}
 		PF_HASHROW_UNLOCK(sh);
 	}
 
 	return (PF_DROP);
 }
 
 static int
 pf_test_fragment(struct pf_rule **rm, int direction, struct pfi_kif *kif,
     struct mbuf *m, void *h, struct pf_pdesc *pd, struct pf_rule **am,
     struct pf_ruleset **rsm)
 {
 	struct pf_rule		*r, *a = NULL;
 	struct pf_ruleset	*ruleset = NULL;
 	sa_family_t		 af = pd->af;
 	u_short			 reason;
 	int			 tag = -1;
 	int			 asd = 0;
 	int			 match = 0;
 	struct pf_anchor_stackframe	anchor_stack[PF_ANCHOR_STACKSIZE];
 
 	PF_RULES_RASSERT();
 
 	r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_FILTER].active.ptr);
 	while (r != NULL) {
 		r->evaluations++;
 		if (pfi_kif_match(r->kif, kif) == r->ifnot)
 			r = r->skip[PF_SKIP_IFP].ptr;
 		else if (r->direction && r->direction != direction)
 			r = r->skip[PF_SKIP_DIR].ptr;
 		else if (r->af && r->af != af)
 			r = r->skip[PF_SKIP_AF].ptr;
 		else if (r->proto && r->proto != pd->proto)
 			r = r->skip[PF_SKIP_PROTO].ptr;
 		else if (PF_MISMATCHAW(&r->src.addr, pd->src, af,
 		    r->src.neg, kif, M_GETFIB(m)))
 			r = r->skip[PF_SKIP_SRC_ADDR].ptr;
 		else if (PF_MISMATCHAW(&r->dst.addr, pd->dst, af,
 		    r->dst.neg, NULL, M_GETFIB(m)))
 			r = r->skip[PF_SKIP_DST_ADDR].ptr;
 		else if (r->tos && !(r->tos == pd->tos))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->os_fingerprint != PF_OSFP_ANY)
 			r = TAILQ_NEXT(r, entries);
 		else if (pd->proto == IPPROTO_UDP &&
 		    (r->src.port_op || r->dst.port_op))
 			r = TAILQ_NEXT(r, entries);
 		else if (pd->proto == IPPROTO_TCP &&
 		    (r->src.port_op || r->dst.port_op || r->flagset))
 			r = TAILQ_NEXT(r, entries);
 		else if ((pd->proto == IPPROTO_ICMP ||
 		    pd->proto == IPPROTO_ICMPV6) &&
 		    (r->type || r->code))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->prob && r->prob <=
 		    (arc4random() % (UINT_MAX - 1) + 1))
 			r = TAILQ_NEXT(r, entries);
 		else if (r->match_tag && !pf_match_tag(m, r, &tag,
 		    pd->pf_mtag ? pd->pf_mtag->tag : 0))
 			r = TAILQ_NEXT(r, entries);
 		else {
 			if (r->anchor == NULL) {
 				match = 1;
 				*rm = r;
 				*am = a;
 				*rsm = ruleset;
 				if ((*rm)->quick)
 					break;
 				r = TAILQ_NEXT(r, entries);
 			} else
 				pf_step_into_anchor(anchor_stack, &asd,
 				    &ruleset, PF_RULESET_FILTER, &r, &a,
 				    &match);
 		}
 		if (r == NULL && pf_step_out_of_anchor(anchor_stack, &asd,
 		    &ruleset, PF_RULESET_FILTER, &r, &a, &match))
 			break;
 	}
 	r = *rm;
 	a = *am;
 	ruleset = *rsm;
 
 	REASON_SET(&reason, PFRES_MATCH);
 
 	if (r->log)
 		PFLOG_PACKET(kif, m, af, direction, reason, r, a, ruleset, pd,
 		    1);
 
 	if (r->action != PF_PASS)
 		return (PF_DROP);
 
 	if (tag > 0 && pf_tag_packet(m, pd, tag)) {
 		REASON_SET(&reason, PFRES_MEMORY);
 		return (PF_DROP);
 	}
 
 	return (PF_PASS);
 }
 
 static int
 pf_tcp_track_full(struct pf_state_peer *src, struct pf_state_peer *dst,
 	struct pf_state **state, struct pfi_kif *kif, struct mbuf *m, int off,
 	struct pf_pdesc *pd, u_short *reason, int *copyback)
 {
 	struct tcphdr		*th = pd->hdr.tcp;
 	u_int16_t		 win = ntohs(th->th_win);
 	u_int32_t		 ack, end, seq, orig_seq;
 	u_int8_t		 sws, dws;
 	int			 ackskew;
 
 	if (src->wscale && dst->wscale && !(th->th_flags & TH_SYN)) {
 		sws = src->wscale & PF_WSCALE_MASK;
 		dws = dst->wscale & PF_WSCALE_MASK;
 	} else
 		sws = dws = 0;
 
 	/*
 	 * Sequence tracking algorithm from Guido van Rooij's paper:
 	 *   http://www.madison-gurkha.com/publications/tcp_filtering/
 	 *	tcp_filtering.ps
 	 */
 
 	orig_seq = seq = ntohl(th->th_seq);
 	if (src->seqlo == 0) {
 		/* First packet from this end. Set its state */
 
 		if ((pd->flags & PFDESC_TCP_NORM || dst->scrub) &&
 		    src->scrub == NULL) {
 			if (pf_normalize_tcp_init(m, off, pd, th, src, dst)) {
 				REASON_SET(reason, PFRES_MEMORY);
 				return (PF_DROP);
 			}
 		}
 
 		/* Deferred generation of sequence number modulator */
 		if (dst->seqdiff && !src->seqdiff) {
 			/* use random iss for the TCP server */
 			while ((src->seqdiff = arc4random() - seq) == 0)
 				;
 			ack = ntohl(th->th_ack) - dst->seqdiff;
 			pf_change_a(&th->th_seq, &th->th_sum, htonl(seq +
 			    src->seqdiff), 0);
 			pf_change_a(&th->th_ack, &th->th_sum, htonl(ack), 0);
 			*copyback = 1;
 		} else {
 			ack = ntohl(th->th_ack);
 		}
 
 		end = seq + pd->p_len;
 		if (th->th_flags & TH_SYN) {
 			end++;
 			if (dst->wscale & PF_WSCALE_FLAG) {
 				src->wscale = pf_get_wscale(m, off, th->th_off,
 				    pd->af);
 				if (src->wscale & PF_WSCALE_FLAG) {
 					/* Remove scale factor from initial
 					 * window */
 					sws = src->wscale & PF_WSCALE_MASK;
 					win = ((u_int32_t)win + (1 << sws) - 1)
 					    >> sws;
 					dws = dst->wscale & PF_WSCALE_MASK;
 				} else {
 					/* fixup other window */
 					dst->max_win <<= dst->wscale &
 					    PF_WSCALE_MASK;
 					/* in case of a retrans SYN|ACK */
 					dst->wscale = 0;
 				}
 			}
 		}
 		if (th->th_flags & TH_FIN)
 			end++;
 
 		src->seqlo = seq;
 		if (src->state < TCPS_SYN_SENT)
 			src->state = TCPS_SYN_SENT;
 
 		/*
 		 * May need to slide the window (seqhi may have been set by
 		 * the crappy stack check or if we picked up the connection
 		 * after establishment)
 		 */
 		if (src->seqhi == 1 ||
 		    SEQ_GEQ(end + MAX(1, dst->max_win << dws), src->seqhi))
 			src->seqhi = end + MAX(1, dst->max_win << dws);
 		if (win > src->max_win)
 			src->max_win = win;
 
 	} else {
 		ack = ntohl(th->th_ack) - dst->seqdiff;
 		if (src->seqdiff) {
 			/* Modulate sequence numbers */
 			pf_change_a(&th->th_seq, &th->th_sum, htonl(seq +
 			    src->seqdiff), 0);
 			pf_change_a(&th->th_ack, &th->th_sum, htonl(ack), 0);
 			*copyback = 1;
 		}
 		end = seq + pd->p_len;
 		if (th->th_flags & TH_SYN)
 			end++;
 		if (th->th_flags & TH_FIN)
 			end++;
 	}
 
 	if ((th->th_flags & TH_ACK) == 0) {
 		/* Let it pass through the ack skew check */
 		ack = dst->seqlo;
 	} else if ((ack == 0 &&
 	    (th->th_flags & (TH_ACK|TH_RST)) == (TH_ACK|TH_RST)) ||
 	    /* broken tcp stacks do not set ack */
 	    (dst->state < TCPS_SYN_SENT)) {
 		/*
 		 * Many stacks (ours included) will set the ACK number in an
 		 * FIN|ACK if the SYN times out -- no sequence to ACK.
 		 */
 		ack = dst->seqlo;
 	}
 
 	if (seq == end) {
 		/* Ease sequencing restrictions on no data packets */
 		seq = src->seqlo;
 		end = seq;
 	}
 
 	ackskew = dst->seqlo - ack;
 
 
 	/*
 	 * Need to demodulate the sequence numbers in any TCP SACK options
 	 * (Selective ACK). We could optionally validate the SACK values
 	 * against the current ACK window, either forwards or backwards, but
 	 * I'm not confident that SACK has been implemented properly
 	 * everywhere. It wouldn't surprise me if several stacks accidently
 	 * SACK too far backwards of previously ACKed data. There really aren't
 	 * any security implications of bad SACKing unless the target stack
 	 * doesn't validate the option length correctly. Someone trying to
 	 * spoof into a TCP connection won't bother blindly sending SACK
 	 * options anyway.
 	 */
 	if (dst->seqdiff && (th->th_off << 2) > sizeof(struct tcphdr)) {
 		if (pf_modulate_sack(m, off, pd, th, dst))
 			*copyback = 1;
 	}
 
 
 #define	MAXACKWINDOW (0xffff + 1500)	/* 1500 is an arbitrary fudge factor */
 	if (SEQ_GEQ(src->seqhi, end) &&
 	    /* Last octet inside other's window space */
 	    SEQ_GEQ(seq, src->seqlo - (dst->max_win << dws)) &&
 	    /* Retrans: not more than one window back */
 	    (ackskew >= -MAXACKWINDOW) &&
 	    /* Acking not more than one reassembled fragment backwards */
 	    (ackskew <= (MAXACKWINDOW << sws)) &&
 	    /* Acking not more than one window forward */
 	    ((th->th_flags & TH_RST) == 0 || orig_seq == src->seqlo ||
 	    (orig_seq == src->seqlo + 1) || (orig_seq + 1 == src->seqlo) ||
 	    (pd->flags & PFDESC_IP_REAS) == 0)) {
 	    /* Require an exact/+1 sequence match on resets when possible */
 
 		if (dst->scrub || src->scrub) {
 			if (pf_normalize_tcp_stateful(m, off, pd, reason, th,
 			    *state, src, dst, copyback))
 				return (PF_DROP);
 		}
 
 		/* update max window */
 		if (src->max_win < win)
 			src->max_win = win;
 		/* synchronize sequencing */
 		if (SEQ_GT(end, src->seqlo))
 			src->seqlo = end;
 		/* slide the window of what the other end can send */
 		if (SEQ_GEQ(ack + (win << sws), dst->seqhi))
 			dst->seqhi = ack + MAX((win << sws), 1);
 
 
 		/* update states */
 		if (th->th_flags & TH_SYN)
 			if (src->state < TCPS_SYN_SENT)
 				src->state = TCPS_SYN_SENT;
 		if (th->th_flags & TH_FIN)
 			if (src->state < TCPS_CLOSING)
 				src->state = TCPS_CLOSING;
 		if (th->th_flags & TH_ACK) {
 			if (dst->state == TCPS_SYN_SENT) {
 				dst->state = TCPS_ESTABLISHED;
 				if (src->state == TCPS_ESTABLISHED &&
 				    (*state)->src_node != NULL &&
 				    pf_src_connlimit(state)) {
 					REASON_SET(reason, PFRES_SRCLIMIT);
 					return (PF_DROP);
 				}
 			} else if (dst->state == TCPS_CLOSING)
 				dst->state = TCPS_FIN_WAIT_2;
 		}
 		if (th->th_flags & TH_RST)
 			src->state = dst->state = TCPS_TIME_WAIT;
 
 		/* update expire time */
 		(*state)->expire = time_uptime;
 		if (src->state >= TCPS_FIN_WAIT_2 &&
 		    dst->state >= TCPS_FIN_WAIT_2)
 			(*state)->timeout = PFTM_TCP_CLOSED;
 		else if (src->state >= TCPS_CLOSING &&
 		    dst->state >= TCPS_CLOSING)
 			(*state)->timeout = PFTM_TCP_FIN_WAIT;
 		else if (src->state < TCPS_ESTABLISHED ||
 		    dst->state < TCPS_ESTABLISHED)
 			(*state)->timeout = PFTM_TCP_OPENING;
 		else if (src->state >= TCPS_CLOSING ||
 		    dst->state >= TCPS_CLOSING)
 			(*state)->timeout = PFTM_TCP_CLOSING;
 		else
 			(*state)->timeout = PFTM_TCP_ESTABLISHED;
 
 		/* Fall through to PASS packet */
 
 	} else if ((dst->state < TCPS_SYN_SENT ||
 		dst->state >= TCPS_FIN_WAIT_2 ||
 		src->state >= TCPS_FIN_WAIT_2) &&
 	    SEQ_GEQ(src->seqhi + MAXACKWINDOW, end) &&
 	    /* Within a window forward of the originating packet */
 	    SEQ_GEQ(seq, src->seqlo - MAXACKWINDOW)) {
 	    /* Within a window backward of the originating packet */
 
 		/*
 		 * This currently handles three situations:
 		 *  1) Stupid stacks will shotgun SYNs before their peer
 		 *     replies.
 		 *  2) When PF catches an already established stream (the
 		 *     firewall rebooted, the state table was flushed, routes
 		 *     changed...)
 		 *  3) Packets get funky immediately after the connection
 		 *     closes (this should catch Solaris spurious ACK|FINs
 		 *     that web servers like to spew after a close)
 		 *
 		 * This must be a little more careful than the above code
 		 * since packet floods will also be caught here. We don't
 		 * update the TTL here to mitigate the damage of a packet
 		 * flood and so the same code can handle awkward establishment
 		 * and a loosened connection close.
 		 * In the establishment case, a correct peer response will
 		 * validate the connection, go through the normal state code
 		 * and keep updating the state TTL.
 		 */
 
 		if (V_pf_status.debug >= PF_DEBUG_MISC) {
 			printf("pf: loose state match: ");
 			pf_print_state(*state);
 			pf_print_flags(th->th_flags);
 			printf(" seq=%u (%u) ack=%u len=%u ackskew=%d "
 			    "pkts=%llu:%llu dir=%s,%s\n", seq, orig_seq, ack,
 			    pd->p_len, ackskew, (unsigned long long)(*state)->packets[0],
 			    (unsigned long long)(*state)->packets[1],
 			    pd->dir == PF_IN ? "in" : "out",
 			    pd->dir == (*state)->direction ? "fwd" : "rev");
 		}
 
 		if (dst->scrub || src->scrub) {
 			if (pf_normalize_tcp_stateful(m, off, pd, reason, th,
 			    *state, src, dst, copyback))
 				return (PF_DROP);
 		}
 
 		/* update max window */
 		if (src->max_win < win)
 			src->max_win = win;
 		/* synchronize sequencing */
 		if (SEQ_GT(end, src->seqlo))
 			src->seqlo = end;
 		/* slide the window of what the other end can send */
 		if (SEQ_GEQ(ack + (win << sws), dst->seqhi))
 			dst->seqhi = ack + MAX((win << sws), 1);
 
 		/*
 		 * Cannot set dst->seqhi here since this could be a shotgunned
 		 * SYN and not an already established connection.
 		 */
 
 		if (th->th_flags & TH_FIN)
 			if (src->state < TCPS_CLOSING)
 				src->state = TCPS_CLOSING;
 		if (th->th_flags & TH_RST)
 			src->state = dst->state = TCPS_TIME_WAIT;
 
 		/* Fall through to PASS packet */
 
 	} else {
 		if ((*state)->dst.state == TCPS_SYN_SENT &&
 		    (*state)->src.state == TCPS_SYN_SENT) {
 			/* Send RST for state mismatches during handshake */
 			if (!(th->th_flags & TH_RST))
 				pf_send_tcp(NULL, (*state)->rule.ptr, pd->af,
 				    pd->dst, pd->src, th->th_dport,
 				    th->th_sport, ntohl(th->th_ack), 0,
 				    TH_RST, 0, 0,
 				    (*state)->rule.ptr->return_ttl, 1, 0,
 				    kif->pfik_ifp);
 			src->seqlo = 0;
 			src->seqhi = 1;
 			src->max_win = 1;
 		} else if (V_pf_status.debug >= PF_DEBUG_MISC) {
 			printf("pf: BAD state: ");
 			pf_print_state(*state);
 			pf_print_flags(th->th_flags);
 			printf(" seq=%u (%u) ack=%u len=%u ackskew=%d "
 			    "pkts=%llu:%llu dir=%s,%s\n",
 			    seq, orig_seq, ack, pd->p_len, ackskew,
 			    (unsigned long long)(*state)->packets[0],
 			    (unsigned long long)(*state)->packets[1],
 			    pd->dir == PF_IN ? "in" : "out",
 			    pd->dir == (*state)->direction ? "fwd" : "rev");
 			printf("pf: State failure on: %c %c %c %c | %c %c\n",
 			    SEQ_GEQ(src->seqhi, end) ? ' ' : '1',
 			    SEQ_GEQ(seq, src->seqlo - (dst->max_win << dws)) ?
 			    ' ': '2',
 			    (ackskew >= -MAXACKWINDOW) ? ' ' : '3',
 			    (ackskew <= (MAXACKWINDOW << sws)) ? ' ' : '4',
 			    SEQ_GEQ(src->seqhi + MAXACKWINDOW, end) ?' ' :'5',
 			    SEQ_GEQ(seq, src->seqlo - MAXACKWINDOW) ?' ' :'6');
 		}
 		REASON_SET(reason, PFRES_BADSTATE);
 		return (PF_DROP);
 	}
 
 	return (PF_PASS);
 }
 
 static int
 pf_tcp_track_sloppy(struct pf_state_peer *src, struct pf_state_peer *dst,
 	struct pf_state **state, struct pf_pdesc *pd, u_short *reason)
 {
 	struct tcphdr		*th = pd->hdr.tcp;
 
 	if (th->th_flags & TH_SYN)
 		if (src->state < TCPS_SYN_SENT)
 			src->state = TCPS_SYN_SENT;
 	if (th->th_flags & TH_FIN)
 		if (src->state < TCPS_CLOSING)
 			src->state = TCPS_CLOSING;
 	if (th->th_flags & TH_ACK) {
 		if (dst->state == TCPS_SYN_SENT) {
 			dst->state = TCPS_ESTABLISHED;
 			if (src->state == TCPS_ESTABLISHED &&
 			    (*state)->src_node != NULL &&
 			    pf_src_connlimit(state)) {
 				REASON_SET(reason, PFRES_SRCLIMIT);
 				return (PF_DROP);
 			}
 		} else if (dst->state == TCPS_CLOSING) {
 			dst->state = TCPS_FIN_WAIT_2;
 		} else if (src->state == TCPS_SYN_SENT &&
 		    dst->state < TCPS_SYN_SENT) {
 			/*
 			 * Handle a special sloppy case where we only see one
 			 * half of the connection. If there is a ACK after
 			 * the initial SYN without ever seeing a packet from
 			 * the destination, set the connection to established.
 			 */
 			dst->state = src->state = TCPS_ESTABLISHED;
 			if ((*state)->src_node != NULL &&
 			    pf_src_connlimit(state)) {
 				REASON_SET(reason, PFRES_SRCLIMIT);
 				return (PF_DROP);
 			}
 		} else if (src->state == TCPS_CLOSING &&
 		    dst->state == TCPS_ESTABLISHED &&
 		    dst->seqlo == 0) {
 			/*
 			 * Handle the closing of half connections where we
 			 * don't see the full bidirectional FIN/ACK+ACK
 			 * handshake.
 			 */
 			dst->state = TCPS_CLOSING;
 		}
 	}
 	if (th->th_flags & TH_RST)
 		src->state = dst->state = TCPS_TIME_WAIT;
 
 	/* update expire time */
 	(*state)->expire = time_uptime;
 	if (src->state >= TCPS_FIN_WAIT_2 &&
 	    dst->state >= TCPS_FIN_WAIT_2)
 		(*state)->timeout = PFTM_TCP_CLOSED;
 	else if (src->state >= TCPS_CLOSING &&
 	    dst->state >= TCPS_CLOSING)
 		(*state)->timeout = PFTM_TCP_FIN_WAIT;
 	else if (src->state < TCPS_ESTABLISHED ||
 	    dst->state < TCPS_ESTABLISHED)
 		(*state)->timeout = PFTM_TCP_OPENING;
 	else if (src->state >= TCPS_CLOSING ||
 	    dst->state >= TCPS_CLOSING)
 		(*state)->timeout = PFTM_TCP_CLOSING;
 	else
 		(*state)->timeout = PFTM_TCP_ESTABLISHED;
 
 	return (PF_PASS);
 }
 
 static int
 pf_test_state_tcp(struct pf_state **state, int direction, struct pfi_kif *kif,
     struct mbuf *m, int off, void *h, struct pf_pdesc *pd,
     u_short *reason)
 {
 	struct pf_state_key_cmp	 key;
 	struct tcphdr		*th = pd->hdr.tcp;
 	int			 copyback = 0;
 	struct pf_state_peer	*src, *dst;
 	struct pf_state_key	*sk;
 
 	bzero(&key, sizeof(key));
 	key.af = pd->af;
 	key.proto = IPPROTO_TCP;
 	if (direction == PF_IN)	{	/* wire side, straight */
 		PF_ACPY(&key.addr[0], pd->src, key.af);
 		PF_ACPY(&key.addr[1], pd->dst, key.af);
 		key.port[0] = th->th_sport;
 		key.port[1] = th->th_dport;
 	} else {			/* stack side, reverse */
 		PF_ACPY(&key.addr[1], pd->src, key.af);
 		PF_ACPY(&key.addr[0], pd->dst, key.af);
 		key.port[1] = th->th_sport;
 		key.port[0] = th->th_dport;
 	}
 
 	STATE_LOOKUP(kif, &key, direction, *state, pd);
 
 	if (direction == (*state)->direction) {
 		src = &(*state)->src;
 		dst = &(*state)->dst;
 	} else {
 		src = &(*state)->dst;
 		dst = &(*state)->src;
 	}
 
 	sk = (*state)->key[pd->didx];
 
 	if ((*state)->src.state == PF_TCPS_PROXY_SRC) {
 		if (direction != (*state)->direction) {
 			REASON_SET(reason, PFRES_SYNPROXY);
 			return (PF_SYNPROXY_DROP);
 		}
 		if (th->th_flags & TH_SYN) {
 			if (ntohl(th->th_seq) != (*state)->src.seqlo) {
 				REASON_SET(reason, PFRES_SYNPROXY);
 				return (PF_DROP);
 			}
 			pf_send_tcp(NULL, (*state)->rule.ptr, pd->af, pd->dst,
 			    pd->src, th->th_dport, th->th_sport,
 			    (*state)->src.seqhi, ntohl(th->th_seq) + 1,
 			    TH_SYN|TH_ACK, 0, (*state)->src.mss, 0, 1, 0, NULL);
 			REASON_SET(reason, PFRES_SYNPROXY);
 			return (PF_SYNPROXY_DROP);
 		} else if (!(th->th_flags & TH_ACK) ||
 		    (ntohl(th->th_ack) != (*state)->src.seqhi + 1) ||
 		    (ntohl(th->th_seq) != (*state)->src.seqlo + 1)) {
 			REASON_SET(reason, PFRES_SYNPROXY);
 			return (PF_DROP);
 		} else if ((*state)->src_node != NULL &&
 		    pf_src_connlimit(state)) {
 			REASON_SET(reason, PFRES_SRCLIMIT);
 			return (PF_DROP);
 		} else
 			(*state)->src.state = PF_TCPS_PROXY_DST;
 	}
 	if ((*state)->src.state == PF_TCPS_PROXY_DST) {
 		if (direction == (*state)->direction) {
 			if (((th->th_flags & (TH_SYN|TH_ACK)) != TH_ACK) ||
 			    (ntohl(th->th_ack) != (*state)->src.seqhi + 1) ||
 			    (ntohl(th->th_seq) != (*state)->src.seqlo + 1)) {
 				REASON_SET(reason, PFRES_SYNPROXY);
 				return (PF_DROP);
 			}
 			(*state)->src.max_win = MAX(ntohs(th->th_win), 1);
 			if ((*state)->dst.seqhi == 1)
 				(*state)->dst.seqhi = htonl(arc4random());
 			pf_send_tcp(NULL, (*state)->rule.ptr, pd->af,
 			    &sk->addr[pd->sidx], &sk->addr[pd->didx],
 			    sk->port[pd->sidx], sk->port[pd->didx],
 			    (*state)->dst.seqhi, 0, TH_SYN, 0,
 			    (*state)->src.mss, 0, 0, (*state)->tag, NULL);
 			REASON_SET(reason, PFRES_SYNPROXY);
 			return (PF_SYNPROXY_DROP);
 		} else if (((th->th_flags & (TH_SYN|TH_ACK)) !=
 		    (TH_SYN|TH_ACK)) ||
 		    (ntohl(th->th_ack) != (*state)->dst.seqhi + 1)) {
 			REASON_SET(reason, PFRES_SYNPROXY);
 			return (PF_DROP);
 		} else {
 			(*state)->dst.max_win = MAX(ntohs(th->th_win), 1);
 			(*state)->dst.seqlo = ntohl(th->th_seq);
 			pf_send_tcp(NULL, (*state)->rule.ptr, pd->af, pd->dst,
 			    pd->src, th->th_dport, th->th_sport,
 			    ntohl(th->th_ack), ntohl(th->th_seq) + 1,
 			    TH_ACK, (*state)->src.max_win, 0, 0, 0,
 			    (*state)->tag, NULL);
 			pf_send_tcp(NULL, (*state)->rule.ptr, pd->af,
 			    &sk->addr[pd->sidx], &sk->addr[pd->didx],
 			    sk->port[pd->sidx], sk->port[pd->didx],
 			    (*state)->src.seqhi + 1, (*state)->src.seqlo + 1,
 			    TH_ACK, (*state)->dst.max_win, 0, 0, 1, 0, NULL);
 			(*state)->src.seqdiff = (*state)->dst.seqhi -
 			    (*state)->src.seqlo;
 			(*state)->dst.seqdiff = (*state)->src.seqhi -
 			    (*state)->dst.seqlo;
 			(*state)->src.seqhi = (*state)->src.seqlo +
 			    (*state)->dst.max_win;
 			(*state)->dst.seqhi = (*state)->dst.seqlo +
 			    (*state)->src.max_win;
 			(*state)->src.wscale = (*state)->dst.wscale = 0;
 			(*state)->src.state = (*state)->dst.state =
 			    TCPS_ESTABLISHED;
 			REASON_SET(reason, PFRES_SYNPROXY);
 			return (PF_SYNPROXY_DROP);
 		}
 	}
 
 	if (((th->th_flags & (TH_SYN|TH_ACK)) == TH_SYN) &&
 	    dst->state >= TCPS_FIN_WAIT_2 &&
 	    src->state >= TCPS_FIN_WAIT_2) {
 		if (V_pf_status.debug >= PF_DEBUG_MISC) {
 			printf("pf: state reuse ");
 			pf_print_state(*state);
 			pf_print_flags(th->th_flags);
 			printf("\n");
 		}
 		/* XXX make sure it's the same direction ?? */
 		(*state)->src.state = (*state)->dst.state = TCPS_CLOSED;
 		pf_unlink_state(*state, PF_ENTER_LOCKED);
 		*state = NULL;
 		return (PF_DROP);
 	}
 
 	if ((*state)->state_flags & PFSTATE_SLOPPY) {
 		if (pf_tcp_track_sloppy(src, dst, state, pd, reason) == PF_DROP)
 			return (PF_DROP);
 	} else {
 		if (pf_tcp_track_full(src, dst, state, kif, m, off, pd, reason,
 		    &copyback) == PF_DROP)
 			return (PF_DROP);
 	}
 
 	/* translate source/destination address, if necessary */
 	if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) {
 		struct pf_state_key *nk = (*state)->key[pd->didx];
 
 		if (PF_ANEQ(pd->src, &nk->addr[pd->sidx], pd->af) ||
 		    nk->port[pd->sidx] != th->th_sport)
 			pf_change_ap(pd->src, &th->th_sport, pd->ip_sum,
 			    &th->th_sum, &nk->addr[pd->sidx],
 			    nk->port[pd->sidx], 0, pd->af);
 
 		if (PF_ANEQ(pd->dst, &nk->addr[pd->didx], pd->af) ||
 		    nk->port[pd->didx] != th->th_dport)
 			pf_change_ap(pd->dst, &th->th_dport, pd->ip_sum,
 			    &th->th_sum, &nk->addr[pd->didx],
 			    nk->port[pd->didx], 0, pd->af);
 		copyback = 1;
 	}
 
 	/* Copyback sequence modulation or stateful scrub changes if needed */
 	if (copyback)
 		m_copyback(m, off, sizeof(*th), (caddr_t)th);
 
 	return (PF_PASS);
 }
 
 static int
 pf_test_state_udp(struct pf_state **state, int direction, struct pfi_kif *kif,
     struct mbuf *m, int off, void *h, struct pf_pdesc *pd)
 {
 	struct pf_state_peer	*src, *dst;
 	struct pf_state_key_cmp	 key;
 	struct udphdr		*uh = pd->hdr.udp;
 
 	bzero(&key, sizeof(key));
 	key.af = pd->af;
 	key.proto = IPPROTO_UDP;
 	if (direction == PF_IN)	{	/* wire side, straight */
 		PF_ACPY(&key.addr[0], pd->src, key.af);
 		PF_ACPY(&key.addr[1], pd->dst, key.af);
 		key.port[0] = uh->uh_sport;
 		key.port[1] = uh->uh_dport;
 	} else {			/* stack side, reverse */
 		PF_ACPY(&key.addr[1], pd->src, key.af);
 		PF_ACPY(&key.addr[0], pd->dst, key.af);
 		key.port[1] = uh->uh_sport;
 		key.port[0] = uh->uh_dport;
 	}
 
 	STATE_LOOKUP(kif, &key, direction, *state, pd);
 
 	if (direction == (*state)->direction) {
 		src = &(*state)->src;
 		dst = &(*state)->dst;
 	} else {
 		src = &(*state)->dst;
 		dst = &(*state)->src;
 	}
 
 	/* update states */
 	if (src->state < PFUDPS_SINGLE)
 		src->state = PFUDPS_SINGLE;
 	if (dst->state == PFUDPS_SINGLE)
 		dst->state = PFUDPS_MULTIPLE;
 
 	/* update expire time */
 	(*state)->expire = time_uptime;
 	if (src->state == PFUDPS_MULTIPLE && dst->state == PFUDPS_MULTIPLE)
 		(*state)->timeout = PFTM_UDP_MULTIPLE;
 	else
 		(*state)->timeout = PFTM_UDP_SINGLE;
 
 	/* translate source/destination address, if necessary */
 	if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) {
 		struct pf_state_key *nk = (*state)->key[pd->didx];
 
 		if (PF_ANEQ(pd->src, &nk->addr[pd->sidx], pd->af) ||
 		    nk->port[pd->sidx] != uh->uh_sport)
 			pf_change_ap(pd->src, &uh->uh_sport, pd->ip_sum,
 			    &uh->uh_sum, &nk->addr[pd->sidx],
 			    nk->port[pd->sidx], 1, pd->af);
 
 		if (PF_ANEQ(pd->dst, &nk->addr[pd->didx], pd->af) ||
 		    nk->port[pd->didx] != uh->uh_dport)
 			pf_change_ap(pd->dst, &uh->uh_dport, pd->ip_sum,
 			    &uh->uh_sum, &nk->addr[pd->didx],
 			    nk->port[pd->didx], 1, pd->af);
 		m_copyback(m, off, sizeof(*uh), (caddr_t)uh);
 	}
 
 	return (PF_PASS);
 }
 
 static int
 pf_test_state_icmp(struct pf_state **state, int direction, struct pfi_kif *kif,
     struct mbuf *m, int off, void *h, struct pf_pdesc *pd, u_short *reason)
 {
 	struct pf_addr  *saddr = pd->src, *daddr = pd->dst;
 	u_int16_t	 icmpid = 0, *icmpsum;
 	u_int8_t	 icmptype;
 	int		 state_icmp = 0;
 	struct pf_state_key_cmp key;
 
 	bzero(&key, sizeof(key));
 	switch (pd->proto) {
 #ifdef INET
 	case IPPROTO_ICMP:
 		icmptype = pd->hdr.icmp->icmp_type;
 		icmpid = pd->hdr.icmp->icmp_id;
 		icmpsum = &pd->hdr.icmp->icmp_cksum;
 
 		if (icmptype == ICMP_UNREACH ||
 		    icmptype == ICMP_SOURCEQUENCH ||
 		    icmptype == ICMP_REDIRECT ||
 		    icmptype == ICMP_TIMXCEED ||
 		    icmptype == ICMP_PARAMPROB)
 			state_icmp++;
 		break;
 #endif /* INET */
 #ifdef INET6
 	case IPPROTO_ICMPV6:
 		icmptype = pd->hdr.icmp6->icmp6_type;
 		icmpid = pd->hdr.icmp6->icmp6_id;
 		icmpsum = &pd->hdr.icmp6->icmp6_cksum;
 
 		if (icmptype == ICMP6_DST_UNREACH ||
 		    icmptype == ICMP6_PACKET_TOO_BIG ||
 		    icmptype == ICMP6_TIME_EXCEEDED ||
 		    icmptype == ICMP6_PARAM_PROB)
 			state_icmp++;
 		break;
 #endif /* INET6 */
 	}
 
 	if (!state_icmp) {
 
 		/*
 		 * ICMP query/reply message not related to a TCP/UDP packet.
 		 * Search for an ICMP state.
 		 */
 		key.af = pd->af;
 		key.proto = pd->proto;
 		key.port[0] = key.port[1] = icmpid;
 		if (direction == PF_IN)	{	/* wire side, straight */
 			PF_ACPY(&key.addr[0], pd->src, key.af);
 			PF_ACPY(&key.addr[1], pd->dst, key.af);
 		} else {			/* stack side, reverse */
 			PF_ACPY(&key.addr[1], pd->src, key.af);
 			PF_ACPY(&key.addr[0], pd->dst, key.af);
 		}
 
 		STATE_LOOKUP(kif, &key, direction, *state, pd);
 
 		(*state)->expire = time_uptime;
 		(*state)->timeout = PFTM_ICMP_ERROR_REPLY;
 
 		/* translate source/destination address, if necessary */
 		if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) {
 			struct pf_state_key *nk = (*state)->key[pd->didx];
 
 			switch (pd->af) {
 #ifdef INET
 			case AF_INET:
 				if (PF_ANEQ(pd->src,
 				    &nk->addr[pd->sidx], AF_INET))
 					pf_change_a(&saddr->v4.s_addr,
 					    pd->ip_sum,
 					    nk->addr[pd->sidx].v4.s_addr, 0);
 
 				if (PF_ANEQ(pd->dst, &nk->addr[pd->didx],
 				    AF_INET))
 					pf_change_a(&daddr->v4.s_addr,
 					    pd->ip_sum,
 					    nk->addr[pd->didx].v4.s_addr, 0);
 
 				if (nk->port[0] !=
 				    pd->hdr.icmp->icmp_id) {
 					pd->hdr.icmp->icmp_cksum =
 					    pf_cksum_fixup(
 					    pd->hdr.icmp->icmp_cksum, icmpid,
 					    nk->port[pd->sidx], 0);
 					pd->hdr.icmp->icmp_id =
 					    nk->port[pd->sidx];
 				}
 
 				m_copyback(m, off, ICMP_MINLEN,
 				    (caddr_t )pd->hdr.icmp);
 				break;
 #endif /* INET */
 #ifdef INET6
 			case AF_INET6:
 				if (PF_ANEQ(pd->src,
 				    &nk->addr[pd->sidx], AF_INET6))
 					pf_change_a6(saddr,
 					    &pd->hdr.icmp6->icmp6_cksum,
 					    &nk->addr[pd->sidx], 0);
 
 				if (PF_ANEQ(pd->dst,
 				    &nk->addr[pd->didx], AF_INET6))
 					pf_change_a6(daddr,
 					    &pd->hdr.icmp6->icmp6_cksum,
 					    &nk->addr[pd->didx], 0);
 
 				m_copyback(m, off, sizeof(struct icmp6_hdr),
 				    (caddr_t )pd->hdr.icmp6);
 				break;
 #endif /* INET6 */
 			}
 		}
 		return (PF_PASS);
 
 	} else {
 		/*
 		 * ICMP error message in response to a TCP/UDP packet.
 		 * Extract the inner TCP/UDP header and search for that state.
 		 */
 
 		struct pf_pdesc	pd2;
 		bzero(&pd2, sizeof pd2);
 #ifdef INET
 		struct ip	h2;
 #endif /* INET */
 #ifdef INET6
 		struct ip6_hdr	h2_6;
 		int		terminal = 0;
 #endif /* INET6 */
 		int		ipoff2 = 0;
 		int		off2 = 0;
 
 		pd2.af = pd->af;
 		/* Payload packet is from the opposite direction. */
 		pd2.sidx = (direction == PF_IN) ? 1 : 0;
 		pd2.didx = (direction == PF_IN) ? 0 : 1;
 		switch (pd->af) {
 #ifdef INET
 		case AF_INET:
 			/* offset of h2 in mbuf chain */
 			ipoff2 = off + ICMP_MINLEN;
 
 			if (!pf_pull_hdr(m, ipoff2, &h2, sizeof(h2),
 			    NULL, reason, pd2.af)) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: ICMP error message too short "
 				    "(ip)\n"));
 				return (PF_DROP);
 			}
 			/*
 			 * ICMP error messages don't refer to non-first
 			 * fragments
 			 */
 			if (h2.ip_off & htons(IP_OFFMASK)) {
 				REASON_SET(reason, PFRES_FRAG);
 				return (PF_DROP);
 			}
 
 			/* offset of protocol header that follows h2 */
 			off2 = ipoff2 + (h2.ip_hl << 2);
 
 			pd2.proto = h2.ip_p;
 			pd2.src = (struct pf_addr *)&h2.ip_src;
 			pd2.dst = (struct pf_addr *)&h2.ip_dst;
 			pd2.ip_sum = &h2.ip_sum;
 			break;
 #endif /* INET */
 #ifdef INET6
 		case AF_INET6:
 			ipoff2 = off + sizeof(struct icmp6_hdr);
 
 			if (!pf_pull_hdr(m, ipoff2, &h2_6, sizeof(h2_6),
 			    NULL, reason, pd2.af)) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: ICMP error message too short "
 				    "(ip6)\n"));
 				return (PF_DROP);
 			}
 			pd2.proto = h2_6.ip6_nxt;
 			pd2.src = (struct pf_addr *)&h2_6.ip6_src;
 			pd2.dst = (struct pf_addr *)&h2_6.ip6_dst;
 			pd2.ip_sum = NULL;
 			off2 = ipoff2 + sizeof(h2_6);
 			do {
 				switch (pd2.proto) {
 				case IPPROTO_FRAGMENT:
 					/*
 					 * ICMPv6 error messages for
 					 * non-first fragments
 					 */
 					REASON_SET(reason, PFRES_FRAG);
 					return (PF_DROP);
 				case IPPROTO_AH:
 				case IPPROTO_HOPOPTS:
 				case IPPROTO_ROUTING:
 				case IPPROTO_DSTOPTS: {
 					/* get next header and header length */
 					struct ip6_ext opt6;
 
 					if (!pf_pull_hdr(m, off2, &opt6,
 					    sizeof(opt6), NULL, reason,
 					    pd2.af)) {
 						DPFPRINTF(PF_DEBUG_MISC,
 						    ("pf: ICMPv6 short opt\n"));
 						return (PF_DROP);
 					}
 					if (pd2.proto == IPPROTO_AH)
 						off2 += (opt6.ip6e_len + 2) * 4;
 					else
 						off2 += (opt6.ip6e_len + 1) * 8;
 					pd2.proto = opt6.ip6e_nxt;
 					/* goto the next header */
 					break;
 				}
 				default:
 					terminal++;
 					break;
 				}
 			} while (!terminal);
 			break;
 #endif /* INET6 */
 		}
 
 		switch (pd2.proto) {
 		case IPPROTO_TCP: {
 			struct tcphdr		 th;
 			u_int32_t		 seq;
 			struct pf_state_peer	*src, *dst;
 			u_int8_t		 dws;
 			int			 copyback = 0;
 
 			/*
 			 * Only the first 8 bytes of the TCP header can be
 			 * expected. Don't access any TCP header fields after
 			 * th_seq, an ackskew test is not possible.
 			 */
 			if (!pf_pull_hdr(m, off2, &th, 8, NULL, reason,
 			    pd2.af)) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: ICMP error message too short "
 				    "(tcp)\n"));
 				return (PF_DROP);
 			}
 
 			key.af = pd2.af;
 			key.proto = IPPROTO_TCP;
 			PF_ACPY(&key.addr[pd2.sidx], pd2.src, key.af);
 			PF_ACPY(&key.addr[pd2.didx], pd2.dst, key.af);
 			key.port[pd2.sidx] = th.th_sport;
 			key.port[pd2.didx] = th.th_dport;
 
 			STATE_LOOKUP(kif, &key, direction, *state, pd);
 
 			if (direction == (*state)->direction) {
 				src = &(*state)->dst;
 				dst = &(*state)->src;
 			} else {
 				src = &(*state)->src;
 				dst = &(*state)->dst;
 			}
 
 			if (src->wscale && dst->wscale)
 				dws = dst->wscale & PF_WSCALE_MASK;
 			else
 				dws = 0;
 
 			/* Demodulate sequence number */
 			seq = ntohl(th.th_seq) - src->seqdiff;
 			if (src->seqdiff) {
 				pf_change_a(&th.th_seq, icmpsum,
 				    htonl(seq), 0);
 				copyback = 1;
 			}
 
 			if (!((*state)->state_flags & PFSTATE_SLOPPY) &&
 			    (!SEQ_GEQ(src->seqhi, seq) ||
 			    !SEQ_GEQ(seq, src->seqlo - (dst->max_win << dws)))) {
 				if (V_pf_status.debug >= PF_DEBUG_MISC) {
 					printf("pf: BAD ICMP %d:%d ",
 					    icmptype, pd->hdr.icmp->icmp_code);
 					pf_print_host(pd->src, 0, pd->af);
 					printf(" -> ");
 					pf_print_host(pd->dst, 0, pd->af);
 					printf(" state: ");
 					pf_print_state(*state);
 					printf(" seq=%u\n", seq);
 				}
 				REASON_SET(reason, PFRES_BADSTATE);
 				return (PF_DROP);
 			} else {
 				if (V_pf_status.debug >= PF_DEBUG_MISC) {
 					printf("pf: OK ICMP %d:%d ",
 					    icmptype, pd->hdr.icmp->icmp_code);
 					pf_print_host(pd->src, 0, pd->af);
 					printf(" -> ");
 					pf_print_host(pd->dst, 0, pd->af);
 					printf(" state: ");
 					pf_print_state(*state);
 					printf(" seq=%u\n", seq);
 				}
 			}
 
 			/* translate source/destination address, if necessary */
 			if ((*state)->key[PF_SK_WIRE] !=
 			    (*state)->key[PF_SK_STACK]) {
 				struct pf_state_key *nk =
 				    (*state)->key[pd->didx];
 
 				if (PF_ANEQ(pd2.src,
 				    &nk->addr[pd2.sidx], pd2.af) ||
 				    nk->port[pd2.sidx] != th.th_sport)
 					pf_change_icmp(pd2.src, &th.th_sport,
 					    daddr, &nk->addr[pd2.sidx],
 					    nk->port[pd2.sidx], NULL,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 0, pd2.af);
 
 				if (PF_ANEQ(pd2.dst,
 				    &nk->addr[pd2.didx], pd2.af) ||
 				    nk->port[pd2.didx] != th.th_dport)
 					pf_change_icmp(pd2.dst, &th.th_dport,
 					    NULL, /* XXX Inbound NAT? */
 					    &nk->addr[pd2.didx],
 					    nk->port[pd2.didx], NULL,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 0, pd2.af);
 				copyback = 1;
 			}
 
 			if (copyback) {
 				switch (pd2.af) {
 #ifdef INET
 				case AF_INET:
 					m_copyback(m, off, ICMP_MINLEN,
 					    (caddr_t )pd->hdr.icmp);
 					m_copyback(m, ipoff2, sizeof(h2),
 					    (caddr_t )&h2);
 					break;
 #endif /* INET */
 #ifdef INET6
 				case AF_INET6:
 					m_copyback(m, off,
 					    sizeof(struct icmp6_hdr),
 					    (caddr_t )pd->hdr.icmp6);
 					m_copyback(m, ipoff2, sizeof(h2_6),
 					    (caddr_t )&h2_6);
 					break;
 #endif /* INET6 */
 				}
 				m_copyback(m, off2, 8, (caddr_t)&th);
 			}
 
 			return (PF_PASS);
 			break;
 		}
 		case IPPROTO_UDP: {
 			struct udphdr		uh;
 
 			if (!pf_pull_hdr(m, off2, &uh, sizeof(uh),
 			    NULL, reason, pd2.af)) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: ICMP error message too short "
 				    "(udp)\n"));
 				return (PF_DROP);
 			}
 
 			key.af = pd2.af;
 			key.proto = IPPROTO_UDP;
 			PF_ACPY(&key.addr[pd2.sidx], pd2.src, key.af);
 			PF_ACPY(&key.addr[pd2.didx], pd2.dst, key.af);
 			key.port[pd2.sidx] = uh.uh_sport;
 			key.port[pd2.didx] = uh.uh_dport;
 
 			STATE_LOOKUP(kif, &key, direction, *state, pd);
 
 			/* translate source/destination address, if necessary */
 			if ((*state)->key[PF_SK_WIRE] !=
 			    (*state)->key[PF_SK_STACK]) {
 				struct pf_state_key *nk =
 				    (*state)->key[pd->didx];
 
 				if (PF_ANEQ(pd2.src,
 				    &nk->addr[pd2.sidx], pd2.af) ||
 				    nk->port[pd2.sidx] != uh.uh_sport)
 					pf_change_icmp(pd2.src, &uh.uh_sport,
 					    daddr, &nk->addr[pd2.sidx],
 					    nk->port[pd2.sidx], &uh.uh_sum,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 1, pd2.af);
 
 				if (PF_ANEQ(pd2.dst,
 				    &nk->addr[pd2.didx], pd2.af) ||
 				    nk->port[pd2.didx] != uh.uh_dport)
 					pf_change_icmp(pd2.dst, &uh.uh_dport,
 					    NULL, /* XXX Inbound NAT? */
 					    &nk->addr[pd2.didx],
 					    nk->port[pd2.didx], &uh.uh_sum,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 1, pd2.af);
 
 				switch (pd2.af) {
 #ifdef INET
 				case AF_INET:
 					m_copyback(m, off, ICMP_MINLEN,
 					    (caddr_t )pd->hdr.icmp);
 					m_copyback(m, ipoff2, sizeof(h2), (caddr_t)&h2);
 					break;
 #endif /* INET */
 #ifdef INET6
 				case AF_INET6:
 					m_copyback(m, off,
 					    sizeof(struct icmp6_hdr),
 					    (caddr_t )pd->hdr.icmp6);
 					m_copyback(m, ipoff2, sizeof(h2_6),
 					    (caddr_t )&h2_6);
 					break;
 #endif /* INET6 */
 				}
 				m_copyback(m, off2, sizeof(uh), (caddr_t)&uh);
 			}
 			return (PF_PASS);
 			break;
 		}
 #ifdef INET
 		case IPPROTO_ICMP: {
 			struct icmp		iih;
 
 			if (!pf_pull_hdr(m, off2, &iih, ICMP_MINLEN,
 			    NULL, reason, pd2.af)) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: ICMP error message too short i"
 				    "(icmp)\n"));
 				return (PF_DROP);
 			}
 
 			key.af = pd2.af;
 			key.proto = IPPROTO_ICMP;
 			PF_ACPY(&key.addr[pd2.sidx], pd2.src, key.af);
 			PF_ACPY(&key.addr[pd2.didx], pd2.dst, key.af);
 			key.port[0] = key.port[1] = iih.icmp_id;
 
 			STATE_LOOKUP(kif, &key, direction, *state, pd);
 
 			/* translate source/destination address, if necessary */
 			if ((*state)->key[PF_SK_WIRE] !=
 			    (*state)->key[PF_SK_STACK]) {
 				struct pf_state_key *nk =
 				    (*state)->key[pd->didx];
 
 				if (PF_ANEQ(pd2.src,
 				    &nk->addr[pd2.sidx], pd2.af) ||
 				    nk->port[pd2.sidx] != iih.icmp_id)
 					pf_change_icmp(pd2.src, &iih.icmp_id,
 					    daddr, &nk->addr[pd2.sidx],
 					    nk->port[pd2.sidx], NULL,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 0, AF_INET);
 
 				if (PF_ANEQ(pd2.dst,
 				    &nk->addr[pd2.didx], pd2.af) ||
 				    nk->port[pd2.didx] != iih.icmp_id)
 					pf_change_icmp(pd2.dst, &iih.icmp_id,
 					    NULL, /* XXX Inbound NAT? */
 					    &nk->addr[pd2.didx],
 					    nk->port[pd2.didx], NULL,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 0, AF_INET);
 
 				m_copyback(m, off, ICMP_MINLEN, (caddr_t)pd->hdr.icmp);
 				m_copyback(m, ipoff2, sizeof(h2), (caddr_t)&h2);
 				m_copyback(m, off2, ICMP_MINLEN, (caddr_t)&iih);
 			}
 			return (PF_PASS);
 			break;
 		}
 #endif /* INET */
 #ifdef INET6
 		case IPPROTO_ICMPV6: {
 			struct icmp6_hdr	iih;
 
 			if (!pf_pull_hdr(m, off2, &iih,
 			    sizeof(struct icmp6_hdr), NULL, reason, pd2.af)) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: ICMP error message too short "
 				    "(icmp6)\n"));
 				return (PF_DROP);
 			}
 
 			key.af = pd2.af;
 			key.proto = IPPROTO_ICMPV6;
 			PF_ACPY(&key.addr[pd2.sidx], pd2.src, key.af);
 			PF_ACPY(&key.addr[pd2.didx], pd2.dst, key.af);
 			key.port[0] = key.port[1] = iih.icmp6_id;
 
 			STATE_LOOKUP(kif, &key, direction, *state, pd);
 
 			/* translate source/destination address, if necessary */
 			if ((*state)->key[PF_SK_WIRE] !=
 			    (*state)->key[PF_SK_STACK]) {
 				struct pf_state_key *nk =
 				    (*state)->key[pd->didx];
 
 				if (PF_ANEQ(pd2.src,
 				    &nk->addr[pd2.sidx], pd2.af) ||
 				    nk->port[pd2.sidx] != iih.icmp6_id)
 					pf_change_icmp(pd2.src, &iih.icmp6_id,
 					    daddr, &nk->addr[pd2.sidx],
 					    nk->port[pd2.sidx], NULL,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 0, AF_INET6);
 
 				if (PF_ANEQ(pd2.dst,
 				    &nk->addr[pd2.didx], pd2.af) ||
 				    nk->port[pd2.didx] != iih.icmp6_id)
 					pf_change_icmp(pd2.dst, &iih.icmp6_id,
 					    NULL, /* XXX Inbound NAT? */
 					    &nk->addr[pd2.didx],
 					    nk->port[pd2.didx], NULL,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 0, AF_INET6);
 
 				m_copyback(m, off, sizeof(struct icmp6_hdr),
 				    (caddr_t)pd->hdr.icmp6);
 				m_copyback(m, ipoff2, sizeof(h2_6), (caddr_t)&h2_6);
 				m_copyback(m, off2, sizeof(struct icmp6_hdr),
 				    (caddr_t)&iih);
 			}
 			return (PF_PASS);
 			break;
 		}
 #endif /* INET6 */
 		default: {
 			key.af = pd2.af;
 			key.proto = pd2.proto;
 			PF_ACPY(&key.addr[pd2.sidx], pd2.src, key.af);
 			PF_ACPY(&key.addr[pd2.didx], pd2.dst, key.af);
 			key.port[0] = key.port[1] = 0;
 
 			STATE_LOOKUP(kif, &key, direction, *state, pd);
 
 			/* translate source/destination address, if necessary */
 			if ((*state)->key[PF_SK_WIRE] !=
 			    (*state)->key[PF_SK_STACK]) {
 				struct pf_state_key *nk =
 				    (*state)->key[pd->didx];
 
 				if (PF_ANEQ(pd2.src,
 				    &nk->addr[pd2.sidx], pd2.af))
 					pf_change_icmp(pd2.src, NULL, daddr,
 					    &nk->addr[pd2.sidx], 0, NULL,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 0, pd2.af);
 
 				if (PF_ANEQ(pd2.dst,
 				    &nk->addr[pd2.didx], pd2.af))
 					pf_change_icmp(pd2.src, NULL,
 					    NULL, /* XXX Inbound NAT? */
 					    &nk->addr[pd2.didx], 0, NULL,
 					    pd2.ip_sum, icmpsum,
 					    pd->ip_sum, 0, pd2.af);
 
 				switch (pd2.af) {
 #ifdef INET
 				case AF_INET:
 					m_copyback(m, off, ICMP_MINLEN,
 					    (caddr_t)pd->hdr.icmp);
 					m_copyback(m, ipoff2, sizeof(h2), (caddr_t)&h2);
 					break;
 #endif /* INET */
 #ifdef INET6
 				case AF_INET6:
 					m_copyback(m, off,
 					    sizeof(struct icmp6_hdr),
 					    (caddr_t )pd->hdr.icmp6);
 					m_copyback(m, ipoff2, sizeof(h2_6),
 					    (caddr_t )&h2_6);
 					break;
 #endif /* INET6 */
 				}
 			}
 			return (PF_PASS);
 			break;
 		}
 		}
 	}
 }
 
 static int
 pf_test_state_other(struct pf_state **state, int direction, struct pfi_kif *kif,
     struct mbuf *m, struct pf_pdesc *pd)
 {
 	struct pf_state_peer	*src, *dst;
 	struct pf_state_key_cmp	 key;
 
 	bzero(&key, sizeof(key));
 	key.af = pd->af;
 	key.proto = pd->proto;
 	if (direction == PF_IN)	{
 		PF_ACPY(&key.addr[0], pd->src, key.af);
 		PF_ACPY(&key.addr[1], pd->dst, key.af);
 		key.port[0] = key.port[1] = 0;
 	} else {
 		PF_ACPY(&key.addr[1], pd->src, key.af);
 		PF_ACPY(&key.addr[0], pd->dst, key.af);
 		key.port[1] = key.port[0] = 0;
 	}
 
 	STATE_LOOKUP(kif, &key, direction, *state, pd);
 
 	if (direction == (*state)->direction) {
 		src = &(*state)->src;
 		dst = &(*state)->dst;
 	} else {
 		src = &(*state)->dst;
 		dst = &(*state)->src;
 	}
 
 	/* update states */
 	if (src->state < PFOTHERS_SINGLE)
 		src->state = PFOTHERS_SINGLE;
 	if (dst->state == PFOTHERS_SINGLE)
 		dst->state = PFOTHERS_MULTIPLE;
 
 	/* update expire time */
 	(*state)->expire = time_uptime;
 	if (src->state == PFOTHERS_MULTIPLE && dst->state == PFOTHERS_MULTIPLE)
 		(*state)->timeout = PFTM_OTHER_MULTIPLE;
 	else
 		(*state)->timeout = PFTM_OTHER_SINGLE;
 
 	/* translate source/destination address, if necessary */
 	if ((*state)->key[PF_SK_WIRE] != (*state)->key[PF_SK_STACK]) {
 		struct pf_state_key *nk = (*state)->key[pd->didx];
 
 		KASSERT(nk, ("%s: nk is null", __func__));
 		KASSERT(pd, ("%s: pd is null", __func__));
 		KASSERT(pd->src, ("%s: pd->src is null", __func__));
 		KASSERT(pd->dst, ("%s: pd->dst is null", __func__));
 		switch (pd->af) {
 #ifdef INET
 		case AF_INET:
 			if (PF_ANEQ(pd->src, &nk->addr[pd->sidx], AF_INET))
 				pf_change_a(&pd->src->v4.s_addr,
 				    pd->ip_sum,
 				    nk->addr[pd->sidx].v4.s_addr,
 				    0);
 
 
 			if (PF_ANEQ(pd->dst, &nk->addr[pd->didx], AF_INET))
 				pf_change_a(&pd->dst->v4.s_addr,
 				    pd->ip_sum,
 				    nk->addr[pd->didx].v4.s_addr,
 				    0);
 
 				break;
 #endif /* INET */
 #ifdef INET6
 		case AF_INET6:
 			if (PF_ANEQ(pd->src, &nk->addr[pd->sidx], AF_INET))
 				PF_ACPY(pd->src, &nk->addr[pd->sidx], pd->af);
 
 			if (PF_ANEQ(pd->dst, &nk->addr[pd->didx], AF_INET))
 				PF_ACPY(pd->dst, &nk->addr[pd->didx], pd->af);
 #endif /* INET6 */
 		}
 	}
 	return (PF_PASS);
 }
 
 /*
  * ipoff and off are measured from the start of the mbuf chain.
  * h must be at "ipoff" on the mbuf chain.
  */
 void *
 pf_pull_hdr(struct mbuf *m, int off, void *p, int len,
     u_short *actionp, u_short *reasonp, sa_family_t af)
 {
 	switch (af) {
 #ifdef INET
 	case AF_INET: {
 		struct ip	*h = mtod(m, struct ip *);
 		u_int16_t	 fragoff = (ntohs(h->ip_off) & IP_OFFMASK) << 3;
 
 		if (fragoff) {
 			if (fragoff >= len)
 				ACTION_SET(actionp, PF_PASS);
 			else {
 				ACTION_SET(actionp, PF_DROP);
 				REASON_SET(reasonp, PFRES_FRAG);
 			}
 			return (NULL);
 		}
 		if (m->m_pkthdr.len < off + len ||
 		    ntohs(h->ip_len) < off + len) {
 			ACTION_SET(actionp, PF_DROP);
 			REASON_SET(reasonp, PFRES_SHORT);
 			return (NULL);
 		}
 		break;
 	}
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6: {
 		struct ip6_hdr	*h = mtod(m, struct ip6_hdr *);
 
 		if (m->m_pkthdr.len < off + len ||
 		    (ntohs(h->ip6_plen) + sizeof(struct ip6_hdr)) <
 		    (unsigned)(off + len)) {
 			ACTION_SET(actionp, PF_DROP);
 			REASON_SET(reasonp, PFRES_SHORT);
 			return (NULL);
 		}
 		break;
 	}
 #endif /* INET6 */
 	}
 	m_copydata(m, off, len, p);
 	return (p);
 }
 
 int
 pf_routable(struct pf_addr *addr, sa_family_t af, struct pfi_kif *kif,
     int rtableid)
 {
 #ifdef RADIX_MPATH
 	struct radix_node_head	*rnh;
 #endif
 	struct sockaddr_in	*dst;
 	int			 ret = 1;
 	int			 check_mpath;
 #ifdef INET6
 	struct sockaddr_in6	*dst6;
 	struct route_in6	 ro;
 #else
 	struct route		 ro;
 #endif
 	struct radix_node	*rn;
 	struct rtentry		*rt;
 	struct ifnet		*ifp;
 
 	check_mpath = 0;
 #ifdef RADIX_MPATH
 	/* XXX: stick to table 0 for now */
 	rnh = rt_tables_get_rnh(0, af);
 	if (rnh != NULL && rn_mpath_capable(rnh))
 		check_mpath = 1;
 #endif
 	bzero(&ro, sizeof(ro));
 	switch (af) {
 	case AF_INET:
 		dst = satosin(&ro.ro_dst);
 		dst->sin_family = AF_INET;
 		dst->sin_len = sizeof(*dst);
 		dst->sin_addr = addr->v4;
 		break;
 #ifdef INET6
 	case AF_INET6:
 		/*
 		 * Skip check for addresses with embedded interface scope,
 		 * as they would always match anyway.
 		 */
 		if (IN6_IS_SCOPE_EMBED(&addr->v6))
 			goto out;
 		dst6 = (struct sockaddr_in6 *)&ro.ro_dst;
 		dst6->sin6_family = AF_INET6;
 		dst6->sin6_len = sizeof(*dst6);
 		dst6->sin6_addr = addr->v6;
 		break;
 #endif /* INET6 */
 	default:
 		return (0);
 	}
 
 	/* Skip checks for ipsec interfaces */
 	if (kif != NULL && kif->pfik_ifp->if_type == IFT_ENC)
 		goto out;
 
 	switch (af) {
 #ifdef INET6
 	case AF_INET6:
 		in6_rtalloc_ign(&ro, 0, rtableid);
 		break;
 #endif
 #ifdef INET
 	case AF_INET:
 		in_rtalloc_ign((struct route *)&ro, 0, rtableid);
 		break;
 #endif
 	default:
 		rtalloc_ign((struct route *)&ro, 0);	/* No/default FIB. */
 		break;
 	}
 
 	if (ro.ro_rt != NULL) {
 		/* No interface given, this is a no-route check */
 		if (kif == NULL)
 			goto out;
 
 		if (kif->pfik_ifp == NULL) {
 			ret = 0;
 			goto out;
 		}
 
 		/* Perform uRPF check if passed input interface */
 		ret = 0;
 		rn = (struct radix_node *)ro.ro_rt;
 		do {
 			rt = (struct rtentry *)rn;
 			ifp = rt->rt_ifp;
 
 			if (kif->pfik_ifp == ifp)
 				ret = 1;
 #ifdef RADIX_MPATH
 			rn = rn_mpath_next(rn);
 #endif
 		} while (check_mpath == 1 && rn != NULL && ret == 0);
 	} else
 		ret = 0;
 out:
 	if (ro.ro_rt != NULL)
 		RTFREE(ro.ro_rt);
 	return (ret);
 }
 
 #ifdef INET
 static void
 pf_route(struct mbuf **m, struct pf_rule *r, int dir, struct ifnet *oifp,
     struct pf_state *s, struct pf_pdesc *pd)
 {
 	struct mbuf		*m0, *m1;
 	struct sockaddr_in	dst;
 	struct ip		*ip;
 	struct ifnet		*ifp = NULL;
 	struct pf_addr		 naddr;
 	struct pf_src_node	*sn = NULL;
 	int			 error = 0;
 	uint16_t		 ip_len, ip_off;
 
 	KASSERT(m && *m && r && oifp, ("%s: invalid parameters", __func__));
 	KASSERT(dir == PF_IN || dir == PF_OUT, ("%s: invalid direction",
 	    __func__));
 
 	if ((pd->pf_mtag == NULL &&
 	    ((pd->pf_mtag = pf_get_mtag(*m)) == NULL)) ||
 	    pd->pf_mtag->routed++ > 3) {
 		m0 = *m;
 		*m = NULL;
 		goto bad_locked;
 	}
 
 	if (r->rt == PF_DUPTO) {
 		if ((m0 = m_dup(*m, M_NOWAIT)) == NULL) {
 			if (s)
 				PF_STATE_UNLOCK(s);
 			return;
 		}
 	} else {
 		if ((r->rt == PF_REPLYTO) == (r->direction == dir)) {
 			if (s)
 				PF_STATE_UNLOCK(s);
 			return;
 		}
 		m0 = *m;
 	}
 
 	ip = mtod(m0, struct ip *);
 
 	bzero(&dst, sizeof(dst));
 	dst.sin_family = AF_INET;
 	dst.sin_len = sizeof(dst);
 	dst.sin_addr = ip->ip_dst;
 
 	if (r->rt == PF_FASTROUTE) {
 		struct rtentry *rt;
 
 		if (s)
 			PF_STATE_UNLOCK(s);
 		rt = rtalloc1_fib(sintosa(&dst), 0, 0, M_GETFIB(m0));
 		if (rt == NULL) {
 			KMOD_IPSTAT_INC(ips_noroute);
 			error = EHOSTUNREACH;
 			goto bad;
 		}
 
 		ifp = rt->rt_ifp;
 		counter_u64_add(rt->rt_pksent, 1);
 
 		if (rt->rt_flags & RTF_GATEWAY)
 			bcopy(satosin(rt->rt_gateway), &dst, sizeof(dst));
 		RTFREE_LOCKED(rt);
 	} else {
 		if (TAILQ_EMPTY(&r->rpool.list)) {
 			DPFPRINTF(PF_DEBUG_URGENT,
 			    ("%s: TAILQ_EMPTY(&r->rpool.list)\n", __func__));
 			goto bad_locked;
 		}
 		if (s == NULL) {
 			pf_map_addr(AF_INET, r, (struct pf_addr *)&ip->ip_src,
 			    &naddr, NULL, &sn);
 			if (!PF_AZERO(&naddr, AF_INET))
 				dst.sin_addr.s_addr = naddr.v4.s_addr;
 			ifp = r->rpool.cur->kif ?
 			    r->rpool.cur->kif->pfik_ifp : NULL;
 		} else {
 			if (!PF_AZERO(&s->rt_addr, AF_INET))
 				dst.sin_addr.s_addr =
 				    s->rt_addr.v4.s_addr;
 			ifp = s->rt_kif ? s->rt_kif->pfik_ifp : NULL;
 			PF_STATE_UNLOCK(s);
 		}
 	}
 	if (ifp == NULL)
 		goto bad;
 
 	if (oifp != ifp) {
 		if (pf_test(PF_OUT, ifp, &m0, NULL) != PF_PASS)
 			goto bad;
 		else if (m0 == NULL)
 			goto done;
 		if (m0->m_len < sizeof(struct ip)) {
 			DPFPRINTF(PF_DEBUG_URGENT,
 			    ("%s: m0->m_len < sizeof(struct ip)\n", __func__));
 			goto bad;
 		}
 		ip = mtod(m0, struct ip *);
 	}
 
 	if (ifp->if_flags & IFF_LOOPBACK)
 		m0->m_flags |= M_SKIP_FIREWALL;
 
 	ip_len = ntohs(ip->ip_len);
 	ip_off = ntohs(ip->ip_off);
 
 	/* Copied from FreeBSD 10.0-CURRENT ip_output. */
 	m0->m_pkthdr.csum_flags |= CSUM_IP;
 	if (m0->m_pkthdr.csum_flags & CSUM_DELAY_DATA & ~ifp->if_hwassist) {
 		in_delayed_cksum(m0);
 		m0->m_pkthdr.csum_flags &= ~CSUM_DELAY_DATA;
 	}
 #ifdef SCTP
 	if (m0->m_pkthdr.csum_flags & CSUM_SCTP & ~ifp->if_hwassist) {
 		sctp_delayed_cksum(m, (uint32_t)(ip->ip_hl << 2));
 		m0->m_pkthdr.csum_flags &= ~CSUM_SCTP;
 	}
 #endif
 
 	/*
 	 * If small enough for interface, or the interface will take
 	 * care of the fragmentation for us, we can just send directly.
 	 */
 	if (ip_len <= ifp->if_mtu ||
 	    (m0->m_pkthdr.csum_flags & ifp->if_hwassist & CSUM_TSO) != 0) {
 		ip->ip_sum = 0;
 		if (m0->m_pkthdr.csum_flags & CSUM_IP & ~ifp->if_hwassist) {
 			ip->ip_sum = in_cksum(m0, ip->ip_hl << 2);
 			m0->m_pkthdr.csum_flags &= ~CSUM_IP;
 		}
 		m_clrprotoflags(m0);	/* Avoid confusing lower layers. */
 		error = (*ifp->if_output)(ifp, m0, sintosa(&dst), NULL);
 		goto done;
 	}
 
 	/* Balk when DF bit is set or the interface didn't support TSO. */
 	if ((ip_off & IP_DF) || (m0->m_pkthdr.csum_flags & CSUM_TSO)) {
 		error = EMSGSIZE;
 		KMOD_IPSTAT_INC(ips_cantfrag);
 		if (r->rt != PF_DUPTO) {
 			icmp_error(m0, ICMP_UNREACH, ICMP_UNREACH_NEEDFRAG, 0,
 			    ifp->if_mtu);
 			goto done;
 		} else
 			goto bad;
 	}
 
 	error = ip_fragment(ip, &m0, ifp->if_mtu, ifp->if_hwassist);
 	if (error)
 		goto bad;
 
 	for (; m0; m0 = m1) {
 		m1 = m0->m_nextpkt;
 		m0->m_nextpkt = NULL;
 		if (error == 0) {
 			m_clrprotoflags(m0);
 			error = (*ifp->if_output)(ifp, m0, sintosa(&dst), NULL);
 		} else
 			m_freem(m0);
 	}
 
 	if (error == 0)
 		KMOD_IPSTAT_INC(ips_fragmented);
 
 done:
 	if (r->rt != PF_DUPTO)
 		*m = NULL;
 	return;
 
 bad_locked:
 	if (s)
 		PF_STATE_UNLOCK(s);
 bad:
 	m_freem(m0);
 	goto done;
 }
 #endif /* INET */
 
 #ifdef INET6
 static void
 pf_route6(struct mbuf **m, struct pf_rule *r, int dir, struct ifnet *oifp,
     struct pf_state *s, struct pf_pdesc *pd)
 {
 	struct mbuf		*m0;
 	struct sockaddr_in6	dst;
 	struct ip6_hdr		*ip6;
 	struct ifnet		*ifp = NULL;
 	struct pf_addr		 naddr;
 	struct pf_src_node	*sn = NULL;
 
 	KASSERT(m && *m && r && oifp, ("%s: invalid parameters", __func__));
 	KASSERT(dir == PF_IN || dir == PF_OUT, ("%s: invalid direction",
 	    __func__));
 
 	if ((pd->pf_mtag == NULL &&
 	    ((pd->pf_mtag = pf_get_mtag(*m)) == NULL)) ||
 	    pd->pf_mtag->routed++ > 3) {
 		m0 = *m;
 		*m = NULL;
 		goto bad_locked;
 	}
 
 	if (r->rt == PF_DUPTO) {
 		if ((m0 = m_dup(*m, M_NOWAIT)) == NULL) {
 			if (s)
 				PF_STATE_UNLOCK(s);
 			return;
 		}
 	} else {
 		if ((r->rt == PF_REPLYTO) == (r->direction == dir)) {
 			if (s)
 				PF_STATE_UNLOCK(s);
 			return;
 		}
 		m0 = *m;
 	}
 
 	ip6 = mtod(m0, struct ip6_hdr *);
 
 	bzero(&dst, sizeof(dst));
 	dst.sin6_family = AF_INET6;
 	dst.sin6_len = sizeof(dst);
 	dst.sin6_addr = ip6->ip6_dst;
 
 	/* Cheat. XXX why only in the v6 case??? */
 	if (r->rt == PF_FASTROUTE) {
 		if (s)
 			PF_STATE_UNLOCK(s);
 		m0->m_flags |= M_SKIP_FIREWALL;
 		ip6_output(m0, NULL, NULL, 0, NULL, NULL, NULL);
 		*m = NULL;
 		return;
 	}
 
 	if (TAILQ_EMPTY(&r->rpool.list)) {
 		DPFPRINTF(PF_DEBUG_URGENT,
 		    ("%s: TAILQ_EMPTY(&r->rpool.list)\n", __func__));
 		goto bad_locked;
 	}
 	if (s == NULL) {
 		pf_map_addr(AF_INET6, r, (struct pf_addr *)&ip6->ip6_src,
 		    &naddr, NULL, &sn);
 		if (!PF_AZERO(&naddr, AF_INET6))
 			PF_ACPY((struct pf_addr *)&dst.sin6_addr,
 			    &naddr, AF_INET6);
 		ifp = r->rpool.cur->kif ? r->rpool.cur->kif->pfik_ifp : NULL;
 	} else {
 		if (!PF_AZERO(&s->rt_addr, AF_INET6))
 			PF_ACPY((struct pf_addr *)&dst.sin6_addr,
 			    &s->rt_addr, AF_INET6);
 		ifp = s->rt_kif ? s->rt_kif->pfik_ifp : NULL;
 	}
 
 	if (s)
 		PF_STATE_UNLOCK(s);
 
 	if (ifp == NULL)
 		goto bad;
 
 	if (oifp != ifp) {
 		if (pf_test6(PF_FWD, ifp, &m0, NULL) != PF_PASS)
 			goto bad;
 		else if (m0 == NULL)
 			goto done;
 		if (m0->m_len < sizeof(struct ip6_hdr)) {
 			DPFPRINTF(PF_DEBUG_URGENT,
 			    ("%s: m0->m_len < sizeof(struct ip6_hdr)\n",
 			    __func__));
 			goto bad;
 		}
 		ip6 = mtod(m0, struct ip6_hdr *);
 	}
 
 	if (ifp->if_flags & IFF_LOOPBACK)
 		m0->m_flags |= M_SKIP_FIREWALL;
 
 	/*
 	 * If the packet is too large for the outgoing interface,
 	 * send back an icmp6 error.
 	 */
 	if (IN6_IS_SCOPE_EMBED(&dst.sin6_addr))
 		dst.sin6_addr.s6_addr16[1] = htons(ifp->if_index);
 	if ((u_long)m0->m_pkthdr.len <= ifp->if_mtu)
 		nd6_output(ifp, ifp, m0, &dst, NULL);
 	else {
 		in6_ifstat_inc(ifp, ifs6_in_toobig);
 		if (r->rt != PF_DUPTO)
 			icmp6_error(m0, ICMP6_PACKET_TOO_BIG, 0, ifp->if_mtu);
 		else
 			goto bad;
 	}
 
 done:
 	if (r->rt != PF_DUPTO)
 		*m = NULL;
 	return;
 
 bad_locked:
 	if (s)
 		PF_STATE_UNLOCK(s);
 bad:
 	m_freem(m0);
 	goto done;
 }
 #endif /* INET6 */
 
 /*
  * FreeBSD supports cksum offloads for the following drivers.
  *  em(4), fxp(4), ixgb(4), lge(4), ndis(4), nge(4), re(4),
  *   ti(4), txp(4), xl(4)
  *
  * CSUM_DATA_VALID | CSUM_PSEUDO_HDR :
  *  network driver performed cksum including pseudo header, need to verify
  *   csum_data
  * CSUM_DATA_VALID :
  *  network driver performed cksum, needs to additional pseudo header
  *  cksum computation with partial csum_data(i.e. lack of H/W support for
  *  pseudo header, for instance hme(4), sk(4) and possibly gem(4))
  *
  * After validating the cksum of packet, set both flag CSUM_DATA_VALID and
  * CSUM_PSEUDO_HDR in order to avoid recomputation of the cksum in upper
  * TCP/UDP layer.
  * Also, set csum_data to 0xffff to force cksum validation.
  */
 static int
 pf_check_proto_cksum(struct mbuf *m, int off, int len, u_int8_t p, sa_family_t af)
 {
 	u_int16_t sum = 0;
 	int hw_assist = 0;
 	struct ip *ip;
 
 	if (off < sizeof(struct ip) || len < sizeof(struct udphdr))
 		return (1);
 	if (m->m_pkthdr.len < off + len)
 		return (1);
 
 	switch (p) {
 	case IPPROTO_TCP:
 		if (m->m_pkthdr.csum_flags & CSUM_DATA_VALID) {
 			if (m->m_pkthdr.csum_flags & CSUM_PSEUDO_HDR) {
 				sum = m->m_pkthdr.csum_data;
 			} else {
 				ip = mtod(m, struct ip *);
 				sum = in_pseudo(ip->ip_src.s_addr,
 				ip->ip_dst.s_addr, htonl((u_short)len +
 				m->m_pkthdr.csum_data + IPPROTO_TCP));
 			}
 			sum ^= 0xffff;
 			++hw_assist;
 		}
 		break;
 	case IPPROTO_UDP:
 		if (m->m_pkthdr.csum_flags & CSUM_DATA_VALID) {
 			if (m->m_pkthdr.csum_flags & CSUM_PSEUDO_HDR) {
 				sum = m->m_pkthdr.csum_data;
 			} else {
 				ip = mtod(m, struct ip *);
 				sum = in_pseudo(ip->ip_src.s_addr,
 				ip->ip_dst.s_addr, htonl((u_short)len +
 				m->m_pkthdr.csum_data + IPPROTO_UDP));
 			}
 			sum ^= 0xffff;
 			++hw_assist;
 		}
 		break;
 	case IPPROTO_ICMP:
 #ifdef INET6
 	case IPPROTO_ICMPV6:
 #endif /* INET6 */
 		break;
 	default:
 		return (1);
 	}
 
 	if (!hw_assist) {
 		switch (af) {
 		case AF_INET:
 			if (p == IPPROTO_ICMP) {
 				if (m->m_len < off)
 					return (1);
 				m->m_data += off;
 				m->m_len -= off;
 				sum = in_cksum(m, len);
 				m->m_data -= off;
 				m->m_len += off;
 			} else {
 				if (m->m_len < sizeof(struct ip))
 					return (1);
 				sum = in4_cksum(m, p, off, len);
 			}
 			break;
 #ifdef INET6
 		case AF_INET6:
 			if (m->m_len < sizeof(struct ip6_hdr))
 				return (1);
 			sum = in6_cksum(m, p, off, len);
 			break;
 #endif /* INET6 */
 		default:
 			return (1);
 		}
 	}
 	if (sum) {
 		switch (p) {
 		case IPPROTO_TCP:
 		    {
 			KMOD_TCPSTAT_INC(tcps_rcvbadsum);
 			break;
 		    }
 		case IPPROTO_UDP:
 		    {
 			KMOD_UDPSTAT_INC(udps_badsum);
 			break;
 		    }
 #ifdef INET
 		case IPPROTO_ICMP:
 		    {
 			KMOD_ICMPSTAT_INC(icps_checksum);
 			break;
 		    }
 #endif
 #ifdef INET6
 		case IPPROTO_ICMPV6:
 		    {
 			KMOD_ICMP6STAT_INC(icp6s_checksum);
 			break;
 		    }
 #endif /* INET6 */
 		}
 		return (1);
 	} else {
 		if (p == IPPROTO_TCP || p == IPPROTO_UDP) {
 			m->m_pkthdr.csum_flags |=
 			    (CSUM_DATA_VALID | CSUM_PSEUDO_HDR);
 			m->m_pkthdr.csum_data = 0xffff;
 		}
 	}
 	return (0);
 }
 
 
 #ifdef INET
 int
 pf_test(int dir, struct ifnet *ifp, struct mbuf **m0, struct inpcb *inp)
 {
 	struct pfi_kif		*kif;
 	u_short			 action, reason = 0, log = 0;
 	struct mbuf		*m = *m0;
 	struct ip		*h = NULL;
 	struct m_tag		*ipfwtag;
 	struct pf_rule		*a = NULL, *r = &V_pf_default_rule, *tr, *nr;
 	struct pf_state		*s = NULL;
 	struct pf_ruleset	*ruleset = NULL;
 	struct pf_pdesc		 pd;
 	int			 off, dirndx, pqid = 0;
 
 	M_ASSERTPKTHDR(m);
 
 	if (!V_pf_status.running)
 		return (PF_PASS);
 
 	memset(&pd, 0, sizeof(pd));
 
 	kif = (struct pfi_kif *)ifp->if_pf_kif;
 
 	if (kif == NULL) {
 		DPFPRINTF(PF_DEBUG_URGENT,
 		    ("pf_test: kif == NULL, if_xname %s\n", ifp->if_xname));
 		return (PF_DROP);
 	}
 	if (kif->pfik_flags & PFI_IFLAG_SKIP)
 		return (PF_PASS);
 
 	if (m->m_flags & M_SKIP_FIREWALL)
 		return (PF_PASS);
 
 	pd.pf_mtag = pf_find_mtag(m);
 
 	PF_RULES_RLOCK();
 
 	if (ip_divert_ptr != NULL &&
 	    ((ipfwtag = m_tag_locate(m, MTAG_IPFW_RULE, 0, NULL)) != NULL)) {
 		struct ipfw_rule_ref *rr = (struct ipfw_rule_ref *)(ipfwtag+1);
 		if (rr->info & IPFW_IS_DIVERT && rr->rulenum == 0) {
 			if (pd.pf_mtag == NULL &&
 			    ((pd.pf_mtag = pf_get_mtag(m)) == NULL)) {
 				action = PF_DROP;
 				goto done;
 			}
 			pd.pf_mtag->flags |= PF_PACKET_LOOPED;
 			m_tag_delete(m, ipfwtag);
 		}
 		if (pd.pf_mtag && pd.pf_mtag->flags & PF_FASTFWD_OURS_PRESENT) {
 			m->m_flags |= M_FASTFWD_OURS;
 			pd.pf_mtag->flags &= ~PF_FASTFWD_OURS_PRESENT;
 		}
 	} else if (pf_normalize_ip(m0, dir, kif, &reason, &pd) != PF_PASS) {
 		/* We do IP header normalization and packet reassembly here */
 		action = PF_DROP;
 		goto done;
 	}
 	m = *m0;	/* pf_normalize messes with m0 */
 	h = mtod(m, struct ip *);
 
 	off = h->ip_hl << 2;
 	if (off < (int)sizeof(struct ip)) {
 		action = PF_DROP;
 		REASON_SET(&reason, PFRES_SHORT);
 		log = 1;
 		goto done;
 	}
 
 	pd.src = (struct pf_addr *)&h->ip_src;
 	pd.dst = (struct pf_addr *)&h->ip_dst;
 	pd.sport = pd.dport = NULL;
 	pd.ip_sum = &h->ip_sum;
 	pd.proto_sum = NULL;
 	pd.proto = h->ip_p;
 	pd.dir = dir;
 	pd.sidx = (dir == PF_IN) ? 0 : 1;
 	pd.didx = (dir == PF_IN) ? 1 : 0;
 	pd.af = AF_INET;
 	pd.tos = h->ip_tos;
 	pd.tot_len = ntohs(h->ip_len);
 
 	/* handle fragments that didn't get reassembled by normalization */
 	if (h->ip_off & htons(IP_MF | IP_OFFMASK)) {
 		action = pf_test_fragment(&r, dir, kif, m, h,
 		    &pd, &a, &ruleset);
 		goto done;
 	}
 
 	switch (h->ip_p) {
 
 	case IPPROTO_TCP: {
 		struct tcphdr	th;
 
 		pd.hdr.tcp = &th;
 		if (!pf_pull_hdr(m, off, &th, sizeof(th),
 		    &action, &reason, AF_INET)) {
 			log = action != PF_PASS;
 			goto done;
 		}
 		pd.p_len = pd.tot_len - off - (th.th_off << 2);
 		if ((th.th_flags & TH_ACK) && pd.p_len == 0)
 			pqid = 1;
 		action = pf_normalize_tcp(dir, kif, m, 0, off, h, &pd);
 		if (action == PF_DROP)
 			goto done;
 		action = pf_test_state_tcp(&s, dir, kif, m, off, h, &pd,
 		    &reason);
 		if (action == PF_PASS) {
 			if (pfsync_update_state_ptr != NULL)
 				pfsync_update_state_ptr(s);
 			r = s->rule.ptr;
 			a = s->anchor.ptr;
 			log = s->log;
 		} else if (s == NULL)
 			action = pf_test_rule(&r, &s, dir, kif, m, off, &pd,
 			    &a, &ruleset, inp);
 		break;
 	}
 
 	case IPPROTO_UDP: {
 		struct udphdr	uh;
 
 		pd.hdr.udp = &uh;
 		if (!pf_pull_hdr(m, off, &uh, sizeof(uh),
 		    &action, &reason, AF_INET)) {
 			log = action != PF_PASS;
 			goto done;
 		}
 		if (uh.uh_dport == 0 ||
 		    ntohs(uh.uh_ulen) > m->m_pkthdr.len - off ||
 		    ntohs(uh.uh_ulen) < sizeof(struct udphdr)) {
 			action = PF_DROP;
 			REASON_SET(&reason, PFRES_SHORT);
 			goto done;
 		}
 		action = pf_test_state_udp(&s, dir, kif, m, off, h, &pd);
 		if (action == PF_PASS) {
 			if (pfsync_update_state_ptr != NULL)
 				pfsync_update_state_ptr(s);
 			r = s->rule.ptr;
 			a = s->anchor.ptr;
 			log = s->log;
 		} else if (s == NULL)
 			action = pf_test_rule(&r, &s, dir, kif, m, off, &pd,
 			    &a, &ruleset, inp);
 		break;
 	}
 
 	case IPPROTO_ICMP: {
 		struct icmp	ih;
 
 		pd.hdr.icmp = &ih;
 		if (!pf_pull_hdr(m, off, &ih, ICMP_MINLEN,
 		    &action, &reason, AF_INET)) {
 			log = action != PF_PASS;
 			goto done;
 		}
 		action = pf_test_state_icmp(&s, dir, kif, m, off, h, &pd,
 		    &reason);
 		if (action == PF_PASS) {
 			if (pfsync_update_state_ptr != NULL)
 				pfsync_update_state_ptr(s);
 			r = s->rule.ptr;
 			a = s->anchor.ptr;
 			log = s->log;
 		} else if (s == NULL)
 			action = pf_test_rule(&r, &s, dir, kif, m, off, &pd,
 			    &a, &ruleset, inp);
 		break;
 	}
 
 #ifdef INET6
 	case IPPROTO_ICMPV6: {
 		action = PF_DROP;
 		DPFPRINTF(PF_DEBUG_MISC,
 		    ("pf: dropping IPv4 packet with ICMPv6 payload\n"));
 		goto done;
 	}
 #endif
 
 	default:
 		action = pf_test_state_other(&s, dir, kif, m, &pd);
 		if (action == PF_PASS) {
 			if (pfsync_update_state_ptr != NULL)
 				pfsync_update_state_ptr(s);
 			r = s->rule.ptr;
 			a = s->anchor.ptr;
 			log = s->log;
 		} else if (s == NULL)
 			action = pf_test_rule(&r, &s, dir, kif, m, off, &pd,
 			    &a, &ruleset, inp);
 		break;
 	}
 
 done:
 	PF_RULES_RUNLOCK();
 	if (action == PF_PASS && h->ip_hl > 5 &&
 	    !((s && s->state_flags & PFSTATE_ALLOWOPTS) || r->allow_opts)) {
 		action = PF_DROP;
 		REASON_SET(&reason, PFRES_IPOPTIONS);
 		log = 1;
 		DPFPRINTF(PF_DEBUG_MISC,
 		    ("pf: dropping packet with ip options\n"));
 	}
 
 	if (s && s->tag > 0 && pf_tag_packet(m, &pd, s->tag)) {
 		action = PF_DROP;
 		REASON_SET(&reason, PFRES_MEMORY);
 	}
 	if (r->rtableid >= 0)
 		M_SETFIB(m, r->rtableid);
 
 #ifdef ALTQ
 	if (action == PF_PASS && r->qid) {
 		if (pd.pf_mtag == NULL &&
 		    ((pd.pf_mtag = pf_get_mtag(m)) == NULL)) {
 			action = PF_DROP;
 			REASON_SET(&reason, PFRES_MEMORY);
 		}
 		if (pqid || (pd.tos & IPTOS_LOWDELAY))
 			pd.pf_mtag->qid = r->pqid;
 		else
 			pd.pf_mtag->qid = r->qid;
 		/* add hints for ecn */
 		pd.pf_mtag->hdr = h;
 
 	}
 #endif /* ALTQ */
 
 	/*
 	 * connections redirected to loopback should not match sockets
 	 * bound specifically to loopback due to security implications,
 	 * see tcp_input() and in_pcblookup_listen().
 	 */
 	if (dir == PF_IN && action == PF_PASS && (pd.proto == IPPROTO_TCP ||
 	    pd.proto == IPPROTO_UDP) && s != NULL && s->nat_rule.ptr != NULL &&
 	    (s->nat_rule.ptr->action == PF_RDR ||
 	    s->nat_rule.ptr->action == PF_BINAT) &&
 	    (ntohl(pd.dst->v4.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET)
 		m->m_flags |= M_SKIP_FIREWALL;
 
 	if (action == PF_PASS && r->divert.port && ip_divert_ptr != NULL &&
 	    !PACKET_LOOPED(&pd)) {
 
 		ipfwtag = m_tag_alloc(MTAG_IPFW_RULE, 0,
 		    sizeof(struct ipfw_rule_ref), M_NOWAIT | M_ZERO);
 		if (ipfwtag != NULL) {
 			((struct ipfw_rule_ref *)(ipfwtag+1))->info =
 			    ntohs(r->divert.port);
 			((struct ipfw_rule_ref *)(ipfwtag+1))->rulenum = dir;
 
 			if (s)
 				PF_STATE_UNLOCK(s);
 
 			m_tag_prepend(m, ipfwtag);
 			if (m->m_flags & M_FASTFWD_OURS) {
 				if (pd.pf_mtag == NULL &&
 				    ((pd.pf_mtag = pf_get_mtag(m)) == NULL)) {
 					action = PF_DROP;
 					REASON_SET(&reason, PFRES_MEMORY);
 					log = 1;
 					DPFPRINTF(PF_DEBUG_MISC,
 					    ("pf: failed to allocate tag\n"));
 				}
 				pd.pf_mtag->flags |= PF_FASTFWD_OURS_PRESENT;
 				m->m_flags &= ~M_FASTFWD_OURS;
 			}
 			ip_divert_ptr(*m0, dir ==  PF_IN ? DIR_IN : DIR_OUT);
 			*m0 = NULL;
 
 			return (action);
 		} else {
 			/* XXX: ipfw has the same behaviour! */
 			action = PF_DROP;
 			REASON_SET(&reason, PFRES_MEMORY);
 			log = 1;
 			DPFPRINTF(PF_DEBUG_MISC,
 			    ("pf: failed to allocate divert tag\n"));
 		}
 	}
 
 	if (log) {
 		struct pf_rule *lr;
 
 		if (s != NULL && s->nat_rule.ptr != NULL &&
 		    s->nat_rule.ptr->log & PF_LOG_ALL)
 			lr = s->nat_rule.ptr;
 		else
 			lr = r;
 		PFLOG_PACKET(kif, m, AF_INET, dir, reason, lr, a, ruleset, &pd,
 		    (s == NULL));
 	}
 
 	kif->pfik_bytes[0][dir == PF_OUT][action != PF_PASS] += pd.tot_len;
 	kif->pfik_packets[0][dir == PF_OUT][action != PF_PASS]++;
 
 	if (action == PF_PASS || r->action == PF_DROP) {
 		dirndx = (dir == PF_OUT);
 		r->packets[dirndx]++;
 		r->bytes[dirndx] += pd.tot_len;
 		if (a != NULL) {
 			a->packets[dirndx]++;
 			a->bytes[dirndx] += pd.tot_len;
 		}
 		if (s != NULL) {
 			if (s->nat_rule.ptr != NULL) {
 				s->nat_rule.ptr->packets[dirndx]++;
 				s->nat_rule.ptr->bytes[dirndx] += pd.tot_len;
 			}
 			if (s->src_node != NULL) {
 				s->src_node->packets[dirndx]++;
 				s->src_node->bytes[dirndx] += pd.tot_len;
 			}
 			if (s->nat_src_node != NULL) {
 				s->nat_src_node->packets[dirndx]++;
 				s->nat_src_node->bytes[dirndx] += pd.tot_len;
 			}
 			dirndx = (dir == s->direction) ? 0 : 1;
 			s->packets[dirndx]++;
 			s->bytes[dirndx] += pd.tot_len;
 		}
 		tr = r;
 		nr = (s != NULL) ? s->nat_rule.ptr : pd.nat_rule;
 		if (nr != NULL && r == &V_pf_default_rule)
 			tr = nr;
 		if (tr->src.addr.type == PF_ADDR_TABLE)
 			pfr_update_stats(tr->src.addr.p.tbl,
 			    (s == NULL) ? pd.src :
 			    &s->key[(s->direction == PF_IN)]->
 				addr[(s->direction == PF_OUT)],
 			    pd.af, pd.tot_len, dir == PF_OUT,
 			    r->action == PF_PASS, tr->src.neg);
 		if (tr->dst.addr.type == PF_ADDR_TABLE)
 			pfr_update_stats(tr->dst.addr.p.tbl,
 			    (s == NULL) ? pd.dst :
 			    &s->key[(s->direction == PF_IN)]->
 				addr[(s->direction == PF_IN)],
 			    pd.af, pd.tot_len, dir == PF_OUT,
 			    r->action == PF_PASS, tr->dst.neg);
 	}
 
 	switch (action) {
 	case PF_SYNPROXY_DROP:
 		m_freem(*m0);
 	case PF_DEFER:
 		*m0 = NULL;
 		action = PF_PASS;
 		break;
 	case PF_DROP:
 		m_freem(*m0);
 		*m0 = NULL;
 		break;
 	default:
 		/* pf_route() returns unlocked. */
 		if (r->rt) {
 			pf_route(m0, r, dir, kif->pfik_ifp, s, &pd);
 			return (action);
 		}
 		break;
 	}
 	if (s)
 		PF_STATE_UNLOCK(s);
 
 	return (action);
 }
 #endif /* INET */
 
 #ifdef INET6
 int
 pf_test6(int dir, struct ifnet *ifp, struct mbuf **m0, struct inpcb *inp)
 {
 	struct pfi_kif		*kif;
 	u_short			 action, reason = 0, log = 0;
 	struct mbuf		*m = *m0, *n = NULL;
 	struct m_tag		*mtag;
 	struct ip6_hdr		*h = NULL;
 	struct pf_rule		*a = NULL, *r = &V_pf_default_rule, *tr, *nr;
 	struct pf_state		*s = NULL;
 	struct pf_ruleset	*ruleset = NULL;
 	struct pf_pdesc		 pd;
 	int			 off, terminal = 0, dirndx, rh_cnt = 0;
 	int			 fwdir = dir;
 
 	M_ASSERTPKTHDR(m);
 
-	if (ifp != m->m_pkthdr.rcvif)
+	if (dir == PF_OUT && m->m_pkthdr.rcvif && ifp != m->m_pkthdr.rcvif)
 		fwdir = PF_FWD;
 
 	if (!V_pf_status.running)
 		return (PF_PASS);
 
 	memset(&pd, 0, sizeof(pd));
 	pd.pf_mtag = pf_find_mtag(m);
 
 	if (pd.pf_mtag && pd.pf_mtag->flags & PF_TAG_GENERATED)
 		return (PF_PASS);
 
 	kif = (struct pfi_kif *)ifp->if_pf_kif;
 	if (kif == NULL) {
 		DPFPRINTF(PF_DEBUG_URGENT,
 		    ("pf_test6: kif == NULL, if_xname %s\n", ifp->if_xname));
 		return (PF_DROP);
 	}
 	if (kif->pfik_flags & PFI_IFLAG_SKIP)
 		return (PF_PASS);
 
 	if (m->m_flags & M_SKIP_FIREWALL)
 		return (PF_PASS);
 
 	PF_RULES_RLOCK();
 
 	/* We do IP header normalization and packet reassembly here */
 	if (pf_normalize_ip6(m0, dir, kif, &reason, &pd) != PF_PASS) {
 		action = PF_DROP;
 		goto done;
 	}
 	m = *m0;	/* pf_normalize messes with m0 */
 	h = mtod(m, struct ip6_hdr *);
 
 #if 1
 	/*
 	 * we do not support jumbogram yet.  if we keep going, zero ip6_plen
 	 * will do something bad, so drop the packet for now.
 	 */
 	if (htons(h->ip6_plen) == 0) {
 		action = PF_DROP;
 		REASON_SET(&reason, PFRES_NORM);	/*XXX*/
 		goto done;
 	}
 #endif
 
 	pd.src = (struct pf_addr *)&h->ip6_src;
 	pd.dst = (struct pf_addr *)&h->ip6_dst;
 	pd.sport = pd.dport = NULL;
 	pd.ip_sum = NULL;
 	pd.proto_sum = NULL;
 	pd.dir = dir;
 	pd.sidx = (dir == PF_IN) ? 0 : 1;
 	pd.didx = (dir == PF_IN) ? 1 : 0;
 	pd.af = AF_INET6;
 	pd.tos = 0;
 	pd.tot_len = ntohs(h->ip6_plen) + sizeof(struct ip6_hdr);
 
 	off = ((caddr_t)h - m->m_data) + sizeof(struct ip6_hdr);
 	pd.proto = h->ip6_nxt;
 	do {
 		switch (pd.proto) {
 		case IPPROTO_FRAGMENT:
 			action = pf_test_fragment(&r, dir, kif, m, h,
 			    &pd, &a, &ruleset);
 			if (action == PF_DROP)
 				REASON_SET(&reason, PFRES_FRAG);
 			goto done;
 		case IPPROTO_ROUTING: {
 			struct ip6_rthdr rthdr;
 
 			if (rh_cnt++) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: IPv6 more than one rthdr\n"));
 				action = PF_DROP;
 				REASON_SET(&reason, PFRES_IPOPTIONS);
 				log = 1;
 				goto done;
 			}
 			if (!pf_pull_hdr(m, off, &rthdr, sizeof(rthdr), NULL,
 			    &reason, pd.af)) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: IPv6 short rthdr\n"));
 				action = PF_DROP;
 				REASON_SET(&reason, PFRES_SHORT);
 				log = 1;
 				goto done;
 			}
 			if (rthdr.ip6r_type == IPV6_RTHDR_TYPE_0) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: IPv6 rthdr0\n"));
 				action = PF_DROP;
 				REASON_SET(&reason, PFRES_IPOPTIONS);
 				log = 1;
 				goto done;
 			}
 			/* FALLTHROUGH */
 		}
 		case IPPROTO_AH:
 		case IPPROTO_HOPOPTS:
 		case IPPROTO_DSTOPTS: {
 			/* get next header and header length */
 			struct ip6_ext	opt6;
 
 			if (!pf_pull_hdr(m, off, &opt6, sizeof(opt6),
 			    NULL, &reason, pd.af)) {
 				DPFPRINTF(PF_DEBUG_MISC,
 				    ("pf: IPv6 short opt\n"));
 				action = PF_DROP;
 				log = 1;
 				goto done;
 			}
 			if (pd.proto == IPPROTO_AH)
 				off += (opt6.ip6e_len + 2) * 4;
 			else
 				off += (opt6.ip6e_len + 1) * 8;
 			pd.proto = opt6.ip6e_nxt;
 			/* goto the next header */
 			break;
 		}
 		default:
 			terminal++;
 			break;
 		}
 	} while (!terminal);
 
 	/* if there's no routing header, use unmodified mbuf for checksumming */
 	if (!n)
 		n = m;
 
 	switch (pd.proto) {
 
 	case IPPROTO_TCP: {
 		struct tcphdr	th;
 
 		pd.hdr.tcp = &th;
 		if (!pf_pull_hdr(m, off, &th, sizeof(th),
 		    &action, &reason, AF_INET6)) {
 			log = action != PF_PASS;
 			goto done;
 		}
 		pd.p_len = pd.tot_len - off - (th.th_off << 2);
 		action = pf_normalize_tcp(dir, kif, m, 0, off, h, &pd);
 		if (action == PF_DROP)
 			goto done;
 		action = pf_test_state_tcp(&s, dir, kif, m, off, h, &pd,
 		    &reason);
 		if (action == PF_PASS) {
 			if (pfsync_update_state_ptr != NULL)
 				pfsync_update_state_ptr(s);
 			r = s->rule.ptr;
 			a = s->anchor.ptr;
 			log = s->log;
 		} else if (s == NULL)
 			action = pf_test_rule(&r, &s, dir, kif, m, off, &pd,
 			    &a, &ruleset, inp);
 		break;
 	}
 
 	case IPPROTO_UDP: {
 		struct udphdr	uh;
 
 		pd.hdr.udp = &uh;
 		if (!pf_pull_hdr(m, off, &uh, sizeof(uh),
 		    &action, &reason, AF_INET6)) {
 			log = action != PF_PASS;
 			goto done;
 		}
 		if (uh.uh_dport == 0 ||
 		    ntohs(uh.uh_ulen) > m->m_pkthdr.len - off ||
 		    ntohs(uh.uh_ulen) < sizeof(struct udphdr)) {
 			action = PF_DROP;
 			REASON_SET(&reason, PFRES_SHORT);
 			goto done;
 		}
 		action = pf_test_state_udp(&s, dir, kif, m, off, h, &pd);
 		if (action == PF_PASS) {
 			if (pfsync_update_state_ptr != NULL)
 				pfsync_update_state_ptr(s);
 			r = s->rule.ptr;
 			a = s->anchor.ptr;
 			log = s->log;
 		} else if (s == NULL)
 			action = pf_test_rule(&r, &s, dir, kif, m, off, &pd,
 			    &a, &ruleset, inp);
 		break;
 	}
 
 	case IPPROTO_ICMP: {
 		action = PF_DROP;
 		DPFPRINTF(PF_DEBUG_MISC,
 		    ("pf: dropping IPv6 packet with ICMPv4 payload\n"));
 		goto done;
 	}
 
 	case IPPROTO_ICMPV6: {
 		struct icmp6_hdr	ih;
 
 		pd.hdr.icmp6 = &ih;
 		if (!pf_pull_hdr(m, off, &ih, sizeof(ih),
 		    &action, &reason, AF_INET6)) {
 			log = action != PF_PASS;
 			goto done;
 		}
 		action = pf_test_state_icmp(&s, dir, kif,
 		    m, off, h, &pd, &reason);
 		if (action == PF_PASS) {
 			if (pfsync_update_state_ptr != NULL)
 				pfsync_update_state_ptr(s);
 			r = s->rule.ptr;
 			a = s->anchor.ptr;
 			log = s->log;
 		} else if (s == NULL)
 			action = pf_test_rule(&r, &s, dir, kif, m, off, &pd,
 			    &a, &ruleset, inp);
 		break;
 	}
 
 	default:
 		action = pf_test_state_other(&s, dir, kif, m, &pd);
 		if (action == PF_PASS) {
 			if (pfsync_update_state_ptr != NULL)
 				pfsync_update_state_ptr(s);
 			r = s->rule.ptr;
 			a = s->anchor.ptr;
 			log = s->log;
 		} else if (s == NULL)
 			action = pf_test_rule(&r, &s, dir, kif, m, off, &pd,
 			    &a, &ruleset, inp);
 		break;
 	}
 
 done:
 	PF_RULES_RUNLOCK();
 	if (n != m) {
 		m_freem(n);
 		n = NULL;
 	}
 
 	/* handle dangerous IPv6 extension headers. */
 	if (action == PF_PASS && rh_cnt &&
 	    !((s && s->state_flags & PFSTATE_ALLOWOPTS) || r->allow_opts)) {
 		action = PF_DROP;
 		REASON_SET(&reason, PFRES_IPOPTIONS);
 		log = 1;
 		DPFPRINTF(PF_DEBUG_MISC,
 		    ("pf: dropping packet with dangerous v6 headers\n"));
 	}
 
 	if (s && s->tag > 0 && pf_tag_packet(m, &pd, s->tag)) {
 		action = PF_DROP;
 		REASON_SET(&reason, PFRES_MEMORY);
 	}
 	if (r->rtableid >= 0)
 		M_SETFIB(m, r->rtableid);
 
 #ifdef ALTQ
 	if (action == PF_PASS && r->qid) {
 		if (pd.pf_mtag == NULL &&
 		    ((pd.pf_mtag = pf_get_mtag(m)) == NULL)) {
 			action = PF_DROP;
 			REASON_SET(&reason, PFRES_MEMORY);
 		}
 		if (pd.tos & IPTOS_LOWDELAY)
 			pd.pf_mtag->qid = r->pqid;
 		else
 			pd.pf_mtag->qid = r->qid;
 		/* add hints for ecn */
 		pd.pf_mtag->hdr = h;
 	}
 #endif /* ALTQ */
 
 	if (dir == PF_IN && action == PF_PASS && (pd.proto == IPPROTO_TCP ||
 	    pd.proto == IPPROTO_UDP) && s != NULL && s->nat_rule.ptr != NULL &&
 	    (s->nat_rule.ptr->action == PF_RDR ||
 	    s->nat_rule.ptr->action == PF_BINAT) &&
 	    IN6_IS_ADDR_LOOPBACK(&pd.dst->v6))
 		m->m_flags |= M_SKIP_FIREWALL;
 
 	/* XXX: Anybody working on it?! */
 	if (r->divert.port)
 		printf("pf: divert(9) is not supported for IPv6\n");
 
 	if (log) {
 		struct pf_rule *lr;
 
 		if (s != NULL && s->nat_rule.ptr != NULL &&
 		    s->nat_rule.ptr->log & PF_LOG_ALL)
 			lr = s->nat_rule.ptr;
 		else
 			lr = r;
 		PFLOG_PACKET(kif, m, AF_INET6, dir, reason, lr, a, ruleset,
 		    &pd, (s == NULL));
 	}
 
 	kif->pfik_bytes[1][dir == PF_OUT][action != PF_PASS] += pd.tot_len;
 	kif->pfik_packets[1][dir == PF_OUT][action != PF_PASS]++;
 
 	if (action == PF_PASS || r->action == PF_DROP) {
 		dirndx = (dir == PF_OUT);
 		r->packets[dirndx]++;
 		r->bytes[dirndx] += pd.tot_len;
 		if (a != NULL) {
 			a->packets[dirndx]++;
 			a->bytes[dirndx] += pd.tot_len;
 		}
 		if (s != NULL) {
 			if (s->nat_rule.ptr != NULL) {
 				s->nat_rule.ptr->packets[dirndx]++;
 				s->nat_rule.ptr->bytes[dirndx] += pd.tot_len;
 			}
 			if (s->src_node != NULL) {
 				s->src_node->packets[dirndx]++;
 				s->src_node->bytes[dirndx] += pd.tot_len;
 			}
 			if (s->nat_src_node != NULL) {
 				s->nat_src_node->packets[dirndx]++;
 				s->nat_src_node->bytes[dirndx] += pd.tot_len;
 			}
 			dirndx = (dir == s->direction) ? 0 : 1;
 			s->packets[dirndx]++;
 			s->bytes[dirndx] += pd.tot_len;
 		}
 		tr = r;
 		nr = (s != NULL) ? s->nat_rule.ptr : pd.nat_rule;
 		if (nr != NULL && r == &V_pf_default_rule)
 			tr = nr;
 		if (tr->src.addr.type == PF_ADDR_TABLE)
 			pfr_update_stats(tr->src.addr.p.tbl,
 			    (s == NULL) ? pd.src :
 			    &s->key[(s->direction == PF_IN)]->addr[0],
 			    pd.af, pd.tot_len, dir == PF_OUT,
 			    r->action == PF_PASS, tr->src.neg);
 		if (tr->dst.addr.type == PF_ADDR_TABLE)
 			pfr_update_stats(tr->dst.addr.p.tbl,
 			    (s == NULL) ? pd.dst :
 			    &s->key[(s->direction == PF_IN)]->addr[1],
 			    pd.af, pd.tot_len, dir == PF_OUT,
 			    r->action == PF_PASS, tr->dst.neg);
 	}
 
 	switch (action) {
 	case PF_SYNPROXY_DROP:
 		m_freem(*m0);
 	case PF_DEFER:
 		*m0 = NULL;
 		action = PF_PASS;
 		break;
 	case PF_DROP:
 		m_freem(*m0);
 		*m0 = NULL;
 		break;
 	default:
 		/* pf_route6() returns unlocked. */
 		if (r->rt) {
 			pf_route6(m0, r, dir, kif->pfik_ifp, s, &pd);
 			return (action);
 		}
 		break;
 	}
 
 	if (s)
 		PF_STATE_UNLOCK(s);
 
 	/* If reassembled packet passed, create new fragments. */
 	if (action == PF_PASS && *m0 && fwdir == PF_FWD &&
 	    (mtag = m_tag_find(m, PF_REASSEMBLED, NULL)) != NULL)
 		action = pf_refragment6(ifp, m0, mtag);
 
 	return (action);
 }
 #endif /* INET6 */
Index: user/ngie/more-tests/sys/netpfil/pf/pf_norm.c
===================================================================
--- user/ngie/more-tests/sys/netpfil/pf/pf_norm.c	(revision 281584)
+++ user/ngie/more-tests/sys/netpfil/pf/pf_norm.c	(revision 281585)
@@ -1,2294 +1,2294 @@
 /*-
  * Copyright 2001 Niels Provos <provos@citi.umich.edu>
  * Copyright 2011 Alexander Bluhm <bluhm@openbsd.org>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  *
  *	$OpenBSD: pf_norm.c,v 1.114 2009/01/29 14:11:45 henning Exp $
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_inet.h"
 #include "opt_inet6.h"
 #include "opt_pf.h"
 
 #include <sys/param.h>
 #include <sys/lock.h>
 #include <sys/mbuf.h>
 #include <sys/mutex.h>
 #include <sys/refcount.h>
 #include <sys/rwlock.h>
 #include <sys/socket.h>
 
 #include <net/if.h>
 #include <net/vnet.h>
 #include <net/pfvar.h>
 #include <net/if_pflog.h>
 
 #include <netinet/in.h>
 #include <netinet/ip.h>
 #include <netinet/ip_var.h>
 #include <netinet6/ip6_var.h>
 #include <netinet/tcp.h>
 #include <netinet/tcp_fsm.h>
 #include <netinet/tcp_seq.h>
 
 #ifdef INET6
 #include <netinet/ip6.h>
 #endif /* INET6 */
 
 struct pf_frent {
 	TAILQ_ENTRY(pf_frent)	fr_next;
 	struct mbuf	*fe_m;
 	uint16_t	fe_hdrlen;	/* ipv4 header lenght with ip options
 					   ipv6, extension, fragment header */
 	uint16_t	fe_extoff;	/* last extension header offset or 0 */
 	uint16_t	fe_len;		/* fragment length */
 	uint16_t	fe_off;		/* fragment offset */
 	uint16_t	fe_mff;		/* more fragment flag */
 };
 
 struct pf_fragment_cmp {
 	struct pf_addr	frc_src;
 	struct pf_addr	frc_dst;
 	uint32_t	frc_id;
 	sa_family_t	frc_af;
 	uint8_t		frc_proto;
 	uint8_t		frc_direction;
 };
 
 struct pf_fragment {
 	struct pf_fragment_cmp	fr_key;
 #define fr_src	fr_key.frc_src
 #define fr_dst	fr_key.frc_dst
 #define fr_id	fr_key.frc_id
 #define fr_af	fr_key.frc_af
 #define fr_proto	fr_key.frc_proto
 #define fr_direction	fr_key.frc_direction
 
 	RB_ENTRY(pf_fragment) fr_entry;
 	TAILQ_ENTRY(pf_fragment) frag_next;
 	uint8_t		fr_flags;	/* status flags */
 #define PFFRAG_SEENLAST		0x0001	/* Seen the last fragment for this */
 #define PFFRAG_NOBUFFER		0x0002	/* Non-buffering fragment cache */
 #define PFFRAG_DROP		0x0004	/* Drop all fragments */
 #define BUFFER_FRAGMENTS(fr)	(!((fr)->fr_flags & PFFRAG_NOBUFFER))
 	uint16_t	fr_max;		/* fragment data max */
 	uint32_t	fr_timeout;
 	uint16_t	fr_maxlen;	/* maximum length of single fragment */
 	TAILQ_HEAD(pf_fragq, pf_frent) fr_queue;
 };
 
 struct pf_fragment_tag {
 	uint16_t	ft_hdrlen;	/* header length of reassembled pkt */
 	uint16_t	ft_extoff;	/* last extension header offset or 0 */
 	uint16_t	ft_maxlen;	/* maximum fragment payload length */
 	uint32_t	ft_id;		/* fragment id */
 };
 
 static struct mtx pf_frag_mtx;
 #define PF_FRAG_LOCK()		mtx_lock(&pf_frag_mtx)
 #define PF_FRAG_UNLOCK()	mtx_unlock(&pf_frag_mtx)
 #define PF_FRAG_ASSERT()	mtx_assert(&pf_frag_mtx, MA_OWNED)
 
 VNET_DEFINE(uma_zone_t, pf_state_scrub_z);	/* XXX: shared with pfsync */
 
 static VNET_DEFINE(uma_zone_t, pf_frent_z);
 #define	V_pf_frent_z	VNET(pf_frent_z)
 static VNET_DEFINE(uma_zone_t, pf_frag_z);
 #define	V_pf_frag_z	VNET(pf_frag_z)
 
 TAILQ_HEAD(pf_fragqueue, pf_fragment);
 TAILQ_HEAD(pf_cachequeue, pf_fragment);
 static VNET_DEFINE(struct pf_fragqueue,	pf_fragqueue);
 #define	V_pf_fragqueue			VNET(pf_fragqueue)
 static VNET_DEFINE(struct pf_cachequeue,	pf_cachequeue);
 #define	V_pf_cachequeue			VNET(pf_cachequeue)
 RB_HEAD(pf_frag_tree, pf_fragment);
 static VNET_DEFINE(struct pf_frag_tree,	pf_frag_tree);
 #define	V_pf_frag_tree			VNET(pf_frag_tree)
 static VNET_DEFINE(struct pf_frag_tree,	pf_cache_tree);
 #define	V_pf_cache_tree			VNET(pf_cache_tree)
 static int		 pf_frag_compare(struct pf_fragment *,
 			    struct pf_fragment *);
 static RB_PROTOTYPE(pf_frag_tree, pf_fragment, fr_entry, pf_frag_compare);
 static RB_GENERATE(pf_frag_tree, pf_fragment, fr_entry, pf_frag_compare);
 
 static void	pf_flush_fragments(void);
 static void	pf_free_fragment(struct pf_fragment *);
 static void	pf_remove_fragment(struct pf_fragment *);
 static int	pf_normalize_tcpopt(struct pf_rule *, struct mbuf *,
 		    struct tcphdr *, int, sa_family_t);
 static struct pf_frent *pf_create_fragment(u_short *);
 static struct pf_fragment *pf_find_fragment(struct pf_fragment_cmp *key,
 		    struct pf_frag_tree *tree);
 static struct pf_fragment *pf_fillup_fragment(struct pf_fragment_cmp *,
 		    struct pf_frent *, u_short *);
 static int	pf_isfull_fragment(struct pf_fragment *);
 static struct mbuf *pf_join_fragment(struct pf_fragment *);
 #ifdef INET
 static void	pf_scrub_ip(struct mbuf **, uint32_t, uint8_t, uint8_t);
 static int	pf_reassemble(struct mbuf **, struct ip *, int, u_short *);
 static struct mbuf *pf_fragcache(struct mbuf **, struct ip*,
 		    struct pf_fragment **, int, int, int *);
 #endif	/* INET */
 #ifdef INET6
 static int	pf_reassemble6(struct mbuf **, struct ip6_hdr *,
 		    struct ip6_frag *, uint16_t, uint16_t, int, u_short *);
 static void	pf_scrub_ip6(struct mbuf **, uint8_t);
 #endif	/* INET6 */
 
 #define	DPFPRINTF(x) do {				\
 	if (V_pf_status.debug >= PF_DEBUG_MISC) {	\
 		printf("%s: ", __func__);		\
 		printf x ;				\
 	}						\
 } while(0)
 
 #ifdef INET
 static void
 pf_ip2key(struct ip *ip, int dir, struct pf_fragment_cmp *key)
 {
 
 	key->frc_src.v4 = ip->ip_src;
 	key->frc_dst.v4 = ip->ip_dst;
 	key->frc_af = AF_INET;
 	key->frc_proto = ip->ip_p;
 	key->frc_id = ip->ip_id;
 	key->frc_direction = dir;
 }
 #endif	/* INET */
 
 void
 pf_normalize_init(void)
 {
 
 	V_pf_frag_z = uma_zcreate("pf frags", sizeof(struct pf_fragment),
 	    NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0);
 	V_pf_frent_z = uma_zcreate("pf frag entries", sizeof(struct pf_frent),
 	    NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, 0);
 	V_pf_state_scrub_z = uma_zcreate("pf state scrubs",
 	    sizeof(struct pf_state_scrub),  NULL, NULL, NULL, NULL,
 	    UMA_ALIGN_PTR, 0);
 
 	V_pf_limits[PF_LIMIT_FRAGS].zone = V_pf_frent_z;
 	V_pf_limits[PF_LIMIT_FRAGS].limit = PFFRAG_FRENT_HIWAT;
 	uma_zone_set_max(V_pf_frent_z, PFFRAG_FRENT_HIWAT);
 	uma_zone_set_warning(V_pf_frent_z, "PF frag entries limit reached");
 
 	mtx_init(&pf_frag_mtx, "pf fragments", NULL, MTX_DEF);
 
 	TAILQ_INIT(&V_pf_fragqueue);
 	TAILQ_INIT(&V_pf_cachequeue);
 }
 
 void
 pf_normalize_cleanup(void)
 {
 
 	uma_zdestroy(V_pf_state_scrub_z);
 	uma_zdestroy(V_pf_frent_z);
 	uma_zdestroy(V_pf_frag_z);
 
 	mtx_destroy(&pf_frag_mtx);
 }
 
 static int
 pf_frag_compare(struct pf_fragment *a, struct pf_fragment *b)
 {
 	int	diff;
 
 	if ((diff = a->fr_id - b->fr_id) != 0)
 		return (diff);
 	if ((diff = a->fr_proto - b->fr_proto) != 0)
 		return (diff);
 	if ((diff = a->fr_af - b->fr_af) != 0)
 		return (diff);
 	if ((diff = pf_addr_cmp(&a->fr_src, &b->fr_src, a->fr_af)) != 0)
 		return (diff);
 	if ((diff = pf_addr_cmp(&a->fr_dst, &b->fr_dst, a->fr_af)) != 0)
 		return (diff);
 	return (0);
 }
 
 void
 pf_purge_expired_fragments(void)
 {
 	struct pf_fragment	*frag;
 	u_int32_t		 expire = time_uptime -
 				    V_pf_default_rule.timeout[PFTM_FRAG];
 
 	PF_FRAG_LOCK();
 	while ((frag = TAILQ_LAST(&V_pf_fragqueue, pf_fragqueue)) != NULL) {
 		KASSERT((BUFFER_FRAGMENTS(frag)),
 		    ("BUFFER_FRAGMENTS(frag) == 0: %s", __FUNCTION__));
 		if (frag->fr_timeout > expire)
 			break;
 
 		DPFPRINTF(("expiring %d(%p)\n", frag->fr_id, frag));
 		pf_free_fragment(frag);
 	}
 
 	while ((frag = TAILQ_LAST(&V_pf_cachequeue, pf_cachequeue)) != NULL) {
 		KASSERT((!BUFFER_FRAGMENTS(frag)),
 		    ("BUFFER_FRAGMENTS(frag) != 0: %s", __FUNCTION__));
 		if (frag->fr_timeout > expire)
 			break;
 
 		DPFPRINTF(("expiring %d(%p)\n", frag->fr_id, frag));
 		pf_free_fragment(frag);
 		KASSERT((TAILQ_EMPTY(&V_pf_cachequeue) ||
 		    TAILQ_LAST(&V_pf_cachequeue, pf_cachequeue) != frag),
 		    ("!(TAILQ_EMPTY() || TAILQ_LAST() == farg): %s",
 		    __FUNCTION__));
 	}
 	PF_FRAG_UNLOCK();
 }
 
 /*
  * Try to flush old fragments to make space for new ones
  */
 static void
 pf_flush_fragments(void)
 {
 	struct pf_fragment	*frag, *cache;
 	int			 goal;
 
 	PF_FRAG_ASSERT();
 
 	goal = uma_zone_get_cur(V_pf_frent_z) * 9 / 10;
 	DPFPRINTF(("trying to free %d frag entriess\n", goal));
 	while (goal < uma_zone_get_cur(V_pf_frent_z)) {
 		frag = TAILQ_LAST(&V_pf_fragqueue, pf_fragqueue);
 		if (frag)
 			pf_free_fragment(frag);
 		cache = TAILQ_LAST(&V_pf_cachequeue, pf_cachequeue);
 		if (cache)
 			pf_free_fragment(cache);
 		if (frag == NULL && cache == NULL)
 			break;
 	}
 }
 
 /* Frees the fragments and all associated entries */
 static void
 pf_free_fragment(struct pf_fragment *frag)
 {
 	struct pf_frent		*frent;
 
 	PF_FRAG_ASSERT();
 
 	/* Free all fragments */
 	if (BUFFER_FRAGMENTS(frag)) {
 		for (frent = TAILQ_FIRST(&frag->fr_queue); frent;
 		    frent = TAILQ_FIRST(&frag->fr_queue)) {
 			TAILQ_REMOVE(&frag->fr_queue, frent, fr_next);
 
 			m_freem(frent->fe_m);
 			uma_zfree(V_pf_frent_z, frent);
 		}
 	} else {
 		for (frent = TAILQ_FIRST(&frag->fr_queue); frent;
 		    frent = TAILQ_FIRST(&frag->fr_queue)) {
 			TAILQ_REMOVE(&frag->fr_queue, frent, fr_next);
 
 			KASSERT((TAILQ_EMPTY(&frag->fr_queue) ||
 			    TAILQ_FIRST(&frag->fr_queue)->fe_off >
 			    frent->fe_len),
 			    ("! (TAILQ_EMPTY() || TAILQ_FIRST()->fe_off >"
 			    " frent->fe_len): %s", __func__));
 
 			uma_zfree(V_pf_frent_z, frent);
 		}
 	}
 
 	pf_remove_fragment(frag);
 }
 
 static struct pf_fragment *
 pf_find_fragment(struct pf_fragment_cmp *key, struct pf_frag_tree *tree)
 {
 	struct pf_fragment	*frag;
 
 	PF_FRAG_ASSERT();
 
 	frag = RB_FIND(pf_frag_tree, tree, (struct pf_fragment *)key);
 	if (frag != NULL) {
 		/* XXX Are we sure we want to update the timeout? */
 		frag->fr_timeout = time_uptime;
 		if (BUFFER_FRAGMENTS(frag)) {
 			TAILQ_REMOVE(&V_pf_fragqueue, frag, frag_next);
 			TAILQ_INSERT_HEAD(&V_pf_fragqueue, frag, frag_next);
 		} else {
 			TAILQ_REMOVE(&V_pf_cachequeue, frag, frag_next);
 			TAILQ_INSERT_HEAD(&V_pf_cachequeue, frag, frag_next);
 		}
 	}
 
 	return (frag);
 }
 
 /* Removes a fragment from the fragment queue and frees the fragment */
 static void
 pf_remove_fragment(struct pf_fragment *frag)
 {
 
 	PF_FRAG_ASSERT();
 
 	if (BUFFER_FRAGMENTS(frag)) {
 		RB_REMOVE(pf_frag_tree, &V_pf_frag_tree, frag);
 		TAILQ_REMOVE(&V_pf_fragqueue, frag, frag_next);
 		uma_zfree(V_pf_frag_z, frag);
 	} else {
 		RB_REMOVE(pf_frag_tree, &V_pf_cache_tree, frag);
 		TAILQ_REMOVE(&V_pf_cachequeue, frag, frag_next);
 		uma_zfree(V_pf_frag_z, frag);
 	}
 }
 
 static struct pf_frent *
 pf_create_fragment(u_short *reason)
 {
 	struct pf_frent *frent;
 
 	PF_FRAG_ASSERT();
 
 	frent = uma_zalloc(V_pf_frent_z, M_NOWAIT);
 	if (frent == NULL) {
 		pf_flush_fragments();
 		frent = uma_zalloc(V_pf_frent_z, M_NOWAIT);
 		if (frent == NULL) {
 			REASON_SET(reason, PFRES_MEMORY);
 			return (NULL);
 		}
 	}
 
 	return (frent);
 }
 
 static struct pf_fragment *
 pf_fillup_fragment(struct pf_fragment_cmp *key, struct pf_frent *frent,
 		u_short *reason)
 {
 	struct pf_frent		*after, *next, *prev;
 	struct pf_fragment	*frag;
 	uint16_t		total;
 
 	PF_FRAG_ASSERT();
 
 	/* No empty fragments. */
 	if (frent->fe_len == 0) {
 		DPFPRINTF(("bad fragment: len 0"));
 		goto bad_fragment;
 	}
 
 	/* All fragments are 8 byte aligned. */
 	if (frent->fe_mff && (frent->fe_len & 0x7)) {
 		DPFPRINTF(("bad fragment: mff and len %d", frent->fe_len));
 		goto bad_fragment;
 	}
 
 	/* Respect maximum length, IP_MAXPACKET == IPV6_MAXPACKET. */
 	if (frent->fe_off + frent->fe_len > IP_MAXPACKET) {
 		DPFPRINTF(("bad fragment: max packet %d",
 		    frent->fe_off + frent->fe_len));
 		goto bad_fragment;
 	}
 
 	DPFPRINTF((key->frc_af == AF_INET ?
 	    "reass frag %d @ %d-%d" : "reass frag %#08x @ %d-%d",
 	    key->frc_id, frent->fe_off, frent->fe_off + frent->fe_len));
 
 	/* Fully buffer all of the fragments in this fragment queue. */
 	frag = pf_find_fragment(key, &V_pf_frag_tree);
 
 	/* Create a new reassembly queue for this packet. */
 	if (frag == NULL) {
 		frag = uma_zalloc(V_pf_frag_z, M_NOWAIT);
 		if (frag == NULL) {
 			pf_flush_fragments();
 			frag = uma_zalloc(V_pf_frag_z, M_NOWAIT);
 			if (frag == NULL) {
 				REASON_SET(reason, PFRES_MEMORY);
 				goto drop_fragment;
 			}
 		}
 
 		*(struct pf_fragment_cmp *)frag = *key;
 		frag->fr_timeout = time_second;
 		frag->fr_maxlen = frent->fe_len;
 		TAILQ_INIT(&frag->fr_queue);
 
 		RB_INSERT(pf_frag_tree, &V_pf_frag_tree, frag);
 		TAILQ_INSERT_HEAD(&V_pf_fragqueue, frag, frag_next);
 
 		/* We do not have a previous fragment. */
 		TAILQ_INSERT_HEAD(&frag->fr_queue, frent, fr_next);
 
 		return (frag);
 	}
 
 	KASSERT(!TAILQ_EMPTY(&frag->fr_queue), ("!TAILQ_EMPTY()->fr_queue"));
 
 	/* Remember maximum fragment len for refragmentation. */
 	if (frent->fe_len > frag->fr_maxlen)
 		frag->fr_maxlen = frent->fe_len;
 
 	/* Maximum data we have seen already. */
 	total = TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_off +
 		TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_len;
 
 	/* Non terminal fragments must have more fragments flag. */
 	if (frent->fe_off + frent->fe_len < total && !frent->fe_mff)
 		goto bad_fragment;
 
 	/* Check if we saw the last fragment already. */
 	if (!TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_mff) {
 		if (frent->fe_off + frent->fe_len > total ||
 		    (frent->fe_off + frent->fe_len == total && frent->fe_mff))
 			goto bad_fragment;
 	} else {
 		if (frent->fe_off + frent->fe_len == total && !frent->fe_mff)
 			goto bad_fragment;
 	}
 
 	/* Find a fragment after the current one. */
 	prev = NULL;
 	TAILQ_FOREACH(after, &frag->fr_queue, fr_next) {
 		if (after->fe_off > frent->fe_off)
 			break;
 		prev = after;
 	}
 
 	KASSERT(prev != NULL || after != NULL,
 	    ("prev != NULL || after != NULL"));
 
 	if (prev != NULL && prev->fe_off + prev->fe_len > frent->fe_off) {
 		uint16_t precut;
 
 		precut = prev->fe_off + prev->fe_len - frent->fe_off;
 		if (precut >= frent->fe_len)
 			goto bad_fragment;
 		DPFPRINTF(("overlap -%d", precut));
 		m_adj(frent->fe_m, precut);
 		frent->fe_off += precut;
 		frent->fe_len -= precut;
 	}
 
 	for (; after != NULL && frent->fe_off + frent->fe_len > after->fe_off;
 	    after = next) {
 		uint16_t aftercut;
 
 		aftercut = frent->fe_off + frent->fe_len - after->fe_off;
 		DPFPRINTF(("adjust overlap %d", aftercut));
 		if (aftercut < after->fe_len) {
 			m_adj(after->fe_m, aftercut);
 			after->fe_off += aftercut;
 			after->fe_len -= aftercut;
 			break;
 		}
 
 		/* This fragment is completely overlapped, lose it. */
 		next = TAILQ_NEXT(after, fr_next);
 		m_freem(after->fe_m);
 		TAILQ_REMOVE(&frag->fr_queue, after, fr_next);
 		uma_zfree(V_pf_frent_z, after);
 	}
 
 	if (prev == NULL)
 		TAILQ_INSERT_HEAD(&frag->fr_queue, frent, fr_next);
 	else
 		TAILQ_INSERT_AFTER(&frag->fr_queue, prev, frent, fr_next);
 
 	return (frag);
 
 bad_fragment:
 	REASON_SET(reason, PFRES_FRAG);
 drop_fragment:
 	uma_zfree(V_pf_frent_z, frent);
 	return (NULL);
 }
 
 static int
 pf_isfull_fragment(struct pf_fragment *frag)
 {
 	struct pf_frent	*frent, *next;
 	uint16_t off, total;
 
 	/* Check if we are completely reassembled */
 	if (TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_mff)
 		return (0);
 
 	/* Maximum data we have seen already */
 	total = TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_off +
 		TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_len;
 
 	/* Check if we have all the data */
 	off = 0;
 	for (frent = TAILQ_FIRST(&frag->fr_queue); frent; frent = next) {
 		next = TAILQ_NEXT(frent, fr_next);
 
 		off += frent->fe_len;
 		if (off < total && (next == NULL || next->fe_off != off)) {
 			DPFPRINTF(("missing fragment at %d, next %d, total %d",
 			    off, next == NULL ? -1 : next->fe_off, total));
 			return (0);
 		}
 	}
 	DPFPRINTF(("%d < %d?", off, total));
 	if (off < total)
 		return (0);
 	KASSERT(off == total, ("off == total"));
 
 	return (1);
 }
 
 static struct mbuf *
 pf_join_fragment(struct pf_fragment *frag)
 {
 	struct mbuf *m, *m2;
 	struct pf_frent	*frent, *next;
 
 	frent = TAILQ_FIRST(&frag->fr_queue);
 	next = TAILQ_NEXT(frent, fr_next);
 
 	m = frent->fe_m;
 	m_adj(m, (frent->fe_hdrlen + frent->fe_len) - m->m_pkthdr.len);
 	uma_zfree(V_pf_frent_z, frent);
 	for (frent = next; frent != NULL; frent = next) {
 		next = TAILQ_NEXT(frent, fr_next);
 
 		m2 = frent->fe_m;
 		/* Strip off ip header. */
 		m_adj(m2, frent->fe_hdrlen);
 		/* Strip off any trailing bytes. */
 		m_adj(m2, frent->fe_len - m2->m_pkthdr.len);
 
 		uma_zfree(V_pf_frent_z, frent);
 		m_cat(m, m2);
 	}
 
 	/* Remove from fragment queue. */
 	pf_remove_fragment(frag);
 
 	return (m);
 }
 
 #ifdef INET
 static int
 pf_reassemble(struct mbuf **m0, struct ip *ip, int dir, u_short *reason)
 {
 	struct mbuf		*m = *m0;
 	struct pf_frent		*frent;
 	struct pf_fragment	*frag;
 	struct pf_fragment_cmp	key;
 	uint16_t		total, hdrlen;
 
 	/* Get an entry for the fragment queue */
 	if ((frent = pf_create_fragment(reason)) == NULL)
 		return (PF_DROP);
 
 	frent->fe_m = m;
 	frent->fe_hdrlen = ip->ip_hl << 2;
 	frent->fe_extoff = 0;
 	frent->fe_len = ntohs(ip->ip_len) - (ip->ip_hl << 2);
 	frent->fe_off = (ntohs(ip->ip_off) & IP_OFFMASK) << 3;
 	frent->fe_mff = ntohs(ip->ip_off) & IP_MF;
 
 	pf_ip2key(ip, dir, &key);
 
 	if ((frag = pf_fillup_fragment(&key, frent, reason)) == NULL)
 		return (PF_DROP);
 
 	/* The mbuf is part of the fragment entry, no direct free or access */
 	m = *m0 = NULL;
 
 	if (!pf_isfull_fragment(frag))
 		return (PF_PASS);  /* drop because *m0 is NULL, no error */
 
 	/* We have all the data */
 	frent = TAILQ_FIRST(&frag->fr_queue);
 	KASSERT(frent != NULL, ("frent != NULL"));
 	total = TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_off +
 		TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_len;
 	hdrlen = frent->fe_hdrlen;
 
 	m = *m0 = pf_join_fragment(frag);
 	frag = NULL;
 
 	if (m->m_flags & M_PKTHDR) {
 		int plen = 0;
 		for (m = *m0; m; m = m->m_next)
 			plen += m->m_len;
 		m = *m0;
 		m->m_pkthdr.len = plen;
 	}
 
 	ip = mtod(m, struct ip *);
 	ip->ip_len = htons(hdrlen + total);
 	ip->ip_off &= ~(IP_MF|IP_OFFMASK);
 
 	if (hdrlen + total > IP_MAXPACKET) {
 		DPFPRINTF(("drop: too big: %d", total));
 		ip->ip_len = 0;
 		REASON_SET(reason, PFRES_SHORT);
 		/* PF_DROP requires a valid mbuf *m0 in pf_test() */
 		return (PF_DROP);
 	}
 
 	DPFPRINTF(("complete: %p(%d)\n", m, ntohs(ip->ip_len)));
 	return (PF_PASS);
 }
 #endif	/* INET */
 
 #ifdef INET6
 static int
 pf_reassemble6(struct mbuf **m0, struct ip6_hdr *ip6, struct ip6_frag *fraghdr,
     uint16_t hdrlen, uint16_t extoff, int dir, u_short *reason)
 {
 	struct mbuf		*m = *m0;
 	struct pf_frent		*frent;
 	struct pf_fragment	*frag;
 	struct pf_fragment_cmp	 key;
 	struct m_tag		*mtag;
 	struct pf_fragment_tag	*ftag;
 	int			 off;
 	uint32_t		 frag_id;
 	uint16_t		 total, maxlen;
 	uint8_t			 proto;
 
 	PF_FRAG_LOCK();
 
 	/* Get an entry for the fragment queue. */
 	if ((frent = pf_create_fragment(reason)) == NULL) {
 		PF_FRAG_UNLOCK();
 		return (PF_DROP);
 	}
 
 	frent->fe_m = m;
 	frent->fe_hdrlen = hdrlen;
 	frent->fe_extoff = extoff;
 	frent->fe_len = sizeof(struct ip6_hdr) + ntohs(ip6->ip6_plen) - hdrlen;
 	frent->fe_off = ntohs(fraghdr->ip6f_offlg & IP6F_OFF_MASK);
 	frent->fe_mff = fraghdr->ip6f_offlg & IP6F_MORE_FRAG;
 
 	key.frc_src.v6 = ip6->ip6_src;
 	key.frc_dst.v6 = ip6->ip6_dst;
 	key.frc_af = AF_INET6;
 	/* Only the first fragment's protocol is relevant. */
 	key.frc_proto = 0;
 	key.frc_id = fraghdr->ip6f_ident;
 	key.frc_direction = dir;
 
 	if ((frag = pf_fillup_fragment(&key, frent, reason)) == NULL) {
 		PF_FRAG_UNLOCK();
 		return (PF_DROP);
 	}
 
 	/* The mbuf is part of the fragment entry, no direct free or access. */
 	m = *m0 = NULL;
 
 	if (!pf_isfull_fragment(frag)) {
 		PF_FRAG_UNLOCK();
 		return (PF_PASS);  /* Drop because *m0 is NULL, no error. */
 	}
 
 	/* We have all the data. */
 	extoff = frent->fe_extoff;
 	maxlen = frag->fr_maxlen;
 	frag_id = frag->fr_id;
 	frent = TAILQ_FIRST(&frag->fr_queue);
 	KASSERT(frent != NULL, ("frent != NULL"));
 	total = TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_off +
 		TAILQ_LAST(&frag->fr_queue, pf_fragq)->fe_len;
 	hdrlen = frent->fe_hdrlen - sizeof(struct ip6_frag);
 
 	m = *m0 = pf_join_fragment(frag);
 	frag = NULL;
 
 	PF_FRAG_UNLOCK();
 
 	/* Take protocol from first fragment header. */
 	m = m_getptr(m, hdrlen + offsetof(struct ip6_frag, ip6f_nxt), &off);
 	KASSERT(m, ("%s: short mbuf chain", __func__));
 	proto = *(mtod(m, caddr_t) + off);
 	m = *m0;
 
 	/* Delete frag6 header */
 	if (ip6_deletefraghdr(m, hdrlen, M_NOWAIT) != 0)
 		goto fail;
 
 	if (m->m_flags & M_PKTHDR) {
 		int plen = 0;
 		for (m = *m0; m; m = m->m_next)
 			plen += m->m_len;
 		m = *m0;
 		m->m_pkthdr.len = plen;
 	}
 
 	if ((mtag = m_tag_get(PF_REASSEMBLED, sizeof(struct pf_fragment_tag),
 	    M_NOWAIT)) == NULL)
 		goto fail;
 	ftag = (struct pf_fragment_tag *)(mtag + 1);
 	ftag->ft_hdrlen = hdrlen;
 	ftag->ft_extoff = extoff;
 	ftag->ft_maxlen = maxlen;
 	ftag->ft_id = frag_id;
 	m_tag_prepend(m, mtag);
 
 	ip6 = mtod(m, struct ip6_hdr *);
 	ip6->ip6_plen = htons(hdrlen - sizeof(struct ip6_hdr) + total);
 	if (extoff) {
 		/* Write protocol into next field of last extension header. */
 		m = m_getptr(m, extoff + offsetof(struct ip6_ext, ip6e_nxt),
 		    &off);
 		KASSERT(m, ("%s: short mbuf chain", __func__));
 		*(mtod(m, char *) + off) = proto;
 		m = *m0;
 	} else
 		ip6->ip6_nxt = proto;
 
 	if (hdrlen - sizeof(struct ip6_hdr) + total > IPV6_MAXPACKET) {
 		DPFPRINTF(("drop: too big: %d", total));
 		ip6->ip6_plen = 0;
 		REASON_SET(reason, PFRES_SHORT);
 		/* PF_DROP requires a valid mbuf *m0 in pf_test6(). */
 		return (PF_DROP);
 	}
 
 	DPFPRINTF(("complete: %p(%d)", m, ntohs(ip6->ip6_plen)));
 	return (PF_PASS);
 
 fail:
 	REASON_SET(reason, PFRES_MEMORY);
 	/* PF_DROP requires a valid mbuf *m0 in pf_test6(), will free later. */
 	return (PF_DROP);
 }
 #endif	/* INET6 */
 
 #ifdef INET
 static struct mbuf *
 pf_fragcache(struct mbuf **m0, struct ip *h, struct pf_fragment **frag, int mff,
     int drop, int *nomem)
 {
 	struct mbuf		*m = *m0;
 	struct pf_frent		*frp, *fra, *cur = NULL;
 	int			 ip_len = ntohs(h->ip_len) - (h->ip_hl << 2);
 	u_int16_t		 off = ntohs(h->ip_off) << 3;
 	u_int16_t		 max = ip_len + off;
 	int			 hosed = 0;
 
 	PF_FRAG_ASSERT();
 	KASSERT((*frag == NULL || !BUFFER_FRAGMENTS(*frag)),
 	    ("!(*frag == NULL || !BUFFER_FRAGMENTS(*frag)): %s", __FUNCTION__));
 
 	/* Create a new range queue for this packet */
 	if (*frag == NULL) {
 		*frag = uma_zalloc(V_pf_frag_z, M_NOWAIT);
 		if (*frag == NULL) {
 			pf_flush_fragments();
 			*frag = uma_zalloc(V_pf_frag_z, M_NOWAIT);
 			if (*frag == NULL)
 				goto no_mem;
 		}
 
 		/* Get an entry for the queue */
 		cur = uma_zalloc(V_pf_frent_z, M_NOWAIT);
 		if (cur == NULL) {
 			uma_zfree(V_pf_frag_z, *frag);
 			*frag = NULL;
 			goto no_mem;
 		}
 
 		(*frag)->fr_flags = PFFRAG_NOBUFFER;
 		(*frag)->fr_max = 0;
 		(*frag)->fr_src.v4 = h->ip_src;
 		(*frag)->fr_dst.v4 = h->ip_dst;
 		(*frag)->fr_id = h->ip_id;
 		(*frag)->fr_timeout = time_uptime;
 
 		cur->fe_off = off;
 		cur->fe_len = max; /* TODO: fe_len = max - off ? */
 		TAILQ_INIT(&(*frag)->fr_queue);
 		TAILQ_INSERT_HEAD(&(*frag)->fr_queue, cur, fr_next);
 
 		RB_INSERT(pf_frag_tree, &V_pf_cache_tree, *frag);
 		TAILQ_INSERT_HEAD(&V_pf_cachequeue, *frag, frag_next);
 
 		DPFPRINTF(("fragcache[%d]: new %d-%d\n", h->ip_id, off, max));
 
 		goto pass;
 	}
 
 	/*
 	 * Find a fragment after the current one:
 	 *  - off contains the real shifted offset.
 	 */
 	frp = NULL;
 	TAILQ_FOREACH(fra, &(*frag)->fr_queue, fr_next) {
 		if (fra->fe_off > off)
 			break;
 		frp = fra;
 	}
 
 	KASSERT((frp != NULL || fra != NULL),
 	    ("!(frp != NULL || fra != NULL): %s", __FUNCTION__));
 
 	if (frp != NULL) {
 		int	precut;
 
 		precut = frp->fe_len - off;
 		if (precut >= ip_len) {
 			/* Fragment is entirely a duplicate */
 			DPFPRINTF(("fragcache[%d]: dead (%d-%d) %d-%d\n",
 			    h->ip_id, frp->fe_off, frp->fe_len, off, max));
 			goto drop_fragment;
 		}
 		if (precut == 0) {
 			/* They are adjacent.  Fixup cache entry */
 			DPFPRINTF(("fragcache[%d]: adjacent (%d-%d) %d-%d\n",
 			    h->ip_id, frp->fe_off, frp->fe_len, off, max));
 			frp->fe_len = max;
 		} else if (precut > 0) {
 			/* The first part of this payload overlaps with a
 			 * fragment that has already been passed.
 			 * Need to trim off the first part of the payload.
 			 * But to do so easily, we need to create another
 			 * mbuf to throw the original header into.
 			 */
 
 			DPFPRINTF(("fragcache[%d]: chop %d (%d-%d) %d-%d\n",
 			    h->ip_id, precut, frp->fe_off, frp->fe_len, off,
 			    max));
 
 			off += precut;
 			max -= precut;
 			/* Update the previous frag to encompass this one */
 			frp->fe_len = max;
 
 			if (!drop) {
 				/* XXX Optimization opportunity
 				 * This is a very heavy way to trim the payload.
 				 * we could do it much faster by diddling mbuf
 				 * internals but that would be even less legible
 				 * than this mbuf magic.  For my next trick,
 				 * I'll pull a rabbit out of my laptop.
 				 */
 				*m0 = m_dup(m, M_NOWAIT);
 				if (*m0 == NULL)
 					goto no_mem;
 				/* From KAME Project : We have missed this! */
 				m_adj(*m0, (h->ip_hl << 2) -
 				    (*m0)->m_pkthdr.len);
 
 				KASSERT(((*m0)->m_next == NULL),
 				    ("(*m0)->m_next != NULL: %s",
 				    __FUNCTION__));
 				m_adj(m, precut + (h->ip_hl << 2));
 				m_cat(*m0, m);
 				m = *m0;
 				if (m->m_flags & M_PKTHDR) {
 					int plen = 0;
 					struct mbuf *t;
 					for (t = m; t; t = t->m_next)
 						plen += t->m_len;
 					m->m_pkthdr.len = plen;
 				}
 
 
 				h = mtod(m, struct ip *);
 
 				KASSERT(((int)m->m_len ==
 				    ntohs(h->ip_len) - precut),
 				    ("m->m_len != ntohs(h->ip_len) - precut: %s",
 				    __FUNCTION__));
 				h->ip_off = htons(ntohs(h->ip_off) +
 				    (precut >> 3));
 				h->ip_len = htons(ntohs(h->ip_len) - precut);
 			} else {
 				hosed++;
 			}
 		} else {
 			/* There is a gap between fragments */
 
 			DPFPRINTF(("fragcache[%d]: gap %d (%d-%d) %d-%d\n",
 			    h->ip_id, -precut, frp->fe_off, frp->fe_len, off,
 			    max));
 
 			cur = uma_zalloc(V_pf_frent_z, M_NOWAIT);
 			if (cur == NULL)
 				goto no_mem;
 
 			cur->fe_off = off;
 			cur->fe_len = max;
 			TAILQ_INSERT_AFTER(&(*frag)->fr_queue, frp, cur, fr_next);
 		}
 	}
 
 	if (fra != NULL) {
 		int	aftercut;
 		int	merge = 0;
 
 		aftercut = max - fra->fe_off;
 		if (aftercut == 0) {
 			/* Adjacent fragments */
 			DPFPRINTF(("fragcache[%d]: adjacent %d-%d (%d-%d)\n",
 			    h->ip_id, off, max, fra->fe_off, fra->fe_len));
 			fra->fe_off = off;
 			merge = 1;
 		} else if (aftercut > 0) {
 			/* Need to chop off the tail of this fragment */
 			DPFPRINTF(("fragcache[%d]: chop %d %d-%d (%d-%d)\n",
 			    h->ip_id, aftercut, off, max, fra->fe_off,
 			    fra->fe_len));
 			fra->fe_off = off;
 			max -= aftercut;
 
 			merge = 1;
 
 			if (!drop) {
 				m_adj(m, -aftercut);
 				if (m->m_flags & M_PKTHDR) {
 					int plen = 0;
 					struct mbuf *t;
 					for (t = m; t; t = t->m_next)
 						plen += t->m_len;
 					m->m_pkthdr.len = plen;
 				}
 				h = mtod(m, struct ip *);
 				KASSERT(((int)m->m_len == ntohs(h->ip_len) - aftercut),
 				    ("m->m_len != ntohs(h->ip_len) - aftercut: %s",
 				    __FUNCTION__));
 				h->ip_len = htons(ntohs(h->ip_len) - aftercut);
 			} else {
 				hosed++;
 			}
 		} else if (frp == NULL) {
 			/* There is a gap between fragments */
 			DPFPRINTF(("fragcache[%d]: gap %d %d-%d (%d-%d)\n",
 			    h->ip_id, -aftercut, off, max, fra->fe_off,
 			    fra->fe_len));
 
 			cur = uma_zalloc(V_pf_frent_z, M_NOWAIT);
 			if (cur == NULL)
 				goto no_mem;
 
 			cur->fe_off = off;
 			cur->fe_len = max;
 			TAILQ_INSERT_HEAD(&(*frag)->fr_queue, cur, fr_next);
 		}
 
 
 		/* Need to glue together two separate fragment descriptors */
 		if (merge) {
 			if (cur && fra->fe_off <= cur->fe_len) {
 				/* Need to merge in a previous 'cur' */
 				DPFPRINTF(("fragcache[%d]: adjacent(merge "
 				    "%d-%d) %d-%d (%d-%d)\n",
 				    h->ip_id, cur->fe_off, cur->fe_len, off,
 				    max, fra->fe_off, fra->fe_len));
 				fra->fe_off = cur->fe_off;
 				TAILQ_REMOVE(&(*frag)->fr_queue, cur, fr_next);
 				uma_zfree(V_pf_frent_z, cur);
 				cur = NULL;
 
 			} else if (frp && fra->fe_off <= frp->fe_len) {
 				/* Need to merge in a modified 'frp' */
 				KASSERT((cur == NULL), ("cur != NULL: %s",
 				    __FUNCTION__));
 				DPFPRINTF(("fragcache[%d]: adjacent(merge "
 				    "%d-%d) %d-%d (%d-%d)\n",
 				    h->ip_id, frp->fe_off, frp->fe_len, off,
 				    max, fra->fe_off, fra->fe_len));
 				fra->fe_off = frp->fe_off;
 				TAILQ_REMOVE(&(*frag)->fr_queue, frp, fr_next);
 				uma_zfree(V_pf_frent_z, frp);
 				frp = NULL;
 
 			}
 		}
 	}
 
 	if (hosed) {
 		/*
 		 * We must keep tracking the overall fragment even when
 		 * we're going to drop it anyway so that we know when to
 		 * free the overall descriptor.  Thus we drop the frag late.
 		 */
 		goto drop_fragment;
 	}
 
 
  pass:
 	/* Update maximum data size */
 	if ((*frag)->fr_max < max)
 		(*frag)->fr_max = max;
 
 	/* This is the last segment */
 	if (!mff)
 		(*frag)->fr_flags |= PFFRAG_SEENLAST;
 
 	/* Check if we are completely reassembled */
 	if (((*frag)->fr_flags & PFFRAG_SEENLAST) &&
 	    TAILQ_FIRST(&(*frag)->fr_queue)->fe_off == 0 &&
 	    TAILQ_FIRST(&(*frag)->fr_queue)->fe_len == (*frag)->fr_max) {
 		/* Remove from fragment queue */
 		DPFPRINTF(("fragcache[%d]: done 0-%d\n", h->ip_id,
 		    (*frag)->fr_max));
 		pf_free_fragment(*frag);
 		*frag = NULL;
 	}
 
 	return (m);
 
  no_mem:
 	*nomem = 1;
 
 	/* Still need to pay attention to !IP_MF */
 	if (!mff && *frag != NULL)
 		(*frag)->fr_flags |= PFFRAG_SEENLAST;
 
 	m_freem(m);
 	return (NULL);
 
  drop_fragment:
 
 	/* Still need to pay attention to !IP_MF */
 	if (!mff && *frag != NULL)
 		(*frag)->fr_flags |= PFFRAG_SEENLAST;
 
 	if (drop) {
 		/* This fragment has been deemed bad.  Don't reass */
 		if (((*frag)->fr_flags & PFFRAG_DROP) == 0)
 			DPFPRINTF(("fragcache[%d]: dropping overall fragment\n",
 			    h->ip_id));
 		(*frag)->fr_flags |= PFFRAG_DROP;
 	}
 
 	m_freem(m);
 	return (NULL);
 }
 #endif	/* INET */
 
 #ifdef INET6
 int
 pf_refragment6(struct ifnet *ifp, struct mbuf **m0, struct m_tag *mtag)
 {
 	struct mbuf		*m = *m0, *t;
 	struct pf_fragment_tag	*ftag = (struct pf_fragment_tag *)(mtag + 1);
 	struct pf_pdesc		 pd;
 	uint32_t		 frag_id;
 	uint16_t		 hdrlen, extoff, maxlen;
 	uint8_t			 proto;
 	int			 error, action;
 
 	hdrlen = ftag->ft_hdrlen;
 	extoff = ftag->ft_extoff;
 	maxlen = ftag->ft_maxlen;
 	frag_id = ftag->ft_id;
 	m_tag_delete(m, mtag);
 	mtag = NULL;
 	ftag = NULL;
 
 	if (extoff) {
 		int off;
 
 		/* Use protocol from next field of last extension header */
 		m = m_getptr(m, extoff + offsetof(struct ip6_ext, ip6e_nxt),
 		    &off);
 		KASSERT((m != NULL), ("pf_refragment6: short mbuf chain"));
 		proto = *(mtod(m, caddr_t) + off);
 		*(mtod(m, char *) + off) = IPPROTO_FRAGMENT;
 		m = *m0;
 	} else {
 		struct ip6_hdr *hdr;
 
 		hdr = mtod(m, struct ip6_hdr *);
 		proto = hdr->ip6_nxt;
 		hdr->ip6_nxt = IPPROTO_FRAGMENT;
 	}
 
 	/*
 	 * Maxlen may be less than 8 if there was only a single
 	 * fragment.  As it was fragmented before, add a fragment
 	 * header also for a single fragment.  If total or maxlen
 	 * is less than 8, ip6_fragment() will return EMSGSIZE and
 	 * we drop the packet.
 	 */
 	error = ip6_fragment(ifp, m, hdrlen, proto, maxlen, frag_id);
 	m = (*m0)->m_nextpkt;
 	(*m0)->m_nextpkt = NULL;
 	if (error == 0) {
 		/* The first mbuf contains the unfragmented packet. */
 		m_freem(*m0);
 		*m0 = NULL;
 		action = PF_PASS;
 	} else {
 		/* Drop expects an mbuf to free. */
 		DPFPRINTF(("refragment error %d", error));
 		action = PF_DROP;
 	}
 	for (t = m; m; m = t) {
 		t = m->m_nextpkt;
 		m->m_nextpkt = NULL;
 		m->m_flags |= M_SKIP_FIREWALL;
 		memset(&pd, 0, sizeof(pd));
 		pd.pf_mtag = pf_find_mtag(m);
 		if (error == 0)
 			ip6_forward(m, 0);
 		else
 			m_freem(m);
 	}
 
 	return (action);
 }
 #endif /* INET6 */
 
 #ifdef INET
 int
 pf_normalize_ip(struct mbuf **m0, int dir, struct pfi_kif *kif, u_short *reason,
     struct pf_pdesc *pd)
 {
 	struct mbuf		*m = *m0;
 	struct pf_rule		*r;
 	struct pf_fragment	*frag = NULL;
 	struct pf_fragment_cmp	key;
 	struct ip		*h = mtod(m, struct ip *);
 	int			 mff = (ntohs(h->ip_off) & IP_MF);
 	int			 hlen = h->ip_hl << 2;
 	u_int16_t		 fragoff = (ntohs(h->ip_off) & IP_OFFMASK) << 3;
 	u_int16_t		 max;
 	int			 ip_len;
 	int			 ip_off;
 	int			 tag = -1;
 	int			 verdict;
 
 	PF_RULES_RASSERT();
 
 	r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_SCRUB].active.ptr);
 	while (r != NULL) {
 		r->evaluations++;
 		if (pfi_kif_match(r->kif, kif) == r->ifnot)
 			r = r->skip[PF_SKIP_IFP].ptr;
 		else if (r->direction && r->direction != dir)
 			r = r->skip[PF_SKIP_DIR].ptr;
 		else if (r->af && r->af != AF_INET)
 			r = r->skip[PF_SKIP_AF].ptr;
 		else if (r->proto && r->proto != h->ip_p)
 			r = r->skip[PF_SKIP_PROTO].ptr;
 		else if (PF_MISMATCHAW(&r->src.addr,
 		    (struct pf_addr *)&h->ip_src.s_addr, AF_INET,
 		    r->src.neg, kif, M_GETFIB(m)))
 			r = r->skip[PF_SKIP_SRC_ADDR].ptr;
 		else if (PF_MISMATCHAW(&r->dst.addr,
 		    (struct pf_addr *)&h->ip_dst.s_addr, AF_INET,
 		    r->dst.neg, NULL, M_GETFIB(m)))
 			r = r->skip[PF_SKIP_DST_ADDR].ptr;
 		else if (r->match_tag && !pf_match_tag(m, r, &tag,
 		    pd->pf_mtag ? pd->pf_mtag->tag : 0))
 			r = TAILQ_NEXT(r, entries);
 		else
 			break;
 	}
 
 	if (r == NULL || r->action == PF_NOSCRUB)
 		return (PF_PASS);
 	else {
 		r->packets[dir == PF_OUT]++;
 		r->bytes[dir == PF_OUT] += pd->tot_len;
 	}
 
 	/* Check for illegal packets */
 	if (hlen < (int)sizeof(struct ip))
 		goto drop;
 
 	if (hlen > ntohs(h->ip_len))
 		goto drop;
 
 	/* Clear IP_DF if the rule uses the no-df option */
 	if (r->rule_flag & PFRULE_NODF && h->ip_off & htons(IP_DF)) {
 		u_int16_t ip_off = h->ip_off;
 
 		h->ip_off &= htons(~IP_DF);
 		h->ip_sum = pf_cksum_fixup(h->ip_sum, ip_off, h->ip_off, 0);
 	}
 
 	/* We will need other tests here */
 	if (!fragoff && !mff)
 		goto no_fragment;
 
 	/* We're dealing with a fragment now. Don't allow fragments
 	 * with IP_DF to enter the cache. If the flag was cleared by
 	 * no-df above, fine. Otherwise drop it.
 	 */
 	if (h->ip_off & htons(IP_DF)) {
 		DPFPRINTF(("IP_DF\n"));
 		goto bad;
 	}
 
 	ip_len = ntohs(h->ip_len) - hlen;
 	ip_off = (ntohs(h->ip_off) & IP_OFFMASK) << 3;
 
 	/* All fragments are 8 byte aligned */
 	if (mff && (ip_len & 0x7)) {
 		DPFPRINTF(("mff and %d\n", ip_len));
 		goto bad;
 	}
 
 	/* Respect maximum length */
 	if (fragoff + ip_len > IP_MAXPACKET) {
 		DPFPRINTF(("max packet %d\n", fragoff + ip_len));
 		goto bad;
 	}
 	max = fragoff + ip_len;
 
 	if ((r->rule_flag & (PFRULE_FRAGCROP|PFRULE_FRAGDROP)) == 0) {
 
 		/* Fully buffer all of the fragments */
 		PF_FRAG_LOCK();
 
 		pf_ip2key(h, dir, &key);
 		frag = pf_find_fragment(&key, &V_pf_frag_tree);
 
 		/* Check if we saw the last fragment already */
 		if (frag != NULL && (frag->fr_flags & PFFRAG_SEENLAST) &&
 		    max > frag->fr_max)
 			goto bad;
 
 		/* Might return a completely reassembled mbuf, or NULL */
 		DPFPRINTF(("reass frag %d @ %d-%d\n", h->ip_id, fragoff, max));
 		verdict = pf_reassemble(m0, h, dir, reason);
 		PF_FRAG_UNLOCK();
 
 		if (verdict != PF_PASS)
 			return (PF_DROP);
 
 		m = *m0;
 		if (m == NULL)
 			return (PF_DROP);
 
 		if (frag != NULL && (frag->fr_flags & PFFRAG_DROP))
 			goto drop;
 
 		h = mtod(m, struct ip *);
 	} else {
 		/* non-buffering fragment cache (drops or masks overlaps) */
 		int	nomem = 0;
 
 		if (dir == PF_OUT && pd->pf_mtag &&
 		    pd->pf_mtag->flags & PF_TAG_FRAGCACHE) {
 			/*
 			 * Already passed the fragment cache in the
 			 * input direction.  If we continued, it would
 			 * appear to be a dup and would be dropped.
 			 */
 			goto fragment_pass;
 		}
 
 		PF_FRAG_LOCK();
 		pf_ip2key(h, dir, &key);
 		frag = pf_find_fragment(&key, &V_pf_cache_tree);
 
 		/* Check if we saw the last fragment already */
 		if (frag != NULL && (frag->fr_flags & PFFRAG_SEENLAST) &&
 		    max > frag->fr_max) {
 			if (r->rule_flag & PFRULE_FRAGDROP)
 				frag->fr_flags |= PFFRAG_DROP;
 			goto bad;
 		}
 
 		*m0 = m = pf_fragcache(m0, h, &frag, mff,
 		    (r->rule_flag & PFRULE_FRAGDROP) ? 1 : 0, &nomem);
 		PF_FRAG_UNLOCK();
 		if (m == NULL) {
 			if (nomem)
 				goto no_mem;
 			goto drop;
 		}
 
 		if (dir == PF_IN) {
 			/* Use mtag from copied and trimmed mbuf chain. */
 			pd->pf_mtag = pf_get_mtag(m);
 			if (pd->pf_mtag == NULL) {
 				m_freem(m);
 				*m0 = NULL;
 				goto no_mem;
 			}
 			pd->pf_mtag->flags |= PF_TAG_FRAGCACHE;
 		}
 
 		if (frag != NULL && (frag->fr_flags & PFFRAG_DROP))
 			goto drop;
 		goto fragment_pass;
 	}
 
  no_fragment:
 	/* At this point, only IP_DF is allowed in ip_off */
 	if (h->ip_off & ~htons(IP_DF)) {
 		u_int16_t ip_off = h->ip_off;
 
 		h->ip_off &= htons(IP_DF);
 		h->ip_sum = pf_cksum_fixup(h->ip_sum, ip_off, h->ip_off, 0);
 	}
 
 	/* not missing a return here */
 
  fragment_pass:
 	pf_scrub_ip(&m, r->rule_flag, r->min_ttl, r->set_tos);
 
 	if ((r->rule_flag & (PFRULE_FRAGCROP|PFRULE_FRAGDROP)) == 0)
 		pd->flags |= PFDESC_IP_REAS;
 	return (PF_PASS);
 
  no_mem:
 	REASON_SET(reason, PFRES_MEMORY);
 	if (r != NULL && r->log)
 		PFLOG_PACKET(kif, m, AF_INET, dir, *reason, r, NULL, NULL, pd,
 		    1);
 	return (PF_DROP);
 
  drop:
 	REASON_SET(reason, PFRES_NORM);
 	if (r != NULL && r->log)
 		PFLOG_PACKET(kif, m, AF_INET, dir, *reason, r, NULL, NULL, pd,
 		    1);
 	return (PF_DROP);
 
  bad:
 	DPFPRINTF(("dropping bad fragment\n"));
 
 	/* Free associated fragments */
 	if (frag != NULL) {
 		pf_free_fragment(frag);
 		PF_FRAG_UNLOCK();
 	}
 
 	REASON_SET(reason, PFRES_FRAG);
 	if (r != NULL && r->log)
 		PFLOG_PACKET(kif, m, AF_INET, dir, *reason, r, NULL, NULL, pd,
 		    1);
 
 	return (PF_DROP);
 }
 #endif
 
 #ifdef INET6
 int
 pf_normalize_ip6(struct mbuf **m0, int dir, struct pfi_kif *kif,
     u_short *reason, struct pf_pdesc *pd)
 {
 	struct mbuf		*m = *m0;
 	struct pf_rule		*r;
 	struct ip6_hdr		*h = mtod(m, struct ip6_hdr *);
 	int			 extoff;
 	int			 off;
 	struct ip6_ext		 ext;
 	struct ip6_opt		 opt;
 	struct ip6_opt_jumbo	 jumbo;
 	struct ip6_frag		 frag;
 	u_int32_t		 jumbolen = 0, plen;
 	int			 optend;
 	int			 ooff;
 	u_int8_t		 proto;
 	int			 terminal;
 
 	PF_RULES_RASSERT();
 
 	r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_SCRUB].active.ptr);
 	while (r != NULL) {
 		r->evaluations++;
 		if (pfi_kif_match(r->kif, kif) == r->ifnot)
 			r = r->skip[PF_SKIP_IFP].ptr;
 		else if (r->direction && r->direction != dir)
 			r = r->skip[PF_SKIP_DIR].ptr;
 		else if (r->af && r->af != AF_INET6)
 			r = r->skip[PF_SKIP_AF].ptr;
 #if 0 /* header chain! */
 		else if (r->proto && r->proto != h->ip6_nxt)
 			r = r->skip[PF_SKIP_PROTO].ptr;
 #endif
 		else if (PF_MISMATCHAW(&r->src.addr,
 		    (struct pf_addr *)&h->ip6_src, AF_INET6,
 		    r->src.neg, kif, M_GETFIB(m)))
 			r = r->skip[PF_SKIP_SRC_ADDR].ptr;
 		else if (PF_MISMATCHAW(&r->dst.addr,
 		    (struct pf_addr *)&h->ip6_dst, AF_INET6,
 		    r->dst.neg, NULL, M_GETFIB(m)))
 			r = r->skip[PF_SKIP_DST_ADDR].ptr;
 		else
 			break;
 	}
 
 	if (r == NULL || r->action == PF_NOSCRUB)
 		return (PF_PASS);
 	else {
 		r->packets[dir == PF_OUT]++;
 		r->bytes[dir == PF_OUT] += pd->tot_len;
 	}
 
 	/* Check for illegal packets */
 	if (sizeof(struct ip6_hdr) + IPV6_MAXPACKET < m->m_pkthdr.len)
 		goto drop;
 
 	extoff = 0;
 	off = sizeof(struct ip6_hdr);
 	proto = h->ip6_nxt;
 	terminal = 0;
 	do {
 		switch (proto) {
 		case IPPROTO_FRAGMENT:
 			goto fragment;
 			break;
 		case IPPROTO_AH:
 		case IPPROTO_ROUTING:
 		case IPPROTO_DSTOPTS:
 			if (!pf_pull_hdr(m, off, &ext, sizeof(ext), NULL,
 			    NULL, AF_INET6))
 				goto shortpkt;
 			extoff = off;
 			if (proto == IPPROTO_AH)
 				off += (ext.ip6e_len + 2) * 4;
 			else
 				off += (ext.ip6e_len + 1) * 8;
 			proto = ext.ip6e_nxt;
 			break;
 		case IPPROTO_HOPOPTS:
 			if (!pf_pull_hdr(m, off, &ext, sizeof(ext), NULL,
 			    NULL, AF_INET6))
 				goto shortpkt;
 			extoff = off;
 			optend = off + (ext.ip6e_len + 1) * 8;
 			ooff = off + sizeof(ext);
 			do {
 				if (!pf_pull_hdr(m, ooff, &opt.ip6o_type,
 				    sizeof(opt.ip6o_type), NULL, NULL,
 				    AF_INET6))
 					goto shortpkt;
 				if (opt.ip6o_type == IP6OPT_PAD1) {
 					ooff++;
 					continue;
 				}
 				if (!pf_pull_hdr(m, ooff, &opt, sizeof(opt),
 				    NULL, NULL, AF_INET6))
 					goto shortpkt;
 				if (ooff + sizeof(opt) + opt.ip6o_len > optend)
 					goto drop;
 				switch (opt.ip6o_type) {
 				case IP6OPT_JUMBO:
 					if (h->ip6_plen != 0)
 						goto drop;
 					if (!pf_pull_hdr(m, ooff, &jumbo,
 					    sizeof(jumbo), NULL, NULL,
 					    AF_INET6))
 						goto shortpkt;
 					memcpy(&jumbolen, jumbo.ip6oj_jumbo_len,
 					    sizeof(jumbolen));
 					jumbolen = ntohl(jumbolen);
 					if (jumbolen <= IPV6_MAXPACKET)
 						goto drop;
 					if (sizeof(struct ip6_hdr) + jumbolen !=
 					    m->m_pkthdr.len)
 						goto drop;
 					break;
 				default:
 					break;
 				}
 				ooff += sizeof(opt) + opt.ip6o_len;
 			} while (ooff < optend);
 
 			off = optend;
 			proto = ext.ip6e_nxt;
 			break;
 		default:
 			terminal = 1;
 			break;
 		}
 	} while (!terminal);
 
 	/* jumbo payload option must be present, or plen > 0 */
 	if (ntohs(h->ip6_plen) == 0)
 		plen = jumbolen;
 	else
 		plen = ntohs(h->ip6_plen);
 	if (plen == 0)
 		goto drop;
 	if (sizeof(struct ip6_hdr) + plen > m->m_pkthdr.len)
 		goto shortpkt;
 
 	pf_scrub_ip6(&m, r->min_ttl);
 
 	return (PF_PASS);
 
  fragment:
 	/* Jumbo payload packets cannot be fragmented. */
 	plen = ntohs(h->ip6_plen);
 	if (plen == 0 || jumbolen)
 		goto drop;
 	if (sizeof(struct ip6_hdr) + plen > m->m_pkthdr.len)
 		goto shortpkt;
 
 	if (!pf_pull_hdr(m, off, &frag, sizeof(frag), NULL, NULL, AF_INET6))
 		goto shortpkt;
 
 	/* Offset now points to data portion. */
 	off += sizeof(frag);
 
 	/* Returns PF_DROP or *m0 is NULL or completely reassembled mbuf. */
 	if (pf_reassemble6(m0, h, &frag, off, extoff, dir, reason) != PF_PASS)
 		return (PF_DROP);
 	m = *m0;
 	if (m == NULL)
 		return (PF_DROP);
 
 	pd->flags |= PFDESC_IP_REAS;
 	return (PF_PASS);
 
  shortpkt:
 	REASON_SET(reason, PFRES_SHORT);
 	if (r != NULL && r->log)
 		PFLOG_PACKET(kif, m, AF_INET6, dir, *reason, r, NULL, NULL, pd,
 		    1);
 	return (PF_DROP);
 
  drop:
 	REASON_SET(reason, PFRES_NORM);
 	if (r != NULL && r->log)
 		PFLOG_PACKET(kif, m, AF_INET6, dir, *reason, r, NULL, NULL, pd,
 		    1);
 	return (PF_DROP);
 }
 #endif /* INET6 */
 
 int
 pf_normalize_tcp(int dir, struct pfi_kif *kif, struct mbuf *m, int ipoff,
     int off, void *h, struct pf_pdesc *pd)
 {
 	struct pf_rule	*r, *rm = NULL;
 	struct tcphdr	*th = pd->hdr.tcp;
 	int		 rewrite = 0;
 	u_short		 reason;
 	u_int8_t	 flags;
 	sa_family_t	 af = pd->af;
 
 	PF_RULES_RASSERT();
 
 	r = TAILQ_FIRST(pf_main_ruleset.rules[PF_RULESET_SCRUB].active.ptr);
 	while (r != NULL) {
 		r->evaluations++;
 		if (pfi_kif_match(r->kif, kif) == r->ifnot)
 			r = r->skip[PF_SKIP_IFP].ptr;
 		else if (r->direction && r->direction != dir)
 			r = r->skip[PF_SKIP_DIR].ptr;
 		else if (r->af && r->af != af)
 			r = r->skip[PF_SKIP_AF].ptr;
 		else if (r->proto && r->proto != pd->proto)
 			r = r->skip[PF_SKIP_PROTO].ptr;
 		else if (PF_MISMATCHAW(&r->src.addr, pd->src, af,
 		    r->src.neg, kif, M_GETFIB(m)))
 			r = r->skip[PF_SKIP_SRC_ADDR].ptr;
 		else if (r->src.port_op && !pf_match_port(r->src.port_op,
 			    r->src.port[0], r->src.port[1], th->th_sport))
 			r = r->skip[PF_SKIP_SRC_PORT].ptr;
 		else if (PF_MISMATCHAW(&r->dst.addr, pd->dst, af,
 		    r->dst.neg, NULL, M_GETFIB(m)))
 			r = r->skip[PF_SKIP_DST_ADDR].ptr;
 		else if (r->dst.port_op && !pf_match_port(r->dst.port_op,
 			    r->dst.port[0], r->dst.port[1], th->th_dport))
 			r = r->skip[PF_SKIP_DST_PORT].ptr;
 		else if (r->os_fingerprint != PF_OSFP_ANY && !pf_osfp_match(
 			    pf_osfp_fingerprint(pd, m, off, th),
 			    r->os_fingerprint))
 			r = TAILQ_NEXT(r, entries);
 		else {
 			rm = r;
 			break;
 		}
 	}
 
 	if (rm == NULL || rm->action == PF_NOSCRUB)
 		return (PF_PASS);
 	else {
 		r->packets[dir == PF_OUT]++;
 		r->bytes[dir == PF_OUT] += pd->tot_len;
 	}
 
 	if (rm->rule_flag & PFRULE_REASSEMBLE_TCP)
 		pd->flags |= PFDESC_TCP_NORM;
 
 	flags = th->th_flags;
 	if (flags & TH_SYN) {
 		/* Illegal packet */
 		if (flags & TH_RST)
 			goto tcp_drop;
 
 		if (flags & TH_FIN)
-			flags &= ~TH_FIN;
+			goto tcp_drop;
 	} else {
 		/* Illegal packet */
 		if (!(flags & (TH_ACK|TH_RST)))
 			goto tcp_drop;
 	}
 
 	if (!(flags & TH_ACK)) {
 		/* These flags are only valid if ACK is set */
 		if ((flags & TH_FIN) || (flags & TH_PUSH) || (flags & TH_URG))
 			goto tcp_drop;
 	}
 
 	/* Check for illegal header length */
 	if (th->th_off < (sizeof(struct tcphdr) >> 2))
 		goto tcp_drop;
 
 	/* If flags changed, or reserved data set, then adjust */
 	if (flags != th->th_flags || th->th_x2 != 0) {
 		u_int16_t	ov, nv;
 
 		ov = *(u_int16_t *)(&th->th_ack + 1);
 		th->th_flags = flags;
 		th->th_x2 = 0;
 		nv = *(u_int16_t *)(&th->th_ack + 1);
 
 		th->th_sum = pf_cksum_fixup(th->th_sum, ov, nv, 0);
 		rewrite = 1;
 	}
 
 	/* Remove urgent pointer, if TH_URG is not set */
 	if (!(flags & TH_URG) && th->th_urp) {
 		th->th_sum = pf_cksum_fixup(th->th_sum, th->th_urp, 0, 0);
 		th->th_urp = 0;
 		rewrite = 1;
 	}
 
 	/* Process options */
 	if (r->max_mss && pf_normalize_tcpopt(r, m, th, off, pd->af))
 		rewrite = 1;
 
 	/* copy back packet headers if we sanitized */
 	if (rewrite)
 		m_copyback(m, off, sizeof(*th), (caddr_t)th);
 
 	return (PF_PASS);
 
  tcp_drop:
 	REASON_SET(&reason, PFRES_NORM);
 	if (rm != NULL && r->log)
 		PFLOG_PACKET(kif, m, AF_INET, dir, reason, r, NULL, NULL, pd,
 		    1);
 	return (PF_DROP);
 }
 
 int
 pf_normalize_tcp_init(struct mbuf *m, int off, struct pf_pdesc *pd,
     struct tcphdr *th, struct pf_state_peer *src, struct pf_state_peer *dst)
 {
 	u_int32_t tsval, tsecr;
 	u_int8_t hdr[60];
 	u_int8_t *opt;
 
 	KASSERT((src->scrub == NULL),
 	    ("pf_normalize_tcp_init: src->scrub != NULL"));
 
 	src->scrub = uma_zalloc(V_pf_state_scrub_z, M_ZERO | M_NOWAIT);
 	if (src->scrub == NULL)
 		return (1);
 
 	switch (pd->af) {
 #ifdef INET
 	case AF_INET: {
 		struct ip *h = mtod(m, struct ip *);
 		src->scrub->pfss_ttl = h->ip_ttl;
 		break;
 	}
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6: {
 		struct ip6_hdr *h = mtod(m, struct ip6_hdr *);
 		src->scrub->pfss_ttl = h->ip6_hlim;
 		break;
 	}
 #endif /* INET6 */
 	}
 
 
 	/*
 	 * All normalizations below are only begun if we see the start of
 	 * the connections.  They must all set an enabled bit in pfss_flags
 	 */
 	if ((th->th_flags & TH_SYN) == 0)
 		return (0);
 
 
 	if (th->th_off > (sizeof(struct tcphdr) >> 2) && src->scrub &&
 	    pf_pull_hdr(m, off, hdr, th->th_off << 2, NULL, NULL, pd->af)) {
 		/* Diddle with TCP options */
 		int hlen;
 		opt = hdr + sizeof(struct tcphdr);
 		hlen = (th->th_off << 2) - sizeof(struct tcphdr);
 		while (hlen >= TCPOLEN_TIMESTAMP) {
 			switch (*opt) {
 			case TCPOPT_EOL:	/* FALLTHROUGH */
 			case TCPOPT_NOP:
 				opt++;
 				hlen--;
 				break;
 			case TCPOPT_TIMESTAMP:
 				if (opt[1] >= TCPOLEN_TIMESTAMP) {
 					src->scrub->pfss_flags |=
 					    PFSS_TIMESTAMP;
 					src->scrub->pfss_ts_mod =
 					    htonl(arc4random());
 
 					/* note PFSS_PAWS not set yet */
 					memcpy(&tsval, &opt[2],
 					    sizeof(u_int32_t));
 					memcpy(&tsecr, &opt[6],
 					    sizeof(u_int32_t));
 					src->scrub->pfss_tsval0 = ntohl(tsval);
 					src->scrub->pfss_tsval = ntohl(tsval);
 					src->scrub->pfss_tsecr = ntohl(tsecr);
 					getmicrouptime(&src->scrub->pfss_last);
 				}
 				/* FALLTHROUGH */
 			default:
 				hlen -= MAX(opt[1], 2);
 				opt += MAX(opt[1], 2);
 				break;
 			}
 		}
 	}
 
 	return (0);
 }
 
 void
 pf_normalize_tcp_cleanup(struct pf_state *state)
 {
 	if (state->src.scrub)
 		uma_zfree(V_pf_state_scrub_z, state->src.scrub);
 	if (state->dst.scrub)
 		uma_zfree(V_pf_state_scrub_z, state->dst.scrub);
 
 	/* Someday... flush the TCP segment reassembly descriptors. */
 }
 
 int
 pf_normalize_tcp_stateful(struct mbuf *m, int off, struct pf_pdesc *pd,
     u_short *reason, struct tcphdr *th, struct pf_state *state,
     struct pf_state_peer *src, struct pf_state_peer *dst, int *writeback)
 {
 	struct timeval uptime;
 	u_int32_t tsval, tsecr;
 	u_int tsval_from_last;
 	u_int8_t hdr[60];
 	u_int8_t *opt;
 	int copyback = 0;
 	int got_ts = 0;
 
 	KASSERT((src->scrub || dst->scrub),
 	    ("%s: src->scrub && dst->scrub!", __func__));
 
 	/*
 	 * Enforce the minimum TTL seen for this connection.  Negate a common
 	 * technique to evade an intrusion detection system and confuse
 	 * firewall state code.
 	 */
 	switch (pd->af) {
 #ifdef INET
 	case AF_INET: {
 		if (src->scrub) {
 			struct ip *h = mtod(m, struct ip *);
 			if (h->ip_ttl > src->scrub->pfss_ttl)
 				src->scrub->pfss_ttl = h->ip_ttl;
 			h->ip_ttl = src->scrub->pfss_ttl;
 		}
 		break;
 	}
 #endif /* INET */
 #ifdef INET6
 	case AF_INET6: {
 		if (src->scrub) {
 			struct ip6_hdr *h = mtod(m, struct ip6_hdr *);
 			if (h->ip6_hlim > src->scrub->pfss_ttl)
 				src->scrub->pfss_ttl = h->ip6_hlim;
 			h->ip6_hlim = src->scrub->pfss_ttl;
 		}
 		break;
 	}
 #endif /* INET6 */
 	}
 
 	if (th->th_off > (sizeof(struct tcphdr) >> 2) &&
 	    ((src->scrub && (src->scrub->pfss_flags & PFSS_TIMESTAMP)) ||
 	    (dst->scrub && (dst->scrub->pfss_flags & PFSS_TIMESTAMP))) &&
 	    pf_pull_hdr(m, off, hdr, th->th_off << 2, NULL, NULL, pd->af)) {
 		/* Diddle with TCP options */
 		int hlen;
 		opt = hdr + sizeof(struct tcphdr);
 		hlen = (th->th_off << 2) - sizeof(struct tcphdr);
 		while (hlen >= TCPOLEN_TIMESTAMP) {
 			switch (*opt) {
 			case TCPOPT_EOL:	/* FALLTHROUGH */
 			case TCPOPT_NOP:
 				opt++;
 				hlen--;
 				break;
 			case TCPOPT_TIMESTAMP:
 				/* Modulate the timestamps.  Can be used for
 				 * NAT detection, OS uptime determination or
 				 * reboot detection.
 				 */
 
 				if (got_ts) {
 					/* Huh?  Multiple timestamps!? */
 					if (V_pf_status.debug >= PF_DEBUG_MISC) {
 						DPFPRINTF(("multiple TS??"));
 						pf_print_state(state);
 						printf("\n");
 					}
 					REASON_SET(reason, PFRES_TS);
 					return (PF_DROP);
 				}
 				if (opt[1] >= TCPOLEN_TIMESTAMP) {
 					memcpy(&tsval, &opt[2],
 					    sizeof(u_int32_t));
 					if (tsval && src->scrub &&
 					    (src->scrub->pfss_flags &
 					    PFSS_TIMESTAMP)) {
 						tsval = ntohl(tsval);
 						pf_change_a(&opt[2],
 						    &th->th_sum,
 						    htonl(tsval +
 						    src->scrub->pfss_ts_mod),
 						    0);
 						copyback = 1;
 					}
 
 					/* Modulate TS reply iff valid (!0) */
 					memcpy(&tsecr, &opt[6],
 					    sizeof(u_int32_t));
 					if (tsecr && dst->scrub &&
 					    (dst->scrub->pfss_flags &
 					    PFSS_TIMESTAMP)) {
 						tsecr = ntohl(tsecr)
 						    - dst->scrub->pfss_ts_mod;
 						pf_change_a(&opt[6],
 						    &th->th_sum, htonl(tsecr),
 						    0);
 						copyback = 1;
 					}
 					got_ts = 1;
 				}
 				/* FALLTHROUGH */
 			default:
 				hlen -= MAX(opt[1], 2);
 				opt += MAX(opt[1], 2);
 				break;
 			}
 		}
 		if (copyback) {
 			/* Copyback the options, caller copys back header */
 			*writeback = 1;
 			m_copyback(m, off + sizeof(struct tcphdr),
 			    (th->th_off << 2) - sizeof(struct tcphdr), hdr +
 			    sizeof(struct tcphdr));
 		}
 	}
 
 
 	/*
 	 * Must invalidate PAWS checks on connections idle for too long.
 	 * The fastest allowed timestamp clock is 1ms.  That turns out to
 	 * be about 24 days before it wraps.  XXX Right now our lowerbound
 	 * TS echo check only works for the first 12 days of a connection
 	 * when the TS has exhausted half its 32bit space
 	 */
 #define TS_MAX_IDLE	(24*24*60*60)
 #define TS_MAX_CONN	(12*24*60*60)	/* XXX remove when better tsecr check */
 
 	getmicrouptime(&uptime);
 	if (src->scrub && (src->scrub->pfss_flags & PFSS_PAWS) &&
 	    (uptime.tv_sec - src->scrub->pfss_last.tv_sec > TS_MAX_IDLE ||
 	    time_uptime - state->creation > TS_MAX_CONN))  {
 		if (V_pf_status.debug >= PF_DEBUG_MISC) {
 			DPFPRINTF(("src idled out of PAWS\n"));
 			pf_print_state(state);
 			printf("\n");
 		}
 		src->scrub->pfss_flags = (src->scrub->pfss_flags & ~PFSS_PAWS)
 		    | PFSS_PAWS_IDLED;
 	}
 	if (dst->scrub && (dst->scrub->pfss_flags & PFSS_PAWS) &&
 	    uptime.tv_sec - dst->scrub->pfss_last.tv_sec > TS_MAX_IDLE) {
 		if (V_pf_status.debug >= PF_DEBUG_MISC) {
 			DPFPRINTF(("dst idled out of PAWS\n"));
 			pf_print_state(state);
 			printf("\n");
 		}
 		dst->scrub->pfss_flags = (dst->scrub->pfss_flags & ~PFSS_PAWS)
 		    | PFSS_PAWS_IDLED;
 	}
 
 	if (got_ts && src->scrub && dst->scrub &&
 	    (src->scrub->pfss_flags & PFSS_PAWS) &&
 	    (dst->scrub->pfss_flags & PFSS_PAWS)) {
 		/* Validate that the timestamps are "in-window".
 		 * RFC1323 describes TCP Timestamp options that allow
 		 * measurement of RTT (round trip time) and PAWS
 		 * (protection against wrapped sequence numbers).  PAWS
 		 * gives us a set of rules for rejecting packets on
 		 * long fat pipes (packets that were somehow delayed
 		 * in transit longer than the time it took to send the
 		 * full TCP sequence space of 4Gb).  We can use these
 		 * rules and infer a few others that will let us treat
 		 * the 32bit timestamp and the 32bit echoed timestamp
 		 * as sequence numbers to prevent a blind attacker from
 		 * inserting packets into a connection.
 		 *
 		 * RFC1323 tells us:
 		 *  - The timestamp on this packet must be greater than
 		 *    or equal to the last value echoed by the other
 		 *    endpoint.  The RFC says those will be discarded
 		 *    since it is a dup that has already been acked.
 		 *    This gives us a lowerbound on the timestamp.
 		 *        timestamp >= other last echoed timestamp
 		 *  - The timestamp will be less than or equal to
 		 *    the last timestamp plus the time between the
 		 *    last packet and now.  The RFC defines the max
 		 *    clock rate as 1ms.  We will allow clocks to be
 		 *    up to 10% fast and will allow a total difference
 		 *    or 30 seconds due to a route change.  And this
 		 *    gives us an upperbound on the timestamp.
 		 *        timestamp <= last timestamp + max ticks
 		 *    We have to be careful here.  Windows will send an
 		 *    initial timestamp of zero and then initialize it
 		 *    to a random value after the 3whs; presumably to
 		 *    avoid a DoS by having to call an expensive RNG
 		 *    during a SYN flood.  Proof MS has at least one
 		 *    good security geek.
 		 *
 		 *  - The TCP timestamp option must also echo the other
 		 *    endpoints timestamp.  The timestamp echoed is the
 		 *    one carried on the earliest unacknowledged segment
 		 *    on the left edge of the sequence window.  The RFC
 		 *    states that the host will reject any echoed
 		 *    timestamps that were larger than any ever sent.
 		 *    This gives us an upperbound on the TS echo.
 		 *        tescr <= largest_tsval
 		 *  - The lowerbound on the TS echo is a little more
 		 *    tricky to determine.  The other endpoint's echoed
 		 *    values will not decrease.  But there may be
 		 *    network conditions that re-order packets and
 		 *    cause our view of them to decrease.  For now the
 		 *    only lowerbound we can safely determine is that
 		 *    the TS echo will never be less than the original
 		 *    TS.  XXX There is probably a better lowerbound.
 		 *    Remove TS_MAX_CONN with better lowerbound check.
 		 *        tescr >= other original TS
 		 *
 		 * It is also important to note that the fastest
 		 * timestamp clock of 1ms will wrap its 32bit space in
 		 * 24 days.  So we just disable TS checking after 24
 		 * days of idle time.  We actually must use a 12d
 		 * connection limit until we can come up with a better
 		 * lowerbound to the TS echo check.
 		 */
 		struct timeval delta_ts;
 		int ts_fudge;
 
 
 		/*
 		 * PFTM_TS_DIFF is how many seconds of leeway to allow
 		 * a host's timestamp.  This can happen if the previous
 		 * packet got delayed in transit for much longer than
 		 * this packet.
 		 */
 		if ((ts_fudge = state->rule.ptr->timeout[PFTM_TS_DIFF]) == 0)
 			ts_fudge = V_pf_default_rule.timeout[PFTM_TS_DIFF];
 
 		/* Calculate max ticks since the last timestamp */
 #define TS_MAXFREQ	1100		/* RFC max TS freq of 1Khz + 10% skew */
 #define TS_MICROSECS	1000000		/* microseconds per second */
 		delta_ts = uptime;
 		timevalsub(&delta_ts, &src->scrub->pfss_last);
 		tsval_from_last = (delta_ts.tv_sec + ts_fudge) * TS_MAXFREQ;
 		tsval_from_last += delta_ts.tv_usec / (TS_MICROSECS/TS_MAXFREQ);
 
 		if ((src->state >= TCPS_ESTABLISHED &&
 		    dst->state >= TCPS_ESTABLISHED) &&
 		    (SEQ_LT(tsval, dst->scrub->pfss_tsecr) ||
 		    SEQ_GT(tsval, src->scrub->pfss_tsval + tsval_from_last) ||
 		    (tsecr && (SEQ_GT(tsecr, dst->scrub->pfss_tsval) ||
 		    SEQ_LT(tsecr, dst->scrub->pfss_tsval0))))) {
 			/* Bad RFC1323 implementation or an insertion attack.
 			 *
 			 * - Solaris 2.6 and 2.7 are known to send another ACK
 			 *   after the FIN,FIN|ACK,ACK closing that carries
 			 *   an old timestamp.
 			 */
 
 			DPFPRINTF(("Timestamp failed %c%c%c%c\n",
 			    SEQ_LT(tsval, dst->scrub->pfss_tsecr) ? '0' : ' ',
 			    SEQ_GT(tsval, src->scrub->pfss_tsval +
 			    tsval_from_last) ? '1' : ' ',
 			    SEQ_GT(tsecr, dst->scrub->pfss_tsval) ? '2' : ' ',
 			    SEQ_LT(tsecr, dst->scrub->pfss_tsval0)? '3' : ' '));
 			DPFPRINTF((" tsval: %u  tsecr: %u  +ticks: %u  "
 			    "idle: %jus %lums\n",
 			    tsval, tsecr, tsval_from_last,
 			    (uintmax_t)delta_ts.tv_sec,
 			    delta_ts.tv_usec / 1000));
 			DPFPRINTF((" src->tsval: %u  tsecr: %u\n",
 			    src->scrub->pfss_tsval, src->scrub->pfss_tsecr));
 			DPFPRINTF((" dst->tsval: %u  tsecr: %u  tsval0: %u"
 			    "\n", dst->scrub->pfss_tsval,
 			    dst->scrub->pfss_tsecr, dst->scrub->pfss_tsval0));
 			if (V_pf_status.debug >= PF_DEBUG_MISC) {
 				pf_print_state(state);
 				pf_print_flags(th->th_flags);
 				printf("\n");
 			}
 			REASON_SET(reason, PFRES_TS);
 			return (PF_DROP);
 		}
 
 		/* XXX I'd really like to require tsecr but it's optional */
 
 	} else if (!got_ts && (th->th_flags & TH_RST) == 0 &&
 	    ((src->state == TCPS_ESTABLISHED && dst->state == TCPS_ESTABLISHED)
 	    || pd->p_len > 0 || (th->th_flags & TH_SYN)) &&
 	    src->scrub && dst->scrub &&
 	    (src->scrub->pfss_flags & PFSS_PAWS) &&
 	    (dst->scrub->pfss_flags & PFSS_PAWS)) {
 		/* Didn't send a timestamp.  Timestamps aren't really useful
 		 * when:
 		 *  - connection opening or closing (often not even sent).
 		 *    but we must not let an attacker to put a FIN on a
 		 *    data packet to sneak it through our ESTABLISHED check.
 		 *  - on a TCP reset.  RFC suggests not even looking at TS.
 		 *  - on an empty ACK.  The TS will not be echoed so it will
 		 *    probably not help keep the RTT calculation in sync and
 		 *    there isn't as much danger when the sequence numbers
 		 *    got wrapped.  So some stacks don't include TS on empty
 		 *    ACKs :-(
 		 *
 		 * To minimize the disruption to mostly RFC1323 conformant
 		 * stacks, we will only require timestamps on data packets.
 		 *
 		 * And what do ya know, we cannot require timestamps on data
 		 * packets.  There appear to be devices that do legitimate
 		 * TCP connection hijacking.  There are HTTP devices that allow
 		 * a 3whs (with timestamps) and then buffer the HTTP request.
 		 * If the intermediate device has the HTTP response cache, it
 		 * will spoof the response but not bother timestamping its
 		 * packets.  So we can look for the presence of a timestamp in
 		 * the first data packet and if there, require it in all future
 		 * packets.
 		 */
 
 		if (pd->p_len > 0 && (src->scrub->pfss_flags & PFSS_DATA_TS)) {
 			/*
 			 * Hey!  Someone tried to sneak a packet in.  Or the
 			 * stack changed its RFC1323 behavior?!?!
 			 */
 			if (V_pf_status.debug >= PF_DEBUG_MISC) {
 				DPFPRINTF(("Did not receive expected RFC1323 "
 				    "timestamp\n"));
 				pf_print_state(state);
 				pf_print_flags(th->th_flags);
 				printf("\n");
 			}
 			REASON_SET(reason, PFRES_TS);
 			return (PF_DROP);
 		}
 	}
 
 
 	/*
 	 * We will note if a host sends his data packets with or without
 	 * timestamps.  And require all data packets to contain a timestamp
 	 * if the first does.  PAWS implicitly requires that all data packets be
 	 * timestamped.  But I think there are middle-man devices that hijack
 	 * TCP streams immediately after the 3whs and don't timestamp their
 	 * packets (seen in a WWW accelerator or cache).
 	 */
 	if (pd->p_len > 0 && src->scrub && (src->scrub->pfss_flags &
 	    (PFSS_TIMESTAMP|PFSS_DATA_TS|PFSS_DATA_NOTS)) == PFSS_TIMESTAMP) {
 		if (got_ts)
 			src->scrub->pfss_flags |= PFSS_DATA_TS;
 		else {
 			src->scrub->pfss_flags |= PFSS_DATA_NOTS;
 			if (V_pf_status.debug >= PF_DEBUG_MISC && dst->scrub &&
 			    (dst->scrub->pfss_flags & PFSS_TIMESTAMP)) {
 				/* Don't warn if other host rejected RFC1323 */
 				DPFPRINTF(("Broken RFC1323 stack did not "
 				    "timestamp data packet. Disabled PAWS "
 				    "security.\n"));
 				pf_print_state(state);
 				pf_print_flags(th->th_flags);
 				printf("\n");
 			}
 		}
 	}
 
 
 	/*
 	 * Update PAWS values
 	 */
 	if (got_ts && src->scrub && PFSS_TIMESTAMP == (src->scrub->pfss_flags &
 	    (PFSS_PAWS_IDLED|PFSS_TIMESTAMP))) {
 		getmicrouptime(&src->scrub->pfss_last);
 		if (SEQ_GEQ(tsval, src->scrub->pfss_tsval) ||
 		    (src->scrub->pfss_flags & PFSS_PAWS) == 0)
 			src->scrub->pfss_tsval = tsval;
 
 		if (tsecr) {
 			if (SEQ_GEQ(tsecr, src->scrub->pfss_tsecr) ||
 			    (src->scrub->pfss_flags & PFSS_PAWS) == 0)
 				src->scrub->pfss_tsecr = tsecr;
 
 			if ((src->scrub->pfss_flags & PFSS_PAWS) == 0 &&
 			    (SEQ_LT(tsval, src->scrub->pfss_tsval0) ||
 			    src->scrub->pfss_tsval0 == 0)) {
 				/* tsval0 MUST be the lowest timestamp */
 				src->scrub->pfss_tsval0 = tsval;
 			}
 
 			/* Only fully initialized after a TS gets echoed */
 			if ((src->scrub->pfss_flags & PFSS_PAWS) == 0)
 				src->scrub->pfss_flags |= PFSS_PAWS;
 		}
 	}
 
 	/* I have a dream....  TCP segment reassembly.... */
 	return (0);
 }
 
 static int
 pf_normalize_tcpopt(struct pf_rule *r, struct mbuf *m, struct tcphdr *th,
     int off, sa_family_t af)
 {
 	u_int16_t	*mss;
 	int		 thoff;
 	int		 opt, cnt, optlen = 0;
 	int		 rewrite = 0;
 	u_char		 opts[TCP_MAXOLEN];
 	u_char		*optp = opts;
 
 	thoff = th->th_off << 2;
 	cnt = thoff - sizeof(struct tcphdr);
 
 	if (cnt > 0 && !pf_pull_hdr(m, off + sizeof(*th), opts, cnt,
 	    NULL, NULL, af))
 		return (rewrite);
 
 	for (; cnt > 0; cnt -= optlen, optp += optlen) {
 		opt = optp[0];
 		if (opt == TCPOPT_EOL)
 			break;
 		if (opt == TCPOPT_NOP)
 			optlen = 1;
 		else {
 			if (cnt < 2)
 				break;
 			optlen = optp[1];
 			if (optlen < 2 || optlen > cnt)
 				break;
 		}
 		switch (opt) {
 		case TCPOPT_MAXSEG:
 			mss = (u_int16_t *)(optp + 2);
 			if ((ntohs(*mss)) > r->max_mss) {
 				th->th_sum = pf_cksum_fixup(th->th_sum,
 				    *mss, htons(r->max_mss), 0);
 				*mss = htons(r->max_mss);
 				rewrite = 1;
 			}
 			break;
 		default:
 			break;
 		}
 	}
 
 	if (rewrite)
 		m_copyback(m, off + sizeof(*th), thoff - sizeof(*th), opts);
 
 	return (rewrite);
 }
 
 #ifdef INET
 static void
 pf_scrub_ip(struct mbuf **m0, u_int32_t flags, u_int8_t min_ttl, u_int8_t tos)
 {
 	struct mbuf		*m = *m0;
 	struct ip		*h = mtod(m, struct ip *);
 
 	/* Clear IP_DF if no-df was requested */
 	if (flags & PFRULE_NODF && h->ip_off & htons(IP_DF)) {
 		u_int16_t ip_off = h->ip_off;
 
 		h->ip_off &= htons(~IP_DF);
 		h->ip_sum = pf_cksum_fixup(h->ip_sum, ip_off, h->ip_off, 0);
 	}
 
 	/* Enforce a minimum ttl, may cause endless packet loops */
 	if (min_ttl && h->ip_ttl < min_ttl) {
 		u_int16_t ip_ttl = h->ip_ttl;
 
 		h->ip_ttl = min_ttl;
 		h->ip_sum = pf_cksum_fixup(h->ip_sum, ip_ttl, h->ip_ttl, 0);
 	}
 
 	/* Enforce tos */
 	if (flags & PFRULE_SET_TOS) {
 		u_int16_t	ov, nv;
 
 		ov = *(u_int16_t *)h;
 		h->ip_tos = tos;
 		nv = *(u_int16_t *)h;
 
 		h->ip_sum = pf_cksum_fixup(h->ip_sum, ov, nv, 0);
 	}
 
 	/* random-id, but not for fragments */
 	if (flags & PFRULE_RANDOMID && !(h->ip_off & ~htons(IP_DF))) {
 		uint16_t ip_id = h->ip_id;
 
 		ip_fillid(h);
 		h->ip_sum = pf_cksum_fixup(h->ip_sum, ip_id, h->ip_id, 0);
 	}
 }
 #endif /* INET */
 
 #ifdef INET6
 static void
 pf_scrub_ip6(struct mbuf **m0, u_int8_t min_ttl)
 {
 	struct mbuf		*m = *m0;
 	struct ip6_hdr		*h = mtod(m, struct ip6_hdr *);
 
 	/* Enforce a minimum ttl, may cause endless packet loops */
 	if (min_ttl && h->ip6_hlim < min_ttl)
 		h->ip6_hlim = min_ttl;
 }
 #endif
Index: user/ngie/more-tests/sys/sys/imgact.h
===================================================================
--- user/ngie/more-tests/sys/sys/imgact.h	(revision 281584)
+++ user/ngie/more-tests/sys/sys/imgact.h	(revision 281585)
@@ -1,102 +1,103 @@
 /*-
  * Copyright (c) 1993, David Greenman
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 #ifndef _SYS_IMGACT_H_
 #define	_SYS_IMGACT_H_
 
 #include <sys/uio.h>
 
 #include <vm/vm.h>
 
 #define MAXSHELLCMDLEN	PAGE_SIZE
 
 struct image_args {
 	char *buf;		/* pointer to string buffer */
 	char *begin_argv;	/* beginning of argv in buf */
 	char *begin_envv;	/* beginning of envv in buf */
 	char *endp;		/* current `end' pointer of arg & env strings */
 	char *fname;            /* pointer to filename of executable (system space) */
 	char *fname_buf;	/* pointer to optional malloc(M_TEMP) buffer */
 	int stringspace;	/* space left in arg & env buffer */
 	int argc;		/* count of argument strings */
 	int envc;		/* count of environment strings */
 	int fd;			/* file descriptor of the executable */
 };
 
 struct image_params {
 	struct proc *proc;	/* our process struct */
 	struct label *execlabel;	/* optional exec label */
 	struct vnode *vp;	/* pointer to vnode of file to exec */
 	struct vm_object *object;	/* The vm object for this vp */
 	struct vattr *attr;	/* attributes of file */
 	const char *image_header; /* head of file to exec */
 	unsigned long entry_addr; /* entry address of target executable */
 	unsigned long reloc_base; /* load address of image */
 	char vmspace_destroyed;	/* flag - we've blown away original vm space */
 #define IMGACT_SHELL	0x1
 #define IMGACT_BINMISC	0x2
 	unsigned char interpreted;	/* mask of interpreters that have run */
 	char opened;		/* flag - we have opened executable vnode */
 	char *interpreter_name;	/* name of the interpreter */
 	void *auxargs;		/* ELF Auxinfo structure pointer */
 	struct sf_buf *firstpage;	/* first page that we mapped */
 	unsigned long ps_strings; /* PS_STRINGS for BSD/OS binaries */
 	size_t auxarg_size;
 	struct image_args *args;	/* system call arguments */
 	struct sysentvec *sysent;	/* system entry vector */
 	char *execpath;
 	unsigned long execpathp;
 	char *freepath;
 	unsigned long canary;
 	int canarylen;
 	unsigned long pagesizes;
 	int pagesizeslen;
 	vm_prot_t stack_prot;
+	u_long stack_sz;
 };
 
 #ifdef _KERNEL
 struct sysentvec;
 struct thread;
 
 #define IMGACT_CORE_COMPRESS	0x01
 
 int	exec_alloc_args(struct image_args *);
 int	exec_check_permissions(struct image_params *);
 register_t *exec_copyout_strings(struct image_params *);
 void	exec_free_args(struct image_args *);
 int	exec_new_vmspace(struct image_params *, struct sysentvec *);
 void	exec_setregs(struct thread *, struct image_params *, u_long);
 int	exec_shell_imgact(struct image_params *);
 int	exec_copyin_args(struct image_args *, char *, enum uio_seg,
 	char **, char **);
 #endif
 
 #endif /* !_SYS_IMGACT_H_ */
Index: user/ngie/more-tests/sys/sys/mount.h
===================================================================
--- user/ngie/more-tests/sys/sys/mount.h	(revision 281584)
+++ user/ngie/more-tests/sys/sys/mount.h	(revision 281585)
@@ -1,949 +1,950 @@
 /*-
  * Copyright (c) 1989, 1991, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)mount.h	8.21 (Berkeley) 5/20/95
  * $FreeBSD$
  */
 
 #ifndef _SYS_MOUNT_H_
 #define _SYS_MOUNT_H_
 
 #include <sys/ucred.h>
 #include <sys/queue.h>
 #ifdef _KERNEL
 #include <sys/lock.h>
 #include <sys/lockmgr.h>
 #include <sys/_mutex.h>
 #include <sys/_sx.h>
 #endif
 
 /*
  * NOTE: When changing statfs structure, mount structure, MNT_* flags or
  * MNTK_* flags also update DDB show mount command in vfs_subr.c.
  */
 
 typedef struct fsid { int32_t val[2]; } fsid_t;	/* filesystem id type */
 
 /*
  * File identifier.
  * These are unique per filesystem on a single machine.
  */
 #define	MAXFIDSZ	16
 
 struct fid {
 	u_short		fid_len;		/* length of data in bytes */
 	u_short		fid_data0;		/* force longword alignment */
 	char		fid_data[MAXFIDSZ];	/* data (variable length) */
 };
 
 /*
  * filesystem statistics
  */
 #define	MFSNAMELEN	16		/* length of type name including null */
 #define	MNAMELEN	88		/* size of on/from name bufs */
 #define	STATFS_VERSION	0x20030518	/* current version number */
 struct statfs {
 	uint32_t f_version;		/* structure version number */
 	uint32_t f_type;		/* type of filesystem */
 	uint64_t f_flags;		/* copy of mount exported flags */
 	uint64_t f_bsize;		/* filesystem fragment size */
 	uint64_t f_iosize;		/* optimal transfer block size */
 	uint64_t f_blocks;		/* total data blocks in filesystem */
 	uint64_t f_bfree;		/* free blocks in filesystem */
 	int64_t	 f_bavail;		/* free blocks avail to non-superuser */
 	uint64_t f_files;		/* total file nodes in filesystem */
 	int64_t	 f_ffree;		/* free nodes avail to non-superuser */
 	uint64_t f_syncwrites;		/* count of sync writes since mount */
 	uint64_t f_asyncwrites;		/* count of async writes since mount */
 	uint64_t f_syncreads;		/* count of sync reads since mount */
 	uint64_t f_asyncreads;		/* count of async reads since mount */
 	uint64_t f_spare[10];		/* unused spare */
 	uint32_t f_namemax;		/* maximum filename length */
 	uid_t	  f_owner;		/* user that mounted the filesystem */
 	fsid_t	  f_fsid;		/* filesystem id */
 	char	  f_charspare[80];	    /* spare string space */
 	char	  f_fstypename[MFSNAMELEN]; /* filesystem type name */
 	char	  f_mntfromname[MNAMELEN];  /* mounted filesystem */
 	char	  f_mntonname[MNAMELEN];    /* directory on which mounted */
 };
 
 #ifdef _KERNEL
 #define	OMFSNAMELEN	16	/* length of fs type name, including null */
 #define	OMNAMELEN	(88 - 2 * sizeof(long))	/* size of on/from name bufs */
 
 /* XXX getfsstat.2 is out of date with write and read counter changes here. */
 /* XXX statfs.2 is out of date with read counter changes here. */
 struct ostatfs {
 	long	f_spare2;		/* placeholder */
 	long	f_bsize;		/* fundamental filesystem block size */
 	long	f_iosize;		/* optimal transfer block size */
 	long	f_blocks;		/* total data blocks in filesystem */
 	long	f_bfree;		/* free blocks in fs */
 	long	f_bavail;		/* free blocks avail to non-superuser */
 	long	f_files;		/* total file nodes in filesystem */
 	long	f_ffree;		/* free file nodes in fs */
 	fsid_t	f_fsid;			/* filesystem id */
 	uid_t	f_owner;		/* user that mounted the filesystem */
 	int	f_type;			/* type of filesystem */
 	int	f_flags;		/* copy of mount exported flags */
 	long	f_syncwrites;		/* count of sync writes since mount */
 	long	f_asyncwrites;		/* count of async writes since mount */
 	char	f_fstypename[OMFSNAMELEN]; /* fs type name */
 	char	f_mntonname[OMNAMELEN];	/* directory on which mounted */
 	long	f_syncreads;		/* count of sync reads since mount */
 	long	f_asyncreads;		/* count of async reads since mount */
 	short	f_spares1;		/* unused spare */
 	char	f_mntfromname[OMNAMELEN];/* mounted filesystem */
 	short	f_spares2;		/* unused spare */
 	/*
 	 * XXX on machines where longs are aligned to 8-byte boundaries, there
 	 * is an unnamed int32_t here.  This spare was after the apparent end
 	 * of the struct until we bit off the read counters from f_mntonname.
 	 */
 	long	f_spare[2];		/* unused spare */
 };
 
 TAILQ_HEAD(vnodelst, vnode);
 
 /* Mount options list */
 TAILQ_HEAD(vfsoptlist, vfsopt);
 struct vfsopt {
 	TAILQ_ENTRY(vfsopt) link;
 	char	*name;
 	void	*value;
 	int	len;
 	int	pos;
 	int	seen;
 };
 
 /*
  * Structure per mounted filesystem.  Each mounted filesystem has an
  * array of operations and an instance record.  The filesystems are
  * put on a doubly linked list.
  *
  * Lock reference:
  *	m - mountlist_mtx
  *	i - interlock
  *	v - vnode freelist mutex
  *
  * Unmarked fields are considered stable as long as a ref is held.
  *
  */
 struct mount {
 	struct mtx	mnt_mtx;		/* mount structure interlock */
 	int		mnt_gen;		/* struct mount generation */
 #define	mnt_startzero	mnt_list
 	TAILQ_ENTRY(mount) mnt_list;		/* (m) mount list */
 	struct vfsops	*mnt_op;		/* operations on fs */
 	struct vfsconf	*mnt_vfc;		/* configuration info */
 	struct vnode	*mnt_vnodecovered;	/* vnode we mounted on */
 	struct vnode	*mnt_syncer;		/* syncer vnode */
 	int		mnt_ref;		/* (i) Reference count */
 	struct vnodelst	mnt_nvnodelist;		/* (i) list of vnodes */
 	int		mnt_nvnodelistsize;	/* (i) # of vnodes */
 	struct vnodelst	mnt_activevnodelist;	/* (v) list of active vnodes */
 	int		mnt_activevnodelistsize;/* (v) # of active vnodes */
 	int		mnt_writeopcount;	/* (i) write syscalls pending */
 	int		mnt_kern_flag;		/* (i) kernel only flags */
 	uint64_t	mnt_flag;		/* (i) flags shared with user */
 	struct vfsoptlist *mnt_opt;		/* current mount options */
 	struct vfsoptlist *mnt_optnew;		/* new options passed to fs */
 	int		mnt_maxsymlinklen;	/* max size of short symlink */
 	struct statfs	mnt_stat;		/* cache of filesystem stats */
 	struct ucred	*mnt_cred;		/* credentials of mounter */
 	void *		mnt_data;		/* private data */
 	time_t		mnt_time;		/* last time written*/
 	int		mnt_iosize_max;		/* max size for clusters, etc */
 	struct netexport *mnt_export;		/* export list */
 	struct label	*mnt_label;		/* MAC label for the fs */
 	u_int		mnt_hashseed;		/* Random seed for vfs_hash */
 	int		mnt_lockref;		/* (i) Lock reference count */
 	int		mnt_secondary_writes;   /* (i) # of secondary writes */
 	int		mnt_secondary_accwrites;/* (i) secondary wr. starts */
 	struct thread	*mnt_susp_owner;	/* (i) thread owning suspension */
 #define	mnt_endzero	mnt_gjprovider
 	char		*mnt_gjprovider;	/* gjournal provider name */
 	struct lock	mnt_explock;		/* vfs_export walkers lock */
 	TAILQ_ENTRY(mount) mnt_upper_link;	/* (m) we in the all uppers */
 	TAILQ_HEAD(, mount) mnt_uppers;		/* (m) upper mounts over us*/
 };
 
 /*
  * Definitions for MNT_VNODE_FOREACH_ALL.
  */
 struct vnode *__mnt_vnode_next_all(struct vnode **mvp, struct mount *mp);
 struct vnode *__mnt_vnode_first_all(struct vnode **mvp, struct mount *mp);
 void          __mnt_vnode_markerfree_all(struct vnode **mvp, struct mount *mp);
 
 #define MNT_VNODE_FOREACH_ALL(vp, mp, mvp)				\
 	for (vp = __mnt_vnode_first_all(&(mvp), (mp));			\
 		(vp) != NULL; vp = __mnt_vnode_next_all(&(mvp), (mp)))
 
 #define MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp)				\
 	do {								\
 		MNT_ILOCK(mp);						\
 		__mnt_vnode_markerfree_all(&(mvp), (mp));		\
 		/* MNT_IUNLOCK(mp); -- done in above function */	\
 		mtx_assert(MNT_MTX(mp), MA_NOTOWNED);			\
 	} while (0)
 
 /*
  * Definitions for MNT_VNODE_FOREACH_ACTIVE.
  */
 struct vnode *__mnt_vnode_next_active(struct vnode **mvp, struct mount *mp);
 struct vnode *__mnt_vnode_first_active(struct vnode **mvp, struct mount *mp);
 void          __mnt_vnode_markerfree_active(struct vnode **mvp, struct mount *);
 
 #define MNT_VNODE_FOREACH_ACTIVE(vp, mp, mvp) 				\
 	for (vp = __mnt_vnode_first_active(&(mvp), (mp)); 		\
 		(vp) != NULL; vp = __mnt_vnode_next_active(&(mvp), (mp)))
 
 #define MNT_VNODE_FOREACH_ACTIVE_ABORT(mp, mvp)				\
 	__mnt_vnode_markerfree_active(&(mvp), (mp))
 
 #define	MNT_ILOCK(mp)	mtx_lock(&(mp)->mnt_mtx)
 #define	MNT_ITRYLOCK(mp) mtx_trylock(&(mp)->mnt_mtx)
 #define	MNT_IUNLOCK(mp)	mtx_unlock(&(mp)->mnt_mtx)
 #define	MNT_MTX(mp)	(&(mp)->mnt_mtx)
 #define	MNT_REF(mp)	(mp)->mnt_ref++
 #define	MNT_REL(mp)	do {						\
 	KASSERT((mp)->mnt_ref > 0, ("negative mnt_ref"));		\
 	(mp)->mnt_ref--;						\
 	if ((mp)->mnt_ref == 0)						\
 		wakeup((mp));						\
 } while (0)
 
 #endif /* _KERNEL */
 
 /*
  * User specifiable flags, stored in mnt_flag.
  */
 #define	MNT_RDONLY	0x0000000000000001ULL /* read only filesystem */
 #define	MNT_SYNCHRONOUS	0x0000000000000002ULL /* fs written synchronously */
 #define	MNT_NOEXEC	0x0000000000000004ULL /* can't exec from filesystem */
 #define	MNT_NOSUID	0x0000000000000008ULL /* don't honor setuid fs bits */
 #define	MNT_NFS4ACLS	0x0000000000000010ULL /* enable NFS version 4 ACLs */
 #define	MNT_UNION	0x0000000000000020ULL /* union with underlying fs */
 #define	MNT_ASYNC	0x0000000000000040ULL /* fs written asynchronously */
 #define	MNT_SUIDDIR	0x0000000000100000ULL /* special SUID dir handling */
 #define	MNT_SOFTDEP	0x0000000000200000ULL /* using soft updates */
 #define	MNT_NOSYMFOLLOW	0x0000000000400000ULL /* do not follow symlinks */
 #define	MNT_GJOURNAL	0x0000000002000000ULL /* GEOM journal support enabled */
 #define	MNT_MULTILABEL	0x0000000004000000ULL /* MAC support for objects */
 #define	MNT_ACLS	0x0000000008000000ULL /* ACL support enabled */
 #define	MNT_NOATIME	0x0000000010000000ULL /* dont update file access time */
 #define	MNT_NOCLUSTERR	0x0000000040000000ULL /* disable cluster read */
 #define	MNT_NOCLUSTERW	0x0000000080000000ULL /* disable cluster write */
 #define	MNT_SUJ		0x0000000100000000ULL /* using journaled soft updates */
 #define	MNT_AUTOMOUNTED	0x0000000200000000ULL /* mounted by automountd(8) */
 
 /*
  * NFS export related mount flags.
  */
 #define	MNT_EXRDONLY	0x0000000000000080ULL	/* exported read only */
 #define	MNT_EXPORTED	0x0000000000000100ULL	/* filesystem is exported */
 #define	MNT_DEFEXPORTED	0x0000000000000200ULL	/* exported to the world */
 #define	MNT_EXPORTANON	0x0000000000000400ULL	/* anon uid mapping for all */
 #define	MNT_EXKERB	0x0000000000000800ULL	/* exported with Kerberos */
 #define	MNT_EXPUBLIC	0x0000000020000000ULL	/* public export (WebNFS) */
 
 /*
  * Flags set by internal operations,
  * but visible to the user.
  * XXX some of these are not quite right.. (I've never seen the root flag set)
  */
 #define	MNT_LOCAL	0x0000000000001000ULL /* filesystem is stored locally */
 #define	MNT_QUOTA	0x0000000000002000ULL /* quotas are enabled on fs */
 #define	MNT_ROOTFS	0x0000000000004000ULL /* identifies the root fs */
 #define	MNT_USER	0x0000000000008000ULL /* mounted by a user */
 #define	MNT_IGNORE	0x0000000000800000ULL /* do not show entry in df */
 
 /*
  * Mask of flags that are visible to statfs().
  * XXX I think that this could now become (~(MNT_CMDFLAGS))
  * but the 'mount' program may need changing to handle this.
  */
 #define	MNT_VISFLAGMASK	(MNT_RDONLY	| MNT_SYNCHRONOUS | MNT_NOEXEC	| \
 			MNT_NOSUID	| MNT_UNION	| MNT_SUJ	| \
 			MNT_ASYNC	| MNT_EXRDONLY	| MNT_EXPORTED	| \
 			MNT_DEFEXPORTED	| MNT_EXPORTANON| MNT_EXKERB	| \
 			MNT_LOCAL	| MNT_USER	| MNT_QUOTA	| \
 			MNT_ROOTFS	| MNT_NOATIME	| MNT_NOCLUSTERR| \
 			MNT_NOCLUSTERW	| MNT_SUIDDIR	| MNT_SOFTDEP	| \
 			MNT_IGNORE	| MNT_EXPUBLIC	| MNT_NOSYMFOLLOW | \
 			MNT_GJOURNAL	| MNT_MULTILABEL | MNT_ACLS	| \
 			MNT_NFS4ACLS	| MNT_AUTOMOUNTED)
 
 /* Mask of flags that can be updated. */
 #define	MNT_UPDATEMASK (MNT_NOSUID	| MNT_NOEXEC	| \
 			MNT_SYNCHRONOUS	| MNT_UNION	| MNT_ASYNC	| \
 			MNT_NOATIME | \
 			MNT_NOSYMFOLLOW	| MNT_IGNORE	| \
 			MNT_NOCLUSTERR	| MNT_NOCLUSTERW | MNT_SUIDDIR	| \
 			MNT_ACLS	| MNT_USER	| MNT_NFS4ACLS	| \
 			MNT_AUTOMOUNTED)
 
 /*
  * External filesystem command modifier flags.
  * Unmount can use the MNT_FORCE flag.
  * XXX: These are not STATES and really should be somewhere else.
  * XXX: MNT_BYFSID collides with MNT_ACLS, but because MNT_ACLS is only used for
  *      mount(2) and MNT_BYFSID is only used for unmount(2) it's harmless.
  */
 #define	MNT_UPDATE	0x0000000000010000ULL /* not real mount, just update */
 #define	MNT_DELEXPORT	0x0000000000020000ULL /* delete export host lists */
 #define	MNT_RELOAD	0x0000000000040000ULL /* reload filesystem data */
 #define	MNT_FORCE	0x0000000000080000ULL /* force unmount or readonly */
 #define	MNT_SNAPSHOT	0x0000000001000000ULL /* snapshot the filesystem */
 #define	MNT_BYFSID	0x0000000008000000ULL /* specify filesystem by ID. */
 #define MNT_CMDFLAGS   (MNT_UPDATE	| MNT_DELEXPORT	| MNT_RELOAD	| \
 			MNT_FORCE	| MNT_SNAPSHOT	| MNT_BYFSID)
 /*
  * Internal filesystem control flags stored in mnt_kern_flag.
  *
  * MNTK_UNMOUNT locks the mount entry so that name lookup cannot proceed
  * past the mount point.  This keeps the subtree stable during mounts
  * and unmounts.
  *
  * MNTK_UNMOUNTF permits filesystems to detect a forced unmount while
  * dounmount() is still waiting to lock the mountpoint. This allows
  * the filesystem to cancel operations that might otherwise deadlock
  * with the unmount attempt (used by NFS).
  *
  * MNTK_NOINSMNTQ is strict subset of MNTK_UNMOUNT. They are separated
  * to allow for failed unmount attempt to restore the syncer vnode for
  * the mount.
  */
 #define MNTK_UNMOUNTF	0x00000001	/* forced unmount in progress */
 #define MNTK_ASYNC	0x00000002	/* filtered async flag */
 #define MNTK_SOFTDEP	0x00000004	/* async disabled by softdep */
 #define MNTK_NOINSMNTQ	0x00000008	/* insmntque is not allowed */
 #define	MNTK_DRAINING	0x00000010	/* lock draining is happening */
 #define	MNTK_REFEXPIRE	0x00000020	/* refcount expiring is happening */
 #define MNTK_EXTENDED_SHARED	0x00000040 /* Allow shared locking for more ops */
 #define	MNTK_SHARED_WRITES	0x00000080 /* Allow shared locking for writes */
 #define	MNTK_NO_IOPF	0x00000100	/* Disallow page faults during reads
 					   and writes. Filesystem shall properly
 					   handle i/o state on EFAULT. */
 #define	MNTK_VGONE_UPPER	0x00000200
 #define	MNTK_VGONE_WAITER	0x00000400
 #define	MNTK_LOOKUP_EXCL_DOTDOT	0x00000800
 #define	MNTK_MARKER		0x00001000
 #define	MNTK_UNMAPPED_BUFS	0x00002000
+#define	MNTK_USES_BCACHE	0x00004000 /* FS uses the buffer cache. */
 #define MNTK_NOASYNC	0x00800000	/* disable async */
 #define MNTK_UNMOUNT	0x01000000	/* unmount in progress */
 #define	MNTK_MWAIT	0x02000000	/* waiting for unmount to finish */
 #define	MNTK_SUSPEND	0x08000000	/* request write suspension */
 #define	MNTK_SUSPEND2	0x04000000	/* block secondary writes */
 #define	MNTK_SUSPENDED	0x10000000	/* write operations are suspended */
 #define	MNTK_SUSPENDABLE	0x20000000 /* writes can be suspended */
 #define MNTK_LOOKUP_SHARED	0x40000000 /* FS supports shared lock lookups */
 #define	MNTK_NOKNOTE	0x80000000	/* Don't send KNOTEs from VOP hooks */
 
 #ifdef _KERNEL
 static inline int
 MNT_SHARED_WRITES(struct mount *mp)
 {
 
 	return (mp != NULL && (mp->mnt_kern_flag & MNTK_SHARED_WRITES) != 0);
 }
 
 static inline int
 MNT_EXTENDED_SHARED(struct mount *mp)
 {
 
 	return (mp != NULL && (mp->mnt_kern_flag & MNTK_EXTENDED_SHARED) != 0);
 }
 #endif
 
 /*
  * Sysctl CTL_VFS definitions.
  *
  * Second level identifier specifies which filesystem. Second level
  * identifier VFS_VFSCONF returns information about all filesystems.
  * Second level identifier VFS_GENERIC is non-terminal.
  */
 #define	VFS_VFSCONF		0	/* get configured filesystems */
 #define	VFS_GENERIC		0	/* generic filesystem information */
 /*
  * Third level identifiers for VFS_GENERIC are given below; third
  * level identifiers for specific filesystems are given in their
  * mount specific header files.
  */
 #define VFS_MAXTYPENUM	1	/* int: highest defined filesystem type */
 #define VFS_CONF	2	/* struct: vfsconf for filesystem given
 				   as next argument */
 
 /*
  * Flags for various system call interfaces.
  *
  * waitfor flags to vfs_sync() and getfsstat()
  */
 #define MNT_WAIT	1	/* synchronously wait for I/O to complete */
 #define MNT_NOWAIT	2	/* start all I/O, but do not wait for it */
 #define MNT_LAZY	3	/* push data not written by filesystem syncer */
 #define MNT_SUSPEND	4	/* Suspend file system after sync */
 
 /*
  * Generic file handle
  */
 struct fhandle {
 	fsid_t	fh_fsid;	/* Filesystem id of mount point */
 	struct	fid fh_fid;	/* Filesys specific id */
 };
 typedef struct fhandle	fhandle_t;
 
 /*
  * Old export arguments without security flavor list
  */
 struct oexport_args {
 	int	ex_flags;		/* export related flags */
 	uid_t	ex_root;		/* mapping for root uid */
 	struct	xucred ex_anon;		/* mapping for anonymous user */
 	struct	sockaddr *ex_addr;	/* net address to which exported */
 	u_char	ex_addrlen;		/* and the net address length */
 	struct	sockaddr *ex_mask;	/* mask of valid bits in saddr */
 	u_char	ex_masklen;		/* and the smask length */
 	char	*ex_indexfile;		/* index file for WebNFS URLs */
 };
 
 /*
  * Export arguments for local filesystem mount calls.
  */
 #define	MAXSECFLAVORS	5
 struct export_args {
 	int	ex_flags;		/* export related flags */
 	uid_t	ex_root;		/* mapping for root uid */
 	struct	xucred ex_anon;		/* mapping for anonymous user */
 	struct	sockaddr *ex_addr;	/* net address to which exported */
 	u_char	ex_addrlen;		/* and the net address length */
 	struct	sockaddr *ex_mask;	/* mask of valid bits in saddr */
 	u_char	ex_masklen;		/* and the smask length */
 	char	*ex_indexfile;		/* index file for WebNFS URLs */
 	int	ex_numsecflavors;	/* security flavor count */
 	int	ex_secflavors[MAXSECFLAVORS]; /* list of security flavors */
 };
 
 /*
  * Structure holding information for a publicly exported filesystem
  * (WebNFS). Currently the specs allow just for one such filesystem.
  */
 struct nfs_public {
 	int		np_valid;	/* Do we hold valid information */
 	fhandle_t	np_handle;	/* Filehandle for pub fs (internal) */
 	struct mount	*np_mount;	/* Mountpoint of exported fs */
 	char		*np_index;	/* Index file */
 };
 
 /*
  * Filesystem configuration information. One of these exists for each
  * type of filesystem supported by the kernel. These are searched at
  * mount time to identify the requested filesystem.
  *
  * XXX: Never change the first two arguments!
  */
 struct vfsconf {
 	u_int	vfc_version;		/* ABI version number */
 	char	vfc_name[MFSNAMELEN];	/* filesystem type name */
 	struct	vfsops *vfc_vfsops;	/* filesystem operations vector */
 	int	vfc_typenum;		/* historic filesystem type number */
 	int	vfc_refcount;		/* number mounted of this type */
 	int	vfc_flags;		/* permanent flags */
 	struct	vfsoptdecl *vfc_opts;	/* mount options */
 	TAILQ_ENTRY(vfsconf) vfc_list;	/* list of vfscons */
 };
 
 /* Userland version of the struct vfsconf. */
 struct xvfsconf {
 	struct	vfsops *vfc_vfsops;	/* filesystem operations vector */
 	char	vfc_name[MFSNAMELEN];	/* filesystem type name */
 	int	vfc_typenum;		/* historic filesystem type number */
 	int	vfc_refcount;		/* number mounted of this type */
 	int	vfc_flags;		/* permanent flags */
 	struct	vfsconf *vfc_next;	/* next in list */
 };
 
 #ifndef BURN_BRIDGES
 struct ovfsconf {
 	void	*vfc_vfsops;
 	char	vfc_name[32];
 	int	vfc_index;
 	int	vfc_refcount;
 	int	vfc_flags;
 };
 #endif
 
 /*
  * NB: these flags refer to IMPLEMENTATION properties, not properties of
  * any actual mounts; i.e., it does not make sense to change the flags.
  */
 #define	VFCF_STATIC	0x00010000	/* statically compiled into kernel */
 #define	VFCF_NETWORK	0x00020000	/* may get data over the network */
 #define	VFCF_READONLY	0x00040000	/* writes are not implemented */
 #define	VFCF_SYNTHETIC	0x00080000	/* data does not represent real files */
 #define	VFCF_LOOPBACK	0x00100000	/* aliases some other mounted FS */
 #define	VFCF_UNICODE	0x00200000	/* stores file names as Unicode */
 #define	VFCF_JAIL	0x00400000	/* can be mounted from within a jail */
 #define	VFCF_DELEGADMIN	0x00800000	/* supports delegated administration */
 #define	VFCF_SBDRY	0x01000000	/* defer stop requests */
 
 typedef uint32_t fsctlop_t;
 
 struct vfsidctl {
 	int		vc_vers;	/* should be VFSIDCTL_VERS1 (below) */
 	fsid_t		vc_fsid;	/* fsid to operate on */
 	char		vc_fstypename[MFSNAMELEN];
 					/* type of fs 'nfs' or '*' */
 	fsctlop_t	vc_op;		/* operation VFS_CTL_* (below) */
 	void		*vc_ptr;	/* pointer to data structure */
 	size_t		vc_len;		/* sizeof said structure */
 	u_int32_t	vc_spare[12];	/* spare (must be zero) */
 };
 
 /* vfsidctl API version. */
 #define VFS_CTL_VERS1	0x01
 
 /*
  * New style VFS sysctls, do not reuse/conflict with the namespace for
  * private sysctls.
  * All "global" sysctl ops have the 33rd bit set:
  * 0x...1....
  * Private sysctl ops should have the 33rd bit unset.
  */
 #define VFS_CTL_QUERY	0x00010001	/* anything wrong? (vfsquery) */
 #define VFS_CTL_TIMEO	0x00010002	/* set timeout for vfs notification */
 #define VFS_CTL_NOLOCKS	0x00010003	/* disable file locking */
 
 struct vfsquery {
 	u_int32_t	vq_flags;
 	u_int32_t	vq_spare[31];
 };
 
 /* vfsquery flags */
 #define VQ_NOTRESP	0x0001	/* server down */
 #define VQ_NEEDAUTH	0x0002	/* server bad auth */
 #define VQ_LOWDISK	0x0004	/* we're low on space */
 #define VQ_MOUNT	0x0008	/* new filesystem arrived */
 #define VQ_UNMOUNT	0x0010	/* filesystem has left */
 #define VQ_DEAD		0x0020	/* filesystem is dead, needs force unmount */
 #define VQ_ASSIST	0x0040	/* filesystem needs assistance from external
 				   program */
 #define VQ_NOTRESPLOCK	0x0080	/* server lockd down */
 #define VQ_FLAG0100	0x0100	/* placeholder */
 #define VQ_FLAG0200	0x0200	/* placeholder */
 #define VQ_FLAG0400	0x0400	/* placeholder */
 #define VQ_FLAG0800	0x0800	/* placeholder */
 #define VQ_FLAG1000	0x1000	/* placeholder */
 #define VQ_FLAG2000	0x2000	/* placeholder */
 #define VQ_FLAG4000	0x4000	/* placeholder */
 #define VQ_FLAG8000	0x8000	/* placeholder */
 
 #ifdef _KERNEL
 /* Point a sysctl request at a vfsidctl's data. */
 #define VCTLTOREQ(vc, req)						\
 	do {								\
 		(req)->newptr = (vc)->vc_ptr;				\
 		(req)->newlen = (vc)->vc_len;				\
 		(req)->newidx = 0;					\
 	} while (0)
 #endif
 
 struct iovec;
 struct uio;
 
 #ifdef _KERNEL
 
 /*
  * vfs_busy specific flags and mask.
  */
 #define	MBF_NOWAIT	0x01
 #define	MBF_MNTLSTLOCK	0x02
 #define	MBF_MASK	(MBF_NOWAIT | MBF_MNTLSTLOCK)
 
 #ifdef MALLOC_DECLARE
 MALLOC_DECLARE(M_MOUNT);
 #endif
 extern int maxvfsconf;		/* highest defined filesystem type */
 extern int nfs_mount_type;	/* vfc_typenum for nfs, or -1 */
 
 TAILQ_HEAD(vfsconfhead, vfsconf);
 extern struct vfsconfhead vfsconf;
 
 /*
  * Operations supported on mounted filesystem.
  */
 struct mount_args;
 struct nameidata;
 struct sysctl_req;
 struct mntarg;
 
 typedef int vfs_cmount_t(struct mntarg *ma, void *data, uint64_t flags);
 typedef int vfs_unmount_t(struct mount *mp, int mntflags);
 typedef int vfs_root_t(struct mount *mp, int flags, struct vnode **vpp);
 typedef	int vfs_quotactl_t(struct mount *mp, int cmds, uid_t uid, void *arg);
 typedef	int vfs_statfs_t(struct mount *mp, struct statfs *sbp);
 typedef	int vfs_sync_t(struct mount *mp, int waitfor);
 typedef	int vfs_vget_t(struct mount *mp, ino_t ino, int flags,
 		    struct vnode **vpp);
 typedef	int vfs_fhtovp_t(struct mount *mp, struct fid *fhp,
 		    int flags, struct vnode **vpp);
 typedef	int vfs_checkexp_t(struct mount *mp, struct sockaddr *nam,
 		    int *extflagsp, struct ucred **credanonp,
 		    int *numsecflavors, int **secflavors);
 typedef	int vfs_init_t(struct vfsconf *);
 typedef	int vfs_uninit_t(struct vfsconf *);
 typedef	int vfs_extattrctl_t(struct mount *mp, int cmd,
 		    struct vnode *filename_vp, int attrnamespace,
 		    const char *attrname);
 typedef	int vfs_mount_t(struct mount *mp);
 typedef int vfs_sysctl_t(struct mount *mp, fsctlop_t op,
 		    struct sysctl_req *req);
 typedef void vfs_susp_clean_t(struct mount *mp);
 typedef void vfs_notify_lowervp_t(struct mount *mp, struct vnode *lowervp);
 typedef void vfs_purge_t(struct mount *mp);
 
 struct vfsops {
 	vfs_mount_t		*vfs_mount;
 	vfs_cmount_t		*vfs_cmount;
 	vfs_unmount_t		*vfs_unmount;
 	vfs_root_t		*vfs_root;
 	vfs_quotactl_t		*vfs_quotactl;
 	vfs_statfs_t		*vfs_statfs;
 	vfs_sync_t		*vfs_sync;
 	vfs_vget_t		*vfs_vget;
 	vfs_fhtovp_t		*vfs_fhtovp;
 	vfs_checkexp_t		*vfs_checkexp;
 	vfs_init_t		*vfs_init;
 	vfs_uninit_t		*vfs_uninit;
 	vfs_extattrctl_t	*vfs_extattrctl;
 	vfs_sysctl_t		*vfs_sysctl;
 	vfs_susp_clean_t	*vfs_susp_clean;
 	vfs_notify_lowervp_t	*vfs_reclaim_lowervp;
 	vfs_notify_lowervp_t	*vfs_unlink_lowervp;
 	vfs_purge_t		*vfs_purge;
 	vfs_mount_t		*vfs_spare[6];	/* spares for ABI compat */
 };
 
 vfs_statfs_t	__vfs_statfs;
 
 #define	VFS_PROLOGUE(MP)	do {					\
 	struct mount *mp__;						\
 	int _enable_stops;						\
 									\
 	mp__ = (MP);							\
 	_enable_stops = (mp__ != NULL &&				\
 	    (mp__->mnt_vfc->vfc_flags & VFCF_SBDRY) && sigdeferstop())
 
 #define	VFS_EPILOGUE(MP)						\
 	if (_enable_stops)						\
 		sigallowstop();						\
 } while (0)
 
 #define	VFS_MOUNT(MP) ({						\
 	int _rc;							\
 									\
 	VFS_PROLOGUE(MP);						\
 	_rc = (*(MP)->mnt_op->vfs_mount)(MP);				\
 	VFS_EPILOGUE(MP);						\
 	_rc; })
 
 #define	VFS_UNMOUNT(MP, FORCE) ({					\
 	int _rc;							\
 									\
 	VFS_PROLOGUE(MP);						\
 	_rc = (*(MP)->mnt_op->vfs_unmount)(MP, FORCE);			\
 	VFS_EPILOGUE(MP);						\
 	_rc; })
 
 #define	VFS_ROOT(MP, FLAGS, VPP) ({					\
 	int _rc;							\
 									\
 	VFS_PROLOGUE(MP);						\
 	_rc = (*(MP)->mnt_op->vfs_root)(MP, FLAGS, VPP);		\
 	VFS_EPILOGUE(MP);						\
 	_rc; })
 
 #define	VFS_QUOTACTL(MP, C, U, A) ({					\
 	int _rc;							\
 									\
 	VFS_PROLOGUE(MP);						\
 	_rc = (*(MP)->mnt_op->vfs_quotactl)(MP, C, U, A);		\
 	VFS_EPILOGUE(MP);						\
 	_rc; })
 
 #define	VFS_STATFS(MP, SBP) ({						\
 	int _rc;							\
 									\
 	VFS_PROLOGUE(MP);						\
 	_rc = __vfs_statfs((MP), (SBP));				\
 	VFS_EPILOGUE(MP);						\
 	_rc; })
 
 #define	VFS_SYNC(MP, WAIT) ({						\
 	int _rc;							\
 									\
 	VFS_PROLOGUE(MP);						\
 	_rc = (*(MP)->mnt_op->vfs_sync)(MP, WAIT);			\
 	VFS_EPILOGUE(MP);						\
 	_rc; })
 
 #define	VFS_VGET(MP, INO, FLAGS, VPP) ({				\
 	int _rc;							\
 									\
 	VFS_PROLOGUE(MP);						\
 	_rc = (*(MP)->mnt_op->vfs_vget)(MP, INO, FLAGS, VPP);		\
 	VFS_EPILOGUE(MP);						\
 	_rc; })
 
 #define	VFS_FHTOVP(MP, FIDP, FLAGS, VPP) ({				\
 	int _rc;							\
 									\
 	VFS_PROLOGUE(MP);						\
 	_rc = (*(MP)->mnt_op->vfs_fhtovp)(MP, FIDP, FLAGS, VPP);	\
 	VFS_EPILOGUE(MP);						\
 	_rc; })
 
 #define	VFS_CHECKEXP(MP, NAM, EXFLG, CRED, NUMSEC, SEC) ({		\
 	int _rc;							\
 									\
 	VFS_PROLOGUE(MP);						\
 	_rc = (*(MP)->mnt_op->vfs_checkexp)(MP, NAM, EXFLG, CRED, NUMSEC,\
 	    SEC);							\
 	VFS_EPILOGUE(MP);						\
 	_rc; })
 
 #define	VFS_EXTATTRCTL(MP, C, FN, NS, N) ({				\
 	int _rc;							\
 									\
 	VFS_PROLOGUE(MP);						\
 	_rc = (*(MP)->mnt_op->vfs_extattrctl)(MP, C, FN, NS, N);	\
 	VFS_EPILOGUE(MP);						\
 	_rc; })
 
 #define	VFS_SYSCTL(MP, OP, REQ) ({					\
 	int _rc;							\
 									\
 	VFS_PROLOGUE(MP);						\
 	_rc = (*(MP)->mnt_op->vfs_sysctl)(MP, OP, REQ);			\
 	VFS_EPILOGUE(MP);						\
 	_rc; })
 
 #define	VFS_SUSP_CLEAN(MP) do {						\
 	if (*(MP)->mnt_op->vfs_susp_clean != NULL) {			\
 		VFS_PROLOGUE(MP);					\
 		(*(MP)->mnt_op->vfs_susp_clean)(MP);			\
 		VFS_EPILOGUE(MP);					\
 	}								\
 } while (0)
 
 #define	VFS_RECLAIM_LOWERVP(MP, VP) do {				\
 	if (*(MP)->mnt_op->vfs_reclaim_lowervp != NULL) {		\
 		VFS_PROLOGUE(MP);					\
 		(*(MP)->mnt_op->vfs_reclaim_lowervp)((MP), (VP));	\
 		VFS_EPILOGUE(MP);					\
 	}								\
 } while (0)
 
 #define	VFS_UNLINK_LOWERVP(MP, VP) do {					\
 	if (*(MP)->mnt_op->vfs_unlink_lowervp != NULL) {		\
 		VFS_PROLOGUE(MP);					\
 		(*(MP)->mnt_op->vfs_unlink_lowervp)((MP), (VP));	\
 		VFS_EPILOGUE(MP);					\
 	}								\
 } while (0)
 
 #define	VFS_PURGE(MP) do {						\
 	if (*(MP)->mnt_op->vfs_purge != NULL) {				\
 		VFS_PROLOGUE(MP);					\
 		(*(MP)->mnt_op->vfs_purge)(MP);				\
 		VFS_EPILOGUE(MP);					\
 	}								\
 } while (0)
 
 #define VFS_KNOTE_LOCKED(vp, hint) do					\
 {									\
 	if (((vp)->v_vflag & VV_NOKNOTE) == 0)				\
 		VN_KNOTE((vp), (hint), KNF_LISTLOCKED);			\
 } while (0)
 
 #define VFS_KNOTE_UNLOCKED(vp, hint) do					\
 {									\
 	if (((vp)->v_vflag & VV_NOKNOTE) == 0)				\
 		VN_KNOTE((vp), (hint), 0);				\
 } while (0)
 
 #define	VFS_NOTIFY_UPPER_RECLAIM	1
 #define	VFS_NOTIFY_UPPER_UNLINK		2
 
 #include <sys/module.h>
 
 /*
  * Version numbers.
  */
 #define VFS_VERSION_00	0x19660120
 #define VFS_VERSION_01	0x20121030
 #define VFS_VERSION	VFS_VERSION_01
 
 #define VFS_SET(vfsops, fsname, flags) \
 	static struct vfsconf fsname ## _vfsconf = {		\
 		.vfc_version = VFS_VERSION,			\
 		.vfc_name = #fsname,				\
 		.vfc_vfsops = &vfsops,				\
 		.vfc_typenum = -1,				\
 		.vfc_flags = flags,				\
 	};							\
 	static moduledata_t fsname ## _mod = {			\
 		#fsname,					\
 		vfs_modevent,					\
 		& fsname ## _vfsconf				\
 	};							\
 	DECLARE_MODULE(fsname, fsname ## _mod, SI_SUB_VFS, SI_ORDER_MIDDLE)
 
 /*
  * exported vnode operations
  */
 
 int	dounmount(struct mount *, int, struct thread *);
 
 int	kernel_mount(struct mntarg *ma, uint64_t flags);
 int	kernel_vmount(int flags, ...);
 struct mntarg *mount_arg(struct mntarg *ma, const char *name, const void *val, int len);
 struct mntarg *mount_argb(struct mntarg *ma, int flag, const char *name);
 struct mntarg *mount_argf(struct mntarg *ma, const char *name, const char *fmt, ...);
 struct mntarg *mount_argsu(struct mntarg *ma, const char *name, const void *val, int len);
 void	statfs_scale_blocks(struct statfs *sf, long max_size);
 struct vfsconf *vfs_byname(const char *);
 struct vfsconf *vfs_byname_kld(const char *, struct thread *td, int *);
 void	vfs_mount_destroy(struct mount *);
 void	vfs_event_signal(fsid_t *, u_int32_t, intptr_t);
 void	vfs_freeopts(struct vfsoptlist *opts);
 void	vfs_deleteopt(struct vfsoptlist *opts, const char *name);
 int	vfs_buildopts(struct uio *auio, struct vfsoptlist **options);
 int	vfs_flagopt(struct vfsoptlist *opts, const char *name, uint64_t *w,
 	    uint64_t val);
 int	vfs_getopt(struct vfsoptlist *, const char *, void **, int *);
 int	vfs_getopt_pos(struct vfsoptlist *opts, const char *name);
 int	vfs_getopt_size(struct vfsoptlist *opts, const char *name,
 	    off_t *value);
 char	*vfs_getopts(struct vfsoptlist *, const char *, int *error);
 int	vfs_copyopt(struct vfsoptlist *, const char *, void *, int);
 int	vfs_filteropt(struct vfsoptlist *, const char **legal);
 void	vfs_opterror(struct vfsoptlist *opts, const char *fmt, ...);
 int	vfs_scanopt(struct vfsoptlist *opts, const char *name, const char *fmt, ...);
 int	vfs_setopt(struct vfsoptlist *opts, const char *name, void *value,
 	    int len);
 int	vfs_setopt_part(struct vfsoptlist *opts, const char *name, void *value,
 	    int len);
 int	vfs_setopts(struct vfsoptlist *opts, const char *name,
 	    const char *value);
 int	vfs_setpublicfs			    /* set publicly exported fs */
 	    (struct mount *, struct netexport *, struct export_args *);
 void	vfs_msync(struct mount *, int);
 int	vfs_busy(struct mount *, int);
 int	vfs_export			 /* process mount export info */
 	    (struct mount *, struct export_args *);
 void	vfs_allocate_syncvnode(struct mount *);
 void	vfs_deallocate_syncvnode(struct mount *);
 int	vfs_donmount(struct thread *td, uint64_t fsflags,
 	    struct uio *fsoptions);
 void	vfs_getnewfsid(struct mount *);
 struct cdev *vfs_getrootfsid(struct mount *);
 struct	mount *vfs_getvfs(fsid_t *);      /* return vfs given fsid */
 struct	mount *vfs_busyfs(fsid_t *);
 int	vfs_modevent(module_t, int, void *);
 void	vfs_mount_error(struct mount *, const char *, ...);
 void	vfs_mountroot(void);			/* mount our root filesystem */
 void	vfs_mountedfrom(struct mount *, const char *from);
 void	vfs_notify_upper(struct vnode *, int);
 void	vfs_oexport_conv(const struct oexport_args *oexp,
 	    struct export_args *exp);
 void	vfs_ref(struct mount *);
 void	vfs_rel(struct mount *);
 struct mount *vfs_mount_alloc(struct vnode *, struct vfsconf *, const char *,
 	    struct ucred *);
 int	vfs_suser(struct mount *, struct thread *);
 void	vfs_unbusy(struct mount *);
 void	vfs_unmountall(void);
 extern	TAILQ_HEAD(mntlist, mount) mountlist;	/* mounted filesystem list */
 extern	struct mtx mountlist_mtx;
 extern	struct nfs_public nfs_pub;
 extern	struct sx vfsconf_sx;
 #define	vfsconf_lock()		sx_xlock(&vfsconf_sx)
 #define	vfsconf_unlock()	sx_xunlock(&vfsconf_sx)
 #define	vfsconf_slock()		sx_slock(&vfsconf_sx)
 #define	vfsconf_sunlock()	sx_sunlock(&vfsconf_sx)
 
 /*
  * Declarations for these vfs default operations are located in
  * kern/vfs_default.c.  They will be automatically used to replace
  * null entries in VFS ops tables when registering a new filesystem
  * type in the global table.
  */
 vfs_root_t		vfs_stdroot;
 vfs_quotactl_t		vfs_stdquotactl;
 vfs_statfs_t		vfs_stdstatfs;
 vfs_sync_t		vfs_stdsync;
 vfs_sync_t		vfs_stdnosync;
 vfs_vget_t		vfs_stdvget;
 vfs_fhtovp_t		vfs_stdfhtovp;
 vfs_checkexp_t		vfs_stdcheckexp;
 vfs_init_t		vfs_stdinit;
 vfs_uninit_t		vfs_stduninit;
 vfs_extattrctl_t	vfs_stdextattrctl;
 vfs_sysctl_t		vfs_stdsysctl;
 
 void	syncer_suspend(void);
 void	syncer_resume(void);
 
 #else /* !_KERNEL */
 
 #include <sys/cdefs.h>
 
 struct stat;
 
 __BEGIN_DECLS
 int	fhopen(const struct fhandle *, int);
 int	fhstat(const struct fhandle *, struct stat *);
 int	fhstatfs(const struct fhandle *, struct statfs *);
 int	fstatfs(int, struct statfs *);
 int	getfh(const char *, fhandle_t *);
 int	getfsstat(struct statfs *, long, int);
 int	getmntinfo(struct statfs **, int);
 int	lgetfh(const char *, fhandle_t *);
 int	mount(const char *, const char *, int, void *);
 int	nmount(struct iovec *, unsigned int, int);
 int	statfs(const char *, struct statfs *);
 int	unmount(const char *, int);
 
 /* C library stuff */
 int	getvfsbyname(const char *, struct xvfsconf *);
 __END_DECLS
 
 #endif /* _KERNEL */
 
 #endif /* !_SYS_MOUNT_H_ */
Index: user/ngie/more-tests/sys/sys/param.h
===================================================================
--- user/ngie/more-tests/sys/sys/param.h	(revision 281584)
+++ user/ngie/more-tests/sys/sys/param.h	(revision 281585)
@@ -1,348 +1,348 @@
 /*-
  * Copyright (c) 1982, 1986, 1989, 1993
  *	The Regents of the University of California.  All rights reserved.
  * (c) UNIX System Laboratories, Inc.
  * All or some portions of this file are derived from material licensed
  * to the University of California by American Telephone and Telegraph
  * Co. or Unix System Laboratories, Inc. and are reproduced herein with
  * the permission of UNIX System Laboratories, Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)param.h	8.3 (Berkeley) 4/4/95
  * $FreeBSD$
  */
 
 #ifndef _SYS_PARAM_H_
 #define _SYS_PARAM_H_
 
 #include <sys/_null.h>
 
 #define	BSD	199506		/* System version (year & month). */
 #define BSD4_3	1
 #define BSD4_4	1
 
 /* 
  * __FreeBSD_version numbers are documented in the Porter's Handbook.
  * If you bump the version for any reason, you should update the documentation
  * there.
  * Currently this lives here in the doc/ repository:
  *
- *	head/en_US.ISO8859-1/books/porters-handbook/book.xml
+ *	head/en_US.ISO8859-1/books/porters-handbook/versions/chapter.xml
  *
  * scheme is:  <major><two digit minor>Rxx
  *		'R' is in the range 0 to 4 if this is a release branch or
  *		x.0-CURRENT before RELENG_*_0 is created, otherwise 'R' is
  *		in the range 5 to 9.
  */
 #undef __FreeBSD_version
-#define __FreeBSD_version 1100068	/* Master, propagated to newvers */
+#define __FreeBSD_version 1100069	/* Master, propagated to newvers */
 
 /*
  * __FreeBSD_kernel__ indicates that this system uses the kernel of FreeBSD,
  * which by definition is always true on FreeBSD. This macro is also defined
  * on other systems that use the kernel of FreeBSD, such as GNU/kFreeBSD.
  *
  * It is tempting to use this macro in userland code when we want to enable
  * kernel-specific routines, and in fact it's fine to do this in code that
  * is part of FreeBSD itself.  However, be aware that as presence of this
  * macro is still not widespread (e.g. older FreeBSD versions, 3rd party
  * compilers, etc), it is STRONGLY DISCOURAGED to check for this macro in
  * external applications without also checking for __FreeBSD__ as an
  * alternative.
  */
 #undef __FreeBSD_kernel__
 #define __FreeBSD_kernel__
 
 #ifdef _KERNEL
 #define	P_OSREL_SIGWAIT		700000
 #define	P_OSREL_SIGSEGV		700004
 #define	P_OSREL_MAP_ANON	800104
 #define	P_OSREL_MAP_FSTRICT	1100036
 
 #define	P_OSREL_MAJOR(x)	((x) / 100000)
 #endif
 
 #ifndef LOCORE
 #include <sys/types.h>
 #endif
 
 /*
  * Machine-independent constants (some used in following include files).
  * Redefined constants are from POSIX 1003.1 limits file.
  *
  * MAXCOMLEN should be >= sizeof(ac_comm) (see <acct.h>)
  */
 #include <sys/syslimits.h>
 
 #define	MAXCOMLEN	19		/* max command name remembered */
 #define	MAXINTERP	PATH_MAX	/* max interpreter file name length */
 #define	MAXLOGNAME	33		/* max login name length (incl. NUL) */
 #define	MAXUPRC		CHILD_MAX	/* max simultaneous processes */
 #define	NCARGS		ARG_MAX		/* max bytes for an exec function */
 #define	NGROUPS		(NGROUPS_MAX+1)	/* max number groups */
 #define	NOFILE		OPEN_MAX	/* max open files per process */
 #define	NOGROUP		65535		/* marker for empty group set member */
 #define MAXHOSTNAMELEN	256		/* max hostname size */
 #define SPECNAMELEN	63		/* max length of devicename */
 
 /* More types and definitions used throughout the kernel. */
 #ifdef _KERNEL
 #include <sys/cdefs.h>
 #include <sys/errno.h>
 #ifndef LOCORE
 #include <sys/time.h>
 #include <sys/priority.h>
 #endif
 
 #ifndef FALSE
 #define	FALSE	0
 #endif
 #ifndef TRUE
 #define	TRUE	1
 #endif
 #endif
 
 #ifndef _KERNEL
 /* Signals. */
 #include <sys/signal.h>
 #endif
 
 /* Machine type dependent parameters. */
 #include <machine/param.h>
 #ifndef _KERNEL
 #include <sys/limits.h>
 #endif
 
 #ifndef DEV_BSHIFT
 #define	DEV_BSHIFT	9		/* log2(DEV_BSIZE) */
 #endif
 #define	DEV_BSIZE	(1<<DEV_BSHIFT)
 
 #ifndef BLKDEV_IOSIZE
 #define BLKDEV_IOSIZE  PAGE_SIZE	/* default block device I/O size */
 #endif
 #ifndef DFLTPHYS
 #define DFLTPHYS	(64 * 1024)	/* default max raw I/O transfer size */
 #endif
 #ifndef MAXPHYS
 #define MAXPHYS		(128 * 1024)	/* max raw I/O transfer size */
 #endif
 #ifndef MAXDUMPPGS
 #define MAXDUMPPGS	(DFLTPHYS/PAGE_SIZE)
 #endif
 
 /*
  * Constants related to network buffer management.
  * MCLBYTES must be no larger than PAGE_SIZE.
  */
 #ifndef	MSIZE
 #define	MSIZE		256		/* size of an mbuf */
 #endif
 
 #ifndef	MCLSHIFT
 #define MCLSHIFT	11		/* convert bytes to mbuf clusters */
 #endif	/* MCLSHIFT */
 
 #define MCLBYTES	(1 << MCLSHIFT)	/* size of an mbuf cluster */
 
 #if PAGE_SIZE < 2048
 #define	MJUMPAGESIZE	MCLBYTES
 #elif PAGE_SIZE <= 8192
 #define	MJUMPAGESIZE	PAGE_SIZE
 #else
 #define	MJUMPAGESIZE	(8 * 1024)
 #endif
 
 #define	MJUM9BYTES	(9 * 1024)	/* jumbo cluster 9k */
 #define	MJUM16BYTES	(16 * 1024)	/* jumbo cluster 16k */
 
 /*
  * Some macros for units conversion
  */
 
 /* clicks to bytes */
 #ifndef ctob
 #define ctob(x)	((x)<<PAGE_SHIFT)
 #endif
 
 /* bytes to clicks */
 #ifndef btoc
 #define btoc(x)	(((vm_offset_t)(x)+PAGE_MASK)>>PAGE_SHIFT)
 #endif
 
 /*
  * btodb() is messy and perhaps slow because `bytes' may be an off_t.  We
  * want to shift an unsigned type to avoid sign extension and we don't
  * want to widen `bytes' unnecessarily.  Assume that the result fits in
  * a daddr_t.
  */
 #ifndef btodb
 #define btodb(bytes)	 		/* calculates (bytes / DEV_BSIZE) */ \
 	(sizeof (bytes) > sizeof(long) \
 	 ? (daddr_t)((unsigned long long)(bytes) >> DEV_BSHIFT) \
 	 : (daddr_t)((unsigned long)(bytes) >> DEV_BSHIFT))
 #endif
 
 #ifndef dbtob
 #define dbtob(db)			/* calculates (db * DEV_BSIZE) */ \
 	((off_t)(db) << DEV_BSHIFT)
 #endif
 
 #define	PRIMASK	0x0ff
 #define	PCATCH	0x100		/* OR'd with pri for tsleep to check signals */
 #define	PDROP	0x200	/* OR'd with pri to stop re-entry of interlock mutex */
 
 #define	NZERO	0		/* default "nice" */
 
 #define	NBBY	8		/* number of bits in a byte */
 #define	NBPW	sizeof(int)	/* number of bytes per word (integer) */
 
 #define	CMASK	022		/* default file mask: S_IWGRP|S_IWOTH */
 
 #define	NODEV	(dev_t)(-1)	/* non-existent device */
 
 /*
  * File system parameters and macros.
  *
  * MAXBSIZE -	Filesystems are made out of blocks of at most MAXBSIZE bytes
  *		per block.  MAXBSIZE may be made larger without effecting
  *		any existing filesystems as long as it does not exceed MAXPHYS,
  *		and may be made smaller at the risk of not being able to use
  *		filesystems which require a block size exceeding MAXBSIZE.
  *
  * BKVASIZE -	Nominal buffer space per buffer, in bytes.  BKVASIZE is the
  *		minimum KVM memory reservation the kernel is willing to make.
  *		Filesystems can of course request smaller chunks.  Actual 
  *		backing memory uses a chunk size of a page (PAGE_SIZE).
  *
  *		If you make BKVASIZE too small you risk seriously fragmenting
  *		the buffer KVM map which may slow things down a bit.  If you
  *		make it too big the kernel will not be able to optimally use 
  *		the KVM memory reserved for the buffer cache and will wind 
  *		up with too-few buffers.
  *
  *		The default is 16384, roughly 2x the block size used by a
  *		normal UFS filesystem.
  */
 #define MAXBSIZE	65536	/* must be power of 2 */
 #define BKVASIZE	16384	/* must be power of 2 */
 #define BKVAMASK	(BKVASIZE-1)
 
 /*
  * MAXPATHLEN defines the longest permissible path length after expanding
  * symbolic links. It is used to allocate a temporary buffer from the buffer
  * pool in which to do the name expansion, hence should be a power of two,
  * and must be less than or equal to MAXBSIZE.  MAXSYMLINKS defines the
  * maximum number of symbolic links that may be expanded in a path name.
  * It should be set high enough to allow all legitimate uses, but halt
  * infinite loops reasonably quickly.
  */
 #define	MAXPATHLEN	PATH_MAX
 #define MAXSYMLINKS	32
 
 /* Bit map related macros. */
 #define	setbit(a,i)	(((unsigned char *)(a))[(i)/NBBY] |= 1<<((i)%NBBY))
 #define	clrbit(a,i)	(((unsigned char *)(a))[(i)/NBBY] &= ~(1<<((i)%NBBY)))
 #define	isset(a,i)							\
 	(((const unsigned char *)(a))[(i)/NBBY] & (1<<((i)%NBBY)))
 #define	isclr(a,i)							\
 	((((const unsigned char *)(a))[(i)/NBBY] & (1<<((i)%NBBY))) == 0)
 
 /* Macros for counting and rounding. */
 #ifndef howmany
 #define	howmany(x, y)	(((x)+((y)-1))/(y))
 #endif
 #define	nitems(x)	(sizeof((x)) / sizeof((x)[0]))
 #define	rounddown(x, y)	(((x)/(y))*(y))
 #define	rounddown2(x, y) ((x)&(~((y)-1)))          /* if y is power of two */
 #define	roundup(x, y)	((((x)+((y)-1))/(y))*(y))  /* to any y */
 #define	roundup2(x, y)	(((x)+((y)-1))&(~((y)-1))) /* if y is powers of two */
 #define powerof2(x)	((((x)-1)&(x))==0)
 
 /* Macros for min/max. */
 #define	MIN(a,b) (((a)<(b))?(a):(b))
 #define	MAX(a,b) (((a)>(b))?(a):(b))
 
 #ifdef _KERNEL
 /*
  * Basic byte order function prototypes for non-inline functions.
  */
 #ifndef LOCORE
 #ifndef _BYTEORDER_PROTOTYPED
 #define	_BYTEORDER_PROTOTYPED
 __BEGIN_DECLS
 __uint32_t	 htonl(__uint32_t);
 __uint16_t	 htons(__uint16_t);
 __uint32_t	 ntohl(__uint32_t);
 __uint16_t	 ntohs(__uint16_t);
 __END_DECLS
 #endif
 #endif
 
 #ifndef lint
 #ifndef _BYTEORDER_FUNC_DEFINED
 #define	_BYTEORDER_FUNC_DEFINED
 #define	htonl(x)	__htonl(x)
 #define	htons(x)	__htons(x)
 #define	ntohl(x)	__ntohl(x)
 #define	ntohs(x)	__ntohs(x)
 #endif /* !_BYTEORDER_FUNC_DEFINED */
 #endif /* lint */
 #endif /* _KERNEL */
 
 /*
  * Scale factor for scaled integers used to count %cpu time and load avgs.
  *
  * The number of CPU `tick's that map to a unique `%age' can be expressed
  * by the formula (1 / (2 ^ (FSHIFT - 11))).  The maximum load average that
  * can be calculated (assuming 32 bits) can be closely approximated using
  * the formula (2 ^ (2 * (16 - FSHIFT))) for (FSHIFT < 15).
  *
  * For the scheduler to maintain a 1:1 mapping of CPU `tick' to `%age',
  * FSHIFT must be at least 11; this gives us a maximum load avg of ~1024.
  */
 #define	FSHIFT	11		/* bits to right of fixed binary point */
 #define FSCALE	(1<<FSHIFT)
 
 #define dbtoc(db)			/* calculates devblks to pages */ \
 	((db + (ctodb(1) - 1)) >> (PAGE_SHIFT - DEV_BSHIFT))
  
 #define ctodb(db)			/* calculates pages to devblks */ \
 	((db) << (PAGE_SHIFT - DEV_BSHIFT))
 
 /*
  * Old spelling of __containerof().
  */
 #define	member2struct(s, m, x)						\
 	((struct s *)(void *)((char *)(x) - offsetof(struct s, m)))
 
 /*
  * Access a variable length array that has been declared as a fixed
  * length array.
  */
 #define __PAST_END(array, offset) (((__typeof__(*(array)) *)(array))[offset])
 
 #endif	/* _SYS_PARAM_H_ */
Index: user/ngie/more-tests/sys/sys/syscallsubr.h
===================================================================
--- user/ngie/more-tests/sys/sys/syscallsubr.h	(revision 281584)
+++ user/ngie/more-tests/sys/sys/syscallsubr.h	(revision 281585)
@@ -1,240 +1,240 @@
 /*-
  * Copyright (c) 2002 Ian Dowse.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 #ifndef _SYS_SYSCALLSUBR_H_
 #define _SYS_SYSCALLSUBR_H_
 
 #include <sys/signal.h>
 #include <sys/uio.h>
 #include <sys/socket.h>
 #include <sys/mac.h>
 #include <sys/mount.h>
 
 struct file;
 enum idtype;
 struct itimerval;
 struct image_args;
 struct jail;
 struct kevent;
 struct kevent_copyops;
 struct kld_file_stat;
 struct ksiginfo;
 struct mbuf;
 struct msghdr;
 struct msqid_ds;
 struct pollfd;
 struct ogetdirentries_args;
 struct rlimit;
 struct rusage;
 union semun;
 struct sendfile_args;
 struct sockaddr;
 struct stat;
 struct thr_param;
 struct __wrusage;
 
 int	kern___getcwd(struct thread *td, char *buf, enum uio_seg bufseg,
 	    u_int buflen);
 int	kern_accept(struct thread *td, int s, struct sockaddr **name,
 	    socklen_t *namelen, struct file **fp);
 int	kern_accept4(struct thread *td, int s, struct sockaddr **name,
 	    socklen_t *namelen, int flags, struct file **fp);
 int	kern_accessat(struct thread *td, int fd, char *path,
 	    enum uio_seg pathseg, int flags, int mode);
 int	kern_adjtime(struct thread *td, struct timeval *delta,
 	    struct timeval *olddelta);
 int	kern_alternate_path(struct thread *td, const char *prefix, const char *path,
 	    enum uio_seg pathseg, char **pathbuf, int create, int dirfd);
 int	kern_bindat(struct thread *td, int dirfd, int fd, struct sockaddr *sa);
 int	kern_cap_ioctls_limit(struct thread *td, int fd, u_long *cmds,
 	    size_t ncmds);
 int	kern_chdir(struct thread *td, char *path, enum uio_seg pathseg);
 int	kern_clock_getcpuclockid2(struct thread *td, id_t id, int which,
 	    clockid_t *clk_id);
 int	kern_clock_getres(struct thread *td, clockid_t clock_id,
 	    struct timespec *ts);
 int	kern_clock_gettime(struct thread *td, clockid_t clock_id,
 	    struct timespec *ats);
 int	kern_clock_settime(struct thread *td, clockid_t clock_id,
 	    struct timespec *ats);
 int	kern_close(struct thread *td, int fd);
 int	kern_connectat(struct thread *td, int dirfd, int fd,
 	    struct sockaddr *sa);
 int	kern_execve(struct thread *td, struct image_args *args,
 	    struct mac *mac_p);
 int	kern_fchmodat(struct thread *td, int fd, char *path,
 	    enum uio_seg pathseg, mode_t mode, int flag);
 int	kern_fchownat(struct thread *td, int fd, char *path,
 	    enum uio_seg pathseg, int uid, int gid, int flag);
 int	kern_fcntl(struct thread *td, int fd, int cmd, intptr_t arg);
 int	kern_fcntl_freebsd(struct thread *td, int fd, int cmd, long arg);
 int	kern_fhstat(struct thread *td, fhandle_t fh, struct stat *buf);
 int	kern_fhstatfs(struct thread *td, fhandle_t fh, struct statfs *buf);
 int	kern_fstat(struct thread *td, int fd, struct stat *sbp);
 int	kern_fstatfs(struct thread *td, int fd, struct statfs *buf);
 int	kern_ftruncate(struct thread *td, int fd, off_t length);
 int	kern_futimes(struct thread *td, int fd, struct timeval *tptr,
 	    enum uio_seg tptrseg);
 int	kern_futimens(struct thread *td, int fd, struct timespec *tptr,
 	    enum uio_seg tptrseg);
 int	kern_getdirentries(struct thread *td, int fd, char *buf, u_int count,
 	    long *basep, ssize_t *residp, enum uio_seg bufseg);
 int	kern_getfsstat(struct thread *td, struct statfs **buf, size_t bufsize,
-	    enum uio_seg bufseg, int flags);
+	    size_t *countp, enum uio_seg bufseg, int flags);
 int	kern_getitimer(struct thread *, u_int, struct itimerval *);
 int	kern_getppid(struct thread *);
 int	kern_getpeername(struct thread *td, int fd, struct sockaddr **sa,
 	    socklen_t *alen);
 int	kern_getrusage(struct thread *td, int who, struct rusage *rup);
 int	kern_getsockname(struct thread *td, int fd, struct sockaddr **sa,
 	    socklen_t *alen);
 int	kern_getsockopt(struct thread *td, int s, int level, int name,
 	    void *optval, enum uio_seg valseg, socklen_t *valsize);
 int	kern_ioctl(struct thread *td, int fd, u_long com, caddr_t data);
 int	kern_jail(struct thread *td, struct jail *j);
 int	kern_jail_get(struct thread *td, struct uio *options, int flags);
 int	kern_jail_set(struct thread *td, struct uio *options, int flags);
 int	kern_kevent(struct thread *td, int fd, int nchanges, int nevents,
 	    struct kevent_copyops *k_ops, const struct timespec *timeout);
 int	kern_kldload(struct thread *td, const char *file, int *fileid);
 int	kern_kldstat(struct thread *td, int fileid, struct kld_file_stat *stat);
 int	kern_kldunload(struct thread *td, int fileid, int flags);
 int	kern_linkat(struct thread *td, int fd1, int fd2, char *path1,
 	    char *path2, enum uio_seg segflg, int follow);
 int	kern_lutimes(struct thread *td, char *path, enum uio_seg pathseg,
 	    struct timeval *tptr, enum uio_seg tptrseg);
 int	kern_mkdirat(struct thread *td, int fd, char *path,
 	    enum uio_seg segflg, int mode);
 int	kern_mkfifoat(struct thread *td, int fd, char *path,
 	    enum uio_seg pathseg, int mode);
 int	kern_mknodat(struct thread *td, int fd, char *path,
 	    enum uio_seg pathseg, int mode, int dev);
 int	kern_msgctl(struct thread *, int, int, struct msqid_ds *);
 int	kern_msgsnd(struct thread *, int, const void *, size_t, int, long);
 int	kern_msgrcv(struct thread *, int, void *, size_t, long, int, long *);
 int     kern_nanosleep(struct thread *td, struct timespec *rqt,
 	    struct timespec *rmt);
 int	kern_ogetdirentries(struct thread *td, struct ogetdirentries_args *uap,
 	    long *ploff);
 int	kern_openat(struct thread *td, int fd, char *path,
 	    enum uio_seg pathseg, int flags, int mode);
 int	kern_pathconf(struct thread *td, char *path, enum uio_seg pathseg,
 	    int name, u_long flags);
 int	kern_pipe(struct thread *td, int fildes[2]);
 int	kern_pipe2(struct thread *td, int fildes[2], int flags);
 int	kern_poll(struct thread *td, struct pollfd *fds, u_int nfds,
 	    struct timespec *tsp, sigset_t *uset);
 int	kern_posix_fadvise(struct thread *td, int fd, off_t offset, off_t len,
 	    int advice);
 int	kern_posix_fallocate(struct thread *td, int fd, off_t offset,
 	    off_t len);
 int	kern_procctl(struct thread *td, enum idtype idtype, id_t id, int com,
 	    void *data);
 int	kern_preadv(struct thread *td, int fd, struct uio *auio, off_t offset);
 int	kern_pselect(struct thread *td, int nd, fd_set *in, fd_set *ou,
 	    fd_set *ex, struct timeval *tvp, sigset_t *uset, int abi_nfdbits);
 int	kern_ptrace(struct thread *td, int req, pid_t pid, void *addr,
 	    int data);
 int	kern_pwritev(struct thread *td, int fd, struct uio *auio, off_t offset);
 int	kern_readlinkat(struct thread *td, int fd, char *path,
 	    enum uio_seg pathseg, char *buf, enum uio_seg bufseg, size_t count);
 int	kern_readv(struct thread *td, int fd, struct uio *auio);
 int	kern_recvit(struct thread *td, int s, struct msghdr *mp,
 	    enum uio_seg fromseg, struct mbuf **controlp);
 int	kern_renameat(struct thread *td, int oldfd, char *old, int newfd,
 	    char *new, enum uio_seg pathseg);
 int	kern_rmdirat(struct thread *td, int fd, char *path,
 	    enum uio_seg pathseg);
 int	kern_sched_rr_get_interval(struct thread *td, pid_t pid,
 	    struct timespec *ts);
 int	kern_semctl(struct thread *td, int semid, int semnum, int cmd,
 	    union semun *arg, register_t *rval);
 int	kern_select(struct thread *td, int nd, fd_set *fd_in, fd_set *fd_ou,
 	    fd_set *fd_ex, struct timeval *tvp, int abi_nfdbits);
 int	kern_sendfile(struct thread *td, struct sendfile_args *uap,
 	    struct uio *hdr_uio, struct uio *trl_uio, int compat);
 int	kern_sendit(struct thread *td, int s, struct msghdr *mp, int flags,
 	    struct mbuf *control, enum uio_seg segflg);
 int	kern_setgroups(struct thread *td, u_int ngrp, gid_t *groups);
 int	kern_setitimer(struct thread *, u_int, struct itimerval *,
 	    struct itimerval *);
 int	kern_setrlimit(struct thread *, u_int, struct rlimit *);
 int	kern_setsockopt(struct thread *td, int s, int level, int name,
 	    void *optval, enum uio_seg valseg, socklen_t valsize);
 int	kern_settimeofday(struct thread *td, struct timeval *tv,
 	    struct timezone *tzp);
 int	kern_shmat(struct thread *td, int shmid, const void *shmaddr,
 	    int shmflg);
 int	kern_shmctl(struct thread *td, int shmid, int cmd, void *buf,
 	    size_t *bufsz);
 int	kern_sigaction(struct thread *td, int sig, struct sigaction *act,
 	    struct sigaction *oact, int flags);
 int	kern_sigaltstack(struct thread *td, stack_t *ss, stack_t *oss);
 int	kern_sigprocmask(struct thread *td, int how,
 	    sigset_t *set, sigset_t *oset, int flags);
 int	kern_sigsuspend(struct thread *td, sigset_t mask);
 int	kern_sigtimedwait(struct thread *td, sigset_t waitset,
 	    struct ksiginfo *ksi, struct timespec *timeout);
 int	kern_statat(struct thread *td, int flag, int fd, char *path,
 	    enum uio_seg pathseg, struct stat *sbp,
 	    void (*hook)(struct vnode *vp, struct stat *sbp));
 int	kern_statfs(struct thread *td, char *path, enum uio_seg pathseg,
 	    struct statfs *buf);
 int	kern_symlinkat(struct thread *td, char *path1, int fd, char *path2,
 	    enum uio_seg segflg);
 int	kern_ktimer_create(struct thread *td, clockid_t clock_id,
 	    struct sigevent *evp, int *timerid, int preset_id);
 int	kern_ktimer_delete(struct thread *, int);
 int	kern_ktimer_settime(struct thread *td, int timer_id, int flags,
 	    struct itimerspec *val, struct itimerspec *oval);
 int	kern_ktimer_gettime(struct thread *td, int timer_id,
 	    struct itimerspec *val);
 int	kern_ktimer_getoverrun(struct thread *td, int timer_id);
 int	kern_thr_new(struct thread *td, struct thr_param *param);
 int	kern_thr_suspend(struct thread *td, struct timespec *tsp);
 int	kern_truncate(struct thread *td, char *path, enum uio_seg pathseg,
 	    off_t length);
 int	kern_unlinkat(struct thread *td, int fd, char *path,
 	    enum uio_seg pathseg, ino_t oldinum);
 int	kern_utimesat(struct thread *td, int fd, char *path,
 	    enum uio_seg pathseg, struct timeval *tptr, enum uio_seg tptrseg);
 int	kern_utimensat(struct thread *td, int fd, char *path,
 	    enum uio_seg pathseg, struct timespec *tptr, enum uio_seg tptrseg,
 	    int follow);
 int	kern_wait(struct thread *td, pid_t pid, int *status, int options,
 	    struct rusage *rup);
 int	kern_wait6(struct thread *td, enum idtype idtype, id_t id, int *status,
 	    int options, struct __wrusage *wrup, siginfo_t *sip);
 int	kern_writev(struct thread *td, int fd, struct uio *auio);
 int	kern_socketpair(struct thread *td, int domain, int type, int protocol,
 	    int *rsv);
 
 /* flags for kern_sigaction */
 #define	KSA_OSIGSET	0x0001	/* uses osigact_t */
 #define	KSA_FREEBSD4	0x0002	/* uses ucontext4 */
 
 #endif /* !_SYS_SYSCALLSUBR_H_ */
Index: user/ngie/more-tests/sys/ufs/ffs/ffs_vfsops.c
===================================================================
--- user/ngie/more-tests/sys/ufs/ffs/ffs_vfsops.c	(revision 281584)
+++ user/ngie/more-tests/sys/ufs/ffs/ffs_vfsops.c	(revision 281585)
@@ -1,2216 +1,2217 @@
 /*-
  * Copyright (c) 1989, 1991, 1993, 1994
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 4. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)ffs_vfsops.c	8.31 (Berkeley) 5/20/95
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_quota.h"
 #include "opt_ufs.h"
 #include "opt_ffs.h"
 #include "opt_ddb.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/namei.h>
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/kernel.h>
 #include <sys/vnode.h>
 #include <sys/mount.h>
 #include <sys/bio.h>
 #include <sys/buf.h>
 #include <sys/conf.h>
 #include <sys/fcntl.h>
 #include <sys/ioccom.h>
 #include <sys/malloc.h>
 #include <sys/mutex.h>
 #include <sys/rwlock.h>
 
 #include <security/mac/mac_framework.h>
 
 #include <ufs/ufs/extattr.h>
 #include <ufs/ufs/gjournal.h>
 #include <ufs/ufs/quota.h>
 #include <ufs/ufs/ufsmount.h>
 #include <ufs/ufs/inode.h>
 #include <ufs/ufs/ufs_extern.h>
 
 #include <ufs/ffs/fs.h>
 #include <ufs/ffs/ffs_extern.h>
 
 #include <vm/vm.h>
 #include <vm/uma.h>
 #include <vm/vm_page.h>
 
 #include <geom/geom.h>
 #include <geom/geom_vfs.h>
 
 #include <ddb/ddb.h>
 
 static uma_zone_t uma_inode, uma_ufs1, uma_ufs2;
 
 static int	ffs_mountfs(struct vnode *, struct mount *, struct thread *);
 static void	ffs_oldfscompat_read(struct fs *, struct ufsmount *,
 		    ufs2_daddr_t);
 static void	ffs_ifree(struct ufsmount *ump, struct inode *ip);
 static int	ffs_sync_lazy(struct mount *mp);
 
 static vfs_init_t ffs_init;
 static vfs_uninit_t ffs_uninit;
 static vfs_extattrctl_t ffs_extattrctl;
 static vfs_cmount_t ffs_cmount;
 static vfs_unmount_t ffs_unmount;
 static vfs_mount_t ffs_mount;
 static vfs_statfs_t ffs_statfs;
 static vfs_fhtovp_t ffs_fhtovp;
 static vfs_sync_t ffs_sync;
 
 static struct vfsops ufs_vfsops = {
 	.vfs_extattrctl =	ffs_extattrctl,
 	.vfs_fhtovp =		ffs_fhtovp,
 	.vfs_init =		ffs_init,
 	.vfs_mount =		ffs_mount,
 	.vfs_cmount =		ffs_cmount,
 	.vfs_quotactl =		ufs_quotactl,
 	.vfs_root =		ufs_root,
 	.vfs_statfs =		ffs_statfs,
 	.vfs_sync =		ffs_sync,
 	.vfs_uninit =		ffs_uninit,
 	.vfs_unmount =		ffs_unmount,
 	.vfs_vget =		ffs_vget,
 	.vfs_susp_clean =	process_deferred_inactive,
 };
 
 VFS_SET(ufs_vfsops, ufs, 0);
 MODULE_VERSION(ufs, 1);
 
 static b_strategy_t ffs_geom_strategy;
 static b_write_t ffs_bufwrite;
 
 static struct buf_ops ffs_ops = {
 	.bop_name =	"FFS",
 	.bop_write =	ffs_bufwrite,
 	.bop_strategy =	ffs_geom_strategy,
 	.bop_sync =	bufsync,
 #ifdef NO_FFS_SNAPSHOT
 	.bop_bdflush =	bufbdflush,
 #else
 	.bop_bdflush =	ffs_bdflush,
 #endif
 };
 
 /*
  * Note that userquota and groupquota options are not currently used
  * by UFS/FFS code and generally mount(8) does not pass those options
  * from userland, but they can be passed by loader(8) via
  * vfs.root.mountfrom.options.
  */
 static const char *ffs_opts[] = { "acls", "async", "noatime", "noclusterr",
     "noclusterw", "noexec", "export", "force", "from", "groupquota",
     "multilabel", "nfsv4acls", "fsckpid", "snapshot", "nosuid", "suiddir",
     "nosymfollow", "sync", "union", "userquota", NULL };
 
 static int
 ffs_mount(struct mount *mp)
 {
 	struct vnode *devvp;
 	struct thread *td;
 	struct ufsmount *ump = NULL;
 	struct fs *fs;
 	pid_t fsckpid = 0;
 	int error, flags;
 	uint64_t mntorflags;
 	accmode_t accmode;
 	struct nameidata ndp;
 	char *fspec;
 
 	td = curthread;
 	if (vfs_filteropt(mp->mnt_optnew, ffs_opts))
 		return (EINVAL);
 	if (uma_inode == NULL) {
 		uma_inode = uma_zcreate("FFS inode",
 		    sizeof(struct inode), NULL, NULL, NULL, NULL,
 		    UMA_ALIGN_PTR, 0);
 		uma_ufs1 = uma_zcreate("FFS1 dinode",
 		    sizeof(struct ufs1_dinode), NULL, NULL, NULL, NULL,
 		    UMA_ALIGN_PTR, 0);
 		uma_ufs2 = uma_zcreate("FFS2 dinode",
 		    sizeof(struct ufs2_dinode), NULL, NULL, NULL, NULL,
 		    UMA_ALIGN_PTR, 0);
 	}
 
 	vfs_deleteopt(mp->mnt_optnew, "groupquota");
 	vfs_deleteopt(mp->mnt_optnew, "userquota");
 
 	fspec = vfs_getopts(mp->mnt_optnew, "from", &error);
 	if (error)
 		return (error);
 
 	mntorflags = 0;
 	if (vfs_getopt(mp->mnt_optnew, "acls", NULL, NULL) == 0)
 		mntorflags |= MNT_ACLS;
 
 	if (vfs_getopt(mp->mnt_optnew, "snapshot", NULL, NULL) == 0) {
 		mntorflags |= MNT_SNAPSHOT;
 		/*
 		 * Once we have set the MNT_SNAPSHOT flag, do not
 		 * persist "snapshot" in the options list.
 		 */
 		vfs_deleteopt(mp->mnt_optnew, "snapshot");
 		vfs_deleteopt(mp->mnt_opt, "snapshot");
 	}
 
 	if (vfs_getopt(mp->mnt_optnew, "fsckpid", NULL, NULL) == 0 &&
 	    vfs_scanopt(mp->mnt_optnew, "fsckpid", "%d", &fsckpid) == 1) {
 		/*
 		 * Once we have set the restricted PID, do not
 		 * persist "fsckpid" in the options list.
 		 */
 		vfs_deleteopt(mp->mnt_optnew, "fsckpid");
 		vfs_deleteopt(mp->mnt_opt, "fsckpid");
 		if (mp->mnt_flag & MNT_UPDATE) {
 			if (VFSTOUFS(mp)->um_fs->fs_ronly == 0 &&
 			     vfs_flagopt(mp->mnt_optnew, "ro", NULL, 0) == 0) {
 				vfs_mount_error(mp,
 				    "Checker enable: Must be read-only");
 				return (EINVAL);
 			}
 		} else if (vfs_flagopt(mp->mnt_optnew, "ro", NULL, 0) == 0) {
 			vfs_mount_error(mp,
 			    "Checker enable: Must be read-only");
 			return (EINVAL);
 		}
 		/* Set to -1 if we are done */
 		if (fsckpid == 0)
 			fsckpid = -1;
 	}
 
 	if (vfs_getopt(mp->mnt_optnew, "nfsv4acls", NULL, NULL) == 0) {
 		if (mntorflags & MNT_ACLS) {
 			vfs_mount_error(mp,
 			    "\"acls\" and \"nfsv4acls\" options "
 			    "are mutually exclusive");
 			return (EINVAL);
 		}
 		mntorflags |= MNT_NFS4ACLS;
 	}
 
 	MNT_ILOCK(mp);
 	mp->mnt_flag |= mntorflags;
 	MNT_IUNLOCK(mp);
 	/*
 	 * If updating, check whether changing from read-only to
 	 * read/write; if there is no device name, that's all we do.
 	 */
 	if (mp->mnt_flag & MNT_UPDATE) {
 		ump = VFSTOUFS(mp);
 		fs = ump->um_fs;
 		devvp = ump->um_devvp;
 		if (fsckpid == -1 && ump->um_fsckpid > 0) {
 			if ((error = ffs_flushfiles(mp, WRITECLOSE, td)) != 0 ||
 			    (error = ffs_sbupdate(ump, MNT_WAIT, 0)) != 0)
 				return (error);
 			DROP_GIANT();
 			g_topology_lock();
 			/*
 			 * Return to normal read-only mode.
 			 */
 			error = g_access(ump->um_cp, 0, -1, 0);
 			g_topology_unlock();
 			PICKUP_GIANT();
 			ump->um_fsckpid = 0;
 		}
 		if (fs->fs_ronly == 0 &&
 		    vfs_flagopt(mp->mnt_optnew, "ro", NULL, 0)) {
 			/*
 			 * Flush any dirty data and suspend filesystem.
 			 */
 			if ((error = vn_start_write(NULL, &mp, V_WAIT)) != 0)
 				return (error);
 			error = vfs_write_suspend_umnt(mp);
 			if (error != 0)
 				return (error);
 			/*
 			 * Check for and optionally get rid of files open
 			 * for writing.
 			 */
 			flags = WRITECLOSE;
 			if (mp->mnt_flag & MNT_FORCE)
 				flags |= FORCECLOSE;
 			if (MOUNTEDSOFTDEP(mp)) {
 				error = softdep_flushfiles(mp, flags, td);
 			} else {
 				error = ffs_flushfiles(mp, flags, td);
 			}
 			if (error) {
 				vfs_write_resume(mp, 0);
 				return (error);
 			}
 			if (fs->fs_pendingblocks != 0 ||
 			    fs->fs_pendinginodes != 0) {
 				printf("WARNING: %s Update error: blocks %jd "
 				    "files %d\n", fs->fs_fsmnt, 
 				    (intmax_t)fs->fs_pendingblocks,
 				    fs->fs_pendinginodes);
 				fs->fs_pendingblocks = 0;
 				fs->fs_pendinginodes = 0;
 			}
 			if ((fs->fs_flags & (FS_UNCLEAN | FS_NEEDSFSCK)) == 0)
 				fs->fs_clean = 1;
 			if ((error = ffs_sbupdate(ump, MNT_WAIT, 0)) != 0) {
 				fs->fs_ronly = 0;
 				fs->fs_clean = 0;
 				vfs_write_resume(mp, 0);
 				return (error);
 			}
 			if (MOUNTEDSOFTDEP(mp))
 				softdep_unmount(mp);
 			DROP_GIANT();
 			g_topology_lock();
 			/*
 			 * Drop our write and exclusive access.
 			 */
 			g_access(ump->um_cp, 0, -1, -1);
 			g_topology_unlock();
 			PICKUP_GIANT();
 			fs->fs_ronly = 1;
 			MNT_ILOCK(mp);
 			mp->mnt_flag |= MNT_RDONLY;
 			MNT_IUNLOCK(mp);
 			/*
 			 * Allow the writers to note that filesystem
 			 * is ro now.
 			 */
 			vfs_write_resume(mp, 0);
 		}
 		if ((mp->mnt_flag & MNT_RELOAD) &&
 		    (error = ffs_reload(mp, td, 0)) != 0)
 			return (error);
 		if (fs->fs_ronly &&
 		    !vfs_flagopt(mp->mnt_optnew, "ro", NULL, 0)) {
 			/*
 			 * If we are running a checker, do not allow upgrade.
 			 */
 			if (ump->um_fsckpid > 0) {
 				vfs_mount_error(mp,
 				    "Active checker, cannot upgrade to write");
 				return (EINVAL);
 			}
 			/*
 			 * If upgrade to read-write by non-root, then verify
 			 * that user has necessary permissions on the device.
 			 */
 			vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY);
 			error = VOP_ACCESS(devvp, VREAD | VWRITE,
 			    td->td_ucred, td);
 			if (error)
 				error = priv_check(td, PRIV_VFS_MOUNT_PERM);
 			if (error) {
 				VOP_UNLOCK(devvp, 0);
 				return (error);
 			}
 			VOP_UNLOCK(devvp, 0);
 			fs->fs_flags &= ~FS_UNCLEAN;
 			if (fs->fs_clean == 0) {
 				fs->fs_flags |= FS_UNCLEAN;
 				if ((mp->mnt_flag & MNT_FORCE) ||
 				    ((fs->fs_flags &
 				     (FS_SUJ | FS_NEEDSFSCK)) == 0 &&
 				     (fs->fs_flags & FS_DOSOFTDEP))) {
 					printf("WARNING: %s was not properly "
 					   "dismounted\n", fs->fs_fsmnt);
 				} else {
 					vfs_mount_error(mp,
 					   "R/W mount of %s denied. %s.%s",
 					   fs->fs_fsmnt,
 					   "Filesystem is not clean - run fsck",
 					   (fs->fs_flags & FS_SUJ) == 0 ? "" :
 					   " Forced mount will invalidate"
 					   " journal contents");
 					return (EPERM);
 				}
 			}
 			DROP_GIANT();
 			g_topology_lock();
 			/*
 			 * Request exclusive write access.
 			 */
 			error = g_access(ump->um_cp, 0, 1, 1);
 			g_topology_unlock();
 			PICKUP_GIANT();
 			if (error)
 				return (error);
 			if ((error = vn_start_write(NULL, &mp, V_WAIT)) != 0)
 				return (error);
 			fs->fs_ronly = 0;
 			MNT_ILOCK(mp);
 			mp->mnt_flag &= ~MNT_RDONLY;
 			MNT_IUNLOCK(mp);
 			fs->fs_mtime = time_second;
 			/* check to see if we need to start softdep */
 			if ((fs->fs_flags & FS_DOSOFTDEP) &&
 			    (error = softdep_mount(devvp, mp, fs, td->td_ucred))){
 				vn_finished_write(mp);
 				return (error);
 			}
 			fs->fs_clean = 0;
 			if ((error = ffs_sbupdate(ump, MNT_WAIT, 0)) != 0) {
 				vn_finished_write(mp);
 				return (error);
 			}
 			if (fs->fs_snapinum[0] != 0)
 				ffs_snapshot_mount(mp);
 			vn_finished_write(mp);
 		}
 		/*
 		 * Soft updates is incompatible with "async",
 		 * so if we are doing softupdates stop the user
 		 * from setting the async flag in an update.
 		 * Softdep_mount() clears it in an initial mount
 		 * or ro->rw remount.
 		 */
 		if (MOUNTEDSOFTDEP(mp)) {
 			/* XXX: Reset too late ? */
 			MNT_ILOCK(mp);
 			mp->mnt_flag &= ~MNT_ASYNC;
 			MNT_IUNLOCK(mp);
 		}
 		/*
 		 * Keep MNT_ACLS flag if it is stored in superblock.
 		 */
 		if ((fs->fs_flags & FS_ACLS) != 0) {
 			/* XXX: Set too late ? */
 			MNT_ILOCK(mp);
 			mp->mnt_flag |= MNT_ACLS;
 			MNT_IUNLOCK(mp);
 		}
 
 		if ((fs->fs_flags & FS_NFS4ACLS) != 0) {
 			/* XXX: Set too late ? */
 			MNT_ILOCK(mp);
 			mp->mnt_flag |= MNT_NFS4ACLS;
 			MNT_IUNLOCK(mp);
 		}
 		/*
 		 * If this is a request from fsck to clean up the filesystem,
 		 * then allow the specified pid to proceed.
 		 */
 		if (fsckpid > 0) {
 			if (ump->um_fsckpid != 0) {
 				vfs_mount_error(mp,
 				    "Active checker already running on %s",
 				    fs->fs_fsmnt);
 				return (EINVAL);
 			}
 			KASSERT(MOUNTEDSOFTDEP(mp) == 0,
 			    ("soft updates enabled on read-only file system"));
 			DROP_GIANT();
 			g_topology_lock();
 			/*
 			 * Request write access.
 			 */
 			error = g_access(ump->um_cp, 0, 1, 0);
 			g_topology_unlock();
 			PICKUP_GIANT();
 			if (error) {
 				vfs_mount_error(mp,
 				    "Checker activation failed on %s",
 				    fs->fs_fsmnt);
 				return (error);
 			}
 			ump->um_fsckpid = fsckpid;
 			if (fs->fs_snapinum[0] != 0)
 				ffs_snapshot_mount(mp);
 			fs->fs_mtime = time_second;
 			fs->fs_fmod = 1;
 			fs->fs_clean = 0;
 			(void) ffs_sbupdate(ump, MNT_WAIT, 0);
 		}
 
 		/*
 		 * If this is a snapshot request, take the snapshot.
 		 */
 		if (mp->mnt_flag & MNT_SNAPSHOT)
 			return (ffs_snapshot(mp, fspec));
 	}
 
 	/*
 	 * Not an update, or updating the name: look up the name
 	 * and verify that it refers to a sensible disk device.
 	 */
 	NDINIT(&ndp, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, fspec, td);
 	if ((error = namei(&ndp)) != 0)
 		return (error);
 	NDFREE(&ndp, NDF_ONLY_PNBUF);
 	devvp = ndp.ni_vp;
 	if (!vn_isdisk(devvp, &error)) {
 		vput(devvp);
 		return (error);
 	}
 
 	/*
 	 * If mount by non-root, then verify that user has necessary
 	 * permissions on the device.
 	 */
 	accmode = VREAD;
 	if ((mp->mnt_flag & MNT_RDONLY) == 0)
 		accmode |= VWRITE;
 	error = VOP_ACCESS(devvp, accmode, td->td_ucred, td);
 	if (error)
 		error = priv_check(td, PRIV_VFS_MOUNT_PERM);
 	if (error) {
 		vput(devvp);
 		return (error);
 	}
 
 	if (mp->mnt_flag & MNT_UPDATE) {
 		/*
 		 * Update only
 		 *
 		 * If it's not the same vnode, or at least the same device
 		 * then it's not correct.
 		 */
 
 		if (devvp->v_rdev != ump->um_devvp->v_rdev)
 			error = EINVAL;	/* needs translation */
 		vput(devvp);
 		if (error)
 			return (error);
 	} else {
 		/*
 		 * New mount
 		 *
 		 * We need the name for the mount point (also used for
 		 * "last mounted on") copied in. If an error occurs,
 		 * the mount point is discarded by the upper level code.
 		 * Note that vfs_mount() populates f_mntonname for us.
 		 */
 		if ((error = ffs_mountfs(devvp, mp, td)) != 0) {
 			vrele(devvp);
 			return (error);
 		}
 		if (fsckpid > 0) {
 			KASSERT(MOUNTEDSOFTDEP(mp) == 0,
 			    ("soft updates enabled on read-only file system"));
 			ump = VFSTOUFS(mp);
 			fs = ump->um_fs;
 			DROP_GIANT();
 			g_topology_lock();
 			/*
 			 * Request write access.
 			 */
 			error = g_access(ump->um_cp, 0, 1, 0);
 			g_topology_unlock();
 			PICKUP_GIANT();
 			if (error) {
 				printf("WARNING: %s: Checker activation "
 				    "failed\n", fs->fs_fsmnt);
 			} else { 
 				ump->um_fsckpid = fsckpid;
 				if (fs->fs_snapinum[0] != 0)
 					ffs_snapshot_mount(mp);
 				fs->fs_mtime = time_second;
 				fs->fs_clean = 0;
 				(void) ffs_sbupdate(ump, MNT_WAIT, 0);
 			}
 		}
 	}
 	vfs_mountedfrom(mp, fspec);
 	return (0);
 }
 
 /*
  * Compatibility with old mount system call.
  */
 
 static int
 ffs_cmount(struct mntarg *ma, void *data, uint64_t flags)
 {
 	struct ufs_args args;
 	struct export_args exp;
 	int error;
 
 	if (data == NULL)
 		return (EINVAL);
 	error = copyin(data, &args, sizeof args);
 	if (error)
 		return (error);
 	vfs_oexport_conv(&args.export, &exp);
 
 	ma = mount_argsu(ma, "from", args.fspec, MAXPATHLEN);
 	ma = mount_arg(ma, "export", &exp, sizeof(exp));
 	error = kernel_mount(ma, flags);
 
 	return (error);
 }
 
 /*
  * Reload all incore data for a filesystem (used after running fsck on
  * the root filesystem and finding things to fix). If the 'force' flag
  * is 0, the filesystem must be mounted read-only.
  *
  * Things to do to update the mount:
  *	1) invalidate all cached meta-data.
  *	2) re-read superblock from disk.
  *	3) re-read summary information from disk.
  *	4) invalidate all inactive vnodes.
  *	5) invalidate all cached file data.
  *	6) re-read inode data for all active vnodes.
  */
 int
 ffs_reload(struct mount *mp, struct thread *td, int force)
 {
 	struct vnode *vp, *mvp, *devvp;
 	struct inode *ip;
 	void *space;
 	struct buf *bp;
 	struct fs *fs, *newfs;
 	struct ufsmount *ump;
 	ufs2_daddr_t sblockloc;
 	int i, blks, size, error;
 	int32_t *lp;
 
 	ump = VFSTOUFS(mp);
 
 	MNT_ILOCK(mp);
 	if ((mp->mnt_flag & MNT_RDONLY) == 0 && force == 0) {
 		MNT_IUNLOCK(mp);
 		return (EINVAL);
 	}
 	MNT_IUNLOCK(mp);
 	
 	/*
 	 * Step 1: invalidate all cached meta-data.
 	 */
 	devvp = VFSTOUFS(mp)->um_devvp;
 	vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY);
 	if (vinvalbuf(devvp, 0, 0, 0) != 0)
 		panic("ffs_reload: dirty1");
 	VOP_UNLOCK(devvp, 0);
 
 	/*
 	 * Step 2: re-read superblock from disk.
 	 */
 	fs = VFSTOUFS(mp)->um_fs;
 	if ((error = bread(devvp, btodb(fs->fs_sblockloc), fs->fs_sbsize,
 	    NOCRED, &bp)) != 0)
 		return (error);
 	newfs = (struct fs *)bp->b_data;
 	if ((newfs->fs_magic != FS_UFS1_MAGIC &&
 	     newfs->fs_magic != FS_UFS2_MAGIC) ||
 	    newfs->fs_bsize > MAXBSIZE ||
 	    newfs->fs_bsize < sizeof(struct fs)) {
 			brelse(bp);
 			return (EIO);		/* XXX needs translation */
 	}
 	/*
 	 * Copy pointer fields back into superblock before copying in	XXX
 	 * new superblock. These should really be in the ufsmount.	XXX
 	 * Note that important parameters (eg fs_ncg) are unchanged.
 	 */
 	newfs->fs_csp = fs->fs_csp;
 	newfs->fs_maxcluster = fs->fs_maxcluster;
 	newfs->fs_contigdirs = fs->fs_contigdirs;
 	newfs->fs_active = fs->fs_active;
 	newfs->fs_ronly = fs->fs_ronly;
 	sblockloc = fs->fs_sblockloc;
 	bcopy(newfs, fs, (u_int)fs->fs_sbsize);
 	brelse(bp);
 	mp->mnt_maxsymlinklen = fs->fs_maxsymlinklen;
 	ffs_oldfscompat_read(fs, VFSTOUFS(mp), sblockloc);
 	UFS_LOCK(ump);
 	if (fs->fs_pendingblocks != 0 || fs->fs_pendinginodes != 0) {
 		printf("WARNING: %s: reload pending error: blocks %jd "
 		    "files %d\n", fs->fs_fsmnt, (intmax_t)fs->fs_pendingblocks,
 		    fs->fs_pendinginodes);
 		fs->fs_pendingblocks = 0;
 		fs->fs_pendinginodes = 0;
 	}
 	UFS_UNLOCK(ump);
 
 	/*
 	 * Step 3: re-read summary information from disk.
 	 */
 	size = fs->fs_cssize;
 	blks = howmany(size, fs->fs_fsize);
 	if (fs->fs_contigsumsize > 0)
 		size += fs->fs_ncg * sizeof(int32_t);
 	size += fs->fs_ncg * sizeof(u_int8_t);
 	free(fs->fs_csp, M_UFSMNT);
 	space = malloc((u_long)size, M_UFSMNT, M_WAITOK);
 	fs->fs_csp = space;
 	for (i = 0; i < blks; i += fs->fs_frag) {
 		size = fs->fs_bsize;
 		if (i + fs->fs_frag > blks)
 			size = (blks - i) * fs->fs_fsize;
 		error = bread(devvp, fsbtodb(fs, fs->fs_csaddr + i), size,
 		    NOCRED, &bp);
 		if (error)
 			return (error);
 		bcopy(bp->b_data, space, (u_int)size);
 		space = (char *)space + size;
 		brelse(bp);
 	}
 	/*
 	 * We no longer know anything about clusters per cylinder group.
 	 */
 	if (fs->fs_contigsumsize > 0) {
 		fs->fs_maxcluster = lp = space;
 		for (i = 0; i < fs->fs_ncg; i++)
 			*lp++ = fs->fs_contigsumsize;
 		space = lp;
 	}
 	size = fs->fs_ncg * sizeof(u_int8_t);
 	fs->fs_contigdirs = (u_int8_t *)space;
 	bzero(fs->fs_contigdirs, size);
 
 loop:
 	MNT_VNODE_FOREACH_ALL(vp, mp, mvp) {
 		/*
 		 * Skip syncer vnode.
 		 */
 		if (vp->v_type == VNON) {
 			VI_UNLOCK(vp);
 			continue;
 		}
 		/*
 		 * Step 4: invalidate all cached file data.
 		 */
 		if (vget(vp, LK_EXCLUSIVE | LK_INTERLOCK, td)) {
 			MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp);
 			goto loop;
 		}
 		if (vinvalbuf(vp, 0, 0, 0))
 			panic("ffs_reload: dirty2");
 		/*
 		 * Step 5: re-read inode data for all active vnodes.
 		 */
 		ip = VTOI(vp);
 		error =
 		    bread(devvp, fsbtodb(fs, ino_to_fsba(fs, ip->i_number)),
 		    (int)fs->fs_bsize, NOCRED, &bp);
 		if (error) {
 			VOP_UNLOCK(vp, 0);
 			vrele(vp);
 			MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp);
 			return (error);
 		}
 		ffs_load_inode(bp, ip, fs, ip->i_number);
 		ip->i_effnlink = ip->i_nlink;
 		brelse(bp);
 		VOP_UNLOCK(vp, 0);
 		vrele(vp);
 	}
 	return (0);
 }
 
 /*
  * Possible superblock locations ordered from most to least likely.
  */
 static int sblock_try[] = SBLOCKSEARCH;
 
 /*
  * Common code for mount and mountroot
  */
 static int
 ffs_mountfs(devvp, mp, td)
 	struct vnode *devvp;
 	struct mount *mp;
 	struct thread *td;
 {
 	struct ufsmount *ump;
 	struct buf *bp;
 	struct fs *fs;
 	struct cdev *dev;
 	void *space;
 	ufs2_daddr_t sblockloc;
 	int error, i, blks, size, ronly;
 	int32_t *lp;
 	struct ucred *cred;
 	struct g_consumer *cp;
 	struct mount *nmp;
 
 	bp = NULL;
 	ump = NULL;
 	cred = td ? td->td_ucred : NOCRED;
 	ronly = (mp->mnt_flag & MNT_RDONLY) != 0;
 
 	dev = devvp->v_rdev;
 	dev_ref(dev);
 	DROP_GIANT();
 	g_topology_lock();
 	error = g_vfs_open(devvp, &cp, "ffs", ronly ? 0 : 1);
 	g_topology_unlock();
 	PICKUP_GIANT();
 	VOP_UNLOCK(devvp, 0);
 	if (error)
 		goto out;
 	if (devvp->v_rdev->si_iosize_max != 0)
 		mp->mnt_iosize_max = devvp->v_rdev->si_iosize_max;
 	if (mp->mnt_iosize_max > MAXPHYS)
 		mp->mnt_iosize_max = MAXPHYS;
 
 	devvp->v_bufobj.bo_ops = &ffs_ops;
 
 	fs = NULL;
 	sblockloc = 0;
 	/*
 	 * Try reading the superblock in each of its possible locations.
 	 */
 	for (i = 0; sblock_try[i] != -1; i++) {
 		if ((SBLOCKSIZE % cp->provider->sectorsize) != 0) {
 			error = EINVAL;
 			vfs_mount_error(mp,
 			    "Invalid sectorsize %d for superblock size %d",
 			    cp->provider->sectorsize, SBLOCKSIZE);
 			goto out;
 		}
 		if ((error = bread(devvp, btodb(sblock_try[i]), SBLOCKSIZE,
 		    cred, &bp)) != 0)
 			goto out;
 		fs = (struct fs *)bp->b_data;
 		sblockloc = sblock_try[i];
 		if ((fs->fs_magic == FS_UFS1_MAGIC ||
 		     (fs->fs_magic == FS_UFS2_MAGIC &&
 		      (fs->fs_sblockloc == sblockloc ||
 		       (fs->fs_old_flags & FS_FLAGS_UPDATED) == 0))) &&
 		    fs->fs_bsize <= MAXBSIZE &&
 		    fs->fs_bsize >= sizeof(struct fs))
 			break;
 		brelse(bp);
 		bp = NULL;
 	}
 	if (sblock_try[i] == -1) {
 		error = EINVAL;		/* XXX needs translation */
 		goto out;
 	}
 	fs->fs_fmod = 0;
 	fs->fs_flags &= ~FS_INDEXDIRS;	/* no support for directory indicies */
 	fs->fs_flags &= ~FS_UNCLEAN;
 	if (fs->fs_clean == 0) {
 		fs->fs_flags |= FS_UNCLEAN;
 		if (ronly || (mp->mnt_flag & MNT_FORCE) ||
 		    ((fs->fs_flags & (FS_SUJ | FS_NEEDSFSCK)) == 0 &&
 		     (fs->fs_flags & FS_DOSOFTDEP))) {
 			printf("WARNING: %s was not properly dismounted\n",
 			    fs->fs_fsmnt);
 		} else {
 			vfs_mount_error(mp, "R/W mount of %s denied. %s%s",
 			    fs->fs_fsmnt, "Filesystem is not clean - run fsck.",
 			    (fs->fs_flags & FS_SUJ) == 0 ? "" :
 			    " Forced mount will invalidate journal contents");
 			error = EPERM;
 			goto out;
 		}
 		if ((fs->fs_pendingblocks != 0 || fs->fs_pendinginodes != 0) &&
 		    (mp->mnt_flag & MNT_FORCE)) {
 			printf("WARNING: %s: lost blocks %jd files %d\n",
 			    fs->fs_fsmnt, (intmax_t)fs->fs_pendingblocks,
 			    fs->fs_pendinginodes);
 			fs->fs_pendingblocks = 0;
 			fs->fs_pendinginodes = 0;
 		}
 	}
 	if (fs->fs_pendingblocks != 0 || fs->fs_pendinginodes != 0) {
 		printf("WARNING: %s: mount pending error: blocks %jd "
 		    "files %d\n", fs->fs_fsmnt, (intmax_t)fs->fs_pendingblocks,
 		    fs->fs_pendinginodes);
 		fs->fs_pendingblocks = 0;
 		fs->fs_pendinginodes = 0;
 	}
 	if ((fs->fs_flags & FS_GJOURNAL) != 0) {
 #ifdef UFS_GJOURNAL
 		/*
 		 * Get journal provider name.
 		 */
 		size = 1024;
 		mp->mnt_gjprovider = malloc(size, M_UFSMNT, M_WAITOK);
 		if (g_io_getattr("GJOURNAL::provider", cp, &size,
 		    mp->mnt_gjprovider) == 0) {
 			mp->mnt_gjprovider = realloc(mp->mnt_gjprovider, size,
 			    M_UFSMNT, M_WAITOK);
 			MNT_ILOCK(mp);
 			mp->mnt_flag |= MNT_GJOURNAL;
 			MNT_IUNLOCK(mp);
 		} else {
 			printf("WARNING: %s: GJOURNAL flag on fs "
 			    "but no gjournal provider below\n",
 			    mp->mnt_stat.f_mntonname);
 			free(mp->mnt_gjprovider, M_UFSMNT);
 			mp->mnt_gjprovider = NULL;
 		}
 #else
 		printf("WARNING: %s: GJOURNAL flag on fs but no "
 		    "UFS_GJOURNAL support\n", mp->mnt_stat.f_mntonname);
 #endif
 	} else {
 		mp->mnt_gjprovider = NULL;
 	}
 	ump = malloc(sizeof *ump, M_UFSMNT, M_WAITOK | M_ZERO);
 	ump->um_cp = cp;
 	ump->um_bo = &devvp->v_bufobj;
 	ump->um_fs = malloc((u_long)fs->fs_sbsize, M_UFSMNT, M_WAITOK);
 	if (fs->fs_magic == FS_UFS1_MAGIC) {
 		ump->um_fstype = UFS1;
 		ump->um_balloc = ffs_balloc_ufs1;
 	} else {
 		ump->um_fstype = UFS2;
 		ump->um_balloc = ffs_balloc_ufs2;
 	}
 	ump->um_blkatoff = ffs_blkatoff;
 	ump->um_truncate = ffs_truncate;
 	ump->um_update = ffs_update;
 	ump->um_valloc = ffs_valloc;
 	ump->um_vfree = ffs_vfree;
 	ump->um_ifree = ffs_ifree;
 	ump->um_rdonly = ffs_rdonly;
 	ump->um_snapgone = ffs_snapgone;
 	mtx_init(UFS_MTX(ump), "FFS", "FFS Lock", MTX_DEF);
 	bcopy(bp->b_data, ump->um_fs, (u_int)fs->fs_sbsize);
 	if (fs->fs_sbsize < SBLOCKSIZE)
 		bp->b_flags |= B_INVAL | B_NOCACHE;
 	brelse(bp);
 	bp = NULL;
 	fs = ump->um_fs;
 	ffs_oldfscompat_read(fs, ump, sblockloc);
 	fs->fs_ronly = ronly;
 	size = fs->fs_cssize;
 	blks = howmany(size, fs->fs_fsize);
 	if (fs->fs_contigsumsize > 0)
 		size += fs->fs_ncg * sizeof(int32_t);
 	size += fs->fs_ncg * sizeof(u_int8_t);
 	space = malloc((u_long)size, M_UFSMNT, M_WAITOK);
 	fs->fs_csp = space;
 	for (i = 0; i < blks; i += fs->fs_frag) {
 		size = fs->fs_bsize;
 		if (i + fs->fs_frag > blks)
 			size = (blks - i) * fs->fs_fsize;
 		if ((error = bread(devvp, fsbtodb(fs, fs->fs_csaddr + i), size,
 		    cred, &bp)) != 0) {
 			free(fs->fs_csp, M_UFSMNT);
 			goto out;
 		}
 		bcopy(bp->b_data, space, (u_int)size);
 		space = (char *)space + size;
 		brelse(bp);
 		bp = NULL;
 	}
 	if (fs->fs_contigsumsize > 0) {
 		fs->fs_maxcluster = lp = space;
 		for (i = 0; i < fs->fs_ncg; i++)
 			*lp++ = fs->fs_contigsumsize;
 		space = lp;
 	}
 	size = fs->fs_ncg * sizeof(u_int8_t);
 	fs->fs_contigdirs = (u_int8_t *)space;
 	bzero(fs->fs_contigdirs, size);
 	fs->fs_active = NULL;
 	mp->mnt_data = ump;
 	mp->mnt_stat.f_fsid.val[0] = fs->fs_id[0];
 	mp->mnt_stat.f_fsid.val[1] = fs->fs_id[1];
 	nmp = NULL;
 	if (fs->fs_id[0] == 0 || fs->fs_id[1] == 0 ||
 	    (nmp = vfs_getvfs(&mp->mnt_stat.f_fsid))) {
 		if (nmp)
 			vfs_rel(nmp);
 		vfs_getnewfsid(mp);
 	}
 	mp->mnt_maxsymlinklen = fs->fs_maxsymlinklen;
 	MNT_ILOCK(mp);
 	mp->mnt_flag |= MNT_LOCAL;
 	MNT_IUNLOCK(mp);
 	if ((fs->fs_flags & FS_MULTILABEL) != 0) {
 #ifdef MAC
 		MNT_ILOCK(mp);
 		mp->mnt_flag |= MNT_MULTILABEL;
 		MNT_IUNLOCK(mp);
 #else
 		printf("WARNING: %s: multilabel flag on fs but "
 		    "no MAC support\n", mp->mnt_stat.f_mntonname);
 #endif
 	}
 	if ((fs->fs_flags & FS_ACLS) != 0) {
 #ifdef UFS_ACL
 		MNT_ILOCK(mp);
 
 		if (mp->mnt_flag & MNT_NFS4ACLS)
 			printf("WARNING: %s: ACLs flag on fs conflicts with "
 			    "\"nfsv4acls\" mount option; option ignored\n",
 			    mp->mnt_stat.f_mntonname);
 		mp->mnt_flag &= ~MNT_NFS4ACLS;
 		mp->mnt_flag |= MNT_ACLS;
 
 		MNT_IUNLOCK(mp);
 #else
 		printf("WARNING: %s: ACLs flag on fs but no ACLs support\n",
 		    mp->mnt_stat.f_mntonname);
 #endif
 	}
 	if ((fs->fs_flags & FS_NFS4ACLS) != 0) {
 #ifdef UFS_ACL
 		MNT_ILOCK(mp);
 
 		if (mp->mnt_flag & MNT_ACLS)
 			printf("WARNING: %s: NFSv4 ACLs flag on fs conflicts "
 			    "with \"acls\" mount option; option ignored\n",
 			    mp->mnt_stat.f_mntonname);
 		mp->mnt_flag &= ~MNT_ACLS;
 		mp->mnt_flag |= MNT_NFS4ACLS;
 
 		MNT_IUNLOCK(mp);
 #else
 		printf("WARNING: %s: NFSv4 ACLs flag on fs but no "
 		    "ACLs support\n", mp->mnt_stat.f_mntonname);
 #endif
 	}
 	if ((fs->fs_flags & FS_TRIM) != 0) {
 		size = sizeof(int);
 		if (g_io_getattr("GEOM::candelete", cp, &size,
 		    &ump->um_candelete) == 0) {
 			if (!ump->um_candelete)
 				printf("WARNING: %s: TRIM flag on fs but disk "
 				    "does not support TRIM\n",
 				    mp->mnt_stat.f_mntonname);
 		} else {
 			printf("WARNING: %s: TRIM flag on fs but disk does "
 			    "not confirm that it supports TRIM\n",
 			    mp->mnt_stat.f_mntonname);
 			ump->um_candelete = 0;
 		}
 	}
 
 	ump->um_mountp = mp;
 	ump->um_dev = dev;
 	ump->um_devvp = devvp;
 	ump->um_nindir = fs->fs_nindir;
 	ump->um_bptrtodb = fs->fs_fsbtodb;
 	ump->um_seqinc = fs->fs_frag;
 	for (i = 0; i < MAXQUOTAS; i++)
 		ump->um_quotas[i] = NULLVP;
 #ifdef UFS_EXTATTR
 	ufs_extattr_uepm_init(&ump->um_extattr);
 #endif
 	/*
 	 * Set FS local "last mounted on" information (NULL pad)
 	 */
 	bzero(fs->fs_fsmnt, MAXMNTLEN);
 	strlcpy(fs->fs_fsmnt, mp->mnt_stat.f_mntonname, MAXMNTLEN);
 	mp->mnt_stat.f_iosize = fs->fs_bsize;
 
 	if (mp->mnt_flag & MNT_ROOTFS) {
 		/*
 		 * Root mount; update timestamp in mount structure.
 		 * this will be used by the common root mount code
 		 * to update the system clock.
 		 */
 		mp->mnt_time = fs->fs_time;
 	}
 
 	if (ronly == 0) {
 		fs->fs_mtime = time_second;
 		if ((fs->fs_flags & FS_DOSOFTDEP) &&
 		    (error = softdep_mount(devvp, mp, fs, cred)) != 0) {
 			free(fs->fs_csp, M_UFSMNT);
 			ffs_flushfiles(mp, FORCECLOSE, td);
 			goto out;
 		}
 		if (devvp->v_type == VCHR && devvp->v_rdev != NULL)
 			devvp->v_rdev->si_mountpt = mp;
 		if (fs->fs_snapinum[0] != 0)
 			ffs_snapshot_mount(mp);
 		fs->fs_fmod = 1;
 		fs->fs_clean = 0;
 		(void) ffs_sbupdate(ump, MNT_WAIT, 0);
 	}
 	/*
 	 * Initialize filesystem stat information in mount struct.
 	 */
 	MNT_ILOCK(mp);
 	mp->mnt_kern_flag |= MNTK_LOOKUP_SHARED | MNTK_EXTENDED_SHARED |
-	    MNTK_NO_IOPF | MNTK_UNMAPPED_BUFS | MNTK_SUSPENDABLE;
+	    MNTK_NO_IOPF | MNTK_UNMAPPED_BUFS | MNTK_SUSPENDABLE |
+	    MNTK_USES_BCACHE;
 	MNT_IUNLOCK(mp);
 #ifdef UFS_EXTATTR
 #ifdef UFS_EXTATTR_AUTOSTART
 	/*
 	 *
 	 * Auto-starting does the following:
 	 *	- check for /.attribute in the fs, and extattr_start if so
 	 *	- for each file in .attribute, enable that file with
 	 * 	  an attribute of the same name.
 	 * Not clear how to report errors -- probably eat them.
 	 * This would all happen while the filesystem was busy/not
 	 * available, so would effectively be "atomic".
 	 */
 	(void) ufs_extattr_autostart(mp, td);
 #endif /* !UFS_EXTATTR_AUTOSTART */
 #endif /* !UFS_EXTATTR */
 	return (0);
 out:
 	if (bp)
 		brelse(bp);
 	if (cp != NULL) {
 		DROP_GIANT();
 		g_topology_lock();
 		g_vfs_close(cp);
 		g_topology_unlock();
 		PICKUP_GIANT();
 	}
 	if (ump) {
 		mtx_destroy(UFS_MTX(ump));
 		if (mp->mnt_gjprovider != NULL) {
 			free(mp->mnt_gjprovider, M_UFSMNT);
 			mp->mnt_gjprovider = NULL;
 		}
 		free(ump->um_fs, M_UFSMNT);
 		free(ump, M_UFSMNT);
 		mp->mnt_data = NULL;
 	}
 	dev_rel(dev);
 	return (error);
 }
 
 #include <sys/sysctl.h>
 static int bigcgs = 0;
 SYSCTL_INT(_debug, OID_AUTO, bigcgs, CTLFLAG_RW, &bigcgs, 0, "");
 
 /*
  * Sanity checks for loading old filesystem superblocks.
  * See ffs_oldfscompat_write below for unwound actions.
  *
  * XXX - Parts get retired eventually.
  * Unfortunately new bits get added.
  */
 static void
 ffs_oldfscompat_read(fs, ump, sblockloc)
 	struct fs *fs;
 	struct ufsmount *ump;
 	ufs2_daddr_t sblockloc;
 {
 	off_t maxfilesize;
 
 	/*
 	 * If not yet done, update fs_flags location and value of fs_sblockloc.
 	 */
 	if ((fs->fs_old_flags & FS_FLAGS_UPDATED) == 0) {
 		fs->fs_flags = fs->fs_old_flags;
 		fs->fs_old_flags |= FS_FLAGS_UPDATED;
 		fs->fs_sblockloc = sblockloc;
 	}
 	/*
 	 * If not yet done, update UFS1 superblock with new wider fields.
 	 */
 	if (fs->fs_magic == FS_UFS1_MAGIC && fs->fs_maxbsize != fs->fs_bsize) {
 		fs->fs_maxbsize = fs->fs_bsize;
 		fs->fs_time = fs->fs_old_time;
 		fs->fs_size = fs->fs_old_size;
 		fs->fs_dsize = fs->fs_old_dsize;
 		fs->fs_csaddr = fs->fs_old_csaddr;
 		fs->fs_cstotal.cs_ndir = fs->fs_old_cstotal.cs_ndir;
 		fs->fs_cstotal.cs_nbfree = fs->fs_old_cstotal.cs_nbfree;
 		fs->fs_cstotal.cs_nifree = fs->fs_old_cstotal.cs_nifree;
 		fs->fs_cstotal.cs_nffree = fs->fs_old_cstotal.cs_nffree;
 	}
 	if (fs->fs_magic == FS_UFS1_MAGIC &&
 	    fs->fs_old_inodefmt < FS_44INODEFMT) {
 		fs->fs_maxfilesize = ((uint64_t)1 << 31) - 1;
 		fs->fs_qbmask = ~fs->fs_bmask;
 		fs->fs_qfmask = ~fs->fs_fmask;
 	}
 	if (fs->fs_magic == FS_UFS1_MAGIC) {
 		ump->um_savedmaxfilesize = fs->fs_maxfilesize;
 		maxfilesize = (uint64_t)0x80000000 * fs->fs_bsize - 1;
 		if (fs->fs_maxfilesize > maxfilesize)
 			fs->fs_maxfilesize = maxfilesize;
 	}
 	/* Compatibility for old filesystems */
 	if (fs->fs_avgfilesize <= 0)
 		fs->fs_avgfilesize = AVFILESIZ;
 	if (fs->fs_avgfpdir <= 0)
 		fs->fs_avgfpdir = AFPDIR;
 	if (bigcgs) {
 		fs->fs_save_cgsize = fs->fs_cgsize;
 		fs->fs_cgsize = fs->fs_bsize;
 	}
 }
 
 /*
  * Unwinding superblock updates for old filesystems.
  * See ffs_oldfscompat_read above for details.
  *
  * XXX - Parts get retired eventually.
  * Unfortunately new bits get added.
  */
 void
 ffs_oldfscompat_write(fs, ump)
 	struct fs *fs;
 	struct ufsmount *ump;
 {
 
 	/*
 	 * Copy back UFS2 updated fields that UFS1 inspects.
 	 */
 	if (fs->fs_magic == FS_UFS1_MAGIC) {
 		fs->fs_old_time = fs->fs_time;
 		fs->fs_old_cstotal.cs_ndir = fs->fs_cstotal.cs_ndir;
 		fs->fs_old_cstotal.cs_nbfree = fs->fs_cstotal.cs_nbfree;
 		fs->fs_old_cstotal.cs_nifree = fs->fs_cstotal.cs_nifree;
 		fs->fs_old_cstotal.cs_nffree = fs->fs_cstotal.cs_nffree;
 		fs->fs_maxfilesize = ump->um_savedmaxfilesize;
 	}
 	if (bigcgs) {
 		fs->fs_cgsize = fs->fs_save_cgsize;
 		fs->fs_save_cgsize = 0;
 	}
 }
 
 /*
  * unmount system call
  */
 static int
 ffs_unmount(mp, mntflags)
 	struct mount *mp;
 	int mntflags;
 {
 	struct thread *td;
 	struct ufsmount *ump = VFSTOUFS(mp);
 	struct fs *fs;
 	int error, flags, susp;
 #ifdef UFS_EXTATTR
 	int e_restart;
 #endif
 
 	flags = 0;
 	td = curthread;
 	fs = ump->um_fs;
 	susp = 0;
 	if (mntflags & MNT_FORCE) {
 		flags |= FORCECLOSE;
 		susp = fs->fs_ronly == 0;
 	}
 #ifdef UFS_EXTATTR
 	if ((error = ufs_extattr_stop(mp, td))) {
 		if (error != EOPNOTSUPP)
 			printf("WARNING: unmount %s: ufs_extattr_stop "
 			    "returned errno %d\n", mp->mnt_stat.f_mntonname,
 			    error);
 		e_restart = 0;
 	} else {
 		ufs_extattr_uepm_destroy(&ump->um_extattr);
 		e_restart = 1;
 	}
 #endif
 	if (susp) {
 		error = vfs_write_suspend_umnt(mp);
 		if (error != 0)
 			goto fail1;
 	}
 	if (MOUNTEDSOFTDEP(mp))
 		error = softdep_flushfiles(mp, flags, td);
 	else
 		error = ffs_flushfiles(mp, flags, td);
 	if (error != 0 && error != ENXIO)
 		goto fail;
 
 	UFS_LOCK(ump);
 	if (fs->fs_pendingblocks != 0 || fs->fs_pendinginodes != 0) {
 		printf("WARNING: unmount %s: pending error: blocks %jd "
 		    "files %d\n", fs->fs_fsmnt, (intmax_t)fs->fs_pendingblocks,
 		    fs->fs_pendinginodes);
 		fs->fs_pendingblocks = 0;
 		fs->fs_pendinginodes = 0;
 	}
 	UFS_UNLOCK(ump);
 	if (MOUNTEDSOFTDEP(mp))
 		softdep_unmount(mp);
 	if (fs->fs_ronly == 0 || ump->um_fsckpid > 0) {
 		fs->fs_clean = fs->fs_flags & (FS_UNCLEAN|FS_NEEDSFSCK) ? 0 : 1;
 		error = ffs_sbupdate(ump, MNT_WAIT, 0);
 		if (error && error != ENXIO) {
 			fs->fs_clean = 0;
 			goto fail;
 		}
 	}
 	if (susp)
 		vfs_write_resume(mp, VR_START_WRITE);
 	DROP_GIANT();
 	g_topology_lock();
 	if (ump->um_fsckpid > 0) {
 		/*
 		 * Return to normal read-only mode.
 		 */
 		error = g_access(ump->um_cp, 0, -1, 0);
 		ump->um_fsckpid = 0;
 	}
 	g_vfs_close(ump->um_cp);
 	g_topology_unlock();
 	PICKUP_GIANT();
 	if (ump->um_devvp->v_type == VCHR && ump->um_devvp->v_rdev != NULL)
 		ump->um_devvp->v_rdev->si_mountpt = NULL;
 	vrele(ump->um_devvp);
 	dev_rel(ump->um_dev);
 	mtx_destroy(UFS_MTX(ump));
 	if (mp->mnt_gjprovider != NULL) {
 		free(mp->mnt_gjprovider, M_UFSMNT);
 		mp->mnt_gjprovider = NULL;
 	}
 	free(fs->fs_csp, M_UFSMNT);
 	free(fs, M_UFSMNT);
 	free(ump, M_UFSMNT);
 	mp->mnt_data = NULL;
 	MNT_ILOCK(mp);
 	mp->mnt_flag &= ~MNT_LOCAL;
 	MNT_IUNLOCK(mp);
 	return (error);
 
 fail:
 	if (susp)
 		vfs_write_resume(mp, VR_START_WRITE);
 fail1:
 #ifdef UFS_EXTATTR
 	if (e_restart) {
 		ufs_extattr_uepm_init(&ump->um_extattr);
 #ifdef UFS_EXTATTR_AUTOSTART
 		(void) ufs_extattr_autostart(mp, td);
 #endif
 	}
 #endif
 
 	return (error);
 }
 
 /*
  * Flush out all the files in a filesystem.
  */
 int
 ffs_flushfiles(mp, flags, td)
 	struct mount *mp;
 	int flags;
 	struct thread *td;
 {
 	struct ufsmount *ump;
 	int qerror, error;
 
 	ump = VFSTOUFS(mp);
 	qerror = 0;
 #ifdef QUOTA
 	if (mp->mnt_flag & MNT_QUOTA) {
 		int i;
 		error = vflush(mp, 0, SKIPSYSTEM|flags, td);
 		if (error)
 			return (error);
 		for (i = 0; i < MAXQUOTAS; i++) {
 			error = quotaoff(td, mp, i);
 			if (error != 0) {
 				if ((flags & EARLYFLUSH) == 0)
 					return (error);
 				else
 					qerror = error;
 			}
 		}
 
 		/*
 		 * Here we fall through to vflush again to ensure that
 		 * we have gotten rid of all the system vnodes, unless
 		 * quotas must not be closed.
 		 */
 	}
 #endif
 	ASSERT_VOP_LOCKED(ump->um_devvp, "ffs_flushfiles");
 	if (ump->um_devvp->v_vflag & VV_COPYONWRITE) {
 		if ((error = vflush(mp, 0, SKIPSYSTEM | flags, td)) != 0)
 			return (error);
 		ffs_snapshot_unmount(mp);
 		flags |= FORCECLOSE;
 		/*
 		 * Here we fall through to vflush again to ensure
 		 * that we have gotten rid of all the system vnodes.
 		 */
 	}
 
 	/*
 	 * Do not close system files if quotas were not closed, to be
 	 * able to sync the remaining dquots.  The freeblks softupdate
 	 * workitems might hold a reference on a dquot, preventing
 	 * quotaoff() from completing.  Next round of
 	 * softdep_flushworklist() iteration should process the
 	 * blockers, allowing the next run of quotaoff() to finally
 	 * flush held dquots.
 	 *
 	 * Otherwise, flush all the files.
 	 */
 	if (qerror == 0 && (error = vflush(mp, 0, flags, td)) != 0)
 		return (error);
 
 	/*
 	 * Flush filesystem metadata.
 	 */
 	vn_lock(ump->um_devvp, LK_EXCLUSIVE | LK_RETRY);
 	error = VOP_FSYNC(ump->um_devvp, MNT_WAIT, td);
 	VOP_UNLOCK(ump->um_devvp, 0);
 	return (error);
 }
 
 /*
  * Get filesystem statistics.
  */
 static int
 ffs_statfs(mp, sbp)
 	struct mount *mp;
 	struct statfs *sbp;
 {
 	struct ufsmount *ump;
 	struct fs *fs;
 
 	ump = VFSTOUFS(mp);
 	fs = ump->um_fs;
 	if (fs->fs_magic != FS_UFS1_MAGIC && fs->fs_magic != FS_UFS2_MAGIC)
 		panic("ffs_statfs");
 	sbp->f_version = STATFS_VERSION;
 	sbp->f_bsize = fs->fs_fsize;
 	sbp->f_iosize = fs->fs_bsize;
 	sbp->f_blocks = fs->fs_dsize;
 	UFS_LOCK(ump);
 	sbp->f_bfree = fs->fs_cstotal.cs_nbfree * fs->fs_frag +
 	    fs->fs_cstotal.cs_nffree + dbtofsb(fs, fs->fs_pendingblocks);
 	sbp->f_bavail = freespace(fs, fs->fs_minfree) +
 	    dbtofsb(fs, fs->fs_pendingblocks);
 	sbp->f_files =  fs->fs_ncg * fs->fs_ipg - ROOTINO;
 	sbp->f_ffree = fs->fs_cstotal.cs_nifree + fs->fs_pendinginodes;
 	UFS_UNLOCK(ump);
 	sbp->f_namemax = NAME_MAX;
 	return (0);
 }
 
 /*
  * For a lazy sync, we only care about access times, quotas and the
  * superblock.  Other filesystem changes are already converted to
  * cylinder group blocks or inode blocks updates and are written to
  * disk by syncer.
  */
 static int
 ffs_sync_lazy(mp)
      struct mount *mp;
 {
 	struct vnode *mvp, *vp;
 	struct inode *ip;
 	struct thread *td;
 	int allerror, error;
 
 	allerror = 0;
 	td = curthread;
 	if ((mp->mnt_flag & MNT_NOATIME) != 0)
 		goto qupdate;
 	MNT_VNODE_FOREACH_ACTIVE(vp, mp, mvp) {
 		if (vp->v_type == VNON) {
 			VI_UNLOCK(vp);
 			continue;
 		}
 		ip = VTOI(vp);
 
 		/*
 		 * The IN_ACCESS flag is converted to IN_MODIFIED by
 		 * ufs_close() and ufs_getattr() by the calls to
 		 * ufs_itimes_locked(), without subsequent UFS_UPDATE().
 		 * Test also all the other timestamp flags too, to pick up
 		 * any other cases that could be missed.
 		 */
 		if ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_MODIFIED |
 		    IN_UPDATE)) == 0) {
 			VI_UNLOCK(vp);
 			continue;
 		}
 		if ((error = vget(vp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK,
 		    td)) != 0)
 			continue;
 		error = ffs_update(vp, 0);
 		if (error != 0)
 			allerror = error;
 		vput(vp);
 	}
 
 qupdate:
 #ifdef QUOTA
 	qsync(mp);
 #endif
 
 	if (VFSTOUFS(mp)->um_fs->fs_fmod != 0 &&
 	    (error = ffs_sbupdate(VFSTOUFS(mp), MNT_LAZY, 0)) != 0)
 		allerror = error;
 	return (allerror);
 }
 
 /*
  * Go through the disk queues to initiate sandbagged IO;
  * go through the inodes to write those that have been modified;
  * initiate the writing of the super block if it has been modified.
  *
  * Note: we are always called with the filesystem marked busy using
  * vfs_busy().
  */
 static int
 ffs_sync(mp, waitfor)
 	struct mount *mp;
 	int waitfor;
 {
 	struct vnode *mvp, *vp, *devvp;
 	struct thread *td;
 	struct inode *ip;
 	struct ufsmount *ump = VFSTOUFS(mp);
 	struct fs *fs;
 	int error, count, wait, lockreq, allerror = 0;
 	int suspend;
 	int suspended;
 	int secondary_writes;
 	int secondary_accwrites;
 	int softdep_deps;
 	int softdep_accdeps;
 	struct bufobj *bo;
 
 	wait = 0;
 	suspend = 0;
 	suspended = 0;
 	td = curthread;
 	fs = ump->um_fs;
 	if (fs->fs_fmod != 0 && fs->fs_ronly != 0 && ump->um_fsckpid == 0)
 		panic("%s: ffs_sync: modification on read-only filesystem",
 		    fs->fs_fsmnt);
 	if (waitfor == MNT_LAZY) {
 		if (!rebooting)
 			return (ffs_sync_lazy(mp));
 		waitfor = MNT_NOWAIT;
 	}
 
 	/*
 	 * Write back each (modified) inode.
 	 */
 	lockreq = LK_EXCLUSIVE | LK_NOWAIT;
 	if (waitfor == MNT_SUSPEND) {
 		suspend = 1;
 		waitfor = MNT_WAIT;
 	}
 	if (waitfor == MNT_WAIT) {
 		wait = 1;
 		lockreq = LK_EXCLUSIVE;
 	}
 	lockreq |= LK_INTERLOCK | LK_SLEEPFAIL;
 loop:
 	/* Grab snapshot of secondary write counts */
 	MNT_ILOCK(mp);
 	secondary_writes = mp->mnt_secondary_writes;
 	secondary_accwrites = mp->mnt_secondary_accwrites;
 	MNT_IUNLOCK(mp);
 
 	/* Grab snapshot of softdep dependency counts */
 	softdep_get_depcounts(mp, &softdep_deps, &softdep_accdeps);
 
 	MNT_VNODE_FOREACH_ALL(vp, mp, mvp) {
 		/*
 		 * Depend on the vnode interlock to keep things stable enough
 		 * for a quick test.  Since there might be hundreds of
 		 * thousands of vnodes, we cannot afford even a subroutine
 		 * call unless there's a good chance that we have work to do.
 		 */
 		if (vp->v_type == VNON) {
 			VI_UNLOCK(vp);
 			continue;
 		}
 		ip = VTOI(vp);
 		if ((ip->i_flag &
 		    (IN_ACCESS | IN_CHANGE | IN_MODIFIED | IN_UPDATE)) == 0 &&
 		    vp->v_bufobj.bo_dirty.bv_cnt == 0) {
 			VI_UNLOCK(vp);
 			continue;
 		}
 		if ((error = vget(vp, lockreq, td)) != 0) {
 			if (error == ENOENT || error == ENOLCK) {
 				MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp);
 				goto loop;
 			}
 			continue;
 		}
 		if ((error = ffs_syncvnode(vp, waitfor, 0)) != 0)
 			allerror = error;
 		vput(vp);
 	}
 	/*
 	 * Force stale filesystem control information to be flushed.
 	 */
 	if (waitfor == MNT_WAIT || rebooting) {
 		if ((error = softdep_flushworklist(ump->um_mountp, &count, td)))
 			allerror = error;
 		/* Flushed work items may create new vnodes to clean */
 		if (allerror == 0 && count)
 			goto loop;
 	}
 #ifdef QUOTA
 	qsync(mp);
 #endif
 
 	devvp = ump->um_devvp;
 	bo = &devvp->v_bufobj;
 	BO_LOCK(bo);
 	if (bo->bo_numoutput > 0 || bo->bo_dirty.bv_cnt > 0) {
 		BO_UNLOCK(bo);
 		vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY);
 		error = VOP_FSYNC(devvp, waitfor, td);
 		VOP_UNLOCK(devvp, 0);
 		if (MOUNTEDSOFTDEP(mp) && (error == 0 || error == EAGAIN))
 			error = ffs_sbupdate(ump, waitfor, 0);
 		if (error != 0)
 			allerror = error;
 		if (allerror == 0 && waitfor == MNT_WAIT)
 			goto loop;
 	} else if (suspend != 0) {
 		if (softdep_check_suspend(mp,
 					  devvp,
 					  softdep_deps,
 					  softdep_accdeps,
 					  secondary_writes,
 					  secondary_accwrites) != 0) {
 			MNT_IUNLOCK(mp);
 			goto loop;	/* More work needed */
 		}
 		mtx_assert(MNT_MTX(mp), MA_OWNED);
 		mp->mnt_kern_flag |= MNTK_SUSPEND2 | MNTK_SUSPENDED;
 		MNT_IUNLOCK(mp);
 		suspended = 1;
 	} else
 		BO_UNLOCK(bo);
 	/*
 	 * Write back modified superblock.
 	 */
 	if (fs->fs_fmod != 0 &&
 	    (error = ffs_sbupdate(ump, waitfor, suspended)) != 0)
 		allerror = error;
 	return (allerror);
 }
 
 int
 ffs_vget(mp, ino, flags, vpp)
 	struct mount *mp;
 	ino_t ino;
 	int flags;
 	struct vnode **vpp;
 {
 	return (ffs_vgetf(mp, ino, flags, vpp, 0));
 }
 
 int
 ffs_vgetf(mp, ino, flags, vpp, ffs_flags)
 	struct mount *mp;
 	ino_t ino;
 	int flags;
 	struct vnode **vpp;
 	int ffs_flags;
 {
 	struct fs *fs;
 	struct inode *ip;
 	struct ufsmount *ump;
 	struct buf *bp;
 	struct vnode *vp;
 	struct cdev *dev;
 	int error;
 
 	error = vfs_hash_get(mp, ino, flags, curthread, vpp, NULL, NULL);
 	if (error || *vpp != NULL)
 		return (error);
 
 	/*
 	 * We must promote to an exclusive lock for vnode creation.  This
 	 * can happen if lookup is passed LOCKSHARED.
 	 */
 	if ((flags & LK_TYPE_MASK) == LK_SHARED) {
 		flags &= ~LK_TYPE_MASK;
 		flags |= LK_EXCLUSIVE;
 	}
 
 	/*
 	 * We do not lock vnode creation as it is believed to be too
 	 * expensive for such rare case as simultaneous creation of vnode
 	 * for same ino by different processes. We just allow them to race
 	 * and check later to decide who wins. Let the race begin!
 	 */
 
 	ump = VFSTOUFS(mp);
 	dev = ump->um_dev;
 	fs = ump->um_fs;
 	ip = uma_zalloc(uma_inode, M_WAITOK | M_ZERO);
 
 	/* Allocate a new vnode/inode. */
 	if (fs->fs_magic == FS_UFS1_MAGIC)
 		error = getnewvnode("ufs", mp, &ffs_vnodeops1, &vp);
 	else
 		error = getnewvnode("ufs", mp, &ffs_vnodeops2, &vp);
 	if (error) {
 		*vpp = NULL;
 		uma_zfree(uma_inode, ip);
 		return (error);
 	}
 	/*
 	 * FFS supports recursive locking.
 	 */
 	lockmgr(vp->v_vnlock, LK_EXCLUSIVE, NULL);
 	VN_LOCK_AREC(vp);
 	vp->v_data = ip;
 	vp->v_bufobj.bo_bsize = fs->fs_bsize;
 	ip->i_vnode = vp;
 	ip->i_ump = ump;
 	ip->i_fs = fs;
 	ip->i_dev = dev;
 	ip->i_number = ino;
 	ip->i_ea_refs = 0;
 #ifdef QUOTA
 	{
 		int i;
 		for (i = 0; i < MAXQUOTAS; i++)
 			ip->i_dquot[i] = NODQUOT;
 	}
 #endif
 
 	if (ffs_flags & FFSV_FORCEINSMQ)
 		vp->v_vflag |= VV_FORCEINSMQ;
 	error = insmntque(vp, mp);
 	if (error != 0) {
 		uma_zfree(uma_inode, ip);
 		*vpp = NULL;
 		return (error);
 	}
 	vp->v_vflag &= ~VV_FORCEINSMQ;
 	error = vfs_hash_insert(vp, ino, flags, curthread, vpp, NULL, NULL);
 	if (error || *vpp != NULL)
 		return (error);
 
 	/* Read in the disk contents for the inode, copy into the inode. */
 	error = bread(ump->um_devvp, fsbtodb(fs, ino_to_fsba(fs, ino)),
 	    (int)fs->fs_bsize, NOCRED, &bp);
 	if (error) {
 		/*
 		 * The inode does not contain anything useful, so it would
 		 * be misleading to leave it on its hash chain. With mode
 		 * still zero, it will be unlinked and returned to the free
 		 * list by vput().
 		 */
 		brelse(bp);
 		vput(vp);
 		*vpp = NULL;
 		return (error);
 	}
 	if (ip->i_ump->um_fstype == UFS1)
 		ip->i_din1 = uma_zalloc(uma_ufs1, M_WAITOK);
 	else
 		ip->i_din2 = uma_zalloc(uma_ufs2, M_WAITOK);
 	ffs_load_inode(bp, ip, fs, ino);
 	if (DOINGSOFTDEP(vp))
 		softdep_load_inodeblock(ip);
 	else
 		ip->i_effnlink = ip->i_nlink;
 	bqrelse(bp);
 
 	/*
 	 * Initialize the vnode from the inode, check for aliases.
 	 * Note that the underlying vnode may have changed.
 	 */
 	if (ip->i_ump->um_fstype == UFS1)
 		error = ufs_vinit(mp, &ffs_fifoops1, &vp);
 	else
 		error = ufs_vinit(mp, &ffs_fifoops2, &vp);
 	if (error) {
 		vput(vp);
 		*vpp = NULL;
 		return (error);
 	}
 
 	/*
 	 * Finish inode initialization.
 	 */
 	if (vp->v_type != VFIFO) {
 		/* FFS supports shared locking for all files except fifos. */
 		VN_LOCK_ASHARE(vp);
 	}
 
 	/*
 	 * Set up a generation number for this inode if it does not
 	 * already have one. This should only happen on old filesystems.
 	 */
 	if (ip->i_gen == 0) {
 		ip->i_gen = arc4random() / 2 + 1;
 		if ((vp->v_mount->mnt_flag & MNT_RDONLY) == 0) {
 			ip->i_flag |= IN_MODIFIED;
 			DIP_SET(ip, i_gen, ip->i_gen);
 		}
 	}
 #ifdef MAC
 	if ((mp->mnt_flag & MNT_MULTILABEL) && ip->i_mode) {
 		/*
 		 * If this vnode is already allocated, and we're running
 		 * multi-label, attempt to perform a label association
 		 * from the extended attributes on the inode.
 		 */
 		error = mac_vnode_associate_extattr(mp, vp);
 		if (error) {
 			/* ufs_inactive will release ip->i_devvp ref. */
 			vput(vp);
 			*vpp = NULL;
 			return (error);
 		}
 	}
 #endif
 
 	*vpp = vp;
 	return (0);
 }
 
 /*
  * File handle to vnode
  *
  * Have to be really careful about stale file handles:
  * - check that the inode number is valid
  * - call ffs_vget() to get the locked inode
  * - check for an unallocated inode (i_mode == 0)
  * - check that the given client host has export rights and return
  *   those rights via. exflagsp and credanonp
  */
 static int
 ffs_fhtovp(mp, fhp, flags, vpp)
 	struct mount *mp;
 	struct fid *fhp;
 	int flags;
 	struct vnode **vpp;
 {
 	struct ufid *ufhp;
 	struct fs *fs;
 
 	ufhp = (struct ufid *)fhp;
 	fs = VFSTOUFS(mp)->um_fs;
 	if (ufhp->ufid_ino < ROOTINO ||
 	    ufhp->ufid_ino >= fs->fs_ncg * fs->fs_ipg)
 		return (ESTALE);
 	return (ufs_fhtovp(mp, ufhp, flags, vpp));
 }
 
 /*
  * Initialize the filesystem.
  */
 static int
 ffs_init(vfsp)
 	struct vfsconf *vfsp;
 {
 
 	ffs_susp_initialize();
 	softdep_initialize();
 	return (ufs_init(vfsp));
 }
 
 /*
  * Undo the work of ffs_init().
  */
 static int
 ffs_uninit(vfsp)
 	struct vfsconf *vfsp;
 {
 	int ret;
 
 	ret = ufs_uninit(vfsp);
 	softdep_uninitialize();
 	ffs_susp_uninitialize();
 	return (ret);
 }
 
 /*
  * Write a superblock and associated information back to disk.
  */
 int
 ffs_sbupdate(ump, waitfor, suspended)
 	struct ufsmount *ump;
 	int waitfor;
 	int suspended;
 {
 	struct fs *fs = ump->um_fs;
 	struct buf *sbbp;
 	struct buf *bp;
 	int blks;
 	void *space;
 	int i, size, error, allerror = 0;
 
 	if (fs->fs_ronly == 1 &&
 	    (ump->um_mountp->mnt_flag & (MNT_RDONLY | MNT_UPDATE)) !=
 	    (MNT_RDONLY | MNT_UPDATE) && ump->um_fsckpid == 0)
 		panic("ffs_sbupdate: write read-only filesystem");
 	/*
 	 * We use the superblock's buf to serialize calls to ffs_sbupdate().
 	 */
 	sbbp = getblk(ump->um_devvp, btodb(fs->fs_sblockloc),
 	    (int)fs->fs_sbsize, 0, 0, 0);
 	/*
 	 * First write back the summary information.
 	 */
 	blks = howmany(fs->fs_cssize, fs->fs_fsize);
 	space = fs->fs_csp;
 	for (i = 0; i < blks; i += fs->fs_frag) {
 		size = fs->fs_bsize;
 		if (i + fs->fs_frag > blks)
 			size = (blks - i) * fs->fs_fsize;
 		bp = getblk(ump->um_devvp, fsbtodb(fs, fs->fs_csaddr + i),
 		    size, 0, 0, 0);
 		bcopy(space, bp->b_data, (u_int)size);
 		space = (char *)space + size;
 		if (suspended)
 			bp->b_flags |= B_VALIDSUSPWRT;
 		if (waitfor != MNT_WAIT)
 			bawrite(bp);
 		else if ((error = bwrite(bp)) != 0)
 			allerror = error;
 	}
 	/*
 	 * Now write back the superblock itself. If any errors occurred
 	 * up to this point, then fail so that the superblock avoids
 	 * being written out as clean.
 	 */
 	if (allerror) {
 		brelse(sbbp);
 		return (allerror);
 	}
 	bp = sbbp;
 	if (fs->fs_magic == FS_UFS1_MAGIC && fs->fs_sblockloc != SBLOCK_UFS1 &&
 	    (fs->fs_old_flags & FS_FLAGS_UPDATED) == 0) {
 		printf("WARNING: %s: correcting fs_sblockloc from %jd to %d\n",
 		    fs->fs_fsmnt, fs->fs_sblockloc, SBLOCK_UFS1);
 		fs->fs_sblockloc = SBLOCK_UFS1;
 	}
 	if (fs->fs_magic == FS_UFS2_MAGIC && fs->fs_sblockloc != SBLOCK_UFS2 &&
 	    (fs->fs_old_flags & FS_FLAGS_UPDATED) == 0) {
 		printf("WARNING: %s: correcting fs_sblockloc from %jd to %d\n",
 		    fs->fs_fsmnt, fs->fs_sblockloc, SBLOCK_UFS2);
 		fs->fs_sblockloc = SBLOCK_UFS2;
 	}
 	fs->fs_fmod = 0;
 	fs->fs_time = time_second;
 	if (MOUNTEDSOFTDEP(ump->um_mountp))
 		softdep_setup_sbupdate(ump, (struct fs *)bp->b_data, bp);
 	bcopy((caddr_t)fs, bp->b_data, (u_int)fs->fs_sbsize);
 	ffs_oldfscompat_write((struct fs *)bp->b_data, ump);
 	if (suspended)
 		bp->b_flags |= B_VALIDSUSPWRT;
 	if (waitfor != MNT_WAIT)
 		bawrite(bp);
 	else if ((error = bwrite(bp)) != 0)
 		allerror = error;
 	return (allerror);
 }
 
 static int
 ffs_extattrctl(struct mount *mp, int cmd, struct vnode *filename_vp,
 	int attrnamespace, const char *attrname)
 {
 
 #ifdef UFS_EXTATTR
 	return (ufs_extattrctl(mp, cmd, filename_vp, attrnamespace,
 	    attrname));
 #else
 	return (vfs_stdextattrctl(mp, cmd, filename_vp, attrnamespace,
 	    attrname));
 #endif
 }
 
 static void
 ffs_ifree(struct ufsmount *ump, struct inode *ip)
 {
 
 	if (ump->um_fstype == UFS1 && ip->i_din1 != NULL)
 		uma_zfree(uma_ufs1, ip->i_din1);
 	else if (ip->i_din2 != NULL)
 		uma_zfree(uma_ufs2, ip->i_din2);
 	uma_zfree(uma_inode, ip);
 }
 
 static int dobkgrdwrite = 1;
 SYSCTL_INT(_debug, OID_AUTO, dobkgrdwrite, CTLFLAG_RW, &dobkgrdwrite, 0,
     "Do background writes (honoring the BV_BKGRDWRITE flag)?");
 
 /*
  * Complete a background write started from bwrite.
  */
 static void
 ffs_backgroundwritedone(struct buf *bp)
 {
 	struct bufobj *bufobj;
 	struct buf *origbp;
 
 	/*
 	 * Find the original buffer that we are writing.
 	 */
 	bufobj = bp->b_bufobj;
 	BO_LOCK(bufobj);
 	if ((origbp = gbincore(bp->b_bufobj, bp->b_lblkno)) == NULL)
 		panic("backgroundwritedone: lost buffer");
 	BO_UNLOCK(bufobj);
 	/*
 	 * Process dependencies then return any unfinished ones.
 	 */
 	pbrelvp(bp);
 	if (!LIST_EMPTY(&bp->b_dep))
 		buf_complete(bp);
 #ifdef SOFTUPDATES
 	if (!LIST_EMPTY(&bp->b_dep))
 		softdep_move_dependencies(bp, origbp);
 #endif
 	/*
 	 * This buffer is marked B_NOCACHE so when it is released
 	 * by biodone it will be tossed.
 	 */
 	bp->b_flags |= B_NOCACHE;
 	bp->b_flags &= ~B_CACHE;
 	bufdone(bp);
 	BO_LOCK(bufobj);
 	/*
 	 * Clear the BV_BKGRDINPROG flag in the original buffer
 	 * and awaken it if it is waiting for the write to complete.
 	 * If BV_BKGRDINPROG is not set in the original buffer it must
 	 * have been released and re-instantiated - which is not legal.
 	 */
 	KASSERT((origbp->b_vflags & BV_BKGRDINPROG),
 	    ("backgroundwritedone: lost buffer2"));
 	origbp->b_vflags &= ~BV_BKGRDINPROG;
 	if (origbp->b_vflags & BV_BKGRDWAIT) {
 		origbp->b_vflags &= ~BV_BKGRDWAIT;
 		wakeup(&origbp->b_xflags);
 	}
 	BO_UNLOCK(bufobj);
 }
 
 
 /*
  * Write, release buffer on completion.  (Done by iodone
  * if async).  Do not bother writing anything if the buffer
  * is invalid.
  *
  * Note that we set B_CACHE here, indicating that buffer is
  * fully valid and thus cacheable.  This is true even of NFS
  * now so we set it generally.  This could be set either here
  * or in biodone() since the I/O is synchronous.  We put it
  * here.
  */
 static int
 ffs_bufwrite(struct buf *bp)
 {
 	struct buf *newbp;
 	int oldflags;
 
 	CTR3(KTR_BUF, "bufwrite(%p) vp %p flags %X", bp, bp->b_vp, bp->b_flags);
 	if (bp->b_flags & B_INVAL) {
 		brelse(bp);
 		return (0);
 	}
 
 	oldflags = bp->b_flags;
 
 	if (!BUF_ISLOCKED(bp))
 		panic("bufwrite: buffer is not busy???");
 	/*
 	 * If a background write is already in progress, delay
 	 * writing this block if it is asynchronous. Otherwise
 	 * wait for the background write to complete.
 	 */
 	BO_LOCK(bp->b_bufobj);
 	if (bp->b_vflags & BV_BKGRDINPROG) {
 		if (bp->b_flags & B_ASYNC) {
 			BO_UNLOCK(bp->b_bufobj);
 			bdwrite(bp);
 			return (0);
 		}
 		bp->b_vflags |= BV_BKGRDWAIT;
 		msleep(&bp->b_xflags, BO_LOCKPTR(bp->b_bufobj), PRIBIO,
 		    "bwrbg", 0);
 		if (bp->b_vflags & BV_BKGRDINPROG)
 			panic("bufwrite: still writing");
 	}
 	BO_UNLOCK(bp->b_bufobj);
 
 	/*
 	 * If this buffer is marked for background writing and we
 	 * do not have to wait for it, make a copy and write the
 	 * copy so as to leave this buffer ready for further use.
 	 *
 	 * This optimization eats a lot of memory.  If we have a page
 	 * or buffer shortfall we can't do it.
 	 */
 	if (dobkgrdwrite && (bp->b_xflags & BX_BKGRDWRITE) &&
 	    (bp->b_flags & B_ASYNC) &&
 	    !vm_page_count_severe() &&
 	    !buf_dirty_count_severe()) {
 		KASSERT(bp->b_iodone == NULL,
 		    ("bufwrite: needs chained iodone (%p)", bp->b_iodone));
 
 		/* get a new block */
 		newbp = geteblk(bp->b_bufsize, GB_NOWAIT_BD);
 		if (newbp == NULL)
 			goto normal_write;
 
 		KASSERT((bp->b_flags & B_UNMAPPED) == 0, ("Unmapped cg"));
 		memcpy(newbp->b_data, bp->b_data, bp->b_bufsize);
 		BO_LOCK(bp->b_bufobj);
 		bp->b_vflags |= BV_BKGRDINPROG;
 		BO_UNLOCK(bp->b_bufobj);
 		newbp->b_xflags |= BX_BKGRDMARKER;
 		newbp->b_lblkno = bp->b_lblkno;
 		newbp->b_blkno = bp->b_blkno;
 		newbp->b_offset = bp->b_offset;
 		newbp->b_iodone = ffs_backgroundwritedone;
 		newbp->b_flags |= B_ASYNC;
 		newbp->b_flags &= ~B_INVAL;
 		pbgetvp(bp->b_vp, newbp);
 
 #ifdef SOFTUPDATES
 		/*
 		 * Move over the dependencies.  If there are rollbacks,
 		 * leave the parent buffer dirtied as it will need to
 		 * be written again.
 		 */
 		if (LIST_EMPTY(&bp->b_dep) ||
 		    softdep_move_dependencies(bp, newbp) == 0)
 			bundirty(bp);
 #else
 		bundirty(bp);
 #endif
 
 		/*
 		 * Initiate write on the copy, release the original.  The
 		 * BKGRDINPROG flag prevents it from going away until 
 		 * the background write completes.
 		 */
 		bqrelse(bp);
 		bp = newbp;
 	} else
 		/* Mark the buffer clean */
 		bundirty(bp);
 
 
 	/* Let the normal bufwrite do the rest for us */
 normal_write:
 	return (bufwrite(bp));
 }
 
 
 static void
 ffs_geom_strategy(struct bufobj *bo, struct buf *bp)
 {
 	struct vnode *vp;
 	int error;
 	struct buf *tbp;
 	int nocopy;
 
 	vp = bo->__bo_vnode;
 	if (bp->b_iocmd == BIO_WRITE) {
 		if ((bp->b_flags & B_VALIDSUSPWRT) == 0 &&
 		    bp->b_vp != NULL && bp->b_vp->v_mount != NULL &&
 		    (bp->b_vp->v_mount->mnt_kern_flag & MNTK_SUSPENDED) != 0)
 			panic("ffs_geom_strategy: bad I/O");
 		nocopy = bp->b_flags & B_NOCOPY;
 		bp->b_flags &= ~(B_VALIDSUSPWRT | B_NOCOPY);
 		if ((vp->v_vflag & VV_COPYONWRITE) && nocopy == 0 &&
 		    vp->v_rdev->si_snapdata != NULL) {
 			if ((bp->b_flags & B_CLUSTER) != 0) {
 				runningbufwakeup(bp);
 				TAILQ_FOREACH(tbp, &bp->b_cluster.cluster_head,
 					      b_cluster.cluster_entry) {
 					error = ffs_copyonwrite(vp, tbp);
 					if (error != 0 &&
 					    error != EOPNOTSUPP) {
 						bp->b_error = error;
 						bp->b_ioflags |= BIO_ERROR;
 						bufdone(bp);
 						return;
 					}
 				}
 				bp->b_runningbufspace = bp->b_bufsize;
 				atomic_add_long(&runningbufspace,
 					       bp->b_runningbufspace);
 			} else {
 				error = ffs_copyonwrite(vp, bp);
 				if (error != 0 && error != EOPNOTSUPP) {
 					bp->b_error = error;
 					bp->b_ioflags |= BIO_ERROR;
 					bufdone(bp);
 					return;
 				}
 			}
 		}
 #ifdef SOFTUPDATES
 		if ((bp->b_flags & B_CLUSTER) != 0) {
 			TAILQ_FOREACH(tbp, &bp->b_cluster.cluster_head,
 				      b_cluster.cluster_entry) {
 				if (!LIST_EMPTY(&tbp->b_dep))
 					buf_start(tbp);
 			}
 		} else {
 			if (!LIST_EMPTY(&bp->b_dep))
 				buf_start(bp);
 		}
 
 #endif
 	}
 	g_vfs_strategy(bo, bp);
 }
 
 int
 ffs_own_mount(const struct mount *mp)
 {
 
 	if (mp->mnt_op == &ufs_vfsops)
 		return (1);
 	return (0);
 }
 
 #ifdef	DDB
 #ifdef SOFTUPDATES
 
 /* defined in ffs_softdep.c */
 extern void db_print_ffs(struct ufsmount *ump);
 
 DB_SHOW_COMMAND(ffs, db_show_ffs)
 {
 	struct mount *mp;
 	struct ufsmount *ump;
 
 	if (have_addr) {
 		ump = VFSTOUFS((struct mount *)addr);
 		db_print_ffs(ump);
 		return;
 	}
 
 	TAILQ_FOREACH(mp, &mountlist, mnt_list) {
 		if (!strcmp(mp->mnt_stat.f_fstypename, ufs_vfsconf.vfc_name))
 			db_print_ffs(VFSTOUFS(mp));
 	}
 }
 
 #endif	/* SOFTUPDATES */
 #endif	/* DDB */
Index: user/ngie/more-tests/sys
===================================================================
--- user/ngie/more-tests/sys	(revision 281584)
+++ user/ngie/more-tests/sys	(revision 281585)

Property changes on: user/ngie/more-tests/sys
___________________________________________________________________
Modified: svn:mergeinfo
## -0,0 +0,1 ##
   Merged /head/sys:r281504-281584
Index: user/ngie/more-tests/usr.bin/Makefile
===================================================================
--- user/ngie/more-tests/usr.bin/Makefile	(revision 281584)
+++ user/ngie/more-tests/usr.bin/Makefile	(revision 281585)
@@ -1,425 +1,431 @@
 #	From: @(#)Makefile	8.3 (Berkeley) 1/7/94
 # $FreeBSD$
 
 .include <src.opts.mk>
 
 # XXX MISSING:		deroff diction graph learn plot
 #			spell spline struct xsend
 # XXX Use GNU versions: diff ld patch
 # Moved to secure: bdes
 #
 
 SUBDIR=	${_addr2line} \
 	alias \
 	apply \
 	asa \
 	awk \
 	banner \
 	basename \
 	brandelf \
 	bsdiff \
 	bzip2 \
 	bzip2recover \
 	cap_mkdb \
 	chat \
 	chpass \
 	cksum \
 	${_clang} \
 	cmp \
 	col \
 	colldef \
 	colrm \
 	column \
 	comm \
 	compress \
 	cpuset \
 	csplit \
 	ctlstat \
 	cut \
 	demandoc \
 	dirname \
 	dpv \
 	du \
 	elf2aout \
 	${_elfcopy} \
 	elfdump \
 	enigma \
 	env \
 	expand \
 	false \
 	fetch \
 	find \
 	fmt \
 	fold \
 	fstat \
 	fsync \
 	gcore \
 	gencat \
 	getconf \
 	getent \
 	getopt \
 	grep \
 	gzip \
 	head \
 	hexdump \
 	${_iconv} \
 	id \
 	ipcrm \
 	ipcs \
 	join \
 	jot \
 	${_kdump} \
 	keylogin \
 	keylogout \
 	killall \
 	ktrace \
 	ktrdump \
 	lam \
 	lastcomm \
 	ldd \
 	leave \
 	less \
 	lessecho \
 	lesskey \
 	limits \
 	locale \
 	lock \
 	lockf \
 	logger \
 	login \
 	logins \
 	logname \
 	look \
 	lorder \
 	lsvfs \
 	lzmainfo \
 	m4 \
 	${_makewhatis} \
 	${_man} \
 	mandoc \
 	mesg \
 	minigzip \
 	ministat \
 	${_mkcsmapper} \
 	mkdep \
 	${_mkesdb} \
 	mkfifo \
 	mkimg \
 	mklocale \
 	mktemp \
 	mkulzma \
 	mkuzip \
 	mt \
 	ncal \
 	netstat \
 	newgrp \
 	nfsstat \
 	nice \
 	nl \
 	${_nm} \
 	nohup \
 	opieinfo \
 	opiekey \
 	opiepasswd \
 	pagesize \
 	passwd \
 	paste \
 	patch \
 	pathchk \
 	perror \
 	pr \
 	printenv \
 	printf \
 	procstat \
 	protect \
 	rctl \
 	${_readelf} \
 	renice \
 	rev \
 	revoke \
 	rpcinfo \
 	rs \
 	rup \
 	rusers \
 	rwall \
 	script \
 	sed \
 	send-pr \
 	seq \
 	shar \
 	showmount \
 	${_size} \
 	sockstat \
 	soeliminate \
 	sort \
 	split \
 	stat \
 	stdbuf \
 	${_strings} \
 	su \
 	systat \
 	tabs \
 	tail \
 	tar \
 	tcopy \
 	tee \
 	${_tests} \
 	time \
 	timeout \
 	tip \
 	top \
 	touch \
 	tput \
 	tr \
 	true \
 	truncate \
 	${_truss} \
 	tset \
 	tsort \
 	tty \
 	uname \
 	unexpand \
 	uniq \
 	unzip \
 	units \
 	unvis \
 	uudecode \
 	uuencode \
 	vis \
 	vmstat \
 	w \
 	wall \
 	wc \
 	what \
 	whereis \
 	which \
 	whois \
 	write \
 	xargs \
 	xinstall \
 	xo \
 	xz \
 	xzdec \
 	yes \
 	${_ypcat} \
 	${_ypmatch} \
 	${_ypwhich}
 
 # NB: keep these sorted by MK_* knobs
 
 .if ${MK_AT} != "no"
 SUBDIR+=	at
 .endif
 
 .if ${MK_ATM} != "no"
 SUBDIR+=	atm
 .endif
 
 .if ${MK_BLUETOOTH} != "no"
 SUBDIR+=	bluetooth
 .endif
 
 .if ${MK_BSD_CPIO} != "no"
 SUBDIR+=	cpio
 .endif
 
 .if ${MK_CALENDAR} != "no"
 SUBDIR+=	calendar
 .endif
 
 .if ${MK_CLANG} != "no"
 _clang=		clang
 .endif
 
 .if ${MK_EE} != "no"
 SUBDIR+=	ee
 .endif
 
 .if ${MK_ELFTOOLCHAIN_TOOLS} != "no"
 _addr2line=	addr2line
 _elfcopy=	elfcopy
 _nm=		nm
 _readelf=	readelf
 _size=		size
 _strings=	strings
 .endif
 
 .if ${MK_FILE} != "no"
 SUBDIR+=	file
 .endif
 
 .if ${MK_FINGER} != "no"
 SUBDIR+=	finger
 .endif
 
 .if ${MK_FMAKE} != "no"
 SUBDIR+=	make
 .endif
 
 .if ${MK_FTP} != "no"
 SUBDIR+=	ftp
 .endif
 
 .if ${MK_GPL_DTC} != "yes"
 SUBDIR+=	dtc
 .endif
 
 .if ${MK_GROFF} != "no"
 SUBDIR+=	vgrind
 .endif
 
 .if ${MK_HESIOD} != "no"
 SUBDIR+=	hesinfo
 .endif
 
 .if ${MK_ICONV} != "no"
 _iconv=		iconv
 _mkcsmapper=	mkcsmapper
 _mkesdb=	mkesdb
 .endif
 
 .if ${MK_ISCSI} != "no"
 SUBDIR+=	iscsictl
 .endif
 
 .if ${MK_KDUMP} != "no"
 SUBDIR+=        kdump
+.if ${MACHINE_ARCH} != "aarch64" # ARM64TODO truss does not build
 SUBDIR+=        truss
 .endif
+.endif
 
 .if ${MK_KERBEROS_SUPPORT} != "no"
 SUBDIR+=	compile_et
 .endif
 
 .if ${MK_LDNS_UTILS} != "no"
 SUBDIR+=	drill
 SUBDIR+=	host
 .endif
 
 .if ${MK_LOCATE} != "no"
 SUBDIR+=	locate
 .endif
 
 # XXX msgs?
 .if ${MK_MAIL} != "no"
 SUBDIR+=	biff
 SUBDIR+=	from
 SUBDIR+=	mail
 SUBDIR+=	msgs
 .endif
 
 .if ${MK_MAKE} != "no"
 SUBDIR+=	bmake
 .endif
 
 .if ${MK_MAN_UTILS} != "no"
 SUBDIR+=	catman
 _makewhatis=	makewhatis
 _man=		man
 .endif
 
 .if ${MK_NETCAT} != "no"
 SUBDIR+=	nc
 .endif
 
 .if ${MK_NIS} != "no"
 SUBDIR+=	ypcat
 SUBDIR+=	ypmatch
 SUBDIR+=	ypwhich
 .endif
 
 .if ${MK_OPENSSH} != "no"
 SUBDIR+=	ssh-copy-id
 .endif
 
 .if ${MK_OPENSSL} != "no"
 SUBDIR+=	bc
 SUBDIR+=	chkey
 SUBDIR+=	dc
 SUBDIR+=	newkey
 .endif
 
 .if ${MK_QUOTAS} != "no"
 SUBDIR+=	quota
 .endif
 
 .if ${MK_RCMDS} != "no"
 SUBDIR+=	rlogin
 SUBDIR+=	rsh
 SUBDIR+=	ruptime
 SUBDIR+=	rwho
 .endif
 
 .if ${MK_SENDMAIL} != "no"
 SUBDIR+=	vacation
 .endif
 
 .if ${MK_TALK} != "no"
 SUBDIR+=	talk
 .endif
 
 .if ${MK_TELNET} != "no"
 SUBDIR+=	telnet
 .endif
 
 .if ${MK_TESTS} != "no"
 _tests=		tests
 .endif
 
 .if ${MK_TEXTPROC} != "no"
 SUBDIR+=	checknr
 SUBDIR+=	colcrt
 SUBDIR+=	ul
 .endif
 
 .if ${MK_TFTP} != "no"
 SUBDIR+=	tftp
 .endif
 
 .if ${MK_TOOLCHAIN} != "no"
 SUBDIR+=	ar
 SUBDIR+=	c89
 SUBDIR+=	c99
 SUBDIR+=	ctags
 SUBDIR+=	file2c
+.if ${MACHINE_ARCH} != "aarch64" # ARM64TODO gprof does not build
 SUBDIR+=	gprof
+.endif
 SUBDIR+=	indent
 SUBDIR+=	lex
 SUBDIR+=	mkstr
 SUBDIR+=	rpcgen
 SUBDIR+=	unifdef
+.if ${MACHINE_ARCH} != "aarch64" # ARM64TODO xlint does not build
 SUBDIR+=	xlint
+.endif
 SUBDIR+=	xstr
 SUBDIR+=	yacc
 .endif
 
 .if ${MK_VI} != "no"
 SUBDIR+=	vi
 .endif
 
 .if ${MK_VT} != "no"
 SUBDIR+=	vtfontcvt
 .endif
 
 .if ${MK_USB} != "no"
 SUBDIR+=	usbhidaction
 SUBDIR+=	usbhidctl
 .endif
 
 .if ${MK_UTMPX} != "no"
 SUBDIR+=	last
 SUBDIR+=	users
 SUBDIR+=	who
 .endif
 
 .if ${MK_SVN} == "yes" || ${MK_SVNLITE} == "yes"
 SUBDIR+=	svn
 .endif
 
 .include <bsd.arch.inc.mk>
 
 SUBDIR:=	${SUBDIR:O}
 
 SUBDIR_PARALLEL=
 
 .include <bsd.subdir.mk>
Index: user/ngie/more-tests/usr.bin/gzip/gzip.c
===================================================================
--- user/ngie/more-tests/usr.bin/gzip/gzip.c	(revision 281584)
+++ user/ngie/more-tests/usr.bin/gzip/gzip.c	(revision 281585)
@@ -1,2170 +1,2173 @@
 /*	$NetBSD: gzip.c,v 1.107 2015/01/13 02:37:20 mrg Exp $	*/
 
 /*-
  * Copyright (c) 1997, 1998, 2003, 2004, 2006 Matthew R. Green
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
  * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
  * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
  * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
  * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  */
 
 #include <sys/cdefs.h>
 #ifndef lint
 __COPYRIGHT("@(#) Copyright (c) 1997, 1998, 2003, 2004, 2006\
  Matthew R. Green.  All rights reserved.");
 __FBSDID("$FreeBSD$");
 #endif /* not lint */
 
 /*
  * gzip.c -- GPL free gzip using zlib.
  *
  * RFC 1950 covers the zlib format
  * RFC 1951 covers the deflate format
  * RFC 1952 covers the gzip format
  *
  * TODO:
  *	- use mmap where possible
  *	- make bzip2/compress -v/-t/-l support work as well as possible
  */
 
 #include <sys/param.h>
 #include <sys/stat.h>
 #include <sys/time.h>
 
 #include <inttypes.h>
 #include <unistd.h>
 #include <stdio.h>
 #include <string.h>
 #include <stdlib.h>
 #include <err.h>
 #include <errno.h>
 #include <fcntl.h>
 #include <zlib.h>
 #include <fts.h>
 #include <libgen.h>
 #include <stdarg.h>
 #include <getopt.h>
 #include <time.h>
 
 /* what type of file are we dealing with */
 enum filetype {
 	FT_GZIP,
 #ifndef NO_BZIP2_SUPPORT
 	FT_BZIP2,
 #endif
 #ifndef NO_COMPRESS_SUPPORT
 	FT_Z,
 #endif
 #ifndef NO_PACK_SUPPORT
 	FT_PACK,
 #endif
 #ifndef NO_XZ_SUPPORT
 	FT_XZ,
 #endif
 	FT_LAST,
 	FT_UNKNOWN
 };
 
 #ifndef NO_BZIP2_SUPPORT
 #include <bzlib.h>
 
 #define BZ2_SUFFIX	".bz2"
 #define BZIP2_MAGIC	"\102\132\150"
 #endif
 
 #ifndef NO_COMPRESS_SUPPORT
 #define Z_SUFFIX	".Z"
 #define Z_MAGIC		"\037\235"
 #endif
 
 #ifndef NO_PACK_SUPPORT
 #define PACK_MAGIC	"\037\036"
 #endif
 
 #ifndef NO_XZ_SUPPORT
 #include <lzma.h>
 #define XZ_SUFFIX	".xz"
 #define XZ_MAGIC	"\3757zXZ"
 #endif
 
 #define GZ_SUFFIX	".gz"
 
 #define BUFLEN		(64 * 1024)
 
 #define GZIP_MAGIC0	0x1F
 #define GZIP_MAGIC1	0x8B
 #define GZIP_OMAGIC1	0x9E
 
 #define GZIP_TIMESTAMP	(off_t)4
 #define GZIP_ORIGNAME	(off_t)10
 
 #define HEAD_CRC	0x02
 #define EXTRA_FIELD	0x04
 #define ORIG_NAME	0x08
 #define COMMENT		0x10
 
 #define OS_CODE		3	/* Unix */
 
 typedef struct {
     const char	*zipped;
     int		ziplen;
     const char	*normal;	/* for unzip - must not be longer than zipped */
 } suffixes_t;
 static suffixes_t suffixes[] = {
 #define	SUFFIX(Z, N) {Z, sizeof Z - 1, N}
 	SUFFIX(GZ_SUFFIX,	""),	/* Overwritten by -S .xxx */
 #ifndef SMALL
 	SUFFIX(GZ_SUFFIX,	""),
 	SUFFIX(".z",		""),
 	SUFFIX("-gz",		""),
 	SUFFIX("-z",		""),
 	SUFFIX("_z",		""),
 	SUFFIX(".taz",		".tar"),
 	SUFFIX(".tgz",		".tar"),
 #ifndef NO_BZIP2_SUPPORT
 	SUFFIX(BZ2_SUFFIX,	""),
 	SUFFIX(".tbz",		".tar"),
 	SUFFIX(".tbz2",		".tar"),
 #endif
 #ifndef NO_COMPRESS_SUPPORT
 	SUFFIX(Z_SUFFIX,	""),
 #endif
 #ifndef NO_XZ_SUPPORT
 	SUFFIX(XZ_SUFFIX,	""),
 #endif
 	SUFFIX(GZ_SUFFIX,	""),	/* Overwritten by -S "" */
 #endif /* SMALL */
 #undef SUFFIX
 };
 #define NUM_SUFFIXES (sizeof suffixes / sizeof suffixes[0])
 #define SUFFIX_MAXLEN	30
 
 static	const char	gzip_version[] = "FreeBSD gzip 20150413";
 
 #ifndef SMALL
 static	const char	gzip_copyright[] = \
 "   Copyright (c) 1997, 1998, 2003, 2004, 2006 Matthew R. Green\n"
 "   All rights reserved.\n"
 "\n"
 "   Redistribution and use in source and binary forms, with or without\n"
 "   modification, are permitted provided that the following conditions\n"
 "   are met:\n"
 "   1. Redistributions of source code must retain the above copyright\n"
 "      notice, this list of conditions and the following disclaimer.\n"
 "   2. Redistributions in binary form must reproduce the above copyright\n"
 "      notice, this list of conditions and the following disclaimer in the\n"
 "      documentation and/or other materials provided with the distribution.\n"
 "\n"
 "   THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR\n"
 "   IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES\n"
 "   OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.\n"
 "   IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,\n"
 "   INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,\n"
 "   BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;\n"
 "   LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED\n"
 "   AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,\n"
 "   OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY\n"
 "   OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF\n"
 "   SUCH DAMAGE.";
 #endif
 
 static	int	cflag;			/* stdout mode */
 static	int	dflag;			/* decompress mode */
 static	int	lflag;			/* list mode */
 static	int	numflag = 6;		/* gzip -1..-9 value */
 
 #ifndef SMALL
 static	int	fflag;			/* force mode */
 static	int	kflag;			/* don't delete input files */
 static	int	nflag;			/* don't save name/timestamp */
 static	int	Nflag;			/* don't restore name/timestamp */
 static	int	qflag;			/* quiet mode */
 static	int	rflag;			/* recursive mode */
 static	int	tflag;			/* test */
 static	int	vflag;			/* verbose mode */
 static	const char *remove_file = NULL;	/* file to be removed upon SIGINT */
 #else
 #define		qflag	0
 #define		tflag	0
 #endif
 
 static	int	exit_value = 0;		/* exit value */
 
 static	char	*infile;		/* name of file coming in */
 
 static	void	maybe_err(const char *fmt, ...) __printflike(1, 2) __dead2;
 #if !defined(NO_BZIP2_SUPPORT) || !defined(NO_PACK_SUPPORT) ||	\
     !defined(NO_XZ_SUPPORT)
 static	void	maybe_errx(const char *fmt, ...) __printflike(1, 2) __dead2;
 #endif
 static	void	maybe_warn(const char *fmt, ...) __printflike(1, 2);
 static	void	maybe_warnx(const char *fmt, ...) __printflike(1, 2);
 static	enum filetype file_gettype(u_char *);
 #ifdef SMALL
 #define gz_compress(if, of, sz, fn, tm) gz_compress(if, of, sz)
 #endif
 static	off_t	gz_compress(int, int, off_t *, const char *, uint32_t);
 static	off_t	gz_uncompress(int, int, char *, size_t, off_t *, const char *);
 static	off_t	file_compress(char *, char *, size_t);
 static	off_t	file_uncompress(char *, char *, size_t);
 static	void	handle_pathname(char *);
 static	void	handle_file(char *, struct stat *);
 static	void	handle_stdin(void);
 static	void	handle_stdout(void);
 static	void	print_ratio(off_t, off_t, FILE *);
 static	void	print_list(int fd, off_t, const char *, time_t);
 static	void	usage(void) __dead2;
 static	void	display_version(void) __dead2;
 #ifndef SMALL
 static	void	display_license(void);
 static	void	sigint_handler(int);
 #endif
 static	const suffixes_t *check_suffix(char *, int);
 static	ssize_t	read_retry(int, void *, size_t);
 
 #ifdef SMALL
 #define unlink_input(f, sb) unlink(f)
 #else
 static	off_t	cat_fd(unsigned char *, size_t, off_t *, int fd);
 static	void	prepend_gzip(char *, int *, char ***);
 static	void	handle_dir(char *);
 static	void	print_verbage(const char *, const char *, off_t, off_t);
 static	void	print_test(const char *, int);
 static	void	copymodes(int fd, const struct stat *, const char *file);
 static	int	check_outfile(const char *outfile);
 #endif
 
 #ifndef NO_BZIP2_SUPPORT
 static	off_t	unbzip2(int, int, char *, size_t, off_t *);
 #endif
 
 #ifndef NO_COMPRESS_SUPPORT
 static	FILE 	*zdopen(int);
 static	off_t	zuncompress(FILE *, FILE *, char *, size_t, off_t *);
 #endif
 
 #ifndef NO_PACK_SUPPORT
 static	off_t	unpack(int, int, char *, size_t, off_t *);
 #endif
 
 #ifndef NO_XZ_SUPPORT
 static	off_t	unxz(int, int, char *, size_t, off_t *);
 #endif
 
 #ifdef SMALL
 #define getopt_long(a,b,c,d,e) getopt(a,b,c)
 #else
 static const struct option longopts[] = {
 	{ "stdout",		no_argument,		0,	'c' },
 	{ "to-stdout",		no_argument,		0,	'c' },
 	{ "decompress",		no_argument,		0,	'd' },
 	{ "uncompress",		no_argument,		0,	'd' },
 	{ "force",		no_argument,		0,	'f' },
 	{ "help",		no_argument,		0,	'h' },
 	{ "keep",		no_argument,		0,	'k' },
 	{ "list",		no_argument,		0,	'l' },
 	{ "no-name",		no_argument,		0,	'n' },
 	{ "name",		no_argument,		0,	'N' },
 	{ "quiet",		no_argument,		0,	'q' },
 	{ "recursive",		no_argument,		0,	'r' },
 	{ "suffix",		required_argument,	0,	'S' },
 	{ "test",		no_argument,		0,	't' },
 	{ "verbose",		no_argument,		0,	'v' },
 	{ "version",		no_argument,		0,	'V' },
 	{ "fast",		no_argument,		0,	'1' },
 	{ "best",		no_argument,		0,	'9' },
 	{ "ascii",		no_argument,		0,	'a' },
 	{ "license",		no_argument,		0,	'L' },
 	{ NULL,			no_argument,		0,	0 },
 };
 #endif
 
 int
 main(int argc, char **argv)
 {
 	const char *progname = getprogname();
 #ifndef SMALL
 	char *gzip;
 	int len;
 #endif
 	int ch;
 
 #ifndef SMALL
 	if ((gzip = getenv("GZIP")) != NULL)
 		prepend_gzip(gzip, &argc, &argv);
 	signal(SIGINT, sigint_handler);
 #endif
 
 	/*
 	 * XXX
 	 * handle being called `gunzip', `zcat' and `gzcat'
 	 */
 	if (strcmp(progname, "gunzip") == 0)
 		dflag = 1;
 	else if (strcmp(progname, "zcat") == 0 ||
 		 strcmp(progname, "gzcat") == 0)
 		dflag = cflag = 1;
 
 #ifdef SMALL
 #define OPT_LIST "123456789cdhlV"
 #else
 #define OPT_LIST "123456789acdfhklLNnqrS:tVv"
 #endif
 
 	while ((ch = getopt_long(argc, argv, OPT_LIST, longopts, NULL)) != -1) {
 		switch (ch) {
 		case '1': case '2': case '3':
 		case '4': case '5': case '6':
 		case '7': case '8': case '9':
 			numflag = ch - '0';
 			break;
 		case 'c':
 			cflag = 1;
 			break;
 		case 'd':
 			dflag = 1;
 			break;
 		case 'l':
 			lflag = 1;
 			dflag = 1;
 			break;
 		case 'V':
 			display_version();
 			/* NOTREACHED */
 #ifndef SMALL
 		case 'a':
 			fprintf(stderr, "%s: option --ascii ignored on this system\n", progname);
 			break;
 		case 'f':
 			fflag = 1;
 			break;
 		case 'k':
 			kflag = 1;
 			break;
 		case 'L':
 			display_license();
 			/* NOT REACHED */
 		case 'N':
 			nflag = 0;
 			Nflag = 1;
 			break;
 		case 'n':
 			nflag = 1;
 			Nflag = 0;
 			break;
 		case 'q':
 			qflag = 1;
 			break;
 		case 'r':
 			rflag = 1;
 			break;
 		case 'S':
 			len = strlen(optarg);
 			if (len != 0) {
 				if (len > SUFFIX_MAXLEN)
 					errx(1, "incorrect suffix: '%s': too long", optarg);
 				suffixes[0].zipped = optarg;
 				suffixes[0].ziplen = len;
 			} else {
 				suffixes[NUM_SUFFIXES - 1].zipped = "";
 				suffixes[NUM_SUFFIXES - 1].ziplen = 0;
 			}
 			break;
 		case 't':
 			cflag = 1;
 			tflag = 1;
 			dflag = 1;
 			break;
 		case 'v':
 			vflag = 1;
 			break;
 #endif
 		default:
 			usage();
 			/* NOTREACHED */
 		}
 	}
 	argv += optind;
 	argc -= optind;
 
 	if (argc == 0) {
 		if (dflag)	/* stdin mode */
 			handle_stdin();
 		else		/* stdout mode */
 			handle_stdout();
 	} else {
 		do {
 			handle_pathname(argv[0]);
 		} while (*++argv);
 	}
 #ifndef SMALL
 	if (qflag == 0 && lflag && argc > 1)
 		print_list(-1, 0, "(totals)", 0);
 #endif
 	exit(exit_value);
 }
 
 /* maybe print a warning */
 void
 maybe_warn(const char *fmt, ...)
 {
 	va_list ap;
 
 	if (qflag == 0) {
 		va_start(ap, fmt);
 		vwarn(fmt, ap);
 		va_end(ap);
 	}
 	if (exit_value == 0)
 		exit_value = 1;
 }
 
 /* ... without an errno. */
 void
 maybe_warnx(const char *fmt, ...)
 {
 	va_list ap;
 
 	if (qflag == 0) {
 		va_start(ap, fmt);
 		vwarnx(fmt, ap);
 		va_end(ap);
 	}
 	if (exit_value == 0)
 		exit_value = 1;
 }
 
 /* maybe print an error */
 void
 maybe_err(const char *fmt, ...)
 {
 	va_list ap;
 
 	if (qflag == 0) {
 		va_start(ap, fmt);
 		vwarn(fmt, ap);
 		va_end(ap);
 	}
 	exit(2);
 }
 
 #if !defined(NO_BZIP2_SUPPORT) || !defined(NO_PACK_SUPPORT) ||	\
     !defined(NO_XZ_SUPPORT)
 /* ... without an errno. */
 void
 maybe_errx(const char *fmt, ...)
 {
 	va_list ap;
 
 	if (qflag == 0) {
 		va_start(ap, fmt);
 		vwarnx(fmt, ap);
 		va_end(ap);
 	}
 	exit(2);
 }
 #endif
 
 #ifndef SMALL
 /* split up $GZIP and prepend it to the argument list */
 static void
 prepend_gzip(char *gzip, int *argc, char ***argv)
 {
 	char *s, **nargv, **ac;
 	int nenvarg = 0, i;
 
 	/* scan how many arguments there are */
 	for (s = gzip;;) {
 		while (*s == ' ' || *s == '\t')
 			s++;
 		if (*s == 0)
 			goto count_done;
 		nenvarg++;
 		while (*s != ' ' && *s != '\t')
 			if (*s++ == 0)
 				goto count_done;
 	}
 count_done:
 	/* punt early */
 	if (nenvarg == 0)
 		return;
 
 	*argc += nenvarg;
 	ac = *argv;
 
 	nargv = (char **)malloc((*argc + 1) * sizeof(char *));
 	if (nargv == NULL)
 		maybe_err("malloc");
 
 	/* stash this away */
 	*argv = nargv;
 
 	/* copy the program name first */
 	i = 0;
 	nargv[i++] = *(ac++);
 
 	/* take a copy of $GZIP and add it to the array */
 	s = strdup(gzip);
 	if (s == NULL)
 		maybe_err("strdup");
 	for (;;) {
 		/* Skip whitespaces. */
 		while (*s == ' ' || *s == '\t')
 			s++;
 		if (*s == 0)
 			goto copy_done;
 		nargv[i++] = s;
 		/* Find the end of this argument. */
 		while (*s != ' ' && *s != '\t')
 			if (*s++ == 0)
 				/* Argument followed by NUL. */
 				goto copy_done;
 		/* Terminate by overwriting ' ' or '\t' with NUL. */
 		*s++ = 0;
 	}
 copy_done:
 
 	/* copy the original arguments and a NULL */
 	while (*ac)
 		nargv[i++] = *(ac++);
 	nargv[i] = NULL;
 }
 #endif
 
 /* compress input to output. Return bytes read, -1 on error */
 static off_t
 gz_compress(int in, int out, off_t *gsizep, const char *origname, uint32_t mtime)
 {
 	z_stream z;
 	char *outbufp, *inbufp;
 	off_t in_tot = 0, out_tot = 0;
 	ssize_t in_size;
 	int i, error;
 	uLong crc;
 #ifdef SMALL
 	static char header[] = { GZIP_MAGIC0, GZIP_MAGIC1, Z_DEFLATED, 0,
 				 0, 0, 0, 0,
 				 0, OS_CODE };
 #endif
 
 	outbufp = malloc(BUFLEN);
 	inbufp = malloc(BUFLEN);
 	if (outbufp == NULL || inbufp == NULL) {
 		maybe_err("malloc failed");
 		goto out;
 	}
 
 	memset(&z, 0, sizeof z);
 	z.zalloc = Z_NULL;
 	z.zfree = Z_NULL;
 	z.opaque = 0;
 
 #ifdef SMALL
 	memcpy(outbufp, header, sizeof header);
 	i = sizeof header;
 #else
 	if (nflag != 0) {
 		mtime = 0;
 		origname = "";
 	}
 
 	i = snprintf(outbufp, BUFLEN, "%c%c%c%c%c%c%c%c%c%c%s", 
 		     GZIP_MAGIC0, GZIP_MAGIC1, Z_DEFLATED,
 		     *origname ? ORIG_NAME : 0,
 		     mtime & 0xff,
 		     (mtime >> 8) & 0xff,
 		     (mtime >> 16) & 0xff,
 		     (mtime >> 24) & 0xff,
 		     numflag == 1 ? 4 : numflag == 9 ? 2 : 0,
 		     OS_CODE, origname);
 	if (i >= BUFLEN)     
 		/* this need PATH_MAX > BUFLEN ... */
 		maybe_err("snprintf");
 	if (*origname)
 		i++;
 #endif
 
 	z.next_out = (unsigned char *)outbufp + i;
 	z.avail_out = BUFLEN - i;
 
 	error = deflateInit2(&z, numflag, Z_DEFLATED,
 			     (-MAX_WBITS), 8, Z_DEFAULT_STRATEGY);
 	if (error != Z_OK) {
 		maybe_warnx("deflateInit2 failed");
 		in_tot = -1;
 		goto out;
 	}
 
 	crc = crc32(0L, Z_NULL, 0);
 	for (;;) {
 		if (z.avail_out == 0) {
 			if (write(out, outbufp, BUFLEN) != BUFLEN) {
 				maybe_warn("write");
 				out_tot = -1;
 				goto out;
 			}
 
 			out_tot += BUFLEN;
 			z.next_out = (unsigned char *)outbufp;
 			z.avail_out = BUFLEN;
 		}
 
 		if (z.avail_in == 0) {
 			in_size = read(in, inbufp, BUFLEN);
 			if (in_size < 0) {
 				maybe_warn("read");
 				in_tot = -1;
 				goto out;
 			}
 			if (in_size == 0)
 				break;
 
 			crc = crc32(crc, (const Bytef *)inbufp, (unsigned)in_size);
 			in_tot += in_size;
 			z.next_in = (unsigned char *)inbufp;
 			z.avail_in = in_size;
 		}
 
 		error = deflate(&z, Z_NO_FLUSH);
 		if (error != Z_OK && error != Z_STREAM_END) {
 			maybe_warnx("deflate failed");
 			in_tot = -1;
 			goto out;
 		}
 	}
 
 	/* clean up */
 	for (;;) {
 		size_t len;
 		ssize_t w;
 
 		error = deflate(&z, Z_FINISH);
 		if (error != Z_OK && error != Z_STREAM_END) {
 			maybe_warnx("deflate failed");
 			in_tot = -1;
 			goto out;
 		}
 
 		len = (char *)z.next_out - outbufp;
 
 		w = write(out, outbufp, len);
 		if (w == -1 || (size_t)w != len) {
 			maybe_warn("write");
 			out_tot = -1;
 			goto out;
 		}
 		out_tot += len;
 		z.next_out = (unsigned char *)outbufp;
 		z.avail_out = BUFLEN;
 
 		if (error == Z_STREAM_END)
 			break;
 	}
 
 	if (deflateEnd(&z) != Z_OK) {
 		maybe_warnx("deflateEnd failed");
 		in_tot = -1;
 		goto out;
 	}
 
 	i = snprintf(outbufp, BUFLEN, "%c%c%c%c%c%c%c%c", 
 		 (int)crc & 0xff,
 		 (int)(crc >> 8) & 0xff,
 		 (int)(crc >> 16) & 0xff,
 		 (int)(crc >> 24) & 0xff,
 		 (int)in_tot & 0xff,
 		 (int)(in_tot >> 8) & 0xff,
 		 (int)(in_tot >> 16) & 0xff,
 		 (int)(in_tot >> 24) & 0xff);
 	if (i != 8)
 		maybe_err("snprintf");
 	if (write(out, outbufp, i) != i) {
 		maybe_warn("write");
 		in_tot = -1;
 	} else
 		out_tot += i;
 
 out:
 	if (inbufp != NULL)
 		free(inbufp);
 	if (outbufp != NULL)
 		free(outbufp);
 	if (gsizep)
 		*gsizep = out_tot;
 	return in_tot;
 }
 
 /*
  * uncompress input to output then close the input.  return the
  * uncompressed size written, and put the compressed sized read
  * into `*gsizep'.
  */
 static off_t
 gz_uncompress(int in, int out, char *pre, size_t prelen, off_t *gsizep,
 	      const char *filename)
 {
 	z_stream z;
 	char *outbufp, *inbufp;
 	off_t out_tot = -1, in_tot = 0;
 	uint32_t out_sub_tot = 0;
 	enum {
 		GZSTATE_MAGIC0,
 		GZSTATE_MAGIC1,
 		GZSTATE_METHOD,
 		GZSTATE_FLAGS,
 		GZSTATE_SKIPPING,
 		GZSTATE_EXTRA,
 		GZSTATE_EXTRA2,
 		GZSTATE_EXTRA3,
 		GZSTATE_ORIGNAME,
 		GZSTATE_COMMENT,
 		GZSTATE_HEAD_CRC1,
 		GZSTATE_HEAD_CRC2,
 		GZSTATE_INIT,
 		GZSTATE_READ,
 		GZSTATE_CRC,
 		GZSTATE_LEN,
 	} state = GZSTATE_MAGIC0;
 	int flags = 0, skip_count = 0;
 	int error = Z_STREAM_ERROR, done_reading = 0;
 	uLong crc = 0;
 	ssize_t wr;
 	int needmore = 0;
 
 #define ADVANCE()       { z.next_in++; z.avail_in--; }
 
 	if ((outbufp = malloc(BUFLEN)) == NULL) {
 		maybe_err("malloc failed");
 		goto out2;
 	}
 	if ((inbufp = malloc(BUFLEN)) == NULL) {
 		maybe_err("malloc failed");
 		goto out1;
 	}
 
 	memset(&z, 0, sizeof z);
 	z.avail_in = prelen;
 	z.next_in = (unsigned char *)pre;
 	z.avail_out = BUFLEN;
 	z.next_out = (unsigned char *)outbufp;
 	z.zalloc = NULL;
 	z.zfree = NULL;
 	z.opaque = 0;
 
 	in_tot = prelen;
 	out_tot = 0;
 
 	for (;;) {
 		if ((z.avail_in == 0 || needmore) && done_reading == 0) {
 			ssize_t in_size;
 
 			if (z.avail_in > 0) {
 				memmove(inbufp, z.next_in, z.avail_in);
 			}
 			z.next_in = (unsigned char *)inbufp;
 			in_size = read(in, z.next_in + z.avail_in,
 			    BUFLEN - z.avail_in);
 
 			if (in_size == -1) {
 				maybe_warn("failed to read stdin");
 				goto stop_and_fail;
 			} else if (in_size == 0) {
 				done_reading = 1;
 			}
 
 			z.avail_in += in_size;
 			needmore = 0;
 
 			in_tot += in_size;
 		}
 		if (z.avail_in == 0) {
 			if (done_reading && state != GZSTATE_MAGIC0) {
 				maybe_warnx("%s: unexpected end of file",
 					    filename);
 				goto stop_and_fail;
 			}
 			goto stop;
 		}
 		switch (state) {
 		case GZSTATE_MAGIC0:
 			if (*z.next_in != GZIP_MAGIC0) {
 				if (in_tot > 0) {
 					maybe_warnx("%s: trailing garbage "
 						    "ignored", filename);
 					goto stop;
 				}
 				maybe_warnx("input not gziped (MAGIC0)");
 				goto stop_and_fail;
 			}
 			ADVANCE();
 			state++;
 			out_sub_tot = 0;
 			crc = crc32(0L, Z_NULL, 0);
 			break;
 
 		case GZSTATE_MAGIC1:
 			if (*z.next_in != GZIP_MAGIC1 &&
 			    *z.next_in != GZIP_OMAGIC1) {
 				maybe_warnx("input not gziped (MAGIC1)");
 				goto stop_and_fail;
 			}
 			ADVANCE();
 			state++;
 			break;
 
 		case GZSTATE_METHOD:
 			if (*z.next_in != Z_DEFLATED) {
 				maybe_warnx("unknown compression method");
 				goto stop_and_fail;
 			}
 			ADVANCE();
 			state++;
 			break;
 
 		case GZSTATE_FLAGS:
 			flags = *z.next_in;
 			ADVANCE();
 			skip_count = 6;
 			state++;
 			break;
 
 		case GZSTATE_SKIPPING:
 			if (skip_count > 0) {
 				skip_count--;
 				ADVANCE();
 			} else
 				state++;
 			break;
 
 		case GZSTATE_EXTRA:
 			if ((flags & EXTRA_FIELD) == 0) {
 				state = GZSTATE_ORIGNAME;
 				break;
 			}
 			skip_count = *z.next_in;
 			ADVANCE();
 			state++;
 			break;
 
 		case GZSTATE_EXTRA2:
 			skip_count |= ((*z.next_in) << 8);
 			ADVANCE();
 			state++;
 			break;
 
 		case GZSTATE_EXTRA3:
 			if (skip_count > 0) {
 				skip_count--;
 				ADVANCE();
 			} else
 				state++;
 			break;
 
 		case GZSTATE_ORIGNAME:
 			if ((flags & ORIG_NAME) == 0) {
 				state++;
 				break;
 			}
 			if (*z.next_in == 0)
 				state++;
 			ADVANCE();
 			break;
 
 		case GZSTATE_COMMENT:
 			if ((flags & COMMENT) == 0) {
 				state++;
 				break;
 			}
 			if (*z.next_in == 0)
 				state++;
 			ADVANCE();
 			break;
 
 		case GZSTATE_HEAD_CRC1:
 			if (flags & HEAD_CRC)
 				skip_count = 2;
 			else
 				skip_count = 0;
 			state++;
 			break;
 
 		case GZSTATE_HEAD_CRC2:
 			if (skip_count > 0) {
 				skip_count--;
 				ADVANCE();
 			} else
 				state++;
 			break;
 
 		case GZSTATE_INIT:
 			if (inflateInit2(&z, -MAX_WBITS) != Z_OK) {
 				maybe_warnx("failed to inflateInit");
 				goto stop_and_fail;
 			}
 			state++;
 			break;
 
 		case GZSTATE_READ:
 			error = inflate(&z, Z_FINISH);
 			switch (error) {
 			/* Z_BUF_ERROR goes with Z_FINISH... */
 			case Z_BUF_ERROR:
 				if (z.avail_out > 0 && !done_reading)
 					continue;
 
 			case Z_STREAM_END:
 			case Z_OK:
 				break;
 
 			case Z_NEED_DICT:
 				maybe_warnx("Z_NEED_DICT error");
 				goto stop_and_fail;
 			case Z_DATA_ERROR:
 				maybe_warnx("data stream error");
 				goto stop_and_fail;
 			case Z_STREAM_ERROR:
 				maybe_warnx("internal stream error");
 				goto stop_and_fail;
 			case Z_MEM_ERROR:
 				maybe_warnx("memory allocation error");
 				goto stop_and_fail;
 
 			default:
 				maybe_warn("unknown error from inflate(): %d",
 				    error);
 			}
 			wr = BUFLEN - z.avail_out;
 
 			if (wr != 0) {
 				crc = crc32(crc, (const Bytef *)outbufp, (unsigned)wr);
 				if (
 #ifndef SMALL
 				    /* don't write anything with -t */
 				    tflag == 0 &&
 #endif
 				    write(out, outbufp, wr) != wr) {
 					maybe_warn("error writing to output");
 					goto stop_and_fail;
 				}
 
 				out_tot += wr;
 				out_sub_tot += wr;
 			}
 
 			if (error == Z_STREAM_END) {
 				inflateEnd(&z);
 				state++;
 			}
 
 			z.next_out = (unsigned char *)outbufp;
 			z.avail_out = BUFLEN;
 
 			break;
 		case GZSTATE_CRC:
 			{
 				uLong origcrc;
 
 				if (z.avail_in < 4) {
 					if (!done_reading) {
 						needmore = 1;
 						continue;
 					}
 					maybe_warnx("truncated input");
 					goto stop_and_fail;
 				}
 				origcrc = ((unsigned)z.next_in[0] & 0xff) |
 					((unsigned)z.next_in[1] & 0xff) << 8 |
 					((unsigned)z.next_in[2] & 0xff) << 16 |
 					((unsigned)z.next_in[3] & 0xff) << 24;
 				if (origcrc != crc) {
 					maybe_warnx("invalid compressed"
 					     " data--crc error");
 					goto stop_and_fail;
 				}
 			}
 
 			z.avail_in -= 4;
 			z.next_in += 4;
 
 			if (!z.avail_in && done_reading) {
 				goto stop;
 			}
 			state++;
 			break;
 		case GZSTATE_LEN:
 			{
 				uLong origlen;
 
 				if (z.avail_in < 4) {
 					if (!done_reading) {
 						needmore = 1;
 						continue;
 					}
 					maybe_warnx("truncated input");
 					goto stop_and_fail;
 				}
 				origlen = ((unsigned)z.next_in[0] & 0xff) |
 					((unsigned)z.next_in[1] & 0xff) << 8 |
 					((unsigned)z.next_in[2] & 0xff) << 16 |
 					((unsigned)z.next_in[3] & 0xff) << 24;
 
 				if (origlen != out_sub_tot) {
 					maybe_warnx("invalid compressed"
 					     " data--length error");
 					goto stop_and_fail;
 				}
 			}
 				
 			z.avail_in -= 4;
 			z.next_in += 4;
 
 			if (error < 0) {
 				maybe_warnx("decompression error");
 				goto stop_and_fail;
 			}
 			state = GZSTATE_MAGIC0;
 			break;
 		}
 		continue;
 stop_and_fail:
 		out_tot = -1;
 stop:
 		break;
 	}
 	if (state > GZSTATE_INIT)
 		inflateEnd(&z);
 
 	free(inbufp);
 out1:
 	free(outbufp);
 out2:
 	if (gsizep)
 		*gsizep = in_tot;
 	return (out_tot);
 }
 
 #ifndef SMALL
 /*
  * set the owner, mode, flags & utimes using the given file descriptor.
  * file is only used in possible warning messages.
  */
 static void
 copymodes(int fd, const struct stat *sbp, const char *file)
 {
 	struct timespec times[2];
 	struct stat sb;
 
 	/*
 	 * If we have no info on the input, give this file some
 	 * default values and return..
 	 */
 	if (sbp == NULL) {
 		mode_t mask = umask(022);
 
 		(void)fchmod(fd, DEFFILEMODE & ~mask);
 		(void)umask(mask);
 		return; 
 	}
 	sb = *sbp;
 
 	/* if the chown fails, remove set-id bits as-per compress(1) */
 	if (fchown(fd, sb.st_uid, sb.st_gid) < 0) {
 		if (errno != EPERM)
 			maybe_warn("couldn't fchown: %s", file);
 		sb.st_mode &= ~(S_ISUID|S_ISGID);
 	}
 
 	/* we only allow set-id and the 9 normal permission bits */
 	sb.st_mode &= S_ISUID | S_ISGID | S_IRWXU | S_IRWXG | S_IRWXO;
 	if (fchmod(fd, sb.st_mode) < 0)
 		maybe_warn("couldn't fchmod: %s", file);
 
 	times[0] = sb.st_atim;
 	times[1] = sb.st_mtim;
 	if (futimens(fd, times) < 0)
 		maybe_warn("couldn't futimens: %s", file);
 
 	/* only try flags if they exist already */
         if (sb.st_flags != 0 && fchflags(fd, sb.st_flags) < 0)
 		maybe_warn("couldn't fchflags: %s", file);
 }
 #endif
 
 /* what sort of file is this? */
 static enum filetype
 file_gettype(u_char *buf)
 {
 
 	if (buf[0] == GZIP_MAGIC0 &&
 	    (buf[1] == GZIP_MAGIC1 || buf[1] == GZIP_OMAGIC1))
 		return FT_GZIP;
 	else
 #ifndef NO_BZIP2_SUPPORT
 	if (memcmp(buf, BZIP2_MAGIC, 3) == 0 &&
 	    buf[3] >= '0' && buf[3] <= '9')
 		return FT_BZIP2;
 	else
 #endif
 #ifndef NO_COMPRESS_SUPPORT
 	if (memcmp(buf, Z_MAGIC, 2) == 0)
 		return FT_Z;
 	else
 #endif
 #ifndef NO_PACK_SUPPORT
 	if (memcmp(buf, PACK_MAGIC, 2) == 0)
 		return FT_PACK;
 	else
 #endif
 #ifndef NO_XZ_SUPPORT
 	if (memcmp(buf, XZ_MAGIC, 4) == 0)	/* XXX: We only have 4 bytes */
 		return FT_XZ;
 	else
 #endif
 		return FT_UNKNOWN;
 }
 
 #ifndef SMALL
 /* check the outfile is OK. */
 static int
 check_outfile(const char *outfile)
 {
 	struct stat sb;
 	int ok = 1;
 
 	if (lflag == 0 && stat(outfile, &sb) == 0) {
 		if (fflag)
 			unlink(outfile);
 		else if (isatty(STDIN_FILENO)) {
 			char ans[10] = { 'n', '\0' };	/* default */
 
 			fprintf(stderr, "%s already exists -- do you wish to "
 					"overwrite (y or n)? " , outfile);
 			(void)fgets(ans, sizeof(ans) - 1, stdin);
 			if (ans[0] != 'y' && ans[0] != 'Y') {
 				fprintf(stderr, "\tnot overwriting\n");
 				ok = 0;
 			} else
 				unlink(outfile);
 		} else {
 			maybe_warnx("%s already exists -- skipping", outfile);
 			ok = 0;
 		}
 	}
 	return ok;
 }
 
 static void
 unlink_input(const char *file, const struct stat *sb)
 {
 	struct stat nsb;
 
 	if (kflag)
 		return;
 	if (stat(file, &nsb) != 0)
 		/* Must be gone already */
 		return;
 	if (nsb.st_dev != sb->st_dev || nsb.st_ino != sb->st_ino)
 		/* Definitely a different file */
 		return;
 	unlink(file);
 }
 
 static void
 sigint_handler(int signo __unused)
 {
 
 	if (remove_file != NULL)
 		unlink(remove_file);
 	_exit(2);
 }
 #endif
 
 static const suffixes_t *
 check_suffix(char *file, int xlate)
 {
 	const suffixes_t *s;
 	int len = strlen(file);
 	char *sp;
 
 	for (s = suffixes; s != suffixes + NUM_SUFFIXES; s++) {
 		/* if it doesn't fit in "a.suf", don't bother */
 		if (s->ziplen >= len)
 			continue;
 		sp = file + len - s->ziplen;
 		if (strcmp(s->zipped, sp) != 0)
 			continue;
 		if (xlate)
 			strcpy(sp, s->normal);
 		return s;
 	}
 	return NULL;
 }
 
 /*
  * compress the given file: create a corresponding .gz file and remove the
  * original.
  */
 static off_t
 file_compress(char *file, char *outfile, size_t outsize)
 {
 	int in;
 	int out;
 	off_t size, insize;
 #ifndef SMALL
 	struct stat isb, osb;
 	const suffixes_t *suff;
 #endif
 
 	in = open(file, O_RDONLY);
 	if (in == -1) {
 		maybe_warn("can't open %s", file);
 		return (-1);
 	}
 
 #ifndef SMALL
 	if (fstat(in, &isb) != 0) {
 		maybe_warn("couldn't stat: %s", file);
 		close(in);
 		return (-1);
 	}
 #endif
 
 	if (cflag == 0) {
 #ifndef SMALL
 		if (isb.st_nlink > 1 && fflag == 0) {
 			maybe_warnx("%s has %d other link%s -- skipping",
 			    file, isb.st_nlink - 1,
 			    (isb.st_nlink - 1) == 1 ? "" : "s");
 			close(in);
 			return (-1);
 		}
 
 		if (fflag == 0 && (suff = check_suffix(file, 0)) &&
 		    suff->zipped[0] != 0) {
 			maybe_warnx("%s already has %s suffix -- unchanged",
 			    file, suff->zipped);
 			close(in);
 			return (-1);
 		}
 #endif
 
 		/* Add (usually) .gz to filename */
 		if ((size_t)snprintf(outfile, outsize, "%s%s",
 		    file, suffixes[0].zipped) >= outsize)
 			memcpy(outfile + outsize - suffixes[0].ziplen - 1,
 			    suffixes[0].zipped, suffixes[0].ziplen + 1);
 
 #ifndef SMALL
 		if (check_outfile(outfile) == 0) {
 			close(in);
 			return (-1);
 		}
 #endif
 	}
 
 	if (cflag == 0) {
 		out = open(outfile, O_WRONLY | O_CREAT | O_EXCL, 0600);
 		if (out == -1) {
 			maybe_warn("could not create output: %s", outfile);
 			fclose(stdin);
 			return (-1);
 		}
 #ifndef SMALL
 		remove_file = outfile;
 #endif
 	} else
 		out = STDOUT_FILENO;
 
 	insize = gz_compress(in, out, &size, basename(file), (uint32_t)isb.st_mtime);
 
 	(void)close(in);
 
 	/*
 	 * If there was an error, insize will be -1.
 	 * If we compressed to stdout, just return the size.
 	 * Otherwise stat the file and check it is the correct size.
 	 * We only blow away the file if we can stat the output and it
 	 * has the expected size.
 	 */
 	if (cflag != 0)
 		return (insize == -1 ? -1 : size);
 
 #ifndef SMALL
 	if (fstat(out, &osb) != 0) {
 		maybe_warn("couldn't stat: %s", outfile);
 		goto bad_outfile;
 	}
 
 	if (osb.st_size != size) {
 		maybe_warnx("output file: %s wrong size (%ju != %ju), deleting",
 		    outfile, (uintmax_t)osb.st_size, (uintmax_t)size);
 		goto bad_outfile;
 	}
 
 	copymodes(out, &isb, outfile);
 	remove_file = NULL;
 #endif
 	if (close(out) == -1)
 		maybe_warn("couldn't close output");
 
 	/* output is good, ok to delete input */
 	unlink_input(file, &isb);
 	return (size);
 
 #ifndef SMALL
     bad_outfile:
 	if (close(out) == -1)
 		maybe_warn("couldn't close output");
 
 	maybe_warnx("leaving original %s", file);
 	unlink(outfile);
 	return (size);
 #endif
 }
 
 /* uncompress the given file and remove the original */
 static off_t
 file_uncompress(char *file, char *outfile, size_t outsize)
 {
 	struct stat isb, osb;
 	off_t size;
 	ssize_t rbytes;
 	unsigned char header1[4];
 	enum filetype method;
 	int fd, ofd, zfd = -1;
 #ifndef SMALL
 	ssize_t rv;
 	time_t timestamp = 0;
 	char name[PATH_MAX + 1];
 #endif
 
 	/* gather the old name info */
 
 	fd = open(file, O_RDONLY);
 	if (fd < 0) {
 		maybe_warn("can't open %s", file);
 		goto lose;
 	}
 
 	strlcpy(outfile, file, outsize);
 	if (check_suffix(outfile, 1) == NULL && !(cflag || lflag)) {
 		maybe_warnx("%s: unknown suffix -- ignored", file);
 		goto lose;
 	}
 
 	rbytes = read(fd, header1, sizeof header1);
 	if (rbytes != sizeof header1) {
 		/* we don't want to fail here. */
 #ifndef SMALL
 		if (fflag)
 			goto lose;
 #endif
 		if (rbytes == -1)
 			maybe_warn("can't read %s", file);
 		else
 			goto unexpected_EOF;
 		goto lose;
 	}
 
 	method = file_gettype(header1);
 #ifndef SMALL
 	if (fflag == 0 && method == FT_UNKNOWN) {
 		maybe_warnx("%s: not in gzip format", file);
 		goto lose;
 	}
 
 #endif
 
 #ifndef SMALL
 	if (method == FT_GZIP && Nflag) {
 		unsigned char ts[4];	/* timestamp */
 
 		rv = pread(fd, ts, sizeof ts, GZIP_TIMESTAMP);
 		if (rv >= 0 && rv < (ssize_t)(sizeof ts))
 			goto unexpected_EOF;
 		if (rv == -1) {
 			if (!fflag)
 				maybe_warn("can't read %s", file);
 			goto lose;
 		}
 		timestamp = ts[3] << 24 | ts[2] << 16 | ts[1] << 8 | ts[0];
 
 		if (header1[3] & ORIG_NAME) {
-			rbytes = pread(fd, name, sizeof name, GZIP_ORIGNAME);
+			rbytes = pread(fd, name, sizeof(name) - 1, GZIP_ORIGNAME);
 			if (rbytes < 0) {
 				maybe_warn("can't read %s", file);
 				goto lose;
 			}
-			if (name[0] != 0) {
+			if (name[0] != '\0') {
 				char *dp, *nf;
+
+				/* Make sure that name is NUL-terminated */
+				name[rbytes] = '\0';
 
 				/* strip saved directory name */
 				nf = strrchr(name, '/');
 				if (nf == NULL)
 					nf = name;
 				else
 					nf++;
 
 				/* preserve original directory name */
 				dp = strrchr(file, '/');
 				if (dp == NULL)
 					dp = file;
 				else
 					dp++;
 				snprintf(outfile, outsize, "%.*s%.*s",
 						(int) (dp - file), 
 						file, (int) rbytes, nf);
 			}
 		}
 	}
 #endif
 	lseek(fd, 0, SEEK_SET);
 
 	if (cflag == 0 || lflag) {
 		if (fstat(fd, &isb) != 0)
 			goto lose;
 #ifndef SMALL
 		if (isb.st_nlink > 1 && lflag == 0 && fflag == 0) {
 			maybe_warnx("%s has %d other links -- skipping",
 			    file, isb.st_nlink - 1);
 			goto lose;
 		}
 		if (nflag == 0 && timestamp)
 			isb.st_mtime = timestamp;
 		if (check_outfile(outfile) == 0)
 			goto lose;
 #endif
 	}
 
 	if (cflag == 0 && lflag == 0) {
 		zfd = open(outfile, O_WRONLY|O_CREAT|O_EXCL, 0600);
 		if (zfd == STDOUT_FILENO) {
 			/* We won't close STDOUT_FILENO later... */
 			zfd = dup(zfd);
 			close(STDOUT_FILENO);
 		}
 		if (zfd == -1) {
 			maybe_warn("can't open %s", outfile);
 			goto lose;
 		}
 #ifndef SMALL
 		remove_file = outfile;
 #endif
 	} else
 		zfd = STDOUT_FILENO;
 
 	switch (method) {
 #ifndef NO_BZIP2_SUPPORT
 	case FT_BZIP2:
 		/* XXX */
 		if (lflag) {
 			maybe_warnx("no -l with bzip2 files");
 			goto lose;
 		}
 
 		size = unbzip2(fd, zfd, NULL, 0, NULL);
 		break;
 #endif
 
 #ifndef NO_COMPRESS_SUPPORT
 	case FT_Z: {
 		FILE *in, *out;
 
 		/* XXX */
 		if (lflag) {
 			maybe_warnx("no -l with Lempel-Ziv files");
 			goto lose;
 		}
 
 		if ((in = zdopen(fd)) == NULL) {
 			maybe_warn("zdopen for read: %s", file);
 			goto lose;
 		}
 
 		out = fdopen(dup(zfd), "w");
 		if (out == NULL) {
 			maybe_warn("fdopen for write: %s", outfile);
 			fclose(in);
 			goto lose;
 		}
 
 		size = zuncompress(in, out, NULL, 0, NULL);
 		/* need to fclose() if ferror() is true... */
 		if (ferror(in) | fclose(in)) {
 			maybe_warn("failed infile fclose");
 			unlink(outfile);
 			(void)fclose(out);
 		}
 		if (fclose(out) != 0) {
 			maybe_warn("failed outfile fclose");
 			unlink(outfile);
 			goto lose;
 		}
 		break;
 	}
 #endif
 
 #ifndef NO_PACK_SUPPORT
 	case FT_PACK:
 		if (lflag) {
 			maybe_warnx("no -l with packed files");
 			goto lose;
 		}
 
 		size = unpack(fd, zfd, NULL, 0, NULL);
 		break;
 #endif
 
 #ifndef NO_XZ_SUPPORT
 	case FT_XZ:
 		if (lflag) {
 			maybe_warnx("no -l with xz files");
 			goto lose;
 		}
 
 		size = unxz(fd, zfd, NULL, 0, NULL);
 		break;
 #endif
 
 #ifndef SMALL
 	case FT_UNKNOWN:
 		if (lflag) {
 			maybe_warnx("no -l for unknown filetypes");
 			goto lose;
 		}
 		size = cat_fd(NULL, 0, NULL, fd);
 		break;
 #endif
 	default:
 		if (lflag) {
 			print_list(fd, isb.st_size, outfile, isb.st_mtime);
 			close(fd);
 			return -1;	/* XXX */
 		}
 
 		size = gz_uncompress(fd, zfd, NULL, 0, NULL, file);
 		break;
 	}
 
 	if (close(fd) != 0)
 		maybe_warn("couldn't close input");
 	if (zfd != STDOUT_FILENO && close(zfd) != 0)
 		maybe_warn("couldn't close output");
 
 	if (size == -1) {
 		if (cflag == 0)
 			unlink(outfile);
 		maybe_warnx("%s: uncompress failed", file);
 		return -1;
 	}
 
 	/* if testing, or we uncompressed to stdout, this is all we need */
 #ifndef SMALL
 	if (tflag)
 		return size;
 #endif
 	/* if we are uncompressing to stdin, don't remove the file. */
 	if (cflag)
 		return size;
 
 	/*
 	 * if we create a file...
 	 */
 	/*
 	 * if we can't stat the file don't remove the file.
 	 */
 
 	ofd = open(outfile, O_RDWR, 0);
 	if (ofd == -1) {
 		maybe_warn("couldn't open (leaving original): %s",
 			   outfile);
 		return -1;
 	}
 	if (fstat(ofd, &osb) != 0) {
 		maybe_warn("couldn't stat (leaving original): %s",
 			   outfile);
 		close(ofd);
 		return -1;
 	}
 	if (osb.st_size != size) {
 		maybe_warnx("stat gave different size: %ju != %ju (leaving original)",
 		    (uintmax_t)size, (uintmax_t)osb.st_size);
 		close(ofd);
 		unlink(outfile);
 		return -1;
 	}
 #ifndef SMALL
 	copymodes(ofd, &isb, outfile);
 	remove_file = NULL;
 #endif
 	close(ofd);
 	unlink_input(file, &isb);
 	return size;
 
     unexpected_EOF:
 	maybe_warnx("%s: unexpected end of file", file);
     lose:
 	if (fd != -1)
 		close(fd);
 	if (zfd != -1 && zfd != STDOUT_FILENO)
 		close(fd);
 	return -1;
 }
 
 #ifndef SMALL
 static off_t
 cat_fd(unsigned char * prepend, size_t count, off_t *gsizep, int fd)
 {
 	char buf[BUFLEN];
 	off_t in_tot;
 	ssize_t w;
 
 	in_tot = count;
 	w = write(STDOUT_FILENO, prepend, count);
 	if (w == -1 || (size_t)w != count) {
 		maybe_warn("write to stdout");
 		return -1;
 	}
 	for (;;) {
 		ssize_t rv;
 
 		rv = read(fd, buf, sizeof buf);
 		if (rv == 0)
 			break;
 		if (rv < 0) {
 			maybe_warn("read from fd %d", fd);
 			break;
 		}
 
 		if (write(STDOUT_FILENO, buf, rv) != rv) {
 			maybe_warn("write to stdout");
 			break;
 		}
 		in_tot += rv;
 	}
 
 	if (gsizep)
 		*gsizep = in_tot;
 	return (in_tot);
 }
 #endif
 
 static void
 handle_stdin(void)
 {
 	unsigned char header1[4];
 	off_t usize, gsize;
 	enum filetype method;
 	ssize_t bytes_read;
 #ifndef NO_COMPRESS_SUPPORT
 	FILE *in;
 #endif
 
 #ifndef SMALL
 	if (fflag == 0 && lflag == 0 && isatty(STDIN_FILENO)) {
 		maybe_warnx("standard input is a terminal -- ignoring");
 		return;
 	}
 #endif
 
 	if (lflag) {
 		struct stat isb;
 
 		/* XXX could read the whole file, etc. */
 		if (fstat(STDIN_FILENO, &isb) < 0) {
 			maybe_warn("fstat");
 			return;
 		}
 		print_list(STDIN_FILENO, isb.st_size, "stdout", isb.st_mtime);
 		return;
 	}
 
 	bytes_read = read_retry(STDIN_FILENO, header1, sizeof header1);
 	if (bytes_read == -1) {
 		maybe_warn("can't read stdin");
 		return;
 	} else if (bytes_read != sizeof(header1)) {
 		maybe_warnx("(stdin): unexpected end of file");
 		return;
 	}
 
 	method = file_gettype(header1);
 	switch (method) {
 	default:
 #ifndef SMALL
 		if (fflag == 0) {
 			maybe_warnx("unknown compression format");
 			return;
 		}
 		usize = cat_fd(header1, sizeof header1, &gsize, STDIN_FILENO);
 		break;
 #endif
 	case FT_GZIP:
 		usize = gz_uncompress(STDIN_FILENO, STDOUT_FILENO, 
 			      (char *)header1, sizeof header1, &gsize, "(stdin)");
 		break;
 #ifndef NO_BZIP2_SUPPORT
 	case FT_BZIP2:
 		usize = unbzip2(STDIN_FILENO, STDOUT_FILENO,
 				(char *)header1, sizeof header1, &gsize);
 		break;
 #endif
 #ifndef NO_COMPRESS_SUPPORT
 	case FT_Z:
 		if ((in = zdopen(STDIN_FILENO)) == NULL) {
 			maybe_warnx("zopen of stdin");
 			return;
 		}
 
 		usize = zuncompress(in, stdout, (char *)header1,
 		    sizeof header1, &gsize);
 		fclose(in);
 		break;
 #endif
 #ifndef NO_PACK_SUPPORT
 	case FT_PACK:
 		usize = unpack(STDIN_FILENO, STDOUT_FILENO,
 			       (char *)header1, sizeof header1, &gsize);
 		break;
 #endif
 #ifndef NO_XZ_SUPPORT
 	case FT_XZ:
 		usize = unxz(STDIN_FILENO, STDOUT_FILENO,
 			     (char *)header1, sizeof header1, &gsize);
 		break;
 #endif
 	}
 
 #ifndef SMALL
         if (vflag && !tflag && usize != -1 && gsize != -1)
 		print_verbage(NULL, NULL, usize, gsize);
 	if (vflag && tflag)
 		print_test("(stdin)", usize != -1);
 #endif 
 
 }
 
 static void
 handle_stdout(void)
 {
 	off_t gsize, usize;
 	struct stat sb;
 	time_t systime;
 	uint32_t mtime;
 	int ret;
 
 #ifndef SMALL
 	if (fflag == 0 && isatty(STDOUT_FILENO)) {
 		maybe_warnx("standard output is a terminal -- ignoring");
 		return;
 	}
 #endif
 	/* If stdin is a file use its mtime, otherwise use current time */
 	ret = fstat(STDIN_FILENO, &sb);
 
 #ifndef SMALL
 	if (ret < 0) {
 		maybe_warn("Can't stat stdin");
 		return;
 	}
 #endif
 
 	if (S_ISREG(sb.st_mode))
 		mtime = (uint32_t)sb.st_mtime;
 	else {
 		systime = time(NULL);
 #ifndef SMALL
 		if (systime == -1) {
 			maybe_warn("time");
 			return;
 		} 
 #endif
 		mtime = (uint32_t)systime;
 	}
 	 		
 	usize = gz_compress(STDIN_FILENO, STDOUT_FILENO, &gsize, "", mtime);
 #ifndef SMALL
         if (vflag && !tflag && usize != -1 && gsize != -1)
 		print_verbage(NULL, NULL, usize, gsize);
 #endif 
 }
 
 /* do what is asked for, for the path name */
 static void
 handle_pathname(char *path)
 {
 	char *opath = path, *s = NULL;
 	ssize_t len;
 	int slen;
 	struct stat sb;
 
 	/* check for stdout/stdin */
 	if (path[0] == '-' && path[1] == '\0') {
 		if (dflag)
 			handle_stdin();
 		else
 			handle_stdout();
 		return;
 	}
 
 retry:
 	if (stat(path, &sb) != 0 || (fflag == 0 && cflag == 0 &&
 	    lstat(path, &sb) != 0)) {
 		/* lets try <path>.gz if we're decompressing */
 		if (dflag && s == NULL && errno == ENOENT) {
 			len = strlen(path);
 			slen = suffixes[0].ziplen;
 			s = malloc(len + slen + 1);
 			if (s == NULL)
 				maybe_err("malloc");
 			memcpy(s, path, len);
 			memcpy(s + len, suffixes[0].zipped, slen + 1);
 			path = s;
 			goto retry;
 		}
 		maybe_warn("can't stat: %s", opath);
 		goto out;
 	}
 
 	if (S_ISDIR(sb.st_mode)) {
 #ifndef SMALL
 		if (rflag)
 			handle_dir(path);
 		else
 #endif
 			maybe_warnx("%s is a directory", path);
 		goto out;
 	}
 
 	if (S_ISREG(sb.st_mode))
 		handle_file(path, &sb);
 	else
 		maybe_warnx("%s is not a regular file", path);
 
 out:
 	if (s)
 		free(s);
 }
 
 /* compress/decompress a file */
 static void
 handle_file(char *file, struct stat *sbp)
 {
 	off_t usize, gsize;
 	char	outfile[PATH_MAX];
 
 	infile = file;
 	if (dflag) {
 		usize = file_uncompress(file, outfile, sizeof(outfile));
 #ifndef SMALL
 		if (vflag && tflag)
 			print_test(file, usize != -1);
 #endif
 		if (usize == -1)
 			return;
 		gsize = sbp->st_size;
 	} else {
 		gsize = file_compress(file, outfile, sizeof(outfile));
 		if (gsize == -1)
 			return;
 		usize = sbp->st_size;
 	}
 
 
 #ifndef SMALL
 	if (vflag && !tflag)
 		print_verbage(file, (cflag) ? NULL : outfile, usize, gsize);
 #endif
 }
 
 #ifndef SMALL
 /* this is used with -r to recursively descend directories */
 static void
 handle_dir(char *dir)
 {
 	char *path_argv[2];
 	FTS *fts;
 	FTSENT *entry;
 
 	path_argv[0] = dir;
 	path_argv[1] = 0;
 	fts = fts_open(path_argv, FTS_PHYSICAL | FTS_NOCHDIR, NULL);
 	if (fts == NULL) {
 		warn("couldn't fts_open %s", dir);
 		return;
 	}
 
 	while ((entry = fts_read(fts))) {
 		switch(entry->fts_info) {
 		case FTS_D:
 		case FTS_DP:
 			continue;
 
 		case FTS_DNR:
 		case FTS_ERR:
 		case FTS_NS:
 			maybe_warn("%s", entry->fts_path);
 			continue;
 		case FTS_F:
 			handle_file(entry->fts_path, entry->fts_statp);
 		}
 	}
 	(void)fts_close(fts);
 }
 #endif
 
 /* print a ratio - size reduction as a fraction of uncompressed size */
 static void
 print_ratio(off_t in, off_t out, FILE *where)
 {
 	int percent10;	/* 10 * percent */
 	off_t diff;
 	char buff[8];
 	int len;
 
 	diff = in - out/2;
 	if (diff <= 0)
 		/*
 		 * Output is more than double size of input! print -99.9%
 		 * Quite possibly we've failed to get the original size.
 		 */
 		percent10 = -999;
 	else {
 		/*
 		 * We only need 12 bits of result from the final division,
 		 * so reduce the values until a 32bit division will suffice.
 		 */
 		while (in > 0x100000) {
 			diff >>= 1;
 			in >>= 1;
 		}
 		if (in != 0)
 			percent10 = ((u_int)diff * 2000) / (u_int)in - 1000;
 		else
 			percent10 = 0;
 	}
 
 	len = snprintf(buff, sizeof buff, "%2.2d.", percent10);
 	/* Move the '.' to before the last digit */
 	buff[len - 1] = buff[len - 2];
 	buff[len - 2] = '.';
 	fprintf(where, "%5s%%", buff);
 }
 
 #ifndef SMALL
 /* print compression statistics, and the new name (if there is one!) */
 static void
 print_verbage(const char *file, const char *nfile, off_t usize, off_t gsize)
 {
 	if (file)
 		fprintf(stderr, "%s:%s  ", file,
 		    strlen(file) < 7 ? "\t\t" : "\t");
 	print_ratio(usize, gsize, stderr);
 	if (nfile)
 		fprintf(stderr, " -- replaced with %s", nfile);
 	fprintf(stderr, "\n");
 	fflush(stderr);
 }
 
 /* print test results */
 static void
 print_test(const char *file, int ok)
 {
 
 	if (exit_value == 0 && ok == 0)
 		exit_value = 1;
 	fprintf(stderr, "%s:%s  %s\n", file,
 	    strlen(file) < 7 ? "\t\t" : "\t", ok ? "OK" : "NOT OK");
 	fflush(stderr);
 }
 #endif
 
 /* print a file's info ala --list */
 /* eg:
   compressed uncompressed  ratio uncompressed_name
       354841      1679360  78.8% /usr/pkgsrc/distfiles/libglade-2.0.1.tar
 */
 static void
 print_list(int fd, off_t out, const char *outfile, time_t ts)
 {
 	static int first = 1;
 #ifndef SMALL
 	static off_t in_tot, out_tot;
 	uint32_t crc = 0;
 #endif
 	off_t in = 0, rv;
 
 	if (first) {
 #ifndef SMALL
 		if (vflag)
 			printf("method  crc     date  time  ");
 #endif
 		if (qflag == 0)
 			printf("  compressed uncompressed  "
 			       "ratio uncompressed_name\n");
 	}
 	first = 0;
 
 	/* print totals? */
 #ifndef SMALL
 	if (fd == -1) {
 		in = in_tot;
 		out = out_tot;
 	} else
 #endif
 	{
 		/* read the last 4 bytes - this is the uncompressed size */
 		rv = lseek(fd, (off_t)(-8), SEEK_END);
 		if (rv != -1) {
 			unsigned char buf[8];
 			uint32_t usize;
 
 			rv = read(fd, (char *)buf, sizeof(buf));
 			if (rv == -1)
 				maybe_warn("read of uncompressed size");
 			else if (rv != sizeof(buf))
 				maybe_warnx("read of uncompressed size");
 
 			else {
 				usize = buf[4] | buf[5] << 8 |
 					buf[6] << 16 | buf[7] << 24;
 				in = (off_t)usize;
 #ifndef SMALL
 				crc = buf[0] | buf[1] << 8 |
 				      buf[2] << 16 | buf[3] << 24;
 #endif
 			}
 		}
 	}
 
 #ifndef SMALL
 	if (vflag && fd == -1)
 		printf("                            ");
 	else if (vflag) {
 		char *date = ctime(&ts);
 
 		/* skip the day, 1/100th second, and year */
 		date += 4;
 		date[12] = 0;
 		printf("%5s %08x %11s ", "defla"/*XXX*/, crc, date);
 	}
 	in_tot += in;
 	out_tot += out;
 #else
 	(void)&ts;	/* XXX */
 #endif
 	printf("%12llu %12llu ", (unsigned long long)out, (unsigned long long)in);
 	print_ratio(in, out, stdout);
 	printf(" %s\n", outfile);
 }
 
 /* display the usage of NetBSD gzip */
 static void
 usage(void)
 {
 
 	fprintf(stderr, "%s\n", gzip_version);
 	fprintf(stderr,
 #ifdef SMALL
     "usage: %s [-" OPT_LIST "] [<file> [<file> ...]]\n",
 #else
     "usage: %s [-123456789acdfhklLNnqrtVv] [-S .suffix] [<file> [<file> ...]]\n"
     " -1 --fast            fastest (worst) compression\n"
     " -2 .. -8             set compression level\n"
     " -9 --best            best (slowest) compression\n"
     " -c --stdout          write to stdout, keep original files\n"
     "    --to-stdout\n"
     " -d --decompress      uncompress files\n"
     "    --uncompress\n"
     " -f --force           force overwriting & compress links\n"
     " -h --help            display this help\n"
     " -k --keep            don't delete input files during operation\n"
     " -l --list            list compressed file contents\n"
     " -N --name            save or restore original file name and time stamp\n"
     " -n --no-name         don't save original file name or time stamp\n"
     " -q --quiet           output no warnings\n"
     " -r --recursive       recursively compress files in directories\n"
     " -S .suf              use suffix .suf instead of .gz\n"
     "    --suffix .suf\n"
     " -t --test            test compressed file\n"
     " -V --version         display program version\n"
     " -v --verbose         print extra statistics\n",
 #endif
 	    getprogname());
 	exit(0);
 }
 
 #ifndef SMALL
 /* display the license information of FreeBSD gzip */
 static void
 display_license(void)
 {
 
 	fprintf(stderr, "%s (based on NetBSD gzip 20150113)\n", gzip_version);
 	fprintf(stderr, "%s\n", gzip_copyright);
 	exit(0);
 }
 #endif
 
 /* display the version of NetBSD gzip */
 static void
 display_version(void)
 {
 
 	fprintf(stderr, "%s\n", gzip_version);
 	exit(0);
 }
 
 #ifndef NO_BZIP2_SUPPORT
 #include "unbzip2.c"
 #endif
 #ifndef NO_COMPRESS_SUPPORT
 #include "zuncompress.c"
 #endif
 #ifndef NO_PACK_SUPPORT
 #include "unpack.c"
 #endif
 #ifndef NO_XZ_SUPPORT
 #include "unxz.c"
 #endif
 
 static ssize_t
 read_retry(int fd, void *buf, size_t sz)
 {
 	char *cp = buf;
 	size_t left = MIN(sz, (size_t) SSIZE_MAX);
 
 	while (left > 0) {
 		ssize_t ret;
 
 		ret = read(fd, cp, left);
 		if (ret == -1) {
 			return ret;
 		} else if (ret == 0) {
 			break; /* EOF */
 		}
 		cp += ret;
 		left -= ret;
 	}
 
 	return sz - left;
 }
Index: user/ngie/more-tests/usr.bin/iconv/iconv.c
===================================================================
--- user/ngie/more-tests/usr.bin/iconv/iconv.c	(revision 281584)
+++ user/ngie/more-tests/usr.bin/iconv/iconv.c	(revision 281585)
@@ -1,220 +1,219 @@
 /* $FreeBSD$ */
 /* $NetBSD: iconv.c,v 1.16 2009/02/20 15:28:21 yamt Exp $ */
 
 /*-
  * Copyright (c)2003 Citrus Project,
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 
 #include <err.h>
 #include <errno.h>
 #include <getopt.h>
 #include <iconv.h>
 #include <limits.h>
 #include <locale.h>
 #include <stdbool.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
 
 static int		do_conv(FILE *, const char *, const char *, bool, bool);
 static int		do_list(unsigned int, const char * const *, void *);
 static void		usage(void) __dead2;
 
 static const struct option long_options[] = {
 	{"from-code",		required_argument,	NULL, 'f'},
 	{"list",		no_argument,		NULL, 'l'},
 	{"silent",		no_argument,		NULL, 's'},
         {"to-code",		required_argument,	NULL, 't'},
         {NULL,                  no_argument,            NULL, 0}
 };
 
 static void
 usage(void)
 {
 	(void)fprintf(stderr,
 	    "Usage:\t%1$s [-cs] -f <from_code> -t <to_code> [file ...]\n"
 	    "\t%1$s -f <from_code> [-cs] [-t <to_code>] [file ...]\n"
 	    "\t%1$s -t <to_code> [-cs] [-f <from_code>] [file ...]\n"
 	    "\t%1$s -l\n", getprogname());
 	exit(1);
 }
 
 #define INBUFSIZE 1024
 #define OUTBUFSIZE (INBUFSIZE * 2)
 static int
 do_conv(FILE *fp, const char *from, const char *to, bool silent,
     bool hide_invalid)
 {
 	iconv_t cd;
-	char inbuf[INBUFSIZE], outbuf[OUTBUFSIZE], *out;
+	char inbuf[INBUFSIZE], outbuf[OUTBUFSIZE], *in, *out;
 	unsigned long long invalids;
-	const char *in;
 	size_t inbytes, outbytes, ret;
 
 	if ((cd = iconv_open(to, from)) == (iconv_t)-1)
 		err(EXIT_FAILURE, "iconv_open(%s, %s)", to, from);
 
 	if (hide_invalid) {
 		int arg = 1;
 
 		if (iconvctl(cd, ICONV_SET_DISCARD_ILSEQ, (void *)&arg) == -1)
 			err(EXIT_FAILURE, NULL);
 	}
 	invalids = 0;
 	while ((inbytes = fread(inbuf, 1, INBUFSIZE, fp)) > 0) {
 		in = inbuf;
 		while (inbytes > 0) {
 			size_t inval;
 
 			out = outbuf;
 			outbytes = OUTBUFSIZE;
 			ret = __iconv(cd, &in, &inbytes, &out, &outbytes,
 			    0, &inval);
 			invalids += inval;
 			if (outbytes < OUTBUFSIZE)
 				(void)fwrite(outbuf, 1, OUTBUFSIZE - outbytes,
 				    stdout);
 			if (ret == (size_t)-1 && errno != E2BIG) {
 				if (errno != EINVAL || in == inbuf)
 					err(EXIT_FAILURE, "iconv()");
 
 				/* incomplete input character */
 				(void)memmove(inbuf, in, inbytes);
 				ret = fread(inbuf + inbytes, 1,
 				    INBUFSIZE - inbytes, fp);
 				if (ret == 0) {
 					fflush(stdout);
 					if (feof(fp))
 						errx(EXIT_FAILURE,
 						    "unexpected end of file; "
 						    "the last character is "
 						    "incomplete.");
 					else
 						err(EXIT_FAILURE, "fread()");
 				}
 				in = inbuf;
 				inbytes += ret;
 			}
 		}
 	}
 	/* reset the shift state of the output buffer */
 	outbytes = OUTBUFSIZE;
 	out = outbuf;
 	ret = iconv(cd, NULL, NULL, &out, &outbytes);
 	if (ret == (size_t)-1)
 		err(EXIT_FAILURE, "iconv()");
 	if (outbytes < OUTBUFSIZE)
 		(void)fwrite(outbuf, 1, OUTBUFSIZE - outbytes, stdout);
 
 	if (invalids > 0 && !silent)
 		warnx("warning: invalid characters: %llu", invalids);
 
 	iconv_close(cd);
 	return (invalids > 0);
 }
 
 static int
 do_list(unsigned int n, const char * const *list, void *data __unused)
 {
 	unsigned int i;
 
 	for(i = 0; i < n; i++) {
 		printf("%s", list[i]);
 		if (i < n - 1)
 			printf(" ");
 	}
 	printf("\n");
 
 	return (1);
 }
 
 int
 main(int argc, char **argv)
 {
 	FILE *fp;
 	char *opt_f, *opt_t;
 	int ch, i, res;
 	bool opt_c = false, opt_s = false;
 
 	opt_f = opt_t = strdup("");
 
 	setlocale(LC_ALL, "");
 	setprogname(argv[0]);
 
 	while ((ch = getopt_long(argc, argv, "csLlf:t:",
 	    long_options, NULL)) != -1) {
 		switch (ch) {
 		case 'c':
 			opt_c = true;
 			break;
 		case 's':
 			opt_s = true;
 			break;
 		case 'l':
 			/* list */
 			if (opt_s || opt_c || strcmp(opt_f, "") != 0 ||
 			    strcmp(opt_t, "") != 0) {
 				warnx("-l is not allowed with other flags.");
 				usage();
 			}
 			iconvlist(do_list, NULL);
 			return (EXIT_SUCCESS);
 		case 'f':
 			/* from */
 			if (optarg != NULL)
 				opt_f = strdup(optarg);
 			break;
 		case 't':
 			/* to */
 			if (optarg != NULL)
 				opt_t = strdup(optarg);
 			break;
 		default:
 			usage();
 		}
 	}
 	argc -= optind;
 	argv += optind;
 	if ((strcmp(opt_f, "") == 0) && (strcmp(opt_t, "") == 0))
 		usage();
 	if (argc == 0)
 		res = do_conv(stdin, opt_f, opt_t, opt_s, opt_c);
 	else {
 		res = 0;
 		for (i = 0; i < argc; i++) {
 			fp = (strcmp(argv[i], "-") != 0) ?
 			    fopen(argv[i], "r") : stdin;
 			if (fp == NULL)
 				err(EXIT_FAILURE, "Cannot open `%s'",
 				    argv[i]);
 			res |= do_conv(fp, opt_f, opt_t, opt_s, opt_c);
 			(void)fclose(fp);
 		}
 	}
 	return (res == 0 ? EXIT_SUCCESS : EXIT_FAILURE);
 }
Index: user/ngie/more-tests/usr.bin/ipcs/ipc.c
===================================================================
--- user/ngie/more-tests/usr.bin/ipcs/ipc.c	(revision 281584)
+++ user/ngie/more-tests/usr.bin/ipcs/ipc.c	(revision 281585)
@@ -1,206 +1,205 @@
 /*
  * Copyright (c) 1994 SigmaSoft, Th. Lockert <tholo@sigmasoft.com>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. The name of the author may not be used to endorse or promote products
  *    derived from this software without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES,
  * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
  * AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL
  * THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
  * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
  * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
  * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
  * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
  * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
  * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  *
  * The split of ipcs.c into ipcs.c and ipc.c to accommodate the
  * changes in ipcrm.c was done by Edwin Groothuis <edwin@FreeBSD.org>
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/types.h>
 #include <sys/sysctl.h>
 #define _KERNEL
 #include <sys/sem.h>
 #include <sys/shm.h>
 #include <sys/msg.h>
 #undef _KERNEL
 
 #include <assert.h>
 #include <err.h>
 #include <kvm.h>
 #include <nlist.h>
 #include <stddef.h>
 #include <stdio.h>
 
 #include "ipc.h"
 
 int	use_sysctl = 1;
 struct semid_kernel	*sema;
 struct seminfo		seminfo;
 struct msginfo		msginfo;
 struct msqid_kernel	*msqids;
 struct shminfo		shminfo;
 struct shmid_kernel	*shmsegs;
-void	kget(int idx, void *addr, size_t size);
 
 struct nlist symbols[] = {
 	{ .n_name = "sema" },
 	{ .n_name = "seminfo" },
 	{ .n_name = "msginfo" },
 	{ .n_name = "msqids" },
 	{ .n_name = "shminfo" },
 	{ .n_name = "shmsegs" },
 	{ .n_name = NULL }
 };
 
 #define	SHMINFO_XVEC	X(shmmax, sizeof(u_long))			\
 			X(shmmin, sizeof(u_long))			\
 			X(shmmni, sizeof(u_long))			\
 			X(shmseg, sizeof(u_long))			\
 			X(shmall, sizeof(u_long))
 
 #define	SEMINFO_XVEC	X(semmni, sizeof(int))				\
 			X(semmns, sizeof(int))				\
 			X(semmnu, sizeof(int))				\
 			X(semmsl, sizeof(int))				\
 			X(semopm, sizeof(int))				\
 			X(semume, sizeof(int))				\
 			X(semusz, sizeof(int))				\
 			X(semvmx, sizeof(int))				\
 			X(semaem, sizeof(int))
 
 #define	MSGINFO_XVEC	X(msgmax, sizeof(int))				\
 			X(msgmni, sizeof(int))				\
 			X(msgmnb, sizeof(int))				\
 			X(msgtql, sizeof(int))				\
 			X(msgssz, sizeof(int))				\
 			X(msgseg, sizeof(int))
 
 #define	X(a, b)	{ "kern.ipc." #a, offsetof(TYPEC, a), (b) },
 #define	TYPEC	struct shminfo
 static struct scgs_vector shminfo_scgsv[] = { SHMINFO_XVEC { .sysctl=NULL } };
 #undef	TYPEC
 #define	TYPEC	struct seminfo
 static struct scgs_vector seminfo_scgsv[] = { SEMINFO_XVEC { .sysctl=NULL } };
 #undef	TYPEC
 #define	TYPEC	struct msginfo
 static struct scgs_vector msginfo_scgsv[] = { MSGINFO_XVEC { .sysctl=NULL } };
 #undef	TYPEC
 #undef	X
 
 kvm_t *kd;
 
 void
 sysctlgatherstruct(void *addr, size_t size, struct scgs_vector *vecarr)
 {
 	struct scgs_vector *xp;
 	size_t tsiz;
 	int rv;
 
 	for (xp = vecarr; xp->sysctl != NULL; xp++) {
 		assert(xp->offset <= size);
 		tsiz = xp->size;
 		rv = sysctlbyname(xp->sysctl, (char *)addr + xp->offset,
 		    &tsiz, NULL, 0);
 		if (rv == -1)
 			err(1, "sysctlbyname: %s", xp->sysctl);
 		if (tsiz != xp->size)
 			errx(1, "%s size mismatch (expected %zu, got %zu)",
 			    xp->sysctl, xp->size, tsiz);
 	}
 }
 
 void
 kget(int idx, void *addr, size_t size)
 {
 	const char *symn;		/* symbol name */
 	size_t tsiz;
 	int rv;
 	unsigned long kaddr;
 	const char *sym2sysctl[] = {	/* symbol to sysctl name table */
 		"kern.ipc.sema",
 		"kern.ipc.seminfo",
 		"kern.ipc.msginfo",
 		"kern.ipc.msqids",
 		"kern.ipc.shminfo",
 		"kern.ipc.shmsegs" };
 
 	assert((unsigned)idx <= sizeof(sym2sysctl) / sizeof(*sym2sysctl));
 	if (!use_sysctl) {
 		symn = symbols[idx].n_name;
 		if (*symn == '_')
 			symn++;
 		if (symbols[idx].n_type == 0 || symbols[idx].n_value == 0)
 			errx(1, "symbol %s undefined", symn);
 		/*
 		 * For some symbols, the value we retrieve is
 		 * actually a pointer; since we want the actual value,
 		 * we have to manually dereference it.
 		 */
 		switch (idx) {
 		case X_MSQIDS:
 			tsiz = sizeof(msqids);
 			rv = kvm_read(kd, symbols[idx].n_value,
 			    &msqids, tsiz);
 			kaddr = (u_long)msqids;
 			break;
 		case X_SHMSEGS:
 			tsiz = sizeof(shmsegs);
 			rv = kvm_read(kd, symbols[idx].n_value,
 			    &shmsegs, tsiz);
 			kaddr = (u_long)shmsegs;
 			break;
 		case X_SEMA:
 			tsiz = sizeof(sema);
 			rv = kvm_read(kd, symbols[idx].n_value,
 			    &sema, tsiz);
 			kaddr = (u_long)sema;
 			break;
 		default:
 			rv = tsiz = 0;
 			kaddr = symbols[idx].n_value;
 			break;
 		}
 		if ((unsigned)rv != tsiz)
 			errx(1, "%s: %s", symn, kvm_geterr(kd));
 		if ((unsigned)kvm_read(kd, kaddr, addr, size) != size)
 			errx(1, "%s: %s", symn, kvm_geterr(kd));
 	} else {
 		switch (idx) {
 		case X_SHMINFO:
 			sysctlgatherstruct(addr, size, shminfo_scgsv);
 			break;
 		case X_SEMINFO:
 			sysctlgatherstruct(addr, size, seminfo_scgsv);
 			break;
 		case X_MSGINFO:
 			sysctlgatherstruct(addr, size, msginfo_scgsv);
 			break;
 		default:
 			tsiz = size;
 			rv = sysctlbyname(sym2sysctl[idx], addr, &tsiz,
 			    NULL, 0);
 			if (rv == -1)
 				err(1, "sysctlbyname: %s", sym2sysctl[idx]);
 			if (tsiz != size)
 				errx(1, "%s size mismatch "
 				    "(expected %zu, got %zu)",
 				    sym2sysctl[idx], size, tsiz);
 			break;
 		}
 	}
 }
Index: user/ngie/more-tests/usr.bin/ipcs/ipc.h
===================================================================
--- user/ngie/more-tests/usr.bin/ipcs/ipc.h	(revision 281584)
+++ user/ngie/more-tests/usr.bin/ipcs/ipc.h	(revision 281585)
@@ -1,71 +1,68 @@
 /*
  * Copyright (c) 1994 SigmaSoft, Th. Lockert <tholo@sigmasoft.com>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. The name of the author may not be used to endorse or promote products
  *    derived from this software without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES,
  * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
  * AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL
  * THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
  * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
  * PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
  * OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
  * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
  * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
  * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  *
  * The split of ipcs.c into ipcs.c and ipc.c to accommodate the
  * changes in ipcrm.c was done by Edwin Groothuis <edwin@FreeBSD.org>
  *
  * $FreeBSD$
  */
 
 /* Part of struct nlist symbols[] */
 #define X_SEMA		0
 #define X_SEMINFO	1
 #define X_MSGINFO	2
 #define X_MSQIDS	3
 #define X_SHMINFO	4
 #define X_SHMSEGS	5
 
 #define	SHMINFO		1
 #define	SHMTOTAL	2
 #define	MSGINFO		4
 #define	MSGTOTAL	8
 #define	SEMINFO		16
 #define	SEMTOTAL	32
 
 #define IPC_TO_STR(x) (x == 'Q' ? "msq" : (x == 'M' ? "shm" : "sem"))
 #define IPC_TO_STRING(x) (x == 'Q' ? "message queue" : \
 	    (x == 'M' ? "shared memory segment" : "semaphore"))
 
 /* SysCtlGatherStruct structure. */
 struct scgs_vector {
 	const char *sysctl;
 	size_t offset;
 	size_t size;
 };
 
 void	kget(int idx, void *addr, size_t size);
 void	sysctlgatherstruct(void *addr, size_t size, struct scgs_vector *vec);
 
 extern int use_sysctl;
 extern struct nlist symbols[];
 extern kvm_t *kd;
 
 extern struct semid_kernel	*sema;
-extern struct seminfo		seminfo;
-extern struct msginfo		msginfo;
 extern struct msqid_kernel	*msqids;
-extern struct shminfo		shminfo;
 extern struct shmid_kernel	*shmsegs;
Index: user/ngie/more-tests/usr.bin/lockf/lockf.c
===================================================================
--- user/ngie/more-tests/usr.bin/lockf/lockf.c	(revision 281584)
+++ user/ngie/more-tests/usr.bin/lockf/lockf.c	(revision 281585)
@@ -1,241 +1,241 @@
 /*
  * Copyright (C) 1997 John D. Polstra.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY JOHN D. POLSTRA AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL JOHN D. POLSTRA OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/types.h>
 #include <sys/wait.h>
 
 #include <err.h>
 #include <errno.h>
 #include <fcntl.h>
 #include <signal.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <sysexits.h>
 #include <unistd.h>
 
 static int acquire_lock(const char *name, int flags);
 static void cleanup(void);
 static void killed(int sig);
 static void timeout(int sig);
 static void usage(void);
 static void wait_for_lock(const char *name);
 
 static const char *lockname;
 static int lockfd = -1;
 static int keep;
 static volatile sig_atomic_t timed_out;
 
 /*
  * Execute an arbitrary command while holding a file lock.
  */
 int
 main(int argc, char **argv)
 {
 	int ch, flags, silent, status, waitsec;
 	pid_t child;
 
 	silent = keep = 0;
 	flags = O_CREAT;
 	waitsec = -1;	/* Infinite. */
 	while ((ch = getopt(argc, argv, "sknt:")) != -1) {
 		switch (ch) {
 		case 'k':
 			keep = 1;
 			break;
 		case 'n':
 			flags &= ~O_CREAT;
 			break;
 		case 's':
 			silent = 1;
 			break;
 		case 't':
 		{
 			char *endptr;
 			waitsec = strtol(optarg, &endptr, 0);
 			if (*optarg == '\0' || *endptr != '\0' || waitsec < 0)
 				errx(EX_USAGE,
 				    "invalid timeout \"%s\"", optarg);
 		}
 			break;
 		default:
 			usage();
 		}
 	}
 	if (argc - optind < 2)
 		usage();
 	lockname = argv[optind++];
 	argc -= optind;
 	argv += optind;
 	if (waitsec > 0) {		/* Set up a timeout. */
 		struct sigaction act;
 
 		act.sa_handler = timeout;
 		sigemptyset(&act.sa_mask);
 		act.sa_flags = 0;	/* Note that we do not set SA_RESTART. */
 		sigaction(SIGALRM, &act, NULL);
 		alarm(waitsec);
 	}
 	/*
 	 * If the "-k" option is not given, then we must not block when
 	 * acquiring the lock.  If we did, then the lock holder would
 	 * unlink the file upon releasing the lock, and we would acquire
 	 * a lock on a file with no directory entry.  Then another
 	 * process could come along and acquire the same lock.  To avoid
 	 * this problem, we separate out the actions of waiting for the
 	 * lock to be available and of actually acquiring the lock.
 	 *
 	 * That approach produces behavior that is technically correct;
 	 * however, it causes some performance & ordering problems for
 	 * locks that have a lot of contention.  First, it is unfair in
 	 * the sense that a released lock isn't necessarily granted to
 	 * the process that has been waiting the longest.  A waiter may
 	 * be starved out indefinitely.  Second, it creates a thundering
 	 * herd situation each time the lock is released.
 	 *
 	 * When the "-k" option is used, the unlink race no longer
 	 * exists.  In that case we can block while acquiring the lock,
 	 * avoiding the separate step of waiting for the lock.  This
 	 * yields fairness and improved performance.
 	 */
 	lockfd = acquire_lock(lockname, flags | O_NONBLOCK);
 	while (lockfd == -1 && !timed_out && waitsec != 0) {
 		if (keep)
 			lockfd = acquire_lock(lockname, flags);
 		else {
 			wait_for_lock(lockname);
 			lockfd = acquire_lock(lockname, flags | O_NONBLOCK);
 		}
 	}
 	if (waitsec > 0)
 		alarm(0);
 	if (lockfd == -1) {		/* We failed to acquire the lock. */
 		if (silent)
 			exit(EX_TEMPFAIL);
 		errx(EX_TEMPFAIL, "%s: already locked", lockname);
 	}
 	/* At this point, we own the lock. */
 	if (atexit(cleanup) == -1)
 		err(EX_OSERR, "atexit failed");
 	if ((child = fork()) == -1)
 		err(EX_OSERR, "cannot fork");
 	if (child == 0) {	/* The child process. */
 		close(lockfd);
 		execvp(argv[0], argv);
 		warn("%s", argv[0]);
 		_exit(1);
 	}
 	/* This is the parent process. */
 	signal(SIGINT, SIG_IGN);
 	signal(SIGQUIT, SIG_IGN);
 	signal(SIGTERM, killed);
 	if (waitpid(child, &status, 0) == -1)
 		err(EX_OSERR, "waitpid failed");
 	return (WIFEXITED(status) ? WEXITSTATUS(status) : EX_SOFTWARE);
 }
 
 /*
  * Try to acquire a lock on the given file, creating the file if
  * necessary.  The flags argument is O_NONBLOCK or 0, depending on
  * whether we should wait for the lock.  Returns an open file descriptor
  * on success, or -1 on failure.
  */
 static int
 acquire_lock(const char *name, int flags)
 {
 	int fd;
 
-	if ((fd = open(name, flags|O_RDONLY|O_EXLOCK|flags, 0666)) == -1) {
+	if ((fd = open(name, O_RDONLY|O_EXLOCK|flags, 0666)) == -1) {
 		if (errno == EAGAIN || errno == EINTR)
 			return (-1);
 		err(EX_CANTCREAT, "cannot open %s", name);
 	}
 	return (fd);
 }
 
 /*
  * Remove the lock file.
  */
 static void
 cleanup(void)
 {
 
 	if (keep)
 		flock(lockfd, LOCK_UN);
 	else
 		unlink(lockname);
 }
 
 /*
  * Signal handler for SIGTERM.  Cleans up the lock file, then re-raises
  * the signal.
  */
 static void
 killed(int sig)
 {
 
 	cleanup();
 	signal(sig, SIG_DFL);
 	if (kill(getpid(), sig) == -1)
 		err(EX_OSERR, "kill failed");
 }
 
 /*
  * Signal handler for SIGALRM.
  */
 static void
 timeout(int sig __unused)
 {
 
 	timed_out = 1;
 }
 
 static void
 usage(void)
 {
 
 	fprintf(stderr,
 	    "usage: lockf [-kns] [-t seconds] file command [arguments]\n");
 	exit(EX_USAGE);
 }
 
 /*
  * Wait until it might be possible to acquire a lock on the given file.
  * If the file does not exist, return immediately without creating it.
  */
 static void
 wait_for_lock(const char *name)
 {
 	int fd;
 
 	if ((fd = open(name, O_RDONLY|O_EXLOCK, 0666)) == -1) {
 		if (errno == ENOENT || errno == EINTR)
 			return;
 		err(EX_CANTCREAT, "cannot open %s", name);
 	}
 	close(fd);
 }
Index: user/ngie/more-tests/usr.sbin/bhyve/bhyverun.c
===================================================================
--- user/ngie/more-tests/usr.sbin/bhyve/bhyverun.c	(revision 281584)
+++ user/ngie/more-tests/usr.sbin/bhyve/bhyverun.c	(revision 281585)
@@ -1,885 +1,887 @@
 /*-
  * Copyright (c) 2011 NetApp, Inc.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY NETAPP, INC ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL NETAPP, INC OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/types.h>
 #include <sys/mman.h>
 #include <sys/time.h>
 
 #include <machine/atomic.h>
 #include <machine/segments.h>
 
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <err.h>
 #include <libgen.h>
 #include <unistd.h>
 #include <assert.h>
 #include <errno.h>
 #include <pthread.h>
 #include <pthread_np.h>
 #include <sysexits.h>
 
 #include <machine/vmm.h>
 #include <vmmapi.h>
 
 #include "bhyverun.h"
 #include "acpi.h"
 #include "inout.h"
 #include "dbgport.h"
 #include "ioapic.h"
 #include "mem.h"
 #include "mevent.h"
 #include "mptbl.h"
 #include "pci_emul.h"
 #include "pci_irq.h"
 #include "pci_lpc.h"
 #include "smbiostbl.h"
 #include "xmsr.h"
 #include "spinup_ap.h"
 #include "rtc.h"
 
 #define GUEST_NIO_PORT		0x488	/* guest upcalls via i/o port */
 
 #define MB		(1024UL * 1024)
 #define GB		(1024UL * MB)
 
 typedef int (*vmexit_handler_t)(struct vmctx *, struct vm_exit *, int *vcpu);
 extern int vmexit_task_switch(struct vmctx *, struct vm_exit *, int *vcpu);
 
 char *vmname;
 
 int guest_ncpus;
 char *guest_uuid_str;
 
 static int guest_vmexit_on_hlt, guest_vmexit_on_pause;
 static int virtio_msix = 1;
 static int x2apic_mode = 0;	/* default is xAPIC */
 
 static int strictio;
 static int strictmsr = 1;
 
 static int acpi;
 
 static char *progname;
 static const int BSP = 0;
 
 static cpuset_t cpumask;
 
 static void vm_loop(struct vmctx *ctx, int vcpu, uint64_t rip);
 
 static struct vm_exit vmexit[VM_MAXCPU];
 
 struct bhyvestats {
         uint64_t        vmexit_bogus;
         uint64_t        vmexit_bogus_switch;
         uint64_t        vmexit_hlt;
         uint64_t        vmexit_pause;
         uint64_t        vmexit_mtrap;
         uint64_t        vmexit_inst_emul;
         uint64_t        cpu_switch_rotate;
         uint64_t        cpu_switch_direct;
 } stats;
 
 struct mt_vmm_info {
 	pthread_t	mt_thr;
 	struct vmctx	*mt_ctx;
 	int		mt_vcpu;	
 } mt_vmm_info[VM_MAXCPU];
 
 static cpuset_t *vcpumap[VM_MAXCPU] = { NULL };
 
 static void
 usage(int code)
 {
 
         fprintf(stderr,
                 "Usage: %s [-abehuwxACHPWY] [-c vcpus] [-g <gdb port>] [-l <lpc>]\n"
 		"       %*s [-m mem] [-p vcpu:hostcpu] [-s <pci>] [-U uuid] <vm>\n"
 		"       -a: local apic is in xAPIC mode (deprecated)\n"
 		"       -A: create ACPI tables\n"
 		"       -c: # cpus (default 1)\n"
 		"       -C: include guest memory in core file\n"
 		"       -e: exit on unhandled I/O access\n"
 		"       -g: gdb port\n"
 		"       -h: help\n"
 		"       -H: vmexit from the guest on hlt\n"
 		"       -l: LPC device configuration\n"
 		"       -m: memory size in MB\n"
 		"       -p: pin 'vcpu' to 'hostcpu'\n"
 		"       -P: vmexit from the guest on pause\n"
 		"       -s: <slot,driver,configinfo> PCI slot config\n"
 		"       -u: RTC keeps UTC time\n"
 		"       -U: uuid\n"
 		"       -w: ignore unimplemented MSRs\n"
 		"       -W: force virtio to use single-vector MSI\n"
 		"       -x: local apic is in x2APIC mode\n"
 		"       -Y: disable MPtable generation\n",
 		progname, (int)strlen(progname), "");
 
 	exit(code);
 }
 
 static int
 pincpu_parse(const char *opt)
 {
 	int vcpu, pcpu;
 
 	if (sscanf(opt, "%d:%d", &vcpu, &pcpu) != 2) {
 		fprintf(stderr, "invalid format: %s\n", opt);
 		return (-1);
 	}
 
 	if (vcpu < 0 || vcpu >= VM_MAXCPU) {
 		fprintf(stderr, "vcpu '%d' outside valid range from 0 to %d\n",
 		    vcpu, VM_MAXCPU - 1);
 		return (-1);
 	}
 
 	if (pcpu < 0 || pcpu >= CPU_SETSIZE) {
 		fprintf(stderr, "hostcpu '%d' outside valid range from "
 		    "0 to %d\n", pcpu, CPU_SETSIZE - 1);
 		return (-1);
 	}
 
 	if (vcpumap[vcpu] == NULL) {
 		if ((vcpumap[vcpu] = malloc(sizeof(cpuset_t))) == NULL) {
 			perror("malloc");
 			return (-1);
 		}
 		CPU_ZERO(vcpumap[vcpu]);
 	}
 	CPU_SET(pcpu, vcpumap[vcpu]);
 	return (0);
 }
 
 void
 vm_inject_fault(void *arg, int vcpu, int vector, int errcode_valid,
     int errcode)
 {
 	struct vmctx *ctx;
 	int error, restart_instruction;
 
 	ctx = arg;
 	restart_instruction = 1;
 
 	error = vm_inject_exception(ctx, vcpu, vector, errcode_valid, errcode,
 	    restart_instruction);
 	assert(error == 0);
 }
 
 void *
 paddr_guest2host(struct vmctx *ctx, uintptr_t gaddr, size_t len)
 {
 
 	return (vm_map_gpa(ctx, gaddr, len));
 }
 
 int
 fbsdrun_vmexit_on_pause(void)
 {
 
 	return (guest_vmexit_on_pause);
 }
 
 int
 fbsdrun_vmexit_on_hlt(void)
 {
 
 	return (guest_vmexit_on_hlt);
 }
 
 int
 fbsdrun_virtio_msix(void)
 {
 
 	return (virtio_msix);
 }
 
 static void *
 fbsdrun_start_thread(void *param)
 {
 	char tname[MAXCOMLEN + 1];
 	struct mt_vmm_info *mtp;
 	int vcpu;
 
 	mtp = param;
 	vcpu = mtp->mt_vcpu;
 
 	snprintf(tname, sizeof(tname), "vcpu %d", vcpu);
 	pthread_set_name_np(mtp->mt_thr, tname);
 
 	vm_loop(mtp->mt_ctx, vcpu, vmexit[vcpu].rip);
 
 	/* not reached */
 	exit(1);
 	return (NULL);
 }
 
 void
 fbsdrun_addcpu(struct vmctx *ctx, int fromcpu, int newcpu, uint64_t rip)
 {
 	int error;
 
 	assert(fromcpu == BSP);
 
 	/*
 	 * The 'newcpu' must be activated in the context of 'fromcpu'. If
 	 * vm_activate_cpu() is delayed until newcpu's pthread starts running
 	 * then vmm.ko is out-of-sync with bhyve and this can create a race
 	 * with vm_suspend().
 	 */
 	error = vm_activate_cpu(ctx, newcpu);
 	assert(error == 0);
 
 	CPU_SET_ATOMIC(newcpu, &cpumask);
 
 	/*
 	 * Set up the vmexit struct to allow execution to start
 	 * at the given RIP
 	 */
 	vmexit[newcpu].rip = rip;
 	vmexit[newcpu].inst_length = 0;
 
 	mt_vmm_info[newcpu].mt_ctx = ctx;
 	mt_vmm_info[newcpu].mt_vcpu = newcpu;
 
 	error = pthread_create(&mt_vmm_info[newcpu].mt_thr, NULL,
 	    fbsdrun_start_thread, &mt_vmm_info[newcpu]);
 	assert(error == 0);
 }
 
 static int
 fbsdrun_deletecpu(struct vmctx *ctx, int vcpu)
 {
 
 	if (!CPU_ISSET(vcpu, &cpumask)) {
 		fprintf(stderr, "Attempting to delete unknown cpu %d\n", vcpu);
 		exit(1);
 	}
 
 	CPU_CLR_ATOMIC(vcpu, &cpumask);
 	return (CPU_EMPTY(&cpumask));
 }
 
 static int
 vmexit_handle_notify(struct vmctx *ctx, struct vm_exit *vme, int *pvcpu,
 		     uint32_t eax)
 {
 #if BHYVE_DEBUG
 	/*
 	 * put guest-driven debug here
 	 */
 #endif
         return (VMEXIT_CONTINUE);
 }
 
 static int
 vmexit_inout(struct vmctx *ctx, struct vm_exit *vme, int *pvcpu)
 {
 	int error;
 	int bytes, port, in, out, string;
 	int vcpu;
 
 	vcpu = *pvcpu;
 
 	port = vme->u.inout.port;
 	bytes = vme->u.inout.bytes;
 	string = vme->u.inout.string;
 	in = vme->u.inout.in;
 	out = !in;
 
         /* Extra-special case of host notifications */
         if (out && port == GUEST_NIO_PORT) {
                 error = vmexit_handle_notify(ctx, vme, pvcpu, vme->u.inout.eax);
 		return (error);
 	}
 
 	error = emulate_inout(ctx, vcpu, vme, strictio);
 	if (error) {
-		fprintf(stderr, "Unhandled %s%c 0x%04x\n", in ? "in" : "out",
-		    bytes == 1 ? 'b' : (bytes == 2 ? 'w' : 'l'), port);
+		fprintf(stderr, "Unhandled %s%c 0x%04x at 0x%lx\n",
+		    in ? "in" : "out",
+		    bytes == 1 ? 'b' : (bytes == 2 ? 'w' : 'l'),
+		    port, vmexit->rip);
 		return (VMEXIT_ABORT);
 	} else {
 		return (VMEXIT_CONTINUE);
 	}
 }
 
 static int
 vmexit_rdmsr(struct vmctx *ctx, struct vm_exit *vme, int *pvcpu)
 {
 	uint64_t val;
 	uint32_t eax, edx;
 	int error;
 
 	val = 0;
 	error = emulate_rdmsr(ctx, *pvcpu, vme->u.msr.code, &val);
 	if (error != 0) {
 		fprintf(stderr, "rdmsr to register %#x on vcpu %d\n",
 		    vme->u.msr.code, *pvcpu);
 		if (strictmsr) {
 			vm_inject_gp(ctx, *pvcpu);
 			return (VMEXIT_CONTINUE);
 		}
 	}
 
 	eax = val;
 	error = vm_set_register(ctx, *pvcpu, VM_REG_GUEST_RAX, eax);
 	assert(error == 0);
 
 	edx = val >> 32;
 	error = vm_set_register(ctx, *pvcpu, VM_REG_GUEST_RDX, edx);
 	assert(error == 0);
 
 	return (VMEXIT_CONTINUE);
 }
 
 static int
 vmexit_wrmsr(struct vmctx *ctx, struct vm_exit *vme, int *pvcpu)
 {
 	int error;
 
 	error = emulate_wrmsr(ctx, *pvcpu, vme->u.msr.code, vme->u.msr.wval);
 	if (error != 0) {
 		fprintf(stderr, "wrmsr to register %#x(%#lx) on vcpu %d\n",
 		    vme->u.msr.code, vme->u.msr.wval, *pvcpu);
 		if (strictmsr) {
 			vm_inject_gp(ctx, *pvcpu);
 			return (VMEXIT_CONTINUE);
 		}
 	}
 	return (VMEXIT_CONTINUE);
 }
 
 static int
 vmexit_spinup_ap(struct vmctx *ctx, struct vm_exit *vme, int *pvcpu)
 {
 	int newcpu;
 	int retval = VMEXIT_CONTINUE;
 
 	newcpu = spinup_ap(ctx, *pvcpu,
 			   vme->u.spinup_ap.vcpu, vme->u.spinup_ap.rip);
 
 	return (retval);
 }
 
 #define	DEBUG_EPT_MISCONFIG
 #ifdef DEBUG_EPT_MISCONFIG
 #define	EXIT_REASON_EPT_MISCONFIG	49
 #define	VMCS_GUEST_PHYSICAL_ADDRESS	0x00002400
 #define	VMCS_IDENT(x)			((x) | 0x80000000)
 
 static uint64_t ept_misconfig_gpa, ept_misconfig_pte[4];
 static int ept_misconfig_ptenum;
 #endif
 
 static int
 vmexit_vmx(struct vmctx *ctx, struct vm_exit *vmexit, int *pvcpu)
 {
 
 	fprintf(stderr, "vm exit[%d]\n", *pvcpu);
 	fprintf(stderr, "\treason\t\tVMX\n");
 	fprintf(stderr, "\trip\t\t0x%016lx\n", vmexit->rip);
 	fprintf(stderr, "\tinst_length\t%d\n", vmexit->inst_length);
 	fprintf(stderr, "\tstatus\t\t%d\n", vmexit->u.vmx.status);
 	fprintf(stderr, "\texit_reason\t%u\n", vmexit->u.vmx.exit_reason);
 	fprintf(stderr, "\tqualification\t0x%016lx\n",
 	    vmexit->u.vmx.exit_qualification);
 	fprintf(stderr, "\tinst_type\t\t%d\n", vmexit->u.vmx.inst_type);
 	fprintf(stderr, "\tinst_error\t\t%d\n", vmexit->u.vmx.inst_error);
 #ifdef DEBUG_EPT_MISCONFIG
 	if (vmexit->u.vmx.exit_reason == EXIT_REASON_EPT_MISCONFIG) {
 		vm_get_register(ctx, *pvcpu,
 		    VMCS_IDENT(VMCS_GUEST_PHYSICAL_ADDRESS),
 		    &ept_misconfig_gpa);
 		vm_get_gpa_pmap(ctx, ept_misconfig_gpa, ept_misconfig_pte,
 		    &ept_misconfig_ptenum);
 		fprintf(stderr, "\tEPT misconfiguration:\n");
 		fprintf(stderr, "\t\tGPA: %#lx\n", ept_misconfig_gpa);
 		fprintf(stderr, "\t\tPTE(%d): %#lx %#lx %#lx %#lx\n",
 		    ept_misconfig_ptenum, ept_misconfig_pte[0],
 		    ept_misconfig_pte[1], ept_misconfig_pte[2],
 		    ept_misconfig_pte[3]);
 	}
 #endif	/* DEBUG_EPT_MISCONFIG */
 	return (VMEXIT_ABORT);
 }
 
 static int
 vmexit_svm(struct vmctx *ctx, struct vm_exit *vmexit, int *pvcpu)
 {
 
 	fprintf(stderr, "vm exit[%d]\n", *pvcpu);
 	fprintf(stderr, "\treason\t\tSVM\n");
 	fprintf(stderr, "\trip\t\t0x%016lx\n", vmexit->rip);
 	fprintf(stderr, "\tinst_length\t%d\n", vmexit->inst_length);
 	fprintf(stderr, "\texitcode\t%#lx\n", vmexit->u.svm.exitcode);
 	fprintf(stderr, "\texitinfo1\t%#lx\n", vmexit->u.svm.exitinfo1);
 	fprintf(stderr, "\texitinfo2\t%#lx\n", vmexit->u.svm.exitinfo2);
 	return (VMEXIT_ABORT);
 }
 
 static int
 vmexit_bogus(struct vmctx *ctx, struct vm_exit *vmexit, int *pvcpu)
 {
 
 	assert(vmexit->inst_length == 0);
 
 	stats.vmexit_bogus++;
 
 	return (VMEXIT_CONTINUE);
 }
 
 static int
 vmexit_hlt(struct vmctx *ctx, struct vm_exit *vmexit, int *pvcpu)
 {
 
 	stats.vmexit_hlt++;
 
 	/*
 	 * Just continue execution with the next instruction. We use
 	 * the HLT VM exit as a way to be friendly with the host
 	 * scheduler.
 	 */
 	return (VMEXIT_CONTINUE);
 }
 
 static int
 vmexit_pause(struct vmctx *ctx, struct vm_exit *vmexit, int *pvcpu)
 {
 
 	stats.vmexit_pause++;
 
 	return (VMEXIT_CONTINUE);
 }
 
 static int
 vmexit_mtrap(struct vmctx *ctx, struct vm_exit *vmexit, int *pvcpu)
 {
 
 	assert(vmexit->inst_length == 0);
 
 	stats.vmexit_mtrap++;
 
 	return (VMEXIT_CONTINUE);
 }
 
 static int
 vmexit_inst_emul(struct vmctx *ctx, struct vm_exit *vmexit, int *pvcpu)
 {
 	int err, i;
 	struct vie *vie;
 
 	stats.vmexit_inst_emul++;
 
 	vie = &vmexit->u.inst_emul.vie;
 	err = emulate_mem(ctx, *pvcpu, vmexit->u.inst_emul.gpa,
 	    vie, &vmexit->u.inst_emul.paging);
 
 	if (err) {
 		if (err == ESRCH) {
 			fprintf(stderr, "Unhandled memory access to 0x%lx\n",
 			    vmexit->u.inst_emul.gpa);
 		}
 
 		fprintf(stderr, "Failed to emulate instruction [");
 		for (i = 0; i < vie->num_valid; i++) {
 			fprintf(stderr, "0x%02x%s", vie->inst[i],
 			    i != (vie->num_valid - 1) ? " " : "");
 		}
 		fprintf(stderr, "] at 0x%lx\n", vmexit->rip);
 		return (VMEXIT_ABORT);
 	}
 
 	return (VMEXIT_CONTINUE);
 }
 
 static pthread_mutex_t resetcpu_mtx = PTHREAD_MUTEX_INITIALIZER;
 static pthread_cond_t resetcpu_cond = PTHREAD_COND_INITIALIZER;
 
 static int
 vmexit_suspend(struct vmctx *ctx, struct vm_exit *vmexit, int *pvcpu)
 {
 	enum vm_suspend_how how;
 
 	how = vmexit->u.suspended.how;
 
 	fbsdrun_deletecpu(ctx, *pvcpu);
 
 	if (*pvcpu != BSP) {
 		pthread_mutex_lock(&resetcpu_mtx);
 		pthread_cond_signal(&resetcpu_cond);
 		pthread_mutex_unlock(&resetcpu_mtx);
 		pthread_exit(NULL);
 	}
 
 	pthread_mutex_lock(&resetcpu_mtx);
 	while (!CPU_EMPTY(&cpumask)) {
 		pthread_cond_wait(&resetcpu_cond, &resetcpu_mtx);
 	}
 	pthread_mutex_unlock(&resetcpu_mtx);
 
 	switch (how) {
 	case VM_SUSPEND_RESET:
 		exit(0);
 	case VM_SUSPEND_POWEROFF:
 		exit(1);
 	case VM_SUSPEND_HALT:
 		exit(2);
 	case VM_SUSPEND_TRIPLEFAULT:
 		exit(3);
 	default:
 		fprintf(stderr, "vmexit_suspend: invalid reason %d\n", how);
 		exit(100);
 	}
 	return (0);	/* NOTREACHED */
 }
 
 static vmexit_handler_t handler[VM_EXITCODE_MAX] = {
 	[VM_EXITCODE_INOUT]  = vmexit_inout,
 	[VM_EXITCODE_INOUT_STR]  = vmexit_inout,
 	[VM_EXITCODE_VMX]    = vmexit_vmx,
 	[VM_EXITCODE_SVM]    = vmexit_svm,
 	[VM_EXITCODE_BOGUS]  = vmexit_bogus,
 	[VM_EXITCODE_RDMSR]  = vmexit_rdmsr,
 	[VM_EXITCODE_WRMSR]  = vmexit_wrmsr,
 	[VM_EXITCODE_MTRAP]  = vmexit_mtrap,
 	[VM_EXITCODE_INST_EMUL] = vmexit_inst_emul,
 	[VM_EXITCODE_SPINUP_AP] = vmexit_spinup_ap,
 	[VM_EXITCODE_SUSPENDED] = vmexit_suspend,
 	[VM_EXITCODE_TASK_SWITCH] = vmexit_task_switch,
 };
 
 static void
 vm_loop(struct vmctx *ctx, int vcpu, uint64_t startrip)
 {
 	int error, rc, prevcpu;
 	enum vm_exitcode exitcode;
 	cpuset_t active_cpus;
 
 	if (vcpumap[vcpu] != NULL) {
 		error = pthread_setaffinity_np(pthread_self(),
 		    sizeof(cpuset_t), vcpumap[vcpu]);
 		assert(error == 0);
 	}
 
 	error = vm_active_cpus(ctx, &active_cpus);
 	assert(CPU_ISSET(vcpu, &active_cpus));
 
 	error = vm_set_register(ctx, vcpu, VM_REG_GUEST_RIP, startrip);
 	assert(error == 0);
 
 	while (1) {
 		error = vm_run(ctx, vcpu, &vmexit[vcpu]);
 		if (error != 0)
 			break;
 
 		prevcpu = vcpu;
 
 		exitcode = vmexit[vcpu].exitcode;
 		if (exitcode >= VM_EXITCODE_MAX || handler[exitcode] == NULL) {
 			fprintf(stderr, "vm_loop: unexpected exitcode 0x%x\n",
 			    exitcode);
 			exit(1);
 		}
 
                 rc = (*handler[exitcode])(ctx, &vmexit[vcpu], &vcpu);
 
 		switch (rc) {
 		case VMEXIT_CONTINUE:
 			break;
 		case VMEXIT_ABORT:
 			abort();
 		default:
 			exit(1);
 		}
 	}
 	fprintf(stderr, "vm_run error %d, errno %d\n", error, errno);
 }
 
 static int
 num_vcpus_allowed(struct vmctx *ctx)
 {
 	int tmp, error;
 
 	error = vm_get_capability(ctx, BSP, VM_CAP_UNRESTRICTED_GUEST, &tmp);
 
 	/*
 	 * The guest is allowed to spinup more than one processor only if the
 	 * UNRESTRICTED_GUEST capability is available.
 	 */
 	if (error == 0)
 		return (VM_MAXCPU);
 	else
 		return (1);
 }
 
 void
 fbsdrun_set_capabilities(struct vmctx *ctx, int cpu)
 {
 	int err, tmp;
 
 	if (fbsdrun_vmexit_on_hlt()) {
 		err = vm_get_capability(ctx, cpu, VM_CAP_HALT_EXIT, &tmp);
 		if (err < 0) {
 			fprintf(stderr, "VM exit on HLT not supported\n");
 			exit(1);
 		}
 		vm_set_capability(ctx, cpu, VM_CAP_HALT_EXIT, 1);
 		if (cpu == BSP)
 			handler[VM_EXITCODE_HLT] = vmexit_hlt;
 	}
 
         if (fbsdrun_vmexit_on_pause()) {
 		/*
 		 * pause exit support required for this mode
 		 */
 		err = vm_get_capability(ctx, cpu, VM_CAP_PAUSE_EXIT, &tmp);
 		if (err < 0) {
 			fprintf(stderr,
 			    "SMP mux requested, no pause support\n");
 			exit(1);
 		}
 		vm_set_capability(ctx, cpu, VM_CAP_PAUSE_EXIT, 1);
 		if (cpu == BSP)
 			handler[VM_EXITCODE_PAUSE] = vmexit_pause;
         }
 
 	if (x2apic_mode)
 		err = vm_set_x2apic_state(ctx, cpu, X2APIC_ENABLED);
 	else
 		err = vm_set_x2apic_state(ctx, cpu, X2APIC_DISABLED);
 
 	if (err) {
 		fprintf(stderr, "Unable to set x2apic state (%d)\n", err);
 		exit(1);
 	}
 
 	vm_set_capability(ctx, cpu, VM_CAP_ENABLE_INVPCID, 1);
 }
 
 int
 main(int argc, char *argv[])
 {
 	int c, error, gdb_port, err, bvmcons;
 	int dump_guest_memory, max_vcpus, mptgen;
 	int rtc_localtime;
 	struct vmctx *ctx;
 	uint64_t rip;
 	size_t memsize;
 
 	bvmcons = 0;
 	dump_guest_memory = 0;
 	progname = basename(argv[0]);
 	gdb_port = 0;
 	guest_ncpus = 1;
 	memsize = 256 * MB;
 	mptgen = 1;
 	rtc_localtime = 1;
 
 	while ((c = getopt(argc, argv, "abehuwxACHIPWYp:g:c:s:m:l:U:")) != -1) {
 		switch (c) {
 		case 'a':
 			x2apic_mode = 0;
 			break;
 		case 'A':
 			acpi = 1;
 			break;
 		case 'b':
 			bvmcons = 1;
 			break;
 		case 'p':
                         if (pincpu_parse(optarg) != 0) {
                             errx(EX_USAGE, "invalid vcpu pinning "
                                  "configuration '%s'", optarg);
                         }
 			break;
                 case 'c':
 			guest_ncpus = atoi(optarg);
 			break;
 		case 'C':
 			dump_guest_memory = 1;
 			break;
 		case 'g':
 			gdb_port = atoi(optarg);
 			break;
 		case 'l':
 			if (lpc_device_parse(optarg) != 0) {
 				errx(EX_USAGE, "invalid lpc device "
 				    "configuration '%s'", optarg);
 			}
 			break;
 		case 's':
 			if (pci_parse_slot(optarg) != 0)
 				exit(1);
 			else
 				break;
                 case 'm':
 			error = vm_parse_memsize(optarg, &memsize);
 			if (error)
 				errx(EX_USAGE, "invalid memsize '%s'", optarg);
 			break;
 		case 'H':
 			guest_vmexit_on_hlt = 1;
 			break;
 		case 'I':
 			/*
 			 * The "-I" option was used to add an ioapic to the
 			 * virtual machine.
 			 *
 			 * An ioapic is now provided unconditionally for each
 			 * virtual machine and this option is now deprecated.
 			 */
 			break;
 		case 'P':
 			guest_vmexit_on_pause = 1;
 			break;
 		case 'e':
 			strictio = 1;
 			break;
 		case 'u':
 			rtc_localtime = 0;
 			break;
 		case 'U':
 			guest_uuid_str = optarg;
 			break;
 		case 'w':
 			strictmsr = 0;
 			break;
 		case 'W':
 			virtio_msix = 0;
 			break;
 		case 'x':
 			x2apic_mode = 1;
 			break;
 		case 'Y':
 			mptgen = 0;
 			break;
 		case 'h':
 			usage(0);			
 		default:
 			usage(1);
 		}
 	}
 	argc -= optind;
 	argv += optind;
 
 	if (argc != 1)
 		usage(1);
 
 	vmname = argv[0];
 
 	ctx = vm_open(vmname);
 	if (ctx == NULL) {
 		perror("vm_open");
 		exit(1);
 	}
 
 	max_vcpus = num_vcpus_allowed(ctx);
 	if (guest_ncpus > max_vcpus) {
 		fprintf(stderr, "%d vCPUs requested but only %d available\n",
 			guest_ncpus, max_vcpus);
 		exit(1);
 	}
 
 	fbsdrun_set_capabilities(ctx, BSP);
 
 	if (dump_guest_memory)
 		vm_set_memflags(ctx, VM_MEM_F_INCORE);
 	err = vm_setup_memory(ctx, memsize, VM_MMAP_ALL);
 	if (err) {
 		fprintf(stderr, "Unable to setup memory (%d)\n", err);
 		exit(1);
 	}
 
 	error = init_msr();
 	if (error) {
 		fprintf(stderr, "init_msr error %d", error);
 		exit(1);
 	}
 
 	init_mem();
 	init_inout();
 	pci_irq_init(ctx);
 	ioapic_init(ctx);
 
 	rtc_init(ctx, rtc_localtime);
 	sci_init(ctx);
 
 	/*
 	 * Exit if a device emulation finds an error in it's initilization
 	 */
 	if (init_pci(ctx) != 0)
 		exit(1);
 
 	if (gdb_port != 0)
 		init_dbgport(gdb_port);
 
 	if (bvmcons)
 		init_bvmcons();
 
 	error = vm_get_register(ctx, BSP, VM_REG_GUEST_RIP, &rip);
 	assert(error == 0);
 
 	/*
 	 * build the guest tables, MP etc.
 	 */
 	if (mptgen) {
 		error = mptable_build(ctx, guest_ncpus);
 		if (error)
 			exit(1);
 	}
 
 	error = smbios_build(ctx);
 	assert(error == 0);
 
 	if (acpi) {
 		error = acpi_build(ctx, guest_ncpus);
 		assert(error == 0);
 	}
 
 	/*
 	 * Change the proc title to include the VM name.
 	 */
 	setproctitle("%s", vmname); 
 	
 	/*
 	 * Add CPU 0
 	 */
 	fbsdrun_addcpu(ctx, BSP, BSP, rip);
 
 	/*
 	 * Head off to the main event dispatch loop
 	 */
 	mevent_dispatch();
 
 	exit(1);
 }
Index: user/ngie/more-tests/usr.sbin/bhyve
===================================================================
--- user/ngie/more-tests/usr.sbin/bhyve	(revision 281584)
+++ user/ngie/more-tests/usr.sbin/bhyve	(revision 281585)

Property changes on: user/ngie/more-tests/usr.sbin/bhyve
___________________________________________________________________
Modified: svn:mergeinfo
## -0,0 +0,1 ##
   Merged /head/usr.sbin/bhyve:r281414-281584
Index: user/ngie/more-tests/usr.sbin/bhyvectl/bhyvectl.c
===================================================================
--- user/ngie/more-tests/usr.sbin/bhyvectl/bhyvectl.c	(revision 281584)
+++ user/ngie/more-tests/usr.sbin/bhyvectl/bhyvectl.c	(revision 281585)
@@ -1,2142 +1,2142 @@
 /*-
  * Copyright (c) 2011 NetApp, Inc.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY NETAPP, INC ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL NETAPP, INC OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/types.h>
 #include <sys/sysctl.h>
 #include <sys/errno.h>
 #include <sys/mman.h>
 
 #include <stdio.h>
 #include <stdlib.h>
 #include <stdbool.h>
 #include <string.h>
 #include <unistd.h>
 #include <libgen.h>
 #include <libutil.h>
 #include <fcntl.h>
 #include <string.h>
 #include <getopt.h>
 #include <time.h>
 #include <assert.h>
 
 #include <machine/cpufunc.h>
 #include <machine/vmm.h>
 #include <machine/specialreg.h>
 #include <vmmapi.h>
 
 #include "amd/vmcb.h"
 #include "intel/vmcs.h"
 
 #define	MB	(1UL << 20)
 #define	GB	(1UL << 30)
 
 #define	REQ_ARG		required_argument
 #define	NO_ARG		no_argument
 #define	OPT_ARG		optional_argument
 
 static const char *progname;
 
 static void
 usage(bool cpu_intel)
 {
 
 	(void)fprintf(stderr,
 	"Usage: %s --vm=<vmname>\n"
 	"       [--cpu=<vcpu_number>]\n"
 	"       [--create]\n"
 	"       [--destroy]\n"
 	"       [--get-all]\n"
 	"       [--get-stats]\n"
 	"       [--set-desc-ds]\n"
 	"       [--get-desc-ds]\n"
 	"       [--set-desc-es]\n"
 	"       [--get-desc-es]\n"
 	"       [--set-desc-gs]\n"
 	"       [--get-desc-gs]\n"
 	"       [--set-desc-fs]\n"
 	"       [--get-desc-fs]\n"
 	"       [--set-desc-cs]\n"
 	"       [--get-desc-cs]\n"
 	"       [--set-desc-ss]\n"
 	"       [--get-desc-ss]\n"
 	"       [--set-desc-tr]\n"
 	"       [--get-desc-tr]\n"
 	"       [--set-desc-ldtr]\n"
 	"       [--get-desc-ldtr]\n"
 	"       [--set-desc-gdtr]\n"
 	"       [--get-desc-gdtr]\n"
 	"       [--set-desc-idtr]\n"
 	"       [--get-desc-idtr]\n"
 	"       [--run]\n"
 	"       [--capname=<capname>]\n"
 	"       [--getcap]\n"
 	"       [--setcap=<0|1>]\n"
 	"       [--desc-base=<BASE>]\n"
 	"       [--desc-limit=<LIMIT>]\n"
 	"       [--desc-access=<ACCESS>]\n"
 	"       [--set-cr0=<CR0>]\n"
 	"       [--get-cr0]\n"
 	"       [--set-cr3=<CR3>]\n"
 	"       [--get-cr3]\n"
 	"       [--set-cr4=<CR4>]\n"
 	"       [--get-cr4]\n"
 	"       [--set-dr7=<DR7>]\n"
 	"       [--get-dr7]\n"
 	"       [--set-rsp=<RSP>]\n"
 	"       [--get-rsp]\n"
 	"       [--set-rip=<RIP>]\n"
 	"       [--get-rip]\n"
 	"       [--get-rax]\n"
 	"       [--set-rax=<RAX>]\n"
 	"       [--get-rbx]\n"
 	"       [--get-rcx]\n"
 	"       [--get-rdx]\n"
 	"       [--get-rsi]\n"
 	"       [--get-rdi]\n"
 	"       [--get-rbp]\n"
 	"       [--get-r8]\n"
 	"       [--get-r9]\n"
 	"       [--get-r10]\n"
 	"       [--get-r11]\n"
 	"       [--get-r12]\n"
 	"       [--get-r13]\n"
 	"       [--get-r14]\n"
 	"       [--get-r15]\n"
 	"       [--set-rflags=<RFLAGS>]\n"
 	"       [--get-rflags]\n"
 	"       [--set-cs]\n"
 	"       [--get-cs]\n"
 	"       [--set-ds]\n"
 	"       [--get-ds]\n"
 	"       [--set-es]\n"
 	"       [--get-es]\n"
 	"       [--set-fs]\n"
 	"       [--get-fs]\n"
 	"       [--set-gs]\n"
 	"       [--get-gs]\n"
 	"       [--set-ss]\n"
 	"       [--get-ss]\n"
 	"       [--get-tr]\n"
 	"       [--get-ldtr]\n"
 	"       [--set-x2apic-state=<state>]\n"
 	"       [--get-x2apic-state]\n"
 	"       [--unassign-pptdev=<bus/slot/func>]\n"
 	"       [--set-mem=<memory in units of MB>]\n"
 	"       [--get-lowmem]\n"
 	"       [--get-highmem]\n"
 	"       [--get-gpa-pmap]\n"
 	"       [--assert-lapic-lvt=<pin>]\n"
 	"       [--inject-nmi]\n"
 	"       [--force-reset]\n"
 	"       [--force-poweroff]\n"
 	"       [--get-rtc-time]\n"
 	"       [--set-rtc-time=<secs>]\n"
 	"       [--get-rtc-nvram]\n"
 	"       [--set-rtc-nvram=<val>]\n"
 	"       [--rtc-nvram-offset=<offset>]\n"
 	"       [--get-active-cpus]\n"
 	"       [--get-suspended-cpus]\n"
 	"       [--get-intinfo]\n"
 	"       [--get-eptp]\n"
 	"       [--set-exception-bitmap]\n"
 	"       [--get-exception-bitmap]\n"
 	"       [--get-tsc-offset]\n"
 	"       [--get-guest-pat]\n"
 	"       [--get-io-bitmap-address]\n"
 	"       [--get-msr-bitmap]\n"
 	"       [--get-msr-bitmap-address]\n"
 	"       [--get-guest-sysenter]\n"
 	"       [--get-exit-reason]\n",
 	progname);
 
 	if (cpu_intel) {
 		(void)fprintf(stderr,
 		"       [--get-vmcs-pinbased-ctls]\n"
 		"       [--get-vmcs-procbased-ctls]\n"
 		"       [--get-vmcs-procbased-ctls2]\n"
 		"       [--get-vmcs-entry-interruption-info]\n"
 		"       [--set-vmcs-entry-interruption-info=<info>]\n"
 		"       [--get-vmcs-guest-physical-address\n"
 		"       [--get-vmcs-guest-linear-address\n"
 		"       [--get-vmcs-host-pat]\n"
 		"       [--get-vmcs-host-cr0]\n"
 		"       [--get-vmcs-host-cr3]\n"
 		"       [--get-vmcs-host-cr4]\n"
 		"       [--get-vmcs-host-rip]\n"
 		"       [--get-vmcs-host-rsp]\n"
 		"       [--get-vmcs-cr0-mask]\n"
 		"       [--get-vmcs-cr0-shadow]\n"
 		"       [--get-vmcs-cr4-mask]\n"
 		"       [--get-vmcs-cr4-shadow]\n"
 		"       [--get-vmcs-cr3-targets]\n"
 		"       [--get-vmcs-apic-access-address]\n"
 		"       [--get-vmcs-virtual-apic-address]\n"
 		"       [--get-vmcs-tpr-threshold]\n"
 		"       [--get-vmcs-vpid]\n"
 		"       [--get-vmcs-instruction-error]\n"
 		"       [--get-vmcs-exit-ctls]\n"
 		"       [--get-vmcs-entry-ctls]\n"
 		"       [--get-vmcs-link]\n"
 		"       [--get-vmcs-exit-qualification]\n"
 		"       [--get-vmcs-exit-interruption-info]\n"
 		"       [--get-vmcs-exit-interruption-error]\n"
 		"       [--get-vmcs-interruptibility]\n"
 		);
 	} else {
 		(void)fprintf(stderr,
 		"       [--get-vmcb-intercepts]\n"
 		"       [--get-vmcb-asid]\n"
 		"       [--get-vmcb-exit-details]\n"
 		"       [--get-vmcb-tlb-ctrl]\n"
 		"       [--get-vmcb-virq]\n"
 		"       [--get-avic-apic-bar]\n"
 		"       [--get-avic-backing-page]\n"
 		"       [--get-avic-table]\n"
 		);
 	}
 	exit(1);
 }
 
 static int get_rtc_time, set_rtc_time;
 static int get_rtc_nvram, set_rtc_nvram;
 static int rtc_nvram_offset;
 static uint8_t rtc_nvram_value;
 static time_t rtc_secs;
 
 static int get_stats, getcap, setcap, capval, get_gpa_pmap;
 static int inject_nmi, assert_lapic_lvt;
 static int force_reset, force_poweroff;
 static const char *capname;
 static int create, destroy, get_lowmem, get_highmem;
 static int get_intinfo;
 static int get_active_cpus, get_suspended_cpus;
 static uint64_t memsize;
 static int set_cr0, get_cr0, set_cr3, get_cr3, set_cr4, get_cr4;
 static int set_efer, get_efer;
 static int set_dr7, get_dr7;
 static int set_rsp, get_rsp, set_rip, get_rip, set_rflags, get_rflags;
 static int set_rax, get_rax;
 static int get_rbx, get_rcx, get_rdx, get_rsi, get_rdi, get_rbp;
 static int get_r8, get_r9, get_r10, get_r11, get_r12, get_r13, get_r14, get_r15;
 static int set_desc_ds, get_desc_ds;
 static int set_desc_es, get_desc_es;
 static int set_desc_fs, get_desc_fs;
 static int set_desc_gs, get_desc_gs;
 static int set_desc_cs, get_desc_cs;
 static int set_desc_ss, get_desc_ss;
 static int set_desc_gdtr, get_desc_gdtr;
 static int set_desc_idtr, get_desc_idtr;
 static int set_desc_tr, get_desc_tr;
 static int set_desc_ldtr, get_desc_ldtr;
 static int set_cs, set_ds, set_es, set_fs, set_gs, set_ss, set_tr, set_ldtr;
 static int get_cs, get_ds, get_es, get_fs, get_gs, get_ss, get_tr, get_ldtr;
 static int set_x2apic_state, get_x2apic_state;
 enum x2apic_state x2apic_state;
 static int unassign_pptdev, bus, slot, func;
 static int run;
 
 /*
  * VMCB specific.
  */
 static int get_vmcb_intercept, get_vmcb_exit_details, get_vmcb_tlb_ctrl;
 static int get_vmcb_virq, get_avic_table;
 
 /*
  * VMCS-specific fields
  */
 static int get_pinbased_ctls, get_procbased_ctls, get_procbased_ctls2;
 static int get_eptp, get_io_bitmap, get_tsc_offset;
 static int get_vmcs_entry_interruption_info, set_vmcs_entry_interruption_info;
 static int get_vmcs_interruptibility;
 uint32_t vmcs_entry_interruption_info;
 static int get_vmcs_gpa, get_vmcs_gla;
 static int get_exception_bitmap, set_exception_bitmap, exception_bitmap;
 static int get_cr0_mask, get_cr0_shadow;
 static int get_cr4_mask, get_cr4_shadow;
 static int get_cr3_targets;
 static int get_apic_access_addr, get_virtual_apic_addr, get_tpr_threshold;
 static int get_msr_bitmap, get_msr_bitmap_address;
 static int get_vpid_asid;
 static int get_inst_err, get_exit_ctls, get_entry_ctls;
 static int get_host_cr0, get_host_cr3, get_host_cr4;
 static int get_host_rip, get_host_rsp;
 static int get_guest_pat, get_host_pat;
 static int get_guest_sysenter, get_vmcs_link;
 static int get_exit_reason, get_vmcs_exit_qualification;
 static int get_vmcs_exit_interruption_info, get_vmcs_exit_interruption_error;
 
 static uint64_t desc_base;
 static uint32_t desc_limit, desc_access;
 
 static int get_all;
 
 static void
 dump_vm_run_exitcode(struct vm_exit *vmexit, int vcpu)
 {
 	printf("vm exit[%d]\n", vcpu);
 	printf("\trip\t\t0x%016lx\n", vmexit->rip);
 	printf("\tinst_length\t%d\n", vmexit->inst_length);
 	switch (vmexit->exitcode) {
 	case VM_EXITCODE_INOUT:
 		printf("\treason\t\tINOUT\n");
 		printf("\tdirection\t%s\n", vmexit->u.inout.in ? "IN" : "OUT");
 		printf("\tbytes\t\t%d\n", vmexit->u.inout.bytes);
 		printf("\tflags\t\t%s%s\n",
 			vmexit->u.inout.string ? "STRING " : "",
 			vmexit->u.inout.rep ? "REP " : "");
 		printf("\tport\t\t0x%04x\n", vmexit->u.inout.port);
 		printf("\teax\t\t0x%08x\n", vmexit->u.inout.eax);
 		break;
 	case VM_EXITCODE_VMX:
 		printf("\treason\t\tVMX\n");
 		printf("\tstatus\t\t%d\n", vmexit->u.vmx.status);
 		printf("\texit_reason\t0x%08x (%u)\n",
 		    vmexit->u.vmx.exit_reason, vmexit->u.vmx.exit_reason);
 		printf("\tqualification\t0x%016lx\n",
 			vmexit->u.vmx.exit_qualification);
 		printf("\tinst_type\t\t%d\n", vmexit->u.vmx.inst_type);
 		printf("\tinst_error\t\t%d\n", vmexit->u.vmx.inst_error);
 		break;
 	case VM_EXITCODE_SVM:
 		printf("\treason\t\tSVM\n");
 		printf("\texit_reason\t\t%#lx\n", vmexit->u.svm.exitcode);
 		printf("\texitinfo1\t\t%#lx\n", vmexit->u.svm.exitinfo1);
 		printf("\texitinfo2\t\t%#lx\n", vmexit->u.svm.exitinfo2);
 		break;
 	default:
 		printf("*** unknown vm run exitcode %d\n", vmexit->exitcode);
 		break;
 	}
 }
 
 /* AMD 6th generation and Intel compatible MSRs */
 #define MSR_AMD6TH_START	0xC0000000
 #define MSR_AMD6TH_END		0xC0001FFF
 /* AMD 7th and 8th generation compatible MSRs */
 #define MSR_AMD7TH_START	0xC0010000
 #define MSR_AMD7TH_END		0xC0011FFF
 
 static const char *
 msr_name(uint32_t msr)
 {
 	static char buf[32];
 
 	switch(msr) {
 	case MSR_TSC:
 		return ("MSR_TSC");
 	case MSR_EFER:
 		return ("MSR_EFER");
 	case MSR_STAR:
 		return ("MSR_STAR");
 	case MSR_LSTAR:	
 		return ("MSR_LSTAR");
 	case MSR_CSTAR:
 		return ("MSR_CSTAR");
 	case MSR_SF_MASK:
 		return ("MSR_SF_MASK");
 	case MSR_FSBASE:
 		return ("MSR_FSBASE");
 	case MSR_GSBASE:
 		return ("MSR_GSBASE");
 	case MSR_KGSBASE:
 		return ("MSR_KGSBASE");
 	case MSR_SYSENTER_CS_MSR:
 		return ("MSR_SYSENTER_CS_MSR");
 	case MSR_SYSENTER_ESP_MSR:
 		return ("MSR_SYSENTER_ESP_MSR");
 	case MSR_SYSENTER_EIP_MSR:
 		return ("MSR_SYSENTER_EIP_MSR");
 	case MSR_PAT:
 		return ("MSR_PAT");
 	}
 	snprintf(buf, sizeof(buf), "MSR       %#08x", msr);
 
 	return (buf);
 }
 
 static inline void
 print_msr_pm(uint64_t msr, int vcpu, int readable, int writeable)
 {
 
 	if (readable || writeable) {
 		printf("%-20s[%d]\t\t%c%c\n", msr_name(msr), vcpu,
 			readable ? 'R' : '-', writeable ? 'W' : '-');
 	}
 }
 
 /*
  * Reference APM vol2, section 15.11 MSR Intercepts.
  */
 static void
 dump_amd_msr_pm(const char *bitmap, int vcpu)
 {
 	int byte, bit, readable, writeable;
 	uint32_t msr;
 
 	for (msr = 0; msr < 0x2000; msr++) {
 		byte = msr / 4;
 		bit = (msr % 4) * 2;
 
 		/* Look at MSRs in the range 0x00000000 to 0x00001FFF */
 		readable = (bitmap[byte] & (1 << bit)) ? 0 : 1;
 		writeable = (bitmap[byte] & (2 << bit)) ?  0 : 1;
 		print_msr_pm(msr, vcpu, readable, writeable);
 
 		/* Look at MSRs in the range 0xC0000000 to 0xC0001FFF */
 		byte += 2048;
 		readable = (bitmap[byte] & (1 << bit)) ? 0 : 1;
 		writeable = (bitmap[byte] & (2 << bit)) ?  0 : 1;
 		print_msr_pm(msr + MSR_AMD6TH_START, vcpu, readable,
 				writeable);
 		
 		/* MSR 0xC0010000 to 0xC0011FF is only for AMD */
 		byte += 4096;
 		readable = (bitmap[byte] & (1 << bit)) ? 0 : 1;
 		writeable = (bitmap[byte] & (2 << bit)) ?  0 : 1;
 		print_msr_pm(msr + MSR_AMD7TH_START, vcpu, readable,
 				writeable);
 	}
 }
 
 /*
  * Reference Intel SDM Vol3 Section 24.6.9 MSR-Bitmap Address
  */
 static void
 dump_intel_msr_pm(const char *bitmap, int vcpu)
 {
 	int byte, bit, readable, writeable;
 	uint32_t msr;
 
 	for (msr = 0; msr < 0x2000; msr++) {
 		byte = msr / 8;
 		bit = msr & 0x7;
 
 		/* Look at MSRs in the range 0x00000000 to 0x00001FFF */
 		readable = (bitmap[byte] & (1 << bit)) ? 0 : 1;
 		writeable = (bitmap[2048 + byte] & (1 << bit)) ?  0 : 1;
 		print_msr_pm(msr, vcpu, readable, writeable);
 
 		/* Look at MSRs in the range 0xC0000000 to 0xC0001FFF */
 		byte += 1024;
 		readable = (bitmap[byte] & (1 << bit)) ? 0 : 1;
 		writeable = (bitmap[2048 + byte] & (1 << bit)) ?  0 : 1;
 		print_msr_pm(msr + MSR_AMD6TH_START, vcpu, readable,
 				writeable);
 	}
 }
 
 static int
 dump_msr_bitmap(int vcpu, uint64_t addr, bool cpu_intel)
 {
 	int error, fd, map_size;
 	const char *bitmap;
 
 	error = -1;
 	bitmap = MAP_FAILED;
 
 	fd = open("/dev/mem", O_RDONLY, 0);
 	if (fd < 0) {
 		perror("Couldn't open /dev/mem");
 		goto done;
 	}
 
 	if (cpu_intel)
 		map_size = PAGE_SIZE;
 	else
 		map_size = 2 * PAGE_SIZE;
 
 	bitmap = mmap(NULL, map_size, PROT_READ, MAP_SHARED, fd, addr);
 	if (bitmap == MAP_FAILED) {
 		perror("mmap failed");
 		goto done;
 	}
 	
 	if (cpu_intel)
 		dump_intel_msr_pm(bitmap, vcpu);
 	else	
 		dump_amd_msr_pm(bitmap, vcpu);
 
 	error = 0;
 done:
 	if (bitmap != MAP_FAILED)
 		munmap((void *)bitmap, map_size);
 	if (fd >= 0)
 		close(fd);
 
 	return (error);
 }
 
 static int
 vm_get_vmcs_field(struct vmctx *ctx, int vcpu, int field, uint64_t *ret_val)
 {
 
 	return (vm_get_register(ctx, vcpu, VMCS_IDENT(field), ret_val));
 }
 
 static int
 vm_set_vmcs_field(struct vmctx *ctx, int vcpu, int field, uint64_t val)
 {
 
 	return (vm_set_register(ctx, vcpu, VMCS_IDENT(field), val));
 }
 
 static int
 vm_get_vmcb_field(struct vmctx *ctx, int vcpu, int off, int bytes,
 	uint64_t *ret_val)
 {
 
 	return (vm_get_register(ctx, vcpu, VMCB_ACCESS(off, bytes), ret_val));
 }
 
 static int
 vm_set_vmcb_field(struct vmctx *ctx, int vcpu, int off, int bytes,
 	uint64_t val)
 {
 	
 	return (vm_set_register(ctx, vcpu, VMCB_ACCESS(off, bytes), val));
 }
 
 enum {
 	VMNAME = 1000,	/* avoid collision with return values from getopt */
 	VCPU,
 	SET_MEM,
 	SET_EFER,
 	SET_CR0,
 	SET_CR3,
 	SET_CR4,
 	SET_DR7,
 	SET_RSP,
 	SET_RIP,
 	SET_RAX,
 	SET_RFLAGS,
 	DESC_BASE,
 	DESC_LIMIT,
 	DESC_ACCESS,
 	SET_CS,
 	SET_DS,
 	SET_ES,
 	SET_FS,
 	SET_GS,
 	SET_SS,
 	SET_TR,
 	SET_LDTR,
 	SET_X2APIC_STATE,
 	SET_EXCEPTION_BITMAP,
 	SET_VMCS_ENTRY_INTERRUPTION_INFO,
 	SET_CAP,
 	CAPNAME,
 	UNASSIGN_PPTDEV,
 	GET_GPA_PMAP,
 	ASSERT_LAPIC_LVT,
 	SET_RTC_TIME,
 	SET_RTC_NVRAM,
 	RTC_NVRAM_OFFSET,
 };
 
 static void
 print_cpus(const char *banner, const cpuset_t *cpus)
 {
 	int i, first;
 
 	first = 1;
 	printf("%s:\t", banner);
 	if (!CPU_EMPTY(cpus)) {
 		for (i = 0; i < CPU_SETSIZE; i++) {
 			if (CPU_ISSET(i, cpus)) {
 				printf("%s%d", first ? " " : ", ", i);
 				first = 0;
 			}
 		}
 	} else
 		printf(" (none)");
 	printf("\n");
 }
 
 static void
 print_intinfo(const char *banner, uint64_t info)
 {
 	int type;
 
 	printf("%s:\t", banner);
 	if (info & VM_INTINFO_VALID) {
 		type = info & VM_INTINFO_TYPE;
 		switch (type) {
 		case VM_INTINFO_HWINTR:
 			printf("extint");
 			break;
 		case VM_INTINFO_NMI:
 			printf("nmi");
 			break;
 		case VM_INTINFO_SWINTR:
 			printf("swint");
 			break;
 		default:
 			printf("exception");
 			break;
 		}
 		printf(" vector %d", (int)VM_INTINFO_VECTOR(info));
 		if (info & VM_INTINFO_DEL_ERRCODE)
 			printf(" errcode %#x", (u_int)(info >> 32));
 	} else {
 		printf("n/a");
 	}
 	printf("\n");
 }
 
 static bool
 cpu_vendor_intel(void)
 {
 	u_int regs[4];
 	char cpu_vendor[13];
 
 	do_cpuid(0, regs);
 	((u_int *)&cpu_vendor)[0] = regs[1];
 	((u_int *)&cpu_vendor)[1] = regs[3];
 	((u_int *)&cpu_vendor)[2] = regs[2];
 	cpu_vendor[12] = '\0';
 
 	if (strcmp(cpu_vendor, "AuthenticAMD") == 0) {
 		return (false);
 	} else if (strcmp(cpu_vendor, "GenuineIntel") == 0) {
 		return (true);
 	} else {
 		fprintf(stderr, "Unknown cpu vendor \"%s\"\n", cpu_vendor);
 		exit(1);
 	}
 }
 
 static int
 get_all_registers(struct vmctx *ctx, int vcpu)
 {
 	uint64_t cr0, cr3, cr4, dr7, rsp, rip, rflags, efer;
 	uint64_t rax, rbx, rcx, rdx, rsi, rdi, rbp;
 	uint64_t r8, r9, r10, r11, r12, r13, r14, r15;
-	int error;
+	int error = 0;
 
-	if (get_efer || get_all) {
+	if (!error && (get_efer || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_EFER, &efer);
 		if (error == 0)
 			printf("efer[%d]\t\t0x%016lx\n", vcpu, efer);
 	}
 
 	if (!error && (get_cr0 || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_CR0, &cr0);
 		if (error == 0)
 			printf("cr0[%d]\t\t0x%016lx\n", vcpu, cr0);
 	}
 
 	if (!error && (get_cr3 || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_CR3, &cr3);
 		if (error == 0)
 			printf("cr3[%d]\t\t0x%016lx\n", vcpu, cr3);
 	}
 
 	if (!error && (get_cr4 || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_CR4, &cr4);
 		if (error == 0)
 			printf("cr4[%d]\t\t0x%016lx\n", vcpu, cr4);
 	}
 
 	if (!error && (get_dr7 || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_DR7, &dr7);
 		if (error == 0)
 			printf("dr7[%d]\t\t0x%016lx\n", vcpu, dr7);
 	}
 
 	if (!error && (get_rsp || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RSP, &rsp);
 		if (error == 0)
 			printf("rsp[%d]\t\t0x%016lx\n", vcpu, rsp);
 	}
 
 	if (!error && (get_rip || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RIP, &rip);
 		if (error == 0)
 			printf("rip[%d]\t\t0x%016lx\n", vcpu, rip);
 	}
 
 	if (!error && (get_rax || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RAX, &rax);
 		if (error == 0)
 			printf("rax[%d]\t\t0x%016lx\n", vcpu, rax);
 	}
 
 	if (!error && (get_rbx || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RBX, &rbx);
 		if (error == 0)
 			printf("rbx[%d]\t\t0x%016lx\n", vcpu, rbx);
 	}
 
 	if (!error && (get_rcx || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RCX, &rcx);
 		if (error == 0)
 			printf("rcx[%d]\t\t0x%016lx\n", vcpu, rcx);
 	}
 
 	if (!error && (get_rdx || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RDX, &rdx);
 		if (error == 0)
 			printf("rdx[%d]\t\t0x%016lx\n", vcpu, rdx);
 	}
 
 	if (!error && (get_rsi || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RSI, &rsi);
 		if (error == 0)
 			printf("rsi[%d]\t\t0x%016lx\n", vcpu, rsi);
 	}
 
 	if (!error && (get_rdi || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RDI, &rdi);
 		if (error == 0)
 			printf("rdi[%d]\t\t0x%016lx\n", vcpu, rdi);
 	}
 
 	if (!error && (get_rbp || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RBP, &rbp);
 		if (error == 0)
 			printf("rbp[%d]\t\t0x%016lx\n", vcpu, rbp);
 	}
 
 	if (!error && (get_r8 || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_R8, &r8);
 		if (error == 0)
 			printf("r8[%d]\t\t0x%016lx\n", vcpu, r8);
 	}
 
 	if (!error && (get_r9 || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_R9, &r9);
 		if (error == 0)
 			printf("r9[%d]\t\t0x%016lx\n", vcpu, r9);
 	}
 
 	if (!error && (get_r10 || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_R10, &r10);
 		if (error == 0)
 			printf("r10[%d]\t\t0x%016lx\n", vcpu, r10);
 	}
 
 	if (!error && (get_r11 || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_R11, &r11);
 		if (error == 0)
 			printf("r11[%d]\t\t0x%016lx\n", vcpu, r11);
 	}
 
 	if (!error && (get_r12 || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_R12, &r12);
 		if (error == 0)
 			printf("r12[%d]\t\t0x%016lx\n", vcpu, r12);
 	}
 
 	if (!error && (get_r13 || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_R13, &r13);
 		if (error == 0)
 			printf("r13[%d]\t\t0x%016lx\n", vcpu, r13);
 	}
 
 	if (!error && (get_r14 || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_R14, &r14);
 		if (error == 0)
 			printf("r14[%d]\t\t0x%016lx\n", vcpu, r14);
 	}
 
 	if (!error && (get_r15 || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_R15, &r15);
 		if (error == 0)
 			printf("r15[%d]\t\t0x%016lx\n", vcpu, r15);
 	}
 
 	if (!error && (get_rflags || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_RFLAGS,
 					&rflags);
 		if (error == 0)
 			printf("rflags[%d]\t0x%016lx\n", vcpu, rflags);
 	}
 	
 	return (error);
 }
 
 static int
 get_all_segments(struct vmctx *ctx, int vcpu)
 {
-	int error;
 	uint64_t cs, ds, es, fs, gs, ss, tr, ldtr;
+	int error = 0;
 
-	if (get_desc_ds || get_all) {
+	if (!error && (get_desc_ds || get_all)) {
 		error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_DS,
 				   &desc_base, &desc_limit, &desc_access);
 		if (error == 0) {
 			printf("ds desc[%d]\t0x%016lx/0x%08x/0x%08x\n",
 			      vcpu, desc_base, desc_limit, desc_access);
 		}
 	}
 
 	if (!error && (get_desc_es || get_all)) {
 		error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_ES,
 				    &desc_base, &desc_limit, &desc_access);
 		if (error == 0) {
 			printf("es desc[%d]\t0x%016lx/0x%08x/0x%08x\n",
 			       vcpu, desc_base, desc_limit, desc_access);
 		}
 	}
 
 	if (!error && (get_desc_fs || get_all)) {
 		error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_FS,
 				    &desc_base, &desc_limit, &desc_access);
 		if (error == 0) {
 			printf("fs desc[%d]\t0x%016lx/0x%08x/0x%08x\n",
 			       vcpu, desc_base, desc_limit, desc_access);
 		}
 	}
 
 	if (!error && (get_desc_gs || get_all)) {
 		error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_GS,
 				    &desc_base, &desc_limit, &desc_access);
 		if (error == 0) {
 			printf("gs desc[%d]\t0x%016lx/0x%08x/0x%08x\n",
 			       vcpu, desc_base, desc_limit, desc_access);
 		}
 	}
 
 	if (!error && (get_desc_ss || get_all)) {
 		error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_SS,
 				    &desc_base, &desc_limit, &desc_access);
 		if (error == 0) {
 			printf("ss desc[%d]\t0x%016lx/0x%08x/0x%08x\n",
 			       vcpu, desc_base, desc_limit, desc_access);
 		}
 	}
 
 	if (!error && (get_desc_cs || get_all)) {
 		error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_CS,
 				    &desc_base, &desc_limit, &desc_access);
 		if (error == 0) {
 			printf("cs desc[%d]\t0x%016lx/0x%08x/0x%08x\n",
 			       vcpu, desc_base, desc_limit, desc_access);
 		}
 	}
 
 	if (!error && (get_desc_tr || get_all)) {
 		error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_TR,
 				    &desc_base, &desc_limit, &desc_access);
 		if (error == 0) {
 			printf("tr desc[%d]\t0x%016lx/0x%08x/0x%08x\n",
 			       vcpu, desc_base, desc_limit, desc_access);
 		}
 	}
 
 	if (!error && (get_desc_ldtr || get_all)) {
 		error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_LDTR,
 				    &desc_base, &desc_limit, &desc_access);
 		if (error == 0) {
 			printf("ldtr desc[%d]\t0x%016lx/0x%08x/0x%08x\n",
 			       vcpu, desc_base, desc_limit, desc_access);
 		}
 	}
 
 	if (!error && (get_desc_gdtr || get_all)) {
 		error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_GDTR,
 				    &desc_base, &desc_limit, &desc_access);
 		if (error == 0) {
 			printf("gdtr[%d]\t\t0x%016lx/0x%08x\n",
 			       vcpu, desc_base, desc_limit);
 		}
 	}
 
 	if (!error && (get_desc_idtr || get_all)) {
 		error = vm_get_desc(ctx, vcpu, VM_REG_GUEST_IDTR,
 				    &desc_base, &desc_limit, &desc_access);
 		if (error == 0) {
 			printf("idtr[%d]\t\t0x%016lx/0x%08x\n",
 			       vcpu, desc_base, desc_limit);
 		}
 	}
 
 	if (!error && (get_cs || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_CS, &cs);
 		if (error == 0)
 			printf("cs[%d]\t\t0x%04lx\n", vcpu, cs);
 	}
 
 	if (!error && (get_ds || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_DS, &ds);
 		if (error == 0)
 			printf("ds[%d]\t\t0x%04lx\n", vcpu, ds);
 	}
 
 	if (!error && (get_es || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_ES, &es);
 		if (error == 0)
 			printf("es[%d]\t\t0x%04lx\n", vcpu, es);
 	}
 
 	if (!error && (get_fs || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_FS, &fs);
 		if (error == 0)
 			printf("fs[%d]\t\t0x%04lx\n", vcpu, fs);
 	}
 
 	if (!error && (get_gs || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_GS, &gs);
 		if (error == 0)
 			printf("gs[%d]\t\t0x%04lx\n", vcpu, gs);
 	}
 
 	if (!error && (get_ss || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_SS, &ss);
 		if (error == 0)
 			printf("ss[%d]\t\t0x%04lx\n", vcpu, ss);
 	}
 
 	if (!error && (get_tr || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_TR, &tr);
 		if (error == 0)
 			printf("tr[%d]\t\t0x%04lx\n", vcpu, tr);
 	}
 
 	if (!error && (get_ldtr || get_all)) {
 		error = vm_get_register(ctx, vcpu, VM_REG_GUEST_LDTR, &ldtr);
 		if (error == 0)
 			printf("ldtr[%d]\t\t0x%04lx\n", vcpu, ldtr);
 	}
 
 	return (error);
 }
 
 static int
 get_misc_vmcs(struct vmctx *ctx, int vcpu)
 {
 	uint64_t ctl, cr0, cr3, cr4, rsp, rip, pat, addr, u64;
-	int error;
-	
-	if (get_cr0_mask || get_all) {
+	int error = 0;
+
+	if (!error && (get_cr0_mask || get_all)) {
 		uint64_t cr0mask;
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_CR0_MASK, &cr0mask);
 		if (error == 0)
 			printf("cr0_mask[%d]\t\t0x%016lx\n", vcpu, cr0mask);
 	}
 
 	if (!error && (get_cr0_shadow || get_all)) {
 		uint64_t cr0shadow;
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_CR0_SHADOW,
 					  &cr0shadow);
 		if (error == 0)
 			printf("cr0_shadow[%d]\t\t0x%016lx\n", vcpu, cr0shadow);
 	}
 
 	if (!error && (get_cr4_mask || get_all)) {
 		uint64_t cr4mask;
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_CR4_MASK, &cr4mask);
 		if (error == 0)
 			printf("cr4_mask[%d]\t\t0x%016lx\n", vcpu, cr4mask);
 	}
 
 	if (!error && (get_cr4_shadow || get_all)) {
 		uint64_t cr4shadow;
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_CR4_SHADOW,
 					  &cr4shadow);
 		if (error == 0)
 			printf("cr4_shadow[%d]\t\t0x%016lx\n", vcpu, cr4shadow);
 	}
 	
 	if (!error && (get_cr3_targets || get_all)) {
 		uint64_t target_count, target_addr;
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_CR3_TARGET_COUNT,
 					  &target_count);
 		if (error == 0) {
 			printf("cr3_target_count[%d]\t0x%016lx\n",
 				vcpu, target_count);
 		}
 
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_CR3_TARGET0,
 					  &target_addr);
 		if (error == 0) {
 			printf("cr3_target0[%d]\t\t0x%016lx\n",
 				vcpu, target_addr);
 		}
 
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_CR3_TARGET1,
 					  &target_addr);
 		if (error == 0) {
 			printf("cr3_target1[%d]\t\t0x%016lx\n",
 				vcpu, target_addr);
 		}
 
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_CR3_TARGET2,
 					  &target_addr);
 		if (error == 0) {
 			printf("cr3_target2[%d]\t\t0x%016lx\n",
 				vcpu, target_addr);
 		}
 
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_CR3_TARGET3,
 					  &target_addr);
 		if (error == 0) {
 			printf("cr3_target3[%d]\t\t0x%016lx\n",
 				vcpu, target_addr);
 		}
 	}
 
 	if (!error && (get_pinbased_ctls || get_all)) {
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_PIN_BASED_CTLS, &ctl);
 		if (error == 0)
 			printf("pinbased_ctls[%d]\t0x%016lx\n", vcpu, ctl);
 	}
 
 	if (!error && (get_procbased_ctls || get_all)) {
 		error = vm_get_vmcs_field(ctx, vcpu,
 					  VMCS_PRI_PROC_BASED_CTLS, &ctl);
 		if (error == 0)
 			printf("procbased_ctls[%d]\t0x%016lx\n", vcpu, ctl);
 	}
 
 	if (!error && (get_procbased_ctls2 || get_all)) {
 		error = vm_get_vmcs_field(ctx, vcpu,
 					  VMCS_SEC_PROC_BASED_CTLS, &ctl);
 		if (error == 0)
 			printf("procbased_ctls2[%d]\t0x%016lx\n", vcpu, ctl);
 	}
 
 	if (!error && (get_vmcs_gla || get_all)) {
 		error = vm_get_vmcs_field(ctx, vcpu,
 					  VMCS_GUEST_LINEAR_ADDRESS, &u64);
 		if (error == 0)
 			printf("gla[%d]\t\t0x%016lx\n", vcpu, u64);
 	}
 
 	if (!error && (get_vmcs_gpa || get_all)) {
 		error = vm_get_vmcs_field(ctx, vcpu,
 					  VMCS_GUEST_PHYSICAL_ADDRESS, &u64);
 		if (error == 0)
 			printf("gpa[%d]\t\t0x%016lx\n", vcpu, u64);
 	}
 
 	if (!error && (get_vmcs_entry_interruption_info || 
 		get_all)) {
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_ENTRY_INTR_INFO,&u64);
 		if (error == 0) {
 			printf("entry_interruption_info[%d]\t0x%016lx\n",
 				vcpu, u64);
 		}
 	}
 	
 	if (!error && (get_tpr_threshold || get_all)) {
 		uint64_t threshold;
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_TPR_THRESHOLD,
 					  &threshold);
 		if (error == 0)
 			printf("tpr_threshold[%d]\t0x%016lx\n", vcpu, threshold);
 	}
 
 	if (!error && (get_inst_err || get_all)) {
 		uint64_t insterr;
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_INSTRUCTION_ERROR,
 					  &insterr);
 		if (error == 0) {
 			printf("instruction_error[%d]\t0x%016lx\n",
 				vcpu, insterr);
 		}
 	}
 	
 	if (!error && (get_exit_ctls || get_all)) {
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_EXIT_CTLS, &ctl);
 		if (error == 0)
 			printf("exit_ctls[%d]\t\t0x%016lx\n", vcpu, ctl);
 	}
 
 	if (!error && (get_entry_ctls || get_all)) {
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_ENTRY_CTLS, &ctl);
 		if (error == 0)
 			printf("entry_ctls[%d]\t\t0x%016lx\n", vcpu, ctl);
 	}
 
 	if (!error && (get_host_pat || get_all)) {
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_HOST_IA32_PAT, &pat);
 		if (error == 0)
 			printf("host_pat[%d]\t\t0x%016lx\n", vcpu, pat);
 	}
 
 	if (!error && (get_host_cr0 || get_all)) {
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_HOST_CR0, &cr0);
 		if (error == 0)
 			printf("host_cr0[%d]\t\t0x%016lx\n", vcpu, cr0);
 	}
 
 	if (!error && (get_host_cr3 || get_all)) {
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_HOST_CR3, &cr3);
 		if (error == 0)
 			printf("host_cr3[%d]\t\t0x%016lx\n", vcpu, cr3);
 	}
 
 	if (!error && (get_host_cr4 || get_all)) {
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_HOST_CR4, &cr4);
 		if (error == 0)
 			printf("host_cr4[%d]\t\t0x%016lx\n", vcpu, cr4);
 	}
 
 	if (!error && (get_host_rip || get_all)) {
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_HOST_RIP, &rip);
 		if (error == 0)
 			printf("host_rip[%d]\t\t0x%016lx\n", vcpu, rip);
 	}
 
 	if (!error && (get_host_rsp || get_all)) {
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_HOST_RSP, &rsp);
 		if (error == 0)
 			printf("host_rsp[%d]\t\t0x%016lx\n", vcpu, rsp);
 	}
 	
 	if (!error && (get_vmcs_link || get_all)) {
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_LINK_POINTER, &addr);
 		if (error == 0)
 			printf("vmcs_pointer[%d]\t0x%016lx\n", vcpu, addr);
 	}
 
 	if (!error && (get_vmcs_exit_interruption_info || get_all)) {
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_EXIT_INTR_INFO, &u64);
 		if (error == 0) {
 			printf("vmcs_exit_interruption_info[%d]\t0x%016lx\n",
 				vcpu, u64);
 		}
 	}
 
 	if (!error && (get_vmcs_exit_interruption_error || get_all)) {
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_EXIT_INTR_ERRCODE,
 		    			  &u64);
 		if (error == 0) {
 			printf("vmcs_exit_interruption_error[%d]\t0x%016lx\n",
 				vcpu, u64);
 		}
 	}
 
 	if (!error && (get_vmcs_interruptibility || get_all)) {
 		error = vm_get_vmcs_field(ctx, vcpu,
 					  VMCS_GUEST_INTERRUPTIBILITY, &u64);
 		if (error == 0) {
 			printf("vmcs_guest_interruptibility[%d]\t0x%016lx\n",
 				vcpu, u64);
 		}
 	}
 	
 	if (!error && (get_vmcs_exit_qualification || get_all)) {
 		error = vm_get_vmcs_field(ctx, vcpu, VMCS_EXIT_QUALIFICATION,
 					  &u64);
 		if (error == 0)
 			printf("vmcs_exit_qualification[%d]\t0x%016lx\n",
 				vcpu, u64);
 	}
 	
 	return (error);
 }
 
 static int
 get_misc_vmcb(struct vmctx *ctx, int vcpu)
 {
 	uint64_t ctl, addr;
-	int error;
+	int error = 0;
 
-	if (get_vmcb_intercept || get_all) {
+	if (!error && (get_vmcb_intercept || get_all)) {
 		error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_CR_INTERCEPT, 4,
 		    &ctl);
 		if (error == 0)
 			printf("cr_intercept[%d]\t0x%08x\n", vcpu, (int)ctl);
 
 		error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_DR_INTERCEPT, 4,
 		    &ctl);
 		if (error == 0)
 			printf("dr_intercept[%d]\t0x%08x\n", vcpu, (int)ctl);
 
 		error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_EXC_INTERCEPT, 4,
 		    &ctl);
 		if (error == 0)
 			printf("exc_intercept[%d]\t0x%08x\n", vcpu, (int)ctl);
 
 		error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_INST1_INTERCEPT,
 		    4, &ctl);
 		if (error == 0)
 			printf("inst1_intercept[%d]\t0x%08x\n", vcpu, (int)ctl);
 
 		error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_INST2_INTERCEPT,
 		    4, &ctl);
 		if (error == 0)
 			printf("inst2_intercept[%d]\t0x%08x\n", vcpu, (int)ctl);
 	}
 
 	if (!error && (get_vmcb_tlb_ctrl || get_all)) {
 		error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_TLB_CTRL,
 					  4, &ctl);
 		if (error == 0)
 			printf("TLB ctrl[%d]\t0x%016lx\n", vcpu, ctl);
 	}
 
 	if (!error && (get_vmcb_exit_details || get_all)) {
 		error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_EXITINFO1,
 					  8, &ctl);
 		if (error == 0)
 			printf("exitinfo1[%d]\t0x%016lx\n", vcpu, ctl);
 		error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_EXITINFO2,
 					  8, &ctl);
 		if (error == 0)
 			printf("exitinfo2[%d]\t0x%016lx\n", vcpu, ctl);
 		error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_EXITINTINFO,
 					  8, &ctl);
 		if (error == 0)
 			printf("exitintinfo[%d]\t0x%016lx\n", vcpu, ctl);
 	}
 
 	if (!error && (get_vmcb_virq || get_all)) {
 		error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_VIRQ,
 					  8, &ctl);
 		if (error == 0)
 			printf("v_irq/tpr[%d]\t0x%016lx\n", vcpu, ctl);
 	}
 
 	if (!error && (get_apic_access_addr || get_all)) {
 		error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_AVIC_BAR, 8,
 					  &addr);
 		if (error == 0)
 			printf("AVIC apic_bar[%d]\t0x%016lx\n", vcpu, addr);
 	}
 
 	if (!error && (get_virtual_apic_addr || get_all)) {
 		error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_AVIC_PAGE, 8,
 					  &addr);
 		if (error == 0)
 			printf("AVIC backing page[%d]\t0x%016lx\n", vcpu, addr);
 	}
 
 	if (!error && (get_avic_table || get_all)) {
 		error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_AVIC_LT, 8,
 					  &addr);
 		if (error == 0)
 			printf("AVIC logical table[%d]\t0x%016lx\n",
 				vcpu, addr);
 		error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_AVIC_PT, 8,
 					  &addr);
 		if (error == 0)
 			printf("AVIC physical table[%d]\t0x%016lx\n",
 				vcpu, addr);
 	}
 
 	return (error);
 }
 
 static struct option *
 setup_options(bool cpu_intel)
 {
 	const struct option common_opts[] = {
 		{ "vm",		REQ_ARG,	0,	VMNAME },
 		{ "cpu",	REQ_ARG,	0,	VCPU },
 		{ "set-mem",	REQ_ARG,	0,	SET_MEM },
 		{ "set-efer",	REQ_ARG,	0,	SET_EFER },
 		{ "set-cr0",	REQ_ARG,	0,	SET_CR0 },
 		{ "set-cr3",	REQ_ARG,	0,	SET_CR3 },
 		{ "set-cr4",	REQ_ARG,	0,	SET_CR4 },
 		{ "set-dr7",	REQ_ARG,	0,	SET_DR7 },
 		{ "set-rsp",	REQ_ARG,	0,	SET_RSP },
 		{ "set-rip",	REQ_ARG,	0,	SET_RIP },
 		{ "set-rax",	REQ_ARG,	0,	SET_RAX },
 		{ "set-rflags",	REQ_ARG,	0,	SET_RFLAGS },
 		{ "desc-base",	REQ_ARG,	0,	DESC_BASE },
 		{ "desc-limit",	REQ_ARG,	0,	DESC_LIMIT },
 		{ "desc-access",REQ_ARG,	0,	DESC_ACCESS },
 		{ "set-cs",	REQ_ARG,	0,	SET_CS },
 		{ "set-ds",	REQ_ARG,	0,	SET_DS },
 		{ "set-es",	REQ_ARG,	0,	SET_ES },
 		{ "set-fs",	REQ_ARG,	0,	SET_FS },
 		{ "set-gs",	REQ_ARG,	0,	SET_GS },
 		{ "set-ss",	REQ_ARG,	0,	SET_SS },
 		{ "set-tr",	REQ_ARG,	0,	SET_TR },
 		{ "set-ldtr",	REQ_ARG,	0,	SET_LDTR },
 		{ "set-x2apic-state",REQ_ARG,	0,	SET_X2APIC_STATE },
 		{ "set-exception-bitmap",
 				REQ_ARG,	0, SET_EXCEPTION_BITMAP },
 		{ "capname",	REQ_ARG,	0,	CAPNAME },
 		{ "unassign-pptdev", REQ_ARG,	0,	UNASSIGN_PPTDEV },
 		{ "setcap",	REQ_ARG,	0,	SET_CAP },
 		{ "get-gpa-pmap", REQ_ARG,	0,	GET_GPA_PMAP },
 		{ "assert-lapic-lvt", REQ_ARG,	0,	ASSERT_LAPIC_LVT },
 		{ "get-rtc-time", NO_ARG,	&get_rtc_time,	1 },
 		{ "set-rtc-time", REQ_ARG,	0,	SET_RTC_TIME },
 		{ "rtc-nvram-offset", REQ_ARG,	0,	RTC_NVRAM_OFFSET },
 		{ "get-rtc-nvram", NO_ARG,	&get_rtc_nvram,	1 },
 		{ "set-rtc-nvram", REQ_ARG,	0,	SET_RTC_NVRAM },
 		{ "getcap",	NO_ARG,		&getcap,	1 },
 		{ "get-stats",	NO_ARG,		&get_stats,	1 },
 		{ "get-desc-ds",NO_ARG,		&get_desc_ds,	1 },
 		{ "set-desc-ds",NO_ARG,		&set_desc_ds,	1 },
 		{ "get-desc-es",NO_ARG,		&get_desc_es,	1 },
 		{ "set-desc-es",NO_ARG,		&set_desc_es,	1 },
 		{ "get-desc-ss",NO_ARG,		&get_desc_ss,	1 },
 		{ "set-desc-ss",NO_ARG,		&set_desc_ss,	1 },
 		{ "get-desc-cs",NO_ARG,		&get_desc_cs,	1 },
 		{ "set-desc-cs",NO_ARG,		&set_desc_cs,	1 },
 		{ "get-desc-fs",NO_ARG,		&get_desc_fs,	1 },
 		{ "set-desc-fs",NO_ARG,		&set_desc_fs,	1 },
 		{ "get-desc-gs",NO_ARG,		&get_desc_gs,	1 },
 		{ "set-desc-gs",NO_ARG,		&set_desc_gs,	1 },
 		{ "get-desc-tr",NO_ARG,		&get_desc_tr,	1 },
 		{ "set-desc-tr",NO_ARG,		&set_desc_tr,	1 },
 		{ "set-desc-ldtr", NO_ARG,	&set_desc_ldtr,	1 },
 		{ "get-desc-ldtr", NO_ARG,	&get_desc_ldtr,	1 },
 		{ "set-desc-gdtr", NO_ARG,	&set_desc_gdtr, 1 },
 		{ "get-desc-gdtr", NO_ARG,	&get_desc_gdtr, 1 },
 		{ "set-desc-idtr", NO_ARG,	&set_desc_idtr, 1 },
 		{ "get-desc-idtr", NO_ARG,	&get_desc_idtr, 1 },
 		{ "get-lowmem", NO_ARG,		&get_lowmem,	1 },
 		{ "get-highmem",NO_ARG,		&get_highmem,	1 },
 		{ "get-efer",	NO_ARG,		&get_efer,	1 },
 		{ "get-cr0",	NO_ARG,		&get_cr0,	1 },
 		{ "get-cr3",	NO_ARG,		&get_cr3,	1 },
 		{ "get-cr4",	NO_ARG,		&get_cr4,	1 },
 		{ "get-dr7",	NO_ARG,		&get_dr7,	1 },
 		{ "get-rsp",	NO_ARG,		&get_rsp,	1 },
 		{ "get-rip",	NO_ARG,		&get_rip,	1 },
 		{ "get-rax",	NO_ARG,		&get_rax,	1 },
 		{ "get-rbx",	NO_ARG,		&get_rbx,	1 },
 		{ "get-rcx",	NO_ARG,		&get_rcx,	1 },
 		{ "get-rdx",	NO_ARG,		&get_rdx,	1 },
 		{ "get-rsi",	NO_ARG,		&get_rsi,	1 },
 		{ "get-rdi",	NO_ARG,		&get_rdi,	1 },
 		{ "get-rbp",	NO_ARG,		&get_rbp,	1 },
 		{ "get-r8",	NO_ARG,		&get_r8,	1 },
 		{ "get-r9",	NO_ARG,		&get_r9,	1 },
 		{ "get-r10",	NO_ARG,		&get_r10,	1 },
 		{ "get-r11",	NO_ARG,		&get_r11,	1 },
 		{ "get-r12",	NO_ARG,		&get_r12,	1 },
 		{ "get-r13",	NO_ARG,		&get_r13,	1 },
 		{ "get-r14",	NO_ARG,		&get_r14,	1 },
 		{ "get-r15",	NO_ARG,		&get_r15,	1 },
 		{ "get-rflags",	NO_ARG,		&get_rflags,	1 },
 		{ "get-cs",	NO_ARG,		&get_cs,	1 },
 		{ "get-ds",	NO_ARG,		&get_ds,	1 },
 		{ "get-es",	NO_ARG,		&get_es,	1 },
 		{ "get-fs",	NO_ARG,		&get_fs,	1 },
 		{ "get-gs",	NO_ARG,		&get_gs,	1 },
 		{ "get-ss",	NO_ARG,		&get_ss,	1 },
 		{ "get-tr",	NO_ARG,		&get_tr,	1 },
 		{ "get-ldtr",	NO_ARG,		&get_ldtr,	1 },
 		{ "get-eptp", 	NO_ARG,		&get_eptp,	1 },
 		{ "get-exception-bitmap",
 					NO_ARG,	&get_exception_bitmap,  1 },
 		{ "get-io-bitmap-address",
 					NO_ARG,	&get_io_bitmap,		1 },
 		{ "get-tsc-offset", 	NO_ARG, &get_tsc_offset, 	1 },
 		{ "get-msr-bitmap",
 					NO_ARG,	&get_msr_bitmap, 	1 },
 		{ "get-msr-bitmap-address",
 					NO_ARG,	&get_msr_bitmap_address, 1 },
 		{ "get-guest-pat",	NO_ARG,	&get_guest_pat,		1 },
 		{ "get-guest-sysenter",
 					NO_ARG,	&get_guest_sysenter, 	1 },
 		{ "get-exit-reason",
 					NO_ARG,	&get_exit_reason, 	1 },
 		{ "get-x2apic-state",	NO_ARG,	&get_x2apic_state, 	1 },
 		{ "get-all",		NO_ARG,	&get_all,		1 },
 		{ "run",		NO_ARG,	&run,			1 },
 		{ "create",		NO_ARG,	&create,		1 },
 		{ "destroy",		NO_ARG,	&destroy,		1 },
 		{ "inject-nmi",		NO_ARG,	&inject_nmi,		1 },
 		{ "force-reset",	NO_ARG,	&force_reset,		1 },
 		{ "force-poweroff", 	NO_ARG,	&force_poweroff, 	1 },
 		{ "get-active-cpus", 	NO_ARG,	&get_active_cpus, 	1 },
 		{ "get-suspended-cpus", NO_ARG,	&get_suspended_cpus, 	1 },
 		{ "get-intinfo", 	NO_ARG,	&get_intinfo,		1 },
 	};
 
 	const struct option intel_opts[] = {
 		{ "get-vmcs-pinbased-ctls",
 				NO_ARG,		&get_pinbased_ctls, 1 },
 		{ "get-vmcs-procbased-ctls",
 				NO_ARG,		&get_procbased_ctls, 1 },
 		{ "get-vmcs-procbased-ctls2",
 				NO_ARG,		&get_procbased_ctls2, 1 },
 		{ "get-vmcs-guest-linear-address",
 				NO_ARG,		&get_vmcs_gla,	1 },
 		{ "get-vmcs-guest-physical-address",
 				NO_ARG,		&get_vmcs_gpa,	1 },
 		{ "get-vmcs-entry-interruption-info",
 				NO_ARG, &get_vmcs_entry_interruption_info, 1},
 		{ "get-vmcs-cr0-mask", NO_ARG,	&get_cr0_mask,	1 },
 		{ "get-vmcs-cr0-shadow", NO_ARG,&get_cr0_shadow, 1 },
 		{ "get-vmcs-cr4-mask", 		NO_ARG,	&get_cr4_mask,	  1 },
 		{ "get-vmcs-cr4-shadow", 	NO_ARG, &get_cr4_shadow,  1 },
 		{ "get-vmcs-cr3-targets", 	NO_ARG, &get_cr3_targets, 1 },
 		{ "get-vmcs-tpr-threshold",
 					NO_ARG,	&get_tpr_threshold, 1 },
 		{ "get-vmcs-vpid", 	NO_ARG,	&get_vpid_asid,	    1 },
 		{ "get-vmcs-exit-ctls", NO_ARG,	&get_exit_ctls,	    1 },
 		{ "get-vmcs-entry-ctls",
 					NO_ARG,	&get_entry_ctls, 1 },
 		{ "get-vmcs-instruction-error",
 					NO_ARG,	&get_inst_err,	1 },
 		{ "get-vmcs-host-pat",	NO_ARG,	&get_host_pat,	1 },
 		{ "get-vmcs-host-cr0",
 					NO_ARG,	&get_host_cr0,	1 },
 		{ "set-vmcs-entry-interruption-info",
 				REQ_ARG, 0, SET_VMCS_ENTRY_INTERRUPTION_INFO },
 		{ "get-vmcs-exit-qualification",
 				NO_ARG,	&get_vmcs_exit_qualification, 1 },
 		{ "get-vmcs-interruptibility",
 				NO_ARG, &get_vmcs_interruptibility, 1 },
 		{ "get-vmcs-exit-interruption-error",
 				NO_ARG,	&get_vmcs_exit_interruption_error, 1 },
 		{ "get-vmcs-exit-interruption-info",
 				NO_ARG,	&get_vmcs_exit_interruption_info, 1 },
 		{ "get-vmcs-link", 	NO_ARG,		&get_vmcs_link, 1 },
 		{ "get-vmcs-host-cr3",
 					NO_ARG,		&get_host_cr3,	1 },
 		{ "get-vmcs-host-cr4",
 				NO_ARG,		&get_host_cr4,	1 },
 		{ "get-vmcs-host-rip",
 				NO_ARG,		&get_host_rip,	1 },
 		{ "get-vmcs-host-rsp",
 				NO_ARG,		&get_host_rsp,	1 },
 		{ "get-apic-access-address",
 				NO_ARG,		&get_apic_access_addr, 1},
 		{ "get-virtual-apic-address",
 				NO_ARG,		&get_virtual_apic_addr, 1}
 	};
 
 	const struct option amd_opts[] = {
 		{ "get-vmcb-intercepts",
 				NO_ARG,	&get_vmcb_intercept, 	1 },
 		{ "get-vmcb-asid", 
 				NO_ARG,	&get_vpid_asid,	     	1 },
 		{ "get-vmcb-exit-details",
 				NO_ARG, &get_vmcb_exit_details,	1 },
 		{ "get-vmcb-tlb-ctrl",
 				NO_ARG, &get_vmcb_tlb_ctrl, 	1 },
 		{ "get-vmcb-virq",
 				NO_ARG, &get_vmcb_virq, 	1 },
 		{ "get-avic-apic-bar",
 				NO_ARG,	&get_apic_access_addr, 	1 },
 		{ "get-avic-backing-page",
 				NO_ARG,	&get_virtual_apic_addr, 1 },
 		{ "get-avic-table",
 				NO_ARG,	&get_avic_table, 	1 }
 	};
 
 	const struct option null_opt = {
 		NULL, 0, NULL, 0
 	};
 
 	struct option *all_opts;
 	char *cp;
 	int optlen;
 
 	optlen = sizeof(common_opts);
 
 	if (cpu_intel)
 		optlen += sizeof(intel_opts);
 	else
 		optlen += sizeof(amd_opts);
 
 	optlen += sizeof(null_opt);
 
 	all_opts = malloc(optlen);
 
 	cp = (char *)all_opts;
 	memcpy(cp, common_opts, sizeof(common_opts));
 	cp += sizeof(common_opts);
 
 	if (cpu_intel) {
 		memcpy(cp, intel_opts, sizeof(intel_opts));
 		cp += sizeof(intel_opts);
 	} else {
 		memcpy(cp, amd_opts, sizeof(amd_opts));
 		cp += sizeof(amd_opts);
 	}
 
 	memcpy(cp, &null_opt, sizeof(null_opt));
 	cp += sizeof(null_opt);
 
 	return (all_opts);
 }
 
 static const char *
 wday_str(int idx)
 {
 	static const char *weekdays[] = {
 		"Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"
 	};
 
 	if (idx >= 0 && idx < 7)
 		return (weekdays[idx]);
 	else
 		return ("UNK");
 }
 
 static const char *
 mon_str(int idx)
 {
 	static const char *months[] = {
 		"Jan", "Feb", "Mar", "Apr", "May", "Jun",
 		"Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
 	};
 
 	if (idx >= 0 && idx < 12)
 		return (months[idx]);
 	else
 		return ("UNK");
 }
 
 int
 main(int argc, char *argv[])
 {
 	char *vmname;
 	int error, ch, vcpu, ptenum;
 	vm_paddr_t gpa, gpa_pmap;
 	size_t len;
 	struct vm_exit vmexit;
 	uint64_t rax, cr0, cr3, cr4, dr7, rsp, rip, rflags, efer, pat;
 	uint64_t eptp, bm, addr, u64, pteval[4], *pte, info[2];
 	struct vmctx *ctx;
 	int wired;
 	cpuset_t cpus;
 	bool cpu_intel;
 	uint64_t cs, ds, es, fs, gs, ss, tr, ldtr;
 	struct tm tm;
 	struct option *opts;
 
 	cpu_intel = cpu_vendor_intel();
 	opts = setup_options(cpu_intel);
 
 	vcpu = 0;
 	vmname = NULL;
 	assert_lapic_lvt = -1;
 	progname = basename(argv[0]);
 
 	while ((ch = getopt_long(argc, argv, "", opts, NULL)) != -1) {
 		switch (ch) {
 		case 0:
 			break;
 		case VMNAME:
 			vmname = optarg;
 			break;
 		case VCPU:
 			vcpu = atoi(optarg);
 			break;
 		case SET_MEM:
 			memsize = atoi(optarg) * MB;
 			memsize = roundup(memsize, 2 * MB);
 			break;
 		case SET_EFER:
 			efer = strtoul(optarg, NULL, 0);
 			set_efer = 1;
 			break;
 		case SET_CR0:
 			cr0 = strtoul(optarg, NULL, 0);
 			set_cr0 = 1;
 			break;
 		case SET_CR3:
 			cr3 = strtoul(optarg, NULL, 0);
 			set_cr3 = 1;
 			break;
 		case SET_CR4:
 			cr4 = strtoul(optarg, NULL, 0);
 			set_cr4 = 1;
 			break;
 		case SET_DR7:
 			dr7 = strtoul(optarg, NULL, 0);
 			set_dr7 = 1;
 			break;
 		case SET_RSP:
 			rsp = strtoul(optarg, NULL, 0);
 			set_rsp = 1;
 			break;
 		case SET_RIP:
 			rip = strtoul(optarg, NULL, 0);
 			set_rip = 1;
 			break;
 		case SET_RAX:
 			rax = strtoul(optarg, NULL, 0);
 			set_rax = 1;
 			break;
 		case SET_RFLAGS:
 			rflags = strtoul(optarg, NULL, 0);
 			set_rflags = 1;
 			break;
 		case DESC_BASE:
 			desc_base = strtoul(optarg, NULL, 0);
 			break;
 		case DESC_LIMIT:
 			desc_limit = strtoul(optarg, NULL, 0);
 			break;
 		case DESC_ACCESS:
 			desc_access = strtoul(optarg, NULL, 0);
 			break;
 		case SET_CS:
 			cs = strtoul(optarg, NULL, 0);
 			set_cs = 1;
 			break;
 		case SET_DS:
 			ds = strtoul(optarg, NULL, 0);
 			set_ds = 1;
 			break;
 		case SET_ES:
 			es = strtoul(optarg, NULL, 0);
 			set_es = 1;
 			break;
 		case SET_FS:
 			fs = strtoul(optarg, NULL, 0);
 			set_fs = 1;
 			break;
 		case SET_GS:
 			gs = strtoul(optarg, NULL, 0);
 			set_gs = 1;
 			break;
 		case SET_SS:
 			ss = strtoul(optarg, NULL, 0);
 			set_ss = 1;
 			break;
 		case SET_TR:
 			tr = strtoul(optarg, NULL, 0);
 			set_tr = 1;
 			break;
 		case SET_LDTR:
 			ldtr = strtoul(optarg, NULL, 0);
 			set_ldtr = 1;
 			break;
 		case SET_X2APIC_STATE:
 			x2apic_state = strtol(optarg, NULL, 0);
 			set_x2apic_state = 1;
 			break;
 		case SET_EXCEPTION_BITMAP:
 			exception_bitmap = strtoul(optarg, NULL, 0);
 			set_exception_bitmap = 1;
 			break;
 		case SET_VMCS_ENTRY_INTERRUPTION_INFO:
 			vmcs_entry_interruption_info = strtoul(optarg, NULL, 0);
 			set_vmcs_entry_interruption_info = 1;
 			break;
 		case SET_CAP:
 			capval = strtoul(optarg, NULL, 0);
 			setcap = 1;
 			break;
 		case SET_RTC_TIME:
 			rtc_secs = strtoul(optarg, NULL, 0);
 			set_rtc_time = 1;
 			break;
 		case SET_RTC_NVRAM:
 			rtc_nvram_value = (uint8_t)strtoul(optarg, NULL, 0);
 			set_rtc_nvram = 1;
 			break;
 		case RTC_NVRAM_OFFSET:
 			rtc_nvram_offset = strtoul(optarg, NULL, 0);
 			break;
 		case GET_GPA_PMAP:
 			gpa_pmap = strtoul(optarg, NULL, 0);
 			get_gpa_pmap = 1;
 			break;
 		case CAPNAME:
 			capname = optarg;
 			break;
 		case UNASSIGN_PPTDEV:
 			unassign_pptdev = 1;
 			if (sscanf(optarg, "%d/%d/%d", &bus, &slot, &func) != 3)
 				usage(cpu_intel);
 			break;
 		case ASSERT_LAPIC_LVT:
 			assert_lapic_lvt = atoi(optarg);
 			break;
 		default:
 			usage(cpu_intel);
 		}
 	}
 	argc -= optind;
 	argv += optind;
 
 	if (vmname == NULL)
 		usage(cpu_intel);
 
 	error = 0;
 
 	if (!error && create)
 		error = vm_create(vmname);
 
 	if (!error) {
 		ctx = vm_open(vmname);
 		if (ctx == NULL) {
 			printf("VM:%s is not created.\n", vmname);
 			exit (1);
 		}
 	}
 
 	if (!error && memsize)
 		error = vm_setup_memory(ctx, memsize, VM_MMAP_NONE);
 
 	if (!error && set_efer)
 		error = vm_set_register(ctx, vcpu, VM_REG_GUEST_EFER, efer);
 
 	if (!error && set_cr0)
 		error = vm_set_register(ctx, vcpu, VM_REG_GUEST_CR0, cr0);
 
 	if (!error && set_cr3)
 		error = vm_set_register(ctx, vcpu, VM_REG_GUEST_CR3, cr3);
 
 	if (!error && set_cr4)
 		error = vm_set_register(ctx, vcpu, VM_REG_GUEST_CR4, cr4);
 
 	if (!error && set_dr7)
 		error = vm_set_register(ctx, vcpu, VM_REG_GUEST_DR7, dr7);
 
 	if (!error && set_rsp)
 		error = vm_set_register(ctx, vcpu, VM_REG_GUEST_RSP, rsp);
 
 	if (!error && set_rip)
 		error = vm_set_register(ctx, vcpu, VM_REG_GUEST_RIP, rip);
 
 	if (!error && set_rax)
 		error = vm_set_register(ctx, vcpu, VM_REG_GUEST_RAX, rax);
 
 	if (!error && set_rflags) {
 		error = vm_set_register(ctx, vcpu, VM_REG_GUEST_RFLAGS,
 					rflags);
 	}
 
 	if (!error && set_desc_ds) {
 		error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_DS,
 				    desc_base, desc_limit, desc_access);
 	}
 
 	if (!error && set_desc_es) {
 		error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_ES,
 				    desc_base, desc_limit, desc_access);
 	}
 
 	if (!error && set_desc_ss) {
 		error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_SS,
 				    desc_base, desc_limit, desc_access);
 	}
 
 	if (!error && set_desc_cs) {
 		error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_CS,
 				    desc_base, desc_limit, desc_access);
 	}
 
 	if (!error && set_desc_fs) {
 		error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_FS,
 				    desc_base, desc_limit, desc_access);
 	}
 
 	if (!error && set_desc_gs) {
 		error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_GS,
 				    desc_base, desc_limit, desc_access);
 	}
 
 	if (!error && set_desc_tr) {
 		error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_TR,
 				    desc_base, desc_limit, desc_access);
 	}
 
 	if (!error && set_desc_ldtr) {
 		error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_LDTR,
 				    desc_base, desc_limit, desc_access);
 	}
 
 	if (!error && set_desc_gdtr) {
 		error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_GDTR,
 				    desc_base, desc_limit, 0);
 	}
 
 	if (!error && set_desc_idtr) {
 		error = vm_set_desc(ctx, vcpu, VM_REG_GUEST_IDTR,
 				    desc_base, desc_limit, 0);
 	}
 
 	if (!error && set_cs)
 		error = vm_set_register(ctx, vcpu, VM_REG_GUEST_CS, cs);
 
 	if (!error && set_ds)
 		error = vm_set_register(ctx, vcpu, VM_REG_GUEST_DS, ds);
 
 	if (!error && set_es)
 		error = vm_set_register(ctx, vcpu, VM_REG_GUEST_ES, es);
 
 	if (!error && set_fs)
 		error = vm_set_register(ctx, vcpu, VM_REG_GUEST_FS, fs);
 
 	if (!error && set_gs)
 		error = vm_set_register(ctx, vcpu, VM_REG_GUEST_GS, gs);
 
 	if (!error && set_ss)
 		error = vm_set_register(ctx, vcpu, VM_REG_GUEST_SS, ss);
 
 	if (!error && set_tr)
 		error = vm_set_register(ctx, vcpu, VM_REG_GUEST_TR, tr);
 
 	if (!error && set_ldtr)
 		error = vm_set_register(ctx, vcpu, VM_REG_GUEST_LDTR, ldtr);
 
 	if (!error && set_x2apic_state)
 		error = vm_set_x2apic_state(ctx, vcpu, x2apic_state);
 
 	if (!error && unassign_pptdev)
 		error = vm_unassign_pptdev(ctx, bus, slot, func);
 
 	if (!error && set_exception_bitmap) {
 		if (cpu_intel)
 			error = vm_set_vmcs_field(ctx, vcpu,
 						  VMCS_EXCEPTION_BITMAP,
 						  exception_bitmap);
 		else
 			error = vm_set_vmcb_field(ctx, vcpu,
 						  VMCB_OFF_EXC_INTERCEPT,
 						  4, exception_bitmap);
 	}
 
 	if (!error && cpu_intel && set_vmcs_entry_interruption_info) {
 		error = vm_set_vmcs_field(ctx, vcpu, VMCS_ENTRY_INTR_INFO,
 					  vmcs_entry_interruption_info);
 	}
 
 	if (!error && inject_nmi) {
 		error = vm_inject_nmi(ctx, vcpu);
 	}
 
 	if (!error && assert_lapic_lvt != -1) {
 		error = vm_lapic_local_irq(ctx, vcpu, assert_lapic_lvt);
 	}
 
 	if (!error && (get_lowmem || get_all)) {
 		gpa = 0;
 		error = vm_get_memory_seg(ctx, gpa, &len, &wired);
 		if (error == 0)
 			printf("lowmem\t\t0x%016lx/%ld%s\n", gpa, len,
 			    wired ? " wired" : "");
 	}
 
 	if (!error && (get_highmem || get_all)) {
 		gpa = 4 * GB;
 		error = vm_get_memory_seg(ctx, gpa, &len, &wired);
 		if (error == 0)
 			printf("highmem\t\t0x%016lx/%ld%s\n", gpa, len,
 			    wired ? " wired" : "");
 	}
 
 	if (!error)
 		error = get_all_registers(ctx, vcpu);
 
 	if (!error)
 		error = get_all_segments(ctx, vcpu);
 
 	if (!error) {
 		if (cpu_intel)
 			error = get_misc_vmcs(ctx, vcpu);
 		else
 			error = get_misc_vmcb(ctx, vcpu);
 	}
 	
 	if (!error && (get_x2apic_state || get_all)) {
 		error = vm_get_x2apic_state(ctx, vcpu, &x2apic_state);
 		if (error == 0)
 			printf("x2apic_state[%d]\t%d\n", vcpu, x2apic_state);
 	}
 
 	if (!error && (get_eptp || get_all)) {
 		if (cpu_intel)
 			error = vm_get_vmcs_field(ctx, vcpu, VMCS_EPTP, &eptp);
 		else
 			error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_NPT_BASE,
 						   8, &eptp);
 		if (error == 0)
 			printf("%s[%d]\t\t0x%016lx\n",
 				cpu_intel ? "eptp" : "rvi/npt", vcpu, eptp);
 	}
 
 	if (!error && (get_exception_bitmap || get_all)) {
 		if(cpu_intel)
 			error = vm_get_vmcs_field(ctx, vcpu,
 						VMCS_EXCEPTION_BITMAP, &bm);
 		else
 			error = vm_get_vmcb_field(ctx, vcpu,
 						  VMCB_OFF_EXC_INTERCEPT,
 						  4, &bm);
 		if (error == 0)
 			printf("exception_bitmap[%d]\t%#lx\n", vcpu, bm);
 	}
 
 	if (!error && (get_io_bitmap || get_all)) {
 		if (cpu_intel) {
 			error = vm_get_vmcs_field(ctx, vcpu, VMCS_IO_BITMAP_A,
 						  &bm);
 			if (error == 0)
 				printf("io_bitmap_a[%d]\t%#lx\n", vcpu, bm);
 			error = vm_get_vmcs_field(ctx, vcpu, VMCS_IO_BITMAP_B,
 						  &bm);
 			if (error == 0)
 				printf("io_bitmap_b[%d]\t%#lx\n", vcpu, bm);
 		} else {
 			error = vm_get_vmcb_field(ctx, vcpu,
 						  VMCB_OFF_IO_PERM, 8, &bm);
 			if (error == 0)
 				printf("io_bitmap[%d]\t%#lx\n", vcpu, bm);
 		}
 	}
 
 	if (!error && (get_tsc_offset || get_all)) {
 		uint64_t tscoff;
 		if (cpu_intel)
 			error = vm_get_vmcs_field(ctx, vcpu, VMCS_TSC_OFFSET,
 						  &tscoff);
 		else
 			error = vm_get_vmcb_field(ctx, vcpu,
 						  VMCB_OFF_TSC_OFFSET, 
 						  8, &tscoff);
 		if (error == 0)
 			printf("tsc_offset[%d]\t0x%016lx\n", vcpu, tscoff);
 	}
 
 	if (!error && (get_msr_bitmap_address || get_all)) {
 		if (cpu_intel)
 			error = vm_get_vmcs_field(ctx, vcpu, VMCS_MSR_BITMAP, 
 						  &addr);
 		else
 			error = vm_get_vmcb_field(ctx, vcpu,
 						  VMCB_OFF_MSR_PERM, 8, &addr);
 		if (error == 0)
 			printf("msr_bitmap[%d]\t\t%#lx\n", vcpu, addr);
 	}
 
 	if (!error && (get_msr_bitmap || get_all)) {
 		if (cpu_intel) {
 			error = vm_get_vmcs_field(ctx, vcpu, 
 						  VMCS_MSR_BITMAP, &addr);
 		} else {
 			error = vm_get_vmcb_field(ctx, vcpu,
 						  VMCB_OFF_MSR_PERM, 8,
 						  &addr);
 		}
 
 		if (error == 0)
 			error = dump_msr_bitmap(vcpu, addr, cpu_intel);
 	}
 
 	if (!error && (get_vpid_asid || get_all)) {
 		uint64_t vpid;
 		if (cpu_intel)
 			error = vm_get_vmcs_field(ctx, vcpu, VMCS_VPID, &vpid);
 		else
 			error = vm_get_vmcb_field(ctx, vcpu, VMCB_OFF_ASID, 
 						  4, &vpid);
 		if (error == 0)
 			printf("%s[%d]\t\t0x%04lx\n", 
 				cpu_intel ? "vpid" : "asid", vcpu, vpid);
 	}
 
 	if (!error && (get_guest_pat || get_all)) {
 		if (cpu_intel)
 			error = vm_get_vmcs_field(ctx, vcpu,
 						  VMCS_GUEST_IA32_PAT, &pat);
 		else
 			error = vm_get_vmcb_field(ctx, vcpu,
 						  VMCB_OFF_GUEST_PAT, 8, &pat);
 		if (error == 0)
 			printf("guest_pat[%d]\t\t0x%016lx\n", vcpu, pat);
 	}
 
 	if (!error && (get_guest_sysenter || get_all)) {
 		if (cpu_intel)
 			error = vm_get_vmcs_field(ctx, vcpu,
 						  VMCS_GUEST_IA32_SYSENTER_CS,
 						  &cs);
 		else
 			error = vm_get_vmcb_field(ctx, vcpu,
 						  VMCB_OFF_SYSENTER_CS, 8,
 						  &cs);
 
 		if (error == 0)
 			printf("guest_sysenter_cs[%d]\t%#lx\n", vcpu, cs);
 		if (cpu_intel)
 			error = vm_get_vmcs_field(ctx, vcpu,
 						  VMCS_GUEST_IA32_SYSENTER_ESP,
 						  &rsp);
 		else
 			error = vm_get_vmcb_field(ctx, vcpu,
 						  VMCB_OFF_SYSENTER_ESP, 8,
 						  &rsp);
 
 		if (error == 0)
 			printf("guest_sysenter_sp[%d]\t%#lx\n", vcpu, rsp);
 		if (cpu_intel)
 			error = vm_get_vmcs_field(ctx, vcpu,
 						  VMCS_GUEST_IA32_SYSENTER_EIP,
 						  &rip);
 		else
 			error = vm_get_vmcb_field(ctx, vcpu,
 						  VMCB_OFF_SYSENTER_EIP, 8, 
 						  &rip);
 		if (error == 0)
 			printf("guest_sysenter_ip[%d]\t%#lx\n", vcpu, rip);
 	}
 
 	if (!error && (get_exit_reason || get_all)) {
 		if (cpu_intel)
 			error = vm_get_vmcs_field(ctx, vcpu, VMCS_EXIT_REASON,
 						  &u64);
 		else	
 			error = vm_get_vmcb_field(ctx, vcpu,
 						  VMCB_OFF_EXIT_REASON, 8,
 						  &u64);
 		if (error == 0)
 			printf("exit_reason[%d]\t%#lx\n", vcpu, u64);
 	}
 
 	if (!error && setcap) {
 		int captype;
 		captype = vm_capability_name2type(capname);
 		error = vm_set_capability(ctx, vcpu, captype, capval);
 		if (error != 0 && errno == ENOENT)
 			printf("Capability \"%s\" is not available\n", capname);
 	}
 
 	if (!error && get_gpa_pmap) {
 		error = vm_get_gpa_pmap(ctx, gpa_pmap, pteval, &ptenum);
 		if (error == 0) {
 			printf("gpa %#lx:", gpa_pmap);
 			pte = &pteval[0];
 			while (ptenum-- > 0)
 				printf(" %#lx", *pte++);
 			printf("\n");
 		}
 	}
 
 	if (!error && set_rtc_nvram)
 		error = vm_rtc_write(ctx, rtc_nvram_offset, rtc_nvram_value);
 
 	if (!error && (get_rtc_nvram || get_all)) {
 		error = vm_rtc_read(ctx, rtc_nvram_offset, &rtc_nvram_value);
 		if (error == 0) {
 			printf("rtc nvram[%03d]: 0x%02x\n", rtc_nvram_offset,
 			    rtc_nvram_value);
 		}
 	}
 
 	if (!error && set_rtc_time)
 		error = vm_rtc_settime(ctx, rtc_secs);
 
 	if (!error && (get_rtc_time || get_all)) {
 		error = vm_rtc_gettime(ctx, &rtc_secs);
 		if (error == 0) {
 			gmtime_r(&rtc_secs, &tm);
 			printf("rtc time %#lx: %s %s %02d %02d:%02d:%02d %d\n",
 			    rtc_secs, wday_str(tm.tm_wday), mon_str(tm.tm_mon),
 			    tm.tm_mday, tm.tm_hour, tm.tm_min, tm.tm_sec,
 			    1900 + tm.tm_year);
 		}
 	}
 
 	if (!error && (getcap || get_all)) {
 		int captype, val, getcaptype;
 
 		if (getcap && capname)
 			getcaptype = vm_capability_name2type(capname);
 		else
 			getcaptype = -1;
 
 		for (captype = 0; captype < VM_CAP_MAX; captype++) {
 			if (getcaptype >= 0 && captype != getcaptype)
 				continue;
 			error = vm_get_capability(ctx, vcpu, captype, &val);
 			if (error == 0) {
 				printf("Capability \"%s\" is %s on vcpu %d\n",
 					vm_capability_type2name(captype),
 					val ? "set" : "not set", vcpu);
 			} else if (errno == ENOENT) {
 				error = 0;
 				printf("Capability \"%s\" is not available\n",
 					vm_capability_type2name(captype));
 			} else {
 				break;
 			}
 		}
 	}
 
 	if (!error && (get_active_cpus || get_all)) {
 		error = vm_active_cpus(ctx, &cpus);
 		if (!error)
 			print_cpus("active cpus", &cpus);
 	}
 
 	if (!error && (get_suspended_cpus || get_all)) {
 		error = vm_suspended_cpus(ctx, &cpus);
 		if (!error)
 			print_cpus("suspended cpus", &cpus);
 	}
 
 	if (!error && (get_intinfo || get_all)) {
 		error = vm_get_intinfo(ctx, vcpu, &info[0], &info[1]);
 		if (!error) {
 			print_intinfo("pending", info[0]);
 			print_intinfo("current", info[1]);
 		}
 	}
 
 	if (!error && (get_stats || get_all)) {
 		int i, num_stats;
 		uint64_t *stats;
 		struct timeval tv;
 		const char *desc;
 
 		stats = vm_get_stats(ctx, vcpu, &tv, &num_stats);
 		if (stats != NULL) {
 			printf("vcpu%d stats:\n", vcpu);
 			for (i = 0; i < num_stats; i++) {
 				desc = vm_get_stat_desc(ctx, i);
 				printf("%-40s\t%ld\n", desc, stats[i]);
 			}
 		}
 	}
 
 	if (!error && run) {
 		error = vm_run(ctx, vcpu, &vmexit);
 		if (error == 0)
 			dump_vm_run_exitcode(&vmexit, vcpu);
 		else
 			printf("vm_run error %d\n", error);
 	}
 
 	if (!error && force_reset)
 		error = vm_suspend(ctx, VM_SUSPEND_RESET);
 
 	if (!error && force_poweroff)
 		error = vm_suspend(ctx, VM_SUSPEND_POWEROFF);
 
 	if (error)
 		printf("errno = %d\n", errno);
 
 	if (!error && destroy)
 		vm_destroy(ctx);
 
 	free (opts);
 	exit(error);
 }
Index: user/ngie/more-tests/usr.sbin/bhyvectl
===================================================================
--- user/ngie/more-tests/usr.sbin/bhyvectl	(revision 281584)
+++ user/ngie/more-tests/usr.sbin/bhyvectl	(revision 281585)

Property changes on: user/ngie/more-tests/usr.sbin/bhyvectl
___________________________________________________________________
Modified: svn:mergeinfo
## -0,0 +0,1 ##
   Merged /head/usr.sbin/bhyvectl:r281414-281584
Index: user/ngie/more-tests/usr.sbin/ctld/discovery.c
===================================================================
--- user/ngie/more-tests/usr.sbin/ctld/discovery.c	(revision 281584)
+++ user/ngie/more-tests/usr.sbin/ctld/discovery.c	(revision 281585)
@@ -1,337 +1,336 @@
 /*-
  * Copyright (c) 2012 The FreeBSD Foundation
  * All rights reserved.
  *
  * This software was developed by Edward Tomasz Napierala under sponsorship
  * from the FreeBSD Foundation.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <assert.h>
-#include <stdint.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <netinet/in.h>
 #include <netdb.h>
 #include <sys/socket.h>
 
 #include "ctld.h"
 #include "iscsi_proto.h"
 
 static struct pdu *
 text_receive(struct connection *conn)
 {
 	struct pdu *request;
 	struct iscsi_bhs_text_request *bhstr;
 
 	request = pdu_new(conn);
 	pdu_receive(request);
 	if ((request->pdu_bhs->bhs_opcode & ~ISCSI_BHS_OPCODE_IMMEDIATE) !=
 	    ISCSI_BHS_OPCODE_TEXT_REQUEST)
 		log_errx(1, "protocol error: received invalid opcode 0x%x",
 		    request->pdu_bhs->bhs_opcode);
 	bhstr = (struct iscsi_bhs_text_request *)request->pdu_bhs;
 #if 0
 	if ((bhstr->bhstr_flags & ISCSI_BHSTR_FLAGS_FINAL) == 0)
 		log_errx(1, "received Text PDU without the \"F\" flag");
 #endif
 	/*
 	 * XXX: Implement the C flag some day.
 	 */
 	if ((bhstr->bhstr_flags & BHSTR_FLAGS_CONTINUE) != 0)
 		log_errx(1, "received Text PDU with unsupported \"C\" flag");
 	if (ISCSI_SNLT(ntohl(bhstr->bhstr_cmdsn), conn->conn_cmdsn)) {
 		log_errx(1, "received Text PDU with decreasing CmdSN: "
 		    "was %u, is %u", conn->conn_cmdsn, ntohl(bhstr->bhstr_cmdsn));
 	}
 	if (ntohl(bhstr->bhstr_expstatsn) != conn->conn_statsn) {
 		log_errx(1, "received Text PDU with wrong StatSN: "
 		    "is %u, should be %u", ntohl(bhstr->bhstr_expstatsn),
 		    conn->conn_statsn);
 	}
 	conn->conn_cmdsn = ntohl(bhstr->bhstr_cmdsn);
 	if ((bhstr->bhstr_opcode & ISCSI_BHS_OPCODE_IMMEDIATE) == 0)
 		conn->conn_cmdsn++;
 
 	return (request);
 }
 
 static struct pdu *
 text_new_response(struct pdu *request)
 {
 	struct pdu *response;
 	struct connection *conn;
 	struct iscsi_bhs_text_request *bhstr;
 	struct iscsi_bhs_text_response *bhstr2;
 
 	bhstr = (struct iscsi_bhs_text_request *)request->pdu_bhs;
 	conn = request->pdu_connection;
 
 	response = pdu_new_response(request);
 	bhstr2 = (struct iscsi_bhs_text_response *)response->pdu_bhs;
 	bhstr2->bhstr_opcode = ISCSI_BHS_OPCODE_TEXT_RESPONSE;
 	bhstr2->bhstr_flags = BHSTR_FLAGS_FINAL;
 	bhstr2->bhstr_lun = bhstr->bhstr_lun;
 	bhstr2->bhstr_initiator_task_tag = bhstr->bhstr_initiator_task_tag;
 	bhstr2->bhstr_target_transfer_tag = bhstr->bhstr_target_transfer_tag;
 	bhstr2->bhstr_statsn = htonl(conn->conn_statsn++);
 	bhstr2->bhstr_expcmdsn = htonl(conn->conn_cmdsn);
 	bhstr2->bhstr_maxcmdsn = htonl(conn->conn_cmdsn);
 
 	return (response);
 }
 
 static struct pdu *
 logout_receive(struct connection *conn)
 {
 	struct pdu *request;
 	struct iscsi_bhs_logout_request *bhslr;
 
 	request = pdu_new(conn);
 	pdu_receive(request);
 	if ((request->pdu_bhs->bhs_opcode & ~ISCSI_BHS_OPCODE_IMMEDIATE) !=
 	    ISCSI_BHS_OPCODE_LOGOUT_REQUEST)
 		log_errx(1, "protocol error: received invalid opcode 0x%x",
 		    request->pdu_bhs->bhs_opcode);
 	bhslr = (struct iscsi_bhs_logout_request *)request->pdu_bhs;
 	if ((bhslr->bhslr_reason & 0x7f) != BHSLR_REASON_CLOSE_SESSION)
 		log_debugx("received Logout PDU with invalid reason 0x%x; "
 		    "continuing anyway", bhslr->bhslr_reason & 0x7f);
 	if (ISCSI_SNLT(ntohl(bhslr->bhslr_cmdsn), conn->conn_cmdsn)) {
 		log_errx(1, "received Logout PDU with decreasing CmdSN: "
 		    "was %u, is %u", conn->conn_cmdsn,
 		    ntohl(bhslr->bhslr_cmdsn));
 	}
 	if (ntohl(bhslr->bhslr_expstatsn) != conn->conn_statsn) {
 		log_errx(1, "received Logout PDU with wrong StatSN: "
 		    "is %u, should be %u", ntohl(bhslr->bhslr_expstatsn),
 		    conn->conn_statsn);
 	}
 	conn->conn_cmdsn = ntohl(bhslr->bhslr_cmdsn);
 	if ((bhslr->bhslr_opcode & ISCSI_BHS_OPCODE_IMMEDIATE) == 0)
 		conn->conn_cmdsn++;
 
 	return (request);
 }
 
 static struct pdu *
 logout_new_response(struct pdu *request)
 {
 	struct pdu *response;
 	struct connection *conn;
 	struct iscsi_bhs_logout_request *bhslr;
 	struct iscsi_bhs_logout_response *bhslr2;
 
 	bhslr = (struct iscsi_bhs_logout_request *)request->pdu_bhs;
 	conn = request->pdu_connection;
 
 	response = pdu_new_response(request);
 	bhslr2 = (struct iscsi_bhs_logout_response *)response->pdu_bhs;
 	bhslr2->bhslr_opcode = ISCSI_BHS_OPCODE_LOGOUT_RESPONSE;
 	bhslr2->bhslr_flags = 0x80;
 	bhslr2->bhslr_response = BHSLR_RESPONSE_CLOSED_SUCCESSFULLY;
 	bhslr2->bhslr_initiator_task_tag = bhslr->bhslr_initiator_task_tag;
 	bhslr2->bhslr_statsn = htonl(conn->conn_statsn++);
 	bhslr2->bhslr_expcmdsn = htonl(conn->conn_cmdsn);
 	bhslr2->bhslr_maxcmdsn = htonl(conn->conn_cmdsn);
 
 	return (response);
 }
 
 static void
 discovery_add_target(struct keys *response_keys, const struct target *targ)
 {
 	struct port *port;
 	struct portal *portal;
 	char *buf;
 	char hbuf[NI_MAXHOST], sbuf[NI_MAXSERV];
 	struct addrinfo *ai;
 	int ret;
 
 	keys_add(response_keys, "TargetName", targ->t_name);
 	TAILQ_FOREACH(port, &targ->t_ports, p_ts) {
 	    if (port->p_portal_group == NULL)
 		continue;
 	    TAILQ_FOREACH(portal, &port->p_portal_group->pg_portals, p_next) {
 		ai = portal->p_ai;
 		ret = getnameinfo(ai->ai_addr, ai->ai_addrlen,
 		    hbuf, sizeof(hbuf), sbuf, sizeof(sbuf),
 		    NI_NUMERICHOST | NI_NUMERICSERV);
 		if (ret != 0) {
 			log_warnx("getnameinfo: %s", gai_strerror(ret));
 			continue;
 		}
 		switch (ai->ai_addr->sa_family) {
 		case AF_INET:
 			if (strcmp(hbuf, "0.0.0.0") == 0)
 				continue;
 			ret = asprintf(&buf, "%s:%s,%d", hbuf, sbuf,
 			    port->p_portal_group->pg_tag);
 			break;
 		case AF_INET6:
 			if (strcmp(hbuf, "::") == 0)
 				continue;
 			ret = asprintf(&buf, "[%s]:%s,%d", hbuf, sbuf,
 			    port->p_portal_group->pg_tag);
 			break;
 		default:
 			continue;
 		}
 		if (ret <= 0)
 		    log_err(1, "asprintf");
 		keys_add(response_keys, "TargetAddress", buf);
 		free(buf);
 	    }
 	}
 }
 
 static bool
 discovery_target_filtered_out(const struct connection *conn,
     const struct port *port)
 {
 	const struct auth_group *ag;
 	const struct portal_group *pg;
 	const struct target *targ;
 	const struct auth *auth;
 	int error;
 
 	targ = port->p_target;
 	ag = port->p_auth_group;
 	if (ag == NULL)
 		ag = targ->t_auth_group;
 	pg = conn->conn_portal->p_portal_group;
 
 	assert(pg->pg_discovery_auth_group != PG_FILTER_UNKNOWN);
 
 	if (pg->pg_discovery_filter >= PG_FILTER_PORTAL &&
 	    auth_portal_check(ag, &conn->conn_initiator_sa) != 0) {
 		log_debugx("initiator does not match initiator portals "
 		    "allowed for target \"%s\"; skipping", targ->t_name);
 		return (true);
 	}
 
 	if (pg->pg_discovery_filter >= PG_FILTER_PORTAL_NAME &&
 	    auth_name_check(ag, conn->conn_initiator_name) != 0) {
 		log_debugx("initiator does not match initiator names "
 		    "allowed for target \"%s\"; skipping", targ->t_name);
 		return (true);
 	}
 
 	if (pg->pg_discovery_filter >= PG_FILTER_PORTAL_NAME_AUTH &&
 	    ag->ag_type != AG_TYPE_NO_AUTHENTICATION) {
 		if (conn->conn_chap == NULL) {
 			assert(pg->pg_discovery_auth_group->ag_type ==
 			    AG_TYPE_NO_AUTHENTICATION);
 
 			log_debugx("initiator didn't authenticate, but target "
 			    "\"%s\" requires CHAP; skipping", targ->t_name);
 			return (true);
 		}
 
 		assert(conn->conn_user != NULL);
 		auth = auth_find(ag, conn->conn_user);
 		if (auth == NULL) {
 			log_debugx("CHAP user \"%s\" doesn't match target "
 			    "\"%s\"; skipping", conn->conn_user, targ->t_name);
 			return (true);
 		}
 
 		error = chap_authenticate(conn->conn_chap, auth->a_secret);
 		if (error != 0) {
 			log_debugx("password for CHAP user \"%s\" doesn't "
 			    "match target \"%s\"; skipping",
 			    conn->conn_user, targ->t_name);
 			return (true);
 		}
 	}
 
 	return (false);
 }
 
 void
 discovery(struct connection *conn)
 {
 	struct pdu *request, *response;
 	struct keys *request_keys, *response_keys;
 	const struct port *port;
 	const struct portal_group *pg;
 	const char *send_targets;
 
 	pg = conn->conn_portal->p_portal_group;
 
 	log_debugx("beginning discovery session; waiting for Text PDU");
 	request = text_receive(conn);
 	request_keys = keys_new();
 	keys_load(request_keys, request);
 
 	send_targets = keys_find(request_keys, "SendTargets");
 	if (send_targets == NULL)
 		log_errx(1, "received Text PDU without SendTargets");
 
 	response = text_new_response(request);
 	response_keys = keys_new();
 
 	if (strcmp(send_targets, "All") == 0) {
 		TAILQ_FOREACH(port, &pg->pg_ports, p_pgs) {
 			if (discovery_target_filtered_out(conn, port)) {
 				/* Ignore this target. */
 				continue;
 			}
 			discovery_add_target(response_keys, port->p_target);
 		}
 	} else {
 		port = port_find_in_pg(pg, send_targets);
 		if (port == NULL) {
 			log_debugx("initiator requested information on unknown "
 			    "target \"%s\"; returning nothing", send_targets);
 		} else {
 			if (discovery_target_filtered_out(conn, port)) {
 				/* Ignore this target. */
 			} else {
 				discovery_add_target(response_keys, port->p_target);
 			}
 		}
 	}
 	keys_save(response_keys, response);
 
 	pdu_send(response);
 	pdu_delete(response);
 	keys_delete(response_keys);
 	pdu_delete(request);
 	keys_delete(request_keys);
 
 	log_debugx("done sending targets; waiting for Logout PDU");
 	request = logout_receive(conn);
 	response = logout_new_response(request);
 
 	pdu_send(response);
 	pdu_delete(response);
 	pdu_delete(request);
 
 	log_debugx("discovery session done");
 }
Index: user/ngie/more-tests/usr.sbin/ctld/isns.c
===================================================================
--- user/ngie/more-tests/usr.sbin/ctld/isns.c	(revision 281584)
+++ user/ngie/more-tests/usr.sbin/ctld/isns.c	(revision 281585)
@@ -1,258 +1,252 @@
 /*-
  * Copyright (c) 2014 Alexander Motin <mav@FreeBSD.org>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/types.h>
 #include <sys/time.h>
 #include <sys/socket.h>
 #include <sys/wait.h>
 #include <sys/endian.h>
 #include <netinet/in.h>
 #include <arpa/inet.h>
-#include <assert.h>
-#include <ctype.h>
-#include <errno.h>
 #include <netdb.h>
-#include <signal.h>
 #include <stdbool.h>
-#include <stdio.h>
-#include <stdint.h>
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
 
 #include "ctld.h"
 #include "isns.h"
 
 struct isns_req *
 isns_req_alloc(void)
 {
 	struct isns_req *req;
 
 	req = calloc(sizeof(struct isns_req), 1);
 	if (req == NULL) {
 		log_err(1, "calloc");
 		return (NULL);
 	}
 	req->ir_buflen = sizeof(struct isns_hdr);
 	req->ir_usedlen = 0;
 	req->ir_buf = calloc(req->ir_buflen, 1);
 	if (req->ir_buf == NULL) {
 		free(req);
 		log_err(1, "calloc");
 		return (NULL);
 	}
 	return (req);
 }
 
 struct isns_req *
 isns_req_create(uint16_t func, uint16_t flags)
 {
 	struct isns_req *req;
 	struct isns_hdr *hdr;
 
 	req = isns_req_alloc();
 	req->ir_usedlen = sizeof(struct isns_hdr);
 	hdr = (struct isns_hdr *)req->ir_buf;
 	be16enc(hdr->ih_version, ISNS_VERSION);
 	be16enc(hdr->ih_function, func);
 	be16enc(hdr->ih_flags, flags);
 	return (req);
 }
 
 void
 isns_req_free(struct isns_req *req)
 {
 
 	free(req->ir_buf);
 	free(req);
 }
 
 static int
 isns_req_getspace(struct isns_req *req, uint32_t len)
 {
 	void *newbuf;
 	int newlen;
 
 	if (req->ir_usedlen + len <= req->ir_buflen)
 		return (0);
 	newlen = 1 << flsl(req->ir_usedlen + len);
 	newbuf = realloc(req->ir_buf, newlen);
 	if (newbuf == NULL) {
 		log_err(1, "realloc");
 		return (1);
 	}
 	req->ir_buf = newbuf;
 	req->ir_buflen = newlen;
 	return (0);
 }
 
 void
 isns_req_add(struct isns_req *req, uint32_t tag, uint32_t len,
     const void *value)
 {
 	struct isns_tlv *tlv;
 	uint32_t vlen;
 
 	vlen = len + ((len & 3) ? (4 - (len & 3)) : 0);
 	isns_req_getspace(req, sizeof(*tlv) + vlen);
 	tlv = (struct isns_tlv *)&req->ir_buf[req->ir_usedlen];
 	be32enc(tlv->it_tag, tag);
 	be32enc(tlv->it_length, vlen);
 	memcpy(tlv->it_value, value, len);
 	if (vlen != len)
 		memset(&tlv->it_value[len], 0, vlen - len);
 	req->ir_usedlen += sizeof(*tlv) + vlen;
 }
 
 void
 isns_req_add_delim(struct isns_req *req)
 {
 
 	isns_req_add(req, 0, 0, NULL);
 }
 
 void
 isns_req_add_str(struct isns_req *req, uint32_t tag, const char *value)
 {
 
 	isns_req_add(req, tag, strlen(value) + 1, value);
 }
 
 void
 isns_req_add_32(struct isns_req *req, uint32_t tag, uint32_t value)
 {
 	uint32_t beval;
 
 	be32enc(&beval, value);
 	isns_req_add(req, tag, sizeof(value), &beval);
 }
 
 void
 isns_req_add_addr(struct isns_req *req, uint32_t tag, struct addrinfo *ai)
 {
 	struct sockaddr_in *in4;
 	struct sockaddr_in6 *in6;
 	uint8_t buf[16];
 
 	switch (ai->ai_addr->sa_family) {
 	case AF_INET:
 		in4 = (struct sockaddr_in *)(void *)ai->ai_addr;
 		memset(buf, 0, 10);
 		buf[10] = 0xff;
 		buf[11] = 0xff;
 		memcpy(&buf[12], &in4->sin_addr, sizeof(in4->sin_addr));
 		isns_req_add(req, tag, sizeof(buf), buf);
 		break;
 	case AF_INET6:
 		in6 = (struct sockaddr_in6 *)(void *)ai->ai_addr;
 		isns_req_add(req, tag, sizeof(in6->sin6_addr), &in6->sin6_addr);
 		break;
 	default:
 		log_errx(1, "Unsupported address family %d",
 		    ai->ai_addr->sa_family);
 	}
 }
 
 void
 isns_req_add_port(struct isns_req *req, uint32_t tag, struct addrinfo *ai)
 {
 	struct sockaddr_in *in4;
 	struct sockaddr_in6 *in6;
 	uint32_t buf;
 
 	switch (ai->ai_addr->sa_family) {
 	case AF_INET:
 		in4 = (struct sockaddr_in *)(void *)ai->ai_addr;
 		be32enc(&buf, ntohs(in4->sin_port));
 		isns_req_add(req, tag, sizeof(buf), &buf);
 		break;
 	case AF_INET6:
 		in6 = (struct sockaddr_in6 *)(void *)ai->ai_addr;
 		be32enc(&buf, ntohs(in6->sin6_port));
 		isns_req_add(req, tag, sizeof(buf), &buf);
 		break;
 	default:
 		log_errx(1, "Unsupported address family %d",
 		    ai->ai_addr->sa_family);
 	}
 }
 
 int
 isns_req_send(int s, struct isns_req *req)
 {
 	struct isns_hdr *hdr;
 	int res;
 
 	hdr = (struct isns_hdr *)req->ir_buf;
 	be16enc(hdr->ih_length, req->ir_usedlen - sizeof(*hdr));
 	be16enc(hdr->ih_flags, be16dec(hdr->ih_flags) |
 	    ISNS_FLAG_LAST | ISNS_FLAG_FIRST);
 	be16enc(hdr->ih_transaction, 0);
 	be16enc(hdr->ih_sequence, 0);
 
 	res = write(s, req->ir_buf, req->ir_usedlen);
 	return ((res < 0) ? -1 : 0);
 }
 
 int
 isns_req_receive(int s, struct isns_req *req)
 {
 	struct isns_hdr *hdr;
 	ssize_t res, len;
 
 	req->ir_usedlen = 0;
 	isns_req_getspace(req, sizeof(*hdr));
 	res = read(s, req->ir_buf, sizeof(*hdr));
 	if (res < (ssize_t)sizeof(*hdr))
 		return (-1);
 	req->ir_usedlen = sizeof(*hdr);
 	hdr = (struct isns_hdr *)req->ir_buf;
 	if (be16dec(hdr->ih_version) != ISNS_VERSION)
 		return (-1);
 	if ((be16dec(hdr->ih_flags) & (ISNS_FLAG_LAST | ISNS_FLAG_FIRST)) !=
 	    (ISNS_FLAG_LAST | ISNS_FLAG_FIRST))
 		return (-1);
 	len = be16dec(hdr->ih_length);
 	isns_req_getspace(req, len);
 	res = read(s, &req->ir_buf[req->ir_usedlen], len);
 	if (res < len)
 		return (-1);
 	req->ir_usedlen += len;
 	return (0);
 }
 
 uint32_t
 isns_req_get_status(struct isns_req *req)
 {
 
 	if (req->ir_usedlen < sizeof(struct isns_hdr) + 4)
 		return (-1);
 	return (be32dec(&req->ir_buf[sizeof(struct isns_hdr)]));
 }
Index: user/ngie/more-tests/usr.sbin/ctld/keys.c
===================================================================
--- user/ngie/more-tests/usr.sbin/ctld/keys.c	(revision 281584)
+++ user/ngie/more-tests/usr.sbin/ctld/keys.c	(revision 281585)
@@ -1,198 +1,197 @@
 /*-
  * Copyright (c) 2012 The FreeBSD Foundation
  * All rights reserved.
  *
  * This software was developed by Edward Tomasz Napierala under sponsorship
  * from the FreeBSD Foundation.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <assert.h>
-#include <stdint.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include "ctld.h"
 
 struct keys *
 keys_new(void)
 {
 	struct keys *keys;
 
 	keys = calloc(sizeof(*keys), 1);
 	if (keys == NULL)
 		log_err(1, "calloc");
 
 	return (keys);
 }
 
 void
 keys_delete(struct keys *keys)
 {
 
 	free(keys->keys_data);
 	free(keys);
 }
 
 void
 keys_load(struct keys *keys, const struct pdu *pdu)
 {
 	int i;
 	char *pair;
 	size_t pair_len;
 
 	if (pdu->pdu_data_len == 0)
 		return;
 
 	if (pdu->pdu_data[pdu->pdu_data_len - 1] != '\0')
 		log_errx(1, "protocol error: key not NULL-terminated\n");
 
 	assert(keys->keys_data == NULL);
 	keys->keys_data_len = pdu->pdu_data_len;
 	keys->keys_data = malloc(keys->keys_data_len);
 	if (keys->keys_data == NULL)
 		log_err(1, "malloc");
 	memcpy(keys->keys_data, pdu->pdu_data, keys->keys_data_len);
 
 	/*
 	 * XXX: Review this carefully.
 	 */
 	pair = keys->keys_data;
 	for (i = 0;; i++) {
 		if (i >= KEYS_MAX)
 			log_errx(1, "too many keys received");
 
 		pair_len = strlen(pair);
 
 		keys->keys_values[i] = pair;
 		keys->keys_names[i] = strsep(&keys->keys_values[i], "=");
 		if (keys->keys_names[i] == NULL || keys->keys_values[i] == NULL)
 			log_errx(1, "malformed keys");
 		log_debugx("key received: \"%s=%s\"",
 		    keys->keys_names[i], keys->keys_values[i]);
 
 		pair += pair_len + 1; /* +1 to skip the terminating '\0'. */
 		if (pair == keys->keys_data + keys->keys_data_len)
 			break;
 		assert(pair < keys->keys_data + keys->keys_data_len);
 	}
 }
 
 void
 keys_save(struct keys *keys, struct pdu *pdu)
 {
 	char *data;
 	size_t len;
 	int i;
 
 	/*
 	 * XXX: Not particularly efficient.
 	 */
 	len = 0;
 	for (i = 0; i < KEYS_MAX; i++) {
 		if (keys->keys_names[i] == NULL)
 			break;
 		/*
 		 * +1 for '=', +1 for '\0'.
 		 */
 		len += strlen(keys->keys_names[i]) +
 		    strlen(keys->keys_values[i]) + 2;
 	}
 
 	if (len == 0)
 		return;
 
 	data = malloc(len);
 	if (data == NULL)
 		log_err(1, "malloc");
 
 	pdu->pdu_data = data;
 	pdu->pdu_data_len = len;
 
 	for (i = 0; i < KEYS_MAX; i++) {
 		if (keys->keys_names[i] == NULL)
 			break;
 		data += sprintf(data, "%s=%s",
 		        keys->keys_names[i], keys->keys_values[i]);
 		data += 1; /* for '\0'. */
 	}
 }
 
 const char *
 keys_find(struct keys *keys, const char *name)
 {
 	int i;
 
 	/*
 	 * Note that we don't handle duplicated key names here,
 	 * as they are not supposed to happen in requests, and if they do,
 	 * it's an initiator error.
 	 */
 	for (i = 0; i < KEYS_MAX; i++) {
 		if (keys->keys_names[i] == NULL)
 			return (NULL);
 		if (strcmp(keys->keys_names[i], name) == 0)
 			return (keys->keys_values[i]);
 	}
 	return (NULL);
 }
 
 void
 keys_add(struct keys *keys, const char *name, const char *value)
 {
 	int i;
 
 	log_debugx("key to send: \"%s=%s\"", name, value);
 
 	/*
 	 * Note that we don't check for duplicates here, as they are perfectly
 	 * fine in responses, e.g. the "TargetName" keys in discovery sesion
 	 * response.
 	 */
 	for (i = 0; i < KEYS_MAX; i++) {
 		if (keys->keys_names[i] == NULL) {
 			keys->keys_names[i] = checked_strdup(name);
 			keys->keys_values[i] = checked_strdup(value);
 			return;
 		}
 	}
 	log_errx(1, "too many keys");
 }
 
 void
 keys_add_int(struct keys *keys, const char *name, int value)
 {
 	char *str;
 	int ret;
 
 	ret = asprintf(&str, "%d", value);
 	if (ret <= 0)
 		log_err(1, "asprintf");
 
 	keys_add(keys, name, str);
 	free(str);
 }
Index: user/ngie/more-tests/usr.sbin/ctld/login.c
===================================================================
--- user/ngie/more-tests/usr.sbin/ctld/login.c	(revision 281584)
+++ user/ngie/more-tests/usr.sbin/ctld/login.c	(revision 281585)
@@ -1,1006 +1,1004 @@
 /*-
  * Copyright (c) 2012 The FreeBSD Foundation
  * All rights reserved.
  *
  * This software was developed by Edward Tomasz Napierala under sponsorship
  * from the FreeBSD Foundation.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <assert.h>
 #include <stdbool.h>
-#include <stdint.h>
-#include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
 #include <netinet/in.h>
 
 #include "ctld.h"
 #include "iscsi_proto.h"
 
 static void login_send_error(struct pdu *request,
     char class, char detail);
 
 static void
 login_set_nsg(struct pdu *response, int nsg)
 {
 	struct iscsi_bhs_login_response *bhslr;
 
 	assert(nsg == BHSLR_STAGE_SECURITY_NEGOTIATION ||
 	    nsg == BHSLR_STAGE_OPERATIONAL_NEGOTIATION ||
 	    nsg == BHSLR_STAGE_FULL_FEATURE_PHASE);
 
 	bhslr = (struct iscsi_bhs_login_response *)response->pdu_bhs;
 
 	bhslr->bhslr_flags &= 0xFC;
 	bhslr->bhslr_flags |= nsg;
 }
 
 static int
 login_csg(const struct pdu *request)
 {
 	struct iscsi_bhs_login_request *bhslr;
 
 	bhslr = (struct iscsi_bhs_login_request *)request->pdu_bhs;
 
 	return ((bhslr->bhslr_flags & 0x0C) >> 2);
 }
 
 static void
 login_set_csg(struct pdu *response, int csg)
 {
 	struct iscsi_bhs_login_response *bhslr;
 
 	assert(csg == BHSLR_STAGE_SECURITY_NEGOTIATION ||
 	    csg == BHSLR_STAGE_OPERATIONAL_NEGOTIATION ||
 	    csg == BHSLR_STAGE_FULL_FEATURE_PHASE);
 
 	bhslr = (struct iscsi_bhs_login_response *)response->pdu_bhs;
 
 	bhslr->bhslr_flags &= 0xF3;
 	bhslr->bhslr_flags |= csg << 2;
 }
 
 static struct pdu *
 login_receive(struct connection *conn, bool initial)
 {
 	struct pdu *request;
 	struct iscsi_bhs_login_request *bhslr;
 
 	request = pdu_new(conn);
 	pdu_receive(request);
 	if ((request->pdu_bhs->bhs_opcode & ~ISCSI_BHS_OPCODE_IMMEDIATE) !=
 	    ISCSI_BHS_OPCODE_LOGIN_REQUEST) {
 		/*
 		 * The first PDU in session is special - if we receive any PDU
 		 * different than login request, we have to drop the connection
 		 * without sending response ("A target receiving any PDU
 		 * except a Login request before the Login Phase is started MUST
 		 * immediately terminate the connection on which the PDU
 		 * was received.")
 		 */
 		if (initial == false)
 			login_send_error(request, 0x02, 0x0b);
 		log_errx(1, "protocol error: received invalid opcode 0x%x",
 		    request->pdu_bhs->bhs_opcode);
 	}
 	bhslr = (struct iscsi_bhs_login_request *)request->pdu_bhs;
 	/*
 	 * XXX: Implement the C flag some day.
 	 */
 	if ((bhslr->bhslr_flags & BHSLR_FLAGS_CONTINUE) != 0) {
 		login_send_error(request, 0x03, 0x00);
 		log_errx(1, "received Login PDU with unsupported \"C\" flag");
 	}
 	if (bhslr->bhslr_version_max != 0x00) {
 		login_send_error(request, 0x02, 0x05);
 		log_errx(1, "received Login PDU with unsupported "
 		    "Version-max 0x%x", bhslr->bhslr_version_max);
 	}
 	if (bhslr->bhslr_version_min != 0x00) {
 		login_send_error(request, 0x02, 0x05);
 		log_errx(1, "received Login PDU with unsupported "
 		    "Version-min 0x%x", bhslr->bhslr_version_min);
 	}
 	if (ISCSI_SNLT(ntohl(bhslr->bhslr_cmdsn), conn->conn_cmdsn)) {
 		login_send_error(request, 0x02, 0x05);
 		log_errx(1, "received Login PDU with decreasing CmdSN: "
 		    "was %u, is %u", conn->conn_cmdsn,
 		    ntohl(bhslr->bhslr_cmdsn));
 	}
 	if (initial == false &&
 	    ntohl(bhslr->bhslr_expstatsn) != conn->conn_statsn) {
 		login_send_error(request, 0x02, 0x05);
 		log_errx(1, "received Login PDU with wrong ExpStatSN: "
 		    "is %u, should be %u", ntohl(bhslr->bhslr_expstatsn),
 		    conn->conn_statsn);
 	}
 	conn->conn_cmdsn = ntohl(bhslr->bhslr_cmdsn);
 
 	return (request);
 }
 
 static struct pdu *
 login_new_response(struct pdu *request)
 {
 	struct pdu *response;
 	struct connection *conn;
 	struct iscsi_bhs_login_request *bhslr;
 	struct iscsi_bhs_login_response *bhslr2;
 
 	bhslr = (struct iscsi_bhs_login_request *)request->pdu_bhs;
 	conn = request->pdu_connection;
 
 	response = pdu_new_response(request);
 	bhslr2 = (struct iscsi_bhs_login_response *)response->pdu_bhs;
 	bhslr2->bhslr_opcode = ISCSI_BHS_OPCODE_LOGIN_RESPONSE;
 	login_set_csg(response, BHSLR_STAGE_SECURITY_NEGOTIATION);
 	memcpy(bhslr2->bhslr_isid,
 	    bhslr->bhslr_isid, sizeof(bhslr2->bhslr_isid));
 	bhslr2->bhslr_initiator_task_tag = bhslr->bhslr_initiator_task_tag;
 	bhslr2->bhslr_statsn = htonl(conn->conn_statsn++);
 	bhslr2->bhslr_expcmdsn = htonl(conn->conn_cmdsn);
 	bhslr2->bhslr_maxcmdsn = htonl(conn->conn_cmdsn);
 
 	return (response);
 }
 
 static void
 login_send_error(struct pdu *request, char class, char detail)
 {
 	struct pdu *response;
 	struct iscsi_bhs_login_response *bhslr2;
 
 	log_debugx("sending Login Response PDU with failure class 0x%x/0x%x; "
 	    "see next line for reason", class, detail);
 	response = login_new_response(request);
 	bhslr2 = (struct iscsi_bhs_login_response *)response->pdu_bhs;
 	bhslr2->bhslr_status_class = class;
 	bhslr2->bhslr_status_detail = detail;
 
 	pdu_send(response);
 	pdu_delete(response);
 }
 
 static int
 login_list_contains(const char *list, const char *what)
 {
 	char *tofree, *str, *token;
 
 	tofree = str = checked_strdup(list);
 
 	while ((token = strsep(&str, ",")) != NULL) {
 		if (strcmp(token, what) == 0) {
 			free(tofree);
 			return (1);
 		}
 	}
 	free(tofree);
 	return (0);
 }
 
 static int
 login_list_prefers(const char *list,
     const char *choice1, const char *choice2)
 {
 	char *tofree, *str, *token;
 
 	tofree = str = checked_strdup(list);
 
 	while ((token = strsep(&str, ",")) != NULL) {
 		if (strcmp(token, choice1) == 0) {
 			free(tofree);
 			return (1);
 		}
 		if (strcmp(token, choice2) == 0) {
 			free(tofree);
 			return (2);
 		}
 	}
 	free(tofree);
 	return (-1);
 }
 
 static struct pdu *
 login_receive_chap_a(struct connection *conn)
 {
 	struct pdu *request;
 	struct keys *request_keys;
 	const char *chap_a;
 
 	request = login_receive(conn, false);
 	request_keys = keys_new();
 	keys_load(request_keys, request);
 
 	chap_a = keys_find(request_keys, "CHAP_A");
 	if (chap_a == NULL) {
 		login_send_error(request, 0x02, 0x07);
 		log_errx(1, "received CHAP Login PDU without CHAP_A");
 	}
 	if (login_list_contains(chap_a, "5") == 0) {
 		login_send_error(request, 0x02, 0x01);
 		log_errx(1, "received CHAP Login PDU with unsupported CHAP_A "
 		    "\"%s\"", chap_a);
 	}
 	keys_delete(request_keys);
 
 	return (request);
 }
 
 static void
 login_send_chap_c(struct pdu *request, struct chap *chap)
 {
 	struct pdu *response;
 	struct keys *response_keys;
 	char *chap_c, *chap_i;
 
 	chap_c = chap_get_challenge(chap);
 	chap_i = chap_get_id(chap);
 
 	response = login_new_response(request);
 	response_keys = keys_new();
 	keys_add(response_keys, "CHAP_A", "5");
 	keys_add(response_keys, "CHAP_I", chap_i);
 	keys_add(response_keys, "CHAP_C", chap_c);
 	free(chap_i);
 	free(chap_c);
 	keys_save(response_keys, response);
 	pdu_send(response);
 	pdu_delete(response);
 	keys_delete(response_keys);
 }
 
 static struct pdu *
 login_receive_chap_r(struct connection *conn, struct auth_group *ag,
     struct chap *chap, const struct auth **authp)
 {
 	struct pdu *request;
 	struct keys *request_keys;
 	const char *chap_n, *chap_r;
 	const struct auth *auth;
 	int error;
 
 	request = login_receive(conn, false);
 	request_keys = keys_new();
 	keys_load(request_keys, request);
 
 	chap_n = keys_find(request_keys, "CHAP_N");
 	if (chap_n == NULL) {
 		login_send_error(request, 0x02, 0x07);
 		log_errx(1, "received CHAP Login PDU without CHAP_N");
 	}
 	chap_r = keys_find(request_keys, "CHAP_R");
 	if (chap_r == NULL) {
 		login_send_error(request, 0x02, 0x07);
 		log_errx(1, "received CHAP Login PDU without CHAP_R");
 	}
 	error = chap_receive(chap, chap_r);
 	if (error != 0) {
 		login_send_error(request, 0x02, 0x07);
 		log_errx(1, "received CHAP Login PDU with malformed CHAP_R");
 	}
 
 	/*
 	 * Verify the response.
 	 */
 	assert(ag->ag_type == AG_TYPE_CHAP ||
 	    ag->ag_type == AG_TYPE_CHAP_MUTUAL);
 	auth = auth_find(ag, chap_n);
 	if (auth == NULL) {
 		login_send_error(request, 0x02, 0x01);
 		log_errx(1, "received CHAP Login with invalid user \"%s\"",
 		    chap_n);
 	}
 
 	assert(auth->a_secret != NULL);
 	assert(strlen(auth->a_secret) > 0);
 
 	error = chap_authenticate(chap, auth->a_secret);
 	if (error != 0) {
 		login_send_error(request, 0x02, 0x01);
 		log_errx(1, "CHAP authentication failed for user \"%s\"",
 		    auth->a_user);
 	}
 
 	keys_delete(request_keys);
 
 	*authp = auth;
 	return (request);
 }
 
 static void
 login_send_chap_success(struct pdu *request,
     const struct auth *auth)
 {
 	struct pdu *response;
 	struct keys *request_keys, *response_keys;
 	struct iscsi_bhs_login_response *bhslr2;
 	struct rchap *rchap;
 	const char *chap_i, *chap_c;
 	char *chap_r;
 	int error;
 
 	response = login_new_response(request);
 	bhslr2 = (struct iscsi_bhs_login_response *)response->pdu_bhs;
 	bhslr2->bhslr_flags |= BHSLR_FLAGS_TRANSIT;
 	login_set_nsg(response, BHSLR_STAGE_OPERATIONAL_NEGOTIATION);
 
 	/*
 	 * Actually, one more thing: mutual authentication.
 	 */
 	request_keys = keys_new();
 	keys_load(request_keys, request);
 	chap_i = keys_find(request_keys, "CHAP_I");
 	chap_c = keys_find(request_keys, "CHAP_C");
 	if (chap_i != NULL || chap_c != NULL) {
 		if (chap_i == NULL) {
 			login_send_error(request, 0x02, 0x07);
 			log_errx(1, "initiator requested target "
 			    "authentication, but didn't send CHAP_I");
 		}
 		if (chap_c == NULL) {
 			login_send_error(request, 0x02, 0x07);
 			log_errx(1, "initiator requested target "
 			    "authentication, but didn't send CHAP_C");
 		}
 		if (auth->a_auth_group->ag_type != AG_TYPE_CHAP_MUTUAL) {
 			login_send_error(request, 0x02, 0x01);
 			log_errx(1, "initiator requests target authentication "
 			    "for user \"%s\", but mutual user/secret "
 			    "is not set", auth->a_user);
 		}
 
 		log_debugx("performing mutual authentication as user \"%s\"",
 		    auth->a_mutual_user);
 
 		rchap = rchap_new(auth->a_mutual_secret);
 		error = rchap_receive(rchap, chap_i, chap_c);
 		if (error != 0) {
 			login_send_error(request, 0x02, 0x07);
 			log_errx(1, "received CHAP Login PDU with malformed "
 			    "CHAP_I or CHAP_C");
 		}
 		chap_r = rchap_get_response(rchap);
 		rchap_delete(rchap);
 		response_keys = keys_new();
 		keys_add(response_keys, "CHAP_N", auth->a_mutual_user);
 		keys_add(response_keys, "CHAP_R", chap_r);
 		free(chap_r);
 		keys_save(response_keys, response);
 		keys_delete(response_keys);
 	} else {
 		log_debugx("initiator did not request target authentication");
 	}
 
 	keys_delete(request_keys);
 	pdu_send(response);
 	pdu_delete(response);
 }
 
 static void
 login_chap(struct connection *conn, struct auth_group *ag)
 {
 	const struct auth *auth;
 	struct chap *chap;
 	struct pdu *request;
 
 	/*
 	 * Receive CHAP_A PDU.
 	 */
 	log_debugx("beginning CHAP authentication; waiting for CHAP_A");
 	request = login_receive_chap_a(conn);
 
 	/*
 	 * Generate the challenge.
 	 */
 	chap = chap_new();
 
 	/*
 	 * Send the challenge.
 	 */
 	log_debugx("sending CHAP_C, binary challenge size is %zd bytes",
 	    sizeof(chap->chap_challenge));
 	login_send_chap_c(request, chap);
 	pdu_delete(request);
 
 	/*
 	 * Receive CHAP_N/CHAP_R PDU and authenticate.
 	 */
 	log_debugx("waiting for CHAP_N/CHAP_R");
 	request = login_receive_chap_r(conn, ag, chap, &auth);
 
 	/*
 	 * Yay, authentication succeeded!
 	 */
 	log_debugx("authentication succeeded for user \"%s\"; "
 	    "transitioning to Negotiation Phase", auth->a_user);
 	login_send_chap_success(request, auth);
 	pdu_delete(request);
 
 	/*
 	 * Leave username and CHAP information for discovery().
 	 */
 	conn->conn_user = auth->a_user;
 	conn->conn_chap = chap;
 }
 
 static void
 login_negotiate_key(struct pdu *request, const char *name,
     const char *value, bool skipped_security, struct keys *response_keys)
 {
 	int which;
 	size_t tmp;
 	struct connection *conn;
 
 	conn = request->pdu_connection;
 
 	if (strcmp(name, "InitiatorName") == 0) {
 		if (!skipped_security)
 			log_errx(1, "initiator resent InitiatorName");
 	} else if (strcmp(name, "SessionType") == 0) {
 		if (!skipped_security)
 			log_errx(1, "initiator resent SessionType");
 	} else if (strcmp(name, "TargetName") == 0) {
 		if (!skipped_security)
 			log_errx(1, "initiator resent TargetName");
 	} else if (strcmp(name, "InitiatorAlias") == 0) {
 		if (conn->conn_initiator_alias != NULL)
 			free(conn->conn_initiator_alias);
 		conn->conn_initiator_alias = checked_strdup(value);
 	} else if (strcmp(value, "Irrelevant") == 0) {
 		/* Ignore. */
 	} else if (strcmp(name, "HeaderDigest") == 0) {
 		/*
 		 * We don't handle digests for discovery sessions.
 		 */
 		if (conn->conn_session_type == CONN_SESSION_TYPE_DISCOVERY) {
 			log_debugx("discovery session; digests disabled");
 			keys_add(response_keys, name, "None");
 			return;
 		}
 
 		which = login_list_prefers(value, "CRC32C", "None");
 		switch (which) {
 		case 1:
 			log_debugx("initiator prefers CRC32C "
 			    "for header digest; we'll use it");
 			conn->conn_header_digest = CONN_DIGEST_CRC32C;
 			keys_add(response_keys, name, "CRC32C");
 			break;
 		case 2:
 			log_debugx("initiator prefers not to do "
 			    "header digest; we'll comply");
 			keys_add(response_keys, name, "None");
 			break;
 		default:
 			log_warnx("initiator sent unrecognized "
 			    "HeaderDigest value \"%s\"; will use None", value);
 			keys_add(response_keys, name, "None");
 			break;
 		}
 	} else if (strcmp(name, "DataDigest") == 0) {
 		if (conn->conn_session_type == CONN_SESSION_TYPE_DISCOVERY) {
 			log_debugx("discovery session; digests disabled");
 			keys_add(response_keys, name, "None");
 			return;
 		}
 
 		which = login_list_prefers(value, "CRC32C", "None");
 		switch (which) {
 		case 1:
 			log_debugx("initiator prefers CRC32C "
 			    "for data digest; we'll use it");
 			conn->conn_data_digest = CONN_DIGEST_CRC32C;
 			keys_add(response_keys, name, "CRC32C");
 			break;
 		case 2:
 			log_debugx("initiator prefers not to do "
 			    "data digest; we'll comply");
 			keys_add(response_keys, name, "None");
 			break;
 		default:
 			log_warnx("initiator sent unrecognized "
 			    "DataDigest value \"%s\"; will use None", value);
 			keys_add(response_keys, name, "None");
 			break;
 		}
 	} else if (strcmp(name, "MaxConnections") == 0) {
 		keys_add(response_keys, name, "1");
 	} else if (strcmp(name, "InitialR2T") == 0) {
 		keys_add(response_keys, name, "Yes");
 	} else if (strcmp(name, "ImmediateData") == 0) {
 		if (conn->conn_session_type == CONN_SESSION_TYPE_DISCOVERY) {
 			log_debugx("discovery session; ImmediateData irrelevant");
 			keys_add(response_keys, name, "Irrelevant");
 		} else {
 			if (strcmp(value, "Yes") == 0) {
 				conn->conn_immediate_data = true;
 				keys_add(response_keys, name, "Yes");
 			} else {
 				conn->conn_immediate_data = false;
 				keys_add(response_keys, name, "No");
 			}
 		}
 	} else if (strcmp(name, "MaxRecvDataSegmentLength") == 0) {
 		tmp = strtoul(value, NULL, 10);
 		if (tmp <= 0) {
 			login_send_error(request, 0x02, 0x00);
 			log_errx(1, "received invalid "
 			    "MaxRecvDataSegmentLength");
 		}
 		if (tmp > conn->conn_data_segment_limit) {
 			log_debugx("capping MaxRecvDataSegmentLength "
 			    "from %zd to %zd", tmp, conn->conn_data_segment_limit);
 			tmp = conn->conn_data_segment_limit;
 		}
 		conn->conn_max_data_segment_length = tmp;
 		keys_add_int(response_keys, name, tmp);
 	} else if (strcmp(name, "MaxBurstLength") == 0) {
 		tmp = strtoul(value, NULL, 10);
 		if (tmp <= 0) {
 			login_send_error(request, 0x02, 0x00);
 			log_errx(1, "received invalid MaxBurstLength");
 		}
 		if (tmp > MAX_BURST_LENGTH) {
 			log_debugx("capping MaxBurstLength from %zd to %d",
 			    tmp, MAX_BURST_LENGTH);
 			tmp = MAX_BURST_LENGTH;
 		}
 		conn->conn_max_burst_length = tmp;
 		keys_add(response_keys, name, value);
 	} else if (strcmp(name, "FirstBurstLength") == 0) {
 		tmp = strtoul(value, NULL, 10);
 		if (tmp <= 0) {
 			login_send_error(request, 0x02, 0x00);
 			log_errx(1, "received invalid "
 			    "FirstBurstLength");
 		}
 		if (tmp > conn->conn_data_segment_limit) {
 			log_debugx("capping FirstBurstLength from %zd to %zd",
 			    tmp, conn->conn_data_segment_limit);
 			tmp = conn->conn_data_segment_limit;
 		}
 		/*
 		 * We don't pass the value to the kernel; it only enforces
 		 * hardcoded limit anyway.
 		 */
 		keys_add_int(response_keys, name, tmp);
 	} else if (strcmp(name, "DefaultTime2Wait") == 0) {
 		keys_add(response_keys, name, value);
 	} else if (strcmp(name, "DefaultTime2Retain") == 0) {
 		keys_add(response_keys, name, "0");
 	} else if (strcmp(name, "MaxOutstandingR2T") == 0) {
 		keys_add(response_keys, name, "1");
 	} else if (strcmp(name, "DataPDUInOrder") == 0) {
 		keys_add(response_keys, name, "Yes");
 	} else if (strcmp(name, "DataSequenceInOrder") == 0) {
 		keys_add(response_keys, name, "Yes");
 	} else if (strcmp(name, "ErrorRecoveryLevel") == 0) {
 		keys_add(response_keys, name, "0");
 	} else if (strcmp(name, "OFMarker") == 0) {
 		keys_add(response_keys, name, "No");
 	} else if (strcmp(name, "IFMarker") == 0) {
 		keys_add(response_keys, name, "No");
 	} else {
 		log_debugx("unknown key \"%s\"; responding "
 		    "with NotUnderstood", name);
 		keys_add(response_keys, name, "NotUnderstood");
 	}
 }
 
 static void
 login_redirect(struct pdu *request, const char *target_address)
 {
 	struct pdu *response;
 	struct iscsi_bhs_login_response *bhslr2;
 	struct keys *response_keys;
 
 	response = login_new_response(request);
 	login_set_csg(response, login_csg(request));
 	bhslr2 = (struct iscsi_bhs_login_response *)response->pdu_bhs;
 	bhslr2->bhslr_status_class = 0x01;
 	bhslr2->bhslr_status_detail = 0x01;
 
 	response_keys = keys_new();
 	keys_add(response_keys, "TargetAddress", target_address);
 
 	keys_save(response_keys, response);
 	pdu_send(response);
 	pdu_delete(response);
 	keys_delete(response_keys);
 }
 
 static bool
 login_portal_redirect(struct connection *conn, struct pdu *request)
 {
 	const struct portal_group *pg;
 
 	pg = conn->conn_portal->p_portal_group;
 	if (pg->pg_redirection == NULL)
 		return (false);
 
 	log_debugx("portal-group \"%s\" configured to redirect to %s",
 	    pg->pg_name, pg->pg_redirection);
 	login_redirect(request, pg->pg_redirection);
 
 	return (true);
 }
 
 static bool
 login_target_redirect(struct connection *conn, struct pdu *request)
 {
 	const char *target_address;
 
 	assert(conn->conn_portal->p_portal_group->pg_redirection == NULL);
 
 	if (conn->conn_target == NULL)
 		return (false);
 
 	target_address = conn->conn_target->t_redirection;
 	if (target_address == NULL)
 		return (false);
 
 	log_debugx("target \"%s\" configured to redirect to %s",
 	  conn->conn_target->t_name, target_address);
 	login_redirect(request, target_address);
 
 	return (true);
 }
 
 static void
 login_negotiate(struct connection *conn, struct pdu *request)
 {
 	struct pdu *response;
 	struct iscsi_bhs_login_response *bhslr2;
 	struct keys *request_keys, *response_keys;
 	int i;
 	bool redirected, skipped_security;
 
 	if (conn->conn_session_type == CONN_SESSION_TYPE_NORMAL) {
 		/*
 		 * Query the kernel for MaxDataSegmentLength it can handle.
 		 * In case of offload, it depends on hardware capabilities.
 		 */
 		assert(conn->conn_target != NULL);
 		kernel_limits(conn->conn_portal->p_portal_group->pg_offload,
 		    &conn->conn_data_segment_limit);
 	} else {
 		conn->conn_data_segment_limit = MAX_DATA_SEGMENT_LENGTH;
 	}
 
 	if (request == NULL) {
 		log_debugx("beginning operational parameter negotiation; "
 		    "waiting for Login PDU");
 		request = login_receive(conn, false);
 		skipped_security = false;
 	} else
 		skipped_security = true;
 
 	/*
 	 * RFC 3720, 10.13.5.  Status-Class and Status-Detail, says
 	 * the redirection SHOULD be accepted by the initiator before
 	 * authentication, but MUST be be accepted afterwards; that's
 	 * why we're doing it here and not earlier.
 	 */
 	redirected = login_target_redirect(conn, request);
 	if (redirected) {
 		log_debugx("initiator redirected; exiting");
 		exit(0);
 	}
 
 	request_keys = keys_new();
 	keys_load(request_keys, request);
 
 	response = login_new_response(request);
 	bhslr2 = (struct iscsi_bhs_login_response *)response->pdu_bhs;
 	bhslr2->bhslr_flags |= BHSLR_FLAGS_TRANSIT;
 	bhslr2->bhslr_tsih = htons(0xbadd);
 	login_set_csg(response, BHSLR_STAGE_OPERATIONAL_NEGOTIATION);
 	login_set_nsg(response, BHSLR_STAGE_FULL_FEATURE_PHASE);
 	response_keys = keys_new();
 
 	if (skipped_security &&
 	    conn->conn_session_type == CONN_SESSION_TYPE_NORMAL) {
 		if (conn->conn_target->t_alias != NULL)
 			keys_add(response_keys,
 			    "TargetAlias", conn->conn_target->t_alias);
 		keys_add_int(response_keys, "TargetPortalGroupTag",
 		    conn->conn_portal->p_portal_group->pg_tag);
 	}
 
 	for (i = 0; i < KEYS_MAX; i++) {
 		if (request_keys->keys_names[i] == NULL)
 			break;
 
 		login_negotiate_key(request, request_keys->keys_names[i],
 		    request_keys->keys_values[i], skipped_security,
 		    response_keys);
 	}
 
 	log_debugx("operational parameter negotiation done; "
 	    "transitioning to Full Feature Phase");
 
 	keys_save(response_keys, response);
 	pdu_send(response);
 	pdu_delete(response);
 	keys_delete(response_keys);
 	pdu_delete(request);
 	keys_delete(request_keys);
 }
 
 void
 login(struct connection *conn)
 {
 	struct pdu *request, *response;
 	struct iscsi_bhs_login_request *bhslr;
 	struct iscsi_bhs_login_response *bhslr2;
 	struct keys *request_keys, *response_keys;
 	struct auth_group *ag;
 	struct portal_group *pg;
 	const char *initiator_name, *initiator_alias, *session_type,
 	    *target_name, *auth_method;
 	bool redirected;
 
 	/*
 	 * Handle the initial Login Request - figure out required authentication
 	 * method and either transition to the next phase, if no authentication
 	 * is required, or call appropriate authentication code.
 	 */
 	log_debugx("beginning Login Phase; waiting for Login PDU");
 	request = login_receive(conn, true);
 	bhslr = (struct iscsi_bhs_login_request *)request->pdu_bhs;
 	if (bhslr->bhslr_tsih != 0) {
 		login_send_error(request, 0x02, 0x0a);
 		log_errx(1, "received Login PDU with non-zero TSIH");
 	}
 
 	pg = conn->conn_portal->p_portal_group;
 
 	memcpy(conn->conn_initiator_isid, bhslr->bhslr_isid,
 	    sizeof(conn->conn_initiator_isid));
 
 	/*
 	 * XXX: Implement the C flag some day.
 	 */
 	request_keys = keys_new();
 	keys_load(request_keys, request);
 
 	assert(conn->conn_initiator_name == NULL);
 	initiator_name = keys_find(request_keys, "InitiatorName");
 	if (initiator_name == NULL) {
 		login_send_error(request, 0x02, 0x07);
 		log_errx(1, "received Login PDU without InitiatorName");
 	}
 	if (valid_iscsi_name(initiator_name) == false) {
 		login_send_error(request, 0x02, 0x00);
 		log_errx(1, "received Login PDU with invalid InitiatorName");
 	}
 	conn->conn_initiator_name = checked_strdup(initiator_name);
 	log_set_peer_name(conn->conn_initiator_name);
 	/*
 	 * XXX: This doesn't work (does nothing) because of Capsicum.
 	 */
 	setproctitle("%s (%s)", conn->conn_initiator_addr, conn->conn_initiator_name);
 
 	redirected = login_portal_redirect(conn, request);
 	if (redirected) {
 		log_debugx("initiator redirected; exiting");
 		exit(0);
 	}
 
 	initiator_alias = keys_find(request_keys, "InitiatorAlias");
 	if (initiator_alias != NULL)
 		conn->conn_initiator_alias = checked_strdup(initiator_alias);
 
 	assert(conn->conn_session_type == CONN_SESSION_TYPE_NONE);
 	session_type = keys_find(request_keys, "SessionType");
 	if (session_type != NULL) {
 		if (strcmp(session_type, "Normal") == 0) {
 			conn->conn_session_type = CONN_SESSION_TYPE_NORMAL;
 		} else if (strcmp(session_type, "Discovery") == 0) {
 			conn->conn_session_type = CONN_SESSION_TYPE_DISCOVERY;
 		} else {
 			login_send_error(request, 0x02, 0x00);
 			log_errx(1, "received Login PDU with invalid "
 			    "SessionType \"%s\"", session_type);
 		}
 	} else
 		conn->conn_session_type = CONN_SESSION_TYPE_NORMAL;
 
 	assert(conn->conn_target == NULL);
 	if (conn->conn_session_type == CONN_SESSION_TYPE_NORMAL) {
 		target_name = keys_find(request_keys, "TargetName");
 		if (target_name == NULL) {
 			login_send_error(request, 0x02, 0x07);
 			log_errx(1, "received Login PDU without TargetName");
 		}
 
 		conn->conn_port = port_find_in_pg(pg, target_name);
 		if (conn->conn_port == NULL) {
 			login_send_error(request, 0x02, 0x03);
 			log_errx(1, "requested target \"%s\" not found",
 			    target_name);
 		}
 		conn->conn_target = conn->conn_port->p_target;
 	}
 
 	/*
 	 * At this point we know what kind of authentication we need.
 	 */
 	if (conn->conn_session_type == CONN_SESSION_TYPE_NORMAL) {
 		ag = conn->conn_port->p_auth_group;
 		if (ag == NULL)
 			ag = conn->conn_target->t_auth_group;
 		if (ag->ag_name != NULL) {
 			log_debugx("initiator requests to connect "
 			    "to target \"%s\"; auth-group \"%s\"",
 			    conn->conn_target->t_name,
 			    ag->ag_name);
 		} else {
 			log_debugx("initiator requests to connect "
 			    "to target \"%s\"", conn->conn_target->t_name);
 		}
 	} else {
 		assert(conn->conn_session_type == CONN_SESSION_TYPE_DISCOVERY);
 		ag = pg->pg_discovery_auth_group;
 		if (ag->ag_name != NULL) {
 			log_debugx("initiator requests "
 			    "discovery session; auth-group \"%s\"", ag->ag_name);
 		} else {
 			log_debugx("initiator requests discovery session");
 		}
 	}
 
 	/*
 	 * Enforce initiator-name and initiator-portal.
 	 */
 	if (auth_name_check(ag, initiator_name) != 0) {
 		login_send_error(request, 0x02, 0x02);
 		log_errx(1, "initiator does not match allowed initiator names");
 	}
 
 	if (auth_portal_check(ag, &conn->conn_initiator_sa) != 0) {
 		login_send_error(request, 0x02, 0x02);
 		log_errx(1, "initiator does not match allowed "
 		    "initiator portals");
 	}
 
 	/*
 	 * Let's see if the initiator intends to do any kind of authentication
 	 * at all.
 	 */
 	if (login_csg(request) == BHSLR_STAGE_OPERATIONAL_NEGOTIATION) {
 		if (ag->ag_type != AG_TYPE_NO_AUTHENTICATION) {
 			login_send_error(request, 0x02, 0x01);
 			log_errx(1, "initiator skipped the authentication, "
 			    "but authentication is required");
 		}
 
 		keys_delete(request_keys);
 
 		log_debugx("initiator skipped the authentication, "
 		    "and we don't need it; proceeding with negotiation");
 		login_negotiate(conn, request);
 		return;
 	}
 
 	if (ag->ag_type == AG_TYPE_NO_AUTHENTICATION) {
 		/*
 		 * Initiator might want to to authenticate,
 		 * but we don't need it.
 		 */
 		log_debugx("authentication not required; "
 		    "transitioning to operational parameter negotiation");
 
 		if ((bhslr->bhslr_flags & BHSLR_FLAGS_TRANSIT) == 0)
 			log_warnx("initiator did not set the \"T\" flag; "
 			    "transitioning anyway");
 
 		response = login_new_response(request);
 		bhslr2 = (struct iscsi_bhs_login_response *)response->pdu_bhs;
 		bhslr2->bhslr_flags |= BHSLR_FLAGS_TRANSIT;
 		login_set_nsg(response, BHSLR_STAGE_OPERATIONAL_NEGOTIATION);
 		response_keys = keys_new();
 		/*
 		 * Required by Linux initiator.
 		 */
 		auth_method = keys_find(request_keys, "AuthMethod");
 		if (auth_method != NULL &&
 		    login_list_contains(auth_method, "None"))
 			keys_add(response_keys, "AuthMethod", "None");
 
 		if (conn->conn_session_type == CONN_SESSION_TYPE_NORMAL) {
 			if (conn->conn_target->t_alias != NULL)
 				keys_add(response_keys,
 				    "TargetAlias", conn->conn_target->t_alias);
 			keys_add_int(response_keys,
 			    "TargetPortalGroupTag", pg->pg_tag);
 		}
 		keys_save(response_keys, response);
 		pdu_send(response);
 		pdu_delete(response);
 		keys_delete(response_keys);
 		pdu_delete(request);
 		keys_delete(request_keys);
 
 		login_negotiate(conn, NULL);
 		return;
 	}
 
 	if (ag->ag_type == AG_TYPE_DENY) {
 		login_send_error(request, 0x02, 0x01);
 		log_errx(1, "auth-type is \"deny\"");
 	}
 
 	if (ag->ag_type == AG_TYPE_UNKNOWN) {
 		/*
 		 * This can happen with empty auth-group.
 		 */
 		login_send_error(request, 0x02, 0x01);
 		log_errx(1, "auth-type not set, denying access");
 	}
 
 	log_debugx("CHAP authentication required");
 
 	auth_method = keys_find(request_keys, "AuthMethod");
 	if (auth_method == NULL) {
 		login_send_error(request, 0x02, 0x07);
 		log_errx(1, "received Login PDU without AuthMethod");
 	}
 	/*
 	 * XXX: This should be Reject, not just a login failure (5.3.2).
 	 */
 	if (login_list_contains(auth_method, "CHAP") == 0) {
 		login_send_error(request, 0x02, 0x01);
 		log_errx(1, "initiator requests unsupported AuthMethod \"%s\" "
 		    "instead of \"CHAP\"", auth_method);
 	}
 
 	response = login_new_response(request);
 
 	response_keys = keys_new();
 	keys_add(response_keys, "AuthMethod", "CHAP");
 	if (conn->conn_session_type == CONN_SESSION_TYPE_NORMAL) {
 		if (conn->conn_target->t_alias != NULL)
 			keys_add(response_keys,
 			    "TargetAlias", conn->conn_target->t_alias);
 		keys_add_int(response_keys,
 		    "TargetPortalGroupTag", pg->pg_tag);
 	}
 	keys_save(response_keys, response);
 
 	pdu_send(response);
 	pdu_delete(response);
 	keys_delete(response_keys);
 	pdu_delete(request);
 	keys_delete(request_keys);
 
 	login_chap(conn, ag);
 
 	login_negotiate(conn, NULL);
 }
Index: user/ngie/more-tests/usr.sbin/ctld/parse.y
===================================================================
--- user/ngie/more-tests/usr.sbin/ctld/parse.y	(revision 281584)
+++ user/ngie/more-tests/usr.sbin/ctld/parse.y	(revision 281585)
@@ -1,1062 +1,1061 @@
 %{
 /*-
  * Copyright (c) 2012 The FreeBSD Foundation
  * All rights reserved.
  *
  * This software was developed by Edward Tomasz Napierala under sponsorship
  * from the FreeBSD Foundation.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 #include <sys/queue.h>
 #include <sys/types.h>
 #include <sys/stat.h>
 #include <assert.h>
 #include <stdio.h>
-#include <stdint.h>
 #include <stdlib.h>
 #include <string.h>
 
 #include "ctld.h"
 
 extern FILE *yyin;
 extern char *yytext;
 extern int lineno;
 
 static struct conf *conf = NULL;
 static struct auth_group *auth_group = NULL;
 static struct portal_group *portal_group = NULL;
 static struct target *target = NULL;
 static struct lun *lun = NULL;
 
 extern void	yyerror(const char *);
 extern int	yylex(void);
 extern void	yyrestart(FILE *);
 
 %}
 
 %token ALIAS AUTH_GROUP AUTH_TYPE BACKEND BLOCKSIZE CHAP CHAP_MUTUAL
 %token CLOSING_BRACKET DEBUG DEVICE_ID DISCOVERY_AUTH_GROUP DISCOVERY_FILTER
 %token INITIATOR_NAME INITIATOR_PORTAL ISNS_SERVER ISNS_PERIOD ISNS_TIMEOUT
 %token LISTEN LISTEN_ISER LUN MAXPROC OFFLOAD OPENING_BRACKET OPTION
 %token PATH PIDFILE PORT PORTAL_GROUP REDIRECT SEMICOLON SERIAL SIZE STR
 %token TARGET TIMEOUT 
 
 %union
 {
 	char *str;
 }
 
 %token <str> STR
 
 %%
 
 statements:
 	|
 	statements statement
 	|
 	statements statement SEMICOLON
 	;
 
 statement:
 	debug
 	|
 	timeout
 	|
 	maxproc
 	|
 	pidfile
 	|
 	isns_server
 	|
 	isns_period
 	|
 	isns_timeout
 	|
 	auth_group
 	|
 	portal_group
 	|
 	lun
 	|
 	target
 	;
 
 debug:		DEBUG STR
 	{
 		uint64_t tmp;
 
 		if (expand_number($2, &tmp) != 0) {
 			yyerror("invalid numeric value");
 			free($2);
 			return (1);
 		}
 			
 		conf->conf_debug = tmp;
 	}
 	;
 
 timeout:	TIMEOUT STR
 	{
 		uint64_t tmp;
 
 		if (expand_number($2, &tmp) != 0) {
 			yyerror("invalid numeric value");
 			free($2);
 			return (1);
 		}
 
 		conf->conf_timeout = tmp;
 	}
 	;
 
 maxproc:	MAXPROC STR
 	{
 		uint64_t tmp;
 
 		if (expand_number($2, &tmp) != 0) {
 			yyerror("invalid numeric value");
 			free($2);
 			return (1);
 		}
 
 		conf->conf_maxproc = tmp;
 	}
 	;
 
 pidfile:	PIDFILE STR
 	{
 		if (conf->conf_pidfile_path != NULL) {
 			log_warnx("pidfile specified more than once");
 			free($2);
 			return (1);
 		}
 		conf->conf_pidfile_path = $2;
 	}
 	;
 
 isns_server:	ISNS_SERVER STR
 	{
 		int error;
 
 		error = isns_new(conf, $2);
 		free($2);
 		if (error != 0)
 			return (1);
 	}
 	;
 
 isns_period:	ISNS_PERIOD STR
 	{
 		uint64_t tmp;
 
 		if (expand_number($2, &tmp) != 0) {
 			yyerror("invalid numeric value");
 			free($2);
 			return (1);
 		}
 
 		conf->conf_isns_period = tmp;
 	}
 	;
 
 isns_timeout:	ISNS_TIMEOUT STR
 	{
 		uint64_t tmp;
 
 		if (expand_number($2, &tmp) != 0) {
 			yyerror("invalid numeric value");
 			free($2);
 			return (1);
 		}
 
 		conf->conf_isns_timeout = tmp;
 	}
 	;
 
 auth_group:	AUTH_GROUP auth_group_name
     OPENING_BRACKET auth_group_entries CLOSING_BRACKET
 	{
 		auth_group = NULL;
 	}
 	;
 
 auth_group_name:	STR
 	{
 		/*
 		 * Make it possible to redefine default
 		 * auth-group. but only once.
 		 */
 		if (strcmp($1, "default") == 0 &&
 		    conf->conf_default_ag_defined == false) {
 			auth_group = auth_group_find(conf, $1);
 			conf->conf_default_ag_defined = true;
 		} else {
 			auth_group = auth_group_new(conf, $1);
 		}
 		free($1);
 		if (auth_group == NULL)
 			return (1);
 	}
 	;
 
 auth_group_entries:
 	|
 	auth_group_entries auth_group_entry
 	|
 	auth_group_entries auth_group_entry SEMICOLON
 	;
 
 auth_group_entry:
 	auth_group_auth_type
 	|
 	auth_group_chap
 	|
 	auth_group_chap_mutual
 	|
 	auth_group_initiator_name
 	|
 	auth_group_initiator_portal
 	;
 
 auth_group_auth_type:	AUTH_TYPE STR
 	{
 		int error;
 
 		error = auth_group_set_type(auth_group, $2);
 		free($2);
 		if (error != 0)
 			return (1);
 	}
 	;
 
 auth_group_chap:	CHAP STR STR
 	{
 		const struct auth *ca;
 
 		ca = auth_new_chap(auth_group, $2, $3);
 		free($2);
 		free($3);
 		if (ca == NULL)
 			return (1);
 	}
 	;
 
 auth_group_chap_mutual:	CHAP_MUTUAL STR STR STR STR
 	{
 		const struct auth *ca;
 
 		ca = auth_new_chap_mutual(auth_group, $2, $3, $4, $5);
 		free($2);
 		free($3);
 		free($4);
 		free($5);
 		if (ca == NULL)
 			return (1);
 	}
 	;
 
 auth_group_initiator_name:	INITIATOR_NAME STR
 	{
 		const struct auth_name *an;
 
 		an = auth_name_new(auth_group, $2);
 		free($2);
 		if (an == NULL)
 			return (1);
 	}
 	;
 
 auth_group_initiator_portal:	INITIATOR_PORTAL STR
 	{
 		const struct auth_portal *ap;
 
 		ap = auth_portal_new(auth_group, $2);
 		free($2);
 		if (ap == NULL)
 			return (1);
 	}
 	;
 
 portal_group:	PORTAL_GROUP portal_group_name
     OPENING_BRACKET portal_group_entries CLOSING_BRACKET
 	{
 		portal_group = NULL;
 	}
 	;
 
 portal_group_name:	STR
 	{
 		/*
 		 * Make it possible to redefine default
 		 * portal-group. but only once.
 		 */
 		if (strcmp($1, "default") == 0 &&
 		    conf->conf_default_pg_defined == false) {
 			portal_group = portal_group_find(conf, $1);
 			conf->conf_default_pg_defined = true;
 		} else {
 			portal_group = portal_group_new(conf, $1);
 		}
 		free($1);
 		if (portal_group == NULL)
 			return (1);
 	}
 	;
 
 portal_group_entries:
 	|
 	portal_group_entries portal_group_entry
 	|
 	portal_group_entries portal_group_entry SEMICOLON
 	;
 
 portal_group_entry:
 	portal_group_discovery_auth_group
 	|
 	portal_group_discovery_filter
 	|
 	portal_group_listen
 	|
 	portal_group_listen_iser
 	|
 	portal_group_offload
 	|
 	portal_group_redirect
 	;
 
 portal_group_discovery_auth_group:	DISCOVERY_AUTH_GROUP STR
 	{
 		if (portal_group->pg_discovery_auth_group != NULL) {
 			log_warnx("discovery-auth-group for portal-group "
 			    "\"%s\" specified more than once",
 			    portal_group->pg_name);
 			return (1);
 		}
 		portal_group->pg_discovery_auth_group =
 		    auth_group_find(conf, $2);
 		if (portal_group->pg_discovery_auth_group == NULL) {
 			log_warnx("unknown discovery-auth-group \"%s\" "
 			    "for portal-group \"%s\"",
 			    $2, portal_group->pg_name);
 			return (1);
 		}
 		free($2);
 	}
 	;
 
 portal_group_discovery_filter:	DISCOVERY_FILTER STR
 	{
 		int error;
 
 		error = portal_group_set_filter(portal_group, $2);
 		free($2);
 		if (error != 0)
 			return (1);
 	}
 	;
 
 portal_group_listen:	LISTEN STR
 	{
 		int error;
 
 		error = portal_group_add_listen(portal_group, $2, false);
 		free($2);
 		if (error != 0)
 			return (1);
 	}
 	;
 
 portal_group_listen_iser:	LISTEN_ISER STR
 	{
 		int error;
 
 		error = portal_group_add_listen(portal_group, $2, true);
 		free($2);
 		if (error != 0)
 			return (1);
 	}
 	;
 
 portal_group_offload:	OFFLOAD STR
 	{
 		int error;
 
 		error = portal_group_set_offload(portal_group, $2);
 		free($2);
 		if (error != 0)
 			return (1);
 	}
 	;
 
 portal_group_redirect:	REDIRECT STR
 	{
 		int error;
 
 		error = portal_group_set_redirection(portal_group, $2);
 		free($2);
 		if (error != 0)
 			return (1);
 	}
 	;
 
 lun:	LUN lun_name
     OPENING_BRACKET lun_entries CLOSING_BRACKET
 	{
 		lun = NULL;
 	}
 	;
 
 lun_name:	STR
 	{
 		lun = lun_new(conf, $1);
 		free($1);
 		if (lun == NULL)
 			return (1);
 	}
 	;
 
 target:	TARGET target_name
     OPENING_BRACKET target_entries CLOSING_BRACKET
 	{
 		target = NULL;
 	}
 	;
 
 target_name:	STR
 	{
 		target = target_new(conf, $1);
 		free($1);
 		if (target == NULL)
 			return (1);
 	}
 	;
 
 target_entries:
 	|
 	target_entries target_entry
 	|
 	target_entries target_entry SEMICOLON
 	;
 
 target_entry:
 	target_alias
 	|
 	target_auth_group
 	|
 	target_auth_type
 	|
 	target_chap
 	|
 	target_chap_mutual
 	|
 	target_initiator_name
 	|
 	target_initiator_portal
 	|
 	target_portal_group
 	|
 	target_port
 	|
 	target_redirect
 	|
 	target_lun
 	|
 	target_lun_ref
 	;
 
 target_alias:	ALIAS STR
 	{
 		if (target->t_alias != NULL) {
 			log_warnx("alias for target \"%s\" "
 			    "specified more than once", target->t_name);
 			return (1);
 		}
 		target->t_alias = $2;
 	}
 	;
 
 target_auth_group:	AUTH_GROUP STR
 	{
 		if (target->t_auth_group != NULL) {
 			if (target->t_auth_group->ag_name != NULL)
 				log_warnx("auth-group for target \"%s\" "
 				    "specified more than once", target->t_name);
 			else
 				log_warnx("cannot use both auth-group and explicit "
 				    "authorisations for target \"%s\"",
 				    target->t_name);
 			return (1);
 		}
 		target->t_auth_group = auth_group_find(conf, $2);
 		if (target->t_auth_group == NULL) {
 			log_warnx("unknown auth-group \"%s\" for target "
 			    "\"%s\"", $2, target->t_name);
 			return (1);
 		}
 		free($2);
 	}
 	;
 
 target_auth_type:	AUTH_TYPE STR
 	{
 		int error;
 
 		if (target->t_auth_group != NULL) {
 			if (target->t_auth_group->ag_name != NULL) {
 				log_warnx("cannot use both auth-group and "
 				    "auth-type for target \"%s\"",
 				    target->t_name);
 				return (1);
 			}
 		} else {
 			target->t_auth_group = auth_group_new(conf, NULL);
 			if (target->t_auth_group == NULL) {
 				free($2);
 				return (1);
 			}
 			target->t_auth_group->ag_target = target;
 		}
 		error = auth_group_set_type(target->t_auth_group, $2);
 		free($2);
 		if (error != 0)
 			return (1);
 	}
 	;
 
 target_chap:	CHAP STR STR
 	{
 		const struct auth *ca;
 
 		if (target->t_auth_group != NULL) {
 			if (target->t_auth_group->ag_name != NULL) {
 				log_warnx("cannot use both auth-group and "
 				    "chap for target \"%s\"",
 				    target->t_name);
 				free($2);
 				free($3);
 				return (1);
 			}
 		} else {
 			target->t_auth_group = auth_group_new(conf, NULL);
 			if (target->t_auth_group == NULL) {
 				free($2);
 				free($3);
 				return (1);
 			}
 			target->t_auth_group->ag_target = target;
 		}
 		ca = auth_new_chap(target->t_auth_group, $2, $3);
 		free($2);
 		free($3);
 		if (ca == NULL)
 			return (1);
 	}
 	;
 
 target_chap_mutual:	CHAP_MUTUAL STR STR STR STR
 	{
 		const struct auth *ca;
 
 		if (target->t_auth_group != NULL) {
 			if (target->t_auth_group->ag_name != NULL) {
 				log_warnx("cannot use both auth-group and "
 				    "chap-mutual for target \"%s\"",
 				    target->t_name);
 				free($2);
 				free($3);
 				free($4);
 				free($5);
 				return (1);
 			}
 		} else {
 			target->t_auth_group = auth_group_new(conf, NULL);
 			if (target->t_auth_group == NULL) {
 				free($2);
 				free($3);
 				free($4);
 				free($5);
 				return (1);
 			}
 			target->t_auth_group->ag_target = target;
 		}
 		ca = auth_new_chap_mutual(target->t_auth_group,
 		    $2, $3, $4, $5);
 		free($2);
 		free($3);
 		free($4);
 		free($5);
 		if (ca == NULL)
 			return (1);
 	}
 	;
 
 target_initiator_name:	INITIATOR_NAME STR
 	{
 		const struct auth_name *an;
 
 		if (target->t_auth_group != NULL) {
 			if (target->t_auth_group->ag_name != NULL) {
 				log_warnx("cannot use both auth-group and "
 				    "initiator-name for target \"%s\"",
 				    target->t_name);
 				free($2);
 				return (1);
 			}
 		} else {
 			target->t_auth_group = auth_group_new(conf, NULL);
 			if (target->t_auth_group == NULL) {
 				free($2);
 				return (1);
 			}
 			target->t_auth_group->ag_target = target;
 		}
 		an = auth_name_new(target->t_auth_group, $2);
 		free($2);
 		if (an == NULL)
 			return (1);
 	}
 	;
 
 target_initiator_portal:	INITIATOR_PORTAL STR
 	{
 		const struct auth_portal *ap;
 
 		if (target->t_auth_group != NULL) {
 			if (target->t_auth_group->ag_name != NULL) {
 				log_warnx("cannot use both auth-group and "
 				    "initiator-portal for target \"%s\"",
 				    target->t_name);
 				free($2);
 				return (1);
 			}
 		} else {
 			target->t_auth_group = auth_group_new(conf, NULL);
 			if (target->t_auth_group == NULL) {
 				free($2);
 				return (1);
 			}
 			target->t_auth_group->ag_target = target;
 		}
 		ap = auth_portal_new(target->t_auth_group, $2);
 		free($2);
 		if (ap == NULL)
 			return (1);
 	}
 	;
 
 target_portal_group:	PORTAL_GROUP STR STR
 	{
 		struct portal_group *tpg;
 		struct auth_group *tag;
 		struct port *tp;
 
 		tpg = portal_group_find(conf, $2);
 		if (tpg == NULL) {
 			log_warnx("unknown portal-group \"%s\" for target "
 			    "\"%s\"", $2, target->t_name);
 			free($2);
 			free($3);
 			return (1);
 		}
 		tag = auth_group_find(conf, $3);
 		if (tag == NULL) {
 			log_warnx("unknown auth-group \"%s\" for target "
 			    "\"%s\"", $3, target->t_name);
 			free($2);
 			free($3);
 			return (1);
 		}
 		tp = port_new(conf, target, tpg);
 		if (tp == NULL) {
 			log_warnx("can't link portal-group \"%s\" to target "
 			    "\"%s\"", $2, target->t_name);
 			free($2);
 			return (1);
 		}
 		tp->p_auth_group = tag;
 		free($2);
 		free($3);
 	}
 	|		PORTAL_GROUP STR
 	{
 		struct portal_group *tpg;
 		struct port *tp;
 
 		tpg = portal_group_find(conf, $2);
 		if (tpg == NULL) {
 			log_warnx("unknown portal-group \"%s\" for target "
 			    "\"%s\"", $2, target->t_name);
 			free($2);
 			return (1);
 		}
 		tp = port_new(conf, target, tpg);
 		if (tp == NULL) {
 			log_warnx("can't link portal-group \"%s\" to target "
 			    "\"%s\"", $2, target->t_name);
 			free($2);
 			return (1);
 		}
 		free($2);
 	}
 	;
 
 target_port:	PORT STR
 	{
 		struct pport *pp;
 		struct port *tp;
 
 		pp = pport_find(conf, $2);
 		if (pp == NULL) {
 			log_warnx("unknown port \"%s\" for target \"%s\"",
 			    $2, target->t_name);
 			free($2);
 			return (1);
 		}
 		if (!TAILQ_EMPTY(&pp->pp_ports)) {
 			log_warnx("can't link port \"%s\" to target \"%s\", "
 			    "port already linked to some target",
 			    $2, target->t_name);
 			free($2);
 			return (1);
 		}
 		tp = port_new_pp(conf, target, pp);
 		if (tp == NULL) {
 			log_warnx("can't link port \"%s\" to target \"%s\"",
 			    $2, target->t_name);
 			free($2);
 			return (1);
 		}
 		free($2);
 	}
 	;
 
 target_redirect:	REDIRECT STR
 	{
 		int error;
 
 		error = target_set_redirection(target, $2);
 		free($2);
 		if (error != 0)
 			return (1);
 	}
 	;
 
 target_lun:	LUN lun_number
     OPENING_BRACKET lun_entries CLOSING_BRACKET
 	{
 		lun = NULL;
 	}
 	;
 
 lun_number:	STR
 	{
 		uint64_t tmp;
 		int ret;
 		char *name;
 
 		if (expand_number($1, &tmp) != 0) {
 			yyerror("invalid numeric value");
 			free($1);
 			return (1);
 		}
 
 		ret = asprintf(&name, "%s,lun,%ju", target->t_name, tmp);
 		if (ret <= 0)
 			log_err(1, "asprintf");
 		lun = lun_new(conf, name);
 		if (lun == NULL)
 			return (1);
 
 		lun_set_scsiname(lun, name);
 		target->t_luns[tmp] = lun;
 	}
 	;
 
 target_lun_ref:	LUN STR STR
 	{
 		uint64_t tmp;
 
 		if (expand_number($2, &tmp) != 0) {
 			yyerror("invalid numeric value");
 			free($2);
 			free($3);
 			return (1);
 		}
 		free($2);
 
 		lun = lun_find(conf, $3);
 		free($3);
 		if (lun == NULL)
 			return (1);
 
 		target->t_luns[tmp] = lun;
 	}
 	;
 
 lun_entries:
 	|
 	lun_entries lun_entry
 	|
 	lun_entries lun_entry SEMICOLON
 	;
 
 lun_entry:
 	lun_backend
 	|
 	lun_blocksize
 	|
 	lun_device_id
 	|
 	lun_option
 	|
 	lun_path
 	|
 	lun_serial
 	|
 	lun_size
 	;
 
 lun_backend:	BACKEND STR
 	{
 		if (lun->l_backend != NULL) {
 			log_warnx("backend for lun \"%s\" "
 			    "specified more than once",
 			    lun->l_name);
 			free($2);
 			return (1);
 		}
 		lun_set_backend(lun, $2);
 		free($2);
 	}
 	;
 
 lun_blocksize:	BLOCKSIZE STR
 	{
 		uint64_t tmp;
 
 		if (expand_number($2, &tmp) != 0) {
 			yyerror("invalid numeric value");
 			free($2);
 			return (1);
 		}
 
 		if (lun->l_blocksize != 0) {
 			log_warnx("blocksize for lun \"%s\" "
 			    "specified more than once",
 			    lun->l_name);
 			return (1);
 		}
 		lun_set_blocksize(lun, tmp);
 	}
 	;
 
 lun_device_id:	DEVICE_ID STR
 	{
 		if (lun->l_device_id != NULL) {
 			log_warnx("device_id for lun \"%s\" "
 			    "specified more than once",
 			    lun->l_name);
 			free($2);
 			return (1);
 		}
 		lun_set_device_id(lun, $2);
 		free($2);
 	}
 	;
 
 lun_option:	OPTION STR STR
 	{
 		struct lun_option *clo;
 
 		clo = lun_option_new(lun, $2, $3);
 		free($2);
 		free($3);
 		if (clo == NULL)
 			return (1);
 	}
 	;
 
 lun_path:	PATH STR
 	{
 		if (lun->l_path != NULL) {
 			log_warnx("path for lun \"%s\" "
 			    "specified more than once",
 			    lun->l_name);
 			free($2);
 			return (1);
 		}
 		lun_set_path(lun, $2);
 		free($2);
 	}
 	;
 
 lun_serial:	SERIAL STR
 	{
 		if (lun->l_serial != NULL) {
 			log_warnx("serial for lun \"%s\" "
 			    "specified more than once",
 			    lun->l_name);
 			free($2);
 			return (1);
 		}
 		lun_set_serial(lun, $2);
 		free($2);
 	}
 	;
 
 lun_size:	SIZE STR
 	{
 		uint64_t tmp;
 
 		if (expand_number($2, &tmp) != 0) {
 			yyerror("invalid numeric value");
 			free($2);
 			return (1);
 		}
 
 		if (lun->l_size != 0) {
 			log_warnx("size for lun \"%s\" "
 			    "specified more than once",
 			    lun->l_name);
 			return (1);
 		}
 		lun_set_size(lun, tmp);
 	}
 	;
 %%
 
 void
 yyerror(const char *str)
 {
 
 	log_warnx("error in configuration file at line %d near '%s': %s",
 	    lineno, yytext, str);
 }
 
 static void
 check_perms(const char *path)
 {
 	struct stat sb;
 	int error;
 
 	error = stat(path, &sb);
 	if (error != 0) {
 		log_warn("stat");
 		return;
 	}
 	if (sb.st_mode & S_IWOTH) {
 		log_warnx("%s is world-writable", path);
 	} else if (sb.st_mode & S_IROTH) {
 		log_warnx("%s is world-readable", path);
 	} else if (sb.st_mode & S_IXOTH) {
 		/*
 		 * Ok, this one doesn't matter, but still do it,
 		 * just for consistency.
 		 */
 		log_warnx("%s is world-executable", path);
 	}
 
 	/*
 	 * XXX: Should we also check for owner != 0?
 	 */
 }
 
 struct conf *
 conf_new_from_file(const char *path, struct conf *oldconf)
 {
 	struct auth_group *ag;
 	struct portal_group *pg;
 	struct pport *pp;
 	int error;
 
 	log_debugx("obtaining configuration from %s", path);
 
 	conf = conf_new();
 
 	TAILQ_FOREACH(pp, &oldconf->conf_pports, pp_next)
 		pport_copy(pp, conf);
 
 	ag = auth_group_new(conf, "default");
 	assert(ag != NULL);
 
 	ag = auth_group_new(conf, "no-authentication");
 	assert(ag != NULL);
 	ag->ag_type = AG_TYPE_NO_AUTHENTICATION;
 
 	ag = auth_group_new(conf, "no-access");
 	assert(ag != NULL);
 	ag->ag_type = AG_TYPE_DENY;
 
 	pg = portal_group_new(conf, "default");
 	assert(pg != NULL);
 
 	yyin = fopen(path, "r");
 	if (yyin == NULL) {
 		log_warn("unable to open configuration file %s", path);
 		conf_delete(conf);
 		return (NULL);
 	}
 	check_perms(path);
 	lineno = 1;
 	yyrestart(yyin);
 	error = yyparse();
 	auth_group = NULL;
 	portal_group = NULL;
 	target = NULL;
 	lun = NULL;
 	fclose(yyin);
 	if (error != 0) {
 		conf_delete(conf);
 		return (NULL);
 	}
 
 	if (conf->conf_default_ag_defined == false) {
 		log_debugx("auth-group \"default\" not defined; "
 		    "going with defaults");
 		ag = auth_group_find(conf, "default");
 		assert(ag != NULL);
 		ag->ag_type = AG_TYPE_DENY;
 	}
 
 	if (conf->conf_default_pg_defined == false) {
 		log_debugx("portal-group \"default\" not defined; "
 		    "going with defaults");
 		pg = portal_group_find(conf, "default");
 		assert(pg != NULL);
 		portal_group_add_listen(pg, "0.0.0.0:3260", false);
 		portal_group_add_listen(pg, "[::]:3260", false);
 	}
 
 	conf->conf_kernel_port_on = true;
 
 	error = conf_verify(conf);
 	if (error != 0) {
 		conf_delete(conf);
 		return (NULL);
 	}
 
 	return (conf);
 }
Index: user/ngie/more-tests/usr.sbin/ctld/pdu.c
===================================================================
--- user/ngie/more-tests/usr.sbin/ctld/pdu.c	(revision 281584)
+++ user/ngie/more-tests/usr.sbin/ctld/pdu.c	(revision 281585)
@@ -1,266 +1,264 @@
 /*-
  * Copyright (c) 2012 The FreeBSD Foundation
  * All rights reserved.
  *
  * This software was developed by Edward Tomasz Napierala under sponsorship
  * from the FreeBSD Foundation.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/types.h>
 #include <sys/uio.h>
 #include <assert.h>
-#include <stdint.h>
-#include <stdio.h>
 #include <stdlib.h>
 #include <unistd.h>
 
 #include "ctld.h"
 #include "iscsi_proto.h"
 
 #ifdef ICL_KERNEL_PROXY
 #include <sys/ioctl.h>
 #endif
 
 extern bool proxy_mode;
 
 static int
 pdu_ahs_length(const struct pdu *pdu)
 {
 
 	return (pdu->pdu_bhs->bhs_total_ahs_len * 4);
 }
 
 static int
 pdu_data_segment_length(const struct pdu *pdu)
 {
 	uint32_t len = 0;
 
 	len += pdu->pdu_bhs->bhs_data_segment_len[0];
 	len <<= 8;
 	len += pdu->pdu_bhs->bhs_data_segment_len[1];
 	len <<= 8;
 	len += pdu->pdu_bhs->bhs_data_segment_len[2];
 
 	return (len);
 }
 
 static void
 pdu_set_data_segment_length(struct pdu *pdu, uint32_t len)
 {
 
 	pdu->pdu_bhs->bhs_data_segment_len[2] = len;
 	pdu->pdu_bhs->bhs_data_segment_len[1] = len >> 8;
 	pdu->pdu_bhs->bhs_data_segment_len[0] = len >> 16;
 }
 
 struct pdu *
 pdu_new(struct connection *conn)
 {
 	struct pdu *pdu;
 
 	pdu = calloc(sizeof(*pdu), 1);
 	if (pdu == NULL)
 		log_err(1, "calloc");
 
 	pdu->pdu_bhs = calloc(sizeof(*pdu->pdu_bhs), 1);
 	if (pdu->pdu_bhs == NULL)
 		log_err(1, "calloc");
 
 	pdu->pdu_connection = conn;
 
 	return (pdu);
 }
 
 struct pdu *
 pdu_new_response(struct pdu *request)
 {
 
 	return (pdu_new(request->pdu_connection));
 }
 
 #ifdef ICL_KERNEL_PROXY
 
 static void
 pdu_receive_proxy(struct pdu *pdu)
 {
 	size_t len;
 
 	assert(proxy_mode);
 
 	kernel_receive(pdu);
 
 	len = pdu_ahs_length(pdu);
 	if (len > 0)
 		log_errx(1, "protocol error: non-empty AHS");
 
 	len = pdu_data_segment_length(pdu);
 	assert(len <= MAX_DATA_SEGMENT_LENGTH);
 	pdu->pdu_data_len = len;
 }
 
 static void
 pdu_send_proxy(struct pdu *pdu)
 {
 
 	assert(proxy_mode);
 
 	pdu_set_data_segment_length(pdu, pdu->pdu_data_len);
 	kernel_send(pdu);
 }
 
 #endif /* ICL_KERNEL_PROXY */
 
 static size_t
 pdu_padding(const struct pdu *pdu)
 {
 
 	if ((pdu->pdu_data_len % 4) != 0)
 		return (4 - (pdu->pdu_data_len % 4));
 
 	return (0);
 }
 
 static void
 pdu_read(int fd, char *data, size_t len)
 {
 	ssize_t ret;
 
 	while (len > 0) {
 		ret = read(fd, data, len);
 		if (ret < 0) {
 			if (timed_out())
 				log_errx(1, "exiting due to timeout");
 			log_err(1, "read");
 		} else if (ret == 0)
 			log_errx(1, "read: connection lost");
 		len -= ret;
 		data += ret;
 	}
 }
 
 void
 pdu_receive(struct pdu *pdu)
 {
 	size_t len, padding;
 	char dummy[4];
 
 #ifdef ICL_KERNEL_PROXY
 	if (proxy_mode)
 		return (pdu_receive_proxy(pdu));
 #endif
 
 	assert(proxy_mode == false);
 
 	pdu_read(pdu->pdu_connection->conn_socket,
 	    (char *)pdu->pdu_bhs, sizeof(*pdu->pdu_bhs));
 
 	len = pdu_ahs_length(pdu);
 	if (len > 0)
 		log_errx(1, "protocol error: non-empty AHS");
 
 	len = pdu_data_segment_length(pdu);
 	if (len > 0) {
 		if (len > MAX_DATA_SEGMENT_LENGTH) {
 			log_errx(1, "protocol error: received PDU "
 			    "with DataSegmentLength exceeding %d",
 			    MAX_DATA_SEGMENT_LENGTH);
 		}
 
 		pdu->pdu_data_len = len;
 		pdu->pdu_data = malloc(len);
 		if (pdu->pdu_data == NULL)
 			log_err(1, "malloc");
 
 		pdu_read(pdu->pdu_connection->conn_socket,
 		    (char *)pdu->pdu_data, pdu->pdu_data_len);
 
 		padding = pdu_padding(pdu);
 		if (padding != 0) {
 			assert(padding < sizeof(dummy));
 			pdu_read(pdu->pdu_connection->conn_socket,
 			    (char *)dummy, padding);
 		}
 	}
 }
 
 void
 pdu_send(struct pdu *pdu)
 {
 	ssize_t ret, total_len;
 	size_t padding;
 	uint32_t zero = 0;
 	struct iovec iov[3];
 	int iovcnt;
 
 #ifdef ICL_KERNEL_PROXY
 	if (proxy_mode)
 		return (pdu_send_proxy(pdu));
 #endif
 
 	assert(proxy_mode == false);
 
 	pdu_set_data_segment_length(pdu, pdu->pdu_data_len);
 	iov[0].iov_base = pdu->pdu_bhs;
 	iov[0].iov_len = sizeof(*pdu->pdu_bhs);
 	total_len = iov[0].iov_len;
 	iovcnt = 1;
 
 	if (pdu->pdu_data_len > 0) {
 		iov[1].iov_base = pdu->pdu_data;
 		iov[1].iov_len = pdu->pdu_data_len;
 		total_len += iov[1].iov_len;
 		iovcnt = 2;
 
 		padding = pdu_padding(pdu);
 		if (padding > 0) {
 			assert(padding < sizeof(zero));
 			iov[2].iov_base = &zero;
 			iov[2].iov_len = padding;
 			total_len += iov[2].iov_len;
 			iovcnt = 3;
 		}
 	}
 
 	ret = writev(pdu->pdu_connection->conn_socket, iov, iovcnt);
 	if (ret < 0) {
 		if (timed_out())
 			log_errx(1, "exiting due to timeout");
 		log_err(1, "writev");
 	}
 	if (ret != total_len)
 		log_errx(1, "short write");
 }
 
 void
 pdu_delete(struct pdu *pdu)
 {
 
 	free(pdu->pdu_data);
 	free(pdu->pdu_bhs);
 	free(pdu);
 }
Index: user/ngie/more-tests/usr.sbin/ctld/token.l
===================================================================
--- user/ngie/more-tests/usr.sbin/ctld/token.l	(revision 281584)
+++ user/ngie/more-tests/usr.sbin/ctld/token.l	(revision 281585)
@@ -1,93 +1,92 @@
 %{
 /*-
  * Copyright (c) 2012 The FreeBSD Foundation
  * All rights reserved.
  *
  * This software was developed by Edward Tomasz Napierala under sponsorship
  * from the FreeBSD Foundation.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 #include <stdio.h>
 #include <stdint.h>
 #include <string.h>
 
-#include "ctld.h"
 #include "y.tab.h"
 
 int lineno;
 
 #define	YY_DECL int yylex(void)
 extern int	yylex(void);
 
 %}
 
 %option noinput
 %option nounput
 
 %%
 alias			{ return ALIAS; }
 auth-group		{ return AUTH_GROUP; }
 auth-type		{ return AUTH_TYPE; }
 backend			{ return BACKEND; }
 blocksize		{ return BLOCKSIZE; }
 chap			{ return CHAP; }
 chap-mutual		{ return CHAP_MUTUAL; }
 debug			{ return DEBUG; }
 device-id		{ return DEVICE_ID; }
 discovery-auth-group	{ return DISCOVERY_AUTH_GROUP; }
 discovery-filter	{ return DISCOVERY_FILTER; }
 initiator-name		{ return INITIATOR_NAME; }
 initiator-portal	{ return INITIATOR_PORTAL; }
 listen			{ return LISTEN; }
 listen-iser		{ return LISTEN_ISER; }
 lun			{ return LUN; }
 maxproc			{ return MAXPROC; }
 offload			{ return OFFLOAD; }
 option			{ return OPTION; }
 path			{ return PATH; }
 pidfile			{ return PIDFILE; }
 isns-server		{ return ISNS_SERVER; }
 isns-period		{ return ISNS_PERIOD; }
 isns-timeout		{ return ISNS_TIMEOUT; }
 port			{ return PORT; }
 portal-group		{ return PORTAL_GROUP; }
 redirect		{ return REDIRECT; }
 serial			{ return SERIAL; }
 size			{ return SIZE; }
 target			{ return TARGET; }
 timeout			{ return TIMEOUT; }
 \"[^"]+\"		{ yylval.str = strndup(yytext + 1,
 			    strlen(yytext) - 2); return STR; }
 [a-zA-Z0-9\.\-_/\:\[\]]+ { yylval.str = strdup(yytext); return STR; }
 \{			{ return OPENING_BRACKET; }
 \}			{ return CLOSING_BRACKET; }
 #.*$			/* ignore comments */;
 \r\n			{ lineno++; }
 \n			{ lineno++; }
 ;			{ return SEMICOLON; }
 [ \t]+			/* ignore whitespace */;
 .			{ yylval.str = strdup(yytext); return STR; }
 %%
Index: user/ngie/more-tests/usr.sbin/freebsd-update/freebsd-update.sh
===================================================================
--- user/ngie/more-tests/usr.sbin/freebsd-update/freebsd-update.sh	(revision 281584)
+++ user/ngie/more-tests/usr.sbin/freebsd-update/freebsd-update.sh	(revision 281585)
@@ -1,3291 +1,3291 @@
 #!/bin/sh
 
 #-
 # Copyright 2004-2007 Colin Percival
 # All rights reserved
 #
 # Redistribution and use in source and binary forms, with or without
 # modification, are permitted providing that the following conditions 
 # are met:
 # 1. Redistributions of source code must retain the above copyright
 #    notice, this list of conditions and the following disclaimer.
 # 2. Redistributions in binary form must reproduce the above copyright
 #    notice, this list of conditions and the following disclaimer in the
 #    documentation and/or other materials provided with the distribution.
 #
 # THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
 # IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
 # WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 # ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
 # DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
 # STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
 # IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 # POSSIBILITY OF SUCH DAMAGE.
 
 # $FreeBSD$
 
 #### Usage function -- called from command-line handling code.
 
 # Usage instructions.  Options not listed:
 # --debug	-- don't filter output from utilities
 # --no-stats	-- don't show progress statistics while fetching files
 usage () {
 	cat <<EOF
 usage: `basename $0` [options] command ... [path]
 
 Options:
   -b basedir   -- Operate on a system mounted at basedir
                   (default: /)
   -d workdir   -- Store working files in workdir
                   (default: /var/db/freebsd-update/)
   -f conffile  -- Read configuration options from conffile
                   (default: /etc/freebsd-update.conf)
   -F           -- Force a fetch operation to proceed
   -k KEY       -- Trust an RSA key with SHA256 hash of KEY
   -r release   -- Target for upgrade (e.g., 6.2-RELEASE)
   -s server    -- Server from which to fetch updates
                   (default: update.FreeBSD.org)
   -t address   -- Mail output of cron command, if any, to address
                   (default: root)
   --not-running-from-cron
                -- Run without a tty, for use by automated tools
 Commands:
   fetch        -- Fetch updates from server
   cron         -- Sleep rand(3600) seconds, fetch updates, and send an
                   email if updates were found
   upgrade      -- Fetch upgrades to FreeBSD version specified via -r option
   install      -- Install downloaded updates or upgrades
   rollback     -- Uninstall most recently installed updates
   IDS          -- Compare the system against an index of "known good" files.
 EOF
 	exit 0
 }
 
 #### Configuration processing functions
 
 #-
 # Configuration options are set in the following order of priority:
 # 1. Command line options
 # 2. Configuration file options
 # 3. Default options
 # In addition, certain options (e.g., IgnorePaths) can be specified multiple
 # times and (as long as these are all in the same place, e.g., inside the
 # configuration file) they will accumulate.  Finally, because the path to the
 # configuration file can be specified at the command line, the entire command
 # line must be processed before we start reading the configuration file.
 #
 # Sound like a mess?  It is.  Here's how we handle this:
 # 1. Initialize CONFFILE and all the options to "".
 # 2. Process the command line.  Throw an error if a non-accumulating option
 #    is specified twice.
 # 3. If CONFFILE is "", set CONFFILE to /etc/freebsd-update.conf .
 # 4. For all the configuration options X, set X_saved to X.
 # 5. Initialize all the options to "".
 # 6. Read CONFFILE line by line, parsing options.
 # 7. For each configuration option X, set X to X_saved iff X_saved is not "".
 # 8. Repeat steps 4-7, except setting options to their default values at (6).
 
 CONFIGOPTIONS="KEYPRINT WORKDIR SERVERNAME MAILTO ALLOWADD ALLOWDELETE
     KEEPMODIFIEDMETADATA COMPONENTS IGNOREPATHS UPDATEIFUNMODIFIED
     BASEDIR VERBOSELEVEL TARGETRELEASE STRICTCOMPONENTS MERGECHANGES
     IDSIGNOREPATHS BACKUPKERNEL BACKUPKERNELDIR BACKUPKERNELSYMBOLFILES"
 
 # Set all the configuration options to "".
 nullconfig () {
 	for X in ${CONFIGOPTIONS}; do
 		eval ${X}=""
 	done
 }
 
 # For each configuration option X, set X_saved to X.
 saveconfig () {
 	for X in ${CONFIGOPTIONS}; do
 		eval ${X}_saved=\$${X}
 	done
 }
 
 # For each configuration option X, set X to X_saved if X_saved is not "".
 mergeconfig () {
 	for X in ${CONFIGOPTIONS}; do
 		eval _=\$${X}_saved
 		if ! [ -z "${_}" ]; then
 			eval ${X}=\$${X}_saved
 		fi
 	done
 }
 
 # Set the trusted keyprint.
 config_KeyPrint () {
 	if [ -z ${KEYPRINT} ]; then
 		KEYPRINT=$1
 	else
 		return 1
 	fi
 }
 
 # Set the working directory.
 config_WorkDir () {
 	if [ -z ${WORKDIR} ]; then
 		WORKDIR=$1
 	else
 		return 1
 	fi
 }
 
 # Set the name of the server (pool) from which to fetch updates
 config_ServerName () {
 	if [ -z ${SERVERNAME} ]; then
 		SERVERNAME=$1
 	else
 		return 1
 	fi
 }
 
 # Set the address to which 'cron' output will be mailed.
 config_MailTo () {
 	if [ -z ${MAILTO} ]; then
 		MAILTO=$1
 	else
 		return 1
 	fi
 }
 
 # Set whether FreeBSD Update is allowed to add files (or directories, or
 # symlinks) which did not previously exist.
 config_AllowAdd () {
 	if [ -z ${ALLOWADD} ]; then
 		case $1 in
 		[Yy][Ee][Ss])
 			ALLOWADD=yes
 			;;
 		[Nn][Oo])
 			ALLOWADD=no
 			;;
 		*)
 			return 1
 			;;
 		esac
 	else
 		return 1
 	fi
 }
 
 # Set whether FreeBSD Update is allowed to remove files/directories/symlinks.
 config_AllowDelete () {
 	if [ -z ${ALLOWDELETE} ]; then
 		case $1 in
 		[Yy][Ee][Ss])
 			ALLOWDELETE=yes
 			;;
 		[Nn][Oo])
 			ALLOWDELETE=no
 			;;
 		*)
 			return 1
 			;;
 		esac
 	else
 		return 1
 	fi
 }
 
 # Set whether FreeBSD Update should keep existing inode ownership,
 # permissions, and flags, in the event that they have been modified locally
 # after the release.
 config_KeepModifiedMetadata () {
 	if [ -z ${KEEPMODIFIEDMETADATA} ]; then
 		case $1 in
 		[Yy][Ee][Ss])
 			KEEPMODIFIEDMETADATA=yes
 			;;
 		[Nn][Oo])
 			KEEPMODIFIEDMETADATA=no
 			;;
 		*)
 			return 1
 			;;
 		esac
 	else
 		return 1
 	fi
 }
 
 # Add to the list of components which should be kept updated.
 config_Components () {
 	for C in $@; do
 		COMPONENTS="${COMPONENTS} ${C}"
 	done
 }
 
 # Add to the list of paths under which updates will be ignored.
 config_IgnorePaths () {
 	for C in $@; do
 		IGNOREPATHS="${IGNOREPATHS} ${C}"
 	done
 }
 
 # Add to the list of paths which IDS should ignore.
 config_IDSIgnorePaths () {
 	for C in $@; do
 		IDSIGNOREPATHS="${IDSIGNOREPATHS} ${C}"
 	done
 }
 
 # Add to the list of paths within which updates will be performed only if the
 # file on disk has not been modified locally.
 config_UpdateIfUnmodified () {
 	for C in $@; do
 		UPDATEIFUNMODIFIED="${UPDATEIFUNMODIFIED} ${C}"
 	done
 }
 
 # Add to the list of paths within which updates to text files will be merged
 # instead of overwritten.
 config_MergeChanges () {
 	for C in $@; do
 		MERGECHANGES="${MERGECHANGES} ${C}"
 	done
 }
 
 # Work on a FreeBSD installation mounted under $1
 config_BaseDir () {
 	if [ -z ${BASEDIR} ]; then
 		BASEDIR=$1
 	else
 		return 1
 	fi
 }
 
 # When fetching upgrades, should we assume the user wants exactly the
 # components listed in COMPONENTS, rather than trying to guess based on
 # what's currently installed?
 config_StrictComponents () {
 	if [ -z ${STRICTCOMPONENTS} ]; then
 		case $1 in
 		[Yy][Ee][Ss])
 			STRICTCOMPONENTS=yes
 			;;
 		[Nn][Oo])
 			STRICTCOMPONENTS=no
 			;;
 		*)
 			return 1
 			;;
 		esac
 	else
 		return 1
 	fi
 }
 
 # Upgrade to FreeBSD $1
 config_TargetRelease () {
 	if [ -z ${TARGETRELEASE} ]; then
 		TARGETRELEASE=$1
 	else
 		return 1
 	fi
 	if echo ${TARGETRELEASE} | grep -qE '^[0-9.]+$'; then
 		TARGETRELEASE="${TARGETRELEASE}-RELEASE"
 	fi
 }
 
 # Define what happens to output of utilities
 config_VerboseLevel () {
 	if [ -z ${VERBOSELEVEL} ]; then
 		case $1 in
 		[Dd][Ee][Bb][Uu][Gg])
 			VERBOSELEVEL=debug
 			;;
 		[Nn][Oo][Ss][Tt][Aa][Tt][Ss])
 			VERBOSELEVEL=nostats
 			;;
 		[Ss][Tt][Aa][Tt][Ss])
 			VERBOSELEVEL=stats
 			;;
 		*)
 			return 1
 			;;
 		esac
 	else
 		return 1
 	fi
 }
 
 config_BackupKernel () {
 	if [ -z ${BACKUPKERNEL} ]; then
 		case $1 in
 		[Yy][Ee][Ss])
 			BACKUPKERNEL=yes
 			;;
 		[Nn][Oo])
 			BACKUPKERNEL=no
 			;;
 		*)
 			return 1
 			;;
 		esac
 	else
 		return 1
 	fi
 }
 
 config_BackupKernelDir () {
 	if [ -z ${BACKUPKERNELDIR} ]; then
 		if [ -z "$1" ]; then
 			echo "BackupKernelDir set to empty dir"
 			return 1
 		fi
 
 		# We check for some paths which would be extremely odd
 		# to use, but which could cause a lot of problems if
 		# used.
 		case $1 in
 		/|/bin|/boot|/etc|/lib|/libexec|/sbin|/usr|/var)
 			echo "BackupKernelDir set to invalid path $1"
 			return 1
 			;;
 		/*)
 			BACKUPKERNELDIR=$1
 			;;
 		*)
 			echo "BackupKernelDir ($1) is not an absolute path"
 			return 1
 			;;
 		esac
 	else
 		return 1
 	fi
 }
 
 config_BackupKernelSymbolFiles () {
 	if [ -z ${BACKUPKERNELSYMBOLFILES} ]; then
 		case $1 in
 		[Yy][Ee][Ss])
 			BACKUPKERNELSYMBOLFILES=yes
 			;;
 		[Nn][Oo])
 			BACKUPKERNELSYMBOLFILES=no
 			;;
 		*)
 			return 1
 			;;
 		esac
 	else
 		return 1
 	fi
 }
 
 # Handle one line of configuration
 configline () {
 	if [ $# -eq 0 ]; then
 		return
 	fi
 
 	OPT=$1
 	shift
 	config_${OPT} $@
 }
 
 #### Parameter handling functions.
 
 # Initialize parameters to null, just in case they're
 # set in the environment.
 init_params () {
 	# Configration settings
 	nullconfig
 
 	# No configuration file set yet
 	CONFFILE=""
 
 	# No commands specified yet
 	COMMANDS=""
 
 	# Force fetch to proceed
 	FORCEFETCH=0
 
 	# Run without a TTY
 	NOTTYOK=0
 }
 
 # Parse the command line
 parse_cmdline () {
 	while [ $# -gt 0 ]; do
 		case "$1" in
 		# Location of configuration file
 		-f)
 			if [ $# -eq 1 ]; then usage; fi
 			if [ ! -z "${CONFFILE}" ]; then usage; fi
 			shift; CONFFILE="$1"
 			;;
 		-F)
 			FORCEFETCH=1
 			;;
 		--not-running-from-cron)
 			NOTTYOK=1
 			;;
 
 		# Configuration file equivalents
 		-b)
 			if [ $# -eq 1 ]; then usage; fi; shift
 			config_BaseDir $1 || usage
 			;;
 		-d)
 			if [ $# -eq 1 ]; then usage; fi; shift
 			config_WorkDir $1 || usage
 			;;
 		-k)
 			if [ $# -eq 1 ]; then usage; fi; shift
 			config_KeyPrint $1 || usage
 			;;
 		-s)
 			if [ $# -eq 1 ]; then usage; fi; shift
 			config_ServerName $1 || usage
 			;;
 		-r)
 			if [ $# -eq 1 ]; then usage; fi; shift
 			config_TargetRelease $1 || usage
 			;;
 		-t)
 			if [ $# -eq 1 ]; then usage; fi; shift
 			config_MailTo $1 || usage
 			;;
 		-v)
 			if [ $# -eq 1 ]; then usage; fi; shift
 			config_VerboseLevel $1 || usage
 			;;
 
 		# Aliases for "-v debug" and "-v nostats"
 		--debug)
 			config_VerboseLevel debug || usage
 			;;
 		--no-stats)
 			config_VerboseLevel nostats || usage
 			;;
 
 		# Commands
 		cron | fetch | upgrade | install | rollback | IDS)
 			COMMANDS="${COMMANDS} $1"
 			;;
 
 		# Anything else is an error
 		*)
 			usage
 			;;
 		esac
 		shift
 	done
 
 	# Make sure we have at least one command
 	if [ -z "${COMMANDS}" ]; then
 		usage
 	fi
 }
 
 # Parse the configuration file
 parse_conffile () {
 	# If a configuration file was specified on the command line, check
 	# that it exists and is readable.
 	if [ ! -z "${CONFFILE}" ] && [ ! -r "${CONFFILE}" ]; then
 		echo -n "File does not exist "
 		echo -n "or is not readable: "
 		echo ${CONFFILE}
 		exit 1
 	fi
 
 	# If a configuration file was not specified on the command line,
 	# use the default configuration file path.  If that default does
 	# not exist, give up looking for any configuration.
 	if [ -z "${CONFFILE}" ]; then
 		CONFFILE="/etc/freebsd-update.conf"
 		if [ ! -r "${CONFFILE}" ]; then
 			return
 		fi
 	fi
 
 	# Save the configuration options specified on the command line, and
 	# clear all the options in preparation for reading the config file.
 	saveconfig
 	nullconfig
 
 	# Read the configuration file.  Anything after the first '#' is
 	# ignored, and any blank lines are ignored.
 	L=0
 	while read LINE; do
 		L=$(($L + 1))
 		LINEX=`echo "${LINE}" | cut -f 1 -d '#'`
 		if ! configline ${LINEX}; then
 			echo "Error processing configuration file, line $L:"
 			echo "==> ${LINE}"
 			exit 1
 		fi
 	done < ${CONFFILE}
 
 	# Merge the settings read from the configuration file with those
 	# provided at the command line.
 	mergeconfig
 }
 
 # Provide some default parameters
 default_params () {
 	# Save any parameters already configured, and clear the slate
 	saveconfig
 	nullconfig
 
 	# Default configurations
 	config_WorkDir /var/db/freebsd-update
 	config_MailTo root
 	config_AllowAdd yes
 	config_AllowDelete yes
 	config_KeepModifiedMetadata yes
 	config_BaseDir /
 	config_VerboseLevel stats
 	config_StrictComponents no
 	config_BackupKernel yes
 	config_BackupKernelDir /boot/kernel.old
 	config_BackupKernelSymbolFiles no
 
 	# Merge these defaults into the earlier-configured settings
 	mergeconfig
 }
 
 # Set utility output filtering options, based on ${VERBOSELEVEL}
 fetch_setup_verboselevel () {
 	case ${VERBOSELEVEL} in
 	debug)
 		QUIETREDIR="/dev/stderr"
 		QUIETFLAG=" "
 		STATSREDIR="/dev/stderr"
 		DDSTATS=".."
 		XARGST="-t"
 		NDEBUG=" "
 		;;
 	nostats)
 		QUIETREDIR=""
 		QUIETFLAG=""
 		STATSREDIR="/dev/null"
 		DDSTATS=".."
 		XARGST=""
 		NDEBUG=""
 		;;
 	stats)
 		QUIETREDIR="/dev/null"
 		QUIETFLAG="-q"
 		STATSREDIR="/dev/stdout"
 		DDSTATS=""
 		XARGST=""
 		NDEBUG="-n"
 		;;
 	esac
 }
 
 # Perform sanity checks and set some final parameters
 # in preparation for fetching files.  Figure out which
 # set of updates should be downloaded: If the user is
 # running *-p[0-9]+, strip off the last part; if the
 # user is running -SECURITY, call it -RELEASE.  Chdir
 # into the working directory.
 fetchupgrade_check_params () {
 	export HTTP_USER_AGENT="freebsd-update (${COMMAND}, `uname -r`)"
 
 	_SERVERNAME_z=\
 "SERVERNAME must be given via command line or configuration file."
 	_KEYPRINT_z="Key must be given via -k option or configuration file."
 	_KEYPRINT_bad="Invalid key fingerprint: "
 	_WORKDIR_bad="Directory does not exist or is not writable: "
 	_WORKDIR_bad2="Directory is not on a persistent filesystem: "
 
 	if [ -z "${SERVERNAME}" ]; then
 		echo -n "`basename $0`: "
 		echo "${_SERVERNAME_z}"
 		exit 1
 	fi
 	if [ -z "${KEYPRINT}" ]; then
 		echo -n "`basename $0`: "
 		echo "${_KEYPRINT_z}"
 		exit 1
 	fi
 	if ! echo "${KEYPRINT}" | grep -qE "^[0-9a-f]{64}$"; then
 		echo -n "`basename $0`: "
 		echo -n "${_KEYPRINT_bad}"
 		echo ${KEYPRINT}
 		exit 1
 	fi
 	if ! [ -d "${WORKDIR}" -a -w "${WORKDIR}" ]; then
 		echo -n "`basename $0`: "
 		echo -n "${_WORKDIR_bad}"
 		echo ${WORKDIR}
 		exit 1
 	fi
 	case `df -T ${WORKDIR}` in */dev/md[0-9]* | *tmpfs*)
 		echo -n "`basename $0`: "
 		echo -n "${_WORKDIR_bad2}"
 		echo ${WORKDIR}
 		exit 1
 		;;
 	esac
 	chmod 700 ${WORKDIR}
 	cd ${WORKDIR} || exit 1
 
 	# Generate release number.  The s/SECURITY/RELEASE/ bit exists
 	# to provide an upgrade path for FreeBSD Update 1.x users, since
 	# the kernels provided by FreeBSD Update 1.x are always labelled
 	# as X.Y-SECURITY.
 	RELNUM=`uname -r |
 	    sed -E 's,-p[0-9]+,,' |
 	    sed -E 's,-SECURITY,-RELEASE,'`
 	ARCH=`uname -m`
 	FETCHDIR=${RELNUM}/${ARCH}
 	PATCHDIR=${RELNUM}/${ARCH}/bp
 
 	# Figure out what directory contains the running kernel
 	BOOTFILE=`sysctl -n kern.bootfile`
 	KERNELDIR=${BOOTFILE%/kernel}
 	if ! [ -d ${KERNELDIR} ]; then
 		echo "Cannot identify running kernel"
 		exit 1
 	fi
 
 	# Figure out what kernel configuration is running.  We start with
 	# the output of `uname -i`, and then make the following adjustments:
 	# 1. Replace "SMP-GENERIC" with "SMP".  Why the SMP kernel config
 	# file says "ident SMP-GENERIC", I don't know...
 	# 2. If the kernel claims to be GENERIC _and_ ${ARCH} is "amd64"
 	# _and_ `sysctl kern.version` contains a line which ends "/SMP", then
 	# we're running an SMP kernel.  This mis-identification is a bug
 	# which was fixed in 6.2-STABLE.
 	KERNCONF=`uname -i`
 	if [ ${KERNCONF} = "SMP-GENERIC" ]; then
 		KERNCONF=SMP
 	fi
 	if [ ${KERNCONF} = "GENERIC" ] && [ ${ARCH} = "amd64" ]; then
 		if sysctl kern.version | grep -qE '/SMP$'; then
 			KERNCONF=SMP
 		fi
 	fi
 
 	# Define some paths
 	BSPATCH=/usr/bin/bspatch
 	SHA256=/sbin/sha256
 	PHTTPGET=/usr/libexec/phttpget
 
 	# Set up variables relating to VERBOSELEVEL
 	fetch_setup_verboselevel
 
 	# Construct a unique name from ${BASEDIR}
 	BDHASH=`echo ${BASEDIR} | sha256 -q`
 }
 
 # Perform sanity checks etc. before fetching updates.
 fetch_check_params () {
 	fetchupgrade_check_params
 
 	if ! [ -z "${TARGETRELEASE}" ]; then
 		echo -n "`basename $0`: "
 		echo -n "-r option is meaningless with 'fetch' command.  "
 		echo "(Did you mean 'upgrade' instead?)"
 		exit 1
 	fi
 
 	# Check that we have updates ready to install
-	if [ -f ${BDHASH}-install/kerneldone && $FORCEFETCH -eq 0 ]; then
+	if [ -f ${BDHASH}-install/kerneldone -a $FORCEFETCH -eq 0 ]; then
 		echo "You have a partially completed upgrade pending"
 		echo "Run '$0 install' first."
 		echo "Run '$0 fetch -F' to proceed anyway."
 		exit 1
 	fi
 }
 
 # Perform sanity checks etc. before fetching upgrades.
 upgrade_check_params () {
 	fetchupgrade_check_params
 
 	# Unless set otherwise, we're upgrading to the same kernel config.
 	NKERNCONF=${KERNCONF}
 
 	# We need TARGETRELEASE set
 	_TARGETRELEASE_z="Release target must be specified via -r option."
 	if [ -z "${TARGETRELEASE}" ]; then
 		echo -n "`basename $0`: "
 		echo "${_TARGETRELEASE_z}"
 		exit 1
 	fi
 
 	# The target release should be != the current release.
 	if [ "${TARGETRELEASE}" = "${RELNUM}" ]; then
 		echo -n "`basename $0`: "
 		echo "Cannot upgrade from ${RELNUM} to itself"
 		exit 1
 	fi
 
 	# Turning off AllowAdd or AllowDelete is a bad idea for upgrades.
 	if [ "${ALLOWADD}" = "no" ]; then
 		echo -n "`basename $0`: "
 		echo -n "WARNING: \"AllowAdd no\" is a bad idea "
 		echo "when upgrading between releases."
 		echo
 	fi
 	if [ "${ALLOWDELETE}" = "no" ]; then
 		echo -n "`basename $0`: "
 		echo -n "WARNING: \"AllowDelete no\" is a bad idea "
 		echo "when upgrading between releases."
 		echo
 	fi
 
 	# Set EDITOR to /usr/bin/vi if it isn't already set
 	: ${EDITOR:='/usr/bin/vi'}
 }
 
 # Perform sanity checks and set some final parameters in
 # preparation for installing updates.
 install_check_params () {
 	# Check that we are root.  All sorts of things won't work otherwise.
 	if [ `id -u` != 0 ]; then
 		echo "You must be root to run this."
 		exit 1
 	fi
 
 	# Check that securelevel <= 0.  Otherwise we can't update schg files.
 	if [ `sysctl -n kern.securelevel` -gt 0 ]; then
 		echo "Updates cannot be installed when the system securelevel"
 		echo "is greater than zero."
 		exit 1
 	fi
 
 	# Check that we have a working directory
 	_WORKDIR_bad="Directory does not exist or is not writable: "
 	if ! [ -d "${WORKDIR}" -a -w "${WORKDIR}" ]; then
 		echo -n "`basename $0`: "
 		echo -n "${_WORKDIR_bad}"
 		echo ${WORKDIR}
 		exit 1
 	fi
 	cd ${WORKDIR} || exit 1
 
 	# Construct a unique name from ${BASEDIR}
 	BDHASH=`echo ${BASEDIR} | sha256 -q`
 
 	# Check that we have updates ready to install
 	if ! [ -L ${BDHASH}-install ]; then
 		echo "No updates are available to install."
 		echo "Run '$0 fetch' first."
 		exit 1
 	fi
 	if ! [ -f ${BDHASH}-install/INDEX-OLD ] ||
 	    ! [ -f ${BDHASH}-install/INDEX-NEW ]; then
 		echo "Update manifest is corrupt -- this should never happen."
 		echo "Re-run '$0 fetch'."
 		exit 1
 	fi
 
 	# Figure out what directory contains the running kernel
 	BOOTFILE=`sysctl -n kern.bootfile`
 	KERNELDIR=${BOOTFILE%/kernel}
 	if ! [ -d ${KERNELDIR} ]; then
 		echo "Cannot identify running kernel"
 		exit 1
 	fi
 }
 
 # Perform sanity checks and set some final parameters in
 # preparation for UNinstalling updates.
 rollback_check_params () {
 	# Check that we are root.  All sorts of things won't work otherwise.
 	if [ `id -u` != 0 ]; then
 		echo "You must be root to run this."
 		exit 1
 	fi
 
 	# Check that we have a working directory
 	_WORKDIR_bad="Directory does not exist or is not writable: "
 	if ! [ -d "${WORKDIR}" -a -w "${WORKDIR}" ]; then
 		echo -n "`basename $0`: "
 		echo -n "${_WORKDIR_bad}"
 		echo ${WORKDIR}
 		exit 1
 	fi
 	cd ${WORKDIR} || exit 1
 
 	# Construct a unique name from ${BASEDIR}
 	BDHASH=`echo ${BASEDIR} | sha256 -q`
 
 	# Check that we have updates ready to rollback
 	if ! [ -L ${BDHASH}-rollback ]; then
 		echo "No rollback directory found."
 		exit 1
 	fi
 	if ! [ -f ${BDHASH}-rollback/INDEX-OLD ] ||
 	    ! [ -f ${BDHASH}-rollback/INDEX-NEW ]; then
 		echo "Update manifest is corrupt -- this should never happen."
 		exit 1
 	fi
 }
 
 # Perform sanity checks and set some final parameters
 # in preparation for comparing the system against the
 # published index.  Figure out which index we should
 # compare against: If the user is running *-p[0-9]+,
 # strip off the last part; if the user is running
 # -SECURITY, call it -RELEASE.  Chdir into the working
 # directory.
 IDS_check_params () {
 	export HTTP_USER_AGENT="freebsd-update (${COMMAND}, `uname -r`)"
 
 	_SERVERNAME_z=\
 "SERVERNAME must be given via command line or configuration file."
 	_KEYPRINT_z="Key must be given via -k option or configuration file."
 	_KEYPRINT_bad="Invalid key fingerprint: "
 	_WORKDIR_bad="Directory does not exist or is not writable: "
 
 	if [ -z "${SERVERNAME}" ]; then
 		echo -n "`basename $0`: "
 		echo "${_SERVERNAME_z}"
 		exit 1
 	fi
 	if [ -z "${KEYPRINT}" ]; then
 		echo -n "`basename $0`: "
 		echo "${_KEYPRINT_z}"
 		exit 1
 	fi
 	if ! echo "${KEYPRINT}" | grep -qE "^[0-9a-f]{64}$"; then
 		echo -n "`basename $0`: "
 		echo -n "${_KEYPRINT_bad}"
 		echo ${KEYPRINT}
 		exit 1
 	fi
 	if ! [ -d "${WORKDIR}" -a -w "${WORKDIR}" ]; then
 		echo -n "`basename $0`: "
 		echo -n "${_WORKDIR_bad}"
 		echo ${WORKDIR}
 		exit 1
 	fi
 	cd ${WORKDIR} || exit 1
 
 	# Generate release number.  The s/SECURITY/RELEASE/ bit exists
 	# to provide an upgrade path for FreeBSD Update 1.x users, since
 	# the kernels provided by FreeBSD Update 1.x are always labelled
 	# as X.Y-SECURITY.
 	RELNUM=`uname -r |
 	    sed -E 's,-p[0-9]+,,' |
 	    sed -E 's,-SECURITY,-RELEASE,'`
 	ARCH=`uname -m`
 	FETCHDIR=${RELNUM}/${ARCH}
 	PATCHDIR=${RELNUM}/${ARCH}/bp
 
 	# Figure out what directory contains the running kernel
 	BOOTFILE=`sysctl -n kern.bootfile`
 	KERNELDIR=${BOOTFILE%/kernel}
 	if ! [ -d ${KERNELDIR} ]; then
 		echo "Cannot identify running kernel"
 		exit 1
 	fi
 
 	# Figure out what kernel configuration is running.  We start with
 	# the output of `uname -i`, and then make the following adjustments:
 	# 1. Replace "SMP-GENERIC" with "SMP".  Why the SMP kernel config
 	# file says "ident SMP-GENERIC", I don't know...
 	# 2. If the kernel claims to be GENERIC _and_ ${ARCH} is "amd64"
 	# _and_ `sysctl kern.version` contains a line which ends "/SMP", then
 	# we're running an SMP kernel.  This mis-identification is a bug
 	# which was fixed in 6.2-STABLE.
 	KERNCONF=`uname -i`
 	if [ ${KERNCONF} = "SMP-GENERIC" ]; then
 		KERNCONF=SMP
 	fi
 	if [ ${KERNCONF} = "GENERIC" ] && [ ${ARCH} = "amd64" ]; then
 		if sysctl kern.version | grep -qE '/SMP$'; then
 			KERNCONF=SMP
 		fi
 	fi
 
 	# Define some paths
 	SHA256=/sbin/sha256
 	PHTTPGET=/usr/libexec/phttpget
 
 	# Set up variables relating to VERBOSELEVEL
 	fetch_setup_verboselevel
 }
 
 #### Core functionality -- the actual work gets done here
 
 # Use an SRV query to pick a server.  If the SRV query doesn't provide
 # a useful answer, use the server name specified by the user.
 # Put another way... look up _http._tcp.${SERVERNAME} and pick a server
 # from that; or if no servers are returned, use ${SERVERNAME}.
 # This allows a user to specify "portsnap.freebsd.org" (in which case
 # portsnap will select one of the mirrors) or "portsnap5.tld.freebsd.org"
 # (in which case portsnap will use that particular server, since there
 # won't be an SRV entry for that name).
 #
 # We ignore the Port field, since we are always going to use port 80.
 
 # Fetch the mirror list, but do not pick a mirror yet.  Returns 1 if
 # no mirrors are available for any reason.
 fetch_pick_server_init () {
 	: > serverlist_tried
 
 # Check that host(1) exists (i.e., that the system wasn't built with the
 # WITHOUT_BIND set) and don't try to find a mirror if it doesn't exist.
 	if ! which -s host; then
 		: > serverlist_full
 		return 1
 	fi
 
 	echo -n "Looking up ${SERVERNAME} mirrors... "
 
 # Issue the SRV query and pull out the Priority, Weight, and Target fields.
 # BIND 9 prints "$name has SRV record ..." while BIND 8 prints
 # "$name server selection ..."; we allow either format.
 	MLIST="_http._tcp.${SERVERNAME}"
 	host -t srv "${MLIST}" |
 	    sed -nE "s/${MLIST} (has SRV record|server selection) //p" |
 	    cut -f 1,2,4 -d ' ' |
 	    sed -e 's/\.$//' |
 	    sort > serverlist_full
 
 # If no records, give up -- we'll just use the server name we were given.
 	if [ `wc -l < serverlist_full` -eq 0 ]; then
 		echo "none found."
 		return 1
 	fi
 
 # Report how many mirrors we found.
 	echo `wc -l < serverlist_full` "mirrors found."
 
 # Generate a random seed for use in picking mirrors.  If HTTP_PROXY
 # is set, this will be used to generate the seed; otherwise, the seed
 # will be random.
 	if [ -n "${HTTP_PROXY}${http_proxy}" ]; then
 		RANDVALUE=`sha256 -qs "${HTTP_PROXY}${http_proxy}" |
 		    tr -d 'a-f' |
 		    cut -c 1-9`
 	else
 		RANDVALUE=`jot -r 1 0 999999999`
 	fi
 }
 
 # Pick a mirror.  Returns 1 if we have run out of mirrors to try.
 fetch_pick_server () {
 # Generate a list of not-yet-tried mirrors
 	sort serverlist_tried |
 	    comm -23 serverlist_full - > serverlist
 
 # Have we run out of mirrors?
 	if [ `wc -l < serverlist` -eq 0 ]; then
 		echo "No mirrors remaining, giving up."
 		return 1
 	fi
 
 # Find the highest priority level (lowest numeric value).
 	SRV_PRIORITY=`cut -f 1 -d ' ' serverlist | sort -n | head -1`
 
 # Add up the weights of the response lines at that priority level.
 	SRV_WSUM=0;
 	while read X; do
 		case "$X" in
 		${SRV_PRIORITY}\ *)
 			SRV_W=`echo $X | cut -f 2 -d ' '`
 			SRV_WSUM=$(($SRV_WSUM + $SRV_W))
 			;;
 		esac
 	done < serverlist
 
 # If all the weights are 0, pretend that they are all 1 instead.
 	if [ ${SRV_WSUM} -eq 0 ]; then
 		SRV_WSUM=`grep -E "^${SRV_PRIORITY} " serverlist | wc -l`
 		SRV_W_ADD=1
 	else
 		SRV_W_ADD=0
 	fi
 
 # Pick a value between 0 and the sum of the weights - 1
 	SRV_RND=`expr ${RANDVALUE} % ${SRV_WSUM}`
 
 # Read through the list of mirrors and set SERVERNAME.  Write the line
 # corresponding to the mirror we selected into serverlist_tried so that
 # we won't try it again.
 	while read X; do
 		case "$X" in
 		${SRV_PRIORITY}\ *)
 			SRV_W=`echo $X | cut -f 2 -d ' '`
 			SRV_W=$(($SRV_W + $SRV_W_ADD))
 			if [ $SRV_RND -lt $SRV_W ]; then
 				SERVERNAME=`echo $X | cut -f 3 -d ' '`
 				echo "$X" >> serverlist_tried
 				break
 			else
 				SRV_RND=$(($SRV_RND - $SRV_W))
 			fi
 			;;
 		esac
 	done < serverlist
 }
 
 # Take a list of ${oldhash}|${newhash} and output a list of needed patches,
 # i.e., those for which we have ${oldhash} and don't have ${newhash}.
 fetch_make_patchlist () {
 	grep -vE "^([0-9a-f]{64})\|\1$" |
 	    tr '|' ' ' |
 		while read X Y; do
 			if [ -f "files/${Y}.gz" ] ||
 			    [ ! -f "files/${X}.gz" ]; then
 				continue
 			fi
 			echo "${X}|${Y}"
 		done | uniq
 }
 
 # Print user-friendly progress statistics
 fetch_progress () {
 	LNC=0
 	while read x; do
 		LNC=$(($LNC + 1))
 		if [ $(($LNC % 10)) = 0 ]; then
 			echo -n $LNC
 		elif [ $(($LNC % 2)) = 0 ]; then
 			echo -n .
 		fi
 	done
 	echo -n " "
 }
 
 # Function for asking the user if everything is ok
 continuep () {
 	while read -p "Does this look reasonable (y/n)? " CONTINUE; do
 		case "${CONTINUE}" in
 		y*)
 			return 0
 			;;
 		n*)
 			return 1
 			;;
 		esac
 	done
 }
 
 # Initialize the working directory
 workdir_init () {
 	mkdir -p files
 	touch tINDEX.present
 }
 
 # Check that we have a public key with an appropriate hash, or
 # fetch the key if it doesn't exist.  Returns 1 if the key has
 # not yet been fetched.
 fetch_key () {
 	if [ -r pub.ssl ] && [ `${SHA256} -q pub.ssl` = ${KEYPRINT} ]; then
 		return 0
 	fi
 
 	echo -n "Fetching public key from ${SERVERNAME}... "
 	rm -f pub.ssl
 	fetch ${QUIETFLAG} http://${SERVERNAME}/${FETCHDIR}/pub.ssl \
 	    2>${QUIETREDIR} || true
 	if ! [ -r pub.ssl ]; then
 		echo "failed."
 		return 1
 	fi
 	if ! [ `${SHA256} -q pub.ssl` = ${KEYPRINT} ]; then
 		echo "key has incorrect hash."
 		rm -f pub.ssl
 		return 1
 	fi
 	echo "done."
 }
 
 # Fetch metadata signature, aka "tag".
 fetch_tag () {
 	echo -n "Fetching metadata signature "
 	echo ${NDEBUG} "for ${RELNUM} from ${SERVERNAME}... "
 	rm -f latest.ssl
 	fetch ${QUIETFLAG} http://${SERVERNAME}/${FETCHDIR}/latest.ssl	\
 	    2>${QUIETREDIR} || true
 	if ! [ -r latest.ssl ]; then
 		echo "failed."
 		return 1
 	fi
 
 	openssl rsautl -pubin -inkey pub.ssl -verify		\
 	    < latest.ssl > tag.new 2>${QUIETREDIR} || true
 	rm latest.ssl
 
 	if ! [ `wc -l < tag.new` = 1 ] ||
 	    ! grep -qE	\
     "^freebsd-update\|${ARCH}\|${RELNUM}\|[0-9]+\|[0-9a-f]{64}\|[0-9]{10}" \
 		tag.new; then
 		echo "invalid signature."
 		return 1
 	fi
 
 	echo "done."
 
 	RELPATCHNUM=`cut -f 4 -d '|' < tag.new`
 	TINDEXHASH=`cut -f 5 -d '|' < tag.new`
 	EOLTIME=`cut -f 6 -d '|' < tag.new`
 }
 
 # Sanity-check the patch number in a tag, to make sure that we're not
 # going to "update" backwards and to prevent replay attacks.
 fetch_tagsanity () {
 	# Check that we're not going to move from -pX to -pY with Y < X.
 	RELPX=`uname -r | sed -E 's,.*-,,'`
 	if echo ${RELPX} | grep -qE '^p[0-9]+$'; then
 		RELPX=`echo ${RELPX} | cut -c 2-`
 	else
 		RELPX=0
 	fi
 	if [ "${RELPATCHNUM}" -lt "${RELPX}" ]; then
 		echo
 		echo -n "Files on mirror (${RELNUM}-p${RELPATCHNUM})"
 		echo " appear older than what"
 		echo "we are currently running (`uname -r`)!"
 		echo "Cowardly refusing to proceed any further."
 		return 1
 	fi
 
 	# If "tag" exists and corresponds to ${RELNUM}, make sure that
 	# it contains a patch number <= RELPATCHNUM, in order to protect
 	# against rollback (replay) attacks.
 	if [ -f tag ] &&
 	    grep -qE	\
     "^freebsd-update\|${ARCH}\|${RELNUM}\|[0-9]+\|[0-9a-f]{64}\|[0-9]{10}" \
 		tag; then
 		LASTRELPATCHNUM=`cut -f 4 -d '|' < tag`
 
 		if [ "${RELPATCHNUM}" -lt "${LASTRELPATCHNUM}" ]; then
 			echo
 			echo -n "Files on mirror (${RELNUM}-p${RELPATCHNUM})"
 			echo " are older than the"
 			echo -n "most recently seen updates"
 			echo " (${RELNUM}-p${LASTRELPATCHNUM})."
 			echo "Cowardly refusing to proceed any further."
 			return 1
 		fi
 	fi
 }
 
 # Fetch metadata index file
 fetch_metadata_index () {
 	echo ${NDEBUG} "Fetching metadata index... "
 	rm -f ${TINDEXHASH}
 	fetch ${QUIETFLAG} http://${SERVERNAME}/${FETCHDIR}/t/${TINDEXHASH}
 	    2>${QUIETREDIR}
 	if ! [ -f ${TINDEXHASH} ]; then
 		echo "failed."
 		return 1
 	fi
 	if [ `${SHA256} -q ${TINDEXHASH}` != ${TINDEXHASH} ]; then
 		echo "update metadata index corrupt."
 		return 1
 	fi
 	echo "done."
 }
 
 # Print an error message about signed metadata being bogus.
 fetch_metadata_bogus () {
 	echo
 	echo "The update metadata$1 is correctly signed, but"
 	echo "failed an integrity check."
 	echo "Cowardly refusing to proceed any further."
 	return 1
 }
 
 # Construct tINDEX.new by merging the lines named in $1 from ${TINDEXHASH}
 # with the lines not named in $@ from tINDEX.present (if that file exists).
 fetch_metadata_index_merge () {
 	for METAFILE in $@; do
 		if [ `grep -E "^${METAFILE}\|" ${TINDEXHASH} | wc -l`	\
 		    -ne 1 ]; then
 			fetch_metadata_bogus " index"
 			return 1
 		fi
 
 		grep -E "${METAFILE}\|" ${TINDEXHASH}
 	done |
 	    sort > tINDEX.wanted
 
 	if [ -f tINDEX.present ]; then
 		join -t '|' -v 2 tINDEX.wanted tINDEX.present |
 		    sort -m - tINDEX.wanted > tINDEX.new
 		rm tINDEX.wanted
 	else
 		mv tINDEX.wanted tINDEX.new
 	fi
 }
 
 # Sanity check all the lines of tINDEX.new.  Even if more metadata lines
 # are added by future versions of the server, this won't cause problems,
 # since the only lines which appear in tINDEX.new are the ones which we
 # specifically grepped out of ${TINDEXHASH}.
 fetch_metadata_index_sanity () {
 	if grep -qvE '^[0-9A-Z.-]+\|[0-9a-f]{64}$' tINDEX.new; then
 		fetch_metadata_bogus " index"
 		return 1
 	fi
 }
 
 # Sanity check the metadata file $1.
 fetch_metadata_sanity () {
 	# Some aliases to save space later: ${P} is a character which can
 	# appear in a path; ${M} is the four numeric metadata fields; and
 	# ${H} is a sha256 hash.
 	P="[-+./:=,%@_[~[:alnum:]]"
 	M="[0-9]+\|[0-9]+\|[0-9]+\|[0-9]+"
 	H="[0-9a-f]{64}"
 
 	# Check that the first four fields make sense.
 	if gunzip -c < files/$1.gz |
 	    grep -qvE "^[a-z]+\|[0-9a-z]+\|${P}+\|[fdL-]\|"; then
 		fetch_metadata_bogus ""
 		return 1
 	fi
 
 	# Remove the first three fields.
 	gunzip -c < files/$1.gz |
 	    cut -f 4- -d '|' > sanitycheck.tmp
 
 	# Sanity check entries with type 'f'
 	if grep -E '^f' sanitycheck.tmp |
 	    grep -qvE "^f\|${M}\|${H}\|${P}*\$"; then
 		fetch_metadata_bogus ""
 		return 1
 	fi
 
 	# Sanity check entries with type 'd'
 	if grep -E '^d' sanitycheck.tmp |
 	    grep -qvE "^d\|${M}\|\|\$"; then
 		fetch_metadata_bogus ""
 		return 1
 	fi
 
 	# Sanity check entries with type 'L'
 	if grep -E '^L' sanitycheck.tmp |
 	    grep -qvE "^L\|${M}\|${P}*\|\$"; then
 		fetch_metadata_bogus ""
 		return 1
 	fi
 
 	# Sanity check entries with type '-'
 	if grep -E '^-' sanitycheck.tmp |
 	    grep -qvE "^-\|\|\|\|\|\|"; then
 		fetch_metadata_bogus ""
 		return 1
 	fi
 
 	# Clean up
 	rm sanitycheck.tmp
 }
 
 # Fetch the metadata index and metadata files listed in $@,
 # taking advantage of metadata patches where possible.
 fetch_metadata () {
 	fetch_metadata_index || return 1
 	fetch_metadata_index_merge $@ || return 1
 	fetch_metadata_index_sanity || return 1
 
 	# Generate a list of wanted metadata patches
 	join -t '|' -o 1.2,2.2 tINDEX.present tINDEX.new |
 	    fetch_make_patchlist > patchlist
 
 	if [ -s patchlist ]; then
 		# Attempt to fetch metadata patches
 		echo -n "Fetching `wc -l < patchlist | tr -d ' '` "
 		echo ${NDEBUG} "metadata patches.${DDSTATS}"
 		tr '|' '-' < patchlist |
 		    lam -s "${FETCHDIR}/tp/" - -s ".gz" |
 		    xargs ${XARGST} ${PHTTPGET} ${SERVERNAME}	\
 			2>${STATSREDIR} | fetch_progress
 		echo "done."
 
 		# Attempt to apply metadata patches
 		echo -n "Applying metadata patches... "
 		tr '|' ' ' < patchlist |
 		    while read X Y; do
 			if [ ! -f "${X}-${Y}.gz" ]; then continue; fi
 			gunzip -c < ${X}-${Y}.gz > diff
 			gunzip -c < files/${X}.gz > diff-OLD
 
 			# Figure out which lines are being added and removed
 			grep -E '^-' diff |
 			    cut -c 2- |
 			    while read PREFIX; do
 				look "${PREFIX}" diff-OLD
 			    done |
 			    sort > diff-rm
 			grep -E '^\+' diff |
 			    cut -c 2- > diff-add
 
 			# Generate the new file
 			comm -23 diff-OLD diff-rm |
 			    sort - diff-add > diff-NEW
 
 			if [ `${SHA256} -q diff-NEW` = ${Y} ]; then
 				mv diff-NEW files/${Y}
 				gzip -n files/${Y}
 			else
 				mv diff-NEW ${Y}.bad
 			fi
 			rm -f ${X}-${Y}.gz diff
 			rm -f diff-OLD diff-NEW diff-add diff-rm
 		done 2>${QUIETREDIR}
 		echo "done."
 	fi
 
 	# Update metadata without patches
 	cut -f 2 -d '|' < tINDEX.new |
 	    while read Y; do
 		if [ ! -f "files/${Y}.gz" ]; then
 			echo ${Y};
 		fi
 	    done |
 	    sort -u > filelist
 
 	if [ -s filelist ]; then
 		echo -n "Fetching `wc -l < filelist | tr -d ' '` "
 		echo ${NDEBUG} "metadata files... "
 		lam -s "${FETCHDIR}/m/" - -s ".gz" < filelist |
 		    xargs ${XARGST} ${PHTTPGET} ${SERVERNAME}	\
 		    2>${QUIETREDIR}
 
 		while read Y; do
 			if ! [ -f ${Y}.gz ]; then
 				echo "failed."
 				return 1
 			fi
 			if [ `gunzip -c < ${Y}.gz |
 			    ${SHA256} -q` = ${Y} ]; then
 				mv ${Y}.gz files/${Y}.gz
 			else
 				echo "metadata is corrupt."
 				return 1
 			fi
 		done < filelist
 		echo "done."
 	fi
 
 # Sanity-check the metadata files.
 	cut -f 2 -d '|' tINDEX.new > filelist
 	while read X; do
 		fetch_metadata_sanity ${X} || return 1
 	done < filelist
 
 # Remove files which are no longer needed
 	cut -f 2 -d '|' tINDEX.present |
 	    sort > oldfiles
 	cut -f 2 -d '|' tINDEX.new |
 	    sort |
 	    comm -13 - oldfiles |
 	    lam -s "files/" - -s ".gz" |
 	    xargs rm -f
 	rm patchlist filelist oldfiles
 	rm ${TINDEXHASH}
 
 # We're done!
 	mv tINDEX.new tINDEX.present
 	mv tag.new tag
 
 	return 0
 }
 
 # Extract a subset of a downloaded metadata file containing only the parts
 # which are listed in COMPONENTS.
 fetch_filter_metadata_components () {
 	METAHASH=`look "$1|" tINDEX.present | cut -f 2 -d '|'`
 	gunzip -c < files/${METAHASH}.gz > $1.all
 
 	# Fish out the lines belonging to components we care about.
 	for C in ${COMPONENTS}; do
 		look "`echo ${C} | tr '/' '|'`|" $1.all
 	done > $1
 
 	# Remove temporary file.
 	rm $1.all
 }
 
 # Generate a filtered version of the metadata file $1 from the downloaded
 # file, by fishing out the lines corresponding to components we're trying
 # to keep updated, and then removing lines corresponding to paths we want
 # to ignore.
 fetch_filter_metadata () {
 	# Fish out the lines belonging to components we care about.
 	fetch_filter_metadata_components $1
 
 	# Canonicalize directory names by removing any trailing / in
 	# order to avoid listing directories multiple times if they
 	# belong to multiple components.  Turning "/" into "" doesn't
 	# matter, since we add a leading "/" when we use paths later.
 	cut -f 3- -d '|' $1 |
 	    sed -e 's,/|d|,|d|,' |
 	    sed -e 's,/|-|,|-|,' |
 	    sort -u > $1.tmp
 
 	# Figure out which lines to ignore and remove them.
 	for X in ${IGNOREPATHS}; do
 		grep -E "^${X}" $1.tmp
 	done |
 	    sort -u |
 	    comm -13 - $1.tmp > $1
 
 	# Remove temporary files.
 	rm $1.tmp
 }
 
 # Filter the metadata file $1 by adding lines with "/boot/$2"
 # replaced by ${KERNELDIR} (which is `sysctl -n kern.bootfile` minus the
 # trailing "/kernel"); and if "/boot/$2" does not exist, remove
 # the original lines which start with that.
 # Put another way: Deal with the fact that the FOO kernel is sometimes
 # installed in /boot/FOO/ and is sometimes installed elsewhere.
 fetch_filter_kernel_names () {
 	grep ^/boot/$2 $1 |
 	    sed -e "s,/boot/$2,${KERNELDIR},g" |
 	    sort - $1 > $1.tmp
 	mv $1.tmp $1
 
 	if ! [ -d /boot/$2 ]; then
 		grep -v ^/boot/$2 $1 > $1.tmp
 		mv $1.tmp $1
 	fi
 }
 
 # For all paths appearing in $1 or $3, inspect the system
 # and generate $2 describing what is currently installed.
 fetch_inspect_system () {
 	# No errors yet...
 	rm -f .err
 
 	# Tell the user why his disk is suddenly making lots of noise
 	echo -n "Inspecting system... "
 
 	# Generate list of files to inspect
 	cat $1 $3 |
 	    cut -f 1 -d '|' |
 	    sort -u > filelist
 
 	# Examine each file and output lines of the form
 	# /path/to/file|type|device-inum|user|group|perm|flags|value
 	# sorted by device and inode number.
 	while read F; do
 		# If the symlink/file/directory does not exist, record this.
 		if ! [ -e ${BASEDIR}/${F} ]; then
 			echo "${F}|-||||||"
 			continue
 		fi
 		if ! [ -r ${BASEDIR}/${F} ]; then
 			echo "Cannot read file: ${BASEDIR}/${F}"	\
 			    >/dev/stderr
 			touch .err
 			return 1
 		fi
 
 		# Otherwise, output an index line.
 		if [ -L ${BASEDIR}/${F} ]; then
 			echo -n "${F}|L|"
 			stat -n -f '%d-%i|%u|%g|%Mp%Lp|%Of|' ${BASEDIR}/${F};
 			readlink ${BASEDIR}/${F};
 		elif [ -f ${BASEDIR}/${F} ]; then
 			echo -n "${F}|f|"
 			stat -n -f '%d-%i|%u|%g|%Mp%Lp|%Of|' ${BASEDIR}/${F};
 			sha256 -q ${BASEDIR}/${F};
 		elif [ -d ${BASEDIR}/${F} ]; then
 			echo -n "${F}|d|"
 			stat -f '%d-%i|%u|%g|%Mp%Lp|%Of|' ${BASEDIR}/${F};
 		else
 			echo "Unknown file type: ${BASEDIR}/${F}"	\
 			    >/dev/stderr
 			touch .err
 			return 1
 		fi
 	done < filelist |
 	    sort -k 3,3 -t '|' > $2.tmp
 	rm filelist
 
 	# Check if an error occurred during system inspection
 	if [ -f .err ]; then
 		return 1
 	fi
 
 	# Convert to the form
 	# /path/to/file|type|user|group|perm|flags|value|hlink
 	# by resolving identical device and inode numbers into hard links.
 	cut -f 1,3 -d '|' $2.tmp |
 	    sort -k 1,1 -t '|' |
 	    sort -s -u -k 2,2 -t '|' |
 	    join -1 2 -2 3 -t '|' - $2.tmp |
 	    awk -F \| -v OFS=\|		\
 		'{
 		    if (($2 == $3) || ($4 == "-"))
 			print $3,$4,$5,$6,$7,$8,$9,""
 		    else
 			print $3,$4,$5,$6,$7,$8,$9,$2
 		}' |
 	    sort > $2
 	rm $2.tmp
 
 	# We're finished looking around
 	echo "done."
 }
 
 # For any paths matching ${MERGECHANGES}, compare $1 and $2 and find any
 # files which differ; generate $3 containing these paths and the old hashes.
 fetch_filter_mergechanges () {
 	# Pull out the paths and hashes of the files matching ${MERGECHANGES}.
 	for F in $1 $2; do
 		for X in ${MERGECHANGES}; do
 			grep -E "^${X}" ${F}
 		done |
 		    cut -f 1,2,7 -d '|' |
 		    sort > ${F}-values
 	done
 
 	# Any line in $2-values which doesn't appear in $1-values and is a
 	# file means that we should list the path in $3.
 	comm -13 $1-values $2-values |
 	    fgrep '|f|' |
 	    cut -f 1 -d '|' > $2-paths
 
 	# For each path, pull out one (and only one!) entry from $1-values.
 	# Note that we cannot distinguish which "old" version the user made
 	# changes to; but hopefully any changes which occur due to security
 	# updates will exist in both the "new" version and the version which
 	# the user has installed, so the merging will still work.
 	while read X; do
 		look "${X}|" $1-values |
 		    head -1
 	done < $2-paths > $3
 
 	# Clean up
 	rm $1-values $2-values $2-paths
 }
 
 # For any paths matching ${UPDATEIFUNMODIFIED}, remove lines from $[123]
 # which correspond to lines in $2 with hashes not matching $1 or $3, unless
 # the paths are listed in $4.  For entries in $2 marked "not present"
 # (aka. type -), remove lines from $[123] unless there is a corresponding
 # entry in $1.
 fetch_filter_unmodified_notpresent () {
 	# Figure out which lines of $1 and $3 correspond to bits which
 	# should only be updated if they haven't changed, and fish out
 	# the (path, type, value) tuples.
 	# NOTE: We don't consider a file to be "modified" if it matches
 	# the hash from $3.
 	for X in ${UPDATEIFUNMODIFIED}; do
 		grep -E "^${X}" $1
 		grep -E "^${X}" $3
 	done |
 	    cut -f 1,2,7 -d '|' |
 	    sort > $1-values
 
 	# Do the same for $2.
 	for X in ${UPDATEIFUNMODIFIED}; do
 		grep -E "^${X}" $2
 	done |
 	    cut -f 1,2,7 -d '|' |
 	    sort > $2-values
 
 	# Any entry in $2-values which is not in $1-values corresponds to
 	# a path which we need to remove from $1, $2, and $3, unless it
 	# that path appears in $4.
 	comm -13 $1-values $2-values |
 	    sort -t '|' -k 1,1 > mlines.tmp
 	cut -f 1 -d '|' $4 |
 	    sort |
 	    join -v 2 -t '|' - mlines.tmp |
 	    sort > mlines
 	rm $1-values $2-values mlines.tmp
 
 	# Any lines in $2 which are not in $1 AND are "not present" lines
 	# also belong in mlines.
 	comm -13 $1 $2 |
 	    cut -f 1,2,7 -d '|' |
 	    fgrep '|-|' >> mlines
 
 	# Remove lines from $1, $2, and $3
 	for X in $1 $2 $3; do
 		sort -t '|' -k 1,1 ${X} > ${X}.tmp
 		cut -f 1 -d '|' < mlines |
 		    sort |
 		    join -v 2 -t '|' - ${X}.tmp |
 		    sort > ${X}
 		rm ${X}.tmp
 	done
 
 	# Store a list of the modified files, for future reference
 	fgrep -v '|-|' mlines |
 	    cut -f 1 -d '|' > modifiedfiles
 	rm mlines
 }
 
 # For each entry in $1 of type -, remove any corresponding
 # entry from $2 if ${ALLOWADD} != "yes".  Remove all entries
 # of type - from $1.
 fetch_filter_allowadd () {
 	cut -f 1,2 -d '|' < $1 |
 	    fgrep '|-' |
 	    cut -f 1 -d '|' > filesnotpresent
 
 	if [ ${ALLOWADD} != "yes" ]; then
 		sort < $2 |
 		    join -v 1 -t '|' - filesnotpresent |
 		    sort > $2.tmp
 		mv $2.tmp $2
 	fi
 
 	sort < $1 |
 	    join -v 1 -t '|' - filesnotpresent |
 	    sort > $1.tmp
 	mv $1.tmp $1
 	rm filesnotpresent
 }
 
 # If ${ALLOWDELETE} != "yes", then remove any entries from $1
 # which don't correspond to entries in $2.
 fetch_filter_allowdelete () {
 	# Produce a lists ${PATH}|${TYPE}
 	for X in $1 $2; do
 		cut -f 1-2 -d '|' < ${X} |
 		    sort -u > ${X}.nodes
 	done
 
 	# Figure out which lines need to be removed from $1.
 	if [ ${ALLOWDELETE} != "yes" ]; then
 		comm -23 $1.nodes $2.nodes > $1.badnodes
 	else
 		: > $1.badnodes
 	fi
 
 	# Remove the relevant lines from $1
 	while read X; do
 		look "${X}|" $1
 	done < $1.badnodes |
 	    comm -13 - $1 > $1.tmp
 	mv $1.tmp $1
 
 	rm $1.badnodes $1.nodes $2.nodes
 }
 
 # If ${KEEPMODIFIEDMETADATA} == "yes", then for each entry in $2
 # with metadata not matching any entry in $1, replace the corresponding
 # line of $3 with one having the same metadata as the entry in $2.
 fetch_filter_modified_metadata () {
 	# Fish out the metadata from $1 and $2
 	for X in $1 $2; do
 		cut -f 1-6 -d '|' < ${X} > ${X}.metadata
 	done
 
 	# Find the metadata we need to keep
 	if [ ${KEEPMODIFIEDMETADATA} = "yes" ]; then
 		comm -13 $1.metadata $2.metadata > keepmeta
 	else
 		: > keepmeta
 	fi
 
 	# Extract the lines which we need to remove from $3, and
 	# construct the lines which we need to add to $3.
 	: > $3.remove
 	: > $3.add
 	while read LINE; do
 		NODE=`echo "${LINE}" | cut -f 1-2 -d '|'`
 		look "${NODE}|" $3 >> $3.remove
 		look "${NODE}|" $3 |
 		    cut -f 7- -d '|' |
 		    lam -s "${LINE}|" - >> $3.add
 	done < keepmeta
 
 	# Remove the specified lines and add the new lines.
 	sort $3.remove |
 	    comm -13 - $3 |
 	    sort -u - $3.add > $3.tmp
 	mv $3.tmp $3
 
 	rm keepmeta $1.metadata $2.metadata $3.add $3.remove
 }
 
 # Remove lines from $1 and $2 which are identical;
 # no need to update a file if it isn't changing.
 fetch_filter_uptodate () {
 	comm -23 $1 $2 > $1.tmp
 	comm -13 $1 $2 > $2.tmp
 
 	mv $1.tmp $1
 	mv $2.tmp $2
 }
 
 # Fetch any "clean" old versions of files we need for merging changes.
 fetch_files_premerge () {
 	# We only need to do anything if $1 is non-empty.
 	if [ -s $1 ]; then
 		# Tell the user what we're doing
 		echo -n "Fetching files from ${OLDRELNUM} for merging... "
 
 		# List of files wanted
 		fgrep '|f|' < $1 |
 		    cut -f 3 -d '|' |
 		    sort -u > files.wanted
 
 		# Only fetch the files we don't already have
 		while read Y; do
 			if [ ! -f "files/${Y}.gz" ]; then
 				echo ${Y};
 			fi
 		done < files.wanted > filelist
 
 		# Actually fetch them
 		lam -s "${OLDFETCHDIR}/f/" - -s ".gz" < filelist |
 		    xargs ${XARGST} ${PHTTPGET} ${SERVERNAME}	\
 		    2>${QUIETREDIR}
 
 		# Make sure we got them all, and move them into /files/
 		while read Y; do
 			if ! [ -f ${Y}.gz ]; then
 				echo "failed."
 				return 1
 			fi
 			if [ `gunzip -c < ${Y}.gz |
 			    ${SHA256} -q` = ${Y} ]; then
 				mv ${Y}.gz files/${Y}.gz
 			else
 				echo "${Y} has incorrect hash."
 				return 1
 			fi
 		done < filelist
 		echo "done."
 
 		# Clean up
 		rm filelist files.wanted
 	fi
 }
 
 # Prepare to fetch files: Generate a list of the files we need,
 # copy the unmodified files we have into /files/, and generate
 # a list of patches to download.
 fetch_files_prepare () {
 	# Tell the user why his disk is suddenly making lots of noise
 	echo -n "Preparing to download files... "
 
 	# Reduce indices to ${PATH}|${HASH} pairs
 	for X in $1 $2 $3; do
 		cut -f 1,2,7 -d '|' < ${X} |
 		    fgrep '|f|' |
 		    cut -f 1,3 -d '|' |
 		    sort > ${X}.hashes
 	done
 
 	# List of files wanted
 	cut -f 2 -d '|' < $3.hashes |
 	    sort -u |
 	    while read HASH; do
 		if ! [ -f files/${HASH}.gz ]; then
 			echo ${HASH}
 		fi
 	done > files.wanted
 
 	# Generate a list of unmodified files
 	comm -12 $1.hashes $2.hashes |
 	    sort -k 1,1 -t '|' > unmodified.files
 
 	# Copy all files into /files/.  We only need the unmodified files
 	# for use in patching; but we'll want all of them if the user asks
 	# to rollback the updates later.
 	while read LINE; do
 		F=`echo "${LINE}" | cut -f 1 -d '|'`
 		HASH=`echo "${LINE}" | cut -f 2 -d '|'`
 
 		# Skip files we already have.
 		if [ -f files/${HASH}.gz ]; then
 			continue
 		fi
 
 		# Make sure the file hasn't changed.
 		cp "${BASEDIR}/${F}" tmpfile
 		if [ `sha256 -q tmpfile` != ${HASH} ]; then
 			echo
 			echo "File changed while FreeBSD Update running: ${F}"
 			return 1
 		fi
 
 		# Place the file into storage.
 		gzip -c < tmpfile > files/${HASH}.gz
 		rm tmpfile
 	done < $2.hashes
 
 	# Produce a list of patches to download
 	sort -k 1,1 -t '|' $3.hashes |
 	    join -t '|' -o 2.2,1.2 - unmodified.files |
 	    fetch_make_patchlist > patchlist
 
 	# Garbage collect
 	rm unmodified.files $1.hashes $2.hashes $3.hashes
 
 	# We don't need the list of possible old files any more.
 	rm $1
 
 	# We're finished making noise
 	echo "done."
 }
 
 # Fetch files.
 fetch_files () {
 	# Attempt to fetch patches
 	if [ -s patchlist ]; then
 		echo -n "Fetching `wc -l < patchlist | tr -d ' '` "
 		echo ${NDEBUG} "patches.${DDSTATS}"
 		tr '|' '-' < patchlist |
 		    lam -s "${PATCHDIR}/" - |
 		    xargs ${XARGST} ${PHTTPGET} ${SERVERNAME}	\
 			2>${STATSREDIR} | fetch_progress
 		echo "done."
 
 		# Attempt to apply patches
 		echo -n "Applying patches... "
 		tr '|' ' ' < patchlist |
 		    while read X Y; do
 			if [ ! -f "${X}-${Y}" ]; then continue; fi
 			gunzip -c < files/${X}.gz > OLD
 
 			bspatch OLD NEW ${X}-${Y}
 
 			if [ `${SHA256} -q NEW` = ${Y} ]; then
 				mv NEW files/${Y}
 				gzip -n files/${Y}
 			fi
 			rm -f diff OLD NEW ${X}-${Y}
 		done 2>${QUIETREDIR}
 		echo "done."
 	fi
 
 	# Download files which couldn't be generate via patching
 	while read Y; do
 		if [ ! -f "files/${Y}.gz" ]; then
 			echo ${Y};
 		fi
 	done < files.wanted > filelist
 
 	if [ -s filelist ]; then
 		echo -n "Fetching `wc -l < filelist | tr -d ' '` "
 		echo ${NDEBUG} "files... "
 		lam -s "${FETCHDIR}/f/" - -s ".gz" < filelist |
 		    xargs ${XARGST} ${PHTTPGET} ${SERVERNAME}	\
 		    2>${QUIETREDIR}
 
 		while read Y; do
 			if ! [ -f ${Y}.gz ]; then
 				echo "failed."
 				return 1
 			fi
 			if [ `gunzip -c < ${Y}.gz |
 			    ${SHA256} -q` = ${Y} ]; then
 				mv ${Y}.gz files/${Y}.gz
 			else
 				echo "${Y} has incorrect hash."
 				return 1
 			fi
 		done < filelist
 		echo "done."
 	fi
 
 	# Clean up
 	rm files.wanted filelist patchlist
 }
 
 # Create and populate install manifest directory; and report what updates
 # are available.
 fetch_create_manifest () {
 	# If we have an existing install manifest, nuke it.
 	if [ -L "${BDHASH}-install" ]; then
 		rm -r ${BDHASH}-install/
 		rm ${BDHASH}-install
 	fi
 
 	# Report to the user if any updates were avoided due to local changes
 	if [ -s modifiedfiles ]; then
 		echo
 		echo -n "The following files are affected by updates, "
 		echo "but no changes have"
 		echo -n "been downloaded because the files have been "
 		echo "modified locally:"
 		cat modifiedfiles
 	fi | $PAGER
 	rm modifiedfiles
 
 	# If no files will be updated, tell the user and exit
 	if ! [ -s INDEX-PRESENT ] &&
 	    ! [ -s INDEX-NEW ]; then
 		rm INDEX-PRESENT INDEX-NEW
 		echo
 		echo -n "No updates needed to update system to "
 		echo "${RELNUM}-p${RELPATCHNUM}."
 		return
 	fi
 
 	# Divide files into (a) removed files, (b) added files, and
 	# (c) updated files.
 	cut -f 1 -d '|' < INDEX-PRESENT |
 	    sort > INDEX-PRESENT.flist
 	cut -f 1 -d '|' < INDEX-NEW |
 	    sort > INDEX-NEW.flist
 	comm -23 INDEX-PRESENT.flist INDEX-NEW.flist > files.removed
 	comm -13 INDEX-PRESENT.flist INDEX-NEW.flist > files.added
 	comm -12 INDEX-PRESENT.flist INDEX-NEW.flist > files.updated
 	rm INDEX-PRESENT.flist INDEX-NEW.flist
 
 	# Report removed files, if any
 	if [ -s files.removed ]; then
 		echo
 		echo -n "The following files will be removed "
 		echo "as part of updating to ${RELNUM}-p${RELPATCHNUM}:"
 		cat files.removed
 	fi | $PAGER
 	rm files.removed
 
 	# Report added files, if any
 	if [ -s files.added ]; then
 		echo
 		echo -n "The following files will be added "
 		echo "as part of updating to ${RELNUM}-p${RELPATCHNUM}:"
 		cat files.added
 	fi | $PAGER
 	rm files.added
 
 	# Report updated files, if any
 	if [ -s files.updated ]; then
 		echo
 		echo -n "The following files will be updated "
 		echo "as part of updating to ${RELNUM}-p${RELPATCHNUM}:"
 
 		cat files.updated
 	fi | $PAGER
 	rm files.updated
 
 	# Create a directory for the install manifest.
 	MDIR=`mktemp -d install.XXXXXX` || return 1
 
 	# Populate it
 	mv INDEX-PRESENT ${MDIR}/INDEX-OLD
 	mv INDEX-NEW ${MDIR}/INDEX-NEW
 
 	# Link it into place
 	ln -s ${MDIR} ${BDHASH}-install
 }
 
 # Warn about any upcoming EoL
 fetch_warn_eol () {
 	# What's the current time?
 	NOWTIME=`date "+%s"`
 
 	# When did we last warn about the EoL date?
 	if [ -f lasteolwarn ]; then
 		LASTWARN=`cat lasteolwarn`
 	else
 		LASTWARN=`expr ${NOWTIME} - 63072000`
 	fi
 
 	# If the EoL time is past, warn.
 	if [ ${EOLTIME} -lt ${NOWTIME} ]; then
 		echo
 		cat <<-EOF
 		WARNING: `uname -sr` HAS PASSED ITS END-OF-LIFE DATE.
 		Any security issues discovered after `date -r ${EOLTIME}`
 		will not have been corrected.
 		EOF
 		return 1
 	fi
 
 	# Figure out how long it has been since we last warned about the
 	# upcoming EoL, and how much longer we have left.
 	SINCEWARN=`expr ${NOWTIME} - ${LASTWARN}`
 	TIMELEFT=`expr ${EOLTIME} - ${NOWTIME}`
 
 	# Don't warn if the EoL is more than 3 months away
 	if [ ${TIMELEFT} -gt 7884000 ]; then
 		return 0
 	fi
 
 	# Don't warn if the time remaining is more than 3 times the time
 	# since the last warning.
 	if [ ${TIMELEFT} -gt `expr ${SINCEWARN} \* 3` ]; then
 		return 0
 	fi
 
 	# Figure out what time units to use.
 	if [ ${TIMELEFT} -lt 604800 ]; then
 		UNIT="day"
 		SIZE=86400
 	elif [ ${TIMELEFT} -lt 2678400 ]; then
 		UNIT="week"
 		SIZE=604800
 	else
 		UNIT="month"
 		SIZE=2678400
 	fi
 
 	# Compute the right number of units
 	NUM=`expr ${TIMELEFT} / ${SIZE}`
 	if [ ${NUM} != 1 ]; then
 		UNIT="${UNIT}s"
 	fi
 
 	# Print the warning
 	echo
 	cat <<-EOF
 		WARNING: `uname -sr` is approaching its End-of-Life date.
 		It is strongly recommended that you upgrade to a newer
 		release within the next ${NUM} ${UNIT}.
 	EOF
 
 	# Update the stored time of last warning
 	echo ${NOWTIME} > lasteolwarn
 }
 
 # Do the actual work involved in "fetch" / "cron".
 fetch_run () {
 	workdir_init || return 1
 
 	# Prepare the mirror list.
 	fetch_pick_server_init && fetch_pick_server
 
 	# Try to fetch the public key until we run out of servers.
 	while ! fetch_key; do
 		fetch_pick_server || return 1
 	done
 
 	# Try to fetch the metadata index signature ("tag") until we run
 	# out of available servers; and sanity check the downloaded tag.
 	while ! fetch_tag; do
 		fetch_pick_server || return 1
 	done
 	fetch_tagsanity || return 1
 
 	# Fetch the latest INDEX-NEW and INDEX-OLD files.
 	fetch_metadata INDEX-NEW INDEX-OLD || return 1
 
 	# Generate filtered INDEX-NEW and INDEX-OLD files containing only
 	# the lines which (a) belong to components we care about, and (b)
 	# don't correspond to paths we're explicitly ignoring.
 	fetch_filter_metadata INDEX-NEW || return 1
 	fetch_filter_metadata INDEX-OLD || return 1
 
 	# Translate /boot/${KERNCONF} into ${KERNELDIR}
 	fetch_filter_kernel_names INDEX-NEW ${KERNCONF}
 	fetch_filter_kernel_names INDEX-OLD ${KERNCONF}
 
 	# For all paths appearing in INDEX-OLD or INDEX-NEW, inspect the
 	# system and generate an INDEX-PRESENT file.
 	fetch_inspect_system INDEX-OLD INDEX-PRESENT INDEX-NEW || return 1
 
 	# Based on ${UPDATEIFUNMODIFIED}, remove lines from INDEX-* which
 	# correspond to lines in INDEX-PRESENT with hashes not appearing
 	# in INDEX-OLD or INDEX-NEW.  Also remove lines where the entry in
 	# INDEX-PRESENT has type - and there isn't a corresponding entry in
 	# INDEX-OLD with type -.
 	fetch_filter_unmodified_notpresent	\
 	    INDEX-OLD INDEX-PRESENT INDEX-NEW /dev/null
 
 	# For each entry in INDEX-PRESENT of type -, remove any corresponding
 	# entry from INDEX-NEW if ${ALLOWADD} != "yes".  Remove all entries
 	# of type - from INDEX-PRESENT.
 	fetch_filter_allowadd INDEX-PRESENT INDEX-NEW
 
 	# If ${ALLOWDELETE} != "yes", then remove any entries from
 	# INDEX-PRESENT which don't correspond to entries in INDEX-NEW.
 	fetch_filter_allowdelete INDEX-PRESENT INDEX-NEW
 
 	# If ${KEEPMODIFIEDMETADATA} == "yes", then for each entry in
 	# INDEX-PRESENT with metadata not matching any entry in INDEX-OLD,
 	# replace the corresponding line of INDEX-NEW with one having the
 	# same metadata as the entry in INDEX-PRESENT.
 	fetch_filter_modified_metadata INDEX-OLD INDEX-PRESENT INDEX-NEW
 
 	# Remove lines from INDEX-PRESENT and INDEX-NEW which are identical;
 	# no need to update a file if it isn't changing.
 	fetch_filter_uptodate INDEX-PRESENT INDEX-NEW
 
 	# Prepare to fetch files: Generate a list of the files we need,
 	# copy the unmodified files we have into /files/, and generate
 	# a list of patches to download.
 	fetch_files_prepare INDEX-OLD INDEX-PRESENT INDEX-NEW || return 1
 
 	# Fetch files.
 	fetch_files || return 1
 
 	# Create and populate install manifest directory; and report what
 	# updates are available.
 	fetch_create_manifest || return 1
 
 	# Warn about any upcoming EoL
 	fetch_warn_eol || return 1
 }
 
 # If StrictComponents is not "yes", generate a new components list
 # with only the components which appear to be installed.
 upgrade_guess_components () {
 	if [ "${STRICTCOMPONENTS}" = "no" ]; then
 		# Generate filtered INDEX-ALL with only the components listed
 		# in COMPONENTS.
 		fetch_filter_metadata_components $1 || return 1
 
 		# Tell the user why his disk is suddenly making lots of noise
 		echo -n "Inspecting system... "
 
 		# Look at the files on disk, and assume that a component is
 		# supposed to be present if it is more than half-present.
 		cut -f 1-3 -d '|' < INDEX-ALL |
 		    tr '|' ' ' |
 		    while read C S F; do
 			if [ -e ${BASEDIR}/${F} ]; then
 				echo "+ ${C}|${S}"
 			fi
 			echo "= ${C}|${S}"
 		    done |
 		    sort |
 		    uniq -c |
 		    sed -E 's,^ +,,' > compfreq
 		grep ' = ' compfreq |
 		    cut -f 1,3 -d ' ' |
 		    sort -k 2,2 -t ' ' > compfreq.total
 		grep ' + ' compfreq |
 		    cut -f 1,3 -d ' ' |
 		    sort -k 2,2 -t ' ' > compfreq.present
 		join -t ' ' -1 2 -2 2 compfreq.present compfreq.total |
 		    while read S P T; do
 			if [ ${P} -gt `expr ${T} / 2` ]; then
 				echo ${S}
 			fi
 		    done > comp.present
 		cut -f 2 -d ' ' < compfreq.total > comp.total
 		rm INDEX-ALL compfreq compfreq.total compfreq.present
 
 		# We're done making noise.
 		echo "done."
 
 		# Sometimes the kernel isn't installed where INDEX-ALL
 		# thinks that it should be: In particular, it is often in
 		# /boot/kernel instead of /boot/GENERIC or /boot/SMP.  To
 		# deal with this, if "kernel|X" is listed in comp.total
 		# (i.e., is a component which would be upgraded if it is
 		# found to be present) we will add it to comp.present.
 		# If "kernel|<anything>" is in comp.total but "kernel|X" is
 		# not, we print a warning -- the user is running a kernel
 		# which isn't part of the release.
 		KCOMP=`echo ${KERNCONF} | tr 'A-Z' 'a-z'`
 		grep -E "^kernel\|${KCOMP}\$" comp.total >> comp.present
 
 		if grep -qE "^kernel\|" comp.total &&
 		    ! grep -qE "^kernel\|${KCOMP}\$" comp.total; then
 			cat <<-EOF
 
 WARNING: This system is running a "${KCOMP}" kernel, which is not a
 kernel configuration distributed as part of FreeBSD ${RELNUM}.
 This kernel will not be updated: you MUST update the kernel manually
 before running "$0 install".
 			EOF
 		fi
 
 		# Re-sort the list of installed components and generate
 		# the list of non-installed components.
 		sort -u < comp.present > comp.present.tmp
 		mv comp.present.tmp comp.present
 		comm -13 comp.present comp.total > comp.absent
 
 		# Ask the user to confirm that what we have is correct.  To
 		# reduce user confusion, translate "X|Y" back to "X/Y" (as
 		# subcomponents must be listed in the configuration file).
 		echo
 		echo -n "The following components of FreeBSD "
 		echo "seem to be installed:"
 		tr '|' '/' < comp.present |
 		    fmt -72
 		echo
 		echo -n "The following components of FreeBSD "
 		echo "do not seem to be installed:"
 		tr '|' '/' < comp.absent |
 		    fmt -72
 		echo
 		continuep || return 1
 		echo
 
 		# Suck the generated list of components into ${COMPONENTS}.
 		# Note that comp.present.tmp is used due to issues with
 		# pipelines and setting variables.
 		COMPONENTS=""
 		tr '|' '/' < comp.present > comp.present.tmp
 		while read C; do
 			COMPONENTS="${COMPONENTS} ${C}"
 		done < comp.present.tmp
 
 		# Delete temporary files
 		rm comp.present comp.present.tmp comp.absent comp.total
 	fi
 }
 
 # If StrictComponents is not "yes", COMPONENTS contains an entry
 # corresponding to the currently running kernel, and said kernel
 # does not exist in the new release, add "kernel/generic" to the
 # list of components.
 upgrade_guess_new_kernel () {
 	if [ "${STRICTCOMPONENTS}" = "no" ]; then
 		# Grab the unfiltered metadata file.
 		METAHASH=`look "$1|" tINDEX.present | cut -f 2 -d '|'`
 		gunzip -c < files/${METAHASH}.gz > $1.all
 
 		# If "kernel/${KCOMP}" is in ${COMPONENTS} and that component
 		# isn't in $1.all, we need to add kernel/generic.
 		for C in ${COMPONENTS}; do
 			if [ ${C} = "kernel/${KCOMP}" ] &&
 			    ! grep -qE "^kernel\|${KCOMP}\|" $1.all; then
 				COMPONENTS="${COMPONENTS} kernel/generic"
 				NKERNCONF="GENERIC"
 				cat <<-EOF
 
 WARNING: This system is running a "${KCOMP}" kernel, which is not a
 kernel configuration distributed as part of FreeBSD ${RELNUM}.
 As part of upgrading to FreeBSD ${RELNUM}, this kernel will be
 replaced with a "generic" kernel.
 				EOF
 				continuep || return 1
 			fi
 		done
 
 		# Don't need this any more...
 		rm $1.all
 	fi
 }
 
 # Convert INDEX-OLD (last release) and INDEX-ALL (new release) into
 # INDEX-OLD and INDEX-NEW files (in the sense of normal upgrades).
 upgrade_oldall_to_oldnew () {
 	# For each ${F}|... which appears in INDEX-ALL but does not appear
 	# in INDEX-OLD, add ${F}|-|||||| to INDEX-OLD.
 	cut -f 1 -d '|' < $1 |
 	    sort -u > $1.paths
 	cut -f 1 -d '|' < $2 |
 	    sort -u |
 	    comm -13 $1.paths - |
 	    lam - -s "|-||||||" |
 	    sort - $1 > $1.tmp
 	mv $1.tmp $1
 
 	# Remove lines from INDEX-OLD which also appear in INDEX-ALL
 	comm -23 $1 $2 > $1.tmp
 	mv $1.tmp $1
 
 	# Remove lines from INDEX-ALL which have a file name not appearing
 	# anywhere in INDEX-OLD (since these must be files which haven't
 	# changed -- if they were new, there would be an entry of type "-").
 	cut -f 1 -d '|' < $1 |
 	    sort -u > $1.paths
 	sort -k 1,1 -t '|' < $2 |
 	    join -t '|' - $1.paths |
 	    sort > $2.tmp
 	rm $1.paths
 	mv $2.tmp $2
 
 	# Rename INDEX-ALL to INDEX-NEW.
 	mv $2 $3
 }
 
 # Helper for upgrade_merge: Return zero true iff the two files differ only
 # in the contents of their RCS tags.
 samef () {
 	X=`sed -E 's/\\$FreeBSD.*\\$/\$FreeBSD\$/' < $1 | ${SHA256}`
 	Y=`sed -E 's/\\$FreeBSD.*\\$/\$FreeBSD\$/' < $2 | ${SHA256}`
 
 	if [ $X = $Y ]; then
 		return 0;
 	else
 		return 1;
 	fi
 }
 
 # From the list of "old" files in $1, merge changes in $2 with those in $3,
 # and update $3 to reflect the hashes of merged files.
 upgrade_merge () {
 	# We only need to do anything if $1 is non-empty.
 	if [ -s $1 ]; then
 		cut -f 1 -d '|' $1 |
 		    sort > $1-paths
 
 		# Create staging area for merging files
 		rm -rf merge/
 		while read F; do
 			D=`dirname ${F}`
 			mkdir -p merge/old/${D}
 			mkdir -p merge/${OLDRELNUM}/${D}
 			mkdir -p merge/${RELNUM}/${D}
 			mkdir -p merge/new/${D}
 		done < $1-paths
 
 		# Copy in files
 		while read F; do
 			# Currently installed file
 			V=`look "${F}|" $2 | cut -f 7 -d '|'`
 			gunzip < files/${V}.gz > merge/old/${F}
 
 			# Old release
 			if look "${F}|" $1 | fgrep -q "|f|"; then
 				V=`look "${F}|" $1 | cut -f 3 -d '|'`
 				gunzip < files/${V}.gz		\
 				    > merge/${OLDRELNUM}/${F}
 			fi
 
 			# New release
 			if look "${F}|" $3 | cut -f 1,2,7 -d '|' |
 			    fgrep -q "|f|"; then
 				V=`look "${F}|" $3 | cut -f 7 -d '|'`
 				gunzip < files/${V}.gz		\
 				    > merge/${RELNUM}/${F}
 			fi
 		done < $1-paths
 
 		# Attempt to automatically merge changes
 		echo -n "Attempting to automatically merge "
 		echo -n "changes in files..."
 		: > failed.merges
 		while read F; do
 			# If the file doesn't exist in the new release,
 			# the result of "merging changes" is having the file
 			# not exist.
 			if ! [ -f merge/${RELNUM}/${F} ]; then
 				continue
 			fi
 
 			# If the file didn't exist in the old release, we're
 			# going to throw away the existing file and hope that
 			# the version from the new release is what we want.
 			if ! [ -f merge/${OLDRELNUM}/${F} ]; then
 				cp merge/${RELNUM}/${F} merge/new/${F}
 				continue
 			fi
 
 			# Some files need special treatment.
 			case ${F} in
 			/etc/spwd.db | /etc/pwd.db | /etc/login.conf.db)
 				# Don't merge these -- we're rebuild them
 				# after updates are installed.
 				cp merge/old/${F} merge/new/${F}
 				;;
 			*)
 				if ! merge -p -L "current version"	\
 				    -L "${OLDRELNUM}" -L "${RELNUM}"	\
 				    merge/old/${F}			\
 				    merge/${OLDRELNUM}/${F}		\
 				    merge/${RELNUM}/${F}		\
 				    > merge/new/${F} 2>/dev/null; then
 					echo ${F} >> failed.merges
 				fi
 				;;
 			esac
 		done < $1-paths
 		echo " done."
 
 		# Ask the user to handle any files which didn't merge.
 		while read F; do
 			# If the installed file differs from the version in
 			# the old release only due to RCS tag expansion
 			# then just use the version in the new release.
 			if samef merge/old/${F} merge/${OLDRELNUM}/${F}; then
 				cp merge/${RELNUM}/${F} merge/new/${F}
 				continue
 			fi
 
 			cat <<-EOF
 
 The following file could not be merged automatically: ${F}
 Press Enter to edit this file in ${EDITOR} and resolve the conflicts
 manually...
 			EOF
 			read dummy </dev/tty
 			${EDITOR} `pwd`/merge/new/${F} < /dev/tty
 		done < failed.merges
 		rm failed.merges
 
 		# Ask the user to confirm that he likes how the result
 		# of merging files.
 		while read F; do
 			# Skip files which haven't changed except possibly
 			# in their RCS tags.
 			if [ -f merge/old/${F} ] && [ -f merge/new/${F} ] &&
 			    samef merge/old/${F} merge/new/${F}; then
 				continue
 			fi
 
 			# Skip files where the installed file differs from
 			# the old file only due to RCS tags.
 			if [ -f merge/old/${F} ] &&
 			    [ -f merge/${OLDRELNUM}/${F} ] &&
 			    samef merge/old/${F} merge/${OLDRELNUM}/${F}; then
 				continue
 			fi
 
 			# Warn about files which are ceasing to exist.
 			if ! [ -f merge/new/${F} ]; then
 				cat <<-EOF
 
 The following file will be removed, as it no longer exists in
 FreeBSD ${RELNUM}: ${F}
 				EOF
 				continuep < /dev/tty || return 1
 				continue
 			fi
 
 			# Print changes for the user's approval.
 			cat <<-EOF
 
 The following changes, which occurred between FreeBSD ${OLDRELNUM} and
 FreeBSD ${RELNUM} have been merged into ${F}:
 EOF
 			diff -U 5 -L "current version" -L "new version"	\
 			    merge/old/${F} merge/new/${F} || true
 			continuep < /dev/tty || return 1
 		done < $1-paths
 
 		# Store merged files.
 		while read F; do
 			if [ -f merge/new/${F} ]; then
 				V=`${SHA256} -q merge/new/${F}`
 
 				gzip -c < merge/new/${F} > files/${V}.gz
 				echo "${F}|${V}"
 			fi
 		done < $1-paths > newhashes
 
 		# Pull lines out from $3 which need to be updated to
 		# reflect merged files.
 		while read F; do
 			look "${F}|" $3
 		done < $1-paths > $3-oldlines
 
 		# Update lines to reflect merged files
 		join -t '|' -o 1.1,1.2,1.3,1.4,1.5,1.6,2.2,1.8		\
 		    $3-oldlines newhashes > $3-newlines
 
 		# Remove old lines from $3 and add new lines.
 		sort $3-oldlines |
 		    comm -13 - $3 |
 		    sort - $3-newlines > $3.tmp
 		mv $3.tmp $3
 
 		# Clean up
 		rm $1-paths newhashes $3-oldlines $3-newlines
 		rm -rf merge/
 	fi
 
 	# We're done with merging files.
 	rm $1
 }
 
 # Do the work involved in fetching upgrades to a new release
 upgrade_run () {
 	workdir_init || return 1
 
 	# Prepare the mirror list.
 	fetch_pick_server_init && fetch_pick_server
 
 	# Try to fetch the public key until we run out of servers.
 	while ! fetch_key; do
 		fetch_pick_server || return 1
 	done
  
 	# Try to fetch the metadata index signature ("tag") until we run
 	# out of available servers; and sanity check the downloaded tag.
 	while ! fetch_tag; do
 		fetch_pick_server || return 1
 	done
 	fetch_tagsanity || return 1
 
 	# Fetch the INDEX-OLD and INDEX-ALL.
 	fetch_metadata INDEX-OLD INDEX-ALL || return 1
 
 	# If StrictComponents is not "yes", generate a new components list
 	# with only the components which appear to be installed.
 	upgrade_guess_components INDEX-ALL || return 1
 
 	# Generate filtered INDEX-OLD and INDEX-ALL files containing only
 	# the components we want and without anything marked as "Ignore".
 	fetch_filter_metadata INDEX-OLD || return 1
 	fetch_filter_metadata INDEX-ALL || return 1
 
 	# Merge the INDEX-OLD and INDEX-ALL files into INDEX-OLD.
 	sort INDEX-OLD INDEX-ALL > INDEX-OLD.tmp
 	mv INDEX-OLD.tmp INDEX-OLD
 	rm INDEX-ALL
 
 	# Adjust variables for fetching files from the new release.
 	OLDRELNUM=${RELNUM}
 	RELNUM=${TARGETRELEASE}
 	OLDFETCHDIR=${FETCHDIR}
 	FETCHDIR=${RELNUM}/${ARCH}
 
 	# Try to fetch the NEW metadata index signature ("tag") until we run
 	# out of available servers; and sanity check the downloaded tag.
 	while ! fetch_tag; do
 		fetch_pick_server || return 1
 	done
 
 	# Fetch the new INDEX-ALL.
 	fetch_metadata INDEX-ALL || return 1
 
 	# If StrictComponents is not "yes", COMPONENTS contains an entry
 	# corresponding to the currently running kernel, and said kernel
 	# does not exist in the new release, add "kernel/generic" to the
 	# list of components.
 	upgrade_guess_new_kernel INDEX-ALL || return 1
 
 	# Filter INDEX-ALL to contain only the components we want and without
 	# anything marked as "Ignore".
 	fetch_filter_metadata INDEX-ALL || return 1
 
 	# Convert INDEX-OLD (last release) and INDEX-ALL (new release) into
 	# INDEX-OLD and INDEX-NEW files (in the sense of normal upgrades).
 	upgrade_oldall_to_oldnew INDEX-OLD INDEX-ALL INDEX-NEW
 
 	# Translate /boot/${KERNCONF} or /boot/${NKERNCONF} into ${KERNELDIR}
 	fetch_filter_kernel_names INDEX-NEW ${NKERNCONF}
 	fetch_filter_kernel_names INDEX-OLD ${KERNCONF}
 
 	# For all paths appearing in INDEX-OLD or INDEX-NEW, inspect the
 	# system and generate an INDEX-PRESENT file.
 	fetch_inspect_system INDEX-OLD INDEX-PRESENT INDEX-NEW || return 1
 
 	# Based on ${MERGECHANGES}, generate a file tomerge-old with the
 	# paths and hashes of old versions of files to merge.
 	fetch_filter_mergechanges INDEX-OLD INDEX-PRESENT tomerge-old
 
 	# Based on ${UPDATEIFUNMODIFIED}, remove lines from INDEX-* which
 	# correspond to lines in INDEX-PRESENT with hashes not appearing
 	# in INDEX-OLD or INDEX-NEW.  Also remove lines where the entry in
 	# INDEX-PRESENT has type - and there isn't a corresponding entry in
 	# INDEX-OLD with type -.
 	fetch_filter_unmodified_notpresent	\
 	    INDEX-OLD INDEX-PRESENT INDEX-NEW tomerge-old
 
 	# For each entry in INDEX-PRESENT of type -, remove any corresponding
 	# entry from INDEX-NEW if ${ALLOWADD} != "yes".  Remove all entries
 	# of type - from INDEX-PRESENT.
 	fetch_filter_allowadd INDEX-PRESENT INDEX-NEW
 
 	# If ${ALLOWDELETE} != "yes", then remove any entries from
 	# INDEX-PRESENT which don't correspond to entries in INDEX-NEW.
 	fetch_filter_allowdelete INDEX-PRESENT INDEX-NEW
 
 	# If ${KEEPMODIFIEDMETADATA} == "yes", then for each entry in
 	# INDEX-PRESENT with metadata not matching any entry in INDEX-OLD,
 	# replace the corresponding line of INDEX-NEW with one having the
 	# same metadata as the entry in INDEX-PRESENT.
 	fetch_filter_modified_metadata INDEX-OLD INDEX-PRESENT INDEX-NEW
 
 	# Remove lines from INDEX-PRESENT and INDEX-NEW which are identical;
 	# no need to update a file if it isn't changing.
 	fetch_filter_uptodate INDEX-PRESENT INDEX-NEW
 
 	# Fetch "clean" files from the old release for merging changes.
 	fetch_files_premerge tomerge-old
 
 	# Prepare to fetch files: Generate a list of the files we need,
 	# copy the unmodified files we have into /files/, and generate
 	# a list of patches to download.
 	fetch_files_prepare INDEX-OLD INDEX-PRESENT INDEX-NEW || return 1
 
 	# Fetch patches from to-${RELNUM}/${ARCH}/bp/
 	PATCHDIR=to-${RELNUM}/${ARCH}/bp
 	fetch_files || return 1
 
 	# Merge configuration file changes.
 	upgrade_merge tomerge-old INDEX-PRESENT INDEX-NEW || return 1
 
 	# Create and populate install manifest directory; and report what
 	# updates are available.
 	fetch_create_manifest || return 1
 
 	# Leave a note behind to tell the "install" command that the kernel
 	# needs to be installed before the world.
 	touch ${BDHASH}-install/kernelfirst
 
 	# Remind the user that they need to run "freebsd-update install"
 	# to install the downloaded bits, in case they didn't RTFM.
 	echo "To install the downloaded upgrades, run \"$0 install\"."
 }
 
 # Make sure that all the file hashes mentioned in $@ have corresponding
 # gzipped files stored in /files/.
 install_verify () {
 	# Generate a list of hashes
 	cat $@ |
 	    cut -f 2,7 -d '|' |
 	    grep -E '^f' |
 	    cut -f 2 -d '|' |
 	    sort -u > filelist
 
 	# Make sure all the hashes exist
 	while read HASH; do
 		if ! [ -f files/${HASH}.gz ]; then
 			echo -n "Update files missing -- "
 			echo "this should never happen."
 			echo "Re-run '$0 fetch'."
 			return 1
 		fi
 	done < filelist
 
 	# Clean up
 	rm filelist
 }
 
 # Remove the system immutable flag from files
 install_unschg () {
 	# Generate file list
 	cat $@ |
 	    cut -f 1 -d '|' > filelist
 
 	# Remove flags
 	while read F; do
 		if ! [ -e ${BASEDIR}/${F} ]; then
 			continue
 		fi
 
 		chflags noschg ${BASEDIR}/${F} || return 1
 	done < filelist
 
 	# Clean up
 	rm filelist
 }
 
 # Decide which directory name to use for kernel backups.
 backup_kernel_finddir () {
 	CNT=0
 	while true ; do
 		# Pathname does not exist, so it is OK use that name
 		# for backup directory.
 		if [ ! -e $BASEDIR/$BACKUPKERNELDIR ]; then
 			return 0
 		fi
 
 		# If directory do exist, we only use if it has our
 		# marker file.
 		if [ -d $BASEDIR/$BACKUPKERNELDIR -a \
 			-e $BASEDIR/$BACKUPKERNELDIR/.freebsd-update ]; then
 			return 0
 		fi
 
 		# We could not use current directory name, so add counter to
 		# the end and try again.
 		CNT=$((CNT + 1))
 		if [ $CNT -gt 9 ]; then
 			echo "Could not find valid backup dir ($BASEDIR/$BACKUPKERNELDIR)"
 			exit 1
 		fi
 		BACKUPKERNELDIR="`echo $BACKUPKERNELDIR | sed -Ee 's/[0-9]\$//'`"
 		BACKUPKERNELDIR="${BACKUPKERNELDIR}${CNT}"
 	done
 }
 
 # Backup the current kernel using hardlinks, if not disabled by user.
 # Since we delete all files in the directory used for previous backups
 # we create a marker file called ".freebsd-update" in the directory so
 # we can determine on the next run that the directory was created by
 # freebsd-update and we then do not accidentally remove user files in
 # the unlikely case that the user has created a directory with a
 # conflicting name.
 backup_kernel () {
 	# Only make kernel backup is so configured.
 	if [ $BACKUPKERNEL != yes ]; then
 		return 0
 	fi
 
 	# Decide which directory name to use for kernel backups.
 	backup_kernel_finddir
 
 	# Remove old kernel backup files.  If $BACKUPKERNELDIR was
 	# "not ours", backup_kernel_finddir would have exited, so
 	# deleting the directory content is as safe as we can make it.
 	if [ -d $BASEDIR/$BACKUPKERNELDIR ]; then
 		rm -fr $BASEDIR/$BACKUPKERNELDIR
 	fi
 
 	# Create directories for backup.
 	mkdir -p $BASEDIR/$BACKUPKERNELDIR
 	mtree -cdn -p "${BASEDIR}/${KERNELDIR}" | \
 	    mtree -Ue -p "${BASEDIR}/${BACKUPKERNELDIR}" > /dev/null
 
 	# Mark the directory as having been created by freebsd-update.
 	touch $BASEDIR/$BACKUPKERNELDIR/.freebsd-update
 	if [ $? -ne 0 ]; then
 		echo "Could not create kernel backup directory"
 		exit 1
 	fi
 
 	# Disable pathname expansion to be sure *.symbols is not
 	# expanded.
 	set -f
 
 	# Use find to ignore symbol files, unless disabled by user.
 	if [ $BACKUPKERNELSYMBOLFILES = yes ]; then
 		FINDFILTER=""
 	else
 		FINDFILTER=-"a ! -name *.symbols"
 	fi
 
 	# Backup all the kernel files using hardlinks.
 	(cd ${BASEDIR}/${KERNELDIR} && find . -type f $FINDFILTER -exec \
 	    cp -pl '{}' ${BASEDIR}/${BACKUPKERNELDIR}/'{}' \;)
 
 	# Re-enable patchname expansion.
 	set +f
 }
 
 # Install new files
 install_from_index () {
 	# First pass: Do everything apart from setting file flags.  We
 	# can't set flags yet, because schg inhibits hard linking.
 	sort -k 1,1 -t '|' $1 |
 	    tr '|' ' ' |
 	    while read FPATH TYPE OWNER GROUP PERM FLAGS HASH LINK; do
 		case ${TYPE} in
 		d)
 			# Create a directory
 			install -d -o ${OWNER} -g ${GROUP}		\
 			    -m ${PERM} ${BASEDIR}/${FPATH}
 			;;
 		f)
 			if [ -z "${LINK}" ]; then
 				# Create a file, without setting flags.
 				gunzip < files/${HASH}.gz > ${HASH}
 				install -S -o ${OWNER} -g ${GROUP}	\
 				    -m ${PERM} ${HASH} ${BASEDIR}/${FPATH}
 				rm ${HASH}
 			else
 				# Create a hard link.
 				ln -f ${BASEDIR}/${LINK} ${BASEDIR}/${FPATH}
 			fi
 			;;
 		L)
 			# Create a symlink
 			ln -sfh ${HASH} ${BASEDIR}/${FPATH}
 			;;
 		esac
 	    done
 
 	# Perform a second pass, adding file flags.
 	tr '|' ' ' < $1 |
 	    while read FPATH TYPE OWNER GROUP PERM FLAGS HASH LINK; do
 		if [ ${TYPE} = "f" ] &&
 		    ! [ ${FLAGS} = "0" ]; then
 			chflags ${FLAGS} ${BASEDIR}/${FPATH}
 		fi
 	    done
 }
 
 # Remove files which we want to delete
 install_delete () {
 	# Generate list of new files
 	cut -f 1 -d '|' < $2 |
 	    sort > newfiles
 
 	# Generate subindex of old files we want to nuke
 	sort -k 1,1 -t '|' $1 |
 	    join -t '|' -v 1 - newfiles |
 	    sort -r -k 1,1 -t '|' |
 	    cut -f 1,2 -d '|' |
 	    tr '|' ' ' > killfiles
 
 	# Remove the offending bits
 	while read FPATH TYPE; do
 		case ${TYPE} in
 		d)
 			rmdir ${BASEDIR}/${FPATH}
 			;;
 		f)
 			rm ${BASEDIR}/${FPATH}
 			;;
 		L)
 			rm ${BASEDIR}/${FPATH}
 			;;
 		esac
 	done < killfiles
 
 	# Clean up
 	rm newfiles killfiles
 }
 
 # Install new files, delete old files, and update linker.hints
 install_files () {
 	# If we haven't already dealt with the kernel, deal with it.
 	if ! [ -f $1/kerneldone ]; then
 		grep -E '^/boot/' $1/INDEX-OLD > INDEX-OLD
 		grep -E '^/boot/' $1/INDEX-NEW > INDEX-NEW
 
 		# Backup current kernel before installing a new one
 		backup_kernel || return 1
 
 		# Install new files
 		install_from_index INDEX-NEW || return 1
 
 		# Remove files which need to be deleted
 		install_delete INDEX-OLD INDEX-NEW || return 1
 
 		# Update linker.hints if necessary
 		if [ -s INDEX-OLD -o -s INDEX-NEW ]; then
 			kldxref -R ${BASEDIR}/boot/ 2>/dev/null
 		fi
 
 		# We've finished updating the kernel.
 		touch $1/kerneldone
 
 		# Do we need to ask for a reboot now?
 		if [ -f $1/kernelfirst ] &&
 		    [ -s INDEX-OLD -o -s INDEX-NEW ]; then
 			cat <<-EOF
 
 Kernel updates have been installed.  Please reboot and run
 "$0 install" again to finish installing updates.
 			EOF
 			exit 0
 		fi
 	fi
 
 	# If we haven't already dealt with the world, deal with it.
 	if ! [ -f $1/worlddone ]; then
 		# Create any necessary directories first
 		grep -vE '^/boot/' $1/INDEX-NEW |
 		    grep -E '^[^|]+\|d\|' > INDEX-NEW
 		install_from_index INDEX-NEW || return 1
 
 		# Install new runtime linker
 		grep -vE '^/boot/' $1/INDEX-NEW |
 		    grep -vE '^[^|]+\|d\|' |
 		    grep -E '^/libexec/ld-elf[^|]*\.so\.[0-9]+\|' > INDEX-NEW
 		install_from_index INDEX-NEW || return 1
 
 		# Install new shared libraries next
 		grep -vE '^/boot/' $1/INDEX-NEW |
 		    grep -vE '^[^|]+\|d\|' |
 		    grep -vE '^/libexec/ld-elf[^|]*\.so\.[0-9]+\|' |
 		    grep -E '^[^|]*/lib/[^|]*\.so\.[0-9]+\|' > INDEX-NEW
 		install_from_index INDEX-NEW || return 1
 
 		# Deal with everything else
 		grep -vE '^/boot/' $1/INDEX-OLD |
 		    grep -vE '^[^|]+\|d\|' |
 		    grep -vE '^/libexec/ld-elf[^|]*\.so\.[0-9]+\|' |
 		    grep -vE '^[^|]*/lib/[^|]*\.so\.[0-9]+\|' > INDEX-OLD
 		grep -vE '^/boot/' $1/INDEX-NEW |
 		    grep -vE '^[^|]+\|d\|' |
 		    grep -vE '^/libexec/ld-elf[^|]*\.so\.[0-9]+\|' |
 		    grep -vE '^[^|]*/lib/[^|]*\.so\.[0-9]+\|' > INDEX-NEW
 		install_from_index INDEX-NEW || return 1
 		install_delete INDEX-OLD INDEX-NEW || return 1
 
 		# Rebuild /etc/spwd.db and /etc/pwd.db if necessary.
 		if [ ${BASEDIR}/etc/master.passwd -nt ${BASEDIR}/etc/spwd.db ] ||
 		    [ ${BASEDIR}/etc/master.passwd -nt ${BASEDIR}/etc/pwd.db ]; then
 			pwd_mkdb -d ${BASEDIR}/etc ${BASEDIR}/etc/master.passwd
 		fi
 
 		# Rebuild /etc/login.conf.db if necessary.
 		if [ ${BASEDIR}/etc/login.conf -nt ${BASEDIR}/etc/login.conf.db ]; then
 			cap_mkdb ${BASEDIR}/etc/login.conf
 		fi
 
 		# We've finished installing the world and deleting old files
 		# which are not shared libraries.
 		touch $1/worlddone
 
 		# Do we need to ask the user to portupgrade now?
 		grep -vE '^/boot/' $1/INDEX-NEW |
 		    grep -E '^[^|]*/lib/[^|]*\.so\.[0-9]+\|' |
 		    cut -f 1 -d '|' |
 		    sort > newfiles
 		if grep -vE '^/boot/' $1/INDEX-OLD |
 		    grep -E '^[^|]*/lib/[^|]*\.so\.[0-9]+\|' |
 		    cut -f 1 -d '|' |
 		    sort |
 		    join -v 1 - newfiles |
 		    grep -q .; then
 			cat <<-EOF
 
 Completing this upgrade requires removing old shared object files.
 Please rebuild all installed 3rd party software (e.g., programs
 installed from the ports tree) and then run "$0 install"
 again to finish installing updates.
 			EOF
 			rm newfiles
 			exit 0
 		fi
 		rm newfiles
 	fi
 
 	# Remove old shared libraries
 	grep -vE '^/boot/' $1/INDEX-NEW |
 	    grep -vE '^[^|]+\|d\|' |
 	    grep -E '^[^|]*/lib/[^|]*\.so\.[0-9]+\|' > INDEX-NEW
 	grep -vE '^/boot/' $1/INDEX-OLD |
 	    grep -vE '^[^|]+\|d\|' |
 	    grep -E '^[^|]*/lib/[^|]*\.so\.[0-9]+\|' > INDEX-OLD
 	install_delete INDEX-OLD INDEX-NEW || return 1
 
 	# Remove old directories
 	grep -vE '^/boot/' $1/INDEX-NEW |
 	    grep -E '^[^|]+\|d\|' > INDEX-NEW
 	grep -vE '^/boot/' $1/INDEX-OLD |
 	    grep -E '^[^|]+\|d\|' > INDEX-OLD
 	install_delete INDEX-OLD INDEX-NEW || return 1
 
 	# Remove temporary files
 	rm INDEX-OLD INDEX-NEW
 }
 
 # Rearrange bits to allow the installed updates to be rolled back
 install_setup_rollback () {
 	# Remove the "reboot after installing kernel", "kernel updated", and
 	# "finished installing the world" flags if present -- they are
 	# irrelevant when rolling back updates.
 	if [ -f ${BDHASH}-install/kernelfirst ]; then
 		rm ${BDHASH}-install/kernelfirst
 		rm ${BDHASH}-install/kerneldone
 	fi
 	if [ -f ${BDHASH}-install/worlddone ]; then
 		rm ${BDHASH}-install/worlddone
 	fi
 
 	if [ -L ${BDHASH}-rollback ]; then
 		mv ${BDHASH}-rollback ${BDHASH}-install/rollback
 	fi
 
 	mv ${BDHASH}-install ${BDHASH}-rollback
 }
 
 # Actually install updates
 install_run () {
 	echo -n "Installing updates..."
 
 	# Make sure we have all the files we should have
 	install_verify ${BDHASH}-install/INDEX-OLD	\
 	    ${BDHASH}-install/INDEX-NEW || return 1
 
 	# Remove system immutable flag from files
 	install_unschg ${BDHASH}-install/INDEX-OLD	\
 	    ${BDHASH}-install/INDEX-NEW || return 1
 
 	# Install new files, delete old files, and update linker.hints
 	install_files ${BDHASH}-install || return 1
 
 	# Rearrange bits to allow the installed updates to be rolled back
 	install_setup_rollback
 
 	echo " done."
 }
 
 # Rearrange bits to allow the previous set of updates to be rolled back next.
 rollback_setup_rollback () {
 	if [ -L ${BDHASH}-rollback/rollback ]; then
 		mv ${BDHASH}-rollback/rollback rollback-tmp
 		rm -r ${BDHASH}-rollback/
 		rm ${BDHASH}-rollback
 		mv rollback-tmp ${BDHASH}-rollback
 	else
 		rm -r ${BDHASH}-rollback/
 		rm ${BDHASH}-rollback
 	fi
 }
 
 # Install old files, delete new files, and update linker.hints
 rollback_files () {
 	# Install old shared library files which don't have the same path as
 	# a new shared library file.
 	grep -vE '^/boot/' $1/INDEX-NEW |
 	    grep -E '/lib/.*\.so\.[0-9]+\|' |
 	    cut -f 1 -d '|' |
 	    sort > INDEX-NEW.libs.flist
 	grep -vE '^/boot/' $1/INDEX-OLD |
 	    grep -E '/lib/.*\.so\.[0-9]+\|' |
 	    sort -k 1,1 -t '|' - |
 	    join -t '|' -v 1 - INDEX-NEW.libs.flist > INDEX-OLD
 	install_from_index INDEX-OLD || return 1
 
 	# Deal with files which are neither kernel nor shared library
 	grep -vE '^/boot/' $1/INDEX-OLD |
 	    grep -vE '/lib/.*\.so\.[0-9]+\|' > INDEX-OLD
 	grep -vE '^/boot/' $1/INDEX-NEW |
 	    grep -vE '/lib/.*\.so\.[0-9]+\|' > INDEX-NEW
 	install_from_index INDEX-OLD || return 1
 	install_delete INDEX-NEW INDEX-OLD || return 1
 
 	# Install any old shared library files which we didn't install above.
 	grep -vE '^/boot/' $1/INDEX-OLD |
 	    grep -E '/lib/.*\.so\.[0-9]+\|' |
 	    sort -k 1,1 -t '|' - |
 	    join -t '|' - INDEX-NEW.libs.flist > INDEX-OLD
 	install_from_index INDEX-OLD || return 1
 
 	# Delete unneeded shared library files
 	grep -vE '^/boot/' $1/INDEX-OLD |
 	    grep -E '/lib/.*\.so\.[0-9]+\|' > INDEX-OLD
 	grep -vE '^/boot/' $1/INDEX-NEW |
 	    grep -E '/lib/.*\.so\.[0-9]+\|' > INDEX-NEW
 	install_delete INDEX-NEW INDEX-OLD || return 1
 
 	# Deal with kernel files
 	grep -E '^/boot/' $1/INDEX-OLD > INDEX-OLD
 	grep -E '^/boot/' $1/INDEX-NEW > INDEX-NEW
 	install_from_index INDEX-OLD || return 1
 	install_delete INDEX-NEW INDEX-OLD || return 1
 	if [ -s INDEX-OLD -o -s INDEX-NEW ]; then
 		kldxref -R /boot/ 2>/dev/null
 	fi
 
 	# Remove temporary files
 	rm INDEX-OLD INDEX-NEW INDEX-NEW.libs.flist
 }
 
 # Actually rollback updates
 rollback_run () {
 	echo -n "Uninstalling updates..."
 
 	# If there are updates waiting to be installed, remove them; we
 	# want the user to re-run 'fetch' after rolling back updates.
 	if [ -L ${BDHASH}-install ]; then
 		rm -r ${BDHASH}-install/
 		rm ${BDHASH}-install
 	fi
 
 	# Make sure we have all the files we should have
 	install_verify ${BDHASH}-rollback/INDEX-NEW	\
 	    ${BDHASH}-rollback/INDEX-OLD || return 1
 
 	# Remove system immutable flag from files
 	install_unschg ${BDHASH}-rollback/INDEX-NEW	\
 	    ${BDHASH}-rollback/INDEX-OLD || return 1
 
 	# Install old files, delete new files, and update linker.hints
 	rollback_files ${BDHASH}-rollback || return 1
 
 	# Remove the rollback directory and the symlink pointing to it; and
 	# rearrange bits to allow the previous set of updates to be rolled
 	# back next.
 	rollback_setup_rollback
 
 	echo " done."
 }
 
 # Compare INDEX-ALL and INDEX-PRESENT and print warnings about differences.
 IDS_compare () {
 	# Get all the lines which mismatch in something other than file
 	# flags.  We ignore file flags because sysinstall doesn't seem to
 	# set them when it installs FreeBSD; warning about these adds a
 	# very large amount of noise.
 	cut -f 1-5,7-8 -d '|' $1 > $1.noflags
 	sort -k 1,1 -t '|' $1.noflags > $1.sorted
 	cut -f 1-5,7-8 -d '|' $2 |
 	    comm -13 $1.noflags - |
 	    fgrep -v '|-|||||' |
 	    sort -k 1,1 -t '|' |
 	    join -t '|' $1.sorted - > INDEX-NOTMATCHING
 
 	# Ignore files which match IDSIGNOREPATHS.
 	for X in ${IDSIGNOREPATHS}; do
 		grep -E "^${X}" INDEX-NOTMATCHING
 	done |
 	    sort -u |
 	    comm -13 - INDEX-NOTMATCHING > INDEX-NOTMATCHING.tmp
 	mv INDEX-NOTMATCHING.tmp INDEX-NOTMATCHING
 
 	# Go through the lines and print warnings.
 	local IFS='|'
 	while read FPATH TYPE OWNER GROUP PERM HASH LINK P_TYPE P_OWNER P_GROUP P_PERM P_HASH P_LINK; do
 		# Warn about different object types.
 		if ! [ "${TYPE}" = "${P_TYPE}" ]; then
 			echo -n "${FPATH} is a "
 			case "${P_TYPE}" in
 			f)	echo -n "regular file, "
 				;;
 			d)	echo -n "directory, "
 				;;
 			L)	echo -n "symlink, "
 				;;
 			esac
 			echo -n "but should be a "
 			case "${TYPE}" in
 			f)	echo -n "regular file."
 				;;
 			d)	echo -n "directory."
 				;;
 			L)	echo -n "symlink."
 				;;
 			esac
 			echo
 
 			# Skip other tests, since they don't make sense if
 			# we're comparing different object types.
 			continue
 		fi
 
 		# Warn about different owners.
 		if ! [ "${OWNER}" = "${P_OWNER}" ]; then
 			echo -n "${FPATH} is owned by user id ${P_OWNER}, "
 			echo "but should be owned by user id ${OWNER}."
 		fi
 
 		# Warn about different groups.
 		if ! [ "${GROUP}" = "${P_GROUP}" ]; then
 			echo -n "${FPATH} is owned by group id ${P_GROUP}, "
 			echo "but should be owned by group id ${GROUP}."
 		fi
 
 		# Warn about different permissions.  We do not warn about
 		# different permissions on symlinks, since some archivers
 		# don't extract symlink permissions correctly and they are
 		# ignored anyway.
 		if ! [ "${PERM}" = "${P_PERM}" ] &&
 		    ! [ "${TYPE}" = "L" ]; then
 			echo -n "${FPATH} has ${P_PERM} permissions, "
 			echo "but should have ${PERM} permissions."
 		fi
 
 		# Warn about different file hashes / symlink destinations.
 		if ! [ "${HASH}" = "${P_HASH}" ]; then
 			if [ "${TYPE}" = "L" ]; then
 				echo -n "${FPATH} is a symlink to ${P_HASH}, "
 				echo "but should be a symlink to ${HASH}."
 			fi
 			if [ "${TYPE}" = "f" ]; then
 				echo -n "${FPATH} has SHA256 hash ${P_HASH}, "
 				echo "but should have SHA256 hash ${HASH}."
 			fi
 		fi
 
 		# We don't warn about different hard links, since some
 		# some archivers break hard links, and as long as the
 		# underlying data is correct they really don't matter.
 	done < INDEX-NOTMATCHING
 
 	# Clean up
 	rm $1 $1.noflags $1.sorted $2 INDEX-NOTMATCHING
 }
 
 # Do the work involved in comparing the system to a "known good" index
 IDS_run () {
 	workdir_init || return 1
 
 	# Prepare the mirror list.
 	fetch_pick_server_init && fetch_pick_server
 
 	# Try to fetch the public key until we run out of servers.
 	while ! fetch_key; do
 		fetch_pick_server || return 1
 	done
  
 	# Try to fetch the metadata index signature ("tag") until we run
 	# out of available servers; and sanity check the downloaded tag.
 	while ! fetch_tag; do
 		fetch_pick_server || return 1
 	done
 	fetch_tagsanity || return 1
 
 	# Fetch INDEX-OLD and INDEX-ALL.
 	fetch_metadata INDEX-OLD INDEX-ALL || return 1
 
 	# Generate filtered INDEX-OLD and INDEX-ALL files containing only
 	# the components we want and without anything marked as "Ignore".
 	fetch_filter_metadata INDEX-OLD || return 1
 	fetch_filter_metadata INDEX-ALL || return 1
 
 	# Merge the INDEX-OLD and INDEX-ALL files into INDEX-ALL.
 	sort INDEX-OLD INDEX-ALL > INDEX-ALL.tmp
 	mv INDEX-ALL.tmp INDEX-ALL
 	rm INDEX-OLD
 
 	# Translate /boot/${KERNCONF} to ${KERNELDIR}
 	fetch_filter_kernel_names INDEX-ALL ${KERNCONF}
 
 	# Inspect the system and generate an INDEX-PRESENT file.
 	fetch_inspect_system INDEX-ALL INDEX-PRESENT /dev/null || return 1
 
 	# Compare INDEX-ALL and INDEX-PRESENT and print warnings about any
 	# differences.
 	IDS_compare INDEX-ALL INDEX-PRESENT
 }
 
 #### Main functions -- call parameter-handling and core functions
 
 # Using the command line, configuration file, and defaults,
 # set all the parameters which are needed later.
 get_params () {
 	init_params
 	parse_cmdline $@
 	parse_conffile
 	default_params
 }
 
 # Fetch command.  Make sure that we're being called
 # interactively, then run fetch_check_params and fetch_run
 cmd_fetch () {
-	if [ ! -t 0 && $NOTTYOK -eq 0 ]; then
+	if [ ! -t 0 -a $NOTTYOK -eq 0 ]; then
 		echo -n "`basename $0` fetch should not "
 		echo "be run non-interactively."
 		echo "Run `basename $0` cron instead."
 		exit 1
 	fi
 	fetch_check_params
 	fetch_run || exit 1
 }
 
 # Cron command.  Make sure the parameters are sensible; wait
 # rand(3600) seconds; then fetch updates.  While fetching updates,
 # send output to a temporary file; only print that file if the
 # fetching failed.
 cmd_cron () {
 	fetch_check_params
 	sleep `jot -r 1 0 3600`
 
 	TMPFILE=`mktemp /tmp/freebsd-update.XXXXXX` || exit 1
 	if ! fetch_run >> ${TMPFILE} ||
 	    ! grep -q "No updates needed" ${TMPFILE} ||
 	    [ ${VERBOSELEVEL} = "debug" ]; then
 		mail -s "`hostname` security updates" ${MAILTO} < ${TMPFILE}
 	fi
 
 	rm ${TMPFILE}
 }
 
 # Fetch files for upgrading to a new release.
 cmd_upgrade () {
 	upgrade_check_params
 	upgrade_run || exit 1
 }
 
 # Install downloaded updates.
 cmd_install () {
 	install_check_params
 	install_run || exit 1
 }
 
 # Rollback most recently installed updates.
 cmd_rollback () {
 	rollback_check_params
 	rollback_run || exit 1
 }
 
 # Compare system against a "known good" index.
 cmd_IDS () {
 	IDS_check_params
 	IDS_run || exit 1
 }
 
 #### Entry point
 
 # Make sure we find utilities from the base system
 export PATH=/sbin:/bin:/usr/sbin:/usr/bin:${PATH}
 
 # Set a pager if the user doesn't
 if [ -z "$PAGER" ]; then
 	PAGER=/usr/bin/more
 fi
 
 # Set LC_ALL in order to avoid problems with character ranges like [A-Z].
 export LC_ALL=C
 
 get_params $@
 for COMMAND in ${COMMANDS}; do
 	cmd_${COMMAND}
 done
Index: user/ngie/more-tests/usr.sbin/vidcontrol/vidcontrol.c
===================================================================
--- user/ngie/more-tests/usr.sbin/vidcontrol/vidcontrol.c	(revision 281584)
+++ user/ngie/more-tests/usr.sbin/vidcontrol/vidcontrol.c	(revision 281585)
@@ -1,1480 +1,1480 @@
 /*-
  * Copyright (c) 1994-1996 Søren Schmidt
  * All rights reserved.
  *
  * Portions of this software are based in part on the work of
  * Sascha Wildner <saw@online.de> contributed to The DragonFly Project
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer,
  *    in this position and unchanged.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. The name of the author may not be used to endorse or promote products
  *    derived from this software without specific prior written permission
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  *
  * $DragonFly: src/usr.sbin/vidcontrol/vidcontrol.c,v 1.10 2005/03/02 06:08:29 joerg Exp $
  */
 
 #ifndef lint
 static const char rcsid[] =
   "$FreeBSD$";
 #endif /* not lint */
 
 #include <ctype.h>
 #include <err.h>
 #include <limits.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
 #include <sys/fbio.h>
 #include <sys/consio.h>
 #include <sys/endian.h>
 #include <sys/errno.h>
 #include <sys/param.h>
 #include <sys/types.h>
 #include <sys/stat.h>
 #include <sys/sysctl.h>
 #include "path.h"
 #include "decode.h"
 
 
 #define	DATASIZE(x)	((x).w * (x).h * 256 / 8)
 
 /* Screen dump modes */
 #define DUMP_FMT_RAW	1
 #define DUMP_FMT_TXT	2
 /* Screen dump options */
 #define DUMP_FBF	0
 #define DUMP_ALL	1
 /* Screen dump file format revision */
 #define DUMP_FMT_REV	1
 
 static const char *legal_colors[16] = {
 	"black", "blue", "green", "cyan",
 	"red", "magenta", "brown", "white",
 	"grey", "lightblue", "lightgreen", "lightcyan",
 	"lightred", "lightmagenta", "yellow", "lightwhite"
 };
 
 static struct {
 	int			active_vty;
 	vid_info_t		console_info;
 	unsigned char		screen_map[256];
 	int			video_mode_number;
 	struct video_info	video_mode_info;
 } cur_info;
 
 struct vt4font_header {
 	uint8_t		magic[8];
 	uint8_t		width;
 	uint8_t		height;
 	uint16_t	pad;
 	uint32_t	glyph_count;
 	uint32_t	map_count[4];
 } __packed;
 
 static int	hex = 0;
 static int	vesa_cols;
 static int	vesa_rows;
 static int	font_height;
 static int	colors_changed;
 static int	video_mode_changed;
 static int	normal_fore_color, normal_back_color;
 static int	revers_fore_color, revers_back_color;
 static int	vt4_mode = 0;
 static struct	vid_info info;
 static struct	video_info new_mode_info;
 
 
 /*
  * Initialize revert data.
  *
  * NOTE: the following parameters are not yet saved/restored:
  *
  *   screen saver timeout
  *   cursor type
  *   mouse character and mouse show/hide state
  *   vty switching on/off state
  *   history buffer size
  *   history contents
  *   font maps
  */
 
 static void
 init(void)
 {
 	if (ioctl(0, VT_GETACTIVE, &cur_info.active_vty) == -1)
 		errc(1, errno, "getting active vty");
 
 	cur_info.console_info.size = sizeof(cur_info.console_info);
 
 	if (ioctl(0, CONS_GETINFO, &cur_info.console_info) == -1)
 		errc(1, errno, "getting console information");
 
 	/* vt(4) use unicode, so no screen mapping required. */
 	if (vt4_mode == 0 &&
 	    ioctl(0, GIO_SCRNMAP, &cur_info.screen_map) == -1)
 		errc(1, errno, "getting screen map");
 
 	if (ioctl(0, CONS_GET, &cur_info.video_mode_number) == -1)
 		errc(1, errno, "getting video mode number");
 
 	cur_info.video_mode_info.vi_mode = cur_info.video_mode_number;
 
 	if (ioctl(0, CONS_MODEINFO, &cur_info.video_mode_info) == -1)
 		errc(1, errno, "getting video mode parameters");
 
 	normal_fore_color = cur_info.console_info.mv_norm.fore;
 	normal_back_color = cur_info.console_info.mv_norm.back;
 	revers_fore_color = cur_info.console_info.mv_rev.fore;
 	revers_back_color = cur_info.console_info.mv_rev.back;
 }
 
 
 /*
  * If something goes wrong along the way we call revert() to go back to the
  * console state we came from (which is assumed to be working).
  *
  * NOTE: please also read the comments of init().
  */
 
 static void
 revert(void)
 {
 	int size[3];
 
 	ioctl(0, VT_ACTIVATE, cur_info.active_vty);
 
 	fprintf(stderr, "\033[=%dA", cur_info.console_info.mv_ovscan);
 	fprintf(stderr, "\033[=%dF", cur_info.console_info.mv_norm.fore);
 	fprintf(stderr, "\033[=%dG", cur_info.console_info.mv_norm.back);
 	fprintf(stderr, "\033[=%dH", cur_info.console_info.mv_rev.fore);
 	fprintf(stderr, "\033[=%dI", cur_info.console_info.mv_rev.back);
 
 	if (vt4_mode == 0)
 		ioctl(0, PIO_SCRNMAP, &cur_info.screen_map);
 
 	if (cur_info.video_mode_number >= M_VESA_BASE)
 		ioctl(0, _IO('V', cur_info.video_mode_number - M_VESA_BASE),
 		      NULL);
 	else
 		ioctl(0, _IO('S', cur_info.video_mode_number), NULL);
 
 	if (cur_info.video_mode_info.vi_flags & V_INFO_GRAPHICS) {
 		size[0] = cur_info.video_mode_info.vi_width / 8;
 		size[1] = cur_info.video_mode_info.vi_height /
 			  cur_info.console_info.font_size;
 		size[2] = cur_info.console_info.font_size;
 
 		ioctl(0, KDRASTER, size);
 	}
 }
 
 
 /*
  * Print a short usage string describing all options, then exit.
  */
 
 static void
 usage(void)
 {
 	if (vt4_mode)
 		fprintf(stderr, "%s\n%s\n%s\n%s\n%s\n",
 "usage: vidcontrol [-CHPpx] [-b color] [-c appearance] [-f [[size] file]]",
 "                  [-g geometry] [-h size] [-i adapter | mode]",
 "                  [-M char] [-m on | off] [-r foreground background]",
 "                  [-S on | off] [-s number] [-T xterm | cons25] [-t N | off]",
 "                  [mode] [foreground [background]] [show]");
 	else
 		fprintf(stderr, "%s\n%s\n%s\n%s\n%s\n",
 "usage: vidcontrol [-CdHLPpx] [-b color] [-c appearance] [-f [size] file]",
 "                  [-g geometry] [-h size] [-i adapter | mode] [-l screen_map]",
 "                  [-M char] [-m on | off] [-r foreground background]",
 "                  [-S on | off] [-s number] [-T xterm | cons25] [-t N | off]",
 "                  [mode] [foreground [background]] [show]");
 	exit(1);
 }
 
 /* Detect presence of vt(4). */
 static int
 is_vt4(void)
 {
 	char vty_name[4] = "";
 	size_t len = sizeof(vty_name);
 
 	if (sysctlbyname("kern.vty", vty_name, &len, NULL, 0) != 0)
 		return (0);
 	return (strcmp(vty_name, "vt") == 0);
 }
 
 /*
  * Retrieve the next argument from the command line (for options that require
  * more than one argument).
  */
 
 static char *
 nextarg(int ac, char **av, int *indp, int oc, int strict)
 {
 	if (*indp < ac)
 		return(av[(*indp)++]);
 
 	if (strict != 0) {
 		revert();
 		errx(1, "option requires two arguments -- %c", oc);
 	}
 
 	return(NULL);
 }
 
 
 /*
  * Guess which file to open. Try to open each combination of a specified set
  * of file name components.
  */
 
 static FILE *
 openguess(const char *a[], const char *b[], const char *c[], const char *d[], char **name)
 {
 	FILE *f;
 	int i, j, k, l;
 
 	for (i = 0; a[i] != NULL; i++) {
 		for (j = 0; b[j] != NULL; j++) {
 			for (k = 0; c[k] != NULL; k++) {
 				for (l = 0; d[l] != NULL; l++) {
 					asprintf(name, "%s%s%s%s",
 						 a[i], b[j], c[k], d[l]);
 
 					f = fopen(*name, "r");
 
 					if (f != NULL)
 						return (f);
 
 					free(*name);
 				}
 			}
 		}
 	}
 	return (NULL);
 }
 
 
 /*
  * Load a screenmap from a file and set it.
  */
 
 static void
 load_scrnmap(const char *filename)
 {
 	FILE *fd;
 	int size;
 	char *name;
 	scrmap_t scrnmap;
 	const char *a[] = {"", SCRNMAP_PATH, NULL};
 	const char *b[] = {filename, NULL};
 	const char *c[] = {"", ".scm", NULL};
 	const char *d[] = {"", NULL};
 
 	fd = openguess(a, b, c, d, &name);
 
 	if (fd == NULL) {
 		revert();
 		errx(1, "screenmap file not found");
 	}
 
 	size = sizeof(scrnmap);
 
 	if (decode(fd, (char *)&scrnmap, size) != size) {
 		rewind(fd);
 
 		if (fread(&scrnmap, 1, size, fd) != (size_t)size) {
 			warnx("bad screenmap file");
 			fclose(fd);
 			revert();
 			errx(1, "bad screenmap file");
 		}
 	}
 
 	if (ioctl(0, PIO_SCRNMAP, &scrnmap) == -1) {
 		revert();
 		errc(1, errno, "loading screenmap");
 	}
 
 	fclose(fd);
 }
 
 
 /*
  * Set the default screenmap.
  */
 
 static void
 load_default_scrnmap(void)
 {
 	scrmap_t scrnmap;
 	int i;
 
 	for (i=0; i<256; i++)
 		*((char*)&scrnmap + i) = i;
 
 	if (ioctl(0, PIO_SCRNMAP, &scrnmap) == -1) {
 		revert();
 		errc(1, errno, "loading default screenmap");
 	}
 }
 
 
 /*
  * Print the current screenmap to stdout.
  */
 
 static void
 print_scrnmap(void)
 {
 	unsigned char map[256];
 	size_t i;
 
 	if (ioctl(0, GIO_SCRNMAP, &map) == -1) {
 		revert();
 		errc(1, errno, "getting screenmap");
 	}
 	for (i=0; i<sizeof(map); i++) {
 		if (i != 0 && i % 16 == 0)
 			fprintf(stdout, "\n");
 
 		if (hex != 0)
 			fprintf(stdout, " %02x", map[i]);
 		else
 			fprintf(stdout, " %03d", map[i]);
 	}
 	fprintf(stdout, "\n");
 
 }
 
 
 /*
  * Determine a file's size.
  */
 
 static int
 fsize(FILE *file)
 {
 	struct stat sb;
 
 	if (fstat(fileno(file), &sb) == 0)
 		return sb.st_size;
 	else
 		return -1;
 }
 
 static vfnt_map_t *
 load_vt4mappingtable(unsigned int nmappings, FILE *f)
 {
 	vfnt_map_t *t;
 	unsigned int i;
 
 	if (nmappings == 0)
 		return (NULL);
 
 	t = malloc(sizeof *t * nmappings);
 
 	if (fread(t, sizeof *t * nmappings, 1, f) != 1) {
 		perror("mappings");
 		exit(1);
 	}
 
 	for (i = 0; i < nmappings; i++) {
 		t[i].src = be32toh(t[i].src);
 		t[i].dst = be16toh(t[i].dst);
 		t[i].len = be16toh(t[i].len);
 	}
 
 	return (t);
 }
 
 /*
  * Set the default vt font.
  */
 
 static void
 load_default_vt4font(void)
 {
 	if (ioctl(0, PIO_VFONT_DEFAULT) == -1) {
 		revert();
 		errc(1, errno, "loading default vt font");
 	}
 }
 
 static int
 load_vt4font(FILE *f)
 {
 	struct vt4font_header fh;
 	static vfnt_t vfnt;
 	size_t glyphsize;
 	unsigned int i;
 
 	if (fread(&fh, sizeof fh, 1, f) != 1) {
 		perror("file_header");
 		return (1);
 	}
 
 	if (memcmp(fh.magic, "VFNT0002", 8) != 0) {
 		fprintf(stderr, "Bad magic\n");
 		return (1);
 	}
 
 	for (i = 0; i < VFNT_MAPS; i++)
 		vfnt.map_count[i] = be32toh(fh.map_count[i]);
 	vfnt.glyph_count = be32toh(fh.glyph_count);
 	vfnt.width = fh.width;
 	vfnt.height = fh.height;
 
 	glyphsize = howmany(vfnt.width, 8) * vfnt.height * vfnt.glyph_count;
 	vfnt.glyphs = malloc(glyphsize);
 
 	if (fread(vfnt.glyphs, glyphsize, 1, f) != 1) {
 		perror("glyphs");
 		return (1);
 	}
 
 	for (i = 0; i < VFNT_MAPS; i++)
 		vfnt.map[i] = load_vt4mappingtable(vfnt.map_count[i], f);
 
 	if (ioctl(STDIN_FILENO, PIO_VFONT, &vfnt) == -1) {
 		perror("PIO_VFONT");
 		return (1);
 	}
 	return (0);
 }
 
 /*
  * Load a font from file and set it.
  */
 
 static void
 load_font(const char *type, const char *filename)
 {
 	FILE	*fd;
 	int	h, i, size, w;
 	unsigned long io = 0;	/* silence stupid gcc(1) in the Wall mode */
 	char	*name, *fontmap, size_sufx[6];
 	const char	*a[] = {"", FONT_PATH, NULL};
 	const char	*vt4a[] = {"", VT_FONT_PATH, NULL};
 	const char	*b[] = {filename, NULL};
 	const char	*c[] = {"", size_sufx, NULL};
 	const char	*d[] = {"", ".fnt", NULL};
 	vid_info_t _info;
 
 	struct sizeinfo {
 		int w;
 		int h;
 		unsigned long io;
 	} sizes[] = {{8, 16, PIO_FONT8x16},
 		     {8, 14, PIO_FONT8x14},
 		     {8,  8,  PIO_FONT8x8},
 		     {0,  0,            0}};
 
 	if (vt4_mode) {
 		size_sufx[0] = '\0';
 	} else {
 		_info.size = sizeof(_info);
 		if (ioctl(0, CONS_GETINFO, &_info) == -1) {
 			revert();
 			warn("failed to obtain current video mode parameters");
 			return;
 		}
 
 		snprintf(size_sufx, sizeof(size_sufx), "-8x%d", _info.font_size);
 	}
 	fd = openguess((vt4_mode == 0) ? a : vt4a, b, c, d, &name);
 
 	if (fd == NULL) {
 		revert();
 		errx(1, "%s: can't load font file", filename);
 	}
 
 	if (vt4_mode) {
 		if(load_vt4font(fd))
 			warn("failed to load font \"%s\"", filename);
 		fclose(fd);
 		return;
 	}
 
 	if (type != NULL) {
 		size = 0;
 		if (sscanf(type, "%dx%d", &w, &h) == 2) {
 			for (i = 0; sizes[i].w != 0; i++) {
 				if (sizes[i].w == w && sizes[i].h == h) {
 					size = DATASIZE(sizes[i]);
 					io = sizes[i].io;
 					font_height = sizes[i].h;
 				}
 			}
 		}
 		if (size == 0) {
 			fclose(fd);
 			revert();
 			errx(1, "%s: bad font size specification", type);
 		}
 	} else {
 		/* Apply heuristics */
 
 		int j;
 		int dsize[2];
 
 		size = DATASIZE(sizes[0]);
 		fontmap = (char*) malloc(size);
 		dsize[0] = decode(fd, fontmap, size);
 		dsize[1] = fsize(fd);
 		free(fontmap);
 
 		size = 0;
 		for (j = 0; j < 2; j++) {
 			for (i = 0; sizes[i].w != 0; i++) {
 				if (DATASIZE(sizes[i]) == dsize[j]) {
 					size = dsize[j];
 					io = sizes[i].io;
 					font_height = sizes[i].h;
 					j = 2;	/* XXX */
 					break;
 				}
 			}
 		}
 
 		if (size == 0) {
 			fclose(fd);
 			revert();
 			errx(1, "%s: can't guess font size", filename);
 		}
 
 		rewind(fd);
 	}
 
 	fontmap = (char*) malloc(size);
 
 	if (decode(fd, fontmap, size) != size) {
 		rewind(fd);
 		if (fsize(fd) != size ||
 		    fread(fontmap, 1, size, fd) != (size_t)size) {
 			warnx("%s: bad font file", filename);
 			fclose(fd);
 			free(fontmap);
 			revert();
 			errx(1, "%s: bad font file", filename);
 		}
 	}
 
 	if (ioctl(0, io, fontmap) == -1) {
 		revert();
 		errc(1, errno, "loading font");
 	}
 
 	fclose(fd);
 	free(fontmap);
 }
 
 
 /*
  * Set the timeout for the screensaver.
  */
 
 static void
 set_screensaver_timeout(char *arg)
 {
 	int nsec;
 
 	if (!strcmp(arg, "off")) {
 		nsec = 0;
 	} else {
 		nsec = atoi(arg);
 
 		if ((*arg == '\0') || (nsec < 1)) {
 			revert();
 			errx(1, "argument must be a positive number");
 		}
 	}
 
 	if (ioctl(0, CONS_BLANKTIME, &nsec) == -1) {
 		revert();
 		errc(1, errno, "setting screensaver period");
 	}
 }
 
 
 /*
  * Set the cursor's shape/type.
  */
 
 static void
 set_cursor_type(char *appearance)
 {
 	int type;
 
 	if (!strcmp(appearance, "normal"))
 		type = 0;
 	else if (!strcmp(appearance, "blink"))
 		type = 1;
 	else if (!strcmp(appearance, "destructive"))
 		type = 3;
 	else {
 		revert();
 		errx(1, "argument to -c must be normal, blink or destructive");
 	}
 
 	if (ioctl(0, CONS_CURSORTYPE, &type) == -1) {
 		revert();
 		errc(1, errno, "setting cursor type");
 	}
 }
 
 
 /*
  * Set the video mode.
  */
 
 static int
 video_mode(int argc, char **argv, int *mode_index)
 {
 	static struct {
 		const char *name;
 		unsigned long mode;
 		unsigned long mode_num;
 	} modes[] = {
 		{ "80x25",        SW_TEXT_80x25,   M_TEXT_80x25 },
 		{ "80x30",        SW_TEXT_80x30,   M_TEXT_80x30 },
 		{ "80x43",        SW_TEXT_80x43,   M_TEXT_80x43 },
 		{ "80x50",        SW_TEXT_80x50,   M_TEXT_80x50 },
 		{ "80x60",        SW_TEXT_80x60,   M_TEXT_80x60 },
 		{ "132x25",       SW_TEXT_132x25,  M_TEXT_132x25 },
 		{ "132x30",       SW_TEXT_132x30,  M_TEXT_132x30 },
 		{ "132x43",       SW_TEXT_132x43,  M_TEXT_132x43 },
 		{ "132x50",       SW_TEXT_132x50,  M_TEXT_132x50 },
 		{ "132x60",       SW_TEXT_132x60,  M_TEXT_132x60 },
 		{ "VGA_40x25",    SW_VGA_C40x25,   M_VGA_C40x25 },
 		{ "VGA_80x25",    SW_VGA_C80x25,   M_VGA_C80x25 },
 		{ "VGA_80x30",    SW_VGA_C80x30,   M_VGA_C80x30 },
 		{ "VGA_80x50",    SW_VGA_C80x50,   M_VGA_C80x50 },
 		{ "VGA_80x60",    SW_VGA_C80x60,   M_VGA_C80x60 },
 #ifdef SW_VGA_C90x25
 		{ "VGA_90x25",    SW_VGA_C90x25,   M_VGA_C90x25 },
 		{ "VGA_90x30",    SW_VGA_C90x30,   M_VGA_C90x30 },
 		{ "VGA_90x43",    SW_VGA_C90x43,   M_VGA_C90x43 },
 		{ "VGA_90x50",    SW_VGA_C90x50,   M_VGA_C90x50 },
 		{ "VGA_90x60",    SW_VGA_C90x60,   M_VGA_C90x60 },
 #endif
 		{ "VGA_320x200",	SW_VGA_CG320,	M_CG320 },
 		{ "EGA_80x25",		SW_ENH_C80x25,	M_ENH_C80x25 },
 		{ "EGA_80x43",		SW_ENH_C80x43,	M_ENH_C80x43 },
 		{ "VESA_132x25",	SW_VESA_C132x25,M_VESA_C132x25 },
 		{ "VESA_132x43",	SW_VESA_C132x43,M_VESA_C132x43 },
 		{ "VESA_132x50",	SW_VESA_C132x50,M_VESA_C132x50 },
 		{ "VESA_132x60",	SW_VESA_C132x60,M_VESA_C132x60 },
 		{ "VESA_800x600",	SW_VESA_800x600,M_VESA_800x600 },
 		{ NULL, 0, 0 },
 	};
 
 	int new_mode_num = 0;
 	unsigned long mode = 0;
 	int cur_mode; 
 	int ioerr;
 	int size[3];
 	int i;
 
 	if (ioctl(0, CONS_GET, &cur_mode) < 0)
 		err(1, "cannot get the current video mode");
 
 	/*
 	 * Parse the video mode argument...
 	 */
 
 	if (*mode_index < argc) {
 		if (!strncmp(argv[*mode_index], "MODE_", 5)) {
 			if (!isdigit(argv[*mode_index][5]))
 				errx(1, "invalid video mode number");
 
 			new_mode_num = atoi(&argv[*mode_index][5]);
 		} else {
 			for (i = 0; modes[i].name != NULL; ++i) {
 				if (!strcmp(argv[*mode_index], modes[i].name)) {
 					mode = modes[i].mode;
 					new_mode_num = modes[i].mode_num;
 					break;
 				}
 			}
 
 			if (modes[i].name == NULL)
 				return EXIT_FAILURE;
 			if (ioctl(0, mode, NULL) < 0) {
 				warn("cannot set videomode");
 				return EXIT_FAILURE;
 			}
 		}
 
 		/*
 		 * Collect enough information about the new video mode...
 		 */
 
 		new_mode_info.vi_mode = new_mode_num;
 
 		if (ioctl(0, CONS_MODEINFO, &new_mode_info) == -1) {
 			revert();
 			errc(1, errno, "obtaining new video mode parameters");
 		}
 
 		if (mode == 0) {
 			if (new_mode_num >= M_VESA_BASE)
 				mode = _IO('V', new_mode_num - M_VESA_BASE);
 			else
 				mode = _IO('S', new_mode_num);
 		}
 
 		/*
 		 * Try setting the new mode.
 		 */
 
 		if (ioctl(0, mode, NULL) == -1) {
 			revert();
 			errc(1, errno, "setting video mode");
 		}
 
 		/*
 		 * For raster modes it's not enough to just set the mode.
 		 * We also need to explicitly set the raster mode.
 		 */
 
 		if (new_mode_info.vi_flags & V_INFO_GRAPHICS) {
 			/* font size */
 
 			if (font_height == 0)
 				font_height = cur_info.console_info.font_size;
 
 			size[2] = font_height;
 
 			/* adjust columns */
 
 			if ((vesa_cols * 8 > new_mode_info.vi_width) ||
 			    (vesa_cols <= 0)) {
 				size[0] = new_mode_info.vi_width / 8;
 			} else {
 				size[0] = vesa_cols;
 			}
 
 			/* adjust rows */
 
 			if ((vesa_rows * font_height > new_mode_info.vi_height) ||
 			    (vesa_rows <= 0)) {
 				size[1] = new_mode_info.vi_height /
 					  font_height;
 			} else {
 				size[1] = vesa_rows;
 			}
 
 			/* set raster mode */
 
 			if (ioctl(0, KDRASTER, size)) {
 				ioerr = errno;
 				if (cur_mode >= M_VESA_BASE)
 					ioctl(0,
 					    _IO('V', cur_mode - M_VESA_BASE),
 					    NULL);
 				else
 					ioctl(0, _IO('S', cur_mode), NULL);
 				revert();
 				warnc(ioerr, "cannot activate raster display");
 				return EXIT_FAILURE;
 			}
 		}
 
 		video_mode_changed = 1;
 
 		(*mode_index)++;
 	}
 	return EXIT_SUCCESS;
 }
 
 
 /*
  * Return the number for a specified color name.
  */
 
 static int
 get_color_number(char *color)
 {
 	int i;
 
 	for (i=0; i<16; i++) {
 		if (!strcmp(color, legal_colors[i]))
 			return i;
 	}
 	return -1;
 }
 
 
 /*
  * Get normal text and background colors.
  */
 
 static void
 get_normal_colors(int argc, char **argv, int *_index)
 {
 	int color;
 
 	if (*_index < argc && (color = get_color_number(argv[*_index])) != -1) {
 		(*_index)++;
 		fprintf(stderr, "\033[=%dF", color);
 		normal_fore_color=color;
 		colors_changed = 1;
 		if (*_index < argc
 		    && (color = get_color_number(argv[*_index])) != -1
 		    && color < 8) {
 			(*_index)++;
 			fprintf(stderr, "\033[=%dG", color);
 			normal_back_color=color;
 		}
 	}
 }
 
 
 /*
  * Get reverse text and background colors.
  */
 
 static void
 get_reverse_colors(int argc, char **argv, int *_index)
 {
 	int color;
 
 	if ((color = get_color_number(argv[*(_index)-1])) != -1) {
 		fprintf(stderr, "\033[=%dH", color);
 		revers_fore_color=color;
 		colors_changed = 1;
 		if (*_index < argc
 		    && (color = get_color_number(argv[*_index])) != -1
 		    && color < 8) {
 			(*_index)++;
 			fprintf(stderr, "\033[=%dI", color);
 			revers_back_color=color;
 		}
 	}
 }
 
 
 /*
  * Set normal and reverse foreground and background colors.
  */
 
 static void
 set_colors(void)
 {
 	fprintf(stderr, "\033[=%dF", normal_fore_color);
 	fprintf(stderr, "\033[=%dG", normal_back_color);
 	fprintf(stderr, "\033[=%dH", revers_fore_color);
 	fprintf(stderr, "\033[=%dI", revers_back_color);
 }
 
 
 /*
  * Switch to virtual terminal #arg.
  */
 
 static void
 set_console(char *arg)
 {
 	int n;
 
 	if(!arg || strspn(arg,"0123456789") != strlen(arg)) {
 		revert();
 		errx(1, "bad console number");
 	}
 
 	n = atoi(arg);
 
 	if (n < 1 || n > 16) {
 		revert();
 		errx(1, "console number out of range");
 	} else if (ioctl(0, VT_ACTIVATE, n) == -1) {
 		revert();
 		errc(1, errno, "switching vty");
 	}
 }
 
 
 /*
  * Sets the border color.
  */
 
 static void
 set_border_color(char *arg)
 {
 	int color;
 
 	if ((color = get_color_number(arg)) != -1) {
 		fprintf(stderr, "\033[=%dA", color);
 	}
 	else
 		usage();
 }
 
 static void
 set_mouse_char(char *arg)
 {
 	struct mouse_info mouse;
 	long l;
 
 	l = strtol(arg, NULL, 0);
 
 	if ((l < 0) || (l > UCHAR_MAX - 3)) {
 		revert();
 		warnx("argument to -M must be 0 through %d", UCHAR_MAX - 3);
 		return;
 	}
 
 	mouse.operation = MOUSE_MOUSECHAR;
 	mouse.u.mouse_char = (int)l;
 
 	if (ioctl(0, CONS_MOUSECTL, &mouse) == -1) {
 		revert();
 		errc(1, errno, "setting mouse character");
 	}
 }
 
 
 /*
  * Show/hide the mouse.
  */
 
 static void
 set_mouse(char *arg)
 {
 	struct mouse_info mouse;
 
 	if (!strcmp(arg, "on")) {
 		mouse.operation = MOUSE_SHOW;
 	} else if (!strcmp(arg, "off")) {
 		mouse.operation = MOUSE_HIDE;
 	} else {
 		revert();
 		errx(1, "argument to -m must be either on or off");
 	}
 
 	if (ioctl(0, CONS_MOUSECTL, &mouse) == -1) {
 		revert();
 		errc(1, errno, "%sing the mouse",
 		     mouse.operation == MOUSE_SHOW ? "show" : "hid");
 	}
 }
 
 
 static void
 set_lockswitch(char *arg)
 {
 	int data;
 
 	if (!strcmp(arg, "off")) {
 		data = 0x01;
 	} else if (!strcmp(arg, "on")) {
 		data = 0x02;
 	} else {
 		revert();
 		errx(1, "argument to -S must be either on or off");
 	}
 
 	if (ioctl(0, VT_LOCKSWITCH, &data) == -1) {
 		revert();
 		errc(1, errno, "turning %s vty switching",
 		     data == 0x01 ? "off" : "on");
 	}
 }
 
 
 /*
  * Return the adapter name for a specified type.
  */
 
 static const char
 *adapter_name(int type)
 {
     static struct {
 	int type;
 	const char *name;
     } names[] = {
 	{ KD_MONO,	"MDA" },
 	{ KD_HERCULES,	"Hercules" },
 	{ KD_CGA,	"CGA" },
 	{ KD_EGA,	"EGA" },
 	{ KD_VGA,	"VGA" },
 	{ KD_PC98,	"PC-98xx" },
 	{ KD_TGA,	"TGA" },
 	{ -1,		"Unknown" },
     };
 
     int i;
 
     for (i = 0; names[i].type != -1; ++i)
 	if (names[i].type == type)
 	    break;
     return names[i].name;
 }
 
 
 /*
  * Show graphics adapter information.
  */
 
 static void
 show_adapter_info(void)
 {
 	struct video_adapter_info ad;
 
 	ad.va_index = 0;
 
 	if (ioctl(0, CONS_ADPINFO, &ad) == -1) {
 		revert();
 		errc(1, errno, "obtaining adapter information");
 	}
 
 	printf("fb%d:\n", ad.va_index);
 	printf("    %.*s%d, type:%s%s (%d), flags:0x%x\n",
 	       (int)sizeof(ad.va_name), ad.va_name, ad.va_unit,
 	       (ad.va_flags & V_ADP_VESA) ? "VESA " : "",
 	       adapter_name(ad.va_type), ad.va_type, ad.va_flags);
 	printf("    initial mode:%d, current mode:%d, BIOS mode:%d\n",
 	       ad.va_initial_mode, ad.va_mode, ad.va_initial_bios_mode);
 	printf("    frame buffer window:0x%zx, buffer size:0x%zx\n",
 	       ad.va_window, ad.va_buffer_size);
 	printf("    window size:0x%zx, origin:0x%x\n",
 	       ad.va_window_size, ad.va_window_orig);
 	printf("    display start address (%d, %d), scan line width:%d\n",
 	       ad.va_disp_start.x, ad.va_disp_start.y, ad.va_line_width);
 	printf("    reserved:0x%zx\n", ad.va_unused0);
 }
 
 
 /*
  * Show video mode information.
  */
 
 static void
 show_mode_info(void)
 {
 	char buf[80];
 	struct video_info _info;
 	int c;
 	int mm;
 	int mode;
 
 	printf("    mode#     flags   type    size       "
 	       "font      window      linear buffer\n");
 	printf("---------------------------------------"
 	       "---------------------------------------\n");
 
 	for (mode = 0; mode <= M_VESA_MODE_MAX; ++mode) {
 		_info.vi_mode = mode;
 		if (ioctl(0, CONS_MODEINFO, &_info))
 			continue;
 		if (_info.vi_mode != mode)
 			continue;
 
 		printf("%3d (0x%03x)", mode, mode);
     		printf(" 0x%08x", _info.vi_flags);
 		if (_info.vi_flags & V_INFO_GRAPHICS) {
 			c = 'G';
 
 			if (_info.vi_mem_model == V_INFO_MM_PLANAR)
 				snprintf(buf, sizeof(buf), "%dx%dx%d %d",
 				    _info.vi_width, _info.vi_height, 
 				    _info.vi_depth, _info.vi_planes);
 			else {
 				switch (_info.vi_mem_model) {
 				case V_INFO_MM_PACKED:
 					mm = 'P';
 					break;
 				case V_INFO_MM_DIRECT:
 					mm = 'D';
 					break;
 				case V_INFO_MM_CGA:
 					mm = 'C';
 					break;
 				case V_INFO_MM_HGC:
 					mm = 'H';
 					break;
 				case V_INFO_MM_VGAX:
 					mm = 'V';
 					break;
 				default:
 					mm = ' ';
 					break;
 				}
 				snprintf(buf, sizeof(buf), "%dx%dx%d %c",
 				    _info.vi_width, _info.vi_height, 
 				    _info.vi_depth, mm);
 			}
 		} else {
 			c = 'T';
 
 			snprintf(buf, sizeof(buf), "%dx%d",
 				 _info.vi_width, _info.vi_height);
 		}
 
 		printf(" %c %-15s", c, buf);
 		snprintf(buf, sizeof(buf), "%dx%d", 
 			 _info.vi_cwidth, _info.vi_cheight); 
 		printf(" %-5s", buf);
     		printf(" 0x%05zx %2dk %2dk", 
 		       _info.vi_window, (int)_info.vi_window_size/1024, 
 		       (int)_info.vi_window_gran/1024);
     		printf(" 0x%08zx %dk\n",
 		       _info.vi_buffer, (int)_info.vi_buffer_size/1024);
 	}
 }
 
 
 static void
 show_info(char *arg)
 {
 	if (!strcmp(arg, "adapter")) {
 		show_adapter_info();
 	} else if (!strcmp(arg, "mode")) {
 		show_mode_info();
 	} else {
 		revert();
 		errx(1, "argument to -i must be either adapter or mode");
 	}
 }
 
 
 static void
 test_frame(void)
 {
 	int i, cur_mode, fore;
 
 	fore = 15;
 
 	if (ioctl(0, CONS_GET, &cur_mode) < 0)
 		err(1, "must be on a virtual console");
 	switch (cur_mode) {
 	case M_PC98_80x25:
 	case M_PC98_80x30:
 		fore = 7;
 		break;
 	}
 
 	fprintf(stdout, "\033[=0G\n\n");
 	for (i=0; i<8; i++) {
 		fprintf(stdout, "\033[=%dF\033[=0G        %2d \033[=%dF%-16s"
 				"\033[=%dF\033[=0G        %2d \033[=%dF%-16s        "
 				"\033[=%dF %2d \033[=%dGBACKGROUND\033[=0G\n",
 			fore, i, i, legal_colors[i],
 			fore, i+8, i+8, legal_colors[i+8],
 			fore, i, i);
 	}
 	fprintf(stdout, "\033[=%dF\033[=%dG\033[=%dH\033[=%dI\n",
 		info.mv_norm.fore, info.mv_norm.back,
 		info.mv_rev.fore, info.mv_rev.back);
 }
 
 
 /*
  * Snapshot the video memory of that terminal, using the CONS_SCRSHOT
  * ioctl, and writes the results to stdout either in the special
  * binary format (see manual page for details), or in the plain
  * text format.
  */
 
 static void
 dump_screen(int mode, int opt)
 {
 	scrshot_t shot;
 	vid_info_t _info;
 
 	_info.size = sizeof(_info);
 
 	if (ioctl(0, CONS_GETINFO, &_info) == -1) {
 		revert();
 		errc(1, errno, "obtaining current video mode parameters");
 		return;
 	}
 
 	shot.x = shot.y = 0;
 	shot.xsize = _info.mv_csz;
 	shot.ysize = _info.mv_rsz;
 	if (opt == DUMP_ALL)
 		shot.ysize += _info.mv_hsz;
 
 	shot.buf = alloca(shot.xsize * shot.ysize * sizeof(u_int16_t));
 	if (shot.buf == NULL) {
 		revert();
 		errx(1, "failed to allocate memory for dump");
 	}
 
 	if (ioctl(0, CONS_SCRSHOT, &shot) == -1) {
 		revert();
 		errc(1, errno, "dumping screen");
 	}
 
 	if (mode == DUMP_FMT_RAW) {
 		printf("SCRSHOT_%c%c%c%c", DUMP_FMT_REV, 2,
 		       shot.xsize, shot.ysize);
 
 		fflush(stdout);
 
 		write(STDOUT_FILENO, shot.buf,
 		      shot.xsize * shot.ysize * sizeof(u_int16_t));
 	} else {
 		char *line;
 		int x, y;
 		u_int16_t ch;
 
 		line = alloca(shot.xsize + 1);
 
 		if (line == NULL) {
 			revert();
 			errx(1, "failed to allocate memory for line buffer");
 		}
 
 		for (y = 0; y < shot.ysize; y++) {
 			for (x = 0; x < shot.xsize; x++) {
 				ch = shot.buf[x + (y * shot.xsize)];
 				ch &= 0xff;
 
 				if (isprint(ch) == 0)
 					ch = ' ';
 
 				line[x] = (char)ch;
 			}
 
 			/* Trim trailing spaces */
 
 			do {
 				line[x--] = '\0';
 			} while (line[x] == ' ' && x != 0);
 
 			puts(line);
 		}
 
 		fflush(stdout);
 	}
 }
 
 
 /*
  * Set the console history buffer size.
  */
 
 static void
 set_history(char *opt)
 {
 	int size;
 
 	size = atoi(opt);
 
 	if ((*opt == '\0') || size < 0) {
 		revert();
 		errx(1, "argument must be a positive number");
 	}
 
 	if (ioctl(0, CONS_HISTORY, &size) == -1) {
 		revert();
 		errc(1, errno, "setting history buffer size");
 	}
 }
 
 
 /*
  * Clear the console history buffer.
  */
 
 static void
 clear_history(void)
 {
 	if (ioctl(0, CONS_CLRHIST) == -1) {
 		revert();
 		errc(1, errno, "clearing history buffer");
 	}
 }
 
 static void
 set_terminal_mode(char *arg)
 {
 
 	if (strcmp(arg, "xterm") == 0)
 		fprintf(stderr, "\033[=T");
 	else if (strcmp(arg, "cons25") == 0)
 		fprintf(stderr, "\033[=1T");
 }
 
 
 int
 main(int argc, char **argv)
 {
 	char    *font, *type, *termmode;
 	const char *opts;
 	int	dumpmod, dumpopt, opt;
 	int	reterr;
 
 	vt4_mode = is_vt4();
 
 	init();
 
 	info.size = sizeof(info);
 
 	if (ioctl(0, CONS_GETINFO, &info) == -1)
 		err(1, "must be on a virtual console");
 	dumpmod = 0;
 	dumpopt = DUMP_FBF;
 	termmode = NULL;
 	if (vt4_mode)
 		opts = "b:Cc:fg:h:Hi:M:m:pPr:S:s:T:t:x";
 	else
-		opts = "b:Cc:df:g:h:Hi:l:LM:m:pPr:S:s:T:t:x";
+		opts = "b:Cc:dfg:h:Hi:l:LM:m:pPr:S:s:T:t:x";
 
 	while ((opt = getopt(argc, argv, opts)) != -1)
 		switch(opt) {
 		case 'b':
 			set_border_color(optarg);
 			break;
 		case 'C':
 			clear_history();
 			break;
 		case 'c':
 			set_cursor_type(optarg);
 			break;
 		case 'd':
 			if (vt4_mode)
 				break;
 			print_scrnmap();
 			break;
 		case 'f':
 			optarg = nextarg(argc, argv, &optind, 'f', 0);
 			if (optarg != NULL) {
 				font = nextarg(argc, argv, &optind, 'f', 0);
 
 				if (font == NULL) {
 					type = NULL;
 					font = optarg;
 				} else
 					type = optarg;
 
 				load_font(type, font);
 			} else {
 				if (!vt4_mode)
 					usage(); /* Switch syscons to ROM? */
 
 				load_default_vt4font();
 			}
 			break;
 		case 'g':
 			if (sscanf(optarg, "%dx%d",
 			    &vesa_cols, &vesa_rows) != 2) {
 				revert();
 				warnx("incorrect geometry: %s", optarg);
 				usage();
 			}
                 	break;
 		case 'h':
 			set_history(optarg);
 			break;
 		case 'H':
 			dumpopt = DUMP_ALL;
 			break;
 		case 'i':
 			show_info(optarg);
 			break;
 		case 'l':
 			if (vt4_mode)
 				break;
 			load_scrnmap(optarg);
 			break;
 		case 'L':
 			if (vt4_mode)
 				break;
 			load_default_scrnmap();
 			break;
 		case 'M':
 			set_mouse_char(optarg);
 			break;
 		case 'm':
 			set_mouse(optarg);
 			break;
 		case 'p':
 			dumpmod = DUMP_FMT_RAW;
 			break;
 		case 'P':
 			dumpmod = DUMP_FMT_TXT;
 			break;
 		case 'r':
 			get_reverse_colors(argc, argv, &optind);
 			break;
 		case 'S':
 			set_lockswitch(optarg);
 			break;
 		case 's':
 			set_console(optarg);
 			break;
 		case 'T':
 			if (strcmp(optarg, "xterm") != 0 &&
 			    strcmp(optarg, "cons25") != 0)
 				usage();
 			termmode = optarg;
 			break;
 		case 't':
 			set_screensaver_timeout(optarg);
 			break;
 		case 'x':
 			hex = 1;
 			break;
 		default:
 			usage();
 		}
 
 	if (dumpmod != 0)
 		dump_screen(dumpmod, dumpopt);
 	reterr = video_mode(argc, argv, &optind);
 	get_normal_colors(argc, argv, &optind);
 
 	if (optind < argc && !strcmp(argv[optind], "show")) {
 		test_frame();
 		optind++;
 	}
 
 	video_mode(argc, argv, &optind);
 	if (termmode != NULL)
 		set_terminal_mode(termmode);
 
 	get_normal_colors(argc, argv, &optind);
 
 	if (colors_changed || video_mode_changed) {
 		if (!(new_mode_info.vi_flags & V_INFO_GRAPHICS)) {
 			if ((normal_back_color < 8) && (revers_back_color < 8)) {
 				set_colors();
 			} else {
 				revert();
 				errx(1, "bg color for text modes must be < 8");
 			}
 		} else {
 			set_colors();
 		}
 	}
 
 	if ((optind != argc) || (argc == 1))
 		usage();
 	return reterr;
 }
 
Index: user/ngie/more-tests
===================================================================
--- user/ngie/more-tests	(revision 281584)
+++ user/ngie/more-tests	(revision 281585)

Property changes on: user/ngie/more-tests
___________________________________________________________________
Modified: svn:mergeinfo
## -0,0 +0,1 ##
   Merged /head:r281504-281584