Index: projects/openssl111/UPDATING
===================================================================
--- projects/openssl111/UPDATING	(revision 339239)
+++ projects/openssl111/UPDATING	(revision 339240)
@@ -1,1845 +1,1852 @@
  Updating Information for FreeBSD current users.
 
 This file is maintained and copyrighted by M. Warner Losh <imp@freebsd.org>.
 See end of file for further details.  For commonly done items, please see the
 COMMON ITEMS: section later in the file.  These instructions assume that you
 basically know what you are doing.  If not, then please consult the FreeBSD
 handbook:
 
     https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html
 
 Items affecting the ports and packages system can be found in
 /usr/ports/UPDATING.  Please read that file before running portupgrade.
 
 NOTE: FreeBSD has switched from gcc to clang. If you have trouble bootstrapping
 from older versions of FreeBSD, try WITHOUT_CLANG and WITH_GCC to bootstrap to
 the tip of head, and then rebuild without this option. The bootstrap process
 from older version of current across the gcc/clang cutover is a bit fragile.
 
 NOTE TO PEOPLE WHO THINK THAT FreeBSD 12.x IS SLOW:
 	FreeBSD 12.x has many debugging features turned on, in both the kernel
 	and userland.  These features attempt to detect incorrect use of
 	system primitives, and encourage loud failure through extra sanity
 	checking and fail stop semantics.  They also substantially impact
 	system performance.  If you want to do performance measurement,
 	benchmarking, and optimization, you'll want to turn them off.  This
 	includes various WITNESS- related kernel options, INVARIANTS, malloc
 	debugging flags in userland, and various verbose features in the
 	kernel.  Many developers choose to disable these features on build
 	machines to maximize performance.  (To completely disable malloc
 	debugging, define MALLOC_PRODUCTION in /etc/make.conf, or to merely
 	disable the most expensive debugging functionality run
 	"ln -s 'abort:false,junk:false' /etc/malloc.conf".)
 
+20181006:
+	The legacy DRM modules and drivers have now been added to the loader's
+	module blacklist, in favor of loading them with kld_list in rc.conf(5).
+	The module blacklist may be overridden with the loader.conf(5)
+	'module_blacklist' variable, but loading them via rc.conf(5) is strongly
+	encouraged.
+
 20181002:
 	The cam(4) based nda(4) driver will be used over nvd(4) by default on
 	powerpc64. You may set 'options NVME_USE_NVD=1' in your kernel conf or
 	loader tunable 'hw.nvme.use_nvd=1' if you wish to use the existing
 	driver.  Make sure to edit /boot/etc/kboot.conf and fstab to use the
 	nda device name.
 
 20180913:
 	Reproducible build mode is now on by default, in preparation for
 	FreeBSD 12.0.  This eliminates build metadata such as the user,
 	host, and time from the kernel (and uname), unless the working tree
 	corresponds to a modified checkout from a version control system.
 	The previous behavior can be obtained by setting the /etc/src.conf
 	knob WITHOUT_REPRODUCIBLE_BUILD.
 
 20180826:
 	The Yarrow CSPRNG has been removed from the kernel as it has not been
 	supported by its designers since at least 2003. Fortuna has been the
 	default since FreeBSD-11.
 
 20180822:
 	devctl freeze/thaw have gone into the tree, the rc scripts have been
 	updated to use them and devmatch has been changed.  You should update
 	kernel, userland and rc scripts all at the same time.
 
 20180818:
 	The default interpreter has been switched from 4th to Lua.
 	LOADER_DEFAULT_INTERP, documented in build(7), will override the default
 	interpreter.  If you have custom FORTH code you will need to set
 	LOADER_DEFAULT_INTERP=4th (valid values are 4th, lua or simp) in
 	src.conf for the build.  This will create default hard links between
 	loader and loader_4th instead of loader and loader_lua, the new default.
 	If you are using UEFI it will create the proper hard link to loader.efi.
 
 	bhyve uses userboot.so. It remains 4th-only until some issues are solved
 	regarding coexisting with multiple versions of FreeBSD are resolved.
 
 20180815:
 	ls(1) now respects the COLORTERM environment variable used in other
 	systems and software to indicate that a colored terminal is both
 	supported and desired.  If ls(1) is suddenly emitting colors, they may
 	be disabled again by either removing the unwanted COLORTERM from your
 	environment, or using `ls --color=never`.  The ls(1) specific CLICOLOR
 	may not be observed in a future release.
 
 20180808:
 	The default pager for most commands has been changed to "less".  To
 	restore the old behavior, set PAGER="more" and MANPAGER="more -s" in
 	your environment.
 
 20180731:
 	The jedec_ts(4) driver has been removed. A superset of its functionality
 	is available in the jedec_dimm(4) driver, and the manpage for that
 	driver includes migration instructions. If you have "device jedec_ts"
 	in your kernel configuration file, it must be removed.
 
 20180730:
 	amd64/GENERIC now has EFI runtime services, EFIRT, enabled by default.
 	This should have no effect if the kernel is booted via BIOS/legacy boot.
 	EFIRT may be disabled via a loader tunable, efi.rt.disabled, if a system
 	has a buggy firmware that prevents a successful boot due to use of
 	runtime services.
 
 20180727:
 	Atmel AT91RM9200 and AT91SAM9, Cavium CNS 11xx and XScale
 	support has been removed from the tree. These ports were
 	obsolete and/or known to be broken for many years.
 
 20180723:
 	loader.efi has been augmented to participate more fully in the
 	UEFI boot manager protocol. loader.efi will now look at the
 	BootXXXX environment variable to determine if a specific kernel
 	or root partition was specified. XXXX is derived from BootCurrent.
 	efibootmgr(8) manages these standard UEFI variables.
 
 20180720:
 	zfsloader's functionality has now been folded into loader.
 	zfsloader is no longer necessary once you've updated your
 	boot blocks. For a transition period, we will install a
 	hardlink for zfsloader to loader to allow a smooth transition
 	until the boot blocks can be updated (hard link because old
 	zfs boot blocks don't understand symlinks).
 
 20180719:
 	ARM64 now have efifb support, if you want to have serial console
 	on your arm64 board when an screen is connected and the bootloader
 	setup a frambuffer for us to use, just add :
 	boot_serial=YES
 	boot_multicons=YES
 	in /boot/loader.conf
 	For Raspberry Pi 3 (RPI) users, this is needed even if you don't have
 	an screen connected as the firmware will setup a framebuffer are that
 	u-boot will expose as an EFI framebuffer.
 
 20180719:
 	New uid:gid added, ntpd:ntpd (123:123).  Be sure to run mergemaster
 	or take steps to update /etc/passwd before doing installworld on
 	existing systems.  Do not skip the "mergemaster -Fp" step before
 	installworld, as described in the update procedures near the bottom
 	of this document.  Also, rc.d/ntpd now starts ntpd(8) as user ntpd
 	if the new mac_ntpd(4) policy is available, unless ntpd_flags or
 	the ntp config file contain options that change file/dir locations.
 	When such options (e.g., "statsdir" or "crypto") are used, ntpd can
 	still be run as non-root by setting ntpd_user=ntpd in rc.conf, after
 	taking steps to ensure that all required files/dirs are accessible
 	by the ntpd user.
 
 20180717:
 	Big endian arm support has been removed.
 
 20180711:
 	The static environment setup in kernel configs is no longer mutually
 	exclusive with the loader(8) environment by default.  In order to
 	restore the previous default behavior of disabling the loader(8)
 	environment if a static environment is present, you must specify
 	loader_env.disabled=1 in the static environment.
 
 20180705:
 	The ABI of syscalls used by management tools like sockstat and
 	netstat has been broken to allow 32-bit binaries to work on
 	64-bit kernels without modification.  These programs will need
 	to match the kernel in order to function.  External programs may
 	require minor modifications to accommodate a change of type in
 	structures from pointers to 64-bit virtual addresses.
 
 20180702:
 	On i386 and amd64 atomics are now inlined. Out of tree modules using
 	atomics will need to be rebuilt.
 
 20180701:
 	The '%I' format in the kern.corefile sysctl limits the number of
 	core files that a process can generate to the number stored in the
 	debug.ncores sysctl. The '%I' format is replaced by the single digit
 	index. Previously, if all indexes were taken the kernel would overwrite
 	only a core file with the highest index in a filename.
 	Currently the system will create a new core file if there is a free
 	index or if all slots are taken it will overwrite the oldest one.
 
 20180630:
 	Clang, llvm, lld, lldb, compiler-rt and libc++ have been upgraded to
 	6.0.1.  Please see the 20141231 entry below for information about
 	prerequisites and upgrading, if you are not already using clang 3.5.0
 	or higher.
 
 20180628:
 	r335753 introduced a new quoting method. However, etc/devd/devmatch.conf
 	needed to be changed to work with it. This change was made with r335763
 	and requires a mergemaster / etcupdate / etc to update the installed file.
 
 20180612:
 	r334930 changed the interface between the NFS modules, so they all
 	need to be rebuilt.  r335018 did a __FreeBSD_version bump for this.
 
 20180530:
 	As of r334391 lld is the default amd64 system linker; it is installed
 	as /usr/bin/ld.  Kernel build workarounds (see 20180510 entry) are no
 	longer necessary.
 
 20180530:
 	The kernel / userland interface for devinfo changed, so you'll
 	need a new kernel and userland as a pair for it to work (rebuilding
 	lib/libdevinfo is all that's required). devinfo and devmatch will
 	not work, but everything else will when there's a mismatch.
 
 20180523:
 	The on-disk format for hwpmc callchain records has changed to include
 	threadid corresponding to a given record. This changes the field offsets
 	and thus requires that libpmcstat be rebuilt before using a kernel
 	later than r334108.
 
 20180517:
 	The vxge(4) driver has been removed.  This driver was introduced into
 	HEAD one week before the Exar left the Ethernet market and is not
 	known to be used.  If you have device vxge in your kernel config file
 	it must be removed.
 
 20180510:
 	The amd64 kernel now requires a ld that supports ifunc to produce a
 	working kernel, either lld or a newer binutils. lld is built by default
 	on amd64, and the 'buildkernel' target uses it automatically. However,
 	it is not the default linker, so building the kernel the traditional
 	way requires LD=ld.lld on the command line (or LD=/usr/local/bin/ld for
 	binutils port/package). lld will soon be default, and this requirement
 	will go away.
 
 	NOTE: As of r334391 lld is the default system linker on amd64, and no
 	workaround is necessary.
 
 20180508:
 	The nxge(4) driver has been removed.  This driver was for PCI-X 10g
 	cards made by s2io/Neterion.  The company was aquired by Exar and
 	no longer sells or supports Ethernet products.  If you have device
 	nxge in your kernel config file it must be removed.
 
 20180504:
 	The tz database (tzdb) has been updated to 2018e.  This version more
 	correctly models time stamps in time zones with negative DST such as
 	Europe/Dublin (from 1971 on), Europe/Prague (1946/7), and
 	Africa/Windhoek (1994/2017).  This does not affect the UT offsets, only
 	time zone abbreviations and the tm_isdst flag.
 
 20180502:
 	The ixgb(4) driver has been removed.  This driver was for an early and
 	uncommon legacy PCI 10GbE for a single ASIC, Intel 82597EX. Intel
 	quickly shifted to the long lived ixgbe family.  If you have device
 	ixgb in your kernel config file it must be removed.
 
 20180501:
 	The lmc(4) driver has been removed.  This was a WAN interface
 	card that was already reportedly rare in 2003, and had an ambiguous
 	license.  If you have device lmc in your kernel config file it must
 	be removed.
 
 20180413:
 	Support for Arcnet networks has been removed.  If you have device
 	arcnet or device cm in your kernel config file they must be
 	removed.
 
 20180411:
 	Support for FDDI networks has been removed.  If you have device
 	fddi or device fpa in your kernel config file they must be
 	removed.
 
 20180406:
 	In addition to supporting RFC 3164 formatted messages, the
 	syslogd(8) service is now capable of parsing RFC 5424 formatted
 	log messages. The main benefit of using RFC 5424 is that clients
 	may now send log messages with timestamps containing year numbers,
 	microseconds and time zone offsets.
 
 	Similarly, the syslog(3) C library function has been altered to
 	send RFC 5424 formatted messages to the local system logging
 	daemon. On systems using syslogd(8), this change should have no
 	negative impact, as long as syslogd(8) and the C library are
 	updated at the same time. On systems using a different system
 	logging daemon, it may be necessary to make configuration
 	adjustments, depending on the software used.
 
 	When using syslog-ng, add the 'syslog-protocol' flag to local
 	input sources to enable parsing of RFC 5424 formatted messages:
 
 		source src {
 			unix-dgram("/var/run/log" flags(syslog-protocol));
 		}
 
 	When using rsyslog, disable the 'SysSock.UseSpecialParser' option
 	of the 'imuxsock' module to let messages be processed by the
 	regular RFC 3164/5424 parsing pipeline:
 
 		module(load="imuxsock" SysSock.UseSpecialParser="off")
 
 	Do note that these changes only affect communication between local
 	applications and syslogd(8). The format that syslogd(8) uses to
 	store messages on disk or forward messages to other systems
 	remains unchanged. syslogd(8) still uses RFC 3164 for these
 	purposes. Options to customize this behaviour will be added in the
 	future. Utilities that process log files stored in /var/log are
 	thus expected to continue to function as before.
 
 	__FreeBSD_version has been incremented to 1200061 to denote this
 	change.
 
 20180328:
 	Support for token ring networks has been removed. If you
 	have "device token" in your kernel config you should remove
 	it. No device drivers supported token ring.
 
 20180323:
 	makefs was modified to be able to tag ISO9660 El Torito boot catalog
 	entries as EFI instead of overloading the i386 tag as done previously.
 	The amd64 mkisoimages.sh script used to build amd64 ISO images for
 	release was updated to use this. This may mean that makefs must be
 	updated before "make cdrom" can be run in the release directory. This
 	should be as simple as:
 
 		$ cd $SRCDIR/usr.sbin/makefs
 		$ make depend all install
 
 20180212:
 	FreeBSD boot loader enhanced with Lua scripting. It's purely opt-in for
 	now by building WITH_LOADER_LUA and WITHOUT_FORTH in /etc/src.conf.
 	Co-existance for the transition period will come shortly. Booting is a
 	complex environment and test coverage for Lua-enabled loaders has been
 	thin, so it would be prudent to assume it might not work and make
 	provisions for backup boot methods.
 
 20180211:
 	devmatch functionality has been turned on in devd. It will automatically
 	load drivers for unattached devices. This may cause unexpected drivers to
 	be loaded. Please report any problems to current@ and imp@freebsd.org.
 
 20180114:
 	Clang, llvm, lld, lldb, compiler-rt and libc++ have been upgraded to
 	6.0.0.  Please see the 20141231 entry below for information about
 	prerequisites and upgrading, if you are not already using clang 3.5.0
 	or higher.
 
 20180110:
 	LLVM's lld linker is now used as the FreeBSD/amd64 bootstrap linker.
 	This means it is used to link the kernel and userland libraries and
 	executables, but is not yet installed as /usr/bin/ld by default.
 
 	To revert to ld.bfd as the bootstrap linker, in /etc/src.conf set
         WITHOUT_LLD_BOOTSTRAP=yes
 
 20180110:
 	On i386, pmtimer has been removed. Its functionality has been folded
 	into apm. It was a no-op on ACPI in current for a while now (but was still
 	needed on i386 in FreeBSD 11 and earlier). Users may need to remove it
 	from kernel config files.
 
 20180104:
 	The use of RSS hash from the network card aka flowid has been
 	disabled by default for lagg(4) as it's currently incompatible with
 	the lacp and loadbalance protocols.
 
 	This can be re-enabled by setting the following in loader.conf:
 	net.link.lagg.default_use_flowid="1"
 
 20180102:
 	The SW_WATCHDOG option is no longer necessary to enable the
 	hardclock-based software watchdog if no hardware watchdog is
 	configured. As before, SW_WATCHDOG will cause the software
 	watchdog to be enabled even if a hardware watchdog is configured.
 
 20171215:
 	r326887 fixes the issue described in the 20171214 UPDATING entry.
 	r326888 flips the switch back to building GELI support always.
 
 20171214:
 	r362593 broke ZFS + GELI support for reasons unknown. However,
 	it also broke ZFS support generally, so GELI has been turned off
 	by default as the lesser evil in r326857. If you boot off ZFS and/or
 	GELI, it might not be a good time to update.
 
 20171125:
 	PowerPC users must update loader(8) by rebuilding world before
 	installing a new kernel, as the protocol connecting them has
 	changed. Without the update, loader metadata will not be passed
 	successfully to the kernel and users will have to enter their
 	root partition at the kernel mountroot prompt to continue booting.
 	Newer versions of loader can boot old kernels without issue.
 
 20171110:
 	The LOADER_FIREWIRE_SUPPORT build variable as been renamed to
 	WITH/OUT_LOADER_FIREWIRE. LOADER_{NO_,}GELI_SUPPORT has been renamed
 	to WITH/OUT_LOADER_GELI.
 
 20171106:
 	The naive and non-compliant support of posix_fallocate(2) in ZFS
 	has been removed as of r325320.  The system call now returns EINVAL
 	when used on a ZFS file.  Although the new behavior complies with the
 	standard, some consumers are not prepared to cope with it.
 	One known victim is lld prior to r325420.
 
 20171102:
 	Building in a FreeBSD src checkout will automatically create object
 	directories now rather than store files in the current directory if
 	'make obj' was not ran.  Calling 'make obj' is no longer necessary.
 	This feature can be disabled by setting WITHOUT_AUTO_OBJ=yes in
 	/etc/src-env.conf (not /etc/src.conf), or passing the option in the
 	environment.
 
 20171101:
 	The default MAKEOBJDIR has changed from /usr/obj/<srcdir> for native
 	builds, and /usr/obj/<arch>/<srcdir> for cross-builds, to a unified
 	/usr/obj/<srcdir>/<arch>.  This behavior can be changed to the old
 	format by setting WITHOUT_UNIFIED_OBJDIR=yes in /etc/src-env.conf,
 	the environment, or with -DWITHOUT_UNIFIED_OBJDIR when building.
 	The UNIFIED_OBJDIR option is a transitional feature that will be
 	removed for 12.0 release; please migrate to the new format for any
 	tools by looking up the OBJDIR used by 'make -V .OBJDIR' means rather
 	than hardcoding paths.
 
 20171028:
 	The native-xtools target no longer installs the files by default to the
 	OBJDIR.  Use the native-xtools-install target with a DESTDIR to install
 	to ${DESTDIR}/${NXTP} where NXTP defaults to /nxb-bin.
 
 20171021:
 	As part of the boot loader infrastructure cleanup, LOADER_*_SUPPORT
 	options are changing from controlling the build if defined / undefined
 	to controlling the build with explicit 'yes' or 'no' values. They will
 	shift to WITH/WITHOUT options to match other options in the system.
 
 20171010:
 	libstand has turned into a private library for sys/boot use only.
 	It is no longer supported as a public interface outside of sys/boot.
 
 20171005:
 	The arm port has split armv6 into armv6 and armv7. armv7 is now
 	a valid TARGET_ARCH/MACHINE_ARCH setting. If you have an armv7 system
 	and are running a kernel from before r324363, you will need to add
 	MACHINE_ARCH=armv7 to 'make buildworld' to do a native build.
 
 20171003:
 	When building multiple kernels using KERNCONF, non-existent KERNCONF
 	files will produce an error and buildkernel will fail. Previously
 	missing KERNCONF files silently failed giving no indication as to
 	why, only to subsequently discover during installkernel that the
 	desired kernel was never built in the first place.
 
 20170912:
 	The default serial number format for CTL LUNs has changed.  This will
 	affect users who use /dev/diskid/* device nodes, or whose FibreChannel
 	or iSCSI clients care about their LUNs' serial numbers.  Users who
 	require serial number stability should hardcode serial numbers in
 	/etc/ctl.conf .
 
 20170912:
 	For 32-bit arm compiled for hard-float support, soft-floating point
 	binaries now always get their shared libraries from
 	LD_SOFT_LIBRARY_PATH (in the past, this was only used if
 	/usr/libsoft also existed). Only users with a hard-float ld.so, but
 	soft-float everything else should be affected.
 
 20170826:
 	The geli password typed at boot is now hidden.  To restore the previous
 	behavior, see geli(8) for configuration options.
 
 20170825:
 	Move PMTUD blackhole counters to TCPSTATS and remove them from bare
 	sysctl values.  Minor nit, but requires a rebuild of both world/kernel
 	to complete.
 
 20170814:
 	"make check" behavior (made in ^/head@r295380) has been changed to
 	execute from a limited sandbox, as opposed to executing from
 	${TESTSDIR}.
 
 	Behavioral changes:
 	- The "beforecheck" and "aftercheck" targets are now specified.
 	- ${CHECKDIR} (added in commit noted above) has been removed.
 	- Legacy behavior can be enabled by setting
 	  WITHOUT_MAKE_CHECK_USE_SANDBOX in src.conf(5) or the environment.
 
 	If the limited sandbox mode is enabled, "make check" will execute
 	"make distribution", then install, execute the tests, and clean up the
 	sandbox if successful.
 
 	The "make distribution" and "make install" targets are typically run as
 	root to set appropriate permissions and ownership at installation time.
 	The end-user should set "WITH_INSTALL_AS_USER" in src.conf(5) or the
 	environment if executing "make check" with limited sandbox mode using
 	an unprivileged user.
 
 20170808:
 	Since the switch to GPT disk labels, fsck for UFS/FFS has been
 	unable to automatically find alternate superblocks. As of r322297,
 	the information needed to find alternate superblocks has been
 	moved to the end of the area reserved for the boot block.
 	Filesystems created with a newfs of this vintage or later
 	will create the recovery information. If you have a filesystem
 	created prior to this change and wish to have a recovery block
 	created for your filesystem, you can do so by running fsck in
 	foreground mode (i.e., do not use the -p or -y options). As it
 	starts, fsck will ask ``SAVE DATA TO FIND ALTERNATE SUPERBLOCKS''
 	to which you should answer yes.
 
 20170728:
 	As of r321665, an NFSv4 server configuration that services
 	Kerberos mounts or clients that do not support the uid/gid in
 	owner/owner_group string capability, must explicitly enable
 	the nfsuserd daemon by adding nfsuserd_enable="YES" to the
 	machine's /etc/rc.conf file.
 
 20170722:
 	Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 5.0.0.
 	Please see the 20141231 entry below for information about prerequisites
 	and upgrading, if you are not already using clang 3.5.0 or higher.
 
 20170701:
 	WITHOUT_RCMDS is now the default. Set WITH_RCMDS if you need the
 	r-commands (rlogin, rsh, etc.) to be built with the base system.
 
 20170625:
 	The FreeBSD/powerpc platform now uses a 64-bit type for time_t.  This is
 	a very major ABI incompatible change, so users of FreeBSD/powerpc must
 	be careful when performing source upgrades.  It is best to run
 	'make installworld' from an alternate root system, either a live
 	CD/memory stick, or a temporary root partition.  Additionally, all ports
 	must be recompiled.  powerpc64 is largely unaffected, except in the case
 	of 32-bit compatibility.  All 32-bit binaries will be affected.
 
 20170623:
 	Forward compatibility for the "ino64" project have been committed. This
 	will allow most new binaries to run on older kernels in a limited
 	fashion.  This prevents many of the common foot-shooting actions in the
 	upgrade as well as the limited ability to roll back the kernel across
 	the ino64 upgrade. Complicated use cases may not work properly, though
 	enough simpler ones work to allow recovery in most situations.
 
 20170620:
 	Switch back to the BSDL dtc (Device Tree Compiler). Set WITH_GPL_DTC
 	if you require the GPL compiler.
 
 20170618:
 	The internal ABI used for communication between the NFS kernel modules
 	was changed by r320085, so __FreeBSD_version was bumped to
 	ensure all the NFS related modules are updated together.
 
 20170617:
 	The ABI of struct event was changed by extending the data
 	member to 64bit and adding ext fields.  For upgrade, same
 	precautions as for the entry 20170523 "ino64" must be
 	followed.
 
 20170531:
 	The GNU roff toolchain has been removed from base. To render manpages
 	which are not supported by mandoc(1), man(1) can fallback on GNU roff
 	from ports (and recommends to install it).
 	To render roff(7) documents, consider using GNU roff from ports or the
 	heirloom doctools roff toolchain from ports via pkg install groff or
 	via pkg install heirloom-doctools.
 
 20170524:
 	The ath(4) and ath_hal(4) modules now build piecemeal to allow for
 	smaller runtime footprint builds.  This is useful for embedded systems
 	which only require one chipset support.
 
 	If you load it as a module, make sure this is in /boot/loader.conf:
 
 	if_ath_load="YES"
 
 	This will load the HAL, all chip/RF backends and if_ath_pci.
 	If you have if_ath_pci in /boot/loader.conf, ensure it is after
 	if_ath or it will not load any HAL chipset support.
 
 	If you want to selectively load things (eg on ye cheape ARM/MIPS
 	platforms where RAM is at a premium) you should:
 
 	* load ath_hal
 	* load the chip modules in question
 	* load ath_rate, ath_dfs
 	* load ath_main
 	* load if_ath_pci and/or if_ath_ahb depending upon your particular
 	  bus bind type - this is where probe/attach is done.
 
 	For further comments/feedback, poke adrian@ .
 
 20170523:
 	The "ino64" 64-bit inode project has been committed, which extends
 	a number of types to 64 bits.  Upgrading in place requires care and
 	adherence to the documented upgrade procedure.
 
 	If using a custom kernel configuration ensure that the
 	COMPAT_FREEBSD11 option is included (as during the upgrade the
 	system will be running the ino64 kernel with the existing world).
 
 	For the safest in-place upgrade begin by removing previous build
 	artifacts via "rm -rf /usr/obj/*".   Then, carefully follow the
 	full procedure documented below under the heading "To rebuild
 	everything and install it on the current system."  Specifically,
 	a reboot is required after installing the new kernel before
 	installing world.
 
 20170424:
 	The NATM framework including the en(4), fatm(4), hatm(4), and
 	patm(4) devices has been removed.  Consumers should plan a
 	migration before the end-of-life date for FreeBSD 11.
 
 20170420:
 	GNU diff has been replaced by a BSD licensed diff. Some features of GNU
 	diff has not been implemented, if those are needed a newer version of
 	GNU diff is available via the diffutils package under the gdiff name.
 
 20170413:
 	As of r316810 for ipfilter, keep frags is no longer assumed when
 	keep state is specified in a rule. r316810 aligns ipfilter with
 	documentation in man pages separating keep frags from keep state.
 	This allows keep state to be specified without forcing keep frags
 	and allows keep frags to be specified independently of keep state.
 	To maintain previous behaviour, also specify keep frags with
 	keep state (as documented in ipf.conf.5).
 
 20170407:
 	arm64 builds now use the base system LLD 4.0.0 linker by default,
 	instead of requiring that the aarch64-binutils port or package be
 	installed. To continue using aarch64-binutils, set
 	CROSS_BINUTILS_PREFIX=/usr/local/aarch64-freebsd/bin .
 
 20170405:
 	The UDP optimization in entry 20160818 that added the sysctl
 	net.inet.udp.require_l2_bcast has been reverted.  L2 broadcast
 	packets will no longer be treated as L3 broadcast packets.
 
 20170331:
 	Binds and sends to the loopback addresses, IPv6 and IPv4, will now
 	use any explicitly assigned loopback address available in the jail
 	instead of using the first assigned address of the jail.
 
 20170329:
 	The ctl.ko module no longer implements the iSCSI target frontend:
 	cfiscsi.ko does instead.
 
 	If building cfiscsi.ko as a kernel module, the module can be loaded
 	via one of the following methods:
 	- `cfiscsi_load="YES"` in loader.conf(5).
 	- Add `cfiscsi` to `$kld_list` in rc.conf(5).
 	- ctladm(8)/ctld(8), when compiled with iSCSI support
 	  (`WITH_ISCSI=yes` in src.conf(5))
 
 	Please see cfiscsi(4) for more details.
 
 20170316:
 	The mmcsd.ko module now additionally depends on geom_flashmap.ko.
 	Also, mmc.ko and mmcsd.ko need to be a matching pair built from the
 	same source (previously, the dependency of mmcsd.ko on mmc.ko was
 	missing, but mmcsd.ko now will refuse to load if it is incompatible
 	with mmc.ko).
 
 20170315:
 	The syntax of ipfw(8) named states was changed to avoid ambiguity.
 	If you have used named states in the firewall rules, you need to modify
 	them after installworld and before rebooting. Now named states must
 	be prefixed with colon.
 
 20170311:
 	The old drm (sys/dev/drm/) drivers for i915 and radeon have been
 	removed as the userland we provide cannot use them. The KMS version
 	(sys/dev/drm2) supports the same hardware.
 
 20170302:
 	Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 4.0.0.
 	Please see the 20141231 entry below for information about prerequisites
 	and upgrading, if you are not already using clang 3.5.0 or higher.
 
 20170221:
 	The code that provides support for ZFS .zfs/ directory functionality
 	has been reimplemented.  It's not possible now to create a snapshot
 	by mkdir under .zfs/snapshot/.  That should be the only user visible
 	change.
 
 20170216:
 	EISA bus support has been removed. The WITH_EISA option is no longer
 	valid.
 
 20170215:
 	MCA bus support has been removed.
 
 20170127:
 	The WITH_LLD_AS_LD / WITHOUT_LLD_AS_LD build knobs have been renamed
 	WITH_LLD_IS_LD / WITHOUT_LLD_IS_LD, for consistency with CLANG_IS_CC.
 
 20170112:
 	The EM_MULTIQUEUE kernel configuration option is deprecated now that
 	the em(4) driver conforms to iflib specifications.
 
 20170109:
 	The igb(4), em(4) and lem(4) ethernet drivers are now implemented via
 	IFLIB.  If you have a custom kernel configuration that excludes em(4)
 	but you use igb(4), you need to re-add em(4) to your custom configuration.
 
 20161217:
 	Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 3.9.1.
 	Please see the 20141231 entry below for information about prerequisites
 	and upgrading, if you are not already using clang 3.5.0 or higher.
 
 20161124:
 	Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 3.9.0.
 	Please see the 20141231 entry below for information about prerequisites
 	and upgrading, if you are not already using clang 3.5.0 or higher.
 
 20161119:
 	The layout of the pmap structure has changed for powerpc to put the pmap
 	statistics at the front for all CPU variations.  libkvm(3) and all tools
 	that link against it need to be recompiled.
 
 20161030:
 	isl(4) and cyapa(4) drivers now require a new driver,
 	chromebook_platform(4), to work properly on Chromebook-class hardware.
 	On other types of hardware the drivers may need to be configured using
 	device hints.  Please see the corresponding manual pages for details.
 
 20161017:
 	The urtwn(4) driver was merged into rtwn(4) and now consists of
 	rtwn(4) main module + rtwn_usb(4) and rtwn_pci(4) bus-specific
 	parts.
 	Also, firmware for RTL8188CE was renamed due to possible name
 	conflict (rtwnrtl8192cU(B) -> rtwnrtl8192cE(B))
 
 20161015:
 	GNU rcs has been removed from base.  It is available as packages:
 	- rcs: Latest GPLv3 GNU rcs version.
 	- rcs57: Copy of the latest version of GNU rcs (GPLv2) before it was
 	removed from base.
 
 20161008:
 	Use of the cc_cdg, cc_chd, cc_hd, or cc_vegas congestion control
 	modules now requires that the kernel configuration contain the
 	TCP_HHOOK option. (This option is included in the GENERIC kernel.)
 
 20161003:
 	The WITHOUT_ELFCOPY_AS_OBJCOPY src.conf(5) knob has been retired.
 	ELF Tool Chain's elfcopy is always installed as /usr/bin/objcopy.
 
 20160924:
 	Relocatable object files with the extension of .So have been renamed
 	to use an extension of .pico instead.  The purpose of this change is
 	to avoid a name clash with shared libraries on case-insensitive file
 	systems.  On those file systems, foo.So is the same file as foo.so.
 
 20160918:
 	GNU rcs has been turned off by default.  It can (temporarily) be built
 	again by adding WITH_RCS knob in src.conf.
 	Otherwise, GNU rcs is available from packages:
 	- rcs: Latest GPLv3 GNU rcs version.
 	- rcs57: Copy of the latest version of GNU rcs (GPLv2) from base.
 
 20160918:
 	The backup_uses_rcs functionality has been removed from rc.subr.
 
 20160908:
 	The queue(3) debugging macro, QUEUE_MACRO_DEBUG, has been split into
 	two separate components, QUEUE_MACRO_DEBUG_TRACE and
 	QUEUE_MACRO_DEBUG_TRASH.  Define both for the original
 	QUEUE_MACRO_DEBUG behavior.
 
 20160824:
 	r304787 changed some ioctl interfaces between the iSCSI userspace
 	programs and the kernel.  ctladm, ctld, iscsictl, and iscsid must be
 	rebuilt to work with new kernels.  __FreeBSD_version has been bumped
 	to 1200005.
 
 20160818:
 	The UDP receive code has been updated to only treat incoming UDP
 	packets that were addressed to an L2 broadcast address as L3
 	broadcast packets.  It is not expected that this will affect any
 	standards-conforming UDP application.  The new behaviour can be
 	disabled by setting the sysctl net.inet.udp.require_l2_bcast to
 	0.
 
 20160818:
 	Remove the openbsd_poll system call.
 	__FreeBSD_version has been bumped because of this.
 
 20160708:
 	The stable/11 branch has been created from head@r302406.
 
 20160622:
 	The libc stub for the pipe(2) system call has been replaced with
 	a wrapper that calls the pipe2(2) system call and the pipe(2)
 	system call is now only implemented by the kernels that include
 	"options COMPAT_FREEBSD10" in their config file (this is the
 	default).  Users should ensure that this option is enabled in
 	their kernel or upgrade userspace to r302092 before upgrading their
 	kernel.
 
 20160527:
 	CAM will now strip leading spaces from SCSI disks' serial numbers.
 	This will affect users who create UFS filesystems on SCSI disks using
 	those disk's diskid device nodes.  For example, if /etc/fstab
 	previously contained a line like
 	"/dev/diskid/DISK-%20%20%20%20%20%20%20ABCDEFG0123456", you should
 	change it to "/dev/diskid/DISK-ABCDEFG0123456".  Users of geom
 	transforms like gmirror may also be affected.  ZFS users should
 	generally be fine.
 
 20160523:
 	The bitstring(3) API has been updated with new functionality and
 	improved performance.  But it is binary-incompatible with the old API.
 	Objects built with the new headers may not be linked against objects
 	built with the old headers.
 
 20160520:
 	The brk and sbrk functions have been removed from libc on arm64.
 	Binutils from ports has been updated to not link to these
 	functions and should be updated to the latest version before
 	installing a new libc.
 
 20160517:
 	The armv6 port now defaults to hard float ABI. Limited support
 	for running both hardfloat and soft float on the same system
 	is available using the libraries installed with -DWITH_LIBSOFT.
 	This has only been tested as an upgrade path for installworld
 	and packages may fail or need manual intervention to run. New
 	packages will be needed.
 
 	To update an existing self-hosted armv6hf system, you must add
 	TARGET_ARCH=armv6 on the make command line for both the build
 	and the install steps.
 
 20160510:
 	Kernel modules compiled outside of a kernel build now default to
 	installing to /boot/modules instead of /boot/kernel.  Many kernel
 	modules built this way (such as those in ports) already overrode
 	KMODDIR explicitly to install into /boot/modules.  However,
 	manually building and installing a module from /sys/modules will
 	now install to /boot/modules instead of /boot/kernel.
 
 20160414:
 	The CAM I/O scheduler has been committed to the kernel. There should be
 	no user visible impact. This does enable NCQ Trim on ada SSDs. While the
 	list of known rogues that claim support for this but actually corrupt
 	data is believed to be complete, be on the lookout for data
 	corruption. The known rogue list is believed to be complete:
 
 		o Crucial MX100, M550 drives with MU01 firmware.
 		o Micron M510 and M550 drives with MU01 firmware.
 		o Micron M500 prior to MU07 firmware
 		o Samsung 830, 840, and 850 all firmwares
 		o FCCT M500 all firmwares
 
 	Crucial has firmware http://www.crucial.com/usa/en/support-ssd-firmware
 	with working NCQ TRIM. For Micron branded drives, see your sales rep for
 	updated firmware. Black listed drives will work correctly because these
 	drives work correctly so long as no NCQ TRIMs are sent to them. Given
 	this list is the same as found in Linux, it's believed there are no
 	other rogues in the market place. All other models from the above
 	vendors work.
 
 	To be safe, if you are at all concerned, you can quirk each of your
 	drives to prevent NCQ from being sent by setting:
 		kern.cam.ada.X.quirks="0x2"
 	in loader.conf. If the drive requires the 4k sector quirk, set the
 	quirks entry to 0x3.
 
 20160330:
 	The FAST_DEPEND build option has been removed and its functionality is
 	now the one true way.  The old mkdep(1) style of 'make depend' has
 	been removed.  See 20160311 for further details.
 
 20160317:
 	Resource range types have grown from unsigned long to uintmax_t.  All
 	drivers, and anything using libdevinfo, need to be recompiled.
 
 20160311:
 	WITH_FAST_DEPEND is now enabled by default for in-tree and out-of-tree
 	builds.  It no longer runs mkdep(1) during 'make depend', and the
 	'make depend' stage can safely be skipped now as it is auto ran
 	when building 'make all' and will generate all SRCS and DPSRCS before
 	building anything else.  Dependencies are gathered at compile time with
 	-MF flags kept in separate .depend files per object file.  Users should
 	run 'make cleandepend' once if using -DNO_CLEAN to clean out older
 	stale .depend files.
 
 20160306:
 	On amd64, clang 3.8.0 can now insert sections of type AMD64_UNWIND into
 	kernel modules.  Therefore, if you load any kernel modules at boot time,
 	please install the boot loaders after you install the kernel, but before
 	rebooting, e.g.:
 
 	make buildworld
 	make buildkernel KERNCONF=YOUR_KERNEL_HERE
 	make installkernel KERNCONF=YOUR_KERNEL_HERE
 	make -C sys/boot install
 	<reboot in single user>
 
 	Then follow the usual steps, described in the General Notes section,
 	below.
 
 20160305:
 	Clang, llvm, lldb and compiler-rt have been upgraded to 3.8.0.  Please
 	see the 20141231 entry below for information about prerequisites and
 	upgrading, if you are not already using clang 3.5.0 or higher.
 
 20160301:
 	The AIO subsystem is now a standard part of the kernel.  The
 	VFS_AIO kernel option and aio.ko kernel module have been removed.
 	Due to stability concerns, asynchronous I/O requests are only
 	permitted on sockets and raw disks by default.  To enable
 	asynchronous I/O requests on all file types, set the
 	vfs.aio.enable_unsafe sysctl to a non-zero value.
 
 20160226:
 	The ELF object manipulation tool objcopy is now provided by the
 	ELF Tool Chain project rather than by GNU binutils. It should be a
 	drop-in replacement, with the addition of arm64 support. The
 	(temporary) src.conf knob WITHOUT_ELFCOPY_AS_OBJCOPY knob may be set
 	to obtain the GNU version if necessary.
 
 20160129:
 	Building ZFS pools on top of zvols is prohibited by default.  That
 	feature has never worked safely; it's always been prone to deadlocks.
 	Using a zvol as the backing store for a VM guest's virtual disk will
 	still work, even if the guest is using ZFS.  Legacy behavior can be
 	restored by setting vfs.zfs.vol.recursive=1.
 
 20160119:
 	The NONE and HPN patches has been removed from OpenSSH.  They are
 	still available in the security/openssh-portable port.
 
 20160113:
 	With the addition of ypldap(8), a new _ypldap user is now required
 	during installworld. "mergemaster -p" can be used to add the user
 	prior to installworld, as documented in the handbook.
 
 20151216:
 	The tftp loader (pxeboot) now uses the option root-path directive. As a
 	consequence it no longer looks for a pxeboot.4th file on the tftp
 	server. Instead it uses the regular /boot infrastructure as with the
 	other loaders.
 
 20151211:
 	The code to start recording plug and play data into the modules has
 	been committed. While the old tools will properly build a new kernel,
 	a number of warnings about "unknown metadata record 4" will be produced
 	for an older kldxref. To avoid such warnings, make sure to rebuild
 	the kernel toolchain (or world). Make sure that you have r292078 or
 	later when trying to build 292077 or later before rebuilding.
 
 20151207:
 	Debug data files are now built by default with 'make buildworld' and
 	installed with 'make installworld'. This facilitates debugging but
 	requires more disk space both during the build and for the installed
 	world. Debug files may be disabled by setting WITHOUT_DEBUG_FILES=yes
 	in src.conf(5).
 
 20151130:
 	r291527 changed the internal interface between the nfsd.ko and
 	nfscommon.ko modules. As such, they must both be upgraded to-gether.
 	__FreeBSD_version has been bumped because of this.
 
 20151108:
 	Add support for unicode collation strings leads to a change of
 	order of files listed by ls(1) for example. To get back to the old
 	behaviour, set LC_COLLATE environment variable to "C".
 
 	Databases administrators will need to reindex their databases given
 	collation results will be different.
 
 	Due to a bug in install(1) it is recommended to remove the ancient
 	locales before running make installworld.
 
 	rm -rf /usr/share/locale/*
 
 20151030:
 	The OpenSSL has been upgraded to 1.0.2d.  Any binaries requiring
 	libcrypto.so.7 or libssl.so.7 must be recompiled.
 
 20151020:
 	Qlogic 24xx/25xx firmware images were updated from 5.5.0 to 7.3.0.
 	Kernel modules isp_2400_multi and isp_2500_multi were removed and
 	should be replaced with isp_2400 and isp_2500 modules respectively.
 
 20151017:
 	The build previously allowed using 'make -n' to not recurse into
 	sub-directories while showing what commands would be executed, and
 	'make -n -n' to recursively show commands.  Now 'make -n' will recurse
 	and 'make -N' will not.
 
 20151012:
 	If you specify SENDMAIL_MC or SENDMAIL_CF in make.conf, mergemaster
 	and etcupdate will now use this file. A custom sendmail.cf is now
 	updated via this mechanism rather than via installworld.  If you had
 	excluded sendmail.cf in mergemaster.rc or etcupdate.conf, you may
 	want to remove the exclusion or change it to "always install".
 	/etc/mail/sendmail.cf is now managed the same way regardless of
 	whether SENDMAIL_MC/SENDMAIL_CF is used.  If you are not using
 	SENDMAIL_MC/SENDMAIL_CF there should be no change in behavior.
 
 20151011:
 	Compatibility shims for legacy ATA device names have been removed.
 	It includes ATA_STATIC_ID kernel option, kern.cam.ada.legacy_aliases
 	and kern.geom.raid.legacy_aliases loader tunables, kern.devalias.*
 	environment variables, /dev/ad* and /dev/ar* symbolic links.
 
 20151006:
 	Clang, llvm, lldb, compiler-rt and libc++ have been upgraded to 3.7.0.
 	Please see the 20141231 entry below for information about prerequisites
 	and upgrading, if you are not already using clang 3.5.0 or higher.
 
 20150924:
 	Kernel debug files have been moved to /usr/lib/debug/boot/kernel/,
 	and renamed from .symbols to .debug. This reduces the size requirements
 	on the boot partition or file system and provides consistency with
 	userland debug files.
 
 	When using the supported kernel installation method the
 	/usr/lib/debug/boot/kernel directory will be renamed (to kernel.old)
 	as is done with /boot/kernel.
 
 	Developers wishing to maintain the historical behavior of installing
 	debug files in /boot/kernel/ can set KERN_DEBUGDIR="" in src.conf(5).
 
 20150827:
 	The wireless drivers had undergone changes that remove the 'parent
 	interface' from the ifconfig -l output. The rc.d network scripts
 	used to check presence of a parent interface in the list, so old
 	scripts would fail to start wireless networking. Thus, etcupdate(3)
 	or mergemaster(8) run is required after kernel update, to update your
 	rc.d scripts in /etc.
 
 20150827:
 	pf no longer supports 'scrub fragment crop' or 'scrub fragment drop-ovl'
 	These configurations are now automatically interpreted as
 	'scrub fragment reassemble'.
 
 20150817:
 	Kernel-loadable modules for the random(4) device are back. To use
 	them, the kernel must have
 
 	device	random
 	options	RANDOM_LOADABLE
 
 	kldload(8) can then be used to load random_fortuna.ko
 	or random_yarrow.ko. Please note that due to the indirect
 	function calls that the loadable modules need to provide,
 	the build-in variants will be slightly more efficient.
 
 	The random(4) kernel option RANDOM_DUMMY has been retired due to
 	unpopularity. It was not all that useful anyway.
 
 20150813:
 	The WITHOUT_ELFTOOLCHAIN_TOOLS src.conf(5) knob has been retired.
 	Control over building the ELF Tool Chain tools is now provided by
 	the WITHOUT_TOOLCHAIN knob.
 
 20150810:
 	The polarity of Pulse Per Second (PPS) capture events with the
 	uart(4) driver has been corrected.  Prior to this change the PPS
 	"assert" event corresponded to the trailing edge of a positive PPS
 	pulse and the "clear" event was the leading edge of the next pulse.
 
 	As the width of a PPS pulse in a typical GPS receiver is on the
 	order of 1 millisecond, most users will not notice any significant
 	difference with this change.
 
 	Anyone who has compensated for the historical polarity reversal by
 	configuring a negative offset equal to the pulse width will need to
 	remove that workaround.
 
 20150809:
 	The default group assigned to /dev/dri entries has been changed
 	from 'wheel' to 'video' with the id of '44'. If you want to have
 	access to the dri devices please add yourself to the video group
 	with:
 
 	# pw groupmod video -m $USER
 
 20150806:
 	The menu.rc and loader.rc files will now be replaced during
 	upgrades. Please migrate local changes to menu.rc.local and
 	loader.rc.local instead.
 
 20150805:
 	GNU Binutils versions of addr2line, c++filt, nm, readelf, size,
 	strings and strip have been removed. The src.conf(5) knob
 	WITHOUT_ELFTOOLCHAIN_TOOLS no longer provides the binutils tools.
 
 20150728:
 	As ZFS requires more kernel stack pages than is the default on some
 	architectures e.g. i386, it now warns if KSTACK_PAGES is less than
 	ZFS_MIN_KSTACK_PAGES (which is 4 at the time of writing).
 
 	Please consider using 'options KSTACK_PAGES=X' where X is greater
 	than or equal to ZFS_MIN_KSTACK_PAGES i.e. 4 in such configurations.
 
 20150706:
 	sendmail has been updated to 8.15.2.  Starting with FreeBSD 11.0
 	and sendmail 8.15, sendmail uses uncompressed IPv6 addresses by
 	default, i.e., they will not contain "::".  For example, instead
 	of ::1, it will be 0:0:0:0:0:0:0:1.  This permits a zero subnet
 	to have a more specific match, such as different map entries for
 	IPv6:0:0 vs IPv6:0.  This change requires that configuration
 	data (including maps, files, classes, custom ruleset, etc.) must
 	use the same format, so make certain such configuration data is
 	upgrading.  As a very simple check search for patterns like
 	'IPv6:[0-9a-fA-F:]*::' and 'IPv6::'.  To return to the old
 	behavior, set the m4 option confUSE_COMPRESSED_IPV6_ADDRESSES or
 	the cf option UseCompressedIPv6Addresses.
 
 20150630:
 	The default kernel entropy-processing algorithm is now
 	Fortuna, replacing Yarrow.
 
 	Assuming you have 'device random' in your kernel config
 	file, the configurations allow a kernel option to override
 	this default. You may choose *ONE* of:
 
 	options	RANDOM_YARROW	# Legacy /dev/random algorithm.
 	options	RANDOM_DUMMY	# Blocking-only driver.
 
 	If you have neither, you get Fortuna.  For most people,
 	read no further, Fortuna will give a /dev/random that works
 	like it always used to, and the difference will be irrelevant.
 
 	If you remove 'device random', you get *NO* kernel-processed
 	entropy at all. This may be acceptable to folks building
 	embedded systems, but has complications. Carry on reading,
 	and it is assumed you know what you need.
 
 	*PLEASE* read random(4) and random(9) if you are in the
 	habit of tweaking kernel configs, and/or if you are a member
 	of the embedded community, wanting specific and not-usual
 	behaviour from your security subsystems.
 
 	NOTE!! If you use RANDOM_DUMMY and/or have no 'device
 	random', you will NOT have a functioning /dev/random, and
 	many cryptographic features will not work, including SSH.
 	You may also find strange behaviour from the random(3) set
 	of library functions, in particular sranddev(3), srandomdev(3)
 	and arc4random(3). The reason for this is that the KERN_ARND
 	sysctl only returns entropy if it thinks it has some to
 	share, and with RANDOM_DUMMY or no 'device random' this
 	will never happen.
 
 20150623:
 	An additional fix for the issue described in the 20150614 sendmail
 	entry below has been committed in revision 284717.
 
 20150616:
 	FreeBSD's old make (fmake) has been removed from the system. It is
 	available as the devel/fmake port or via pkg install fmake.
 
 20150615:
 	The fix for the issue described in the 20150614 sendmail entry
 	below has been committed in revision 284436.  The work
 	around described in that entry is no longer needed unless the
 	default setting is overridden by a confDH_PARAMETERS configuration
 	setting of '5' or pointing to a 512 bit DH parameter file.
 
 20150614:
 	ALLOW_DEPRECATED_ATF_TOOLS/ATFFILE support has been removed from
 	atf.test.mk (included from bsd.test.mk). Please upgrade devel/atf
 	and devel/kyua to version 0.20+ and adjust any calling code to work
 	with Kyuafile and kyua.
 
 20150614:
 	The import of openssl to address the FreeBSD-SA-15:10.openssl
 	security advisory includes a change which rejects handshakes
 	with DH parameters below 768 bits.  sendmail releases prior
 	to 8.15.2 (not yet released), defaulted to a 512 bit
 	DH parameter setting for client connections.  To work around
 	this interoperability, sendmail can be configured to use a
 	2048 bit DH parameter by:
 
 	1. Edit /etc/mail/`hostname`.mc
 	2. If a setting for confDH_PARAMETERS does not exist or
 	   exists and is set to a string beginning with '5',
 	   replace it with '2'.
 	3. If a setting for confDH_PARAMETERS exists and is set to
 	   a file path, create a new file with:
 		openssl dhparam -out /path/to/file 2048
 	4. Rebuild the .cf file:
 		cd /etc/mail/; make; make install
 	5. Restart sendmail:
 		cd /etc/mail/; make restart
 
 	A sendmail patch is coming, at which time this file will be
 	updated.
 
 20150604:
 	Generation of legacy formatted entries have been disabled by default
 	in pwd_mkdb(8), as all base system consumers of the legacy formatted
 	entries were converted to use the new format by default when the new,
 	machine independent format have been added and supported since FreeBSD
 	5.x.
 
 	Please see the pwd_mkdb(8) manual page for further details.
 
 20150525:
 	Clang and llvm have been upgraded to 3.6.1 release.  Please see the
 	20141231 entry below for information about prerequisites and upgrading,
 	if you are not already using 3.5.0 or higher.
 
 20150521:
 	TI platform code switched to using vendor DTS files and this update
 	may break existing systems running on Beaglebone, Beaglebone Black,
 	and Pandaboard:
 
 	- dtb files should be regenerated/reinstalled. Filenames are the
 	  same but content is different now
 	- GPIO addressing was changed, now each GPIO bank (32 pins per bank)
 	  has its own /dev/gpiocX device, e.g. pin 121 on /dev/gpioc0 in old
 	  addressing scheme is now pin 25 on /dev/gpioc3.
 	- Pandaboard: /etc/ttys should be updated, serial console device is
 	  now /dev/ttyu2, not /dev/ttyu0
 
 20150501:
 	soelim(1) from gnu/usr.bin/groff has been replaced by usr.bin/soelim.
 	If you need the GNU extension from groff soelim(1), install groff
 	from package: pkg install groff, or via ports: textproc/groff.
 
 20150423:
 	chmod, chflags, chown and chgrp now affect symlinks in -R mode as
 	defined in symlink(7); previously symlinks were silently ignored.
 
 20150415:
 	The const qualifier has been removed from iconv(3) to comply with
 	POSIX.  The ports tree is aware of this from r384038 onwards.
 
 20150416:
 	Libraries specified by LIBADD in Makefiles must have a corresponding
 	DPADD_<lib> variable to ensure correct dependencies.  This is now
 	enforced in src.libnames.mk.
 
 20150324:
 	From legacy ata(4) driver was removed support for SATA controllers
 	supported by more functional drivers ahci(4), siis(4) and mvs(4).
 	Kernel modules ataahci and ataadaptec were removed completely,
 	replaced by ahci and mvs modules respectively.
 
 20150315:
 	Clang, llvm and lldb have been upgraded to 3.6.0 release.  Please see
 	the 20141231 entry below for information about prerequisites and
 	upgrading, if you are not already using 3.5.0 or higher.
 
 20150307:
 	The 32-bit PowerPC kernel has been changed to a position-independent
 	executable. This can only be booted with a version of loader(8)
 	newer than January 31, 2015, so make sure to update both world and
 	kernel before rebooting.
 
 20150217:
 	If you are running a -CURRENT kernel since r273872 (Oct 30th, 2014),
 	but before r278950, the RNG was not seeded properly.  Immediately
 	upgrade the kernel to r278950 or later and regenerate any keys (e.g.
 	ssh keys or openssl keys) that were generated w/ a kernel from that
 	range.  This does not affect programs that directly used /dev/random
 	or /dev/urandom.  All userland uses of arc4random(3) are affected.
 
 20150210:
 	The autofs(4) ABI was changed in order to restore binary compatibility
 	with 10.1-RELEASE.  The automountd(8) daemon needs to be rebuilt to work
 	with the new kernel.
 
 20150131:
 	The powerpc64 kernel has been changed to a position-independent
 	executable. This can only be booted with a new version of loader(8),
 	so make sure to update both world and kernel before rebooting.
 
 20150118:
 	Clang and llvm have been upgraded to 3.5.1 release.  This is a bugfix
 	only release, no new features have been added.  Please see the 20141231
 	entry below for information about prerequisites and upgrading, if you
 	are not already using 3.5.0.
 
 20150107:
 	ELF tools addr2line, elfcopy (strip), nm, size, and strings are now
 	taken from the ELF Tool Chain project rather than GNU binutils. They
 	should be drop-in replacements, with the addition of arm64 support.
 	The WITHOUT_ELFTOOLCHAIN_TOOLS= knob may be used to obtain the
 	binutils tools, if necessary. See 20150805 for updated information.
 
 20150105:
 	The default Unbound configuration now enables remote control
 	using a local socket.  Users who have already enabled the
 	local_unbound service should regenerate their configuration
 	by running "service local_unbound setup" as root.
 
 20150102:
 	The GNU texinfo and GNU info pages have been removed.
 	To be able to view GNU info pages please install texinfo from ports.
 
 20141231:
 	Clang, llvm and lldb have been upgraded to 3.5.0 release.
 
 	As of this release, a prerequisite for building clang, llvm and lldb is
 	a C++11 capable compiler and C++11 standard library.  This means that to
 	be able to successfully build the cross-tools stage of buildworld, with
 	clang as the bootstrap compiler, your system compiler or cross compiler
 	should either be clang 3.3 or later, or gcc 4.8 or later, and your
 	system C++ library should be libc++, or libdstdc++ from gcc 4.8 or
 	later.
 
 	On any standard FreeBSD 10.x or 11.x installation, where clang and
 	libc++ are on by default (that is, on x86 or arm), this should work out
 	of the box.
 
 	On 9.x installations where clang is enabled by default, e.g. on x86 and
 	powerpc, libc++ will not be enabled by default, so libc++ should be
 	built (with clang) and installed first.  If both clang and libc++ are
 	missing, build clang first, then use it to build libc++.
 
 	On 8.x and earlier installations, upgrade to 9.x first, and then follow
 	the instructions for 9.x above.
 
 	Sparc64 and mips users are unaffected, as they still use gcc 4.2.1 by
 	default, and do not build clang.
 
 	Many embedded systems are resource constrained, and will not be able to
 	build clang in a reasonable time, or in some cases at all.  In those
 	cases, cross building bootable systems on amd64 is a workaround.
 
 	This new version of clang introduces a number of new warnings, of which
 	the following are most likely to appear:
 
 	-Wabsolute-value
 
 	This warns in two cases, for both C and C++:
 	* When the code is trying to take the absolute value of an unsigned
 	  quantity, which is effectively a no-op, and almost never what was
 	  intended.  The code should be fixed, if at all possible.  If you are
 	  sure that the unsigned quantity can be safely cast to signed, without
 	  loss of information or undefined behavior, you can add an explicit
 	  cast, or disable the warning.
 
 	* When the code is trying to take an absolute value, but the called
 	  abs() variant is for the wrong type, which can lead to truncation.
 	  If you want to disable the warning instead of fixing the code, please
 	  make sure that truncation will not occur, or it might lead to unwanted
 	  side-effects.
 
 	-Wtautological-undefined-compare and
 	-Wundefined-bool-conversion
 
 	These warn when C++ code is trying to compare 'this' against NULL, while
 	'this' should never be NULL in well-defined C++ code.  However, there is
 	some legacy (pre C++11) code out there, which actively abuses this
 	feature, which was less strictly defined in previous C++ versions.
 
 	Squid and openjdk do this, for example.  The warning can be turned off
 	for C++98 and earlier, but compiling the code in C++11 mode might result
 	in unexpected behavior; for example, the parts of the program that are
 	unreachable could be optimized away.
 
 20141222:
 	The old NFS client and server (kernel options NFSCLIENT, NFSSERVER)
 	kernel sources have been removed. The .h files remain, since some
 	utilities include them. This will need to be fixed later.
 	If "mount -t oldnfs ..." is attempted, it will fail.
 	If the "-o" option on mountd(8), nfsd(8) or nfsstat(1) is used,
 	the utilities will report errors.
 
 20141121:
 	The handling of LOCAL_LIB_DIRS has been altered to skip addition of
 	directories to top level SUBDIR variable when their parent
 	directory is included in LOCAL_DIRS.  Users with build systems with
 	such hierarchies and without SUBDIR entries in the parent
 	directory Makefiles should add them or add the directories to
 	LOCAL_DIRS.
 
 20141109:
 	faith(4) and faithd(8) have been removed from the base system. Faith
 	has been obsolete for a very long time.
 
 20141104:
 	vt(4), the new console driver, is enabled by default. It brings
 	support for Unicode and double-width characters, as well as
 	support for UEFI and integration with the KMS kernel video
 	drivers.
 
 	You may need to update your console settings in /etc/rc.conf,
 	most probably the keymap. During boot, /etc/rc.d/syscons will
 	indicate what you need to do.
 
 	vt(4) still has issues and lacks some features compared to
 	syscons(4). See the wiki for up-to-date information:
 	  https://wiki.freebsd.org/Newcons
 
 	If you want to keep using syscons(4), you can do so by adding
 	the following line to /boot/loader.conf:
 	  kern.vty=sc
 
 20141102:
 	pjdfstest has been integrated into kyua as an opt-in test suite.
 	Please see share/doc/pjdfstest/README for more details on how to
 	execute it.
 
 20141009:
 	gperf has been removed from the base system for architectures
 	that use clang. Ports that require gperf will obtain it from the
 	devel/gperf port.
 
 20140923:
 	pjdfstest has been moved from tools/regression/pjdfstest to
 	contrib/pjdfstest .
 
 20140922:
 	At svn r271982, The default linux compat kernel ABI has been adjusted
 	to 2.6.18 in support of the linux-c6 compat ports infrastructure
 	update.  If you wish to continue using the linux-f10 compat ports,
 	add compat.linux.osrelease=2.6.16 to your local sysctl.conf.  Users are
 	encouraged to update their linux-compat packages to linux-c6 during
 	their next update cycle.
 
 20140729:
 	The ofwfb driver, used to provide a graphics console on PowerPC when
 	using vt(4), no longer allows mmap() of all physical memory. This
 	will prevent Xorg on PowerPC with some ATI graphics cards from
 	initializing properly unless x11-servers/xorg-server is updated to
 	1.12.4_8 or newer.
 
 20140723:
 	The xdev targets have been converted to using TARGET and
 	TARGET_ARCH instead of XDEV and XDEV_ARCH.
 
 20140719:
 	The default unbound configuration has been modified to address
 	issues with reverse lookups on networks that use private
 	address ranges.  If you use the local_unbound service, run
 	"service local_unbound setup" as root to regenerate your
 	configuration, then "service local_unbound reload" to load the
 	new configuration.
 
 20140709:
 	The GNU texinfo and GNU info pages are not built and installed
 	anymore, WITH_INFO knob has been added to allow to built and install
 	them again.
 	UPDATE: see 20150102 entry on texinfo's removal
 
 20140708:
 	The GNU readline library is now an INTERNALLIB - that is, it is
 	statically linked into consumers (GDB and variants) in the base
 	system, and the shared library is no longer installed.  The
 	devel/readline port is available for third party software that
 	requires readline.
 
 20140702:
 	The Itanium architecture (ia64) has been removed from the list of
 	known architectures. This is the first step in the removal of the
 	architecture.
 
 20140701:
 	Commit r268115 has added NFSv4.1 server support, merged from
 	projects/nfsv4.1-server.  Since this includes changes to the
 	internal interfaces between the NFS related modules, a full
 	build of the kernel and modules will be necessary.
 	__FreeBSD_version has been bumped.
 
 20140629:
 	The WITHOUT_VT_SUPPORT kernel config knob has been renamed
 	WITHOUT_VT.  (The other _SUPPORT knobs have a consistent meaning
 	which differs from the behaviour controlled by this knob.)
 
 20140619:
 	Maximal length of the serial number in CTL was increased from 16 to
 	64 chars, that breaks ABI.  All CTL-related tools, such as ctladm
 	and ctld, need to be rebuilt to work with a new kernel.
 
 20140606:
 	The libatf-c and libatf-c++ major versions were downgraded to 0 and
 	1 respectively to match the upstream numbers.  They were out of
 	sync because, when they were originally added to FreeBSD, the
 	upstream versions were not respected.  These libraries are private
 	and not yet built by default, so renumbering them should be a
 	non-issue.  However, unclean source trees will yield broken test
 	programs once the operator executes "make delete-old-libs" after a
 	"make installworld".
 
 	Additionally, the atf-sh binary was made private by moving it into
 	/usr/libexec/.  Already-built shell test programs will keep the
 	path to the old binary so they will break after "make delete-old"
 	is run.
 
 	If you are using WITH_TESTS=yes (not the default), wipe the object
 	tree and rebuild from scratch to prevent spurious test failures.
 	This is only needed once: the misnumbered libraries and misplaced
 	binaries have been added to OptionalObsoleteFiles.inc so they will
 	be removed during a clean upgrade.
 
 20140512:
 	Clang and llvm have been upgraded to 3.4.1 release.
 
 20140508:
 	We bogusly installed src.opts.mk in /usr/share/mk. This file should
 	be removed to avoid issues in the future (and has been added to
 	ObsoleteFiles.inc).
 
 20140505:
 	/etc/src.conf now affects only builds of the FreeBSD src tree. In the
 	past, it affected all builds that used the bsd.*.mk files. The old
 	behavior was a bug, but people may have relied upon it. To get this
 	behavior back, you can .include /etc/src.conf from /etc/make.conf
 	(which is still global and isn't changed). This also changes the
 	behavior of incremental builds inside the tree of individual
 	directories. Set MAKESYSPATH to ".../share/mk" to do that.
 	Although this has survived make universe and some upgrade scenarios,
 	other upgrade scenarios may have broken. At least one form of
 	temporary breakage was fixed with MAKESYSPATH settings for buildworld
 	as well... In cases where MAKESYSPATH isn't working with this
 	setting, you'll need to set it to the full path to your tree.
 
 	One side effect of all this cleaning up is that bsd.compiler.mk
 	is no longer implicitly included by bsd.own.mk. If you wish to
 	use COMPILER_TYPE, you must now explicitly include bsd.compiler.mk
 	as well.
 
 20140430:
 	The lindev device has been removed since /dev/full has been made a
 	standard device.  __FreeBSD_version has been bumped.
 
 20140424:
 	The knob WITHOUT_VI was added to the base system, which controls
 	building ex(1), vi(1), etc. Older releases of FreeBSD required ex(1)
 	in order to reorder files share/termcap and didn't build ex(1) as a
 	build tool, so building/installing with WITH_VI is highly advised for
 	build hosts for older releases.
 
 	This issue has been fixed in stable/9 and stable/10 in r277022 and
 	r276991, respectively.
 
 20140418:
 	The YES_HESIOD knob has been removed. It has been obsolete for
 	a decade. Please move to using WITH_HESIOD instead or your builds
 	will silently lack HESIOD.
 
 20140405:
 	The uart(4) driver has been changed with respect to its handling
 	of the low-level console. Previously the uart(4) driver prevented
 	any process from changing the baudrate or the CLOCAL and HUPCL
 	control flags. By removing the restrictions, operators can make
 	changes to the serial console port without having to reboot.
 	However, when getty(8) is started on the serial device that is
 	associated with the low-level console, a misconfigured terminal
 	line in /etc/ttys will now have a real impact.
 	Before upgrading the kernel, make sure that /etc/ttys has the
 	serial console device configured as 3wire without baudrate to
 	preserve the previous behaviour. E.g:
 	    ttyu0  "/usr/libexec/getty 3wire"  vt100  on  secure
 
 20140306:
 	Support for libwrap (TCP wrappers) in rpcbind was disabled by default
 	to improve performance.  To re-enable it, if needed, run rpcbind
 	with command line option -W.
 
 20140226:
 	Switched back to the GPL dtc compiler due to updates in the upstream
 	dts files not being supported by the BSDL dtc compiler. You will need
 	to rebuild your kernel toolchain to pick up the new compiler. Core dumps
 	may result while building dtb files during a kernel build if you fail
 	to do so. Set WITHOUT_GPL_DTC if you require the BSDL compiler.
 
 20140216:
 	Clang and llvm have been upgraded to 3.4 release.
 
 20140216:
 	The nve(4) driver has been removed.  Please use the nfe(4) driver
 	for NVIDIA nForce MCP Ethernet adapters instead.
 
 20140212:
 	An ABI incompatibility crept into the libc++ 3.4 import in r261283.
 	This could cause certain C++ applications using shared libraries built
 	against the previous version of libc++ to crash.  The incompatibility
 	has now been fixed, but any C++ applications or shared libraries built
 	between r261283 and r261801 should be recompiled.
 
 20140204:
 	OpenSSH will now ignore errors caused by kernel lacking of Capsicum
 	capability mode support.  Please note that enabling the feature in
 	kernel is still highly recommended.
 
 20140131:
 	OpenSSH is now built with sandbox support, and will use sandbox as
 	the default privilege separation method.  This requires Capsicum
 	capability mode support in kernel.
 
 20140128:
 	The libelf and libdwarf libraries have been updated to newer
 	versions from upstream. Shared library version numbers for
 	these two libraries were bumped. Any ports or binaries
 	requiring these two libraries should be recompiled.
 	__FreeBSD_version is bumped to 1100006.
 
 20140110:
 	If a Makefile in a tests/ directory was auto-generating a Kyuafile
 	instead of providing an explicit one, this would prevent such
 	Makefile from providing its own Kyuafile in the future during
 	NO_CLEAN builds.  This has been fixed in the Makefiles but manual
 	intervention is needed to clean an objdir if you use NO_CLEAN:
 	  # find /usr/obj -name Kyuafile | xargs rm -f
 
 20131213:
 	The behavior of gss_pseudo_random() for the krb5 mechanism
 	has changed, for applications requesting a longer random string
 	than produced by the underlying enctype's pseudo-random() function.
 	In particular, the random string produced from a session key of
 	enctype aes256-cts-hmac-sha1-96 or aes256-cts-hmac-sha1-96 will
 	be different at the 17th octet and later, after this change.
 	The counter used in the PRF+ construction is now encoded as a
 	big-endian integer in accordance with RFC 4402.
 	__FreeBSD_version is bumped to 1100004.
 
 20131108:
 	The WITHOUT_ATF build knob has been removed and its functionality
 	has been subsumed into the more generic WITHOUT_TESTS.  If you were
 	using the former to disable the build of the ATF libraries, you
 	should change your settings to use the latter.
 
 20131025:
 	The default version of mtree is nmtree which is obtained from
 	NetBSD.  The output is generally the same, but may vary
 	slightly.  If you found you need identical output adding
 	"-F freebsd9" to the command line should do the trick.  For the
 	time being, the old mtree is available as fmtree.
 
 20131014:
 	libbsdyml has been renamed to libyaml and moved to /usr/lib/private.
 	This will break ports-mgmt/pkg. Rebuild the port, or upgrade to pkg
 	1.1.4_8 and verify bsdyml not linked in, before running "make
 	delete-old-libs":
 	  # make -C /usr/ports/ports-mgmt/pkg build deinstall install clean
 	  or
 	  # pkg install pkg; ldd /usr/local/sbin/pkg | grep bsdyml
 
 20131010:
 	The stable/10 branch has been created in subversion from head
 	revision r256279.
 
 COMMON ITEMS:
 
 	General Notes
 	-------------
 	Avoid using make -j when upgrading.  While generally safe, there are
 	sometimes problems using -j to upgrade.  If your upgrade fails with
 	-j, please try again without -j.  From time to time in the past there
 	have been problems using -j with buildworld and/or installworld.  This
 	is especially true when upgrading between "distant" versions (eg one
 	that cross a major release boundary or several minor releases, or when
 	several months have passed on the -current branch).
 
 	Sometimes, obscure build problems are the result of environment
 	poisoning.  This can happen because the make utility reads its
 	environment when searching for values for global variables.  To run
 	your build attempts in an "environmental clean room", prefix all make
 	commands with 'env -i '.  See the env(1) manual page for more details.
 
 	When upgrading from one major version to another it is generally best to
 	upgrade to the latest code in the currently installed branch first, then
 	do an upgrade to the new branch. This is the best-tested upgrade path,
 	and has the highest probability of being successful.  Please try this
 	approach if you encounter problems with a major version upgrade.  Since
 	the stable 4.x branch point, one has generally been able to upgrade from
 	anywhere in the most recent stable branch to head / current (or even the
 	last couple of stable branches). See the top of this file when there's
 	an exception.
 
 	When upgrading a live system, having a root shell around before
 	installing anything can help undo problems. Not having a root shell
 	around can lead to problems if pam has changed too much from your
 	starting point to allow continued authentication after the upgrade.
 
 	This file should be read as a log of events. When a later event changes
 	information of a prior event, the prior event should not be deleted.
 	Instead, a pointer to the entry with the new information should be
 	placed in the old entry. Readers of this file should also sanity check
 	older entries before relying on them blindly. Authors of new entries
 	should write them with this in mind.
 
 	ZFS notes
 	---------
 	When upgrading the boot ZFS pool to a new version, always follow
 	these two steps:
 
 	1.) recompile and reinstall the ZFS boot loader and boot block
 	(this is part of "make buildworld" and "make installworld")
 
 	2.) update the ZFS boot block on your boot drive
 
 	The following example updates the ZFS boot block on the first
 	partition (freebsd-boot) of a GPT partitioned drive ada0:
 	"gpart bootcode -p /boot/gptzfsboot -i 1 ada0"
 
 	Non-boot pools do not need these updates.
 
 	To build a kernel
 	-----------------
 	If you are updating from a prior version of FreeBSD (even one just
 	a few days old), you should follow this procedure.  It is the most
 	failsafe as it uses a /usr/obj tree with a fresh mini-buildworld,
 
 	make kernel-toolchain
 	make -DALWAYS_CHECK_MAKE buildkernel KERNCONF=YOUR_KERNEL_HERE
 	make -DALWAYS_CHECK_MAKE installkernel KERNCONF=YOUR_KERNEL_HERE
 
 	To test a kernel once
 	---------------------
 	If you just want to boot a kernel once (because you are not sure
 	if it works, or if you want to boot a known bad kernel to provide
 	debugging information) run
 	make installkernel KERNCONF=YOUR_KERNEL_HERE KODIR=/boot/testkernel
 	nextboot -k testkernel
 
 	To rebuild everything and install it on the current system.
 	-----------------------------------------------------------
 	# Note: sometimes if you are running current you gotta do more than
 	# is listed here if you are upgrading from a really old current.
 
 	<make sure you have good level 0 dumps>
 	make buildworld
 	make buildkernel KERNCONF=YOUR_KERNEL_HERE
 	make installkernel KERNCONF=YOUR_KERNEL_HERE
 							[1]
 	<reboot in single user>				[3]
 	mergemaster -Fp					[5]
 	make installworld
 	mergemaster -Fi					[4]
 	make delete-old					[6]
 	<reboot>
 
 	To cross-install current onto a separate partition
 	--------------------------------------------------
 	# In this approach we use a separate partition to hold
 	# current's root, 'usr', and 'var' directories.   A partition
 	# holding "/", "/usr" and "/var" should be about 2GB in
 	# size.
 
 	<make sure you have good level 0 dumps>
 	<boot into -stable>
 	make buildworld
 	make buildkernel KERNCONF=YOUR_KERNEL_HERE
 	<maybe newfs current's root partition>
 	<mount current's root partition on directory ${CURRENT_ROOT}>
 	make installworld DESTDIR=${CURRENT_ROOT} -DDB_FROM_SRC
 	make distribution DESTDIR=${CURRENT_ROOT} # if newfs'd
 	make installkernel KERNCONF=YOUR_KERNEL_HERE DESTDIR=${CURRENT_ROOT}
 	cp /etc/fstab ${CURRENT_ROOT}/etc/fstab 		   # if newfs'd
 	<edit ${CURRENT_ROOT}/etc/fstab to mount "/" from the correct partition>
 	<reboot into current>
 	<do a "native" rebuild/install as described in the previous section>
 	<maybe install compatibility libraries from ports/misc/compat*>
 	<reboot>
 
 
 	To upgrade in-place from stable to current
 	----------------------------------------------
 	<make sure you have good level 0 dumps>
 	make buildworld					[9]
 	make buildkernel KERNCONF=YOUR_KERNEL_HERE	[8]
 	make installkernel KERNCONF=YOUR_KERNEL_HERE
 							[1]
 	<reboot in single user>				[3]
 	mergemaster -Fp					[5]
 	make installworld
 	mergemaster -Fi					[4]
 	make delete-old					[6]
 	<reboot>
 
 	Make sure that you've read the UPDATING file to understand the
 	tweaks to various things you need.  At this point in the life
 	cycle of current, things change often and you are on your own
 	to cope.  The defaults can also change, so please read ALL of
 	the UPDATING entries.
 
 	Also, if you are tracking -current, you must be subscribed to
 	freebsd-current@freebsd.org.  Make sure that before you update
 	your sources that you have read and understood all the recent
 	messages there.  If in doubt, please track -stable which has
 	much fewer pitfalls.
 
 	[1] If you have third party modules, such as vmware, you
 	should disable them at this point so they don't crash your
 	system on reboot.
 
 	[3] From the bootblocks, boot -s, and then do
 		fsck -p
 		mount -u /
 		mount -a
 		cd src
 		adjkerntz -i		# if CMOS is wall time
 	Also, when doing a major release upgrade, it is required that
 	you boot into single user mode to do the installworld.
 
 	[4] Note: This step is non-optional.  Failure to do this step
 	can result in a significant reduction in the functionality of the
 	system.  Attempting to do it by hand is not recommended and those
 	that pursue this avenue should read this file carefully, as well
 	as the archives of freebsd-current and freebsd-hackers mailing lists
 	for potential gotchas.  The -U option is also useful to consider.
 	See mergemaster(8) for more information.
 
 	[5] Usually this step is a no-op.  However, from time to time
 	you may need to do this if you get unknown user in the following
 	step.  It never hurts to do it all the time.  You may need to
 	install a new mergemaster (cd src/usr.sbin/mergemaster && make
 	install) after the buildworld before this step if you last updated
 	from current before 20130425 or from -stable before 20130430.
 
 	[6] This only deletes old files and directories. Old libraries
 	can be deleted by "make delete-old-libs", but you have to make
 	sure that no program is using those libraries anymore.
 
 	[8] In order to have a kernel that can run the 4.x binaries needed to
 	do an installworld, you must include the COMPAT_FREEBSD4 option in
 	your kernel.  Failure to do so may leave you with a system that is
 	hard to boot to recover. A similar kernel option COMPAT_FREEBSD5 is
 	required to run the 5.x binaries on more recent kernels.  And so on
 	for COMPAT_FREEBSD6 and COMPAT_FREEBSD7.
 
 	Make sure that you merge any new devices from GENERIC since the
 	last time you updated your kernel config file.
 
 	[9] If CPUTYPE is defined in your /etc/make.conf, make sure to use the
 	"?=" instead of the "=" assignment operator, so that buildworld can
 	override the CPUTYPE if it needs to.
 
 	MAKEOBJDIRPREFIX must be defined in an environment variable, and
 	not on the command line, or in /etc/make.conf.  buildworld will
 	warn if it is improperly defined.
 FORMAT:
 
 This file contains a list, in reverse chronological order, of major
 breakages in tracking -current.  It is not guaranteed to be a complete
 list of such breakages, and only contains entries since September 23, 2011.
 If you need to see UPDATING entries from before that date, you will need
 to fetch an UPDATING file from an older FreeBSD release.
 
 Copyright information:
 
 Copyright 1998-2009 M. Warner Losh.  All Rights Reserved.
 
 Redistribution, publication, translation and use, with or without
 modification, in full or in part, in any form or format of this
 document are permitted without further permission from the author.
 
 THIS DOCUMENT IS PROVIDED BY WARNER LOSH ``AS IS'' AND ANY EXPRESS OR
 IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
 WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 DISCLAIMED.  IN NO EVENT SHALL WARNER LOSH BE LIABLE FOR ANY DIRECT,
 INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
 (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
 SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
 STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
 IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 POSSIBILITY OF SUCH DAMAGE.
 
 Contact Warner Losh if you have any questions about your use of
 this document.
 
 $FreeBSD$
Index: projects/openssl111/crypto/openssh/auth2.c
===================================================================
--- projects/openssl111/crypto/openssh/auth2.c	(revision 339239)
+++ projects/openssl111/crypto/openssh/auth2.c	(revision 339240)
@@ -1,824 +1,824 @@
 /* $OpenBSD: auth2.c,v 1.149 2018/07/11 18:53:29 markus Exp $ */
 /*
  * Copyright (c) 2000 Markus Friedl.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
 #include "includes.h"
 __RCSID("$FreeBSD$");
 
 #include <sys/types.h>
 #include <sys/stat.h>
 #include <sys/uio.h>
 
 #include <fcntl.h>
 #include <limits.h>
 #include <pwd.h>
 #include <stdarg.h>
 #include <string.h>
 #include <unistd.h>
 
 #include "atomicio.h"
 #include "xmalloc.h"
 #include "ssh2.h"
 #include "packet.h"
 #include "log.h"
 #include "sshbuf.h"
 #include "misc.h"
 #include "servconf.h"
 #include "compat.h"
 #include "sshkey.h"
 #include "hostfile.h"
 #include "auth.h"
 #include "dispatch.h"
 #include "pathnames.h"
 #include "sshbuf.h"
 #include "ssherr.h"
 #include "blacklist_client.h"
 
 #ifdef GSSAPI
 #include "ssh-gss.h"
 #endif
 #include "monitor_wrap.h"
 #include "ssherr.h"
 #include "digest.h"
 
 /* import */
 extern ServerOptions options;
 extern u_char *session_id2;
 extern u_int session_id2_len;
 extern struct sshbuf *loginmsg;
 
 /* methods */
 
 extern Authmethod method_none;
 extern Authmethod method_pubkey;
 extern Authmethod method_passwd;
 extern Authmethod method_kbdint;
 extern Authmethod method_hostbased;
 #ifdef GSSAPI
 extern Authmethod method_gssapi;
 #endif
 
 Authmethod *authmethods[] = {
 	&method_none,
 	&method_pubkey,
 #ifdef GSSAPI
 	&method_gssapi,
 #endif
 	&method_passwd,
 	&method_kbdint,
 	&method_hostbased,
 	NULL
 };
 
 /* protocol */
 
 static int input_service_request(int, u_int32_t, struct ssh *);
 static int input_userauth_request(int, u_int32_t, struct ssh *);
 
 /* helper */
 static Authmethod *authmethod_lookup(Authctxt *, const char *);
 static char *authmethods_get(Authctxt *authctxt);
 
 #define MATCH_NONE	0	/* method or submethod mismatch */
 #define MATCH_METHOD	1	/* method matches (no submethod specified) */
 #define MATCH_BOTH	2	/* method and submethod match */
 #define MATCH_PARTIAL	3	/* method matches, submethod can't be checked */
 static int list_starts_with(const char *, const char *, const char *);
 
 char *
 auth2_read_banner(void)
 {
 	struct stat st;
 	char *banner = NULL;
 	size_t len, n;
 	int fd;
 
 	if ((fd = open(options.banner, O_RDONLY)) == -1)
 		return (NULL);
 	if (fstat(fd, &st) == -1) {
 		close(fd);
 		return (NULL);
 	}
 	if (st.st_size <= 0 || st.st_size > 1*1024*1024) {
 		close(fd);
 		return (NULL);
 	}
 
 	len = (size_t)st.st_size;		/* truncate */
 	banner = xmalloc(len + 1);
 	n = atomicio(read, fd, banner, len);
 	close(fd);
 
 	if (n != len) {
 		free(banner);
 		return (NULL);
 	}
 	banner[n] = '\0';
 
 	return (banner);
 }
 
 void
 userauth_send_banner(const char *msg)
 {
 	packet_start(SSH2_MSG_USERAUTH_BANNER);
 	packet_put_cstring(msg);
 	packet_put_cstring("");		/* language, unused */
 	packet_send();
 	debug("%s: sent", __func__);
 }
 
 static void
 userauth_banner(void)
 {
 	char *banner = NULL;
 
 	if (options.banner == NULL)
 		return;
 
 	if ((banner = PRIVSEP(auth2_read_banner())) == NULL)
 		goto done;
 	userauth_send_banner(banner);
 
 done:
 	free(banner);
 }
 
 /*
  * loop until authctxt->success == TRUE
  */
 void
 do_authentication2(Authctxt *authctxt)
 {
 	struct ssh *ssh = active_state;		/* XXX */
 	ssh->authctxt = authctxt;		/* XXX move to caller */
 	ssh_dispatch_init(ssh, &dispatch_protocol_error);
 	ssh_dispatch_set(ssh, SSH2_MSG_SERVICE_REQUEST, &input_service_request);
 	ssh_dispatch_run_fatal(ssh, DISPATCH_BLOCK, &authctxt->success);
 	ssh->authctxt = NULL;
 }
 
 /*ARGSUSED*/
 static int
 input_service_request(int type, u_int32_t seq, struct ssh *ssh)
 {
 	Authctxt *authctxt = ssh->authctxt;
 	u_int len;
 	int acceptit = 0;
 	char *service = packet_get_cstring(&len);
 	packet_check_eom();
 
 	if (authctxt == NULL)
 		fatal("input_service_request: no authctxt");
 
 	if (strcmp(service, "ssh-userauth") == 0) {
 		if (!authctxt->success) {
 			acceptit = 1;
 			/* now we can handle user-auth requests */
 			ssh_dispatch_set(ssh, SSH2_MSG_USERAUTH_REQUEST, &input_userauth_request);
 		}
 	}
 	/* XXX all other service requests are denied */
 
 	if (acceptit) {
 		packet_start(SSH2_MSG_SERVICE_ACCEPT);
 		packet_put_cstring(service);
 		packet_send();
 		packet_write_wait();
 	} else {
 		debug("bad service request %s", service);
 		packet_disconnect("bad service request %s", service);
 	}
 	free(service);
 	return 0;
 }
 
 #define MIN_FAIL_DELAY_SECONDS 0.005
 static double
 user_specific_delay(const char *user)
 {
 	char b[512];
 	size_t len = ssh_digest_bytes(SSH_DIGEST_SHA512);
 	u_char *hash = xmalloc(len);
 	double delay;
 
 	(void)snprintf(b, sizeof b, "%llu%s",
 	     (unsigned long long)options.timing_secret, user);
 	if (ssh_digest_memory(SSH_DIGEST_SHA512, b, strlen(b), hash, len) != 0)
 		fatal("%s: ssh_digest_memory", __func__);
 	/* 0-4.2 ms of delay */
 	delay = (double)PEEK_U32(hash) / 1000 / 1000 / 1000 / 1000;
 	freezero(hash, len);
 	debug3("%s: user specific delay %0.3lfms", __func__, delay/1000);
 	return MIN_FAIL_DELAY_SECONDS + delay;
 }
 
 static void
 ensure_minimum_time_since(double start, double seconds)
 {
 	struct timespec ts;
 	double elapsed = monotime_double() - start, req = seconds, remain;
 
 	/* if we've already passed the requested time, scale up */
 	while ((remain = seconds - elapsed) < 0.0)
 		seconds *= 2;
 
 	ts.tv_sec = remain;
 	ts.tv_nsec = (remain - ts.tv_sec) * 1000000000;
 	debug3("%s: elapsed %0.3lfms, delaying %0.3lfms (requested %0.3lfms)",
 	    __func__, elapsed*1000, remain*1000, req*1000);
 	nanosleep(&ts, NULL);
 }
 
 /*ARGSUSED*/
 static int
 input_userauth_request(int type, u_int32_t seq, struct ssh *ssh)
 {
 	Authctxt *authctxt = ssh->authctxt;
 	Authmethod *m = NULL;
 	char *user, *service, *method, *style = NULL;
 	int authenticated = 0;
 	double tstart = monotime_double();
 #ifdef HAVE_LOGIN_CAP
 	login_cap_t *lc;
 	const char *from_host, *from_ip;
 #endif
 
 	if (authctxt == NULL)
 		fatal("input_userauth_request: no authctxt");
 
 	user = packet_get_cstring(NULL);
 	service = packet_get_cstring(NULL);
 	method = packet_get_cstring(NULL);
 	debug("userauth-request for user %s service %s method %s", user, service, method);
 	debug("attempt %d failures %d", authctxt->attempt, authctxt->failures);
 
 	if ((style = strchr(user, ':')) != NULL)
 		*style++ = 0;
 
 	if (authctxt->attempt++ == 0) {
 		/* setup auth context */
 		authctxt->pw = PRIVSEP(getpwnamallow(user));
 		authctxt->user = xstrdup(user);
 		if (authctxt->pw && strcmp(service, "ssh-connection")==0) {
 			authctxt->valid = 1;
 			debug2("%s: setting up authctxt for %s",
 			    __func__, user);
 		} else {
 			/* Invalid user, fake password information */
 			authctxt->pw = fakepw();
 #ifdef SSH_AUDIT_EVENTS
 			PRIVSEP(audit_event(SSH_INVALID_USER));
 #endif
 		}
 #ifdef USE_PAM
 		if (options.use_pam)
 			PRIVSEP(start_pam(authctxt));
 #endif
 		ssh_packet_set_log_preamble(ssh, "%suser %s",
 		    authctxt->valid ? "authenticating " : "invalid ", user);
 		setproctitle("%s%s", authctxt->valid ? user : "unknown",
 		    use_privsep ? " [net]" : "");
 		authctxt->service = xstrdup(service);
 		authctxt->style = style ? xstrdup(style) : NULL;
 		if (use_privsep)
 			mm_inform_authserv(service, style);
 		userauth_banner();
 		if (auth2_setup_methods_lists(authctxt) != 0)
 			packet_disconnect("no authentication methods enabled");
 	} else if (strcmp(user, authctxt->user) != 0 ||
 	    strcmp(service, authctxt->service) != 0) {
 		packet_disconnect("Change of username or service not allowed: "
 		    "(%s,%s) -> (%s,%s)",
 		    authctxt->user, authctxt->service, user, service);
 	}
 
 #ifdef HAVE_LOGIN_CAP
 	if (authctxt->pw != NULL &&
-	    (lc = login_getpwclass(authctxt->pw)) != NULL) {
+	    (lc = PRIVSEP(login_getpwclass(authctxt->pw))) != NULL) {
 		logit("user %s login class %s", authctxt->pw->pw_name,
 		    authctxt->pw->pw_class);
 		from_host = auth_get_canonical_hostname(ssh, options.use_dns);
 		from_ip = ssh_remote_ipaddr(ssh);
 		if (!auth_hostok(lc, from_host, from_ip)) {
 			logit("Denied connection for %.200s from %.200s [%.200s].",
 			    authctxt->pw->pw_name, from_host, from_ip);
 			packet_disconnect("Sorry, you are not allowed to connect.");
 		}
 		if (!auth_timeok(lc, time(NULL))) {
 			logit("LOGIN %.200s REFUSED (TIME) FROM %.200s",
 			    authctxt->pw->pw_name, from_host);
 			packet_disconnect("Logins not available right now.");
 		}
-		login_close(lc);
+		PRIVSEP(login_close(lc));
 	}
 #endif  /* HAVE_LOGIN_CAP */
 
 	/* reset state */
 	auth2_challenge_stop(ssh);
 
 #ifdef GSSAPI
 	/* XXX move to auth2_gssapi_stop() */
 	ssh_dispatch_set(ssh, SSH2_MSG_USERAUTH_GSSAPI_TOKEN, NULL);
 	ssh_dispatch_set(ssh, SSH2_MSG_USERAUTH_GSSAPI_EXCHANGE_COMPLETE, NULL);
 #endif
 
 	auth2_authctxt_reset_info(authctxt);
 	authctxt->postponed = 0;
 	authctxt->server_caused_failure = 0;
 
 	/* try to authenticate user */
 	m = authmethod_lookup(authctxt, method);
 	if (m != NULL && authctxt->failures < options.max_authtries) {
 		debug2("input_userauth_request: try method %s", method);
 		authenticated =	m->userauth(ssh);
 	}
 	if (!authctxt->authenticated)
 		ensure_minimum_time_since(tstart,
 		    user_specific_delay(authctxt->user));
 	userauth_finish(ssh, authenticated, method, NULL);
 
 	free(service);
 	free(user);
 	free(method);
 	return 0;
 }
 
 void
 userauth_finish(struct ssh *ssh, int authenticated, const char *method,
     const char *submethod)
 {
 	Authctxt *authctxt = ssh->authctxt;
 	char *methods;
 	int partial = 0;
 
 	if (!authctxt->valid && authenticated)
 		fatal("INTERNAL ERROR: authenticated invalid user %s",
 		    authctxt->user);
 	if (authenticated && authctxt->postponed)
 		fatal("INTERNAL ERROR: authenticated and postponed");
 
 	/* Special handling for root */
 	if (authenticated && authctxt->pw->pw_uid == 0 &&
 	    !auth_root_allowed(ssh, method)) {
 		authenticated = 0;
 #ifdef SSH_AUDIT_EVENTS
 		PRIVSEP(audit_event(SSH_LOGIN_ROOT_DENIED));
 #endif
 	}
 
 	if (authenticated && options.num_auth_methods != 0) {
 		if (!auth2_update_methods_lists(authctxt, method, submethod)) {
 			authenticated = 0;
 			partial = 1;
 		}
 	}
 
 	/* Log before sending the reply */
 	auth_log(authctxt, authenticated, partial, method, submethod);
 
 	/* Update information exposed to session */
 	if (authenticated || partial)
 		auth2_update_session_info(authctxt, method, submethod);
 
 	if (authctxt->postponed)
 		return;
 
 #ifdef USE_PAM
 	if (options.use_pam && authenticated) {
 		int r;
 
 		if (!PRIVSEP(do_pam_account())) {
 			/* if PAM returned a message, send it to the user */
 			if (sshbuf_len(loginmsg) > 0) {
 				if ((r = sshbuf_put(loginmsg, "\0", 1)) != 0)
 					fatal("%s: buffer error: %s",
 					    __func__, ssh_err(r));
 				userauth_send_banner(sshbuf_ptr(loginmsg));
 				packet_write_wait();
 			}
 			fatal("Access denied for user %s by PAM account "
 			    "configuration", authctxt->user);
 		}
 	}
 #endif
 
 	if (authenticated == 1) {
 		/* turn off userauth */
 		ssh_dispatch_set(ssh, SSH2_MSG_USERAUTH_REQUEST, &dispatch_protocol_ignore);
 		packet_start(SSH2_MSG_USERAUTH_SUCCESS);
 		packet_send();
 		packet_write_wait();
 		/* now we can break out */
 		authctxt->success = 1;
 		ssh_packet_set_log_preamble(ssh, "user %s", authctxt->user);
 	} else {
 		/* Allow initial try of "none" auth without failure penalty */
 		if (!partial && !authctxt->server_caused_failure &&
 		    (authctxt->attempt > 1 || strcmp(method, "none") != 0)) {
 			authctxt->failures++;
 			BLACKLIST_NOTIFY(BLACKLIST_AUTH_FAIL, "ssh");
 		}
 		if (authctxt->failures >= options.max_authtries) {
 #ifdef SSH_AUDIT_EVENTS
 			PRIVSEP(audit_event(SSH_LOGIN_EXCEED_MAXTRIES));
 #endif
 			auth_maxtries_exceeded(authctxt);
 		}
 		methods = authmethods_get(authctxt);
 		debug3("%s: failure partial=%d next methods=\"%s\"", __func__,
 		    partial, methods);
 		packet_start(SSH2_MSG_USERAUTH_FAILURE);
 		packet_put_cstring(methods);
 		packet_put_char(partial);
 		packet_send();
 		packet_write_wait();
 		free(methods);
 	}
 }
 
 /*
  * Checks whether method is allowed by at least one AuthenticationMethods
  * methods list. Returns 1 if allowed, or no methods lists configured.
  * 0 otherwise.
  */
 int
 auth2_method_allowed(Authctxt *authctxt, const char *method,
     const char *submethod)
 {
 	u_int i;
 
 	/*
 	 * NB. authctxt->num_auth_methods might be zero as a result of
 	 * auth2_setup_methods_lists(), so check the configuration.
 	 */
 	if (options.num_auth_methods == 0)
 		return 1;
 	for (i = 0; i < authctxt->num_auth_methods; i++) {
 		if (list_starts_with(authctxt->auth_methods[i], method,
 		    submethod) != MATCH_NONE)
 			return 1;
 	}
 	return 0;
 }
 
 static char *
 authmethods_get(Authctxt *authctxt)
 {
 	struct sshbuf *b;
 	char *list;
 	int i, r;
 
 	if ((b = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	for (i = 0; authmethods[i] != NULL; i++) {
 		if (strcmp(authmethods[i]->name, "none") == 0)
 			continue;
 		if (authmethods[i]->enabled == NULL ||
 		    *(authmethods[i]->enabled) == 0)
 			continue;
 		if (!auth2_method_allowed(authctxt, authmethods[i]->name,
 		    NULL))
 			continue;
 		if ((r = sshbuf_putf(b, "%s%s", sshbuf_len(b) ? "," : "",
 		    authmethods[i]->name)) != 0)
 			fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	}
 	if ((list = sshbuf_dup_string(b)) == NULL)
 		fatal("%s: sshbuf_dup_string failed", __func__);
 	sshbuf_free(b);
 	return list;
 }
 
 static Authmethod *
 authmethod_lookup(Authctxt *authctxt, const char *name)
 {
 	int i;
 
 	if (name != NULL)
 		for (i = 0; authmethods[i] != NULL; i++)
 			if (authmethods[i]->enabled != NULL &&
 			    *(authmethods[i]->enabled) != 0 &&
 			    strcmp(name, authmethods[i]->name) == 0 &&
 			    auth2_method_allowed(authctxt,
 			    authmethods[i]->name, NULL))
 				return authmethods[i];
 	debug2("Unrecognized authentication method name: %s",
 	    name ? name : "NULL");
 	return NULL;
 }
 
 /*
  * Check a comma-separated list of methods for validity. Is need_enable is
  * non-zero, then also require that the methods are enabled.
  * Returns 0 on success or -1 if the methods list is invalid.
  */
 int
 auth2_methods_valid(const char *_methods, int need_enable)
 {
 	char *methods, *omethods, *method, *p;
 	u_int i, found;
 	int ret = -1;
 
 	if (*_methods == '\0') {
 		error("empty authentication method list");
 		return -1;
 	}
 	omethods = methods = xstrdup(_methods);
 	while ((method = strsep(&methods, ",")) != NULL) {
 		for (found = i = 0; !found && authmethods[i] != NULL; i++) {
 			if ((p = strchr(method, ':')) != NULL)
 				*p = '\0';
 			if (strcmp(method, authmethods[i]->name) != 0)
 				continue;
 			if (need_enable) {
 				if (authmethods[i]->enabled == NULL ||
 				    *(authmethods[i]->enabled) == 0) {
 					error("Disabled method \"%s\" in "
 					    "AuthenticationMethods list \"%s\"",
 					    method, _methods);
 					goto out;
 				}
 			}
 			found = 1;
 			break;
 		}
 		if (!found) {
 			error("Unknown authentication method \"%s\" in list",
 			    method);
 			goto out;
 		}
 	}
 	ret = 0;
  out:
 	free(omethods);
 	return ret;
 }
 
 /*
  * Prune the AuthenticationMethods supplied in the configuration, removing
  * any methods lists that include disabled methods. Note that this might
  * leave authctxt->num_auth_methods == 0, even when multiple required auth
  * has been requested. For this reason, all tests for whether multiple is
  * enabled should consult options.num_auth_methods directly.
  */
 int
 auth2_setup_methods_lists(Authctxt *authctxt)
 {
 	u_int i;
 
 	if (options.num_auth_methods == 0)
 		return 0;
 	debug3("%s: checking methods", __func__);
 	authctxt->auth_methods = xcalloc(options.num_auth_methods,
 	    sizeof(*authctxt->auth_methods));
 	authctxt->num_auth_methods = 0;
 	for (i = 0; i < options.num_auth_methods; i++) {
 		if (auth2_methods_valid(options.auth_methods[i], 1) != 0) {
 			logit("Authentication methods list \"%s\" contains "
 			    "disabled method, skipping",
 			    options.auth_methods[i]);
 			continue;
 		}
 		debug("authentication methods list %d: %s",
 		    authctxt->num_auth_methods, options.auth_methods[i]);
 		authctxt->auth_methods[authctxt->num_auth_methods++] =
 		    xstrdup(options.auth_methods[i]);
 	}
 	if (authctxt->num_auth_methods == 0) {
 		error("No AuthenticationMethods left after eliminating "
 		    "disabled methods");
 		return -1;
 	}
 	return 0;
 }
 
 static int
 list_starts_with(const char *methods, const char *method,
     const char *submethod)
 {
 	size_t l = strlen(method);
 	int match;
 	const char *p;
 
 	if (strncmp(methods, method, l) != 0)
 		return MATCH_NONE;
 	p = methods + l;
 	match = MATCH_METHOD;
 	if (*p == ':') {
 		if (!submethod)
 			return MATCH_PARTIAL;
 		l = strlen(submethod);
 		p += 1;
 		if (strncmp(submethod, p, l))
 			return MATCH_NONE;
 		p += l;
 		match = MATCH_BOTH;
 	}
 	if (*p != ',' && *p != '\0')
 		return MATCH_NONE;
 	return match;
 }
 
 /*
  * Remove method from the start of a comma-separated list of methods.
  * Returns 0 if the list of methods did not start with that method or 1
  * if it did.
  */
 static int
 remove_method(char **methods, const char *method, const char *submethod)
 {
 	char *omethods = *methods, *p;
 	size_t l = strlen(method);
 	int match;
 
 	match = list_starts_with(omethods, method, submethod);
 	if (match != MATCH_METHOD && match != MATCH_BOTH)
 		return 0;
 	p = omethods + l;
 	if (submethod && match == MATCH_BOTH)
 		p += 1 + strlen(submethod); /* include colon */
 	if (*p == ',')
 		p++;
 	*methods = xstrdup(p);
 	free(omethods);
 	return 1;
 }
 
 /*
  * Called after successful authentication. Will remove the successful method
  * from the start of each list in which it occurs. If it was the last method
  * in any list, then authentication is deemed successful.
  * Returns 1 if the method completed any authentication list or 0 otherwise.
  */
 int
 auth2_update_methods_lists(Authctxt *authctxt, const char *method,
     const char *submethod)
 {
 	u_int i, found = 0;
 
 	debug3("%s: updating methods list after \"%s\"", __func__, method);
 	for (i = 0; i < authctxt->num_auth_methods; i++) {
 		if (!remove_method(&(authctxt->auth_methods[i]), method,
 		    submethod))
 			continue;
 		found = 1;
 		if (*authctxt->auth_methods[i] == '\0') {
 			debug2("authentication methods list %d complete", i);
 			return 1;
 		}
 		debug3("authentication methods list %d remaining: \"%s\"",
 		    i, authctxt->auth_methods[i]);
 	}
 	/* This should not happen, but would be bad if it did */
 	if (!found)
 		fatal("%s: method not in AuthenticationMethods", __func__);
 	return 0;
 }
 
 /* Reset method-specific information */
 void auth2_authctxt_reset_info(Authctxt *authctxt)
 {
 	sshkey_free(authctxt->auth_method_key);
 	free(authctxt->auth_method_info);
 	authctxt->auth_method_key = NULL;
 	authctxt->auth_method_info = NULL;
 }
 
 /* Record auth method-specific information for logs */
 void
 auth2_record_info(Authctxt *authctxt, const char *fmt, ...)
 {
 	va_list ap;
         int i;
 
 	free(authctxt->auth_method_info);
 	authctxt->auth_method_info = NULL;
 
 	va_start(ap, fmt);
 	i = vasprintf(&authctxt->auth_method_info, fmt, ap);
 	va_end(ap);
 
 	if (i < 0 || authctxt->auth_method_info == NULL)
 		fatal("%s: vasprintf failed", __func__);
 }
 
 /*
  * Records a public key used in authentication. This is used for logging
  * and to ensure that the same key is not subsequently accepted again for
  * multiple authentication.
  */
 void
 auth2_record_key(Authctxt *authctxt, int authenticated,
     const struct sshkey *key)
 {
 	struct sshkey **tmp, *dup;
 	int r;
 
 	if ((r = sshkey_from_private(key, &dup)) != 0)
 		fatal("%s: copy key: %s", __func__, ssh_err(r));
 	sshkey_free(authctxt->auth_method_key);
 	authctxt->auth_method_key = dup;
 
 	if (!authenticated)
 		return;
 
 	/* If authenticated, make sure we don't accept this key again */
 	if ((r = sshkey_from_private(key, &dup)) != 0)
 		fatal("%s: copy key: %s", __func__, ssh_err(r));
 	if (authctxt->nprev_keys >= INT_MAX ||
 	    (tmp = recallocarray(authctxt->prev_keys, authctxt->nprev_keys,
 	    authctxt->nprev_keys + 1, sizeof(*authctxt->prev_keys))) == NULL)
 		fatal("%s: reallocarray failed", __func__);
 	authctxt->prev_keys = tmp;
 	authctxt->prev_keys[authctxt->nprev_keys] = dup;
 	authctxt->nprev_keys++;
 
 }
 
 /* Checks whether a key has already been previously used for authentication */
 int
 auth2_key_already_used(Authctxt *authctxt, const struct sshkey *key)
 {
 	u_int i;
 	char *fp;
 
 	for (i = 0; i < authctxt->nprev_keys; i++) {
 		if (sshkey_equal_public(key, authctxt->prev_keys[i])) {
 			fp = sshkey_fingerprint(authctxt->prev_keys[i],
 			    options.fingerprint_hash, SSH_FP_DEFAULT);
 			debug3("%s: key already used: %s %s", __func__,
 			    sshkey_type(authctxt->prev_keys[i]),
 			    fp == NULL ? "UNKNOWN" : fp);
 			free(fp);
 			return 1;
 		}
 	}
 	return 0;
 }
 
 /*
  * Updates authctxt->session_info with details of authentication. Should be
  * whenever an authentication method succeeds.
  */
 void
 auth2_update_session_info(Authctxt *authctxt, const char *method,
     const char *submethod)
 {
 	int r;
 
 	if (authctxt->session_info == NULL) {
 		if ((authctxt->session_info = sshbuf_new()) == NULL)
 			fatal("%s: sshbuf_new", __func__);
 	}
 
 	/* Append method[/submethod] */
 	if ((r = sshbuf_putf(authctxt->session_info, "%s%s%s",
 	    method, submethod == NULL ? "" : "/",
 	    submethod == NULL ? "" : submethod)) != 0)
 		fatal("%s: append method: %s", __func__, ssh_err(r));
 
 	/* Append key if present */
 	if (authctxt->auth_method_key != NULL) {
 		if ((r = sshbuf_put_u8(authctxt->session_info, ' ')) != 0 ||
 		    (r = sshkey_format_text(authctxt->auth_method_key,
 		    authctxt->session_info)) != 0)
 			fatal("%s: append key: %s", __func__, ssh_err(r));
 	}
 
 	if (authctxt->auth_method_info != NULL) {
 		/* Ensure no ambiguity here */
 		if (strchr(authctxt->auth_method_info, '\n') != NULL)
 			fatal("%s: auth_method_info contains \\n", __func__);
 		if ((r = sshbuf_put_u8(authctxt->session_info, ' ')) != 0 ||
 		    (r = sshbuf_putf(authctxt->session_info, "%s",
 		    authctxt->auth_method_info)) != 0) {
 			fatal("%s: append method info: %s",
 			    __func__, ssh_err(r));
 		}
 	}
 	if ((r = sshbuf_put_u8(authctxt->session_info, '\n')) != 0)
 		fatal("%s: append: %s", __func__, ssh_err(r));
 }
 
Index: projects/openssl111/crypto/openssh/monitor.c
===================================================================
--- projects/openssl111/crypto/openssh/monitor.c	(revision 339239)
+++ projects/openssl111/crypto/openssh/monitor.c	(revision 339240)
@@ -1,1875 +1,1906 @@
 /* $OpenBSD: monitor.c,v 1.186 2018/07/20 03:46:34 djm Exp $ */
 /*
  * Copyright 2002 Niels Provos <provos@citi.umich.edu>
  * Copyright 2002 Markus Friedl <markus@openbsd.org>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
 #include "includes.h"
 
 #include <sys/types.h>
 #include <sys/socket.h>
 #include <sys/wait.h>
 
 #include <errno.h>
 #include <fcntl.h>
 #include <limits.h>
 #ifdef HAVE_PATHS_H
 #include <paths.h>
 #endif
 #include <pwd.h>
 #include <signal.h>
 #ifdef HAVE_STDINT_H
 #include <stdint.h>
 #endif
 #include <stdlib.h>
 #include <string.h>
 #include <stdarg.h>
 #include <stdio.h>
 #include <unistd.h>
 #ifdef HAVE_POLL_H
 #include <poll.h>
 #else
 # ifdef HAVE_SYS_POLL_H
 #  include <sys/poll.h>
 # endif
 #endif
 
 #ifdef WITH_OPENSSL
 #include <openssl/dh.h>
 #endif
 
 #include "openbsd-compat/sys-tree.h"
 #include "openbsd-compat/sys-queue.h"
 #include "openbsd-compat/openssl-compat.h"
 
 #include "atomicio.h"
 #include "xmalloc.h"
 #include "ssh.h"
 #include "sshkey.h"
 #include "sshbuf.h"
 #include "hostfile.h"
 #include "auth.h"
 #include "cipher.h"
 #include "kex.h"
 #include "dh.h"
 #include "auth-pam.h"
 #include "packet.h"
 #include "auth-options.h"
 #include "sshpty.h"
 #include "channels.h"
 #include "session.h"
 #include "sshlogin.h"
 #include "canohost.h"
 #include "log.h"
 #include "misc.h"
 #include "servconf.h"
 #include "monitor.h"
 #ifdef GSSAPI
 #include "ssh-gss.h"
 #endif
 #include "monitor_wrap.h"
 #include "monitor_fdpass.h"
 #include "compat.h"
 #include "ssh2.h"
 #include "authfd.h"
 #include "match.h"
 #include "ssherr.h"
 
 #ifdef GSSAPI
 static Gssctxt *gsscontext = NULL;
 #endif
 
 /* Imports */
 extern ServerOptions options;
 extern u_int utmp_len;
 extern u_char session_id[];
 extern struct sshbuf *loginmsg;
 extern struct sshauthopt *auth_opts; /* XXX move to permanent ssh->authctxt? */
 
 /* State exported from the child */
 static struct sshbuf *child_state;
 
 /* Functions on the monitor that answer unprivileged requests */
 
 int mm_answer_moduli(int, struct sshbuf *);
 int mm_answer_sign(int, struct sshbuf *);
+int mm_answer_login_getpwclass(int, struct sshbuf *);
 int mm_answer_pwnamallow(int, struct sshbuf *);
 int mm_answer_auth2_read_banner(int, struct sshbuf *);
 int mm_answer_authserv(int, struct sshbuf *);
 int mm_answer_authpassword(int, struct sshbuf *);
 int mm_answer_bsdauthquery(int, struct sshbuf *);
 int mm_answer_bsdauthrespond(int, struct sshbuf *);
 int mm_answer_keyallowed(int, struct sshbuf *);
 int mm_answer_keyverify(int, struct sshbuf *);
 int mm_answer_pty(int, struct sshbuf *);
 int mm_answer_pty_cleanup(int, struct sshbuf *);
 int mm_answer_term(int, struct sshbuf *);
 int mm_answer_rsa_keyallowed(int, struct sshbuf *);
 int mm_answer_rsa_challenge(int, struct sshbuf *);
 int mm_answer_rsa_response(int, struct sshbuf *);
 int mm_answer_sesskey(int, struct sshbuf *);
 int mm_answer_sessid(int, struct sshbuf *);
 
 #ifdef USE_PAM
 int mm_answer_pam_start(int, struct sshbuf *);
 int mm_answer_pam_account(int, struct sshbuf *);
 int mm_answer_pam_init_ctx(int, struct sshbuf *);
 int mm_answer_pam_query(int, struct sshbuf *);
 int mm_answer_pam_respond(int, struct sshbuf *);
 int mm_answer_pam_free_ctx(int, struct sshbuf *);
 #endif
 
 #ifdef GSSAPI
 int mm_answer_gss_setup_ctx(int, struct sshbuf *);
 int mm_answer_gss_accept_ctx(int, struct sshbuf *);
 int mm_answer_gss_userok(int, struct sshbuf *);
 int mm_answer_gss_checkmic(int, struct sshbuf *);
 #endif
 
 #ifdef SSH_AUDIT_EVENTS
 int mm_answer_audit_event(int, struct sshbuf *);
 int mm_answer_audit_command(int, struct sshbuf *);
 #endif
 
 static int monitor_read_log(struct monitor *);
 
 static Authctxt *authctxt;
 
 /* local state for key verify */
 static u_char *key_blob = NULL;
 static size_t key_bloblen = 0;
 static int key_blobtype = MM_NOKEY;
 static struct sshauthopt *key_opts = NULL;
 static char *hostbased_cuser = NULL;
 static char *hostbased_chost = NULL;
 static char *auth_method = "unknown";
 static char *auth_submethod = NULL;
 static u_int session_id2_len = 0;
 static u_char *session_id2 = NULL;
 static pid_t monitor_child_pid;
 
 struct mon_table {
 	enum monitor_reqtype type;
 	int flags;
 	int (*f)(int, struct sshbuf *);
 };
 
 #define MON_ISAUTH	0x0004	/* Required for Authentication */
 #define MON_AUTHDECIDE	0x0008	/* Decides Authentication */
 #define MON_ONCE	0x0010	/* Disable after calling */
 #define MON_ALOG	0x0020	/* Log auth attempt without authenticating */
 
 #define MON_AUTH	(MON_ISAUTH|MON_AUTHDECIDE)
 
 #define MON_PERMIT	0x1000	/* Request is permitted */
 
 struct mon_table mon_dispatch_proto20[] = {
 #ifdef WITH_OPENSSL
     {MONITOR_REQ_MODULI, MON_ONCE, mm_answer_moduli},
 #endif
     {MONITOR_REQ_SIGN, MON_ONCE, mm_answer_sign},
+    {MONITOR_REQ_GETPWCLASS, MON_AUTH, mm_answer_login_getpwclass},
     {MONITOR_REQ_PWNAM, MON_ONCE, mm_answer_pwnamallow},
     {MONITOR_REQ_AUTHSERV, MON_ONCE, mm_answer_authserv},
     {MONITOR_REQ_AUTH2_READ_BANNER, MON_ONCE, mm_answer_auth2_read_banner},
     {MONITOR_REQ_AUTHPASSWORD, MON_AUTH, mm_answer_authpassword},
 #ifdef USE_PAM
     {MONITOR_REQ_PAM_START, MON_ONCE, mm_answer_pam_start},
     {MONITOR_REQ_PAM_ACCOUNT, 0, mm_answer_pam_account},
     {MONITOR_REQ_PAM_INIT_CTX, MON_ONCE, mm_answer_pam_init_ctx},
     {MONITOR_REQ_PAM_QUERY, 0, mm_answer_pam_query},
     {MONITOR_REQ_PAM_RESPOND, MON_ONCE, mm_answer_pam_respond},
     {MONITOR_REQ_PAM_FREE_CTX, MON_ONCE|MON_AUTHDECIDE, mm_answer_pam_free_ctx},
 #endif
 #ifdef SSH_AUDIT_EVENTS
     {MONITOR_REQ_AUDIT_EVENT, MON_PERMIT, mm_answer_audit_event},
 #endif
 #ifdef BSD_AUTH
     {MONITOR_REQ_BSDAUTHQUERY, MON_ISAUTH, mm_answer_bsdauthquery},
     {MONITOR_REQ_BSDAUTHRESPOND, MON_AUTH, mm_answer_bsdauthrespond},
 #endif
     {MONITOR_REQ_KEYALLOWED, MON_ISAUTH, mm_answer_keyallowed},
     {MONITOR_REQ_KEYVERIFY, MON_AUTH, mm_answer_keyverify},
 #ifdef GSSAPI
     {MONITOR_REQ_GSSSETUP, MON_ISAUTH, mm_answer_gss_setup_ctx},
     {MONITOR_REQ_GSSSTEP, 0, mm_answer_gss_accept_ctx},
     {MONITOR_REQ_GSSUSEROK, MON_ONCE|MON_AUTHDECIDE, mm_answer_gss_userok},
     {MONITOR_REQ_GSSCHECKMIC, MON_ONCE, mm_answer_gss_checkmic},
 #endif
     {0, 0, NULL}
 };
 
 struct mon_table mon_dispatch_postauth20[] = {
 #ifdef WITH_OPENSSL
     {MONITOR_REQ_MODULI, 0, mm_answer_moduli},
 #endif
     {MONITOR_REQ_SIGN, 0, mm_answer_sign},
     {MONITOR_REQ_PTY, 0, mm_answer_pty},
     {MONITOR_REQ_PTYCLEANUP, 0, mm_answer_pty_cleanup},
     {MONITOR_REQ_TERM, 0, mm_answer_term},
 #ifdef SSH_AUDIT_EVENTS
     {MONITOR_REQ_AUDIT_EVENT, MON_PERMIT, mm_answer_audit_event},
     {MONITOR_REQ_AUDIT_COMMAND, MON_PERMIT, mm_answer_audit_command},
 #endif
     {0, 0, NULL}
 };
 
 struct mon_table *mon_dispatch;
 
 /* Specifies if a certain message is allowed at the moment */
 static void
 monitor_permit(struct mon_table *ent, enum monitor_reqtype type, int permit)
 {
 	while (ent->f != NULL) {
 		if (ent->type == type) {
 			ent->flags &= ~MON_PERMIT;
 			ent->flags |= permit ? MON_PERMIT : 0;
 			return;
 		}
 		ent++;
 	}
 }
 
 static void
 monitor_permit_authentications(int permit)
 {
 	struct mon_table *ent = mon_dispatch;
 
 	while (ent->f != NULL) {
 		if (ent->flags & MON_AUTH) {
 			ent->flags &= ~MON_PERMIT;
 			ent->flags |= permit ? MON_PERMIT : 0;
 		}
 		ent++;
 	}
 }
 
 void
 monitor_child_preauth(Authctxt *_authctxt, struct monitor *pmonitor)
 {
 	struct ssh *ssh = active_state;	/* XXX */
 	struct mon_table *ent;
 	int authenticated = 0, partial = 0;
 
 	debug3("preauth child monitor started");
 
 	if (pmonitor->m_recvfd >= 0)
 		close(pmonitor->m_recvfd);
 	if (pmonitor->m_log_sendfd >= 0)
 		close(pmonitor->m_log_sendfd);
 	pmonitor->m_log_sendfd = pmonitor->m_recvfd = -1;
 
 	authctxt = _authctxt;
 	memset(authctxt, 0, sizeof(*authctxt));
 	ssh->authctxt = authctxt;
 
 	authctxt->loginmsg = loginmsg;
 
 	mon_dispatch = mon_dispatch_proto20;
 	/* Permit requests for moduli and signatures */
 	monitor_permit(mon_dispatch, MONITOR_REQ_MODULI, 1);
 	monitor_permit(mon_dispatch, MONITOR_REQ_SIGN, 1);
 
 	/* The first few requests do not require asynchronous access */
 	while (!authenticated) {
 		partial = 0;
 		auth_method = "unknown";
 		auth_submethod = NULL;
 		auth2_authctxt_reset_info(authctxt);
 
 		authenticated = (monitor_read(pmonitor, mon_dispatch, &ent) == 1);
 
 		/* Special handling for multiple required authentications */
 		if (options.num_auth_methods != 0) {
 			if (authenticated &&
 			    !auth2_update_methods_lists(authctxt,
 			    auth_method, auth_submethod)) {
 				debug3("%s: method %s: partial", __func__,
 				    auth_method);
 				authenticated = 0;
 				partial = 1;
 			}
 		}
 
 		if (authenticated) {
 			if (!(ent->flags & MON_AUTHDECIDE))
 				fatal("%s: unexpected authentication from %d",
 				    __func__, ent->type);
 			if (authctxt->pw->pw_uid == 0 &&
 			    !auth_root_allowed(ssh, auth_method))
 				authenticated = 0;
 #ifdef USE_PAM
 			/* PAM needs to perform account checks after auth */
 			if (options.use_pam && authenticated) {
 				struct sshbuf *m;
 
 				if ((m = sshbuf_new()) == NULL)
 					fatal("%s: sshbuf_new failed",
 					    __func__);
 				mm_request_receive_expect(pmonitor->m_sendfd,
 				    MONITOR_REQ_PAM_ACCOUNT, m);
 				authenticated = mm_answer_pam_account(
 				    pmonitor->m_sendfd, m);
 				sshbuf_free(m);
 			}
 #endif
 		}
 		if (ent->flags & (MON_AUTHDECIDE|MON_ALOG)) {
 			auth_log(authctxt, authenticated, partial,
 			    auth_method, auth_submethod);
 			if (!partial && !authenticated)
 				authctxt->failures++;
 			if (authenticated || partial) {
 				auth2_update_session_info(authctxt,
 				    auth_method, auth_submethod);
 			}
 		}
 	}
 
 	if (!authctxt->valid)
 		fatal("%s: authenticated invalid user", __func__);
 	if (strcmp(auth_method, "unknown") == 0)
 		fatal("%s: authentication method name unknown", __func__);
 
 	debug("%s: %s has been authenticated by privileged process",
 	    __func__, authctxt->user);
 	ssh->authctxt = NULL;
 	ssh_packet_set_log_preamble(ssh, "user %s", authctxt->user);
 
 	mm_get_keystate(pmonitor);
 
 	/* Drain any buffered messages from the child */
 	while (pmonitor->m_log_recvfd != -1 && monitor_read_log(pmonitor) == 0)
 		;
 
 	if (pmonitor->m_recvfd >= 0)
 		close(pmonitor->m_recvfd);
 	if (pmonitor->m_log_sendfd >= 0)
 		close(pmonitor->m_log_sendfd);
 	pmonitor->m_sendfd = pmonitor->m_log_recvfd = -1;
 }
 
 static void
 monitor_set_child_handler(pid_t pid)
 {
 	monitor_child_pid = pid;
 }
 
 static void
 monitor_child_handler(int sig)
 {
 	kill(monitor_child_pid, sig);
 }
 
 void
 monitor_child_postauth(struct monitor *pmonitor)
 {
 	close(pmonitor->m_recvfd);
 	pmonitor->m_recvfd = -1;
 
 	monitor_set_child_handler(pmonitor->m_pid);
 	signal(SIGHUP, &monitor_child_handler);
 	signal(SIGTERM, &monitor_child_handler);
 	signal(SIGINT, &monitor_child_handler);
 #ifdef SIGXFSZ
 	signal(SIGXFSZ, SIG_IGN);
 #endif
 
 	mon_dispatch = mon_dispatch_postauth20;
 
 	/* Permit requests for moduli and signatures */
 	monitor_permit(mon_dispatch, MONITOR_REQ_MODULI, 1);
 	monitor_permit(mon_dispatch, MONITOR_REQ_SIGN, 1);
 	monitor_permit(mon_dispatch, MONITOR_REQ_TERM, 1);
 
 	if (auth_opts->permit_pty_flag) {
 		monitor_permit(mon_dispatch, MONITOR_REQ_PTY, 1);
 		monitor_permit(mon_dispatch, MONITOR_REQ_PTYCLEANUP, 1);
 	}
 
 	for (;;)
 		monitor_read(pmonitor, mon_dispatch, NULL);
 }
 
 static int
 monitor_read_log(struct monitor *pmonitor)
 {
 	struct sshbuf *logmsg;
 	u_int len, level;
 	char *msg;
 	u_char *p;
 	int r;
 
 	if ((logmsg = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new", __func__);
 
 	/* Read length */
 	if ((r = sshbuf_reserve(logmsg, 4, &p)) != 0)
 		fatal("%s: reserve: %s", __func__, ssh_err(r));
 	if (atomicio(read, pmonitor->m_log_recvfd, p, 4) != 4) {
 		if (errno == EPIPE) {
 			sshbuf_free(logmsg);
 			debug("%s: child log fd closed", __func__);
 			close(pmonitor->m_log_recvfd);
 			pmonitor->m_log_recvfd = -1;
 			return -1;
 		}
 		fatal("%s: log fd read: %s", __func__, strerror(errno));
 	}
 	if ((r = sshbuf_get_u32(logmsg, &len)) != 0)
 		fatal("%s: get len: %s", __func__, ssh_err(r));
 	if (len <= 4 || len > 8192)
 		fatal("%s: invalid log message length %u", __func__, len);
 
 	/* Read severity, message */
 	sshbuf_reset(logmsg);
 	if ((r = sshbuf_reserve(logmsg, len, &p)) != 0)
 		fatal("%s: reserve: %s", __func__, ssh_err(r));
 	if (atomicio(read, pmonitor->m_log_recvfd, p, len) != len)
 		fatal("%s: log fd read: %s", __func__, strerror(errno));
 	if ((r = sshbuf_get_u32(logmsg, &level)) != 0 ||
 	    (r = sshbuf_get_cstring(logmsg, &msg, NULL)) != 0)
 		fatal("%s: decode: %s", __func__, ssh_err(r));
 
 	/* Log it */
 	if (log_level_name(level) == NULL)
 		fatal("%s: invalid log level %u (corrupted message?)",
 		    __func__, level);
 	do_log2(level, "%s [preauth]", msg);
 
 	sshbuf_free(logmsg);
 	free(msg);
 
 	return 0;
 }
 
 int
 monitor_read(struct monitor *pmonitor, struct mon_table *ent,
     struct mon_table **pent)
 {
 	struct sshbuf *m;
 	int r, ret;
 	u_char type;
 	struct pollfd pfd[2];
 
 	for (;;) {
 		memset(&pfd, 0, sizeof(pfd));
 		pfd[0].fd = pmonitor->m_sendfd;
 		pfd[0].events = POLLIN;
 		pfd[1].fd = pmonitor->m_log_recvfd;
 		pfd[1].events = pfd[1].fd == -1 ? 0 : POLLIN;
 		if (poll(pfd, pfd[1].fd == -1 ? 1 : 2, -1) == -1) {
 			if (errno == EINTR || errno == EAGAIN)
 				continue;
 			fatal("%s: poll: %s", __func__, strerror(errno));
 		}
 		if (pfd[1].revents) {
 			/*
 			 * Drain all log messages before processing next
 			 * monitor request.
 			 */
 			monitor_read_log(pmonitor);
 			continue;
 		}
 		if (pfd[0].revents)
 			break;  /* Continues below */
 	}
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new", __func__);
 
 	mm_request_receive(pmonitor->m_sendfd, m);
 	if ((r = sshbuf_get_u8(m, &type)) != 0)
 		fatal("%s: decode: %s", __func__, ssh_err(r));
 
 	debug3("%s: checking request %d", __func__, type);
 
 	while (ent->f != NULL) {
 		if (ent->type == type)
 			break;
 		ent++;
 	}
 
 	if (ent->f != NULL) {
 		if (!(ent->flags & MON_PERMIT))
 			fatal("%s: unpermitted request %d", __func__,
 			    type);
 		ret = (*ent->f)(pmonitor->m_sendfd, m);
 		sshbuf_free(m);
 
 		/* The child may use this request only once, disable it */
 		if (ent->flags & MON_ONCE) {
 			debug2("%s: %d used once, disabling now", __func__,
 			    type);
 			ent->flags &= ~MON_PERMIT;
 		}
 
 		if (pent != NULL)
 			*pent = ent;
 
 		return ret;
 	}
 
 	fatal("%s: unsupported request: %d", __func__, type);
 
 	/* NOTREACHED */
 	return (-1);
 }
 
 /* allowed key state */
 static int
 monitor_allowed_key(u_char *blob, u_int bloblen)
 {
 	/* make sure key is allowed */
 	if (key_blob == NULL || key_bloblen != bloblen ||
 	    timingsafe_bcmp(key_blob, blob, key_bloblen))
 		return (0);
 	return (1);
 }
 
 static void
 monitor_reset_key_state(void)
 {
 	/* reset state */
 	free(key_blob);
 	free(hostbased_cuser);
 	free(hostbased_chost);
 	sshauthopt_free(key_opts);
 	key_blob = NULL;
 	key_bloblen = 0;
 	key_blobtype = MM_NOKEY;
 	key_opts = NULL;
 	hostbased_cuser = NULL;
 	hostbased_chost = NULL;
 }
 
 #ifdef WITH_OPENSSL
 int
 mm_answer_moduli(int sock, struct sshbuf *m)
 {
 	DH *dh;
 	const BIGNUM *dh_p, *dh_g;
 	int r;
 	u_int min, want, max;
 
 	if ((r = sshbuf_get_u32(m, &min)) != 0 ||
 	    (r = sshbuf_get_u32(m, &want)) != 0 ||
 	    (r = sshbuf_get_u32(m, &max)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	debug3("%s: got parameters: %d %d %d",
 	    __func__, min, want, max);
 	/* We need to check here, too, in case the child got corrupted */
 	if (max < min || want < min || max < want)
 		fatal("%s: bad parameters: %d %d %d",
 		    __func__, min, want, max);
 
 	sshbuf_reset(m);
 
 	dh = choose_dh(min, want, max);
 	if (dh == NULL) {
 		if ((r = sshbuf_put_u8(m, 0)) != 0)
 			fatal("%s: buffer error: %s", __func__, ssh_err(r));
 		return (0);
 	} else {
 		/* Send first bignum */
 		DH_get0_pqg(dh, &dh_p, NULL, &dh_g);
 		if ((r = sshbuf_put_u8(m, 1)) != 0 ||
 		    (r = sshbuf_put_bignum2(m, dh_p)) != 0 ||
 		    (r = sshbuf_put_bignum2(m, dh_g)) != 0)
 			fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 		DH_free(dh);
 	}
 	mm_request_send(sock, MONITOR_ANS_MODULI, m);
 	return (0);
 }
 #endif
 
 int
 mm_answer_sign(int sock, struct sshbuf *m)
 {
 	struct ssh *ssh = active_state; 	/* XXX */
 	extern int auth_sock;			/* XXX move to state struct? */
 	struct sshkey *key;
 	struct sshbuf *sigbuf = NULL;
 	u_char *p = NULL, *signature = NULL;
 	char *alg = NULL;
 	size_t datlen, siglen, alglen;
 	int r, is_proof = 0;
 	u_int keyid, compat;
 	const char proof_req[] = "hostkeys-prove-00@openssh.com";
 
 	debug3("%s", __func__);
 
 	if ((r = sshbuf_get_u32(m, &keyid)) != 0 ||
 	    (r = sshbuf_get_string(m, &p, &datlen)) != 0 ||
 	    (r = sshbuf_get_cstring(m, &alg, &alglen)) != 0 ||
 	    (r = sshbuf_get_u32(m, &compat)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (keyid > INT_MAX)
 		fatal("%s: invalid key ID", __func__);
 
 	/*
 	 * Supported KEX types use SHA1 (20 bytes), SHA256 (32 bytes),
 	 * SHA384 (48 bytes) and SHA512 (64 bytes).
 	 *
 	 * Otherwise, verify the signature request is for a hostkey
 	 * proof.
 	 *
 	 * XXX perform similar check for KEX signature requests too?
 	 * it's not trivial, since what is signed is the hash, rather
 	 * than the full kex structure...
 	 */
 	if (datlen != 20 && datlen != 32 && datlen != 48 && datlen != 64) {
 		/*
 		 * Construct expected hostkey proof and compare it to what
 		 * the client sent us.
 		 */
 		if (session_id2_len == 0) /* hostkeys is never first */
 			fatal("%s: bad data length: %zu", __func__, datlen);
 		if ((key = get_hostkey_public_by_index(keyid, ssh)) == NULL)
 			fatal("%s: no hostkey for index %d", __func__, keyid);
 		if ((sigbuf = sshbuf_new()) == NULL)
 			fatal("%s: sshbuf_new", __func__);
 		if ((r = sshbuf_put_cstring(sigbuf, proof_req)) != 0 ||
 		    (r = sshbuf_put_string(sigbuf, session_id2,
 		    session_id2_len)) != 0 ||
 		    (r = sshkey_puts(key, sigbuf)) != 0)
 			fatal("%s: couldn't prepare private key "
 			    "proof buffer: %s", __func__, ssh_err(r));
 		if (datlen != sshbuf_len(sigbuf) ||
 		    memcmp(p, sshbuf_ptr(sigbuf), sshbuf_len(sigbuf)) != 0)
 			fatal("%s: bad data length: %zu, hostkey proof len %zu",
 			    __func__, datlen, sshbuf_len(sigbuf));
 		sshbuf_free(sigbuf);
 		is_proof = 1;
 	}
 
 	/* save session id, it will be passed on the first call */
 	if (session_id2_len == 0) {
 		session_id2_len = datlen;
 		session_id2 = xmalloc(session_id2_len);
 		memcpy(session_id2, p, session_id2_len);
 	}
 
 	if ((key = get_hostkey_by_index(keyid)) != NULL) {
 		if ((r = sshkey_sign(key, &signature, &siglen, p, datlen, alg,
 		    compat)) != 0)
 			fatal("%s: sshkey_sign failed: %s",
 			    __func__, ssh_err(r));
 	} else if ((key = get_hostkey_public_by_index(keyid, ssh)) != NULL &&
 	    auth_sock > 0) {
 		if ((r = ssh_agent_sign(auth_sock, key, &signature, &siglen,
 		    p, datlen, alg, compat)) != 0) {
 			fatal("%s: ssh_agent_sign failed: %s",
 			    __func__, ssh_err(r));
 		}
 	} else
 		fatal("%s: no hostkey from index %d", __func__, keyid);
 
 	debug3("%s: %s signature %p(%zu)", __func__,
 	    is_proof ? "KEX" : "hostkey proof", signature, siglen);
 
 	sshbuf_reset(m);
 	if ((r = sshbuf_put_string(m, signature, siglen)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	free(alg);
 	free(p);
 	free(signature);
 
 	mm_request_send(sock, MONITOR_ANS_SIGN, m);
 
 	/* Turn on permissions for getpwnam */
 	monitor_permit(mon_dispatch, MONITOR_REQ_PWNAM, 1);
 
 	return (0);
 }
 
+int
+mm_answer_login_getpwclass(int sock, struct sshbuf *m)
+{
+	login_cap_t *lc;
+	struct passwd *pw;
+	int r;
+	u_int len;
+
+	debug3("%s", __func__);
+
+	pw = sshbuf_get_passwd(m);
+	if (pw == NULL)
+		fatal("%s: receive get struct passwd failed", __func__);
+
+	lc = login_getpwclass(pw);
+
+	sshbuf_reset(m);
+
+	if (lc == NULL) {
+		if (r = sshbuf_put_u8(m, 0) != 0)
+			fatal("%s: buffer error: %s", __func__, ssh_err(r));
+		goto out;
+	}
+
+	if ((r = sshbuf_put_u8(m, 1)) != 0 ||
+	    (r = sshbuf_put_cstring(m, lc->lc_class)) != 0 ||
+	    (r = sshbuf_put_cstring(m, lc->lc_cap)) != 0 ||
+	    (r = sshbuf_put_cstring(m, lc->lc_style)) != 0)
+		fatal("%s: buffer error: %s", __func__, ssh_err(r));
+
+	login_close(lc);
+ out:
+	debug3("%s: sending MONITOR_ANS_GETPWCLASS", __func__);
+	mm_request_send(sock, MONITOR_ANS_GETPWCLASS, m);
+
+	sshbuf_free_passwd(pw);
+
+	return (0);
+}
+
 /* Retrieves the password entry and also checks if the user is permitted */
 
 int
 mm_answer_pwnamallow(int sock, struct sshbuf *m)
 {
 	struct ssh *ssh = active_state;	/* XXX */
 	char *username;
 	struct passwd *pwent;
 	int r, allowed = 0;
 	u_int i;
 
 	debug3("%s", __func__);
 
 	if (authctxt->attempt++ != 0)
 		fatal("%s: multiple attempts for getpwnam", __func__);
 
 	if ((r = sshbuf_get_cstring(m, &username, NULL)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	pwent = getpwnamallow(username);
 
 	authctxt->user = xstrdup(username);
 	setproctitle("%s [priv]", pwent ? username : "unknown");
 	free(username);
 
 	sshbuf_reset(m);
 
 	if (pwent == NULL) {
 		if ((r = sshbuf_put_u8(m, 0)) != 0)
 			fatal("%s: buffer error: %s", __func__, ssh_err(r));
 		authctxt->pw = fakepw();
 		goto out;
 	}
 
 	allowed = 1;
 	authctxt->pw = pwent;
 	authctxt->valid = 1;
 
-	/* XXX don't sent pwent to unpriv; send fake class/dir/shell too */
 	if ((r = sshbuf_put_u8(m, 1)) != 0 ||
-	    (r = sshbuf_put_string(m, pwent, sizeof(*pwent))) != 0 ||
-	    (r = sshbuf_put_cstring(m, pwent->pw_name)) != 0 ||
-	    (r = sshbuf_put_cstring(m, "*")) != 0 ||
-#ifdef HAVE_STRUCT_PASSWD_PW_GECOS
-	    (r = sshbuf_put_cstring(m, pwent->pw_gecos)) != 0 ||
-#endif
-#ifdef HAVE_STRUCT_PASSWD_PW_CLASS
-	    (r = sshbuf_put_cstring(m, pwent->pw_class)) != 0 ||
-#endif
-	    (r = sshbuf_put_cstring(m, pwent->pw_dir)) != 0 ||
-	    (r = sshbuf_put_cstring(m, pwent->pw_shell)) != 0)
+	    (r = sshbuf_put_passwd(m, pwent)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
  out:
 	ssh_packet_set_log_preamble(ssh, "%suser %s",
 	    authctxt->valid ? "authenticating" : "invalid ", authctxt->user);
 	if ((r = sshbuf_put_string(m, &options, sizeof(options))) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 #define M_CP_STROPT(x) do { \
 		if (options.x != NULL) { \
 			if ((r = sshbuf_put_cstring(m, options.x)) != 0) \
 				fatal("%s: buffer error: %s", \
 				    __func__, ssh_err(r)); \
 		} \
 	} while (0)
 #define M_CP_STRARRAYOPT(x, nx) do { \
 		for (i = 0; i < options.nx; i++) { \
 			if ((r = sshbuf_put_cstring(m, options.x[i])) != 0) \
 				fatal("%s: buffer error: %s", \
 				    __func__, ssh_err(r)); \
 		} \
 	} while (0)
 	/* See comment in servconf.h */
 	COPY_MATCH_STRING_OPTS();
 #undef M_CP_STROPT
 #undef M_CP_STRARRAYOPT
 
 	/* Create valid auth method lists */
 	if (auth2_setup_methods_lists(authctxt) != 0) {
 		/*
 		 * The monitor will continue long enough to let the child
 		 * run to it's packet_disconnect(), but it must not allow any
 		 * authentication to succeed.
 		 */
 		debug("%s: no valid authentication method lists", __func__);
 	}
 
 	debug3("%s: sending MONITOR_ANS_PWNAM: %d", __func__, allowed);
 	mm_request_send(sock, MONITOR_ANS_PWNAM, m);
 
 	/* Allow service/style information on the auth context */
 	monitor_permit(mon_dispatch, MONITOR_REQ_AUTHSERV, 1);
 	monitor_permit(mon_dispatch, MONITOR_REQ_AUTH2_READ_BANNER, 1);
 
 #ifdef USE_PAM
 	if (options.use_pam)
 		monitor_permit(mon_dispatch, MONITOR_REQ_PAM_START, 1);
 #endif
 
 	return (0);
 }
 
 int mm_answer_auth2_read_banner(int sock, struct sshbuf *m)
 {
 	char *banner;
 	int r;
 
 	sshbuf_reset(m);
 	banner = auth2_read_banner();
 	if ((r = sshbuf_put_cstring(m, banner != NULL ? banner : "")) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	mm_request_send(sock, MONITOR_ANS_AUTH2_READ_BANNER, m);
 	free(banner);
 
 	return (0);
 }
 
 int
 mm_answer_authserv(int sock, struct sshbuf *m)
 {
 	int r;
 
 	monitor_permit_authentications(1);
 
 	if ((r = sshbuf_get_cstring(m, &authctxt->service, NULL)) != 0 ||
 	    (r = sshbuf_get_cstring(m, &authctxt->style, NULL)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	debug3("%s: service=%s, style=%s",
 	    __func__, authctxt->service, authctxt->style);
 
 	if (strlen(authctxt->style) == 0) {
 		free(authctxt->style);
 		authctxt->style = NULL;
 	}
 
 	return (0);
 }
 
 int
 mm_answer_authpassword(int sock, struct sshbuf *m)
 {
 	struct ssh *ssh = active_state;	/* XXX */
 	static int call_count;
 	char *passwd;
 	int r, authenticated;
 	size_t plen;
 
 	if (!options.password_authentication)
 		fatal("%s: password authentication not enabled", __func__);
 	if ((r = sshbuf_get_cstring(m, &passwd, &plen)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	/* Only authenticate if the context is valid */
 	authenticated = options.password_authentication &&
 	    auth_password(ssh, passwd);
 	explicit_bzero(passwd, plen);
 	free(passwd);
 
 	sshbuf_reset(m);
 	if ((r = sshbuf_put_u32(m, authenticated)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 #ifdef USE_PAM
 	if ((r = sshbuf_put_u32(m, sshpam_get_maxtries_reached())) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 #endif
 
 	debug3("%s: sending result %d", __func__, authenticated);
 	mm_request_send(sock, MONITOR_ANS_AUTHPASSWORD, m);
 
 	call_count++;
 	if (plen == 0 && call_count == 1)
 		auth_method = "none";
 	else
 		auth_method = "password";
 
 	/* Causes monitor loop to terminate if authenticated */
 	return (authenticated);
 }
 
 #ifdef BSD_AUTH
 int
 mm_answer_bsdauthquery(int sock, struct sshbuf *m)
 {
 	char *name, *infotxt;
 	u_int numprompts, *echo_on, success;
 	char **prompts;
 	int r;
 
 	if (!options.kbd_interactive_authentication)
 		fatal("%s: kbd-int authentication not enabled", __func__);
 	success = bsdauth_query(authctxt, &name, &infotxt, &numprompts,
 	    &prompts, &echo_on) < 0 ? 0 : 1;
 
 	sshbuf_reset(m);
 	if ((r = sshbuf_put_u32(m, success)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (success) {
 		if ((r = sshbuf_put_cstring(m, prompts[0])) != 0)
 			fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	}
 
 	debug3("%s: sending challenge success: %u", __func__, success);
 	mm_request_send(sock, MONITOR_ANS_BSDAUTHQUERY, m);
 
 	if (success) {
 		free(name);
 		free(infotxt);
 		free(prompts);
 		free(echo_on);
 	}
 
 	return (0);
 }
 
 int
 mm_answer_bsdauthrespond(int sock, struct sshbuf *m)
 {
 	char *response;
 	int r, authok;
 
 	if (!options.kbd_interactive_authentication)
 		fatal("%s: kbd-int authentication not enabled", __func__);
 	if (authctxt->as == NULL)
 		fatal("%s: no bsd auth session", __func__);
 
 	if ((r = sshbuf_get_cstring(m, &response, NULL)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	authok = options.challenge_response_authentication &&
 	    auth_userresponse(authctxt->as, response, 0);
 	authctxt->as = NULL;
 	debug3("%s: <%s> = <%d>", __func__, response, authok);
 	free(response);
 
 	sshbuf_reset(m);
 	if ((r = sshbuf_put_u32(m, authok)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	debug3("%s: sending authenticated: %d", __func__, authok);
 	mm_request_send(sock, MONITOR_ANS_BSDAUTHRESPOND, m);
 
 	auth_method = "keyboard-interactive";
 	auth_submethod = "bsdauth";
 
 	return (authok != 0);
 }
 #endif
 
 #ifdef USE_PAM
 int
 mm_answer_pam_start(int sock, struct sshbuf *m)
 {
 	if (!options.use_pam)
 		fatal("UsePAM not set, but ended up in %s anyway", __func__);
 
 	start_pam(authctxt);
 
 	monitor_permit(mon_dispatch, MONITOR_REQ_PAM_ACCOUNT, 1);
 	if (options.kbd_interactive_authentication)
 		monitor_permit(mon_dispatch, MONITOR_REQ_PAM_INIT_CTX, 1);
 
 	return (0);
 }
 
 int
 mm_answer_pam_account(int sock, struct sshbuf *m)
 {
 	u_int ret;
 	int r;
 
 	if (!options.use_pam)
 		fatal("%s: PAM not enabled", __func__);
 
 	ret = do_pam_account();
 
 	if ((r = sshbuf_put_u32(m, ret)) != 0 ||
 	    (r = sshbuf_put_stringb(m, loginmsg)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	mm_request_send(sock, MONITOR_ANS_PAM_ACCOUNT, m);
 
 	return (ret);
 }
 
 static void *sshpam_ctxt, *sshpam_authok;
 extern KbdintDevice sshpam_device;
 
 int
 mm_answer_pam_init_ctx(int sock, struct sshbuf *m)
 {
 	u_int ok = 0;
 	int r;
 
 	debug3("%s", __func__);
 	if (!options.kbd_interactive_authentication)
 		fatal("%s: kbd-int authentication not enabled", __func__);
 	if (sshpam_ctxt != NULL)
 		fatal("%s: already called", __func__);
 	sshpam_ctxt = (sshpam_device.init_ctx)(authctxt);
 	sshpam_authok = NULL;
 	sshbuf_reset(m);
 	if (sshpam_ctxt != NULL) {
 		monitor_permit(mon_dispatch, MONITOR_REQ_PAM_FREE_CTX, 1);
 		monitor_permit(mon_dispatch, MONITOR_REQ_PAM_QUERY, 1);
 		ok = 1;
 	}
 	if ((r = sshbuf_put_u32(m, ok)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	mm_request_send(sock, MONITOR_ANS_PAM_INIT_CTX, m);
 	return (0);
 }
 
 int
 mm_answer_pam_query(int sock, struct sshbuf *m)
 {
 	char *name = NULL, *info = NULL, **prompts = NULL;
 	u_int i, num = 0, *echo_on = 0;
 	int r, ret;
 
 	debug3("%s", __func__);
 	sshpam_authok = NULL;
 	if (sshpam_ctxt == NULL)
 		fatal("%s: no context", __func__);
 	ret = (sshpam_device.query)(sshpam_ctxt, &name, &info,
 	    &num, &prompts, &echo_on);
 	if (ret == 0 && num == 0)
 		sshpam_authok = sshpam_ctxt;
 	if (num > 1 || name == NULL || info == NULL)
 		fatal("sshpam_device.query failed");
 	monitor_permit(mon_dispatch, MONITOR_REQ_PAM_RESPOND, 1);
 	sshbuf_reset(m);
 	if ((r = sshbuf_put_u32(m, ret)) != 0 ||
 	    (r = sshbuf_put_cstring(m, name)) != 0 ||
 	    (r = sshbuf_put_cstring(m, info)) != 0 ||
 	    (r = sshbuf_put_u32(m, sshpam_get_maxtries_reached())) != 0 ||
 	    (r = sshbuf_put_u32(m, num)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	free(name);
 	free(info);
 	for (i = 0; i < num; ++i) {
 		if ((r = sshbuf_put_cstring(m, prompts[i])) != 0 ||
 		    (r = sshbuf_put_u32(m, echo_on[i])) != 0)
 			fatal("%s: buffer error: %s", __func__, ssh_err(r));
 		free(prompts[i]);
 	}
 	free(prompts);
 	free(echo_on);
 	auth_method = "keyboard-interactive";
 	auth_submethod = "pam";
 	mm_request_send(sock, MONITOR_ANS_PAM_QUERY, m);
 	return (0);
 }
 
 int
 mm_answer_pam_respond(int sock, struct sshbuf *m)
 {
 	char **resp;
 	u_int i, num;
 	int r, ret;
 
 	debug3("%s", __func__);
 	if (sshpam_ctxt == NULL)
 		fatal("%s: no context", __func__);
 	sshpam_authok = NULL;
 	if ((r = sshbuf_get_u32(m, &num)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (num > 0) {
 		resp = xcalloc(num, sizeof(char *));
 		for (i = 0; i < num; ++i) {
 			if ((r = sshbuf_get_cstring(m, &(resp[i]), NULL)) != 0)
 				fatal("%s: buffer error: %s",
 				    __func__, ssh_err(r));
 		}
 		ret = (sshpam_device.respond)(sshpam_ctxt, num, resp);
 		for (i = 0; i < num; ++i)
 			free(resp[i]);
 		free(resp);
 	} else {
 		ret = (sshpam_device.respond)(sshpam_ctxt, num, NULL);
 	}
 	sshbuf_reset(m);
 	if ((r = sshbuf_put_u32(m, ret)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	mm_request_send(sock, MONITOR_ANS_PAM_RESPOND, m);
 	auth_method = "keyboard-interactive";
 	auth_submethod = "pam";
 	if (ret == 0)
 		sshpam_authok = sshpam_ctxt;
 	return (0);
 }
 
 int
 mm_answer_pam_free_ctx(int sock, struct sshbuf *m)
 {
 	int r = sshpam_authok != NULL && sshpam_authok == sshpam_ctxt;
 
 	debug3("%s", __func__);
 	if (sshpam_ctxt == NULL)
 		fatal("%s: no context", __func__);
 	(sshpam_device.free_ctx)(sshpam_ctxt);
 	sshpam_ctxt = sshpam_authok = NULL;
 	sshbuf_reset(m);
 	mm_request_send(sock, MONITOR_ANS_PAM_FREE_CTX, m);
 	/* Allow another attempt */
 	monitor_permit(mon_dispatch, MONITOR_REQ_PAM_INIT_CTX, 1);
 	auth_method = "keyboard-interactive";
 	auth_submethod = "pam";
 	return r;
 }
 #endif
 
 int
 mm_answer_keyallowed(int sock, struct sshbuf *m)
 {
 	struct ssh *ssh = active_state;	/* XXX */
 	struct sshkey *key = NULL;
 	char *cuser, *chost;
 	u_int pubkey_auth_attempt;
 	enum mm_keytype type = 0;
 	int r, allowed = 0;
 	struct sshauthopt *opts = NULL;
 
 	debug3("%s entering", __func__);
 	if ((r = sshbuf_get_u32(m, &type)) != 0 ||
 	    (r = sshbuf_get_cstring(m, &cuser, NULL)) != 0 ||
 	    (r = sshbuf_get_cstring(m, &chost, NULL)) != 0 ||
 	    (r = sshkey_froms(m, &key)) != 0 ||
 	    (r = sshbuf_get_u32(m, &pubkey_auth_attempt)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	debug3("%s: key_from_blob: %p", __func__, key);
 
 	if (key != NULL && authctxt->valid) {
 		/* These should not make it past the privsep child */
 		if (sshkey_type_plain(key->type) == KEY_RSA &&
 		    (datafellows & SSH_BUG_RSASIGMD5) != 0)
 			fatal("%s: passed a SSH_BUG_RSASIGMD5 key", __func__);
 
 		switch (type) {
 		case MM_USERKEY:
 			auth_method = "publickey";
 			if (!options.pubkey_authentication)
 				break;
 			if (auth2_key_already_used(authctxt, key))
 				break;
 			if (match_pattern_list(sshkey_ssh_name(key),
 			    options.pubkey_key_types, 0) != 1)
 				break;
 			allowed = user_key_allowed(ssh, authctxt->pw, key,
 			    pubkey_auth_attempt, &opts);
 			break;
 		case MM_HOSTKEY:
 			auth_method = "hostbased";
 			if (!options.hostbased_authentication)
 				break;
 			if (auth2_key_already_used(authctxt, key))
 				break;
 			if (match_pattern_list(sshkey_ssh_name(key),
 			    options.hostbased_key_types, 0) != 1)
 				break;
 			allowed = hostbased_key_allowed(authctxt->pw,
 			    cuser, chost, key);
 			auth2_record_info(authctxt,
 			    "client user \"%.100s\", client host \"%.100s\"",
 			    cuser, chost);
 			break;
 		default:
 			fatal("%s: unknown key type %d", __func__, type);
 			break;
 		}
 	}
 
 	debug3("%s: %s authentication%s: %s key is %s", __func__,
 	    auth_method, pubkey_auth_attempt ? "" : " test",
 	    (key == NULL || !authctxt->valid) ? "invalid" : sshkey_type(key),
 	    allowed ? "allowed" : "not allowed");
 
 	auth2_record_key(authctxt, 0, key);
 
 	/* clear temporarily storage (used by verify) */
 	monitor_reset_key_state();
 
 	if (allowed) {
 		/* Save temporarily for comparison in verify */
 		if ((r = sshkey_to_blob(key, &key_blob, &key_bloblen)) != 0)
 			fatal("%s: buffer error: %s", __func__, ssh_err(r));
 		key_blobtype = type;
 		key_opts = opts;
 		hostbased_cuser = cuser;
 		hostbased_chost = chost;
 	} else {
 		/* Log failed attempt */
 		auth_log(authctxt, 0, 0, auth_method, NULL);
 		free(cuser);
 		free(chost);
 	}
 	sshkey_free(key);
 
 	sshbuf_reset(m);
 	if ((r = sshbuf_put_u32(m, allowed)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (opts != NULL && (r = sshauthopt_serialise(opts, m, 1)) != 0)
 		fatal("%s: sshauthopt_serialise: %s", __func__, ssh_err(r));
 	mm_request_send(sock, MONITOR_ANS_KEYALLOWED, m);
 
 	if (!allowed)
 		sshauthopt_free(opts);
 
 	return (0);
 }
 
 static int
 monitor_valid_userblob(u_char *data, u_int datalen)
 {
 	struct sshbuf *b;
 	const u_char *p;
 	char *userstyle, *cp;
 	size_t len;
 	u_char type;
 	int r, fail = 0;
 
 	if ((b = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new", __func__);
 	if ((r = sshbuf_put(b, data, datalen)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	if (datafellows & SSH_OLD_SESSIONID) {
 		p = sshbuf_ptr(b);
 		len = sshbuf_len(b);
 		if ((session_id2 == NULL) ||
 		    (len < session_id2_len) ||
 		    (timingsafe_bcmp(p, session_id2, session_id2_len) != 0))
 			fail++;
 		if ((r = sshbuf_consume(b, session_id2_len)) != 0)
 			fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	} else {
 		if ((r = sshbuf_get_string_direct(b, &p, &len)) != 0)
 			fatal("%s: buffer error: %s", __func__, ssh_err(r));
 		if ((session_id2 == NULL) ||
 		    (len != session_id2_len) ||
 		    (timingsafe_bcmp(p, session_id2, session_id2_len) != 0))
 			fail++;
 	}
 	if ((r = sshbuf_get_u8(b, &type)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (type != SSH2_MSG_USERAUTH_REQUEST)
 		fail++;
 	if ((r = sshbuf_get_cstring(b, &cp, NULL)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	xasprintf(&userstyle, "%s%s%s", authctxt->user,
 	    authctxt->style ? ":" : "",
 	    authctxt->style ? authctxt->style : "");
 	if (strcmp(userstyle, cp) != 0) {
 		logit("wrong user name passed to monitor: "
 		    "expected %s != %.100s", userstyle, cp);
 		fail++;
 	}
 	free(userstyle);
 	free(cp);
 	if ((r = sshbuf_skip_string(b)) != 0 ||	/* service */
 	    (r = sshbuf_get_cstring(b, &cp, NULL)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (strcmp("publickey", cp) != 0)
 		fail++;
 	free(cp);
 	if ((r = sshbuf_get_u8(b, &type)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (type == 0)
 		fail++;
 	if ((r = sshbuf_skip_string(b)) != 0 ||	/* pkalg */
 	    (r = sshbuf_skip_string(b)) != 0)	/* pkblob */
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (sshbuf_len(b) != 0)
 		fail++;
 	sshbuf_free(b);
 	return (fail == 0);
 }
 
 static int
 monitor_valid_hostbasedblob(u_char *data, u_int datalen, char *cuser,
     char *chost)
 {
 	struct sshbuf *b;
 	const u_char *p;
 	char *cp, *userstyle;
 	size_t len;
 	int r, fail = 0;
 	u_char type;
 
 	if ((b = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new", __func__);
 	if ((r = sshbuf_put(b, data, datalen)) != 0 ||
 	    (r = sshbuf_get_string_direct(b, &p, &len)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	if ((session_id2 == NULL) ||
 	    (len != session_id2_len) ||
 	    (timingsafe_bcmp(p, session_id2, session_id2_len) != 0))
 		fail++;
 
 	if ((r = sshbuf_get_u8(b, &type)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (type != SSH2_MSG_USERAUTH_REQUEST)
 		fail++;
 	if ((r = sshbuf_get_cstring(b, &cp, NULL)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	xasprintf(&userstyle, "%s%s%s", authctxt->user,
 	    authctxt->style ? ":" : "",
 	    authctxt->style ? authctxt->style : "");
 	if (strcmp(userstyle, cp) != 0) {
 		logit("wrong user name passed to monitor: "
 		    "expected %s != %.100s", userstyle, cp);
 		fail++;
 	}
 	free(userstyle);
 	free(cp);
 	if ((r = sshbuf_skip_string(b)) != 0 ||	/* service */
 	    (r = sshbuf_get_cstring(b, &cp, NULL)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (strcmp(cp, "hostbased") != 0)
 		fail++;
 	free(cp);
 	if ((r = sshbuf_skip_string(b)) != 0 ||	/* pkalg */
 	    (r = sshbuf_skip_string(b)) != 0)	/* pkblob */
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	/* verify client host, strip trailing dot if necessary */
 	if ((r = sshbuf_get_cstring(b, &cp, NULL)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (((len = strlen(cp)) > 0) && cp[len - 1] == '.')
 		cp[len - 1] = '\0';
 	if (strcmp(cp, chost) != 0)
 		fail++;
 	free(cp);
 
 	/* verify client user */
 	if ((r = sshbuf_get_cstring(b, &cp, NULL)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (strcmp(cp, cuser) != 0)
 		fail++;
 	free(cp);
 
 	if (sshbuf_len(b) != 0)
 		fail++;
 	sshbuf_free(b);
 	return (fail == 0);
 }
 
 int
 mm_answer_keyverify(int sock, struct sshbuf *m)
 {
 	struct ssh *ssh = active_state;	/* XXX */
 	struct sshkey *key;
 	u_char *signature, *data, *blob;
 	char *sigalg;
 	size_t signaturelen, datalen, bloblen;
 	int r, ret, valid_data = 0, encoded_ret;
 
 	if ((r = sshbuf_get_string(m, &blob, &bloblen)) != 0 ||
 	    (r = sshbuf_get_string(m, &signature, &signaturelen)) != 0 ||
 	    (r = sshbuf_get_string(m, &data, &datalen)) != 0 ||
 	    (r = sshbuf_get_cstring(m, &sigalg, NULL)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	if (hostbased_cuser == NULL || hostbased_chost == NULL ||
 	  !monitor_allowed_key(blob, bloblen))
 		fatal("%s: bad key, not previously allowed", __func__);
 
 	/* Empty signature algorithm means NULL. */
 	if (*sigalg == '\0') {
 		free(sigalg);
 		sigalg = NULL;
 	}
 
 	/* XXX use sshkey_froms here; need to change key_blob, etc. */
 	if ((r = sshkey_from_blob(blob, bloblen, &key)) != 0)
 		fatal("%s: bad public key blob: %s", __func__, ssh_err(r));
 
 	switch (key_blobtype) {
 	case MM_USERKEY:
 		valid_data = monitor_valid_userblob(data, datalen);
 		auth_method = "publickey";
 		break;
 	case MM_HOSTKEY:
 		valid_data = monitor_valid_hostbasedblob(data, datalen,
 		    hostbased_cuser, hostbased_chost);
 		auth_method = "hostbased";
 		break;
 	default:
 		valid_data = 0;
 		break;
 	}
 	if (!valid_data)
 		fatal("%s: bad signature data blob", __func__);
 
 	ret = sshkey_verify(key, signature, signaturelen, data, datalen,
 	    sigalg, active_state->compat);
 	debug3("%s: %s %p signature %s", __func__, auth_method, key,
 	    (ret == 0) ? "verified" : "unverified");
 	auth2_record_key(authctxt, ret == 0, key);
 
 	free(blob);
 	free(signature);
 	free(data);
 	free(sigalg);
 
 	if (key_blobtype == MM_USERKEY)
 		auth_activate_options(ssh, key_opts);
 	monitor_reset_key_state();
 
 	sshkey_free(key);
 	sshbuf_reset(m);
 
 	/* encode ret != 0 as positive integer, since we're sending u32 */
 	encoded_ret = (ret != 0);
 	if ((r = sshbuf_put_u32(m, encoded_ret)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	mm_request_send(sock, MONITOR_ANS_KEYVERIFY, m);
 
 	return ret == 0;
 }
 
 static void
 mm_record_login(Session *s, struct passwd *pw)
 {
 	struct ssh *ssh = active_state;	/* XXX */
 	socklen_t fromlen;
 	struct sockaddr_storage from;
 
 	/*
 	 * Get IP address of client. If the connection is not a socket, let
 	 * the address be 0.0.0.0.
 	 */
 	memset(&from, 0, sizeof(from));
 	fromlen = sizeof(from);
 	if (packet_connection_is_on_socket()) {
 		if (getpeername(packet_get_connection_in(),
 		    (struct sockaddr *)&from, &fromlen) < 0) {
 			debug("getpeername: %.100s", strerror(errno));
 			cleanup_exit(255);
 		}
 	}
 	/* Record that there was a login on that tty from the remote host. */
 	record_login(s->pid, s->tty, pw->pw_name, pw->pw_uid,
 	    session_get_remote_name_or_ip(ssh, utmp_len, options.use_dns),
 	    (struct sockaddr *)&from, fromlen);
 }
 
 static void
 mm_session_close(Session *s)
 {
 	debug3("%s: session %d pid %ld", __func__, s->self, (long)s->pid);
 	if (s->ttyfd != -1) {
 		debug3("%s: tty %s ptyfd %d", __func__, s->tty, s->ptyfd);
 		session_pty_cleanup2(s);
 	}
 	session_unused(s->self);
 }
 
 int
 mm_answer_pty(int sock, struct sshbuf *m)
 {
 	extern struct monitor *pmonitor;
 	Session *s;
 	int r, res, fd0;
 
 	debug3("%s entering", __func__);
 
 	sshbuf_reset(m);
 	s = session_new();
 	if (s == NULL)
 		goto error;
 	s->authctxt = authctxt;
 	s->pw = authctxt->pw;
 	s->pid = pmonitor->m_pid;
 	res = pty_allocate(&s->ptyfd, &s->ttyfd, s->tty, sizeof(s->tty));
 	if (res == 0)
 		goto error;
 	pty_setowner(authctxt->pw, s->tty);
 
 	if ((r = sshbuf_put_u32(m, 1)) != 0 ||
 	    (r = sshbuf_put_cstring(m, s->tty)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	/* We need to trick ttyslot */
 	if (dup2(s->ttyfd, 0) == -1)
 		fatal("%s: dup2", __func__);
 
 	mm_record_login(s, authctxt->pw);
 
 	/* Now we can close the file descriptor again */
 	close(0);
 
 	/* send messages generated by record_login */
 	if ((r = sshbuf_put_stringb(m, loginmsg)) != 0)
 		fatal("%s: put login message: %s", __func__, ssh_err(r));
 	sshbuf_reset(loginmsg);
 
 	mm_request_send(sock, MONITOR_ANS_PTY, m);
 
 	if (mm_send_fd(sock, s->ptyfd) == -1 ||
 	    mm_send_fd(sock, s->ttyfd) == -1)
 		fatal("%s: send fds failed", __func__);
 
 	/* make sure nothing uses fd 0 */
 	if ((fd0 = open(_PATH_DEVNULL, O_RDONLY)) < 0)
 		fatal("%s: open(/dev/null): %s", __func__, strerror(errno));
 	if (fd0 != 0)
 		error("%s: fd0 %d != 0", __func__, fd0);
 
 	/* slave is not needed */
 	close(s->ttyfd);
 	s->ttyfd = s->ptyfd;
 	/* no need to dup() because nobody closes ptyfd */
 	s->ptymaster = s->ptyfd;
 
 	debug3("%s: tty %s ptyfd %d", __func__, s->tty, s->ttyfd);
 
 	return (0);
 
  error:
 	if (s != NULL)
 		mm_session_close(s);
 	if ((r = sshbuf_put_u32(m, 0)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	mm_request_send(sock, MONITOR_ANS_PTY, m);
 	return (0);
 }
 
 int
 mm_answer_pty_cleanup(int sock, struct sshbuf *m)
 {
 	Session *s;
 	char *tty;
 	int r;
 
 	debug3("%s entering", __func__);
 
 	if ((r = sshbuf_get_cstring(m, &tty, NULL)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if ((s = session_by_tty(tty)) != NULL)
 		mm_session_close(s);
 	sshbuf_reset(m);
 	free(tty);
 	return (0);
 }
 
 int
 mm_answer_term(int sock, struct sshbuf *req)
 {
 	struct ssh *ssh = active_state;	/* XXX */
 	extern struct monitor *pmonitor;
 	int res, status;
 
 	debug3("%s: tearing down sessions", __func__);
 
 	/* The child is terminating */
 	session_destroy_all(ssh, &mm_session_close);
 
 #ifdef USE_PAM
 	if (options.use_pam)
 		sshpam_cleanup();
 #endif
 
 	while (waitpid(pmonitor->m_pid, &status, 0) == -1)
 		if (errno != EINTR)
 			exit(1);
 
 	res = WIFEXITED(status) ? WEXITSTATUS(status) : 1;
 
 	/* Terminate process */
 	exit(res);
 }
 
 #ifdef SSH_AUDIT_EVENTS
 /* Report that an audit event occurred */
 int
 mm_answer_audit_event(int socket, struct sshbuf *m)
 {
 	u_int n;
 	ssh_audit_event_t event;
 	int r;
 
 	debug3("%s entering", __func__);
 
 	if ((r = sshbuf_get_u32(m, &n)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	event = (ssh_audit_event_t)n;
 	switch (event) {
 	case SSH_AUTH_FAIL_PUBKEY:
 	case SSH_AUTH_FAIL_HOSTBASED:
 	case SSH_AUTH_FAIL_GSSAPI:
 	case SSH_LOGIN_EXCEED_MAXTRIES:
 	case SSH_LOGIN_ROOT_DENIED:
 	case SSH_CONNECTION_CLOSE:
 	case SSH_INVALID_USER:
 		audit_event(event);
 		break;
 	default:
 		fatal("Audit event type %d not permitted", event);
 	}
 
 	return (0);
 }
 
 int
 mm_answer_audit_command(int socket, struct sshbuf *m)
 {
 	char *cmd;
 	int r;
 
 	debug3("%s entering", __func__);
 	if ((r = sshbuf_get_cstring(m, &cmd, NULL)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	/* sanity check command, if so how? */
 	audit_run_command(cmd);
 	free(cmd);
 	return (0);
 }
 #endif /* SSH_AUDIT_EVENTS */
 
 void
 monitor_clear_keystate(struct monitor *pmonitor)
 {
 	struct ssh *ssh = active_state;	/* XXX */
 
 	ssh_clear_newkeys(ssh, MODE_IN);
 	ssh_clear_newkeys(ssh, MODE_OUT);
 	sshbuf_free(child_state);
 	child_state = NULL;
 }
 
 void
 monitor_apply_keystate(struct monitor *pmonitor)
 {
 	struct ssh *ssh = active_state;	/* XXX */
 	struct kex *kex;
 	int r;
 
 	debug3("%s: packet_set_state", __func__);
 	if ((r = ssh_packet_set_state(ssh, child_state)) != 0)
                 fatal("%s: packet_set_state: %s", __func__, ssh_err(r));
 	sshbuf_free(child_state);
 	child_state = NULL;
 
 	if ((kex = ssh->kex) != NULL) {
 		/* XXX set callbacks */
 #ifdef WITH_OPENSSL
 		kex->kex[KEX_DH_GRP1_SHA1] = kexdh_server;
 		kex->kex[KEX_DH_GRP14_SHA1] = kexdh_server;
 		kex->kex[KEX_DH_GRP14_SHA256] = kexdh_server;
 		kex->kex[KEX_DH_GRP16_SHA512] = kexdh_server;
 		kex->kex[KEX_DH_GRP18_SHA512] = kexdh_server;
 		kex->kex[KEX_DH_GEX_SHA1] = kexgex_server;
 		kex->kex[KEX_DH_GEX_SHA256] = kexgex_server;
 # ifdef OPENSSL_HAS_ECC
 		kex->kex[KEX_ECDH_SHA2] = kexecdh_server;
 # endif
 #endif /* WITH_OPENSSL */
 		kex->kex[KEX_C25519_SHA256] = kexc25519_server;
 		kex->load_host_public_key=&get_hostkey_public_by_type;
 		kex->load_host_private_key=&get_hostkey_private_by_type;
 		kex->host_key_index=&get_hostkey_index;
 		kex->sign = sshd_hostkey_sign;
 	}
 }
 
 /* This function requries careful sanity checking */
 
 void
 mm_get_keystate(struct monitor *pmonitor)
 {
 	debug3("%s: Waiting for new keys", __func__);
 
 	if ((child_state = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	mm_request_receive_expect(pmonitor->m_sendfd, MONITOR_REQ_KEYEXPORT,
 	    child_state);
 	debug3("%s: GOT new keys", __func__);
 }
 
 
 /* XXX */
 
 #define FD_CLOSEONEXEC(x) do { \
 	if (fcntl(x, F_SETFD, FD_CLOEXEC) == -1) \
 		fatal("fcntl(%d, F_SETFD)", x); \
 } while (0)
 
 static void
 monitor_openfds(struct monitor *mon, int do_logfds)
 {
 	int pair[2];
 #ifdef SO_ZEROIZE
 	int on = 1;
 #endif
 
 	if (socketpair(AF_UNIX, SOCK_STREAM, 0, pair) == -1)
 		fatal("%s: socketpair: %s", __func__, strerror(errno));
 #ifdef SO_ZEROIZE
 	if (setsockopt(pair[0], SOL_SOCKET, SO_ZEROIZE, &on, sizeof(on)) < 0)
 		error("setsockopt SO_ZEROIZE(0): %.100s", strerror(errno));
 	if (setsockopt(pair[1], SOL_SOCKET, SO_ZEROIZE, &on, sizeof(on)) < 0)
 		error("setsockopt SO_ZEROIZE(1): %.100s", strerror(errno));
 #endif
 	FD_CLOSEONEXEC(pair[0]);
 	FD_CLOSEONEXEC(pair[1]);
 	mon->m_recvfd = pair[0];
 	mon->m_sendfd = pair[1];
 
 	if (do_logfds) {
 		if (pipe(pair) == -1)
 			fatal("%s: pipe: %s", __func__, strerror(errno));
 		FD_CLOSEONEXEC(pair[0]);
 		FD_CLOSEONEXEC(pair[1]);
 		mon->m_log_recvfd = pair[0];
 		mon->m_log_sendfd = pair[1];
 	} else
 		mon->m_log_recvfd = mon->m_log_sendfd = -1;
 }
 
 #define MM_MEMSIZE	65536
 
 struct monitor *
 monitor_init(void)
 {
 	struct monitor *mon;
 
 	mon = xcalloc(1, sizeof(*mon));
 	monitor_openfds(mon, 1);
 
 	return mon;
 }
 
 void
 monitor_reinit(struct monitor *mon)
 {
 	monitor_openfds(mon, 0);
 }
 
 #ifdef GSSAPI
 int
 mm_answer_gss_setup_ctx(int sock, struct sshbuf *m)
 {
 	gss_OID_desc goid;
 	OM_uint32 major;
 	size_t len;
 	u_char *p;
 	int r;
 
 	if (!options.gss_authentication)
 		fatal("%s: GSSAPI authentication not enabled", __func__);
 
 	if ((r = sshbuf_get_string(m, &p, &len)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	goid.elements = p;
 	goid.length = len;
 
 	major = ssh_gssapi_server_ctx(&gsscontext, &goid);
 
 	free(goid.elements);
 
 	sshbuf_reset(m);
 	if ((r = sshbuf_put_u32(m, major)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	mm_request_send(sock, MONITOR_ANS_GSSSETUP, m);
 
 	/* Now we have a context, enable the step */
 	monitor_permit(mon_dispatch, MONITOR_REQ_GSSSTEP, 1);
 
 	return (0);
 }
 
 int
 mm_answer_gss_accept_ctx(int sock, struct sshbuf *m)
 {
 	gss_buffer_desc in;
 	gss_buffer_desc out = GSS_C_EMPTY_BUFFER;
 	OM_uint32 major, minor;
 	OM_uint32 flags = 0; /* GSI needs this */
 	int r;
 
 	if (!options.gss_authentication)
 		fatal("%s: GSSAPI authentication not enabled", __func__);
 
 	if ((r = ssh_gssapi_get_buffer_desc(m, &in)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	major = ssh_gssapi_accept_ctx(gsscontext, &in, &out, &flags);
 	free(in.value);
 
 	sshbuf_reset(m);
 	if ((r = sshbuf_put_u32(m, major)) != 0 ||
 	    (r = sshbuf_put_string(m, out.value, out.length)) != 0 ||
 	    (r = sshbuf_put_u32(m, flags)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	mm_request_send(sock, MONITOR_ANS_GSSSTEP, m);
 
 	gss_release_buffer(&minor, &out);
 
 	if (major == GSS_S_COMPLETE) {
 		monitor_permit(mon_dispatch, MONITOR_REQ_GSSSTEP, 0);
 		monitor_permit(mon_dispatch, MONITOR_REQ_GSSUSEROK, 1);
 		monitor_permit(mon_dispatch, MONITOR_REQ_GSSCHECKMIC, 1);
 	}
 	return (0);
 }
 
 int
 mm_answer_gss_checkmic(int sock, struct sshbuf *m)
 {
 	gss_buffer_desc gssbuf, mic;
 	OM_uint32 ret;
 	int r;
 
 	if (!options.gss_authentication)
 		fatal("%s: GSSAPI authentication not enabled", __func__);
 
 	if ((r = ssh_gssapi_get_buffer_desc(m, &gssbuf)) != 0 ||
 	    (r = ssh_gssapi_get_buffer_desc(m, &mic)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	ret = ssh_gssapi_checkmic(gsscontext, &gssbuf, &mic);
 
 	free(gssbuf.value);
 	free(mic.value);
 
 	sshbuf_reset(m);
 	if ((r = sshbuf_put_u32(m, ret)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	mm_request_send(sock, MONITOR_ANS_GSSCHECKMIC, m);
 
 	if (!GSS_ERROR(ret))
 		monitor_permit(mon_dispatch, MONITOR_REQ_GSSUSEROK, 1);
 
 	return (0);
 }
 
 int
 mm_answer_gss_userok(int sock, struct sshbuf *m)
 {
 	int r, authenticated;
 	const char *displayname;
 
 	if (!options.gss_authentication)
 		fatal("%s: GSSAPI authentication not enabled", __func__);
 
 	authenticated = authctxt->valid && ssh_gssapi_userok(authctxt->user);
 
 	sshbuf_reset(m);
 	if ((r = sshbuf_put_u32(m, authenticated)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	debug3("%s: sending result %d", __func__, authenticated);
 	mm_request_send(sock, MONITOR_ANS_GSSUSEROK, m);
 
 	auth_method = "gssapi-with-mic";
 
 	if ((displayname = ssh_gssapi_displayname()) != NULL)
 		auth2_record_info(authctxt, "%s", displayname);
 
 	/* Monitor loop will terminate if authenticated */
 	return (authenticated);
 }
 #endif /* GSSAPI */
 
Index: projects/openssl111/crypto/openssh/monitor.h
===================================================================
--- projects/openssl111/crypto/openssh/monitor.h	(revision 339239)
+++ projects/openssl111/crypto/openssh/monitor.h	(revision 339240)
@@ -1,92 +1,93 @@
 /* $OpenBSD: monitor.h,v 1.21 2018/07/09 21:53:45 markus Exp $ */
 
 /*
  * Copyright 2002 Niels Provos <provos@citi.umich.edu>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
 #ifndef _MONITOR_H_
 #define _MONITOR_H_
 
 /* Please keep *_REQ_* values on even numbers and *_ANS_* on odd numbers */
 enum monitor_reqtype {
 	MONITOR_REQ_MODULI = 0, MONITOR_ANS_MODULI = 1,
 	MONITOR_REQ_FREE = 2,
 	MONITOR_REQ_AUTHSERV = 4,
 	MONITOR_REQ_SIGN = 6, MONITOR_ANS_SIGN = 7,
 	MONITOR_REQ_PWNAM = 8, MONITOR_ANS_PWNAM = 9,
 	MONITOR_REQ_AUTH2_READ_BANNER = 10, MONITOR_ANS_AUTH2_READ_BANNER = 11,
 	MONITOR_REQ_AUTHPASSWORD = 12, MONITOR_ANS_AUTHPASSWORD = 13,
 	MONITOR_REQ_BSDAUTHQUERY = 14, MONITOR_ANS_BSDAUTHQUERY = 15,
 	MONITOR_REQ_BSDAUTHRESPOND = 16, MONITOR_ANS_BSDAUTHRESPOND = 17,
 	MONITOR_REQ_KEYALLOWED = 22, MONITOR_ANS_KEYALLOWED = 23,
 	MONITOR_REQ_KEYVERIFY = 24, MONITOR_ANS_KEYVERIFY = 25,
 	MONITOR_REQ_KEYEXPORT = 26,
 	MONITOR_REQ_PTY = 28, MONITOR_ANS_PTY = 29,
 	MONITOR_REQ_PTYCLEANUP = 30,
 	MONITOR_REQ_SESSKEY = 32, MONITOR_ANS_SESSKEY = 33,
 	MONITOR_REQ_SESSID = 34,
 	MONITOR_REQ_RSAKEYALLOWED = 36, MONITOR_ANS_RSAKEYALLOWED = 37,
 	MONITOR_REQ_RSACHALLENGE = 38, MONITOR_ANS_RSACHALLENGE = 39,
 	MONITOR_REQ_RSARESPONSE = 40, MONITOR_ANS_RSARESPONSE = 41,
 	MONITOR_REQ_GSSSETUP = 42, MONITOR_ANS_GSSSETUP = 43,
 	MONITOR_REQ_GSSSTEP = 44, MONITOR_ANS_GSSSTEP = 45,
 	MONITOR_REQ_GSSUSEROK = 46, MONITOR_ANS_GSSUSEROK = 47,
 	MONITOR_REQ_GSSCHECKMIC = 48, MONITOR_ANS_GSSCHECKMIC = 49,
-	MONITOR_REQ_TERM = 50,
+	MONITOR_REQ_GETPWCLASS = 50, MONITOR_ANS_GETPWCLASS = 51,
+	MONITOR_REQ_TERM = 52,
 
 	MONITOR_REQ_PAM_START = 100,
 	MONITOR_REQ_PAM_ACCOUNT = 102, MONITOR_ANS_PAM_ACCOUNT = 103,
 	MONITOR_REQ_PAM_INIT_CTX = 104, MONITOR_ANS_PAM_INIT_CTX = 105,
 	MONITOR_REQ_PAM_QUERY = 106, MONITOR_ANS_PAM_QUERY = 107,
 	MONITOR_REQ_PAM_RESPOND = 108, MONITOR_ANS_PAM_RESPOND = 109,
 	MONITOR_REQ_PAM_FREE_CTX = 110, MONITOR_ANS_PAM_FREE_CTX = 111,
 	MONITOR_REQ_AUDIT_EVENT = 112, MONITOR_REQ_AUDIT_COMMAND = 113,
 
 };
 
 struct monitor {
 	int			 m_recvfd;
 	int			 m_sendfd;
 	int			 m_log_recvfd;
 	int			 m_log_sendfd;
 	struct kex		**m_pkex;
 	pid_t			 m_pid;
 };
 
 struct monitor *monitor_init(void);
 void monitor_reinit(struct monitor *);
 
 struct Authctxt;
 void monitor_child_preauth(struct Authctxt *, struct monitor *);
 void monitor_child_postauth(struct monitor *);
 
 struct mon_table;
 int monitor_read(struct monitor*, struct mon_table *, struct mon_table **);
 
 /* Prototypes for request sending and receiving */
 void mm_request_send(int, enum monitor_reqtype, struct sshbuf *);
 void mm_request_receive(int, struct sshbuf *);
 void mm_request_receive_expect(int, enum monitor_reqtype, struct sshbuf *);
 
 #endif /* _MONITOR_H_ */
Index: projects/openssl111/crypto/openssh/monitor_wrap.c
===================================================================
--- projects/openssl111/crypto/openssh/monitor_wrap.c	(revision 339239)
+++ projects/openssl111/crypto/openssh/monitor_wrap.c	(revision 339240)
@@ -1,1006 +1,1041 @@
 /* $OpenBSD: monitor_wrap.c,v 1.107 2018/07/20 03:46:34 djm Exp $ */
 /*
  * Copyright 2002 Niels Provos <provos@citi.umich.edu>
  * Copyright 2002 Markus Friedl <markus@openbsd.org>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
 #include "includes.h"
 
 #include <sys/types.h>
 #include <sys/uio.h>
 
 #include <errno.h>
 #include <pwd.h>
 #include <signal.h>
 #include <stdarg.h>
 #include <stdio.h>
 #include <string.h>
 #include <unistd.h>
 
 #ifdef WITH_OPENSSL
 #include <openssl/bn.h>
 #include <openssl/dh.h>
 #include <openssl/evp.h>
 #endif
 
 #include "openbsd-compat/sys-queue.h"
 #include "xmalloc.h"
 #include "ssh.h"
 #ifdef WITH_OPENSSL
 #include "dh.h"
 #endif
 #include "sshbuf.h"
 #include "sshkey.h"
 #include "cipher.h"
 #include "kex.h"
 #include "hostfile.h"
 #include "auth.h"
 #include "auth-options.h"
 #include "packet.h"
 #include "mac.h"
 #include "log.h"
 #include "auth-pam.h"
 #include "monitor.h"
 #ifdef GSSAPI
 #include "ssh-gss.h"
 #endif
 #include "monitor_wrap.h"
 #include "atomicio.h"
 #include "monitor_fdpass.h"
 #include "misc.h"
 
 #include "channels.h"
 #include "session.h"
 #include "servconf.h"
 
 #include "ssherr.h"
 
 /* Imports */
 extern struct monitor *pmonitor;
 extern struct sshbuf *loginmsg;
 extern ServerOptions options;
 
 void
 mm_log_handler(LogLevel level, const char *msg, void *ctx)
 {
 	struct sshbuf *log_msg;
 	struct monitor *mon = (struct monitor *)ctx;
 	int r;
 	size_t len;
 
 	if (mon->m_log_sendfd == -1)
 		fatal("%s: no log channel", __func__);
 
 	if ((log_msg = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 
 	if ((r = sshbuf_put_u32(log_msg, 0)) != 0 || /* length; filled below */
 	    (r = sshbuf_put_u32(log_msg, level)) != 0 ||
 	    (r = sshbuf_put_cstring(log_msg, msg)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if ((len = sshbuf_len(log_msg)) < 4 || len > 0xffffffff)
 		fatal("%s: bad length %zu", __func__, len);
 	POKE_U32(sshbuf_mutable_ptr(log_msg), len - 4);
 	if (atomicio(vwrite, mon->m_log_sendfd,
 	    sshbuf_mutable_ptr(log_msg), len) != len)
 		fatal("%s: write: %s", __func__, strerror(errno));
 	sshbuf_free(log_msg);
 }
 
 int
 mm_is_monitor(void)
 {
 	/*
 	 * m_pid is only set in the privileged part, and
 	 * points to the unprivileged child.
 	 */
 	return (pmonitor && pmonitor->m_pid > 0);
 }
 
 void
 mm_request_send(int sock, enum monitor_reqtype type, struct sshbuf *m)
 {
 	size_t mlen = sshbuf_len(m);
 	u_char buf[5];
 
 	debug3("%s entering: type %d", __func__, type);
 
 	if (mlen >= 0xffffffff)
 		fatal("%s: bad length %zu", __func__, mlen);
 	POKE_U32(buf, mlen + 1);
 	buf[4] = (u_char) type;		/* 1st byte of payload is mesg-type */
 	if (atomicio(vwrite, sock, buf, sizeof(buf)) != sizeof(buf))
 		fatal("%s: write: %s", __func__, strerror(errno));
 	if (atomicio(vwrite, sock, sshbuf_mutable_ptr(m), mlen) != mlen)
 		fatal("%s: write: %s", __func__, strerror(errno));
 }
 
 void
 mm_request_receive(int sock, struct sshbuf *m)
 {
 	u_char buf[4], *p = NULL;
 	u_int msg_len;
 	int r;
 
 	debug3("%s entering", __func__);
 
 	if (atomicio(read, sock, buf, sizeof(buf)) != sizeof(buf)) {
 		if (errno == EPIPE)
 			cleanup_exit(255);
 		fatal("%s: read: %s", __func__, strerror(errno));
 	}
 	msg_len = PEEK_U32(buf);
 	if (msg_len > 256 * 1024)
 		fatal("%s: read: bad msg_len %d", __func__, msg_len);
 	sshbuf_reset(m);
 	if ((r = sshbuf_reserve(m, msg_len, &p)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (atomicio(read, sock, p, msg_len) != msg_len)
 		fatal("%s: read: %s", __func__, strerror(errno));
 }
 
 void
 mm_request_receive_expect(int sock, enum monitor_reqtype type, struct sshbuf *m)
 {
 	u_char rtype;
 	int r;
 
 	debug3("%s entering: type %d", __func__, type);
 
 	mm_request_receive(sock, m);
 	if ((r = sshbuf_get_u8(m, &rtype)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (rtype != type)
 		fatal("%s: read: rtype %d != type %d", __func__,
 		    rtype, type);
 }
 
 #ifdef WITH_OPENSSL
 DH *
 mm_choose_dh(int min, int nbits, int max)
 {
 	BIGNUM *p, *g;
 	int r;
 	u_char success = 0;
 	struct sshbuf *m;
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	if ((r = sshbuf_put_u32(m, min)) != 0 ||
 	    (r = sshbuf_put_u32(m, nbits)) != 0 ||
 	    (r = sshbuf_put_u32(m, max)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_MODULI, m);
 
 	debug3("%s: waiting for MONITOR_ANS_MODULI", __func__);
 	mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_MODULI, m);
 
 	if ((r = sshbuf_get_u8(m, &success)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (success == 0)
 		fatal("%s: MONITOR_ANS_MODULI failed", __func__);
 
 	if ((p = BN_new()) == NULL)
 		fatal("%s: BN_new failed", __func__);
 	if ((g = BN_new()) == NULL)
 		fatal("%s: BN_new failed", __func__);
 	if ((r = sshbuf_get_bignum2(m, p)) != 0 ||
 	    (r = sshbuf_get_bignum2(m, g)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	debug3("%s: remaining %zu", __func__, sshbuf_len(m));
 	sshbuf_free(m);
 
 	return (dh_new_group(g, p));
 }
 #endif
 
 int
 mm_sshkey_sign(struct sshkey *key, u_char **sigp, size_t *lenp,
     const u_char *data, size_t datalen, const char *hostkey_alg, u_int compat)
 {
 	struct kex *kex = *pmonitor->m_pkex;
 	struct sshbuf *m;
 	u_int ndx = kex->host_key_index(key, 0, active_state);
 	int r;
 
 	debug3("%s entering", __func__);
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	if ((r = sshbuf_put_u32(m, ndx)) != 0 ||
 	    (r = sshbuf_put_string(m, data, datalen)) != 0 ||
 	    (r = sshbuf_put_cstring(m, hostkey_alg)) != 0 ||
 	    (r = sshbuf_put_u32(m, compat)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_SIGN, m);
 
 	debug3("%s: waiting for MONITOR_ANS_SIGN", __func__);
 	mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_SIGN, m);
 	if ((r = sshbuf_get_string(m, sigp, lenp)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	sshbuf_free(m);
 
 	return (0);
 }
 
+login_cap_t *
+mm_login_getpwclass(const struct passwd *pwent)
+{
+	int r;
+	struct sshbuf *m;
+	char rc;
+	login_cap_t *lc;
+
+	debug3("%s entering", __func__);
+
+	if ((m = sshbuf_new()) == NULL)
+		fatal("%s: sshbuf_new failed", __func__);
+	if ((r = sshbuf_put_passwd(m, pwent)) != 0)
+		fatal("%s: buffer error: %s", __func__, ssh_err(r));
+
+	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_GETPWCLASS, m);
+
+	debug3("%s: waiting for MONITOR_ANS_GETPWCLASS", __func__);
+	mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_GETPWCLASS, m);
+
+	if ((r = sshbuf_get_u8(m, &rc)) != 0)
+		fatal("%s: buffer error: %s", __func__, ssh_err(r));
+
+	if (rc == 0) {
+		lc = NULL;
+		goto out;
+	}
+
+	lc = xmalloc(sizeof(*lc));
+	if ((r = sshbuf_get_cstring(m, &lc->lc_class, NULL)) != 0 ||
+	    (r = sshbuf_get_cstring(m, &lc->lc_cap, NULL)) != 0 ||
+	    (r = sshbuf_get_cstring(m, &lc->lc_style, NULL)) != 0)
+		fatal("%s: buffer error: %s", __func__, ssh_err(r));
+
+ out:
+	sshbuf_free(m);
+
+	return (lc);
+}
+
+void
+mm_login_close(login_cap_t *lc)
+{
+	if (lc == NULL)
+		return;
+	free(lc->lc_style);
+	free(lc->lc_class);
+	free(lc->lc_cap);
+	free(lc);
+}
+
 struct passwd *
 mm_getpwnamallow(const char *username)
 {
 	struct ssh *ssh = active_state;		/* XXX */
 	struct sshbuf *m;
 	struct passwd *pw;
 	size_t len;
 	u_int i;
 	ServerOptions *newopts;
 	int r;
 	u_char ok;
 	const u_char *p;
 
 	debug3("%s entering", __func__);
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	if ((r = sshbuf_put_cstring(m, username)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_PWNAM, m);
 
 	debug3("%s: waiting for MONITOR_ANS_PWNAM", __func__);
 	mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_PWNAM, m);
 
 	if ((r = sshbuf_get_u8(m, &ok)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (ok == 0) {
 		pw = NULL;
 		goto out;
 	}
 
-	/* XXX don't like passing struct passwd like this */
-	pw = xcalloc(sizeof(*pw), 1);
-	if ((r = sshbuf_get_string_direct(m, &p, &len)) != 0)
-		fatal("%s: buffer error: %s", __func__, ssh_err(r));
-	if (len != sizeof(*pw))
-		fatal("%s: struct passwd size mismatch", __func__);
-	memcpy(pw, p, sizeof(*pw));
-
-	if ((r = sshbuf_get_cstring(m, &pw->pw_name, NULL)) != 0 ||
-	    (r = sshbuf_get_cstring(m, &pw->pw_passwd, NULL)) != 0 ||
-#ifdef HAVE_STRUCT_PASSWD_PW_GECOS
-	    (r = sshbuf_get_cstring(m, &pw->pw_gecos, NULL)) != 0 ||
-#endif
-#ifdef HAVE_STRUCT_PASSWD_PW_CLASS
-	    (r = sshbuf_get_cstring(m, &pw->pw_class, NULL)) != 0 ||
-#endif
-	    (r = sshbuf_get_cstring(m, &pw->pw_dir, NULL)) != 0 ||
-	    (r = sshbuf_get_cstring(m, &pw->pw_shell, NULL)) != 0)
-		fatal("%s: buffer error: %s", __func__, ssh_err(r));
+	pw = sshbuf_get_passwd(m);
+	if (pw == NULL)
+		fatal("%s: receive get struct passwd failed", __func__);
 
 out:
 	/* copy options block as a Match directive may have changed some */
 	if ((r = sshbuf_get_string_direct(m, &p, &len)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (len != sizeof(*newopts))
 		fatal("%s: option block size mismatch", __func__);
 	newopts = xcalloc(sizeof(*newopts), 1);
 	memcpy(newopts, p, sizeof(*newopts));
 
 #define M_CP_STROPT(x) do { \
 		if (newopts->x != NULL) { \
 			if ((r = sshbuf_get_cstring(m, \
 			    &newopts->x, NULL)) != 0) \
 				fatal("%s: buffer error: %s", \
 				    __func__, ssh_err(r)); \
 		} \
 	} while (0)
 #define M_CP_STRARRAYOPT(x, nx) do { \
 		newopts->x = newopts->nx == 0 ? \
 		    NULL : xcalloc(newopts->nx, sizeof(*newopts->x)); \
 		for (i = 0; i < newopts->nx; i++) { \
 			if ((r = sshbuf_get_cstring(m, \
 			    &newopts->x[i], NULL)) != 0) \
 				fatal("%s: buffer error: %s", \
 				    __func__, ssh_err(r)); \
 		} \
 	} while (0)
 	/* See comment in servconf.h */
 	COPY_MATCH_STRING_OPTS();
 #undef M_CP_STROPT
 #undef M_CP_STRARRAYOPT
 
 	copy_set_server_options(&options, newopts, 1);
 	log_change_level(options.log_level);
 	process_permitopen(ssh, &options);
 	free(newopts);
 
 	sshbuf_free(m);
 
 	return (pw);
 }
 
 char *
 mm_auth2_read_banner(void)
 {
 	struct sshbuf *m;
 	char *banner;
 	int r;
 
 	debug3("%s entering", __func__);
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_AUTH2_READ_BANNER, m);
 	sshbuf_reset(m);
 
 	mm_request_receive_expect(pmonitor->m_recvfd,
 	    MONITOR_ANS_AUTH2_READ_BANNER, m);
 	if ((r = sshbuf_get_cstring(m, &banner, NULL)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	sshbuf_free(m);
 
 	/* treat empty banner as missing banner */
 	if (strlen(banner) == 0) {
 		free(banner);
 		banner = NULL;
 	}
 	return (banner);
 }
 
 /* Inform the privileged process about service and style */
 
 void
 mm_inform_authserv(char *service, char *style)
 {
 	struct sshbuf *m;
 	int r;
 
 	debug3("%s entering", __func__);
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	if ((r = sshbuf_put_cstring(m, service)) != 0 ||
 	    (r = sshbuf_put_cstring(m, style ? style : "")) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_AUTHSERV, m);
 
 	sshbuf_free(m);
 }
 
 /* Do the password authentication */
 int
 mm_auth_password(struct ssh *ssh, char *password)
 {
 	struct sshbuf *m;
 	int r, authenticated = 0;
 #ifdef USE_PAM
 	u_int maxtries = 0;
 #endif
 
 	debug3("%s entering", __func__);
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	if ((r = sshbuf_put_cstring(m, password)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_AUTHPASSWORD, m);
 
 	debug3("%s: waiting for MONITOR_ANS_AUTHPASSWORD", __func__);
 	mm_request_receive_expect(pmonitor->m_recvfd,
 	    MONITOR_ANS_AUTHPASSWORD, m);
 
 	if ((r = sshbuf_get_u32(m, &authenticated)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 #ifdef USE_PAM
 	if ((r = sshbuf_get_u32(m, &maxtries)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (maxtries > INT_MAX)
 		fatal("%s: bad maxtries %u", __func__, maxtries);
 	sshpam_set_maxtries_reached(maxtries);
 #endif
 
 	sshbuf_free(m);
 
 	debug3("%s: user %sauthenticated",
 	    __func__, authenticated ? "" : "not ");
 	return (authenticated);
 }
 
 int
 mm_user_key_allowed(struct ssh *ssh, struct passwd *pw, struct sshkey *key,
     int pubkey_auth_attempt, struct sshauthopt **authoptp)
 {
 	return (mm_key_allowed(MM_USERKEY, NULL, NULL, key,
 	    pubkey_auth_attempt, authoptp));
 }
 
 int
 mm_hostbased_key_allowed(struct passwd *pw, const char *user, const char *host,
     struct sshkey *key)
 {
 	return (mm_key_allowed(MM_HOSTKEY, user, host, key, 0, NULL));
 }
 
 int
 mm_key_allowed(enum mm_keytype type, const char *user, const char *host,
     struct sshkey *key, int pubkey_auth_attempt, struct sshauthopt **authoptp)
 {
 	struct sshbuf *m;
 	int r, allowed = 0;
 	struct sshauthopt *opts = NULL;
 
 	debug3("%s entering", __func__);
 
 	if (authoptp != NULL)
 		*authoptp = NULL;
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	if ((r = sshbuf_put_u32(m, type)) != 0 ||
 	    (r = sshbuf_put_cstring(m, user ? user : "")) != 0 ||
 	    (r = sshbuf_put_cstring(m, host ? host : "")) != 0 ||
 	    (r = sshkey_puts(key, m)) != 0 ||
 	    (r = sshbuf_put_u32(m, pubkey_auth_attempt)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_KEYALLOWED, m);
 
 	debug3("%s: waiting for MONITOR_ANS_KEYALLOWED", __func__);
 	mm_request_receive_expect(pmonitor->m_recvfd,
 	    MONITOR_ANS_KEYALLOWED, m);
 
 	if ((r = sshbuf_get_u32(m, &allowed)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (allowed && type == MM_USERKEY) {
 		if ((r = sshauthopt_deserialise(m, &opts)) != 0)
 			fatal("%s: sshauthopt_deserialise: %s",
 			    __func__, ssh_err(r));
 	}
 	sshbuf_free(m);
 
 	if (authoptp != NULL) {
 		*authoptp = opts;
 		opts = NULL;
 	}
 	sshauthopt_free(opts);
 
 	return allowed;
 }
 
 /*
  * This key verify needs to send the key type along, because the
  * privileged parent makes the decision if the key is allowed
  * for authentication.
  */
 
 int
 mm_sshkey_verify(const struct sshkey *key, const u_char *sig, size_t siglen,
     const u_char *data, size_t datalen, const char *sigalg, u_int compat)
 {
 	struct sshbuf *m;
 	u_int encoded_ret = 0;
 	int r;
 
 	debug3("%s entering", __func__);
 
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	if ((r = sshkey_puts(key, m)) != 0 ||
 	    (r = sshbuf_put_string(m, sig, siglen)) != 0 ||
 	    (r = sshbuf_put_string(m, data, datalen)) != 0 ||
 	    (r = sshbuf_put_cstring(m, sigalg == NULL ? "" : sigalg)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_KEYVERIFY, m);
 
 	debug3("%s: waiting for MONITOR_ANS_KEYVERIFY", __func__);
 	mm_request_receive_expect(pmonitor->m_recvfd,
 	    MONITOR_ANS_KEYVERIFY, m);
 
 	if ((r = sshbuf_get_u32(m, &encoded_ret)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	sshbuf_free(m);
 
 	if (encoded_ret != 0)
 		return SSH_ERR_SIGNATURE_INVALID;
 	return 0;
 }
 
 void
 mm_send_keystate(struct monitor *monitor)
 {
 	struct ssh *ssh = active_state;		/* XXX */
 	struct sshbuf *m;
 	int r;
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	if ((r = ssh_packet_get_state(ssh, m)) != 0)
 		fatal("%s: get_state failed: %s",
 		    __func__, ssh_err(r));
 	mm_request_send(monitor->m_recvfd, MONITOR_REQ_KEYEXPORT, m);
 	debug3("%s: Finished sending state", __func__);
 	sshbuf_free(m);
 }
 
 int
 mm_pty_allocate(int *ptyfd, int *ttyfd, char *namebuf, size_t namebuflen)
 {
 	struct sshbuf *m;
 	char *p, *msg;
 	int success = 0, tmp1 = -1, tmp2 = -1, r;
 
 	/* Kludge: ensure there are fds free to receive the pty/tty */
 	if ((tmp1 = dup(pmonitor->m_recvfd)) == -1 ||
 	    (tmp2 = dup(pmonitor->m_recvfd)) == -1) {
 		error("%s: cannot allocate fds for pty", __func__);
 		if (tmp1 > 0)
 			close(tmp1);
 		if (tmp2 > 0)
 			close(tmp2);
 		return 0;
 	}
 	close(tmp1);
 	close(tmp2);
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_PTY, m);
 
 	debug3("%s: waiting for MONITOR_ANS_PTY", __func__);
 	mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_PTY, m);
 
 	if ((r = sshbuf_get_u32(m, &success)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (success == 0) {
 		debug3("%s: pty alloc failed", __func__);
 		sshbuf_free(m);
 		return (0);
 	}
 	if ((r = sshbuf_get_cstring(m, &p, NULL)) != 0 ||
 	    (r = sshbuf_get_cstring(m, &msg, NULL)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	sshbuf_free(m);
 
 	strlcpy(namebuf, p, namebuflen); /* Possible truncation */
 	free(p);
 
 	if ((r = sshbuf_put(loginmsg, msg, strlen(msg))) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	free(msg);
 
 	if ((*ptyfd = mm_receive_fd(pmonitor->m_recvfd)) == -1 ||
 	    (*ttyfd = mm_receive_fd(pmonitor->m_recvfd)) == -1)
 		fatal("%s: receive fds failed", __func__);
 
 	/* Success */
 	return (1);
 }
 
 void
 mm_session_pty_cleanup2(Session *s)
 {
 	struct sshbuf *m;
 	int r;
 
 	if (s->ttyfd == -1)
 		return;
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	if ((r = sshbuf_put_cstring(m, s->tty)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_PTYCLEANUP, m);
 	sshbuf_free(m);
 
 	/* closed dup'ed master */
 	if (s->ptymaster != -1 && close(s->ptymaster) < 0)
 		error("close(s->ptymaster/%d): %s",
 		    s->ptymaster, strerror(errno));
 
 	/* unlink pty from session */
 	s->ttyfd = -1;
 }
 
 #ifdef USE_PAM
 void
 mm_start_pam(Authctxt *authctxt)
 {
 	struct sshbuf *m;
 
 	debug3("%s entering", __func__);
 	if (!options.use_pam)
 		fatal("UsePAM=no, but ended up in %s anyway", __func__);
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_PAM_START, m);
 
 	sshbuf_free(m);
 }
 
 u_int
 mm_do_pam_account(void)
 {
 	struct sshbuf *m;
 	u_int ret;
 	char *msg;
 	size_t msglen;
 	int r;
 
 	debug3("%s entering", __func__);
 	if (!options.use_pam)
 		fatal("UsePAM=no, but ended up in %s anyway", __func__);
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_PAM_ACCOUNT, m);
 
 	mm_request_receive_expect(pmonitor->m_recvfd,
 	    MONITOR_ANS_PAM_ACCOUNT, m);
 	if ((r = sshbuf_get_u32(m, &ret)) != 0 ||
 	    (r = sshbuf_get_cstring(m, &msg, &msglen)) != 0 ||
 	    (r = sshbuf_put(loginmsg, msg, msglen)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	free(msg);
 	sshbuf_free(m);
 
 	debug3("%s returning %d", __func__, ret);
 
 	return (ret);
 }
 
 void *
 mm_sshpam_init_ctx(Authctxt *authctxt)
 {
 	struct sshbuf *m;
 	int r, success;
 
 	debug3("%s", __func__);
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_PAM_INIT_CTX, m);
 	debug3("%s: waiting for MONITOR_ANS_PAM_INIT_CTX", __func__);
 	mm_request_receive_expect(pmonitor->m_recvfd,
 	    MONITOR_ANS_PAM_INIT_CTX, m);
 	if ((r = sshbuf_get_u32(m, &success)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (success == 0) {
 		debug3("%s: pam_init_ctx failed", __func__);
 		sshbuf_free(m);
 		return (NULL);
 	}
 	sshbuf_free(m);
 	return (authctxt);
 }
 
 int
 mm_sshpam_query(void *ctx, char **name, char **info,
     u_int *num, char ***prompts, u_int **echo_on)
 {
 	struct sshbuf *m;
 	u_int i, n;
 	int r, ret;
 
 	debug3("%s", __func__);
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_PAM_QUERY, m);
 	debug3("%s: waiting for MONITOR_ANS_PAM_QUERY", __func__);
 	mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_PAM_QUERY, m);
 	if ((r = sshbuf_get_u32(m, &ret)) != 0 ||
 	    (r = sshbuf_get_cstring(m, name, NULL)) != 0 ||
 	    (r = sshbuf_get_cstring(m, info, NULL)) != 0 ||
 	    (r = sshbuf_get_u32(m, &n)) != 0 ||
 	    (r = sshbuf_get_u32(m, num)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	debug3("%s: pam_query returned %d", __func__, ret);
 	sshpam_set_maxtries_reached(n);
 	if (*num > PAM_MAX_NUM_MSG)
 		fatal("%s: received %u PAM messages, expected <= %u",
 		    __func__, *num, PAM_MAX_NUM_MSG);
 	*prompts = xcalloc((*num + 1), sizeof(char *));
 	*echo_on = xcalloc((*num + 1), sizeof(u_int));
 	for (i = 0; i < *num; ++i) {
 		if ((r = sshbuf_get_cstring(m, &((*prompts)[i]), NULL)) != 0 ||
 		    (r = sshbuf_get_u32(m, &((*echo_on)[i]))) != 0)
 			fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	}
 	sshbuf_free(m);
 	return (ret);
 }
 
 int
 mm_sshpam_respond(void *ctx, u_int num, char **resp)
 {
 	struct sshbuf *m;
 	u_int n, i;
 	int r, ret;
 
 	debug3("%s", __func__);
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	if ((r = sshbuf_put_u32(m, num)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	for (i = 0; i < num; ++i) {
 		if ((r = sshbuf_put_cstring(m, resp[i])) != 0)
 			fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	}
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_PAM_RESPOND, m);
 	debug3("%s: waiting for MONITOR_ANS_PAM_RESPOND", __func__);
 	mm_request_receive_expect(pmonitor->m_recvfd,
 	    MONITOR_ANS_PAM_RESPOND, m);
 	if ((r = sshbuf_get_u32(m, &n)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	ret = (int)n; /* XXX */
 	debug3("%s: pam_respond returned %d", __func__, ret);
 	sshbuf_free(m);
 	return (ret);
 }
 
 void
 mm_sshpam_free_ctx(void *ctxtp)
 {
 	struct sshbuf *m;
 
 	debug3("%s", __func__);
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_PAM_FREE_CTX, m);
 	debug3("%s: waiting for MONITOR_ANS_PAM_FREE_CTX", __func__);
 	mm_request_receive_expect(pmonitor->m_recvfd,
 	    MONITOR_ANS_PAM_FREE_CTX, m);
 	sshbuf_free(m);
 }
 #endif /* USE_PAM */
 
 /* Request process termination */
 
 void
 mm_terminate(void)
 {
 	struct sshbuf *m;
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_TERM, m);
 	sshbuf_free(m);
 }
 
 static void
 mm_chall_setup(char **name, char **infotxt, u_int *numprompts,
     char ***prompts, u_int **echo_on)
 {
 	*name = xstrdup("");
 	*infotxt = xstrdup("");
 	*numprompts = 1;
 	*prompts = xcalloc(*numprompts, sizeof(char *));
 	*echo_on = xcalloc(*numprompts, sizeof(u_int));
 	(*echo_on)[0] = 0;
 }
 
 int
 mm_bsdauth_query(void *ctx, char **name, char **infotxt,
    u_int *numprompts, char ***prompts, u_int **echo_on)
 {
 	struct sshbuf *m;
 	u_int success;
 	char *challenge;
 	int r;
 
 	debug3("%s: entering", __func__);
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_BSDAUTHQUERY, m);
 
 	mm_request_receive_expect(pmonitor->m_recvfd,
 	    MONITOR_ANS_BSDAUTHQUERY, m);
 	if ((r = sshbuf_get_u32(m, &success)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (success == 0) {
 		debug3("%s: no challenge", __func__);
 		sshbuf_free(m);
 		return (-1);
 	}
 
 	/* Get the challenge, and format the response */
 	if ((r = sshbuf_get_cstring(m, &challenge, NULL)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	sshbuf_free(m);
 
 	mm_chall_setup(name, infotxt, numprompts, prompts, echo_on);
 	(*prompts)[0] = challenge;
 
 	debug3("%s: received challenge: %s", __func__, challenge);
 
 	return (0);
 }
 
 int
 mm_bsdauth_respond(void *ctx, u_int numresponses, char **responses)
 {
 	struct sshbuf *m;
 	int r, authok;
 
 	debug3("%s: entering", __func__);
 	if (numresponses != 1)
 		return (-1);
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	if ((r = sshbuf_put_cstring(m, responses[0])) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_BSDAUTHRESPOND, m);
 
 	mm_request_receive_expect(pmonitor->m_recvfd,
 	    MONITOR_ANS_BSDAUTHRESPOND, m);
 
 	if ((r = sshbuf_get_u32(m, &authok)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	sshbuf_free(m);
 
 	return ((authok == 0) ? -1 : 0);
 }
 
 #ifdef SSH_AUDIT_EVENTS
 void
 mm_audit_event(ssh_audit_event_t event)
 {
 	struct sshbuf *m;
 	int r;
 
 	debug3("%s entering", __func__);
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	if ((r = sshbuf_put_u32(m, event)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_AUDIT_EVENT, m);
 	sshbuf_free(m);
 }
 
 void
 mm_audit_run_command(const char *command)
 {
 	struct sshbuf *m;
 	int r;
 
 	debug3("%s entering command %s", __func__, command);
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	if ((r = sshbuf_put_cstring(m, command)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_AUDIT_COMMAND, m);
 	sshbuf_free(m);
 }
 #endif /* SSH_AUDIT_EVENTS */
 
 #ifdef GSSAPI
 OM_uint32
 mm_ssh_gssapi_server_ctx(Gssctxt **ctx, gss_OID goid)
 {
 	struct sshbuf *m;
 	OM_uint32 major;
 	int r;
 
 	/* Client doesn't get to see the context */
 	*ctx = NULL;
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	if ((r = sshbuf_put_string(m, goid->elements, goid->length)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_GSSSETUP, m);
 	mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_GSSSETUP, m);
 
 	if ((r = sshbuf_get_u32(m, &major)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	sshbuf_free(m);
 	return (major);
 }
 
 OM_uint32
 mm_ssh_gssapi_accept_ctx(Gssctxt *ctx, gss_buffer_desc *in,
     gss_buffer_desc *out, OM_uint32 *flagsp)
 {
 	struct sshbuf *m;
 	OM_uint32 major;
 	u_int flags;
 	int r;
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	if ((r = sshbuf_put_string(m, in->value, in->length)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_GSSSTEP, m);
 	mm_request_receive_expect(pmonitor->m_recvfd, MONITOR_ANS_GSSSTEP, m);
 
 	if ((r = sshbuf_get_u32(m, &major)) != 0 ||
 	    (r = ssh_gssapi_get_buffer_desc(m, out)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (flagsp != NULL) {
 		if ((r = sshbuf_get_u32(m, &flags)) != 0)
 			fatal("%s: buffer error: %s", __func__, ssh_err(r));
 		*flagsp = flags;
 	}
 
 	sshbuf_free(m);
 
 	return (major);
 }
 
 OM_uint32
 mm_ssh_gssapi_checkmic(Gssctxt *ctx, gss_buffer_t gssbuf, gss_buffer_t gssmic)
 {
 	struct sshbuf *m;
 	OM_uint32 major;
 	int r;
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	if ((r = sshbuf_put_string(m, gssbuf->value, gssbuf->length)) != 0 ||
 	    (r = sshbuf_put_string(m, gssmic->value, gssmic->length)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_GSSCHECKMIC, m);
 	mm_request_receive_expect(pmonitor->m_recvfd,
 	    MONITOR_ANS_GSSCHECKMIC, m);
 
 	if ((r = sshbuf_get_u32(m, &major)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	sshbuf_free(m);
 	return(major);
 }
 
 int
 mm_ssh_gssapi_userok(char *user)
 {
 	struct sshbuf *m;
 	int r, authenticated = 0;
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 
 	mm_request_send(pmonitor->m_recvfd, MONITOR_REQ_GSSUSEROK, m);
 	mm_request_receive_expect(pmonitor->m_recvfd,
 	    MONITOR_ANS_GSSUSEROK, m);
 
 	if ((r = sshbuf_get_u32(m, &authenticated)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 	sshbuf_free(m);
 	debug3("%s: user %sauthenticated",__func__, authenticated ? "" : "not ");
 	return (authenticated);
 }
 #endif /* GSSAPI */
Index: projects/openssl111/crypto/openssh/monitor_wrap.h
===================================================================
--- projects/openssl111/crypto/openssh/monitor_wrap.h	(revision 339239)
+++ projects/openssl111/crypto/openssh/monitor_wrap.h	(revision 339240)
@@ -1,100 +1,104 @@
 /* $OpenBSD: monitor_wrap.h,v 1.38 2018/07/11 18:53:29 markus Exp $ */
 
 /*
  * Copyright 2002 Niels Provos <provos@citi.umich.edu>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
 #ifndef _MM_WRAP_H_
 #define _MM_WRAP_H_
 
+#include <login_cap.h>
+
 extern int use_privsep;
 #define PRIVSEP(x)	(use_privsep ? mm_##x : x)
 
 enum mm_keytype { MM_NOKEY, MM_HOSTKEY, MM_USERKEY };
 
 struct monitor;
 struct Authctxt;
 struct sshkey;
 struct sshauthopt;
 
 void mm_log_handler(LogLevel, const char *, void *);
 int mm_is_monitor(void);
 DH *mm_choose_dh(int, int, int);
 int mm_sshkey_sign(struct sshkey *, u_char **, size_t *, const u_char *, size_t,
     const char *, u_int compat);
 void mm_inform_authserv(char *, char *);
 struct passwd *mm_getpwnamallow(const char *);
+login_cap_t *mm_login_getpwclass(const struct passwd *pwd);
+void mm_login_close(login_cap_t *lc);
 char *mm_auth2_read_banner(void);
 int mm_auth_password(struct ssh *, char *);
 int mm_key_allowed(enum mm_keytype, const char *, const char *, struct sshkey *,
     int, struct sshauthopt **);
 int mm_user_key_allowed(struct ssh *, struct passwd *, struct sshkey *, int,
     struct sshauthopt **);
 int mm_hostbased_key_allowed(struct passwd *, const char *,
     const char *, struct sshkey *);
 int mm_sshkey_verify(const struct sshkey *, const u_char *, size_t,
     const u_char *, size_t, const char *, u_int);
 
 #ifdef GSSAPI
 OM_uint32 mm_ssh_gssapi_server_ctx(Gssctxt **, gss_OID);
 OM_uint32 mm_ssh_gssapi_accept_ctx(Gssctxt *,
    gss_buffer_desc *, gss_buffer_desc *, OM_uint32 *);
 int mm_ssh_gssapi_userok(char *user);
 OM_uint32 mm_ssh_gssapi_checkmic(Gssctxt *, gss_buffer_t, gss_buffer_t);
 #endif
 
 #ifdef USE_PAM
 void mm_start_pam(struct Authctxt *);
 u_int mm_do_pam_account(void);
 void *mm_sshpam_init_ctx(struct Authctxt *);
 int mm_sshpam_query(void *, char **, char **, u_int *, char ***, u_int **);
 int mm_sshpam_respond(void *, u_int, char **);
 void mm_sshpam_free_ctx(void *);
 #endif
 
 #ifdef SSH_AUDIT_EVENTS
 #include "audit.h"
 void mm_audit_event(ssh_audit_event_t);
 void mm_audit_run_command(const char *);
 #endif
 
 struct Session;
 void mm_terminate(void);
 int mm_pty_allocate(int *, int *, char *, size_t);
 void mm_session_pty_cleanup2(struct Session *);
 
 /* Key export functions */
 struct newkeys *mm_newkeys_from_blob(u_char *, int);
 int mm_newkeys_to_blob(int, u_char **, u_int *);
 
 void monitor_clear_keystate(struct monitor *);
 void monitor_apply_keystate(struct monitor *);
 void mm_get_keystate(struct monitor *);
 void mm_send_keystate(struct monitor*);
 
 /* bsdauth */
 int mm_bsdauth_query(void *, char **, char **, u_int *, char ***, u_int **);
 int mm_bsdauth_respond(void *, u_int, char **);
 
 #endif /* _MM_WRAP_H_ */
Index: projects/openssl111/crypto/openssh/sandbox-capsicum.c
===================================================================
--- projects/openssl111/crypto/openssh/sandbox-capsicum.c	(revision 339239)
+++ projects/openssl111/crypto/openssh/sandbox-capsicum.c	(revision 339240)
@@ -1,123 +1,126 @@
 /*
  * Copyright (c) 2011 Dag-Erling Smorgrav
  *
  * Permission to use, copy, modify, and distribute this software for any
  * purpose with or without fee is hereby granted, provided that the above
  * copyright notice and this permission notice appear in all copies.
  *
  * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
  * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
  * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
  * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
  * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  */
 
 #include "includes.h"
 __RCSID("$FreeBSD$");
 
 #ifdef SANDBOX_CAPSICUM
 
 #include <sys/types.h>
 #include <sys/param.h>
 #include <sys/time.h>
 #include <sys/resource.h>
 #include <sys/capsicum.h>
 
 #include <errno.h>
 #include <stdarg.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
+#include <capsicum_helpers.h>
 
 #include "log.h"
 #include "monitor.h"
 #include "ssh-sandbox.h"
 #include "xmalloc.h"
 
 /*
  * Capsicum sandbox that sets zero nfiles, nprocs and filesize rlimits,
  * limits rights on stdout, stdin, stderr, monitor and switches to
  * capability mode.
  */
 
 struct ssh_sandbox {
 	struct monitor *monitor;
 	pid_t child_pid;
 };
 
 struct ssh_sandbox *
 ssh_sandbox_init(struct monitor *monitor)
 {
 	struct ssh_sandbox *box;
 
 	/*
 	 * Strictly, we don't need to maintain any state here but we need
 	 * to return non-NULL to satisfy the API.
 	 */
 	debug3("%s: preparing capsicum sandbox", __func__);
 	box = xcalloc(1, sizeof(*box));
 	box->monitor = monitor;
 	box->child_pid = 0;
 
 	return box;
 }
 
 void
 ssh_sandbox_child(struct ssh_sandbox *box)
 {
 	struct rlimit rl_zero;
 	cap_rights_t rights;
+
+	caph_cache_tzdata();
 
 	rl_zero.rlim_cur = rl_zero.rlim_max = 0;
 
 	if (setrlimit(RLIMIT_FSIZE, &rl_zero) == -1)
 		fatal("%s: setrlimit(RLIMIT_FSIZE, { 0, 0 }): %s",
 			__func__, strerror(errno));
 #ifndef SANDBOX_SKIP_RLIMIT_NOFILE
 	if (setrlimit(RLIMIT_NOFILE, &rl_zero) == -1)
 		fatal("%s: setrlimit(RLIMIT_NOFILE, { 0, 0 }): %s",
 			__func__, strerror(errno));
 #endif
 	if (setrlimit(RLIMIT_NPROC, &rl_zero) == -1)
 		fatal("%s: setrlimit(RLIMIT_NPROC, { 0, 0 }): %s",
 			__func__, strerror(errno));
 
 	cap_rights_init(&rights);
 
 	if (cap_rights_limit(STDIN_FILENO, &rights) < 0 && errno != ENOSYS)
 		fatal("can't limit stdin: %m");
 	if (cap_rights_limit(STDOUT_FILENO, &rights) < 0 && errno != ENOSYS)
 		fatal("can't limit stdout: %m");
 	if (cap_rights_limit(STDERR_FILENO, &rights) < 0 && errno != ENOSYS)
 		fatal("can't limit stderr: %m");
 
 	cap_rights_init(&rights, CAP_READ, CAP_WRITE);
 	if (cap_rights_limit(box->monitor->m_recvfd, &rights) < 0 &&
 	    errno != ENOSYS)
 		fatal("%s: failed to limit the network socket", __func__);
 	cap_rights_init(&rights, CAP_WRITE);
 	if (cap_rights_limit(box->monitor->m_log_sendfd, &rights) < 0 &&
 	    errno != ENOSYS)
 		fatal("%s: failed to limit the logging socket", __func__);
 	if (cap_enter() < 0 && errno != ENOSYS)
 		fatal("%s: failed to enter capability mode", __func__);
 
 }
 
 void
 ssh_sandbox_parent_finish(struct ssh_sandbox *box)
 {
 	free(box);
 	debug3("%s: finished", __func__);
 }
 
 void
 ssh_sandbox_parent_preauth(struct ssh_sandbox *box, pid_t child_pid)
 {
 	box->child_pid = child_pid;
 }
 
 #endif /* SANDBOX_CAPSICUM */
Index: projects/openssl111/crypto/openssh/sshbuf-getput-basic.c
===================================================================
--- projects/openssl111/crypto/openssh/sshbuf-getput-basic.c	(revision 339239)
+++ projects/openssl111/crypto/openssh/sshbuf-getput-basic.c	(revision 339240)
@@ -1,464 +1,557 @@
 /*	$OpenBSD: sshbuf-getput-basic.c,v 1.7 2017/06/01 04:51:58 djm Exp $	*/
 /*
  * Copyright (c) 2011 Damien Miller
  *
  * Permission to use, copy, modify, and distribute this software for any
  * purpose with or without fee is hereby granted, provided that the above
  * copyright notice and this permission notice appear in all copies.
  *
  * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
  * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
  * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
  * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
  * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  */
 
 #define SSHBUF_INTERNAL
 #include "includes.h"
 
 #include <sys/types.h>
 
 #include <stdarg.h>
 #include <stdlib.h>
 #include <stdio.h>
 #include <string.h>
 
+#include "xmalloc.h"
 #include "ssherr.h"
 #include "sshbuf.h"
 
 int
 sshbuf_get(struct sshbuf *buf, void *v, size_t len)
 {
 	const u_char *p = sshbuf_ptr(buf);
 	int r;
 
 	if ((r = sshbuf_consume(buf, len)) < 0)
 		return r;
 	if (v != NULL && len != 0)
 		memcpy(v, p, len);
 	return 0;
 }
 
 int
 sshbuf_get_u64(struct sshbuf *buf, u_int64_t *valp)
 {
 	const u_char *p = sshbuf_ptr(buf);
 	int r;
 
 	if ((r = sshbuf_consume(buf, 8)) < 0)
 		return r;
 	if (valp != NULL)
 		*valp = PEEK_U64(p);
 	return 0;
 }
 
 int
 sshbuf_get_u32(struct sshbuf *buf, u_int32_t *valp)
 {
 	const u_char *p = sshbuf_ptr(buf);
 	int r;
 
 	if ((r = sshbuf_consume(buf, 4)) < 0)
 		return r;
 	if (valp != NULL)
 		*valp = PEEK_U32(p);
 	return 0;
 }
 
 int
 sshbuf_get_u16(struct sshbuf *buf, u_int16_t *valp)
 {
 	const u_char *p = sshbuf_ptr(buf);
 	int r;
 
 	if ((r = sshbuf_consume(buf, 2)) < 0)
 		return r;
 	if (valp != NULL)
 		*valp = PEEK_U16(p);
 	return 0;
 }
 
 int
 sshbuf_get_u8(struct sshbuf *buf, u_char *valp)
 {
 	const u_char *p = sshbuf_ptr(buf);
 	int r;
 
 	if ((r = sshbuf_consume(buf, 1)) < 0)
 		return r;
 	if (valp != NULL)
 		*valp = (u_int8_t)*p;
 	return 0;
 }
 
 int
 sshbuf_get_string(struct sshbuf *buf, u_char **valp, size_t *lenp)
 {
 	const u_char *val;
 	size_t len;
 	int r;
 
 	if (valp != NULL)
 		*valp = NULL;
 	if (lenp != NULL)
 		*lenp = 0;
 	if ((r = sshbuf_get_string_direct(buf, &val, &len)) < 0)
 		return r;
 	if (valp != NULL) {
 		if ((*valp = malloc(len + 1)) == NULL) {
 			SSHBUF_DBG(("SSH_ERR_ALLOC_FAIL"));
 			return SSH_ERR_ALLOC_FAIL;
 		}
 		if (len != 0)
 			memcpy(*valp, val, len);
 		(*valp)[len] = '\0';
 	}
 	if (lenp != NULL)
 		*lenp = len;
 	return 0;
 }
 
 int
 sshbuf_get_string_direct(struct sshbuf *buf, const u_char **valp, size_t *lenp)
 {
 	size_t len;
 	const u_char *p;
 	int r;
 
 	if (valp != NULL)
 		*valp = NULL;
 	if (lenp != NULL)
 		*lenp = 0;
 	if ((r = sshbuf_peek_string_direct(buf, &p, &len)) < 0)
 		return r;
 	if (valp != NULL)
 		*valp = p;
 	if (lenp != NULL)
 		*lenp = len;
 	if (sshbuf_consume(buf, len + 4) != 0) {
 		/* Shouldn't happen */
 		SSHBUF_DBG(("SSH_ERR_INTERNAL_ERROR"));
 		SSHBUF_ABORT();
 		return SSH_ERR_INTERNAL_ERROR;
 	}
 	return 0;
 }
 
 int
 sshbuf_peek_string_direct(const struct sshbuf *buf, const u_char **valp,
     size_t *lenp)
 {
 	u_int32_t len;
 	const u_char *p = sshbuf_ptr(buf);
 
 	if (valp != NULL)
 		*valp = NULL;
 	if (lenp != NULL)
 		*lenp = 0;
 	if (sshbuf_len(buf) < 4) {
 		SSHBUF_DBG(("SSH_ERR_MESSAGE_INCOMPLETE"));
 		return SSH_ERR_MESSAGE_INCOMPLETE;
 	}
 	len = PEEK_U32(p);
 	if (len > SSHBUF_SIZE_MAX - 4) {
 		SSHBUF_DBG(("SSH_ERR_STRING_TOO_LARGE"));
 		return SSH_ERR_STRING_TOO_LARGE;
 	}
 	if (sshbuf_len(buf) - 4 < len) {
 		SSHBUF_DBG(("SSH_ERR_MESSAGE_INCOMPLETE"));
 		return SSH_ERR_MESSAGE_INCOMPLETE;
 	}
 	if (valp != NULL)
 		*valp = p + 4;
 	if (lenp != NULL)
 		*lenp = len;
 	return 0;
 }
 
 int
 sshbuf_get_cstring(struct sshbuf *buf, char **valp, size_t *lenp)
 {
 	size_t len;
 	const u_char *p, *z;
 	int r;
 
 	if (valp != NULL)
 		*valp = NULL;
 	if (lenp != NULL)
 		*lenp = 0;
 	if ((r = sshbuf_peek_string_direct(buf, &p, &len)) != 0)
 		return r;
 	/* Allow a \0 only at the end of the string */
 	if (len > 0 &&
 	    (z = memchr(p , '\0', len)) != NULL && z < p + len - 1) {
 		SSHBUF_DBG(("SSH_ERR_INVALID_FORMAT"));
 		return SSH_ERR_INVALID_FORMAT;
 	}
 	if ((r = sshbuf_skip_string(buf)) != 0)
 		return -1;
 	if (valp != NULL) {
 		if ((*valp = malloc(len + 1)) == NULL) {
 			SSHBUF_DBG(("SSH_ERR_ALLOC_FAIL"));
 			return SSH_ERR_ALLOC_FAIL;
 		}
 		if (len != 0)
 			memcpy(*valp, p, len);
 		(*valp)[len] = '\0';
 	}
 	if (lenp != NULL)
 		*lenp = (size_t)len;
 	return 0;
 }
 
 int
 sshbuf_get_stringb(struct sshbuf *buf, struct sshbuf *v)
 {
 	u_int32_t len;
 	u_char *p;
 	int r;
 
 	/*
 	 * Use sshbuf_peek_string_direct() to figure out if there is
 	 * a complete string in 'buf' and copy the string directly
 	 * into 'v'.
 	 */
 	if ((r = sshbuf_peek_string_direct(buf, NULL, NULL)) != 0 ||
 	    (r = sshbuf_get_u32(buf, &len)) != 0 ||
 	    (r = sshbuf_reserve(v, len, &p)) != 0 ||
 	    (r = sshbuf_get(buf, p, len)) != 0)
 		return r;
 	return 0;
 }
 
 int
 sshbuf_put(struct sshbuf *buf, const void *v, size_t len)
 {
 	u_char *p;
 	int r;
 
 	if ((r = sshbuf_reserve(buf, len, &p)) < 0)
 		return r;
 	if (len != 0)
 		memcpy(p, v, len);
 	return 0;
 }
 
 int
 sshbuf_putb(struct sshbuf *buf, const struct sshbuf *v)
 {
 	return sshbuf_put(buf, sshbuf_ptr(v), sshbuf_len(v));
 }
 
 int
 sshbuf_putf(struct sshbuf *buf, const char *fmt, ...)
 {
 	va_list ap;
 	int r;
 
 	va_start(ap, fmt);
 	r = sshbuf_putfv(buf, fmt, ap);
 	va_end(ap);
 	return r;
 }
 
 int
 sshbuf_putfv(struct sshbuf *buf, const char *fmt, va_list ap)
 {
 	va_list ap2;
 	int r, len;
 	u_char *p;
 
 	VA_COPY(ap2, ap);
 	if ((len = vsnprintf(NULL, 0, fmt, ap2)) < 0) {
 		r = SSH_ERR_INVALID_ARGUMENT;
 		goto out;
 	}
 	if (len == 0) {
 		r = 0;
 		goto out; /* Nothing to do */
 	}
 	va_end(ap2);
 	VA_COPY(ap2, ap);
 	if ((r = sshbuf_reserve(buf, (size_t)len + 1, &p)) < 0)
 		goto out;
 	if ((r = vsnprintf((char *)p, len + 1, fmt, ap2)) != len) {
 		r = SSH_ERR_INTERNAL_ERROR;
 		goto out; /* Shouldn't happen */
 	}
 	/* Consume terminating \0 */
 	if ((r = sshbuf_consume_end(buf, 1)) != 0)
 		goto out;
 	r = 0;
  out:
 	va_end(ap2);
 	return r;
 }
 
 int
 sshbuf_put_u64(struct sshbuf *buf, u_int64_t val)
 {
 	u_char *p;
 	int r;
 
 	if ((r = sshbuf_reserve(buf, 8, &p)) < 0)
 		return r;
 	POKE_U64(p, val);
 	return 0;
 }
 
 int
 sshbuf_put_u32(struct sshbuf *buf, u_int32_t val)
 {
 	u_char *p;
 	int r;
 
 	if ((r = sshbuf_reserve(buf, 4, &p)) < 0)
 		return r;
 	POKE_U32(p, val);
 	return 0;
 }
 
 int
 sshbuf_put_u16(struct sshbuf *buf, u_int16_t val)
 {
 	u_char *p;
 	int r;
 
 	if ((r = sshbuf_reserve(buf, 2, &p)) < 0)
 		return r;
 	POKE_U16(p, val);
 	return 0;
 }
 
 int
 sshbuf_put_u8(struct sshbuf *buf, u_char val)
 {
 	u_char *p;
 	int r;
 
 	if ((r = sshbuf_reserve(buf, 1, &p)) < 0)
 		return r;
 	p[0] = val;
 	return 0;
 }
 
 int
 sshbuf_put_string(struct sshbuf *buf, const void *v, size_t len)
 {
 	u_char *d;
 	int r;
 
 	if (len > SSHBUF_SIZE_MAX - 4) {
 		SSHBUF_DBG(("SSH_ERR_NO_BUFFER_SPACE"));
 		return SSH_ERR_NO_BUFFER_SPACE;
 	}
 	if ((r = sshbuf_reserve(buf, len + 4, &d)) < 0)
 		return r;
 	POKE_U32(d, len);
 	if (len != 0)
 		memcpy(d + 4, v, len);
 	return 0;
 }
 
 int
 sshbuf_put_cstring(struct sshbuf *buf, const char *v)
 {
 	return sshbuf_put_string(buf, v, v == NULL ? 0 : strlen(v));
 }
 
 int
 sshbuf_put_stringb(struct sshbuf *buf, const struct sshbuf *v)
 {
 	return sshbuf_put_string(buf, sshbuf_ptr(v), sshbuf_len(v));
 }
 
 int
 sshbuf_froms(struct sshbuf *buf, struct sshbuf **bufp)
 {
 	const u_char *p;
 	size_t len;
 	struct sshbuf *ret;
 	int r;
 
 	if (buf == NULL || bufp == NULL)
 		return SSH_ERR_INVALID_ARGUMENT;
 	*bufp = NULL;
 	if ((r = sshbuf_peek_string_direct(buf, &p, &len)) != 0)
 		return r;
 	if ((ret = sshbuf_from(p, len)) == NULL)
 		return SSH_ERR_ALLOC_FAIL;
 	if ((r = sshbuf_consume(buf, len + 4)) != 0 ||  /* Shouldn't happen */
 	    (r = sshbuf_set_parent(ret, buf)) != 0) {
 		sshbuf_free(ret);
 		return r;
 	}
 	*bufp = ret;
 	return 0;
 }
 
 int
 sshbuf_put_bignum2_bytes(struct sshbuf *buf, const void *v, size_t len)
 {
 	u_char *d;
 	const u_char *s = (const u_char *)v;
 	int r, prepend;
 
 	if (len > SSHBUF_SIZE_MAX - 5) {
 		SSHBUF_DBG(("SSH_ERR_NO_BUFFER_SPACE"));
 		return SSH_ERR_NO_BUFFER_SPACE;
 	}
 	/* Skip leading zero bytes */
 	for (; len > 0 && *s == 0; len--, s++)
 		;
 	/*
 	 * If most significant bit is set then prepend a zero byte to
 	 * avoid interpretation as a negative number.
 	 */
 	prepend = len > 0 && (s[0] & 0x80) != 0;
 	if ((r = sshbuf_reserve(buf, len + 4 + prepend, &d)) < 0)
 		return r;
 	POKE_U32(d, len + prepend);
 	if (prepend)
 		d[4] = 0;
 	if (len != 0)
 		memcpy(d + 4 + prepend, s, len);
 	return 0;
 }
 
 int
 sshbuf_get_bignum2_bytes_direct(struct sshbuf *buf,
     const u_char **valp, size_t *lenp)
 {
 	const u_char *d;
 	size_t len, olen;
 	int r;
 
 	if ((r = sshbuf_peek_string_direct(buf, &d, &olen)) < 0)
 		return r;
 	len = olen;
 	/* Refuse negative (MSB set) bignums */
 	if ((len != 0 && (*d & 0x80) != 0))
 		return SSH_ERR_BIGNUM_IS_NEGATIVE;
 	/* Refuse overlong bignums, allow prepended \0 to avoid MSB set */
 	if (len > SSHBUF_MAX_BIGNUM + 1 ||
 	    (len == SSHBUF_MAX_BIGNUM + 1 && *d != 0))
 		return SSH_ERR_BIGNUM_TOO_LARGE;
 	/* Trim leading zeros */
 	while (len > 0 && *d == 0x00) {
 		d++;
 		len--;
 	}
 	if (valp != NULL)
 		*valp = d;
 	if (lenp != NULL)
 		*lenp = len;
 	if (sshbuf_consume(buf, olen + 4) != 0) {
 		/* Shouldn't happen */
 		SSHBUF_DBG(("SSH_ERR_INTERNAL_ERROR"));
 		SSHBUF_ABORT();
 		return SSH_ERR_INTERNAL_ERROR;
 	}
 	return 0;
+}
+
+/*
+ * store struct pwd
+ */
+int
+sshbuf_put_passwd(struct sshbuf *buf, const struct passwd *pwent)
+{
+	int r;
+
+	/*
+	 * We never send pointer values of struct passwd.
+	 * It is safe from wild pointer even if a new pointer member is added.
+	 */
+
+	if ((r = sshbuf_put_u64(buf, sizeof(*pwent)) != 0) ||
+	    (r = sshbuf_put_cstring(buf, pwent->pw_name)) != 0 ||
+	    (r = sshbuf_put_cstring(buf, "*")) != 0 ||
+	    (r = sshbuf_put_u32(buf, pwent->pw_uid)) != 0 ||
+	    (r = sshbuf_put_u32(buf, pwent->pw_gid)) != 0 ||
+	    (r = sshbuf_put_u64(buf, pwent->pw_change)) != 0 ||
+#ifdef HAVE_STRUCT_PASSWD_PW_GECOS
+	    (r = sshbuf_put_cstring(buf, pwent->pw_gecos)) != 0 ||
+#endif
+#ifdef HAVE_STRUCT_PASSWD_PW_CLASS
+	    (r = sshbuf_put_cstring(buf, pwent->pw_class)) != 0 ||
+#endif
+	    (r = sshbuf_put_cstring(buf, pwent->pw_dir)) != 0 ||
+	    (r = sshbuf_put_cstring(buf, pwent->pw_shell)) != 0 ||
+	    (r = sshbuf_put_u64(buf, pwent->pw_expire)) != 0 ||
+	    (r = sshbuf_put_u32(buf, pwent->pw_fields)) != 0) {
+		return r;
+	}
+	return 0;
+}
+
+/*
+ * extract struct pwd
+ */
+struct passwd *
+sshbuf_get_passwd(struct sshbuf *buf)
+{
+	struct passwd *pw;
+	int r;
+	size_t len;
+
+	/* check if size of struct passwd is as same as sender's size */
+	r = sshbuf_get_u64(buf, &len);
+	if (r != 0 || len != sizeof(*pw))
+		return NULL;
+
+	pw = xcalloc(1, sizeof(*pw));
+	if (sshbuf_get_cstring(buf, &pw->pw_name, NULL) != 0 ||
+	    sshbuf_get_cstring(buf, &pw->pw_passwd, NULL) != 0 ||
+	    sshbuf_get_u32(buf, &pw->pw_uid) != 0 ||
+	    sshbuf_get_u32(buf, &pw->pw_gid) != 0 ||
+	    sshbuf_get_u64(buf, &pw->pw_change) != 0 ||
+#ifdef HAVE_STRUCT_PASSWD_PW_GECOS
+	    sshbuf_get_cstring(buf, &pw->pw_gecos, NULL) != 0 ||
+#endif
+#ifdef HAVE_STRUCT_PASSWD_PW_CLASS
+	    sshbuf_get_cstring(buf, &pw->pw_class, NULL) != 0 ||
+#endif
+	    sshbuf_get_cstring(buf, &pw->pw_dir, NULL) != 0 ||
+	    sshbuf_get_cstring(buf, &pw->pw_shell, NULL) != 0 ||
+	    sshbuf_get_u64(buf, &pw->pw_expire) != 0 ||
+	    sshbuf_get_u32(buf, &pw->pw_fields) != 0) {
+		sshbuf_free_passwd(pw);
+		return NULL;
+	}
+	return pw;
+}
+
+/*
+ * free struct passwd obtained from sshbuf_get_passwd.
+ */
+void
+sshbuf_free_passwd(struct passwd *pwent)
+{
+	if (pwent == NULL)
+		return;
+	free(pwent->pw_shell);
+	free(pwent->pw_dir);
+#ifdef HAVE_STRUCT_PASSWD_PW_CLASS
+	free(pwent->pw_class);
+#endif
+#ifdef HAVE_STRUCT_PASSWD_PW_GECOS
+	free(pwent->pw_gecos);
+#endif
+	free(pwent->pw_passwd);
+	free(pwent->pw_name);
+	free(pwent);
 }
Index: projects/openssl111/crypto/openssh/sshbuf.h
===================================================================
--- projects/openssl111/crypto/openssh/sshbuf.h	(revision 339239)
+++ projects/openssl111/crypto/openssh/sshbuf.h	(revision 339240)
@@ -1,348 +1,364 @@
 /*	$OpenBSD: sshbuf.h,v 1.11 2018/07/09 21:56:06 markus Exp $	*/
 /*
  * Copyright (c) 2011 Damien Miller
  *
  * Permission to use, copy, modify, and distribute this software for any
  * purpose with or without fee is hereby granted, provided that the above
  * copyright notice and this permission notice appear in all copies.
  *
  * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
  * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
  * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
  * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
  * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  */
 
 #ifndef _SSHBUF_H
 #define _SSHBUF_H
 
 #include <sys/types.h>
 #include <stdarg.h>
 #include <stdio.h>
+#include <pwd.h>
 #ifdef WITH_OPENSSL
 # include <openssl/bn.h>
 # ifdef OPENSSL_HAS_ECC
 #  include <openssl/ec.h>
 # endif /* OPENSSL_HAS_ECC */
 #endif /* WITH_OPENSSL */
 
 #define SSHBUF_SIZE_MAX		0x8000000	/* Hard maximum size */
 #define SSHBUF_REFS_MAX		0x100000	/* Max child buffers */
 #define SSHBUF_MAX_BIGNUM	(16384 / 8)	/* Max bignum *bytes* */
 #define SSHBUF_MAX_ECPOINT	((528 * 2 / 8) + 1) /* Max EC point *bytes* */
 
 /*
  * NB. do not depend on the internals of this. It will be made opaque
  * one day.
  */
 struct sshbuf {
 	u_char *d;		/* Data */
 	const u_char *cd;	/* Const data */
 	size_t off;		/* First available byte is buf->d + buf->off */
 	size_t size;		/* Last byte is buf->d + buf->size - 1 */
 	size_t max_size;	/* Maximum size of buffer */
 	size_t alloc;		/* Total bytes allocated to buf->d */
 	int readonly;		/* Refers to external, const data */
 	int dont_free;		/* Kludge to support sshbuf_init */
 	u_int refcount;		/* Tracks self and number of child buffers */
 	struct sshbuf *parent;	/* If child, pointer to parent */
 };
 
 /*
  * Create a new sshbuf buffer.
  * Returns pointer to buffer on success, or NULL on allocation failure.
  */
 struct sshbuf *sshbuf_new(void);
 
 /*
  * Create a new, read-only sshbuf buffer from existing data.
  * Returns pointer to buffer on success, or NULL on allocation failure.
  */
 struct sshbuf *sshbuf_from(const void *blob, size_t len);
 
 /*
  * Create a new, read-only sshbuf buffer from the contents of an existing
  * buffer. The contents of "buf" must not change in the lifetime of the
  * resultant buffer.
  * Returns pointer to buffer on success, or NULL on allocation failure.
  */
 struct sshbuf *sshbuf_fromb(struct sshbuf *buf);
 
 /*
  * Create a new, read-only sshbuf buffer from the contents of a string in
  * an existing buffer (the string is consumed in the process).
  * The contents of "buf" must not change in the lifetime of the resultant
  * buffer.
  * Returns pointer to buffer on success, or NULL on allocation failure.
  */
 int	sshbuf_froms(struct sshbuf *buf, struct sshbuf **bufp);
 
 /*
  * Clear and free buf
  */
 void	sshbuf_free(struct sshbuf *buf);
 
 /*
  * Reset buf, clearing its contents. NB. max_size is preserved.
  */
 void	sshbuf_reset(struct sshbuf *buf);
 
 /*
  * Return the maximum size of buf
  */
 size_t	sshbuf_max_size(const struct sshbuf *buf);
 
 /*
  * Set the maximum size of buf
  * Returns 0 on success, or a negative SSH_ERR_* error code on failure.
  */
 int	sshbuf_set_max_size(struct sshbuf *buf, size_t max_size);
 
 /*
  * Returns the length of data in buf
  */
 size_t	sshbuf_len(const struct sshbuf *buf);
 
 /*
  * Returns number of bytes left in buffer before hitting max_size.
  */
 size_t	sshbuf_avail(const struct sshbuf *buf);
 
 /*
  * Returns a read-only pointer to the start of the data in buf
  */
 const u_char *sshbuf_ptr(const struct sshbuf *buf);
 
 /*
  * Returns a mutable pointer to the start of the data in buf, or
  * NULL if the buffer is read-only.
  */
 u_char *sshbuf_mutable_ptr(const struct sshbuf *buf);
 
 /*
  * Check whether a reservation of size len will succeed in buf
  * Safer to use than direct comparisons again sshbuf_avail as it copes
  * with unsigned overflows correctly.
  * Returns 0 on success, or a negative SSH_ERR_* error code on failure.
  */
 int	sshbuf_check_reserve(const struct sshbuf *buf, size_t len);
 
 /*
  * Preallocates len additional bytes in buf.
  * Useful for cases where the caller knows how many bytes will ultimately be
  * required to avoid realloc in the buffer code.
  * Returns 0 on success, or a negative SSH_ERR_* error code on failure.
  */
 int	sshbuf_allocate(struct sshbuf *buf, size_t len);
 
 /*
  * Reserve len bytes in buf.
  * Returns 0 on success and a pointer to the first reserved byte via the
  * optional dpp parameter or a negative * SSH_ERR_* error code on failure.
  */
 int	sshbuf_reserve(struct sshbuf *buf, size_t len, u_char **dpp);
 
 /*
  * Consume len bytes from the start of buf
  * Returns 0 on success, or a negative SSH_ERR_* error code on failure.
  */
 int	sshbuf_consume(struct sshbuf *buf, size_t len);
 
 /*
  * Consume len bytes from the end of buf
  * Returns 0 on success, or a negative SSH_ERR_* error code on failure.
  */
 int	sshbuf_consume_end(struct sshbuf *buf, size_t len);
 
 /* Extract or deposit some bytes */
 int	sshbuf_get(struct sshbuf *buf, void *v, size_t len);
 int	sshbuf_put(struct sshbuf *buf, const void *v, size_t len);
 int	sshbuf_putb(struct sshbuf *buf, const struct sshbuf *v);
 
 /* Append using a printf(3) format */
 int	sshbuf_putf(struct sshbuf *buf, const char *fmt, ...)
 	    __attribute__((format(printf, 2, 3)));
 int	sshbuf_putfv(struct sshbuf *buf, const char *fmt, va_list ap);
 
 /* Functions to extract or store big-endian words of various sizes */
 int	sshbuf_get_u64(struct sshbuf *buf, u_int64_t *valp);
 int	sshbuf_get_u32(struct sshbuf *buf, u_int32_t *valp);
 int	sshbuf_get_u16(struct sshbuf *buf, u_int16_t *valp);
 int	sshbuf_get_u8(struct sshbuf *buf, u_char *valp);
 int	sshbuf_put_u64(struct sshbuf *buf, u_int64_t val);
 int	sshbuf_put_u32(struct sshbuf *buf, u_int32_t val);
 int	sshbuf_put_u16(struct sshbuf *buf, u_int16_t val);
 int	sshbuf_put_u8(struct sshbuf *buf, u_char val);
 
 /*
  * Functions to extract or store SSH wire encoded strings (u32 len || data)
  * The "cstring" variants admit no \0 characters in the string contents.
  * Caller must free *valp.
  */
 int	sshbuf_get_string(struct sshbuf *buf, u_char **valp, size_t *lenp);
 int	sshbuf_get_cstring(struct sshbuf *buf, char **valp, size_t *lenp);
 int	sshbuf_get_stringb(struct sshbuf *buf, struct sshbuf *v);
 int	sshbuf_put_string(struct sshbuf *buf, const void *v, size_t len);
 int	sshbuf_put_cstring(struct sshbuf *buf, const char *v);
 int	sshbuf_put_stringb(struct sshbuf *buf, const struct sshbuf *v);
 
 /*
  * "Direct" variant of sshbuf_get_string, returns pointer into the sshbuf to
  * avoid an malloc+memcpy. The pointer is guaranteed to be valid until the
  * next sshbuf-modifying function call. Caller does not free.
  */
 int	sshbuf_get_string_direct(struct sshbuf *buf, const u_char **valp,
 	    size_t *lenp);
 
 /* Skip past a string */
 #define sshbuf_skip_string(buf) sshbuf_get_string_direct(buf, NULL, NULL)
 
 /* Another variant: "peeks" into the buffer without modifying it */
 int	sshbuf_peek_string_direct(const struct sshbuf *buf, const u_char **valp,
 	    size_t *lenp);
 /* XXX peek_u8 / peek_u32 */
 
 /*
  * Functions to extract or store SSH wire encoded bignums and elliptic
  * curve points.
  */
 int	sshbuf_put_bignum2_bytes(struct sshbuf *buf, const void *v, size_t len);
 int	sshbuf_get_bignum2_bytes_direct(struct sshbuf *buf,
 	    const u_char **valp, size_t *lenp);
 #ifdef WITH_OPENSSL
 int	sshbuf_get_bignum2(struct sshbuf *buf, BIGNUM *v);
 int	sshbuf_get_bignum1(struct sshbuf *buf, BIGNUM *v);
 int	sshbuf_put_bignum2(struct sshbuf *buf, const BIGNUM *v);
 int	sshbuf_put_bignum1(struct sshbuf *buf, const BIGNUM *v);
 # ifdef OPENSSL_HAS_ECC
 int	sshbuf_get_ec(struct sshbuf *buf, EC_POINT *v, const EC_GROUP *g);
 int	sshbuf_get_eckey(struct sshbuf *buf, EC_KEY *v);
 int	sshbuf_put_ec(struct sshbuf *buf, const EC_POINT *v, const EC_GROUP *g);
 int	sshbuf_put_eckey(struct sshbuf *buf, const EC_KEY *v);
 # endif /* OPENSSL_HAS_ECC */
 #endif /* WITH_OPENSSL */
 
 /* Dump the contents of the buffer in a human-readable format */
 void	sshbuf_dump(struct sshbuf *buf, FILE *f);
 
 /* Dump specified memory in a human-readable format */
 void	sshbuf_dump_data(const void *s, size_t len, FILE *f);
 
 /* Return the hexadecimal representation of the contents of the buffer */
 char	*sshbuf_dtob16(struct sshbuf *buf);
 
 /* Encode the contents of the buffer as base64 */
 char	*sshbuf_dtob64(struct sshbuf *buf);
 
 /* Decode base64 data and append it to the buffer */
 int	sshbuf_b64tod(struct sshbuf *buf, const char *b64);
 
 /*
  * Duplicate the contents of a buffer to a string (caller to free).
  * Returns NULL on buffer error, or if the buffer contains a premature
  * nul character.
  */
 char *sshbuf_dup_string(struct sshbuf *buf);
+
+/*
+ * store struct pwd
+ */
+int sshbuf_put_passwd(struct sshbuf *buf, const struct passwd *pwent);
+
+/*
+ * extract struct pwd
+ */
+struct passwd *sshbuf_get_passwd(struct sshbuf *buf);
+
+/*
+ * free struct passwd obtained from sshbuf_get_passwd.
+ */
+void sshbuf_free_passwd(struct passwd *pwent);
 
 /* Macros for decoding/encoding integers */
 #define PEEK_U64(p) \
 	(((u_int64_t)(((const u_char *)(p))[0]) << 56) | \
 	 ((u_int64_t)(((const u_char *)(p))[1]) << 48) | \
 	 ((u_int64_t)(((const u_char *)(p))[2]) << 40) | \
 	 ((u_int64_t)(((const u_char *)(p))[3]) << 32) | \
 	 ((u_int64_t)(((const u_char *)(p))[4]) << 24) | \
 	 ((u_int64_t)(((const u_char *)(p))[5]) << 16) | \
 	 ((u_int64_t)(((const u_char *)(p))[6]) << 8) | \
 	  (u_int64_t)(((const u_char *)(p))[7]))
 #define PEEK_U32(p) \
 	(((u_int32_t)(((const u_char *)(p))[0]) << 24) | \
 	 ((u_int32_t)(((const u_char *)(p))[1]) << 16) | \
 	 ((u_int32_t)(((const u_char *)(p))[2]) << 8) | \
 	  (u_int32_t)(((const u_char *)(p))[3]))
 #define PEEK_U16(p) \
 	(((u_int16_t)(((const u_char *)(p))[0]) << 8) | \
 	  (u_int16_t)(((const u_char *)(p))[1]))
 
 #define POKE_U64(p, v) \
 	do { \
 		const u_int64_t __v = (v); \
 		((u_char *)(p))[0] = (__v >> 56) & 0xff; \
 		((u_char *)(p))[1] = (__v >> 48) & 0xff; \
 		((u_char *)(p))[2] = (__v >> 40) & 0xff; \
 		((u_char *)(p))[3] = (__v >> 32) & 0xff; \
 		((u_char *)(p))[4] = (__v >> 24) & 0xff; \
 		((u_char *)(p))[5] = (__v >> 16) & 0xff; \
 		((u_char *)(p))[6] = (__v >> 8) & 0xff; \
 		((u_char *)(p))[7] = __v & 0xff; \
 	} while (0)
 #define POKE_U32(p, v) \
 	do { \
 		const u_int32_t __v = (v); \
 		((u_char *)(p))[0] = (__v >> 24) & 0xff; \
 		((u_char *)(p))[1] = (__v >> 16) & 0xff; \
 		((u_char *)(p))[2] = (__v >> 8) & 0xff; \
 		((u_char *)(p))[3] = __v & 0xff; \
 	} while (0)
 #define POKE_U16(p, v) \
 	do { \
 		const u_int16_t __v = (v); \
 		((u_char *)(p))[0] = (__v >> 8) & 0xff; \
 		((u_char *)(p))[1] = __v & 0xff; \
 	} while (0)
 
 /* Internal definitions follow. Exposed for regress tests */
 #ifdef SSHBUF_INTERNAL
 
 /*
  * Return the allocation size of buf
  */
 size_t	sshbuf_alloc(const struct sshbuf *buf);
 
 /*
  * Increment the reference count of buf.
  */
 int	sshbuf_set_parent(struct sshbuf *child, struct sshbuf *parent);
 
 /*
  * Return the parent buffer of buf, or NULL if it has no parent.
  */
 const struct sshbuf *sshbuf_parent(const struct sshbuf *buf);
 
 /*
  * Return the reference count of buf
  */
 u_int	sshbuf_refcount(const struct sshbuf *buf);
 
 # define SSHBUF_SIZE_INIT	256		/* Initial allocation */
 # define SSHBUF_SIZE_INC	256		/* Preferred increment length */
 # define SSHBUF_PACK_MIN	8192		/* Minimim packable offset */
 
 /* # define SSHBUF_ABORT abort */
 /* # define SSHBUF_DEBUG */
 
 # ifndef SSHBUF_ABORT
 #  define SSHBUF_ABORT()
 # endif
 
 # ifdef SSHBUF_DEBUG
 #  define SSHBUF_TELL(what) do { \
 		printf("%s:%d %s: %s size %zu alloc %zu off %zu max %zu\n", \
 		    __FILE__, __LINE__, __func__, what, \
 		    buf->size, buf->alloc, buf->off, buf->max_size); \
 		fflush(stdout); \
 	} while (0)
 #  define SSHBUF_DBG(x) do { \
 		printf("%s:%d %s: ", __FILE__, __LINE__, __func__); \
 		printf x; \
 		printf("\n"); \
 		fflush(stdout); \
 	} while (0)
 # else
 #  define SSHBUF_TELL(what)
 #  define SSHBUF_DBG(x)
 # endif
 #endif /* SSHBUF_INTERNAL */
 
 #endif /* _SSHBUF_H */
Index: projects/openssl111/crypto/openssh/sshd.c
===================================================================
--- projects/openssl111/crypto/openssh/sshd.c	(revision 339239)
+++ projects/openssl111/crypto/openssh/sshd.c	(revision 339240)
@@ -1,2427 +1,2432 @@
 /* $OpenBSD: sshd.c,v 1.514 2018/08/13 02:41:05 djm Exp $ */
 /*
  * Author: Tatu Ylonen <ylo@cs.hut.fi>
  * Copyright (c) 1995 Tatu Ylonen <ylo@cs.hut.fi>, Espoo, Finland
  *                    All rights reserved
  * This program is the ssh daemon.  It listens for connections from clients,
  * and performs authentication, executes use commands or shell, and forwards
  * information to/from the application to the user client over an encrypted
  * connection.  This can also handle forwarding of X11, TCP/IP, and
  * authentication agent connections.
  *
  * As far as I am concerned, the code I have written for this software
  * can be used freely for any purpose.  Any derived versions of this
  * software must be clearly marked as such, and if the derived work is
  * incompatible with the protocol description in the RFC file, it must be
  * called by a name other than "ssh" or "Secure Shell".
  *
  * SSH2 implementation:
  * Privilege Separation:
  *
  * Copyright (c) 2000, 2001, 2002 Markus Friedl.  All rights reserved.
  * Copyright (c) 2002 Niels Provos.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
 #include "includes.h"
 __RCSID("$FreeBSD$");
 
 #include <sys/types.h>
 #include <sys/ioctl.h>
 #include <sys/mman.h>
 #include <sys/socket.h>
 #ifdef HAVE_SYS_STAT_H
 # include <sys/stat.h>
 #endif
 #ifdef HAVE_SYS_TIME_H
 # include <sys/time.h>
 #endif
 #include "openbsd-compat/sys-tree.h"
 #include "openbsd-compat/sys-queue.h"
 #include <sys/wait.h>
 
 #include <errno.h>
 #include <fcntl.h>
 #include <netdb.h>
 #ifdef HAVE_PATHS_H
 #include <paths.h>
 #endif
 #include <grp.h>
 #include <pwd.h>
 #include <signal.h>
 #include <stdarg.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
 #include <limits.h>
 
 #ifdef WITH_OPENSSL
 #include <openssl/dh.h>
 #include <openssl/bn.h>
 #include <openssl/rand.h>
 #include "openbsd-compat/openssl-compat.h"
 #endif
 
 #ifdef HAVE_SECUREWARE
 #include <sys/security.h>
 #include <prot.h>
 #endif
 
 #ifdef __FreeBSD__
 #include <resolv.h>
 #if defined(GSSAPI) && defined(HAVE_GSSAPI_GSSAPI_H)
 #include <gssapi/gssapi.h>
 #elif defined(GSSAPI) && defined(HAVE_GSSAPI_H)
 #include <gssapi.h>
 #endif
 #endif
 
 #include "xmalloc.h"
 #include "ssh.h"
 #include "ssh2.h"
 #include "sshpty.h"
 #include "packet.h"
 #include "log.h"
 #include "sshbuf.h"
 #include "misc.h"
 #include "match.h"
 #include "servconf.h"
 #include "uidswap.h"
 #include "compat.h"
 #include "cipher.h"
 #include "digest.h"
 #include "sshkey.h"
 #include "kex.h"
 #include "myproposal.h"
 #include "authfile.h"
 #include "pathnames.h"
 #include "atomicio.h"
 #include "canohost.h"
 #include "hostfile.h"
 #include "auth.h"
 #include "authfd.h"
 #include "msg.h"
 #include "dispatch.h"
 #include "channels.h"
 #include "session.h"
 #include "monitor.h"
 #ifdef GSSAPI
 #include "ssh-gss.h"
 #endif
 #include "monitor_wrap.h"
 #include "ssh-sandbox.h"
 #include "auth-options.h"
 #include "version.h"
 #include "ssherr.h"
 #include "blacklist_client.h"
 
 #ifdef LIBWRAP
 #include <tcpd.h>
 #include <syslog.h>
 int allow_severity;
 int deny_severity;
 #endif /* LIBWRAP */
 
 /* Re-exec fds */
 #define REEXEC_DEVCRYPTO_RESERVED_FD	(STDERR_FILENO + 1)
 #define REEXEC_STARTUP_PIPE_FD		(STDERR_FILENO + 2)
 #define REEXEC_CONFIG_PASS_FD		(STDERR_FILENO + 3)
 #define REEXEC_MIN_FREE_FD		(STDERR_FILENO + 4)
 
 extern char *__progname;
 
 /* Server configuration options. */
 ServerOptions options;
 
 /* Name of the server configuration file. */
 char *config_file_name = _PATH_SERVER_CONFIG_FILE;
 
 /*
  * Debug mode flag.  This can be set on the command line.  If debug
  * mode is enabled, extra debugging output will be sent to the system
  * log, the daemon will not go to background, and will exit after processing
  * the first connection.
  */
 int debug_flag = 0;
 
 /*
  * Indicating that the daemon should only test the configuration and keys.
  * If test_flag > 1 ("-T" flag), then sshd will also dump the effective
  * configuration, optionally using connection information provided by the
  * "-C" flag.
  */
 int test_flag = 0;
 
 /* Flag indicating that the daemon is being started from inetd. */
 int inetd_flag = 0;
 
 /* Flag indicating that sshd should not detach and become a daemon. */
 int no_daemon_flag = 0;
 
 /* debug goes to stderr unless inetd_flag is set */
 int log_stderr = 0;
 
 /* Saved arguments to main(). */
 char **saved_argv;
 int saved_argc;
 
 /* re-exec */
 int rexeced_flag = 0;
 int rexec_flag = 1;
 int rexec_argc = 0;
 char **rexec_argv;
 
 /*
  * The sockets that the server is listening; this is used in the SIGHUP
  * signal handler.
  */
 #define	MAX_LISTEN_SOCKS	16
 int listen_socks[MAX_LISTEN_SOCKS];
 int num_listen_socks = 0;
 
 /*
  * the client's version string, passed by sshd2 in compat mode. if != NULL,
  * sshd will skip the version-number exchange
  */
 char *client_version_string = NULL;
 char *server_version_string = NULL;
 
 /* Daemon's agent connection */
 int auth_sock = -1;
 int have_agent = 0;
 
 /*
  * Any really sensitive data in the application is contained in this
  * structure. The idea is that this structure could be locked into memory so
  * that the pages do not get written into swap.  However, there are some
  * problems. The private key contains BIGNUMs, and we do not (in principle)
  * have access to the internals of them, and locking just the structure is
  * not very useful.  Currently, memory locking is not implemented.
  */
 struct {
 	struct sshkey	**host_keys;		/* all private host keys */
 	struct sshkey	**host_pubkeys;		/* all public host keys */
 	struct sshkey	**host_certificates;	/* all public host certificates */
 	int		have_ssh2_key;
 } sensitive_data;
 
 /* This is set to true when a signal is received. */
 static volatile sig_atomic_t received_sighup = 0;
 static volatile sig_atomic_t received_sigterm = 0;
 
 /* session identifier, used by RSA-auth */
 u_char session_id[16];
 
 /* same for ssh2 */
 u_char *session_id2 = NULL;
 u_int session_id2_len = 0;
 
 /* record remote hostname or ip */
 u_int utmp_len = HOST_NAME_MAX+1;
 
 /* options.max_startup sized array of fd ints */
 int *startup_pipes = NULL;
 int startup_pipe;		/* in child */
 
 /* variables used for privilege separation */
 int use_privsep = -1;
 struct monitor *pmonitor = NULL;
 int privsep_is_preauth = 1;
 static int privsep_chroot = 1;
 
 /* global authentication context */
 Authctxt *the_authctxt = NULL;
 
 /* global key/cert auth options. XXX move to permanent ssh->authctxt? */
 struct sshauthopt *auth_opts = NULL;
 
 /* sshd_config buffer */
 struct sshbuf *cfg;
 
 /* message to be displayed after login */
 struct sshbuf *loginmsg;
 
 /* Unprivileged user */
 struct passwd *privsep_pw = NULL;
 
 /* Prototypes for various functions defined later in this file. */
 void destroy_sensitive_data(void);
 void demote_sensitive_data(void);
 static void do_ssh2_kex(void);
 
 /*
  * Close all listening sockets
  */
 static void
 close_listen_socks(void)
 {
 	int i;
 
 	for (i = 0; i < num_listen_socks; i++)
 		close(listen_socks[i]);
 	num_listen_socks = -1;
 }
 
 static void
 close_startup_pipes(void)
 {
 	int i;
 
 	if (startup_pipes)
 		for (i = 0; i < options.max_startups; i++)
 			if (startup_pipes[i] != -1)
 				close(startup_pipes[i]);
 }
 
 /*
  * Signal handler for SIGHUP.  Sshd execs itself when it receives SIGHUP;
  * the effect is to reread the configuration file (and to regenerate
  * the server key).
  */
 
 /*ARGSUSED*/
 static void
 sighup_handler(int sig)
 {
 	int save_errno = errno;
 
 	received_sighup = 1;
 	errno = save_errno;
 }
 
 /*
  * Called from the main program after receiving SIGHUP.
  * Restarts the server.
  */
 static void
 sighup_restart(void)
 {
 	logit("Received SIGHUP; restarting.");
 	if (options.pid_file != NULL)
 		unlink(options.pid_file);
 	platform_pre_restart();
 	close_listen_socks();
 	close_startup_pipes();
 	alarm(0);  /* alarm timer persists across exec */
 	signal(SIGHUP, SIG_IGN); /* will be restored after exec */
 	execv(saved_argv[0], saved_argv);
 	logit("RESTART FAILED: av[0]='%.100s', error: %.100s.", saved_argv[0],
 	    strerror(errno));
 	exit(1);
 }
 
 /*
  * Generic signal handler for terminating signals in the master daemon.
  */
 /*ARGSUSED*/
 static void
 sigterm_handler(int sig)
 {
 	received_sigterm = sig;
 }
 
 /*
  * SIGCHLD handler.  This is called whenever a child dies.  This will then
  * reap any zombies left by exited children.
  */
 /*ARGSUSED*/
 static void
 main_sigchld_handler(int sig)
 {
 	int save_errno = errno;
 	pid_t pid;
 	int status;
 
 	while ((pid = waitpid(-1, &status, WNOHANG)) > 0 ||
 	    (pid < 0 && errno == EINTR))
 		;
 	errno = save_errno;
 }
 
 /*
  * Signal handler for the alarm after the login grace period has expired.
  */
 /*ARGSUSED*/
 static void
 grace_alarm_handler(int sig)
 {
 	if (use_privsep && pmonitor != NULL && pmonitor->m_pid > 0)
 		kill(pmonitor->m_pid, SIGALRM);
 
 	/*
 	 * Try to kill any processes that we have spawned, E.g. authorized
 	 * keys command helpers.
 	 */
 	if (getpgid(0) == getpid()) {
 		signal(SIGTERM, SIG_IGN);
 		kill(0, SIGTERM);
 	}
 
 	BLACKLIST_NOTIFY(BLACKLIST_AUTH_FAIL, "ssh");
 
 	/* Log error and exit. */
 	sigdie("Timeout before authentication for %s port %d",
 	    ssh_remote_ipaddr(active_state), ssh_remote_port(active_state));
 }
 
 static void
 sshd_exchange_identification(struct ssh *ssh, int sock_in, int sock_out)
 {
 	u_int i;
 	int remote_major, remote_minor;
 	char *s;
 	char buf[256];			/* Must not be larger than remote_version. */
 	char remote_version[256];	/* Must be at least as big as buf. */
 
 	xasprintf(&server_version_string, "SSH-%d.%d-%.100s%s%s\r\n",
 	    PROTOCOL_MAJOR_2, PROTOCOL_MINOR_2, SSH_VERSION,
 	    *options.version_addendum == '\0' ? "" : " ",
 	    options.version_addendum);
 
 	/* Send our protocol version identification. */
 	if (atomicio(vwrite, sock_out, server_version_string,
 	    strlen(server_version_string))
 	    != strlen(server_version_string)) {
 		logit("Could not write ident string to %s port %d",
 		    ssh_remote_ipaddr(ssh), ssh_remote_port(ssh));
 		cleanup_exit(255);
 	}
 
 	/* Read other sides version identification. */
 	memset(buf, 0, sizeof(buf));
 	for (i = 0; i < sizeof(buf) - 1; i++) {
 		if (atomicio(read, sock_in, &buf[i], 1) != 1) {
 			logit("Did not receive identification string "
 			    "from %s port %d",
 			    ssh_remote_ipaddr(ssh), ssh_remote_port(ssh));
 			cleanup_exit(255);
 		}
 		if (buf[i] == '\r') {
 			buf[i] = 0;
 			/* Kludge for F-Secure Macintosh < 1.0.2 */
 			if (i == 12 &&
 			    strncmp(buf, "SSH-1.5-W1.0", 12) == 0)
 				break;
 			continue;
 		}
 		if (buf[i] == '\n') {
 			buf[i] = 0;
 			break;
 		}
 	}
 	buf[sizeof(buf) - 1] = 0;
 	client_version_string = xstrdup(buf);
 
 	/*
 	 * Check that the versions match.  In future this might accept
 	 * several versions and set appropriate flags to handle them.
 	 */
 	if (sscanf(client_version_string, "SSH-%d.%d-%[^\n]\n",
 	    &remote_major, &remote_minor, remote_version) != 3) {
 		s = "Protocol mismatch.\n";
 		(void) atomicio(vwrite, sock_out, s, strlen(s));
 		logit("Bad protocol version identification '%.100s' "
 		    "from %s port %d", client_version_string,
 		    ssh_remote_ipaddr(ssh), ssh_remote_port(ssh));
 		close(sock_in);
 		close(sock_out);
 		cleanup_exit(255);
 	}
 	debug("Client protocol version %d.%d; client software version %.100s",
 	    remote_major, remote_minor, remote_version);
 
 	ssh->compat = compat_datafellows(remote_version);
 
 	if ((ssh->compat & SSH_BUG_PROBE) != 0) {
 		logit("probed from %s port %d with %s.  Don't panic.",
 		    ssh_remote_ipaddr(ssh), ssh_remote_port(ssh),
 		    client_version_string);
 		cleanup_exit(255);
 	}
 	if ((ssh->compat & SSH_BUG_SCANNER) != 0) {
 		logit("scanned from %s port %d with %s.  Don't panic.",
 		    ssh_remote_ipaddr(ssh), ssh_remote_port(ssh),
 		    client_version_string);
 		cleanup_exit(255);
 	}
 	if ((ssh->compat & SSH_BUG_RSASIGMD5) != 0) {
 		logit("Client version \"%.100s\" uses unsafe RSA signature "
 		    "scheme; disabling use of RSA keys", remote_version);
 	}
 
 	chop(server_version_string);
 	debug("Local version string %.200s", server_version_string);
 
 	if (remote_major != 2 &&
 	    !(remote_major == 1 && remote_minor == 99)) {
 		s = "Protocol major versions differ.\n";
 		(void) atomicio(vwrite, sock_out, s, strlen(s));
 		close(sock_in);
 		close(sock_out);
 		logit("Protocol major versions differ for %s port %d: "
 		    "%.200s vs. %.200s",
 		    ssh_remote_ipaddr(ssh), ssh_remote_port(ssh),
 		    server_version_string, client_version_string);
 		cleanup_exit(255);
 	}
 }
 
 /* Destroy the host and server keys.  They will no longer be needed. */
 void
 destroy_sensitive_data(void)
 {
 	u_int i;
 
 	for (i = 0; i < options.num_host_key_files; i++) {
 		if (sensitive_data.host_keys[i]) {
 			sshkey_free(sensitive_data.host_keys[i]);
 			sensitive_data.host_keys[i] = NULL;
 		}
 		if (sensitive_data.host_certificates[i]) {
 			sshkey_free(sensitive_data.host_certificates[i]);
 			sensitive_data.host_certificates[i] = NULL;
 		}
 	}
 }
 
 /* Demote private to public keys for network child */
 void
 demote_sensitive_data(void)
 {
 	struct sshkey *tmp;
 	u_int i;
 	int r;
 
 	for (i = 0; i < options.num_host_key_files; i++) {
 		if (sensitive_data.host_keys[i]) {
 			if ((r = sshkey_from_private(
 			    sensitive_data.host_keys[i], &tmp)) != 0)
 				fatal("could not demote host %s key: %s",
 				    sshkey_type(sensitive_data.host_keys[i]),
 				    ssh_err(r));
 			sshkey_free(sensitive_data.host_keys[i]);
 			sensitive_data.host_keys[i] = tmp;
 		}
 		/* Certs do not need demotion */
 	}
 }
 
 static void
 reseed_prngs(void)
 {
 	u_int32_t rnd[256];
 
 #ifdef WITH_OPENSSL
 	RAND_poll();
 #endif
 	arc4random_stir(); /* noop on recent arc4random() implementations */
 	arc4random_buf(rnd, sizeof(rnd)); /* let arc4random notice PID change */
 
 #ifdef WITH_OPENSSL
 	RAND_seed(rnd, sizeof(rnd));
 	/* give libcrypto a chance to notice the PID change */
 	if ((RAND_bytes((u_char *)rnd, 1)) != 1)
 		fatal("%s: RAND_bytes failed", __func__);
 #endif
 
 	explicit_bzero(rnd, sizeof(rnd));
 }
 
 static void
 privsep_preauth_child(void)
 {
 	gid_t gidset[1];
 
 	/* Enable challenge-response authentication for privilege separation */
 	privsep_challenge_enable();
 
 #ifdef GSSAPI
 	/* Cache supported mechanism OIDs for later use */
 	if (options.gss_authentication)
 		ssh_gssapi_prepare_supported_oids();
 #endif
 
 	reseed_prngs();
 
 	/* Demote the private keys to public keys. */
 	demote_sensitive_data();
 
 	/* Demote the child */
 	if (privsep_chroot) {
 		/* Change our root directory */
 		if (chroot(_PATH_PRIVSEP_CHROOT_DIR) == -1)
 			fatal("chroot(\"%s\"): %s", _PATH_PRIVSEP_CHROOT_DIR,
 			    strerror(errno));
 		if (chdir("/") == -1)
 			fatal("chdir(\"/\"): %s", strerror(errno));
 
 		/* Drop our privileges */
 		debug3("privsep user:group %u:%u", (u_int)privsep_pw->pw_uid,
 		    (u_int)privsep_pw->pw_gid);
 		gidset[0] = privsep_pw->pw_gid;
 		if (setgroups(1, gidset) < 0)
 			fatal("setgroups: %.100s", strerror(errno));
 		permanently_set_uid(privsep_pw);
 	}
 }
 
 static int
 privsep_preauth(Authctxt *authctxt)
 {
 	int status, r;
 	pid_t pid;
 	struct ssh_sandbox *box = NULL;
 
 	/* Set up unprivileged child process to deal with network data */
 	pmonitor = monitor_init();
 	/* Store a pointer to the kex for later rekeying */
 	pmonitor->m_pkex = &active_state->kex;
 
 	if (use_privsep == PRIVSEP_ON)
 		box = ssh_sandbox_init(pmonitor);
 	pid = fork();
 	if (pid == -1) {
 		fatal("fork of unprivileged child failed");
 	} else if (pid != 0) {
 		debug2("Network child is on pid %ld", (long)pid);
 
 		pmonitor->m_pid = pid;
 		if (have_agent) {
 			r = ssh_get_authentication_socket(&auth_sock);
 			if (r != 0) {
 				error("Could not get agent socket: %s",
 				    ssh_err(r));
 				have_agent = 0;
 			}
 		}
 		if (box != NULL)
 			ssh_sandbox_parent_preauth(box, pid);
 		monitor_child_preauth(authctxt, pmonitor);
 
 		/* Wait for the child's exit status */
 		while (waitpid(pid, &status, 0) < 0) {
 			if (errno == EINTR)
 				continue;
 			pmonitor->m_pid = -1;
 			fatal("%s: waitpid: %s", __func__, strerror(errno));
 		}
 		privsep_is_preauth = 0;
 		pmonitor->m_pid = -1;
 		if (WIFEXITED(status)) {
 			if (WEXITSTATUS(status) != 0)
 				fatal("%s: preauth child exited with status %d",
 				    __func__, WEXITSTATUS(status));
 		} else if (WIFSIGNALED(status))
 			fatal("%s: preauth child terminated by signal %d",
 			    __func__, WTERMSIG(status));
 		if (box != NULL)
 			ssh_sandbox_parent_finish(box);
 		return 1;
 	} else {
 		/* child */
 		close(pmonitor->m_sendfd);
 		close(pmonitor->m_log_recvfd);
 
 		/* Arrange for logging to be sent to the monitor */
 		set_log_handler(mm_log_handler, pmonitor);
 
 		privsep_preauth_child();
 		setproctitle("%s", "[net]");
 		if (box != NULL)
 			ssh_sandbox_child(box);
 
 		return 0;
 	}
 }
 
 static void
 privsep_postauth(Authctxt *authctxt)
 {
 #ifdef DISABLE_FD_PASSING
 	if (1) {
 #else
 	if (authctxt->pw->pw_uid == 0) {
 #endif
 		/* File descriptor passing is broken or root login */
 		use_privsep = 0;
 		goto skip;
 	}
 
 	/* New socket pair */
 	monitor_reinit(pmonitor);
 
 	pmonitor->m_pid = fork();
 	if (pmonitor->m_pid == -1)
 		fatal("fork of unprivileged child failed");
 	else if (pmonitor->m_pid != 0) {
 		verbose("User child is on pid %ld", (long)pmonitor->m_pid);
 		sshbuf_reset(loginmsg);
 		monitor_clear_keystate(pmonitor);
 		monitor_child_postauth(pmonitor);
 
 		/* NEVERREACHED */
 		exit(0);
 	}
 
 	/* child */
 
 	close(pmonitor->m_sendfd);
 	pmonitor->m_sendfd = -1;
 
 	/* Demote the private keys to public keys. */
 	demote_sensitive_data();
 
 	reseed_prngs();
 
 	/* Drop privileges */
 	do_setusercontext(authctxt->pw);
 
  skip:
 	/* It is safe now to apply the key state */
 	monitor_apply_keystate(pmonitor);
 
 	/*
 	 * Tell the packet layer that authentication was successful, since
 	 * this information is not part of the key state.
 	 */
 	packet_set_authenticated();
 }
 
 static void
 append_hostkey_type(struct sshbuf *b, const char *s)
 {
 	int r;
 
 	if (match_pattern_list(s, options.hostkeyalgorithms, 0) != 1) {
 		debug3("%s: %s key not permitted by HostkeyAlgorithms",
 		    __func__, s);
 		return;
 	}
 	if ((r = sshbuf_putf(b, "%s%s", sshbuf_len(b) > 0 ? "," : "", s)) != 0)
 		fatal("%s: sshbuf_putf: %s", __func__, ssh_err(r));
 }
 
 static char *
 list_hostkey_types(void)
 {
 	struct sshbuf *b;
 	struct sshkey *key;
 	char *ret;
 	u_int i;
 
 	if ((b = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	for (i = 0; i < options.num_host_key_files; i++) {
 		key = sensitive_data.host_keys[i];
 		if (key == NULL)
 			key = sensitive_data.host_pubkeys[i];
 		if (key == NULL)
 			continue;
 		switch (key->type) {
 		case KEY_RSA:
 			/* for RSA we also support SHA2 signatures */
 			append_hostkey_type(b, "rsa-sha2-512");
 			append_hostkey_type(b, "rsa-sha2-256");
 			/* FALLTHROUGH */
 		case KEY_DSA:
 		case KEY_ECDSA:
 		case KEY_ED25519:
 		case KEY_XMSS:
 			append_hostkey_type(b, sshkey_ssh_name(key));
 			break;
 		}
 		/* If the private key has a cert peer, then list that too */
 		key = sensitive_data.host_certificates[i];
 		if (key == NULL)
 			continue;
 		switch (key->type) {
 		case KEY_RSA_CERT:
 			/* for RSA we also support SHA2 signatures */
 			append_hostkey_type(b,
 			    "rsa-sha2-512-cert-v01@openssh.com");
 			append_hostkey_type(b,
 			    "rsa-sha2-256-cert-v01@openssh.com");
 			/* FALLTHROUGH */
 		case KEY_DSA_CERT:
 		case KEY_ECDSA_CERT:
 		case KEY_ED25519_CERT:
 		case KEY_XMSS_CERT:
 			append_hostkey_type(b, sshkey_ssh_name(key));
 			break;
 		}
 	}
 	if ((ret = sshbuf_dup_string(b)) == NULL)
 		fatal("%s: sshbuf_dup_string failed", __func__);
 	sshbuf_free(b);
 	debug("%s: %s", __func__, ret);
 	return ret;
 }
 
 static struct sshkey *
 get_hostkey_by_type(int type, int nid, int need_private, struct ssh *ssh)
 {
 	u_int i;
 	struct sshkey *key;
 
 	for (i = 0; i < options.num_host_key_files; i++) {
 		switch (type) {
 		case KEY_RSA_CERT:
 		case KEY_DSA_CERT:
 		case KEY_ECDSA_CERT:
 		case KEY_ED25519_CERT:
 		case KEY_XMSS_CERT:
 			key = sensitive_data.host_certificates[i];
 			break;
 		default:
 			key = sensitive_data.host_keys[i];
 			if (key == NULL && !need_private)
 				key = sensitive_data.host_pubkeys[i];
 			break;
 		}
 		if (key != NULL && key->type == type &&
 		    (key->type != KEY_ECDSA || key->ecdsa_nid == nid))
 			return need_private ?
 			    sensitive_data.host_keys[i] : key;
 	}
 	return NULL;
 }
 
 struct sshkey *
 get_hostkey_public_by_type(int type, int nid, struct ssh *ssh)
 {
 	return get_hostkey_by_type(type, nid, 0, ssh);
 }
 
 struct sshkey *
 get_hostkey_private_by_type(int type, int nid, struct ssh *ssh)
 {
 	return get_hostkey_by_type(type, nid, 1, ssh);
 }
 
 struct sshkey *
 get_hostkey_by_index(int ind)
 {
 	if (ind < 0 || (u_int)ind >= options.num_host_key_files)
 		return (NULL);
 	return (sensitive_data.host_keys[ind]);
 }
 
 struct sshkey *
 get_hostkey_public_by_index(int ind, struct ssh *ssh)
 {
 	if (ind < 0 || (u_int)ind >= options.num_host_key_files)
 		return (NULL);
 	return (sensitive_data.host_pubkeys[ind]);
 }
 
 int
 get_hostkey_index(struct sshkey *key, int compare, struct ssh *ssh)
 {
 	u_int i;
 
 	for (i = 0; i < options.num_host_key_files; i++) {
 		if (sshkey_is_cert(key)) {
 			if (key == sensitive_data.host_certificates[i] ||
 			    (compare && sensitive_data.host_certificates[i] &&
 			    sshkey_equal(key,
 			    sensitive_data.host_certificates[i])))
 				return (i);
 		} else {
 			if (key == sensitive_data.host_keys[i] ||
 			    (compare && sensitive_data.host_keys[i] &&
 			    sshkey_equal(key, sensitive_data.host_keys[i])))
 				return (i);
 			if (key == sensitive_data.host_pubkeys[i] ||
 			    (compare && sensitive_data.host_pubkeys[i] &&
 			    sshkey_equal(key, sensitive_data.host_pubkeys[i])))
 				return (i);
 		}
 	}
 	return (-1);
 }
 
 /* Inform the client of all hostkeys */
 static void
 notify_hostkeys(struct ssh *ssh)
 {
 	struct sshbuf *buf;
 	struct sshkey *key;
 	u_int i, nkeys;
 	int r;
 	char *fp;
 
 	/* Some clients cannot cope with the hostkeys message, skip those. */
 	if (datafellows & SSH_BUG_HOSTKEYS)
 		return;
 
 	if ((buf = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new", __func__);
 	for (i = nkeys = 0; i < options.num_host_key_files; i++) {
 		key = get_hostkey_public_by_index(i, ssh);
 		if (key == NULL || key->type == KEY_UNSPEC ||
 		    sshkey_is_cert(key))
 			continue;
 		fp = sshkey_fingerprint(key, options.fingerprint_hash,
 		    SSH_FP_DEFAULT);
 		debug3("%s: key %d: %s %s", __func__, i,
 		    sshkey_ssh_name(key), fp);
 		free(fp);
 		if (nkeys == 0) {
 			packet_start(SSH2_MSG_GLOBAL_REQUEST);
 			packet_put_cstring("hostkeys-00@openssh.com");
 			packet_put_char(0); /* want-reply */
 		}
 		sshbuf_reset(buf);
 		if ((r = sshkey_putb(key, buf)) != 0)
 			fatal("%s: couldn't put hostkey %d: %s",
 			    __func__, i, ssh_err(r));
 		packet_put_string(sshbuf_ptr(buf), sshbuf_len(buf));
 		nkeys++;
 	}
 	debug3("%s: sent %u hostkeys", __func__, nkeys);
 	if (nkeys == 0)
 		fatal("%s: no hostkeys", __func__);
 	packet_send();
 	sshbuf_free(buf);
 }
 
 /*
  * returns 1 if connection should be dropped, 0 otherwise.
  * dropping starts at connection #max_startups_begin with a probability
  * of (max_startups_rate/100). the probability increases linearly until
  * all connections are dropped for startups > max_startups
  */
 static int
 drop_connection(int startups)
 {
 	int p, r;
 
 	if (startups < options.max_startups_begin)
 		return 0;
 	if (startups >= options.max_startups)
 		return 1;
 	if (options.max_startups_rate == 100)
 		return 1;
 
 	p  = 100 - options.max_startups_rate;
 	p *= startups - options.max_startups_begin;
 	p /= options.max_startups - options.max_startups_begin;
 	p += options.max_startups_rate;
 	r = arc4random_uniform(100);
 
 	debug("drop_connection: p %d, r %d", p, r);
 	return (r < p) ? 1 : 0;
 }
 
 static void
 usage(void)
 {
 	if (options.version_addendum && *options.version_addendum != '\0')
 		fprintf(stderr, "%s %s, %s\n",
 		    SSH_RELEASE,
 		    options.version_addendum, OPENSSL_VERSION_STRING);
 	else
 		fprintf(stderr, "%s, %s\n",
 		    SSH_RELEASE, OPENSSL_VERSION_STRING);
 	fprintf(stderr,
 "usage: sshd [-46DdeiqTt] [-C connection_spec] [-c host_cert_file]\n"
 "            [-E log_file] [-f config_file] [-g login_grace_time]\n"
 "            [-h host_key_file] [-o option] [-p port] [-u len]\n"
 	);
 	exit(1);
 }
 
 static void
 send_rexec_state(int fd, struct sshbuf *conf)
 {
 	struct sshbuf *m;
 	int r;
 
 	debug3("%s: entering fd = %d config len %zu", __func__, fd,
 	    sshbuf_len(conf));
 
 	/*
 	 * Protocol from reexec master to child:
 	 *	string	configuration
 	 *	string rngseed		(only if OpenSSL is not self-seeded)
 	 */
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	if ((r = sshbuf_put_stringb(m, conf)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 
 #if defined(WITH_OPENSSL) && !defined(OPENSSL_PRNG_ONLY)
 	rexec_send_rng_seed(m);
 #endif
 
 	if (ssh_msg_send(fd, 0, m) == -1)
 		fatal("%s: ssh_msg_send failed", __func__);
 
 	sshbuf_free(m);
 
 	debug3("%s: done", __func__);
 }
 
 static void
 recv_rexec_state(int fd, struct sshbuf *conf)
 {
 	struct sshbuf *m;
 	u_char *cp, ver;
 	size_t len;
 	int r;
 
 	debug3("%s: entering fd = %d", __func__, fd);
 
 	if ((m = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	if (ssh_msg_recv(fd, m) == -1)
 		fatal("%s: ssh_msg_recv failed", __func__);
 	if ((r = sshbuf_get_u8(m, &ver)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (ver != 0)
 		fatal("%s: rexec version mismatch", __func__);
 	if ((r = sshbuf_get_string(m, &cp, &len)) != 0)
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 	if (conf != NULL && (r = sshbuf_put(conf, cp, len)))
 		fatal("%s: buffer error: %s", __func__, ssh_err(r));
 #if defined(WITH_OPENSSL) && !defined(OPENSSL_PRNG_ONLY)
 	rexec_recv_rng_seed(m);
 #endif
 
 	free(cp);
 	sshbuf_free(m);
 
 	debug3("%s: done", __func__);
 }
 
 /* Accept a connection from inetd */
 static void
 server_accept_inetd(int *sock_in, int *sock_out)
 {
 	int fd;
 
 	startup_pipe = -1;
 	if (rexeced_flag) {
 		close(REEXEC_CONFIG_PASS_FD);
 		*sock_in = *sock_out = dup(STDIN_FILENO);
 		if (!debug_flag) {
 			startup_pipe = dup(REEXEC_STARTUP_PIPE_FD);
 			close(REEXEC_STARTUP_PIPE_FD);
 		}
 	} else {
 		*sock_in = dup(STDIN_FILENO);
 		*sock_out = dup(STDOUT_FILENO);
 	}
 	/*
 	 * We intentionally do not close the descriptors 0, 1, and 2
 	 * as our code for setting the descriptors won't work if
 	 * ttyfd happens to be one of those.
 	 */
 	if ((fd = open(_PATH_DEVNULL, O_RDWR, 0)) != -1) {
 		dup2(fd, STDIN_FILENO);
 		dup2(fd, STDOUT_FILENO);
 		if (!log_stderr)
 			dup2(fd, STDERR_FILENO);
 		if (fd > (log_stderr ? STDERR_FILENO : STDOUT_FILENO))
 			close(fd);
 	}
 	debug("inetd sockets after dupping: %d, %d", *sock_in, *sock_out);
 }
 
 /*
  * Listen for TCP connections
  */
 static void
 listen_on_addrs(struct listenaddr *la)
 {
 	int ret, listen_sock;
 	struct addrinfo *ai;
 	char ntop[NI_MAXHOST], strport[NI_MAXSERV];
 	int socksize;
 	socklen_t len;
 
 	for (ai = la->addrs; ai; ai = ai->ai_next) {
 		if (ai->ai_family != AF_INET && ai->ai_family != AF_INET6)
 			continue;
 		if (num_listen_socks >= MAX_LISTEN_SOCKS)
 			fatal("Too many listen sockets. "
 			    "Enlarge MAX_LISTEN_SOCKS");
 		if ((ret = getnameinfo(ai->ai_addr, ai->ai_addrlen,
 		    ntop, sizeof(ntop), strport, sizeof(strport),
 		    NI_NUMERICHOST|NI_NUMERICSERV)) != 0) {
 			error("getnameinfo failed: %.100s",
 			    ssh_gai_strerror(ret));
 			continue;
 		}
 		/* Create socket for listening. */
 		listen_sock = socket(ai->ai_family, ai->ai_socktype,
 		    ai->ai_protocol);
 		if (listen_sock < 0) {
 			/* kernel may not support ipv6 */
 			verbose("socket: %.100s", strerror(errno));
 			continue;
 		}
 		if (set_nonblock(listen_sock) == -1) {
 			close(listen_sock);
 			continue;
 		}
 		if (fcntl(listen_sock, F_SETFD, FD_CLOEXEC) == -1) {
 			verbose("socket: CLOEXEC: %s", strerror(errno));
 			close(listen_sock);
 			continue;
 		}
 		/* Socket options */
 		set_reuseaddr(listen_sock);
 		if (la->rdomain != NULL &&
 		    set_rdomain(listen_sock, la->rdomain) == -1) {
 			close(listen_sock);
 			continue;
 		}
 
 		/* Only communicate in IPv6 over AF_INET6 sockets. */
 		if (ai->ai_family == AF_INET6)
 			sock_set_v6only(listen_sock);
 
 		debug("Bind to port %s on %s.", strport, ntop);
 
 		len = sizeof(socksize);
 		getsockopt(listen_sock, SOL_SOCKET, SO_RCVBUF, &socksize, &len);
 		debug("Server TCP RWIN socket size: %d", socksize);
 
 		/* Bind the socket to the desired port. */
 		if (bind(listen_sock, ai->ai_addr, ai->ai_addrlen) < 0) {
 			error("Bind to port %s on %s failed: %.200s.",
 			    strport, ntop, strerror(errno));
 			close(listen_sock);
 			continue;
 		}
 		listen_socks[num_listen_socks] = listen_sock;
 		num_listen_socks++;
 
 		/* Start listening on the port. */
 		if (listen(listen_sock, SSH_LISTEN_BACKLOG) < 0)
 			fatal("listen on [%s]:%s: %.100s",
 			    ntop, strport, strerror(errno));
 		logit("Server listening on %s port %s%s%s.",
 		    ntop, strport,
 		    la->rdomain == NULL ? "" : " rdomain ",
 		    la->rdomain == NULL ? "" : la->rdomain);
 	}
 }
 
 static void
 server_listen(void)
 {
 	u_int i;
 
 	for (i = 0; i < options.num_listen_addrs; i++) {
 		listen_on_addrs(&options.listen_addrs[i]);
 		freeaddrinfo(options.listen_addrs[i].addrs);
 		free(options.listen_addrs[i].rdomain);
 		memset(&options.listen_addrs[i], 0,
 		    sizeof(options.listen_addrs[i]));
 	}
 	free(options.listen_addrs);
 	options.listen_addrs = NULL;
 	options.num_listen_addrs = 0;
 
 	if (!num_listen_socks)
 		fatal("Cannot bind any address.");
 }
 
 /*
  * The main TCP accept loop. Note that, for the non-debug case, returns
  * from this function are in a forked subprocess.
  */
 static void
 server_accept_loop(int *sock_in, int *sock_out, int *newsock, int *config_s)
 {
 	fd_set *fdset;
 	int i, j, ret, maxfd;
 	int startups = 0;
 	int startup_p[2] = { -1 , -1 };
 	struct sockaddr_storage from;
 	socklen_t fromlen;
 	pid_t pid;
 	u_char rnd[256];
 
 	/* setup fd set for accept */
 	fdset = NULL;
 	maxfd = 0;
 	for (i = 0; i < num_listen_socks; i++)
 		if (listen_socks[i] > maxfd)
 			maxfd = listen_socks[i];
 	/* pipes connected to unauthenticated childs */
 	startup_pipes = xcalloc(options.max_startups, sizeof(int));
 	for (i = 0; i < options.max_startups; i++)
 		startup_pipes[i] = -1;
 
 	/*
 	 * Stay listening for connections until the system crashes or
 	 * the daemon is killed with a signal.
 	 */
 	for (;;) {
 		if (received_sighup)
 			sighup_restart();
 		free(fdset);
 		fdset = xcalloc(howmany(maxfd + 1, NFDBITS),
 		    sizeof(fd_mask));
 
 		for (i = 0; i < num_listen_socks; i++)
 			FD_SET(listen_socks[i], fdset);
 		for (i = 0; i < options.max_startups; i++)
 			if (startup_pipes[i] != -1)
 				FD_SET(startup_pipes[i], fdset);
 
 		/* Wait in select until there is a connection. */
 		ret = select(maxfd+1, fdset, NULL, NULL, NULL);
 		if (ret < 0 && errno != EINTR)
 			error("select: %.100s", strerror(errno));
 		if (received_sigterm) {
 			logit("Received signal %d; terminating.",
 			    (int) received_sigterm);
 			close_listen_socks();
 			if (options.pid_file != NULL)
 				unlink(options.pid_file);
 			exit(received_sigterm == SIGTERM ? 0 : 255);
 		}
 		if (ret < 0)
 			continue;
 
 		for (i = 0; i < options.max_startups; i++)
 			if (startup_pipes[i] != -1 &&
 			    FD_ISSET(startup_pipes[i], fdset)) {
 				/*
 				 * the read end of the pipe is ready
 				 * if the child has closed the pipe
 				 * after successful authentication
 				 * or if the child has died
 				 */
 				close(startup_pipes[i]);
 				startup_pipes[i] = -1;
 				startups--;
 			}
 		for (i = 0; i < num_listen_socks; i++) {
 			if (!FD_ISSET(listen_socks[i], fdset))
 				continue;
 			fromlen = sizeof(from);
 			*newsock = accept(listen_socks[i],
 			    (struct sockaddr *)&from, &fromlen);
 			if (*newsock < 0) {
 				if (errno != EINTR && errno != EWOULDBLOCK &&
 				    errno != ECONNABORTED && errno != EAGAIN)
 					error("accept: %.100s",
 					    strerror(errno));
 				if (errno == EMFILE || errno == ENFILE)
 					usleep(100 * 1000);
 				continue;
 			}
 			if (unset_nonblock(*newsock) == -1) {
 				close(*newsock);
 				continue;
 			}
 			if (drop_connection(startups) == 1) {
 				char *laddr = get_local_ipaddr(*newsock);
 				char *raddr = get_peer_ipaddr(*newsock);
 
 				verbose("drop connection #%d from [%s]:%d "
 				    "on [%s]:%d past MaxStartups", startups,
 				    raddr, get_peer_port(*newsock),
 				    laddr, get_local_port(*newsock));
 				free(laddr);
 				free(raddr);
 				close(*newsock);
 				continue;
 			}
 			if (pipe(startup_p) == -1) {
 				close(*newsock);
 				continue;
 			}
 
 			if (rexec_flag && socketpair(AF_UNIX,
 			    SOCK_STREAM, 0, config_s) == -1) {
 				error("reexec socketpair: %s",
 				    strerror(errno));
 				close(*newsock);
 				close(startup_p[0]);
 				close(startup_p[1]);
 				continue;
 			}
 
 			for (j = 0; j < options.max_startups; j++)
 				if (startup_pipes[j] == -1) {
 					startup_pipes[j] = startup_p[0];
 					if (maxfd < startup_p[0])
 						maxfd = startup_p[0];
 					startups++;
 					break;
 				}
 
 			/*
 			 * Got connection.  Fork a child to handle it, unless
 			 * we are in debugging mode.
 			 */
 			if (debug_flag) {
 				/*
 				 * In debugging mode.  Close the listening
 				 * socket, and start processing the
 				 * connection without forking.
 				 */
 				debug("Server will not fork when running in debugging mode.");
 				close_listen_socks();
 				*sock_in = *newsock;
 				*sock_out = *newsock;
 				close(startup_p[0]);
 				close(startup_p[1]);
 				startup_pipe = -1;
 				pid = getpid();
 				if (rexec_flag) {
 					send_rexec_state(config_s[0], cfg);
 					close(config_s[0]);
 				}
 				break;
 			}
 
 			/*
 			 * Normal production daemon.  Fork, and have
 			 * the child process the connection. The
 			 * parent continues listening.
 			 */
 			platform_pre_fork();
 			if ((pid = fork()) == 0) {
 				/*
 				 * Child.  Close the listening and
 				 * max_startup sockets.  Start using
 				 * the accepted socket. Reinitialize
 				 * logging (since our pid has changed).
 				 * We break out of the loop to handle
 				 * the connection.
 				 */
 				platform_post_fork_child();
 				startup_pipe = startup_p[1];
 				close_startup_pipes();
 				close_listen_socks();
 				*sock_in = *newsock;
 				*sock_out = *newsock;
 				log_init(__progname,
 				    options.log_level,
 				    options.log_facility,
 				    log_stderr);
 				if (rexec_flag)
 					close(config_s[0]);
 				break;
 			}
 
 			/* Parent.  Stay in the loop. */
 			platform_post_fork_parent(pid);
 			if (pid < 0)
 				error("fork: %.100s", strerror(errno));
 			else
 				debug("Forked child %ld.", (long)pid);
 
 			close(startup_p[1]);
 
 			if (rexec_flag) {
 				send_rexec_state(config_s[0], cfg);
 				close(config_s[0]);
 				close(config_s[1]);
 			}
 			close(*newsock);
 
 			/*
 			 * Ensure that our random state differs
 			 * from that of the child
 			 */
 			arc4random_stir();
 			arc4random_buf(rnd, sizeof(rnd));
 #ifdef WITH_OPENSSL
 			RAND_seed(rnd, sizeof(rnd));
 			if ((RAND_bytes((u_char *)rnd, 1)) != 1)
 				fatal("%s: RAND_bytes failed", __func__);
 #endif
 			explicit_bzero(rnd, sizeof(rnd));
 		}
 
 		/* child process check (or debug mode) */
 		if (num_listen_socks < 0)
 			break;
 	}
 }
 
 /*
  * If IP options are supported, make sure there are none (log and
  * return an error if any are found).  Basically we are worried about
  * source routing; it can be used to pretend you are somebody
  * (ip-address) you are not. That itself may be "almost acceptable"
  * under certain circumstances, but rhosts authentication is useless
  * if source routing is accepted. Notice also that if we just dropped
  * source routing here, the other side could use IP spoofing to do
  * rest of the interaction and could still bypass security.  So we
  * exit here if we detect any IP options.
  */
 static void
 check_ip_options(struct ssh *ssh)
 {
 #ifdef IP_OPTIONS
 	int sock_in = ssh_packet_get_connection_in(ssh);
 	struct sockaddr_storage from;
 	u_char opts[200];
 	socklen_t i, option_size = sizeof(opts), fromlen = sizeof(from);
 	char text[sizeof(opts) * 3 + 1];
 
 	memset(&from, 0, sizeof(from));
 	if (getpeername(sock_in, (struct sockaddr *)&from,
 	    &fromlen) < 0)
 		return;
 	if (from.ss_family != AF_INET)
 		return;
 	/* XXX IPv6 options? */
 
 	if (getsockopt(sock_in, IPPROTO_IP, IP_OPTIONS, opts,
 	    &option_size) >= 0 && option_size != 0) {
 		text[0] = '\0';
 		for (i = 0; i < option_size; i++)
 			snprintf(text + i*3, sizeof(text) - i*3,
 			    " %2.2x", opts[i]);
 		fatal("Connection from %.100s port %d with IP opts: %.800s",
 		    ssh_remote_ipaddr(ssh), ssh_remote_port(ssh), text);
 	}
 	return;
 #endif /* IP_OPTIONS */
 }
 
 /* Set the routing domain for this process */
 static void
 set_process_rdomain(struct ssh *ssh, const char *name)
 {
 #if defined(HAVE_SYS_SET_PROCESS_RDOMAIN)
 	if (name == NULL)
 		return; /* default */
 
 	if (strcmp(name, "%D") == 0) {
 		/* "expands" to routing domain of connection */
 		if ((name = ssh_packet_rdomain_in(ssh)) == NULL)
 			return;
 	}
 	/* NB. We don't pass 'ssh' to sys_set_process_rdomain() */
 	return sys_set_process_rdomain(name);
 #elif defined(__OpenBSD__)
 	int rtable, ortable = getrtable();
 	const char *errstr;
 
 	if (name == NULL)
 		return; /* default */
 
 	if (strcmp(name, "%D") == 0) {
 		/* "expands" to routing domain of connection */
 		if ((name = ssh_packet_rdomain_in(ssh)) == NULL)
 			return;
 	}
 
 	rtable = (int)strtonum(name, 0, 255, &errstr);
 	if (errstr != NULL) /* Shouldn't happen */
 		fatal("Invalid routing domain \"%s\": %s", name, errstr);
 	if (rtable != ortable && setrtable(rtable) != 0)
 		fatal("Unable to set routing domain %d: %s",
 		    rtable, strerror(errno));
 	debug("%s: set routing domain %d (was %d)", __func__, rtable, ortable);
 #else /* defined(__OpenBSD__) */
 	fatal("Unable to set routing domain: not supported in this platform");
 #endif
 }
 
 static void
 accumulate_host_timing_secret(struct sshbuf *server_cfg,
     const struct sshkey *key)
 {
 	static struct ssh_digest_ctx *ctx;
 	u_char *hash;
 	size_t len;
 	struct sshbuf *buf;
 	int r;
 
 	if (ctx == NULL && (ctx = ssh_digest_start(SSH_DIGEST_SHA512)) == NULL)
 		fatal("%s: ssh_digest_start", __func__);
 	if (key == NULL) { /* finalize */
 		/* add server config in case we are using agent for host keys */
 		if (ssh_digest_update(ctx, sshbuf_ptr(server_cfg),
 		    sshbuf_len(server_cfg)) != 0)
 			fatal("%s: ssh_digest_update", __func__);
 		len = ssh_digest_bytes(SSH_DIGEST_SHA512);
 		hash = xmalloc(len);
 		if (ssh_digest_final(ctx, hash, len) != 0)
 			fatal("%s: ssh_digest_final", __func__);
 		options.timing_secret = PEEK_U64(hash);
 		freezero(hash, len);
 		ssh_digest_free(ctx);
 		ctx = NULL;
 		return;
 	}
 	if ((buf = sshbuf_new()) == NULL)
 		fatal("%s could not allocate buffer", __func__);
 	if ((r = sshkey_private_serialize(key, buf)) != 0)
 		fatal("sshkey_private_serialize: %s", ssh_err(r));
 	if (ssh_digest_update(ctx, sshbuf_ptr(buf), sshbuf_len(buf)) != 0)
 		fatal("%s: ssh_digest_update", __func__);
 	sshbuf_reset(buf);
 	sshbuf_free(buf);
 }
 
 /*
  * Main program for the daemon.
  */
 int
 main(int ac, char **av)
 {
 	struct ssh *ssh = NULL;
 	extern char *optarg;
 	extern int optind;
 	int r, opt, on = 1, already_daemon, remote_port;
 	int sock_in = -1, sock_out = -1, newsock = -1;
 	const char *remote_ip, *rdomain;
 	char *fp, *line, *laddr, *logfile = NULL;
 	int config_s[2] = { -1 , -1 };
 	u_int i, j;
 	u_int64_t ibytes, obytes;
 	mode_t new_umask;
 	struct sshkey *key;
 	struct sshkey *pubkey;
 	int keytype;
 	Authctxt *authctxt;
 	struct connection_info *connection_info = NULL;
 
 	ssh_malloc_init();	/* must be called before any mallocs */
 
 #ifdef HAVE_SECUREWARE
 	(void)set_auth_parameters(ac, av);
 #endif
 	__progname = ssh_get_progname(av[0]);
 
 	/* Save argv. Duplicate so setproctitle emulation doesn't clobber it */
 	saved_argc = ac;
 	rexec_argc = ac;
 	saved_argv = xcalloc(ac + 1, sizeof(*saved_argv));
 	for (i = 0; (int)i < ac; i++)
 		saved_argv[i] = xstrdup(av[i]);
 	saved_argv[i] = NULL;
 
 #ifndef HAVE_SETPROCTITLE
 	/* Prepare for later setproctitle emulation */
 	compat_init_setproctitle(ac, av);
 	av = saved_argv;
 #endif
 
 	if (geteuid() == 0 && setgroups(0, NULL) == -1)
 		debug("setgroups(): %.200s", strerror(errno));
 
 	/* Ensure that fds 0, 1 and 2 are open or directed to /dev/null */
 	sanitise_stdfd();
 
 	/* Initialize configuration options to their default values. */
 	initialize_server_options(&options);
 
 	/* Parse command-line arguments. */
 	while ((opt = getopt(ac, av,
 	    "C:E:b:c:f:g:h:k:o:p:u:46DQRTdeiqrt")) != -1) {
 		switch (opt) {
 		case '4':
 			options.address_family = AF_INET;
 			break;
 		case '6':
 			options.address_family = AF_INET6;
 			break;
 		case 'f':
 			config_file_name = optarg;
 			break;
 		case 'c':
 			servconf_add_hostcert("[command-line]", 0,
 			    &options, optarg);
 			break;
 		case 'd':
 			if (debug_flag == 0) {
 				debug_flag = 1;
 				options.log_level = SYSLOG_LEVEL_DEBUG1;
 			} else if (options.log_level < SYSLOG_LEVEL_DEBUG3)
 				options.log_level++;
 			break;
 		case 'D':
 			no_daemon_flag = 1;
 			break;
 		case 'E':
 			logfile = optarg;
 			/* FALLTHROUGH */
 		case 'e':
 			log_stderr = 1;
 			break;
 		case 'i':
 			inetd_flag = 1;
 			break;
 		case 'r':
 			rexec_flag = 0;
 			break;
 		case 'R':
 			rexeced_flag = 1;
 			inetd_flag = 1;
 			break;
 		case 'Q':
 			/* ignored */
 			break;
 		case 'q':
 			options.log_level = SYSLOG_LEVEL_QUIET;
 			break;
 		case 'b':
 			/* protocol 1, ignored */
 			break;
 		case 'p':
 			options.ports_from_cmdline = 1;
 			if (options.num_ports >= MAX_PORTS) {
 				fprintf(stderr, "too many ports.\n");
 				exit(1);
 			}
 			options.ports[options.num_ports++] = a2port(optarg);
 			if (options.ports[options.num_ports-1] <= 0) {
 				fprintf(stderr, "Bad port number.\n");
 				exit(1);
 			}
 			break;
 		case 'g':
 			if ((options.login_grace_time = convtime(optarg)) == -1) {
 				fprintf(stderr, "Invalid login grace time.\n");
 				exit(1);
 			}
 			break;
 		case 'k':
 			/* protocol 1, ignored */
 			break;
 		case 'h':
 			servconf_add_hostkey("[command-line]", 0,
 			    &options, optarg);
 			break;
 		case 't':
 			test_flag = 1;
 			break;
 		case 'T':
 			test_flag = 2;
 			break;
 		case 'C':
 			connection_info = get_connection_info(0, 0);
 			if (parse_server_match_testspec(connection_info,
 			    optarg) == -1)
 				exit(1);
 			break;
 		case 'u':
 			utmp_len = (u_int)strtonum(optarg, 0, HOST_NAME_MAX+1+1, NULL);
 			if (utmp_len > HOST_NAME_MAX+1) {
 				fprintf(stderr, "Invalid utmp length.\n");
 				exit(1);
 			}
 			break;
 		case 'o':
 			line = xstrdup(optarg);
 			if (process_server_config_line(&options, line,
 			    "command-line", 0, NULL, NULL) != 0)
 				exit(1);
 			free(line);
 			break;
 		case '?':
 		default:
 			usage();
 			break;
 		}
 	}
 	if (rexeced_flag || inetd_flag)
 		rexec_flag = 0;
 	if (!test_flag && (rexec_flag && (av[0] == NULL || *av[0] != '/')))
 		fatal("sshd re-exec requires execution with an absolute path");
 	if (rexeced_flag)
 		closefrom(REEXEC_MIN_FREE_FD);
 	else
 		closefrom(REEXEC_DEVCRYPTO_RESERVED_FD);
 
 #ifdef WITH_OPENSSL
 	OpenSSL_add_all_algorithms();
 #endif
 
 	/* If requested, redirect the logs to the specified logfile. */
 	if (logfile != NULL)
 		log_redirect_stderr_to(logfile);
 	/*
 	 * Force logging to stderr until we have loaded the private host
 	 * key (unless started from inetd)
 	 */
 	log_init(__progname,
 	    options.log_level == SYSLOG_LEVEL_NOT_SET ?
 	    SYSLOG_LEVEL_INFO : options.log_level,
 	    options.log_facility == SYSLOG_FACILITY_NOT_SET ?
 	    SYSLOG_FACILITY_AUTH : options.log_facility,
 	    log_stderr || !inetd_flag);
 
 	/*
 	 * Unset KRB5CCNAME, otherwise the user's session may inherit it from
 	 * root's environment
 	 */
 	if (getenv("KRB5CCNAME") != NULL)
 		(void) unsetenv("KRB5CCNAME");
 
 	sensitive_data.have_ssh2_key = 0;
 
 	/*
 	 * If we're not doing an extended test do not silently ignore connection
 	 * test params.
 	 */
 	if (test_flag < 2 && connection_info != NULL)
 		fatal("Config test connection parameter (-C) provided without "
 		   "test mode (-T)");
 
 	/* Fetch our configuration */
 	if ((cfg = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	if (rexeced_flag)
 		recv_rexec_state(REEXEC_CONFIG_PASS_FD, cfg);
 	else if (strcasecmp(config_file_name, "none") != 0)
 		load_server_config(config_file_name, cfg);
 
 	parse_server_config(&options, rexeced_flag ? "rexec" : config_file_name,
 	    cfg, NULL);
 
 	seed_rng();
 
 	/* Fill in default values for those options not explicitly set. */
 	fill_default_server_options(&options);
 
 	/* challenge-response is implemented via keyboard interactive */
 	if (options.challenge_response_authentication)
 		options.kbd_interactive_authentication = 1;
 
 	/* Check that options are sensible */
 	if (options.authorized_keys_command_user == NULL &&
 	    (options.authorized_keys_command != NULL &&
 	    strcasecmp(options.authorized_keys_command, "none") != 0))
 		fatal("AuthorizedKeysCommand set without "
 		    "AuthorizedKeysCommandUser");
 	if (options.authorized_principals_command_user == NULL &&
 	    (options.authorized_principals_command != NULL &&
 	    strcasecmp(options.authorized_principals_command, "none") != 0))
 		fatal("AuthorizedPrincipalsCommand set without "
 		    "AuthorizedPrincipalsCommandUser");
 
 	/*
 	 * Check whether there is any path through configured auth methods.
 	 * Unfortunately it is not possible to verify this generally before
 	 * daemonisation in the presence of Match block, but this catches
 	 * and warns for trivial misconfigurations that could break login.
 	 */
 	if (options.num_auth_methods != 0) {
 		for (i = 0; i < options.num_auth_methods; i++) {
 			if (auth2_methods_valid(options.auth_methods[i],
 			    1) == 0)
 				break;
 		}
 		if (i >= options.num_auth_methods)
 			fatal("AuthenticationMethods cannot be satisfied by "
 			    "enabled authentication methods");
 	}
 
 	/* Check that there are no remaining arguments. */
 	if (optind < ac) {
 		fprintf(stderr, "Extra argument %s.\n", av[optind]);
 		exit(1);
 	}
 
 	debug("sshd version %s, %s", SSH_VERSION,
 #ifdef WITH_OPENSSL
 	    SSLeay_version(SSLEAY_VERSION)
 #else
 	    "without OpenSSL"
 #endif
 	);
 
 	/* Store privilege separation user for later use if required. */
 	privsep_chroot = use_privsep && (getuid() == 0 || geteuid() == 0);
 	if ((privsep_pw = getpwnam(SSH_PRIVSEP_USER)) == NULL) {
 		if (privsep_chroot || options.kerberos_authentication)
 			fatal("Privilege separation user %s does not exist",
 			    SSH_PRIVSEP_USER);
 	} else {
 		privsep_pw = pwcopy(privsep_pw);
 		freezero(privsep_pw->pw_passwd, strlen(privsep_pw->pw_passwd));
 		privsep_pw->pw_passwd = xstrdup("*");
 	}
 	endpwent();
 
 	/* load host keys */
 	sensitive_data.host_keys = xcalloc(options.num_host_key_files,
 	    sizeof(struct sshkey *));
 	sensitive_data.host_pubkeys = xcalloc(options.num_host_key_files,
 	    sizeof(struct sshkey *));
 
 	if (options.host_key_agent) {
 		if (strcmp(options.host_key_agent, SSH_AUTHSOCKET_ENV_NAME))
 			setenv(SSH_AUTHSOCKET_ENV_NAME,
 			    options.host_key_agent, 1);
 		if ((r = ssh_get_authentication_socket(NULL)) == 0)
 			have_agent = 1;
 		else
 			error("Could not connect to agent \"%s\": %s",
 			    options.host_key_agent, ssh_err(r));
 	}
 
 	for (i = 0; i < options.num_host_key_files; i++) {
 		if (options.host_key_files[i] == NULL)
 			continue;
 		if ((r = sshkey_load_private(options.host_key_files[i], "",
 		    &key, NULL)) != 0 && r != SSH_ERR_SYSTEM_ERROR)
 			error("Error loading host key \"%s\": %s",
 			    options.host_key_files[i], ssh_err(r));
 		if ((r = sshkey_load_public(options.host_key_files[i],
 		    &pubkey, NULL)) != 0 && r != SSH_ERR_SYSTEM_ERROR)
 			error("Error loading host key \"%s\": %s",
 			    options.host_key_files[i], ssh_err(r));
 		if (pubkey == NULL && key != NULL)
 			if ((r = sshkey_from_private(key, &pubkey)) != 0)
 				fatal("Could not demote key: \"%s\": %s",
 				    options.host_key_files[i], ssh_err(r));
 		sensitive_data.host_keys[i] = key;
 		sensitive_data.host_pubkeys[i] = pubkey;
 
 		if (key == NULL && pubkey != NULL && have_agent) {
 			debug("will rely on agent for hostkey %s",
 			    options.host_key_files[i]);
 			keytype = pubkey->type;
 		} else if (key != NULL) {
 			keytype = key->type;
 			accumulate_host_timing_secret(cfg, key);
 		} else {
 			error("Could not load host key: %s",
 			    options.host_key_files[i]);
 			sensitive_data.host_keys[i] = NULL;
 			sensitive_data.host_pubkeys[i] = NULL;
 			continue;
 		}
 
 		switch (keytype) {
 		case KEY_RSA:
 		case KEY_DSA:
 		case KEY_ECDSA:
 		case KEY_ED25519:
 		case KEY_XMSS:
 			if (have_agent || key != NULL)
 				sensitive_data.have_ssh2_key = 1;
 			break;
 		}
 		if ((fp = sshkey_fingerprint(pubkey, options.fingerprint_hash,
 		    SSH_FP_DEFAULT)) == NULL)
 			fatal("sshkey_fingerprint failed");
 		debug("%s host key #%d: %s %s",
 		    key ? "private" : "agent", i, sshkey_ssh_name(pubkey), fp);
 		free(fp);
 	}
 	accumulate_host_timing_secret(cfg, NULL);
 	if (!sensitive_data.have_ssh2_key) {
 		logit("sshd: no hostkeys available -- exiting.");
 		exit(1);
 	}
 
 	/*
 	 * Load certificates. They are stored in an array at identical
 	 * indices to the public keys that they relate to.
 	 */
 	sensitive_data.host_certificates = xcalloc(options.num_host_key_files,
 	    sizeof(struct sshkey *));
 	for (i = 0; i < options.num_host_key_files; i++)
 		sensitive_data.host_certificates[i] = NULL;
 
 	for (i = 0; i < options.num_host_cert_files; i++) {
 		if (options.host_cert_files[i] == NULL)
 			continue;
 		if ((r = sshkey_load_public(options.host_cert_files[i],
 		    &key, NULL)) != 0) {
 			error("Could not load host certificate \"%s\": %s",
 			    options.host_cert_files[i], ssh_err(r));
 			continue;
 		}
 		if (!sshkey_is_cert(key)) {
 			error("Certificate file is not a certificate: %s",
 			    options.host_cert_files[i]);
 			sshkey_free(key);
 			continue;
 		}
 		/* Find matching private key */
 		for (j = 0; j < options.num_host_key_files; j++) {
 			if (sshkey_equal_public(key,
 			    sensitive_data.host_keys[j])) {
 				sensitive_data.host_certificates[j] = key;
 				break;
 			}
 		}
 		if (j >= options.num_host_key_files) {
 			error("No matching private key for certificate: %s",
 			    options.host_cert_files[i]);
 			sshkey_free(key);
 			continue;
 		}
 		sensitive_data.host_certificates[j] = key;
 		debug("host certificate: #%u type %d %s", j, key->type,
 		    sshkey_type(key));
 	}
 
 	if (privsep_chroot) {
 		struct stat st;
 
 		if ((stat(_PATH_PRIVSEP_CHROOT_DIR, &st) == -1) ||
 		    (S_ISDIR(st.st_mode) == 0))
 			fatal("Missing privilege separation directory: %s",
 			    _PATH_PRIVSEP_CHROOT_DIR);
 
 #ifdef HAVE_CYGWIN
 		if (check_ntsec(_PATH_PRIVSEP_CHROOT_DIR) &&
 		    (st.st_uid != getuid () ||
 		    (st.st_mode & (S_IWGRP|S_IWOTH)) != 0))
 #else
 		if (st.st_uid != 0 || (st.st_mode & (S_IWGRP|S_IWOTH)) != 0)
 #endif
 			fatal("%s must be owned by root and not group or "
 			    "world-writable.", _PATH_PRIVSEP_CHROOT_DIR);
 	}
 
 	if (test_flag > 1) {
 		/*
 		 * If no connection info was provided by -C then use
 		 * use a blank one that will cause no predicate to match.
 		 */
 		if (connection_info == NULL)
 			connection_info = get_connection_info(0, 0);
 		parse_server_match_config(&options, connection_info);
 		dump_config(&options);
 	}
 
 	/* Configuration looks good, so exit if in test mode. */
 	if (test_flag)
 		exit(0);
 
 	/*
 	 * Clear out any supplemental groups we may have inherited.  This
 	 * prevents inadvertent creation of files with bad modes (in the
 	 * portable version at least, it's certainly possible for PAM
 	 * to create a file, and we can't control the code in every
 	 * module which might be used).
 	 */
 	if (setgroups(0, NULL) < 0)
 		debug("setgroups() failed: %.200s", strerror(errno));
 
 	if (rexec_flag) {
 		if (rexec_argc < 0)
 			fatal("rexec_argc %d < 0", rexec_argc);
 		rexec_argv = xcalloc(rexec_argc + 2, sizeof(char *));
 		for (i = 0; i < (u_int)rexec_argc; i++) {
 			debug("rexec_argv[%d]='%s'", i, saved_argv[i]);
 			rexec_argv[i] = saved_argv[i];
 		}
 		rexec_argv[rexec_argc] = "-R";
 		rexec_argv[rexec_argc + 1] = NULL;
 	}
 
 	/* Ensure that umask disallows at least group and world write */
 	new_umask = umask(0077) | 0022;
 	(void) umask(new_umask);
 
 	/* Initialize the log (it is reinitialized below in case we forked). */
 	if (debug_flag && (!inetd_flag || rexeced_flag))
 		log_stderr = 1;
 	log_init(__progname, options.log_level, options.log_facility, log_stderr);
 
 	/*
 	 * If not in debugging mode, not started from inetd and not already
 	 * daemonized (eg re-exec via SIGHUP), disconnect from the controlling
 	 * terminal, and fork.  The original process exits.
 	 */
 	already_daemon = daemonized();
 	if (!(debug_flag || inetd_flag || no_daemon_flag || already_daemon)) {
 
 		if (daemon(0, 0) < 0)
 			fatal("daemon() failed: %.200s", strerror(errno));
 
 		disconnect_controlling_tty();
 	}
 	/* Reinitialize the log (because of the fork above). */
 	log_init(__progname, options.log_level, options.log_facility, log_stderr);
 
 	/* Avoid killing the process in high-pressure swapping environments. */
 	if (!inetd_flag && madvise(NULL, 0, MADV_PROTECT) != 0)
 		debug("madvise(): %.200s", strerror(errno));
 
 	/* Chdir to the root directory so that the current disk can be
 	   unmounted if desired. */
 	if (chdir("/") == -1)
 		error("chdir(\"/\"): %s", strerror(errno));
 
 	/* ignore SIGPIPE */
 	signal(SIGPIPE, SIG_IGN);
 
 	/* Get a connection, either from inetd or a listening TCP socket */
 	if (inetd_flag) {
 		server_accept_inetd(&sock_in, &sock_out);
 	} else {
 		platform_pre_listen();
 		server_listen();
 
 		signal(SIGHUP, sighup_handler);
 		signal(SIGCHLD, main_sigchld_handler);
 		signal(SIGTERM, sigterm_handler);
 		signal(SIGQUIT, sigterm_handler);
 
 		/*
 		 * Write out the pid file after the sigterm handler
 		 * is setup and the listen sockets are bound
 		 */
 		if (options.pid_file != NULL && !debug_flag) {
 			FILE *f = fopen(options.pid_file, "w");
 
 			if (f == NULL) {
 				error("Couldn't create pid file \"%s\": %s",
 				    options.pid_file, strerror(errno));
 			} else {
 				fprintf(f, "%ld\n", (long) getpid());
 				fclose(f);
 			}
 		}
 
 		/* Accept a connection and return in a forked child */
 		server_accept_loop(&sock_in, &sock_out,
 		    &newsock, config_s);
 	}
 
 	/* This is the child processing a new connection. */
 	setproctitle("%s", "[accepted]");
 
 	/*
 	 * Create a new session and process group since the 4.4BSD
 	 * setlogin() affects the entire process group.  We don't
 	 * want the child to be able to affect the parent.
 	 */
 #if !defined(SSHD_ACQUIRES_CTTY)
 	/*
 	 * If setsid is called, on some platforms sshd will later acquire a
 	 * controlling terminal which will result in "could not set
 	 * controlling tty" errors.
 	 */
 	if (!debug_flag && !inetd_flag && setsid() < 0)
 		error("setsid: %.100s", strerror(errno));
 #endif
 
 	if (rexec_flag) {
 		int fd;
 
 		debug("rexec start in %d out %d newsock %d pipe %d sock %d",
 		    sock_in, sock_out, newsock, startup_pipe, config_s[0]);
 		dup2(newsock, STDIN_FILENO);
 		dup2(STDIN_FILENO, STDOUT_FILENO);
 		if (startup_pipe == -1)
 			close(REEXEC_STARTUP_PIPE_FD);
 		else if (startup_pipe != REEXEC_STARTUP_PIPE_FD) {
 			dup2(startup_pipe, REEXEC_STARTUP_PIPE_FD);
 			close(startup_pipe);
 			startup_pipe = REEXEC_STARTUP_PIPE_FD;
 		}
 
 		dup2(config_s[1], REEXEC_CONFIG_PASS_FD);
 		close(config_s[1]);
 
 		execv(rexec_argv[0], rexec_argv);
 
 		/* Reexec has failed, fall back and continue */
 		error("rexec of %s failed: %s", rexec_argv[0], strerror(errno));
 		recv_rexec_state(REEXEC_CONFIG_PASS_FD, NULL);
 		log_init(__progname, options.log_level,
 		    options.log_facility, log_stderr);
 
 		/* Clean up fds */
 		close(REEXEC_CONFIG_PASS_FD);
 		newsock = sock_out = sock_in = dup(STDIN_FILENO);
 		if ((fd = open(_PATH_DEVNULL, O_RDWR, 0)) != -1) {
 			dup2(fd, STDIN_FILENO);
 			dup2(fd, STDOUT_FILENO);
 			if (fd > STDERR_FILENO)
 				close(fd);
 		}
 		debug("rexec cleanup in %d out %d newsock %d pipe %d sock %d",
 		    sock_in, sock_out, newsock, startup_pipe, config_s[0]);
 	}
 
 	/* Executed child processes don't need these. */
 	fcntl(sock_out, F_SETFD, FD_CLOEXEC);
 	fcntl(sock_in, F_SETFD, FD_CLOEXEC);
 
 	/*
 	 * Disable the key regeneration alarm.  We will not regenerate the
 	 * key since we are no longer in a position to give it to anyone. We
 	 * will not restart on SIGHUP since it no longer makes sense.
 	 */
 	alarm(0);
 	signal(SIGALRM, SIG_DFL);
 	signal(SIGHUP, SIG_DFL);
 	signal(SIGTERM, SIG_DFL);
 	signal(SIGQUIT, SIG_DFL);
 	signal(SIGCHLD, SIG_DFL);
 	signal(SIGINT, SIG_DFL);
 
 #ifdef __FreeBSD__
 	/*
 	 * Initialize the resolver.  This may not happen automatically
 	 * before privsep chroot().
 	 */
 	if ((_res.options & RES_INIT) == 0) {
 		debug("res_init()");
 		res_init();
 	}
 #ifdef GSSAPI
 	/*
 	 * Force GSS-API to parse its configuration and load any
 	 * mechanism plugins.
 	 */
 	{
 		gss_OID_set mechs;
 		OM_uint32 minor_status;
 		gss_indicate_mechs(&minor_status, &mechs);
 		gss_release_oid_set(&minor_status, &mechs);
 	}
 #endif
 #endif
 
 	/*
 	 * Register our connection.  This turns encryption off because we do
 	 * not have a key.
 	 */
 	packet_set_connection(sock_in, sock_out);
 	packet_set_server();
 	ssh = active_state; /* XXX */
 
 	check_ip_options(ssh);
 
 	/* Prepare the channels layer */
 	channel_init_channels(ssh);
 	channel_set_af(ssh, options.address_family);
 	process_permitopen(ssh, &options);
 
 	/* Set SO_KEEPALIVE if requested. */
 	if (options.tcp_keep_alive && packet_connection_is_on_socket() &&
 	    setsockopt(sock_in, SOL_SOCKET, SO_KEEPALIVE, &on, sizeof(on)) < 0)
 		error("setsockopt SO_KEEPALIVE: %.100s", strerror(errno));
 
 	if ((remote_port = ssh_remote_port(ssh)) < 0) {
 		debug("ssh_remote_port failed");
 		cleanup_exit(255);
 	}
 
 	if (options.routing_domain != NULL)
 		set_process_rdomain(ssh, options.routing_domain);
 
 	/*
 	 * The rest of the code depends on the fact that
 	 * ssh_remote_ipaddr() caches the remote ip, even if
 	 * the socket goes away.
 	 */
 	remote_ip = ssh_remote_ipaddr(ssh);
 
+#ifdef HAVE_LOGIN_CAP
+	/* Also caches remote hostname for sandboxed child. */
+	auth_get_canonical_hostname(ssh, options.use_dns);
+#endif
+
 #ifdef SSH_AUDIT_EVENTS
 	audit_connection_from(remote_ip, remote_port);
 #endif
 #ifdef LIBWRAP
 	allow_severity = options.log_facility|LOG_INFO;
 	deny_severity = options.log_facility|LOG_WARNING;
 	/* Check whether logins are denied from this host. */
 	if (packet_connection_is_on_socket()) {
 		struct request_info req;
 
 		request_init(&req, RQ_DAEMON, __progname, RQ_FILE, sock_in, 0);
 		fromhost(&req);
 
 		if (!hosts_access(&req)) {
 			debug("Connection refused by tcp wrapper");
 			refuse(&req);
 			/* NOTREACHED */
 			fatal("libwrap refuse returns");
 		}
 	}
 #endif /* LIBWRAP */
 
 	rdomain = ssh_packet_rdomain_in(ssh);
 
 	/* Log the connection. */
 	laddr = get_local_ipaddr(sock_in);
 	verbose("Connection from %s port %d on %s port %d%s%s%s",
 	    remote_ip, remote_port, laddr,  ssh_local_port(ssh),
 	    rdomain == NULL ? "" : " rdomain \"",
 	    rdomain == NULL ? "" : rdomain,
 	    rdomain == NULL ? "" : "\"");
 	free(laddr);
 
 	/*
 	 * We don't want to listen forever unless the other side
 	 * successfully authenticates itself.  So we set up an alarm which is
 	 * cleared after successful authentication.  A limit of zero
 	 * indicates no limit. Note that we don't set the alarm in debugging
 	 * mode; it is just annoying to have the server exit just when you
 	 * are about to discover the bug.
 	 */
 	signal(SIGALRM, grace_alarm_handler);
 	if (!debug_flag)
 		alarm(options.login_grace_time);
 
 	sshd_exchange_identification(ssh, sock_in, sock_out);
 	packet_set_nonblocking();
 
 	/* allocate authentication context */
 	authctxt = xcalloc(1, sizeof(*authctxt));
 
 	authctxt->loginmsg = loginmsg;
 
 	/* XXX global for cleanup, access from other modules */
 	the_authctxt = authctxt;
 
 	/* Set default key authentication options */
 	if ((auth_opts = sshauthopt_new_with_keys_defaults()) == NULL)
 		fatal("allocation failed");
 
 	/* prepare buffer to collect messages to display to user after login */
 	if ((loginmsg = sshbuf_new()) == NULL)
 		fatal("%s: sshbuf_new failed", __func__);
 	auth_debug_reset();
 
 	BLACKLIST_INIT();
 
 	if (use_privsep) {
 		if (privsep_preauth(authctxt) == 1)
 			goto authenticated;
 	} else if (have_agent) {
 		if ((r = ssh_get_authentication_socket(&auth_sock)) != 0) {
 			error("Unable to get agent socket: %s", ssh_err(r));
 			have_agent = 0;
 		}
 	}
 
 	/* perform the key exchange */
 	/* authenticate user and start session */
 	do_ssh2_kex();
 	do_authentication2(authctxt);
 
 	/*
 	 * If we use privilege separation, the unprivileged child transfers
 	 * the current keystate and exits
 	 */
 	if (use_privsep) {
 		mm_send_keystate(pmonitor);
 		packet_clear_keys();
 		exit(0);
 	}
 
  authenticated:
 	/*
 	 * Cancel the alarm we set to limit the time taken for
 	 * authentication.
 	 */
 	alarm(0);
 	signal(SIGALRM, SIG_DFL);
 	authctxt->authenticated = 1;
 	if (startup_pipe != -1) {
 		close(startup_pipe);
 		startup_pipe = -1;
 	}
 
 #ifdef SSH_AUDIT_EVENTS
 	audit_event(SSH_AUTH_SUCCESS);
 #endif
 
 #ifdef GSSAPI
 	if (options.gss_authentication) {
 		temporarily_use_uid(authctxt->pw);
 		ssh_gssapi_storecreds();
 		restore_uid();
 	}
 #endif
 #ifdef USE_PAM
 	if (options.use_pam) {
 		do_pam_setcred(1);
 		do_pam_session(ssh);
 	}
 #endif
 
 	/*
 	 * In privilege separation, we fork another child and prepare
 	 * file descriptor passing.
 	 */
 	if (use_privsep) {
 		privsep_postauth(authctxt);
 		/* the monitor process [priv] will not return */
 	}
 
 	packet_set_timeout(options.client_alive_interval,
 	    options.client_alive_count_max);
 
 	/* Try to send all our hostkeys to the client */
 	notify_hostkeys(ssh);
 
 	/* Start session. */
 	do_authenticated(ssh, authctxt);
 
 	/* The connection has been terminated. */
 	packet_get_bytes(&ibytes, &obytes);
 	verbose("Transferred: sent %llu, received %llu bytes",
 	    (unsigned long long)obytes, (unsigned long long)ibytes);
 
 	verbose("Closing connection to %.500s port %d", remote_ip, remote_port);
 
 #ifdef USE_PAM
 	if (options.use_pam)
 		finish_pam();
 #endif /* USE_PAM */
 
 #ifdef SSH_AUDIT_EVENTS
 	PRIVSEP(audit_event(SSH_CONNECTION_CLOSE));
 #endif
 
 	packet_close();
 
 	if (use_privsep)
 		mm_terminate();
 
 	exit(0);
 }
 
 int
 sshd_hostkey_sign(struct sshkey *privkey, struct sshkey *pubkey,
     u_char **signature, size_t *slenp, const u_char *data, size_t dlen,
     const char *alg, u_int flag)
 {
 	int r;
 
 	if (privkey) {
 		if (PRIVSEP(sshkey_sign(privkey, signature, slenp, data, dlen,
 		    alg, datafellows)) < 0)
 			fatal("%s: key_sign failed", __func__);
 	} else if (use_privsep) {
 		if (mm_sshkey_sign(pubkey, signature, slenp, data, dlen,
 		    alg, datafellows) < 0)
 			fatal("%s: pubkey_sign failed", __func__);
 	} else {
 		if ((r = ssh_agent_sign(auth_sock, pubkey, signature, slenp,
 		    data, dlen, alg, datafellows)) != 0)
 			fatal("%s: ssh_agent_sign failed: %s",
 			    __func__, ssh_err(r));
 	}
 	return 0;
 }
 
 /* SSH2 key exchange */
 static void
 do_ssh2_kex(void)
 {
 	char *myproposal[PROPOSAL_MAX] = { KEX_SERVER };
 	struct kex *kex;
 	int r;
 
 	myproposal[PROPOSAL_KEX_ALGS] = compat_kex_proposal(
 	    options.kex_algorithms);
 	myproposal[PROPOSAL_ENC_ALGS_CTOS] = compat_cipher_proposal(
 	    options.ciphers);
 	myproposal[PROPOSAL_ENC_ALGS_STOC] = compat_cipher_proposal(
 	    options.ciphers);
 	myproposal[PROPOSAL_MAC_ALGS_CTOS] =
 	    myproposal[PROPOSAL_MAC_ALGS_STOC] = options.macs;
 
 	if (options.compression == COMP_NONE) {
 		myproposal[PROPOSAL_COMP_ALGS_CTOS] =
 		    myproposal[PROPOSAL_COMP_ALGS_STOC] = "none";
 	}
 
 	if (options.rekey_limit || options.rekey_interval)
 		packet_set_rekey_limits(options.rekey_limit,
 		    options.rekey_interval);
 
 	myproposal[PROPOSAL_SERVER_HOST_KEY_ALGS] = compat_pkalg_proposal(
 	    list_hostkey_types());
 
 	/* start key exchange */
 	if ((r = kex_setup(active_state, myproposal)) != 0)
 		fatal("kex_setup: %s", ssh_err(r));
 	kex = active_state->kex;
 #ifdef WITH_OPENSSL
 	kex->kex[KEX_DH_GRP1_SHA1] = kexdh_server;
 	kex->kex[KEX_DH_GRP14_SHA1] = kexdh_server;
 	kex->kex[KEX_DH_GRP14_SHA256] = kexdh_server;
 	kex->kex[KEX_DH_GRP16_SHA512] = kexdh_server;
 	kex->kex[KEX_DH_GRP18_SHA512] = kexdh_server;
 	kex->kex[KEX_DH_GEX_SHA1] = kexgex_server;
 	kex->kex[KEX_DH_GEX_SHA256] = kexgex_server;
 # ifdef OPENSSL_HAS_ECC
 	kex->kex[KEX_ECDH_SHA2] = kexecdh_server;
 # endif
 #endif
 	kex->kex[KEX_C25519_SHA256] = kexc25519_server;
 	kex->server = 1;
 	kex->client_version_string=client_version_string;
 	kex->server_version_string=server_version_string;
 	kex->load_host_public_key=&get_hostkey_public_by_type;
 	kex->load_host_private_key=&get_hostkey_private_by_type;
 	kex->host_key_index=&get_hostkey_index;
 	kex->sign = sshd_hostkey_sign;
 
 	ssh_dispatch_run_fatal(active_state, DISPATCH_BLOCK, &kex->done);
 
 	session_id2 = kex->session_id;
 	session_id2_len = kex->session_id_len;
 
 #ifdef DEBUG_KEXDH
 	/* send 1st encrypted/maced/compressed message */
 	packet_start(SSH2_MSG_IGNORE);
 	packet_put_cstring("markus");
 	packet_send();
 	packet_write_wait();
 #endif
 	debug("KEX done");
 }
 
 /* server specific fatal cleanup */
 void
 cleanup_exit(int i)
 {
 	struct ssh *ssh = active_state; /* XXX */
 
 	if (the_authctxt) {
 		do_cleanup(ssh, the_authctxt);
 		if (use_privsep && privsep_is_preauth &&
 		    pmonitor != NULL && pmonitor->m_pid > 1) {
 			debug("Killing privsep child %d", pmonitor->m_pid);
 			if (kill(pmonitor->m_pid, SIGKILL) != 0 &&
 			    errno != ESRCH)
 				error("%s: kill(%d): %s", __func__,
 				    pmonitor->m_pid, strerror(errno));
 		}
 	}
 #ifdef SSH_AUDIT_EVENTS
 	/* done after do_cleanup so it can cancel the PAM auth 'thread' */
 	if (!use_privsep || mm_is_monitor())
 		audit_event(SSH_CONNECTION_ABANDON);
 #endif
 	_exit(i);
 }
Index: projects/openssl111/crypto/openssh
===================================================================
--- projects/openssl111/crypto/openssh	(revision 339239)
+++ projects/openssl111/crypto/openssh	(revision 339240)

Property changes on: projects/openssl111/crypto/openssh
___________________________________________________________________
Modified: svn:mergeinfo
## -0,0 +0,1 ##
   Merged /head/crypto/openssh:r339215-339239
Index: projects/openssl111/lib/libc/amd64/string/memset.S
===================================================================
--- projects/openssl111/lib/libc/amd64/string/memset.S	(revision 339239)
+++ projects/openssl111/lib/libc/amd64/string/memset.S	(revision 339240)
@@ -1,77 +1,131 @@
 /*-
  * Copyright (c) 2018 The FreeBSD Foundation
  *
  * This software was developed by Mateusz Guzik <mjg@FreeBSD.org>
  * under sponsorship from the FreeBSD Foundation.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 #include <machine/asm.h>
 __FBSDID("$FreeBSD$");
 
-.macro MEMSET bzero
+.macro MEMSET bzero erms
 .if \bzero == 1
 	movq	%rsi,%rcx
 	movq	%rsi,%rdx
 	xorl	%eax,%eax
 .else
 	movq	%rdi,%r9
 	movq	%rdx,%rcx
 	movzbq	%sil,%r8
 	movabs	$0x0101010101010101,%rax
 	imulq	%r8,%rax
 .endif
-	cmpq	$15,%rcx
-	jbe	1f
-	shrq	$3,%rcx
-	rep
-	stosq
-	movq	%rdx,%rcx
-	andq	$7,%rcx
-	jne	1f
+
+	cmpq	$32,%rcx
+	jb	1016f
+
+	cmpq	$256,%rcx
+	ja	1256f
+
+1032:
+	movq	%rax,(%rdi)
+	movq	%rax,8(%rdi)
+	movq	%rax,16(%rdi)
+	movq	%rax,24(%rdi)
+	leaq	32(%rdi),%rdi
+	subq	$32,%rcx
+	cmpq	$32,%rcx
+	jae	1032b
+	cmpb	$0,%cl
+	je	1000f
+1016:
+	cmpb	$16,%cl
+	jl	1008f
+	movq	%rax,(%rdi)
+	movq	%rax,8(%rdi)
+	subb	$16,%cl
+	jz	1000f
+	leaq	16(%rdi),%rdi
+1008:
+	cmpb	$8,%cl
+	jl	1004f
+	movq	%rax,(%rdi)
+	subb	$8,%cl
+	jz	1000f
+	leaq	8(%rdi),%rdi
+1004:
+	cmpb	$4,%cl
+	jl	1002f
+	movl	%eax,(%rdi)
+	subb	$4,%cl
+	jz	1000f
+	leaq	4(%rdi),%rdi
+1002:
+	cmpb	$2,%cl
+	jl	1001f
+	movw	%ax,(%rdi)
+	subb	$2,%cl
+	jz	1000f
+	leaq	2(%rdi),%rdi
+1001:
+	cmpb	$1,%cl
+	jl	1000f
+	movb	%al,(%rdi)
+1000:
 .if \bzero == 0
 	movq	%r9,%rax
 .endif
 	ret
-1:
+
+1256:
+.if \erms == 1
 	rep
 	stosb
+.else
+	shrq	$3,%rcx
+	rep
+	stosq
+	movq	%rdx,%rcx
+	andb	$7,%cl
+	jne	1004b
+.endif
 .if \bzero == 0
 	movq	%r9,%rax
 .endif
 	ret
 .endm
 
 #ifndef BZERO
 ENTRY(memset)
-	MEMSET bzero=0
+	MEMSET bzero=0 erms=0
 END(memset)
 #else
 ENTRY(bzero)
-	MEMSET bzero=1
+	MEMSET bzero=1 erms=0
 END(bzero)
 #endif
 
 	.section .note.GNU-stack,"",%progbits
Index: projects/openssl111/sbin/init/rc.conf
===================================================================
--- projects/openssl111/sbin/init/rc.conf	(revision 339239)
+++ projects/openssl111/sbin/init/rc.conf	(revision 339240)
@@ -1,750 +1,750 @@
 #!/bin/sh
 
 # This is rc.conf - a file full of useful variables that you can set
 # to change the default startup behavior of your system.  You should
 # not edit this file!  Put any overrides into one of the ${rc_conf_files}
 # instead and you will be able to update these defaults later without
 # spamming your local configuration information.
 #
 # The ${rc_conf_files} files should only contain values which override
 # values set in this file.  This eases the upgrade path when defaults
 # are changed and new features are added.
 #
 # All arguments must be in double or single quotes.
 #
 # For a more detailed explanation of all the rc.conf variables, please
 # refer to the rc.conf(5) manual page.
 #
 # $FreeBSD$
 
 ##############################################################
 ###  Important initial Boot-time options  ####################
 ##############################################################
 
 # rc_debug can't be set here without interferring with rc.subr's setting it
 # when the kenv variable rc.debug is set.
 #rc_debug="NO"		# Set to YES to enable debugging output from rc.d
 rc_info="NO"		# Enables display of informational messages at boot.
 rc_startmsgs="YES" 	# Show "Starting foo:" messages at boot
 rcshutdown_timeout="90" # Seconds to wait before terminating rc.shutdown
 early_late_divider="FILESYSTEMS"	# Script that separates early/late
 			# stages of the boot process.  Make sure you know
 			# the ramifications if you change this.
 			# See rc.conf(5) for more details.
 always_force_depends="NO"	# Set to check that indicated dependencies are
 				# running during boot (can increase boot time).
 
 apm_enable="NO"		# Set to YES to enable APM BIOS functions (or NO).
 apmd_enable="NO"	# Run apmd to handle APM event from userland.
 apmd_flags=""		# Flags to apmd (if enabled).
 ddb_enable="NO"		# Set to YES to load ddb scripts at boot.
 ddb_config="/etc/ddb.conf"	# ddb(8) config file.
 devd_enable="YES" 	# Run devd, to trigger programs on device tree changes.
 devd_flags=""		# Additional flags for devd(8).
 devmatch_enable="YES"	# Demand load kernel modules based on device ids.
 devmatch_blacklist=""	# List of modules (w/o .ko) to exclude from devmatch.
 #kld_list="" 		# Kernel modules to load after local disks are mounted
 kldxref_enable="YES"	# Build linker.hints files with kldxref(8).
 kldxref_clobber="NO"	# Overwrite old linker.hints at boot.
 kldxref_module_path=""	# Override kern.module_path. A ';'-delimited list.
 powerd_enable="NO" 	# Run powerd to lower our power usage.
 powerd_flags=""		# Flags to powerd (if enabled).
 tmpmfs="AUTO"		# Set to YES to always create an mfs /tmp, NO to never
 tmpsize="20m"		# Size of mfs /tmp if created
 tmpmfs_flags="-S"	# Extra mdmfs options for the mfs /tmp
 varmfs="AUTO"		# Set to YES to always create an mfs /var, NO to never
 varsize="32m"		# Size of mfs /var if created
 varmfs_flags="-S"	# Extra mount options for the mfs /var
 mfs_type="auto"		# "md", "tmpfs", "auto" to prefer tmpfs with md as fallback
 populate_var="AUTO"	# Set to YES to always (re)populate /var, NO to never
 cleanvar_enable="YES" 	# Clean the /var directory
 local_startup="/usr/local/etc/rc.d" # startup script dirs.
 script_name_sep=" "	# Change if your startup scripts' names contain spaces
 rc_conf_files="/etc/rc.conf /etc/rc.conf.local"
 
 # ZFS support
 zfs_enable="NO"		# Set to YES to automatically mount ZFS file systems
 
 # ZFSD support
 zfsd_enable="NO"	# Set to YES to automatically start the ZFS fault
 			# management daemon.
 
 gptboot_enable="YES"	# GPT boot success/failure reporting.
 
 # Experimental - test before enabling
 gbde_autoattach_all="NO" # YES automatically mounts gbde devices from fstab
 gbde_devices="NO" 	# Devices to automatically attach (list, or AUTO)
 gbde_attach_attempts="3" # Number of times to attempt attaching gbde devices
 gbde_lockdir="/etc"	# Where to look for gbde lockfiles
 
 # GELI disk encryption configuration.
 geli_devices=""		# List of devices to automatically attach in addition to
 			# GELI devices listed in /etc/fstab.
 geli_groups=""		# List of groups containing devices to automatically
 			# attach with the same keyfiles and passphrase
 geli_tries=""		# Number of times to attempt attaching geli device.
 			# If empty, kern.geom.eli.tries will be used.
 geli_default_flags=""	# Default flags for geli(8).
 geli_autodetach="YES"	# Automatically detach on last close.
 			# Providers are marked as such when all file systems are
 			# mounted.
 # Example use.
 #geli_devices="da1 mirror/home"
 #geli_da1_flags="-p -k /etc/geli/da1.keys"
 #geli_da1_autodetach="NO"
 #geli_mirror_home_flags="-k /etc/geli/home.keys"
 #geli_groups="storage backup"
 #geli_storage_flags="-k /etc/geli/storage.keys"
 #geli_storage_devices="ada0 ada1"
 #geli_backup_flags="-j /etc/geli/backup.passfile -k /etc/geli/backup.keys"
 #geli_backup_devices="ada2 ada3"
 
 root_rw_mount="YES"	# Set to NO to inhibit remounting root read-write.
 root_hold_delay="30"	# Time to wait for root mount hold release.
 fsck_y_enable="NO"	# Set to YES to do fsck -y if the initial preen fails.
 fsck_y_flags="-T ffs:-R -T ufs:-R"	# Additional flags for fsck -y
 background_fsck="YES"	# Attempt to run fsck in the background where possible.
 background_fsck_delay="60" # Time to wait (seconds) before starting the fsck.
 growfs_enable="NO"	# Set to YES to attempt to grow the root filesystem on boot
 netfs_types="nfs:NFS smbfs:SMB" # Net filesystems.
 extra_netfs_types="NO"	# List of network extra filesystem types for delayed
 			# mount at startup (or NO).
 
 ##############################################################
 ###  Network configuration sub-section  ######################
 ##############################################################
 
 ### Basic network and firewall/security options: ###
 hostname=""			# Set this!
 hostid_enable="YES"		# Set host UUID.
 hostid_file="/etc/hostid"	# File with hostuuid.
 nisdomainname="NO"		# Set to NIS domain if using NIS (or NO).
 dhclient_program="/sbin/dhclient"	# Path to dhcp client program.
 dhclient_flags=""		# Extra flags to pass to dhcp client.
 #dhclient_flags_fxp0=""		# Extra dhclient flags for fxp0 only
 background_dhclient="NO"	# Start dhcp client in the background.
 #background_dhclient_fxp0="YES"	# Start dhcp client on fxp0 in the background.
 synchronous_dhclient="NO"	# Start dhclient directly on configured
 				# interfaces during startup.
 defaultroute_delay="30"		# Time to wait for a default route on a DHCP interface.
 defaultroute_carrier_delay="5"	# Time to wait for carrier while waiting for a default route.
 netif_enable="YES"		# Set to YES to initialize network interfaces
 netif_ipexpand_max="2048"	# Maximum number of IP addrs in a range spec.
 wpa_supplicant_program="/usr/sbin/wpa_supplicant"
 wpa_supplicant_flags="-s"	# Extra flags to pass to wpa_supplicant
 wpa_supplicant_conf_file="/etc/wpa_supplicant.conf"
 #
 firewall_enable="NO"		# Set to YES to enable firewall functionality
 firewall_script="/etc/rc.firewall" # Which script to run to set up the firewall
 firewall_type="UNKNOWN"		# Firewall type (see /etc/rc.firewall)
 firewall_quiet="NO"		# Set to YES to suppress rule display
 firewall_logging="NO"		# Set to YES to enable events logging
 firewall_logif="NO"		# Set to YES to create logging-pseudo interface
 firewall_flags=""		# Flags passed to ipfw when type is a file
 firewall_coscripts=""		# List of executables/scripts to run after
 				# firewall starts/stops
 firewall_client_net="192.0.2.0/24" # IPv4 Network address for "client"
 				# firewall.
 #firewall_client_net_ipv6="2001:db8:2:1::/64" # IPv6 network prefix for
 				# "client" firewall.
 firewall_simple_iif="ed1"	# Inside network interface for "simple"
 				# firewall.
 firewall_simple_inet="192.0.2.16/28" # Inside network address for "simple"
 				# firewall.
 firewall_simple_oif="ed0"	# Outside network interface for "simple"
 				# firewall.
 firewall_simple_onet="192.0.2.0/28" # Outside network address for "simple"
 				# firewall.
 #firewall_simple_iif_ipv6="ed1"	# Inside IPv6 network interface for "simple"
 				# firewall.
 #firewall_simple_inet_ipv6="2001:db8:2:800::/56" # Inside IPv6 network prefix
 				# for "simple" firewall.
 #firewall_simple_oif_ipv6="ed0"	# Outside IPv6 network interface for "simple"
 				# firewall.
 #firewall_simple_onet_ipv6="2001:db8:2:0::/56" # Outside IPv6 network prefix
 				# for "simple" firewall.
-firewall_myservices=""		# List of TCP ports on which this host
+firewall_myservices=""		# List of ports/protocols on which this host
 				# offers services for "workstation" firewall.
 firewall_allowservices=""	# List of IPs which have access to
 				# $firewall_myservices for "workstation"
 				# firewall.
 firewall_trusted=""		# List of IPs which have full access to this
 				# host for "workstation" firewall.
 firewall_logdeny="NO"		# Set to YES to log default denied incoming
 				# packets for "workstation" firewall.
 firewall_nologports="135-139,445 1026,1027 1433,1434" # List of TCP/UDP ports
 				# for which denied incoming packets are not
 				# logged for "workstation" firewall.
 firewall_nat_enable="NO"	# Enable kernel NAT (if firewall_enable == YES)
 firewall_nat_interface=""	# Public interface or IPaddress to use
 firewall_nat_flags=""		# Additional configuration parameters
 dummynet_enable="NO"		# Load the dummynet(4) module
 ipfw_netflow_enable="NO"	# Enable netflow logging via ng_netflow
 ip_portrange_first="NO"		# Set first dynamically allocated port
 ip_portrange_last="NO"		# Set last dynamically allocated port
 ike_enable="NO"			# Enable IKE daemon (usually racoon or isakmpd)
 ike_program="/usr/local/sbin/isakmpd"	# Path to IKE daemon
 ike_flags=""			# Additional flags for IKE daemon
 ipsec_enable="NO"		# Set to YES to run setkey on ipsec_file
 ipsec_file="/etc/ipsec.conf"	# Name of config file for setkey
 natd_program="/sbin/natd"	# path to natd, if you want a different one.
 natd_enable="NO"		# Enable natd (if firewall_enable == YES).
 natd_interface=""		# Public interface or IPaddress to use.
 natd_flags=""			# Additional flags for natd.
 ipfilter_enable="NO"		# Set to YES to enable ipfilter functionality
 ipfilter_program="/sbin/ipf"	# where the ipfilter program lives
 ipfilter_rules="/etc/ipf.rules"	# rules definition file for ipfilter, see
 				# /usr/src/contrib/ipfilter/rules for examples
 ipfilter_flags=""		# additional flags for ipfilter
 ipnat_enable="NO"		# Set to YES to enable ipnat functionality
 ipnat_program="/sbin/ipnat"	# where the ipnat program lives
 ipnat_rules="/etc/ipnat.rules"	# rules definition file for ipnat
 ipnat_flags=""			# additional flags for ipnat
 ipmon_enable="NO"		# Set to YES for ipmon; needs ipfilter or ipnat
 ipmon_program="/sbin/ipmon"	# where the ipfilter monitor program lives
 ipmon_flags="-Ds"		# typically "-Ds" or "-D /var/log/ipflog"
 ipfs_enable="NO"		# Set to YES to enable saving and restoring
 				# of state tables at shutdown and boot
 ipfs_program="/sbin/ipfs"	# where the ipfs program lives
 ipfs_flags=""			# additional flags for ipfs
 pf_enable="NO"			# Set to YES to enable packet filter (pf)
 pf_rules="/etc/pf.conf"		# rules definition file for pf
 pf_program="/sbin/pfctl"	# where the pfctl program lives
 pf_flags=""			# additional flags for pfctl
 pflog_enable="NO"		# Set to YES to enable packet filter logging
 pflog_logfile="/var/log/pflog"	# where pflogd should store the logfile
 pflog_program="/sbin/pflogd"	# where the pflogd program lives
 pflog_flags=""			# additional flags for pflogd
 ftpproxy_enable="NO"		# Set to YES to enable ftp-proxy(8) for pf
 ftpproxy_flags=""		# additional flags for ftp-proxy(8)
 pfsync_enable="NO"		# Expose pf state to other hosts for syncing
 pfsync_syncdev=""		# Interface for pfsync to work through
 pfsync_syncpeer=""		# IP address of pfsync peer host
 pfsync_ifconfig=""		# Additional options to ifconfig(8) for pfsync
 tcp_extensions="YES"		# Set to NO to turn off RFC1323 extensions.
 log_in_vain="0"			# >=1 to log connects to ports w/o listeners.
 tcp_keepalive="YES"		# Enable stale TCP connection timeout (or NO).
 tcp_drop_synfin="NO"		# Set to YES to drop TCP packets with SYN+FIN
 				# NOTE: this violates the TCP specification
 icmp_drop_redirect="NO" 	# Set to YES to ignore ICMP REDIRECT packets
 icmp_log_redirect="NO"		# Set to YES to log ICMP REDIRECT packets
 network_interfaces="auto"	# List of network interfaces (or "auto").
 cloned_interfaces=""		# List of cloned network interfaces to create.
 #cloned_interfaces="gif0 gif1 gif2 gif3" # Pre-cloning GENERIC config.
 #ifconfig_lo0="inet 127.0.0.1"	# default loopback device configuration.
 #ifconfig_lo0_alias0="inet 127.0.0.254 netmask 0xffffffff" # Sample alias entry.
 #ifconfig_ed0_ipv6="inet6 2001:db8:1::1 prefixlen 64" # Sample IPv6 addr entry
 #ifconfig_ed0_alias0="inet6 2001:db8:2::1 prefixlen 64" # Sample IPv6 alias
 #ifconfig_fxp0_name="net0"	# Change interface name from fxp0 to net0.
 #vlans_fxp0="101 vlan0"		# vlan(4) interfaces for fxp0 device
 #create_args_vlan0="vlan 102"	# vlan tag for vlan0 device
 #wlans_ath0="wlan0"		# wlan(4) interfaces for ath0 device
 #wlandebug_wlan0="scan+auth+assoc"	# Set debug flags with wlandebug(8)
 #ipv4_addrs_fxp0="192.168.0.1/24 192.168.1.1-5/28" # example IPv4 address entry.
 #
 #autobridge_interfaces="bridge0"	# List of bridges to check
 #autobridge_bridge0="tap* vlan0"	# Interface glob to automatically add to the bridge
 #
 # If you have any sppp(4) interfaces above, you might also want to set
 # the following parameters.  Refer to spppcontrol(8) for their meaning.
 sppp_interfaces=""		# List of sppp interfaces.
 #sppp_interfaces="...0"		# example: sppp over ...
 #spppconfig_...0="authproto=chap myauthname=foo myauthsecret='top secret' hisauthname=some-gw hisauthsecret='another secret'"
 
 # User ppp configuration.
 ppp_enable="NO"		# Start user-ppp (or NO).
 ppp_program="/usr/sbin/ppp"	# Path to user-ppp program.
 ppp_mode="auto"		# Choice of "auto", "ddial", "direct" or "dedicated".
 			# For details see man page for ppp(8). Default is auto.
 ppp_nat="YES"		# Use PPP's internal network address translation or NO.
 ppp_profile="papchap"	# Which profile to use from /etc/ppp/ppp.conf.
 ppp_user="root"		# Which user to run ppp as
 
 # Start multiple instances of ppp at boot time
 #ppp_profile="profile1 profile2 profile3"	# Which profiles to use
 #ppp_profile1_mode="ddial"	# Override ppp mode for profile1
 #ppp_profile2_nat="NO"		# Override nat mode for profile2
 # profile3 uses default ppp_mode and ppp_nat
 
 ### Network daemon (miscellaneous) ###
 hostapd_enable="NO"		# Run hostap daemon.
 syslogd_enable="YES"		# Run syslog daemon (or NO).
 syslogd_program="/usr/sbin/syslogd" # path to syslogd, if you want a different one.
 syslogd_flags="-s"		# Flags to syslogd (if enabled).
 syslogd_oomprotect="YES"	# Don't kill syslogd when swap space is exhausted. 
 altlog_proglist=""		# List of chrooted applicatioins in /var
 inetd_enable="NO"		# Run the network daemon dispatcher (YES/NO).
 inetd_program="/usr/sbin/inetd"	# path to inetd, if you want a different one.
 inetd_flags="-wW -C 60"		# Optional flags to inetd
 iscsid_enable="NO"		# iSCSI initiator daemon.
 iscsictl_enable="NO"		# iSCSI initiator autostart.
 iscsictl_flags="-Aa"		# Optional flags to iscsictl.
 hastd_enable="NO"		# Run the HAST daemon (YES/NO).
 hastd_program="/sbin/hastd"	# path to hastd, if you want a different one.
 hastd_flags=""			# Optional flags to hastd.
 ctld_enable="NO"		# CAM Target Layer / iSCSI target daemon.
 local_unbound_enable="NO"	# local caching resolver
 blacklistd_enable="NO" 	# Run blacklistd daemon (YES/NO).
 blacklistd_flags=""		# Optional flags for blacklistd(8).
 resolv_enable="YES"		# Enable resolv / resolvconf
 
 #
 # kerberos. Do not run the admin daemons on slave servers
 #
 kdc_enable="NO"			# Run a kerberos 5 KDC (or NO).
 kdc_program="/usr/libexec/kdc"	# path to kerberos 5 KDC
 kdc_flags=""			# Additional flags to the kerberos 5 KDC
 kadmind_enable="NO"		# Run kadmind (or NO)
 kadmind_program="/usr/libexec/kadmind"	# path to kadmind
 kpasswdd_enable="NO"		# Run kpasswdd (or NO)
 kpasswdd_program="/usr/libexec/kpasswdd" # path to kpasswdd
 kfd_enable="NO"			# Run kfd (or NO)
 kfd_program="/usr/libexec/kfd"	# path to kerberos 5 kfd daemon
 kfd_flags=""
 ipropd_master_enable="NO"	# Run Heimdal incremental propagation daemon
 				# (master daemon).
 ipropd_master_program="/usr/libexec/ipropd-master"
 ipropd_master_flags=""		# Flags to ipropd-master.
 ipropd_master_keytab="/etc/krb5.keytab"	# keytab for ipropd-master.
 ipropd_master_slaves=""		# slave node names used for /var/heimdal/slaves.
 ipropd_slave_enable="NO"	# Run Heimdal incremental propagation daemon
 				# (slave daemon).
 ipropd_slave_program="/usr/libexec/ipropd-slave"
 ipropd_slave_flags=""		# Flags to ipropd-slave.
 ipropd_slave_keytab="/etc/krb5.keytab"	# keytab for ipropd-slave.
 ipropd_slave_master=""		# master node name.
 
 gssd_enable="NO"		# Run the gssd daemon (or NO).
 gssd_program="/usr/sbin/gssd"	# Path to gssd.
 gssd_flags=""			# Flags for gssd.
 
 rwhod_enable="NO"		# Run the rwho daemon (or NO).
 rwhod_flags=""			# Flags for rwhod
 rarpd_enable="NO"		# Run rarpd (or NO).
 rarpd_flags="-a"		# Flags to rarpd.
 bootparamd_enable="NO"		# Run bootparamd (or NO).
 bootparamd_flags=""		# Flags to bootparamd
 pppoed_enable="NO"		# Run the PPP over Ethernet daemon.
 pppoed_provider="*"		# Provider and ppp(8) config file entry.
 pppoed_flags="-P /var/run/pppoed.pid"	# Flags to pppoed (if enabled).
 pppoed_interface="fxp0"		# The interface that pppoed runs on.
 sshd_enable="NO"		# Enable sshd
 sshd_program="/usr/sbin/sshd"	# path to sshd, if you want a different one.
 sshd_flags=""			# Additional flags for sshd.
 ftpd_enable="NO"		# Enable stand-alone ftpd.
 ftpd_program="/usr/libexec/ftpd" # Path to ftpd, if you want a different one.
 ftpd_flags=""			# Additional flags to stand-alone ftpd.
 
 ### Network daemon (NFS): All need rpcbind_enable="YES" ###
 amd_enable="NO"			# Run amd service with $amd_flags (or NO).
 amd_program="/usr/sbin/amd"	# path to amd, if you want a different one.
 amd_flags="-a /.amd_mnt -l syslog /host /etc/amd.map /net /etc/amd.map"
 amd_map_program="NO"		# Can be set to "ypcat -k amd.master"
 autofs_enable="NO"		# Run autofs daemons.
 automount_flags=""		# Flags to automount(8) (if autofs enabled).
 automountd_flags=""		# Flags to automountd(8) (if autofs enabled).
 autounmountd_flags=""		# Flags to autounmountd(8) (if autofs enabled).
 nfs_client_enable="NO"		# This host is an NFS client (or NO).
 nfs_access_cache="60"		# Client cache timeout in seconds
 nfs_server_enable="NO"		# This host is an NFS server (or NO).
 nfs_server_flags="-u -t"	# Flags to nfsd (if enabled).
 nfs_server_managegids="NO"	# The NFS server maps gids for AUTH_SYS (or NO).
 mountd_enable="NO"		# Run mountd (or NO).
 mountd_flags="-r -S"		# Flags to mountd (if NFS server enabled).
 weak_mountd_authentication="NO"	# Allow non-root mount requests to be served.
 nfs_reserved_port_only="NO"	# Provide NFS only on secure port (or NO).
 nfs_bufpackets=""		# bufspace (in packets) for client
 rpc_lockd_enable="NO"		# Run NFS rpc.lockd needed for client/server.
 rpc_lockd_flags=""		# Flags to rpc.lockd (if enabled).
 rpc_statd_enable="NO"		# Run NFS rpc.statd needed for client/server.
 rpc_statd_flags=""		# Flags to rpc.statd (if enabled).
 rpcbind_enable="NO"		# Run the portmapper service (YES/NO).
 rpcbind_program="/usr/sbin/rpcbind"	# path to rpcbind, if you want a different one.
 rpcbind_flags=""		# Flags to rpcbind (if enabled).
 rpc_ypupdated_enable="NO"	# Run if NIS master and SecureRPC (or NO).
 keyserv_enable="NO"		# Run the SecureRPC keyserver (or NO).
 keyserv_flags=""		# Flags to keyserv (if enabled).
 nfsv4_server_enable="NO"	# Enable support for NFSv4
 nfscbd_enable="NO"		# NFSv4 client side callback daemon
 nfscbd_flags=""			# Flags for nfscbd
 nfsuserd_enable="NO"		# NFSv4 user/group name mapping daemon
 nfsuserd_flags=""		# Flags for nfsuserd
 
 ### Network Time Services options: ###
 timed_enable="NO"		# Run the time daemon (or NO).
 timed_flags=""			# Flags to timed (if enabled).
 ntpdate_enable="NO"		# Run ntpdate to sync time on boot (or NO).
 ntpdate_program="/usr/sbin/ntpdate"	# path to ntpdate, if you want a different one.
 ntpdate_flags="-b"		# Flags to ntpdate (if enabled).
 ntpdate_config="/etc/ntp.conf"	# ntpdate(8) configuration file
 ntpdate_hosts=""		# Whitespace-separated list of ntpdate(8) servers.
 ntpd_enable="NO"		# Run ntpd Network Time Protocol (or NO).
 ntpd_program="/usr/sbin/ntpd"	# path to ntpd, if you want a different one.
 ntpd_config="/etc/ntp.conf"	# ntpd(8) configuration file
 ntpd_sync_on_start="NO"		# Sync time on ntpd startup, even if offset is high
 ntpd_flags=""			# Additional flags to ntpd
 ntp_src_leapfile="/etc/ntp/leap-seconds"
 				# Initial source for ntpd leapfile
 ntp_db_leapfile="/var/db/ntpd.leap-seconds.list"
 				# Working copy (updated weekly) leapfile
 ntp_leapfile_sources="https://www.ietf.org/timezones/data/leap-seconds.list"
 				# Source from which to fetch leapfile
 ntp_leapfile_fetch_opts="-mq"	# Options to use for ntp leapfile fetch,
 				# e.g. --no-verify-peer
 ntp_leapfile_expiry_days=30	# Check for new leapfile 30 days prior to
 				# expiry.
 ntp_leapfile_fetch_verbose="NO"	# Be verbose during NTP leapfile fetch
 
 # Network Information Services (NIS) options: All need rpcbind_enable="YES" ###
 nis_client_enable="NO"		# We're an NIS client (or NO).
 nis_client_flags=""		# Flags to ypbind (if enabled).
 nis_ypset_enable="NO"		# Run ypset at boot time (or NO).
 nis_ypset_flags=""		# Flags to ypset (if enabled).
 nis_server_enable="NO"		# We're an NIS server (or NO).
 nis_server_flags=""		# Flags to ypserv (if enabled).
 nis_ypxfrd_enable="NO"		# Run rpc.ypxfrd at boot time (or NO).
 nis_ypxfrd_flags=""		# Flags to rpc.ypxfrd (if enabled).
 nis_yppasswdd_enable="NO"	# Run rpc.yppasswdd at boot time (or NO).
 nis_yppasswdd_flags=""		# Flags to rpc.yppasswdd (if enabled).
 nis_ypldap_enable="NO"		# Run ypldap at boot time (or NO).
 nis_ypldap_flags=""		# Flags to ypldap (if enabled).
 
 ### SNMP daemon ###
 # Be sure to understand the security implications of running SNMP v1/v2
 # in your network.
 bsnmpd_enable="NO"		# Run the SNMP daemon (or NO).
 bsnmpd_flags=""			# Flags for bsnmpd.
 
 ### Network routing options: ###
 defaultrouter="NO"		# Set to default gateway (or NO).
 static_arp_pairs=""		# Set to static ARP list (or leave empty).
 static_ndp_pairs=""		# Set to static NDP list (or leave empty).
 static_routes=""		# Set to static route list (or leave empty).
 gateway_enable="NO"		# Set to YES if this host will be a gateway.
 routed_enable="NO"		# Set to YES to enable a routing daemon.
 routed_program="/sbin/routed"	# Name of routing daemon to use if enabled.
 routed_flags="-q"		# Flags for routing daemon.
 arpproxy_all="NO"		# replaces obsolete kernel option ARP_PROXYALL.
 forward_sourceroute="NO"	# do source routing (only if gateway_enable is set to "YES")
 accept_sourceroute="NO"		# accept source routed packets to us
 
 ### Bluetooth ###
 hcsecd_enable="NO"		# Enable hcsecd(8) (or NO)
 hcsecd_config="/etc/bluetooth/hcsecd.conf" # hcsecd(8) configuration file
 
 sdpd_enable="NO"		# Enable sdpd(8) (or NO)
 sdpd_control="/var/run/sdp"	# sdpd(8) control socket
 sdpd_groupname="nobody"		# set spdp(8) user/group to run as after
 sdpd_username="nobody"		# it initializes
 
 bthidd_enable="NO"		# Enable bthidd(8) (or NO)
 bthidd_config="/etc/bluetooth/bthidd.conf" # bthidd(8) configuration file
 bthidd_hids="/var/db/bthidd.hids" # bthidd(8) known HID devices file
 bthidd_evdev_support="AUTO"	# AUTO depends on EVDEV_SUPPORT kernel option
 
 rfcomm_pppd_server_enable="NO"	# Enable rfcomm_pppd(8) in server mode (or NO)
 rfcomm_pppd_server_profile="one two"	# Profile to use from /etc/ppp/ppp.conf
 #
 #rfcomm_pppd_server_one_bdaddr=""	# Override local bdaddr for 'one'
 rfcomm_pppd_server_one_channel="1"	# Override local channel for 'one'
 #rfcomm_pppd_server_one_register_sp="NO"	# Override SP and DUN register
 #rfcomm_pppd_server_one_register_dun="NO"	# for 'one'
 #
 #rfcomm_pppd_server_two_bdaddr=""	# Override local bdaddr for 'two'
 rfcomm_pppd_server_two_channel="3"	# Override local channel for 'two'
 #rfcomm_pppd_server_two_register_sp="NO"	# Override SP and DUN register
 #rfcomm_pppd_server_two_register_dun="NO"	# for 'two'
 
 ubthidhci_enable="NO"		# Switch an USB BT controller present on
 #ubthidhci_busnum="3"		# bus 3 and addr 2 from HID mode to HCI mode.
 #ubthidhci_addr="2"		# Check usbconfig list to find the correct
 				# numbers for your system.
 
 ### Network link/usability verification options
 netwait_enable="NO"		# Enable rc.d/netwait (or NO)
 #netwait_ip=""			# Wait for ping response from any IP in this list.
 netwait_timeout="60"		# Total number of seconds to perform pings.
 #netwait_if=""			# Wait for active link on each intf in this list.
 netwait_if_timeout="30"		# Total number of seconds to monitor link state.
 
 ### Miscellaneous network options: ###
 icmp_bmcastecho="NO"	# respond to broadcast ping packets
 
 ### IPv6 options: ###
 ipv6_network_interfaces="auto"	# List of IPv6 network interfaces
 				# (or "auto" or "none").
 ipv6_activate_all_interfaces="NO"	# If NO, interfaces which have no
 					# corresponding $ifconfig_IF_ipv6 is
 					# marked as IFDISABLED for security
 					# reason.
 ipv6_defaultrouter="NO"		# Set to IPv6 default gateway (or NO).
 #ipv6_defaultrouter="2002:c058:6301::"	# Use this for 6to4 (RFC 3068)
 ipv6_static_routes=""		# Set to static route list (or leave empty).
 #ipv6_static_routes="xxx"	# An example to set fec0:0000:0000:0006::/64
 				#  route toward loopback interface.
 #ipv6_route_xxx="fec0:0000:0000:0006:: -prefixlen 64 ::1"
 ipv6_gateway_enable="NO"	# Set to YES if this host will be a gateway.
 ipv6_cpe_wanif="NO"		# Set to the upstream interface name if this
 				# node will work as a router to forward IPv6
 				# packets not explicitly addressed to itself.
 ipv6_privacy="NO"		# Use privacy address on RA-receiving IFs
 				# (RFC 4941)
 
 route6d_enable="NO"		# Set to YES to enable an IPv6 routing daemon.
 route6d_program="/usr/sbin/route6d"	# Name of IPv6 routing daemon.
 route6d_flags=""		# Flags to IPv6 routing daemon.
 #route6d_flags="-l"		# Example for route6d with only IPv6 site local
 				# addrs.
 #route6d_flags="-q"		# If you want to run a routing daemon on an end
 				# node, you should stop advertisement.
 #ipv6_network_interfaces="ed0 ep0"	# Examples for router
 					# or static configuration for end node.
 					# Choose correct prefix value.
 #ipv6_prefix_ed0="fec0:0000:0000:0001 fec0:0000:0000:0002"  # Examples for rtr.
 #ipv6_prefix_ep0="fec0:0000:0000:0003 fec0:0000:0000:0004"  # Examples for rtr.
 ipv6_default_interface="NO"	# Default output interface for scoped addrs.
 				# This works only with
 				# ipv6_gateway_enable="NO".
 rtsol_flags=""			# Flags to IPv6 router solicitation.
 rtsold_enable="NO"		# Set to YES to enable an IPv6 router
 				# solicitation daemon.
 rtsold_flags="-a"		# Flags to an IPv6 router solicitation
 				# daemon.
 rtadvd_enable="NO"		# Set to YES to enable an IPv6 router
 				# advertisement daemon. If set to YES,
 				# this router becomes a possible candidate
 				# IPv6 default router for local subnets.
 rtadvd_interfaces=""		# Interfaces rtadvd sends RA packets.
 stf_interface_ipv4addr=""	# Local IPv4 addr for 6to4 IPv6 over IPv4
 				# tunneling interface. Specify this entry
 				# to enable 6to4 interface.
 stf_interface_ipv4plen="0"	# Prefix length for 6to4 IPv4 addr,
 				# to limit peer addr range. Effective value
 				# is 0-31.
 stf_interface_ipv6_ifid="0:0:0:1"	# IPv6 interface id for stf0.
 				# If you like, you can set "AUTO" for this.
 stf_interface_ipv6_slaid="0000"	# IPv6 Site Level Aggregator for stf0
 ipv6_ipv4mapping="NO"		# Set to "YES" to enable IPv4 mapped IPv6 addr
 				# communication. (like ::ffff:a.b.c.d)
 ipv6_ipfilter_rules="/etc/ipf6.rules"	# rules definition file for ipfilter,
 					# see /usr/src/contrib/ipfilter/rules
 					# for examples
 ip6addrctl_enable="YES"	# Set to YES to enable default address selection
 ip6addrctl_verbose="NO"	# Set to YES to enable verbose configuration messages
 ip6addrctl_policy="AUTO"	# A pre-defined address selection policy
 				# (ipv4_prefer, ipv6_prefer, or AUTO)
 
 ##############################################################
 ###  System console options  #################################
 ##############################################################
 
 keyboard=""		# keyboard device to use (default /dev/kbd0).
 keymap="NO"		# keymap in /usr/share/{syscons,vt}/keymaps/* (or NO).
 keyrate="NO"		# keyboard rate to: slow, normal, fast (or NO).
 keybell="NO" 		# See kbdcontrol(1) for options.  Use "off" to disable.
 keychange="NO"		# function keys default values (or NO).
 cursor="NO"		# cursor type {normal|blink|destructive} (or NO).
 scrnmap="NO"		# screen map in /usr/share/syscons/scrnmaps/* (or NO).
 font8x16="NO"		# font 8x16 from /usr/share/{syscons,vt}/fonts/* (or NO).
 font8x14="NO"		# font 8x14 from /usr/share/{syscons,vt}/fonts/* (or NO).
 font8x8="NO"		# font 8x8 from /usr/share/{syscons,vt}/fonts/* (or NO).
 blanktime="300"		# blank time (in seconds) or "NO" to turn it off.
 saver="NO"		# screen saver: Uses /boot/kernel/${saver}_saver.ko
 moused_nondefault_enable="YES" # Treat non-default mice as enabled unless
 			       # specifically overriden in rc.conf(5).
 moused_enable="NO"	# Run the mouse daemon.
 moused_type="auto"	# See man page for rc.conf(5) for available settings.
 moused_port="/dev/psm0"	# Set to your mouse port.
 moused_flags=""		# Any additional flags to moused.
 mousechar_start="NO"	# if 0xd0-0xd3 default range is occupied in your
 			# language code table, specify alternative range
 			# start like mousechar_start=3, see vidcontrol(1)
 allscreens_flags=""	# Set this vidcontrol mode for all virtual screens
 allscreens_kbdflags=""	# Set this kbdcontrol mode for all virtual screens
 
 ##############################################################
 ###  Mail Transfer Agent (MTA) options  ######################
 ##############################################################
 
 mta_start_script="/etc/rc.sendmail"
 			# Script to start your chosen MTA, called by /etc/rc.
 # Settings for /etc/rc.sendmail and /etc/rc.d/sendmail:
 sendmail_enable="NO"	# Run the sendmail inbound daemon (YES/NO).
 sendmail_pidfile="/var/run/sendmail.pid"	# sendmail pid file
 sendmail_procname="/usr/sbin/sendmail"		# sendmail process name
 sendmail_flags="-L sm-mta -bd -q30m" # Flags to sendmail (as a server)
 sendmail_cert_create="YES"	# Create a server certificate if none (YES/NO)
 #sendmail_cert_cn="CN"   	# CN of the generate certificate
 sendmail_submit_enable="YES"	# Start a localhost-only MTA for mail submission
 sendmail_submit_flags="-L sm-mta -bd -q30m -ODaemonPortOptions=Addr=localhost"
 				# Flags for localhost-only MTA
 sendmail_outbound_enable="YES"	# Dequeue stuck mail (YES/NO).
 sendmail_outbound_flags="-L sm-queue -q30m" # Flags to sendmail (outbound only)
 sendmail_msp_queue_enable="YES"	# Dequeue stuck clientmqueue mail (YES/NO).
 sendmail_msp_queue_flags="-L sm-msp-queue -Ac -q30m"
 				# Flags for sendmail_msp_queue daemon.
 sendmail_rebuild_aliases="NO"	# Run newaliases if necessary (YES/NO).
 
 
 ##############################################################
 ###  Miscellaneous administrative options  ###################
 ##############################################################
 
 auditd_enable="NO"	# Run the audit daemon.
 auditd_program="/usr/sbin/auditd"	# Path to the audit daemon.
 auditd_flags=""		# Which options to pass to the audit daemon.
 auditdistd_enable="NO"	# Run the audit daemon.
 auditdistd_program="/usr/sbin/auditdistd"	# Path to the auditdistd daemon.
 auditdistd_flags=""	# Which options to pass to the auditdistd daemon.
 cron_enable="YES"	# Run the periodic job daemon.
 cron_program="/usr/sbin/cron"	# Which cron executable to run (if enabled).
 cron_dst="YES"		# Handle DST transitions intelligently (YES/NO)
 cron_flags=""		# Which options to pass to the cron daemon.
 cfumass_enable="NO"	# Create default LUN for cfumass(4).
 cfumass_dir="/var/cfumass"	# File to LUN's contents.
 cfumass_image="/var/tmp/cfumass.img"	# LUN's backing file path.
 lpd_enable="NO"		# Run the line printer daemon.
 lpd_program="/usr/sbin/lpd"	# path to lpd, if you want a different one.
 lpd_flags=""		# Flags to lpd (if enabled).
 nscd_enable="NO"	# Run the nsswitch caching daemon.
 chkprintcap_enable="NO"	# Run chkprintcap(8) before running lpd.
 chkprintcap_flags="-d"	# Create missing directories by default.
 dumpdev="AUTO"		# Device to crashdump to (device name, AUTO, or NO).
 dumpon_flags=""		# Options to pass to dumpon(8), followed by dumpdev.
 dumpdir="/var/crash"	# Directory where crash dumps are to be stored
 savecore_enable="YES"	# Extract core from dump devices if any
 savecore_flags="-m 10"	# Used if dumpdev is enabled above, and present.
 			# By default, only the 10 most recent kernel dumps
 			# are saved.
 crashinfo_enable="YES"	# Automatically generate crash dump summary.
 crashinfo_program="/usr/sbin/crashinfo"	# Script to generate crash dump summary.
 quota_enable="NO"	# turn on quotas on startup (or NO).
 check_quotas="YES"	# Check quotas on startup (or NO).
 quotaon_flags="-a"	# Turn quotas on for all file systems (if enabled)
 quotaoff_flags="-a"	# Turn quotas off for all file systems at shutdown
 quotacheck_flags="-a"	# Check all file system quotas (if enabled)
 accounting_enable="NO"	# Turn on process accounting (or NO).
 ibcs2_enable="NO"	# Ibcs2 (SCO) emulation loaded at startup (or NO).
 ibcs2_loaders="coff"	# List of additional Ibcs2 loaders (or NO).
 firstboot_sentinel="/firstboot"	# Scripts with "firstboot" keyword are run if
 			# this file exists.  Should be on a R/W filesystem so
 			# the file can be deleted after the boot completes.
 
 # Emulation/compatibility services provided by /etc/rc.d/abi
 sysvipc_enable="NO"	# Load System V IPC primitives at startup (or NO).
 linux_enable="NO"	# Linux binary compatibility loaded at startup (or NO).
 clear_tmp_enable="NO"	# Clear /tmp at startup.
 clear_tmp_X="YES" 	# Clear and recreate X11-related directories in /tmp
 ldconfig_insecure="NO"	# Set to YES to disable ldconfig security checks
 ldconfig_paths="/usr/lib/compat /usr/local/lib /usr/local/lib/compat/pkg"
 			# shared library search paths
 ldconfig32_paths="/usr/lib32 /usr/lib32/compat"
 			# 32-bit compatibility shared library search paths
 ldconfigsoft_paths="/usr/libsoft /usr/libsoft/compat /usr/local/libsoft"
 			# soft float compatibility shared library search paths
 			# Note: temporarily with extra stuff for transition
 ldconfig_paths_aout="/usr/lib/compat/aout /usr/local/lib/aout"
 			# a.out shared library search paths
 ldconfig_local_dirs="/usr/local/libdata/ldconfig"
 			# Local directories with ldconfig configuration files.
 ldconfig_local32_dirs="/usr/local/libdata/ldconfig32"
 			# Local directories with 32-bit compatibility ldconfig
 			# configuration files.
 ldconfig_localsoft_dirs="/usr/local/libdata/ldconfigsoft"
 			# Local directories with soft float compatibility ldconfig
 			# configuration files.
 kern_securelevel_enable="NO"	# kernel security level (see security(7))
 kern_securelevel="-1"	# range: -1..3 ; `-1' is the most insecure
 			# Note that setting securelevel to 0 will result
 			# in the system booting with securelevel set to 1, as
 			# init(8) will raise the level when rc(8) completes.
 update_motd="YES"	# update version info in /etc/motd (or NO)
 entropy_boot_file="/boot/entropy"	# Set to NO to disable very early
 			# (used at early boot time) entropy caching through reboots.
 entropy_file="/entropy"	# Set to NO to disable late (used when going multi-user)
 			# entropy through reboots.
 			# /var/db/entropy-file is preferred if / is not avail.
 entropy_dir="/var/db/entropy" # Set to NO to disable caching entropy via cron.
 entropy_save_sz="4096"	# Size of the entropy cache files.
 entropy_save_num="8"	# Number of entropy cache files to save.
 harvest_mask="511"	# Entropy device harvests all but the very invasive sources.
 			# (See 'sysctl kern.random.harvest' and random(4))
 dmesg_enable="YES"	# Save dmesg(8) to /var/run/dmesg.boot
 watchdogd_enable="NO"	# Start the software watchdog daemon
 watchdogd_flags=""	# Flags to watchdogd (if enabled)
 devfs_rulesets="/etc/defaults/devfs.rules /etc/devfs.rules" # Files containing
 							    # devfs(8) rules.
 devfs_system_ruleset=""	# The name (NOT number) of a ruleset to apply to /dev
 devfs_set_rulesets=""	# A list of /mount/dev=ruleset_name settings to
 			# apply (must be mounted already, i.e. fstab(5))
 devfs_load_rulesets="YES"	# Enable to always load the default rulesets
 performance_cx_lowest="NONE"	# Online CPU idle state
 performance_cpu_freq="NONE"	# Online CPU frequency
 economy_cx_lowest="Cmax"	# Offline CPU idle state
 economy_cpu_freq="NONE"		# Offline CPU frequency
 virecover_enable="YES"	# Perform housekeeping for the vi(1) editor
 ugidfw_enable="NO"	# Load mac_bsdextended(4) rules on boot
 bsdextended_script="/etc/rc.bsdextended"	# Default mac_bsdextended(4)
 						# ruleset file.
 newsyslog_enable="YES"	# Run newsyslog at startup.
 newsyslog_flags="-CN"	# Newsyslog flags to create marked files
 mixer_enable="YES"	# Run the sound mixer.
 opensm_enable="NO"	# Opensm(8) for infiniband devices defaults to off
 
 # rctl(8) requires kernel options RACCT and RCTL
 rctl_enable="YES"		# Load rctl(8) rules on boot
 rctl_rules="/etc/rctl.conf"	# rctl(8) ruleset. See rctl.conf(5).
 
 iovctl_files=""		# Config files for iovctl(8)
 
 ##############################################################
 ### Jail Configuration (see rc.conf(5) manual page) ##########
 ##############################################################
 jail_enable="NO"	# Set to NO to disable starting of any jails
 jail_confwarn="YES"	# Prevent warning about obsolete per-jail configuration
 jail_parallel_start="NO"	# Start jails in the background
 jail_list=""		# Space separated list of names of jails
 jail_reverse_stop="NO"	# Stop jails in reverse order
 
 ##############################################################
 ### Define source_rc_confs, the mechanism used by /etc/rc.* ##
 ### scripts to source rc_conf_files overrides safely.	    ##
 ##############################################################
 
 if [ -z "${source_rc_confs_defined}" ]; then
 	source_rc_confs_defined=yes
 	source_rc_confs() {
 		local i sourced_files
 		for i in ${rc_conf_files}; do
 			case ${sourced_files} in
 			*:$i:*)
 				;;
 			*)
 				sourced_files="${sourced_files}:$i:"
 				if [ -r $i ]; then
 					. $i
 				fi
 				;;
 			esac
 		done
 		# Re-do process to pick up [possibly] redefined $rc_conf_files
 		for i in ${rc_conf_files}; do
 			case ${sourced_files} in
 			*:$i:*)
 				;;
 			*)
 				sourced_files="${sourced_files}:$i:"
 				if [ -r $i ]; then
 					. $i
 				fi
 				;;
 			esac
 		done
 	}
 fi
 
 # Allow vendors to override FreeBSD defaults in /etc/default/rc.conf
 # without the need to carefully manage /etc/rc.conf.
 if [ -r /etc/defaults/vendor.conf ]; then
 	. /etc/defaults/vendor.conf
 fi
Index: projects/openssl111/stand/defaults/loader.conf
===================================================================
--- projects/openssl111/stand/defaults/loader.conf	(revision 339239)
+++ projects/openssl111/stand/defaults/loader.conf	(revision 339240)
@@ -1,169 +1,170 @@
 # This is loader.conf - a file full of useful variables that you can
 # set to change the default load behavior of your system. You should
 # not edit this file!  Put any overrides into one of the
 # loader_conf_files instead and you will be able to update these
 # defaults later without spamming your local configuration information.
 #
 # All arguments must be in double quotes.
 #
 # $FreeBSD$
 
 ###  Basic configuration options  ############################
 exec="echo Loading /boot/defaults/loader.conf"
 
 kernel="kernel"		# /boot sub-directory containing kernel and modules
 bootfile="kernel"	# Kernel name (possibly absolute path)
 kernel_options=""	# Flags to be passed to the kernel
 loader_conf_files="/boot/device.hints /boot/loader.conf /boot/loader.conf.local"
 nextboot_conf="/boot/nextboot.conf"
 nextboot_enable="NO"
 verbose_loading="NO"		# Set to YES for verbose loader output
 
 ###  Splash screen configuration  ############################
 splash_bmp_load="NO"		# Set this to YES for bmp splash screen!
 splash_pcx_load="NO"		# Set this to YES for pcx splash screen!
 splash_txt_load="NO"		# Set this to YES for TheDraw splash screen!
 vesa_load="NO"			# Set this to YES to load the vesa module
 bitmap_load="NO"		# Set this to YES if you want splash screen!
 bitmap_name="splash.bmp"	# Set this to the name of the file
 bitmap_type="splash_image_data" # and place it on the module_path
 
 ###  Screen saver modules  ###################################
 # This is best done in rc.conf
 screensave_load="NO"		# Set to YES to load a screensaver module
 screensave_name="green_saver"	# Set to the name of the screensaver module
 
 ###  Random number generator configuration  ##################
 # See rc.conf(5). The entropy_boot_file config variable must agree with the
 # settings below.
 entropy_cache_load="YES"		# Set this to NO to disable loading
 					# entropy at boot time
 entropy_cache_name="/boot/entropy"	# Set this to the name of the file
 entropy_cache_type="boot_entropy_cache"	# Required for the kernel to find
 					# the boot-time entropy cache. This
 					# must not change value even if the
 					# _name above does change!
 
 ###  RAM Blacklist configuration  ############################
 ram_blacklist_load="NO"			# Set this to YES to load a file
 					# containing a list of addresses to
 					# exclude from the running system.
 ram_blacklist_name="/boot/blacklist.txt" # Set this to the name of the file
 ram_blacklist_type="ram_blacklist"	# Required for the kernel to find
 					# the blacklist module
 
 ###  Microcode loading configuration  ########################
 cpu_microcode_load="NO"			# Set this to YES to load and apply a
 					# microcode update file during boot.
 cpu_microcode_name="/boot/firmware/ucode.bin" # Set this to the microcode
 					      # update file path.
 cpu_microcode_type="cpu_microcode"	# Required for the kernel to find
 					# the microcode update file.
 
 ###  ACPI settings  ##########################################
 acpi_dsdt_load="NO"		# DSDT Overriding
 acpi_dsdt_type="acpi_dsdt"	# Don't change this
 acpi_dsdt_name="/boot/acpi_dsdt.aml"
 				# Override DSDT in BIOS by this file
 acpi_video_load="NO"		# Load the ACPI video extension driver
 
 ###  Audit settings  #########################################
 audit_event_load="NO"		# Preload audit_event config
 audit_event_name="/etc/security/audit_event"
 audit_event_type="etc_security_audit_event"
 
 ###  Initial memory disk settings  ###########################
 #mdroot_load="YES"		# The "mdroot" prefix is arbitrary.
 #mdroot_type="md_image"		# Create md(4) disk at boot.
 #mdroot_name="/boot/root.img"	# Path to a file containing the image.
 #rootdev="ufs:/dev/md0"		# Set the root filesystem to md(4) device.
 
 ###  Loader settings  ########################################
 #loader_delay="3"		# Delay in seconds before loading anything.
 				# Default is unset and disabled (no delay).
 #autoboot_delay="10"		# Delay in seconds before autobooting,
 				# -1 for no user interrupts, NO to disable
 #password=""			# Prevent changes to boot options
 #bootlock_password=""		# Prevent booting (see check-password.4th(8))
 #geom_eli_passphrase_prompt="NO" # Prompt for geli(8) passphrase to mount root
 bootenv_autolist="YES"		# Auto populate the list of ZFS Boot Environments
 #beastie_disable="NO"		# Turn the beastie boot menu on and off
 efi_max_resolution="1x1"	# Set the max resolution for EFI loader to use:
 				# 480p, 720p, 1080p, 2160p/4k, 5k, or specify
 				# WidthxHeight (e.g. 1920x1080)
 #kernels="kernel kernel.old"	# Kernels to display in the boot menu
 #loader_logo="orbbw"		# Desired logo: orbbw, orb, fbsdbw, beastiebw, beastie, none
 #comconsole_speed="9600"	# Set the current serial console speed
 #console="vidconsole"		# A comma separated list of console(s)
 #currdev="disk1s1a"		# Set the current device
 module_path="/boot/modules;/boot/dtb;/boot/dtb/overlays"	# Set the module search path
+module_blacklist="drm drm2 radeonkms i915kms amdgpu"	# Loader module blacklist
 #prompt="\\${interpret}"	# Set the command prompt
 #root_disk_unit="0"		# Force the root disk unit number
 #rootdev="disk1s1a"		# Set the root filesystem
 #dumpdev="disk1s1b"		# Set a dump device early in the boot process
 #tftp.blksize="1428"		# Set the RFC 2348 TFTP block size.
 				# If the TFTP server does not support RFC 2348,
 				# the block size is set to 512. Valid: (8,9007)
 #twiddle_divisor="1"		# >1 means slow down the progress indicator.
 
 ###  Kernel settings  ########################################
 # The following boot_ variables are enabled by setting them to any value.
 # Their presence in the kernel environment (see kenv(1)) has the same
 # effect as setting the given boot flag (see boot(8)).
 #boot_askname=""	# -a: Prompt the user for the name of the root device
 #boot_cdrom=""		# -C: Attempt to mount root file system from CD-ROM
 #boot_ddb=""		# -d: Instructs the kernel to start in the DDB debugger
 #boot_dfltroot=""	# -r: Use the statically configured root file system
 #boot_gdb=""		# -g: Selects gdb-remote mode for the kernel debugger
 #boot_multicons=""	# -D: Use multiple consoles
 #boot_mute=""		# -m: Mute the console
 #boot_pause=""		# -p: Pause after each line during device probing
 #boot_serial=""		# -h: Use serial console
 #boot_single=""		# -s: Start system in single-user mode
 #boot_verbose=""	# -v: Causes extra debugging information to be printed
 #init_path="/sbin/init:/sbin/oinit:/sbin/init.bak:/rescue/init"
 			# Sets the list of init candidates
 #init_shell="/bin/sh"	# The shell binary used by init(8).
 #init_script=""		# Initial script to run by init(8) before chrooting.
 #init_chroot=""		# Directory for init(8) to chroot into.
 
 ###  Kernel tunables  ########################################
 #hw.physmem="1G"		# Limit physical memory. See loader(8)
 #kern.dfldsiz=""		# Set the initial data size limit
 #kern.dflssiz=""		# Set the initial stack size limit
 #kern.hz="100"			# Set the kernel interval timer rate
 #kern.maxbcache=""		# Set the max buffer cache KVA storage
 #kern.maxdsiz=""		# Set the max data size
 #kern.maxfiles=""		# Set the sys. wide open files limit
 #kern.maxproc=""		# Set the maximum # of processes
 #kern.maxssiz=""		# Set the max stack size
 #kern.maxswzone=""		# Set the max swmeta KVA storage
 #kern.maxtsiz=""		# Set the max text size
 #kern.maxusers="32"		# Set size of various static tables
 #kern.msgbufsize="65536"	# Set size of kernel message buffer
 #kern.nbuf=""			# Set the number of buffer headers
 #kern.ncallout=""		# Set the maximum # of timer events
 #kern.ngroups="1023"		# Set the maximum # of supplemental groups
 #kern.sgrowsiz=""		# Set the amount to grow stack
 #kern.cam.boot_delay="10000"	# Delay (in ms) of root mount for CAM bus
 				# registration, useful for USB sticks as root
 #kern.cam.scsi_delay="2000"	# Delay (in ms) before probing SCSI
 #kern.ipc.maxsockets=""		# Set the maximum number of sockets available
 #kern.ipc.nmbclusters=""	# Set the number of mbuf clusters
 #kern.ipc.nsfbufs=""		# Set the number of sendfile(2) bufs
 #net.inet.tcp.tcbhashsize=""	# Set the value of TCBHASHSIZE
 #vfs.root.mountfrom=""		# Specify root partition
 #vm.kmem_size=""		# Sets the size of kernel memory (bytes)
 #debug.kdb.break_to_debugger="0" # Allow console to break into debugger.
 #debug.ktr.cpumask="0xf"	# Bitmask of CPUs to enable KTR on
 #debug.ktr.mask="0x1200"	# Bitmask of KTR events to enable
 #debug.ktr.verbose="1"		# Enable console dump of KTR events
 
 ###  Module loading syntax example  ##########################
 #module_load="YES"		# loads module "module"
 #module_name="realname"		# uses "realname" instead of "module"
 #module_type="type"		# passes "-t type" to load
 #module_flags="flags"		# passes "flags" to the module
 #module_before="cmd"		# executes "cmd" before loading the module
 #module_after="cmd"		# executes "cmd" after loading the module
 #module_error="cmd"		# executes "cmd" if load fails
Index: projects/openssl111/stand/defaults/loader.conf.5
===================================================================
--- projects/openssl111/stand/defaults/loader.conf.5	(revision 339239)
+++ projects/openssl111/stand/defaults/loader.conf.5	(revision 339240)
@@ -1,365 +1,374 @@
 .\" Copyright (c) 1999 Daniel C. Sobral
 .\" All rights reserved.
 .\"
 .\" Redistribution and use in source and binary forms, with or without
 .\" modification, are permitted provided that the following conditions
 .\" are met:
 .\" 1. Redistributions of source code must retain the above copyright
 .\"    notice, this list of conditions and the following disclaimer.
 .\" 2. Redistributions in binary form must reproduce the above copyright
 .\"    notice, this list of conditions and the following disclaimer in the
 .\"    documentation and/or other materials provided with the distribution.
 .\"
 .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 .\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 .\" SUCH DAMAGE.
 .\"
 .\" $FreeBSD$
-.Dd August 28, 2018
+.Dd October 6, 2018
 .Dt LOADER.CONF 5
 .Os
 .Sh NAME
 .Nm loader.conf
 .Nd "system bootstrap configuration information"
 .Sh DESCRIPTION
 The file
 .Nm
 contains descriptive information on bootstrapping the system.
 Through
 it you can specify the kernel to be booted, parameters to be passed to
 it, and additional modules to be loaded; and generally set all variables
 described in
 .Xr loader 8 .
 .Pp
 .Sh SYNTAX
 Though
 .Nm Ns 's
 format was defined explicitly to resemble
 .Xr rc.conf 5 ,
 and can be sourced by
 .Xr sh 1 ,
 some settings are treated in a special fashion.
 Also, the
 behavior of some settings is defined by the setting's suffix;
 the prefix identifies which module the setting controls.
 .Pp
 The general parsing rules are:
 .Bl -bullet
 .It
 Spaces and empty lines are ignored.
 .It
 A # sign will mark the remainder of the line as a comment.
 .It
 Only one setting can be present on each line.
 .El
 .Pp
 All settings have the following format:
 .Pp
 .Dl variable="value"
 .Pp
 Unless it belongs to one of the classes of settings that receive special
 treatment, a setting will set the value of a
 .Xr loader 8
 environment variable.
 The settings that receive special
 treatment are listed below.
 Settings beginning with
 .Qq *
 below define the modules to be loaded and
 may have any prefix; the prefix identifies a module.
 All such settings sharing a common
 prefix refer to the same module.
 .Bl -tag -width Ar
 .It Ar exec
 Immediately executes a
 .Xr loader 8
 command.
 This type of setting cannot be processed by programs other
 than
 .Xr loader 8 ,
 so its use should be avoided.
 Multiple instances of it will be processed
 independently.
 .It Ar loader_conf_files
 Defines additional configuration files to be processed right after the
 present file.
 .It Ar kernel
 Name of the kernel to be loaded.
 If no kernel name is set, no additional
 modules will be loaded.
 The name must be a subdirectory of
 .Pa /boot
 that contains a kernel.
 .It Ar kernel_options
 Flags to be passed to the kernel.
 .It Ar vfs.root.mountfrom
 Specify the root partition to mount.
 For example:
 .Pp
 .Dl vfs.root.mountfrom="ufs:/dev/da0s1a"
 .Pp
 .Xr loader 8
 automatically calculates the value of this tunable from
 .Pa /etc/fstab
 from the partition the kernel was loaded from.
 The calculated value might be calculated incorrectly when
 .Pa /etc/fstab
 is not available during
 .Xr loader 8
 startup (as during diskless booting from NFS), or if a different
 device is desired by the user.
 The preferred value can be set in
 .Pa /loader.conf .
 .Pp
 The value can also be overridden from the
 .Xr loader 8
 command line.
 This is useful for system recovery when
 .Pa /etc/fstab
 is damaged, lost, or read from the wrong partition.
 .It Ar password
 Protect boot menu with a password without interrupting
 .Ic autoboot
 process.
 The password should be in clear text format.
 If a password is set, boot menu will not appear until any key is pressed during
 countdown period specified by
 .Va autoboot_delay
 variable or
 .Ic autoboot
 process fails.
 In both cases user should provide specified password to be able to access boot
 menu.
 .It Ar bootlock_password
 Provides a password to be required by check-password before execution is
 allowed to continue.
 The password should be in clear text format.
 If a password is set, the user must provide specified password to boot.
 .It Ar verbose_loading
 If set to
 .Dq YES ,
 module names will be displayed as they are loaded.
+.It Ar module_blacklist
+Blacklist of modules.
+Modules specified in the blacklist may not be loaded automatically with a
+.Ar *_load
+directive, but they may be loaded directly at the
+.Xr loader 8
+prompt.
+Blacklisted modules may still be loaded indirectly as dependencies of other
+moduled.
 .It Ar *_load
 If set to
 .Dq YES ,
 that module will be loaded.
 If no name is defined (see below), the
 module's name is taken to be the same as the prefix.
 .It Ar *_name
 Defines the name of the module.
 .It Ar *_type
 Defines the module's type.
 If none is given, it defaults to a kld module.
 .It Ar *_flags
 Flags and parameters to be passed to the module.
 .It Ar *_before
 Commands to be executed before the module is loaded.
 Use of this setting
 should be avoided.
 .It Ar *_after
 Commands to be executed after the module is loaded.
 Use of this setting
 should be avoided.
 .It Ar *_error
 Commands to be executed if the loading of a module fails.
 Except for the
 special value
 .Dq abort ,
 which aborts the bootstrap process, use of this setting should be avoided.
 .El
 .Pp
 .Em WARNING:
 developers should never use these suffixes for any kernel environment
 variables (tunables) or conflicts will result.
 .Sh DEFAULT SETTINGS
 Most of
 .Nm Ns 's
 default settings can be ignored.
 The few of them which are important
 or useful are:
 .Bl -tag -width bootfile -offset indent
 .It Va bitmap_load
 .Pq Dq NO
 If set to
 .Dq YES ,
 a bitmap will be loaded to be displayed on screen while booting.
 .It Va bitmap_name
 .Pq Dq Pa /boot/splash.bmp
 Name of the bitmap to be loaded.
 Any other name can be used.
 .It Va comconsole_speed
 .Dq ( 9600
 or the value of the
 .Va BOOT_COMCONSOLE_SPEED
 variable when
 .Xr loader 8
 was compiled).
 Sets the speed of the serial console.
 If the previous boot loader stage specified that a serial console
 is in use then the default speed is determined from the current
 serial port speed setting.
 .It Va console
 .Pq Dq vidconsole
 .Dq comconsole
 selects serial console,
 .Dq vidconsole
 selects the video console,
 .Dq nullconsole
 selects a mute console
 (useful for systems with neither a video console nor a serial port), and
 .Dq spinconsole
 selects the video console which prevents any input and hides all output
 replacing it with
 .Dq spinning
 character (useful for embedded products and such).
 .It Va efi_max_resolution
 Specify the maximum desired resolution for the EFI console.
 The following values are accepted:
 .Bl -column "WidthxHeight"
 .It Sy Value Ta Sy Resolution
 .It 480p Ta 640x480
 .It 720p Ta 1280x720
 .It 1080p Ta 1920x1080
 .It 2160p Ta 3840x2160
 .It 4k Ta 3840x2160
 .It 5k Ta 5120x2880
 .It Va Width Ns x Ns Va Height Ta Va Width Ns x Ns Va Height
 .El
 .It Va kernel
 .Pq Dq kernel
 .It Va kernels
 .Pq Dq kernel kernel.old
 Space or comma separated list of kernels to present in the boot menu.
 .It Va loader_conf_files
 .Pq Dq Pa /boot/loader.conf /boot/loader.conf.local
 .It Va splash_bmp_load
 .Pq Dq NO
 If set to
 .Dq YES ,
 will load the splash screen module, making it possible to display a bmp image
 on the screen while booting.
 .It Va splash_pcx_load
 .Pq Dq NO
 If set to
 .Dq YES ,
 will load the splash screen module, making it possible to display a pcx image
 on the screen while booting.
 .It Va vesa_load
 .Pq Dq NO
 If set to
 .Dq YES ,
 the vesa module will be loaded, enabling bitmaps above VGA resolution to
 be displayed.
 .It Va beastie_disable
 If set to
 .Dq YES ,
 the beastie boot menu will be skipped.
 .It Va loader_logo Pq Dq Li orbbw
 Selects a desired logo in the beastie boot menu.
 Possible values are:
 .Dq Li orbbw ,
 .Dq Li orb ,
 .Dq Li fbsdbw ,
 .Dq Li beastiebw ,
 .Dq Li beastie ,
 and
 .Dq Li none .
 .It Va loader_color
 If set to
 .Dq NO ,
 the beastie boot menu will be displayed without ANSI coloring.
 .It Va entropy_cache_load
 .Pq Dq YES
 If set to
 .Dq NO ,
 the very early
 boot-time entropy file
 will not be loaded.
 See the entropy entries in
 .Xr rc.conf 5 .
 .It Va entropy_cache_name
 .Pq Dq /boot/entropy
 The name of the very early
 boot-time entropy cache file.
 .It Va cpu_microcode_load
 .Pq Dq NO
 If set to
 .Dq YES ,
 the microcode update file specified by
 .Va cpu_microcode_name
 will be loaded and applied very early during boot.
 This provides functionality similar to
 .Xr cpucontrol 8
 but ensures that CPU features enabled by microcode updates can be
 used by the kernel.
 The update will be re-applied automatically when resuming from an
 ACPI sleep state.
 If the update file contains updates for multiple processor models,
 the kernel will search for and extract a matching update.
 Currently this setting is supported only on Intel
 .Dv i386
 and
 .Dv amd64
 processors.
 It has no effect on other processor types.
 .It Va cpu_microcode_name
 A path to a microcode update file.
 .El
 .Sh OTHER SETTINGS
 Other settings that may be used in
 .Nm
 that have no default value:
 .Bl -tag -width bootfile -offset indent
 .It Va fdt_overlays
 Specifies a comma-delimited list of FDT overlays to apply.
 .Pa /boot/dtb/overlays
 is created by default for overlays to be placed in.
 .It Va kernels_autodetect
 If set to
 .Dq YES ,
 attempt to auto-detect kernels installed in
 .Pa /boot .
 This is an option specific to the Lua-based loader.
 It is not available in the default Forth-based loader.
 .El
 .Sh FILES
 .Bl -tag -width /boot/defaults/loader.conf -compact
 .It Pa /boot/defaults/loader.conf
 default settings -- do not change this file.
 .It Pa /boot/loader.conf
 user defined settings.
 .It Pa /boot/loader.conf.local
 machine-specific settings for sites with a common loader.conf.
 .El
 .Sh SEE ALSO
 .Xr rc.conf 5 ,
 .Xr boot 8 ,
 .Xr cpucontrol 8 ,
 .Xr loader 8 ,
 .Xr loader.4th 8
 .Sh HISTORY
 The file
 .Nm
 first appeared in
 .Fx 3.2 .
 .Sh AUTHORS
 This manual page was written by
 .An Daniel C. Sobral Aq dcs@FreeBSD.org .
 .Sh BUGS
 The
 .Xr loader 8
 stops reading
 .Nm
 when it encounters a syntax error, so any options which are vital for
 booting a particular system (i.e.\&
 .Dq Va hw.ata.ata_dma Ns "=0" )
 should precede any experimental additions to
 .Nm .
Index: projects/openssl111/stand/lua/config.lua
===================================================================
--- projects/openssl111/stand/lua/config.lua	(revision 339239)
+++ projects/openssl111/stand/lua/config.lua	(revision 339240)
@@ -1,624 +1,643 @@
 --
 -- SPDX-License-Identifier: BSD-2-Clause-FreeBSD
 --
 -- Copyright (c) 2015 Pedro Souza <pedrosouza@freebsd.org>
 -- Copyright (C) 2018 Kyle Evans <kevans@FreeBSD.org>
 -- All rights reserved.
 --
 -- Redistribution and use in source and binary forms, with or without
 -- modification, are permitted provided that the following conditions
 -- are met:
 -- 1. Redistributions of source code must retain the above copyright
 --    notice, this list of conditions and the following disclaimer.
 -- 2. Redistributions in binary form must reproduce the above copyright
 --    notice, this list of conditions and the following disclaimer in the
 --    documentation and/or other materials provided with the distribution.
 --
 -- THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
 -- ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 -- IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 -- ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
 -- FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 -- DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 -- OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 -- HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 -- LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 -- OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 -- SUCH DAMAGE.
 --
 -- $FreeBSD$
 --
 
 local hook = require("hook")
 
 local config = {}
 local modules = {}
 local carousel_choices = {}
 -- Which variables we changed
 local env_changed = {}
 -- Values to restore env to (nil to unset)
 local env_restore = {}
 
 local MSG_FAILEXEC = "Failed to exec '%s'"
 local MSG_FAILSETENV = "Failed to '%s' with value: %s"
 local MSG_FAILOPENCFG = "Failed to open config: '%s'"
 local MSG_FAILREADCFG = "Failed to read config: '%s'"
 local MSG_FAILPARSECFG = "Failed to parse config: '%s'"
 local MSG_FAILEXBEF = "Failed to execute '%s' before loading '%s'"
 local MSG_FAILEXMOD = "Failed to execute '%s'"
 local MSG_FAILEXAF = "Failed to execute '%s' after loading '%s'"
 local MSG_MALFORMED = "Malformed line (%d):\n\t'%s'"
 local MSG_DEFAULTKERNFAIL = "No kernel set, failed to load from module_path"
 local MSG_KERNFAIL = "Failed to load kernel '%s'"
 local MSG_XENKERNFAIL = "Failed to load Xen kernel '%s'"
 local MSG_XENKERNLOADING = "Loading Xen kernel..."
 local MSG_KERNLOADING = "Loading kernel..."
 local MSG_MODLOADING = "Loading configured modules..."
+local MSG_MODBLACKLIST = "Not loading blacklisted module '%s'"
 local MSG_MODLOADFAIL = "Could not load one or more modules!"
 
 local MODULEEXPR = '([%w-_]+)'
 local QVALEXPR = "\"([%w%s%p]-)\""
 local QVALREPL = QVALEXPR:gsub('%%', '%%%%')
 local WORDEXPR = "([%w]+)"
 local WORDREPL = WORDEXPR:gsub('%%', '%%%%')
 
 local function restoreEnv()
 	-- Examine changed environment variables
 	for k, v in pairs(env_changed) do
 		local restore_value = env_restore[k]
 		if restore_value == nil then
 			-- This one doesn't need restored for some reason
 			goto continue
 		end
 		local current_value = loader.getenv(k)
 		if current_value ~= v then
 			-- This was overwritten by some action taken on the menu
 			-- most likely; we'll leave it be.
 			goto continue
 		end
 		restore_value = restore_value.value
 		if restore_value ~= nil then
 			loader.setenv(k, restore_value)
 		else
 			loader.unsetenv(k)
 		end
 		::continue::
 	end
 
 	env_changed = {}
 	env_restore = {}
 end
 
 local function setEnv(key, value)
 	-- Track the original value for this if we haven't already
 	if env_restore[key] == nil then
 		env_restore[key] = {value = loader.getenv(key)}
 	end
 
 	env_changed[key] = value
 
 	return loader.setenv(key, value)
 end
 
 -- name here is one of 'name', 'type', flags', 'before', 'after', or 'error.'
 -- These are set from lines in loader.conf(5): ${key}_${name}="${value}" where
 -- ${key} is a module name.
 local function setKey(key, name, value)
 	if modules[key] == nil then
 		modules[key] = {}
 	end
 	modules[key][name] = value
 end
 
 -- Escapes the named value for use as a literal in a replacement pattern.
 -- e.g. dhcp.host-name gets turned into dhcp%.host%-name to remove the special
 -- meaning.
 local function escapeName(name)
 	return name:gsub("([%p])", "%%%1")
 end
 
 local function processEnvVar(value)
 	for name in value:gmatch("${([^}]+)}") do
 		local replacement = loader.getenv(name) or ""
 		value = value:gsub("${" .. escapeName(name) .. "}", replacement)
 	end
 	for name in value:gmatch("$([%w%p]+)%s*") do
 		local replacement = loader.getenv(name) or ""
 		value = value:gsub("$" .. escapeName(name), replacement)
 	end
 	return value
 end
 
 local function checkPattern(line, pattern)
 	local function _realCheck(_line, _pattern)
 		return _line:match(_pattern)
 	end
 
 	if pattern:find('$VALUE') then
 		local k, v, c
 		k, v, c = _realCheck(line, pattern:gsub('$VALUE', QVALREPL))
 		if k ~= nil then
 			return k,v, c
 		end
 		return _realCheck(line, pattern:gsub('$VALUE', WORDREPL))
 	else
 		return _realCheck(line, pattern)
 	end
 end
 
 -- str in this table is a regex pattern.  It will automatically be anchored to
 -- the beginning of a line and any preceding whitespace will be skipped.  The
 -- pattern should have no more than two captures patterns, which correspond to
 -- the two parameters (usually 'key' and 'value') that are passed to the
 -- process function.  All trailing characters will be validated.  Any $VALUE
 -- token included in a pattern will be tried first with a quoted value capture
 -- group, then a single-word value capture group.  This is our kludge for Lua
 -- regex not supporting branching.
 --
 -- We have two special entries in this table: the first is the first entry,
 -- a full-line comment.  The second is for 'exec' handling.  Both have a single
 -- capture group, but the difference is that the full-line comment pattern will
 -- match the entire line.  This does not run afoul of the later end of line
 -- validation that we'll do after a match.  However, the 'exec' pattern will.
 -- We document the exceptions with a special 'groups' index that indicates
 -- the number of capture groups, if not two.  We'll use this later to do
 -- validation on the proper entry.
 --
 local pattern_table = {
 	{
 		str = "(#.*)",
 		process = function(_, _)  end,
 		groups = 1,
 	},
 	--  module_load="value"
 	{
 		str = MODULEEXPR .. "_load%s*=%s*$VALUE",
 		process = function(k, v)
 			if modules[k] == nil then
 				modules[k] = {}
 			end
 			modules[k].load = v:upper()
 		end,
 	},
 	--  module_name="value"
 	{
 		str = MODULEEXPR .. "_name%s*=%s*$VALUE",
 		process = function(k, v)
 			setKey(k, "name", v)
 		end,
 	},
 	--  module_type="value"
 	{
 		str = MODULEEXPR .. "_type%s*=%s*$VALUE",
 		process = function(k, v)
 			setKey(k, "type", v)
 		end,
 	},
 	--  module_flags="value"
 	{
 		str = MODULEEXPR .. "_flags%s*=%s*$VALUE",
 		process = function(k, v)
 			setKey(k, "flags", v)
 		end,
 	},
 	--  module_before="value"
 	{
 		str = MODULEEXPR .. "_before%s*=%s*$VALUE",
 		process = function(k, v)
 			setKey(k, "before", v)
 		end,
 	},
 	--  module_after="value"
 	{
 		str = MODULEEXPR .. "_after%s*=%s*$VALUE",
 		process = function(k, v)
 			setKey(k, "after", v)
 		end,
 	},
 	--  module_error="value"
 	{
 		str = MODULEEXPR .. "_error%s*=%s*$VALUE",
 		process = function(k, v)
 			setKey(k, "error", v)
 		end,
 	},
 	--  exec="command"
 	{
 		str = "exec%s*=%s*" .. QVALEXPR,
 		process = function(k, _)
 			if cli_execute_unparsed(k) ~= 0 then
 				print(MSG_FAILEXEC:format(k))
 			end
 		end,
 		groups = 1,
 	},
 	--  env_var="value"
 	{
 		str = "([%w%p]+)%s*=%s*$VALUE",
 		process = function(k, v)
 			if setEnv(k, processEnvVar(v)) ~= 0 then
 				print(MSG_FAILSETENV:format(k, v))
 			end
 		end,
 	},
 	--  env_var=num
 	{
 		str = "([%w%p]+)%s*=%s*(-?%d+)",
 		process = function(k, v)
 			if setEnv(k, processEnvVar(v)) ~= 0 then
 				print(MSG_FAILSETENV:format(k, tostring(v)))
 			end
 		end,
 	},
 }
 
 local function isValidComment(line)
 	if line ~= nil then
 		local s = line:match("^%s*#.*")
 		if s == nil then
 			s = line:match("^%s*$")
 		end
 		if s == nil then
 			return false
 		end
 	end
 	return true
 end
 
+local function getBlacklist()
+	local blacklist_str = loader.getenv('module_blacklist')
+	if blacklist_str == nil then
+		return nil
+	end
+
+	local blacklist = {}
+	for mod in blacklist_str:gmatch("[;, ]?([%w-_]+)[;, ]?") do
+		blacklist[mod] = true
+	end
+	return blacklist
+end
+
 local function loadModule(mod, silent)
 	local status = true
+	local blacklist = getBlacklist()
 	local pstatus
 	for k, v in pairs(mod) do
 		if v.load ~= nil and v.load:lower() == "yes" then
+			local module_name = v.name or k
+			if blacklist[module_name] ~= nil then
+				if not silent then
+					print(MSG_MODBLACKLIST:format(module_name))
+				end
+				goto continue
+			end
 			local str = "load "
 			if v.type ~= nil then
 				str = str .. "-t " .. v.type .. " "
 			end
-			if v.name ~= nil then
-				str = str .. v.name
-			else
-				str = str .. k
-			end
+			str = str .. module_name
 			if v.flags ~= nil then
 				str = str .. " " .. v.flags
 			end
 			if v.before ~= nil then
 				pstatus = cli_execute_unparsed(v.before) == 0
 				if not pstatus and not silent then
 					print(MSG_FAILEXBEF:format(v.before, k))
 				end
 				status = status and pstatus
 			end
 
 			if cli_execute_unparsed(str) ~= 0 then
 				if not silent then
 					print(MSG_FAILEXMOD:format(str))
 				end
 				if v.error ~= nil then
 					cli_execute_unparsed(v.error)
 				end
 				status = false
 			end
 
 			if v.after ~= nil then
 				pstatus = cli_execute_unparsed(v.after) == 0
 				if not pstatus and not silent then
 					print(MSG_FAILEXAF:format(v.after, k))
 				end
 				status = status and pstatus
 			end
 
 		end
+		::continue::
 	end
 
 	return status
 end
 
 local function readConfFiles(loaded_files)
 	local f = loader.getenv("loader_conf_files")
 	if f ~= nil then
 		for name in f:gmatch("([%w%p]+)%s*") do
 			if loaded_files[name] ~= nil then
 				goto continue
 			end
 
 			local prefiles = loader.getenv("loader_conf_files")
 
 			print("Loading " .. name)
 			-- These may or may not exist, and that's ok. Do a
 			-- silent parse so that we complain on parse errors but
 			-- not for them simply not existing.
 			if not config.processFile(name, true) then
 				print(MSG_FAILPARSECFG:format(name))
 			end
 
 			loaded_files[name] = true
 			local newfiles = loader.getenv("loader_conf_files")
 			if prefiles ~= newfiles then
 				readConfFiles(loaded_files)
 			end
 			::continue::
 		end
 	end
 end
 
 local function readFile(name, silent)
 	local f = io.open(name)
 	if f == nil then
 		if not silent then
 			print(MSG_FAILOPENCFG:format(name))
 		end
 		return nil
 	end
 
 	local text, _ = io.read(f)
 	-- We might have read in the whole file, this won't be needed any more.
 	io.close(f)
 
 	if text == nil and not silent then
 		print(MSG_FAILREADCFG:format(name))
 	end
 	return text
 end
 
 local function checkNextboot()
 	local nextboot_file = loader.getenv("nextboot_conf")
 	if nextboot_file == nil then
 		return
 	end
 
 	local text = readFile(nextboot_file, true)
 	if text == nil then
 		return
 	end
 
 	if text:match("^nextboot_enable=\"NO\"") ~= nil then
 		-- We're done; nextboot is not enabled
 		return
 	end
 
 	if not config.parse(text) then
 		print(MSG_FAILPARSECFG:format(nextboot_file))
 	end
 
 	-- Attempt to rewrite the first line and only the first line of the
 	-- nextboot_file. We overwrite it with nextboot_enable="NO", then
 	-- check for that on load.
 	-- It's worth noting that this won't work on every filesystem, so we
 	-- won't do anything notable if we have any errors in this process.
 	local nfile = io.open(nextboot_file, 'w')
 	if nfile ~= nil then
 		-- We need the trailing space here to account for the extra
 		-- character taken up by the string nextboot_enable="YES"
 		-- Or new end quotation mark lands on the S, and we want to
 		-- rewrite the entirety of the first line.
 		io.write(nfile, "nextboot_enable=\"NO\" ")
 		io.close(nfile)
 	end
 end
 
 -- Module exports
 config.verbose = false
 
 -- The first item in every carousel is always the default item.
 function config.getCarouselIndex(id)
 	return carousel_choices[id] or 1
 end
 
 function config.setCarouselIndex(id, idx)
 	carousel_choices[id] = idx
 end
 
 -- Returns true if we processed the file successfully, false if we did not.
 -- If 'silent' is true, being unable to read the file is not considered a
 -- failure.
 function config.processFile(name, silent)
 	if silent == nil then
 		silent = false
 	end
 
 	local text = readFile(name, silent)
 	if text == nil then
 		return silent
 	end
 
 	return config.parse(text)
 end
 
 -- silent runs will not return false if we fail to open the file
 function config.parse(text)
 	local n = 1
 	local status = true
 
 	for line in text:gmatch("([^\n]+)") do
 		if line:match("^%s*$") == nil then
 			for _, val in ipairs(pattern_table) do
 				local pattern = '^%s*' .. val.str .. '%s*(.*)';
 				local cgroups = val.groups or 2
 				local k, v, c = checkPattern(line, pattern)
 				if k ~= nil then
 					-- Offset by one, drats
 					if cgroups == 1 then
 						c = v
 						v = nil
 					end
 
 					if isValidComment(c) then
 						val.process(k, v)
 						goto nextline
 					end
 
 					break
 				end
 			end
 
 			print(MSG_MALFORMED:format(n, line))
 			status = false
 		end
 		::nextline::
 		n = n + 1
 	end
 
 	return status
 end
 
 -- other_kernel is optionally the name of a kernel to load, if not the default
 -- or autoloaded default from the module_path
 function config.loadKernel(other_kernel)
 	local flags = loader.getenv("kernel_options") or ""
 	local kernel = other_kernel or loader.getenv("kernel")
 
 	local function tryLoad(names)
 		for name in names:gmatch("([^;]+)%s*;?") do
 			local r = loader.perform("load " .. name ..
 			     " " .. flags)
 			if r == 0 then
 				return name
 			end
 		end
 		return nil
 	end
 
 	local function getModulePath()
 		local module_path = loader.getenv("module_path")
 		local kernel_path = loader.getenv("kernel_path")
 
 		if kernel_path == nil then
 			return module_path
 		end
 
 		-- Strip the loaded kernel path from module_path. This currently assumes
 		-- that the kernel path will be prepended to the module_path when it's
 		-- found.
 		kernel_path = escapeName(kernel_path .. ';')
 		return module_path:gsub(kernel_path, '')
 	end
 
 	local function loadBootfile()
 		local bootfile = loader.getenv("bootfile")
 
 		-- append default kernel name
 		if bootfile == nil then
 			bootfile = "kernel"
 		else
 			bootfile = bootfile .. ";kernel"
 		end
 
 		return tryLoad(bootfile)
 	end
 
 	-- kernel not set, try load from default module_path
 	if kernel == nil then
 		local res = loadBootfile()
 
 		if res ~= nil then
 			-- Default kernel is loaded
 			config.kernel_loaded = nil
 			return true
 		else
 			print(MSG_DEFAULTKERNFAIL)
 			return false
 		end
 	else
 		-- Use our cached module_path, so we don't end up with multiple
 		-- automatically added kernel paths to our final module_path
 		local module_path = getModulePath()
 		local res
 
 		if other_kernel ~= nil then
 			kernel = other_kernel
 		end
 		-- first try load kernel with module_path = /boot/${kernel}
 		-- then try load with module_path=${kernel}
 		local paths = {"/boot/" .. kernel, kernel}
 
 		for _, v in pairs(paths) do
 			loader.setenv("module_path", v)
 			res = loadBootfile()
 
 			-- succeeded, add path to module_path
 			if res ~= nil then
 				config.kernel_loaded = kernel
 				if module_path ~= nil then
 					loader.setenv("module_path", v .. ";" ..
 					    module_path)
 					loader.setenv("kernel_path", v)
 				end
 				return true
 			end
 		end
 
 		-- failed to load with ${kernel} as a directory
 		-- try as a file
 		res = tryLoad(kernel)
 		if res ~= nil then
 			config.kernel_loaded = kernel
 			return true
 		else
 			print(MSG_KERNFAIL:format(kernel))
 			return false
 		end
 	end
 end
 
 function config.selectKernel(kernel)
 	config.kernel_selected = kernel
 end
 
 function config.load(file, reloading)
 	if not file then
 		file = "/boot/defaults/loader.conf"
 	end
 
 	if not config.processFile(file) then
 		print(MSG_FAILPARSECFG:format(file))
 	end
 
 	local loaded_files = {file = true}
 	readConfFiles(loaded_files)
 
 	checkNextboot()
 
 	local verbose = loader.getenv("verbose_loading") or "no"
 	config.verbose = verbose:lower() == "yes"
 	if not reloading then
 		hook.runAll("config.loaded")
 	end
 end
 
 -- Reload configuration
 function config.reload(file)
 	modules = {}
 	restoreEnv()
 	config.load(file, true)
 	hook.runAll("config.reloaded")
 end
 
 function config.loadelf()
 	local xen_kernel = loader.getenv('xen_kernel')
 	local kernel = config.kernel_selected or config.kernel_loaded
 	local loaded
 
 	if xen_kernel ~= nil then
 		print(MSG_XENKERNLOADING)
 		if cli_execute_unparsed('load ' .. xen_kernel) ~= 0 then
 			print(MSG_XENKERNFAIL:format(xen_kernel))
 			return
 		end
 	end
 	print(MSG_KERNLOADING)
 	loaded = config.loadKernel(kernel)
 
 	if not loaded then
 		return
 	end
 
 	print(MSG_MODLOADING)
 	if not loadModule(modules, not config.verbose) then
 		print(MSG_MODLOADFAIL)
 	end
 end
 
 hook.registerType("config.loaded")
 hook.registerType("config.reloaded")
 return config
Index: projects/openssl111/stand/lua/core.lua
===================================================================
--- projects/openssl111/stand/lua/core.lua	(revision 339239)
+++ projects/openssl111/stand/lua/core.lua	(revision 339240)
@@ -1,379 +1,396 @@
 --
 -- SPDX-License-Identifier: BSD-2-Clause-FreeBSD
 --
 -- Copyright (c) 2015 Pedro Souza <pedrosouza@freebsd.org>
 -- Copyright (c) 2018 Kyle Evans <kevans@FreeBSD.org>
 -- All rights reserved.
 --
 -- Redistribution and use in source and binary forms, with or without
 -- modification, are permitted provided that the following conditions
 -- are met:
 -- 1. Redistributions of source code must retain the above copyright
 --    notice, this list of conditions and the following disclaimer.
 -- 2. Redistributions in binary form must reproduce the above copyright
 --    notice, this list of conditions and the following disclaimer in the
 --    documentation and/or other materials provided with the distribution.
 --
 -- THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
 -- ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 -- IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 -- ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
 -- FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 -- DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 -- OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 -- HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 -- LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 -- OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 -- SUCH DAMAGE.
 --
 -- $FreeBSD$
 --
 
 local config = require("config")
 local hook = require("hook")
 
 local core = {}
 
+local default_safe_mode = false
+local default_single_user = false
+local default_verbose = false
+
 local function composeLoaderCmd(cmd_name, argstr)
 	if argstr ~= nil then
 		cmd_name = cmd_name .. " " .. argstr
 	end
 	return cmd_name
 end
 
+local function recordDefaults()
+	-- On i386, hint.acpi.0.rsdp will be set before we're loaded. On !i386,
+	-- it will generally be set upon execution of the kernel. Because of
+	-- this, we can't (or don't really want to) detect/disable ACPI on !i386
+	-- reliably. Just set it enabled if we detect it and leave well enough
+	-- alone if we don't.
+	local boot_acpi = core.isSystem386() and core.getACPIPresent(false)
+	local boot_single = loader.getenv("boot_single") or "no"
+	local boot_verbose = loader.getenv("boot_verbose") or "no"
+	default_single_user = boot_single:lower() ~= "no"
+	default_verbose = boot_verbose:lower() ~= "no"
+
+	if boot_acpi then
+		core.setACPI(true)
+	end
+	core.setSingleUser(default_single_user)
+	core.setVerbose(default_verbose)
+end
+
+
 -- Globals
 -- try_include will return the loaded module on success, or nil on failure.
 -- A message will also be printed on failure, with one exception: non-verbose
 -- loading will suppress 'module not found' errors.
 function try_include(module)
 	local status, ret = pcall(require, module)
 	-- ret is the module if we succeeded.
 	if status then
 		return ret
 	end
 	-- Otherwise, ret is just a message; filter out ENOENT unless we're
 	-- doing a verbose load. As a consequence, try_include prior to loading
 	-- configuration will not display 'module not found'. All other errors
 	-- in loading will be printed.
 	if config.verbose or ret:match("^module .+ not found") == nil then
 		error(ret, 2)
 	end
 	return nil
 end
 
 -- Module exports
 -- Commonly appearing constants
 core.KEY_BACKSPACE	= 8
 core.KEY_ENTER		= 13
 core.KEY_DELETE		= 127
 
 -- Note that this is a decimal representation, despite the leading 0 that in
 -- other contexts (outside of Lua) may mean 'octal'
 core.KEYSTR_ESCAPE	= "\027"
 core.KEYSTR_CSI		= core.KEYSTR_ESCAPE .. "["
 
 core.MENU_RETURN	= "return"
 core.MENU_ENTRY		= "entry"
 core.MENU_SEPARATOR	= "separator"
 core.MENU_SUBMENU	= "submenu"
 core.MENU_CAROUSEL_ENTRY	= "carousel_entry"
 
 function core.setVerbose(verbose)
 	if verbose == nil then
 		verbose = not core.verbose
 	end
 
 	if verbose then
 		loader.setenv("boot_verbose", "YES")
 	else
 		loader.unsetenv("boot_verbose")
 	end
 	core.verbose = verbose
 end
 
 function core.setSingleUser(single_user)
 	if single_user == nil then
 		single_user = not core.su
 	end
 
 	if single_user then
 		loader.setenv("boot_single", "YES")
 	else
 		loader.unsetenv("boot_single")
 	end
 	core.su = single_user
 end
 
 function core.getACPIPresent(checking_system_defaults)
 	local c = loader.getenv("hint.acpi.0.rsdp")
 
 	if c ~= nil then
 		if checking_system_defaults then
 			return true
 		end
 		-- Otherwise, respect disabled if it's set
 		c = loader.getenv("hint.acpi.0.disabled")
 		return c == nil or tonumber(c) ~= 1
 	end
 	return false
 end
 
 function core.setACPI(acpi)
 	if acpi == nil then
 		acpi = not core.acpi
 	end
 
 	if acpi then
 		loader.setenv("acpi_load", "YES")
 		loader.setenv("hint.acpi.0.disabled", "0")
 		loader.unsetenv("loader.acpi_disabled_by_user")
 	else
 		loader.unsetenv("acpi_load")
 		loader.setenv("hint.acpi.0.disabled", "1")
 		loader.setenv("loader.acpi_disabled_by_user", "1")
 	end
 	core.acpi = acpi
 end
 
 function core.setSafeMode(safe_mode)
 	if safe_mode == nil then
 		safe_mode = not core.sm
 	end
 	if safe_mode then
 		loader.setenv("kern.smp.disabled", "1")
 		loader.setenv("hw.ata.ata_dma", "0")
 		loader.setenv("hw.ata.atapi_dma", "0")
 		loader.setenv("hw.ata.wc", "0")
 		loader.setenv("hw.eisa_slots", "0")
 		loader.setenv("kern.eventtimer.periodic", "1")
 		loader.setenv("kern.geom.part.check_integrity", "0")
 	else
 		loader.unsetenv("kern.smp.disabled")
 		loader.unsetenv("hw.ata.ata_dma")
 		loader.unsetenv("hw.ata.atapi_dma")
 		loader.unsetenv("hw.ata.wc")
 		loader.unsetenv("hw.eisa_slots")
 		loader.unsetenv("kern.eventtimer.periodic")
 		loader.unsetenv("kern.geom.part.check_integrity")
 	end
 	core.sm = safe_mode
 end
 
 function core.clearCachedKernels()
 	-- Clear the kernel cache on config changes, autodetect might have
 	-- changed or if we've switched boot environments then we could have
 	-- a new kernel set.
 	core.cached_kernels = nil
 end
 
 function core.kernelList()
 	if core.cached_kernels ~= nil then
 		return core.cached_kernels
 	end
 
 	local k = loader.getenv("kernel")
 	local v = loader.getenv("kernels")
 	local autodetect = loader.getenv("kernels_autodetect") or ""
 
 	local kernels = {}
 	local unique = {}
 	local i = 0
 	if k ~= nil then
 		i = i + 1
 		kernels[i] = k
 		unique[k] = true
 	end
 
 	if v ~= nil then
 		for n in v:gmatch("([^;, ]+)[;, ]?") do
 			if unique[n] == nil then
 				i = i + 1
 				kernels[i] = n
 				unique[n] = true
 			end
 		end
 	end
 
 	-- Base whether we autodetect kernels or not on a loader.conf(5)
 	-- setting, kernels_autodetect. If it's set to 'yes', we'll add
 	-- any kernels we detect based on the criteria described.
 	if autodetect:lower() ~= "yes" then
 		core.cached_kernels = kernels
 		return core.cached_kernels
 	end
 
 	-- Automatically detect other bootable kernel directories using a
 	-- heuristic.  Any directory in /boot that contains an ordinary file
 	-- named "kernel" is considered eligible.
 	for file in lfs.dir("/boot") do
 		local fname = "/boot/" .. file
 
 		if file == "." or file == ".." then
 			goto continue
 		end
 
 		if lfs.attributes(fname, "mode") ~= "directory" then
 			goto continue
 		end
 
 		if lfs.attributes(fname .. "/kernel", "mode") ~= "file" then
 			goto continue
 		end
 
 		if unique[file] == nil then
 			i = i + 1
 			kernels[i] = file
 			unique[file] = true
 		end
 
 		::continue::
 	end
 	core.cached_kernels = kernels
 	return core.cached_kernels
 end
 
 function core.bootenvDefault()
 	return loader.getenv("zfs_be_active")
 end
 
 function core.bootenvList()
 	local bootenv_count = tonumber(loader.getenv("bootenvs_count"))
 	local bootenvs = {}
 	local curenv
 	local envcount = 0
 	local unique = {}
 
 	if bootenv_count == nil or bootenv_count <= 0 then
 		return bootenvs
 	end
 
 	-- Currently selected bootenv is always first/default
 	curenv = core.bootenvDefault()
 	if curenv ~= nil then
 		envcount = envcount + 1
 		bootenvs[envcount] = curenv
 		unique[curenv] = true
 	end
 
 	for curenv_idx = 0, bootenv_count - 1 do
 		curenv = loader.getenv("bootenvs[" .. curenv_idx .. "]")
 		if curenv ~= nil and unique[curenv] == nil then
 			envcount = envcount + 1
 			bootenvs[envcount] = curenv
 			unique[curenv] = true
 		end
 	end
 	return bootenvs
 end
 
 function core.setDefaults()
 	core.setACPI(core.getACPIPresent(true))
-	core.setSafeMode(false)
-	core.setSingleUser(false)
-	core.setVerbose(false)
+	core.setSafeMode(default_safe_mode)
+	core.setSingleUser(default_single_user)
+	core.setVerbose(default_verbose)
 end
 
 function core.autoboot(argstr)
 	-- loadelf() only if we've not already loaded a kernel
 	if loader.getenv("kernelname") == nil then
 		config.loadelf()
 	end
 	loader.perform(composeLoaderCmd("autoboot", argstr))
 end
 
 function core.boot(argstr)
 	-- loadelf() only if we've not already loaded a kernel
 	if loader.getenv("kernelname") == nil then
 		config.loadelf()
 	end
 	loader.perform(composeLoaderCmd("boot", argstr))
 end
 
 function core.isSingleUserBoot()
 	local single_user = loader.getenv("boot_single")
 	return single_user ~= nil and single_user:lower() == "yes"
 end
 
 function core.isUEFIBoot()
 	local efiver = loader.getenv("efi-version")
 
 	return efiver ~= nil
 end
 
 function core.isZFSBoot()
 	local c = loader.getenv("currdev")
 
 	if c ~= nil then
 		return c:match("^zfs:") ~= nil
 	end
 	return false
 end
 
 function core.isSerialBoot()
 	local s = loader.getenv("boot_serial")
 	if s ~= nil then
 		return true
 	end
 
 	local m = loader.getenv("boot_multicons")
 	if m ~= nil then
 		return true
 	end
 	return false
 end
 
 function core.isSystem386()
 	return loader.machine_arch == "i386"
 end
 
 -- Is the menu skipped in the environment in which we've booted?
 function core.isMenuSkipped()
 	return string.lower(loader.getenv("beastie_disable") or "") == "yes"
 end
 
 -- This may be a better candidate for a 'utility' module.
 function core.deepCopyTable(tbl)
 	local new_tbl = {}
 	for k, v in pairs(tbl) do
 		if type(v) == "table" then
 			new_tbl[k] = core.deepCopyTable(v)
 		else
 			new_tbl[k] = v
 		end
 	end
 	return new_tbl
 end
 
 -- XXX This should go away if we get the table lib into shape for importing.
 -- As of now, it requires some 'os' functions, so we'll implement this in lua
 -- for our uses
 function core.popFrontTable(tbl)
 	-- Shouldn't reasonably happen
 	if #tbl == 0 then
 		return nil, nil
 	elseif #tbl == 1 then
 		return tbl[1], {}
 	end
 
 	local first_value = tbl[1]
 	local new_tbl = {}
 	-- This is not a cheap operation
 	for k, v in ipairs(tbl) do
 		if k > 1 then
 			new_tbl[k - 1] = v
 		end
 	end
 
 	return first_value, new_tbl
 end
 
--- On i386, hint.acpi.0.rsdp will be set before we're loaded. On !i386, it will
--- generally be set upon execution of the kernel. Because of this, we can't (or
--- don't really want to) detect/disable ACPI on !i386 reliably. Just set it
--- enabled if we detect it and leave well enough alone if we don't.
-if core.isSystem386() and core.getACPIPresent(false) then
-	core.setACPI(true)
-end
-
+recordDefaults()
 hook.register("config.reloaded", core.clearCachedKernels)
 return core
Index: projects/openssl111/sys/amd64/conf/GENERIC
===================================================================
--- projects/openssl111/sys/amd64/conf/GENERIC	(revision 339239)
+++ projects/openssl111/sys/amd64/conf/GENERIC	(revision 339240)
@@ -1,380 +1,379 @@
 #
 # GENERIC -- Generic kernel configuration file for FreeBSD/amd64
 #
 # For more information on this file, please read the config(5) manual page,
 # and/or the handbook section on Kernel Configuration Files:
 #
 #    https://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig-config.html
 #
 # The handbook is also available locally in /usr/share/doc/handbook
 # if you've installed the doc distribution, otherwise always see the
 # FreeBSD World Wide Web server (https://www.FreeBSD.org/) for the
 # latest information.
 #
 # An exhaustive list of options and more detailed explanations of the
 # device lines is also present in the ../../conf/NOTES and NOTES files.
 # If you are in doubt as to the purpose or necessity of a line, check first
 # in NOTES.
 #
 # $FreeBSD$
 
 cpu		HAMMER
 ident		GENERIC
 
 makeoptions	DEBUG=-g		# Build kernel with gdb(1) debug symbols
 makeoptions	WITH_CTF=1		# Run ctfconvert(1) for DTrace support
 
 options 	SCHED_ULE		# ULE scheduler
 options 	NUMA			# Non-Uniform Memory Architecture support
 options 	PREEMPTION		# Enable kernel thread preemption
 options 	VIMAGE			# Subsystem virtualization, e.g. VNET
 options 	INET			# InterNETworking
 options 	INET6			# IPv6 communications protocols
 options 	IPSEC			# IP (v4/v6) security
 options 	IPSEC_SUPPORT		# Allow kldload of ipsec and tcpmd5
 options 	TCP_OFFLOAD		# TCP offload
 options 	TCP_BLACKBOX		# Enhanced TCP event logging
 options 	TCP_HHOOK		# hhook(9) framework for TCP
 options		TCP_RFC7413		# TCP Fast Open
 options 	SCTP			# Stream Control Transmission Protocol
 options 	FFS			# Berkeley Fast Filesystem
 options 	SOFTUPDATES		# Enable FFS soft updates support
 options 	UFS_ACL			# Support for access control lists
 options 	UFS_DIRHASH		# Improve performance on big directories
 options 	UFS_GJOURNAL		# Enable gjournal-based UFS journaling
 options 	QUOTA			# Enable disk quotas for UFS
 options 	MD_ROOT			# MD is a potential root device
 options 	NFSCL			# Network Filesystem Client
 options 	NFSD			# Network Filesystem Server
 options 	NFSLOCKD		# Network Lock Manager
 options 	NFS_ROOT		# NFS usable as /, requires NFSCL
 options 	MSDOSFS			# MSDOS Filesystem
 options 	CD9660			# ISO 9660 Filesystem
 options 	PROCFS			# Process filesystem (requires PSEUDOFS)
 options 	PSEUDOFS		# Pseudo-filesystem framework
-options 	GEOM_PART_GPT		# GUID Partition Tables.
 options 	GEOM_RAID		# Soft RAID functionality.
 options 	GEOM_LABEL		# Provides labelization
 options 	EFIRT			# EFI Runtime Services support
 options 	COMPAT_FREEBSD32	# Compatible with i386 binaries
 options 	COMPAT_FREEBSD4		# Compatible with FreeBSD4
 options 	COMPAT_FREEBSD5		# Compatible with FreeBSD5
 options 	COMPAT_FREEBSD6		# Compatible with FreeBSD6
 options 	COMPAT_FREEBSD7		# Compatible with FreeBSD7
 options 	COMPAT_FREEBSD9		# Compatible with FreeBSD9
 options 	COMPAT_FREEBSD10	# Compatible with FreeBSD10
 options 	COMPAT_FREEBSD11	# Compatible with FreeBSD11
 options 	SCSI_DELAY=5000		# Delay (in ms) before probing SCSI
 options 	KTRACE			# ktrace(1) support
 options 	STACK			# stack(9) support
 options 	SYSVSHM			# SYSV-style shared memory
 options 	SYSVMSG			# SYSV-style message queues
 options 	SYSVSEM			# SYSV-style semaphores
 options 	_KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions
 options 	PRINTF_BUFR_SIZE=128	# Prevent printf output being interspersed.
 options 	KBD_INSTALL_CDEV	# install a CDEV entry in /dev
 options 	HWPMC_HOOKS		# Necessary kernel hooks for hwpmc(4)
 options 	AUDIT			# Security event auditing
 options 	CAPABILITY_MODE		# Capsicum capability mode
 options 	CAPABILITIES		# Capsicum capabilities
 options 	MAC			# TrustedBSD MAC Framework
 options 	KDTRACE_FRAME		# Ensure frames are compiled in
 options 	KDTRACE_HOOKS		# Kernel DTrace hooks
 options 	DDB_CTF			# Kernel ELF linker loads CTF data
 options 	INCLUDE_CONFIG_FILE	# Include this file in kernel
 options 	RACCT			# Resource accounting framework
 options 	RACCT_DEFAULT_TO_DISABLED # Set kern.racct.enable=0 by default
 options 	RCTL			# Resource limits
 
 # Debugging support.  Always need this:
 options 	KDB			# Enable kernel debugger support.
 options 	KDB_TRACE		# Print a stack trace for a panic.
 # For full debugger support use (turn off in stable branch):
 options 	BUF_TRACKING		# Track buffer history
 options 	DDB			# Support DDB.
 options 	FULL_BUF_TRACKING	# Track more buffer history
 options 	GDB			# Support remote GDB.
 options 	DEADLKRES		# Enable the deadlock resolver
 options 	INVARIANTS		# Enable calls of extra sanity checking
 options 	INVARIANT_SUPPORT	# Extra sanity checks of internal structures, required by INVARIANTS
 options 	WITNESS			# Enable checks to detect deadlocks and cycles
 options 	WITNESS_SKIPSPIN	# Don't run witness on spinlocks for speed
 options 	MALLOC_DEBUG_MAXZONES=8	# Separate malloc(9) zones
 
 # Kernel dump features.
 options 	EKCD			# Support for encrypted kernel dumps
 options 	GZIO			# gzip-compressed kernel and user dumps
 options 	ZSTDIO			# zstd-compressed kernel and user dumps
 options 	NETDUMP			# netdump(4) client support
 
 # Make an SMP-capable kernel by default
 options 	SMP			# Symmetric MultiProcessor Kernel
 options 	EARLY_AP_STARTUP
 
 # CPU frequency control
 device		cpufreq
 
 # Bus support.
 device		acpi
 options 	ACPI_DMAR
 device		pci
 options 	PCI_HP			# PCI-Express native HotPlug
 options		PCI_IOV			# PCI SR-IOV support
 
 # Floppy drives
 device		fdc
 
 # ATA controllers
 device		ahci			# AHCI-compatible SATA controllers
 device		ata			# Legacy ATA/SATA controllers
 device		mvs			# Marvell 88SX50XX/88SX60XX/88SX70XX/SoC SATA
 device		siis			# SiliconImage SiI3124/SiI3132/SiI3531 SATA
 
 # SCSI Controllers
 device		ahc			# AHA2940 and onboard AIC7xxx devices
 device		ahd			# AHA39320/29320 and onboard AIC79xx devices
 device		esp			# AMD Am53C974 (Tekram DC-390(T))
 device		hptiop			# Highpoint RocketRaid 3xxx series
 device		isp			# Qlogic family
 #device		ispfw			# Firmware for QLogic HBAs- normally a module
 device		mpt			# LSI-Logic MPT-Fusion
 device		mps			# LSI-Logic MPT-Fusion 2
 device		mpr			# LSI-Logic MPT-Fusion 3
 #device		ncr			# NCR/Symbios Logic
 device		sym			# NCR/Symbios Logic (newer chipsets + those of `ncr')
 device		trm			# Tekram DC395U/UW/F DC315U adapters
 
 device		adv			# Advansys SCSI adapters
 device		adw			# Advansys wide SCSI adapters
 device		aic			# Adaptec 15[012]x SCSI adapters, AIC-6[23]60.
 device		bt			# Buslogic/Mylex MultiMaster SCSI adapters
 device		isci			# Intel C600 SAS controller
 device		ocs_fc			# Emulex FC adapters
 
 # ATA/SCSI peripherals
 device		scbus			# SCSI bus (required for ATA/SCSI)
 device		ch			# SCSI media changers
 device		da			# Direct Access (disks)
 device		sa			# Sequential Access (tape etc)
 device		cd			# CD
 device		pass			# Passthrough device (direct ATA/SCSI access)
 device		ses			# Enclosure Services (SES and SAF-TE)
 #device		ctl			# CAM Target Layer
 
 # RAID controllers interfaced to the SCSI subsystem
 device		amr			# AMI MegaRAID
 device		arcmsr			# Areca SATA II RAID
 device		ciss			# Compaq Smart RAID 5*
 device		dpt			# DPT Smartcache III, IV - See NOTES for options
 device		hptmv			# Highpoint RocketRAID 182x
 device		hptnr			# Highpoint DC7280, R750
 device		hptrr			# Highpoint RocketRAID 17xx, 22xx, 23xx, 25xx
 device		hpt27xx			# Highpoint RocketRAID 27xx
 device		iir			# Intel Integrated RAID
 device		ips			# IBM (Adaptec) ServeRAID
 device		mly			# Mylex AcceleRAID/eXtremeRAID
 device		twa			# 3ware 9000 series PATA/SATA RAID
 device		smartpqi		# Microsemi smartpqi driver
 device		tws			# LSI 3ware 9750 SATA+SAS 6Gb/s RAID controller
 
 # RAID controllers
 device		aac			# Adaptec FSA RAID
 device		aacp			# SCSI passthrough for aac (requires CAM)
 device		aacraid			# Adaptec by PMC RAID
 device		ida			# Compaq Smart RAID
 device		mfi			# LSI MegaRAID SAS
 device		mlx			# Mylex DAC960 family
 device		mrsas			# LSI/Avago MegaRAID SAS/SATA, 6Gb/s and 12Gb/s
 device		pmspcv			# PMC-Sierra SAS/SATA Controller driver
 #XXX pointer/int warnings
 #device		pst			# Promise Supertrak SX6000
 device		twe			# 3ware ATA RAID
 
 # NVM Express (NVMe) support
 device		nvme			# base NVMe driver
 device		nvd			# expose NVMe namespaces as disks, depends on nvme
 
 # atkbdc0 controls both the keyboard and the PS/2 mouse
 device		atkbdc			# AT keyboard controller
 device		atkbd			# AT keyboard
 device		psm			# PS/2 mouse
 
 device		kbdmux			# keyboard multiplexer
 
 device		vga			# VGA video card driver
 options 	VESA			# Add support for VESA BIOS Extensions (VBE)
 
 device		splash			# Splash screen and screen saver support
 
 # syscons is the default console driver, resembling an SCO console
 device		sc
 options 	SC_PIXEL_MODE		# add support for the raster text mode
 
 # vt is the new video console driver
 device		vt
 device		vt_vga
 device		vt_efifb
 
 device		agp			# support several AGP chipsets
 
 # PCCARD (PCMCIA) support
 # PCMCIA and cardbus bridge support
 device		cbb			# cardbus (yenta) bridge
 device		pccard			# PC Card (16-bit) bus
 device		cardbus			# CardBus (32-bit) bus
 
 # Serial (COM) ports
 device		uart			# Generic UART driver
 
 # Parallel port
 device		ppc
 device		ppbus			# Parallel port bus (required)
 device		lpt			# Printer
 device		ppi			# Parallel port interface device
 #device		vpo			# Requires scbus and da
 
 device		puc			# Multi I/O cards and multi-channel UARTs
 
 # PCI Ethernet NICs.
 device		bxe			# Broadcom NetXtreme II BCM5771X/BCM578XX 10GbE
 device		de			# DEC/Intel DC21x4x (``Tulip'')
 device		em			# Intel PRO/1000 Gigabit Ethernet Family
 device		ix			# Intel PRO/10GbE PCIE PF Ethernet
 device		ixv			# Intel PRO/10GbE PCIE VF Ethernet
 device		ixl			# Intel XL710 40Gbe PCIE Ethernet
 #options		IXL_IW			# Enable iWARP Client Interface in ixl(4)
 #device		ixlv			# Intel XL710 40Gbe VF PCIE Ethernet
 device		le			# AMD Am7900 LANCE and Am79C9xx PCnet
 device		ti			# Alteon Networks Tigon I/II gigabit Ethernet
 device		txp			# 3Com 3cR990 (``Typhoon'')
 device		vx			# 3Com 3c590, 3c595 (``Vortex'')
 
 # PCI Ethernet NICs that use the common MII bus controller code.
 # NOTE: Be sure to keep the 'device miibus' line in order to use these NICs!
 device		miibus			# MII bus support
 device		ae			# Attansic/Atheros L2 FastEthernet
 device		age			# Attansic/Atheros L1 Gigabit Ethernet
 device		alc			# Atheros AR8131/AR8132 Ethernet
 device		ale			# Atheros AR8121/AR8113/AR8114 Ethernet
 device		bce			# Broadcom BCM5706/BCM5708 Gigabit Ethernet
 device		bfe			# Broadcom BCM440x 10/100 Ethernet
 device		bge			# Broadcom BCM570xx Gigabit Ethernet
 device		cas			# Sun Cassini/Cassini+ and NS DP83065 Saturn
 device		dc			# DEC/Intel 21143 and various workalikes
 device		et			# Agere ET1310 10/100/Gigabit Ethernet
 device		fxp			# Intel EtherExpress PRO/100B (82557, 82558)
 device		gem			# Sun GEM/Sun ERI/Apple GMAC
 device		hme			# Sun HME (Happy Meal Ethernet)
 device		jme			# JMicron JMC250 Gigabit/JMC260 Fast Ethernet
 device		lge			# Level 1 LXT1001 gigabit Ethernet
 device		msk			# Marvell/SysKonnect Yukon II Gigabit Ethernet
 device		nfe			# nVidia nForce MCP on-board Ethernet
 device		nge			# NatSemi DP83820 gigabit Ethernet
 device		pcn			# AMD Am79C97x PCI 10/100 (precedence over 'le')
 device		re			# RealTek 8139C+/8169/8169S/8110S
 device		rl			# RealTek 8129/8139
 device		sf			# Adaptec AIC-6915 (``Starfire'')
 device		sge			# Silicon Integrated Systems SiS190/191
 device		sis			# Silicon Integrated Systems SiS 900/SiS 7016
 device		sk			# SysKonnect SK-984x & SK-982x gigabit Ethernet
 device		ste			# Sundance ST201 (D-Link DFE-550TX)
 device		stge			# Sundance/Tamarack TC9021 gigabit Ethernet
 device		tl			# Texas Instruments ThunderLAN
 device		tx			# SMC EtherPower II (83c170 ``EPIC'')
 device		vge			# VIA VT612x gigabit Ethernet
 device		vr			# VIA Rhine, Rhine II
 device		wb			# Winbond W89C840F
 device		xl			# 3Com 3c90x (``Boomerang'', ``Cyclone'')
 
 # Wireless NIC cards
 device		wlan			# 802.11 support
 options 	IEEE80211_DEBUG		# enable debug msgs
 options 	IEEE80211_AMPDU_AGE	# age frames in AMPDU reorder q's
 options 	IEEE80211_SUPPORT_MESH	# enable 802.11s draft support
 device		wlan_wep		# 802.11 WEP support
 device		wlan_ccmp		# 802.11 CCMP support
 device		wlan_tkip		# 802.11 TKIP support
 device		wlan_amrr		# AMRR transmit rate control algorithm
 device		an			# Aironet 4500/4800 802.11 wireless NICs.
 device		ath			# Atheros NICs
 device		ath_pci			# Atheros pci/cardbus glue
 device		ath_hal			# pci/cardbus chip support
 options 	AH_SUPPORT_AR5416	# enable AR5416 tx/rx descriptors
 options 	AH_AR5416_INTERRUPT_MITIGATION # AR5416 interrupt mitigation
 options 	ATH_ENABLE_11N		# Enable 802.11n support for AR5416 and later
 device		ath_rate_sample		# SampleRate tx rate control for ath
 #device		bwi			# Broadcom BCM430x/BCM431x wireless NICs.
 #device		bwn			# Broadcom BCM43xx wireless NICs.
 device		ipw			# Intel 2100 wireless NICs.
 device		iwi			# Intel 2200BG/2225BG/2915ABG wireless NICs.
 device		iwn			# Intel 4965/1000/5000/6000 wireless NICs.
 device		malo			# Marvell Libertas wireless NICs.
 device		mwl			# Marvell 88W8363 802.11n wireless NICs.
 device		ral			# Ralink Technology RT2500 wireless NICs.
 device		wi			# WaveLAN/Intersil/Symbol 802.11 wireless NICs.
 device		wpi			# Intel 3945ABG wireless NICs.
 
 # Pseudo devices.
 device		crypto			# core crypto support
 device		loop			# Network loopback
 device		random			# Entropy device
 device		padlock_rng		# VIA Padlock RNG
 device		rdrand_rng		# Intel Bull Mountain RNG
 device		ether			# Ethernet support
 device		vlan			# 802.1Q VLAN support
 device		tun			# Packet tunnel.
 device		md			# Memory "disks"
 device		gif			# IPv6 and IPv4 tunneling
 device		firmware		# firmware assist module
 
 # The `bpf' device enables the Berkeley Packet Filter.
 # Be aware of the administrative consequences of enabling this!
 # Note that 'bpf' is required for DHCP.
 device		bpf			# Berkeley packet filter
 
 # USB support
 options 	USB_DEBUG		# enable debug msgs
 device		uhci			# UHCI PCI->USB interface
 device		ohci			# OHCI PCI->USB interface
 device		ehci			# EHCI PCI->USB interface (USB 2.0)
 device		xhci			# XHCI PCI->USB interface (USB 3.0)
 device		usb			# USB Bus (required)
 device		ukbd			# Keyboard
 device		umass			# Disks/Mass storage - Requires scbus and da
 
 # Sound support
 device		sound			# Generic sound driver (required)
 device		snd_cmi			# CMedia CMI8338/CMI8738
 device		snd_csa			# Crystal Semiconductor CS461x/428x
 device		snd_emu10kx		# Creative SoundBlaster Live! and Audigy
 device		snd_es137x		# Ensoniq AudioPCI ES137x
 device		snd_hda			# Intel High Definition Audio
 device		snd_ich			# Intel, NVidia and other ICH AC'97 Audio
 device		snd_via8233		# VIA VT8233x Audio
 
 # MMC/SD
 device		mmc			# MMC/SD bus
 device		mmcsd			# MMC/SD memory card
 device		sdhci			# Generic PCI SD Host Controller
 
 # VirtIO support
 device		virtio			# Generic VirtIO bus (required)
 device		virtio_pci		# VirtIO PCI device
 device		vtnet			# VirtIO Ethernet device
 device		virtio_blk		# VirtIO Block device
 device		virtio_scsi		# VirtIO SCSI device
 device		virtio_balloon		# VirtIO Memory Balloon device
 
 # HyperV drivers and enhancement support
 device		hyperv			# HyperV drivers 
 
 # Xen HVM Guest Optimizations
 # NOTE: XENHVM depends on xenpci.  They must be added or removed together.
 options 	XENHVM			# Xen HVM kernel infrastructure
 device		xenpci			# Xen HVM Hypervisor services driver
 
 # VMware support
 device		vmx			# VMware VMXNET3 Ethernet
 
 # Netmap provides direct access to TX/RX rings on supported NICs
 device		netmap			# netmap(4) support
Index: projects/openssl111/sys/amd64/conf/GENERIC-MMCCAM
===================================================================
--- projects/openssl111/sys/amd64/conf/GENERIC-MMCCAM	(revision 339239)
+++ projects/openssl111/sys/amd64/conf/GENERIC-MMCCAM	(revision 339240)
@@ -1,36 +1,35 @@
 # MMCCAM is the kernel config for doing MMC on CAM development
 # and testing on bhyve
 # $FreeBSD$
 
 include         MINIMAL
 
 ident		GENERIC-MMCCAM
 
 # Access GPT-formatted and labeled root volume
-options         GEOM_PART_GPT
 options         GEOM_LABEL
 
 # UART -- for bhyve console
 device          uart
 
 # kgdb stub
 device          bvmdebug
 
 # VirtIO support, needed for bhyve
 device		virtio			# Generic VirtIO bus (required)
 device		virtio_pci		# VirtIO PCI device
 device		vtnet			# VirtIO Ethernet device
 device		virtio_blk		# VirtIO Block device
 device		virtio_scsi		# VirtIO SCSI device
 device		virtio_balloon		# VirtIO Memory Balloon device
 
 # CAM-specific stuff
 device		pass
 device		scbus
 device		da
 
 options	       MMCCAM
 
 # Add CAMDEBUG stuff
 options	       CAMDEBUG
 options	       CAM_DEBUG_FLAGS=(CAM_DEBUG_INFO|CAM_DEBUG_PROBE|CAM_DEBUG_PERIPH)
Index: projects/openssl111/sys/arm64/conf/GENERIC
===================================================================
--- projects/openssl111/sys/arm64/conf/GENERIC	(revision 339239)
+++ projects/openssl111/sys/arm64/conf/GENERIC	(revision 339240)
@@ -1,277 +1,276 @@
 #
 # GENERIC -- Generic kernel configuration file for FreeBSD/arm64
 #
 # For more information on this file, please read the config(5) manual page,
 # and/or the handbook section on Kernel Configuration Files:
 #
 #    https://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig-config.html
 #
 # The handbook is also available locally in /usr/share/doc/handbook
 # if you've installed the doc distribution, otherwise always see the
 # FreeBSD World Wide Web server (https://www.FreeBSD.org/) for the
 # latest information.
 #
 # An exhaustive list of options and more detailed explanations of the
 # device lines is also present in the ../../conf/NOTES and NOTES files.
 # If you are in doubt as to the purpose or necessity of a line, check first
 # in NOTES.
 #
 # $FreeBSD$
 
 cpu		ARM64
 ident		GENERIC
 
 makeoptions	DEBUG=-g		# Build kernel with gdb(1) debug symbols
 makeoptions	WITH_CTF=1		# Run ctfconvert(1) for DTrace support
 
 options 	SCHED_ULE		# ULE scheduler
 options 	PREEMPTION		# Enable kernel thread preemption
 options 	VIMAGE			# Subsystem virtualization, e.g. VNET
 options 	INET			# InterNETworking
 options 	INET6			# IPv6 communications protocols
 options 	IPSEC			# IP (v4/v6) security
 options 	IPSEC_SUPPORT		# Allow kldload of ipsec and tcpmd5
 options 	TCP_HHOOK		# hhook(9) framework for TCP
 options 	TCP_OFFLOAD		# TCP offload
 options		TCP_RFC7413		# TCP Fast Open
 options 	SCTP			# Stream Control Transmission Protocol
 options 	FFS			# Berkeley Fast Filesystem
 options 	SOFTUPDATES		# Enable FFS soft updates support
 options 	UFS_ACL			# Support for access control lists
 options 	UFS_DIRHASH		# Improve performance on big directories
 options 	UFS_GJOURNAL		# Enable gjournal-based UFS journaling
 options 	QUOTA			# Enable disk quotas for UFS
 options 	MD_ROOT			# MD is a potential root device
 options 	NFSCL			# Network Filesystem Client
 options 	NFSD			# Network Filesystem Server
 options 	NFSLOCKD		# Network Lock Manager
 options 	NFS_ROOT		# NFS usable as /, requires NFSCL
 options 	MSDOSFS			# MSDOS Filesystem
 options 	CD9660			# ISO 9660 Filesystem
 options 	PROCFS			# Process filesystem (requires PSEUDOFS)
 options 	PSEUDOFS		# Pseudo-filesystem framework
-options 	GEOM_PART_GPT		# GUID Partition Tables.
 options 	GEOM_RAID		# Soft RAID functionality.
 options 	GEOM_LABEL		# Provides labelization
 options 	COMPAT_FREEBSD32	# Incomplete, but used by cloudabi32.ko.
 options 	COMPAT_FREEBSD11	# Compatible with FreeBSD11
 options 	SCSI_DELAY=5000		# Delay (in ms) before probing SCSI
 options 	KTRACE			# ktrace(1) support
 options 	STACK			# stack(9) support
 options 	SYSVSHM			# SYSV-style shared memory
 options 	SYSVMSG			# SYSV-style message queues
 options 	SYSVSEM			# SYSV-style semaphores
 options 	_KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions
 options 	PRINTF_BUFR_SIZE=128	# Prevent printf output being interspersed.
 options 	KBD_INSTALL_CDEV	# install a CDEV entry in /dev
 options 	HWPMC_HOOKS		# Necessary kernel hooks for hwpmc(4)
 options 	AUDIT			# Security event auditing
 options 	CAPABILITY_MODE		# Capsicum capability mode
 options 	CAPABILITIES		# Capsicum capabilities
 options 	MAC			# TrustedBSD MAC Framework
 options 	KDTRACE_FRAME		# Ensure frames are compiled in
 options 	KDTRACE_HOOKS		# Kernel DTrace hooks
 options 	VFP			# Floating-point support
 options 	RACCT			# Resource accounting framework
 options 	RACCT_DEFAULT_TO_DISABLED # Set kern.racct.enable=0 by default
 options 	RCTL			# Resource limits
 options 	SMP
 options 	INTRNG
 
 # Debugging support.  Always need this:
 options 	KDB			# Enable kernel debugger support.
 options 	KDB_TRACE		# Print a stack trace for a panic.
 # For full debugger support use (turn off in stable branch):
 options 	DDB			# Support DDB.
 #options 	GDB			# Support remote GDB.
 options 	DEADLKRES		# Enable the deadlock resolver
 options 	INVARIANTS		# Enable calls of extra sanity checking
 options 	INVARIANT_SUPPORT	# Extra sanity checks of internal structures, required by INVARIANTS
 options 	WITNESS			# Enable checks to detect deadlocks and cycles
 options 	WITNESS_SKIPSPIN	# Don't run witness on spinlocks for speed
 options 	MALLOC_DEBUG_MAXZONES=8	# Separate malloc(9) zones
 options 	ALT_BREAK_TO_DEBUGGER	# Enter debugger on keyboard escape sequence
 options 	USB_DEBUG		# enable debug msgs
 
 # Kernel dump features.
 options 	EKCD			# Support for encrypted kernel dumps
 options 	GZIO			# gzip-compressed kernel and user dumps
 options 	ZSTDIO			# zstd-compressed kernel and user dumps
 options 	NETDUMP			# netdump(4) client support
 
 # SoC support
 options 	SOC_ALLWINNER_A64
 options 	SOC_ALLWINNER_H5
 options 	SOC_CAVM_THUNDERX
 options 	SOC_HISI_HI6220
 options 	SOC_BRCM_BCM2837
 options 	SOC_ROCKCHIP_RK3328
 options 	SOC_XILINX_ZYNQ
 
 # Timer drivers
 device		a10_timer
 
 # Annapurna Alpine drivers
 device		al_ccu			# Alpine Cache Coherency Unit
 device		al_nb_service		# Alpine North Bridge Service
 device		al_iofic		# I/O Fabric Interrupt Controller
 device		al_serdes		# Serializer/Deserializer
 device		al_udma			# Universal DMA
 
 # Qualcomm Snapdragon drivers
 device		qcom_gcc		# Global Clock Controller
 
 # VirtIO support
 device		virtio
 device		virtio_pci
 device		virtio_mmio
 device		virtio_blk
 device		vtnet
 
 # CPU frequency control
 device		cpufreq
 
 # Bus drivers
 device		pci
 device		al_pci		# Annapurna Alpine PCI-E
 options 	PCI_HP			# PCI-Express native HotPlug
 options 	PCI_IOV		# PCI SR-IOV support
 
 # Ethernet NICs
 device		mdio
 device		mii
 device		miibus		# MII bus support
 device		awg		# Allwinner EMAC Gigabit Ethernet
 device		axgbe		# AMD Opteron A1100 integrated NIC
 device		em		# Intel PRO/1000 Gigabit Ethernet Family
 device		ix		# Intel 10Gb Ethernet Family
 device		msk		# Marvell/SysKonnect Yukon II Gigabit Ethernet
 device		neta		# Marvell Armada 370/38x/XP/3700 NIC
 device  	smc		# SMSC LAN91C111
 device		vnic		# Cavium ThunderX NIC
 device		al_eth		# Annapurna Alpine Ethernet NIC
 device		dwc_rk		# Rockchip Designware
 
 # Block devices
 device		ahci
 device		scbus
 device		da
 
 # ATA/SCSI peripherals
 device		pass		# Passthrough device (direct ATA/SCSI access)
 
 # MMC/SD/SDIO Card slot support
 device		sdhci
 device		sdhci_xenon		# Marvell Xenon SD/MMC controller
 device		aw_mmc			# Allwinner SD/MMC controller
 device		mmc			# mmc/sd bus
 device		mmcsd			# mmc/sd flash cards
 device		dwmmc
 
 # Serial (COM) ports
 device		uart		# Generic UART driver
 device		uart_msm	# Qualcomm MSM UART driver
 device		uart_mu		# RPI3 aux port
 device		uart_mvebu	# Armada 3700 UART driver
 device		uart_ns8250	# ns8250-type UART driver
 device		uart_snps
 device		pl011
 
 # USB support
 device		aw_ehci			# Allwinner EHCI USB interface (USB 2.0)
 device		aw_usbphy		# Allwinner USB PHY
 device		dwcotg			# DWC OTG controller
 device		ohci			# OHCI USB interface
 device		ehci			# EHCI USB interface (USB 2.0)
 device		ehci_mv			# Marvell EHCI USB interface
 device		xhci			# XHCI PCI->USB interface (USB 3.0)
 device		xhci_mv			# Marvell XHCI USB interface
 device		usb			# USB Bus (required)
 device		ukbd			# Keyboard
 device		umass			# Disks/Mass storage - Requires scbus and da
 
 # USB ethernet support
 device		muge
 device		smcphy
 device		smsc
 
 # GPIO
 device		aw_gpio		# Allwinner GPIO controller
 device		gpio
 device		gpioled
 device		fdt_pinctrl
 
 # I2C
 device		aw_rsb		# Allwinner Reduced Serial Bus
 device		bcm2835_bsc	# Broadcom BCM283x I2C bus
 device		iicbus
 device		iic
 device		twsi		# Allwinner I2C controller
 
 # Clock and reset controllers
 device		aw_ccu		# Allwinner clock controller
 
 # Interrupt controllers
 device		aw_nmi		# Allwinner NMI support
 
 # Real-time clock support
 device		aw_rtc		# Allwinner Real-time Clock
 device		mv_rtc		# Marvell Real-time Clock
 
 # Watchdog controllers
 device		aw_wdog		# Allwinner Watchdog
 
 # Power management controllers
 device		axp81x		# X-Powers AXP81x PMIC
 
 # EFUSE
 device		aw_sid		# Allwinner Secure ID EFUSE
 
 # Thermal sensors
 device		aw_thermal	# Allwinner Thermal Sensor Controller
 
 # SPI
 device		spibus
 device		bcm2835_spi	# Broadcom BCM283x SPI bus
 
 # Console
 device		vt
 device		kbdmux
 
 device		vt_efifb
 
 # Pseudo devices.
 device		crypto		# core crypto support
 device		loop		# Network loopback
 device		random		# Entropy device
 device		ether		# Ethernet support
 device		vlan		# 802.1Q VLAN support
 device		tun		# Packet tunnel.
 device		md		# Memory "disks"
 device		gif		# IPv6 and IPv4 tunneling
 device		firmware	# firmware assist module
 options 	EFIRT		# EFI Runtime Services
 
 # EXT_RESOURCES pseudo devices
 options 	EXT_RESOURCES
 device		clk
 device		phy
 device		hwreset
 device		nvmem
 device		regulator
 device		syscon
 device		aw_syscon
 
 # The `bpf' device enables the Berkeley Packet Filter.
 # Be aware of the administrative consequences of enabling this!
 # Note that 'bpf' is required for DHCP.
 device		bpf		# Berkeley packet filter
 
 # Chip-specific errata
 options 	THUNDERX_PASS_1_1_ERRATA
 
 options 	FDT
 device		acpi
 
 # DTBs
 makeoptions	MODULES_EXTRA="dtb/allwinner"
Index: projects/openssl111/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c
===================================================================
--- projects/openssl111/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c	(revision 339239)
+++ projects/openssl111/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c	(revision 339240)
@@ -1,3938 +1,3939 @@
 /*
  * CDDL HEADER START
  *
  * The contents of this file are subject to the terms of the
  * Common Development and Distribution License (the "License").
  * You may not use this file except in compliance with the License.
  *
  * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
  * or http://www.opensolaris.org/os/licensing.
  * See the License for the specific language governing permissions
  * and limitations under the License.
  *
  * When distributing Covered Code, include this CDDL HEADER in each
  * file and include the License file at usr/src/OPENSOLARIS.LICENSE.
  * If applicable, add the following below this CDDL HEADER, with the
  * fields enclosed by brackets "[]" replaced with your own identifying
  * information: Portions Copyright [yyyy] [name of copyright owner]
  *
  * CDDL HEADER END
  */
 /*
  * Copyright (c) 2008, 2010, Oracle and/or its affiliates. All rights reserved.
  * Copyright (c) 2011, 2018 by Delphix. All rights reserved.
  * Copyright 2016 Gary Mills
  * Copyright (c) 2011, 2017 by Delphix. All rights reserved.
  * Copyright 2017 Joyent, Inc.
  * Copyright (c) 2017 Datto Inc.
  */
 
 #include <sys/dsl_scan.h>
 #include <sys/dsl_pool.h>
 #include <sys/dsl_dataset.h>
 #include <sys/dsl_prop.h>
 #include <sys/dsl_dir.h>
 #include <sys/dsl_synctask.h>
 #include <sys/dnode.h>
 #include <sys/dmu_tx.h>
 #include <sys/dmu_objset.h>
 #include <sys/arc.h>
 #include <sys/zap.h>
 #include <sys/zio.h>
 #include <sys/zfs_context.h>
 #include <sys/fs/zfs.h>
 #include <sys/zfs_znode.h>
 #include <sys/spa_impl.h>
 #include <sys/vdev_impl.h>
 #include <sys/zil_impl.h>
 #include <sys/zio_checksum.h>
 #include <sys/ddt.h>
 #include <sys/sa.h>
 #include <sys/sa_impl.h>
 #include <sys/zfeature.h>
 #include <sys/abd.h>
 #include <sys/range_tree.h>
 #ifdef _KERNEL
 #include <sys/zfs_vfsops.h>
 #endif
 
 /*
  * Grand theory statement on scan queue sorting
  *
  * Scanning is implemented by recursively traversing all indirection levels
  * in an object and reading all blocks referenced from said objects. This
  * results in us approximately traversing the object from lowest logical
  * offset to the highest. For best performance, we would want the logical
  * blocks to be physically contiguous. However, this is frequently not the
  * case with pools given the allocation patterns of copy-on-write filesystems.
  * So instead, we put the I/Os into a reordering queue and issue them in a
  * way that will most benefit physical disks (LBA-order).
  *
  * Queue management:
  *
  * Ideally, we would want to scan all metadata and queue up all block I/O
  * prior to starting to issue it, because that allows us to do an optimal
  * sorting job. This can however consume large amounts of memory. Therefore
  * we continuously monitor the size of the queues and constrain them to 5%
  * (zfs_scan_mem_lim_fact) of physmem. If the queues grow larger than this
  * limit, we clear out a few of the largest extents at the head of the queues
  * to make room for more scanning. Hopefully, these extents will be fairly
  * large and contiguous, allowing us to approach sequential I/O throughput
  * even without a fully sorted tree.
  *
  * Metadata scanning takes place in dsl_scan_visit(), which is called from
  * dsl_scan_sync() every spa_sync(). If we have either fully scanned all
  * metadata on the pool, or we need to make room in memory because our
  * queues are too large, dsl_scan_visit() is postponed and
  * scan_io_queues_run() is called from dsl_scan_sync() instead. This implies
  * that metadata scanning and queued I/O issuing are mutually exclusive. This
  * allows us to provide maximum sequential I/O throughput for the majority of
  * I/O's issued since sequential I/O performance is significantly negatively
  * impacted if it is interleaved with random I/O.
  *
  * Implementation Notes
  *
  * One side effect of the queued scanning algorithm is that the scanning code
  * needs to be notified whenever a block is freed. This is needed to allow
  * the scanning code to remove these I/Os from the issuing queue. Additionally,
  * we do not attempt to queue gang blocks to be issued sequentially since this
  * is very hard to do and would have an extremely limitted performance benefit.
  * Instead, we simply issue gang I/Os as soon as we find them using the legacy
  * algorithm.
  *
  * Backwards compatibility
  *
  * This new algorithm is backwards compatible with the legacy on-disk data
  * structures (and therefore does not require a new feature flag).
  * Periodically during scanning (see zfs_scan_checkpoint_intval), the scan
  * will stop scanning metadata (in logical order) and wait for all outstanding
  * sorted I/O to complete. Once this is done, we write out a checkpoint
  * bookmark, indicating that we have scanned everything logically before it.
  * If the pool is imported on a machine without the new sorting algorithm,
  * the scan simply resumes from the last checkpoint using the legacy algorithm.
  */
 
 typedef int (scan_cb_t)(dsl_pool_t *, const blkptr_t *,
     const zbookmark_phys_t *);
 
 static scan_cb_t dsl_scan_scrub_cb;
 
 static int scan_ds_queue_compare(const void *a, const void *b);
 static int scan_prefetch_queue_compare(const void *a, const void *b);
 static void scan_ds_queue_clear(dsl_scan_t *scn);
 static boolean_t scan_ds_queue_contains(dsl_scan_t *scn, uint64_t dsobj,
     uint64_t *txg);
 static void scan_ds_queue_insert(dsl_scan_t *scn, uint64_t dsobj, uint64_t txg);
 static void scan_ds_queue_remove(dsl_scan_t *scn, uint64_t dsobj);
 static void scan_ds_queue_sync(dsl_scan_t *scn, dmu_tx_t *tx);
 
 extern int zfs_vdev_async_write_active_min_dirty_percent;
 
 /*
  * By default zfs will check to ensure it is not over the hard memory
  * limit before each txg. If finer-grained control of this is needed
  * this value can be set to 1 to enable checking before scanning each
  * block.
  */
 int zfs_scan_strict_mem_lim = B_FALSE;
 
 /*
  * Maximum number of parallelly executing I/Os per top-level vdev.
  * Tune with care. Very high settings (hundreds) are known to trigger
  * some firmware bugs and resets on certain SSDs.
  */
 int zfs_top_maxinflight = 32;		/* maximum I/Os per top-level */
 unsigned int zfs_resilver_delay = 2;	/* number of ticks to delay resilver -- 2 is a good number */
 unsigned int zfs_scrub_delay = 4;	/* number of ticks to delay scrub -- 4 is a good number */
 unsigned int zfs_scan_idle = 50;	/* idle window in clock ticks */
 
 /*
  * Maximum number of parallelly executed bytes per leaf vdev. We attempt
  * to strike a balance here between keeping the vdev queues full of I/Os
  * at all times and not overflowing the queues to cause long latency,
  * which would cause long txg sync times. No matter what, we will not
  * overload the drives with I/O, since that is protected by
  * zfs_vdev_scrub_max_active.
  */
 unsigned long zfs_scan_vdev_limit = 4 << 20;
 
 int zfs_scan_issue_strategy = 0;
 int zfs_scan_legacy = B_FALSE;	/* don't queue & sort zios, go direct */
 uint64_t zfs_scan_max_ext_gap = 2 << 20;	/* in bytes */
 
 unsigned int zfs_scan_checkpoint_intval = 7200;	/* seconds */
 #define	ZFS_SCAN_CHECKPOINT_INTVAL	SEC_TO_TICK(zfs_scan_checkpoint_intval)
 
 /*
  * fill_weight is non-tunable at runtime, so we copy it at module init from
  * zfs_scan_fill_weight. Runtime adjustments to zfs_scan_fill_weight would
  * break queue sorting.
  */
 uint64_t zfs_scan_fill_weight = 3;
 static uint64_t fill_weight;
 
 /* See dsl_scan_should_clear() for details on the memory limit tunables */
 uint64_t zfs_scan_mem_lim_min = 16 << 20;	/* bytes */
 uint64_t zfs_scan_mem_lim_soft_max = 128 << 20;	/* bytes */
 int zfs_scan_mem_lim_fact = 20;		/* fraction of physmem */
 int zfs_scan_mem_lim_soft_fact = 20;	/* fraction of mem lim above */
 
 unsigned int zfs_scrub_min_time_ms = 1000; /* min millisecs to scrub per txg */
 unsigned int zfs_free_min_time_ms = 1000; /* min millisecs to free per txg */
 unsigned int zfs_obsolete_min_time_ms = 500; /* min millisecs to obsolete per txg */
 unsigned int zfs_resilver_min_time_ms = 3000; /* min millisecs to resilver per txg */
 boolean_t zfs_no_scrub_io = B_FALSE; /* set to disable scrub i/o */
 boolean_t zfs_no_scrub_prefetch = B_FALSE; /* set to disable scrub prefetch */
 
 SYSCTL_DECL(_vfs_zfs);
 SYSCTL_UINT(_vfs_zfs, OID_AUTO, top_maxinflight, CTLFLAG_RWTUN,
     &zfs_top_maxinflight, 0, "Maximum I/Os per top-level vdev");
 SYSCTL_UINT(_vfs_zfs, OID_AUTO, resilver_delay, CTLFLAG_RWTUN,
     &zfs_resilver_delay, 0, "Number of ticks to delay resilver");
 SYSCTL_UINT(_vfs_zfs, OID_AUTO, scrub_delay, CTLFLAG_RWTUN,
     &zfs_scrub_delay, 0, "Number of ticks to delay scrub");
 SYSCTL_UINT(_vfs_zfs, OID_AUTO, scan_idle, CTLFLAG_RWTUN,
     &zfs_scan_idle, 0, "Idle scan window in clock ticks");
 SYSCTL_UINT(_vfs_zfs, OID_AUTO, scan_min_time_ms, CTLFLAG_RWTUN,
     &zfs_scrub_min_time_ms, 0, "Min millisecs to scrub per txg");
 SYSCTL_UINT(_vfs_zfs, OID_AUTO, free_min_time_ms, CTLFLAG_RWTUN,
     &zfs_free_min_time_ms, 0, "Min millisecs to free per txg");
 SYSCTL_UINT(_vfs_zfs, OID_AUTO, resilver_min_time_ms, CTLFLAG_RWTUN,
     &zfs_resilver_min_time_ms, 0, "Min millisecs to resilver per txg");
 SYSCTL_INT(_vfs_zfs, OID_AUTO, no_scrub_io, CTLFLAG_RWTUN,
     &zfs_no_scrub_io, 0, "Disable scrub I/O");
 SYSCTL_INT(_vfs_zfs, OID_AUTO, no_scrub_prefetch, CTLFLAG_RWTUN,
     &zfs_no_scrub_prefetch, 0, "Disable scrub prefetching");
 SYSCTL_UINT(_vfs_zfs, OID_AUTO, zfs_scan_legacy, CTLFLAG_RWTUN,
     &zfs_scan_legacy, 0, "Scrub using legacy non-sequential method");
 SYSCTL_UINT(_vfs_zfs, OID_AUTO, zfs_scan_checkpoint_interval, CTLFLAG_RWTUN,
     &zfs_scan_checkpoint_intval, 0, "Scan progress on-disk checkpointing interval");
 
 enum ddt_class zfs_scrub_ddt_class_max = DDT_CLASS_DUPLICATE;
 /* max number of blocks to free in a single TXG */
 uint64_t zfs_async_block_max_blocks = UINT64_MAX;
 SYSCTL_UQUAD(_vfs_zfs, OID_AUTO, free_max_blocks, CTLFLAG_RWTUN,
     &zfs_async_block_max_blocks, 0, "Maximum number of blocks to free in one TXG");
 
 /*
  * We wait a few txgs after importing a pool to begin scanning so that
  * the import / mounting code isn't held up by scrub / resilver IO.
  * Unfortunately, it is a bit difficult to determine exactly how long
  * this will take since userspace will trigger fs mounts asynchronously
  * and the kernel will create zvol minors asynchronously. As a result,
  * the value provided here is a bit arbitrary, but represents a
  * reasonable estimate of how many txgs it will take to finish fully
  * importing a pool
  */
 #define        SCAN_IMPORT_WAIT_TXGS           5
 
 
 #define	DSL_SCAN_IS_SCRUB_RESILVER(scn) \
 	((scn)->scn_phys.scn_func == POOL_SCAN_SCRUB || \
 	(scn)->scn_phys.scn_func == POOL_SCAN_RESILVER)
 
 extern int zfs_txg_timeout;
 
 /*
  * Enable/disable the processing of the free_bpobj object.
  */
 boolean_t zfs_free_bpobj_enabled = B_TRUE;
 
 SYSCTL_INT(_vfs_zfs, OID_AUTO, free_bpobj_enabled, CTLFLAG_RWTUN,
     &zfs_free_bpobj_enabled, 0, "Enable free_bpobj processing");
 
 /* the order has to match pool_scan_type */
 static scan_cb_t *scan_funcs[POOL_SCAN_FUNCS] = {
 	NULL,
 	dsl_scan_scrub_cb,	/* POOL_SCAN_SCRUB */
 	dsl_scan_scrub_cb,	/* POOL_SCAN_RESILVER */
 };
 
 /* In core node for the scn->scn_queue. Represents a dataset to be scanned */
 typedef struct {
 	uint64_t	sds_dsobj;
 	uint64_t	sds_txg;
 	avl_node_t	sds_node;
 } scan_ds_t;
 
 /*
  * This controls what conditions are placed on dsl_scan_sync_state():
  * SYNC_OPTIONAL) write out scn_phys iff scn_bytes_pending == 0
  * SYNC_MANDATORY) write out scn_phys always. scn_bytes_pending must be 0.
  * SYNC_CACHED) if scn_bytes_pending == 0, write out scn_phys. Otherwise
  *	write out the scn_phys_cached version.
  * See dsl_scan_sync_state for details.
  */
 typedef enum {
 	SYNC_OPTIONAL,
 	SYNC_MANDATORY,
 	SYNC_CACHED
 } state_sync_type_t;
 
 /*
  * This struct represents the minimum information needed to reconstruct a
  * zio for sequential scanning. This is useful because many of these will
  * accumulate in the sequential IO queues before being issued, so saving
  * memory matters here.
  */
 typedef struct scan_io {
 	/* fields from blkptr_t */
 	uint64_t		sio_offset;
 	uint64_t		sio_blk_prop;
 	uint64_t		sio_phys_birth;
 	uint64_t		sio_birth;
 	zio_cksum_t		sio_cksum;
 	uint32_t		sio_asize;
 
 	/* fields from zio_t */
 	int			sio_flags;
 	zbookmark_phys_t	sio_zb;
 
 	/* members for queue sorting */
 	union {
 		avl_node_t	sio_addr_node; /* link into issueing queue */
 		list_node_t	sio_list_node; /* link for issuing to disk */
 	} sio_nodes;
 } scan_io_t;
 
 struct dsl_scan_io_queue {
 	dsl_scan_t	*q_scn; /* associated dsl_scan_t */
 	vdev_t		*q_vd; /* top-level vdev that this queue represents */
 
 	/* trees used for sorting I/Os and extents of I/Os */
 	range_tree_t	*q_exts_by_addr;
 	avl_tree_t	q_exts_by_size;
 	avl_tree_t	q_sios_by_addr;
 
 	/* members for zio rate limiting */
 	uint64_t	q_maxinflight_bytes;
 	uint64_t	q_inflight_bytes;
 	kcondvar_t	q_zio_cv; /* used under vd->vdev_scan_io_queue_lock */
 
 	/* per txg statistics */
 	uint64_t	q_total_seg_size_this_txg;
 	uint64_t	q_segs_this_txg;
 	uint64_t	q_total_zio_size_this_txg;
 	uint64_t	q_zios_this_txg;
 };
 
 /* private data for dsl_scan_prefetch_cb() */
 typedef struct scan_prefetch_ctx {
 	refcount_t spc_refcnt;		/* refcount for memory management */
 	dsl_scan_t *spc_scn;		/* dsl_scan_t for the pool */
 	boolean_t spc_root;		/* is this prefetch for an objset? */
 	uint8_t spc_indblkshift;	/* dn_indblkshift of current dnode */
 	uint16_t spc_datablkszsec;	/* dn_idatablkszsec of current dnode */
 } scan_prefetch_ctx_t;
 
 /* private data for dsl_scan_prefetch() */
 typedef struct scan_prefetch_issue_ctx {
 	avl_node_t spic_avl_node;	/* link into scn->scn_prefetch_queue */
 	scan_prefetch_ctx_t *spic_spc;	/* spc for the callback */
 	blkptr_t spic_bp;		/* bp to prefetch */
 	zbookmark_phys_t spic_zb;	/* bookmark to prefetch */
 } scan_prefetch_issue_ctx_t;
 
 static void scan_exec_io(dsl_pool_t *dp, const blkptr_t *bp, int zio_flags,
     const zbookmark_phys_t *zb, dsl_scan_io_queue_t *queue);
 static void scan_io_queue_insert_impl(dsl_scan_io_queue_t *queue,
     scan_io_t *sio);
 
 static dsl_scan_io_queue_t *scan_io_queue_create(vdev_t *vd);
 static void scan_io_queues_destroy(dsl_scan_t *scn);
 
 static kmem_cache_t *sio_cache;
 
 void
 scan_init(void)
 {
 	/*
 	 * This is used in ext_size_compare() to weight segments
 	 * based on how sparse they are. This cannot be changed
 	 * mid-scan and the tree comparison functions don't currently
 	 * have a mechansim for passing additional context to the
 	 * compare functions. Thus we store this value globally and
 	 * we only allow it to be set at module intiailization time
 	 */
 	fill_weight = zfs_scan_fill_weight;
 	
 	sio_cache = kmem_cache_create("sio_cache",
 	    sizeof (scan_io_t), 0, NULL, NULL, NULL, NULL, NULL, 0);
 }
 
 void
 scan_fini(void)
 {
 	kmem_cache_destroy(sio_cache);
 }
 
 static inline boolean_t
 dsl_scan_is_running(const dsl_scan_t *scn)
 {
 	return (scn->scn_phys.scn_state == DSS_SCANNING);
 }
 
 boolean_t
 dsl_scan_resilvering(dsl_pool_t *dp)
 {
 	return (dsl_scan_is_running(dp->dp_scan) &&
 	    dp->dp_scan->scn_phys.scn_func == POOL_SCAN_RESILVER);
 }
 
 static inline void
 sio2bp(const scan_io_t *sio, blkptr_t *bp, uint64_t vdev_id)
 {
 	bzero(bp, sizeof (*bp));
 	DVA_SET_ASIZE(&bp->blk_dva[0], sio->sio_asize);
 	DVA_SET_VDEV(&bp->blk_dva[0], vdev_id);
 	DVA_SET_OFFSET(&bp->blk_dva[0], sio->sio_offset);
 	bp->blk_prop = sio->sio_blk_prop;
 	bp->blk_phys_birth = sio->sio_phys_birth;
 	bp->blk_birth = sio->sio_birth;
 	bp->blk_fill = 1;	/* we always only work with data pointers */
 	bp->blk_cksum = sio->sio_cksum;
 }
 
 static inline void
 bp2sio(const blkptr_t *bp, scan_io_t *sio, int dva_i)
 {
 	/* we discard the vdev id, since we can deduce it from the queue */
 	sio->sio_offset = DVA_GET_OFFSET(&bp->blk_dva[dva_i]);
 	sio->sio_asize = DVA_GET_ASIZE(&bp->blk_dva[dva_i]);
 	sio->sio_blk_prop = bp->blk_prop;
 	sio->sio_phys_birth = bp->blk_phys_birth;
 	sio->sio_birth = bp->blk_birth;
 	sio->sio_cksum = bp->blk_cksum;
 }
 
 void
 dsl_scan_global_init(void)
 {
 	/*
 	 * This is used in ext_size_compare() to weight segments
 	 * based on how sparse they are. This cannot be changed
 	 * mid-scan and the tree comparison functions don't currently
 	 * have a mechansim for passing additional context to the
 	 * compare functions. Thus we store this value globally and
 	 * we only allow it to be set at module intiailization time
 	 */
 	fill_weight = zfs_scan_fill_weight;
 }
 
 int
 dsl_scan_init(dsl_pool_t *dp, uint64_t txg)
 {
 	int err;
 	dsl_scan_t *scn;
 	spa_t *spa = dp->dp_spa;
 	uint64_t f;
 
 	scn = dp->dp_scan = kmem_zalloc(sizeof (dsl_scan_t), KM_SLEEP);
 	scn->scn_dp = dp;
 
 	/*
 	 * It's possible that we're resuming a scan after a reboot so
 	 * make sure that the scan_async_destroying flag is initialized
 	 * appropriately.
 	 */
 	ASSERT(!scn->scn_async_destroying);
 	scn->scn_async_destroying = spa_feature_is_active(dp->dp_spa,
 	    SPA_FEATURE_ASYNC_DESTROY);
 
 	bcopy(&scn->scn_phys, &scn->scn_phys_cached, sizeof (scn->scn_phys));
 	avl_create(&scn->scn_queue, scan_ds_queue_compare, sizeof (scan_ds_t),
 	    offsetof(scan_ds_t, sds_node));
 	avl_create(&scn->scn_prefetch_queue, scan_prefetch_queue_compare,
 	    sizeof (scan_prefetch_issue_ctx_t),
 	    offsetof(scan_prefetch_issue_ctx_t, spic_avl_node));
 
 	err = zap_lookup(dp->dp_meta_objset, DMU_POOL_DIRECTORY_OBJECT,
 	    "scrub_func", sizeof (uint64_t), 1, &f);
 	if (err == 0) {
 		/*
 		 * There was an old-style scrub in progress.  Restart a
 		 * new-style scrub from the beginning.
 		 */
 		scn->scn_restart_txg = txg;
 		zfs_dbgmsg("old-style scrub was in progress; "
 		    "restarting new-style scrub in txg %llu",
 		    (longlong_t)scn->scn_restart_txg);
 
 		/*
 		 * Load the queue obj from the old location so that it
 		 * can be freed by dsl_scan_done().
 		 */
 		(void) zap_lookup(dp->dp_meta_objset, DMU_POOL_DIRECTORY_OBJECT,
 		    "scrub_queue", sizeof (uint64_t), 1,
 		    &scn->scn_phys.scn_queue_obj);
 	} else {
 		err = zap_lookup(dp->dp_meta_objset, DMU_POOL_DIRECTORY_OBJECT,
 		    DMU_POOL_SCAN, sizeof (uint64_t), SCAN_PHYS_NUMINTS,
 		    &scn->scn_phys);
 		if (err == ENOENT)
 			return (0);
 		else if (err)
 			return (err);
 
 		/*
 		 * We might be restarting after a reboot, so jump the issued
 		 * counter to how far we've scanned. We know we're consistent
 		 * up to here.
 		 */
 		scn->scn_issued_before_pass = scn->scn_phys.scn_examined;
 
 		if (dsl_scan_is_running(scn) &&
 		    spa_prev_software_version(dp->dp_spa) < SPA_VERSION_SCAN) {
 			/*
 			 * A new-type scrub was in progress on an old
 			 * pool, and the pool was accessed by old
 			 * software.  Restart from the beginning, since
 			 * the old software may have changed the pool in
 			 * the meantime.
 			 */
 			scn->scn_restart_txg = txg;
 			zfs_dbgmsg("new-style scrub was modified "
 			    "by old software; restarting in txg %llu",
 			    (longlong_t)scn->scn_restart_txg);
 		}
 	}
 
 	/* reload the queue into the in-core state */
 	if (scn->scn_phys.scn_queue_obj != 0) {
 		zap_cursor_t zc;
 		zap_attribute_t za;
 
 		for (zap_cursor_init(&zc, dp->dp_meta_objset,
 		    scn->scn_phys.scn_queue_obj);
 		    zap_cursor_retrieve(&zc, &za) == 0;
 		    (void) zap_cursor_advance(&zc)) {
 			scan_ds_queue_insert(scn,
 			    zfs_strtonum(za.za_name, NULL),
 			    za.za_first_integer);
 		}
 		zap_cursor_fini(&zc);
 	}
 
 	spa_scan_stat_init(spa);
 	return (0);
 }
 
 void
 dsl_scan_fini(dsl_pool_t *dp)
 {
 	if (dp->dp_scan != NULL) {
 		dsl_scan_t *scn = dp->dp_scan;
 
 		if (scn->scn_taskq != NULL)
 			taskq_destroy(scn->scn_taskq);
 		scan_ds_queue_clear(scn);
 		avl_destroy(&scn->scn_queue);
 		avl_destroy(&scn->scn_prefetch_queue);
 
 		kmem_free(dp->dp_scan, sizeof (dsl_scan_t));
 		dp->dp_scan = NULL;
 	}
 }
 
 static boolean_t
 dsl_scan_restarting(dsl_scan_t *scn, dmu_tx_t *tx)
 {
 	return (scn->scn_restart_txg != 0 &&
 	    scn->scn_restart_txg <= tx->tx_txg);
 }
 
 boolean_t
 dsl_scan_scrubbing(const dsl_pool_t *dp)
 {
 	dsl_scan_phys_t *scn_phys = &dp->dp_scan->scn_phys;
 
 	return (scn_phys->scn_state == DSS_SCANNING &&
 	    scn_phys->scn_func == POOL_SCAN_SCRUB);
 }
 
 boolean_t
 dsl_scan_is_paused_scrub(const dsl_scan_t *scn)
 {
 	return (dsl_scan_scrubbing(scn->scn_dp) &&
 	    scn->scn_phys.scn_flags & DSF_SCRUB_PAUSED);
 }
 
 /*
  * Writes out a persistent dsl_scan_phys_t record to the pool directory.
  * Because we can be running in the block sorting algorithm, we do not always
  * want to write out the record, only when it is "safe" to do so. This safety
  * condition is achieved by making sure that the sorting queues are empty
  * (scn_bytes_pending == 0). When this condition is not true, the sync'd state
  * is inconsistent with how much actual scanning progress has been made. The
  * kind of sync to be performed is specified by the sync_type argument. If the
  * sync is optional, we only sync if the queues are empty. If the sync is
  * mandatory, we do a hard ASSERT to make sure that the queues are empty. The
  * third possible state is a "cached" sync. This is done in response to:
  * 1) The dataset that was in the last sync'd dsl_scan_phys_t having been
  *	destroyed, so we wouldn't be able to restart scanning from it.
  * 2) The snapshot that was in the last sync'd dsl_scan_phys_t having been
  *	superseded by a newer snapshot.
  * 3) The dataset that was in the last sync'd dsl_scan_phys_t having been
  *	swapped with its clone.
  * In all cases, a cached sync simply rewrites the last record we've written,
  * just slightly modified. For the modifications that are performed to the
  * last written dsl_scan_phys_t, see dsl_scan_ds_destroyed,
  * dsl_scan_ds_snapshotted and dsl_scan_ds_clone_swapped.
  */
 static void
 dsl_scan_sync_state(dsl_scan_t *scn, dmu_tx_t *tx, state_sync_type_t sync_type)
 {
 	int i;
 	spa_t *spa = scn->scn_dp->dp_spa;
 
 	ASSERT(sync_type != SYNC_MANDATORY || scn->scn_bytes_pending == 0);
 	if (scn->scn_bytes_pending == 0) {
 		for (i = 0; i < spa->spa_root_vdev->vdev_children; i++) {
 			vdev_t *vd = spa->spa_root_vdev->vdev_child[i];
 			dsl_scan_io_queue_t *q = vd->vdev_scan_io_queue;
 
 			if (q == NULL)
 				continue;
 
 			mutex_enter(&vd->vdev_scan_io_queue_lock);
 			ASSERT3P(avl_first(&q->q_sios_by_addr), ==, NULL);
 			ASSERT3P(avl_first(&q->q_exts_by_size), ==, NULL);
 			ASSERT3P(range_tree_first(q->q_exts_by_addr), ==, NULL);
 			mutex_exit(&vd->vdev_scan_io_queue_lock);
 		}
 
 		if (scn->scn_phys.scn_queue_obj != 0)
 			scan_ds_queue_sync(scn, tx);
 		VERIFY0(zap_update(scn->scn_dp->dp_meta_objset,
 		    DMU_POOL_DIRECTORY_OBJECT,
 		    DMU_POOL_SCAN, sizeof (uint64_t), SCAN_PHYS_NUMINTS,
 		    &scn->scn_phys, tx));
 		bcopy(&scn->scn_phys, &scn->scn_phys_cached,
 		    sizeof (scn->scn_phys));
 
 		if (scn->scn_checkpointing)
 			zfs_dbgmsg("finish scan checkpoint");
 
 		scn->scn_checkpointing = B_FALSE;
 		scn->scn_last_checkpoint = ddi_get_lbolt();
 	} else if (sync_type == SYNC_CACHED) {
 		VERIFY0(zap_update(scn->scn_dp->dp_meta_objset,
 		    DMU_POOL_DIRECTORY_OBJECT,
 		    DMU_POOL_SCAN, sizeof (uint64_t), SCAN_PHYS_NUMINTS,
 		    &scn->scn_phys_cached, tx));
 	}
 }
 
 /* ARGSUSED */
 static int
 dsl_scan_setup_check(void *arg, dmu_tx_t *tx)
 {
 	dsl_scan_t *scn = dmu_tx_pool(tx)->dp_scan;
 
 	if (dsl_scan_is_running(scn))
 		return (SET_ERROR(EBUSY));
 
 	return (0);
 }
 
 static void
 dsl_scan_setup_sync(void *arg, dmu_tx_t *tx)
 {
 	dsl_scan_t *scn = dmu_tx_pool(tx)->dp_scan;
 	pool_scan_func_t *funcp = arg;
 	dmu_object_type_t ot = 0;
 	dsl_pool_t *dp = scn->scn_dp;
 	spa_t *spa = dp->dp_spa;
 
 	ASSERT(!dsl_scan_is_running(scn));
 	ASSERT(*funcp > POOL_SCAN_NONE && *funcp < POOL_SCAN_FUNCS);
 	bzero(&scn->scn_phys, sizeof (scn->scn_phys));
 	scn->scn_phys.scn_func = *funcp;
 	scn->scn_phys.scn_state = DSS_SCANNING;
 	scn->scn_phys.scn_min_txg = 0;
 	scn->scn_phys.scn_max_txg = tx->tx_txg;
 	scn->scn_phys.scn_ddt_class_max = DDT_CLASSES - 1; /* the entire DDT */
 	scn->scn_phys.scn_start_time = gethrestime_sec();
 	scn->scn_phys.scn_errors = 0;
 	scn->scn_phys.scn_to_examine = spa->spa_root_vdev->vdev_stat.vs_alloc;
 	scn->scn_issued_before_pass = 0;
 	scn->scn_restart_txg = 0;
 	scn->scn_done_txg = 0;
 	scn->scn_last_checkpoint = 0;
 	scn->scn_checkpointing = B_FALSE;
 	spa_scan_stat_init(spa);
 
 	if (DSL_SCAN_IS_SCRUB_RESILVER(scn)) {
 		scn->scn_phys.scn_ddt_class_max = zfs_scrub_ddt_class_max;
 
 		/* rewrite all disk labels */
 		vdev_config_dirty(spa->spa_root_vdev);
 
 		if (vdev_resilver_needed(spa->spa_root_vdev,
 		    &scn->scn_phys.scn_min_txg, &scn->scn_phys.scn_max_txg)) {
 			spa_event_notify(spa, NULL, NULL,
 			    ESC_ZFS_RESILVER_START);
 		} else {
 			spa_event_notify(spa, NULL, NULL, ESC_ZFS_SCRUB_START);
 		}
 
 		spa->spa_scrub_started = B_TRUE;
 		/*
 		 * If this is an incremental scrub, limit the DDT scrub phase
 		 * to just the auto-ditto class (for correctness); the rest
 		 * of the scrub should go faster using top-down pruning.
 		 */
 		if (scn->scn_phys.scn_min_txg > TXG_INITIAL)
 			scn->scn_phys.scn_ddt_class_max = DDT_CLASS_DITTO;
 
 	}
 
 	/* back to the generic stuff */
 
 	if (dp->dp_blkstats == NULL) {
 		dp->dp_blkstats =
 		    kmem_alloc(sizeof (zfs_all_blkstats_t), KM_SLEEP);
 		mutex_init(&dp->dp_blkstats->zab_lock, NULL,
 		    MUTEX_DEFAULT, NULL);
 	}
 	bzero(&dp->dp_blkstats->zab_type, sizeof (dp->dp_blkstats->zab_type));
 
 	if (spa_version(spa) < SPA_VERSION_DSL_SCRUB)
 		ot = DMU_OT_ZAP_OTHER;
 
 	scn->scn_phys.scn_queue_obj = zap_create(dp->dp_meta_objset,
 	    ot ? ot : DMU_OT_SCAN_QUEUE, DMU_OT_NONE, 0, tx);
 
 	bcopy(&scn->scn_phys, &scn->scn_phys_cached, sizeof (scn->scn_phys));
 
 	dsl_scan_sync_state(scn, tx, SYNC_MANDATORY);
 
 	spa_history_log_internal(spa, "scan setup", tx,
 	    "func=%u mintxg=%llu maxtxg=%llu",
 	    *funcp, scn->scn_phys.scn_min_txg, scn->scn_phys.scn_max_txg);
 }
 
 /*
  * Called by the ZFS_IOC_POOL_SCAN ioctl to start a scrub or resilver.
  * Can also be called to resume a paused scrub.
  */
 int
 dsl_scan(dsl_pool_t *dp, pool_scan_func_t func)
 {
 	spa_t *spa = dp->dp_spa;
 	dsl_scan_t *scn = dp->dp_scan;
 
 	/*
 	 * Purge all vdev caches and probe all devices.  We do this here
 	 * rather than in sync context because this requires a writer lock
 	 * on the spa_config lock, which we can't do from sync context.  The
 	 * spa_scrub_reopen flag indicates that vdev_open() should not
 	 * attempt to start another scrub.
 	 */
 	spa_vdev_state_enter(spa, SCL_NONE);
 	spa->spa_scrub_reopen = B_TRUE;
 	vdev_reopen(spa->spa_root_vdev);
 	spa->spa_scrub_reopen = B_FALSE;
 	(void) spa_vdev_state_exit(spa, NULL, 0);
 
 	if (func == POOL_SCAN_SCRUB && dsl_scan_is_paused_scrub(scn)) {
 		/* got scrub start cmd, resume paused scrub */
 		int err = dsl_scrub_set_pause_resume(scn->scn_dp,
 		    POOL_SCRUB_NORMAL);
 		if (err == 0) {
 			spa_event_notify(spa, NULL, NULL, ESC_ZFS_SCRUB_RESUME);
 			return (ECANCELED);
 		}
 		return (SET_ERROR(err));
 	}
 
 	return (dsl_sync_task(spa_name(spa), dsl_scan_setup_check,
 	    dsl_scan_setup_sync, &func, 0, ZFS_SPACE_CHECK_EXTRA_RESERVED));
 }
 
 /* ARGSUSED */
 static void
 dsl_scan_done(dsl_scan_t *scn, boolean_t complete, dmu_tx_t *tx)
 {
 	static const char *old_names[] = {
 		"scrub_bookmark",
 		"scrub_ddt_bookmark",
 		"scrub_ddt_class_max",
 		"scrub_queue",
 		"scrub_min_txg",
 		"scrub_max_txg",
 		"scrub_func",
 		"scrub_errors",
 		NULL
 	};
 
 	dsl_pool_t *dp = scn->scn_dp;
 	spa_t *spa = dp->dp_spa;
 	int i;
 
 	/* Remove any remnants of an old-style scrub. */
 	for (i = 0; old_names[i]; i++) {
 		(void) zap_remove(dp->dp_meta_objset,
 		    DMU_POOL_DIRECTORY_OBJECT, old_names[i], tx);
 	}
 
 	if (scn->scn_phys.scn_queue_obj != 0) {
 		VERIFY0(dmu_object_free(dp->dp_meta_objset,
 		    scn->scn_phys.scn_queue_obj, tx));
 		scn->scn_phys.scn_queue_obj = 0;
 	}
 	scan_ds_queue_clear(scn);
 
 	scn->scn_phys.scn_flags &= ~DSF_SCRUB_PAUSED;
 
 	/*
 	 * If we were "restarted" from a stopped state, don't bother
 	 * with anything else.
 	 */
 	if (!dsl_scan_is_running(scn)) {
 		ASSERT(!scn->scn_is_sorted);
 		return;
 	}
 
 	if (scn->scn_is_sorted) {
 		scan_io_queues_destroy(scn);
 		scn->scn_is_sorted = B_FALSE;
 
 		if (scn->scn_taskq != NULL) {
 			taskq_destroy(scn->scn_taskq);
 			scn->scn_taskq = NULL;
 		}
 	}
 
 	scn->scn_phys.scn_state = complete ? DSS_FINISHED : DSS_CANCELED;
 
 	if (dsl_scan_restarting(scn, tx))
 		spa_history_log_internal(spa, "scan aborted, restarting", tx,
 		    "errors=%llu", spa_get_errlog_size(spa));
 	else if (!complete)
 		spa_history_log_internal(spa, "scan cancelled", tx,
 		    "errors=%llu", spa_get_errlog_size(spa));
 	else
 		spa_history_log_internal(spa, "scan done", tx,
 		    "errors=%llu", spa_get_errlog_size(spa));
 
 	if (DSL_SCAN_IS_SCRUB_RESILVER(scn)) {
 		spa->spa_scrub_started = B_FALSE;
 		spa->spa_scrub_active = B_FALSE;
 
 		/*
 		 * If the scrub/resilver completed, update all DTLs to
 		 * reflect this.  Whether it succeeded or not, vacate
 		 * all temporary scrub DTLs.
 		 *
 		 * As the scrub does not currently support traversing
 		 * data that have been freed but are part of a checkpoint,
 		 * we don't mark the scrub as done in the DTLs as faults
 		 * may still exist in those vdevs.
 		 */
 		if (complete &&
 		    !spa_feature_is_active(spa, SPA_FEATURE_POOL_CHECKPOINT)) {
 			vdev_dtl_reassess(spa->spa_root_vdev, tx->tx_txg,
 			    scn->scn_phys.scn_max_txg, B_TRUE);
 
 			spa_event_notify(spa, NULL, NULL,
 			    scn->scn_phys.scn_min_txg ?
 			    ESC_ZFS_RESILVER_FINISH : ESC_ZFS_SCRUB_FINISH);
 		} else {
 			vdev_dtl_reassess(spa->spa_root_vdev, tx->tx_txg,
 			    0, B_TRUE);
 		}
 		spa_errlog_rotate(spa);
 
 		/*
 		 * We may have finished replacing a device.
 		 * Let the async thread assess this and handle the detach.
 		 */
 		spa_async_request(spa, SPA_ASYNC_RESILVER_DONE);
 	}
 
 	scn->scn_phys.scn_end_time = gethrestime_sec();
 
 	ASSERT(!dsl_scan_is_running(scn));
 }
 
 /* ARGSUSED */
 static int
 dsl_scan_cancel_check(void *arg, dmu_tx_t *tx)
 {
 	dsl_scan_t *scn = dmu_tx_pool(tx)->dp_scan;
 
 	if (!dsl_scan_is_running(scn))
 		return (SET_ERROR(ENOENT));
 	return (0);
 }
 
 /* ARGSUSED */
 static void
 dsl_scan_cancel_sync(void *arg, dmu_tx_t *tx)
 {
 	dsl_scan_t *scn = dmu_tx_pool(tx)->dp_scan;
 
 	dsl_scan_done(scn, B_FALSE, tx);
 	dsl_scan_sync_state(scn, tx, SYNC_MANDATORY);
 	spa_event_notify(scn->scn_dp->dp_spa, NULL, NULL, ESC_ZFS_SCRUB_ABORT);
 }
 
 int
 dsl_scan_cancel(dsl_pool_t *dp)
 {
 	return (dsl_sync_task(spa_name(dp->dp_spa), dsl_scan_cancel_check,
 	    dsl_scan_cancel_sync, NULL, 3, ZFS_SPACE_CHECK_RESERVED));
 }
 
 static int
 dsl_scrub_pause_resume_check(void *arg, dmu_tx_t *tx)
 {
 	pool_scrub_cmd_t *cmd = arg;
 	dsl_pool_t *dp = dmu_tx_pool(tx);
 	dsl_scan_t *scn = dp->dp_scan;
 
 	if (*cmd == POOL_SCRUB_PAUSE) {
 		/* can't pause a scrub when there is no in-progress scrub */
 		if (!dsl_scan_scrubbing(dp))
 			return (SET_ERROR(ENOENT));
 
 		/* can't pause a paused scrub */
 		if (dsl_scan_is_paused_scrub(scn))
 			return (SET_ERROR(EBUSY));
 	} else if (*cmd != POOL_SCRUB_NORMAL) {
 		return (SET_ERROR(ENOTSUP));
 	}
 
 	return (0);
 }
 
 static void
 dsl_scrub_pause_resume_sync(void *arg, dmu_tx_t *tx)
 {
 	pool_scrub_cmd_t *cmd = arg;
 	dsl_pool_t *dp = dmu_tx_pool(tx);
 	spa_t *spa = dp->dp_spa;
 	dsl_scan_t *scn = dp->dp_scan;
 
 	if (*cmd == POOL_SCRUB_PAUSE) {
 		/* can't pause a scrub when there is no in-progress scrub */
 		spa->spa_scan_pass_scrub_pause = gethrestime_sec();
 		scn->scn_phys.scn_flags |= DSF_SCRUB_PAUSED;
 		dsl_scan_sync_state(scn, tx, SYNC_CACHED);
 		spa_event_notify(spa, NULL, NULL, ESC_ZFS_SCRUB_PAUSED);
 	} else {
 		ASSERT3U(*cmd, ==, POOL_SCRUB_NORMAL);
 		if (dsl_scan_is_paused_scrub(scn)) {
 			/*
 			 * We need to keep track of how much time we spend
 			 * paused per pass so that we can adjust the scrub rate
 			 * shown in the output of 'zpool status'
 			 */
 			spa->spa_scan_pass_scrub_spent_paused +=
 			    gethrestime_sec() - spa->spa_scan_pass_scrub_pause;
 			spa->spa_scan_pass_scrub_pause = 0;
 			scn->scn_phys.scn_flags &= ~DSF_SCRUB_PAUSED;
 			dsl_scan_sync_state(scn, tx, SYNC_CACHED);
 		}
 	}
 }
 
 /*
  * Set scrub pause/resume state if it makes sense to do so
  */
 int
 dsl_scrub_set_pause_resume(const dsl_pool_t *dp, pool_scrub_cmd_t cmd)
 {
 	return (dsl_sync_task(spa_name(dp->dp_spa),
 	    dsl_scrub_pause_resume_check, dsl_scrub_pause_resume_sync, &cmd, 3,
 	    ZFS_SPACE_CHECK_RESERVED));
 }
 
 
 /* start a new scan, or restart an existing one. */
 void
 dsl_resilver_restart(dsl_pool_t *dp, uint64_t txg)
 {
 	if (txg == 0) {
 		dmu_tx_t *tx;
 		tx = dmu_tx_create_dd(dp->dp_mos_dir);
 		VERIFY(0 == dmu_tx_assign(tx, TXG_WAIT));
 
 		txg = dmu_tx_get_txg(tx);
 		dp->dp_scan->scn_restart_txg = txg;
 		dmu_tx_commit(tx);
 	} else {
 		dp->dp_scan->scn_restart_txg = txg;
 	}
 	zfs_dbgmsg("restarting resilver txg=%llu", txg);
 }
 
 void
 dsl_free(dsl_pool_t *dp, uint64_t txg, const blkptr_t *bp)
 {
 	zio_free(dp->dp_spa, txg, bp);
 }
 
 void
 dsl_free_sync(zio_t *pio, dsl_pool_t *dp, uint64_t txg, const blkptr_t *bpp)
 {
 	ASSERT(dsl_pool_sync_context(dp));
 	zio_nowait(zio_free_sync(pio, dp->dp_spa, txg, bpp, BP_GET_PSIZE(bpp),
 	    pio->io_flags));
 }
 
 static int
 scan_ds_queue_compare(const void *a, const void *b)
 {
 	const scan_ds_t *sds_a = a, *sds_b = b;
 
 	if (sds_a->sds_dsobj < sds_b->sds_dsobj)
 		return (-1);
 	if (sds_a->sds_dsobj == sds_b->sds_dsobj)
 		return (0);
 	return (1);
 }
 
 static void
 scan_ds_queue_clear(dsl_scan_t *scn)
 {
 	void *cookie = NULL;
 	scan_ds_t *sds;
 	while ((sds = avl_destroy_nodes(&scn->scn_queue, &cookie)) != NULL) {
 		kmem_free(sds, sizeof (*sds));
 	}
 }
 
 static boolean_t
 scan_ds_queue_contains(dsl_scan_t *scn, uint64_t dsobj, uint64_t *txg)
 {
 	scan_ds_t srch, *sds;
 
 	srch.sds_dsobj = dsobj;
 	sds = avl_find(&scn->scn_queue, &srch, NULL);
 	if (sds != NULL && txg != NULL)
 		*txg = sds->sds_txg;
 	return (sds != NULL);
 }
 
 static void
 scan_ds_queue_insert(dsl_scan_t *scn, uint64_t dsobj, uint64_t txg)
 {
 	scan_ds_t *sds;
 	avl_index_t where;
 
 	sds = kmem_zalloc(sizeof (*sds), KM_SLEEP);
 	sds->sds_dsobj = dsobj;
 	sds->sds_txg = txg;
 
 	VERIFY3P(avl_find(&scn->scn_queue, sds, &where), ==, NULL);
 	avl_insert(&scn->scn_queue, sds, where);
 }
 
 static void
 scan_ds_queue_remove(dsl_scan_t *scn, uint64_t dsobj)
 {
 	scan_ds_t srch, *sds;
 
 	srch.sds_dsobj = dsobj;
 
 	sds = avl_find(&scn->scn_queue, &srch, NULL);
 	VERIFY(sds != NULL);
 	avl_remove(&scn->scn_queue, sds);
 	kmem_free(sds, sizeof (*sds));
 }
 
 static void
 scan_ds_queue_sync(dsl_scan_t *scn, dmu_tx_t *tx)
 {
 	dsl_pool_t *dp = scn->scn_dp;
 	spa_t *spa = dp->dp_spa;
 	dmu_object_type_t ot = (spa_version(spa) >= SPA_VERSION_DSL_SCRUB) ?
 	    DMU_OT_SCAN_QUEUE : DMU_OT_ZAP_OTHER;
 
 	ASSERT0(scn->scn_bytes_pending);
 	ASSERT(scn->scn_phys.scn_queue_obj != 0);
 
 	VERIFY0(dmu_object_free(dp->dp_meta_objset,
 	    scn->scn_phys.scn_queue_obj, tx));
 	scn->scn_phys.scn_queue_obj = zap_create(dp->dp_meta_objset, ot,
 	    DMU_OT_NONE, 0, tx);
 	for (scan_ds_t *sds = avl_first(&scn->scn_queue);
 	    sds != NULL; sds = AVL_NEXT(&scn->scn_queue, sds)) {
 		VERIFY0(zap_add_int_key(dp->dp_meta_objset,
 		    scn->scn_phys.scn_queue_obj, sds->sds_dsobj,
 		    sds->sds_txg, tx));
 	}
 }
 
 /*
  * Computes the memory limit state that we're currently in. A sorted scan
  * needs quite a bit of memory to hold the sorting queue, so we need to
  * reasonably constrain the size so it doesn't impact overall system
  * performance. We compute two limits:
  * 1) Hard memory limit: if the amount of memory used by the sorting
  *	queues on a pool gets above this value, we stop the metadata
  *	scanning portion and start issuing the queued up and sorted
  *	I/Os to reduce memory usage.
  *	This limit is calculated as a fraction of physmem (by default 5%).
  *	We constrain the lower bound of the hard limit to an absolute
  *	minimum of zfs_scan_mem_lim_min (default: 16 MiB). We also constrain
  *	the upper bound to 5% of the total pool size - no chance we'll
  *	ever need that much memory, but just to keep the value in check.
  * 2) Soft memory limit: once we hit the hard memory limit, we start
  *	issuing I/O to reduce queue memory usage, but we don't want to
  *	completely empty out the queues, since we might be able to find I/Os
  *	that will fill in the gaps of our non-sequential IOs at some point
  *	in the future. So we stop the issuing of I/Os once the amount of
  *	memory used drops below the soft limit (at which point we stop issuing
  *	I/O and start scanning metadata again).
  *
  *	This limit is calculated by subtracting a fraction of the hard
  *	limit from the hard limit. By default this fraction is 5%, so
  *	the soft limit is 95% of the hard limit. We cap the size of the
  *	difference between the hard and soft limits at an absolute
  *	maximum of zfs_scan_mem_lim_soft_max (default: 128 MiB) - this is
  *	sufficient to not cause too frequent switching between the
  *	metadata scan and I/O issue (even at 2k recordsize, 128 MiB's
  *	worth of queues is about 1.2 GiB of on-pool data, so scanning
  *	that should take at least a decent fraction of a second).
  */
 static boolean_t
 dsl_scan_should_clear(dsl_scan_t *scn)
 {
 	vdev_t *rvd = scn->scn_dp->dp_spa->spa_root_vdev;
 	uint64_t mlim_hard, mlim_soft, mused;
 	uint64_t alloc = metaslab_class_get_alloc(spa_normal_class(
 	    scn->scn_dp->dp_spa));
 
 	mlim_hard = MAX((physmem / zfs_scan_mem_lim_fact) * PAGESIZE,
 	    zfs_scan_mem_lim_min);
 	mlim_hard = MIN(mlim_hard, alloc / 20);
 	mlim_soft = mlim_hard - MIN(mlim_hard / zfs_scan_mem_lim_soft_fact,
 	    zfs_scan_mem_lim_soft_max);
 	mused = 0;
 	for (uint64_t i = 0; i < rvd->vdev_children; i++) {
 		vdev_t *tvd = rvd->vdev_child[i];
 		dsl_scan_io_queue_t *queue;
 
 		mutex_enter(&tvd->vdev_scan_io_queue_lock);
 		queue = tvd->vdev_scan_io_queue;
 		if (queue != NULL) {
 			/* #extents in exts_by_size = # in exts_by_addr */
 			mused += avl_numnodes(&queue->q_exts_by_size) *
 			    sizeof (range_seg_t) +
 			    avl_numnodes(&queue->q_sios_by_addr) *
 			    sizeof (scan_io_t);
 		}
 		mutex_exit(&tvd->vdev_scan_io_queue_lock);
 	}
 
 	dprintf("current scan memory usage: %llu bytes\n", (longlong_t)mused);
 
 	if (mused == 0)
 		ASSERT0(scn->scn_bytes_pending);
 
 	/*
 	 * If we are above our hard limit, we need to clear out memory.
 	 * If we are below our soft limit, we need to accumulate sequential IOs.
 	 * Otherwise, we should keep doing whatever we are currently doing.
 	 */
 	if (mused >= mlim_hard)
 		return (B_TRUE);
 	else if (mused < mlim_soft)
 		return (B_FALSE);
 	else
 		return (scn->scn_clearing);
 }
 
 static boolean_t
 dsl_scan_check_suspend(dsl_scan_t *scn, const zbookmark_phys_t *zb)
 {
 	/* we never skip user/group accounting objects */
 	if (zb && (int64_t)zb->zb_object < 0)
 		return (B_FALSE);
 
 	if (scn->scn_suspending)
 		return (B_TRUE); /* we're already suspending */
 
 	if (!ZB_IS_ZERO(&scn->scn_phys.scn_bookmark))
 		return (B_FALSE); /* we're resuming */
 
 	/* We only know how to resume from level-0 blocks. */
 	if (zb && zb->zb_level != 0)
 		return (B_FALSE);
 
 	/*
 	 * We suspend if:
 	 *  - we have scanned for at least the minimum time (default 1 sec
 	 *    for scrub, 3 sec for resilver), and either we have sufficient
 	 *    dirty data that we are starting to write more quickly
 	 *    (default 30%), or someone is explicitly waiting for this txg
 	 *    to complete.
 	 *  or
 	 *  - the spa is shutting down because this pool is being exported
 	 *    or the machine is rebooting.
 	 *  or
 	 *  - the scan queue has reached its memory use limit
 	 */
 	uint64_t elapsed_nanosecs = gethrtime();
 	uint64_t curr_time_ns = gethrtime();
 	uint64_t scan_time_ns = curr_time_ns - scn->scn_sync_start_time;
 	uint64_t sync_time_ns = curr_time_ns -
 	    scn->scn_dp->dp_spa->spa_sync_starttime;
 
 	int dirty_pct = scn->scn_dp->dp_dirty_total * 100 / zfs_dirty_data_max;
 	int mintime = (scn->scn_phys.scn_func == POOL_SCAN_RESILVER) ?
 	    zfs_resilver_min_time_ms : zfs_scrub_min_time_ms;
 
 	if ((NSEC2MSEC(scan_time_ns) > mintime &&
             (dirty_pct >= zfs_vdev_async_write_active_min_dirty_percent ||
             txg_sync_waiting(scn->scn_dp) ||
             NSEC2SEC(sync_time_ns) >= zfs_txg_timeout)) ||
             spa_shutting_down(scn->scn_dp->dp_spa) ||
 	    (zfs_scan_strict_mem_lim && dsl_scan_should_clear(scn))) {
 		if (zb) {
 			dprintf("suspending at bookmark %llx/%llx/%llx/%llx\n",
 			    (longlong_t)zb->zb_objset,
 			    (longlong_t)zb->zb_object,
 			    (longlong_t)zb->zb_level,
 			    (longlong_t)zb->zb_blkid);
 			scn->scn_phys.scn_bookmark = *zb;
 		} else {
 			dsl_scan_phys_t *scnp = &scn->scn_phys;
 
 			dprintf("suspending at at DDT bookmark "
 			    "%llx/%llx/%llx/%llx\n",
 			    (longlong_t)scnp->scn_ddt_bookmark.ddb_class,
 			    (longlong_t)scnp->scn_ddt_bookmark.ddb_type,
 			    (longlong_t)scnp->scn_ddt_bookmark.ddb_checksum,
 			    (longlong_t)scnp->scn_ddt_bookmark.ddb_cursor);
 		}
 		scn->scn_suspending = B_TRUE;
 		return (B_TRUE);
 	}
 	return (B_FALSE);
 }
 
 typedef struct zil_scan_arg {
 	dsl_pool_t	*zsa_dp;
 	zil_header_t	*zsa_zh;
 } zil_scan_arg_t;
 
 /* ARGSUSED */
 static int
 dsl_scan_zil_block(zilog_t *zilog, blkptr_t *bp, void *arg, uint64_t claim_txg)
 {
 	zil_scan_arg_t *zsa = arg;
 	dsl_pool_t *dp = zsa->zsa_dp;
 	dsl_scan_t *scn = dp->dp_scan;
 	zil_header_t *zh = zsa->zsa_zh;
 	zbookmark_phys_t zb;
 
 	if (BP_IS_HOLE(bp) || bp->blk_birth <= scn->scn_phys.scn_cur_min_txg)
 		return (0);
 
 	/*
 	 * One block ("stubby") can be allocated a long time ago; we
 	 * want to visit that one because it has been allocated
 	 * (on-disk) even if it hasn't been claimed (even though for
 	 * scrub there's nothing to do to it).
 	 */
 	if (claim_txg == 0 && bp->blk_birth >= spa_min_claim_txg(dp->dp_spa))
 		return (0);
 
 	SET_BOOKMARK(&zb, zh->zh_log.blk_cksum.zc_word[ZIL_ZC_OBJSET],
 	    ZB_ZIL_OBJECT, ZB_ZIL_LEVEL, bp->blk_cksum.zc_word[ZIL_ZC_SEQ]);
 
 	VERIFY(0 == scan_funcs[scn->scn_phys.scn_func](dp, bp, &zb));
 	return (0);
 }
 
 /* ARGSUSED */
 static int
 dsl_scan_zil_record(zilog_t *zilog, lr_t *lrc, void *arg, uint64_t claim_txg)
 {
 	if (lrc->lrc_txtype == TX_WRITE) {
 		zil_scan_arg_t *zsa = arg;
 		dsl_pool_t *dp = zsa->zsa_dp;
 		dsl_scan_t *scn = dp->dp_scan;
 		zil_header_t *zh = zsa->zsa_zh;
 		lr_write_t *lr = (lr_write_t *)lrc;
 		blkptr_t *bp = &lr->lr_blkptr;
 		zbookmark_phys_t zb;
 
 		if (BP_IS_HOLE(bp) ||
 		    bp->blk_birth <= scn->scn_phys.scn_cur_min_txg)
 			return (0);
 
 		/*
 		 * birth can be < claim_txg if this record's txg is
 		 * already txg sync'ed (but this log block contains
 		 * other records that are not synced)
 		 */
 		if (claim_txg == 0 || bp->blk_birth < claim_txg)
 			return (0);
 
 		SET_BOOKMARK(&zb, zh->zh_log.blk_cksum.zc_word[ZIL_ZC_OBJSET],
 		    lr->lr_foid, ZB_ZIL_LEVEL,
 		    lr->lr_offset / BP_GET_LSIZE(bp));
 
 		VERIFY(0 == scan_funcs[scn->scn_phys.scn_func](dp, bp, &zb));
 	}
 	return (0);
 }
 
 static void
 dsl_scan_zil(dsl_pool_t *dp, zil_header_t *zh)
 {
 	uint64_t claim_txg = zh->zh_claim_txg;
 	zil_scan_arg_t zsa = { dp, zh };
 	zilog_t *zilog;
 
 	ASSERT(spa_writeable(dp->dp_spa));
 
 	/*
 	 * We only want to visit blocks that have been claimed
 	 * but not yet replayed.
 	 */
 	if (claim_txg == 0)
 		return;
 
 	zilog = zil_alloc(dp->dp_meta_objset, zh);
 
 	(void) zil_parse(zilog, dsl_scan_zil_block, dsl_scan_zil_record, &zsa,
 	    claim_txg);
 
 	zil_free(zilog);
 }
 
 /*
  * We compare scan_prefetch_issue_ctx_t's based on their bookmarks. The idea
  * here is to sort the AVL tree by the order each block will be needed.
  */
 static int
 scan_prefetch_queue_compare(const void *a, const void *b)
 {
 	const scan_prefetch_issue_ctx_t *spic_a = a, *spic_b = b;
 	const scan_prefetch_ctx_t *spc_a = spic_a->spic_spc;
 	const scan_prefetch_ctx_t *spc_b = spic_b->spic_spc;
 
 	return (zbookmark_compare(spc_a->spc_datablkszsec,
 	    spc_a->spc_indblkshift, spc_b->spc_datablkszsec,
 	    spc_b->spc_indblkshift, &spic_a->spic_zb, &spic_b->spic_zb));
 }
 
 static void
 scan_prefetch_ctx_rele(scan_prefetch_ctx_t *spc, void *tag)
 {
 	if (refcount_remove(&spc->spc_refcnt, tag) == 0) {
 		refcount_destroy(&spc->spc_refcnt);
 		kmem_free(spc, sizeof (scan_prefetch_ctx_t));
 	}
 }
 
 static scan_prefetch_ctx_t *
 scan_prefetch_ctx_create(dsl_scan_t *scn, dnode_phys_t *dnp, void *tag)
 {
 	scan_prefetch_ctx_t *spc;
 
 	spc = kmem_alloc(sizeof (scan_prefetch_ctx_t), KM_SLEEP);
 	refcount_create(&spc->spc_refcnt);
 	refcount_add(&spc->spc_refcnt, tag);
 	spc->spc_scn = scn;
 	if (dnp != NULL) {
 		spc->spc_datablkszsec = dnp->dn_datablkszsec;
 		spc->spc_indblkshift = dnp->dn_indblkshift;
 		spc->spc_root = B_FALSE;
 	} else {
 		spc->spc_datablkszsec = 0;
 		spc->spc_indblkshift = 0;
 		spc->spc_root = B_TRUE;
 	}
 
 	return (spc);
 }
 
 static void
 scan_prefetch_ctx_add_ref(scan_prefetch_ctx_t *spc, void *tag)
 {
 	refcount_add(&spc->spc_refcnt, tag);
 }
 
 static boolean_t
 dsl_scan_check_prefetch_resume(scan_prefetch_ctx_t *spc,
     const zbookmark_phys_t *zb)
 {
 	zbookmark_phys_t *last_zb = &spc->spc_scn->scn_prefetch_bookmark;
 	dnode_phys_t tmp_dnp;
 	dnode_phys_t *dnp = (spc->spc_root) ? NULL : &tmp_dnp;
 
 	if (zb->zb_objset != last_zb->zb_objset)
 		return (B_TRUE);
 	if ((int64_t)zb->zb_object < 0)
 		return (B_FALSE);
 
 	tmp_dnp.dn_datablkszsec = spc->spc_datablkszsec;
 	tmp_dnp.dn_indblkshift = spc->spc_indblkshift;
 
 	if (zbookmark_subtree_completed(dnp, zb, last_zb))
 		return (B_TRUE);
 
 	return (B_FALSE);
 }
 
 static void
 dsl_scan_prefetch(scan_prefetch_ctx_t *spc, blkptr_t *bp, zbookmark_phys_t *zb)
 {
 	avl_index_t idx;
 	dsl_scan_t *scn = spc->spc_scn;
 	spa_t *spa = scn->scn_dp->dp_spa;
 	scan_prefetch_issue_ctx_t *spic;
 
 	if (zfs_no_scrub_prefetch)
 		return;
 
 	if (BP_IS_HOLE(bp) || bp->blk_birth <= scn->scn_phys.scn_cur_min_txg ||
 	    (BP_GET_LEVEL(bp) == 0 && BP_GET_TYPE(bp) != DMU_OT_DNODE &&
 	    BP_GET_TYPE(bp) != DMU_OT_OBJSET))
 		return;
 
 	if (dsl_scan_check_prefetch_resume(spc, zb))
 		return;
 
 	scan_prefetch_ctx_add_ref(spc, scn);
 	spic = kmem_alloc(sizeof (scan_prefetch_issue_ctx_t), KM_SLEEP);
 	spic->spic_spc = spc;
 	spic->spic_bp = *bp;
 	spic->spic_zb = *zb;
 
 	/*
 	 * Add the IO to the queue of blocks to prefetch. This allows us to
 	 * prioritize blocks that we will need first for the main traversal
 	 * thread.
 	 */
 	mutex_enter(&spa->spa_scrub_lock);
 	if (avl_find(&scn->scn_prefetch_queue, spic, &idx) != NULL) {
 		/* this block is already queued for prefetch */
 		kmem_free(spic, sizeof (scan_prefetch_issue_ctx_t));
 		scan_prefetch_ctx_rele(spc, scn);
 		mutex_exit(&spa->spa_scrub_lock);
 		return;
 	}
 
 	avl_insert(&scn->scn_prefetch_queue, spic, idx);
 	cv_broadcast(&spa->spa_scrub_io_cv);
 	mutex_exit(&spa->spa_scrub_lock);
 }
 
 static void
 dsl_scan_prefetch_dnode(dsl_scan_t *scn, dnode_phys_t *dnp,
     uint64_t objset, uint64_t object)
 {
 	int i;
 	zbookmark_phys_t zb;
 	scan_prefetch_ctx_t *spc;
 
 	if (dnp->dn_nblkptr == 0 && !(dnp->dn_flags & DNODE_FLAG_SPILL_BLKPTR))
 		return;
 
 	SET_BOOKMARK(&zb, objset, object, 0, 0);
 
 	spc = scan_prefetch_ctx_create(scn, dnp, FTAG);
 
 	for (i = 0; i < dnp->dn_nblkptr; i++) {
 		zb.zb_level = BP_GET_LEVEL(&dnp->dn_blkptr[i]);
 		zb.zb_blkid = i;
 		dsl_scan_prefetch(spc, &dnp->dn_blkptr[i], &zb);
 	}
 
 	if (dnp->dn_flags & DNODE_FLAG_SPILL_BLKPTR) {
 		zb.zb_level = 0;
 		zb.zb_blkid = DMU_SPILL_BLKID;
 		dsl_scan_prefetch(spc, &dnp->dn_spill, &zb);
 	}
 
 	scan_prefetch_ctx_rele(spc, FTAG);
 }
 
 void
 dsl_scan_prefetch_cb(zio_t *zio, const zbookmark_phys_t *zb, const blkptr_t *bp,
     arc_buf_t *buf, void *private)
 {
 	scan_prefetch_ctx_t *spc = private;
 	dsl_scan_t *scn = spc->spc_scn;
 	spa_t *spa = scn->scn_dp->dp_spa;
 
 	/* broadcast that the IO has completed for rate limitting purposes */
 	mutex_enter(&spa->spa_scrub_lock);
 	ASSERT3U(spa->spa_scrub_inflight, >=, BP_GET_PSIZE(bp));
 	spa->spa_scrub_inflight -= BP_GET_PSIZE(bp);
 	cv_broadcast(&spa->spa_scrub_io_cv);
 	mutex_exit(&spa->spa_scrub_lock);
 
 	/* if there was an error or we are done prefetching, just cleanup */
 	if (buf == NULL || scn->scn_suspending)
 		goto out;
 
 	if (BP_GET_LEVEL(bp) > 0) {
 		int i;
 		blkptr_t *cbp;
 		int epb = BP_GET_LSIZE(bp) >> SPA_BLKPTRSHIFT;
 		zbookmark_phys_t czb;
 
 		for (i = 0, cbp = buf->b_data; i < epb; i++, cbp++) {
 			SET_BOOKMARK(&czb, zb->zb_objset, zb->zb_object,
 			    zb->zb_level - 1, zb->zb_blkid * epb + i);
 			dsl_scan_prefetch(spc, cbp, &czb);
 		}
 	} else if (BP_GET_TYPE(bp) == DMU_OT_DNODE) {
 		dnode_phys_t *cdnp = buf->b_data;
 		int i;
 		int epb = BP_GET_LSIZE(bp) >> DNODE_SHIFT;
 
 		for (i = 0, cdnp = buf->b_data; i < epb;
 		    i += cdnp->dn_extra_slots + 1,
 		    cdnp += cdnp->dn_extra_slots + 1) {
 			dsl_scan_prefetch_dnode(scn, cdnp,
 			    zb->zb_objset, zb->zb_blkid * epb + i);
 		}
 	} else if (BP_GET_TYPE(bp) == DMU_OT_OBJSET) {
 		objset_phys_t *osp = buf->b_data;
 
 		dsl_scan_prefetch_dnode(scn, &osp->os_meta_dnode,
 		    zb->zb_objset, DMU_META_DNODE_OBJECT);
 
 		if (OBJSET_BUF_HAS_USERUSED(buf)) {
 			dsl_scan_prefetch_dnode(scn,
 			    &osp->os_groupused_dnode, zb->zb_objset,
 			    DMU_GROUPUSED_OBJECT);
 			dsl_scan_prefetch_dnode(scn,
 			    &osp->os_userused_dnode, zb->zb_objset,
 			    DMU_USERUSED_OBJECT);
 		}
 	}
 
 out:
 	if (buf != NULL)
 		arc_buf_destroy(buf, private);
 	scan_prefetch_ctx_rele(spc, scn);
 }
 
 /* ARGSUSED */
 static void
 dsl_scan_prefetch_thread(void *arg)
 {
 	dsl_scan_t *scn = arg;
 	spa_t *spa = scn->scn_dp->dp_spa;
 	vdev_t *rvd = spa->spa_root_vdev;
 	uint64_t maxinflight = rvd->vdev_children * zfs_top_maxinflight;
 	scan_prefetch_issue_ctx_t *spic;
 
 	/* loop until we are told to stop */
 	while (!scn->scn_prefetch_stop) {
 		arc_flags_t flags = ARC_FLAG_NOWAIT |
                     ARC_FLAG_PRESCIENT_PREFETCH | ARC_FLAG_PREFETCH;
 		int zio_flags = ZIO_FLAG_CANFAIL | ZIO_FLAG_SCAN_THREAD;
 		
 		mutex_enter(&spa->spa_scrub_lock);
 
 		/*
 		 * Wait until we have an IO to issue and are not above our
 		 * maximum in flight limit.
 		 */
 		while (!scn->scn_prefetch_stop &&
 		    (avl_numnodes(&scn->scn_prefetch_queue) == 0 ||
 		    spa->spa_scrub_inflight >= scn->scn_maxinflight_bytes)) {
 			cv_wait(&spa->spa_scrub_io_cv, &spa->spa_scrub_lock);
 		}
 
 		/* recheck if we should stop since we waited for the cv */
 		if (scn->scn_prefetch_stop) {
 			mutex_exit(&spa->spa_scrub_lock);
 			break;
 		}
 
 		/* remove the prefetch IO from the tree */
 		spic = avl_first(&scn->scn_prefetch_queue);
 		spa->spa_scrub_inflight += BP_GET_PSIZE(&spic->spic_bp);
 		avl_remove(&scn->scn_prefetch_queue, spic);
 
 		mutex_exit(&spa->spa_scrub_lock);
 
 		/* issue the prefetch asynchronously */
 		(void) arc_read(scn->scn_zio_root, scn->scn_dp->dp_spa,
 		    &spic->spic_bp, dsl_scan_prefetch_cb, spic->spic_spc,
 		    ZIO_PRIORITY_SCRUB, zio_flags, &flags, &spic->spic_zb);
 
 		kmem_free(spic, sizeof (scan_prefetch_issue_ctx_t));
 	}
 
 	ASSERT(scn->scn_prefetch_stop);
 
 	/* free any prefetches we didn't get to complete */
 	mutex_enter(&spa->spa_scrub_lock);
 	while ((spic = avl_first(&scn->scn_prefetch_queue)) != NULL) {
 		avl_remove(&scn->scn_prefetch_queue, spic);
 		scan_prefetch_ctx_rele(spic->spic_spc, scn);
 		kmem_free(spic, sizeof (scan_prefetch_issue_ctx_t));
 	}
 	ASSERT0(avl_numnodes(&scn->scn_prefetch_queue));
 	mutex_exit(&spa->spa_scrub_lock);
 }
 
 static boolean_t
 dsl_scan_check_resume(dsl_scan_t *scn, const dnode_phys_t *dnp,
     const zbookmark_phys_t *zb)
 {
 	/*
 	 * We never skip over user/group accounting objects (obj<0)
 	 */
 	if (!ZB_IS_ZERO(&scn->scn_phys.scn_bookmark) &&
 	    (int64_t)zb->zb_object >= 0) {
 		/*
 		 * If we already visited this bp & everything below (in
 		 * a prior txg sync), don't bother doing it again.
 		 */
 		if (zbookmark_subtree_completed(dnp, zb,
 		    &scn->scn_phys.scn_bookmark))
 			return (B_TRUE);
 
 		/*
 		 * If we found the block we're trying to resume from, or
 		 * we went past it to a different object, zero it out to
 		 * indicate that it's OK to start checking for suspending
 		 * again.
 		 */
 		if (bcmp(zb, &scn->scn_phys.scn_bookmark, sizeof (*zb)) == 0 ||
 		    zb->zb_object > scn->scn_phys.scn_bookmark.zb_object) {
 			dprintf("resuming at %llx/%llx/%llx/%llx\n",
 			    (longlong_t)zb->zb_objset,
 			    (longlong_t)zb->zb_object,
 			    (longlong_t)zb->zb_level,
 			    (longlong_t)zb->zb_blkid);
 			bzero(&scn->scn_phys.scn_bookmark, sizeof (*zb));
 		}
 	}
 	return (B_FALSE);
 }
 
 static void dsl_scan_visitbp(blkptr_t *bp, const zbookmark_phys_t *zb,
     dnode_phys_t *dnp, dsl_dataset_t *ds, dsl_scan_t *scn,
     dmu_objset_type_t ostype, dmu_tx_t *tx);
 static void dsl_scan_visitdnode(
     dsl_scan_t *, dsl_dataset_t *ds, dmu_objset_type_t ostype,
     dnode_phys_t *dnp, uint64_t object, dmu_tx_t *tx);
 
 /*
  * Return nonzero on i/o error.
  * Return new buf to write out in *bufp.
  */
 static int
 dsl_scan_recurse(dsl_scan_t *scn, dsl_dataset_t *ds, dmu_objset_type_t ostype,
     dnode_phys_t *dnp, const blkptr_t *bp,
     const zbookmark_phys_t *zb, dmu_tx_t *tx)
 {
 	dsl_pool_t *dp = scn->scn_dp;
 	int zio_flags = ZIO_FLAG_CANFAIL | ZIO_FLAG_SCAN_THREAD;
 	int err;
 
 	if (BP_GET_LEVEL(bp) > 0) {
 		arc_flags_t flags = ARC_FLAG_WAIT;
 		int i;
 		blkptr_t *cbp;
 		int epb = BP_GET_LSIZE(bp) >> SPA_BLKPTRSHIFT;
 		arc_buf_t *buf;
 
 		err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, &buf,
 		    ZIO_PRIORITY_SCRUB, zio_flags, &flags, zb);
 		if (err) {
 			scn->scn_phys.scn_errors++;
 			return (err);
 		}
 		for (i = 0, cbp = buf->b_data; i < epb; i++, cbp++) {
 			zbookmark_phys_t czb;
 
 			SET_BOOKMARK(&czb, zb->zb_objset, zb->zb_object,
 			    zb->zb_level - 1,
 			    zb->zb_blkid * epb + i);
 			dsl_scan_visitbp(cbp, &czb, dnp,
 			    ds, scn, ostype, tx);
 		}
 		arc_buf_destroy(buf, &buf);
 	} else if (BP_GET_TYPE(bp) == DMU_OT_DNODE) {
 		arc_flags_t flags = ARC_FLAG_WAIT;
 		dnode_phys_t *cdnp;
 		int i;
 		int epb = BP_GET_LSIZE(bp) >> DNODE_SHIFT;
 		arc_buf_t *buf;
 
 		err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, &buf,
 		    ZIO_PRIORITY_SCRUB, zio_flags, &flags, zb);
 		if (err) {
 			scn->scn_phys.scn_errors++;
 			return (err);
 		}
 		for (i = 0, cdnp = buf->b_data; i < epb;
 		    i += cdnp->dn_extra_slots + 1,
 		    cdnp += cdnp->dn_extra_slots + 1) {
 			dsl_scan_visitdnode(scn, ds, ostype,
 			    cdnp, zb->zb_blkid * epb + i, tx);
 		}
 
 		arc_buf_destroy(buf, &buf);
 	} else if (BP_GET_TYPE(bp) == DMU_OT_OBJSET) {
 		arc_flags_t flags = ARC_FLAG_WAIT;
 		objset_phys_t *osp;
 		arc_buf_t *buf;
 
 		err = arc_read(NULL, dp->dp_spa, bp, arc_getbuf_func, &buf,
 		    ZIO_PRIORITY_SCRUB, zio_flags, &flags, zb);
 		if (err) {
 			scn->scn_phys.scn_errors++;
 			return (err);
 		}
 
 		osp = buf->b_data;
 
 		dsl_scan_visitdnode(scn, ds, osp->os_type,
 		    &osp->os_meta_dnode, DMU_META_DNODE_OBJECT, tx);
 
 		if (OBJSET_BUF_HAS_USERUSED(buf)) {
 			/*
 			 * We also always visit user/group accounting
 			 * objects, and never skip them, even if we are
 			 * suspending.  This is necessary so that the space
 			 * deltas from this txg get integrated.
 			 */
 			dsl_scan_visitdnode(scn, ds, osp->os_type,
 			    &osp->os_groupused_dnode,
 			    DMU_GROUPUSED_OBJECT, tx);
 			dsl_scan_visitdnode(scn, ds, osp->os_type,
 			    &osp->os_userused_dnode,
 			    DMU_USERUSED_OBJECT, tx);
 		}
 		arc_buf_destroy(buf, &buf);
 	}
 
 	return (0);
 }
 
 static void
 dsl_scan_visitdnode(dsl_scan_t *scn, dsl_dataset_t *ds,
     dmu_objset_type_t ostype, dnode_phys_t *dnp,
     uint64_t object, dmu_tx_t *tx)
 {
 	int j;
 
 	for (j = 0; j < dnp->dn_nblkptr; j++) {
 		zbookmark_phys_t czb;
 
 		SET_BOOKMARK(&czb, ds ? ds->ds_object : 0, object,
 		    dnp->dn_nlevels - 1, j);
 		dsl_scan_visitbp(&dnp->dn_blkptr[j],
 		    &czb, dnp, ds, scn, ostype, tx);
 	}
 
 	if (dnp->dn_flags & DNODE_FLAG_SPILL_BLKPTR) {
 		zbookmark_phys_t czb;
 		SET_BOOKMARK(&czb, ds ? ds->ds_object : 0, object,
 		    0, DMU_SPILL_BLKID);
 		dsl_scan_visitbp(DN_SPILL_BLKPTR(dnp),
 		    &czb, dnp, ds, scn, ostype, tx);
 	}
 }
 
 /*
  * The arguments are in this order because mdb can only print the
  * first 5; we want them to be useful.
  */
 static void
 dsl_scan_visitbp(blkptr_t *bp, const zbookmark_phys_t *zb,
     dnode_phys_t *dnp, dsl_dataset_t *ds, dsl_scan_t *scn,
     dmu_objset_type_t ostype, dmu_tx_t *tx)
 {
 	dsl_pool_t *dp = scn->scn_dp;
 	blkptr_t *bp_toread = NULL;
 
 	if (dsl_scan_check_suspend(scn, zb))
 		return;
 
 	if (dsl_scan_check_resume(scn, dnp, zb))
 		return;
 
 	scn->scn_visited_this_txg++;
 
 	dprintf_bp(bp,
 	    "visiting ds=%p/%llu zb=%llx/%llx/%llx/%llx bp=%p",
 	    ds, ds ? ds->ds_object : 0,
 	    zb->zb_objset, zb->zb_object, zb->zb_level, zb->zb_blkid,
 	    bp);
 
 	if (BP_IS_HOLE(bp)) {
 		scn->scn_holes_this_txg++;
 		return;
 	}
 
 	if (bp->blk_birth <= scn->scn_phys.scn_cur_min_txg) {
 		scn->scn_lt_min_this_txg++;
 		return;
 	}
 
 	bp_toread = kmem_alloc(sizeof (blkptr_t), KM_SLEEP);
 	*bp_toread = *bp;
 
 	if (dsl_scan_recurse(scn, ds, ostype, dnp, bp_toread, zb, tx) != 0)
 		return;
 
 	/*
 	 * If dsl_scan_ddt() has already visited this block, it will have
 	 * already done any translations or scrubbing, so don't call the
 	 * callback again.
 	 */
 	if (ddt_class_contains(dp->dp_spa,
 	    scn->scn_phys.scn_ddt_class_max, bp)) {
 		scn->scn_ddt_contained_this_txg++;
 		goto out;
 	}
 
 	/*
 	 * If this block is from the future (after cur_max_txg), then we
 	 * are doing this on behalf of a deleted snapshot, and we will
 	 * revisit the future block on the next pass of this dataset.
 	 * Don't scan it now unless we need to because something
 	 * under it was modified.
 	 */
 	if (BP_PHYSICAL_BIRTH(bp) > scn->scn_phys.scn_cur_max_txg) {
 		scn->scn_gt_max_this_txg++;
 		goto out;
 	}
 
 	scan_funcs[scn->scn_phys.scn_func](dp, bp, zb);
 out:
 	kmem_free(bp_toread, sizeof (blkptr_t));
 }
 
 static void
 dsl_scan_visit_rootbp(dsl_scan_t *scn, dsl_dataset_t *ds, blkptr_t *bp,
     dmu_tx_t *tx)
 {
 	zbookmark_phys_t zb;
 	scan_prefetch_ctx_t *spc;
 
 	SET_BOOKMARK(&zb, ds ? ds->ds_object : DMU_META_OBJSET,
 	    ZB_ROOT_OBJECT, ZB_ROOT_LEVEL, ZB_ROOT_BLKID);
 
 	if (ZB_IS_ZERO(&scn->scn_phys.scn_bookmark)) {
 		SET_BOOKMARK(&scn->scn_prefetch_bookmark,
 		    zb.zb_objset, 0, 0, 0);
 	} else {
 		scn->scn_prefetch_bookmark = scn->scn_phys.scn_bookmark;
 	}
 
 	scn->scn_objsets_visited_this_txg++;
 
 	spc = scan_prefetch_ctx_create(scn, NULL, FTAG);
 	dsl_scan_prefetch(spc, bp, &zb);
 	scan_prefetch_ctx_rele(spc, FTAG);
 
 	dsl_scan_visitbp(bp, &zb, NULL, ds, scn, DMU_OST_NONE, tx);
 
 	dprintf_ds(ds, "finished scan%s", "");
 }
 
 static void
 ds_destroyed_scn_phys(dsl_dataset_t *ds, dsl_scan_phys_t *scn_phys)
 {
 	if (scn_phys->scn_bookmark.zb_objset == ds->ds_object) {
 		if (ds->ds_is_snapshot) {
 			/*
 			 * Note:
 			 *  - scn_cur_{min,max}_txg stays the same.
 			 *  - Setting the flag is not really necessary if
 			 *    scn_cur_max_txg == scn_max_txg, because there
 			 *    is nothing after this snapshot that we care
 			 *    about.  However, we set it anyway and then
 			 *    ignore it when we retraverse it in
 			 *    dsl_scan_visitds().
 			 */
 			scn_phys->scn_bookmark.zb_objset =
 			    dsl_dataset_phys(ds)->ds_next_snap_obj;
 			zfs_dbgmsg("destroying ds %llu; currently traversing; "
 			    "reset zb_objset to %llu",
 			    (u_longlong_t)ds->ds_object,
 			    (u_longlong_t)dsl_dataset_phys(ds)->
 			    ds_next_snap_obj);
 			scn_phys->scn_flags |= DSF_VISIT_DS_AGAIN;
 		} else {
 			SET_BOOKMARK(&scn_phys->scn_bookmark,
 			    ZB_DESTROYED_OBJSET, 0, 0, 0);
 			zfs_dbgmsg("destroying ds %llu; currently traversing; "
 			    "reset bookmark to -1,0,0,0",
 			    (u_longlong_t)ds->ds_object);
 		}
 	}
 }
 
 /*
  * Invoked when a dataset is destroyed. We need to make sure that:
  *
  * 1) If it is the dataset that was currently being scanned, we write
  *	a new dsl_scan_phys_t and marking the objset reference in it
  *	as destroyed.
  * 2) Remove it from the work queue, if it was present.
  *
  * If the dataset was actually a snapshot, instead of marking the dataset
  * as destroyed, we instead substitute the next snapshot in line.
  */
 void
 dsl_scan_ds_destroyed(dsl_dataset_t *ds, dmu_tx_t *tx)
 {
 	dsl_pool_t *dp = ds->ds_dir->dd_pool;
 	dsl_scan_t *scn = dp->dp_scan;
 	uint64_t mintxg;
 
 	if (!dsl_scan_is_running(scn))
 		return;
 
 	ds_destroyed_scn_phys(ds, &scn->scn_phys);
 	ds_destroyed_scn_phys(ds, &scn->scn_phys_cached);
 
 	if (scan_ds_queue_contains(scn, ds->ds_object, &mintxg)) {
 		scan_ds_queue_remove(scn, ds->ds_object);
 		if (ds->ds_is_snapshot)
 			scan_ds_queue_insert(scn,
 			    dsl_dataset_phys(ds)->ds_next_snap_obj, mintxg);
 	}
 
 	if (zap_lookup_int_key(dp->dp_meta_objset, scn->scn_phys.scn_queue_obj,
 	    ds->ds_object, &mintxg) == 0) {
 		ASSERT3U(dsl_dataset_phys(ds)->ds_num_children, <=, 1);
 		VERIFY3U(0, ==, zap_remove_int(dp->dp_meta_objset,
 		    scn->scn_phys.scn_queue_obj, ds->ds_object, tx));
 		if (ds->ds_is_snapshot) {
 			/*
 			 * We keep the same mintxg; it could be >
 			 * ds_creation_txg if the previous snapshot was
 			 * deleted too.
 			 */
 			VERIFY(zap_add_int_key(dp->dp_meta_objset,
 			    scn->scn_phys.scn_queue_obj,
 			    dsl_dataset_phys(ds)->ds_next_snap_obj,
 			    mintxg, tx) == 0);
 			zfs_dbgmsg("destroying ds %llu; in queue; "
 			    "replacing with %llu",
 			    (u_longlong_t)ds->ds_object,
 			    (u_longlong_t)dsl_dataset_phys(ds)->
 			    ds_next_snap_obj);
 		} else {
 			zfs_dbgmsg("destroying ds %llu; in queue; removing",
 			    (u_longlong_t)ds->ds_object);
 		}
 	}
 
 	/*
 	 * dsl_scan_sync() should be called after this, and should sync
 	 * out our changed state, but just to be safe, do it here.
 	 */
 	dsl_scan_sync_state(scn, tx, SYNC_CACHED);
 }
 
 static void
 ds_snapshotted_bookmark(dsl_dataset_t *ds, zbookmark_phys_t *scn_bookmark)
 {
 	if (scn_bookmark->zb_objset == ds->ds_object) {
 		scn_bookmark->zb_objset =
 		    dsl_dataset_phys(ds)->ds_prev_snap_obj;
 		zfs_dbgmsg("snapshotting ds %llu; currently traversing; "
 		    "reset zb_objset to %llu",
 		    (u_longlong_t)ds->ds_object,
 		    (u_longlong_t)dsl_dataset_phys(ds)->ds_prev_snap_obj);
 	}
 }
 
 /*
  * Called when a dataset is snapshotted. If we were currently traversing
  * this snapshot, we reset our bookmark to point at the newly created
  * snapshot. We also modify our work queue to remove the old snapshot and
  * replace with the new one.
  */
 void
 dsl_scan_ds_snapshotted(dsl_dataset_t *ds, dmu_tx_t *tx)
 {
 	dsl_pool_t *dp = ds->ds_dir->dd_pool;
 	dsl_scan_t *scn = dp->dp_scan;
 	uint64_t mintxg;
 
 	if (!dsl_scan_is_running(scn))
 		return;
 
 	ASSERT(dsl_dataset_phys(ds)->ds_prev_snap_obj != 0);
 
 	ds_snapshotted_bookmark(ds, &scn->scn_phys.scn_bookmark);
 	ds_snapshotted_bookmark(ds, &scn->scn_phys_cached.scn_bookmark);
 
 	if (scan_ds_queue_contains(scn, ds->ds_object, &mintxg)) {
 		scan_ds_queue_remove(scn, ds->ds_object);
 		scan_ds_queue_insert(scn,
 		    dsl_dataset_phys(ds)->ds_prev_snap_obj, mintxg);
 	}
 
 	if (zap_lookup_int_key(dp->dp_meta_objset, scn->scn_phys.scn_queue_obj,
 	    ds->ds_object, &mintxg) == 0) {
 		VERIFY3U(0, ==, zap_remove_int(dp->dp_meta_objset,
 		    scn->scn_phys.scn_queue_obj, ds->ds_object, tx));
 		VERIFY(zap_add_int_key(dp->dp_meta_objset,
 		    scn->scn_phys.scn_queue_obj,
 		    dsl_dataset_phys(ds)->ds_prev_snap_obj, mintxg, tx) == 0);
 		zfs_dbgmsg("snapshotting ds %llu; in queue; "
 		    "replacing with %llu",
 		    (u_longlong_t)ds->ds_object,
 		    (u_longlong_t)dsl_dataset_phys(ds)->ds_prev_snap_obj);
 	}
 
 	dsl_scan_sync_state(scn, tx, SYNC_CACHED);
 }
 
 static void
 ds_clone_swapped_bookmark(dsl_dataset_t *ds1, dsl_dataset_t *ds2,
     zbookmark_phys_t *scn_bookmark)
 {
 	if (scn_bookmark->zb_objset == ds1->ds_object) {
 		scn_bookmark->zb_objset = ds2->ds_object;
 		zfs_dbgmsg("clone_swap ds %llu; currently traversing; "
 		    "reset zb_objset to %llu",
 		    (u_longlong_t)ds1->ds_object,
 		    (u_longlong_t)ds2->ds_object);
 	} else if (scn_bookmark->zb_objset == ds2->ds_object) {
 		scn_bookmark->zb_objset = ds1->ds_object;
 		zfs_dbgmsg("clone_swap ds %llu; currently traversing; "
 		    "reset zb_objset to %llu",
 		    (u_longlong_t)ds2->ds_object,
 		    (u_longlong_t)ds1->ds_object);
 	}
 }
 
 /*
  * Called when a parent dataset and its clone are swapped. If we were
  * currently traversing the dataset, we need to switch to traversing the
  * newly promoted parent.
  */
 void
 dsl_scan_ds_clone_swapped(dsl_dataset_t *ds1, dsl_dataset_t *ds2, dmu_tx_t *tx)
 {
 	dsl_pool_t *dp = ds1->ds_dir->dd_pool;
 	dsl_scan_t *scn = dp->dp_scan;
 	uint64_t mintxg;
 
 	if (!dsl_scan_is_running(scn))
 		return;
 
 	ds_clone_swapped_bookmark(ds1, ds2, &scn->scn_phys.scn_bookmark);
 	ds_clone_swapped_bookmark(ds1, ds2, &scn->scn_phys_cached.scn_bookmark);
 
 	if (scan_ds_queue_contains(scn, ds1->ds_object, &mintxg)) {
 		scan_ds_queue_remove(scn, ds1->ds_object);
 		scan_ds_queue_insert(scn, ds2->ds_object, mintxg);
 	}
 	if (scan_ds_queue_contains(scn, ds2->ds_object, &mintxg)) {
 		scan_ds_queue_remove(scn, ds2->ds_object);
 		scan_ds_queue_insert(scn, ds1->ds_object, mintxg);
 	}
 
 	if (zap_lookup_int_key(dp->dp_meta_objset, scn->scn_phys.scn_queue_obj,
 	    ds1->ds_object, &mintxg) == 0) {
 		int err;
 		ASSERT3U(mintxg, ==, dsl_dataset_phys(ds1)->ds_prev_snap_txg);
 		ASSERT3U(mintxg, ==, dsl_dataset_phys(ds2)->ds_prev_snap_txg);
 		VERIFY3U(0, ==, zap_remove_int(dp->dp_meta_objset,
 		    scn->scn_phys.scn_queue_obj, ds1->ds_object, tx));
 		err = zap_add_int_key(dp->dp_meta_objset,
 		    scn->scn_phys.scn_queue_obj, ds2->ds_object, mintxg, tx);
 		VERIFY(err == 0 || err == EEXIST);
 		if (err == EEXIST) {
 			/* Both were there to begin with */
 			VERIFY(0 == zap_add_int_key(dp->dp_meta_objset,
 			    scn->scn_phys.scn_queue_obj,
 			    ds1->ds_object, mintxg, tx));
 		}
 		zfs_dbgmsg("clone_swap ds %llu; in queue; "
 		    "replacing with %llu",
 		    (u_longlong_t)ds1->ds_object,
 		    (u_longlong_t)ds2->ds_object);
 	}
 	if (zap_lookup_int_key(dp->dp_meta_objset, scn->scn_phys.scn_queue_obj,
 	    ds2->ds_object, &mintxg) == 0) {
 		ASSERT3U(mintxg, ==, dsl_dataset_phys(ds1)->ds_prev_snap_txg);
 		ASSERT3U(mintxg, ==, dsl_dataset_phys(ds2)->ds_prev_snap_txg);
 		VERIFY3U(0, ==, zap_remove_int(dp->dp_meta_objset,
 		    scn->scn_phys.scn_queue_obj, ds2->ds_object, tx));
 		VERIFY(0 == zap_add_int_key(dp->dp_meta_objset,
 		    scn->scn_phys.scn_queue_obj, ds1->ds_object, mintxg, tx));
 		zfs_dbgmsg("clone_swap ds %llu; in queue; "
 		    "replacing with %llu",
 		    (u_longlong_t)ds2->ds_object,
 		    (u_longlong_t)ds1->ds_object);
 	}
 
 	dsl_scan_sync_state(scn, tx, SYNC_CACHED);
 }
 
 /* ARGSUSED */
 static int
 enqueue_clones_cb(dsl_pool_t *dp, dsl_dataset_t *hds, void *arg)
 {
 	uint64_t originobj = *(uint64_t *)arg;
 	dsl_dataset_t *ds;
 	int err;
 	dsl_scan_t *scn = dp->dp_scan;
 
 	if (dsl_dir_phys(hds->ds_dir)->dd_origin_obj != originobj)
 		return (0);
 
 	err = dsl_dataset_hold_obj(dp, hds->ds_object, FTAG, &ds);
 	if (err)
 		return (err);
 
 	while (dsl_dataset_phys(ds)->ds_prev_snap_obj != originobj) {
 		dsl_dataset_t *prev;
 		err = dsl_dataset_hold_obj(dp,
 		    dsl_dataset_phys(ds)->ds_prev_snap_obj, FTAG, &prev);
 
 		dsl_dataset_rele(ds, FTAG);
 		if (err)
 			return (err);
 		ds = prev;
 	}
 	scan_ds_queue_insert(scn, ds->ds_object,
 	    dsl_dataset_phys(ds)->ds_prev_snap_txg);
 	dsl_dataset_rele(ds, FTAG);
 	return (0);
 }
 
 static void
 dsl_scan_visitds(dsl_scan_t *scn, uint64_t dsobj, dmu_tx_t *tx)
 {
 	dsl_pool_t *dp = scn->scn_dp;
 	dsl_dataset_t *ds;
 
 	VERIFY3U(0, ==, dsl_dataset_hold_obj(dp, dsobj, FTAG, &ds));
 
 	if (scn->scn_phys.scn_cur_min_txg >=
 	    scn->scn_phys.scn_max_txg) {
 		/*
 		 * This can happen if this snapshot was created after the
 		 * scan started, and we already completed a previous snapshot
 		 * that was created after the scan started.  This snapshot
 		 * only references blocks with:
 		 *
 		 *	birth < our ds_creation_txg
 		 *	cur_min_txg is no less than ds_creation_txg.
 		 *	We have already visited these blocks.
 		 * or
 		 *	birth > scn_max_txg
 		 *	The scan requested not to visit these blocks.
 		 *
 		 * Subsequent snapshots (and clones) can reference our
 		 * blocks, or blocks with even higher birth times.
 		 * Therefore we do not need to visit them either,
 		 * so we do not add them to the work queue.
 		 *
 		 * Note that checking for cur_min_txg >= cur_max_txg
 		 * is not sufficient, because in that case we may need to
 		 * visit subsequent snapshots.  This happens when min_txg > 0,
 		 * which raises cur_min_txg.  In this case we will visit
 		 * this dataset but skip all of its blocks, because the
 		 * rootbp's birth time is < cur_min_txg.  Then we will
 		 * add the next snapshots/clones to the work queue.
 		 */
 		char *dsname = kmem_alloc(MAXNAMELEN, KM_SLEEP);
 		dsl_dataset_name(ds, dsname);
 		zfs_dbgmsg("scanning dataset %llu (%s) is unnecessary because "
 		    "cur_min_txg (%llu) >= max_txg (%llu)",
 		    (longlong_t)dsobj, dsname,
 		    (longlong_t)scn->scn_phys.scn_cur_min_txg,
 		    (longlong_t)scn->scn_phys.scn_max_txg);
 		kmem_free(dsname, MAXNAMELEN);
 
 		goto out;
 	}
 
 	/*
 	 * Only the ZIL in the head (non-snapshot) is valid. Even though
 	 * snapshots can have ZIL block pointers (which may be the same
 	 * BP as in the head), they must be ignored. In addition, $ORIGIN
 	 * doesn't have a objset (i.e. its ds_bp is a hole) so we don't
 	 * need to look for a ZIL in it either. So we traverse the ZIL here,
 	 * rather than in scan_recurse(), because the regular snapshot
 	 * block-sharing rules don't apply to it.
 	 */
 	if (DSL_SCAN_IS_SCRUB_RESILVER(scn) && !dsl_dataset_is_snapshot(ds) &&
 	    (dp->dp_origin_snap == NULL ||
 	    ds->ds_dir != dp->dp_origin_snap->ds_dir)) {
 		objset_t *os;
 		if (dmu_objset_from_ds(ds, &os) != 0) {
 			goto out;
 		}
 		dsl_scan_zil(dp, &os->os_zil_header);
 	}
 
 	/*
 	 * Iterate over the bps in this ds.
 	 */
 	dmu_buf_will_dirty(ds->ds_dbuf, tx);
 	rrw_enter(&ds->ds_bp_rwlock, RW_READER, FTAG);
 	dsl_scan_visit_rootbp(scn, ds, &dsl_dataset_phys(ds)->ds_bp, tx);
 	rrw_exit(&ds->ds_bp_rwlock, FTAG);
 
 	char *dsname = kmem_alloc(ZFS_MAX_DATASET_NAME_LEN, KM_SLEEP);
 	dsl_dataset_name(ds, dsname);
 	zfs_dbgmsg("scanned dataset %llu (%s) with min=%llu max=%llu; "
 	    "suspending=%u",
 	    (longlong_t)dsobj, dsname,
 	    (longlong_t)scn->scn_phys.scn_cur_min_txg,
 	    (longlong_t)scn->scn_phys.scn_cur_max_txg,
 	    (int)scn->scn_suspending);
 	kmem_free(dsname, ZFS_MAX_DATASET_NAME_LEN);
 
 	if (scn->scn_suspending)
 		goto out;
 
 	/*
 	 * We've finished this pass over this dataset.
 	 */
 
 	/*
 	 * If we did not completely visit this dataset, do another pass.
 	 */
 	if (scn->scn_phys.scn_flags & DSF_VISIT_DS_AGAIN) {
 		zfs_dbgmsg("incomplete pass; visiting again");
 		scn->scn_phys.scn_flags &= ~DSF_VISIT_DS_AGAIN;
 		scan_ds_queue_insert(scn, ds->ds_object,
 		    scn->scn_phys.scn_cur_max_txg);
 		goto out;
 	}
 
 	/*
 	 * Add descendent datasets to work queue.
 	 */
 	if (dsl_dataset_phys(ds)->ds_next_snap_obj != 0) {
 		scan_ds_queue_insert(scn,
 		    dsl_dataset_phys(ds)->ds_next_snap_obj,
 		    dsl_dataset_phys(ds)->ds_creation_txg);
 	}
 	if (dsl_dataset_phys(ds)->ds_num_children > 1) {
 		boolean_t usenext = B_FALSE;
 		if (dsl_dataset_phys(ds)->ds_next_clones_obj != 0) {
 			uint64_t count;
 			/*
 			 * A bug in a previous version of the code could
 			 * cause upgrade_clones_cb() to not set
 			 * ds_next_snap_obj when it should, leading to a
 			 * missing entry.  Therefore we can only use the
 			 * next_clones_obj when its count is correct.
 			 */
 			int err = zap_count(dp->dp_meta_objset,
 			    dsl_dataset_phys(ds)->ds_next_clones_obj, &count);
 			if (err == 0 &&
 			    count == dsl_dataset_phys(ds)->ds_num_children - 1)
 				usenext = B_TRUE;
 		}
 
 		if (usenext) {
 			zap_cursor_t zc;
 			zap_attribute_t za;
 			for (zap_cursor_init(&zc, dp->dp_meta_objset,
 			    dsl_dataset_phys(ds)->ds_next_clones_obj);
 			    zap_cursor_retrieve(&zc, &za) == 0;
 			    (void) zap_cursor_advance(&zc)) {
 				scan_ds_queue_insert(scn,
 				    zfs_strtonum(za.za_name, NULL),
 				    dsl_dataset_phys(ds)->ds_creation_txg);
 			}
 			zap_cursor_fini(&zc);
 		} else {
 			VERIFY0(dmu_objset_find_dp(dp, dp->dp_root_dir_obj,
 			    enqueue_clones_cb, &ds->ds_object,
 			    DS_FIND_CHILDREN));
 		}
 	}
 
 out:
 	dsl_dataset_rele(ds, FTAG);
 }
 
 /* ARGSUSED */
 static int
 enqueue_cb(dsl_pool_t *dp, dsl_dataset_t *hds, void *arg)
 {
 	dsl_dataset_t *ds;
 	int err;
 	dsl_scan_t *scn = dp->dp_scan;
 
 	err = dsl_dataset_hold_obj(dp, hds->ds_object, FTAG, &ds);
 	if (err)
 		return (err);
 
 	while (dsl_dataset_phys(ds)->ds_prev_snap_obj != 0) {
 		dsl_dataset_t *prev;
 		err = dsl_dataset_hold_obj(dp,
 		    dsl_dataset_phys(ds)->ds_prev_snap_obj, FTAG, &prev);
 		if (err) {
 			dsl_dataset_rele(ds, FTAG);
 			return (err);
 		}
 
 		/*
 		 * If this is a clone, we don't need to worry about it for now.
 		 */
 		if (dsl_dataset_phys(prev)->ds_next_snap_obj != ds->ds_object) {
 			dsl_dataset_rele(ds, FTAG);
 			dsl_dataset_rele(prev, FTAG);
 			return (0);
 		}
 		dsl_dataset_rele(ds, FTAG);
 		ds = prev;
 	}
 
 	scan_ds_queue_insert(scn, ds->ds_object,
 	    dsl_dataset_phys(ds)->ds_prev_snap_txg);
 	dsl_dataset_rele(ds, FTAG);
 	return (0);
 }
 
 /* ARGSUSED */
 void
 dsl_scan_ddt_entry(dsl_scan_t *scn, enum zio_checksum checksum,
     ddt_entry_t *dde, dmu_tx_t *tx)
 {
 	const ddt_key_t *ddk = &dde->dde_key;
 	ddt_phys_t *ddp = dde->dde_phys;
 	blkptr_t bp;
 	zbookmark_phys_t zb = { 0 };
 	int p;
 
 	if (scn->scn_phys.scn_state != DSS_SCANNING)
 		return;
 
 	for (p = 0; p < DDT_PHYS_TYPES; p++, ddp++) {
 		if (ddp->ddp_phys_birth == 0 ||
 		    ddp->ddp_phys_birth > scn->scn_phys.scn_max_txg)
 			continue;
 		ddt_bp_create(checksum, ddk, ddp, &bp);
 
 		scn->scn_visited_this_txg++;
 		scan_funcs[scn->scn_phys.scn_func](scn->scn_dp, &bp, &zb);
 	}
 }
 
 /*
  * Scrub/dedup interaction.
  *
  * If there are N references to a deduped block, we don't want to scrub it
  * N times -- ideally, we should scrub it exactly once.
  *
  * We leverage the fact that the dde's replication class (enum ddt_class)
  * is ordered from highest replication class (DDT_CLASS_DITTO) to lowest
  * (DDT_CLASS_UNIQUE) so that we may walk the DDT in that order.
  *
  * To prevent excess scrubbing, the scrub begins by walking the DDT
  * to find all blocks with refcnt > 1, and scrubs each of these once.
  * Since there are two replication classes which contain blocks with
  * refcnt > 1, we scrub the highest replication class (DDT_CLASS_DITTO) first.
  * Finally the top-down scrub begins, only visiting blocks with refcnt == 1.
  *
  * There would be nothing more to say if a block's refcnt couldn't change
  * during a scrub, but of course it can so we must account for changes
  * in a block's replication class.
  *
  * Here's an example of what can occur:
  *
  * If a block has refcnt > 1 during the DDT scrub phase, but has refcnt == 1
  * when visited during the top-down scrub phase, it will be scrubbed twice.
  * This negates our scrub optimization, but is otherwise harmless.
  *
  * If a block has refcnt == 1 during the DDT scrub phase, but has refcnt > 1
  * on each visit during the top-down scrub phase, it will never be scrubbed.
  * To catch this, ddt_sync_entry() notifies the scrub code whenever a block's
  * reference class transitions to a higher level (i.e DDT_CLASS_UNIQUE to
  * DDT_CLASS_DUPLICATE); if it transitions from refcnt == 1 to refcnt > 1
  * while a scrub is in progress, it scrubs the block right then.
  */
 static void
 dsl_scan_ddt(dsl_scan_t *scn, dmu_tx_t *tx)
 {
 	ddt_bookmark_t *ddb = &scn->scn_phys.scn_ddt_bookmark;
 	ddt_entry_t dde = { 0 };
 	int error;
 	uint64_t n = 0;
 
 	while ((error = ddt_walk(scn->scn_dp->dp_spa, ddb, &dde)) == 0) {
 		ddt_t *ddt;
 
 		if (ddb->ddb_class > scn->scn_phys.scn_ddt_class_max)
 			break;
 		dprintf("visiting ddb=%llu/%llu/%llu/%llx\n",
 		    (longlong_t)ddb->ddb_class,
 		    (longlong_t)ddb->ddb_type,
 		    (longlong_t)ddb->ddb_checksum,
 		    (longlong_t)ddb->ddb_cursor);
 
 		/* There should be no pending changes to the dedup table */
 		ddt = scn->scn_dp->dp_spa->spa_ddt[ddb->ddb_checksum];
 		ASSERT(avl_first(&ddt->ddt_tree) == NULL);
 
 		dsl_scan_ddt_entry(scn, ddb->ddb_checksum, &dde, tx);
 		n++;
 
 		if (dsl_scan_check_suspend(scn, NULL))
 			break;
 	}
 
 	zfs_dbgmsg("scanned %llu ddt entries with class_max = %u; "
 	    "suspending=%u", (longlong_t)n,
 	    (int)scn->scn_phys.scn_ddt_class_max, (int)scn->scn_suspending);
 
 	ASSERT(error == 0 || error == ENOENT);
 	ASSERT(error != ENOENT ||
 	    ddb->ddb_class > scn->scn_phys.scn_ddt_class_max);
 }
 
 static uint64_t
 dsl_scan_ds_maxtxg(dsl_dataset_t *ds)
 {
 	uint64_t smt = ds->ds_dir->dd_pool->dp_scan->scn_phys.scn_max_txg;
 	if (ds->ds_is_snapshot)
 		return (MIN(smt, dsl_dataset_phys(ds)->ds_creation_txg));
 	return (smt);
 }
 
 static void
 dsl_scan_visit(dsl_scan_t *scn, dmu_tx_t *tx)
 {
 	scan_ds_t *sds;
 	dsl_pool_t *dp = scn->scn_dp;
 
 	if (scn->scn_phys.scn_ddt_bookmark.ddb_class <=
 	    scn->scn_phys.scn_ddt_class_max) {
 		scn->scn_phys.scn_cur_min_txg = scn->scn_phys.scn_min_txg;
 		scn->scn_phys.scn_cur_max_txg = scn->scn_phys.scn_max_txg;
 		dsl_scan_ddt(scn, tx);
 		if (scn->scn_suspending)
 			return;
 	}
 
 	if (scn->scn_phys.scn_bookmark.zb_objset == DMU_META_OBJSET) {
 		/* First do the MOS & ORIGIN */
 
 		scn->scn_phys.scn_cur_min_txg = scn->scn_phys.scn_min_txg;
 		scn->scn_phys.scn_cur_max_txg = scn->scn_phys.scn_max_txg;
 		dsl_scan_visit_rootbp(scn, NULL,
 		    &dp->dp_meta_rootbp, tx);
 		spa_set_rootblkptr(dp->dp_spa, &dp->dp_meta_rootbp);
 		if (scn->scn_suspending)
 			return;
 
 		if (spa_version(dp->dp_spa) < SPA_VERSION_DSL_SCRUB) {
 			VERIFY0(dmu_objset_find_dp(dp, dp->dp_root_dir_obj,
 			    enqueue_cb, NULL, DS_FIND_CHILDREN));
 		} else {
 			dsl_scan_visitds(scn,
 			    dp->dp_origin_snap->ds_object, tx);
 		}
 		ASSERT(!scn->scn_suspending);
 	} else if (scn->scn_phys.scn_bookmark.zb_objset !=
 	    ZB_DESTROYED_OBJSET) {
 		uint64_t dsobj = scn->scn_phys.scn_bookmark.zb_objset;
 		/*
 		 * If we were suspended, continue from here. Note if the
 		 * ds we were suspended on was deleted, the zb_objset may
 		 * be -1, so we will skip this and find a new objset
 		 * below.
 		 */
 		dsl_scan_visitds(scn, dsobj, tx);
 		if (scn->scn_suspending)
 			return;
 	}
 
 	/*
 	 * In case we suspended right at the end of the ds, zero the
 	 * bookmark so we don't think that we're still trying to resume.
 	 */
 	bzero(&scn->scn_phys.scn_bookmark, sizeof (zbookmark_phys_t));
 
 	/*
 	 * Keep pulling things out of the dataset avl queue. Updates to the
 	 * persistent zap-object-as-queue happen only at checkpoints.
 	 */
 	while ((sds = avl_first(&scn->scn_queue)) != NULL) {
 		dsl_dataset_t *ds;
 		uint64_t dsobj = sds->sds_dsobj;
 		uint64_t txg = sds->sds_txg;
 
 		/* dequeue and free the ds from the queue */
 		scan_ds_queue_remove(scn, dsobj);
 		sds = NULL;	/* must not be touched after removal */
 
 		/* Set up min / max txg */
 		VERIFY3U(0, ==, dsl_dataset_hold_obj(dp, dsobj, FTAG, &ds));
 		if (txg != 0) {
 			scn->scn_phys.scn_cur_min_txg =
 			    MAX(scn->scn_phys.scn_min_txg, txg);
 		} else {
 			scn->scn_phys.scn_cur_min_txg =
 			    MAX(scn->scn_phys.scn_min_txg,
 			    dsl_dataset_phys(ds)->ds_prev_snap_txg);
 		}
 		scn->scn_phys.scn_cur_max_txg = dsl_scan_ds_maxtxg(ds);
 		dsl_dataset_rele(ds, FTAG);
 
 		dsl_scan_visitds(scn, dsobj, tx);
 		if (scn->scn_suspending)
 			return;
 	}
 	/* No more objsets to fetch, we're done */
 	scn->scn_phys.scn_bookmark.zb_objset = ZB_DESTROYED_OBJSET;
 	ASSERT0(scn->scn_suspending);
 }
 
 static uint64_t
 dsl_scan_count_leaves(vdev_t *vd)
 {
 	uint64_t i, leaves = 0;
 	
 	/* we only count leaves that belong to the main pool and are readable */
 	if (vd->vdev_islog || vd->vdev_isspare ||
 	    vd->vdev_isl2cache || !vdev_readable(vd))
 		return (0);
 	
 	if (vd->vdev_ops->vdev_op_leaf)
 		return (1);
 	
 	for (i = 0; i < vd->vdev_children; i++) {
 		leaves += dsl_scan_count_leaves(vd->vdev_child[i]);
 	}
 	
 	return (leaves);
 }
 
 
 static void
 scan_io_queues_update_zio_stats(dsl_scan_io_queue_t *q, const blkptr_t *bp)
 {
 	int i;
 	uint64_t cur_size = 0;
 
 	for (i = 0; i < BP_GET_NDVAS(bp); i++) {
 		cur_size += DVA_GET_ASIZE(&bp->blk_dva[i]);
 	}
 
 	q->q_total_zio_size_this_txg += cur_size;
 	q->q_zios_this_txg++;
 }
 
 static void
 scan_io_queues_update_seg_stats(dsl_scan_io_queue_t *q, uint64_t start,
     uint64_t end)
 {
 	q->q_total_seg_size_this_txg += end - start;
 	q->q_segs_this_txg++;
 }
 
 static boolean_t
 scan_io_queue_check_suspend(dsl_scan_t *scn)
 {
 	/* See comment in dsl_scan_check_suspend() */
 	uint64_t curr_time_ns = gethrtime();
 	uint64_t scan_time_ns = curr_time_ns - scn->scn_sync_start_time;
 	uint64_t sync_time_ns = curr_time_ns -
 	    scn->scn_dp->dp_spa->spa_sync_starttime;
 	int dirty_pct = scn->scn_dp->dp_dirty_total * 100 / zfs_dirty_data_max;
 	int mintime = (scn->scn_phys.scn_func == POOL_SCAN_RESILVER) ?
 	    zfs_resilver_min_time_ms : zfs_scrub_min_time_ms;
        
 	return ((NSEC2MSEC(scan_time_ns) > mintime &&
 	    (dirty_pct >= zfs_vdev_async_write_active_min_dirty_percent ||
 	    txg_sync_waiting(scn->scn_dp) ||
 	    NSEC2SEC(sync_time_ns) >= zfs_txg_timeout)) ||
 	    spa_shutting_down(scn->scn_dp->dp_spa));
 }
 
 /*
  * Given a list of scan_io_t's in io_list, this issues the io's out to
  * disk. This consumes the io_list and frees the scan_io_t's. This is
  * called when emptying queues, either when we're up against the memory
  * limit or when we have finished scanning. Returns B_TRUE if we stopped
  * processing the list before we finished. Any zios that were not issued
  * will remain in the io_list.
  */
 static boolean_t
 scan_io_queue_issue(dsl_scan_io_queue_t *queue, list_t *io_list)
 {
 	dsl_scan_t *scn = queue->q_scn;
 	scan_io_t *sio;
 	int64_t bytes_issued = 0;
 	boolean_t suspended = B_FALSE;
 
 	while ((sio = list_head(io_list)) != NULL) {
 		blkptr_t bp;
 
 		if (scan_io_queue_check_suspend(scn)) {
 			suspended = B_TRUE;
 			break;
 		}
 
 		sio2bp(sio, &bp, queue->q_vd->vdev_id);
 		bytes_issued += sio->sio_asize;
 		scan_exec_io(scn->scn_dp, &bp, sio->sio_flags,
 		    &sio->sio_zb, queue);
 		(void) list_remove_head(io_list);
 		scan_io_queues_update_zio_stats(queue, &bp);
 		kmem_free(sio, sizeof (*sio));
 	}
 
 	atomic_add_64(&scn->scn_bytes_pending, -bytes_issued);
 
 	return (suspended);
 }
 
 /*
  * Given a range_seg_t (extent) and a list, this function passes over a
  * scan queue and gathers up the appropriate ios which fit into that
  * scan seg (starting from lowest LBA). At the end, we remove the segment
  * from the q_exts_by_addr range tree.
  */
 static boolean_t
 scan_io_queue_gather(dsl_scan_io_queue_t *queue, range_seg_t *rs, list_t *list)
 {
 	scan_io_t srch_sio, *sio, *next_sio;
 	avl_index_t idx;
 	uint_t num_sios = 0;
 	int64_t bytes_issued = 0;
 
 	ASSERT(rs != NULL);
 	ASSERT(MUTEX_HELD(&queue->q_vd->vdev_scan_io_queue_lock));
 
 	srch_sio.sio_offset = rs->rs_start;
 
 	/*
 	 * The exact start of the extent might not contain any matching zios,
 	 * so if that's the case, examine the next one in the tree.
 	 */
 	sio = avl_find(&queue->q_sios_by_addr, &srch_sio, &idx);
 	if (sio == NULL)
 		sio = avl_nearest(&queue->q_sios_by_addr, idx, AVL_AFTER);
 
 	while (sio != NULL && sio->sio_offset < rs->rs_end && num_sios <= 32) {
 		ASSERT3U(sio->sio_offset, >=, rs->rs_start);
 		ASSERT3U(sio->sio_offset + sio->sio_asize, <=, rs->rs_end);
 
 		next_sio = AVL_NEXT(&queue->q_sios_by_addr, sio);
 		avl_remove(&queue->q_sios_by_addr, sio);
 
 		bytes_issued += sio->sio_asize;
 		num_sios++;
 		list_insert_tail(list, sio);
 		sio = next_sio;
 	}
 
 	/*
 	 * We limit the number of sios we process at once to 32 to avoid
 	 * biting off more than we can chew. If we didn't take everything
 	 * in the segment we update it to reflect the work we were able to
 	 * complete. Otherwise, we remove it from the range tree entirely.
 	 */
 	if (sio != NULL && sio->sio_offset < rs->rs_end) {
 		range_tree_adjust_fill(queue->q_exts_by_addr, rs,
 		    -bytes_issued);
 		range_tree_resize_segment(queue->q_exts_by_addr, rs,
 		    sio->sio_offset, rs->rs_end - sio->sio_offset);
 
 		return (B_TRUE);
 	} else {
 		range_tree_remove(queue->q_exts_by_addr, rs->rs_start,
 		    rs->rs_end - rs->rs_start);
 		return (B_FALSE);
 	}
 }
 
 
 /*
  * This is called from the queue emptying thread and selects the next
  * extent from which we are to issue io's. The behavior of this function
  * depends on the state of the scan, the current memory consumption and
  * whether or not we are performing a scan shutdown.
  * 1) We select extents in an elevator algorithm (LBA-order) if the scan
  * 	needs to perform a checkpoint
  * 2) We select the largest available extent if we are up against the
  * 	memory limit.
  * 3) Otherwise we don't select any extents.
  */
 static const range_seg_t *
 scan_io_queue_fetch_ext(dsl_scan_io_queue_t *queue)
 {
 	dsl_scan_t *scn = queue->q_scn;
 
 	ASSERT(MUTEX_HELD(&queue->q_vd->vdev_scan_io_queue_lock));
 	ASSERT(scn->scn_is_sorted);
 
 	/* handle tunable overrides */
 	if (scn->scn_checkpointing || scn->scn_clearing) {
 		if (zfs_scan_issue_strategy == 1) {
 			return (range_tree_first(queue->q_exts_by_addr));
 		} else if (zfs_scan_issue_strategy == 2) {
 			return (avl_first(&queue->q_exts_by_size));
 		}
 	}
 
 	/*
 	 * During normal clearing, we want to issue our largest segments
 	 * first, keeping IO as sequential as possible, and leaving the
 	 * smaller extents for later with the hope that they might eventually
 	 * grow to larger sequential segments. However, when the scan is
 	 * checkpointing, no new extents will be added to the sorting queue,
 	 * so the way we are sorted now is as good as it will ever get.
 	 * In this case, we instead switch to issuing extents in LBA order.
 	 */
 	if (scn->scn_checkpointing) {
 		return (range_tree_first(queue->q_exts_by_addr));
 	} else if (scn->scn_clearing) {
 		return (avl_first(&queue->q_exts_by_size));
 	} else {
 		return (NULL);
 	}
 }
 
 static void
 scan_io_queues_run_one(void *arg)
 {
 	dsl_scan_io_queue_t *queue = arg;
 	kmutex_t *q_lock = &queue->q_vd->vdev_scan_io_queue_lock;
 	boolean_t suspended = B_FALSE;
 	range_seg_t *rs = NULL;
 	scan_io_t *sio = NULL;
 	list_t sio_list;
 	uint64_t bytes_per_leaf = zfs_scan_vdev_limit;
 	uint64_t nr_leaves = dsl_scan_count_leaves(queue->q_vd);
 
 	ASSERT(queue->q_scn->scn_is_sorted);
 
 	list_create(&sio_list, sizeof (scan_io_t),
 	    offsetof(scan_io_t, sio_nodes.sio_list_node));
 	mutex_enter(q_lock);
 
 	/* calculate maximum in-flight bytes for this txg (min 1MB) */
 	queue->q_maxinflight_bytes =
 	    MAX(nr_leaves * bytes_per_leaf, 1ULL << 20);
 
 	/* reset per-queue scan statistics for this txg */
 	queue->q_total_seg_size_this_txg = 0;
 	queue->q_segs_this_txg = 0;
 	queue->q_total_zio_size_this_txg = 0;
 	queue->q_zios_this_txg = 0;
 
 	/* loop until we have run out of time or sios */
 	while ((rs = (range_seg_t*)scan_io_queue_fetch_ext(queue)) != NULL) {
 		uint64_t seg_start = 0, seg_end = 0;
 		boolean_t more_left = B_TRUE;
 
 		ASSERT(list_is_empty(&sio_list));
 
 		/* loop while we still have sios left to process in this rs */
 		while (more_left) {
 			scan_io_t *first_sio, *last_sio;
 
 			/*
 			 * We have selected which extent needs to be
 			 * processed next. Gather up the corresponding sios.
 			 */
 			more_left = scan_io_queue_gather(queue, rs, &sio_list);
 			ASSERT(!list_is_empty(&sio_list));
 			first_sio = list_head(&sio_list);
 			last_sio = list_tail(&sio_list);
 
 			seg_end = last_sio->sio_offset + last_sio->sio_asize;
 			if (seg_start == 0)
 				seg_start = first_sio->sio_offset;
 
 			/*
 			 * Issuing sios can take a long time so drop the
 			 * queue lock. The sio queue won't be updated by
 			 * other threads since we're in syncing context so
 			 * we can be sure that our trees will remain exactly
 			 * as we left them.
 			 */
 			mutex_exit(q_lock);
 			suspended = scan_io_queue_issue(queue, &sio_list);
 			mutex_enter(q_lock);
 
 			if (suspended)
 				break;
 		}
 		/* update statistics for debugging purposes */
 		scan_io_queues_update_seg_stats(queue, seg_start, seg_end);
 		
 		if (suspended)
 			break;
 	}
 		
 
 	/* If we were suspended in the middle of processing,
 	 * requeue any unfinished sios and exit.
 	 */
 	while ((sio = list_head(&sio_list)) != NULL) {
 		list_remove(&sio_list, sio);
 		scan_io_queue_insert_impl(queue, sio);
 	}
 
 	mutex_exit(q_lock);
 	list_destroy(&sio_list);
 }
 
 /*
  * Performs an emptying run on all scan queues in the pool. This just
  * punches out one thread per top-level vdev, each of which processes
  * only that vdev's scan queue. We can parallelize the I/O here because
  * we know that each queue's io's only affect its own top-level vdev.
  *
  * This function waits for the queue runs to complete, and must be
  * called from dsl_scan_sync (or in general, syncing context).
  */
 static void
 scan_io_queues_run(dsl_scan_t *scn)
 {
 	spa_t *spa = scn->scn_dp->dp_spa;
 
 	ASSERT(scn->scn_is_sorted);
 	ASSERT(spa_config_held(spa, SCL_CONFIG, RW_READER));
 
 	if (scn->scn_bytes_pending == 0)
 		return;
 
 	if (scn->scn_taskq == NULL) {
 		char *tq_name = kmem_zalloc(ZFS_MAX_DATASET_NAME_LEN + 16,
 		    KM_SLEEP);
 		int nthreads = spa->spa_root_vdev->vdev_children;
 
 		/*
 		 * We need to make this taskq *always* execute as many
 		 * threads in parallel as we have top-level vdevs and no
 		 * less, otherwise strange serialization of the calls to
 		 * scan_io_queues_run_one can occur during spa_sync runs
 		 * and that significantly impacts performance.
 		 */
 		(void) snprintf(tq_name, ZFS_MAX_DATASET_NAME_LEN + 16,
 		    "dsl_scan_tq_%s", spa->spa_name);
 		scn->scn_taskq = taskq_create(tq_name, nthreads, minclsyspri,
 		    nthreads, nthreads, TASKQ_PREPOPULATE);
 		kmem_free(tq_name, ZFS_MAX_DATASET_NAME_LEN + 16);
 	}
 
 	for (uint64_t i = 0; i < spa->spa_root_vdev->vdev_children; i++) {
 		vdev_t *vd = spa->spa_root_vdev->vdev_child[i];
 
 		mutex_enter(&vd->vdev_scan_io_queue_lock);
 		if (vd->vdev_scan_io_queue != NULL) {
 			VERIFY(taskq_dispatch(scn->scn_taskq,
 			    scan_io_queues_run_one, vd->vdev_scan_io_queue,
 			    TQ_SLEEP) != TASKQID_INVALID);
 		}
 		mutex_exit(&vd->vdev_scan_io_queue_lock);
 	}
 
 	/*
 	 * Wait for the queues to finish issuing thir IOs for this run
 	 * before we return. There may still be IOs in flight at this
 	 * point.
 	 */
 	taskq_wait(scn->scn_taskq);
 }
 
 static boolean_t
 dsl_scan_async_block_should_pause(dsl_scan_t *scn)
 {
 	uint64_t elapsed_nanosecs;
 
 	if (zfs_recover)
 		return (B_FALSE);
 
 	if (scn->scn_visited_this_txg >= zfs_async_block_max_blocks)
 		return (B_TRUE);
 
 	elapsed_nanosecs = gethrtime() - scn->scn_sync_start_time;
 	return (elapsed_nanosecs / NANOSEC > zfs_txg_timeout ||
 	    (NSEC2MSEC(elapsed_nanosecs) > scn->scn_async_block_min_time_ms &&
 	    txg_sync_waiting(scn->scn_dp)) ||
 	    spa_shutting_down(scn->scn_dp->dp_spa));
 }
 
 static int
 dsl_scan_free_block_cb(void *arg, const blkptr_t *bp, dmu_tx_t *tx)
 {
 	dsl_scan_t *scn = arg;
 
 	if (!scn->scn_is_bptree ||
 	    (BP_GET_LEVEL(bp) == 0 && BP_GET_TYPE(bp) != DMU_OT_OBJSET)) {
 		if (dsl_scan_async_block_should_pause(scn))
 			return (SET_ERROR(ERESTART));
 	}
 
 	zio_nowait(zio_free_sync(scn->scn_zio_root, scn->scn_dp->dp_spa,
 	    dmu_tx_get_txg(tx), bp, BP_GET_PSIZE(bp), 0));
 	dsl_dir_diduse_space(tx->tx_pool->dp_free_dir, DD_USED_HEAD,
 	    -bp_get_dsize_sync(scn->scn_dp->dp_spa, bp),
 	    -BP_GET_PSIZE(bp), -BP_GET_UCSIZE(bp), tx);
 	scn->scn_visited_this_txg++;
 	return (0);
 }
 
 static void
 dsl_scan_update_stats(dsl_scan_t *scn)
 {
 	spa_t *spa = scn->scn_dp->dp_spa;
 	uint64_t i;
 	uint64_t seg_size_total = 0, zio_size_total = 0;
 	uint64_t seg_count_total = 0, zio_count_total = 0;
 
 	for (i = 0; i < spa->spa_root_vdev->vdev_children; i++) {
 		vdev_t *vd = spa->spa_root_vdev->vdev_child[i];
 		dsl_scan_io_queue_t *queue = vd->vdev_scan_io_queue;
 
 		if (queue == NULL)
 			continue;
 
 		seg_size_total += queue->q_total_seg_size_this_txg;
 		zio_size_total += queue->q_total_zio_size_this_txg;
 		seg_count_total += queue->q_segs_this_txg;
 		zio_count_total += queue->q_zios_this_txg;
 	}
 
 	if (seg_count_total == 0 || zio_count_total == 0) {
 		scn->scn_avg_seg_size_this_txg = 0;
 		scn->scn_avg_zio_size_this_txg = 0;
 		scn->scn_segs_this_txg = 0;
 		scn->scn_zios_this_txg = 0;
 		return;
 	}
 
 	scn->scn_avg_seg_size_this_txg = seg_size_total / seg_count_total;
 	scn->scn_avg_zio_size_this_txg = zio_size_total / zio_count_total;
 	scn->scn_segs_this_txg = seg_count_total;
 	scn->scn_zios_this_txg = zio_count_total;
 }
 
 static int
 dsl_scan_obsolete_block_cb(void *arg, const blkptr_t *bp, dmu_tx_t *tx)
 {
 	dsl_scan_t *scn = arg;
 	const dva_t *dva = &bp->blk_dva[0];
 
 	if (dsl_scan_async_block_should_pause(scn))
 		return (SET_ERROR(ERESTART));
 
 	spa_vdev_indirect_mark_obsolete(scn->scn_dp->dp_spa,
 	    DVA_GET_VDEV(dva), DVA_GET_OFFSET(dva),
 	    DVA_GET_ASIZE(dva), tx);
 	scn->scn_visited_this_txg++;
 	return (0);
 }
 
 boolean_t
 dsl_scan_active(dsl_scan_t *scn)
 {
 	spa_t *spa = scn->scn_dp->dp_spa;
 	uint64_t used = 0, comp, uncomp;
 
 	if (spa->spa_load_state != SPA_LOAD_NONE)
 		return (B_FALSE);
 	if (spa_shutting_down(spa))
 		return (B_FALSE);
 	if ((dsl_scan_is_running(scn) && !dsl_scan_is_paused_scrub(scn)) ||
 	    (scn->scn_async_destroying && !scn->scn_async_stalled))
 		return (B_TRUE);
 
 	if (spa_version(scn->scn_dp->dp_spa) >= SPA_VERSION_DEADLISTS) {
 		(void) bpobj_space(&scn->scn_dp->dp_free_bpobj,
 		    &used, &comp, &uncomp);
 	}
 	return (used != 0);
 }
 
 static boolean_t
 dsl_scan_need_resilver(spa_t *spa, const dva_t *dva, size_t psize,
     uint64_t phys_birth)
 {
 	vdev_t *vd;
 
+	vd = vdev_lookup_top(spa, DVA_GET_VDEV(dva));
+
 	if (vd->vdev_ops == &vdev_indirect_ops) {
 		/*
 		 * The indirect vdev can point to multiple
 		 * vdevs.  For simplicity, always create
 		 * the resilver zio_t. zio_vdev_io_start()
 		 * will bypass the child resilver i/o's if
 		 * they are on vdevs that don't have DTL's.
 		 */
 		return (B_TRUE);
 	}
+
 	if (DVA_GET_GANG(dva)) {
 		/*
 		 * Gang members may be spread across multiple
 		 * vdevs, so the best estimate we have is the
 		 * scrub range, which has already been checked.
 		 * XXX -- it would be better to change our
 		 * allocation policy to ensure that all
 		 * gang members reside on the same vdev.
 		 */
 		return (B_TRUE);
 	}
-
-	vd = vdev_lookup_top(spa, DVA_GET_VDEV(dva));
 
 	/*
 	 * Check if the txg falls within the range which must be
 	 * resilvered.  DVAs outside this range can always be skipped.
 	 */
 	if (!vdev_dtl_contains(vd, DTL_PARTIAL, phys_birth, 1))
 		return (B_FALSE);
 
 	/*
 	 * Check if the top-level vdev must resilver this offset.
 	 * When the offset does not intersect with a dirty leaf DTL
 	 * then it may be possible to skip the resilver IO.  The psize
 	 * is provided instead of asize to simplify the check for RAIDZ.
 	 */
 	if (!vdev_dtl_need_resilver(vd, DVA_GET_OFFSET(dva), psize))
 		return (B_FALSE);
 
 	return (B_TRUE);
 }
 
 static int
 dsl_process_async_destroys(dsl_pool_t *dp, dmu_tx_t *tx)
 {
 	int err = 0;
 	dsl_scan_t *scn = dp->dp_scan;
 	spa_t *spa = dp->dp_spa;
 
 	if (spa_suspend_async_destroy(spa))
 		return (0);
 
 	if (zfs_free_bpobj_enabled &&
 	    spa_version(spa) >= SPA_VERSION_DEADLISTS) {
 		scn->scn_is_bptree = B_FALSE;
 		scn->scn_async_block_min_time_ms = zfs_free_min_time_ms;
 		scn->scn_zio_root = zio_root(spa, NULL,
 		    NULL, ZIO_FLAG_MUSTSUCCEED);
 		err = bpobj_iterate(&dp->dp_free_bpobj,
 		    dsl_scan_free_block_cb, scn, tx);
 		VERIFY0(zio_wait(scn->scn_zio_root));
 		scn->scn_zio_root = NULL;
 
 		if (err != 0 && err != ERESTART)
 			zfs_panic_recover("error %u from bpobj_iterate()", err);
 	}
 
 	if (err == 0 && spa_feature_is_active(spa, SPA_FEATURE_ASYNC_DESTROY)) {
 		ASSERT(scn->scn_async_destroying);
 		scn->scn_is_bptree = B_TRUE;
 		scn->scn_zio_root = zio_root(spa, NULL,
 		    NULL, ZIO_FLAG_MUSTSUCCEED);
 		err = bptree_iterate(dp->dp_meta_objset,
 		    dp->dp_bptree_obj, B_TRUE, dsl_scan_free_block_cb, scn, tx);
 		VERIFY0(zio_wait(scn->scn_zio_root));
 		scn->scn_zio_root = NULL;
 
 		if (err == EIO || err == ECKSUM) {
 			err = 0;
 		} else if (err != 0 && err != ERESTART) {
 			zfs_panic_recover("error %u from "
 			    "traverse_dataset_destroyed()", err);
 		}
 
 		if (bptree_is_empty(dp->dp_meta_objset, dp->dp_bptree_obj)) {
 			/* finished; deactivate async destroy feature */
 			spa_feature_decr(spa, SPA_FEATURE_ASYNC_DESTROY, tx);
 			ASSERT(!spa_feature_is_active(spa,
 			    SPA_FEATURE_ASYNC_DESTROY));
 			VERIFY0(zap_remove(dp->dp_meta_objset,
 			    DMU_POOL_DIRECTORY_OBJECT,
 			    DMU_POOL_BPTREE_OBJ, tx));
 			VERIFY0(bptree_free(dp->dp_meta_objset,
 			    dp->dp_bptree_obj, tx));
 			dp->dp_bptree_obj = 0;
 			scn->scn_async_destroying = B_FALSE;
 			scn->scn_async_stalled = B_FALSE;
 		} else {
 			/*
 			 * If we didn't make progress, mark the async
 			 * destroy as stalled, so that we will not initiate
 			 * a spa_sync() on its behalf.  Note that we only
 			 * check this if we are not finished, because if the
 			 * bptree had no blocks for us to visit, we can
 			 * finish without "making progress".
 			 */
 			scn->scn_async_stalled =
 			    (scn->scn_visited_this_txg == 0);
 		}
 	}
 	if (scn->scn_visited_this_txg) {
 		zfs_dbgmsg("freed %llu blocks in %llums from "
 		    "free_bpobj/bptree txg %llu; err=%d",
 		    (longlong_t)scn->scn_visited_this_txg,
 		    (longlong_t)
 		    NSEC2MSEC(gethrtime() - scn->scn_sync_start_time),
 		    (longlong_t)tx->tx_txg, err);
 		scn->scn_visited_this_txg = 0;
 
 		/*
 		 * Write out changes to the DDT that may be required as a
 		 * result of the blocks freed.  This ensures that the DDT
 		 * is clean when a scrub/resilver runs.
 		 */
 		ddt_sync(spa, tx->tx_txg);
 	}
 	if (err != 0)
 		return (err);
 	if (dp->dp_free_dir != NULL && !scn->scn_async_destroying &&
 	    zfs_free_leak_on_eio &&
 	    (dsl_dir_phys(dp->dp_free_dir)->dd_used_bytes != 0 ||
 	    dsl_dir_phys(dp->dp_free_dir)->dd_compressed_bytes != 0 ||
 	    dsl_dir_phys(dp->dp_free_dir)->dd_uncompressed_bytes != 0)) {
 		/*
 		 * We have finished background destroying, but there is still
 		 * some space left in the dp_free_dir. Transfer this leaked
 		 * space to the dp_leak_dir.
 		 */
 		if (dp->dp_leak_dir == NULL) {
 			rrw_enter(&dp->dp_config_rwlock, RW_WRITER, FTAG);
 			(void) dsl_dir_create_sync(dp, dp->dp_root_dir,
 			    LEAK_DIR_NAME, tx);
 			VERIFY0(dsl_pool_open_special_dir(dp,
 			    LEAK_DIR_NAME, &dp->dp_leak_dir));
 			rrw_exit(&dp->dp_config_rwlock, FTAG);
 		}
 		dsl_dir_diduse_space(dp->dp_leak_dir, DD_USED_HEAD,
 		    dsl_dir_phys(dp->dp_free_dir)->dd_used_bytes,
 		    dsl_dir_phys(dp->dp_free_dir)->dd_compressed_bytes,
 		    dsl_dir_phys(dp->dp_free_dir)->dd_uncompressed_bytes, tx);
 		dsl_dir_diduse_space(dp->dp_free_dir, DD_USED_HEAD,
 		    -dsl_dir_phys(dp->dp_free_dir)->dd_used_bytes,
 		    -dsl_dir_phys(dp->dp_free_dir)->dd_compressed_bytes,
 		    -dsl_dir_phys(dp->dp_free_dir)->dd_uncompressed_bytes, tx);
 	}
 
 	if (dp->dp_free_dir != NULL && !scn->scn_async_destroying) {
 		/* finished; verify that space accounting went to zero */
 		ASSERT0(dsl_dir_phys(dp->dp_free_dir)->dd_used_bytes);
 		ASSERT0(dsl_dir_phys(dp->dp_free_dir)->dd_compressed_bytes);
 		ASSERT0(dsl_dir_phys(dp->dp_free_dir)->dd_uncompressed_bytes);
 	}
 
 	EQUIV(bpobj_is_open(&dp->dp_obsolete_bpobj),
 	    0 == zap_contains(dp->dp_meta_objset, DMU_POOL_DIRECTORY_OBJECT,
 	    DMU_POOL_OBSOLETE_BPOBJ));
 	if (err == 0 && bpobj_is_open(&dp->dp_obsolete_bpobj)) {
 		ASSERT(spa_feature_is_active(dp->dp_spa,
 		    SPA_FEATURE_OBSOLETE_COUNTS));
 
 		scn->scn_is_bptree = B_FALSE;
 		scn->scn_async_block_min_time_ms = zfs_obsolete_min_time_ms;
 		err = bpobj_iterate(&dp->dp_obsolete_bpobj,
 		    dsl_scan_obsolete_block_cb, scn, tx);
 		if (err != 0 && err != ERESTART)
 			zfs_panic_recover("error %u from bpobj_iterate()", err);
 
 		if (bpobj_is_empty(&dp->dp_obsolete_bpobj))
 			dsl_pool_destroy_obsolete_bpobj(dp, tx);
 	}
 
 	return (0);
 }
 
 /*
  * This is the primary entry point for scans that is called from syncing
  * context. Scans must happen entirely during syncing context so that we
  * cna guarantee that blocks we are currently scanning will not change out
  * from under us. While a scan is active, this funciton controls how quickly
  * transaction groups proceed, instead of the normal handling provided by
  * txg_sync_thread().
  */
 void
 dsl_scan_sync(dsl_pool_t *dp, dmu_tx_t *tx)
 {
 	dsl_scan_t *scn = dp->dp_scan;
 	spa_t *spa = dp->dp_spa;
 	int err = 0;
 	state_sync_type_t sync_type = SYNC_OPTIONAL;
 
 	/*
 	 * Check for scn_restart_txg before checking spa_load_state, so
 	 * that we can restart an old-style scan while the pool is being
 	 * imported (see dsl_scan_init).
 	 */
 	if (dsl_scan_restarting(scn, tx)) {
 		pool_scan_func_t func = POOL_SCAN_SCRUB;
 		dsl_scan_done(scn, B_FALSE, tx);
 		if (vdev_resilver_needed(spa->spa_root_vdev, NULL, NULL))
 			func = POOL_SCAN_RESILVER;
 		zfs_dbgmsg("restarting scan func=%u txg=%llu",
 		    func, (longlong_t)tx->tx_txg);
 		dsl_scan_setup_sync(&func, tx);
 	}
 
 	/*
 	 * Only process scans in sync pass 1.
 	 */
 	if (spa_sync_pass(dp->dp_spa) > 1)
 		return;
 
 	/*
 	 * If the spa is shutting down, then stop scanning. This will
 	 * ensure that the scan does not dirty any new data during the
 	 * shutdown phase.
 	 */
 	if (spa_shutting_down(spa))
 		return;
 
 	/*
 	 * If the scan is inactive due to a stalled async destroy, try again.
 	 */
 	if (!scn->scn_async_stalled && !dsl_scan_active(scn))
 		return;
 
 	/* reset scan statistics */
 	scn->scn_visited_this_txg = 0;
 	scn->scn_holes_this_txg = 0;
 	scn->scn_lt_min_this_txg = 0;
 	scn->scn_gt_max_this_txg = 0;
 	scn->scn_ddt_contained_this_txg = 0;
 	scn->scn_objsets_visited_this_txg = 0;
 	scn->scn_avg_seg_size_this_txg = 0;
 	scn->scn_segs_this_txg = 0;
 	scn->scn_avg_zio_size_this_txg = 0;
 	scn->scn_zios_this_txg = 0;
 	scn->scn_suspending = B_FALSE;
 	scn->scn_sync_start_time = gethrtime();
 	spa->spa_scrub_active = B_TRUE;
 
 	/*
 	 * First process the async destroys.  If we pause, don't do
 	 * any scrubbing or resilvering.  This ensures that there are no
 	 * async destroys while we are scanning, so the scan code doesn't
 	 * have to worry about traversing it.  It is also faster to free the
 	 * blocks than to scrub them.
 	 */
 	err = dsl_process_async_destroys(dp, tx);
 	if (err != 0)
 		return;
 
 	if (!dsl_scan_is_running(scn) || dsl_scan_is_paused_scrub(scn))
 		return;
 
 	/*
 	 * Wait a few txgs after importing to begin scanning so that
 	 * we can get the pool imported quickly.
 	 */
 	if (spa->spa_syncing_txg < spa->spa_first_txg + SCAN_IMPORT_WAIT_TXGS)
 		return;
 
 	/*
 	 * It is possible to switch from unsorted to sorted at any time,
 	 * but afterwards the scan will remain sorted unless reloaded from
 	 * a checkpoint after a reboot.
 	 */
 	if (!zfs_scan_legacy) {
 		scn->scn_is_sorted = B_TRUE;
 		if (scn->scn_last_checkpoint == 0)
 			scn->scn_last_checkpoint = ddi_get_lbolt();
 	}
 
 	/*
 	 * For sorted scans, determine what kind of work we will be doing
 	 * this txg based on our memory limitations and whether or not we
 	 * need to perform a checkpoint.
 	 */
 	if (scn->scn_is_sorted) {
 		/*
 		 * If we are over our checkpoint interval, set scn_clearing
 		 * so that we can begin checkpointing immediately. The
 		 * checkpoint allows us to save a consisent bookmark
 		 * representing how much data we have scrubbed so far.
 		 * Otherwise, use the memory limit to determine if we should
 		 * scan for metadata or start issue scrub IOs. We accumulate
 		 * metadata until we hit our hard memory limit at which point
 		 * we issue scrub IOs until we are at our soft memory limit.
 		 */
 		if (scn->scn_checkpointing ||
 		    ddi_get_lbolt() - scn->scn_last_checkpoint >
 		    SEC_TO_TICK(zfs_scan_checkpoint_intval)) {
 			if (!scn->scn_checkpointing)
 				zfs_dbgmsg("begin scan checkpoint");
 
 			scn->scn_checkpointing = B_TRUE;
 			scn->scn_clearing = B_TRUE;
 		} else {
 			boolean_t should_clear = dsl_scan_should_clear(scn);
 			if (should_clear && !scn->scn_clearing) {
 				zfs_dbgmsg("begin scan clearing");
 				scn->scn_clearing = B_TRUE;
 			} else if (!should_clear && scn->scn_clearing) {
 				zfs_dbgmsg("finish scan clearing");
 				scn->scn_clearing = B_FALSE;
 			}
 		}
 	} else {
 		ASSERT0(scn->scn_checkpointing);
                 ASSERT0(scn->scn_clearing);
 	}
 
 	if (!scn->scn_clearing && scn->scn_done_txg == 0) {
 		/* Need to scan metadata for more blocks to scrub */
 		dsl_scan_phys_t *scnp = &scn->scn_phys;
 		taskqid_t prefetch_tqid;
 		uint64_t bytes_per_leaf = zfs_scan_vdev_limit;
 		uint64_t nr_leaves = dsl_scan_count_leaves(spa->spa_root_vdev);
 
 		/*
 		 * Calculate the max number of in-flight bytes for pool-wide
 		 * scanning operations (minimum 1MB). Limits for the issuing
 		 * phase are done per top-level vdev and are handled separately.
 		 */
 		scn->scn_maxinflight_bytes =
 		    MAX(nr_leaves * bytes_per_leaf, 1ULL << 20);
 
 		if (scnp->scn_ddt_bookmark.ddb_class <=
 		    scnp->scn_ddt_class_max) {
 			ASSERT(ZB_IS_ZERO(&scnp->scn_bookmark));
 			zfs_dbgmsg("doing scan sync txg %llu; "
 			    "ddt bm=%llu/%llu/%llu/%llx",
 			    (longlong_t)tx->tx_txg,
 			    (longlong_t)scnp->scn_ddt_bookmark.ddb_class,
 			    (longlong_t)scnp->scn_ddt_bookmark.ddb_type,
 			    (longlong_t)scnp->scn_ddt_bookmark.ddb_checksum,
 			    (longlong_t)scnp->scn_ddt_bookmark.ddb_cursor);
 		} else {
 			zfs_dbgmsg("doing scan sync txg %llu; "
 			    "bm=%llu/%llu/%llu/%llu",
 			    (longlong_t)tx->tx_txg,
 			    (longlong_t)scnp->scn_bookmark.zb_objset,
 			    (longlong_t)scnp->scn_bookmark.zb_object,
 			    (longlong_t)scnp->scn_bookmark.zb_level,
 			    (longlong_t)scnp->scn_bookmark.zb_blkid);
 		}
 
 		scn->scn_zio_root = zio_root(dp->dp_spa, NULL,
 		    NULL, ZIO_FLAG_CANFAIL);
 
 		scn->scn_prefetch_stop = B_FALSE;
 		prefetch_tqid = taskq_dispatch(dp->dp_sync_taskq,
 		    dsl_scan_prefetch_thread, scn, TQ_SLEEP);
 		ASSERT(prefetch_tqid != TASKQID_INVALID);
 
 		dsl_pool_config_enter(dp, FTAG);
 		dsl_scan_visit(scn, tx);
 		dsl_pool_config_exit(dp, FTAG);
 
 		mutex_enter(&dp->dp_spa->spa_scrub_lock);
 		scn->scn_prefetch_stop = B_TRUE;
 		cv_broadcast(&spa->spa_scrub_io_cv);
 		mutex_exit(&dp->dp_spa->spa_scrub_lock);
 
 		taskq_wait_id(dp->dp_sync_taskq, prefetch_tqid);
 		(void) zio_wait(scn->scn_zio_root);
 		scn->scn_zio_root = NULL;
 
 		zfs_dbgmsg("scan visited %llu blocks in %llums "
 		    "(%llu os's, %llu holes, %llu < mintxg, "
 		    "%llu in ddt, %llu > maxtxg)",
 		    (longlong_t)scn->scn_visited_this_txg,
 		    (longlong_t)NSEC2MSEC(gethrtime() -
 		    scn->scn_sync_start_time),
 		    (longlong_t)scn->scn_objsets_visited_this_txg,
 		    (longlong_t)scn->scn_holes_this_txg,
 		    (longlong_t)scn->scn_lt_min_this_txg,
 		    (longlong_t)scn->scn_ddt_contained_this_txg,
 		    (longlong_t)scn->scn_gt_max_this_txg);
 
 		if (!scn->scn_suspending) {
 			ASSERT0(avl_numnodes(&scn->scn_queue));
 			scn->scn_done_txg = tx->tx_txg + 1;
 			if (scn->scn_is_sorted) {
 				scn->scn_checkpointing = B_TRUE;
 				scn->scn_clearing = B_TRUE;
 			}
 			zfs_dbgmsg("scan complete txg %llu",
 				   (longlong_t)tx->tx_txg);
 		}
 	} else if (scn->scn_is_sorted && scn->scn_bytes_pending != 0) {
 		/* need to issue scrubbing IOs from per-vdev queues */
 		scn->scn_zio_root = zio_root(dp->dp_spa, NULL,
 		    NULL, ZIO_FLAG_CANFAIL);
 		scan_io_queues_run(scn);
 		(void) zio_wait(scn->scn_zio_root);
 		scn->scn_zio_root = NULL;
 
 		/* calculate and dprintf the current memory usage */
 		(void) dsl_scan_should_clear(scn);
 		dsl_scan_update_stats(scn);
 
 		zfs_dbgmsg("scrubbed %llu blocks (%llu segs) in %llums "
 		    "(avg_block_size = %llu, avg_seg_size = %llu)",
 		    (longlong_t)scn->scn_zios_this_txg,
 		    (longlong_t)scn->scn_segs_this_txg,
 		    (longlong_t)NSEC2MSEC(gethrtime() -
 		    scn->scn_sync_start_time),
 		    (longlong_t)scn->scn_avg_zio_size_this_txg,
 		    (longlong_t)scn->scn_avg_seg_size_this_txg);
 	} else if (scn->scn_done_txg != 0 && scn->scn_done_txg <= tx->tx_txg) {
 		/* Finished with everything. Mark the scrub as complete */
 		zfs_dbgmsg("scan issuing complete txg %llu",
 		    (longlong_t)tx->tx_txg);
 		ASSERT3U(scn->scn_done_txg, !=, 0);
 		ASSERT0(spa->spa_scrub_inflight);
 		ASSERT0(scn->scn_bytes_pending);
 		dsl_scan_done(scn, B_TRUE, tx);
 		sync_type = SYNC_MANDATORY;
 	}
 
 	dsl_scan_sync_state(scn, tx, sync_type);
 }
 
 static void
 count_block(dsl_scan_t *scn, zfs_all_blkstats_t *zab, const blkptr_t *bp)
 {
 	int i;
 
 	/* update the spa's stats on how many bytes we have issued */
 	for (i = 0; i < BP_GET_NDVAS(bp); i++) {
 		atomic_add_64(&scn->scn_dp->dp_spa->spa_scan_pass_issued,
 		    DVA_GET_ASIZE(&bp->blk_dva[i]));
 	}
 
 	/*
 	 * If we resume after a reboot, zab will be NULL; don't record
 	 * incomplete stats in that case.
 	 */
 	if (zab == NULL)
 		return;
 
 	mutex_enter(&zab->zab_lock);
 
 	for (i = 0; i < 4; i++) {
 		int l = (i < 2) ? BP_GET_LEVEL(bp) : DN_MAX_LEVELS;
 		int t = (i & 1) ? BP_GET_TYPE(bp) : DMU_OT_TOTAL;
 		if (t & DMU_OT_NEWTYPE)
 			t = DMU_OT_OTHER;
 		zfs_blkstat_t *zb = &zab->zab_type[l][t];
 		int equal;
 
 		zb->zb_count++;
 		zb->zb_asize += BP_GET_ASIZE(bp);
 		zb->zb_lsize += BP_GET_LSIZE(bp);
 		zb->zb_psize += BP_GET_PSIZE(bp);
 		zb->zb_gangs += BP_COUNT_GANG(bp);
 
 		switch (BP_GET_NDVAS(bp)) {
 		case 2:
 			if (DVA_GET_VDEV(&bp->blk_dva[0]) ==
 			    DVA_GET_VDEV(&bp->blk_dva[1]))
 				zb->zb_ditto_2_of_2_samevdev++;
 			break;
 		case 3:
 			equal = (DVA_GET_VDEV(&bp->blk_dva[0]) ==
 			    DVA_GET_VDEV(&bp->blk_dva[1])) +
 			    (DVA_GET_VDEV(&bp->blk_dva[0]) ==
 			    DVA_GET_VDEV(&bp->blk_dva[2])) +
 			    (DVA_GET_VDEV(&bp->blk_dva[1]) ==
 			    DVA_GET_VDEV(&bp->blk_dva[2]));
 			if (equal == 1)
 				zb->zb_ditto_2_of_3_samevdev++;
 			else if (equal == 3)
 				zb->zb_ditto_3_of_3_samevdev++;
 			break;
 		}
 	}
 
 	mutex_exit(&zab->zab_lock);
 }
 
 static void
 scan_io_queue_insert_impl(dsl_scan_io_queue_t *queue, scan_io_t *sio)
 {
 	avl_index_t idx;
 	int64_t asize = sio->sio_asize;
 	dsl_scan_t *scn = queue->q_scn;
 
 	ASSERT(MUTEX_HELD(&queue->q_vd->vdev_scan_io_queue_lock));
 
 	if (avl_find(&queue->q_sios_by_addr, sio, &idx) != NULL) {
 		/* block is already scheduled for reading */
 		atomic_add_64(&scn->scn_bytes_pending, -asize);
 		kmem_free(sio, sizeof (*sio));
 		return;
 	}
 	avl_insert(&queue->q_sios_by_addr, sio, idx);
 	range_tree_add(queue->q_exts_by_addr, sio->sio_offset, asize);
 }
 
 /*
  * Given all the info we got from our metadata scanning process, we
  * construct a scan_io_t and insert it into the scan sorting queue. The
  * I/O must already be suitable for us to process. This is controlled
  * by dsl_scan_enqueue().
  */
 static void
 scan_io_queue_insert(dsl_scan_io_queue_t *queue, const blkptr_t *bp, int dva_i,
     int zio_flags, const zbookmark_phys_t *zb)
 {
 	dsl_scan_t *scn = queue->q_scn;
 	scan_io_t *sio = kmem_zalloc(sizeof (*sio), KM_SLEEP);
 
 	ASSERT0(BP_IS_GANG(bp));
 	ASSERT(MUTEX_HELD(&queue->q_vd->vdev_scan_io_queue_lock));
 
 	bp2sio(bp, sio, dva_i);
 	sio->sio_flags = zio_flags;
 	sio->sio_zb = *zb;
 
 	/*
 	 * Increment the bytes pending counter now so that we can't
 	 * get an integer underflow in case the worker processes the
 	 * zio before we get to incrementing this counter.
 	 */
 	atomic_add_64(&scn->scn_bytes_pending, sio->sio_asize);
 
 	scan_io_queue_insert_impl(queue, sio);
 }
 
 /*
  * Given a set of I/O parameters as discovered by the metadata traversal
  * process, attempts to place the I/O into the sorted queues (if allowed),
  * or immediately executes the I/O.
  */
 static void
 dsl_scan_enqueue(dsl_pool_t *dp, const blkptr_t *bp, int zio_flags,
     const zbookmark_phys_t *zb)
 {
 	spa_t *spa = dp->dp_spa;
 
 	ASSERT(!BP_IS_EMBEDDED(bp));
 
 	/*
 	 * Gang blocks are hard to issue sequentially, so we just issue them
 	 * here immediately instead of queuing them.
 	 */
 	if (!dp->dp_scan->scn_is_sorted || BP_IS_GANG(bp)) {
 		scan_exec_io(dp, bp, zio_flags, zb, NULL);
 		return;
 	}
 	for (int i = 0; i < BP_GET_NDVAS(bp); i++) {
 		dva_t dva;
 		vdev_t *vdev;
 
 		dva = bp->blk_dva[i];
 		vdev = vdev_lookup_top(spa, DVA_GET_VDEV(&dva));
 		ASSERT(vdev != NULL);
 
 		mutex_enter(&vdev->vdev_scan_io_queue_lock);
 		if (vdev->vdev_scan_io_queue == NULL)
 			vdev->vdev_scan_io_queue = scan_io_queue_create(vdev);
 		ASSERT(dp->dp_scan != NULL);
 		scan_io_queue_insert(vdev->vdev_scan_io_queue, bp,
 		    i, zio_flags, zb);
 		mutex_exit(&vdev->vdev_scan_io_queue_lock);
 	}
 }
 
 static int
 dsl_scan_scrub_cb(dsl_pool_t *dp,
     const blkptr_t *bp, const zbookmark_phys_t *zb)
 {
 	dsl_scan_t *scn = dp->dp_scan;
 	spa_t *spa = dp->dp_spa;
 	uint64_t phys_birth = BP_PHYSICAL_BIRTH(bp);
 	size_t psize = BP_GET_PSIZE(bp);
 	boolean_t needs_io;
 	int zio_flags = ZIO_FLAG_SCAN_THREAD | ZIO_FLAG_RAW | ZIO_FLAG_CANFAIL;
 	int d;
 
 	if (phys_birth <= scn->scn_phys.scn_min_txg ||
 	    phys_birth >= scn->scn_phys.scn_max_txg) {
 		count_block(scn, dp->dp_blkstats, bp);
 		return (0);
 	}
 
 	/* Embedded BP's have phys_birth==0, so we reject them above. */
 	ASSERT(!BP_IS_EMBEDDED(bp));
 
 	ASSERT(DSL_SCAN_IS_SCRUB_RESILVER(scn));
 	if (scn->scn_phys.scn_func == POOL_SCAN_SCRUB) {
 		zio_flags |= ZIO_FLAG_SCRUB;
 		needs_io = B_TRUE;
 	} else {
 		ASSERT3U(scn->scn_phys.scn_func, ==, POOL_SCAN_RESILVER);
 		zio_flags |= ZIO_FLAG_RESILVER;
 		needs_io = B_FALSE;
 	}
 
 	/* If it's an intent log block, failure is expected. */
 	if (zb->zb_level == ZB_ZIL_LEVEL)
 		zio_flags |= ZIO_FLAG_SPECULATIVE;
 
 	for (d = 0; d < BP_GET_NDVAS(bp); d++) {
 		const dva_t *dva = &bp->blk_dva[d];
 
 		/*
 		 * Keep track of how much data we've examined so that
 		 * zpool(1M) status can make useful progress reports.
 		 */
 		scn->scn_phys.scn_examined += DVA_GET_ASIZE(dva);
 		spa->spa_scan_pass_exam += DVA_GET_ASIZE(dva);
 
 		/* if it's a resilver, this may not be in the target range */
 		if (!needs_io)
 			needs_io = dsl_scan_need_resilver(spa, dva, psize,
                             phys_birth);
 	}
 
 	if (needs_io && !zfs_no_scrub_io) {
 		dsl_scan_enqueue(dp, bp, zio_flags, zb);
 	} else {
 		count_block(scn, dp->dp_blkstats, bp);
 	}
 
 	/* do not relocate this block */
 	return (0);
 }
 
 static void
 dsl_scan_scrub_done(zio_t *zio)
 {
 	spa_t *spa = zio->io_spa;
 	blkptr_t *bp = zio->io_bp;
 	dsl_scan_io_queue_t *queue = zio->io_private;
 
 	abd_free(zio->io_abd);
 
 	if (queue == NULL) {
 		mutex_enter(&spa->spa_scrub_lock);
 		ASSERT3U(spa->spa_scrub_inflight, >=, BP_GET_PSIZE(bp));
 		spa->spa_scrub_inflight -= BP_GET_PSIZE(bp);
 		cv_broadcast(&spa->spa_scrub_io_cv);
 		mutex_exit(&spa->spa_scrub_lock);
 	} else {
 		mutex_enter(&queue->q_vd->vdev_scan_io_queue_lock);
 		ASSERT3U(queue->q_inflight_bytes, >=, BP_GET_PSIZE(bp));
 		queue->q_inflight_bytes -= BP_GET_PSIZE(bp);
 		cv_broadcast(&queue->q_zio_cv);
 		mutex_exit(&queue->q_vd->vdev_scan_io_queue_lock);
 	}
 
 	if (zio->io_error && (zio->io_error != ECKSUM ||
 	    !(zio->io_flags & ZIO_FLAG_SPECULATIVE))) {
 		atomic_inc_64(&spa->spa_dsl_pool->dp_scan->scn_phys.scn_errors);
 	}
 }
 
 /*
  * Given a scanning zio's information, executes the zio. The zio need
  * not necessarily be only sortable, this function simply executes the
  * zio, no matter what it is. The optional queue argument allows the
  * caller to specify that they want per top level vdev IO rate limiting
  * instead of the legacy global limiting.
  */
 static void
 scan_exec_io(dsl_pool_t *dp, const blkptr_t *bp, int zio_flags,
     const zbookmark_phys_t *zb, dsl_scan_io_queue_t *queue)
 {
 	spa_t *spa = dp->dp_spa;
 	dsl_scan_t *scn = dp->dp_scan;
 	size_t size = BP_GET_PSIZE(bp);
 	abd_t *data = abd_alloc_for_io(size, B_FALSE);
 	unsigned int scan_delay = 0;
 
 	if (queue == NULL) {
 		mutex_enter(&spa->spa_scrub_lock);
 		while (spa->spa_scrub_inflight >= scn->scn_maxinflight_bytes)
 			cv_wait(&spa->spa_scrub_io_cv, &spa->spa_scrub_lock);
 		spa->spa_scrub_inflight += BP_GET_PSIZE(bp);
 		mutex_exit(&spa->spa_scrub_lock);
 	} else {
 		kmutex_t *q_lock = &queue->q_vd->vdev_scan_io_queue_lock;
 
 		mutex_enter(q_lock);
 		while (queue->q_inflight_bytes >= queue->q_maxinflight_bytes)
 			cv_wait(&queue->q_zio_cv, q_lock);
 		queue->q_inflight_bytes += BP_GET_PSIZE(bp);
 		mutex_exit(q_lock);
 	}
 
 	if (zio_flags & ZIO_FLAG_RESILVER)
 		scan_delay = zfs_resilver_delay;
 	else {
 		ASSERT(zio_flags & ZIO_FLAG_SCRUB);
 		scan_delay = zfs_scrub_delay;
 	}
 
 	if (scan_delay && (ddi_get_lbolt64() - spa->spa_last_io <= zfs_scan_idle))
 		delay(MAX((int)scan_delay, 0));
 	
 	count_block(dp->dp_scan, dp->dp_blkstats, bp);
 	zio_nowait(zio_read(dp->dp_scan->scn_zio_root, spa, bp, data, size,
 	    dsl_scan_scrub_done, queue, ZIO_PRIORITY_SCRUB, zio_flags, zb));
 }
 
 /*
  * This is the primary extent sorting algorithm. We balance two parameters:
  * 1) how many bytes of I/O are in an extent
  * 2) how well the extent is filled with I/O (as a fraction of its total size)
  * Since we allow extents to have gaps between their constituent I/Os, it's
  * possible to have a fairly large extent that contains the same amount of
  * I/O bytes than a much smaller extent, which just packs the I/O more tightly.
  * The algorithm sorts based on a score calculated from the extent's size,
  * the relative fill volume (in %) and a "fill weight" parameter that controls
  * the split between whether we prefer larger extents or more well populated
  * extents:
  *
  * SCORE = FILL_IN_BYTES + (FILL_IN_PERCENT * FILL_IN_BYTES * FILL_WEIGHT)
  *
  * Example:
  * 1) assume extsz = 64 MiB
  * 2) assume fill = 32 MiB (extent is half full)
  * 3) assume fill_weight = 3
  * 4)	SCORE = 32M + (((32M * 100) / 64M) * 3 * 32M) / 100
  *	SCORE = 32M + (50 * 3 * 32M) / 100
  *	SCORE = 32M + (4800M / 100)
  *	SCORE = 32M + 48M
  *	         ^     ^
  *	         |     +--- final total relative fill-based score
  *	         +--------- final total fill-based score
  *	SCORE = 80M
  *
  * As can be seen, at fill_ratio=3, the algorithm is slightly biased towards
  * extents that are more completely filled (in a 3:2 ratio) vs just larger.
  * Note that as an optimization, we replace multiplication and division by
  * 100 with bitshifting by 7 (which effecitvely multiplies and divides by 128).
  */
 static int
 ext_size_compare(const void *x, const void *y)
 {
 	const range_seg_t *rsa = x, *rsb = y;
 	uint64_t sa = rsa->rs_end - rsa->rs_start,
 	    sb = rsb->rs_end - rsb->rs_start;
 	uint64_t score_a, score_b;
 
 	score_a = rsa->rs_fill + ((((rsa->rs_fill << 7) / sa) *
 	    fill_weight * rsa->rs_fill) >> 7);
 	score_b = rsb->rs_fill + ((((rsb->rs_fill << 7) / sb) *
 	    fill_weight * rsb->rs_fill) >> 7);
 
 	if (score_a > score_b)
 		return (-1);
 	if (score_a == score_b) {
 		if (rsa->rs_start < rsb->rs_start)
 			return (-1);
 		if (rsa->rs_start == rsb->rs_start)
 			return (0);
 		return (1);
 	}
 	return (1);
 }
 
 /*
  * Comparator for the q_sios_by_addr tree. Sorting is simply performed
  * based on LBA-order (from lowest to highest).
  */
 static int
 io_addr_compare(const void *x, const void *y)
 {
 	const scan_io_t *a = x, *b = y;
 
 	if (a->sio_offset < b->sio_offset)
 		return (-1);
 	if (a->sio_offset == b->sio_offset)
 		return (0);
 	return (1);
 }
 
 /* IO queues are created on demand when they are needed. */
 static dsl_scan_io_queue_t *
 scan_io_queue_create(vdev_t *vd)
 {
 	dsl_scan_t *scn = vd->vdev_spa->spa_dsl_pool->dp_scan;
 	dsl_scan_io_queue_t *q = kmem_zalloc(sizeof (*q), KM_SLEEP);
 
 	q->q_scn = scn;
 	q->q_vd = vd;
 	cv_init(&q->q_zio_cv, NULL, CV_DEFAULT, NULL);
 	q->q_exts_by_addr = range_tree_create_impl(&rt_avl_ops,
 	    &q->q_exts_by_size, ext_size_compare, zfs_scan_max_ext_gap);
 	avl_create(&q->q_sios_by_addr, io_addr_compare,
 	    sizeof (scan_io_t), offsetof(scan_io_t, sio_nodes.sio_addr_node));
 
 	return (q);
 }
 
 /*
  * Destroys a scan queue and all segments and scan_io_t's contained in it.
  * No further execution of I/O occurs, anything pending in the queue is
  * simply freed without being executed.
  */
 void
 dsl_scan_io_queue_destroy(dsl_scan_io_queue_t *queue)
 {
 	dsl_scan_t *scn = queue->q_scn;
 	scan_io_t *sio;
 	void *cookie = NULL;
 	int64_t bytes_dequeued = 0;
 
 	ASSERT(MUTEX_HELD(&queue->q_vd->vdev_scan_io_queue_lock));
 
 	while ((sio = avl_destroy_nodes(&queue->q_sios_by_addr, &cookie)) !=
 	    NULL) {
 		ASSERT(range_tree_contains(queue->q_exts_by_addr,
 		    sio->sio_offset, sio->sio_asize));
 		bytes_dequeued += sio->sio_asize;
 		kmem_free(sio, sizeof (*sio));
 	}
 
 	atomic_add_64(&scn->scn_bytes_pending, -bytes_dequeued);
 	range_tree_vacate(queue->q_exts_by_addr, NULL, queue);
 	range_tree_destroy(queue->q_exts_by_addr);
 	avl_destroy(&queue->q_sios_by_addr);
 	cv_destroy(&queue->q_zio_cv);
 
 	kmem_free(queue, sizeof (*queue));
 }
 
 /*
  * Properly transfers a dsl_scan_queue_t from `svd' to `tvd'. This is
  * called on behalf of vdev_top_transfer when creating or destroying
  * a mirror vdev due to zpool attach/detach.
  */
 void
 dsl_scan_io_queue_vdev_xfer(vdev_t *svd, vdev_t *tvd)
 {
 	mutex_enter(&svd->vdev_scan_io_queue_lock);
 	mutex_enter(&tvd->vdev_scan_io_queue_lock);
 
 	VERIFY3P(tvd->vdev_scan_io_queue, ==, NULL);
 	tvd->vdev_scan_io_queue = svd->vdev_scan_io_queue;
 	svd->vdev_scan_io_queue = NULL;
 	if (tvd->vdev_scan_io_queue != NULL)
 		tvd->vdev_scan_io_queue->q_vd = tvd;
 
 	mutex_exit(&tvd->vdev_scan_io_queue_lock);
 	mutex_exit(&svd->vdev_scan_io_queue_lock);
 }
 
 static void
 scan_io_queues_destroy(dsl_scan_t *scn)
 {
 	vdev_t *rvd = scn->scn_dp->dp_spa->spa_root_vdev;
 
 	for (uint64_t i = 0; i < rvd->vdev_children; i++) {
 		vdev_t *tvd = rvd->vdev_child[i];
 
 		mutex_enter(&tvd->vdev_scan_io_queue_lock);
 		if (tvd->vdev_scan_io_queue != NULL)
 			dsl_scan_io_queue_destroy(tvd->vdev_scan_io_queue);
 		tvd->vdev_scan_io_queue = NULL;
 		mutex_exit(&tvd->vdev_scan_io_queue_lock);
 	}
 }
 
 static void
 dsl_scan_freed_dva(spa_t *spa, const blkptr_t *bp, int dva_i)
 {
 	dsl_pool_t *dp = spa->spa_dsl_pool;
 	dsl_scan_t *scn = dp->dp_scan;
 	vdev_t *vdev;
 	kmutex_t *q_lock;
 	dsl_scan_io_queue_t *queue;
 	scan_io_t srch, *sio;
 	avl_index_t idx;
 	uint64_t start, size;
 
 	vdev = vdev_lookup_top(spa, DVA_GET_VDEV(&bp->blk_dva[dva_i]));
 	ASSERT(vdev != NULL);
 	q_lock = &vdev->vdev_scan_io_queue_lock;
 	queue = vdev->vdev_scan_io_queue;
 
 	mutex_enter(q_lock);
 	if (queue == NULL) {
 		mutex_exit(q_lock);
 		return;
 	}
 
 	bp2sio(bp, &srch, dva_i);
 	start = srch.sio_offset;
 	size = srch.sio_asize;
 
 	/*
 	 * We can find the zio in two states:
 	 * 1) Cold, just sitting in the queue of zio's to be issued at
 	 *	some point in the future. In this case, all we do is
 	 *	remove the zio from the q_sios_by_addr tree, decrement
 	 *	its data volume from the containing range_seg_t and
 	 *	resort the q_exts_by_size tree to reflect that the
 	 *	range_seg_t has lost some of its 'fill'. We don't shorten
 	 *	the range_seg_t - this is usually rare enough not to be
 	 *	worth the extra hassle of trying keep track of precise
 	 *	extent boundaries.
 	 * 2) Hot, where the zio is currently in-flight in
 	 *	dsl_scan_issue_ios. In this case, we can't simply
 	 *	reach in and stop the in-flight zio's, so we instead
 	 *	block the caller. Eventually, dsl_scan_issue_ios will
 	 *	be done with issuing the zio's it gathered and will
 	 *	signal us.
 	 */
 	sio = avl_find(&queue->q_sios_by_addr, &srch, &idx);
 	if (sio != NULL) {
 		int64_t asize = sio->sio_asize;
 		blkptr_t tmpbp;
 
 		/* Got it while it was cold in the queue */
 		ASSERT3U(start, ==, sio->sio_offset);
 		ASSERT3U(size, ==, asize);
 		avl_remove(&queue->q_sios_by_addr, sio);
 
 		ASSERT(range_tree_contains(queue->q_exts_by_addr, start, size));
 		range_tree_remove_fill(queue->q_exts_by_addr, start, size);
 
 		/*
 		 * We only update scn_bytes_pending in the cold path,
 		 * otherwise it will already have been accounted for as
 		 * part of the zio's execution.
 		 */
 		atomic_add_64(&scn->scn_bytes_pending, -asize);
 
 		/* count the block as though we issued it */
 		sio2bp(sio, &tmpbp, dva_i);
 		count_block(scn, dp->dp_blkstats, &tmpbp);
 
 		kmem_free(sio, sizeof (*sio));
 	}
 	mutex_exit(q_lock);
 }
 
 /*
  * Callback invoked when a zio_free() zio is executing. This needs to be
  * intercepted to prevent the zio from deallocating a particular portion
  * of disk space and it then getting reallocated and written to, while we
  * still have it queued up for processing.
  */
 void
 dsl_scan_freed(spa_t *spa, const blkptr_t *bp)
 {
 	dsl_pool_t *dp = spa->spa_dsl_pool;
 	dsl_scan_t *scn = dp->dp_scan;
 
 	ASSERT(!BP_IS_EMBEDDED(bp));
 	ASSERT(scn != NULL);
 	if (!dsl_scan_is_running(scn))
 		return;
 
 	for (int i = 0; i < BP_GET_NDVAS(bp); i++)
 		dsl_scan_freed_dva(spa, bp, i);
 }
Index: projects/openssl111/sys/cddl/contrib/opensolaris
===================================================================
--- projects/openssl111/sys/cddl/contrib/opensolaris	(revision 339239)
+++ projects/openssl111/sys/cddl/contrib/opensolaris	(revision 339240)

Property changes on: projects/openssl111/sys/cddl/contrib/opensolaris
___________________________________________________________________
Modified: svn:mergeinfo
## -0,0 +0,1 ##
   Merged /head/sys/cddl/contrib/opensolaris:r339215-339239
Index: projects/openssl111/sys/dev/e1000/if_em.c
===================================================================
--- projects/openssl111/sys/dev/e1000/if_em.c	(revision 339239)
+++ projects/openssl111/sys/dev/e1000/if_em.c	(revision 339240)
@@ -1,4543 +1,4541 @@
 /*-
  * SPDX-License-Identifier: BSD-2-Clause
  *
  * Copyright (c) 2016 Nicole Graziano <nicole@nextbsd.org>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 /* $FreeBSD$ */
 #include "if_em.h"
 #include <sys/sbuf.h>
 #include <machine/_inttypes.h>
 
 #define em_mac_min e1000_82547
 #define igb_mac_min e1000_82575
 
 /*********************************************************************
  *  Driver version:
  *********************************************************************/
 char em_driver_version[] = "7.6.1-k";
 
 /*********************************************************************
  *  PCI Device ID Table
  *
  *  Used by probe to select devices to load on
  *  Last field stores an index into e1000_strings
  *  Last entry must be all 0s
  *
  *  { Vendor ID, Device ID, SubVendor ID, SubDevice ID, String Index }
  *********************************************************************/
 
 static pci_vendor_info_t em_vendor_info_array[] =
 {
 	/* Intel(R) PRO/1000 Network Connection - Legacy em*/
 	PVID(0x8086, E1000_DEV_ID_82540EM, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82540EM_LOM, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82540EP, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82540EP_LOM, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82540EP_LP, "Intel(R) PRO/1000 Network Connection"),
 
 	PVID(0x8086, E1000_DEV_ID_82541EI, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82541ER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82541ER_LOM, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82541EI_MOBILE, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82541GI, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82541GI_LF, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82541GI_MOBILE, "Intel(R) PRO/1000 Network Connection"),
 
 	PVID(0x8086, E1000_DEV_ID_82542, "Intel(R) PRO/1000 Network Connection"),
 
 	PVID(0x8086, E1000_DEV_ID_82543GC_FIBER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82543GC_COPPER, "Intel(R) PRO/1000 Network Connection"),
 
 	PVID(0x8086, E1000_DEV_ID_82544EI_COPPER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82544EI_FIBER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82544GC_COPPER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82544GC_LOM, "Intel(R) PRO/1000 Network Connection"),
 
 	PVID(0x8086, E1000_DEV_ID_82545EM_COPPER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82545EM_FIBER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82545GM_COPPER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82545GM_FIBER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82545GM_SERDES, "Intel(R) PRO/1000 Network Connection"),
 
 	PVID(0x8086, E1000_DEV_ID_82546EB_COPPER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82546EB_FIBER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82546EB_QUAD_COPPER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82546GB_COPPER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82546GB_FIBER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82546GB_SERDES, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82546GB_PCIE, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82546GB_QUAD_COPPER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82546GB_QUAD_COPPER_KSP3, "Intel(R) PRO/1000 Network Connection"),
 
 	PVID(0x8086, E1000_DEV_ID_82547EI, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82547EI_MOBILE, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82547GI, "Intel(R) PRO/1000 Network Connection"),
 
 	/* Intel(R) PRO/1000 Network Connection - em */
 	PVID(0x8086, E1000_DEV_ID_82571EB_COPPER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82571EB_FIBER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82571EB_SERDES, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82571EB_SERDES_DUAL, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82571EB_SERDES_QUAD, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82571EB_QUAD_COPPER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82571EB_QUAD_COPPER_LP, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82571EB_QUAD_FIBER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82571PT_QUAD_COPPER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82572EI, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82572EI_COPPER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82572EI_FIBER, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82572EI_SERDES, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82573E, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82573E_IAMT, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82573L, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82583V, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_80003ES2LAN_COPPER_SPT, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_80003ES2LAN_SERDES_SPT, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_80003ES2LAN_COPPER_DPT, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_80003ES2LAN_SERDES_DPT, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH8_IGP_M_AMT, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH8_IGP_AMT, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH8_IGP_C, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH8_IFE, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH8_IFE_GT, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH8_IFE_G, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH8_IGP_M, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH8_82567V_3, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH9_IGP_M_AMT, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH9_IGP_AMT, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH9_IGP_C, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH9_IGP_M, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH9_IGP_M_V, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH9_IFE, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH9_IFE_GT, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH9_IFE_G, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH9_BM, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82574L, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_82574LA, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH10_R_BM_LM, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH10_R_BM_LF, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH10_R_BM_V, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH10_D_BM_LM, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH10_D_BM_LF, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_ICH10_D_BM_V, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_M_HV_LM, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_M_HV_LC, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_D_HV_DM, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_D_HV_DC, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH2_LV_LM, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH2_LV_V, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_LPT_I217_LM, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_LPT_I217_V, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_LPTLP_I218_LM, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_LPTLP_I218_V, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_I218_LM2, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_I218_V2, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_I218_LM3, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_I218_V3, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_SPT_I219_LM, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_SPT_I219_V, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_SPT_I219_LM2, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_SPT_I219_V2, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_LBG_I219_LM3, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_SPT_I219_LM4, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_SPT_I219_V4, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_SPT_I219_LM5, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_SPT_I219_V5, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_CNP_I219_LM6, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_CNP_I219_V6, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_CNP_I219_LM7, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_CNP_I219_V7, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_ICP_I219_LM8, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_ICP_I219_V8, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_ICP_I219_LM9, "Intel(R) PRO/1000 Network Connection"),
 	PVID(0x8086, E1000_DEV_ID_PCH_ICP_I219_V9, "Intel(R) PRO/1000 Network Connection"),
 	/* required last entry */
 	PVID_END
 };
 
 static pci_vendor_info_t igb_vendor_info_array[] =
 {
 	/* Intel(R) PRO/1000 Network Connection - igb */
 	PVID(0x8086, E1000_DEV_ID_82575EB_COPPER, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_82575EB_FIBER_SERDES, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_82575GB_QUAD_COPPER, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_82576, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_82576_NS, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_82576_NS_SERDES, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_82576_FIBER, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_82576_SERDES, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_82576_SERDES_QUAD, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_82576_QUAD_COPPER, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_82576_QUAD_COPPER_ET2, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_82576_VF, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_82580_COPPER, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_82580_FIBER, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_82580_SERDES, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_82580_SGMII, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_82580_COPPER_DUAL, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_82580_QUAD_FIBER, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_DH89XXCC_SERDES, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_DH89XXCC_SGMII, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_DH89XXCC_SFP, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_DH89XXCC_BACKPLANE, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_I350_COPPER, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_I350_FIBER, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_I350_SERDES, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_I350_SGMII, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_I350_VF, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_I210_COPPER, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_I210_COPPER_IT, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_I210_COPPER_OEM1, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_I210_COPPER_FLASHLESS, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_I210_SERDES_FLASHLESS, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_I210_FIBER, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_I210_SERDES, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_I210_SGMII, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_I211_COPPER, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_I354_BACKPLANE_1GBPS, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_I354_BACKPLANE_2_5GBPS, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	PVID(0x8086, E1000_DEV_ID_I354_SGMII, "Intel(R) PRO/1000 PCI-Express Network Driver"),
 	/* required last entry */
 	PVID_END
 };
 
 /*********************************************************************
  *  Function prototypes
  *********************************************************************/
 static void	*em_register(device_t dev);
 static void	*igb_register(device_t dev);
 static int	em_if_attach_pre(if_ctx_t ctx);
 static int	em_if_attach_post(if_ctx_t ctx);
 static int	em_if_detach(if_ctx_t ctx);
 static int	em_if_shutdown(if_ctx_t ctx);
 static int	em_if_suspend(if_ctx_t ctx);
 static int	em_if_resume(if_ctx_t ctx);
 
 static int	em_if_tx_queues_alloc(if_ctx_t ctx, caddr_t *vaddrs, uint64_t *paddrs, int ntxqs, int ntxqsets);
 static int	em_if_rx_queues_alloc(if_ctx_t ctx, caddr_t *vaddrs, uint64_t *paddrs, int nrxqs, int nrxqsets);
 static void	em_if_queues_free(if_ctx_t ctx);
 
 static uint64_t	em_if_get_counter(if_ctx_t, ift_counter);
 static void	em_if_init(if_ctx_t ctx);
 static void	em_if_stop(if_ctx_t ctx);
 static void	em_if_media_status(if_ctx_t, struct ifmediareq *);
 static int	em_if_media_change(if_ctx_t ctx);
 static int	em_if_mtu_set(if_ctx_t ctx, uint32_t mtu);
 static void	em_if_timer(if_ctx_t ctx, uint16_t qid);
 static void	em_if_vlan_register(if_ctx_t ctx, u16 vtag);
 static void	em_if_vlan_unregister(if_ctx_t ctx, u16 vtag);
 
 static void	em_identify_hardware(if_ctx_t ctx);
 static int	em_allocate_pci_resources(if_ctx_t ctx);
 static void	em_free_pci_resources(if_ctx_t ctx);
 static void	em_reset(if_ctx_t ctx);
 static int	em_setup_interface(if_ctx_t ctx);
 static int	em_setup_msix(if_ctx_t ctx);
 
 static void	em_initialize_transmit_unit(if_ctx_t ctx);
 static void	em_initialize_receive_unit(if_ctx_t ctx);
 
 static void	em_if_enable_intr(if_ctx_t ctx);
 static void	em_if_disable_intr(if_ctx_t ctx);
 static int	em_if_rx_queue_intr_enable(if_ctx_t ctx, uint16_t rxqid);
 static int	em_if_tx_queue_intr_enable(if_ctx_t ctx, uint16_t txqid);
 static void	em_if_multi_set(if_ctx_t ctx);
 static void	em_if_update_admin_status(if_ctx_t ctx);
 static void	em_if_debug(if_ctx_t ctx);
 static void	em_update_stats_counters(struct adapter *);
 static void	em_add_hw_stats(struct adapter *adapter);
 static int	em_if_set_promisc(if_ctx_t ctx, int flags);
 static void	em_setup_vlan_hw_support(struct adapter *);
 static int	em_sysctl_nvm_info(SYSCTL_HANDLER_ARGS);
 static void	em_print_nvm_info(struct adapter *);
 static int	em_sysctl_debug_info(SYSCTL_HANDLER_ARGS);
 static int	em_get_rs(SYSCTL_HANDLER_ARGS);
 static void	em_print_debug_info(struct adapter *);
 static int 	em_is_valid_ether_addr(u8 *);
 static int	em_sysctl_int_delay(SYSCTL_HANDLER_ARGS);
 static void	em_add_int_delay_sysctl(struct adapter *, const char *,
 		    const char *, struct em_int_delay_info *, int, int);
 /* Management and WOL Support */
 static void	em_init_manageability(struct adapter *);
 static void	em_release_manageability(struct adapter *);
 static void	em_get_hw_control(struct adapter *);
 static void	em_release_hw_control(struct adapter *);
 static void	em_get_wakeup(if_ctx_t ctx);
 static void	em_enable_wakeup(if_ctx_t ctx);
 static int	em_enable_phy_wakeup(struct adapter *);
 static void	em_disable_aspm(struct adapter *);
 
 int		em_intr(void *arg);
 static void	em_disable_promisc(if_ctx_t ctx);
 
 /* MSIX handlers */
 static int	em_if_msix_intr_assign(if_ctx_t, int);
 static int	em_msix_link(void *);
 static void	em_handle_link(void *context);
 
 static void	em_enable_vectors_82574(if_ctx_t);
 
 static int	em_set_flowcntl(SYSCTL_HANDLER_ARGS);
 static int	em_sysctl_eee(SYSCTL_HANDLER_ARGS);
 static void	em_if_led_func(if_ctx_t ctx, int onoff);
 
 static int	em_get_regs(SYSCTL_HANDLER_ARGS);
 
 static void	lem_smartspeed(struct adapter *adapter);
 static void	igb_configure_queues(struct adapter *adapter);
 
 
 /*********************************************************************
  *  FreeBSD Device Interface Entry Points
  *********************************************************************/
 static device_method_t em_methods[] = {
 	/* Device interface */
 	DEVMETHOD(device_register, em_register),
 	DEVMETHOD(device_probe, iflib_device_probe),
 	DEVMETHOD(device_attach, iflib_device_attach),
 	DEVMETHOD(device_detach, iflib_device_detach),
 	DEVMETHOD(device_shutdown, iflib_device_shutdown),
 	DEVMETHOD(device_suspend, iflib_device_suspend),
 	DEVMETHOD(device_resume, iflib_device_resume),
 	DEVMETHOD_END
 };
 
 static device_method_t igb_methods[] = {
 	/* Device interface */
 	DEVMETHOD(device_register, igb_register),
 	DEVMETHOD(device_probe, iflib_device_probe),
 	DEVMETHOD(device_attach, iflib_device_attach),
 	DEVMETHOD(device_detach, iflib_device_detach),
 	DEVMETHOD(device_shutdown, iflib_device_shutdown),
 	DEVMETHOD(device_suspend, iflib_device_suspend),
 	DEVMETHOD(device_resume, iflib_device_resume),
 	DEVMETHOD_END
 };
 
 
 static driver_t em_driver = {
 	"em", em_methods, sizeof(struct adapter),
 };
 
 static devclass_t em_devclass;
 DRIVER_MODULE(em, pci, em_driver, em_devclass, 0, 0);
 
 MODULE_DEPEND(em, pci, 1, 1, 1);
 MODULE_DEPEND(em, ether, 1, 1, 1);
 MODULE_DEPEND(em, iflib, 1, 1, 1);
 
 IFLIB_PNP_INFO(pci, em, em_vendor_info_array);
 
 static driver_t igb_driver = {
 	"igb", igb_methods, sizeof(struct adapter),
 };
 
 static devclass_t igb_devclass;
 DRIVER_MODULE(igb, pci, igb_driver, igb_devclass, 0, 0);
 
 MODULE_DEPEND(igb, pci, 1, 1, 1);
 MODULE_DEPEND(igb, ether, 1, 1, 1);
 MODULE_DEPEND(igb, iflib, 1, 1, 1);
 
 IFLIB_PNP_INFO(pci, igb, igb_vendor_info_array);
 
 static device_method_t em_if_methods[] = {
 	DEVMETHOD(ifdi_attach_pre, em_if_attach_pre),
 	DEVMETHOD(ifdi_attach_post, em_if_attach_post),
 	DEVMETHOD(ifdi_detach, em_if_detach),
 	DEVMETHOD(ifdi_shutdown, em_if_shutdown),
 	DEVMETHOD(ifdi_suspend, em_if_suspend),
 	DEVMETHOD(ifdi_resume, em_if_resume),
 	DEVMETHOD(ifdi_init, em_if_init),
 	DEVMETHOD(ifdi_stop, em_if_stop),
 	DEVMETHOD(ifdi_msix_intr_assign, em_if_msix_intr_assign),
 	DEVMETHOD(ifdi_intr_enable, em_if_enable_intr),
 	DEVMETHOD(ifdi_intr_disable, em_if_disable_intr),
 	DEVMETHOD(ifdi_tx_queues_alloc, em_if_tx_queues_alloc),
 	DEVMETHOD(ifdi_rx_queues_alloc, em_if_rx_queues_alloc),
 	DEVMETHOD(ifdi_queues_free, em_if_queues_free),
 	DEVMETHOD(ifdi_update_admin_status, em_if_update_admin_status),
 	DEVMETHOD(ifdi_multi_set, em_if_multi_set),
 	DEVMETHOD(ifdi_media_status, em_if_media_status),
 	DEVMETHOD(ifdi_media_change, em_if_media_change),
 	DEVMETHOD(ifdi_mtu_set, em_if_mtu_set),
 	DEVMETHOD(ifdi_promisc_set, em_if_set_promisc),
 	DEVMETHOD(ifdi_timer, em_if_timer),
 	DEVMETHOD(ifdi_vlan_register, em_if_vlan_register),
 	DEVMETHOD(ifdi_vlan_unregister, em_if_vlan_unregister),
 	DEVMETHOD(ifdi_get_counter, em_if_get_counter),
 	DEVMETHOD(ifdi_led_func, em_if_led_func),
 	DEVMETHOD(ifdi_rx_queue_intr_enable, em_if_rx_queue_intr_enable),
 	DEVMETHOD(ifdi_tx_queue_intr_enable, em_if_tx_queue_intr_enable),
 	DEVMETHOD(ifdi_debug, em_if_debug),
 	DEVMETHOD_END
 };
 
 /*
  * note that if (adapter->msix_mem) is replaced by:
  * if (adapter->intr_type == IFLIB_INTR_MSIX)
  */
 static driver_t em_if_driver = {
 	"em_if", em_if_methods, sizeof(struct adapter)
 };
 
 /*********************************************************************
  *  Tunable default values.
  *********************************************************************/
 
 #define EM_TICKS_TO_USECS(ticks)	((1024 * (ticks) + 500) / 1000)
 #define EM_USECS_TO_TICKS(usecs)	((1000 * (usecs) + 512) / 1024)
 
 #define MAX_INTS_PER_SEC	8000
 #define DEFAULT_ITR		(1000000000/(MAX_INTS_PER_SEC * 256))
 
 /* Allow common code without TSO */
 #ifndef CSUM_TSO
 #define CSUM_TSO	0
 #endif
 
 static SYSCTL_NODE(_hw, OID_AUTO, em, CTLFLAG_RD, 0, "EM driver parameters");
 
 static int em_disable_crc_stripping = 0;
 SYSCTL_INT(_hw_em, OID_AUTO, disable_crc_stripping, CTLFLAG_RDTUN,
     &em_disable_crc_stripping, 0, "Disable CRC Stripping");
 
 static int em_tx_int_delay_dflt = EM_TICKS_TO_USECS(EM_TIDV);
 static int em_rx_int_delay_dflt = EM_TICKS_TO_USECS(EM_RDTR);
 SYSCTL_INT(_hw_em, OID_AUTO, tx_int_delay, CTLFLAG_RDTUN, &em_tx_int_delay_dflt,
     0, "Default transmit interrupt delay in usecs");
 SYSCTL_INT(_hw_em, OID_AUTO, rx_int_delay, CTLFLAG_RDTUN, &em_rx_int_delay_dflt,
     0, "Default receive interrupt delay in usecs");
 
 static int em_tx_abs_int_delay_dflt = EM_TICKS_TO_USECS(EM_TADV);
 static int em_rx_abs_int_delay_dflt = EM_TICKS_TO_USECS(EM_RADV);
 SYSCTL_INT(_hw_em, OID_AUTO, tx_abs_int_delay, CTLFLAG_RDTUN,
     &em_tx_abs_int_delay_dflt, 0,
     "Default transmit interrupt delay limit in usecs");
 SYSCTL_INT(_hw_em, OID_AUTO, rx_abs_int_delay, CTLFLAG_RDTUN,
     &em_rx_abs_int_delay_dflt, 0,
     "Default receive interrupt delay limit in usecs");
 
 static int em_smart_pwr_down = FALSE;
 SYSCTL_INT(_hw_em, OID_AUTO, smart_pwr_down, CTLFLAG_RDTUN, &em_smart_pwr_down,
     0, "Set to true to leave smart power down enabled on newer adapters");
 
 /* Controls whether promiscuous also shows bad packets */
 static int em_debug_sbp = TRUE;
 SYSCTL_INT(_hw_em, OID_AUTO, sbp, CTLFLAG_RDTUN, &em_debug_sbp, 0,
     "Show bad packets in promiscuous mode");
 
 /* How many packets rxeof tries to clean at a time */
 static int em_rx_process_limit = 100;
 SYSCTL_INT(_hw_em, OID_AUTO, rx_process_limit, CTLFLAG_RDTUN,
     &em_rx_process_limit, 0,
     "Maximum number of received packets to process "
     "at a time, -1 means unlimited");
 
 /* Energy efficient ethernet - default to OFF */
 static int eee_setting = 1;
 SYSCTL_INT(_hw_em, OID_AUTO, eee_setting, CTLFLAG_RDTUN, &eee_setting, 0,
     "Enable Energy Efficient Ethernet");
 
 /*
 ** Tuneable Interrupt rate
 */
 static int em_max_interrupt_rate = 8000;
 SYSCTL_INT(_hw_em, OID_AUTO, max_interrupt_rate, CTLFLAG_RDTUN,
     &em_max_interrupt_rate, 0, "Maximum interrupts per second");
 
 
 
 /* Global used in WOL setup with multiport cards */
 static int global_quad_port_a = 0;
 
 extern struct if_txrx igb_txrx;
 extern struct if_txrx em_txrx;
 extern struct if_txrx lem_txrx;
 
 static struct if_shared_ctx em_sctx_init = {
 	.isc_magic = IFLIB_MAGIC,
 	.isc_q_align = PAGE_SIZE,
 	.isc_tx_maxsize = EM_TSO_SIZE + sizeof(struct ether_vlan_header),
 	.isc_tx_maxsegsize = PAGE_SIZE,
 	.isc_tso_maxsize = EM_TSO_SIZE + sizeof(struct ether_vlan_header),
 	.isc_tso_maxsegsize = EM_TSO_SEG_SIZE,
 	.isc_rx_maxsize = MJUM9BYTES,
 	.isc_rx_nsegments = 1,
 	.isc_rx_maxsegsize = MJUM9BYTES,
 	.isc_nfl = 1,
 	.isc_nrxqs = 1,
 	.isc_ntxqs = 1,
 	.isc_admin_intrcnt = 1,
 	.isc_vendor_info = em_vendor_info_array,
 	.isc_driver_version = em_driver_version,
 	.isc_driver = &em_if_driver,
 	.isc_flags = IFLIB_NEED_SCRATCH | IFLIB_TSO_INIT_IP | IFLIB_NEED_ZERO_CSUM,
 
 	.isc_nrxd_min = {EM_MIN_RXD},
 	.isc_ntxd_min = {EM_MIN_TXD},
 	.isc_nrxd_max = {EM_MAX_RXD},
 	.isc_ntxd_max = {EM_MAX_TXD},
 	.isc_nrxd_default = {EM_DEFAULT_RXD},
 	.isc_ntxd_default = {EM_DEFAULT_TXD},
 };
 
 if_shared_ctx_t em_sctx = &em_sctx_init;
 
 static struct if_shared_ctx igb_sctx_init = {
 	.isc_magic = IFLIB_MAGIC,
 	.isc_q_align = PAGE_SIZE,
 	.isc_tx_maxsize = EM_TSO_SIZE + sizeof(struct ether_vlan_header),
 	.isc_tx_maxsegsize = PAGE_SIZE,
 	.isc_tso_maxsize = EM_TSO_SIZE + sizeof(struct ether_vlan_header),
 	.isc_tso_maxsegsize = EM_TSO_SEG_SIZE,
 	.isc_rx_maxsize = MJUM9BYTES,
 	.isc_rx_nsegments = 1,
 	.isc_rx_maxsegsize = MJUM9BYTES,
 	.isc_nfl = 1,
 	.isc_nrxqs = 1,
 	.isc_ntxqs = 1,
 	.isc_admin_intrcnt = 1,
 	.isc_vendor_info = igb_vendor_info_array,
 	.isc_driver_version = em_driver_version,
 	.isc_driver = &em_if_driver,
 	.isc_flags = IFLIB_NEED_SCRATCH | IFLIB_TSO_INIT_IP | IFLIB_NEED_ZERO_CSUM,
 
 	.isc_nrxd_min = {EM_MIN_RXD},
 	.isc_ntxd_min = {EM_MIN_TXD},
 	.isc_nrxd_max = {IGB_MAX_RXD},
 	.isc_ntxd_max = {IGB_MAX_TXD},
 	.isc_nrxd_default = {EM_DEFAULT_RXD},
 	.isc_ntxd_default = {EM_DEFAULT_TXD},
 };
 
 if_shared_ctx_t igb_sctx = &igb_sctx_init;
 
 /*****************************************************************
  *
  * Dump Registers
  *
  ****************************************************************/
 #define IGB_REGS_LEN 739
 
 static int em_get_regs(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *adapter = (struct adapter *)arg1;
 	struct e1000_hw *hw = &adapter->hw;
 	struct sbuf *sb;
 	u32 *regs_buff;
 	int rc;
 
 	regs_buff = malloc(sizeof(u32) * IGB_REGS_LEN, M_DEVBUF, M_WAITOK);
 	memset(regs_buff, 0, IGB_REGS_LEN * sizeof(u32));
 
 	rc = sysctl_wire_old_buffer(req, 0);
 	MPASS(rc == 0);
 	if (rc != 0) {
 		free(regs_buff, M_DEVBUF);
 		return (rc);
 	}
 
 	sb = sbuf_new_for_sysctl(NULL, NULL, 32*400, req);
 	MPASS(sb != NULL);
 	if (sb == NULL) {
 		free(regs_buff, M_DEVBUF);
 		return (ENOMEM);
 	}
 
 	/* General Registers */
 	regs_buff[0] = E1000_READ_REG(hw, E1000_CTRL);
 	regs_buff[1] = E1000_READ_REG(hw, E1000_STATUS);
 	regs_buff[2] = E1000_READ_REG(hw, E1000_CTRL_EXT);
 	regs_buff[3] = E1000_READ_REG(hw, E1000_ICR);
 	regs_buff[4] = E1000_READ_REG(hw, E1000_RCTL);
 	regs_buff[5] = E1000_READ_REG(hw, E1000_RDLEN(0));
 	regs_buff[6] = E1000_READ_REG(hw, E1000_RDH(0));
 	regs_buff[7] = E1000_READ_REG(hw, E1000_RDT(0));
 	regs_buff[8] = E1000_READ_REG(hw, E1000_RXDCTL(0));
 	regs_buff[9] = E1000_READ_REG(hw, E1000_RDBAL(0));
 	regs_buff[10] = E1000_READ_REG(hw, E1000_RDBAH(0));
 	regs_buff[11] = E1000_READ_REG(hw, E1000_TCTL);
 	regs_buff[12] = E1000_READ_REG(hw, E1000_TDBAL(0));
 	regs_buff[13] = E1000_READ_REG(hw, E1000_TDBAH(0));
 	regs_buff[14] = E1000_READ_REG(hw, E1000_TDLEN(0));
 	regs_buff[15] = E1000_READ_REG(hw, E1000_TDH(0));
 	regs_buff[16] = E1000_READ_REG(hw, E1000_TDT(0));
 	regs_buff[17] = E1000_READ_REG(hw, E1000_TXDCTL(0));
 	regs_buff[18] = E1000_READ_REG(hw, E1000_TDFH);
 	regs_buff[19] = E1000_READ_REG(hw, E1000_TDFT);
 	regs_buff[20] = E1000_READ_REG(hw, E1000_TDFHS);
 	regs_buff[21] = E1000_READ_REG(hw, E1000_TDFPC);
 
 	sbuf_printf(sb, "General Registers\n");
 	sbuf_printf(sb, "\tCTRL\t %08x\n", regs_buff[0]);
 	sbuf_printf(sb, "\tSTATUS\t %08x\n", regs_buff[1]);
 	sbuf_printf(sb, "\tCTRL_EXIT\t %08x\n\n", regs_buff[2]);
 
 	sbuf_printf(sb, "Interrupt Registers\n");
 	sbuf_printf(sb, "\tICR\t %08x\n\n", regs_buff[3]);
 
 	sbuf_printf(sb, "RX Registers\n");
 	sbuf_printf(sb, "\tRCTL\t %08x\n", regs_buff[4]);
 	sbuf_printf(sb, "\tRDLEN\t %08x\n", regs_buff[5]);
 	sbuf_printf(sb, "\tRDH\t %08x\n", regs_buff[6]);
 	sbuf_printf(sb, "\tRDT\t %08x\n", regs_buff[7]);
 	sbuf_printf(sb, "\tRXDCTL\t %08x\n", regs_buff[8]);
 	sbuf_printf(sb, "\tRDBAL\t %08x\n", regs_buff[9]);
 	sbuf_printf(sb, "\tRDBAH\t %08x\n\n", regs_buff[10]);
 
 	sbuf_printf(sb, "TX Registers\n");
 	sbuf_printf(sb, "\tTCTL\t %08x\n", regs_buff[11]);
 	sbuf_printf(sb, "\tTDBAL\t %08x\n", regs_buff[12]);
 	sbuf_printf(sb, "\tTDBAH\t %08x\n", regs_buff[13]);
 	sbuf_printf(sb, "\tTDLEN\t %08x\n", regs_buff[14]);
 	sbuf_printf(sb, "\tTDH\t %08x\n", regs_buff[15]);
 	sbuf_printf(sb, "\tTDT\t %08x\n", regs_buff[16]);
 	sbuf_printf(sb, "\tTXDCTL\t %08x\n", regs_buff[17]);
 	sbuf_printf(sb, "\tTDFH\t %08x\n", regs_buff[18]);
 	sbuf_printf(sb, "\tTDFT\t %08x\n", regs_buff[19]);
 	sbuf_printf(sb, "\tTDFHS\t %08x\n", regs_buff[20]);
 	sbuf_printf(sb, "\tTDFPC\t %08x\n\n", regs_buff[21]);
 
 	free(regs_buff, M_DEVBUF);
 
 #ifdef DUMP_DESCS
 	{
 		if_softc_ctx_t scctx = adapter->shared;
 		struct rx_ring *rxr = &rx_que->rxr;
 		struct tx_ring *txr = &tx_que->txr;
 		int ntxd = scctx->isc_ntxd[0];
 		int nrxd = scctx->isc_nrxd[0];
 		int j;
 
 	for (j = 0; j < nrxd; j++) {
 		u32 staterr = le32toh(rxr->rx_base[j].wb.upper.status_error);
 		u32 length =  le32toh(rxr->rx_base[j].wb.upper.length);
 		sbuf_printf(sb, "\tReceive Descriptor Address %d: %08" PRIx64 "  Error:%d  Length:%d\n", j, rxr->rx_base[j].read.buffer_addr, staterr, length);
 	}
 
 	for (j = 0; j < min(ntxd, 256); j++) {
 		unsigned int *ptr = (unsigned int *)&txr->tx_base[j];
 
 		sbuf_printf(sb, "\tTXD[%03d] [0]: %08x [1]: %08x [2]: %08x [3]: %08x  eop: %d DD=%d\n",
 			    j, ptr[0], ptr[1], ptr[2], ptr[3], buf->eop,
 			    buf->eop != -1 ? txr->tx_base[buf->eop].upper.fields.status & E1000_TXD_STAT_DD : 0);
 
 	}
 	}
 #endif
 
 	rc = sbuf_finish(sb);
 	sbuf_delete(sb);
 	return(rc);
 }
 
 static void *
 em_register(device_t dev)
 {
 	return (em_sctx);
 }
 
 static void *
 igb_register(device_t dev)
 {
 	return (igb_sctx);
 }
 
 static int
 em_set_num_queues(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	int maxqueues;
 
 	/* Sanity check based on HW */
 	switch (adapter->hw.mac.type) {
 	case e1000_82576:
 	case e1000_82580:
 	case e1000_i350:
 	case e1000_i354:
 		maxqueues = 8;
 		break;
 	case e1000_i210:
 	case e1000_82575:
 		maxqueues = 4;
 		break;
 	case e1000_i211:
 	case e1000_82574:
 		maxqueues = 2;
 		break;
 	default:
 		maxqueues = 1;
 		break;
 	}
 
 	return (maxqueues);
 }
 
 #define	LEM_CAPS							\
     IFCAP_HWCSUM | IFCAP_VLAN_MTU | IFCAP_VLAN_HWTAGGING |		\
     IFCAP_VLAN_HWCSUM | IFCAP_WOL | IFCAP_VLAN_HWFILTER
 
 #define	EM_CAPS								\
     IFCAP_HWCSUM | IFCAP_VLAN_MTU | IFCAP_VLAN_HWTAGGING |		\
     IFCAP_VLAN_HWCSUM | IFCAP_WOL | IFCAP_VLAN_HWFILTER | IFCAP_TSO4 |	\
     IFCAP_LRO | IFCAP_VLAN_HWTSO
 
 #define	IGB_CAPS							\
     IFCAP_HWCSUM | IFCAP_VLAN_MTU | IFCAP_VLAN_HWTAGGING |		\
     IFCAP_VLAN_HWCSUM | IFCAP_WOL | IFCAP_VLAN_HWFILTER | IFCAP_TSO4 |	\
     IFCAP_LRO | IFCAP_VLAN_HWTSO | IFCAP_JUMBO_MTU | IFCAP_HWCSUM_IPV6 |\
     IFCAP_TSO6
 
 /*********************************************************************
  *  Device initialization routine
  *
  *  The attach entry point is called when the driver is being loaded.
  *  This routine identifies the type of hardware, allocates all resources
  *  and initializes the hardware.
  *
  *  return 0 on success, positive on failure
  *********************************************************************/
 
 static int
 em_if_attach_pre(if_ctx_t ctx)
 {
 	struct adapter *adapter;
 	if_softc_ctx_t scctx;
 	device_t dev;
 	struct e1000_hw *hw;
 	int error = 0;
 
 	INIT_DEBUGOUT("em_if_attach_pre begin");
 	dev = iflib_get_dev(ctx);
 	adapter = iflib_get_softc(ctx);
 
 	if (resource_disabled("em", device_get_unit(dev))) {
 		device_printf(dev, "Disabled by device hint\n");
 		return (ENXIO);
 	}
 
 	adapter->ctx = adapter->osdep.ctx = ctx;
 	adapter->dev = adapter->osdep.dev = dev;
 	scctx = adapter->shared = iflib_get_softc_ctx(ctx);
 	adapter->media = iflib_get_media(ctx);
 	hw = &adapter->hw;
 
 	adapter->tx_process_limit = scctx->isc_ntxd[0];
 
 	/* SYSCTL stuff */
 	SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev),
 	    SYSCTL_CHILDREN(device_get_sysctl_tree(dev)),
 	    OID_AUTO, "nvm", CTLTYPE_INT|CTLFLAG_RW, adapter, 0,
 	    em_sysctl_nvm_info, "I", "NVM Information");
 
 	SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev),
 	    SYSCTL_CHILDREN(device_get_sysctl_tree(dev)),
 	    OID_AUTO, "debug", CTLTYPE_INT|CTLFLAG_RW, adapter, 0,
 	    em_sysctl_debug_info, "I", "Debug Information");
 
 	SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev),
 	    SYSCTL_CHILDREN(device_get_sysctl_tree(dev)),
 	    OID_AUTO, "fc", CTLTYPE_INT|CTLFLAG_RW, adapter, 0,
 	    em_set_flowcntl, "I", "Flow Control");
 
 	SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev),
 	    SYSCTL_CHILDREN(device_get_sysctl_tree(dev)),
 	    OID_AUTO, "reg_dump", CTLTYPE_STRING | CTLFLAG_RD, adapter, 0,
 	    em_get_regs, "A", "Dump Registers");
 
 	SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev),
 	    SYSCTL_CHILDREN(device_get_sysctl_tree(dev)),
 	    OID_AUTO, "rs_dump", CTLTYPE_INT | CTLFLAG_RW, adapter, 0,
 	    em_get_rs, "I", "Dump RS indexes");
 
 	/* Determine hardware and mac info */
 	em_identify_hardware(ctx);
 
 	scctx->isc_msix_bar = PCIR_BAR(EM_MSIX_BAR);
 	scctx->isc_tx_nsegments = EM_MAX_SCATTER;
 	scctx->isc_nrxqsets_max = scctx->isc_ntxqsets_max = em_set_num_queues(ctx);
 	device_printf(dev, "attach_pre capping queues at %d\n", scctx->isc_ntxqsets_max);
 
 	if (adapter->hw.mac.type >= igb_mac_min) {
 		int try_second_bar;
 
 		scctx->isc_txqsizes[0] = roundup2(scctx->isc_ntxd[0] * sizeof(union e1000_adv_tx_desc), EM_DBA_ALIGN);
 		scctx->isc_rxqsizes[0] = roundup2(scctx->isc_nrxd[0] * sizeof(union e1000_adv_rx_desc), EM_DBA_ALIGN);
 		scctx->isc_txd_size[0] = sizeof(union e1000_adv_tx_desc);
 		scctx->isc_rxd_size[0] = sizeof(union e1000_adv_rx_desc);
 		scctx->isc_txrx = &igb_txrx;
 		scctx->isc_tx_tso_segments_max = EM_MAX_SCATTER;
 		scctx->isc_tx_tso_size_max = EM_TSO_SIZE;
 		scctx->isc_tx_tso_segsize_max = EM_TSO_SEG_SIZE;
 		scctx->isc_capabilities = scctx->isc_capenable = IGB_CAPS;
 		scctx->isc_tx_csum_flags = CSUM_TCP | CSUM_UDP | CSUM_TSO |
 		     CSUM_IP6_TCP | CSUM_IP6_UDP;
 		if (adapter->hw.mac.type != e1000_82575)
 			scctx->isc_tx_csum_flags |= CSUM_SCTP | CSUM_IP6_SCTP;
 
 		/*
 		** Some new devices, as with ixgbe, now may
 		** use a different BAR, so we need to keep
 		** track of which is used.
 		*/
 		try_second_bar = pci_read_config(dev, scctx->isc_msix_bar, 4);
 		if (try_second_bar == 0)
 			scctx->isc_msix_bar += 4;
 	} else if (adapter->hw.mac.type >= em_mac_min) {
 		scctx->isc_txqsizes[0] = roundup2(scctx->isc_ntxd[0]* sizeof(struct e1000_tx_desc), EM_DBA_ALIGN);
 		scctx->isc_rxqsizes[0] = roundup2(scctx->isc_nrxd[0] * sizeof(union e1000_rx_desc_extended), EM_DBA_ALIGN);
 		scctx->isc_txd_size[0] = sizeof(struct e1000_tx_desc);
 		scctx->isc_rxd_size[0] = sizeof(union e1000_rx_desc_extended);
 		scctx->isc_txrx = &em_txrx;
 		scctx->isc_tx_tso_segments_max = EM_MAX_SCATTER;
 		scctx->isc_tx_tso_size_max = EM_TSO_SIZE;
 		scctx->isc_tx_tso_segsize_max = EM_TSO_SEG_SIZE;
 		scctx->isc_capabilities = scctx->isc_capenable = EM_CAPS;
 		/*
 		 * For EM-class devices, don't enable IFCAP_{TSO4,VLAN_HWTSO}
 		 * by default as we don't have workarounds for all associated
 		 * silicon errata.  E. g., with several MACs such as 82573E,
 		 * TSO only works at Gigabit speed and otherwise can cause the
 		 * hardware to hang (which also would be next to impossible to
 		 * work around given that already queued TSO-using descriptors
 		 * would need to be flushed and vlan(4) reconfigured at runtime
 		 * in case of a link speed change).  Moreover, MACs like 82579
 		 * still can hang at Gigabit even with all publicly documented
 		 * TSO workarounds implemented.  Generally, the penality of
 		 * these workarounds is rather high and may involve copying
 		 * mbuf data around so advantages of TSO lapse.  Still, TSO may
 		 * work for a few MACs of this class - at least when sticking
 		 * with Gigabit - in which case users may enable TSO manually.
 		 */
 		scctx->isc_capenable &= ~(IFCAP_TSO4 | IFCAP_VLAN_HWTSO);
 		scctx->isc_tx_csum_flags = CSUM_TCP | CSUM_UDP | CSUM_IP_TSO;
 	} else {
 		scctx->isc_txqsizes[0] = roundup2((scctx->isc_ntxd[0] + 1) * sizeof(struct e1000_tx_desc), EM_DBA_ALIGN);
 		scctx->isc_rxqsizes[0] = roundup2((scctx->isc_nrxd[0] + 1) * sizeof(struct e1000_rx_desc), EM_DBA_ALIGN);
 		scctx->isc_txd_size[0] = sizeof(struct e1000_tx_desc);
 		scctx->isc_rxd_size[0] = sizeof(struct e1000_rx_desc);
 		scctx->isc_tx_csum_flags = CSUM_TCP | CSUM_UDP;
 		scctx->isc_txrx = &lem_txrx;
 		scctx->isc_capabilities = scctx->isc_capenable = LEM_CAPS;
 		if (adapter->hw.mac.type < e1000_82543)
 			scctx->isc_capenable &= ~(IFCAP_HWCSUM|IFCAP_VLAN_HWCSUM);
 		scctx->isc_msix_bar = 0;
 	}
 
 	/* Setup PCI resources */
 	if (em_allocate_pci_resources(ctx)) {
 		device_printf(dev, "Allocation of PCI resources failed\n");
 		error = ENXIO;
 		goto err_pci;
 	}
 
 	/*
 	** For ICH8 and family we need to
 	** map the flash memory, and this
 	** must happen after the MAC is
 	** identified
 	*/
 	if ((hw->mac.type == e1000_ich8lan) ||
 	    (hw->mac.type == e1000_ich9lan) ||
 	    (hw->mac.type == e1000_ich10lan) ||
 	    (hw->mac.type == e1000_pchlan) ||
 	    (hw->mac.type == e1000_pch2lan) ||
 	    (hw->mac.type == e1000_pch_lpt)) {
 		int rid = EM_BAR_TYPE_FLASH;
 		adapter->flash = bus_alloc_resource_any(dev,
 		    SYS_RES_MEMORY, &rid, RF_ACTIVE);
 		if (adapter->flash == NULL) {
 			device_printf(dev, "Mapping of Flash failed\n");
 			error = ENXIO;
 			goto err_pci;
 		}
 		/* This is used in the shared code */
 		hw->flash_address = (u8 *)adapter->flash;
 		adapter->osdep.flash_bus_space_tag =
 		    rman_get_bustag(adapter->flash);
 		adapter->osdep.flash_bus_space_handle =
 		    rman_get_bushandle(adapter->flash);
 	}
 	/*
 	** In the new SPT device flash is not  a
 	** separate BAR, rather it is also in BAR0,
 	** so use the same tag and an offset handle for the
 	** FLASH read/write macros in the shared code.
 	*/
 	else if (hw->mac.type >= e1000_pch_spt) {
 		adapter->osdep.flash_bus_space_tag =
 		    adapter->osdep.mem_bus_space_tag;
 		adapter->osdep.flash_bus_space_handle =
 		    adapter->osdep.mem_bus_space_handle
 		    + E1000_FLASH_BASE_ADDR;
 	}
 
 	/* Do Shared Code initialization */
 	error = e1000_setup_init_funcs(hw, TRUE);
 	if (error) {
 		device_printf(dev, "Setup of Shared code failed, error %d\n",
 		    error);
 		error = ENXIO;
 		goto err_pci;
 	}
 
 	em_setup_msix(ctx);
 	e1000_get_bus_info(hw);
 
 	/* Set up some sysctls for the tunable interrupt delays */
 	em_add_int_delay_sysctl(adapter, "rx_int_delay",
 	    "receive interrupt delay in usecs", &adapter->rx_int_delay,
 	    E1000_REGISTER(hw, E1000_RDTR), em_rx_int_delay_dflt);
 	em_add_int_delay_sysctl(adapter, "tx_int_delay",
 	    "transmit interrupt delay in usecs", &adapter->tx_int_delay,
 	    E1000_REGISTER(hw, E1000_TIDV), em_tx_int_delay_dflt);
 	em_add_int_delay_sysctl(adapter, "rx_abs_int_delay",
 	    "receive interrupt delay limit in usecs",
 	    &adapter->rx_abs_int_delay,
 	    E1000_REGISTER(hw, E1000_RADV),
 	    em_rx_abs_int_delay_dflt);
 	em_add_int_delay_sysctl(adapter, "tx_abs_int_delay",
 	    "transmit interrupt delay limit in usecs",
 	    &adapter->tx_abs_int_delay,
 	    E1000_REGISTER(hw, E1000_TADV),
 	    em_tx_abs_int_delay_dflt);
 	em_add_int_delay_sysctl(adapter, "itr",
 	    "interrupt delay limit in usecs/4",
 	    &adapter->tx_itr,
 	    E1000_REGISTER(hw, E1000_ITR),
 	    DEFAULT_ITR);
 
 	hw->mac.autoneg = DO_AUTO_NEG;
 	hw->phy.autoneg_wait_to_complete = FALSE;
 	hw->phy.autoneg_advertised = AUTONEG_ADV_DEFAULT;
 
 	if (adapter->hw.mac.type < em_mac_min) {
 		e1000_init_script_state_82541(&adapter->hw, TRUE);
 		e1000_set_tbi_compatibility_82543(&adapter->hw, TRUE);
 	}
 	/* Copper options */
 	if (hw->phy.media_type == e1000_media_type_copper) {
 		hw->phy.mdix = AUTO_ALL_MODES;
 		hw->phy.disable_polarity_correction = FALSE;
 		hw->phy.ms_type = EM_MASTER_SLAVE;
 	}
 
 	/*
 	 * Set the frame limits assuming
 	 * standard ethernet sized frames.
 	 */
 	scctx->isc_max_frame_size = adapter->hw.mac.max_frame_size =
 	    ETHERMTU + ETHER_HDR_LEN + ETHERNET_FCS_SIZE;
 
 	/*
 	 * This controls when hardware reports transmit completion
 	 * status.
 	 */
 	hw->mac.report_tx_early = 1;
 
 	/* Allocate multicast array memory. */
 	adapter->mta = malloc(sizeof(u8) * ETH_ADDR_LEN *
 	    MAX_NUM_MULTICAST_ADDRESSES, M_DEVBUF, M_NOWAIT);
 	if (adapter->mta == NULL) {
 		device_printf(dev, "Can not allocate multicast setup array\n");
 		error = ENOMEM;
 		goto err_late;
 	}
 
 	/* Check SOL/IDER usage */
 	if (e1000_check_reset_block(hw))
 		device_printf(dev, "PHY reset is blocked"
 			      " due to SOL/IDER session.\n");
 
 	/* Sysctl for setting Energy Efficient Ethernet */
 	hw->dev_spec.ich8lan.eee_disable = eee_setting;
 	SYSCTL_ADD_PROC(device_get_sysctl_ctx(dev),
 	    SYSCTL_CHILDREN(device_get_sysctl_tree(dev)),
 	    OID_AUTO, "eee_control", CTLTYPE_INT|CTLFLAG_RW,
 	    adapter, 0, em_sysctl_eee, "I",
 	    "Disable Energy Efficient Ethernet");
 
 	/*
 	** Start from a known state, this is
 	** important in reading the nvm and
 	** mac from that.
 	*/
 	e1000_reset_hw(hw);
 
 	/* Make sure we have a good EEPROM before we read from it */
 	if (e1000_validate_nvm_checksum(hw) < 0) {
 		/*
 		** Some PCI-E parts fail the first check due to
 		** the link being in sleep state, call it again,
 		** if it fails a second time its a real issue.
 		*/
 		if (e1000_validate_nvm_checksum(hw) < 0) {
 			device_printf(dev,
 			    "The EEPROM Checksum Is Not Valid\n");
 			error = EIO;
 			goto err_late;
 		}
 	}
 
 	/* Copy the permanent MAC address out of the EEPROM */
 	if (e1000_read_mac_addr(hw) < 0) {
 		device_printf(dev, "EEPROM read error while reading MAC"
 			      " address\n");
 		error = EIO;
 		goto err_late;
 	}
 
 	if (!em_is_valid_ether_addr(hw->mac.addr)) {
 		device_printf(dev, "Invalid MAC address\n");
 		error = EIO;
 		goto err_late;
 	}
 
 	/* Disable ULP support */
 	e1000_disable_ulp_lpt_lp(hw, TRUE);
 
 	/*
 	 * Get Wake-on-Lan and Management info for later use
 	 */
 	em_get_wakeup(ctx);
 
 	/* Enable only WOL MAGIC by default */
 	scctx->isc_capenable &= ~IFCAP_WOL;
 	if (adapter->wol != 0)
 		scctx->isc_capenable |= IFCAP_WOL_MAGIC;
 
 	iflib_set_mac(ctx, hw->mac.addr);
 
 	return (0);
 
 err_late:
 	em_release_hw_control(adapter);
 err_pci:
 	em_free_pci_resources(ctx);
 	free(adapter->mta, M_DEVBUF);
 
 	return (error);
 }
 
 static int
 em_if_attach_post(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	struct e1000_hw *hw = &adapter->hw;
 	int error = 0;
 	
 	/* Setup OS specific network interface */
 	error = em_setup_interface(ctx);
 	if (error != 0) {
 		goto err_late;
 	}
 
 	em_reset(ctx);
 
 	/* Initialize statistics */
 	em_update_stats_counters(adapter);
 	hw->mac.get_link_status = 1;
 	em_if_update_admin_status(ctx);
 	em_add_hw_stats(adapter);
 
 	/* Non-AMT based hardware can now take control from firmware */
 	if (adapter->has_manage && !adapter->has_amt)
 		em_get_hw_control(adapter);
 
 	INIT_DEBUGOUT("em_if_attach_post: end");
 
 	return (error);
 
 err_late:
 	em_release_hw_control(adapter);
 	em_free_pci_resources(ctx);
 	em_if_queues_free(ctx);
 	free(adapter->mta, M_DEVBUF);
 
 	return (error);
 }
 
 /*********************************************************************
  *  Device removal routine
  *
  *  The detach entry point is called when the driver is being removed.
  *  This routine stops the adapter and deallocates all the resources
  *  that were allocated for driver operation.
  *
  *  return 0 on success, positive on failure
  *********************************************************************/
 
 static int
 em_if_detach(if_ctx_t ctx)
 {
 	struct adapter	*adapter = iflib_get_softc(ctx);
 
 	INIT_DEBUGOUT("em_detach: begin");
 
 	e1000_phy_hw_reset(&adapter->hw);
 
 	em_release_manageability(adapter);
 	em_release_hw_control(adapter);
 	em_free_pci_resources(ctx);
 
 	return (0);
 }
 
 /*********************************************************************
  *
  *  Shutdown entry point
  *
  **********************************************************************/
 
 static int
 em_if_shutdown(if_ctx_t ctx)
 {
 	return em_if_suspend(ctx);
 }
 
 /*
  * Suspend/resume device methods.
  */
 static int
 em_if_suspend(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 
 	em_release_manageability(adapter);
 	em_release_hw_control(adapter);
 	em_enable_wakeup(ctx);
 	return (0);
 }
 
 static int
 em_if_resume(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 
 	if (adapter->hw.mac.type == e1000_pch2lan)
 		e1000_resume_workarounds_pchlan(&adapter->hw);
 	em_if_init(ctx);
 	em_init_manageability(adapter);
 
 	return(0);
 }
 
 static int
 em_if_mtu_set(if_ctx_t ctx, uint32_t mtu)
 {
 	int max_frame_size;
 	struct adapter *adapter = iflib_get_softc(ctx);
 	if_softc_ctx_t scctx = iflib_get_softc_ctx(ctx);
 
 	 IOCTL_DEBUGOUT("ioctl rcv'd: SIOCSIFMTU (Set Interface MTU)");
 
 	switch (adapter->hw.mac.type) {
 	case e1000_82571:
 	case e1000_82572:
 	case e1000_ich9lan:
 	case e1000_ich10lan:
 	case e1000_pch2lan:
 	case e1000_pch_lpt:
 	case e1000_pch_spt:
 	case e1000_pch_cnp:
 	case e1000_82574:
 	case e1000_82583:
 	case e1000_80003es2lan:
 		/* 9K Jumbo Frame size */
 		max_frame_size = 9234;
 		break;
 	case e1000_pchlan:
 		max_frame_size = 4096;
 		break;
 	case e1000_82542:
 	case e1000_ich8lan:
 		/* Adapters that do not support jumbo frames */
 		max_frame_size = ETHER_MAX_LEN;
 		break;
 	default:
 		if (adapter->hw.mac.type >= igb_mac_min)
 			max_frame_size = 9234;
 		else /* lem */
 			max_frame_size = MAX_JUMBO_FRAME_SIZE;
 	}
 	if (mtu > max_frame_size - ETHER_HDR_LEN - ETHER_CRC_LEN) {
 		return (EINVAL);
 	}
 
 	scctx->isc_max_frame_size = adapter->hw.mac.max_frame_size =
 	    mtu + ETHER_HDR_LEN + ETHER_CRC_LEN;
 	return (0);
 }
 
 /*********************************************************************
  *  Init entry point
  *
  *  This routine is used in two ways. It is used by the stack as
  *  init entry point in network interface structure. It is also used
  *  by the driver as a hw/sw initialization routine to get to a
  *  consistent state.
  *
  *  return 0 on success, positive on failure
  **********************************************************************/
 
 static void
 em_if_init(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	struct ifnet *ifp = iflib_get_ifp(ctx);
 	struct em_tx_queue *tx_que;
 	int i;
 	INIT_DEBUGOUT("em_if_init: begin");
 
 	/* Get the latest mac address, User can use a LAA */
 	bcopy(if_getlladdr(ifp), adapter->hw.mac.addr,
 	    ETHER_ADDR_LEN);
 
 	/* Put the address into the Receive Address Array */
 	e1000_rar_set(&adapter->hw, adapter->hw.mac.addr, 0);
 
 	/*
 	 * With the 82571 adapter, RAR[0] may be overwritten
 	 * when the other port is reset, we make a duplicate
 	 * in RAR[14] for that eventuality, this assures
 	 * the interface continues to function.
 	 */
 	if (adapter->hw.mac.type == e1000_82571) {
 		e1000_set_laa_state_82571(&adapter->hw, TRUE);
 		e1000_rar_set(&adapter->hw, adapter->hw.mac.addr,
 		    E1000_RAR_ENTRIES - 1);
 	}
 
 
 	/* Initialize the hardware */
 	em_reset(ctx);
 	em_if_update_admin_status(ctx);
 
 	for (i = 0, tx_que = adapter->tx_queues; i < adapter->tx_num_queues; i++, tx_que++) {
 		struct tx_ring *txr = &tx_que->txr;
 
 		txr->tx_rs_cidx = txr->tx_rs_pidx = txr->tx_cidx_processed = 0;
 	}
 
 	/* Setup VLAN support, basic and offload if available */
 	E1000_WRITE_REG(&adapter->hw, E1000_VET, ETHERTYPE_VLAN);
 
 	/* Clear bad data from Rx FIFOs */
 	if (adapter->hw.mac.type >= igb_mac_min)
 		e1000_rx_fifo_flush_82575(&adapter->hw);
 
 	/* Configure for OS presence */
 	em_init_manageability(adapter);
 
 	/* Prepare transmit descriptors and buffers */
 	em_initialize_transmit_unit(ctx);
 
 	/* Setup Multicast table */
 	em_if_multi_set(ctx);
 
 	/*
 	 * Figure out the desired mbuf
 	 * pool for doing jumbos
 	 */
 	if (adapter->hw.mac.max_frame_size <= 2048)
 		adapter->rx_mbuf_sz = MCLBYTES;
 #ifndef CONTIGMALLOC_WORKS
 	else
 		adapter->rx_mbuf_sz = MJUMPAGESIZE;
 #else
 	else if (adapter->hw.mac.max_frame_size <= 4096)
 		adapter->rx_mbuf_sz = MJUMPAGESIZE;
 	else
 		adapter->rx_mbuf_sz = MJUM9BYTES;
 #endif
 	em_initialize_receive_unit(ctx);
 
 	/* Use real VLAN Filter support? */
 	if (if_getcapenable(ifp) & IFCAP_VLAN_HWTAGGING) {
 		if (if_getcapenable(ifp) & IFCAP_VLAN_HWFILTER)
 			/* Use real VLAN Filter support */
 			em_setup_vlan_hw_support(adapter);
 		else {
 			u32 ctrl;
 			ctrl = E1000_READ_REG(&adapter->hw, E1000_CTRL);
 			ctrl |= E1000_CTRL_VME;
 			E1000_WRITE_REG(&adapter->hw, E1000_CTRL, ctrl);
 		}
 	}
 
 	/* Don't lose promiscuous settings */
 	em_if_set_promisc(ctx, IFF_PROMISC);
 	e1000_clear_hw_cntrs_base_generic(&adapter->hw);
 
 	/* MSI/X configuration for 82574 */
 	if (adapter->hw.mac.type == e1000_82574) {
 		int tmp = E1000_READ_REG(&adapter->hw, E1000_CTRL_EXT);
 
 		tmp |= E1000_CTRL_EXT_PBA_CLR;
 		E1000_WRITE_REG(&adapter->hw, E1000_CTRL_EXT, tmp);
 		/* Set the IVAR - interrupt vector routing. */
 		E1000_WRITE_REG(&adapter->hw, E1000_IVAR, adapter->ivars);
 	} else if (adapter->intr_type == IFLIB_INTR_MSIX) /* Set up queue routing */
 		igb_configure_queues(adapter);
 
 	/* this clears any pending interrupts */
 	E1000_READ_REG(&adapter->hw, E1000_ICR);
 	E1000_WRITE_REG(&adapter->hw, E1000_ICS, E1000_ICS_LSC);
 
 	/* AMT based hardware can now take control from firmware */
 	if (adapter->has_manage && adapter->has_amt)
 		em_get_hw_control(adapter);
 
 	/* Set Energy Efficient Ethernet */
 	if (adapter->hw.mac.type >= igb_mac_min &&
 	    adapter->hw.phy.media_type == e1000_media_type_copper) {
 		if (adapter->hw.mac.type == e1000_i354)
 			e1000_set_eee_i354(&adapter->hw, TRUE, TRUE);
 		else
 			e1000_set_eee_i350(&adapter->hw, TRUE, TRUE);
 	}
 }
 
 /*********************************************************************
  *
  *  Fast Legacy/MSI Combined Interrupt Service routine
  *
  *********************************************************************/
 int
 em_intr(void *arg)
 {
 	struct adapter *adapter = arg;
 	if_ctx_t ctx = adapter->ctx;
 	u32 reg_icr;
 
 	reg_icr = E1000_READ_REG(&adapter->hw, E1000_ICR);
 
 	if (adapter->intr_type != IFLIB_INTR_LEGACY)
 		goto skip_stray;
 	/* Hot eject? */
 	if (reg_icr == 0xffffffff)
 		return FILTER_STRAY;
 
 	/* Definitely not our interrupt. */
 	if (reg_icr == 0x0)
 		return FILTER_STRAY;
 
 	/*
 	 * Starting with the 82571 chip, bit 31 should be used to
 	 * determine whether the interrupt belongs to us.
 	 */
 	if (adapter->hw.mac.type >= e1000_82571 &&
 	    (reg_icr & E1000_ICR_INT_ASSERTED) == 0)
 		return FILTER_STRAY;
 
 skip_stray:
 	/* Link status change */
 	if (reg_icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC)) {
 		adapter->hw.mac.get_link_status = 1;
 		iflib_admin_intr_deferred(ctx);
 	}
 
 	if (reg_icr & E1000_ICR_RXO)
 		adapter->rx_overruns++;
 
 	return (FILTER_SCHEDULE_THREAD);
 }
 
 static void
 igb_rx_enable_queue(struct adapter *adapter, struct em_rx_queue *rxq)
 {
 	E1000_WRITE_REG(&adapter->hw, E1000_EIMS, rxq->eims);
 }
 
 static void
 em_rx_enable_queue(struct adapter *adapter, struct em_rx_queue *rxq)
 {
 	E1000_WRITE_REG(&adapter->hw, E1000_IMS, rxq->eims);
 }
 
 static void
 igb_tx_enable_queue(struct adapter *adapter, struct em_tx_queue *txq)
 {
 	E1000_WRITE_REG(&adapter->hw, E1000_EIMS, txq->eims);
 }
 
 static void
 em_tx_enable_queue(struct adapter *adapter, struct em_tx_queue *txq)
 {
 	E1000_WRITE_REG(&adapter->hw, E1000_IMS, txq->eims);
 }
 
 static int
 em_if_rx_queue_intr_enable(if_ctx_t ctx, uint16_t rxqid)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	struct em_rx_queue *rxq = &adapter->rx_queues[rxqid];
 
 	if (adapter->hw.mac.type >= igb_mac_min)
 		igb_rx_enable_queue(adapter, rxq);
 	else
 		em_rx_enable_queue(adapter, rxq);
 	return (0);
 }
 
 static int
 em_if_tx_queue_intr_enable(if_ctx_t ctx, uint16_t txqid)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	struct em_tx_queue *txq = &adapter->tx_queues[txqid];
 
 	if (adapter->hw.mac.type >= igb_mac_min)
 		igb_tx_enable_queue(adapter, txq);
 	else
 		em_tx_enable_queue(adapter, txq);
 	return (0);
 }
 
 /*********************************************************************
  *
  *  MSIX RX Interrupt Service routine
  *
  **********************************************************************/
 static int
 em_msix_que(void *arg)
 {
 	struct em_rx_queue *que = arg;
 
 	++que->irqs;
 
 	return (FILTER_SCHEDULE_THREAD);
 }
 
 /*********************************************************************
  *
  *  MSIX Link Fast Interrupt Service routine
  *
  **********************************************************************/
 static int
 em_msix_link(void *arg)
 {
 	struct adapter *adapter = arg;
 	u32 reg_icr;
 
 	++adapter->link_irq;
 	MPASS(adapter->hw.back != NULL);
 	reg_icr = E1000_READ_REG(&adapter->hw, E1000_ICR);
 
 	if (reg_icr & E1000_ICR_RXO)
 		adapter->rx_overruns++;
 
 	if (reg_icr & (E1000_ICR_RXSEQ | E1000_ICR_LSC)) {
 		em_handle_link(adapter->ctx);
 	} else {
 		E1000_WRITE_REG(&adapter->hw, E1000_IMS,
 				EM_MSIX_LINK | E1000_IMS_LSC);
 		if (adapter->hw.mac.type >= igb_mac_min)
 			E1000_WRITE_REG(&adapter->hw, E1000_EIMS, adapter->link_mask);
 	}
 
 	/*
 	 * Because we must read the ICR for this interrupt
 	 * it may clear other causes using autoclear, for
 	 * this reason we simply create a soft interrupt
 	 * for all these vectors.
 	 */
 	if (reg_icr && adapter->hw.mac.type < igb_mac_min) {
 		E1000_WRITE_REG(&adapter->hw,
 			E1000_ICS, adapter->ims);
 	}
 
 	return (FILTER_HANDLED);
 }
 
 static void
 em_handle_link(void *context)
 {
 	if_ctx_t ctx = context;
 	struct adapter *adapter = iflib_get_softc(ctx);
 
 	adapter->hw.mac.get_link_status = 1;
 	iflib_admin_intr_deferred(ctx);
 }
 
 
 /*********************************************************************
  *
  *  Media Ioctl callback
  *
  *  This routine is called whenever the user queries the status of
  *  the interface using ifconfig.
  *
  **********************************************************************/
 static void
 em_if_media_status(if_ctx_t ctx, struct ifmediareq *ifmr)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	u_char fiber_type = IFM_1000_SX;
 
 	INIT_DEBUGOUT("em_if_media_status: begin");
 
 	iflib_admin_intr_deferred(ctx);
 
 	ifmr->ifm_status = IFM_AVALID;
 	ifmr->ifm_active = IFM_ETHER;
 
 	if (!adapter->link_active) {
 		return;
 	}
 
 	ifmr->ifm_status |= IFM_ACTIVE;
 
 	if ((adapter->hw.phy.media_type == e1000_media_type_fiber) ||
 	    (adapter->hw.phy.media_type == e1000_media_type_internal_serdes)) {
 		if (adapter->hw.mac.type == e1000_82545)
 			fiber_type = IFM_1000_LX;
 		ifmr->ifm_active |= fiber_type | IFM_FDX;
 	} else {
 		switch (adapter->link_speed) {
 		case 10:
 			ifmr->ifm_active |= IFM_10_T;
 			break;
 		case 100:
 			ifmr->ifm_active |= IFM_100_TX;
 			break;
 		case 1000:
 			ifmr->ifm_active |= IFM_1000_T;
 			break;
 		}
 		if (adapter->link_duplex == FULL_DUPLEX)
 			ifmr->ifm_active |= IFM_FDX;
 		else
 			ifmr->ifm_active |= IFM_HDX;
 	}
 }
 
 /*********************************************************************
  *
  *  Media Ioctl callback
  *
  *  This routine is called when the user changes speed/duplex using
  *  media/mediopt option with ifconfig.
  *
  **********************************************************************/
 static int
 em_if_media_change(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	struct ifmedia *ifm = iflib_get_media(ctx);
 
 	INIT_DEBUGOUT("em_if_media_change: begin");
 
 	if (IFM_TYPE(ifm->ifm_media) != IFM_ETHER)
 		return (EINVAL);
 
 	switch (IFM_SUBTYPE(ifm->ifm_media)) {
 	case IFM_AUTO:
 		adapter->hw.mac.autoneg = DO_AUTO_NEG;
 		adapter->hw.phy.autoneg_advertised = AUTONEG_ADV_DEFAULT;
 		break;
 	case IFM_1000_LX:
 	case IFM_1000_SX:
 	case IFM_1000_T:
 		adapter->hw.mac.autoneg = DO_AUTO_NEG;
 		adapter->hw.phy.autoneg_advertised = ADVERTISE_1000_FULL;
 		break;
 	case IFM_100_TX:
 		adapter->hw.mac.autoneg = FALSE;
 		adapter->hw.phy.autoneg_advertised = 0;
 		if ((ifm->ifm_media & IFM_GMASK) == IFM_FDX)
 			adapter->hw.mac.forced_speed_duplex = ADVERTISE_100_FULL;
 		else
 			adapter->hw.mac.forced_speed_duplex = ADVERTISE_100_HALF;
 		break;
 	case IFM_10_T:
 		adapter->hw.mac.autoneg = FALSE;
 		adapter->hw.phy.autoneg_advertised = 0;
 		if ((ifm->ifm_media & IFM_GMASK) == IFM_FDX)
 			adapter->hw.mac.forced_speed_duplex = ADVERTISE_10_FULL;
 		else
 			adapter->hw.mac.forced_speed_duplex = ADVERTISE_10_HALF;
 		break;
 	default:
 		device_printf(adapter->dev, "Unsupported media type\n");
 	}
 
 	em_if_init(ctx);
 
 	return (0);
 }
 
 static int
 em_if_set_promisc(if_ctx_t ctx, int flags)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	u32 reg_rctl;
 
 	em_disable_promisc(ctx);
 
 	reg_rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL);
 
 	if (flags & IFF_PROMISC) {
 		reg_rctl |= (E1000_RCTL_UPE | E1000_RCTL_MPE);
 		/* Turn this on if you want to see bad packets */
 		if (em_debug_sbp)
 			reg_rctl |= E1000_RCTL_SBP;
 		E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl);
 	} else if (flags & IFF_ALLMULTI) {
 		reg_rctl |= E1000_RCTL_MPE;
 		reg_rctl &= ~E1000_RCTL_UPE;
 		E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl);
 	}
 	return (0);
 }
 
 static void
 em_disable_promisc(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	struct ifnet *ifp = iflib_get_ifp(ctx);
 	u32 reg_rctl;
 	int mcnt = 0;
 
 	reg_rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL);
 	reg_rctl &= (~E1000_RCTL_UPE);
 	if (if_getflags(ifp) & IFF_ALLMULTI)
 		mcnt = MAX_NUM_MULTICAST_ADDRESSES;
 	else
 		mcnt = if_multiaddr_count(ifp, MAX_NUM_MULTICAST_ADDRESSES);
 	/* Don't disable if in MAX groups */
 	if (mcnt < MAX_NUM_MULTICAST_ADDRESSES)
 		reg_rctl &=  (~E1000_RCTL_MPE);
 	reg_rctl &=  (~E1000_RCTL_SBP);
 	E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl);
 }
 
 
 /*********************************************************************
  *  Multicast Update
  *
  *  This routine is called whenever multicast address list is updated.
  *
  **********************************************************************/
 
 static void
 em_if_multi_set(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	struct ifnet *ifp = iflib_get_ifp(ctx);
 	u32 reg_rctl = 0;
 	u8  *mta; /* Multicast array memory */
 	int mcnt = 0;
 
 	IOCTL_DEBUGOUT("em_set_multi: begin");
 
 	mta = adapter->mta;
 	bzero(mta, sizeof(u8) * ETH_ADDR_LEN * MAX_NUM_MULTICAST_ADDRESSES);
 
 	if (adapter->hw.mac.type == e1000_82542 &&
 	    adapter->hw.revision_id == E1000_REVISION_2) {
 		reg_rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL);
 		if (adapter->hw.bus.pci_cmd_word & CMD_MEM_WRT_INVALIDATE)
 			e1000_pci_clear_mwi(&adapter->hw);
 		reg_rctl |= E1000_RCTL_RST;
 		E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl);
 		msec_delay(5);
 	}
 
 	if_multiaddr_array(ifp, mta, &mcnt, MAX_NUM_MULTICAST_ADDRESSES);
 
 	if (mcnt >= MAX_NUM_MULTICAST_ADDRESSES) {
 		reg_rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL);
 		reg_rctl |= E1000_RCTL_MPE;
 		E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl);
 	} else
 		e1000_update_mc_addr_list(&adapter->hw, mta, mcnt);
 
 	if (adapter->hw.mac.type == e1000_82542 &&
 	    adapter->hw.revision_id == E1000_REVISION_2) {
 		reg_rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL);
 		reg_rctl &= ~E1000_RCTL_RST;
 		E1000_WRITE_REG(&adapter->hw, E1000_RCTL, reg_rctl);
 		msec_delay(5);
 		if (adapter->hw.bus.pci_cmd_word & CMD_MEM_WRT_INVALIDATE)
 			e1000_pci_set_mwi(&adapter->hw);
 	}
 }
 
 
 /*********************************************************************
  *  Timer routine
  *
  *  This routine checks for link status and updates statistics.
  *
  **********************************************************************/
 
 static void
 em_if_timer(if_ctx_t ctx, uint16_t qid)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	struct em_rx_queue *que;
 	int i;
 	int trigger = 0;
 
 	if (qid != 0)
 		return;
 
 	iflib_admin_intr_deferred(ctx);
 
 	/* Mask to use in the irq trigger */
 	if (adapter->intr_type == IFLIB_INTR_MSIX) {
 		for (i = 0, que = adapter->rx_queues; i < adapter->rx_num_queues; i++, que++)
 			trigger |= que->eims;
 	} else {
 		trigger = E1000_ICS_RXDMT0;
 	}
 }
 
 
 static void
 em_if_update_admin_status(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	struct e1000_hw *hw = &adapter->hw;
 	device_t dev = iflib_get_dev(ctx);
 	u32 link_check, thstat, ctrl;
 
 	link_check = thstat = ctrl = 0;
 	/* Get the cached link value or read phy for real */
 	switch (hw->phy.media_type) {
 	case e1000_media_type_copper:
 		if (hw->mac.get_link_status) {
 			if (hw->mac.type == e1000_pch_spt)
 				msec_delay(50);
 			/* Do the work to read phy */
 			e1000_check_for_link(hw);
 			link_check = !hw->mac.get_link_status;
 			if (link_check) /* ESB2 fix */
 				e1000_cfg_on_link_up(hw);
 		} else {
 			link_check = TRUE;
 		}
 		break;
 	case e1000_media_type_fiber:
 		e1000_check_for_link(hw);
 		link_check = (E1000_READ_REG(hw, E1000_STATUS) &
 			    E1000_STATUS_LU);
 		break;
 	case e1000_media_type_internal_serdes:
 		e1000_check_for_link(hw);
 		link_check = adapter->hw.mac.serdes_has_link;
 		break;
 	/* VF device is type_unknown */
 	case e1000_media_type_unknown:
 		e1000_check_for_link(hw);
 		link_check = !hw->mac.get_link_status;
 		/* FALLTHROUGH */
 	default:
 		break;
 	}
 
 	/* Check for thermal downshift or shutdown */
 	if (hw->mac.type == e1000_i350) {
 		thstat = E1000_READ_REG(hw, E1000_THSTAT);
 		ctrl = E1000_READ_REG(hw, E1000_CTRL_EXT);
 	}
 
 	/* Now check for a transition */
 	if (link_check && (adapter->link_active == 0)) {
 		e1000_get_speed_and_duplex(hw, &adapter->link_speed,
 		    &adapter->link_duplex);
 		/* Check if we must disable SPEED_MODE bit on PCI-E */
 		if ((adapter->link_speed != SPEED_1000) &&
 		    ((hw->mac.type == e1000_82571) ||
 		    (hw->mac.type == e1000_82572))) {
 			int tarc0;
 			tarc0 = E1000_READ_REG(hw, E1000_TARC(0));
 			tarc0 &= ~TARC_SPEED_MODE_BIT;
 			E1000_WRITE_REG(hw, E1000_TARC(0), tarc0);
 		}
 		if (bootverbose)
 			device_printf(dev, "Link is up %d Mbps %s\n",
 			    adapter->link_speed,
 			    ((adapter->link_duplex == FULL_DUPLEX) ?
 			    "Full Duplex" : "Half Duplex"));
 		adapter->link_active = 1;
 		adapter->smartspeed = 0;
 		if ((ctrl & E1000_CTRL_EXT_LINK_MODE_MASK) ==
 		    E1000_CTRL_EXT_LINK_MODE_GMII &&
 		    (thstat & E1000_THSTAT_LINK_THROTTLE))
 			device_printf(dev, "Link: thermal downshift\n");
 		/* Delay Link Up for Phy update */
 		if (((hw->mac.type == e1000_i210) ||
 		    (hw->mac.type == e1000_i211)) &&
 		    (hw->phy.id == I210_I_PHY_ID))
 			msec_delay(I210_LINK_DELAY);
 		/* Reset if the media type changed. */
 		if ((hw->dev_spec._82575.media_changed) &&
 			(adapter->hw.mac.type >= igb_mac_min)) {
 			hw->dev_spec._82575.media_changed = false;
 			adapter->flags |= IGB_MEDIA_RESET;
 			em_reset(ctx);
 		}
 		iflib_link_state_change(ctx, LINK_STATE_UP,
 		    IF_Mbps(adapter->link_speed));
-		printf("Link state changed to up\n");
 	} else if (!link_check && (adapter->link_active == 1)) {
 		adapter->link_speed = 0;
 		adapter->link_duplex = 0;
 		adapter->link_active = 0;
 		iflib_link_state_change(ctx, LINK_STATE_DOWN, 0);
-		printf("Link state changed to down\n");
 	}
 	em_update_stats_counters(adapter);
 
 	/* Reset LAA into RAR[0] on 82571 */
 	if ((adapter->hw.mac.type == e1000_82571) &&
 	    e1000_get_laa_state_82571(&adapter->hw))
 		e1000_rar_set(&adapter->hw, adapter->hw.mac.addr, 0);
 
 	if (adapter->hw.mac.type < em_mac_min)
 		lem_smartspeed(adapter);
 
 	E1000_WRITE_REG(&adapter->hw, E1000_IMS, EM_MSIX_LINK | E1000_IMS_LSC);
 }
 
 /*********************************************************************
  *
  *  This routine disables all traffic on the adapter by issuing a
  *  global reset on the MAC and deallocates TX/RX buffers.
  *
  *  This routine should always be called with BOTH the CORE
  *  and TX locks.
  **********************************************************************/
 
 static void
 em_if_stop(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 
 	INIT_DEBUGOUT("em_stop: begin");
 
 	e1000_reset_hw(&adapter->hw);
 	if (adapter->hw.mac.type >= e1000_82544)
 		E1000_WRITE_REG(&adapter->hw, E1000_WUFC, 0);
 
 	e1000_led_off(&adapter->hw);
 	e1000_cleanup_led(&adapter->hw);
 }
 
 
 /*********************************************************************
  *
  *  Determine hardware revision.
  *
  **********************************************************************/
 static void
 em_identify_hardware(if_ctx_t ctx)
 {
 	device_t dev = iflib_get_dev(ctx);
 	struct adapter *adapter = iflib_get_softc(ctx);
 
 	/* Make sure our PCI config space has the necessary stuff set */
 	adapter->hw.bus.pci_cmd_word = pci_read_config(dev, PCIR_COMMAND, 2);
 
 	/* Save off the information about this board */
 	adapter->hw.vendor_id = pci_get_vendor(dev);
 	adapter->hw.device_id = pci_get_device(dev);
 	adapter->hw.revision_id = pci_read_config(dev, PCIR_REVID, 1);
 	adapter->hw.subsystem_vendor_id =
 	    pci_read_config(dev, PCIR_SUBVEND_0, 2);
 	adapter->hw.subsystem_device_id =
 	    pci_read_config(dev, PCIR_SUBDEV_0, 2);
 
 	/* Do Shared Code Init and Setup */
 	if (e1000_set_mac_type(&adapter->hw)) {
 		device_printf(dev, "Setup init failure\n");
 		return;
 	}
 }
 
 static int
 em_allocate_pci_resources(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	device_t dev = iflib_get_dev(ctx);
 	int rid, val;
 
 	rid = PCIR_BAR(0);
 	adapter->memory = bus_alloc_resource_any(dev, SYS_RES_MEMORY,
 	    &rid, RF_ACTIVE);
 	if (adapter->memory == NULL) {
 		device_printf(dev, "Unable to allocate bus resource: memory\n");
 		return (ENXIO);
 	}
 	adapter->osdep.mem_bus_space_tag = rman_get_bustag(adapter->memory);
 	adapter->osdep.mem_bus_space_handle =
 	    rman_get_bushandle(adapter->memory);
 	adapter->hw.hw_addr = (u8 *)&adapter->osdep.mem_bus_space_handle;
 
 	/* Only older adapters use IO mapping */
 	if (adapter->hw.mac.type < em_mac_min &&
 	    adapter->hw.mac.type > e1000_82543) {
 		/* Figure our where our IO BAR is ? */
 		for (rid = PCIR_BAR(0); rid < PCIR_CIS;) {
 			val = pci_read_config(dev, rid, 4);
 			if (EM_BAR_TYPE(val) == EM_BAR_TYPE_IO) {
 				adapter->io_rid = rid;
 				break;
 			}
 			rid += 4;
 			/* check for 64bit BAR */
 			if (EM_BAR_MEM_TYPE(val) == EM_BAR_MEM_TYPE_64BIT)
 				rid += 4;
 		}
 		if (rid >= PCIR_CIS) {
 			device_printf(dev, "Unable to locate IO BAR\n");
 			return (ENXIO);
 		}
 		adapter->ioport = bus_alloc_resource_any(dev,
 		    SYS_RES_IOPORT, &adapter->io_rid, RF_ACTIVE);
 		if (adapter->ioport == NULL) {
 			device_printf(dev, "Unable to allocate bus resource: "
 			    "ioport\n");
 			return (ENXIO);
 		}
 		adapter->hw.io_base = 0;
 		adapter->osdep.io_bus_space_tag =
 		    rman_get_bustag(adapter->ioport);
 		adapter->osdep.io_bus_space_handle =
 		    rman_get_bushandle(adapter->ioport);
 	}
 
 	adapter->hw.back = &adapter->osdep;
 
 	return (0);
 }
 
 /*********************************************************************
  *
  *  Setup the MSIX Interrupt handlers
  *
  **********************************************************************/
 static int
 em_if_msix_intr_assign(if_ctx_t ctx, int msix)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	struct em_rx_queue *rx_que = adapter->rx_queues;
 	struct em_tx_queue *tx_que = adapter->tx_queues;
 	int error, rid, i, vector = 0, rx_vectors;
 	char buf[16];
 
 	/* First set up ring resources */
 	for (i = 0; i < adapter->rx_num_queues; i++, rx_que++, vector++) {
 		rid = vector + 1;
 		snprintf(buf, sizeof(buf), "rxq%d", i);
 		error = iflib_irq_alloc_generic(ctx, &rx_que->que_irq, rid, IFLIB_INTR_RXTX, em_msix_que, rx_que, rx_que->me, buf);
 		if (error) {
 			device_printf(iflib_get_dev(ctx), "Failed to allocate que int %d err: %d", i, error);
 			adapter->rx_num_queues = i + 1;
 			goto fail;
 		}
 
 		rx_que->msix =  vector;
 
 		/*
 		 * Set the bit to enable interrupt
 		 * in E1000_IMS -- bits 20 and 21
 		 * are for RX0 and RX1, note this has
 		 * NOTHING to do with the MSIX vector
 		 */
 		if (adapter->hw.mac.type == e1000_82574) {
 			rx_que->eims = 1 << (20 + i);
 			adapter->ims |= rx_que->eims;
 			adapter->ivars |= (8 | rx_que->msix) << (i * 4);
 		} else if (adapter->hw.mac.type == e1000_82575)
 			rx_que->eims = E1000_EICR_TX_QUEUE0 << vector;
 		else
 			rx_que->eims = 1 << vector;
 	}
 	rx_vectors = vector;
 
 	vector = 0;
 	for (i = 0; i < adapter->tx_num_queues; i++, tx_que++, vector++) {
 		snprintf(buf, sizeof(buf), "txq%d", i);
 		tx_que = &adapter->tx_queues[i];
 		iflib_softirq_alloc_generic(ctx,
 		    &adapter->rx_queues[i % adapter->rx_num_queues].que_irq,
 		    IFLIB_INTR_TX, tx_que, tx_que->me, buf);
 
 		tx_que->msix = (vector % adapter->tx_num_queues);
 
 		/*
 		 * Set the bit to enable interrupt
 		 * in E1000_IMS -- bits 22 and 23
 		 * are for TX0 and TX1, note this has
 		 * NOTHING to do with the MSIX vector
 		 */
 		if (adapter->hw.mac.type == e1000_82574) {
 			tx_que->eims = 1 << (22 + i);
 			adapter->ims |= tx_que->eims;
 			adapter->ivars |= (8 | tx_que->msix) << (8 + (i * 4));
 		} else if (adapter->hw.mac.type == e1000_82575) {
 			tx_que->eims = E1000_EICR_TX_QUEUE0 << (i %  adapter->tx_num_queues);
 		} else {
 			tx_que->eims = 1 << (i %  adapter->tx_num_queues);
 		}
 	}
 
 	/* Link interrupt */
 	rid = rx_vectors + 1;
 	error = iflib_irq_alloc_generic(ctx, &adapter->irq, rid, IFLIB_INTR_ADMIN, em_msix_link, adapter, 0, "aq");
 
 	if (error) {
 		device_printf(iflib_get_dev(ctx), "Failed to register admin handler");
 		goto fail;
 	}
 	adapter->linkvec = rx_vectors;
 	if (adapter->hw.mac.type < igb_mac_min) {
 		adapter->ivars |=  (8 | rx_vectors) << 16;
 		adapter->ivars |= 0x80000000;
 	}
 	return (0);
 fail:
 	iflib_irq_free(ctx, &adapter->irq);
 	rx_que = adapter->rx_queues;
 	for (int i = 0; i < adapter->rx_num_queues; i++, rx_que++)
 		iflib_irq_free(ctx, &rx_que->que_irq);
 	return (error);
 }
 
 static void
 igb_configure_queues(struct adapter *adapter)
 {
 	struct e1000_hw *hw = &adapter->hw;
 	struct em_rx_queue *rx_que;
 	struct em_tx_queue *tx_que;
 	u32 tmp, ivar = 0, newitr = 0;
 
 	/* First turn on RSS capability */
 	if (adapter->hw.mac.type != e1000_82575)
 		E1000_WRITE_REG(hw, E1000_GPIE,
 		    E1000_GPIE_MSIX_MODE | E1000_GPIE_EIAME |
 		    E1000_GPIE_PBA | E1000_GPIE_NSICR);
 
 	/* Turn on MSIX */
 	switch (adapter->hw.mac.type) {
 	case e1000_82580:
 	case e1000_i350:
 	case e1000_i354:
 	case e1000_i210:
 	case e1000_i211:
 	case e1000_vfadapt:
 	case e1000_vfadapt_i350:
 		/* RX entries */
 		for (int i = 0; i < adapter->rx_num_queues; i++) {
 			u32 index = i >> 1;
 			ivar = E1000_READ_REG_ARRAY(hw, E1000_IVAR0, index);
 			rx_que = &adapter->rx_queues[i];
 			if (i & 1) {
 				ivar &= 0xFF00FFFF;
 				ivar |= (rx_que->msix | E1000_IVAR_VALID) << 16;
 			} else {
 				ivar &= 0xFFFFFF00;
 				ivar |= rx_que->msix | E1000_IVAR_VALID;
 			}
 			E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, index, ivar);
 		}
 		/* TX entries */
 		for (int i = 0; i < adapter->tx_num_queues; i++) {
 			u32 index = i >> 1;
 			ivar = E1000_READ_REG_ARRAY(hw, E1000_IVAR0, index);
 			tx_que = &adapter->tx_queues[i];
 			if (i & 1) {
 				ivar &= 0x00FFFFFF;
 				ivar |= (tx_que->msix | E1000_IVAR_VALID) << 24;
 			} else {
 				ivar &= 0xFFFF00FF;
 				ivar |= (tx_que->msix | E1000_IVAR_VALID) << 8;
 			}
 			E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, index, ivar);
 			adapter->que_mask |= tx_que->eims;
 		}
 
 		/* And for the link interrupt */
 		ivar = (adapter->linkvec | E1000_IVAR_VALID) << 8;
 		adapter->link_mask = 1 << adapter->linkvec;
 		E1000_WRITE_REG(hw, E1000_IVAR_MISC, ivar);
 		break;
 	case e1000_82576:
 		/* RX entries */
 		for (int i = 0; i < adapter->rx_num_queues; i++) {
 			u32 index = i & 0x7; /* Each IVAR has two entries */
 			ivar = E1000_READ_REG_ARRAY(hw, E1000_IVAR0, index);
 			rx_que = &adapter->rx_queues[i];
 			if (i < 8) {
 				ivar &= 0xFFFFFF00;
 				ivar |= rx_que->msix | E1000_IVAR_VALID;
 			} else {
 				ivar &= 0xFF00FFFF;
 				ivar |= (rx_que->msix | E1000_IVAR_VALID) << 16;
 			}
 			E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, index, ivar);
 			adapter->que_mask |= rx_que->eims;
 		}
 		/* TX entries */
 		for (int i = 0; i < adapter->tx_num_queues; i++) {
 			u32 index = i & 0x7; /* Each IVAR has two entries */
 			ivar = E1000_READ_REG_ARRAY(hw, E1000_IVAR0, index);
 			tx_que = &adapter->tx_queues[i];
 			if (i < 8) {
 				ivar &= 0xFFFF00FF;
 				ivar |= (tx_que->msix | E1000_IVAR_VALID) << 8;
 			} else {
 				ivar &= 0x00FFFFFF;
 				ivar |= (tx_que->msix | E1000_IVAR_VALID) << 24;
 			}
 			E1000_WRITE_REG_ARRAY(hw, E1000_IVAR0, index, ivar);
 			adapter->que_mask |= tx_que->eims;
 		}
 
 		/* And for the link interrupt */
 		ivar = (adapter->linkvec | E1000_IVAR_VALID) << 8;
 		adapter->link_mask = 1 << adapter->linkvec;
 		E1000_WRITE_REG(hw, E1000_IVAR_MISC, ivar);
 		break;
 
 	case e1000_82575:
 		/* enable MSI-X support*/
 		tmp = E1000_READ_REG(hw, E1000_CTRL_EXT);
 		tmp |= E1000_CTRL_EXT_PBA_CLR;
 		/* Auto-Mask interrupts upon ICR read. */
 		tmp |= E1000_CTRL_EXT_EIAME;
 		tmp |= E1000_CTRL_EXT_IRCA;
 		E1000_WRITE_REG(hw, E1000_CTRL_EXT, tmp);
 
 		/* Queues */
 		for (int i = 0; i < adapter->rx_num_queues; i++) {
 			rx_que = &adapter->rx_queues[i];
 			tmp = E1000_EICR_RX_QUEUE0 << i;
 			tmp |= E1000_EICR_TX_QUEUE0 << i;
 			rx_que->eims = tmp;
 			E1000_WRITE_REG_ARRAY(hw, E1000_MSIXBM(0),
 			    i, rx_que->eims);
 			adapter->que_mask |= rx_que->eims;
 		}
 
 		/* Link */
 		E1000_WRITE_REG(hw, E1000_MSIXBM(adapter->linkvec),
 		    E1000_EIMS_OTHER);
 		adapter->link_mask |= E1000_EIMS_OTHER;
 	default:
 		break;
 	}
 
 	/* Set the starting interrupt rate */
 	if (em_max_interrupt_rate > 0)
 		newitr = (4000000 / em_max_interrupt_rate) & 0x7FFC;
 
 	if (hw->mac.type == e1000_82575)
 		newitr |= newitr << 16;
 	else
 		newitr |= E1000_EITR_CNT_IGNR;
 
 	for (int i = 0; i < adapter->rx_num_queues; i++) {
 		rx_que = &adapter->rx_queues[i];
 		E1000_WRITE_REG(hw, E1000_EITR(rx_que->msix), newitr);
 	}
 
 	return;
 }
 
 static void
 em_free_pci_resources(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	struct em_rx_queue *que = adapter->rx_queues;
 	device_t dev = iflib_get_dev(ctx);
 
 	/* Release all msix queue resources */
 	if (adapter->intr_type == IFLIB_INTR_MSIX)
 		iflib_irq_free(ctx, &adapter->irq);
 
 	for (int i = 0; i < adapter->rx_num_queues; i++, que++) {
 		iflib_irq_free(ctx, &que->que_irq);
 	}
 
 	/* First release all the interrupt resources */
 	if (adapter->memory != NULL) {
 		bus_release_resource(dev, SYS_RES_MEMORY,
 				     PCIR_BAR(0), adapter->memory);
 		adapter->memory = NULL;
 	}
 
 	if (adapter->flash != NULL) {
 		bus_release_resource(dev, SYS_RES_MEMORY,
 				     EM_FLASH, adapter->flash);
 		adapter->flash = NULL;
 	}
 	if (adapter->ioport != NULL)
 		bus_release_resource(dev, SYS_RES_IOPORT,
 		    adapter->io_rid, adapter->ioport);
 }
 
 /* Setup MSI or MSI/X */
 static int
 em_setup_msix(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 
 	if (adapter->hw.mac.type == e1000_82574) {
 		em_enable_vectors_82574(ctx);
 	}
 	return (0);
 }
 
 /*********************************************************************
  *
  *  Initialize the hardware to a configuration
  *  as specified by the adapter structure.
  *
  **********************************************************************/
 
 static void
 lem_smartspeed(struct adapter *adapter)
 {
 	u16 phy_tmp;
 
 	if (adapter->link_active || (adapter->hw.phy.type != e1000_phy_igp) ||
 	    adapter->hw.mac.autoneg == 0 ||
 	    (adapter->hw.phy.autoneg_advertised & ADVERTISE_1000_FULL) == 0)
 		return;
 
 	if (adapter->smartspeed == 0) {
 		/* If Master/Slave config fault is asserted twice,
 		 * we assume back-to-back */
 		e1000_read_phy_reg(&adapter->hw, PHY_1000T_STATUS, &phy_tmp);
 		if (!(phy_tmp & SR_1000T_MS_CONFIG_FAULT))
 			return;
 		e1000_read_phy_reg(&adapter->hw, PHY_1000T_STATUS, &phy_tmp);
 		if (phy_tmp & SR_1000T_MS_CONFIG_FAULT) {
 			e1000_read_phy_reg(&adapter->hw,
 			    PHY_1000T_CTRL, &phy_tmp);
 			if(phy_tmp & CR_1000T_MS_ENABLE) {
 				phy_tmp &= ~CR_1000T_MS_ENABLE;
 				e1000_write_phy_reg(&adapter->hw,
 				    PHY_1000T_CTRL, phy_tmp);
 				adapter->smartspeed++;
 				if(adapter->hw.mac.autoneg &&
 				   !e1000_copper_link_autoneg(&adapter->hw) &&
 				   !e1000_read_phy_reg(&adapter->hw,
 				    PHY_CONTROL, &phy_tmp)) {
 					phy_tmp |= (MII_CR_AUTO_NEG_EN |
 						    MII_CR_RESTART_AUTO_NEG);
 					e1000_write_phy_reg(&adapter->hw,
 					    PHY_CONTROL, phy_tmp);
 				}
 			}
 		}
 		return;
 	} else if(adapter->smartspeed == EM_SMARTSPEED_DOWNSHIFT) {
 		/* If still no link, perhaps using 2/3 pair cable */
 		e1000_read_phy_reg(&adapter->hw, PHY_1000T_CTRL, &phy_tmp);
 		phy_tmp |= CR_1000T_MS_ENABLE;
 		e1000_write_phy_reg(&adapter->hw, PHY_1000T_CTRL, phy_tmp);
 		if(adapter->hw.mac.autoneg &&
 		   !e1000_copper_link_autoneg(&adapter->hw) &&
 		   !e1000_read_phy_reg(&adapter->hw, PHY_CONTROL, &phy_tmp)) {
 			phy_tmp |= (MII_CR_AUTO_NEG_EN |
 				    MII_CR_RESTART_AUTO_NEG);
 			e1000_write_phy_reg(&adapter->hw, PHY_CONTROL, phy_tmp);
 		}
 	}
 	/* Restart process after EM_SMARTSPEED_MAX iterations */
 	if(adapter->smartspeed++ == EM_SMARTSPEED_MAX)
 		adapter->smartspeed = 0;
 }
 
 /*********************************************************************
  *
  *  Initialize the DMA Coalescing feature
  *
  **********************************************************************/
 static void
 igb_init_dmac(struct adapter *adapter, u32 pba)
 {
 	device_t	dev = adapter->dev;
 	struct e1000_hw *hw = &adapter->hw;
 	u32 		dmac, reg = ~E1000_DMACR_DMAC_EN;
 	u16		hwm;
 	u16		max_frame_size;
 
 	if (hw->mac.type == e1000_i211)
 		return;
 
 	max_frame_size = adapter->shared->isc_max_frame_size;
 	if (hw->mac.type > e1000_82580) {
 
 		if (adapter->dmac == 0) { /* Disabling it */
 			E1000_WRITE_REG(hw, E1000_DMACR, reg);
 			return;
 		} else
 			device_printf(dev, "DMA Coalescing enabled\n");
 
 		/* Set starting threshold */
 		E1000_WRITE_REG(hw, E1000_DMCTXTH, 0);
 
 		hwm = 64 * pba - max_frame_size / 16;
 		if (hwm < 64 * (pba - 6))
 			hwm = 64 * (pba - 6);
 		reg = E1000_READ_REG(hw, E1000_FCRTC);
 		reg &= ~E1000_FCRTC_RTH_COAL_MASK;
 		reg |= ((hwm << E1000_FCRTC_RTH_COAL_SHIFT)
 		    & E1000_FCRTC_RTH_COAL_MASK);
 		E1000_WRITE_REG(hw, E1000_FCRTC, reg);
 
 
 		dmac = pba - max_frame_size / 512;
 		if (dmac < pba - 10)
 			dmac = pba - 10;
 		reg = E1000_READ_REG(hw, E1000_DMACR);
 		reg &= ~E1000_DMACR_DMACTHR_MASK;
 		reg |= ((dmac << E1000_DMACR_DMACTHR_SHIFT)
 		    & E1000_DMACR_DMACTHR_MASK);
 
 		/* transition to L0x or L1 if available..*/
 		reg |= (E1000_DMACR_DMAC_EN | E1000_DMACR_DMAC_LX_MASK);
 
 		/* Check if status is 2.5Gb backplane connection
 		* before configuration of watchdog timer, which is
 		* in msec values in 12.8usec intervals
 		* watchdog timer= msec values in 32usec intervals
 		* for non 2.5Gb connection
 		*/
 		if (hw->mac.type == e1000_i354) {
 			int status = E1000_READ_REG(hw, E1000_STATUS);
 			if ((status & E1000_STATUS_2P5_SKU) &&
 			    (!(status & E1000_STATUS_2P5_SKU_OVER)))
 				reg |= ((adapter->dmac * 5) >> 6);
 			else
 				reg |= (adapter->dmac >> 5);
 		} else {
 			reg |= (adapter->dmac >> 5);
 		}
 
 		E1000_WRITE_REG(hw, E1000_DMACR, reg);
 
 		E1000_WRITE_REG(hw, E1000_DMCRTRH, 0);
 
 		/* Set the interval before transition */
 		reg = E1000_READ_REG(hw, E1000_DMCTLX);
 		if (hw->mac.type == e1000_i350)
 			reg |= IGB_DMCTLX_DCFLUSH_DIS;
 		/*
 		** in 2.5Gb connection, TTLX unit is 0.4 usec
 		** which is 0x4*2 = 0xA. But delay is still 4 usec
 		*/
 		if (hw->mac.type == e1000_i354) {
 			int status = E1000_READ_REG(hw, E1000_STATUS);
 			if ((status & E1000_STATUS_2P5_SKU) &&
 			    (!(status & E1000_STATUS_2P5_SKU_OVER)))
 				reg |= 0xA;
 			else
 				reg |= 0x4;
 		} else {
 			reg |= 0x4;
 		}
 
 		E1000_WRITE_REG(hw, E1000_DMCTLX, reg);
 
 		/* free space in tx packet buffer to wake from DMA coal */
 		E1000_WRITE_REG(hw, E1000_DMCTXTH, (IGB_TXPBSIZE -
 		    (2 * max_frame_size)) >> 6);
 
 		/* make low power state decision controlled by DMA coal */
 		reg = E1000_READ_REG(hw, E1000_PCIEMISC);
 		reg &= ~E1000_PCIEMISC_LX_DECISION;
 		E1000_WRITE_REG(hw, E1000_PCIEMISC, reg);
 
 	} else if (hw->mac.type == e1000_82580) {
 		u32 reg = E1000_READ_REG(hw, E1000_PCIEMISC);
 		E1000_WRITE_REG(hw, E1000_PCIEMISC,
 		    reg & ~E1000_PCIEMISC_LX_DECISION);
 		E1000_WRITE_REG(hw, E1000_DMACR, 0);
 	}
 }
 
 static void
 em_reset(if_ctx_t ctx)
 {
 	device_t dev = iflib_get_dev(ctx);
 	struct adapter *adapter = iflib_get_softc(ctx);
 	struct ifnet *ifp = iflib_get_ifp(ctx);
 	struct e1000_hw *hw = &adapter->hw;
 	u16 rx_buffer_size;
 	u32 pba;
 
 	INIT_DEBUGOUT("em_reset: begin");
 	/* Let the firmware know the OS is in control */
 	em_get_hw_control(adapter);
 
 	/* Set up smart power down as default off on newer adapters. */
 	if (!em_smart_pwr_down && (hw->mac.type == e1000_82571 ||
 	    hw->mac.type == e1000_82572)) {
 		u16 phy_tmp = 0;
 
 		/* Speed up time to link by disabling smart power down. */
 		e1000_read_phy_reg(hw, IGP02E1000_PHY_POWER_MGMT, &phy_tmp);
 		phy_tmp &= ~IGP02E1000_PM_SPD;
 		e1000_write_phy_reg(hw, IGP02E1000_PHY_POWER_MGMT, phy_tmp);
 	}
 
 	/*
 	 * Packet Buffer Allocation (PBA)
 	 * Writing PBA sets the receive portion of the buffer
 	 * the remainder is used for the transmit buffer.
 	 */
 	switch (hw->mac.type) {
 	/* Total Packet Buffer on these is 48K */
 	case e1000_82571:
 	case e1000_82572:
 	case e1000_80003es2lan:
 			pba = E1000_PBA_32K; /* 32K for Rx, 16K for Tx */
 		break;
 	case e1000_82573: /* 82573: Total Packet Buffer is 32K */
 			pba = E1000_PBA_12K; /* 12K for Rx, 20K for Tx */
 		break;
 	case e1000_82574:
 	case e1000_82583:
 			pba = E1000_PBA_20K; /* 20K for Rx, 20K for Tx */
 		break;
 	case e1000_ich8lan:
 		pba = E1000_PBA_8K;
 		break;
 	case e1000_ich9lan:
 	case e1000_ich10lan:
 		/* Boost Receive side for jumbo frames */
 		if (adapter->hw.mac.max_frame_size > 4096)
 			pba = E1000_PBA_14K;
 		else
 			pba = E1000_PBA_10K;
 		break;
 	case e1000_pchlan:
 	case e1000_pch2lan:
 	case e1000_pch_lpt:
 	case e1000_pch_spt:
 	case e1000_pch_cnp:
 		pba = E1000_PBA_26K;
 		break;
 	case e1000_82575:
 		pba = E1000_PBA_32K;
 		break;
 	case e1000_82576:
 	case e1000_vfadapt:
 		pba = E1000_READ_REG(hw, E1000_RXPBS);
 		pba &= E1000_RXPBS_SIZE_MASK_82576;
 		break;
 	case e1000_82580:
 	case e1000_i350:
 	case e1000_i354:
 	case e1000_vfadapt_i350:
 		pba = E1000_READ_REG(hw, E1000_RXPBS);
 		pba = e1000_rxpbs_adjust_82580(pba);
 		break;
 	case e1000_i210:
 	case e1000_i211:
 		pba = E1000_PBA_34K;
 		break;
 	default:
 		if (adapter->hw.mac.max_frame_size > 8192)
 			pba = E1000_PBA_40K; /* 40K for Rx, 24K for Tx */
 		else
 			pba = E1000_PBA_48K; /* 48K for Rx, 16K for Tx */
 	}
 
 	/* Special needs in case of Jumbo frames */
 	if ((hw->mac.type == e1000_82575) && (ifp->if_mtu > ETHERMTU)) {
 		u32 tx_space, min_tx, min_rx;
 		pba = E1000_READ_REG(hw, E1000_PBA);
 		tx_space = pba >> 16;
 		pba &= 0xffff;
 		min_tx = (adapter->hw.mac.max_frame_size +
 		    sizeof(struct e1000_tx_desc) - ETHERNET_FCS_SIZE) * 2;
 		min_tx = roundup2(min_tx, 1024);
 		min_tx >>= 10;
 		min_rx = adapter->hw.mac.max_frame_size;
 		min_rx = roundup2(min_rx, 1024);
 		min_rx >>= 10;
 		if (tx_space < min_tx &&
 		    ((min_tx - tx_space) < pba)) {
 			pba = pba - (min_tx - tx_space);
 			/*
 			 * if short on rx space, rx wins
 			 * and must trump tx adjustment
 			 */
 			if (pba < min_rx)
 				pba = min_rx;
 		}
 		E1000_WRITE_REG(hw, E1000_PBA, pba);
 	}
 
 	if (hw->mac.type < igb_mac_min)
 		E1000_WRITE_REG(&adapter->hw, E1000_PBA, pba);
 
 	INIT_DEBUGOUT1("em_reset: pba=%dK",pba);
 
 	/*
 	 * These parameters control the automatic generation (Tx) and
 	 * response (Rx) to Ethernet PAUSE frames.
 	 * - High water mark should allow for at least two frames to be
 	 *   received after sending an XOFF.
 	 * - Low water mark works best when it is very near the high water mark.
 	 *   This allows the receiver to restart by sending XON when it has
 	 *   drained a bit. Here we use an arbitrary value of 1500 which will
 	 *   restart after one full frame is pulled from the buffer. There
 	 *   could be several smaller frames in the buffer and if so they will
 	 *   not trigger the XON until their total number reduces the buffer
 	 *   by 1500.
 	 * - The pause time is fairly large at 1000 x 512ns = 512 usec.
 	 */
 	rx_buffer_size = (pba & 0xffff) << 10;
 	hw->fc.high_water = rx_buffer_size -
 	    roundup2(adapter->hw.mac.max_frame_size, 1024);
 	hw->fc.low_water = hw->fc.high_water - 1500;
 
 	if (adapter->fc) /* locally set flow control value? */
 		hw->fc.requested_mode = adapter->fc;
 	else
 		hw->fc.requested_mode = e1000_fc_full;
 
 	if (hw->mac.type == e1000_80003es2lan)
 		hw->fc.pause_time = 0xFFFF;
 	else
 		hw->fc.pause_time = EM_FC_PAUSE_TIME;
 
 	hw->fc.send_xon = TRUE;
 
 	/* Device specific overrides/settings */
 	switch (hw->mac.type) {
 	case e1000_pchlan:
 		/* Workaround: no TX flow ctrl for PCH */
 		hw->fc.requested_mode = e1000_fc_rx_pause;
 		hw->fc.pause_time = 0xFFFF; /* override */
 		if (if_getmtu(ifp) > ETHERMTU) {
 			hw->fc.high_water = 0x3500;
 			hw->fc.low_water = 0x1500;
 		} else {
 			hw->fc.high_water = 0x5000;
 			hw->fc.low_water = 0x3000;
 		}
 		hw->fc.refresh_time = 0x1000;
 		break;
 	case e1000_pch2lan:
 	case e1000_pch_lpt:
 	case e1000_pch_spt:
 	case e1000_pch_cnp:
 		hw->fc.high_water = 0x5C20;
 		hw->fc.low_water = 0x5048;
 		hw->fc.pause_time = 0x0650;
 		hw->fc.refresh_time = 0x0400;
 		/* Jumbos need adjusted PBA */
 		if (if_getmtu(ifp) > ETHERMTU)
 			E1000_WRITE_REG(hw, E1000_PBA, 12);
 		else
 			E1000_WRITE_REG(hw, E1000_PBA, 26);
 		break;
 	case e1000_82575:
 	case e1000_82576:
 		/* 8-byte granularity */
 		hw->fc.low_water = hw->fc.high_water - 8;
 		break;
 	case e1000_82580:
 	case e1000_i350:
 	case e1000_i354:
 	case e1000_i210:
 	case e1000_i211:
 	case e1000_vfadapt:
 	case e1000_vfadapt_i350:
 		/* 16-byte granularity */
 		hw->fc.low_water = hw->fc.high_water - 16;
 		break;
 	case e1000_ich9lan:
 	case e1000_ich10lan:
 		if (if_getmtu(ifp) > ETHERMTU) {
 			hw->fc.high_water = 0x2800;
 			hw->fc.low_water = hw->fc.high_water - 8;
 			break;
 		}
 		/* FALLTHROUGH */
 	default:
 		if (hw->mac.type == e1000_80003es2lan)
 			hw->fc.pause_time = 0xFFFF;
 		break;
 	}
 
 	/* Issue a global reset */
 	e1000_reset_hw(hw);
 	if (adapter->hw.mac.type >= igb_mac_min) {
 		E1000_WRITE_REG(hw, E1000_WUC, 0);
 	} else {
 		E1000_WRITE_REG(hw, E1000_WUFC, 0);
 		em_disable_aspm(adapter);
 	}
 	if (adapter->flags & IGB_MEDIA_RESET) {
 		e1000_setup_init_funcs(hw, TRUE);
 		e1000_get_bus_info(hw);
 		adapter->flags &= ~IGB_MEDIA_RESET;
 	}
 	/* and a re-init */
 	if (e1000_init_hw(hw) < 0) {
 		device_printf(dev, "Hardware Initialization Failed\n");
 		return;
 	}
 	if (adapter->hw.mac.type >= igb_mac_min)
 		igb_init_dmac(adapter, pba);
 
 	E1000_WRITE_REG(hw, E1000_VET, ETHERTYPE_VLAN);
 	e1000_get_phy_info(hw);
 	e1000_check_for_link(hw);
 }
 
 #define RSSKEYLEN 10
 static void
 em_initialize_rss_mapping(struct adapter *adapter)
 {
 	uint8_t  rss_key[4 * RSSKEYLEN];
 	uint32_t reta = 0;
 	struct e1000_hw	*hw = &adapter->hw;
 	int i;
 
 	/*
 	 * Configure RSS key
 	 */
 	arc4rand(rss_key, sizeof(rss_key), 0);
 	for (i = 0; i < RSSKEYLEN; ++i) {
 		uint32_t rssrk = 0;
 
 		rssrk = EM_RSSRK_VAL(rss_key, i);
 		E1000_WRITE_REG(hw,E1000_RSSRK(i), rssrk);
 	}
 
 	/*
 	 * Configure RSS redirect table in following fashion:
 	 * (hash & ring_cnt_mask) == rdr_table[(hash & rdr_table_mask)]
 	 */
 	for (i = 0; i < sizeof(reta); ++i) {
 		uint32_t q;
 
 		q = (i % adapter->rx_num_queues) << 7;
 		reta |= q << (8 * i);
 	}
 
 	for (i = 0; i < 32; ++i)
 		E1000_WRITE_REG(hw, E1000_RETA(i), reta);
 
 	E1000_WRITE_REG(hw, E1000_MRQC, E1000_MRQC_RSS_ENABLE_2Q |
 			E1000_MRQC_RSS_FIELD_IPV4_TCP |
 			E1000_MRQC_RSS_FIELD_IPV4 |
 			E1000_MRQC_RSS_FIELD_IPV6_TCP_EX |
 			E1000_MRQC_RSS_FIELD_IPV6_EX |
 			E1000_MRQC_RSS_FIELD_IPV6);
 
 }
 
 static void
 igb_initialize_rss_mapping(struct adapter *adapter)
 {
 	struct e1000_hw *hw = &adapter->hw;
 	int i;
 	int queue_id;
 	u32 reta;
 	u32 rss_key[10], mrqc, shift = 0;
 
 	/* XXX? */
 	if (adapter->hw.mac.type == e1000_82575)
 		shift = 6;
 
 	/*
 	 * The redirection table controls which destination
 	 * queue each bucket redirects traffic to.
 	 * Each DWORD represents four queues, with the LSB
 	 * being the first queue in the DWORD.
 	 *
 	 * This just allocates buckets to queues using round-robin
 	 * allocation.
 	 *
 	 * NOTE: It Just Happens to line up with the default
 	 * RSS allocation method.
 	 */
 
 	/* Warning FM follows */
 	reta = 0;
 	for (i = 0; i < 128; i++) {
 #ifdef RSS
 		queue_id = rss_get_indirection_to_bucket(i);
 		/*
 		 * If we have more queues than buckets, we'll
 		 * end up mapping buckets to a subset of the
 		 * queues.
 		 *
 		 * If we have more buckets than queues, we'll
 		 * end up instead assigning multiple buckets
 		 * to queues.
 		 *
 		 * Both are suboptimal, but we need to handle
 		 * the case so we don't go out of bounds
 		 * indexing arrays and such.
 		 */
 		queue_id = queue_id % adapter->rx_num_queues;
 #else
 		queue_id = (i % adapter->rx_num_queues);
 #endif
 		/* Adjust if required */
 		queue_id = queue_id << shift;
 
 		/*
 		 * The low 8 bits are for hash value (n+0);
 		 * The next 8 bits are for hash value (n+1), etc.
 		 */
 		reta = reta >> 8;
 		reta = reta | ( ((uint32_t) queue_id) << 24);
 		if ((i & 3) == 3) {
 			E1000_WRITE_REG(hw, E1000_RETA(i >> 2), reta);
 			reta = 0;
 		}
 	}
 
 	/* Now fill in hash table */
 
 	/*
 	 * MRQC: Multiple Receive Queues Command
 	 * Set queuing to RSS control, number depends on the device.
 	 */
 	mrqc = E1000_MRQC_ENABLE_RSS_8Q;
 
 #ifdef RSS
 	/* XXX ew typecasting */
 	rss_getkey((uint8_t *) &rss_key);
 #else
 	arc4rand(&rss_key, sizeof(rss_key), 0);
 #endif
 	for (i = 0; i < 10; i++)
 		E1000_WRITE_REG_ARRAY(hw, E1000_RSSRK(0), i, rss_key[i]);
 
 	/*
 	 * Configure the RSS fields to hash upon.
 	 */
 	mrqc |= (E1000_MRQC_RSS_FIELD_IPV4 |
 	    E1000_MRQC_RSS_FIELD_IPV4_TCP);
 	mrqc |= (E1000_MRQC_RSS_FIELD_IPV6 |
 	    E1000_MRQC_RSS_FIELD_IPV6_TCP);
 	mrqc |=( E1000_MRQC_RSS_FIELD_IPV4_UDP |
 	    E1000_MRQC_RSS_FIELD_IPV6_UDP);
 	mrqc |=( E1000_MRQC_RSS_FIELD_IPV6_UDP_EX |
 	    E1000_MRQC_RSS_FIELD_IPV6_TCP_EX);
 
 	E1000_WRITE_REG(hw, E1000_MRQC, mrqc);
 }
 
 /*********************************************************************
  *
  *  Setup networking device structure and register an interface.
  *
  **********************************************************************/
 static int
 em_setup_interface(if_ctx_t ctx)
 {
 	struct ifnet *ifp = iflib_get_ifp(ctx);
 	struct adapter *adapter = iflib_get_softc(ctx);
 	if_softc_ctx_t scctx = adapter->shared;
 
 	INIT_DEBUGOUT("em_setup_interface: begin");
 
 	/* Single Queue */
 	if (adapter->tx_num_queues == 1) {
 		if_setsendqlen(ifp, scctx->isc_ntxd[0] - 1);
 		if_setsendqready(ifp);
 	}
 
 	/*
 	 * Specify the media types supported by this adapter and register
 	 * callbacks to update media and link information
 	 */
 	if ((adapter->hw.phy.media_type == e1000_media_type_fiber) ||
 	    (adapter->hw.phy.media_type == e1000_media_type_internal_serdes)) {
 		u_char fiber_type = IFM_1000_SX;	/* default type */
 
 		if (adapter->hw.mac.type == e1000_82545)
 			fiber_type = IFM_1000_LX;
 		ifmedia_add(adapter->media, IFM_ETHER | fiber_type | IFM_FDX, 0, NULL);
 		ifmedia_add(adapter->media, IFM_ETHER | fiber_type, 0, NULL);
 	} else {
 		ifmedia_add(adapter->media, IFM_ETHER | IFM_10_T, 0, NULL);
 		ifmedia_add(adapter->media, IFM_ETHER | IFM_10_T | IFM_FDX, 0, NULL);
 		ifmedia_add(adapter->media, IFM_ETHER | IFM_100_TX, 0, NULL);
 		ifmedia_add(adapter->media, IFM_ETHER | IFM_100_TX | IFM_FDX, 0, NULL);
 		if (adapter->hw.phy.type != e1000_phy_ife) {
 			ifmedia_add(adapter->media, IFM_ETHER | IFM_1000_T | IFM_FDX, 0, NULL);
 			ifmedia_add(adapter->media, IFM_ETHER | IFM_1000_T, 0, NULL);
 		}
 	}
 	ifmedia_add(adapter->media, IFM_ETHER | IFM_AUTO, 0, NULL);
 	ifmedia_set(adapter->media, IFM_ETHER | IFM_AUTO);
 	return (0);
 }
 
 static int
 em_if_tx_queues_alloc(if_ctx_t ctx, caddr_t *vaddrs, uint64_t *paddrs, int ntxqs, int ntxqsets)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	if_softc_ctx_t scctx = adapter->shared;
 	int error = E1000_SUCCESS;
 	struct em_tx_queue *que;
 	int i, j;
 
 	MPASS(adapter->tx_num_queues > 0);
 	MPASS(adapter->tx_num_queues == ntxqsets);
 
 	/* First allocate the top level queue structs */
 	if (!(adapter->tx_queues =
 	    (struct em_tx_queue *) malloc(sizeof(struct em_tx_queue) *
 	    adapter->tx_num_queues, M_DEVBUF, M_NOWAIT | M_ZERO))) {
 		device_printf(iflib_get_dev(ctx), "Unable to allocate queue memory\n");
 		return(ENOMEM);
 	}
 
 	for (i = 0, que = adapter->tx_queues; i < adapter->tx_num_queues; i++, que++) {
 		/* Set up some basics */
 
 		struct tx_ring *txr = &que->txr;
 		txr->adapter = que->adapter = adapter;
 		que->me = txr->me =  i;
 
 		/* Allocate report status array */
 		if (!(txr->tx_rsq = (qidx_t *) malloc(sizeof(qidx_t) * scctx->isc_ntxd[0], M_DEVBUF, M_NOWAIT | M_ZERO))) {
 			device_printf(iflib_get_dev(ctx), "failed to allocate rs_idxs memory\n");
 			error = ENOMEM;
 			goto fail;
 		}
 		for (j = 0; j < scctx->isc_ntxd[0]; j++)
 			txr->tx_rsq[j] = QIDX_INVALID;
 		/* get the virtual and physical address of the hardware queues */
 		txr->tx_base = (struct e1000_tx_desc *)vaddrs[i*ntxqs];
 		txr->tx_paddr = paddrs[i*ntxqs];
 	}
 
 	device_printf(iflib_get_dev(ctx), "allocated for %d tx_queues\n", adapter->tx_num_queues);
 	return (0);
 fail:
 	em_if_queues_free(ctx);
 	return (error);
 }
 
 static int
 em_if_rx_queues_alloc(if_ctx_t ctx, caddr_t *vaddrs, uint64_t *paddrs, int nrxqs, int nrxqsets)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	int error = E1000_SUCCESS;
 	struct em_rx_queue *que;
 	int i;
 
 	MPASS(adapter->rx_num_queues > 0);
 	MPASS(adapter->rx_num_queues == nrxqsets);
 
 	/* First allocate the top level queue structs */
 	if (!(adapter->rx_queues =
 	    (struct em_rx_queue *) malloc(sizeof(struct em_rx_queue) *
 	    adapter->rx_num_queues, M_DEVBUF, M_NOWAIT | M_ZERO))) {
 		device_printf(iflib_get_dev(ctx), "Unable to allocate queue memory\n");
 		error = ENOMEM;
 		goto fail;
 	}
 
 	for (i = 0, que = adapter->rx_queues; i < nrxqsets; i++, que++) {
 		/* Set up some basics */
 		struct rx_ring *rxr = &que->rxr;
 		rxr->adapter = que->adapter = adapter;
 		rxr->que = que;
 		que->me = rxr->me =  i;
 
 		/* get the virtual and physical address of the hardware queues */
 		rxr->rx_base = (union e1000_rx_desc_extended *)vaddrs[i*nrxqs];
 		rxr->rx_paddr = paddrs[i*nrxqs];
 	}
 
 	device_printf(iflib_get_dev(ctx), "allocated for %d rx_queues\n", adapter->rx_num_queues);
 
 	return (0);
 fail:
 	em_if_queues_free(ctx);
 	return (error);
 }
 
 static void
 em_if_queues_free(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	struct em_tx_queue *tx_que = adapter->tx_queues;
 	struct em_rx_queue *rx_que = adapter->rx_queues;
 
 	if (tx_que != NULL) {
 		for (int i = 0; i < adapter->tx_num_queues; i++, tx_que++) {
 			struct tx_ring *txr = &tx_que->txr;
 			if (txr->tx_rsq == NULL)
 				break;
 
 			free(txr->tx_rsq, M_DEVBUF);
 			txr->tx_rsq = NULL;
 		}
 		free(adapter->tx_queues, M_DEVBUF);
 		adapter->tx_queues = NULL;
 	}
 
 	if (rx_que != NULL) {
 		free(adapter->rx_queues, M_DEVBUF);
 		adapter->rx_queues = NULL;
 	}
 
 	em_release_hw_control(adapter);
 
 	if (adapter->mta != NULL) {
 		free(adapter->mta, M_DEVBUF);
 	}
 }
 
 /*********************************************************************
  *
  *  Enable transmit unit.
  *
  **********************************************************************/
 static void
 em_initialize_transmit_unit(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	if_softc_ctx_t scctx = adapter->shared;
 	struct em_tx_queue *que;
 	struct tx_ring	*txr;
 	struct e1000_hw	*hw = &adapter->hw;
 	u32 tctl, txdctl = 0, tarc, tipg = 0;
 
 	INIT_DEBUGOUT("em_initialize_transmit_unit: begin");
 
 	for (int i = 0; i < adapter->tx_num_queues; i++, txr++) {
 		u64 bus_addr;
 		caddr_t offp, endp;
 
 		que = &adapter->tx_queues[i];
 		txr = &que->txr;
 		bus_addr = txr->tx_paddr;
 
 		/* Clear checksum offload context. */
 		offp = (caddr_t)&txr->csum_flags;
 		endp = (caddr_t)(txr + 1);
 		bzero(offp, endp - offp);
 
 		/* Base and Len of TX Ring */
 		E1000_WRITE_REG(hw, E1000_TDLEN(i),
 		    scctx->isc_ntxd[0] * sizeof(struct e1000_tx_desc));
 		E1000_WRITE_REG(hw, E1000_TDBAH(i),
 		    (u32)(bus_addr >> 32));
 		E1000_WRITE_REG(hw, E1000_TDBAL(i),
 		    (u32)bus_addr);
 		/* Init the HEAD/TAIL indices */
 		E1000_WRITE_REG(hw, E1000_TDT(i), 0);
 		E1000_WRITE_REG(hw, E1000_TDH(i), 0);
 
 		HW_DEBUGOUT2("Base = %x, Length = %x\n",
 		    E1000_READ_REG(&adapter->hw, E1000_TDBAL(i)),
 		    E1000_READ_REG(&adapter->hw, E1000_TDLEN(i)));
 
 		txdctl = 0; /* clear txdctl */
 		txdctl |= 0x1f; /* PTHRESH */
 		txdctl |= 1 << 8; /* HTHRESH */
 		txdctl |= 1 << 16;/* WTHRESH */
 		txdctl |= 1 << 22; /* Reserved bit 22 must always be 1 */
 		txdctl |= E1000_TXDCTL_GRAN;
 		txdctl |= 1 << 25; /* LWTHRESH */
 
 		E1000_WRITE_REG(hw, E1000_TXDCTL(i), txdctl);
 	}
 
 	/* Set the default values for the Tx Inter Packet Gap timer */
 	switch (adapter->hw.mac.type) {
 	case e1000_80003es2lan:
 		tipg = DEFAULT_82543_TIPG_IPGR1;
 		tipg |= DEFAULT_80003ES2LAN_TIPG_IPGR2 <<
 		    E1000_TIPG_IPGR2_SHIFT;
 		break;
 	case e1000_82542:
 		tipg = DEFAULT_82542_TIPG_IPGT;
 		tipg |= DEFAULT_82542_TIPG_IPGR1 << E1000_TIPG_IPGR1_SHIFT;
 		tipg |= DEFAULT_82542_TIPG_IPGR2 << E1000_TIPG_IPGR2_SHIFT;
 		break;
 	default:
 		if ((adapter->hw.phy.media_type == e1000_media_type_fiber) ||
 		    (adapter->hw.phy.media_type ==
 		    e1000_media_type_internal_serdes))
 			tipg = DEFAULT_82543_TIPG_IPGT_FIBER;
 		else
 			tipg = DEFAULT_82543_TIPG_IPGT_COPPER;
 		tipg |= DEFAULT_82543_TIPG_IPGR1 << E1000_TIPG_IPGR1_SHIFT;
 		tipg |= DEFAULT_82543_TIPG_IPGR2 << E1000_TIPG_IPGR2_SHIFT;
 	}
 
 	E1000_WRITE_REG(&adapter->hw, E1000_TIPG, tipg);
 	E1000_WRITE_REG(&adapter->hw, E1000_TIDV, adapter->tx_int_delay.value);
 
 	if(adapter->hw.mac.type >= e1000_82540)
 		E1000_WRITE_REG(&adapter->hw, E1000_TADV,
 		    adapter->tx_abs_int_delay.value);
 
 	if ((adapter->hw.mac.type == e1000_82571) ||
 	    (adapter->hw.mac.type == e1000_82572)) {
 		tarc = E1000_READ_REG(&adapter->hw, E1000_TARC(0));
 		tarc |= TARC_SPEED_MODE_BIT;
 		E1000_WRITE_REG(&adapter->hw, E1000_TARC(0), tarc);
 	} else if (adapter->hw.mac.type == e1000_80003es2lan) {
 		/* errata: program both queues to unweighted RR */
 		tarc = E1000_READ_REG(&adapter->hw, E1000_TARC(0));
 		tarc |= 1;
 		E1000_WRITE_REG(&adapter->hw, E1000_TARC(0), tarc);
 		tarc = E1000_READ_REG(&adapter->hw, E1000_TARC(1));
 		tarc |= 1;
 		E1000_WRITE_REG(&adapter->hw, E1000_TARC(1), tarc);
 	} else if (adapter->hw.mac.type == e1000_82574) {
 		tarc = E1000_READ_REG(&adapter->hw, E1000_TARC(0));
 		tarc |= TARC_ERRATA_BIT;
 		if ( adapter->tx_num_queues > 1) {
 			tarc |= (TARC_COMPENSATION_MODE | TARC_MQ_FIX);
 			E1000_WRITE_REG(&adapter->hw, E1000_TARC(0), tarc);
 			E1000_WRITE_REG(&adapter->hw, E1000_TARC(1), tarc);
 		} else
 			E1000_WRITE_REG(&adapter->hw, E1000_TARC(0), tarc);
 	}
 
 	if (adapter->tx_int_delay.value > 0)
 		adapter->txd_cmd |= E1000_TXD_CMD_IDE;
 
 	/* Program the Transmit Control Register */
 	tctl = E1000_READ_REG(&adapter->hw, E1000_TCTL);
 	tctl &= ~E1000_TCTL_CT;
 	tctl |= (E1000_TCTL_PSP | E1000_TCTL_RTLC | E1000_TCTL_EN |
 		   (E1000_COLLISION_THRESHOLD << E1000_CT_SHIFT));
 
 	if (adapter->hw.mac.type >= e1000_82571)
 		tctl |= E1000_TCTL_MULR;
 
 	/* This write will effectively turn on the transmit unit. */
 	E1000_WRITE_REG(&adapter->hw, E1000_TCTL, tctl);
 
 	/* SPT and KBL errata workarounds */
 	if (hw->mac.type == e1000_pch_spt) {
 		u32 reg;
 		reg = E1000_READ_REG(hw, E1000_IOSFPC);
 		reg |= E1000_RCTL_RDMTS_HEX;
 		E1000_WRITE_REG(hw, E1000_IOSFPC, reg);
 		/* i218-i219 Specification Update 1.5.4.5 */
 		reg = E1000_READ_REG(hw, E1000_TARC(0));
 		reg &= ~E1000_TARC0_CB_MULTIQ_3_REQ;
 		reg |= E1000_TARC0_CB_MULTIQ_2_REQ;
 		E1000_WRITE_REG(hw, E1000_TARC(0), reg);
 	}
 }
 
 /*********************************************************************
  *
  *  Enable receive unit.
  *
  **********************************************************************/
 
 static void
 em_initialize_receive_unit(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	if_softc_ctx_t scctx = adapter->shared;
 	struct ifnet *ifp = iflib_get_ifp(ctx);
 	struct e1000_hw	*hw = &adapter->hw;
 	struct em_rx_queue *que;
 	int i;
 	u32 rctl, rxcsum, rfctl;
 
 	INIT_DEBUGOUT("em_initialize_receive_units: begin");
 
 	/*
 	 * Make sure receives are disabled while setting
 	 * up the descriptor ring
 	 */
 	rctl = E1000_READ_REG(hw, E1000_RCTL);
 	/* Do not disable if ever enabled on this hardware */
 	if ((hw->mac.type != e1000_82574) && (hw->mac.type != e1000_82583))
 		E1000_WRITE_REG(hw, E1000_RCTL, rctl & ~E1000_RCTL_EN);
 
 	/* Setup the Receive Control Register */
 	rctl &= ~(3 << E1000_RCTL_MO_SHIFT);
 	rctl |= E1000_RCTL_EN | E1000_RCTL_BAM |
 	    E1000_RCTL_LBM_NO | E1000_RCTL_RDMTS_HALF |
 	    (hw->mac.mc_filter_type << E1000_RCTL_MO_SHIFT);
 
 	/* Do not store bad packets */
 	rctl &= ~E1000_RCTL_SBP;
 
 	/* Enable Long Packet receive */
 	if (if_getmtu(ifp) > ETHERMTU)
 		rctl |= E1000_RCTL_LPE;
 	else
 		rctl &= ~E1000_RCTL_LPE;
 
 	/* Strip the CRC */
 	if (!em_disable_crc_stripping)
 		rctl |= E1000_RCTL_SECRC;
 
 	if (adapter->hw.mac.type >= e1000_82540) {
 		E1000_WRITE_REG(&adapter->hw, E1000_RADV,
 			    adapter->rx_abs_int_delay.value);
 
 		/*
 		 * Set the interrupt throttling rate. Value is calculated
 		 * as DEFAULT_ITR = 1/(MAX_INTS_PER_SEC * 256ns)
 		 */
 		E1000_WRITE_REG(hw, E1000_ITR, DEFAULT_ITR);
 	}
 	E1000_WRITE_REG(&adapter->hw, E1000_RDTR,
 	    adapter->rx_int_delay.value);
 
 	/* Use extended rx descriptor formats */
 	rfctl = E1000_READ_REG(hw, E1000_RFCTL);
 	rfctl |= E1000_RFCTL_EXTEN;
 	/*
 	 * When using MSIX interrupts we need to throttle
 	 * using the EITR register (82574 only)
 	 */
 	if (hw->mac.type == e1000_82574) {
 		for (int i = 0; i < 4; i++)
 			E1000_WRITE_REG(hw, E1000_EITR_82574(i),
 			    DEFAULT_ITR);
 		/* Disable accelerated acknowledge */
 		rfctl |= E1000_RFCTL_ACK_DIS;
 	}
 	E1000_WRITE_REG(hw, E1000_RFCTL, rfctl);
 
 	rxcsum = E1000_READ_REG(hw, E1000_RXCSUM);
 	if (if_getcapenable(ifp) & IFCAP_RXCSUM &&
 	    adapter->hw.mac.type >= e1000_82543) {
 		if (adapter->tx_num_queues > 1) {
 			if (adapter->hw.mac.type >= igb_mac_min) {
 				rxcsum |= E1000_RXCSUM_PCSD;
 				if (hw->mac.type != e1000_82575)
 					rxcsum |= E1000_RXCSUM_CRCOFL;
 			} else
 				rxcsum |= E1000_RXCSUM_TUOFL |
 					E1000_RXCSUM_IPOFL |
 					E1000_RXCSUM_PCSD;
 		} else {
 			if (adapter->hw.mac.type >= igb_mac_min)
 				rxcsum |= E1000_RXCSUM_IPPCSE;
 			else
 				rxcsum |= E1000_RXCSUM_TUOFL | E1000_RXCSUM_IPOFL;
 			if (adapter->hw.mac.type > e1000_82575)
 				rxcsum |= E1000_RXCSUM_CRCOFL;
 		}
 	} else
 		rxcsum &= ~E1000_RXCSUM_TUOFL;
 
 	E1000_WRITE_REG(hw, E1000_RXCSUM, rxcsum);
 
 	if (adapter->rx_num_queues > 1) {
 		if (adapter->hw.mac.type >= igb_mac_min)
 			igb_initialize_rss_mapping(adapter);
 		else
 			em_initialize_rss_mapping(adapter);
 	}
 
 	/*
 	 * XXX TEMPORARY WORKAROUND: on some systems with 82573
 	 * long latencies are observed, like Lenovo X60. This
 	 * change eliminates the problem, but since having positive
 	 * values in RDTR is a known source of problems on other
 	 * platforms another solution is being sought.
 	 */
 	if (hw->mac.type == e1000_82573)
 		E1000_WRITE_REG(hw, E1000_RDTR, 0x20);
 
 	for (i = 0, que = adapter->rx_queues; i < adapter->rx_num_queues; i++, que++) {
 		struct rx_ring *rxr = &que->rxr;
 		/* Setup the Base and Length of the Rx Descriptor Ring */
 		u64 bus_addr = rxr->rx_paddr;
 #if 0
 		u32 rdt = adapter->rx_num_queues -1;  /* default */
 #endif
 
 		E1000_WRITE_REG(hw, E1000_RDLEN(i),
 		    scctx->isc_nrxd[0] * sizeof(union e1000_rx_desc_extended));
 		E1000_WRITE_REG(hw, E1000_RDBAH(i), (u32)(bus_addr >> 32));
 		E1000_WRITE_REG(hw, E1000_RDBAL(i), (u32)bus_addr);
 		/* Setup the Head and Tail Descriptor Pointers */
 		E1000_WRITE_REG(hw, E1000_RDH(i), 0);
 		E1000_WRITE_REG(hw, E1000_RDT(i), 0);
 	}
 
 	/*
 	 * Set PTHRESH for improved jumbo performance
 	 * According to 10.2.5.11 of Intel 82574 Datasheet,
 	 * RXDCTL(1) is written whenever RXDCTL(0) is written.
 	 * Only write to RXDCTL(1) if there is a need for different
 	 * settings.
 	 */
 
 	if (((adapter->hw.mac.type == e1000_ich9lan) ||
 	    (adapter->hw.mac.type == e1000_pch2lan) ||
 	    (adapter->hw.mac.type == e1000_ich10lan)) &&
 	    (if_getmtu(ifp) > ETHERMTU)) {
 		u32 rxdctl = E1000_READ_REG(hw, E1000_RXDCTL(0));
 		E1000_WRITE_REG(hw, E1000_RXDCTL(0), rxdctl | 3);
 	} else if (adapter->hw.mac.type == e1000_82574) {
 		for (int i = 0; i < adapter->rx_num_queues; i++) {
 			u32 rxdctl = E1000_READ_REG(hw, E1000_RXDCTL(i));
 			rxdctl |= 0x20; /* PTHRESH */
 			rxdctl |= 4 << 8; /* HTHRESH */
 			rxdctl |= 4 << 16;/* WTHRESH */
 			rxdctl |= 1 << 24; /* Switch to granularity */
 			E1000_WRITE_REG(hw, E1000_RXDCTL(i), rxdctl);
 		}
 	} else if (adapter->hw.mac.type >= igb_mac_min) {
 		u32 psize, srrctl = 0;
 
 		if (if_getmtu(ifp) > ETHERMTU) {
 			/* Set maximum packet len */
 			if (adapter->rx_mbuf_sz <= 4096) {
 				srrctl |= 4096 >> E1000_SRRCTL_BSIZEPKT_SHIFT;
 				rctl |= E1000_RCTL_SZ_4096 | E1000_RCTL_BSEX;
 			} else if (adapter->rx_mbuf_sz > 4096) {
 				srrctl |= 8192 >> E1000_SRRCTL_BSIZEPKT_SHIFT;
 				rctl |= E1000_RCTL_SZ_8192 | E1000_RCTL_BSEX;
 			}
 			psize = scctx->isc_max_frame_size;
 			/* are we on a vlan? */
 			if (ifp->if_vlantrunk != NULL)
 				psize += VLAN_TAG_SIZE;
 			E1000_WRITE_REG(&adapter->hw, E1000_RLPML, psize);
 		} else {
 			srrctl |= 2048 >> E1000_SRRCTL_BSIZEPKT_SHIFT;
 			rctl |= E1000_RCTL_SZ_2048;
 		}
 
 		/*
 		 * If TX flow control is disabled and there's >1 queue defined,
 		 * enable DROP.
 		 *
 		 * This drops frames rather than hanging the RX MAC for all queues.
 		 */
 		if ((adapter->rx_num_queues > 1) &&
 		    (adapter->fc == e1000_fc_none ||
 		     adapter->fc == e1000_fc_rx_pause)) {
 			srrctl |= E1000_SRRCTL_DROP_EN;
 		}
 			/* Setup the Base and Length of the Rx Descriptor Rings */
 		for (i = 0, que = adapter->rx_queues; i < adapter->rx_num_queues; i++, que++) {
 			struct rx_ring *rxr = &que->rxr;
 			u64 bus_addr = rxr->rx_paddr;
 			u32 rxdctl;
 
 #ifdef notyet
 			/* Configure for header split? -- ignore for now */
 			rxr->hdr_split = igb_header_split;
 #else
 			srrctl |= E1000_SRRCTL_DESCTYPE_ADV_ONEBUF;
 #endif
 
 			E1000_WRITE_REG(hw, E1000_RDLEN(i),
 					scctx->isc_nrxd[0] * sizeof(struct e1000_rx_desc));
 			E1000_WRITE_REG(hw, E1000_RDBAH(i),
 					(uint32_t)(bus_addr >> 32));
 			E1000_WRITE_REG(hw, E1000_RDBAL(i),
 					(uint32_t)bus_addr);
 			E1000_WRITE_REG(hw, E1000_SRRCTL(i), srrctl);
 			/* Enable this Queue */
 			rxdctl = E1000_READ_REG(hw, E1000_RXDCTL(i));
 			rxdctl |= E1000_RXDCTL_QUEUE_ENABLE;
 			rxdctl &= 0xFFF00000;
 			rxdctl |= IGB_RX_PTHRESH;
 			rxdctl |= IGB_RX_HTHRESH << 8;
 			rxdctl |= IGB_RX_WTHRESH << 16;
 			E1000_WRITE_REG(hw, E1000_RXDCTL(i), rxdctl);
 		}		
 	} else if (adapter->hw.mac.type >= e1000_pch2lan) {
 		if (if_getmtu(ifp) > ETHERMTU)
 			e1000_lv_jumbo_workaround_ich8lan(hw, TRUE);
 		else
 			e1000_lv_jumbo_workaround_ich8lan(hw, FALSE);
 	}
 
 	/* Make sure VLAN Filters are off */
 	rctl &= ~E1000_RCTL_VFE;
 
 	if (adapter->hw.mac.type < igb_mac_min) {
 		if (adapter->rx_mbuf_sz == MCLBYTES)
 			rctl |= E1000_RCTL_SZ_2048;
 		else if (adapter->rx_mbuf_sz == MJUMPAGESIZE)
 			rctl |= E1000_RCTL_SZ_4096 | E1000_RCTL_BSEX;
 		else if (adapter->rx_mbuf_sz > MJUMPAGESIZE)
 			rctl |= E1000_RCTL_SZ_8192 | E1000_RCTL_BSEX;
 
 		/* ensure we clear use DTYPE of 00 here */
 		rctl &= ~0x00000C00;
 	}
 
 	/* Write out the settings */
 	E1000_WRITE_REG(hw, E1000_RCTL, rctl);
 
 	return;
 }
 
 static void
 em_if_vlan_register(if_ctx_t ctx, u16 vtag)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	u32 index, bit;
 
 	index = (vtag >> 5) & 0x7F;
 	bit = vtag & 0x1F;
 	adapter->shadow_vfta[index] |= (1 << bit);
 	++adapter->num_vlans;
 }
 
 static void
 em_if_vlan_unregister(if_ctx_t ctx, u16 vtag)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	u32 index, bit;
 
 	index = (vtag >> 5) & 0x7F;
 	bit = vtag & 0x1F;
 	adapter->shadow_vfta[index] &= ~(1 << bit);
 	--adapter->num_vlans;
 }
 
 static void
 em_setup_vlan_hw_support(struct adapter *adapter)
 {
 	struct e1000_hw *hw = &adapter->hw;
 	u32 reg;
 
 	/*
 	 * We get here thru init_locked, meaning
 	 * a soft reset, this has already cleared
 	 * the VFTA and other state, so if there
 	 * have been no vlan's registered do nothing.
 	 */
 	if (adapter->num_vlans == 0)
 		return;
 
 	/*
 	 * A soft reset zero's out the VFTA, so
 	 * we need to repopulate it now.
 	 */
 	for (int i = 0; i < EM_VFTA_SIZE; i++)
 		if (adapter->shadow_vfta[i] != 0)
 			E1000_WRITE_REG_ARRAY(hw, E1000_VFTA,
 			    i, adapter->shadow_vfta[i]);
 
 	reg = E1000_READ_REG(hw, E1000_CTRL);
 	reg |= E1000_CTRL_VME;
 	E1000_WRITE_REG(hw, E1000_CTRL, reg);
 
 	/* Enable the Filter Table */
 	reg = E1000_READ_REG(hw, E1000_RCTL);
 	reg &= ~E1000_RCTL_CFIEN;
 	reg |= E1000_RCTL_VFE;
 	E1000_WRITE_REG(hw, E1000_RCTL, reg);
 }
 
 static void
 em_if_enable_intr(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	struct e1000_hw *hw = &adapter->hw;
 	u32 ims_mask = IMS_ENABLE_MASK;
 
 	if (hw->mac.type == e1000_82574) {
 		E1000_WRITE_REG(hw, EM_EIAC, EM_MSIX_MASK);
 		ims_mask |= adapter->ims;
 	} else if (adapter->intr_type == IFLIB_INTR_MSIX && hw->mac.type >= igb_mac_min)  {
 		u32 mask = (adapter->que_mask | adapter->link_mask);
 
 		E1000_WRITE_REG(&adapter->hw, E1000_EIAC, mask);
 		E1000_WRITE_REG(&adapter->hw, E1000_EIAM, mask);
 		E1000_WRITE_REG(&adapter->hw, E1000_EIMS, mask);
 		ims_mask = E1000_IMS_LSC;
 	}
 
 	E1000_WRITE_REG(hw, E1000_IMS, ims_mask);
 }
 
 static void
 em_if_disable_intr(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	struct e1000_hw *hw = &adapter->hw;
 
 	if (adapter->intr_type == IFLIB_INTR_MSIX) {
 		if (hw->mac.type >= igb_mac_min)
 			E1000_WRITE_REG(&adapter->hw, E1000_EIMC, ~0);
 		E1000_WRITE_REG(&adapter->hw, E1000_EIAC, 0);
 	}
 	E1000_WRITE_REG(&adapter->hw, E1000_IMC, 0xffffffff);
 }
 
 /*
  * Bit of a misnomer, what this really means is
  * to enable OS management of the system... aka
  * to disable special hardware management features
  */
 static void
 em_init_manageability(struct adapter *adapter)
 {
 	/* A shared code workaround */
 #define E1000_82542_MANC2H E1000_MANC2H
 	if (adapter->has_manage) {
 		int manc2h = E1000_READ_REG(&adapter->hw, E1000_MANC2H);
 		int manc = E1000_READ_REG(&adapter->hw, E1000_MANC);
 
 		/* disable hardware interception of ARP */
 		manc &= ~(E1000_MANC_ARP_EN);
 
 		/* enable receiving management packets to the host */
 		manc |= E1000_MANC_EN_MNG2HOST;
 #define E1000_MNG2HOST_PORT_623 (1 << 5)
 #define E1000_MNG2HOST_PORT_664 (1 << 6)
 		manc2h |= E1000_MNG2HOST_PORT_623;
 		manc2h |= E1000_MNG2HOST_PORT_664;
 		E1000_WRITE_REG(&adapter->hw, E1000_MANC2H, manc2h);
 		E1000_WRITE_REG(&adapter->hw, E1000_MANC, manc);
 	}
 }
 
 /*
  * Give control back to hardware management
  * controller if there is one.
  */
 static void
 em_release_manageability(struct adapter *adapter)
 {
 	if (adapter->has_manage) {
 		int manc = E1000_READ_REG(&adapter->hw, E1000_MANC);
 
 		/* re-enable hardware interception of ARP */
 		manc |= E1000_MANC_ARP_EN;
 		manc &= ~E1000_MANC_EN_MNG2HOST;
 
 		E1000_WRITE_REG(&adapter->hw, E1000_MANC, manc);
 	}
 }
 
 /*
  * em_get_hw_control sets the {CTRL_EXT|FWSM}:DRV_LOAD bit.
  * For ASF and Pass Through versions of f/w this means
  * that the driver is loaded. For AMT version type f/w
  * this means that the network i/f is open.
  */
 static void
 em_get_hw_control(struct adapter *adapter)
 {
 	u32 ctrl_ext, swsm;
 
 	if (adapter->vf_ifp)
 		return;
 
 	if (adapter->hw.mac.type == e1000_82573) {
 		swsm = E1000_READ_REG(&adapter->hw, E1000_SWSM);
 		E1000_WRITE_REG(&adapter->hw, E1000_SWSM,
 		    swsm | E1000_SWSM_DRV_LOAD);
 		return;
 	}
 	/* else */
 	ctrl_ext = E1000_READ_REG(&adapter->hw, E1000_CTRL_EXT);
 	E1000_WRITE_REG(&adapter->hw, E1000_CTRL_EXT,
 	    ctrl_ext | E1000_CTRL_EXT_DRV_LOAD);
 }
 
 /*
  * em_release_hw_control resets {CTRL_EXT|FWSM}:DRV_LOAD bit.
  * For ASF and Pass Through versions of f/w this means that
  * the driver is no longer loaded. For AMT versions of the
  * f/w this means that the network i/f is closed.
  */
 static void
 em_release_hw_control(struct adapter *adapter)
 {
 	u32 ctrl_ext, swsm;
 
 	if (!adapter->has_manage)
 		return;
 
 	if (adapter->hw.mac.type == e1000_82573) {
 		swsm = E1000_READ_REG(&adapter->hw, E1000_SWSM);
 		E1000_WRITE_REG(&adapter->hw, E1000_SWSM,
 		    swsm & ~E1000_SWSM_DRV_LOAD);
 		return;
 	}
 	/* else */
 	ctrl_ext = E1000_READ_REG(&adapter->hw, E1000_CTRL_EXT);
 	E1000_WRITE_REG(&adapter->hw, E1000_CTRL_EXT,
 	    ctrl_ext & ~E1000_CTRL_EXT_DRV_LOAD);
 	return;
 }
 
 static int
 em_is_valid_ether_addr(u8 *addr)
 {
 	char zero_addr[6] = { 0, 0, 0, 0, 0, 0 };
 
 	if ((addr[0] & 1) || (!bcmp(addr, zero_addr, ETHER_ADDR_LEN))) {
 		return (FALSE);
 	}
 
 	return (TRUE);
 }
 
 /*
 ** Parse the interface capabilities with regard
 ** to both system management and wake-on-lan for
 ** later use.
 */
 static void
 em_get_wakeup(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	device_t dev = iflib_get_dev(ctx);
 	u16 eeprom_data = 0, device_id, apme_mask;
 
 	adapter->has_manage = e1000_enable_mng_pass_thru(&adapter->hw);
 	apme_mask = EM_EEPROM_APME;
 
 	switch (adapter->hw.mac.type) {
 	case e1000_82542:
 	case e1000_82543:
 		break;
 	case e1000_82544:
 		e1000_read_nvm(&adapter->hw,
 		    NVM_INIT_CONTROL2_REG, 1, &eeprom_data);
 		apme_mask = EM_82544_APME;
 		break;
 	case e1000_82546:
 	case e1000_82546_rev_3:
 		if (adapter->hw.bus.func == 1) {
 			e1000_read_nvm(&adapter->hw,
 			    NVM_INIT_CONTROL3_PORT_B, 1, &eeprom_data);
 			break;
 		} else
 			e1000_read_nvm(&adapter->hw,
 			    NVM_INIT_CONTROL3_PORT_A, 1, &eeprom_data);
 		break;
 	case e1000_82573:
 	case e1000_82583:
 		adapter->has_amt = TRUE;
 		/* FALLTHROUGH */
 	case e1000_82571:
 	case e1000_82572:
 	case e1000_80003es2lan:
 		if (adapter->hw.bus.func == 1) {
 			e1000_read_nvm(&adapter->hw,
 			    NVM_INIT_CONTROL3_PORT_B, 1, &eeprom_data);
 			break;
 		} else
 			e1000_read_nvm(&adapter->hw,
 			    NVM_INIT_CONTROL3_PORT_A, 1, &eeprom_data);
 		break;
 	case e1000_ich8lan:
 	case e1000_ich9lan:
 	case e1000_ich10lan:
 	case e1000_pchlan:
 	case e1000_pch2lan:
 	case e1000_pch_lpt:
 	case e1000_pch_spt:
 	case e1000_82575:	/* listing all igb devices */
 	case e1000_82576:
 	case e1000_82580:
 	case e1000_i350:
 	case e1000_i354:
 	case e1000_i210:
 	case e1000_i211:
 	case e1000_vfadapt:
 	case e1000_vfadapt_i350:
 		apme_mask = E1000_WUC_APME;
 		adapter->has_amt = TRUE;
 		eeprom_data = E1000_READ_REG(&adapter->hw, E1000_WUC);
 		break;
 	default:
 		e1000_read_nvm(&adapter->hw,
 		    NVM_INIT_CONTROL3_PORT_A, 1, &eeprom_data);
 		break;
 	}
 	if (eeprom_data & apme_mask)
 		adapter->wol = (E1000_WUFC_MAG | E1000_WUFC_MC);
 	/*
 	 * We have the eeprom settings, now apply the special cases
 	 * where the eeprom may be wrong or the board won't support
 	 * wake on lan on a particular port
 	 */
 	device_id = pci_get_device(dev);
 	switch (device_id) {
 	case E1000_DEV_ID_82546GB_PCIE:
 		adapter->wol = 0;
 		break;
 	case E1000_DEV_ID_82546EB_FIBER:
 	case E1000_DEV_ID_82546GB_FIBER:
 		/* Wake events only supported on port A for dual fiber
 		 * regardless of eeprom setting */
 		if (E1000_READ_REG(&adapter->hw, E1000_STATUS) &
 		    E1000_STATUS_FUNC_1)
 			adapter->wol = 0;
 		break;
 	case E1000_DEV_ID_82546GB_QUAD_COPPER_KSP3:
 		/* if quad port adapter, disable WoL on all but port A */
 		if (global_quad_port_a != 0)
 			adapter->wol = 0;
 		/* Reset for multiple quad port adapters */
 		if (++global_quad_port_a == 4)
 			global_quad_port_a = 0;
 		break;
 	case E1000_DEV_ID_82571EB_FIBER:
 		/* Wake events only supported on port A for dual fiber
 		 * regardless of eeprom setting */
 		if (E1000_READ_REG(&adapter->hw, E1000_STATUS) &
 		    E1000_STATUS_FUNC_1)
 			adapter->wol = 0;
 		break;
 	case E1000_DEV_ID_82571EB_QUAD_COPPER:
 	case E1000_DEV_ID_82571EB_QUAD_FIBER:
 	case E1000_DEV_ID_82571EB_QUAD_COPPER_LP:
 		/* if quad port adapter, disable WoL on all but port A */
 		if (global_quad_port_a != 0)
 			adapter->wol = 0;
 		/* Reset for multiple quad port adapters */
 		if (++global_quad_port_a == 4)
 			global_quad_port_a = 0;
 		break;
 	}
 	return;
 }
 
 
 /*
  * Enable PCI Wake On Lan capability
  */
 static void
 em_enable_wakeup(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	device_t dev = iflib_get_dev(ctx);
 	if_t ifp = iflib_get_ifp(ctx);
 	int error = 0;
 	u32 pmc, ctrl, ctrl_ext, rctl;
 	u16 status;
 
 	if (pci_find_cap(dev, PCIY_PMG, &pmc) != 0)
 		return;
 
 	/*
 	 * Determine type of Wakeup: note that wol
 	 * is set with all bits on by default.
 	 */
 	if ((if_getcapenable(ifp) & IFCAP_WOL_MAGIC) == 0)
 		adapter->wol &= ~E1000_WUFC_MAG;
 
 	if ((if_getcapenable(ifp) & IFCAP_WOL_UCAST) == 0)
 		adapter->wol &= ~E1000_WUFC_EX;
 
 	if ((if_getcapenable(ifp) & IFCAP_WOL_MCAST) == 0)
 		adapter->wol &= ~E1000_WUFC_MC;
 	else {
 		rctl = E1000_READ_REG(&adapter->hw, E1000_RCTL);
 		rctl |= E1000_RCTL_MPE;
 		E1000_WRITE_REG(&adapter->hw, E1000_RCTL, rctl);
 	}
 
 	if (!(adapter->wol & (E1000_WUFC_EX | E1000_WUFC_MAG | E1000_WUFC_MC)))
 		goto pme;
 
 	/* Advertise the wakeup capability */
 	ctrl = E1000_READ_REG(&adapter->hw, E1000_CTRL);
 	ctrl |= (E1000_CTRL_SWDPIN2 | E1000_CTRL_SWDPIN3);
 	E1000_WRITE_REG(&adapter->hw, E1000_CTRL, ctrl);
 
 	/* Keep the laser running on Fiber adapters */
 	if (adapter->hw.phy.media_type == e1000_media_type_fiber ||
 	    adapter->hw.phy.media_type == e1000_media_type_internal_serdes) {
 		ctrl_ext = E1000_READ_REG(&adapter->hw, E1000_CTRL_EXT);
 		ctrl_ext |= E1000_CTRL_EXT_SDP3_DATA;
 		E1000_WRITE_REG(&adapter->hw, E1000_CTRL_EXT, ctrl_ext);
 	}
 
 	if ((adapter->hw.mac.type == e1000_ich8lan) ||
 	    (adapter->hw.mac.type == e1000_pchlan) ||
 	    (adapter->hw.mac.type == e1000_ich9lan) ||
 	    (adapter->hw.mac.type == e1000_ich10lan))
 		e1000_suspend_workarounds_ich8lan(&adapter->hw);
 
 	if ( adapter->hw.mac.type >= e1000_pchlan) {
 		error = em_enable_phy_wakeup(adapter);
 		if (error)
 			goto pme;
 	} else {
 		/* Enable wakeup by the MAC */
 		E1000_WRITE_REG(&adapter->hw, E1000_WUC, E1000_WUC_PME_EN);
 		E1000_WRITE_REG(&adapter->hw, E1000_WUFC, adapter->wol);
 	}
 
 	if (adapter->hw.phy.type == e1000_phy_igp_3)
 		e1000_igp3_phy_powerdown_workaround_ich8lan(&adapter->hw);
 
 pme:
 	status = pci_read_config(dev, pmc + PCIR_POWER_STATUS, 2);
 	status &= ~(PCIM_PSTAT_PME | PCIM_PSTAT_PMEENABLE);
 	if (!error && (if_getcapenable(ifp) & IFCAP_WOL))
 		status |= PCIM_PSTAT_PME | PCIM_PSTAT_PMEENABLE;
 	pci_write_config(dev, pmc + PCIR_POWER_STATUS, status, 2);
 
 	return;
 }
 
 /*
  * WOL in the newer chipset interfaces (pchlan)
  * require thing to be copied into the phy
  */
 static int
 em_enable_phy_wakeup(struct adapter *adapter)
 {
 	struct e1000_hw *hw = &adapter->hw;
 	u32 mreg, ret = 0;
 	u16 preg;
 
 	/* copy MAC RARs to PHY RARs */
 	e1000_copy_rx_addrs_to_phy_ich8lan(hw);
 
 	/* copy MAC MTA to PHY MTA */
 	for (int i = 0; i < adapter->hw.mac.mta_reg_count; i++) {
 		mreg = E1000_READ_REG_ARRAY(hw, E1000_MTA, i);
 		e1000_write_phy_reg(hw, BM_MTA(i), (u16)(mreg & 0xFFFF));
 		e1000_write_phy_reg(hw, BM_MTA(i) + 1,
 		    (u16)((mreg >> 16) & 0xFFFF));
 	}
 
 	/* configure PHY Rx Control register */
 	e1000_read_phy_reg(&adapter->hw, BM_RCTL, &preg);
 	mreg = E1000_READ_REG(hw, E1000_RCTL);
 	if (mreg & E1000_RCTL_UPE)
 		preg |= BM_RCTL_UPE;
 	if (mreg & E1000_RCTL_MPE)
 		preg |= BM_RCTL_MPE;
 	preg &= ~(BM_RCTL_MO_MASK);
 	if (mreg & E1000_RCTL_MO_3)
 		preg |= (((mreg & E1000_RCTL_MO_3) >> E1000_RCTL_MO_SHIFT)
 				<< BM_RCTL_MO_SHIFT);
 	if (mreg & E1000_RCTL_BAM)
 		preg |= BM_RCTL_BAM;
 	if (mreg & E1000_RCTL_PMCF)
 		preg |= BM_RCTL_PMCF;
 	mreg = E1000_READ_REG(hw, E1000_CTRL);
 	if (mreg & E1000_CTRL_RFCE)
 		preg |= BM_RCTL_RFCE;
 	e1000_write_phy_reg(&adapter->hw, BM_RCTL, preg);
 
 	/* enable PHY wakeup in MAC register */
 	E1000_WRITE_REG(hw, E1000_WUC,
 	    E1000_WUC_PHY_WAKE | E1000_WUC_PME_EN | E1000_WUC_APME);
 	E1000_WRITE_REG(hw, E1000_WUFC, adapter->wol);
 
 	/* configure and enable PHY wakeup in PHY registers */
 	e1000_write_phy_reg(&adapter->hw, BM_WUFC, adapter->wol);
 	e1000_write_phy_reg(&adapter->hw, BM_WUC, E1000_WUC_PME_EN);
 
 	/* activate PHY wakeup */
 	ret = hw->phy.ops.acquire(hw);
 	if (ret) {
 		printf("Could not acquire PHY\n");
 		return ret;
 	}
 	e1000_write_phy_reg_mdic(hw, IGP01E1000_PHY_PAGE_SELECT,
 	                         (BM_WUC_ENABLE_PAGE << IGP_PAGE_SHIFT));
 	ret = e1000_read_phy_reg_mdic(hw, BM_WUC_ENABLE_REG, &preg);
 	if (ret) {
 		printf("Could not read PHY page 769\n");
 		goto out;
 	}
 	preg |= BM_WUC_ENABLE_BIT | BM_WUC_HOST_WU_BIT;
 	ret = e1000_write_phy_reg_mdic(hw, BM_WUC_ENABLE_REG, preg);
 	if (ret)
 		printf("Could not set PHY Host Wakeup bit\n");
 out:
 	hw->phy.ops.release(hw);
 
 	return ret;
 }
 
 static void
 em_if_led_func(if_ctx_t ctx, int onoff)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 
 	if (onoff) {
 		e1000_setup_led(&adapter->hw);
 		e1000_led_on(&adapter->hw);
 	} else {
 		e1000_led_off(&adapter->hw);
 		e1000_cleanup_led(&adapter->hw);
 	}
 }
 
 /*
  * Disable the L0S and L1 LINK states
  */
 static void
 em_disable_aspm(struct adapter *adapter)
 {
 	int base, reg;
 	u16 link_cap,link_ctrl;
 	device_t dev = adapter->dev;
 
 	switch (adapter->hw.mac.type) {
 	case e1000_82573:
 	case e1000_82574:
 	case e1000_82583:
 		break;
 	default:
 		return;
 	}
 	if (pci_find_cap(dev, PCIY_EXPRESS, &base) != 0)
 		return;
 	reg = base + PCIER_LINK_CAP;
 	link_cap = pci_read_config(dev, reg, 2);
 	if ((link_cap & PCIEM_LINK_CAP_ASPM) == 0)
 		return;
 	reg = base + PCIER_LINK_CTL;
 	link_ctrl = pci_read_config(dev, reg, 2);
 	link_ctrl &= ~PCIEM_LINK_CTL_ASPMC;
 	pci_write_config(dev, reg, link_ctrl, 2);
 	return;
 }
 
 /**********************************************************************
  *
  *  Update the board statistics counters.
  *
  **********************************************************************/
 static void
 em_update_stats_counters(struct adapter *adapter)
 {
 
 	if(adapter->hw.phy.media_type == e1000_media_type_copper ||
 	   (E1000_READ_REG(&adapter->hw, E1000_STATUS) & E1000_STATUS_LU)) {
 		adapter->stats.symerrs += E1000_READ_REG(&adapter->hw, E1000_SYMERRS);
 		adapter->stats.sec += E1000_READ_REG(&adapter->hw, E1000_SEC);
 	}
 	adapter->stats.crcerrs += E1000_READ_REG(&adapter->hw, E1000_CRCERRS);
 	adapter->stats.mpc += E1000_READ_REG(&adapter->hw, E1000_MPC);
 	adapter->stats.scc += E1000_READ_REG(&adapter->hw, E1000_SCC);
 	adapter->stats.ecol += E1000_READ_REG(&adapter->hw, E1000_ECOL);
 
 	adapter->stats.mcc += E1000_READ_REG(&adapter->hw, E1000_MCC);
 	adapter->stats.latecol += E1000_READ_REG(&adapter->hw, E1000_LATECOL);
 	adapter->stats.colc += E1000_READ_REG(&adapter->hw, E1000_COLC);
 	adapter->stats.dc += E1000_READ_REG(&adapter->hw, E1000_DC);
 	adapter->stats.rlec += E1000_READ_REG(&adapter->hw, E1000_RLEC);
 	adapter->stats.xonrxc += E1000_READ_REG(&adapter->hw, E1000_XONRXC);
 	adapter->stats.xontxc += E1000_READ_REG(&adapter->hw, E1000_XONTXC);
 	adapter->stats.xoffrxc += E1000_READ_REG(&adapter->hw, E1000_XOFFRXC);
 	/*
 	 ** For watchdog management we need to know if we have been
 	 ** paused during the last interval, so capture that here.
 	*/
 	adapter->shared->isc_pause_frames = adapter->stats.xoffrxc;
 	adapter->stats.xofftxc += E1000_READ_REG(&adapter->hw, E1000_XOFFTXC);
 	adapter->stats.fcruc += E1000_READ_REG(&adapter->hw, E1000_FCRUC);
 	adapter->stats.prc64 += E1000_READ_REG(&adapter->hw, E1000_PRC64);
 	adapter->stats.prc127 += E1000_READ_REG(&adapter->hw, E1000_PRC127);
 	adapter->stats.prc255 += E1000_READ_REG(&adapter->hw, E1000_PRC255);
 	adapter->stats.prc511 += E1000_READ_REG(&adapter->hw, E1000_PRC511);
 	adapter->stats.prc1023 += E1000_READ_REG(&adapter->hw, E1000_PRC1023);
 	adapter->stats.prc1522 += E1000_READ_REG(&adapter->hw, E1000_PRC1522);
 	adapter->stats.gprc += E1000_READ_REG(&adapter->hw, E1000_GPRC);
 	adapter->stats.bprc += E1000_READ_REG(&adapter->hw, E1000_BPRC);
 	adapter->stats.mprc += E1000_READ_REG(&adapter->hw, E1000_MPRC);
 	adapter->stats.gptc += E1000_READ_REG(&adapter->hw, E1000_GPTC);
 
 	/* For the 64-bit byte counters the low dword must be read first. */
 	/* Both registers clear on the read of the high dword */
 
 	adapter->stats.gorc += E1000_READ_REG(&adapter->hw, E1000_GORCL) +
 	    ((u64)E1000_READ_REG(&adapter->hw, E1000_GORCH) << 32);
 	adapter->stats.gotc += E1000_READ_REG(&adapter->hw, E1000_GOTCL) +
 	    ((u64)E1000_READ_REG(&adapter->hw, E1000_GOTCH) << 32);
 
 	adapter->stats.rnbc += E1000_READ_REG(&adapter->hw, E1000_RNBC);
 	adapter->stats.ruc += E1000_READ_REG(&adapter->hw, E1000_RUC);
 	adapter->stats.rfc += E1000_READ_REG(&adapter->hw, E1000_RFC);
 	adapter->stats.roc += E1000_READ_REG(&adapter->hw, E1000_ROC);
 	adapter->stats.rjc += E1000_READ_REG(&adapter->hw, E1000_RJC);
 
 	adapter->stats.tor += E1000_READ_REG(&adapter->hw, E1000_TORH);
 	adapter->stats.tot += E1000_READ_REG(&adapter->hw, E1000_TOTH);
 
 	adapter->stats.tpr += E1000_READ_REG(&adapter->hw, E1000_TPR);
 	adapter->stats.tpt += E1000_READ_REG(&adapter->hw, E1000_TPT);
 	adapter->stats.ptc64 += E1000_READ_REG(&adapter->hw, E1000_PTC64);
 	adapter->stats.ptc127 += E1000_READ_REG(&adapter->hw, E1000_PTC127);
 	adapter->stats.ptc255 += E1000_READ_REG(&adapter->hw, E1000_PTC255);
 	adapter->stats.ptc511 += E1000_READ_REG(&adapter->hw, E1000_PTC511);
 	adapter->stats.ptc1023 += E1000_READ_REG(&adapter->hw, E1000_PTC1023);
 	adapter->stats.ptc1522 += E1000_READ_REG(&adapter->hw, E1000_PTC1522);
 	adapter->stats.mptc += E1000_READ_REG(&adapter->hw, E1000_MPTC);
 	adapter->stats.bptc += E1000_READ_REG(&adapter->hw, E1000_BPTC);
 
 	/* Interrupt Counts */
 
 	adapter->stats.iac += E1000_READ_REG(&adapter->hw, E1000_IAC);
 	adapter->stats.icrxptc += E1000_READ_REG(&adapter->hw, E1000_ICRXPTC);
 	adapter->stats.icrxatc += E1000_READ_REG(&adapter->hw, E1000_ICRXATC);
 	adapter->stats.ictxptc += E1000_READ_REG(&adapter->hw, E1000_ICTXPTC);
 	adapter->stats.ictxatc += E1000_READ_REG(&adapter->hw, E1000_ICTXATC);
 	adapter->stats.ictxqec += E1000_READ_REG(&adapter->hw, E1000_ICTXQEC);
 	adapter->stats.ictxqmtc += E1000_READ_REG(&adapter->hw, E1000_ICTXQMTC);
 	adapter->stats.icrxdmtc += E1000_READ_REG(&adapter->hw, E1000_ICRXDMTC);
 	adapter->stats.icrxoc += E1000_READ_REG(&adapter->hw, E1000_ICRXOC);
 
 	if (adapter->hw.mac.type >= e1000_82543) {
 		adapter->stats.algnerrc +=
 		E1000_READ_REG(&adapter->hw, E1000_ALGNERRC);
 		adapter->stats.rxerrc +=
 		E1000_READ_REG(&adapter->hw, E1000_RXERRC);
 		adapter->stats.tncrs +=
 		E1000_READ_REG(&adapter->hw, E1000_TNCRS);
 		adapter->stats.cexterr +=
 		E1000_READ_REG(&adapter->hw, E1000_CEXTERR);
 		adapter->stats.tsctc +=
 		E1000_READ_REG(&adapter->hw, E1000_TSCTC);
 		adapter->stats.tsctfc +=
 		E1000_READ_REG(&adapter->hw, E1000_TSCTFC);
 	}
 }
 
 static uint64_t
 em_if_get_counter(if_ctx_t ctx, ift_counter cnt)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	struct ifnet *ifp = iflib_get_ifp(ctx);
 
 	switch (cnt) {
 	case IFCOUNTER_COLLISIONS:
 		return (adapter->stats.colc);
 	case IFCOUNTER_IERRORS:
 		return (adapter->dropped_pkts + adapter->stats.rxerrc +
 		    adapter->stats.crcerrs + adapter->stats.algnerrc +
 		    adapter->stats.ruc + adapter->stats.roc +
 		    adapter->stats.mpc + adapter->stats.cexterr);
 	case IFCOUNTER_OERRORS:
 		return (adapter->stats.ecol + adapter->stats.latecol +
 		    adapter->watchdog_events);
 	default:
 		return (if_get_counter_default(ifp, cnt));
 	}
 }
 
 /* Export a single 32-bit register via a read-only sysctl. */
 static int
 em_sysctl_reg_handler(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *adapter;
 	u_int val;
 
 	adapter = oidp->oid_arg1;
 	val = E1000_READ_REG(&adapter->hw, oidp->oid_arg2);
 	return (sysctl_handle_int(oidp, &val, 0, req));
 }
 
 /*
  * Add sysctl variables, one per statistic, to the system.
  */
 static void
 em_add_hw_stats(struct adapter *adapter)
 {
 	device_t dev = iflib_get_dev(adapter->ctx);
 	struct em_tx_queue *tx_que = adapter->tx_queues;
 	struct em_rx_queue *rx_que = adapter->rx_queues;
 
 	struct sysctl_ctx_list *ctx = device_get_sysctl_ctx(dev);
 	struct sysctl_oid *tree = device_get_sysctl_tree(dev);
 	struct sysctl_oid_list *child = SYSCTL_CHILDREN(tree);
 	struct e1000_hw_stats *stats = &adapter->stats;
 
 	struct sysctl_oid *stat_node, *queue_node, *int_node;
 	struct sysctl_oid_list *stat_list, *queue_list, *int_list;
 
 #define QUEUE_NAME_LEN 32
 	char namebuf[QUEUE_NAME_LEN];
 
 	/* Driver Statistics */
 	SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "dropped",
 			CTLFLAG_RD, &adapter->dropped_pkts,
 			"Driver dropped packets");
 	SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "link_irq",
 			CTLFLAG_RD, &adapter->link_irq,
 			"Link MSIX IRQ Handled");
 	SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "mbuf_defrag_fail",
 			 CTLFLAG_RD, &adapter->mbuf_defrag_failed,
 			 "Defragmenting mbuf chain failed");
 	SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "tx_dma_fail",
 			CTLFLAG_RD, &adapter->no_tx_dma_setup,
 			"Driver tx dma failure in xmit");
 	SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "rx_overruns",
 			CTLFLAG_RD, &adapter->rx_overruns,
 			"RX overruns");
 	SYSCTL_ADD_ULONG(ctx, child, OID_AUTO, "watchdog_timeouts",
 			CTLFLAG_RD, &adapter->watchdog_events,
 			"Watchdog timeouts");
 	SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "device_control",
 			CTLTYPE_UINT | CTLFLAG_RD, adapter, E1000_CTRL,
 			em_sysctl_reg_handler, "IU",
 			"Device Control Register");
 	SYSCTL_ADD_PROC(ctx, child, OID_AUTO, "rx_control",
 			CTLTYPE_UINT | CTLFLAG_RD, adapter, E1000_RCTL,
 			em_sysctl_reg_handler, "IU",
 			"Receiver Control Register");
 	SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "fc_high_water",
 			CTLFLAG_RD, &adapter->hw.fc.high_water, 0,
 			"Flow Control High Watermark");
 	SYSCTL_ADD_UINT(ctx, child, OID_AUTO, "fc_low_water",
 			CTLFLAG_RD, &adapter->hw.fc.low_water, 0,
 			"Flow Control Low Watermark");
 
 	for (int i = 0; i < adapter->tx_num_queues; i++, tx_que++) {
 		struct tx_ring *txr = &tx_que->txr;
 		snprintf(namebuf, QUEUE_NAME_LEN, "queue_tx_%d", i);
 		queue_node = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, namebuf,
 					    CTLFLAG_RD, NULL, "TX Queue Name");
 		queue_list = SYSCTL_CHILDREN(queue_node);
 
 		SYSCTL_ADD_PROC(ctx, queue_list, OID_AUTO, "txd_head",
 				CTLTYPE_UINT | CTLFLAG_RD, adapter,
 				E1000_TDH(txr->me),
 				em_sysctl_reg_handler, "IU",
 				"Transmit Descriptor Head");
 		SYSCTL_ADD_PROC(ctx, queue_list, OID_AUTO, "txd_tail",
 				CTLTYPE_UINT | CTLFLAG_RD, adapter,
 				E1000_TDT(txr->me),
 				em_sysctl_reg_handler, "IU",
 				"Transmit Descriptor Tail");
 		SYSCTL_ADD_ULONG(ctx, queue_list, OID_AUTO, "tx_irq",
 				CTLFLAG_RD, &txr->tx_irq,
 				"Queue MSI-X Transmit Interrupts");
 	}
 
 	for (int j = 0; j < adapter->rx_num_queues; j++, rx_que++) {
 		struct rx_ring *rxr = &rx_que->rxr;
 		snprintf(namebuf, QUEUE_NAME_LEN, "queue_rx_%d", j);
 		queue_node = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, namebuf,
 					    CTLFLAG_RD, NULL, "RX Queue Name");
 		queue_list = SYSCTL_CHILDREN(queue_node);
 
 		SYSCTL_ADD_PROC(ctx, queue_list, OID_AUTO, "rxd_head",
 				CTLTYPE_UINT | CTLFLAG_RD, adapter,
 				E1000_RDH(rxr->me),
 				em_sysctl_reg_handler, "IU",
 				"Receive Descriptor Head");
 		SYSCTL_ADD_PROC(ctx, queue_list, OID_AUTO, "rxd_tail",
 				CTLTYPE_UINT | CTLFLAG_RD, adapter,
 				E1000_RDT(rxr->me),
 				em_sysctl_reg_handler, "IU",
 				"Receive Descriptor Tail");
 		SYSCTL_ADD_ULONG(ctx, queue_list, OID_AUTO, "rx_irq",
 				CTLFLAG_RD, &rxr->rx_irq,
 				"Queue MSI-X Receive Interrupts");
 	}
 
 	/* MAC stats get their own sub node */
 
 	stat_node = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, "mac_stats",
 				    CTLFLAG_RD, NULL, "Statistics");
 	stat_list = SYSCTL_CHILDREN(stat_node);
 
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "excess_coll",
 			CTLFLAG_RD, &stats->ecol,
 			"Excessive collisions");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "single_coll",
 			CTLFLAG_RD, &stats->scc,
 			"Single collisions");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "multiple_coll",
 			CTLFLAG_RD, &stats->mcc,
 			"Multiple collisions");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "late_coll",
 			CTLFLAG_RD, &stats->latecol,
 			"Late collisions");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "collision_count",
 			CTLFLAG_RD, &stats->colc,
 			"Collision Count");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "symbol_errors",
 			CTLFLAG_RD, &adapter->stats.symerrs,
 			"Symbol Errors");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "sequence_errors",
 			CTLFLAG_RD, &adapter->stats.sec,
 			"Sequence Errors");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "defer_count",
 			CTLFLAG_RD, &adapter->stats.dc,
 			"Defer Count");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "missed_packets",
 			CTLFLAG_RD, &adapter->stats.mpc,
 			"Missed Packets");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "recv_no_buff",
 			CTLFLAG_RD, &adapter->stats.rnbc,
 			"Receive No Buffers");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "recv_undersize",
 			CTLFLAG_RD, &adapter->stats.ruc,
 			"Receive Undersize");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "recv_fragmented",
 			CTLFLAG_RD, &adapter->stats.rfc,
 			"Fragmented Packets Received ");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "recv_oversize",
 			CTLFLAG_RD, &adapter->stats.roc,
 			"Oversized Packets Received");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "recv_jabber",
 			CTLFLAG_RD, &adapter->stats.rjc,
 			"Recevied Jabber");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "recv_errs",
 			CTLFLAG_RD, &adapter->stats.rxerrc,
 			"Receive Errors");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "crc_errs",
 			CTLFLAG_RD, &adapter->stats.crcerrs,
 			"CRC errors");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "alignment_errs",
 			CTLFLAG_RD, &adapter->stats.algnerrc,
 			"Alignment Errors");
 	/* On 82575 these are collision counts */
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "coll_ext_errs",
 			CTLFLAG_RD, &adapter->stats.cexterr,
 			"Collision/Carrier extension errors");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "xon_recvd",
 			CTLFLAG_RD, &adapter->stats.xonrxc,
 			"XON Received");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "xon_txd",
 			CTLFLAG_RD, &adapter->stats.xontxc,
 			"XON Transmitted");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "xoff_recvd",
 			CTLFLAG_RD, &adapter->stats.xoffrxc,
 			"XOFF Received");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "xoff_txd",
 			CTLFLAG_RD, &adapter->stats.xofftxc,
 			"XOFF Transmitted");
 
 	/* Packet Reception Stats */
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "total_pkts_recvd",
 			CTLFLAG_RD, &adapter->stats.tpr,
 			"Total Packets Received ");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "good_pkts_recvd",
 			CTLFLAG_RD, &adapter->stats.gprc,
 			"Good Packets Received");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "bcast_pkts_recvd",
 			CTLFLAG_RD, &adapter->stats.bprc,
 			"Broadcast Packets Received");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "mcast_pkts_recvd",
 			CTLFLAG_RD, &adapter->stats.mprc,
 			"Multicast Packets Received");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "rx_frames_64",
 			CTLFLAG_RD, &adapter->stats.prc64,
 			"64 byte frames received ");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "rx_frames_65_127",
 			CTLFLAG_RD, &adapter->stats.prc127,
 			"65-127 byte frames received");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "rx_frames_128_255",
 			CTLFLAG_RD, &adapter->stats.prc255,
 			"128-255 byte frames received");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "rx_frames_256_511",
 			CTLFLAG_RD, &adapter->stats.prc511,
 			"256-511 byte frames received");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "rx_frames_512_1023",
 			CTLFLAG_RD, &adapter->stats.prc1023,
 			"512-1023 byte frames received");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "rx_frames_1024_1522",
 			CTLFLAG_RD, &adapter->stats.prc1522,
 			"1023-1522 byte frames received");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "good_octets_recvd",
 			CTLFLAG_RD, &adapter->stats.gorc,
 			"Good Octets Received");
 
 	/* Packet Transmission Stats */
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "good_octets_txd",
 			CTLFLAG_RD, &adapter->stats.gotc,
 			"Good Octets Transmitted");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "total_pkts_txd",
 			CTLFLAG_RD, &adapter->stats.tpt,
 			"Total Packets Transmitted");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "good_pkts_txd",
 			CTLFLAG_RD, &adapter->stats.gptc,
 			"Good Packets Transmitted");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "bcast_pkts_txd",
 			CTLFLAG_RD, &adapter->stats.bptc,
 			"Broadcast Packets Transmitted");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "mcast_pkts_txd",
 			CTLFLAG_RD, &adapter->stats.mptc,
 			"Multicast Packets Transmitted");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tx_frames_64",
 			CTLFLAG_RD, &adapter->stats.ptc64,
 			"64 byte frames transmitted ");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tx_frames_65_127",
 			CTLFLAG_RD, &adapter->stats.ptc127,
 			"65-127 byte frames transmitted");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tx_frames_128_255",
 			CTLFLAG_RD, &adapter->stats.ptc255,
 			"128-255 byte frames transmitted");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tx_frames_256_511",
 			CTLFLAG_RD, &adapter->stats.ptc511,
 			"256-511 byte frames transmitted");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tx_frames_512_1023",
 			CTLFLAG_RD, &adapter->stats.ptc1023,
 			"512-1023 byte frames transmitted");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tx_frames_1024_1522",
 			CTLFLAG_RD, &adapter->stats.ptc1522,
 			"1024-1522 byte frames transmitted");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tso_txd",
 			CTLFLAG_RD, &adapter->stats.tsctc,
 			"TSO Contexts Transmitted");
 	SYSCTL_ADD_UQUAD(ctx, stat_list, OID_AUTO, "tso_ctx_fail",
 			CTLFLAG_RD, &adapter->stats.tsctfc,
 			"TSO Contexts Failed");
 
 
 	/* Interrupt Stats */
 
 	int_node = SYSCTL_ADD_NODE(ctx, child, OID_AUTO, "interrupts",
 				    CTLFLAG_RD, NULL, "Interrupt Statistics");
 	int_list = SYSCTL_CHILDREN(int_node);
 
 	SYSCTL_ADD_UQUAD(ctx, int_list, OID_AUTO, "asserts",
 			CTLFLAG_RD, &adapter->stats.iac,
 			"Interrupt Assertion Count");
 
 	SYSCTL_ADD_UQUAD(ctx, int_list, OID_AUTO, "rx_pkt_timer",
 			CTLFLAG_RD, &adapter->stats.icrxptc,
 			"Interrupt Cause Rx Pkt Timer Expire Count");
 
 	SYSCTL_ADD_UQUAD(ctx, int_list, OID_AUTO, "rx_abs_timer",
 			CTLFLAG_RD, &adapter->stats.icrxatc,
 			"Interrupt Cause Rx Abs Timer Expire Count");
 
 	SYSCTL_ADD_UQUAD(ctx, int_list, OID_AUTO, "tx_pkt_timer",
 			CTLFLAG_RD, &adapter->stats.ictxptc,
 			"Interrupt Cause Tx Pkt Timer Expire Count");
 
 	SYSCTL_ADD_UQUAD(ctx, int_list, OID_AUTO, "tx_abs_timer",
 			CTLFLAG_RD, &adapter->stats.ictxatc,
 			"Interrupt Cause Tx Abs Timer Expire Count");
 
 	SYSCTL_ADD_UQUAD(ctx, int_list, OID_AUTO, "tx_queue_empty",
 			CTLFLAG_RD, &adapter->stats.ictxqec,
 			"Interrupt Cause Tx Queue Empty Count");
 
 	SYSCTL_ADD_UQUAD(ctx, int_list, OID_AUTO, "tx_queue_min_thresh",
 			CTLFLAG_RD, &adapter->stats.ictxqmtc,
 			"Interrupt Cause Tx Queue Min Thresh Count");
 
 	SYSCTL_ADD_UQUAD(ctx, int_list, OID_AUTO, "rx_desc_min_thresh",
 			CTLFLAG_RD, &adapter->stats.icrxdmtc,
 			"Interrupt Cause Rx Desc Min Thresh Count");
 
 	SYSCTL_ADD_UQUAD(ctx, int_list, OID_AUTO, "rx_overrun",
 			CTLFLAG_RD, &adapter->stats.icrxoc,
 			"Interrupt Cause Receiver Overrun Count");
 }
 
 /**********************************************************************
  *
  *  This routine provides a way to dump out the adapter eeprom,
  *  often a useful debug/service tool. This only dumps the first
  *  32 words, stuff that matters is in that extent.
  *
  **********************************************************************/
 static int
 em_sysctl_nvm_info(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *adapter = (struct adapter *)arg1;
 	int error;
 	int result;
 
 	result = -1;
 	error = sysctl_handle_int(oidp, &result, 0, req);
 
 	if (error || !req->newptr)
 		return (error);
 
 	/*
 	 * This value will cause a hex dump of the
 	 * first 32 16-bit words of the EEPROM to
 	 * the screen.
 	 */
 	if (result == 1)
 		em_print_nvm_info(adapter);
 
 	return (error);
 }
 
 static void
 em_print_nvm_info(struct adapter *adapter)
 {
 	u16 eeprom_data;
 	int i, j, row = 0;
 
 	/* Its a bit crude, but it gets the job done */
 	printf("\nInterface EEPROM Dump:\n");
 	printf("Offset\n0x0000  ");
 	for (i = 0, j = 0; i < 32; i++, j++) {
 		if (j == 8) { /* Make the offset block */
 			j = 0; ++row;
 			printf("\n0x00%x0  ",row);
 		}
 		e1000_read_nvm(&adapter->hw, i, 1, &eeprom_data);
 		printf("%04x ", eeprom_data);
 	}
 	printf("\n");
 }
 
 static int
 em_sysctl_int_delay(SYSCTL_HANDLER_ARGS)
 {
 	struct em_int_delay_info *info;
 	struct adapter *adapter;
 	u32 regval;
 	int error, usecs, ticks;
 
 	info = (struct em_int_delay_info *) arg1;
 	usecs = info->value;
 	error = sysctl_handle_int(oidp, &usecs, 0, req);
 	if (error != 0 || req->newptr == NULL)
 		return (error);
 	if (usecs < 0 || usecs > EM_TICKS_TO_USECS(65535))
 		return (EINVAL);
 	info->value = usecs;
 	ticks = EM_USECS_TO_TICKS(usecs);
 	if (info->offset == E1000_ITR)	/* units are 256ns here */
 		ticks *= 4;
 
 	adapter = info->adapter;
 
 	regval = E1000_READ_OFFSET(&adapter->hw, info->offset);
 	regval = (regval & ~0xffff) | (ticks & 0xffff);
 	/* Handle a few special cases. */
 	switch (info->offset) {
 	case E1000_RDTR:
 		break;
 	case E1000_TIDV:
 		if (ticks == 0) {
 			adapter->txd_cmd &= ~E1000_TXD_CMD_IDE;
 			/* Don't write 0 into the TIDV register. */
 			regval++;
 		} else
 			adapter->txd_cmd |= E1000_TXD_CMD_IDE;
 		break;
 	}
 	E1000_WRITE_OFFSET(&adapter->hw, info->offset, regval);
 	return (0);
 }
 
 static void
 em_add_int_delay_sysctl(struct adapter *adapter, const char *name,
 	const char *description, struct em_int_delay_info *info,
 	int offset, int value)
 {
 	info->adapter = adapter;
 	info->offset = offset;
 	info->value = value;
 	SYSCTL_ADD_PROC(device_get_sysctl_ctx(adapter->dev),
 	    SYSCTL_CHILDREN(device_get_sysctl_tree(adapter->dev)),
 	    OID_AUTO, name, CTLTYPE_INT|CTLFLAG_RW,
 	    info, 0, em_sysctl_int_delay, "I", description);
 }
 
 /*
  * Set flow control using sysctl:
  * Flow control values:
  *      0 - off
  *      1 - rx pause
  *      2 - tx pause
  *      3 - full
  */
 static int
 em_set_flowcntl(SYSCTL_HANDLER_ARGS)
 {
 	int error;
 	static int input = 3; /* default is full */
 	struct adapter	*adapter = (struct adapter *) arg1;
 
 	error = sysctl_handle_int(oidp, &input, 0, req);
 
 	if ((error) || (req->newptr == NULL))
 		return (error);
 
 	if (input == adapter->fc) /* no change? */
 		return (error);
 
 	switch (input) {
 	case e1000_fc_rx_pause:
 	case e1000_fc_tx_pause:
 	case e1000_fc_full:
 	case e1000_fc_none:
 		adapter->hw.fc.requested_mode = input;
 		adapter->fc = input;
 		break;
 	default:
 		/* Do nothing */
 		return (error);
 	}
 
 	adapter->hw.fc.current_mode = adapter->hw.fc.requested_mode;
 	e1000_force_mac_fc(&adapter->hw);
 	return (error);
 }
 
 /*
  * Manage Energy Efficient Ethernet:
  * Control values:
  *     0/1 - enabled/disabled
  */
 static int
 em_sysctl_eee(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *adapter = (struct adapter *) arg1;
 	int error, value;
 
 	value = adapter->hw.dev_spec.ich8lan.eee_disable;
 	error = sysctl_handle_int(oidp, &value, 0, req);
 	if (error || req->newptr == NULL)
 		return (error);
 	adapter->hw.dev_spec.ich8lan.eee_disable = (value != 0);
 	em_if_init(adapter->ctx);
 
 	return (0);
 }
 
 static int
 em_sysctl_debug_info(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *adapter;
 	int error;
 	int result;
 
 	result = -1;
 	error = sysctl_handle_int(oidp, &result, 0, req);
 
 	if (error || !req->newptr)
 		return (error);
 
 	if (result == 1) {
 		adapter = (struct adapter *) arg1;
 		em_print_debug_info(adapter);
 	}
 
 	return (error);
 }
 
 static int
 em_get_rs(SYSCTL_HANDLER_ARGS)
 {
 	struct adapter *adapter = (struct adapter *) arg1;
 	int error;
 	int result;
 
 	result = 0;
 	error = sysctl_handle_int(oidp, &result, 0, req);
 
 	if (error || !req->newptr || result != 1)
 		return (error);
 	em_dump_rs(adapter);
 
 	return (error);
 }
 
 static void
 em_if_debug(if_ctx_t ctx)
 {
 	em_dump_rs(iflib_get_softc(ctx));
 }
 
 /*
  * This routine is meant to be fluid, add whatever is
  * needed for debugging a problem.  -jfv
  */
 static void
 em_print_debug_info(struct adapter *adapter)
 {
 	device_t dev = iflib_get_dev(adapter->ctx);
 	struct ifnet *ifp = iflib_get_ifp(adapter->ctx);
 	struct tx_ring *txr = &adapter->tx_queues->txr;
 	struct rx_ring *rxr = &adapter->rx_queues->rxr;
 
 	if (if_getdrvflags(ifp) & IFF_DRV_RUNNING)
 		printf("Interface is RUNNING ");
 	else
 		printf("Interface is NOT RUNNING\n");
 
 	if (if_getdrvflags(ifp) & IFF_DRV_OACTIVE)
 		printf("and INACTIVE\n");
 	else
 		printf("and ACTIVE\n");
 
 	for (int i = 0; i < adapter->tx_num_queues; i++, txr++) {
 		device_printf(dev, "TX Queue %d ------\n", i);
 		device_printf(dev, "hw tdh = %d, hw tdt = %d\n",
 			E1000_READ_REG(&adapter->hw, E1000_TDH(i)),
 			E1000_READ_REG(&adapter->hw, E1000_TDT(i)));
 
 	}
 	for (int j=0; j < adapter->rx_num_queues; j++, rxr++) {
 		device_printf(dev, "RX Queue %d ------\n", j);
 		device_printf(dev, "hw rdh = %d, hw rdt = %d\n",
 			E1000_READ_REG(&adapter->hw, E1000_RDH(j)),
 			E1000_READ_REG(&adapter->hw, E1000_RDT(j)));
 	}
 }
 
 /*
  * 82574 only:
  * Write a new value to the EEPROM increasing the number of MSIX
  * vectors from 3 to 5, for proper multiqueue support.
  */
 static void
 em_enable_vectors_82574(if_ctx_t ctx)
 {
 	struct adapter *adapter = iflib_get_softc(ctx);
 	struct e1000_hw *hw = &adapter->hw;
 	device_t dev = iflib_get_dev(ctx);
 	u16 edata;
 
 	e1000_read_nvm(hw, EM_NVM_PCIE_CTRL, 1, &edata);
 	printf("Current cap: %#06x\n", edata);
 	if (((edata & EM_NVM_MSIX_N_MASK) >> EM_NVM_MSIX_N_SHIFT) != 4) {
 		device_printf(dev, "Writing to eeprom: increasing "
 		    "reported MSIX vectors from 3 to 5...\n");
 		edata &= ~(EM_NVM_MSIX_N_MASK);
 		edata |= 4 << EM_NVM_MSIX_N_SHIFT;
 		e1000_write_nvm(hw, EM_NVM_PCIE_CTRL, 1, &edata);
 		e1000_update_nvm_checksum(hw);
 		device_printf(dev, "Writing to eeprom: done\n");
 	}
 }
Index: projects/openssl111/sys/dev/e1000/igb_txrx.c
===================================================================
--- projects/openssl111/sys/dev/e1000/igb_txrx.c	(revision 339239)
+++ projects/openssl111/sys/dev/e1000/igb_txrx.c	(revision 339240)
@@ -1,584 +1,584 @@
 /*-
  * Copyright (c) 2016 Matthew Macy <mmacy@mattmacy.io>
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 /* $FreeBSD$ */
 #include "if_em.h"
 
 #ifdef RSS
 #include <net/rss_config.h>
 #include <netinet/in_rss.h>
 #endif
 
 #ifdef VERBOSE_DEBUG
 #define DPRINTF device_printf
 #else
 #define DPRINTF(...)
 #endif
 
 /*********************************************************************
  *  Local Function prototypes
  *********************************************************************/
 static int igb_isc_txd_encap(void *arg, if_pkt_info_t pi);
 static void igb_isc_txd_flush(void *arg, uint16_t txqid, qidx_t pidx);
 static int igb_isc_txd_credits_update(void *arg, uint16_t txqid, bool clear);
 
 static void igb_isc_rxd_refill(void *arg, if_rxd_update_t iru);
 
 static void igb_isc_rxd_flush(void *arg, uint16_t rxqid, uint8_t flid __unused, qidx_t pidx);
 static int igb_isc_rxd_available(void *arg, uint16_t rxqid, qidx_t idx, qidx_t budget);
 
 static int igb_isc_rxd_pkt_get(void *arg, if_rxd_info_t ri);
 
 static int igb_tx_ctx_setup(struct tx_ring *txr, if_pkt_info_t pi, u32 *cmd_type_len, u32 *olinfo_status);
 static int igb_tso_setup(struct tx_ring *txr, if_pkt_info_t pi, u32 *cmd_type_len, u32 *olinfo_status);
 
 static void igb_rx_checksum(u32 staterr, if_rxd_info_t ri, u32 ptype);
 static int igb_determine_rsstype(u16 pkt_info);	
 
 extern void igb_if_enable_intr(if_ctx_t ctx);
 extern int em_intr(void *arg);
 
 struct if_txrx igb_txrx = {
 	.ift_txd_encap = igb_isc_txd_encap,
 	.ift_txd_flush = igb_isc_txd_flush,
 	.ift_txd_credits_update = igb_isc_txd_credits_update,
 	.ift_rxd_available = igb_isc_rxd_available,
 	.ift_rxd_pkt_get = igb_isc_rxd_pkt_get,
 	.ift_rxd_refill = igb_isc_rxd_refill,
 	.ift_rxd_flush = igb_isc_rxd_flush,
 	.ift_legacy_intr = em_intr
 };
 
 extern if_shared_ctx_t em_sctx;
 
 /**********************************************************************
  *
  *  Setup work for hardware segmentation offload (TSO) on
  *  adapters using advanced tx descriptors
  *
  **********************************************************************/
 static int
 igb_tso_setup(struct tx_ring *txr, if_pkt_info_t pi, u32 *cmd_type_len, u32 *olinfo_status)
 {
 	struct e1000_adv_tx_context_desc *TXD;
 	struct adapter *adapter = txr->adapter;
 	u32 type_tucmd_mlhl = 0, vlan_macip_lens = 0;
 	u32 mss_l4len_idx = 0;
 	u32 paylen;
 
 	switch(pi->ipi_etype) {
 	case ETHERTYPE_IPV6:
 		type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_IPV6;
 		break;
 	case ETHERTYPE_IP:
 		type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_IPV4;
 		/* Tell transmit desc to also do IPv4 checksum. */
 		*olinfo_status |= E1000_TXD_POPTS_IXSM << 8;
 		break;
 	default:
 		panic("%s: CSUM_TSO but no supported IP version (0x%04x)",
 		      __func__, ntohs(pi->ipi_etype));
 		break;
 	}
 
 	TXD = (struct e1000_adv_tx_context_desc *) &txr->tx_base[pi->ipi_pidx];
 
 	/* This is used in the transmit desc in encap */
 	paylen = pi->ipi_len - pi->ipi_ehdrlen - pi->ipi_ip_hlen - pi->ipi_tcp_hlen;
 
 	/* VLAN MACLEN IPLEN */
 	if (pi->ipi_mflags & M_VLANTAG) {
 		vlan_macip_lens |= (pi->ipi_vtag << E1000_ADVTXD_VLAN_SHIFT);
 	}
 
 	vlan_macip_lens |= pi->ipi_ehdrlen << E1000_ADVTXD_MACLEN_SHIFT;
 	vlan_macip_lens |= pi->ipi_ip_hlen;
 	TXD->vlan_macip_lens = htole32(vlan_macip_lens);
 
 	/* ADV DTYPE TUCMD */
 	type_tucmd_mlhl |= E1000_ADVTXD_DCMD_DEXT | E1000_ADVTXD_DTYP_CTXT;
 	type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_L4T_TCP;
 	TXD->type_tucmd_mlhl = htole32(type_tucmd_mlhl);
 
 	/* MSS L4LEN IDX */
 	mss_l4len_idx |= (pi->ipi_tso_segsz << E1000_ADVTXD_MSS_SHIFT);
 	mss_l4len_idx |= (pi->ipi_tcp_hlen << E1000_ADVTXD_L4LEN_SHIFT);
 	/* 82575 needs the queue index added */
 	if (adapter->hw.mac.type == e1000_82575)
 		mss_l4len_idx |= txr->me << 4;
 	TXD->mss_l4len_idx = htole32(mss_l4len_idx);
 
 	TXD->seqnum_seed = htole32(0);
 	*cmd_type_len |= E1000_ADVTXD_DCMD_TSE;
 	*olinfo_status |= E1000_TXD_POPTS_TXSM << 8;
 	*olinfo_status |= paylen << E1000_ADVTXD_PAYLEN_SHIFT;
 
 	return (1);
 }
 
 /*********************************************************************
  *
  *  Advanced Context Descriptor setup for VLAN, CSUM or TSO
  *
  **********************************************************************/
 static int
 igb_tx_ctx_setup(struct tx_ring *txr, if_pkt_info_t pi, u32 *cmd_type_len, u32 *olinfo_status)
 {
 	struct e1000_adv_tx_context_desc *TXD;
 	struct adapter *adapter = txr->adapter; 
 	u32 vlan_macip_lens, type_tucmd_mlhl;
 	u32 mss_l4len_idx;
 	mss_l4len_idx = vlan_macip_lens = type_tucmd_mlhl = 0;
-	int offload = TRUE; 
 
 	/* First check if TSO is to be used */
 	if (pi->ipi_csum_flags & CSUM_TSO)
 		return (igb_tso_setup(txr, pi, cmd_type_len, olinfo_status));
 
 	/* Indicate the whole packet as payload when not doing TSO */
 	*olinfo_status |= pi->ipi_len << E1000_ADVTXD_PAYLEN_SHIFT;
 
 	/* Now ready a context descriptor */
 	TXD = (struct e1000_adv_tx_context_desc *) &txr->tx_base[pi->ipi_pidx];
 
 	/*
 	** In advanced descriptors the vlan tag must 
 	** be placed into the context descriptor. Hence
 	** we need to make one even if not doing offloads.
 	*/
 	if (pi->ipi_mflags & M_VLANTAG) {
 		vlan_macip_lens |= (pi->ipi_vtag << E1000_ADVTXD_VLAN_SHIFT);
 	} else if ((pi->ipi_csum_flags & IGB_CSUM_OFFLOAD) == 0) {
 		return (0);
 	}
 	
 	/* Set the ether header length */
 	vlan_macip_lens |= pi->ipi_ehdrlen << E1000_ADVTXD_MACLEN_SHIFT;
 
 	switch(pi->ipi_etype) {
 	case ETHERTYPE_IP:
 		type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_IPV4;
 		break;
 	case ETHERTYPE_IPV6:
 		type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_IPV6;
 		break;
 	default:
-		offload = FALSE;
 		break;
 	}
 
 	vlan_macip_lens |= pi->ipi_ip_hlen;
 	type_tucmd_mlhl |= E1000_ADVTXD_DCMD_DEXT | E1000_ADVTXD_DTYP_CTXT;
 
 	switch (pi->ipi_ipproto) {
 	case IPPROTO_TCP:
-		if (pi->ipi_csum_flags & (CSUM_IP_TCP | CSUM_IP6_TCP))
+		if (pi->ipi_csum_flags & (CSUM_IP_TCP | CSUM_IP6_TCP)) {
 			type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_L4T_TCP;
+			*olinfo_status |= E1000_TXD_POPTS_TXSM << 8;
+		}
 		break;
 	case IPPROTO_UDP:
-		if (pi->ipi_csum_flags & (CSUM_IP_UDP | CSUM_IP6_UDP))
+		if (pi->ipi_csum_flags & (CSUM_IP_UDP | CSUM_IP6_UDP)) {
 			type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_L4T_UDP;
+			*olinfo_status |= E1000_TXD_POPTS_TXSM << 8;
+		}
 		break;
 	case IPPROTO_SCTP:
-		if (pi->ipi_csum_flags & (CSUM_IP_SCTP | CSUM_IP6_SCTP))
+		if (pi->ipi_csum_flags & (CSUM_IP_SCTP | CSUM_IP6_SCTP)) {
 			type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_L4T_SCTP;
+			*olinfo_status |= E1000_TXD_POPTS_TXSM << 8;
+		}
 		break;
 	default:
-		offload = FALSE;
 		break;
 	}
-
-	if (offload) /* For the TX descriptor setup */
-		*olinfo_status |= E1000_TXD_POPTS_TXSM << 8;
 
 	/* 82575 needs the queue index added */
 	if (adapter->hw.mac.type == e1000_82575)
 		mss_l4len_idx = txr->me << 4;
 
 	/* Now copy bits into descriptor */
 	TXD->vlan_macip_lens = htole32(vlan_macip_lens);
 	TXD->type_tucmd_mlhl = htole32(type_tucmd_mlhl);
 	TXD->seqnum_seed = htole32(0);
 	TXD->mss_l4len_idx = htole32(mss_l4len_idx);
 
 	return (1);
 }
 
 static int
 igb_isc_txd_encap(void *arg, if_pkt_info_t pi)
 {
 	struct adapter *sc = arg;
 	if_softc_ctx_t scctx = sc->shared;
 	struct em_tx_queue *que = &sc->tx_queues[pi->ipi_qsidx];
 	struct tx_ring *txr = &que->txr;
 	int nsegs = pi->ipi_nsegs;
 	bus_dma_segment_t *segs = pi->ipi_segs;
 	union e1000_adv_tx_desc *txd = NULL;
 	int i, j, pidx_last;
 	u32 olinfo_status, cmd_type_len, txd_flags;
 	qidx_t ntxd;
 
 	pidx_last = olinfo_status = 0;
 	/* Basic descriptor defines */
 	cmd_type_len = (E1000_ADVTXD_DTYP_DATA |
 			E1000_ADVTXD_DCMD_IFCS | E1000_ADVTXD_DCMD_DEXT);
 
 	if (pi->ipi_mflags & M_VLANTAG)
 		cmd_type_len |= E1000_ADVTXD_DCMD_VLE;
 
 	i = pi->ipi_pidx;
 	ntxd = scctx->isc_ntxd[0];
 	txd_flags = pi->ipi_flags & IPI_TX_INTR ? E1000_ADVTXD_DCMD_RS : 0;
 	/* Consume the first descriptor */
 	i += igb_tx_ctx_setup(txr, pi, &cmd_type_len, &olinfo_status);
 	if (i == scctx->isc_ntxd[0])
 		i = 0;
 
 	/* 82575 needs the queue index added */
 	if (sc->hw.mac.type == e1000_82575)
 		olinfo_status |= txr->me << 4;
 
 	for (j = 0; j < nsegs; j++) {
 		bus_size_t seglen;
 		bus_addr_t segaddr;
 
 		txd = (union e1000_adv_tx_desc *)&txr->tx_base[i];
 		seglen = segs[j].ds_len;
 		segaddr = htole64(segs[j].ds_addr);
 
 		txd->read.buffer_addr = segaddr;
 		txd->read.cmd_type_len = htole32(E1000_TXD_CMD_IFCS |
 		    cmd_type_len | seglen);
 		txd->read.olinfo_status = htole32(olinfo_status);
 		pidx_last = i;
 		if (++i == scctx->isc_ntxd[0]) {
 			i = 0;
 		}
 	}
 	if (txd_flags) {
 		txr->tx_rsq[txr->tx_rs_pidx] = pidx_last;
 		txr->tx_rs_pidx = (txr->tx_rs_pidx+1) & (ntxd-1);
 		MPASS(txr->tx_rs_pidx != txr->tx_rs_cidx);
 	}
 
 	txd->read.cmd_type_len |= htole32(E1000_TXD_CMD_EOP | txd_flags);
 	pi->ipi_new_pidx = i;
 
 	return (0);
 }
 
 static void
 igb_isc_txd_flush(void *arg, uint16_t txqid, qidx_t pidx)
 {
 	struct adapter *adapter	= arg;
 	struct em_tx_queue *que	= &adapter->tx_queues[txqid];
 	struct tx_ring *txr	= &que->txr;
 
 	E1000_WRITE_REG(&adapter->hw, E1000_TDT(txr->me), pidx);
 }
 
 static int
 igb_isc_txd_credits_update(void *arg, uint16_t txqid, bool clear)
 {
 	struct adapter *adapter = arg;
 	if_softc_ctx_t scctx = adapter->shared;
 	struct em_tx_queue *que = &adapter->tx_queues[txqid];
 	struct tx_ring *txr = &que->txr;
 
 	qidx_t processed = 0;
 	int updated;
 	qidx_t cur, prev, ntxd, rs_cidx;
 	int32_t delta;
 	uint8_t status;
 
 	rs_cidx = txr->tx_rs_cidx;
 	if (rs_cidx == txr->tx_rs_pidx)
 		return (0);
 	cur = txr->tx_rsq[rs_cidx];
 	status = ((union e1000_adv_tx_desc *)&txr->tx_base[cur])->wb.status;
 	updated = !!(status & E1000_TXD_STAT_DD);
 
 	if (!clear || !updated)
 		return (updated);
 
 	prev = txr->tx_cidx_processed;
 	ntxd = scctx->isc_ntxd[0];
 	do {
 		delta = (int32_t)cur - (int32_t)prev;
 		MPASS(prev == 0 || delta != 0);
 		if (delta < 0)
 			delta += ntxd;
 
 		processed += delta;
 		prev  = cur;
 		rs_cidx = (rs_cidx + 1) & (ntxd-1);
 		if (rs_cidx  == txr->tx_rs_pidx)
 			break;
 		cur = txr->tx_rsq[rs_cidx];
 		status = ((union e1000_adv_tx_desc *)&txr->tx_base[cur])->wb.status;
 	} while ((status & E1000_TXD_STAT_DD));
 
 	txr->tx_rs_cidx = rs_cidx;
 	txr->tx_cidx_processed = prev;
 	return (processed);
 }
 
 static void
 igb_isc_rxd_refill(void *arg, if_rxd_update_t iru)
 {
 	struct adapter *sc = arg;
 	if_softc_ctx_t scctx = sc->shared;
 	uint16_t rxqid = iru->iru_qsidx;
 	struct em_rx_queue *que = &sc->rx_queues[rxqid];
 	union e1000_adv_rx_desc *rxd;
 	struct rx_ring *rxr = &que->rxr;
 	uint64_t *paddrs;
 	uint32_t next_pidx, pidx;
 	uint16_t count;
 	int i;
 
 	paddrs = iru->iru_paddrs;
 	pidx = iru->iru_pidx;
 	count = iru->iru_count;
 
 	for (i = 0, next_pidx = pidx; i < count; i++) {
 		rxd = (union e1000_adv_rx_desc *)&rxr->rx_base[next_pidx];
 
 		rxd->read.pkt_addr = htole64(paddrs[i]);
 		if (++next_pidx == scctx->isc_nrxd[0])
 			next_pidx = 0;
 	}
 }
 
 static void
 igb_isc_rxd_flush(void *arg, uint16_t rxqid, uint8_t flid __unused, qidx_t pidx)
 {
 	struct adapter *sc = arg;
 	struct em_rx_queue *que = &sc->rx_queues[rxqid];
 	struct rx_ring *rxr = &que->rxr;
 
 	E1000_WRITE_REG(&sc->hw, E1000_RDT(rxr->me), pidx);
 }
 
 static int
 igb_isc_rxd_available(void *arg, uint16_t rxqid, qidx_t idx, qidx_t budget)
 {
 	struct adapter *sc = arg;
 	if_softc_ctx_t scctx = sc->shared;
 	struct em_rx_queue *que = &sc->rx_queues[rxqid];
 	struct rx_ring *rxr = &que->rxr;
 	union e1000_adv_rx_desc *rxd;
 	u32 staterr = 0;
 	int cnt, i, iter;
 
 	if (budget == 1) {
 		rxd = (union e1000_adv_rx_desc *)&rxr->rx_base[idx];
 		staterr = le32toh(rxd->wb.upper.status_error);
 		return (staterr & E1000_RXD_STAT_DD);
 	}
 
 	for (iter = cnt = 0, i = idx; iter < scctx->isc_nrxd[0] && iter <= budget;) {
 		rxd = (union e1000_adv_rx_desc *)&rxr->rx_base[i];
 		staterr = le32toh(rxd->wb.upper.status_error);
 
 		if ((staterr & E1000_RXD_STAT_DD) == 0)
 			break;
 
 		if (++i == scctx->isc_nrxd[0]) {
 			i = 0;
 		}
 
 		if (staterr & E1000_RXD_STAT_EOP)
 			cnt++;
 		iter++;
 	}
 	return (cnt);
 }
 
 /****************************************************************
  * Routine sends data which has been dma'ed into host memory
  * to upper layer. Initialize ri structure. 
  *
  * Returns 0 upon success, errno on failure
  ***************************************************************/
 
 static int
 igb_isc_rxd_pkt_get(void *arg, if_rxd_info_t ri)
 {
 	struct adapter *adapter = arg;
 	if_softc_ctx_t scctx = adapter->shared;
 	struct em_rx_queue *que = &adapter->rx_queues[ri->iri_qsidx];
 	struct rx_ring *rxr = &que->rxr;
 	struct ifnet *ifp = iflib_get_ifp(adapter->ctx);
 	union e1000_adv_rx_desc *rxd;
 
 	u16 pkt_info, len;
 	u16 vtag = 0;
 	u32 ptype;
 	u32 staterr = 0;
 	bool eop;
 	int i = 0;
 	int cidx = ri->iri_cidx;
 
 	do {
 		rxd = (union e1000_adv_rx_desc *)&rxr->rx_base[cidx];
 		staterr = le32toh(rxd->wb.upper.status_error);
 		pkt_info = le16toh(rxd->wb.lower.lo_dword.hs_rss.pkt_info);
 
 		MPASS ((staterr & E1000_RXD_STAT_DD) != 0);
 
 		len = le16toh(rxd->wb.upper.length);
 		ptype = le32toh(rxd->wb.lower.lo_dword.data) &  IGB_PKTTYPE_MASK;
 
 		ri->iri_len += len;
 		rxr->rx_bytes += ri->iri_len;
 
 		rxd->wb.upper.status_error = 0;
 		eop = ((staterr & E1000_RXD_STAT_EOP) == E1000_RXD_STAT_EOP);
 
 		if (((adapter->hw.mac.type == e1000_i350) ||
 		    (adapter->hw.mac.type == e1000_i354)) &&
 		    (staterr & E1000_RXDEXT_STATERR_LB))
 			vtag = be16toh(rxd->wb.upper.vlan);
 		else
 			vtag = le16toh(rxd->wb.upper.vlan);
 
 		/* Make sure bad packets are discarded */
 		if (eop && ((staterr & E1000_RXDEXT_ERR_FRAME_ERR_MASK) != 0)) {
 			adapter->dropped_pkts++;
 			++rxr->rx_discarded;
 			return (EBADMSG);
 		}
 		ri->iri_frags[i].irf_flid = 0;
 		ri->iri_frags[i].irf_idx = cidx;
 		ri->iri_frags[i].irf_len = len;
 
 		if (++cidx == scctx->isc_nrxd[0])
 			cidx = 0;
 #ifdef notyet
 		if (rxr->hdr_split == TRUE) {
 			ri->iri_frags[i].irf_flid = 1;
 			ri->iri_frags[i].irf_idx = cidx;
 			if (++cidx == scctx->isc_nrxd[0])
 				cidx = 0;
 		}
 #endif
 		i++;
 	} while (!eop);
 
 	rxr->rx_packets++;
 
 	if ((ifp->if_capenable & IFCAP_RXCSUM) != 0)
 		igb_rx_checksum(staterr, ri, ptype);
 
 	if ((ifp->if_capenable & IFCAP_VLAN_HWTAGGING) != 0 &&
 	    (staterr & E1000_RXD_STAT_VP) != 0) {
 		ri->iri_vtag = vtag;
 		ri->iri_flags |= M_VLANTAG;
 	}
 	ri->iri_flowid =
 		le32toh(rxd->wb.lower.hi_dword.rss);
 	ri->iri_rsstype = igb_determine_rsstype(pkt_info);
 	ri->iri_nfrags = i;
 
 	return (0);
 }
 
 /*********************************************************************
  *
  *  Verify that the hardware indicated that the checksum is valid.
  *  Inform the stack about the status of checksum so that stack
  *  doesn't spend time verifying the checksum.
  *
  *********************************************************************/
 static void
 igb_rx_checksum(u32 staterr, if_rxd_info_t ri, u32 ptype)
 {
 	u16 status = (u16)staterr;
 	u8 errors = (u8) (staterr >> 24);
 	bool sctp = FALSE;
 
 	/* Ignore Checksum bit is set */
 	if (status & E1000_RXD_STAT_IXSM) {
 		ri->iri_csum_flags = 0;
 		return;
 	}
 
 	if ((ptype & E1000_RXDADV_PKTTYPE_ETQF) == 0 &&
 	    (ptype & E1000_RXDADV_PKTTYPE_SCTP) != 0)
 		sctp = 1;
 	else
 		sctp = 0;
 
 	if (status & E1000_RXD_STAT_IPCS) {
 		/* Did it pass? */
 		if (!(errors & E1000_RXD_ERR_IPE)) {
 			/* IP Checksum Good */
 			ri->iri_csum_flags = CSUM_IP_CHECKED;
 			ri->iri_csum_flags |= CSUM_IP_VALID;
 		} else
 			ri->iri_csum_flags = 0;
 	}
 
 	if (status & (E1000_RXD_STAT_TCPCS | E1000_RXD_STAT_UDPCS)) {
 		u64 type = (CSUM_DATA_VALID | CSUM_PSEUDO_HDR);
 		if (sctp) /* reassign */
 			type = CSUM_SCTP_VALID;
 		/* Did it pass? */
 		if (!(errors & E1000_RXD_ERR_TCPE)) {
 			ri->iri_csum_flags |= type;
 			if (sctp == 0)
 				ri->iri_csum_data = htons(0xffff);
 		}
 	}
 	return;
 }
 
 /********************************************************************
  *
  *  Parse the packet type to determine the appropriate hash
  *
  ******************************************************************/
 static int
 igb_determine_rsstype(u16 pkt_info)
 {
 	switch (pkt_info & E1000_RXDADV_RSSTYPE_MASK) {
 	case E1000_RXDADV_RSSTYPE_IPV4_TCP:
 		return M_HASHTYPE_RSS_TCP_IPV4;
 	case E1000_RXDADV_RSSTYPE_IPV4:
 		return M_HASHTYPE_RSS_IPV4;
 	case E1000_RXDADV_RSSTYPE_IPV6_TCP:
 		return M_HASHTYPE_RSS_TCP_IPV6;
 	case E1000_RXDADV_RSSTYPE_IPV6_EX:
 		return M_HASHTYPE_RSS_IPV6_EX;
 	case E1000_RXDADV_RSSTYPE_IPV6:
 		return M_HASHTYPE_RSS_IPV6;
 	case E1000_RXDADV_RSSTYPE_IPV6_TCP_EX:
 		return M_HASHTYPE_RSS_TCP_IPV6_EX;
 	default:
 		return M_HASHTYPE_OPAQUE;
 	}
 }
Index: projects/openssl111/sys/dev/mlx4/mlx4_en/mlx4_en_netdev.c
===================================================================
--- projects/openssl111/sys/dev/mlx4/mlx4_en/mlx4_en_netdev.c	(revision 339239)
+++ projects/openssl111/sys/dev/mlx4/mlx4_en/mlx4_en_netdev.c	(revision 339240)
@@ -1,2881 +1,2891 @@
 /*
  * Copyright (c) 2007, 2014 Mellanox Technologies. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
  * General Public License (GPL) Version 2, available from the file
  * COPYING in the main directory of this source tree, or the
  * OpenIB.org BSD license below:
  *
  *     Redistribution and use in source and binary forms, with or
  *     without modification, are permitted provided that the following
  *     conditions are met:
  *
  *      - Redistributions of source code must retain the above
  *        copyright notice, this list of conditions and the following
  *        disclaimer.
  *
  *      - Redistributions in binary form must reproduce the above
  *        copyright notice, this list of conditions and the following
  *        disclaimer in the documentation and/or other materials
  *        provided with the distribution.
  *
  * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
  * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
  * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
  * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
  * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
  * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
  * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
  * SOFTWARE.
  *
  */
 
 #include <linux/etherdevice.h>
 #include <linux/delay.h>
 #include <linux/slab.h>
 #include <linux/compat.h>
 #ifdef CONFIG_NET_RX_BUSY_POLL
 #include <net/busy_poll.h>
 #endif
 
 #include <linux/list.h>
 #include <linux/if_ether.h>
 
 #include <dev/mlx4/driver.h>
 #include <dev/mlx4/device.h>
 #include <dev/mlx4/cmd.h>
 #include <dev/mlx4/cq.h>
 
 #include <sys/sockio.h>
 #include <sys/sysctl.h>
 
 #include "en.h"
 #include "en_port.h"
 
 static void mlx4_en_sysctl_stat(struct mlx4_en_priv *priv);
 static void mlx4_en_sysctl_conf(struct mlx4_en_priv *priv);
 
 #ifdef CONFIG_NET_RX_BUSY_POLL
 /* must be called with local_bh_disable()d */
 static int mlx4_en_low_latency_recv(struct napi_struct *napi)
 {
 	struct mlx4_en_cq *cq = container_of(napi, struct mlx4_en_cq, napi);
 	struct net_device *dev = cq->dev;
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_en_rx_ring *rx_ring = priv->rx_ring[cq->ring];
 	int done;
 
 	if (!priv->port_up)
 		return LL_FLUSH_FAILED;
 
 	if (!mlx4_en_cq_lock_poll(cq))
 		return LL_FLUSH_BUSY;
 
 	done = mlx4_en_process_rx_cq(dev, cq, 4);
 #ifdef LL_EXTENDED_STATS
 	if (likely(done))
 		rx_ring->cleaned += done;
 	else
 		rx_ring->misses++;
 #endif
 
 	mlx4_en_cq_unlock_poll(cq);
 
 	return done;
 }
 #endif	/* CONFIG_NET_RX_BUSY_POLL */
 
 #ifdef CONFIG_RFS_ACCEL
 
 struct mlx4_en_filter {
 	struct list_head next;
 	struct work_struct work;
 
 	u8     ip_proto;
 	__be32 src_ip;
 	__be32 dst_ip;
 	__be16 src_port;
 	__be16 dst_port;
 
 	int rxq_index;
 	struct mlx4_en_priv *priv;
 	u32 flow_id;			/* RFS infrastructure id */
 	int id;				/* mlx4_en driver id */
 	u64 reg_id;			/* Flow steering API id */
 	u8 activated;			/* Used to prevent expiry before filter
 					 * is attached
 					 */
 	struct hlist_node filter_chain;
 };
 
 static void mlx4_en_filter_rfs_expire(struct mlx4_en_priv *priv);
 
 static enum mlx4_net_trans_rule_id mlx4_ip_proto_to_trans_rule_id(u8 ip_proto)
 {
 	switch (ip_proto) {
 	case IPPROTO_UDP:
 		return MLX4_NET_TRANS_RULE_ID_UDP;
 	case IPPROTO_TCP:
 		return MLX4_NET_TRANS_RULE_ID_TCP;
 	default:
 		return MLX4_NET_TRANS_RULE_NUM;
 	}
 };
 
 static void mlx4_en_filter_work(struct work_struct *work)
 {
 	struct mlx4_en_filter *filter = container_of(work,
 						     struct mlx4_en_filter,
 						     work);
 	struct mlx4_en_priv *priv = filter->priv;
 	struct mlx4_spec_list spec_tcp_udp = {
 		.id = mlx4_ip_proto_to_trans_rule_id(filter->ip_proto),
 		{
 			.tcp_udp = {
 				.dst_port = filter->dst_port,
 				.dst_port_msk = (__force __be16)-1,
 				.src_port = filter->src_port,
 				.src_port_msk = (__force __be16)-1,
 			},
 		},
 	};
 	struct mlx4_spec_list spec_ip = {
 		.id = MLX4_NET_TRANS_RULE_ID_IPV4,
 		{
 			.ipv4 = {
 				.dst_ip = filter->dst_ip,
 				.dst_ip_msk = (__force __be32)-1,
 				.src_ip = filter->src_ip,
 				.src_ip_msk = (__force __be32)-1,
 			},
 		},
 	};
 	struct mlx4_spec_list spec_eth = {
 		.id = MLX4_NET_TRANS_RULE_ID_ETH,
 	};
 	struct mlx4_net_trans_rule rule = {
 		.list = LIST_HEAD_INIT(rule.list),
 		.queue_mode = MLX4_NET_TRANS_Q_LIFO,
 		.exclusive = 1,
 		.allow_loopback = 1,
 		.promisc_mode = MLX4_FS_REGULAR,
 		.port = priv->port,
 		.priority = MLX4_DOMAIN_RFS,
 	};
 	int rc;
 	__be64 mac_mask = cpu_to_be64(MLX4_MAC_MASK << 16);
 
 	if (spec_tcp_udp.id >= MLX4_NET_TRANS_RULE_NUM) {
 		en_warn(priv, "RFS: ignoring unsupported ip protocol (%d)\n",
 			filter->ip_proto);
 		goto ignore;
 	}
 	list_add_tail(&spec_eth.list, &rule.list);
 	list_add_tail(&spec_ip.list, &rule.list);
 	list_add_tail(&spec_tcp_udp.list, &rule.list);
 
 	rule.qpn = priv->rss_map.qps[filter->rxq_index].qpn;
 	memcpy(spec_eth.eth.dst_mac, priv->dev->dev_addr, ETH_ALEN);
 	memcpy(spec_eth.eth.dst_mac_msk, &mac_mask, ETH_ALEN);
 
 	filter->activated = 0;
 
 	if (filter->reg_id) {
 		rc = mlx4_flow_detach(priv->mdev->dev, filter->reg_id);
 		if (rc && rc != -ENOENT)
 			en_err(priv, "Error detaching flow. rc = %d\n", rc);
 	}
 
 	rc = mlx4_flow_attach(priv->mdev->dev, &rule, &filter->reg_id);
 	if (rc)
 		en_err(priv, "Error attaching flow. err = %d\n", rc);
 
 ignore:
 	mlx4_en_filter_rfs_expire(priv);
 
 	filter->activated = 1;
 }
 
 static inline struct hlist_head *
 filter_hash_bucket(struct mlx4_en_priv *priv, __be32 src_ip, __be32 dst_ip,
 		   __be16 src_port, __be16 dst_port)
 {
 	unsigned long l;
 	int bucket_idx;
 
 	l = (__force unsigned long)src_port |
 	    ((__force unsigned long)dst_port << 2);
 	l ^= (__force unsigned long)(src_ip ^ dst_ip);
 
 	bucket_idx = hash_long(l, MLX4_EN_FILTER_HASH_SHIFT);
 
 	return &priv->filter_hash[bucket_idx];
 }
 
 static struct mlx4_en_filter *
 mlx4_en_filter_alloc(struct mlx4_en_priv *priv, int rxq_index, __be32 src_ip,
 		     __be32 dst_ip, u8 ip_proto, __be16 src_port,
 		     __be16 dst_port, u32 flow_id)
 {
 	struct mlx4_en_filter *filter = NULL;
 
 	filter = kzalloc(sizeof(struct mlx4_en_filter), GFP_ATOMIC);
 	if (!filter)
 		return NULL;
 
 	filter->priv = priv;
 	filter->rxq_index = rxq_index;
 	INIT_WORK(&filter->work, mlx4_en_filter_work);
 
 	filter->src_ip = src_ip;
 	filter->dst_ip = dst_ip;
 	filter->ip_proto = ip_proto;
 	filter->src_port = src_port;
 	filter->dst_port = dst_port;
 
 	filter->flow_id = flow_id;
 
 	filter->id = priv->last_filter_id++ % RPS_NO_FILTER;
 
 	list_add_tail(&filter->next, &priv->filters);
 	hlist_add_head(&filter->filter_chain,
 		       filter_hash_bucket(priv, src_ip, dst_ip, src_port,
 					  dst_port));
 
 	return filter;
 }
 
 static void mlx4_en_filter_free(struct mlx4_en_filter *filter)
 {
 	struct mlx4_en_priv *priv = filter->priv;
 	int rc;
 
 	list_del(&filter->next);
 
 	rc = mlx4_flow_detach(priv->mdev->dev, filter->reg_id);
 	if (rc && rc != -ENOENT)
 		en_err(priv, "Error detaching flow. rc = %d\n", rc);
 
 	kfree(filter);
 }
 
 static inline struct mlx4_en_filter *
 mlx4_en_filter_find(struct mlx4_en_priv *priv, __be32 src_ip, __be32 dst_ip,
 		    u8 ip_proto, __be16 src_port, __be16 dst_port)
 {
 	struct mlx4_en_filter *filter;
 	struct mlx4_en_filter *ret = NULL;
 
 	hlist_for_each_entry(filter,
 			     filter_hash_bucket(priv, src_ip, dst_ip,
 						src_port, dst_port),
 			     filter_chain) {
 		if (filter->src_ip == src_ip &&
 		    filter->dst_ip == dst_ip &&
 		    filter->ip_proto == ip_proto &&
 		    filter->src_port == src_port &&
 		    filter->dst_port == dst_port) {
 			ret = filter;
 			break;
 		}
 	}
 
 	return ret;
 }
 
 static int
 mlx4_en_filter_rfs(struct net_device *net_dev, const struct sk_buff *skb,
 		   u16 rxq_index, u32 flow_id)
 {
 	struct mlx4_en_priv *priv = netdev_priv(net_dev);
 	struct mlx4_en_filter *filter;
 	const struct iphdr *ip;
 	const __be16 *ports;
 	u8 ip_proto;
 	__be32 src_ip;
 	__be32 dst_ip;
 	__be16 src_port;
 	__be16 dst_port;
 	int nhoff = skb_network_offset(skb);
 	int ret = 0;
 
 	if (skb->protocol != htons(ETH_P_IP))
 		return -EPROTONOSUPPORT;
 
 	ip = (const struct iphdr *)(skb->data + nhoff);
 	if (ip_is_fragment(ip))
 		return -EPROTONOSUPPORT;
 
 	if ((ip->protocol != IPPROTO_TCP) && (ip->protocol != IPPROTO_UDP))
 		return -EPROTONOSUPPORT;
 	ports = (const __be16 *)(skb->data + nhoff + 4 * ip->ihl);
 
 	ip_proto = ip->protocol;
 	src_ip = ip->saddr;
 	dst_ip = ip->daddr;
 	src_port = ports[0];
 	dst_port = ports[1];
 
 	spin_lock_bh(&priv->filters_lock);
 	filter = mlx4_en_filter_find(priv, src_ip, dst_ip, ip_proto,
 				     src_port, dst_port);
 	if (filter) {
 		if (filter->rxq_index == rxq_index)
 			goto out;
 
 		filter->rxq_index = rxq_index;
 	} else {
 		filter = mlx4_en_filter_alloc(priv, rxq_index,
 					      src_ip, dst_ip, ip_proto,
 					      src_port, dst_port, flow_id);
 		if (!filter) {
 			ret = -ENOMEM;
 			goto err;
 		}
 	}
 
 	queue_work(priv->mdev->workqueue, &filter->work);
 
 out:
 	ret = filter->id;
 err:
 	spin_unlock_bh(&priv->filters_lock);
 
 	return ret;
 }
 
 void mlx4_en_cleanup_filters(struct mlx4_en_priv *priv)
 {
 	struct mlx4_en_filter *filter, *tmp;
 	LIST_HEAD(del_list);
 
 	spin_lock_bh(&priv->filters_lock);
 	list_for_each_entry_safe(filter, tmp, &priv->filters, next) {
 		list_move(&filter->next, &del_list);
 		hlist_del(&filter->filter_chain);
 	}
 	spin_unlock_bh(&priv->filters_lock);
 
 	list_for_each_entry_safe(filter, tmp, &del_list, next) {
 		cancel_work_sync(&filter->work);
 		mlx4_en_filter_free(filter);
 	}
 }
 
 static void mlx4_en_filter_rfs_expire(struct mlx4_en_priv *priv)
 {
 	struct mlx4_en_filter *filter = NULL, *tmp, *last_filter = NULL;
 	LIST_HEAD(del_list);
 	int i = 0;
 
 	spin_lock_bh(&priv->filters_lock);
 	list_for_each_entry_safe(filter, tmp, &priv->filters, next) {
 		if (i > MLX4_EN_FILTER_EXPIRY_QUOTA)
 			break;
 
 		if (filter->activated &&
 		    !work_pending(&filter->work) &&
 		    rps_may_expire_flow(priv->dev,
 					filter->rxq_index, filter->flow_id,
 					filter->id)) {
 			list_move(&filter->next, &del_list);
 			hlist_del(&filter->filter_chain);
 		} else
 			last_filter = filter;
 
 		i++;
 	}
 
 	if (last_filter && (&last_filter->next != priv->filters.next))
 		list_move(&priv->filters, &last_filter->next);
 
 	spin_unlock_bh(&priv->filters_lock);
 
 	list_for_each_entry_safe(filter, tmp, &del_list, next)
 		mlx4_en_filter_free(filter);
 }
 #endif
 
 static void mlx4_en_vlan_rx_add_vid(void *arg, struct net_device *dev, u16 vid)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_en_dev *mdev = priv->mdev;
 	int err;
 	int idx;
 
 	if (arg != priv)
 		return;
 
 	en_dbg(HW, priv, "adding VLAN:%d\n", vid);
 
 	set_bit(vid, priv->active_vlans);
 
 	/* Add VID to port VLAN filter */
 	mutex_lock(&mdev->state_lock);
 	if (mdev->device_up && priv->port_up) {
 		err = mlx4_SET_VLAN_FLTR(mdev->dev, priv);
 		if (err)
 			en_err(priv, "Failed configuring VLAN filter\n");
 	}
 	if (mlx4_register_vlan(mdev->dev, priv->port, vid, &idx))
 		en_dbg(HW, priv, "failed adding vlan %d\n", vid);
 	mutex_unlock(&mdev->state_lock);
 
 }
 
 static void mlx4_en_vlan_rx_kill_vid(void *arg, struct net_device *dev, u16 vid)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_en_dev *mdev = priv->mdev;
 	int err;
 
 	if (arg != priv)
 		return;
 
 	en_dbg(HW, priv, "Killing VID:%d\n", vid);
 
 	clear_bit(vid, priv->active_vlans);
 
 	/* Remove VID from port VLAN filter */
 	mutex_lock(&mdev->state_lock);
 	mlx4_unregister_vlan(mdev->dev, priv->port, vid);
 
 	if (mdev->device_up && priv->port_up) {
 		err = mlx4_SET_VLAN_FLTR(mdev->dev, priv);
 		if (err)
 			en_err(priv, "Failed configuring VLAN filter\n");
 	}
 	mutex_unlock(&mdev->state_lock);
 
 }
 
 static int mlx4_en_tunnel_steer_add(struct mlx4_en_priv *priv, unsigned char *addr,
 				    int qpn, u64 *reg_id)
 {
 	int err;
 
 	if (priv->mdev->dev->caps.tunnel_offload_mode != MLX4_TUNNEL_OFFLOAD_MODE_VXLAN ||
 	    priv->mdev->dev->caps.dmfs_high_steer_mode == MLX4_STEERING_DMFS_A0_STATIC)
 		return 0; /* do nothing */
 
 	err = mlx4_tunnel_steer_add(priv->mdev->dev, addr, priv->port, qpn,
 				    MLX4_DOMAIN_NIC, reg_id);
 	if (err) {
 		en_err(priv, "failed to add vxlan steering rule, err %d\n", err);
 		return err;
 	}
 	en_dbg(DRV, priv, "added vxlan steering rule, mac %pM reg_id %llx\n", addr, (long long)*reg_id);
 	return 0;
 }
 
 static int mlx4_en_uc_steer_add(struct mlx4_en_priv *priv,
 				unsigned char *mac, int *qpn, u64 *reg_id)
 {
 	struct mlx4_en_dev *mdev = priv->mdev;
 	struct mlx4_dev *dev = mdev->dev;
 	int err;
 
 	switch (dev->caps.steering_mode) {
 	case MLX4_STEERING_MODE_B0: {
 		struct mlx4_qp qp;
 		u8 gid[16] = {0};
 
 		qp.qpn = *qpn;
 		memcpy(&gid[10], mac, ETH_ALEN);
 		gid[5] = priv->port;
 
 		err = mlx4_unicast_attach(dev, &qp, gid, 0, MLX4_PROT_ETH);
 		break;
 	}
 	case MLX4_STEERING_MODE_DEVICE_MANAGED: {
 		struct mlx4_spec_list spec_eth = { {NULL} };
 		__be64 mac_mask = cpu_to_be64(MLX4_MAC_MASK << 16);
 
 		struct mlx4_net_trans_rule rule = {
 			.queue_mode = MLX4_NET_TRANS_Q_FIFO,
 			.exclusive = 0,
 			.allow_loopback = 1,
 			.promisc_mode = MLX4_FS_REGULAR,
 			.priority = MLX4_DOMAIN_NIC,
 		};
 
 		rule.port = priv->port;
 		rule.qpn = *qpn;
 		INIT_LIST_HEAD(&rule.list);
 
 		spec_eth.id = MLX4_NET_TRANS_RULE_ID_ETH;
 		memcpy(spec_eth.eth.dst_mac, mac, ETH_ALEN);
 		memcpy(spec_eth.eth.dst_mac_msk, &mac_mask, ETH_ALEN);
 		list_add_tail(&spec_eth.list, &rule.list);
 
 		err = mlx4_flow_attach(dev, &rule, reg_id);
 		break;
 	}
 	default:
 		return -EINVAL;
 	}
 	if (err)
 		en_warn(priv, "Failed Attaching Unicast\n");
 
 	return err;
 }
 
 static void mlx4_en_uc_steer_release(struct mlx4_en_priv *priv,
 				     unsigned char *mac, int qpn, u64 reg_id)
 {
 	struct mlx4_en_dev *mdev = priv->mdev;
 	struct mlx4_dev *dev = mdev->dev;
 
 	switch (dev->caps.steering_mode) {
 	case MLX4_STEERING_MODE_B0: {
 		struct mlx4_qp qp;
 		u8 gid[16] = {0};
 
 		qp.qpn = qpn;
 		memcpy(&gid[10], mac, ETH_ALEN);
 		gid[5] = priv->port;
 
 		mlx4_unicast_detach(dev, &qp, gid, MLX4_PROT_ETH);
 		break;
 	}
 	case MLX4_STEERING_MODE_DEVICE_MANAGED: {
 		mlx4_flow_detach(dev, reg_id);
 		break;
 	}
 	default:
 		en_err(priv, "Invalid steering mode.\n");
 	}
 }
 
 static int mlx4_en_get_qp(struct mlx4_en_priv *priv)
 {
 	struct mlx4_en_dev *mdev = priv->mdev;
 	struct mlx4_dev *dev = mdev->dev;
 	int index = 0;
 	int err = 0;
 	int *qpn = &priv->base_qpn;
 	u64 mac = mlx4_mac_to_u64(IF_LLADDR(priv->dev));
 
 	en_dbg(DRV, priv, "Registering MAC: %pM for adding\n",
 	       IF_LLADDR(priv->dev));
 	index = mlx4_register_mac(dev, priv->port, mac);
 	if (index < 0) {
 		err = index;
 		en_err(priv, "Failed adding MAC: %pM\n",
 		       IF_LLADDR(priv->dev));
 		return err;
 	}
 
 	if (dev->caps.steering_mode == MLX4_STEERING_MODE_A0) {
 		int base_qpn = mlx4_get_base_qpn(dev, priv->port);
 		*qpn = base_qpn + index;
 		return 0;
 	}
 
 	err = mlx4_qp_reserve_range(dev, 1, 1, qpn, MLX4_RESERVE_A0_QP);
 	en_dbg(DRV, priv, "Reserved qp %d\n", *qpn);
 	if (err) {
 		en_err(priv, "Failed to reserve qp for mac registration\n");
 		mlx4_unregister_mac(dev, priv->port, mac);
 		return err;
 	}
 
 	return 0;
 }
 
 static void mlx4_en_put_qp(struct mlx4_en_priv *priv)
 {
 	struct mlx4_en_dev *mdev = priv->mdev;
 	struct mlx4_dev *dev = mdev->dev;
 	int qpn = priv->base_qpn;
 
 	if (dev->caps.steering_mode == MLX4_STEERING_MODE_A0) {
 		u64 mac = mlx4_mac_to_u64(IF_LLADDR(priv->dev));
 		en_dbg(DRV, priv, "Registering MAC: %pM for deleting\n",
 		       IF_LLADDR(priv->dev));
 		mlx4_unregister_mac(dev, priv->port, mac);
 	} else {
 		en_dbg(DRV, priv, "Releasing qp: port %d, qpn %d\n",
 		       priv->port, qpn);
 		mlx4_qp_release_range(dev, qpn, 1);
 		priv->flags &= ~MLX4_EN_FLAG_FORCE_PROMISC;
 	}
 }
 
 static void mlx4_en_clear_uclist(struct net_device *dev)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_en_addr_list *tmp, *uc_to_del;
 
 	list_for_each_entry_safe(uc_to_del, tmp, &priv->uc_list, list) {
 		list_del(&uc_to_del->list);
 		kfree(uc_to_del);
 	}
 }
 
 static void mlx4_en_cache_uclist(struct net_device *dev)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_en_addr_list *tmp;
 	struct ifaddr *ifa;
 
 	mlx4_en_clear_uclist(dev);
 
 	if_addr_rlock(dev);
 	CK_STAILQ_FOREACH(ifa, &dev->if_addrhead, ifa_link) {
 		if (ifa->ifa_addr->sa_family != AF_LINK)
 			continue;
 		if (((struct sockaddr_dl *)ifa->ifa_addr)->sdl_alen !=
 				ETHER_ADDR_LEN)
 			continue;
 		tmp = kzalloc(sizeof(struct mlx4_en_addr_list), GFP_ATOMIC);
 		if (tmp == NULL) {
 			en_err(priv, "Failed to allocate address list\n");
 			break;
 		}
 		memcpy(tmp->addr,
 			LLADDR((struct sockaddr_dl *)ifa->ifa_addr), ETH_ALEN);
 		list_add_tail(&tmp->list, &priv->uc_list);
 	}
 	if_addr_runlock(dev);
 }
 
 static void mlx4_en_clear_mclist(struct net_device *dev)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_en_addr_list *tmp, *mc_to_del;
 
 	list_for_each_entry_safe(mc_to_del, tmp, &priv->mc_list, list) {
 		list_del(&mc_to_del->list);
 		kfree(mc_to_del);
 	}
 }
 
 static void mlx4_en_cache_mclist(struct net_device *dev)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_en_addr_list *tmp;
 	struct ifmultiaddr *ifma;
 
 	mlx4_en_clear_mclist(dev);
 
 	if_maddr_rlock(dev);
 	CK_STAILQ_FOREACH(ifma, &dev->if_multiaddrs, ifma_link) {
 		if (ifma->ifma_addr->sa_family != AF_LINK)
 			continue;
 		if (((struct sockaddr_dl *)ifma->ifma_addr)->sdl_alen !=
 				ETHER_ADDR_LEN)
 			continue;
 		tmp = kzalloc(sizeof(struct mlx4_en_addr_list), GFP_ATOMIC);
 		if (tmp == NULL) {
 			en_err(priv, "Failed to allocate address list\n");
 			break;
 		}
 		memcpy(tmp->addr,
 			LLADDR((struct sockaddr_dl *)ifma->ifma_addr), ETH_ALEN);
 		list_add_tail(&tmp->list, &priv->mc_list);
 	}
 	if_maddr_runlock(dev);
 }
 
 static void update_addr_list_flags(struct mlx4_en_priv *priv,
 				struct list_head *dst,
 				struct list_head *src)
 {
 	struct mlx4_en_addr_list *dst_tmp, *src_tmp, *new_mc;
 	bool found;
 
 	/* Find all the entries that should be removed from dst,
 	 * These are the entries that are not found in src
 	 */
 	list_for_each_entry(dst_tmp, dst, list) {
 		found = false;
 		list_for_each_entry(src_tmp, src, list) {
 			if (!memcmp(dst_tmp->addr, src_tmp->addr, ETH_ALEN)) {
 				found = true;
 				break;
 			}
 		}
 		if (!found)
 			dst_tmp->action = MLX4_ADDR_LIST_REM;
 	}
 
 	/* Add entries that exist in src but not in dst
 	 * mark them as need to add
 	 */
 	list_for_each_entry(src_tmp, src, list) {
 		found = false;
 		list_for_each_entry(dst_tmp, dst, list) {
 			if (!memcmp(dst_tmp->addr, src_tmp->addr, ETH_ALEN)) {
 				dst_tmp->action = MLX4_ADDR_LIST_NONE;
 				found = true;
 				break;
 			}
 		}
 		if (!found) {
 			new_mc = kmalloc(sizeof(struct mlx4_en_addr_list),
 					 GFP_KERNEL);
 			if (!new_mc) {
 				en_err(priv, "Failed to allocate current multicast list\n");
 				return;
 			}
 			memcpy(new_mc, src_tmp,
 			       sizeof(struct mlx4_en_addr_list));
 			new_mc->action = MLX4_ADDR_LIST_ADD;
 			list_add_tail(&new_mc->list, dst);
 		}
 	}
 }
 
 static void mlx4_en_set_rx_mode(struct net_device *dev)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 
 	if (!priv->port_up)
 		return;
 
 	queue_work(priv->mdev->workqueue, &priv->rx_mode_task);
 }
 
 static void mlx4_en_set_promisc_mode(struct mlx4_en_priv *priv,
 				     struct mlx4_en_dev *mdev)
 {
 	int err = 0;
 
 	if (!(priv->flags & MLX4_EN_FLAG_PROMISC)) {
 		priv->flags |= MLX4_EN_FLAG_PROMISC;
 
 		/* Enable promiscouos mode */
 		switch (mdev->dev->caps.steering_mode) {
 		case MLX4_STEERING_MODE_DEVICE_MANAGED:
 			err = mlx4_flow_steer_promisc_add(mdev->dev,
 							  priv->port,
 							  priv->base_qpn,
 							  MLX4_FS_ALL_DEFAULT);
 			if (err)
 				en_err(priv, "Failed enabling promiscuous mode\n");
 			priv->flags |= MLX4_EN_FLAG_MC_PROMISC;
 			break;
 
 		case MLX4_STEERING_MODE_B0:
 			err = mlx4_unicast_promisc_add(mdev->dev,
 						       priv->base_qpn,
 						       priv->port);
 			if (err)
 				en_err(priv, "Failed enabling unicast promiscuous mode\n");
 
 			/* Add the default qp number as multicast
 			 * promisc
 			 */
 			if (!(priv->flags & MLX4_EN_FLAG_MC_PROMISC)) {
 				err = mlx4_multicast_promisc_add(mdev->dev,
 								 priv->base_qpn,
 								 priv->port);
 				if (err)
 					en_err(priv, "Failed enabling multicast promiscuous mode\n");
 				priv->flags |= MLX4_EN_FLAG_MC_PROMISC;
 			}
 			break;
 
 		case MLX4_STEERING_MODE_A0:
 			err = mlx4_SET_PORT_qpn_calc(mdev->dev,
 						     priv->port,
 						     priv->base_qpn,
 						     1);
 			if (err)
 				en_err(priv, "Failed enabling promiscuous mode\n");
 			break;
 		}
 
 		/* Disable port multicast filter (unconditionally) */
 		err = mlx4_SET_MCAST_FLTR(mdev->dev, priv->port, 0,
 					  0, MLX4_MCAST_DISABLE);
 		if (err)
 			en_err(priv, "Failed disabling multicast filter\n");
 	}
 }
 
 static void mlx4_en_clear_promisc_mode(struct mlx4_en_priv *priv,
 				       struct mlx4_en_dev *mdev)
 {
 	int err = 0;
 
 	priv->flags &= ~MLX4_EN_FLAG_PROMISC;
 
 	/* Disable promiscouos mode */
 	switch (mdev->dev->caps.steering_mode) {
 	case MLX4_STEERING_MODE_DEVICE_MANAGED:
 		err = mlx4_flow_steer_promisc_remove(mdev->dev,
 						     priv->port,
 						     MLX4_FS_ALL_DEFAULT);
 		if (err)
 			en_err(priv, "Failed disabling promiscuous mode\n");
 		priv->flags &= ~MLX4_EN_FLAG_MC_PROMISC;
 		break;
 
 	case MLX4_STEERING_MODE_B0:
 		err = mlx4_unicast_promisc_remove(mdev->dev,
 						  priv->base_qpn,
 						  priv->port);
 		if (err)
 			en_err(priv, "Failed disabling unicast promiscuous mode\n");
 		/* Disable Multicast promisc */
 		if (priv->flags & MLX4_EN_FLAG_MC_PROMISC) {
 			err = mlx4_multicast_promisc_remove(mdev->dev,
 							    priv->base_qpn,
 							    priv->port);
 			if (err)
 				en_err(priv, "Failed disabling multicast promiscuous mode\n");
 			priv->flags &= ~MLX4_EN_FLAG_MC_PROMISC;
 		}
 		break;
 
 	case MLX4_STEERING_MODE_A0:
 		err = mlx4_SET_PORT_qpn_calc(mdev->dev,
 					     priv->port,
 					     priv->base_qpn, 0);
 		if (err)
 			en_err(priv, "Failed disabling promiscuous mode\n");
 		break;
 	}
 }
 
 static void mlx4_en_do_multicast(struct mlx4_en_priv *priv,
 				 struct net_device *dev,
 				 struct mlx4_en_dev *mdev)
 {
 	struct mlx4_en_addr_list *addr_list, *tmp;
 	u8 mc_list[16] = {0};
 	int err = 0;
 	u64 mcast_addr = 0;
 
 
 	/* Enable/disable the multicast filter according to IFF_ALLMULTI */
 	if (dev->if_flags & IFF_ALLMULTI) {
 		err = mlx4_SET_MCAST_FLTR(mdev->dev, priv->port, 0,
 					  0, MLX4_MCAST_DISABLE);
 		if (err)
 			en_err(priv, "Failed disabling multicast filter\n");
 
 		/* Add the default qp number as multicast promisc */
 		if (!(priv->flags & MLX4_EN_FLAG_MC_PROMISC)) {
 			switch (mdev->dev->caps.steering_mode) {
 			case MLX4_STEERING_MODE_DEVICE_MANAGED:
 				err = mlx4_flow_steer_promisc_add(mdev->dev,
 								  priv->port,
 								  priv->base_qpn,
 								  MLX4_FS_MC_DEFAULT);
 				break;
 
 			case MLX4_STEERING_MODE_B0:
 				err = mlx4_multicast_promisc_add(mdev->dev,
 								 priv->base_qpn,
 								 priv->port);
 				break;
 
 			case MLX4_STEERING_MODE_A0:
 				break;
 			}
 			if (err)
 				en_err(priv, "Failed entering multicast promisc mode\n");
 			priv->flags |= MLX4_EN_FLAG_MC_PROMISC;
 		}
 	} else {
 		/* Disable Multicast promisc */
 		if (priv->flags & MLX4_EN_FLAG_MC_PROMISC) {
 			switch (mdev->dev->caps.steering_mode) {
 			case MLX4_STEERING_MODE_DEVICE_MANAGED:
 				err = mlx4_flow_steer_promisc_remove(mdev->dev,
 								     priv->port,
 								     MLX4_FS_MC_DEFAULT);
 				break;
 
 			case MLX4_STEERING_MODE_B0:
 				err = mlx4_multicast_promisc_remove(mdev->dev,
 								    priv->base_qpn,
 								    priv->port);
 				break;
 
 			case MLX4_STEERING_MODE_A0:
 				break;
 			}
 			if (err)
 				en_err(priv, "Failed disabling multicast promiscuous mode\n");
 			priv->flags &= ~MLX4_EN_FLAG_MC_PROMISC;
 		}
 
-		/* Update unicast list */
-		mlx4_en_cache_uclist(dev);
-
-		update_addr_list_flags(priv, &priv->curr_uc_list, &priv->uc_list);
-
-		list_for_each_entry_safe(addr_list, tmp, &priv->curr_uc_list, list) {
-			if (addr_list->action == MLX4_ADDR_LIST_REM) {
-				mlx4_en_uc_steer_release(priv, addr_list->addr,
-							       priv->rss_map.indir_qp.qpn,
-							       addr_list->reg_id);
-				/* remove from list */
-				list_del(&addr_list->list);
-				kfree(addr_list);
-			} else if (addr_list->action == MLX4_ADDR_LIST_ADD) {
-				err = mlx4_en_uc_steer_add(priv, addr_list->addr,
-							   &priv->rss_map.indir_qp.qpn,
-							   &addr_list->reg_id);
-				if (err)
-					en_err(priv, "Fail to add unicast address\n");
-			}
-		}
-
 		err = mlx4_SET_MCAST_FLTR(mdev->dev, priv->port, 0,
 					  0, MLX4_MCAST_DISABLE);
 		if (err)
 			en_err(priv, "Failed disabling multicast filter\n");
 
 		/* Flush mcast filter and init it with broadcast address */
 		mlx4_SET_MCAST_FLTR(mdev->dev, priv->port, ETH_BCAST,
 				    1, MLX4_MCAST_CONFIG);
 
 		/* Update multicast list - we cache all addresses so they won't
 		 * change while HW is updated holding the command semaphor */
 		mlx4_en_cache_mclist(dev);
 		list_for_each_entry(addr_list, &priv->mc_list, list) {
 			mcast_addr = mlx4_mac_to_u64(addr_list->addr);
 			mlx4_SET_MCAST_FLTR(mdev->dev, priv->port,
 					mcast_addr, 0, MLX4_MCAST_CONFIG);
 		}
 		err = mlx4_SET_MCAST_FLTR(mdev->dev, priv->port, 0,
 					  0, MLX4_MCAST_ENABLE);
 		if (err)
 			en_err(priv, "Failed enabling multicast filter\n");
 
 		update_addr_list_flags(priv, &priv->curr_mc_list, &priv->mc_list);
 
 		list_for_each_entry_safe(addr_list, tmp, &priv->curr_mc_list, list) {
 			if (addr_list->action == MLX4_ADDR_LIST_REM) {
 				/* detach this address and delete from list */
 				memcpy(&mc_list[10], addr_list->addr, ETH_ALEN);
 				mc_list[5] = priv->port;
 				err = mlx4_multicast_detach(mdev->dev,
 							    &priv->rss_map.indir_qp,
 							    mc_list,
 							    MLX4_PROT_ETH,
 							    addr_list->reg_id);
 				if (err)
 					en_err(priv, "Fail to detach multicast address\n");
 
 				if (addr_list->tunnel_reg_id) {
 					err = mlx4_flow_detach(priv->mdev->dev, addr_list->tunnel_reg_id);
 					if (err)
 						en_err(priv, "Failed to detach multicast address\n");
 				}
 
 				/* remove from list */
 				list_del(&addr_list->list);
 				kfree(addr_list);
 			} else if (addr_list->action == MLX4_ADDR_LIST_ADD) {
 				/* attach the address */
 				memcpy(&mc_list[10], addr_list->addr, ETH_ALEN);
 				/* needed for B0 steering support */
 				mc_list[5] = priv->port;
 				err = mlx4_multicast_attach(mdev->dev,
 							    &priv->rss_map.indir_qp,
 							    mc_list,
 							    priv->port, 0,
 							    MLX4_PROT_ETH,
 							    &addr_list->reg_id);
 				if (err)
 					en_err(priv, "Fail to attach multicast address\n");
 
 				err = mlx4_en_tunnel_steer_add(priv, &mc_list[10], priv->base_qpn,
 							       &addr_list->tunnel_reg_id);
 				if (err)
 					en_err(priv, "Failed to attach multicast address\n");
 			}
 		}
 	}
 }
 
+static void mlx4_en_do_unicast(struct mlx4_en_priv *priv,
+			       struct net_device *dev,
+			       struct mlx4_en_dev *mdev)
+{
+	struct mlx4_en_addr_list *addr_list, *tmp;
+	int err;
+
+	/* Update unicast list */
+	mlx4_en_cache_uclist(dev);
+
+	update_addr_list_flags(priv, &priv->curr_uc_list, &priv->uc_list);
+
+	list_for_each_entry_safe(addr_list, tmp, &priv->curr_uc_list, list) {
+		if (addr_list->action == MLX4_ADDR_LIST_REM) {
+			mlx4_en_uc_steer_release(priv, addr_list->addr,
+						 priv->rss_map.indir_qp.qpn,
+						 addr_list->reg_id);
+			/* remove from list */
+			list_del(&addr_list->list);
+			kfree(addr_list);
+		} else if (addr_list->action == MLX4_ADDR_LIST_ADD) {
+			err = mlx4_en_uc_steer_add(priv, addr_list->addr,
+						   &priv->rss_map.indir_qp.qpn,
+						   &addr_list->reg_id);
+			if (err)
+				en_err(priv, "Fail to add unicast address\n");
+		}
+	}
+}
+
 static void mlx4_en_do_set_rx_mode(struct work_struct *work)
 {
 	struct mlx4_en_priv *priv = container_of(work, struct mlx4_en_priv,
 						 rx_mode_task);
 	struct mlx4_en_dev *mdev = priv->mdev;
 	struct net_device *dev = priv->dev;
 
 	mutex_lock(&mdev->state_lock);
 	if (!mdev->device_up) {
 		en_dbg(HW, priv, "Card is not up, ignoring rx mode change.\n");
 		goto out;
 	}
 	if (!priv->port_up) {
 		en_dbg(HW, priv, "Port is down, ignoring rx mode change.\n");
 		goto out;
 	}
 	if (!mlx4_en_QUERY_PORT(mdev, priv->port)) {
 		if (priv->port_state.link_state) {
 			priv->last_link_state = MLX4_DEV_EVENT_PORT_UP;
 			/* update netif baudrate */
 			priv->dev->if_baudrate =
 			    IF_Mbps(priv->port_state.link_speed);
 			/* Important note: the following call for if_link_state_change
 			 * is needed for interface up scenario (start port, link state
 			 * change) */
 			if_link_state_change(priv->dev, LINK_STATE_UP);
 			en_dbg(HW, priv, "Link Up\n");
 		}
 	}
 
+	/* Set unicast rules */
+	mlx4_en_do_unicast(priv, dev, mdev);
+
 	/* Promsicuous mode: disable all filters */
 	if ((dev->if_flags & IFF_PROMISC) ||
 	    (priv->flags & MLX4_EN_FLAG_FORCE_PROMISC)) {
 		mlx4_en_set_promisc_mode(priv, mdev);
-		goto out;
+	} else if (priv->flags & MLX4_EN_FLAG_PROMISC) {
+		/* Not in promiscuous mode */
+		mlx4_en_clear_promisc_mode(priv, mdev);
 	}
 
-	/* Not in promiscuous mode */
-	if (priv->flags & MLX4_EN_FLAG_PROMISC)
-		mlx4_en_clear_promisc_mode(priv, mdev);
-
+	/* Set multicast rules */
 	mlx4_en_do_multicast(priv, dev, mdev);
 out:
 	mutex_unlock(&mdev->state_lock);
 }
 
 static void mlx4_en_watchdog_timeout(void *arg)
 {
         struct mlx4_en_priv *priv = arg;
         struct mlx4_en_dev *mdev = priv->mdev;
 
         en_dbg(DRV, priv, "Scheduling watchdog\n");
         queue_work(mdev->workqueue, &priv->watchdog_task);
         if (priv->port_up)
                 callout_reset(&priv->watchdog_timer, MLX4_EN_WATCHDOG_TIMEOUT,
                                 mlx4_en_watchdog_timeout, priv);
 }
 
 
 
 static void mlx4_en_set_default_moderation(struct mlx4_en_priv *priv)
 {
 	struct mlx4_en_cq *cq;
 	int i;
 
 	/* If we haven't received a specific coalescing setting
 	 * (module param), we set the moderation parameters as follows:
 	 * - moder_cnt is set to the number of mtu sized packets to
 	 *   satisfy our coalescing target.
 	 * - moder_time is set to a fixed value.
 	 */
 	priv->rx_frames = MLX4_EN_RX_COAL_TARGET;
 	priv->rx_usecs = MLX4_EN_RX_COAL_TIME;
 	priv->tx_frames = MLX4_EN_TX_COAL_PKTS;
 	priv->tx_usecs = MLX4_EN_TX_COAL_TIME;
 	en_dbg(INTR, priv, "Default coalesing params for mtu: %u - "
 	       "rx_frames:%d rx_usecs:%d\n",
 	       (unsigned)priv->dev->if_mtu, priv->rx_frames, priv->rx_usecs);
 
 	/* Setup cq moderation params */
 	for (i = 0; i < priv->rx_ring_num; i++) {
 		cq = priv->rx_cq[i];
 		cq->moder_cnt = priv->rx_frames;
 		cq->moder_time = priv->rx_usecs;
 		priv->last_moder_time[i] = MLX4_EN_AUTO_CONF;
 		priv->last_moder_packets[i] = 0;
 		priv->last_moder_bytes[i] = 0;
 	}
 
 	for (i = 0; i < priv->tx_ring_num; i++) {
 		cq = priv->tx_cq[i];
 		cq->moder_cnt = priv->tx_frames;
 		cq->moder_time = priv->tx_usecs;
 	}
 
 	/* Reset auto-moderation params */
 	priv->pkt_rate_low = MLX4_EN_RX_RATE_LOW;
 	priv->rx_usecs_low = MLX4_EN_RX_COAL_TIME_LOW;
 	priv->pkt_rate_high = MLX4_EN_RX_RATE_HIGH;
 	priv->rx_usecs_high = MLX4_EN_RX_COAL_TIME_HIGH;
 	priv->sample_interval = MLX4_EN_SAMPLE_INTERVAL;
 	priv->adaptive_rx_coal = 1;
 	priv->last_moder_jiffies = 0;
 	priv->last_moder_tx_packets = 0;
 }
 
 static void mlx4_en_auto_moderation(struct mlx4_en_priv *priv)
 {
 	unsigned long period = (unsigned long) (jiffies - priv->last_moder_jiffies);
 	struct mlx4_en_cq *cq;
 	unsigned long packets;
 	unsigned long rate;
 	unsigned long avg_pkt_size;
 	unsigned long rx_packets;
 	unsigned long rx_bytes;
 	unsigned long rx_pkt_diff;
 	int moder_time;
 	int ring, err;
 
 	if (!priv->adaptive_rx_coal || period < priv->sample_interval * HZ)
 		return;
 
 	for (ring = 0; ring < priv->rx_ring_num; ring++) {
                 spin_lock(&priv->stats_lock);
 		rx_packets = priv->rx_ring[ring]->packets;
 		rx_bytes = priv->rx_ring[ring]->bytes;
 		spin_unlock(&priv->stats_lock);
 
 		rx_pkt_diff = ((unsigned long) (rx_packets -
 				priv->last_moder_packets[ring]));
 		packets = rx_pkt_diff;
 		rate = packets * HZ / period;
 		avg_pkt_size = packets ? ((unsigned long) (rx_bytes -
 				priv->last_moder_bytes[ring])) / packets : 0;
 
 		/* Apply auto-moderation only when packet rate
 		 * exceeds a rate that it matters */
 		if (rate > (MLX4_EN_RX_RATE_THRESH / priv->rx_ring_num) &&
 		    avg_pkt_size > MLX4_EN_AVG_PKT_SMALL) {
 			if (rate < priv->pkt_rate_low)
 				moder_time = priv->rx_usecs_low;
 			else if (rate > priv->pkt_rate_high)
 				moder_time = priv->rx_usecs_high;
 			else
 				moder_time = (rate - priv->pkt_rate_low) *
 					(priv->rx_usecs_high - priv->rx_usecs_low) /
 					(priv->pkt_rate_high - priv->pkt_rate_low) +
 					priv->rx_usecs_low;
 		} else {
 			moder_time = priv->rx_usecs_low;
 		}
 
 		if (moder_time != priv->last_moder_time[ring]) {
 			priv->last_moder_time[ring] = moder_time;
 			cq = priv->rx_cq[ring];
 			cq->moder_time = moder_time;
 			cq->moder_cnt = priv->rx_frames;
 			err = mlx4_en_set_cq_moder(priv, cq);
 			if (err)
 				en_err(priv, "Failed modifying moderation for cq:%d\n",
 				       ring);
 		}
 		priv->last_moder_packets[ring] = rx_packets;
 		priv->last_moder_bytes[ring] = rx_bytes;
 	}
 
 	priv->last_moder_jiffies = jiffies;
 }
 
 static void mlx4_en_do_get_stats(struct work_struct *work)
 {
 	struct delayed_work *delay = to_delayed_work(work);
 	struct mlx4_en_priv *priv = container_of(delay, struct mlx4_en_priv,
 						 stats_task);
 	struct mlx4_en_dev *mdev = priv->mdev;
 	int err;
 
 	mutex_lock(&mdev->state_lock);
 	if (mdev->device_up) {
 		if (priv->port_up) {
 			if (mlx4_is_slave(mdev->dev))
 				err = mlx4_en_get_vport_stats(mdev, priv->port);
 			else
 				err = mlx4_en_DUMP_ETH_STATS(mdev, priv->port, 0);
 			if (err)
 				en_dbg(HW, priv, "Could not update stats\n");
 
 			mlx4_en_auto_moderation(priv);
 		}
 
 		queue_delayed_work(mdev->workqueue, &priv->stats_task, STATS_DELAY);
 	}
 	mutex_unlock(&mdev->state_lock);
 }
 
 /* mlx4_en_service_task - Run service task for tasks that needed to be done
  * periodically
  */
 static void mlx4_en_service_task(struct work_struct *work)
 {
 	struct delayed_work *delay = to_delayed_work(work);
 	struct mlx4_en_priv *priv = container_of(delay, struct mlx4_en_priv,
 						 service_task);
 	struct mlx4_en_dev *mdev = priv->mdev;
 
 	mutex_lock(&mdev->state_lock);
 	if (mdev->device_up) {
 		queue_delayed_work(mdev->workqueue, &priv->service_task,
 				   SERVICE_TASK_DELAY);
 	}
 	mutex_unlock(&mdev->state_lock);
 }
 
 static void mlx4_en_linkstate(struct work_struct *work)
 {
 	struct mlx4_en_priv *priv = container_of(work, struct mlx4_en_priv,
 						 linkstate_task);
 	struct mlx4_en_dev *mdev = priv->mdev;
 	int linkstate = priv->link_state;
 
 	mutex_lock(&mdev->state_lock);
 	/* If observable port state changed set carrier state and
 	 * report to system log */
 	if (priv->last_link_state != linkstate) {
 		if (linkstate == MLX4_DEV_EVENT_PORT_DOWN) {
 			en_info(priv, "Link Down\n");
 			if_link_state_change(priv->dev, LINK_STATE_DOWN);
 			/* update netif baudrate */
 			priv->dev->if_baudrate = 0;
 
 		/* make sure the port is up before notifying the OS.
 		 * This is tricky since we get here on INIT_PORT and
 		 * in such case we can't tell the OS the port is up.
 		 * To solve this there is a call to if_link_state_change
 		 * in set_rx_mode.
 		 * */
 		} else if (priv->port_up && (linkstate == MLX4_DEV_EVENT_PORT_UP)){
 			if (mlx4_en_QUERY_PORT(priv->mdev, priv->port))
 				en_info(priv, "Query port failed\n");
 			priv->dev->if_baudrate =
 			    IF_Mbps(priv->port_state.link_speed);
 			en_info(priv, "Link Up\n");
 			if_link_state_change(priv->dev, LINK_STATE_UP);
 		}
 	}
 	priv->last_link_state = linkstate;
 	mutex_unlock(&mdev->state_lock);
 }
 
 
 int mlx4_en_start_port(struct net_device *dev)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_en_dev *mdev = priv->mdev;
 	struct mlx4_en_cq *cq;
 	struct mlx4_en_tx_ring *tx_ring;
 	int rx_index = 0;
 	int tx_index = 0;
 	int err = 0;
 	int i;
 	int j;
 	u8 mc_list[16] = {0};
 
 
 	if (priv->port_up) {
 		en_dbg(DRV, priv, "start port called while port already up\n");
 		return 0;
 	}
 
 	INIT_LIST_HEAD(&priv->mc_list);
 	INIT_LIST_HEAD(&priv->uc_list);
 	INIT_LIST_HEAD(&priv->curr_mc_list);
 	INIT_LIST_HEAD(&priv->curr_uc_list);
 	INIT_LIST_HEAD(&priv->ethtool_list);
 
 	/* Calculate Rx buf size */
 	dev->if_mtu = min(dev->if_mtu, priv->max_mtu);
         mlx4_en_calc_rx_buf(dev);
 	en_dbg(DRV, priv, "Rx buf size:%d\n", priv->rx_mb_size);
 
 	/* Configure rx cq's and rings */
 	err = mlx4_en_activate_rx_rings(priv);
 	if (err) {
 		en_err(priv, "Failed to activate RX rings\n");
 		return err;
 	}
 	for (i = 0; i < priv->rx_ring_num; i++) {
 		cq = priv->rx_cq[i];
 
 		mlx4_en_cq_init_lock(cq);
 		err = mlx4_en_activate_cq(priv, cq, i);
 		if (err) {
 			en_err(priv, "Failed activating Rx CQ\n");
 			goto cq_err;
 		}
 		for (j = 0; j < cq->size; j++)
 			cq->buf[j].owner_sr_opcode = MLX4_CQE_OWNER_MASK;
 		err = mlx4_en_set_cq_moder(priv, cq);
 		if (err) {
 			en_err(priv, "Failed setting cq moderation parameters");
 			mlx4_en_deactivate_cq(priv, cq);
 			goto cq_err;
 		}
 		mlx4_en_arm_cq(priv, cq);
 		priv->rx_ring[i]->cqn = cq->mcq.cqn;
 		++rx_index;
 	}
 
 	/* Set qp number */
 	en_dbg(DRV, priv, "Getting qp number for port %d\n", priv->port);
 	err = mlx4_en_get_qp(priv);
 	if (err) {
 		en_err(priv, "Failed getting eth qp\n");
 		goto cq_err;
 	}
 	mdev->mac_removed[priv->port] = 0;
 
 	priv->counter_index =
 			mlx4_get_default_counter_index(mdev->dev, priv->port);
 
 	err = mlx4_en_config_rss_steer(priv);
 	if (err) {
 		en_err(priv, "Failed configuring rss steering\n");
 		goto mac_err;
 	}
 
 	err = mlx4_en_create_drop_qp(priv);
 	if (err)
 		goto rss_err;
 
 	/* Configure tx cq's and rings */
 	for (i = 0; i < priv->tx_ring_num; i++) {
 		/* Configure cq */
 		cq = priv->tx_cq[i];
 		err = mlx4_en_activate_cq(priv, cq, i);
 		if (err) {
 			en_err(priv, "Failed activating Tx CQ\n");
 			goto tx_err;
 		}
 		err = mlx4_en_set_cq_moder(priv, cq);
 		if (err) {
 			en_err(priv, "Failed setting cq moderation parameters");
 			mlx4_en_deactivate_cq(priv, cq);
 			goto tx_err;
 		}
 		en_dbg(DRV, priv, "Resetting index of collapsed CQ:%d to -1\n", i);
 		cq->buf->wqe_index = cpu_to_be16(0xffff);
 
 		/* Configure ring */
 		tx_ring = priv->tx_ring[i];
 
 		err = mlx4_en_activate_tx_ring(priv, tx_ring, cq->mcq.cqn,
 					       i / priv->num_tx_rings_p_up);
 		if (err) {
 			en_err(priv, "Failed activating Tx ring %d\n", i);
 			mlx4_en_deactivate_cq(priv, cq);
 			goto tx_err;
 		}
 
 		/* Arm CQ for TX completions */
 		mlx4_en_arm_cq(priv, cq);
 
 		/* Set initial ownership of all Tx TXBBs to SW (1) */
 		for (j = 0; j < tx_ring->buf_size; j += STAMP_STRIDE)
 			*((u32 *) (tx_ring->buf + j)) = INIT_OWNER_BIT;
 		++tx_index;
 	}
 
 	/* Configure port */
 	err = mlx4_SET_PORT_general(mdev->dev, priv->port,
 				    priv->rx_mb_size,
 				    priv->prof->tx_pause,
 				    priv->prof->tx_ppp,
 				    priv->prof->rx_pause,
 				    priv->prof->rx_ppp);
 	if (err) {
 		en_err(priv, "Failed setting port general configurations for port %d, with error %d\n",
 		       priv->port, err);
 		goto tx_err;
 	}
 	/* Set default qp number */
 	err = mlx4_SET_PORT_qpn_calc(mdev->dev, priv->port, priv->base_qpn, 0);
 	if (err) {
 		en_err(priv, "Failed setting default qp numbers\n");
 		goto tx_err;
 	}
 
 	/* Init port */
 	en_dbg(HW, priv, "Initializing port\n");
 	err = mlx4_INIT_PORT(mdev->dev, priv->port);
 	if (err) {
 		en_err(priv, "Failed Initializing port\n");
 		goto tx_err;
 	}
 
 	/* Attach rx QP to bradcast address */
 	memset(&mc_list[10], 0xff, ETH_ALEN);
 	mc_list[5] = priv->port; /* needed for B0 steering support */
 	if (mlx4_multicast_attach(mdev->dev, &priv->rss_map.indir_qp, mc_list,
 				  priv->port, 0, MLX4_PROT_ETH,
 				  &priv->broadcast_id))
 		mlx4_warn(mdev, "Failed Attaching Broadcast\n");
 
 	/* Must redo promiscuous mode setup. */
 	priv->flags &= ~(MLX4_EN_FLAG_PROMISC | MLX4_EN_FLAG_MC_PROMISC);
 
 	/* Schedule multicast task to populate multicast list */
 	queue_work(mdev->workqueue, &priv->rx_mode_task);
 
 	priv->port_up = true;
 
         /* Enable the queues. */
         dev->if_drv_flags &= ~IFF_DRV_OACTIVE;
         dev->if_drv_flags |= IFF_DRV_RUNNING;
 #ifdef CONFIG_DEBUG_FS
 	mlx4_en_create_debug_files(priv);
 #endif
         callout_reset(&priv->watchdog_timer, MLX4_EN_WATCHDOG_TIMEOUT,
                     mlx4_en_watchdog_timeout, priv);
 
 
 	return 0;
 
 tx_err:
 	while (tx_index--) {
 		mlx4_en_deactivate_tx_ring(priv, priv->tx_ring[tx_index]);
 		mlx4_en_deactivate_cq(priv, priv->tx_cq[tx_index]);
 	}
 	mlx4_en_destroy_drop_qp(priv);
 rss_err:
 	mlx4_en_release_rss_steer(priv);
 mac_err:
 	mlx4_en_put_qp(priv);
 cq_err:
 	while (rx_index--)
 		mlx4_en_deactivate_cq(priv, priv->rx_cq[rx_index]);
 	for (i = 0; i < priv->rx_ring_num; i++)
 		mlx4_en_deactivate_rx_ring(priv, priv->rx_ring[i]);
 
 	return err; /* need to close devices */
 }
 
 
 void mlx4_en_stop_port(struct net_device *dev)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_en_dev *mdev = priv->mdev;
 	struct mlx4_en_addr_list *addr_list, *tmp;
 	int i;
 	u8 mc_list[16] = {0};
 
 	if (!priv->port_up) {
 		en_dbg(DRV, priv, "stop port called while port already down\n");
 		return;
 	}
 
 #ifdef CONFIG_DEBUG_FS
 	mlx4_en_delete_debug_files(priv);
 #endif
 
 	/* close port*/
 	mlx4_CLOSE_PORT(mdev->dev, priv->port);
 
 	/* Set port as not active */
 	priv->port_up = false;
 	priv->counter_index = MLX4_SINK_COUNTER_INDEX(mdev->dev);
 
 	/* Promsicuous mode */
 	if (mdev->dev->caps.steering_mode ==
 	    MLX4_STEERING_MODE_DEVICE_MANAGED) {
 		priv->flags &= ~(MLX4_EN_FLAG_PROMISC |
 				 MLX4_EN_FLAG_MC_PROMISC);
 		mlx4_flow_steer_promisc_remove(mdev->dev,
 					       priv->port,
 					       MLX4_FS_ALL_DEFAULT);
 		mlx4_flow_steer_promisc_remove(mdev->dev,
 					       priv->port,
 					       MLX4_FS_MC_DEFAULT);
 	} else if (priv->flags & MLX4_EN_FLAG_PROMISC) {
 		priv->flags &= ~MLX4_EN_FLAG_PROMISC;
 
 		/* Disable promiscouos mode */
 		mlx4_unicast_promisc_remove(mdev->dev, priv->base_qpn,
 					    priv->port);
 
 		/* Disable Multicast promisc */
 		if (priv->flags & MLX4_EN_FLAG_MC_PROMISC) {
 			mlx4_multicast_promisc_remove(mdev->dev, priv->base_qpn,
 						      priv->port);
 			priv->flags &= ~MLX4_EN_FLAG_MC_PROMISC;
 		}
 	}
 
 	/* Detach All unicasts */
 	list_for_each_entry(addr_list, &priv->curr_uc_list, list) {
 		mlx4_en_uc_steer_release(priv, addr_list->addr,
 					 priv->rss_map.indir_qp.qpn,
 					 addr_list->reg_id);
 	}
 	mlx4_en_clear_uclist(dev);
 	list_for_each_entry_safe(addr_list, tmp, &priv->curr_uc_list, list) {
 		list_del(&addr_list->list);
 		kfree(addr_list);
 	}
 
 	/* Detach All multicasts */
 	memset(&mc_list[10], 0xff, ETH_ALEN);
 	mc_list[5] = priv->port; /* needed for B0 steering support */
 	mlx4_multicast_detach(mdev->dev, &priv->rss_map.indir_qp, mc_list,
 			      MLX4_PROT_ETH, priv->broadcast_id);
 	list_for_each_entry(addr_list, &priv->curr_mc_list, list) {
 		memcpy(&mc_list[10], addr_list->addr, ETH_ALEN);
 		mc_list[5] = priv->port;
 		mlx4_multicast_detach(mdev->dev, &priv->rss_map.indir_qp,
 				      mc_list, MLX4_PROT_ETH, addr_list->reg_id);
 	}
 	mlx4_en_clear_mclist(dev);
 	list_for_each_entry_safe(addr_list, tmp, &priv->curr_mc_list, list) {
 		list_del(&addr_list->list);
 		kfree(addr_list);
 	}
 
 	/* Flush multicast filter */
 	mlx4_SET_MCAST_FLTR(mdev->dev, priv->port, 0, 1, MLX4_MCAST_CONFIG);
 	mlx4_en_destroy_drop_qp(priv);
 
 	/* Free TX Rings */
 	for (i = 0; i < priv->tx_ring_num; i++) {
 		mlx4_en_deactivate_tx_ring(priv, priv->tx_ring[i]);
 		mlx4_en_deactivate_cq(priv, priv->tx_cq[i]);
 	}
 	msleep(10);
 
 	for (i = 0; i < priv->tx_ring_num; i++)
 		mlx4_en_free_tx_buf(dev, priv->tx_ring[i]);
 
 	/* Free RSS qps */
 	mlx4_en_release_rss_steer(priv);
 
 	/* Unregister Mac address for the port */
 	mlx4_en_put_qp(priv);
 	mdev->mac_removed[priv->port] = 1;
 
 	/* Free RX Rings */
 	for (i = 0; i < priv->rx_ring_num; i++) {
 		struct mlx4_en_cq *cq = priv->rx_cq[i];
 		mlx4_en_deactivate_rx_ring(priv, priv->rx_ring[i]);
 		mlx4_en_deactivate_cq(priv, cq);
 	}
 
         callout_stop(&priv->watchdog_timer);
 
         dev->if_drv_flags &= ~(IFF_DRV_RUNNING | IFF_DRV_OACTIVE);
 }
 
 static void mlx4_en_restart(struct work_struct *work)
 {
 	struct mlx4_en_priv *priv = container_of(work, struct mlx4_en_priv,
 						 watchdog_task);
 	struct mlx4_en_dev *mdev = priv->mdev;
 	struct net_device *dev = priv->dev;
 	struct mlx4_en_tx_ring *ring;
 	int i;
 
 
 	if (priv->blocked == 0 || priv->port_up == 0)
 		return;
 	for (i = 0; i < priv->tx_ring_num; i++) {
 		ring = priv->tx_ring[i];
 		if (ring->blocked &&
 				ring->watchdog_time + MLX4_EN_WATCHDOG_TIMEOUT < ticks)
 			goto reset;
 	}
 	return;
 
 reset:
 	priv->port_stats.tx_timeout++;
 	en_dbg(DRV, priv, "Watchdog task called for port %d\n", priv->port);
 
 	mutex_lock(&mdev->state_lock);
 	if (priv->port_up) {
 		mlx4_en_stop_port(dev);
                 //for (i = 0; i < priv->tx_ring_num; i++)         
                 //        netdev_tx_reset_queue(priv->tx_ring[i]->tx_queue);
 		if (mlx4_en_start_port(dev))
 			en_err(priv, "Failed restarting port %d\n", priv->port);
 	}
 	mutex_unlock(&mdev->state_lock);
 }
 
 static void mlx4_en_clear_stats(struct net_device *dev)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_en_dev *mdev = priv->mdev;
 	int i;
 
 	if (!mlx4_is_slave(mdev->dev))
 		if (mlx4_en_DUMP_ETH_STATS(mdev, priv->port, 1))
 			en_dbg(HW, priv, "Failed dumping statistics\n");
 
 	memset(&priv->pstats, 0, sizeof(priv->pstats));
 	memset(&priv->pkstats, 0, sizeof(priv->pkstats));
 	memset(&priv->port_stats, 0, sizeof(priv->port_stats));
 	memset(&priv->vport_stats, 0, sizeof(priv->vport_stats));
 
 	for (i = 0; i < priv->tx_ring_num; i++) {
 		priv->tx_ring[i]->bytes = 0;
 		priv->tx_ring[i]->packets = 0;
 		priv->tx_ring[i]->tx_csum = 0;
 		priv->tx_ring[i]->oversized_packets = 0;
 	}
 	for (i = 0; i < priv->rx_ring_num; i++) {
 		priv->rx_ring[i]->bytes = 0;
 		priv->rx_ring[i]->packets = 0;
 		priv->rx_ring[i]->csum_ok = 0;
 		priv->rx_ring[i]->csum_none = 0;
 	}
 }
 
 static void mlx4_en_open(void* arg)
 {
 
         struct mlx4_en_priv *priv;
         struct mlx4_en_dev *mdev;
         struct net_device *dev;
         int err = 0;
 
         priv = arg;
         mdev = priv->mdev;
         dev = priv->dev;
 
 
 	mutex_lock(&mdev->state_lock);
 
 	if (!mdev->device_up) {
 		en_err(priv, "Cannot open - device down/disabled\n");
 		goto out;
 	}
 
 	/* Reset HW statistics and SW counters */
 	mlx4_en_clear_stats(dev);
 
 	err = mlx4_en_start_port(dev);
 	if (err)
 		en_err(priv, "Failed starting port:%d\n", priv->port);
 
 out:
 	mutex_unlock(&mdev->state_lock);
 	return;
 }
 
 void mlx4_en_free_resources(struct mlx4_en_priv *priv)
 {
 	int i;
 
 #ifdef CONFIG_RFS_ACCEL
 	if (priv->dev->rx_cpu_rmap) {
 		free_irq_cpu_rmap(priv->dev->rx_cpu_rmap);
 		priv->dev->rx_cpu_rmap = NULL;
 	}
 #endif
 
 	for (i = 0; i < priv->tx_ring_num; i++) {
 		if (priv->tx_ring && priv->tx_ring[i])
 			mlx4_en_destroy_tx_ring(priv, &priv->tx_ring[i]);
 		if (priv->tx_cq && priv->tx_cq[i])
 			mlx4_en_destroy_cq(priv, &priv->tx_cq[i]);
 	}
 
 	for (i = 0; i < priv->rx_ring_num; i++) {
 		if (priv->rx_ring[i])
 			mlx4_en_destroy_rx_ring(priv, &priv->rx_ring[i],
 				priv->prof->rx_ring_size, priv->stride);
 		if (priv->rx_cq[i])
 			mlx4_en_destroy_cq(priv, &priv->rx_cq[i]);
 	}
 
 	if (priv->stat_sysctl != NULL)
 		sysctl_ctx_free(&priv->stat_ctx);
 }
 
 int mlx4_en_alloc_resources(struct mlx4_en_priv *priv)
 {
 	struct mlx4_en_port_profile *prof = priv->prof;
 	int i;
 	int node = 0;
 
 	/* Create rx Rings */
 	for (i = 0; i < priv->rx_ring_num; i++) {
 		if (mlx4_en_create_cq(priv, &priv->rx_cq[i],
 				      prof->rx_ring_size, i, RX, node))
 			goto err;
 
 		if (mlx4_en_create_rx_ring(priv, &priv->rx_ring[i],
 					   prof->rx_ring_size, node))
 			goto err;
 	}
 
 	/* Create tx Rings */
 	for (i = 0; i < priv->tx_ring_num; i++) {
 		if (mlx4_en_create_cq(priv, &priv->tx_cq[i],
 				      prof->tx_ring_size, i, TX, node))
 			goto err;
 
 		if (mlx4_en_create_tx_ring(priv, &priv->tx_ring[i],
 					   prof->tx_ring_size, TXBB_SIZE, node, i))
 			goto err;
 	}
 
 #ifdef CONFIG_RFS_ACCEL
 	priv->dev->rx_cpu_rmap = alloc_irq_cpu_rmap(priv->rx_ring_num);
 	if (!priv->dev->rx_cpu_rmap)
 		goto err;
 #endif
         /* Re-create stat sysctls in case the number of rings changed. */
 	mlx4_en_sysctl_stat(priv);
 	return 0;
 
 err:
 	en_err(priv, "Failed to allocate NIC resources\n");
 	for (i = 0; i < priv->rx_ring_num; i++) {
 		if (priv->rx_ring[i])
 			mlx4_en_destroy_rx_ring(priv, &priv->rx_ring[i],
 						prof->rx_ring_size,
 						priv->stride);
 		if (priv->rx_cq[i])
 			mlx4_en_destroy_cq(priv, &priv->rx_cq[i]);
 	}
 	for (i = 0; i < priv->tx_ring_num; i++) {
 		if (priv->tx_ring[i])
 			mlx4_en_destroy_tx_ring(priv, &priv->tx_ring[i]);
 		if (priv->tx_cq[i])
 			mlx4_en_destroy_cq(priv, &priv->tx_cq[i]);
 	}
 	priv->port_up = false;
 	return -ENOMEM;
 }
 
 struct en_port_attribute {
 	struct attribute attr;
 	ssize_t (*show)(struct en_port *, struct en_port_attribute *, char *buf);
 	ssize_t (*store)(struct en_port *, struct en_port_attribute *, char *buf, size_t count);
 };
 
 #define PORT_ATTR_RO(_name) \
 struct en_port_attribute en_port_attr_##_name = __ATTR_RO(_name)
 
 #define EN_PORT_ATTR(_name, _mode, _show, _store) \
 struct en_port_attribute en_port_attr_##_name = __ATTR(_name, _mode, _show, _store)
 
 void mlx4_en_destroy_netdev(struct net_device *dev)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_en_dev *mdev = priv->mdev;
 
 	en_dbg(DRV, priv, "Destroying netdev on port:%d\n", priv->port);
 
 	/* don't allow more IOCTLs */
 	priv->gone = 1;
 
 	/* XXX wait a bit to allow IOCTL handlers to complete */
 	pause("W", hz);
 
 	if (priv->vlan_attach != NULL)
 		EVENTHANDLER_DEREGISTER(vlan_config, priv->vlan_attach);
 	if (priv->vlan_detach != NULL)
 		EVENTHANDLER_DEREGISTER(vlan_unconfig, priv->vlan_detach);
 
 	/* Unregister device - this will close the port if it was up */
 	if (priv->registered) {
 		mutex_lock(&mdev->state_lock);
 		ether_ifdetach(dev);
 		mutex_unlock(&mdev->state_lock);
 	}
 
 	mutex_lock(&mdev->state_lock);
 	mlx4_en_stop_port(dev);
 	mutex_unlock(&mdev->state_lock);
 
 	if (priv->allocated)
 		mlx4_free_hwq_res(mdev->dev, &priv->res, MLX4_EN_PAGE_SIZE);
 
 	cancel_delayed_work(&priv->stats_task);
 	cancel_delayed_work(&priv->service_task);
 	/* flush any pending task for this netdev */
 	flush_workqueue(mdev->workqueue);
         callout_drain(&priv->watchdog_timer);
 
 	/* Detach the netdev so tasks would not attempt to access it */
 	mutex_lock(&mdev->state_lock);
 	mdev->pndev[priv->port] = NULL;
 	mutex_unlock(&mdev->state_lock);
 
 
 	mlx4_en_free_resources(priv);
 
 	/* freeing the sysctl conf cannot be called from within mlx4_en_free_resources */
 	if (priv->conf_sysctl != NULL)
 		sysctl_ctx_free(&priv->conf_ctx);
 
 	kfree(priv->tx_ring);
 	kfree(priv->tx_cq);
 
         kfree(priv);
         if_free(dev);
 
 }
 
 static int mlx4_en_change_mtu(struct net_device *dev, int new_mtu)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_en_dev *mdev = priv->mdev;
 	int err = 0;
 
 	en_dbg(DRV, priv, "Change MTU called - current:%u new:%u\n",
 	       (unsigned)dev->if_mtu, (unsigned)new_mtu);
 
 	if ((new_mtu < MLX4_EN_MIN_MTU) || (new_mtu > priv->max_mtu)) {
 		en_err(priv, "Bad MTU size:%d, max %u.\n", new_mtu,
 		    priv->max_mtu);
 		return -EPERM;
 	}
 	mutex_lock(&mdev->state_lock);
 	dev->if_mtu = new_mtu;
 	if (dev->if_drv_flags & IFF_DRV_RUNNING) {
 		if (!mdev->device_up) {
 			/* NIC is probably restarting - let watchdog task reset
 			 *                          * the port */
 			en_dbg(DRV, priv, "Change MTU called with card down!?\n");
 		} else {
 			mlx4_en_stop_port(dev);
 			err = mlx4_en_start_port(dev);
 			if (err) {
 				en_err(priv, "Failed restarting port:%d\n",
 						priv->port);
 				queue_work(mdev->workqueue, &priv->watchdog_task);
 			}
 		}
 	}
 	mutex_unlock(&mdev->state_lock);
 	return 0;
 }
 
 static int mlx4_en_calc_media(struct mlx4_en_priv *priv)
 {
 	int trans_type;
 	int active;
 
 	active = IFM_ETHER;
 	if (priv->last_link_state == MLX4_DEV_EVENT_PORT_DOWN)
 		return (active);
 	active |= IFM_FDX;
 	trans_type = priv->port_state.transceiver;
 	/* XXX I don't know all of the transceiver values. */
 	switch (priv->port_state.link_speed) {
 	case 100:
 		active |= IFM_100_T;
 		break;
 	case 1000:
 		active |= IFM_1000_T;
 		break;
 	case 10000:
 		if (trans_type > 0 && trans_type <= 0xC)
 			active |= IFM_10G_SR;
 		else if (trans_type == 0x80 || trans_type == 0)
 			active |= IFM_10G_CX4;
 		break;
 	case 40000:
 		active |= IFM_40G_CR4;
 		break;
 	}
 	if (priv->prof->tx_pause)
 		active |= IFM_ETH_TXPAUSE;
 	if (priv->prof->rx_pause)
 		active |= IFM_ETH_RXPAUSE;
 
 	return (active);
 }
 
 static void mlx4_en_media_status(struct ifnet *dev, struct ifmediareq *ifmr)
 {
 	struct mlx4_en_priv *priv;
 
 	priv = dev->if_softc;
 	ifmr->ifm_status = IFM_AVALID;
 	if (priv->last_link_state != MLX4_DEV_EVENT_PORT_DOWN)
 		ifmr->ifm_status |= IFM_ACTIVE;
 	ifmr->ifm_active = mlx4_en_calc_media(priv);
 
 	return;
 }
 
 static int mlx4_en_media_change(struct ifnet *dev)
 {
 	struct mlx4_en_priv *priv;
         struct ifmedia *ifm;
 	int rxpause;
 	int txpause;
 	int error;
 
 	priv = dev->if_softc;
 	ifm = &priv->media;
 	rxpause = txpause = 0;
 	error = 0;
 
 	if (IFM_TYPE(ifm->ifm_media) != IFM_ETHER)
 		return (EINVAL);
         switch (IFM_SUBTYPE(ifm->ifm_media)) {
         case IFM_AUTO:
 		break;
 	case IFM_10G_SR:
 	case IFM_10G_CX4:
 	case IFM_1000_T:
 	case IFM_40G_CR4:
 		if ((IFM_SUBTYPE(ifm->ifm_media)
 			== IFM_SUBTYPE(mlx4_en_calc_media(priv)))
 			&& (ifm->ifm_media & IFM_FDX))
 			break;
 		/* Fallthrough */
 	default:
                 printf("%s: Only auto media type\n", if_name(dev));
                 return (EINVAL);
 	}
 	/* Allow user to set/clear pause */
 	if (IFM_OPTIONS(ifm->ifm_media) & IFM_ETH_RXPAUSE)
 		rxpause = 1;
 	if (IFM_OPTIONS(ifm->ifm_media) & IFM_ETH_TXPAUSE)
 		txpause = 1;
 	if (priv->prof->tx_pause != txpause || priv->prof->rx_pause != rxpause) {
 		priv->prof->tx_pause = txpause;
 		priv->prof->rx_pause = rxpause;
 		error = -mlx4_SET_PORT_general(priv->mdev->dev, priv->port,
 		     priv->rx_mb_size + ETHER_CRC_LEN, priv->prof->tx_pause,
 		     priv->prof->tx_ppp, priv->prof->rx_pause,
 		     priv->prof->rx_ppp);
 	}
 	return (error);
 }
 
 static int mlx4_en_ioctl(struct ifnet *dev, u_long command, caddr_t data)
 {
 	struct mlx4_en_priv *priv;
 	struct mlx4_en_dev *mdev;
 	struct ifreq *ifr;
 	int error;
 	int mask;
 	struct ifrsskey *ifrk;
 	const u32 *key;
 	struct ifrsshash *ifrh;
 	u8 rss_mask;
 
 	error = 0;
 	mask = 0;
 	priv = dev->if_softc;
 
 	/* check if detaching */
 	if (priv == NULL || priv->gone != 0)
 		return (ENXIO);
 
 	mdev = priv->mdev;
 	ifr = (struct ifreq *) data;
 
 	switch (command) {
 	case SIOCSIFMTU:
 		error = -mlx4_en_change_mtu(dev, ifr->ifr_mtu);
 		break;
 	case SIOCSIFFLAGS:
 		if (dev->if_flags & IFF_UP) {
 			if ((dev->if_drv_flags & IFF_DRV_RUNNING) == 0) {
 				mutex_lock(&mdev->state_lock);
 				mlx4_en_start_port(dev);
 				mutex_unlock(&mdev->state_lock);
 			} else {
 				mlx4_en_set_rx_mode(dev);
 			}
 		} else {
 			mutex_lock(&mdev->state_lock);
 			if (dev->if_drv_flags & IFF_DRV_RUNNING) {
 				mlx4_en_stop_port(dev);
 				if_link_state_change(dev, LINK_STATE_DOWN);
 			}
 			mutex_unlock(&mdev->state_lock);
 		}
 		break;
 	case SIOCADDMULTI:
 	case SIOCDELMULTI:
 		mlx4_en_set_rx_mode(dev);
 		break;
 	case SIOCSIFMEDIA:
 	case SIOCGIFMEDIA:
 		error = ifmedia_ioctl(dev, ifr, &priv->media, command);
 		break;
 	case SIOCSIFCAP:
 		mutex_lock(&mdev->state_lock);
 		mask = ifr->ifr_reqcap ^ dev->if_capenable;
 		if (mask & IFCAP_TXCSUM) {
 			dev->if_capenable ^= IFCAP_TXCSUM;
 			dev->if_hwassist ^= (CSUM_TCP | CSUM_UDP | CSUM_IP);
 
 			if (IFCAP_TSO4 & dev->if_capenable &&
 			    !(IFCAP_TXCSUM & dev->if_capenable)) {
 				dev->if_capenable &= ~IFCAP_TSO4;
 				dev->if_hwassist &= ~CSUM_IP_TSO;
 				if_printf(dev,
 				    "tso4 disabled due to -txcsum.\n");
 			}
 		}
 		if (mask & IFCAP_TXCSUM_IPV6) {
 			dev->if_capenable ^= IFCAP_TXCSUM_IPV6;
 			dev->if_hwassist ^= (CSUM_UDP_IPV6 | CSUM_TCP_IPV6);
 
 			if (IFCAP_TSO6 & dev->if_capenable &&
 			    !(IFCAP_TXCSUM_IPV6 & dev->if_capenable)) {
 				dev->if_capenable &= ~IFCAP_TSO6;
 				dev->if_hwassist &= ~CSUM_IP6_TSO;
 				if_printf(dev,
 				    "tso6 disabled due to -txcsum6.\n");
 			}
 		}
 		if (mask & IFCAP_RXCSUM)
 			dev->if_capenable ^= IFCAP_RXCSUM;
 		if (mask & IFCAP_RXCSUM_IPV6)
 			dev->if_capenable ^= IFCAP_RXCSUM_IPV6;
 
 		if (mask & IFCAP_TSO4) {
 			if (!(IFCAP_TSO4 & dev->if_capenable) &&
 			    !(IFCAP_TXCSUM & dev->if_capenable)) {
 				if_printf(dev, "enable txcsum first.\n");
 				error = EAGAIN;
 				goto out;
 			}
 			dev->if_capenable ^= IFCAP_TSO4;
 			dev->if_hwassist ^= CSUM_IP_TSO;
 		}
 		if (mask & IFCAP_TSO6) {
 			if (!(IFCAP_TSO6 & dev->if_capenable) &&
 			    !(IFCAP_TXCSUM_IPV6 & dev->if_capenable)) {
 				if_printf(dev, "enable txcsum6 first.\n");
 				error = EAGAIN;
 				goto out;
 			}
 			dev->if_capenable ^= IFCAP_TSO6;
 			dev->if_hwassist ^= CSUM_IP6_TSO;
 		}
 		if (mask & IFCAP_LRO)
 			dev->if_capenable ^= IFCAP_LRO;
 		if (mask & IFCAP_VLAN_HWTAGGING)
 			dev->if_capenable ^= IFCAP_VLAN_HWTAGGING;
 		if (mask & IFCAP_VLAN_HWFILTER)
 			dev->if_capenable ^= IFCAP_VLAN_HWFILTER;
 		if (mask & IFCAP_WOL_MAGIC)
 			dev->if_capenable ^= IFCAP_WOL_MAGIC;
 		if (dev->if_drv_flags & IFF_DRV_RUNNING)
 			mlx4_en_start_port(dev);
 out:
 		mutex_unlock(&mdev->state_lock);
 		VLAN_CAPABILITIES(dev);
 		break;
 #if __FreeBSD_version >= 1100036
 	case SIOCGI2C: {
 		struct ifi2creq i2c;
 
 		error = copyin(ifr_data_get_ptr(ifr), &i2c, sizeof(i2c));
 		if (error)
 			break;
 		if (i2c.len > sizeof(i2c.data)) {
 			error = EINVAL;
 			break;
 		}
 		/*
 		 * Note that we ignore i2c.addr here. The driver hardcodes
 		 * the address to 0x50, while standard expects it to be 0xA0.
 		 */
 		error = mlx4_get_module_info(mdev->dev, priv->port,
 		    i2c.offset, i2c.len, i2c.data);
 		if (error < 0) {
 			error = -error;
 			break;
 		}
 		error = copyout(&i2c, ifr_data_get_ptr(ifr), sizeof(i2c));
 		break;
 	}
 #endif
 	case SIOCGIFRSSKEY:
 		ifrk = (struct ifrsskey *)data;
 		ifrk->ifrk_func = RSS_FUNC_TOEPLITZ;
 		mutex_lock(&mdev->state_lock);
 		key = mlx4_en_get_rss_key(priv, &ifrk->ifrk_keylen);
 		if (ifrk->ifrk_keylen > RSS_KEYLEN)
 			error = EINVAL;
 		else
 			memcpy(ifrk->ifrk_key, key, ifrk->ifrk_keylen);
 		mutex_unlock(&mdev->state_lock);
 		break;
 
 	case SIOCGIFRSSHASH:
 		mutex_lock(&mdev->state_lock);
 		rss_mask = mlx4_en_get_rss_mask(priv);
 		mutex_unlock(&mdev->state_lock);
 		ifrh = (struct ifrsshash *)data;
 		ifrh->ifrh_func = RSS_FUNC_TOEPLITZ;
 		ifrh->ifrh_types = 0;
 		if (rss_mask & MLX4_RSS_IPV4)
 			ifrh->ifrh_types |= RSS_TYPE_IPV4;
 		if (rss_mask & MLX4_RSS_TCP_IPV4)
 			ifrh->ifrh_types |= RSS_TYPE_TCP_IPV4;
 		if (rss_mask & MLX4_RSS_IPV6)
 			ifrh->ifrh_types |= RSS_TYPE_IPV6;
 		if (rss_mask & MLX4_RSS_TCP_IPV6)
 			ifrh->ifrh_types |= RSS_TYPE_TCP_IPV6;
 		if (rss_mask & MLX4_RSS_UDP_IPV4)
 			ifrh->ifrh_types |= RSS_TYPE_UDP_IPV4;
 		if (rss_mask & MLX4_RSS_UDP_IPV6)
 			ifrh->ifrh_types |= RSS_TYPE_UDP_IPV6;
 		break;
 
 	default:
 		error = ether_ioctl(dev, command, data);
 		break;
 	}
 
 	return (error);
 }
 
 
 int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
 			struct mlx4_en_port_profile *prof)
 {
 	struct net_device *dev;
 	struct mlx4_en_priv *priv;
 	uint8_t dev_addr[ETHER_ADDR_LEN];
 	int err;
 	int i;
 
 	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
 	dev = priv->dev = if_alloc(IFT_ETHER);
 	if (dev == NULL) {
 		en_err(priv, "Net device allocation failed\n");
 		kfree(priv);
 		return -ENOMEM;
 	}
 	dev->if_softc = priv;
 	if_initname(dev, "mlxen", (device_get_unit(
 	    mdev->pdev->dev.bsddev) * MLX4_MAX_PORTS) + port - 1);
 	dev->if_mtu = ETHERMTU;
 	dev->if_init = mlx4_en_open;
 	dev->if_flags = IFF_BROADCAST | IFF_SIMPLEX | IFF_MULTICAST;
 	dev->if_ioctl = mlx4_en_ioctl;
 	dev->if_transmit = mlx4_en_transmit;
 	dev->if_qflush = mlx4_en_qflush;
 	dev->if_snd.ifq_maxlen = prof->tx_ring_size;
 
 	/*
 	 * Initialize driver private data
 	 */
 	priv->counter_index = 0xff;
 	spin_lock_init(&priv->stats_lock);
 	INIT_WORK(&priv->rx_mode_task, mlx4_en_do_set_rx_mode);
 	INIT_WORK(&priv->watchdog_task, mlx4_en_restart);
 	INIT_WORK(&priv->linkstate_task, mlx4_en_linkstate);
 	INIT_DELAYED_WORK(&priv->stats_task, mlx4_en_do_get_stats);
 	INIT_DELAYED_WORK(&priv->service_task, mlx4_en_service_task);
 	callout_init(&priv->watchdog_timer, 1);
 #ifdef CONFIG_RFS_ACCEL
 	INIT_LIST_HEAD(&priv->filters);
 	spin_lock_init(&priv->filters_lock);
 #endif
 
 	priv->msg_enable = MLX4_EN_MSG_LEVEL;
 	priv->dev = dev;
 	priv->mdev = mdev;
 	priv->ddev = &mdev->pdev->dev;
 	priv->prof = prof;
 	priv->port = port;
 	priv->port_up = false;
 	priv->flags = prof->flags;
 
 	priv->num_tx_rings_p_up = mdev->profile.num_tx_rings_p_up;
 	priv->tx_ring_num = prof->tx_ring_num;
 	priv->tx_ring = kcalloc(MAX_TX_RINGS,
 				sizeof(struct mlx4_en_tx_ring *), GFP_KERNEL);
 	if (!priv->tx_ring) {
 		err = -ENOMEM;
 		goto out;
 	}
 	priv->tx_cq = kcalloc(sizeof(struct mlx4_en_cq *), MAX_TX_RINGS,
 			GFP_KERNEL);
 	if (!priv->tx_cq) {
 		err = -ENOMEM;
 		goto out;
 	}
 
 	priv->rx_ring_num = prof->rx_ring_num;
 	priv->cqe_factor = (mdev->dev->caps.cqe_size == 64) ? 1 : 0;
 	priv->mac_index = -1;
 	priv->last_ifq_jiffies = 0;
 	priv->if_counters_rx_errors = 0;
 	priv->if_counters_rx_no_buffer = 0;
 #ifdef CONFIG_MLX4_EN_DCB
 	if (!mlx4_is_slave(priv->mdev->dev)) {
 		priv->dcbx_cap = DCB_CAP_DCBX_HOST;
 		priv->flags |= MLX4_EN_FLAG_DCB_ENABLED;
 		if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ETS_CFG) {
 			dev->dcbnl_ops = &mlx4_en_dcbnl_ops;
 		} else {
 			en_info(priv, "QoS disabled - no HW support\n");
 			dev->dcbnl_ops = &mlx4_en_dcbnl_pfc_ops;
 		}
 	}
 #endif
 
 	/* Query for default mac and max mtu */
 	priv->max_mtu = mdev->dev->caps.eth_mtu_cap[priv->port];
         priv->mac = mdev->dev->caps.def_mac[priv->port];
         if (ILLEGAL_MAC(priv->mac)) {
 #if BITS_PER_LONG == 64
                 en_err(priv, "Port: %d, invalid mac burned: 0x%lx, quiting\n",
                                 priv->port, priv->mac);
 #elif BITS_PER_LONG == 32
                 en_err(priv, "Port: %d, invalid mac burned: 0x%llx, quiting\n",
                                 priv->port, priv->mac);
 #endif
                 err = -EINVAL;
                 goto out;
         }
 
 	priv->stride = roundup_pow_of_two(sizeof(struct mlx4_en_rx_desc) +
 					  DS_SIZE);
 
 	mlx4_en_sysctl_conf(priv);
 
 	err = mlx4_en_alloc_resources(priv);
 	if (err)
 		goto out;
 
 	/* Allocate page for receive rings */
 	err = mlx4_alloc_hwq_res(mdev->dev, &priv->res,
 				MLX4_EN_PAGE_SIZE, MLX4_EN_PAGE_SIZE);
 	if (err) {
 		en_err(priv, "Failed to allocate page for rx qps\n");
 		goto out;
 	}
 	priv->allocated = 1;
 
 	/*
 	 * Set driver features
 	 */
 	dev->if_capabilities |= IFCAP_HWCSUM | IFCAP_HWCSUM_IPV6;
 	dev->if_capabilities |= IFCAP_VLAN_MTU | IFCAP_VLAN_HWTAGGING;
 	dev->if_capabilities |= IFCAP_VLAN_HWCSUM | IFCAP_VLAN_HWFILTER;
 	dev->if_capabilities |= IFCAP_LINKSTATE | IFCAP_JUMBO_MTU;
 	dev->if_capabilities |= IFCAP_LRO;
 	dev->if_capabilities |= IFCAP_HWSTATS;
 
 	if (mdev->LSO_support)
 		dev->if_capabilities |= IFCAP_TSO4 | IFCAP_TSO6 | IFCAP_VLAN_HWTSO;
 
 #if __FreeBSD_version >= 1100000
 	/* set TSO limits so that we don't have to drop TX packets */
 	dev->if_hw_tsomax = MLX4_EN_TX_MAX_PAYLOAD_SIZE - (ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN) /* hdr */;
 	dev->if_hw_tsomaxsegcount = MLX4_EN_TX_MAX_MBUF_FRAGS - 1 /* hdr */;
 	dev->if_hw_tsomaxsegsize = MLX4_EN_TX_MAX_MBUF_SIZE;
 #endif
 
 	dev->if_capenable = dev->if_capabilities;
 
 	dev->if_hwassist = 0;
 	if (dev->if_capenable & (IFCAP_TSO4 | IFCAP_TSO6))
 		dev->if_hwassist |= CSUM_TSO;
 	if (dev->if_capenable & IFCAP_TXCSUM)
 		dev->if_hwassist |= (CSUM_TCP | CSUM_UDP | CSUM_IP);
 	if (dev->if_capenable & IFCAP_TXCSUM_IPV6)
 		dev->if_hwassist |= (CSUM_UDP_IPV6 | CSUM_TCP_IPV6);
 
 
         /* Register for VLAN events */
 	priv->vlan_attach = EVENTHANDLER_REGISTER(vlan_config,
             mlx4_en_vlan_rx_add_vid, priv, EVENTHANDLER_PRI_FIRST);
 	priv->vlan_detach = EVENTHANDLER_REGISTER(vlan_unconfig,
             mlx4_en_vlan_rx_kill_vid, priv, EVENTHANDLER_PRI_FIRST);
 
 	mdev->pndev[priv->port] = dev;
 
 	priv->last_link_state = MLX4_DEV_EVENT_PORT_DOWN;
         mlx4_en_set_default_moderation(priv);
 
 	/* Set default MAC */
 	for (i = 0; i < ETHER_ADDR_LEN; i++)
 		dev_addr[ETHER_ADDR_LEN - 1 - i] = (u8) (priv->mac >> (8 * i));
 
 
 	ether_ifattach(dev, dev_addr);
 	if_link_state_change(dev, LINK_STATE_DOWN);
 	ifmedia_init(&priv->media, IFM_IMASK | IFM_ETH_FMASK,
 	    mlx4_en_media_change, mlx4_en_media_status);
 	ifmedia_add(&priv->media, IFM_ETHER | IFM_FDX | IFM_1000_T, 0, NULL);
 	ifmedia_add(&priv->media, IFM_ETHER | IFM_FDX | IFM_10G_SR, 0, NULL);
 	ifmedia_add(&priv->media, IFM_ETHER | IFM_FDX | IFM_10G_CX4, 0, NULL);
 	ifmedia_add(&priv->media, IFM_ETHER | IFM_FDX | IFM_40G_CR4, 0, NULL);
 	ifmedia_add(&priv->media, IFM_ETHER | IFM_AUTO, 0, NULL);
 	ifmedia_set(&priv->media, IFM_ETHER | IFM_AUTO);
 
 	en_warn(priv, "Using %d TX rings\n", prof->tx_ring_num);
 	en_warn(priv, "Using %d RX rings\n", prof->rx_ring_num);
 
 	priv->registered = 1;
 
         en_warn(priv, "Using %d TX rings\n", prof->tx_ring_num);
         en_warn(priv, "Using %d RX rings\n", prof->rx_ring_num);
 
 
 	priv->rx_mb_size = dev->if_mtu + ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN;
 	err = mlx4_SET_PORT_general(mdev->dev, priv->port,
 				    priv->rx_mb_size,
 				    prof->tx_pause, prof->tx_ppp,
 				    prof->rx_pause, prof->rx_ppp);
 	if (err) {
 		en_err(priv, "Failed setting port general configurations "
 		       "for port %d, with error %d\n", priv->port, err);
 		goto out;
 	}
 
 	/* Init port */
 	en_warn(priv, "Initializing port\n");
 	err = mlx4_INIT_PORT(mdev->dev, priv->port);
 	if (err) {
 		en_err(priv, "Failed Initializing port\n");
 		goto out;
 	}
 
 	queue_delayed_work(mdev->workqueue, &priv->stats_task, STATS_DELAY);
 
         if (mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_TS)
                 queue_delayed_work(mdev->workqueue, &priv->service_task, SERVICE_TASK_DELAY);
 
 	return 0;
 
 out:
 	mlx4_en_destroy_netdev(dev);
 	return err;
 }
 
 static int mlx4_en_set_ring_size(struct net_device *dev,
     int rx_size, int tx_size)
 {
         struct mlx4_en_priv *priv = netdev_priv(dev);
         struct mlx4_en_dev *mdev = priv->mdev;
         int port_up = 0;
         int err = 0;
 
         rx_size = roundup_pow_of_two(rx_size);
         rx_size = max_t(u32, rx_size, MLX4_EN_MIN_RX_SIZE);
         rx_size = min_t(u32, rx_size, MLX4_EN_MAX_RX_SIZE);
         tx_size = roundup_pow_of_two(tx_size);
         tx_size = max_t(u32, tx_size, MLX4_EN_MIN_TX_SIZE);
         tx_size = min_t(u32, tx_size, MLX4_EN_MAX_TX_SIZE);
 
         if (rx_size == (priv->port_up ?
             priv->rx_ring[0]->actual_size : priv->rx_ring[0]->size) &&
             tx_size == priv->tx_ring[0]->size)
                 return 0;
         mutex_lock(&mdev->state_lock);
         if (priv->port_up) {
                 port_up = 1;
                 mlx4_en_stop_port(dev);
         }
         mlx4_en_free_resources(priv);
         priv->prof->tx_ring_size = tx_size;
         priv->prof->rx_ring_size = rx_size;
         err = mlx4_en_alloc_resources(priv);
         if (err) {
                 en_err(priv, "Failed reallocating port resources\n");
                 goto out;
         }
         if (port_up) {
                 err = mlx4_en_start_port(dev);
                 if (err)
                         en_err(priv, "Failed starting port\n");
         }
 out:
         mutex_unlock(&mdev->state_lock);
         return err;
 }
 static int mlx4_en_set_rx_ring_size(SYSCTL_HANDLER_ARGS)
 {
         struct mlx4_en_priv *priv;
         int size;
         int error;
 
         priv = arg1;
         size = priv->prof->rx_ring_size;
         error = sysctl_handle_int(oidp, &size, 0, req);
         if (error || !req->newptr)
                 return (error);
         error = -mlx4_en_set_ring_size(priv->dev, size,
             priv->prof->tx_ring_size);
         return (error);
 }
 
 static int mlx4_en_set_tx_ring_size(SYSCTL_HANDLER_ARGS)
 {
         struct mlx4_en_priv *priv;
         int size;
         int error;
 
         priv = arg1;
         size = priv->prof->tx_ring_size;
         error = sysctl_handle_int(oidp, &size, 0, req);
         if (error || !req->newptr)
                 return (error);
         error = -mlx4_en_set_ring_size(priv->dev, priv->prof->rx_ring_size,
             size);
 
         return (error);
 }
 
 static int mlx4_en_get_module_info(struct net_device *dev,
 				   struct ethtool_modinfo *modinfo)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_en_dev *mdev = priv->mdev;
 	int ret;
 	u8 data[4];
 
 	/* Read first 2 bytes to get Module & REV ID */
 	ret = mlx4_get_module_info(mdev->dev, priv->port,
 				   0/*offset*/, 2/*size*/, data);
 
 	if (ret < 2) {
 		en_err(priv, "Failed to read eeprom module first two bytes, error: 0x%x\n", -ret);
 		return -EIO;
 	}
 
 	switch (data[0] /* identifier */) {
 	case MLX4_MODULE_ID_QSFP:
 		modinfo->type = ETH_MODULE_SFF_8436;
 		modinfo->eeprom_len = ETH_MODULE_SFF_8436_LEN;
 		break;
 	case MLX4_MODULE_ID_QSFP_PLUS:
 		if (data[1] >= 0x3) { /* revision id */
 			modinfo->type = ETH_MODULE_SFF_8636;
 			modinfo->eeprom_len = ETH_MODULE_SFF_8636_LEN;
 		} else {
 			modinfo->type = ETH_MODULE_SFF_8436;
 			modinfo->eeprom_len = ETH_MODULE_SFF_8436_LEN;
 		}
 		break;
 	case MLX4_MODULE_ID_QSFP28:
 		modinfo->type = ETH_MODULE_SFF_8636;
 		modinfo->eeprom_len = ETH_MODULE_SFF_8636_LEN;
 		break;
 	case MLX4_MODULE_ID_SFP:
 		modinfo->type = ETH_MODULE_SFF_8472;
 		modinfo->eeprom_len = ETH_MODULE_SFF_8472_LEN;
 		break;
 	default:
 		en_err(priv, "mlx4_en_get_module_info :  Not recognized cable type\n");
 		return -EINVAL;
 	}
 
 	return 0;
 }
 
 static int mlx4_en_get_module_eeprom(struct net_device *dev,
 				     struct ethtool_eeprom *ee,
 				     u8 *data)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_en_dev *mdev = priv->mdev;
 	int offset = ee->offset;
 	int i = 0, ret;
 
 	if (ee->len == 0)
 		return -EINVAL;
 
 	memset(data, 0, ee->len);
 
 	while (i < ee->len) {
 		en_dbg(DRV, priv,
 		       "mlx4_get_module_info i(%d) offset(%d) len(%d)\n",
 		       i, offset, ee->len - i);
 
 		ret = mlx4_get_module_info(mdev->dev, priv->port,
 					   offset, ee->len - i, data + i);
 
 		if (!ret) /* Done reading */
 			return 0;
 
 		if (ret < 0) {
 			en_err(priv,
 			       "mlx4_get_module_info i(%d) offset(%d) bytes_to_read(%d) - FAILED (0x%x)\n",
 			       i, offset, ee->len - i, ret);
 			return -1;
 		}
 
 		i += ret;
 		offset += ret;
 	}
 	return 0;
 }
 
 static void mlx4_en_print_eeprom(u8 *data, __u32 len)
 {
 	int		i;
 	int		j = 0;
 	int		row = 0;
 	const int	NUM_OF_BYTES = 16;
 
 	printf("\nOffset\t\tValues\n");
 	printf("------\t\t------\n");
 	while(row < len){
 		printf("0x%04x\t\t",row);
 		for(i=0; i < NUM_OF_BYTES; i++){
 			printf("%02x ", data[j]);
 			row++;
 			j++;
 		}
 		printf("\n");
 	}
 }
 
 /* Read cable EEPROM module information by first inspecting the first
  * two bytes to get the length and then read the rest of the information.
  * The information is printed to dmesg. */
 static int mlx4_en_read_eeprom(SYSCTL_HANDLER_ARGS)
 {
 
 	u8*		data;
 	int		error;
 	int		result = 0;
 	struct		mlx4_en_priv *priv;
 	struct		net_device *dev;
 	struct		ethtool_modinfo modinfo;
 	struct		ethtool_eeprom ee;
 
 	error = sysctl_handle_int(oidp, &result, 0, req);
 	if (error || !req->newptr)
 		return (error);
 
 	if (result == 1) {
 		priv = arg1;
 		dev = priv->dev;
 		data = kmalloc(PAGE_SIZE, GFP_KERNEL);
 
 		error = mlx4_en_get_module_info(dev, &modinfo);
 		if (error) {
 			en_err(priv,
 			       "mlx4_en_get_module_info returned with error - FAILED (0x%x)\n",
 			       -error);
 			goto out;
 		}
 
 		ee.len = modinfo.eeprom_len;
 		ee.offset = 0;
 
 		error = mlx4_en_get_module_eeprom(dev, &ee, data);
 		if (error) {
 			en_err(priv,
 			       "mlx4_en_get_module_eeprom returned with error - FAILED (0x%x)\n",
 			       -error);
 			/* Continue printing partial information in case of an error */
 		}
 
 		/* EEPROM information will be printed in dmesg */
 		mlx4_en_print_eeprom(data, ee.len);
 out:
 		kfree(data);
 	}
 	/* Return zero to prevent sysctl failure. */
 	return (0);
 }
 
 static int mlx4_en_set_tx_ppp(SYSCTL_HANDLER_ARGS)
 {
         struct mlx4_en_priv *priv;
         int ppp;
         int error;
 
         priv = arg1;
         ppp = priv->prof->tx_ppp;
         error = sysctl_handle_int(oidp, &ppp, 0, req);
         if (error || !req->newptr)
                 return (error);
         if (ppp > 0xff || ppp < 0)
                 return (-EINVAL);
         priv->prof->tx_ppp = ppp;
         error = -mlx4_SET_PORT_general(priv->mdev->dev, priv->port,
                                        priv->rx_mb_size + ETHER_CRC_LEN,
                                        priv->prof->tx_pause,
                                        priv->prof->tx_ppp,
                                        priv->prof->rx_pause,
                                        priv->prof->rx_ppp);
 
         return (error);
 }
 
 static int mlx4_en_set_rx_ppp(SYSCTL_HANDLER_ARGS)
 {
         struct mlx4_en_priv *priv;
         struct mlx4_en_dev *mdev;
         int ppp;
         int error;
         int port_up;
 
         port_up = 0;
         priv = arg1;
         mdev = priv->mdev;
         ppp = priv->prof->rx_ppp;
         error = sysctl_handle_int(oidp, &ppp, 0, req);
         if (error || !req->newptr)
                 return (error);
         if (ppp > 0xff || ppp < 0)
                 return (-EINVAL);
         /* See if we have to change the number of tx queues. */
         if (!ppp != !priv->prof->rx_ppp) {
                 mutex_lock(&mdev->state_lock);
                 if (priv->port_up) {
                         port_up = 1;
                         mlx4_en_stop_port(priv->dev);
                 }
                 mlx4_en_free_resources(priv);
                 priv->prof->rx_ppp = ppp;
                 error = -mlx4_en_alloc_resources(priv);
                 if (error)
                         en_err(priv, "Failed reallocating port resources\n");
                 if (error == 0 && port_up) {
                         error = -mlx4_en_start_port(priv->dev);
                         if (error)
                                 en_err(priv, "Failed starting port\n");
                 }
                 mutex_unlock(&mdev->state_lock);
                 return (error);
 
         }
         priv->prof->rx_ppp = ppp;
         error = -mlx4_SET_PORT_general(priv->mdev->dev, priv->port,
                                        priv->rx_mb_size + ETHER_CRC_LEN,
                                        priv->prof->tx_pause,
                                        priv->prof->tx_ppp,
                                        priv->prof->rx_pause,
                                        priv->prof->rx_ppp);
 
         return (error);
 }
 
 static void mlx4_en_sysctl_conf(struct mlx4_en_priv *priv)
 {
         struct net_device *dev;
         struct sysctl_ctx_list *ctx;
         struct sysctl_oid *node;
         struct sysctl_oid_list *node_list;
         struct sysctl_oid *coal;
         struct sysctl_oid_list *coal_list;
 	const char *pnameunit;
         dev = priv->dev;
         ctx = &priv->conf_ctx;
 	pnameunit = device_get_nameunit(priv->mdev->pdev->dev.bsddev);
 
         sysctl_ctx_init(ctx);
         priv->conf_sysctl = SYSCTL_ADD_NODE(ctx, SYSCTL_STATIC_CHILDREN(_hw),
             OID_AUTO, dev->if_xname, CTLFLAG_RD, 0, "mlx4 10gig ethernet");
         node = SYSCTL_ADD_NODE(ctx, SYSCTL_CHILDREN(priv->conf_sysctl), OID_AUTO,
             "conf", CTLFLAG_RD, NULL, "Configuration");
         node_list = SYSCTL_CHILDREN(node);
 
         SYSCTL_ADD_UINT(ctx, node_list, OID_AUTO, "msg_enable",
             CTLFLAG_RW, &priv->msg_enable, 0,
             "Driver message enable bitfield");
         SYSCTL_ADD_UINT(ctx, node_list, OID_AUTO, "rx_rings",
             CTLFLAG_RD, &priv->rx_ring_num, 0,
             "Number of receive rings");
         SYSCTL_ADD_UINT(ctx, node_list, OID_AUTO, "tx_rings",
             CTLFLAG_RD, &priv->tx_ring_num, 0,
             "Number of transmit rings");
         SYSCTL_ADD_PROC(ctx, node_list, OID_AUTO, "rx_size",
             CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, priv, 0,
             mlx4_en_set_rx_ring_size, "I", "Receive ring size");
         SYSCTL_ADD_PROC(ctx, node_list, OID_AUTO, "tx_size",
             CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, priv, 0,
             mlx4_en_set_tx_ring_size, "I", "Transmit ring size");
         SYSCTL_ADD_PROC(ctx, node_list, OID_AUTO, "tx_ppp",
             CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, priv, 0,
             mlx4_en_set_tx_ppp, "I", "TX Per-priority pause");
         SYSCTL_ADD_PROC(ctx, node_list, OID_AUTO, "rx_ppp",
             CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, priv, 0,
             mlx4_en_set_rx_ppp, "I", "RX Per-priority pause");
         SYSCTL_ADD_UINT(ctx, node_list, OID_AUTO, "port_num",
             CTLFLAG_RD, &priv->port, 0,
             "Port Number");
         SYSCTL_ADD_STRING(ctx, node_list, OID_AUTO, "device_name",
 	    CTLFLAG_RD, __DECONST(void *, pnameunit), 0,
 	    "PCI device name");
         /* Add coalescer configuration. */
         coal = SYSCTL_ADD_NODE(ctx, node_list, OID_AUTO,
             "coalesce", CTLFLAG_RD, NULL, "Interrupt coalesce configuration");
         coal_list = SYSCTL_CHILDREN(coal);
         SYSCTL_ADD_UINT(ctx, coal_list, OID_AUTO, "pkt_rate_low",
             CTLFLAG_RW, &priv->pkt_rate_low, 0,
             "Packets per-second for minimum delay");
         SYSCTL_ADD_UINT(ctx, coal_list, OID_AUTO, "rx_usecs_low",
             CTLFLAG_RW, &priv->rx_usecs_low, 0,
             "Minimum RX delay in micro-seconds");
         SYSCTL_ADD_UINT(ctx, coal_list, OID_AUTO, "pkt_rate_high",
             CTLFLAG_RW, &priv->pkt_rate_high, 0,
             "Packets per-second for maximum delay");
         SYSCTL_ADD_UINT(ctx, coal_list, OID_AUTO, "rx_usecs_high",
             CTLFLAG_RW, &priv->rx_usecs_high, 0,
             "Maximum RX delay in micro-seconds");
         SYSCTL_ADD_UINT(ctx, coal_list, OID_AUTO, "sample_interval",
             CTLFLAG_RW, &priv->sample_interval, 0,
             "adaptive frequency in units of HZ ticks");
         SYSCTL_ADD_UINT(ctx, coal_list, OID_AUTO, "adaptive_rx_coal",
             CTLFLAG_RW, &priv->adaptive_rx_coal, 0,
             "Enable adaptive rx coalescing");
 	/* EEPROM support */
 	SYSCTL_ADD_PROC(ctx, node_list, OID_AUTO, "eeprom_info",
 	    CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, priv, 0,
 	    mlx4_en_read_eeprom, "I", "EEPROM information");
 }
 
 static void mlx4_en_sysctl_stat(struct mlx4_en_priv *priv)
 {
 	struct sysctl_ctx_list *ctx;
 	struct sysctl_oid_list *node_list;
 	struct sysctl_oid *ring_node;
 	struct sysctl_oid_list *ring_list;
 	struct mlx4_en_tx_ring *tx_ring;
 	struct mlx4_en_rx_ring *rx_ring;
 	char namebuf[128];
 	int i;
 
 	ctx = &priv->stat_ctx;
 	sysctl_ctx_init(ctx);
 	priv->stat_sysctl = SYSCTL_ADD_NODE(ctx, SYSCTL_CHILDREN(priv->conf_sysctl), OID_AUTO,
 	    "stat", CTLFLAG_RD, NULL, "Statistics");
 	node_list = SYSCTL_CHILDREN(priv->stat_sysctl);
 
 #ifdef MLX4_EN_PERF_STAT
 	SYSCTL_ADD_UINT(ctx, node_list, OID_AUTO, "tx_poll", CTLFLAG_RD,
 	    &priv->pstats.tx_poll, "TX Poll calls");
 	SYSCTL_ADD_QUAD(ctx, node_list, OID_AUTO, "tx_pktsz_avg", CTLFLAG_RD,
 	    &priv->pstats.tx_pktsz_avg, "TX average packet size");
 	SYSCTL_ADD_UINT(ctx, node_list, OID_AUTO, "inflight_avg", CTLFLAG_RD,
 	    &priv->pstats.inflight_avg, "TX average packets in-flight");
 	SYSCTL_ADD_UINT(ctx, node_list, OID_AUTO, "tx_coal_avg", CTLFLAG_RD,
 	    &priv->pstats.tx_coal_avg, "TX average coalesced completions");
 	SYSCTL_ADD_UINT(ctx, node_list, OID_AUTO, "rx_coal_avg", CTLFLAG_RD,
 	    &priv->pstats.rx_coal_avg, "RX average coalesced completions");
 #endif
 
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "tso_packets", CTLFLAG_RD,
 	    &priv->port_stats.tso_packets, 0, "TSO packets sent");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "queue_stopped", CTLFLAG_RD,
 	    &priv->port_stats.queue_stopped, 0, "Queue full");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "wake_queue", CTLFLAG_RD,
 	    &priv->port_stats.wake_queue, 0, "Queue resumed after full");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "tx_timeout", CTLFLAG_RD,
 	    &priv->port_stats.tx_timeout, 0, "Transmit timeouts");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "tx_oversized_packets", CTLFLAG_RD,
 	    &priv->port_stats.oversized_packets, 0, "TX oversized packets, m_defrag failed");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_alloc_failed", CTLFLAG_RD,
 	    &priv->port_stats.rx_alloc_failed, 0, "RX failed to allocate mbuf");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_chksum_good", CTLFLAG_RD,
             &priv->port_stats.rx_chksum_good, 0, "RX checksum offload success");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_chksum_none", CTLFLAG_RD,
 	    &priv->port_stats.rx_chksum_none, 0, "RX without checksum offload");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "tx_chksum_offload",
 	    CTLFLAG_RD, &priv->port_stats.tx_chksum_offload, 0,
 	    "TX checksum offloads");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "defrag_attempts",
 	    CTLFLAG_RD, &priv->port_stats.defrag_attempts, 0,
 	    "Oversized chains defragged");
 
 	/* Could strdup the names and add in a loop.  This is simpler. */
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_bytes", CTLFLAG_RD,
 	    &priv->pkstats.rx_bytes, 0, "RX Bytes");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_packets", CTLFLAG_RD,
 	    &priv->pkstats.rx_packets, 0, "RX packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_multicast_packets", CTLFLAG_RD,
 	    &priv->pkstats.rx_multicast_packets, 0, "RX Multicast Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_broadcast_packets", CTLFLAG_RD,
 	    &priv->pkstats.rx_broadcast_packets, 0, "RX Broadcast Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_errors", CTLFLAG_RD,
 	    &priv->pkstats.rx_errors, 0, "RX Errors");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_dropped", CTLFLAG_RD,
 	    &priv->pkstats.rx_dropped, 0, "RX Dropped");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_length_errors", CTLFLAG_RD,
 	    &priv->pkstats.rx_length_errors, 0, "RX Length Errors");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_over_errors", CTLFLAG_RD,
 	    &priv->pkstats.rx_over_errors, 0, "RX Over Errors");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_crc_errors", CTLFLAG_RD,
 	    &priv->pkstats.rx_crc_errors, 0, "RX CRC Errors");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_jabbers", CTLFLAG_RD,
 	    &priv->pkstats.rx_jabbers, 0, "RX Jabbers");
 
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_in_range_length_error", CTLFLAG_RD,
 	    &priv->pkstats.rx_in_range_length_error, 0, "RX IN_Range Length Error");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_out_range_length_error",
 	    CTLFLAG_RD, &priv->pkstats.rx_out_range_length_error, 0,
 	    "RX Out Range Length Error");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_lt_64_bytes_packets", CTLFLAG_RD,
 	    &priv->pkstats.rx_lt_64_bytes_packets, 0, "RX Lt 64 Bytes Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_127_bytes_packets", CTLFLAG_RD,
 	    &priv->pkstats.rx_127_bytes_packets, 0, "RX 127 bytes Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_255_bytes_packets", CTLFLAG_RD,
 	    &priv->pkstats.rx_255_bytes_packets, 0, "RX 255 bytes Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_511_bytes_packets", CTLFLAG_RD,
 	    &priv->pkstats.rx_511_bytes_packets, 0, "RX 511 bytes Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_1023_bytes_packets", CTLFLAG_RD,
 	    &priv->pkstats.rx_1023_bytes_packets, 0, "RX 1023 bytes Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_1518_bytes_packets", CTLFLAG_RD,
 	    &priv->pkstats.rx_1518_bytes_packets, 0, "RX 1518 bytes Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_1522_bytes_packets", CTLFLAG_RD,
 	    &priv->pkstats.rx_1522_bytes_packets, 0, "RX 1522 bytes Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_1548_bytes_packets", CTLFLAG_RD,
 	    &priv->pkstats.rx_1548_bytes_packets, 0, "RX 1548 bytes Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "rx_gt_1548_bytes_packets", CTLFLAG_RD,
 	    &priv->pkstats.rx_gt_1548_bytes_packets, 0,
 	    "RX Greater Then 1548 bytes Packets");
 
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "tx_packets", CTLFLAG_RD,
 	    &priv->pkstats.tx_packets, 0, "TX packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "tx_bytes", CTLFLAG_RD,
 	    &priv->pkstats.tx_bytes, 0, "TX Bytes");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "tx_multicast_packets", CTLFLAG_RD,
 	    &priv->pkstats.tx_multicast_packets, 0, "TX Multicast Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "tx_broadcast_packets", CTLFLAG_RD,
 	    &priv->pkstats.tx_broadcast_packets, 0, "TX Broadcast Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "tx_errors", CTLFLAG_RD,
 	    &priv->pkstats.tx_errors, 0, "TX Errors");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "tx_dropped", CTLFLAG_RD,
 	    &priv->pkstats.tx_dropped, 0, "TX Dropped");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "tx_lt_64_bytes_packets", CTLFLAG_RD,
 	    &priv->pkstats.tx_lt_64_bytes_packets, 0, "TX Less Then 64 Bytes Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "tx_127_bytes_packets", CTLFLAG_RD,
 	    &priv->pkstats.tx_127_bytes_packets, 0, "TX 127 Bytes Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "tx_255_bytes_packets", CTLFLAG_RD,
 	    &priv->pkstats.tx_255_bytes_packets, 0, "TX 255 Bytes Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "tx_511_bytes_packets", CTLFLAG_RD,
 	    &priv->pkstats.tx_511_bytes_packets, 0, "TX 511 Bytes Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "tx_1023_bytes_packets", CTLFLAG_RD,
 	    &priv->pkstats.tx_1023_bytes_packets, 0, "TX 1023 Bytes Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "tx_1518_bytes_packets", CTLFLAG_RD,
 	    &priv->pkstats.tx_1518_bytes_packets, 0, "TX 1518 Bytes Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "tx_1522_bytes_packets", CTLFLAG_RD,
 	    &priv->pkstats.tx_1522_bytes_packets, 0, "TX 1522 Bytes Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "tx_1548_bytes_packets", CTLFLAG_RD,
 	    &priv->pkstats.tx_1548_bytes_packets, 0, "TX 1548 Bytes Packets");
 	SYSCTL_ADD_U64(ctx, node_list, OID_AUTO, "tx_gt_1548_bytes_packets", CTLFLAG_RD,
 	    &priv->pkstats.tx_gt_1548_bytes_packets, 0,
 	    "TX Greater Then 1548 Bytes Packets");
 
 	for (i = 0; i < priv->tx_ring_num; i++) {
 		tx_ring = priv->tx_ring[i];
 		snprintf(namebuf, sizeof(namebuf), "tx_ring%d", i);
 		ring_node = SYSCTL_ADD_NODE(ctx, node_list, OID_AUTO, namebuf,
 		    CTLFLAG_RD, NULL, "TX Ring");
 		ring_list = SYSCTL_CHILDREN(ring_node);
 		SYSCTL_ADD_U64(ctx, ring_list, OID_AUTO, "packets",
 		    CTLFLAG_RD, &tx_ring->packets, 0, "TX packets");
 		SYSCTL_ADD_U64(ctx, ring_list, OID_AUTO, "bytes",
 		    CTLFLAG_RD, &tx_ring->bytes, 0, "TX bytes");
 		SYSCTL_ADD_U64(ctx, ring_list, OID_AUTO, "tso_packets",
 		    CTLFLAG_RD, &tx_ring->tso_packets, 0, "TSO packets");
 		SYSCTL_ADD_U64(ctx, ring_list, OID_AUTO, "defrag_attempts",
 		    CTLFLAG_RD, &tx_ring->defrag_attempts, 0,
 		    "Oversized chains defragged");
 	}
 
 	for (i = 0; i < priv->rx_ring_num; i++) {
 		rx_ring = priv->rx_ring[i];
 		snprintf(namebuf, sizeof(namebuf), "rx_ring%d", i);
 		ring_node = SYSCTL_ADD_NODE(ctx, node_list, OID_AUTO, namebuf,
 		    CTLFLAG_RD, NULL, "RX Ring");
 		ring_list = SYSCTL_CHILDREN(ring_node);
 		SYSCTL_ADD_U64(ctx, ring_list, OID_AUTO, "packets",
 		    CTLFLAG_RD, &rx_ring->packets, 0, "RX packets");
 		SYSCTL_ADD_U64(ctx, ring_list, OID_AUTO, "bytes",
 		    CTLFLAG_RD, &rx_ring->bytes, 0, "RX bytes");
 		SYSCTL_ADD_U64(ctx, ring_list, OID_AUTO, "error",
 		    CTLFLAG_RD, &rx_ring->errors, 0, "RX soft errors");
 	}
 }
Index: projects/openssl111/sys/i386/conf/GENERIC
===================================================================
--- projects/openssl111/sys/i386/conf/GENERIC	(revision 339239)
+++ projects/openssl111/sys/i386/conf/GENERIC	(revision 339240)
@@ -1,380 +1,379 @@
 #
 # GENERIC -- Generic kernel configuration file for FreeBSD/i386
 #
 # For more information on this file, please read the config(5) manual page,
 # and/or the handbook section on Kernel Configuration Files:
 #
 #    https://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig-config.html
 #
 # The handbook is also available locally in /usr/share/doc/handbook
 # if you've installed the doc distribution, otherwise always see the
 # FreeBSD World Wide Web server (https://www.FreeBSD.org/) for the
 # latest information.
 #
 # An exhaustive list of options and more detailed explanations of the
 # device lines is also present in the ../../conf/NOTES and NOTES files.
 # If you are in doubt as to the purpose or necessity of a line, check first
 # in NOTES.
 #
 # $FreeBSD$
 
 cpu		I486_CPU
 cpu		I586_CPU
 cpu		I686_CPU
 ident		GENERIC
 
 makeoptions	DEBUG=-g		# Build kernel with gdb(1) debug symbols
 makeoptions	WITH_CTF=1		# Run ctfconvert(1) for DTrace support
 
 options 	SCHED_ULE		# ULE scheduler
 options 	PREEMPTION		# Enable kernel thread preemption
 options 	VIMAGE			# Subsystem virtualization, e.g. VNET
 options 	INET			# InterNETworking
 options 	INET6			# IPv6 communications protocols
 options 	IPSEC			# IP (v4/v6) security
 options 	IPSEC_SUPPORT		# Allow kldload of ipsec and tcpmd5
 options 	TCP_HHOOK		# hhook(9) framework for TCP
 options 	TCP_OFFLOAD		# TCP offload
 options 	SCTP			# Stream Control Transmission Protocol
 options 	FFS			# Berkeley Fast Filesystem
 options 	SOFTUPDATES		# Enable FFS soft updates support
 options 	UFS_ACL			# Support for access control lists
 options 	UFS_DIRHASH		# Improve performance on big directories
 options 	UFS_GJOURNAL		# Enable gjournal-based UFS journaling
 options 	QUOTA			# Enable disk quotas for UFS
 options 	MD_ROOT			# MD is a potential root device
 options 	NFSCL			# Network Filesystem Client
 options 	NFSD			# Network Filesystem Server
 options 	NFSLOCKD		# Network Lock Manager
 options 	NFS_ROOT		# NFS usable as /, requires NFSCL
 options 	MSDOSFS			# MSDOS Filesystem
 options 	CD9660			# ISO 9660 Filesystem
 options 	PROCFS			# Process filesystem (requires PSEUDOFS)
 options 	PSEUDOFS		# Pseudo-filesystem framework
-options 	GEOM_PART_GPT		# GUID Partition Tables.
 options 	GEOM_RAID		# Soft RAID functionality.
 options 	GEOM_LABEL		# Provides labelization
 options 	COMPAT_FREEBSD4		# Compatible with FreeBSD4
 options 	COMPAT_FREEBSD5		# Compatible with FreeBSD5
 options 	COMPAT_FREEBSD6		# Compatible with FreeBSD6
 options 	COMPAT_FREEBSD7		# Compatible with FreeBSD7
 options 	COMPAT_FREEBSD9		# Compatible with FreeBSD9
 options 	COMPAT_FREEBSD10	# Compatible with FreeBSD10
 options 	COMPAT_FREEBSD11	# Compatible with FreeBSD11
 options 	SCSI_DELAY=5000		# Delay (in ms) before probing SCSI
 options 	KTRACE			# ktrace(1) support
 options 	STACK			# stack(9) support
 options 	SYSVSHM			# SYSV-style shared memory
 options 	SYSVMSG			# SYSV-style message queues
 options 	SYSVSEM			# SYSV-style semaphores
 options 	_KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions
 options 	PRINTF_BUFR_SIZE=128	# Prevent printf output being interspersed.
 options 	KBD_INSTALL_CDEV	# install a CDEV entry in /dev
 options 	HWPMC_HOOKS		# Necessary kernel hooks for hwpmc(4)
 options 	AUDIT			# Security event auditing
 options 	CAPABILITY_MODE		# Capsicum capability mode
 options 	CAPABILITIES		# Capsicum capabilities
 options 	MAC			# TrustedBSD MAC Framework
 options 	KDTRACE_HOOKS		# Kernel DTrace hooks
 options 	DDB_CTF			# Kernel ELF linker loads CTF data
 options 	INCLUDE_CONFIG_FILE	# Include this file in kernel
 options 	RACCT			# Resource accounting framework
 options 	RACCT_DEFAULT_TO_DISABLED # Set kern.racct.enable=0 by default
 options 	RCTL			# Resource limits
 
 # Debugging support.  Always need this:
 options 	KDB			# Enable kernel debugger support.
 options 	KDB_TRACE		# Print a stack trace for a panic.
 # For full debugger support use (turn off in stable branch):
 options 	DDB			# Support DDB.
 options 	GDB			# Support remote GDB.
 options 	DEADLKRES		# Enable the deadlock resolver
 options 	INVARIANTS		# Enable calls of extra sanity checking
 options 	INVARIANT_SUPPORT	# Extra sanity checks of internal structures, required by INVARIANTS
 options 	WITNESS			# Enable checks to detect deadlocks and cycles
 options 	WITNESS_SKIPSPIN	# Don't run witness on spinlocks for speed
 options 	MALLOC_DEBUG_MAXZONES=8	# Separate malloc(9) zones
 
 # Kernel dump features.
 options 	EKCD			# Support for encrypted kernel dumps
 options 	GZIO			# gzip-compressed kernel and user dumps
 options 	ZSTDIO			# zstd-compressed kernel and user dumps
 options 	NETDUMP			# netdump(4) client support
 
 # To make an SMP kernel, the next two lines are needed
 options 	SMP			# Symmetric MultiProcessor Kernel
 device		apic			# I/O APIC
 options 	EARLY_AP_STARTUP
 
 # CPU frequency control
 device		cpufreq
 
 # Bus support.
 device		acpi
 device		pci
 options 	PCI_HP			# PCI-Express native HotPlug
 options		PCI_IOV			# PCI SR-IOV support
 
 # Floppy drives
 device		fdc
 
 # ATA controllers
 device		ahci			# AHCI-compatible SATA controllers
 device		ata			# Legacy ATA/SATA controllers
 device		mvs			# Marvell 88SX50XX/88SX60XX/88SX70XX/SoC SATA
 device		siis			# SiliconImage SiI3124/SiI3132/SiI3531 SATA
 
 # SCSI Controllers
 device		ahc			# AHA2940 and onboard AIC7xxx devices
 device		esp			# AMD Am53C974 (Tekram DC-390(T))
 device		hptiop			# Highpoint RocketRaid 3xxx series
 device		isp			# Qlogic family
 #device		ispfw			# Firmware for QLogic HBAs- normally a module
 device		mpt			# LSI-Logic MPT-Fusion
 device		mps			# LSI-Logic MPT-Fusion 2
 device		mpr			# LSI-Logic MPT-Fusion 3
 #device		ncr			# NCR/Symbios Logic
 device		sym			# NCR/Symbios Logic (newer chipsets + those of `ncr')
 device		trm			# Tekram DC395U/UW/F DC315U adapters
 
 device		adv			# Advansys SCSI adapters
 device		adw			# Advansys wide SCSI adapters
 device		aha			# Adaptec 154x SCSI adapters
 device		aic			# Adaptec 15[012]x SCSI adapters, AIC-6[23]60.
 device		bt			# Buslogic/Mylex MultiMaster SCSI adapters
 
 device		ncv			# NCR 53C500
 device		nsp			# Workbit Ninja SCSI-3
 device		stg			# TMC 18C30/18C50
 device		isci			# Intel C600 SAS controller
 
 # ATA/SCSI peripherals
 device		scbus			# SCSI bus (required for ATA/SCSI)
 device		ch			# SCSI media changers
 device		da			# Direct Access (disks)
 device		sa			# Sequential Access (tape etc)
 device		cd			# CD
 device		pass			# Passthrough device (direct ATA/SCSI access)
 device		ses			# Enclosure Services (SES and SAF-TE)
 #device		ctl			# CAM Target Layer
 
 # RAID controllers interfaced to the SCSI subsystem
 device		amr			# AMI MegaRAID
 device		arcmsr			# Areca SATA II RAID
 device		ciss			# Compaq Smart RAID 5*
 device		dpt			# DPT Smartcache III, IV - See NOTES for options
 device		hptmv			# Highpoint RocketRAID 182x
 device		hptnr			# Highpoint DC7280, R750
 device		hptrr			# Highpoint RocketRAID 17xx, 22xx, 23xx, 25xx
 device		hpt27xx			# Highpoint RocketRAID 27xx
 device		iir			# Intel Integrated RAID
 device		ips			# IBM (Adaptec) ServeRAID
 device		mly			# Mylex AcceleRAID/eXtremeRAID
 device		twa			# 3ware 9000 series PATA/SATA RAID
 device		tws			# LSI 3ware 9750 SATA+SAS 6Gb/s RAID controller
 
 # RAID controllers
 device		aac			# Adaptec FSA RAID
 device		aacp			# SCSI passthrough for aac (requires CAM)
 device		aacraid			# Adaptec by PMC RAID
 device		ida			# Compaq Smart RAID
 device		mfi			# LSI MegaRAID SAS
 device		mlx			# Mylex DAC960 family
 device		mrsas			# LSI/Avago MegaRAID SAS/SATA, 6Gb/s and 12Gb/s
 device		pmspcv			# PMC-Sierra SAS/SATA Controller driver
 device		pst			# Promise Supertrak SX6000
 device		twe			# 3ware ATA RAID
 
 # NVM Express (NVMe) support
 device		nvme			# base NVMe driver
 device		nvd			# expose NVMe namespace as disks, depends on nvme
 
 # atkbdc0 controls both the keyboard and the PS/2 mouse
 device		atkbdc			# AT keyboard controller
 device		atkbd			# AT keyboard
 device		psm			# PS/2 mouse
 
 device		kbdmux			# keyboard multiplexer
 
 device		vga			# VGA video card driver
 options 	VESA			# Add support for VESA BIOS Extensions (VBE)
 
 device		splash			# Splash screen and screen saver support
 
 # syscons is the default console driver, resembling an SCO console
 device		sc
 options 	SC_PIXEL_MODE		# add support for the raster text mode
 
 # vt is the new video console driver
 device		vt
 device		vt_vga
 
 device		agp			# support several AGP chipsets
 
 # Power management support (see NOTES for more options)
 #device		apm
 
 # PCCARD (PCMCIA) support
 # PCMCIA and cardbus bridge support
 device		cbb			# cardbus (yenta) bridge
 device		pccard			# PC Card (16-bit) bus
 device		cardbus			# CardBus (32-bit) bus
 
 # Serial (COM) ports
 device		uart			# Generic UART driver
 
 # Parallel port
 device		ppc
 device		ppbus			# Parallel port bus (required)
 device		lpt			# Printer
 device		ppi			# Parallel port interface device
 #device		vpo			# Requires scbus and da
 
 device		puc			# Multi I/O cards and multi-channel UARTs
 
 # PCI Ethernet NICs.
 device		bxe			# Broadcom NetXtreme II BCM5771X/BCM578XX 10GbE
 device		de			# DEC/Intel DC21x4x (``Tulip'')
 device		em			# Intel PRO/1000 Gigabit Ethernet Family
 device		le			# AMD Am7900 LANCE and Am79C9xx PCnet
 device		ti			# Alteon Networks Tigon I/II gigabit Ethernet
 device		txp			# 3Com 3cR990 (``Typhoon'')
 device		vx			# 3Com 3c590, 3c595 (``Vortex'')
 
 # PCI Ethernet NICs that use the common MII bus controller code.
 # NOTE: Be sure to keep the 'device miibus' line in order to use these NICs!
 device		miibus			# MII bus support
 device		ae			# Attansic/Atheros L2 FastEthernet
 device		age			# Attansic/Atheros L1 Gigabit Ethernet
 device		alc			# Atheros AR8131/AR8132 Ethernet
 device		ale			# Atheros AR8121/AR8113/AR8114 Ethernet
 device		bce			# Broadcom BCM5706/BCM5708 Gigabit Ethernet
 device		bfe			# Broadcom BCM440x 10/100 Ethernet
 device		bge			# Broadcom BCM570xx Gigabit Ethernet
 device		cas			# Sun Cassini/Cassini+ and NS DP83065 Saturn
 device		dc			# DEC/Intel 21143 and various workalikes
 device		et			# Agere ET1310 10/100/Gigabit Ethernet
 device		fxp			# Intel EtherExpress PRO/100B (82557, 82558)
 device		gem			# Sun GEM/Sun ERI/Apple GMAC
 device		hme			# Sun HME (Happy Meal Ethernet)
 device		jme			# JMicron JMC250 Gigabit/JMC260 Fast Ethernet
 device		lge			# Level 1 LXT1001 gigabit Ethernet
 device		msk			# Marvell/SysKonnect Yukon II Gigabit Ethernet
 device		nfe			# nVidia nForce MCP on-board Ethernet
 device		nge			# NatSemi DP83820 gigabit Ethernet
 device		pcn			# AMD Am79C97x PCI 10/100 (precedence over 'le')
 device		re			# RealTek 8139C+/8169/8169S/8110S
 device		rl			# RealTek 8129/8139
 device		sf			# Adaptec AIC-6915 (``Starfire'')
 device		sge			# Silicon Integrated Systems SiS190/191
 device		sis			# Silicon Integrated Systems SiS 900/SiS 7016
 device		sk			# SysKonnect SK-984x & SK-982x gigabit Ethernet
 device		ste			# Sundance ST201 (D-Link DFE-550TX)
 device		stge			# Sundance/Tamarack TC9021 gigabit Ethernet
 device		tl			# Texas Instruments ThunderLAN
 device		tx			# SMC EtherPower II (83c170 ``EPIC'')
 device		vge			# VIA VT612x gigabit Ethernet
 device		vr			# VIA Rhine, Rhine II
 device		vte			# DM&P Vortex86 RDC R6040 Fast Ethernet
 device		wb			# Winbond W89C840F
 device		xl			# 3Com 3c90x (``Boomerang'', ``Cyclone'')
 
 # ISA Ethernet NICs.  pccard NICs included.
 device		cs			# Crystal Semiconductor CS89x0 NIC
 # 'device ed' requires 'device miibus'
 device		ed			# NE[12]000, SMC Ultra, 3c503, DS8390 cards
 device		ex			# Intel EtherExpress Pro/10 and Pro/10+
 device		ep			# Etherlink III based cards
 device		fe			# Fujitsu MB8696x based cards
 device		sn			# SMC's 9000 series of Ethernet chips
 device		xe			# Xircom pccard Ethernet
 
 # Wireless NIC cards
 device		wlan			# 802.11 support
 options 	IEEE80211_DEBUG		# enable debug msgs
 options 	IEEE80211_AMPDU_AGE	# age frames in AMPDU reorder q's
 options 	IEEE80211_SUPPORT_MESH	# enable 802.11s draft support
 device		wlan_wep		# 802.11 WEP support
 device		wlan_ccmp		# 802.11 CCMP support
 device		wlan_tkip		# 802.11 TKIP support
 device		wlan_amrr		# AMRR transmit rate control algorithm
 device		an			# Aironet 4500/4800 802.11 wireless NICs.
 device		ath			# Atheros NICs
 device		ath_pci			# Atheros pci/cardbus glue
 device		ath_hal			# pci/cardbus chip support
 options 	AH_SUPPORT_AR5416	# enable AR5416 tx/rx descriptors
 options 	AH_AR5416_INTERRUPT_MITIGATION # AR5416 interrupt mitigation
 options 	ATH_ENABLE_11N		# Enable 802.11n support for AR5416 and later
 device		ath_rate_sample		# SampleRate tx rate control for ath
 #device		bwi			# Broadcom BCM430x/BCM431x wireless NICs.
 #device		bwn			# Broadcom BCM43xx wireless NICs.
 device		ipw			# Intel 2100 wireless NICs.
 device		iwi			# Intel 2200BG/2225BG/2915ABG wireless NICs.
 device		iwn			# Intel 4965/1000/5000/6000 wireless NICs.
 device		malo			# Marvell Libertas wireless NICs.
 device		mwl			# Marvell 88W8363 802.11n wireless NICs.
 device		ral			# Ralink Technology RT2500 wireless NICs.
 device		wi			# WaveLAN/Intersil/Symbol 802.11 wireless NICs.
 device		wpi			# Intel 3945ABG wireless NICs.
 
 # Pseudo devices.
 device		crypto			# core crypto support
 device		loop			# Network loopback
 device		random			# Entropy device
 device		padlock_rng		# VIA Padlock RNG
 device		rdrand_rng		# Intel Bull Mountain RNG
 device		ether			# Ethernet support
 device		vlan			# 802.1Q VLAN support
 device		tun			# Packet tunnel.
 device		md			# Memory "disks"
 device		gif			# IPv6 and IPv4 tunneling
 device		firmware		# firmware assist module
 
 # The `bpf' device enables the Berkeley Packet Filter.
 # Be aware of the administrative consequences of enabling this!
 # Note that 'bpf' is required for DHCP.
 device		bpf			# Berkeley packet filter
 
 # USB support
 options 	USB_DEBUG		# enable debug msgs
 device		uhci			# UHCI PCI->USB interface
 device		ohci			# OHCI PCI->USB interface
 device		ehci			# EHCI PCI->USB interface (USB 2.0)
 device		xhci			# XHCI PCI->USB interface (USB 3.0)
 device		usb			# USB Bus (required)
 device		ukbd			# Keyboard
 device		umass			# Disks/Mass storage - Requires scbus and da
 
 # Sound support
 device		sound			# Generic sound driver (required)
 device		snd_cmi			# CMedia CMI8338/CMI8738
 device		snd_csa			# Crystal Semiconductor CS461x/428x
 device		snd_emu10kx		# Creative SoundBlaster Live! and Audigy
 device		snd_es137x		# Ensoniq AudioPCI ES137x
 device		snd_hda			# Intel High Definition Audio
 device		snd_ich			# Intel, NVidia and other ICH AC'97 Audio
 device		snd_via8233		# VIA VT8233x Audio
 
 # MMC/SD
 device		mmc			# MMC/SD bus
 device		mmcsd			# MMC/SD memory card
 device		sdhci			# Generic PCI SD Host Controller
 
 # VirtIO support
 device		virtio			# Generic VirtIO bus (required)
 device		virtio_pci		# VirtIO PCI device
 device		vtnet			# VirtIO Ethernet device
 device		virtio_blk		# VirtIO Block device
 device		virtio_scsi		# VirtIO SCSI device
 device		virtio_balloon		# VirtIO Memory Balloon device
 
 # HyperV drivers and enchancement support
 device		hyperv			# HyperV drivers 
 
 # Xen HVM Guest Optimizations
 # NOTE: XENHVM depends on xenpci.  They must be added or removed together.
 options 	XENHVM			# Xen HVM kernel infrastructure
 device		xenpci			# Xen HVM Hypervisor services driver
 
 # VMware support
 device		vmx			# VMware VMXNET3 Ethernet
Index: projects/openssl111/sys/kern/kern_jail.c
===================================================================
--- projects/openssl111/sys/kern/kern_jail.c	(revision 339239)
+++ projects/openssl111/sys/kern/kern_jail.c	(revision 339240)
@@ -1,4208 +1,4210 @@
 /*-
  * SPDX-License-Identifier: BSD-2-Clause-FreeBSD
  *
  * Copyright (c) 1999 Poul-Henning Kamp.
  * Copyright (c) 2008 Bjoern A. Zeeb.
  * Copyright (c) 2009 James Gritton.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_ddb.h"
 #include "opt_inet.h"
 #include "opt_inet6.h"
 
 #include <sys/param.h>
 #include <sys/types.h>
 #include <sys/kernel.h>
 #include <sys/systm.h>
 #include <sys/errno.h>
 #include <sys/sysproto.h>
 #include <sys/malloc.h>
 #include <sys/osd.h>
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/taskqueue.h>
 #include <sys/fcntl.h>
 #include <sys/jail.h>
 #include <sys/lock.h>
 #include <sys/mutex.h>
 #include <sys/racct.h>
 #include <sys/rctl.h>
 #include <sys/refcount.h>
 #include <sys/sx.h>
 #include <sys/sysent.h>
 #include <sys/namei.h>
 #include <sys/mount.h>
 #include <sys/queue.h>
 #include <sys/socket.h>
 #include <sys/syscallsubr.h>
 #include <sys/sysctl.h>
 #include <sys/vnode.h>
 
 #include <net/if.h>
 #include <net/vnet.h>
 
 #include <netinet/in.h>
 
 #ifdef DDB
 #include <ddb/ddb.h>
 #endif /* DDB */
 
 #include <security/mac/mac_framework.h>
 
 #define	DEFAULT_HOSTUUID	"00000000-0000-0000-0000-000000000000"
 
 MALLOC_DEFINE(M_PRISON, "prison", "Prison structures");
 static MALLOC_DEFINE(M_PRISON_RACCT, "prison_racct", "Prison racct structures");
 
 /* Keep struct prison prison0 and some code in kern_jail_set() readable. */
 #ifdef INET
 #ifdef INET6
 #define	_PR_IP_SADDRSEL	PR_IP4_SADDRSEL|PR_IP6_SADDRSEL
 #else
 #define	_PR_IP_SADDRSEL	PR_IP4_SADDRSEL
 #endif
 #else /* !INET */
 #ifdef INET6
 #define	_PR_IP_SADDRSEL	PR_IP6_SADDRSEL
 #else
 #define	_PR_IP_SADDRSEL	0
 #endif
 #endif
 
 /* prison0 describes what is "real" about the system. */
 struct prison prison0 = {
 	.pr_id		= 0,
 	.pr_name	= "0",
 	.pr_ref		= 1,
 	.pr_uref	= 1,
 	.pr_path	= "/",
 	.pr_securelevel	= -1,
 	.pr_devfs_rsnum = 0,
 	.pr_childmax	= JAIL_MAX,
 	.pr_hostuuid	= DEFAULT_HOSTUUID,
 	.pr_children	= LIST_HEAD_INITIALIZER(prison0.pr_children),
 #ifdef VIMAGE
 	.pr_flags	= PR_HOST|PR_VNET|_PR_IP_SADDRSEL,
 #else
 	.pr_flags	= PR_HOST|_PR_IP_SADDRSEL,
 #endif
 	.pr_allow	= PR_ALLOW_ALL_STATIC,
 };
 MTX_SYSINIT(prison0, &prison0.pr_mtx, "jail mutex", MTX_DEF);
 
 struct bool_flags {
 	const char	*name;
 	const char	*noname;
 	unsigned	 flag;
 };
 struct jailsys_flags {
 	const char	*name;
 	unsigned	 disable;
 	unsigned	 new;
 };
 
 /* allprison, allprison_racct and lastprid are protected by allprison_lock. */
 struct	sx allprison_lock;
 SX_SYSINIT(allprison_lock, &allprison_lock, "allprison");
 struct	prisonlist allprison = TAILQ_HEAD_INITIALIZER(allprison);
 LIST_HEAD(, prison_racct) allprison_racct;
 int	lastprid = 0;
 
 static int do_jail_attach(struct thread *td, struct prison *pr);
 static void prison_complete(void *context, int pending);
 static void prison_deref(struct prison *pr, int flags);
 static char *prison_path(struct prison *pr1, struct prison *pr2);
 static void prison_remove_one(struct prison *pr);
 #ifdef RACCT
 static void prison_racct_attach(struct prison *pr);
 static void prison_racct_modify(struct prison *pr);
 static void prison_racct_detach(struct prison *pr);
 #endif
 
 /* Flags for prison_deref */
 #define	PD_DEREF	0x01
 #define	PD_DEUREF	0x02
 #define	PD_LOCKED	0x04
 #define	PD_LIST_SLOCKED	0x08
 #define	PD_LIST_XLOCKED	0x10
 
 /*
  * Parameter names corresponding to PR_* flag values.  Size values are for kvm
  * as we cannot figure out the size of a sparse array, or an array without a
  * terminating entry.
  */
 static struct bool_flags pr_flag_bool[] = {
 	{"persist", "nopersist", PR_PERSIST},
 #ifdef INET
 	{"ip4.saddrsel", "ip4.nosaddrsel", PR_IP4_SADDRSEL},
 #endif
 #ifdef INET6
 	{"ip6.saddrsel", "ip6.nosaddrsel", PR_IP6_SADDRSEL},
 #endif
 };
 const size_t pr_flag_bool_size = sizeof(pr_flag_bool);
 
 static struct jailsys_flags pr_flag_jailsys[] = {
 	{"host", 0, PR_HOST},
 #ifdef VIMAGE
 	{"vnet", 0, PR_VNET},
 #endif
 #ifdef INET
 	{"ip4", PR_IP4_USER, PR_IP4_USER},
 #endif
 #ifdef INET6
 	{"ip6", PR_IP6_USER, PR_IP6_USER},
 #endif
 };
 const size_t pr_flag_jailsys_size = sizeof(pr_flag_jailsys);
 
 /* Make this array full-size so dynamic parameters can be added. */
 static struct bool_flags pr_flag_allow[NBBY * NBPW] = {
 	{"allow.set_hostname", "allow.noset_hostname", PR_ALLOW_SET_HOSTNAME},
 	{"allow.sysvipc", "allow.nosysvipc", PR_ALLOW_SYSVIPC},
 	{"allow.raw_sockets", "allow.noraw_sockets", PR_ALLOW_RAW_SOCKETS},
 	{"allow.chflags", "allow.nochflags", PR_ALLOW_CHFLAGS},
 	{"allow.mount", "allow.nomount", PR_ALLOW_MOUNT},
 	{"allow.quotas", "allow.noquotas", PR_ALLOW_QUOTAS},
 	{"allow.socket_af", "allow.nosocket_af", PR_ALLOW_SOCKET_AF},
 	{"allow.mlock", "allow.nomlock", PR_ALLOW_MLOCK},
 	{"allow.reserved_ports", "allow.noreserved_ports",
 	 PR_ALLOW_RESERVED_PORTS},
 };
 const size_t pr_flag_allow_size = sizeof(pr_flag_allow);
 
 #define	JAIL_DEFAULT_ALLOW		(PR_ALLOW_SET_HOSTNAME | PR_ALLOW_RESERVED_PORTS)
 #define	JAIL_DEFAULT_ENFORCE_STATFS	2
 #define	JAIL_DEFAULT_DEVFS_RSNUM	0
 static unsigned jail_default_allow = JAIL_DEFAULT_ALLOW;
 static int jail_default_enforce_statfs = JAIL_DEFAULT_ENFORCE_STATFS;
 static int jail_default_devfs_rsnum = JAIL_DEFAULT_DEVFS_RSNUM;
 #if defined(INET) || defined(INET6)
 static unsigned jail_max_af_ips = 255;
 #endif
 
 /*
  * Initialize the parts of prison0 that can't be static-initialized with
  * constants.  This is called from proc0_init() after creating thread0 cpuset.
  */
 void
 prison0_init(void)
 {
 
 	prison0.pr_cpuset = cpuset_ref(thread0.td_cpuset);
 	prison0.pr_osreldate = osreldate;
 	strlcpy(prison0.pr_osrelease, osrelease, sizeof(prison0.pr_osrelease));
 }
 
 /*
  * struct jail_args {
  *	struct jail *jail;
  * };
  */
 int
 sys_jail(struct thread *td, struct jail_args *uap)
 {
 	uint32_t version;
 	int error;
 	struct jail j;
 
 	error = copyin(uap->jail, &version, sizeof(uint32_t));
 	if (error)
 		return (error);
 
 	switch (version) {
 	case 0:
 	{
 		struct jail_v0 j0;
 
 		/* FreeBSD single IPv4 jails. */
 		bzero(&j, sizeof(struct jail));
 		error = copyin(uap->jail, &j0, sizeof(struct jail_v0));
 		if (error)
 			return (error);
 		j.version = j0.version;
 		j.path = j0.path;
 		j.hostname = j0.hostname;
 		j.ip4s = htonl(j0.ip_number);	/* jail_v0 is host order */
 		break;
 	}
 
 	case 1:
 		/*
 		 * Version 1 was used by multi-IPv4 jail implementations
 		 * that never made it into the official kernel.
 		 */
 		return (EINVAL);
 
 	case 2:	/* JAIL_API_VERSION */
 		/* FreeBSD multi-IPv4/IPv6,noIP jails. */
 		error = copyin(uap->jail, &j, sizeof(struct jail));
 		if (error)
 			return (error);
 		break;
 
 	default:
 		/* Sci-Fi jails are not supported, sorry. */
 		return (EINVAL);
 	}
 	return (kern_jail(td, &j));
 }
 
 int
 kern_jail(struct thread *td, struct jail *j)
 {
 	struct iovec optiov[2 * (4 + nitems(pr_flag_allow)
 #ifdef INET
 			    + 1
 #endif
 #ifdef INET6
 			    + 1
 #endif
 			    )];
 	struct uio opt;
 	char *u_path, *u_hostname, *u_name;
 	struct bool_flags *bf;
 #ifdef INET
 	uint32_t ip4s;
 	struct in_addr *u_ip4;
 #endif
 #ifdef INET6
 	struct in6_addr *u_ip6;
 #endif
 	size_t tmplen;
 	int error, enforce_statfs;
 
 	bzero(&optiov, sizeof(optiov));
 	opt.uio_iov = optiov;
 	opt.uio_iovcnt = 0;
 	opt.uio_offset = -1;
 	opt.uio_resid = -1;
 	opt.uio_segflg = UIO_SYSSPACE;
 	opt.uio_rw = UIO_READ;
 	opt.uio_td = td;
 
 	/* Set permissions for top-level jails from sysctls. */
 	if (!jailed(td->td_ucred)) {
 		for (bf = pr_flag_allow;
 		     bf < pr_flag_allow + nitems(pr_flag_allow) &&
 			bf->flag != 0;
 		     bf++) {
 			optiov[opt.uio_iovcnt].iov_base = __DECONST(char *,
 			    (jail_default_allow & bf->flag)
 			    ? bf->name : bf->noname);
 			optiov[opt.uio_iovcnt].iov_len =
 			    strlen(optiov[opt.uio_iovcnt].iov_base) + 1;
 			opt.uio_iovcnt += 2;
 		}
 		optiov[opt.uio_iovcnt].iov_base = "enforce_statfs";
 		optiov[opt.uio_iovcnt].iov_len = sizeof("enforce_statfs");
 		opt.uio_iovcnt++;
 		enforce_statfs = jail_default_enforce_statfs;
 		optiov[opt.uio_iovcnt].iov_base = &enforce_statfs;
 		optiov[opt.uio_iovcnt].iov_len = sizeof(enforce_statfs);
 		opt.uio_iovcnt++;
 	}
 
 	tmplen = MAXPATHLEN + MAXHOSTNAMELEN + MAXHOSTNAMELEN;
 #ifdef INET
 	ip4s = (j->version == 0) ? 1 : j->ip4s;
 	if (ip4s > jail_max_af_ips)
 		return (EINVAL);
 	tmplen += ip4s * sizeof(struct in_addr);
 #else
 	if (j->ip4s > 0)
 		return (EINVAL);
 #endif
 #ifdef INET6
 	if (j->ip6s > jail_max_af_ips)
 		return (EINVAL);
 	tmplen += j->ip6s * sizeof(struct in6_addr);
 #else
 	if (j->ip6s > 0)
 		return (EINVAL);
 #endif
 	u_path = malloc(tmplen, M_TEMP, M_WAITOK);
 	u_hostname = u_path + MAXPATHLEN;
 	u_name = u_hostname + MAXHOSTNAMELEN;
 #ifdef INET
 	u_ip4 = (struct in_addr *)(u_name + MAXHOSTNAMELEN);
 #endif
 #ifdef INET6
 #ifdef INET
 	u_ip6 = (struct in6_addr *)(u_ip4 + ip4s);
 #else
 	u_ip6 = (struct in6_addr *)(u_name + MAXHOSTNAMELEN);
 #endif
 #endif
 	optiov[opt.uio_iovcnt].iov_base = "path";
 	optiov[opt.uio_iovcnt].iov_len = sizeof("path");
 	opt.uio_iovcnt++;
 	optiov[opt.uio_iovcnt].iov_base = u_path;
 	error = copyinstr(j->path, u_path, MAXPATHLEN,
 	    &optiov[opt.uio_iovcnt].iov_len);
 	if (error) {
 		free(u_path, M_TEMP);
 		return (error);
 	}
 	opt.uio_iovcnt++;
 	optiov[opt.uio_iovcnt].iov_base = "host.hostname";
 	optiov[opt.uio_iovcnt].iov_len = sizeof("host.hostname");
 	opt.uio_iovcnt++;
 	optiov[opt.uio_iovcnt].iov_base = u_hostname;
 	error = copyinstr(j->hostname, u_hostname, MAXHOSTNAMELEN,
 	    &optiov[opt.uio_iovcnt].iov_len);
 	if (error) {
 		free(u_path, M_TEMP);
 		return (error);
 	}
 	opt.uio_iovcnt++;
 	if (j->jailname != NULL) {
 		optiov[opt.uio_iovcnt].iov_base = "name";
 		optiov[opt.uio_iovcnt].iov_len = sizeof("name");
 		opt.uio_iovcnt++;
 		optiov[opt.uio_iovcnt].iov_base = u_name;
 		error = copyinstr(j->jailname, u_name, MAXHOSTNAMELEN,
 		    &optiov[opt.uio_iovcnt].iov_len);
 		if (error) {
 			free(u_path, M_TEMP);
 			return (error);
 		}
 		opt.uio_iovcnt++;
 	}
 #ifdef INET
 	optiov[opt.uio_iovcnt].iov_base = "ip4.addr";
 	optiov[opt.uio_iovcnt].iov_len = sizeof("ip4.addr");
 	opt.uio_iovcnt++;
 	optiov[opt.uio_iovcnt].iov_base = u_ip4;
 	optiov[opt.uio_iovcnt].iov_len = ip4s * sizeof(struct in_addr);
 	if (j->version == 0)
 		u_ip4->s_addr = j->ip4s;
 	else {
 		error = copyin(j->ip4, u_ip4, optiov[opt.uio_iovcnt].iov_len);
 		if (error) {
 			free(u_path, M_TEMP);
 			return (error);
 		}
 	}
 	opt.uio_iovcnt++;
 #endif
 #ifdef INET6
 	optiov[opt.uio_iovcnt].iov_base = "ip6.addr";
 	optiov[opt.uio_iovcnt].iov_len = sizeof("ip6.addr");
 	opt.uio_iovcnt++;
 	optiov[opt.uio_iovcnt].iov_base = u_ip6;
 	optiov[opt.uio_iovcnt].iov_len = j->ip6s * sizeof(struct in6_addr);
 	error = copyin(j->ip6, u_ip6, optiov[opt.uio_iovcnt].iov_len);
 	if (error) {
 		free(u_path, M_TEMP);
 		return (error);
 	}
 	opt.uio_iovcnt++;
 #endif
 	KASSERT(opt.uio_iovcnt <= nitems(optiov),
 		("kern_jail: too many iovecs (%d)", opt.uio_iovcnt));
 	error = kern_jail_set(td, &opt, JAIL_CREATE | JAIL_ATTACH);
 	free(u_path, M_TEMP);
 	return (error);
 }
 
 
 /*
  * struct jail_set_args {
  *	struct iovec *iovp;
  *	unsigned int iovcnt;
  *	int flags;
  * };
  */
 int
 sys_jail_set(struct thread *td, struct jail_set_args *uap)
 {
 	struct uio *auio;
 	int error;
 
 	/* Check that we have an even number of iovecs. */
 	if (uap->iovcnt & 1)
 		return (EINVAL);
 
 	error = copyinuio(uap->iovp, uap->iovcnt, &auio);
 	if (error)
 		return (error);
 	error = kern_jail_set(td, auio, uap->flags);
 	free(auio, M_IOV);
 	return (error);
 }
 
 int
 kern_jail_set(struct thread *td, struct uio *optuio, int flags)
 {
 	struct nameidata nd;
 #ifdef INET
 	struct in_addr *ip4;
 #endif
 #ifdef INET6
 	struct in6_addr *ip6;
 #endif
 	struct vfsopt *opt;
 	struct vfsoptlist *opts;
 	struct prison *pr, *deadpr, *mypr, *ppr, *tpr;
 	struct vnode *root;
 	char *domain, *errmsg, *host, *name, *namelc, *p, *path, *uuid;
 	char *g_path, *osrelstr;
 	struct bool_flags *bf;
 	struct jailsys_flags *jsf;
 #if defined(INET) || defined(INET6)
 	struct prison *tppr;
 	void *op;
 #endif
 	unsigned long hid;
 	size_t namelen, onamelen, pnamelen;
 	int born, created, cuflags, descend, enforce;
 	int error, errmsg_len, errmsg_pos;
 	int gotchildmax, gotenforce, gothid, gotrsnum, gotslevel;
 	int jid, jsys, len, level;
 	int childmax, osreldt, rsnum, slevel;
 	int fullpath_disabled;
 #if defined(INET) || defined(INET6)
 	int ii, ij;
 #endif
 #ifdef INET
 	int ip4s, redo_ip4;
 #endif
 #ifdef INET6
 	int ip6s, redo_ip6;
 #endif
 	uint64_t pr_allow, ch_allow, pr_flags, ch_flags;
 	unsigned tallow;
 	char numbuf[12];
 
 	error = priv_check(td, PRIV_JAIL_SET);
 	if (!error && (flags & JAIL_ATTACH))
 		error = priv_check(td, PRIV_JAIL_ATTACH);
 	if (error)
 		return (error);
 	mypr = td->td_ucred->cr_prison;
 	if ((flags & JAIL_CREATE) && mypr->pr_childmax == 0)
 		return (EPERM);
 	if (flags & ~JAIL_SET_MASK)
 		return (EINVAL);
 
 	/*
 	 * Check all the parameters before committing to anything.  Not all
 	 * errors can be caught early, but we may as well try.  Also, this
 	 * takes care of some expensive stuff (path lookup) before getting
 	 * the allprison lock.
 	 *
 	 * XXX Jails are not filesystems, and jail parameters are not mount
 	 *     options.  But it makes more sense to re-use the vfsopt code
 	 *     than duplicate it under a different name.
 	 */
 	error = vfs_buildopts(optuio, &opts);
 	if (error)
 		return (error);
 #ifdef INET
 	ip4 = NULL;
 #endif
 #ifdef INET6
 	ip6 = NULL;
 #endif
 	g_path = NULL;
 
 	cuflags = flags & (JAIL_CREATE | JAIL_UPDATE);
 	if (!cuflags) {
 		error = EINVAL;
 		vfs_opterror(opts, "no valid operation (create or update)");
 		goto done_errmsg;
 	}
 
 	error = vfs_copyopt(opts, "jid", &jid, sizeof(jid));
 	if (error == ENOENT)
 		jid = 0;
 	else if (error != 0)
 		goto done_free;
 
 	error = vfs_copyopt(opts, "securelevel", &slevel, sizeof(slevel));
 	if (error == ENOENT)
 		gotslevel = 0;
 	else if (error != 0)
 		goto done_free;
 	else
 		gotslevel = 1;
 
 	error =
 	    vfs_copyopt(opts, "children.max", &childmax, sizeof(childmax));
 	if (error == ENOENT)
 		gotchildmax = 0;
 	else if (error != 0)
 		goto done_free;
 	else
 		gotchildmax = 1;
 
 	error = vfs_copyopt(opts, "enforce_statfs", &enforce, sizeof(enforce));
 	if (error == ENOENT)
 		gotenforce = 0;
 	else if (error != 0)
 		goto done_free;
 	else if (enforce < 0 || enforce > 2) {
 		error = EINVAL;
 		goto done_free;
 	} else
 		gotenforce = 1;
 
 	error = vfs_copyopt(opts, "devfs_ruleset", &rsnum, sizeof(rsnum));
 	if (error == ENOENT)
 		gotrsnum = 0;
 	else if (error != 0)
 		goto done_free;
 	else
 		gotrsnum = 1;
 
 	pr_flags = ch_flags = 0;
 	for (bf = pr_flag_bool;
 	     bf < pr_flag_bool + nitems(pr_flag_bool);
 	     bf++) {
 		vfs_flagopt(opts, bf->name, &pr_flags, bf->flag);
 		vfs_flagopt(opts, bf->noname, &ch_flags, bf->flag);
 	}
 	ch_flags |= pr_flags;
 	for (jsf = pr_flag_jailsys;
 	     jsf < pr_flag_jailsys + nitems(pr_flag_jailsys);
 	     jsf++) {
 		error = vfs_copyopt(opts, jsf->name, &jsys, sizeof(jsys));
 		if (error == ENOENT)
 			continue;
 		if (error != 0)
 			goto done_free;
 		switch (jsys) {
 		case JAIL_SYS_DISABLE:
 			if (!jsf->disable) {
 				error = EINVAL;
 				goto done_free;
 			}
 			pr_flags |= jsf->disable;
 			break;
 		case JAIL_SYS_NEW:
 			pr_flags |= jsf->new;
 			break;
 		case JAIL_SYS_INHERIT:
 			break;
 		default:
 			error = EINVAL;
 			goto done_free;
 		}
 		ch_flags |= jsf->new | jsf->disable;
 	}
 	if ((flags & (JAIL_CREATE | JAIL_UPDATE | JAIL_ATTACH)) == JAIL_CREATE
 	    && !(pr_flags & PR_PERSIST)) {
 		error = EINVAL;
 		vfs_opterror(opts, "new jail must persist or attach");
 		goto done_errmsg;
 	}
 #ifdef VIMAGE
 	if ((flags & JAIL_UPDATE) && (ch_flags & PR_VNET)) {
 		error = EINVAL;
 		vfs_opterror(opts, "vnet cannot be changed after creation");
 		goto done_errmsg;
 	}
 #endif
 #ifdef INET
 	if ((flags & JAIL_UPDATE) && (ch_flags & PR_IP4_USER)) {
 		error = EINVAL;
 		vfs_opterror(opts, "ip4 cannot be changed after creation");
 		goto done_errmsg;
 	}
 #endif
 #ifdef INET6
 	if ((flags & JAIL_UPDATE) && (ch_flags & PR_IP6_USER)) {
 		error = EINVAL;
 		vfs_opterror(opts, "ip6 cannot be changed after creation");
 		goto done_errmsg;
 	}
 #endif
 
 	pr_allow = ch_allow = 0;
 	for (bf = pr_flag_allow;
 	     bf < pr_flag_allow + nitems(pr_flag_allow) && bf->flag != 0;
 	     bf++) {
 		vfs_flagopt(opts, bf->name, &pr_allow, bf->flag);
 		vfs_flagopt(opts, bf->noname, &ch_allow, bf->flag);
 	}
 	ch_allow |= pr_allow;
 
 	error = vfs_getopt(opts, "name", (void **)&name, &len);
 	if (error == ENOENT)
 		name = NULL;
 	else if (error != 0)
 		goto done_free;
 	else {
 		if (len == 0 || name[len - 1] != '\0') {
 			error = EINVAL;
 			goto done_free;
 		}
 		if (len > MAXHOSTNAMELEN) {
 			error = ENAMETOOLONG;
 			goto done_free;
 		}
 	}
 
 	error = vfs_getopt(opts, "host.hostname", (void **)&host, &len);
 	if (error == ENOENT)
 		host = NULL;
 	else if (error != 0)
 		goto done_free;
 	else {
 		ch_flags |= PR_HOST;
 		pr_flags |= PR_HOST;
 		if (len == 0 || host[len - 1] != '\0') {
 			error = EINVAL;
 			goto done_free;
 		}
 		if (len > MAXHOSTNAMELEN) {
 			error = ENAMETOOLONG;
 			goto done_free;
 		}
 	}
 
 	error = vfs_getopt(opts, "host.domainname", (void **)&domain, &len);
 	if (error == ENOENT)
 		domain = NULL;
 	else if (error != 0)
 		goto done_free;
 	else {
 		ch_flags |= PR_HOST;
 		pr_flags |= PR_HOST;
 		if (len == 0 || domain[len - 1] != '\0') {
 			error = EINVAL;
 			goto done_free;
 		}
 		if (len > MAXHOSTNAMELEN) {
 			error = ENAMETOOLONG;
 			goto done_free;
 		}
 	}
 
 	error = vfs_getopt(opts, "host.hostuuid", (void **)&uuid, &len);
 	if (error == ENOENT)
 		uuid = NULL;
 	else if (error != 0)
 		goto done_free;
 	else {
 		ch_flags |= PR_HOST;
 		pr_flags |= PR_HOST;
 		if (len == 0 || uuid[len - 1] != '\0') {
 			error = EINVAL;
 			goto done_free;
 		}
 		if (len > HOSTUUIDLEN) {
 			error = ENAMETOOLONG;
 			goto done_free;
 		}
 	}
 
 #ifdef COMPAT_FREEBSD32
 	if (SV_PROC_FLAG(td->td_proc, SV_ILP32)) {
 		uint32_t hid32;
 
 		error = vfs_copyopt(opts, "host.hostid", &hid32, sizeof(hid32));
 		hid = hid32;
 	} else
 #endif
 		error = vfs_copyopt(opts, "host.hostid", &hid, sizeof(hid));
 	if (error == ENOENT)
 		gothid = 0;
 	else if (error != 0)
 		goto done_free;
 	else {
 		gothid = 1;
 		ch_flags |= PR_HOST;
 		pr_flags |= PR_HOST;
 	}
 
 #ifdef INET
 	error = vfs_getopt(opts, "ip4.addr", &op, &ip4s);
 	if (error == ENOENT)
 		ip4s = 0;
 	else if (error != 0)
 		goto done_free;
 	else if (ip4s & (sizeof(*ip4) - 1)) {
 		error = EINVAL;
 		goto done_free;
 	} else {
 		ch_flags |= PR_IP4_USER;
 		pr_flags |= PR_IP4_USER;
 		if (ip4s > 0) {
 			ip4s /= sizeof(*ip4);
 			if (ip4s > jail_max_af_ips) {
 				error = EINVAL;
 				vfs_opterror(opts, "too many IPv4 addresses");
 				goto done_errmsg;
 			}
 			ip4 = malloc(ip4s * sizeof(*ip4), M_PRISON, M_WAITOK);
 			bcopy(op, ip4, ip4s * sizeof(*ip4));
 			/*
 			 * IP addresses are all sorted but ip[0] to preserve
 			 * the primary IP address as given from userland.
 			 * This special IP is used for unbound outgoing
 			 * connections as well for "loopback" traffic in case
 			 * source address selection cannot find any more fitting
 			 * address to connect from.
 			 */
 			if (ip4s > 1)
 				qsort(ip4 + 1, ip4s - 1, sizeof(*ip4),
 				    prison_qcmp_v4);
 			/*
 			 * Check for duplicate addresses and do some simple
 			 * zero and broadcast checks. If users give other bogus
 			 * addresses it is their problem.
 			 *
 			 * We do not have to care about byte order for these
 			 * checks so we will do them in NBO.
 			 */
 			for (ii = 0; ii < ip4s; ii++) {
 				if (ip4[ii].s_addr == INADDR_ANY ||
 				    ip4[ii].s_addr == INADDR_BROADCAST) {
 					error = EINVAL;
 					goto done_free;
 				}
 				if ((ii+1) < ip4s &&
 				    (ip4[0].s_addr == ip4[ii+1].s_addr ||
 				     ip4[ii].s_addr == ip4[ii+1].s_addr)) {
 					error = EINVAL;
 					goto done_free;
 				}
 			}
 		}
 	}
 #endif
 
 #ifdef INET6
 	error = vfs_getopt(opts, "ip6.addr", &op, &ip6s);
 	if (error == ENOENT)
 		ip6s = 0;
 	else if (error != 0)
 		goto done_free;
 	else if (ip6s & (sizeof(*ip6) - 1)) {
 		error = EINVAL;
 		goto done_free;
 	} else {
 		ch_flags |= PR_IP6_USER;
 		pr_flags |= PR_IP6_USER;
 		if (ip6s > 0) {
 			ip6s /= sizeof(*ip6);
 			if (ip6s > jail_max_af_ips) {
 				error = EINVAL;
 				vfs_opterror(opts, "too many IPv6 addresses");
 				goto done_errmsg;
 			}
 			ip6 = malloc(ip6s * sizeof(*ip6), M_PRISON, M_WAITOK);
 			bcopy(op, ip6, ip6s * sizeof(*ip6));
 			if (ip6s > 1)
 				qsort(ip6 + 1, ip6s - 1, sizeof(*ip6),
 				    prison_qcmp_v6);
 			for (ii = 0; ii < ip6s; ii++) {
 				if (IN6_IS_ADDR_UNSPECIFIED(&ip6[ii])) {
 					error = EINVAL;
 					goto done_free;
 				}
 				if ((ii+1) < ip6s &&
 				    (IN6_ARE_ADDR_EQUAL(&ip6[0], &ip6[ii+1]) ||
 				     IN6_ARE_ADDR_EQUAL(&ip6[ii], &ip6[ii+1])))
 				{
 					error = EINVAL;
 					goto done_free;
 				}
 			}
 		}
 	}
 #endif
 
 #if defined(VIMAGE) && (defined(INET) || defined(INET6))
 	if ((ch_flags & PR_VNET) && (ch_flags & (PR_IP4_USER | PR_IP6_USER))) {
 		error = EINVAL;
 		vfs_opterror(opts,
 		    "vnet jails cannot have IP address restrictions");
 		goto done_errmsg;
 	}
 #endif
 
 	error = vfs_getopt(opts, "osrelease", (void **)&osrelstr, &len);
 	if (error == ENOENT)
 		osrelstr = NULL;
 	else if (error != 0)
 		goto done_free;
 	else {
 		if (flags & JAIL_UPDATE) {
 			error = EINVAL;
 			vfs_opterror(opts,
 			    "osrelease cannot be changed after creation");
 			goto done_errmsg;
 		}
 		if (len == 0 || len >= OSRELEASELEN) {
 			error = EINVAL;
 			vfs_opterror(opts,
 			    "osrelease string must be 1-%d bytes long",
 			    OSRELEASELEN - 1);
 			goto done_errmsg;
 		}
 	}
 
 	error = vfs_copyopt(opts, "osreldate", &osreldt, sizeof(osreldt));
 	if (error == ENOENT)
 		osreldt = 0;
 	else if (error != 0)
 		goto done_free;
 	else {
 		if (flags & JAIL_UPDATE) {
 			error = EINVAL;
 			vfs_opterror(opts,
 			    "osreldate cannot be changed after creation");
 			goto done_errmsg;
 		}
 		if (osreldt == 0) {
 			error = EINVAL;
 			vfs_opterror(opts, "osreldate cannot be 0");
 			goto done_errmsg;
 		}
 	}
 
 	fullpath_disabled = 0;
 	root = NULL;
 	error = vfs_getopt(opts, "path", (void **)&path, &len);
 	if (error == ENOENT)
 		path = NULL;
 	else if (error != 0)
 		goto done_free;
 	else {
 		if (flags & JAIL_UPDATE) {
 			error = EINVAL;
 			vfs_opterror(opts,
 			    "path cannot be changed after creation");
 			goto done_errmsg;
 		}
 		if (len == 0 || path[len - 1] != '\0') {
 			error = EINVAL;
 			goto done_free;
 		}
 		NDINIT(&nd, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE,
 		    path, td);
 		error = namei(&nd);
 		if (error)
 			goto done_free;
 		root = nd.ni_vp;
 		NDFREE(&nd, NDF_ONLY_PNBUF);
 		g_path = malloc(MAXPATHLEN, M_TEMP, M_WAITOK);
 		strlcpy(g_path, path, MAXPATHLEN);
 		error = vn_path_to_global_path(td, root, g_path, MAXPATHLEN);
 		if (error == 0)
 			path = g_path;
 		else if (error == ENODEV) {
 			/* proceed if sysctl debug.disablefullpath == 1 */
 			fullpath_disabled = 1;
 			if (len < 2 || (len == 2 && path[0] == '/'))
 				path = NULL;
 		} else {
 			/* exit on other errors */
 			goto done_free;
 		}
 		if (root->v_type != VDIR) {
 			error = ENOTDIR;
 			vput(root);
 			goto done_free;
 		}
 		VOP_UNLOCK(root, 0);
 		if (fullpath_disabled) {
 			/* Leave room for a real-root full pathname. */
 			if (len + (path[0] == '/' && strcmp(mypr->pr_path, "/")
 			    ? strlen(mypr->pr_path) : 0) > MAXPATHLEN) {
 				error = ENAMETOOLONG;
 				vrele(root);
 				goto done_free;
 			}
 		}
 	}
 
 	/*
 	 * Find the specified jail, or at least its parent.
 	 * This abuses the file error codes ENOENT and EEXIST.
 	 */
 	pr = NULL;
 	ppr = mypr;
 	if (cuflags == JAIL_CREATE && jid == 0 && name != NULL) {
 		namelc = strrchr(name, '.');
 		jid = strtoul(namelc != NULL ? namelc + 1 : name, &p, 10);
 		if (*p != '\0')
 			jid = 0;
 	}
 	sx_xlock(&allprison_lock);
 	if (jid != 0) {
 		/*
 		 * See if a requested jid already exists.  There is an
 		 * information leak here if the jid exists but is not within
 		 * the caller's jail hierarchy.  Jail creators will get EEXIST
 		 * even though they cannot see the jail, and CREATE | UPDATE
 		 * will return ENOENT which is not normally a valid error.
 		 */
 		if (jid < 0) {
 			error = EINVAL;
 			vfs_opterror(opts, "negative jid");
 			goto done_unlock_list;
 		}
 		pr = prison_find(jid);
 		if (pr != NULL) {
 			ppr = pr->pr_parent;
 			/* Create: jid must not exist. */
 			if (cuflags == JAIL_CREATE) {
 				mtx_unlock(&pr->pr_mtx);
 				error = EEXIST;
 				vfs_opterror(opts, "jail %d already exists",
 				    jid);
 				goto done_unlock_list;
 			}
 			if (!prison_ischild(mypr, pr)) {
 				mtx_unlock(&pr->pr_mtx);
 				pr = NULL;
 			} else if (pr->pr_uref == 0) {
 				if (!(flags & JAIL_DYING)) {
 					mtx_unlock(&pr->pr_mtx);
 					error = ENOENT;
 					vfs_opterror(opts, "jail %d is dying",
 					    jid);
 					goto done_unlock_list;
 				} else if ((flags & JAIL_ATTACH) ||
 				    (pr_flags & PR_PERSIST)) {
 					/*
 					 * A dying jail might be resurrected
 					 * (via attach or persist), but first
 					 * it must determine if another jail
 					 * has claimed its name.  Accomplish
 					 * this by implicitly re-setting the
 					 * name.
 					 */
 					if (name == NULL)
 						name = prison_name(mypr, pr);
 				}
 			}
 		}
 		if (pr == NULL) {
 			/* Update: jid must exist. */
 			if (cuflags == JAIL_UPDATE) {
 				error = ENOENT;
 				vfs_opterror(opts, "jail %d not found", jid);
 				goto done_unlock_list;
 			}
 		}
 	}
 	/*
 	 * If the caller provided a name, look for a jail by that name.
 	 * This has different semantics for creates and updates keyed by jid
 	 * (where the name must not already exist in a different jail),
 	 * and updates keyed by the name itself (where the name must exist
 	 * because that is the jail being updated).
 	 */
 	namelc = NULL;
 	if (name != NULL) {
 		namelc = strrchr(name, '.');
 		if (namelc == NULL)
 			namelc = name;
 		else {
 			/*
 			 * This is a hierarchical name.  Split it into the
 			 * parent and child names, and make sure the parent
 			 * exists or matches an already found jail.
 			 */
 			if (pr != NULL) {
 				if (strncmp(name, ppr->pr_name, namelc - name)
 				    || ppr->pr_name[namelc - name] != '\0') {
 					mtx_unlock(&pr->pr_mtx);
 					error = EINVAL;
 					vfs_opterror(opts,
 					    "cannot change jail's parent");
 					goto done_unlock_list;
 				}
 			} else {
 				*namelc = '\0';
 				ppr = prison_find_name(mypr, name);
 				if (ppr == NULL) {
 					error = ENOENT;
 					vfs_opterror(opts,
 					    "jail \"%s\" not found", name);
 					goto done_unlock_list;
 				}
 				mtx_unlock(&ppr->pr_mtx);
 				*namelc = '.';
 			}
 			namelc++;
 		}
 		if (namelc[0] != '\0') {
 			pnamelen =
 			    (ppr == &prison0) ? 0 : strlen(ppr->pr_name) + 1;
  name_again:
 			deadpr = NULL;
 			FOREACH_PRISON_CHILD(ppr, tpr) {
 				if (tpr != pr && tpr->pr_ref > 0 &&
 				    !strcmp(tpr->pr_name + pnamelen, namelc)) {
 					if (pr == NULL &&
 					    cuflags != JAIL_CREATE) {
 						mtx_lock(&tpr->pr_mtx);
 						if (tpr->pr_ref > 0) {
 							/*
 							 * Use this jail
 							 * for updates.
 							 */
 							if (tpr->pr_uref > 0) {
 								pr = tpr;
 								break;
 							}
 							deadpr = tpr;
 						}
 						mtx_unlock(&tpr->pr_mtx);
 					} else if (tpr->pr_uref > 0) {
 						/*
 						 * Create, or update(jid):
 						 * name must not exist in an
 						 * active sibling jail.
 						 */
 						error = EEXIST;
 						if (pr != NULL)
 							mtx_unlock(&pr->pr_mtx);
 						vfs_opterror(opts,
 						   "jail \"%s\" already exists",
 						   name);
 						goto done_unlock_list;
 					}
 				}
 			}
 			/* If no active jail is found, use a dying one. */
 			if (deadpr != NULL && pr == NULL) {
 				if (flags & JAIL_DYING) {
 					mtx_lock(&deadpr->pr_mtx);
 					if (deadpr->pr_ref == 0) {
 						mtx_unlock(&deadpr->pr_mtx);
 						goto name_again;
 					}
 					pr = deadpr;
 				} else if (cuflags == JAIL_UPDATE) {
 					error = ENOENT;
 					vfs_opterror(opts,
 					    "jail \"%s\" is dying", name);
 					goto done_unlock_list;
 				}
 			}
 			/* Update: name must exist if no jid. */
 			else if (cuflags == JAIL_UPDATE && pr == NULL) {
 				error = ENOENT;
 				vfs_opterror(opts, "jail \"%s\" not found",
 				    name);
 				goto done_unlock_list;
 			}
 		}
 	}
 	/* Update: must provide a jid or name. */
 	else if (cuflags == JAIL_UPDATE && pr == NULL) {
 		error = ENOENT;
 		vfs_opterror(opts, "update specified no jail");
 		goto done_unlock_list;
 	}
 
 	/* If there's no prison to update, create a new one and link it in. */
 	if (pr == NULL) {
 		for (tpr = mypr; tpr != NULL; tpr = tpr->pr_parent)
 			if (tpr->pr_childcount >= tpr->pr_childmax) {
 				error = EPERM;
 				vfs_opterror(opts, "prison limit exceeded");
 				goto done_unlock_list;
 			}
 		created = 1;
 		mtx_lock(&ppr->pr_mtx);
 		if (ppr->pr_ref == 0) {
 			mtx_unlock(&ppr->pr_mtx);
 			error = ENOENT;
 			vfs_opterror(opts, "jail \"%s\" not found",
 			    prison_name(mypr, ppr));
 			goto done_unlock_list;
 		}
 		ppr->pr_ref++;
 		ppr->pr_uref++;
 		mtx_unlock(&ppr->pr_mtx);
 		pr = malloc(sizeof(*pr), M_PRISON, M_WAITOK | M_ZERO);
 		if (jid == 0) {
 			/* Find the next free jid. */
 			jid = lastprid + 1;
  findnext:
 			if (jid == JAIL_MAX)
 				jid = 1;
 			TAILQ_FOREACH(tpr, &allprison, pr_list) {
 				if (tpr->pr_id < jid)
 					continue;
 				if (tpr->pr_id > jid || tpr->pr_ref == 0) {
 					TAILQ_INSERT_BEFORE(tpr, pr, pr_list);
 					break;
 				}
 				if (jid == lastprid) {
 					error = EAGAIN;
 					vfs_opterror(opts,
 					    "no available jail IDs");
 					free(pr, M_PRISON);
 					prison_deref(ppr, PD_DEREF |
 					    PD_DEUREF | PD_LIST_XLOCKED);
 					goto done_releroot;
 				}
 				jid++;
 				goto findnext;
 			}
 			lastprid = jid;
 		} else {
 			/*
 			 * The jail already has a jid (that did not yet exist),
 			 * so just find where to insert it.
 			 */
 			TAILQ_FOREACH(tpr, &allprison, pr_list)
 				if (tpr->pr_id >= jid) {
 					TAILQ_INSERT_BEFORE(tpr, pr, pr_list);
 					break;
 				}
 		}
 		if (tpr == NULL)
 			TAILQ_INSERT_TAIL(&allprison, pr, pr_list);
 		LIST_INSERT_HEAD(&ppr->pr_children, pr, pr_sibling);
 		for (tpr = ppr; tpr != NULL; tpr = tpr->pr_parent)
 			tpr->pr_childcount++;
 
 		pr->pr_parent = ppr;
 		pr->pr_id = jid;
 
 		/* Set some default values, and inherit some from the parent. */
 		if (namelc == NULL)
 			namelc = "";
 		if (path == NULL) {
 			path = "/";
 			root = mypr->pr_root;
 			vref(root);
 		}
 		strlcpy(pr->pr_hostuuid, DEFAULT_HOSTUUID, HOSTUUIDLEN);
 		pr->pr_flags |= PR_HOST;
 #if defined(INET) || defined(INET6)
 #ifdef VIMAGE
 		if (!(pr_flags & PR_VNET))
 #endif
 		{
 #ifdef INET
 			if (!(ch_flags & PR_IP4_USER))
 				pr->pr_flags |= PR_IP4 | PR_IP4_USER;
 			else if (!(pr_flags & PR_IP4_USER)) {
 				pr->pr_flags |= ppr->pr_flags & PR_IP4;
 				if (ppr->pr_ip4 != NULL) {
 					pr->pr_ip4s = ppr->pr_ip4s;
 					pr->pr_ip4 = malloc(pr->pr_ip4s *
 					    sizeof(struct in_addr), M_PRISON,
 					    M_WAITOK);
 					bcopy(ppr->pr_ip4, pr->pr_ip4,
 					    pr->pr_ip4s * sizeof(*pr->pr_ip4));
 				}
 			}
 #endif
 #ifdef INET6
 			if (!(ch_flags & PR_IP6_USER))
 				pr->pr_flags |= PR_IP6 | PR_IP6_USER;
 			else if (!(pr_flags & PR_IP6_USER)) {
 				pr->pr_flags |= ppr->pr_flags & PR_IP6;
 				if (ppr->pr_ip6 != NULL) {
 					pr->pr_ip6s = ppr->pr_ip6s;
 					pr->pr_ip6 = malloc(pr->pr_ip6s *
 					    sizeof(struct in6_addr), M_PRISON,
 					    M_WAITOK);
 					bcopy(ppr->pr_ip6, pr->pr_ip6,
 					    pr->pr_ip6s * sizeof(*pr->pr_ip6));
 				}
 			}
 #endif
 		}
 #endif
 		/* Source address selection is always on by default. */
 		pr->pr_flags |= _PR_IP_SADDRSEL;
 
 		pr->pr_securelevel = ppr->pr_securelevel;
 		pr->pr_allow = JAIL_DEFAULT_ALLOW & ppr->pr_allow;
 		pr->pr_enforce_statfs = jail_default_enforce_statfs;
 		pr->pr_devfs_rsnum = ppr->pr_devfs_rsnum;
 
 		pr->pr_osreldate = osreldt ? osreldt : ppr->pr_osreldate;
 		if (osrelstr == NULL)
 		    strcpy(pr->pr_osrelease, ppr->pr_osrelease);
 		else
 		    strcpy(pr->pr_osrelease, osrelstr);
 
 		LIST_INIT(&pr->pr_children);
 		mtx_init(&pr->pr_mtx, "jail mutex", NULL, MTX_DEF | MTX_DUPOK);
 		TASK_INIT(&pr->pr_task, 0, prison_complete, pr);
 
 #ifdef VIMAGE
 		/* Allocate a new vnet if specified. */
 		pr->pr_vnet = (pr_flags & PR_VNET)
 		    ? vnet_alloc() : ppr->pr_vnet;
 #endif
 		/*
 		 * Allocate a dedicated cpuset for each jail.
 		 * Unlike other initial settings, this may return an erorr.
 		 */
 		error = cpuset_create_root(ppr, &pr->pr_cpuset);
 		if (error) {
 			prison_deref(pr, PD_LIST_XLOCKED);
 			goto done_releroot;
 		}
 
 		mtx_lock(&pr->pr_mtx);
 		/*
 		 * New prisons do not yet have a reference, because we do not
 		 * want others to see the incomplete prison once the
 		 * allprison_lock is downgraded.
 		 */
 	} else {
 		created = 0;
 		/*
 		 * Grab a reference for existing prisons, to ensure they
 		 * continue to exist for the duration of the call.
 		 */
 		pr->pr_ref++;
 #if defined(VIMAGE) && (defined(INET) || defined(INET6))
 		if ((pr->pr_flags & PR_VNET) &&
 		    (ch_flags & (PR_IP4_USER | PR_IP6_USER))) {
 			error = EINVAL;
 			vfs_opterror(opts,
 			    "vnet jails cannot have IP address restrictions");
 			goto done_deref_locked;
 		}
 #endif
 #ifdef INET
 		if (PR_IP4_USER & ch_flags & (pr_flags ^ pr->pr_flags)) {
 			error = EINVAL;
 			vfs_opterror(opts,
 			    "ip4 cannot be changed after creation");
 			goto done_deref_locked;
 		}
 #endif
 #ifdef INET6
 		if (PR_IP6_USER & ch_flags & (pr_flags ^ pr->pr_flags)) {
 			error = EINVAL;
 			vfs_opterror(opts,
 			    "ip6 cannot be changed after creation");
 			goto done_deref_locked;
 		}
 #endif
 	}
 
 	/* Do final error checking before setting anything. */
 	if (gotslevel) {
 		if (slevel < ppr->pr_securelevel) {
 			error = EPERM;
 			goto done_deref_locked;
 		}
 	}
 	if (gotchildmax) {
 		if (childmax >= ppr->pr_childmax) {
 			error = EPERM;
 			goto done_deref_locked;
 		}
 	}
 	if (gotenforce) {
 		if (enforce < ppr->pr_enforce_statfs) {
 			error = EPERM;
 			goto done_deref_locked;
 		}
 	}
 	if (gotrsnum) {
 		/*
 		 * devfs_rsnum is a uint16_t
 		 */
 		if (rsnum < 0 || rsnum > 65535) {
 			error = EINVAL;
 			goto done_deref_locked;
 		}
 		/*
 		 * Nested jails always inherit parent's devfs ruleset
 		 */
 		if (jailed(td->td_ucred)) {
 			if (rsnum > 0 && rsnum != ppr->pr_devfs_rsnum) {
 				error = EPERM;
 				goto done_deref_locked;
 			} else
 				rsnum = ppr->pr_devfs_rsnum;
 		}
 	}
 #ifdef INET
 	if (ip4s > 0) {
 		if (ppr->pr_flags & PR_IP4) {
 			/*
 			 * Make sure the new set of IP addresses is a
 			 * subset of the parent's list.  Don't worry
 			 * about the parent being unlocked, as any
 			 * setting is done with allprison_lock held.
 			 */
 			for (ij = 0; ij < ppr->pr_ip4s; ij++)
 				if (ip4[0].s_addr == ppr->pr_ip4[ij].s_addr)
 					break;
 			if (ij == ppr->pr_ip4s) {
 				error = EPERM;
 				goto done_deref_locked;
 			}
 			if (ip4s > 1) {
 				for (ii = ij = 1; ii < ip4s; ii++) {
 					if (ip4[ii].s_addr ==
 					    ppr->pr_ip4[0].s_addr)
 						continue;
 					for (; ij < ppr->pr_ip4s; ij++)
 						if (ip4[ii].s_addr ==
 						    ppr->pr_ip4[ij].s_addr)
 							break;
 					if (ij == ppr->pr_ip4s)
 						break;
 				}
 				if (ij == ppr->pr_ip4s) {
 					error = EPERM;
 					goto done_deref_locked;
 				}
 			}
 		}
 		/*
 		 * Check for conflicting IP addresses.  We permit them
 		 * if there is no more than one IP on each jail.  If
 		 * there is a duplicate on a jail with more than one
 		 * IP stop checking and return error.
 		 */
-		tppr = ppr;
 #ifdef VIMAGE
-		for (; tppr != &prison0; tppr = tppr->pr_parent)
+		for (tppr = ppr; tppr != &prison0; tppr = tppr->pr_parent)
 			if (tppr->pr_flags & PR_VNET)
 				break;
+#else
+		tppr = &prison0;
 #endif
 		FOREACH_PRISON_DESCENDANT(tppr, tpr, descend) {
 			if (tpr == pr ||
 #ifdef VIMAGE
 			    (tpr != tppr && (tpr->pr_flags & PR_VNET)) ||
 #endif
 			    tpr->pr_uref == 0) {
 				descend = 0;
 				continue;
 			}
 			if (!(tpr->pr_flags & PR_IP4_USER))
 				continue;
 			descend = 0;
 			if (tpr->pr_ip4 == NULL ||
 			    (ip4s == 1 && tpr->pr_ip4s == 1))
 				continue;
 			for (ii = 0; ii < ip4s; ii++) {
 				if (prison_check_ip4_locked(tpr, &ip4[ii]) ==
 				    0) {
 					error = EADDRINUSE;
 					vfs_opterror(opts,
 					    "IPv4 addresses clash");
 					goto done_deref_locked;
 				}
 			}
 		}
 	}
 #endif
 #ifdef INET6
 	if (ip6s > 0) {
 		if (ppr->pr_flags & PR_IP6) {
 			/*
 			 * Make sure the new set of IP addresses is a
 			 * subset of the parent's list.
 			 */
 			for (ij = 0; ij < ppr->pr_ip6s; ij++)
 				if (IN6_ARE_ADDR_EQUAL(&ip6[0],
 				    &ppr->pr_ip6[ij]))
 					break;
 			if (ij == ppr->pr_ip6s) {
 				error = EPERM;
 				goto done_deref_locked;
 			}
 			if (ip6s > 1) {
 				for (ii = ij = 1; ii < ip6s; ii++) {
 					if (IN6_ARE_ADDR_EQUAL(&ip6[ii],
 					     &ppr->pr_ip6[0]))
 						continue;
 					for (; ij < ppr->pr_ip6s; ij++)
 						if (IN6_ARE_ADDR_EQUAL(
 						    &ip6[ii], &ppr->pr_ip6[ij]))
 							break;
 					if (ij == ppr->pr_ip6s)
 						break;
 				}
 				if (ij == ppr->pr_ip6s) {
 					error = EPERM;
 					goto done_deref_locked;
 				}
 			}
 		}
 		/* Check for conflicting IP addresses. */
-		tppr = ppr;
 #ifdef VIMAGE
-		for (; tppr != &prison0; tppr = tppr->pr_parent)
+		for (tppr = ppr; tppr != &prison0; tppr = tppr->pr_parent)
 			if (tppr->pr_flags & PR_VNET)
 				break;
+#else
+		tppr = &prison0;
 #endif
 		FOREACH_PRISON_DESCENDANT(tppr, tpr, descend) {
 			if (tpr == pr ||
 #ifdef VIMAGE
 			    (tpr != tppr && (tpr->pr_flags & PR_VNET)) ||
 #endif
 			    tpr->pr_uref == 0) {
 				descend = 0;
 				continue;
 			}
 			if (!(tpr->pr_flags & PR_IP6_USER))
 				continue;
 			descend = 0;
 			if (tpr->pr_ip6 == NULL ||
 			    (ip6s == 1 && tpr->pr_ip6s == 1))
 				continue;
 			for (ii = 0; ii < ip6s; ii++) {
 				if (prison_check_ip6_locked(tpr, &ip6[ii]) ==
 				    0) {
 					error = EADDRINUSE;
 					vfs_opterror(opts,
 					    "IPv6 addresses clash");
 					goto done_deref_locked;
 				}
 			}
 		}
 	}
 #endif
 	onamelen = namelen = 0;
 	if (namelc != NULL) {
 		/* Give a default name of the jid.  Also allow the name to be
 		 * explicitly the jid - but not any other number, and only in
 		 * normal form (no leading zero/etc).
 		 */
 		if (namelc[0] == '\0')
 			snprintf(namelc = numbuf, sizeof(numbuf), "%d", jid);
 		else if ((strtoul(namelc, &p, 10) != jid ||
 			  namelc[0] < '1' || namelc[0] > '9') && *p == '\0') {
 			error = EINVAL;
 			vfs_opterror(opts,
 			    "name cannot be numeric (unless it is the jid)");
 			goto done_deref_locked;
 		}
 		/*
 		 * Make sure the name isn't too long for the prison or its
 		 * children.
 		 */
 		pnamelen = (ppr == &prison0) ? 0 : strlen(ppr->pr_name) + 1;
 		onamelen = strlen(pr->pr_name + pnamelen);
 		namelen = strlen(namelc);
 		if (pnamelen + namelen + 1 > sizeof(pr->pr_name)) {
 			error = ENAMETOOLONG;
 			goto done_deref_locked;
 		}
 		FOREACH_PRISON_DESCENDANT(pr, tpr, descend) {
 			if (strlen(tpr->pr_name) + (namelen - onamelen) >=
 			    sizeof(pr->pr_name)) {
 				error = ENAMETOOLONG;
 				goto done_deref_locked;
 			}
 		}
 	}
 	if (pr_allow & ~ppr->pr_allow) {
 		error = EPERM;
 		goto done_deref_locked;
 	}
 
 	/*
 	 * Let modules check their parameters.  This requires unlocking and
 	 * then re-locking the prison, but this is still a valid state as long
 	 * as allprison_lock remains xlocked.
 	 */
 	mtx_unlock(&pr->pr_mtx);
 	error = osd_jail_call(pr, PR_METHOD_CHECK, opts);
 	if (error != 0) {
 		prison_deref(pr, created
 		    ? PD_LIST_XLOCKED
 		    : PD_DEREF | PD_LIST_XLOCKED);
 		goto done_releroot;
 	}
 	mtx_lock(&pr->pr_mtx);
 
 	/* At this point, all valid parameters should have been noted. */
 	TAILQ_FOREACH(opt, opts, link) {
 		if (!opt->seen && strcmp(opt->name, "errmsg")) {
 			error = EINVAL;
 			vfs_opterror(opts, "unknown parameter: %s", opt->name);
 			goto done_deref_locked;
 		}
 	}
 
 	/* Set the parameters of the prison. */
 #ifdef INET
 	redo_ip4 = 0;
 	if (pr_flags & PR_IP4_USER) {
 		pr->pr_flags |= PR_IP4;
 		free(pr->pr_ip4, M_PRISON);
 		pr->pr_ip4s = ip4s;
 		pr->pr_ip4 = ip4;
 		ip4 = NULL;
 		FOREACH_PRISON_DESCENDANT_LOCKED(pr, tpr, descend) {
 #ifdef VIMAGE
 			if (tpr->pr_flags & PR_VNET) {
 				descend = 0;
 				continue;
 			}
 #endif
 			if (prison_restrict_ip4(tpr, NULL)) {
 				redo_ip4 = 1;
 				descend = 0;
 			}
 		}
 	}
 #endif
 #ifdef INET6
 	redo_ip6 = 0;
 	if (pr_flags & PR_IP6_USER) {
 		pr->pr_flags |= PR_IP6;
 		free(pr->pr_ip6, M_PRISON);
 		pr->pr_ip6s = ip6s;
 		pr->pr_ip6 = ip6;
 		ip6 = NULL;
 		FOREACH_PRISON_DESCENDANT_LOCKED(pr, tpr, descend) {
 #ifdef VIMAGE
 			if (tpr->pr_flags & PR_VNET) {
 				descend = 0;
 				continue;
 			}
 #endif
 			if (prison_restrict_ip6(tpr, NULL)) {
 				redo_ip6 = 1;
 				descend = 0;
 			}
 		}
 	}
 #endif
 	if (gotslevel) {
 		pr->pr_securelevel = slevel;
 		/* Set all child jails to be at least this level. */
 		FOREACH_PRISON_DESCENDANT_LOCKED(pr, tpr, descend)
 			if (tpr->pr_securelevel < slevel)
 				tpr->pr_securelevel = slevel;
 	}
 	if (gotchildmax) {
 		pr->pr_childmax = childmax;
 		/* Set all child jails to under this limit. */
 		FOREACH_PRISON_DESCENDANT_LOCKED_LEVEL(pr, tpr, descend, level)
 			if (tpr->pr_childmax > childmax - level)
 				tpr->pr_childmax = childmax > level
 				    ? childmax - level : 0;
 	}
 	if (gotenforce) {
 		pr->pr_enforce_statfs = enforce;
 		/* Pass this restriction on to the children. */
 		FOREACH_PRISON_DESCENDANT_LOCKED(pr, tpr, descend)
 			if (tpr->pr_enforce_statfs < enforce)
 				tpr->pr_enforce_statfs = enforce;
 	}
 	if (gotrsnum) {
 		pr->pr_devfs_rsnum = rsnum;
 		/* Pass this restriction on to the children. */
 		FOREACH_PRISON_DESCENDANT_LOCKED(pr, tpr, descend)
 			tpr->pr_devfs_rsnum = rsnum;
 	}
 	if (namelc != NULL) {
 		if (ppr == &prison0)
 			strlcpy(pr->pr_name, namelc, sizeof(pr->pr_name));
 		else
 			snprintf(pr->pr_name, sizeof(pr->pr_name), "%s.%s",
 			    ppr->pr_name, namelc);
 		/* Change this component of child names. */
 		FOREACH_PRISON_DESCENDANT_LOCKED(pr, tpr, descend) {
 			bcopy(tpr->pr_name + onamelen, tpr->pr_name + namelen,
 			    strlen(tpr->pr_name + onamelen) + 1);
 			bcopy(pr->pr_name, tpr->pr_name, namelen);
 		}
 	}
 	if (path != NULL) {
 		/* Try to keep a real-rooted full pathname. */
 		if (fullpath_disabled && path[0] == '/' &&
 		    strcmp(mypr->pr_path, "/"))
 			snprintf(pr->pr_path, sizeof(pr->pr_path), "%s%s",
 			    mypr->pr_path, path);
 		else
 			strlcpy(pr->pr_path, path, sizeof(pr->pr_path));
 		pr->pr_root = root;
 	}
 	if (PR_HOST & ch_flags & ~pr_flags) {
 		if (pr->pr_flags & PR_HOST) {
 			/*
 			 * Copy the parent's host info.  As with pr_ip4 above,
 			 * the lack of a lock on the parent is not a problem;
 			 * it is always set with allprison_lock at least
 			 * shared, and is held exclusively here.
 			 */
 			strlcpy(pr->pr_hostname, pr->pr_parent->pr_hostname,
 			    sizeof(pr->pr_hostname));
 			strlcpy(pr->pr_domainname, pr->pr_parent->pr_domainname,
 			    sizeof(pr->pr_domainname));
 			strlcpy(pr->pr_hostuuid, pr->pr_parent->pr_hostuuid,
 			    sizeof(pr->pr_hostuuid));
 			pr->pr_hostid = pr->pr_parent->pr_hostid;
 		}
 	} else if (host != NULL || domain != NULL || uuid != NULL || gothid) {
 		/* Set this prison, and any descendants without PR_HOST. */
 		if (host != NULL)
 			strlcpy(pr->pr_hostname, host, sizeof(pr->pr_hostname));
 		if (domain != NULL)
 			strlcpy(pr->pr_domainname, domain, 
 			    sizeof(pr->pr_domainname));
 		if (uuid != NULL)
 			strlcpy(pr->pr_hostuuid, uuid, sizeof(pr->pr_hostuuid));
 		if (gothid)
 			pr->pr_hostid = hid;
 		FOREACH_PRISON_DESCENDANT_LOCKED(pr, tpr, descend) {
 			if (tpr->pr_flags & PR_HOST)
 				descend = 0;
 			else {
 				if (host != NULL)
 					strlcpy(tpr->pr_hostname,
 					    pr->pr_hostname,
 					    sizeof(tpr->pr_hostname));
 				if (domain != NULL)
 					strlcpy(tpr->pr_domainname, 
 					    pr->pr_domainname,
 					    sizeof(tpr->pr_domainname));
 				if (uuid != NULL)
 					strlcpy(tpr->pr_hostuuid,
 					    pr->pr_hostuuid,
 					    sizeof(tpr->pr_hostuuid));
 				if (gothid)
 					tpr->pr_hostid = hid;
 			}
 		}
 	}
 	if ((tallow = ch_allow & ~pr_allow)) {
 		/* Clear allow bits in all children. */
 		FOREACH_PRISON_DESCENDANT_LOCKED(pr, tpr, descend)
 			tpr->pr_allow &= ~tallow;
 	}
 	pr->pr_allow = (pr->pr_allow & ~ch_allow) | pr_allow;
 	/*
 	 * Persistent prisons get an extra reference, and prisons losing their
 	 * persist flag lose that reference.  Only do this for existing prisons
 	 * for now, so new ones will remain unseen until after the module
 	 * handlers have completed.
 	 */
 	born = pr->pr_uref == 0;
 	if (!created && (ch_flags & PR_PERSIST & (pr_flags ^ pr->pr_flags))) {
 		if (pr_flags & PR_PERSIST) {
 			pr->pr_ref++;
 			pr->pr_uref++;
 		} else {
 			pr->pr_ref--;
 			pr->pr_uref--;
 		}
 	}
 	pr->pr_flags = (pr->pr_flags & ~ch_flags) | pr_flags;
 	mtx_unlock(&pr->pr_mtx);
 
 #ifdef RACCT
 	if (racct_enable && created)
 		prison_racct_attach(pr);
 #endif
 
 	/* Locks may have prevented a complete restriction of child IP
 	 * addresses.  If so, allocate some more memory and try again.
 	 */
 #ifdef INET
 	while (redo_ip4) {
 		ip4s = pr->pr_ip4s;
 		ip4 = malloc(ip4s * sizeof(*ip4), M_PRISON, M_WAITOK);
 		mtx_lock(&pr->pr_mtx);
 		redo_ip4 = 0;
 		FOREACH_PRISON_DESCENDANT_LOCKED(pr, tpr, descend) {
 #ifdef VIMAGE
 			if (tpr->pr_flags & PR_VNET) {
 				descend = 0;
 				continue;
 			}
 #endif
 			if (prison_restrict_ip4(tpr, ip4)) {
 				if (ip4 != NULL)
 					ip4 = NULL;
 				else
 					redo_ip4 = 1;
 			}
 		}
 		mtx_unlock(&pr->pr_mtx);
 	}
 #endif
 #ifdef INET6
 	while (redo_ip6) {
 		ip6s = pr->pr_ip6s;
 		ip6 = malloc(ip6s * sizeof(*ip6), M_PRISON, M_WAITOK);
 		mtx_lock(&pr->pr_mtx);
 		redo_ip6 = 0;
 		FOREACH_PRISON_DESCENDANT_LOCKED(pr, tpr, descend) {
 #ifdef VIMAGE
 			if (tpr->pr_flags & PR_VNET) {
 				descend = 0;
 				continue;
 			}
 #endif
 			if (prison_restrict_ip6(tpr, ip6)) {
 				if (ip6 != NULL)
 					ip6 = NULL;
 				else
 					redo_ip6 = 1;
 			}
 		}
 		mtx_unlock(&pr->pr_mtx);
 	}
 #endif
 
 	/* Let the modules do their work. */
 	sx_downgrade(&allprison_lock);
 	if (born) {
 		error = osd_jail_call(pr, PR_METHOD_CREATE, opts);
 		if (error) {
 			(void)osd_jail_call(pr, PR_METHOD_REMOVE, NULL);
 			prison_deref(pr, created
 			    ? PD_LIST_SLOCKED
 			    : PD_DEREF | PD_LIST_SLOCKED);
 			goto done_errmsg;
 		}
 	}
 	error = osd_jail_call(pr, PR_METHOD_SET, opts);
 	if (error) {
 		if (born)
 			(void)osd_jail_call(pr, PR_METHOD_REMOVE, NULL);
 		prison_deref(pr, created
 		    ? PD_LIST_SLOCKED
 		    : PD_DEREF | PD_LIST_SLOCKED);
 		goto done_errmsg;
 	}
 
 	/* Attach this process to the prison if requested. */
 	if (flags & JAIL_ATTACH) {
 		mtx_lock(&pr->pr_mtx);
 		error = do_jail_attach(td, pr);
 		if (error) {
 			vfs_opterror(opts, "attach failed");
 			if (!created)
 				prison_deref(pr, PD_DEREF);
 			goto done_errmsg;
 		}
 	}
 
 #ifdef RACCT
 	if (racct_enable && !created) {
 		if (!(flags & JAIL_ATTACH))
 			sx_sunlock(&allprison_lock);
 		prison_racct_modify(pr);
 		if (!(flags & JAIL_ATTACH))
 			sx_slock(&allprison_lock);
 	}
 #endif
 
 	td->td_retval[0] = pr->pr_id;
 
 	/*
 	 * Now that it is all there, drop the temporary reference from existing
 	 * prisons.  Or add a reference to newly created persistent prisons
 	 * (which was not done earlier so that the prison would not be publicly
 	 * visible).
 	 */
 	if (!created) {
 		prison_deref(pr, (flags & JAIL_ATTACH)
 		    ? PD_DEREF
 		    : PD_DEREF | PD_LIST_SLOCKED);
 	} else {
 		if (pr_flags & PR_PERSIST) {
 			mtx_lock(&pr->pr_mtx);
 			pr->pr_ref++;
 			pr->pr_uref++;
 			mtx_unlock(&pr->pr_mtx);
 		}
 		if (!(flags & JAIL_ATTACH))
 			sx_sunlock(&allprison_lock);
 	}
 
 	goto done_free;
 
  done_deref_locked:
 	prison_deref(pr, created
 	    ? PD_LOCKED | PD_LIST_XLOCKED
 	    : PD_DEREF | PD_LOCKED | PD_LIST_XLOCKED);
 	goto done_releroot;
  done_unlock_list:
 	sx_xunlock(&allprison_lock);
  done_releroot:
 	if (root != NULL)
 		vrele(root);
  done_errmsg:
 	if (error) {
 		if (vfs_getopt(opts, "errmsg", (void **)&errmsg,
 		    &errmsg_len) == 0 && errmsg_len > 0) {
 			errmsg_pos = 2 * vfs_getopt_pos(opts, "errmsg") + 1;
 			if (optuio->uio_segflg == UIO_SYSSPACE)
 				bcopy(errmsg,
 				    optuio->uio_iov[errmsg_pos].iov_base,
 				    errmsg_len);
 			else
 				copyout(errmsg,
 				    optuio->uio_iov[errmsg_pos].iov_base,
 				    errmsg_len);
 		}
 	}
  done_free:
 #ifdef INET
 	free(ip4, M_PRISON);
 #endif
 #ifdef INET6
 	free(ip6, M_PRISON);
 #endif
 	if (g_path != NULL)
 		free(g_path, M_TEMP);
 	vfs_freeopts(opts);
 	return (error);
 }
 
 
 /*
  * struct jail_get_args {
  *	struct iovec *iovp;
  *	unsigned int iovcnt;
  *	int flags;
  * };
  */
 int
 sys_jail_get(struct thread *td, struct jail_get_args *uap)
 {
 	struct uio *auio;
 	int error;
 
 	/* Check that we have an even number of iovecs. */
 	if (uap->iovcnt & 1)
 		return (EINVAL);
 
 	error = copyinuio(uap->iovp, uap->iovcnt, &auio);
 	if (error)
 		return (error);
 	error = kern_jail_get(td, auio, uap->flags);
 	if (error == 0)
 		error = copyout(auio->uio_iov, uap->iovp,
 		    uap->iovcnt * sizeof (struct iovec));
 	free(auio, M_IOV);
 	return (error);
 }
 
 int
 kern_jail_get(struct thread *td, struct uio *optuio, int flags)
 {
 	struct bool_flags *bf;
 	struct jailsys_flags *jsf;
 	struct prison *pr, *mypr;
 	struct vfsopt *opt;
 	struct vfsoptlist *opts;
 	char *errmsg, *name;
 	int error, errmsg_len, errmsg_pos, i, jid, len, locked, pos;
 	unsigned f;
 
 	if (flags & ~JAIL_GET_MASK)
 		return (EINVAL);
 
 	/* Get the parameter list. */
 	error = vfs_buildopts(optuio, &opts);
 	if (error)
 		return (error);
 	errmsg_pos = vfs_getopt_pos(opts, "errmsg");
 	mypr = td->td_ucred->cr_prison;
 
 	/*
 	 * Find the prison specified by one of: lastjid, jid, name.
 	 */
 	sx_slock(&allprison_lock);
 	error = vfs_copyopt(opts, "lastjid", &jid, sizeof(jid));
 	if (error == 0) {
 		TAILQ_FOREACH(pr, &allprison, pr_list) {
 			if (pr->pr_id > jid && prison_ischild(mypr, pr)) {
 				mtx_lock(&pr->pr_mtx);
 				if (pr->pr_ref > 0 &&
 				    (pr->pr_uref > 0 || (flags & JAIL_DYING)))
 					break;
 				mtx_unlock(&pr->pr_mtx);
 			}
 		}
 		if (pr != NULL)
 			goto found_prison;
 		error = ENOENT;
 		vfs_opterror(opts, "no jail after %d", jid);
 		goto done_unlock_list;
 	} else if (error != ENOENT)
 		goto done_unlock_list;
 
 	error = vfs_copyopt(opts, "jid", &jid, sizeof(jid));
 	if (error == 0) {
 		if (jid != 0) {
 			pr = prison_find_child(mypr, jid);
 			if (pr != NULL) {
 				if (pr->pr_uref == 0 && !(flags & JAIL_DYING)) {
 					mtx_unlock(&pr->pr_mtx);
 					error = ENOENT;
 					vfs_opterror(opts, "jail %d is dying",
 					    jid);
 					goto done_unlock_list;
 				}
 				goto found_prison;
 			}
 			error = ENOENT;
 			vfs_opterror(opts, "jail %d not found", jid);
 			goto done_unlock_list;
 		}
 	} else if (error != ENOENT)
 		goto done_unlock_list;
 
 	error = vfs_getopt(opts, "name", (void **)&name, &len);
 	if (error == 0) {
 		if (len == 0 || name[len - 1] != '\0') {
 			error = EINVAL;
 			goto done_unlock_list;
 		}
 		pr = prison_find_name(mypr, name);
 		if (pr != NULL) {
 			if (pr->pr_uref == 0 && !(flags & JAIL_DYING)) {
 				mtx_unlock(&pr->pr_mtx);
 				error = ENOENT;
 				vfs_opterror(opts, "jail \"%s\" is dying",
 				    name);
 				goto done_unlock_list;
 			}
 			goto found_prison;
 		}
 		error = ENOENT;
 		vfs_opterror(opts, "jail \"%s\" not found", name);
 		goto done_unlock_list;
 	} else if (error != ENOENT)
 		goto done_unlock_list;
 
 	vfs_opterror(opts, "no jail specified");
 	error = ENOENT;
 	goto done_unlock_list;
 
  found_prison:
 	/* Get the parameters of the prison. */
 	pr->pr_ref++;
 	locked = PD_LOCKED;
 	td->td_retval[0] = pr->pr_id;
 	error = vfs_setopt(opts, "jid", &pr->pr_id, sizeof(pr->pr_id));
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 	i = (pr->pr_parent == mypr) ? 0 : pr->pr_parent->pr_id;
 	error = vfs_setopt(opts, "parent", &i, sizeof(i));
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 	error = vfs_setopts(opts, "name", prison_name(mypr, pr));
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 	error = vfs_setopt(opts, "cpuset.id", &pr->pr_cpuset->cs_id,
 	    sizeof(pr->pr_cpuset->cs_id));
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 	error = vfs_setopts(opts, "path", prison_path(mypr, pr));
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 #ifdef INET
 	error = vfs_setopt_part(opts, "ip4.addr", pr->pr_ip4,
 	    pr->pr_ip4s * sizeof(*pr->pr_ip4));
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 #endif
 #ifdef INET6
 	error = vfs_setopt_part(opts, "ip6.addr", pr->pr_ip6,
 	    pr->pr_ip6s * sizeof(*pr->pr_ip6));
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 #endif
 	error = vfs_setopt(opts, "securelevel", &pr->pr_securelevel,
 	    sizeof(pr->pr_securelevel));
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 	error = vfs_setopt(opts, "children.cur", &pr->pr_childcount,
 	    sizeof(pr->pr_childcount));
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 	error = vfs_setopt(opts, "children.max", &pr->pr_childmax,
 	    sizeof(pr->pr_childmax));
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 	error = vfs_setopts(opts, "host.hostname", pr->pr_hostname);
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 	error = vfs_setopts(opts, "host.domainname", pr->pr_domainname);
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 	error = vfs_setopts(opts, "host.hostuuid", pr->pr_hostuuid);
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 #ifdef COMPAT_FREEBSD32
 	if (SV_PROC_FLAG(td->td_proc, SV_ILP32)) {
 		uint32_t hid32 = pr->pr_hostid;
 
 		error = vfs_setopt(opts, "host.hostid", &hid32, sizeof(hid32));
 	} else
 #endif
 	error = vfs_setopt(opts, "host.hostid", &pr->pr_hostid,
 	    sizeof(pr->pr_hostid));
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 	error = vfs_setopt(opts, "enforce_statfs", &pr->pr_enforce_statfs,
 	    sizeof(pr->pr_enforce_statfs));
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 	error = vfs_setopt(opts, "devfs_ruleset", &pr->pr_devfs_rsnum,
 	    sizeof(pr->pr_devfs_rsnum));
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 	for (bf = pr_flag_bool;
 	     bf < pr_flag_bool + nitems(pr_flag_bool);
 	     bf++) {
 		i = (pr->pr_flags & bf->flag) ? 1 : 0;
 		error = vfs_setopt(opts, bf->name, &i, sizeof(i));
 		if (error != 0 && error != ENOENT)
 			goto done_deref;
 		i = !i;
 		error = vfs_setopt(opts, bf->noname, &i, sizeof(i));
 		if (error != 0 && error != ENOENT)
 			goto done_deref;
 	}
 	for (jsf = pr_flag_jailsys;
 	     jsf < pr_flag_jailsys + nitems(pr_flag_jailsys);
 	     jsf++) {
 		f = pr->pr_flags & (jsf->disable | jsf->new);
 		i = (f != 0 && f == jsf->disable) ? JAIL_SYS_DISABLE
 		    : (f == jsf->new) ? JAIL_SYS_NEW
 		    : JAIL_SYS_INHERIT;
 		error = vfs_setopt(opts, jsf->name, &i, sizeof(i));
 		if (error != 0 && error != ENOENT)
 			goto done_deref;
 	}
 	for (bf = pr_flag_allow;
 	     bf < pr_flag_allow + nitems(pr_flag_allow) && bf->flag != 0;
 	     bf++) {
 		i = (pr->pr_allow & bf->flag) ? 1 : 0;
 		error = vfs_setopt(opts, bf->name, &i, sizeof(i));
 		if (error != 0 && error != ENOENT)
 			goto done_deref;
 		i = !i;
 		error = vfs_setopt(opts, bf->noname, &i, sizeof(i));
 		if (error != 0 && error != ENOENT)
 			goto done_deref;
 	}
 	i = (pr->pr_uref == 0);
 	error = vfs_setopt(opts, "dying", &i, sizeof(i));
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 	i = !i;
 	error = vfs_setopt(opts, "nodying", &i, sizeof(i));
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 	error = vfs_setopt(opts, "osreldate", &pr->pr_osreldate,
 	    sizeof(pr->pr_osreldate));
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 	error = vfs_setopts(opts, "osrelease", pr->pr_osrelease);
 	if (error != 0 && error != ENOENT)
 		goto done_deref;
 
 	/* Get the module parameters. */
 	mtx_unlock(&pr->pr_mtx);
 	locked = 0;
 	error = osd_jail_call(pr, PR_METHOD_GET, opts);
 	if (error)
 		goto done_deref;
 	prison_deref(pr, PD_DEREF | PD_LIST_SLOCKED);
 
 	/* By now, all parameters should have been noted. */
 	TAILQ_FOREACH(opt, opts, link) {
 		if (!opt->seen && strcmp(opt->name, "errmsg")) {
 			error = EINVAL;
 			vfs_opterror(opts, "unknown parameter: %s", opt->name);
 			goto done_errmsg;
 		}
 	}
 
 	/* Write the fetched parameters back to userspace. */
 	error = 0;
 	TAILQ_FOREACH(opt, opts, link) {
 		if (opt->pos >= 0 && opt->pos != errmsg_pos) {
 			pos = 2 * opt->pos + 1;
 			optuio->uio_iov[pos].iov_len = opt->len;
 			if (opt->value != NULL) {
 				if (optuio->uio_segflg == UIO_SYSSPACE) {
 					bcopy(opt->value,
 					    optuio->uio_iov[pos].iov_base,
 					    opt->len);
 				} else {
 					error = copyout(opt->value,
 					    optuio->uio_iov[pos].iov_base,
 					    opt->len);
 					if (error)
 						break;
 				}
 			}
 		}
 	}
 	goto done_errmsg;
 
  done_deref:
 	prison_deref(pr, locked | PD_DEREF | PD_LIST_SLOCKED);
 	goto done_errmsg;
 
  done_unlock_list:
 	sx_sunlock(&allprison_lock);
  done_errmsg:
 	if (error && errmsg_pos >= 0) {
 		vfs_getopt(opts, "errmsg", (void **)&errmsg, &errmsg_len);
 		errmsg_pos = 2 * errmsg_pos + 1;
 		if (errmsg_len > 0) {
 			if (optuio->uio_segflg == UIO_SYSSPACE)
 				bcopy(errmsg,
 				    optuio->uio_iov[errmsg_pos].iov_base,
 				    errmsg_len);
 			else
 				copyout(errmsg,
 				    optuio->uio_iov[errmsg_pos].iov_base,
 				    errmsg_len);
 		}
 	}
 	vfs_freeopts(opts);
 	return (error);
 }
 
 
 /*
  * struct jail_remove_args {
  *	int jid;
  * };
  */
 int
 sys_jail_remove(struct thread *td, struct jail_remove_args *uap)
 {
 	struct prison *pr, *cpr, *lpr, *tpr;
 	int descend, error;
 
 	error = priv_check(td, PRIV_JAIL_REMOVE);
 	if (error)
 		return (error);
 
 	sx_xlock(&allprison_lock);
 	pr = prison_find_child(td->td_ucred->cr_prison, uap->jid);
 	if (pr == NULL) {
 		sx_xunlock(&allprison_lock);
 		return (EINVAL);
 	}
 
 	/* Remove all descendants of this prison, then remove this prison. */
 	pr->pr_ref++;
 	if (!LIST_EMPTY(&pr->pr_children)) {
 		mtx_unlock(&pr->pr_mtx);
 		lpr = NULL;
 		FOREACH_PRISON_DESCENDANT(pr, cpr, descend) {
 			mtx_lock(&cpr->pr_mtx);
 			if (cpr->pr_ref > 0) {
 				tpr = cpr;
 				cpr->pr_ref++;
 			} else {
 				/* Already removed - do not do it again. */
 				tpr = NULL;
 			}
 			mtx_unlock(&cpr->pr_mtx);
 			if (lpr != NULL) {
 				mtx_lock(&lpr->pr_mtx);
 				prison_remove_one(lpr);
 				sx_xlock(&allprison_lock);
 			}
 			lpr = tpr;
 		}
 		if (lpr != NULL) {
 			mtx_lock(&lpr->pr_mtx);
 			prison_remove_one(lpr);
 			sx_xlock(&allprison_lock);
 		}
 		mtx_lock(&pr->pr_mtx);
 	}
 	prison_remove_one(pr);
 	return (0);
 }
 
 static void
 prison_remove_one(struct prison *pr)
 {
 	struct proc *p;
 	int deuref;
 
 	/* If the prison was persistent, it is not anymore. */
 	deuref = 0;
 	if (pr->pr_flags & PR_PERSIST) {
 		pr->pr_ref--;
 		deuref = PD_DEUREF;
 		pr->pr_flags &= ~PR_PERSIST;
 	}
 
 	/*
 	 * jail_remove added a reference.  If that's the only one, remove
 	 * the prison now.
 	 */
 	KASSERT(pr->pr_ref > 0,
 	    ("prison_remove_one removing a dead prison (jid=%d)", pr->pr_id));
 	if (pr->pr_ref == 1) {
 		prison_deref(pr,
 		    deuref | PD_DEREF | PD_LOCKED | PD_LIST_XLOCKED);
 		return;
 	}
 
 	mtx_unlock(&pr->pr_mtx);
 	sx_xunlock(&allprison_lock);
 	/*
 	 * Kill all processes unfortunate enough to be attached to this prison.
 	 */
 	sx_slock(&allproc_lock);
 	FOREACH_PROC_IN_SYSTEM(p) {
 		PROC_LOCK(p);
 		if (p->p_state != PRS_NEW && p->p_ucred &&
 		    p->p_ucred->cr_prison == pr)
 			kern_psignal(p, SIGKILL);
 		PROC_UNLOCK(p);
 	}
 	sx_sunlock(&allproc_lock);
 	/* Remove the temporary reference added by jail_remove. */
 	prison_deref(pr, deuref | PD_DEREF);
 }
 
 
 /*
  * struct jail_attach_args {
  *	int jid;
  * };
  */
 int
 sys_jail_attach(struct thread *td, struct jail_attach_args *uap)
 {
 	struct prison *pr;
 	int error;
 
 	error = priv_check(td, PRIV_JAIL_ATTACH);
 	if (error)
 		return (error);
 
 	/*
 	 * Start with exclusive hold on allprison_lock to ensure that a possible
 	 * PR_METHOD_REMOVE call isn't concurrent with jail_set or jail_remove.
 	 * But then immediately downgrade it since we don't need to stop
 	 * readers.
 	 */
 	sx_xlock(&allprison_lock);
 	sx_downgrade(&allprison_lock);
 	pr = prison_find_child(td->td_ucred->cr_prison, uap->jid);
 	if (pr == NULL) {
 		sx_sunlock(&allprison_lock);
 		return (EINVAL);
 	}
 
 	/*
 	 * Do not allow a process to attach to a prison that is not
 	 * considered to be "alive".
 	 */
 	if (pr->pr_uref == 0) {
 		mtx_unlock(&pr->pr_mtx);
 		sx_sunlock(&allprison_lock);
 		return (EINVAL);
 	}
 
 	return (do_jail_attach(td, pr));
 }
 
 static int
 do_jail_attach(struct thread *td, struct prison *pr)
 {
 	struct proc *p;
 	struct ucred *newcred, *oldcred;
 	int error;
 
 	/*
 	 * XXX: Note that there is a slight race here if two threads
 	 * in the same privileged process attempt to attach to two
 	 * different jails at the same time.  It is important for
 	 * user processes not to do this, or they might end up with
 	 * a process root from one prison, but attached to the jail
 	 * of another.
 	 */
 	pr->pr_ref++;
 	pr->pr_uref++;
 	mtx_unlock(&pr->pr_mtx);
 
 	/* Let modules do whatever they need to prepare for attaching. */
 	error = osd_jail_call(pr, PR_METHOD_ATTACH, td);
 	if (error) {
 		prison_deref(pr, PD_DEREF | PD_DEUREF | PD_LIST_SLOCKED);
 		return (error);
 	}
 	sx_sunlock(&allprison_lock);
 
 	/*
 	 * Reparent the newly attached process to this jail.
 	 */
 	p = td->td_proc;
 	error = cpuset_setproc_update_set(p, pr->pr_cpuset);
 	if (error)
 		goto e_revert_osd;
 
 	vn_lock(pr->pr_root, LK_EXCLUSIVE | LK_RETRY);
 	if ((error = change_dir(pr->pr_root, td)) != 0)
 		goto e_unlock;
 #ifdef MAC
 	if ((error = mac_vnode_check_chroot(td->td_ucred, pr->pr_root)))
 		goto e_unlock;
 #endif
 	VOP_UNLOCK(pr->pr_root, 0);
 	if ((error = pwd_chroot(td, pr->pr_root)))
 		goto e_revert_osd;
 
 	newcred = crget();
 	PROC_LOCK(p);
 	oldcred = crcopysafe(p, newcred);
 	newcred->cr_prison = pr;
 	proc_set_cred(p, newcred);
 	setsugid(p);
 #ifdef RACCT
 	racct_proc_ucred_changed(p, oldcred, newcred);
 	crhold(newcred);
 #endif
 	PROC_UNLOCK(p);
 #ifdef RCTL
 	rctl_proc_ucred_changed(p, newcred);
 	crfree(newcred);
 #endif
 	prison_deref(oldcred->cr_prison, PD_DEREF | PD_DEUREF);
 	crfree(oldcred);
 	return (0);
 
  e_unlock:
 	VOP_UNLOCK(pr->pr_root, 0);
  e_revert_osd:
 	/* Tell modules this thread is still in its old jail after all. */
 	(void)osd_jail_call(td->td_ucred->cr_prison, PR_METHOD_ATTACH, td);
 	prison_deref(pr, PD_DEREF | PD_DEUREF);
 	return (error);
 }
 
 
 /*
  * Returns a locked prison instance, or NULL on failure.
  */
 struct prison *
 prison_find(int prid)
 {
 	struct prison *pr;
 
 	sx_assert(&allprison_lock, SX_LOCKED);
 	TAILQ_FOREACH(pr, &allprison, pr_list) {
 		if (pr->pr_id == prid) {
 			mtx_lock(&pr->pr_mtx);
 			if (pr->pr_ref > 0)
 				return (pr);
 			mtx_unlock(&pr->pr_mtx);
 		}
 	}
 	return (NULL);
 }
 
 /*
  * Find a prison that is a descendant of mypr.  Returns a locked prison or NULL.
  */
 struct prison *
 prison_find_child(struct prison *mypr, int prid)
 {
 	struct prison *pr;
 	int descend;
 
 	sx_assert(&allprison_lock, SX_LOCKED);
 	FOREACH_PRISON_DESCENDANT(mypr, pr, descend) {
 		if (pr->pr_id == prid) {
 			mtx_lock(&pr->pr_mtx);
 			if (pr->pr_ref > 0)
 				return (pr);
 			mtx_unlock(&pr->pr_mtx);
 		}
 	}
 	return (NULL);
 }
 
 /*
  * Look for the name relative to mypr.  Returns a locked prison or NULL.
  */
 struct prison *
 prison_find_name(struct prison *mypr, const char *name)
 {
 	struct prison *pr, *deadpr;
 	size_t mylen;
 	int descend;
 
 	sx_assert(&allprison_lock, SX_LOCKED);
 	mylen = (mypr == &prison0) ? 0 : strlen(mypr->pr_name) + 1;
  again:
 	deadpr = NULL;
 	FOREACH_PRISON_DESCENDANT(mypr, pr, descend) {
 		if (!strcmp(pr->pr_name + mylen, name)) {
 			mtx_lock(&pr->pr_mtx);
 			if (pr->pr_ref > 0) {
 				if (pr->pr_uref > 0)
 					return (pr);
 				deadpr = pr;
 			}
 			mtx_unlock(&pr->pr_mtx);
 		}
 	}
 	/* There was no valid prison - perhaps there was a dying one. */
 	if (deadpr != NULL) {
 		mtx_lock(&deadpr->pr_mtx);
 		if (deadpr->pr_ref == 0) {
 			mtx_unlock(&deadpr->pr_mtx);
 			goto again;
 		}
 	}
 	return (deadpr);
 }
 
 /*
  * See if a prison has the specific flag set.
  */
 int
 prison_flag(struct ucred *cred, unsigned flag)
 {
 
 	/* This is an atomic read, so no locking is necessary. */
 	return (cred->cr_prison->pr_flags & flag);
 }
 
 int
 prison_allow(struct ucred *cred, unsigned flag)
 {
 
 	/* This is an atomic read, so no locking is necessary. */
 	return (cred->cr_prison->pr_allow & flag);
 }
 
 /*
  * Remove a prison reference.  If that was the last reference, remove the
  * prison itself - but not in this context in case there are locks held.
  */
 void
 prison_free_locked(struct prison *pr)
 {
 	int ref;
 
 	mtx_assert(&pr->pr_mtx, MA_OWNED);
 	ref = --pr->pr_ref;
 	mtx_unlock(&pr->pr_mtx);
 	if (ref == 0)
 		taskqueue_enqueue(taskqueue_thread, &pr->pr_task);
 }
 
 void
 prison_free(struct prison *pr)
 {
 
 	mtx_lock(&pr->pr_mtx);
 	prison_free_locked(pr);
 }
 
 /*
  * Complete a call to either prison_free or prison_proc_free.
  */
 static void
 prison_complete(void *context, int pending)
 {
 	struct prison *pr = context;
 
 	sx_xlock(&allprison_lock);
 	mtx_lock(&pr->pr_mtx);
 	prison_deref(pr, pr->pr_uref
 	    ? PD_DEREF | PD_DEUREF | PD_LOCKED | PD_LIST_XLOCKED
 	    : PD_LOCKED | PD_LIST_XLOCKED);
 }
 
 /*
  * Remove a prison reference (usually).  This internal version assumes no
  * mutexes are held, except perhaps the prison itself.  If there are no more
  * references, release and delist the prison.  On completion, the prison lock
  * and the allprison lock are both unlocked.
  */
 static void
 prison_deref(struct prison *pr, int flags)
 {
 	struct prison *ppr, *tpr;
 	int ref, lasturef;
 
 	if (!(flags & PD_LOCKED))
 		mtx_lock(&pr->pr_mtx);
 	for (;;) {
 		if (flags & PD_DEUREF) {
 			KASSERT(pr->pr_uref > 0,
 			    ("prison_deref PD_DEUREF on a dead prison (jid=%d)",
 			     pr->pr_id));
 			pr->pr_uref--;
 			lasturef = pr->pr_uref == 0;
 			if (lasturef)
 				pr->pr_ref++;
 			KASSERT(prison0.pr_uref != 0, ("prison0 pr_uref=0"));
 		} else
 			lasturef = 0;
 		if (flags & PD_DEREF) {
 			KASSERT(pr->pr_ref > 0,
 			    ("prison_deref PD_DEREF on a dead prison (jid=%d)",
 			     pr->pr_id));
 			pr->pr_ref--;
 		}
 		ref = pr->pr_ref;
 		mtx_unlock(&pr->pr_mtx);
 
 		/*
 		 * Tell the modules if the last user reference was removed
 		 * (even it sticks around in dying state).
 		 */
 		if (lasturef) {
 			if (!(flags & (PD_LIST_SLOCKED | PD_LIST_XLOCKED))) {
 				sx_xlock(&allprison_lock);
 				flags |= PD_LIST_XLOCKED;
 			}
 			(void)osd_jail_call(pr, PR_METHOD_REMOVE, NULL);
 			mtx_lock(&pr->pr_mtx);
 			ref = --pr->pr_ref;
 			mtx_unlock(&pr->pr_mtx);
 		}
 
 		/* If the prison still has references, nothing else to do. */
 		if (ref > 0) {
 			if (flags & PD_LIST_SLOCKED)
 				sx_sunlock(&allprison_lock);
 			else if (flags & PD_LIST_XLOCKED)
 				sx_xunlock(&allprison_lock);
 			return;
 		}
 
 		if (flags & PD_LIST_SLOCKED) {
 			if (!sx_try_upgrade(&allprison_lock)) {
 				sx_sunlock(&allprison_lock);
 				sx_xlock(&allprison_lock);
 			}
 		} else if (!(flags & PD_LIST_XLOCKED))
 			sx_xlock(&allprison_lock);
 
 		TAILQ_REMOVE(&allprison, pr, pr_list);
 		LIST_REMOVE(pr, pr_sibling);
 		ppr = pr->pr_parent;
 		for (tpr = ppr; tpr != NULL; tpr = tpr->pr_parent)
 			tpr->pr_childcount--;
 		sx_xunlock(&allprison_lock);
 
 #ifdef VIMAGE
 		if (pr->pr_vnet != ppr->pr_vnet)
 			vnet_destroy(pr->pr_vnet);
 #endif
 		if (pr->pr_root != NULL)
 			vrele(pr->pr_root);
 		mtx_destroy(&pr->pr_mtx);
 #ifdef INET
 		free(pr->pr_ip4, M_PRISON);
 #endif
 #ifdef INET6
 		free(pr->pr_ip6, M_PRISON);
 #endif
 		if (pr->pr_cpuset != NULL)
 			cpuset_rel(pr->pr_cpuset);
 		osd_jail_exit(pr);
 #ifdef RACCT
 		if (racct_enable)
 			prison_racct_detach(pr);
 #endif
 		free(pr, M_PRISON);
 
 		/* Removing a prison frees a reference on its parent. */
 		pr = ppr;
 		mtx_lock(&pr->pr_mtx);
 		flags = PD_DEREF | PD_DEUREF;
 	}
 }
 
 void
 prison_hold_locked(struct prison *pr)
 {
 
 	mtx_assert(&pr->pr_mtx, MA_OWNED);
 	KASSERT(pr->pr_ref > 0,
 	    ("Trying to hold dead prison %p (jid=%d).", pr, pr->pr_id));
 	pr->pr_ref++;
 }
 
 void
 prison_hold(struct prison *pr)
 {
 
 	mtx_lock(&pr->pr_mtx);
 	prison_hold_locked(pr);
 	mtx_unlock(&pr->pr_mtx);
 }
 
 void
 prison_proc_hold(struct prison *pr)
 {
 
 	mtx_lock(&pr->pr_mtx);
 	KASSERT(pr->pr_uref > 0,
 	    ("Cannot add a process to a non-alive prison (jid=%d)", pr->pr_id));
 	pr->pr_uref++;
 	mtx_unlock(&pr->pr_mtx);
 }
 
 void
 prison_proc_free(struct prison *pr)
 {
 
 	mtx_lock(&pr->pr_mtx);
 	KASSERT(pr->pr_uref > 0,
 	    ("Trying to kill a process in a dead prison (jid=%d)", pr->pr_id));
 	if (pr->pr_uref > 1)
 		pr->pr_uref--;
 	else {
 		/*
 		 * Don't remove the last user reference in this context, which
 		 * is expected to be a process that is not only locked, but
 		 * also half dead.
 		 */
 		pr->pr_ref++;
 		mtx_unlock(&pr->pr_mtx);
 		taskqueue_enqueue(taskqueue_thread, &pr->pr_task);
 		return;
 	}
 	mtx_unlock(&pr->pr_mtx);
 }
 
 /*
  * Check if a jail supports the given address family.
  *
  * Returns 0 if not jailed or the address family is supported, EAFNOSUPPORT
  * if not.
  */
 int
 prison_check_af(struct ucred *cred, int af)
 {
 	struct prison *pr;
 	int error;
 
 	KASSERT(cred != NULL, ("%s: cred is NULL", __func__));
 
 	pr = cred->cr_prison;
 #ifdef VIMAGE
 	/* Prisons with their own network stack are not limited. */
 	if (prison_owns_vnet(cred))
 		return (0);
 #endif
 
 	error = 0;
 	switch (af)
 	{
 #ifdef INET
 	case AF_INET:
 		if (pr->pr_flags & PR_IP4)
 		{
 			mtx_lock(&pr->pr_mtx);
 			if ((pr->pr_flags & PR_IP4) && pr->pr_ip4 == NULL)
 				error = EAFNOSUPPORT;
 			mtx_unlock(&pr->pr_mtx);
 		}
 		break;
 #endif
 #ifdef INET6
 	case AF_INET6:
 		if (pr->pr_flags & PR_IP6)
 		{
 			mtx_lock(&pr->pr_mtx);
 			if ((pr->pr_flags & PR_IP6) && pr->pr_ip6 == NULL)
 				error = EAFNOSUPPORT;
 			mtx_unlock(&pr->pr_mtx);
 		}
 		break;
 #endif
 	case AF_LOCAL:
 	case AF_ROUTE:
 		break;
 	default:
 		if (!(pr->pr_allow & PR_ALLOW_SOCKET_AF))
 			error = EAFNOSUPPORT;
 	}
 	return (error);
 }
 
 /*
  * Check if given address belongs to the jail referenced by cred (wrapper to
  * prison_check_ip[46]).
  *
  * Returns 0 if jail doesn't restrict the address family or if address belongs
  * to jail, EADDRNOTAVAIL if the address doesn't belong, or EAFNOSUPPORT if
  * the jail doesn't allow the address family.  IPv4 Address passed in in NBO.
  */
 int
 prison_if(struct ucred *cred, struct sockaddr *sa)
 {
 #ifdef INET
 	struct sockaddr_in *sai;
 #endif
 #ifdef INET6
 	struct sockaddr_in6 *sai6;
 #endif
 	int error;
 
 	KASSERT(cred != NULL, ("%s: cred is NULL", __func__));
 	KASSERT(sa != NULL, ("%s: sa is NULL", __func__));
 
 #ifdef VIMAGE
 	if (prison_owns_vnet(cred))
 		return (0);
 #endif
 
 	error = 0;
 	switch (sa->sa_family)
 	{
 #ifdef INET
 	case AF_INET:
 		sai = (struct sockaddr_in *)sa;
 		error = prison_check_ip4(cred, &sai->sin_addr);
 		break;
 #endif
 #ifdef INET6
 	case AF_INET6:
 		sai6 = (struct sockaddr_in6 *)sa;
 		error = prison_check_ip6(cred, &sai6->sin6_addr);
 		break;
 #endif
 	default:
 		if (!(cred->cr_prison->pr_allow & PR_ALLOW_SOCKET_AF))
 			error = EAFNOSUPPORT;
 	}
 	return (error);
 }
 
 /*
  * Return 0 if jails permit p1 to frob p2, otherwise ESRCH.
  */
 int
 prison_check(struct ucred *cred1, struct ucred *cred2)
 {
 
 	return ((cred1->cr_prison == cred2->cr_prison ||
 	    prison_ischild(cred1->cr_prison, cred2->cr_prison)) ? 0 : ESRCH);
 }
 
 /*
  * Return 1 if p2 is a child of p1, otherwise 0.
  */
 int
 prison_ischild(struct prison *pr1, struct prison *pr2)
 {
 
 	for (pr2 = pr2->pr_parent; pr2 != NULL; pr2 = pr2->pr_parent)
 		if (pr1 == pr2)
 			return (1);
 	return (0);
 }
 
 /*
  * Return 1 if the passed credential is in a jail, otherwise 0.
  */
 int
 jailed(struct ucred *cred)
 {
 
 	return (cred->cr_prison != &prison0);
 }
 
 /*
  * Return 1 if the passed credential is in a jail and that jail does not
  * have its own virtual network stack, otherwise 0.
  */
 int
 jailed_without_vnet(struct ucred *cred)
 {
 
 	if (!jailed(cred))
 		return (0);
 #ifdef VIMAGE
 	if (prison_owns_vnet(cred))
 		return (0);
 #endif
 
 	return (1);
 }
 
 /*
  * Return the correct hostname (domainname, et al) for the passed credential.
  */
 void
 getcredhostname(struct ucred *cred, char *buf, size_t size)
 {
 	struct prison *pr;
 
 	/*
 	 * A NULL credential can be used to shortcut to the physical
 	 * system's hostname.
 	 */
 	pr = (cred != NULL) ? cred->cr_prison : &prison0;
 	mtx_lock(&pr->pr_mtx);
 	strlcpy(buf, pr->pr_hostname, size);
 	mtx_unlock(&pr->pr_mtx);
 }
 
 void
 getcreddomainname(struct ucred *cred, char *buf, size_t size)
 {
 
 	mtx_lock(&cred->cr_prison->pr_mtx);
 	strlcpy(buf, cred->cr_prison->pr_domainname, size);
 	mtx_unlock(&cred->cr_prison->pr_mtx);
 }
 
 void
 getcredhostuuid(struct ucred *cred, char *buf, size_t size)
 {
 
 	mtx_lock(&cred->cr_prison->pr_mtx);
 	strlcpy(buf, cred->cr_prison->pr_hostuuid, size);
 	mtx_unlock(&cred->cr_prison->pr_mtx);
 }
 
 void
 getcredhostid(struct ucred *cred, unsigned long *hostid)
 {
 
 	mtx_lock(&cred->cr_prison->pr_mtx);
 	*hostid = cred->cr_prison->pr_hostid;
 	mtx_unlock(&cred->cr_prison->pr_mtx);
 }
 
 #ifdef VIMAGE
 /*
  * Determine whether the prison represented by cred owns
  * its vnet rather than having it inherited.
  *
  * Returns 1 in case the prison owns the vnet, 0 otherwise.
  */
 int
 prison_owns_vnet(struct ucred *cred)
 {
 
 	/*
 	 * vnets cannot be added/removed after jail creation,
 	 * so no need to lock here.
 	 */
 	return (cred->cr_prison->pr_flags & PR_VNET ? 1 : 0);
 }
 #endif
 
 /*
  * Determine whether the subject represented by cred can "see"
  * status of a mount point.
  * Returns: 0 for permitted, ENOENT otherwise.
  * XXX: This function should be called cr_canseemount() and should be
  *      placed in kern_prot.c.
  */
 int
 prison_canseemount(struct ucred *cred, struct mount *mp)
 {
 	struct prison *pr;
 	struct statfs *sp;
 	size_t len;
 
 	pr = cred->cr_prison;
 	if (pr->pr_enforce_statfs == 0)
 		return (0);
 	if (pr->pr_root->v_mount == mp)
 		return (0);
 	if (pr->pr_enforce_statfs == 2)
 		return (ENOENT);
 	/*
 	 * If jail's chroot directory is set to "/" we should be able to see
 	 * all mount-points from inside a jail.
 	 * This is ugly check, but this is the only situation when jail's
 	 * directory ends with '/'.
 	 */
 	if (strcmp(pr->pr_path, "/") == 0)
 		return (0);
 	len = strlen(pr->pr_path);
 	sp = &mp->mnt_stat;
 	if (strncmp(pr->pr_path, sp->f_mntonname, len) != 0)
 		return (ENOENT);
 	/*
 	 * Be sure that we don't have situation where jail's root directory
 	 * is "/some/path" and mount point is "/some/pathpath".
 	 */
 	if (sp->f_mntonname[len] != '\0' && sp->f_mntonname[len] != '/')
 		return (ENOENT);
 	return (0);
 }
 
 void
 prison_enforce_statfs(struct ucred *cred, struct mount *mp, struct statfs *sp)
 {
 	char jpath[MAXPATHLEN];
 	struct prison *pr;
 	size_t len;
 
 	pr = cred->cr_prison;
 	if (pr->pr_enforce_statfs == 0)
 		return;
 	if (prison_canseemount(cred, mp) != 0) {
 		bzero(sp->f_mntonname, sizeof(sp->f_mntonname));
 		strlcpy(sp->f_mntonname, "[restricted]",
 		    sizeof(sp->f_mntonname));
 		return;
 	}
 	if (pr->pr_root->v_mount == mp) {
 		/*
 		 * Clear current buffer data, so we are sure nothing from
 		 * the valid path left there.
 		 */
 		bzero(sp->f_mntonname, sizeof(sp->f_mntonname));
 		*sp->f_mntonname = '/';
 		return;
 	}
 	/*
 	 * If jail's chroot directory is set to "/" we should be able to see
 	 * all mount-points from inside a jail.
 	 */
 	if (strcmp(pr->pr_path, "/") == 0)
 		return;
 	len = strlen(pr->pr_path);
 	strlcpy(jpath, sp->f_mntonname + len, sizeof(jpath));
 	/*
 	 * Clear current buffer data, so we are sure nothing from
 	 * the valid path left there.
 	 */
 	bzero(sp->f_mntonname, sizeof(sp->f_mntonname));
 	if (*jpath == '\0') {
 		/* Should never happen. */
 		*sp->f_mntonname = '/';
 	} else {
 		strlcpy(sp->f_mntonname, jpath, sizeof(sp->f_mntonname));
 	}
 }
 
 /*
  * Check with permission for a specific privilege is granted within jail.  We
  * have a specific list of accepted privileges; the rest are denied.
  */
 int
 prison_priv_check(struct ucred *cred, int priv)
 {
 
 	if (!jailed(cred))
 		return (0);
 
 #ifdef VIMAGE
 	/*
 	 * Privileges specific to prisons with a virtual network stack.
 	 * There might be a duplicate entry here in case the privilege
 	 * is only granted conditionally in the legacy jail case.
 	 */
 	switch (priv) {
 #ifdef notyet
 		/*
 		 * NFS-specific privileges.
 		 */
 	case PRIV_NFS_DAEMON:
 	case PRIV_NFS_LOCKD:
 #endif
 		/*
 		 * Network stack privileges.
 		 */
 	case PRIV_NET_BRIDGE:
 	case PRIV_NET_GRE:
 	case PRIV_NET_BPF:
 	case PRIV_NET_RAW:		/* Dup, cond. in legacy jail case. */
 	case PRIV_NET_ROUTE:
 	case PRIV_NET_TAP:
 	case PRIV_NET_SETIFMTU:
 	case PRIV_NET_SETIFFLAGS:
 	case PRIV_NET_SETIFCAP:
 	case PRIV_NET_SETIFDESCR:
 	case PRIV_NET_SETIFNAME	:
 	case PRIV_NET_SETIFMETRIC:
 	case PRIV_NET_SETIFPHYS:
 	case PRIV_NET_SETIFMAC:
 	case PRIV_NET_ADDMULTI:
 	case PRIV_NET_DELMULTI:
 	case PRIV_NET_HWIOCTL:
 	case PRIV_NET_SETLLADDR:
 	case PRIV_NET_ADDIFGROUP:
 	case PRIV_NET_DELIFGROUP:
 	case PRIV_NET_IFCREATE:
 	case PRIV_NET_IFDESTROY:
 	case PRIV_NET_ADDIFADDR:
 	case PRIV_NET_DELIFADDR:
 	case PRIV_NET_LAGG:
 	case PRIV_NET_GIF:
 	case PRIV_NET_SETIFVNET:
 	case PRIV_NET_SETIFFIB:
 
 		/*
 		 * 802.11-related privileges.
 		 */
 	case PRIV_NET80211_GETKEY:
 #ifdef notyet
 	case PRIV_NET80211_MANAGE:		/* XXX-BZ discuss with sam@ */
 #endif
 
 #ifdef notyet
 		/*
 		 * ATM privileges.
 		 */
 	case PRIV_NETATM_CFG:
 	case PRIV_NETATM_ADD:
 	case PRIV_NETATM_DEL:
 	case PRIV_NETATM_SET:
 
 		/*
 		 * Bluetooth privileges.
 		 */
 	case PRIV_NETBLUETOOTH_RAW:
 #endif
 
 		/*
 		 * Netgraph and netgraph module privileges.
 		 */
 	case PRIV_NETGRAPH_CONTROL:
 #ifdef notyet
 	case PRIV_NETGRAPH_TTY:
 #endif
 
 		/*
 		 * IPv4 and IPv6 privileges.
 		 */
 	case PRIV_NETINET_IPFW:
 	case PRIV_NETINET_DIVERT:
 	case PRIV_NETINET_PF:
 	case PRIV_NETINET_DUMMYNET:
 	case PRIV_NETINET_CARP:
 	case PRIV_NETINET_MROUTE:
 	case PRIV_NETINET_RAW:
 	case PRIV_NETINET_ADDRCTRL6:
 	case PRIV_NETINET_ND6:
 	case PRIV_NETINET_SCOPE6:
 	case PRIV_NETINET_ALIFETIME6:
 	case PRIV_NETINET_IPSEC:
 	case PRIV_NETINET_BINDANY:
 
 #ifdef notyet
 		/*
 		 * NCP privileges.
 		 */
 	case PRIV_NETNCP:
 
 		/*
 		 * SMB privileges.
 		 */
 	case PRIV_NETSMB:
 #endif
 
 	/*
 	 * No default: or deny here.
 	 * In case of no permit fall through to next switch().
 	 */
 		if (cred->cr_prison->pr_flags & PR_VNET)
 			return (0);
 	}
 #endif /* VIMAGE */
 
 	switch (priv) {
 
 		/*
 		 * Allow ktrace privileges for root in jail.
 		 */
 	case PRIV_KTRACE:
 
 #if 0
 		/*
 		 * Allow jailed processes to configure audit identity and
 		 * submit audit records (login, etc).  In the future we may
 		 * want to further refine the relationship between audit and
 		 * jail.
 		 */
 	case PRIV_AUDIT_GETAUDIT:
 	case PRIV_AUDIT_SETAUDIT:
 	case PRIV_AUDIT_SUBMIT:
 #endif
 
 		/*
 		 * Allow jailed processes to manipulate process UNIX
 		 * credentials in any way they see fit.
 		 */
 	case PRIV_CRED_SETUID:
 	case PRIV_CRED_SETEUID:
 	case PRIV_CRED_SETGID:
 	case PRIV_CRED_SETEGID:
 	case PRIV_CRED_SETGROUPS:
 	case PRIV_CRED_SETREUID:
 	case PRIV_CRED_SETREGID:
 	case PRIV_CRED_SETRESUID:
 	case PRIV_CRED_SETRESGID:
 
 		/*
 		 * Jail implements visibility constraints already, so allow
 		 * jailed root to override uid/gid-based constraints.
 		 */
 	case PRIV_SEEOTHERGIDS:
 	case PRIV_SEEOTHERUIDS:
 
 		/*
 		 * Jail implements inter-process debugging limits already, so
 		 * allow jailed root various debugging privileges.
 		 */
 	case PRIV_DEBUG_DIFFCRED:
 	case PRIV_DEBUG_SUGID:
 	case PRIV_DEBUG_UNPRIV:
 
 		/*
 		 * Allow jail to set various resource limits and login
 		 * properties, and for now, exceed process resource limits.
 		 */
 	case PRIV_PROC_LIMIT:
 	case PRIV_PROC_SETLOGIN:
 	case PRIV_PROC_SETRLIMIT:
 
 		/*
 		 * System V and POSIX IPC privileges are granted in jail.
 		 */
 	case PRIV_IPC_READ:
 	case PRIV_IPC_WRITE:
 	case PRIV_IPC_ADMIN:
 	case PRIV_IPC_MSGSIZE:
 	case PRIV_MQ_ADMIN:
 
 		/*
 		 * Jail operations within a jail work on child jails.
 		 */
 	case PRIV_JAIL_ATTACH:
 	case PRIV_JAIL_SET:
 	case PRIV_JAIL_REMOVE:
 
 		/*
 		 * Jail implements its own inter-process limits, so allow
 		 * root processes in jail to change scheduling on other
 		 * processes in the same jail.  Likewise for signalling.
 		 */
 	case PRIV_SCHED_DIFFCRED:
 	case PRIV_SCHED_CPUSET:
 	case PRIV_SIGNAL_DIFFCRED:
 	case PRIV_SIGNAL_SUGID:
 
 		/*
 		 * Allow jailed processes to write to sysctls marked as jail
 		 * writable.
 		 */
 	case PRIV_SYSCTL_WRITEJAIL:
 
 		/*
 		 * Allow root in jail to manage a variety of quota
 		 * properties.  These should likely be conditional on a
 		 * configuration option.
 		 */
 	case PRIV_VFS_GETQUOTA:
 	case PRIV_VFS_SETQUOTA:
 
 		/*
 		 * Since Jail relies on chroot() to implement file system
 		 * protections, grant many VFS privileges to root in jail.
 		 * Be careful to exclude mount-related and NFS-related
 		 * privileges.
 		 */
 	case PRIV_VFS_READ:
 	case PRIV_VFS_WRITE:
 	case PRIV_VFS_ADMIN:
 	case PRIV_VFS_EXEC:
 	case PRIV_VFS_LOOKUP:
 	case PRIV_VFS_BLOCKRESERVE:	/* XXXRW: Slightly surprising. */
 	case PRIV_VFS_CHFLAGS_DEV:
 	case PRIV_VFS_CHOWN:
 	case PRIV_VFS_CHROOT:
 	case PRIV_VFS_RETAINSUGID:
 	case PRIV_VFS_FCHROOT:
 	case PRIV_VFS_LINK:
 	case PRIV_VFS_SETGID:
 	case PRIV_VFS_STAT:
 	case PRIV_VFS_STICKYFILE:
 
 		/*
 		 * As in the non-jail case, non-root users are expected to be
 		 * able to read kernel/phyiscal memory (provided /dev/[k]mem
 		 * exists in the jail and they have permission to access it).
 		 */
 	case PRIV_KMEM_READ:
 		return (0);
 
 		/*
 		 * Depending on the global setting, allow privilege of
 		 * setting system flags.
 		 */
 	case PRIV_VFS_SYSFLAGS:
 		if (cred->cr_prison->pr_allow & PR_ALLOW_CHFLAGS)
 			return (0);
 		else
 			return (EPERM);
 
 		/*
 		 * Depending on the global setting, allow privilege of
 		 * mounting/unmounting file systems.
 		 */
 	case PRIV_VFS_MOUNT:
 	case PRIV_VFS_UNMOUNT:
 	case PRIV_VFS_MOUNT_NONUSER:
 	case PRIV_VFS_MOUNT_OWNER:
 		if (cred->cr_prison->pr_allow & PR_ALLOW_MOUNT &&
 		    cred->cr_prison->pr_enforce_statfs < 2)
 			return (0);
 		else
 			return (EPERM);
 
 		/*
 		 * Conditionnaly allow locking (unlocking) physical pages
 		 * in memory.
 		 */
 	case PRIV_VM_MLOCK:
 	case PRIV_VM_MUNLOCK:
 		if (cred->cr_prison->pr_allow & PR_ALLOW_MLOCK)
 			return (0);
 		else
 			return (EPERM);
 
 		/*
 		 * Conditionally allow jailed root to bind reserved ports.
 		 */
 	case PRIV_NETINET_RESERVEDPORT:
 		if (cred->cr_prison->pr_allow & PR_ALLOW_RESERVED_PORTS)
 			return (0);
 		else
 			return (EPERM);
 
 		/*
 		 * Allow jailed root to reuse in-use ports.
 		 */
 	case PRIV_NETINET_REUSEPORT:
 		return (0);
 
 		/*
 		 * Allow jailed root to set certain IPv4/6 (option) headers.
 		 */
 	case PRIV_NETINET_SETHDROPTS:
 		return (0);
 
 		/*
 		 * Conditionally allow creating raw sockets in jail.
 		 */
 	case PRIV_NETINET_RAW:
 		if (cred->cr_prison->pr_allow & PR_ALLOW_RAW_SOCKETS)
 			return (0);
 		else
 			return (EPERM);
 
 		/*
 		 * Since jail implements its own visibility limits on netstat
 		 * sysctls, allow getcred.  This allows identd to work in
 		 * jail.
 		 */
 	case PRIV_NETINET_GETCRED:
 		return (0);
 
 		/*
 		 * Allow jailed root to set loginclass.
 		 */
 	case PRIV_PROC_SETLOGINCLASS:
 		return (0);
 
 	default:
 		/*
 		 * In all remaining cases, deny the privilege request.  This
 		 * includes almost all network privileges, many system
 		 * configuration privileges.
 		 */
 		return (EPERM);
 	}
 }
 
 /*
  * Return the part of pr2's name that is relative to pr1, or the whole name
  * if it does not directly follow.
  */
 
 char *
 prison_name(struct prison *pr1, struct prison *pr2)
 {
 	char *name;
 
 	/* Jails see themselves as "0" (if they see themselves at all). */
 	if (pr1 == pr2)
 		return "0";
 	name = pr2->pr_name;
 	if (prison_ischild(pr1, pr2)) {
 		/*
 		 * pr1 isn't locked (and allprison_lock may not be either)
 		 * so its length can't be counted on.  But the number of dots
 		 * can be counted on - and counted.
 		 */
 		for (; pr1 != &prison0; pr1 = pr1->pr_parent)
 			name = strchr(name, '.') + 1;
 	}
 	return (name);
 }
 
 /*
  * Return the part of pr2's path that is relative to pr1, or the whole path
  * if it does not directly follow.
  */
 static char *
 prison_path(struct prison *pr1, struct prison *pr2)
 {
 	char *path1, *path2;
 	int len1;
 
 	path1 = pr1->pr_path;
 	path2 = pr2->pr_path;
 	if (!strcmp(path1, "/"))
 		return (path2);
 	len1 = strlen(path1);
 	if (strncmp(path1, path2, len1))
 		return (path2);
 	if (path2[len1] == '\0')
 		return "/";
 	if (path2[len1] == '/')
 		return (path2 + len1);
 	return (path2);
 }
 
 
 /*
  * Jail-related sysctls.
  */
 static SYSCTL_NODE(_security, OID_AUTO, jail, CTLFLAG_RW, 0,
     "Jails");
 
 static int
 sysctl_jail_list(SYSCTL_HANDLER_ARGS)
 {
 	struct xprison *xp;
 	struct prison *pr, *cpr;
 #ifdef INET
 	struct in_addr *ip4 = NULL;
 	int ip4s = 0;
 #endif
 #ifdef INET6
 	struct in6_addr *ip6 = NULL;
 	int ip6s = 0;
 #endif
 	int descend, error;
 
 	xp = malloc(sizeof(*xp), M_TEMP, M_WAITOK);
 	pr = req->td->td_ucred->cr_prison;
 	error = 0;
 	sx_slock(&allprison_lock);
 	FOREACH_PRISON_DESCENDANT(pr, cpr, descend) {
 #if defined(INET) || defined(INET6)
  again:
 #endif
 		mtx_lock(&cpr->pr_mtx);
 #ifdef INET
 		if (cpr->pr_ip4s > 0) {
 			if (ip4s < cpr->pr_ip4s) {
 				ip4s = cpr->pr_ip4s;
 				mtx_unlock(&cpr->pr_mtx);
 				ip4 = realloc(ip4, ip4s *
 				    sizeof(struct in_addr), M_TEMP, M_WAITOK);
 				goto again;
 			}
 			bcopy(cpr->pr_ip4, ip4,
 			    cpr->pr_ip4s * sizeof(struct in_addr));
 		}
 #endif
 #ifdef INET6
 		if (cpr->pr_ip6s > 0) {
 			if (ip6s < cpr->pr_ip6s) {
 				ip6s = cpr->pr_ip6s;
 				mtx_unlock(&cpr->pr_mtx);
 				ip6 = realloc(ip6, ip6s *
 				    sizeof(struct in6_addr), M_TEMP, M_WAITOK);
 				goto again;
 			}
 			bcopy(cpr->pr_ip6, ip6,
 			    cpr->pr_ip6s * sizeof(struct in6_addr));
 		}
 #endif
 		if (cpr->pr_ref == 0) {
 			mtx_unlock(&cpr->pr_mtx);
 			continue;
 		}
 		bzero(xp, sizeof(*xp));
 		xp->pr_version = XPRISON_VERSION;
 		xp->pr_id = cpr->pr_id;
 		xp->pr_state = cpr->pr_uref > 0
 		    ? PRISON_STATE_ALIVE : PRISON_STATE_DYING;
 		strlcpy(xp->pr_path, prison_path(pr, cpr), sizeof(xp->pr_path));
 		strlcpy(xp->pr_host, cpr->pr_hostname, sizeof(xp->pr_host));
 		strlcpy(xp->pr_name, prison_name(pr, cpr), sizeof(xp->pr_name));
 #ifdef INET
 		xp->pr_ip4s = cpr->pr_ip4s;
 #endif
 #ifdef INET6
 		xp->pr_ip6s = cpr->pr_ip6s;
 #endif
 		mtx_unlock(&cpr->pr_mtx);
 		error = SYSCTL_OUT(req, xp, sizeof(*xp));
 		if (error)
 			break;
 #ifdef INET
 		if (xp->pr_ip4s > 0) {
 			error = SYSCTL_OUT(req, ip4,
 			    xp->pr_ip4s * sizeof(struct in_addr));
 			if (error)
 				break;
 		}
 #endif
 #ifdef INET6
 		if (xp->pr_ip6s > 0) {
 			error = SYSCTL_OUT(req, ip6,
 			    xp->pr_ip6s * sizeof(struct in6_addr));
 			if (error)
 				break;
 		}
 #endif
 	}
 	sx_sunlock(&allprison_lock);
 	free(xp, M_TEMP);
 #ifdef INET
 	free(ip4, M_TEMP);
 #endif
 #ifdef INET6
 	free(ip6, M_TEMP);
 #endif
 	return (error);
 }
 
 SYSCTL_OID(_security_jail, OID_AUTO, list,
     CTLTYPE_STRUCT | CTLFLAG_RD | CTLFLAG_MPSAFE, NULL, 0,
     sysctl_jail_list, "S", "List of active jails");
 
 static int
 sysctl_jail_jailed(SYSCTL_HANDLER_ARGS)
 {
 	int error, injail;
 
 	injail = jailed(req->td->td_ucred);
 	error = SYSCTL_OUT(req, &injail, sizeof(injail));
 
 	return (error);
 }
 
 SYSCTL_PROC(_security_jail, OID_AUTO, jailed,
     CTLTYPE_INT | CTLFLAG_RD | CTLFLAG_MPSAFE, NULL, 0,
     sysctl_jail_jailed, "I", "Process in jail?");
 
 static int
 sysctl_jail_vnet(SYSCTL_HANDLER_ARGS)
 {
 	int error, havevnet;
 #ifdef VIMAGE
 	struct ucred *cred = req->td->td_ucred;
 
 	havevnet = jailed(cred) && prison_owns_vnet(cred);
 #else
 	havevnet = 0;
 #endif
 	error = SYSCTL_OUT(req, &havevnet, sizeof(havevnet));
 
 	return (error);
 }
 
 SYSCTL_PROC(_security_jail, OID_AUTO, vnet,
     CTLTYPE_INT | CTLFLAG_RD | CTLFLAG_MPSAFE, NULL, 0,
     sysctl_jail_vnet, "I", "Jail owns vnet?");
 
 #if defined(INET) || defined(INET6)
 SYSCTL_UINT(_security_jail, OID_AUTO, jail_max_af_ips, CTLFLAG_RW,
     &jail_max_af_ips, 0,
     "Number of IP addresses a jail may have at most per address family (deprecated)");
 #endif
 
 /*
  * Default parameters for jail(2) compatibility.  For historical reasons,
  * the sysctl names have varying similarity to the parameter names.  Prisons
  * just see their own parameters, and can't change them.
  */
 static int
 sysctl_jail_default_allow(SYSCTL_HANDLER_ARGS)
 {
 	struct prison *pr;
 	int allow, error, i;
 
 	pr = req->td->td_ucred->cr_prison;
 	allow = (pr == &prison0) ? jail_default_allow : pr->pr_allow;
 
 	/* Get the current flag value, and convert it to a boolean. */
 	i = (allow & arg2) ? 1 : 0;
 	if (arg1 != NULL)
 		i = !i;
 	error = sysctl_handle_int(oidp, &i, 0, req);
 	if (error || !req->newptr)
 		return (error);
 	i = i ? arg2 : 0;
 	if (arg1 != NULL)
 		i ^= arg2;
 	/*
 	 * The sysctls don't have CTLFLAGS_PRISON, so assume prison0
 	 * for writing.
 	 */
 	mtx_lock(&prison0.pr_mtx);
 	jail_default_allow = (jail_default_allow & ~arg2) | i;
 	mtx_unlock(&prison0.pr_mtx);
 	return (0);
 }
 
 SYSCTL_PROC(_security_jail, OID_AUTO, set_hostname_allowed,
     CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE,
     NULL, PR_ALLOW_SET_HOSTNAME, sysctl_jail_default_allow, "I",
     "Processes in jail can set their hostnames (deprecated)");
 SYSCTL_PROC(_security_jail, OID_AUTO, socket_unixiproute_only,
     CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE,
     (void *)1, PR_ALLOW_SOCKET_AF, sysctl_jail_default_allow, "I",
     "Processes in jail are limited to creating UNIX/IP/route sockets only (deprecated)");
 SYSCTL_PROC(_security_jail, OID_AUTO, sysvipc_allowed,
     CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE,
     NULL, PR_ALLOW_SYSVIPC, sysctl_jail_default_allow, "I",
     "Processes in jail can use System V IPC primitives (deprecated)");
 SYSCTL_PROC(_security_jail, OID_AUTO, allow_raw_sockets,
     CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE,
     NULL, PR_ALLOW_RAW_SOCKETS, sysctl_jail_default_allow, "I",
     "Prison root can create raw sockets (deprecated)");
 SYSCTL_PROC(_security_jail, OID_AUTO, chflags_allowed,
     CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE,
     NULL, PR_ALLOW_CHFLAGS, sysctl_jail_default_allow, "I",
     "Processes in jail can alter system file flags (deprecated)");
 SYSCTL_PROC(_security_jail, OID_AUTO, mount_allowed,
     CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE,
     NULL, PR_ALLOW_MOUNT, sysctl_jail_default_allow, "I",
     "Processes in jail can mount/unmount jail-friendly file systems (deprecated)");
 
 static int
 sysctl_jail_default_level(SYSCTL_HANDLER_ARGS)
 {
 	struct prison *pr;
 	int level, error;
 
 	pr = req->td->td_ucred->cr_prison;
 	level = (pr == &prison0) ? *(int *)arg1 : *(int *)((char *)pr + arg2);
 	error = sysctl_handle_int(oidp, &level, 0, req);
 	if (error || !req->newptr)
 		return (error);
 	*(int *)arg1 = level;
 	return (0);
 }
 
 SYSCTL_PROC(_security_jail, OID_AUTO, enforce_statfs,
     CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE,
     &jail_default_enforce_statfs, offsetof(struct prison, pr_enforce_statfs),
     sysctl_jail_default_level, "I",
     "Processes in jail cannot see all mounted file systems (deprecated)");
 
 SYSCTL_PROC(_security_jail, OID_AUTO, devfs_ruleset,
     CTLTYPE_INT | CTLFLAG_RD | CTLFLAG_MPSAFE,
     &jail_default_devfs_rsnum, offsetof(struct prison, pr_devfs_rsnum),
     sysctl_jail_default_level, "I",
     "Ruleset for the devfs filesystem in jail (deprecated)");
 
 /*
  * Nodes to describe jail parameters.  Maximum length of string parameters
  * is returned in the string itself, and the other parameters exist merely
  * to make themselves and their types known.
  */
 SYSCTL_NODE(_security_jail, OID_AUTO, param, CTLFLAG_RW, 0,
     "Jail parameters");
 
 int
 sysctl_jail_param(SYSCTL_HANDLER_ARGS)
 {
 	int i;
 	long l;
 	size_t s;
 	char numbuf[12];
 
 	switch (oidp->oid_kind & CTLTYPE)
 	{
 	case CTLTYPE_LONG:
 	case CTLTYPE_ULONG:
 		l = 0;
 #ifdef SCTL_MASK32
 		if (!(req->flags & SCTL_MASK32))
 #endif
 			return (SYSCTL_OUT(req, &l, sizeof(l)));
 	case CTLTYPE_INT:
 	case CTLTYPE_UINT:
 		i = 0;
 		return (SYSCTL_OUT(req, &i, sizeof(i)));
 	case CTLTYPE_STRING:
 		snprintf(numbuf, sizeof(numbuf), "%jd", (intmax_t)arg2);
 		return
 		    (sysctl_handle_string(oidp, numbuf, sizeof(numbuf), req));
 	case CTLTYPE_STRUCT:
 		s = (size_t)arg2;
 		return (SYSCTL_OUT(req, &s, sizeof(s)));
 	}
 	return (0);
 }
 
 /*
  * CTLFLAG_RDTUN in the following indicates jail parameters that can be set at
  * jail creation time but cannot be changed in an existing jail.
  */
 SYSCTL_JAIL_PARAM(, jid, CTLTYPE_INT | CTLFLAG_RDTUN, "I", "Jail ID");
 SYSCTL_JAIL_PARAM(, parent, CTLTYPE_INT | CTLFLAG_RD, "I", "Jail parent ID");
 SYSCTL_JAIL_PARAM_STRING(, name, CTLFLAG_RW, MAXHOSTNAMELEN, "Jail name");
 SYSCTL_JAIL_PARAM_STRING(, path, CTLFLAG_RDTUN, MAXPATHLEN, "Jail root path");
 SYSCTL_JAIL_PARAM(, securelevel, CTLTYPE_INT | CTLFLAG_RW,
     "I", "Jail secure level");
 SYSCTL_JAIL_PARAM(, osreldate, CTLTYPE_INT | CTLFLAG_RDTUN, "I", 
     "Jail value for kern.osreldate and uname -K");
 SYSCTL_JAIL_PARAM_STRING(, osrelease, CTLFLAG_RDTUN, OSRELEASELEN, 
     "Jail value for kern.osrelease and uname -r");
 SYSCTL_JAIL_PARAM(, enforce_statfs, CTLTYPE_INT | CTLFLAG_RW,
     "I", "Jail cannot see all mounted file systems");
 SYSCTL_JAIL_PARAM(, devfs_ruleset, CTLTYPE_INT | CTLFLAG_RW,
     "I", "Ruleset for in-jail devfs mounts");
 SYSCTL_JAIL_PARAM(, persist, CTLTYPE_INT | CTLFLAG_RW,
     "B", "Jail persistence");
 #ifdef VIMAGE
 SYSCTL_JAIL_PARAM(, vnet, CTLTYPE_INT | CTLFLAG_RDTUN,
     "E,jailsys", "Virtual network stack");
 #endif
 SYSCTL_JAIL_PARAM(, dying, CTLTYPE_INT | CTLFLAG_RD,
     "B", "Jail is in the process of shutting down");
 
 SYSCTL_JAIL_PARAM_NODE(children, "Number of child jails");
 SYSCTL_JAIL_PARAM(_children, cur, CTLTYPE_INT | CTLFLAG_RD,
     "I", "Current number of child jails");
 SYSCTL_JAIL_PARAM(_children, max, CTLTYPE_INT | CTLFLAG_RW,
     "I", "Maximum number of child jails");
 
 SYSCTL_JAIL_PARAM_SYS_NODE(host, CTLFLAG_RW, "Jail host info");
 SYSCTL_JAIL_PARAM_STRING(_host, hostname, CTLFLAG_RW, MAXHOSTNAMELEN,
     "Jail hostname");
 SYSCTL_JAIL_PARAM_STRING(_host, domainname, CTLFLAG_RW, MAXHOSTNAMELEN,
     "Jail NIS domainname");
 SYSCTL_JAIL_PARAM_STRING(_host, hostuuid, CTLFLAG_RW, HOSTUUIDLEN,
     "Jail host UUID");
 SYSCTL_JAIL_PARAM(_host, hostid, CTLTYPE_ULONG | CTLFLAG_RW,
     "LU", "Jail host ID");
 
 SYSCTL_JAIL_PARAM_NODE(cpuset, "Jail cpuset");
 SYSCTL_JAIL_PARAM(_cpuset, id, CTLTYPE_INT | CTLFLAG_RD, "I", "Jail cpuset ID");
 
 #ifdef INET
 SYSCTL_JAIL_PARAM_SYS_NODE(ip4, CTLFLAG_RDTUN,
     "Jail IPv4 address virtualization");
 SYSCTL_JAIL_PARAM_STRUCT(_ip4, addr, CTLFLAG_RW, sizeof(struct in_addr),
     "S,in_addr,a", "Jail IPv4 addresses");
 SYSCTL_JAIL_PARAM(_ip4, saddrsel, CTLTYPE_INT | CTLFLAG_RW,
     "B", "Do (not) use IPv4 source address selection rather than the "
     "primary jail IPv4 address.");
 #endif
 #ifdef INET6
 SYSCTL_JAIL_PARAM_SYS_NODE(ip6, CTLFLAG_RDTUN,
     "Jail IPv6 address virtualization");
 SYSCTL_JAIL_PARAM_STRUCT(_ip6, addr, CTLFLAG_RW, sizeof(struct in6_addr),
     "S,in6_addr,a", "Jail IPv6 addresses");
 SYSCTL_JAIL_PARAM(_ip6, saddrsel, CTLTYPE_INT | CTLFLAG_RW,
     "B", "Do (not) use IPv6 source address selection rather than the "
     "primary jail IPv6 address.");
 #endif
 
 SYSCTL_JAIL_PARAM_NODE(allow, "Jail permission flags");
 SYSCTL_JAIL_PARAM(_allow, set_hostname, CTLTYPE_INT | CTLFLAG_RW,
     "B", "Jail may set hostname");
 SYSCTL_JAIL_PARAM(_allow, sysvipc, CTLTYPE_INT | CTLFLAG_RW,
     "B", "Jail may use SYSV IPC");
 SYSCTL_JAIL_PARAM(_allow, raw_sockets, CTLTYPE_INT | CTLFLAG_RW,
     "B", "Jail may create raw sockets");
 SYSCTL_JAIL_PARAM(_allow, chflags, CTLTYPE_INT | CTLFLAG_RW,
     "B", "Jail may alter system file flags");
 SYSCTL_JAIL_PARAM(_allow, quotas, CTLTYPE_INT | CTLFLAG_RW,
     "B", "Jail may set file quotas");
 SYSCTL_JAIL_PARAM(_allow, socket_af, CTLTYPE_INT | CTLFLAG_RW,
     "B", "Jail may create sockets other than just UNIX/IPv4/IPv6/route");
 SYSCTL_JAIL_PARAM(_allow, mlock, CTLTYPE_INT | CTLFLAG_RW,
     "B", "Jail may lock (unlock) physical pages in memory");
 SYSCTL_JAIL_PARAM(_allow, reserved_ports, CTLTYPE_INT | CTLFLAG_RW,
     "B", "Jail may bind sockets to reserved ports");
 
 SYSCTL_JAIL_PARAM_SUBNODE(allow, mount, "Jail mount/unmount permission flags");
 SYSCTL_JAIL_PARAM(_allow_mount, , CTLTYPE_INT | CTLFLAG_RW,
     "B", "Jail may mount/unmount jail-friendly file systems in general");
 
 /*
  * Add a dynamic parameter allow.<name>, or allow.<prefix>.<name>.  Return
  * its associated bit in the pr_allow bitmask, or zero if the parameter was
  * not created.
  */
 unsigned
 prison_add_allow(const char *prefix, const char *name, const char *prefix_descr,
     const char *descr)
 {
 	struct bool_flags *bf;
 	struct sysctl_oid *parent;
 	char *allow_name, *allow_noname, *allowed;
 #ifndef NO_SYSCTL_DESCR
 	char *descr_deprecated;
 #endif
 	unsigned allow_flag;
 
 	if (prefix
 	    ? asprintf(&allow_name, M_PRISON, "allow.%s.%s", prefix, name)
 		< 0 ||
 	      asprintf(&allow_noname, M_PRISON, "allow.%s.no%s", prefix, name)
 		< 0
 	    : asprintf(&allow_name, M_PRISON, "allow.%s", name) < 0 ||
 	      asprintf(&allow_noname, M_PRISON, "allow.no%s", name) < 0) {
 		free(allow_name, M_PRISON);
 		return 0;
 	}
 
 	/*
 	 * See if this parameter has already beed added, i.e. a module was
 	 * previously loaded/unloaded.
 	 */
 	mtx_lock(&prison0.pr_mtx);
 	for (bf = pr_flag_allow;
 	     bf < pr_flag_allow + nitems(pr_flag_allow) && bf->flag != 0;
 	     bf++) {
 		if (strcmp(bf->name, allow_name) == 0) {
 			allow_flag = bf->flag;
 			goto no_add;
 		}
 	}
 
 	/*
 	 * Find a free bit in prison0's pr_allow, failing if there are none
 	 * (which shouldn't happen as long as we keep track of how many
 	 * potential dynamic flags exist).
 	 */
 	for (allow_flag = 1;; allow_flag <<= 1) {
 		if (allow_flag == 0)
 			goto no_add;
 		if ((prison0.pr_allow & allow_flag) == 0)
 			break;
 	}
 
 	/*
 	 * Note the parameter in the next open slot in pr_flag_allow.
 	 * Set the flag last so code that checks pr_flag_allow can do so
 	 * without locking.
 	 */
 	for (bf = pr_flag_allow; bf->flag != 0; bf++)
 		if (bf == pr_flag_allow + nitems(pr_flag_allow)) {
 			/* This should never happen, but is not fatal. */
 			allow_flag = 0;
 			goto no_add;
 		}
 	prison0.pr_allow |= allow_flag;
 	bf->name = allow_name;
 	bf->noname = allow_noname;
 	bf->flag = allow_flag;
 	mtx_unlock(&prison0.pr_mtx);
 
 	/*
 	 * Create sysctls for the paramter, and the back-compat global
 	 * permission.
 	 */
 	parent = prefix
 	    ? SYSCTL_ADD_NODE(NULL,
 		  SYSCTL_CHILDREN(&sysctl___security_jail_param_allow),
 		  OID_AUTO, prefix, 0, 0, prefix_descr)
 	    : &sysctl___security_jail_param_allow;
 	(void)SYSCTL_ADD_PROC(NULL, SYSCTL_CHILDREN(parent), OID_AUTO,
 	    name, CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE,
 	    NULL, 0, sysctl_jail_param, "B", descr);
 	if ((prefix
 	     ? asprintf(&allowed, M_TEMP, "%s_%s_allowed", prefix, name)
 	     : asprintf(&allowed, M_TEMP, "%s_allowed", name)) >= 0) {
 #ifndef NO_SYSCTL_DESCR
 		(void)asprintf(&descr_deprecated, M_TEMP, "%s (deprecated)",
 		    descr);
 #endif
 		(void)SYSCTL_ADD_PROC(NULL,
 		    SYSCTL_CHILDREN(&sysctl___security_jail), OID_AUTO, allowed,
 		    CTLTYPE_INT | CTLFLAG_RW | CTLFLAG_MPSAFE, NULL, allow_flag,
 		    sysctl_jail_default_allow, "I", descr_deprecated);
 #ifndef NO_SYSCTL_DESCR
 		free(descr_deprecated, M_TEMP);
 #endif
 		free(allowed, M_TEMP);
 	}
 	return allow_flag;
 
  no_add:
 	mtx_unlock(&prison0.pr_mtx);
 	free(allow_name, M_PRISON);
 	free(allow_noname, M_PRISON);
 	return allow_flag;
 }
 
 /*
  * The VFS system will register jail-aware filesystems here.  They each get
  * a parameter allow.mount.xxxfs and a flag to check when a jailed user
  * attempts to mount.
  */
 void
 prison_add_vfs(struct vfsconf *vfsp)
 {
 #ifdef NO_SYSCTL_DESCR
 
 	vfsp->vfc_prison_flag = prison_add_allow("mount", vfsp->vfc_name,
 	    NULL, NULL);
 #else
 	char *descr;
 
 	(void)asprintf(&descr, M_TEMP, "Jail may mount the %s file system",
 	    vfsp->vfc_name);
 	vfsp->vfc_prison_flag = prison_add_allow("mount", vfsp->vfc_name,
 	    NULL, descr);
 	free(descr, M_TEMP);
 #endif
 }
 
 #ifdef RACCT
 void
 prison_racct_foreach(void (*callback)(struct racct *racct,
     void *arg2, void *arg3), void (*pre)(void), void (*post)(void),
     void *arg2, void *arg3)
 {
 	struct prison_racct *prr;
 
 	ASSERT_RACCT_ENABLED();
 
 	sx_slock(&allprison_lock);
 	if (pre != NULL)
 		(pre)();
 	LIST_FOREACH(prr, &allprison_racct, prr_next)
 		(callback)(prr->prr_racct, arg2, arg3);
 	if (post != NULL)
 		(post)();
 	sx_sunlock(&allprison_lock);
 }
 
 static struct prison_racct *
 prison_racct_find_locked(const char *name)
 {
 	struct prison_racct *prr;
 
 	ASSERT_RACCT_ENABLED();
 	sx_assert(&allprison_lock, SA_XLOCKED);
 
 	if (name[0] == '\0' || strlen(name) >= MAXHOSTNAMELEN)
 		return (NULL);
 
 	LIST_FOREACH(prr, &allprison_racct, prr_next) {
 		if (strcmp(name, prr->prr_name) != 0)
 			continue;
 
 		/* Found prison_racct with a matching name? */
 		prison_racct_hold(prr);
 		return (prr);
 	}
 
 	/* Add new prison_racct. */
 	prr = malloc(sizeof(*prr), M_PRISON_RACCT, M_ZERO | M_WAITOK);
 	racct_create(&prr->prr_racct);
 
 	strcpy(prr->prr_name, name);
 	refcount_init(&prr->prr_refcount, 1);
 	LIST_INSERT_HEAD(&allprison_racct, prr, prr_next);
 
 	return (prr);
 }
 
 struct prison_racct *
 prison_racct_find(const char *name)
 {
 	struct prison_racct *prr;
 
 	ASSERT_RACCT_ENABLED();
 
 	sx_xlock(&allprison_lock);
 	prr = prison_racct_find_locked(name);
 	sx_xunlock(&allprison_lock);
 	return (prr);
 }
 
 void
 prison_racct_hold(struct prison_racct *prr)
 {
 
 	ASSERT_RACCT_ENABLED();
 
 	refcount_acquire(&prr->prr_refcount);
 }
 
 static void
 prison_racct_free_locked(struct prison_racct *prr)
 {
 
 	ASSERT_RACCT_ENABLED();
 	sx_assert(&allprison_lock, SA_XLOCKED);
 
 	if (refcount_release(&prr->prr_refcount)) {
 		racct_destroy(&prr->prr_racct);
 		LIST_REMOVE(prr, prr_next);
 		free(prr, M_PRISON_RACCT);
 	}
 }
 
 void
 prison_racct_free(struct prison_racct *prr)
 {
 	int old;
 
 	ASSERT_RACCT_ENABLED();
 	sx_assert(&allprison_lock, SA_UNLOCKED);
 
 	old = prr->prr_refcount;
 	if (old > 1 && atomic_cmpset_int(&prr->prr_refcount, old, old - 1))
 		return;
 
 	sx_xlock(&allprison_lock);
 	prison_racct_free_locked(prr);
 	sx_xunlock(&allprison_lock);
 }
 
 static void
 prison_racct_attach(struct prison *pr)
 {
 	struct prison_racct *prr;
 
 	ASSERT_RACCT_ENABLED();
 	sx_assert(&allprison_lock, SA_XLOCKED);
 
 	prr = prison_racct_find_locked(pr->pr_name);
 	KASSERT(prr != NULL, ("cannot find prison_racct"));
 
 	pr->pr_prison_racct = prr;
 }
 
 /*
  * Handle jail renaming.  From the racct point of view, renaming means
  * moving from one prison_racct to another.
  */
 static void
 prison_racct_modify(struct prison *pr)
 {
 #ifdef RCTL
 	struct proc *p;
 	struct ucred *cred;
 #endif
 	struct prison_racct *oldprr;
 
 	ASSERT_RACCT_ENABLED();
 
 	sx_slock(&allproc_lock);
 	sx_xlock(&allprison_lock);
 
 	if (strcmp(pr->pr_name, pr->pr_prison_racct->prr_name) == 0) {
 		sx_xunlock(&allprison_lock);
 		sx_sunlock(&allproc_lock);
 		return;
 	}
 
 	oldprr = pr->pr_prison_racct;
 	pr->pr_prison_racct = NULL;
 
 	prison_racct_attach(pr);
 
 	/*
 	 * Move resource utilisation records.
 	 */
 	racct_move(pr->pr_prison_racct->prr_racct, oldprr->prr_racct);
 
 #ifdef RCTL
 	/*
 	 * Force rctl to reattach rules to processes.
 	 */
 	FOREACH_PROC_IN_SYSTEM(p) {
 		PROC_LOCK(p);
 		cred = crhold(p->p_ucred);
 		PROC_UNLOCK(p);
 		rctl_proc_ucred_changed(p, cred);
 		crfree(cred);
 	}
 #endif
 
 	sx_sunlock(&allproc_lock);
 	prison_racct_free_locked(oldprr);
 	sx_xunlock(&allprison_lock);
 }
 
 static void
 prison_racct_detach(struct prison *pr)
 {
 
 	ASSERT_RACCT_ENABLED();
 	sx_assert(&allprison_lock, SA_UNLOCKED);
 
 	if (pr->pr_prison_racct == NULL)
 		return;
 	prison_racct_free(pr->pr_prison_racct);
 	pr->pr_prison_racct = NULL;
 }
 #endif /* RACCT */
 
 #ifdef DDB
 
 static void
 db_show_prison(struct prison *pr)
 {
 	struct bool_flags *bf;
 	struct jailsys_flags *jsf;
 #if defined(INET) || defined(INET6)
 	int ii;
 #endif
 	unsigned f;
 #ifdef INET
 	char ip4buf[INET_ADDRSTRLEN];
 #endif
 #ifdef INET6
 	char ip6buf[INET6_ADDRSTRLEN];
 #endif
 
 	db_printf("prison %p:\n", pr);
 	db_printf(" jid             = %d\n", pr->pr_id);
 	db_printf(" name            = %s\n", pr->pr_name);
 	db_printf(" parent          = %p\n", pr->pr_parent);
 	db_printf(" ref             = %d\n", pr->pr_ref);
 	db_printf(" uref            = %d\n", pr->pr_uref);
 	db_printf(" path            = %s\n", pr->pr_path);
 	db_printf(" cpuset          = %d\n", pr->pr_cpuset
 	    ? pr->pr_cpuset->cs_id : -1);
 #ifdef VIMAGE
 	db_printf(" vnet            = %p\n", pr->pr_vnet);
 #endif
 	db_printf(" root            = %p\n", pr->pr_root);
 	db_printf(" securelevel     = %d\n", pr->pr_securelevel);
 	db_printf(" devfs_rsnum     = %d\n", pr->pr_devfs_rsnum);
 	db_printf(" children.max    = %d\n", pr->pr_childmax);
 	db_printf(" children.cur    = %d\n", pr->pr_childcount);
 	db_printf(" child           = %p\n", LIST_FIRST(&pr->pr_children));
 	db_printf(" sibling         = %p\n", LIST_NEXT(pr, pr_sibling));
 	db_printf(" flags           = 0x%x", pr->pr_flags);
 	for (bf = pr_flag_bool; bf < pr_flag_bool + nitems(pr_flag_bool); bf++)
 		if (pr->pr_flags & bf->flag)
 			db_printf(" %s", bf->name);
 	for (jsf = pr_flag_jailsys;
 	     jsf < pr_flag_jailsys + nitems(pr_flag_jailsys);
 	     jsf++) {
 		f = pr->pr_flags & (jsf->disable | jsf->new);
 		db_printf(" %-16s= %s\n", jsf->name,
 		    (f != 0 && f == jsf->disable) ? "disable"
 		    : (f == jsf->new) ? "new"
 		    : "inherit");
 	}
 	db_printf(" allow           = 0x%x", pr->pr_allow);
 	for (bf = pr_flag_allow;
 	     bf < pr_flag_allow + nitems(pr_flag_allow) && bf->flag != 0;
 	     bf++)
 		if (pr->pr_allow & bf->flag)
 			db_printf(" %s", bf->name);
 	db_printf("\n");
 	db_printf(" enforce_statfs  = %d\n", pr->pr_enforce_statfs);
 	db_printf(" host.hostname   = %s\n", pr->pr_hostname);
 	db_printf(" host.domainname = %s\n", pr->pr_domainname);
 	db_printf(" host.hostuuid   = %s\n", pr->pr_hostuuid);
 	db_printf(" host.hostid     = %lu\n", pr->pr_hostid);
 #ifdef INET
 	db_printf(" ip4s            = %d\n", pr->pr_ip4s);
 	for (ii = 0; ii < pr->pr_ip4s; ii++)
 		db_printf(" %s %s\n",
 		    ii == 0 ? "ip4.addr        =" : "                 ",
 		    inet_ntoa_r(pr->pr_ip4[ii], ip4buf));
 #endif
 #ifdef INET6
 	db_printf(" ip6s            = %d\n", pr->pr_ip6s);
 	for (ii = 0; ii < pr->pr_ip6s; ii++)
 		db_printf(" %s %s\n",
 		    ii == 0 ? "ip6.addr        =" : "                 ",
 		    ip6_sprintf(ip6buf, &pr->pr_ip6[ii]));
 #endif
 }
 
 DB_SHOW_COMMAND(prison, db_show_prison_command)
 {
 	struct prison *pr;
 
 	if (!have_addr) {
 		/*
 		 * Show all prisons in the list, and prison0 which is not
 		 * listed.
 		 */
 		db_show_prison(&prison0);
 		if (!db_pager_quit) {
 			TAILQ_FOREACH(pr, &allprison, pr_list) {
 				db_show_prison(pr);
 				if (db_pager_quit)
 					break;
 			}
 		}
 		return;
 	}
 
 	if (addr == 0)
 		pr = &prison0;
 	else {
 		/* Look for a prison with the ID and with references. */
 		TAILQ_FOREACH(pr, &allprison, pr_list)
 			if (pr->pr_id == addr && pr->pr_ref > 0)
 				break;
 		if (pr == NULL)
 			/* Look again, without requiring a reference. */
 			TAILQ_FOREACH(pr, &allprison, pr_list)
 				if (pr->pr_id == addr)
 					break;
 		if (pr == NULL)
 			/* Assume address points to a valid prison. */
 			pr = (struct prison *)addr;
 	}
 	db_show_prison(pr);
 }
 
 #endif /* DDB */
Index: projects/openssl111/sys/netinet/ip_output.c
===================================================================
--- projects/openssl111/sys/netinet/ip_output.c	(revision 339239)
+++ projects/openssl111/sys/netinet/ip_output.c	(revision 339240)
@@ -1,1461 +1,1462 @@
 /*-
  * SPDX-License-Identifier: BSD-3-Clause
  *
  * Copyright (c) 1982, 1986, 1988, 1990, 1993
  *	The Regents of the University of California.  All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of the University nor the names of its contributors
  *    may be used to endorse or promote products derived from this software
  *    without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  *	@(#)ip_output.c	8.3 (Berkeley) 1/21/94
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_inet.h"
 #include "opt_ratelimit.h"
 #include "opt_ipsec.h"
 #include "opt_mbuf_stress_test.h"
 #include "opt_mpath.h"
 #include "opt_route.h"
 #include "opt_sctp.h"
 #include "opt_rss.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/kernel.h>
 #include <sys/lock.h>
 #include <sys/malloc.h>
 #include <sys/mbuf.h>
 #include <sys/priv.h>
 #include <sys/proc.h>
 #include <sys/protosw.h>
 #include <sys/rmlock.h>
 #include <sys/sdt.h>
 #include <sys/socket.h>
 #include <sys/socketvar.h>
 #include <sys/sysctl.h>
 #include <sys/ucred.h>
 
 #include <net/if.h>
 #include <net/if_var.h>
 #include <net/if_llatbl.h>
 #include <net/netisr.h>
 #include <net/pfil.h>
 #include <net/route.h>
 #ifdef RADIX_MPATH
 #include <net/radix_mpath.h>
 #endif
 #include <net/rss_config.h>
 #include <net/vnet.h>
 
 #include <netinet/in.h>
 #include <netinet/in_kdtrace.h>
 #include <netinet/in_systm.h>
 #include <netinet/ip.h>
 #include <netinet/in_pcb.h>
 #include <netinet/in_rss.h>
 #include <netinet/in_var.h>
 #include <netinet/ip_var.h>
 #include <netinet/ip_options.h>
 
 #include <netinet/udp.h>
 #include <netinet/udp_var.h>
 
 #ifdef SCTP
 #include <netinet/sctp.h>
 #include <netinet/sctp_crc32.h>
 #endif
 
 #include <netipsec/ipsec_support.h>
 
 #include <machine/in_cksum.h>
 
 #include <security/mac/mac_framework.h>
 
 #ifdef MBUF_STRESS_TEST
 static int mbuf_frag_size = 0;
 SYSCTL_INT(_net_inet_ip, OID_AUTO, mbuf_frag_size, CTLFLAG_RW,
 	&mbuf_frag_size, 0, "Fragment outgoing mbufs to this size");
 #endif
 
 static void	ip_mloopback(struct ifnet *, const struct mbuf *, int);
 
 
 extern int in_mcast_loop;
 extern	struct protosw inetsw[];
 
 static inline int
 ip_output_pfil(struct mbuf **mp, struct ifnet *ifp, struct inpcb *inp,
     struct sockaddr_in *dst, int *fibnum, int *error)
 {
 	struct m_tag *fwd_tag = NULL;
 	struct mbuf *m;
 	struct in_addr odst;
 	struct ip *ip;
 
 	m = *mp;
 	ip = mtod(m, struct ip *);
 
 	/* Run through list of hooks for output packets. */
 	odst.s_addr = ip->ip_dst.s_addr;
 	*error = pfil_run_hooks(&V_inet_pfil_hook, mp, ifp, PFIL_OUT, 0, inp);
 	m = *mp;
 	if ((*error) != 0 || m == NULL)
 		return 1; /* Finished */
 
 	ip = mtod(m, struct ip *);
 
 	/* See if destination IP address was changed by packet filter. */
 	if (odst.s_addr != ip->ip_dst.s_addr) {
 		m->m_flags |= M_SKIP_FIREWALL;
 		/* If destination is now ourself drop to ip_input(). */
 		if (in_localip(ip->ip_dst)) {
 			m->m_flags |= M_FASTFWD_OURS;
 			if (m->m_pkthdr.rcvif == NULL)
 				m->m_pkthdr.rcvif = V_loif;
 			if (m->m_pkthdr.csum_flags & CSUM_DELAY_DATA) {
 				m->m_pkthdr.csum_flags |=
 					CSUM_DATA_VALID | CSUM_PSEUDO_HDR;
 				m->m_pkthdr.csum_data = 0xffff;
 			}
 			m->m_pkthdr.csum_flags |=
 				CSUM_IP_CHECKED | CSUM_IP_VALID;
 #ifdef SCTP
 			if (m->m_pkthdr.csum_flags & CSUM_SCTP)
 				m->m_pkthdr.csum_flags |= CSUM_SCTP_VALID;
 #endif
 			*error = netisr_queue(NETISR_IP, m);
 			return 1; /* Finished */
 		}
 
 		bzero(dst, sizeof(*dst));
 		dst->sin_family = AF_INET;
 		dst->sin_len = sizeof(*dst);
 		dst->sin_addr = ip->ip_dst;
 
 		return -1; /* Reloop */
 	}
 	/* See if fib was changed by packet filter. */
 	if ((*fibnum) != M_GETFIB(m)) {
 		m->m_flags |= M_SKIP_FIREWALL;
 		*fibnum = M_GETFIB(m);
 		return -1; /* Reloop for FIB change */
 	}
 
 	/* See if local, if yes, send it to netisr with IP_FASTFWD_OURS. */
 	if (m->m_flags & M_FASTFWD_OURS) {
 		if (m->m_pkthdr.rcvif == NULL)
 			m->m_pkthdr.rcvif = V_loif;
 		if (m->m_pkthdr.csum_flags & CSUM_DELAY_DATA) {
 			m->m_pkthdr.csum_flags |=
 				CSUM_DATA_VALID | CSUM_PSEUDO_HDR;
 			m->m_pkthdr.csum_data = 0xffff;
 		}
 #ifdef SCTP
 		if (m->m_pkthdr.csum_flags & CSUM_SCTP)
 			m->m_pkthdr.csum_flags |= CSUM_SCTP_VALID;
 #endif
 		m->m_pkthdr.csum_flags |=
 			CSUM_IP_CHECKED | CSUM_IP_VALID;
 
 		*error = netisr_queue(NETISR_IP, m);
 		return 1; /* Finished */
 	}
 	/* Or forward to some other address? */
 	if ((m->m_flags & M_IP_NEXTHOP) &&
 	    ((fwd_tag = m_tag_find(m, PACKET_TAG_IPFORWARD, NULL)) != NULL)) {
 		bcopy((fwd_tag+1), dst, sizeof(struct sockaddr_in));
 		m->m_flags |= M_SKIP_FIREWALL;
 		m->m_flags &= ~M_IP_NEXTHOP;
 		m_tag_delete(m, fwd_tag);
 
 		return -1; /* Reloop for CHANGE of dst */
 	}
 
 	return 0;
 }
 
 /*
  * IP output.  The packet in mbuf chain m contains a skeletal IP
  * header (with len, off, ttl, proto, tos, src, dst).
  * The mbuf chain containing the packet will be freed.
  * The mbuf opt, if present, will not be freed.
  * If route ro is present and has ro_rt initialized, route lookup would be
  * skipped and ro->ro_rt would be used. If ro is present but ro->ro_rt is NULL,
  * then result of route lookup is stored in ro->ro_rt.
  *
  * In the IP forwarding case, the packet will arrive with options already
  * inserted, so must have a NULL opt pointer.
  */
 int
 ip_output(struct mbuf *m, struct mbuf *opt, struct route *ro, int flags,
     struct ip_moptions *imo, struct inpcb *inp)
 {
 	struct rm_priotracker in_ifa_tracker;
 	struct ip *ip;
 	struct ifnet *ifp = NULL;	/* keep compiler happy */
 	struct mbuf *m0;
 	int hlen = sizeof (struct ip);
 	int mtu;
 	int error = 0;
 	struct sockaddr_in *dst;
 	const struct sockaddr_in *gw;
 	struct in_ifaddr *ia;
 	int isbroadcast;
 	uint16_t ip_len, ip_off;
 	struct route iproute;
 	struct rtentry *rte;	/* cache for ro->ro_rt */
 	uint32_t fibnum;
 #if defined(IPSEC) || defined(IPSEC_SUPPORT)
 	int no_route_but_check_spd = 0;
 #endif
 	M_ASSERTPKTHDR(m);
 
 	if (inp != NULL) {
 		INP_LOCK_ASSERT(inp);
 		M_SETFIB(m, inp->inp_inc.inc_fibnum);
 		if ((flags & IP_NODEFAULTFLOWID) == 0) {
 			m->m_pkthdr.flowid = inp->inp_flowid;
 			M_HASHTYPE_SET(m, inp->inp_flowtype);
 		}
 	}
 
 	if (ro == NULL) {
 		ro = &iproute;
 		bzero(ro, sizeof (*ro));
 	}
 
 	if (opt) {
 		int len = 0;
 		m = ip_insertoptions(m, opt, &len);
 		if (len != 0)
 			hlen = len; /* ip->ip_hl is updated above */
 	}
 	ip = mtod(m, struct ip *);
 	ip_len = ntohs(ip->ip_len);
 	ip_off = ntohs(ip->ip_off);
 
 	if ((flags & (IP_FORWARDING|IP_RAWOUTPUT)) == 0) {
 		ip->ip_v = IPVERSION;
 		ip->ip_hl = hlen >> 2;
 		ip_fillid(ip);
-		IPSTAT_INC(ips_localout);
 	} else {
 		/* Header already set, fetch hlen from there */
 		hlen = ip->ip_hl << 2;
 	}
+	if ((flags & IP_FORWARDING) == 0)
+		IPSTAT_INC(ips_localout);
 
 	/*
 	 * dst/gw handling:
 	 *
 	 * dst can be rewritten but always points to &ro->ro_dst.
 	 * gw is readonly but can point either to dst OR rt_gateway,
 	 * therefore we need restore gw if we're redoing lookup.
 	 */
 	gw = dst = (struct sockaddr_in *)&ro->ro_dst;
 	fibnum = (inp != NULL) ? inp->inp_inc.inc_fibnum : M_GETFIB(m);
 	rte = ro->ro_rt;
 	if (rte == NULL) {
 		bzero(dst, sizeof(*dst));
 		dst->sin_family = AF_INET;
 		dst->sin_len = sizeof(*dst);
 		dst->sin_addr = ip->ip_dst;
 	}
 	NET_EPOCH_ENTER();
 again:
 	/*
 	 * Validate route against routing table additions;
 	 * a better/more specific route might have been added.
 	 */
 	if (inp)
 		RT_VALIDATE(ro, &inp->inp_rt_cookie, fibnum);
 	/*
 	 * If there is a cached route,
 	 * check that it is to the same destination
 	 * and is still up.  If not, free it and try again.
 	 * The address family should also be checked in case of sharing the
 	 * cache with IPv6.
 	 * Also check whether routing cache needs invalidation.
 	 */
 	rte = ro->ro_rt;
 	if (rte && ((rte->rt_flags & RTF_UP) == 0 ||
 		    rte->rt_ifp == NULL ||
 		    !RT_LINK_IS_UP(rte->rt_ifp) ||
 			  dst->sin_family != AF_INET ||
 			  dst->sin_addr.s_addr != ip->ip_dst.s_addr)) {
 		RO_INVALIDATE_CACHE(ro);
 		rte = NULL;
 	}
 	ia = NULL;
 	/*
 	 * If routing to interface only, short circuit routing lookup.
 	 * The use of an all-ones broadcast address implies this; an
 	 * interface is specified by the broadcast address of an interface,
 	 * or the destination address of a ptp interface.
 	 */
 	if (flags & IP_SENDONES) {
 		if ((ia = ifatoia(ifa_ifwithbroadaddr(sintosa(dst),
 						      M_GETFIB(m)))) == NULL &&
 		    (ia = ifatoia(ifa_ifwithdstaddr(sintosa(dst),
 						    M_GETFIB(m)))) == NULL) {
 			IPSTAT_INC(ips_noroute);
 			error = ENETUNREACH;
 			goto bad;
 		}
 		ip->ip_dst.s_addr = INADDR_BROADCAST;
 		dst->sin_addr = ip->ip_dst;
 		ifp = ia->ia_ifp;
 		ip->ip_ttl = 1;
 		isbroadcast = 1;
 	} else if (flags & IP_ROUTETOIF) {
 		if ((ia = ifatoia(ifa_ifwithdstaddr(sintosa(dst),
 						    M_GETFIB(m)))) == NULL &&
 		    (ia = ifatoia(ifa_ifwithnet(sintosa(dst), 0,
 						M_GETFIB(m)))) == NULL) {
 			IPSTAT_INC(ips_noroute);
 			error = ENETUNREACH;
 			goto bad;
 		}
 		ifp = ia->ia_ifp;
 		ip->ip_ttl = 1;
 		isbroadcast = ifp->if_flags & IFF_BROADCAST ?
 		    in_ifaddr_broadcast(dst->sin_addr, ia) : 0;
 	} else if (IN_MULTICAST(ntohl(ip->ip_dst.s_addr)) &&
 	    imo != NULL && imo->imo_multicast_ifp != NULL) {
 		/*
 		 * Bypass the normal routing lookup for multicast
 		 * packets if the interface is specified.
 		 */
 		ifp = imo->imo_multicast_ifp;
 		IFP_TO_IA(ifp, ia, &in_ifa_tracker);
 		isbroadcast = 0;	/* fool gcc */
 	} else {
 		/*
 		 * We want to do any cloning requested by the link layer,
 		 * as this is probably required in all cases for correct
 		 * operation (as it is for ARP).
 		 */
 		if (rte == NULL) {
 #ifdef RADIX_MPATH
 			rtalloc_mpath_fib(ro,
 			    ntohl(ip->ip_src.s_addr ^ ip->ip_dst.s_addr),
 			    fibnum);
 #else
 			in_rtalloc_ign(ro, 0, fibnum);
 #endif
 			rte = ro->ro_rt;
 		}
 		if (rte == NULL ||
 		    (rte->rt_flags & RTF_UP) == 0 ||
 		    rte->rt_ifp == NULL ||
 		    !RT_LINK_IS_UP(rte->rt_ifp)) {
 #if defined(IPSEC) || defined(IPSEC_SUPPORT)
 			/*
 			 * There is no route for this packet, but it is
 			 * possible that a matching SPD entry exists.
 			 */
 			no_route_but_check_spd = 1;
 			mtu = 0; /* Silence GCC warning. */
 			goto sendit;
 #endif
 			IPSTAT_INC(ips_noroute);
 			error = EHOSTUNREACH;
 			goto bad;
 		}
 		ia = ifatoia(rte->rt_ifa);
 		ifp = rte->rt_ifp;
 		counter_u64_add(rte->rt_pksent, 1);
 		rt_update_ro_flags(ro);
 		if (rte->rt_flags & RTF_GATEWAY)
 			gw = (struct sockaddr_in *)rte->rt_gateway;
 		if (rte->rt_flags & RTF_HOST)
 			isbroadcast = (rte->rt_flags & RTF_BROADCAST);
 		else if (ifp->if_flags & IFF_BROADCAST)
 			isbroadcast = in_ifaddr_broadcast(gw->sin_addr, ia);
 		else
 			isbroadcast = 0;
 	}
 
 	/*
 	 * Calculate MTU.  If we have a route that is up, use that,
 	 * otherwise use the interface's MTU.
 	 */
 	if (rte != NULL && (rte->rt_flags & (RTF_UP|RTF_HOST)))
 		mtu = rte->rt_mtu;
 	else
 		mtu = ifp->if_mtu;
 	/* Catch a possible divide by zero later. */
 	KASSERT(mtu > 0, ("%s: mtu %d <= 0, rte=%p (rt_flags=0x%08x) ifp=%p",
 	    __func__, mtu, rte, (rte != NULL) ? rte->rt_flags : 0, ifp));
 
 	if (IN_MULTICAST(ntohl(ip->ip_dst.s_addr))) {
 		m->m_flags |= M_MCAST;
 		/*
 		 * IP destination address is multicast.  Make sure "gw"
 		 * still points to the address in "ro".  (It may have been
 		 * changed to point to a gateway address, above.)
 		 */
 		gw = dst;
 		/*
 		 * See if the caller provided any multicast options
 		 */
 		if (imo != NULL) {
 			ip->ip_ttl = imo->imo_multicast_ttl;
 			if (imo->imo_multicast_vif != -1)
 				ip->ip_src.s_addr =
 				    ip_mcast_src ?
 				    ip_mcast_src(imo->imo_multicast_vif) :
 				    INADDR_ANY;
 		} else
 			ip->ip_ttl = IP_DEFAULT_MULTICAST_TTL;
 		/*
 		 * Confirm that the outgoing interface supports multicast.
 		 */
 		if ((imo == NULL) || (imo->imo_multicast_vif == -1)) {
 			if ((ifp->if_flags & IFF_MULTICAST) == 0) {
 				IPSTAT_INC(ips_noroute);
 				error = ENETUNREACH;
 				goto bad;
 			}
 		}
 		/*
 		 * If source address not specified yet, use address
 		 * of outgoing interface.
 		 */
 		if (ip->ip_src.s_addr == INADDR_ANY) {
 			/* Interface may have no addresses. */
 			if (ia != NULL)
 				ip->ip_src = IA_SIN(ia)->sin_addr;
 		}
 
 		if ((imo == NULL && in_mcast_loop) ||
 		    (imo && imo->imo_multicast_loop)) {
 			/*
 			 * Loop back multicast datagram if not expressly
 			 * forbidden to do so, even if we are not a member
 			 * of the group; ip_input() will filter it later,
 			 * thus deferring a hash lookup and mutex acquisition
 			 * at the expense of a cheap copy using m_copym().
 			 */
 			ip_mloopback(ifp, m, hlen);
 		} else {
 			/*
 			 * If we are acting as a multicast router, perform
 			 * multicast forwarding as if the packet had just
 			 * arrived on the interface to which we are about
 			 * to send.  The multicast forwarding function
 			 * recursively calls this function, using the
 			 * IP_FORWARDING flag to prevent infinite recursion.
 			 *
 			 * Multicasts that are looped back by ip_mloopback(),
 			 * above, will be forwarded by the ip_input() routine,
 			 * if necessary.
 			 */
 			if (V_ip_mrouter && (flags & IP_FORWARDING) == 0) {
 				/*
 				 * If rsvp daemon is not running, do not
 				 * set ip_moptions. This ensures that the packet
 				 * is multicast and not just sent down one link
 				 * as prescribed by rsvpd.
 				 */
 				if (!V_rsvp_on)
 					imo = NULL;
 				if (ip_mforward &&
 				    ip_mforward(ip, ifp, m, imo) != 0) {
 					m_freem(m);
 					goto done;
 				}
 			}
 		}
 
 		/*
 		 * Multicasts with a time-to-live of zero may be looped-
 		 * back, above, but must not be transmitted on a network.
 		 * Also, multicasts addressed to the loopback interface
 		 * are not sent -- the above call to ip_mloopback() will
 		 * loop back a copy. ip_input() will drop the copy if
 		 * this host does not belong to the destination group on
 		 * the loopback interface.
 		 */
 		if (ip->ip_ttl == 0 || ifp->if_flags & IFF_LOOPBACK) {
 			m_freem(m);
 			goto done;
 		}
 
 		goto sendit;
 	}
 
 	/*
 	 * If the source address is not specified yet, use the address
 	 * of the outoing interface.
 	 */
 	if (ip->ip_src.s_addr == INADDR_ANY) {
 		/* Interface may have no addresses. */
 		if (ia != NULL) {
 			ip->ip_src = IA_SIN(ia)->sin_addr;
 		}
 	}
 
 	/*
 	 * Look for broadcast address and
 	 * verify user is allowed to send
 	 * such a packet.
 	 */
 	if (isbroadcast) {
 		if ((ifp->if_flags & IFF_BROADCAST) == 0) {
 			error = EADDRNOTAVAIL;
 			goto bad;
 		}
 		if ((flags & IP_ALLOWBROADCAST) == 0) {
 			error = EACCES;
 			goto bad;
 		}
 		/* don't allow broadcast messages to be fragmented */
 		if (ip_len > mtu) {
 			error = EMSGSIZE;
 			goto bad;
 		}
 		m->m_flags |= M_BCAST;
 	} else {
 		m->m_flags &= ~M_BCAST;
 	}
 
 sendit:
 #if defined(IPSEC) || defined(IPSEC_SUPPORT)
 	if (IPSEC_ENABLED(ipv4)) {
 		if ((error = IPSEC_OUTPUT(ipv4, m, inp)) != 0) {
 			if (error == EINPROGRESS)
 				error = 0;
 			goto done;
 		}
 	}
 	/*
 	 * Check if there was a route for this packet; return error if not.
 	 */
 	if (no_route_but_check_spd) {
 		IPSTAT_INC(ips_noroute);
 		error = EHOSTUNREACH;
 		goto bad;
 	}
 	/* Update variables that are affected by ipsec4_output(). */
 	ip = mtod(m, struct ip *);
 	hlen = ip->ip_hl << 2;
 #endif /* IPSEC */
 
 	/* Jump over all PFIL processing if hooks are not active. */
 	if (PFIL_HOOKED(&V_inet_pfil_hook)) {
 		switch (ip_output_pfil(&m, ifp, inp, dst, &fibnum, &error)) {
 		case 1: /* Finished */
 			goto done;
 
 		case 0: /* Continue normally */
 			ip = mtod(m, struct ip *);
 			break;
 
 		case -1: /* Need to try again */
 			/* Reset everything for a new round */
 			RO_RTFREE(ro);
 			ro->ro_prepend = NULL;
 			rte = NULL;
 			gw = dst;
 			ip = mtod(m, struct ip *);
 			goto again;
 
 		}
 	}
 
 	/* 127/8 must not appear on wire - RFC1122. */
 	if ((ntohl(ip->ip_dst.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET ||
 	    (ntohl(ip->ip_src.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET) {
 		if ((ifp->if_flags & IFF_LOOPBACK) == 0) {
 			IPSTAT_INC(ips_badaddr);
 			error = EADDRNOTAVAIL;
 			goto bad;
 		}
 	}
 
 	m->m_pkthdr.csum_flags |= CSUM_IP;
 	if (m->m_pkthdr.csum_flags & CSUM_DELAY_DATA & ~ifp->if_hwassist) {
 		in_delayed_cksum(m);
 		m->m_pkthdr.csum_flags &= ~CSUM_DELAY_DATA;
 	}
 #ifdef SCTP
 	if (m->m_pkthdr.csum_flags & CSUM_SCTP & ~ifp->if_hwassist) {
 		sctp_delayed_cksum(m, (uint32_t)(ip->ip_hl << 2));
 		m->m_pkthdr.csum_flags &= ~CSUM_SCTP;
 	}
 #endif
 
 	/*
 	 * If small enough for interface, or the interface will take
 	 * care of the fragmentation for us, we can just send directly.
 	 */
 	if (ip_len <= mtu ||
 	    (m->m_pkthdr.csum_flags & ifp->if_hwassist & CSUM_TSO) != 0) {
 		ip->ip_sum = 0;
 		if (m->m_pkthdr.csum_flags & CSUM_IP & ~ifp->if_hwassist) {
 			ip->ip_sum = in_cksum(m, hlen);
 			m->m_pkthdr.csum_flags &= ~CSUM_IP;
 		}
 
 		/*
 		 * Record statistics for this interface address.
 		 * With CSUM_TSO the byte/packet count will be slightly
 		 * incorrect because we count the IP+TCP headers only
 		 * once instead of for every generated packet.
 		 */
 		if (!(flags & IP_FORWARDING) && ia) {
 			if (m->m_pkthdr.csum_flags & CSUM_TSO)
 				counter_u64_add(ia->ia_ifa.ifa_opackets,
 				    m->m_pkthdr.len / m->m_pkthdr.tso_segsz);
 			else
 				counter_u64_add(ia->ia_ifa.ifa_opackets, 1);
 
 			counter_u64_add(ia->ia_ifa.ifa_obytes, m->m_pkthdr.len);
 		}
 #ifdef MBUF_STRESS_TEST
 		if (mbuf_frag_size && m->m_pkthdr.len > mbuf_frag_size)
 			m = m_fragment(m, M_NOWAIT, mbuf_frag_size);
 #endif
 		/*
 		 * Reset layer specific mbuf flags
 		 * to avoid confusing lower layers.
 		 */
 		m_clrprotoflags(m);
 		IP_PROBE(send, NULL, NULL, ip, ifp, ip, NULL);
 #ifdef RATELIMIT
 		if (inp != NULL) {
 			if (inp->inp_flags2 & INP_RATE_LIMIT_CHANGED)
 				in_pcboutput_txrtlmt(inp, ifp, m);
 			/* stamp send tag on mbuf */
 			m->m_pkthdr.snd_tag = inp->inp_snd_tag;
 		} else {
 			m->m_pkthdr.snd_tag = NULL;
 		}
 #endif
 		error = (*ifp->if_output)(ifp, m,
 		    (const struct sockaddr *)gw, ro);
 #ifdef RATELIMIT
 		/* check for route change */
 		if (error == EAGAIN)
 			in_pcboutput_eagain(inp);
 #endif
 		goto done;
 	}
 
 	/* Balk when DF bit is set or the interface didn't support TSO. */
 	if ((ip_off & IP_DF) || (m->m_pkthdr.csum_flags & CSUM_TSO)) {
 		error = EMSGSIZE;
 		IPSTAT_INC(ips_cantfrag);
 		goto bad;
 	}
 
 	/*
 	 * Too large for interface; fragment if possible. If successful,
 	 * on return, m will point to a list of packets to be sent.
 	 */
 	error = ip_fragment(ip, &m, mtu, ifp->if_hwassist);
 	if (error)
 		goto bad;
 	for (; m; m = m0) {
 		m0 = m->m_nextpkt;
 		m->m_nextpkt = 0;
 		if (error == 0) {
 			/* Record statistics for this interface address. */
 			if (ia != NULL) {
 				counter_u64_add(ia->ia_ifa.ifa_opackets, 1);
 				counter_u64_add(ia->ia_ifa.ifa_obytes,
 				    m->m_pkthdr.len);
 			}
 			/*
 			 * Reset layer specific mbuf flags
 			 * to avoid confusing upper layers.
 			 */
 			m_clrprotoflags(m);
 
 			IP_PROBE(send, NULL, NULL, mtod(m, struct ip *), ifp,
 			    mtod(m, struct ip *), NULL);
 #ifdef RATELIMIT
 			if (inp != NULL) {
 				if (inp->inp_flags2 & INP_RATE_LIMIT_CHANGED)
 					in_pcboutput_txrtlmt(inp, ifp, m);
 				/* stamp send tag on mbuf */
 				m->m_pkthdr.snd_tag = inp->inp_snd_tag;
 			} else {
 				m->m_pkthdr.snd_tag = NULL;
 			}
 #endif
 			error = (*ifp->if_output)(ifp, m,
 			    (const struct sockaddr *)gw, ro);
 #ifdef RATELIMIT
 			/* check for route change */
 			if (error == EAGAIN)
 				in_pcboutput_eagain(inp);
 #endif
 		} else
 			m_freem(m);
 	}
 
 	if (error == 0)
 		IPSTAT_INC(ips_fragmented);
 
 done:
 	if (ro == &iproute)
 		RO_RTFREE(ro);
 	else if (rte == NULL)
 		/*
 		 * If the caller supplied a route but somehow the reference
 		 * to it has been released need to prevent the caller
 		 * calling RTFREE on it again.
 		 */
 		ro->ro_rt = NULL;
 	NET_EPOCH_EXIT();
 	return (error);
  bad:
 	m_freem(m);
 	goto done;
 }
 
 /*
  * Create a chain of fragments which fit the given mtu. m_frag points to the
  * mbuf to be fragmented; on return it points to the chain with the fragments.
  * Return 0 if no error. If error, m_frag may contain a partially built
  * chain of fragments that should be freed by the caller.
  *
  * if_hwassist_flags is the hw offload capabilities (see if_data.ifi_hwassist)
  */
 int
 ip_fragment(struct ip *ip, struct mbuf **m_frag, int mtu,
     u_long if_hwassist_flags)
 {
 	int error = 0;
 	int hlen = ip->ip_hl << 2;
 	int len = (mtu - hlen) & ~7;	/* size of payload in each fragment */
 	int off;
 	struct mbuf *m0 = *m_frag;	/* the original packet		*/
 	int firstlen;
 	struct mbuf **mnext;
 	int nfrags;
 	uint16_t ip_len, ip_off;
 
 	ip_len = ntohs(ip->ip_len);
 	ip_off = ntohs(ip->ip_off);
 
 	if (ip_off & IP_DF) {	/* Fragmentation not allowed */
 		IPSTAT_INC(ips_cantfrag);
 		return EMSGSIZE;
 	}
 
 	/*
 	 * Must be able to put at least 8 bytes per fragment.
 	 */
 	if (len < 8)
 		return EMSGSIZE;
 
 	/*
 	 * If the interface will not calculate checksums on
 	 * fragmented packets, then do it here.
 	 */
 	if (m0->m_pkthdr.csum_flags & CSUM_DELAY_DATA) {
 		in_delayed_cksum(m0);
 		m0->m_pkthdr.csum_flags &= ~CSUM_DELAY_DATA;
 	}
 #ifdef SCTP
 	if (m0->m_pkthdr.csum_flags & CSUM_SCTP) {
 		sctp_delayed_cksum(m0, hlen);
 		m0->m_pkthdr.csum_flags &= ~CSUM_SCTP;
 	}
 #endif
 	if (len > PAGE_SIZE) {
 		/*
 		 * Fragment large datagrams such that each segment
 		 * contains a multiple of PAGE_SIZE amount of data,
 		 * plus headers. This enables a receiver to perform
 		 * page-flipping zero-copy optimizations.
 		 *
 		 * XXX When does this help given that sender and receiver
 		 * could have different page sizes, and also mtu could
 		 * be less than the receiver's page size ?
 		 */
 		int newlen;
 
 		off = MIN(mtu, m0->m_pkthdr.len);
 
 		/*
 		 * firstlen (off - hlen) must be aligned on an
 		 * 8-byte boundary
 		 */
 		if (off < hlen)
 			goto smart_frag_failure;
 		off = ((off - hlen) & ~7) + hlen;
 		newlen = (~PAGE_MASK) & mtu;
 		if ((newlen + sizeof (struct ip)) > mtu) {
 			/* we failed, go back the default */
 smart_frag_failure:
 			newlen = len;
 			off = hlen + len;
 		}
 		len = newlen;
 
 	} else {
 		off = hlen + len;
 	}
 
 	firstlen = off - hlen;
 	mnext = &m0->m_nextpkt;		/* pointer to next packet */
 
 	/*
 	 * Loop through length of segment after first fragment,
 	 * make new header and copy data of each part and link onto chain.
 	 * Here, m0 is the original packet, m is the fragment being created.
 	 * The fragments are linked off the m_nextpkt of the original
 	 * packet, which after processing serves as the first fragment.
 	 */
 	for (nfrags = 1; off < ip_len; off += len, nfrags++) {
 		struct ip *mhip;	/* ip header on the fragment */
 		struct mbuf *m;
 		int mhlen = sizeof (struct ip);
 
 		m = m_gethdr(M_NOWAIT, MT_DATA);
 		if (m == NULL) {
 			error = ENOBUFS;
 			IPSTAT_INC(ips_odropped);
 			goto done;
 		}
 		/*
 		 * Make sure the complete packet header gets copied
 		 * from the originating mbuf to the newly created
 		 * mbuf. This also ensures that existing firewall
 		 * classification(s), VLAN tags and so on get copied
 		 * to the resulting fragmented packet(s):
 		 */
 		if (m_dup_pkthdr(m, m0, M_NOWAIT) == 0) {
 			m_free(m);
 			error = ENOBUFS;
 			IPSTAT_INC(ips_odropped);
 			goto done;
 		}
 		/*
 		 * In the first mbuf, leave room for the link header, then
 		 * copy the original IP header including options. The payload
 		 * goes into an additional mbuf chain returned by m_copym().
 		 */
 		m->m_data += max_linkhdr;
 		mhip = mtod(m, struct ip *);
 		*mhip = *ip;
 		if (hlen > sizeof (struct ip)) {
 			mhlen = ip_optcopy(ip, mhip) + sizeof (struct ip);
 			mhip->ip_v = IPVERSION;
 			mhip->ip_hl = mhlen >> 2;
 		}
 		m->m_len = mhlen;
 		/* XXX do we need to add ip_off below ? */
 		mhip->ip_off = ((off - hlen) >> 3) + ip_off;
 		if (off + len >= ip_len)
 			len = ip_len - off;
 		else
 			mhip->ip_off |= IP_MF;
 		mhip->ip_len = htons((u_short)(len + mhlen));
 		m->m_next = m_copym(m0, off, len, M_NOWAIT);
 		if (m->m_next == NULL) {	/* copy failed */
 			m_free(m);
 			error = ENOBUFS;	/* ??? */
 			IPSTAT_INC(ips_odropped);
 			goto done;
 		}
 		m->m_pkthdr.len = mhlen + len;
 #ifdef MAC
 		mac_netinet_fragment(m0, m);
 #endif
 		mhip->ip_off = htons(mhip->ip_off);
 		mhip->ip_sum = 0;
 		if (m->m_pkthdr.csum_flags & CSUM_IP & ~if_hwassist_flags) {
 			mhip->ip_sum = in_cksum(m, mhlen);
 			m->m_pkthdr.csum_flags &= ~CSUM_IP;
 		}
 		*mnext = m;
 		mnext = &m->m_nextpkt;
 	}
 	IPSTAT_ADD(ips_ofragments, nfrags);
 
 	/*
 	 * Update first fragment by trimming what's been copied out
 	 * and updating header.
 	 */
 	m_adj(m0, hlen + firstlen - ip_len);
 	m0->m_pkthdr.len = hlen + firstlen;
 	ip->ip_len = htons((u_short)m0->m_pkthdr.len);
 	ip->ip_off = htons(ip_off | IP_MF);
 	ip->ip_sum = 0;
 	if (m0->m_pkthdr.csum_flags & CSUM_IP & ~if_hwassist_flags) {
 		ip->ip_sum = in_cksum(m0, hlen);
 		m0->m_pkthdr.csum_flags &= ~CSUM_IP;
 	}
 
 done:
 	*m_frag = m0;
 	return error;
 }
 
 void
 in_delayed_cksum(struct mbuf *m)
 {
 	struct ip *ip;
 	struct udphdr *uh;
 	uint16_t cklen, csum, offset;
 
 	ip = mtod(m, struct ip *);
 	offset = ip->ip_hl << 2 ;
 
 	if (m->m_pkthdr.csum_flags & CSUM_UDP) {
 		/* if udp header is not in the first mbuf copy udplen */
 		if (offset + sizeof(struct udphdr) > m->m_len) {
 			m_copydata(m, offset + offsetof(struct udphdr,
 			    uh_ulen), sizeof(cklen), (caddr_t)&cklen);
 			cklen = ntohs(cklen);
 		} else {
 			uh = (struct udphdr *)mtodo(m, offset);
 			cklen = ntohs(uh->uh_ulen);
 		}
 		csum = in_cksum_skip(m, cklen + offset, offset);
 		if (csum == 0)
 			csum = 0xffff;
 	} else {
 		cklen = ntohs(ip->ip_len);
 		csum = in_cksum_skip(m, cklen, offset);
 	}
 	offset += m->m_pkthdr.csum_data;	/* checksum offset */
 
 	if (offset + sizeof(csum) > m->m_len)
 		m_copyback(m, offset, sizeof(csum), (caddr_t)&csum);
 	else
 		*(u_short *)mtodo(m, offset) = csum;
 }
 
 /*
  * IP socket option processing.
  */
 int
 ip_ctloutput(struct socket *so, struct sockopt *sopt)
 {
 	struct	inpcb *inp = sotoinpcb(so);
 	int	error, optval;
 #ifdef	RSS
 	uint32_t rss_bucket;
 	int retval;
 #endif
 
 	error = optval = 0;
 	if (sopt->sopt_level != IPPROTO_IP) {
 		error = EINVAL;
 
 		if (sopt->sopt_level == SOL_SOCKET &&
 		    sopt->sopt_dir == SOPT_SET) {
 			switch (sopt->sopt_name) {
 			case SO_REUSEADDR:
 				INP_WLOCK(inp);
 				if ((so->so_options & SO_REUSEADDR) != 0)
 					inp->inp_flags2 |= INP_REUSEADDR;
 				else
 					inp->inp_flags2 &= ~INP_REUSEADDR;
 				INP_WUNLOCK(inp);
 				error = 0;
 				break;
 			case SO_REUSEPORT:
 				INP_WLOCK(inp);
 				if ((so->so_options & SO_REUSEPORT) != 0)
 					inp->inp_flags2 |= INP_REUSEPORT;
 				else
 					inp->inp_flags2 &= ~INP_REUSEPORT;
 				INP_WUNLOCK(inp);
 				error = 0;
 				break;
 			case SO_REUSEPORT_LB:
 				INP_WLOCK(inp);
 				if ((so->so_options & SO_REUSEPORT_LB) != 0)
 					inp->inp_flags2 |= INP_REUSEPORT_LB;
 				else
 					inp->inp_flags2 &= ~INP_REUSEPORT_LB;
 				INP_WUNLOCK(inp);
 				error = 0;
 				break;
 			case SO_SETFIB:
 				INP_WLOCK(inp);
 				inp->inp_inc.inc_fibnum = so->so_fibnum;
 				INP_WUNLOCK(inp);
 				error = 0;
 				break;
 			case SO_MAX_PACING_RATE:
 #ifdef RATELIMIT
 				INP_WLOCK(inp);
 				inp->inp_flags2 |= INP_RATE_LIMIT_CHANGED;
 				INP_WUNLOCK(inp);
 				error = 0;
 #else
 				error = EOPNOTSUPP;
 #endif
 				break;
 			default:
 				break;
 			}
 		}
 		return (error);
 	}
 
 	switch (sopt->sopt_dir) {
 	case SOPT_SET:
 		switch (sopt->sopt_name) {
 		case IP_OPTIONS:
 #ifdef notyet
 		case IP_RETOPTS:
 #endif
 		{
 			struct mbuf *m;
 			if (sopt->sopt_valsize > MLEN) {
 				error = EMSGSIZE;
 				break;
 			}
 			m = m_get(sopt->sopt_td ? M_WAITOK : M_NOWAIT, MT_DATA);
 			if (m == NULL) {
 				error = ENOBUFS;
 				break;
 			}
 			m->m_len = sopt->sopt_valsize;
 			error = sooptcopyin(sopt, mtod(m, char *), m->m_len,
 					    m->m_len);
 			if (error) {
 				m_free(m);
 				break;
 			}
 			INP_WLOCK(inp);
 			error = ip_pcbopts(inp, sopt->sopt_name, m);
 			INP_WUNLOCK(inp);
 			return (error);
 		}
 
 		case IP_BINDANY:
 			if (sopt->sopt_td != NULL) {
 				error = priv_check(sopt->sopt_td,
 				    PRIV_NETINET_BINDANY);
 				if (error)
 					break;
 			}
 			/* FALLTHROUGH */
 		case IP_BINDMULTI:
 #ifdef	RSS
 		case IP_RSS_LISTEN_BUCKET:
 #endif
 		case IP_TOS:
 		case IP_TTL:
 		case IP_MINTTL:
 		case IP_RECVOPTS:
 		case IP_RECVRETOPTS:
 		case IP_ORIGDSTADDR:
 		case IP_RECVDSTADDR:
 		case IP_RECVTTL:
 		case IP_RECVIF:
 		case IP_ONESBCAST:
 		case IP_DONTFRAG:
 		case IP_RECVTOS:
 		case IP_RECVFLOWID:
 #ifdef	RSS
 		case IP_RECVRSSBUCKETID:
 #endif
 			error = sooptcopyin(sopt, &optval, sizeof optval,
 					    sizeof optval);
 			if (error)
 				break;
 
 			switch (sopt->sopt_name) {
 			case IP_TOS:
 				inp->inp_ip_tos = optval;
 				break;
 
 			case IP_TTL:
 				inp->inp_ip_ttl = optval;
 				break;
 
 			case IP_MINTTL:
 				if (optval >= 0 && optval <= MAXTTL)
 					inp->inp_ip_minttl = optval;
 				else
 					error = EINVAL;
 				break;
 
 #define	OPTSET(bit) do {						\
 	INP_WLOCK(inp);							\
 	if (optval)							\
 		inp->inp_flags |= bit;					\
 	else								\
 		inp->inp_flags &= ~bit;					\
 	INP_WUNLOCK(inp);						\
 } while (0)
 
 #define	OPTSET2(bit, val) do {						\
 	INP_WLOCK(inp);							\
 	if (val)							\
 		inp->inp_flags2 |= bit;					\
 	else								\
 		inp->inp_flags2 &= ~bit;				\
 	INP_WUNLOCK(inp);						\
 } while (0)
 
 			case IP_RECVOPTS:
 				OPTSET(INP_RECVOPTS);
 				break;
 
 			case IP_RECVRETOPTS:
 				OPTSET(INP_RECVRETOPTS);
 				break;
 
 			case IP_RECVDSTADDR:
 				OPTSET(INP_RECVDSTADDR);
 				break;
 
 			case IP_ORIGDSTADDR:
 				OPTSET2(INP_ORIGDSTADDR, optval);
 				break;
 
 			case IP_RECVTTL:
 				OPTSET(INP_RECVTTL);
 				break;
 
 			case IP_RECVIF:
 				OPTSET(INP_RECVIF);
 				break;
 
 			case IP_ONESBCAST:
 				OPTSET(INP_ONESBCAST);
 				break;
 			case IP_DONTFRAG:
 				OPTSET(INP_DONTFRAG);
 				break;
 			case IP_BINDANY:
 				OPTSET(INP_BINDANY);
 				break;
 			case IP_RECVTOS:
 				OPTSET(INP_RECVTOS);
 				break;
 			case IP_BINDMULTI:
 				OPTSET2(INP_BINDMULTI, optval);
 				break;
 			case IP_RECVFLOWID:
 				OPTSET2(INP_RECVFLOWID, optval);
 				break;
 #ifdef	RSS
 			case IP_RSS_LISTEN_BUCKET:
 				if ((optval >= 0) &&
 				    (optval < rss_getnumbuckets())) {
 					inp->inp_rss_listen_bucket = optval;
 					OPTSET2(INP_RSS_BUCKET_SET, 1);
 				} else {
 					error = EINVAL;
 				}
 				break;
 			case IP_RECVRSSBUCKETID:
 				OPTSET2(INP_RECVRSSBUCKETID, optval);
 				break;
 #endif
 			}
 			break;
 #undef OPTSET
 #undef OPTSET2
 
 		/*
 		 * Multicast socket options are processed by the in_mcast
 		 * module.
 		 */
 		case IP_MULTICAST_IF:
 		case IP_MULTICAST_VIF:
 		case IP_MULTICAST_TTL:
 		case IP_MULTICAST_LOOP:
 		case IP_ADD_MEMBERSHIP:
 		case IP_DROP_MEMBERSHIP:
 		case IP_ADD_SOURCE_MEMBERSHIP:
 		case IP_DROP_SOURCE_MEMBERSHIP:
 		case IP_BLOCK_SOURCE:
 		case IP_UNBLOCK_SOURCE:
 		case IP_MSFILTER:
 		case MCAST_JOIN_GROUP:
 		case MCAST_LEAVE_GROUP:
 		case MCAST_JOIN_SOURCE_GROUP:
 		case MCAST_LEAVE_SOURCE_GROUP:
 		case MCAST_BLOCK_SOURCE:
 		case MCAST_UNBLOCK_SOURCE:
 			error = inp_setmoptions(inp, sopt);
 			break;
 
 		case IP_PORTRANGE:
 			error = sooptcopyin(sopt, &optval, sizeof optval,
 					    sizeof optval);
 			if (error)
 				break;
 
 			INP_WLOCK(inp);
 			switch (optval) {
 			case IP_PORTRANGE_DEFAULT:
 				inp->inp_flags &= ~(INP_LOWPORT);
 				inp->inp_flags &= ~(INP_HIGHPORT);
 				break;
 
 			case IP_PORTRANGE_HIGH:
 				inp->inp_flags &= ~(INP_LOWPORT);
 				inp->inp_flags |= INP_HIGHPORT;
 				break;
 
 			case IP_PORTRANGE_LOW:
 				inp->inp_flags &= ~(INP_HIGHPORT);
 				inp->inp_flags |= INP_LOWPORT;
 				break;
 
 			default:
 				error = EINVAL;
 				break;
 			}
 			INP_WUNLOCK(inp);
 			break;
 
 #if defined(IPSEC) || defined(IPSEC_SUPPORT)
 		case IP_IPSEC_POLICY:
 			if (IPSEC_ENABLED(ipv4)) {
 				error = IPSEC_PCBCTL(ipv4, inp, sopt);
 				break;
 			}
 			/* FALLTHROUGH */
 #endif /* IPSEC */
 
 		default:
 			error = ENOPROTOOPT;
 			break;
 		}
 		break;
 
 	case SOPT_GET:
 		switch (sopt->sopt_name) {
 		case IP_OPTIONS:
 		case IP_RETOPTS:
 			INP_RLOCK(inp);
 			if (inp->inp_options) {
 				struct mbuf *options;
 
 				options = m_dup(inp->inp_options, M_NOWAIT);
 				INP_RUNLOCK(inp);
 				if (options != NULL) {
 					error = sooptcopyout(sopt,
 							     mtod(options, char *),
 							     options->m_len);
 					m_freem(options);
 				} else
 					error = ENOMEM;
 			} else {
 				INP_RUNLOCK(inp);
 				sopt->sopt_valsize = 0;
 			}
 			break;
 
 		case IP_TOS:
 		case IP_TTL:
 		case IP_MINTTL:
 		case IP_RECVOPTS:
 		case IP_RECVRETOPTS:
 		case IP_ORIGDSTADDR:
 		case IP_RECVDSTADDR:
 		case IP_RECVTTL:
 		case IP_RECVIF:
 		case IP_PORTRANGE:
 		case IP_ONESBCAST:
 		case IP_DONTFRAG:
 		case IP_BINDANY:
 		case IP_RECVTOS:
 		case IP_BINDMULTI:
 		case IP_FLOWID:
 		case IP_FLOWTYPE:
 		case IP_RECVFLOWID:
 #ifdef	RSS
 		case IP_RSSBUCKETID:
 		case IP_RECVRSSBUCKETID:
 #endif
 			switch (sopt->sopt_name) {
 
 			case IP_TOS:
 				optval = inp->inp_ip_tos;
 				break;
 
 			case IP_TTL:
 				optval = inp->inp_ip_ttl;
 				break;
 
 			case IP_MINTTL:
 				optval = inp->inp_ip_minttl;
 				break;
 
 #define	OPTBIT(bit)	(inp->inp_flags & bit ? 1 : 0)
 #define	OPTBIT2(bit)	(inp->inp_flags2 & bit ? 1 : 0)
 
 			case IP_RECVOPTS:
 				optval = OPTBIT(INP_RECVOPTS);
 				break;
 
 			case IP_RECVRETOPTS:
 				optval = OPTBIT(INP_RECVRETOPTS);
 				break;
 
 			case IP_RECVDSTADDR:
 				optval = OPTBIT(INP_RECVDSTADDR);
 				break;
 
 			case IP_ORIGDSTADDR:
 				optval = OPTBIT2(INP_ORIGDSTADDR);
 				break;
 
 			case IP_RECVTTL:
 				optval = OPTBIT(INP_RECVTTL);
 				break;
 
 			case IP_RECVIF:
 				optval = OPTBIT(INP_RECVIF);
 				break;
 
 			case IP_PORTRANGE:
 				if (inp->inp_flags & INP_HIGHPORT)
 					optval = IP_PORTRANGE_HIGH;
 				else if (inp->inp_flags & INP_LOWPORT)
 					optval = IP_PORTRANGE_LOW;
 				else
 					optval = 0;
 				break;
 
 			case IP_ONESBCAST:
 				optval = OPTBIT(INP_ONESBCAST);
 				break;
 			case IP_DONTFRAG:
 				optval = OPTBIT(INP_DONTFRAG);
 				break;
 			case IP_BINDANY:
 				optval = OPTBIT(INP_BINDANY);
 				break;
 			case IP_RECVTOS:
 				optval = OPTBIT(INP_RECVTOS);
 				break;
 			case IP_FLOWID:
 				optval = inp->inp_flowid;
 				break;
 			case IP_FLOWTYPE:
 				optval = inp->inp_flowtype;
 				break;
 			case IP_RECVFLOWID:
 				optval = OPTBIT2(INP_RECVFLOWID);
 				break;
 #ifdef	RSS
 			case IP_RSSBUCKETID:
 				retval = rss_hash2bucket(inp->inp_flowid,
 				    inp->inp_flowtype,
 				    &rss_bucket);
 				if (retval == 0)
 					optval = rss_bucket;
 				else
 					error = EINVAL;
 				break;
 			case IP_RECVRSSBUCKETID:
 				optval = OPTBIT2(INP_RECVRSSBUCKETID);
 				break;
 #endif
 			case IP_BINDMULTI:
 				optval = OPTBIT2(INP_BINDMULTI);
 				break;
 			}
 			error = sooptcopyout(sopt, &optval, sizeof optval);
 			break;
 
 		/*
 		 * Multicast socket options are processed by the in_mcast
 		 * module.
 		 */
 		case IP_MULTICAST_IF:
 		case IP_MULTICAST_VIF:
 		case IP_MULTICAST_TTL:
 		case IP_MULTICAST_LOOP:
 		case IP_MSFILTER:
 			error = inp_getmoptions(inp, sopt);
 			break;
 
 #if defined(IPSEC) || defined(IPSEC_SUPPORT)
 		case IP_IPSEC_POLICY:
 			if (IPSEC_ENABLED(ipv4)) {
 				error = IPSEC_PCBCTL(ipv4, inp, sopt);
 				break;
 			}
 			/* FALLTHROUGH */
 #endif /* IPSEC */
 
 		default:
 			error = ENOPROTOOPT;
 			break;
 		}
 		break;
 	}
 	return (error);
 }
 
 /*
  * Routine called from ip_output() to loop back a copy of an IP multicast
  * packet to the input queue of a specified interface.  Note that this
  * calls the output routine of the loopback "driver", but with an interface
  * pointer that might NOT be a loopback interface -- evil, but easier than
  * replicating that code here.
  */
 static void
 ip_mloopback(struct ifnet *ifp, const struct mbuf *m, int hlen)
 {
 	struct ip *ip;
 	struct mbuf *copym;
 
 	/*
 	 * Make a deep copy of the packet because we're going to
 	 * modify the pack in order to generate checksums.
 	 */
 	copym = m_dup(m, M_NOWAIT);
 	if (copym != NULL && (!M_WRITABLE(copym) || copym->m_len < hlen))
 		copym = m_pullup(copym, hlen);
 	if (copym != NULL) {
 		/* If needed, compute the checksum and mark it as valid. */
 		if (copym->m_pkthdr.csum_flags & CSUM_DELAY_DATA) {
 			in_delayed_cksum(copym);
 			copym->m_pkthdr.csum_flags &= ~CSUM_DELAY_DATA;
 			copym->m_pkthdr.csum_flags |=
 			    CSUM_DATA_VALID | CSUM_PSEUDO_HDR;
 			copym->m_pkthdr.csum_data = 0xffff;
 		}
 		/*
 		 * We don't bother to fragment if the IP length is greater
 		 * than the interface's MTU.  Can this possibly matter?
 		 */
 		ip = mtod(copym, struct ip *);
 		ip->ip_sum = 0;
 		ip->ip_sum = in_cksum(copym, hlen);
 		if_simloop(ifp, copym, AF_INET, 0);
 	}
 }
Index: projects/openssl111/sys/netinet/sctp_output.c
===================================================================
--- projects/openssl111/sys/netinet/sctp_output.c	(revision 339239)
+++ projects/openssl111/sys/netinet/sctp_output.c	(revision 339240)
@@ -1,13867 +1,13848 @@
 /*-
  * SPDX-License-Identifier: BSD-3-Clause
  *
  * Copyright (c) 2001-2008, by Cisco Systems, Inc. All rights reserved.
  * Copyright (c) 2008-2012, by Randall Stewart. All rights reserved.
  * Copyright (c) 2008-2012, by Michael Tuexen. All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions are met:
  *
  * a) Redistributions of source code must retain the above copyright notice,
  *    this list of conditions and the following disclaimer.
  *
  * b) Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in
  *    the documentation and/or other materials provided with the distribution.
  *
  * c) Neither the name of Cisco Systems, Inc. nor the names of its
  *    contributors may be used to endorse or promote products derived
  *    from this software without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
  * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
  * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
  * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
  * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
  * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
  * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
  * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
  * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
  * THE POSSIBILITY OF SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <netinet/sctp_os.h>
 #include <sys/proc.h>
 #include <netinet/sctp_var.h>
 #include <netinet/sctp_sysctl.h>
 #include <netinet/sctp_header.h>
 #include <netinet/sctp_pcb.h>
 #include <netinet/sctputil.h>
 #include <netinet/sctp_output.h>
 #include <netinet/sctp_uio.h>
 #include <netinet/sctputil.h>
 #include <netinet/sctp_auth.h>
 #include <netinet/sctp_timer.h>
 #include <netinet/sctp_asconf.h>
 #include <netinet/sctp_indata.h>
 #include <netinet/sctp_bsd_addr.h>
 #include <netinet/sctp_input.h>
 #include <netinet/sctp_crc32.h>
 #if defined(INET) || defined(INET6)
 #include <netinet/udp.h>
 #endif
 #include <netinet/udp_var.h>
 #include <machine/in_cksum.h>
 #include <netinet/in_kdtrace.h>
 
 
 
 #define SCTP_MAX_GAPS_INARRAY 4
 struct sack_track {
 	uint8_t right_edge;	/* mergable on the right edge */
 	uint8_t left_edge;	/* mergable on the left edge */
 	uint8_t num_entries;
 	uint8_t spare;
 	struct sctp_gap_ack_block gaps[SCTP_MAX_GAPS_INARRAY];
 };
 
 const struct sack_track sack_array[256] = {
 	{0, 0, 0, 0,		/* 0x00 */
 		{{0, 0},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 1, 0,		/* 0x01 */
 		{{0, 0},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x02 */
 		{{1, 1},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 1, 0,		/* 0x03 */
 		{{0, 1},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x04 */
 		{{2, 2},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x05 */
 		{{0, 0},
 		{2, 2},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x06 */
 		{{1, 2},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 1, 0,		/* 0x07 */
 		{{0, 2},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x08 */
 		{{3, 3},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x09 */
 		{{0, 0},
 		{3, 3},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x0a */
 		{{1, 1},
 		{3, 3},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x0b */
 		{{0, 1},
 		{3, 3},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x0c */
 		{{2, 3},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x0d */
 		{{0, 0},
 		{2, 3},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x0e */
 		{{1, 3},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 1, 0,		/* 0x0f */
 		{{0, 3},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x10 */
 		{{4, 4},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x11 */
 		{{0, 0},
 		{4, 4},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x12 */
 		{{1, 1},
 		{4, 4},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x13 */
 		{{0, 1},
 		{4, 4},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x14 */
 		{{2, 2},
 		{4, 4},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x15 */
 		{{0, 0},
 		{2, 2},
 		{4, 4},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x16 */
 		{{1, 2},
 		{4, 4},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x17 */
 		{{0, 2},
 		{4, 4},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x18 */
 		{{3, 4},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x19 */
 		{{0, 0},
 		{3, 4},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x1a */
 		{{1, 1},
 		{3, 4},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x1b */
 		{{0, 1},
 		{3, 4},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x1c */
 		{{2, 4},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x1d */
 		{{0, 0},
 		{2, 4},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x1e */
 		{{1, 4},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 1, 0,		/* 0x1f */
 		{{0, 4},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x20 */
 		{{5, 5},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x21 */
 		{{0, 0},
 		{5, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x22 */
 		{{1, 1},
 		{5, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x23 */
 		{{0, 1},
 		{5, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x24 */
 		{{2, 2},
 		{5, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x25 */
 		{{0, 0},
 		{2, 2},
 		{5, 5},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x26 */
 		{{1, 2},
 		{5, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x27 */
 		{{0, 2},
 		{5, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x28 */
 		{{3, 3},
 		{5, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x29 */
 		{{0, 0},
 		{3, 3},
 		{5, 5},
 		{0, 0}
 		}
 	},
 	{0, 0, 3, 0,		/* 0x2a */
 		{{1, 1},
 		{3, 3},
 		{5, 5},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x2b */
 		{{0, 1},
 		{3, 3},
 		{5, 5},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x2c */
 		{{2, 3},
 		{5, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x2d */
 		{{0, 0},
 		{2, 3},
 		{5, 5},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x2e */
 		{{1, 3},
 		{5, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x2f */
 		{{0, 3},
 		{5, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x30 */
 		{{4, 5},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x31 */
 		{{0, 0},
 		{4, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x32 */
 		{{1, 1},
 		{4, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x33 */
 		{{0, 1},
 		{4, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x34 */
 		{{2, 2},
 		{4, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x35 */
 		{{0, 0},
 		{2, 2},
 		{4, 5},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x36 */
 		{{1, 2},
 		{4, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x37 */
 		{{0, 2},
 		{4, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x38 */
 		{{3, 5},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x39 */
 		{{0, 0},
 		{3, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x3a */
 		{{1, 1},
 		{3, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x3b */
 		{{0, 1},
 		{3, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x3c */
 		{{2, 5},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x3d */
 		{{0, 0},
 		{2, 5},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x3e */
 		{{1, 5},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 1, 0,		/* 0x3f */
 		{{0, 5},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x40 */
 		{{6, 6},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x41 */
 		{{0, 0},
 		{6, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x42 */
 		{{1, 1},
 		{6, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x43 */
 		{{0, 1},
 		{6, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x44 */
 		{{2, 2},
 		{6, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x45 */
 		{{0, 0},
 		{2, 2},
 		{6, 6},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x46 */
 		{{1, 2},
 		{6, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x47 */
 		{{0, 2},
 		{6, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x48 */
 		{{3, 3},
 		{6, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x49 */
 		{{0, 0},
 		{3, 3},
 		{6, 6},
 		{0, 0}
 		}
 	},
 	{0, 0, 3, 0,		/* 0x4a */
 		{{1, 1},
 		{3, 3},
 		{6, 6},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x4b */
 		{{0, 1},
 		{3, 3},
 		{6, 6},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x4c */
 		{{2, 3},
 		{6, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x4d */
 		{{0, 0},
 		{2, 3},
 		{6, 6},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x4e */
 		{{1, 3},
 		{6, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x4f */
 		{{0, 3},
 		{6, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x50 */
 		{{4, 4},
 		{6, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x51 */
 		{{0, 0},
 		{4, 4},
 		{6, 6},
 		{0, 0}
 		}
 	},
 	{0, 0, 3, 0,		/* 0x52 */
 		{{1, 1},
 		{4, 4},
 		{6, 6},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x53 */
 		{{0, 1},
 		{4, 4},
 		{6, 6},
 		{0, 0}
 		}
 	},
 	{0, 0, 3, 0,		/* 0x54 */
 		{{2, 2},
 		{4, 4},
 		{6, 6},
 		{0, 0}
 		}
 	},
 	{1, 0, 4, 0,		/* 0x55 */
 		{{0, 0},
 		{2, 2},
 		{4, 4},
 		{6, 6}
 		}
 	},
 	{0, 0, 3, 0,		/* 0x56 */
 		{{1, 2},
 		{4, 4},
 		{6, 6},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x57 */
 		{{0, 2},
 		{4, 4},
 		{6, 6},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x58 */
 		{{3, 4},
 		{6, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x59 */
 		{{0, 0},
 		{3, 4},
 		{6, 6},
 		{0, 0}
 		}
 	},
 	{0, 0, 3, 0,		/* 0x5a */
 		{{1, 1},
 		{3, 4},
 		{6, 6},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x5b */
 		{{0, 1},
 		{3, 4},
 		{6, 6},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x5c */
 		{{2, 4},
 		{6, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x5d */
 		{{0, 0},
 		{2, 4},
 		{6, 6},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x5e */
 		{{1, 4},
 		{6, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x5f */
 		{{0, 4},
 		{6, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x60 */
 		{{5, 6},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x61 */
 		{{0, 0},
 		{5, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x62 */
 		{{1, 1},
 		{5, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x63 */
 		{{0, 1},
 		{5, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x64 */
 		{{2, 2},
 		{5, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x65 */
 		{{0, 0},
 		{2, 2},
 		{5, 6},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x66 */
 		{{1, 2},
 		{5, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x67 */
 		{{0, 2},
 		{5, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x68 */
 		{{3, 3},
 		{5, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x69 */
 		{{0, 0},
 		{3, 3},
 		{5, 6},
 		{0, 0}
 		}
 	},
 	{0, 0, 3, 0,		/* 0x6a */
 		{{1, 1},
 		{3, 3},
 		{5, 6},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x6b */
 		{{0, 1},
 		{3, 3},
 		{5, 6},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x6c */
 		{{2, 3},
 		{5, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x6d */
 		{{0, 0},
 		{2, 3},
 		{5, 6},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x6e */
 		{{1, 3},
 		{5, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x6f */
 		{{0, 3},
 		{5, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x70 */
 		{{4, 6},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x71 */
 		{{0, 0},
 		{4, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x72 */
 		{{1, 1},
 		{4, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x73 */
 		{{0, 1},
 		{4, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x74 */
 		{{2, 2},
 		{4, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 3, 0,		/* 0x75 */
 		{{0, 0},
 		{2, 2},
 		{4, 6},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x76 */
 		{{1, 2},
 		{4, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x77 */
 		{{0, 2},
 		{4, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x78 */
 		{{3, 6},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x79 */
 		{{0, 0},
 		{3, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 2, 0,		/* 0x7a */
 		{{1, 1},
 		{3, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x7b */
 		{{0, 1},
 		{3, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x7c */
 		{{2, 6},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 2, 0,		/* 0x7d */
 		{{0, 0},
 		{2, 6},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 0, 1, 0,		/* 0x7e */
 		{{1, 6},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 0, 1, 0,		/* 0x7f */
 		{{0, 6},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 1, 0,		/* 0x80 */
 		{{7, 7},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0x81 */
 		{{0, 0},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0x82 */
 		{{1, 1},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0x83 */
 		{{0, 1},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0x84 */
 		{{2, 2},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0x85 */
 		{{0, 0},
 		{2, 2},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0x86 */
 		{{1, 2},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0x87 */
 		{{0, 2},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0x88 */
 		{{3, 3},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0x89 */
 		{{0, 0},
 		{3, 3},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 3, 0,		/* 0x8a */
 		{{1, 1},
 		{3, 3},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0x8b */
 		{{0, 1},
 		{3, 3},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0x8c */
 		{{2, 3},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0x8d */
 		{{0, 0},
 		{2, 3},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0x8e */
 		{{1, 3},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0x8f */
 		{{0, 3},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0x90 */
 		{{4, 4},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0x91 */
 		{{0, 0},
 		{4, 4},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 3, 0,		/* 0x92 */
 		{{1, 1},
 		{4, 4},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0x93 */
 		{{0, 1},
 		{4, 4},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 3, 0,		/* 0x94 */
 		{{2, 2},
 		{4, 4},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 4, 0,		/* 0x95 */
 		{{0, 0},
 		{2, 2},
 		{4, 4},
 		{7, 7}
 		}
 	},
 	{0, 1, 3, 0,		/* 0x96 */
 		{{1, 2},
 		{4, 4},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0x97 */
 		{{0, 2},
 		{4, 4},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0x98 */
 		{{3, 4},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0x99 */
 		{{0, 0},
 		{3, 4},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 3, 0,		/* 0x9a */
 		{{1, 1},
 		{3, 4},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0x9b */
 		{{0, 1},
 		{3, 4},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0x9c */
 		{{2, 4},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0x9d */
 		{{0, 0},
 		{2, 4},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0x9e */
 		{{1, 4},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0x9f */
 		{{0, 4},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xa0 */
 		{{5, 5},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xa1 */
 		{{0, 0},
 		{5, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 3, 0,		/* 0xa2 */
 		{{1, 1},
 		{5, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xa3 */
 		{{0, 1},
 		{5, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 3, 0,		/* 0xa4 */
 		{{2, 2},
 		{5, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 4, 0,		/* 0xa5 */
 		{{0, 0},
 		{2, 2},
 		{5, 5},
 		{7, 7}
 		}
 	},
 	{0, 1, 3, 0,		/* 0xa6 */
 		{{1, 2},
 		{5, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xa7 */
 		{{0, 2},
 		{5, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 3, 0,		/* 0xa8 */
 		{{3, 3},
 		{5, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 4, 0,		/* 0xa9 */
 		{{0, 0},
 		{3, 3},
 		{5, 5},
 		{7, 7}
 		}
 	},
 	{0, 1, 4, 0,		/* 0xaa */
 		{{1, 1},
 		{3, 3},
 		{5, 5},
 		{7, 7}
 		}
 	},
 	{1, 1, 4, 0,		/* 0xab */
 		{{0, 1},
 		{3, 3},
 		{5, 5},
 		{7, 7}
 		}
 	},
 	{0, 1, 3, 0,		/* 0xac */
 		{{2, 3},
 		{5, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 4, 0,		/* 0xad */
 		{{0, 0},
 		{2, 3},
 		{5, 5},
 		{7, 7}
 		}
 	},
 	{0, 1, 3, 0,		/* 0xae */
 		{{1, 3},
 		{5, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xaf */
 		{{0, 3},
 		{5, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xb0 */
 		{{4, 5},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xb1 */
 		{{0, 0},
 		{4, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 3, 0,		/* 0xb2 */
 		{{1, 1},
 		{4, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xb3 */
 		{{0, 1},
 		{4, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 3, 0,		/* 0xb4 */
 		{{2, 2},
 		{4, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 4, 0,		/* 0xb5 */
 		{{0, 0},
 		{2, 2},
 		{4, 5},
 		{7, 7}
 		}
 	},
 	{0, 1, 3, 0,		/* 0xb6 */
 		{{1, 2},
 		{4, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xb7 */
 		{{0, 2},
 		{4, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xb8 */
 		{{3, 5},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xb9 */
 		{{0, 0},
 		{3, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 3, 0,		/* 0xba */
 		{{1, 1},
 		{3, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xbb */
 		{{0, 1},
 		{3, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xbc */
 		{{2, 5},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xbd */
 		{{0, 0},
 		{2, 5},
 		{7, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xbe */
 		{{1, 5},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0xbf */
 		{{0, 5},
 		{7, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 1, 0,		/* 0xc0 */
 		{{6, 7},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0xc1 */
 		{{0, 0},
 		{6, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xc2 */
 		{{1, 1},
 		{6, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0xc3 */
 		{{0, 1},
 		{6, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xc4 */
 		{{2, 2},
 		{6, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xc5 */
 		{{0, 0},
 		{2, 2},
 		{6, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xc6 */
 		{{1, 2},
 		{6, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0xc7 */
 		{{0, 2},
 		{6, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xc8 */
 		{{3, 3},
 		{6, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xc9 */
 		{{0, 0},
 		{3, 3},
 		{6, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 3, 0,		/* 0xca */
 		{{1, 1},
 		{3, 3},
 		{6, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xcb */
 		{{0, 1},
 		{3, 3},
 		{6, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xcc */
 		{{2, 3},
 		{6, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xcd */
 		{{0, 0},
 		{2, 3},
 		{6, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xce */
 		{{1, 3},
 		{6, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0xcf */
 		{{0, 3},
 		{6, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xd0 */
 		{{4, 4},
 		{6, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xd1 */
 		{{0, 0},
 		{4, 4},
 		{6, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 3, 0,		/* 0xd2 */
 		{{1, 1},
 		{4, 4},
 		{6, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xd3 */
 		{{0, 1},
 		{4, 4},
 		{6, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 3, 0,		/* 0xd4 */
 		{{2, 2},
 		{4, 4},
 		{6, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 4, 0,		/* 0xd5 */
 		{{0, 0},
 		{2, 2},
 		{4, 4},
 		{6, 7}
 		}
 	},
 	{0, 1, 3, 0,		/* 0xd6 */
 		{{1, 2},
 		{4, 4},
 		{6, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xd7 */
 		{{0, 2},
 		{4, 4},
 		{6, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xd8 */
 		{{3, 4},
 		{6, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xd9 */
 		{{0, 0},
 		{3, 4},
 		{6, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 3, 0,		/* 0xda */
 		{{1, 1},
 		{3, 4},
 		{6, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xdb */
 		{{0, 1},
 		{3, 4},
 		{6, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xdc */
 		{{2, 4},
 		{6, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xdd */
 		{{0, 0},
 		{2, 4},
 		{6, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xde */
 		{{1, 4},
 		{6, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0xdf */
 		{{0, 4},
 		{6, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 1, 0,		/* 0xe0 */
 		{{5, 7},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0xe1 */
 		{{0, 0},
 		{5, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xe2 */
 		{{1, 1},
 		{5, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0xe3 */
 		{{0, 1},
 		{5, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xe4 */
 		{{2, 2},
 		{5, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xe5 */
 		{{0, 0},
 		{2, 2},
 		{5, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xe6 */
 		{{1, 2},
 		{5, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0xe7 */
 		{{0, 2},
 		{5, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xe8 */
 		{{3, 3},
 		{5, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xe9 */
 		{{0, 0},
 		{3, 3},
 		{5, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 3, 0,		/* 0xea */
 		{{1, 1},
 		{3, 3},
 		{5, 7},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xeb */
 		{{0, 1},
 		{3, 3},
 		{5, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xec */
 		{{2, 3},
 		{5, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xed */
 		{{0, 0},
 		{2, 3},
 		{5, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xee */
 		{{1, 3},
 		{5, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0xef */
 		{{0, 3},
 		{5, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 1, 0,		/* 0xf0 */
 		{{4, 7},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0xf1 */
 		{{0, 0},
 		{4, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xf2 */
 		{{1, 1},
 		{4, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0xf3 */
 		{{0, 1},
 		{4, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xf4 */
 		{{2, 2},
 		{4, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 3, 0,		/* 0xf5 */
 		{{0, 0},
 		{2, 2},
 		{4, 7},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xf6 */
 		{{1, 2},
 		{4, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0xf7 */
 		{{0, 2},
 		{4, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 1, 0,		/* 0xf8 */
 		{{3, 7},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0xf9 */
 		{{0, 0},
 		{3, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 2, 0,		/* 0xfa */
 		{{1, 1},
 		{3, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0xfb */
 		{{0, 1},
 		{3, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 1, 0,		/* 0xfc */
 		{{2, 7},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 2, 0,		/* 0xfd */
 		{{0, 0},
 		{2, 7},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{0, 1, 1, 0,		/* 0xfe */
 		{{1, 7},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	},
 	{1, 1, 1, 0,		/* 0xff */
 		{{0, 7},
 		{0, 0},
 		{0, 0},
 		{0, 0}
 		}
 	}
 };
 
 
 int
 sctp_is_address_in_scope(struct sctp_ifa *ifa,
     struct sctp_scoping *scope,
     int do_update)
 {
 	if ((scope->loopback_scope == 0) &&
 	    (ifa->ifn_p) && SCTP_IFN_IS_IFT_LOOP(ifa->ifn_p)) {
 		/*
 		 * skip loopback if not in scope *
 		 */
 		return (0);
 	}
 	switch (ifa->address.sa.sa_family) {
 #ifdef INET
 	case AF_INET:
 		if (scope->ipv4_addr_legal) {
 			struct sockaddr_in *sin;
 
 			sin = &ifa->address.sin;
 			if (sin->sin_addr.s_addr == 0) {
 				/* not in scope , unspecified */
 				return (0);
 			}
 			if ((scope->ipv4_local_scope == 0) &&
 			    (IN4_ISPRIVATE_ADDRESS(&sin->sin_addr))) {
 				/* private address not in scope */
 				return (0);
 			}
 		} else {
 			return (0);
 		}
 		break;
 #endif
 #ifdef INET6
 	case AF_INET6:
 		if (scope->ipv6_addr_legal) {
 			struct sockaddr_in6 *sin6;
 
 			/*
 			 * Must update the flags,  bummer, which means any
 			 * IFA locks must now be applied HERE <->
 			 */
 			if (do_update) {
 				sctp_gather_internal_ifa_flags(ifa);
 			}
 			if (ifa->localifa_flags & SCTP_ADDR_IFA_UNUSEABLE) {
 				return (0);
 			}
 			/* ok to use deprecated addresses? */
 			sin6 = &ifa->address.sin6;
 			if (IN6_IS_ADDR_UNSPECIFIED(&sin6->sin6_addr)) {
 				/* skip unspecifed addresses */
 				return (0);
 			}
 			if (	/* (local_scope == 0) && */
 			    (IN6_IS_ADDR_LINKLOCAL(&sin6->sin6_addr))) {
 				return (0);
 			}
 			if ((scope->site_scope == 0) &&
 			    (IN6_IS_ADDR_SITELOCAL(&sin6->sin6_addr))) {
 				return (0);
 			}
 		} else {
 			return (0);
 		}
 		break;
 #endif
 	default:
 		return (0);
 	}
 	return (1);
 }
 
 static struct mbuf *
 sctp_add_addr_to_mbuf(struct mbuf *m, struct sctp_ifa *ifa, uint16_t *len)
 {
 #if defined(INET) || defined(INET6)
 	struct sctp_paramhdr *paramh;
 	struct mbuf *mret;
 	uint16_t plen;
 #endif
 
 	switch (ifa->address.sa.sa_family) {
 #ifdef INET
 	case AF_INET:
 		plen = (uint16_t)sizeof(struct sctp_ipv4addr_param);
 		break;
 #endif
 #ifdef INET6
 	case AF_INET6:
 		plen = (uint16_t)sizeof(struct sctp_ipv6addr_param);
 		break;
 #endif
 	default:
 		return (m);
 	}
 #if defined(INET) || defined(INET6)
 	if (M_TRAILINGSPACE(m) >= plen) {
 		/* easy side we just drop it on the end */
 		paramh = (struct sctp_paramhdr *)(SCTP_BUF_AT(m, SCTP_BUF_LEN(m)));
 		mret = m;
 	} else {
 		/* Need more space */
 		mret = m;
 		while (SCTP_BUF_NEXT(mret) != NULL) {
 			mret = SCTP_BUF_NEXT(mret);
 		}
 		SCTP_BUF_NEXT(mret) = sctp_get_mbuf_for_msg(plen, 0, M_NOWAIT, 1, MT_DATA);
 		if (SCTP_BUF_NEXT(mret) == NULL) {
 			/* We are hosed, can't add more addresses */
 			return (m);
 		}
 		mret = SCTP_BUF_NEXT(mret);
 		paramh = mtod(mret, struct sctp_paramhdr *);
 	}
 	/* now add the parameter */
 	switch (ifa->address.sa.sa_family) {
 #ifdef INET
 	case AF_INET:
 		{
 			struct sctp_ipv4addr_param *ipv4p;
 			struct sockaddr_in *sin;
 
 			sin = &ifa->address.sin;
 			ipv4p = (struct sctp_ipv4addr_param *)paramh;
 			paramh->param_type = htons(SCTP_IPV4_ADDRESS);
 			paramh->param_length = htons(plen);
 			ipv4p->addr = sin->sin_addr.s_addr;
 			SCTP_BUF_LEN(mret) += plen;
 			break;
 		}
 #endif
 #ifdef INET6
 	case AF_INET6:
 		{
 			struct sctp_ipv6addr_param *ipv6p;
 			struct sockaddr_in6 *sin6;
 
 			sin6 = &ifa->address.sin6;
 			ipv6p = (struct sctp_ipv6addr_param *)paramh;
 			paramh->param_type = htons(SCTP_IPV6_ADDRESS);
 			paramh->param_length = htons(plen);
 			memcpy(ipv6p->addr, &sin6->sin6_addr,
 			    sizeof(ipv6p->addr));
 			/* clear embedded scope in the address */
 			in6_clearscope((struct in6_addr *)ipv6p->addr);
 			SCTP_BUF_LEN(mret) += plen;
 			break;
 		}
 #endif
 	default:
 		return (m);
 	}
 	if (len != NULL) {
 		*len += plen;
 	}
 	return (mret);
 #endif
 }
 
 
 struct mbuf *
 sctp_add_addresses_to_i_ia(struct sctp_inpcb *inp, struct sctp_tcb *stcb,
     struct sctp_scoping *scope,
     struct mbuf *m_at, int cnt_inits_to,
     uint16_t *padding_len, uint16_t *chunk_len)
 {
 	struct sctp_vrf *vrf = NULL;
 	int cnt, limit_out = 0, total_count;
 	uint32_t vrf_id;
 
 	vrf_id = inp->def_vrf_id;
 	SCTP_IPI_ADDR_RLOCK();
 	vrf = sctp_find_vrf(vrf_id);
 	if (vrf == NULL) {
 		SCTP_IPI_ADDR_RUNLOCK();
 		return (m_at);
 	}
 	if (inp->sctp_flags & SCTP_PCB_FLAGS_BOUNDALL) {
 		struct sctp_ifa *sctp_ifap;
 		struct sctp_ifn *sctp_ifnp;
 
 		cnt = cnt_inits_to;
 		if (vrf->total_ifa_count > SCTP_COUNT_LIMIT) {
 			limit_out = 1;
 			cnt = SCTP_ADDRESS_LIMIT;
 			goto skip_count;
 		}
 		LIST_FOREACH(sctp_ifnp, &vrf->ifnlist, next_ifn) {
 			if ((scope->loopback_scope == 0) &&
 			    SCTP_IFN_IS_IFT_LOOP(sctp_ifnp)) {
 				/*
 				 * Skip loopback devices if loopback_scope
 				 * not set
 				 */
 				continue;
 			}
 			LIST_FOREACH(sctp_ifap, &sctp_ifnp->ifalist, next_ifa) {
 #ifdef INET
 				if ((sctp_ifap->address.sa.sa_family == AF_INET) &&
 				    (prison_check_ip4(inp->ip_inp.inp.inp_cred,
 				    &sctp_ifap->address.sin.sin_addr) != 0)) {
 					continue;
 				}
 #endif
 #ifdef INET6
 				if ((sctp_ifap->address.sa.sa_family == AF_INET6) &&
 				    (prison_check_ip6(inp->ip_inp.inp.inp_cred,
 				    &sctp_ifap->address.sin6.sin6_addr) != 0)) {
 					continue;
 				}
 #endif
 				if (sctp_is_addr_restricted(stcb, sctp_ifap)) {
 					continue;
 				}
 				if (sctp_is_address_in_scope(sctp_ifap, scope, 1) == 0) {
 					continue;
 				}
 				cnt++;
 				if (cnt > SCTP_ADDRESS_LIMIT) {
 					break;
 				}
 			}
 			if (cnt > SCTP_ADDRESS_LIMIT) {
 				break;
 			}
 		}
 skip_count:
 		if (cnt > 1) {
 			total_count = 0;
 			LIST_FOREACH(sctp_ifnp, &vrf->ifnlist, next_ifn) {
 				cnt = 0;
 				if ((scope->loopback_scope == 0) &&
 				    SCTP_IFN_IS_IFT_LOOP(sctp_ifnp)) {
 					/*
 					 * Skip loopback devices if
 					 * loopback_scope not set
 					 */
 					continue;
 				}
 				LIST_FOREACH(sctp_ifap, &sctp_ifnp->ifalist, next_ifa) {
 #ifdef INET
 					if ((sctp_ifap->address.sa.sa_family == AF_INET) &&
 					    (prison_check_ip4(inp->ip_inp.inp.inp_cred,
 					    &sctp_ifap->address.sin.sin_addr) != 0)) {
 						continue;
 					}
 #endif
 #ifdef INET6
 					if ((sctp_ifap->address.sa.sa_family == AF_INET6) &&
 					    (prison_check_ip6(inp->ip_inp.inp.inp_cred,
 					    &sctp_ifap->address.sin6.sin6_addr) != 0)) {
 						continue;
 					}
 #endif
 					if (sctp_is_addr_restricted(stcb, sctp_ifap)) {
 						continue;
 					}
 					if (sctp_is_address_in_scope(sctp_ifap,
 					    scope, 0) == 0) {
 						continue;
 					}
 					if ((chunk_len != NULL) &&
 					    (padding_len != NULL) &&
 					    (*padding_len > 0)) {
 						memset(mtod(m_at, caddr_t)+*chunk_len, 0, *padding_len);
 						SCTP_BUF_LEN(m_at) += *padding_len;
 						*chunk_len += *padding_len;
 						*padding_len = 0;
 					}
 					m_at = sctp_add_addr_to_mbuf(m_at, sctp_ifap, chunk_len);
 					if (limit_out) {
 						cnt++;
 						total_count++;
 						if (cnt >= 2) {
 							/*
 							 * two from each
 							 * address
 							 */
 							break;
 						}
 						if (total_count > SCTP_ADDRESS_LIMIT) {
 							/* No more addresses */
 							break;
 						}
 					}
 				}
 			}
 		}
 	} else {
 		struct sctp_laddr *laddr;
 
 		cnt = cnt_inits_to;
 		/* First, how many ? */
 		LIST_FOREACH(laddr, &inp->sctp_addr_list, sctp_nxt_addr) {
 			if (laddr->ifa == NULL) {
 				continue;
 			}
 			if (laddr->ifa->localifa_flags & SCTP_BEING_DELETED)
 				/*
 				 * Address being deleted by the system, dont
 				 * list.
 				 */
 				continue;
 			if (laddr->action == SCTP_DEL_IP_ADDRESS) {
 				/*
 				 * Address being deleted on this ep don't
 				 * list.
 				 */
 				continue;
 			}
 			if (sctp_is_address_in_scope(laddr->ifa,
 			    scope, 1) == 0) {
 				continue;
 			}
 			cnt++;
 		}
 		/*
 		 * To get through a NAT we only list addresses if we have
 		 * more than one. That way if you just bind a single address
 		 * we let the source of the init dictate our address.
 		 */
 		if (cnt > 1) {
 			cnt = cnt_inits_to;
 			LIST_FOREACH(laddr, &inp->sctp_addr_list, sctp_nxt_addr) {
 				if (laddr->ifa == NULL) {
 					continue;
 				}
 				if (laddr->ifa->localifa_flags & SCTP_BEING_DELETED) {
 					continue;
 				}
 				if (sctp_is_address_in_scope(laddr->ifa,
 				    scope, 0) == 0) {
 					continue;
 				}
 				if ((chunk_len != NULL) &&
 				    (padding_len != NULL) &&
 				    (*padding_len > 0)) {
 					memset(mtod(m_at, caddr_t)+*chunk_len, 0, *padding_len);
 					SCTP_BUF_LEN(m_at) += *padding_len;
 					*chunk_len += *padding_len;
 					*padding_len = 0;
 				}
 				m_at = sctp_add_addr_to_mbuf(m_at, laddr->ifa, chunk_len);
 				cnt++;
 				if (cnt >= SCTP_ADDRESS_LIMIT) {
 					break;
 				}
 			}
 		}
 	}
 	SCTP_IPI_ADDR_RUNLOCK();
 	return (m_at);
 }
 
 static struct sctp_ifa *
 sctp_is_ifa_addr_preferred(struct sctp_ifa *ifa,
     uint8_t dest_is_loop,
     uint8_t dest_is_priv,
     sa_family_t fam)
 {
 	uint8_t dest_is_global = 0;
 
 	/* dest_is_priv is true if destination is a private address */
 	/* dest_is_loop is true if destination is a loopback addresses */
 
 	/**
 	 * Here we determine if its a preferred address. A preferred address
 	 * means it is the same scope or higher scope then the destination.
 	 * L = loopback, P = private, G = global
 	 * -----------------------------------------
 	 *    src    |  dest | result
 	 *  ----------------------------------------
 	 *     L     |    L  |    yes
 	 *  -----------------------------------------
 	 *     P     |    L  |    yes-v4 no-v6
 	 *  -----------------------------------------
 	 *     G     |    L  |    yes-v4 no-v6
 	 *  -----------------------------------------
 	 *     L     |    P  |    no
 	 *  -----------------------------------------
 	 *     P     |    P  |    yes
 	 *  -----------------------------------------
 	 *     G     |    P  |    no
 	 *   -----------------------------------------
 	 *     L     |    G  |    no
 	 *   -----------------------------------------
 	 *     P     |    G  |    no
 	 *    -----------------------------------------
 	 *     G     |    G  |    yes
 	 *    -----------------------------------------
 	 */
 
 	if (ifa->address.sa.sa_family != fam) {
 		/* forget mis-matched family */
 		return (NULL);
 	}
 	if ((dest_is_priv == 0) && (dest_is_loop == 0)) {
 		dest_is_global = 1;
 	}
 	SCTPDBG(SCTP_DEBUG_OUTPUT2, "Is destination preferred:");
 	SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, &ifa->address.sa);
 	/* Ok the address may be ok */
 #ifdef INET6
 	if (fam == AF_INET6) {
 		/* ok to use deprecated addresses? no lets not! */
 		if (ifa->localifa_flags & SCTP_ADDR_IFA_UNUSEABLE) {
 			SCTPDBG(SCTP_DEBUG_OUTPUT3, "NO:1\n");
 			return (NULL);
 		}
 		if (ifa->src_is_priv && !ifa->src_is_loop) {
 			if (dest_is_loop) {
 				SCTPDBG(SCTP_DEBUG_OUTPUT3, "NO:2\n");
 				return (NULL);
 			}
 		}
 		if (ifa->src_is_glob) {
 			if (dest_is_loop) {
 				SCTPDBG(SCTP_DEBUG_OUTPUT3, "NO:3\n");
 				return (NULL);
 			}
 		}
 	}
 #endif
 	/*
 	 * Now that we know what is what, implement or table this could in
 	 * theory be done slicker (it used to be), but this is
 	 * straightforward and easier to validate :-)
 	 */
 	SCTPDBG(SCTP_DEBUG_OUTPUT3, "src_loop:%d src_priv:%d src_glob:%d\n",
 	    ifa->src_is_loop, ifa->src_is_priv, ifa->src_is_glob);
 	SCTPDBG(SCTP_DEBUG_OUTPUT3, "dest_loop:%d dest_priv:%d dest_glob:%d\n",
 	    dest_is_loop, dest_is_priv, dest_is_global);
 
 	if ((ifa->src_is_loop) && (dest_is_priv)) {
 		SCTPDBG(SCTP_DEBUG_OUTPUT3, "NO:4\n");
 		return (NULL);
 	}
 	if ((ifa->src_is_glob) && (dest_is_priv)) {
 		SCTPDBG(SCTP_DEBUG_OUTPUT3, "NO:5\n");
 		return (NULL);
 	}
 	if ((ifa->src_is_loop) && (dest_is_global)) {
 		SCTPDBG(SCTP_DEBUG_OUTPUT3, "NO:6\n");
 		return (NULL);
 	}
 	if ((ifa->src_is_priv) && (dest_is_global)) {
 		SCTPDBG(SCTP_DEBUG_OUTPUT3, "NO:7\n");
 		return (NULL);
 	}
 	SCTPDBG(SCTP_DEBUG_OUTPUT3, "YES\n");
 	/* its a preferred address */
 	return (ifa);
 }
 
 static struct sctp_ifa *
 sctp_is_ifa_addr_acceptable(struct sctp_ifa *ifa,
     uint8_t dest_is_loop,
     uint8_t dest_is_priv,
     sa_family_t fam)
 {
 	uint8_t dest_is_global = 0;
 
 	/**
 	 * Here we determine if its a acceptable address. A acceptable
 	 * address means it is the same scope or higher scope but we can
 	 * allow for NAT which means its ok to have a global dest and a
 	 * private src.
 	 *
 	 * L = loopback, P = private, G = global
 	 * -----------------------------------------
 	 *  src    |  dest | result
 	 * -----------------------------------------
 	 *   L     |   L   |    yes
 	 *  -----------------------------------------
 	 *   P     |   L   |    yes-v4 no-v6
 	 *  -----------------------------------------
 	 *   G     |   L   |    yes
 	 * -----------------------------------------
 	 *   L     |   P   |    no
 	 * -----------------------------------------
 	 *   P     |   P   |    yes
 	 * -----------------------------------------
 	 *   G     |   P   |    yes - May not work
 	 * -----------------------------------------
 	 *   L     |   G   |    no
 	 * -----------------------------------------
 	 *   P     |   G   |    yes - May not work
 	 * -----------------------------------------
 	 *   G     |   G   |    yes
 	 * -----------------------------------------
 	 */
 
 	if (ifa->address.sa.sa_family != fam) {
 		/* forget non matching family */
 		SCTPDBG(SCTP_DEBUG_OUTPUT3, "ifa_fam:%d fam:%d\n",
 		    ifa->address.sa.sa_family, fam);
 		return (NULL);
 	}
 	/* Ok the address may be ok */
 	SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT3, &ifa->address.sa);
 	SCTPDBG(SCTP_DEBUG_OUTPUT3, "dst_is_loop:%d dest_is_priv:%d\n",
 	    dest_is_loop, dest_is_priv);
 	if ((dest_is_loop == 0) && (dest_is_priv == 0)) {
 		dest_is_global = 1;
 	}
 #ifdef INET6
 	if (fam == AF_INET6) {
 		/* ok to use deprecated addresses? */
 		if (ifa->localifa_flags & SCTP_ADDR_IFA_UNUSEABLE) {
 			return (NULL);
 		}
 		if (ifa->src_is_priv) {
 			/* Special case, linklocal to loop */
 			if (dest_is_loop)
 				return (NULL);
 		}
 	}
 #endif
 	/*
 	 * Now that we know what is what, implement our table. This could in
 	 * theory be done slicker (it used to be), but this is
 	 * straightforward and easier to validate :-)
 	 */
 	SCTPDBG(SCTP_DEBUG_OUTPUT3, "ifa->src_is_loop:%d dest_is_priv:%d\n",
 	    ifa->src_is_loop,
 	    dest_is_priv);
 	if ((ifa->src_is_loop == 1) && (dest_is_priv)) {
 		return (NULL);
 	}
 	SCTPDBG(SCTP_DEBUG_OUTPUT3, "ifa->src_is_loop:%d dest_is_glob:%d\n",
 	    ifa->src_is_loop,
 	    dest_is_global);
 	if ((ifa->src_is_loop == 1) && (dest_is_global)) {
 		return (NULL);
 	}
 	SCTPDBG(SCTP_DEBUG_OUTPUT3, "address is acceptable\n");
 	/* its an acceptable address */
 	return (ifa);
 }
 
 int
 sctp_is_addr_restricted(struct sctp_tcb *stcb, struct sctp_ifa *ifa)
 {
 	struct sctp_laddr *laddr;
 
 	if (stcb == NULL) {
 		/* There are no restrictions, no TCB :-) */
 		return (0);
 	}
 	LIST_FOREACH(laddr, &stcb->asoc.sctp_restricted_addrs, sctp_nxt_addr) {
 		if (laddr->ifa == NULL) {
 			SCTPDBG(SCTP_DEBUG_OUTPUT1, "%s: NULL ifa\n",
 			    __func__);
 			continue;
 		}
 		if (laddr->ifa == ifa) {
 			/* Yes it is on the list */
 			return (1);
 		}
 	}
 	return (0);
 }
 
 
 int
 sctp_is_addr_in_ep(struct sctp_inpcb *inp, struct sctp_ifa *ifa)
 {
 	struct sctp_laddr *laddr;
 
 	if (ifa == NULL)
 		return (0);
 	LIST_FOREACH(laddr, &inp->sctp_addr_list, sctp_nxt_addr) {
 		if (laddr->ifa == NULL) {
 			SCTPDBG(SCTP_DEBUG_OUTPUT1, "%s: NULL ifa\n",
 			    __func__);
 			continue;
 		}
 		if ((laddr->ifa == ifa) && laddr->action == 0)
 			/* same pointer */
 			return (1);
 	}
 	return (0);
 }
 
 
 
 static struct sctp_ifa *
 sctp_choose_boundspecific_inp(struct sctp_inpcb *inp,
     sctp_route_t *ro,
     uint32_t vrf_id,
     int non_asoc_addr_ok,
     uint8_t dest_is_priv,
     uint8_t dest_is_loop,
     sa_family_t fam)
 {
 	struct sctp_laddr *laddr, *starting_point;
 	void *ifn;
 	int resettotop = 0;
 	struct sctp_ifn *sctp_ifn;
 	struct sctp_ifa *sctp_ifa, *sifa;
 	struct sctp_vrf *vrf;
 	uint32_t ifn_index;
 
 	vrf = sctp_find_vrf(vrf_id);
 	if (vrf == NULL)
 		return (NULL);
 
 	ifn = SCTP_GET_IFN_VOID_FROM_ROUTE(ro);
 	ifn_index = SCTP_GET_IF_INDEX_FROM_ROUTE(ro);
 	sctp_ifn = sctp_find_ifn(ifn, ifn_index);
 	/*
 	 * first question, is the ifn we will emit on in our list, if so, we
 	 * want such an address. Note that we first looked for a preferred
 	 * address.
 	 */
 	if (sctp_ifn) {
 		/* is a preferred one on the interface we route out? */
 		LIST_FOREACH(sctp_ifa, &sctp_ifn->ifalist, next_ifa) {
 #ifdef INET
 			if ((sctp_ifa->address.sa.sa_family == AF_INET) &&
 			    (prison_check_ip4(inp->ip_inp.inp.inp_cred,
 			    &sctp_ifa->address.sin.sin_addr) != 0)) {
 				continue;
 			}
 #endif
 #ifdef INET6
 			if ((sctp_ifa->address.sa.sa_family == AF_INET6) &&
 			    (prison_check_ip6(inp->ip_inp.inp.inp_cred,
 			    &sctp_ifa->address.sin6.sin6_addr) != 0)) {
 				continue;
 			}
 #endif
 			if ((sctp_ifa->localifa_flags & SCTP_ADDR_DEFER_USE) &&
 			    (non_asoc_addr_ok == 0))
 				continue;
 			sifa = sctp_is_ifa_addr_preferred(sctp_ifa,
 			    dest_is_loop,
 			    dest_is_priv, fam);
 			if (sifa == NULL)
 				continue;
 			if (sctp_is_addr_in_ep(inp, sifa)) {
 				atomic_add_int(&sifa->refcount, 1);
 				return (sifa);
 			}
 		}
 	}
 	/*
 	 * ok, now we now need to find one on the list of the addresses. We
 	 * can't get one on the emitting interface so let's find first a
 	 * preferred one. If not that an acceptable one otherwise... we
 	 * return NULL.
 	 */
 	starting_point = inp->next_addr_touse;
 once_again:
 	if (inp->next_addr_touse == NULL) {
 		inp->next_addr_touse = LIST_FIRST(&inp->sctp_addr_list);
 		resettotop = 1;
 	}
 	for (laddr = inp->next_addr_touse; laddr;
 	    laddr = LIST_NEXT(laddr, sctp_nxt_addr)) {
 		if (laddr->ifa == NULL) {
 			/* address has been removed */
 			continue;
 		}
 		if (laddr->action == SCTP_DEL_IP_ADDRESS) {
 			/* address is being deleted */
 			continue;
 		}
 		sifa = sctp_is_ifa_addr_preferred(laddr->ifa, dest_is_loop,
 		    dest_is_priv, fam);
 		if (sifa == NULL)
 			continue;
 		atomic_add_int(&sifa->refcount, 1);
 		return (sifa);
 	}
 	if (resettotop == 0) {
 		inp->next_addr_touse = NULL;
 		goto once_again;
 	}
 
 	inp->next_addr_touse = starting_point;
 	resettotop = 0;
 once_again_too:
 	if (inp->next_addr_touse == NULL) {
 		inp->next_addr_touse = LIST_FIRST(&inp->sctp_addr_list);
 		resettotop = 1;
 	}
 
 	/* ok, what about an acceptable address in the inp */
 	for (laddr = inp->next_addr_touse; laddr;
 	    laddr = LIST_NEXT(laddr, sctp_nxt_addr)) {
 		if (laddr->ifa == NULL) {
 			/* address has been removed */
 			continue;
 		}
 		if (laddr->action == SCTP_DEL_IP_ADDRESS) {
 			/* address is being deleted */
 			continue;
 		}
 		sifa = sctp_is_ifa_addr_acceptable(laddr->ifa, dest_is_loop,
 		    dest_is_priv, fam);
 		if (sifa == NULL)
 			continue;
 		atomic_add_int(&sifa->refcount, 1);
 		return (sifa);
 	}
 	if (resettotop == 0) {
 		inp->next_addr_touse = NULL;
 		goto once_again_too;
 	}
 
 	/*
 	 * no address bound can be a source for the destination we are in
 	 * trouble
 	 */
 	return (NULL);
 }
 
 
 
 static struct sctp_ifa *
 sctp_choose_boundspecific_stcb(struct sctp_inpcb *inp,
     struct sctp_tcb *stcb,
     sctp_route_t *ro,
     uint32_t vrf_id,
     uint8_t dest_is_priv,
     uint8_t dest_is_loop,
     int non_asoc_addr_ok,
     sa_family_t fam)
 {
 	struct sctp_laddr *laddr, *starting_point;
 	void *ifn;
 	struct sctp_ifn *sctp_ifn;
 	struct sctp_ifa *sctp_ifa, *sifa;
 	uint8_t start_at_beginning = 0;
 	struct sctp_vrf *vrf;
 	uint32_t ifn_index;
 
 	/*
 	 * first question, is the ifn we will emit on in our list, if so, we
 	 * want that one.
 	 */
 	vrf = sctp_find_vrf(vrf_id);
 	if (vrf == NULL)
 		return (NULL);
 
 	ifn = SCTP_GET_IFN_VOID_FROM_ROUTE(ro);
 	ifn_index = SCTP_GET_IF_INDEX_FROM_ROUTE(ro);
 	sctp_ifn = sctp_find_ifn(ifn, ifn_index);
 
 	/*
 	 * first question, is the ifn we will emit on in our list?  If so,
 	 * we want that one. First we look for a preferred. Second, we go
 	 * for an acceptable.
 	 */
 	if (sctp_ifn) {
 		/* first try for a preferred address on the ep */
 		LIST_FOREACH(sctp_ifa, &sctp_ifn->ifalist, next_ifa) {
 #ifdef INET
 			if ((sctp_ifa->address.sa.sa_family == AF_INET) &&
 			    (prison_check_ip4(inp->ip_inp.inp.inp_cred,
 			    &sctp_ifa->address.sin.sin_addr) != 0)) {
 				continue;
 			}
 #endif
 #ifdef INET6
 			if ((sctp_ifa->address.sa.sa_family == AF_INET6) &&
 			    (prison_check_ip6(inp->ip_inp.inp.inp_cred,
 			    &sctp_ifa->address.sin6.sin6_addr) != 0)) {
 				continue;
 			}
 #endif
 			if ((sctp_ifa->localifa_flags & SCTP_ADDR_DEFER_USE) && (non_asoc_addr_ok == 0))
 				continue;
 			if (sctp_is_addr_in_ep(inp, sctp_ifa)) {
 				sifa = sctp_is_ifa_addr_preferred(sctp_ifa, dest_is_loop, dest_is_priv, fam);
 				if (sifa == NULL)
 					continue;
 				if (((non_asoc_addr_ok == 0) &&
 				    (sctp_is_addr_restricted(stcb, sifa))) ||
 				    (non_asoc_addr_ok &&
 				    (sctp_is_addr_restricted(stcb, sifa)) &&
 				    (!sctp_is_addr_pending(stcb, sifa)))) {
 					/* on the no-no list */
 					continue;
 				}
 				atomic_add_int(&sifa->refcount, 1);
 				return (sifa);
 			}
 		}
 		/* next try for an acceptable address on the ep */
 		LIST_FOREACH(sctp_ifa, &sctp_ifn->ifalist, next_ifa) {
 #ifdef INET
 			if ((sctp_ifa->address.sa.sa_family == AF_INET) &&
 			    (prison_check_ip4(inp->ip_inp.inp.inp_cred,
 			    &sctp_ifa->address.sin.sin_addr) != 0)) {
 				continue;
 			}
 #endif
 #ifdef INET6
 			if ((sctp_ifa->address.sa.sa_family == AF_INET6) &&
 			    (prison_check_ip6(inp->ip_inp.inp.inp_cred,
 			    &sctp_ifa->address.sin6.sin6_addr) != 0)) {
 				continue;
 			}
 #endif
 			if ((sctp_ifa->localifa_flags & SCTP_ADDR_DEFER_USE) && (non_asoc_addr_ok == 0))
 				continue;
 			if (sctp_is_addr_in_ep(inp, sctp_ifa)) {
 				sifa = sctp_is_ifa_addr_acceptable(sctp_ifa, dest_is_loop, dest_is_priv, fam);
 				if (sifa == NULL)
 					continue;
 				if (((non_asoc_addr_ok == 0) &&
 				    (sctp_is_addr_restricted(stcb, sifa))) ||
 				    (non_asoc_addr_ok &&
 				    (sctp_is_addr_restricted(stcb, sifa)) &&
 				    (!sctp_is_addr_pending(stcb, sifa)))) {
 					/* on the no-no list */
 					continue;
 				}
 				atomic_add_int(&sifa->refcount, 1);
 				return (sifa);
 			}
 		}
 
 	}
 	/*
 	 * if we can't find one like that then we must look at all addresses
 	 * bound to pick one at first preferable then secondly acceptable.
 	 */
 	starting_point = stcb->asoc.last_used_address;
 sctp_from_the_top:
 	if (stcb->asoc.last_used_address == NULL) {
 		start_at_beginning = 1;
 		stcb->asoc.last_used_address = LIST_FIRST(&inp->sctp_addr_list);
 	}
 	/* search beginning with the last used address */
 	for (laddr = stcb->asoc.last_used_address; laddr;
 	    laddr = LIST_NEXT(laddr, sctp_nxt_addr)) {
 		if (laddr->ifa == NULL) {
 			/* address has been removed */
 			continue;
 		}
 		if (laddr->action == SCTP_DEL_IP_ADDRESS) {
 			/* address is being deleted */
 			continue;
 		}
 		sifa = sctp_is_ifa_addr_preferred(laddr->ifa, dest_is_loop, dest_is_priv, fam);
 		if (sifa == NULL)
 			continue;
 		if (((non_asoc_addr_ok == 0) &&
 		    (sctp_is_addr_restricted(stcb, sifa))) ||
 		    (non_asoc_addr_ok &&
 		    (sctp_is_addr_restricted(stcb, sifa)) &&
 		    (!sctp_is_addr_pending(stcb, sifa)))) {
 			/* on the no-no list */
 			continue;
 		}
 		stcb->asoc.last_used_address = laddr;
 		atomic_add_int(&sifa->refcount, 1);
 		return (sifa);
 	}
 	if (start_at_beginning == 0) {
 		stcb->asoc.last_used_address = NULL;
 		goto sctp_from_the_top;
 	}
 	/* now try for any higher scope than the destination */
 	stcb->asoc.last_used_address = starting_point;
 	start_at_beginning = 0;
 sctp_from_the_top2:
 	if (stcb->asoc.last_used_address == NULL) {
 		start_at_beginning = 1;
 		stcb->asoc.last_used_address = LIST_FIRST(&inp->sctp_addr_list);
 	}
 	/* search beginning with the last used address */
 	for (laddr = stcb->asoc.last_used_address; laddr;
 	    laddr = LIST_NEXT(laddr, sctp_nxt_addr)) {
 		if (laddr->ifa == NULL) {
 			/* address has been removed */
 			continue;
 		}
 		if (laddr->action == SCTP_DEL_IP_ADDRESS) {
 			/* address is being deleted */
 			continue;
 		}
 		sifa = sctp_is_ifa_addr_acceptable(laddr->ifa, dest_is_loop,
 		    dest_is_priv, fam);
 		if (sifa == NULL)
 			continue;
 		if (((non_asoc_addr_ok == 0) &&
 		    (sctp_is_addr_restricted(stcb, sifa))) ||
 		    (non_asoc_addr_ok &&
 		    (sctp_is_addr_restricted(stcb, sifa)) &&
 		    (!sctp_is_addr_pending(stcb, sifa)))) {
 			/* on the no-no list */
 			continue;
 		}
 		stcb->asoc.last_used_address = laddr;
 		atomic_add_int(&sifa->refcount, 1);
 		return (sifa);
 	}
 	if (start_at_beginning == 0) {
 		stcb->asoc.last_used_address = NULL;
 		goto sctp_from_the_top2;
 	}
 	return (NULL);
 }
 
 static struct sctp_ifa *
 sctp_select_nth_preferred_addr_from_ifn_boundall(struct sctp_ifn *ifn,
     struct sctp_inpcb *inp,
     struct sctp_tcb *stcb,
     int non_asoc_addr_ok,
     uint8_t dest_is_loop,
     uint8_t dest_is_priv,
     int addr_wanted,
     sa_family_t fam,
     sctp_route_t *ro
 )
 {
 	struct sctp_ifa *ifa, *sifa;
 	int num_eligible_addr = 0;
 #ifdef INET6
 	struct sockaddr_in6 sin6, lsa6;
 
 	if (fam == AF_INET6) {
 		memcpy(&sin6, &ro->ro_dst, sizeof(struct sockaddr_in6));
 		(void)sa6_recoverscope(&sin6);
 	}
 #endif				/* INET6 */
 	LIST_FOREACH(ifa, &ifn->ifalist, next_ifa) {
 #ifdef INET
 		if ((ifa->address.sa.sa_family == AF_INET) &&
 		    (prison_check_ip4(inp->ip_inp.inp.inp_cred,
 		    &ifa->address.sin.sin_addr) != 0)) {
 			continue;
 		}
 #endif
 #ifdef INET6
 		if ((ifa->address.sa.sa_family == AF_INET6) &&
 		    (prison_check_ip6(inp->ip_inp.inp.inp_cred,
 		    &ifa->address.sin6.sin6_addr) != 0)) {
 			continue;
 		}
 #endif
 		if ((ifa->localifa_flags & SCTP_ADDR_DEFER_USE) &&
 		    (non_asoc_addr_ok == 0))
 			continue;
 		sifa = sctp_is_ifa_addr_preferred(ifa, dest_is_loop,
 		    dest_is_priv, fam);
 		if (sifa == NULL)
 			continue;
 #ifdef INET6
 		if (fam == AF_INET6 &&
 		    dest_is_loop &&
 		    sifa->src_is_loop && sifa->src_is_priv) {
 			/*
 			 * don't allow fe80::1 to be a src on loop ::1, we
 			 * don't list it to the peer so we will get an
 			 * abort.
 			 */
 			continue;
 		}
 		if (fam == AF_INET6 &&
 		    IN6_IS_ADDR_LINKLOCAL(&sifa->address.sin6.sin6_addr) &&
 		    IN6_IS_ADDR_LINKLOCAL(&sin6.sin6_addr)) {
 			/*
 			 * link-local <-> link-local must belong to the same
 			 * scope.
 			 */
 			memcpy(&lsa6, &sifa->address.sin6, sizeof(struct sockaddr_in6));
 			(void)sa6_recoverscope(&lsa6);
 			if (sin6.sin6_scope_id != lsa6.sin6_scope_id) {
 				continue;
 			}
 		}
 #endif				/* INET6 */
 
 		/*
 		 * Check if the IPv6 address matches to next-hop. In the
 		 * mobile case, old IPv6 address may be not deleted from the
 		 * interface. Then, the interface has previous and new
 		 * addresses.  We should use one corresponding to the
 		 * next-hop.  (by micchie)
 		 */
 #ifdef INET6
 		if (stcb && fam == AF_INET6 &&
 		    sctp_is_mobility_feature_on(stcb->sctp_ep, SCTP_MOBILITY_BASE)) {
 			if (sctp_v6src_match_nexthop(&sifa->address.sin6, ro)
 			    == 0) {
 				continue;
 			}
 		}
 #endif
 #ifdef INET
 		/* Avoid topologically incorrect IPv4 address */
 		if (stcb && fam == AF_INET &&
 		    sctp_is_mobility_feature_on(stcb->sctp_ep, SCTP_MOBILITY_BASE)) {
 			if (sctp_v4src_match_nexthop(sifa, ro) == 0) {
 				continue;
 			}
 		}
 #endif
 		if (stcb) {
 			if (sctp_is_address_in_scope(ifa, &stcb->asoc.scope, 0) == 0) {
 				continue;
 			}
 			if (((non_asoc_addr_ok == 0) &&
 			    (sctp_is_addr_restricted(stcb, sifa))) ||
 			    (non_asoc_addr_ok &&
 			    (sctp_is_addr_restricted(stcb, sifa)) &&
 			    (!sctp_is_addr_pending(stcb, sifa)))) {
 				/*
 				 * It is restricted for some reason..
 				 * probably not yet added.
 				 */
 				continue;
 			}
 		}
 		if (num_eligible_addr >= addr_wanted) {
 			return (sifa);
 		}
 		num_eligible_addr++;
 	}
 	return (NULL);
 }
 
 
 static int
 sctp_count_num_preferred_boundall(struct sctp_ifn *ifn,
     struct sctp_inpcb *inp,
     struct sctp_tcb *stcb,
     int non_asoc_addr_ok,
     uint8_t dest_is_loop,
     uint8_t dest_is_priv,
     sa_family_t fam)
 {
 	struct sctp_ifa *ifa, *sifa;
 	int num_eligible_addr = 0;
 
 	LIST_FOREACH(ifa, &ifn->ifalist, next_ifa) {
 #ifdef INET
 		if ((ifa->address.sa.sa_family == AF_INET) &&
 		    (prison_check_ip4(inp->ip_inp.inp.inp_cred,
 		    &ifa->address.sin.sin_addr) != 0)) {
 			continue;
 		}
 #endif
 #ifdef INET6
 		if ((ifa->address.sa.sa_family == AF_INET6) &&
 		    (stcb != NULL) &&
 		    (prison_check_ip6(inp->ip_inp.inp.inp_cred,
 		    &ifa->address.sin6.sin6_addr) != 0)) {
 			continue;
 		}
 #endif
 		if ((ifa->localifa_flags & SCTP_ADDR_DEFER_USE) &&
 		    (non_asoc_addr_ok == 0)) {
 			continue;
 		}
 		sifa = sctp_is_ifa_addr_preferred(ifa, dest_is_loop,
 		    dest_is_priv, fam);
 		if (sifa == NULL) {
 			continue;
 		}
 		if (stcb) {
 			if (sctp_is_address_in_scope(ifa, &stcb->asoc.scope, 0) == 0) {
 				continue;
 			}
 			if (((non_asoc_addr_ok == 0) &&
 			    (sctp_is_addr_restricted(stcb, sifa))) ||
 			    (non_asoc_addr_ok &&
 			    (sctp_is_addr_restricted(stcb, sifa)) &&
 			    (!sctp_is_addr_pending(stcb, sifa)))) {
 				/*
 				 * It is restricted for some reason..
 				 * probably not yet added.
 				 */
 				continue;
 			}
 		}
 		num_eligible_addr++;
 	}
 	return (num_eligible_addr);
 }
 
 static struct sctp_ifa *
 sctp_choose_boundall(struct sctp_inpcb *inp,
     struct sctp_tcb *stcb,
     struct sctp_nets *net,
     sctp_route_t *ro,
     uint32_t vrf_id,
     uint8_t dest_is_priv,
     uint8_t dest_is_loop,
     int non_asoc_addr_ok,
     sa_family_t fam)
 {
 	int cur_addr_num = 0, num_preferred = 0;
 	void *ifn;
 	struct sctp_ifn *sctp_ifn, *looked_at = NULL, *emit_ifn;
 	struct sctp_ifa *sctp_ifa, *sifa;
 	uint32_t ifn_index;
 	struct sctp_vrf *vrf;
 #ifdef INET
 	int retried = 0;
 #endif
 
 	/*-
 	 * For boundall we can use any address in the association.
 	 * If non_asoc_addr_ok is set we can use any address (at least in
 	 * theory). So we look for preferred addresses first. If we find one,
 	 * we use it. Otherwise we next try to get an address on the
 	 * interface, which we should be able to do (unless non_asoc_addr_ok
 	 * is false and we are routed out that way). In these cases where we
 	 * can't use the address of the interface we go through all the
 	 * ifn's looking for an address we can use and fill that in. Punting
 	 * means we send back address 0, which will probably cause problems
 	 * actually since then IP will fill in the address of the route ifn,
 	 * which means we probably already rejected it.. i.e. here comes an
 	 * abort :-<.
 	 */
 	vrf = sctp_find_vrf(vrf_id);
 	if (vrf == NULL)
 		return (NULL);
 
 	ifn = SCTP_GET_IFN_VOID_FROM_ROUTE(ro);
 	ifn_index = SCTP_GET_IF_INDEX_FROM_ROUTE(ro);
 	SCTPDBG(SCTP_DEBUG_OUTPUT2, "ifn from route:%p ifn_index:%d\n", ifn, ifn_index);
 	emit_ifn = looked_at = sctp_ifn = sctp_find_ifn(ifn, ifn_index);
 	if (sctp_ifn == NULL) {
 		/* ?? We don't have this guy ?? */
 		SCTPDBG(SCTP_DEBUG_OUTPUT2, "No ifn emit interface?\n");
 		goto bound_all_plan_b;
 	}
 	SCTPDBG(SCTP_DEBUG_OUTPUT2, "ifn_index:%d name:%s is emit interface\n",
 	    ifn_index, sctp_ifn->ifn_name);
 
 	if (net) {
 		cur_addr_num = net->indx_of_eligible_next_to_use;
 	}
 	num_preferred = sctp_count_num_preferred_boundall(sctp_ifn,
 	    inp, stcb,
 	    non_asoc_addr_ok,
 	    dest_is_loop,
 	    dest_is_priv, fam);
 	SCTPDBG(SCTP_DEBUG_OUTPUT2, "Found %d preferred source addresses for intf:%s\n",
 	    num_preferred, sctp_ifn->ifn_name);
 	if (num_preferred == 0) {
 		/*
 		 * no eligible addresses, we must use some other interface
 		 * address if we can find one.
 		 */
 		goto bound_all_plan_b;
 	}
 	/*
 	 * Ok we have num_eligible_addr set with how many we can use, this
 	 * may vary from call to call due to addresses being deprecated
 	 * etc..
 	 */
 	if (cur_addr_num >= num_preferred) {
 		cur_addr_num = 0;
 	}
 	/*
 	 * select the nth address from the list (where cur_addr_num is the
 	 * nth) and 0 is the first one, 1 is the second one etc...
 	 */
 	SCTPDBG(SCTP_DEBUG_OUTPUT2, "cur_addr_num:%d\n", cur_addr_num);
 
 	sctp_ifa = sctp_select_nth_preferred_addr_from_ifn_boundall(sctp_ifn, inp, stcb, non_asoc_addr_ok, dest_is_loop,
 	    dest_is_priv, cur_addr_num, fam, ro);
 
 	/* if sctp_ifa is NULL something changed??, fall to plan b. */
 	if (sctp_ifa) {
 		atomic_add_int(&sctp_ifa->refcount, 1);
 		if (net) {
 			/* save off where the next one we will want */
 			net->indx_of_eligible_next_to_use = cur_addr_num + 1;
 		}
 		return (sctp_ifa);
 	}
 	/*
 	 * plan_b: Look at all interfaces and find a preferred address. If
 	 * no preferred fall through to plan_c.
 	 */
 bound_all_plan_b:
 	SCTPDBG(SCTP_DEBUG_OUTPUT2, "Trying Plan B\n");
 	LIST_FOREACH(sctp_ifn, &vrf->ifnlist, next_ifn) {
 		SCTPDBG(SCTP_DEBUG_OUTPUT2, "Examine interface %s\n",
 		    sctp_ifn->ifn_name);
 		if (dest_is_loop == 0 && SCTP_IFN_IS_IFT_LOOP(sctp_ifn)) {
 			/* wrong base scope */
 			SCTPDBG(SCTP_DEBUG_OUTPUT2, "skip\n");
 			continue;
 		}
 		if ((sctp_ifn == looked_at) && looked_at) {
 			/* already looked at this guy */
 			SCTPDBG(SCTP_DEBUG_OUTPUT2, "already seen\n");
 			continue;
 		}
 		num_preferred = sctp_count_num_preferred_boundall(sctp_ifn, inp, stcb, non_asoc_addr_ok,
 		    dest_is_loop, dest_is_priv, fam);
 		SCTPDBG(SCTP_DEBUG_OUTPUT2,
 		    "Found ifn:%p %d preferred source addresses\n",
 		    ifn, num_preferred);
 		if (num_preferred == 0) {
 			/* None on this interface. */
 			SCTPDBG(SCTP_DEBUG_OUTPUT2, "No preferred -- skipping to next\n");
 			continue;
 		}
 		SCTPDBG(SCTP_DEBUG_OUTPUT2,
 		    "num preferred:%d on interface:%p cur_addr_num:%d\n",
 		    num_preferred, (void *)sctp_ifn, cur_addr_num);
 
 		/*
 		 * Ok we have num_eligible_addr set with how many we can
 		 * use, this may vary from call to call due to addresses
 		 * being deprecated etc..
 		 */
 		if (cur_addr_num >= num_preferred) {
 			cur_addr_num = 0;
 		}
 		sifa = sctp_select_nth_preferred_addr_from_ifn_boundall(sctp_ifn, inp, stcb, non_asoc_addr_ok, dest_is_loop,
 		    dest_is_priv, cur_addr_num, fam, ro);
 		if (sifa == NULL)
 			continue;
 		if (net) {
 			net->indx_of_eligible_next_to_use = cur_addr_num + 1;
 			SCTPDBG(SCTP_DEBUG_OUTPUT2, "we selected %d\n",
 			    cur_addr_num);
 			SCTPDBG(SCTP_DEBUG_OUTPUT2, "Source:");
 			SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, &sifa->address.sa);
 			SCTPDBG(SCTP_DEBUG_OUTPUT2, "Dest:");
 			SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, &net->ro._l_addr.sa);
 		}
 		atomic_add_int(&sifa->refcount, 1);
 		return (sifa);
 	}
 #ifdef INET
 again_with_private_addresses_allowed:
 #endif
 	/* plan_c: do we have an acceptable address on the emit interface */
 	sifa = NULL;
 	SCTPDBG(SCTP_DEBUG_OUTPUT2, "Trying Plan C: find acceptable on interface\n");
 	if (emit_ifn == NULL) {
 		SCTPDBG(SCTP_DEBUG_OUTPUT2, "Jump to Plan D - no emit_ifn\n");
 		goto plan_d;
 	}
 	LIST_FOREACH(sctp_ifa, &emit_ifn->ifalist, next_ifa) {
 		SCTPDBG(SCTP_DEBUG_OUTPUT2, "ifa:%p\n", (void *)sctp_ifa);
 #ifdef INET
 		if ((sctp_ifa->address.sa.sa_family == AF_INET) &&
 		    (prison_check_ip4(inp->ip_inp.inp.inp_cred,
 		    &sctp_ifa->address.sin.sin_addr) != 0)) {
 			SCTPDBG(SCTP_DEBUG_OUTPUT2, "Jailed\n");
 			continue;
 		}
 #endif
 #ifdef INET6
 		if ((sctp_ifa->address.sa.sa_family == AF_INET6) &&
 		    (prison_check_ip6(inp->ip_inp.inp.inp_cred,
 		    &sctp_ifa->address.sin6.sin6_addr) != 0)) {
 			SCTPDBG(SCTP_DEBUG_OUTPUT2, "Jailed\n");
 			continue;
 		}
 #endif
 		if ((sctp_ifa->localifa_flags & SCTP_ADDR_DEFER_USE) &&
 		    (non_asoc_addr_ok == 0)) {
 			SCTPDBG(SCTP_DEBUG_OUTPUT2, "Defer\n");
 			continue;
 		}
 		sifa = sctp_is_ifa_addr_acceptable(sctp_ifa, dest_is_loop,
 		    dest_is_priv, fam);
 		if (sifa == NULL) {
 			SCTPDBG(SCTP_DEBUG_OUTPUT2, "IFA not acceptable\n");
 			continue;
 		}
 		if (stcb) {
 			if (sctp_is_address_in_scope(sifa, &stcb->asoc.scope, 0) == 0) {
 				SCTPDBG(SCTP_DEBUG_OUTPUT2, "NOT in scope\n");
 				sifa = NULL;
 				continue;
 			}
 			if (((non_asoc_addr_ok == 0) &&
 			    (sctp_is_addr_restricted(stcb, sifa))) ||
 			    (non_asoc_addr_ok &&
 			    (sctp_is_addr_restricted(stcb, sifa)) &&
 			    (!sctp_is_addr_pending(stcb, sifa)))) {
 				/*
 				 * It is restricted for some reason..
 				 * probably not yet added.
 				 */
 				SCTPDBG(SCTP_DEBUG_OUTPUT2, "Its restricted\n");
 				sifa = NULL;
 				continue;
 			}
 		}
 		atomic_add_int(&sifa->refcount, 1);
 		goto out;
 	}
 plan_d:
 	/*
 	 * plan_d: We are in trouble. No preferred address on the emit
 	 * interface. And not even a preferred address on all interfaces. Go
 	 * out and see if we can find an acceptable address somewhere
 	 * amongst all interfaces.
 	 */
 	SCTPDBG(SCTP_DEBUG_OUTPUT2, "Trying Plan D looked_at is %p\n", (void *)looked_at);
 	LIST_FOREACH(sctp_ifn, &vrf->ifnlist, next_ifn) {
 		if (dest_is_loop == 0 && SCTP_IFN_IS_IFT_LOOP(sctp_ifn)) {
 			/* wrong base scope */
 			continue;
 		}
 		LIST_FOREACH(sctp_ifa, &sctp_ifn->ifalist, next_ifa) {
 #ifdef INET
 			if ((sctp_ifa->address.sa.sa_family == AF_INET) &&
 			    (prison_check_ip4(inp->ip_inp.inp.inp_cred,
 			    &sctp_ifa->address.sin.sin_addr) != 0)) {
 				continue;
 			}
 #endif
 #ifdef INET6
 			if ((sctp_ifa->address.sa.sa_family == AF_INET6) &&
 			    (prison_check_ip6(inp->ip_inp.inp.inp_cred,
 			    &sctp_ifa->address.sin6.sin6_addr) != 0)) {
 				continue;
 			}
 #endif
 			if ((sctp_ifa->localifa_flags & SCTP_ADDR_DEFER_USE) &&
 			    (non_asoc_addr_ok == 0))
 				continue;
 			sifa = sctp_is_ifa_addr_acceptable(sctp_ifa,
 			    dest_is_loop,
 			    dest_is_priv, fam);
 			if (sifa == NULL)
 				continue;
 			if (stcb) {
 				if (sctp_is_address_in_scope(sifa, &stcb->asoc.scope, 0) == 0) {
 					sifa = NULL;
 					continue;
 				}
 				if (((non_asoc_addr_ok == 0) &&
 				    (sctp_is_addr_restricted(stcb, sifa))) ||
 				    (non_asoc_addr_ok &&
 				    (sctp_is_addr_restricted(stcb, sifa)) &&
 				    (!sctp_is_addr_pending(stcb, sifa)))) {
 					/*
 					 * It is restricted for some
 					 * reason.. probably not yet added.
 					 */
 					sifa = NULL;
 					continue;
 				}
 			}
 			goto out;
 		}
 	}
 #ifdef INET
 	if (stcb) {
 		if ((retried == 0) && (stcb->asoc.scope.ipv4_local_scope == 0)) {
 			stcb->asoc.scope.ipv4_local_scope = 1;
 			retried = 1;
 			goto again_with_private_addresses_allowed;
 		} else if (retried == 1) {
 			stcb->asoc.scope.ipv4_local_scope = 0;
 		}
 	}
 #endif
 out:
 #ifdef INET
 	if (sifa) {
 		if (retried == 1) {
 			LIST_FOREACH(sctp_ifn, &vrf->ifnlist, next_ifn) {
 				if (dest_is_loop == 0 && SCTP_IFN_IS_IFT_LOOP(sctp_ifn)) {
 					/* wrong base scope */
 					continue;
 				}
 				LIST_FOREACH(sctp_ifa, &sctp_ifn->ifalist, next_ifa) {
 					struct sctp_ifa *tmp_sifa;
 
 #ifdef INET
 					if ((sctp_ifa->address.sa.sa_family == AF_INET) &&
 					    (prison_check_ip4(inp->ip_inp.inp.inp_cred,
 					    &sctp_ifa->address.sin.sin_addr) != 0)) {
 						continue;
 					}
 #endif
 #ifdef INET6
 					if ((sctp_ifa->address.sa.sa_family == AF_INET6) &&
 					    (prison_check_ip6(inp->ip_inp.inp.inp_cred,
 					    &sctp_ifa->address.sin6.sin6_addr) != 0)) {
 						continue;
 					}
 #endif
 					if ((sctp_ifa->localifa_flags & SCTP_ADDR_DEFER_USE) &&
 					    (non_asoc_addr_ok == 0))
 						continue;
 					tmp_sifa = sctp_is_ifa_addr_acceptable(sctp_ifa,
 					    dest_is_loop,
 					    dest_is_priv, fam);
 					if (tmp_sifa == NULL) {
 						continue;
 					}
 					if (tmp_sifa == sifa) {
 						continue;
 					}
 					if (stcb) {
 						if (sctp_is_address_in_scope(tmp_sifa,
 						    &stcb->asoc.scope, 0) == 0) {
 							continue;
 						}
 						if (((non_asoc_addr_ok == 0) &&
 						    (sctp_is_addr_restricted(stcb, tmp_sifa))) ||
 						    (non_asoc_addr_ok &&
 						    (sctp_is_addr_restricted(stcb, tmp_sifa)) &&
 						    (!sctp_is_addr_pending(stcb, tmp_sifa)))) {
 							/*
 							 * It is restricted
 							 * for some reason..
 							 * probably not yet
 							 * added.
 							 */
 							continue;
 						}
 					}
 					if ((tmp_sifa->address.sin.sin_family == AF_INET) &&
 					    (IN4_ISPRIVATE_ADDRESS(&(tmp_sifa->address.sin.sin_addr)))) {
 						sctp_add_local_addr_restricted(stcb, tmp_sifa);
 					}
 				}
 			}
 		}
 		atomic_add_int(&sifa->refcount, 1);
 	}
 #endif
 	return (sifa);
 }
 
 
 
 /* tcb may be NULL */
 struct sctp_ifa *
 sctp_source_address_selection(struct sctp_inpcb *inp,
     struct sctp_tcb *stcb,
     sctp_route_t *ro,
     struct sctp_nets *net,
     int non_asoc_addr_ok, uint32_t vrf_id)
 {
 	struct sctp_ifa *answer;
 	uint8_t dest_is_priv, dest_is_loop;
 	sa_family_t fam;
 #ifdef INET
 	struct sockaddr_in *to = (struct sockaddr_in *)&ro->ro_dst;
 #endif
 #ifdef INET6
 	struct sockaddr_in6 *to6 = (struct sockaddr_in6 *)&ro->ro_dst;
 #endif
 
 	/**
 	 * Rules:
 	 * - Find the route if needed, cache if I can.
 	 * - Look at interface address in route, Is it in the bound list. If so we
 	 *   have the best source.
 	 * - If not we must rotate amongst the addresses.
 	 *
 	 * Cavets and issues
 	 *
 	 * Do we need to pay attention to scope. We can have a private address
 	 * or a global address we are sourcing or sending to. So if we draw
 	 * it out
 	 * zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
 	 * For V4
 	 * ------------------------------------------
 	 *      source     *      dest  *  result
 	 * -----------------------------------------
 	 * <a>  Private    *    Global  *  NAT
 	 * -----------------------------------------
 	 * <b>  Private    *    Private *  No problem
 	 * -----------------------------------------
 	 * <c>  Global     *    Private *  Huh, How will this work?
 	 * -----------------------------------------
 	 * <d>  Global     *    Global  *  No Problem
 	 *------------------------------------------
 	 * zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
 	 * For V6
 	 *------------------------------------------
 	 *      source     *      dest  *  result
 	 * -----------------------------------------
 	 * <a>  Linklocal  *    Global  *
 	 * -----------------------------------------
 	 * <b>  Linklocal  * Linklocal  *  No problem
 	 * -----------------------------------------
 	 * <c>  Global     * Linklocal  *  Huh, How will this work?
 	 * -----------------------------------------
 	 * <d>  Global     *    Global  *  No Problem
 	 *------------------------------------------
 	 * zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
 	 *
 	 * And then we add to that what happens if there are multiple addresses
 	 * assigned to an interface. Remember the ifa on a ifn is a linked
 	 * list of addresses. So one interface can have more than one IP
 	 * address. What happens if we have both a private and a global
 	 * address? Do we then use context of destination to sort out which
 	 * one is best? And what about NAT's sending P->G may get you a NAT
 	 * translation, or should you select the G thats on the interface in
 	 * preference.
 	 *
 	 * Decisions:
 	 *
 	 * - count the number of addresses on the interface.
 	 * - if it is one, no problem except case <c>.
 	 *   For <a> we will assume a NAT out there.
 	 * - if there are more than one, then we need to worry about scope P
 	 *   or G. We should prefer G -> G and P -> P if possible.
 	 *   Then as a secondary fall back to mixed types G->P being a last
 	 *   ditch one.
 	 * - The above all works for bound all, but bound specific we need to
 	 *   use the same concept but instead only consider the bound
 	 *   addresses. If the bound set is NOT assigned to the interface then
 	 *   we must use rotation amongst the bound addresses..
 	 */
 	if (ro->ro_rt == NULL) {
 		/*
 		 * Need a route to cache.
 		 */
 		SCTP_RTALLOC(ro, vrf_id, inp->fibnum);
 	}
 	if (ro->ro_rt == NULL) {
 		return (NULL);
 	}
 	fam = ro->ro_dst.sa_family;
 	dest_is_priv = dest_is_loop = 0;
 	/* Setup our scopes for the destination */
 	switch (fam) {
 #ifdef INET
 	case AF_INET:
 		/* Scope based on outbound address */
 		if (IN4_ISLOOPBACK_ADDRESS(&to->sin_addr)) {
 			dest_is_loop = 1;
 			if (net != NULL) {
 				/* mark it as local */
 				net->addr_is_local = 1;
 			}
 		} else if ((IN4_ISPRIVATE_ADDRESS(&to->sin_addr))) {
 			dest_is_priv = 1;
 		}
 		break;
 #endif
 #ifdef INET6
 	case AF_INET6:
 		/* Scope based on outbound address */
 		if (IN6_IS_ADDR_LOOPBACK(&to6->sin6_addr) ||
 		    SCTP_ROUTE_IS_REAL_LOOP(ro)) {
 			/*
 			 * If the address is a loopback address, which
 			 * consists of "::1" OR "fe80::1%lo0", we are
 			 * loopback scope. But we don't use dest_is_priv
 			 * (link local addresses).
 			 */
 			dest_is_loop = 1;
 			if (net != NULL) {
 				/* mark it as local */
 				net->addr_is_local = 1;
 			}
 		} else if (IN6_IS_ADDR_LINKLOCAL(&to6->sin6_addr)) {
 			dest_is_priv = 1;
 		}
 		break;
 #endif
 	}
 	SCTPDBG(SCTP_DEBUG_OUTPUT2, "Select source addr for:");
 	SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, (struct sockaddr *)&ro->ro_dst);
 	SCTP_IPI_ADDR_RLOCK();
 	if (inp->sctp_flags & SCTP_PCB_FLAGS_BOUNDALL) {
 		/*
 		 * Bound all case
 		 */
 		answer = sctp_choose_boundall(inp, stcb, net, ro, vrf_id,
 		    dest_is_priv, dest_is_loop,
 		    non_asoc_addr_ok, fam);
 		SCTP_IPI_ADDR_RUNLOCK();
 		return (answer);
 	}
 	/*
 	 * Subset bound case
 	 */
 	if (stcb) {
 		answer = sctp_choose_boundspecific_stcb(inp, stcb, ro,
 		    vrf_id, dest_is_priv,
 		    dest_is_loop,
 		    non_asoc_addr_ok, fam);
 	} else {
 		answer = sctp_choose_boundspecific_inp(inp, ro, vrf_id,
 		    non_asoc_addr_ok,
 		    dest_is_priv,
 		    dest_is_loop, fam);
 	}
 	SCTP_IPI_ADDR_RUNLOCK();
 	return (answer);
 }
 
 static int
 sctp_find_cmsg(int c_type, void *data, struct mbuf *control, size_t cpsize)
 {
 	struct cmsghdr cmh;
 	struct sctp_sndinfo sndinfo;
 	struct sctp_prinfo prinfo;
 	struct sctp_authinfo authinfo;
 	int tot_len, rem_len, cmsg_data_len, cmsg_data_off, off;
 	int found;
 
 	/*
 	 * Independent of how many mbufs, find the c_type inside the control
 	 * structure and copy out the data.
 	 */
 	found = 0;
 	tot_len = SCTP_BUF_LEN(control);
 	for (off = 0; off < tot_len; off += CMSG_ALIGN(cmh.cmsg_len)) {
 		rem_len = tot_len - off;
 		if (rem_len < (int)CMSG_ALIGN(sizeof(cmh))) {
 			/* There is not enough room for one more. */
 			return (found);
 		}
 		m_copydata(control, off, sizeof(cmh), (caddr_t)&cmh);
 		if (cmh.cmsg_len < CMSG_ALIGN(sizeof(cmh))) {
 			/* We dont't have a complete CMSG header. */
 			return (found);
 		}
 		if ((cmh.cmsg_len > INT_MAX) || ((int)cmh.cmsg_len > rem_len)) {
 			/* We don't have the complete CMSG. */
 			return (found);
 		}
 		cmsg_data_len = (int)cmh.cmsg_len - CMSG_ALIGN(sizeof(cmh));
 		cmsg_data_off = off + CMSG_ALIGN(sizeof(cmh));
 		if ((cmh.cmsg_level == IPPROTO_SCTP) &&
 		    ((c_type == cmh.cmsg_type) ||
 		    ((c_type == SCTP_SNDRCV) &&
 		    ((cmh.cmsg_type == SCTP_SNDINFO) ||
 		    (cmh.cmsg_type == SCTP_PRINFO) ||
 		    (cmh.cmsg_type == SCTP_AUTHINFO))))) {
 			if (c_type == cmh.cmsg_type) {
 				if (cpsize > INT_MAX) {
 					return (found);
 				}
 				if (cmsg_data_len < (int)cpsize) {
 					return (found);
 				}
 				/* It is exactly what we want. Copy it out. */
 				m_copydata(control, cmsg_data_off, (int)cpsize, (caddr_t)data);
 				return (1);
 			} else {
 				struct sctp_sndrcvinfo *sndrcvinfo;
 
 				sndrcvinfo = (struct sctp_sndrcvinfo *)data;
 				if (found == 0) {
 					if (cpsize < sizeof(struct sctp_sndrcvinfo)) {
 						return (found);
 					}
 					memset(sndrcvinfo, 0, sizeof(struct sctp_sndrcvinfo));
 				}
 				switch (cmh.cmsg_type) {
 				case SCTP_SNDINFO:
 					if (cmsg_data_len < (int)sizeof(struct sctp_sndinfo)) {
 						return (found);
 					}
 					m_copydata(control, cmsg_data_off, sizeof(struct sctp_sndinfo), (caddr_t)&sndinfo);
 					sndrcvinfo->sinfo_stream = sndinfo.snd_sid;
 					sndrcvinfo->sinfo_flags = sndinfo.snd_flags;
 					sndrcvinfo->sinfo_ppid = sndinfo.snd_ppid;
 					sndrcvinfo->sinfo_context = sndinfo.snd_context;
 					sndrcvinfo->sinfo_assoc_id = sndinfo.snd_assoc_id;
 					break;
 				case SCTP_PRINFO:
 					if (cmsg_data_len < (int)sizeof(struct sctp_prinfo)) {
 						return (found);
 					}
 					m_copydata(control, cmsg_data_off, sizeof(struct sctp_prinfo), (caddr_t)&prinfo);
 					if (prinfo.pr_policy != SCTP_PR_SCTP_NONE) {
 						sndrcvinfo->sinfo_timetolive = prinfo.pr_value;
 					} else {
 						sndrcvinfo->sinfo_timetolive = 0;
 					}
 					sndrcvinfo->sinfo_flags |= prinfo.pr_policy;
 					break;
 				case SCTP_AUTHINFO:
 					if (cmsg_data_len < (int)sizeof(struct sctp_authinfo)) {
 						return (found);
 					}
 					m_copydata(control, cmsg_data_off, sizeof(struct sctp_authinfo), (caddr_t)&authinfo);
 					sndrcvinfo->sinfo_keynumber_valid = 1;
 					sndrcvinfo->sinfo_keynumber = authinfo.auth_keynumber;
 					break;
 				default:
 					return (found);
 				}
 				found = 1;
 			}
 		}
 	}
 	return (found);
 }
 
 static int
 sctp_process_cmsgs_for_init(struct sctp_tcb *stcb, struct mbuf *control, int *error)
 {
 	struct cmsghdr cmh;
 	struct sctp_initmsg initmsg;
 #ifdef INET
 	struct sockaddr_in sin;
 #endif
 #ifdef INET6
 	struct sockaddr_in6 sin6;
 #endif
 	int tot_len, rem_len, cmsg_data_len, cmsg_data_off, off;
 
 	tot_len = SCTP_BUF_LEN(control);
 	for (off = 0; off < tot_len; off += CMSG_ALIGN(cmh.cmsg_len)) {
 		rem_len = tot_len - off;
 		if (rem_len < (int)CMSG_ALIGN(sizeof(cmh))) {
 			/* There is not enough room for one more. */
 			*error = EINVAL;
 			return (1);
 		}
 		m_copydata(control, off, sizeof(cmh), (caddr_t)&cmh);
 		if (cmh.cmsg_len < CMSG_ALIGN(sizeof(cmh))) {
 			/* We dont't have a complete CMSG header. */
 			*error = EINVAL;
 			return (1);
 		}
 		if ((cmh.cmsg_len > INT_MAX) || ((int)cmh.cmsg_len > rem_len)) {
 			/* We don't have the complete CMSG. */
 			*error = EINVAL;
 			return (1);
 		}
 		cmsg_data_len = (int)cmh.cmsg_len - CMSG_ALIGN(sizeof(cmh));
 		cmsg_data_off = off + CMSG_ALIGN(sizeof(cmh));
 		if (cmh.cmsg_level == IPPROTO_SCTP) {
 			switch (cmh.cmsg_type) {
 			case SCTP_INIT:
 				if (cmsg_data_len < (int)sizeof(struct sctp_initmsg)) {
 					*error = EINVAL;
 					return (1);
 				}
 				m_copydata(control, cmsg_data_off, sizeof(struct sctp_initmsg), (caddr_t)&initmsg);
 				if (initmsg.sinit_max_attempts)
 					stcb->asoc.max_init_times = initmsg.sinit_max_attempts;
 				if (initmsg.sinit_num_ostreams)
 					stcb->asoc.pre_open_streams = initmsg.sinit_num_ostreams;
 				if (initmsg.sinit_max_instreams)
 					stcb->asoc.max_inbound_streams = initmsg.sinit_max_instreams;
 				if (initmsg.sinit_max_init_timeo)
 					stcb->asoc.initial_init_rto_max = initmsg.sinit_max_init_timeo;
 				if (stcb->asoc.streamoutcnt < stcb->asoc.pre_open_streams) {
 					struct sctp_stream_out *tmp_str;
 					unsigned int i;
 #if defined(SCTP_DETAILED_STR_STATS)
 					int j;
 #endif
 
 					/* Default is NOT correct */
 					SCTPDBG(SCTP_DEBUG_OUTPUT1, "Ok, default:%d pre_open:%d\n",
 					    stcb->asoc.streamoutcnt, stcb->asoc.pre_open_streams);
 					SCTP_TCB_UNLOCK(stcb);
 					SCTP_MALLOC(tmp_str,
 					    struct sctp_stream_out *,
 					    (stcb->asoc.pre_open_streams * sizeof(struct sctp_stream_out)),
 					    SCTP_M_STRMO);
 					SCTP_TCB_LOCK(stcb);
 					if (tmp_str != NULL) {
 						SCTP_FREE(stcb->asoc.strmout, SCTP_M_STRMO);
 						stcb->asoc.strmout = tmp_str;
 						stcb->asoc.strm_realoutsize = stcb->asoc.streamoutcnt = stcb->asoc.pre_open_streams;
 					} else {
 						stcb->asoc.pre_open_streams = stcb->asoc.streamoutcnt;
 					}
 					for (i = 0; i < stcb->asoc.streamoutcnt; i++) {
 						TAILQ_INIT(&stcb->asoc.strmout[i].outqueue);
 						stcb->asoc.strmout[i].chunks_on_queues = 0;
 						stcb->asoc.strmout[i].next_mid_ordered = 0;
 						stcb->asoc.strmout[i].next_mid_unordered = 0;
 #if defined(SCTP_DETAILED_STR_STATS)
 						for (j = 0; j < SCTP_PR_SCTP_MAX + 1; j++) {
 							stcb->asoc.strmout[i].abandoned_sent[j] = 0;
 							stcb->asoc.strmout[i].abandoned_unsent[j] = 0;
 						}
 #else
 						stcb->asoc.strmout[i].abandoned_sent[0] = 0;
 						stcb->asoc.strmout[i].abandoned_unsent[0] = 0;
 #endif
 						stcb->asoc.strmout[i].sid = i;
 						stcb->asoc.strmout[i].last_msg_incomplete = 0;
 						stcb->asoc.strmout[i].state = SCTP_STREAM_OPENING;
 						stcb->asoc.ss_functions.sctp_ss_init_stream(stcb, &stcb->asoc.strmout[i], NULL);
 					}
 				}
 				break;
 #ifdef INET
 			case SCTP_DSTADDRV4:
 				if (cmsg_data_len < (int)sizeof(struct in_addr)) {
 					*error = EINVAL;
 					return (1);
 				}
 				memset(&sin, 0, sizeof(struct sockaddr_in));
 				sin.sin_family = AF_INET;
 				sin.sin_len = sizeof(struct sockaddr_in);
 				sin.sin_port = stcb->rport;
 				m_copydata(control, cmsg_data_off, sizeof(struct in_addr), (caddr_t)&sin.sin_addr);
 				if ((sin.sin_addr.s_addr == INADDR_ANY) ||
 				    (sin.sin_addr.s_addr == INADDR_BROADCAST) ||
 				    IN_MULTICAST(ntohl(sin.sin_addr.s_addr))) {
 					*error = EINVAL;
 					return (1);
 				}
 				if (sctp_add_remote_addr(stcb, (struct sockaddr *)&sin, NULL, stcb->asoc.port,
 				    SCTP_DONOT_SETSCOPE, SCTP_ADDR_IS_CONFIRMED)) {
 					*error = ENOBUFS;
 					return (1);
 				}
 				break;
 #endif
 #ifdef INET6
 			case SCTP_DSTADDRV6:
 				if (cmsg_data_len < (int)sizeof(struct in6_addr)) {
 					*error = EINVAL;
 					return (1);
 				}
 				memset(&sin6, 0, sizeof(struct sockaddr_in6));
 				sin6.sin6_family = AF_INET6;
 				sin6.sin6_len = sizeof(struct sockaddr_in6);
 				sin6.sin6_port = stcb->rport;
 				m_copydata(control, cmsg_data_off, sizeof(struct in6_addr), (caddr_t)&sin6.sin6_addr);
 				if (IN6_IS_ADDR_UNSPECIFIED(&sin6.sin6_addr) ||
 				    IN6_IS_ADDR_MULTICAST(&sin6.sin6_addr)) {
 					*error = EINVAL;
 					return (1);
 				}
 #ifdef INET
 				if (IN6_IS_ADDR_V4MAPPED(&sin6.sin6_addr)) {
 					in6_sin6_2_sin(&sin, &sin6);
 					if ((sin.sin_addr.s_addr == INADDR_ANY) ||
 					    (sin.sin_addr.s_addr == INADDR_BROADCAST) ||
 					    IN_MULTICAST(ntohl(sin.sin_addr.s_addr))) {
 						*error = EINVAL;
 						return (1);
 					}
 					if (sctp_add_remote_addr(stcb, (struct sockaddr *)&sin, NULL, stcb->asoc.port,
 					    SCTP_DONOT_SETSCOPE, SCTP_ADDR_IS_CONFIRMED)) {
 						*error = ENOBUFS;
 						return (1);
 					}
 				} else
 #endif
 					if (sctp_add_remote_addr(stcb, (struct sockaddr *)&sin6, NULL, stcb->asoc.port,
 				    SCTP_DONOT_SETSCOPE, SCTP_ADDR_IS_CONFIRMED)) {
 					*error = ENOBUFS;
 					return (1);
 				}
 				break;
 #endif
 			default:
 				break;
 			}
 		}
 	}
 	return (0);
 }
 
 static struct sctp_tcb *
 sctp_findassociation_cmsgs(struct sctp_inpcb **inp_p,
     uint16_t port,
     struct mbuf *control,
     struct sctp_nets **net_p,
     int *error)
 {
 	struct cmsghdr cmh;
 	struct sctp_tcb *stcb;
 	struct sockaddr *addr;
 #ifdef INET
 	struct sockaddr_in sin;
 #endif
 #ifdef INET6
 	struct sockaddr_in6 sin6;
 #endif
 	int tot_len, rem_len, cmsg_data_len, cmsg_data_off, off;
 
 	tot_len = SCTP_BUF_LEN(control);
 	for (off = 0; off < tot_len; off += CMSG_ALIGN(cmh.cmsg_len)) {
 		rem_len = tot_len - off;
 		if (rem_len < (int)CMSG_ALIGN(sizeof(cmh))) {
 			/* There is not enough room for one more. */
 			*error = EINVAL;
 			return (NULL);
 		}
 		m_copydata(control, off, sizeof(cmh), (caddr_t)&cmh);
 		if (cmh.cmsg_len < CMSG_ALIGN(sizeof(cmh))) {
 			/* We dont't have a complete CMSG header. */
 			*error = EINVAL;
 			return (NULL);
 		}
 		if ((cmh.cmsg_len > INT_MAX) || ((int)cmh.cmsg_len > rem_len)) {
 			/* We don't have the complete CMSG. */
 			*error = EINVAL;
 			return (NULL);
 		}
 		cmsg_data_len = (int)cmh.cmsg_len - CMSG_ALIGN(sizeof(cmh));
 		cmsg_data_off = off + CMSG_ALIGN(sizeof(cmh));
 		if (cmh.cmsg_level == IPPROTO_SCTP) {
 			switch (cmh.cmsg_type) {
 #ifdef INET
 			case SCTP_DSTADDRV4:
 				if (cmsg_data_len < (int)sizeof(struct in_addr)) {
 					*error = EINVAL;
 					return (NULL);
 				}
 				memset(&sin, 0, sizeof(struct sockaddr_in));
 				sin.sin_family = AF_INET;
 				sin.sin_len = sizeof(struct sockaddr_in);
 				sin.sin_port = port;
 				m_copydata(control, cmsg_data_off, sizeof(struct in_addr), (caddr_t)&sin.sin_addr);
 				addr = (struct sockaddr *)&sin;
 				break;
 #endif
 #ifdef INET6
 			case SCTP_DSTADDRV6:
 				if (cmsg_data_len < (int)sizeof(struct in6_addr)) {
 					*error = EINVAL;
 					return (NULL);
 				}
 				memset(&sin6, 0, sizeof(struct sockaddr_in6));
 				sin6.sin6_family = AF_INET6;
 				sin6.sin6_len = sizeof(struct sockaddr_in6);
 				sin6.sin6_port = port;
 				m_copydata(control, cmsg_data_off, sizeof(struct in6_addr), (caddr_t)&sin6.sin6_addr);
 #ifdef INET
 				if (IN6_IS_ADDR_V4MAPPED(&sin6.sin6_addr)) {
 					in6_sin6_2_sin(&sin, &sin6);
 					addr = (struct sockaddr *)&sin;
 				} else
 #endif
 					addr = (struct sockaddr *)&sin6;
 				break;
 #endif
 			default:
 				addr = NULL;
 				break;
 			}
 			if (addr) {
 				stcb = sctp_findassociation_ep_addr(inp_p, addr, net_p, NULL, NULL);
 				if (stcb != NULL) {
 					return (stcb);
 				}
 			}
 		}
 	}
 	return (NULL);
 }
 
 static struct mbuf *
 sctp_add_cookie(struct mbuf *init, int init_offset,
     struct mbuf *initack, int initack_offset, struct sctp_state_cookie *stc_in, uint8_t **signature)
 {
 	struct mbuf *copy_init, *copy_initack, *m_at, *sig, *mret;
 	struct sctp_state_cookie *stc;
 	struct sctp_paramhdr *ph;
 	uint8_t *foo;
 	int sig_offset;
 	uint16_t cookie_sz;
 
 	mret = sctp_get_mbuf_for_msg((sizeof(struct sctp_state_cookie) +
 	    sizeof(struct sctp_paramhdr)), 0,
 	    M_NOWAIT, 1, MT_DATA);
 	if (mret == NULL) {
 		return (NULL);
 	}
 	copy_init = SCTP_M_COPYM(init, init_offset, M_COPYALL, M_NOWAIT);
 	if (copy_init == NULL) {
 		sctp_m_freem(mret);
 		return (NULL);
 	}
 #ifdef SCTP_MBUF_LOGGING
 	if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_MBUF_LOGGING_ENABLE) {
 		sctp_log_mbc(copy_init, SCTP_MBUF_ICOPY);
 	}
 #endif
 	copy_initack = SCTP_M_COPYM(initack, initack_offset, M_COPYALL,
 	    M_NOWAIT);
 	if (copy_initack == NULL) {
 		sctp_m_freem(mret);
 		sctp_m_freem(copy_init);
 		return (NULL);
 	}
 #ifdef SCTP_MBUF_LOGGING
 	if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_MBUF_LOGGING_ENABLE) {
 		sctp_log_mbc(copy_initack, SCTP_MBUF_ICOPY);
 	}
 #endif
 	/* easy side we just drop it on the end */
 	ph = mtod(mret, struct sctp_paramhdr *);
 	SCTP_BUF_LEN(mret) = sizeof(struct sctp_state_cookie) +
 	    sizeof(struct sctp_paramhdr);
 	stc = (struct sctp_state_cookie *)((caddr_t)ph +
 	    sizeof(struct sctp_paramhdr));
 	ph->param_type = htons(SCTP_STATE_COOKIE);
 	ph->param_length = 0;	/* fill in at the end */
 	/* Fill in the stc cookie data */
 	memcpy(stc, stc_in, sizeof(struct sctp_state_cookie));
 
 	/* tack the INIT and then the INIT-ACK onto the chain */
 	cookie_sz = 0;
 	for (m_at = mret; m_at; m_at = SCTP_BUF_NEXT(m_at)) {
 		cookie_sz += SCTP_BUF_LEN(m_at);
 		if (SCTP_BUF_NEXT(m_at) == NULL) {
 			SCTP_BUF_NEXT(m_at) = copy_init;
 			break;
 		}
 	}
 	for (m_at = copy_init; m_at; m_at = SCTP_BUF_NEXT(m_at)) {
 		cookie_sz += SCTP_BUF_LEN(m_at);
 		if (SCTP_BUF_NEXT(m_at) == NULL) {
 			SCTP_BUF_NEXT(m_at) = copy_initack;
 			break;
 		}
 	}
 	for (m_at = copy_initack; m_at; m_at = SCTP_BUF_NEXT(m_at)) {
 		cookie_sz += SCTP_BUF_LEN(m_at);
 		if (SCTP_BUF_NEXT(m_at) == NULL) {
 			break;
 		}
 	}
 	sig = sctp_get_mbuf_for_msg(SCTP_SECRET_SIZE, 0, M_NOWAIT, 1, MT_DATA);
 	if (sig == NULL) {
 		/* no space, so free the entire chain */
 		sctp_m_freem(mret);
 		return (NULL);
 	}
 	SCTP_BUF_LEN(sig) = 0;
 	SCTP_BUF_NEXT(m_at) = sig;
 	sig_offset = 0;
 	foo = (uint8_t *)(mtod(sig, caddr_t)+sig_offset);
 	memset(foo, 0, SCTP_SIGNATURE_SIZE);
 	*signature = foo;
 	SCTP_BUF_LEN(sig) += SCTP_SIGNATURE_SIZE;
 	cookie_sz += SCTP_SIGNATURE_SIZE;
 	ph->param_length = htons(cookie_sz);
 	return (mret);
 }
 
 
 static uint8_t
 sctp_get_ect(struct sctp_tcb *stcb)
 {
 	if ((stcb != NULL) && (stcb->asoc.ecn_supported == 1)) {
 		return (SCTP_ECT0_BIT);
 	} else {
 		return (0);
 	}
 }
 
 #if defined(INET) || defined(INET6)
 static void
 sctp_handle_no_route(struct sctp_tcb *stcb,
     struct sctp_nets *net,
     int so_locked)
 {
 	SCTPDBG(SCTP_DEBUG_OUTPUT1, "dropped packet - no valid source addr\n");
 
 	if (net) {
 		SCTPDBG(SCTP_DEBUG_OUTPUT1, "Destination was ");
 		SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT1, &net->ro._l_addr.sa);
 		if (net->dest_state & SCTP_ADDR_CONFIRMED) {
 			if ((net->dest_state & SCTP_ADDR_REACHABLE) && stcb) {
 				SCTPDBG(SCTP_DEBUG_OUTPUT1, "no route takes interface %p down\n", (void *)net);
 				sctp_ulp_notify(SCTP_NOTIFY_INTERFACE_DOWN,
 				    stcb, 0,
 				    (void *)net,
 				    so_locked);
 				net->dest_state &= ~SCTP_ADDR_REACHABLE;
 				net->dest_state &= ~SCTP_ADDR_PF;
 			}
 		}
 		if (stcb) {
 			if (net == stcb->asoc.primary_destination) {
 				/* need a new primary */
 				struct sctp_nets *alt;
 
 				alt = sctp_find_alternate_net(stcb, net, 0);
 				if (alt != net) {
 					if (stcb->asoc.alternate) {
 						sctp_free_remote_addr(stcb->asoc.alternate);
 					}
 					stcb->asoc.alternate = alt;
 					atomic_add_int(&stcb->asoc.alternate->ref_count, 1);
 					if (net->ro._s_addr) {
 						sctp_free_ifa(net->ro._s_addr);
 						net->ro._s_addr = NULL;
 					}
 					net->src_addr_selected = 0;
 				}
 			}
 		}
 	}
 }
 #endif
 
 static int
 sctp_lowlevel_chunk_output(struct sctp_inpcb *inp,
     struct sctp_tcb *stcb,	/* may be NULL */
     struct sctp_nets *net,
     struct sockaddr *to,
     struct mbuf *m,
     uint32_t auth_offset,
     struct sctp_auth_chunk *auth,
     uint16_t auth_keyid,
     int nofragment_flag,
     int ecn_ok,
     int out_of_asoc_ok,
     uint16_t src_port,
     uint16_t dest_port,
     uint32_t v_tag,
     uint16_t port,
     union sctp_sockstore *over_addr,
     uint8_t mflowtype, uint32_t mflowid,
 #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING)
     int so_locked SCTP_UNUSED
 #else
     int so_locked
 #endif
 )
 {
 /* nofragment_flag to tell if IP_DF should be set (IPv4 only) */
 	/**
 	 * Given a mbuf chain (via SCTP_BUF_NEXT()) that holds a packet header
 	 * WITH an SCTPHDR but no IP header, endpoint inp and sa structure:
 	 * - fill in the HMAC digest of any AUTH chunk in the packet.
 	 * - calculate and fill in the SCTP checksum.
 	 * - prepend an IP address header.
 	 * - if boundall use INADDR_ANY.
 	 * - if boundspecific do source address selection.
 	 * - set fragmentation option for ipV4.
 	 * - On return from IP output, check/adjust mtu size of output
 	 *   interface and smallest_mtu size as well.
 	 */
 	/* Will need ifdefs around this */
 	struct mbuf *newm;
 	struct sctphdr *sctphdr;
 	int packet_length;
 	int ret;
 #if defined(INET) || defined(INET6)
 	uint32_t vrf_id;
 #endif
 #if defined(INET) || defined(INET6)
 	struct mbuf *o_pak;
 	sctp_route_t *ro = NULL;
 	struct udphdr *udp = NULL;
 #endif
 	uint8_t tos_value;
 #if defined(__APPLE__) || defined(SCTP_SO_LOCK_TESTING)
 	struct socket *so = NULL;
 #endif
 
 	if ((net) && (net->dest_state & SCTP_ADDR_OUT_OF_SCOPE)) {
 		SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EFAULT);
 		sctp_m_freem(m);
 		return (EFAULT);
 	}
 #if defined(INET) || defined(INET6)
 	if (stcb) {
 		vrf_id = stcb->asoc.vrf_id;
 	} else {
 		vrf_id = inp->def_vrf_id;
 	}
 #endif
 	/* fill in the HMAC digest for any AUTH chunk in the packet */
 	if ((auth != NULL) && (stcb != NULL)) {
 		sctp_fill_hmac_digest_m(m, auth_offset, auth, stcb, auth_keyid);
 	}
 
 	if (net) {
 		tos_value = net->dscp;
 	} else if (stcb) {
 		tos_value = stcb->asoc.default_dscp;
 	} else {
 		tos_value = inp->sctp_ep.default_dscp;
 	}
 
 	switch (to->sa_family) {
 #ifdef INET
 	case AF_INET:
 		{
 			struct ip *ip = NULL;
 			sctp_route_t iproute;
 			int len;
 
 			len = SCTP_MIN_V4_OVERHEAD;
 			if (port) {
 				len += sizeof(struct udphdr);
 			}
 			newm = sctp_get_mbuf_for_msg(len, 1, M_NOWAIT, 1, MT_DATA);
 			if (newm == NULL) {
 				sctp_m_freem(m);
 				SCTP_LTRACE_ERR_RET(inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 				return (ENOMEM);
 			}
 			SCTP_ALIGN_TO_END(newm, len);
 			SCTP_BUF_LEN(newm) = len;
 			SCTP_BUF_NEXT(newm) = m;
 			m = newm;
 			if (net != NULL) {
 				m->m_pkthdr.flowid = net->flowid;
 				M_HASHTYPE_SET(m, net->flowtype);
 			} else {
 				m->m_pkthdr.flowid = mflowid;
 				M_HASHTYPE_SET(m, mflowtype);
 			}
 			packet_length = sctp_calculate_len(m);
 			ip = mtod(m, struct ip *);
 			ip->ip_v = IPVERSION;
 			ip->ip_hl = (sizeof(struct ip) >> 2);
 			if (tos_value == 0) {
 				/*
 				 * This means especially, that it is not set
 				 * at the SCTP layer. So use the value from
 				 * the IP layer.
 				 */
 				tos_value = inp->ip_inp.inp.inp_ip_tos;
 			}
 			tos_value &= 0xfc;
 			if (ecn_ok) {
 				tos_value |= sctp_get_ect(stcb);
 			}
 			if ((nofragment_flag) && (port == 0)) {
 				ip->ip_off = htons(IP_DF);
 			} else {
 				ip->ip_off = htons(0);
 			}
 			/* FreeBSD has a function for ip_id's */
 			ip_fillid(ip);
 
 			ip->ip_ttl = inp->ip_inp.inp.inp_ip_ttl;
 			ip->ip_len = htons(packet_length);
 			ip->ip_tos = tos_value;
 			if (port) {
 				ip->ip_p = IPPROTO_UDP;
 			} else {
 				ip->ip_p = IPPROTO_SCTP;
 			}
 			ip->ip_sum = 0;
 			if (net == NULL) {
 				ro = &iproute;
 				memset(&iproute, 0, sizeof(iproute));
 				memcpy(&ro->ro_dst, to, to->sa_len);
 			} else {
 				ro = (sctp_route_t *)&net->ro;
 			}
 			/* Now the address selection part */
 			ip->ip_dst.s_addr = ((struct sockaddr_in *)to)->sin_addr.s_addr;
 
 			/* call the routine to select the src address */
 			if (net && out_of_asoc_ok == 0) {
 				if (net->ro._s_addr && (net->ro._s_addr->localifa_flags & (SCTP_BEING_DELETED | SCTP_ADDR_IFA_UNUSEABLE))) {
 					sctp_free_ifa(net->ro._s_addr);
 					net->ro._s_addr = NULL;
 					net->src_addr_selected = 0;
 					if (ro->ro_rt) {
 						RTFREE(ro->ro_rt);
 						ro->ro_rt = NULL;
 					}
 				}
 				if (net->src_addr_selected == 0) {
 					/* Cache the source address */
 					net->ro._s_addr = sctp_source_address_selection(inp, stcb,
 					    ro, net, 0,
 					    vrf_id);
 					net->src_addr_selected = 1;
 				}
 				if (net->ro._s_addr == NULL) {
 					/* No route to host */
 					net->src_addr_selected = 0;
 					sctp_handle_no_route(stcb, net, so_locked);
 					SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EHOSTUNREACH);
 					sctp_m_freem(m);
 					return (EHOSTUNREACH);
 				}
 				ip->ip_src = net->ro._s_addr->address.sin.sin_addr;
 			} else {
 				if (over_addr == NULL) {
 					struct sctp_ifa *_lsrc;
 
 					_lsrc = sctp_source_address_selection(inp, stcb, ro,
 					    net,
 					    out_of_asoc_ok,
 					    vrf_id);
 					if (_lsrc == NULL) {
 						sctp_handle_no_route(stcb, net, so_locked);
 						SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EHOSTUNREACH);
 						sctp_m_freem(m);
 						return (EHOSTUNREACH);
 					}
 					ip->ip_src = _lsrc->address.sin.sin_addr;
 					sctp_free_ifa(_lsrc);
 				} else {
 					ip->ip_src = over_addr->sin.sin_addr;
 					SCTP_RTALLOC(ro, vrf_id, inp->fibnum);
 				}
 			}
 			if (port) {
 				if (htons(SCTP_BASE_SYSCTL(sctp_udp_tunneling_port)) == 0) {
 					sctp_handle_no_route(stcb, net, so_locked);
 					SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EHOSTUNREACH);
 					sctp_m_freem(m);
 					return (EHOSTUNREACH);
 				}
 				udp = (struct udphdr *)((caddr_t)ip + sizeof(struct ip));
 				udp->uh_sport = htons(SCTP_BASE_SYSCTL(sctp_udp_tunneling_port));
 				udp->uh_dport = port;
 				udp->uh_ulen = htons((uint16_t)(packet_length - sizeof(struct ip)));
 				if (V_udp_cksum) {
 					udp->uh_sum = in_pseudo(ip->ip_src.s_addr, ip->ip_dst.s_addr, udp->uh_ulen + htons(IPPROTO_UDP));
 				} else {
 					udp->uh_sum = 0;
 				}
 				sctphdr = (struct sctphdr *)((caddr_t)udp + sizeof(struct udphdr));
 			} else {
 				sctphdr = (struct sctphdr *)((caddr_t)ip + sizeof(struct ip));
 			}
 
 			sctphdr->src_port = src_port;
 			sctphdr->dest_port = dest_port;
 			sctphdr->v_tag = v_tag;
 			sctphdr->checksum = 0;
 
 			/*
 			 * If source address selection fails and we find no
 			 * route then the ip_output should fail as well with
 			 * a NO_ROUTE_TO_HOST type error. We probably should
 			 * catch that somewhere and abort the association
 			 * right away (assuming this is an INIT being sent).
 			 */
 			if (ro->ro_rt == NULL) {
 				/*
 				 * src addr selection failed to find a route
 				 * (or valid source addr), so we can't get
 				 * there from here (yet)!
 				 */
 				sctp_handle_no_route(stcb, net, so_locked);
 				SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EHOSTUNREACH);
 				sctp_m_freem(m);
 				return (EHOSTUNREACH);
 			}
 			if (ro != &iproute) {
 				memcpy(&iproute, ro, sizeof(*ro));
 			}
 			SCTPDBG(SCTP_DEBUG_OUTPUT3, "Calling ipv4 output routine from low level src addr:%x\n",
 			    (uint32_t)(ntohl(ip->ip_src.s_addr)));
 			SCTPDBG(SCTP_DEBUG_OUTPUT3, "Destination is %x\n",
 			    (uint32_t)(ntohl(ip->ip_dst.s_addr)));
 			SCTPDBG(SCTP_DEBUG_OUTPUT3, "RTP route is %p through\n",
 			    (void *)ro->ro_rt);
 
 			if (SCTP_GET_HEADER_FOR_OUTPUT(o_pak)) {
 				/* failed to prepend data, give up */
 				SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 				sctp_m_freem(m);
 				return (ENOMEM);
 			}
 			SCTP_ATTACH_CHAIN(o_pak, m, packet_length);
 			if (port) {
 				sctphdr->checksum = sctp_calculate_cksum(m, sizeof(struct ip) + sizeof(struct udphdr));
 				SCTP_STAT_INCR(sctps_sendswcrc);
 				if (V_udp_cksum) {
 					SCTP_ENABLE_UDP_CSUM(o_pak);
 				}
 			} else {
 				m->m_pkthdr.csum_flags = CSUM_SCTP;
 				m->m_pkthdr.csum_data = offsetof(struct sctphdr, checksum);
 				SCTP_STAT_INCR(sctps_sendhwcrc);
 			}
 #ifdef SCTP_PACKET_LOGGING
 			if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LAST_PACKET_TRACING)
 				sctp_packet_log(o_pak);
 #endif
 			/* send it out.  table id is taken from stcb */
 #if defined(__APPLE__) || defined(SCTP_SO_LOCK_TESTING)
 			if ((SCTP_BASE_SYSCTL(sctp_output_unlocked)) && (so_locked)) {
 				so = SCTP_INP_SO(inp);
 				SCTP_SOCKET_UNLOCK(so, 0);
 			}
 #endif
 			SCTP_PROBE5(send, NULL, stcb, ip, stcb, sctphdr);
 			SCTP_IP_OUTPUT(ret, o_pak, ro, stcb, vrf_id);
 #if defined(__APPLE__) || defined(SCTP_SO_LOCK_TESTING)
 			if ((SCTP_BASE_SYSCTL(sctp_output_unlocked)) && (so_locked)) {
 				atomic_add_int(&stcb->asoc.refcnt, 1);
 				SCTP_TCB_UNLOCK(stcb);
 				SCTP_SOCKET_LOCK(so, 0);
 				SCTP_TCB_LOCK(stcb);
 				atomic_subtract_int(&stcb->asoc.refcnt, 1);
 			}
 #endif
 			if (port) {
 				UDPSTAT_INC(udps_opackets);
 			}
 			SCTP_STAT_INCR(sctps_sendpackets);
 			SCTP_STAT_INCR_COUNTER64(sctps_outpackets);
 			if (ret)
 				SCTP_STAT_INCR(sctps_senderrors);
 
 			SCTPDBG(SCTP_DEBUG_OUTPUT3, "IP output returns %d\n", ret);
 			if (net == NULL) {
 				/* free tempy routes */
 				RO_RTFREE(ro);
 			} else {
 				if ((ro->ro_rt != NULL) && (net->ro._s_addr) &&
 				    ((net->dest_state & SCTP_ADDR_NO_PMTUD) == 0)) {
 					uint32_t mtu;
 
 					mtu = SCTP_GATHER_MTU_FROM_ROUTE(net->ro._s_addr, &net->ro._l_addr.sa, ro->ro_rt);
 					if (mtu > 0) {
 						if (net->port) {
 							mtu -= sizeof(struct udphdr);
 						}
 						if ((stcb != NULL) && (stcb->asoc.smallest_mtu > mtu)) {
 							sctp_mtu_size_reset(inp, &stcb->asoc, mtu);
 						}
 						net->mtu = mtu;
 					}
 				} else if (ro->ro_rt == NULL) {
 					/* route was freed */
 					if (net->ro._s_addr &&
 					    net->src_addr_selected) {
 						sctp_free_ifa(net->ro._s_addr);
 						net->ro._s_addr = NULL;
 					}
 					net->src_addr_selected = 0;
 				}
 			}
 			return (ret);
 		}
 #endif
 #ifdef INET6
 	case AF_INET6:
 		{
 			uint32_t flowlabel, flowinfo;
 			struct ip6_hdr *ip6h;
 			struct route_in6 ip6route;
 			struct ifnet *ifp;
 			struct sockaddr_in6 *sin6, tmp, *lsa6, lsa6_tmp;
 			int prev_scope = 0;
 			struct sockaddr_in6 lsa6_storage;
 			int error;
 			u_short prev_port = 0;
 			int len;
 
 			if (net) {
 				flowlabel = net->flowlabel;
 			} else if (stcb) {
 				flowlabel = stcb->asoc.default_flowlabel;
 			} else {
 				flowlabel = inp->sctp_ep.default_flowlabel;
 			}
 			if (flowlabel == 0) {
 				/*
 				 * This means especially, that it is not set
 				 * at the SCTP layer. So use the value from
 				 * the IP layer.
 				 */
 				flowlabel = ntohl(((struct in6pcb *)inp)->in6p_flowinfo);
 			}
 			flowlabel &= 0x000fffff;
 			len = SCTP_MIN_OVERHEAD;
 			if (port) {
 				len += sizeof(struct udphdr);
 			}
 			newm = sctp_get_mbuf_for_msg(len, 1, M_NOWAIT, 1, MT_DATA);
 			if (newm == NULL) {
 				sctp_m_freem(m);
 				SCTP_LTRACE_ERR_RET(inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 				return (ENOMEM);
 			}
 			SCTP_ALIGN_TO_END(newm, len);
 			SCTP_BUF_LEN(newm) = len;
 			SCTP_BUF_NEXT(newm) = m;
 			m = newm;
 			if (net != NULL) {
 				m->m_pkthdr.flowid = net->flowid;
 				M_HASHTYPE_SET(m, net->flowtype);
 			} else {
 				m->m_pkthdr.flowid = mflowid;
 				M_HASHTYPE_SET(m, mflowtype);
 			}
 			packet_length = sctp_calculate_len(m);
 
 			ip6h = mtod(m, struct ip6_hdr *);
 			/* protect *sin6 from overwrite */
 			sin6 = (struct sockaddr_in6 *)to;
 			tmp = *sin6;
 			sin6 = &tmp;
 
 			/* KAME hack: embed scopeid */
 			if (sa6_embedscope(sin6, MODULE_GLOBAL(ip6_use_defzone)) != 0) {
 				SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 				sctp_m_freem(m);
 				return (EINVAL);
 			}
 			if (net == NULL) {
 				memset(&ip6route, 0, sizeof(ip6route));
 				ro = (sctp_route_t *)&ip6route;
 				memcpy(&ro->ro_dst, sin6, sin6->sin6_len);
 			} else {
 				ro = (sctp_route_t *)&net->ro;
 			}
 			/*
 			 * We assume here that inp_flow is in host byte
 			 * order within the TCB!
 			 */
 			if (tos_value == 0) {
 				/*
 				 * This means especially, that it is not set
 				 * at the SCTP layer. So use the value from
 				 * the IP layer.
 				 */
 				tos_value = (ntohl(((struct in6pcb *)inp)->in6p_flowinfo) >> 20) & 0xff;
 			}
 			tos_value &= 0xfc;
 			if (ecn_ok) {
 				tos_value |= sctp_get_ect(stcb);
 			}
 			flowinfo = 0x06;
 			flowinfo <<= 8;
 			flowinfo |= tos_value;
 			flowinfo <<= 20;
 			flowinfo |= flowlabel;
 			ip6h->ip6_flow = htonl(flowinfo);
 			if (port) {
 				ip6h->ip6_nxt = IPPROTO_UDP;
 			} else {
 				ip6h->ip6_nxt = IPPROTO_SCTP;
 			}
 			ip6h->ip6_plen = htons((uint16_t)(packet_length - sizeof(struct ip6_hdr)));
 			ip6h->ip6_dst = sin6->sin6_addr;
 
 			/*
 			 * Add SRC address selection here: we can only reuse
 			 * to a limited degree the kame src-addr-sel, since
 			 * we can try their selection but it may not be
 			 * bound.
 			 */
 			memset(&lsa6_tmp, 0, sizeof(lsa6_tmp));
 			lsa6_tmp.sin6_family = AF_INET6;
 			lsa6_tmp.sin6_len = sizeof(lsa6_tmp);
 			lsa6 = &lsa6_tmp;
 			if (net && out_of_asoc_ok == 0) {
 				if (net->ro._s_addr && (net->ro._s_addr->localifa_flags & (SCTP_BEING_DELETED | SCTP_ADDR_IFA_UNUSEABLE))) {
 					sctp_free_ifa(net->ro._s_addr);
 					net->ro._s_addr = NULL;
 					net->src_addr_selected = 0;
 					if (ro->ro_rt) {
 						RTFREE(ro->ro_rt);
 						ro->ro_rt = NULL;
 					}
 				}
 				if (net->src_addr_selected == 0) {
 					sin6 = (struct sockaddr_in6 *)&net->ro._l_addr;
 					/* KAME hack: embed scopeid */
 					if (sa6_embedscope(sin6, MODULE_GLOBAL(ip6_use_defzone)) != 0) {
 						SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 						sctp_m_freem(m);
 						return (EINVAL);
 					}
 					/* Cache the source address */
 					net->ro._s_addr = sctp_source_address_selection(inp,
 					    stcb,
 					    ro,
 					    net,
 					    0,
 					    vrf_id);
 					(void)sa6_recoverscope(sin6);
 					net->src_addr_selected = 1;
 				}
 				if (net->ro._s_addr == NULL) {
 					SCTPDBG(SCTP_DEBUG_OUTPUT3, "V6:No route to host\n");
 					net->src_addr_selected = 0;
 					sctp_handle_no_route(stcb, net, so_locked);
 					SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EHOSTUNREACH);
 					sctp_m_freem(m);
 					return (EHOSTUNREACH);
 				}
 				lsa6->sin6_addr = net->ro._s_addr->address.sin6.sin6_addr;
 			} else {
 				sin6 = (struct sockaddr_in6 *)&ro->ro_dst;
 				/* KAME hack: embed scopeid */
 				if (sa6_embedscope(sin6, MODULE_GLOBAL(ip6_use_defzone)) != 0) {
 					SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 					sctp_m_freem(m);
 					return (EINVAL);
 				}
 				if (over_addr == NULL) {
 					struct sctp_ifa *_lsrc;
 
 					_lsrc = sctp_source_address_selection(inp, stcb, ro,
 					    net,
 					    out_of_asoc_ok,
 					    vrf_id);
 					if (_lsrc == NULL) {
 						sctp_handle_no_route(stcb, net, so_locked);
 						SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EHOSTUNREACH);
 						sctp_m_freem(m);
 						return (EHOSTUNREACH);
 					}
 					lsa6->sin6_addr = _lsrc->address.sin6.sin6_addr;
 					sctp_free_ifa(_lsrc);
 				} else {
 					lsa6->sin6_addr = over_addr->sin6.sin6_addr;
 					SCTP_RTALLOC(ro, vrf_id, inp->fibnum);
 				}
 				(void)sa6_recoverscope(sin6);
 			}
 			lsa6->sin6_port = inp->sctp_lport;
 
 			if (ro->ro_rt == NULL) {
 				/*
 				 * src addr selection failed to find a route
 				 * (or valid source addr), so we can't get
 				 * there from here!
 				 */
 				sctp_handle_no_route(stcb, net, so_locked);
 				SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EHOSTUNREACH);
 				sctp_m_freem(m);
 				return (EHOSTUNREACH);
 			}
 			/*
 			 * XXX: sa6 may not have a valid sin6_scope_id in
 			 * the non-SCOPEDROUTING case.
 			 */
 			memset(&lsa6_storage, 0, sizeof(lsa6_storage));
 			lsa6_storage.sin6_family = AF_INET6;
 			lsa6_storage.sin6_len = sizeof(lsa6_storage);
 			lsa6_storage.sin6_addr = lsa6->sin6_addr;
 			if ((error = sa6_recoverscope(&lsa6_storage)) != 0) {
 				SCTPDBG(SCTP_DEBUG_OUTPUT3, "recover scope fails error %d\n", error);
 				sctp_m_freem(m);
 				return (error);
 			}
 			/* XXX */
 			lsa6_storage.sin6_addr = lsa6->sin6_addr;
 			lsa6_storage.sin6_port = inp->sctp_lport;
 			lsa6 = &lsa6_storage;
 			ip6h->ip6_src = lsa6->sin6_addr;
 
 			if (port) {
 				if (htons(SCTP_BASE_SYSCTL(sctp_udp_tunneling_port)) == 0) {
 					sctp_handle_no_route(stcb, net, so_locked);
 					SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EHOSTUNREACH);
 					sctp_m_freem(m);
 					return (EHOSTUNREACH);
 				}
 				udp = (struct udphdr *)((caddr_t)ip6h + sizeof(struct ip6_hdr));
 				udp->uh_sport = htons(SCTP_BASE_SYSCTL(sctp_udp_tunneling_port));
 				udp->uh_dport = port;
 				udp->uh_ulen = htons((uint16_t)(packet_length - sizeof(struct ip6_hdr)));
 				udp->uh_sum = 0;
 				sctphdr = (struct sctphdr *)((caddr_t)udp + sizeof(struct udphdr));
 			} else {
 				sctphdr = (struct sctphdr *)((caddr_t)ip6h + sizeof(struct ip6_hdr));
 			}
 
 			sctphdr->src_port = src_port;
 			sctphdr->dest_port = dest_port;
 			sctphdr->v_tag = v_tag;
 			sctphdr->checksum = 0;
 
 			/*
 			 * We set the hop limit now since there is a good
 			 * chance that our ro pointer is now filled
 			 */
 			ip6h->ip6_hlim = SCTP_GET_HLIM(inp, ro);
 			ifp = SCTP_GET_IFN_VOID_FROM_ROUTE(ro);
 
 #ifdef SCTP_DEBUG
 			/* Copy to be sure something bad is not happening */
 			sin6->sin6_addr = ip6h->ip6_dst;
 			lsa6->sin6_addr = ip6h->ip6_src;
 #endif
 
 			SCTPDBG(SCTP_DEBUG_OUTPUT3, "Calling ipv6 output routine from low level\n");
 			SCTPDBG(SCTP_DEBUG_OUTPUT3, "src: ");
 			SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT3, (struct sockaddr *)lsa6);
 			SCTPDBG(SCTP_DEBUG_OUTPUT3, "dst: ");
 			SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT3, (struct sockaddr *)sin6);
 			if (net) {
 				sin6 = (struct sockaddr_in6 *)&net->ro._l_addr;
 				/*
 				 * preserve the port and scope for link
 				 * local send
 				 */
 				prev_scope = sin6->sin6_scope_id;
 				prev_port = sin6->sin6_port;
 			}
 
 			if (SCTP_GET_HEADER_FOR_OUTPUT(o_pak)) {
 				/* failed to prepend data, give up */
 				sctp_m_freem(m);
 				SCTP_LTRACE_ERR_RET(inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 				return (ENOMEM);
 			}
 			SCTP_ATTACH_CHAIN(o_pak, m, packet_length);
 			if (port) {
 				sctphdr->checksum = sctp_calculate_cksum(m, sizeof(struct ip6_hdr) + sizeof(struct udphdr));
 				SCTP_STAT_INCR(sctps_sendswcrc);
 				if ((udp->uh_sum = in6_cksum(o_pak, IPPROTO_UDP, sizeof(struct ip6_hdr), packet_length - sizeof(struct ip6_hdr))) == 0) {
 					udp->uh_sum = 0xffff;
 				}
 			} else {
 				m->m_pkthdr.csum_flags = CSUM_SCTP_IPV6;
 				m->m_pkthdr.csum_data = offsetof(struct sctphdr, checksum);
 				SCTP_STAT_INCR(sctps_sendhwcrc);
 			}
 			/* send it out. table id is taken from stcb */
 #if defined(__APPLE__) || defined(SCTP_SO_LOCK_TESTING)
 			if ((SCTP_BASE_SYSCTL(sctp_output_unlocked)) && (so_locked)) {
 				so = SCTP_INP_SO(inp);
 				SCTP_SOCKET_UNLOCK(so, 0);
 			}
 #endif
 #ifdef SCTP_PACKET_LOGGING
 			if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LAST_PACKET_TRACING)
 				sctp_packet_log(o_pak);
 #endif
 			SCTP_PROBE5(send, NULL, stcb, ip6h, stcb, sctphdr);
 			SCTP_IP6_OUTPUT(ret, o_pak, (struct route_in6 *)ro, &ifp, stcb, vrf_id);
 #if defined(__APPLE__) || defined(SCTP_SO_LOCK_TESTING)
 			if ((SCTP_BASE_SYSCTL(sctp_output_unlocked)) && (so_locked)) {
 				atomic_add_int(&stcb->asoc.refcnt, 1);
 				SCTP_TCB_UNLOCK(stcb);
 				SCTP_SOCKET_LOCK(so, 0);
 				SCTP_TCB_LOCK(stcb);
 				atomic_subtract_int(&stcb->asoc.refcnt, 1);
 			}
 #endif
 			if (net) {
 				/* for link local this must be done */
 				sin6->sin6_scope_id = prev_scope;
 				sin6->sin6_port = prev_port;
 			}
 			SCTPDBG(SCTP_DEBUG_OUTPUT3, "return from send is %d\n", ret);
 			if (port) {
 				UDPSTAT_INC(udps_opackets);
 			}
 			SCTP_STAT_INCR(sctps_sendpackets);
 			SCTP_STAT_INCR_COUNTER64(sctps_outpackets);
 			if (ret) {
 				SCTP_STAT_INCR(sctps_senderrors);
 			}
 			if (net == NULL) {
 				/* Now if we had a temp route free it */
 				RO_RTFREE(ro);
 			} else {
 				/*
 				 * PMTU check versus smallest asoc MTU goes
 				 * here
 				 */
 				if (ro->ro_rt == NULL) {
 					/* Route was freed */
 					if (net->ro._s_addr &&
 					    net->src_addr_selected) {
 						sctp_free_ifa(net->ro._s_addr);
 						net->ro._s_addr = NULL;
 					}
 					net->src_addr_selected = 0;
 				}
 				if ((ro->ro_rt != NULL) && (net->ro._s_addr) &&
 				    ((net->dest_state & SCTP_ADDR_NO_PMTUD) == 0)) {
 					uint32_t mtu;
 
 					mtu = SCTP_GATHER_MTU_FROM_ROUTE(net->ro._s_addr, &net->ro._l_addr.sa, ro->ro_rt);
 					if (mtu > 0) {
 						if (net->port) {
 							mtu -= sizeof(struct udphdr);
 						}
 						if ((stcb != NULL) && (stcb->asoc.smallest_mtu > mtu)) {
 							sctp_mtu_size_reset(inp, &stcb->asoc, mtu);
 						}
 						net->mtu = mtu;
 					}
 				} else if (ifp) {
 					if (ND_IFINFO(ifp)->linkmtu &&
 					    (stcb->asoc.smallest_mtu > ND_IFINFO(ifp)->linkmtu)) {
 						sctp_mtu_size_reset(inp,
 						    &stcb->asoc,
 						    ND_IFINFO(ifp)->linkmtu);
 					}
 				}
 			}
 			return (ret);
 		}
 #endif
 	default:
 		SCTPDBG(SCTP_DEBUG_OUTPUT1, "Unknown protocol (TSNH) type %d\n",
 		    ((struct sockaddr *)to)->sa_family);
 		sctp_m_freem(m);
 		SCTP_LTRACE_ERR_RET_PKT(m, inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EFAULT);
 		return (EFAULT);
 	}
 }
 
 
 void
 sctp_send_initiate(struct sctp_inpcb *inp, struct sctp_tcb *stcb, int so_locked
 #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING)
     SCTP_UNUSED
 #endif
 )
 {
 	struct mbuf *m, *m_last;
 	struct sctp_nets *net;
 	struct sctp_init_chunk *init;
 	struct sctp_supported_addr_param *sup_addr;
 	struct sctp_adaptation_layer_indication *ali;
 	struct sctp_supported_chunk_types_param *pr_supported;
 	struct sctp_paramhdr *ph;
 	int cnt_inits_to = 0;
 	int error;
 	uint16_t num_ext, chunk_len, padding_len, parameter_len;
 
 	/* INIT's always go to the primary (and usually ONLY address) */
 	net = stcb->asoc.primary_destination;
 	if (net == NULL) {
 		net = TAILQ_FIRST(&stcb->asoc.nets);
 		if (net == NULL) {
 			/* TSNH */
 			return;
 		}
 		/* we confirm any address we send an INIT to */
 		net->dest_state &= ~SCTP_ADDR_UNCONFIRMED;
 		(void)sctp_set_primary_addr(stcb, NULL, net);
 	} else {
 		/* we confirm any address we send an INIT to */
 		net->dest_state &= ~SCTP_ADDR_UNCONFIRMED;
 	}
 	SCTPDBG(SCTP_DEBUG_OUTPUT4, "Sending INIT\n");
 #ifdef INET6
 	if (net->ro._l_addr.sa.sa_family == AF_INET6) {
 		/*
 		 * special hook, if we are sending to link local it will not
 		 * show up in our private address count.
 		 */
 		if (IN6_IS_ADDR_LINKLOCAL(&net->ro._l_addr.sin6.sin6_addr))
 			cnt_inits_to = 1;
 	}
 #endif
 	if (SCTP_OS_TIMER_PENDING(&net->rxt_timer.timer)) {
 		/* This case should not happen */
 		SCTPDBG(SCTP_DEBUG_OUTPUT4, "Sending INIT - failed timer?\n");
 		return;
 	}
 	/* start the INIT timer */
 	sctp_timer_start(SCTP_TIMER_TYPE_INIT, inp, stcb, net);
 
 	m = sctp_get_mbuf_for_msg(MCLBYTES, 1, M_NOWAIT, 1, MT_DATA);
 	if (m == NULL) {
 		/* No memory, INIT timer will re-attempt. */
 		SCTPDBG(SCTP_DEBUG_OUTPUT4, "Sending INIT - mbuf?\n");
 		return;
 	}
 	chunk_len = (uint16_t)sizeof(struct sctp_init_chunk);
 	padding_len = 0;
 	/* Now lets put the chunk header in place */
 	init = mtod(m, struct sctp_init_chunk *);
 	/* now the chunk header */
 	init->ch.chunk_type = SCTP_INITIATION;
 	init->ch.chunk_flags = 0;
 	/* fill in later from mbuf we build */
 	init->ch.chunk_length = 0;
 	/* place in my tag */
 	init->init.initiate_tag = htonl(stcb->asoc.my_vtag);
 	/* set up some of the credits. */
 	init->init.a_rwnd = htonl(max(inp->sctp_socket ? SCTP_SB_LIMIT_RCV(inp->sctp_socket) : 0,
 	    SCTP_MINIMAL_RWND));
 	init->init.num_outbound_streams = htons(stcb->asoc.pre_open_streams);
 	init->init.num_inbound_streams = htons(stcb->asoc.max_inbound_streams);
 	init->init.initial_tsn = htonl(stcb->asoc.init_seq_number);
 
 	/* Adaptation layer indication parameter */
 	if (inp->sctp_ep.adaptation_layer_indicator_provided) {
 		parameter_len = (uint16_t)sizeof(struct sctp_adaptation_layer_indication);
 		ali = (struct sctp_adaptation_layer_indication *)(mtod(m, caddr_t)+chunk_len);
 		ali->ph.param_type = htons(SCTP_ULP_ADAPTATION);
 		ali->ph.param_length = htons(parameter_len);
 		ali->indication = htonl(inp->sctp_ep.adaptation_layer_indicator);
 		chunk_len += parameter_len;
 	}
 
 	/* ECN parameter */
 	if (stcb->asoc.ecn_supported == 1) {
 		parameter_len = (uint16_t)sizeof(struct sctp_paramhdr);
 		ph = (struct sctp_paramhdr *)(mtod(m, caddr_t)+chunk_len);
 		ph->param_type = htons(SCTP_ECN_CAPABLE);
 		ph->param_length = htons(parameter_len);
 		chunk_len += parameter_len;
 	}
 
 	/* PR-SCTP supported parameter */
 	if (stcb->asoc.prsctp_supported == 1) {
 		parameter_len = (uint16_t)sizeof(struct sctp_paramhdr);
 		ph = (struct sctp_paramhdr *)(mtod(m, caddr_t)+chunk_len);
 		ph->param_type = htons(SCTP_PRSCTP_SUPPORTED);
 		ph->param_length = htons(parameter_len);
 		chunk_len += parameter_len;
 	}
 
 	/* Add NAT friendly parameter. */
 	if (SCTP_BASE_SYSCTL(sctp_inits_include_nat_friendly)) {
 		parameter_len = (uint16_t)sizeof(struct sctp_paramhdr);
 		ph = (struct sctp_paramhdr *)(mtod(m, caddr_t)+chunk_len);
 		ph->param_type = htons(SCTP_HAS_NAT_SUPPORT);
 		ph->param_length = htons(parameter_len);
 		chunk_len += parameter_len;
 	}
 
 	/* And now tell the peer which extensions we support */
 	num_ext = 0;
 	pr_supported = (struct sctp_supported_chunk_types_param *)(mtod(m, caddr_t)+chunk_len);
 	if (stcb->asoc.prsctp_supported == 1) {
 		pr_supported->chunk_types[num_ext++] = SCTP_FORWARD_CUM_TSN;
 		if (stcb->asoc.idata_supported) {
 			pr_supported->chunk_types[num_ext++] = SCTP_IFORWARD_CUM_TSN;
 		}
 	}
 	if (stcb->asoc.auth_supported == 1) {
 		pr_supported->chunk_types[num_ext++] = SCTP_AUTHENTICATION;
 	}
 	if (stcb->asoc.asconf_supported == 1) {
 		pr_supported->chunk_types[num_ext++] = SCTP_ASCONF;
 		pr_supported->chunk_types[num_ext++] = SCTP_ASCONF_ACK;
 	}
 	if (stcb->asoc.reconfig_supported == 1) {
 		pr_supported->chunk_types[num_ext++] = SCTP_STREAM_RESET;
 	}
 	if (stcb->asoc.idata_supported) {
 		pr_supported->chunk_types[num_ext++] = SCTP_IDATA;
 	}
 	if (stcb->asoc.nrsack_supported == 1) {
 		pr_supported->chunk_types[num_ext++] = SCTP_NR_SELECTIVE_ACK;
 	}
 	if (stcb->asoc.pktdrop_supported == 1) {
 		pr_supported->chunk_types[num_ext++] = SCTP_PACKET_DROPPED;
 	}
 	if (num_ext > 0) {
 		parameter_len = (uint16_t)sizeof(struct sctp_supported_chunk_types_param) + num_ext;
 		pr_supported->ph.param_type = htons(SCTP_SUPPORTED_CHUNK_EXT);
 		pr_supported->ph.param_length = htons(parameter_len);
 		padding_len = SCTP_SIZE32(parameter_len) - parameter_len;
 		chunk_len += parameter_len;
 	}
 	/* add authentication parameters */
 	if (stcb->asoc.auth_supported) {
 		/* attach RANDOM parameter, if available */
 		if (stcb->asoc.authinfo.random != NULL) {
 			struct sctp_auth_random *randp;
 
 			if (padding_len > 0) {
 				memset(mtod(m, caddr_t)+chunk_len, 0, padding_len);
 				chunk_len += padding_len;
 				padding_len = 0;
 			}
 			randp = (struct sctp_auth_random *)(mtod(m, caddr_t)+chunk_len);
 			parameter_len = (uint16_t)sizeof(struct sctp_auth_random) + stcb->asoc.authinfo.random_len;
 			/* random key already contains the header */
 			memcpy(randp, stcb->asoc.authinfo.random->key, parameter_len);
 			padding_len = SCTP_SIZE32(parameter_len) - parameter_len;
 			chunk_len += parameter_len;
 		}
 		/* add HMAC_ALGO parameter */
 		if (stcb->asoc.local_hmacs != NULL) {
 			struct sctp_auth_hmac_algo *hmacs;
 
 			if (padding_len > 0) {
 				memset(mtod(m, caddr_t)+chunk_len, 0, padding_len);
 				chunk_len += padding_len;
 				padding_len = 0;
 			}
 			hmacs = (struct sctp_auth_hmac_algo *)(mtod(m, caddr_t)+chunk_len);
 			parameter_len = (uint16_t)(sizeof(struct sctp_auth_hmac_algo) +
 			    stcb->asoc.local_hmacs->num_algo * sizeof(uint16_t));
 			hmacs->ph.param_type = htons(SCTP_HMAC_LIST);
 			hmacs->ph.param_length = htons(parameter_len);
 			sctp_serialize_hmaclist(stcb->asoc.local_hmacs, (uint8_t *)hmacs->hmac_ids);
 			padding_len = SCTP_SIZE32(parameter_len) - parameter_len;
 			chunk_len += parameter_len;
 		}
 		/* add CHUNKS parameter */
 		if (stcb->asoc.local_auth_chunks != NULL) {
 			struct sctp_auth_chunk_list *chunks;
 
 			if (padding_len > 0) {
 				memset(mtod(m, caddr_t)+chunk_len, 0, padding_len);
 				chunk_len += padding_len;
 				padding_len = 0;
 			}
 			chunks = (struct sctp_auth_chunk_list *)(mtod(m, caddr_t)+chunk_len);
 			parameter_len = (uint16_t)(sizeof(struct sctp_auth_chunk_list) +
 			    sctp_auth_get_chklist_size(stcb->asoc.local_auth_chunks));
 			chunks->ph.param_type = htons(SCTP_CHUNK_LIST);
 			chunks->ph.param_length = htons(parameter_len);
 			sctp_serialize_auth_chunks(stcb->asoc.local_auth_chunks, chunks->chunk_types);
 			padding_len = SCTP_SIZE32(parameter_len) - parameter_len;
 			chunk_len += parameter_len;
 		}
 	}
 
 	/* now any cookie time extensions */
 	if (stcb->asoc.cookie_preserve_req) {
 		struct sctp_cookie_perserve_param *cookie_preserve;
 
 		if (padding_len > 0) {
 			memset(mtod(m, caddr_t)+chunk_len, 0, padding_len);
 			chunk_len += padding_len;
 			padding_len = 0;
 		}
 		parameter_len = (uint16_t)sizeof(struct sctp_cookie_perserve_param);
 		cookie_preserve = (struct sctp_cookie_perserve_param *)(mtod(m, caddr_t)+chunk_len);
 		cookie_preserve->ph.param_type = htons(SCTP_COOKIE_PRESERVE);
 		cookie_preserve->ph.param_length = htons(parameter_len);
 		cookie_preserve->time = htonl(stcb->asoc.cookie_preserve_req);
 		stcb->asoc.cookie_preserve_req = 0;
 		chunk_len += parameter_len;
 	}
 
 	if (stcb->asoc.scope.ipv4_addr_legal || stcb->asoc.scope.ipv6_addr_legal) {
 		uint8_t i;
 
 		if (padding_len > 0) {
 			memset(mtod(m, caddr_t)+chunk_len, 0, padding_len);
 			chunk_len += padding_len;
 			padding_len = 0;
 		}
 		parameter_len = (uint16_t)sizeof(struct sctp_paramhdr);
 		if (stcb->asoc.scope.ipv4_addr_legal) {
 			parameter_len += (uint16_t)sizeof(uint16_t);
 		}
 		if (stcb->asoc.scope.ipv6_addr_legal) {
 			parameter_len += (uint16_t)sizeof(uint16_t);
 		}
 		sup_addr = (struct sctp_supported_addr_param *)(mtod(m, caddr_t)+chunk_len);
 		sup_addr->ph.param_type = htons(SCTP_SUPPORTED_ADDRTYPE);
 		sup_addr->ph.param_length = htons(parameter_len);
 		i = 0;
 		if (stcb->asoc.scope.ipv4_addr_legal) {
 			sup_addr->addr_type[i++] = htons(SCTP_IPV4_ADDRESS);
 		}
 		if (stcb->asoc.scope.ipv6_addr_legal) {
 			sup_addr->addr_type[i++] = htons(SCTP_IPV6_ADDRESS);
 		}
 		padding_len = 4 - 2 * i;
 		chunk_len += parameter_len;
 	}
 
 	SCTP_BUF_LEN(m) = chunk_len;
 	/* now the addresses */
 	/*
 	 * To optimize this we could put the scoping stuff into a structure
 	 * and remove the individual uint8's from the assoc structure. Then
 	 * we could just sifa in the address within the stcb. But for now
 	 * this is a quick hack to get the address stuff teased apart.
 	 */
 	m_last = sctp_add_addresses_to_i_ia(inp, stcb, &stcb->asoc.scope,
 	    m, cnt_inits_to,
 	    &padding_len, &chunk_len);
 
 	init->ch.chunk_length = htons(chunk_len);
 	if (padding_len > 0) {
 		if (sctp_add_pad_tombuf(m_last, padding_len) == NULL) {
 			sctp_m_freem(m);
 			return;
 		}
 	}
 	SCTPDBG(SCTP_DEBUG_OUTPUT4, "Sending INIT - calls lowlevel_output\n");
 	if ((error = sctp_lowlevel_chunk_output(inp, stcb, net,
 	    (struct sockaddr *)&net->ro._l_addr,
 	    m, 0, NULL, 0, 0, 0, 0,
 	    inp->sctp_lport, stcb->rport, htonl(0),
 	    net->port, NULL,
 	    0, 0,
 	    so_locked))) {
 		SCTPDBG(SCTP_DEBUG_OUTPUT4, "Gak send error %d\n", error);
 		if (error == ENOBUFS) {
 			stcb->asoc.ifp_had_enobuf = 1;
 			SCTP_STAT_INCR(sctps_lowlevelerr);
 		}
 	} else {
 		stcb->asoc.ifp_had_enobuf = 0;
 	}
 	SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks);
 	(void)SCTP_GETTIME_TIMEVAL(&net->last_sent_time);
 }
 
 struct mbuf *
 sctp_arethere_unrecognized_parameters(struct mbuf *in_initpkt,
     int param_offset, int *abort_processing, struct sctp_chunkhdr *cp, int *nat_friendly)
 {
 	/*
 	 * Given a mbuf containing an INIT or INIT-ACK with the param_offset
 	 * being equal to the beginning of the params i.e. (iphlen +
 	 * sizeof(struct sctp_init_msg) parse through the parameters to the
 	 * end of the mbuf verifying that all parameters are known.
 	 *
 	 * For unknown parameters build and return a mbuf with
 	 * UNRECOGNIZED_PARAMETER errors. If the flags indicate to stop
 	 * processing this chunk stop, and set *abort_processing to 1.
 	 *
 	 * By having param_offset be pre-set to where parameters begin it is
 	 * hoped that this routine may be reused in the future by new
 	 * features.
 	 */
 	struct sctp_paramhdr *phdr, params;
 
 	struct mbuf *mat, *op_err;
-	char tempbuf[SCTP_PARAM_BUFFER_SIZE];
 	int at, limit, pad_needed;
 	uint16_t ptype, plen, padded_size;
 	int err_at;
 
 	*abort_processing = 0;
 	mat = in_initpkt;
 	err_at = 0;
 	limit = ntohs(cp->chunk_length) - sizeof(struct sctp_init_chunk);
 	at = param_offset;
 	op_err = NULL;
 	SCTPDBG(SCTP_DEBUG_OUTPUT1, "Check for unrecognized param's\n");
 	phdr = sctp_get_next_param(mat, at, &params, sizeof(params));
 	while ((phdr != NULL) && ((size_t)limit >= sizeof(struct sctp_paramhdr))) {
 		ptype = ntohs(phdr->param_type);
 		plen = ntohs(phdr->param_length);
 		if ((plen > limit) || (plen < sizeof(struct sctp_paramhdr))) {
 			/* wacked parameter */
 			SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error %d\n", plen);
 			goto invalid_size;
 		}
 		limit -= SCTP_SIZE32(plen);
 		/*-
 		 * All parameters for all chunks that we know/understand are
 		 * listed here. We process them other places and make
 		 * appropriate stop actions per the upper bits. However this
 		 * is the generic routine processor's can call to get back
 		 * an operr.. to either incorporate (init-ack) or send.
 		 */
 		padded_size = SCTP_SIZE32(plen);
 		switch (ptype) {
 			/* Param's with variable size */
 		case SCTP_HEARTBEAT_INFO:
 		case SCTP_STATE_COOKIE:
 		case SCTP_UNRECOG_PARAM:
 		case SCTP_ERROR_CAUSE_IND:
 			/* ok skip fwd */
 			at += padded_size;
 			break;
 			/* Param's with variable size within a range */
 		case SCTP_CHUNK_LIST:
 		case SCTP_SUPPORTED_CHUNK_EXT:
 			if (padded_size > (sizeof(struct sctp_supported_chunk_types_param) + (sizeof(uint8_t) * SCTP_MAX_SUPPORTED_EXT))) {
 				SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error chklist %d\n", plen);
 				goto invalid_size;
 			}
 			at += padded_size;
 			break;
 		case SCTP_SUPPORTED_ADDRTYPE:
 			if (padded_size > SCTP_MAX_ADDR_PARAMS_SIZE) {
 				SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error supaddrtype %d\n", plen);
 				goto invalid_size;
 			}
 			at += padded_size;
 			break;
 		case SCTP_RANDOM:
 			if (padded_size > (sizeof(struct sctp_auth_random) + SCTP_RANDOM_MAX_SIZE)) {
 				SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error random %d\n", plen);
 				goto invalid_size;
 			}
 			at += padded_size;
 			break;
 		case SCTP_SET_PRIM_ADDR:
 		case SCTP_DEL_IP_ADDRESS:
 		case SCTP_ADD_IP_ADDRESS:
 			if ((padded_size != sizeof(struct sctp_asconf_addrv4_param)) &&
 			    (padded_size != sizeof(struct sctp_asconf_addr_param))) {
 				SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error setprim %d\n", plen);
 				goto invalid_size;
 			}
 			at += padded_size;
 			break;
 			/* Param's with a fixed size */
 		case SCTP_IPV4_ADDRESS:
 			if (padded_size != sizeof(struct sctp_ipv4addr_param)) {
 				SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error ipv4 addr %d\n", plen);
 				goto invalid_size;
 			}
 			at += padded_size;
 			break;
 		case SCTP_IPV6_ADDRESS:
 			if (padded_size != sizeof(struct sctp_ipv6addr_param)) {
 				SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error ipv6 addr %d\n", plen);
 				goto invalid_size;
 			}
 			at += padded_size;
 			break;
 		case SCTP_COOKIE_PRESERVE:
 			if (padded_size != sizeof(struct sctp_cookie_perserve_param)) {
 				SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error cookie-preserve %d\n", plen);
 				goto invalid_size;
 			}
 			at += padded_size;
 			break;
 		case SCTP_HAS_NAT_SUPPORT:
 			*nat_friendly = 1;
 			/* fall through */
 		case SCTP_PRSCTP_SUPPORTED:
 			if (padded_size != sizeof(struct sctp_paramhdr)) {
 				SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error prsctp/nat support %d\n", plen);
 				goto invalid_size;
 			}
 			at += padded_size;
 			break;
 		case SCTP_ECN_CAPABLE:
 			if (padded_size != sizeof(struct sctp_paramhdr)) {
 				SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error ecn %d\n", plen);
 				goto invalid_size;
 			}
 			at += padded_size;
 			break;
 		case SCTP_ULP_ADAPTATION:
 			if (padded_size != sizeof(struct sctp_adaptation_layer_indication)) {
 				SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error adapatation %d\n", plen);
 				goto invalid_size;
 			}
 			at += padded_size;
 			break;
 		case SCTP_SUCCESS_REPORT:
 			if (padded_size != sizeof(struct sctp_asconf_paramhdr)) {
 				SCTPDBG(SCTP_DEBUG_OUTPUT1, "Invalid size - error success %d\n", plen);
 				goto invalid_size;
 			}
 			at += padded_size;
 			break;
 		case SCTP_HOSTNAME_ADDRESS:
 			{
 				/* We can NOT handle HOST NAME addresses!! */
 				int l_len;
 
 				SCTPDBG(SCTP_DEBUG_OUTPUT1, "Can't handle hostname addresses.. abort processing\n");
 				*abort_processing = 1;
 				if (op_err == NULL) {
 					/* Ok need to try to get a mbuf */
 #ifdef INET6
 					l_len = SCTP_MIN_OVERHEAD;
 #else
 					l_len = SCTP_MIN_V4_OVERHEAD;
 #endif
 					l_len += sizeof(struct sctp_chunkhdr);
-					l_len += plen;
-					l_len += sizeof(struct sctp_paramhdr);
+					l_len += sizeof(struct sctp_gen_error_cause);
 					op_err = sctp_get_mbuf_for_msg(l_len, 0, M_NOWAIT, 1, MT_DATA);
 					if (op_err) {
 						SCTP_BUF_LEN(op_err) = 0;
 						/*
-						 * pre-reserve space for ip
-						 * and sctp header  and
-						 * chunk hdr
+						 * Pre-reserve space for IP,
+						 * SCTP, and chunk header.
 						 */
 #ifdef INET6
 						SCTP_BUF_RESV_UF(op_err, sizeof(struct ip6_hdr));
 #else
 						SCTP_BUF_RESV_UF(op_err, sizeof(struct ip));
 #endif
 						SCTP_BUF_RESV_UF(op_err, sizeof(struct sctphdr));
 						SCTP_BUF_RESV_UF(op_err, sizeof(struct sctp_chunkhdr));
 					}
 				}
 				if (op_err) {
 					/* If we have space */
-					struct sctp_paramhdr s;
+					struct sctp_gen_error_cause cause;
 
 					if (err_at % 4) {
 						uint32_t cpthis = 0;
 
 						pad_needed = 4 - (err_at % 4);
 						m_copyback(op_err, err_at, pad_needed, (caddr_t)&cpthis);
 						err_at += pad_needed;
 					}
-					s.param_type = htons(SCTP_CAUSE_UNRESOLVABLE_ADDR);
-					s.param_length = htons(sizeof(s) + plen);
-					m_copyback(op_err, err_at, sizeof(s), (caddr_t)&s);
-					err_at += sizeof(s);
-					if (plen > sizeof(tempbuf)) {
-						plen = sizeof(tempbuf);
-					}
-					phdr = sctp_get_next_param(mat, at, (struct sctp_paramhdr *)tempbuf, plen);
-					if (phdr == NULL) {
+					cause.code = htons(SCTP_CAUSE_UNRESOLVABLE_ADDR);
+					cause.length = htons((uint16_t)(sizeof(struct sctp_gen_error_cause) + plen));
+					m_copyback(op_err, err_at, sizeof(struct sctp_gen_error_cause), (caddr_t)&cause);
+					err_at += sizeof(struct sctp_gen_error_cause);
+					SCTP_BUF_NEXT(op_err) = SCTP_M_COPYM(mat, at, plen, M_NOWAIT);
+					if (SCTP_BUF_NEXT(op_err) == NULL) {
 						sctp_m_freem(op_err);
-						/*
-						 * we are out of memory but
-						 * we still need to have a
-						 * look at what to do (the
-						 * system is in trouble
-						 * though).
-						 */
 						return (NULL);
 					}
-					m_copyback(op_err, err_at, plen, (caddr_t)phdr);
 				}
 				return (op_err);
 				break;
 			}
 		default:
 			/*
 			 * we do not recognize the parameter figure out what
 			 * we do.
 			 */
 			SCTPDBG(SCTP_DEBUG_OUTPUT1, "Hit default param %x\n", ptype);
 			if ((ptype & 0x4000) == 0x4000) {
 				/* Report bit is set?? */
 				SCTPDBG(SCTP_DEBUG_OUTPUT1, "report op err\n");
 				if (op_err == NULL) {
 					int l_len;
 
 					/* Ok need to try to get an mbuf */
 #ifdef INET6
 					l_len = SCTP_MIN_OVERHEAD;
 #else
 					l_len = SCTP_MIN_V4_OVERHEAD;
 #endif
 					l_len += sizeof(struct sctp_chunkhdr);
-					l_len += plen;
 					l_len += sizeof(struct sctp_paramhdr);
 					op_err = sctp_get_mbuf_for_msg(l_len, 0, M_NOWAIT, 1, MT_DATA);
 					if (op_err) {
 						SCTP_BUF_LEN(op_err) = 0;
 #ifdef INET6
 						SCTP_BUF_RESV_UF(op_err, sizeof(struct ip6_hdr));
 #else
 						SCTP_BUF_RESV_UF(op_err, sizeof(struct ip));
 #endif
 						SCTP_BUF_RESV_UF(op_err, sizeof(struct sctphdr));
 						SCTP_BUF_RESV_UF(op_err, sizeof(struct sctp_chunkhdr));
 					}
 				}
 				if (op_err) {
 					/* If we have space */
 					struct sctp_paramhdr s;
 
 					if (err_at % 4) {
 						uint32_t cpthis = 0;
 
 						pad_needed = 4 - (err_at % 4);
 						m_copyback(op_err, err_at, pad_needed, (caddr_t)&cpthis);
 						err_at += pad_needed;
 					}
 					s.param_type = htons(SCTP_UNRECOG_PARAM);
-					s.param_length = htons(sizeof(s) + plen);
-					m_copyback(op_err, err_at, sizeof(s), (caddr_t)&s);
-					err_at += sizeof(s);
-					if (plen > sizeof(tempbuf)) {
-						plen = sizeof(tempbuf);
-					}
-					phdr = sctp_get_next_param(mat, at, (struct sctp_paramhdr *)tempbuf, plen);
-					if (phdr == NULL) {
+					s.param_length = htons((uint16_t)sizeof(struct sctp_paramhdr) + plen);
+					m_copyback(op_err, err_at, sizeof(struct sctp_paramhdr), (caddr_t)&s);
+					err_at += sizeof(struct sctp_paramhdr);
+					SCTP_BUF_NEXT(op_err) = SCTP_M_COPYM(mat, at, plen, M_NOWAIT);
+					if (SCTP_BUF_NEXT(op_err) == NULL) {
 						sctp_m_freem(op_err);
 						/*
 						 * we are out of memory but
 						 * we still need to have a
 						 * look at what to do (the
 						 * system is in trouble
 						 * though).
 						 */
 						op_err = NULL;
 						goto more_processing;
 					}
-					m_copyback(op_err, err_at, plen, (caddr_t)phdr);
 					err_at += plen;
 				}
 			}
 	more_processing:
 			if ((ptype & 0x8000) == 0x0000) {
 				SCTPDBG(SCTP_DEBUG_OUTPUT1, "stop proc\n");
 				return (op_err);
 			} else {
 				/* skip this chunk and continue processing */
 				SCTPDBG(SCTP_DEBUG_OUTPUT1, "move on\n");
 				at += SCTP_SIZE32(plen);
 			}
 			break;
 
 		}
 		phdr = sctp_get_next_param(mat, at, &params, sizeof(params));
 	}
 	return (op_err);
 invalid_size:
 	SCTPDBG(SCTP_DEBUG_OUTPUT1, "abort flag set\n");
 	*abort_processing = 1;
 	if ((op_err == NULL) && phdr) {
 		int l_len;
 #ifdef INET6
 		l_len = SCTP_MIN_OVERHEAD;
 #else
 		l_len = SCTP_MIN_V4_OVERHEAD;
 #endif
 		l_len += sizeof(struct sctp_chunkhdr);
 		l_len += (2 * sizeof(struct sctp_paramhdr));
 		op_err = sctp_get_mbuf_for_msg(l_len, 0, M_NOWAIT, 1, MT_DATA);
 		if (op_err) {
 			SCTP_BUF_LEN(op_err) = 0;
 #ifdef INET6
 			SCTP_BUF_RESV_UF(op_err, sizeof(struct ip6_hdr));
 #else
 			SCTP_BUF_RESV_UF(op_err, sizeof(struct ip));
 #endif
 			SCTP_BUF_RESV_UF(op_err, sizeof(struct sctphdr));
 			SCTP_BUF_RESV_UF(op_err, sizeof(struct sctp_chunkhdr));
 		}
 	}
 	if ((op_err) && phdr) {
 		struct sctp_paramhdr s;
 
 		if (err_at % 4) {
 			uint32_t cpthis = 0;
 
 			pad_needed = 4 - (err_at % 4);
 			m_copyback(op_err, err_at, pad_needed, (caddr_t)&cpthis);
 			err_at += pad_needed;
 		}
 		s.param_type = htons(SCTP_CAUSE_PROTOCOL_VIOLATION);
 		s.param_length = htons(sizeof(s) + sizeof(struct sctp_paramhdr));
 		m_copyback(op_err, err_at, sizeof(s), (caddr_t)&s);
 		err_at += sizeof(s);
 		/* Only copy back the p-hdr that caused the issue */
 		m_copyback(op_err, err_at, sizeof(struct sctp_paramhdr), (caddr_t)phdr);
 	}
 	return (op_err);
 }
 
 static int
 sctp_are_there_new_addresses(struct sctp_association *asoc,
     struct mbuf *in_initpkt, int offset, struct sockaddr *src)
 {
 	/*
 	 * Given a INIT packet, look through the packet to verify that there
 	 * are NO new addresses. As we go through the parameters add reports
 	 * of any un-understood parameters that require an error.  Also we
 	 * must return (1) to drop the packet if we see a un-understood
 	 * parameter that tells us to drop the chunk.
 	 */
 	struct sockaddr *sa_touse;
 	struct sockaddr *sa;
 	struct sctp_paramhdr *phdr, params;
 	uint16_t ptype, plen;
 	uint8_t fnd;
 	struct sctp_nets *net;
 	int check_src;
 #ifdef INET
 	struct sockaddr_in sin4, *sa4;
 #endif
 #ifdef INET6
 	struct sockaddr_in6 sin6, *sa6;
 #endif
 
 #ifdef INET
 	memset(&sin4, 0, sizeof(sin4));
 	sin4.sin_family = AF_INET;
 	sin4.sin_len = sizeof(sin4);
 #endif
 #ifdef INET6
 	memset(&sin6, 0, sizeof(sin6));
 	sin6.sin6_family = AF_INET6;
 	sin6.sin6_len = sizeof(sin6);
 #endif
 	/* First what about the src address of the pkt ? */
 	check_src = 0;
 	switch (src->sa_family) {
 #ifdef INET
 	case AF_INET:
 		if (asoc->scope.ipv4_addr_legal) {
 			check_src = 1;
 		}
 		break;
 #endif
 #ifdef INET6
 	case AF_INET6:
 		if (asoc->scope.ipv6_addr_legal) {
 			check_src = 1;
 		}
 		break;
 #endif
 	default:
 		/* TSNH */
 		break;
 	}
 	if (check_src) {
 		fnd = 0;
 		TAILQ_FOREACH(net, &asoc->nets, sctp_next) {
 			sa = (struct sockaddr *)&net->ro._l_addr;
 			if (sa->sa_family == src->sa_family) {
 #ifdef INET
 				if (sa->sa_family == AF_INET) {
 					struct sockaddr_in *src4;
 
 					sa4 = (struct sockaddr_in *)sa;
 					src4 = (struct sockaddr_in *)src;
 					if (sa4->sin_addr.s_addr == src4->sin_addr.s_addr) {
 						fnd = 1;
 						break;
 					}
 				}
 #endif
 #ifdef INET6
 				if (sa->sa_family == AF_INET6) {
 					struct sockaddr_in6 *src6;
 
 					sa6 = (struct sockaddr_in6 *)sa;
 					src6 = (struct sockaddr_in6 *)src;
 					if (SCTP6_ARE_ADDR_EQUAL(sa6, src6)) {
 						fnd = 1;
 						break;
 					}
 				}
 #endif
 			}
 		}
 		if (fnd == 0) {
 			/* New address added! no need to look further. */
 			return (1);
 		}
 	}
 	/* Ok so far lets munge through the rest of the packet */
 	offset += sizeof(struct sctp_init_chunk);
 	phdr = sctp_get_next_param(in_initpkt, offset, &params, sizeof(params));
 	while (phdr) {
 		sa_touse = NULL;
 		ptype = ntohs(phdr->param_type);
 		plen = ntohs(phdr->param_length);
 		switch (ptype) {
 #ifdef INET
 		case SCTP_IPV4_ADDRESS:
 			{
 				struct sctp_ipv4addr_param *p4, p4_buf;
 
 				if (plen != sizeof(struct sctp_ipv4addr_param)) {
 					return (1);
 				}
 				phdr = sctp_get_next_param(in_initpkt, offset,
 				    (struct sctp_paramhdr *)&p4_buf, sizeof(p4_buf));
 				if (phdr == NULL) {
 					return (1);
 				}
 				if (asoc->scope.ipv4_addr_legal) {
 					p4 = (struct sctp_ipv4addr_param *)phdr;
 					sin4.sin_addr.s_addr = p4->addr;
 					sa_touse = (struct sockaddr *)&sin4;
 				}
 				break;
 			}
 #endif
 #ifdef INET6
 		case SCTP_IPV6_ADDRESS:
 			{
 				struct sctp_ipv6addr_param *p6, p6_buf;
 
 				if (plen != sizeof(struct sctp_ipv6addr_param)) {
 					return (1);
 				}
 				phdr = sctp_get_next_param(in_initpkt, offset,
 				    (struct sctp_paramhdr *)&p6_buf, sizeof(p6_buf));
 				if (phdr == NULL) {
 					return (1);
 				}
 				if (asoc->scope.ipv6_addr_legal) {
 					p6 = (struct sctp_ipv6addr_param *)phdr;
 					memcpy((caddr_t)&sin6.sin6_addr, p6->addr,
 					    sizeof(p6->addr));
 					sa_touse = (struct sockaddr *)&sin6;
 				}
 				break;
 			}
 #endif
 		default:
 			sa_touse = NULL;
 			break;
 		}
 		if (sa_touse) {
 			/* ok, sa_touse points to one to check */
 			fnd = 0;
 			TAILQ_FOREACH(net, &asoc->nets, sctp_next) {
 				sa = (struct sockaddr *)&net->ro._l_addr;
 				if (sa->sa_family != sa_touse->sa_family) {
 					continue;
 				}
 #ifdef INET
 				if (sa->sa_family == AF_INET) {
 					sa4 = (struct sockaddr_in *)sa;
 					if (sa4->sin_addr.s_addr ==
 					    sin4.sin_addr.s_addr) {
 						fnd = 1;
 						break;
 					}
 				}
 #endif
 #ifdef INET6
 				if (sa->sa_family == AF_INET6) {
 					sa6 = (struct sockaddr_in6 *)sa;
 					if (SCTP6_ARE_ADDR_EQUAL(
 					    sa6, &sin6)) {
 						fnd = 1;
 						break;
 					}
 				}
 #endif
 			}
 			if (!fnd) {
 				/* New addr added! no need to look further */
 				return (1);
 			}
 		}
 		offset += SCTP_SIZE32(plen);
 		phdr = sctp_get_next_param(in_initpkt, offset, &params, sizeof(params));
 	}
 	return (0);
 }
 
 /*
  * Given a MBUF chain that was sent into us containing an INIT. Build a
  * INIT-ACK with COOKIE and send back. We assume that the in_initpkt has done
  * a pullup to include IPv6/4header, SCTP header and initial part of INIT
  * message (i.e. the struct sctp_init_msg).
  */
 void
 sctp_send_initiate_ack(struct sctp_inpcb *inp, struct sctp_tcb *stcb,
     struct sctp_nets *src_net, struct mbuf *init_pkt,
     int iphlen, int offset,
     struct sockaddr *src, struct sockaddr *dst,
     struct sctphdr *sh, struct sctp_init_chunk *init_chk,
     uint8_t mflowtype, uint32_t mflowid,
     uint32_t vrf_id, uint16_t port)
 {
 	struct sctp_association *asoc;
 	struct mbuf *m, *m_tmp, *m_last, *m_cookie, *op_err;
 	struct sctp_init_ack_chunk *initack;
 	struct sctp_adaptation_layer_indication *ali;
 	struct sctp_supported_chunk_types_param *pr_supported;
 	struct sctp_paramhdr *ph;
 	union sctp_sockstore *over_addr;
 	struct sctp_scoping scp;
 	struct timeval now;
 #ifdef INET
 	struct sockaddr_in *dst4 = (struct sockaddr_in *)dst;
 	struct sockaddr_in *src4 = (struct sockaddr_in *)src;
 	struct sockaddr_in *sin;
 #endif
 #ifdef INET6
 	struct sockaddr_in6 *dst6 = (struct sockaddr_in6 *)dst;
 	struct sockaddr_in6 *src6 = (struct sockaddr_in6 *)src;
 	struct sockaddr_in6 *sin6;
 #endif
 	struct sockaddr *to;
 	struct sctp_state_cookie stc;
 	struct sctp_nets *net = NULL;
 	uint8_t *signature = NULL;
 	int cnt_inits_to = 0;
 	uint16_t his_limit, i_want;
 	int abort_flag;
 	int nat_friendly = 0;
 	int error;
 	struct socket *so;
 	uint16_t num_ext, chunk_len, padding_len, parameter_len;
 
 	if (stcb) {
 		asoc = &stcb->asoc;
 	} else {
 		asoc = NULL;
 	}
 	if ((asoc != NULL) &&
 	    (SCTP_GET_STATE(stcb) != SCTP_STATE_COOKIE_WAIT)) {
 		if (sctp_are_there_new_addresses(asoc, init_pkt, offset, src)) {
 			/*
 			 * new addresses, out of here in non-cookie-wait
 			 * states
 			 *
 			 * Send an ABORT, without the new address error
 			 * cause. This looks no different than if no
 			 * listener was present.
 			 */
 			op_err = sctp_generate_cause(SCTP_BASE_SYSCTL(sctp_diag_info_code),
 			    "Address added");
 			sctp_send_abort(init_pkt, iphlen, src, dst, sh, 0, op_err,
 			    mflowtype, mflowid, inp->fibnum,
 			    vrf_id, port);
 			return;
 		}
 		if (src_net != NULL && (src_net->port != port)) {
 			/*
 			 * change of remote encapsulation port, out of here
 			 * in non-cookie-wait states
 			 *
 			 * Send an ABORT, without an specific error cause.
 			 * This looks no different than if no listener was
 			 * present.
 			 */
 			op_err = sctp_generate_cause(SCTP_BASE_SYSCTL(sctp_diag_info_code),
 			    "Remote encapsulation port changed");
 			sctp_send_abort(init_pkt, iphlen, src, dst, sh, 0, op_err,
 			    mflowtype, mflowid, inp->fibnum,
 			    vrf_id, port);
 			return;
 		}
 	}
 	abort_flag = 0;
 	op_err = sctp_arethere_unrecognized_parameters(init_pkt,
 	    (offset + sizeof(struct sctp_init_chunk)),
 	    &abort_flag, (struct sctp_chunkhdr *)init_chk, &nat_friendly);
 	if (abort_flag) {
 do_a_abort:
 		if (op_err == NULL) {
 			char msg[SCTP_DIAG_INFO_LEN];
 
 			snprintf(msg, sizeof(msg), "%s:%d at %s", __FILE__, __LINE__, __func__);
 			op_err = sctp_generate_cause(SCTP_BASE_SYSCTL(sctp_diag_info_code),
 			    msg);
 		}
 		sctp_send_abort(init_pkt, iphlen, src, dst, sh,
 		    init_chk->init.initiate_tag, op_err,
 		    mflowtype, mflowid, inp->fibnum,
 		    vrf_id, port);
 		return;
 	}
 	m = sctp_get_mbuf_for_msg(MCLBYTES, 0, M_NOWAIT, 1, MT_DATA);
 	if (m == NULL) {
 		/* No memory, INIT timer will re-attempt. */
 		if (op_err)
 			sctp_m_freem(op_err);
 		return;
 	}
 	chunk_len = (uint16_t)sizeof(struct sctp_init_ack_chunk);
 	padding_len = 0;
 
 	/*
 	 * We might not overwrite the identification[] completely and on
 	 * some platforms time_entered will contain some padding. Therefore
 	 * zero out the cookie to avoid putting uninitialized memory on the
 	 * wire.
 	 */
 	memset(&stc, 0, sizeof(struct sctp_state_cookie));
 
 	/* the time I built cookie */
 	(void)SCTP_GETTIME_TIMEVAL(&now);
 	stc.time_entered.tv_sec = now.tv_sec;
 	stc.time_entered.tv_usec = now.tv_usec;
 
 	/* populate any tie tags */
 	if (asoc != NULL) {
 		/* unlock before tag selections */
 		stc.tie_tag_my_vtag = asoc->my_vtag_nonce;
 		stc.tie_tag_peer_vtag = asoc->peer_vtag_nonce;
 		stc.cookie_life = asoc->cookie_life;
 		net = asoc->primary_destination;
 	} else {
 		stc.tie_tag_my_vtag = 0;
 		stc.tie_tag_peer_vtag = 0;
 		/* life I will award this cookie */
 		stc.cookie_life = inp->sctp_ep.def_cookie_life;
 	}
 
 	/* copy in the ports for later check */
 	stc.myport = sh->dest_port;
 	stc.peerport = sh->src_port;
 
 	/*
 	 * If we wanted to honor cookie life extensions, we would add to
 	 * stc.cookie_life. For now we should NOT honor any extension
 	 */
 	stc.site_scope = stc.local_scope = stc.loopback_scope = 0;
 	if (inp->sctp_flags & SCTP_PCB_FLAGS_BOUND_V6) {
 		stc.ipv6_addr_legal = 1;
 		if (SCTP_IPV6_V6ONLY(inp)) {
 			stc.ipv4_addr_legal = 0;
 		} else {
 			stc.ipv4_addr_legal = 1;
 		}
 	} else {
 		stc.ipv6_addr_legal = 0;
 		stc.ipv4_addr_legal = 1;
 	}
 	stc.ipv4_scope = 0;
 	if (net == NULL) {
 		to = src;
 		switch (dst->sa_family) {
 #ifdef INET
 		case AF_INET:
 			{
 				/* lookup address */
 				stc.address[0] = src4->sin_addr.s_addr;
 				stc.address[1] = 0;
 				stc.address[2] = 0;
 				stc.address[3] = 0;
 				stc.addr_type = SCTP_IPV4_ADDRESS;
 				/* local from address */
 				stc.laddress[0] = dst4->sin_addr.s_addr;
 				stc.laddress[1] = 0;
 				stc.laddress[2] = 0;
 				stc.laddress[3] = 0;
 				stc.laddr_type = SCTP_IPV4_ADDRESS;
 				/* scope_id is only for v6 */
 				stc.scope_id = 0;
 				if ((IN4_ISPRIVATE_ADDRESS(&src4->sin_addr)) ||
 				    (IN4_ISPRIVATE_ADDRESS(&dst4->sin_addr))) {
 					stc.ipv4_scope = 1;
 				}
 				/* Must use the address in this case */
 				if (sctp_is_address_on_local_host(src, vrf_id)) {
 					stc.loopback_scope = 1;
 					stc.ipv4_scope = 1;
 					stc.site_scope = 1;
 					stc.local_scope = 0;
 				}
 				break;
 			}
 #endif
 #ifdef INET6
 		case AF_INET6:
 			{
 				stc.addr_type = SCTP_IPV6_ADDRESS;
 				memcpy(&stc.address, &src6->sin6_addr, sizeof(struct in6_addr));
 				stc.scope_id = ntohs(in6_getscope(&src6->sin6_addr));
 				if (sctp_is_address_on_local_host(src, vrf_id)) {
 					stc.loopback_scope = 1;
 					stc.local_scope = 0;
 					stc.site_scope = 1;
 					stc.ipv4_scope = 1;
 				} else if (IN6_IS_ADDR_LINKLOCAL(&src6->sin6_addr) ||
 				    IN6_IS_ADDR_LINKLOCAL(&dst6->sin6_addr)) {
 					/*
 					 * If the new destination or source
 					 * is a LINK_LOCAL we must have
 					 * common both site and local scope.
 					 * Don't set local scope though
 					 * since we must depend on the
 					 * source to be added implicitly. We
 					 * cannot assure just because we
 					 * share one link that all links are
 					 * common.
 					 */
 					stc.local_scope = 0;
 					stc.site_scope = 1;
 					stc.ipv4_scope = 1;
 					/*
 					 * we start counting for the private
 					 * address stuff at 1. since the
 					 * link local we source from won't
 					 * show up in our scoped count.
 					 */
 					cnt_inits_to = 1;
 					/*
 					 * pull out the scope_id from
 					 * incoming pkt
 					 */
 				} else if (IN6_IS_ADDR_SITELOCAL(&src6->sin6_addr) ||
 				    IN6_IS_ADDR_SITELOCAL(&dst6->sin6_addr)) {
 					/*
 					 * If the new destination or source
 					 * is SITE_LOCAL then we must have
 					 * site scope in common.
 					 */
 					stc.site_scope = 1;
 				}
 				memcpy(&stc.laddress, &dst6->sin6_addr, sizeof(struct in6_addr));
 				stc.laddr_type = SCTP_IPV6_ADDRESS;
 				break;
 			}
 #endif
 		default:
 			/* TSNH */
 			goto do_a_abort;
 			break;
 		}
 	} else {
 		/* set the scope per the existing tcb */
 
 #ifdef INET6
 		struct sctp_nets *lnet;
 #endif
 
 		stc.loopback_scope = asoc->scope.loopback_scope;
 		stc.ipv4_scope = asoc->scope.ipv4_local_scope;
 		stc.site_scope = asoc->scope.site_scope;
 		stc.local_scope = asoc->scope.local_scope;
 #ifdef INET6
 		/* Why do we not consider IPv4 LL addresses? */
 		TAILQ_FOREACH(lnet, &asoc->nets, sctp_next) {
 			if (lnet->ro._l_addr.sin6.sin6_family == AF_INET6) {
 				if (IN6_IS_ADDR_LINKLOCAL(&lnet->ro._l_addr.sin6.sin6_addr)) {
 					/*
 					 * if we have a LL address, start
 					 * counting at 1.
 					 */
 					cnt_inits_to = 1;
 				}
 			}
 		}
 #endif
 		/* use the net pointer */
 		to = (struct sockaddr *)&net->ro._l_addr;
 		switch (to->sa_family) {
 #ifdef INET
 		case AF_INET:
 			sin = (struct sockaddr_in *)to;
 			stc.address[0] = sin->sin_addr.s_addr;
 			stc.address[1] = 0;
 			stc.address[2] = 0;
 			stc.address[3] = 0;
 			stc.addr_type = SCTP_IPV4_ADDRESS;
 			if (net->src_addr_selected == 0) {
 				/*
 				 * strange case here, the INIT should have
 				 * did the selection.
 				 */
 				net->ro._s_addr = sctp_source_address_selection(inp,
 				    stcb, (sctp_route_t *)&net->ro,
 				    net, 0, vrf_id);
 				if (net->ro._s_addr == NULL)
 					return;
 
 				net->src_addr_selected = 1;
 
 			}
 			stc.laddress[0] = net->ro._s_addr->address.sin.sin_addr.s_addr;
 			stc.laddress[1] = 0;
 			stc.laddress[2] = 0;
 			stc.laddress[3] = 0;
 			stc.laddr_type = SCTP_IPV4_ADDRESS;
 			/* scope_id is only for v6 */
 			stc.scope_id = 0;
 			break;
 #endif
 #ifdef INET6
 		case AF_INET6:
 			sin6 = (struct sockaddr_in6 *)to;
 			memcpy(&stc.address, &sin6->sin6_addr,
 			    sizeof(struct in6_addr));
 			stc.addr_type = SCTP_IPV6_ADDRESS;
 			stc.scope_id = sin6->sin6_scope_id;
 			if (net->src_addr_selected == 0) {
 				/*
 				 * strange case here, the INIT should have
 				 * done the selection.
 				 */
 				net->ro._s_addr = sctp_source_address_selection(inp,
 				    stcb, (sctp_route_t *)&net->ro,
 				    net, 0, vrf_id);
 				if (net->ro._s_addr == NULL)
 					return;
 
 				net->src_addr_selected = 1;
 			}
 			memcpy(&stc.laddress, &net->ro._s_addr->address.sin6.sin6_addr,
 			    sizeof(struct in6_addr));
 			stc.laddr_type = SCTP_IPV6_ADDRESS;
 			break;
 #endif
 		}
 	}
 	/* Now lets put the SCTP header in place */
 	initack = mtod(m, struct sctp_init_ack_chunk *);
 	/* Save it off for quick ref */
 	stc.peers_vtag = ntohl(init_chk->init.initiate_tag);
 	/* who are we */
 	memcpy(stc.identification, SCTP_VERSION_STRING,
 	    min(strlen(SCTP_VERSION_STRING), sizeof(stc.identification)));
 	memset(stc.reserved, 0, SCTP_RESERVE_SPACE);
 	/* now the chunk header */
 	initack->ch.chunk_type = SCTP_INITIATION_ACK;
 	initack->ch.chunk_flags = 0;
 	/* fill in later from mbuf we build */
 	initack->ch.chunk_length = 0;
 	/* place in my tag */
 	if ((asoc != NULL) &&
 	    ((SCTP_GET_STATE(stcb) == SCTP_STATE_COOKIE_WAIT) ||
 	    (SCTP_GET_STATE(stcb) == SCTP_STATE_INUSE) ||
 	    (SCTP_GET_STATE(stcb) == SCTP_STATE_COOKIE_ECHOED))) {
 		/* re-use the v-tags and init-seq here */
 		initack->init.initiate_tag = htonl(asoc->my_vtag);
 		initack->init.initial_tsn = htonl(asoc->init_seq_number);
 	} else {
 		uint32_t vtag, itsn;
 
 		if (asoc) {
 			atomic_add_int(&asoc->refcnt, 1);
 			SCTP_TCB_UNLOCK(stcb);
 	new_tag:
 			vtag = sctp_select_a_tag(inp, inp->sctp_lport, sh->src_port, 1);
 			if ((asoc->peer_supports_nat) && (vtag == asoc->my_vtag)) {
 				/*
 				 * Got a duplicate vtag on some guy behind a
 				 * nat make sure we don't use it.
 				 */
 				goto new_tag;
 			}
 			initack->init.initiate_tag = htonl(vtag);
 			/* get a TSN to use too */
 			itsn = sctp_select_initial_TSN(&inp->sctp_ep);
 			initack->init.initial_tsn = htonl(itsn);
 			SCTP_TCB_LOCK(stcb);
 			atomic_add_int(&asoc->refcnt, -1);
 		} else {
 			SCTP_INP_INCR_REF(inp);
 			SCTP_INP_RUNLOCK(inp);
 			vtag = sctp_select_a_tag(inp, inp->sctp_lport, sh->src_port, 1);
 			initack->init.initiate_tag = htonl(vtag);
 			/* get a TSN to use too */
 			initack->init.initial_tsn = htonl(sctp_select_initial_TSN(&inp->sctp_ep));
 			SCTP_INP_RLOCK(inp);
 			SCTP_INP_DECR_REF(inp);
 		}
 	}
 	/* save away my tag to */
 	stc.my_vtag = initack->init.initiate_tag;
 
 	/* set up some of the credits. */
 	so = inp->sctp_socket;
 	if (so == NULL) {
 		/* memory problem */
 		sctp_m_freem(m);
 		return;
 	} else {
 		initack->init.a_rwnd = htonl(max(SCTP_SB_LIMIT_RCV(so), SCTP_MINIMAL_RWND));
 	}
 	/* set what I want */
 	his_limit = ntohs(init_chk->init.num_inbound_streams);
 	/* choose what I want */
 	if (asoc != NULL) {
 		if (asoc->streamoutcnt > asoc->pre_open_streams) {
 			i_want = asoc->streamoutcnt;
 		} else {
 			i_want = asoc->pre_open_streams;
 		}
 	} else {
 		i_want = inp->sctp_ep.pre_open_stream_count;
 	}
 	if (his_limit < i_want) {
 		/* I Want more :< */
 		initack->init.num_outbound_streams = init_chk->init.num_inbound_streams;
 	} else {
 		/* I can have what I want :> */
 		initack->init.num_outbound_streams = htons(i_want);
 	}
 	/* tell him his limit. */
 	initack->init.num_inbound_streams =
 	    htons(inp->sctp_ep.max_open_streams_intome);
 
 	/* adaptation layer indication parameter */
 	if (inp->sctp_ep.adaptation_layer_indicator_provided) {
 		parameter_len = (uint16_t)sizeof(struct sctp_adaptation_layer_indication);
 		ali = (struct sctp_adaptation_layer_indication *)(mtod(m, caddr_t)+chunk_len);
 		ali->ph.param_type = htons(SCTP_ULP_ADAPTATION);
 		ali->ph.param_length = htons(parameter_len);
 		ali->indication = htonl(inp->sctp_ep.adaptation_layer_indicator);
 		chunk_len += parameter_len;
 	}
 
 	/* ECN parameter */
 	if (((asoc != NULL) && (asoc->ecn_supported == 1)) ||
 	    ((asoc == NULL) && (inp->ecn_supported == 1))) {
 		parameter_len = (uint16_t)sizeof(struct sctp_paramhdr);
 		ph = (struct sctp_paramhdr *)(mtod(m, caddr_t)+chunk_len);
 		ph->param_type = htons(SCTP_ECN_CAPABLE);
 		ph->param_length = htons(parameter_len);
 		chunk_len += parameter_len;
 	}
 
 	/* PR-SCTP supported parameter */
 	if (((asoc != NULL) && (asoc->prsctp_supported == 1)) ||
 	    ((asoc == NULL) && (inp->prsctp_supported == 1))) {
 		parameter_len = (uint16_t)sizeof(struct sctp_paramhdr);
 		ph = (struct sctp_paramhdr *)(mtod(m, caddr_t)+chunk_len);
 		ph->param_type = htons(SCTP_PRSCTP_SUPPORTED);
 		ph->param_length = htons(parameter_len);
 		chunk_len += parameter_len;
 	}
 
 	/* Add NAT friendly parameter */
 	if (nat_friendly) {
 		parameter_len = (uint16_t)sizeof(struct sctp_paramhdr);
 		ph = (struct sctp_paramhdr *)(mtod(m, caddr_t)+chunk_len);
 		ph->param_type = htons(SCTP_HAS_NAT_SUPPORT);
 		ph->param_length = htons(parameter_len);
 		chunk_len += parameter_len;
 	}
 
 	/* And now tell the peer which extensions we support */
 	num_ext = 0;
 	pr_supported = (struct sctp_supported_chunk_types_param *)(mtod(m, caddr_t)+chunk_len);
 	if (((asoc != NULL) && (asoc->prsctp_supported == 1)) ||
 	    ((asoc == NULL) && (inp->prsctp_supported == 1))) {
 		pr_supported->chunk_types[num_ext++] = SCTP_FORWARD_CUM_TSN;
 		if (((asoc != NULL) && (asoc->idata_supported == 1)) ||
 		    ((asoc == NULL) && (inp->idata_supported == 1))) {
 			pr_supported->chunk_types[num_ext++] = SCTP_IFORWARD_CUM_TSN;
 		}
 	}
 	if (((asoc != NULL) && (asoc->auth_supported == 1)) ||
 	    ((asoc == NULL) && (inp->auth_supported == 1))) {
 		pr_supported->chunk_types[num_ext++] = SCTP_AUTHENTICATION;
 	}
 	if (((asoc != NULL) && (asoc->asconf_supported == 1)) ||
 	    ((asoc == NULL) && (inp->asconf_supported == 1))) {
 		pr_supported->chunk_types[num_ext++] = SCTP_ASCONF;
 		pr_supported->chunk_types[num_ext++] = SCTP_ASCONF_ACK;
 	}
 	if (((asoc != NULL) && (asoc->reconfig_supported == 1)) ||
 	    ((asoc == NULL) && (inp->reconfig_supported == 1))) {
 		pr_supported->chunk_types[num_ext++] = SCTP_STREAM_RESET;
 	}
 	if (((asoc != NULL) && (asoc->idata_supported == 1)) ||
 	    ((asoc == NULL) && (inp->idata_supported == 1))) {
 		pr_supported->chunk_types[num_ext++] = SCTP_IDATA;
 	}
 	if (((asoc != NULL) && (asoc->nrsack_supported == 1)) ||
 	    ((asoc == NULL) && (inp->nrsack_supported == 1))) {
 		pr_supported->chunk_types[num_ext++] = SCTP_NR_SELECTIVE_ACK;
 	}
 	if (((asoc != NULL) && (asoc->pktdrop_supported == 1)) ||
 	    ((asoc == NULL) && (inp->pktdrop_supported == 1))) {
 		pr_supported->chunk_types[num_ext++] = SCTP_PACKET_DROPPED;
 	}
 	if (num_ext > 0) {
 		parameter_len = (uint16_t)sizeof(struct sctp_supported_chunk_types_param) + num_ext;
 		pr_supported->ph.param_type = htons(SCTP_SUPPORTED_CHUNK_EXT);
 		pr_supported->ph.param_length = htons(parameter_len);
 		padding_len = SCTP_SIZE32(parameter_len) - parameter_len;
 		chunk_len += parameter_len;
 	}
 
 	/* add authentication parameters */
 	if (((asoc != NULL) && (asoc->auth_supported == 1)) ||
 	    ((asoc == NULL) && (inp->auth_supported == 1))) {
 		struct sctp_auth_random *randp;
 		struct sctp_auth_hmac_algo *hmacs;
 		struct sctp_auth_chunk_list *chunks;
 
 		if (padding_len > 0) {
 			memset(mtod(m, caddr_t)+chunk_len, 0, padding_len);
 			chunk_len += padding_len;
 			padding_len = 0;
 		}
 		/* generate and add RANDOM parameter */
 		randp = (struct sctp_auth_random *)(mtod(m, caddr_t)+chunk_len);
 		parameter_len = (uint16_t)sizeof(struct sctp_auth_random) +
 		    SCTP_AUTH_RANDOM_SIZE_DEFAULT;
 		randp->ph.param_type = htons(SCTP_RANDOM);
 		randp->ph.param_length = htons(parameter_len);
 		SCTP_READ_RANDOM(randp->random_data, SCTP_AUTH_RANDOM_SIZE_DEFAULT);
 		padding_len = SCTP_SIZE32(parameter_len) - parameter_len;
 		chunk_len += parameter_len;
 
 		if (padding_len > 0) {
 			memset(mtod(m, caddr_t)+chunk_len, 0, padding_len);
 			chunk_len += padding_len;
 			padding_len = 0;
 		}
 		/* add HMAC_ALGO parameter */
 		hmacs = (struct sctp_auth_hmac_algo *)(mtod(m, caddr_t)+chunk_len);
 		parameter_len = (uint16_t)sizeof(struct sctp_auth_hmac_algo) +
 		    sctp_serialize_hmaclist(inp->sctp_ep.local_hmacs,
 		    (uint8_t *)hmacs->hmac_ids);
 		hmacs->ph.param_type = htons(SCTP_HMAC_LIST);
 		hmacs->ph.param_length = htons(parameter_len);
 		padding_len = SCTP_SIZE32(parameter_len) - parameter_len;
 		chunk_len += parameter_len;
 
 		if (padding_len > 0) {
 			memset(mtod(m, caddr_t)+chunk_len, 0, padding_len);
 			chunk_len += padding_len;
 			padding_len = 0;
 		}
 		/* add CHUNKS parameter */
 		chunks = (struct sctp_auth_chunk_list *)(mtod(m, caddr_t)+chunk_len);
 		parameter_len = (uint16_t)sizeof(struct sctp_auth_chunk_list) +
 		    sctp_serialize_auth_chunks(inp->sctp_ep.local_auth_chunks,
 		    chunks->chunk_types);
 		chunks->ph.param_type = htons(SCTP_CHUNK_LIST);
 		chunks->ph.param_length = htons(parameter_len);
 		padding_len = SCTP_SIZE32(parameter_len) - parameter_len;
 		chunk_len += parameter_len;
 	}
 	SCTP_BUF_LEN(m) = chunk_len;
 	m_last = m;
 	/* now the addresses */
 	/*
 	 * To optimize this we could put the scoping stuff into a structure
 	 * and remove the individual uint8's from the stc structure. Then we
 	 * could just sifa in the address within the stc.. but for now this
 	 * is a quick hack to get the address stuff teased apart.
 	 */
 	scp.ipv4_addr_legal = stc.ipv4_addr_legal;
 	scp.ipv6_addr_legal = stc.ipv6_addr_legal;
 	scp.loopback_scope = stc.loopback_scope;
 	scp.ipv4_local_scope = stc.ipv4_scope;
 	scp.local_scope = stc.local_scope;
 	scp.site_scope = stc.site_scope;
 	m_last = sctp_add_addresses_to_i_ia(inp, stcb, &scp, m_last,
 	    cnt_inits_to,
 	    &padding_len, &chunk_len);
 	/* padding_len can only be positive, if no addresses have been added */
 	if (padding_len > 0) {
 		memset(mtod(m, caddr_t)+chunk_len, 0, padding_len);
 		chunk_len += padding_len;
 		SCTP_BUF_LEN(m) += padding_len;
 		padding_len = 0;
 	}
 
 	/* tack on the operational error if present */
 	if (op_err) {
 		parameter_len = 0;
 		for (m_tmp = op_err; m_tmp != NULL; m_tmp = SCTP_BUF_NEXT(m_tmp)) {
 			parameter_len += SCTP_BUF_LEN(m_tmp);
 		}
 		padding_len = SCTP_SIZE32(parameter_len) - parameter_len;
 		SCTP_BUF_NEXT(m_last) = op_err;
 		while (SCTP_BUF_NEXT(m_last) != NULL) {
 			m_last = SCTP_BUF_NEXT(m_last);
 		}
 		chunk_len += parameter_len;
 	}
 	if (padding_len > 0) {
 		m_last = sctp_add_pad_tombuf(m_last, padding_len);
 		if (m_last == NULL) {
 			/* Houston we have a problem, no space */
 			sctp_m_freem(m);
 			return;
 		}
 		chunk_len += padding_len;
 		padding_len = 0;
 	}
 	/* Now we must build a cookie */
 	m_cookie = sctp_add_cookie(init_pkt, offset, m, 0, &stc, &signature);
 	if (m_cookie == NULL) {
 		/* memory problem */
 		sctp_m_freem(m);
 		return;
 	}
 	/* Now append the cookie to the end and update the space/size */
 	SCTP_BUF_NEXT(m_last) = m_cookie;
 	parameter_len = 0;
 	for (m_tmp = m_cookie; m_tmp != NULL; m_tmp = SCTP_BUF_NEXT(m_tmp)) {
 		parameter_len += SCTP_BUF_LEN(m_tmp);
 		if (SCTP_BUF_NEXT(m_tmp) == NULL) {
 			m_last = m_tmp;
 		}
 	}
 	padding_len = SCTP_SIZE32(parameter_len) - parameter_len;
 	chunk_len += parameter_len;
 
 	/*
 	 * Place in the size, but we don't include the last pad (if any) in
 	 * the INIT-ACK.
 	 */
 	initack->ch.chunk_length = htons(chunk_len);
 
 	/*
 	 * Time to sign the cookie, we don't sign over the cookie signature
 	 * though thus we set trailer.
 	 */
 	(void)sctp_hmac_m(SCTP_HMAC,
 	    (uint8_t *)inp->sctp_ep.secret_key[(int)(inp->sctp_ep.current_secret_number)],
 	    SCTP_SECRET_SIZE, m_cookie, sizeof(struct sctp_paramhdr),
 	    (uint8_t *)signature, SCTP_SIGNATURE_SIZE);
 	/*
 	 * We sifa 0 here to NOT set IP_DF if its IPv4, we ignore the return
 	 * here since the timer will drive a retranmission.
 	 */
 	if (padding_len > 0) {
 		if (sctp_add_pad_tombuf(m_last, padding_len) == NULL) {
 			sctp_m_freem(m);
 			return;
 		}
 	}
 	if (stc.loopback_scope) {
 		over_addr = (union sctp_sockstore *)dst;
 	} else {
 		over_addr = NULL;
 	}
 
 	if ((error = sctp_lowlevel_chunk_output(inp, NULL, NULL, to, m, 0, NULL, 0, 0,
 	    0, 0,
 	    inp->sctp_lport, sh->src_port, init_chk->init.initiate_tag,
 	    port, over_addr,
 	    mflowtype, mflowid,
 	    SCTP_SO_NOT_LOCKED))) {
 		SCTPDBG(SCTP_DEBUG_OUTPUT4, "Gak send error %d\n", error);
 		if (error == ENOBUFS) {
 			if (asoc != NULL) {
 				asoc->ifp_had_enobuf = 1;
 			}
 			SCTP_STAT_INCR(sctps_lowlevelerr);
 		}
 	} else {
 		if (asoc != NULL) {
 			asoc->ifp_had_enobuf = 0;
 		}
 	}
 	SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks);
 }
 
 
 static void
 sctp_prune_prsctp(struct sctp_tcb *stcb,
     struct sctp_association *asoc,
     struct sctp_sndrcvinfo *srcv,
     int dataout)
 {
 	int freed_spc = 0;
 	struct sctp_tmit_chunk *chk, *nchk;
 
 	SCTP_TCB_LOCK_ASSERT(stcb);
 	if ((asoc->prsctp_supported) &&
 	    (asoc->sent_queue_cnt_removeable > 0)) {
 		TAILQ_FOREACH(chk, &asoc->sent_queue, sctp_next) {
 			/*
 			 * Look for chunks marked with the PR_SCTP flag AND
 			 * the buffer space flag. If the one being sent is
 			 * equal or greater priority then purge the old one
 			 * and free some space.
 			 */
 			if (PR_SCTP_BUF_ENABLED(chk->flags)) {
 				/*
 				 * This one is PR-SCTP AND buffer space
 				 * limited type
 				 */
 				if (chk->rec.data.timetodrop.tv_sec >= (long)srcv->sinfo_timetolive) {
 					/*
 					 * Lower numbers equates to higher
 					 * priority so if the one we are
 					 * looking at has a larger or equal
 					 * priority we want to drop the data
 					 * and NOT retransmit it.
 					 */
 					if (chk->data) {
 						/*
 						 * We release the book_size
 						 * if the mbuf is here
 						 */
 						int ret_spc;
 						uint8_t sent;
 
 						if (chk->sent > SCTP_DATAGRAM_UNSENT)
 							sent = 1;
 						else
 							sent = 0;
 						ret_spc = sctp_release_pr_sctp_chunk(stcb, chk,
 						    sent,
 						    SCTP_SO_LOCKED);
 						freed_spc += ret_spc;
 						if (freed_spc >= dataout) {
 							return;
 						}
 					}	/* if chunk was present */
 				}	/* if of sufficient priority */
 			}	/* if chunk has enabled */
 		}		/* tailqforeach */
 
 		TAILQ_FOREACH_SAFE(chk, &asoc->send_queue, sctp_next, nchk) {
 			/* Here we must move to the sent queue and mark */
 			if (PR_SCTP_BUF_ENABLED(chk->flags)) {
 				if (chk->rec.data.timetodrop.tv_sec >= (long)srcv->sinfo_timetolive) {
 					if (chk->data) {
 						/*
 						 * We release the book_size
 						 * if the mbuf is here
 						 */
 						int ret_spc;
 
 						ret_spc = sctp_release_pr_sctp_chunk(stcb, chk,
 						    0, SCTP_SO_LOCKED);
 
 						freed_spc += ret_spc;
 						if (freed_spc >= dataout) {
 							return;
 						}
 					}	/* end if chk->data */
 				}	/* end if right class */
 			}	/* end if chk pr-sctp */
 		}		/* tailqforeachsafe (chk) */
 	}			/* if enabled in asoc */
 }
 
 int
 sctp_get_frag_point(struct sctp_tcb *stcb,
     struct sctp_association *asoc)
 {
 	int siz, ovh;
 
 	/*
 	 * For endpoints that have both v6 and v4 addresses we must reserve
 	 * room for the ipv6 header, for those that are only dealing with V4
 	 * we use a larger frag point.
 	 */
 	if (stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_BOUND_V6) {
 		ovh = SCTP_MIN_OVERHEAD;
 	} else {
 		ovh = SCTP_MIN_V4_OVERHEAD;
 	}
 	ovh += SCTP_DATA_CHUNK_OVERHEAD(stcb);
 	if (stcb->asoc.sctp_frag_point > asoc->smallest_mtu)
 		siz = asoc->smallest_mtu - ovh;
 	else
 		siz = (stcb->asoc.sctp_frag_point - ovh);
 	/*
 	 * if (siz > (MCLBYTES-sizeof(struct sctp_data_chunk))) {
 	 */
 	/* A data chunk MUST fit in a cluster */
 	/* siz = (MCLBYTES - sizeof(struct sctp_data_chunk)); */
 	/* } */
 
 	/* adjust for an AUTH chunk if DATA requires auth */
 	if (sctp_auth_is_required_chunk(SCTP_DATA, stcb->asoc.peer_auth_chunks))
 		siz -= sctp_get_auth_chunk_len(stcb->asoc.peer_hmac_id);
 
 	if (siz % 4) {
 		/* make it an even word boundary please */
 		siz -= (siz % 4);
 	}
 	return (siz);
 }
 
 static void
 sctp_set_prsctp_policy(struct sctp_stream_queue_pending *sp)
 {
 	/*
 	 * We assume that the user wants PR_SCTP_TTL if the user provides a
 	 * positive lifetime but does not specify any PR_SCTP policy.
 	 */
 	if (PR_SCTP_ENABLED(sp->sinfo_flags)) {
 		sp->act_flags |= PR_SCTP_POLICY(sp->sinfo_flags);
 	} else if (sp->timetolive > 0) {
 		sp->sinfo_flags |= SCTP_PR_SCTP_TTL;
 		sp->act_flags |= PR_SCTP_POLICY(sp->sinfo_flags);
 	} else {
 		return;
 	}
 	switch (PR_SCTP_POLICY(sp->sinfo_flags)) {
 	case CHUNK_FLAGS_PR_SCTP_BUF:
 		/*
 		 * Time to live is a priority stored in tv_sec when doing
 		 * the buffer drop thing.
 		 */
 		sp->ts.tv_sec = sp->timetolive;
 		sp->ts.tv_usec = 0;
 		break;
 	case CHUNK_FLAGS_PR_SCTP_TTL:
 		{
 			struct timeval tv;
 
 			(void)SCTP_GETTIME_TIMEVAL(&sp->ts);
 			tv.tv_sec = sp->timetolive / 1000;
 			tv.tv_usec = (sp->timetolive * 1000) % 1000000;
 			/*
 			 * TODO sctp_constants.h needs alternative time
 			 * macros when _KERNEL is undefined.
 			 */
 			timevaladd(&sp->ts, &tv);
 		}
 		break;
 	case CHUNK_FLAGS_PR_SCTP_RTX:
 		/*
 		 * Time to live is a the number or retransmissions stored in
 		 * tv_sec.
 		 */
 		sp->ts.tv_sec = sp->timetolive;
 		sp->ts.tv_usec = 0;
 		break;
 	default:
 		SCTPDBG(SCTP_DEBUG_USRREQ1,
 		    "Unknown PR_SCTP policy %u.\n",
 		    PR_SCTP_POLICY(sp->sinfo_flags));
 		break;
 	}
 }
 
 static int
 sctp_msg_append(struct sctp_tcb *stcb,
     struct sctp_nets *net,
     struct mbuf *m,
     struct sctp_sndrcvinfo *srcv, int hold_stcb_lock)
 {
 	int error = 0;
 	struct mbuf *at;
 	struct sctp_stream_queue_pending *sp = NULL;
 	struct sctp_stream_out *strm;
 
 	/*
 	 * Given an mbuf chain, put it into the association send queue and
 	 * place it on the wheel
 	 */
 	if (srcv->sinfo_stream >= stcb->asoc.streamoutcnt) {
 		/* Invalid stream number */
 		SCTP_LTRACE_ERR_RET_PKT(m, NULL, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 		error = EINVAL;
 		goto out_now;
 	}
 	if ((stcb->asoc.stream_locked) &&
 	    (stcb->asoc.stream_locked_on != srcv->sinfo_stream)) {
 		SCTP_LTRACE_ERR_RET_PKT(m, NULL, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 		error = EINVAL;
 		goto out_now;
 	}
 	strm = &stcb->asoc.strmout[srcv->sinfo_stream];
 	/* Now can we send this? */
 	if ((SCTP_GET_STATE(stcb) == SCTP_STATE_SHUTDOWN_SENT) ||
 	    (SCTP_GET_STATE(stcb) == SCTP_STATE_SHUTDOWN_ACK_SENT) ||
 	    (SCTP_GET_STATE(stcb) == SCTP_STATE_SHUTDOWN_RECEIVED) ||
 	    (stcb->asoc.state & SCTP_STATE_SHUTDOWN_PENDING)) {
 		/* got data while shutting down */
 		SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ECONNRESET);
 		error = ECONNRESET;
 		goto out_now;
 	}
 	sctp_alloc_a_strmoq(stcb, sp);
 	if (sp == NULL) {
 		SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 		error = ENOMEM;
 		goto out_now;
 	}
 	sp->sinfo_flags = srcv->sinfo_flags;
 	sp->timetolive = srcv->sinfo_timetolive;
 	sp->ppid = srcv->sinfo_ppid;
 	sp->context = srcv->sinfo_context;
 	sp->fsn = 0;
 	if (sp->sinfo_flags & SCTP_ADDR_OVER) {
 		sp->net = net;
 		atomic_add_int(&sp->net->ref_count, 1);
 	} else {
 		sp->net = NULL;
 	}
 	(void)SCTP_GETTIME_TIMEVAL(&sp->ts);
 	sp->sid = srcv->sinfo_stream;
 	sp->msg_is_complete = 1;
 	sp->sender_all_done = 1;
 	sp->some_taken = 0;
 	sp->data = m;
 	sp->tail_mbuf = NULL;
 	sctp_set_prsctp_policy(sp);
 	/*
 	 * We could in theory (for sendall) sifa the length in, but we would
 	 * still have to hunt through the chain since we need to setup the
 	 * tail_mbuf
 	 */
 	sp->length = 0;
 	for (at = m; at; at = SCTP_BUF_NEXT(at)) {
 		if (SCTP_BUF_NEXT(at) == NULL)
 			sp->tail_mbuf = at;
 		sp->length += SCTP_BUF_LEN(at);
 	}
 	if (srcv->sinfo_keynumber_valid) {
 		sp->auth_keyid = srcv->sinfo_keynumber;
 	} else {
 		sp->auth_keyid = stcb->asoc.authinfo.active_keyid;
 	}
 	if (sctp_auth_is_required_chunk(SCTP_DATA, stcb->asoc.peer_auth_chunks)) {
 		sctp_auth_key_acquire(stcb, sp->auth_keyid);
 		sp->holds_key_ref = 1;
 	}
 	if (hold_stcb_lock == 0) {
 		SCTP_TCB_SEND_LOCK(stcb);
 	}
 	sctp_snd_sb_alloc(stcb, sp->length);
 	atomic_add_int(&stcb->asoc.stream_queue_cnt, 1);
 	TAILQ_INSERT_TAIL(&strm->outqueue, sp, next);
 	stcb->asoc.ss_functions.sctp_ss_add_to_stream(stcb, &stcb->asoc, strm, sp, 1);
 	m = NULL;
 	if (hold_stcb_lock == 0) {
 		SCTP_TCB_SEND_UNLOCK(stcb);
 	}
 out_now:
 	if (m) {
 		sctp_m_freem(m);
 	}
 	return (error);
 }
 
 
 static struct mbuf *
 sctp_copy_mbufchain(struct mbuf *clonechain,
     struct mbuf *outchain,
     struct mbuf **endofchain,
     int can_take_mbuf,
     int sizeofcpy,
     uint8_t copy_by_ref)
 {
 	struct mbuf *m;
 	struct mbuf *appendchain;
 	caddr_t cp;
 	int len;
 
 	if (endofchain == NULL) {
 		/* error */
 error_out:
 		if (outchain)
 			sctp_m_freem(outchain);
 		return (NULL);
 	}
 	if (can_take_mbuf) {
 		appendchain = clonechain;
 	} else {
 		if (!copy_by_ref &&
 		    (sizeofcpy <= (int)((((SCTP_BASE_SYSCTL(sctp_mbuf_threshold_count) - 1) * MLEN) + MHLEN)))
 		    ) {
 			/* Its not in a cluster */
 			if (*endofchain == NULL) {
 				/* lets get a mbuf cluster */
 				if (outchain == NULL) {
 					/* This is the general case */
 			new_mbuf:
 					outchain = sctp_get_mbuf_for_msg(MCLBYTES, 0, M_NOWAIT, 1, MT_HEADER);
 					if (outchain == NULL) {
 						goto error_out;
 					}
 					SCTP_BUF_LEN(outchain) = 0;
 					*endofchain = outchain;
 					/* get the prepend space */
 					SCTP_BUF_RESV_UF(outchain, (SCTP_FIRST_MBUF_RESV + 4));
 				} else {
 					/*
 					 * We really should not get a NULL
 					 * in endofchain
 					 */
 					/* find end */
 					m = outchain;
 					while (m) {
 						if (SCTP_BUF_NEXT(m) == NULL) {
 							*endofchain = m;
 							break;
 						}
 						m = SCTP_BUF_NEXT(m);
 					}
 					/* sanity */
 					if (*endofchain == NULL) {
 						/*
 						 * huh, TSNH XXX maybe we
 						 * should panic
 						 */
 						sctp_m_freem(outchain);
 						goto new_mbuf;
 					}
 				}
 				/* get the new end of length */
 				len = (int)M_TRAILINGSPACE(*endofchain);
 			} else {
 				/* how much is left at the end? */
 				len = (int)M_TRAILINGSPACE(*endofchain);
 			}
 			/* Find the end of the data, for appending */
 			cp = (mtod((*endofchain), caddr_t)+SCTP_BUF_LEN((*endofchain)));
 
 			/* Now lets copy it out */
 			if (len >= sizeofcpy) {
 				/* It all fits, copy it in */
 				m_copydata(clonechain, 0, sizeofcpy, cp);
 				SCTP_BUF_LEN((*endofchain)) += sizeofcpy;
 			} else {
 				/* fill up the end of the chain */
 				if (len > 0) {
 					m_copydata(clonechain, 0, len, cp);
 					SCTP_BUF_LEN((*endofchain)) += len;
 					/* now we need another one */
 					sizeofcpy -= len;
 				}
 				m = sctp_get_mbuf_for_msg(MCLBYTES, 0, M_NOWAIT, 1, MT_HEADER);
 				if (m == NULL) {
 					/* We failed */
 					goto error_out;
 				}
 				SCTP_BUF_NEXT((*endofchain)) = m;
 				*endofchain = m;
 				cp = mtod((*endofchain), caddr_t);
 				m_copydata(clonechain, len, sizeofcpy, cp);
 				SCTP_BUF_LEN((*endofchain)) += sizeofcpy;
 			}
 			return (outchain);
 		} else {
 			/* copy the old fashion way */
 			appendchain = SCTP_M_COPYM(clonechain, 0, M_COPYALL, M_NOWAIT);
 #ifdef SCTP_MBUF_LOGGING
 			if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_MBUF_LOGGING_ENABLE) {
 				sctp_log_mbc(appendchain, SCTP_MBUF_ICOPY);
 			}
 #endif
 		}
 	}
 	if (appendchain == NULL) {
 		/* error */
 		if (outchain)
 			sctp_m_freem(outchain);
 		return (NULL);
 	}
 	if (outchain) {
 		/* tack on to the end */
 		if (*endofchain != NULL) {
 			SCTP_BUF_NEXT(((*endofchain))) = appendchain;
 		} else {
 			m = outchain;
 			while (m) {
 				if (SCTP_BUF_NEXT(m) == NULL) {
 					SCTP_BUF_NEXT(m) = appendchain;
 					break;
 				}
 				m = SCTP_BUF_NEXT(m);
 			}
 		}
 		/*
 		 * save off the end and update the end-chain position
 		 */
 		m = appendchain;
 		while (m) {
 			if (SCTP_BUF_NEXT(m) == NULL) {
 				*endofchain = m;
 				break;
 			}
 			m = SCTP_BUF_NEXT(m);
 		}
 		return (outchain);
 	} else {
 		/* save off the end and update the end-chain position */
 		m = appendchain;
 		while (m) {
 			if (SCTP_BUF_NEXT(m) == NULL) {
 				*endofchain = m;
 				break;
 			}
 			m = SCTP_BUF_NEXT(m);
 		}
 		return (appendchain);
 	}
 }
 
 static int
 sctp_med_chunk_output(struct sctp_inpcb *inp,
     struct sctp_tcb *stcb,
     struct sctp_association *asoc,
     int *num_out,
     int *reason_code,
     int control_only, int from_where,
     struct timeval *now, int *now_filled, int frag_point, int so_locked
 #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING)
     SCTP_UNUSED
 #endif
 );
 
 static void
 sctp_sendall_iterator(struct sctp_inpcb *inp, struct sctp_tcb *stcb, void *ptr,
     uint32_t val SCTP_UNUSED)
 {
 	struct sctp_copy_all *ca;
 	struct mbuf *m;
 	int ret = 0;
 	int added_control = 0;
 	int un_sent, do_chunk_output = 1;
 	struct sctp_association *asoc;
 	struct sctp_nets *net;
 
 	ca = (struct sctp_copy_all *)ptr;
 	if (ca->m == NULL) {
 		return;
 	}
 	if (ca->inp != inp) {
 		/* TSNH */
 		return;
 	}
 	if (ca->sndlen > 0) {
 		m = SCTP_M_COPYM(ca->m, 0, M_COPYALL, M_NOWAIT);
 		if (m == NULL) {
 			/* can't copy so we are done */
 			ca->cnt_failed++;
 			return;
 		}
 #ifdef SCTP_MBUF_LOGGING
 		if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_MBUF_LOGGING_ENABLE) {
 			sctp_log_mbc(m, SCTP_MBUF_ICOPY);
 		}
 #endif
 	} else {
 		m = NULL;
 	}
 	SCTP_TCB_LOCK_ASSERT(stcb);
 	if (stcb->asoc.alternate) {
 		net = stcb->asoc.alternate;
 	} else {
 		net = stcb->asoc.primary_destination;
 	}
 	if (ca->sndrcv.sinfo_flags & SCTP_ABORT) {
 		/* Abort this assoc with m as the user defined reason */
 		if (m != NULL) {
 			SCTP_BUF_PREPEND(m, sizeof(struct sctp_paramhdr), M_NOWAIT);
 		} else {
 			m = sctp_get_mbuf_for_msg(sizeof(struct sctp_paramhdr),
 			    0, M_NOWAIT, 1, MT_DATA);
 			SCTP_BUF_LEN(m) = sizeof(struct sctp_paramhdr);
 		}
 		if (m != NULL) {
 			struct sctp_paramhdr *ph;
 
 			ph = mtod(m, struct sctp_paramhdr *);
 			ph->param_type = htons(SCTP_CAUSE_USER_INITIATED_ABT);
 			ph->param_length = htons((uint16_t)(sizeof(struct sctp_paramhdr) + ca->sndlen));
 		}
 		/*
 		 * We add one here to keep the assoc from dis-appearing on
 		 * us.
 		 */
 		atomic_add_int(&stcb->asoc.refcnt, 1);
 		sctp_abort_an_association(inp, stcb, m, SCTP_SO_NOT_LOCKED);
 		/*
 		 * sctp_abort_an_association calls sctp_free_asoc() free
 		 * association will NOT free it since we incremented the
 		 * refcnt .. we do this to prevent it being freed and things
 		 * getting tricky since we could end up (from free_asoc)
 		 * calling inpcb_free which would get a recursive lock call
 		 * to the iterator lock.. But as a consequence of that the
 		 * stcb will return to us un-locked.. since free_asoc
 		 * returns with either no TCB or the TCB unlocked, we must
 		 * relock.. to unlock in the iterator timer :-0
 		 */
 		SCTP_TCB_LOCK(stcb);
 		atomic_add_int(&stcb->asoc.refcnt, -1);
 		goto no_chunk_output;
 	} else {
 		if (m) {
 			ret = sctp_msg_append(stcb, net, m,
 			    &ca->sndrcv, 1);
 		}
 		asoc = &stcb->asoc;
 		if (ca->sndrcv.sinfo_flags & SCTP_EOF) {
 			/* shutdown this assoc */
 			if (TAILQ_EMPTY(&asoc->send_queue) &&
 			    TAILQ_EMPTY(&asoc->sent_queue) &&
 			    sctp_is_there_unsent_data(stcb, SCTP_SO_NOT_LOCKED) == 0) {
 				if ((*asoc->ss_functions.sctp_ss_is_user_msgs_incomplete) (stcb, asoc)) {
 					goto abort_anyway;
 				}
 				/*
 				 * there is nothing queued to send, so I'm
 				 * done...
 				 */
 				if ((SCTP_GET_STATE(stcb) != SCTP_STATE_SHUTDOWN_SENT) &&
 				    (SCTP_GET_STATE(stcb) != SCTP_STATE_SHUTDOWN_RECEIVED) &&
 				    (SCTP_GET_STATE(stcb) != SCTP_STATE_SHUTDOWN_ACK_SENT)) {
 					/*
 					 * only send SHUTDOWN the first time
 					 * through
 					 */
 					if (SCTP_GET_STATE(stcb) == SCTP_STATE_OPEN) {
 						SCTP_STAT_DECR_GAUGE32(sctps_currestab);
 					}
 					SCTP_SET_STATE(stcb, SCTP_STATE_SHUTDOWN_SENT);
 					sctp_stop_timers_for_shutdown(stcb);
 					sctp_send_shutdown(stcb, net);
 					sctp_timer_start(SCTP_TIMER_TYPE_SHUTDOWN, stcb->sctp_ep, stcb,
 					    net);
 					sctp_timer_start(SCTP_TIMER_TYPE_SHUTDOWNGUARD, stcb->sctp_ep, stcb,
 					    asoc->primary_destination);
 					added_control = 1;
 					do_chunk_output = 0;
 				}
 			} else {
 				/*
 				 * we still got (or just got) data to send,
 				 * so set SHUTDOWN_PENDING
 				 */
 				/*
 				 * XXX sockets draft says that SCTP_EOF
 				 * should be sent with no data.  currently,
 				 * we will allow user data to be sent first
 				 * and move to SHUTDOWN-PENDING
 				 */
 				if ((SCTP_GET_STATE(stcb) != SCTP_STATE_SHUTDOWN_SENT) &&
 				    (SCTP_GET_STATE(stcb) != SCTP_STATE_SHUTDOWN_RECEIVED) &&
 				    (SCTP_GET_STATE(stcb) != SCTP_STATE_SHUTDOWN_ACK_SENT)) {
 					if ((*asoc->ss_functions.sctp_ss_is_user_msgs_incomplete) (stcb, asoc)) {
 						SCTP_ADD_SUBSTATE(stcb, SCTP_STATE_PARTIAL_MSG_LEFT);
 					}
 					SCTP_ADD_SUBSTATE(stcb, SCTP_STATE_SHUTDOWN_PENDING);
 					if (TAILQ_EMPTY(&asoc->send_queue) &&
 					    TAILQ_EMPTY(&asoc->sent_queue) &&
 					    (asoc->state & SCTP_STATE_PARTIAL_MSG_LEFT)) {
 						struct mbuf *op_err;
 						char msg[SCTP_DIAG_INFO_LEN];
 
 				abort_anyway:
 						snprintf(msg, sizeof(msg),
 						    "%s:%d at %s", __FILE__, __LINE__, __func__);
 						op_err = sctp_generate_cause(SCTP_BASE_SYSCTL(sctp_diag_info_code),
 						    msg);
 						atomic_add_int(&stcb->asoc.refcnt, 1);
 						sctp_abort_an_association(stcb->sctp_ep, stcb,
 						    op_err, SCTP_SO_NOT_LOCKED);
 						atomic_add_int(&stcb->asoc.refcnt, -1);
 						goto no_chunk_output;
 					}
 					sctp_timer_start(SCTP_TIMER_TYPE_SHUTDOWNGUARD, stcb->sctp_ep, stcb,
 					    asoc->primary_destination);
 				}
 			}
 
 		}
 	}
 	un_sent = ((stcb->asoc.total_output_queue_size - stcb->asoc.total_flight) +
 	    (stcb->asoc.stream_queue_cnt * SCTP_DATA_CHUNK_OVERHEAD(stcb)));
 
 	if ((sctp_is_feature_off(inp, SCTP_PCB_FLAGS_NODELAY)) &&
 	    (stcb->asoc.total_flight > 0) &&
 	    (un_sent < (int)(stcb->asoc.smallest_mtu - SCTP_MIN_OVERHEAD))) {
 		do_chunk_output = 0;
 	}
 	if (do_chunk_output)
 		sctp_chunk_output(inp, stcb, SCTP_OUTPUT_FROM_USR_SEND, SCTP_SO_NOT_LOCKED);
 	else if (added_control) {
 		int num_out, reason, now_filled = 0;
 		struct timeval now;
 		int frag_point;
 
 		frag_point = sctp_get_frag_point(stcb, &stcb->asoc);
 		(void)sctp_med_chunk_output(inp, stcb, &stcb->asoc, &num_out,
 		    &reason, 1, 1, &now, &now_filled, frag_point, SCTP_SO_NOT_LOCKED);
 	}
 no_chunk_output:
 	if (ret) {
 		ca->cnt_failed++;
 	} else {
 		ca->cnt_sent++;
 	}
 }
 
 static void
 sctp_sendall_completes(void *ptr, uint32_t val SCTP_UNUSED)
 {
 	struct sctp_copy_all *ca;
 
 	ca = (struct sctp_copy_all *)ptr;
 	/*
 	 * Do a notify here? Kacheong suggests that the notify be done at
 	 * the send time.. so you would push up a notification if any send
 	 * failed. Don't know if this is feasible since the only failures we
 	 * have is "memory" related and if you cannot get an mbuf to send
 	 * the data you surely can't get an mbuf to send up to notify the
 	 * user you can't send the data :->
 	 */
 
 	/* now free everything */
 	sctp_m_freem(ca->m);
 	SCTP_FREE(ca, SCTP_M_COPYAL);
 }
 
 static struct mbuf *
 sctp_copy_out_all(struct uio *uio, int len)
 {
 	struct mbuf *ret, *at;
 	int left, willcpy, cancpy, error;
 
 	ret = sctp_get_mbuf_for_msg(MCLBYTES, 0, M_WAITOK, 1, MT_DATA);
 	if (ret == NULL) {
 		/* TSNH */
 		return (NULL);
 	}
 	left = len;
 	SCTP_BUF_LEN(ret) = 0;
 	/* save space for the data chunk header */
 	cancpy = (int)M_TRAILINGSPACE(ret);
 	willcpy = min(cancpy, left);
 	at = ret;
 	while (left > 0) {
 		/* Align data to the end */
 		error = uiomove(mtod(at, caddr_t), willcpy, uio);
 		if (error) {
 	err_out_now:
 			sctp_m_freem(at);
 			return (NULL);
 		}
 		SCTP_BUF_LEN(at) = willcpy;
 		SCTP_BUF_NEXT_PKT(at) = SCTP_BUF_NEXT(at) = 0;
 		left -= willcpy;
 		if (left > 0) {
 			SCTP_BUF_NEXT(at) = sctp_get_mbuf_for_msg(left, 0, M_WAITOK, 1, MT_DATA);
 			if (SCTP_BUF_NEXT(at) == NULL) {
 				goto err_out_now;
 			}
 			at = SCTP_BUF_NEXT(at);
 			SCTP_BUF_LEN(at) = 0;
 			cancpy = (int)M_TRAILINGSPACE(at);
 			willcpy = min(cancpy, left);
 		}
 	}
 	return (ret);
 }
 
 static int
 sctp_sendall(struct sctp_inpcb *inp, struct uio *uio, struct mbuf *m,
     struct sctp_sndrcvinfo *srcv)
 {
 	int ret;
 	struct sctp_copy_all *ca;
 
 	SCTP_MALLOC(ca, struct sctp_copy_all *, sizeof(struct sctp_copy_all),
 	    SCTP_M_COPYAL);
 	if (ca == NULL) {
 		sctp_m_freem(m);
 		SCTP_LTRACE_ERR_RET(inp, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 		return (ENOMEM);
 	}
 	memset(ca, 0, sizeof(struct sctp_copy_all));
 
 	ca->inp = inp;
 	if (srcv) {
 		memcpy(&ca->sndrcv, srcv, sizeof(struct sctp_nonpad_sndrcvinfo));
 	}
 	/*
 	 * take off the sendall flag, it would be bad if we failed to do
 	 * this :-0
 	 */
 	ca->sndrcv.sinfo_flags &= ~SCTP_SENDALL;
 	/* get length and mbuf chain */
 	if (uio) {
 		ca->sndlen = (int)uio->uio_resid;
 		ca->m = sctp_copy_out_all(uio, ca->sndlen);
 		if (ca->m == NULL) {
 			SCTP_FREE(ca, SCTP_M_COPYAL);
 			SCTP_LTRACE_ERR_RET(inp, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 			return (ENOMEM);
 		}
 	} else {
 		/* Gather the length of the send */
 		struct mbuf *mat;
 
 		ca->sndlen = 0;
 		for (mat = m; mat; mat = SCTP_BUF_NEXT(mat)) {
 			ca->sndlen += SCTP_BUF_LEN(mat);
 		}
 	}
 	ret = sctp_initiate_iterator(NULL, sctp_sendall_iterator, NULL,
 	    SCTP_PCB_ANY_FLAGS, SCTP_PCB_ANY_FEATURES,
 	    SCTP_ASOC_ANY_STATE,
 	    (void *)ca, 0,
 	    sctp_sendall_completes, inp, 1);
 	if (ret) {
 		SCTP_PRINTF("Failed to initiate iterator for sendall\n");
 		SCTP_FREE(ca, SCTP_M_COPYAL);
 		SCTP_LTRACE_ERR_RET_PKT(m, inp, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, EFAULT);
 		return (EFAULT);
 	}
 	return (0);
 }
 
 
 void
 sctp_toss_old_cookies(struct sctp_tcb *stcb, struct sctp_association *asoc)
 {
 	struct sctp_tmit_chunk *chk, *nchk;
 
 	TAILQ_FOREACH_SAFE(chk, &asoc->control_send_queue, sctp_next, nchk) {
 		if (chk->rec.chunk_id.id == SCTP_COOKIE_ECHO) {
 			TAILQ_REMOVE(&asoc->control_send_queue, chk, sctp_next);
 			asoc->ctrl_queue_cnt--;
 			if (chk->data) {
 				sctp_m_freem(chk->data);
 				chk->data = NULL;
 			}
 			sctp_free_a_chunk(stcb, chk, SCTP_SO_NOT_LOCKED);
 		}
 	}
 }
 
 void
 sctp_toss_old_asconf(struct sctp_tcb *stcb)
 {
 	struct sctp_association *asoc;
 	struct sctp_tmit_chunk *chk, *nchk;
 	struct sctp_asconf_chunk *acp;
 
 	asoc = &stcb->asoc;
 	TAILQ_FOREACH_SAFE(chk, &asoc->asconf_send_queue, sctp_next, nchk) {
 		/* find SCTP_ASCONF chunk in queue */
 		if (chk->rec.chunk_id.id == SCTP_ASCONF) {
 			if (chk->data) {
 				acp = mtod(chk->data, struct sctp_asconf_chunk *);
 				if (SCTP_TSN_GT(ntohl(acp->serial_number), asoc->asconf_seq_out_acked)) {
 					/* Not Acked yet */
 					break;
 				}
 			}
 			TAILQ_REMOVE(&asoc->asconf_send_queue, chk, sctp_next);
 			asoc->ctrl_queue_cnt--;
 			if (chk->data) {
 				sctp_m_freem(chk->data);
 				chk->data = NULL;
 			}
 			sctp_free_a_chunk(stcb, chk, SCTP_SO_NOT_LOCKED);
 		}
 	}
 }
 
 
 static void
 sctp_clean_up_datalist(struct sctp_tcb *stcb,
     struct sctp_association *asoc,
     struct sctp_tmit_chunk **data_list,
     int bundle_at,
     struct sctp_nets *net)
 {
 	int i;
 	struct sctp_tmit_chunk *tp1;
 
 	for (i = 0; i < bundle_at; i++) {
 		/* off of the send queue */
 		TAILQ_REMOVE(&asoc->send_queue, data_list[i], sctp_next);
 		asoc->send_queue_cnt--;
 		if (i > 0) {
 			/*
 			 * Any chunk NOT 0 you zap the time chunk 0 gets
 			 * zapped or set based on if a RTO measurment is
 			 * needed.
 			 */
 			data_list[i]->do_rtt = 0;
 		}
 		/* record time */
 		data_list[i]->sent_rcv_time = net->last_sent_time;
 		data_list[i]->rec.data.cwnd_at_send = net->cwnd;
 		data_list[i]->rec.data.fast_retran_tsn = data_list[i]->rec.data.tsn;
 		if (data_list[i]->whoTo == NULL) {
 			data_list[i]->whoTo = net;
 			atomic_add_int(&net->ref_count, 1);
 		}
 		/* on to the sent queue */
 		tp1 = TAILQ_LAST(&asoc->sent_queue, sctpchunk_listhead);
 		if ((tp1) && SCTP_TSN_GT(tp1->rec.data.tsn, data_list[i]->rec.data.tsn)) {
 			struct sctp_tmit_chunk *tpp;
 
 			/* need to move back */
 	back_up_more:
 			tpp = TAILQ_PREV(tp1, sctpchunk_listhead, sctp_next);
 			if (tpp == NULL) {
 				TAILQ_INSERT_BEFORE(tp1, data_list[i], sctp_next);
 				goto all_done;
 			}
 			tp1 = tpp;
 			if (SCTP_TSN_GT(tp1->rec.data.tsn, data_list[i]->rec.data.tsn)) {
 				goto back_up_more;
 			}
 			TAILQ_INSERT_AFTER(&asoc->sent_queue, tp1, data_list[i], sctp_next);
 		} else {
 			TAILQ_INSERT_TAIL(&asoc->sent_queue,
 			    data_list[i],
 			    sctp_next);
 		}
 all_done:
 		/* This does not lower until the cum-ack passes it */
 		asoc->sent_queue_cnt++;
 		if ((asoc->peers_rwnd <= 0) &&
 		    (asoc->total_flight == 0) &&
 		    (bundle_at == 1)) {
 			/* Mark the chunk as being a window probe */
 			SCTP_STAT_INCR(sctps_windowprobed);
 		}
 #ifdef SCTP_AUDITING_ENABLED
 		sctp_audit_log(0xC2, 3);
 #endif
 		data_list[i]->sent = SCTP_DATAGRAM_SENT;
 		data_list[i]->snd_count = 1;
 		data_list[i]->rec.data.chunk_was_revoked = 0;
 		if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_FLIGHT_LOGGING_ENABLE) {
 			sctp_misc_ints(SCTP_FLIGHT_LOG_UP,
 			    data_list[i]->whoTo->flight_size,
 			    data_list[i]->book_size,
 			    (uint32_t)(uintptr_t)data_list[i]->whoTo,
 			    data_list[i]->rec.data.tsn);
 		}
 		sctp_flight_size_increase(data_list[i]);
 		sctp_total_flight_increase(stcb, data_list[i]);
 		if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LOG_RWND_ENABLE) {
 			sctp_log_rwnd(SCTP_DECREASE_PEER_RWND,
 			    asoc->peers_rwnd, data_list[i]->send_size, SCTP_BASE_SYSCTL(sctp_peer_chunk_oh));
 		}
 		asoc->peers_rwnd = sctp_sbspace_sub(asoc->peers_rwnd,
 		    (uint32_t)(data_list[i]->send_size + SCTP_BASE_SYSCTL(sctp_peer_chunk_oh)));
 		if (asoc->peers_rwnd < stcb->sctp_ep->sctp_ep.sctp_sws_sender) {
 			/* SWS sender side engages */
 			asoc->peers_rwnd = 0;
 		}
 	}
 	if (asoc->cc_functions.sctp_cwnd_update_packet_transmitted) {
 		(*asoc->cc_functions.sctp_cwnd_update_packet_transmitted) (stcb, net);
 	}
 }
 
 static void
 sctp_clean_up_ctl(struct sctp_tcb *stcb, struct sctp_association *asoc, int so_locked
 #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING)
     SCTP_UNUSED
 #endif
 )
 {
 	struct sctp_tmit_chunk *chk, *nchk;
 
 	TAILQ_FOREACH_SAFE(chk, &asoc->control_send_queue, sctp_next, nchk) {
 		if ((chk->rec.chunk_id.id == SCTP_SELECTIVE_ACK) ||
 		    (chk->rec.chunk_id.id == SCTP_NR_SELECTIVE_ACK) ||	/* EY */
 		    (chk->rec.chunk_id.id == SCTP_HEARTBEAT_REQUEST) ||
 		    (chk->rec.chunk_id.id == SCTP_HEARTBEAT_ACK) ||
 		    (chk->rec.chunk_id.id == SCTP_FORWARD_CUM_TSN) ||
 		    (chk->rec.chunk_id.id == SCTP_SHUTDOWN) ||
 		    (chk->rec.chunk_id.id == SCTP_SHUTDOWN_ACK) ||
 		    (chk->rec.chunk_id.id == SCTP_OPERATION_ERROR) ||
 		    (chk->rec.chunk_id.id == SCTP_PACKET_DROPPED) ||
 		    (chk->rec.chunk_id.id == SCTP_COOKIE_ACK) ||
 		    (chk->rec.chunk_id.id == SCTP_ECN_CWR) ||
 		    (chk->rec.chunk_id.id == SCTP_ASCONF_ACK)) {
 			/* Stray chunks must be cleaned up */
 	clean_up_anyway:
 			TAILQ_REMOVE(&asoc->control_send_queue, chk, sctp_next);
 			asoc->ctrl_queue_cnt--;
 			if (chk->data) {
 				sctp_m_freem(chk->data);
 				chk->data = NULL;
 			}
 			if (chk->rec.chunk_id.id == SCTP_FORWARD_CUM_TSN) {
 				asoc->fwd_tsn_cnt--;
 			}
 			sctp_free_a_chunk(stcb, chk, so_locked);
 		} else if (chk->rec.chunk_id.id == SCTP_STREAM_RESET) {
 			/* special handling, we must look into the param */
 			if (chk != asoc->str_reset) {
 				goto clean_up_anyway;
 			}
 		}
 	}
 }
 
 static uint32_t
 sctp_can_we_split_this(struct sctp_tcb *stcb, uint32_t length,
     uint32_t space_left, uint32_t frag_point, int eeor_on)
 {
 	/*
 	 * Make a decision on if I should split a msg into multiple parts.
 	 * This is only asked of incomplete messages.
 	 */
 	if (eeor_on) {
 		/*
 		 * If we are doing EEOR we need to always send it if its the
 		 * entire thing, since it might be all the guy is putting in
 		 * the hopper.
 		 */
 		if (space_left >= length) {
 			/*-
 			 * If we have data outstanding,
 			 * we get another chance when the sack
 			 * arrives to transmit - wait for more data
 			 */
 			if (stcb->asoc.total_flight == 0) {
 				/*
 				 * If nothing is in flight, we zero the
 				 * packet counter.
 				 */
 				return (length);
 			}
 			return (0);
 
 		} else {
 			/* You can fill the rest */
 			return (space_left);
 		}
 	}
 	/*-
 	 * For those strange folk that make the send buffer
 	 * smaller than our fragmentation point, we can't
 	 * get a full msg in so we have to allow splitting.
 	 */
 	if (SCTP_SB_LIMIT_SND(stcb->sctp_socket) < frag_point) {
 		return (length);
 	}
 	if ((length <= space_left) ||
 	    ((length - space_left) < SCTP_BASE_SYSCTL(sctp_min_residual))) {
 		/* Sub-optimial residual don't split in non-eeor mode. */
 		return (0);
 	}
 	/*
 	 * If we reach here length is larger than the space_left. Do we wish
 	 * to split it for the sake of packet putting together?
 	 */
 	if (space_left >= min(SCTP_BASE_SYSCTL(sctp_min_split_point), frag_point)) {
 		/* Its ok to split it */
 		return (min(space_left, frag_point));
 	}
 	/* Nope, can't split */
 	return (0);
 }
 
 static uint32_t
 sctp_move_to_outqueue(struct sctp_tcb *stcb,
     struct sctp_stream_out *strq,
     uint32_t space_left,
     uint32_t frag_point,
     int *giveup,
     int eeor_mode,
     int *bail,
     int so_locked
 #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING)
     SCTP_UNUSED
 #endif
 )
 {
 	/* Move from the stream to the send_queue keeping track of the total */
 	struct sctp_association *asoc;
 	struct sctp_stream_queue_pending *sp;
 	struct sctp_tmit_chunk *chk;
 	struct sctp_data_chunk *dchkh = NULL;
 	struct sctp_idata_chunk *ndchkh = NULL;
 	uint32_t to_move, length;
 	int leading;
 	uint8_t rcv_flags = 0;
 	uint8_t some_taken;
 	uint8_t send_lock_up = 0;
 
 	SCTP_TCB_LOCK_ASSERT(stcb);
 	asoc = &stcb->asoc;
 one_more_time:
 	/* sa_ignore FREED_MEMORY */
 	sp = TAILQ_FIRST(&strq->outqueue);
 	if (sp == NULL) {
 		if (send_lock_up == 0) {
 			SCTP_TCB_SEND_LOCK(stcb);
 			send_lock_up = 1;
 		}
 		sp = TAILQ_FIRST(&strq->outqueue);
 		if (sp) {
 			goto one_more_time;
 		}
 		if ((sctp_is_feature_on(stcb->sctp_ep, SCTP_PCB_FLAGS_EXPLICIT_EOR) == 0) &&
 		    (stcb->asoc.idata_supported == 0) &&
 		    (strq->last_msg_incomplete)) {
 			SCTP_PRINTF("Huh? Stream:%d lm_in_c=%d but queue is NULL\n",
 			    strq->sid,
 			    strq->last_msg_incomplete);
 			strq->last_msg_incomplete = 0;
 		}
 		to_move = 0;
 		if (send_lock_up) {
 			SCTP_TCB_SEND_UNLOCK(stcb);
 			send_lock_up = 0;
 		}
 		goto out_of;
 	}
 	if ((sp->msg_is_complete) && (sp->length == 0)) {
 		if (sp->sender_all_done) {
 			/*
 			 * We are doing deferred cleanup. Last time through
 			 * when we took all the data the sender_all_done was
 			 * not set.
 			 */
 			if ((sp->put_last_out == 0) && (sp->discard_rest == 0)) {
 				SCTP_PRINTF("Gak, put out entire msg with NO end!-1\n");
 				SCTP_PRINTF("sender_done:%d len:%d msg_comp:%d put_last_out:%d send_lock:%d\n",
 				    sp->sender_all_done,
 				    sp->length,
 				    sp->msg_is_complete,
 				    sp->put_last_out,
 				    send_lock_up);
 			}
 			if ((TAILQ_NEXT(sp, next) == NULL) && (send_lock_up == 0)) {
 				SCTP_TCB_SEND_LOCK(stcb);
 				send_lock_up = 1;
 			}
 			atomic_subtract_int(&asoc->stream_queue_cnt, 1);
 			TAILQ_REMOVE(&strq->outqueue, sp, next);
 			stcb->asoc.ss_functions.sctp_ss_remove_from_stream(stcb, asoc, strq, sp, send_lock_up);
 			if ((strq->state == SCTP_STREAM_RESET_PENDING) &&
 			    (strq->chunks_on_queues == 0) &&
 			    TAILQ_EMPTY(&strq->outqueue)) {
 				stcb->asoc.trigger_reset = 1;
 			}
 			if (sp->net) {
 				sctp_free_remote_addr(sp->net);
 				sp->net = NULL;
 			}
 			if (sp->data) {
 				sctp_m_freem(sp->data);
 				sp->data = NULL;
 			}
 			sctp_free_a_strmoq(stcb, sp, so_locked);
 			/* we can't be locked to it */
 			if (send_lock_up) {
 				SCTP_TCB_SEND_UNLOCK(stcb);
 				send_lock_up = 0;
 			}
 			/* back to get the next msg */
 			goto one_more_time;
 		} else {
 			/*
 			 * sender just finished this but still holds a
 			 * reference
 			 */
 			*giveup = 1;
 			to_move = 0;
 			goto out_of;
 		}
 	} else {
 		/* is there some to get */
 		if (sp->length == 0) {
 			/* no */
 			*giveup = 1;
 			to_move = 0;
 			goto out_of;
 		} else if (sp->discard_rest) {
 			if (send_lock_up == 0) {
 				SCTP_TCB_SEND_LOCK(stcb);
 				send_lock_up = 1;
 			}
 			/* Whack down the size */
 			atomic_subtract_int(&stcb->asoc.total_output_queue_size, sp->length);
 			if ((stcb->sctp_socket != NULL) &&
 			    ((stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_TCPTYPE) ||
 			    (stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_IN_TCPPOOL))) {
 				atomic_subtract_int(&stcb->sctp_socket->so_snd.sb_cc, sp->length);
 			}
 			if (sp->data) {
 				sctp_m_freem(sp->data);
 				sp->data = NULL;
 				sp->tail_mbuf = NULL;
 			}
 			sp->length = 0;
 			sp->some_taken = 1;
 			*giveup = 1;
 			to_move = 0;
 			goto out_of;
 		}
 	}
 	some_taken = sp->some_taken;
 re_look:
 	length = sp->length;
 	if (sp->msg_is_complete) {
 		/* The message is complete */
 		to_move = min(length, frag_point);
 		if (to_move == length) {
 			/* All of it fits in the MTU */
 			if (sp->some_taken) {
 				rcv_flags |= SCTP_DATA_LAST_FRAG;
 			} else {
 				rcv_flags |= SCTP_DATA_NOT_FRAG;
 			}
 			sp->put_last_out = 1;
 			if (sp->sinfo_flags & SCTP_SACK_IMMEDIATELY) {
 				rcv_flags |= SCTP_DATA_SACK_IMMEDIATELY;
 			}
 		} else {
 			/* Not all of it fits, we fragment */
 			if (sp->some_taken == 0) {
 				rcv_flags |= SCTP_DATA_FIRST_FRAG;
 			}
 			sp->some_taken = 1;
 		}
 	} else {
 		to_move = sctp_can_we_split_this(stcb, length, space_left, frag_point, eeor_mode);
 		if (to_move) {
 			/*-
 			 * We use a snapshot of length in case it
 			 * is expanding during the compare.
 			 */
 			uint32_t llen;
 
 			llen = length;
 			if (to_move >= llen) {
 				to_move = llen;
 				if (send_lock_up == 0) {
 					/*-
 					 * We are taking all of an incomplete msg
 					 * thus we need a send lock.
 					 */
 					SCTP_TCB_SEND_LOCK(stcb);
 					send_lock_up = 1;
 					if (sp->msg_is_complete) {
 						/*
 						 * the sender finished the
 						 * msg
 						 */
 						goto re_look;
 					}
 				}
 			}
 			if (sp->some_taken == 0) {
 				rcv_flags |= SCTP_DATA_FIRST_FRAG;
 				sp->some_taken = 1;
 			}
 		} else {
 			/* Nothing to take. */
 			*giveup = 1;
 			to_move = 0;
 			goto out_of;
 		}
 	}
 
 	/* If we reach here, we can copy out a chunk */
 	sctp_alloc_a_chunk(stcb, chk);
 	if (chk == NULL) {
 		/* No chunk memory */
 		*giveup = 1;
 		to_move = 0;
 		goto out_of;
 	}
 	/*
 	 * Setup for unordered if needed by looking at the user sent info
 	 * flags.
 	 */
 	if (sp->sinfo_flags & SCTP_UNORDERED) {
 		rcv_flags |= SCTP_DATA_UNORDERED;
 	}
 	if (SCTP_BASE_SYSCTL(sctp_enable_sack_immediately) &&
 	    (sp->sinfo_flags & SCTP_EOF) == SCTP_EOF) {
 		rcv_flags |= SCTP_DATA_SACK_IMMEDIATELY;
 	}
 	/* clear out the chunk before setting up */
 	memset(chk, 0, sizeof(*chk));
 	chk->rec.data.rcv_flags = rcv_flags;
 
 	if (to_move >= length) {
 		/* we think we can steal the whole thing */
 		if ((sp->sender_all_done == 0) && (send_lock_up == 0)) {
 			SCTP_TCB_SEND_LOCK(stcb);
 			send_lock_up = 1;
 		}
 		if (to_move < sp->length) {
 			/* bail, it changed */
 			goto dont_do_it;
 		}
 		chk->data = sp->data;
 		chk->last_mbuf = sp->tail_mbuf;
 		/* register the stealing */
 		sp->data = sp->tail_mbuf = NULL;
 	} else {
 		struct mbuf *m;
 
 dont_do_it:
 		chk->data = SCTP_M_COPYM(sp->data, 0, to_move, M_NOWAIT);
 		chk->last_mbuf = NULL;
 		if (chk->data == NULL) {
 			sp->some_taken = some_taken;
 			sctp_free_a_chunk(stcb, chk, so_locked);
 			*bail = 1;
 			to_move = 0;
 			goto out_of;
 		}
 #ifdef SCTP_MBUF_LOGGING
 		if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_MBUF_LOGGING_ENABLE) {
 			sctp_log_mbc(chk->data, SCTP_MBUF_ICOPY);
 		}
 #endif
 		/* Pull off the data */
 		m_adj(sp->data, to_move);
 		/* Now lets work our way down and compact it */
 		m = sp->data;
 		while (m && (SCTP_BUF_LEN(m) == 0)) {
 			sp->data = SCTP_BUF_NEXT(m);
 			SCTP_BUF_NEXT(m) = NULL;
 			if (sp->tail_mbuf == m) {
 				/*-
 				 * Freeing tail? TSNH since
 				 * we supposedly were taking less
 				 * than the sp->length.
 				 */
 #ifdef INVARIANTS
 				panic("Huh, freing tail? - TSNH");
 #else
 				SCTP_PRINTF("Huh, freeing tail? - TSNH\n");
 				sp->tail_mbuf = sp->data = NULL;
 				sp->length = 0;
 #endif
 
 			}
 			sctp_m_free(m);
 			m = sp->data;
 		}
 	}
 	if (SCTP_BUF_IS_EXTENDED(chk->data)) {
 		chk->copy_by_ref = 1;
 	} else {
 		chk->copy_by_ref = 0;
 	}
 	/*
 	 * get last_mbuf and counts of mb usage This is ugly but hopefully
 	 * its only one mbuf.
 	 */
 	if (chk->last_mbuf == NULL) {
 		chk->last_mbuf = chk->data;
 		while (SCTP_BUF_NEXT(chk->last_mbuf) != NULL) {
 			chk->last_mbuf = SCTP_BUF_NEXT(chk->last_mbuf);
 		}
 	}
 
 	if (to_move > length) {
 		/*- This should not happen either
 		 * since we always lower to_move to the size
 		 * of sp->length if its larger.
 		 */
 #ifdef INVARIANTS
 		panic("Huh, how can to_move be larger?");
 #else
 		SCTP_PRINTF("Huh, how can to_move be larger?\n");
 		sp->length = 0;
 #endif
 	} else {
 		atomic_subtract_int(&sp->length, to_move);
 	}
 	leading = SCTP_DATA_CHUNK_OVERHEAD(stcb);
 	if (M_LEADINGSPACE(chk->data) < leading) {
 		/* Not enough room for a chunk header, get some */
 		struct mbuf *m;
 
 		m = sctp_get_mbuf_for_msg(1, 0, M_NOWAIT, 1, MT_DATA);
 		if (m == NULL) {
 			/*
 			 * we're in trouble here. _PREPEND below will free
 			 * all the data if there is no leading space, so we
 			 * must put the data back and restore.
 			 */
 			if (send_lock_up == 0) {
 				SCTP_TCB_SEND_LOCK(stcb);
 				send_lock_up = 1;
 			}
 			if (sp->data == NULL) {
 				/* unsteal the data */
 				sp->data = chk->data;
 				sp->tail_mbuf = chk->last_mbuf;
 			} else {
 				struct mbuf *m_tmp;
 
 				/* reassemble the data */
 				m_tmp = sp->data;
 				sp->data = chk->data;
 				SCTP_BUF_NEXT(chk->last_mbuf) = m_tmp;
 			}
 			sp->some_taken = some_taken;
 			atomic_add_int(&sp->length, to_move);
 			chk->data = NULL;
 			*bail = 1;
 			sctp_free_a_chunk(stcb, chk, so_locked);
 			to_move = 0;
 			goto out_of;
 		} else {
 			SCTP_BUF_LEN(m) = 0;
 			SCTP_BUF_NEXT(m) = chk->data;
 			chk->data = m;
 			M_ALIGN(chk->data, 4);
 		}
 	}
 	SCTP_BUF_PREPEND(chk->data, SCTP_DATA_CHUNK_OVERHEAD(stcb), M_NOWAIT);
 	if (chk->data == NULL) {
 		/* HELP, TSNH since we assured it would not above? */
 #ifdef INVARIANTS
 		panic("prepend failes HELP?");
 #else
 		SCTP_PRINTF("prepend fails HELP?\n");
 		sctp_free_a_chunk(stcb, chk, so_locked);
 #endif
 		*bail = 1;
 		to_move = 0;
 		goto out_of;
 	}
 	sctp_snd_sb_alloc(stcb, SCTP_DATA_CHUNK_OVERHEAD(stcb));
 	chk->book_size = chk->send_size = (uint16_t)(to_move + SCTP_DATA_CHUNK_OVERHEAD(stcb));
 	chk->book_size_scale = 0;
 	chk->sent = SCTP_DATAGRAM_UNSENT;
 
 	chk->flags = 0;
 	chk->asoc = &stcb->asoc;
 	chk->pad_inplace = 0;
 	chk->no_fr_allowed = 0;
 	if (stcb->asoc.idata_supported == 0) {
 		if (rcv_flags & SCTP_DATA_UNORDERED) {
 			/* Just use 0. The receiver ignores the values. */
 			chk->rec.data.mid = 0;
 		} else {
 			chk->rec.data.mid = strq->next_mid_ordered;
 			if (rcv_flags & SCTP_DATA_LAST_FRAG) {
 				strq->next_mid_ordered++;
 			}
 		}
 	} else {
 		if (rcv_flags & SCTP_DATA_UNORDERED) {
 			chk->rec.data.mid = strq->next_mid_unordered;
 			if (rcv_flags & SCTP_DATA_LAST_FRAG) {
 				strq->next_mid_unordered++;
 			}
 		} else {
 			chk->rec.data.mid = strq->next_mid_ordered;
 			if (rcv_flags & SCTP_DATA_LAST_FRAG) {
 				strq->next_mid_ordered++;
 			}
 		}
 	}
 	chk->rec.data.sid = sp->sid;
 	chk->rec.data.ppid = sp->ppid;
 	chk->rec.data.context = sp->context;
 	chk->rec.data.doing_fast_retransmit = 0;
 
 	chk->rec.data.timetodrop = sp->ts;
 	chk->flags = sp->act_flags;
 
 	if (sp->net) {
 		chk->whoTo = sp->net;
 		atomic_add_int(&chk->whoTo->ref_count, 1);
 	} else
 		chk->whoTo = NULL;
 
 	if (sp->holds_key_ref) {
 		chk->auth_keyid = sp->auth_keyid;
 		sctp_auth_key_acquire(stcb, chk->auth_keyid);
 		chk->holds_key_ref = 1;
 	}
 	chk->rec.data.tsn = atomic_fetchadd_int(&asoc->sending_seq, 1);
 	if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LOG_AT_SEND_2_OUTQ) {
 		sctp_misc_ints(SCTP_STRMOUT_LOG_SEND,
 		    (uint32_t)(uintptr_t)stcb, sp->length,
 		    (uint32_t)((chk->rec.data.sid << 16) | (0x0000ffff & chk->rec.data.mid)),
 		    chk->rec.data.tsn);
 	}
 	if (stcb->asoc.idata_supported == 0) {
 		dchkh = mtod(chk->data, struct sctp_data_chunk *);
 	} else {
 		ndchkh = mtod(chk->data, struct sctp_idata_chunk *);
 	}
 	/*
 	 * Put the rest of the things in place now. Size was done earlier in
 	 * previous loop prior to padding.
 	 */
 
 #ifdef SCTP_ASOCLOG_OF_TSNS
 	SCTP_TCB_LOCK_ASSERT(stcb);
 	if (asoc->tsn_out_at >= SCTP_TSN_LOG_SIZE) {
 		asoc->tsn_out_at = 0;
 		asoc->tsn_out_wrapped = 1;
 	}
 	asoc->out_tsnlog[asoc->tsn_out_at].tsn = chk->rec.data.tsn;
 	asoc->out_tsnlog[asoc->tsn_out_at].strm = chk->rec.data.sid;
 	asoc->out_tsnlog[asoc->tsn_out_at].seq = chk->rec.data.mid;
 	asoc->out_tsnlog[asoc->tsn_out_at].sz = chk->send_size;
 	asoc->out_tsnlog[asoc->tsn_out_at].flgs = chk->rec.data.rcv_flags;
 	asoc->out_tsnlog[asoc->tsn_out_at].stcb = (void *)stcb;
 	asoc->out_tsnlog[asoc->tsn_out_at].in_pos = asoc->tsn_out_at;
 	asoc->out_tsnlog[asoc->tsn_out_at].in_out = 2;
 	asoc->tsn_out_at++;
 #endif
 	if (stcb->asoc.idata_supported == 0) {
 		dchkh->ch.chunk_type = SCTP_DATA;
 		dchkh->ch.chunk_flags = chk->rec.data.rcv_flags;
 		dchkh->dp.tsn = htonl(chk->rec.data.tsn);
 		dchkh->dp.sid = htons(strq->sid);
 		dchkh->dp.ssn = htons((uint16_t)chk->rec.data.mid);
 		dchkh->dp.ppid = chk->rec.data.ppid;
 		dchkh->ch.chunk_length = htons(chk->send_size);
 	} else {
 		ndchkh->ch.chunk_type = SCTP_IDATA;
 		ndchkh->ch.chunk_flags = chk->rec.data.rcv_flags;
 		ndchkh->dp.tsn = htonl(chk->rec.data.tsn);
 		ndchkh->dp.sid = htons(strq->sid);
 		ndchkh->dp.reserved = htons(0);
 		ndchkh->dp.mid = htonl(chk->rec.data.mid);
 		if (sp->fsn == 0)
 			ndchkh->dp.ppid_fsn.ppid = chk->rec.data.ppid;
 		else
 			ndchkh->dp.ppid_fsn.fsn = htonl(sp->fsn);
 		sp->fsn++;
 		ndchkh->ch.chunk_length = htons(chk->send_size);
 	}
 	/* Now advance the chk->send_size by the actual pad needed. */
 	if (chk->send_size < SCTP_SIZE32(chk->book_size)) {
 		/* need a pad */
 		struct mbuf *lm;
 		int pads;
 
 		pads = SCTP_SIZE32(chk->book_size) - chk->send_size;
 		lm = sctp_pad_lastmbuf(chk->data, pads, chk->last_mbuf);
 		if (lm != NULL) {
 			chk->last_mbuf = lm;
 			chk->pad_inplace = 1;
 		}
 		chk->send_size += pads;
 	}
 	if (PR_SCTP_ENABLED(chk->flags)) {
 		asoc->pr_sctp_cnt++;
 	}
 	if (sp->msg_is_complete && (sp->length == 0) && (sp->sender_all_done)) {
 		/* All done pull and kill the message */
 		if (sp->put_last_out == 0) {
 			SCTP_PRINTF("Gak, put out entire msg with NO end!-2\n");
 			SCTP_PRINTF("sender_done:%d len:%d msg_comp:%d put_last_out:%d send_lock:%d\n",
 			    sp->sender_all_done,
 			    sp->length,
 			    sp->msg_is_complete,
 			    sp->put_last_out,
 			    send_lock_up);
 		}
 		if ((send_lock_up == 0) && (TAILQ_NEXT(sp, next) == NULL)) {
 			SCTP_TCB_SEND_LOCK(stcb);
 			send_lock_up = 1;
 		}
 		atomic_subtract_int(&asoc->stream_queue_cnt, 1);
 		TAILQ_REMOVE(&strq->outqueue, sp, next);
 		stcb->asoc.ss_functions.sctp_ss_remove_from_stream(stcb, asoc, strq, sp, send_lock_up);
 		if ((strq->state == SCTP_STREAM_RESET_PENDING) &&
 		    (strq->chunks_on_queues == 0) &&
 		    TAILQ_EMPTY(&strq->outqueue)) {
 			stcb->asoc.trigger_reset = 1;
 		}
 		if (sp->net) {
 			sctp_free_remote_addr(sp->net);
 			sp->net = NULL;
 		}
 		if (sp->data) {
 			sctp_m_freem(sp->data);
 			sp->data = NULL;
 		}
 		sctp_free_a_strmoq(stcb, sp, so_locked);
 	}
 	asoc->chunks_on_out_queue++;
 	strq->chunks_on_queues++;
 	TAILQ_INSERT_TAIL(&asoc->send_queue, chk, sctp_next);
 	asoc->send_queue_cnt++;
 out_of:
 	if (send_lock_up) {
 		SCTP_TCB_SEND_UNLOCK(stcb);
 	}
 	return (to_move);
 }
 
 
 static void
 sctp_fill_outqueue(struct sctp_tcb *stcb,
     struct sctp_nets *net, int frag_point, int eeor_mode, int *quit_now, int so_locked
 #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING)
     SCTP_UNUSED
 #endif
 )
 {
 	struct sctp_association *asoc;
 	struct sctp_stream_out *strq;
 	uint32_t space_left, moved, total_moved;
 	int bail, giveup;
 
 	SCTP_TCB_LOCK_ASSERT(stcb);
 	asoc = &stcb->asoc;
 	total_moved = 0;
 	switch (net->ro._l_addr.sa.sa_family) {
 #ifdef INET
 	case AF_INET:
 		space_left = net->mtu - SCTP_MIN_V4_OVERHEAD;
 		break;
 #endif
 #ifdef INET6
 	case AF_INET6:
 		space_left = net->mtu - SCTP_MIN_OVERHEAD;
 		break;
 #endif
 	default:
 		/* TSNH */
 		space_left = net->mtu;
 		break;
 	}
 	/* Need an allowance for the data chunk header too */
 	space_left -= SCTP_DATA_CHUNK_OVERHEAD(stcb);
 
 	/* must make even word boundary */
 	space_left &= 0xfffffffc;
 	strq = stcb->asoc.ss_functions.sctp_ss_select_stream(stcb, net, asoc);
 	giveup = 0;
 	bail = 0;
 	while ((space_left > 0) && (strq != NULL)) {
 		moved = sctp_move_to_outqueue(stcb, strq, space_left, frag_point,
 		    &giveup, eeor_mode, &bail, so_locked);
 		stcb->asoc.ss_functions.sctp_ss_scheduled(stcb, net, asoc, strq, moved);
 		if ((giveup != 0) || (bail != 0)) {
 			break;
 		}
 		strq = stcb->asoc.ss_functions.sctp_ss_select_stream(stcb, net, asoc);
 		total_moved += moved;
 		space_left -= moved;
 		if (space_left >= SCTP_DATA_CHUNK_OVERHEAD(stcb)) {
 			space_left -= SCTP_DATA_CHUNK_OVERHEAD(stcb);
 		} else {
 			space_left = 0;
 		}
 		space_left &= 0xfffffffc;
 	}
 	if (bail != 0)
 		*quit_now = 1;
 
 	stcb->asoc.ss_functions.sctp_ss_packet_done(stcb, net, asoc);
 
 	if (total_moved == 0) {
 		if ((stcb->asoc.sctp_cmt_on_off == 0) &&
 		    (net == stcb->asoc.primary_destination)) {
 			/* ran dry for primary network net */
 			SCTP_STAT_INCR(sctps_primary_randry);
 		} else if (stcb->asoc.sctp_cmt_on_off > 0) {
 			/* ran dry with CMT on */
 			SCTP_STAT_INCR(sctps_cmt_randry);
 		}
 	}
 }
 
 void
 sctp_fix_ecn_echo(struct sctp_association *asoc)
 {
 	struct sctp_tmit_chunk *chk;
 
 	TAILQ_FOREACH(chk, &asoc->control_send_queue, sctp_next) {
 		if (chk->rec.chunk_id.id == SCTP_ECN_ECHO) {
 			chk->sent = SCTP_DATAGRAM_UNSENT;
 		}
 	}
 }
 
 void
 sctp_move_chunks_from_net(struct sctp_tcb *stcb, struct sctp_nets *net)
 {
 	struct sctp_association *asoc;
 	struct sctp_tmit_chunk *chk;
 	struct sctp_stream_queue_pending *sp;
 	unsigned int i;
 
 	if (net == NULL) {
 		return;
 	}
 	asoc = &stcb->asoc;
 	for (i = 0; i < stcb->asoc.streamoutcnt; i++) {
 		TAILQ_FOREACH(sp, &stcb->asoc.strmout[i].outqueue, next) {
 			if (sp->net == net) {
 				sctp_free_remote_addr(sp->net);
 				sp->net = NULL;
 			}
 		}
 	}
 	TAILQ_FOREACH(chk, &asoc->send_queue, sctp_next) {
 		if (chk->whoTo == net) {
 			sctp_free_remote_addr(chk->whoTo);
 			chk->whoTo = NULL;
 		}
 	}
 }
 
 int
 sctp_med_chunk_output(struct sctp_inpcb *inp,
     struct sctp_tcb *stcb,
     struct sctp_association *asoc,
     int *num_out,
     int *reason_code,
     int control_only, int from_where,
     struct timeval *now, int *now_filled, int frag_point, int so_locked
 #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING)
     SCTP_UNUSED
 #endif
 )
 {
 	/**
 	 * Ok this is the generic chunk service queue. we must do the
 	 * following:
 	 * - Service the stream queue that is next, moving any
 	 *   message (note I must get a complete message i.e. FIRST/MIDDLE and
 	 *   LAST to the out queue in one pass) and assigning TSN's. This
 	 *   only applys though if the peer does not support NDATA. For NDATA
 	 *   chunks its ok to not send the entire message ;-)
 	 * - Check to see if the cwnd/rwnd allows any output, if so we go ahead and
 	 *   fomulate and send the low level chunks. Making sure to combine
 	 *   any control in the control chunk queue also.
 	 */
 	struct sctp_nets *net, *start_at, *sack_goes_to = NULL, *old_start_at = NULL;
 	struct mbuf *outchain, *endoutchain;
 	struct sctp_tmit_chunk *chk, *nchk;
 
 	/* temp arrays for unlinking */
 	struct sctp_tmit_chunk *data_list[SCTP_MAX_DATA_BUNDLING];
 	int no_fragmentflg, error;
 	unsigned int max_rwnd_per_dest, max_send_per_dest;
 	int one_chunk, hbflag, skip_data_for_this_net;
 	int asconf, cookie, no_out_cnt;
 	int bundle_at, ctl_cnt, no_data_chunks, eeor_mode;
 	unsigned int mtu, r_mtu, omtu, mx_mtu, to_out;
 	int tsns_sent = 0;
 	uint32_t auth_offset = 0;
 	struct sctp_auth_chunk *auth = NULL;
 	uint16_t auth_keyid;
 	int override_ok = 1;
 	int skip_fill_up = 0;
 	int data_auth_reqd = 0;
 
 	/*
 	 * JRS 5/14/07 - Add flag for whether a heartbeat is sent to the
 	 * destination.
 	 */
 	int quit_now = 0;
 
 	*num_out = 0;
 	*reason_code = 0;
 	auth_keyid = stcb->asoc.authinfo.active_keyid;
 	if ((asoc->state & SCTP_STATE_SHUTDOWN_PENDING) ||
 	    (SCTP_GET_STATE(stcb) == SCTP_STATE_SHUTDOWN_RECEIVED) ||
 	    (sctp_is_feature_on(inp, SCTP_PCB_FLAGS_EXPLICIT_EOR))) {
 		eeor_mode = 1;
 	} else {
 		eeor_mode = 0;
 	}
 	ctl_cnt = no_out_cnt = asconf = cookie = 0;
 	/*
 	 * First lets prime the pump. For each destination, if there is room
 	 * in the flight size, attempt to pull an MTU's worth out of the
 	 * stream queues into the general send_queue
 	 */
 #ifdef SCTP_AUDITING_ENABLED
 	sctp_audit_log(0xC2, 2);
 #endif
 	SCTP_TCB_LOCK_ASSERT(stcb);
 	hbflag = 0;
 	if (control_only)
 		no_data_chunks = 1;
 	else
 		no_data_chunks = 0;
 
 	/* Nothing to possible to send? */
 	if ((TAILQ_EMPTY(&asoc->control_send_queue) ||
 	    (asoc->ctrl_queue_cnt == stcb->asoc.ecn_echo_cnt_onq)) &&
 	    TAILQ_EMPTY(&asoc->asconf_send_queue) &&
 	    TAILQ_EMPTY(&asoc->send_queue) &&
 	    sctp_is_there_unsent_data(stcb, so_locked) == 0) {
 nothing_to_send:
 		*reason_code = 9;
 		return (0);
 	}
 	if (asoc->peers_rwnd == 0) {
 		/* No room in peers rwnd */
 		*reason_code = 1;
 		if (asoc->total_flight > 0) {
 			/* we are allowed one chunk in flight */
 			no_data_chunks = 1;
 		}
 	}
 	if (stcb->asoc.ecn_echo_cnt_onq) {
 		/* Record where a sack goes, if any */
 		if (no_data_chunks &&
 		    (asoc->ctrl_queue_cnt == stcb->asoc.ecn_echo_cnt_onq)) {
 			/* Nothing but ECNe to send - we don't do that */
 			goto nothing_to_send;
 		}
 		TAILQ_FOREACH(chk, &asoc->control_send_queue, sctp_next) {
 			if ((chk->rec.chunk_id.id == SCTP_SELECTIVE_ACK) ||
 			    (chk->rec.chunk_id.id == SCTP_NR_SELECTIVE_ACK)) {
 				sack_goes_to = chk->whoTo;
 				break;
 			}
 		}
 	}
 	max_rwnd_per_dest = ((asoc->peers_rwnd + asoc->total_flight) / asoc->numnets);
 	if (stcb->sctp_socket)
 		max_send_per_dest = SCTP_SB_LIMIT_SND(stcb->sctp_socket) / asoc->numnets;
 	else
 		max_send_per_dest = 0;
 	if (no_data_chunks == 0) {
 		/* How many non-directed chunks are there? */
 		TAILQ_FOREACH(chk, &asoc->send_queue, sctp_next) {
 			if (chk->whoTo == NULL) {
 				/*
 				 * We already have non-directed chunks on
 				 * the queue, no need to do a fill-up.
 				 */
 				skip_fill_up = 1;
 				break;
 			}
 		}
 
 	}
 	if ((no_data_chunks == 0) &&
 	    (skip_fill_up == 0) &&
 	    (!stcb->asoc.ss_functions.sctp_ss_is_empty(stcb, asoc))) {
 		TAILQ_FOREACH(net, &asoc->nets, sctp_next) {
 			/*
 			 * This for loop we are in takes in each net, if
 			 * its's got space in cwnd and has data sent to it
 			 * (when CMT is off) then it calls
 			 * sctp_fill_outqueue for the net. This gets data on
 			 * the send queue for that network.
 			 *
 			 * In sctp_fill_outqueue TSN's are assigned and data
 			 * is copied out of the stream buffers. Note mostly
 			 * copy by reference (we hope).
 			 */
 			net->window_probe = 0;
 			if ((net != stcb->asoc.alternate) &&
 			    ((net->dest_state & SCTP_ADDR_PF) ||
 			    (!(net->dest_state & SCTP_ADDR_REACHABLE)) ||
 			    (net->dest_state & SCTP_ADDR_UNCONFIRMED))) {
 				if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_CWND_LOGGING_ENABLE) {
 					sctp_log_cwnd(stcb, net, 1,
 					    SCTP_CWND_LOG_FILL_OUTQ_CALLED);
 				}
 				continue;
 			}
 			if ((stcb->asoc.cc_functions.sctp_cwnd_new_transmission_begins) &&
 			    (net->flight_size == 0)) {
 				(*stcb->asoc.cc_functions.sctp_cwnd_new_transmission_begins) (stcb, net);
 			}
 			if (net->flight_size >= net->cwnd) {
 				/* skip this network, no room - can't fill */
 				if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_CWND_LOGGING_ENABLE) {
 					sctp_log_cwnd(stcb, net, 3,
 					    SCTP_CWND_LOG_FILL_OUTQ_CALLED);
 				}
 				continue;
 			}
 			if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_CWND_LOGGING_ENABLE) {
 				sctp_log_cwnd(stcb, net, 4, SCTP_CWND_LOG_FILL_OUTQ_CALLED);
 			}
 			sctp_fill_outqueue(stcb, net, frag_point, eeor_mode, &quit_now, so_locked);
 			if (quit_now) {
 				/* memory alloc failure */
 				no_data_chunks = 1;
 				break;
 			}
 		}
 	}
 	/* now service each destination and send out what we can for it */
 	/* Nothing to send? */
 	if (TAILQ_EMPTY(&asoc->control_send_queue) &&
 	    TAILQ_EMPTY(&asoc->asconf_send_queue) &&
 	    TAILQ_EMPTY(&asoc->send_queue)) {
 		*reason_code = 8;
 		return (0);
 	}
 
 	if (asoc->sctp_cmt_on_off > 0) {
 		/* get the last start point */
 		start_at = asoc->last_net_cmt_send_started;
 		if (start_at == NULL) {
 			/* null so to beginning */
 			start_at = TAILQ_FIRST(&asoc->nets);
 		} else {
 			start_at = TAILQ_NEXT(asoc->last_net_cmt_send_started, sctp_next);
 			if (start_at == NULL) {
 				start_at = TAILQ_FIRST(&asoc->nets);
 			}
 		}
 		asoc->last_net_cmt_send_started = start_at;
 	} else {
 		start_at = TAILQ_FIRST(&asoc->nets);
 	}
 	TAILQ_FOREACH(chk, &asoc->control_send_queue, sctp_next) {
 		if (chk->whoTo == NULL) {
 			if (asoc->alternate) {
 				chk->whoTo = asoc->alternate;
 			} else {
 				chk->whoTo = asoc->primary_destination;
 			}
 			atomic_add_int(&chk->whoTo->ref_count, 1);
 		}
 	}
 	old_start_at = NULL;
 again_one_more_time:
 	for (net = start_at; net != NULL; net = TAILQ_NEXT(net, sctp_next)) {
 		/* how much can we send? */
 		/* SCTPDBG("Examine for sending net:%x\n", (uint32_t)net); */
 		if (old_start_at && (old_start_at == net)) {
 			/* through list ocmpletely. */
 			break;
 		}
 		tsns_sent = 0xa;
 		if (TAILQ_EMPTY(&asoc->control_send_queue) &&
 		    TAILQ_EMPTY(&asoc->asconf_send_queue) &&
 		    (net->flight_size >= net->cwnd)) {
 			/*
 			 * Nothing on control or asconf and flight is full,
 			 * we can skip even in the CMT case.
 			 */
 			continue;
 		}
 		bundle_at = 0;
 		endoutchain = outchain = NULL;
 		no_fragmentflg = 1;
 		one_chunk = 0;
 		if (net->dest_state & SCTP_ADDR_UNCONFIRMED) {
 			skip_data_for_this_net = 1;
 		} else {
 			skip_data_for_this_net = 0;
 		}
 		switch (((struct sockaddr *)&net->ro._l_addr)->sa_family) {
 #ifdef INET
 		case AF_INET:
 			mtu = net->mtu - SCTP_MIN_V4_OVERHEAD;
 			break;
 #endif
 #ifdef INET6
 		case AF_INET6:
 			mtu = net->mtu - SCTP_MIN_OVERHEAD;
 			break;
 #endif
 		default:
 			/* TSNH */
 			mtu = net->mtu;
 			break;
 		}
 		mx_mtu = mtu;
 		to_out = 0;
 		if (mtu > asoc->peers_rwnd) {
 			if (asoc->total_flight > 0) {
 				/* We have a packet in flight somewhere */
 				r_mtu = asoc->peers_rwnd;
 			} else {
 				/* We are always allowed to send one MTU out */
 				one_chunk = 1;
 				r_mtu = mtu;
 			}
 		} else {
 			r_mtu = mtu;
 		}
 		error = 0;
 		/************************/
 		/* ASCONF transmission */
 		/************************/
 		/* Now first lets go through the asconf queue */
 		TAILQ_FOREACH_SAFE(chk, &asoc->asconf_send_queue, sctp_next, nchk) {
 			if (chk->rec.chunk_id.id != SCTP_ASCONF) {
 				continue;
 			}
 			if (chk->whoTo == NULL) {
 				if (asoc->alternate == NULL) {
 					if (asoc->primary_destination != net) {
 						break;
 					}
 				} else {
 					if (asoc->alternate != net) {
 						break;
 					}
 				}
 			} else {
 				if (chk->whoTo != net) {
 					break;
 				}
 			}
 			if (chk->data == NULL) {
 				break;
 			}
 			if (chk->sent != SCTP_DATAGRAM_UNSENT &&
 			    chk->sent != SCTP_DATAGRAM_RESEND) {
 				break;
 			}
 			/*
 			 * if no AUTH is yet included and this chunk
 			 * requires it, make sure to account for it.  We
 			 * don't apply the size until the AUTH chunk is
 			 * actually added below in case there is no room for
 			 * this chunk. NOTE: we overload the use of "omtu"
 			 * here
 			 */
 			if ((auth == NULL) &&
 			    sctp_auth_is_required_chunk(chk->rec.chunk_id.id,
 			    stcb->asoc.peer_auth_chunks)) {
 				omtu = sctp_get_auth_chunk_len(stcb->asoc.peer_hmac_id);
 			} else
 				omtu = 0;
 			/* Here we do NOT factor the r_mtu */
 			if ((chk->send_size < (int)(mtu - omtu)) ||
 			    (chk->flags & CHUNK_FLAGS_FRAGMENT_OK)) {
 				/*
 				 * We probably should glom the mbuf chain
 				 * from the chk->data for control but the
 				 * problem is it becomes yet one more level
 				 * of tracking to do if for some reason
 				 * output fails. Then I have got to
 				 * reconstruct the merged control chain.. el
 				 * yucko.. for now we take the easy way and
 				 * do the copy
 				 */
 				/*
 				 * Add an AUTH chunk, if chunk requires it
 				 * save the offset into the chain for AUTH
 				 */
 				if ((auth == NULL) &&
 				    (sctp_auth_is_required_chunk(chk->rec.chunk_id.id,
 				    stcb->asoc.peer_auth_chunks))) {
 					outchain = sctp_add_auth_chunk(outchain,
 					    &endoutchain,
 					    &auth,
 					    &auth_offset,
 					    stcb,
 					    chk->rec.chunk_id.id);
 					SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks);
 				}
 				outchain = sctp_copy_mbufchain(chk->data, outchain, &endoutchain,
 				    (int)chk->rec.chunk_id.can_take_data,
 				    chk->send_size, chk->copy_by_ref);
 				if (outchain == NULL) {
 					*reason_code = 8;
 					SCTP_LTRACE_ERR_RET(inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 					return (ENOMEM);
 				}
 				SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks);
 				/* update our MTU size */
 				if (mtu > (chk->send_size + omtu))
 					mtu -= (chk->send_size + omtu);
 				else
 					mtu = 0;
 				to_out += (chk->send_size + omtu);
 				/* Do clear IP_DF ? */
 				if (chk->flags & CHUNK_FLAGS_FRAGMENT_OK) {
 					no_fragmentflg = 0;
 				}
 				if (chk->rec.chunk_id.can_take_data)
 					chk->data = NULL;
 				/*
 				 * set hb flag since we can use these for
 				 * RTO
 				 */
 				hbflag = 1;
 				asconf = 1;
 				/*
 				 * should sysctl this: don't bundle data
 				 * with ASCONF since it requires AUTH
 				 */
 				no_data_chunks = 1;
 				chk->sent = SCTP_DATAGRAM_SENT;
 				if (chk->whoTo == NULL) {
 					chk->whoTo = net;
 					atomic_add_int(&net->ref_count, 1);
 				}
 				chk->snd_count++;
 				if (mtu == 0) {
 					/*
 					 * Ok we are out of room but we can
 					 * output without effecting the
 					 * flight size since this little guy
 					 * is a control only packet.
 					 */
 					sctp_timer_start(SCTP_TIMER_TYPE_ASCONF, inp, stcb, net);
 					/*
 					 * do NOT clear the asconf flag as
 					 * it is used to do appropriate
 					 * source address selection.
 					 */
 					if (*now_filled == 0) {
 						(void)SCTP_GETTIME_TIMEVAL(now);
 						*now_filled = 1;
 					}
 					net->last_sent_time = *now;
 					hbflag = 0;
 					if ((error = sctp_lowlevel_chunk_output(inp, stcb, net,
 					    (struct sockaddr *)&net->ro._l_addr,
 					    outchain, auth_offset, auth,
 					    stcb->asoc.authinfo.active_keyid,
 					    no_fragmentflg, 0, asconf,
 					    inp->sctp_lport, stcb->rport,
 					    htonl(stcb->asoc.peer_vtag),
 					    net->port, NULL,
 					    0, 0,
 					    so_locked))) {
 						/*
 						 * error, we could not
 						 * output
 						 */
 						SCTPDBG(SCTP_DEBUG_OUTPUT3, "Gak send error %d\n", error);
 						if (from_where == 0) {
 							SCTP_STAT_INCR(sctps_lowlevelerrusr);
 						}
 						if (error == ENOBUFS) {
 							asoc->ifp_had_enobuf = 1;
 							SCTP_STAT_INCR(sctps_lowlevelerr);
 						}
 						/* error, could not output */
 						if (error == EHOSTUNREACH) {
 							/*
 							 * Destination went
 							 * unreachable
 							 * during this send
 							 */
 							sctp_move_chunks_from_net(stcb, net);
 						}
 						*reason_code = 7;
 						break;
 					} else {
 						asoc->ifp_had_enobuf = 0;
 					}
 					/*
 					 * increase the number we sent, if a
 					 * cookie is sent we don't tell them
 					 * any was sent out.
 					 */
 					outchain = endoutchain = NULL;
 					auth = NULL;
 					auth_offset = 0;
 					if (!no_out_cnt)
 						*num_out += ctl_cnt;
 					/* recalc a clean slate and setup */
 					switch (net->ro._l_addr.sa.sa_family) {
 #ifdef INET
 					case AF_INET:
 						mtu = net->mtu - SCTP_MIN_V4_OVERHEAD;
 						break;
 #endif
 #ifdef INET6
 					case AF_INET6:
 						mtu = net->mtu - SCTP_MIN_OVERHEAD;
 						break;
 #endif
 					default:
 						/* TSNH */
 						mtu = net->mtu;
 						break;
 					}
 					to_out = 0;
 					no_fragmentflg = 1;
 				}
 			}
 		}
 		if (error != 0) {
 			/* try next net */
 			continue;
 		}
 		/************************/
 		/* Control transmission */
 		/************************/
 		/* Now first lets go through the control queue */
 		TAILQ_FOREACH_SAFE(chk, &asoc->control_send_queue, sctp_next, nchk) {
 			if ((sack_goes_to) &&
 			    (chk->rec.chunk_id.id == SCTP_ECN_ECHO) &&
 			    (chk->whoTo != sack_goes_to)) {
 				/*
 				 * if we have a sack in queue, and we are
 				 * looking at an ecn echo that is NOT queued
 				 * to where the sack is going..
 				 */
 				if (chk->whoTo == net) {
 					/*
 					 * Don't transmit it to where its
 					 * going (current net)
 					 */
 					continue;
 				} else if (sack_goes_to == net) {
 					/*
 					 * But do transmit it to this
 					 * address
 					 */
 					goto skip_net_check;
 				}
 			}
 			if (chk->whoTo == NULL) {
 				if (asoc->alternate == NULL) {
 					if (asoc->primary_destination != net) {
 						continue;
 					}
 				} else {
 					if (asoc->alternate != net) {
 						continue;
 					}
 				}
 			} else {
 				if (chk->whoTo != net) {
 					continue;
 				}
 			}
 	skip_net_check:
 			if (chk->data == NULL) {
 				continue;
 			}
 			if (chk->sent != SCTP_DATAGRAM_UNSENT) {
 				/*
 				 * It must be unsent. Cookies and ASCONF's
 				 * hang around but there timers will force
 				 * when marked for resend.
 				 */
 				continue;
 			}
 			/*
 			 * if no AUTH is yet included and this chunk
 			 * requires it, make sure to account for it.  We
 			 * don't apply the size until the AUTH chunk is
 			 * actually added below in case there is no room for
 			 * this chunk. NOTE: we overload the use of "omtu"
 			 * here
 			 */
 			if ((auth == NULL) &&
 			    sctp_auth_is_required_chunk(chk->rec.chunk_id.id,
 			    stcb->asoc.peer_auth_chunks)) {
 				omtu = sctp_get_auth_chunk_len(stcb->asoc.peer_hmac_id);
 			} else
 				omtu = 0;
 			/* Here we do NOT factor the r_mtu */
 			if ((chk->send_size <= (int)(mtu - omtu)) ||
 			    (chk->flags & CHUNK_FLAGS_FRAGMENT_OK)) {
 				/*
 				 * We probably should glom the mbuf chain
 				 * from the chk->data for control but the
 				 * problem is it becomes yet one more level
 				 * of tracking to do if for some reason
 				 * output fails. Then I have got to
 				 * reconstruct the merged control chain.. el
 				 * yucko.. for now we take the easy way and
 				 * do the copy
 				 */
 				/*
 				 * Add an AUTH chunk, if chunk requires it
 				 * save the offset into the chain for AUTH
 				 */
 				if ((auth == NULL) &&
 				    (sctp_auth_is_required_chunk(chk->rec.chunk_id.id,
 				    stcb->asoc.peer_auth_chunks))) {
 					outchain = sctp_add_auth_chunk(outchain,
 					    &endoutchain,
 					    &auth,
 					    &auth_offset,
 					    stcb,
 					    chk->rec.chunk_id.id);
 					SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks);
 				}
 				outchain = sctp_copy_mbufchain(chk->data, outchain, &endoutchain,
 				    (int)chk->rec.chunk_id.can_take_data,
 				    chk->send_size, chk->copy_by_ref);
 				if (outchain == NULL) {
 					*reason_code = 8;
 					SCTP_LTRACE_ERR_RET(inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 					return (ENOMEM);
 				}
 				SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks);
 				/* update our MTU size */
 				if (mtu > (chk->send_size + omtu))
 					mtu -= (chk->send_size + omtu);
 				else
 					mtu = 0;
 				to_out += (chk->send_size + omtu);
 				/* Do clear IP_DF ? */
 				if (chk->flags & CHUNK_FLAGS_FRAGMENT_OK) {
 					no_fragmentflg = 0;
 				}
 				if (chk->rec.chunk_id.can_take_data)
 					chk->data = NULL;
 				/* Mark things to be removed, if needed */
 				if ((chk->rec.chunk_id.id == SCTP_SELECTIVE_ACK) ||
 				    (chk->rec.chunk_id.id == SCTP_NR_SELECTIVE_ACK) ||	/* EY */
 				    (chk->rec.chunk_id.id == SCTP_HEARTBEAT_REQUEST) ||
 				    (chk->rec.chunk_id.id == SCTP_HEARTBEAT_ACK) ||
 				    (chk->rec.chunk_id.id == SCTP_SHUTDOWN) ||
 				    (chk->rec.chunk_id.id == SCTP_SHUTDOWN_ACK) ||
 				    (chk->rec.chunk_id.id == SCTP_OPERATION_ERROR) ||
 				    (chk->rec.chunk_id.id == SCTP_COOKIE_ACK) ||
 				    (chk->rec.chunk_id.id == SCTP_ECN_CWR) ||
 				    (chk->rec.chunk_id.id == SCTP_PACKET_DROPPED) ||
 				    (chk->rec.chunk_id.id == SCTP_ASCONF_ACK)) {
 					if (chk->rec.chunk_id.id == SCTP_HEARTBEAT_REQUEST) {
 						hbflag = 1;
 					}
 					/* remove these chunks at the end */
 					if ((chk->rec.chunk_id.id == SCTP_SELECTIVE_ACK) ||
 					    (chk->rec.chunk_id.id == SCTP_NR_SELECTIVE_ACK)) {
 						/* turn off the timer */
 						if (SCTP_OS_TIMER_PENDING(&stcb->asoc.dack_timer.timer)) {
 							sctp_timer_stop(SCTP_TIMER_TYPE_RECV,
 							    inp, stcb, net,
 							    SCTP_FROM_SCTP_OUTPUT + SCTP_LOC_1);
 						}
 					}
 					ctl_cnt++;
 				} else {
 					/*
 					 * Other chunks, since they have
 					 * timers running (i.e. COOKIE) we
 					 * just "trust" that it gets sent or
 					 * retransmitted.
 					 */
 					ctl_cnt++;
 					if (chk->rec.chunk_id.id == SCTP_COOKIE_ECHO) {
 						cookie = 1;
 						no_out_cnt = 1;
 					} else if (chk->rec.chunk_id.id == SCTP_ECN_ECHO) {
 						/*
 						 * Increment ecne send count
 						 * here this means we may be
 						 * over-zealous in our
 						 * counting if the send
 						 * fails, but its the best
 						 * place to do it (we used
 						 * to do it in the queue of
 						 * the chunk, but that did
 						 * not tell how many times
 						 * it was sent.
 						 */
 						SCTP_STAT_INCR(sctps_sendecne);
 					}
 					chk->sent = SCTP_DATAGRAM_SENT;
 					if (chk->whoTo == NULL) {
 						chk->whoTo = net;
 						atomic_add_int(&net->ref_count, 1);
 					}
 					chk->snd_count++;
 				}
 				if (mtu == 0) {
 					/*
 					 * Ok we are out of room but we can
 					 * output without effecting the
 					 * flight size since this little guy
 					 * is a control only packet.
 					 */
 					if (asconf) {
 						sctp_timer_start(SCTP_TIMER_TYPE_ASCONF, inp, stcb, net);
 						/*
 						 * do NOT clear the asconf
 						 * flag as it is used to do
 						 * appropriate source
 						 * address selection.
 						 */
 					}
 					if (cookie) {
 						sctp_timer_start(SCTP_TIMER_TYPE_COOKIE, inp, stcb, net);
 						cookie = 0;
 					}
 					/* Only HB or ASCONF advances time */
 					if (hbflag) {
 						if (*now_filled == 0) {
 							(void)SCTP_GETTIME_TIMEVAL(now);
 							*now_filled = 1;
 						}
 						net->last_sent_time = *now;
 						hbflag = 0;
 					}
 					if ((error = sctp_lowlevel_chunk_output(inp, stcb, net,
 					    (struct sockaddr *)&net->ro._l_addr,
 					    outchain,
 					    auth_offset, auth,
 					    stcb->asoc.authinfo.active_keyid,
 					    no_fragmentflg, 0, asconf,
 					    inp->sctp_lport, stcb->rport,
 					    htonl(stcb->asoc.peer_vtag),
 					    net->port, NULL,
 					    0, 0,
 					    so_locked))) {
 						/*
 						 * error, we could not
 						 * output
 						 */
 						SCTPDBG(SCTP_DEBUG_OUTPUT3, "Gak send error %d\n", error);
 						if (from_where == 0) {
 							SCTP_STAT_INCR(sctps_lowlevelerrusr);
 						}
 						if (error == ENOBUFS) {
 							asoc->ifp_had_enobuf = 1;
 							SCTP_STAT_INCR(sctps_lowlevelerr);
 						}
 						if (error == EHOSTUNREACH) {
 							/*
 							 * Destination went
 							 * unreachable
 							 * during this send
 							 */
 							sctp_move_chunks_from_net(stcb, net);
 						}
 						*reason_code = 7;
 						break;
 					} else {
 						asoc->ifp_had_enobuf = 0;
 					}
 					/*
 					 * increase the number we sent, if a
 					 * cookie is sent we don't tell them
 					 * any was sent out.
 					 */
 					outchain = endoutchain = NULL;
 					auth = NULL;
 					auth_offset = 0;
 					if (!no_out_cnt)
 						*num_out += ctl_cnt;
 					/* recalc a clean slate and setup */
 					switch (net->ro._l_addr.sa.sa_family) {
 #ifdef INET
 					case AF_INET:
 						mtu = net->mtu - SCTP_MIN_V4_OVERHEAD;
 						break;
 #endif
 #ifdef INET6
 					case AF_INET6:
 						mtu = net->mtu - SCTP_MIN_OVERHEAD;
 						break;
 #endif
 					default:
 						/* TSNH */
 						mtu = net->mtu;
 						break;
 					}
 					to_out = 0;
 					no_fragmentflg = 1;
 				}
 			}
 		}
 		if (error != 0) {
 			/* try next net */
 			continue;
 		}
 		/* JRI: if dest is in PF state, do not send data to it */
 		if ((asoc->sctp_cmt_on_off > 0) &&
 		    (net != stcb->asoc.alternate) &&
 		    (net->dest_state & SCTP_ADDR_PF)) {
 			goto no_data_fill;
 		}
 		if (net->flight_size >= net->cwnd) {
 			goto no_data_fill;
 		}
 		if ((asoc->sctp_cmt_on_off > 0) &&
 		    (SCTP_BASE_SYSCTL(sctp_buffer_splitting) & SCTP_RECV_BUFFER_SPLITTING) &&
 		    (net->flight_size > max_rwnd_per_dest)) {
 			goto no_data_fill;
 		}
 		/*
 		 * We need a specific accounting for the usage of the send
 		 * buffer. We also need to check the number of messages per
 		 * net. For now, this is better than nothing and it disabled
 		 * by default...
 		 */
 		if ((asoc->sctp_cmt_on_off > 0) &&
 		    (SCTP_BASE_SYSCTL(sctp_buffer_splitting) & SCTP_SEND_BUFFER_SPLITTING) &&
 		    (max_send_per_dest > 0) &&
 		    (net->flight_size > max_send_per_dest)) {
 			goto no_data_fill;
 		}
 		/*********************/
 		/* Data transmission */
 		/*********************/
 		/*
 		 * if AUTH for DATA is required and no AUTH has been added
 		 * yet, account for this in the mtu now... if no data can be
 		 * bundled, this adjustment won't matter anyways since the
 		 * packet will be going out...
 		 */
 		data_auth_reqd = sctp_auth_is_required_chunk(SCTP_DATA,
 		    stcb->asoc.peer_auth_chunks);
 		if (data_auth_reqd && (auth == NULL)) {
 			mtu -= sctp_get_auth_chunk_len(stcb->asoc.peer_hmac_id);
 		}
 		/* now lets add any data within the MTU constraints */
 		switch (((struct sockaddr *)&net->ro._l_addr)->sa_family) {
 #ifdef INET
 		case AF_INET:
 			if (net->mtu > SCTP_MIN_V4_OVERHEAD)
 				omtu = net->mtu - SCTP_MIN_V4_OVERHEAD;
 			else
 				omtu = 0;
 			break;
 #endif
 #ifdef INET6
 		case AF_INET6:
 			if (net->mtu > SCTP_MIN_OVERHEAD)
 				omtu = net->mtu - SCTP_MIN_OVERHEAD;
 			else
 				omtu = 0;
 			break;
 #endif
 		default:
 			/* TSNH */
 			omtu = 0;
 			break;
 		}
 		if ((((SCTP_GET_STATE(stcb) == SCTP_STATE_OPEN) ||
 		    (SCTP_GET_STATE(stcb) == SCTP_STATE_SHUTDOWN_RECEIVED)) &&
 		    (skip_data_for_this_net == 0)) ||
 		    (cookie)) {
 			TAILQ_FOREACH_SAFE(chk, &asoc->send_queue, sctp_next, nchk) {
 				if (no_data_chunks) {
 					/* let only control go out */
 					*reason_code = 1;
 					break;
 				}
 				if (net->flight_size >= net->cwnd) {
 					/* skip this net, no room for data */
 					*reason_code = 2;
 					break;
 				}
 				if ((chk->whoTo != NULL) &&
 				    (chk->whoTo != net)) {
 					/* Don't send the chunk on this net */
 					continue;
 				}
 
 				if (asoc->sctp_cmt_on_off == 0) {
 					if ((asoc->alternate) &&
 					    (asoc->alternate != net) &&
 					    (chk->whoTo == NULL)) {
 						continue;
 					} else if ((net != asoc->primary_destination) &&
 						    (asoc->alternate == NULL) &&
 					    (chk->whoTo == NULL)) {
 						continue;
 					}
 				}
 				if ((chk->send_size > omtu) && ((chk->flags & CHUNK_FLAGS_FRAGMENT_OK) == 0)) {
 					/*-
 					 * strange, we have a chunk that is
 					 * to big for its destination and
 					 * yet no fragment ok flag.
 					 * Something went wrong when the
 					 * PMTU changed...we did not mark
 					 * this chunk for some reason?? I
 					 * will fix it here by letting IP
 					 * fragment it for now and printing
 					 * a warning. This really should not
 					 * happen ...
 					 */
 					SCTP_PRINTF("Warning chunk of %d bytes > mtu:%d and yet PMTU disc missed\n",
 					    chk->send_size, mtu);
 					chk->flags |= CHUNK_FLAGS_FRAGMENT_OK;
 				}
 				if (SCTP_BASE_SYSCTL(sctp_enable_sack_immediately) &&
 				    (asoc->state & SCTP_STATE_SHUTDOWN_PENDING)) {
 					struct sctp_data_chunk *dchkh;
 
 					dchkh = mtod(chk->data, struct sctp_data_chunk *);
 					dchkh->ch.chunk_flags |= SCTP_DATA_SACK_IMMEDIATELY;
 				}
 				if (((chk->send_size <= mtu) && (chk->send_size <= r_mtu)) ||
 				    ((chk->flags & CHUNK_FLAGS_FRAGMENT_OK) && (chk->send_size <= asoc->peers_rwnd))) {
 					/* ok we will add this one */
 
 					/*
 					 * Add an AUTH chunk, if chunk
 					 * requires it, save the offset into
 					 * the chain for AUTH
 					 */
 					if (data_auth_reqd) {
 						if (auth == NULL) {
 							outchain = sctp_add_auth_chunk(outchain,
 							    &endoutchain,
 							    &auth,
 							    &auth_offset,
 							    stcb,
 							    SCTP_DATA);
 							auth_keyid = chk->auth_keyid;
 							override_ok = 0;
 							SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks);
 						} else if (override_ok) {
 							/*
 							 * use this data's
 							 * keyid
 							 */
 							auth_keyid = chk->auth_keyid;
 							override_ok = 0;
 						} else if (auth_keyid != chk->auth_keyid) {
 							/*
 							 * different keyid,
 							 * so done bundling
 							 */
 							break;
 						}
 					}
 					outchain = sctp_copy_mbufchain(chk->data, outchain, &endoutchain, 0,
 					    chk->send_size, chk->copy_by_ref);
 					if (outchain == NULL) {
 						SCTPDBG(SCTP_DEBUG_OUTPUT3, "No memory?\n");
 						if (!SCTP_OS_TIMER_PENDING(&net->rxt_timer.timer)) {
 							sctp_timer_start(SCTP_TIMER_TYPE_SEND, inp, stcb, net);
 						}
 						*reason_code = 3;
 						SCTP_LTRACE_ERR_RET(inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 						return (ENOMEM);
 					}
 					/* upate our MTU size */
 					/* Do clear IP_DF ? */
 					if (chk->flags & CHUNK_FLAGS_FRAGMENT_OK) {
 						no_fragmentflg = 0;
 					}
 					/* unsigned subtraction of mtu */
 					if (mtu > chk->send_size)
 						mtu -= chk->send_size;
 					else
 						mtu = 0;
 					/* unsigned subtraction of r_mtu */
 					if (r_mtu > chk->send_size)
 						r_mtu -= chk->send_size;
 					else
 						r_mtu = 0;
 
 					to_out += chk->send_size;
 					if ((to_out > mx_mtu) && no_fragmentflg) {
 #ifdef INVARIANTS
 						panic("Exceeding mtu of %d out size is %d", mx_mtu, to_out);
 #else
 						SCTP_PRINTF("Exceeding mtu of %d out size is %d\n",
 						    mx_mtu, to_out);
 #endif
 					}
 					chk->window_probe = 0;
 					data_list[bundle_at++] = chk;
 					if (bundle_at >= SCTP_MAX_DATA_BUNDLING) {
 						break;
 					}
 					if (chk->sent == SCTP_DATAGRAM_UNSENT) {
 						if ((chk->rec.data.rcv_flags & SCTP_DATA_UNORDERED) == 0) {
 							SCTP_STAT_INCR_COUNTER64(sctps_outorderchunks);
 						} else {
 							SCTP_STAT_INCR_COUNTER64(sctps_outunorderchunks);
 						}
 						if (((chk->rec.data.rcv_flags & SCTP_DATA_LAST_FRAG) == SCTP_DATA_LAST_FRAG) &&
 						    ((chk->rec.data.rcv_flags & SCTP_DATA_FIRST_FRAG) == 0))
 							/*
 							 * Count number of
 							 * user msg's that
 							 * were fragmented
 							 * we do this by
 							 * counting when we
 							 * see a LAST
 							 * fragment only.
 							 */
 							SCTP_STAT_INCR_COUNTER64(sctps_fragusrmsgs);
 					}
 					if ((mtu == 0) || (r_mtu == 0) || (one_chunk)) {
 						if ((one_chunk) && (stcb->asoc.total_flight == 0)) {
 							data_list[0]->window_probe = 1;
 							net->window_probe = 1;
 						}
 						break;
 					}
 				} else {
 					/*
 					 * Must be sent in order of the
 					 * TSN's (on a network)
 					 */
 					break;
 				}
 			}	/* for (chunk gather loop for this net) */
 		}		/* if asoc.state OPEN */
 no_data_fill:
 		/* Is there something to send for this destination? */
 		if (outchain) {
 			/* We may need to start a control timer or two */
 			if (asconf) {
 				sctp_timer_start(SCTP_TIMER_TYPE_ASCONF, inp,
 				    stcb, net);
 				/*
 				 * do NOT clear the asconf flag as it is
 				 * used to do appropriate source address
 				 * selection.
 				 */
 			}
 			if (cookie) {
 				sctp_timer_start(SCTP_TIMER_TYPE_COOKIE, inp, stcb, net);
 				cookie = 0;
 			}
 			/* must start a send timer if data is being sent */
 			if (bundle_at && (!SCTP_OS_TIMER_PENDING(&net->rxt_timer.timer))) {
 				/*
 				 * no timer running on this destination
 				 * restart it.
 				 */
 				sctp_timer_start(SCTP_TIMER_TYPE_SEND, inp, stcb, net);
 			}
 			if (bundle_at || hbflag) {
 				/* For data/asconf and hb set time */
 				if (*now_filled == 0) {
 					(void)SCTP_GETTIME_TIMEVAL(now);
 					*now_filled = 1;
 				}
 				net->last_sent_time = *now;
 			}
 			/* Now send it, if there is anything to send :> */
 			if ((error = sctp_lowlevel_chunk_output(inp,
 			    stcb,
 			    net,
 			    (struct sockaddr *)&net->ro._l_addr,
 			    outchain,
 			    auth_offset,
 			    auth,
 			    auth_keyid,
 			    no_fragmentflg,
 			    bundle_at,
 			    asconf,
 			    inp->sctp_lport, stcb->rport,
 			    htonl(stcb->asoc.peer_vtag),
 			    net->port, NULL,
 			    0, 0,
 			    so_locked))) {
 				/* error, we could not output */
 				SCTPDBG(SCTP_DEBUG_OUTPUT3, "Gak send error %d\n", error);
 				if (from_where == 0) {
 					SCTP_STAT_INCR(sctps_lowlevelerrusr);
 				}
 				if (error == ENOBUFS) {
 					asoc->ifp_had_enobuf = 1;
 					SCTP_STAT_INCR(sctps_lowlevelerr);
 				}
 				if (error == EHOSTUNREACH) {
 					/*
 					 * Destination went unreachable
 					 * during this send
 					 */
 					sctp_move_chunks_from_net(stcb, net);
 				}
 				*reason_code = 6;
 				/*-
 				 * I add this line to be paranoid. As far as
 				 * I can tell the continue, takes us back to
 				 * the top of the for, but just to make sure
 				 * I will reset these again here.
 				 */
 				ctl_cnt = bundle_at = 0;
 				continue;	/* This takes us back to the
 						 * for() for the nets. */
 			} else {
 				asoc->ifp_had_enobuf = 0;
 			}
 			endoutchain = NULL;
 			auth = NULL;
 			auth_offset = 0;
 			if (!no_out_cnt) {
 				*num_out += (ctl_cnt + bundle_at);
 			}
 			if (bundle_at) {
 				/* setup for a RTO measurement */
 				tsns_sent = data_list[0]->rec.data.tsn;
 				/* fill time if not already filled */
 				if (*now_filled == 0) {
 					(void)SCTP_GETTIME_TIMEVAL(&asoc->time_last_sent);
 					*now_filled = 1;
 					*now = asoc->time_last_sent;
 				} else {
 					asoc->time_last_sent = *now;
 				}
 				if (net->rto_needed) {
 					data_list[0]->do_rtt = 1;
 					net->rto_needed = 0;
 				}
 				SCTP_STAT_INCR_BY(sctps_senddata, bundle_at);
 				sctp_clean_up_datalist(stcb, asoc, data_list, bundle_at, net);
 			}
 			if (one_chunk) {
 				break;
 			}
 		}
 		if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_CWND_LOGGING_ENABLE) {
 			sctp_log_cwnd(stcb, net, tsns_sent, SCTP_CWND_LOG_FROM_SEND);
 		}
 	}
 	if (old_start_at == NULL) {
 		old_start_at = start_at;
 		start_at = TAILQ_FIRST(&asoc->nets);
 		if (old_start_at)
 			goto again_one_more_time;
 	}
 
 	/*
 	 * At the end there should be no NON timed chunks hanging on this
 	 * queue.
 	 */
 	if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_CWND_LOGGING_ENABLE) {
 		sctp_log_cwnd(stcb, net, *num_out, SCTP_CWND_LOG_FROM_SEND);
 	}
 	if ((*num_out == 0) && (*reason_code == 0)) {
 		*reason_code = 4;
 	} else {
 		*reason_code = 5;
 	}
 	sctp_clean_up_ctl(stcb, asoc, so_locked);
 	return (0);
 }
 
 void
 sctp_queue_op_err(struct sctp_tcb *stcb, struct mbuf *op_err)
 {
 	/*-
 	 * Prepend a OPERATIONAL_ERROR chunk header and put on the end of
 	 * the control chunk queue.
 	 */
 	struct sctp_chunkhdr *hdr;
 	struct sctp_tmit_chunk *chk;
 	struct mbuf *mat, *last_mbuf;
 	uint32_t chunk_length;
 	uint16_t padding_length;
 
 	SCTP_TCB_LOCK_ASSERT(stcb);
 	SCTP_BUF_PREPEND(op_err, sizeof(struct sctp_chunkhdr), M_NOWAIT);
 	if (op_err == NULL) {
 		return;
 	}
 	last_mbuf = NULL;
 	chunk_length = 0;
 	for (mat = op_err; mat != NULL; mat = SCTP_BUF_NEXT(mat)) {
 		chunk_length += SCTP_BUF_LEN(mat);
 		if (SCTP_BUF_NEXT(mat) == NULL) {
 			last_mbuf = mat;
 		}
 	}
 	if (chunk_length > SCTP_MAX_CHUNK_LENGTH) {
 		sctp_m_freem(op_err);
 		return;
 	}
 	padding_length = chunk_length % 4;
 	if (padding_length != 0) {
 		padding_length = 4 - padding_length;
 	}
 	if (padding_length != 0) {
 		if (sctp_add_pad_tombuf(last_mbuf, padding_length) == NULL) {
 			sctp_m_freem(op_err);
 			return;
 		}
 	}
 	sctp_alloc_a_chunk(stcb, chk);
 	if (chk == NULL) {
 		/* no memory */
 		sctp_m_freem(op_err);
 		return;
 	}
 	chk->copy_by_ref = 0;
 	chk->rec.chunk_id.id = SCTP_OPERATION_ERROR;
 	chk->rec.chunk_id.can_take_data = 0;
 	chk->flags = 0;
 	chk->send_size = (uint16_t)chunk_length;
 	chk->sent = SCTP_DATAGRAM_UNSENT;
 	chk->snd_count = 0;
 	chk->asoc = &stcb->asoc;
 	chk->data = op_err;
 	chk->whoTo = NULL;
 	hdr = mtod(op_err, struct sctp_chunkhdr *);
 	hdr->chunk_type = SCTP_OPERATION_ERROR;
 	hdr->chunk_flags = 0;
 	hdr->chunk_length = htons(chk->send_size);
 	TAILQ_INSERT_TAIL(&chk->asoc->control_send_queue, chk, sctp_next);
 	chk->asoc->ctrl_queue_cnt++;
 }
 
 int
 sctp_send_cookie_echo(struct mbuf *m,
     int offset,
     struct sctp_tcb *stcb,
     struct sctp_nets *net)
 {
 	/*-
 	 * pull out the cookie and put it at the front of the control chunk
 	 * queue.
 	 */
 	int at;
 	struct mbuf *cookie;
 	struct sctp_paramhdr param, *phdr;
 	struct sctp_chunkhdr *hdr;
 	struct sctp_tmit_chunk *chk;
 	uint16_t ptype, plen;
 
 	SCTP_TCB_LOCK_ASSERT(stcb);
 	/* First find the cookie in the param area */
 	cookie = NULL;
 	at = offset + sizeof(struct sctp_init_chunk);
 	for (;;) {
 		phdr = sctp_get_next_param(m, at, &param, sizeof(param));
 		if (phdr == NULL) {
 			return (-3);
 		}
 		ptype = ntohs(phdr->param_type);
 		plen = ntohs(phdr->param_length);
 		if (ptype == SCTP_STATE_COOKIE) {
 			int pad;
 
 			/* found the cookie */
 			if ((pad = (plen % 4))) {
 				plen += 4 - pad;
 			}
 			cookie = SCTP_M_COPYM(m, at, plen, M_NOWAIT);
 			if (cookie == NULL) {
 				/* No memory */
 				return (-2);
 			}
 #ifdef SCTP_MBUF_LOGGING
 			if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_MBUF_LOGGING_ENABLE) {
 				sctp_log_mbc(cookie, SCTP_MBUF_ICOPY);
 			}
 #endif
 			break;
 		}
 		at += SCTP_SIZE32(plen);
 	}
 	/* ok, we got the cookie lets change it into a cookie echo chunk */
 	/* first the change from param to cookie */
 	hdr = mtod(cookie, struct sctp_chunkhdr *);
 	hdr->chunk_type = SCTP_COOKIE_ECHO;
 	hdr->chunk_flags = 0;
 	/* get the chunk stuff now and place it in the FRONT of the queue */
 	sctp_alloc_a_chunk(stcb, chk);
 	if (chk == NULL) {
 		/* no memory */
 		sctp_m_freem(cookie);
 		return (-5);
 	}
 	chk->copy_by_ref = 0;
 	chk->rec.chunk_id.id = SCTP_COOKIE_ECHO;
 	chk->rec.chunk_id.can_take_data = 0;
 	chk->flags = CHUNK_FLAGS_FRAGMENT_OK;
 	chk->send_size = plen;
 	chk->sent = SCTP_DATAGRAM_UNSENT;
 	chk->snd_count = 0;
 	chk->asoc = &stcb->asoc;
 	chk->data = cookie;
 	chk->whoTo = net;
 	atomic_add_int(&chk->whoTo->ref_count, 1);
 	TAILQ_INSERT_HEAD(&chk->asoc->control_send_queue, chk, sctp_next);
 	chk->asoc->ctrl_queue_cnt++;
 	return (0);
 }
 
 void
 sctp_send_heartbeat_ack(struct sctp_tcb *stcb,
     struct mbuf *m,
     int offset,
     int chk_length,
     struct sctp_nets *net)
 {
 	/*
 	 * take a HB request and make it into a HB ack and send it.
 	 */
 	struct mbuf *outchain;
 	struct sctp_chunkhdr *chdr;
 	struct sctp_tmit_chunk *chk;
 
 
 	if (net == NULL)
 		/* must have a net pointer */
 		return;
 
 	outchain = SCTP_M_COPYM(m, offset, chk_length, M_NOWAIT);
 	if (outchain == NULL) {
 		/* gak out of memory */
 		return;
 	}
 #ifdef SCTP_MBUF_LOGGING
 	if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_MBUF_LOGGING_ENABLE) {
 		sctp_log_mbc(outchain, SCTP_MBUF_ICOPY);
 	}
 #endif
 	chdr = mtod(outchain, struct sctp_chunkhdr *);
 	chdr->chunk_type = SCTP_HEARTBEAT_ACK;
 	chdr->chunk_flags = 0;
 	if (chk_length % 4) {
 		/* need pad */
 		uint32_t cpthis = 0;
 		int padlen;
 
 		padlen = 4 - (chk_length % 4);
 		m_copyback(outchain, chk_length, padlen, (caddr_t)&cpthis);
 	}
 	sctp_alloc_a_chunk(stcb, chk);
 	if (chk == NULL) {
 		/* no memory */
 		sctp_m_freem(outchain);
 		return;
 	}
 	chk->copy_by_ref = 0;
 	chk->rec.chunk_id.id = SCTP_HEARTBEAT_ACK;
 	chk->rec.chunk_id.can_take_data = 1;
 	chk->flags = 0;
 	chk->send_size = chk_length;
 	chk->sent = SCTP_DATAGRAM_UNSENT;
 	chk->snd_count = 0;
 	chk->asoc = &stcb->asoc;
 	chk->data = outchain;
 	chk->whoTo = net;
 	atomic_add_int(&chk->whoTo->ref_count, 1);
 	TAILQ_INSERT_TAIL(&chk->asoc->control_send_queue, chk, sctp_next);
 	chk->asoc->ctrl_queue_cnt++;
 }
 
 void
 sctp_send_cookie_ack(struct sctp_tcb *stcb)
 {
 	/* formulate and queue a cookie-ack back to sender */
 	struct mbuf *cookie_ack;
 	struct sctp_chunkhdr *hdr;
 	struct sctp_tmit_chunk *chk;
 
 	SCTP_TCB_LOCK_ASSERT(stcb);
 
 	cookie_ack = sctp_get_mbuf_for_msg(sizeof(struct sctp_chunkhdr), 0, M_NOWAIT, 1, MT_HEADER);
 	if (cookie_ack == NULL) {
 		/* no mbuf's */
 		return;
 	}
 	SCTP_BUF_RESV_UF(cookie_ack, SCTP_MIN_OVERHEAD);
 	sctp_alloc_a_chunk(stcb, chk);
 	if (chk == NULL) {
 		/* no memory */
 		sctp_m_freem(cookie_ack);
 		return;
 	}
 	chk->copy_by_ref = 0;
 	chk->rec.chunk_id.id = SCTP_COOKIE_ACK;
 	chk->rec.chunk_id.can_take_data = 1;
 	chk->flags = 0;
 	chk->send_size = sizeof(struct sctp_chunkhdr);
 	chk->sent = SCTP_DATAGRAM_UNSENT;
 	chk->snd_count = 0;
 	chk->asoc = &stcb->asoc;
 	chk->data = cookie_ack;
 	if (chk->asoc->last_control_chunk_from != NULL) {
 		chk->whoTo = chk->asoc->last_control_chunk_from;
 		atomic_add_int(&chk->whoTo->ref_count, 1);
 	} else {
 		chk->whoTo = NULL;
 	}
 	hdr = mtod(cookie_ack, struct sctp_chunkhdr *);
 	hdr->chunk_type = SCTP_COOKIE_ACK;
 	hdr->chunk_flags = 0;
 	hdr->chunk_length = htons(chk->send_size);
 	SCTP_BUF_LEN(cookie_ack) = chk->send_size;
 	TAILQ_INSERT_TAIL(&chk->asoc->control_send_queue, chk, sctp_next);
 	chk->asoc->ctrl_queue_cnt++;
 	return;
 }
 
 
 void
 sctp_send_shutdown_ack(struct sctp_tcb *stcb, struct sctp_nets *net)
 {
 	/* formulate and queue a SHUTDOWN-ACK back to the sender */
 	struct mbuf *m_shutdown_ack;
 	struct sctp_shutdown_ack_chunk *ack_cp;
 	struct sctp_tmit_chunk *chk;
 
 	m_shutdown_ack = sctp_get_mbuf_for_msg(sizeof(struct sctp_shutdown_ack_chunk), 0, M_NOWAIT, 1, MT_HEADER);
 	if (m_shutdown_ack == NULL) {
 		/* no mbuf's */
 		return;
 	}
 	SCTP_BUF_RESV_UF(m_shutdown_ack, SCTP_MIN_OVERHEAD);
 	sctp_alloc_a_chunk(stcb, chk);
 	if (chk == NULL) {
 		/* no memory */
 		sctp_m_freem(m_shutdown_ack);
 		return;
 	}
 	chk->copy_by_ref = 0;
 	chk->rec.chunk_id.id = SCTP_SHUTDOWN_ACK;
 	chk->rec.chunk_id.can_take_data = 1;
 	chk->flags = 0;
 	chk->send_size = sizeof(struct sctp_chunkhdr);
 	chk->sent = SCTP_DATAGRAM_UNSENT;
 	chk->snd_count = 0;
 	chk->asoc = &stcb->asoc;
 	chk->data = m_shutdown_ack;
 	chk->whoTo = net;
 	if (chk->whoTo) {
 		atomic_add_int(&chk->whoTo->ref_count, 1);
 	}
 	ack_cp = mtod(m_shutdown_ack, struct sctp_shutdown_ack_chunk *);
 	ack_cp->ch.chunk_type = SCTP_SHUTDOWN_ACK;
 	ack_cp->ch.chunk_flags = 0;
 	ack_cp->ch.chunk_length = htons(chk->send_size);
 	SCTP_BUF_LEN(m_shutdown_ack) = chk->send_size;
 	TAILQ_INSERT_TAIL(&chk->asoc->control_send_queue, chk, sctp_next);
 	chk->asoc->ctrl_queue_cnt++;
 	return;
 }
 
 void
 sctp_send_shutdown(struct sctp_tcb *stcb, struct sctp_nets *net)
 {
 	/* formulate and queue a SHUTDOWN to the sender */
 	struct mbuf *m_shutdown;
 	struct sctp_shutdown_chunk *shutdown_cp;
 	struct sctp_tmit_chunk *chk;
 
 	TAILQ_FOREACH(chk, &stcb->asoc.control_send_queue, sctp_next) {
 		if (chk->rec.chunk_id.id == SCTP_SHUTDOWN) {
 			/* We already have a SHUTDOWN queued. Reuse it. */
 			if (chk->whoTo) {
 				sctp_free_remote_addr(chk->whoTo);
 				chk->whoTo = NULL;
 			}
 			break;
 		}
 	}
 	if (chk == NULL) {
 		m_shutdown = sctp_get_mbuf_for_msg(sizeof(struct sctp_shutdown_chunk), 0, M_NOWAIT, 1, MT_HEADER);
 		if (m_shutdown == NULL) {
 			/* no mbuf's */
 			return;
 		}
 		SCTP_BUF_RESV_UF(m_shutdown, SCTP_MIN_OVERHEAD);
 		sctp_alloc_a_chunk(stcb, chk);
 		if (chk == NULL) {
 			/* no memory */
 			sctp_m_freem(m_shutdown);
 			return;
 		}
 		chk->copy_by_ref = 0;
 		chk->rec.chunk_id.id = SCTP_SHUTDOWN;
 		chk->rec.chunk_id.can_take_data = 1;
 		chk->flags = 0;
 		chk->send_size = sizeof(struct sctp_shutdown_chunk);
 		chk->sent = SCTP_DATAGRAM_UNSENT;
 		chk->snd_count = 0;
 		chk->asoc = &stcb->asoc;
 		chk->data = m_shutdown;
 		chk->whoTo = net;
 		if (chk->whoTo) {
 			atomic_add_int(&chk->whoTo->ref_count, 1);
 		}
 		shutdown_cp = mtod(m_shutdown, struct sctp_shutdown_chunk *);
 		shutdown_cp->ch.chunk_type = SCTP_SHUTDOWN;
 		shutdown_cp->ch.chunk_flags = 0;
 		shutdown_cp->ch.chunk_length = htons(chk->send_size);
 		shutdown_cp->cumulative_tsn_ack = htonl(stcb->asoc.cumulative_tsn);
 		SCTP_BUF_LEN(m_shutdown) = chk->send_size;
 		TAILQ_INSERT_TAIL(&chk->asoc->control_send_queue, chk, sctp_next);
 		chk->asoc->ctrl_queue_cnt++;
 	} else {
 		TAILQ_REMOVE(&stcb->asoc.control_send_queue, chk, sctp_next);
 		chk->whoTo = net;
 		if (chk->whoTo) {
 			atomic_add_int(&chk->whoTo->ref_count, 1);
 		}
 		shutdown_cp = mtod(chk->data, struct sctp_shutdown_chunk *);
 		shutdown_cp->cumulative_tsn_ack = htonl(stcb->asoc.cumulative_tsn);
 		TAILQ_INSERT_TAIL(&stcb->asoc.control_send_queue, chk, sctp_next);
 	}
 	return;
 }
 
 void
 sctp_send_asconf(struct sctp_tcb *stcb, struct sctp_nets *net, int addr_locked)
 {
 	/*
 	 * formulate and queue an ASCONF to the peer. ASCONF parameters
 	 * should be queued on the assoc queue.
 	 */
 	struct sctp_tmit_chunk *chk;
 	struct mbuf *m_asconf;
 	int len;
 
 	SCTP_TCB_LOCK_ASSERT(stcb);
 
 	if ((!TAILQ_EMPTY(&stcb->asoc.asconf_send_queue)) &&
 	    (!sctp_is_feature_on(stcb->sctp_ep, SCTP_PCB_FLAGS_MULTIPLE_ASCONFS))) {
 		/* can't send a new one if there is one in flight already */
 		return;
 	}
 
 	/* compose an ASCONF chunk, maximum length is PMTU */
 	m_asconf = sctp_compose_asconf(stcb, &len, addr_locked);
 	if (m_asconf == NULL) {
 		return;
 	}
 
 	sctp_alloc_a_chunk(stcb, chk);
 	if (chk == NULL) {
 		/* no memory */
 		sctp_m_freem(m_asconf);
 		return;
 	}
 
 	chk->copy_by_ref = 0;
 	chk->rec.chunk_id.id = SCTP_ASCONF;
 	chk->rec.chunk_id.can_take_data = 0;
 	chk->flags = CHUNK_FLAGS_FRAGMENT_OK;
 	chk->data = m_asconf;
 	chk->send_size = len;
 	chk->sent = SCTP_DATAGRAM_UNSENT;
 	chk->snd_count = 0;
 	chk->asoc = &stcb->asoc;
 	chk->whoTo = net;
 	if (chk->whoTo) {
 		atomic_add_int(&chk->whoTo->ref_count, 1);
 	}
 	TAILQ_INSERT_TAIL(&chk->asoc->asconf_send_queue, chk, sctp_next);
 	chk->asoc->ctrl_queue_cnt++;
 	return;
 }
 
 void
 sctp_send_asconf_ack(struct sctp_tcb *stcb)
 {
 	/*
 	 * formulate and queue a asconf-ack back to sender. the asconf-ack
 	 * must be stored in the tcb.
 	 */
 	struct sctp_tmit_chunk *chk;
 	struct sctp_asconf_ack *ack, *latest_ack;
 	struct mbuf *m_ack;
 	struct sctp_nets *net = NULL;
 
 	SCTP_TCB_LOCK_ASSERT(stcb);
 	/* Get the latest ASCONF-ACK */
 	latest_ack = TAILQ_LAST(&stcb->asoc.asconf_ack_sent, sctp_asconf_ackhead);
 	if (latest_ack == NULL) {
 		return;
 	}
 	if (latest_ack->last_sent_to != NULL &&
 	    latest_ack->last_sent_to == stcb->asoc.last_control_chunk_from) {
 		/* we're doing a retransmission */
 		net = sctp_find_alternate_net(stcb, stcb->asoc.last_control_chunk_from, 0);
 		if (net == NULL) {
 			/* no alternate */
 			if (stcb->asoc.last_control_chunk_from == NULL) {
 				if (stcb->asoc.alternate) {
 					net = stcb->asoc.alternate;
 				} else {
 					net = stcb->asoc.primary_destination;
 				}
 			} else {
 				net = stcb->asoc.last_control_chunk_from;
 			}
 		}
 	} else {
 		/* normal case */
 		if (stcb->asoc.last_control_chunk_from == NULL) {
 			if (stcb->asoc.alternate) {
 				net = stcb->asoc.alternate;
 			} else {
 				net = stcb->asoc.primary_destination;
 			}
 		} else {
 			net = stcb->asoc.last_control_chunk_from;
 		}
 	}
 	latest_ack->last_sent_to = net;
 
 	TAILQ_FOREACH(ack, &stcb->asoc.asconf_ack_sent, next) {
 		if (ack->data == NULL) {
 			continue;
 		}
 
 		/* copy the asconf_ack */
 		m_ack = SCTP_M_COPYM(ack->data, 0, M_COPYALL, M_NOWAIT);
 		if (m_ack == NULL) {
 			/* couldn't copy it */
 			return;
 		}
 #ifdef SCTP_MBUF_LOGGING
 		if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_MBUF_LOGGING_ENABLE) {
 			sctp_log_mbc(m_ack, SCTP_MBUF_ICOPY);
 		}
 #endif
 
 		sctp_alloc_a_chunk(stcb, chk);
 		if (chk == NULL) {
 			/* no memory */
 			if (m_ack)
 				sctp_m_freem(m_ack);
 			return;
 		}
 		chk->copy_by_ref = 0;
 		chk->rec.chunk_id.id = SCTP_ASCONF_ACK;
 		chk->rec.chunk_id.can_take_data = 1;
 		chk->flags = CHUNK_FLAGS_FRAGMENT_OK;
 		chk->whoTo = net;
 		if (chk->whoTo) {
 			atomic_add_int(&chk->whoTo->ref_count, 1);
 		}
 		chk->data = m_ack;
 		chk->send_size = ack->len;
 		chk->sent = SCTP_DATAGRAM_UNSENT;
 		chk->snd_count = 0;
 		chk->asoc = &stcb->asoc;
 
 		TAILQ_INSERT_TAIL(&chk->asoc->control_send_queue, chk, sctp_next);
 		chk->asoc->ctrl_queue_cnt++;
 	}
 	return;
 }
 
 
 static int
 sctp_chunk_retransmission(struct sctp_inpcb *inp,
     struct sctp_tcb *stcb,
     struct sctp_association *asoc,
     int *cnt_out, struct timeval *now, int *now_filled, int *fr_done, int so_locked
 #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING)
     SCTP_UNUSED
 #endif
 )
 {
 	/*-
 	 * send out one MTU of retransmission. If fast_retransmit is
 	 * happening we ignore the cwnd. Otherwise we obey the cwnd and
 	 * rwnd. For a Cookie or Asconf in the control chunk queue we
 	 * retransmit them by themselves.
 	 *
 	 * For data chunks we will pick out the lowest TSN's in the sent_queue
 	 * marked for resend and bundle them all together (up to a MTU of
 	 * destination). The address to send to should have been
 	 * selected/changed where the retransmission was marked (i.e. in FR
 	 * or t3-timeout routines).
 	 */
 	struct sctp_tmit_chunk *data_list[SCTP_MAX_DATA_BUNDLING];
 	struct sctp_tmit_chunk *chk, *fwd;
 	struct mbuf *m, *endofchain;
 	struct sctp_nets *net = NULL;
 	uint32_t tsns_sent = 0;
 	int no_fragmentflg, bundle_at, cnt_thru;
 	unsigned int mtu;
 	int error, i, one_chunk, fwd_tsn, ctl_cnt, tmr_started;
 	struct sctp_auth_chunk *auth = NULL;
 	uint32_t auth_offset = 0;
 	uint16_t auth_keyid;
 	int override_ok = 1;
 	int data_auth_reqd = 0;
 	uint32_t dmtu = 0;
 
 	SCTP_TCB_LOCK_ASSERT(stcb);
 	tmr_started = ctl_cnt = bundle_at = error = 0;
 	no_fragmentflg = 1;
 	fwd_tsn = 0;
 	*cnt_out = 0;
 	fwd = NULL;
 	endofchain = m = NULL;
 	auth_keyid = stcb->asoc.authinfo.active_keyid;
 #ifdef SCTP_AUDITING_ENABLED
 	sctp_audit_log(0xC3, 1);
 #endif
 	if ((TAILQ_EMPTY(&asoc->sent_queue)) &&
 	    (TAILQ_EMPTY(&asoc->control_send_queue))) {
 		SCTPDBG(SCTP_DEBUG_OUTPUT1, "SCTP hits empty queue with cnt set to %d?\n",
 		    asoc->sent_queue_retran_cnt);
 		asoc->sent_queue_cnt = 0;
 		asoc->sent_queue_cnt_removeable = 0;
 		/* send back 0/0 so we enter normal transmission */
 		*cnt_out = 0;
 		return (0);
 	}
 	TAILQ_FOREACH(chk, &asoc->control_send_queue, sctp_next) {
 		if ((chk->rec.chunk_id.id == SCTP_COOKIE_ECHO) ||
 		    (chk->rec.chunk_id.id == SCTP_STREAM_RESET) ||
 		    (chk->rec.chunk_id.id == SCTP_FORWARD_CUM_TSN)) {
 			if (chk->sent != SCTP_DATAGRAM_RESEND) {
 				continue;
 			}
 			if (chk->rec.chunk_id.id == SCTP_STREAM_RESET) {
 				if (chk != asoc->str_reset) {
 					/*
 					 * not eligible for retran if its
 					 * not ours
 					 */
 					continue;
 				}
 			}
 			ctl_cnt++;
 			if (chk->rec.chunk_id.id == SCTP_FORWARD_CUM_TSN) {
 				fwd_tsn = 1;
 			}
 			/*
 			 * Add an AUTH chunk, if chunk requires it save the
 			 * offset into the chain for AUTH
 			 */
 			if ((auth == NULL) &&
 			    (sctp_auth_is_required_chunk(chk->rec.chunk_id.id,
 			    stcb->asoc.peer_auth_chunks))) {
 				m = sctp_add_auth_chunk(m, &endofchain,
 				    &auth, &auth_offset,
 				    stcb,
 				    chk->rec.chunk_id.id);
 				SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks);
 			}
 			m = sctp_copy_mbufchain(chk->data, m, &endofchain, 0, chk->send_size, chk->copy_by_ref);
 			break;
 		}
 	}
 	one_chunk = 0;
 	cnt_thru = 0;
 	/* do we have control chunks to retransmit? */
 	if (m != NULL) {
 		/* Start a timer no matter if we succeed or fail */
 		if (chk->rec.chunk_id.id == SCTP_COOKIE_ECHO) {
 			sctp_timer_start(SCTP_TIMER_TYPE_COOKIE, inp, stcb, chk->whoTo);
 		} else if (chk->rec.chunk_id.id == SCTP_ASCONF)
 			sctp_timer_start(SCTP_TIMER_TYPE_ASCONF, inp, stcb, chk->whoTo);
 		chk->snd_count++;	/* update our count */
 		if ((error = sctp_lowlevel_chunk_output(inp, stcb, chk->whoTo,
 		    (struct sockaddr *)&chk->whoTo->ro._l_addr, m,
 		    auth_offset, auth, stcb->asoc.authinfo.active_keyid,
 		    no_fragmentflg, 0, 0,
 		    inp->sctp_lport, stcb->rport, htonl(stcb->asoc.peer_vtag),
 		    chk->whoTo->port, NULL,
 		    0, 0,
 		    so_locked))) {
 			SCTPDBG(SCTP_DEBUG_OUTPUT3, "Gak send error %d\n", error);
 			if (error == ENOBUFS) {
 				asoc->ifp_had_enobuf = 1;
 				SCTP_STAT_INCR(sctps_lowlevelerr);
 			}
 			return (error);
 		} else {
 			asoc->ifp_had_enobuf = 0;
 		}
 		endofchain = NULL;
 		auth = NULL;
 		auth_offset = 0;
 		/*
 		 * We don't want to mark the net->sent time here since this
 		 * we use this for HB and retrans cannot measure RTT
 		 */
 		/* (void)SCTP_GETTIME_TIMEVAL(&chk->whoTo->last_sent_time); */
 		*cnt_out += 1;
 		chk->sent = SCTP_DATAGRAM_SENT;
 		sctp_ucount_decr(stcb->asoc.sent_queue_retran_cnt);
 		if (fwd_tsn == 0) {
 			return (0);
 		} else {
 			/* Clean up the fwd-tsn list */
 			sctp_clean_up_ctl(stcb, asoc, so_locked);
 			return (0);
 		}
 	}
 	/*
 	 * Ok, it is just data retransmission we need to do or that and a
 	 * fwd-tsn with it all.
 	 */
 	if (TAILQ_EMPTY(&asoc->sent_queue)) {
 		return (SCTP_RETRAN_DONE);
 	}
 	if ((SCTP_GET_STATE(stcb) == SCTP_STATE_COOKIE_ECHOED) ||
 	    (SCTP_GET_STATE(stcb) == SCTP_STATE_COOKIE_WAIT)) {
 		/* not yet open, resend the cookie and that is it */
 		return (1);
 	}
 #ifdef SCTP_AUDITING_ENABLED
 	sctp_auditing(20, inp, stcb, NULL);
 #endif
 	data_auth_reqd = sctp_auth_is_required_chunk(SCTP_DATA, stcb->asoc.peer_auth_chunks);
 	TAILQ_FOREACH(chk, &asoc->sent_queue, sctp_next) {
 		if (chk->sent != SCTP_DATAGRAM_RESEND) {
 			/* No, not sent to this net or not ready for rtx */
 			continue;
 		}
 		if (chk->data == NULL) {
 			SCTP_PRINTF("TSN:%x chk->snd_count:%d chk->sent:%d can't retran - no data\n",
 			    chk->rec.data.tsn, chk->snd_count, chk->sent);
 			continue;
 		}
 		if ((SCTP_BASE_SYSCTL(sctp_max_retran_chunk)) &&
 		    (chk->snd_count >= SCTP_BASE_SYSCTL(sctp_max_retran_chunk))) {
 			struct mbuf *op_err;
 			char msg[SCTP_DIAG_INFO_LEN];
 
 			snprintf(msg, sizeof(msg), "TSN %8.8x retransmitted %d times, giving up",
 			    chk->rec.data.tsn, chk->snd_count);
 			op_err = sctp_generate_cause(SCTP_BASE_SYSCTL(sctp_diag_info_code),
 			    msg);
 			atomic_add_int(&stcb->asoc.refcnt, 1);
 			sctp_abort_an_association(stcb->sctp_ep, stcb, op_err,
 			    so_locked);
 			SCTP_TCB_LOCK(stcb);
 			atomic_subtract_int(&stcb->asoc.refcnt, 1);
 			return (SCTP_RETRAN_EXIT);
 		}
 		/* pick up the net */
 		net = chk->whoTo;
 		switch (net->ro._l_addr.sa.sa_family) {
 #ifdef INET
 		case AF_INET:
 			mtu = net->mtu - SCTP_MIN_V4_OVERHEAD;
 			break;
 #endif
 #ifdef INET6
 		case AF_INET6:
 			mtu = net->mtu - SCTP_MIN_OVERHEAD;
 			break;
 #endif
 		default:
 			/* TSNH */
 			mtu = net->mtu;
 			break;
 		}
 
 		if ((asoc->peers_rwnd < mtu) && (asoc->total_flight > 0)) {
 			/* No room in peers rwnd */
 			uint32_t tsn;
 
 			tsn = asoc->last_acked_seq + 1;
 			if (tsn == chk->rec.data.tsn) {
 				/*
 				 * we make a special exception for this
 				 * case. The peer has no rwnd but is missing
 				 * the lowest chunk.. which is probably what
 				 * is holding up the rwnd.
 				 */
 				goto one_chunk_around;
 			}
 			return (1);
 		}
 one_chunk_around:
 		if (asoc->peers_rwnd < mtu) {
 			one_chunk = 1;
 			if ((asoc->peers_rwnd == 0) &&
 			    (asoc->total_flight == 0)) {
 				chk->window_probe = 1;
 				chk->whoTo->window_probe = 1;
 			}
 		}
 #ifdef SCTP_AUDITING_ENABLED
 		sctp_audit_log(0xC3, 2);
 #endif
 		bundle_at = 0;
 		m = NULL;
 		net->fast_retran_ip = 0;
 		if (chk->rec.data.doing_fast_retransmit == 0) {
 			/*
 			 * if no FR in progress skip destination that have
 			 * flight_size > cwnd.
 			 */
 			if (net->flight_size >= net->cwnd) {
 				continue;
 			}
 		} else {
 			/*
 			 * Mark the destination net to have FR recovery
 			 * limits put on it.
 			 */
 			*fr_done = 1;
 			net->fast_retran_ip = 1;
 		}
 
 		/*
 		 * if no AUTH is yet included and this chunk requires it,
 		 * make sure to account for it.  We don't apply the size
 		 * until the AUTH chunk is actually added below in case
 		 * there is no room for this chunk.
 		 */
 		if (data_auth_reqd && (auth == NULL)) {
 			dmtu = sctp_get_auth_chunk_len(stcb->asoc.peer_hmac_id);
 		} else
 			dmtu = 0;
 
 		if ((chk->send_size <= (mtu - dmtu)) ||
 		    (chk->flags & CHUNK_FLAGS_FRAGMENT_OK)) {
 			/* ok we will add this one */
 			if (data_auth_reqd) {
 				if (auth == NULL) {
 					m = sctp_add_auth_chunk(m,
 					    &endofchain,
 					    &auth,
 					    &auth_offset,
 					    stcb,
 					    SCTP_DATA);
 					auth_keyid = chk->auth_keyid;
 					override_ok = 0;
 					SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks);
 				} else if (override_ok) {
 					auth_keyid = chk->auth_keyid;
 					override_ok = 0;
 				} else if (chk->auth_keyid != auth_keyid) {
 					/* different keyid, so done bundling */
 					break;
 				}
 			}
 			m = sctp_copy_mbufchain(chk->data, m, &endofchain, 0, chk->send_size, chk->copy_by_ref);
 			if (m == NULL) {
 				SCTP_LTRACE_ERR_RET(inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 				return (ENOMEM);
 			}
 			/* Do clear IP_DF ? */
 			if (chk->flags & CHUNK_FLAGS_FRAGMENT_OK) {
 				no_fragmentflg = 0;
 			}
 			/* upate our MTU size */
 			if (mtu > (chk->send_size + dmtu))
 				mtu -= (chk->send_size + dmtu);
 			else
 				mtu = 0;
 			data_list[bundle_at++] = chk;
 			if (one_chunk && (asoc->total_flight <= 0)) {
 				SCTP_STAT_INCR(sctps_windowprobed);
 			}
 		}
 		if (one_chunk == 0) {
 			/*
 			 * now are there anymore forward from chk to pick
 			 * up?
 			 */
 			for (fwd = TAILQ_NEXT(chk, sctp_next); fwd != NULL; fwd = TAILQ_NEXT(fwd, sctp_next)) {
 				if (fwd->sent != SCTP_DATAGRAM_RESEND) {
 					/* Nope, not for retran */
 					continue;
 				}
 				if (fwd->whoTo != net) {
 					/* Nope, not the net in question */
 					continue;
 				}
 				if (data_auth_reqd && (auth == NULL)) {
 					dmtu = sctp_get_auth_chunk_len(stcb->asoc.peer_hmac_id);
 				} else
 					dmtu = 0;
 				if (fwd->send_size <= (mtu - dmtu)) {
 					if (data_auth_reqd) {
 						if (auth == NULL) {
 							m = sctp_add_auth_chunk(m,
 							    &endofchain,
 							    &auth,
 							    &auth_offset,
 							    stcb,
 							    SCTP_DATA);
 							auth_keyid = fwd->auth_keyid;
 							override_ok = 0;
 							SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks);
 						} else if (override_ok) {
 							auth_keyid = fwd->auth_keyid;
 							override_ok = 0;
 						} else if (fwd->auth_keyid != auth_keyid) {
 							/*
 							 * different keyid,
 							 * so done bundling
 							 */
 							break;
 						}
 					}
 					m = sctp_copy_mbufchain(fwd->data, m, &endofchain, 0, fwd->send_size, fwd->copy_by_ref);
 					if (m == NULL) {
 						SCTP_LTRACE_ERR_RET(inp, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 						return (ENOMEM);
 					}
 					/* Do clear IP_DF ? */
 					if (fwd->flags & CHUNK_FLAGS_FRAGMENT_OK) {
 						no_fragmentflg = 0;
 					}
 					/* upate our MTU size */
 					if (mtu > (fwd->send_size + dmtu))
 						mtu -= (fwd->send_size + dmtu);
 					else
 						mtu = 0;
 					data_list[bundle_at++] = fwd;
 					if (bundle_at >= SCTP_MAX_DATA_BUNDLING) {
 						break;
 					}
 				} else {
 					/* can't fit so we are done */
 					break;
 				}
 			}
 		}
 		/* Is there something to send for this destination? */
 		if (m) {
 			/*
 			 * No matter if we fail/or succeed we should start a
 			 * timer. A failure is like a lost IP packet :-)
 			 */
 			if (!SCTP_OS_TIMER_PENDING(&net->rxt_timer.timer)) {
 				/*
 				 * no timer running on this destination
 				 * restart it.
 				 */
 				sctp_timer_start(SCTP_TIMER_TYPE_SEND, inp, stcb, net);
 				tmr_started = 1;
 			}
 			/* Now lets send it, if there is anything to send :> */
 			if ((error = sctp_lowlevel_chunk_output(inp, stcb, net,
 			    (struct sockaddr *)&net->ro._l_addr, m,
 			    auth_offset, auth, auth_keyid,
 			    no_fragmentflg, 0, 0,
 			    inp->sctp_lport, stcb->rport, htonl(stcb->asoc.peer_vtag),
 			    net->port, NULL,
 			    0, 0,
 			    so_locked))) {
 				/* error, we could not output */
 				SCTPDBG(SCTP_DEBUG_OUTPUT3, "Gak send error %d\n", error);
 				if (error == ENOBUFS) {
 					asoc->ifp_had_enobuf = 1;
 					SCTP_STAT_INCR(sctps_lowlevelerr);
 				}
 				return (error);
 			} else {
 				asoc->ifp_had_enobuf = 0;
 			}
 			endofchain = NULL;
 			auth = NULL;
 			auth_offset = 0;
 			/* For HB's */
 			/*
 			 * We don't want to mark the net->sent time here
 			 * since this we use this for HB and retrans cannot
 			 * measure RTT
 			 */
 			/* (void)SCTP_GETTIME_TIMEVAL(&net->last_sent_time); */
 
 			/* For auto-close */
 			cnt_thru++;
 			if (*now_filled == 0) {
 				(void)SCTP_GETTIME_TIMEVAL(&asoc->time_last_sent);
 				*now = asoc->time_last_sent;
 				*now_filled = 1;
 			} else {
 				asoc->time_last_sent = *now;
 			}
 			*cnt_out += bundle_at;
 #ifdef SCTP_AUDITING_ENABLED
 			sctp_audit_log(0xC4, bundle_at);
 #endif
 			if (bundle_at) {
 				tsns_sent = data_list[0]->rec.data.tsn;
 			}
 			for (i = 0; i < bundle_at; i++) {
 				SCTP_STAT_INCR(sctps_sendretransdata);
 				data_list[i]->sent = SCTP_DATAGRAM_SENT;
 				/*
 				 * When we have a revoked data, and we
 				 * retransmit it, then we clear the revoked
 				 * flag since this flag dictates if we
 				 * subtracted from the fs
 				 */
 				if (data_list[i]->rec.data.chunk_was_revoked) {
 					/* Deflate the cwnd */
 					data_list[i]->whoTo->cwnd -= data_list[i]->book_size;
 					data_list[i]->rec.data.chunk_was_revoked = 0;
 				}
 				data_list[i]->snd_count++;
 				sctp_ucount_decr(asoc->sent_queue_retran_cnt);
 				/* record the time */
 				data_list[i]->sent_rcv_time = asoc->time_last_sent;
 				if (data_list[i]->book_size_scale) {
 					/*
 					 * need to double the book size on
 					 * this one
 					 */
 					data_list[i]->book_size_scale = 0;
 					/*
 					 * Since we double the booksize, we
 					 * must also double the output queue
 					 * size, since this get shrunk when
 					 * we free by this amount.
 					 */
 					atomic_add_int(&((asoc)->total_output_queue_size), data_list[i]->book_size);
 					data_list[i]->book_size *= 2;
 
 
 				} else {
 					if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LOG_RWND_ENABLE) {
 						sctp_log_rwnd(SCTP_DECREASE_PEER_RWND,
 						    asoc->peers_rwnd, data_list[i]->send_size, SCTP_BASE_SYSCTL(sctp_peer_chunk_oh));
 					}
 					asoc->peers_rwnd = sctp_sbspace_sub(asoc->peers_rwnd,
 					    (uint32_t)(data_list[i]->send_size +
 					    SCTP_BASE_SYSCTL(sctp_peer_chunk_oh)));
 				}
 				if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_FLIGHT_LOGGING_ENABLE) {
 					sctp_misc_ints(SCTP_FLIGHT_LOG_UP_RSND,
 					    data_list[i]->whoTo->flight_size,
 					    data_list[i]->book_size,
 					    (uint32_t)(uintptr_t)data_list[i]->whoTo,
 					    data_list[i]->rec.data.tsn);
 				}
 				sctp_flight_size_increase(data_list[i]);
 				sctp_total_flight_increase(stcb, data_list[i]);
 				if (asoc->peers_rwnd < stcb->sctp_ep->sctp_ep.sctp_sws_sender) {
 					/* SWS sender side engages */
 					asoc->peers_rwnd = 0;
 				}
 				if ((i == 0) &&
 				    (data_list[i]->rec.data.doing_fast_retransmit)) {
 					SCTP_STAT_INCR(sctps_sendfastretrans);
 					if ((data_list[i] == TAILQ_FIRST(&asoc->sent_queue)) &&
 					    (tmr_started == 0)) {
 						/*-
 						 * ok we just fast-retrans'd
 						 * the lowest TSN, i.e the
 						 * first on the list. In
 						 * this case we want to give
 						 * some more time to get a
 						 * SACK back without a
 						 * t3-expiring.
 						 */
 						sctp_timer_stop(SCTP_TIMER_TYPE_SEND, inp, stcb, net,
 						    SCTP_FROM_SCTP_OUTPUT + SCTP_LOC_2);
 						sctp_timer_start(SCTP_TIMER_TYPE_SEND, inp, stcb, net);
 					}
 				}
 			}
 			if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_CWND_LOGGING_ENABLE) {
 				sctp_log_cwnd(stcb, net, tsns_sent, SCTP_CWND_LOG_FROM_RESEND);
 			}
 #ifdef SCTP_AUDITING_ENABLED
 			sctp_auditing(21, inp, stcb, NULL);
 #endif
 		} else {
 			/* None will fit */
 			return (1);
 		}
 		if (asoc->sent_queue_retran_cnt <= 0) {
 			/* all done we have no more to retran */
 			asoc->sent_queue_retran_cnt = 0;
 			break;
 		}
 		if (one_chunk) {
 			/* No more room in rwnd */
 			return (1);
 		}
 		/* stop the for loop here. we sent out a packet */
 		break;
 	}
 	return (0);
 }
 
 static void
 sctp_timer_validation(struct sctp_inpcb *inp,
     struct sctp_tcb *stcb,
     struct sctp_association *asoc)
 {
 	struct sctp_nets *net;
 
 	/* Validate that a timer is running somewhere */
 	TAILQ_FOREACH(net, &asoc->nets, sctp_next) {
 		if (SCTP_OS_TIMER_PENDING(&net->rxt_timer.timer)) {
 			/* Here is a timer */
 			return;
 		}
 	}
 	SCTP_TCB_LOCK_ASSERT(stcb);
 	/* Gak, we did not have a timer somewhere */
 	SCTPDBG(SCTP_DEBUG_OUTPUT3, "Deadlock avoided starting timer on a dest at retran\n");
 	if (asoc->alternate) {
 		sctp_timer_start(SCTP_TIMER_TYPE_SEND, inp, stcb, asoc->alternate);
 	} else {
 		sctp_timer_start(SCTP_TIMER_TYPE_SEND, inp, stcb, asoc->primary_destination);
 	}
 	return;
 }
 
 void
 sctp_chunk_output(struct sctp_inpcb *inp,
     struct sctp_tcb *stcb,
     int from_where,
     int so_locked
 #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING)
     SCTP_UNUSED
 #endif
 )
 {
 	/*-
 	 * Ok this is the generic chunk service queue. we must do the
 	 * following:
 	 * - See if there are retransmits pending, if so we must
 	 *   do these first.
 	 * - Service the stream queue that is next, moving any
 	 *   message (note I must get a complete message i.e.
 	 *   FIRST/MIDDLE and LAST to the out queue in one pass) and assigning
 	 *   TSN's
 	 * - Check to see if the cwnd/rwnd allows any output, if so we
 	 *   go ahead and fomulate and send the low level chunks. Making sure
 	 *   to combine any control in the control chunk queue also.
 	 */
 	struct sctp_association *asoc;
 	struct sctp_nets *net;
 	int error = 0, num_out, tot_out = 0, ret = 0, reason_code;
 	unsigned int burst_cnt = 0;
 	struct timeval now;
 	int now_filled = 0;
 	int nagle_on;
 	int frag_point = sctp_get_frag_point(stcb, &stcb->asoc);
 	int un_sent = 0;
 	int fr_done;
 	unsigned int tot_frs = 0;
 
 	asoc = &stcb->asoc;
 do_it_again:
 	/* The Nagle algorithm is only applied when handling a send call. */
 	if (from_where == SCTP_OUTPUT_FROM_USR_SEND) {
 		if (sctp_is_feature_on(inp, SCTP_PCB_FLAGS_NODELAY)) {
 			nagle_on = 0;
 		} else {
 			nagle_on = 1;
 		}
 	} else {
 		nagle_on = 0;
 	}
 	SCTP_TCB_LOCK_ASSERT(stcb);
 
 	un_sent = (stcb->asoc.total_output_queue_size - stcb->asoc.total_flight);
 
 	if ((un_sent <= 0) &&
 	    (TAILQ_EMPTY(&asoc->control_send_queue)) &&
 	    (TAILQ_EMPTY(&asoc->asconf_send_queue)) &&
 	    (asoc->sent_queue_retran_cnt == 0) &&
 	    (asoc->trigger_reset == 0)) {
 		/* Nothing to do unless there is something to be sent left */
 		return;
 	}
 	/*
 	 * Do we have something to send, data or control AND a sack timer
 	 * running, if so piggy-back the sack.
 	 */
 	if (SCTP_OS_TIMER_PENDING(&stcb->asoc.dack_timer.timer)) {
 		sctp_send_sack(stcb, so_locked);
 		(void)SCTP_OS_TIMER_STOP(&stcb->asoc.dack_timer.timer);
 	}
 	while (asoc->sent_queue_retran_cnt) {
 		/*-
 		 * Ok, it is retransmission time only, we send out only ONE
 		 * packet with a single call off to the retran code.
 		 */
 		if (from_where == SCTP_OUTPUT_FROM_COOKIE_ACK) {
 			/*-
 			 * Special hook for handling cookiess discarded
 			 * by peer that carried data. Send cookie-ack only
 			 * and then the next call with get the retran's.
 			 */
 			(void)sctp_med_chunk_output(inp, stcb, asoc, &num_out, &reason_code, 1,
 			    from_where,
 			    &now, &now_filled, frag_point, so_locked);
 			return;
 		} else if (from_where != SCTP_OUTPUT_FROM_HB_TMR) {
 			/* if its not from a HB then do it */
 			fr_done = 0;
 			ret = sctp_chunk_retransmission(inp, stcb, asoc, &num_out, &now, &now_filled, &fr_done, so_locked);
 			if (fr_done) {
 				tot_frs++;
 			}
 		} else {
 			/*
 			 * its from any other place, we don't allow retran
 			 * output (only control)
 			 */
 			ret = 1;
 		}
 		if (ret > 0) {
 			/* Can't send anymore */
 			/*-
 			 * now lets push out control by calling med-level
 			 * output once. this assures that we WILL send HB's
 			 * if queued too.
 			 */
 			(void)sctp_med_chunk_output(inp, stcb, asoc, &num_out, &reason_code, 1,
 			    from_where,
 			    &now, &now_filled, frag_point, so_locked);
 #ifdef SCTP_AUDITING_ENABLED
 			sctp_auditing(8, inp, stcb, NULL);
 #endif
 			sctp_timer_validation(inp, stcb, asoc);
 			return;
 		}
 		if (ret < 0) {
 			/*-
 			 * The count was off.. retran is not happening so do
 			 * the normal retransmission.
 			 */
 #ifdef SCTP_AUDITING_ENABLED
 			sctp_auditing(9, inp, stcb, NULL);
 #endif
 			if (ret == SCTP_RETRAN_EXIT) {
 				return;
 			}
 			break;
 		}
 		if (from_where == SCTP_OUTPUT_FROM_T3) {
 			/* Only one transmission allowed out of a timeout */
 #ifdef SCTP_AUDITING_ENABLED
 			sctp_auditing(10, inp, stcb, NULL);
 #endif
 			/* Push out any control */
 			(void)sctp_med_chunk_output(inp, stcb, asoc, &num_out, &reason_code, 1, from_where,
 			    &now, &now_filled, frag_point, so_locked);
 			return;
 		}
 		if ((asoc->fr_max_burst > 0) && (tot_frs >= asoc->fr_max_burst)) {
 			/* Hit FR burst limit */
 			return;
 		}
 		if ((num_out == 0) && (ret == 0)) {
 			/* No more retrans to send */
 			break;
 		}
 	}
 #ifdef SCTP_AUDITING_ENABLED
 	sctp_auditing(12, inp, stcb, NULL);
 #endif
 	/* Check for bad destinations, if they exist move chunks around. */
 	TAILQ_FOREACH(net, &asoc->nets, sctp_next) {
 		if (!(net->dest_state & SCTP_ADDR_REACHABLE)) {
 			/*-
 			 * if possible move things off of this address we
 			 * still may send below due to the dormant state but
 			 * we try to find an alternate address to send to
 			 * and if we have one we move all queued data on the
 			 * out wheel to this alternate address.
 			 */
 			if (net->ref_count > 1)
 				sctp_move_chunks_from_net(stcb, net);
 		} else {
 			/*-
 			 * if ((asoc->sat_network) || (net->addr_is_local))
 			 * { burst_limit = asoc->max_burst *
 			 * SCTP_SAT_NETWORK_BURST_INCR; }
 			 */
 			if (asoc->max_burst > 0) {
 				if (SCTP_BASE_SYSCTL(sctp_use_cwnd_based_maxburst)) {
 					if ((net->flight_size + (asoc->max_burst * net->mtu)) < net->cwnd) {
 						/*
 						 * JRS - Use the congestion
 						 * control given in the
 						 * congestion control module
 						 */
 						asoc->cc_functions.sctp_cwnd_update_after_output(stcb, net, asoc->max_burst);
 						if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LOG_MAXBURST_ENABLE) {
 							sctp_log_maxburst(stcb, net, 0, asoc->max_burst, SCTP_MAX_BURST_APPLIED);
 						}
 						SCTP_STAT_INCR(sctps_maxburstqueued);
 					}
 					net->fast_retran_ip = 0;
 				} else {
 					if (net->flight_size == 0) {
 						/*
 						 * Should be decaying the
 						 * cwnd here
 						 */
 						;
 					}
 				}
 			}
 		}
 
 	}
 	burst_cnt = 0;
 	do {
 		error = sctp_med_chunk_output(inp, stcb, asoc, &num_out,
 		    &reason_code, 0, from_where,
 		    &now, &now_filled, frag_point, so_locked);
 		if (error) {
 			SCTPDBG(SCTP_DEBUG_OUTPUT1, "Error %d was returned from med-c-op\n", error);
 			if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LOG_MAXBURST_ENABLE) {
 				sctp_log_maxburst(stcb, asoc->primary_destination, error, burst_cnt, SCTP_MAX_BURST_ERROR_STOP);
 			}
 			if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_CWND_LOGGING_ENABLE) {
 				sctp_log_cwnd(stcb, NULL, error, SCTP_SEND_NOW_COMPLETES);
 				sctp_log_cwnd(stcb, NULL, 0xdeadbeef, SCTP_SEND_NOW_COMPLETES);
 			}
 			break;
 		}
 		SCTPDBG(SCTP_DEBUG_OUTPUT3, "m-c-o put out %d\n", num_out);
 
 		tot_out += num_out;
 		burst_cnt++;
 		if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_CWND_LOGGING_ENABLE) {
 			sctp_log_cwnd(stcb, NULL, num_out, SCTP_SEND_NOW_COMPLETES);
 			if (num_out == 0) {
 				sctp_log_cwnd(stcb, NULL, reason_code, SCTP_SEND_NOW_COMPLETES);
 			}
 		}
 		if (nagle_on) {
 			/*
 			 * When the Nagle algorithm is used, look at how
 			 * much is unsent, then if its smaller than an MTU
 			 * and we have data in flight we stop, except if we
 			 * are handling a fragmented user message.
 			 */
 			un_sent = stcb->asoc.total_output_queue_size - stcb->asoc.total_flight;
 			if ((un_sent < (int)(stcb->asoc.smallest_mtu - SCTP_MIN_OVERHEAD)) &&
 			    (stcb->asoc.total_flight > 0)) {
 /*	&&		     sctp_is_feature_on(inp, SCTP_PCB_FLAGS_EXPLICIT_EOR))) {*/
 				break;
 			}
 		}
 		if (TAILQ_EMPTY(&asoc->control_send_queue) &&
 		    TAILQ_EMPTY(&asoc->send_queue) &&
 		    sctp_is_there_unsent_data(stcb, so_locked) == 0) {
 			/* Nothing left to send */
 			break;
 		}
 		if ((stcb->asoc.total_output_queue_size - stcb->asoc.total_flight) <= 0) {
 			/* Nothing left to send */
 			break;
 		}
 	} while (num_out &&
 	    ((asoc->max_burst == 0) ||
 	    SCTP_BASE_SYSCTL(sctp_use_cwnd_based_maxburst) ||
 	    (burst_cnt < asoc->max_burst)));
 
 	if (SCTP_BASE_SYSCTL(sctp_use_cwnd_based_maxburst) == 0) {
 		if ((asoc->max_burst > 0) && (burst_cnt >= asoc->max_burst)) {
 			SCTP_STAT_INCR(sctps_maxburstqueued);
 			asoc->burst_limit_applied = 1;
 			if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LOG_MAXBURST_ENABLE) {
 				sctp_log_maxburst(stcb, asoc->primary_destination, 0, burst_cnt, SCTP_MAX_BURST_APPLIED);
 			}
 		} else {
 			asoc->burst_limit_applied = 0;
 		}
 	}
 	if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_CWND_LOGGING_ENABLE) {
 		sctp_log_cwnd(stcb, NULL, tot_out, SCTP_SEND_NOW_COMPLETES);
 	}
 	SCTPDBG(SCTP_DEBUG_OUTPUT1, "Ok, we have put out %d chunks\n",
 	    tot_out);
 
 	/*-
 	 * Now we need to clean up the control chunk chain if a ECNE is on
 	 * it. It must be marked as UNSENT again so next call will continue
 	 * to send it until such time that we get a CWR, to remove it.
 	 */
 	if (stcb->asoc.ecn_echo_cnt_onq)
 		sctp_fix_ecn_echo(asoc);
 
 	if (stcb->asoc.trigger_reset) {
 		if (sctp_send_stream_reset_out_if_possible(stcb, so_locked) == 0) {
 			goto do_it_again;
 		}
 	}
 	return;
 }
 
 
 int
 sctp_output(
     struct sctp_inpcb *inp,
     struct mbuf *m,
     struct sockaddr *addr,
     struct mbuf *control,
     struct thread *p,
     int flags)
 {
 	if (inp == NULL) {
 		SCTP_LTRACE_ERR_RET_PKT(m, inp, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 		return (EINVAL);
 	}
 
 	if (inp->sctp_socket == NULL) {
 		SCTP_LTRACE_ERR_RET_PKT(m, inp, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 		return (EINVAL);
 	}
 	return (sctp_sosend(inp->sctp_socket,
 	    addr,
 	    (struct uio *)NULL,
 	    m,
 	    control,
 	    flags, p
 	    ));
 }
 
 void
 send_forward_tsn(struct sctp_tcb *stcb,
     struct sctp_association *asoc)
 {
 	struct sctp_tmit_chunk *chk, *at, *tp1, *last;
 	struct sctp_forward_tsn_chunk *fwdtsn;
 	struct sctp_strseq *strseq;
 	struct sctp_strseq_mid *strseq_m;
 	uint32_t advance_peer_ack_point;
 	unsigned int cnt_of_space, i, ovh;
 	unsigned int space_needed;
 	unsigned int cnt_of_skipped = 0;
 
 	SCTP_TCB_LOCK_ASSERT(stcb);
 	TAILQ_FOREACH(chk, &asoc->control_send_queue, sctp_next) {
 		if (chk->rec.chunk_id.id == SCTP_FORWARD_CUM_TSN) {
 			/* mark it to unsent */
 			chk->sent = SCTP_DATAGRAM_UNSENT;
 			chk->snd_count = 0;
 			/* Do we correct its output location? */
 			if (chk->whoTo) {
 				sctp_free_remote_addr(chk->whoTo);
 				chk->whoTo = NULL;
 			}
 			goto sctp_fill_in_rest;
 		}
 	}
 	/* Ok if we reach here we must build one */
 	sctp_alloc_a_chunk(stcb, chk);
 	if (chk == NULL) {
 		return;
 	}
 	asoc->fwd_tsn_cnt++;
 	chk->copy_by_ref = 0;
 	/*
 	 * We don't do the old thing here since this is used not for on-wire
 	 * but to tell if we are sending a fwd-tsn by the stack during
 	 * output. And if its a IFORWARD or a FORWARD it is a fwd-tsn.
 	 */
 	chk->rec.chunk_id.id = SCTP_FORWARD_CUM_TSN;
 	chk->rec.chunk_id.can_take_data = 0;
 	chk->flags = 0;
 	chk->asoc = asoc;
 	chk->whoTo = NULL;
 	chk->data = sctp_get_mbuf_for_msg(MCLBYTES, 0, M_NOWAIT, 1, MT_DATA);
 	if (chk->data == NULL) {
 		sctp_free_a_chunk(stcb, chk, SCTP_SO_NOT_LOCKED);
 		return;
 	}
 	SCTP_BUF_RESV_UF(chk->data, SCTP_MIN_OVERHEAD);
 	chk->sent = SCTP_DATAGRAM_UNSENT;
 	chk->snd_count = 0;
 	TAILQ_INSERT_TAIL(&asoc->control_send_queue, chk, sctp_next);
 	asoc->ctrl_queue_cnt++;
 sctp_fill_in_rest:
 	/*-
 	 * Here we go through and fill out the part that deals with
 	 * stream/seq of the ones we skip.
 	 */
 	SCTP_BUF_LEN(chk->data) = 0;
 	TAILQ_FOREACH(at, &asoc->sent_queue, sctp_next) {
 		if ((at->sent != SCTP_FORWARD_TSN_SKIP) &&
 		    (at->sent != SCTP_DATAGRAM_NR_ACKED)) {
 			/* no more to look at */
 			break;
 		}
 		if (!asoc->idata_supported && (at->rec.data.rcv_flags & SCTP_DATA_UNORDERED)) {
 			/* We don't report these */
 			continue;
 		}
 		cnt_of_skipped++;
 	}
 	if (asoc->idata_supported) {
 		space_needed = (sizeof(struct sctp_forward_tsn_chunk) +
 		    (cnt_of_skipped * sizeof(struct sctp_strseq_mid)));
 	} else {
 		space_needed = (sizeof(struct sctp_forward_tsn_chunk) +
 		    (cnt_of_skipped * sizeof(struct sctp_strseq)));
 	}
 	cnt_of_space = (unsigned int)M_TRAILINGSPACE(chk->data);
 
 	if (stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_BOUND_V6) {
 		ovh = SCTP_MIN_OVERHEAD;
 	} else {
 		ovh = SCTP_MIN_V4_OVERHEAD;
 	}
 	if (cnt_of_space > (asoc->smallest_mtu - ovh)) {
 		/* trim to a mtu size */
 		cnt_of_space = asoc->smallest_mtu - ovh;
 	}
 	if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LOG_TRY_ADVANCE) {
 		sctp_misc_ints(SCTP_FWD_TSN_CHECK,
 		    0xff, 0, cnt_of_skipped,
 		    asoc->advanced_peer_ack_point);
 	}
 	advance_peer_ack_point = asoc->advanced_peer_ack_point;
 	if (cnt_of_space < space_needed) {
 		/*-
 		 * ok we must trim down the chunk by lowering the
 		 * advance peer ack point.
 		 */
 		if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LOG_TRY_ADVANCE) {
 			sctp_misc_ints(SCTP_FWD_TSN_CHECK,
 			    0xff, 0xff, cnt_of_space,
 			    space_needed);
 		}
 		cnt_of_skipped = cnt_of_space - sizeof(struct sctp_forward_tsn_chunk);
 		if (asoc->idata_supported) {
 			cnt_of_skipped /= sizeof(struct sctp_strseq_mid);
 		} else {
 			cnt_of_skipped /= sizeof(struct sctp_strseq);
 		}
 		/*-
 		 * Go through and find the TSN that will be the one
 		 * we report.
 		 */
 		at = TAILQ_FIRST(&asoc->sent_queue);
 		if (at != NULL) {
 			for (i = 0; i < cnt_of_skipped; i++) {
 				tp1 = TAILQ_NEXT(at, sctp_next);
 				if (tp1 == NULL) {
 					break;
 				}
 				at = tp1;
 			}
 		}
 		if (at && SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LOG_TRY_ADVANCE) {
 			sctp_misc_ints(SCTP_FWD_TSN_CHECK,
 			    0xff, cnt_of_skipped, at->rec.data.tsn,
 			    asoc->advanced_peer_ack_point);
 		}
 		last = at;
 		/*-
 		 * last now points to last one I can report, update
 		 * peer ack point
 		 */
 		if (last) {
 			advance_peer_ack_point = last->rec.data.tsn;
 		}
 		if (asoc->idata_supported) {
 			space_needed = sizeof(struct sctp_forward_tsn_chunk) +
 			    cnt_of_skipped * sizeof(struct sctp_strseq_mid);
 		} else {
 			space_needed = sizeof(struct sctp_forward_tsn_chunk) +
 			    cnt_of_skipped * sizeof(struct sctp_strseq);
 		}
 	}
 	chk->send_size = space_needed;
 	/* Setup the chunk */
 	fwdtsn = mtod(chk->data, struct sctp_forward_tsn_chunk *);
 	fwdtsn->ch.chunk_length = htons(chk->send_size);
 	fwdtsn->ch.chunk_flags = 0;
 	if (asoc->idata_supported) {
 		fwdtsn->ch.chunk_type = SCTP_IFORWARD_CUM_TSN;
 	} else {
 		fwdtsn->ch.chunk_type = SCTP_FORWARD_CUM_TSN;
 	}
 	fwdtsn->new_cumulative_tsn = htonl(advance_peer_ack_point);
 	SCTP_BUF_LEN(chk->data) = chk->send_size;
 	fwdtsn++;
 	/*-
 	 * Move pointer to after the fwdtsn and transfer to the
 	 * strseq pointer.
 	 */
 	if (asoc->idata_supported) {
 		strseq_m = (struct sctp_strseq_mid *)fwdtsn;
 		strseq = NULL;
 	} else {
 		strseq = (struct sctp_strseq *)fwdtsn;
 		strseq_m = NULL;
 	}
 	/*-
 	 * Now populate the strseq list. This is done blindly
 	 * without pulling out duplicate stream info. This is
 	 * inefficent but won't harm the process since the peer will
 	 * look at these in sequence and will thus release anything.
 	 * It could mean we exceed the PMTU and chop off some that
 	 * we could have included.. but this is unlikely (aka 1432/4
 	 * would mean 300+ stream seq's would have to be reported in
 	 * one FWD-TSN. With a bit of work we can later FIX this to
 	 * optimize and pull out duplicates.. but it does add more
 	 * overhead. So for now... not!
 	 */
 	i = 0;
 	TAILQ_FOREACH(at, &asoc->sent_queue, sctp_next) {
 		if (i >= cnt_of_skipped) {
 			break;
 		}
 		if (!asoc->idata_supported && (at->rec.data.rcv_flags & SCTP_DATA_UNORDERED)) {
 			/* We don't report these */
 			continue;
 		}
 		if (at->rec.data.tsn == advance_peer_ack_point) {
 			at->rec.data.fwd_tsn_cnt = 0;
 		}
 		if (asoc->idata_supported) {
 			strseq_m->sid = htons(at->rec.data.sid);
 			if (at->rec.data.rcv_flags & SCTP_DATA_UNORDERED) {
 				strseq_m->flags = htons(PR_SCTP_UNORDERED_FLAG);
 			} else {
 				strseq_m->flags = 0;
 			}
 			strseq_m->mid = htonl(at->rec.data.mid);
 			strseq_m++;
 		} else {
 			strseq->sid = htons(at->rec.data.sid);
 			strseq->ssn = htons((uint16_t)at->rec.data.mid);
 			strseq++;
 		}
 		i++;
 	}
 	return;
 }
 
 void
 sctp_send_sack(struct sctp_tcb *stcb, int so_locked
 #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING)
     SCTP_UNUSED
 #endif
 )
 {
 	/*-
 	 * Queue up a SACK or NR-SACK in the control queue.
 	 * We must first check to see if a SACK or NR-SACK is
 	 * somehow on the control queue.
 	 * If so, we will take and and remove the old one.
 	 */
 	struct sctp_association *asoc;
 	struct sctp_tmit_chunk *chk, *a_chk;
 	struct sctp_sack_chunk *sack;
 	struct sctp_nr_sack_chunk *nr_sack;
 	struct sctp_gap_ack_block *gap_descriptor;
 	const struct sack_track *selector;
 	int mergeable = 0;
 	int offset;
 	caddr_t limit;
 	uint32_t *dup;
 	int limit_reached = 0;
 	unsigned int i, siz, j;
 	unsigned int num_gap_blocks = 0, num_nr_gap_blocks = 0, space;
 	int num_dups = 0;
 	int space_req;
 	uint32_t highest_tsn;
 	uint8_t flags;
 	uint8_t type;
 	uint8_t tsn_map;
 
 	if (stcb->asoc.nrsack_supported == 1) {
 		type = SCTP_NR_SELECTIVE_ACK;
 	} else {
 		type = SCTP_SELECTIVE_ACK;
 	}
 	a_chk = NULL;
 	asoc = &stcb->asoc;
 	SCTP_TCB_LOCK_ASSERT(stcb);
 	if (asoc->last_data_chunk_from == NULL) {
 		/* Hmm we never received anything */
 		return;
 	}
 	sctp_slide_mapping_arrays(stcb);
 	sctp_set_rwnd(stcb, asoc);
 	TAILQ_FOREACH(chk, &asoc->control_send_queue, sctp_next) {
 		if (chk->rec.chunk_id.id == type) {
 			/* Hmm, found a sack already on queue, remove it */
 			TAILQ_REMOVE(&asoc->control_send_queue, chk, sctp_next);
 			asoc->ctrl_queue_cnt--;
 			a_chk = chk;
 			if (a_chk->data) {
 				sctp_m_freem(a_chk->data);
 				a_chk->data = NULL;
 			}
 			if (a_chk->whoTo) {
 				sctp_free_remote_addr(a_chk->whoTo);
 				a_chk->whoTo = NULL;
 			}
 			break;
 		}
 	}
 	if (a_chk == NULL) {
 		sctp_alloc_a_chunk(stcb, a_chk);
 		if (a_chk == NULL) {
 			/* No memory so we drop the idea, and set a timer */
 			if (stcb->asoc.delayed_ack) {
 				sctp_timer_stop(SCTP_TIMER_TYPE_RECV,
 				    stcb->sctp_ep, stcb, NULL,
 				    SCTP_FROM_SCTP_OUTPUT + SCTP_LOC_3);
 				sctp_timer_start(SCTP_TIMER_TYPE_RECV,
 				    stcb->sctp_ep, stcb, NULL);
 			} else {
 				stcb->asoc.send_sack = 1;
 			}
 			return;
 		}
 		a_chk->copy_by_ref = 0;
 		a_chk->rec.chunk_id.id = type;
 		a_chk->rec.chunk_id.can_take_data = 1;
 	}
 	/* Clear our pkt counts */
 	asoc->data_pkts_seen = 0;
 
 	a_chk->flags = 0;
 	a_chk->asoc = asoc;
 	a_chk->snd_count = 0;
 	a_chk->send_size = 0;	/* fill in later */
 	a_chk->sent = SCTP_DATAGRAM_UNSENT;
 	a_chk->whoTo = NULL;
 
 	if (!(asoc->last_data_chunk_from->dest_state & SCTP_ADDR_REACHABLE)) {
 		/*-
 		 * Ok, the destination for the SACK is unreachable, lets see if
 		 * we can select an alternate to asoc->last_data_chunk_from
 		 */
 		a_chk->whoTo = sctp_find_alternate_net(stcb, asoc->last_data_chunk_from, 0);
 		if (a_chk->whoTo == NULL) {
 			/* Nope, no alternate */
 			a_chk->whoTo = asoc->last_data_chunk_from;
 		}
 	} else {
 		a_chk->whoTo = asoc->last_data_chunk_from;
 	}
 	if (a_chk->whoTo) {
 		atomic_add_int(&a_chk->whoTo->ref_count, 1);
 	}
 	if (SCTP_TSN_GT(asoc->highest_tsn_inside_map, asoc->highest_tsn_inside_nr_map)) {
 		highest_tsn = asoc->highest_tsn_inside_map;
 	} else {
 		highest_tsn = asoc->highest_tsn_inside_nr_map;
 	}
 	if (highest_tsn == asoc->cumulative_tsn) {
 		/* no gaps */
 		if (type == SCTP_SELECTIVE_ACK) {
 			space_req = sizeof(struct sctp_sack_chunk);
 		} else {
 			space_req = sizeof(struct sctp_nr_sack_chunk);
 		}
 	} else {
 		/* gaps get a cluster */
 		space_req = MCLBYTES;
 	}
 	/* Ok now lets formulate a MBUF with our sack */
 	a_chk->data = sctp_get_mbuf_for_msg(space_req, 0, M_NOWAIT, 1, MT_DATA);
 	if ((a_chk->data == NULL) ||
 	    (a_chk->whoTo == NULL)) {
 		/* rats, no mbuf memory */
 		if (a_chk->data) {
 			/* was a problem with the destination */
 			sctp_m_freem(a_chk->data);
 			a_chk->data = NULL;
 		}
 		sctp_free_a_chunk(stcb, a_chk, so_locked);
 		/* sa_ignore NO_NULL_CHK */
 		if (stcb->asoc.delayed_ack) {
 			sctp_timer_stop(SCTP_TIMER_TYPE_RECV,
 			    stcb->sctp_ep, stcb, NULL,
 			    SCTP_FROM_SCTP_OUTPUT + SCTP_LOC_4);
 			sctp_timer_start(SCTP_TIMER_TYPE_RECV,
 			    stcb->sctp_ep, stcb, NULL);
 		} else {
 			stcb->asoc.send_sack = 1;
 		}
 		return;
 	}
 	/* ok, lets go through and fill it in */
 	SCTP_BUF_RESV_UF(a_chk->data, SCTP_MIN_OVERHEAD);
 	space = (unsigned int)M_TRAILINGSPACE(a_chk->data);
 	if (space > (a_chk->whoTo->mtu - SCTP_MIN_OVERHEAD)) {
 		space = (a_chk->whoTo->mtu - SCTP_MIN_OVERHEAD);
 	}
 	limit = mtod(a_chk->data, caddr_t);
 	limit += space;
 
 	flags = 0;
 
 	if ((asoc->sctp_cmt_on_off > 0) &&
 	    SCTP_BASE_SYSCTL(sctp_cmt_use_dac)) {
 		/*-
 		 * CMT DAC algorithm: If 2 (i.e., 0x10) packets have been
 		 * received, then set high bit to 1, else 0. Reset
 		 * pkts_rcvd.
 		 */
 		flags |= (asoc->cmt_dac_pkts_rcvd << 6);
 		asoc->cmt_dac_pkts_rcvd = 0;
 	}
 #ifdef SCTP_ASOCLOG_OF_TSNS
 	stcb->asoc.cumack_logsnt[stcb->asoc.cumack_log_atsnt] = asoc->cumulative_tsn;
 	stcb->asoc.cumack_log_atsnt++;
 	if (stcb->asoc.cumack_log_atsnt >= SCTP_TSN_LOG_SIZE) {
 		stcb->asoc.cumack_log_atsnt = 0;
 	}
 #endif
 	/* reset the readers interpretation */
 	stcb->freed_by_sorcv_sincelast = 0;
 
 	if (type == SCTP_SELECTIVE_ACK) {
 		sack = mtod(a_chk->data, struct sctp_sack_chunk *);
 		nr_sack = NULL;
 		gap_descriptor = (struct sctp_gap_ack_block *)((caddr_t)sack + sizeof(struct sctp_sack_chunk));
 		if (highest_tsn > asoc->mapping_array_base_tsn) {
 			siz = (((highest_tsn - asoc->mapping_array_base_tsn) + 1) + 7) / 8;
 		} else {
 			siz = (((MAX_TSN - highest_tsn) + 1) + highest_tsn + 7) / 8;
 		}
 	} else {
 		sack = NULL;
 		nr_sack = mtod(a_chk->data, struct sctp_nr_sack_chunk *);
 		gap_descriptor = (struct sctp_gap_ack_block *)((caddr_t)nr_sack + sizeof(struct sctp_nr_sack_chunk));
 		if (asoc->highest_tsn_inside_map > asoc->mapping_array_base_tsn) {
 			siz = (((asoc->highest_tsn_inside_map - asoc->mapping_array_base_tsn) + 1) + 7) / 8;
 		} else {
 			siz = (((MAX_TSN - asoc->mapping_array_base_tsn) + 1) + asoc->highest_tsn_inside_map + 7) / 8;
 		}
 	}
 
 	if (SCTP_TSN_GT(asoc->mapping_array_base_tsn, asoc->cumulative_tsn)) {
 		offset = 1;
 	} else {
 		offset = asoc->mapping_array_base_tsn - asoc->cumulative_tsn;
 	}
 	if (((type == SCTP_SELECTIVE_ACK) &&
 	    SCTP_TSN_GT(highest_tsn, asoc->cumulative_tsn)) ||
 	    ((type == SCTP_NR_SELECTIVE_ACK) &&
 	    SCTP_TSN_GT(asoc->highest_tsn_inside_map, asoc->cumulative_tsn))) {
 		/* we have a gap .. maybe */
 		for (i = 0; i < siz; i++) {
 			tsn_map = asoc->mapping_array[i];
 			if (type == SCTP_SELECTIVE_ACK) {
 				tsn_map |= asoc->nr_mapping_array[i];
 			}
 			if (i == 0) {
 				/*
 				 * Clear all bits corresponding to TSNs
 				 * smaller or equal to the cumulative TSN.
 				 */
 				tsn_map &= (~0U << (1 - offset));
 			}
 			selector = &sack_array[tsn_map];
 			if (mergeable && selector->right_edge) {
 				/*
 				 * Backup, left and right edges were ok to
 				 * merge.
 				 */
 				num_gap_blocks--;
 				gap_descriptor--;
 			}
 			if (selector->num_entries == 0)
 				mergeable = 0;
 			else {
 				for (j = 0; j < selector->num_entries; j++) {
 					if (mergeable && selector->right_edge) {
 						/*
 						 * do a merge by NOT setting
 						 * the left side
 						 */
 						mergeable = 0;
 					} else {
 						/*
 						 * no merge, set the left
 						 * side
 						 */
 						mergeable = 0;
 						gap_descriptor->start = htons((selector->gaps[j].start + offset));
 					}
 					gap_descriptor->end = htons((selector->gaps[j].end + offset));
 					num_gap_blocks++;
 					gap_descriptor++;
 					if (((caddr_t)gap_descriptor + sizeof(struct sctp_gap_ack_block)) > limit) {
 						/* no more room */
 						limit_reached = 1;
 						break;
 					}
 				}
 				if (selector->left_edge) {
 					mergeable = 1;
 				}
 			}
 			if (limit_reached) {
 				/* Reached the limit stop */
 				break;
 			}
 			offset += 8;
 		}
 	}
 	if ((type == SCTP_NR_SELECTIVE_ACK) &&
 	    (limit_reached == 0)) {
 
 		mergeable = 0;
 
 		if (asoc->highest_tsn_inside_nr_map > asoc->mapping_array_base_tsn) {
 			siz = (((asoc->highest_tsn_inside_nr_map - asoc->mapping_array_base_tsn) + 1) + 7) / 8;
 		} else {
 			siz = (((MAX_TSN - asoc->mapping_array_base_tsn) + 1) + asoc->highest_tsn_inside_nr_map + 7) / 8;
 		}
 
 		if (SCTP_TSN_GT(asoc->mapping_array_base_tsn, asoc->cumulative_tsn)) {
 			offset = 1;
 		} else {
 			offset = asoc->mapping_array_base_tsn - asoc->cumulative_tsn;
 		}
 		if (SCTP_TSN_GT(asoc->highest_tsn_inside_nr_map, asoc->cumulative_tsn)) {
 			/* we have a gap .. maybe */
 			for (i = 0; i < siz; i++) {
 				tsn_map = asoc->nr_mapping_array[i];
 				if (i == 0) {
 					/*
 					 * Clear all bits corresponding to
 					 * TSNs smaller or equal to the
 					 * cumulative TSN.
 					 */
 					tsn_map &= (~0U << (1 - offset));
 				}
 				selector = &sack_array[tsn_map];
 				if (mergeable && selector->right_edge) {
 					/*
 					 * Backup, left and right edges were
 					 * ok to merge.
 					 */
 					num_nr_gap_blocks--;
 					gap_descriptor--;
 				}
 				if (selector->num_entries == 0)
 					mergeable = 0;
 				else {
 					for (j = 0; j < selector->num_entries; j++) {
 						if (mergeable && selector->right_edge) {
 							/*
 							 * do a merge by NOT
 							 * setting the left
 							 * side
 							 */
 							mergeable = 0;
 						} else {
 							/*
 							 * no merge, set the
 							 * left side
 							 */
 							mergeable = 0;
 							gap_descriptor->start = htons((selector->gaps[j].start + offset));
 						}
 						gap_descriptor->end = htons((selector->gaps[j].end + offset));
 						num_nr_gap_blocks++;
 						gap_descriptor++;
 						if (((caddr_t)gap_descriptor + sizeof(struct sctp_gap_ack_block)) > limit) {
 							/* no more room */
 							limit_reached = 1;
 							break;
 						}
 					}
 					if (selector->left_edge) {
 						mergeable = 1;
 					}
 				}
 				if (limit_reached) {
 					/* Reached the limit stop */
 					break;
 				}
 				offset += 8;
 			}
 		}
 	}
 	/* now we must add any dups we are going to report. */
 	if ((limit_reached == 0) && (asoc->numduptsns)) {
 		dup = (uint32_t *)gap_descriptor;
 		for (i = 0; i < asoc->numduptsns; i++) {
 			*dup = htonl(asoc->dup_tsns[i]);
 			dup++;
 			num_dups++;
 			if (((caddr_t)dup + sizeof(uint32_t)) > limit) {
 				/* no more room */
 				break;
 			}
 		}
 		asoc->numduptsns = 0;
 	}
 	/*
 	 * now that the chunk is prepared queue it to the control chunk
 	 * queue.
 	 */
 	if (type == SCTP_SELECTIVE_ACK) {
 		a_chk->send_size = (uint16_t)(sizeof(struct sctp_sack_chunk) +
 		    (num_gap_blocks + num_nr_gap_blocks) * sizeof(struct sctp_gap_ack_block) +
 		    num_dups * sizeof(int32_t));
 		SCTP_BUF_LEN(a_chk->data) = a_chk->send_size;
 		sack->sack.cum_tsn_ack = htonl(asoc->cumulative_tsn);
 		sack->sack.a_rwnd = htonl(asoc->my_rwnd);
 		sack->sack.num_gap_ack_blks = htons(num_gap_blocks);
 		sack->sack.num_dup_tsns = htons(num_dups);
 		sack->ch.chunk_type = type;
 		sack->ch.chunk_flags = flags;
 		sack->ch.chunk_length = htons(a_chk->send_size);
 	} else {
 		a_chk->send_size = (uint16_t)(sizeof(struct sctp_nr_sack_chunk) +
 		    (num_gap_blocks + num_nr_gap_blocks) * sizeof(struct sctp_gap_ack_block) +
 		    num_dups * sizeof(int32_t));
 		SCTP_BUF_LEN(a_chk->data) = a_chk->send_size;
 		nr_sack->nr_sack.cum_tsn_ack = htonl(asoc->cumulative_tsn);
 		nr_sack->nr_sack.a_rwnd = htonl(asoc->my_rwnd);
 		nr_sack->nr_sack.num_gap_ack_blks = htons(num_gap_blocks);
 		nr_sack->nr_sack.num_nr_gap_ack_blks = htons(num_nr_gap_blocks);
 		nr_sack->nr_sack.num_dup_tsns = htons(num_dups);
 		nr_sack->nr_sack.reserved = 0;
 		nr_sack->ch.chunk_type = type;
 		nr_sack->ch.chunk_flags = flags;
 		nr_sack->ch.chunk_length = htons(a_chk->send_size);
 	}
 	TAILQ_INSERT_TAIL(&asoc->control_send_queue, a_chk, sctp_next);
 	asoc->my_last_reported_rwnd = asoc->my_rwnd;
 	asoc->ctrl_queue_cnt++;
 	asoc->send_sack = 0;
 	SCTP_STAT_INCR(sctps_sendsacks);
 	return;
 }
 
 void
 sctp_send_abort_tcb(struct sctp_tcb *stcb, struct mbuf *operr, int so_locked
 #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING)
     SCTP_UNUSED
 #endif
 )
 {
 	struct mbuf *m_abort, *m, *m_last;
 	struct mbuf *m_out, *m_end = NULL;
 	struct sctp_abort_chunk *abort;
 	struct sctp_auth_chunk *auth = NULL;
 	struct sctp_nets *net;
 	uint32_t vtag;
 	uint32_t auth_offset = 0;
 	int error;
 	uint16_t cause_len, chunk_len, padding_len;
 
 	SCTP_TCB_LOCK_ASSERT(stcb);
 	/*-
 	 * Add an AUTH chunk, if chunk requires it and save the offset into
 	 * the chain for AUTH
 	 */
 	if (sctp_auth_is_required_chunk(SCTP_ABORT_ASSOCIATION,
 	    stcb->asoc.peer_auth_chunks)) {
 		m_out = sctp_add_auth_chunk(NULL, &m_end, &auth, &auth_offset,
 		    stcb, SCTP_ABORT_ASSOCIATION);
 		SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks);
 	} else {
 		m_out = NULL;
 	}
 	m_abort = sctp_get_mbuf_for_msg(sizeof(struct sctp_abort_chunk), 0, M_NOWAIT, 1, MT_HEADER);
 	if (m_abort == NULL) {
 		if (m_out) {
 			sctp_m_freem(m_out);
 		}
 		if (operr) {
 			sctp_m_freem(operr);
 		}
 		return;
 	}
 	/* link in any error */
 	SCTP_BUF_NEXT(m_abort) = operr;
 	cause_len = 0;
 	m_last = NULL;
 	for (m = operr; m; m = SCTP_BUF_NEXT(m)) {
 		cause_len += (uint16_t)SCTP_BUF_LEN(m);
 		if (SCTP_BUF_NEXT(m) == NULL) {
 			m_last = m;
 		}
 	}
 	SCTP_BUF_LEN(m_abort) = sizeof(struct sctp_abort_chunk);
 	chunk_len = (uint16_t)sizeof(struct sctp_abort_chunk) + cause_len;
 	padding_len = SCTP_SIZE32(chunk_len) - chunk_len;
 	if (m_out == NULL) {
 		/* NO Auth chunk prepended, so reserve space in front */
 		SCTP_BUF_RESV_UF(m_abort, SCTP_MIN_OVERHEAD);
 		m_out = m_abort;
 	} else {
 		/* Put AUTH chunk at the front of the chain */
 		SCTP_BUF_NEXT(m_end) = m_abort;
 	}
 	if (stcb->asoc.alternate) {
 		net = stcb->asoc.alternate;
 	} else {
 		net = stcb->asoc.primary_destination;
 	}
 	/* Fill in the ABORT chunk header. */
 	abort = mtod(m_abort, struct sctp_abort_chunk *);
 	abort->ch.chunk_type = SCTP_ABORT_ASSOCIATION;
 	if (stcb->asoc.peer_vtag == 0) {
 		/* This happens iff the assoc is in COOKIE-WAIT state. */
 		vtag = stcb->asoc.my_vtag;
 		abort->ch.chunk_flags = SCTP_HAD_NO_TCB;
 	} else {
 		vtag = stcb->asoc.peer_vtag;
 		abort->ch.chunk_flags = 0;
 	}
 	abort->ch.chunk_length = htons(chunk_len);
 	/* Add padding, if necessary. */
 	if (padding_len > 0) {
 		if ((m_last == NULL) ||
 		    (sctp_add_pad_tombuf(m_last, padding_len) == NULL)) {
 			sctp_m_freem(m_out);
 			return;
 		}
 	}
 	if ((error = sctp_lowlevel_chunk_output(stcb->sctp_ep, stcb, net,
 	    (struct sockaddr *)&net->ro._l_addr,
 	    m_out, auth_offset, auth, stcb->asoc.authinfo.active_keyid, 1, 0, 0,
 	    stcb->sctp_ep->sctp_lport, stcb->rport, htonl(vtag),
 	    stcb->asoc.primary_destination->port, NULL,
 	    0, 0,
 	    so_locked))) {
 		SCTPDBG(SCTP_DEBUG_OUTPUT3, "Gak send error %d\n", error);
 		if (error == ENOBUFS) {
 			stcb->asoc.ifp_had_enobuf = 1;
 			SCTP_STAT_INCR(sctps_lowlevelerr);
 		}
 	} else {
 		stcb->asoc.ifp_had_enobuf = 0;
 	}
 	SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks);
 }
 
 void
 sctp_send_shutdown_complete(struct sctp_tcb *stcb,
     struct sctp_nets *net,
     int reflect_vtag)
 {
 	/* formulate and SEND a SHUTDOWN-COMPLETE */
 	struct mbuf *m_shutdown_comp;
 	struct sctp_shutdown_complete_chunk *shutdown_complete;
 	uint32_t vtag;
 	int error;
 	uint8_t flags;
 
 	m_shutdown_comp = sctp_get_mbuf_for_msg(sizeof(struct sctp_chunkhdr), 0, M_NOWAIT, 1, MT_HEADER);
 	if (m_shutdown_comp == NULL) {
 		/* no mbuf's */
 		return;
 	}
 	if (reflect_vtag) {
 		flags = SCTP_HAD_NO_TCB;
 		vtag = stcb->asoc.my_vtag;
 	} else {
 		flags = 0;
 		vtag = stcb->asoc.peer_vtag;
 	}
 	shutdown_complete = mtod(m_shutdown_comp, struct sctp_shutdown_complete_chunk *);
 	shutdown_complete->ch.chunk_type = SCTP_SHUTDOWN_COMPLETE;
 	shutdown_complete->ch.chunk_flags = flags;
 	shutdown_complete->ch.chunk_length = htons(sizeof(struct sctp_shutdown_complete_chunk));
 	SCTP_BUF_LEN(m_shutdown_comp) = sizeof(struct sctp_shutdown_complete_chunk);
 	if ((error = sctp_lowlevel_chunk_output(stcb->sctp_ep, stcb, net,
 	    (struct sockaddr *)&net->ro._l_addr,
 	    m_shutdown_comp, 0, NULL, 0, 1, 0, 0,
 	    stcb->sctp_ep->sctp_lport, stcb->rport,
 	    htonl(vtag),
 	    net->port, NULL,
 	    0, 0,
 	    SCTP_SO_NOT_LOCKED))) {
 		SCTPDBG(SCTP_DEBUG_OUTPUT3, "Gak send error %d\n", error);
 		if (error == ENOBUFS) {
 			stcb->asoc.ifp_had_enobuf = 1;
 			SCTP_STAT_INCR(sctps_lowlevelerr);
 		}
 	} else {
 		stcb->asoc.ifp_had_enobuf = 0;
 	}
 	SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks);
 	return;
 }
 
 static void
 sctp_send_resp_msg(struct sockaddr *src, struct sockaddr *dst,
     struct sctphdr *sh, uint32_t vtag,
     uint8_t type, struct mbuf *cause,
     uint8_t mflowtype, uint32_t mflowid, uint16_t fibnum,
     uint32_t vrf_id, uint16_t port)
 {
 	struct mbuf *o_pak;
 	struct mbuf *mout;
 	struct sctphdr *shout;
 	struct sctp_chunkhdr *ch;
 #if defined(INET) || defined(INET6)
 	struct udphdr *udp;
 #endif
 	int ret, len, cause_len, padding_len;
 #ifdef INET
 	struct sockaddr_in *src_sin, *dst_sin;
 	struct ip *ip;
 #endif
 #ifdef INET6
 	struct sockaddr_in6 *src_sin6, *dst_sin6;
 	struct ip6_hdr *ip6;
 #endif
 
 	/* Compute the length of the cause and add final padding. */
 	cause_len = 0;
 	if (cause != NULL) {
 		struct mbuf *m_at, *m_last = NULL;
 
 		for (m_at = cause; m_at; m_at = SCTP_BUF_NEXT(m_at)) {
 			if (SCTP_BUF_NEXT(m_at) == NULL)
 				m_last = m_at;
 			cause_len += SCTP_BUF_LEN(m_at);
 		}
 		padding_len = cause_len % 4;
 		if (padding_len != 0) {
 			padding_len = 4 - padding_len;
 		}
 		if (padding_len != 0) {
 			if (sctp_add_pad_tombuf(m_last, padding_len) == NULL) {
 				sctp_m_freem(cause);
 				return;
 			}
 		}
 	} else {
 		padding_len = 0;
 	}
 	/* Get an mbuf for the header. */
 	len = sizeof(struct sctphdr) + sizeof(struct sctp_chunkhdr);
 	switch (dst->sa_family) {
 #ifdef INET
 	case AF_INET:
 		len += sizeof(struct ip);
 		break;
 #endif
 #ifdef INET6
 	case AF_INET6:
 		len += sizeof(struct ip6_hdr);
 		break;
 #endif
 	default:
 		break;
 	}
 #if defined(INET) || defined(INET6)
 	if (port) {
 		len += sizeof(struct udphdr);
 	}
 #endif
 	mout = sctp_get_mbuf_for_msg(len + max_linkhdr, 1, M_NOWAIT, 1, MT_DATA);
 	if (mout == NULL) {
 		if (cause) {
 			sctp_m_freem(cause);
 		}
 		return;
 	}
 	SCTP_BUF_RESV_UF(mout, max_linkhdr);
 	SCTP_BUF_LEN(mout) = len;
 	SCTP_BUF_NEXT(mout) = cause;
 	M_SETFIB(mout, fibnum);
 	mout->m_pkthdr.flowid = mflowid;
 	M_HASHTYPE_SET(mout, mflowtype);
 #ifdef INET
 	ip = NULL;
 #endif
 #ifdef INET6
 	ip6 = NULL;
 #endif
 	switch (dst->sa_family) {
 #ifdef INET
 	case AF_INET:
 		src_sin = (struct sockaddr_in *)src;
 		dst_sin = (struct sockaddr_in *)dst;
 		ip = mtod(mout, struct ip *);
 		ip->ip_v = IPVERSION;
 		ip->ip_hl = (sizeof(struct ip) >> 2);
 		ip->ip_tos = 0;
 		ip->ip_off = htons(IP_DF);
 		ip_fillid(ip);
 		ip->ip_ttl = MODULE_GLOBAL(ip_defttl);
 		if (port) {
 			ip->ip_p = IPPROTO_UDP;
 		} else {
 			ip->ip_p = IPPROTO_SCTP;
 		}
 		ip->ip_src.s_addr = dst_sin->sin_addr.s_addr;
 		ip->ip_dst.s_addr = src_sin->sin_addr.s_addr;
 		ip->ip_sum = 0;
 		len = sizeof(struct ip);
 		shout = (struct sctphdr *)((caddr_t)ip + len);
 		break;
 #endif
 #ifdef INET6
 	case AF_INET6:
 		src_sin6 = (struct sockaddr_in6 *)src;
 		dst_sin6 = (struct sockaddr_in6 *)dst;
 		ip6 = mtod(mout, struct ip6_hdr *);
 		ip6->ip6_flow = htonl(0x60000000);
 		if (V_ip6_auto_flowlabel) {
 			ip6->ip6_flow |= (htonl(ip6_randomflowlabel()) & IPV6_FLOWLABEL_MASK);
 		}
 		ip6->ip6_hlim = MODULE_GLOBAL(ip6_defhlim);
 		if (port) {
 			ip6->ip6_nxt = IPPROTO_UDP;
 		} else {
 			ip6->ip6_nxt = IPPROTO_SCTP;
 		}
 		ip6->ip6_src = dst_sin6->sin6_addr;
 		ip6->ip6_dst = src_sin6->sin6_addr;
 		len = sizeof(struct ip6_hdr);
 		shout = (struct sctphdr *)((caddr_t)ip6 + len);
 		break;
 #endif
 	default:
 		len = 0;
 		shout = mtod(mout, struct sctphdr *);
 		break;
 	}
 #if defined(INET) || defined(INET6)
 	if (port) {
 		if (htons(SCTP_BASE_SYSCTL(sctp_udp_tunneling_port)) == 0) {
 			sctp_m_freem(mout);
 			return;
 		}
 		udp = (struct udphdr *)shout;
 		udp->uh_sport = htons(SCTP_BASE_SYSCTL(sctp_udp_tunneling_port));
 		udp->uh_dport = port;
 		udp->uh_sum = 0;
 		udp->uh_ulen = htons((uint16_t)(sizeof(struct udphdr) +
 		    sizeof(struct sctphdr) +
 		    sizeof(struct sctp_chunkhdr) +
 		    cause_len + padding_len));
 		len += sizeof(struct udphdr);
 		shout = (struct sctphdr *)((caddr_t)shout + sizeof(struct udphdr));
 	} else {
 		udp = NULL;
 	}
 #endif
 	shout->src_port = sh->dest_port;
 	shout->dest_port = sh->src_port;
 	shout->checksum = 0;
 	if (vtag) {
 		shout->v_tag = htonl(vtag);
 	} else {
 		shout->v_tag = sh->v_tag;
 	}
 	len += sizeof(struct sctphdr);
 	ch = (struct sctp_chunkhdr *)((caddr_t)shout + sizeof(struct sctphdr));
 	ch->chunk_type = type;
 	if (vtag) {
 		ch->chunk_flags = 0;
 	} else {
 		ch->chunk_flags = SCTP_HAD_NO_TCB;
 	}
 	ch->chunk_length = htons((uint16_t)(sizeof(struct sctp_chunkhdr) + cause_len));
 	len += sizeof(struct sctp_chunkhdr);
 	len += cause_len + padding_len;
 
 	if (SCTP_GET_HEADER_FOR_OUTPUT(o_pak)) {
 		sctp_m_freem(mout);
 		return;
 	}
 	SCTP_ATTACH_CHAIN(o_pak, mout, len);
 	switch (dst->sa_family) {
 #ifdef INET
 	case AF_INET:
 		if (port) {
 			if (V_udp_cksum) {
 				udp->uh_sum = in_pseudo(ip->ip_src.s_addr, ip->ip_dst.s_addr, udp->uh_ulen + htons(IPPROTO_UDP));
 			} else {
 				udp->uh_sum = 0;
 			}
 		}
 		ip->ip_len = htons(len);
 		if (port) {
 			shout->checksum = sctp_calculate_cksum(mout, sizeof(struct ip) + sizeof(struct udphdr));
 			SCTP_STAT_INCR(sctps_sendswcrc);
 			if (V_udp_cksum) {
 				SCTP_ENABLE_UDP_CSUM(o_pak);
 			}
 		} else {
 			mout->m_pkthdr.csum_flags = CSUM_SCTP;
 			mout->m_pkthdr.csum_data = offsetof(struct sctphdr, checksum);
 			SCTP_STAT_INCR(sctps_sendhwcrc);
 		}
 #ifdef SCTP_PACKET_LOGGING
 		if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LAST_PACKET_TRACING) {
 			sctp_packet_log(o_pak);
 		}
 #endif
 		SCTP_PROBE5(send, NULL, NULL, ip, NULL, shout);
 		SCTP_IP_OUTPUT(ret, o_pak, NULL, NULL, vrf_id);
 		break;
 #endif
 #ifdef INET6
 	case AF_INET6:
 		ip6->ip6_plen = htons((uint16_t)(len - sizeof(struct ip6_hdr)));
 		if (port) {
 			shout->checksum = sctp_calculate_cksum(mout, sizeof(struct ip6_hdr) + sizeof(struct udphdr));
 			SCTP_STAT_INCR(sctps_sendswcrc);
 			if ((udp->uh_sum = in6_cksum(o_pak, IPPROTO_UDP, sizeof(struct ip6_hdr), len - sizeof(struct ip6_hdr))) == 0) {
 				udp->uh_sum = 0xffff;
 			}
 		} else {
 			mout->m_pkthdr.csum_flags = CSUM_SCTP_IPV6;
 			mout->m_pkthdr.csum_data = offsetof(struct sctphdr, checksum);
 			SCTP_STAT_INCR(sctps_sendhwcrc);
 		}
 #ifdef SCTP_PACKET_LOGGING
 		if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_LAST_PACKET_TRACING) {
 			sctp_packet_log(o_pak);
 		}
 #endif
 		SCTP_PROBE5(send, NULL, NULL, ip6, NULL, shout);
 		SCTP_IP6_OUTPUT(ret, o_pak, NULL, NULL, NULL, vrf_id);
 		break;
 #endif
 	default:
 		SCTPDBG(SCTP_DEBUG_OUTPUT1, "Unknown protocol (TSNH) type %d\n",
 		    dst->sa_family);
 		sctp_m_freem(mout);
 		SCTP_LTRACE_ERR_RET_PKT(mout, NULL, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, EFAULT);
 		return;
 	}
 	SCTPDBG(SCTP_DEBUG_OUTPUT3, "return from send is %d\n", ret);
 	if (port) {
 		UDPSTAT_INC(udps_opackets);
 	}
 	SCTP_STAT_INCR(sctps_sendpackets);
 	SCTP_STAT_INCR_COUNTER64(sctps_outpackets);
 	SCTP_STAT_INCR_COUNTER64(sctps_outcontrolchunks);
 	if (ret) {
 		SCTP_STAT_INCR(sctps_senderrors);
 	}
 	return;
 }
 
 void
 sctp_send_shutdown_complete2(struct sockaddr *src, struct sockaddr *dst,
     struct sctphdr *sh,
     uint8_t mflowtype, uint32_t mflowid, uint16_t fibnum,
     uint32_t vrf_id, uint16_t port)
 {
 	sctp_send_resp_msg(src, dst, sh, 0, SCTP_SHUTDOWN_COMPLETE, NULL,
 	    mflowtype, mflowid, fibnum,
 	    vrf_id, port);
 }
 
 void
 sctp_send_hb(struct sctp_tcb *stcb, struct sctp_nets *net, int so_locked
 #if !defined(__APPLE__) && !defined(SCTP_SO_LOCK_TESTING)
     SCTP_UNUSED
 #endif
 )
 {
 	struct sctp_tmit_chunk *chk;
 	struct sctp_heartbeat_chunk *hb;
 	struct timeval now;
 
 	SCTP_TCB_LOCK_ASSERT(stcb);
 	if (net == NULL) {
 		return;
 	}
 	(void)SCTP_GETTIME_TIMEVAL(&now);
 	switch (net->ro._l_addr.sa.sa_family) {
 #ifdef INET
 	case AF_INET:
 		break;
 #endif
 #ifdef INET6
 	case AF_INET6:
 		break;
 #endif
 	default:
 		return;
 	}
 	sctp_alloc_a_chunk(stcb, chk);
 	if (chk == NULL) {
 		SCTPDBG(SCTP_DEBUG_OUTPUT4, "Gak, can't get a chunk for hb\n");
 		return;
 	}
 
 	chk->copy_by_ref = 0;
 	chk->rec.chunk_id.id = SCTP_HEARTBEAT_REQUEST;
 	chk->rec.chunk_id.can_take_data = 1;
 	chk->flags = 0;
 	chk->asoc = &stcb->asoc;
 	chk->send_size = sizeof(struct sctp_heartbeat_chunk);
 
 	chk->data = sctp_get_mbuf_for_msg(chk->send_size, 0, M_NOWAIT, 1, MT_HEADER);
 	if (chk->data == NULL) {
 		sctp_free_a_chunk(stcb, chk, so_locked);
 		return;
 	}
 	SCTP_BUF_RESV_UF(chk->data, SCTP_MIN_OVERHEAD);
 	SCTP_BUF_LEN(chk->data) = chk->send_size;
 	chk->sent = SCTP_DATAGRAM_UNSENT;
 	chk->snd_count = 0;
 	chk->whoTo = net;
 	atomic_add_int(&chk->whoTo->ref_count, 1);
 	/* Now we have a mbuf that we can fill in with the details */
 	hb = mtod(chk->data, struct sctp_heartbeat_chunk *);
 	memset(hb, 0, sizeof(struct sctp_heartbeat_chunk));
 	/* fill out chunk header */
 	hb->ch.chunk_type = SCTP_HEARTBEAT_REQUEST;
 	hb->ch.chunk_flags = 0;
 	hb->ch.chunk_length = htons(chk->send_size);
 	/* Fill out hb parameter */
 	hb->heartbeat.hb_info.ph.param_type = htons(SCTP_HEARTBEAT_INFO);
 	hb->heartbeat.hb_info.ph.param_length = htons(sizeof(struct sctp_heartbeat_info_param));
 	hb->heartbeat.hb_info.time_value_1 = now.tv_sec;
 	hb->heartbeat.hb_info.time_value_2 = now.tv_usec;
 	/* Did our user request this one, put it in */
 	hb->heartbeat.hb_info.addr_family = (uint8_t)net->ro._l_addr.sa.sa_family;
 	hb->heartbeat.hb_info.addr_len = net->ro._l_addr.sa.sa_len;
 	if (net->dest_state & SCTP_ADDR_UNCONFIRMED) {
 		/*
 		 * we only take from the entropy pool if the address is not
 		 * confirmed.
 		 */
 		net->heartbeat_random1 = hb->heartbeat.hb_info.random_value1 = sctp_select_initial_TSN(&stcb->sctp_ep->sctp_ep);
 		net->heartbeat_random2 = hb->heartbeat.hb_info.random_value2 = sctp_select_initial_TSN(&stcb->sctp_ep->sctp_ep);
 	} else {
 		net->heartbeat_random1 = hb->heartbeat.hb_info.random_value1 = 0;
 		net->heartbeat_random2 = hb->heartbeat.hb_info.random_value2 = 0;
 	}
 	switch (net->ro._l_addr.sa.sa_family) {
 #ifdef INET
 	case AF_INET:
 		memcpy(hb->heartbeat.hb_info.address,
 		    &net->ro._l_addr.sin.sin_addr,
 		    sizeof(net->ro._l_addr.sin.sin_addr));
 		break;
 #endif
 #ifdef INET6
 	case AF_INET6:
 		memcpy(hb->heartbeat.hb_info.address,
 		    &net->ro._l_addr.sin6.sin6_addr,
 		    sizeof(net->ro._l_addr.sin6.sin6_addr));
 		break;
 #endif
 	default:
 		if (chk->data) {
 			sctp_m_freem(chk->data);
 			chk->data = NULL;
 		}
 		sctp_free_a_chunk(stcb, chk, so_locked);
 		return;
 		break;
 	}
 	net->hb_responded = 0;
 	TAILQ_INSERT_TAIL(&stcb->asoc.control_send_queue, chk, sctp_next);
 	stcb->asoc.ctrl_queue_cnt++;
 	SCTP_STAT_INCR(sctps_sendheartbeat);
 	return;
 }
 
 void
 sctp_send_ecn_echo(struct sctp_tcb *stcb, struct sctp_nets *net,
     uint32_t high_tsn)
 {
 	struct sctp_association *asoc;
 	struct sctp_ecne_chunk *ecne;
 	struct sctp_tmit_chunk *chk;
 
 	if (net == NULL) {
 		return;
 	}
 	asoc = &stcb->asoc;
 	SCTP_TCB_LOCK_ASSERT(stcb);
 	TAILQ_FOREACH(chk, &asoc->control_send_queue, sctp_next) {
 		if ((chk->rec.chunk_id.id == SCTP_ECN_ECHO) && (net == chk->whoTo)) {
 			/* found a previous ECN_ECHO update it if needed */
 			uint32_t cnt, ctsn;
 
 			ecne = mtod(chk->data, struct sctp_ecne_chunk *);
 			ctsn = ntohl(ecne->tsn);
 			if (SCTP_TSN_GT(high_tsn, ctsn)) {
 				ecne->tsn = htonl(high_tsn);
 				SCTP_STAT_INCR(sctps_queue_upd_ecne);
 			}
 			cnt = ntohl(ecne->num_pkts_since_cwr);
 			cnt++;
 			ecne->num_pkts_since_cwr = htonl(cnt);
 			return;
 		}
 	}
 	/* nope could not find one to update so we must build one */
 	sctp_alloc_a_chunk(stcb, chk);
 	if (chk == NULL) {
 		return;
 	}
 	SCTP_STAT_INCR(sctps_queue_upd_ecne);
 	chk->copy_by_ref = 0;
 	chk->rec.chunk_id.id = SCTP_ECN_ECHO;
 	chk->rec.chunk_id.can_take_data = 0;
 	chk->flags = 0;
 	chk->asoc = &stcb->asoc;
 	chk->send_size = sizeof(struct sctp_ecne_chunk);
 	chk->data = sctp_get_mbuf_for_msg(chk->send_size, 0, M_NOWAIT, 1, MT_HEADER);
 	if (chk->data == NULL) {
 		sctp_free_a_chunk(stcb, chk, SCTP_SO_NOT_LOCKED);
 		return;
 	}
 	SCTP_BUF_RESV_UF(chk->data, SCTP_MIN_OVERHEAD);
 	SCTP_BUF_LEN(chk->data) = chk->send_size;
 	chk->sent = SCTP_DATAGRAM_UNSENT;
 	chk->snd_count = 0;
 	chk->whoTo = net;
 	atomic_add_int(&chk->whoTo->ref_count, 1);
 
 	stcb->asoc.ecn_echo_cnt_onq++;
 	ecne = mtod(chk->data, struct sctp_ecne_chunk *);
 	ecne->ch.chunk_type = SCTP_ECN_ECHO;
 	ecne->ch.chunk_flags = 0;
 	ecne->ch.chunk_length = htons(sizeof(struct sctp_ecne_chunk));
 	ecne->tsn = htonl(high_tsn);
 	ecne->num_pkts_since_cwr = htonl(1);
 	TAILQ_INSERT_HEAD(&stcb->asoc.control_send_queue, chk, sctp_next);
 	asoc->ctrl_queue_cnt++;
 }
 
 void
 sctp_send_packet_dropped(struct sctp_tcb *stcb, struct sctp_nets *net,
     struct mbuf *m, int len, int iphlen, int bad_crc)
 {
 	struct sctp_association *asoc;
 	struct sctp_pktdrop_chunk *drp;
 	struct sctp_tmit_chunk *chk;
 	uint8_t *datap;
 	int was_trunc = 0;
 	int fullsz = 0;
 	long spc;
 	int offset;
 	struct sctp_chunkhdr *ch, chunk_buf;
 	unsigned int chk_length;
 
 	if (!stcb) {
 		return;
 	}
 	asoc = &stcb->asoc;
 	SCTP_TCB_LOCK_ASSERT(stcb);
 	if (asoc->pktdrop_supported == 0) {
 		/*-
 		 * peer must declare support before I send one.
 		 */
 		return;
 	}
 	if (stcb->sctp_socket == NULL) {
 		return;
 	}
 	sctp_alloc_a_chunk(stcb, chk);
 	if (chk == NULL) {
 		return;
 	}
 	chk->copy_by_ref = 0;
 	chk->rec.chunk_id.id = SCTP_PACKET_DROPPED;
 	chk->rec.chunk_id.can_take_data = 1;
 	chk->flags = 0;
 	len -= iphlen;
 	chk->send_size = len;
 	/* Validate that we do not have an ABORT in here. */
 	offset = iphlen + sizeof(struct sctphdr);
 	ch = (struct sctp_chunkhdr *)sctp_m_getptr(m, offset,
 	    sizeof(*ch), (uint8_t *)&chunk_buf);
 	while (ch != NULL) {
 		chk_length = ntohs(ch->chunk_length);
 		if (chk_length < sizeof(*ch)) {
 			/* break to abort land */
 			break;
 		}
 		switch (ch->chunk_type) {
 		case SCTP_PACKET_DROPPED:
 		case SCTP_ABORT_ASSOCIATION:
 		case SCTP_INITIATION_ACK:
 			/**
 			 * We don't respond with an PKT-DROP to an ABORT
 			 * or PKT-DROP. We also do not respond to an
 			 * INIT-ACK, because we can't know if the initiation
 			 * tag is correct or not.
 			 */
 			sctp_free_a_chunk(stcb, chk, SCTP_SO_NOT_LOCKED);
 			return;
 		default:
 			break;
 		}
 		offset += SCTP_SIZE32(chk_length);
 		ch = (struct sctp_chunkhdr *)sctp_m_getptr(m, offset,
 		    sizeof(*ch), (uint8_t *)&chunk_buf);
 	}
 
 	if ((len + SCTP_MAX_OVERHEAD + sizeof(struct sctp_pktdrop_chunk)) >
 	    min(stcb->asoc.smallest_mtu, MCLBYTES)) {
 		/*
 		 * only send 1 mtu worth, trim off the excess on the end.
 		 */
 		fullsz = len;
 		len = min(stcb->asoc.smallest_mtu, MCLBYTES) - SCTP_MAX_OVERHEAD;
 		was_trunc = 1;
 	}
 	chk->asoc = &stcb->asoc;
 	chk->data = sctp_get_mbuf_for_msg(MCLBYTES, 0, M_NOWAIT, 1, MT_DATA);
 	if (chk->data == NULL) {
 jump_out:
 		sctp_free_a_chunk(stcb, chk, SCTP_SO_NOT_LOCKED);
 		return;
 	}
 	SCTP_BUF_RESV_UF(chk->data, SCTP_MIN_OVERHEAD);
 	drp = mtod(chk->data, struct sctp_pktdrop_chunk *);
 	if (drp == NULL) {
 		sctp_m_freem(chk->data);
 		chk->data = NULL;
 		goto jump_out;
 	}
 	chk->book_size = SCTP_SIZE32((chk->send_size + sizeof(struct sctp_pktdrop_chunk) +
 	    sizeof(struct sctphdr) + SCTP_MED_OVERHEAD));
 	chk->book_size_scale = 0;
 	if (was_trunc) {
 		drp->ch.chunk_flags = SCTP_PACKET_TRUNCATED;
 		drp->trunc_len = htons(fullsz);
 		/*
 		 * Len is already adjusted to size minus overhead above take
 		 * out the pkt_drop chunk itself from it.
 		 */
 		chk->send_size = (uint16_t)(len - sizeof(struct sctp_pktdrop_chunk));
 		len = chk->send_size;
 	} else {
 		/* no truncation needed */
 		drp->ch.chunk_flags = 0;
 		drp->trunc_len = htons(0);
 	}
 	if (bad_crc) {
 		drp->ch.chunk_flags |= SCTP_BADCRC;
 	}
 	chk->send_size += sizeof(struct sctp_pktdrop_chunk);
 	SCTP_BUF_LEN(chk->data) = chk->send_size;
 	chk->sent = SCTP_DATAGRAM_UNSENT;
 	chk->snd_count = 0;
 	if (net) {
 		/* we should hit here */
 		chk->whoTo = net;
 		atomic_add_int(&chk->whoTo->ref_count, 1);
 	} else {
 		chk->whoTo = NULL;
 	}
 	drp->ch.chunk_type = SCTP_PACKET_DROPPED;
 	drp->ch.chunk_length = htons(chk->send_size);
 	spc = SCTP_SB_LIMIT_RCV(stcb->sctp_socket);
 	if (spc < 0) {
 		spc = 0;
 	}
 	drp->bottle_bw = htonl(spc);
 	if (asoc->my_rwnd) {
 		drp->current_onq = htonl(asoc->size_on_reasm_queue +
 		    asoc->size_on_all_streams +
 		    asoc->my_rwnd_control_len +
 		    stcb->sctp_socket->so_rcv.sb_cc);
 	} else {
 		/*-
 		 * If my rwnd is 0, possibly from mbuf depletion as well as
 		 * space used, tell the peer there is NO space aka onq == bw
 		 */
 		drp->current_onq = htonl(spc);
 	}
 	drp->reserved = 0;
 	datap = drp->data;
 	m_copydata(m, iphlen, len, (caddr_t)datap);
 	TAILQ_INSERT_TAIL(&stcb->asoc.control_send_queue, chk, sctp_next);
 	asoc->ctrl_queue_cnt++;
 }
 
 void
 sctp_send_cwr(struct sctp_tcb *stcb, struct sctp_nets *net, uint32_t high_tsn, uint8_t override)
 {
 	struct sctp_association *asoc;
 	struct sctp_cwr_chunk *cwr;
 	struct sctp_tmit_chunk *chk;
 
 	SCTP_TCB_LOCK_ASSERT(stcb);
 	if (net == NULL) {
 		return;
 	}
 	asoc = &stcb->asoc;
 	TAILQ_FOREACH(chk, &asoc->control_send_queue, sctp_next) {
 		if ((chk->rec.chunk_id.id == SCTP_ECN_CWR) && (net == chk->whoTo)) {
 			/*
 			 * found a previous CWR queued to same destination
 			 * update it if needed
 			 */
 			uint32_t ctsn;
 
 			cwr = mtod(chk->data, struct sctp_cwr_chunk *);
 			ctsn = ntohl(cwr->tsn);
 			if (SCTP_TSN_GT(high_tsn, ctsn)) {
 				cwr->tsn = htonl(high_tsn);
 			}
 			if (override & SCTP_CWR_REDUCE_OVERRIDE) {
 				/* Make sure override is carried */
 				cwr->ch.chunk_flags |= SCTP_CWR_REDUCE_OVERRIDE;
 			}
 			return;
 		}
 	}
 	sctp_alloc_a_chunk(stcb, chk);
 	if (chk == NULL) {
 		return;
 	}
 	chk->copy_by_ref = 0;
 	chk->rec.chunk_id.id = SCTP_ECN_CWR;
 	chk->rec.chunk_id.can_take_data = 1;
 	chk->flags = 0;
 	chk->asoc = &stcb->asoc;
 	chk->send_size = sizeof(struct sctp_cwr_chunk);
 	chk->data = sctp_get_mbuf_for_msg(chk->send_size, 0, M_NOWAIT, 1, MT_HEADER);
 	if (chk->data == NULL) {
 		sctp_free_a_chunk(stcb, chk, SCTP_SO_NOT_LOCKED);
 		return;
 	}
 	SCTP_BUF_RESV_UF(chk->data, SCTP_MIN_OVERHEAD);
 	SCTP_BUF_LEN(chk->data) = chk->send_size;
 	chk->sent = SCTP_DATAGRAM_UNSENT;
 	chk->snd_count = 0;
 	chk->whoTo = net;
 	atomic_add_int(&chk->whoTo->ref_count, 1);
 	cwr = mtod(chk->data, struct sctp_cwr_chunk *);
 	cwr->ch.chunk_type = SCTP_ECN_CWR;
 	cwr->ch.chunk_flags = override;
 	cwr->ch.chunk_length = htons(sizeof(struct sctp_cwr_chunk));
 	cwr->tsn = htonl(high_tsn);
 	TAILQ_INSERT_TAIL(&stcb->asoc.control_send_queue, chk, sctp_next);
 	asoc->ctrl_queue_cnt++;
 }
 
 static int
 sctp_add_stream_reset_out(struct sctp_tcb *stcb, struct sctp_tmit_chunk *chk,
     uint32_t seq, uint32_t resp_seq, uint32_t last_sent)
 {
 	uint16_t len, old_len, i;
 	struct sctp_stream_reset_out_request *req_out;
 	struct sctp_chunkhdr *ch;
 	int at;
 	int number_entries = 0;
 
 	ch = mtod(chk->data, struct sctp_chunkhdr *);
 	old_len = len = SCTP_SIZE32(ntohs(ch->chunk_length));
 	/* get to new offset for the param. */
 	req_out = (struct sctp_stream_reset_out_request *)((caddr_t)ch + len);
 	/* now how long will this param be? */
 	for (i = 0; i < stcb->asoc.streamoutcnt; i++) {
 		if ((stcb->asoc.strmout[i].state == SCTP_STREAM_RESET_PENDING) &&
 		    (stcb->asoc.strmout[i].chunks_on_queues == 0) &&
 		    TAILQ_EMPTY(&stcb->asoc.strmout[i].outqueue)) {
 			number_entries++;
 		}
 	}
 	if (number_entries == 0) {
 		return (0);
 	}
 	if (number_entries == stcb->asoc.streamoutcnt) {
 		number_entries = 0;
 	}
 	if (number_entries > SCTP_MAX_STREAMS_AT_ONCE_RESET) {
 		number_entries = SCTP_MAX_STREAMS_AT_ONCE_RESET;
 	}
 	len = (uint16_t)(sizeof(struct sctp_stream_reset_out_request) + (sizeof(uint16_t) * number_entries));
 	req_out->ph.param_type = htons(SCTP_STR_RESET_OUT_REQUEST);
 	req_out->ph.param_length = htons(len);
 	req_out->request_seq = htonl(seq);
 	req_out->response_seq = htonl(resp_seq);
 	req_out->send_reset_at_tsn = htonl(last_sent);
 	at = 0;
 	if (number_entries) {
 		for (i = 0; i < stcb->asoc.streamoutcnt; i++) {
 			if ((stcb->asoc.strmout[i].state == SCTP_STREAM_RESET_PENDING) &&
 			    (stcb->asoc.strmout[i].chunks_on_queues == 0) &&
 			    TAILQ_EMPTY(&stcb->asoc.strmout[i].outqueue)) {
 				req_out->list_of_streams[at] = htons(i);
 				at++;
 				stcb->asoc.strmout[i].state = SCTP_STREAM_RESET_IN_FLIGHT;
 				if (at >= number_entries) {
 					break;
 				}
 			}
 		}
 	} else {
 		for (i = 0; i < stcb->asoc.streamoutcnt; i++) {
 			stcb->asoc.strmout[i].state = SCTP_STREAM_RESET_IN_FLIGHT;
 		}
 	}
 	if (SCTP_SIZE32(len) > len) {
 		/*-
 		 * Need to worry about the pad we may end up adding to the
 		 * end. This is easy since the struct is either aligned to 4
 		 * bytes or 2 bytes off.
 		 */
 		req_out->list_of_streams[number_entries] = 0;
 	}
 	/* now fix the chunk length */
 	ch->chunk_length = htons(len + old_len);
 	chk->book_size = len + old_len;
 	chk->book_size_scale = 0;
 	chk->send_size = SCTP_SIZE32(chk->book_size);
 	SCTP_BUF_LEN(chk->data) = chk->send_size;
 	return (1);
 }
 
 static void
 sctp_add_stream_reset_in(struct sctp_tmit_chunk *chk,
     int number_entries, uint16_t *list,
     uint32_t seq)
 {
 	uint16_t len, old_len, i;
 	struct sctp_stream_reset_in_request *req_in;
 	struct sctp_chunkhdr *ch;
 
 	ch = mtod(chk->data, struct sctp_chunkhdr *);
 	old_len = len = SCTP_SIZE32(ntohs(ch->chunk_length));
 
 	/* get to new offset for the param. */
 	req_in = (struct sctp_stream_reset_in_request *)((caddr_t)ch + len);
 	/* now how long will this param be? */
 	len = (uint16_t)(sizeof(struct sctp_stream_reset_in_request) + (sizeof(uint16_t) * number_entries));
 	req_in->ph.param_type = htons(SCTP_STR_RESET_IN_REQUEST);
 	req_in->ph.param_length = htons(len);
 	req_in->request_seq = htonl(seq);
 	if (number_entries) {
 		for (i = 0; i < number_entries; i++) {
 			req_in->list_of_streams[i] = htons(list[i]);
 		}
 	}
 	if (SCTP_SIZE32(len) > len) {
 		/*-
 		 * Need to worry about the pad we may end up adding to the
 		 * end. This is easy since the struct is either aligned to 4
 		 * bytes or 2 bytes off.
 		 */
 		req_in->list_of_streams[number_entries] = 0;
 	}
 	/* now fix the chunk length */
 	ch->chunk_length = htons(len + old_len);
 	chk->book_size = len + old_len;
 	chk->book_size_scale = 0;
 	chk->send_size = SCTP_SIZE32(chk->book_size);
 	SCTP_BUF_LEN(chk->data) = chk->send_size;
 	return;
 }
 
 static void
 sctp_add_stream_reset_tsn(struct sctp_tmit_chunk *chk,
     uint32_t seq)
 {
 	uint16_t len, old_len;
 	struct sctp_stream_reset_tsn_request *req_tsn;
 	struct sctp_chunkhdr *ch;
 
 	ch = mtod(chk->data, struct sctp_chunkhdr *);
 	old_len = len = SCTP_SIZE32(ntohs(ch->chunk_length));
 
 	/* get to new offset for the param. */
 	req_tsn = (struct sctp_stream_reset_tsn_request *)((caddr_t)ch + len);
 	/* now how long will this param be? */
 	len = sizeof(struct sctp_stream_reset_tsn_request);
 	req_tsn->ph.param_type = htons(SCTP_STR_RESET_TSN_REQUEST);
 	req_tsn->ph.param_length = htons(len);
 	req_tsn->request_seq = htonl(seq);
 
 	/* now fix the chunk length */
 	ch->chunk_length = htons(len + old_len);
 	chk->send_size = len + old_len;
 	chk->book_size = SCTP_SIZE32(chk->send_size);
 	chk->book_size_scale = 0;
 	SCTP_BUF_LEN(chk->data) = SCTP_SIZE32(chk->send_size);
 	return;
 }
 
 void
 sctp_add_stream_reset_result(struct sctp_tmit_chunk *chk,
     uint32_t resp_seq, uint32_t result)
 {
 	uint16_t len, old_len;
 	struct sctp_stream_reset_response *resp;
 	struct sctp_chunkhdr *ch;
 
 	ch = mtod(chk->data, struct sctp_chunkhdr *);
 	old_len = len = SCTP_SIZE32(ntohs(ch->chunk_length));
 
 	/* get to new offset for the param. */
 	resp = (struct sctp_stream_reset_response *)((caddr_t)ch + len);
 	/* now how long will this param be? */
 	len = sizeof(struct sctp_stream_reset_response);
 	resp->ph.param_type = htons(SCTP_STR_RESET_RESPONSE);
 	resp->ph.param_length = htons(len);
 	resp->response_seq = htonl(resp_seq);
 	resp->result = ntohl(result);
 
 	/* now fix the chunk length */
 	ch->chunk_length = htons(len + old_len);
 	chk->book_size = len + old_len;
 	chk->book_size_scale = 0;
 	chk->send_size = SCTP_SIZE32(chk->book_size);
 	SCTP_BUF_LEN(chk->data) = chk->send_size;
 	return;
 }
 
 void
 sctp_send_deferred_reset_response(struct sctp_tcb *stcb,
     struct sctp_stream_reset_list *ent,
     int response)
 {
 	struct sctp_association *asoc;
 	struct sctp_tmit_chunk *chk;
 	struct sctp_chunkhdr *ch;
 
 	asoc = &stcb->asoc;
 
 	/*
 	 * Reset our last reset action to the new one IP -> response
 	 * (PERFORMED probably). This assures that if we fail to send, a
 	 * retran from the peer will get the new response.
 	 */
 	asoc->last_reset_action[0] = response;
 	if (asoc->stream_reset_outstanding) {
 		return;
 	}
 	sctp_alloc_a_chunk(stcb, chk);
 	if (chk == NULL) {
 		SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 		return;
 	}
 	chk->copy_by_ref = 0;
 	chk->rec.chunk_id.id = SCTP_STREAM_RESET;
 	chk->rec.chunk_id.can_take_data = 0;
 	chk->flags = 0;
 	chk->asoc = &stcb->asoc;
 	chk->book_size = sizeof(struct sctp_chunkhdr);
 	chk->send_size = SCTP_SIZE32(chk->book_size);
 	chk->book_size_scale = 0;
 	chk->data = sctp_get_mbuf_for_msg(MCLBYTES, 0, M_NOWAIT, 1, MT_DATA);
 	if (chk->data == NULL) {
 		sctp_free_a_chunk(stcb, chk, SCTP_SO_LOCKED);
 		SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 		return;
 	}
 	SCTP_BUF_RESV_UF(chk->data, SCTP_MIN_OVERHEAD);
 	/* setup chunk parameters */
 	chk->sent = SCTP_DATAGRAM_UNSENT;
 	chk->snd_count = 0;
 	if (stcb->asoc.alternate) {
 		chk->whoTo = stcb->asoc.alternate;
 	} else {
 		chk->whoTo = stcb->asoc.primary_destination;
 	}
 	ch = mtod(chk->data, struct sctp_chunkhdr *);
 	ch->chunk_type = SCTP_STREAM_RESET;
 	ch->chunk_flags = 0;
 	ch->chunk_length = htons(chk->book_size);
 	atomic_add_int(&chk->whoTo->ref_count, 1);
 	SCTP_BUF_LEN(chk->data) = chk->send_size;
 	sctp_add_stream_reset_result(chk, ent->seq, response);
 	/* insert the chunk for sending */
 	TAILQ_INSERT_TAIL(&asoc->control_send_queue,
 	    chk,
 	    sctp_next);
 	asoc->ctrl_queue_cnt++;
 }
 
 void
 sctp_add_stream_reset_result_tsn(struct sctp_tmit_chunk *chk,
     uint32_t resp_seq, uint32_t result,
     uint32_t send_una, uint32_t recv_next)
 {
 	uint16_t len, old_len;
 	struct sctp_stream_reset_response_tsn *resp;
 	struct sctp_chunkhdr *ch;
 
 	ch = mtod(chk->data, struct sctp_chunkhdr *);
 	old_len = len = SCTP_SIZE32(ntohs(ch->chunk_length));
 
 	/* get to new offset for the param. */
 	resp = (struct sctp_stream_reset_response_tsn *)((caddr_t)ch + len);
 	/* now how long will this param be? */
 	len = sizeof(struct sctp_stream_reset_response_tsn);
 	resp->ph.param_type = htons(SCTP_STR_RESET_RESPONSE);
 	resp->ph.param_length = htons(len);
 	resp->response_seq = htonl(resp_seq);
 	resp->result = htonl(result);
 	resp->senders_next_tsn = htonl(send_una);
 	resp->receivers_next_tsn = htonl(recv_next);
 
 	/* now fix the chunk length */
 	ch->chunk_length = htons(len + old_len);
 	chk->book_size = len + old_len;
 	chk->send_size = SCTP_SIZE32(chk->book_size);
 	chk->book_size_scale = 0;
 	SCTP_BUF_LEN(chk->data) = chk->send_size;
 	return;
 }
 
 static void
 sctp_add_an_out_stream(struct sctp_tmit_chunk *chk,
     uint32_t seq,
     uint16_t adding)
 {
 	uint16_t len, old_len;
 	struct sctp_chunkhdr *ch;
 	struct sctp_stream_reset_add_strm *addstr;
 
 	ch = mtod(chk->data, struct sctp_chunkhdr *);
 	old_len = len = SCTP_SIZE32(ntohs(ch->chunk_length));
 
 	/* get to new offset for the param. */
 	addstr = (struct sctp_stream_reset_add_strm *)((caddr_t)ch + len);
 	/* now how long will this param be? */
 	len = sizeof(struct sctp_stream_reset_add_strm);
 
 	/* Fill it out. */
 	addstr->ph.param_type = htons(SCTP_STR_RESET_ADD_OUT_STREAMS);
 	addstr->ph.param_length = htons(len);
 	addstr->request_seq = htonl(seq);
 	addstr->number_of_streams = htons(adding);
 	addstr->reserved = 0;
 
 	/* now fix the chunk length */
 	ch->chunk_length = htons(len + old_len);
 	chk->send_size = len + old_len;
 	chk->book_size = SCTP_SIZE32(chk->send_size);
 	chk->book_size_scale = 0;
 	SCTP_BUF_LEN(chk->data) = SCTP_SIZE32(chk->send_size);
 	return;
 }
 
 static void
 sctp_add_an_in_stream(struct sctp_tmit_chunk *chk,
     uint32_t seq,
     uint16_t adding)
 {
 	uint16_t len, old_len;
 	struct sctp_chunkhdr *ch;
 	struct sctp_stream_reset_add_strm *addstr;
 
 	ch = mtod(chk->data, struct sctp_chunkhdr *);
 	old_len = len = SCTP_SIZE32(ntohs(ch->chunk_length));
 
 	/* get to new offset for the param. */
 	addstr = (struct sctp_stream_reset_add_strm *)((caddr_t)ch + len);
 	/* now how long will this param be? */
 	len = sizeof(struct sctp_stream_reset_add_strm);
 	/* Fill it out. */
 	addstr->ph.param_type = htons(SCTP_STR_RESET_ADD_IN_STREAMS);
 	addstr->ph.param_length = htons(len);
 	addstr->request_seq = htonl(seq);
 	addstr->number_of_streams = htons(adding);
 	addstr->reserved = 0;
 
 	/* now fix the chunk length */
 	ch->chunk_length = htons(len + old_len);
 	chk->send_size = len + old_len;
 	chk->book_size = SCTP_SIZE32(chk->send_size);
 	chk->book_size_scale = 0;
 	SCTP_BUF_LEN(chk->data) = SCTP_SIZE32(chk->send_size);
 	return;
 }
 
 int
 sctp_send_stream_reset_out_if_possible(struct sctp_tcb *stcb, int so_locked)
 {
 	struct sctp_association *asoc;
 	struct sctp_tmit_chunk *chk;
 	struct sctp_chunkhdr *ch;
 	uint32_t seq;
 
 	asoc = &stcb->asoc;
 	asoc->trigger_reset = 0;
 	if (asoc->stream_reset_outstanding) {
 		return (EALREADY);
 	}
 	sctp_alloc_a_chunk(stcb, chk);
 	if (chk == NULL) {
 		SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 		return (ENOMEM);
 	}
 	chk->copy_by_ref = 0;
 	chk->rec.chunk_id.id = SCTP_STREAM_RESET;
 	chk->rec.chunk_id.can_take_data = 0;
 	chk->flags = 0;
 	chk->asoc = &stcb->asoc;
 	chk->book_size = sizeof(struct sctp_chunkhdr);
 	chk->send_size = SCTP_SIZE32(chk->book_size);
 	chk->book_size_scale = 0;
 	chk->data = sctp_get_mbuf_for_msg(MCLBYTES, 0, M_NOWAIT, 1, MT_DATA);
 	if (chk->data == NULL) {
 		sctp_free_a_chunk(stcb, chk, so_locked);
 		SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 		return (ENOMEM);
 	}
 	SCTP_BUF_RESV_UF(chk->data, SCTP_MIN_OVERHEAD);
 
 	/* setup chunk parameters */
 	chk->sent = SCTP_DATAGRAM_UNSENT;
 	chk->snd_count = 0;
 	if (stcb->asoc.alternate) {
 		chk->whoTo = stcb->asoc.alternate;
 	} else {
 		chk->whoTo = stcb->asoc.primary_destination;
 	}
 	ch = mtod(chk->data, struct sctp_chunkhdr *);
 	ch->chunk_type = SCTP_STREAM_RESET;
 	ch->chunk_flags = 0;
 	ch->chunk_length = htons(chk->book_size);
 	atomic_add_int(&chk->whoTo->ref_count, 1);
 	SCTP_BUF_LEN(chk->data) = chk->send_size;
 	seq = stcb->asoc.str_reset_seq_out;
 	if (sctp_add_stream_reset_out(stcb, chk, seq, (stcb->asoc.str_reset_seq_in - 1), (stcb->asoc.sending_seq - 1))) {
 		seq++;
 		asoc->stream_reset_outstanding++;
 	} else {
 		m_freem(chk->data);
 		chk->data = NULL;
 		sctp_free_a_chunk(stcb, chk, so_locked);
 		return (ENOENT);
 	}
 	asoc->str_reset = chk;
 	/* insert the chunk for sending */
 	TAILQ_INSERT_TAIL(&asoc->control_send_queue,
 	    chk,
 	    sctp_next);
 	asoc->ctrl_queue_cnt++;
 
 	if (stcb->asoc.send_sack) {
 		sctp_send_sack(stcb, so_locked);
 	}
 	sctp_timer_start(SCTP_TIMER_TYPE_STRRESET, stcb->sctp_ep, stcb, chk->whoTo);
 	return (0);
 }
 
 int
 sctp_send_str_reset_req(struct sctp_tcb *stcb,
     uint16_t number_entries, uint16_t *list,
     uint8_t send_in_req,
     uint8_t send_tsn_req,
     uint8_t add_stream,
     uint16_t adding_o,
     uint16_t adding_i, uint8_t peer_asked)
 {
 	struct sctp_association *asoc;
 	struct sctp_tmit_chunk *chk;
 	struct sctp_chunkhdr *ch;
 	int can_send_out_req = 0;
 	uint32_t seq;
 
 	asoc = &stcb->asoc;
 	if (asoc->stream_reset_outstanding) {
 		/*-
 		 * Already one pending, must get ACK back to clear the flag.
 		 */
 		SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EBUSY);
 		return (EBUSY);
 	}
 	if ((send_in_req == 0) && (send_tsn_req == 0) &&
 	    (add_stream == 0)) {
 		/* nothing to do */
 		SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 		return (EINVAL);
 	}
 	if (send_tsn_req && send_in_req) {
 		/* error, can't do that */
 		SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 		return (EINVAL);
 	} else if (send_in_req) {
 		can_send_out_req = 1;
 	}
 	if (number_entries > (MCLBYTES -
 	    SCTP_MIN_OVERHEAD -
 	    sizeof(struct sctp_chunkhdr) -
 	    sizeof(struct sctp_stream_reset_out_request)) /
 	    sizeof(uint16_t)) {
 		SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 		return (ENOMEM);
 	}
 	sctp_alloc_a_chunk(stcb, chk);
 	if (chk == NULL) {
 		SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 		return (ENOMEM);
 	}
 	chk->copy_by_ref = 0;
 	chk->rec.chunk_id.id = SCTP_STREAM_RESET;
 	chk->rec.chunk_id.can_take_data = 0;
 	chk->flags = 0;
 	chk->asoc = &stcb->asoc;
 	chk->book_size = sizeof(struct sctp_chunkhdr);
 	chk->send_size = SCTP_SIZE32(chk->book_size);
 	chk->book_size_scale = 0;
 	chk->data = sctp_get_mbuf_for_msg(MCLBYTES, 0, M_NOWAIT, 1, MT_DATA);
 	if (chk->data == NULL) {
 		sctp_free_a_chunk(stcb, chk, SCTP_SO_LOCKED);
 		SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 		return (ENOMEM);
 	}
 	SCTP_BUF_RESV_UF(chk->data, SCTP_MIN_OVERHEAD);
 
 	/* setup chunk parameters */
 	chk->sent = SCTP_DATAGRAM_UNSENT;
 	chk->snd_count = 0;
 	if (stcb->asoc.alternate) {
 		chk->whoTo = stcb->asoc.alternate;
 	} else {
 		chk->whoTo = stcb->asoc.primary_destination;
 	}
 	atomic_add_int(&chk->whoTo->ref_count, 1);
 	ch = mtod(chk->data, struct sctp_chunkhdr *);
 	ch->chunk_type = SCTP_STREAM_RESET;
 	ch->chunk_flags = 0;
 	ch->chunk_length = htons(chk->book_size);
 	SCTP_BUF_LEN(chk->data) = chk->send_size;
 
 	seq = stcb->asoc.str_reset_seq_out;
 	if (can_send_out_req) {
 		int ret;
 
 		ret = sctp_add_stream_reset_out(stcb, chk, seq, (stcb->asoc.str_reset_seq_in - 1), (stcb->asoc.sending_seq - 1));
 		if (ret) {
 			seq++;
 			asoc->stream_reset_outstanding++;
 		}
 	}
 	if ((add_stream & 1) &&
 	    ((stcb->asoc.strm_realoutsize - stcb->asoc.streamoutcnt) < adding_o)) {
 		/* Need to allocate more */
 		struct sctp_stream_out *oldstream;
 		struct sctp_stream_queue_pending *sp, *nsp;
 		int i;
 #if defined(SCTP_DETAILED_STR_STATS)
 		int j;
 #endif
 
 		oldstream = stcb->asoc.strmout;
 		/* get some more */
 		SCTP_MALLOC(stcb->asoc.strmout, struct sctp_stream_out *,
 		    (stcb->asoc.streamoutcnt + adding_o) * sizeof(struct sctp_stream_out),
 		    SCTP_M_STRMO);
 		if (stcb->asoc.strmout == NULL) {
 			uint8_t x;
 
 			stcb->asoc.strmout = oldstream;
 			/* Turn off the bit */
 			x = add_stream & 0xfe;
 			add_stream = x;
 			goto skip_stuff;
 		}
 		/*
 		 * Ok now we proceed with copying the old out stuff and
 		 * initializing the new stuff.
 		 */
 		SCTP_TCB_SEND_LOCK(stcb);
 		stcb->asoc.ss_functions.sctp_ss_clear(stcb, &stcb->asoc, 0, 1);
 		for (i = 0; i < stcb->asoc.streamoutcnt; i++) {
 			TAILQ_INIT(&stcb->asoc.strmout[i].outqueue);
 			stcb->asoc.strmout[i].chunks_on_queues = oldstream[i].chunks_on_queues;
 			stcb->asoc.strmout[i].next_mid_ordered = oldstream[i].next_mid_ordered;
 			stcb->asoc.strmout[i].next_mid_unordered = oldstream[i].next_mid_unordered;
 			stcb->asoc.strmout[i].last_msg_incomplete = oldstream[i].last_msg_incomplete;
 			stcb->asoc.strmout[i].sid = i;
 			stcb->asoc.strmout[i].state = oldstream[i].state;
 			/* FIX ME FIX ME */
 			/*
 			 * This should be a SS_COPY operation FIX ME STREAM
 			 * SCHEDULER EXPERT
 			 */
 			stcb->asoc.ss_functions.sctp_ss_init_stream(stcb, &stcb->asoc.strmout[i], &oldstream[i]);
 			/* now anything on those queues? */
 			TAILQ_FOREACH_SAFE(sp, &oldstream[i].outqueue, next, nsp) {
 				TAILQ_REMOVE(&oldstream[i].outqueue, sp, next);
 				TAILQ_INSERT_TAIL(&stcb->asoc.strmout[i].outqueue, sp, next);
 			}
 
 		}
 		/* now the new streams */
 		stcb->asoc.ss_functions.sctp_ss_init(stcb, &stcb->asoc, 1);
 		for (i = stcb->asoc.streamoutcnt; i < (stcb->asoc.streamoutcnt + adding_o); i++) {
 			TAILQ_INIT(&stcb->asoc.strmout[i].outqueue);
 			stcb->asoc.strmout[i].chunks_on_queues = 0;
 #if defined(SCTP_DETAILED_STR_STATS)
 			for (j = 0; j < SCTP_PR_SCTP_MAX + 1; j++) {
 				stcb->asoc.strmout[i].abandoned_sent[j] = 0;
 				stcb->asoc.strmout[i].abandoned_unsent[j] = 0;
 			}
 #else
 			stcb->asoc.strmout[i].abandoned_sent[0] = 0;
 			stcb->asoc.strmout[i].abandoned_unsent[0] = 0;
 #endif
 			stcb->asoc.strmout[i].next_mid_ordered = 0;
 			stcb->asoc.strmout[i].next_mid_unordered = 0;
 			stcb->asoc.strmout[i].sid = i;
 			stcb->asoc.strmout[i].last_msg_incomplete = 0;
 			stcb->asoc.ss_functions.sctp_ss_init_stream(stcb, &stcb->asoc.strmout[i], NULL);
 			stcb->asoc.strmout[i].state = SCTP_STREAM_CLOSED;
 		}
 		stcb->asoc.strm_realoutsize = stcb->asoc.streamoutcnt + adding_o;
 		SCTP_FREE(oldstream, SCTP_M_STRMO);
 		SCTP_TCB_SEND_UNLOCK(stcb);
 	}
 skip_stuff:
 	if ((add_stream & 1) && (adding_o > 0)) {
 		asoc->strm_pending_add_size = adding_o;
 		asoc->peer_req_out = peer_asked;
 		sctp_add_an_out_stream(chk, seq, adding_o);
 		seq++;
 		asoc->stream_reset_outstanding++;
 	}
 	if ((add_stream & 2) && (adding_i > 0)) {
 		sctp_add_an_in_stream(chk, seq, adding_i);
 		seq++;
 		asoc->stream_reset_outstanding++;
 	}
 	if (send_in_req) {
 		sctp_add_stream_reset_in(chk, number_entries, list, seq);
 		seq++;
 		asoc->stream_reset_outstanding++;
 	}
 	if (send_tsn_req) {
 		sctp_add_stream_reset_tsn(chk, seq);
 		asoc->stream_reset_outstanding++;
 	}
 	asoc->str_reset = chk;
 	/* insert the chunk for sending */
 	TAILQ_INSERT_TAIL(&asoc->control_send_queue,
 	    chk,
 	    sctp_next);
 	asoc->ctrl_queue_cnt++;
 	if (stcb->asoc.send_sack) {
 		sctp_send_sack(stcb, SCTP_SO_LOCKED);
 	}
 	sctp_timer_start(SCTP_TIMER_TYPE_STRRESET, stcb->sctp_ep, stcb, chk->whoTo);
 	return (0);
 }
 
 void
 sctp_send_abort(struct mbuf *m, int iphlen, struct sockaddr *src, struct sockaddr *dst,
     struct sctphdr *sh, uint32_t vtag, struct mbuf *cause,
     uint8_t mflowtype, uint32_t mflowid, uint16_t fibnum,
     uint32_t vrf_id, uint16_t port)
 {
 	/* Don't respond to an ABORT with an ABORT. */
 	if (sctp_is_there_an_abort_here(m, iphlen, &vtag)) {
 		if (cause)
 			sctp_m_freem(cause);
 		return;
 	}
 	sctp_send_resp_msg(src, dst, sh, vtag, SCTP_ABORT_ASSOCIATION, cause,
 	    mflowtype, mflowid, fibnum,
 	    vrf_id, port);
 	return;
 }
 
 void
 sctp_send_operr_to(struct sockaddr *src, struct sockaddr *dst,
     struct sctphdr *sh, uint32_t vtag, struct mbuf *cause,
     uint8_t mflowtype, uint32_t mflowid, uint16_t fibnum,
     uint32_t vrf_id, uint16_t port)
 {
 	sctp_send_resp_msg(src, dst, sh, vtag, SCTP_OPERATION_ERROR, cause,
 	    mflowtype, mflowid, fibnum,
 	    vrf_id, port);
 	return;
 }
 
 static struct mbuf *
 sctp_copy_resume(struct uio *uio,
     int max_send_len,
     int user_marks_eor,
     int *error,
     uint32_t *sndout,
     struct mbuf **new_tail)
 {
 	struct mbuf *m;
 
 	m = m_uiotombuf(uio, M_WAITOK, max_send_len, 0,
 	    (M_PKTHDR | (user_marks_eor ? M_EOR : 0)));
 	if (m == NULL) {
 		SCTP_LTRACE_ERR_RET(NULL, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, ENOBUFS);
 		*error = ENOBUFS;
 	} else {
 		*sndout = m_length(m, NULL);
 		*new_tail = m_last(m);
 	}
 	return (m);
 }
 
 static int
 sctp_copy_one(struct sctp_stream_queue_pending *sp,
     struct uio *uio,
     int resv_upfront)
 {
 	sp->data = m_uiotombuf(uio, M_WAITOK, sp->length,
 	    resv_upfront, 0);
 	if (sp->data == NULL) {
 		SCTP_LTRACE_ERR_RET(NULL, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, ENOBUFS);
 		return (ENOBUFS);
 	}
 
 	sp->tail_mbuf = m_last(sp->data);
 	return (0);
 }
 
 
 
 static struct sctp_stream_queue_pending *
 sctp_copy_it_in(struct sctp_tcb *stcb,
     struct sctp_association *asoc,
     struct sctp_sndrcvinfo *srcv,
     struct uio *uio,
     struct sctp_nets *net,
     int max_send_len,
     int user_marks_eor,
     int *error)
 {
 
 	/*-
 	 * This routine must be very careful in its work. Protocol
 	 * processing is up and running so care must be taken to spl...()
 	 * when you need to do something that may effect the stcb/asoc. The
 	 * sb is locked however. When data is copied the protocol processing
 	 * should be enabled since this is a slower operation...
 	 */
 	struct sctp_stream_queue_pending *sp = NULL;
 	int resv_in_first;
 
 	*error = 0;
 	/* Now can we send this? */
 	if ((SCTP_GET_STATE(stcb) == SCTP_STATE_SHUTDOWN_SENT) ||
 	    (SCTP_GET_STATE(stcb) == SCTP_STATE_SHUTDOWN_ACK_SENT) ||
 	    (SCTP_GET_STATE(stcb) == SCTP_STATE_SHUTDOWN_RECEIVED) ||
 	    (asoc->state & SCTP_STATE_SHUTDOWN_PENDING)) {
 		/* got data while shutting down */
 		SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ECONNRESET);
 		*error = ECONNRESET;
 		goto out_now;
 	}
 	sctp_alloc_a_strmoq(stcb, sp);
 	if (sp == NULL) {
 		SCTP_LTRACE_ERR_RET(NULL, stcb, net, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 		*error = ENOMEM;
 		goto out_now;
 	}
 	sp->act_flags = 0;
 	sp->sender_all_done = 0;
 	sp->sinfo_flags = srcv->sinfo_flags;
 	sp->timetolive = srcv->sinfo_timetolive;
 	sp->ppid = srcv->sinfo_ppid;
 	sp->context = srcv->sinfo_context;
 	sp->fsn = 0;
 	(void)SCTP_GETTIME_TIMEVAL(&sp->ts);
 
 	sp->sid = srcv->sinfo_stream;
 	sp->length = (uint32_t)min(uio->uio_resid, max_send_len);
 	if ((sp->length == (uint32_t)uio->uio_resid) &&
 	    ((user_marks_eor == 0) ||
 	    (srcv->sinfo_flags & SCTP_EOF) ||
 	    (user_marks_eor && (srcv->sinfo_flags & SCTP_EOR)))) {
 		sp->msg_is_complete = 1;
 	} else {
 		sp->msg_is_complete = 0;
 	}
 	sp->sender_all_done = 0;
 	sp->some_taken = 0;
 	sp->put_last_out = 0;
 	resv_in_first = SCTP_DATA_CHUNK_OVERHEAD(stcb);
 	sp->data = sp->tail_mbuf = NULL;
 	if (sp->length == 0) {
 		goto skip_copy;
 	}
 	if (srcv->sinfo_keynumber_valid) {
 		sp->auth_keyid = srcv->sinfo_keynumber;
 	} else {
 		sp->auth_keyid = stcb->asoc.authinfo.active_keyid;
 	}
 	if (sctp_auth_is_required_chunk(SCTP_DATA, stcb->asoc.peer_auth_chunks)) {
 		sctp_auth_key_acquire(stcb, sp->auth_keyid);
 		sp->holds_key_ref = 1;
 	}
 	*error = sctp_copy_one(sp, uio, resv_in_first);
 skip_copy:
 	if (*error) {
 		sctp_free_a_strmoq(stcb, sp, SCTP_SO_LOCKED);
 		sp = NULL;
 	} else {
 		if (sp->sinfo_flags & SCTP_ADDR_OVER) {
 			sp->net = net;
 			atomic_add_int(&sp->net->ref_count, 1);
 		} else {
 			sp->net = NULL;
 		}
 		sctp_set_prsctp_policy(sp);
 	}
 out_now:
 	return (sp);
 }
 
 
 int
 sctp_sosend(struct socket *so,
     struct sockaddr *addr,
     struct uio *uio,
     struct mbuf *top,
     struct mbuf *control,
     int flags,
     struct thread *p
 )
 {
 	int error, use_sndinfo = 0;
 	struct sctp_sndrcvinfo sndrcvninfo;
 	struct sockaddr *addr_to_use;
 #if defined(INET) && defined(INET6)
 	struct sockaddr_in sin;
 #endif
 
 	if (control) {
 		/* process cmsg snd/rcv info (maybe a assoc-id) */
 		if (sctp_find_cmsg(SCTP_SNDRCV, (void *)&sndrcvninfo, control,
 		    sizeof(sndrcvninfo))) {
 			/* got one */
 			use_sndinfo = 1;
 		}
 	}
 	addr_to_use = addr;
 #if defined(INET) && defined(INET6)
 	if ((addr) && (addr->sa_family == AF_INET6)) {
 		struct sockaddr_in6 *sin6;
 
 		sin6 = (struct sockaddr_in6 *)addr;
 		if (IN6_IS_ADDR_V4MAPPED(&sin6->sin6_addr)) {
 			in6_sin6_2_sin(&sin, sin6);
 			addr_to_use = (struct sockaddr *)&sin;
 		}
 	}
 #endif
 	error = sctp_lower_sosend(so, addr_to_use, uio, top,
 	    control,
 	    flags,
 	    use_sndinfo ? &sndrcvninfo : NULL
 	    ,p
 	    );
 	return (error);
 }
 
 
 int
 sctp_lower_sosend(struct socket *so,
     struct sockaddr *addr,
     struct uio *uio,
     struct mbuf *i_pak,
     struct mbuf *control,
     int flags,
     struct sctp_sndrcvinfo *srcv
     ,
     struct thread *p
 )
 {
 	unsigned int sndlen = 0, max_len;
 	int error, len;
 	struct mbuf *top = NULL;
 	int queue_only = 0, queue_only_for_init = 0;
 	int free_cnt_applied = 0;
 	int un_sent;
 	int now_filled = 0;
 	unsigned int inqueue_bytes = 0;
 	struct sctp_block_entry be;
 	struct sctp_inpcb *inp;
 	struct sctp_tcb *stcb = NULL;
 	struct timeval now;
 	struct sctp_nets *net;
 	struct sctp_association *asoc;
 	struct sctp_inpcb *t_inp;
 	int user_marks_eor;
 	int create_lock_applied = 0;
 	int nagle_applies = 0;
 	int some_on_control = 0;
 	int got_all_of_the_send = 0;
 	int hold_tcblock = 0;
 	int non_blocking = 0;
 	uint32_t local_add_more, local_soresv = 0;
 	uint16_t port;
 	uint16_t sinfo_flags;
 	sctp_assoc_t sinfo_assoc_id;
 
 	error = 0;
 	net = NULL;
 	stcb = NULL;
 	asoc = NULL;
 
 	t_inp = inp = (struct sctp_inpcb *)so->so_pcb;
 	if (inp == NULL) {
 		SCTP_LTRACE_ERR_RET(NULL, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 		error = EINVAL;
 		if (i_pak) {
 			SCTP_RELEASE_PKT(i_pak);
 		}
 		return (error);
 	}
 	if ((uio == NULL) && (i_pak == NULL)) {
 		SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 		return (EINVAL);
 	}
 	user_marks_eor = sctp_is_feature_on(inp, SCTP_PCB_FLAGS_EXPLICIT_EOR);
 	atomic_add_int(&inp->total_sends, 1);
 	if (uio) {
 		if (uio->uio_resid < 0) {
 			SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 			return (EINVAL);
 		}
 		sndlen = (unsigned int)uio->uio_resid;
 	} else {
 		top = SCTP_HEADER_TO_CHAIN(i_pak);
 		sndlen = SCTP_HEADER_LEN(i_pak);
 	}
 	SCTPDBG(SCTP_DEBUG_OUTPUT1, "Send called addr:%p send length %d\n",
 	    (void *)addr,
 	    sndlen);
 	if ((inp->sctp_flags & SCTP_PCB_FLAGS_TCPTYPE) &&
 	    SCTP_IS_LISTENING(inp)) {
 		/* The listener can NOT send */
 		SCTP_LTRACE_ERR_RET(NULL, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, ENOTCONN);
 		error = ENOTCONN;
 		goto out_unlocked;
 	}
 	/**
 	 * Pre-screen address, if one is given the sin-len
 	 * must be set correctly!
 	 */
 	if (addr) {
 		union sctp_sockstore *raddr = (union sctp_sockstore *)addr;
 
 		switch (raddr->sa.sa_family) {
 #ifdef INET
 		case AF_INET:
 			if (raddr->sin.sin_len != sizeof(struct sockaddr_in)) {
 				SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 				error = EINVAL;
 				goto out_unlocked;
 			}
 			port = raddr->sin.sin_port;
 			break;
 #endif
 #ifdef INET6
 		case AF_INET6:
 			if (raddr->sin6.sin6_len != sizeof(struct sockaddr_in6)) {
 				SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 				error = EINVAL;
 				goto out_unlocked;
 			}
 			port = raddr->sin6.sin6_port;
 			break;
 #endif
 		default:
 			SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EAFNOSUPPORT);
 			error = EAFNOSUPPORT;
 			goto out_unlocked;
 		}
 	} else
 		port = 0;
 
 	if (srcv) {
 		sinfo_flags = srcv->sinfo_flags;
 		sinfo_assoc_id = srcv->sinfo_assoc_id;
 		if (INVALID_SINFO_FLAG(sinfo_flags) ||
 		    PR_SCTP_INVALID_POLICY(sinfo_flags)) {
 			SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 			error = EINVAL;
 			goto out_unlocked;
 		}
 		if (srcv->sinfo_flags)
 			SCTP_STAT_INCR(sctps_sends_with_flags);
 	} else {
 		sinfo_flags = inp->def_send.sinfo_flags;
 		sinfo_assoc_id = inp->def_send.sinfo_assoc_id;
 	}
 	if (sinfo_flags & SCTP_SENDALL) {
 		/* its a sendall */
 		error = sctp_sendall(inp, uio, top, srcv);
 		top = NULL;
 		goto out_unlocked;
 	}
 	if ((sinfo_flags & SCTP_ADDR_OVER) && (addr == NULL)) {
 		SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 		error = EINVAL;
 		goto out_unlocked;
 	}
 	/* now we must find the assoc */
 	if ((inp->sctp_flags & SCTP_PCB_FLAGS_CONNECTED) ||
 	    (inp->sctp_flags & SCTP_PCB_FLAGS_IN_TCPPOOL)) {
 		SCTP_INP_RLOCK(inp);
 		stcb = LIST_FIRST(&inp->sctp_asoc_list);
 		if (stcb) {
 			SCTP_TCB_LOCK(stcb);
 			hold_tcblock = 1;
 		}
 		SCTP_INP_RUNLOCK(inp);
 	} else if (sinfo_assoc_id) {
 		stcb = sctp_findassociation_ep_asocid(inp, sinfo_assoc_id, 1);
 		if (stcb != NULL) {
 			hold_tcblock = 1;
 		}
 	} else if (addr) {
 		/*-
 		 * Since we did not use findep we must
 		 * increment it, and if we don't find a tcb
 		 * decrement it.
 		 */
 		SCTP_INP_WLOCK(inp);
 		SCTP_INP_INCR_REF(inp);
 		SCTP_INP_WUNLOCK(inp);
 		stcb = sctp_findassociation_ep_addr(&t_inp, addr, &net, NULL, NULL);
 		if (stcb == NULL) {
 			SCTP_INP_WLOCK(inp);
 			SCTP_INP_DECR_REF(inp);
 			SCTP_INP_WUNLOCK(inp);
 		} else {
 			hold_tcblock = 1;
 		}
 	}
 	if ((stcb == NULL) && (addr)) {
 		/* Possible implicit send? */
 		SCTP_ASOC_CREATE_LOCK(inp);
 		create_lock_applied = 1;
 		if ((inp->sctp_flags & SCTP_PCB_FLAGS_SOCKET_GONE) ||
 		    (inp->sctp_flags & SCTP_PCB_FLAGS_SOCKET_ALLGONE)) {
 			/* Should I really unlock ? */
 			SCTP_LTRACE_ERR_RET(NULL, NULL, NULL, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 			error = EINVAL;
 			goto out_unlocked;
 
 		}
 		if (((inp->sctp_flags & SCTP_PCB_FLAGS_BOUND_V6) == 0) &&
 		    (addr->sa_family == AF_INET6)) {
 			SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 			error = EINVAL;
 			goto out_unlocked;
 		}
 		SCTP_INP_WLOCK(inp);
 		SCTP_INP_INCR_REF(inp);
 		SCTP_INP_WUNLOCK(inp);
 		/* With the lock applied look again */
 		stcb = sctp_findassociation_ep_addr(&t_inp, addr, &net, NULL, NULL);
 		if ((stcb == NULL) && (control != NULL) && (port > 0)) {
 			stcb = sctp_findassociation_cmsgs(&t_inp, port, control, &net, &error);
 		}
 		if (stcb == NULL) {
 			SCTP_INP_WLOCK(inp);
 			SCTP_INP_DECR_REF(inp);
 			SCTP_INP_WUNLOCK(inp);
 		} else {
 			hold_tcblock = 1;
 		}
 		if (error) {
 			goto out_unlocked;
 		}
 		if (t_inp != inp) {
 			SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, ENOTCONN);
 			error = ENOTCONN;
 			goto out_unlocked;
 		}
 	}
 	if (stcb == NULL) {
 		if (addr == NULL) {
 			SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, ENOENT);
 			error = ENOENT;
 			goto out_unlocked;
 		} else {
 			/* We must go ahead and start the INIT process */
 			uint32_t vrf_id;
 
 			if ((sinfo_flags & SCTP_ABORT) ||
 			    ((sinfo_flags & SCTP_EOF) && (sndlen == 0))) {
 				/*-
 				 * User asks to abort a non-existant assoc,
 				 * or EOF a non-existant assoc with no data
 				 */
 				SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, ENOENT);
 				error = ENOENT;
 				goto out_unlocked;
 			}
 			/* get an asoc/stcb struct */
 			vrf_id = inp->def_vrf_id;
 #ifdef INVARIANTS
 			if (create_lock_applied == 0) {
 				panic("Error, should hold create lock and I don't?");
 			}
 #endif
 			stcb = sctp_aloc_assoc(inp, addr, &error, 0, vrf_id,
 			    inp->sctp_ep.pre_open_stream_count,
 			    inp->sctp_ep.port,
 			    p);
 			if (stcb == NULL) {
 				/* Error is setup for us in the call */
 				goto out_unlocked;
 			}
 			if (stcb->sctp_ep->sctp_flags & SCTP_PCB_FLAGS_TCPTYPE) {
 				stcb->sctp_ep->sctp_flags |= SCTP_PCB_FLAGS_CONNECTED;
 				/*
 				 * Set the connected flag so we can queue
 				 * data
 				 */
 				soisconnecting(so);
 			}
 			hold_tcblock = 1;
 			if (create_lock_applied) {
 				SCTP_ASOC_CREATE_UNLOCK(inp);
 				create_lock_applied = 0;
 			} else {
 				SCTP_PRINTF("Huh-3? create lock should have been on??\n");
 			}
 			/*
 			 * Turn on queue only flag to prevent data from
 			 * being sent
 			 */
 			queue_only = 1;
 			asoc = &stcb->asoc;
 			SCTP_SET_STATE(stcb, SCTP_STATE_COOKIE_WAIT);
 			(void)SCTP_GETTIME_TIMEVAL(&asoc->time_entered);
 
 			/* initialize authentication params for the assoc */
 			sctp_initialize_auth_params(inp, stcb);
 
 			if (control) {
 				if (sctp_process_cmsgs_for_init(stcb, control, &error)) {
 					sctp_free_assoc(inp, stcb, SCTP_PCBFREE_FORCE,
 					    SCTP_FROM_SCTP_OUTPUT + SCTP_LOC_5);
 					hold_tcblock = 0;
 					stcb = NULL;
 					goto out_unlocked;
 				}
 			}
 			/* out with the INIT */
 			queue_only_for_init = 1;
 			/*-
 			 * we may want to dig in after this call and adjust the MTU
 			 * value. It defaulted to 1500 (constant) but the ro
 			 * structure may now have an update and thus we may need to
 			 * change it BEFORE we append the message.
 			 */
 		}
 	} else
 		asoc = &stcb->asoc;
 	if (srcv == NULL)
 		srcv = (struct sctp_sndrcvinfo *)&asoc->def_send;
 	if (srcv->sinfo_flags & SCTP_ADDR_OVER) {
 		if (addr)
 			net = sctp_findnet(stcb, addr);
 		else
 			net = NULL;
 		if ((net == NULL) ||
 		    ((port != 0) && (port != stcb->rport))) {
 			SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 			error = EINVAL;
 			goto out_unlocked;
 		}
 	} else {
 		if (stcb->asoc.alternate) {
 			net = stcb->asoc.alternate;
 		} else {
 			net = stcb->asoc.primary_destination;
 		}
 	}
 	atomic_add_int(&stcb->total_sends, 1);
 	/* Keep the stcb from being freed under our feet */
 	atomic_add_int(&asoc->refcnt, 1);
 	free_cnt_applied = 1;
 
 	if (sctp_is_feature_on(inp, SCTP_PCB_FLAGS_NO_FRAGMENT)) {
 		if (sndlen > asoc->smallest_mtu) {
 			SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EMSGSIZE);
 			error = EMSGSIZE;
 			goto out_unlocked;
 		}
 	}
 	if (SCTP_SO_IS_NBIO(so)
 	    || (flags & MSG_NBIO)
 	    ) {
 		non_blocking = 1;
 	}
 	/* would we block? */
 	if (non_blocking) {
 		uint32_t amount;
 
 		if (hold_tcblock == 0) {
 			SCTP_TCB_LOCK(stcb);
 			hold_tcblock = 1;
 		}
 		inqueue_bytes = stcb->asoc.total_output_queue_size - (stcb->asoc.chunks_on_out_queue * SCTP_DATA_CHUNK_OVERHEAD(stcb));
 		if (user_marks_eor == 0) {
 			amount = sndlen;
 		} else {
 			amount = 1;
 		}
 		if ((SCTP_SB_LIMIT_SND(so) < (amount + inqueue_bytes + stcb->asoc.sb_send_resv)) ||
 		    (stcb->asoc.chunks_on_out_queue >= SCTP_BASE_SYSCTL(sctp_max_chunks_on_queue))) {
 			SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EWOULDBLOCK);
 			if (sndlen > SCTP_SB_LIMIT_SND(so))
 				error = EMSGSIZE;
 			else
 				error = EWOULDBLOCK;
 			goto out_unlocked;
 		}
 		stcb->asoc.sb_send_resv += sndlen;
 		SCTP_TCB_UNLOCK(stcb);
 		hold_tcblock = 0;
 	} else {
 		atomic_add_int(&stcb->asoc.sb_send_resv, sndlen);
 	}
 	local_soresv = sndlen;
 	if (stcb->asoc.state & SCTP_STATE_ABOUT_TO_BE_FREED) {
 		SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ECONNRESET);
 		error = ECONNRESET;
 		goto out_unlocked;
 	}
 	if (create_lock_applied) {
 		SCTP_ASOC_CREATE_UNLOCK(inp);
 		create_lock_applied = 0;
 	}
 	/* Is the stream no. valid? */
 	if (srcv->sinfo_stream >= asoc->streamoutcnt) {
 		/* Invalid stream number */
 		SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 		error = EINVAL;
 		goto out_unlocked;
 	}
 	if ((asoc->strmout[srcv->sinfo_stream].state != SCTP_STREAM_OPEN) &&
 	    (asoc->strmout[srcv->sinfo_stream].state != SCTP_STREAM_OPENING)) {
 		/*
 		 * Can't queue any data while stream reset is underway.
 		 */
 		if (asoc->strmout[srcv->sinfo_stream].state > SCTP_STREAM_OPEN) {
 			error = EAGAIN;
 		} else {
 			error = EINVAL;
 		}
 		SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, error);
 		goto out_unlocked;
 	}
 	if ((SCTP_GET_STATE(stcb) == SCTP_STATE_COOKIE_WAIT) ||
 	    (SCTP_GET_STATE(stcb) == SCTP_STATE_COOKIE_ECHOED)) {
 		queue_only = 1;
 	}
 	/* we are now done with all control */
 	if (control) {
 		sctp_m_freem(control);
 		control = NULL;
 	}
 	if ((SCTP_GET_STATE(stcb) == SCTP_STATE_SHUTDOWN_SENT) ||
 	    (SCTP_GET_STATE(stcb) == SCTP_STATE_SHUTDOWN_RECEIVED) ||
 	    (SCTP_GET_STATE(stcb) == SCTP_STATE_SHUTDOWN_ACK_SENT) ||
 	    (asoc->state & SCTP_STATE_SHUTDOWN_PENDING)) {
 		if (srcv->sinfo_flags & SCTP_ABORT) {
 			;
 		} else {
 			SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ECONNRESET);
 			error = ECONNRESET;
 			goto out_unlocked;
 		}
 	}
 	/* Ok, we will attempt a msgsnd :> */
 	if (p) {
 		p->td_ru.ru_msgsnd++;
 	}
 	/* Are we aborting? */
 	if (srcv->sinfo_flags & SCTP_ABORT) {
 		struct mbuf *mm;
 		int tot_demand, tot_out = 0, max_out;
 
 		SCTP_STAT_INCR(sctps_sends_with_abort);
 		if ((SCTP_GET_STATE(stcb) == SCTP_STATE_COOKIE_WAIT) ||
 		    (SCTP_GET_STATE(stcb) == SCTP_STATE_COOKIE_ECHOED)) {
 			/* It has to be up before we abort */
 			/* how big is the user initiated abort? */
 			SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 			error = EINVAL;
 			goto out;
 		}
 		if (hold_tcblock) {
 			SCTP_TCB_UNLOCK(stcb);
 			hold_tcblock = 0;
 		}
 		if (top) {
 			struct mbuf *cntm = NULL;
 
 			mm = sctp_get_mbuf_for_msg(sizeof(struct sctp_paramhdr), 0, M_WAITOK, 1, MT_DATA);
 			if (sndlen != 0) {
 				for (cntm = top; cntm; cntm = SCTP_BUF_NEXT(cntm)) {
 					tot_out += SCTP_BUF_LEN(cntm);
 				}
 			}
 		} else {
 			/* Must fit in a MTU */
 			tot_out = sndlen;
 			tot_demand = (tot_out + sizeof(struct sctp_paramhdr));
 			if (tot_demand > SCTP_DEFAULT_ADD_MORE) {
 				/* To big */
 				SCTP_LTRACE_ERR_RET(NULL, stcb, net, SCTP_FROM_SCTP_OUTPUT, EMSGSIZE);
 				error = EMSGSIZE;
 				goto out;
 			}
 			mm = sctp_get_mbuf_for_msg(tot_demand, 0, M_WAITOK, 1, MT_DATA);
 		}
 		if (mm == NULL) {
 			SCTP_LTRACE_ERR_RET(NULL, stcb, net, SCTP_FROM_SCTP_OUTPUT, ENOMEM);
 			error = ENOMEM;
 			goto out;
 		}
 		max_out = asoc->smallest_mtu - sizeof(struct sctp_paramhdr);
 		max_out -= sizeof(struct sctp_abort_msg);
 		if (tot_out > max_out) {
 			tot_out = max_out;
 		}
 		if (mm) {
 			struct sctp_paramhdr *ph;
 
 			/* now move forward the data pointer */
 			ph = mtod(mm, struct sctp_paramhdr *);
 			ph->param_type = htons(SCTP_CAUSE_USER_INITIATED_ABT);
 			ph->param_length = htons((uint16_t)(sizeof(struct sctp_paramhdr) + tot_out));
 			ph++;
 			SCTP_BUF_LEN(mm) = tot_out + sizeof(struct sctp_paramhdr);
 			if (top == NULL) {
 				error = uiomove((caddr_t)ph, (int)tot_out, uio);
 				if (error) {
 					/*-
 					 * Here if we can't get his data we
 					 * still abort we just don't get to
 					 * send the users note :-0
 					 */
 					sctp_m_freem(mm);
 					mm = NULL;
 				}
 			} else {
 				if (sndlen != 0) {
 					SCTP_BUF_NEXT(mm) = top;
 				}
 			}
 		}
 		if (hold_tcblock == 0) {
 			SCTP_TCB_LOCK(stcb);
 		}
 		atomic_add_int(&stcb->asoc.refcnt, -1);
 		free_cnt_applied = 0;
 		/* release this lock, otherwise we hang on ourselves */
 		sctp_abort_an_association(stcb->sctp_ep, stcb, mm, SCTP_SO_LOCKED);
 		/* now relock the stcb so everything is sane */
 		hold_tcblock = 0;
 		stcb = NULL;
 		/*
 		 * In this case top is already chained to mm avoid double
 		 * free, since we free it below if top != NULL and driver
 		 * would free it after sending the packet out
 		 */
 		if (sndlen != 0) {
 			top = NULL;
 		}
 		goto out_unlocked;
 	}
 	/* Calculate the maximum we can send */
 	inqueue_bytes = stcb->asoc.total_output_queue_size - (stcb->asoc.chunks_on_out_queue * SCTP_DATA_CHUNK_OVERHEAD(stcb));
 	if (SCTP_SB_LIMIT_SND(so) > inqueue_bytes) {
 		if (non_blocking) {
 			/* we already checked for non-blocking above. */
 			max_len = sndlen;
 		} else {
 			max_len = SCTP_SB_LIMIT_SND(so) - inqueue_bytes;
 		}
 	} else {
 		max_len = 0;
 	}
 	if (hold_tcblock) {
 		SCTP_TCB_UNLOCK(stcb);
 		hold_tcblock = 0;
 	}
 	if (asoc->strmout == NULL) {
 		/* huh? software error */
 		SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EFAULT);
 		error = EFAULT;
 		goto out_unlocked;
 	}
 
 	/* Unless E_EOR mode is on, we must make a send FIT in one call. */
 	if ((user_marks_eor == 0) &&
 	    (sndlen > SCTP_SB_LIMIT_SND(stcb->sctp_socket))) {
 		/* It will NEVER fit */
 		SCTP_LTRACE_ERR_RET(NULL, stcb, net, SCTP_FROM_SCTP_OUTPUT, EMSGSIZE);
 		error = EMSGSIZE;
 		goto out_unlocked;
 	}
 	if ((uio == NULL) && user_marks_eor) {
 		/*-
 		 * We do not support eeor mode for
 		 * sending with mbuf chains (like sendfile).
 		 */
 		SCTP_LTRACE_ERR_RET(NULL, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 		error = EINVAL;
 		goto out_unlocked;
 	}
 
 	if (user_marks_eor) {
 		local_add_more = min(SCTP_SB_LIMIT_SND(so), SCTP_BASE_SYSCTL(sctp_add_more_threshold));
 	} else {
 		/*-
 		 * For non-eeor the whole message must fit in
 		 * the socket send buffer.
 		 */
 		local_add_more = sndlen;
 	}
 	len = 0;
 	if (non_blocking) {
 		goto skip_preblock;
 	}
 	if (((max_len <= local_add_more) &&
 	    (SCTP_SB_LIMIT_SND(so) >= local_add_more)) ||
 	    (max_len == 0) ||
 	    ((stcb->asoc.chunks_on_out_queue + stcb->asoc.stream_queue_cnt) >= SCTP_BASE_SYSCTL(sctp_max_chunks_on_queue))) {
 		/* No room right now ! */
 		SOCKBUF_LOCK(&so->so_snd);
 		inqueue_bytes = stcb->asoc.total_output_queue_size - (stcb->asoc.chunks_on_out_queue * SCTP_DATA_CHUNK_OVERHEAD(stcb));
 		while ((SCTP_SB_LIMIT_SND(so) < (inqueue_bytes + local_add_more)) ||
 		    ((stcb->asoc.stream_queue_cnt + stcb->asoc.chunks_on_out_queue) >= SCTP_BASE_SYSCTL(sctp_max_chunks_on_queue))) {
 			SCTPDBG(SCTP_DEBUG_OUTPUT1, "pre_block limit:%u <(inq:%d + %d) || (%d+%d > %d)\n",
 			    (unsigned int)SCTP_SB_LIMIT_SND(so),
 			    inqueue_bytes,
 			    local_add_more,
 			    stcb->asoc.stream_queue_cnt,
 			    stcb->asoc.chunks_on_out_queue,
 			    SCTP_BASE_SYSCTL(sctp_max_chunks_on_queue));
 			if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_BLK_LOGGING_ENABLE) {
 				sctp_log_block(SCTP_BLOCK_LOG_INTO_BLKA, asoc, sndlen);
 			}
 			be.error = 0;
 			stcb->block_entry = &be;
 			error = sbwait(&so->so_snd);
 			stcb->block_entry = NULL;
 			if (error || so->so_error || be.error) {
 				if (error == 0) {
 					if (so->so_error)
 						error = so->so_error;
 					if (be.error) {
 						error = be.error;
 					}
 				}
 				SOCKBUF_UNLOCK(&so->so_snd);
 				goto out_unlocked;
 			}
 			if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_BLK_LOGGING_ENABLE) {
 				sctp_log_block(SCTP_BLOCK_LOG_OUTOF_BLK,
 				    asoc, stcb->asoc.total_output_queue_size);
 			}
 			if (stcb->asoc.state & SCTP_STATE_ABOUT_TO_BE_FREED) {
 				SOCKBUF_UNLOCK(&so->so_snd);
 				goto out_unlocked;
 			}
 			inqueue_bytes = stcb->asoc.total_output_queue_size - (stcb->asoc.chunks_on_out_queue * SCTP_DATA_CHUNK_OVERHEAD(stcb));
 		}
 		if (SCTP_SB_LIMIT_SND(so) > inqueue_bytes) {
 			max_len = SCTP_SB_LIMIT_SND(so) - inqueue_bytes;
 		} else {
 			max_len = 0;
 		}
 		SOCKBUF_UNLOCK(&so->so_snd);
 	}
 
 skip_preblock:
 	if (stcb->asoc.state & SCTP_STATE_ABOUT_TO_BE_FREED) {
 		goto out_unlocked;
 	}
 	/*
 	 * sndlen covers for mbuf case uio_resid covers for the non-mbuf
 	 * case NOTE: uio will be null when top/mbuf is passed
 	 */
 	if (sndlen == 0) {
 		if (srcv->sinfo_flags & SCTP_EOF) {
 			got_all_of_the_send = 1;
 			goto dataless_eof;
 		} else {
 			SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 			error = EINVAL;
 			goto out;
 		}
 	}
 	if (top == NULL) {
 		struct sctp_stream_queue_pending *sp;
 		struct sctp_stream_out *strm;
 		uint32_t sndout;
 
 		SCTP_TCB_SEND_LOCK(stcb);
 		if ((asoc->stream_locked) &&
 		    (asoc->stream_locked_on != srcv->sinfo_stream)) {
 			SCTP_TCB_SEND_UNLOCK(stcb);
 			SCTP_LTRACE_ERR_RET(inp, stcb, net, SCTP_FROM_SCTP_OUTPUT, EINVAL);
 			error = EINVAL;
 			goto out;
 		}
 		SCTP_TCB_SEND_UNLOCK(stcb);
 
 		strm = &stcb->asoc.strmout[srcv->sinfo_stream];
 		if (strm->last_msg_incomplete == 0) {
 	do_a_copy_in:
 			sp = sctp_copy_it_in(stcb, asoc, srcv, uio, net, max_len, user_marks_eor, &error);
 			if (error) {
 				goto out;
 			}
 			SCTP_TCB_SEND_LOCK(stcb);
 			if (sp->msg_is_complete) {
 				strm->last_msg_incomplete = 0;
 				asoc->stream_locked = 0;
 			} else {
 				/*
 				 * Just got locked to this guy in case of an
 				 * interrupt.
 				 */
 				strm->last_msg_incomplete = 1;
 				if (stcb->asoc.idata_supported == 0) {
 					asoc->stream_locked = 1;
 					asoc->stream_locked_on = srcv->sinfo_stream;
 				}
 				sp->sender_all_done = 0;
 			}
 			sctp_snd_sb_alloc(stcb, sp->length);
 			atomic_add_int(&asoc->stream_queue_cnt, 1);
 			if (srcv->sinfo_flags & SCTP_UNORDERED) {
 				SCTP_STAT_INCR(sctps_sends_with_unord);
 			}
 			TAILQ_INSERT_TAIL(&strm->outqueue, sp, next);
 			stcb->asoc.ss_functions.sctp_ss_add_to_stream(stcb, asoc, strm, sp, 1);
 			SCTP_TCB_SEND_UNLOCK(stcb);
 		} else {
 			SCTP_TCB_SEND_LOCK(stcb);
 			sp = TAILQ_LAST(&strm->outqueue, sctp_streamhead);
 			SCTP_TCB_SEND_UNLOCK(stcb);
 			if (sp == NULL) {
 				/* ???? Huh ??? last msg is gone */
 #ifdef INVARIANTS
 				panic("Warning: Last msg marked incomplete, yet nothing left?");
 #else
 				SCTP_PRINTF("Warning: Last msg marked incomplete, yet nothing left?\n");
 				strm->last_msg_incomplete = 0;
 #endif
 				goto do_a_copy_in;
 
 			}
 		}
 		while (uio->uio_resid > 0) {
 			/* How much room do we have? */
 			struct mbuf *new_tail, *mm;
 
 			inqueue_bytes = stcb->asoc.total_output_queue_size - (stcb->asoc.chunks_on_out_queue * SCTP_DATA_CHUNK_OVERHEAD(stcb));
 			if (SCTP_SB_LIMIT_SND(so) > inqueue_bytes)
 				max_len = SCTP_SB_LIMIT_SND(so) - inqueue_bytes;
 			else
 				max_len = 0;
 
 			if ((max_len > SCTP_BASE_SYSCTL(sctp_add_more_threshold)) ||
 			    (max_len && (SCTP_SB_LIMIT_SND(so) < SCTP_BASE_SYSCTL(sctp_add_more_threshold))) ||
 			    (uio->uio_resid && (uio->uio_resid <= (int)max_len))) {
 				sndout = 0;
 				new_tail = NULL;
 				if (hold_tcblock) {
 					SCTP_TCB_UNLOCK(stcb);
 					hold_tcblock = 0;
 				}
 				mm = sctp_copy_resume(uio, max_len, user_marks_eor, &error, &sndout, &new_tail);
 				if ((mm == NULL) || error) {
 					if (mm) {
 						sctp_m_freem(mm);
 					}
 					goto out;
 				}
 				/* Update the mbuf and count */
 				SCTP_TCB_SEND_LOCK(stcb);
 				if (stcb->asoc.state & SCTP_STATE_ABOUT_TO_BE_FREED) {
 					/*
 					 * we need to get out. Peer probably
 					 * aborted.
 					 */
 					sctp_m_freem(mm);
 					if (stcb->asoc.state & SCTP_PCB_FLAGS_WAS_ABORTED) {
 						SCTP_LTRACE_ERR_RET(NULL, stcb, NULL, SCTP_FROM_SCTP_OUTPUT, ECONNRESET);
 						error = ECONNRESET;
 					}
 					SCTP_TCB_SEND_UNLOCK(stcb);
 					goto out;
 				}
 				if (sp->tail_mbuf) {
 					/* tack it to the end */
 					SCTP_BUF_NEXT(sp->tail_mbuf) = mm;
 					sp->tail_mbuf = new_tail;
 				} else {
 					/* A stolen mbuf */
 					sp->data = mm;
 					sp->tail_mbuf = new_tail;
 				}
 				sctp_snd_sb_alloc(stcb, sndout);
 				atomic_add_int(&sp->length, sndout);
 				len += sndout;
 				if (srcv->sinfo_flags & SCTP_SACK_IMMEDIATELY) {
 					sp->sinfo_flags |= SCTP_SACK_IMMEDIATELY;
 				}
 
 				/* Did we reach EOR? */
 				if ((uio->uio_resid == 0) &&
 				    ((user_marks_eor == 0) ||
 				    (srcv->sinfo_flags & SCTP_EOF) ||
 				    (user_marks_eor && (srcv->sinfo_flags & SCTP_EOR)))) {
 					sp->msg_is_complete = 1;
 				} else {
 					sp->msg_is_complete = 0;
 				}
 				SCTP_TCB_SEND_UNLOCK(stcb);
 			}
 			if (uio->uio_resid == 0) {
 				/* got it all? */
 				continue;
 			}
 			/* PR-SCTP? */
 			if ((asoc->prsctp_supported) && (asoc->sent_queue_cnt_removeable > 0)) {
 				/*
 				 * This is ugly but we must assure locking
 				 * order
 				 */
 				if (hold_tcblock == 0) {
 					SCTP_TCB_LOCK(stcb);
 					hold_tcblock = 1;
 				}
 				sctp_prune_prsctp(stcb, asoc, srcv, sndlen);
 				inqueue_bytes = stcb->asoc.total_output_queue_size - (stcb->asoc.chunks_on_out_queue * SCTP_DATA_CHUNK_OVERHEAD(stcb));
 				if (SCTP_SB_LIMIT_SND(so) > inqueue_bytes)
 					max_len = SCTP_SB_LIMIT_SND(so) - inqueue_bytes;
 				else
 					max_len = 0;
 				if (max_len > 0) {
 					continue;
 				}
 				SCTP_TCB_UNLOCK(stcb);
 				hold_tcblock = 0;
 			}
 			/* wait for space now */
 			if (non_blocking) {
 				/* Non-blocking io in place out */
 				goto skip_out_eof;
 			}
 			/* What about the INIT, send it maybe */
 			if (queue_only_for_init) {
 				if (hold_tcblock == 0) {
 					SCTP_TCB_LOCK(stcb);
 					hold_tcblock = 1;
 				}
 				if (SCTP_GET_STATE(stcb) == SCTP_STATE_OPEN) {
 					/* a collision took us forward? */
 					queue_only = 0;
 				} else {
 					sctp_send_initiate(inp, stcb, SCTP_SO_LOCKED);
 					SCTP_SET_STATE(stcb, SCTP_STATE_COOKIE_WAIT);
 					queue_only = 1;
 				}
 			}
 			if ((net->flight_size > net->cwnd) &&
 			    (asoc->sctp_cmt_on_off == 0)) {
 				SCTP_STAT_INCR(sctps_send_cwnd_avoid);
 				queue_only = 1;
 			} else if (asoc->ifp_had_enobuf) {
 				SCTP_STAT_INCR(sctps_ifnomemqueued);
 				if (net->flight_size > (2 * net->mtu)) {
 					queue_only = 1;
 				}
 				asoc->ifp_had_enobuf = 0;
 			}
 			un_sent = stcb->asoc.total_output_queue_size - stcb->asoc.total_flight;
 			if ((sctp_is_feature_off(inp, SCTP_PCB_FLAGS_NODELAY)) &&
 			    (stcb->asoc.total_flight > 0) &&
 			    (stcb->asoc.stream_queue_cnt < SCTP_MAX_DATA_BUNDLING) &&
 			    (un_sent < (int)(stcb->asoc.smallest_mtu - SCTP_MIN_OVERHEAD))) {
 
 				/*-
 				 * Ok, Nagle is set on and we have data outstanding.
 				 * Don't send anything and let SACKs drive out the
 				 * data unless we have a "full" segment to send.
 				 */
 				if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_NAGLE_LOGGING_ENABLE) {
 					sctp_log_nagle_event(stcb, SCTP_NAGLE_APPLIED);
 				}
 				SCTP_STAT_INCR(sctps_naglequeued);
 				nagle_applies = 1;
 			} else {
 				if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_NAGLE_LOGGING_ENABLE) {
 					if (sctp_is_feature_off(inp, SCTP_PCB_FLAGS_NODELAY))
 						sctp_log_nagle_event(stcb, SCTP_NAGLE_SKIPPED);
 				}
 				SCTP_STAT_INCR(sctps_naglesent);
 				nagle_applies = 0;
 			}
 			if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_BLK_LOGGING_ENABLE) {
 
 				sctp_misc_ints(SCTP_CWNDLOG_PRESEND, queue_only_for_init, queue_only,
 				    nagle_applies, un_sent);
 				sctp_misc_ints(SCTP_CWNDLOG_PRESEND, stcb->asoc.total_output_queue_size,
 				    stcb->asoc.total_flight,
 				    stcb->asoc.chunks_on_out_queue, stcb->asoc.total_flight_count);
 			}
 			if (queue_only_for_init)
 				queue_only_for_init = 0;
 			if ((queue_only == 0) && (nagle_applies == 0)) {
 				/*-
 				 * need to start chunk output
 				 * before blocking.. note that if
 				 * a lock is already applied, then
 				 * the input via the net is happening
 				 * and I don't need to start output :-D
 				 */
 				if (hold_tcblock == 0) {
 					if (SCTP_TCB_TRYLOCK(stcb)) {
 						hold_tcblock = 1;
 						sctp_chunk_output(inp,
 						    stcb,
 						    SCTP_OUTPUT_FROM_USR_SEND, SCTP_SO_LOCKED);
 					}
 				} else {
 					sctp_chunk_output(inp,
 					    stcb,
 					    SCTP_OUTPUT_FROM_USR_SEND, SCTP_SO_LOCKED);
 				}
 				if (hold_tcblock == 1) {
 					SCTP_TCB_UNLOCK(stcb);
 					hold_tcblock = 0;
 				}
 			}
 			SOCKBUF_LOCK(&so->so_snd);
 			/*-
 			 * This is a bit strange, but I think it will
 			 * work. The total_output_queue_size is locked and
 			 * protected by the TCB_LOCK, which we just released.
 			 * There is a race that can occur between releasing it
 			 * above, and me getting the socket lock, where sacks
 			 * come in but we have not put the SB_WAIT on the
 			 * so_snd buffer to get the wakeup. After the LOCK
 			 * is applied the sack_processing will also need to
 			 * LOCK the so->so_snd to do the actual sowwakeup(). So
 			 * once we have the socket buffer lock if we recheck the
 			 * size we KNOW we will get to sleep safely with the
 			 * wakeup flag in place.
 			 */
 			inqueue_bytes = stcb->asoc.total_output_queue_size - (stcb->asoc.chunks_on_out_queue * SCTP_DATA_CHUNK_OVERHEAD(stcb));
 			if (SCTP_SB_LIMIT_SND(so) <= (inqueue_bytes +
 			    min(SCTP_BASE_SYSCTL(sctp_add_more_threshold), SCTP_SB_LIMIT_SND(so)))) {
 				if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_BLK_LOGGING_ENABLE) {
 					sctp_log_block(SCTP_BLOCK_LOG_INTO_BLK,
 					    asoc, (size_t)uio->uio_resid);
 				}
 				be.error = 0;
 				stcb->block_entry = &be;
 				error = sbwait(&so->so_snd);
 				stcb->block_entry = NULL;
 
 				if (error || so->so_error || be.error) {
 					if (error == 0) {
 						if (so->so_error)
 							error = so->so_error;
 						if (be.error) {
 							error = be.error;
 						}
 					}
 					SOCKBUF_UNLOCK(&so->so_snd);
 					goto out_unlocked;
 				}
 
 				if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_BLK_LOGGING_ENABLE) {
 					sctp_log_block(SCTP_BLOCK_LOG_OUTOF_BLK,
 					    asoc, stcb->asoc.total_output_queue_size);
 				}
 			}
 			SOCKBUF_UNLOCK(&so->so_snd);
 			if (stcb->asoc.state & SCTP_STATE_ABOUT_TO_BE_FREED) {
 				goto out_unlocked;
 			}
 		}
 		SCTP_TCB_SEND_LOCK(stcb);
 		if (stcb->asoc.state & SCTP_STATE_ABOUT_TO_BE_FREED) {
 			SCTP_TCB_SEND_UNLOCK(stcb);
 			goto out_unlocked;
 		}
 		if (sp) {
 			if (sp->msg_is_complete == 0) {
 				strm->last_msg_incomplete = 1;
 				if (stcb->asoc.idata_supported == 0) {
 					asoc->stream_locked = 1;
 					asoc->stream_locked_on = srcv->sinfo_stream;
 				}
 			} else {
 				sp->sender_all_done = 1;
 				strm->last_msg_incomplete = 0;
 				asoc->stream_locked = 0;
 			}
 		} else {
 			SCTP_PRINTF("Huh no sp TSNH?\n");
 			strm->last_msg_incomplete = 0;
 			asoc->stream_locked = 0;
 		}
 		SCTP_TCB_SEND_UNLOCK(stcb);
 		if (uio->uio_resid == 0) {
 			got_all_of_the_send = 1;
 		}
 	} else {
 		/* We send in a 0, since we do NOT have any locks */
 		error = sctp_msg_append(stcb, net, top, srcv, 0);
 		top = NULL;
 		if (srcv->sinfo_flags & SCTP_EOF) {
 			/*
 			 * This should only happen for Panda for the mbuf
 			 * send case, which does NOT yet support EEOR mode.
 			 * Thus, we can just set this flag to do the proper
 			 * EOF handling.
 			 */
 			got_all_of_the_send = 1;
 		}
 	}
 	if (error) {
 		goto out;
 	}
 dataless_eof:
 	/* EOF thing ? */
 	if ((srcv->sinfo_flags & SCTP_EOF) &&
 	    (got_all_of_the_send == 1)) {
 		SCTP_STAT_INCR(sctps_sends_with_eof);
 		error = 0;
 		if (hold_tcblock == 0) {
 			SCTP_TCB_LOCK(stcb);
 			hold_tcblock = 1;
 		}
 		if (TAILQ_EMPTY(&asoc->send_queue) &&
 		    TAILQ_EMPTY(&asoc->sent_queue) &&
 		    sctp_is_there_unsent_data(stcb, SCTP_SO_LOCKED) == 0) {
 			if ((*asoc->ss_functions.sctp_ss_is_user_msgs_incomplete) (stcb, asoc)) {
 				goto abort_anyway;
 			}
 			/* there is nothing queued to send, so I'm done... */
 			if ((SCTP_GET_STATE(stcb) != SCTP_STATE_SHUTDOWN_SENT) &&
 			    (SCTP_GET_STATE(stcb) != SCTP_STATE_SHUTDOWN_RECEIVED) &&
 			    (SCTP_GET_STATE(stcb) != SCTP_STATE_SHUTDOWN_ACK_SENT)) {
 				struct sctp_nets *netp;
 
 				/* only send SHUTDOWN the first time through */
 				if (SCTP_GET_STATE(stcb) == SCTP_STATE_OPEN) {
 					SCTP_STAT_DECR_GAUGE32(sctps_currestab);
 				}
 				SCTP_SET_STATE(stcb, SCTP_STATE_SHUTDOWN_SENT);
 				sctp_stop_timers_for_shutdown(stcb);
 				if (stcb->asoc.alternate) {
 					netp = stcb->asoc.alternate;
 				} else {
 					netp = stcb->asoc.primary_destination;
 				}
 				sctp_send_shutdown(stcb, netp);
 				sctp_timer_start(SCTP_TIMER_TYPE_SHUTDOWN, stcb->sctp_ep, stcb,
 				    netp);
 				sctp_timer_start(SCTP_TIMER_TYPE_SHUTDOWNGUARD, stcb->sctp_ep, stcb,
 				    asoc->primary_destination);
 			}
 		} else {
 			/*-
 			 * we still got (or just got) data to send, so set
 			 * SHUTDOWN_PENDING
 			 */
 			/*-
 			 * XXX sockets draft says that SCTP_EOF should be
 			 * sent with no data.  currently, we will allow user
 			 * data to be sent first and move to
 			 * SHUTDOWN-PENDING
 			 */
 			if ((SCTP_GET_STATE(stcb) != SCTP_STATE_SHUTDOWN_SENT) &&
 			    (SCTP_GET_STATE(stcb) != SCTP_STATE_SHUTDOWN_RECEIVED) &&
 			    (SCTP_GET_STATE(stcb) != SCTP_STATE_SHUTDOWN_ACK_SENT)) {
 				if (hold_tcblock == 0) {
 					SCTP_TCB_LOCK(stcb);
 					hold_tcblock = 1;
 				}
 				if ((*asoc->ss_functions.sctp_ss_is_user_msgs_incomplete) (stcb, asoc)) {
 					SCTP_ADD_SUBSTATE(stcb, SCTP_STATE_PARTIAL_MSG_LEFT);
 				}
 				SCTP_ADD_SUBSTATE(stcb, SCTP_STATE_SHUTDOWN_PENDING);
 				if (TAILQ_EMPTY(&asoc->send_queue) &&
 				    TAILQ_EMPTY(&asoc->sent_queue) &&
 				    (asoc->state & SCTP_STATE_PARTIAL_MSG_LEFT)) {
 					struct mbuf *op_err;
 					char msg[SCTP_DIAG_INFO_LEN];
 
 			abort_anyway:
 					if (free_cnt_applied) {
 						atomic_add_int(&stcb->asoc.refcnt, -1);
 						free_cnt_applied = 0;
 					}
 					snprintf(msg, sizeof(msg),
 					    "%s:%d at %s", __FILE__, __LINE__, __func__);
 					op_err = sctp_generate_cause(SCTP_BASE_SYSCTL(sctp_diag_info_code),
 					    msg);
 					sctp_abort_an_association(stcb->sctp_ep, stcb,
 					    op_err, SCTP_SO_LOCKED);
 					/*
 					 * now relock the stcb so everything
 					 * is sane
 					 */
 					hold_tcblock = 0;
 					stcb = NULL;
 					goto out;
 				}
 				sctp_timer_start(SCTP_TIMER_TYPE_SHUTDOWNGUARD, stcb->sctp_ep, stcb,
 				    asoc->primary_destination);
 				sctp_feature_off(inp, SCTP_PCB_FLAGS_NODELAY);
 			}
 		}
 	}
 skip_out_eof:
 	if (!TAILQ_EMPTY(&stcb->asoc.control_send_queue)) {
 		some_on_control = 1;
 	}
 	if (queue_only_for_init) {
 		if (hold_tcblock == 0) {
 			SCTP_TCB_LOCK(stcb);
 			hold_tcblock = 1;
 		}
 		if (SCTP_GET_STATE(stcb) == SCTP_STATE_OPEN) {
 			/* a collision took us forward? */
 			queue_only = 0;
 		} else {
 			sctp_send_initiate(inp, stcb, SCTP_SO_LOCKED);
 			SCTP_SET_STATE(stcb, SCTP_STATE_COOKIE_WAIT);
 			queue_only = 1;
 		}
 	}
 	if ((net->flight_size > net->cwnd) &&
 	    (stcb->asoc.sctp_cmt_on_off == 0)) {
 		SCTP_STAT_INCR(sctps_send_cwnd_avoid);
 		queue_only = 1;
 	} else if (asoc->ifp_had_enobuf) {
 		SCTP_STAT_INCR(sctps_ifnomemqueued);
 		if (net->flight_size > (2 * net->mtu)) {
 			queue_only = 1;
 		}
 		asoc->ifp_had_enobuf = 0;
 	}
 	un_sent = stcb->asoc.total_output_queue_size - stcb->asoc.total_flight;
 	if ((sctp_is_feature_off(inp, SCTP_PCB_FLAGS_NODELAY)) &&
 	    (stcb->asoc.total_flight > 0) &&
 	    (stcb->asoc.stream_queue_cnt < SCTP_MAX_DATA_BUNDLING) &&
 	    (un_sent < (int)(stcb->asoc.smallest_mtu - SCTP_MIN_OVERHEAD))) {
 		/*-
 		 * Ok, Nagle is set on and we have data outstanding.
 		 * Don't send anything and let SACKs drive out the
 		 * data unless wen have a "full" segment to send.
 		 */
 		if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_NAGLE_LOGGING_ENABLE) {
 			sctp_log_nagle_event(stcb, SCTP_NAGLE_APPLIED);
 		}
 		SCTP_STAT_INCR(sctps_naglequeued);
 		nagle_applies = 1;
 	} else {
 		if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_NAGLE_LOGGING_ENABLE) {
 			if (sctp_is_feature_off(inp, SCTP_PCB_FLAGS_NODELAY))
 				sctp_log_nagle_event(stcb, SCTP_NAGLE_SKIPPED);
 		}
 		SCTP_STAT_INCR(sctps_naglesent);
 		nagle_applies = 0;
 	}
 	if (SCTP_BASE_SYSCTL(sctp_logging_level) & SCTP_BLK_LOGGING_ENABLE) {
 		sctp_misc_ints(SCTP_CWNDLOG_PRESEND, queue_only_for_init, queue_only,
 		    nagle_applies, un_sent);
 		sctp_misc_ints(SCTP_CWNDLOG_PRESEND, stcb->asoc.total_output_queue_size,
 		    stcb->asoc.total_flight,
 		    stcb->asoc.chunks_on_out_queue, stcb->asoc.total_flight_count);
 	}
 	if ((queue_only == 0) && (nagle_applies == 0) && (stcb->asoc.peers_rwnd && un_sent)) {
 		/* we can attempt to send too. */
 		if (hold_tcblock == 0) {
 			/*
 			 * If there is activity recv'ing sacks no need to
 			 * send
 			 */
 			if (SCTP_TCB_TRYLOCK(stcb)) {
 				sctp_chunk_output(inp, stcb, SCTP_OUTPUT_FROM_USR_SEND, SCTP_SO_LOCKED);
 				hold_tcblock = 1;
 			}
 		} else {
 			sctp_chunk_output(inp, stcb, SCTP_OUTPUT_FROM_USR_SEND, SCTP_SO_LOCKED);
 		}
 	} else if ((queue_only == 0) &&
 		    (stcb->asoc.peers_rwnd == 0) &&
 	    (stcb->asoc.total_flight == 0)) {
 		/* We get to have a probe outstanding */
 		if (hold_tcblock == 0) {
 			hold_tcblock = 1;
 			SCTP_TCB_LOCK(stcb);
 		}
 		sctp_chunk_output(inp, stcb, SCTP_OUTPUT_FROM_USR_SEND, SCTP_SO_LOCKED);
 	} else if (some_on_control) {
 		int num_out, reason, frag_point;
 
 		/* Here we do control only */
 		if (hold_tcblock == 0) {
 			hold_tcblock = 1;
 			SCTP_TCB_LOCK(stcb);
 		}
 		frag_point = sctp_get_frag_point(stcb, &stcb->asoc);
 		(void)sctp_med_chunk_output(inp, stcb, &stcb->asoc, &num_out,
 		    &reason, 1, 1, &now, &now_filled, frag_point, SCTP_SO_LOCKED);
 	}
 	SCTPDBG(SCTP_DEBUG_OUTPUT1, "USR Send complete qo:%d prw:%d unsent:%d tf:%d cooq:%d toqs:%d err:%d\n",
 	    queue_only, stcb->asoc.peers_rwnd, un_sent,
 	    stcb->asoc.total_flight, stcb->asoc.chunks_on_out_queue,
 	    stcb->asoc.total_output_queue_size, error);
 
 out:
 out_unlocked:
 
 	if (local_soresv && stcb) {
 		atomic_subtract_int(&stcb->asoc.sb_send_resv, sndlen);
 	}
 	if (create_lock_applied) {
 		SCTP_ASOC_CREATE_UNLOCK(inp);
 	}
 	if ((stcb) && hold_tcblock) {
 		SCTP_TCB_UNLOCK(stcb);
 	}
 	if (stcb && free_cnt_applied) {
 		atomic_add_int(&stcb->asoc.refcnt, -1);
 	}
 #ifdef INVARIANTS
 	if (stcb) {
 		if (mtx_owned(&stcb->tcb_mtx)) {
 			panic("Leaving with tcb mtx owned?");
 		}
 		if (mtx_owned(&stcb->tcb_send_mtx)) {
 			panic("Leaving with tcb send mtx owned?");
 		}
 	}
 #endif
 	if (top) {
 		sctp_m_freem(top);
 	}
 	if (control) {
 		sctp_m_freem(control);
 	}
 	return (error);
 }
 
 
 /*
  * generate an AUTHentication chunk, if required
  */
 struct mbuf *
 sctp_add_auth_chunk(struct mbuf *m, struct mbuf **m_end,
     struct sctp_auth_chunk **auth_ret, uint32_t *offset,
     struct sctp_tcb *stcb, uint8_t chunk)
 {
 	struct mbuf *m_auth;
 	struct sctp_auth_chunk *auth;
 	int chunk_len;
 	struct mbuf *cn;
 
 	if ((m_end == NULL) || (auth_ret == NULL) || (offset == NULL) ||
 	    (stcb == NULL))
 		return (m);
 
 	if (stcb->asoc.auth_supported == 0) {
 		return (m);
 	}
 	/* does the requested chunk require auth? */
 	if (!sctp_auth_is_required_chunk(chunk, stcb->asoc.peer_auth_chunks)) {
 		return (m);
 	}
 	m_auth = sctp_get_mbuf_for_msg(sizeof(*auth), 0, M_NOWAIT, 1, MT_HEADER);
 	if (m_auth == NULL) {
 		/* no mbuf's */
 		return (m);
 	}
 	/* reserve some space if this will be the first mbuf */
 	if (m == NULL)
 		SCTP_BUF_RESV_UF(m_auth, SCTP_MIN_OVERHEAD);
 	/* fill in the AUTH chunk details */
 	auth = mtod(m_auth, struct sctp_auth_chunk *);
 	memset(auth, 0, sizeof(*auth));
 	auth->ch.chunk_type = SCTP_AUTHENTICATION;
 	auth->ch.chunk_flags = 0;
 	chunk_len = sizeof(*auth) +
 	    sctp_get_hmac_digest_len(stcb->asoc.peer_hmac_id);
 	auth->ch.chunk_length = htons(chunk_len);
 	auth->hmac_id = htons(stcb->asoc.peer_hmac_id);
 	/* key id and hmac digest will be computed and filled in upon send */
 
 	/* save the offset where the auth was inserted into the chain */
 	*offset = 0;
 	for (cn = m; cn; cn = SCTP_BUF_NEXT(cn)) {
 		*offset += SCTP_BUF_LEN(cn);
 	}
 
 	/* update length and return pointer to the auth chunk */
 	SCTP_BUF_LEN(m_auth) = chunk_len;
 	m = sctp_copy_mbufchain(m_auth, m, m_end, 1, chunk_len, 0);
 	if (auth_ret != NULL)
 		*auth_ret = auth;
 
 	return (m);
 }
 
 #ifdef INET6
 int
 sctp_v6src_match_nexthop(struct sockaddr_in6 *src6, sctp_route_t *ro)
 {
 	struct nd_prefix *pfx = NULL;
 	struct nd_pfxrouter *pfxrtr = NULL;
 	struct sockaddr_in6 gw6;
 
 	if (ro == NULL || ro->ro_rt == NULL || src6->sin6_family != AF_INET6)
 		return (0);
 
 	/* get prefix entry of address */
 	ND6_RLOCK();
 	LIST_FOREACH(pfx, &MODULE_GLOBAL(nd_prefix), ndpr_entry) {
 		if (pfx->ndpr_stateflags & NDPRF_DETACHED)
 			continue;
 		if (IN6_ARE_MASKED_ADDR_EQUAL(&pfx->ndpr_prefix.sin6_addr,
 		    &src6->sin6_addr, &pfx->ndpr_mask))
 			break;
 	}
 	/* no prefix entry in the prefix list */
 	if (pfx == NULL) {
 		ND6_RUNLOCK();
 		SCTPDBG(SCTP_DEBUG_OUTPUT2, "No prefix entry for ");
 		SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, (struct sockaddr *)src6);
 		return (0);
 	}
 
 	SCTPDBG(SCTP_DEBUG_OUTPUT2, "v6src_match_nexthop(), Prefix entry is ");
 	SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, (struct sockaddr *)src6);
 
 	/* search installed gateway from prefix entry */
 	LIST_FOREACH(pfxrtr, &pfx->ndpr_advrtrs, pfr_entry) {
 		memset(&gw6, 0, sizeof(struct sockaddr_in6));
 		gw6.sin6_family = AF_INET6;
 		gw6.sin6_len = sizeof(struct sockaddr_in6);
 		memcpy(&gw6.sin6_addr, &pfxrtr->router->rtaddr,
 		    sizeof(struct in6_addr));
 		SCTPDBG(SCTP_DEBUG_OUTPUT2, "prefix router is ");
 		SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, (struct sockaddr *)&gw6);
 		SCTPDBG(SCTP_DEBUG_OUTPUT2, "installed router is ");
 		SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, ro->ro_rt->rt_gateway);
 		if (sctp_cmpaddr((struct sockaddr *)&gw6, ro->ro_rt->rt_gateway)) {
 			ND6_RUNLOCK();
 			SCTPDBG(SCTP_DEBUG_OUTPUT2, "pfxrouter is installed\n");
 			return (1);
 		}
 	}
 	ND6_RUNLOCK();
 	SCTPDBG(SCTP_DEBUG_OUTPUT2, "pfxrouter is not installed\n");
 	return (0);
 }
 #endif
 
 int
 sctp_v4src_match_nexthop(struct sctp_ifa *sifa, sctp_route_t *ro)
 {
 #ifdef INET
 	struct sockaddr_in *sin, *mask;
 	struct ifaddr *ifa;
 	struct in_addr srcnetaddr, gwnetaddr;
 
 	if (ro == NULL || ro->ro_rt == NULL ||
 	    sifa->address.sa.sa_family != AF_INET) {
 		return (0);
 	}
 	ifa = (struct ifaddr *)sifa->ifa;
 	mask = (struct sockaddr_in *)(ifa->ifa_netmask);
 	sin = &sifa->address.sin;
 	srcnetaddr.s_addr = (sin->sin_addr.s_addr & mask->sin_addr.s_addr);
 	SCTPDBG(SCTP_DEBUG_OUTPUT1, "match_nexthop4: src address is ");
 	SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, &sifa->address.sa);
 	SCTPDBG(SCTP_DEBUG_OUTPUT1, "network address is %x\n", srcnetaddr.s_addr);
 
 	sin = (struct sockaddr_in *)ro->ro_rt->rt_gateway;
 	gwnetaddr.s_addr = (sin->sin_addr.s_addr & mask->sin_addr.s_addr);
 	SCTPDBG(SCTP_DEBUG_OUTPUT1, "match_nexthop4: nexthop is ");
 	SCTPDBG_ADDR(SCTP_DEBUG_OUTPUT2, ro->ro_rt->rt_gateway);
 	SCTPDBG(SCTP_DEBUG_OUTPUT1, "network address is %x\n", gwnetaddr.s_addr);
 	if (srcnetaddr.s_addr == gwnetaddr.s_addr) {
 		return (1);
 	}
 #endif
 	return (0);
 }
Index: projects/openssl111/sys/powerpc/conf/GENERIC
===================================================================
--- projects/openssl111/sys/powerpc/conf/GENERIC	(revision 339239)
+++ projects/openssl111/sys/powerpc/conf/GENERIC	(revision 339240)
@@ -1,228 +1,229 @@
 #
 # GENERIC -- Generic kernel configuration file for FreeBSD/powerpc
 #
 # For more information on this file, please read the handbook section on
 # Kernel Configuration Files:
 #
 #    https://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig-config.html
 #
 # The handbook is also available locally in /usr/share/doc/handbook
 # if you've installed the doc distribution, otherwise always see the
 # FreeBSD World Wide Web server (https://www.FreeBSD.org/) for the
 # latest information.
 #
 # An exhaustive list of options and more detailed explanations of the
 # device lines is also present in the ../../conf/NOTES and NOTES files. 
 # If you are in doubt as to the purpose or necessity of a line, check first 
 # in NOTES.
 #
 # $FreeBSD$
 
 cpu		AIM
 ident		GENERIC
 
 machine 	powerpc powerpc
 
 makeoptions	DEBUG=-g		#Build kernel with gdb(1) debug symbols
 makeoptions	WITH_CTF=1
 
 # Platform support
 options 	POWERMAC		#NewWorld Apple PowerMacs
 options 	PSIM			#GDB PSIM ppc simulator
 options 	MAMBO			#IBM Mambo Full System Simulator
 options 	PSERIES			#PAPR-compliant systems
 
 options		FDT
 options 	SCHED_ULE		#ULE scheduler
 options 	PREEMPTION		#Enable kernel thread preemption
 options 	VIMAGE			# Subsystem virtualization, e.g. VNET
 options 	INET			#InterNETworking
 options 	INET6			#IPv6 communications protocols
 options 	IPSEC			# IP (v4/v6) security
 options 	IPSEC_SUPPORT		# Allow kldload of ipsec and tcpmd5
 options 	TCP_HHOOK		# hhook(9) framework for TCP
+options 	TCP_RFC7413		# TCP Fast Open
 options 	SCTP			#Stream Control Transmission Protocol
 options 	FFS			#Berkeley Fast Filesystem
 options 	SOFTUPDATES		#Enable FFS soft updates support
 options 	UFS_ACL			#Support for access control lists
 options 	UFS_DIRHASH		#Improve performance on big directories
 options 	UFS_GJOURNAL		#Enable gjournal-based UFS journaling
 options 	QUOTA			#Enable disk quotas for UFS
 options 	MD_ROOT			#MD is a potential root device
 options 	NFSCL			#Network Filesystem Client
 options 	NFSD			#Network Filesystem Server
 options 	NFSLOCKD		#Network Lock Manager
 options 	NFS_ROOT		#NFS usable as root device
 options 	MSDOSFS			#MSDOS Filesystem
 options 	CD9660			#ISO 9660 Filesystem
 options 	PROCFS			#Process filesystem (requires PSEUDOFS)
 options 	PSEUDOFS		#Pseudo-filesystem framework
 options 	GEOM_PART_APM		#Apple Partition Maps.
 options 	GEOM_PART_GPT		#GUID Partition Tables.
 options 	GEOM_LABEL		#Provides labelization
 options 	COMPAT_FREEBSD4		#Keep this for a while
 options 	COMPAT_FREEBSD5		#Compatible with FreeBSD5
 options 	COMPAT_FREEBSD6		#Compatible with FreeBSD6
 options 	COMPAT_FREEBSD7		#Compatible with FreeBSD7
 options 	COMPAT_FREEBSD9		# Compatible with FreeBSD9
 options 	COMPAT_FREEBSD10	# Compatible with FreeBSD10
 options 	COMPAT_FREEBSD11	# Compatible with FreeBSD11
 options 	SCSI_DELAY=5000		#Delay (in ms) before probing SCSI 
 options 	KTRACE			#ktrace(1) syscall trace support
 options 	STACK			#stack(9) support
 options 	SYSVSHM			#SYSV-style shared memory
 options 	SYSVMSG			#SYSV-style message queues
 options 	SYSVSEM			#SYSV-style semaphores
 options 	_KPOSIX_PRIORITY_SCHEDULING #Posix P1003_1B real-time extensions
 options 	HWPMC_HOOKS		# Necessary kernel hooks for hwpmc(4)
 options 	AUDIT			# Security event auditing
 options 	CAPABILITY_MODE		# Capsicum capability mode
 options 	CAPABILITIES		# Capsicum capabilities
 options 	MAC			# TrustedBSD MAC Framework
 options 	KDTRACE_HOOKS		# Kernel DTrace hooks
 options 	DDB_CTF			# Kernel ELF linker loads CTF data
 options 	INCLUDE_CONFIG_FILE     # Include this file in kernel
 options 	RACCT			# Resource accounting framework
 options 	RACCT_DEFAULT_TO_DISABLED # Set kern.racct.enable=0 by default
 options 	RCTL			# Resource limits
 
 # Debugging support.  Always need this:
 options 	KDB			# Enable kernel debugger support.
 options 	KDB_TRACE		# Print a stack trace for a panic.
 # For full debugger support use (turn off in stable branch):
 options 	DDB			#Support DDB
 #options 	DEADLKRES		#Enable the deadlock resolver
 options 	INVARIANTS		#Enable calls of extra sanity checking
 options 	INVARIANT_SUPPORT	#Extra sanity checks of internal structures, required by INVARIANTS
 options 	WITNESS			#Enable checks to detect deadlocks and cycles
 options 	WITNESS_SKIPSPIN	#Don't run witness on spinlocks for speed
 options 	MALLOC_DEBUG_MAXZONES=8	# Separate malloc(9) zones
 
 # Kernel dump features.
 options 	EKCD			# Support for encrypted kernel dumps
 options 	GZIO			# gzip-compressed kernel and user dumps
 options 	ZSTDIO			# zstd-compressed kernel and user dumps
 options 	NETDUMP			# netdump(4) client support
 
 # Make an SMP-capable kernel by default
 options 	SMP			# Symmetric MultiProcessor Kernel
 
 # CPU frequency control
 device		cpufreq
 
 # Standard busses
 device		pci
 options 	PCI_HP			# PCI-Express native HotPlug
 device		agp
 
 # ATA controllers
 device		ahci		# AHCI-compatible SATA controllers
 device		ata		# Legacy ATA/SATA controllers
 device		mvs		# Marvell 88SX50XX/88SX60XX/88SX70XX/SoC SATA
 device		siis		# SiliconImage SiI3124/SiI3132/SiI3531 SATA
 
 # SCSI Controllers
 device		ahc		# AHA2940 and onboard AIC7xxx devices
 options 	AHC_ALLOW_MEMIO	# Attempt to use memory mapped I/O
 device		isp		# Qlogic family
 device		ispfw		# Firmware module for Qlogic host adapters
 device		mpt		# LSI-Logic MPT-Fusion
 device		mps		# LSI-Logic MPT-Fusion 2
 device		sym		# NCR/Symbios/LSI Logic 53C8XX/53C1010/53C1510D
 
 # ATA/SCSI peripherals
 device		scbus		# SCSI bus (required for ATA/SCSI)
 device		da		# Direct Access (disks)
 device		sa		# Sequential Access (tape etc)
 device		cd		# CD
 device		pass		# Passthrough device (direct ATA/SCSI access)
 
 # vt is the default console driver, resembling an SCO console
 device		vt		# Generic console driver (pulls in OF FB)
 device		kbdmux
 
 # Serial (COM) ports
 device		scc
 device		uart
 device		uart_z8530
 
 # FireWire support
 device		firewire	# FireWire bus code
 device		sbp		# SCSI over FireWire (Requires scbus and da)
 device		fwe		# Ethernet over FireWire (non-standard!)
 
 # PCI Ethernet NICs that use the common MII bus controller code.
 device		miibus		# MII bus support
 device		bge		# Broadcom BCM570xx Gigabit Ethernet
 device		bm		# Apple BMAC Ethernet
 device		gem		# Sun GEM/Sun ERI/Apple GMAC
 device		dc		# DEC/Intel 21143 and various workalikes
 device		fxp		# Intel EtherExpress PRO/100B (82557, 82558)
 
 # Pseudo devices.
 device		crypto		# core crypto support
 device		loop		# Network loopback
 device		random		# Entropy device
 device		ether		# Ethernet support
 device		vlan		# 802.1Q VLAN support
 device		tun		# Packet tunnel.
 device		md		# Memory "disks"
 device		ofwd		# Open Firmware disks
 device		gif		# IPv6 and IPv4 tunneling
 device		firmware	# firmware assist module
 
 # The `bpf' device enables the Berkeley Packet Filter.
 # Be aware of the administrative consequences of enabling this!
 # Note that 'bpf' is required for DHCP.
 device		bpf		#Berkeley packet filter
 
 # USB support
 options 	USB_DEBUG	# enable debug msgs
 device		uhci		# UHCI PCI->USB interface
 device		ohci		# OHCI PCI->USB interface
 device		ehci		# EHCI PCI->USB interface
 device		usb		# USB Bus (required)
 device		uhid		# "Human Interface Devices"
 device		ukbd		# Keyboard
 options 	KBD_INSTALL_CDEV # install a CDEV entry in /dev
 device		ulpt		# Printer
 device		umass		# Disks/Mass storage - Requires scbus and da0
 device		ums		# Mouse
 device		atp		# Apple USB touchpad
 device		urio		# Diamond Rio 500 MP3 player
 # USB Ethernet
 device		aue		# ADMtek USB Ethernet
 device		axe		# ASIX Electronics USB Ethernet
 device		cdce		# Generic USB over Ethernet
 device		cue		# CATC USB Ethernet
 device		kue		# Kawasaki LSI USB Ethernet
 
 # Wireless NIC cards
 options 	IEEE80211_SUPPORT_MESH
 options 	AH_SUPPORT_AR5416
 
 # Misc
 device		iicbus		# I2C bus code
 device		kiic		# Keywest I2C
 device		ad7417		# PowerMac7,2 temperature sensor
 device		adt746x		# PowerBook5,8 temperature sensor
 device		ds1631		# PowerMac11,2 temperature sensor
 device		ds1775		# PowerMac7,2 temperature sensor
 device		fcu		# Apple Fan Control Unit
 device		max6690		# PowerMac7,2 temperature sensor
 device		powermac_nvram	# Open Firmware configuration NVRAM
 device		smu		# Apple System Management Unit
 device		adm1030		# Apple G4 MDD fan controller
 device		atibl		# ATI-based backlight driver for PowerBooks/iBooks
 device		nvbl		# nVidia-based backlight driver for PowerBooks/iBooks
 
 # ADB support
 device		adb
 device		cuda
 device		pmu
 
 # Sound support
 device		sound		# Generic sound driver (required)
 device		snd_ai2s	# Apple I2S audio
 device		snd_davbus	# Apple DAVBUS audio
 device		snd_uaudio	# USB Audio
Index: projects/openssl111/sys/powerpc/conf/GENERIC64
===================================================================
--- projects/openssl111/sys/powerpc/conf/GENERIC64	(revision 339239)
+++ projects/openssl111/sys/powerpc/conf/GENERIC64	(revision 339240)
@@ -1,236 +1,237 @@
 #
 # GENERIC -- Generic kernel configuration file for FreeBSD/powerpc
 #
 # For more information on this file, please read the handbook section on
 # Kernel Configuration Files:
 #
 #    https://www.FreeBSD.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig-config.html
 #
 # The handbook is also available locally in /usr/share/doc/handbook
 # if you've installed the doc distribution, otherwise always see the
 # FreeBSD World Wide Web server (https://www.FreeBSD.org/) for the
 # latest information.
 #
 # An exhaustive list of options and more detailed explanations of the
 # device lines is also present in the ../../conf/NOTES and NOTES files. 
 # If you are in doubt as to the purpose or necessity of a line, check first 
 # in NOTES.
 #
 # $FreeBSD$
 
 cpu		AIM
 ident		GENERIC
 
 machine 	powerpc	powerpc64
 
 makeoptions	DEBUG=-g		#Build kernel with gdb(1) debug symbols
 makeoptions	WITH_CTF=1
 
 # Platform support
 options 	POWERMAC		#NewWorld Apple PowerMacs
 options 	PS3			#Sony Playstation 3
 options 	MAMBO			#IBM Mambo Full System Simulator
 options 	PSERIES			#PAPR-compliant systems (e.g. IBM p)
 options 	POWERNV			#Non-virtualized OpenPOWER systems
 
 options		FDT			#Flattened Device Tree
 options 	SCHED_ULE		#ULE scheduler
 options 	PREEMPTION		#Enable kernel thread preemption
 options 	VIMAGE			# Subsystem virtualization, e.g. VNET
 options 	INET			#InterNETworking
 options 	INET6			#IPv6 communications protocols
 options 	TCP_HHOOK		# hhook(9) framework for TCP
+options 	TCP_RFC7413		# TCP Fast Open
 options 	SCTP			#Stream Control Transmission Protocol
 options 	FFS			#Berkeley Fast Filesystem
 options 	SOFTUPDATES		#Enable FFS soft updates support
 options 	UFS_ACL			#Support for access control lists
 options 	UFS_DIRHASH		#Improve performance on big directories
 options 	UFS_GJOURNAL		#Enable gjournal-based UFS journaling
 options 	QUOTA			#Enable disk quotas for UFS
 options 	MD_ROOT			#MD is a potential root device
 options 	NFSCL			#Network Filesystem Client
 options 	NFSD			#Network Filesystem Server
 options 	NFSLOCKD		#Network Lock Manager
 options 	NFS_ROOT		#NFS usable as root device
 options 	MSDOSFS			#MSDOS Filesystem
 options 	CD9660			#ISO 9660 Filesystem
 options 	PROCFS			#Process filesystem (requires PSEUDOFS)
 options 	PSEUDOFS		#Pseudo-filesystem framework
 options 	GEOM_PART_APM		#Apple Partition Maps.
 options 	GEOM_PART_GPT		#GUID Partition Tables.
 options 	GEOM_LABEL		#Provides labelization
 options 	COMPAT_FREEBSD32	#Compatible with FreeBSD/powerpc binaries
 options 	COMPAT_FREEBSD5		#Compatible with FreeBSD5
 options 	COMPAT_FREEBSD6		#Compatible with FreeBSD6
 options 	COMPAT_FREEBSD7		#Compatible with FreeBSD7
 options 	COMPAT_FREEBSD9		# Compatible with FreeBSD9
 options 	COMPAT_FREEBSD10	# Compatible with FreeBSD10
 options 	COMPAT_FREEBSD11	# Compatible with FreeBSD11
 options 	SCSI_DELAY=5000		#Delay (in ms) before probing SCSI 
 options 	KTRACE			#ktrace(1) syscall trace support
 options 	STACK			#stack(9) support
 options 	SYSVSHM			#SYSV-style shared memory
 options 	SYSVMSG			#SYSV-style message queues
 options 	SYSVSEM			#SYSV-style semaphores
 options 	_KPOSIX_PRIORITY_SCHEDULING #Posix P1003_1B real-time extensions
 options		PRINTF_BUFR_SIZE=128	# Prevent printf output being interspersed.
 options 	HWPMC_HOOKS		# Necessary kernel hooks for hwpmc(4)
 options 	AUDIT			# Security event auditing
 options 	CAPABILITY_MODE		# Capsicum capability mode
 options 	CAPABILITIES		# Capsicum capabilities
 options 	MAC			# TrustedBSD MAC Framework
 options 	KDTRACE_HOOKS		# Kernel DTrace hooks
 options 	DDB_CTF			# Kernel ELF linker loads CTF data
 options 	INCLUDE_CONFIG_FILE     # Include this file in kernel
 
 # Debugging support.  Always need this:
 options 	KDB			# Enable kernel debugger support.
 options 	KDB_TRACE		# Print a stack trace for a panic.
 # For full debugger support use (turn off in stable branch):
 options 	DDB			#Support DDB
 #options 	DEADLKRES		#Enable the deadlock resolver
 options 	INVARIANTS		#Enable calls of extra sanity checking
 options 	INVARIANT_SUPPORT	#Extra sanity checks of internal structures, required by INVARIANTS
 options 	WITNESS			#Enable checks to detect deadlocks and cycles
 options 	WITNESS_SKIPSPIN	#Don't run witness on spinlocks for speed
 options 	MALLOC_DEBUG_MAXZONES=8	# Separate malloc(9) zones
 
 # Kernel dump features.
 options 	EKCD			# Support for encrypted kernel dumps
 options 	GZIO			# gzip-compressed kernel and user dumps
 options 	ZSTDIO			# zstd-compressed kernel and user dumps
 options 	NETDUMP			# netdump(4) client support
 
 # Make an SMP-capable kernel by default
 options 	SMP			# Symmetric MultiProcessor Kernel
 
 # CPU frequency control
 device		cpufreq
 
 # Standard busses
 device		pci
 options 	PCI_HP			# PCI-Express native HotPlug
 device		agp
 
 # ATA controllers
 device		ahci		# AHCI-compatible SATA controllers
 device		ata		# Legacy ATA/SATA controllers
 device		mvs		# Marvell 88SX50XX/88SX60XX/88SX70XX/SoC SATA
 device		siis		# SiliconImage SiI3124/SiI3132/SiI3531 SATA
 
 # NVM Express (NVMe) support
 device		nvme		# base NVMe driver
 options		NVME_USE_NVD=0	# prefer the cam(4) based nda(4) driver
 device		nvd		# expose NVMe namespaces as disks, depends on nvme
 
 # SCSI Controllers
 device		ahc		# AHA2940 and onboard AIC7xxx devices
 options 	AHC_ALLOW_MEMIO	# Attempt to use memory mapped I/O
 device		isp		# Qlogic family
 device		ispfw		# Firmware module for Qlogic host adapters
 device		mpt		# LSI-Logic MPT-Fusion
 device		mps		# LSI-Logic MPT-Fusion 2
 device		sym		# NCR/Symbios/LSI Logic 53C8XX/53C1010/53C1510D
 
 # ATA/SCSI peripherals
 device		scbus		# SCSI bus (required for ATA/SCSI)
 device		da		# Direct Access (disks)
 device		sa		# Sequential Access (tape etc)
 device		cd		# CD
 device		pass		# Passthrough device (direct ATA/SCSI access)
 
 # vt is the default console driver, resembling an SCO console
 device		vt		# Core console driver
 device		kbdmux
 
 # Serial (COM) ports
 device		scc
 device		uart
 device		uart_z8530
 
 # Ethernet hardware
 device		em		# Intel PRO/1000 Gigabit Ethernet Family
 device		ix		# Intel PRO/10GbE PCIE PF Ethernet Family
 device		ixv		# Intel PRO/10GbE PCIE VF Ethernet Family
 device		glc		# Sony Playstation 3 Ethernet
 device		llan		# IBM pSeries Virtual Ethernet
 device		cxgbe		# Chelsio 10/25G NIC
 
 # PCI Ethernet NICs that use the common MII bus controller code.
 device		miibus		# MII bus support
 device		bge		# Broadcom BCM570xx Gigabit Ethernet
 device		gem		# Sun GEM/Sun ERI/Apple GMAC
 device		dc		# DEC/Intel 21143 and various workalikes
 device		fxp		# Intel EtherExpress PRO/100B (82557, 82558)
 device		re		# RealTek 8139C+/8169/8169S/8110S
 device		rl		# RealTek 8129/8139
 
 # Pseudo devices.
 device		loop		# Network loopback
 device		random		# Entropy device
 device		ether		# Ethernet support
 device		vlan		# 802.1Q VLAN support
 device		tun		# Packet tunnel.
 device		md		# Memory "disks"
 device		ofwd		# Open Firmware disks
 device		gif		# IPv6 and IPv4 tunneling
 device		firmware	# firmware assist module
 
 # The `bpf' device enables the Berkeley Packet Filter.
 # Be aware of the administrative consequences of enabling this!
 # Note that 'bpf' is required for DHCP.
 device		bpf		#Berkeley packet filter
 
 # USB support
 options 	USB_DEBUG	# enable debug msgs
 device		uhci		# UHCI PCI->USB interface
 device		ohci		# OHCI PCI->USB interface
 device		ehci		# EHCI PCI->USB interface
 device		xhci		# XHCI PCI->USB interface
 device		usb		# USB Bus (required)
 device		uhid		# "Human Interface Devices"
 device		ukbd		# Keyboard
 options 	KBD_INSTALL_CDEV # install a CDEV entry in /dev
 device		ulpt		# Printer
 device		umass		# Disks/Mass storage - Requires scbus and da0
 device		ums		# Mouse
 device		urio		# Diamond Rio 500 MP3 player
 # USB Ethernet
 device		aue		# ADMtek USB Ethernet
 device		axe		# ASIX Electronics USB Ethernet
 device		cdce		# Generic USB over Ethernet
 device		cue		# CATC USB Ethernet
 device		kue		# Kawasaki LSI USB Ethernet
 
 # Wireless NIC cards
 options 	IEEE80211_SUPPORT_MESH
 options 	AH_SUPPORT_AR5416
 
 # FireWire support
 device		firewire	# FireWire bus code
 device		sbp		# SCSI over FireWire (Requires scbus and da)
 device		fwe		# Ethernet over FireWire (non-standard!)
 
 # Misc
 device		iicbus		# I2C bus code
 device		iic
 device		kiic		# Keywest I2C
 device		ad7417		# PowerMac7,2 temperature sensor
 device		ds1631		# PowerMac11,2 temperature sensor
 device		ds1775		# PowerMac7,2 temperature sensor
 device		fcu		# Apple Fan Control Unit
 device		max6690		# PowerMac7,2 temperature sensor
 device		powermac_nvram	# Open Firmware configuration NVRAM
 device		smu		# Apple System Management Unit
 device		atibl		# ATI-based backlight driver for PowerBooks/iBooks
 device		nvbl		# nVidia-based backlight driver for PowerBooks/iBooks
 
 # ADB support
 device		adb
 device		pmu
 
 # Sound support
 device		sound		# Generic sound driver (required)
 device		snd_ai2s	# Apple I2S audio
 device		snd_uaudio	# USB Audio
 
Index: projects/openssl111/sys/powerpc/powernv/opal_pci.c
===================================================================
--- projects/openssl111/sys/powerpc/powernv/opal_pci.c	(revision 339239)
+++ projects/openssl111/sys/powerpc/powernv/opal_pci.c	(revision 339240)
@@ -1,682 +1,663 @@
 /*-
  * Copyright (c) 2015-2016 Nathan Whitehorn
  * Copyright (c) 2017-2018 Semihalf
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/module.h>
 #include <sys/bus.h>
 #include <sys/conf.h>
 #include <sys/kernel.h>
 #include <sys/pciio.h>
 #include <sys/endian.h>
 #include <sys/rman.h>
 #include <sys/vmem.h>
 
 #include <dev/ofw/openfirm.h>
 #include <dev/ofw/ofw_pci.h>
 #include <dev/ofw/ofw_bus.h>
 #include <dev/ofw/ofw_bus_subr.h>
 #include <dev/ofw/ofwpci.h>
 
 #include <dev/pci/pcivar.h>
 #include <dev/pci/pcireg.h>
 
 #include <machine/bus.h>
 #include <machine/intr_machdep.h>
 #include <machine/md_var.h>
 
 #include <vm/vm.h>
 #include <vm/pmap.h>
 
 #include "pcib_if.h"
 #include "pic_if.h"
 #include "iommu_if.h"
 #include "opal.h"
 
 #define	OPAL_PCI_TCE_MAX_ENTRIES	(1024*1024UL)
 #define	OPAL_PCI_TCE_DEFAULT_SEG_SIZE	(16*1024*1024UL)
 #define	OPAL_PCI_TCE_R			(1UL << 0)
 #define	OPAL_PCI_TCE_W			(1UL << 1)
 #define	PHB3_TCE_KILL_INVAL_ALL		(1UL << 63)
 
 /*
  * Device interface.
  */
 static int		opalpci_probe(device_t);
 static int		opalpci_attach(device_t);
 
 /*
  * pcib interface.
  */
 static uint32_t		opalpci_read_config(device_t, u_int, u_int, u_int,
 			    u_int, int);
 static void		opalpci_write_config(device_t, u_int, u_int, u_int,
 			    u_int, u_int32_t, int);
 static int		opalpci_alloc_msi(device_t dev, device_t child,
 			    int count, int maxcount, int *irqs);
 static int		opalpci_release_msi(device_t dev, device_t child,
 			    int count, int *irqs);
 static int		opalpci_alloc_msix(device_t dev, device_t child,
 			    int *irq);
 static int		opalpci_release_msix(device_t dev, device_t child,
 			    int irq);
 static int		opalpci_map_msi(device_t dev, device_t child,
 			    int irq, uint64_t *addr, uint32_t *data);
 static int opalpci_route_interrupt(device_t bus, device_t dev, int pin);
 
 /*
  * MSI PIC interface.
  */
 static void opalpic_pic_enable(device_t dev, u_int irq, u_int vector);
 static void opalpic_pic_eoi(device_t dev, u_int irq);
-static void opalpic_pic_mask(device_t dev, u_int irq);
-static void opalpic_pic_unmask(device_t dev, u_int irq);
 
 /*
  * Commands
  */
 #define	OPAL_M32_WINDOW_TYPE		1
 #define	OPAL_M64_WINDOW_TYPE		2
 #define	OPAL_IO_WINDOW_TYPE		3
 
 #define	OPAL_RESET_PHB_COMPLETE		1
 #define	OPAL_RESET_PCI_IODA_TABLE	6
 
 #define	OPAL_DISABLE_M64		0
 #define	OPAL_ENABLE_M64_SPLIT		1
 #define	OPAL_ENABLE_M64_NON_SPLIT	2
 
 #define	OPAL_EEH_ACTION_CLEAR_FREEZE_MMIO	1
 #define	OPAL_EEH_ACTION_CLEAR_FREEZE_DMA	2
 #define	OPAL_EEH_ACTION_CLEAR_FREEZE_ALL	3
 
 /*
  * Constants
  */
 #define OPAL_PCI_DEFAULT_PE			1
 
 /*
  * Driver methods.
  */
 static device_method_t	opalpci_methods[] = {
 	/* Device interface */
 	DEVMETHOD(device_probe,		opalpci_probe),
 	DEVMETHOD(device_attach,	opalpci_attach),
 
 	/* pcib interface */
 	DEVMETHOD(pcib_read_config,	opalpci_read_config),
 	DEVMETHOD(pcib_write_config,	opalpci_write_config),
 
 	DEVMETHOD(pcib_alloc_msi,	opalpci_alloc_msi),
 	DEVMETHOD(pcib_release_msi,	opalpci_release_msi),
 	DEVMETHOD(pcib_alloc_msix,	opalpci_alloc_msix),
 	DEVMETHOD(pcib_release_msix,	opalpci_release_msix),
 	DEVMETHOD(pcib_map_msi,		opalpci_map_msi),
 	DEVMETHOD(pcib_route_interrupt,	opalpci_route_interrupt),
 
 	/* PIC interface for MSIs */
 	DEVMETHOD(pic_enable,		opalpic_pic_enable),
 	DEVMETHOD(pic_eoi,		opalpic_pic_eoi),
-	DEVMETHOD(pic_mask,		opalpic_pic_mask),
-	DEVMETHOD(pic_unmask,		opalpic_pic_unmask),
 
 	DEVMETHOD_END
 };
 
 struct opalpci_softc {
 	struct ofw_pci_softc ofw_sc;
 	uint64_t phb_id;
 	vmem_t *msi_vmem;
 	int msi_base;		/* Base XIVE number */
 	int base_msi_irq;	/* Base IRQ assigned by FreeBSD to this PIC */
 	uint64_t *tce;		/* TCE table for 1:1 mapping */
 	struct resource *r_reg;
 };
 
 static devclass_t	opalpci_devclass;
 DEFINE_CLASS_1(pcib, opalpci_driver, opalpci_methods,
     sizeof(struct opalpci_softc), ofw_pci_driver);
 EARLY_DRIVER_MODULE(opalpci, ofwbus, opalpci_driver, opalpci_devclass, 0, 0,
     BUS_PASS_BUS);
 
 static int
 opalpci_probe(device_t dev)
 {
 	const char	*type;
 
 	if (opal_check() != 0)
 		return (ENXIO);
 
 	type = ofw_bus_get_type(dev);
 
 	if (type == NULL || (strcmp(type, "pci") != 0 &&
 	    strcmp(type, "pciex") != 0))
 		return (ENXIO);
 
 	if (!OF_hasprop(ofw_bus_get_node(dev), "ibm,opal-phbid"))
 		return (ENXIO); 
 
 	device_set_desc(dev, "OPAL Host-PCI bridge");
 	return (BUS_PROBE_GENERIC);
 }
 
 static void
 pci_phb3_tce_invalidate_entire(struct opalpci_softc *sc)
 {
 
 	mb();
 	bus_write_8(sc->r_reg, 0x210, PHB3_TCE_KILL_INVAL_ALL);
 	mb();
 }
 
 /* Simple function to round to a power of 2 */
 static uint64_t
 round_pow2(uint64_t val)
 {
 
 	return (1 << (flsl(val + (val - 1)) - 1));
 }
 
 /*
  * Starting with skiboot 5.10 PCIe nodes have a new property,
  * "ibm,supported-tce-sizes", to denote the TCE sizes available.  This allows us
  * to avoid hard-coding the maximum TCE size allowed, and instead provide a sane
  * default (however, the "sane" default, which works for all targets, is 64k,
  * limiting us to 64GB if we have 1M entries.
  */
 static uint64_t
 max_tce_size(device_t dev)
 {
 	phandle_t node;
 	cell_t sizes[64]; /* Property is a list of bit-widths, up to 64-bits */
 	int count;
 
 	node = ofw_bus_get_node(dev);
 
 	count = OF_getencprop(node, "ibm,supported-tce-sizes",
 	    sizes, sizeof(sizes));
 	if (count < (int) sizeof(cell_t))
 		return OPAL_PCI_TCE_DEFAULT_SEG_SIZE;
 
 	count /= sizeof(cell_t);
 
 	return (1ULL << sizes[count - 1]);
 }
 
 static int
 opalpci_attach(device_t dev)
 {
 	struct opalpci_softc *sc;
 	cell_t id[2], m64ranges[2], m64window[6], npe;
 	phandle_t node;
 	int i, err;
 	uint64_t maxmem;
 	uint64_t entries;
 	uint64_t tce_size;
 	uint64_t tce_tbl_size;
 	int m64bar;
 	int rid;
 
 	sc = device_get_softc(dev);
 	node = ofw_bus_get_node(dev);
 
 	switch (OF_getproplen(node, "ibm,opal-phbid")) {
 	case 8:
 		OF_getencprop(node, "ibm,opal-phbid", id, 8);
 		sc->phb_id = ((uint64_t)id[0] << 32) | id[1];
 		break;
 	case 4:
 		OF_getencprop(node, "ibm,opal-phbid", id, 4);
 		sc->phb_id = id[0];
 		break;
 	default:
 		device_printf(dev, "PHB ID property had wrong length (%zd)\n",
 		    OF_getproplen(node, "ibm,opal-phbid"));
 		return (ENXIO);
 	}
 
 	if (bootverbose)
 		device_printf(dev, "OPAL ID %#lx\n", sc->phb_id);
 
 	rid = 0;
 	sc->r_reg = bus_alloc_resource_any(dev, SYS_RES_MEMORY,
 	    &rid, RF_ACTIVE | RF_SHAREABLE);
 	if (sc->r_reg == NULL) {
 		device_printf(dev, "Failed to allocate PHB[%jd] registers\n",
 		    (uintmax_t)sc->phb_id);
 		return (ENXIO);
 	}
 
 #if 0
 	/*
 	 * Reset PCI IODA table
 	 */
 	err = opal_call(OPAL_PCI_RESET, sc->phb_id, OPAL_RESET_PCI_IODA_TABLE,
 	    1);
 	if (err != 0) {
 		device_printf(dev, "IODA table reset failed: %d\n", err);
 		return (ENXIO);
 	}
 	err = opal_call(OPAL_PCI_RESET, sc->phb_id, OPAL_RESET_PHB_COMPLETE,
 	    1);
 	if (err < 0) {
 		device_printf(dev, "PHB reset failed: %d\n", err);
 		return (ENXIO);
 	}
 	if (err > 0) {
 		while ((err = opal_call(OPAL_PCI_POLL, sc->phb_id)) > 0) {
 			DELAY(1000*(err + 1)); /* Returns expected delay in ms */
 		}
 	}
 	if (err < 0) {
 		device_printf(dev, "WARNING: PHB IODA reset poll failed: %d\n", err);
 	}
 	err = opal_call(OPAL_PCI_RESET, sc->phb_id, OPAL_RESET_PHB_COMPLETE,
 	    0);
 	if (err < 0) {
 		device_printf(dev, "PHB reset failed: %d\n", err);
 		return (ENXIO);
 	}
 	if (err > 0) {
 		while ((err = opal_call(OPAL_PCI_POLL, sc->phb_id)) > 0) {
 			DELAY(1000*(err + 1)); /* Returns expected delay in ms */
 		}
 	}
 #endif
 
 	/*
 	 * Map all devices on the bus to partitionable endpoint one until
 	 * such time as we start wanting to do things like bhyve.
 	 */
 	err = opal_call(OPAL_PCI_SET_PE, sc->phb_id, OPAL_PCI_DEFAULT_PE,
 	    0, OPAL_PCI_BUS_ANY, OPAL_IGNORE_RID_DEVICE_NUMBER,
 	    OPAL_IGNORE_RID_FUNC_NUMBER, OPAL_MAP_PE);
 	if (err != 0) {
 		device_printf(dev, "PE mapping failed: %d\n", err);
 		return (ENXIO);
 	}
 
 	/*
 	 * Turn on MMIO, mapped to PE 1
 	 */
 	if (OF_getencprop(node, "ibm,opal-num-pes", &npe, 4) != 4)
 		npe = 1;
 	for (i = 0; i < npe; i++) {
 		err = opal_call(OPAL_PCI_MAP_PE_MMIO_WINDOW, sc->phb_id,
 		    OPAL_PCI_DEFAULT_PE, OPAL_M32_WINDOW_TYPE, 0, i);
 		if (err != 0)
 			device_printf(dev, "MMIO %d map failed: %d\n", i, err);
 	}
 
 	if (OF_getencprop(node, "ibm,opal-available-m64-ranges",
 	    m64ranges, sizeof(m64ranges)) == sizeof(m64ranges))
 		m64bar = m64ranges[0];
 	else
 	    m64bar = 0;
 
 	/* XXX: multiple M64 windows? */
 	if (OF_getencprop(node, "ibm,opal-m64-window",
 	    m64window, sizeof(m64window)) == sizeof(m64window)) {
 		opal_call(OPAL_PCI_PHB_MMIO_ENABLE, sc->phb_id,
 		    OPAL_M64_WINDOW_TYPE, m64bar, 0);
 		opal_call(OPAL_PCI_SET_PHB_MEM_WINDOW, sc->phb_id,
 		    OPAL_M64_WINDOW_TYPE, m64bar /* index */, 
 		    ((uint64_t)m64window[2] << 32) | m64window[3], 0,
 		    ((uint64_t)m64window[4] << 32) | m64window[5]);
 		opal_call(OPAL_PCI_MAP_PE_MMIO_WINDOW, sc->phb_id,
 		    OPAL_PCI_DEFAULT_PE, OPAL_M64_WINDOW_TYPE,
 		    m64bar /* index */, 0);
 		opal_call(OPAL_PCI_PHB_MMIO_ENABLE, sc->phb_id,
 		    OPAL_M64_WINDOW_TYPE, m64bar, OPAL_ENABLE_M64_NON_SPLIT);
 	}
 
 	/*
 	 * Enable IOMMU for PE1 - map everything 1:1 using
 	 * segments of max_tce_size size
 	 */
 	tce_size = max_tce_size(dev);
 	maxmem = roundup2(powerpc_ptob(Maxmem), tce_size);
 	entries = round_pow2(maxmem / tce_size);
 	tce_tbl_size = max(entries * sizeof(uint64_t), 4096);
 	if (entries > OPAL_PCI_TCE_MAX_ENTRIES)
 		panic("POWERNV supports only %jdGB of memory space\n",
 		    (uintmax_t)((OPAL_PCI_TCE_MAX_ENTRIES * tce_size) >> 30));
 	if (bootverbose)
 		device_printf(dev, "Mapping 0-%#jx for DMA\n", (uintmax_t)maxmem);
 	sc->tce = contigmalloc(tce_tbl_size,
 	    M_DEVBUF, M_NOWAIT | M_ZERO, 0,
 	    BUS_SPACE_MAXADDR, tce_size, 0);
 	if (sc->tce == NULL)
 		panic("Failed to allocate TCE memory for PHB %jd\n",
 		    (uintmax_t)sc->phb_id);
 
 	for (i = 0; i < entries; i++)
 		sc->tce[i] = (i * tce_size) | OPAL_PCI_TCE_R | OPAL_PCI_TCE_W;
 
 	/* Map TCE for every PE. It seems necessary for Power8 */
 	for (i = 0; i < npe; i++) {
 		err = opal_call(OPAL_PCI_MAP_PE_DMA_WINDOW, sc->phb_id,
 		    i, (i << 1),
 		    1, pmap_kextract((uint64_t)&sc->tce[0]),
 		    tce_tbl_size, tce_size);
 		if (err != 0) {
 			device_printf(dev, "DMA IOMMU mapping failed: %d\n", err);
 			return (ENXIO);
 		}
 
 		err = opal_call(OPAL_PCI_MAP_PE_DMA_WINDOW_REAL, sc->phb_id,
 		    i, (i << 1) + 1,
 		    (1UL << 59), maxmem);
 		if (err != 0) {
 			device_printf(dev, "DMA 64b bypass mapping failed: %d\n", err);
 			return (ENXIO);
 		}
 	}
 
 	/*
 	 * Invalidate all previous TCE entries.
 	 *
 	 * TODO: add support for other PHBs than PHB3
 	 */
 	pci_phb3_tce_invalidate_entire(sc);
 
 	/*
 	 * Get MSI properties
 	 */
 	sc->msi_vmem = NULL;
 	if (OF_getproplen(node, "ibm,opal-msi-ranges") > 0) {
 		cell_t msi_ranges[2];
 		OF_getencprop(node, "ibm,opal-msi-ranges",
 		    msi_ranges, sizeof(msi_ranges));
 		sc->msi_base = msi_ranges[0];
 
 		sc->msi_vmem = vmem_create("OPAL MSI", msi_ranges[0],
 		    msi_ranges[1], 1, 16, M_BESTFIT | M_WAITOK);
 
 		sc->base_msi_irq = powerpc_register_pic(dev,
 		    OF_xref_from_node(node),
 		    msi_ranges[0] + msi_ranges[1], 0, FALSE);
 
 		if (bootverbose)
 			device_printf(dev, "Supports %d MSIs starting at %d\n",
 			    msi_ranges[1], msi_ranges[0]);
 	}
 
 	/*
 	 * General OFW PCI attach
 	 */
 	err = ofw_pci_init(dev);
 	if (err != 0)
 		return (err);
 
 	/*
 	 * Unfreeze non-config-space PCI operations. Let this fail silently
 	 * if e.g. there is no current freeze.
 	 */
 	opal_call(OPAL_PCI_EEH_FREEZE_CLEAR, sc->phb_id, OPAL_PCI_DEFAULT_PE,
 	    OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
 
 	/*
 	 * OPAL stores 64-bit BARs in a special property rather than "ranges"
 	 */
 	if (OF_getencprop(node, "ibm,opal-m64-window",
 	    m64window, sizeof(m64window)) == sizeof(m64window)) {
 		struct ofw_pci_range *rp;
 
 		sc->ofw_sc.sc_nrange++;
 		sc->ofw_sc.sc_range = realloc(sc->ofw_sc.sc_range,
 		    sc->ofw_sc.sc_nrange * sizeof(sc->ofw_sc.sc_range[0]),
 		    M_DEVBUF, M_WAITOK);
 		rp = &sc->ofw_sc.sc_range[sc->ofw_sc.sc_nrange-1];
 		rp->pci_hi = OFW_PCI_PHYS_HI_SPACE_MEM64 |
 		    OFW_PCI_PHYS_HI_PREFETCHABLE;
 		rp->pci = ((uint64_t)m64window[0] << 32) | m64window[1];
 		rp->host = ((uint64_t)m64window[2] << 32) | m64window[3];
 		rp->size = ((uint64_t)m64window[4] << 32) | m64window[5];
 		rman_manage_region(&sc->ofw_sc.sc_mem_rman, rp->pci,
 		   rp->pci + rp->size - 1);
 	}
 
 	return (ofw_pci_attach(dev));
 }
 
 static uint32_t
 opalpci_read_config(device_t dev, u_int bus, u_int slot, u_int func, u_int reg,
     int width)
 {
 	struct opalpci_softc *sc;
 	uint64_t config_addr;
 	uint8_t byte;
 	uint16_t half;
 	uint32_t word;
 	int error;
 
 	sc = device_get_softc(dev);
 
 	config_addr = (bus << 8) | ((slot & 0x1f) << 3) | (func & 0x7);
 
 	switch (width) {
 	case 1:
 		error = opal_call(OPAL_PCI_CONFIG_READ_BYTE, sc->phb_id,
 		    config_addr, reg, vtophys(&byte));
 		word = byte;
 		break;
 	case 2:
 		error = opal_call(OPAL_PCI_CONFIG_READ_HALF_WORD, sc->phb_id,
 		    config_addr, reg, vtophys(&half));
 		word = half;
 		break;
 	case 4:
 		error = opal_call(OPAL_PCI_CONFIG_READ_WORD, sc->phb_id,
 		    config_addr, reg, vtophys(&word));
 		break;
 	default:
 		error = OPAL_SUCCESS;
 		word = 0xffffffff;
 	}
 
 	/*
 	 * Poking config state for non-existant devices can make
 	 * the host bridge hang up. Clear any errors.
 	 *
 	 * XXX: Make this conditional on the existence of a freeze
 	 */
 	opal_call(OPAL_PCI_EEH_FREEZE_CLEAR, sc->phb_id, OPAL_PCI_DEFAULT_PE,
 	    OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
 	
 	if (error != OPAL_SUCCESS)
 		word = 0xffffffff;
 
 	return (word);
 }
 
 static void
 opalpci_write_config(device_t dev, u_int bus, u_int slot, u_int func,
     u_int reg, uint32_t val, int width)
 {
 	struct opalpci_softc *sc;
 	uint64_t config_addr;
 	int error = OPAL_SUCCESS;
 
 	sc = device_get_softc(dev);
 
 	config_addr = (bus << 8) | ((slot & 0x1f) << 3) | (func & 0x7);
 
 	switch (width) {
 	case 1:
 		error = opal_call(OPAL_PCI_CONFIG_WRITE_BYTE, sc->phb_id,
 		    config_addr, reg, val);
 		break;
 	case 2:
 		error = opal_call(OPAL_PCI_CONFIG_WRITE_HALF_WORD, sc->phb_id,
 		    config_addr, reg, val);
 		break;
 	case 4:
 		error = opal_call(OPAL_PCI_CONFIG_WRITE_WORD, sc->phb_id,
 		    config_addr, reg, val);
 		break;
 	}
 
 	if (error != OPAL_SUCCESS) {
 		/*
 		 * Poking config state for non-existant devices can make
 		 * the host bridge hang up. Clear any errors.
 		 */
 		opal_call(OPAL_PCI_EEH_FREEZE_CLEAR, sc->phb_id,
 		    OPAL_PCI_DEFAULT_PE, OPAL_EEH_ACTION_CLEAR_FREEZE_ALL);
 	}
 }
 
 static int
 opalpci_route_interrupt(device_t bus, device_t dev, int pin)
 {
 
 	return (pin);
 }
 
 static int
 opalpci_alloc_msi(device_t dev, device_t child, int count, int maxcount,
     int *irqs)
 {
 	struct opalpci_softc *sc;
 	vmem_addr_t start;
 	phandle_t xref;
 	int err, i;
 
 	sc = device_get_softc(dev);
 	if (sc->msi_vmem == NULL)
 		return (ENODEV);
 
 	err = vmem_xalloc(sc->msi_vmem, count, powerof2(count), 0, 0,
 	    VMEM_ADDR_MIN, VMEM_ADDR_MAX, M_BESTFIT | M_WAITOK, &start);
 
 	if (err)
 		return (err);
 
 	xref = OF_xref_from_node(ofw_bus_get_node(dev));
 	for (i = 0; i < count; i++)
 		irqs[i] = MAP_IRQ(xref, start + i);
 
 	return (0);
 }
 
 static int
 opalpci_release_msi(device_t dev, device_t child, int count, int *irqs)
 {
 	struct opalpci_softc *sc;
 
 	sc = device_get_softc(dev);
 	if (sc->msi_vmem == NULL)
 		return (ENODEV);
 
 	vmem_xfree(sc->msi_vmem, irqs[0] - sc->base_msi_irq, count);
 	return (0);
 }
 
 static int
 opalpci_alloc_msix(device_t dev, device_t child, int *irq)
 {
 	return (opalpci_alloc_msi(dev, child, 1, 1, irq));
 }
 
 static int
 opalpci_release_msix(device_t dev, device_t child, int irq)
 {
 	return (opalpci_release_msi(dev, child, 1, &irq));
 }
 
 static int
 opalpci_map_msi(device_t dev, device_t child, int irq, uint64_t *addr,
     uint32_t *data)
 {
 	struct opalpci_softc *sc;
 	struct pci_devinfo *dinfo;
 	int err, xive;
 
 	sc = device_get_softc(dev);
 	if (sc->msi_vmem == NULL)
 		return (ENODEV);
 
 	xive = irq - sc->base_msi_irq - sc->msi_base;
 	opal_call(OPAL_PCI_SET_XIVE_PE, sc->phb_id, OPAL_PCI_DEFAULT_PE, xive);
 
 	dinfo = device_get_ivars(child);
 	if (dinfo->cfg.msi.msi_alloc > 0 &&
 	    (dinfo->cfg.msi.msi_ctrl & PCIM_MSICTRL_64BIT) == 0) {
 		uint32_t msi32;
 		err = opal_call(OPAL_GET_MSI_32, sc->phb_id,
 		    OPAL_PCI_DEFAULT_PE, xive, 1, vtophys(&msi32),
 		    vtophys(data));
 		*addr = be32toh(msi32);
 	} else {
 		err = opal_call(OPAL_GET_MSI_64, sc->phb_id,
 		    OPAL_PCI_DEFAULT_PE, xive, 1, vtophys(addr), vtophys(data));
 		*addr = be64toh(*addr);
 	}
 	*data = be32toh(*data);
 
 	if (bootverbose && err != 0)
 		device_printf(child, "OPAL MSI mapping error: %d\n", err);
 
 	return ((err == 0) ? 0 : ENXIO);
 }
 
 static void
 opalpic_pic_enable(device_t dev, u_int irq, u_int vector)
 {
+	struct opalpci_softc *sc = device_get_softc(dev);
+
 	PIC_ENABLE(root_pic, irq, vector);
+	opal_call(OPAL_PCI_MSI_EOI, sc->phb_id, irq);
 }
 
 static void opalpic_pic_eoi(device_t dev, u_int irq)
 {
 	struct opalpci_softc *sc;
 
 	sc = device_get_softc(dev);
 	opal_call(OPAL_PCI_MSI_EOI, sc->phb_id, irq);
 
 	PIC_EOI(root_pic, irq);
 }
-
-static void opalpic_pic_mask(device_t dev, u_int irq)
-{
-	PIC_MASK(root_pic, irq);
-}
-
-static void opalpic_pic_unmask(device_t dev, u_int irq)
-{
-	struct opalpci_softc *sc;
-
-	sc = device_get_softc(dev);
-
-	PIC_UNMASK(root_pic, irq);
-
-	opal_call(OPAL_PCI_MSI_EOI, sc->phb_id, irq);
-}
-
-
Index: projects/openssl111/sys/powerpc/pseries/xics.c
===================================================================
--- projects/openssl111/sys/powerpc/pseries/xics.c	(revision 339239)
+++ projects/openssl111/sys/powerpc/pseries/xics.c	(revision 339240)
@@ -1,577 +1,569 @@
 /*-
  * SPDX-License-Identifier: BSD-2-Clause-FreeBSD
  *
  * Copyright 2011 Nathan Whitehorn
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
  * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
  * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
  * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
  * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include "opt_platform.h"
 
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/module.h>
 #include <sys/bus.h>
 #include <sys/conf.h>
 #include <sys/kernel.h>
 #include <sys/malloc.h>
 #include <sys/smp.h>
 
 #include <vm/vm.h>
 #include <vm/pmap.h>
 
 #include <machine/bus.h>
 #include <machine/intr_machdep.h>
 #include <machine/md_var.h>
 #include <machine/rtas.h>
 
 #include <dev/ofw/ofw_bus.h>
 #include <dev/ofw/ofw_bus_subr.h>
 
 #ifdef POWERNV
 #include <powerpc/powernv/opal.h>
 #endif
 
 #include "phyp-hvcall.h"
 #include "pic_if.h"
 
 #define XICP_PRIORITY	5	/* Random non-zero number */
 #define XICP_IPI	2
 #define MAX_XICP_IRQS	(1<<24)	/* 24-bit XIRR field */
 
 #define	XIVE_XICS_MODE_EMU	0
 #define	XIVE_XICS_MODE_EXP	1
 
 static int	xicp_probe(device_t);
 static int	xicp_attach(device_t);
 static int	xics_probe(device_t);
 static int	xics_attach(device_t);
 
 static void	xicp_bind(device_t dev, u_int irq, cpuset_t cpumask);
 static void	xicp_dispatch(device_t, struct trapframe *);
 static void	xicp_enable(device_t, u_int, u_int);
 static void	xicp_eoi(device_t, u_int);
 static void	xicp_ipi(device_t, u_int);
 static void	xicp_mask(device_t, u_int);
 static void	xicp_unmask(device_t, u_int);
 
 #ifdef POWERNV
 void	xicp_smp_cpu_startup(void);
 #endif
 
 static device_method_t  xicp_methods[] = {
 	/* Device interface */
 	DEVMETHOD(device_probe,		xicp_probe),
 	DEVMETHOD(device_attach,	xicp_attach),
 
 	/* PIC interface */
 	DEVMETHOD(pic_bind,		xicp_bind),
 	DEVMETHOD(pic_dispatch,		xicp_dispatch),
 	DEVMETHOD(pic_enable,		xicp_enable),
 	DEVMETHOD(pic_eoi,		xicp_eoi),
 	DEVMETHOD(pic_ipi,		xicp_ipi),
 	DEVMETHOD(pic_mask,		xicp_mask),
 	DEVMETHOD(pic_unmask,		xicp_unmask),
 
 	DEVMETHOD_END
 };
 
 static device_method_t  xics_methods[] = {
 	/* Device interface */
 	DEVMETHOD(device_probe,		xics_probe),
 	DEVMETHOD(device_attach,	xics_attach),
 
 	DEVMETHOD_END
 };
 
 struct xicp_softc {
 	struct mtx sc_mtx;
 	struct resource *mem[MAXCPU];
 
 	int cpu_range[2];
 
 	int ibm_int_on;
 	int ibm_int_off;
 	int ibm_get_xive;
 	int ibm_set_xive;
 
 	/* XXX: inefficient -- hash table? tree? */
 	struct {
 		int irq;
 		int vector;
 		int cpu;
 	} intvecs[256];
 	int nintvecs;
 	bool xics_emu;
 };
 
 static driver_t xicp_driver = {
 	"xicp",
 	xicp_methods,
 	sizeof(struct xicp_softc)
 };
 
 static driver_t xics_driver = {
 	"xics",
 	xics_methods,
 	0
 };
 
 #ifdef POWERNV
 /* We can only pass physical addresses into OPAL.  Kernel stacks are in the KVA,
  * not in the direct map, so we need to somehow extract the physical address.
  * However, pmap_kextract() takes locks, which is forbidden in a critical region
  * (which PMAP_DISPATCH() operates in).  The kernel is mapped into the Direct
  * Map (0xc000....), and the CPU implicitly drops the top two bits when doing
  * real address by nature that the bus width is smaller than 64-bits.  Placing
  * cpu_xirr into the DMAP lets us take advantage of this and avoids the
  * pmap_kextract() that would otherwise be needed if using the stack variable.
  */
 static uint32_t cpu_xirr[MAXCPU];
 #endif
 
 static devclass_t xicp_devclass;
 static devclass_t xics_devclass;
 
 EARLY_DRIVER_MODULE(xicp, ofwbus, xicp_driver, xicp_devclass, 0, 0,
     BUS_PASS_INTERRUPT-1);
 EARLY_DRIVER_MODULE(xics, ofwbus, xics_driver, xics_devclass, 0, 0,
     BUS_PASS_INTERRUPT);
 
 #ifdef POWERNV
 static struct resource *
 xicp_mem_for_cpu(int cpu)
 {
 	device_t dev;
 	struct xicp_softc *sc;
 	int i;
 
 	for (i = 0; (dev = devclass_get_device(xicp_devclass, i)) != NULL; i++){
 		sc = device_get_softc(dev);
 		if (cpu >= sc->cpu_range[0] && cpu < sc->cpu_range[1])
 			return (sc->mem[cpu - sc->cpu_range[0]]);
 	}
 
 	return (NULL);
 }
 #endif
 
 static int
 xicp_probe(device_t dev)
 {
 
 	if (!ofw_bus_is_compatible(dev, "ibm,ppc-xicp") &&
 	    !ofw_bus_is_compatible(dev, "ibm,opal-intc"))
 		return (ENXIO);
 
 	device_set_desc(dev, "External Interrupt Presentation Controller");
 	return (BUS_PROBE_GENERIC);
 }
 
 static int
 xics_probe(device_t dev)
 {
 
 	if (!ofw_bus_is_compatible(dev, "ibm,ppc-xics") &&
 	    !ofw_bus_is_compatible(dev, "IBM,opal-xics"))
 		return (ENXIO);
 
 	device_set_desc(dev, "External Interrupt Source Controller");
 	return (BUS_PROBE_GENERIC);
 }
 
 static int
 xicp_attach(device_t dev)
 {
 	struct xicp_softc *sc = device_get_softc(dev);
 	phandle_t phandle = ofw_bus_get_node(dev);
 
 	if (rtas_exists()) {
 		sc->ibm_int_on = rtas_token_lookup("ibm,int-on");
 		sc->ibm_int_off = rtas_token_lookup("ibm,int-off");
 		sc->ibm_set_xive = rtas_token_lookup("ibm,set-xive");
 		sc->ibm_get_xive = rtas_token_lookup("ibm,get-xive");
 #ifdef POWERNV
 	} else if (opal_check() == 0) {
 		/* No init needed */
 #endif
 	} else {
 		device_printf(dev, "Cannot attach without RTAS or OPAL\n");
 		return (ENXIO);
 	}
 
 	if (OF_hasprop(phandle, "ibm,interrupt-server-ranges")) {
 		OF_getencprop(phandle, "ibm,interrupt-server-ranges",
 		    sc->cpu_range, sizeof(sc->cpu_range));
 		sc->cpu_range[1] += sc->cpu_range[0];
 		device_printf(dev, "Handling CPUs %d-%d\n", sc->cpu_range[0],
 		    sc->cpu_range[1]-1);
 #ifdef POWERNV
 	} else if (ofw_bus_is_compatible(dev, "ibm,opal-intc")) {
 			/*
 			 * For now run POWER9 XIVE interrupt controller in XICS
 			 * compatibility mode.
 			 */
 			sc->xics_emu = true;
 			opal_call(OPAL_XIVE_RESET, XIVE_XICS_MODE_EMU);
 #endif
 	} else {
 		sc->cpu_range[0] = 0;
 		sc->cpu_range[1] = mp_ncpus;
 	}
 
 #ifdef POWERNV
 	if (mfmsr() & PSL_HV) {
 		int i;
 
 		if (sc->xics_emu) {
 			opal_call(OPAL_INT_SET_CPPR, 0xff);
 			for (i = 0; i < mp_ncpus; i++) {
 				opal_call(OPAL_INT_SET_MFRR,
 				    pcpu_find(i)->pc_hwref, 0xff);
 			}
 		} else {
 			for (i = 0; i < sc->cpu_range[1] - sc->cpu_range[0]; i++) {
 				sc->mem[i] = bus_alloc_resource_any(dev, SYS_RES_MEMORY,
 				    &i, RF_ACTIVE);
 				if (sc->mem[i] == NULL) {
 					device_printf(dev, "Could not alloc mem "
 					    "resource %d\n", i);
 					return (ENXIO);
 				}
 
 				/* Unmask interrupts on all cores */
 				bus_write_1(sc->mem[i], 4, 0xff);
 				bus_write_1(sc->mem[i], 12, 0xff);
 			}
 		}
 	}
 #endif
 
 	mtx_init(&sc->sc_mtx, "XICP", NULL, MTX_DEF);
 	sc->nintvecs = 0;
 
 	powerpc_register_pic(dev, OF_xref_from_node(phandle), MAX_XICP_IRQS,
 	    1 /* Number of IPIs */, FALSE);
 	root_pic = dev;
 
 	return (0);
 }
 
 static int
 xics_attach(device_t dev)
 {
 	phandle_t phandle = ofw_bus_get_node(dev);
 
 	/* The XICP (root PIC) will handle all our interrupts */
 	powerpc_register_pic(root_pic, OF_xref_from_node(phandle),
 	    MAX_XICP_IRQS, 1 /* Number of IPIs */, FALSE);
 
 	return (0);
 }
 
 /*
  * PIC I/F methods.
  */
 
 static void
 xicp_bind(device_t dev, u_int irq, cpuset_t cpumask)
 {
 	struct xicp_softc *sc = device_get_softc(dev);
 	cell_t status, cpu;
 	int ncpus, i, error;
 
 	/* Ignore IPIs */
 	if (irq == MAX_XICP_IRQS)
 		return;
 
 	/*
 	 * This doesn't appear to actually support affinity groups, so pick a
 	 * random CPU.
 	 */
 	ncpus = 0;
 	CPU_FOREACH(cpu)
 		if (CPU_ISSET(cpu, &cpumask)) ncpus++;
 
 	i = mftb() % ncpus;
 	ncpus = 0;
 	CPU_FOREACH(cpu) {
 		if (!CPU_ISSET(cpu, &cpumask))
 			continue;
 		if (ncpus == i)
 			break;
 		ncpus++;
 	}
 	
 	cpu = pcpu_find(cpu)->pc_hwref;
 
 	/* XXX: super inefficient */
 	for (i = 0; i < sc->nintvecs; i++) {
 		if (sc->intvecs[i].irq == irq) {
 			sc->intvecs[i].cpu = cpu;
 			break;
 		}
 	}
 	KASSERT(i < sc->nintvecs, ("Binding non-configured interrupt"));
 
 	if (rtas_exists())
 		error = rtas_call_method(sc->ibm_set_xive, 3, 1, irq, cpu,
 		    XICP_PRIORITY, &status);
 #ifdef POWERNV
 	else
 		error = opal_call(OPAL_SET_XIVE, irq, cpu << 2, XICP_PRIORITY);
 #endif
 
 	if (error < 0)
 		panic("Cannot bind interrupt %d to CPU %d", irq, cpu);
 }
 
 static void
 xicp_dispatch(device_t dev, struct trapframe *tf)
 {
 	struct xicp_softc *sc;
 	struct resource *regs = NULL;
 	uint64_t xirr, junk;
 	int i;
 
 	sc = device_get_softc(dev);
 #ifdef POWERNV
 	if ((mfmsr() & PSL_HV) && !sc->xics_emu) {
 		regs = xicp_mem_for_cpu(PCPU_GET(hwref));
 		KASSERT(regs != NULL,
 		    ("Can't find regs for CPU %ld", (uintptr_t)PCPU_GET(hwref)));
 	}
 #endif
 
 	for (;;) {
 		/* Return value in R4, use the PFT call */
 		if (regs) {
 			xirr = bus_read_4(regs, 4);
 #ifdef POWERNV
 		} else if (sc->xics_emu) {
 			opal_call(OPAL_INT_GET_XIRR, &cpu_xirr[PCPU_GET(cpuid)],
 			    false);
 			xirr = cpu_xirr[PCPU_GET(cpuid)];
 #endif
 		} else {
 			/* Return value in R4, use the PFT call */
 			phyp_pft_hcall(H_XIRR, 0, 0, 0, 0, &xirr, &junk, &junk);
 		}
 		xirr &= 0x00ffffff;
 
-		if (xirr == 0) { /* No more pending interrupts? */
-			if (regs)
-				bus_write_1(regs, 4, 0xff);
-#ifdef POWERNV
-			else if (sc->xics_emu)
-				opal_call(OPAL_INT_SET_CPPR, 0xff);
-#endif
-			else
-				phyp_hcall(H_CPPR, (uint64_t)0xff);
+		if (xirr == 0) /* No more pending interrupts? */
 			break;
-		}
+
 		if (xirr == XICP_IPI) {		/* Magic number for IPIs */
 			xirr = MAX_XICP_IRQS;	/* Map to FreeBSD magic */
 
 			/* Clear IPI */
 			if (regs)
 				bus_write_1(regs, 12, 0xff);
 #ifdef POWERNV
 			else if (sc->xics_emu)
 				opal_call(OPAL_INT_SET_MFRR,
 				    PCPU_GET(hwref), 0xff);
 #endif
 			else
 				phyp_hcall(H_IPI, (uint64_t)(PCPU_GET(hwref)),
 				    0xff);
 		}
 
 		/* XXX: super inefficient */
 		for (i = 0; i < sc->nintvecs; i++) {
 			if (sc->intvecs[i].irq == xirr)
 				break;
 		}
 
 		KASSERT(i < sc->nintvecs, ("Unmapped XIRR"));
 		powerpc_dispatch_intr(sc->intvecs[i].vector, tf);
 	}
 }
 
 static void
 xicp_enable(device_t dev, u_int irq, u_int vector)
 {
 	struct xicp_softc *sc;
 	cell_t status, cpu;
 
 	sc = device_get_softc(dev);
 
 	KASSERT(sc->nintvecs + 1 < nitems(sc->intvecs),
 		("Too many XICP interrupts"));
 
 	/* Bind to this CPU to start: distrib. ID is last entry in gserver# */
 	cpu = PCPU_GET(hwref);
 
 	mtx_lock(&sc->sc_mtx);
 	sc->intvecs[sc->nintvecs].irq = irq;
 	sc->intvecs[sc->nintvecs].vector = vector;
 	sc->intvecs[sc->nintvecs].cpu = cpu;
 	mb();
 	sc->nintvecs++;
 	mtx_unlock(&sc->sc_mtx);
 
 	/* IPIs are also enabled */
 	if (irq == MAX_XICP_IRQS)
 		return;
 
 	if (rtas_exists()) {
 		rtas_call_method(sc->ibm_set_xive, 3, 1, irq, cpu,
 		    XICP_PRIORITY, &status);
 		xicp_unmask(dev, irq);
 #ifdef POWERNV
 	} else {
 		status = opal_call(OPAL_SET_XIVE, irq, cpu << 2, XICP_PRIORITY);
 		/* Unmask implicit for OPAL */
 
 		if (status != 0)
 			panic("OPAL_SET_XIVE IRQ %d -> cpu %d failed: %d", irq,
 			    cpu, status);
 #endif
 	}
 }
 
 static void
 xicp_eoi(device_t dev, u_int irq)
 {
 #ifdef POWERNV
 	struct xicp_softc *sc;
 #endif
 	uint64_t xirr;
 
 	if (irq == MAX_XICP_IRQS) /* Remap IPI interrupt to internal value */
 		irq = XICP_IPI;
-	xirr = irq | (XICP_PRIORITY << 24);
+	xirr = irq | (0xff << 24);
 
 #ifdef POWERNV
 	if (mfmsr() & PSL_HV) {
 		sc = device_get_softc(dev);
 		if (sc->xics_emu)
 			opal_call(OPAL_INT_EOI, xirr);
 		else
 			bus_write_4(xicp_mem_for_cpu(PCPU_GET(hwref)), 4, xirr);
 	} else
 #endif
 		phyp_hcall(H_EOI, xirr);
 }
 
 static void
 xicp_ipi(device_t dev, u_int cpu)
 {
 
 #ifdef POWERNV
 	struct xicp_softc *sc;
 	cpu = pcpu_find(cpu)->pc_hwref;
 
 	if (mfmsr() & PSL_HV) {
 		sc = device_get_softc(dev);
 		if (sc->xics_emu) {
 			int64_t rv;
 			rv = opal_call(OPAL_INT_SET_MFRR, cpu, XICP_PRIORITY);
 			if (rv != 0)
 			    device_printf(dev, "IPI SET_MFRR result: %ld\n", rv);
 		} else
 			bus_write_1(xicp_mem_for_cpu(cpu), 12, XICP_PRIORITY);
 	} else
 #endif
 		phyp_hcall(H_IPI, (uint64_t)cpu, XICP_PRIORITY);
 }
 
 static void
 xicp_mask(device_t dev, u_int irq)
 {
 	struct xicp_softc *sc = device_get_softc(dev);
 	cell_t status;
 
 	if (irq == MAX_XICP_IRQS)
 		return;
 
 	if (rtas_exists()) {
 		rtas_call_method(sc->ibm_int_off, 1, 1, irq, &status);
 #ifdef POWERNV
 	} else {
 		int i;
 
 		for (i = 0; i < sc->nintvecs; i++) {
 			if (sc->intvecs[i].irq == irq) {
 				break;
 			}
 		}
 		KASSERT(i < sc->nintvecs, ("Masking unconfigured interrupt"));
 		opal_call(OPAL_SET_XIVE, irq, sc->intvecs[i].cpu << 2, 0xff);
 #endif
 	}
 }
 
 static void
 xicp_unmask(device_t dev, u_int irq)
 {
 	struct xicp_softc *sc = device_get_softc(dev);
 	cell_t status;
 
 	if (irq == MAX_XICP_IRQS)
 		return;
 
 	if (rtas_exists()) {
 		rtas_call_method(sc->ibm_int_on, 1, 1, irq, &status);
 #ifdef POWERNV
 	} else {
 		int i;
 
 		for (i = 0; i < sc->nintvecs; i++) {
 			if (sc->intvecs[i].irq == irq) {
 				break;
 			}
 		}
 		KASSERT(i < sc->nintvecs, ("Unmasking unconfigured interrupt"));
 		opal_call(OPAL_SET_XIVE, irq, sc->intvecs[i].cpu << 2,
 		    XICP_PRIORITY);
 #endif
 	}
 }
 
 #ifdef POWERNV
 /* This is only used on POWER9 systems with the XIVE's XICS emulation. */
 void
 xicp_smp_cpu_startup(void)
 {
 	struct xicp_softc *sc;
 
 	if (mfmsr() & PSL_HV) {
 		sc = device_get_softc(root_pic);
 
 		if (sc->xics_emu)
 			opal_call(OPAL_INT_SET_CPPR, 0xff);
 	}
 }
 #endif
Index: projects/openssl111/usr.bin/truss/syscalls.c
===================================================================
--- projects/openssl111/usr.bin/truss/syscalls.c	(revision 339239)
+++ projects/openssl111/usr.bin/truss/syscalls.c	(revision 339240)
@@ -1,2710 +1,2714 @@
 /*-
  * SPDX-License-Identifier: BSD-4-Clause
  *
  * Copyright 1997 Sean Eric Fagan
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. All advertising materials mentioning features or use of this software
  *    must display the following acknowledgement:
  *	This product includes software developed by Sean Eric Fagan
  * 4. Neither the name of the author may be used to endorse or promote
  *    products derived from this software without specific prior written
  *    permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 /*
  * This file has routines used to print out system calls and their
  * arguments.
  */
 
 #include <sys/capsicum.h>
 #include <sys/types.h>
 #define	_WANT_FREEBSD11_KEVENT
 #include <sys/event.h>
 #include <sys/ioccom.h>
 #include <sys/mount.h>
 #include <sys/ptrace.h>
 #include <sys/resource.h>
 #include <sys/socket.h>
 #define _WANT_FREEBSD11_STAT
 #include <sys/stat.h>
 #include <sys/time.h>
 #include <sys/un.h>
 #include <sys/wait.h>
 #include <netinet/in.h>
 #include <netinet/sctp.h>
 #include <arpa/inet.h>
 
 #include <assert.h>
 #include <ctype.h>
 #include <err.h>
 #include <fcntl.h>
 #include <poll.h>
 #include <sched.h>
 #include <signal.h>
 #include <stdbool.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <sysdecode.h>
 #include <unistd.h>
 #include <vis.h>
 
 #include <contrib/cloudabi/cloudabi_types_common.h>
 
 #include "truss.h"
 #include "extern.h"
 #include "syscall.h"
 
 /*
  * This should probably be in its own file, sorted alphabetically.
  */
 static struct syscall decoded_syscalls[] = {
 	/* Native ABI */
 	{ .name = "__acl_aclcheck_fd", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Acltype, 1 }, { Ptr, 2 } } },
 	{ .name = "__acl_aclcheck_file", .ret_type = 1, .nargs = 3,
 	  .args = { { Name, 0 }, { Acltype, 1 }, { Ptr, 2 } } },
 	{ .name = "__acl_aclcheck_link", .ret_type = 1, .nargs = 3,
 	  .args = { { Name, 0 }, { Acltype, 1 }, { Ptr, 2 } } },
 	{ .name = "__acl_delete_fd", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { Acltype, 1 } } },
 	{ .name = "__acl_delete_file", .ret_type = 1, .nargs = 2,
 	  .args = { { Name, 0 }, { Acltype, 1 } } },
 	{ .name = "__acl_delete_link", .ret_type = 1, .nargs = 2,
 	  .args = { { Name, 0 }, { Acltype, 1 } } },
 	{ .name = "__acl_get_fd", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Acltype, 1 }, { Ptr, 2 } } },
 	{ .name = "__acl_get_file", .ret_type = 1, .nargs = 3,
 	  .args = { { Name, 0 }, { Acltype, 1 }, { Ptr, 2 } } },
 	{ .name = "__acl_get_link", .ret_type = 1, .nargs = 3,
 	  .args = { { Name, 0 }, { Acltype, 1 }, { Ptr, 2 } } },
 	{ .name = "__acl_set_fd", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Acltype, 1 }, { Ptr, 2 } } },
 	{ .name = "__acl_set_file", .ret_type = 1, .nargs = 3,
 	  .args = { { Name, 0 }, { Acltype, 1 }, { Ptr, 2 } } },
 	{ .name = "__acl_set_link", .ret_type = 1, .nargs = 3,
 	  .args = { { Name, 0 }, { Acltype, 1 }, { Ptr, 2 } } },
 	{ .name = "__cap_rights_get", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Int, 1 }, { CapRights | OUT, 2 } } },
 	{ .name = "__getcwd", .ret_type = 1, .nargs = 2,
 	  .args = { { Name | OUT, 0 }, { Int, 1 } } },
 	{ .name = "_umtx_op", .ret_type = 1, .nargs = 5,
 	  .args = { { Ptr, 0 }, { Umtxop, 1 }, { LongHex, 2 }, { Ptr, 3 },
 		    { Ptr, 4 } } },
 	{ .name = "accept", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Sockaddr | OUT, 1 }, { Ptr | OUT, 2 } } },
 	{ .name = "access", .ret_type = 1, .nargs = 2,
 	  .args = { { Name | IN, 0 }, { Accessmode, 1 } } },
 	{ .name = "bind", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Sockaddr | IN, 1 }, { Socklent, 2 } } },
 	{ .name = "bindat", .ret_type = 1, .nargs = 4,
 	  .args = { { Atfd, 0 }, { Int, 1 }, { Sockaddr | IN, 2 },
 		    { Int, 3 } } },
 	{ .name = "break", .ret_type = 1, .nargs = 1,
 	  .args = { { Ptr, 0 } } },
 	{ .name = "cap_fcntls_get", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { CapFcntlRights | OUT, 1 } } },
 	{ .name = "cap_fcntls_limit", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { CapFcntlRights, 1 } } },
 	{ .name = "cap_getmode", .ret_type = 1, .nargs = 1,
 	  .args = { { PUInt | OUT, 0 } } },
 	{ .name = "cap_rights_limit", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { CapRights, 1 } } },
 	{ .name = "chdir", .ret_type = 1, .nargs = 1,
 	  .args = { { Name, 0 } } },
 	{ .name = "chflags", .ret_type = 1, .nargs = 2,
 	  .args = { { Name | IN, 0 }, { FileFlags, 1 } } },
 	{ .name = "chflagsat", .ret_type = 1, .nargs = 4,
 	  .args = { { Atfd, 0 }, { Name | IN, 1 }, { FileFlags, 2 },
 		    { Atflags, 3 } } },
 	{ .name = "chmod", .ret_type = 1, .nargs = 2,
 	  .args = { { Name, 0 }, { Octal, 1 } } },
 	{ .name = "chown", .ret_type = 1, .nargs = 3,
 	  .args = { { Name, 0 }, { Int, 1 }, { Int, 2 } } },
 	{ .name = "chroot", .ret_type = 1, .nargs = 1,
 	  .args = { { Name, 0 } } },
 	{ .name = "clock_gettime", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { Timespec | OUT, 1 } } },
 	{ .name = "close", .ret_type = 1, .nargs = 1,
 	  .args = { { Int, 0 } } },
 	{ .name = "compat11.fstat", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { Stat11 | OUT, 1 } } },
 	{ .name = "compat11.fstatat", .ret_type = 1, .nargs = 4,
 	  .args = { { Atfd, 0 }, { Name | IN, 1 }, { Stat11 | OUT, 2 },
 		    { Atflags, 3 } } },
 	{ .name = "compat11.kevent", .ret_type = 1, .nargs = 6,
 	  .args = { { Int, 0 }, { Kevent11, 1 }, { Int, 2 },
 		    { Kevent11 | OUT, 3 }, { Int, 4 }, { Timespec, 5 } } },
 	{ .name = "compat11.lstat", .ret_type = 1, .nargs = 2,
 	  .args = { { Name | IN, 0 }, { Stat11 | OUT, 1 } } },
 	{ .name = "compat11.stat", .ret_type = 1, .nargs = 2,
 	  .args = { { Name | IN, 0 }, { Stat11 | OUT, 1 } } },
 	{ .name = "connect", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Sockaddr | IN, 1 }, { Socklent, 2 } } },
 	{ .name = "connectat", .ret_type = 1, .nargs = 4,
 	  .args = { { Atfd, 0 }, { Int, 1 }, { Sockaddr | IN, 2 },
 		    { Int, 3 } } },
 	{ .name = "dup", .ret_type = 1, .nargs = 1,
 	  .args = { { Int, 0 } } },
 	{ .name = "dup2", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { Int, 1 } } },
 	{ .name = "eaccess", .ret_type = 1, .nargs = 2,
 	  .args = { { Name | IN, 0 }, { Accessmode, 1 } } },
 	{ .name = "execve", .ret_type = 1, .nargs = 3,
 	  .args = { { Name | IN, 0 }, { ExecArgs | IN, 1 },
 		    { ExecEnv | IN, 2 } } },
 	{ .name = "exit", .ret_type = 0, .nargs = 1,
 	  .args = { { Hex, 0 } } },
 	{ .name = "extattr_delete_fd", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Extattrnamespace, 1 }, { Name, 2 } } },
 	{ .name = "extattr_delete_file", .ret_type = 1, .nargs = 3,
 	  .args = { { Name, 0 }, { Extattrnamespace, 1 }, { Name, 2 } } },
 	{ .name = "extattr_delete_link", .ret_type = 1, .nargs = 3,
 	  .args = { { Name, 0 }, { Extattrnamespace, 1 }, { Name, 2 } } },
 	{ .name = "extattr_get_fd", .ret_type = 1, .nargs = 5,
 	  .args = { { Int, 0 }, { Extattrnamespace, 1 }, { Name, 2 },
 		    { BinString | OUT, 3 }, { Sizet, 4 } } },
 	{ .name = "extattr_get_file", .ret_type = 1, .nargs = 5,
 	  .args = { { Name, 0 }, { Extattrnamespace, 1 }, { Name, 2 },
 		    { BinString | OUT, 3 }, { Sizet, 4 } } },
 	{ .name = "extattr_get_link", .ret_type = 1, .nargs = 5,
 	  .args = { { Name, 0 }, { Extattrnamespace, 1 }, { Name, 2 },
 		    { BinString | OUT, 3 }, { Sizet, 4 } } },
 	{ .name = "extattr_list_fd", .ret_type = 1, .nargs = 4,
 	  .args = { { Int, 0 }, { Extattrnamespace, 1 }, { BinString | OUT, 2 },
 		    { Sizet, 3 } } },
 	{ .name = "extattr_list_file", .ret_type = 1, .nargs = 4,
 	  .args = { { Name, 0 }, { Extattrnamespace, 1 }, { BinString | OUT, 2 },
 		    { Sizet, 3 } } },
 	{ .name = "extattr_list_link", .ret_type = 1, .nargs = 4,
 	  .args = { { Name, 0 }, { Extattrnamespace, 1 }, { BinString | OUT, 2 },
 		    { Sizet, 3 } } },
 	{ .name = "extattr_set_fd", .ret_type = 1, .nargs = 5,
 	  .args = { { Int, 0 }, { Extattrnamespace, 1 }, { Name, 2 },
 		    { BinString | IN, 3 }, { Sizet, 4 } } },
 	{ .name = "extattr_set_file", .ret_type = 1, .nargs = 5,
 	  .args = { { Name, 0 }, { Extattrnamespace, 1 }, { Name, 2 },
 		    { BinString | IN, 3 }, { Sizet, 4 } } },
 	{ .name = "extattr_set_link", .ret_type = 1, .nargs = 5,
 	  .args = { { Name, 0 }, { Extattrnamespace, 1 }, { Name, 2 },
 		    { BinString | IN, 3 }, { Sizet, 4 } } },
 	{ .name = "extattrctl", .ret_type = 1, .nargs = 5,
 	  .args = { { Name, 0 }, { Hex, 1 }, { Name, 2 },
 		    { Extattrnamespace, 3 }, { Name, 4 } } },
 	{ .name = "faccessat", .ret_type = 1, .nargs = 4,
 	  .args = { { Atfd, 0 }, { Name | IN, 1 }, { Accessmode, 2 },
 		    { Atflags, 3 } } },
 	{ .name = "fchflags", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { FileFlags, 1 } } },
 	{ .name = "fchmod", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { Octal, 1 } } },
 	{ .name = "fchmodat", .ret_type = 1, .nargs = 4,
 	  .args = { { Atfd, 0 }, { Name, 1 }, { Octal, 2 }, { Atflags, 3 } } },
 	{ .name = "fchown", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Int, 1 }, { Int, 2 } } },
 	{ .name = "fchownat", .ret_type = 1, .nargs = 5,
 	  .args = { { Atfd, 0 }, { Name, 1 }, { Int, 2 }, { Int, 3 },
 		    { Atflags, 4 } } },
 	{ .name = "fcntl", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Fcntl, 1 }, { Fcntlflag, 2 } } },
 	{ .name = "flock", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { Flockop, 1 } } },
 	{ .name = "fstat", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { Stat | OUT, 1 } } },
 	{ .name = "fstatat", .ret_type = 1, .nargs = 4,
 	  .args = { { Atfd, 0 }, { Name | IN, 1 }, { Stat | OUT, 2 },
 		    { Atflags, 3 } } },
 	{ .name = "fstatfs", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { StatFs | OUT, 1 } } },
 	{ .name = "ftruncate", .ret_type = 1, .nargs = 2,
 	  .args = { { Int | IN, 0 }, { QuadHex | IN, 1 } } },
 	{ .name = "futimens", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { Timespec2 | IN, 1 } } },
 	{ .name = "futimes", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { Timeval2 | IN, 1 } } },
 	{ .name = "futimesat", .ret_type = 1, .nargs = 3,
 	  .args = { { Atfd, 0 }, { Name | IN, 1 }, { Timeval2 | IN, 2 } } },
 	{ .name = "getdirentries", .ret_type = 1, .nargs = 4,
 	  .args = { { Int, 0 }, { BinString | OUT, 1 }, { Int, 2 },
 		    { PQuadHex | OUT, 3 } } },
 	{ .name = "getfsstat", .ret_type = 1, .nargs = 3,
 	  .args = { { Ptr, 0 }, { Long, 1 }, { Getfsstatmode, 2 } } },
 	{ .name = "getitimer", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { Itimerval | OUT, 2 } } },
 	{ .name = "getpeername", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Sockaddr | OUT, 1 }, { Ptr | OUT, 2 } } },
 	{ .name = "getpgid", .ret_type = 1, .nargs = 1,
 	  .args = { { Int, 0 } } },
 	{ .name = "getpriority", .ret_type = 1, .nargs = 2,
 	  .args = { { Priowhich, 0 }, { Int, 1 } } },
 	{ .name = "getrandom", .ret_type = 1, .nargs = 3,
 	  .args = { { BinString | OUT, 0 }, { Sizet, 1 }, { UInt, 2 } } },
 	{ .name = "getrlimit", .ret_type = 1, .nargs = 2,
 	  .args = { { Resource, 0 }, { Rlimit | OUT, 1 } } },
 	{ .name = "getrusage", .ret_type = 1, .nargs = 2,
 	  .args = { { RusageWho, 0 }, { Rusage | OUT, 1 } } },
 	{ .name = "getsid", .ret_type = 1, .nargs = 1,
 	  .args = { { Int, 0 } } },
 	{ .name = "getsockname", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Sockaddr | OUT, 1 }, { Ptr | OUT, 2 } } },
 	{ .name = "getsockopt", .ret_type = 1, .nargs = 5,
 	  .args = { { Int, 0 }, { Sockoptlevel, 1 }, { Sockoptname, 2 },
 		    { Ptr | OUT, 3 }, { Ptr | OUT, 4 } } },
 	{ .name = "gettimeofday", .ret_type = 1, .nargs = 2,
 	  .args = { { Timeval | OUT, 0 }, { Ptr, 1 } } },
 	{ .name = "ioctl", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Ioctl, 1 }, { Ptr, 2 } } },
 	{ .name = "kevent", .ret_type = 1, .nargs = 6,
 	  .args = { { Int, 0 }, { Kevent, 1 }, { Int, 2 }, { Kevent | OUT, 3 },
 		    { Int, 4 }, { Timespec, 5 } } },
 	{ .name = "kill", .ret_type = 1, .nargs = 2,
 	  .args = { { Int | IN, 0 }, { Signal | IN, 1 } } },
 	{ .name = "kldfind", .ret_type = 1, .nargs = 1,
 	  .args = { { Name | IN, 0 } } },
 	{ .name = "kldfirstmod", .ret_type = 1, .nargs = 1,
 	  .args = { { Int, 0 } } },
 	{ .name = "kldload", .ret_type = 1, .nargs = 1,
 	  .args = { { Name | IN, 0 } } },
 	{ .name = "kldnext", .ret_type = 1, .nargs = 1,
 	  .args = { { Int, 0 } } },
 	{ .name = "kldstat", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { Ptr, 1 } } },
 	{ .name = "kldsym", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Kldsymcmd, 1 }, { Ptr, 2 } } },
 	{ .name = "kldunload", .ret_type = 1, .nargs = 1,
 	  .args = { { Int, 0 } } },
 	{ .name = "kldunloadf", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { Kldunloadflags, 1 } } },
 	{ .name = "kse_release", .ret_type = 0, .nargs = 1,
 	  .args = { { Timespec, 0 } } },
 	{ .name = "lchflags", .ret_type = 1, .nargs = 2,
 	  .args = { { Name | IN, 0 }, { FileFlags, 1 } } },
 	{ .name = "lchmod", .ret_type = 1, .nargs = 2,
 	  .args = { { Name, 0 }, { Octal, 1 } } },
 	{ .name = "lchown", .ret_type = 1, .nargs = 3,
 	  .args = { { Name, 0 }, { Int, 1 }, { Int, 2 } } },
 	{ .name = "link", .ret_type = 1, .nargs = 2,
 	  .args = { { Name, 0 }, { Name, 1 } } },
 	{ .name = "linkat", .ret_type = 1, .nargs = 5,
 	  .args = { { Atfd, 0 }, { Name, 1 }, { Atfd, 2 }, { Name, 3 },
 		    { Atflags, 4 } } },
 	{ .name = "listen", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { Int, 1 } } },
  	{ .name = "lseek", .ret_type = 2, .nargs = 3,
 	  .args = { { Int, 0 }, { QuadHex, 1 }, { Whence, 2 } } },
 	{ .name = "lstat", .ret_type = 1, .nargs = 2,
 	  .args = { { Name | IN, 0 }, { Stat | OUT, 1 } } },
 	{ .name = "lutimes", .ret_type = 1, .nargs = 2,
 	  .args = { { Name | IN, 0 }, { Timeval2 | IN, 1 } } },
 	{ .name = "madvise", .ret_type = 1, .nargs = 3,
 	  .args = { { Ptr, 0 }, { Sizet, 1 }, { Madvice, 2 } } },
 	{ .name = "minherit", .ret_type = 1, .nargs = 3,
 	  .args = { { Ptr, 0 }, { Sizet, 1 }, { Minherit, 2 } } },
 	{ .name = "mkdir", .ret_type = 1, .nargs = 2,
 	  .args = { { Name, 0 }, { Octal, 1 } } },
 	{ .name = "mkdirat", .ret_type = 1, .nargs = 3,
 	  .args = { { Atfd, 0 }, { Name, 1 }, { Octal, 2 } } },
 	{ .name = "mkfifo", .ret_type = 1, .nargs = 2,
 	  .args = { { Name, 0 }, { Octal, 1 } } },
 	{ .name = "mkfifoat", .ret_type = 1, .nargs = 3,
 	  .args = { { Atfd, 0 }, { Name, 1 }, { Octal, 2 } } },
 	{ .name = "mknod", .ret_type = 1, .nargs = 3,
 	  .args = { { Name, 0 }, { Octal, 1 }, { Int, 2 } } },
 	{ .name = "mknodat", .ret_type = 1, .nargs = 4,
 	  .args = { { Atfd, 0 }, { Name, 1 }, { Octal, 2 }, { Int, 3 } } },
 	{ .name = "mlock", .ret_type = 1, .nargs = 2,
 	  .args = { { Ptr, 0 }, { Sizet, 1 } } },
 	{ .name = "mlockall", .ret_type = 1, .nargs = 1,
 	  .args = { { Mlockall, 0 } } },
 	{ .name = "mmap", .ret_type = 1, .nargs = 6,
 	  .args = { { Ptr, 0 }, { Sizet, 1 }, { Mprot, 2 }, { Mmapflags, 3 },
 		    { Int, 4 }, { QuadHex, 5 } } },
 	{ .name = "modfind", .ret_type = 1, .nargs = 1,
 	  .args = { { Name | IN, 0 } } },
 	{ .name = "mount", .ret_type = 1, .nargs = 4,
 	  .args = { { Name, 0 }, { Name, 1 }, { Mountflags, 2 }, { Ptr, 3 } } },
 	{ .name = "mprotect", .ret_type = 1, .nargs = 3,
 	  .args = { { Ptr, 0 }, { Sizet, 1 }, { Mprot, 2 } } },
 	{ .name = "msync", .ret_type = 1, .nargs = 3,
 	  .args = { { Ptr, 0 }, { Sizet, 1 }, { Msync, 2 } } },
 	{ .name = "munlock", .ret_type = 1, .nargs = 2,
 	  .args = { { Ptr, 0 }, { Sizet, 1 } } },
 	{ .name = "munmap", .ret_type = 1, .nargs = 2,
 	  .args = { { Ptr, 0 }, { Sizet, 1 } } },
 	{ .name = "nanosleep", .ret_type = 1, .nargs = 1,
 	  .args = { { Timespec, 0 } } },
 	{ .name = "nmount", .ret_type = 1, .nargs = 3,
 	  .args = { { Ptr, 0 }, { UInt, 1 }, { Mountflags, 2 } } },
 	{ .name = "open", .ret_type = 1, .nargs = 3,
 	  .args = { { Name | IN, 0 }, { Open, 1 }, { Octal, 2 } } },
 	{ .name = "openat", .ret_type = 1, .nargs = 4,
 	  .args = { { Atfd, 0 }, { Name | IN, 1 }, { Open, 2 },
 		    { Octal, 3 } } },
 	{ .name = "pathconf", .ret_type = 1, .nargs = 2,
 	  .args = { { Name | IN, 0 }, { Pathconf, 1 } } },
 	{ .name = "pipe", .ret_type = 1, .nargs = 1,
 	  .args = { { PipeFds | OUT, 0 } } },
 	{ .name = "pipe2", .ret_type = 1, .nargs = 2,
 	  .args = { { Ptr, 0 }, { Pipe2, 1 } } },
 	{ .name = "poll", .ret_type = 1, .nargs = 3,
 	  .args = { { Pollfd, 0 }, { Int, 1 }, { Int, 2 } } },
 	{ .name = "posix_fadvise", .ret_type = 1, .nargs = 4,
 	  .args = { { Int, 0 }, { QuadHex, 1 }, { QuadHex, 2 },
 		    { Fadvice, 3 } } },
 	{ .name = "posix_openpt", .ret_type = 1, .nargs = 1,
 	  .args = { { Open, 0 } } },
 	{ .name = "pread", .ret_type = 1, .nargs = 4,
 	  .args = { { Int, 0 }, { BinString | OUT, 1 }, { Sizet, 2 },
 		    { QuadHex, 3 } } },
 	{ .name = "procctl", .ret_type = 1, .nargs = 4,
 	  .args = { { Idtype, 0 }, { Quad, 1 }, { Procctl, 2 }, { Ptr, 3 } } },
 	{ .name = "ptrace", .ret_type = 1, .nargs = 4,
 	  .args = { { Ptraceop, 0 }, { Int, 1 }, { Ptr, 2 }, { Int, 3 } } },
 	{ .name = "pwrite", .ret_type = 1, .nargs = 4,
 	  .args = { { Int, 0 }, { BinString | IN, 1 }, { Sizet, 2 },
 		    { QuadHex, 3 } } },
 	{ .name = "quotactl", .ret_type = 1, .nargs = 4,
 	  .args = { { Name, 0 }, { Quotactlcmd, 1 }, { Int, 2 }, { Ptr, 3 } } },
 	{ .name = "read", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { BinString | OUT, 1 }, { Sizet, 2 } } },
 	{ .name = "readlink", .ret_type = 1, .nargs = 3,
 	  .args = { { Name, 0 }, { Readlinkres | OUT, 1 }, { Sizet, 2 } } },
 	{ .name = "readlinkat", .ret_type = 1, .nargs = 4,
 	  .args = { { Atfd, 0 }, { Name, 1 }, { Readlinkres | OUT, 2 },
 		    { Sizet, 3 } } },
 	{ .name = "readv", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Iovec | OUT, 1 }, { Int, 2 } } },
 	{ .name = "reboot", .ret_type = 1, .nargs = 1,
 	  .args = { { Reboothowto, 0 } } },
 	{ .name = "recvfrom", .ret_type = 1, .nargs = 6,
 	  .args = { { Int, 0 }, { BinString | OUT, 1 }, { Sizet, 2 },
 	            { Msgflags, 3 }, { Sockaddr | OUT, 4 },
 	            { Ptr | OUT, 5 } } },
 	{ .name = "recvmsg", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Msghdr | OUT, 1 }, { Msgflags, 2 } } },
 	{ .name = "rename", .ret_type = 1, .nargs = 2,
 	  .args = { { Name, 0 }, { Name, 1 } } },
 	{ .name = "renameat", .ret_type = 1, .nargs = 4,
 	  .args = { { Atfd, 0 }, { Name, 1 }, { Atfd, 2 }, { Name, 3 } } },
 	{ .name = "rfork", .ret_type = 1, .nargs = 1,
 	  .args = { { Rforkflags, 0 } } },
 	{ .name = "rmdir", .ret_type = 1, .nargs = 1,
 	  .args = { { Name, 0 } } },
 	{ .name = "rtprio", .ret_type = 1, .nargs = 3,
 	  .args = { { Rtpriofunc, 0 }, { Int, 1 }, { Ptr, 2 } } },
 	{ .name = "rtprio_thread", .ret_type = 1, .nargs = 3,
 	  .args = { { Rtpriofunc, 0 }, { Int, 1 }, { Ptr, 2 } } },
 	{ .name = "sched_get_priority_max", .ret_type = 1, .nargs = 1,
 	  .args = { { Schedpolicy, 0 } } },
 	{ .name = "sched_get_priority_min", .ret_type = 1, .nargs = 1,
 	  .args = { { Schedpolicy, 0 } } },
 	{ .name = "sched_getparam", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { Schedparam | OUT, 1 } } },
 	{ .name = "sched_getscheduler", .ret_type = 1, .nargs = 1,
 	  .args = { { Int, 0 } } },
 	{ .name = "sched_rr_get_interval", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { Timespec | OUT, 1 } } },
 	{ .name = "sched_setparam", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { Schedparam, 1 } } },
 	{ .name = "sched_setscheduler", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Schedpolicy, 1 }, { Schedparam, 2 } } },
 	{ .name = "sctp_generic_recvmsg", .ret_type = 1, .nargs = 7,
 	  .args = { { Int, 0 }, { Iovec | OUT, 1 }, { Int, 2 },
 	            { Sockaddr | OUT, 3 }, { Ptr | OUT, 4 },
 	            { Sctpsndrcvinfo | OUT, 5 }, { Ptr | OUT, 6 } } },
 	{ .name = "sctp_generic_sendmsg", .ret_type = 1, .nargs = 7,
 	  .args = { { Int, 0 }, { BinString | IN, 1 }, { Int, 2 },
 	            { Sockaddr | IN, 3 }, { Socklent, 4 },
 	            { Sctpsndrcvinfo | IN, 5 }, { Msgflags, 6 } } },
 	{ .name = "sctp_generic_sendmsg_iov", .ret_type = 1, .nargs = 7,
 	  .args = { { Int, 0 }, { Iovec | IN, 1 }, { Int, 2 },
 	            { Sockaddr | IN, 3 }, { Socklent, 4 },
 	            { Sctpsndrcvinfo | IN, 5 }, { Msgflags, 6 } } },
 	{ .name = "select", .ret_type = 1, .nargs = 5,
 	  .args = { { Int, 0 }, { Fd_set, 1 }, { Fd_set, 2 }, { Fd_set, 3 },
 		    { Timeval, 4 } } },
 	{ .name = "sendmsg", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Msghdr | IN, 1 }, { Msgflags, 2 } } },
 	{ .name = "sendto", .ret_type = 1, .nargs = 6,
 	  .args = { { Int, 0 }, { BinString | IN, 1 }, { Sizet, 2 },
 	            { Msgflags, 3 }, { Sockaddr | IN, 4 },
 	            { Socklent | IN, 5 } } },
 	{ .name = "setitimer", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Itimerval, 1 }, { Itimerval | OUT, 2 } } },
 	{ .name = "setpriority", .ret_type = 1, .nargs = 3,
 	  .args = { { Priowhich, 0 }, { Int, 1 }, { Int, 2 } } },
 	{ .name = "setrlimit", .ret_type = 1, .nargs = 2,
 	  .args = { { Resource, 0 }, { Rlimit | IN, 1 } } },
 	{ .name = "setsockopt", .ret_type = 1, .nargs = 5,
 	  .args = { { Int, 0 }, { Sockoptlevel, 1 }, { Sockoptname, 2 },
 		    { Ptr | IN, 3 }, { Socklent, 4 } } },
+	{ .name = "shm_open", .ret_type = 1, .nargs = 3,
+	  .args = { { Name | IN, 0 }, { Open, 1 }, { Octal, 2 } } },
+	{ .name = "shm_unlink", .ret_type = 1, .nargs = 1,
+	  .args = { { Name | IN, 0 } } },
 	{ .name = "shutdown", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { Shutdown, 1 } } },
 	{ .name = "sigaction", .ret_type = 1, .nargs = 3,
 	  .args = { { Signal, 0 }, { Sigaction | IN, 1 },
 		    { Sigaction | OUT, 2 } } },
 	{ .name = "sigpending", .ret_type = 1, .nargs = 1,
 	  .args = { { Sigset | OUT, 0 } } },
 	{ .name = "sigprocmask", .ret_type = 1, .nargs = 3,
 	  .args = { { Sigprocmask, 0 }, { Sigset, 1 }, { Sigset | OUT, 2 } } },
 	{ .name = "sigqueue", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Signal, 1 }, { LongHex, 2 } } },
 	{ .name = "sigreturn", .ret_type = 1, .nargs = 1,
 	  .args = { { Ptr, 0 } } },
 	{ .name = "sigsuspend", .ret_type = 1, .nargs = 1,
 	  .args = { { Sigset | IN, 0 } } },
 	{ .name = "sigtimedwait", .ret_type = 1, .nargs = 3,
 	  .args = { { Sigset | IN, 0 }, { Siginfo | OUT, 1 },
 		    { Timespec | IN, 2 } } },
 	{ .name = "sigwait", .ret_type = 1, .nargs = 2,
 	  .args = { { Sigset | IN, 0 }, { PSig | OUT, 1 } } },
 	{ .name = "sigwaitinfo", .ret_type = 1, .nargs = 2,
 	  .args = { { Sigset | IN, 0 }, { Siginfo | OUT, 1 } } },
 	{ .name = "socket", .ret_type = 1, .nargs = 3,
 	  .args = { { Sockdomain, 0 }, { Socktype, 1 }, { Sockprotocol, 2 } } },
 	{ .name = "stat", .ret_type = 1, .nargs = 2,
 	  .args = { { Name | IN, 0 }, { Stat | OUT, 1 } } },
 	{ .name = "statfs", .ret_type = 1, .nargs = 2,
 	  .args = { { Name | IN, 0 }, { StatFs | OUT, 1 } } },
 	{ .name = "symlink", .ret_type = 1, .nargs = 2,
 	  .args = { { Name, 0 }, { Name, 1 } } },
 	{ .name = "symlinkat", .ret_type = 1, .nargs = 3,
 	  .args = { { Name, 0 }, { Atfd, 1 }, { Name, 2 } } },
 	{ .name = "sysarch", .ret_type = 1, .nargs = 2,
 	  .args = { { Sysarch, 0 }, { Ptr, 1 } } },
 	{ .name = "thr_kill", .ret_type = 1, .nargs = 2,
 	  .args = { { Long, 0 }, { Signal, 1 } } },
 	{ .name = "thr_self", .ret_type = 1, .nargs = 1,
 	  .args = { { Ptr, 0 } } },
 	{ .name = "thr_set_name", .ret_type = 1, .nargs = 2,
 	  .args = { { Long, 0 }, { Name, 1 } } },
 	{ .name = "truncate", .ret_type = 1, .nargs = 2,
 	  .args = { { Name | IN, 0 }, { QuadHex | IN, 1 } } },
 #if 0
 	/* Does not exist */
 	{ .name = "umount", .ret_type = 1, .nargs = 2,
 	  .args = { { Name, 0 }, { Int, 2 } } },
 #endif
 	{ .name = "unlink", .ret_type = 1, .nargs = 1,
 	  .args = { { Name, 0 } } },
 	{ .name = "unlinkat", .ret_type = 1, .nargs = 3,
 	  .args = { { Atfd, 0 }, { Name, 1 }, { Atflags, 2 } } },
 	{ .name = "unmount", .ret_type = 1, .nargs = 2,
 	  .args = { { Name, 0 }, { Mountflags, 1 } } },
 	{ .name = "utimensat", .ret_type = 1, .nargs = 4,
 	  .args = { { Atfd, 0 }, { Name | IN, 1 }, { Timespec2 | IN, 2 },
 		    { Atflags, 3 } } },
 	{ .name = "utimes", .ret_type = 1, .nargs = 2,
 	  .args = { { Name | IN, 0 }, { Timeval2 | IN, 1 } } },
 	{ .name = "utrace", .ret_type = 1, .nargs = 1,
 	  .args = { { Utrace, 0 } } },
 	{ .name = "wait4", .ret_type = 1, .nargs = 4,
 	  .args = { { Int, 0 }, { ExitStatus | OUT, 1 }, { Waitoptions, 2 },
 		    { Rusage | OUT, 3 } } },
 	{ .name = "wait6", .ret_type = 1, .nargs = 6,
 	  .args = { { Idtype, 0 }, { Quad, 1 }, { ExitStatus | OUT, 2 },
 		    { Waitoptions, 3 }, { Rusage | OUT, 4 },
 		    { Siginfo | OUT, 5 } } },
 	{ .name = "write", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { BinString | IN, 1 }, { Sizet, 2 } } },
 	{ .name = "writev", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Iovec | IN, 1 }, { Int, 2 } } },
 
 	/* Linux ABI */
 	{ .name = "linux_access", .ret_type = 1, .nargs = 2,
 	  .args = { { Name, 0 }, { Accessmode, 1 } } },
 	{ .name = "linux_execve", .ret_type = 1, .nargs = 3,
 	  .args = { { Name | IN, 0 }, { ExecArgs | IN, 1 },
 		    { ExecEnv | IN, 2 } } },
 	{ .name = "linux_lseek", .ret_type = 2, .nargs = 3,
 	  .args = { { Int, 0 }, { Int, 1 }, { Whence, 2 } } },
 	{ .name = "linux_mkdir", .ret_type = 1, .nargs = 2,
 	  .args = { { Name | IN, 0 }, { Int, 1 } } },
 	{ .name = "linux_newfstat", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { Ptr | OUT, 1 } } },
 	{ .name = "linux_newstat", .ret_type = 1, .nargs = 2,
 	  .args = { { Name | IN, 0 }, { Ptr | OUT, 1 } } },
 	{ .name = "linux_open", .ret_type = 1, .nargs = 3,
 	  .args = { { Name, 0 }, { Hex, 1 }, { Octal, 2 } } },
 	{ .name = "linux_readlink", .ret_type = 1, .nargs = 3,
 	  .args = { { Name, 0 }, { Name | OUT, 1 }, { Sizet, 2 } } },
 	{ .name = "linux_socketcall", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { LinuxSockArgs, 1 } } },
 	{ .name = "linux_stat64", .ret_type = 1, .nargs = 2,
 	  .args = { { Name | IN, 0 }, { Ptr | OUT, 1 } } },
 
 	/* CloudABI system calls. */
 	{ .name = "cloudabi_sys_clock_res_get", .ret_type = 1, .nargs = 1,
 	  .args = { { CloudABIClockID, 0 } } },
 	{ .name = "cloudabi_sys_clock_time_get", .ret_type = 1, .nargs = 2,
 	  .args = { { CloudABIClockID, 0 }, { CloudABITimestamp, 1 } } },
 	{ .name = "cloudabi_sys_condvar_signal", .ret_type = 1, .nargs = 3,
 	  .args = { { Ptr, 0 }, { CloudABIMFlags, 1 }, { UInt, 2 } } },
 	{ .name = "cloudabi_sys_fd_close", .ret_type = 1, .nargs = 1,
 	  .args = { { Int, 0 } } },
 	{ .name = "cloudabi_sys_fd_create1", .ret_type = 1, .nargs = 1,
 	  .args = { { CloudABIFileType, 0 } } },
 	{ .name = "cloudabi_sys_fd_create2", .ret_type = 1, .nargs = 2,
 	  .args = { { CloudABIFileType, 0 }, { PipeFds | OUT, 0 } } },
 	{ .name = "cloudabi_sys_fd_datasync", .ret_type = 1, .nargs = 1,
 	  .args = { { Int, 0 } } },
 	{ .name = "cloudabi_sys_fd_dup", .ret_type = 1, .nargs = 1,
 	  .args = { { Int, 0 } } },
 	{ .name = "cloudabi_sys_fd_replace", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { Int, 1 } } },
 	{ .name = "cloudabi_sys_fd_seek", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Int, 1 }, { CloudABIWhence, 2 } } },
 	{ .name = "cloudabi_sys_fd_stat_get", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { CloudABIFDStat | OUT, 1 } } },
 	{ .name = "cloudabi_sys_fd_stat_put", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { CloudABIFDStat | IN, 1 },
 	            { CloudABIFDSFlags, 2 } } },
 	{ .name = "cloudabi_sys_fd_sync", .ret_type = 1, .nargs = 1,
 	  .args = { { Int, 0 } } },
 	{ .name = "cloudabi_sys_file_advise", .ret_type = 1, .nargs = 4,
 	  .args = { { Int, 0 }, { Int, 1 }, { Int, 2 },
 	            { CloudABIAdvice, 3 } } },
 	{ .name = "cloudabi_sys_file_allocate", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { Int, 1 }, { Int, 2 } } },
 	{ .name = "cloudabi_sys_file_create", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { BinString | IN, 1 },
 	            { CloudABIFileType, 3 } } },
 	{ .name = "cloudabi_sys_file_link", .ret_type = 1, .nargs = 4,
 	  .args = { { CloudABILookup, 0 }, { BinString | IN, 1 },
 	            { Int, 3 }, { BinString | IN, 4 } } },
 	{ .name = "cloudabi_sys_file_open", .ret_type = 1, .nargs = 4,
 	  .args = { { Int, 0 }, { BinString | IN, 1 },
 	            { CloudABIOFlags, 3 }, { CloudABIFDStat | IN, 4 } } },
 	{ .name = "cloudabi_sys_file_readdir", .ret_type = 1, .nargs = 4,
 	  .args = { { Int, 0 }, { BinString | OUT, 1 }, { Int, 2 },
 	            { Int, 3 } } },
 	{ .name = "cloudabi_sys_file_readlink", .ret_type = 1, .nargs = 4,
 	  .args = { { Int, 0 }, { BinString | IN, 1 },
 	            { BinString | OUT, 3 }, { Int, 4 } } },
 	{ .name = "cloudabi_sys_file_rename", .ret_type = 1, .nargs = 4,
 	  .args = { { Int, 0 }, { BinString | IN, 1 },
 	            { Int, 3 }, { BinString | IN, 4 } } },
 	{ .name = "cloudabi_sys_file_stat_fget", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { CloudABIFileStat | OUT, 1 } } },
 	{ .name = "cloudabi_sys_file_stat_fput", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { CloudABIFileStat | IN, 1 },
 	            { CloudABIFSFlags, 2 } } },
 	{ .name = "cloudabi_sys_file_stat_get", .ret_type = 1, .nargs = 3,
 	  .args = { { CloudABILookup, 0 }, { BinString | IN, 1 },
 	            { CloudABIFileStat | OUT, 3 } } },
 	{ .name = "cloudabi_sys_file_stat_put", .ret_type = 1, .nargs = 4,
 	  .args = { { CloudABILookup, 0 }, { BinString | IN, 1 },
 	            { CloudABIFileStat | IN, 3 }, { CloudABIFSFlags, 4 } } },
 	{ .name = "cloudabi_sys_file_symlink", .ret_type = 1, .nargs = 3,
 	  .args = { { BinString | IN, 0 },
 	            { Int, 2 }, { BinString | IN, 3 } } },
 	{ .name = "cloudabi_sys_file_unlink", .ret_type = 1, .nargs = 3,
 	  .args = { { Int, 0 }, { BinString | IN, 1 },
 	            { CloudABIULFlags, 3 } } },
 	{ .name = "cloudabi_sys_lock_unlock", .ret_type = 1, .nargs = 2,
 	  .args = { { Ptr, 0 }, { CloudABIMFlags, 1 } } },
 	{ .name = "cloudabi_sys_mem_advise", .ret_type = 1, .nargs = 3,
 	  .args = { { Ptr, 0 }, { Int, 1 }, { CloudABIAdvice, 2 } } },
 	{ .name = "cloudabi_sys_mem_map", .ret_type = 1, .nargs = 6,
 	  .args = { { Ptr, 0 }, { Int, 1 }, { CloudABIMProt, 2 },
 	            { CloudABIMFlags, 3 }, { Int, 4 }, { Int, 5 } } },
 	{ .name = "cloudabi_sys_mem_protect", .ret_type = 1, .nargs = 3,
 	  .args = { { Ptr, 0 }, { Int, 1 }, { CloudABIMProt, 2 } } },
 	{ .name = "cloudabi_sys_mem_sync", .ret_type = 1, .nargs = 3,
 	  .args = { { Ptr, 0 }, { Int, 1 }, { CloudABIMSFlags, 2 } } },
 	{ .name = "cloudabi_sys_mem_unmap", .ret_type = 1, .nargs = 2,
 	  .args = { { Ptr, 0 }, { Int, 1 } } },
 	{ .name = "cloudabi_sys_proc_exec", .ret_type = 1, .nargs = 5,
 	  .args = { { Int, 0 }, { BinString | IN, 1 }, { Int, 2 },
 	            { IntArray, 3 }, { Int, 4 } } },
 	{ .name = "cloudabi_sys_proc_exit", .ret_type = 1, .nargs = 1,
 	  .args = { { Int, 0 } } },
 	{ .name = "cloudabi_sys_proc_fork", .ret_type = 1, .nargs = 0 },
 	{ .name = "cloudabi_sys_proc_raise", .ret_type = 1, .nargs = 1,
 	  .args = { { CloudABISignal, 0 } } },
 	{ .name = "cloudabi_sys_random_get", .ret_type = 1, .nargs = 2,
 	  .args = { { BinString | OUT, 0 }, { Int, 1 } } },
 	{ .name = "cloudabi_sys_sock_shutdown", .ret_type = 1, .nargs = 2,
 	  .args = { { Int, 0 }, { CloudABISDFlags, 1 } } },
 	{ .name = "cloudabi_sys_thread_exit", .ret_type = 1, .nargs = 2,
 	  .args = { { Ptr, 0 }, { CloudABIMFlags, 1 } } },
 	{ .name = "cloudabi_sys_thread_yield", .ret_type = 1, .nargs = 0 },
 
 	{ .name = 0 },
 };
 static STAILQ_HEAD(, syscall) syscalls;
 
 /* Xlat idea taken from strace */
 struct xlat {
 	int val;
 	const char *str;
 };
 
 #define	X(a)	{ a, #a },
 #define	XEND	{ 0, NULL }
 
 static struct xlat poll_flags[] = {
 	X(POLLSTANDARD) X(POLLIN) X(POLLPRI) X(POLLOUT) X(POLLERR)
 	X(POLLHUP) X(POLLNVAL) X(POLLRDNORM) X(POLLRDBAND)
 	X(POLLWRBAND) X(POLLINIGNEOF) XEND
 };
 
 static struct xlat sigaction_flags[] = {
 	X(SA_ONSTACK) X(SA_RESTART) X(SA_RESETHAND) X(SA_NOCLDSTOP)
 	X(SA_NODEFER) X(SA_NOCLDWAIT) X(SA_SIGINFO) XEND
 };
 
 static struct xlat linux_socketcall_ops[] = {
 	X(LINUX_SOCKET) X(LINUX_BIND) X(LINUX_CONNECT) X(LINUX_LISTEN)
 	X(LINUX_ACCEPT) X(LINUX_GETSOCKNAME) X(LINUX_GETPEERNAME)
 	X(LINUX_SOCKETPAIR) X(LINUX_SEND) X(LINUX_RECV) X(LINUX_SENDTO)
 	X(LINUX_RECVFROM) X(LINUX_SHUTDOWN) X(LINUX_SETSOCKOPT)
 	X(LINUX_GETSOCKOPT) X(LINUX_SENDMSG) X(LINUX_RECVMSG)
 	XEND
 };
 
 #undef X
 #define	X(a)	{ CLOUDABI_##a, #a },
 
 static struct xlat cloudabi_advice[] = {
 	X(ADVICE_DONTNEED) X(ADVICE_NOREUSE) X(ADVICE_NORMAL)
 	X(ADVICE_RANDOM) X(ADVICE_SEQUENTIAL) X(ADVICE_WILLNEED)
 	XEND
 };
 
 static struct xlat cloudabi_clockid[] = {
 	X(CLOCK_MONOTONIC) X(CLOCK_PROCESS_CPUTIME_ID)
 	X(CLOCK_REALTIME) X(CLOCK_THREAD_CPUTIME_ID)
 	XEND
 };
 
 static struct xlat cloudabi_fdflags[] = {
 	X(FDFLAG_APPEND) X(FDFLAG_DSYNC) X(FDFLAG_NONBLOCK)
 	X(FDFLAG_RSYNC) X(FDFLAG_SYNC)
 	XEND
 };
 
 static struct xlat cloudabi_fdsflags[] = {
 	X(FDSTAT_FLAGS) X(FDSTAT_RIGHTS)
 	XEND
 };
 
 static struct xlat cloudabi_filetype[] = {
 	X(FILETYPE_UNKNOWN) X(FILETYPE_BLOCK_DEVICE)
 	X(FILETYPE_CHARACTER_DEVICE) X(FILETYPE_DIRECTORY)
 	X(FILETYPE_PROCESS) X(FILETYPE_REGULAR_FILE)
 	X(FILETYPE_SHARED_MEMORY) X(FILETYPE_SOCKET_DGRAM)
 	X(FILETYPE_SOCKET_STREAM) X(FILETYPE_SYMBOLIC_LINK)
 	XEND
 };
 
 static struct xlat cloudabi_fsflags[] = {
 	X(FILESTAT_ATIM) X(FILESTAT_ATIM_NOW) X(FILESTAT_MTIM)
 	X(FILESTAT_MTIM_NOW) X(FILESTAT_SIZE)
 	XEND
 };
 
 static struct xlat cloudabi_mflags[] = {
 	X(MAP_ANON) X(MAP_FIXED) X(MAP_PRIVATE) X(MAP_SHARED)
 	XEND
 };
 
 static struct xlat cloudabi_mprot[] = {
 	X(PROT_EXEC) X(PROT_WRITE) X(PROT_READ)
 	XEND
 };
 
 static struct xlat cloudabi_msflags[] = {
 	X(MS_ASYNC) X(MS_INVALIDATE) X(MS_SYNC)
 	XEND
 };
 
 static struct xlat cloudabi_oflags[] = {
 	X(O_CREAT) X(O_DIRECTORY) X(O_EXCL) X(O_TRUNC)
 	XEND
 };
 
 static struct xlat cloudabi_sdflags[] = {
 	X(SHUT_RD) X(SHUT_WR)
 	XEND
 };
 
 static struct xlat cloudabi_signal[] = {
 	X(SIGABRT) X(SIGALRM) X(SIGBUS) X(SIGCHLD) X(SIGCONT) X(SIGFPE)
 	X(SIGHUP) X(SIGILL) X(SIGINT) X(SIGKILL) X(SIGPIPE) X(SIGQUIT)
 	X(SIGSEGV) X(SIGSTOP) X(SIGSYS) X(SIGTERM) X(SIGTRAP) X(SIGTSTP)
 	X(SIGTTIN) X(SIGTTOU) X(SIGURG) X(SIGUSR1) X(SIGUSR2)
 	X(SIGVTALRM) X(SIGXCPU) X(SIGXFSZ)
 	XEND
 };
 
 static struct xlat cloudabi_ulflags[] = {
 	X(UNLINK_REMOVEDIR)
 	XEND
 };
 
 static struct xlat cloudabi_whence[] = {
 	X(WHENCE_CUR) X(WHENCE_END) X(WHENCE_SET)
 	XEND
 };
 
 #undef X
 #undef XEND
 
 /*
  * Searches an xlat array for a value, and returns it if found.  Otherwise
  * return a string representation.
  */
 static const char *
 lookup(struct xlat *xlat, int val, int base)
 {
 	static char tmp[16];
 
 	for (; xlat->str != NULL; xlat++)
 		if (xlat->val == val)
 			return (xlat->str);
 	switch (base) {
 		case 8:
 			sprintf(tmp, "0%o", val);
 			break;
 		case 16:
 			sprintf(tmp, "0x%x", val);
 			break;
 		case 10:
 			sprintf(tmp, "%u", val);
 			break;
 		default:
 			errx(1,"Unknown lookup base");
 			break;
 	}
 	return (tmp);
 }
 
 static const char *
 xlookup(struct xlat *xlat, int val)
 {
 
 	return (lookup(xlat, val, 16));
 }
 
 /*
  * Searches an xlat array containing bitfield values.  Remaining bits
  * set after removing the known ones are printed at the end:
  * IN|0x400.
  */
 static char *
 xlookup_bits(struct xlat *xlat, int val)
 {
 	int len, rem;
 	static char str[512];
 
 	len = 0;
 	rem = val;
 	for (; xlat->str != NULL; xlat++) {
 		if ((xlat->val & rem) == xlat->val) {
 			/*
 			 * Don't print the "all-bits-zero" string unless all
 			 * bits are really zero.
 			 */
 			if (xlat->val == 0 && val != 0)
 				continue;
 			len += sprintf(str + len, "%s|", xlat->str);
 			rem &= ~(xlat->val);
 		}
 	}
 
 	/*
 	 * If we have leftover bits or didn't match anything, print
 	 * the remainder.
 	 */
 	if (rem || len == 0)
 		len += sprintf(str + len, "0x%x", rem);
 	if (len && str[len - 1] == '|')
 		len--;
 	str[len] = 0;
 	return (str);
 }
 
 static void
 print_integer_arg(const char *(*decoder)(int), FILE *fp, int value)
 {
 	const char *str;
 
 	str = decoder(value);
 	if (str != NULL)
 		fputs(str, fp);
 	else
 		fprintf(fp, "%d", value);
 }
 
 static void
 print_mask_arg(bool (*decoder)(FILE *, int, int *), FILE *fp, int value)
 {
 	int rem;
 
 	if (!decoder(fp, value, &rem))
 		fprintf(fp, "0x%x", rem);
 	else if (rem != 0)
 		fprintf(fp, "|0x%x", rem);
 }
 
 static void
 print_mask_arg32(bool (*decoder)(FILE *, uint32_t, uint32_t *), FILE *fp,
     uint32_t value)
 {
 	uint32_t rem;
 
 	if (!decoder(fp, value, &rem))
 		fprintf(fp, "0x%x", rem);
 	else if (rem != 0)
 		fprintf(fp, "|0x%x", rem);
 }
 
 #ifndef __LP64__
 /*
  * Add argument padding to subsequent system calls afater a Quad
  * syscall arguments as needed.  This used to be done by hand in the
  * decoded_syscalls table which was ugly and error prone.  It is
  * simpler to do the fixup of offsets at initalization time than when
  * decoding arguments.
  */
 static void
 quad_fixup(struct syscall *sc)
 {
 	int offset, prev;
 	u_int i;
 
 	offset = 0;
 	prev = -1;
 	for (i = 0; i < sc->nargs; i++) {
 		/* This arg type is a dummy that doesn't use offset. */
 		if ((sc->args[i].type & ARG_MASK) == PipeFds)
 			continue;
 
 		assert(prev < sc->args[i].offset);
 		prev = sc->args[i].offset;
 		sc->args[i].offset += offset;
 		switch (sc->args[i].type & ARG_MASK) {
 		case Quad:
 		case QuadHex:
 #ifdef __powerpc__
 			/*
 			 * 64-bit arguments on 32-bit powerpc must be
 			 * 64-bit aligned.  If the current offset is
 			 * not aligned, the calling convention inserts
 			 * a 32-bit pad argument that should be skipped.
 			 */
 			if (sc->args[i].offset % 2 == 1) {
 				sc->args[i].offset++;
 				offset++;
 			}
 #endif
 			offset++;
 		default:
 			break;
 		}
 	}
 }
 #endif
 
 void
 init_syscalls(void)
 {
 	struct syscall *sc;
 
 	STAILQ_INIT(&syscalls);
 	for (sc = decoded_syscalls; sc->name != NULL; sc++) {
 #ifndef __LP64__
 		quad_fixup(sc);
 #endif
 		STAILQ_INSERT_HEAD(&syscalls, sc, entries);
 	}
 }
 
 static struct syscall *
 find_syscall(struct procabi *abi, u_int number)
 {
 	struct extra_syscall *es;
 
 	if (number < nitems(abi->syscalls))
 		return (abi->syscalls[number]);
 	STAILQ_FOREACH(es, &abi->extra_syscalls, entries) {
 		if (es->number == number)
 			return (es->sc);
 	}
 	return (NULL);
 }
 
 static void
 add_syscall(struct procabi *abi, u_int number, struct syscall *sc)
 {
 	struct extra_syscall *es;
 
 	if (number < nitems(abi->syscalls)) {
 		assert(abi->syscalls[number] == NULL);
 		abi->syscalls[number] = sc;
 	} else {
 		es = malloc(sizeof(*es));
 		es->sc = sc;
 		es->number = number;
 		STAILQ_INSERT_TAIL(&abi->extra_syscalls, es, entries);
 	}
 }
 
 /*
  * If/when the list gets big, it might be desirable to do it
  * as a hash table or binary search.
  */
 struct syscall *
 get_syscall(struct threadinfo *t, u_int number, u_int nargs)
 {
 	struct syscall *sc;
 	const char *name;
 	char *new_name;
 	u_int i;
 
 	sc = find_syscall(t->proc->abi, number);
 	if (sc != NULL)
 		return (sc);
 
 	name = sysdecode_syscallname(t->proc->abi->abi, number);
 	if (name == NULL) {
 		asprintf(&new_name, "#%d", number);
 		name = new_name;
 	} else
 		new_name = NULL;
 	STAILQ_FOREACH(sc, &syscalls, entries) {
 		if (strcmp(name, sc->name) == 0) {
 			add_syscall(t->proc->abi, number, sc);
 			free(new_name);
 			return (sc);
 		}
 	}
 
 	/* It is unknown.  Add it into the list. */
 #if DEBUG
 	fprintf(stderr, "unknown syscall %s -- setting args to %d\n", name,
 	    nargs);
 #endif
 
 	sc = calloc(1, sizeof(struct syscall));
 	sc->name = name;
 	if (new_name != NULL)
 		sc->unknown = true;
 	sc->ret_type = 1;
 	sc->nargs = nargs;
 	for (i = 0; i < nargs; i++) {
 		sc->args[i].offset = i;
 		/* Treat all unknown arguments as LongHex. */
 		sc->args[i].type = LongHex;
 	}
 	STAILQ_INSERT_HEAD(&syscalls, sc, entries);
 	add_syscall(t->proc->abi, number, sc);
 
 	return (sc);
 }
 
 /*
  * Copy a fixed amount of bytes from the process.
  */
 static int
 get_struct(pid_t pid, void *offset, void *buf, int len)
 {
 	struct ptrace_io_desc iorequest;
 
 	iorequest.piod_op = PIOD_READ_D;
 	iorequest.piod_offs = offset;
 	iorequest.piod_addr = buf;
 	iorequest.piod_len = len;
 	if (ptrace(PT_IO, pid, (caddr_t)&iorequest, 0) < 0)
 		return (-1);
 	return (0);
 }
 
 #define	MAXSIZE		4096
 
 /*
  * Copy a string from the process.  Note that it is
  * expected to be a C string, but if max is set, it will
  * only get that much.
  */
 static char *
 get_string(pid_t pid, void *addr, int max)
 {
 	struct ptrace_io_desc iorequest;
 	char *buf, *nbuf;
 	size_t offset, size, totalsize;
 
 	offset = 0;
 	if (max)
 		size = max + 1;
 	else {
 		/* Read up to the end of the current page. */
 		size = PAGE_SIZE - ((uintptr_t)addr % PAGE_SIZE);
 		if (size > MAXSIZE)
 			size = MAXSIZE;
 	}
 	totalsize = size;
 	buf = malloc(totalsize);
 	if (buf == NULL)
 		return (NULL);
 	for (;;) {
 		iorequest.piod_op = PIOD_READ_D;
 		iorequest.piod_offs = (char *)addr + offset;
 		iorequest.piod_addr = buf + offset;
 		iorequest.piod_len = size;
 		if (ptrace(PT_IO, pid, (caddr_t)&iorequest, 0) < 0) {
 			free(buf);
 			return (NULL);
 		}
 		if (memchr(buf + offset, '\0', size) != NULL)
 			return (buf);
 		offset += size;
 		if (totalsize < MAXSIZE && max == 0) {
 			size = MAXSIZE - totalsize;
 			if (size > PAGE_SIZE)
 				size = PAGE_SIZE;
 			nbuf = realloc(buf, totalsize + size);
 			if (nbuf == NULL) {
 				buf[totalsize - 1] = '\0';
 				return (buf);
 			}
 			buf = nbuf;
 			totalsize += size;
 		} else {
 			buf[totalsize - 1] = '\0';
 			return (buf);
 		}
 	}
 }
 
 static const char *
 strsig2(int sig)
 {
 	static char tmp[32];
 	const char *signame;
 
 	signame = sysdecode_signal(sig);
 	if (signame == NULL) {
 		snprintf(tmp, sizeof(tmp), "%d", sig);
 		signame = tmp;
 	}
 	return (signame);
 }
 
 static void
 print_kevent(FILE *fp, struct kevent *ke)
 {
 
 	switch (ke->filter) {
 	case EVFILT_READ:
 	case EVFILT_WRITE:
 	case EVFILT_VNODE:
 	case EVFILT_PROC:
 	case EVFILT_TIMER:
 	case EVFILT_PROCDESC:
 	case EVFILT_EMPTY:
 		fprintf(fp, "%ju", (uintmax_t)ke->ident);
 		break;
 	case EVFILT_SIGNAL:
 		fputs(strsig2(ke->ident), fp);
 		break;
 	default:
 		fprintf(fp, "%p", (void *)ke->ident);
 	}
 	fprintf(fp, ",");
 	print_integer_arg(sysdecode_kevent_filter, fp, ke->filter);
 	fprintf(fp, ",");
 	print_mask_arg(sysdecode_kevent_flags, fp, ke->flags);
 	fprintf(fp, ",");
 	sysdecode_kevent_fflags(fp, ke->filter, ke->fflags, 16);
 	fprintf(fp, ",%#jx,%p", (uintmax_t)ke->data, ke->udata);
 }
 
 static void
 print_utrace(FILE *fp, void *utrace_addr, size_t len)
 {
 	unsigned char *utrace_buffer;
 
 	fprintf(fp, "{ ");
 	if (sysdecode_utrace(fp, utrace_addr, len)) {
 		fprintf(fp, " }");
 		return;
 	}
 
 	utrace_buffer = utrace_addr;
 	fprintf(fp, "%zu:", len);
 	while (len--)
 		fprintf(fp, " %02x", *utrace_buffer++);
 	fprintf(fp, " }");
 }
 
 static void
 print_sockaddr(FILE *fp, struct trussinfo *trussinfo, void *arg, socklen_t len)
 {
 	char addr[64];
 	struct sockaddr_in *lsin;
 	struct sockaddr_in6 *lsin6;
 	struct sockaddr_un *sun;
 	struct sockaddr *sa;
 	u_char *q;
 	pid_t pid = trussinfo->curthread->proc->pid;
 
 	if (arg == NULL) {
 		fputs("NULL", fp);
 		return;
 	}
 	/* If the length is too small, just bail. */
 	if (len < sizeof(*sa)) {
 		fprintf(fp, "%p", arg);
 		return;
 	}
 
 	sa = calloc(1, len);
 	if (get_struct(pid, arg, sa, len) == -1) {
 		free(sa);
 		fprintf(fp, "%p", arg);
 		return;
 	}
 
 	switch (sa->sa_family) {
 	case AF_INET:
 		if (len < sizeof(*lsin))
 			goto sockaddr_short;
 		lsin = (struct sockaddr_in *)(void *)sa;
 		inet_ntop(AF_INET, &lsin->sin_addr, addr, sizeof(addr));
 		fprintf(fp, "{ AF_INET %s:%d }", addr,
 		    htons(lsin->sin_port));
 		break;
 	case AF_INET6:
 		if (len < sizeof(*lsin6))
 			goto sockaddr_short;
 		lsin6 = (struct sockaddr_in6 *)(void *)sa;
 		inet_ntop(AF_INET6, &lsin6->sin6_addr, addr,
 		    sizeof(addr));
 		fprintf(fp, "{ AF_INET6 [%s]:%d }", addr,
 		    htons(lsin6->sin6_port));
 		break;
 	case AF_UNIX:
 		sun = (struct sockaddr_un *)sa;
 		fprintf(fp, "{ AF_UNIX \"%.*s\" }",
 		    (int)(len - offsetof(struct sockaddr_un, sun_path)),
 		    sun->sun_path);
 		break;
 	default:
 	sockaddr_short:
 		fprintf(fp,
 		    "{ sa_len = %d, sa_family = %d, sa_data = {",
 		    (int)sa->sa_len, (int)sa->sa_family);
 		for (q = (u_char *)sa->sa_data;
 		     q < (u_char *)sa + len; q++)
 			fprintf(fp, "%s 0x%02x",
 			    q == (u_char *)sa->sa_data ? "" : ",",
 			    *q);
 		fputs(" } }", fp);
 	}
 	free(sa);
 }
 
 #define IOV_LIMIT 16
 
 static void
 print_iovec(FILE *fp, struct trussinfo *trussinfo, void *arg, int iovcnt)
 {
 	struct iovec iov[IOV_LIMIT];
 	size_t max_string = trussinfo->strsize;
 	char tmp2[max_string + 1], *tmp3;
 	size_t len;
 	pid_t pid = trussinfo->curthread->proc->pid;
 	int i;
 	bool buf_truncated, iov_truncated;
 
 	if (iovcnt <= 0) {
 		fprintf(fp, "%p", arg);
 		return;
 	}
 	if (iovcnt > IOV_LIMIT) {
 		iovcnt = IOV_LIMIT;
 		iov_truncated = true;
 	} else {
 		iov_truncated = false;
 	}
 	if (get_struct(pid, arg, &iov, iovcnt * sizeof(struct iovec)) == -1) {
 		fprintf(fp, "%p", arg);
 		return;
 	}
 
 	fputs("[", fp);
 	for (i = 0; i < iovcnt; i++) {
 		len = iov[i].iov_len;
 		if (len > max_string) {
 			len = max_string;
 			buf_truncated = true;
 		} else {
 			buf_truncated = false;
 		}
 		fprintf(fp, "%s{", (i > 0) ? "," : "");
 		if (len && get_struct(pid, iov[i].iov_base, &tmp2, len) != -1) {
 			tmp3 = malloc(len * 4 + 1);
 			while (len) {
 				if (strvisx(tmp3, tmp2, len,
 				    VIS_CSTYLE|VIS_TAB|VIS_NL) <=
 				    (int)max_string)
 					break;
 				len--;
 				buf_truncated = true;
 			}
 			fprintf(fp, "\"%s\"%s", tmp3,
 			    buf_truncated ? "..." : "");
 			free(tmp3);
 		} else {
 			fprintf(fp, "%p", iov[i].iov_base);
 		}
 		fprintf(fp, ",%zu}", iov[i].iov_len);
 	}
 	fprintf(fp, "%s%s", iov_truncated ? ",..." : "", "]");
 }
 
 static void
 print_gen_cmsg(FILE *fp, struct cmsghdr *cmsghdr)
 {
 	u_char *q;
 
 	fputs("{", fp);
 	for (q = CMSG_DATA(cmsghdr);
 	     q < (u_char *)cmsghdr + cmsghdr->cmsg_len; q++) {
 		fprintf(fp, "%s0x%02x", q == CMSG_DATA(cmsghdr) ? "" : ",", *q);
 	}
 	fputs("}", fp);
 }
 
 static void
 print_sctp_initmsg(FILE *fp, struct sctp_initmsg *init)
 {
 	fprintf(fp, "{out=%u,", init->sinit_num_ostreams);
 	fprintf(fp, "in=%u,", init->sinit_max_instreams);
 	fprintf(fp, "max_rtx=%u,", init->sinit_max_attempts);
 	fprintf(fp, "max_rto=%u}", init->sinit_max_init_timeo);
 }
 
 static void
 print_sctp_sndrcvinfo(FILE *fp, bool receive, struct sctp_sndrcvinfo *info)
 {
 	fprintf(fp, "{sid=%u,", info->sinfo_stream);
 	if (receive) {
 		fprintf(fp, "ssn=%u,", info->sinfo_ssn);
 	}
 	fputs("flgs=", fp);
 	sysdecode_sctp_sinfo_flags(fp, info->sinfo_flags);
 	fprintf(fp, ",ppid=%u,", ntohl(info->sinfo_ppid));
 	if (!receive) {
 		fprintf(fp, "ctx=%u,", info->sinfo_context);
 		fprintf(fp, "ttl=%u,", info->sinfo_timetolive);
 	}
 	if (receive) {
 		fprintf(fp, "tsn=%u,", info->sinfo_tsn);
 		fprintf(fp, "cumtsn=%u,", info->sinfo_cumtsn);
 	}
 	fprintf(fp, "id=%u}", info->sinfo_assoc_id);
 }
 
 static void
 print_sctp_sndinfo(FILE *fp, struct sctp_sndinfo *info)
 {
 	fprintf(fp, "{sid=%u,", info->snd_sid);
 	fputs("flgs=", fp);
 	print_mask_arg(sysdecode_sctp_snd_flags, fp, info->snd_flags);
 	fprintf(fp, ",ppid=%u,", ntohl(info->snd_ppid));
 	fprintf(fp, "ctx=%u,", info->snd_context);
 	fprintf(fp, "id=%u}", info->snd_assoc_id);
 }
 
 static void
 print_sctp_rcvinfo(FILE *fp, struct sctp_rcvinfo *info)
 {
 	fprintf(fp, "{sid=%u,", info->rcv_sid);
 	fprintf(fp, "ssn=%u,", info->rcv_ssn);
 	fputs("flgs=", fp);
 	print_mask_arg(sysdecode_sctp_rcv_flags, fp, info->rcv_flags);
 	fprintf(fp, ",ppid=%u,", ntohl(info->rcv_ppid));
 	fprintf(fp, "tsn=%u,", info->rcv_tsn);
 	fprintf(fp, "cumtsn=%u,", info->rcv_cumtsn);
 	fprintf(fp, "ctx=%u,", info->rcv_context);
 	fprintf(fp, "id=%u}", info->rcv_assoc_id);
 }
 
 static void
 print_sctp_nxtinfo(FILE *fp, struct sctp_nxtinfo *info)
 {
 	fprintf(fp, "{sid=%u,", info->nxt_sid);
 	fputs("flgs=", fp);
 	print_mask_arg(sysdecode_sctp_nxt_flags, fp, info->nxt_flags);
 	fprintf(fp, ",ppid=%u,", ntohl(info->nxt_ppid));
 	fprintf(fp, "len=%u,", info->nxt_length);
 	fprintf(fp, "id=%u}", info->nxt_assoc_id);
 }
 
 static void
 print_sctp_prinfo(FILE *fp, struct sctp_prinfo *info)
 {
 	fputs("{pol=", fp);
 	print_integer_arg(sysdecode_sctp_pr_policy, fp, info->pr_policy);
 	fprintf(fp, ",val=%u}", info->pr_value);
 }
 
 static void
 print_sctp_authinfo(FILE *fp, struct sctp_authinfo *info)
 {
 	fprintf(fp, "{num=%u}", info->auth_keynumber);
 }
 
 static void
 print_sctp_ipv4_addr(FILE *fp, struct in_addr *addr)
 {
 	char buf[INET_ADDRSTRLEN];
 	const char *s;
 
 	s = inet_ntop(AF_INET, addr, buf, INET_ADDRSTRLEN);
 	if (s != NULL)
 		fprintf(fp, "{addr=%s}", s);
 	else
 		fputs("{addr=???}", fp);
 }
 
 static void
 print_sctp_ipv6_addr(FILE *fp, struct in6_addr *addr)
 {
 	char buf[INET6_ADDRSTRLEN];
 	const char *s;
 
 	s = inet_ntop(AF_INET6, addr, buf, INET6_ADDRSTRLEN);
 	if (s != NULL)
 		fprintf(fp, "{addr=%s}", s);
 	else
 		fputs("{addr=???}", fp);
 }
 
 static void
 print_sctp_cmsg(FILE *fp, bool receive, struct cmsghdr *cmsghdr)
 {
 	void *data;
 	socklen_t len;
 
 	len = cmsghdr->cmsg_len;
 	data = CMSG_DATA(cmsghdr);
 	switch (cmsghdr->cmsg_type) {
 	case SCTP_INIT:
 		if (len == CMSG_LEN(sizeof(struct sctp_initmsg)))
 			print_sctp_initmsg(fp, (struct sctp_initmsg *)data);
 		else
 			print_gen_cmsg(fp, cmsghdr);
 		break;
 	case SCTP_SNDRCV:
 		if (len == CMSG_LEN(sizeof(struct sctp_sndrcvinfo)))
 			print_sctp_sndrcvinfo(fp, receive,
 			    (struct sctp_sndrcvinfo *)data);
 		else
 			print_gen_cmsg(fp, cmsghdr);
 		break;
 #if 0
 	case SCTP_EXTRCV:
 		if (len == CMSG_LEN(sizeof(struct sctp_extrcvinfo)))
 			print_sctp_extrcvinfo(fp,
 			    (struct sctp_extrcvinfo *)data);
 		else
 			print_gen_cmsg(fp, cmsghdr);
 		break;
 #endif
 	case SCTP_SNDINFO:
 		if (len == CMSG_LEN(sizeof(struct sctp_sndinfo)))
 			print_sctp_sndinfo(fp, (struct sctp_sndinfo *)data);
 		else
 			print_gen_cmsg(fp, cmsghdr);
 		break;
 	case SCTP_RCVINFO:
 		if (len == CMSG_LEN(sizeof(struct sctp_rcvinfo)))
 			print_sctp_rcvinfo(fp, (struct sctp_rcvinfo *)data);
 		else
 			print_gen_cmsg(fp, cmsghdr);
 		break;
 	case SCTP_NXTINFO:
 		if (len == CMSG_LEN(sizeof(struct sctp_nxtinfo)))
 			print_sctp_nxtinfo(fp, (struct sctp_nxtinfo *)data);
 		else
 			print_gen_cmsg(fp, cmsghdr);
 		break;
 	case SCTP_PRINFO:
 		if (len == CMSG_LEN(sizeof(struct sctp_prinfo)))
 			print_sctp_prinfo(fp, (struct sctp_prinfo *)data);
 		else
 			print_gen_cmsg(fp, cmsghdr);
 		break;
 	case SCTP_AUTHINFO:
 		if (len == CMSG_LEN(sizeof(struct sctp_authinfo)))
 			print_sctp_authinfo(fp, (struct sctp_authinfo *)data);
 		else
 			print_gen_cmsg(fp, cmsghdr);
 		break;
 	case SCTP_DSTADDRV4:
 		if (len == CMSG_LEN(sizeof(struct in_addr)))
 			print_sctp_ipv4_addr(fp, (struct in_addr *)data);
 		else
 			print_gen_cmsg(fp, cmsghdr);
 		break;
 	case SCTP_DSTADDRV6:
 		if (len == CMSG_LEN(sizeof(struct in6_addr)))
 			print_sctp_ipv6_addr(fp, (struct in6_addr *)data);
 		else
 			print_gen_cmsg(fp, cmsghdr);
 		break;
 	default:
 		print_gen_cmsg(fp, cmsghdr);
 	}
 }
 
 static void
 print_cmsgs(FILE *fp, pid_t pid, bool receive, struct msghdr *msghdr)
 {
 	struct cmsghdr *cmsghdr;
 	char *cmsgbuf;
 	const char *temp;
 	socklen_t len;
 	int level, type;
 	bool first;
 
 	len = msghdr->msg_controllen;
 	if (len == 0) {
 		fputs("{}", fp);
 		return;
 	}
 	cmsgbuf = calloc(1, len);
 	if (get_struct(pid, msghdr->msg_control, cmsgbuf, len) == -1) {
 		fprintf(fp, "%p", msghdr->msg_control);
 		free(cmsgbuf);
 		return;
 	}
 	msghdr->msg_control = cmsgbuf;
 	first = true;
 	fputs("{", fp);
 	for (cmsghdr = CMSG_FIRSTHDR(msghdr);
 	   cmsghdr != NULL;
 	   cmsghdr = CMSG_NXTHDR(msghdr, cmsghdr)) {
 		level = cmsghdr->cmsg_level;
 		type = cmsghdr->cmsg_type;
 		len = cmsghdr->cmsg_len;
 		fprintf(fp, "%s{level=", first ? "" : ",");
 		print_integer_arg(sysdecode_sockopt_level, fp, level);
 		fputs(",type=", fp);
 		temp = sysdecode_cmsg_type(level, type);
 		if (temp) {
 			fputs(temp, fp);
 		} else {
 			fprintf(fp, "%d", type);
 		}
 		fputs(",data=", fp);
 		switch (level) {
 		case IPPROTO_SCTP:
 			print_sctp_cmsg(fp, receive, cmsghdr);
 			break;
 		default:
 			print_gen_cmsg(fp, cmsghdr);
 			break;
 		}
 		fputs("}", fp);
 		first = false;
 	}
 	fputs("}", fp);
 	free(cmsgbuf);
 }
 
 /*
  * Converts a syscall argument into a string.  Said string is
  * allocated via malloc(), so needs to be free()'d.  sc is
  * a pointer to the syscall description (see above); args is
  * an array of all of the system call arguments.
  */
 char *
 print_arg(struct syscall_args *sc, unsigned long *args, long *retval,
     struct trussinfo *trussinfo)
 {
 	FILE *fp;
 	char *tmp;
 	size_t tmplen;
 	pid_t pid;
 
 	fp = open_memstream(&tmp, &tmplen);
 	pid = trussinfo->curthread->proc->pid;
 	switch (sc->type & ARG_MASK) {
 	case Hex:
 		fprintf(fp, "0x%x", (int)args[sc->offset]);
 		break;
 	case Octal:
 		fprintf(fp, "0%o", (int)args[sc->offset]);
 		break;
 	case Int:
 		fprintf(fp, "%d", (int)args[sc->offset]);
 		break;
 	case UInt:
 		fprintf(fp, "%u", (unsigned int)args[sc->offset]);
 		break;
 	case PUInt: {
 		unsigned int val;
 
 		if (get_struct(pid, (void *)args[sc->offset], &val,
 		    sizeof(val)) == 0) 
 			fprintf(fp, "{ %u }", val);
 		else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 	case LongHex:
 		fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	case Long:
 		fprintf(fp, "%ld", args[sc->offset]);
 		break;
 	case Sizet:
 		fprintf(fp, "%zu", (size_t)args[sc->offset]);
 		break;
 	case Name: {
 		/* NULL-terminated string. */
 		char *tmp2;
 
 		tmp2 = get_string(pid, (void*)args[sc->offset], 0);
 		fprintf(fp, "\"%s\"", tmp2);
 		free(tmp2);
 		break;
 	}
 	case BinString: {
 		/*
 		 * Binary block of data that might have printable characters.
 		 * XXX If type|OUT, assume that the length is the syscall's
 		 * return value.  Otherwise, assume that the length of the block
 		 * is in the next syscall argument.
 		 */
 		int max_string = trussinfo->strsize;
 		char tmp2[max_string + 1], *tmp3;
 		int len;
 		int truncated = 0;
 
 		if (sc->type & OUT)
 			len = retval[0];
 		else
 			len = args[sc->offset + 1];
 
 		/*
 		 * Don't print more than max_string characters, to avoid word
 		 * wrap.  If we have to truncate put some ... after the string.
 		 */
 		if (len > max_string) {
 			len = max_string;
 			truncated = 1;
 		}
 		if (len && get_struct(pid, (void*)args[sc->offset], &tmp2, len)
 		    != -1) {
 			tmp3 = malloc(len * 4 + 1);
 			while (len) {
 				if (strvisx(tmp3, tmp2, len,
 				    VIS_CSTYLE|VIS_TAB|VIS_NL) <= max_string)
 					break;
 				len--;
 				truncated = 1;
 			}
 			fprintf(fp, "\"%s\"%s", tmp3, truncated ?
 			    "..." : "");
 			free(tmp3);
 		} else {
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		}
 		break;
 	}
 	case ExecArgs:
 	case ExecEnv:
 	case StringArray: {
 		uintptr_t addr;
 		union {
 			char *strarray[0];
 			char buf[PAGE_SIZE];
 		} u;
 		char *string;
 		size_t len;
 		u_int first, i;
 
 		/*
 		 * Only parse argv[] and environment arrays from exec calls
 		 * if requested.
 		 */
 		if (((sc->type & ARG_MASK) == ExecArgs &&
 		    (trussinfo->flags & EXECVEARGS) == 0) ||
 		    ((sc->type & ARG_MASK) == ExecEnv &&
 		    (trussinfo->flags & EXECVEENVS) == 0)) {
 			fprintf(fp, "0x%lx", args[sc->offset]);
 			break;
 		}
 
 		/*
 		 * Read a page of pointers at a time.  Punt if the top-level
 		 * pointer is not aligned.  Note that the first read is of
 		 * a partial page.
 		 */
 		addr = args[sc->offset];
 		if (addr % sizeof(char *) != 0) {
 			fprintf(fp, "0x%lx", args[sc->offset]);
 			break;
 		}
 
 		len = PAGE_SIZE - (addr & PAGE_MASK);
 		if (get_struct(pid, (void *)addr, u.buf, len) == -1) {
 			fprintf(fp, "0x%lx", args[sc->offset]);
 			break;
 		}
 
 		fputc('[', fp);
 		first = 1;
 		i = 0;
 		while (u.strarray[i] != NULL) {
 			string = get_string(pid, u.strarray[i], 0);
 			fprintf(fp, "%s \"%s\"", first ? "" : ",", string);
 			free(string);
 			first = 0;
 
 			i++;
 			if (i == len / sizeof(char *)) {
 				addr += len;
 				len = PAGE_SIZE;
 				if (get_struct(pid, (void *)addr, u.buf, len) ==
 				    -1) {
 					fprintf(fp, ", <inval>");
 					break;
 				}
 				i = 0;
 			}
 		}
 		fputs(" ]", fp);
 		break;
 	}
 #ifdef __LP64__
 	case Quad:
 		fprintf(fp, "%ld", args[sc->offset]);
 		break;
 	case QuadHex:
 		fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 #else
 	case Quad:
 	case QuadHex: {
 		unsigned long long ll;
 
 #if _BYTE_ORDER == _LITTLE_ENDIAN
 		ll = (unsigned long long)args[sc->offset + 1] << 32 |
 		    args[sc->offset];
 #else
 		ll = (unsigned long long)args[sc->offset] << 32 |
 		    args[sc->offset + 1];
 #endif
 		if ((sc->type & ARG_MASK) == Quad)
 			fprintf(fp, "%lld", ll);
 		else
 			fprintf(fp, "0x%llx", ll);
 		break;
 	}
 #endif
 	case PQuadHex: {
 		uint64_t val;
 
 		if (get_struct(pid, (void *)args[sc->offset], &val,
 		    sizeof(val)) == 0) 
 			fprintf(fp, "{ 0x%jx }", (uintmax_t)val);
 		else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 	case Ptr:
 		fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	case Readlinkres: {
 		char *tmp2;
 
 		if (retval[0] == -1)
 			break;
 		tmp2 = get_string(pid, (void*)args[sc->offset], retval[0]);
 		fprintf(fp, "\"%s\"", tmp2);
 		free(tmp2);
 		break;
 	}
 	case Ioctl: {
 		const char *temp;
 		unsigned long cmd;
 
 		cmd = args[sc->offset];
 		temp = sysdecode_ioctlname(cmd);
 		if (temp)
 			fputs(temp, fp);
 		else {
 			fprintf(fp, "0x%lx { IO%s%s 0x%lx('%c'), %lu, %lu }",
 			    cmd, cmd & IOC_OUT ? "R" : "",
 			    cmd & IOC_IN ? "W" : "", IOCGROUP(cmd),
 			    isprint(IOCGROUP(cmd)) ? (char)IOCGROUP(cmd) : '?',
 			    cmd & 0xFF, IOCPARM_LEN(cmd));
 		}
 		break;
 	}
 	case Timespec: {
 		struct timespec ts;
 
 		if (get_struct(pid, (void *)args[sc->offset], &ts,
 		    sizeof(ts)) != -1)
 			fprintf(fp, "{ %jd.%09ld }", (intmax_t)ts.tv_sec,
 			    ts.tv_nsec);
 		else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 	case Timespec2: {
 		struct timespec ts[2];
 		const char *sep;
 		unsigned int i;
 
 		if (get_struct(pid, (void *)args[sc->offset], &ts, sizeof(ts))
 		    != -1) {
 			fputs("{ ", fp);
 			sep = "";
 			for (i = 0; i < nitems(ts); i++) {
 				fputs(sep, fp);
 				sep = ", ";
 				switch (ts[i].tv_nsec) {
 				case UTIME_NOW:
 					fprintf(fp, "UTIME_NOW");
 					break;
 				case UTIME_OMIT:
 					fprintf(fp, "UTIME_OMIT");
 					break;
 				default:
 					fprintf(fp, "%jd.%09ld",
 					    (intmax_t)ts[i].tv_sec,
 					    ts[i].tv_nsec);
 					break;
 				}
 			}
 			fputs(" }", fp);
 		} else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 	case Timeval: {
 		struct timeval tv;
 
 		if (get_struct(pid, (void *)args[sc->offset], &tv, sizeof(tv))
 		    != -1)
 			fprintf(fp, "{ %jd.%06ld }", (intmax_t)tv.tv_sec,
 			    tv.tv_usec);
 		else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 	case Timeval2: {
 		struct timeval tv[2];
 
 		if (get_struct(pid, (void *)args[sc->offset], &tv, sizeof(tv))
 		    != -1)
 			fprintf(fp, "{ %jd.%06ld, %jd.%06ld }",
 			    (intmax_t)tv[0].tv_sec, tv[0].tv_usec,
 			    (intmax_t)tv[1].tv_sec, tv[1].tv_usec);
 		else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 	case Itimerval: {
 		struct itimerval itv;
 
 		if (get_struct(pid, (void *)args[sc->offset], &itv,
 		    sizeof(itv)) != -1)
 			fprintf(fp, "{ %jd.%06ld, %jd.%06ld }",
 			    (intmax_t)itv.it_interval.tv_sec,
 			    itv.it_interval.tv_usec,
 			    (intmax_t)itv.it_value.tv_sec,
 			    itv.it_value.tv_usec);
 		else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 	case LinuxSockArgs:
 	{
 		struct linux_socketcall_args largs;
 
 		if (get_struct(pid, (void *)args[sc->offset], (void *)&largs,
 		    sizeof(largs)) != -1)
 			fprintf(fp, "{ %s, 0x%lx }",
 			    lookup(linux_socketcall_ops, largs.what, 10),
 			    (long unsigned int)largs.args);
 		else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 	case Pollfd: {
 		/*
 		 * XXX: A Pollfd argument expects the /next/ syscall argument
 		 * to be the number of fds in the array. This matches the poll
 		 * syscall.
 		 */
 		struct pollfd *pfd;
 		int numfds = args[sc->offset + 1];
 		size_t bytes = sizeof(struct pollfd) * numfds;
 		int i;
 
 		if ((pfd = malloc(bytes)) == NULL)
 			err(1, "Cannot malloc %zu bytes for pollfd array",
 			    bytes);
 		if (get_struct(pid, (void *)args[sc->offset], pfd, bytes)
 		    != -1) {
 			fputs("{", fp);
 			for (i = 0; i < numfds; i++) {
 				fprintf(fp, " %d/%s", pfd[i].fd,
 				    xlookup_bits(poll_flags, pfd[i].events));
 			}
 			fputs(" }", fp);
 		} else {
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		}
 		free(pfd);
 		break;
 	}
 	case Fd_set: {
 		/*
 		 * XXX: A Fd_set argument expects the /first/ syscall argument
 		 * to be the number of fds in the array.  This matches the
 		 * select syscall.
 		 */
 		fd_set *fds;
 		int numfds = args[0];
 		size_t bytes = _howmany(numfds, _NFDBITS) * _NFDBITS;
 		int i;
 
 		if ((fds = malloc(bytes)) == NULL)
 			err(1, "Cannot malloc %zu bytes for fd_set array",
 			    bytes);
 		if (get_struct(pid, (void *)args[sc->offset], fds, bytes)
 		    != -1) {
 			fputs("{", fp);
 			for (i = 0; i < numfds; i++) {
 				if (FD_ISSET(i, fds))
 					fprintf(fp, " %d", i);
 			}
 			fputs(" }", fp);
 		} else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		free(fds);
 		break;
 	}
 	case Signal:
 		fputs(strsig2(args[sc->offset]), fp);
 		break;
 	case Sigset: {
 		long sig;
 		sigset_t ss;
 		int i, first;
 
 		sig = args[sc->offset];
 		if (get_struct(pid, (void *)args[sc->offset], (void *)&ss,
 		    sizeof(ss)) == -1) {
 			fprintf(fp, "0x%lx", args[sc->offset]);
 			break;
 		}
 		fputs("{ ", fp);
 		first = 1;
 		for (i = 1; i < sys_nsig; i++) {
 			if (sigismember(&ss, i)) {
 				fprintf(fp, "%s%s", !first ? "|" : "",
 				    strsig2(i));
 				first = 0;
 			}
 		}
 		if (!first)
 			fputc(' ', fp);
 		fputc('}', fp);
 		break;
 	}
 	case Sigprocmask:
 		print_integer_arg(sysdecode_sigprocmask_how, fp,
 		    args[sc->offset]);
 		break;
 	case Fcntlflag:
 		/* XXX: Output depends on the value of the previous argument. */
 		if (sysdecode_fcntl_arg_p(args[sc->offset - 1]))
 			sysdecode_fcntl_arg(fp, args[sc->offset - 1],
 			    args[sc->offset], 16);
 		break;
 	case Open:
 		print_mask_arg(sysdecode_open_flags, fp, args[sc->offset]);
 		break;
 	case Fcntl:
 		print_integer_arg(sysdecode_fcntl_cmd, fp, args[sc->offset]);
 		break;
 	case Mprot:
 		print_mask_arg(sysdecode_mmap_prot, fp, args[sc->offset]);
 		break;
 	case Mmapflags:
 		print_mask_arg(sysdecode_mmap_flags, fp, args[sc->offset]);
 		break;
 	case Whence:
 		print_integer_arg(sysdecode_whence, fp, args[sc->offset]);
 		break;
 	case Sockdomain:
 		print_integer_arg(sysdecode_socketdomain, fp, args[sc->offset]);
 		break;
 	case Socktype:
 		print_mask_arg(sysdecode_socket_type, fp, args[sc->offset]);
 		break;
 	case Shutdown:
 		print_integer_arg(sysdecode_shutdown_how, fp, args[sc->offset]);
 		break;
 	case Resource:
 		print_integer_arg(sysdecode_rlimit, fp, args[sc->offset]);
 		break;
 	case RusageWho:
 		print_integer_arg(sysdecode_getrusage_who, fp, args[sc->offset]);
 		break;
 	case Pathconf:
 		print_integer_arg(sysdecode_pathconf_name, fp, args[sc->offset]);
 		break;
 	case Rforkflags:
 		print_mask_arg(sysdecode_rfork_flags, fp, args[sc->offset]);
 		break;
 	case Sockaddr: {
 		socklen_t len;
 
 		if (args[sc->offset] == 0) {
 			fputs("NULL", fp);
 			break;
 		}
 
 		/*
 		 * Extract the address length from the next argument.  If
 		 * this is an output sockaddr (OUT is set), then the
 		 * next argument is a pointer to a socklen_t.  Otherwise
 		 * the next argument contains a socklen_t by value.
 		 */
 		if (sc->type & OUT) {
 			if (get_struct(pid, (void *)args[sc->offset + 1],
 			    &len, sizeof(len)) == -1) {
 				fprintf(fp, "0x%lx", args[sc->offset]);
 				break;
 			}
 		} else
 			len = args[sc->offset + 1];
 
 		print_sockaddr(fp, trussinfo, (void *)args[sc->offset], len);
 		break;
 	}
 	case Sigaction: {
 		struct sigaction sa;
 
 		if (get_struct(pid, (void *)args[sc->offset], &sa, sizeof(sa))
 		    != -1) {
 			fputs("{ ", fp);
 			if (sa.sa_handler == SIG_DFL)
 				fputs("SIG_DFL", fp);
 			else if (sa.sa_handler == SIG_IGN)
 				fputs("SIG_IGN", fp);
 			else
 				fprintf(fp, "%p", sa.sa_handler);
 			fprintf(fp, " %s ss_t }",
 			    xlookup_bits(sigaction_flags, sa.sa_flags));
 		} else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 	case Kevent: {
 		/*
 		 * XXX XXX: The size of the array is determined by either the
 		 * next syscall argument, or by the syscall return value,
 		 * depending on which argument number we are.  This matches the
 		 * kevent syscall, but luckily that's the only syscall that uses
 		 * them.
 		 */
 		struct kevent *ke;
 		int numevents = -1;
 		size_t bytes;
 		int i;
 
 		if (sc->offset == 1)
 			numevents = args[sc->offset+1];
 		else if (sc->offset == 3 && retval[0] != -1)
 			numevents = retval[0];
 
 		if (numevents >= 0) {
 			bytes = sizeof(struct kevent) * numevents;
 			if ((ke = malloc(bytes)) == NULL)
 				err(1,
 				    "Cannot malloc %zu bytes for kevent array",
 				    bytes);
 		} else
 			ke = NULL;
 		if (numevents >= 0 && get_struct(pid, (void *)args[sc->offset],
 		    ke, bytes) != -1) {
 			fputc('{', fp);
 			for (i = 0; i < numevents; i++) {
 				fputc(' ', fp);
 				print_kevent(fp, &ke[i]);
 			}
 			fputs(" }", fp);
 		} else {
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		}
 		free(ke);
 		break;
 	}
 	case Kevent11: {
 		struct kevent_freebsd11 *ke11;
 		struct kevent ke;
 		int numevents = -1;
 		size_t bytes;
 		int i;
 
 		if (sc->offset == 1)
 			numevents = args[sc->offset+1];
 		else if (sc->offset == 3 && retval[0] != -1)
 			numevents = retval[0];
 
 		if (numevents >= 0) {
 			bytes = sizeof(struct kevent_freebsd11) * numevents;
 			if ((ke11 = malloc(bytes)) == NULL)
 				err(1,
 				    "Cannot malloc %zu bytes for kevent array",
 				    bytes);
 		} else
 			ke11 = NULL;
 		memset(&ke, 0, sizeof(ke));
 		if (numevents >= 0 && get_struct(pid, (void *)args[sc->offset],
 		    ke11, bytes) != -1) {
 			fputc('{', fp);
 			for (i = 0; i < numevents; i++) {
 				fputc(' ', fp);
 				ke.ident = ke11[i].ident;
 				ke.filter = ke11[i].filter;
 				ke.flags = ke11[i].flags;
 				ke.fflags = ke11[i].fflags;
 				ke.data = ke11[i].data;
 				ke.udata = ke11[i].udata;
 				print_kevent(fp, &ke);
 			}
 			fputs(" }", fp);
 		} else {
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		}
 		free(ke11);
 		break;
 	}
 	case Stat: {
 		struct stat st;
 
 		if (get_struct(pid, (void *)args[sc->offset], &st, sizeof(st))
 		    != -1) {
 			char mode[12];
 
 			strmode(st.st_mode, mode);
 			fprintf(fp,
 			    "{ mode=%s,inode=%ju,size=%jd,blksize=%ld }", mode,
 			    (uintmax_t)st.st_ino, (intmax_t)st.st_size,
 			    (long)st.st_blksize);
 		} else {
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		}
 		break;
 	}
 	case Stat11: {
 		struct freebsd11_stat st;
 
 		if (get_struct(pid, (void *)args[sc->offset], &st, sizeof(st))
 		    != -1) {
 			char mode[12];
 
 			strmode(st.st_mode, mode);
 			fprintf(fp,
 			    "{ mode=%s,inode=%ju,size=%jd,blksize=%ld }", mode,
 			    (uintmax_t)st.st_ino, (intmax_t)st.st_size,
 			    (long)st.st_blksize);
 		} else {
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		}
 		break;
 	}
 	case StatFs: {
 		unsigned int i;
 		struct statfs buf;
 
 		if (get_struct(pid, (void *)args[sc->offset], &buf,
 		    sizeof(buf)) != -1) {
 			char fsid[17];
 
 			bzero(fsid, sizeof(fsid));
 			if (buf.f_fsid.val[0] != 0 || buf.f_fsid.val[1] != 0) {
 			        for (i = 0; i < sizeof(buf.f_fsid); i++)
 					snprintf(&fsid[i*2],
 					    sizeof(fsid) - (i*2), "%02x",
 					    ((u_char *)&buf.f_fsid)[i]);
 			}
 			fprintf(fp,
 			    "{ fstypename=%s,mntonname=%s,mntfromname=%s,"
 			    "fsid=%s }", buf.f_fstypename, buf.f_mntonname,
 			    buf.f_mntfromname, fsid);
 		} else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 
 	case Rusage: {
 		struct rusage ru;
 
 		if (get_struct(pid, (void *)args[sc->offset], &ru, sizeof(ru))
 		    != -1) {
 			fprintf(fp,
 			    "{ u=%jd.%06ld,s=%jd.%06ld,in=%ld,out=%ld }",
 			    (intmax_t)ru.ru_utime.tv_sec, ru.ru_utime.tv_usec,
 			    (intmax_t)ru.ru_stime.tv_sec, ru.ru_stime.tv_usec,
 			    ru.ru_inblock, ru.ru_oublock);
 		} else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 	case Rlimit: {
 		struct rlimit rl;
 
 		if (get_struct(pid, (void *)args[sc->offset], &rl, sizeof(rl))
 		    != -1) {
 			fprintf(fp, "{ cur=%ju,max=%ju }",
 			    rl.rlim_cur, rl.rlim_max);
 		} else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 	case ExitStatus: {
 		int status;
 
 		if (get_struct(pid, (void *)args[sc->offset], &status,
 		    sizeof(status)) != -1) {
 			fputs("{ ", fp);
 			if (WIFCONTINUED(status))
 				fputs("CONTINUED", fp);
 			else if (WIFEXITED(status))
 				fprintf(fp, "EXITED,val=%d",
 				    WEXITSTATUS(status));
 			else if (WIFSIGNALED(status))
 				fprintf(fp, "SIGNALED,sig=%s%s",
 				    strsig2(WTERMSIG(status)),
 				    WCOREDUMP(status) ? ",cored" : "");
 			else
 				fprintf(fp, "STOPPED,sig=%s",
 				    strsig2(WTERMSIG(status)));
 			fputs(" }", fp);
 		} else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 	case Waitoptions:
 		print_mask_arg(sysdecode_wait6_options, fp, args[sc->offset]);
 		break;
 	case Idtype:
 		print_integer_arg(sysdecode_idtype, fp, args[sc->offset]);
 		break;
 	case Procctl:
 		print_integer_arg(sysdecode_procctl_cmd, fp, args[sc->offset]);
 		break;
 	case Umtxop:
 		print_integer_arg(sysdecode_umtx_op, fp, args[sc->offset]);
 		break;
 	case Atfd:
 		print_integer_arg(sysdecode_atfd, fp, args[sc->offset]);
 		break;
 	case Atflags:
 		print_mask_arg(sysdecode_atflags, fp, args[sc->offset]);
 		break;
 	case Accessmode:
 		print_mask_arg(sysdecode_access_mode, fp, args[sc->offset]);
 		break;
 	case Sysarch:
 		print_integer_arg(sysdecode_sysarch_number, fp,
 		    args[sc->offset]);
 		break;
 	case PipeFds:
 		/*
 		 * The pipe() system call in the kernel returns its
 		 * two file descriptors via return values.  However,
 		 * the interface exposed by libc is that pipe()
 		 * accepts a pointer to an array of descriptors.
 		 * Format the output to match the libc API by printing
 		 * the returned file descriptors as a fake argument.
 		 *
 		 * Overwrite the first retval to signal a successful
 		 * return as well.
 		 */
 		fprintf(fp, "{ %ld, %ld }", retval[0], retval[1]);
 		retval[0] = 0;
 		break;
 	case Utrace: {
 		size_t len;
 		void *utrace_addr;
 
 		len = args[sc->offset + 1];
 		utrace_addr = calloc(1, len);
 		if (get_struct(pid, (void *)args[sc->offset],
 		    (void *)utrace_addr, len) != -1)
 			print_utrace(fp, utrace_addr, len);
 		else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		free(utrace_addr);
 		break;
 	}
 	case IntArray: {
 		int descriptors[16];
 		unsigned long i, ndescriptors;
 		bool truncated;
 
 		ndescriptors = args[sc->offset + 1];
 		truncated = false;
 		if (ndescriptors > nitems(descriptors)) {
 			ndescriptors = nitems(descriptors);
 			truncated = true;
 		}
 		if (get_struct(pid, (void *)args[sc->offset],
 		    descriptors, ndescriptors * sizeof(descriptors[0])) != -1) {
 			fprintf(fp, "{");
 			for (i = 0; i < ndescriptors; i++)
 				fprintf(fp, i == 0 ? " %d" : ", %d",
 				    descriptors[i]);
 			fprintf(fp, truncated ? ", ... }" : " }");
 		} else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 	case Pipe2:
 		print_mask_arg(sysdecode_pipe2_flags, fp, args[sc->offset]);
 		break;
 	case CapFcntlRights: {
 		uint32_t rights;
 
 		if (sc->type & OUT) {
 			if (get_struct(pid, (void *)args[sc->offset], &rights,
 			    sizeof(rights)) == -1) {
 				fprintf(fp, "0x%lx", args[sc->offset]);
 				break;
 			}
 		} else
 			rights = args[sc->offset];
 		print_mask_arg32(sysdecode_cap_fcntlrights, fp, rights);
 		break;
 	}
 	case Fadvice:
 		print_integer_arg(sysdecode_fadvice, fp, args[sc->offset]);
 		break;
 	case FileFlags: {
 		fflags_t rem;
 
 		if (!sysdecode_fileflags(fp, args[sc->offset], &rem))
 			fprintf(fp, "0x%x", rem);
 		else if (rem != 0)
 			fprintf(fp, "|0x%x", rem);
 		break;
 	}
 	case Flockop:
 		print_mask_arg(sysdecode_flock_operation, fp, args[sc->offset]);
 		break;
 	case Getfsstatmode:
 		print_integer_arg(sysdecode_getfsstat_mode, fp,
 		    args[sc->offset]);
 		break;
 	case Kldsymcmd:
 		print_integer_arg(sysdecode_kldsym_cmd, fp, args[sc->offset]);
 		break;
 	case Kldunloadflags:
 		print_integer_arg(sysdecode_kldunload_flags, fp,
 		    args[sc->offset]);
 		break;
 	case Madvice:
 		print_integer_arg(sysdecode_madvice, fp, args[sc->offset]);
 		break;
 	case Socklent:
 		fprintf(fp, "%u", (socklen_t)args[sc->offset]);
 		break;
 	case Sockprotocol: {
 		const char *temp;
 		int domain, protocol;
 
 		domain = args[sc->offset - 2];
 		protocol = args[sc->offset];
 		if (protocol == 0) {
 			fputs("0", fp);
 		} else {
 			temp = sysdecode_socket_protocol(domain, protocol);
 			if (temp) {
 				fputs(temp, fp);
 			} else {
 				fprintf(fp, "%d", protocol);
 			}
 		}
 		break;
 	}
 	case Sockoptlevel:
 		print_integer_arg(sysdecode_sockopt_level, fp,
 		    args[sc->offset]);
 		break;
 	case Sockoptname: {
 		const char *temp;
 		int level, name;
 
 		level = args[sc->offset - 1];
 		name = args[sc->offset];
 		temp = sysdecode_sockopt_name(level, name);
 		if (temp) {
 			fputs(temp, fp);
 		} else {
 			fprintf(fp, "%d", name);
 		}
 		break;
 	}
 	case Msgflags:
 		print_mask_arg(sysdecode_msg_flags, fp, args[sc->offset]);
 		break;
 	case CapRights: {
 		cap_rights_t rights;
 
 		if (get_struct(pid, (void *)args[sc->offset], &rights,
 		    sizeof(rights)) != -1) {
 			fputs("{ ", fp);
 			sysdecode_cap_rights(fp, &rights);
 			fputs(" }", fp);
 		} else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 	case Acltype:
 		print_integer_arg(sysdecode_acltype, fp, args[sc->offset]);
 		break;
 	case Extattrnamespace:
 		print_integer_arg(sysdecode_extattrnamespace, fp,
 		    args[sc->offset]);
 		break;
 	case Minherit:
 		print_integer_arg(sysdecode_minherit_inherit, fp,
 		    args[sc->offset]);
 		break;
 	case Mlockall:
 		print_mask_arg(sysdecode_mlockall_flags, fp, args[sc->offset]);
 		break;
 	case Mountflags:
 		print_mask_arg(sysdecode_mount_flags, fp, args[sc->offset]);
 		break;
 	case Msync:
 		print_mask_arg(sysdecode_msync_flags, fp, args[sc->offset]);
 		break;
 	case Priowhich:
 		print_integer_arg(sysdecode_prio_which, fp, args[sc->offset]);
 		break;
 	case Ptraceop:
 		print_integer_arg(sysdecode_ptrace_request, fp,
 		    args[sc->offset]);
 		break;
 	case Quotactlcmd:
 		if (!sysdecode_quotactl_cmd(fp, args[sc->offset]))
 			fprintf(fp, "%#x", (int)args[sc->offset]);
 		break;
 	case Reboothowto:
 		print_mask_arg(sysdecode_reboot_howto, fp, args[sc->offset]);
 		break;
 	case Rtpriofunc:
 		print_integer_arg(sysdecode_rtprio_function, fp,
 		    args[sc->offset]);
 		break;
 	case Schedpolicy:
 		print_integer_arg(sysdecode_scheduler_policy, fp,
 		    args[sc->offset]);
 		break;
 	case Schedparam: {
 		struct sched_param sp;
 
 		if (get_struct(pid, (void *)args[sc->offset], &sp,
 		    sizeof(sp)) != -1)
 			fprintf(fp, "{ %d }", sp.sched_priority);
 		else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 	case PSig: {
 		int sig;
 
 		if (get_struct(pid, (void *)args[sc->offset], &sig,
 		    sizeof(sig)) == 0) 
 			fprintf(fp, "{ %s }", strsig2(sig));
 		else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 	case Siginfo: {
 		siginfo_t si;
 
 		if (get_struct(pid, (void *)args[sc->offset], &si,
 		    sizeof(si)) != -1) {
 			fprintf(fp, "{ signo=%s", strsig2(si.si_signo));
 			decode_siginfo(fp, &si);
 			fprintf(fp, " }");
 		} else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 	case Iovec:
 		/*
 		 * Print argument as an array of struct iovec, where the next
 		 * syscall argument is the number of elements of the array.
 		 */
 
 		print_iovec(fp, trussinfo, (void *)args[sc->offset],
 		    (int)args[sc->offset + 1]);
 		break;
 	case Sctpsndrcvinfo: {
 		struct sctp_sndrcvinfo info;
 
 		if (get_struct(pid, (void *)args[sc->offset],
 		    &info, sizeof(struct sctp_sndrcvinfo)) == -1) {
 			fprintf(fp, "0x%lx", args[sc->offset]);
 			break;
 		}
 		print_sctp_sndrcvinfo(fp, sc->type & OUT, &info);
 		break;
 	}
 	case Msghdr: {
 		struct msghdr msghdr;
 
 		if (get_struct(pid, (void *)args[sc->offset],
 		    &msghdr, sizeof(struct msghdr)) == -1) {
 			fprintf(fp, "0x%lx", args[sc->offset]);
 			break;
 		}
 		fputs("{", fp);
 		print_sockaddr(fp, trussinfo, msghdr.msg_name, msghdr.msg_namelen);
 		fprintf(fp, ",%d,", msghdr.msg_namelen);
 		print_iovec(fp, trussinfo, msghdr.msg_iov, msghdr.msg_iovlen);
 		fprintf(fp, ",%d,", msghdr.msg_iovlen);
 		print_cmsgs(fp, pid, sc->type & OUT, &msghdr);
 		fprintf(fp, ",%u,", msghdr.msg_controllen);
 		print_mask_arg(sysdecode_msg_flags, fp, msghdr.msg_flags);
 		fputs("}", fp);
 		break;
 	}
 
 	case CloudABIAdvice:
 		fputs(xlookup(cloudabi_advice, args[sc->offset]), fp);
 		break;
 	case CloudABIClockID:
 		fputs(xlookup(cloudabi_clockid, args[sc->offset]), fp);
 		break;
 	case CloudABIFDSFlags:
 		fputs(xlookup_bits(cloudabi_fdsflags, args[sc->offset]), fp);
 		break;
 	case CloudABIFDStat: {
 		cloudabi_fdstat_t fds;
 		if (get_struct(pid, (void *)args[sc->offset], &fds, sizeof(fds))
 		    != -1) {
 			fprintf(fp, "{ %s, ",
 			    xlookup(cloudabi_filetype, fds.fs_filetype));
 			fprintf(fp, "%s, ... }",
 			    xlookup_bits(cloudabi_fdflags, fds.fs_flags));
 		} else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 	case CloudABIFileStat: {
 		cloudabi_filestat_t fsb;
 		if (get_struct(pid, (void *)args[sc->offset], &fsb, sizeof(fsb))
 		    != -1)
 			fprintf(fp, "{ %s, %ju }",
 			    xlookup(cloudabi_filetype, fsb.st_filetype),
 			    (uintmax_t)fsb.st_size);
 		else
 			fprintf(fp, "0x%lx", args[sc->offset]);
 		break;
 	}
 	case CloudABIFileType:
 		fputs(xlookup(cloudabi_filetype, args[sc->offset]), fp);
 		break;
 	case CloudABIFSFlags:
 		fputs(xlookup_bits(cloudabi_fsflags, args[sc->offset]), fp);
 		break;
 	case CloudABILookup:
 		if ((args[sc->offset] & CLOUDABI_LOOKUP_SYMLINK_FOLLOW) != 0)
 			fprintf(fp, "%d|LOOKUP_SYMLINK_FOLLOW",
 			    (int)args[sc->offset]);
 		else
 			fprintf(fp, "%d", (int)args[sc->offset]);
 		break;
 	case CloudABIMFlags:
 		fputs(xlookup_bits(cloudabi_mflags, args[sc->offset]), fp);
 		break;
 	case CloudABIMProt:
 		fputs(xlookup_bits(cloudabi_mprot, args[sc->offset]), fp);
 		break;
 	case CloudABIMSFlags:
 		fputs(xlookup_bits(cloudabi_msflags, args[sc->offset]), fp);
 		break;
 	case CloudABIOFlags:
 		fputs(xlookup_bits(cloudabi_oflags, args[sc->offset]), fp);
 		break;
 	case CloudABISDFlags:
 		fputs(xlookup_bits(cloudabi_sdflags, args[sc->offset]), fp);
 		break;
 	case CloudABISignal:
 		fputs(xlookup(cloudabi_signal, args[sc->offset]), fp);
 		break;
 	case CloudABITimestamp:
 		fprintf(fp, "%lu.%09lus", args[sc->offset] / 1000000000,
 		    args[sc->offset] % 1000000000);
 		break;
 	case CloudABIULFlags:
 		fputs(xlookup_bits(cloudabi_ulflags, args[sc->offset]), fp);
 		break;
 	case CloudABIWhence:
 		fputs(xlookup(cloudabi_whence, args[sc->offset]), fp);
 		break;
 
 	default:
 		errx(1, "Invalid argument type %d\n", sc->type & ARG_MASK);
 	}
 	fclose(fp);
 	return (tmp);
 }
 
 /*
  * Print (to outfile) the system call and its arguments.
  */
 void
 print_syscall(struct trussinfo *trussinfo)
 {
 	struct threadinfo *t;
 	const char *name;
 	char **s_args;
 	int i, len, nargs;
 
 	t = trussinfo->curthread;
 
 	name = t->cs.sc->name;
 	nargs = t->cs.nargs;
 	s_args = t->cs.s_args;
 
 	len = print_line_prefix(trussinfo);
 	len += fprintf(trussinfo->outfile, "%s(", name);
 
 	for (i = 0; i < nargs; i++) {
 		if (s_args[i] != NULL)
 			len += fprintf(trussinfo->outfile, "%s", s_args[i]);
 		else
 			len += fprintf(trussinfo->outfile,
 			    "<missing argument>");
 		len += fprintf(trussinfo->outfile, "%s", i < (nargs - 1) ?
 		    "," : "");
 	}
 	len += fprintf(trussinfo->outfile, ")");
 	for (i = 0; i < 6 - (len / 8); i++)
 		fprintf(trussinfo->outfile, "\t");
 }
 
 void
 print_syscall_ret(struct trussinfo *trussinfo, int errorp, long *retval)
 {
 	struct timespec timediff;
 	struct threadinfo *t;
 	struct syscall *sc;
 	int error;
 
 	t = trussinfo->curthread;
 	sc = t->cs.sc;
 	if (trussinfo->flags & COUNTONLY) {
 		timespecsub(&t->after, &t->before, &timediff);
 		timespecadd(&sc->time, &timediff, &sc->time);
 		sc->ncalls++;
 		if (errorp)
 			sc->nerror++;
 		return;
 	}
 
 	print_syscall(trussinfo);
 	fflush(trussinfo->outfile);
 
 	if (retval == NULL) {
 		/*
 		 * This system call resulted in the current thread's exit,
 		 * so there is no return value or error to display.
 		 */
 		fprintf(trussinfo->outfile, "\n");
 		return;
 	}
 
 	if (errorp) {
 		error = sysdecode_abi_to_freebsd_errno(t->proc->abi->abi,
 		    retval[0]);
 		fprintf(trussinfo->outfile, " ERR#%ld '%s'\n", retval[0],
 		    error == INT_MAX ? "Unknown error" : strerror(error));
 	}
 #ifndef __LP64__
 	else if (sc->ret_type == 2) {
 		off_t off;
 
 #if _BYTE_ORDER == _LITTLE_ENDIAN
 		off = (off_t)retval[1] << 32 | retval[0];
 #else
 		off = (off_t)retval[0] << 32 | retval[1];
 #endif
 		fprintf(trussinfo->outfile, " = %jd (0x%jx)\n", (intmax_t)off,
 		    (intmax_t)off);
 	}
 #endif
 	else
 		fprintf(trussinfo->outfile, " = %ld (0x%lx)\n", retval[0],
 		    retval[0]);
 }
 
 void
 print_summary(struct trussinfo *trussinfo)
 {
 	struct timespec total = {0, 0};
 	struct syscall *sc;
 	int ncall, nerror;
 
 	fprintf(trussinfo->outfile, "%-20s%15s%8s%8s\n",
 	    "syscall", "seconds", "calls", "errors");
 	ncall = nerror = 0;
 	STAILQ_FOREACH(sc, &syscalls, entries)
 		if (sc->ncalls) {
 			fprintf(trussinfo->outfile, "%-20s%5jd.%09ld%8d%8d\n",
 			    sc->name, (intmax_t)sc->time.tv_sec,
 			    sc->time.tv_nsec, sc->ncalls, sc->nerror);
 			timespecadd(&total, &sc->time, &total);
 			ncall += sc->ncalls;
 			nerror += sc->nerror;
 		}
 	fprintf(trussinfo->outfile, "%20s%15s%8s%8s\n",
 	    "", "-------------", "-------", "-------");
 	fprintf(trussinfo->outfile, "%-20s%5jd.%09ld%8d%8d\n",
 	    "", (intmax_t)total.tv_sec, total.tv_nsec, ncall, nerror);
 }
Index: projects/openssl111
===================================================================
--- projects/openssl111	(revision 339239)
+++ projects/openssl111	(revision 339240)

Property changes on: projects/openssl111
___________________________________________________________________
Modified: svn:mergeinfo
## -0,0 +0,1 ##
   Merged /head:r339206-339212,339215-339239