Index: releng/10.3/UPDATING
===================================================================
--- releng/10.3/UPDATING	(revision 303983)
+++ releng/10.3/UPDATING	(revision 303984)
@@ -1,2319 +1,2335 @@
 Updating Information for FreeBSD current users
 
 This file is maintained and copyrighted by M. Warner Losh <imp@freebsd.org>.
 See end of file for further details.  For commonly done items, please see the
 COMMON ITEMS: section later in the file.  These instructions assume that you
 basically know what you are doing.  If not, then please consult the FreeBSD
 handbook:
 
     http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html
 
 Items affecting the ports and packages system can be found in
 /usr/ports/UPDATING.  Please read that file before running portupgrade.
 
 NOTE: FreeBSD has switched from gcc to clang. If you have trouble bootstrapping
 from older versions of FreeBSD, try WITHOUT_CLANG to bootstrap to the tip of
 stable/10, and then rebuild without this option. The bootstrap process from
 older version of current is a bit fragile.
 
+20160811	p7	FreeBSD-EN-16:10.dhclient
+			FreeBSD-EN-16:11.vmbus
+			FreeBSD-EN-16:12.hv_storvsc
+			FreeBSD-EN-16:13.vmbus
+			FreeBSD-EN-16:14.hv_storvsc
+			FreeBSD-EN-16:15.vmbus
+			FreeBSD-EN-16:16.hv_storvsc
+
+	Fix handling of unknown options from a DHCP server. [EN-16:10]
+	Fix a panic in hv_vmbus(4). [EN-16:11]
+	Fix missing hotplugged disk in hv_storvsc(4). [EN-16:12]
+	Fix the timecounter emulation in hv_vmbus(4). [EN-16:13]
+	Fix callout(9) handling in hv_storvsc(4). [EN-16:14]
+	Fix memory allocation issues in hv_vmbus(4). [EN-16:15]
+	Fix SCSI command handling in hv_storvsc(4). [EN-16:16]
+
 20160725	p6	FreeBSD-SA-16:25.bspatch
 			FreeBSD-EN-16:09.freebsd-update
 
 	Fix bspatch heap overflow vulnerability. [SA-16:25]
 
 	Fix freebsd-update(8) support of FreeBSD 11.0 release
 	distribution. [EN-16:09]
 
 20160604	p5	FreeBSD-SA-16:24.ntp
 
 	Fix multiple vulnerabilities of ntp.
 
 20160531	p4	FreeBSD-SA-16:20.linux
 			FreeBSD-SA-16:21.43bsd
 			FreeBSD-SA-16:22.libarchive
 
 	Fix kernel stack disclosure in Linux compatibility layer. [SA-16:20]
 	Fix kernel stack disclosure in 4.3BSD compatibility layer. [SA-16:21]
 	Fix directory traversal in cpio(1). [SA-16:22]
 
 
 20160517	p3	FreeBSD-SA-16:18.atkbd
 			FreeBSD-SA-16:19.sendmsg
 
 	Fix buffer overflow in keyboard driver. [SA-16:18]
 
 	Fix incorrect argument handling in sendmsg(2). [SA-16:19]
 
 20160504	p2	FreeBSD-SA-16:17.openssl
 			FreeBSD-EN-16:06.libc
 			FreeBSD-EN-16:07.ipi
 			FreeBSD-EN-16:08.zfs
 
 	Fix multiple OpenSSL vulnerabilitites. [SA-16:17]
 
 	Fix performance regression in libc hash(3). [EN-16:06]
 
 	Fix excessive latency in x86 IPI delivery. [EN-16:07]
 
 	Fix memory leak in ZFS. [EN-16:08]
 
 20160429	p1	FreeBSD-SA-16:16.ntp
 
 	Fix multiple vulnerabilities of ntp.
 
 20160329:
 	10.3-RELEASE.
 
 20160124:
 	The NONE and HPN patches has been removed from OpenSSH.  They are
 	still available in the security/openssh-portable port.
 
 20151214:
 	r292223 changed the internal interface between the nfsd.ko and
 	nfscommon.ko modules. As such, they must both be upgraded to-gether.
 	__FreeBSD_version has been bumped because of this.
 
 20151113:
 	Qlogic 24xx/25xx firmware images were updated from 5.5.0 to 7.3.0.
 	Kernel modules isp_2400_multi and isp_2500_multi were removed and
 	should be replaced with isp_2400 and isp_2500 modules respectively.
 
 20150806:
 	The menu.rc and loader.rc files will now be replaced during 
 	upgrades. Please migrate local changes to menu.rc.local and
 	loader.rc.local instead.
 
 20151026:
 	NTP has been upgraded to 4.2.8p4.
 
 20151025:
 	ALLOW_DEPRECATED_ATF_TOOLS/ATFFILE support has been removed from
 	atf.test.mk (included from bsd.test.mk). Please upgrade devel/atf
 	and devel/kyua to version 0.20+ and adjust any calling code to work
 	with Kyuafile and kyua.
 
 20150823:
 	The polarity of Pulse Per Second (PPS) capture events with the
 	uart(4) driver has been corrected.  Prior to this change the PPS
 	"assert" event corresponded to the trailing edge of a positive PPS
 	pulse and the "clear" event was the leading edge of the next pulse.
 	
 	As the width of a PPS pulse in a typical GPS receiver is on the
 	order of 1 millisecond, most users will not notice any significant
 	difference with this change.
 	
 	Anyone who has compensated for the historical polarity reversal by
 	configuring a negative offset equal to the pulse width will need to
 	remove that workaround.
 
 20150822:
 	From legacy ata(4) driver was removed support for SATA controllers
 	supported by more functional drivers ahci(4), siis(4) and mvs(4).
 	Kernel modules ataahci and ataadaptec were removed completely,
 	replaced by ahci and mvs modules respectively.
 
 20150813:
 	10.2-RELEASE.
 
 20150731:
 	As ZFS requires more kernel stack pages than is the default on some
 	architectures e.g. i386, it now warns if KSTACK_PAGES is less than
 	ZFS_MIN_KSTACK_PAGES (which is 4 at the time of writing).
 
 	Please consider using 'options KSTACK_PAGES=X' where X is greater
 	than or equal to ZFS_MIN_KSTACK_PAGES i.e. 4 in such configurations.
 
 20150703:
 	The default Unbound configuration now enables remote control
 	using a local socket.  Users who have already enabled the
 	local_unbound service should regenerate their configuration
 	by running "service local_unbound setup" as root.
 
 20150624:
 	An additional fix for the issue described in the 20150614 sendmail
 	entry below has been been committed in revision 284786.
 
 20150615:
 	The fix for the issue described in the 20150614 sendmail entry
 	below has been been committed in revision 284485.  The work
 	around described in that entry is no longer needed unless the
 	default setting is overridden by a confDH_PARAMETERS configuration
 	setting of '5' or pointing to a 512 bit DH parameter file.
 
 20150614:
 	The import of openssl to address the FreeBSD-SA-15:10.openssl
 	security advisory includes a change which rejects handshakes
 	with DH parameters below 768 bits.  sendmail releases prior
 	to 8.15.2 (not yet released), defaulted to a 512 bit
 	DH parameter setting for client connections.  To work around
 	this interoperability, sendmail can be configured to use a
 	2048 bit DH parameter by:
 
 	1. Edit /etc/mail/`hostname`.mc 
 	2. If a setting for confDH_PARAMETERS does not exist or
 	   exists and is set to a string beginning with '5',
 	   replace it with '2'.
 	3. If a setting for confDH_PARAMETERS exists and is set to
 	   a file path, create a new file with:
 		openssl dhparam -out /path/to/file 2048
 	4. Rebuild the .cf file:
 		cd /etc/mail/; make; make install
 	5. Restart sendmail:
 		cd /etc/mail/; make restart
 
 	A sendmail patch is coming, at which time this file will be
 	updated.
 
 20150601:
 	chmod, chflags, chown and chgrp now affect symlinks in -R mode as
 	defined in symlink(7); previously symlinks were silently ignored.
 
 20150430:
 	The const qualifier has been removed from iconv(3) to comply with
 	POSIX.  The ports tree is aware of this from r384038 onwards.
 
 20141215:
 	At svn r275807, The default linux compat kernel ABI has been adjusted
 	to 2.6.18 in support of the linux-c6 compat ports infrastructure
 	update.  If you wish to continue using the linux-f10 compat ports,
 	add compat.linux.osrelease=2.6.16 to your local sysctl.conf.  Users are
 	encouraged to update their linux-compat packages to linux-c6 during
 	their next update cycle.
 	
 	See ports/UPDATING 20141209 and 20141215 on migration to CentOS 6 ports.
 
 20141205:
 	pjdfstest has been integrated into kyua as an opt-in test suite.
 	Please see share/doc/pjdfstest/README for a more details on how to
 	execute it.
 
 20141118:
 	10.1-RELEASE.
 
 20140904:
 	The ofwfb driver, used to provide a graphics console on PowerPC when
 	using vt(4), no longer allows mmap() of all of physical memory. This
 	will prevent Xorg on PowerPC with some ATI graphics cards from
 	initializing properly unless x11-servers/xorg-server is updated to
 	1.12.4_8 or newer.
 
 20140831:
 	The libatf-c and libatf-c++ major versions were downgraded to 0 and
 	1 respectively to match the upstream numbers.  They were out of
 	sync because, when they were originally added to FreeBSD, the
 	upstream versions were not respected.  These libraries are private
 	and not yet built by default, so renumbering them should be a
 	non-issue.  However, unclean source trees will yield broken test
 	programs once the operator executes "make delete-old-libs" after a
 	"make installworld".
 
 	Additionally, the atf-sh binary was made private by moving it into
 	/usr/libexec/.  Already-built shell test programs will keep the
 	path to the old binary so they will break after "make delete-old"
 	is run.
 
 	If you are using WITH_TESTS=yes (not the default), wipe the object
 	tree and rebuild from scratch to prevent spurious test failures.
 	This is only needed once: the misnumbered libraries and misplaced
 	binaries have been added to OptionalObsoleteFiles.inc so they will
 	be removed during a clean upgrade.
 
 20140814:
 	The ixgbe tunables now match their sysctl counterparts, for example:
 	hw.ixgbe.enable_aim => hw.ix.enable_aim
 	Anyone using ixgbe tunables should ensure they update /boot/loader.conf.
 
 20140801:
 	The NFSv4.1 server committed by r269398 changes the internal
 	function call interfaces used between the NFS and krpc modules.
 	As such, __FreeBSD_version was bumped.
 
 20140729:
 	The default unbound configuration has been modified to address
 	issues with reverse lookups on networks that use private
 	address ranges.  If you use the local_unbound service, run
 	"service local_unbound setup" as root to regenerate your
 	configuration, then "service local_unbound reload" to load the
 	new configuration.
 
 20140717:
 	It is no longer necessary to include the dwarf version in your DEBUG
 	options in your kernel config file. The bug that required it to be
 	placed in the config file has bene fixed. DEBUG should now just
 	contain -g. The build system will automatically update things
 	to do the right thing.
 
 20140715:
 	Several ABI breaking changes were merged to CTL and new iSCSI code.
 	All CTL and iSCSI-related tools, such as ctladm, ctld, iscsid and
 	iscsictl need to be rebuilt to work with a new kernel.
 
 20140708:
 	The WITHOUT_VT_SUPPORT kernel config knob has been renamed
 	WITHOUT_VT.  (The other _SUPPORT knobs have a consistent meaning
 	which differs from the behaviour controlled by this knob.)
 
 20140608:
 	On i386 and amd64 systems, the onifconsole flag is now set by default
 	in /etc/ttys for ttyu0. This causes ttyu0 to be automatically enabled
 	as a login TTY if it is set in the bootloader as an active kernel
 	console. No changes in behavior should result otherwise. To revert to
 	the previous behavior, set ttyu0 to "off" in /etc/ttys.
 
 20140512:
 	Clang and llvm have been upgraded to 3.4.1 release.
 
 20140321:
 	Clang and llvm have been upgraded to 3.4 release.
 
 20140306:
 	If a Makefile in a tests/ directory was auto-generating a Kyuafile
 	instead of providing an explicit one, this would prevent such
 	Makefile from providing its own Kyuafile in the future during
 	NO_CLEAN builds.  This has been fixed in the Makefiles but manual
 	intervention is needed to clean an objdir if you use NO_CLEAN:
 	  # find /usr/obj -name Kyuafile | xargs rm -f
 
 20140303:
 	OpenSSH will now ignore errors caused by kernel lacking of Capsicum
 	capability mode support.  Please note that enabling the feature in
 	kernel is still highly recommended.
 
 20140227:
 	OpenSSH is now built with sandbox support, and will use sandbox as
 	the default privilege separation method.  This requires Capsicum
 	capability mode support in kernel.
 
 20140216:
 	The nve(4) driver for NVIDIA nForce MCP Ethernet adapters has
 	been deprecated and will not be part of FreeBSD 11.0 and later
 	releases.  If you use this driver, please consider switching to
 	the nfe(4) driver instead.
 
 20140120:
 	10.0-RELEASE.
 
 20131216:
 	The behavior of gss_pseudo_random() for the krb5 mechanism
 	has changed, for applications requesting a longer random string
 	than produced by the underlying enctype's pseudo-random() function.
 	In particular, the random string produced from a session key of
 	enctype aes256-cts-hmac-sha1-96 or aes256-cts-hmac-sha1-96 will
 	be different at the 17th octet and later, after this change.
 	The counter used in the PRF+ construction is now encoded as a
 	big-endian integer in accordance with RFC 4402.
 	__FreeBSD_version is bumped to 1000701.
 
 20131108:
 	The WITHOUT_ATF build knob has been removed and its functionality
 	has been subsumed into the more generic WITHOUT_TESTS.  If you were
 	using the former to disable the build of the ATF libraries, you
 	should change your settings to use the latter.
 
 20131031:
 	The default version of mtree is nmtree which is obtained from
 	NetBSD.  The output is generally the same, but may vary
 	slightly.  If you found you need identical output adding
 	"-F freebsd9" to the command line should do the trick.  For the
 	time being, the old mtree is available as fmtree.
 
 20131014:
 	libbsdyml has been renamed to libyaml and moved to /usr/lib/private.
 	This will break ports-mgmt/pkg. Rebuild the port, or upgrade to pkg
 	1.1.4_8 and verify bsdyml not linked in, before running "make
 	delete-old-libs":
 	  # make -C /usr/ports/ports-mgmt/pkg build deinstall install clean
 	  or
 	  # pkg install pkg; ldd /usr/local/sbin/pkg | grep bsdyml
 
 20131010:
 	The rc.d/jail script has been updated to support jail(8)
 	configuration file.  The "jail_<jname>_*" rc.conf(5) variables
 	for per-jail configuration are automatically converted to
 	/var/run/jail.<jname>.conf before the jail(8) utility is invoked.
 	This is transparently backward compatible.  See below about some
 	incompatibilities and rc.conf(5) manual page for more details.
 
 	These variables are now deprecated in favor of jail(8) configuration
 	file.  One can use "rc.d/jail config <jname>" command to generate
 	a jail(8) configuration file in /var/run/jail.<jname>.conf without
 	running the jail(8) utility.   The default pathname of the
 	configuration file is /etc/jail.conf and can be specified by
 	using $jail_conf or $jail_<jname>_conf variables.
 
 	Please note that jail_devfs_ruleset accepts an integer at
 	this moment.  Please consider to rewrite the ruleset name
 	with an integer.
 
 20130930:
 	BIND has been removed from the base system.  If all you need
 	is a local resolver, simply enable and start the local_unbound
 	service instead.  Otherwise, several versions of BIND are
 	available in the ports tree.   The dns/bind99 port is one example.
 
 	With this change, nslookup(1) and dig(1) are no longer in the base
 	system.  Users should instead use host(1) and drill(1) which are
 	in the base system.  Alternatively, nslookup and dig can
 	be obtained by installing the dns/bind-tools port.
 
 20130916:
 	With the addition of unbound(8), a new unbound user is now
 	required during installworld.  "mergemaster -p" can be used to
 	add the user prior to installworld, as documented in the handbook.
 
 20130911:
 	OpenSSH is now built with DNSSEC support, and will by default
 	silently trust signed SSHFP records.  This can be controlled with
 	the VerifyHostKeyDNS client configuration setting.  DNSSEC support
 	can be disabled entirely with the WITHOUT_LDNS option in src.conf.
 
 20130906:
 	The GNU Compiler Collection and C++ standard library (libstdc++)
 	are no longer built by default on platforms where clang is the system
 	compiler.  You can enable them with the WITH_GCC and WITH_GNUCXX
 	options in src.conf.  
 
 20130905:
 	The PROCDESC kernel option is now part of the GENERIC kernel
 	configuration and is required for the rwhod(8) to work.
 	If you are using custom kernel configuration, you should include
 	'options PROCDESC'.
 
 20130905:
 	The API and ABI related to the Capsicum framework was modified
 	in backward incompatible way. The userland libraries and programs
 	have to be recompiled to work with the new kernel. This includes the
 	following libraries and programs, but the whole buildworld is
 	advised: libc, libprocstat, dhclient, tcpdump, hastd, hastctl,
 	kdump, procstat, rwho, rwhod, uniq.
 
 20130903:
 	AES-NI intrinsic support has been added to gcc.  The AES-NI module
 	has been updated to use this support.  A new gcc is required to build
 	the aesni module on both i386 and amd64.
 
 20130821:
 	The PADLOCK_RNG and RDRAND_RNG kernel options are now devices.
 	Thus "device padlock_rng" and "device rdrand_rng" should be
 	used instead of "options PADLOCK_RNG" & "options RDRAND_RNG".
 
 20130813:
 	WITH_ICONV has been split into two feature sets.  WITH_ICONV now
 	enables just the iconv* functionality and is now on by default.
 	WITH_LIBICONV_COMPAT enables the libiconv api and link time
 	compatability.  Set WITHOUT_ICONV to build the old way.
 	If you have been using WITH_ICONV before, you will very likely
 	need to turn on WITH_LIBICONV_COMPAT.
 
 20130806:
 	INVARIANTS option now enables DEBUG for code with OpenSolaris and
 	Illumos origin, including ZFS.  If you have INVARIANTS in your
 	kernel configuration, then there is no need to set DEBUG or ZFS_DEBUG
 	explicitly.
 	DEBUG used to enable witness(9) tracking of OpenSolaris (mostly ZFS)
 	locks if WITNESS option was set.  Because that generated a lot of
 	witness(9) reports and all of them were believed to be false
 	positives, this is no longer done.  New option OPENSOLARIS_WITNESS
 	can be used to achieve the previous behavior.
 
 20130806:
 	Timer values in IPv6 data structures now use time_uptime instead
 	of time_second.  Although this is not a user-visible functional
 	change, userland utilities which directly use them---ndp(8),
 	rtadvd(8), and rtsold(8) in the base system---need to be updated
 	to r253970 or later.
 
 20130802:
 	find -delete can now delete the pathnames given as arguments,
 	instead of only files found below them or if the pathname did
 	not contain any slashes. Formerly, the following error message
 	would result:
 
 	find: -delete: <path>: relative path potentially not safe
 
 	Deleting the pathnames given as arguments can be prevented
 	without error messages using -mindepth 1 or by changing
 	directory and passing "." as argument to find. This works in the
 	old as well as the new version of find.
 
 20130726:
 	Behavior of devfs rules path matching has been changed.
 	Pattern is now always matched against fully qualified devfs
 	path and slash characters must be explicitly matched by
 	slashes in pattern (FNM_PATHNAME). Rulesets involving devfs
 	subdirectories must be reviewed.
 
 20130716:
 	The default ARM ABI has changed to the ARM EABI. The old ABI is
 	incompatible with the ARM EABI and all programs and modules will
 	need to be rebuilt to work with a new kernel.
 
 	To keep using the old ABI ensure the WITHOUT_ARM_EABI knob is set.
 
 	NOTE: Support for the old ABI will be removed in the future and
 	users are advised to upgrade.
 
 20130709:
 	pkg_install has been disconnected from the build if you really need it
 	you should add WITH_PKGTOOLS in your src.conf(5).
 
 20130709:
 	Most of network statistics structures were changed to be able
 	keep 64-bits counters. Thus all tools, that work with networking
 	statistics, must be rebuilt (netstat(1), bsnmpd(1), etc.)
 
 20130618:
 	Fix a bug that allowed a tracing process (e.g. gdb) to write
 	to a memory-mapped file in the traced process's address space
 	even if neither the traced process nor the tracing process had
 	write access to that file.
 
 20130615:
 	CVS has been removed from the base system.  An exact copy
 	of the code is available from the devel/cvs port.
 
 20130613:
 	Some people report the following error after the switch to bmake:
 
 		make: illegal option -- J
 		usage: make [-BPSXeiknpqrstv] [-C directory] [-D variable]
 			...
 		*** [buildworld] Error code 2
 
 	this likely due to an old instance of make in
 	${MAKEPATH} (${MAKEOBJDIRPREFIX}${.CURDIR}/make.${MACHINE})
 	which src/Makefile will use that blindly, if it exists, so if
 	you see the above error:
 
 		rm -rf `make -V MAKEPATH`
 
 	should resolve it.
 
 20130516:
 	Use bmake by default.
 	Whereas before one could choose to build with bmake via
 	-DWITH_BMAKE one must now use -DWITHOUT_BMAKE to use the old
 	make. The goal is to remove these knobs for 10-RELEASE.
 
 	It is worth noting that bmake (like gmake) treats the command
 	line as the unit of failure, rather than statements within the
 	command line.  Thus '(cd some/where && dosomething)' is safer
 	than 'cd some/where; dosomething'. The '()' allows consistent
 	behavior in parallel build.
 
 20130429:
         Fix a bug that allows NFS clients to issue READDIR on files.
 
 20130426:
 	The WITHOUT_IDEA option has been removed because
 	the IDEA patent expired.
 
 20130426:
 	The sysctl which controls TRIM support under ZFS has been renamed
 	from vfs.zfs.trim_disable -> vfs.zfs.trim.enabled and has been
 	enabled by default.
 
 20130425:
 	The mergemaster command now uses the default MAKEOBJDIRPREFIX
 	rather than creating it's own in the temporary directory in
 	order allow access to bootstrapped versions of tools such as
 	install and mtree.  When upgrading from version of FreeBSD where
 	the install command does not support -l, you will need to
 	install a new mergemaster command if mergemaster -p is required.
 	This can be accomplished with the command (cd src/usr.sbin/mergemaster
 	&& make install).
 
 20130404:
 	Legacy ATA stack, disabled and replaced by new CAM-based one since
 	FreeBSD 9.0, completely removed from the sources.  Kernel modules
 	atadisk and atapi*, user-level tools atacontrol and burncd are
 	removed.  Kernel option `options ATA_CAM` is now permanently enabled
 	and removed.
 
 20130319:
 	SOCK_CLOEXEC and SOCK_NONBLOCK flags have been added to socket(2)
 	and socketpair(2). Software, in particular Kerberos, may
 	automatically detect and use these during building. The resulting
 	binaries will not work on older kernels.
 
 20130308:
 	CTL_DISABLE has also been added to the sparc64 GENERIC (for further
 	information, see the respective 20130304 entry).
 
 20130304:
 	Recent commits to callout(9) changed the size of struct callout,
 	so the KBI is probably heavily disturbed. Also, some functions
 	in callout(9)/sleep(9)/sleepqueue(9)/condvar(9) KPIs were replaced
 	by macros. Every kernel module using it won't load, so rebuild
 	is requested.
 
 	The ctl device has been re-enabled in GENERIC for i386 and amd64,
 	but does not initialize by default (because of the new CTL_DISABLE
 	option) to save memory.  To re-enable it, remove the CTL_DISABLE
 	option from the kernel config file or set kern.cam.ctl.disable=0
 	in /boot/loader.conf.
 
 20130301:
 	The ctl device has been disabled in GENERIC for i386 and amd64.
 	This was done due to the extra memory being allocated at system
 	initialisation time by the ctl driver which was only used if
 	a CAM target device was created.  This makes a FreeBSD system
 	unusable on 128MB or less of RAM.
 
 20130208:
 	A new compression method (lz4) has been merged to -HEAD.  Please
 	refer to zpool-features(7) for more information.
 
 	Please refer to the "ZFS notes" section of this file for information
 	on upgrading boot ZFS pools.
 
 20130129:
 	A BSD-licensed patch(1) variant has been added and is installed
 	as bsdpatch, being the GNU version the default patch.
 	To inverse the logic and use the BSD-licensed one as default,
 	while having the GNU version installed as gnupatch, rebuild
 	and install world with the WITH_BSD_PATCH knob set.
 
 20130121:
 	Due to the use of the new -l option to install(1) during build
 	and install, you must take care not to directly set the INSTALL
 	make variable in your /etc/make.conf, /etc/src.conf, or on the
 	command line.  If you wish to use the -C flag for all installs
 	you may be able to add INSTALL+=-C to /etc/make.conf or
 	/etc/src.conf.
 
 20130118:
 	The install(1) option -M has changed meaning and now takes an
 	argument that is a file or path to append logs to.  In the
 	unlikely event that -M was the last option on the command line
 	and the command line contained at least two files and a target
 	directory the first file will have logs appended to it.  The -M
 	option served little practical purpose in the last decade so its
 	use is expected to be extremely rare.
 
 20121223:
 	After switching to Clang as the default compiler some users of ZFS
 	on i386 systems started to experience stack overflow kernel panics.
 	Please consider using 'options KSTACK_PAGES=4' in such configurations.
 
 20121222:
 	GEOM_LABEL now mangles label names read from file system metadata.
 	Mangling affect labels containing spaces, non-printable characters,
 	'%' or '"'. Device names in /etc/fstab and other places may need to
 	be updated.
 
 20121217:
 	By default, only the 10 most recent kernel dumps will be saved.  To
 	restore the previous behaviour (no limit on the number of kernel dumps
 	stored in the dump directory) add the following line to /etc/rc.conf:
 
 		savecore_flags=""
 
 20121201:
 	With the addition of auditdistd(8), a new auditdistd user is now
 	required during installworld.  "mergemaster -p" can be used to
 	add the user prior to installworld, as documented in the handbook.
 
 20121117:
 	The sin6_scope_id member variable in struct sockaddr_in6 is now
 	filled by the kernel before passing the structure to the userland via
 	sysctl or routing socket.  This means the KAME-specific embedded scope
 	id in sin6_addr.s6_addr[2] is always cleared in userland application.
 	This behavior can be controlled by net.inet6.ip6.deembed_scopeid.
 	__FreeBSD_version is bumped to 1000025.
 
 20121105:
 	On i386 and amd64 systems WITH_CLANG_IS_CC is now the default.
 	This means that the world and kernel will be compiled with clang
 	and that clang will be installed as /usr/bin/cc, /usr/bin/c++,
 	and /usr/bin/cpp.  To disable this behavior and revert to building
 	with gcc, compile with WITHOUT_CLANG_IS_CC. Really old versions
 	of current may need to bootstrap WITHOUT_CLANG first if the clang
 	build fails (its compatibility window doesn't extend to the 9 stable
 	branch point).
 
 20121102:
 	The IPFIREWALL_FORWARD kernel option has been removed. Its
 	functionality now turned on by default.
 
 20121023:
 	The ZERO_COPY_SOCKET kernel option has been removed and
 	split into SOCKET_SEND_COW and SOCKET_RECV_PFLIP.
 	NB: SOCKET_SEND_COW uses the VM page based copy-on-write
 	mechanism which is not safe and may result in kernel crashes.
 	NB: The SOCKET_RECV_PFLIP mechanism is useless as no current
 	driver supports disposeable external page sized mbuf storage.
 	Proper replacements for both zero-copy mechanisms are under
 	consideration and will eventually lead to complete removal
 	of the two kernel options.
 
 20121023:
 	The IPv4 network stack has been converted to network byte
 	order. The following modules need to be recompiled together
 	with kernel: carp(4), divert(4), gif(4), siftr(4), gre(4),
 	pf(4), ipfw(4), ng_ipfw(4), stf(4).
 
 20121022:
 	Support for non-MPSAFE filesystems was removed from VFS. The
 	VFS_VERSION was bumped, all filesystem modules shall be
 	recompiled.
 
 20121018:
 	All the non-MPSAFE filesystems have been disconnected from
 	the build. The full list includes: codafs, hpfs, ntfs, nwfs,
 	portalfs, smbfs, xfs.
 
 20121016:
 	The interface cloning API and ABI has changed. The following
 	modules need to be recompiled together with kernel:
 	ipfw(4), pfsync(4), pflog(4), usb(4), wlan(4), stf(4),
 	vlan(4), disc(4), edsc(4), if_bridge(4), gif(4), tap(4),
 	faith(4), epair(4), enc(4), tun(4), if_lagg(4), gre(4).
 
 20121015:
 	The sdhci driver was split in two parts: sdhci (generic SD Host
 	Controller logic) and sdhci_pci (actual hardware driver).
 	No kernel config modifications are required, but if you
 	load sdhc as a module you must switch to sdhci_pci instead.
 
 20121014:
 	Import the FUSE kernel and userland support into base system.
 
 20121013:
 	The GNU sort(1) program has been removed since the BSD-licensed
 	sort(1) has been the default for quite some time and no serious
 	problems have been reported.  The corresponding WITH_GNU_SORT
 	knob has also gone.
 
 20121006:
 	The pfil(9) API/ABI for AF_INET family has been changed. Packet
 	filtering modules: pf(4), ipfw(4), ipfilter(4) need to be recompiled
 	with new kernel.
 
 20121001:
 	The net80211(4) ABI has been changed to allow for improved driver
 	PS-POLL and power-save support.  All wireless drivers need to be
 	recompiled to work with the new kernel.
 
 20120913:
 	The random(4) support for the VIA hardware random number
 	generator (`PADLOCK') is no longer enabled unconditionally.
 	Add the padlock_rng device in the custom kernel config if
 	needed.  The GENERIC kernels on i386 and amd64 do include the
 	device, so the change only affects the custom kernel
 	configurations.
 
 20120908:
 	The pf(4) packet filter ABI has been changed. pfctl(8) and
 	snmp_pf module need to be recompiled to work with new kernel.
 
 20120828:
 	A new ZFS feature flag "com.delphix:empty_bpobj" has been merged
 	to -HEAD. Pools that have empty_bpobj in active state can not be
 	imported read-write with ZFS implementations that do not support
 	this feature. For more information read the zpool-features(5)
 	manual page.
 
 20120727:
 	The sparc64 ZFS loader has been changed to no longer try to auto-
 	detect ZFS providers based on diskN aliases but now requires these
 	to be explicitly listed in the OFW boot-device environment variable. 
 
 20120712:
 	The OpenSSL has been upgraded to 1.0.1c.  Any binaries requiring
 	libcrypto.so.6 or libssl.so.6 must be recompiled.  Also, there are
 	configuration changes.  Make sure to merge /etc/ssl/openssl.cnf.
 
 20120712:
 	The following sysctls and tunables have been renamed for consistency
 	with other variables:
 	  kern.cam.da.da_send_ordered   -> kern.cam.da.send_ordered
 	  kern.cam.ada.ada_send_ordered -> kern.cam.ada.send_ordered
 
 20120628:
 	The sort utility has been replaced with BSD sort.  For now, GNU sort
 	is also available as "gnusort" or the default can be set back to
 	GNU sort by setting WITH_GNU_SORT.  In this case, BSD sort will be
 	installed as "bsdsort".
 
 20120611:
 	A new version of ZFS (pool version 5000) has been merged to -HEAD.
 	Starting with this version the old system of ZFS pool versioning
 	is superseded by "feature flags". This concept enables forward
 	compatibility against certain future changes in functionality of ZFS
 	pools. The first read-only compatible "feature flag" for ZFS pools
 	is named "com.delphix:async_destroy". For more information
 	read the new zpool-features(5) manual page.
 	Please refer to the "ZFS notes" section of this file for information
 	on upgrading boot ZFS pools.
 
 20120417:
 	The malloc(3) implementation embedded in libc now uses sources imported
 	as contrib/jemalloc.  The most disruptive API change is to
 	/etc/malloc.conf.  If your system has an old-style /etc/malloc.conf,
 	delete it prior to installworld, and optionally re-create it using the
 	new format after rebooting.  See malloc.conf(5) for details
 	(specifically the TUNING section and the "opt.*" entries in the MALLCTL
 	NAMESPACE section).
 
 20120328:
 	Big-endian MIPS TARGET_ARCH values no longer end in "eb".  mips64eb
 	is now spelled mips64.  mipsn32eb is now spelled mipsn32.  mipseb is
 	now spelled mips.  This is to aid compatibility with third-party
 	software that expects this naming scheme in uname(3).  Little-endian
 	settings are unchanged. If you are updating a big-endian mips64 machine
 	from before this change, you may need to set MACHINE_ARCH=mips64 in
 	your environment before the new build system will recognize your machine.
 
 20120306:
 	Disable by default the option VFS_ALLOW_NONMPSAFE for all supported
 	platforms.
 
 20120229:
 	Now unix domain sockets behave "as expected" on	nullfs(5). Previously
 	nullfs(5) did not pass through all behaviours to the underlying layer,
 	as a result if we bound to a socket on the lower layer we could connect
 	only to the lower path; if we bound to the upper layer we could connect
 	only to	the upper path. The new behavior is one can connect to both the
 	lower and the upper paths regardless what layer path one binds to.
 
 20120211:
 	The getifaddrs upgrade path broken with 20111215 has been restored.
 	If you have upgraded in between 20111215 and 20120209 you need to
 	recompile libc again with your kernel.  You still need to recompile
 	world to be able to configure CARP but this restriction already
 	comes from 20111215.
 
 20120114:
 	The set_rcvar() function has been removed from /etc/rc.subr.  All
 	base and ports rc.d scripts have been updated, so if you have a
 	port installed with a script in /usr/local/etc/rc.d you can either
 	hand-edit the rcvar= line, or reinstall the port.
 
 	An easy way to handle the mass-update of /etc/rc.d:
 	rm /etc/rc.d/* && mergemaster -i
 
 20120109:
 	panic(9) now stops other CPUs in the SMP systems, disables interrupts
 	on the current CPU and prevents other threads from running.
 	This behavior can be reverted using the kern.stop_scheduler_on_panic
 	tunable/sysctl.
 	The new behavior can be incompatible with kern.sync_on_panic.
 
 20111215:
 	The carp(4) facility has been changed significantly. Configuration
 	of the CARP protocol via ifconfig(8) has changed, as well as format
 	of CARP events submitted to devd(8) has changed. See manual pages
 	for more information. The arpbalance feature of carp(4) is currently
 	not supported anymore.
 
 	Size of struct in_aliasreq, struct in6_aliasreq has changed. User
 	utilities using SIOCAIFADDR, SIOCAIFADDR_IN6, e.g. ifconfig(8),
 	need to be recompiled.
 
 20111122:
 	The acpi_wmi(4) status device /dev/wmistat has been renamed to
 	/dev/wmistat0.
 
 20111108:
 	The option VFS_ALLOW_NONMPSAFE option has been added in order to
 	explicitely support non-MPSAFE filesystems.
 	It is on by default for all supported platform at this present
 	time.
 
 20111101:
 	The broken amd(4) driver has been replaced with esp(4) in the amd64,
 	i386 and pc98 GENERIC kernel configuration files.
 
 20110930:
 	sysinstall has been removed
 
 20110923:
 	The stable/9 branch created in subversion.  This corresponds to the
 	RELENG_9 branch in CVS.
 
 20110913:
 	This commit modifies vfs_register() so that it uses a hash
 	calculation to set vfc_typenum, which is enabled by default.
 	The first time a system is booted after this change, the
 	vfc_typenum values will change for all file systems. The
 	main effect of this is a change to the NFS server file handles
 	for file systems that use vfc_typenum in their fsid, such as ZFS.
 	It will, however, prevent vfc_typenum from changing when file
 	systems are loaded in a different order for subsequent reboots.
 	To disable this, you can set vfs.typenumhash=0 in /boot/loader.conf
 	until you are ready to remount all NFS clients after a reboot.
 
 20110828:
 	Bump the shared library version numbers for libraries that
 	do not use symbol versioning, have changed the ABI compared
 	to stable/8 and which shared library version was not bumped.
 	Done as part of 9.0-RELEASE cycle.
 
 20110815:
 	During the merge of Capsicum features, the fget(9) KPI was modified.
 	This may require the rebuilding of out-of-tree device drivers --
 	issues have been reported specifically with the nVidia device driver.
 	__FreeBSD_version is bumped to 900041.
 
 	Also, there is a period between 20110811 and 20110814 where the
 	special devices /dev/{stdin,stdout,stderr} did not work correctly.
 	Building world from a kernel during that window may not work.
 
 20110628:
 	The packet filter (pf) code has been updated to OpenBSD 4.5.
 	You need to update userland tools to be in sync with kernel.
 	This update breaks backward compatibility with earlier pfsync(4)
 	versions.  Care must be taken when updating redundant firewall setups.
 
 20110608:
 	The following sysctls and tunables are retired on x86 platforms:
 		machdep.hlt_cpus
 		machdep.hlt_logical_cpus
 	The following sysctl is retired:
 		machdep.hyperthreading_allowed
 	The sysctls were supposed to provide a way to dynamically offline and
 	online selected CPUs on x86 platforms, but the implementation has not
 	been reliable especially with SCHED_ULE scheduler.
 	machdep.hyperthreading_allowed tunable is still available to ignore
 	hyperthreading CPUs at OS level.
 	Individual CPUs can be disabled using hint.lapic.X.disabled tunable,
 	where X is an APIC ID of a CPU.  Be advised, though, that disabling
 	CPUs in non-uniform fashion will result in non-uniform topology and
 	may lead to sub-optimal system performance with SCHED_ULE, which is
 	a default scheduler.
 
 20110607:
 	cpumask_t type is retired and cpuset_t is used in order to describe
 	a mask of CPUs.
 
 20110531:
 	Changes to ifconfig(8) for dynamic address family detection mandate
 	that you are running a kernel of 20110525 or later.  Make sure to
 	follow the update procedure to boot a new kernel before installing
 	world.
 
 20110513:
 	Support for sun4v architecture is officially dropped
 
 20110503:
 	Several KPI breaking changes have been committed to the mii(4) layer,
 	the PHY drivers and consequently some Ethernet drivers using mii(4).
 	This means that miibus.ko and the modules of the affected Ethernet
 	drivers need to be recompiled.
 
 	Note to kernel developers: Given that the OUI bit reversion problem
 	was fixed as part of these changes all mii(4) commits related to OUIs,
 	i.e. to sys/dev/mii/miidevs, PHY driver probing and vendor specific
 	handling, no longer can be merged verbatim to stable/8 and previous
 	branches.
 
 20110430:
 	Users of the Atheros AR71xx SoC code now need to add 'device ar71xx_pci'
 	into their kernel configurations along with 'device pci'.
 
 20110427:
 	The default NFS client is now the new NFS client, so fstype "newnfs"
 	is now "nfs" and the regular/old NFS client is now fstype "oldnfs".
 	Although mounts via fstype "nfs" will usually work without userland
 	changes, it is recommended that the mount(8) and mount_nfs(8)
 	commands be rebuilt from sources and that a link to mount_nfs called
 	mount_oldnfs be created. The new client is compiled into the
 	kernel with "options NFSCL" and this is needed for diskless root
 	file systems. The GENERIC kernel configs have been changed to use
 	NFSCL and NFSD (the new server) instead of NFSCLIENT and NFSSERVER.
 	To use the regular/old client, you can "mount -t oldnfs ...". For
 	a diskless root file system, you must also include a line like:
 	
 	vfs.root.mountfrom="oldnfs:"
 
 	in the boot/loader.conf on the root fs on the NFS server to make
 	a diskless root fs use the old client.
 
 20110424:
 	The GENERIC kernels for all architectures now default to the new
 	CAM-based ATA stack. It means that all legacy ATA drivers were
 	removed and replaced by respective CAM drivers. If you are using
 	ATA device names in /etc/fstab or other places, make sure to update
 	them respectively (adX -> adaY, acdX -> cdY, afdX -> daY, astX -> saY,
 	where 'Y's are the sequential numbers starting from zero for each type
 	in order of detection, unless configured otherwise with tunables,
 	see cam(4)). There will be symbolic links created in /dev/ to map
 	old adX devices to the respective adaY. They should provide basic
 	compatibility for file systems mounting in most cases, but they do
 	not support old user-level APIs and do not have respective providers
 	in GEOM. Consider using updated management tools with new device names.
 
 	It is possible to load devices ahci, ata, siis and mvs as modules,
 	but option ATA_CAM should remain in kernel configuration to make ata
 	module work as CAM driver supporting legacy ATA controllers. Device ata
 	still can be used in modular fashion (atacore + ...). Modules atadisk
 	and atapi* are not used and won't affect operation in ATA_CAM mode.
 	Note that to use CAM-based ATA kernel should include CAM devices
 	scbus, pass, da (or explicitly ada), cd and optionally others. All of
 	them are parts of the cam module.
 
 	ataraid(4) functionality is now supported by the RAID GEOM class.
 	To use it you can load geom_raid kernel module and use graid(8) tool
 	for management. Instead of /dev/arX device names, use /dev/raid/rX.
 
 	No kernel config options or code have been removed, so if a problem
 	arises, please report it and optionally revert to the old ATA stack.
 	In order to do it you can remove from the kernel config:
 	    options        ATA_CAM
 	    device         ahci
 	    device         mvs
 	    device         siis
 	, and instead add back:
 	    device         atadisk         # ATA disk drives
 	    device         ataraid         # ATA RAID drives
 	    device         atapicd         # ATAPI CDROM drives
 	    device         atapifd         # ATAPI floppy drives
 	    device         atapist         # ATAPI tape drives
 
 20110423:
 	The default NFS server has been changed to the new server, which
 	was referred to as the experimental server. If you need to switch
 	back to the old NFS server, you must now put the "-o" option on
 	both the mountd and nfsd commands. This can be done using the
 	mountd_flags and nfs_server_flags rc.conf variables until an
 	update to the rc scripts is committed, which is coming soon.
 
 20110418:
 	The GNU Objective-C runtime library (libobjc), and other Objective-C
 	related components have been removed from the base system.  If you
 	require an Objective-C library, please use one of the available ports.
 
 20110331:
 	ath(4) has been split into bus- and device- modules. if_ath contains
 	the HAL, the TX rate control and the network device code. if_ath_pci
 	contains the PCI bus glue. For Atheros MIPS embedded systems, if_ath_ahb
 	contains the AHB glue. Users need to load both if_ath_pci and if_ath
 	in order to use ath on everything else.
 
 	TO REPEAT: if_ath_ahb is not needed for normal users. Normal users only
 	need to load if_ath and if_ath_pci for ath(4) operation.
 
 20110314:
 	As part of the replacement of sysinstall, the process of building
 	release media has changed significantly. For details, please re-read
 	release(7), which has been updated to reflect the new build process.
 
 20110218:
 	GNU binutils 2.17.50 (as of 2007-07-03) has been merged to -HEAD.  This
 	is the last available version under GPLv2.  It brings a number of new
 	features, such as support for newer x86 CPU's (with SSE-3, SSSE-3, SSE
 	4.1 and SSE 4.2), better support for powerpc64, a number of new
 	directives, and lots of other small improvements.  See the ChangeLog
 	file in contrib/binutils for the full details.
 
 20110218:
 	IPsec's HMAC_SHA256-512 support has been fixed to be RFC4868
 	compliant, and will now use half of hash for authentication.
 	This will break interoperability with all stacks (including all
 	actual FreeBSD versions) who implement
 	draft-ietf-ipsec-ciph-sha-256-00 (they use 96 bits of hash for
 	authentication).
 	The only workaround with such peers is to use another HMAC
 	algorithm for IPsec ("phase 2") authentication.
 
 20110207:
 	Remove the uio_yield prototype and symbol.  This function has
 	been misnamed since it was introduced and should not be
 	globally exposed with this name.  The equivalent functionality
 	is now available using kern_yield(curthread->td_user_pri).
 	The function remains undocumented.
 
 20110112:
 	A SYSCTL_[ADD_]UQUAD was added for unsigned uint64_t pointers,
 	symmetric with the existing SYSCTL_[ADD_]QUAD.  Type checking
 	for scalar sysctls is defined but disabled.  Code that needs
 	UQUAD to pass the type checking that must compile on older
 	systems where the define is not present can check against
 	__FreeBSD_version >= 900030.
 
 	The system dialog(1) has been replaced with a new version previously
 	in ports as devel/cdialog. dialog(1) is mostly command-line compatible
 	with the previous version, but the libdialog associated with it has
 	a largely incompatible API. As such, the original version of libdialog
 	will be kept temporarily as libodialog, until its base system consumers
 	are replaced or updated. Bump __FreeBSD_version to 900030.
 
 20110103:
 	If you are trying to run make universe on a -stable system, and you get
 	the following warning:
 	"Makefile", line 356: "Target architecture for i386/conf/GENERIC 
 	unknown.  config(8) likely too old."
 	or something similar to it, then you must upgrade your -stable system
 	to 8.2-Release or newer (really, any time after r210146 7/15/2010 in
 	stable/8) or build the config from the latest stable/8 branch and
 	install it on your system.
 
 	Prior to this date, building a current universe on 8-stable system from
 	between 7/15/2010 and 1/2/2011 would result in a weird shell parsing
 	error in the first kernel build phase.  A new config on those old 
 	systems will fix that problem for older versions of -current.
 
 20101228:
 	The TCP stack has been modified to allow Khelp modules to interact with
 	it via helper hook points and store per-connection data in the TCP
 	control block. Bump __FreeBSD_version to 900029. User space tools that
 	rely on the size of struct tcpcb in tcp_var.h (e.g. sockstat) need to
 	be recompiled.
 
 20101114:
 	Generic IEEE 802.3 annex 31B full duplex flow control support has been
 	added to mii(4) and bge(4), bce(4), msk(4), nfe(4) and stge(4) along
 	with brgphy(4), e1000phy(4) as well as ip1000phy() have been converted
 	to take advantage of it instead of using custom implementations.  This
 	means that these drivers now no longer unconditionally advertise
 	support for flow control but only do so if flow control is a selected
 	media option.  This was implemented in the generic support that way in
 	order to allow flow control to be switched on and off via ifconfig(8)
 	with the PHY specific default to typically off in order to protect
 	from unwanted effects.  Consequently, if you used flow control with
 	one of the above mentioned drivers you now need to explicitly enable
 	it, for example via:
 		ifconfig bge0 media auto mediaopt flowcontrol
 
 	Along with the above mentioned changes generic support for setting
 	1000baseT master mode also has been added and brgphy(4), ciphy(4),
 	e1000phy(4) as well as ip1000phy(4) have been converted to take
 	advantage of it.  This means that these drivers now no longer take the
 	link0 parameter for selecting master mode but the master media option
 	has to be used instead, for example like in the following:
 		ifconfig bge0 media 1000baseT mediaopt full-duplex,master
 
 	Selection of master mode now is also available with all other PHY
 	drivers supporting 1000baseT.
 
 20101111:
 	The TCP stack has received a significant update to add support for
 	modularised congestion control and generally improve the clarity of
 	congestion control decisions. Bump __FreeBSD_version to 900025. User
 	space tools that rely on the size of struct tcpcb in tcp_var.h (e.g.
 	sockstat) need to be recompiled.
 
 20101002:
 	The man(1) utility has been replaced by a new version that no longer
 	uses /etc/manpath.config. Please consult man.conf(5) for how to
 	migrate local entries to the new format.
 
 20100928:
 	The copyright strings printed by login(1) and sshd(8) at the time of a
 	new connection have been removed to follow other operating systems and
 	upstream sshd.
 
 20100915:
 	A workaround for a fixed ld bug has been removed in kernel code,
 	so make sure that your system ld is built from sources after
 	revision 210245 from 2010-07-19 (r211583 if building head kernel
 	on stable/8, r211584 for stable/7; both from 2010-08-21).
 	A symptom of incorrect ld version is different addresses for
 	set_pcpu section and __start_set_pcpu symbol in kernel and/or modules.
 
 20100913:
 	The $ipv6_prefer variable in rc.conf(5) has been split into
 	$ip6addrctl_policy and $ipv6_activate_all_interfaces.
 
 	The $ip6addrctl_policy is a variable to choose a pre-defined
 	address selection policy set by ip6addrctl(8).  A value
 	"ipv4_prefer", "ipv6_prefer" or "AUTO" can be specified.  The
 	default is "AUTO".
 
 	The $ipv6_activate_all_interfaces specifies whether IFDISABLED
 	flag (see an entry of 20090926) is set on an interface with no
 	corresponding $ifconfig_IF_ipv6 line.  The default is "NO" for
 	security reason.  If you want IPv6 link-local address on all
 	interfaces by default, set this to "YES".
 
 	The old ipv6_prefer="YES" is equivalent to
 	ipv6_activate_all_interfaces="YES" and
 	ip6addrctl_policy="ipv6_prefer".
 
 20100913:
 	DTrace has grown support for userland tracing. Due to this, DTrace is
 	now i386 and amd64 only.
 	dtruss(1) is now installed by default on those systems and a new
 	kernel module is needed for userland tracing: fasttrap.
 	No changes to your kernel config file are necessary to enable
 	userland tracing, but you might consider adding 'STRIP=' and
 	'CFLAGS+=-fno-omit-frame-pointer' to your make.conf if you want
 	to have informative userland stack traces in DTrace (ustack).
 
 20100725:
 	The acpi_aiboost(4) driver has been removed in favor of the new
 	aibs(4) driver. You should update your kernel configuration file.
 
 20100722:
 	BSD grep has been imported to the base system and it is built by
 	default.  It is completely BSD licensed, highly GNU-compatible, uses
 	less memory than its GNU counterpart and has a small codebase.
 	However, it is slower than its GNU counterpart, which is mostly
 	noticeable for larger searches, for smaller ones it is measurable
 	but not significant.  The reason is complex, the most important factor
 	is that we lack a modern and efficient regex library and GNU
 	overcomes this by optimizing the searches internally.  Future work
 	on improving the regex performance is planned, for the meantime,
 	users that need better performance, can build GNU grep instead by
 	setting the WITH_GNU_GREP knob.
 
 20100713:
 	Due to the import of powerpc64 support, all existing powerpc kernel
 	configuration files must be updated with a machine directive like this:
 	    machine powerpc powerpc
 
 	In addition, an updated config(8) is required to build powerpc kernels
 	after this change.
 
 20100713:
 	A new version of ZFS (version 15) has been merged to -HEAD.
 	This version uses a python library for the following subcommands:
 	zfs allow, zfs unallow, zfs groupspace, zfs userspace.
 	For full functionality of these commands the following port must
 	be installed: sysutils/py-zfs
 
 20100429:
 	'vm_page's are now hashed by physical address to an array of mutexes.
 	Currently this is only used to serialize access to hold_count. Over 
 	time the page queue mutex will be peeled away. This changes the size
 	of pmap on every architecture. And requires all callers of vm_page_hold
 	and vm_page_unhold to be updated. 
  
 20100402:
 	WITH_CTF can now be specified in src.conf (not recommended, there
 	are some problems with static executables), make.conf (would also
 	affect ports which do not use GNU make and do not override the
 	compile targets) or in the kernel config (via "makeoptions
 	WITH_CTF=yes").
 	When WITH_CTF was specified there before this was silently ignored,
 	so make sure that WITH_CTF is not used in places which could lead
 	to unwanted behavior.
 
 20100311:
 	The kernel option COMPAT_IA32 has been replaced with COMPAT_FREEBSD32
 	to allow 32-bit compatibility on non-x86 platforms. All kernel
 	configurations on amd64 and ia64 platforms using these options must
 	be modified accordingly.
 
 20100113:
 	The utmp user accounting database has been replaced with utmpx,
 	the user accounting interface standardized by POSIX.
 	Unfortunately the semantics of utmp and utmpx don't match,
 	making it practically impossible to support both interfaces.
 	The user accounting database is used by tools like finger(1),
 	last(1), talk(1), w(1) and ac(8).
 
 	All applications in the base system use utmpx.  This means only
 	local binaries (e.g. from the ports tree) may still use these
 	utmp database files.  These applications must be rebuilt to make
 	use of utmpx.
 
 	After the system has been upgraded, it is safe to remove the old
 	log files (/var/run/utmp, /var/log/lastlog and /var/log/wtmp*),
 	assuming their contents is of no importance anymore.  Old wtmp
 	databases can only be used by last(1) and ac(8) after they have
 	been converted to the new format using wtmpcvt(1).
 
 20100108:
 	Introduce the kernel thread "deadlock resolver" (which can be enabled
 	via the DEADLKRES option, see NOTES for more details) and the
 	sleepq_type() function for sleepqueues.
 
 20091202:
 	The rc.firewall and rc.firewall6 were unified, and
 	rc.firewall6 and rc.d/ip6fw were removed.
 	According to the removal of rc.d/ip6fw, ipv6_firewall_* rc
 	variables are obsoleted.  Instead, the following new rc
 	variables are added to rc.d/ipfw:
 
 		firewall_client_net_ipv6, firewall_simple_iif_ipv6,
 		firewall_simple_inet_ipv6, firewall_simple_oif_ipv6,
 		firewall_simple_onet_ipv6, firewall_trusted_ipv6
 
 	The meanings correspond to the relevant IPv4 variables.
 
 20091125:
 	8.0-RELEASE.
 
 20091113:
 	The default terminal emulation for syscons(4) has been changed
 	from cons25 to xterm on all platforms except pc98.  This means
 	that the /etc/ttys file needs to be updated to ensure correct
 	operation of applications on the console.
 
 	The terminal emulation style can be toggled per window by using
 	vidcontrol(1)'s -T flag.  The TEKEN_CONS25 kernel configuration
 	options can be used to change the compile-time default back to
 	cons25.
 
 	To prevent graphical artifacts, make sure the TERM environment
 	variable is set to match the terminal emulation that is being
 	performed by syscons(4).
 
 20091109:
 	The layout of the structure ieee80211req_scan_result has changed.
 	Applications that require wireless scan results (e.g. ifconfig(8))
 	from net80211 need to be recompiled.
 
 	Applications such as wpa_supplicant(8) may require a full world
 	build without using NO_CLEAN in order to get synchronized with the
 	new structure.
 
 20091025:
 	The iwn(4) driver has been updated to support the 5000 and 5150 series.
 	There's one kernel module for each firmware. Adding "device iwnfw"
 	to the kernel configuration file means including all three firmware
 	images inside the kernel. If you want to include just the one for
 	your wireless card, use the devices iwn4965fw, iwn5000fw or
 	iwn5150fw.
 
 20090926:
 	The rc.d/network_ipv6, IPv6 configuration script has been integrated
 	into rc.d/netif.  The changes are the following:
 
 	1. To use IPv6, simply define $ifconfig_IF_ipv6 like $ifconfig_IF
 	   for IPv4.  For aliases, $ifconfig_IF_aliasN should be used.
 	   Note that both variables need the "inet6" keyword at the head.
 
 	   Do not set $ipv6_network_interfaces manually if you do not
 	   understand what you are doing.  It is not needed in most cases. 
 
 	   $ipv6_ifconfig_IF and $ipv6_ifconfig_IF_aliasN still work, but
 	   they are obsolete.
 
 	2. $ipv6_enable is obsolete.  Use $ipv6_prefer and
 	   "inet6 accept_rtadv" keyword in ifconfig(8) instead.
 
 	   If you define $ipv6_enable=YES, it means $ipv6_prefer=YES and
 	   all configured interfaces have "inet6 accept_rtadv" in the
 	   $ifconfig_IF_ipv6.  These are for backward compatibility.
 
 	3. A new variable $ipv6_prefer has been added.  If NO, IPv6
 	   functionality of interfaces with no corresponding
 	   $ifconfig_IF_ipv6 is disabled by using "inet6 ifdisabled" flag,
 	   and the default address selection policy of ip6addrctl(8) 
 	   is the IPv4-preferred one (see rc.d/ip6addrctl for more details).
 	   Note that if you want to configure IPv6 functionality on the
 	   disabled interfaces after boot, first you need to clear the flag by
 	   using ifconfig(8) like:
 
 		ifconfig em0 inet6 -ifdisabled
 
 	   If YES, the default address selection policy is set as
 	   IPv6-preferred.
 
 	   The default value of $ipv6_prefer is NO.
 
 	4. If your system need to receive Router Advertisement messages,
 	   define "inet6 accept_rtadv" in $ifconfig_IF_ipv6.  The rc(8)
 	   scripts automatically invoke rtsol(8) when the interface becomes
 	   UP.  The Router Advertisement messages are used for SLAAC
 	   (State-Less Address AutoConfiguration).
 
 20090922:
 	802.11s D3.03 support was committed. This is incompatible with the
 	previous code, which was based on D3.0.
 
 20090912:
 	A sysctl variable net.inet6.ip6.accept_rtadv now sets the default value
 	of a per-interface flag ND6_IFF_ACCEPT_RTADV, not a global knob to
 	control whether accepting Router Advertisement messages or not.
 	Also, a per-interface flag ND6_IFF_AUTO_LINKLOCAL has been added and
 	a sysctl variable net.inet6.ip6.auto_linklocal is its default value.
 	The ifconfig(8) utility now supports these flags.
 
 20090910:
 	ZFS snapshots are now mounted with MNT_IGNORE flag. Use -v option for
 	mount(8) and -a option for df(1) to see them.
 
 20090825:
 	The old tunable hw.bus.devctl_disable has been superseded by
 	hw.bus.devctl_queue.  hw.bus.devctl_disable=1 in loader.conf should be
 	replaced by hw.bus.devctl_queue=0.  The default for this new tunable
 	is 1000.
 
 20090813:
 	Remove the option STOP_NMI.  The default action is now to use NMI only
 	for KDB via the newly introduced function stop_cpus_hard() and
 	maintain stop_cpus() to just use a normal IPI_STOP on ia32 and amd64.
 
 20090803:
 	The stable/8 branch created in subversion.  This corresponds to the
 	RELENG_8 branch in CVS.
 
 20090719:
 	Bump the shared library version numbers for all libraries that do not
 	use symbol versioning as part of the 8.0-RELEASE cycle.  Bump
 	__FreeBSD_version to 800105.
 
 20090714:
 	Due to changes in the implementation of virtual network stack support,
 	all network-related kernel modules must be recompiled.  As this change
 	breaks the ABI, bump __FreeBSD_version to 800104.
 
 20090713:
 	The TOE interface to the TCP syncache has been modified to remove
 	struct tcpopt (<netinet/tcp_var.h>) from the ABI of the network stack.
 	The cxgb driver is the only TOE consumer affected by this change, and
 	needs to be recompiled along with the kernel. As this change breaks
 	the ABI, bump __FreeBSD_version to 800103.
 
 20090712: 
 	Padding has been added to struct tcpcb, sackhint and tcpstat in
 	<netinet/tcp_var.h> to facilitate future MFCs and bug fixes whilst
 	maintaining the ABI. However, this change breaks the ABI, so bump
 	__FreeBSD_version to 800102. User space tools that rely on the size of
 	any of these structs (e.g. sockstat) need to be recompiled.
 
 20090630:
 	The NFS_LEGACYRPC option has been removed along with the old kernel
 	RPC implementation that this option selected. Kernel configurations
 	may need to be adjusted.
 
 20090629:
 	The network interface device nodes at /dev/net/<interface> have been
 	removed.  All ioctl operations can be performed the normal way using
 	routing sockets.  The kqueue functionality can generally be replaced
 	with routing sockets.
 
 20090628:
 	The documentation from the FreeBSD Documentation Project (Handbook,
 	FAQ, etc.) is now installed via packages by sysinstall(8) and under
 	the /usr/local/share/doc/freebsd directory instead of /usr/share/doc.
 
 20090624:
 	The ABI of various structures related to the SYSV IPC API have been
 	changed.  As a result, the COMPAT_FREEBSD[456] and COMPAT_43 kernel
 	options now all require COMPAT_FREEBSD7.  Bump __FreeBSD_version to
 	800100.
 
 20090622:
 	Layout of struct vnet has changed as routing related variables were
 	moved to their own Vimage module. Modules need to be recompiled.  Bump
 	__FreeBSD_version to 800099.
 
 20090619:
 	NGROUPS_MAX and NGROUPS have been increased from 16 to 1023 and 1024
 	respectively.  As long as no more than 16 groups per process are used,
 	no changes should be visible.  When more than 16 groups are used, old
 	binaries may fail if they call getgroups() or getgrouplist() with
 	statically sized storage.  Recompiling will work around this, but
 	applications should be modified to use dynamically allocated storage
 	for group arrays as POSIX.1-2008 does not cap an implementation's
 	number of supported groups at NGROUPS_MAX+1 as previous versions did.
 
 	NFS and portalfs mounts may also be affected as the list of groups is
 	truncated to 16.  Users of NFS who use more than 16 groups, should
 	take care that negative group permissions are not used on the exported
 	file systems as they will not be reliable unless a GSSAPI based
 	authentication method is used.
 
 20090616: 
 	The compiling option ADAPTIVE_LOCKMGRS has been introduced.  This
 	option compiles in the support for adaptive spinning for lockmgrs
 	which want to enable it.  The lockinit() function now accepts the flag
 	LK_ADAPTIVE in order to make the lock object subject to adaptive
 	spinning when both held in write and read mode.
 
 20090613:
 	The layout of the structure returned by IEEE80211_IOC_STA_INFO has
 	changed.  User applications that use this ioctl need to be rebuilt.
 
 20090611:
 	The layout of struct thread has changed.  Kernel and modules need to
 	be rebuilt.
 
 20090608:
 	The layout of structs ifnet, domain, protosw and vnet_net has changed.
 	Kernel modules need to be rebuilt.  Bump __FreeBSD_version to 800097.
 
 20090602:
 	window(1) has been removed from the base system. It can now be
 	installed from ports. The port is called misc/window.
 
 20090601:
 	The way we are storing and accessing `routing table' entries has
 	changed. Programs reading the FIB, like netstat, need to be
 	re-compiled.
 
 20090601:
 	A new netisr implementation has been added for FreeBSD 8.  Network
 	file system modules, such as igmp, ipdivert, and others, should be
 	rebuilt.
 	Bump __FreeBSD_version to 800096.
 
 20090530:
 	Remove the tunable/sysctl debug.mpsafevfs as its initial purpose is no
 	more valid.
 
 20090530:
 	Add VOP_ACCESSX(9).  File system modules need to be rebuilt.
 	Bump __FreeBSD_version to 800094.
 
 20090529:
 	Add mnt_xflag field to 'struct mount'.  File system modules need to be
 	rebuilt.
 	Bump __FreeBSD_version to 800093.
 
 20090528:
 	The compiling option ADAPTIVE_SX has been retired while it has been
 	introduced the option NO_ADAPTIVE_SX which handles the reversed logic.
 	The KPI for sx_init_flags() changes as accepting flags:
 	SX_ADAPTIVESPIN flag has been retired while the SX_NOADAPTIVE flag has
 	been introduced in order to handle the reversed logic.
 	Bump __FreeBSD_version to 800092.
 
 20090527:
 	Add support for hierarchical jails.  Remove global securelevel.
 	Bump __FreeBSD_version to 800091.
 
 20090523:
 	The layout of struct vnet_net has changed, therefore modules
 	need to be rebuilt.
 	Bump __FreeBSD_version to 800090.
 
 20090523:
 	The newly imported zic(8) produces a new format in the output. Please
 	run tzsetup(8) to install the newly created data to /etc/localtime.
 
 20090520:
 	The sysctl tree for the usb stack has renamed from hw.usb2.* to
 	hw.usb.* and is now consistent again with previous releases.
 
 20090520:
 	802.11 monitor mode support was revised and driver api's were changed.
 	Drivers dependent on net80211 now support DLT_IEEE802_11_RADIO instead
 	of DLT_IEEE802_11.  No user-visible data structures were changed but
 	applications that use DLT_IEEE802_11 may require changes.
 	Bump __FreeBSD_version to 800088.
 
 20090430:
 	The layout of the following structs has changed: sysctl_oid,
 	socket, ifnet, inpcbinfo, tcpcb, syncache_head, vnet_inet,
 	vnet_inet6 and vnet_ipfw.  Most modules need to be rebuild or
 	panics may be experienced.  World rebuild is required for
 	correctly checking networking state from userland.
 	Bump __FreeBSD_version to 800085.
 
 20090429:
 	MLDv2 and Source-Specific Multicast (SSM) have been merged
 	to the IPv6 stack. VIMAGE hooks are in but not yet used.
 	The implementation of SSM within FreeBSD's IPv6 stack closely
 	follows the IPv4 implementation.
 
 	For kernel developers:
 
 	* The most important changes are that the ip6_output() and
 	  ip6_input() paths no longer take the IN6_MULTI_LOCK,
 	  and this lock has been downgraded to a non-recursive mutex.
 
 	* As with the changes to the IPv4 stack to support SSM, filtering
 	  of inbound multicast traffic must now be performed by transport
 	  protocols within the IPv6 stack. This does not apply to TCP and
 	  SCTP, however, it does apply to UDP in IPv6 and raw IPv6.
 
 	* The KPIs used by IPv6 multicast are similar to those used by
 	  the IPv4 stack, with the following differences:
 	   * im6o_mc_filter() is analogous to imo_multicast_filter().
 	   * The legacy KAME entry points in6_joingroup and in6_leavegroup()
 	     are shimmed to in6_mc_join() and in6_mc_leave() respectively.
 	   * IN6_LOOKUP_MULTI() has been deprecated and removed.
 	   * IPv6 relies on MLD for the DAD mechanism. KAME's internal KPIs
 	     for MLDv1 have an additional 'timer' argument which is used to
 	     jitter the initial membership report for the solicited-node
 	     multicast membership on-link.
 	   * This is not strictly needed for MLDv2, which already jitters
 	     its report transmissions.  However, the 'timer' argument is
 	     preserved in case MLDv1 is active on the interface.
 
 	* The KAME linked-list based IPv6 membership implementation has
 	  been refactored to use a vector similar to that used by the IPv4
 	  stack.
 	  Code which maintains a list of its own multicast memberships
 	  internally, e.g. carp, has been updated to reflect the new
 	  semantics.
 
 	* There is a known Lock Order Reversal (LOR) due to in6_setscope()
 	  acquiring the IF_AFDATA_LOCK and being called within ip6_output().
 	  Whilst MLDv2 tries to avoid this otherwise benign LOR, it is an
 	  implementation constraint which needs to be addressed in HEAD.
 
 	For application developers:
 
 	* The changes are broadly similar to those made for the IPv4
 	  stack.
 
 	* The use of IPv4 and IPv6 multicast socket options on the same
 	  socket, using mapped addresses, HAS NOT been tested or supported.
 
 	* There are a number of issues with the implementation of various
 	  IPv6 multicast APIs which need to be resolved in the API surface
 	  before the implementation is fully compatible with KAME userland
 	  use, and these are mostly to do with interface index treatment.
 
 	* The literature available discusses the use of either the delta / ASM
 	  API with setsockopt(2)/getsockopt(2), or the full-state / ASM API
 	  using setsourcefilter(3)/getsourcefilter(3). For more information
 	  please refer to RFC 3768, 'Socket Interface Extensions for
 	  Multicast Source Filters'.
 
 	* Applications which use the published RFC 3678 APIs should be fine.
 
 	For systems administrators:
 
 	* The mtest(8) utility has been refactored to support IPv6, in
 	  addition to IPv4. Interface addresses are no longer accepted
 	  as arguments, their names must be used instead. The utility
 	  will map the interface name to its first IPv4 address as
 	  returned by getifaddrs(3).
 
 	* The ifmcstat(8) utility has also been updated to print the MLDv2
 	  endpoint state and source filter lists via sysctl(3).
 
 	* The net.inet6.ip6.mcast.loop sysctl may be tuned to 0 to disable
 	  loopback of IPv6 multicast datagrams by default; it defaults to 1
 	  to preserve the existing behaviour. Disabling multicast loopback is
 	  recommended for optimal system performance.
 
 	* The IPv6 MROUTING code has been changed to examine this sysctl
 	  instead of attempting to perform a group lookup before looping
 	  back forwarded datagrams.
 
 	Bump __FreeBSD_version to 800084.
 
 20090422:
 	Implement low-level Bluetooth HCI API.
 	Bump __FreeBSD_version to 800083.
 
 20090419:
 	The layout of struct malloc_type, used by modules to register new
 	memory allocation types, has changed.  Most modules will need to
 	be rebuilt or panics may be experienced.
 	Bump __FreeBSD_version to 800081.
 
 20090415:
 	Anticipate overflowing inp_flags - add inp_flags2.
 	This changes most offsets in inpcb, so checking v4 connection
 	state will require a world rebuild.
 	Bump __FreeBSD_version to 800080.
 
 20090415:
 	Add an llentry to struct route and struct route_in6. Modules
 	embedding a struct route will need to be recompiled.
 	Bump __FreeBSD_version to 800079.
 
 20090414:
 	The size of rt_metrics_lite and by extension rtentry has changed.
 	Networking administration apps will need to be recompiled.
 	The route command now supports show as an alias for get, weighting
 	of routes, sticky and nostick flags to alter the behavior of stateful
 	load balancing.
 	Bump __FreeBSD_version to 800078.
 
 20090408:
 	Do not use Giant for kbdmux(4) locking. This is wrong and
 	apparently causing more problems than it solves. This will
 	re-open the issue where interrupt handlers may race with
 	kbdmux(4) in polling mode. Typical symptoms include (but
 	not limited to) duplicated and/or missing characters when
 	low level console functions (such as gets) are used while
 	interrupts are enabled (for example geli password prompt,
 	mountroot prompt etc.). Disabling kbdmux(4) may help.
 
 20090407:
 	The size of structs vnet_net, vnet_inet and vnet_ipfw has changed;
 	kernel modules referencing any of the above need to be recompiled.
 	Bump __FreeBSD_version to 800075.
 
 20090320:
 	GEOM_PART has become the default partition slicer for storage devices,
 	replacing GEOM_MBR, GEOM_BSD, GEOM_PC98 and GEOM_GPT slicers. It
 	introduces some changes:
 
 	MSDOS/EBR: the devices created from MSDOS extended partition entries
 	(EBR) can be named differently than with GEOM_MBR and are now symlinks
 	to devices with offset-based names. fstabs may need to be modified.
 
 	BSD: the "geometry does not match label" warning is harmless in most
 	cases but it points to problems in file system misalignment with
 	disk geometry. The "c" partition is now implicit, covers the whole
 	top-level drive and cannot be (mis)used by users.
 
 	General: Kernel dumps are now not allowed to be written to devices
 	whose partition types indicate they are meant to be used for file
 	systems (or, in case of MSDOS partitions, as something else than
 	the "386BSD" type).
 
 	Most of these changes date approximately from 200812.
 
 20090319:
 	The uscanner(4) driver has been removed from the kernel. This follows
 	Linux removing theirs in 2.6 and making libusb the default interface
 	(supported by sane).
 
 20090319:
 	The multicast forwarding code has been cleaned up. netstat(1)
 	only relies on KVM now for printing bandwidth upcall meters.
 	The IPv4 and IPv6 modules are split into ip_mroute_mod and
 	ip6_mroute_mod respectively. The config(5) options for statically
 	compiling this code remain the same, i.e. 'options MROUTING'.
 
 20090315:
 	Support for the IFF_NEEDSGIANT network interface flag has been
 	removed, which means that non-MPSAFE network device drivers are no
 	longer supported.  In particular, if_ar, if_sr, and network device
 	drivers from the old (legacy) USB stack can no longer be built or
 	used.
 
 20090313:
 	POSIX.1 Native Language Support (NLS) has been enabled in libc and
 	a bunch of new language catalog files have also been added.
 	This means that some common libc messages are now localized and
 	they depend on the LC_MESSAGES environmental variable.
 
 20090313:
 	The k8temp(4) driver has been renamed to amdtemp(4) since
 	support for Family 10 and Family 11 CPU families was added.
 
 20090309:
 	IGMPv3 and Source-Specific Multicast (SSM) have been merged
 	to the IPv4 stack. VIMAGE hooks are in but not yet used.
 
 	For kernel developers, the most important changes are that the
 	ip_output() and ip_input() paths no longer take the IN_MULTI_LOCK(),
 	and this lock has been downgraded to a non-recursive mutex.
 
 	Transport protocols (UDP, Raw IP) are now responsible for filtering
 	inbound multicast traffic according to group membership and source
 	filters. The imo_multicast_filter() KPI exists for this purpose.
 	Transports which do not use multicast (SCTP, TCP) already reject
 	multicast by default. Forwarding and receive performance may improve
 	as a mutex acquisition is no longer needed in the ip_input()
 	low-level input path.  in_addmulti() and in_delmulti() are shimmed
 	to new KPIs which exist to support SSM in-kernel.
 
 	For application developers, it is recommended that loopback of
 	multicast datagrams be disabled for best performance, as this
 	will still cause the lock to be taken for each looped-back
 	datagram transmission. The net.inet.ip.mcast.loop sysctl may
 	be tuned to 0 to disable loopback by default; it defaults to 1
 	to preserve the existing behaviour.
 
 	For systems administrators, to obtain best performance with
 	multicast reception and multiple groups, it is always recommended
 	that a card with a suitably precise hash filter is used. Hash
 	collisions will still result in the lock being taken within the
 	transport protocol input path to check group membership.
 
 	If deploying FreeBSD in an environment with IGMP snooping switches,
 	it is recommended that the net.inet.igmp.sendlocal sysctl remain
 	enabled; this forces 224.0.0.0/24 group membership to be announced
 	via IGMP.
 
 	The size of 'struct igmpstat' has changed; netstat needs to be
 	recompiled to reflect this.
 	Bump __FreeBSD_version to 800070.
 
 20090309:
 	libusb20.so.1 is now installed as libusb.so.1 and the ports system
 	updated to use it. This requires a buildworld/installworld in order to
 	update the library and dependencies (usbconfig, etc). Its advisable to
 	rebuild all ports which uses libusb. More specific directions are given
 	in the ports collection UPDATING file. Any /etc/libmap.conf entries for
 	libusb are no longer required and can be removed.
 
 20090302:
 	A workaround is committed to allow the creation of System V shared
 	memory segment of size > 2 GB on the 64-bit architectures.
 	Due to a limitation of the existing ABI, the shm_segsz member
 	of the struct shmid_ds, returned by shmctl(IPC_STAT) call is
 	wrong for large segments. Note that limits must be explicitly
 	raised to allow such segments to be created.
 
 20090301:
 	The layout of struct ifnet has changed, requiring a rebuild of all
 	network device driver modules.
 
 20090227:
 	The /dev handling for the new USB stack has changed, a
 	buildworld/installworld is required for libusb20.
 
 20090223:
 	The new USB2 stack has now been permanently moved in and all kernel and
 	module names reverted to their previous values (eg, usb, ehci, ohci,
 	ums, ...).  The old usb stack can be compiled in by prefixing the name
 	with the letter 'o', the old usb modules have been removed.
 	Updating entry 20090216 for xorg and 20090215 for libmap may still
 	apply.
 
 20090217:
 	The rc.conf(5) option if_up_delay has been renamed to
 	defaultroute_delay to better reflect its purpose. If you have
 	customized this setting in /etc/rc.conf you need to update it to
 	use the new name.
 
 20090216:
 	xorg 7.4 wants to configure its input devices via hald which does not
 	yet work with USB2. If the keyboard/mouse does not work in xorg then
 	add
 		Option "AllowEmptyInput" "off"
 	to your ServerLayout section.  This will cause X to use the configured
 	kbd and mouse sections from your xorg.conf.
 
 20090215:
 	The GENERIC kernels for all architectures now default to the new USB2
 	stack. No kernel config options or code have been removed so if a
 	problem arises please report it and optionally revert to the old USB
 	stack. If you are loading USB kernel modules or have a custom kernel
 	that includes GENERIC then ensure that usb names are also changed over,
 	eg uftdi -> usb2_serial_ftdi.
 
 	Older programs linked against the ports libusb 0.1 need to be
 	redirected to the new stack's libusb20.  /etc/libmap.conf can
 	be used for this:
 		# Map old usb library to new one for usb2 stack
 		libusb-0.1.so.8	libusb20.so.1
 
 20090209:
 	All USB ethernet devices now attach as interfaces under the name ueN
 	(eg. ue0). This is to provide a predictable name as vendors often
 	change usb chipsets in a product without notice.
 
 20090203:
 	The ichsmb(4) driver has been changed to require SMBus slave
 	addresses be left-justified (xxxxxxx0b) rather than right-justified.
 	All of the other SMBus controller drivers require left-justified
 	slave addresses, so this change makes all the drivers provide the
 	same interface.
 
 20090201:
 	INET6 statistics (struct ip6stat) was updated.
 	netstat(1) needs to be recompiled.
 
 20090119:
 	NTFS has been removed from GENERIC kernel on amd64 to match
 	GENERIC on i386. Should not cause any issues since mount_ntfs(8)
 	will load ntfs.ko module automatically when NTFS support is
 	actually needed, unless ntfs.ko is not installed or security
 	level prohibits loading kernel modules. If either is the case,
 	"options NTFS" has to be added into kernel config.
 
 20090115:
 	TCP Appropriate Byte Counting (RFC 3465) support added to kernel.
 	New field in struct tcpcb breaks ABI, so bump __FreeBSD_version to
 	800061. User space tools that rely on the size of struct tcpcb in
 	tcp_var.h (e.g. sockstat) need to be recompiled.
 
 20081225:
 	ng_tty(4) module updated to match the new TTY subsystem.
 	Due to API change, user-level applications must be updated.
 	New API support added to mpd5 CVS and expected to be present
 	in next mpd5.3 release.
 
 20081219:
 	With __FreeBSD_version 800060 the makefs tool is part of
 	the base system (it was a port).
 
 20081216:
 	The afdata and ifnet locks have been changed from mutexes to
 	rwlocks, network modules will need to be re-compiled.
 
 20081214:
 	__FreeBSD_version 800059 incorporates the new arp-v2 rewrite.
 	RTF_CLONING, RTF_LLINFO and RTF_WASCLONED flags are eliminated.
 	The new code reduced struct rtentry{} by 16 bytes on 32-bit
 	architecture and 40 bytes on 64-bit architecture. The userland
 	applications "arp" and "ndp" have been updated accordingly.
 	The output from "netstat -r" shows only routing entries and
 	none of the L2 information.
 
 20081130:
 	__FreeBSD_version 800057 marks the switchover from the
 	binary ath hal to source code. Users must add the line:
 
 	options	AH_SUPPORT_AR5416
 
 	to their kernel config files when specifying:
 
 	device	ath_hal
 
 	The ath_hal module no longer exists; the code is now compiled
 	together with the driver in the ath module.  It is now
 	possible to tailor chip support (i.e. reduce the set of chips
 	and thereby the code size); consult ath_hal(4) for details.
 
 20081121:
 	__FreeBSD_version 800054 adds memory barriers to
 	<machine/atomic.h>, new interfaces to ifnet to facilitate
 	multiple hardware transmit queues for cards that support
 	them, and a lock-less ring-buffer implementation to
 	enable drivers to more efficiently manage queueing of
 	packets.
 
 20081117:
 	A new version of ZFS (version 13) has been merged to -HEAD.
 	This version has zpool attribute "listsnapshots" off by
 	default, which means "zfs list" does not show snapshots,
 	and is the same as Solaris behavior.
 
 20081028:
 	dummynet(4) ABI has changed. ipfw(8) needs to be recompiled.
 
 20081009:
 	The uhci, ohci, ehci and slhci USB Host controller drivers have
 	been put into separate modules. If you load the usb module
 	separately through loader.conf you will need to load the
 	appropriate *hci module as well. E.g. for a UHCI-based USB 2.0
 	controller add the following to loader.conf:
 
 		uhci_load="YES"
 		ehci_load="YES"
 
 20081009:
 	The ABI used by the PMC toolset has changed.  Please keep
 	userland (libpmc(3)) and the kernel module (hwpmc(4)) in
 	sync.
 
 20081009:
 	atapci kernel module now includes only generic PCI ATA
 	driver. AHCI driver moved to ataahci kernel module.
 	All vendor-specific code moved into separate kernel modules:
 	ataacard, ataacerlabs, ataadaptec, ataamd, ataati, atacenatek,
 	atacypress, atacyrix, atahighpoint, ataintel, ataite, atajmicron,
 	atamarvell, atamicron, atanational, atanetcell, atanvidia,
 	atapromise, ataserverworks, atasiliconimage, atasis, atavia
 
 20080820:
 	The TTY subsystem of the kernel has been replaced by a new
 	implementation, which provides better scalability and an
 	improved driver model. Most common drivers have been migrated to
 	the new TTY subsystem, while others have not. The following
 	drivers have not yet been ported to the new TTY layer:
 
 	PCI/ISA:
 		cy, digi, rc, rp, sio
 
 	USB:
 		ubser, ucycom
 
 	Line disciplines:
 		ng_h4, ng_tty, ppp, sl, snp
 
 	Adding these drivers to your kernel configuration file shall
 	cause compilation to fail.
 
 20080818:
 	ntpd has been upgraded to 4.2.4p5.
 
 20080801:
 	OpenSSH has been upgraded to 5.1p1.
 
 	For many years, FreeBSD's version of OpenSSH preferred DSA
 	over RSA for host and user authentication keys.  With this
 	upgrade, we've switched to the vendor's default of RSA over
 	DSA.  This may cause upgraded clients to warn about unknown
 	host keys even for previously known hosts.  Users should
 	follow the usual procedure for verifying host keys before
 	accepting the RSA key.
 
 	This can be circumvented by setting the "HostKeyAlgorithms"
 	option to "ssh-dss,ssh-rsa" in ~/.ssh/config or on the ssh
 	command line.
 
 	Please note that the sequence of keys offered for
 	authentication has been changed as well.  You may want to
 	specify IdentityFile in a different order to revert this
 	behavior.
 
 20080713:
 	The sio(4) driver has been removed from the i386 and amd64
 	kernel configuration files. This means uart(4) is now the
 	default serial port driver on those platforms as well.
 
 	To prevent collisions with the sio(4) driver, the uart(4) driver
 	uses different names for its device nodes. This means the
 	onboard serial port will now most likely be called "ttyu0"
 	instead of "ttyd0". You may need to reconfigure applications to
 	use the new device names.
 
 	When using the serial port as a boot console, be sure to update
 	/boot/device.hints and /etc/ttys before booting the new kernel.
 	If you forget to do so, you can still manually specify the hints
 	at the loader prompt:
 
 		set hint.uart.0.at="isa"
 		set hint.uart.0.port="0x3F8"
 		set hint.uart.0.flags="0x10"
 		set hint.uart.0.irq="4"
 		boot -s
 
 20080609:
 	The gpt(8) utility has been removed. Use gpart(8) to partition
 	disks instead.
 
 20080603:
 	The version that Linuxulator emulates was changed from 2.4.2
 	to 2.6.16. If you experience any problems with Linux binaries
 	please try to set sysctl compat.linux.osrelease to 2.4.2 and
 	if it fixes the problem contact emulation mailing list.
 
 20080525:
 	ISDN4BSD (I4B) was removed from the src tree. You may need to
 	update a your kernel configuration and remove relevant entries.
 
 20080509:
 	I have checked in code to support multiple routing tables.
 	See the man pages setfib(1) and setfib(2).
 	This is a hopefully backwards compatible version,
 	but to make use of it you need to compile your kernel
 	with options ROUTETABLES=2 (or more up to 16).
 
 20080420:
 	The 802.11 wireless support was redone to enable multi-bss
 	operation on devices that are capable.  The underlying device
 	is no longer used directly but instead wlanX devices are
 	cloned with ifconfig.  This requires changes to rc.conf files.
 	For example, change:
 		ifconfig_ath0="WPA DHCP"
 	to
 		wlans_ath0=wlan0
 		ifconfig_wlan0="WPA DHCP"
 	see rc.conf(5) for more details.  In addition, mergemaster of
 	/etc/rc.d is highly recommended.  Simultaneous update of userland
 	and kernel wouldn't hurt either.
 
 	As part of the multi-bss changes the wlan_scan_ap and wlan_scan_sta
 	modules were merged into the base wlan module.  All references
 	to these modules (e.g. in kernel config files) must be removed.
 
 20080408:
 	psm(4) has gained write(2) support in native operation level.
 	Arbitrary commands can be written to /dev/psm%d and status can
 	be read back from it.  Therefore, an application is responsible
 	for status validation and error recovery.  It is a no-op in
 	other operation levels.
 
 20080312:
 	Support for KSE threading has been removed from the kernel.  To
 	run legacy applications linked against KSE libmap.conf may
 	be used.  The following libmap.conf may be used to ensure
 	compatibility with any prior release:
 
 	libpthread.so.1 libthr.so.1
 	libpthread.so.2 libthr.so.2
 	libkse.so.3 libthr.so.3
 
 20080301:
 	The layout of struct vmspace has changed. This affects libkvm
 	and any executables that link against libkvm and use the
 	kvm_getprocs() function. In particular, but not exclusively,
 	it affects ps(1), fstat(1), pkill(1), systat(1), top(1) and w(1).
 	The effects are minimal, but it's advisable to upgrade world
 	nonetheless.
 
 20080229:
 	The latest em driver no longer has support in it for the
 	82575 adapter, this is now moved to the igb driver. The
 	split was done to make new features that are incompatible
 	with older hardware easier to do.
 
 20080220:
 	The new geom_lvm(4) geom class has been renamed to geom_linux_lvm(4),
 	likewise the kernel option is now GEOM_LINUX_LVM.
 
 20080211:
 	The default NFS mount mode has changed from UDP to TCP for
 	increased reliability.  If you rely on (insecurely) NFS
 	mounting across a firewall you may need to update your
 	firewall rules.
 
 20080208:
 	Belatedly note the addition of m_collapse for compacting
 	mbuf chains.
 
 20080126:
 	The fts(3) structures have been changed to use adequate
 	integer types for their members and so to be able to cope
 	with huge file trees.  The old fts(3) ABI is preserved
 	through symbol versioning in libc, so third-party binaries
 	using fts(3) should still work, although they will not take
 	advantage of the extended types.  At the same time, some
 	third-party software might fail to build after this change
 	due to unportable assumptions made in its source code about
 	fts(3) structure members.  Such software should be fixed
 	by its vendor or, in the worst case, in the ports tree.
 	FreeBSD_version 800015 marks this change for the unlikely
 	case that a portable fix is impossible.
 
 20080123:
 	To upgrade to -current after this date, you must be running
 	FreeBSD not older than 6.0-RELEASE.  Upgrading to -current
 	from 5.x now requires a stop over at RELENG_6 or RELENG_7 systems.
 
 20071128:
 	The ADAPTIVE_GIANT kernel option has been retired because its
 	functionality is the default now.
 
 20071118:
 	The AT keyboard emulation of sunkbd(4) has been turned on
 	by default. In order to make the special symbols of the Sun
 	keyboards driven by sunkbd(4) work under X these now have
 	to be configured the same way as Sun USB keyboards driven
 	by ukbd(4) (which also does AT keyboard emulation), f.e.:
 
 	Option	"XkbLayout" "us"
 	Option	"XkbRules" "xorg"
 	Option	"XkbSymbols" "pc(pc105)+sun_vndr/usb(sun_usb)+us"
 
 20071024:
 	It has been decided that it is desirable to provide ABI
 	backwards compatibility to the FreeBSD 4/5/6 versions of the
 	PCIOCGETCONF, PCIOCREAD and PCIOCWRITE IOCTLs, which was
 	broken with the introduction of PCI domain support (see the
 	20070930 entry). Unfortunately, this required the ABI of
 	PCIOCGETCONF to be broken again in order to be able to
 	provide backwards compatibility to the old version of that
 	IOCTL. Thus consumers of PCIOCGETCONF have to be recompiled
 	again. As for prominent ports this affects neither pciutils
 	nor xorg-server this time, the hal port needs to be rebuilt
 	however.
 
 20071020:
 	The misnamed kthread_create() and friends have been renamed
 	to kproc_create() etc. Many of the callers already
 	used kproc_start()..
 	I will return kthread_create() and friends in a while
 	with implementations that actually create threads, not procs.
 	Renaming corresponds with version 800002.
 
 20071010:
 	RELENG_7 branched.
 
 COMMON ITEMS:
 
 	General Notes
 	-------------
 	Avoid using make -j when upgrading.  While generally safe, there are
 	sometimes problems using -j to upgrade.  If your upgrade fails with
 	-j, please try again without -j.  From time to time in the past there
 	have been problems using -j with buildworld and/or installworld.  This
 	is especially true when upgrading between "distant" versions (eg one
 	that cross a major release boundary or several minor releases, or when
 	several months have passed on the -current branch).
 
 	Sometimes, obscure build problems are the result of environment
 	poisoning.  This can happen because the make utility reads its
 	environment when searching for values for global variables.  To run
 	your build attempts in an "environmental clean room", prefix all make
 	commands with 'env -i '.  See the env(1) manual page for more details.
 
 	When upgrading from one major version to another it is generally best
 	to upgrade to the latest code in the currently installed branch first,
 	then do an upgrade to the new branch. This is the best-tested upgrade
 	path, and has the highest probability of being successful.  Please try
 	this approach before reporting problems with a major version upgrade.
 
 	When upgrading a live system, having a root shell around before
 	installing anything can help undo problems. Not having a root shell
 	around can lead to problems if pam has changed too much from your
 	starting point to allow continued authentication after the upgrade.
 
 	ZFS notes
 	---------
 	When upgrading the boot ZFS pool to a new version, always follow
 	these two steps:
 
 	1.) recompile and reinstall the ZFS boot loader and boot block
 	(this is part of "make buildworld" and "make installworld")
 
 	2.) update the ZFS boot block on your boot drive
 
 	The following example updates the ZFS boot block on the first
 	partition (freebsd-boot) of a GPT partitioned drive ad0:
 	"gpart bootcode -p /boot/gptzfsboot -i 1 ad0"
 
 	Non-boot pools do not need these updates.
 
 	To build a kernel
 	-----------------
 	If you are updating from a prior version of FreeBSD (even one just
 	a few days old), you should follow this procedure.  It is the most
 	failsafe as it uses a /usr/obj tree with a fresh mini-buildworld,
 
 	make kernel-toolchain
 	make -DALWAYS_CHECK_MAKE buildkernel KERNCONF=YOUR_KERNEL_HERE
 	make -DALWAYS_CHECK_MAKE installkernel KERNCONF=YOUR_KERNEL_HERE
 
 	To test a kernel once
 	---------------------
 	If you just want to boot a kernel once (because you are not sure
 	if it works, or if you want to boot a known bad kernel to provide
 	debugging information) run
 	make installkernel KERNCONF=YOUR_KERNEL_HERE KODIR=/boot/testkernel
 	nextboot -k testkernel
 
 	To just build a kernel when you know that it won't mess you up
 	--------------------------------------------------------------
 	This assumes you are already running a CURRENT system.  Replace
 	${arch} with the architecture of your machine (e.g. "i386",
 	"arm", "amd64", "ia64", "pc98", "sparc64", "powerpc", "mips", etc).
 
 	cd src/sys/${arch}/conf
 	config KERNEL_NAME_HERE
 	cd ../compile/KERNEL_NAME_HERE
 	make depend
 	make
 	make install
 
 	If this fails, go to the "To build a kernel" section.
 
 	To rebuild everything and install it on the current system.
 	-----------------------------------------------------------
 	# Note: sometimes if you are running current you gotta do more than
 	# is listed here if you are upgrading from a really old current.
 
 	<make sure you have good level 0 dumps>
 	make buildworld
 	make kernel KERNCONF=YOUR_KERNEL_HERE
 							[1]
 	<reboot in single user>				[3]
 	mergemaster -p					[5]
 	make installworld
 	mergemaster -i					[4]
 	make delete-old					[6]
 	<reboot>
 
 	To cross-install current onto a separate partition
 	--------------------------------------------------
 	# In this approach we use a separate partition to hold
 	# current's root, 'usr', and 'var' directories.   A partition
 	# holding "/", "/usr" and "/var" should be about 2GB in
 	# size.
 
 	<make sure you have good level 0 dumps>
 	<boot into -stable>
 	make buildworld
 	make buildkernel KERNCONF=YOUR_KERNEL_HERE
 	<maybe newfs current's root partition>
 	<mount current's root partition on directory ${CURRENT_ROOT}>
 	make installworld DESTDIR=${CURRENT_ROOT}
 	make distribution DESTDIR=${CURRENT_ROOT} # if newfs'd
 	make installkernel KERNCONF=YOUR_KERNEL_HERE DESTDIR=${CURRENT_ROOT}
 	cp /etc/fstab ${CURRENT_ROOT}/etc/fstab 		   # if newfs'd
 	<edit ${CURRENT_ROOT}/etc/fstab to mount "/" from the correct partition>
 	<reboot into current>
 	<do a "native" rebuild/install as described in the previous section>
 	<maybe install compatibility libraries from ports/misc/compat*>
 	<reboot>
 
 
 	To upgrade in-place from stable to current
 	----------------------------------------------
 	<make sure you have good level 0 dumps>
 	make buildworld					[9]
 	make kernel KERNCONF=YOUR_KERNEL_HERE		[8]
 							[1]
 	<reboot in single user>				[3]
 	mergemaster -p					[5]
 	make installworld
 	mergemaster -i					[4]
 	make delete-old					[6]
 	<reboot>
 
 	Make sure that you've read the UPDATING file to understand the
 	tweaks to various things you need.  At this point in the life
 	cycle of current, things change often and you are on your own
 	to cope.  The defaults can also change, so please read ALL of
 	the UPDATING entries.
 
 	Also, if you are tracking -current, you must be subscribed to
 	freebsd-current@freebsd.org.  Make sure that before you update
 	your sources that you have read and understood all the recent
 	messages there.  If in doubt, please track -stable which has
 	much fewer pitfalls.
 
 	[1] If you have third party modules, such as vmware, you
 	should disable them at this point so they don't crash your
 	system on reboot.
 
 	[3] From the bootblocks, boot -s, and then do
 		fsck -p
 		mount -u /
 		mount -a
 		cd src
 		adjkerntz -i		# if CMOS is wall time
 	Also, when doing a major release upgrade, it is required that
 	you boot into single user mode to do the installworld.
 
 	[4] Note: This step is non-optional.  Failure to do this step
 	can result in a significant reduction in the functionality of the
 	system.  Attempting to do it by hand is not recommended and those
 	that pursue this avenue should read this file carefully, as well
 	as the archives of freebsd-current and freebsd-hackers mailing lists
 	for potential gotchas.  The -U option is also useful to consider.
 	See mergemaster(8) for more information.
 
 	[5] Usually this step is a noop.  However, from time to time
 	you may need to do this if you get unknown user in the following
 	step.  It never hurts to do it all the time.  You may need to
 	install a new mergemaster (cd src/usr.sbin/mergemaster && make
 	install) after the buildworld before this step if you last updated
 	from current before 20130425 or from -stable before 20130430.
 
 	[6] This only deletes old files and directories. Old libraries
 	can be deleted by "make delete-old-libs", but you have to make
 	sure that no program is using those libraries anymore.
 
 	[8] In order to have a kernel that can run the 4.x binaries needed to
 	do an installworld, you must include the COMPAT_FREEBSD4 option in
 	your kernel.  Failure to do so may leave you with a system that is
 	hard to boot to recover. A similar kernel option COMPAT_FREEBSD5 is
 	required to run the 5.x binaries on more recent kernels.  And so on
 	for COMPAT_FREEBSD6 and COMPAT_FREEBSD7.
 
 	Make sure that you merge any new devices from GENERIC since the
 	last time you updated your kernel config file.
 
 	[9] When checking out sources, you must include the -P flag to have
 	cvs prune empty directories.
 
 	If CPUTYPE is defined in your /etc/make.conf, make sure to use the
 	"?=" instead of the "=" assignment operator, so that buildworld can
 	override the CPUTYPE if it needs to.
 
 	MAKEOBJDIRPREFIX must be defined in an environment variable, and
 	not on the command line, or in /etc/make.conf.  buildworld will
 	warn if it is improperly defined.
 FORMAT:
 
 This file contains a list, in reverse chronological order, of major
 breakages in tracking -current.  It is not guaranteed to be a complete
 list of such breakages, and only contains entries since October 10, 2007.
 If you need to see UPDATING entries from before that date, you will need
 to fetch an UPDATING file from an older FreeBSD release.
 
 Copyright information:
 
 Copyright 1998-2009 M. Warner Losh.  All Rights Reserved.
 
 Redistribution, publication, translation and use, with or without
 modification, in full or in part, in any form or format of this
 document are permitted without further permission from the author.
 
 THIS DOCUMENT IS PROVIDED BY WARNER LOSH ``AS IS'' AND ANY EXPRESS OR
 IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
 WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 DISCLAIMED.  IN NO EVENT SHALL WARNER LOSH BE LIABLE FOR ANY DIRECT,
 INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
 (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
 SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
 STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
 IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 POSSIBILITY OF SUCH DAMAGE.
 
 Contact Warner Losh if you have any questions about your use of
 this document.
 
 $FreeBSD$
Index: releng/10.3/sbin/dhclient/dhclient.c
===================================================================
--- releng/10.3/sbin/dhclient/dhclient.c	(revision 303983)
+++ releng/10.3/sbin/dhclient/dhclient.c	(revision 303984)
@@ -1,2762 +1,2763 @@
 /*	$OpenBSD: dhclient.c,v 1.63 2005/02/06 17:10:13 krw Exp $	*/
 
 /*
  * Copyright 2004 Henning Brauer <henning@openbsd.org>
  * Copyright (c) 1995, 1996, 1997, 1998, 1999
  * The Internet Software Consortium.    All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  *
  * 1. Redistributions of source code must retain the above copyright
  *    notice, this list of conditions and the following disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  * 3. Neither the name of The Internet Software Consortium nor the names
  *    of its contributors may be used to endorse or promote products derived
  *    from this software without specific prior written permission.
  *
  * THIS SOFTWARE IS PROVIDED BY THE INTERNET SOFTWARE CONSORTIUM AND
  * CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES,
  * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
  * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
  * DISCLAIMED.  IN NO EVENT SHALL THE INTERNET SOFTWARE CONSORTIUM OR
  * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
  * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
  * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
  * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
  * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
  * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  * SUCH DAMAGE.
  *
  * This software has been written for the Internet Software Consortium
  * by Ted Lemon <mellon@fugue.com> in cooperation with Vixie
  * Enterprises.  To learn more about the Internet Software Consortium,
  * see ``http://www.vix.com/isc''.  To learn more about Vixie
  * Enterprises, see ``http://www.vix.com''.
  *
  * This client was substantially modified and enhanced by Elliot Poger
  * for use on Linux while he was working on the MosquitoNet project at
  * Stanford.
  *
  * The current version owes much to Elliot's Linux enhancements, but
  * was substantially reorganized and partially rewritten by Ted Lemon
  * so as to use the same networking framework that the Internet Software
  * Consortium DHCP server uses.   Much system-specific configuration code
  * was moved into a shell script so that as support for more operating
  * systems is added, it will not be necessary to port and maintain
  * system-specific configuration code to these operating systems - instead,
  * the shell script can invoke the native tools to accomplish the same
  * purpose.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/capsicum.h>
 
 #include "dhcpd.h"
 #include "privsep.h"
 
 #include <sys/capsicum.h>
 
 #include <net80211/ieee80211_freebsd.h>
 
 #ifndef _PATH_VAREMPTY
 #define	_PATH_VAREMPTY	"/var/empty"
 #endif
 
 #define	PERIOD 0x2e
 #define	hyphenchar(c) ((c) == 0x2d)
 #define	bslashchar(c) ((c) == 0x5c)
 #define	periodchar(c) ((c) == PERIOD)
 #define	asterchar(c) ((c) == 0x2a)
 #define	alphachar(c) (((c) >= 0x41 && (c) <= 0x5a) || \
 	    ((c) >= 0x61 && (c) <= 0x7a))
 #define	digitchar(c) ((c) >= 0x30 && (c) <= 0x39)
 #define	whitechar(c) ((c) == ' ' || (c) == '\t')
 
 #define	borderchar(c) (alphachar(c) || digitchar(c))
 #define	middlechar(c) (borderchar(c) || hyphenchar(c))
 #define	domainchar(c) ((c) > 0x20 && (c) < 0x7f)
 
 #define	CLIENT_PATH "PATH=/usr/bin:/usr/sbin:/bin:/sbin"
 
 time_t cur_time;
 time_t default_lease_time = 43200; /* 12 hours... */
 
 char *path_dhclient_conf = _PATH_DHCLIENT_CONF;
 char *path_dhclient_db = NULL;
 
 int log_perror = 1;
 int privfd;
 int nullfd = -1;
 
 char hostname[_POSIX_HOST_NAME_MAX + 1];
 
 struct iaddr iaddr_broadcast = { 4, { 255, 255, 255, 255 } };
 struct in_addr inaddr_any, inaddr_broadcast;
 
 char *path_dhclient_pidfile;
 struct pidfh *pidfile;
 
 /*
  * ASSERT_STATE() does nothing now; it used to be
  * assert (state_is == state_shouldbe).
  */
 #define ASSERT_STATE(state_is, state_shouldbe) {}
 
 #define TIME_MAX 2147483647
 
 int		log_priority;
 int		no_daemon;
 int		unknown_ok = 1;
 int		routefd;
 
 struct interface_info	*ifi;
 
 int		 findproto(char *, int);
 struct sockaddr	*get_ifa(char *, int);
 void		 routehandler(struct protocol *);
 void		 usage(void);
 int		 check_option(struct client_lease *l, int option);
 int		 check_classless_option(unsigned char *data, int len);
 int		 ipv4addrs(char * buf);
 int		 res_hnok(const char *dn);
 int		 check_search(const char *srch);
 char		*option_as_string(unsigned int code, unsigned char *data, int len);
 int		 fork_privchld(int, int);
 
 #define	ROUNDUP(a) \
 	    ((a) > 0 ? (1 + (((a) - 1) | (sizeof(long) - 1))) : sizeof(long))
 #define	ADVANCE(x, n) (x += ROUNDUP((n)->sa_len))
 
 static time_t	scripttime;
 
 int
 findproto(char *cp, int n)
 {
 	struct sockaddr *sa;
 	int i;
 
 	if (n == 0)
 		return -1;
 	for (i = 1; i; i <<= 1) {
 		if (i & n) {
 			sa = (struct sockaddr *)cp;
 			switch (i) {
 			case RTA_IFA:
 			case RTA_DST:
 			case RTA_GATEWAY:
 			case RTA_NETMASK:
 				if (sa->sa_family == AF_INET)
 					return AF_INET;
 				if (sa->sa_family == AF_INET6)
 					return AF_INET6;
 				break;
 			case RTA_IFP:
 				break;
 			}
 			ADVANCE(cp, sa);
 		}
 	}
 	return (-1);
 }
 
 struct sockaddr *
 get_ifa(char *cp, int n)
 {
 	struct sockaddr *sa;
 	int i;
 
 	if (n == 0)
 		return (NULL);
 	for (i = 1; i; i <<= 1)
 		if (i & n) {
 			sa = (struct sockaddr *)cp;
 			if (i == RTA_IFA)
 				return (sa);
 			ADVANCE(cp, sa);
 		}
 
 	return (NULL);
 }
 
 struct iaddr defaddr = { 4 };
 uint8_t curbssid[6];
 
 static void
 disassoc(void *arg)
 {
 	struct interface_info *ifi = arg;
 
 	/*
 	 * Clear existing state.
 	 */
 	if (ifi->client->active != NULL) {
 		script_init("EXPIRE", NULL);
 		script_write_params("old_",
 		    ifi->client->active);
 		if (ifi->client->alias)
 			script_write_params("alias_",
 				ifi->client->alias);
 		script_go();
 	}
 	ifi->client->state = S_INIT;
 }
 
 /* ARGSUSED */
 void
 routehandler(struct protocol *p)
 {
 	char msg[2048], *addr;
 	struct rt_msghdr *rtm;
 	struct if_msghdr *ifm;
 	struct ifa_msghdr *ifam;
 	struct if_announcemsghdr *ifan;
 	struct ieee80211_join_event *jev;
 	struct client_lease *l;
 	time_t t = time(NULL);
 	struct sockaddr *sa;
 	struct iaddr a;
 	ssize_t n;
 	int linkstat;
 
 	n = read(routefd, &msg, sizeof(msg));
 	rtm = (struct rt_msghdr *)msg;
 	if (n < sizeof(rtm->rtm_msglen) || n < rtm->rtm_msglen ||
 	    rtm->rtm_version != RTM_VERSION)
 		return;
 
 	switch (rtm->rtm_type) {
 	case RTM_NEWADDR:
 	case RTM_DELADDR:
 		ifam = (struct ifa_msghdr *)rtm;
 
 		if (ifam->ifam_index != ifi->index)
 			break;
 		if (findproto((char *)(ifam + 1), ifam->ifam_addrs) != AF_INET)
 			break;
 		if (scripttime == 0 || t < scripttime + 10)
 			break;
 
 		sa = get_ifa((char *)(ifam + 1), ifam->ifam_addrs);
 		if (sa == NULL)
 			break;
 
 		if ((a.len = sizeof(struct in_addr)) > sizeof(a.iabuf))
 			error("king bula sez: len mismatch");
 		memcpy(a.iabuf, &((struct sockaddr_in *)sa)->sin_addr, a.len);
 		if (addr_eq(a, defaddr))
 			break;
 
 		for (l = ifi->client->active; l != NULL; l = l->next)
 			if (addr_eq(a, l->address))
 				break;
 
 		if (l == NULL)	/* added/deleted addr is not the one we set */
 			break;
 
 		addr = inet_ntoa(((struct sockaddr_in *)sa)->sin_addr);
 		if (rtm->rtm_type == RTM_NEWADDR)  {
 			/*
 			 * XXX: If someone other than us adds our address,
 			 * should we assume they are taking over from us,
 			 * delete the lease record, and exit without modifying
 			 * the interface?
 			 */
 			warning("My address (%s) was re-added", addr);
 		} else {
 			warning("My address (%s) was deleted, dhclient exiting",
 			    addr);
 			goto die;
 		}
 		break;
 	case RTM_IFINFO:
 		ifm = (struct if_msghdr *)rtm;
 		if (ifm->ifm_index != ifi->index)
 			break;
 		if ((rtm->rtm_flags & RTF_UP) == 0) {
 			warning("Interface %s is down, dhclient exiting",
 			    ifi->name);
 			goto die;
 		}
 		linkstat = interface_link_status(ifi->name);
 		if (linkstat != ifi->linkstat) {
 			debug("%s link state %s -> %s", ifi->name,
 			    ifi->linkstat ? "up" : "down",
 			    linkstat ? "up" : "down");
 			ifi->linkstat = linkstat;
 			if (linkstat)
 				state_reboot(ifi);
 		}
 		break;
 	case RTM_IFANNOUNCE:
 		ifan = (struct if_announcemsghdr *)rtm;
 		if (ifan->ifan_what == IFAN_DEPARTURE &&
 		    ifan->ifan_index == ifi->index) {
 			warning("Interface %s is gone, dhclient exiting",
 			    ifi->name);
 			goto die;
 		}
 		break;
 	case RTM_IEEE80211:
 		ifan = (struct if_announcemsghdr *)rtm;
 		if (ifan->ifan_index != ifi->index)
 			break;
 		switch (ifan->ifan_what) {
 		case RTM_IEEE80211_ASSOC:
 		case RTM_IEEE80211_REASSOC:
 			/*
 			 * Use assoc/reassoc event to kick state machine
 			 * in case we roam.  Otherwise fall back to the
 			 * normal state machine just like a wired network.
 			 */
 			jev = (struct ieee80211_join_event *) &ifan[1];
 			if (memcmp(curbssid, jev->iev_addr, 6)) {
 				disassoc(ifi);
 				state_reboot(ifi);
 			}
 			memcpy(curbssid, jev->iev_addr, 6);
 			break;
 		}
 		break;
 	default:
 		break;
 	}
 	return;
 
 die:
 	script_init("FAIL", NULL);
 	if (ifi->client->alias)
 		script_write_params("alias_", ifi->client->alias);
 	script_go();
 	if (pidfile != NULL)
 		pidfile_remove(pidfile);
 	exit(1);
 }
 
 int
 main(int argc, char *argv[])
 {
 	extern char		*__progname;
 	int			 ch, fd, quiet = 0, i = 0;
 	int			 pipe_fd[2];
 	int			 immediate_daemon = 0;
 	struct passwd		*pw;
 	pid_t			 otherpid;
 	cap_rights_t		 rights;
 
 	/* Initially, log errors to stderr as well as to syslogd. */
 	openlog(__progname, LOG_PID | LOG_NDELAY, DHCPD_LOG_FACILITY);
 	setlogmask(LOG_UPTO(LOG_DEBUG));
 
 	while ((ch = getopt(argc, argv, "bc:dl:p:qu")) != -1)
 		switch (ch) {
 		case 'b':
 			immediate_daemon = 1;
 			break;
 		case 'c':
 			path_dhclient_conf = optarg;
 			break;
 		case 'd':
 			no_daemon = 1;
 			break;
 		case 'l':
 			path_dhclient_db = optarg;
 			break;
 		case 'p':
 			path_dhclient_pidfile = optarg;
 			break;
 		case 'q':
 			quiet = 1;
 			break;
 		case 'u':
 			unknown_ok = 0;
 			break;
 		default:
 			usage();
 		}
 
 	argc -= optind;
 	argv += optind;
 
 	if (argc != 1)
 		usage();
 
 	if (path_dhclient_pidfile == NULL) {
 		asprintf(&path_dhclient_pidfile,
 		    "%sdhclient.%s.pid", _PATH_VARRUN, *argv);
 		if (path_dhclient_pidfile == NULL)
 			error("asprintf");
 	}
 	pidfile = pidfile_open(path_dhclient_pidfile, 0600, &otherpid);
 	if (pidfile == NULL) {
 		if (errno == EEXIST)
 			error("dhclient already running, pid: %d.", otherpid);
 		if (errno == EAGAIN)
 			error("dhclient already running.");
 		warning("Cannot open or create pidfile: %m");
 	}
 
 	if ((ifi = calloc(1, sizeof(struct interface_info))) == NULL)
 		error("calloc");
 	if (strlcpy(ifi->name, argv[0], IFNAMSIZ) >= IFNAMSIZ)
 		error("Interface name too long");
 	if (path_dhclient_db == NULL && asprintf(&path_dhclient_db, "%s.%s",
 	    _PATH_DHCLIENT_DB, ifi->name) == -1)
 		error("asprintf");
 
 	if (quiet)
 		log_perror = 0;
 
 	tzset();
 	time(&cur_time);
 
 	inaddr_broadcast.s_addr = INADDR_BROADCAST;
 	inaddr_any.s_addr = INADDR_ANY;
 
 	read_client_conf();
 
 	/* The next bit is potentially very time-consuming, so write out
 	   the pidfile right away.  We will write it out again with the
 	   correct pid after daemonizing. */
 	if (pidfile != NULL)
 		pidfile_write(pidfile);
 
 	if (!interface_link_status(ifi->name)) {
 		fprintf(stderr, "%s: no link ...", ifi->name);
 		fflush(stderr);
 		sleep(1);
 		while (!interface_link_status(ifi->name)) {
 			fprintf(stderr, ".");
 			fflush(stderr);
 			if (++i > 10) {
 				fprintf(stderr, " giving up\n");
 				exit(1);
 			}
 			sleep(1);
 		}
 		fprintf(stderr, " got link\n");
 	}
 	ifi->linkstat = 1;
 
 	if ((nullfd = open(_PATH_DEVNULL, O_RDWR, 0)) == -1)
 		error("cannot open %s: %m", _PATH_DEVNULL);
 
 	if ((pw = getpwnam("_dhcp")) == NULL) {
 		warning("no such user: _dhcp, falling back to \"nobody\"");
 		if ((pw = getpwnam("nobody")) == NULL)
 			error("no such user: nobody");
 	}
 
 	/*
 	 * Obtain hostname before entering capability mode - it won't be
 	 * possible then, as reading kern.hostname is not permitted.
 	 */
 	if (gethostname(hostname, sizeof(hostname)) < 0)
 		hostname[0] = '\0';
 
 	priv_script_init("PREINIT", NULL);
 	if (ifi->client->alias)
 		priv_script_write_params("alias_", ifi->client->alias);
 	priv_script_go();
 
 	/* set up the interface */
 	discover_interfaces(ifi);
 
 	if (pipe(pipe_fd) == -1)
 		error("pipe");
 
 	fork_privchld(pipe_fd[0], pipe_fd[1]);
 
 	close(ifi->ufdesc);
 	ifi->ufdesc = -1;
 	close(ifi->wfdesc);
 	ifi->wfdesc = -1;
 
 	close(pipe_fd[0]);
 	privfd = pipe_fd[1];
 	cap_rights_init(&rights, CAP_READ, CAP_WRITE);
 	if (cap_rights_limit(privfd, &rights) < 0 && errno != ENOSYS)
 		error("can't limit private descriptor: %m");
 
 	if ((fd = open(path_dhclient_db, O_RDONLY|O_EXLOCK|O_CREAT, 0)) == -1)
 		error("can't open and lock %s: %m", path_dhclient_db);
 	read_client_leases();
 	rewrite_client_leases();
 	close(fd);
 
 	if ((routefd = socket(PF_ROUTE, SOCK_RAW, 0)) != -1)
 		add_protocol("AF_ROUTE", routefd, routehandler, ifi);
 	if (shutdown(routefd, SHUT_WR) < 0)
 		error("can't shutdown route socket: %m");
 	cap_rights_init(&rights, CAP_EVENT, CAP_READ);
 	if (cap_rights_limit(routefd, &rights) < 0 && errno != ENOSYS)
 		error("can't limit route socket: %m");
 
 	if (chroot(_PATH_VAREMPTY) == -1)
 		error("chroot");
 	if (chdir("/") == -1)
 		error("chdir(\"/\")");
 
 	if (setgroups(1, &pw->pw_gid) ||
 	    setegid(pw->pw_gid) || setgid(pw->pw_gid) ||
 	    seteuid(pw->pw_uid) || setuid(pw->pw_uid))
 		error("can't drop privileges: %m");
 
 	endpwent();
 
 	setproctitle("%s", ifi->name);
 
 	if (cap_enter() < 0 && errno != ENOSYS)
 		error("can't enter capability mode: %m");
 
 	if (immediate_daemon)
 		go_daemon();
 
 	ifi->client->state = S_INIT;
 	state_reboot(ifi);
 
 	bootp_packet_handler = do_packet;
 
 	dispatch();
 
 	/* not reached */
 	return (0);
 }
 
 void
 usage(void)
 {
 	extern char	*__progname;
 
 	fprintf(stderr, "usage: %s [-bdqu] ", __progname);
 	fprintf(stderr, "[-c conffile] [-l leasefile] interface\n");
 	exit(1);
 }
 
 /*
  * Individual States:
  *
  * Each routine is called from the dhclient_state_machine() in one of
  * these conditions:
  * -> entering INIT state
  * -> recvpacket_flag == 0: timeout in this state
  * -> otherwise: received a packet in this state
  *
  * Return conditions as handled by dhclient_state_machine():
  * Returns 1, sendpacket_flag = 1: send packet, reset timer.
  * Returns 1, sendpacket_flag = 0: just reset the timer (wait for a milestone).
  * Returns 0: finish the nap which was interrupted for no good reason.
  *
  * Several per-interface variables are used to keep track of the process:
  *   active_lease: the lease that is being used on the interface
  *                 (null pointer if not configured yet).
  *   offered_leases: leases corresponding to DHCPOFFER messages that have
  *                   been sent to us by DHCP servers.
  *   acked_leases: leases corresponding to DHCPACK messages that have been
  *                 sent to us by DHCP servers.
  *   sendpacket: DHCP packet we're trying to send.
  *   destination: IP address to send sendpacket to
  * In addition, there are several relevant per-lease variables.
  *   T1_expiry, T2_expiry, lease_expiry: lease milestones
  * In the active lease, these control the process of renewing the lease;
  * In leases on the acked_leases list, this simply determines when we
  * can no longer legitimately use the lease.
  */
 
 void
 state_reboot(void *ipp)
 {
 	struct interface_info *ip = ipp;
 
 	/* If we don't remember an active lease, go straight to INIT. */
 	if (!ip->client->active || ip->client->active->is_bootp) {
 		state_init(ip);
 		return;
 	}
 
 	/* We are in the rebooting state. */
 	ip->client->state = S_REBOOTING;
 
 	/* make_request doesn't initialize xid because it normally comes
 	   from the DHCPDISCOVER, but we haven't sent a DHCPDISCOVER,
 	   so pick an xid now. */
 	ip->client->xid = arc4random();
 
 	/* Make a DHCPREQUEST packet, and set appropriate per-interface
 	   flags. */
 	make_request(ip, ip->client->active);
 	ip->client->destination = iaddr_broadcast;
 	ip->client->first_sending = cur_time;
 	ip->client->interval = ip->client->config->initial_interval;
 
 	/* Zap the medium list... */
 	ip->client->medium = NULL;
 
 	/* Send out the first DHCPREQUEST packet. */
 	send_request(ip);
 }
 
 /*
  * Called when a lease has completely expired and we've
  * been unable to renew it.
  */
 void
 state_init(void *ipp)
 {
 	struct interface_info *ip = ipp;
 
 	ASSERT_STATE(state, S_INIT);
 
 	/* Make a DHCPDISCOVER packet, and set appropriate per-interface
 	   flags. */
 	make_discover(ip, ip->client->active);
 	ip->client->xid = ip->client->packet.xid;
 	ip->client->destination = iaddr_broadcast;
 	ip->client->state = S_SELECTING;
 	ip->client->first_sending = cur_time;
 	ip->client->interval = ip->client->config->initial_interval;
 
 	/* Add an immediate timeout to cause the first DHCPDISCOVER packet
 	   to go out. */
 	send_discover(ip);
 }
 
 /*
  * state_selecting is called when one or more DHCPOFFER packets
  * have been received and a configurable period of time has passed.
  */
 void
 state_selecting(void *ipp)
 {
 	struct interface_info *ip = ipp;
 	struct client_lease *lp, *next, *picked;
 
 	ASSERT_STATE(state, S_SELECTING);
 
 	/* Cancel state_selecting and send_discover timeouts, since either
 	   one could have got us here. */
 	cancel_timeout(state_selecting, ip);
 	cancel_timeout(send_discover, ip);
 
 	/* We have received one or more DHCPOFFER packets.   Currently,
 	   the only criterion by which we judge leases is whether or
 	   not we get a response when we arp for them. */
 	picked = NULL;
 	for (lp = ip->client->offered_leases; lp; lp = next) {
 		next = lp->next;
 
 		/* Check to see if we got an ARPREPLY for the address
 		   in this particular lease. */
 		if (!picked) {
 			script_init("ARPCHECK", lp->medium);
 			script_write_params("check_", lp);
 
 			/* If the ARPCHECK code detects another
 			   machine using the offered address, it exits
 			   nonzero.  We need to send a DHCPDECLINE and
 			   toss the lease. */
 			if (script_go()) {
 				make_decline(ip, lp);
 				send_decline(ip);
 				goto freeit;
 			}
 			picked = lp;
 			picked->next = NULL;
 		} else {
 freeit:
 			free_client_lease(lp);
 		}
 	}
 	ip->client->offered_leases = NULL;
 
 	/* If we just tossed all the leases we were offered, go back
 	   to square one. */
 	if (!picked) {
 		ip->client->state = S_INIT;
 		state_init(ip);
 		return;
 	}
 
 	/* If it was a BOOTREPLY, we can just take the address right now. */
 	if (!picked->options[DHO_DHCP_MESSAGE_TYPE].len) {
 		ip->client->new = picked;
 
 		/* Make up some lease expiry times
 		   XXX these should be configurable. */
 		ip->client->new->expiry = cur_time + 12000;
 		ip->client->new->renewal += cur_time + 8000;
 		ip->client->new->rebind += cur_time + 10000;
 
 		ip->client->state = S_REQUESTING;
 
 		/* Bind to the address we received. */
 		bind_lease(ip);
 		return;
 	}
 
 	/* Go to the REQUESTING state. */
 	ip->client->destination = iaddr_broadcast;
 	ip->client->state = S_REQUESTING;
 	ip->client->first_sending = cur_time;
 	ip->client->interval = ip->client->config->initial_interval;
 
 	/* Make a DHCPREQUEST packet from the lease we picked. */
 	make_request(ip, picked);
 	ip->client->xid = ip->client->packet.xid;
 
 	/* Toss the lease we picked - we'll get it back in a DHCPACK. */
 	free_client_lease(picked);
 
 	/* Add an immediate timeout to send the first DHCPREQUEST packet. */
 	send_request(ip);
 }
 
 /* state_requesting is called when we receive a DHCPACK message after
    having sent out one or more DHCPREQUEST packets. */
 
 void
 dhcpack(struct packet *packet)
 {
 	struct interface_info *ip = packet->interface;
 	struct client_lease *lease;
 
 	/* If we're not receptive to an offer right now, or if the offer
 	   has an unrecognizable transaction id, then just drop it. */
 	if (packet->interface->client->xid != packet->raw->xid ||
 	    (packet->interface->hw_address.hlen != packet->raw->hlen) ||
 	    (memcmp(packet->interface->hw_address.haddr,
 	    packet->raw->chaddr, packet->raw->hlen)))
 		return;
 
 	if (ip->client->state != S_REBOOTING &&
 	    ip->client->state != S_REQUESTING &&
 	    ip->client->state != S_RENEWING &&
 	    ip->client->state != S_REBINDING)
 		return;
 
 	note("DHCPACK from %s", piaddr(packet->client_addr));
 
 	lease = packet_to_lease(packet);
 	if (!lease) {
 		note("packet_to_lease failed.");
 		return;
 	}
 
 	ip->client->new = lease;
 
 	/* Stop resending DHCPREQUEST. */
 	cancel_timeout(send_request, ip);
 
 	/* Figure out the lease time. */
 	if (ip->client->new->options[DHO_DHCP_LEASE_TIME].data)
 		ip->client->new->expiry = getULong(
 		    ip->client->new->options[DHO_DHCP_LEASE_TIME].data);
 	else
 		ip->client->new->expiry = default_lease_time;
 	/* A number that looks negative here is really just very large,
 	   because the lease expiry offset is unsigned. */
 	if (ip->client->new->expiry < 0)
 		ip->client->new->expiry = TIME_MAX;
 	/* XXX should be fixed by resetting the client state */
 	if (ip->client->new->expiry < 60)
 		ip->client->new->expiry = 60;
 
 	/* Take the server-provided renewal time if there is one;
 	   otherwise figure it out according to the spec. */
 	if (ip->client->new->options[DHO_DHCP_RENEWAL_TIME].len)
 		ip->client->new->renewal = getULong(
 		    ip->client->new->options[DHO_DHCP_RENEWAL_TIME].data);
 	else
 		ip->client->new->renewal = ip->client->new->expiry / 2;
 
 	/* Same deal with the rebind time. */
 	if (ip->client->new->options[DHO_DHCP_REBINDING_TIME].len)
 		ip->client->new->rebind = getULong(
 		    ip->client->new->options[DHO_DHCP_REBINDING_TIME].data);
 	else
 		ip->client->new->rebind = ip->client->new->renewal +
 		    ip->client->new->renewal / 2 + ip->client->new->renewal / 4;
 
 	ip->client->new->expiry += cur_time;
 	/* Lease lengths can never be negative. */
 	if (ip->client->new->expiry < cur_time)
 		ip->client->new->expiry = TIME_MAX;
 	ip->client->new->renewal += cur_time;
 	if (ip->client->new->renewal < cur_time)
 		ip->client->new->renewal = TIME_MAX;
 	ip->client->new->rebind += cur_time;
 	if (ip->client->new->rebind < cur_time)
 		ip->client->new->rebind = TIME_MAX;
 
 	bind_lease(ip);
 }
 
 void
 bind_lease(struct interface_info *ip)
 {
 	/* Remember the medium. */
 	ip->client->new->medium = ip->client->medium;
 
 	/* Write out the new lease. */
 	write_client_lease(ip, ip->client->new, 0);
 
 	/* Run the client script with the new parameters. */
 	script_init((ip->client->state == S_REQUESTING ? "BOUND" :
 	    (ip->client->state == S_RENEWING ? "RENEW" :
 	    (ip->client->state == S_REBOOTING ? "REBOOT" : "REBIND"))),
 	    ip->client->new->medium);
 	if (ip->client->active && ip->client->state != S_REBOOTING)
 		script_write_params("old_", ip->client->active);
 	script_write_params("new_", ip->client->new);
 	if (ip->client->alias)
 		script_write_params("alias_", ip->client->alias);
 	script_go();
 
 	/* Replace the old active lease with the new one. */
 	if (ip->client->active)
 		free_client_lease(ip->client->active);
 	ip->client->active = ip->client->new;
 	ip->client->new = NULL;
 
 	/* Set up a timeout to start the renewal process. */
 	add_timeout(ip->client->active->renewal, state_bound, ip);
 
 	note("bound to %s -- renewal in %d seconds.",
 	    piaddr(ip->client->active->address),
 	    (int)(ip->client->active->renewal - cur_time));
 	ip->client->state = S_BOUND;
 	reinitialize_interfaces();
 	go_daemon();
 }
 
 /*
  * state_bound is called when we've successfully bound to a particular
  * lease, but the renewal time on that lease has expired.   We are
  * expected to unicast a DHCPREQUEST to the server that gave us our
  * original lease.
  */
 void
 state_bound(void *ipp)
 {
 	struct interface_info *ip = ipp;
 
 	ASSERT_STATE(state, S_BOUND);
 
 	/* T1 has expired. */
 	make_request(ip, ip->client->active);
 	ip->client->xid = ip->client->packet.xid;
 
 	if (ip->client->active->options[DHO_DHCP_SERVER_IDENTIFIER].len == 4) {
 		memcpy(ip->client->destination.iabuf, ip->client->active->
 		    options[DHO_DHCP_SERVER_IDENTIFIER].data, 4);
 		ip->client->destination.len = 4;
 	} else
 		ip->client->destination = iaddr_broadcast;
 
 	ip->client->first_sending = cur_time;
 	ip->client->interval = ip->client->config->initial_interval;
 	ip->client->state = S_RENEWING;
 
 	/* Send the first packet immediately. */
 	send_request(ip);
 }
 
 void
 bootp(struct packet *packet)
 {
 	struct iaddrlist *ap;
 
 	if (packet->raw->op != BOOTREPLY)
 		return;
 
 	/* If there's a reject list, make sure this packet's sender isn't
 	   on it. */
 	for (ap = packet->interface->client->config->reject_list;
 	    ap; ap = ap->next) {
 		if (addr_eq(packet->client_addr, ap->addr)) {
 			note("BOOTREPLY from %s rejected.", piaddr(ap->addr));
 			return;
 		}
 	}
 	dhcpoffer(packet);
 }
 
 void
 dhcp(struct packet *packet)
 {
 	struct iaddrlist *ap;
 	void (*handler)(struct packet *);
 	char *type;
 
 	switch (packet->packet_type) {
 	case DHCPOFFER:
 		handler = dhcpoffer;
 		type = "DHCPOFFER";
 		break;
 	case DHCPNAK:
 		handler = dhcpnak;
 		type = "DHCPNACK";
 		break;
 	case DHCPACK:
 		handler = dhcpack;
 		type = "DHCPACK";
 		break;
 	default:
 		return;
 	}
 
 	/* If there's a reject list, make sure this packet's sender isn't
 	   on it. */
 	for (ap = packet->interface->client->config->reject_list;
 	    ap; ap = ap->next) {
 		if (addr_eq(packet->client_addr, ap->addr)) {
 			note("%s from %s rejected.", type, piaddr(ap->addr));
 			return;
 		}
 	}
 	(*handler)(packet);
 }
 
 void
 dhcpoffer(struct packet *packet)
 {
 	struct interface_info *ip = packet->interface;
 	struct client_lease *lease, *lp;
 	int i;
 	int arp_timeout_needed, stop_selecting;
 	char *name = packet->options[DHO_DHCP_MESSAGE_TYPE].len ?
 	    "DHCPOFFER" : "BOOTREPLY";
 
 	/* If we're not receptive to an offer right now, or if the offer
 	   has an unrecognizable transaction id, then just drop it. */
 	if (ip->client->state != S_SELECTING ||
 	    packet->interface->client->xid != packet->raw->xid ||
 	    (packet->interface->hw_address.hlen != packet->raw->hlen) ||
 	    (memcmp(packet->interface->hw_address.haddr,
 	    packet->raw->chaddr, packet->raw->hlen)))
 		return;
 
 	note("%s from %s", name, piaddr(packet->client_addr));
 
 
 	/* If this lease doesn't supply the minimum required parameters,
 	   blow it off. */
 	for (i = 0; ip->client->config->required_options[i]; i++) {
 		if (!packet->options[ip->client->config->
 		    required_options[i]].len) {
 			note("%s isn't satisfactory.", name);
 			return;
 		}
 	}
 
 	/* If we've already seen this lease, don't record it again. */
 	for (lease = ip->client->offered_leases;
 	    lease; lease = lease->next) {
 		if (lease->address.len == sizeof(packet->raw->yiaddr) &&
 		    !memcmp(lease->address.iabuf,
 		    &packet->raw->yiaddr, lease->address.len)) {
 			debug("%s already seen.", name);
 			return;
 		}
 	}
 
 	lease = packet_to_lease(packet);
 	if (!lease) {
 		note("packet_to_lease failed.");
 		return;
 	}
 
 	/* If this lease was acquired through a BOOTREPLY, record that
 	   fact. */
 	if (!packet->options[DHO_DHCP_MESSAGE_TYPE].len)
 		lease->is_bootp = 1;
 
 	/* Record the medium under which this lease was offered. */
 	lease->medium = ip->client->medium;
 
 	/* Send out an ARP Request for the offered IP address. */
 	script_init("ARPSEND", lease->medium);
 	script_write_params("check_", lease);
 	/* If the script can't send an ARP request without waiting,
 	   we'll be waiting when we do the ARPCHECK, so don't wait now. */
 	if (script_go())
 		arp_timeout_needed = 0;
 	else
 		arp_timeout_needed = 2;
 
 	/* Figure out when we're supposed to stop selecting. */
 	stop_selecting =
 	    ip->client->first_sending + ip->client->config->select_interval;
 
 	/* If this is the lease we asked for, put it at the head of the
 	   list, and don't mess with the arp request timeout. */
 	if (lease->address.len == ip->client->requested_address.len &&
 	    !memcmp(lease->address.iabuf,
 	    ip->client->requested_address.iabuf,
 	    ip->client->requested_address.len)) {
 		lease->next = ip->client->offered_leases;
 		ip->client->offered_leases = lease;
 	} else {
 		/* If we already have an offer, and arping for this
 		   offer would take us past the selection timeout,
 		   then don't extend the timeout - just hope for the
 		   best. */
 		if (ip->client->offered_leases &&
 		    (cur_time + arp_timeout_needed) > stop_selecting)
 			arp_timeout_needed = 0;
 
 		/* Put the lease at the end of the list. */
 		lease->next = NULL;
 		if (!ip->client->offered_leases)
 			ip->client->offered_leases = lease;
 		else {
 			for (lp = ip->client->offered_leases; lp->next;
 			    lp = lp->next)
 				;	/* nothing */
 			lp->next = lease;
 		}
 	}
 
 	/* If we're supposed to stop selecting before we've had time
 	   to wait for the ARPREPLY, add some delay to wait for
 	   the ARPREPLY. */
 	if (stop_selecting - cur_time < arp_timeout_needed)
 		stop_selecting = cur_time + arp_timeout_needed;
 
 	/* If the selecting interval has expired, go immediately to
 	   state_selecting().  Otherwise, time out into
 	   state_selecting at the select interval. */
 	if (stop_selecting <= 0)
 		state_selecting(ip);
 	else {
 		add_timeout(stop_selecting, state_selecting, ip);
 		cancel_timeout(send_discover, ip);
 	}
 }
 
 /* Allocate a client_lease structure and initialize it from the parameters
    in the specified packet. */
 
 struct client_lease *
 packet_to_lease(struct packet *packet)
 {
 	struct client_lease *lease;
 	int i;
 
 	lease = malloc(sizeof(struct client_lease));
 
 	if (!lease) {
 		warning("dhcpoffer: no memory to record lease.");
 		return (NULL);
 	}
 
 	memset(lease, 0, sizeof(*lease));
 
 	/* Copy the lease options. */
 	for (i = 0; i < 256; i++) {
 		if (packet->options[i].len) {
 			lease->options[i].data =
 			    malloc(packet->options[i].len + 1);
 			if (!lease->options[i].data) {
 				warning("dhcpoffer: no memory for option %d", i);
 				free_client_lease(lease);
 				return (NULL);
 			} else {
 				memcpy(lease->options[i].data,
 				    packet->options[i].data,
 				    packet->options[i].len);
 				lease->options[i].len =
 				    packet->options[i].len;
 				lease->options[i].data[lease->options[i].len] =
 				    0;
 			}
 			if (!check_option(lease,i)) {
 				/* ignore a bogus lease offer */
 				warning("Invalid lease option - ignoring offer");
 				free_client_lease(lease);
 				return (NULL);
 			}
 		}
 	}
 
 	lease->address.len = sizeof(packet->raw->yiaddr);
 	memcpy(lease->address.iabuf, &packet->raw->yiaddr, lease->address.len);
 
 	lease->nextserver.len = sizeof(packet->raw->siaddr);
 	memcpy(lease->nextserver.iabuf, &packet->raw->siaddr, lease->nextserver.len);
 
 	/* If the server name was filled out, copy it.
 	   Do not attempt to validate the server name as a host name.
 	   RFC 2131 merely states that sname is NUL-terminated (which do
 	   do not assume) and that it is the server's host name.  Since
 	   the ISC client and server allow arbitrary characters, we do
 	   as well. */
 	if ((!packet->options[DHO_DHCP_OPTION_OVERLOAD].len ||
 	    !(packet->options[DHO_DHCP_OPTION_OVERLOAD].data[0] & 2)) &&
 	    packet->raw->sname[0]) {
 		lease->server_name = malloc(DHCP_SNAME_LEN + 1);
 		if (!lease->server_name) {
 			warning("dhcpoffer: no memory for server name.");
 			free_client_lease(lease);
 			return (NULL);
 		}
 		memcpy(lease->server_name, packet->raw->sname, DHCP_SNAME_LEN);
 		lease->server_name[DHCP_SNAME_LEN]='\0';
 	}
 
 	/* Ditto for the filename. */
 	if ((!packet->options[DHO_DHCP_OPTION_OVERLOAD].len ||
 	    !(packet->options[DHO_DHCP_OPTION_OVERLOAD].data[0] & 1)) &&
 	    packet->raw->file[0]) {
 		/* Don't count on the NUL terminator. */
 		lease->filename = malloc(DHCP_FILE_LEN + 1);
 		if (!lease->filename) {
 			warning("dhcpoffer: no memory for filename.");
 			free_client_lease(lease);
 			return (NULL);
 		}
 		memcpy(lease->filename, packet->raw->file, DHCP_FILE_LEN);
 		lease->filename[DHCP_FILE_LEN]='\0';
 	}
 	return lease;
 }
 
 void
 dhcpnak(struct packet *packet)
 {
 	struct interface_info *ip = packet->interface;
 
 	/* If we're not receptive to an offer right now, or if the offer
 	   has an unrecognizable transaction id, then just drop it. */
 	if (packet->interface->client->xid != packet->raw->xid ||
 	    (packet->interface->hw_address.hlen != packet->raw->hlen) ||
 	    (memcmp(packet->interface->hw_address.haddr,
 	    packet->raw->chaddr, packet->raw->hlen)))
 		return;
 
 	if (ip->client->state != S_REBOOTING &&
 	    ip->client->state != S_REQUESTING &&
 	    ip->client->state != S_RENEWING &&
 	    ip->client->state != S_REBINDING)
 		return;
 
 	note("DHCPNAK from %s", piaddr(packet->client_addr));
 
 	if (!ip->client->active) {
 		note("DHCPNAK with no active lease.\n");
 		return;
 	}
 
 	free_client_lease(ip->client->active);
 	ip->client->active = NULL;
 
 	/* Stop sending DHCPREQUEST packets... */
 	cancel_timeout(send_request, ip);
 
 	ip->client->state = S_INIT;
 	state_init(ip);
 }
 
 /* Send out a DHCPDISCOVER packet, and set a timeout to send out another
    one after the right interval has expired.  If we don't get an offer by
    the time we reach the panic interval, call the panic function. */
 
 void
 send_discover(void *ipp)
 {
 	struct interface_info *ip = ipp;
 	int interval, increase = 1;
 
 	/* Figure out how long it's been since we started transmitting. */
 	interval = cur_time - ip->client->first_sending;
 
 	/* If we're past the panic timeout, call the script and tell it
 	   we haven't found anything for this interface yet. */
 	if (interval > ip->client->config->timeout) {
 		state_panic(ip);
 		return;
 	}
 
 	/* If we're selecting media, try the whole list before doing
 	   the exponential backoff, but if we've already received an
 	   offer, stop looping, because we obviously have it right. */
 	if (!ip->client->offered_leases &&
 	    ip->client->config->media) {
 		int fail = 0;
 again:
 		if (ip->client->medium) {
 			ip->client->medium = ip->client->medium->next;
 			increase = 0;
 		}
 		if (!ip->client->medium) {
 			if (fail)
 				error("No valid media types for %s!", ip->name);
 			ip->client->medium = ip->client->config->media;
 			increase = 1;
 		}
 
 		note("Trying medium \"%s\" %d", ip->client->medium->string,
 		    increase);
 		script_init("MEDIUM", ip->client->medium);
 		if (script_go())
 			goto again;
 	}
 
 	/*
 	 * If we're supposed to increase the interval, do so.  If it's
 	 * currently zero (i.e., we haven't sent any packets yet), set
 	 * it to one; otherwise, add to it a random number between zero
 	 * and two times itself.  On average, this means that it will
 	 * double with every transmission.
 	 */
 	if (increase) {
 		if (!ip->client->interval)
 			ip->client->interval =
 			    ip->client->config->initial_interval;
 		else {
 			ip->client->interval += (arc4random() >> 2) %
 			    (2 * ip->client->interval);
 		}
 
 		/* Don't backoff past cutoff. */
 		if (ip->client->interval >
 		    ip->client->config->backoff_cutoff)
 			ip->client->interval =
 				((ip->client->config->backoff_cutoff / 2)
 				 + ((arc4random() >> 2) %
 				    ip->client->config->backoff_cutoff));
 	} else if (!ip->client->interval)
 		ip->client->interval =
 			ip->client->config->initial_interval;
 
 	/* If the backoff would take us to the panic timeout, just use that
 	   as the interval. */
 	if (cur_time + ip->client->interval >
 	    ip->client->first_sending + ip->client->config->timeout)
 		ip->client->interval =
 			(ip->client->first_sending +
 			 ip->client->config->timeout) - cur_time + 1;
 
 	/* Record the number of seconds since we started sending. */
 	if (interval < 65536)
 		ip->client->packet.secs = htons(interval);
 	else
 		ip->client->packet.secs = htons(65535);
 	ip->client->secs = ip->client->packet.secs;
 
 	note("DHCPDISCOVER on %s to %s port %d interval %d",
 	    ip->name, inet_ntoa(inaddr_broadcast), REMOTE_PORT,
 	    (int)ip->client->interval);
 
 	/* Send out a packet. */
 	send_packet_unpriv(privfd, &ip->client->packet,
 	    ip->client->packet_length, inaddr_any, inaddr_broadcast);
 
 	add_timeout(cur_time + ip->client->interval, send_discover, ip);
 }
 
 /*
  * state_panic gets called if we haven't received any offers in a preset
  * amount of time.   When this happens, we try to use existing leases
  * that haven't yet expired, and failing that, we call the client script
  * and hope it can do something.
  */
 void
 state_panic(void *ipp)
 {
 	struct interface_info *ip = ipp;
 	struct client_lease *loop = ip->client->active;
 	struct client_lease *lp;
 
 	note("No DHCPOFFERS received.");
 
 	/* We may not have an active lease, but we may have some
 	   predefined leases that we can try. */
 	if (!ip->client->active && ip->client->leases)
 		goto activate_next;
 
 	/* Run through the list of leases and see if one can be used. */
 	while (ip->client->active) {
 		if (ip->client->active->expiry > cur_time) {
 			note("Trying recorded lease %s",
 			    piaddr(ip->client->active->address));
 			/* Run the client script with the existing
 			   parameters. */
 			script_init("TIMEOUT",
 			    ip->client->active->medium);
 			script_write_params("new_", ip->client->active);
 			if (ip->client->alias)
 				script_write_params("alias_",
 				    ip->client->alias);
 
 			/* If the old lease is still good and doesn't
 			   yet need renewal, go into BOUND state and
 			   timeout at the renewal time. */
 			if (!script_go()) {
 				if (cur_time <
 				    ip->client->active->renewal) {
 					ip->client->state = S_BOUND;
 					note("bound: renewal in %d seconds.",
 					    (int)(ip->client->active->renewal -
 					    cur_time));
 					add_timeout(
 					    ip->client->active->renewal,
 					    state_bound, ip);
 				} else {
 					ip->client->state = S_BOUND;
 					note("bound: immediate renewal.");
 					state_bound(ip);
 				}
 				reinitialize_interfaces();
 				go_daemon();
 				return;
 			}
 		}
 
 		/* If there are no other leases, give up. */
 		if (!ip->client->leases) {
 			ip->client->leases = ip->client->active;
 			ip->client->active = NULL;
 			break;
 		}
 
 activate_next:
 		/* Otherwise, put the active lease at the end of the
 		   lease list, and try another lease.. */
 		for (lp = ip->client->leases; lp->next; lp = lp->next)
 			;
 		lp->next = ip->client->active;
 		if (lp->next)
 			lp->next->next = NULL;
 		ip->client->active = ip->client->leases;
 		ip->client->leases = ip->client->leases->next;
 
 		/* If we already tried this lease, we've exhausted the
 		   set of leases, so we might as well give up for
 		   now. */
 		if (ip->client->active == loop)
 			break;
 		else if (!loop)
 			loop = ip->client->active;
 	}
 
 	/* No leases were available, or what was available didn't work, so
 	   tell the shell script that we failed to allocate an address,
 	   and try again later. */
 	note("No working leases in persistent database - sleeping.\n");
 	script_init("FAIL", NULL);
 	if (ip->client->alias)
 		script_write_params("alias_", ip->client->alias);
 	script_go();
 	ip->client->state = S_INIT;
 	add_timeout(cur_time + ip->client->config->retry_interval, state_init,
 	    ip);
 	go_daemon();
 }
 
 void
 send_request(void *ipp)
 {
 	struct interface_info *ip = ipp;
 	struct in_addr from, to;
 	int interval;
 
 	/* Figure out how long it's been since we started transmitting. */
 	interval = cur_time - ip->client->first_sending;
 
 	/* If we're in the INIT-REBOOT or REQUESTING state and we're
 	   past the reboot timeout, go to INIT and see if we can
 	   DISCOVER an address... */
 	/* XXX In the INIT-REBOOT state, if we don't get an ACK, it
 	   means either that we're on a network with no DHCP server,
 	   or that our server is down.  In the latter case, assuming
 	   that there is a backup DHCP server, DHCPDISCOVER will get
 	   us a new address, but we could also have successfully
 	   reused our old address.  In the former case, we're hosed
 	   anyway.  This is not a win-prone situation. */
 	if ((ip->client->state == S_REBOOTING ||
 	    ip->client->state == S_REQUESTING) &&
 	    interval > ip->client->config->reboot_timeout) {
 cancel:
 		ip->client->state = S_INIT;
 		cancel_timeout(send_request, ip);
 		state_init(ip);
 		return;
 	}
 
 	/* If we're in the reboot state, make sure the media is set up
 	   correctly. */
 	if (ip->client->state == S_REBOOTING &&
 	    !ip->client->medium &&
 	    ip->client->active->medium ) {
 		script_init("MEDIUM", ip->client->active->medium);
 
 		/* If the medium we chose won't fly, go to INIT state. */
 		if (script_go())
 			goto cancel;
 
 		/* Record the medium. */
 		ip->client->medium = ip->client->active->medium;
 	}
 
 	/* If the lease has expired, relinquish the address and go back
 	   to the INIT state. */
 	if (ip->client->state != S_REQUESTING &&
 	    cur_time > ip->client->active->expiry) {
 		/* Run the client script with the new parameters. */
 		script_init("EXPIRE", NULL);
 		script_write_params("old_", ip->client->active);
 		if (ip->client->alias)
 			script_write_params("alias_", ip->client->alias);
 		script_go();
 
 		/* Now do a preinit on the interface so that we can
 		   discover a new address. */
 		script_init("PREINIT", NULL);
 		if (ip->client->alias)
 			script_write_params("alias_", ip->client->alias);
 		script_go();
 
 		ip->client->state = S_INIT;
 		state_init(ip);
 		return;
 	}
 
 	/* Do the exponential backoff... */
 	if (!ip->client->interval)
 		ip->client->interval = ip->client->config->initial_interval;
 	else
 		ip->client->interval += ((arc4random() >> 2) %
 		    (2 * ip->client->interval));
 
 	/* Don't backoff past cutoff. */
 	if (ip->client->interval >
 	    ip->client->config->backoff_cutoff)
 		ip->client->interval =
 		    ((ip->client->config->backoff_cutoff / 2) +
 		    ((arc4random() >> 2) % ip->client->interval));
 
 	/* If the backoff would take us to the expiry time, just set the
 	   timeout to the expiry time. */
 	if (ip->client->state != S_REQUESTING &&
 	    cur_time + ip->client->interval >
 	    ip->client->active->expiry)
 		ip->client->interval =
 		    ip->client->active->expiry - cur_time + 1;
 
 	/* If the lease T2 time has elapsed, or if we're not yet bound,
 	   broadcast the DHCPREQUEST rather than unicasting. */
 	if (ip->client->state == S_REQUESTING ||
 	    ip->client->state == S_REBOOTING ||
 	    cur_time > ip->client->active->rebind)
 		to.s_addr = INADDR_BROADCAST;
 	else
 		memcpy(&to.s_addr, ip->client->destination.iabuf,
 		    sizeof(to.s_addr));
 
 	if (ip->client->state != S_REQUESTING)
 		memcpy(&from, ip->client->active->address.iabuf,
 		    sizeof(from));
 	else
 		from.s_addr = INADDR_ANY;
 
 	/* Record the number of seconds since we started sending. */
 	if (ip->client->state == S_REQUESTING)
 		ip->client->packet.secs = ip->client->secs;
 	else {
 		if (interval < 65536)
 			ip->client->packet.secs = htons(interval);
 		else
 			ip->client->packet.secs = htons(65535);
 	}
 
 	note("DHCPREQUEST on %s to %s port %d", ip->name, inet_ntoa(to),
 	    REMOTE_PORT);
 
 	/* Send out a packet. */
 	send_packet_unpriv(privfd, &ip->client->packet,
 	    ip->client->packet_length, from, to);
 
 	add_timeout(cur_time + ip->client->interval, send_request, ip);
 }
 
 void
 send_decline(void *ipp)
 {
 	struct interface_info *ip = ipp;
 
 	note("DHCPDECLINE on %s to %s port %d", ip->name,
 	    inet_ntoa(inaddr_broadcast), REMOTE_PORT);
 
 	/* Send out a packet. */
 	send_packet_unpriv(privfd, &ip->client->packet,
 	    ip->client->packet_length, inaddr_any, inaddr_broadcast);
 }
 
 void
 make_discover(struct interface_info *ip, struct client_lease *lease)
 {
 	unsigned char discover = DHCPDISCOVER;
 	struct tree_cache *options[256];
 	struct tree_cache option_elements[256];
 	int i;
 
 	memset(option_elements, 0, sizeof(option_elements));
 	memset(options, 0, sizeof(options));
 	memset(&ip->client->packet, 0, sizeof(ip->client->packet));
 
 	/* Set DHCP_MESSAGE_TYPE to DHCPDISCOVER */
 	i = DHO_DHCP_MESSAGE_TYPE;
 	options[i] = &option_elements[i];
 	options[i]->value = &discover;
 	options[i]->len = sizeof(discover);
 	options[i]->buf_size = sizeof(discover);
 	options[i]->timeout = 0xFFFFFFFF;
 
 	/* Request the options we want */
 	i  = DHO_DHCP_PARAMETER_REQUEST_LIST;
 	options[i] = &option_elements[i];
 	options[i]->value = ip->client->config->requested_options;
 	options[i]->len = ip->client->config->requested_option_count;
 	options[i]->buf_size =
 		ip->client->config->requested_option_count;
 	options[i]->timeout = 0xFFFFFFFF;
 
 	/* If we had an address, try to get it again. */
 	if (lease) {
 		ip->client->requested_address = lease->address;
 		i = DHO_DHCP_REQUESTED_ADDRESS;
 		options[i] = &option_elements[i];
 		options[i]->value = lease->address.iabuf;
 		options[i]->len = lease->address.len;
 		options[i]->buf_size = lease->address.len;
 		options[i]->timeout = 0xFFFFFFFF;
 	} else
 		ip->client->requested_address.len = 0;
 
 	/* Send any options requested in the config file. */
 	for (i = 0; i < 256; i++)
 		if (!options[i] &&
 		    ip->client->config->send_options[i].data) {
 			options[i] = &option_elements[i];
 			options[i]->value =
 			    ip->client->config->send_options[i].data;
 			options[i]->len =
 			    ip->client->config->send_options[i].len;
 			options[i]->buf_size =
 			    ip->client->config->send_options[i].len;
 			options[i]->timeout = 0xFFFFFFFF;
 		}
 
 	/* send host name if not set via config file. */
 	if (!options[DHO_HOST_NAME]) {
 		if (hostname[0] != '\0') {
 			size_t len;
 			char* posDot = strchr(hostname, '.');
 			if (posDot != NULL)
 				len = posDot - hostname;
 			else
 				len = strlen(hostname);
 			options[DHO_HOST_NAME] = &option_elements[DHO_HOST_NAME];
 			options[DHO_HOST_NAME]->value = hostname;
 			options[DHO_HOST_NAME]->len = len;
 			options[DHO_HOST_NAME]->buf_size = len;
 			options[DHO_HOST_NAME]->timeout = 0xFFFFFFFF;
 		}
 	}
 
 	/* set unique client identifier */
 	char client_ident[sizeof(struct hardware)];
 	if (!options[DHO_DHCP_CLIENT_IDENTIFIER]) {
 		int hwlen = (ip->hw_address.hlen < sizeof(client_ident)-1) ?
 				ip->hw_address.hlen : sizeof(client_ident)-1;
 		client_ident[0] = ip->hw_address.htype;
 		memcpy(&client_ident[1], ip->hw_address.haddr, hwlen);
 		options[DHO_DHCP_CLIENT_IDENTIFIER] = &option_elements[DHO_DHCP_CLIENT_IDENTIFIER];
 		options[DHO_DHCP_CLIENT_IDENTIFIER]->value = client_ident;
 		options[DHO_DHCP_CLIENT_IDENTIFIER]->len = hwlen+1;
 		options[DHO_DHCP_CLIENT_IDENTIFIER]->buf_size = hwlen+1;
 		options[DHO_DHCP_CLIENT_IDENTIFIER]->timeout = 0xFFFFFFFF;
 	}
 
 	/* Set up the option buffer... */
 	ip->client->packet_length = cons_options(NULL, &ip->client->packet, 0,
 	    options, 0, 0, 0, NULL, 0);
 	if (ip->client->packet_length < BOOTP_MIN_LEN)
 		ip->client->packet_length = BOOTP_MIN_LEN;
 
 	ip->client->packet.op = BOOTREQUEST;
 	ip->client->packet.htype = ip->hw_address.htype;
 	ip->client->packet.hlen = ip->hw_address.hlen;
 	ip->client->packet.hops = 0;
 	ip->client->packet.xid = arc4random();
 	ip->client->packet.secs = 0; /* filled in by send_discover. */
 	ip->client->packet.flags = 0;
 
 	memset(&(ip->client->packet.ciaddr),
 	    0, sizeof(ip->client->packet.ciaddr));
 	memset(&(ip->client->packet.yiaddr),
 	    0, sizeof(ip->client->packet.yiaddr));
 	memset(&(ip->client->packet.siaddr),
 	    0, sizeof(ip->client->packet.siaddr));
 	memset(&(ip->client->packet.giaddr),
 	    0, sizeof(ip->client->packet.giaddr));
 	memcpy(ip->client->packet.chaddr,
 	    ip->hw_address.haddr, ip->hw_address.hlen);
 }
 
 
 void
 make_request(struct interface_info *ip, struct client_lease * lease)
 {
 	unsigned char request = DHCPREQUEST;
 	struct tree_cache *options[256];
 	struct tree_cache option_elements[256];
 	int i;
 
 	memset(options, 0, sizeof(options));
 	memset(&ip->client->packet, 0, sizeof(ip->client->packet));
 
 	/* Set DHCP_MESSAGE_TYPE to DHCPREQUEST */
 	i = DHO_DHCP_MESSAGE_TYPE;
 	options[i] = &option_elements[i];
 	options[i]->value = &request;
 	options[i]->len = sizeof(request);
 	options[i]->buf_size = sizeof(request);
 	options[i]->timeout = 0xFFFFFFFF;
 
 	/* Request the options we want */
 	i = DHO_DHCP_PARAMETER_REQUEST_LIST;
 	options[i] = &option_elements[i];
 	options[i]->value = ip->client->config->requested_options;
 	options[i]->len = ip->client->config->requested_option_count;
 	options[i]->buf_size =
 		ip->client->config->requested_option_count;
 	options[i]->timeout = 0xFFFFFFFF;
 
 	/* If we are requesting an address that hasn't yet been assigned
 	   to us, use the DHCP Requested Address option. */
 	if (ip->client->state == S_REQUESTING) {
 		/* Send back the server identifier... */
 		i = DHO_DHCP_SERVER_IDENTIFIER;
 		options[i] = &option_elements[i];
 		options[i]->value = lease->options[i].data;
 		options[i]->len = lease->options[i].len;
 		options[i]->buf_size = lease->options[i].len;
 		options[i]->timeout = 0xFFFFFFFF;
 	}
 	if (ip->client->state == S_REQUESTING ||
 	    ip->client->state == S_REBOOTING) {
 		ip->client->requested_address = lease->address;
 		i = DHO_DHCP_REQUESTED_ADDRESS;
 		options[i] = &option_elements[i];
 		options[i]->value = lease->address.iabuf;
 		options[i]->len = lease->address.len;
 		options[i]->buf_size = lease->address.len;
 		options[i]->timeout = 0xFFFFFFFF;
 	} else
 		ip->client->requested_address.len = 0;
 
 	/* Send any options requested in the config file. */
 	for (i = 0; i < 256; i++)
 		if (!options[i] &&
 		    ip->client->config->send_options[i].data) {
 			options[i] = &option_elements[i];
 			options[i]->value =
 			    ip->client->config->send_options[i].data;
 			options[i]->len =
 			    ip->client->config->send_options[i].len;
 			options[i]->buf_size =
 			    ip->client->config->send_options[i].len;
 			options[i]->timeout = 0xFFFFFFFF;
 		}
 
 	/* send host name if not set via config file. */
 	if (!options[DHO_HOST_NAME]) {
 		if (hostname[0] != '\0') {
 			size_t len;
 			char* posDot = strchr(hostname, '.');
 			if (posDot != NULL)
 				len = posDot - hostname;
 			else
 				len = strlen(hostname);
 			options[DHO_HOST_NAME] = &option_elements[DHO_HOST_NAME];
 			options[DHO_HOST_NAME]->value = hostname;
 			options[DHO_HOST_NAME]->len = len;
 			options[DHO_HOST_NAME]->buf_size = len;
 			options[DHO_HOST_NAME]->timeout = 0xFFFFFFFF;
 		}
 	}
 
 	/* set unique client identifier */
 	char client_ident[sizeof(struct hardware)];
 	if (!options[DHO_DHCP_CLIENT_IDENTIFIER]) {
 		int hwlen = (ip->hw_address.hlen < sizeof(client_ident)-1) ?
 				ip->hw_address.hlen : sizeof(client_ident)-1;
 		client_ident[0] = ip->hw_address.htype;
 		memcpy(&client_ident[1], ip->hw_address.haddr, hwlen);
 		options[DHO_DHCP_CLIENT_IDENTIFIER] = &option_elements[DHO_DHCP_CLIENT_IDENTIFIER];
 		options[DHO_DHCP_CLIENT_IDENTIFIER]->value = client_ident;
 		options[DHO_DHCP_CLIENT_IDENTIFIER]->len = hwlen+1;
 		options[DHO_DHCP_CLIENT_IDENTIFIER]->buf_size = hwlen+1;
 		options[DHO_DHCP_CLIENT_IDENTIFIER]->timeout = 0xFFFFFFFF;
 	}
 
 	/* Set up the option buffer... */
 	ip->client->packet_length = cons_options(NULL, &ip->client->packet, 0,
 	    options, 0, 0, 0, NULL, 0);
 	if (ip->client->packet_length < BOOTP_MIN_LEN)
 		ip->client->packet_length = BOOTP_MIN_LEN;
 
 	ip->client->packet.op = BOOTREQUEST;
 	ip->client->packet.htype = ip->hw_address.htype;
 	ip->client->packet.hlen = ip->hw_address.hlen;
 	ip->client->packet.hops = 0;
 	ip->client->packet.xid = ip->client->xid;
 	ip->client->packet.secs = 0; /* Filled in by send_request. */
 
 	/* If we own the address we're requesting, put it in ciaddr;
 	   otherwise set ciaddr to zero. */
 	if (ip->client->state == S_BOUND ||
 	    ip->client->state == S_RENEWING ||
 	    ip->client->state == S_REBINDING) {
 		memcpy(&ip->client->packet.ciaddr,
 		    lease->address.iabuf, lease->address.len);
 		ip->client->packet.flags = 0;
 	} else {
 		memset(&ip->client->packet.ciaddr, 0,
 		    sizeof(ip->client->packet.ciaddr));
 		ip->client->packet.flags = 0;
 	}
 
 	memset(&ip->client->packet.yiaddr, 0,
 	    sizeof(ip->client->packet.yiaddr));
 	memset(&ip->client->packet.siaddr, 0,
 	    sizeof(ip->client->packet.siaddr));
 	memset(&ip->client->packet.giaddr, 0,
 	    sizeof(ip->client->packet.giaddr));
 	memcpy(ip->client->packet.chaddr,
 	    ip->hw_address.haddr, ip->hw_address.hlen);
 }
 
 void
 make_decline(struct interface_info *ip, struct client_lease *lease)
 {
 	struct tree_cache *options[256], message_type_tree;
 	struct tree_cache requested_address_tree;
 	struct tree_cache server_id_tree, client_id_tree;
 	unsigned char decline = DHCPDECLINE;
 	int i;
 
 	memset(options, 0, sizeof(options));
 	memset(&ip->client->packet, 0, sizeof(ip->client->packet));
 
 	/* Set DHCP_MESSAGE_TYPE to DHCPDECLINE */
 	i = DHO_DHCP_MESSAGE_TYPE;
 	options[i] = &message_type_tree;
 	options[i]->value = &decline;
 	options[i]->len = sizeof(decline);
 	options[i]->buf_size = sizeof(decline);
 	options[i]->timeout = 0xFFFFFFFF;
 
 	/* Send back the server identifier... */
 	i = DHO_DHCP_SERVER_IDENTIFIER;
 	options[i] = &server_id_tree;
 	options[i]->value = lease->options[i].data;
 	options[i]->len = lease->options[i].len;
 	options[i]->buf_size = lease->options[i].len;
 	options[i]->timeout = 0xFFFFFFFF;
 
 	/* Send back the address we're declining. */
 	i = DHO_DHCP_REQUESTED_ADDRESS;
 	options[i] = &requested_address_tree;
 	options[i]->value = lease->address.iabuf;
 	options[i]->len = lease->address.len;
 	options[i]->buf_size = lease->address.len;
 	options[i]->timeout = 0xFFFFFFFF;
 
 	/* Send the uid if the user supplied one. */
 	i = DHO_DHCP_CLIENT_IDENTIFIER;
 	if (ip->client->config->send_options[i].len) {
 		options[i] = &client_id_tree;
 		options[i]->value = ip->client->config->send_options[i].data;
 		options[i]->len = ip->client->config->send_options[i].len;
 		options[i]->buf_size = ip->client->config->send_options[i].len;
 		options[i]->timeout = 0xFFFFFFFF;
 	}
 
 
 	/* Set up the option buffer... */
 	ip->client->packet_length = cons_options(NULL, &ip->client->packet, 0,
 	    options, 0, 0, 0, NULL, 0);
 	if (ip->client->packet_length < BOOTP_MIN_LEN)
 		ip->client->packet_length = BOOTP_MIN_LEN;
 
 	ip->client->packet.op = BOOTREQUEST;
 	ip->client->packet.htype = ip->hw_address.htype;
 	ip->client->packet.hlen = ip->hw_address.hlen;
 	ip->client->packet.hops = 0;
 	ip->client->packet.xid = ip->client->xid;
 	ip->client->packet.secs = 0; /* Filled in by send_request. */
 	ip->client->packet.flags = 0;
 
 	/* ciaddr must always be zero. */
 	memset(&ip->client->packet.ciaddr, 0,
 	    sizeof(ip->client->packet.ciaddr));
 	memset(&ip->client->packet.yiaddr, 0,
 	    sizeof(ip->client->packet.yiaddr));
 	memset(&ip->client->packet.siaddr, 0,
 	    sizeof(ip->client->packet.siaddr));
 	memset(&ip->client->packet.giaddr, 0,
 	    sizeof(ip->client->packet.giaddr));
 	memcpy(ip->client->packet.chaddr,
 	    ip->hw_address.haddr, ip->hw_address.hlen);
 }
 
 void
 free_client_lease(struct client_lease *lease)
 {
 	int i;
 
 	if (lease->server_name)
 		free(lease->server_name);
 	if (lease->filename)
 		free(lease->filename);
 	for (i = 0; i < 256; i++) {
 		if (lease->options[i].len)
 			free(lease->options[i].data);
 	}
 	free(lease);
 }
 
 FILE *leaseFile;
 
 void
 rewrite_client_leases(void)
 {
 	struct client_lease *lp;
 	cap_rights_t rights;
 
 	if (!leaseFile) {
 		leaseFile = fopen(path_dhclient_db, "w");
 		if (!leaseFile)
 			error("can't create %s: %m", path_dhclient_db);
 		cap_rights_init(&rights, CAP_FCNTL, CAP_FSTAT, CAP_FSYNC,
 		    CAP_FTRUNCATE, CAP_SEEK, CAP_WRITE);
 		if (cap_rights_limit(fileno(leaseFile), &rights) < 0 &&
 		    errno != ENOSYS) {
 			error("can't limit lease descriptor: %m");
 		}
 		if (cap_fcntls_limit(fileno(leaseFile), CAP_FCNTL_GETFL) < 0 &&
 		    errno != ENOSYS) {
 			error("can't limit lease descriptor fcntls: %m");
 		}
 	} else {
 		fflush(leaseFile);
 		rewind(leaseFile);
 	}
 
 	for (lp = ifi->client->leases; lp; lp = lp->next)
 		write_client_lease(ifi, lp, 1);
 	if (ifi->client->active)
 		write_client_lease(ifi, ifi->client->active, 1);
 
 	fflush(leaseFile);
 	ftruncate(fileno(leaseFile), ftello(leaseFile));
 	fsync(fileno(leaseFile));
 }
 
 void
 write_client_lease(struct interface_info *ip, struct client_lease *lease,
     int rewrite)
 {
 	static int leases_written;
 	struct tm *t;
 	int i;
 
 	if (!rewrite) {
 		if (leases_written++ > 20) {
 			rewrite_client_leases();
 			leases_written = 0;
 		}
 	}
 
 	/* If the lease came from the config file, we don't need to stash
 	   a copy in the lease database. */
 	if (lease->is_static)
 		return;
 
 	if (!leaseFile) {	/* XXX */
 		leaseFile = fopen(path_dhclient_db, "w");
 		if (!leaseFile)
 			error("can't create %s: %m", path_dhclient_db);
 	}
 
 	fprintf(leaseFile, "lease {\n");
 	if (lease->is_bootp)
 		fprintf(leaseFile, "  bootp;\n");
 	fprintf(leaseFile, "  interface \"%s\";\n", ip->name);
 	fprintf(leaseFile, "  fixed-address %s;\n", piaddr(lease->address));
 	if (lease->nextserver.len == sizeof(inaddr_any) &&
 	    0 != memcmp(lease->nextserver.iabuf, &inaddr_any,
 	    sizeof(inaddr_any)))
 		fprintf(leaseFile, "  next-server %s;\n",
 		    piaddr(lease->nextserver));
 	if (lease->filename)
 		fprintf(leaseFile, "  filename \"%s\";\n", lease->filename);
 	if (lease->server_name)
 		fprintf(leaseFile, "  server-name \"%s\";\n",
 		    lease->server_name);
 	if (lease->medium)
 		fprintf(leaseFile, "  medium \"%s\";\n", lease->medium->string);
 	for (i = 0; i < 256; i++)
 		if (lease->options[i].len)
 			fprintf(leaseFile, "  option %s %s;\n",
 			    dhcp_options[i].name,
 			    pretty_print_option(i, lease->options[i].data,
 			    lease->options[i].len, 1, 1));
 
 	t = gmtime(&lease->renewal);
 	fprintf(leaseFile, "  renew %d %d/%d/%d %02d:%02d:%02d;\n",
 	    t->tm_wday, t->tm_year + 1900, t->tm_mon + 1, t->tm_mday,
 	    t->tm_hour, t->tm_min, t->tm_sec);
 	t = gmtime(&lease->rebind);
 	fprintf(leaseFile, "  rebind %d %d/%d/%d %02d:%02d:%02d;\n",
 	    t->tm_wday, t->tm_year + 1900, t->tm_mon + 1, t->tm_mday,
 	    t->tm_hour, t->tm_min, t->tm_sec);
 	t = gmtime(&lease->expiry);
 	fprintf(leaseFile, "  expire %d %d/%d/%d %02d:%02d:%02d;\n",
 	    t->tm_wday, t->tm_year + 1900, t->tm_mon + 1, t->tm_mday,
 	    t->tm_hour, t->tm_min, t->tm_sec);
 	fprintf(leaseFile, "}\n");
 	fflush(leaseFile);
 }
 
 void
 script_init(char *reason, struct string_list *medium)
 {
 	size_t		 len, mediumlen = 0;
 	struct imsg_hdr	 hdr;
 	struct buf	*buf;
 	int		 errs;
 
 	if (medium != NULL && medium->string != NULL)
 		mediumlen = strlen(medium->string);
 
 	hdr.code = IMSG_SCRIPT_INIT;
 	hdr.len = sizeof(struct imsg_hdr) +
 	    sizeof(size_t) + mediumlen +
 	    sizeof(size_t) + strlen(reason);
 
 	if ((buf = buf_open(hdr.len)) == NULL)
 		error("buf_open: %m");
 
 	errs = 0;
 	errs += buf_add(buf, &hdr, sizeof(hdr));
 	errs += buf_add(buf, &mediumlen, sizeof(mediumlen));
 	if (mediumlen > 0)
 		errs += buf_add(buf, medium->string, mediumlen);
 	len = strlen(reason);
 	errs += buf_add(buf, &len, sizeof(len));
 	errs += buf_add(buf, reason, len);
 
 	if (errs)
 		error("buf_add: %m");
 
 	if (buf_close(privfd, buf) == -1)
 		error("buf_close: %m");
 }
 
 void
 priv_script_init(char *reason, char *medium)
 {
 	struct interface_info *ip = ifi;
 
 	if (ip) {
 		ip->client->scriptEnvsize = 100;
 		if (ip->client->scriptEnv == NULL)
 			ip->client->scriptEnv =
 			    malloc(ip->client->scriptEnvsize * sizeof(char *));
 		if (ip->client->scriptEnv == NULL)
 			error("script_init: no memory for environment");
 
 		ip->client->scriptEnv[0] = strdup(CLIENT_PATH);
 		if (ip->client->scriptEnv[0] == NULL)
 			error("script_init: no memory for environment");
 
 		ip->client->scriptEnv[1] = NULL;
 
 		script_set_env(ip->client, "", "interface", ip->name);
 
 		if (medium)
 			script_set_env(ip->client, "", "medium", medium);
 
 		script_set_env(ip->client, "", "reason", reason);
 	}
 }
 
 void
 priv_script_write_params(char *prefix, struct client_lease *lease)
 {
 	struct interface_info *ip = ifi;
 	u_int8_t dbuf[1500], *dp = NULL;
 	int i, len;
 	char tbuf[128];
 
 	script_set_env(ip->client, prefix, "ip_address",
 	    piaddr(lease->address));
 
 	if (ip->client->config->default_actions[DHO_SUBNET_MASK] ==
 	    ACTION_SUPERSEDE) {
 		dp = ip->client->config->defaults[DHO_SUBNET_MASK].data;
 		len = ip->client->config->defaults[DHO_SUBNET_MASK].len;
 	} else {
 		dp = lease->options[DHO_SUBNET_MASK].data;
 		len = lease->options[DHO_SUBNET_MASK].len;
 	}
 	if (len && (len < sizeof(lease->address.iabuf))) {
 		struct iaddr netmask, subnet, broadcast;
 
 		memcpy(netmask.iabuf, dp, len);
 		netmask.len = len;
 		subnet = subnet_number(lease->address, netmask);
 		if (subnet.len) {
 			script_set_env(ip->client, prefix, "network_number",
 			    piaddr(subnet));
 			if (!lease->options[DHO_BROADCAST_ADDRESS].len) {
 				broadcast = broadcast_addr(subnet, netmask);
 				if (broadcast.len)
 					script_set_env(ip->client, prefix,
 					    "broadcast_address",
 					    piaddr(broadcast));
 			}
 		}
 	}
 
 	if (lease->filename)
 		script_set_env(ip->client, prefix, "filename", lease->filename);
 	if (lease->server_name)
 		script_set_env(ip->client, prefix, "server_name",
 		    lease->server_name);
 	for (i = 0; i < 256; i++) {
 		len = 0;
 
 		if (ip->client->config->defaults[i].len) {
 			if (lease->options[i].len) {
 				switch (
 				    ip->client->config->default_actions[i]) {
 				case ACTION_DEFAULT:
 					dp = lease->options[i].data;
 					len = lease->options[i].len;
 					break;
 				case ACTION_SUPERSEDE:
 supersede:
 					dp = ip->client->
 						config->defaults[i].data;
 					len = ip->client->
 						config->defaults[i].len;
 					break;
 				case ACTION_PREPEND:
 					len = ip->client->
 					    config->defaults[i].len +
 					    lease->options[i].len;
 					if (len >= sizeof(dbuf)) {
 						warning("no space to %s %s",
 						    "prepend option",
 						    dhcp_options[i].name);
 						goto supersede;
 					}
 					dp = dbuf;
 					memcpy(dp,
 						ip->client->
 						config->defaults[i].data,
 						ip->client->
 						config->defaults[i].len);
 					memcpy(dp + ip->client->
 						config->defaults[i].len,
 						lease->options[i].data,
 						lease->options[i].len);
 					dp[len] = '\0';
 					break;
 				case ACTION_APPEND:
 					/*
 					 * When we append, we assume that we're
 					 * appending to text.  Some MS servers
 					 * include a NUL byte at the end of
 					 * the search string provided.
 					 */
 					len = ip->client->
 					    config->defaults[i].len +
 					    lease->options[i].len;
 					if (len >= sizeof(dbuf)) {
 						warning("no space to %s %s",
 						    "append option",
 						    dhcp_options[i].name);
 						goto supersede;
 					}
 					memcpy(dbuf,
 						lease->options[i].data,
 						lease->options[i].len);
 					for (dp = dbuf + lease->options[i].len;
 					    dp > dbuf; dp--, len--)
 						if (dp[-1] != '\0')
 							break;
 					memcpy(dp,
 						ip->client->
 						config->defaults[i].data,
 						ip->client->
 						config->defaults[i].len);
 					dp = dbuf;
 					dp[len] = '\0';
 				}
 			} else {
 				dp = ip->client->
 					config->defaults[i].data;
 				len = ip->client->
 					config->defaults[i].len;
 			}
 		} else if (lease->options[i].len) {
 			len = lease->options[i].len;
 			dp = lease->options[i].data;
 		} else {
 			len = 0;
 		}
 		if (len) {
 			char name[256];
 
 			if (dhcp_option_ev_name(name, sizeof(name),
 			    &dhcp_options[i]))
 				script_set_env(ip->client, prefix, name,
 				    pretty_print_option(i, dp, len, 0, 0));
 		}
 	}
 	snprintf(tbuf, sizeof(tbuf), "%d", (int)lease->expiry);
 	script_set_env(ip->client, prefix, "expiry", tbuf);
 }
 
 void
 script_write_params(char *prefix, struct client_lease *lease)
 {
 	size_t		 fn_len = 0, sn_len = 0, pr_len = 0;
 	struct imsg_hdr	 hdr;
 	struct buf	*buf;
 	int		 errs, i;
 
 	if (lease->filename != NULL)
 		fn_len = strlen(lease->filename);
 	if (lease->server_name != NULL)
 		sn_len = strlen(lease->server_name);
 	if (prefix != NULL)
 		pr_len = strlen(prefix);
 
 	hdr.code = IMSG_SCRIPT_WRITE_PARAMS;
 	hdr.len = sizeof(hdr) + sizeof(struct client_lease) +
 	    sizeof(size_t) + fn_len + sizeof(size_t) + sn_len +
 	    sizeof(size_t) + pr_len;
 
 	for (i = 0; i < 256; i++)
 		hdr.len += sizeof(int) + lease->options[i].len;
 
 	scripttime = time(NULL);
 
 	if ((buf = buf_open(hdr.len)) == NULL)
 		error("buf_open: %m");
 
 	errs = 0;
 	errs += buf_add(buf, &hdr, sizeof(hdr));
 	errs += buf_add(buf, lease, sizeof(struct client_lease));
 	errs += buf_add(buf, &fn_len, sizeof(fn_len));
 	errs += buf_add(buf, lease->filename, fn_len);
 	errs += buf_add(buf, &sn_len, sizeof(sn_len));
 	errs += buf_add(buf, lease->server_name, sn_len);
 	errs += buf_add(buf, &pr_len, sizeof(pr_len));
 	errs += buf_add(buf, prefix, pr_len);
 
 	for (i = 0; i < 256; i++) {
 		errs += buf_add(buf, &lease->options[i].len,
 		    sizeof(lease->options[i].len));
 		errs += buf_add(buf, lease->options[i].data,
 		    lease->options[i].len);
 	}
 
 	if (errs)
 		error("buf_add: %m");
 
 	if (buf_close(privfd, buf) == -1)
 		error("buf_close: %m");
 }
 
 int
 script_go(void)
 {
 	struct imsg_hdr	 hdr;
 	struct buf	*buf;
 	int		 ret;
 
 	hdr.code = IMSG_SCRIPT_GO;
 	hdr.len = sizeof(struct imsg_hdr);
 
 	if ((buf = buf_open(hdr.len)) == NULL)
 		error("buf_open: %m");
 
 	if (buf_add(buf, &hdr, sizeof(hdr)))
 		error("buf_add: %m");
 
 	if (buf_close(privfd, buf) == -1)
 		error("buf_close: %m");
 
 	bzero(&hdr, sizeof(hdr));
 	buf_read(privfd, &hdr, sizeof(hdr));
 	if (hdr.code != IMSG_SCRIPT_GO_RET)
 		error("unexpected msg type %u", hdr.code);
 	if (hdr.len != sizeof(hdr) + sizeof(int))
 		error("received corrupted message");
 	buf_read(privfd, &ret, sizeof(ret));
 
 	scripttime = time(NULL);
 
 	return (ret);
 }
 
 int
 priv_script_go(void)
 {
 	char *scriptName, *argv[2], **envp, *epp[3], reason[] = "REASON=NBI";
 	static char client_path[] = CLIENT_PATH;
 	struct interface_info *ip = ifi;
 	int pid, wpid, wstatus;
 
 	scripttime = time(NULL);
 
 	if (ip) {
 		scriptName = ip->client->config->script_name;
 		envp = ip->client->scriptEnv;
 	} else {
 		scriptName = top_level_config.script_name;
 		epp[0] = reason;
 		epp[1] = client_path;
 		epp[2] = NULL;
 		envp = epp;
 	}
 
 	argv[0] = scriptName;
 	argv[1] = NULL;
 
 	pid = fork();
 	if (pid < 0) {
 		error("fork: %m");
 		wstatus = 0;
 	} else if (pid) {
 		do {
 			wpid = wait(&wstatus);
 		} while (wpid != pid && wpid > 0);
 		if (wpid < 0) {
 			error("wait: %m");
 			wstatus = 0;
 		}
 	} else {
 		execve(scriptName, argv, envp);
 		error("execve (%s, ...): %m", scriptName);
 	}
 
 	if (ip)
 		script_flush_env(ip->client);
 
 	return (wstatus & 0xff);
 }
 
 void
 script_set_env(struct client_state *client, const char *prefix,
     const char *name, const char *value)
 {
 	int i, j, namelen;
 
+	/* No `` or $() command substitution allowed in environment values! */
+	for (j=0; j < strlen(value); j++)
+		switch (value[j]) {
+		case '`':
+		case '$':
+			warning("illegal character (%c) in value '%s'",
+			    value[j], value);
+			/* Ignore this option */
+			return;
+		}
+
 	namelen = strlen(name);
 
 	for (i = 0; client->scriptEnv[i]; i++)
 		if (strncmp(client->scriptEnv[i], name, namelen) == 0 &&
 		    client->scriptEnv[i][namelen] == '=')
 			break;
 
 	if (client->scriptEnv[i])
 		/* Reuse the slot. */
 		free(client->scriptEnv[i]);
 	else {
 		/* New variable.  Expand if necessary. */
 		if (i >= client->scriptEnvsize - 1) {
 			char **newscriptEnv;
 			int newscriptEnvsize = client->scriptEnvsize + 50;
 
 			newscriptEnv = realloc(client->scriptEnv,
 			    newscriptEnvsize);
 			if (newscriptEnv == NULL) {
 				free(client->scriptEnv);
 				client->scriptEnv = NULL;
 				client->scriptEnvsize = 0;
 				error("script_set_env: no memory for variable");
 			}
 			client->scriptEnv = newscriptEnv;
 			client->scriptEnvsize = newscriptEnvsize;
 		}
 		/* need to set the NULL pointer at end of array beyond
 		   the new slot. */
 		client->scriptEnv[i + 1] = NULL;
 	}
 	/* Allocate space and format the variable in the appropriate slot. */
 	client->scriptEnv[i] = malloc(strlen(prefix) + strlen(name) + 1 +
 	    strlen(value) + 1);
 	if (client->scriptEnv[i] == NULL)
 		error("script_set_env: no memory for variable assignment");
-
-	/* No `` or $() command substitution allowed in environment values! */
-	for (j=0; j < strlen(value); j++)
-		switch (value[j]) {
-		case '`':
-		case '$':
-			error("illegal character (%c) in value '%s'", value[j],
-			    value);
-			/* not reached */
-		}
 	snprintf(client->scriptEnv[i], strlen(prefix) + strlen(name) +
 	    1 + strlen(value) + 1, "%s%s=%s", prefix, name, value);
 }
 
 void
 script_flush_env(struct client_state *client)
 {
 	int i;
 
 	for (i = 0; client->scriptEnv[i]; i++) {
 		free(client->scriptEnv[i]);
 		client->scriptEnv[i] = NULL;
 	}
 	client->scriptEnvsize = 0;
 }
 
 int
 dhcp_option_ev_name(char *buf, size_t buflen, struct option *option)
 {
 	int i;
 
 	for (i = 0; option->name[i]; i++) {
 		if (i + 1 == buflen)
 			return 0;
 		if (option->name[i] == '-')
 			buf[i] = '_';
 		else
 			buf[i] = option->name[i];
 	}
 
 	buf[i] = 0;
 	return 1;
 }
 
 void
 go_daemon(void)
 {
 	static int state = 0;
 	cap_rights_t rights;
 
 	if (no_daemon || state)
 		return;
 
 	state = 1;
 
 	/* Stop logging to stderr... */
 	log_perror = 0;
 
 	if (daemon(1, 0) == -1)
 		error("daemon");
 
 	cap_rights_init(&rights);
 
 	if (pidfile != NULL) {
 		pidfile_write(pidfile);
 		if (cap_rights_limit(pidfile_fileno(pidfile), &rights) < 0 &&
 		    errno != ENOSYS) {
 			error("can't limit pidfile descriptor: %m");
 		}
 	}
 
 	/* we are chrooted, daemon(3) fails to open /dev/null */
 	if (nullfd != -1) {
 		dup2(nullfd, STDIN_FILENO);
 		dup2(nullfd, STDOUT_FILENO);
 		dup2(nullfd, STDERR_FILENO);
 		close(nullfd);
 		nullfd = -1;
 	}
 
 	if (cap_rights_limit(STDIN_FILENO, &rights) < 0 && errno != ENOSYS)
 		error("can't limit stdin: %m");
 	cap_rights_init(&rights, CAP_WRITE);
 	if (cap_rights_limit(STDOUT_FILENO, &rights) < 0 && errno != ENOSYS)
 		error("can't limit stdout: %m");
 	if (cap_rights_limit(STDERR_FILENO, &rights) < 0 && errno != ENOSYS)
 		error("can't limit stderr: %m");
 }
 
 int
 check_option(struct client_lease *l, int option)
 {
 	char *opbuf;
 	char *sbuf;
 
 	/* we use this, since this is what gets passed to dhclient-script */
 
 	opbuf = pretty_print_option(option, l->options[option].data,
 	    l->options[option].len, 0, 0);
 
 	sbuf = option_as_string(option, l->options[option].data,
 	    l->options[option].len);
 
 	switch (option) {
 	case DHO_SUBNET_MASK:
 	case DHO_TIME_SERVERS:
 	case DHO_NAME_SERVERS:
 	case DHO_ROUTERS:
 	case DHO_DOMAIN_NAME_SERVERS:
 	case DHO_LOG_SERVERS:
 	case DHO_COOKIE_SERVERS:
 	case DHO_LPR_SERVERS:
 	case DHO_IMPRESS_SERVERS:
 	case DHO_RESOURCE_LOCATION_SERVERS:
 	case DHO_SWAP_SERVER:
 	case DHO_BROADCAST_ADDRESS:
 	case DHO_NIS_SERVERS:
 	case DHO_NTP_SERVERS:
 	case DHO_NETBIOS_NAME_SERVERS:
 	case DHO_NETBIOS_DD_SERVER:
 	case DHO_FONT_SERVERS:
 	case DHO_DHCP_SERVER_IDENTIFIER:
 	case DHO_NISPLUS_SERVERS:
 	case DHO_MOBILE_IP_HOME_AGENT:
 	case DHO_SMTP_SERVER:
 	case DHO_POP_SERVER:
 	case DHO_NNTP_SERVER:
 	case DHO_WWW_SERVER:
 	case DHO_FINGER_SERVER:
 	case DHO_IRC_SERVER:
 	case DHO_STREETTALK_SERVER:
 	case DHO_STREETTALK_DA_SERVER:
 		if (!ipv4addrs(opbuf)) {
 			warning("Invalid IP address in option: %s", opbuf);
 			return (0);
 		}
 		return (1)  ;
 	case DHO_HOST_NAME:
 	case DHO_NIS_DOMAIN:
 	case DHO_NISPLUS_DOMAIN:
 	case DHO_TFTP_SERVER_NAME:
 		if (!res_hnok(sbuf)) {
 			warning("Bogus Host Name option %d: %s (%s)", option,
 			    sbuf, opbuf);
 			l->options[option].len = 0;
 			free(l->options[option].data);
 		}
 		return (1);
 	case DHO_DOMAIN_NAME:
 	case DHO_DOMAIN_SEARCH:
 		if (!res_hnok(sbuf)) {
 			if (!check_search(sbuf)) {
 				warning("Bogus domain search list %d: %s (%s)",
 				    option, sbuf, opbuf);
 				l->options[option].len = 0;
 				free(l->options[option].data);
 			}
 		}
 		return (1);
 	case DHO_PAD:
 	case DHO_TIME_OFFSET:
 	case DHO_BOOT_SIZE:
 	case DHO_MERIT_DUMP:
 	case DHO_ROOT_PATH:
 	case DHO_EXTENSIONS_PATH:
 	case DHO_IP_FORWARDING:
 	case DHO_NON_LOCAL_SOURCE_ROUTING:
 	case DHO_POLICY_FILTER:
 	case DHO_MAX_DGRAM_REASSEMBLY:
 	case DHO_DEFAULT_IP_TTL:
 	case DHO_PATH_MTU_AGING_TIMEOUT:
 	case DHO_PATH_MTU_PLATEAU_TABLE:
 	case DHO_INTERFACE_MTU:
 	case DHO_ALL_SUBNETS_LOCAL:
 	case DHO_PERFORM_MASK_DISCOVERY:
 	case DHO_MASK_SUPPLIER:
 	case DHO_ROUTER_DISCOVERY:
 	case DHO_ROUTER_SOLICITATION_ADDRESS:
 	case DHO_STATIC_ROUTES:
 	case DHO_TRAILER_ENCAPSULATION:
 	case DHO_ARP_CACHE_TIMEOUT:
 	case DHO_IEEE802_3_ENCAPSULATION:
 	case DHO_DEFAULT_TCP_TTL:
 	case DHO_TCP_KEEPALIVE_INTERVAL:
 	case DHO_TCP_KEEPALIVE_GARBAGE:
 	case DHO_VENDOR_ENCAPSULATED_OPTIONS:
 	case DHO_NETBIOS_NODE_TYPE:
 	case DHO_NETBIOS_SCOPE:
 	case DHO_X_DISPLAY_MANAGER:
 	case DHO_DHCP_REQUESTED_ADDRESS:
 	case DHO_DHCP_LEASE_TIME:
 	case DHO_DHCP_OPTION_OVERLOAD:
 	case DHO_DHCP_MESSAGE_TYPE:
 	case DHO_DHCP_PARAMETER_REQUEST_LIST:
 	case DHO_DHCP_MESSAGE:
 	case DHO_DHCP_MAX_MESSAGE_SIZE:
 	case DHO_DHCP_RENEWAL_TIME:
 	case DHO_DHCP_REBINDING_TIME:
 	case DHO_DHCP_CLASS_IDENTIFIER:
 	case DHO_DHCP_CLIENT_IDENTIFIER:
 	case DHO_BOOTFILE_NAME:
 	case DHO_DHCP_USER_CLASS_ID:
 	case DHO_END:
 		return (1);
 	case DHO_CLASSLESS_ROUTES:
 		return (check_classless_option(l->options[option].data,
 		    l->options[option].len));
 	default:
 		warning("unknown dhcp option value 0x%x", option);
 		return (unknown_ok);
 	}
 }
 
 /* RFC 3442 The Classless Static Routes option checks */
 int
 check_classless_option(unsigned char *data, int len)
 {
 	int i = 0;
 	unsigned char width;
 	in_addr_t addr, mask;
 
 	if (len < 5) {
 		warning("Too small length: %d", len);
 		return (0);
 	}
 	while(i < len) {
 		width = data[i++];
 		if (width == 0) {
 			i += 4;
 			continue;
 		} else if (width < 9) {
 			addr =  (in_addr_t)(data[i]	<< 24);
 			i += 1;
 		} else if (width < 17) {
 			addr =  (in_addr_t)(data[i]	<< 24) +
 				(in_addr_t)(data[i + 1]	<< 16);
 			i += 2;
 		} else if (width < 25) {
 			addr =  (in_addr_t)(data[i]	<< 24) +
 				(in_addr_t)(data[i + 1]	<< 16) +
 				(in_addr_t)(data[i + 2]	<< 8);
 			i += 3;
 		} else if (width < 33) {
 			addr =  (in_addr_t)(data[i]	<< 24) +
 				(in_addr_t)(data[i + 1]	<< 16) +
 				(in_addr_t)(data[i + 2]	<< 8)  +
 				data[i + 3];
 			i += 4;
 		} else {
 			warning("Incorrect subnet width: %d", width);
 			return (0);
 		}
 		mask = (in_addr_t)(~0) << (32 - width);
 		addr = ntohl(addr);
 		mask = ntohl(mask);
 
 		/*
 		 * From RFC 3442:
 		 * ... After deriving a subnet number and subnet mask
 		 * from each destination descriptor, the DHCP client
 		 * MUST zero any bits in the subnet number where the
 		 * corresponding bit in the mask is zero...
 		 */
 		if ((addr & mask) != addr) {
 			addr &= mask;
 			data[i - 1] = (unsigned char)(
 				(addr >> (((32 - width)/8)*8)) & 0xFF);
 		}
 		i += 4;
 	}
 	if (i > len) {
 		warning("Incorrect data length: %d (must be %d)", len, i);
 		return (0);
 	}
 	return (1);
 }
 
 int
 res_hnok(const char *dn)
 {
 	int pch = PERIOD, ch = *dn++;
 
 	while (ch != '\0') {
 		int nch = *dn++;
 
 		if (periodchar(ch)) {
 			;
 		} else if (periodchar(pch)) {
 			if (!borderchar(ch))
 				return (0);
 		} else if (periodchar(nch) || nch == '\0') {
 			if (!borderchar(ch))
 				return (0);
 		} else {
 			if (!middlechar(ch))
 				return (0);
 		}
 		pch = ch, ch = nch;
 	}
 	return (1);
 }
 
 int
 check_search(const char *srch)
 {
         int pch = PERIOD, ch = *srch++;
 	int domains = 1;
 
 	/* 256 char limit re resolv.conf(5) */
 	if (strlen(srch) > 256)
 		return (0);
 
 	while (whitechar(ch))
 		ch = *srch++;
 
         while (ch != '\0') {
                 int nch = *srch++;
 
                 if (periodchar(ch) || whitechar(ch)) {
                         ;
                 } else if (periodchar(pch)) {
                         if (!borderchar(ch))
                                 return (0);
                 } else if (periodchar(nch) || nch == '\0') {
                         if (!borderchar(ch))
                                 return (0);
                 } else {
                         if (!middlechar(ch))
                                 return (0);
                 }
 		if (!whitechar(ch)) {
 			pch = ch;
 		} else {
 			while (whitechar(nch)) {
 				nch = *srch++;
 			}
 			if (nch != '\0')
 				domains++;
 			pch = PERIOD;
 		}
 		ch = nch;
         }
 	/* 6 domain limit re resolv.conf(5) */
 	if (domains > 6)
 		return (0);
         return (1);
 }
 
 /* Does buf consist only of dotted decimal ipv4 addrs?
  * return how many if so,
  * otherwise, return 0
  */
 int
 ipv4addrs(char * buf)
 {
 	struct in_addr jnk;
 	int count = 0;
 
 	while (inet_aton(buf, &jnk) == 1){
 		count++;
 		while (periodchar(*buf) || digitchar(*buf))
 			buf++;
 		if (*buf == '\0')
 			return (count);
 		while (*buf ==  ' ')
 			buf++;
 	}
 	return (0);
 }
 
 
 char *
 option_as_string(unsigned int code, unsigned char *data, int len)
 {
 	static char optbuf[32768]; /* XXX */
 	char *op = optbuf;
 	int opleft = sizeof(optbuf);
 	unsigned char *dp = data;
 
 	if (code > 255)
 		error("option_as_string: bad code %d", code);
 
 	for (; dp < data + len; dp++) {
 		if (!isascii(*dp) || !isprint(*dp)) {
 			if (dp + 1 != data + len || *dp != 0) {
 				snprintf(op, opleft, "\\%03o", *dp);
 				op += 4;
 				opleft -= 4;
 			}
 		} else if (*dp == '"' || *dp == '\'' || *dp == '$' ||
 		    *dp == '`' || *dp == '\\') {
 			*op++ = '\\';
 			*op++ = *dp;
 			opleft -= 2;
 		} else {
 			*op++ = *dp;
 			opleft--;
 		}
 	}
 	if (opleft < 1)
 		goto toobig;
 	*op = 0;
 	return optbuf;
 toobig:
 	warning("dhcp option too large");
 	return "<error>";
 }
 
 int
 fork_privchld(int fd, int fd2)
 {
 	struct pollfd pfd[1];
 	int nfds;
 
 	switch (fork()) {
 	case -1:
 		error("cannot fork");
 	case 0:
 		break;
 	default:
 		return (0);
 	}
 
 	setproctitle("%s [priv]", ifi->name);
 
 	setsid();
 	dup2(nullfd, STDIN_FILENO);
 	dup2(nullfd, STDOUT_FILENO);
 	dup2(nullfd, STDERR_FILENO);
 	close(nullfd);
 	close(fd2);
 	close(ifi->rfdesc);
 	ifi->rfdesc = -1;
 
 	for (;;) {
 		pfd[0].fd = fd;
 		pfd[0].events = POLLIN;
 		if ((nfds = poll(pfd, 1, INFTIM)) == -1)
 			if (errno != EINTR)
 				error("poll error");
 
 		if (nfds == 0 || !(pfd[0].revents & POLLIN))
 			continue;
 
 		dispatch_imsg(ifi, fd);
 	}
 }
Index: releng/10.3/sys/conf/newvers.sh
===================================================================
--- releng/10.3/sys/conf/newvers.sh	(revision 303983)
+++ releng/10.3/sys/conf/newvers.sh	(revision 303984)
@@ -1,232 +1,232 @@
 #!/bin/sh -
 #
 # Copyright (c) 1984, 1986, 1990, 1993
 #	The Regents of the University of California.  All rights reserved.
 #
 # Redistribution and use in source and binary forms, with or without
 # modification, are permitted provided that the following conditions
 # are met:
 # 1. Redistributions of source code must retain the above copyright
 #    notice, this list of conditions and the following disclaimer.
 # 2. Redistributions in binary form must reproduce the above copyright
 #    notice, this list of conditions and the following disclaimer in the
 #    documentation and/or other materials provided with the distribution.
 # 4. Neither the name of the University nor the names of its contributors
 #    may be used to endorse or promote products derived from this software
 #    without specific prior written permission.
 #
 # THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
 # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 # ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
 # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 # SUCH DAMAGE.
 #
 #	@(#)newvers.sh	8.1 (Berkeley) 4/20/94
 # $FreeBSD$
 
 TYPE="FreeBSD"
 REVISION="10.3"
-BRANCH="RELEASE-p6"
+BRANCH="RELEASE-p7"
 if [ "X${BRANCH_OVERRIDE}" != "X" ]; then
 	BRANCH=${BRANCH_OVERRIDE}
 fi
 RELEASE="${REVISION}-${BRANCH}"
 VERSION="${TYPE} ${RELEASE}"
 
 if [ "X${SYSDIR}" = "X" ]; then
     SYSDIR=$(dirname $0)/..
 fi
 
 if [ "X${PARAMFILE}" != "X" ]; then
 	RELDATE=$(awk '/__FreeBSD_version.*propagated to newvers/ {print $3}' \
 		${PARAMFILE})
 else
 	RELDATE=$(awk '/__FreeBSD_version.*propagated to newvers/ {print $3}' \
 		${SYSDIR}/sys/param.h)
 fi
 
 b=share/examples/etc/bsd-style-copyright
 year=$(sed -Ee '/^Copyright .* The FreeBSD Project/!d;s/^.*1992-([0-9]*) .*$/\1/g' ${SYSDIR}/../COPYRIGHT)
 # look for copyright template
 for bsd_copyright in ../$b ../../$b ../../../$b /usr/src/$b /usr/$b
 do
 	if [ -r "$bsd_copyright" ]; then
 		COPYRIGHT=`sed \
 		    -e "s/\[year\]/1992-$year/" \
 		    -e 's/\[your name here\]\.* /The FreeBSD Project./' \
 		    -e 's/\[your name\]\.*/The FreeBSD Project./' \
 		    -e '/\[id for your version control system, if any\]/d' \
 		    $bsd_copyright` 
 		break
 	fi
 done
 
 # no copyright found, use a dummy
 if [ X"$COPYRIGHT" = X ]; then
 	COPYRIGHT="/*-
  * Copyright (c) 1992-$year The FreeBSD Project.
  * All rights reserved.
  *
  */"
 fi
 
 # add newline
 COPYRIGHT="$COPYRIGHT
 "
 
 LC_ALL=C; export LC_ALL
 if [ ! -r version ]
 then
 	echo 0 > version
 fi
 
 touch version
 v=`cat version` u=${USER:-root} d=`pwd` h=${HOSTNAME:-`hostname`}
 if [ -n "$SOURCE_DATE_EPOCH" ]; then
 	if ! t=`date -r $SOURCE_DATE_EPOCH 2>/dev/null`; then
 		echo "Invalid SOURCE_DATE_EPOCH" >&2
 		exit 1
 	fi
 else
 	t=`date`
 fi
 i=`${MAKE:-make} -V KERN_IDENT`
 compiler_v=$($(${MAKE:-make} -V CC) -v 2>&1 | grep 'version')
 
 for dir in /usr/bin /usr/local/bin; do
 	if [ ! -z "${svnversion}" ] ; then
 		break
 	fi
 	if [ -x "${dir}/svnversion" ] && [ -z ${svnversion} ] ; then
 		# Run svnversion from ${dir} on this script; if return code
 		# is not zero, the checkout might not be compatible with the
 		# svnversion being used.
 		${dir}/svnversion $(realpath ${0}) >/dev/null 2>&1
 		if [ $? -eq 0 ]; then
 			svnversion=${dir}/svnversion
 			break
 		fi
 	fi
 done
 
 if [ -z "${svnversion}" ] && [ -x /usr/bin/svnliteversion ] ; then
 	/usr/bin/svnliteversion $(realpath ${0}) >/dev/null 2>&1
 	if [ $? -eq 0 ]; then
 		svnversion=/usr/bin/svnliteversion
 	else
 		svnversion=
 	fi
 fi
 
 for dir in /usr/bin /usr/local/bin; do
 	if [ -x "${dir}/p4" ] && [ -z ${p4_cmd} ] ; then
 		p4_cmd=${dir}/p4
 	fi
 done
 if [ -d "${SYSDIR}/../.git" ] ; then
 	for dir in /usr/bin /usr/local/bin; do
 		if [ -x "${dir}/git" ] ; then
 			git_cmd="${dir}/git --git-dir=${SYSDIR}/../.git"
 			break
 		fi
 	done
 fi
 
 if [ -d "${SYSDIR}/../.hg" ] ; then
 	for dir in /usr/bin /usr/local/bin; do
 		if [ -x "${dir}/hg" ] ; then
 			hg_cmd="${dir}/hg -R ${SYSDIR}/.."
 			break
 		fi
 	done
 fi
 
 if [ -n "$svnversion" ] ; then
 	svn=`cd ${SYSDIR} && $svnversion 2>/dev/null`
 	case "$svn" in
 	[0-9]*)	svn=" r${svn}" ;;
 	*)	unset svn ;;
 	esac
 fi
 
 if [ -n "$git_cmd" ] ; then
 	git=`$git_cmd rev-parse --verify --short HEAD 2>/dev/null`
 	svn=`$git_cmd svn find-rev $git 2>/dev/null`
 	if [ -n "$svn" ] ; then
 		svn=" r${svn}"
 		git="=${git}"
 	else
 		svn=`$git_cmd log | fgrep 'git-svn-id:' | head -1 | \
 		     sed -n 's/^.*@\([0-9][0-9]*\).*$/\1/p'`
 		if [ -z "$svn" ] ; then
 			svn=`$git_cmd log --format='format:%N' | \
 			     grep '^svn ' | head -1 | \
 			     sed -n 's/^.*revision=\([0-9][0-9]*\).*$/\1/p'`
 		fi
 		if [ -n "$svn" ] ; then
 			svn=" r${svn}"
 			git="+${git}"
 		else
 			git=" ${git}"
 		fi
 	fi
 	git_b=`$git_cmd rev-parse --abbrev-ref HEAD`
 	if [ -n "$git_b" ] ; then
 		git="${git}(${git_b})"
 	fi
 	if $git_cmd --work-tree=${SYSDIR}/.. diff-index \
 	    --name-only HEAD | read dummy; then
 		git="${git}-dirty"
 	fi
 fi
 
 if [ -n "$p4_cmd" ] ; then
 	p4version=`cd ${SYSDIR} && $p4_cmd changes -m1 "./...#have" 2>&1 | \
 		awk '{ print $2 }'`
 	case "$p4version" in
 	[0-9]*)
 		p4version=" ${p4version}"
 		p4opened=`cd ${SYSDIR} && $p4_cmd opened ./... 2>&1`
 		case "$p4opened" in
 		File*) ;;
 		//*)	p4version="${p4version}+edit" ;;
 		esac
 		;;
 	*)	unset p4version ;;
 	esac
 fi
 
 if [ -n "$hg_cmd" ] ; then
 	hg=`$hg_cmd id 2>/dev/null`
 	svn=`$hg_cmd svn info 2>/dev/null | \
 		awk -F': ' '/Revision/ { print $2 }'`
 	if [ -n "$svn" ] ; then
 		svn=" r${svn}"
 	fi
 	if [ -n "$hg" ] ; then
 		hg=" ${hg}"
 	fi
 fi
 
 cat << EOF > vers.c
 $COPYRIGHT
 #define SCCSSTR "@(#)${VERSION} #${v}${svn}${git}${hg}${p4version}: ${t}"
 #define VERSTR "${VERSION} #${v}${svn}${git}${hg}${p4version}: ${t}\\n    ${u}@${h}:${d}\\n"
 #define RELSTR "${RELEASE}"
 
 char sccs[sizeof(SCCSSTR) > 128 ? sizeof(SCCSSTR) : 128] = SCCSSTR;
 char version[sizeof(VERSTR) > 256 ? sizeof(VERSTR) : 256] = VERSTR;
 char compiler_version[] = "${compiler_v}";
 char ostype[] = "${TYPE}";
 char osrelease[sizeof(RELSTR) > 32 ? sizeof(RELSTR) : 32] = RELSTR;
 int osreldate = ${RELDATE};
 char kern_ident[] = "${i}";
 EOF
 
 echo $((v + 1)) > version
Index: releng/10.3/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c
===================================================================
--- releng/10.3/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c	(revision 303983)
+++ releng/10.3/sys/dev/hyperv/storvsc/hv_storvsc_drv_freebsd.c	(revision 303984)
@@ -1,2145 +1,2195 @@
 /*-
  * Copyright (c) 2009-2012 Microsoft Corp.
  * Copyright (c) 2012 NetApp Inc.
  * Copyright (c) 2012 Citrix Inc.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice unmodified, this list of conditions, and the following
  *    disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
 /**
  * StorVSC driver for Hyper-V.  This driver presents a SCSI HBA interface
  * to the Comman Access Method (CAM) layer.  CAM control blocks (CCBs) are
  * converted into VSCSI protocol messages which are delivered to the parent
  * partition StorVSP driver over the Hyper-V VMBUS.
  */
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/proc.h>
 #include <sys/condvar.h>
 #include <sys/time.h>
 #include <sys/systm.h>
 #include <sys/sockio.h>
 #include <sys/mbuf.h>
 #include <sys/malloc.h>
 #include <sys/module.h>
 #include <sys/kernel.h>
 #include <sys/queue.h>
 #include <sys/lock.h>
 #include <sys/sx.h>
 #include <sys/taskqueue.h>
 #include <sys/bus.h>
 #include <sys/mutex.h>
 #include <sys/callout.h>
 #include <vm/vm.h>
 #include <vm/pmap.h>
 #include <vm/uma.h>
 #include <sys/lock.h>
 #include <sys/sema.h>
 #include <sys/sglist.h>
 #include <machine/bus.h>
 #include <sys/bus_dma.h>
 
 #include <cam/cam.h>
 #include <cam/cam_ccb.h>
 #include <cam/cam_periph.h>
 #include <cam/cam_sim.h>
 #include <cam/cam_xpt_sim.h>
 #include <cam/cam_xpt_internal.h>
 #include <cam/cam_debug.h>
 #include <cam/scsi/scsi_all.h>
 #include <cam/scsi/scsi_message.h>
 
 #include <dev/hyperv/include/hyperv.h>
 #include "hv_vstorage.h"
 
 #define STORVSC_RINGBUFFER_SIZE		(20*PAGE_SIZE)
 #define STORVSC_MAX_LUNS_PER_TARGET	(64)
 #define STORVSC_MAX_IO_REQUESTS		(STORVSC_MAX_LUNS_PER_TARGET * 2)
 #define BLKVSC_MAX_IDE_DISKS_PER_TARGET	(1)
 #define BLKVSC_MAX_IO_REQUESTS		STORVSC_MAX_IO_REQUESTS
 #define STORVSC_MAX_TARGETS		(2)
 
-#define STORVSC_WIN7_MAJOR 4
-#define STORVSC_WIN7_MINOR 2
-
-#define STORVSC_WIN8_MAJOR 5
-#define STORVSC_WIN8_MINOR 1
-
 #define VSTOR_PKT_SIZE	(sizeof(struct vstor_packet) - vmscsi_size_delta)
 
 #define HV_ALIGN(x, a) roundup2(x, a)
 
 struct storvsc_softc;
 
 struct hv_sgl_node {
 	LIST_ENTRY(hv_sgl_node) link;
 	struct sglist *sgl_data;
 };
 
 struct hv_sgl_page_pool{
 	LIST_HEAD(, hv_sgl_node) in_use_sgl_list;
 	LIST_HEAD(, hv_sgl_node) free_sgl_list;
 	boolean_t                is_init;
 } g_hv_sgl_page_pool;
 
 #define STORVSC_MAX_SG_PAGE_CNT STORVSC_MAX_IO_REQUESTS * HV_MAX_MULTIPAGE_BUFFER_COUNT
 
 enum storvsc_request_type {
 	WRITE_TYPE,
 	READ_TYPE,
 	UNKNOWN_TYPE
 };
 
 struct hv_storvsc_request {
 	LIST_ENTRY(hv_storvsc_request) link;
 	struct vstor_packet	vstor_packet;
 	hv_vmbus_multipage_buffer data_buf;
 	void *sense_data;
 	uint8_t sense_info_len;
 	uint8_t retries;
 	union ccb *ccb;
 	struct storvsc_softc *softc;
 	struct callout callout;
 	struct sema synch_sema; /*Synchronize the request/response if needed */
 	struct sglist *bounce_sgl;
 	unsigned int bounce_sgl_count;
 	uint64_t not_aligned_seg_bits;
 };
 
 struct storvsc_softc {
 	struct hv_device		*hs_dev;
 	LIST_HEAD(, hv_storvsc_request)	hs_free_list;
 	struct mtx			hs_lock;
 	struct storvsc_driver_props	*hs_drv_props;
 	int 				hs_unit;
 	uint32_t			hs_frozen;
 	struct cam_sim			*hs_sim;
 	struct cam_path 		*hs_path;
 	uint32_t			hs_num_out_reqs;
 	boolean_t			hs_destroy;
 	boolean_t			hs_drain_notify;
 	boolean_t			hs_open_multi_channel;
 	struct sema 			hs_drain_sema;	
 	struct hv_storvsc_request	hs_init_req;
 	struct hv_storvsc_request	hs_reset_req;
 };
 
 
 /**
  * HyperV storvsc timeout testing cases:
  * a. IO returned after first timeout;
  * b. IO returned after second timeout and queue freeze;
  * c. IO returned while timer handler is running
  * The first can be tested by "sg_senddiag -vv /dev/daX",
  * and the second and third can be done by
  * "sg_wr_mode -v -p 08 -c 0,1a -m 0,ff /dev/daX".
  */
 #define HVS_TIMEOUT_TEST 0
 
 /*
  * Bus/adapter reset functionality on the Hyper-V host is
  * buggy and it will be disabled until
  * it can be further tested.
  */
 #define HVS_HOST_RESET 0
 
 struct storvsc_driver_props {
 	char		*drv_name;
 	char		*drv_desc;
 	uint8_t		drv_max_luns_per_target;
 	uint8_t		drv_max_ios_per_target;
 	uint32_t	drv_ringbuffer_size;
 };
 
 enum hv_storage_type {
 	DRIVER_BLKVSC,
 	DRIVER_STORVSC,
 	DRIVER_UNKNOWN
 };
 
 #define HS_MAX_ADAPTERS 10
 
 #define HV_STORAGE_SUPPORTS_MULTI_CHANNEL 0x1
 
 /* {ba6163d9-04a1-4d29-b605-72e2ffb1dc7f} */
 static const hv_guid gStorVscDeviceType={
 	.data = {0xd9, 0x63, 0x61, 0xba, 0xa1, 0x04, 0x29, 0x4d,
 		 0xb6, 0x05, 0x72, 0xe2, 0xff, 0xb1, 0xdc, 0x7f}
 };
 
 /* {32412632-86cb-44a2-9b5c-50d1417354f5} */
 static const hv_guid gBlkVscDeviceType={
 	.data = {0x32, 0x26, 0x41, 0x32, 0xcb, 0x86, 0xa2, 0x44,
 		 0x9b, 0x5c, 0x50, 0xd1, 0x41, 0x73, 0x54, 0xf5}
 };
 
 static struct storvsc_driver_props g_drv_props_table[] = {
 	{"blkvsc", "Hyper-V IDE Storage Interface",
 	 BLKVSC_MAX_IDE_DISKS_PER_TARGET, BLKVSC_MAX_IO_REQUESTS,
 	 STORVSC_RINGBUFFER_SIZE},
 	{"storvsc", "Hyper-V SCSI Storage Interface",
 	 STORVSC_MAX_LUNS_PER_TARGET, STORVSC_MAX_IO_REQUESTS,
 	 STORVSC_RINGBUFFER_SIZE}
 };
 
 /*
  * Sense buffer size changed in win8; have a run-time
  * variable to track the size we should use.
  */
-static int sense_buffer_size;
+static int sense_buffer_size = PRE_WIN8_STORVSC_SENSE_BUFFER_SIZE;
 
 /*
  * The size of the vmscsi_request has changed in win8. The
  * additional size is for the newly added elements in the
  * structure. These elements are valid only when we are talking
  * to a win8 host.
  * Track the correct size we need to apply.
  */
 static int vmscsi_size_delta;
+/*
+ * The storage protocol version is determined during the
+ * initial exchange with the host.  It will indicate which
+ * storage functionality is available in the host.
+*/
+static int vmstor_proto_version;
 
-static int storvsc_current_major;
-static int storvsc_current_minor;
+struct vmstor_proto {
+        int proto_version;
+        int sense_buffer_size;
+        int vmscsi_size_delta;
+};
 
+static const struct vmstor_proto vmstor_proto_list[] = {
+        {
+                VMSTOR_PROTOCOL_VERSION_WIN10,
+                POST_WIN7_STORVSC_SENSE_BUFFER_SIZE,
+                0
+        },
+        {
+                VMSTOR_PROTOCOL_VERSION_WIN8_1,
+                POST_WIN7_STORVSC_SENSE_BUFFER_SIZE,
+                0
+        },
+        {
+                VMSTOR_PROTOCOL_VERSION_WIN8,
+                POST_WIN7_STORVSC_SENSE_BUFFER_SIZE,
+                0
+        },
+        {
+                VMSTOR_PROTOCOL_VERSION_WIN7,
+                PRE_WIN8_STORVSC_SENSE_BUFFER_SIZE,
+                sizeof(struct vmscsi_win8_extension),
+        },
+        {
+                VMSTOR_PROTOCOL_VERSION_WIN6,
+                PRE_WIN8_STORVSC_SENSE_BUFFER_SIZE,
+                sizeof(struct vmscsi_win8_extension),
+        }
+};
+
 /* static functions */
 static int storvsc_probe(device_t dev);
 static int storvsc_attach(device_t dev);
 static int storvsc_detach(device_t dev);
 static void storvsc_poll(struct cam_sim * sim);
 static void storvsc_action(struct cam_sim * sim, union ccb * ccb);
 static int create_storvsc_request(union ccb *ccb, struct hv_storvsc_request *reqp);
 static void storvsc_free_request(struct storvsc_softc *sc, struct hv_storvsc_request *reqp);
 static enum hv_storage_type storvsc_get_storage_type(device_t dev);
 static void hv_storvsc_rescan_target(struct storvsc_softc *sc);
 static void hv_storvsc_on_channel_callback(void *context);
 static void hv_storvsc_on_iocompletion( struct storvsc_softc *sc,
 					struct vstor_packet *vstor_packet,
 					struct hv_storvsc_request *request);
 static int hv_storvsc_connect_vsp(struct hv_device *device);
 static void storvsc_io_done(struct hv_storvsc_request *reqp);
 static void storvsc_copy_sgl_to_bounce_buf(struct sglist *bounce_sgl,
 				bus_dma_segment_t *orig_sgl,
 				unsigned int orig_sgl_count,
 				uint64_t seg_bits);
 void storvsc_copy_from_bounce_buf_to_sgl(bus_dma_segment_t *dest_sgl,
 				unsigned int dest_sgl_count,
 				struct sglist* src_sgl,
 				uint64_t seg_bits);
 
 static device_method_t storvsc_methods[] = {
 	/* Device interface */
 	DEVMETHOD(device_probe,		storvsc_probe),
 	DEVMETHOD(device_attach,	storvsc_attach),
 	DEVMETHOD(device_detach,	storvsc_detach),
 	DEVMETHOD(device_shutdown,      bus_generic_shutdown),
 	DEVMETHOD_END
 };
 
 static driver_t storvsc_driver = {
 	"storvsc", storvsc_methods, sizeof(struct storvsc_softc),
 };
 
 static devclass_t storvsc_devclass;
 DRIVER_MODULE(storvsc, vmbus, storvsc_driver, storvsc_devclass, 0, 0);
 MODULE_VERSION(storvsc, 1);
 MODULE_DEPEND(storvsc, vmbus, 1, 1, 1);
 
 
 /**
  * The host is capable of sending messages to us that are
  * completely unsolicited. So, we need to address the race
  * condition where we may be in the process of unloading the
  * driver when the host may send us an unsolicited message.
  * We address this issue by implementing a sequentially
  * consistent protocol:
  *
  * 1. Channel callback is invoked while holding the the channel lock
  *    and an unloading driver will reset the channel callback under
  *    the protection of this channel lock.
  *
  * 2. To ensure bounded wait time for unloading a driver, we don't
  *    permit outgoing traffic once the device is marked as being
  *    destroyed.
  *
  * 3. Once the device is marked as being destroyed, we only
  *    permit incoming traffic to properly account for
  *    packets already sent out.
  */
 static inline struct storvsc_softc *
 get_stor_device(struct hv_device *device,
 				boolean_t outbound)
 {
 	struct storvsc_softc *sc;
 
 	sc = device_get_softc(device->device);
 	if (sc == NULL) {
 		return NULL;
 	}
 
 	if (outbound) {
 		/*
 		 * Here we permit outgoing I/O only
 		 * if the device is not being destroyed.
 		 */
 
 		if (sc->hs_destroy) {
 			sc = NULL;
 		}
 	} else {
 		/*
 		 * inbound case; if being destroyed
 		 * only permit to account for
 		 * messages already sent out.
 		 */
 		if (sc->hs_destroy && (sc->hs_num_out_reqs == 0)) {
 			sc = NULL;
 		}
 	}
 	return sc;
 }
 
 /**
  * @brief Callback handler, will be invoked when receive mutil-channel offer
  *
  * @param context  new multi-channel
  */
 static void
 storvsc_handle_sc_creation(void *context)
 {
 	hv_vmbus_channel *new_channel;
 	struct hv_device *device;
 	struct storvsc_softc *sc;
 	struct vmstor_chan_props props;
 	int ret = 0;
 
 	new_channel = (hv_vmbus_channel *)context;
 	device = new_channel->primary_channel->device;
 	sc = get_stor_device(device, TRUE);
 	if (sc == NULL)
 		return;
 
 	if (FALSE == sc->hs_open_multi_channel)
 		return;
 	
 	memset(&props, 0, sizeof(props));
 
 	ret = hv_vmbus_channel_open(new_channel,
 	    sc->hs_drv_props->drv_ringbuffer_size,
   	    sc->hs_drv_props->drv_ringbuffer_size,
 	    (void *)&props,
 	    sizeof(struct vmstor_chan_props),
 	    hv_storvsc_on_channel_callback,
 	    new_channel);
 
 	return;
 }
 
 /**
  * @brief Send multi-channel creation request to host
  *
  * @param device  a Hyper-V device pointer
  * @param max_chans  the max channels supported by vmbus
  */
 static void
 storvsc_send_multichannel_request(struct hv_device *dev, int max_chans)
 {
 	struct storvsc_softc *sc;
 	struct hv_storvsc_request *request;
 	struct vstor_packet *vstor_packet;	
 	int request_channels_cnt = 0;
 	int ret;
 
 	/* get multichannels count that need to create */
 	request_channels_cnt = MIN(max_chans, mp_ncpus);
 
 	sc = get_stor_device(dev, TRUE);
 	if (sc == NULL) {
 		printf("Storvsc_error: get sc failed while send mutilchannel "
 		    "request\n");
 		return;
 	}
 
 	request = &sc->hs_init_req;
 
 	/* Establish a handler for multi-channel */
 	dev->channel->sc_creation_callback = storvsc_handle_sc_creation;
 
 	/* request the host to create multi-channel */
 	memset(request, 0, sizeof(struct hv_storvsc_request));
 	
 	sema_init(&request->synch_sema, 0, ("stor_synch_sema"));
 
 	vstor_packet = &request->vstor_packet;
 	
 	vstor_packet->operation = VSTOR_OPERATION_CREATE_MULTI_CHANNELS;
 	vstor_packet->flags = REQUEST_COMPLETION_FLAG;
 	vstor_packet->u.multi_channels_cnt = request_channels_cnt;
 
 	ret = hv_vmbus_channel_send_packet(
 	    dev->channel,
 	    vstor_packet,
 	    VSTOR_PKT_SIZE,
 	    (uint64_t)(uintptr_t)request,
 	    HV_VMBUS_PACKET_TYPE_DATA_IN_BAND,
 	    HV_VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED);
 
 	/* wait for 5 seconds */
 	ret = sema_timedwait(&request->synch_sema, 5 * hz);
 	if (ret != 0) {		
 		printf("Storvsc_error: create multi-channel timeout, %d\n",
 		    ret);
 		return;
 	}
 
 	if (vstor_packet->operation != VSTOR_OPERATION_COMPLETEIO ||
 	    vstor_packet->status != 0) {		
 		printf("Storvsc_error: create multi-channel invalid operation "
 		    "(%d) or statue (%u)\n",
 		    vstor_packet->operation, vstor_packet->status);
 		return;
 	}
 
 	sc->hs_open_multi_channel = TRUE;
 
 	if (bootverbose)
 		printf("Storvsc create multi-channel success!\n");
 }
 
 /**
  * @brief initialize channel connection to parent partition
  *
  * @param dev  a Hyper-V device pointer
  * @returns  0 on success, non-zero error on failure
  */
 static int
 hv_storvsc_channel_init(struct hv_device *dev)
 {
-	int ret = 0;
+	int ret = 0, i;
 	struct hv_storvsc_request *request;
 	struct vstor_packet *vstor_packet;
 	struct storvsc_softc *sc;
 	uint16_t max_chans = 0;
 	boolean_t support_multichannel = FALSE;
 
 	max_chans = 0;
 	support_multichannel = FALSE;
 
 	sc = get_stor_device(dev, TRUE);
 	if (sc == NULL)
 		return (ENODEV);
 
 	request = &sc->hs_init_req;
 	memset(request, 0, sizeof(struct hv_storvsc_request));
 	vstor_packet = &request->vstor_packet;
 	request->softc = sc;
 
 	/**
 	 * Initiate the vsc/vsp initialization protocol on the open channel
 	 */
 	sema_init(&request->synch_sema, 0, ("stor_synch_sema"));
 
 	vstor_packet->operation = VSTOR_OPERATION_BEGININITIALIZATION;
 	vstor_packet->flags = REQUEST_COMPLETION_FLAG;
 
 
 	ret = hv_vmbus_channel_send_packet(
 			dev->channel,
 			vstor_packet,
 			VSTOR_PKT_SIZE,
 			(uint64_t)(uintptr_t)request,
 			HV_VMBUS_PACKET_TYPE_DATA_IN_BAND,
 			HV_VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED);
 
 	if (ret != 0)
 		goto cleanup;
 
 	/* wait 5 seconds */
 	ret = sema_timedwait(&request->synch_sema, 5 * hz);
 	if (ret != 0)
 		goto cleanup;
 
 	if (vstor_packet->operation != VSTOR_OPERATION_COMPLETEIO ||
 		vstor_packet->status != 0) {
 		goto cleanup;
 	}
 
-	/* reuse the packet for version range supported */
+	for (i = 0; i < nitems(vmstor_proto_list); i++) {
+		/* reuse the packet for version range supported */
 
-	memset(vstor_packet, 0, sizeof(struct vstor_packet));
-	vstor_packet->operation = VSTOR_OPERATION_QUERYPROTOCOLVERSION;
-	vstor_packet->flags = REQUEST_COMPLETION_FLAG;
+		memset(vstor_packet, 0, sizeof(struct vstor_packet));
+		vstor_packet->operation = VSTOR_OPERATION_QUERYPROTOCOLVERSION;
+		vstor_packet->flags = REQUEST_COMPLETION_FLAG;
 
-	vstor_packet->u.version.major_minor =
-	    VMSTOR_PROTOCOL_VERSION(storvsc_current_major, storvsc_current_minor);
+		vstor_packet->u.version.major_minor =
+			vmstor_proto_list[i].proto_version;
 
-	/* revision is only significant for Windows guests */
-	vstor_packet->u.version.revision = 0;
+		/* revision is only significant for Windows guests */
+		vstor_packet->u.version.revision = 0;
 
-	ret = hv_vmbus_channel_send_packet(
+		ret = hv_vmbus_channel_send_packet(
 			dev->channel,
 			vstor_packet,
 			VSTOR_PKT_SIZE,
 			(uint64_t)(uintptr_t)request,
 			HV_VMBUS_PACKET_TYPE_DATA_IN_BAND,
 			HV_VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED);
 
-	if (ret != 0)
-		goto cleanup;
+		if (ret != 0)
+			goto cleanup;
 
-	/* wait 5 seconds */
-	ret = sema_timedwait(&request->synch_sema, 5 * hz);
+		/* wait 5 seconds */
+		ret = sema_timedwait(&request->synch_sema, 5 * hz);
 
-	if (ret)
-		goto cleanup;
+		if (ret)
+			goto cleanup;
 
-	/* TODO: Check returned version */
-	if (vstor_packet->operation != VSTOR_OPERATION_COMPLETEIO ||
-		vstor_packet->status != 0)
-		goto cleanup;
+		if (vstor_packet->operation != VSTOR_OPERATION_COMPLETEIO) {
+			ret = EINVAL;
+			goto cleanup;	
+		}
+		if (vstor_packet->status == 0) {
+			vmstor_proto_version =
+				vmstor_proto_list[i].proto_version;
+			sense_buffer_size =
+				vmstor_proto_list[i].sense_buffer_size;
+			vmscsi_size_delta =
+				vmstor_proto_list[i].vmscsi_size_delta;
+			break;
+		}
+	}
 
+	if (vstor_packet->status != 0) {
+		ret = EINVAL;
+		goto cleanup;
+	}
 	/**
 	 * Query channel properties
 	 */
 	memset(vstor_packet, 0, sizeof(struct vstor_packet));
 	vstor_packet->operation = VSTOR_OPERATION_QUERYPROPERTIES;
 	vstor_packet->flags = REQUEST_COMPLETION_FLAG;
 
 	ret = hv_vmbus_channel_send_packet(
 				dev->channel,
 				vstor_packet,
 				VSTOR_PKT_SIZE,
 				(uint64_t)(uintptr_t)request,
 				HV_VMBUS_PACKET_TYPE_DATA_IN_BAND,
 				HV_VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED);
 
 	if ( ret != 0)
 		goto cleanup;
 
 	/* wait 5 seconds */
 	ret = sema_timedwait(&request->synch_sema, 5 * hz);
 
 	if (ret != 0)
 		goto cleanup;
 
 	/* TODO: Check returned version */
 	if (vstor_packet->operation != VSTOR_OPERATION_COMPLETEIO ||
 	    vstor_packet->status != 0) {
 		goto cleanup;
 	}
 
 	/* multi-channels feature is supported by WIN8 and above version */
 	max_chans = vstor_packet->u.chan_props.max_channel_cnt;
 	if ((hv_vmbus_protocal_version != HV_VMBUS_VERSION_WIN7) &&
 	    (hv_vmbus_protocal_version != HV_VMBUS_VERSION_WS2008) &&
 	    (vstor_packet->u.chan_props.flags &
 	     HV_STORAGE_SUPPORTS_MULTI_CHANNEL)) {
 		support_multichannel = TRUE;
 	}
 
 	memset(vstor_packet, 0, sizeof(struct vstor_packet));
 	vstor_packet->operation = VSTOR_OPERATION_ENDINITIALIZATION;
 	vstor_packet->flags = REQUEST_COMPLETION_FLAG;
 
 	ret = hv_vmbus_channel_send_packet(
 			dev->channel,
 			vstor_packet,
 			VSTOR_PKT_SIZE,
 			(uint64_t)(uintptr_t)request,
 			HV_VMBUS_PACKET_TYPE_DATA_IN_BAND,
 			HV_VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED);
 
 	if (ret != 0) {
 		goto cleanup;
 	}
 
 	/* wait 5 seconds */
 	ret = sema_timedwait(&request->synch_sema, 5 * hz);
 
 	if (ret != 0)
 		goto cleanup;
 
 	if (vstor_packet->operation != VSTOR_OPERATION_COMPLETEIO ||
 	    vstor_packet->status != 0)
 		goto cleanup;
 
 	/*
 	 * If multi-channel is supported, send multichannel create
 	 * request to host.
 	 */
 	if (support_multichannel)
 		storvsc_send_multichannel_request(dev, max_chans);
 
 cleanup:
 	sema_destroy(&request->synch_sema);
 	return (ret);
 }
 
 /**
  * @brief Open channel connection to paraent partition StorVSP driver
  *
  * Open and initialize channel connection to parent partition StorVSP driver.
  *
  * @param pointer to a Hyper-V device
  * @returns 0 on success, non-zero error on failure
  */
 static int
 hv_storvsc_connect_vsp(struct hv_device *dev)
 {	
 	int ret = 0;
 	struct vmstor_chan_props props;
 	struct storvsc_softc *sc;
 
 	sc = device_get_softc(dev->device);
 		
 	memset(&props, 0, sizeof(struct vmstor_chan_props));
 
 	/*
 	 * Open the channel
 	 */
 
 	ret = hv_vmbus_channel_open(
 		dev->channel,
 		sc->hs_drv_props->drv_ringbuffer_size,
 		sc->hs_drv_props->drv_ringbuffer_size,
 		(void *)&props,
 		sizeof(struct vmstor_chan_props),
 		hv_storvsc_on_channel_callback,
 		dev->channel);
 
 	if (ret != 0) {
 		return ret;
 	}
 
 	ret = hv_storvsc_channel_init(dev);
 
 	return (ret);
 }
 
 #if HVS_HOST_RESET
 static int
 hv_storvsc_host_reset(struct hv_device *dev)
 {
 	int ret = 0;
 	struct storvsc_softc *sc;
 
 	struct hv_storvsc_request *request;
 	struct vstor_packet *vstor_packet;
 
 	sc = get_stor_device(dev, TRUE);
 	if (sc == NULL) {
 		return ENODEV;
 	}
 
 	request = &sc->hs_reset_req;
 	request->softc = sc;
 	vstor_packet = &request->vstor_packet;
 
 	sema_init(&request->synch_sema, 0, "stor synch sema");
 
 	vstor_packet->operation = VSTOR_OPERATION_RESETBUS;
 	vstor_packet->flags = REQUEST_COMPLETION_FLAG;
 
 	ret = hv_vmbus_channel_send_packet(dev->channel,
 			vstor_packet,
 			VSTOR_PKT_SIZE,
 			(uint64_t)(uintptr_t)&sc->hs_reset_req,
 			HV_VMBUS_PACKET_TYPE_DATA_IN_BAND,
 			HV_VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED);
 
 	if (ret != 0) {
 		goto cleanup;
 	}
 
 	ret = sema_timedwait(&request->synch_sema, 5 * hz); /* KYS 5 seconds */
 
 	if (ret) {
 		goto cleanup;
 	}
 
 
 	/*
 	 * At this point, all outstanding requests in the adapter
 	 * should have been flushed out and return to us
 	 */
 
 cleanup:
 	sema_destroy(&request->synch_sema);
 	return (ret);
 }
 #endif /* HVS_HOST_RESET */
 
 /**
  * @brief Function to initiate an I/O request
  *
  * @param device Hyper-V device pointer
  * @param request pointer to a request structure
  * @returns 0 on success, non-zero error on failure
  */
 static int
 hv_storvsc_io_request(struct hv_device *device,
 					  struct hv_storvsc_request *request)
 {
 	struct storvsc_softc *sc;
 	struct vstor_packet *vstor_packet = &request->vstor_packet;
 	struct hv_vmbus_channel* outgoing_channel = NULL;
 	int ret = 0;
 
 	sc = get_stor_device(device, TRUE);
 
 	if (sc == NULL) {
 		return ENODEV;
 	}
 
 	vstor_packet->flags |= REQUEST_COMPLETION_FLAG;
 
 	vstor_packet->u.vm_srb.length = VSTOR_PKT_SIZE;
 	
 	vstor_packet->u.vm_srb.sense_info_len = sense_buffer_size;
 
 	vstor_packet->u.vm_srb.transfer_len = request->data_buf.length;
 
 	vstor_packet->operation = VSTOR_OPERATION_EXECUTESRB;
 
 	outgoing_channel = vmbus_select_outgoing_channel(device->channel);
 
 	mtx_unlock(&request->softc->hs_lock);
 	if (request->data_buf.length) {
 		ret = hv_vmbus_channel_send_packet_multipagebuffer(
 				outgoing_channel,
 				&request->data_buf,
 				vstor_packet,
 				VSTOR_PKT_SIZE,
 				(uint64_t)(uintptr_t)request);
 
 	} else {
 		ret = hv_vmbus_channel_send_packet(
 			outgoing_channel,
 			vstor_packet,
 			VSTOR_PKT_SIZE,
 			(uint64_t)(uintptr_t)request,
 			HV_VMBUS_PACKET_TYPE_DATA_IN_BAND,
 			HV_VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED);
 	}
 	mtx_lock(&request->softc->hs_lock);
 
 	if (ret != 0) {
 		printf("Unable to send packet %p ret %d", vstor_packet, ret);
 	} else {
 		atomic_add_int(&sc->hs_num_out_reqs, 1);
 	}
 
 	return (ret);
 }
 
 
 /**
  * Process IO_COMPLETION_OPERATION and ready
  * the result to be completed for upper layer
  * processing by the CAM layer.
  */
 static void
 hv_storvsc_on_iocompletion(struct storvsc_softc *sc,
 			   struct vstor_packet *vstor_packet,
 			   struct hv_storvsc_request *request)
 {
 	struct vmscsi_req *vm_srb;
 
 	vm_srb = &vstor_packet->u.vm_srb;
 
+	/*
+	 * Copy some fields of the host's response into the request structure,
+	 * because the fields will be used later in storvsc_io_done().
+	 */
+	request->vstor_packet.u.vm_srb.scsi_status = vm_srb->scsi_status;
+	request->vstor_packet.u.vm_srb.transfer_len = vm_srb->transfer_len;
+
 	if (((vm_srb->scsi_status & 0xFF) == SCSI_STATUS_CHECK_COND) &&
 			(vm_srb->srb_status & SRB_STATUS_AUTOSENSE_VALID)) {
 		/* Autosense data available */
 
 		KASSERT(vm_srb->sense_info_len <= request->sense_info_len,
 				("vm_srb->sense_info_len <= "
 				 "request->sense_info_len"));
 
 		memcpy(request->sense_data, vm_srb->u.sense_data,
 			vm_srb->sense_info_len);
 
 		request->sense_info_len = vm_srb->sense_info_len;
 	}
 
 	/* Complete request by passing to the CAM layer */
 	storvsc_io_done(request);
 	atomic_subtract_int(&sc->hs_num_out_reqs, 1);
 	if (sc->hs_drain_notify && (sc->hs_num_out_reqs == 0)) {
 		sema_post(&sc->hs_drain_sema);
 	}
 }
 
 static void
 hv_storvsc_rescan_target(struct storvsc_softc *sc)
 {
 	path_id_t pathid;
 	target_id_t targetid;
 	union ccb *ccb;
 
 	pathid = cam_sim_path(sc->hs_sim);
 	targetid = CAM_TARGET_WILDCARD;
 
 	/*
 	 * Allocate a CCB and schedule a rescan.
 	 */
 	ccb = xpt_alloc_ccb_nowait();
 	if (ccb == NULL) {
 		printf("unable to alloc CCB for rescan\n");
 		return;
 	}
 
 	if (xpt_create_path(&ccb->ccb_h.path, NULL, pathid, targetid,
 	    CAM_LUN_WILDCARD) != CAM_REQ_CMP) {
 		printf("unable to create path for rescan, pathid: %d,"
 		    "targetid: %d\n", pathid, targetid);
 		xpt_free_ccb(ccb);
 		return;
 	}
 
 	if (targetid == CAM_TARGET_WILDCARD)
 		ccb->ccb_h.func_code = XPT_SCAN_BUS;
 	else
 		ccb->ccb_h.func_code = XPT_SCAN_TGT;
 
 	xpt_rescan(ccb);
 }
 
 static void
 hv_storvsc_on_channel_callback(void *context)
 {
 	int ret = 0;
 	hv_vmbus_channel *channel = (hv_vmbus_channel *)context;
 	struct hv_device *device = NULL;
 	struct storvsc_softc *sc;
 	uint32_t bytes_recvd;
 	uint64_t request_id;
 	uint8_t packet[roundup2(sizeof(struct vstor_packet), 8)];
 	struct hv_storvsc_request *request;
 	struct vstor_packet *vstor_packet;
 
 	if (channel->primary_channel != NULL){
 		device = channel->primary_channel->device;
 	} else {
 		device = channel->device;
 	}
 
 	KASSERT(device, ("device is NULL"));
 
 	sc = get_stor_device(device, FALSE);
 	if (sc == NULL) {
 		printf("Storvsc_error: get stor device failed.\n");
 		return;
 	}
 
 	ret = hv_vmbus_channel_recv_packet(
 			channel,
 			packet,
 			roundup2(VSTOR_PKT_SIZE, 8),
 			&bytes_recvd,
 			&request_id);
 
 	while ((ret == 0) && (bytes_recvd > 0)) {
 		request = (struct hv_storvsc_request *)(uintptr_t)request_id;
 
 		if ((request == &sc->hs_init_req) ||
 			(request == &sc->hs_reset_req)) {
 			memcpy(&request->vstor_packet, packet,
 				   sizeof(struct vstor_packet));
 			sema_post(&request->synch_sema);
 		} else {
 			vstor_packet = (struct vstor_packet *)packet;
 			switch(vstor_packet->operation) {
 			case VSTOR_OPERATION_COMPLETEIO:
 				if (request == NULL)
 					panic("VMBUS: storvsc received a "
 					    "packet with NULL request id in "
 					    "COMPLETEIO operation.");
 
 				hv_storvsc_on_iocompletion(sc,
 							vstor_packet, request);
 				break;
 			case VSTOR_OPERATION_REMOVEDEVICE:
 				printf("VMBUS: storvsc operation %d not "
 				    "implemented.\n", vstor_packet->operation);
 				/* TODO: implement */
 				break;
 			case VSTOR_OPERATION_ENUMERATE_BUS:
 				hv_storvsc_rescan_target(sc);
 				break;
 			default:
 				break;
 			}			
 		}
 		ret = hv_vmbus_channel_recv_packet(
 				channel,
 				packet,
 				roundup2(VSTOR_PKT_SIZE, 8),
 				&bytes_recvd,
 				&request_id);
 	}
 }
 
 /**
  * @brief StorVSC probe function
  *
  * Device probe function.  Returns 0 if the input device is a StorVSC
  * device.  Otherwise, a ENXIO is returned.  If the input device is
  * for BlkVSC (paravirtual IDE) device and this support is disabled in
  * favor of the emulated ATA/IDE device, return ENXIO.
  *
  * @param a device
  * @returns 0 on success, ENXIO if not a matcing StorVSC device
  */
 static int
 storvsc_probe(device_t dev)
 {
 	int ata_disk_enable = 0;
 	int ret	= ENXIO;
 	
-	if (hv_vmbus_protocal_version == HV_VMBUS_VERSION_WS2008 ||
-	    hv_vmbus_protocal_version == HV_VMBUS_VERSION_WIN7) {
-		sense_buffer_size = PRE_WIN8_STORVSC_SENSE_BUFFER_SIZE;
-		vmscsi_size_delta = sizeof(struct vmscsi_win8_extension);
-		storvsc_current_major = STORVSC_WIN7_MAJOR;
-		storvsc_current_minor = STORVSC_WIN7_MINOR;
-	} else {
-		sense_buffer_size = POST_WIN7_STORVSC_SENSE_BUFFER_SIZE;
-		vmscsi_size_delta = 0;
-		storvsc_current_major = STORVSC_WIN8_MAJOR;
-		storvsc_current_minor = STORVSC_WIN8_MINOR;
-	}
-	
 	switch (storvsc_get_storage_type(dev)) {
 	case DRIVER_BLKVSC:
 		if(bootverbose)
 			device_printf(dev, "DRIVER_BLKVSC-Emulated ATA/IDE probe\n");
 		if (!getenv_int("hw.ata.disk_enable", &ata_disk_enable)) {
 			if(bootverbose)
 				device_printf(dev,
 					"Enlightened ATA/IDE detected\n");
 			ret = BUS_PROBE_DEFAULT;
 		} else if(bootverbose)
 			device_printf(dev, "Emulated ATA/IDE set (hw.ata.disk_enable set)\n");
 		break;
 	case DRIVER_STORVSC:
 		if(bootverbose)
 			device_printf(dev, "Enlightened SCSI device detected\n");
 		ret = BUS_PROBE_DEFAULT;
 		break;
 	default:
 		ret = ENXIO;
 	}
 	return (ret);
 }
 
 /**
  * @brief StorVSC attach function
  *
  * Function responsible for allocating per-device structures,
  * setting up CAM interfaces and scanning for available LUNs to
  * be used for SCSI device peripherals.
  *
  * @param a device
  * @returns 0 on success or an error on failure
  */
 static int
 storvsc_attach(device_t dev)
 {
 	struct hv_device *hv_dev = vmbus_get_devctx(dev);
 	enum hv_storage_type stor_type;
 	struct storvsc_softc *sc;
 	struct cam_devq *devq;
 	int ret, i, j;
 	struct hv_storvsc_request *reqp;
 	struct root_hold_token *root_mount_token = NULL;
 	struct hv_sgl_node *sgl_node = NULL;
 	void *tmp_buff = NULL;
 
 	/*
 	 * We need to serialize storvsc attach calls.
 	 */
 	root_mount_token = root_mount_hold("storvsc");
 
 	sc = device_get_softc(dev);
 	if (sc == NULL) {
 		ret = ENOMEM;
 		goto cleanup;
 	}
 
 	stor_type = storvsc_get_storage_type(dev);
 
 	if (stor_type == DRIVER_UNKNOWN) {
 		ret = ENODEV;
 		goto cleanup;
 	}
 
 	bzero(sc, sizeof(struct storvsc_softc));
 
 	/* fill in driver specific properties */
 	sc->hs_drv_props = &g_drv_props_table[stor_type];
 
 	/* fill in device specific properties */
 	sc->hs_unit	= device_get_unit(dev);
 	sc->hs_dev	= hv_dev;
 	device_set_desc(dev, g_drv_props_table[stor_type].drv_desc);
 
 	LIST_INIT(&sc->hs_free_list);
 	mtx_init(&sc->hs_lock, "hvslck", NULL, MTX_DEF);
 
 	for (i = 0; i < sc->hs_drv_props->drv_max_ios_per_target; ++i) {
 		reqp = malloc(sizeof(struct hv_storvsc_request),
 				 M_DEVBUF, M_WAITOK|M_ZERO);
 		reqp->softc = sc;
 
 		LIST_INSERT_HEAD(&sc->hs_free_list, reqp, link);
 	}
 
 	/* create sg-list page pool */
 	if (FALSE == g_hv_sgl_page_pool.is_init) {
 		g_hv_sgl_page_pool.is_init = TRUE;
 		LIST_INIT(&g_hv_sgl_page_pool.in_use_sgl_list);
 		LIST_INIT(&g_hv_sgl_page_pool.free_sgl_list);
 
 		/*
 		 * Pre-create SG list, each SG list with
 		 * HV_MAX_MULTIPAGE_BUFFER_COUNT segments, each
 		 * segment has one page buffer
 		 */
 		for (i = 0; i < STORVSC_MAX_IO_REQUESTS; i++) {
 	        	sgl_node = malloc(sizeof(struct hv_sgl_node),
 			    M_DEVBUF, M_WAITOK|M_ZERO);
 
 			sgl_node->sgl_data =
 			    sglist_alloc(HV_MAX_MULTIPAGE_BUFFER_COUNT,
 			    M_WAITOK|M_ZERO);
 
 			for (j = 0; j < HV_MAX_MULTIPAGE_BUFFER_COUNT; j++) {
 				tmp_buff = malloc(PAGE_SIZE,
 				    M_DEVBUF, M_WAITOK|M_ZERO);
 
 				sgl_node->sgl_data->sg_segs[j].ss_paddr =
 				    (vm_paddr_t)tmp_buff;
 			}
 
 			LIST_INSERT_HEAD(&g_hv_sgl_page_pool.free_sgl_list,
 			    sgl_node, link);
 		}
 	}
 
 	sc->hs_destroy = FALSE;
 	sc->hs_drain_notify = FALSE;
 	sc->hs_open_multi_channel = FALSE;
 	sema_init(&sc->hs_drain_sema, 0, "Store Drain Sema");
 
 	ret = hv_storvsc_connect_vsp(hv_dev);
 	if (ret != 0) {
 		goto cleanup;
 	}
 
 	/*
 	 * Create the device queue.
 	 * Hyper-V maps each target to one SCSI HBA
 	 */
 	devq = cam_simq_alloc(sc->hs_drv_props->drv_max_ios_per_target);
 	if (devq == NULL) {
 		device_printf(dev, "Failed to alloc device queue\n");
 		ret = ENOMEM;
 		goto cleanup;
 	}
 
 	sc->hs_sim = cam_sim_alloc(storvsc_action,
 				storvsc_poll,
 				sc->hs_drv_props->drv_name,
 				sc,
 				sc->hs_unit,
 				&sc->hs_lock, 1,
 				sc->hs_drv_props->drv_max_ios_per_target,
 				devq);
 
 	if (sc->hs_sim == NULL) {
 		device_printf(dev, "Failed to alloc sim\n");
 		cam_simq_free(devq);
 		ret = ENOMEM;
 		goto cleanup;
 	}
 
 	mtx_lock(&sc->hs_lock);
 	/* bus_id is set to 0, need to get it from VMBUS channel query? */
 	if (xpt_bus_register(sc->hs_sim, dev, 0) != CAM_SUCCESS) {
 		cam_sim_free(sc->hs_sim, /*free_devq*/TRUE);
 		mtx_unlock(&sc->hs_lock);
 		device_printf(dev, "Unable to register SCSI bus\n");
 		ret = ENXIO;
 		goto cleanup;
 	}
 
 	if (xpt_create_path(&sc->hs_path, /*periph*/NULL,
 		 cam_sim_path(sc->hs_sim),
 		CAM_TARGET_WILDCARD, CAM_LUN_WILDCARD) != CAM_REQ_CMP) {
 		xpt_bus_deregister(cam_sim_path(sc->hs_sim));
 		cam_sim_free(sc->hs_sim, /*free_devq*/TRUE);
 		mtx_unlock(&sc->hs_lock);
 		device_printf(dev, "Unable to create path\n");
 		ret = ENXIO;
 		goto cleanup;
 	}
 
 	mtx_unlock(&sc->hs_lock);
 
 	root_mount_rel(root_mount_token);
 	return (0);
 
 
 cleanup:
 	root_mount_rel(root_mount_token);
 	while (!LIST_EMPTY(&sc->hs_free_list)) {
 		reqp = LIST_FIRST(&sc->hs_free_list);
 		LIST_REMOVE(reqp, link);
 		free(reqp, M_DEVBUF);
 	}
 
 	while (!LIST_EMPTY(&g_hv_sgl_page_pool.free_sgl_list)) {
 		sgl_node = LIST_FIRST(&g_hv_sgl_page_pool.free_sgl_list);
 		LIST_REMOVE(sgl_node, link);
 		for (j = 0; j < HV_MAX_MULTIPAGE_BUFFER_COUNT; j++) {
 			if (NULL !=
 			    (void*)sgl_node->sgl_data->sg_segs[j].ss_paddr) {
 				free((void*)sgl_node->sgl_data->sg_segs[j].ss_paddr, M_DEVBUF);
 			}
 		}
 		sglist_free(sgl_node->sgl_data);
 		free(sgl_node, M_DEVBUF);
 	}
 
 	return (ret);
 }
 
 /**
  * @brief StorVSC device detach function
  *
  * This function is responsible for safely detaching a
  * StorVSC device.  This includes waiting for inbound responses
  * to complete and freeing associated per-device structures.
  *
  * @param dev a device
  * returns 0 on success
  */
 static int
 storvsc_detach(device_t dev)
 {
 	struct storvsc_softc *sc = device_get_softc(dev);
 	struct hv_storvsc_request *reqp = NULL;
 	struct hv_device *hv_device = vmbus_get_devctx(dev);
 	struct hv_sgl_node *sgl_node = NULL;
 	int j = 0;
 
 	mtx_lock(&hv_device->channel->inbound_lock);
 	sc->hs_destroy = TRUE;
 	mtx_unlock(&hv_device->channel->inbound_lock);
 
 	/*
 	 * At this point, all outbound traffic should be disabled. We
 	 * only allow inbound traffic (responses) to proceed so that
 	 * outstanding requests can be completed.
 	 */
 
 	sc->hs_drain_notify = TRUE;
 	sema_wait(&sc->hs_drain_sema);
 	sc->hs_drain_notify = FALSE;
 
 	/*
 	 * Since we have already drained, we don't need to busy wait.
 	 * The call to close the channel will reset the callback
 	 * under the protection of the incoming channel lock.
 	 */
 
 	hv_vmbus_channel_close(hv_device->channel);
 
 	mtx_lock(&sc->hs_lock);
 	while (!LIST_EMPTY(&sc->hs_free_list)) {
 		reqp = LIST_FIRST(&sc->hs_free_list);
 		LIST_REMOVE(reqp, link);
 
 		free(reqp, M_DEVBUF);
 	}
 	mtx_unlock(&sc->hs_lock);
 
 	while (!LIST_EMPTY(&g_hv_sgl_page_pool.free_sgl_list)) {
 		sgl_node = LIST_FIRST(&g_hv_sgl_page_pool.free_sgl_list);
 		LIST_REMOVE(sgl_node, link);
 		for (j = 0; j < HV_MAX_MULTIPAGE_BUFFER_COUNT; j++){
 			if (NULL !=
 			    (void*)sgl_node->sgl_data->sg_segs[j].ss_paddr) {
 				free((void*)sgl_node->sgl_data->sg_segs[j].ss_paddr, M_DEVBUF);
 			}
 		}
 		sglist_free(sgl_node->sgl_data);
 		free(sgl_node, M_DEVBUF);
 	}
 	
 	return (0);
 }
 
 #if HVS_TIMEOUT_TEST
 /**
  * @brief unit test for timed out operations
  *
  * This function provides unit testing capability to simulate
  * timed out operations.  Recompilation with HV_TIMEOUT_TEST=1
  * is required.
  *
  * @param reqp pointer to a request structure
  * @param opcode SCSI operation being performed
  * @param wait if 1, wait for I/O to complete
  */
 static void
 storvsc_timeout_test(struct hv_storvsc_request *reqp,
 		uint8_t opcode, int wait)
 {
 	int ret;
 	union ccb *ccb = reqp->ccb;
 	struct storvsc_softc *sc = reqp->softc;
 
 	if (reqp->vstor_packet.vm_srb.cdb[0] != opcode) {
 		return;
 	}
 
 	if (wait) {
 		mtx_lock(&reqp->event.mtx);
 	}
 	ret = hv_storvsc_io_request(sc->hs_dev, reqp);
 	if (ret != 0) {
 		if (wait) {
 			mtx_unlock(&reqp->event.mtx);
 		}
 		printf("%s: io_request failed with %d.\n",
 				__func__, ret);
 		ccb->ccb_h.status = CAM_PROVIDE_FAIL;
 		mtx_lock(&sc->hs_lock);
 		storvsc_free_request(sc, reqp);
 		xpt_done(ccb);
 		mtx_unlock(&sc->hs_lock);
 		return;
 	}
 
 	if (wait) {
 		xpt_print(ccb->ccb_h.path,
 				"%u: %s: waiting for IO return.\n",
 				ticks, __func__);
 		ret = cv_timedwait(&reqp->event.cv, &reqp->event.mtx, 60*hz);
 		mtx_unlock(&reqp->event.mtx);
 		xpt_print(ccb->ccb_h.path, "%u: %s: %s.\n",
 				ticks, __func__, (ret == 0)?
 				"IO return detected" :
 				"IO return not detected");
 		/*
 		 * Now both the timer handler and io done are running
 		 * simultaneously. We want to confirm the io done always
 		 * finishes after the timer handler exits. So reqp used by
 		 * timer handler is not freed or stale. Do busy loop for
 		 * another 1/10 second to make sure io done does
 		 * wait for the timer handler to complete.
 		 */
 		DELAY(100*1000);
 		mtx_lock(&sc->hs_lock);
 		xpt_print(ccb->ccb_h.path,
 				"%u: %s: finishing, queue frozen %d, "
 				"ccb status 0x%x scsi_status 0x%x.\n",
 				ticks, __func__, sc->hs_frozen,
 				ccb->ccb_h.status,
 				ccb->csio.scsi_status);
 		mtx_unlock(&sc->hs_lock);
 	}
 }
 #endif /* HVS_TIMEOUT_TEST */
 
+#ifdef notyet
 /**
  * @brief timeout handler for requests
  *
  * This function is called as a result of a callout expiring.
  *
  * @param arg pointer to a request
  */
 static void
 storvsc_timeout(void *arg)
 {
 	struct hv_storvsc_request *reqp = arg;
 	struct storvsc_softc *sc = reqp->softc;
 	union ccb *ccb = reqp->ccb;
 
 	if (reqp->retries == 0) {
 		mtx_lock(&sc->hs_lock);
 		xpt_print(ccb->ccb_h.path,
 		    "%u: IO timed out (req=0x%p), wait for another %u secs.\n",
 		    ticks, reqp, ccb->ccb_h.timeout / 1000);
 		cam_error_print(ccb, CAM_ESF_ALL, CAM_EPF_ALL);
 		mtx_unlock(&sc->hs_lock);
 
 		reqp->retries++;
 		callout_reset_sbt(&reqp->callout, SBT_1MS * ccb->ccb_h.timeout,
 		    0, storvsc_timeout, reqp, 0);
 #if HVS_TIMEOUT_TEST
 		storvsc_timeout_test(reqp, SEND_DIAGNOSTIC, 0);
 #endif
 		return;
 	}
 
 	mtx_lock(&sc->hs_lock);
 	xpt_print(ccb->ccb_h.path,
 		"%u: IO (reqp = 0x%p) did not return for %u seconds, %s.\n",
 		ticks, reqp, ccb->ccb_h.timeout * (reqp->retries+1) / 1000,
 		(sc->hs_frozen == 0)?
 		"freezing the queue" : "the queue is already frozen");
 	if (sc->hs_frozen == 0) {
 		sc->hs_frozen = 1;
 		xpt_freeze_simq(xpt_path_sim(ccb->ccb_h.path), 1);
 	}
 	mtx_unlock(&sc->hs_lock);
 	
 #if HVS_TIMEOUT_TEST
 	storvsc_timeout_test(reqp, MODE_SELECT_10, 1);
 #endif
 }
+#endif
 
 /**
  * @brief StorVSC device poll function
  *
  * This function is responsible for servicing requests when
  * interrupts are disabled (i.e when we are dumping core.)
  *
  * @param sim a pointer to a CAM SCSI interface module
  */
 static void
 storvsc_poll(struct cam_sim *sim)
 {
 	struct storvsc_softc *sc = cam_sim_softc(sim);
 
 	mtx_assert(&sc->hs_lock, MA_OWNED);
 	mtx_unlock(&sc->hs_lock);
 	hv_storvsc_on_channel_callback(sc->hs_dev->channel);
 	mtx_lock(&sc->hs_lock);
 }
 
 /**
  * @brief StorVSC device action function
  *
  * This function is responsible for handling SCSI operations which
  * are passed from the CAM layer.  The requests are in the form of
  * CAM control blocks which indicate the action being performed.
  * Not all actions require converting the request to a VSCSI protocol
  * message - these actions can be responded to by this driver.
  * Requests which are destined for a backend storage device are converted
  * to a VSCSI protocol message and sent on the channel connection associated
  * with this device.
  *
  * @param sim pointer to a CAM SCSI interface module
  * @param ccb pointer to a CAM control block
  */
 static void
 storvsc_action(struct cam_sim *sim, union ccb *ccb)
 {
 	struct storvsc_softc *sc = cam_sim_softc(sim);
 	int res;
 
 	mtx_assert(&sc->hs_lock, MA_OWNED);
 	switch (ccb->ccb_h.func_code) {
 	case XPT_PATH_INQ: {
 		struct ccb_pathinq *cpi = &ccb->cpi;
 
 		cpi->version_num = 1;
 		cpi->hba_inquiry = PI_TAG_ABLE|PI_SDTR_ABLE;
 		cpi->target_sprt = 0;
 		cpi->hba_misc = PIM_NOBUSRESET;
 		cpi->hba_eng_cnt = 0;
 		cpi->max_target = STORVSC_MAX_TARGETS;
 		cpi->max_lun = sc->hs_drv_props->drv_max_luns_per_target;
 		cpi->initiator_id = cpi->max_target;
 		cpi->bus_id = cam_sim_bus(sim);
 		cpi->base_transfer_speed = 300000;
 		cpi->transport = XPORT_SAS;
 		cpi->transport_version = 0;
 		cpi->protocol = PROTO_SCSI;
 		cpi->protocol_version = SCSI_REV_SPC2;
 		strncpy(cpi->sim_vid, "FreeBSD", SIM_IDLEN);
 		strncpy(cpi->hba_vid, sc->hs_drv_props->drv_name, HBA_IDLEN);
 		strncpy(cpi->dev_name, cam_sim_name(sim), DEV_IDLEN);
 		cpi->unit_number = cam_sim_unit(sim);
 
 		ccb->ccb_h.status = CAM_REQ_CMP;
 		xpt_done(ccb);
 		return;
 	}
 	case XPT_GET_TRAN_SETTINGS: {
 		struct  ccb_trans_settings *cts = &ccb->cts;
 
 		cts->transport = XPORT_SAS;
 		cts->transport_version = 0;
 		cts->protocol = PROTO_SCSI;
 		cts->protocol_version = SCSI_REV_SPC2;
 
 		/* enable tag queuing and disconnected mode */
 		cts->proto_specific.valid = CTS_SCSI_VALID_TQ;
 		cts->proto_specific.scsi.valid = CTS_SCSI_VALID_TQ;
 		cts->proto_specific.scsi.flags = CTS_SCSI_FLAGS_TAG_ENB;
 		cts->xport_specific.valid = CTS_SPI_VALID_DISC;
 		cts->xport_specific.spi.flags = CTS_SPI_FLAGS_DISC_ENB;
 			
 		ccb->ccb_h.status = CAM_REQ_CMP;
 		xpt_done(ccb);
 		return;
 	}
 	case XPT_SET_TRAN_SETTINGS:	{
 		ccb->ccb_h.status = CAM_REQ_CMP;
 		xpt_done(ccb);
 		return;
 	}
 	case XPT_CALC_GEOMETRY:{
 		cam_calc_geometry(&ccb->ccg, 1);
 		xpt_done(ccb);
 		return;
 	}
 	case  XPT_RESET_BUS:
 	case  XPT_RESET_DEV:{
 #if HVS_HOST_RESET
 		if ((res = hv_storvsc_host_reset(sc->hs_dev)) != 0) {
 			xpt_print(ccb->ccb_h.path,
 				"hv_storvsc_host_reset failed with %d\n", res);
 			ccb->ccb_h.status = CAM_PROVIDE_FAIL;
 			xpt_done(ccb);
 			return;
 		}
 		ccb->ccb_h.status = CAM_REQ_CMP;
 		xpt_done(ccb);
 		return;
 #else
 		xpt_print(ccb->ccb_h.path,
 				  "%s reset not supported.\n",
 				  (ccb->ccb_h.func_code == XPT_RESET_BUS)?
 				  "bus" : "dev");
 		ccb->ccb_h.status = CAM_REQ_INVALID;
 		xpt_done(ccb);
 		return;
 #endif	/* HVS_HOST_RESET */
 	}
 	case XPT_SCSI_IO:
 	case XPT_IMMED_NOTIFY: {
 		struct hv_storvsc_request *reqp = NULL;
 
 		if (ccb->csio.cdb_len == 0) {
 			panic("cdl_len is 0\n");
 		}
 
 		if (LIST_EMPTY(&sc->hs_free_list)) {
 			ccb->ccb_h.status = CAM_REQUEUE_REQ;
 			if (sc->hs_frozen == 0) {
 				sc->hs_frozen = 1;
 				xpt_freeze_simq(sim, /* count*/1);
 			}
 			xpt_done(ccb);
 			return;
 		}
 
 		reqp = LIST_FIRST(&sc->hs_free_list);
 		LIST_REMOVE(reqp, link);
 
 		bzero(reqp, sizeof(struct hv_storvsc_request));
 		reqp->softc = sc;
 		
 		ccb->ccb_h.status |= CAM_SIM_QUEUED;
 		if ((res = create_storvsc_request(ccb, reqp)) != 0) {
 			ccb->ccb_h.status = CAM_REQ_INVALID;
 			xpt_done(ccb);
 			return;
 		}
 
+#ifdef notyet
 		if (ccb->ccb_h.timeout != CAM_TIME_INFINITY) {
 			callout_init(&reqp->callout, CALLOUT_MPSAFE);
 			callout_reset_sbt(&reqp->callout,
 			    SBT_1MS * ccb->ccb_h.timeout, 0,
 			    storvsc_timeout, reqp, 0);
 #if HVS_TIMEOUT_TEST
 			cv_init(&reqp->event.cv, "storvsc timeout cv");
 			mtx_init(&reqp->event.mtx, "storvsc timeout mutex",
 					NULL, MTX_DEF);
 			switch (reqp->vstor_packet.vm_srb.cdb[0]) {
 				case MODE_SELECT_10:
 				case SEND_DIAGNOSTIC:
 					/* To have timer send the request. */
 					return;
 				default:
 					break;
 			}
 #endif /* HVS_TIMEOUT_TEST */
 		}
+#endif
 
 		if ((res = hv_storvsc_io_request(sc->hs_dev, reqp)) != 0) {
 			xpt_print(ccb->ccb_h.path,
 				"hv_storvsc_io_request failed with %d\n", res);
 			ccb->ccb_h.status = CAM_PROVIDE_FAIL;
 			storvsc_free_request(sc, reqp);
 			xpt_done(ccb);
 			return;
 		}
 		return;
 	}
 
 	default:
 		ccb->ccb_h.status = CAM_REQ_INVALID;
 		xpt_done(ccb);
 		return;
 	}
 }
 
 /**
  * @brief destroy bounce buffer
  *
  * This function is responsible for destroy a Scatter/Gather list
  * that create by storvsc_create_bounce_buffer()
  *
  * @param sgl- the Scatter/Gather need be destroy
  * @param sg_count- page count of the SG list.
  *
  */
 static void
 storvsc_destroy_bounce_buffer(struct sglist *sgl)
 {
 	struct hv_sgl_node *sgl_node = NULL;
 
 	sgl_node = LIST_FIRST(&g_hv_sgl_page_pool.in_use_sgl_list);
 	LIST_REMOVE(sgl_node, link);
 	if (NULL == sgl_node) {
 		printf("storvsc error: not enough in use sgl\n");
 		return;
 	}
 	sgl_node->sgl_data = sgl;
 	LIST_INSERT_HEAD(&g_hv_sgl_page_pool.free_sgl_list, sgl_node, link);
 }
 
 /**
  * @brief create bounce buffer
  *
  * This function is responsible for create a Scatter/Gather list,
  * which hold several pages that can be aligned with page size.
  *
  * @param seg_count- SG-list segments count
  * @param write - if WRITE_TYPE, set SG list page used size to 0,
  * otherwise set used size to page size.
  *
  * return NULL if create failed
  */
 static struct sglist *
 storvsc_create_bounce_buffer(uint16_t seg_count, int write)
 {
 	int i = 0;
 	struct sglist *bounce_sgl = NULL;
 	unsigned int buf_len = ((write == WRITE_TYPE) ? 0 : PAGE_SIZE);
 	struct hv_sgl_node *sgl_node = NULL;	
 
 	/* get struct sglist from free_sgl_list */
 	sgl_node = LIST_FIRST(&g_hv_sgl_page_pool.free_sgl_list);
 	LIST_REMOVE(sgl_node, link);
 	if (NULL == sgl_node) {
 		printf("storvsc error: not enough free sgl\n");
 		return NULL;
 	}
 	bounce_sgl = sgl_node->sgl_data;
 	LIST_INSERT_HEAD(&g_hv_sgl_page_pool.in_use_sgl_list, sgl_node, link);
 
 	bounce_sgl->sg_maxseg = seg_count;
 
 	if (write == WRITE_TYPE)
 		bounce_sgl->sg_nseg = 0;
 	else
 		bounce_sgl->sg_nseg = seg_count;
 
 	for (i = 0; i < seg_count; i++)
 	        bounce_sgl->sg_segs[i].ss_len = buf_len;
 
 	return bounce_sgl;
 }
 
 /**
  * @brief copy data from SG list to bounce buffer
  *
  * This function is responsible for copy data from one SG list's segments
  * to another SG list which used as bounce buffer.
  *
  * @param bounce_sgl - the destination SG list
  * @param orig_sgl - the segment of the source SG list.
  * @param orig_sgl_count - the count of segments.
  * @param orig_sgl_count - indicate which segment need bounce buffer,
  *  set 1 means need.
  *
  */
 static void
 storvsc_copy_sgl_to_bounce_buf(struct sglist *bounce_sgl,
 			       bus_dma_segment_t *orig_sgl,
 			       unsigned int orig_sgl_count,
 			       uint64_t seg_bits)
 {
 	int src_sgl_idx = 0;
 
 	for (src_sgl_idx = 0; src_sgl_idx < orig_sgl_count; src_sgl_idx++) {
 		if (seg_bits & (1 << src_sgl_idx)) {
 			memcpy((void*)bounce_sgl->sg_segs[src_sgl_idx].ss_paddr,
 			    (void*)orig_sgl[src_sgl_idx].ds_addr,
 			    orig_sgl[src_sgl_idx].ds_len);
 
 			bounce_sgl->sg_segs[src_sgl_idx].ss_len =
 			    orig_sgl[src_sgl_idx].ds_len;
 		}
 	}
 }
 
 /**
  * @brief copy data from SG list which used as bounce to another SG list
  *
  * This function is responsible for copy data from one SG list with bounce
  * buffer to another SG list's segments.
  *
  * @param dest_sgl - the destination SG list's segments
  * @param dest_sgl_count - the count of destination SG list's segment.
  * @param src_sgl - the source SG list.
  * @param seg_bits - indicate which segment used bounce buffer of src SG-list.
  *
  */
 void
 storvsc_copy_from_bounce_buf_to_sgl(bus_dma_segment_t *dest_sgl,
 				    unsigned int dest_sgl_count,
 				    struct sglist* src_sgl,
 				    uint64_t seg_bits)
 {
 	int sgl_idx = 0;
 	
 	for (sgl_idx = 0; sgl_idx < dest_sgl_count; sgl_idx++) {
 		if (seg_bits & (1 << sgl_idx)) {
 			memcpy((void*)(dest_sgl[sgl_idx].ds_addr),
 			    (void*)(src_sgl->sg_segs[sgl_idx].ss_paddr),
 			    src_sgl->sg_segs[sgl_idx].ss_len);
 		}
 	}
 }
 
 /**
  * @brief check SG list with bounce buffer or not
  *
  * This function is responsible for check if need bounce buffer for SG list.
  *
  * @param sgl - the SG list's segments
  * @param sg_count - the count of SG list's segment.
  * @param bits - segmengs number that need bounce buffer
  *
  * return -1 if SG list needless bounce buffer
  */
 static int
 storvsc_check_bounce_buffer_sgl(bus_dma_segment_t *sgl,
 				unsigned int sg_count,
 				uint64_t *bits)
 {
 	int i = 0;
 	int offset = 0;
 	uint64_t phys_addr = 0;
 	uint64_t tmp_bits = 0;
 	boolean_t found_hole = FALSE;
 	boolean_t pre_aligned = TRUE;
 
 	if (sg_count < 2){
 		return -1;
 	}
 
 	*bits = 0;
 	
 	phys_addr = vtophys(sgl[0].ds_addr);
 	offset =  phys_addr - trunc_page(phys_addr);
 
 	if (offset != 0) {
 		pre_aligned = FALSE;
 		tmp_bits |= 1;
 	}
 
 	for (i = 1; i < sg_count; i++) {
 		phys_addr = vtophys(sgl[i].ds_addr);
 		offset =  phys_addr - trunc_page(phys_addr);
 
 		if (offset == 0) {
 			if (FALSE == pre_aligned){
 				/*
 				 * This segment is aligned, if the previous
 				 * one is not aligned, find a hole
 				 */
 				found_hole = TRUE;
 			}
 			pre_aligned = TRUE;
 		} else {
 			tmp_bits |= 1 << i;
 			if (!pre_aligned) {
 				if (phys_addr != vtophys(sgl[i-1].ds_addr +
 				    sgl[i-1].ds_len)) {
 					/*
 					 * Check whether connect to previous
 					 * segment,if not, find the hole
 					 */
 					found_hole = TRUE;
 				}
 			} else {
 				found_hole = TRUE;
 			}
 			pre_aligned = FALSE;
 		}
 	}
 
 	if (!found_hole) {
 		return (-1);
 	} else {
 		*bits = tmp_bits;
 		return 0;
 	}
 }
 
 /**
  * @brief Fill in a request structure based on a CAM control block
  *
  * Fills in a request structure based on the contents of a CAM control
  * block.  The request structure holds the payload information for
  * VSCSI protocol request.
  *
  * @param ccb pointer to a CAM contorl block
  * @param reqp pointer to a request structure
  */
 static int
 create_storvsc_request(union ccb *ccb, struct hv_storvsc_request *reqp)
 {
 	struct ccb_scsiio *csio = &ccb->csio;
 	uint64_t phys_addr;
 	uint32_t bytes_to_copy = 0;
 	uint32_t pfn_num = 0;
 	uint32_t pfn;
 	uint64_t not_aligned_seg_bits = 0;
 	
 	/* refer to struct vmscsi_req for meanings of these two fields */
 	reqp->vstor_packet.u.vm_srb.port =
 		cam_sim_unit(xpt_path_sim(ccb->ccb_h.path));
 	reqp->vstor_packet.u.vm_srb.path_id =
 		cam_sim_bus(xpt_path_sim(ccb->ccb_h.path));
 
 	reqp->vstor_packet.u.vm_srb.target_id = ccb->ccb_h.target_id;
 	reqp->vstor_packet.u.vm_srb.lun = ccb->ccb_h.target_lun;
 
 	reqp->vstor_packet.u.vm_srb.cdb_len = csio->cdb_len;
 	if(ccb->ccb_h.flags & CAM_CDB_POINTER) {
 		memcpy(&reqp->vstor_packet.u.vm_srb.u.cdb, csio->cdb_io.cdb_ptr,
 			csio->cdb_len);
 	} else {
 		memcpy(&reqp->vstor_packet.u.vm_srb.u.cdb, csio->cdb_io.cdb_bytes,
 			csio->cdb_len);
 	}
 
 	switch (ccb->ccb_h.flags & CAM_DIR_MASK) {
 	case CAM_DIR_OUT:
 		reqp->vstor_packet.u.vm_srb.data_in = WRITE_TYPE;	
 		break;
 	case CAM_DIR_IN:
 		reqp->vstor_packet.u.vm_srb.data_in = READ_TYPE;
 		break;
 	case CAM_DIR_NONE:
 		reqp->vstor_packet.u.vm_srb.data_in = UNKNOWN_TYPE;
 		break;
 	default:
 		reqp->vstor_packet.u.vm_srb.data_in = UNKNOWN_TYPE;
 		break;
 	}
 
 	reqp->sense_data     = &csio->sense_data;
 	reqp->sense_info_len = csio->sense_len;
 
 	reqp->ccb = ccb;
 
 	if (0 == csio->dxfer_len) {
 		return (0);
 	}
 
 	reqp->data_buf.length = csio->dxfer_len;
 
 	switch (ccb->ccb_h.flags & CAM_DATA_MASK) {
 	case CAM_DATA_VADDR:
 	{
 		bytes_to_copy = csio->dxfer_len;
 		phys_addr = vtophys(csio->data_ptr);
 		reqp->data_buf.offset = phys_addr & PAGE_MASK;
 		
 		while (bytes_to_copy != 0) {
 			int bytes, page_offset;
 			phys_addr =
 			    vtophys(&csio->data_ptr[reqp->data_buf.length -
 			    bytes_to_copy]);
 			pfn = phys_addr >> PAGE_SHIFT;
 			reqp->data_buf.pfn_array[pfn_num] = pfn;
 			page_offset = phys_addr & PAGE_MASK;
 
 			bytes = min(PAGE_SIZE - page_offset, bytes_to_copy);
 
 			bytes_to_copy -= bytes;
 			pfn_num++;
 		}
 		break;
 	}
 
 	case CAM_DATA_SG:
 	{
 		int i = 0;
 		int offset = 0;
 		int ret;
 
 		bus_dma_segment_t *storvsc_sglist =
 		    (bus_dma_segment_t *)ccb->csio.data_ptr;
 		u_int16_t storvsc_sg_count = ccb->csio.sglist_cnt;
 
 		printf("Storvsc: get SG I/O operation, %d\n",
 		    reqp->vstor_packet.u.vm_srb.data_in);
 
 		if (storvsc_sg_count > HV_MAX_MULTIPAGE_BUFFER_COUNT){
 			printf("Storvsc: %d segments is too much, "
 			    "only support %d segments\n",
 			    storvsc_sg_count, HV_MAX_MULTIPAGE_BUFFER_COUNT);
 			return (EINVAL);
 		}
 
 		/*
 		 * We create our own bounce buffer function currently. Idealy
 		 * we should use BUS_DMA(9) framework. But with current BUS_DMA
 		 * code there is no callback API to check the page alignment of
 		 * middle segments before busdma can decide if a bounce buffer
 		 * is needed for particular segment. There is callback,
 		 * "bus_dma_filter_t *filter", but the parrameters are not
 		 * sufficient for storvsc driver.
 		 * TODO:
 		 *	Add page alignment check in BUS_DMA(9) callback. Once
 		 *	this is complete, switch the following code to use
 		 *	BUS_DMA(9) for storvsc bounce buffer support.
 		 */
 		/* check if we need to create bounce buffer */
 		ret = storvsc_check_bounce_buffer_sgl(storvsc_sglist,
 		    storvsc_sg_count, &not_aligned_seg_bits);
 		if (ret != -1) {
 			reqp->bounce_sgl =
 			    storvsc_create_bounce_buffer(storvsc_sg_count,
 			    reqp->vstor_packet.u.vm_srb.data_in);
 			if (NULL == reqp->bounce_sgl) {
 				printf("Storvsc_error: "
 				    "create bounce buffer failed.\n");
 				return (ENOMEM);
 			}
 
 			reqp->bounce_sgl_count = storvsc_sg_count;
 			reqp->not_aligned_seg_bits = not_aligned_seg_bits;
 
 			/*
 			 * if it is write, we need copy the original data
 			 *to bounce buffer
 			 */
 			if (WRITE_TYPE == reqp->vstor_packet.u.vm_srb.data_in) {
 				storvsc_copy_sgl_to_bounce_buf(
 				    reqp->bounce_sgl,
 				    storvsc_sglist,
 				    storvsc_sg_count,
 				    reqp->not_aligned_seg_bits);
 			}
 
 			/* transfer virtual address to physical frame number */
 			if (reqp->not_aligned_seg_bits & 0x1){
  				phys_addr =
 				    vtophys(reqp->bounce_sgl->sg_segs[0].ss_paddr);
 			}else{
  				phys_addr =
 					vtophys(storvsc_sglist[0].ds_addr);
 			}
 			reqp->data_buf.offset = phys_addr & PAGE_MASK;
 
 			pfn = phys_addr >> PAGE_SHIFT;
 			reqp->data_buf.pfn_array[0] = pfn;
 			
 			for (i = 1; i < storvsc_sg_count; i++) {
 				if (reqp->not_aligned_seg_bits & (1 << i)) {
 					phys_addr =
 					    vtophys(reqp->bounce_sgl->sg_segs[i].ss_paddr);
 				} else {
 					phys_addr =
 					    vtophys(storvsc_sglist[i].ds_addr);
 				}
 
 				pfn = phys_addr >> PAGE_SHIFT;
 				reqp->data_buf.pfn_array[i] = pfn;
 			}
 		} else {
 			phys_addr = vtophys(storvsc_sglist[0].ds_addr);
 
 			reqp->data_buf.offset = phys_addr & PAGE_MASK;
 
 			for (i = 0; i < storvsc_sg_count; i++) {
 				phys_addr = vtophys(storvsc_sglist[i].ds_addr);
 				pfn = phys_addr >> PAGE_SHIFT;
 				reqp->data_buf.pfn_array[i] = pfn;
 			}
 
 			/* check the last segment cross boundary or not */
 			offset = phys_addr & PAGE_MASK;
 			if (offset) {
 				phys_addr =
 				    vtophys(storvsc_sglist[i-1].ds_addr +
 				    PAGE_SIZE - offset);
 				pfn = phys_addr >> PAGE_SHIFT;
 				reqp->data_buf.pfn_array[i] = pfn;
 			}
 			
 			reqp->bounce_sgl_count = 0;
 		}
 		break;
 	}
 	default:
 		printf("Unknow flags: %d\n", ccb->ccb_h.flags);
 		return(EINVAL);
 	}
 
 	return(0);
 }
 
 /*
- * Modified based on scsi_print_inquiry which is responsible to
- * print the detail information for scsi_inquiry_data.
- *
+ * SCSI Inquiry checks qualifier and type.
+ * If qualifier is 011b, means the device server is not capable
+ * of supporting a peripheral device on this logical unit, and
+ * the type should be set to 1Fh.
+ * 
  * Return 1 if it is valid, 0 otherwise.
  */
 static inline int
 is_inquiry_valid(const struct scsi_inquiry_data *inq_data)
 {
 	uint8_t type;
-	char vendor[16], product[48], revision[16];
-
-	/*
-	 * Check device type and qualifier
-	 */
-	if (!(SID_QUAL_IS_VENDOR_UNIQUE(inq_data) ||
-	    SID_QUAL(inq_data) == SID_QUAL_LU_CONNECTED))
+	if (SID_QUAL(inq_data) != SID_QUAL_LU_CONNECTED) {
 		return (0);
-
+	}
 	type = SID_TYPE(inq_data);
-	switch (type) {
-	case T_DIRECT:
-	case T_SEQUENTIAL:
-	case T_PRINTER:
-	case T_PROCESSOR:
-	case T_WORM:
-	case T_CDROM:
-	case T_SCANNER:
-	case T_OPTICAL:
-	case T_CHANGER:
-	case T_COMM:
-	case T_STORARRAY:
-	case T_ENCLOSURE:
-	case T_RBC:
-	case T_OCRW:
-	case T_OSD:
-	case T_ADC:
-		break;
-	case T_NODEVICE:
-	default:
+	if (type == T_NODEVICE) {
 		return (0);
 	}
-
-	/*
-	 * Check vendor, product, and revision
-	 */
-	cam_strvis(vendor, inq_data->vendor, sizeof(inq_data->vendor),
-	    sizeof(vendor));
-	cam_strvis(product, inq_data->product, sizeof(inq_data->product),
-	    sizeof(product));
-	cam_strvis(revision, inq_data->revision, sizeof(inq_data->revision),
-	    sizeof(revision));
-	if (strlen(vendor) == 0  ||
-	    strlen(product) == 0 ||
-	    strlen(revision) == 0)
-		return (0);
-
 	return (1);
 }
 
 /**
  * @brief completion function before returning to CAM
  *
  * I/O process has been completed and the result needs
  * to be passed to the CAM layer.
  * Free resources related to this request.
  *
  * @param reqp pointer to a request structure
  */
 static void
 storvsc_io_done(struct hv_storvsc_request *reqp)
 {
 	union ccb *ccb = reqp->ccb;
 	struct ccb_scsiio *csio = &ccb->csio;
 	struct storvsc_softc *sc = reqp->softc;
 	struct vmscsi_req *vm_srb = &reqp->vstor_packet.u.vm_srb;
 	bus_dma_segment_t *ori_sglist = NULL;
 	int ori_sg_count = 0;
 
 	/* destroy bounce buffer if it is used */
 	if (reqp->bounce_sgl_count) {
 		ori_sglist = (bus_dma_segment_t *)ccb->csio.data_ptr;
 		ori_sg_count = ccb->csio.sglist_cnt;
 
 		/*
 		 * If it is READ operation, we should copy back the data
 		 * to original SG list.
 		 */
 		if (READ_TYPE == reqp->vstor_packet.u.vm_srb.data_in) {
 			storvsc_copy_from_bounce_buf_to_sgl(ori_sglist,
 			    ori_sg_count,
 			    reqp->bounce_sgl,
 			    reqp->not_aligned_seg_bits);
 		}
 
 		storvsc_destroy_bounce_buffer(reqp->bounce_sgl);
 		reqp->bounce_sgl_count = 0;
 	}
 		
 	if (reqp->retries > 0) {
 		mtx_lock(&sc->hs_lock);
 #if HVS_TIMEOUT_TEST
 		xpt_print(ccb->ccb_h.path,
 			"%u: IO returned after timeout, "
 			"waking up timer handler if any.\n", ticks);
 		mtx_lock(&reqp->event.mtx);
 		cv_signal(&reqp->event.cv);
 		mtx_unlock(&reqp->event.mtx);
 #endif
 		reqp->retries = 0;
 		xpt_print(ccb->ccb_h.path,
 			"%u: IO returned after timeout, "
 			"stopping timer if any.\n", ticks);
 		mtx_unlock(&sc->hs_lock);
 	}
 
+#ifdef notyet
 	/*
 	 * callout_drain() will wait for the timer handler to finish
 	 * if it is running. So we don't need any lock to synchronize
 	 * between this routine and the timer handler.
 	 * Note that we need to make sure reqp is not freed when timer
 	 * handler is using or will use it.
 	 */
 	if (ccb->ccb_h.timeout != CAM_TIME_INFINITY) {
 		callout_drain(&reqp->callout);
 	}
+#endif
 
 	ccb->ccb_h.status &= ~CAM_SIM_QUEUED;
 	ccb->ccb_h.status &= ~CAM_STATUS_MASK;
 	if (vm_srb->scsi_status == SCSI_STATUS_OK) {
 		const struct scsi_generic *cmd;
-
 		/*
 		 * Check whether the data for INQUIRY cmd is valid or
 		 * not.  Windows 10 and Windows 2016 send all zero
 		 * inquiry data to VM even for unpopulated slots.
 		 */
 		cmd = (const struct scsi_generic *)
 		    ((ccb->ccb_h.flags & CAM_CDB_POINTER) ?
 		     csio->cdb_io.cdb_ptr : csio->cdb_io.cdb_bytes);
-		if (cmd->opcode == INQUIRY &&
-		    is_inquiry_valid(
-		    (const struct scsi_inquiry_data *)csio->data_ptr) == 0) {
+		if (cmd->opcode == INQUIRY) {
+		    /*
+		     * The host of Windows 10 or 2016 server will response
+		     * the inquiry request with invalid data for unexisted device:
+			[0x7f 0x0 0x5 0x2 0x1f ... ]
+		     * But on windows 2012 R2, the response is:
+			[0x7f 0x0 0x0 0x0 0x0 ]
+		     * That is why here wants to validate the inquiry response.
+		     * The validation will skip the INQUIRY whose response is short,
+		     * which is less than SHORT_INQUIRY_LENGTH (36).
+		     *
+		     * For more information about INQUIRY, please refer to:
+		     *  ftp://ftp.avc-pioneer.com/Mtfuji_7/Proposal/Jun09/INQUIRY.pdf
+		     */
+		    const struct scsi_inquiry_data *inq_data =
+			(const struct scsi_inquiry_data *)csio->data_ptr;
+		    uint8_t* resp_buf = (uint8_t*)csio->data_ptr;
+		    /* Get the buffer length reported by host */
+		    int resp_xfer_len = vm_srb->transfer_len;
+		    /* Get the available buffer length */
+		    int resp_buf_len = resp_xfer_len >= 5 ? resp_buf[4] + 5 : 0;
+		    int data_len = (resp_buf_len < resp_xfer_len) ? resp_buf_len : resp_xfer_len;
+		    if (data_len < SHORT_INQUIRY_LENGTH) {
+			ccb->ccb_h.status |= CAM_REQ_CMP;
+			if (bootverbose && data_len >= 5) {
+				mtx_lock(&sc->hs_lock);
+				xpt_print(ccb->ccb_h.path,
+				    "storvsc skips the validation for short inquiry (%d)"
+				    " [%x %x %x %x %x]\n",
+				    data_len,resp_buf[0],resp_buf[1],resp_buf[2],
+				    resp_buf[3],resp_buf[4]);
+				mtx_unlock(&sc->hs_lock);
+			}
+		    } else if (is_inquiry_valid(inq_data) == 0) {
 			ccb->ccb_h.status |= CAM_DEV_NOT_THERE;
+			if (bootverbose && data_len >= 5) {
+				mtx_lock(&sc->hs_lock);
+				xpt_print(ccb->ccb_h.path,
+				    "storvsc uninstalled invalid device"
+				    " [%x %x %x %x %x]\n",
+				resp_buf[0],resp_buf[1],resp_buf[2],resp_buf[3],resp_buf[4]);
+				mtx_unlock(&sc->hs_lock);
+			}
+		    } else {
+			ccb->ccb_h.status |= CAM_REQ_CMP;
 			if (bootverbose) {
 				mtx_lock(&sc->hs_lock);
 				xpt_print(ccb->ccb_h.path,
-				    "storvsc uninstalled device\n");
+				    "storvsc has passed inquiry response (%d) validation\n",
+				    data_len);
 				mtx_unlock(&sc->hs_lock);
 			}
+		    }
 		} else {
 			ccb->ccb_h.status |= CAM_REQ_CMP;
 		}
 	} else {
 		mtx_lock(&sc->hs_lock);
 		xpt_print(ccb->ccb_h.path,
 			"storvsc scsi_status = %d\n",
 			vm_srb->scsi_status);
 		mtx_unlock(&sc->hs_lock);
 		ccb->ccb_h.status |= CAM_SCSI_STATUS_ERROR;
 	}
 
 	ccb->csio.scsi_status = (vm_srb->scsi_status & 0xFF);
 	ccb->csio.resid = ccb->csio.dxfer_len - vm_srb->transfer_len;
 
 	if (reqp->sense_info_len != 0) {
 		csio->sense_resid = csio->sense_len - reqp->sense_info_len;
 		ccb->ccb_h.status |= CAM_AUTOSNS_VALID;
 	}
 
 	mtx_lock(&sc->hs_lock);
 	if (reqp->softc->hs_frozen == 1) {
 		xpt_print(ccb->ccb_h.path,
 			"%u: storvsc unfreezing softc 0x%p.\n",
 			ticks, reqp->softc);
 		ccb->ccb_h.status |= CAM_RELEASE_SIMQ;
 		reqp->softc->hs_frozen = 0;
 	}
 	storvsc_free_request(sc, reqp);
 	xpt_done(ccb);
 	mtx_unlock(&sc->hs_lock);
 }
 
 /**
  * @brief Free a request structure
  *
  * Free a request structure by returning it to the free list
  *
  * @param sc pointer to a softc
  * @param reqp pointer to a request structure
  */	
 static void
 storvsc_free_request(struct storvsc_softc *sc, struct hv_storvsc_request *reqp)
 {
 
 	LIST_INSERT_HEAD(&sc->hs_free_list, reqp, link);
 }
 
 /**
  * @brief Determine type of storage device from GUID
  *
  * Using the type GUID, determine if this is a StorVSC (paravirtual
  * SCSI or BlkVSC (paravirtual IDE) device.
  *
  * @param dev a device
  * returns an enum
  */
 static enum hv_storage_type
 storvsc_get_storage_type(device_t dev)
 {
 	const char *p = vmbus_get_type(dev);
 
 	if (!memcmp(p, &gBlkVscDeviceType, sizeof(hv_guid))) {
 		return DRIVER_BLKVSC;
 	} else if (!memcmp(p, &gStorVscDeviceType, sizeof(hv_guid))) {
 		return DRIVER_STORVSC;
 	}
 	return (DRIVER_UNKNOWN);
 }
 
Index: releng/10.3/sys/dev/hyperv/storvsc/hv_vstorage.h
===================================================================
--- releng/10.3/sys/dev/hyperv/storvsc/hv_vstorage.h	(revision 303983)
+++ releng/10.3/sys/dev/hyperv/storvsc/hv_vstorage.h	(revision 303984)
@@ -1,266 +1,271 @@
 /*-
  * Copyright (c) 2009-2012 Microsoft Corp.
  * Copyright (c) 2012 NetApp Inc.
  * Copyright (c) 2012 Citrix Inc.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice unmodified, this list of conditions, and the following
  *    disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 #ifndef __HV_VSTORAGE_H__
 #define __HV_VSTORAGE_H__
 
 /*
  * Major/minor macros.  Minor version is in LSB, meaning that earlier flat
  * version numbers will be interpreted as "0.x" (i.e., 1 becomes 0.1).
  */
 
 #define VMSTOR_PROTOCOL_MAJOR(VERSION_)         (((VERSION_) >> 8) & 0xff)
 #define VMSTOR_PROTOCOL_MINOR(VERSION_)         (((VERSION_)     ) & 0xff)
 #define VMSTOR_PROTOCOL_VERSION(MAJOR_, MINOR_) ((((MAJOR_) & 0xff) << 8) | \
                                                  (((MINOR_) & 0xff)     ))
 
+#define VMSTOR_PROTOCOL_VERSION_WIN6       VMSTOR_PROTOCOL_VERSION(2, 0)
+#define VMSTOR_PROTOCOL_VERSION_WIN7       VMSTOR_PROTOCOL_VERSION(4, 2)
+#define VMSTOR_PROTOCOL_VERSION_WIN8       VMSTOR_PROTOCOL_VERSION(5, 1)
+#define VMSTOR_PROTOCOL_VERSION_WIN8_1     VMSTOR_PROTOCOL_VERSION(6, 0)
+#define VMSTOR_PROTOCOL_VERSION_WIN10      VMSTOR_PROTOCOL_VERSION(6, 2)
 /*
  * Invalid version.
  */
 #define VMSTOR_INVALID_PROTOCOL_VERSION  -1
 
 /*
  * Version history:
  * V1 Beta                    0.1
  * V1 RC < 2008/1/31          1.0
  * V1 RC > 2008/1/31          2.0
  * Win7: 4.2
  * Win8: 5.1
  */
 
 #define VMSTOR_PROTOCOL_VERSION_CURRENT	VMSTOR_PROTOCOL_VERSION(5, 1)
 
 /**
  *  Packet structure ops describing virtual storage requests.
  */
 enum vstor_packet_ops {
 	VSTOR_OPERATION_COMPLETEIO            = 1,
 	VSTOR_OPERATION_REMOVEDEVICE          = 2,
 	VSTOR_OPERATION_EXECUTESRB            = 3,
 	VSTOR_OPERATION_RESETLUN              = 4,
 	VSTOR_OPERATION_RESETADAPTER          = 5,
 	VSTOR_OPERATION_RESETBUS              = 6,
 	VSTOR_OPERATION_BEGININITIALIZATION   = 7,
 	VSTOR_OPERATION_ENDINITIALIZATION     = 8,
 	VSTOR_OPERATION_QUERYPROTOCOLVERSION  = 9,
 	VSTOR_OPERATION_QUERYPROPERTIES       = 10,
 	VSTOR_OPERATION_ENUMERATE_BUS         = 11,
 	VSTOR_OPERATION_FCHBA_DATA            = 12,
 	VSTOR_OPERATION_CREATE_MULTI_CHANNELS = 13,
 	VSTOR_OPERATION_MAXIMUM               = 13
 };
 
 
 /*
  *  Platform neutral description of a scsi request -
  *  this remains the same across the write regardless of 32/64 bit
  *  note: it's patterned off the Windows DDK SCSI_PASS_THROUGH structure
  */
 
 #define CDB16GENERIC_LENGTH			0x10
 #define SENSE_BUFFER_SIZE			0x14
 #define MAX_DATA_BUFFER_LENGTH_WITH_PADDING	0x14
 
 #define POST_WIN7_STORVSC_SENSE_BUFFER_SIZE	0x14
 #define PRE_WIN8_STORVSC_SENSE_BUFFER_SIZE	0x12
 
 
 struct vmscsi_win8_extension {
 	/*
 	 * The following were added in Windows 8
 	 */
 	uint16_t reserve;
 	uint8_t  queue_tag;
 	uint8_t  queue_action;
 	uint32_t srb_flags;
 	uint32_t time_out_value;
 	uint32_t queue_sort_ey;
 } __packed;
 
 struct vmscsi_req {
 	uint16_t length;
 	uint8_t  srb_status;
 	uint8_t  scsi_status;
 
 	/* HBA number, set to the order number detected by initiator. */
 	uint8_t  port;
 	/* SCSI bus number or bus_id, different from CAM's path_id. */
 	uint8_t  path_id;
 
 	uint8_t  target_id;
 	uint8_t  lun;
 
 	uint8_t  cdb_len;
 	uint8_t  sense_info_len;
 	uint8_t  data_in;
 	uint8_t  reserved;
 
 	uint32_t transfer_len;
 
 	union {
 	    uint8_t cdb[CDB16GENERIC_LENGTH];
 
 	    uint8_t sense_data[SENSE_BUFFER_SIZE];
 
 	    uint8_t reserved_array[MAX_DATA_BUFFER_LENGTH_WITH_PADDING];
 	} u;
 
 	/*
 	 * The following was added in win8.
 	 */
 	struct vmscsi_win8_extension win8_extension;
 
 } __packed;
 
 /**
  *  This structure is sent during the initialization phase to get the different
  *  properties of the channel.
  */
 
 struct vmstor_chan_props {
 	uint16_t proto_ver;
 	uint8_t  path_id;
 	uint8_t  target_id;
 
 	uint16_t max_channel_cnt;
 
 	/**
 	 * Note: port number is only really known on the client side
 	 */
 	uint16_t port;
 	uint32_t flags;
 	uint32_t max_transfer_bytes;
 
 	/**
 	 *  This id is unique for each channel and will correspond with
 	 *  vendor specific data in the inquiry_ata
 	 */
 	uint64_t unique_id;
 
 } __packed;
 
 /**
  *  This structure is sent during the storage protocol negotiations.
  */
 
 struct vmstor_proto_ver
 {
 	/**
 	 * Major (MSW) and minor (LSW) version numbers.
 	 */
 	uint16_t major_minor;
 
 	uint16_t revision;			/* always zero */
 } __packed;
 
 /**
  * Channel Property Flags
  */
 
 #define STORAGE_CHANNEL_REMOVABLE_FLAG                  0x1
 #define STORAGE_CHANNEL_EMULATED_IDE_FLAG               0x2
 
 
 struct vstor_packet {
 	/**
 	 * Requested operation type
 	 */
 	enum vstor_packet_ops operation;
 
 	/*
 	 * Flags - see below for values
 	 */
 	uint32_t flags;
 
 	/**
 	 * Status of the request returned from the server side.
 	 */
 	uint32_t status;
 
 	union
 	{
 	    /**
 	     * Structure used to forward SCSI commands from the client to
 	     * the server.
 	     */
 	    struct vmscsi_req vm_srb;
 
 	    /**
 	     * Structure used to query channel properties.
 	     */
 	    struct vmstor_chan_props chan_props;
 
 	    /**
 	     * Used during version negotiations.
 	     */
 	    struct vmstor_proto_ver version;
 
 	    /**
              * Number of multichannels to create
 	     */
 	    uint16_t multi_channels_cnt;
 	} u;
 
 } __packed;
 
 
 /**
  * SRB (SCSI Request Block) Status Codes
  */
 #define SRB_STATUS_PENDING		0x00
 #define SRB_STATUS_SUCCESS		0x01
 #define SRB_STATUS_ABORTED		0x02
 #define SRB_STATUS_ABORT_FAILED	0x03
 #define SRB_STATUS_ERROR 		0x04
 #define SRB_STATUS_BUSY			0x05
 
 /**
  * SRB Status Masks (can be combined with above status codes)
  */
 #define SRB_STATUS_QUEUE_FROZEN		0x40
 #define SRB_STATUS_AUTOSENSE_VALID	0x80
 
 
 /**
  *  Packet flags
  */
 
 /**
  *  This flag indicates that the server should send back a completion for this
  *  packet.
  */
 #define REQUEST_COMPLETION_FLAG	0x1
 
 /**
  *  This is the set of flags that the vsc can set in any packets it sends
  */
 #define VSC_LEGAL_FLAGS (REQUEST_COMPLETION_FLAG)
 
 #endif /* __HV_VSTORAGE_H__ */
Index: releng/10.3/sys/dev/hyperv/vmbus/hv_channel.c
===================================================================
--- releng/10.3/sys/dev/hyperv/vmbus/hv_channel.c	(revision 303983)
+++ releng/10.3/sys/dev/hyperv/vmbus/hv_channel.c	(revision 303984)
@@ -1,882 +1,882 @@
 /*-
  * Copyright (c) 2009-2012 Microsoft Corp.
  * Copyright (c) 2012 NetApp Inc.
  * Copyright (c) 2012 Citrix Inc.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice unmodified, this list of conditions, and the following
  *    disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/kernel.h>
 #include <sys/malloc.h>
 #include <sys/systm.h>
 #include <sys/mbuf.h>
 #include <sys/lock.h>
 #include <sys/mutex.h>
 #include <machine/bus.h>
 #include <vm/vm.h>
 #include <vm/vm_param.h>
 #include <vm/pmap.h>
 
 #include "hv_vmbus_priv.h"
 
 static int 	vmbus_channel_create_gpadl_header(
 			/* must be phys and virt contiguous*/
 			void*				contig_buffer,
 			/* page-size multiple */
 			uint32_t 			size,
 			hv_vmbus_channel_msg_info**	msg_info,
 			uint32_t*			message_count);
 
 static void 	vmbus_channel_set_event(hv_vmbus_channel* channel);
 
 /**
  *  @brief Trigger an event notification on the specified channel
  */
 static void
 vmbus_channel_set_event(hv_vmbus_channel *channel)
 {
 	hv_vmbus_monitor_page *monitor_page;
 
 	if (channel->offer_msg.monitor_allocated) {
 		/* Each uint32_t represents 32 channels */
 		synch_set_bit((channel->offer_msg.child_rel_id & 31),
 			((uint32_t *)hv_vmbus_g_connection.send_interrupt_page
 				+ ((channel->offer_msg.child_rel_id >> 5))));
 
 		monitor_page = (hv_vmbus_monitor_page *)
 			hv_vmbus_g_connection.monitor_pages;
 
 		monitor_page++; /* Get the child to parent monitor page */
 
 		synch_set_bit(channel->monitor_bit,
 			(uint32_t *)&monitor_page->
 				trigger_group[channel->monitor_group].u.pending);
 	} else {
 		hv_vmbus_set_event(channel);
 	}
 
 }
 
 /**
  * @brief Open the specified channel
  */
 int
 hv_vmbus_channel_open(
 	hv_vmbus_channel*		new_channel,
 	uint32_t			send_ring_buffer_size,
 	uint32_t			recv_ring_buffer_size,
 	void*				user_data,
 	uint32_t			user_data_len,
 	hv_vmbus_pfn_channel_callback	pfn_on_channel_callback,
 	void* 				context)
 {
 
 	int ret = 0;
 	void *in, *out;
 	hv_vmbus_channel_open_channel*	open_msg;
 	hv_vmbus_channel_msg_info* 	open_info;
 
 	mtx_lock(&new_channel->sc_lock);
 	if (new_channel->state == HV_CHANNEL_OPEN_STATE) {
 	    new_channel->state = HV_CHANNEL_OPENING_STATE;
 	} else {
 	    mtx_unlock(&new_channel->sc_lock);
 	    if(bootverbose)
 		printf("VMBUS: Trying to open channel <%p> which in "
 		    "%d state.\n", new_channel, new_channel->state);
 	    return (EINVAL);
 	}
 	mtx_unlock(&new_channel->sc_lock);
 
 	new_channel->on_channel_callback = pfn_on_channel_callback;
 	new_channel->channel_callback_context = context;
 
 	/* Allocate the ring buffer */
 	out = contigmalloc((send_ring_buffer_size + recv_ring_buffer_size),
 	    M_DEVBUF, M_ZERO, 0UL, BUS_SPACE_MAXADDR, PAGE_SIZE, 0);
 	KASSERT(out != NULL,
 	    ("Error VMBUS: contigmalloc failed to allocate Ring Buffer!"));
 	if (out == NULL)
 		return (ENOMEM);
 
 	in = ((uint8_t *) out + send_ring_buffer_size);
 
 	new_channel->ring_buffer_pages = out;
 	new_channel->ring_buffer_page_count = (send_ring_buffer_size +
 	    recv_ring_buffer_size) >> PAGE_SHIFT;
 	new_channel->ring_buffer_size = send_ring_buffer_size +
 	    recv_ring_buffer_size;
 
 	hv_vmbus_ring_buffer_init(
 		&new_channel->outbound,
 		out,
 		send_ring_buffer_size);
 
 	hv_vmbus_ring_buffer_init(
 		&new_channel->inbound,
 		in,
 		recv_ring_buffer_size);
 
 	/**
 	 * Establish the gpadl for the ring buffer
 	 */
 	new_channel->ring_buffer_gpadl_handle = 0;
 
 	ret = hv_vmbus_channel_establish_gpadl(new_channel,
 		new_channel->outbound.ring_buffer,
 		send_ring_buffer_size + recv_ring_buffer_size,
 		&new_channel->ring_buffer_gpadl_handle);
 
 	/**
 	 * Create and init the channel open message
 	 */
 	open_info = (hv_vmbus_channel_msg_info*) malloc(
 		sizeof(hv_vmbus_channel_msg_info) +
 			sizeof(hv_vmbus_channel_open_channel),
 		M_DEVBUF,
 		M_NOWAIT);
 	KASSERT(open_info != NULL,
 	    ("Error VMBUS: malloc failed to allocate Open Channel message!"));
 
 	if (open_info == NULL)
 		return (ENOMEM);
 
 	sema_init(&open_info->wait_sema, 0, "Open Info Sema");
 
 	open_msg = (hv_vmbus_channel_open_channel*) open_info->msg;
 	open_msg->header.message_type = HV_CHANNEL_MESSAGE_OPEN_CHANNEL;
 	open_msg->open_id = new_channel->offer_msg.child_rel_id;
 	open_msg->child_rel_id = new_channel->offer_msg.child_rel_id;
 	open_msg->ring_buffer_gpadl_handle =
 		new_channel->ring_buffer_gpadl_handle;
 	open_msg->downstream_ring_buffer_page_offset = send_ring_buffer_size
 		>> PAGE_SHIFT;
 	open_msg->target_vcpu = new_channel->target_vcpu;
 
 	if (user_data_len)
 		memcpy(open_msg->user_data, user_data, user_data_len);
 
-	mtx_lock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_lock(&hv_vmbus_g_connection.channel_msg_lock);
 	TAILQ_INSERT_TAIL(
 		&hv_vmbus_g_connection.channel_msg_anchor,
 		open_info,
 		msg_list_entry);
-	mtx_unlock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_unlock(&hv_vmbus_g_connection.channel_msg_lock);
 
 	ret = hv_vmbus_post_message(
 		open_msg, sizeof(hv_vmbus_channel_open_channel));
 
 	if (ret != 0)
 	    goto cleanup;
 
 	ret = sema_timedwait(&open_info->wait_sema, 5 * hz); /* KYS 5 seconds */
 
 	if (ret) {
 	    if(bootverbose)
 		printf("VMBUS: channel <%p> open timeout.\n", new_channel);
 	    goto cleanup;
 	}
 
 	if (open_info->response.open_result.status == 0) {
 	    new_channel->state = HV_CHANNEL_OPENED_STATE;
 	    if(bootverbose)
 		printf("VMBUS: channel <%p> open success.\n", new_channel);
 	} else {
 	    if(bootverbose)
 		printf("Error VMBUS: channel <%p> open failed - %d!\n",
 			new_channel, open_info->response.open_result.status);
 	}
 
 	cleanup:
-	mtx_lock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_lock(&hv_vmbus_g_connection.channel_msg_lock);
 	TAILQ_REMOVE(
 		&hv_vmbus_g_connection.channel_msg_anchor,
 		open_info,
 		msg_list_entry);
-	mtx_unlock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_unlock(&hv_vmbus_g_connection.channel_msg_lock);
 	sema_destroy(&open_info->wait_sema);
 	free(open_info, M_DEVBUF);
 
 	return (ret);
 }
 
 /**
  * @brief Create a gpadl for the specified buffer
  */
 static int
 vmbus_channel_create_gpadl_header(
 	void*				contig_buffer,
 	uint32_t			size,	/* page-size multiple */
 	hv_vmbus_channel_msg_info**	msg_info,
 	uint32_t*			message_count)
 {
 	int				i;
 	int				page_count;
 	unsigned long long 		pfn;
 	uint32_t			msg_size;
 	hv_vmbus_channel_gpadl_header*	gpa_header;
 	hv_vmbus_channel_gpadl_body*	gpadl_body;
 	hv_vmbus_channel_msg_info*	msg_header;
 	hv_vmbus_channel_msg_info*	msg_body;
 
 	int pfnSum, pfnCount, pfnLeft, pfnCurr, pfnSize;
 
 	page_count = size >> PAGE_SHIFT;
 	pfn = hv_get_phys_addr(contig_buffer) >> PAGE_SHIFT;
 
 	/*do we need a gpadl body msg */
 	pfnSize = HV_MAX_SIZE_CHANNEL_MESSAGE
 	    - sizeof(hv_vmbus_channel_gpadl_header)
 	    - sizeof(hv_gpa_range);
 	pfnCount = pfnSize / sizeof(uint64_t);
 
 	if (page_count > pfnCount) { /* if(we need a gpadl body)	*/
 	    /* fill in the header		*/
 	    msg_size = sizeof(hv_vmbus_channel_msg_info)
 		+ sizeof(hv_vmbus_channel_gpadl_header)
 		+ sizeof(hv_gpa_range)
 		+ pfnCount * sizeof(uint64_t);
 	    msg_header = malloc(msg_size, M_DEVBUF, M_NOWAIT | M_ZERO);
 	    KASSERT(
 		msg_header != NULL,
 		("Error VMBUS: malloc failed to allocate Gpadl Message!"));
 	    if (msg_header == NULL)
 		return (ENOMEM);
 
 	    TAILQ_INIT(&msg_header->sub_msg_list_anchor);
 	    msg_header->message_size = msg_size;
 
 	    gpa_header = (hv_vmbus_channel_gpadl_header*) msg_header->msg;
 	    gpa_header->range_count = 1;
 	    gpa_header->range_buf_len = sizeof(hv_gpa_range)
 		+ page_count * sizeof(uint64_t);
 	    gpa_header->range[0].byte_offset = 0;
 	    gpa_header->range[0].byte_count = size;
 	    for (i = 0; i < pfnCount; i++) {
 		gpa_header->range[0].pfn_array[i] = pfn + i;
 	    }
 	    *msg_info = msg_header;
 	    *message_count = 1;
 
 	    pfnSum = pfnCount;
 	    pfnLeft = page_count - pfnCount;
 
 	    /*
 	     *  figure out how many pfns we can fit
 	     */
 	    pfnSize = HV_MAX_SIZE_CHANNEL_MESSAGE
 		- sizeof(hv_vmbus_channel_gpadl_body);
 	    pfnCount = pfnSize / sizeof(uint64_t);
 
 	    /*
 	     * fill in the body
 	     */
 	    while (pfnLeft) {
 		if (pfnLeft > pfnCount) {
 		    pfnCurr = pfnCount;
 		} else {
 		    pfnCurr = pfnLeft;
 		}
 
 		msg_size = sizeof(hv_vmbus_channel_msg_info) +
 		    sizeof(hv_vmbus_channel_gpadl_body) +
 		    pfnCurr * sizeof(uint64_t);
 		msg_body = malloc(msg_size, M_DEVBUF, M_NOWAIT | M_ZERO);
 		KASSERT(
 		    msg_body != NULL,
 		    ("Error VMBUS: malloc failed to allocate Gpadl msg_body!"));
 		if (msg_body == NULL)
 		    return (ENOMEM);
 
 		msg_body->message_size = msg_size;
 		(*message_count)++;
 		gpadl_body =
 		    (hv_vmbus_channel_gpadl_body*) msg_body->msg;
 		/*
 		 * gpadl_body->gpadl = kbuffer;
 		 */
 		for (i = 0; i < pfnCurr; i++) {
 		    gpadl_body->pfn[i] = pfn + pfnSum + i;
 		}
 
 		TAILQ_INSERT_TAIL(
 		    &msg_header->sub_msg_list_anchor,
 		    msg_body,
 		    msg_list_entry);
 		pfnSum += pfnCurr;
 		pfnLeft -= pfnCurr;
 	    }
 	} else { /* else everything fits in a header */
 
 	    msg_size = sizeof(hv_vmbus_channel_msg_info) +
 		sizeof(hv_vmbus_channel_gpadl_header) +
 		sizeof(hv_gpa_range) +
 		page_count * sizeof(uint64_t);
 	    msg_header = malloc(msg_size, M_DEVBUF, M_NOWAIT | M_ZERO);
 	    KASSERT(
 		msg_header != NULL,
 		("Error VMBUS: malloc failed to allocate Gpadl Message!"));
 	    if (msg_header == NULL)
 		return (ENOMEM);
 
 	    msg_header->message_size = msg_size;
 
 	    gpa_header = (hv_vmbus_channel_gpadl_header*) msg_header->msg;
 	    gpa_header->range_count = 1;
 	    gpa_header->range_buf_len = sizeof(hv_gpa_range) +
 		page_count * sizeof(uint64_t);
 	    gpa_header->range[0].byte_offset = 0;
 	    gpa_header->range[0].byte_count = size;
 	    for (i = 0; i < page_count; i++) {
 		gpa_header->range[0].pfn_array[i] = pfn + i;
 	    }
 
 	    *msg_info = msg_header;
 	    *message_count = 1;
 	}
 
 	return (0);
 }
 
 /**
  * @brief Establish a GPADL for the specified buffer
  */
 int
 hv_vmbus_channel_establish_gpadl(
 	hv_vmbus_channel*	channel,
 	void*			contig_buffer,
 	uint32_t		size, /* page-size multiple */
 	uint32_t*		gpadl_handle)
 
 {
 	int ret = 0;
 	hv_vmbus_channel_gpadl_header*	gpadl_msg;
 	hv_vmbus_channel_gpadl_body*	gpadl_body;
 	hv_vmbus_channel_msg_info*	msg_info;
 	hv_vmbus_channel_msg_info*	sub_msg_info;
 	uint32_t			msg_count;
 	hv_vmbus_channel_msg_info*	curr;
 	uint32_t			next_gpadl_handle;
 
 	next_gpadl_handle = hv_vmbus_g_connection.next_gpadl_handle;
 	atomic_add_int((int*) &hv_vmbus_g_connection.next_gpadl_handle, 1);
 
 	ret = vmbus_channel_create_gpadl_header(
 		contig_buffer, size, &msg_info, &msg_count);
 
 	if(ret != 0) { /* if(allocation failed) return immediately */
 	    /* reverse atomic_add_int above */
 	    atomic_subtract_int((int*)
 		    &hv_vmbus_g_connection.next_gpadl_handle, 1);
 	    return ret;
 	}
 
 	sema_init(&msg_info->wait_sema, 0, "Open Info Sema");
 	gpadl_msg = (hv_vmbus_channel_gpadl_header*) msg_info->msg;
 	gpadl_msg->header.message_type = HV_CHANNEL_MESSAGEL_GPADL_HEADER;
 	gpadl_msg->child_rel_id = channel->offer_msg.child_rel_id;
 	gpadl_msg->gpadl = next_gpadl_handle;
 
-	mtx_lock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_lock(&hv_vmbus_g_connection.channel_msg_lock);
 	TAILQ_INSERT_TAIL(
 		&hv_vmbus_g_connection.channel_msg_anchor,
 		msg_info,
 		msg_list_entry);
 
-	mtx_unlock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_unlock(&hv_vmbus_g_connection.channel_msg_lock);
 
 	ret = hv_vmbus_post_message(
 		gpadl_msg,
 		msg_info->message_size -
 		    (uint32_t) sizeof(hv_vmbus_channel_msg_info));
 
 	if (ret != 0)
 	    goto cleanup;
 
 	if (msg_count > 1) {
 	    TAILQ_FOREACH(curr,
 		    &msg_info->sub_msg_list_anchor, msg_list_entry) {
 		sub_msg_info = curr;
 		gpadl_body =
 		    (hv_vmbus_channel_gpadl_body*) sub_msg_info->msg;
 
 		gpadl_body->header.message_type =
 		    HV_CHANNEL_MESSAGE_GPADL_BODY;
 		gpadl_body->gpadl = next_gpadl_handle;
 
 		ret = hv_vmbus_post_message(
 			gpadl_body,
 			sub_msg_info->message_size
 			    - (uint32_t) sizeof(hv_vmbus_channel_msg_info));
 		 /* if (the post message failed) give up and clean up */
 		if(ret != 0)
 		    goto cleanup;
 	    }
 	}
 
 	ret = sema_timedwait(&msg_info->wait_sema, 5 * hz); /* KYS 5 seconds*/
 	if (ret != 0)
 	    goto cleanup;
 
 	*gpadl_handle = gpadl_msg->gpadl;
 
 cleanup:
 
-	mtx_lock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_lock(&hv_vmbus_g_connection.channel_msg_lock);
 	TAILQ_REMOVE(&hv_vmbus_g_connection.channel_msg_anchor,
 		msg_info, msg_list_entry);
-	mtx_unlock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_unlock(&hv_vmbus_g_connection.channel_msg_lock);
 
 	sema_destroy(&msg_info->wait_sema);
 	free(msg_info, M_DEVBUF);
 
 	return (ret);
 }
 
 /**
  * @brief Teardown the specified GPADL handle
  */
 int
 hv_vmbus_channel_teardown_gpdal(
 	hv_vmbus_channel*	channel,
 	uint32_t		gpadl_handle)
 {
 	int					ret = 0;
 	hv_vmbus_channel_gpadl_teardown*	msg;
 	hv_vmbus_channel_msg_info*		info;
 
 	info = (hv_vmbus_channel_msg_info *)
 		malloc(	sizeof(hv_vmbus_channel_msg_info) +
 			sizeof(hv_vmbus_channel_gpadl_teardown),
 				M_DEVBUF, M_NOWAIT);
 	KASSERT(info != NULL,
 	    ("Error VMBUS: malloc failed to allocate Gpadl Teardown Msg!"));
 	if (info == NULL) {
 	    ret = ENOMEM;
 	    goto cleanup;
 	}
 
 	sema_init(&info->wait_sema, 0, "Open Info Sema");
 
 	msg = (hv_vmbus_channel_gpadl_teardown*) info->msg;
 
 	msg->header.message_type = HV_CHANNEL_MESSAGE_GPADL_TEARDOWN;
 	msg->child_rel_id = channel->offer_msg.child_rel_id;
 	msg->gpadl = gpadl_handle;
 
-	mtx_lock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_lock(&hv_vmbus_g_connection.channel_msg_lock);
 	TAILQ_INSERT_TAIL(&hv_vmbus_g_connection.channel_msg_anchor,
 			info, msg_list_entry);
-	mtx_unlock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_unlock(&hv_vmbus_g_connection.channel_msg_lock);
 
 	ret = hv_vmbus_post_message(msg,
 			sizeof(hv_vmbus_channel_gpadl_teardown));
 	if (ret != 0) 
 	    goto cleanup;
 	
 	ret = sema_timedwait(&info->wait_sema, 5 * hz); /* KYS 5 seconds */
 
 cleanup:
 	/*
 	 * Received a torndown response
 	 */
-	mtx_lock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_lock(&hv_vmbus_g_connection.channel_msg_lock);
 	TAILQ_REMOVE(&hv_vmbus_g_connection.channel_msg_anchor,
 			info, msg_list_entry);
-	mtx_unlock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_unlock(&hv_vmbus_g_connection.channel_msg_lock);
 	sema_destroy(&info->wait_sema);
 	free(info, M_DEVBUF);
 
 	return (ret);
 }
 
 static void
 hv_vmbus_channel_close_internal(hv_vmbus_channel *channel)
 {
 	int ret = 0;
 	hv_vmbus_channel_close_channel* msg;
 	hv_vmbus_channel_msg_info* info;
 
 	channel->state = HV_CHANNEL_OPEN_STATE;
 	channel->sc_creation_callback = NULL;
 
 	/*
 	 * Grab the lock to prevent race condition when a packet received
 	 * and unloading driver is in the process.
 	 */
 	mtx_lock(&channel->inbound_lock);
 	channel->on_channel_callback = NULL;
 	mtx_unlock(&channel->inbound_lock);
 
 	/**
 	 * Send a closing message
 	 */
 	info = (hv_vmbus_channel_msg_info *)
 		malloc(	sizeof(hv_vmbus_channel_msg_info) +
 			sizeof(hv_vmbus_channel_close_channel),
 				M_DEVBUF, M_NOWAIT);
 	KASSERT(info != NULL, ("VMBUS: malloc failed hv_vmbus_channel_close!"));
 	if(info == NULL)
 	    return;
 
 	msg = (hv_vmbus_channel_close_channel*) info->msg;
 	msg->header.message_type = HV_CHANNEL_MESSAGE_CLOSE_CHANNEL;
 	msg->child_rel_id = channel->offer_msg.child_rel_id;
 
 	ret = hv_vmbus_post_message(
 		msg, sizeof(hv_vmbus_channel_close_channel));
 
 	/* Tear down the gpadl for the channel's ring buffer */
 	if (channel->ring_buffer_gpadl_handle) {
 		hv_vmbus_channel_teardown_gpdal(channel,
 			channel->ring_buffer_gpadl_handle);
 	}
 
 	/* TODO: Send a msg to release the childRelId */
 
 	/* cleanup the ring buffers for this channel */
 	hv_ring_buffer_cleanup(&channel->outbound);
 	hv_ring_buffer_cleanup(&channel->inbound);
 
 	contigfree(channel->ring_buffer_pages, channel->ring_buffer_size,
 	    M_DEVBUF);
 
 	free(info, M_DEVBUF);
 }
 
 /**
  * @brief Close the specified channel
  */
 void
 hv_vmbus_channel_close(hv_vmbus_channel *channel)
 {
 	hv_vmbus_channel*	sub_channel;
 
 	if (channel->primary_channel != NULL) {
 		/*
 		 * We only close multi-channels when the primary is
 		 * closed.
 		 */
 		return;
 	}
 
 	/*
 	 * Close all multi-channels first.
 	 */
 	TAILQ_FOREACH(sub_channel, &channel->sc_list_anchor,
 	    sc_list_entry) {
 		if (sub_channel->state != HV_CHANNEL_OPENED_STATE)
 			continue;
 		hv_vmbus_channel_close_internal(sub_channel);
 	}
 	/*
 	 * Then close the primary channel.
 	 */
 	hv_vmbus_channel_close_internal(channel);
 }
 
 /**
  * @brief Send the specified buffer on the given channel
  */
 int
 hv_vmbus_channel_send_packet(
 	hv_vmbus_channel*	channel,
 	void*			buffer,
 	uint32_t		buffer_len,
 	uint64_t		request_id,
 	hv_vmbus_packet_type	type,
 	uint32_t		flags)
 {
 	int			ret = 0;
 	hv_vm_packet_descriptor	desc;
 	uint32_t		packet_len;
 	uint64_t		aligned_data;
 	uint32_t		packet_len_aligned;
 	boolean_t		need_sig;
 	hv_vmbus_sg_buffer_list	buffer_list[3];
 
 	packet_len = sizeof(hv_vm_packet_descriptor) + buffer_len;
 	packet_len_aligned = HV_ALIGN_UP(packet_len, sizeof(uint64_t));
 	aligned_data = 0;
 
 	/* Setup the descriptor */
 	desc.type = type;   /* HV_VMBUS_PACKET_TYPE_DATA_IN_BAND;             */
 	desc.flags = flags; /* HV_VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED */
 			    /* in 8-bytes granularity */
 	desc.data_offset8 = sizeof(hv_vm_packet_descriptor) >> 3;
 	desc.length8 = (uint16_t) (packet_len_aligned >> 3);
 	desc.transaction_id = request_id;
 
 	buffer_list[0].data = &desc;
 	buffer_list[0].length = sizeof(hv_vm_packet_descriptor);
 
 	buffer_list[1].data = buffer;
 	buffer_list[1].length = buffer_len;
 
 	buffer_list[2].data = &aligned_data;
 	buffer_list[2].length = packet_len_aligned - packet_len;
 
 	ret = hv_ring_buffer_write(&channel->outbound, buffer_list, 3,
 	    &need_sig);
 
 	/* TODO: We should determine if this is optional */
 	if (ret == 0 && need_sig) {
 		vmbus_channel_set_event(channel);
 	}
 
 	return (ret);
 }
 
 /**
  * @brief Send a range of single-page buffer packets using
  * a GPADL Direct packet type
  */
 int
 hv_vmbus_channel_send_packet_pagebuffer(
 	hv_vmbus_channel*	channel,
 	hv_vmbus_page_buffer	page_buffers[],
 	uint32_t		page_count,
 	void*			buffer,
 	uint32_t		buffer_len,
 	uint64_t		request_id)
 {
 
 	int					ret = 0;
 	int					i = 0;
 	boolean_t				need_sig;
 	uint32_t				packet_len;
 	uint32_t				packetLen_aligned;
 	hv_vmbus_sg_buffer_list			buffer_list[3];
 	hv_vmbus_channel_packet_page_buffer	desc;
 	uint32_t				descSize;
 	uint64_t				alignedData = 0;
 
 	if (page_count > HV_MAX_PAGE_BUFFER_COUNT)
 		return (EINVAL);
 
 	/*
 	 * Adjust the size down since hv_vmbus_channel_packet_page_buffer
 	 *  is the largest size we support
 	 */
 	descSize = sizeof(hv_vmbus_channel_packet_page_buffer) -
 			((HV_MAX_PAGE_BUFFER_COUNT - page_count) *
 			sizeof(hv_vmbus_page_buffer));
 	packet_len = descSize + buffer_len;
 	packetLen_aligned = HV_ALIGN_UP(packet_len, sizeof(uint64_t));
 
 	/* Setup the descriptor */
 	desc.type = HV_VMBUS_PACKET_TYPE_DATA_USING_GPA_DIRECT;
 	desc.flags = HV_VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED;
 	desc.data_offset8 = descSize >> 3; /* in 8-bytes granularity */
 	desc.length8 = (uint16_t) (packetLen_aligned >> 3);
 	desc.transaction_id = request_id;
 	desc.range_count = page_count;
 
 	for (i = 0; i < page_count; i++) {
 		desc.range[i].length = page_buffers[i].length;
 		desc.range[i].offset = page_buffers[i].offset;
 		desc.range[i].pfn = page_buffers[i].pfn;
 	}
 
 	buffer_list[0].data = &desc;
 	buffer_list[0].length = descSize;
 
 	buffer_list[1].data = buffer;
 	buffer_list[1].length = buffer_len;
 
 	buffer_list[2].data = &alignedData;
 	buffer_list[2].length = packetLen_aligned - packet_len;
 
 	ret = hv_ring_buffer_write(&channel->outbound, buffer_list, 3,
 	    &need_sig);
 
 	/* TODO: We should determine if this is optional */
 	if (ret == 0 && need_sig) {
 		vmbus_channel_set_event(channel);
 	}
 
 	return (ret);
 }
 
 /**
  * @brief Send a multi-page buffer packet using a GPADL Direct packet type
  */
 int
 hv_vmbus_channel_send_packet_multipagebuffer(
 	hv_vmbus_channel*		channel,
 	hv_vmbus_multipage_buffer*	multi_page_buffer,
 	void*				buffer,
 	uint32_t			buffer_len,
 	uint64_t			request_id)
 {
 
 	int			ret = 0;
 	uint32_t		desc_size;
 	boolean_t		need_sig;
 	uint32_t		packet_len;
 	uint32_t		packet_len_aligned;
 	uint32_t		pfn_count;
 	uint64_t		aligned_data = 0;
 	hv_vmbus_sg_buffer_list	buffer_list[3];
 	hv_vmbus_channel_packet_multipage_buffer desc;
 
 	pfn_count =
 	    HV_NUM_PAGES_SPANNED(
 		    multi_page_buffer->offset,
 		    multi_page_buffer->length);
 
 	if ((pfn_count == 0) || (pfn_count > HV_MAX_MULTIPAGE_BUFFER_COUNT))
 	    return (EINVAL);
 	/*
 	 * Adjust the size down since hv_vmbus_channel_packet_multipage_buffer
 	 * is the largest size we support
 	 */
 	desc_size =
 	    sizeof(hv_vmbus_channel_packet_multipage_buffer) -
 		    ((HV_MAX_MULTIPAGE_BUFFER_COUNT - pfn_count) *
 			sizeof(uint64_t));
 	packet_len = desc_size + buffer_len;
 	packet_len_aligned = HV_ALIGN_UP(packet_len, sizeof(uint64_t));
 
 	/*
 	 * Setup the descriptor
 	 */
 	desc.type = HV_VMBUS_PACKET_TYPE_DATA_USING_GPA_DIRECT;
 	desc.flags = HV_VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED;
 	desc.data_offset8 = desc_size >> 3; /* in 8-bytes granularity */
 	desc.length8 = (uint16_t) (packet_len_aligned >> 3);
 	desc.transaction_id = request_id;
 	desc.range_count = 1;
 
 	desc.range.length = multi_page_buffer->length;
 	desc.range.offset = multi_page_buffer->offset;
 
 	memcpy(desc.range.pfn_array, multi_page_buffer->pfn_array,
 		pfn_count * sizeof(uint64_t));
 
 	buffer_list[0].data = &desc;
 	buffer_list[0].length = desc_size;
 
 	buffer_list[1].data = buffer;
 	buffer_list[1].length = buffer_len;
 
 	buffer_list[2].data = &aligned_data;
 	buffer_list[2].length = packet_len_aligned - packet_len;
 
 	ret = hv_ring_buffer_write(&channel->outbound, buffer_list, 3,
 	    &need_sig);
 
 	/* TODO: We should determine if this is optional */
 	if (ret == 0 && need_sig) {
 	    vmbus_channel_set_event(channel);
 	}
 
 	return (ret);
 }
 
 /**
  * @brief Retrieve the user packet on the specified channel
  */
 int
 hv_vmbus_channel_recv_packet(
 	hv_vmbus_channel*	channel,
 	void*			Buffer,
 	uint32_t		buffer_len,
 	uint32_t*		buffer_actual_len,
 	uint64_t*		request_id)
 {
 	int			ret;
 	uint32_t		user_len;
 	uint32_t		packet_len;
 	hv_vm_packet_descriptor	desc;
 
 	*buffer_actual_len = 0;
 	*request_id = 0;
 
 	ret = hv_ring_buffer_peek(&channel->inbound, &desc,
 		sizeof(hv_vm_packet_descriptor));
 	if (ret != 0)
 		return (0);
 
 	packet_len = desc.length8 << 3;
 	user_len = packet_len - (desc.data_offset8 << 3);
 
 	*buffer_actual_len = user_len;
 
 	if (user_len > buffer_len)
 		return (EINVAL);
 
 	*request_id = desc.transaction_id;
 
 	/* Copy over the packet to the user buffer */
 	ret = hv_ring_buffer_read(&channel->inbound, Buffer, user_len,
 		(desc.data_offset8 << 3));
 
 	return (0);
 }
 
 /**
  * @brief Retrieve the raw packet on the specified channel
  */
 int
 hv_vmbus_channel_recv_packet_raw(
 	hv_vmbus_channel*	channel,
 	void*			buffer,
 	uint32_t		buffer_len,
 	uint32_t*		buffer_actual_len,
 	uint64_t*		request_id)
 {
 	int		ret;
 	uint32_t	packetLen;
 	uint32_t	userLen;
 	hv_vm_packet_descriptor	desc;
 
 	*buffer_actual_len = 0;
 	*request_id = 0;
 
 	ret = hv_ring_buffer_peek(
 		&channel->inbound, &desc,
 		sizeof(hv_vm_packet_descriptor));
 
 	if (ret != 0)
 	    return (0);
 
 	packetLen = desc.length8 << 3;
 	userLen = packetLen - (desc.data_offset8 << 3);
 
 	*buffer_actual_len = packetLen;
 
 	if (packetLen > buffer_len)
 	    return (ENOBUFS);
 
 	*request_id = desc.transaction_id;
 
 	/* Copy over the entire packet to the user buffer */
 	ret = hv_ring_buffer_read(&channel->inbound, buffer, packetLen, 0);
 
 	return (0);
 }
Index: releng/10.3/sys/dev/hyperv/vmbus/hv_channel_mgmt.c
===================================================================
--- releng/10.3/sys/dev/hyperv/vmbus/hv_channel_mgmt.c	(revision 303983)
+++ releng/10.3/sys/dev/hyperv/vmbus/hv_channel_mgmt.c	(revision 303984)
@@ -1,851 +1,851 @@
 /*-
  * Copyright (c) 2009-2012 Microsoft Corp.
  * Copyright (c) 2012 NetApp Inc.
  * Copyright (c) 2012 Citrix Inc.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice unmodified, this list of conditions, and the following
  *    disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/mbuf.h>
 
 #include "hv_vmbus_priv.h"
 
 /*
  * Internal functions
  */
 
 static void vmbus_channel_on_offer(hv_vmbus_channel_msg_header* hdr);
 static void vmbus_channel_on_open_result(hv_vmbus_channel_msg_header* hdr);
 static void vmbus_channel_on_offer_rescind(hv_vmbus_channel_msg_header* hdr);
 static void vmbus_channel_on_gpadl_created(hv_vmbus_channel_msg_header* hdr);
 static void vmbus_channel_on_gpadl_torndown(hv_vmbus_channel_msg_header* hdr);
 static void vmbus_channel_on_offers_delivered(hv_vmbus_channel_msg_header* hdr);
 static void vmbus_channel_on_version_response(hv_vmbus_channel_msg_header* hdr);
 
 /**
  * Channel message dispatch table
  */
 hv_vmbus_channel_msg_table_entry
     g_channel_message_table[HV_CHANNEL_MESSAGE_COUNT] = {
 	{ HV_CHANNEL_MESSAGE_INVALID,
 		0, NULL },
 	{ HV_CHANNEL_MESSAGE_OFFER_CHANNEL,
 		0, vmbus_channel_on_offer },
 	{ HV_CHANNEL_MESSAGE_RESCIND_CHANNEL_OFFER,
 		0, vmbus_channel_on_offer_rescind },
 	{ HV_CHANNEL_MESSAGE_REQUEST_OFFERS,
 		0, NULL },
 	{ HV_CHANNEL_MESSAGE_ALL_OFFERS_DELIVERED,
 		1, vmbus_channel_on_offers_delivered },
 	{ HV_CHANNEL_MESSAGE_OPEN_CHANNEL,
 		0, NULL },
 	{ HV_CHANNEL_MESSAGE_OPEN_CHANNEL_RESULT,
 		1, vmbus_channel_on_open_result },
 	{ HV_CHANNEL_MESSAGE_CLOSE_CHANNEL,
 		0, NULL },
 	{ HV_CHANNEL_MESSAGEL_GPADL_HEADER,
 		0, NULL },
 	{ HV_CHANNEL_MESSAGE_GPADL_BODY,
 		0, NULL },
 	{ HV_CHANNEL_MESSAGE_GPADL_CREATED,
 		1, vmbus_channel_on_gpadl_created },
 	{ HV_CHANNEL_MESSAGE_GPADL_TEARDOWN,
 		0, NULL },
 	{ HV_CHANNEL_MESSAGE_GPADL_TORNDOWN,
 		1, vmbus_channel_on_gpadl_torndown },
 	{ HV_CHANNEL_MESSAGE_REL_ID_RELEASED,
 		0, NULL },
 	{ HV_CHANNEL_MESSAGE_INITIATED_CONTACT,
 		0, NULL },
 	{ HV_CHANNEL_MESSAGE_VERSION_RESPONSE,
 		1, vmbus_channel_on_version_response },
 	{ HV_CHANNEL_MESSAGE_UNLOAD,
 		0, NULL }
 };
 
 
 /**
  * Implementation of the work abstraction.
  */
 static void
 work_item_callback(void *work, int pending)
 {
 	struct hv_work_item *w = (struct hv_work_item *)work;
 
 	/*
 	 * Serialize work execution.
 	 */
 	if (w->wq->work_sema != NULL) {
 		sema_wait(w->wq->work_sema);
 	}
 
 	w->callback(w->context);
 
 	if (w->wq->work_sema != NULL) {
 		sema_post(w->wq->work_sema);
 	} 
 
 	free(w, M_DEVBUF);
 }
 
 struct hv_work_queue*
 hv_work_queue_create(char* name)
 {
 	static unsigned int	qid = 0;
 	char			qname[64];
 	int			pri;
 	struct hv_work_queue*	wq;
 
 	wq = malloc(sizeof(struct hv_work_queue), M_DEVBUF, M_NOWAIT | M_ZERO);
 	KASSERT(wq != NULL, ("Error VMBUS: Failed to allocate work_queue\n"));
 	if (wq == NULL)
 	    return (NULL);
 
 	/*
 	 * We use work abstraction to handle messages
 	 * coming from the host and these are typically offers.
 	 * Some FreeBsd drivers appear to have a concurrency issue
 	 * where probe/attach needs to be serialized. We ensure that
 	 * by having only one thread process work elements in a 
 	 * specific queue by serializing work execution.
 	 *
 	 */
 	if (strcmp(name, "vmbusQ") == 0) {
 	    pri = PI_DISK;
 	} else { /* control */
 	    pri = PI_NET;
 	    /*
 	     * Initialize semaphore for this queue by pointing
 	     * to the globale semaphore used for synchronizing all
 	     * control messages.
 	     */
 	    wq->work_sema = &hv_vmbus_g_connection.control_sema;
 	}
 
 	sprintf(qname, "hv_%s_%u", name, qid);
 
 	/*
 	 * Fixme:  FreeBSD 8.2 has a different prototype for
 	 * taskqueue_create(), and for certain other taskqueue functions.
 	 * We need to research the implications of these changes.
 	 * Fixme:  Not sure when the changes were introduced.
 	 */
 	wq->queue = taskqueue_create(qname, M_NOWAIT, taskqueue_thread_enqueue,
 	    &wq->queue
 	    #if __FreeBSD_version < 800000
 	    , &wq->proc
 	    #endif
 	    );
 
 	if (wq->queue == NULL) {
 	    free(wq, M_DEVBUF);
 	    return (NULL);
 	}
 
 	if (taskqueue_start_threads(&wq->queue, 1, pri, "%s taskq", qname)) {
 	    taskqueue_free(wq->queue);
 	    free(wq, M_DEVBUF);
 	    return (NULL);
 	}
 
 	qid++;
 
 	return (wq);
 }
 
 void
 hv_work_queue_close(struct hv_work_queue *wq)
 {
 	/*
 	 * KYS: Need to drain the taskqueue
 	 * before we close the hv_work_queue.
 	 */
 	/*KYS: taskqueue_drain(wq->tq, ); */
 	taskqueue_free(wq->queue);
 	free(wq, M_DEVBUF);
 }
 
 /**
  * @brief Create work item
  */
 int
 hv_queue_work_item(
 	struct hv_work_queue *wq,
 	void (*callback)(void *), void *context)
 {
 	struct hv_work_item *w = malloc(sizeof(struct hv_work_item),
 					M_DEVBUF, M_NOWAIT | M_ZERO);
 	KASSERT(w != NULL, ("Error VMBUS: Failed to allocate WorkItem\n"));
 	if (w == NULL)
 	    return (ENOMEM);
 
 	w->callback = callback;
 	w->context = context;
 	w->wq = wq;
 
 	TASK_INIT(&w->work, 0, work_item_callback, w);
 
 	return (taskqueue_enqueue(wq->queue, &w->work));
 }
 
 
 /**
  * @brief Allocate and initialize a vmbus channel object
  */
 hv_vmbus_channel*
 hv_vmbus_allocate_channel(void)
 {
 	hv_vmbus_channel* channel;
 
 	channel = (hv_vmbus_channel*) malloc(
 					sizeof(hv_vmbus_channel),
 					M_DEVBUF,
 					M_NOWAIT | M_ZERO);
 	KASSERT(channel != NULL, ("Error VMBUS: Failed to allocate channel!"));
 	if (channel == NULL)
 	    return (NULL);
 
 	mtx_init(&channel->inbound_lock, "channel inbound", NULL, MTX_DEF);
 	mtx_init(&channel->sc_lock, "vmbus multi channel", NULL, MTX_DEF);
 
 	TAILQ_INIT(&channel->sc_list_anchor);
 
 	return (channel);
 }
 
 /**
  * @brief Release the vmbus channel object itself
  */
 static inline void
 ReleaseVmbusChannel(void *context)
 {
 	hv_vmbus_channel* channel = (hv_vmbus_channel*) context;
 	free(channel, M_DEVBUF);
 }
 
 /**
  * @brief Release the resources used by the vmbus channel object
  */
 void
 hv_vmbus_free_vmbus_channel(hv_vmbus_channel* channel)
 {
 	mtx_destroy(&channel->sc_lock);
 	mtx_destroy(&channel->inbound_lock);
 	/*
 	 * We have to release the channel's workqueue/thread in
 	 *  the vmbus's workqueue/thread context
 	 * ie we can't destroy ourselves
 	 */
 	hv_queue_work_item(hv_vmbus_g_connection.work_queue,
 	    ReleaseVmbusChannel, (void *) channel);
 }
 
 /**
  * @brief Process the offer by creating a channel/device
  * associated with this offer
  */
 static void
 vmbus_channel_process_offer(hv_vmbus_channel *new_channel)
 {
 	boolean_t		f_new;
 	hv_vmbus_channel*	channel;
 	int			ret;
 	uint32_t                relid;
 
 	f_new = TRUE;
 	channel = NULL;
 	relid = new_channel->offer_msg.child_rel_id;
 	/*
 	 * Make sure this is a new offer
 	 */
 	mtx_lock(&hv_vmbus_g_connection.channel_lock);
 	hv_vmbus_g_connection.channels[relid] = new_channel;
 
 	TAILQ_FOREACH(channel, &hv_vmbus_g_connection.channel_anchor,
 	    list_entry)
 	{
 		if (memcmp(&channel->offer_msg.offer.interface_type,
 		    &new_channel->offer_msg.offer.interface_type,
 		    sizeof(hv_guid)) == 0 &&
 		    memcmp(&channel->offer_msg.offer.interface_instance,
 		    &new_channel->offer_msg.offer.interface_instance,
 		    sizeof(hv_guid)) == 0) {
 			f_new = FALSE;
 			break;
 		}
 	}
 
 	if (f_new) {
 		/* Insert at tail */
 		TAILQ_INSERT_TAIL(
 		    &hv_vmbus_g_connection.channel_anchor,
 		    new_channel,
 		    list_entry);
 	}
 	mtx_unlock(&hv_vmbus_g_connection.channel_lock);
 
 	/*XXX add new channel to percpu_list */
 
 	if (!f_new) {
 		/*
 		 * Check if this is a sub channel.
 		 */
 		if (new_channel->offer_msg.offer.sub_channel_index != 0) {
 			/*
 			 * It is a sub channel offer, process it.
 			 */
 			new_channel->primary_channel = channel;
 			mtx_lock(&channel->sc_lock);
 			TAILQ_INSERT_TAIL(
 			    &channel->sc_list_anchor,
 			    new_channel,
 			    sc_list_entry);
 			mtx_unlock(&channel->sc_lock);
 
 			/* Insert new channel into channel_anchor. */
 			printf("VMBUS get multi-channel offer, rel=%u,sub=%u\n",
 			    new_channel->offer_msg.child_rel_id,
 			    new_channel->offer_msg.offer.sub_channel_index);	
 			mtx_lock(&hv_vmbus_g_connection.channel_lock);
 			TAILQ_INSERT_TAIL(&hv_vmbus_g_connection.channel_anchor,
 			    new_channel, list_entry);				
 			mtx_unlock(&hv_vmbus_g_connection.channel_lock);
 
 			if(bootverbose)
 				printf("VMBUS: new multi-channel offer <%p>, "
 				    "its primary channel is <%p>.\n",
 				    new_channel, new_channel->primary_channel);
 
 			/*XXX add it to percpu_list */
 
 			new_channel->state = HV_CHANNEL_OPEN_STATE;
 			if (channel->sc_creation_callback != NULL) {
 				channel->sc_creation_callback(new_channel);
 			}
 			return;
 		}
 
 	    hv_vmbus_free_vmbus_channel(new_channel);
 	    return;
 	}
 
 	new_channel->state = HV_CHANNEL_OPEN_STATE;
 
 	/*
 	 * Start the process of binding this offer to the driver
 	 * (We need to set the device field before calling
 	 * hv_vmbus_child_device_add())
 	 */
 	new_channel->device = hv_vmbus_child_device_create(
 	    new_channel->offer_msg.offer.interface_type,
 	    new_channel->offer_msg.offer.interface_instance, new_channel);
 
 	/*
 	 * Add the new device to the bus. This will kick off device-driver
 	 * binding which eventually invokes the device driver's AddDevice()
 	 * method.
 	 */
 	ret = hv_vmbus_child_device_register(new_channel->device);
 	if (ret != 0) {
 		mtx_lock(&hv_vmbus_g_connection.channel_lock);
 		TAILQ_REMOVE(
 		    &hv_vmbus_g_connection.channel_anchor,
 		    new_channel,
 		    list_entry);
 		mtx_unlock(&hv_vmbus_g_connection.channel_lock);
 		hv_vmbus_free_vmbus_channel(new_channel);
 	}
 }
 
 /**
  * Array of device guids that are performance critical. We try to distribute
  * the interrupt load for these devices across all online cpus. 
  */
 static const hv_guid high_perf_devices[] = {
 	{HV_NIC_GUID, },
 	{HV_IDE_GUID, },
 	{HV_SCSI_GUID, },
 };
 
 enum {
 	PERF_CHN_NIC = 0,
 	PERF_CHN_IDE,
 	PERF_CHN_SCSI,
 	MAX_PERF_CHN,
 };
 
 /*
  * We use this static number to distribute the channel interrupt load.
  */
 static uint32_t next_vcpu;
 
 /**
  * Starting with Win8, we can statically distribute the incoming
  * channel interrupt load by binding a channel to VCPU. We
  * implement here a simple round robin scheme for distributing
  * the interrupt load.
  * We will bind channels that are not performance critical to cpu 0 and
  * performance critical channels (IDE, SCSI and Network) will be uniformly
  * distributed across all available CPUs.
  */
 static void
 vmbus_channel_select_cpu(hv_vmbus_channel *channel, hv_guid *guid)
 {
 	uint32_t current_cpu;
 	int i;
 	boolean_t is_perf_channel = FALSE;
 
 	for (i = PERF_CHN_NIC; i < MAX_PERF_CHN; i++) {
 		if (memcmp(guid->data, high_perf_devices[i].data,
 		    sizeof(hv_guid)) == 0) {
 			is_perf_channel = TRUE;
 			break;
 		}
 	}
 
 	if ((hv_vmbus_protocal_version == HV_VMBUS_VERSION_WS2008) ||
 	    (hv_vmbus_protocal_version == HV_VMBUS_VERSION_WIN7) ||
 	    (!is_perf_channel)) {
 		/* Host's view of guest cpu */
 		channel->target_vcpu = 0;
 		/* Guest's own view of cpu */
 		channel->target_cpu = 0;
 		return;
 	}
 	/* mp_ncpus should have the number cpus currently online */
 	current_cpu = (++next_vcpu % mp_ncpus);
 	channel->target_cpu = current_cpu;
 	channel->target_vcpu =
 	    hv_vmbus_g_context.hv_vcpu_index[current_cpu];
 	if (bootverbose)
 		printf("VMBUS: Total online cpus %d, assign perf channel %d "
 		    "to vcpu %d, cpu %d\n", mp_ncpus, i, channel->target_vcpu,
 		    current_cpu);
 }
 
 /**
  * @brief Handler for channel offers from Hyper-V/Azure
  *
  * Handler for channel offers from vmbus in parent partition. We ignore
  * all offers except network and storage offers. For each network and storage
  * offers, we create a channel object and queue a work item to the channel
  * object to process the offer synchronously
  */
 static void
 vmbus_channel_on_offer(hv_vmbus_channel_msg_header* hdr)
 {
 	hv_vmbus_channel_offer_channel* offer;
 	hv_vmbus_channel* new_channel;
 
 	offer = (hv_vmbus_channel_offer_channel*) hdr;
 
 	hv_guid *guidType;
 	hv_guid *guidInstance;
 
 	guidType = &offer->offer.interface_type;
 	guidInstance = &offer->offer.interface_instance;
 
 	/* Allocate the channel object and save this offer */
 	new_channel = hv_vmbus_allocate_channel();
 	if (new_channel == NULL)
 	    return;
 
 	/*
 	 * By default we setup state to enable batched
 	 * reading. A specific service can choose to
 	 * disable this prior to opening the channel.
 	 */
 	new_channel->batched_reading = TRUE;
 
 	new_channel->signal_event_param =
 	    (hv_vmbus_input_signal_event *)
 	    (HV_ALIGN_UP((unsigned long)
 		&new_channel->signal_event_buffer,
 		HV_HYPERCALL_PARAM_ALIGN));
 
  	new_channel->signal_event_param->connection_id.as_uint32_t = 0;	
 	new_channel->signal_event_param->connection_id.u.id =
 	    HV_VMBUS_EVENT_CONNECTION_ID;
 	new_channel->signal_event_param->flag_number = 0;
 	new_channel->signal_event_param->rsvd_z = 0;
 
 	if (hv_vmbus_protocal_version != HV_VMBUS_VERSION_WS2008) {
 		new_channel->is_dedicated_interrupt =
 		    (offer->is_dedicated_interrupt != 0);
 		new_channel->signal_event_param->connection_id.u.id =
 		    offer->connection_id;
 	}
 
 	/*
 	 * Bind the channel to a chosen cpu.
 	 */
 	vmbus_channel_select_cpu(new_channel,
 	    &offer->offer.interface_type);
 
 	memcpy(&new_channel->offer_msg, offer,
 	    sizeof(hv_vmbus_channel_offer_channel));
 	new_channel->monitor_group = (uint8_t) offer->monitor_id / 32;
 	new_channel->monitor_bit = (uint8_t) offer->monitor_id % 32;
 
 	vmbus_channel_process_offer(new_channel);
 }
 
 /**
  * @brief Rescind offer handler.
  *
  * We queue a work item to process this offer
  * synchronously
  */
 static void
 vmbus_channel_on_offer_rescind(hv_vmbus_channel_msg_header* hdr)
 {
 	hv_vmbus_channel_rescind_offer*	rescind;
 	hv_vmbus_channel*		channel;
 
 	rescind = (hv_vmbus_channel_rescind_offer*) hdr;
 
 	channel = hv_vmbus_g_connection.channels[rescind->child_rel_id];
 	if (channel == NULL) 
 	    return;
 
 	hv_vmbus_child_device_unregister(channel->device);
 	mtx_lock(&hv_vmbus_g_connection.channel_lock);
 	hv_vmbus_g_connection.channels[rescind->child_rel_id] = NULL;
 	mtx_unlock(&hv_vmbus_g_connection.channel_lock);
 }
 
 /**
  *
  * @brief Invoked when all offers have been delivered.
  */
 static void
 vmbus_channel_on_offers_delivered(hv_vmbus_channel_msg_header* hdr)
 {
 }
 
 /**
  * @brief Open result handler.
  *
  * This is invoked when we received a response
  * to our channel open request. Find the matching request, copy the
  * response and signal the requesting thread.
  */
 static void
 vmbus_channel_on_open_result(hv_vmbus_channel_msg_header* hdr)
 {
 	hv_vmbus_channel_open_result*	result;
 	hv_vmbus_channel_msg_info*	msg_info;
 	hv_vmbus_channel_msg_header*	requestHeader;
 	hv_vmbus_channel_open_channel*	openMsg;
 
 	result = (hv_vmbus_channel_open_result*) hdr;
 
 	/*
 	 * Find the open msg, copy the result and signal/unblock the wait event
 	 */
-	mtx_lock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_lock(&hv_vmbus_g_connection.channel_msg_lock);
 
 	TAILQ_FOREACH(msg_info, &hv_vmbus_g_connection.channel_msg_anchor,
 	    msg_list_entry) {
 	    requestHeader = (hv_vmbus_channel_msg_header*) msg_info->msg;
 
 	    if (requestHeader->message_type ==
 		    HV_CHANNEL_MESSAGE_OPEN_CHANNEL) {
 		openMsg = (hv_vmbus_channel_open_channel*) msg_info->msg;
 		if (openMsg->child_rel_id == result->child_rel_id
 		    && openMsg->open_id == result->open_id) {
 		    memcpy(&msg_info->response.open_result, result,
 			sizeof(hv_vmbus_channel_open_result));
 		    sema_post(&msg_info->wait_sema);
 		    break;
 		}
 	    }
 	}
-	mtx_unlock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_unlock(&hv_vmbus_g_connection.channel_msg_lock);
 
 }
 
 /**
  * @brief GPADL created handler.
  *
  * This is invoked when we received a response
  * to our gpadl create request. Find the matching request, copy the
  * response and signal the requesting thread.
  */
 static void
 vmbus_channel_on_gpadl_created(hv_vmbus_channel_msg_header* hdr)
 {
 	hv_vmbus_channel_gpadl_created*		gpadl_created;
 	hv_vmbus_channel_msg_info*		msg_info;
 	hv_vmbus_channel_msg_header*		request_header;
 	hv_vmbus_channel_gpadl_header*		gpadl_header;
 
 	gpadl_created = (hv_vmbus_channel_gpadl_created*) hdr;
 
 	/* Find the establish msg, copy the result and signal/unblock
 	 * the wait event
 	 */
-	mtx_lock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_lock(&hv_vmbus_g_connection.channel_msg_lock);
 	TAILQ_FOREACH(msg_info, &hv_vmbus_g_connection.channel_msg_anchor,
 		msg_list_entry) {
 	    request_header = (hv_vmbus_channel_msg_header*) msg_info->msg;
 	    if (request_header->message_type ==
 		    HV_CHANNEL_MESSAGEL_GPADL_HEADER) {
 		gpadl_header =
 		    (hv_vmbus_channel_gpadl_header*) request_header;
 
 		if ((gpadl_created->child_rel_id == gpadl_header->child_rel_id)
 		    && (gpadl_created->gpadl == gpadl_header->gpadl)) {
 		    memcpy(&msg_info->response.gpadl_created,
 			gpadl_created,
 			sizeof(hv_vmbus_channel_gpadl_created));
 		    sema_post(&msg_info->wait_sema);
 		    break;
 		}
 	    }
 	}
-	mtx_unlock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_unlock(&hv_vmbus_g_connection.channel_msg_lock);
 }
 
 /**
  * @brief GPADL torndown handler.
  *
  * This is invoked when we received a respons
  * to our gpadl teardown request. Find the matching request, copy the
  * response and signal the requesting thread
  */
 static void
 vmbus_channel_on_gpadl_torndown(hv_vmbus_channel_msg_header* hdr)
 {
 	hv_vmbus_channel_gpadl_torndown*	gpadl_torndown;
 	hv_vmbus_channel_msg_info*		msg_info;
 	hv_vmbus_channel_msg_header*		requestHeader;
 	hv_vmbus_channel_gpadl_teardown*	gpadlTeardown;
 
 	gpadl_torndown = (hv_vmbus_channel_gpadl_torndown*)hdr;
 
 	/*
 	 * Find the open msg, copy the result and signal/unblock the
 	 * wait event.
 	 */
 
-	mtx_lock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_lock(&hv_vmbus_g_connection.channel_msg_lock);
 
 	TAILQ_FOREACH(msg_info, &hv_vmbus_g_connection.channel_msg_anchor,
 		msg_list_entry) {
 	    requestHeader = (hv_vmbus_channel_msg_header*) msg_info->msg;
 
 	    if (requestHeader->message_type
 		    == HV_CHANNEL_MESSAGE_GPADL_TEARDOWN) {
 		gpadlTeardown =
 		    (hv_vmbus_channel_gpadl_teardown*) requestHeader;
 
 		if (gpadl_torndown->gpadl == gpadlTeardown->gpadl) {
 		    memcpy(&msg_info->response.gpadl_torndown,
 			gpadl_torndown,
 			sizeof(hv_vmbus_channel_gpadl_torndown));
 		    sema_post(&msg_info->wait_sema);
 		    break;
 		}
 	    }
 	}
-    mtx_unlock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+    mtx_unlock(&hv_vmbus_g_connection.channel_msg_lock);
 }
 
 /**
  * @brief Version response handler.
  *
  * This is invoked when we received a response
  * to our initiate contact request. Find the matching request, copy th
  * response and signal the requesting thread.
  */
 static void
 vmbus_channel_on_version_response(hv_vmbus_channel_msg_header* hdr)
 {
 	hv_vmbus_channel_msg_info*		msg_info;
 	hv_vmbus_channel_msg_header*		requestHeader;
 	hv_vmbus_channel_initiate_contact*	initiate;
 	hv_vmbus_channel_version_response*	versionResponse;
 
 	versionResponse = (hv_vmbus_channel_version_response*)hdr;
 
-	mtx_lock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_lock(&hv_vmbus_g_connection.channel_msg_lock);
 	TAILQ_FOREACH(msg_info, &hv_vmbus_g_connection.channel_msg_anchor,
 	    msg_list_entry) {
 	    requestHeader = (hv_vmbus_channel_msg_header*) msg_info->msg;
 	    if (requestHeader->message_type
 		== HV_CHANNEL_MESSAGE_INITIATED_CONTACT) {
 		initiate =
 		    (hv_vmbus_channel_initiate_contact*) requestHeader;
 		memcpy(&msg_info->response.version_response,
 		    versionResponse,
 		    sizeof(hv_vmbus_channel_version_response));
 		sema_post(&msg_info->wait_sema);
 	    }
 	}
-    mtx_unlock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+    mtx_unlock(&hv_vmbus_g_connection.channel_msg_lock);
 
 }
 
 /**
  * @brief Handler for channel protocol messages.
  *
  * This is invoked in the vmbus worker thread context.
  */
 void
 hv_vmbus_on_channel_message(void *context)
 {
 	hv_vmbus_message*		msg;
 	hv_vmbus_channel_msg_header*	hdr;
 	int				size;
 
 	msg = (hv_vmbus_message*) context;
 	hdr = (hv_vmbus_channel_msg_header*) msg->u.payload;
 	size = msg->header.payload_size;
 
 	if (hdr->message_type >= HV_CHANNEL_MESSAGE_COUNT) {
 	    free(msg, M_DEVBUF);
 	    return;
 	}
 
 	if (g_channel_message_table[hdr->message_type].messageHandler) {
 	    g_channel_message_table[hdr->message_type].messageHandler(hdr);
 	}
 
 	/* Free the msg that was allocated in VmbusOnMsgDPC() */
 	free(msg, M_DEVBUF);
 }
 
 /**
  *  @brief Send a request to get all our pending offers.
  */
 int
 hv_vmbus_request_channel_offers(void)
 {
 	int				ret;
 	hv_vmbus_channel_msg_header*	msg;
 	hv_vmbus_channel_msg_info*	msg_info;
 
 	msg_info = (hv_vmbus_channel_msg_info *)
 	    malloc(sizeof(hv_vmbus_channel_msg_info)
 		    + sizeof(hv_vmbus_channel_msg_header), M_DEVBUF, M_NOWAIT);
 
 	if (msg_info == NULL) {
 	    if(bootverbose)
 		printf("Error VMBUS: malloc failed for Request Offers\n");
 	    return (ENOMEM);
 	}
 
 	msg = (hv_vmbus_channel_msg_header*) msg_info->msg;
 	msg->message_type = HV_CHANNEL_MESSAGE_REQUEST_OFFERS;
 
 	ret = hv_vmbus_post_message(msg, sizeof(hv_vmbus_channel_msg_header));
 
 	if (msg_info)
 	    free(msg_info, M_DEVBUF);
 
 	return (ret);
 }
 
 /**
  * @brief Release channels that are unattached/unconnected (i.e., no drivers associated)
  */
 void
 hv_vmbus_release_unattached_channels(void) 
 {
 	hv_vmbus_channel *channel;
 
 	mtx_lock(&hv_vmbus_g_connection.channel_lock);
 
 	while (!TAILQ_EMPTY(&hv_vmbus_g_connection.channel_anchor)) {
 	    channel = TAILQ_FIRST(&hv_vmbus_g_connection.channel_anchor);
 	    TAILQ_REMOVE(&hv_vmbus_g_connection.channel_anchor,
 			    channel, list_entry);
 
 	    hv_vmbus_child_device_unregister(channel->device);
 	    hv_vmbus_free_vmbus_channel(channel);
 	}
 	bzero(hv_vmbus_g_connection.channels, 
 		sizeof(hv_vmbus_channel*) * HV_CHANNEL_MAX_COUNT);
 	mtx_unlock(&hv_vmbus_g_connection.channel_lock);
 }
 
 /**
  * @brief Select the best outgoing channel
  * 
  * The channel whose vcpu binding is closest to the currect vcpu will
  * be selected.
  * If no multi-channel, always select primary channel
  * 
  * @param primary - primary channel
  */
 struct hv_vmbus_channel *
 vmbus_select_outgoing_channel(struct hv_vmbus_channel *primary)
 {
 	hv_vmbus_channel *new_channel = NULL;
 	hv_vmbus_channel *outgoing_channel = primary;
 	int old_cpu_distance = 0;
 	int new_cpu_distance = 0;
 	int cur_vcpu = 0;
 	int smp_pro_id = PCPU_GET(cpuid);
 
 	if (TAILQ_EMPTY(&primary->sc_list_anchor)) {
 		return outgoing_channel;
 	}
 
 	if (smp_pro_id >= MAXCPU) {
 		return outgoing_channel;
 	}
 
 	cur_vcpu = hv_vmbus_g_context.hv_vcpu_index[smp_pro_id];
 	
 	TAILQ_FOREACH(new_channel, &primary->sc_list_anchor, sc_list_entry) {
 		if (new_channel->state != HV_CHANNEL_OPENED_STATE){
 			continue;
 		}
 
 		if (new_channel->target_vcpu == cur_vcpu){
 			return new_channel;
 		}
 
 		old_cpu_distance = ((outgoing_channel->target_vcpu > cur_vcpu) ?
 		    (outgoing_channel->target_vcpu - cur_vcpu) :
 		    (cur_vcpu - outgoing_channel->target_vcpu));
 
 		new_cpu_distance = ((new_channel->target_vcpu > cur_vcpu) ?
 		    (new_channel->target_vcpu - cur_vcpu) :
 		    (cur_vcpu - new_channel->target_vcpu));
 
 		if (old_cpu_distance < new_cpu_distance) {
 			continue;
 		}
 
 		outgoing_channel = new_channel;
 	}
 
 	return(outgoing_channel);
 }
Index: releng/10.3/sys/dev/hyperv/vmbus/hv_connection.c
===================================================================
--- releng/10.3/sys/dev/hyperv/vmbus/hv_connection.c	(revision 303983)
+++ releng/10.3/sys/dev/hyperv/vmbus/hv_connection.c	(revision 303984)
@@ -1,522 +1,526 @@
 /*-
  * Copyright (c) 2009-2012 Microsoft Corp.
  * Copyright (c) 2012 NetApp Inc.
  * Copyright (c) 2012 Citrix Inc.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice unmodified, this list of conditions, and the following
  *    disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
 #include <sys/kernel.h>
 #include <sys/malloc.h>
 #include <sys/systm.h>
 #include <sys/lock.h>
 #include <sys/mutex.h>
 #include <machine/bus.h>
 #include <vm/vm.h>
 #include <vm/vm_param.h>
 #include <vm/pmap.h>
 
 #include "hv_vmbus_priv.h"
 
 /*
  * Globals
  */
 hv_vmbus_connection hv_vmbus_g_connection =
 	{ .connect_state = HV_DISCONNECTED,
 	  .next_gpadl_handle = 0xE1E10, };
 
 uint32_t hv_vmbus_protocal_version = HV_VMBUS_VERSION_WS2008;
 
 static uint32_t
 hv_vmbus_get_next_version(uint32_t current_ver)
 {
 	switch (current_ver) {
 	case (HV_VMBUS_VERSION_WIN7):
 		return(HV_VMBUS_VERSION_WS2008);
 
 	case (HV_VMBUS_VERSION_WIN8):
 		return(HV_VMBUS_VERSION_WIN7);
 
 	case (HV_VMBUS_VERSION_WIN8_1):
 		return(HV_VMBUS_VERSION_WIN8);
 
 	case (HV_VMBUS_VERSION_WS2008):
 	default:
 		return(HV_VMBUS_VERSION_INVALID);
 	}
 }
 
 /**
  * Negotiate the highest supported hypervisor version.
  */
 static int
 hv_vmbus_negotiate_version(hv_vmbus_channel_msg_info *msg_info,
 	uint32_t version)
 {
 	int					ret = 0;
 	hv_vmbus_channel_initiate_contact	*msg;
 
 	sema_init(&msg_info->wait_sema, 0, "Msg Info Sema");
 	msg = (hv_vmbus_channel_initiate_contact*) msg_info->msg;
 
 	msg->header.message_type = HV_CHANNEL_MESSAGE_INITIATED_CONTACT;
 	msg->vmbus_version_requested = version;
 
 	msg->interrupt_page = hv_get_phys_addr(
 		hv_vmbus_g_connection.interrupt_page);
 
 	msg->monitor_page_1 = hv_get_phys_addr(
 		hv_vmbus_g_connection.monitor_pages);
 
 	msg->monitor_page_2 =
 		hv_get_phys_addr(
 			((uint8_t *) hv_vmbus_g_connection.monitor_pages
 			+ PAGE_SIZE));
 
 	/**
 	 * Add to list before we send the request since we may receive the
 	 * response before returning from this routine
 	 */
-	mtx_lock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_lock(&hv_vmbus_g_connection.channel_msg_lock);
 
 	TAILQ_INSERT_TAIL(
 		&hv_vmbus_g_connection.channel_msg_anchor,
 		msg_info,
 		msg_list_entry);
 
-	mtx_unlock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_unlock(&hv_vmbus_g_connection.channel_msg_lock);
 
 	ret = hv_vmbus_post_message(
 		msg,
 		sizeof(hv_vmbus_channel_initiate_contact));
 
 	if (ret != 0) {
-		mtx_lock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+		mtx_lock(&hv_vmbus_g_connection.channel_msg_lock);
 		TAILQ_REMOVE(
 			&hv_vmbus_g_connection.channel_msg_anchor,
 			msg_info,
 			msg_list_entry);
-		mtx_unlock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+		mtx_unlock(&hv_vmbus_g_connection.channel_msg_lock);
 		return (ret);
 	}
 
 	/**
 	 * Wait for the connection response
 	 */
 	ret = sema_timedwait(&msg_info->wait_sema, 5 * hz); /* KYS 5 seconds */
 
-	mtx_lock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_lock(&hv_vmbus_g_connection.channel_msg_lock);
 	TAILQ_REMOVE(
 		&hv_vmbus_g_connection.channel_msg_anchor,
 		msg_info,
 		msg_list_entry);
-	mtx_unlock_spin(&hv_vmbus_g_connection.channel_msg_lock);
+	mtx_unlock(&hv_vmbus_g_connection.channel_msg_lock);
 
 	/**
 	 * Check if successful
 	 */
 	if (msg_info->response.version_response.version_supported) {
 		hv_vmbus_g_connection.connect_state = HV_CONNECTED;
 	} else {
 		ret = ECONNREFUSED;
 	}
 
 	return (ret);
 }
 
 /**
  * Send a connect request on the partition service connection
  */
 int
 hv_vmbus_connect(void) {
 	int					ret = 0;
 	uint32_t				version;
 	hv_vmbus_channel_msg_info*		msg_info = NULL;
 
 	/**
 	 * Make sure we are not connecting or connected
 	 */
 	if (hv_vmbus_g_connection.connect_state != HV_DISCONNECTED) {
 		return (-1);
 	}
 
 	/**
 	 * Initialize the vmbus connection
 	 */
 	hv_vmbus_g_connection.connect_state = HV_CONNECTING;
 	hv_vmbus_g_connection.work_queue = hv_work_queue_create("vmbusQ");
 	sema_init(&hv_vmbus_g_connection.control_sema, 1, "control_sema");
 
 	TAILQ_INIT(&hv_vmbus_g_connection.channel_msg_anchor);
 	mtx_init(&hv_vmbus_g_connection.channel_msg_lock, "vmbus channel msg",
-		NULL, MTX_SPIN);
+		NULL, MTX_DEF);
 
 	TAILQ_INIT(&hv_vmbus_g_connection.channel_anchor);
 	mtx_init(&hv_vmbus_g_connection.channel_lock, "vmbus channel",
 		NULL, MTX_DEF);
 
 	/**
 	 * Setup the vmbus event connection for channel interrupt abstraction
 	 * stuff
 	 */
 	hv_vmbus_g_connection.interrupt_page = contigmalloc(
 					PAGE_SIZE, M_DEVBUF,
 					M_NOWAIT | M_ZERO, 0UL,
 					BUS_SPACE_MAXADDR,
 					PAGE_SIZE, 0);
 	KASSERT(hv_vmbus_g_connection.interrupt_page != NULL,
 	    ("Error VMBUS: malloc failed to allocate Channel"
 		" Request Event message!"));
 	if (hv_vmbus_g_connection.interrupt_page == NULL) {
 	    ret = ENOMEM;
 	    goto cleanup;
 	}
 
 	hv_vmbus_g_connection.recv_interrupt_page =
 		hv_vmbus_g_connection.interrupt_page;
 
 	hv_vmbus_g_connection.send_interrupt_page =
 		((uint8_t *) hv_vmbus_g_connection.interrupt_page +
 		    (PAGE_SIZE >> 1));
 
 	/**
 	 * Set up the monitor notification facility. The 1st page for
 	 * parent->child and the 2nd page for child->parent
 	 */
 	hv_vmbus_g_connection.monitor_pages = contigmalloc(
 		2 * PAGE_SIZE,
 		M_DEVBUF,
 		M_NOWAIT | M_ZERO,
 		0UL,
 		BUS_SPACE_MAXADDR,
 		PAGE_SIZE,
 		0);
 	KASSERT(hv_vmbus_g_connection.monitor_pages != NULL,
 	    ("Error VMBUS: malloc failed to allocate Monitor Pages!"));
 	if (hv_vmbus_g_connection.monitor_pages == NULL) {
 	    ret = ENOMEM;
 	    goto cleanup;
 	}
 
 	msg_info = (hv_vmbus_channel_msg_info*)
 		malloc(sizeof(hv_vmbus_channel_msg_info) +
 			sizeof(hv_vmbus_channel_initiate_contact),
 			M_DEVBUF, M_NOWAIT | M_ZERO);
 	KASSERT(msg_info != NULL,
 	    ("Error VMBUS: malloc failed for Initiate Contact message!"));
 	if (msg_info == NULL) {
 	    ret = ENOMEM;
 	    goto cleanup;
 	}
 
 	hv_vmbus_g_connection.channels = malloc(sizeof(hv_vmbus_channel*) *
 		HV_CHANNEL_MAX_COUNT,
 		M_DEVBUF, M_WAITOK | M_ZERO);
 	/*
 	 * Find the highest vmbus version number we can support.
 	 */
 	version = HV_VMBUS_VERSION_CURRENT;
 
 	do {
 		ret = hv_vmbus_negotiate_version(msg_info, version);
 		if (ret == EWOULDBLOCK) {
 			/*
 			 * We timed out.
 			 */
 			goto cleanup;
 		}
 
 		if (hv_vmbus_g_connection.connect_state == HV_CONNECTED)
 			break;
 
 		version = hv_vmbus_get_next_version(version);
 	} while (version != HV_VMBUS_VERSION_INVALID);
 
 	hv_vmbus_protocal_version = version;
 	if (bootverbose)
 		printf("VMBUS: Protocol Version: %d.%d\n",
 		    version >> 16, version & 0xFFFF);
 
 	sema_destroy(&msg_info->wait_sema);
 	free(msg_info, M_DEVBUF);
 
 	return (0);
 
 	/*
 	 * Cleanup after failure!
 	 */
 	cleanup:
 
 	hv_vmbus_g_connection.connect_state = HV_DISCONNECTED;
 
 	hv_work_queue_close(hv_vmbus_g_connection.work_queue);
 	sema_destroy(&hv_vmbus_g_connection.control_sema);
 	mtx_destroy(&hv_vmbus_g_connection.channel_lock);
 	mtx_destroy(&hv_vmbus_g_connection.channel_msg_lock);
 
 	if (hv_vmbus_g_connection.interrupt_page != NULL) {
 		contigfree(
 			hv_vmbus_g_connection.interrupt_page,
 			PAGE_SIZE,
 			M_DEVBUF);
 		hv_vmbus_g_connection.interrupt_page = NULL;
 	}
 
 	if (hv_vmbus_g_connection.monitor_pages != NULL) {
 		contigfree(
 			hv_vmbus_g_connection.monitor_pages,
 			2 * PAGE_SIZE,
 			M_DEVBUF);
 		hv_vmbus_g_connection.monitor_pages = NULL;
 	}
 
 	if (msg_info) {
 		sema_destroy(&msg_info->wait_sema);
 		free(msg_info, M_DEVBUF);
 	}
 
 	free(hv_vmbus_g_connection.channels, M_DEVBUF);
 	return (ret);
 }
 
 /**
  * Send a disconnect request on the partition service connection
  */
 int
 hv_vmbus_disconnect(void) {
 	int			 ret = 0;
 	hv_vmbus_channel_unload* msg;
 
 	msg = malloc(sizeof(hv_vmbus_channel_unload),
 	    M_DEVBUF, M_NOWAIT | M_ZERO);
 	KASSERT(msg != NULL,
 	    ("Error VMBUS: malloc failed to allocate Channel Unload Msg!"));
 	if (msg == NULL)
 	    return (ENOMEM);
 
 	msg->message_type = HV_CHANNEL_MESSAGE_UNLOAD;
 
 	ret = hv_vmbus_post_message(msg, sizeof(hv_vmbus_channel_unload));
 
 
 	contigfree(hv_vmbus_g_connection.interrupt_page, PAGE_SIZE, M_DEVBUF);
 
 	mtx_destroy(&hv_vmbus_g_connection.channel_msg_lock);
 
 	hv_work_queue_close(hv_vmbus_g_connection.work_queue);
 	sema_destroy(&hv_vmbus_g_connection.control_sema);
 
 	free(hv_vmbus_g_connection.channels, M_DEVBUF);
 	hv_vmbus_g_connection.connect_state = HV_DISCONNECTED;
 
 	free(msg, M_DEVBUF);
 
 	return (ret);
 }
 
 /**
  * Process a channel event notification
  */
 static void
 VmbusProcessChannelEvent(uint32_t relid) 
 {
 	void* arg;
 	uint32_t bytes_to_read;
 	hv_vmbus_channel* channel;
 	boolean_t is_batched_reading;
 
 	/**
 	 * Find the channel based on this relid and invokes
 	 * the channel callback to process the event
 	 */
 
 	channel = hv_vmbus_g_connection.channels[relid];
 
 	if (channel == NULL) {
 		return;
 	}
 	/**
 	 * To deal with the race condition where we might
 	 * receive a packet while the relevant driver is 
 	 * being unloaded, dispatch the callback while 
 	 * holding the channel lock. The unloading driver
 	 * will acquire the same channel lock to set the
 	 * callback to NULL. This closes the window.
 	 */
 
 	/*
 	 * Disable the lock due to newly added WITNESS check in r277723.
 	 * Will seek other way to avoid race condition.
 	 * -- whu
 	 */
 	// mtx_lock(&channel->inbound_lock);
 	if (channel->on_channel_callback != NULL) {
 		arg = channel->channel_callback_context;
 		is_batched_reading = channel->batched_reading;
 		/*
 		 * Optimize host to guest signaling by ensuring:
 		 * 1. While reading the channel, we disable interrupts from
 		 *    host.
 		 * 2. Ensure that we process all posted messages from the host
 		 *    before returning from this callback.
 		 * 3. Once we return, enable signaling from the host. Once this
 		 *    state is set we check to see if additional packets are
 		 *    available to read. In this case we repeat the process.
 		 */
 		do {
 			if (is_batched_reading)
 				hv_ring_buffer_read_begin(&channel->inbound);
 
 			channel->on_channel_callback(arg);
 
 			if (is_batched_reading)
 				bytes_to_read =
 				    hv_ring_buffer_read_end(&channel->inbound);
 			else
 				bytes_to_read = 0;
 		} while (is_batched_reading && (bytes_to_read != 0));
 	}
 	// mtx_unlock(&channel->inbound_lock);
 }
 
 /**
  * Handler for events
  */
 void
 hv_vmbus_on_events(void *arg) 
 {
 	int bit;
 	int cpu;
 	int dword;
 	void *page_addr;
 	uint32_t* recv_interrupt_page = NULL;
 	int rel_id;
 	int maxdword;
 	hv_vmbus_synic_event_flags *event;
 	/* int maxdword = PAGE_SIZE >> 3; */
 
 	cpu = (int)(long)arg;
 	KASSERT(cpu <= mp_maxid, ("VMBUS: hv_vmbus_on_events: "
 	    "cpu out of range!"));
 
 	if ((hv_vmbus_protocal_version == HV_VMBUS_VERSION_WS2008) ||
 	    (hv_vmbus_protocal_version == HV_VMBUS_VERSION_WIN7)) {
 		maxdword = HV_MAX_NUM_CHANNELS_SUPPORTED >> 5;
 		/*
 		 * receive size is 1/2 page and divide that by 4 bytes
 		 */
 		recv_interrupt_page =
 		    hv_vmbus_g_connection.recv_interrupt_page;
 	} else {
 		/*
 		 * On Host with Win8 or above, the event page can be
 		 * checked directly to get the id of the channel
 		 * that has the pending interrupt.
 		 */
 		maxdword = HV_EVENT_FLAGS_DWORD_COUNT;
 		page_addr = hv_vmbus_g_context.syn_ic_event_page[cpu];
 		event = (hv_vmbus_synic_event_flags *)
 		    page_addr + HV_VMBUS_MESSAGE_SINT;
 		recv_interrupt_page = event->flags32;
 	}
 
 	/*
 	 * Check events
 	 */
 	if (recv_interrupt_page != NULL) {
 	    for (dword = 0; dword < maxdword; dword++) {
 		if (recv_interrupt_page[dword]) {
 		    for (bit = 0; bit < HV_CHANNEL_DWORD_LEN; bit++) {
 			if (synch_test_and_clear_bit(bit,
 			    (uint32_t *) &recv_interrupt_page[dword])) {
 			    rel_id = (dword << 5) + bit;
 			    if (rel_id == 0) {
 				/*
 				 * Special case -
 				 * vmbus channel protocol msg.
 				 */
 				continue;
 			    } else {
 				VmbusProcessChannelEvent(rel_id);
 
 			    }
 			}
 		    }
 		}
 	    }
 	}
 
 	return;
 }
 
 /**
  * Send a msg on the vmbus's message connection
  */
-int hv_vmbus_post_message(void *buffer, size_t bufferLen) {
-	int ret = 0;
+int hv_vmbus_post_message(void *buffer, size_t bufferLen)
+{
 	hv_vmbus_connection_id connId;
-	unsigned retries = 0;
+	sbintime_t time = SBT_1MS;
+	int retries;
+	int ret;
 
-	/* NetScaler delays from previous code were consolidated here */
-	static int delayAmount[] = {100, 100, 100, 500, 500, 5000, 5000, 5000};
+	connId.as_uint32_t = 0;
+	connId.u.id = HV_VMBUS_MESSAGE_CONNECTION_ID;
 
-	/* for(each entry in delayAmount) try to post message,
-	 *  delay a little bit before retrying
+	/*
+	 * We retry to cope with transient failures caused by host side's
+	 * insufficient resources. 20 times should suffice in practice.
 	 */
-	for (retries = 0;
-	    retries < sizeof(delayAmount)/sizeof(delayAmount[0]); retries++) {
-	    connId.as_uint32_t = 0;
-	    connId.u.id = HV_VMBUS_MESSAGE_CONNECTION_ID;
-	    ret = hv_vmbus_post_msg_via_msg_ipc(connId, 1, buffer, bufferLen);
-	    if (ret != HV_STATUS_INSUFFICIENT_BUFFERS)
-		break;
-	    /* TODO: KYS We should use a blocking wait call */
-	    DELAY(delayAmount[retries]);
+	for (retries = 0; retries < 20; retries++) {
+		ret = hv_vmbus_post_msg_via_msg_ipc(connId, 1, buffer,
+						    bufferLen);
+		if (ret == HV_STATUS_SUCCESS)
+			return (0);
+
+		pause_sbt("pstmsg", time, 0, C_HARDCLOCK);
+		if (time < SBT_1S * 2)
+			time *= 2;
 	}
 
-	KASSERT(ret == 0, ("Error VMBUS: Message Post Failed\n"));
+	KASSERT(ret == HV_STATUS_SUCCESS,
+		("Error VMBUS: Message Post Failed, ret=%d\n", ret));
 
-	return (ret);
+	return (EAGAIN);
 }
 
 /**
  * Send an event notification to the parent
  */
 int
 hv_vmbus_set_event(hv_vmbus_channel *channel) {
 	int ret = 0;
 	uint32_t child_rel_id = channel->offer_msg.child_rel_id;
 
 	/* Each uint32_t represents 32 channels */
 
 	synch_set_bit(child_rel_id & 31,
 		(((uint32_t *)hv_vmbus_g_connection.send_interrupt_page
 			+ (child_rel_id >> 5))));
 	ret = hv_vmbus_signal_event(channel->signal_event_param);
 
 	return (ret);
 }
Index: releng/10.3/sys/dev/hyperv/vmbus/hv_hv.c
===================================================================
--- releng/10.3/sys/dev/hyperv/vmbus/hv_hv.c	(revision 303983)
+++ releng/10.3/sys/dev/hyperv/vmbus/hv_hv.c	(revision 303984)
@@ -1,429 +1,521 @@
 /*-
  * Copyright (c) 2009-2012 Microsoft Corp.
  * Copyright (c) 2012 NetApp Inc.
  * Copyright (c) 2012 Citrix Inc.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice unmodified, this list of conditions, and the following
  *    disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
 /**
  * Implements low-level interactions with Hypver-V/Azure
  */
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
 #include <sys/param.h>
+#include <sys/kernel.h>
 #include <sys/malloc.h>
 #include <sys/pcpu.h>
 #include <sys/timetc.h>
 #include <machine/bus.h>
 #include <machine/md_var.h>
 #include <vm/vm.h>
 #include <vm/vm_param.h>
 #include <vm/pmap.h>
 
 
 #include "hv_vmbus_priv.h"
 
 #define HV_NANOSECONDS_PER_SEC		1000000000L
 
 
 static u_int hv_get_timecount(struct timecounter *tc);
 
+u_int	hyperv_features;
+u_int	hyperv_recommends;
+
 /**
  * Globals
  */
 hv_vmbus_context hv_vmbus_g_context = {
 	.syn_ic_initialized = FALSE,
 	.hypercall_page = NULL,
 };
 
 static struct timecounter hv_timecounter = {
 	hv_get_timecount, 0, ~0u, HV_NANOSECONDS_PER_SEC/100, "Hyper-V", HV_NANOSECONDS_PER_SEC/100
 };
 
 static u_int
 hv_get_timecount(struct timecounter *tc)
 {
 	u_int now = rdmsr(HV_X64_MSR_TIME_REF_COUNT);
 	return (now);
 }
 
 /**
  * @brief Query the cpuid for presence of windows hypervisor
  */
 int
 hv_vmbus_query_hypervisor_presence(void) 
 {
 	if (vm_guest != VM_GUEST_HV)
 		return (0);
 
 	return (hv_high >= HV_X64_CPUID_MIN && hv_high <= HV_X64_CPUID_MAX);
 }
 
 /**
  * @brief Get version of the windows hypervisor
  */
 static int
 hv_vmbus_get_hypervisor_version(void) 
 {
 	u_int regs[4];
 	unsigned int maxLeaf;
 	unsigned int op;
 
 	/*
 	 * Its assumed that this is called after confirming that
 	 * Viridian is present
 	 * Query id and revision.
 	 */
 	op = HV_CPU_ID_FUNCTION_HV_VENDOR_AND_MAX_FUNCTION;
 	do_cpuid(op, regs);
 
 	maxLeaf = regs[0];
 	op = HV_CPU_ID_FUNCTION_HV_INTERFACE;
 	do_cpuid(op, regs);
 
 	if (maxLeaf >= HV_CPU_ID_FUNCTION_MS_HV_VERSION) {
 	    op = HV_CPU_ID_FUNCTION_MS_HV_VERSION;
 	    do_cpuid(op, regs);
 	}
 	return (maxLeaf);
 }
 
 /**
  * @brief Invoke the specified hypercall
  */
 static uint64_t
 hv_vmbus_do_hypercall(uint64_t control, void* input, void* output)
 {
 #ifdef __x86_64__
 	uint64_t hv_status = 0;
 	uint64_t input_address = (input) ? hv_get_phys_addr(input) : 0;
 	uint64_t output_address = (output) ? hv_get_phys_addr(output) : 0;
 	volatile void* hypercall_page = hv_vmbus_g_context.hypercall_page;
 
 	__asm__ __volatile__ ("mov %0, %%r8" : : "r" (output_address): "r8");
 	__asm__ __volatile__ ("call *%3" : "=a"(hv_status):
 				"c" (control), "d" (input_address),
 				"m" (hypercall_page));
 	return (hv_status);
 #else
 	uint32_t control_high = control >> 32;
 	uint32_t control_low = control & 0xFFFFFFFF;
 	uint32_t hv_status_high = 1;
 	uint32_t hv_status_low = 1;
 	uint64_t input_address = (input) ? hv_get_phys_addr(input) : 0;
 	uint32_t input_address_high = input_address >> 32;
 	uint32_t input_address_low = input_address & 0xFFFFFFFF;
 	uint64_t output_address = (output) ? hv_get_phys_addr(output) : 0;
 	uint32_t output_address_high = output_address >> 32;
 	uint32_t output_address_low = output_address & 0xFFFFFFFF;
 	volatile void* hypercall_page = hv_vmbus_g_context.hypercall_page;
 
 	__asm__ __volatile__ ("call *%8" : "=d"(hv_status_high),
 				"=a"(hv_status_low) : "d" (control_high),
 				"a" (control_low), "b" (input_address_high),
 				"c" (input_address_low),
 				"D"(output_address_high),
 				"S"(output_address_low), "m" (hypercall_page));
 	return (hv_status_low | ((uint64_t)hv_status_high << 32));
 #endif /* __x86_64__ */
 }
 
 /**
  *  @brief Main initialization routine.
  *
  *  This routine must be called
  *  before any other routines in here are called
  */
 int
 hv_vmbus_init(void) 
 {
 	int					max_leaf;
 	hv_vmbus_x64_msr_hypercall_contents	hypercall_msr;
 	void* 					virt_addr = 0;
 
 	memset(
 	    hv_vmbus_g_context.syn_ic_event_page,
 	    0,
 	    sizeof(hv_vmbus_handle) * MAXCPU);
 
 	memset(
 	    hv_vmbus_g_context.syn_ic_msg_page,
 	    0,
 	    sizeof(hv_vmbus_handle) * MAXCPU);
 
 	if (vm_guest != VM_GUEST_HV)
 	    goto cleanup;
 
 	max_leaf = hv_vmbus_get_hypervisor_version();
 
 	/*
 	 * Write our OS info
 	 */
 	uint64_t os_guest_info = HV_FREEBSD_GUEST_ID;
 	wrmsr(HV_X64_MSR_GUEST_OS_ID, os_guest_info);
 	hv_vmbus_g_context.guest_id = os_guest_info;
 
 	/*
 	 * See if the hypercall page is already set
 	 */
 	hypercall_msr.as_uint64_t = rdmsr(HV_X64_MSR_HYPERCALL);
 	virt_addr = malloc(PAGE_SIZE, M_DEVBUF, M_NOWAIT | M_ZERO);
 	KASSERT(virt_addr != NULL,
 	    ("Error VMBUS: malloc failed to allocate page during init!"));
 	if (virt_addr == NULL)
 	    goto cleanup;
 
 	hypercall_msr.u.enable = 1;
 	hypercall_msr.u.guest_physical_address =
 	    (hv_get_phys_addr(virt_addr) >> PAGE_SHIFT);
 	wrmsr(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64_t);
 
 	/*
 	 * Confirm that hypercall page did get set up
 	 */
 	hypercall_msr.as_uint64_t = 0;
 	hypercall_msr.as_uint64_t = rdmsr(HV_X64_MSR_HYPERCALL);
 
 	if (!hypercall_msr.u.enable)
 	    goto cleanup;
 
 	hv_vmbus_g_context.hypercall_page = virt_addr;
 
-	tc_init(&hv_timecounter); /* register virtual timecount */
-
 	hv_et_init();
 	
 	return (0);
 
 	cleanup:
 	if (virt_addr != NULL) {
 	    if (hypercall_msr.u.enable) {
 		hypercall_msr.as_uint64_t = 0;
 		wrmsr(HV_X64_MSR_HYPERCALL,
 					hypercall_msr.as_uint64_t);
 	    }
 
 	    free(virt_addr, M_DEVBUF);
 	}
 	return (ENOTSUP);
 }
 
 /**
  * @brief Cleanup routine, called normally during driver unloading or exiting
  */
 void
 hv_vmbus_cleanup(void) 
 {
 	hv_vmbus_x64_msr_hypercall_contents hypercall_msr;
 
 	if (hv_vmbus_g_context.guest_id == HV_FREEBSD_GUEST_ID) {
 	    if (hv_vmbus_g_context.hypercall_page != NULL) {
 		hypercall_msr.as_uint64_t = 0;
 		wrmsr(HV_X64_MSR_HYPERCALL,
 					hypercall_msr.as_uint64_t);
 		free(hv_vmbus_g_context.hypercall_page, M_DEVBUF);
 		hv_vmbus_g_context.hypercall_page = NULL;
 	    }
 	}
 }
 
 /**
  * @brief Post a message using the hypervisor message IPC.
  * (This involves a hypercall.)
  */
 hv_vmbus_status
 hv_vmbus_post_msg_via_msg_ipc(
 	hv_vmbus_connection_id	connection_id,
 	hv_vmbus_msg_type	message_type,
 	void*			payload,
 	size_t			payload_size)
 {
 	struct alignedinput {
 	    uint64_t alignment8;
 	    hv_vmbus_input_post_message msg;
 	};
 
 	hv_vmbus_input_post_message*	aligned_msg;
 	hv_vmbus_status 		status;
 	size_t				addr;
 
 	if (payload_size > HV_MESSAGE_PAYLOAD_BYTE_COUNT)
 	    return (EMSGSIZE);
 
 	addr = (size_t) malloc(sizeof(struct alignedinput), M_DEVBUF,
 			    M_ZERO | M_NOWAIT);
 	KASSERT(addr != 0,
 	    ("Error VMBUS: malloc failed to allocate message buffer!"));
 	if (addr == 0)
 	    return (ENOMEM);
 
 	aligned_msg = (hv_vmbus_input_post_message*)
 	    (HV_ALIGN_UP(addr, HV_HYPERCALL_PARAM_ALIGN));
 
 	aligned_msg->connection_id = connection_id;
 	aligned_msg->message_type = message_type;
 	aligned_msg->payload_size = payload_size;
 	memcpy((void*) aligned_msg->payload, payload, payload_size);
 
 	status = hv_vmbus_do_hypercall(
 		    HV_CALL_POST_MESSAGE, aligned_msg, 0) & 0xFFFF;
 
 	free((void *) addr, M_DEVBUF);
 	return (status);
 }
 
 /**
  * @brief Signal an event on the specified connection using the hypervisor
  * event IPC. (This involves a hypercall.)
  */
 hv_vmbus_status
 hv_vmbus_signal_event(void *con_id)
 {
 	hv_vmbus_status status;
 
 	status = hv_vmbus_do_hypercall(
 		    HV_CALL_SIGNAL_EVENT,
 		    con_id,
 		    0) & 0xFFFF;
 
 	return (status);
 }
 
 /**
  * @brief hv_vmbus_synic_init
  */
 void
 hv_vmbus_synic_init(void *arg)
 
 {
 	int			cpu;
 	uint64_t		hv_vcpu_index;
 	hv_vmbus_synic_simp	simp;
 	hv_vmbus_synic_siefp	siefp;
 	hv_vmbus_synic_scontrol sctrl;
 	hv_vmbus_synic_sint	shared_sint;
 	uint64_t		version;
 	hv_setup_args* 		setup_args = (hv_setup_args *)arg;
 
 	cpu = PCPU_GET(cpuid);
 
 	if (hv_vmbus_g_context.hypercall_page == NULL)
 	    return;
 
 	/*
 	 * TODO: Check the version
 	 */
 	version = rdmsr(HV_X64_MSR_SVERSION);
 	
 	hv_vmbus_g_context.syn_ic_msg_page[cpu] =
 	    setup_args->page_buffers[2 * cpu];
 	hv_vmbus_g_context.syn_ic_event_page[cpu] =
 	    setup_args->page_buffers[2 * cpu + 1];
 
 	/*
 	 * Setup the Synic's message page
 	 */
 
 	simp.as_uint64_t = rdmsr(HV_X64_MSR_SIMP);
 	simp.u.simp_enabled = 1;
 	simp.u.base_simp_gpa = ((hv_get_phys_addr(
 	    hv_vmbus_g_context.syn_ic_msg_page[cpu])) >> PAGE_SHIFT);
 
 	wrmsr(HV_X64_MSR_SIMP, simp.as_uint64_t);
 
 	/*
 	 * Setup the Synic's event page
 	 */
 	siefp.as_uint64_t = rdmsr(HV_X64_MSR_SIEFP);
 	siefp.u.siefp_enabled = 1;
 	siefp.u.base_siefp_gpa = ((hv_get_phys_addr(
 	    hv_vmbus_g_context.syn_ic_event_page[cpu])) >> PAGE_SHIFT);
 
 	wrmsr(HV_X64_MSR_SIEFP, siefp.as_uint64_t);
 
 	/*HV_SHARED_SINT_IDT_VECTOR + 0x20; */
 	shared_sint.as_uint64_t = 0;
 	shared_sint.u.vector = setup_args->vector;
 	shared_sint.u.masked = FALSE;
 	shared_sint.u.auto_eoi = TRUE;
 
 	wrmsr(HV_X64_MSR_SINT0 + HV_VMBUS_MESSAGE_SINT,
 	    shared_sint.as_uint64_t);
 
 	/* Enable the global synic bit */
 	sctrl.as_uint64_t = rdmsr(HV_X64_MSR_SCONTROL);
 	sctrl.u.enable = 1;
 
 	wrmsr(HV_X64_MSR_SCONTROL, sctrl.as_uint64_t);
 
 	hv_vmbus_g_context.syn_ic_initialized = TRUE;
 
 	/*
 	 * Set up the cpuid mapping from Hyper-V to FreeBSD.
 	 * The array is indexed using FreeBSD cpuid.
 	 */
 	hv_vcpu_index = rdmsr(HV_X64_MSR_VP_INDEX);
 	hv_vmbus_g_context.hv_vcpu_index[cpu] = (uint32_t)hv_vcpu_index;
 
 	return;
 }
 
 /**
  * @brief Cleanup routine for hv_vmbus_synic_init()
  */
 void hv_vmbus_synic_cleanup(void *arg)
 {
 	hv_vmbus_synic_sint	shared_sint;
 	hv_vmbus_synic_simp	simp;
 	hv_vmbus_synic_siefp	siefp;
 
 	if (!hv_vmbus_g_context.syn_ic_initialized)
 	    return;
 
 	shared_sint.as_uint64_t = rdmsr(
 	    HV_X64_MSR_SINT0 + HV_VMBUS_MESSAGE_SINT);
 
 	shared_sint.u.masked = 1;
 
 	/*
 	 * Disable the interrupt
 	 */
 	wrmsr(
 	    HV_X64_MSR_SINT0 + HV_VMBUS_MESSAGE_SINT,
 	    shared_sint.as_uint64_t);
 
 	simp.as_uint64_t = rdmsr(HV_X64_MSR_SIMP);
 	simp.u.simp_enabled = 0;
 	simp.u.base_simp_gpa = 0;
 
 	wrmsr(HV_X64_MSR_SIMP, simp.as_uint64_t);
 
 	siefp.as_uint64_t = rdmsr(HV_X64_MSR_SIEFP);
 	siefp.u.siefp_enabled = 0;
 	siefp.u.base_siefp_gpa = 0;
 
 	wrmsr(HV_X64_MSR_SIEFP, siefp.as_uint64_t);
 }
 
+static bool
+hyperv_identify(void)
+{
+	u_int regs[4];
+	unsigned int maxLeaf;
+	unsigned int op;
+
+	if (vm_guest != VM_GUEST_HV)
+		return (false);
+
+	op = HV_CPU_ID_FUNCTION_HV_VENDOR_AND_MAX_FUNCTION;
+	do_cpuid(op, regs);
+	maxLeaf = regs[0];
+	if (maxLeaf < HV_CPU_ID_FUNCTION_MS_HV_IMPLEMENTATION_LIMITS)
+		return (false);
+
+	op = HV_CPU_ID_FUNCTION_HV_INTERFACE;
+	do_cpuid(op, regs);
+	if (regs[0] != 0x31237648 /* HV#1 */)
+		return (false);
+
+	op = HV_CPU_ID_FUNCTION_MS_HV_FEATURES;
+	do_cpuid(op, regs);
+	if ((regs[0] & HV_FEATURE_MSR_HYPERCALL) == 0) {
+		/*
+		 * Hyper-V w/o Hypercall is impossible; someone
+		 * is faking Hyper-V.
+		 */
+		return (false);
+	}
+	hyperv_features = regs[0];
+
+	op = HV_CPU_ID_FUNCTION_MS_HV_VERSION;
+	do_cpuid(op, regs);
+	printf("Hyper-V Version: %d.%d.%d [SP%d]\n",
+	    regs[1] >> 16, regs[1] & 0xffff, regs[0], regs[2]);
+
+	printf("  Features: 0x%b\n", hyperv_features,
+	    "\020"
+	    "\001VPRUNTIME"
+	    "\002TMREFCNT"
+	    "\003SYNCIC"
+	    "\004SYNCTM"
+	    "\005APIC"
+	    "\006HYERCALL"
+	    "\007VPINDEX"
+	    "\010RESET"
+	    "\011STATS"
+	    "\012REFTSC"
+	    "\013IDLE"
+	    "\014TMFREQ"
+	    "\015DEBUG");
+
+	op = HV_CPU_ID_FUNCTION_MS_HV_ENLIGHTENMENT_INFORMATION;
+	do_cpuid(op, regs);
+	hyperv_recommends = regs[0];
+	if (bootverbose)
+		printf("  Recommends: %08x %08x\n", regs[0], regs[1]);
+
+	op = HV_CPU_ID_FUNCTION_MS_HV_IMPLEMENTATION_LIMITS;
+	do_cpuid(op, regs);
+	if (bootverbose) {
+		printf("  Limits: Vcpu:%d Lcpu:%d Int:%d\n",
+		    regs[0], regs[1], regs[2]);
+	}
+
+	if (maxLeaf >= HV_CPU_ID_FUNCTION_MS_HV_HARDWARE_FEATURE) {
+		op = HV_CPU_ID_FUNCTION_MS_HV_HARDWARE_FEATURE;
+		do_cpuid(op, regs);
+		if (bootverbose) {
+			printf("  HW Features: %08x AMD: %08x\n",
+			    regs[0], regs[3]);
+		}
+	}
+
+	return (true);
+}
+
+static void
+hyperv_init(void *dummy __unused)
+{
+	if (!hyperv_identify())
+		return;
+
+	if (hyperv_features & HV_FEATURE_MSR_TIME_REFCNT) {
+		/* Register virtual timecount */
+		tc_init(&hv_timecounter);
+	}
+}
+SYSINIT(hyperv_initialize, SI_SUB_HYPERVISOR, SI_ORDER_FIRST, hyperv_init, NULL);
Index: releng/10.3/sys/dev/hyperv/vmbus/hv_vmbus_priv.h
===================================================================
--- releng/10.3/sys/dev/hyperv/vmbus/hv_vmbus_priv.h	(revision 303983)
+++ releng/10.3/sys/dev/hyperv/vmbus/hv_vmbus_priv.h	(revision 303984)
@@ -1,771 +1,782 @@
 /*-
  * Copyright (c) 2009-2012 Microsoft Corp.
  * Copyright (c) 2012 NetApp Inc.
  * Copyright (c) 2012 Citrix Inc.
  * All rights reserved.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions
  * are met:
  * 1. Redistributions of source code must retain the above copyright
  *    notice unmodified, this list of conditions, and the following
  *    disclaimer.
  * 2. Redistributions in binary form must reproduce the above copyright
  *    notice, this list of conditions and the following disclaimer in the
  *    documentation and/or other materials provided with the distribution.
  *
  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
  * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
  * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  *
  * $FreeBSD$
  */
 
 #ifndef __HYPERV_PRIV_H__
 #define __HYPERV_PRIV_H__
 
 #include <sys/param.h>
 #include <sys/lock.h>
 #include <sys/mutex.h>
 #include <sys/sema.h>
 
 #include <dev/hyperv/include/hyperv.h>
 
 
 /*
  *  Status codes for hypervisor operations.
  */
 
 typedef uint16_t hv_vmbus_status;
 
 #define HV_MESSAGE_SIZE                 (256)
 #define HV_MESSAGE_PAYLOAD_BYTE_COUNT   (240)
 #define HV_MESSAGE_PAYLOAD_QWORD_COUNT  (30)
 #define HV_ANY_VP                       (0xFFFFFFFF)
 
 /*
  * Synthetic interrupt controller flag constants.
  */
 
 #define HV_EVENT_FLAGS_COUNT        (256 * 8)
 #define HV_EVENT_FLAGS_BYTE_COUNT   (256)
 #define HV_EVENT_FLAGS_DWORD_COUNT  (256 / sizeof(uint32_t))
 
 /**
  * max channel count <== event_flags_dword_count * bit_of_dword
  */
 #define HV_CHANNEL_DWORD_LEN        (32)
 #define HV_CHANNEL_MAX_COUNT        \
 	((HV_EVENT_FLAGS_DWORD_COUNT) * HV_CHANNEL_DWORD_LEN)
 /*
  * MessageId: HV_STATUS_INSUFFICIENT_BUFFERS
  * MessageText:
  *    You did not supply enough message buffers to send a message.
  */
 
+#define HV_STATUS_SUCCESS                ((uint16_t)0)
 #define HV_STATUS_INSUFFICIENT_BUFFERS   ((uint16_t)0x0013)
 
 typedef void (*hv_vmbus_channel_callback)(void *context);
 
 typedef struct {
 	void*		data;
 	uint32_t	length;
 } hv_vmbus_sg_buffer_list;
 
 typedef struct {
 	uint32_t	current_interrupt_mask;
 	uint32_t	current_read_index;
 	uint32_t	current_write_index;
 	uint32_t	bytes_avail_to_read;
 	uint32_t	bytes_avail_to_write;
 } hv_vmbus_ring_buffer_debug_info;
 
 typedef struct {
 	uint32_t 		rel_id;
 	hv_vmbus_channel_state	state;
 	hv_guid			interface_type;
 	hv_guid			interface_instance;
 	uint32_t		monitor_id;
 	uint32_t		server_monitor_pending;
 	uint32_t		server_monitor_latency;
 	uint32_t		server_monitor_connection_id;
 	uint32_t		client_monitor_pending;
 	uint32_t		client_monitor_latency;
 	uint32_t		client_monitor_connection_id;
 	hv_vmbus_ring_buffer_debug_info	inbound;
 	hv_vmbus_ring_buffer_debug_info	outbound;
 } hv_vmbus_channel_debug_info;
 
 typedef union {
 	hv_vmbus_channel_version_supported	version_supported;
 	hv_vmbus_channel_open_result		open_result;
 	hv_vmbus_channel_gpadl_torndown		gpadl_torndown;
 	hv_vmbus_channel_gpadl_created		gpadl_created;
 	hv_vmbus_channel_version_response	version_response;
 } hv_vmbus_channel_msg_response;
 
 /*
  * Represents each channel msg on the vmbus connection
  * This is a variable-size data structure depending on
  * the msg type itself
  */
 typedef struct hv_vmbus_channel_msg_info {
 	/*
 	 * Bookkeeping stuff
 	 */
 	TAILQ_ENTRY(hv_vmbus_channel_msg_info)  msg_list_entry;
 	/*
 	 * So far, this is only used to handle
 	 * gpadl body message
 	 */
 	TAILQ_HEAD(, hv_vmbus_channel_msg_info) sub_msg_list_anchor;
 	/*
 	 * Synchronize the request/response if
 	 * needed.
 	 * KYS: Use a semaphore for now.
 	 * Not perf critical.
 	 */
 	struct sema				wait_sema;
 	hv_vmbus_channel_msg_response		response;
 	uint32_t				message_size;
 	/**
 	 * The channel message that goes out on
 	 *  the "wire". It will contain at
 	 *  minimum the
 	 *  hv_vmbus_channel_msg_header
 	 * header.
 	 */
 	unsigned char 				msg[0];
 } hv_vmbus_channel_msg_info;
 
 /*
  * The format must be the same as hv_vm_data_gpa_direct
  */
 typedef struct hv_vmbus_channel_packet_page_buffer {
 	uint16_t		type;
 	uint16_t		data_offset8;
 	uint16_t		length8;
 	uint16_t		flags;
 	uint64_t		transaction_id;
 	uint32_t		reserved;
 	uint32_t		range_count;
 	hv_vmbus_page_buffer	range[HV_MAX_PAGE_BUFFER_COUNT];
 } __packed hv_vmbus_channel_packet_page_buffer;
 
 /*
  * The format must be the same as hv_vm_data_gpa_direct
  */
 typedef struct hv_vmbus_channel_packet_multipage_buffer {
 	uint16_t 			type;
 	uint16_t 			data_offset8;
 	uint16_t 			length8;
 	uint16_t 			flags;
 	uint64_t			transaction_id;
 	uint32_t 			reserved;
 	uint32_t			range_count; /* Always 1 in this case */
 	hv_vmbus_multipage_buffer	range;
 } __packed hv_vmbus_channel_packet_multipage_buffer;
 
 enum {
 	HV_VMBUS_MESSAGE_CONNECTION_ID	= 1,
 	HV_VMBUS_MESSAGE_PORT_ID	= 1,
 	HV_VMBUS_EVENT_CONNECTION_ID	= 2,
 	HV_VMBUS_EVENT_PORT_ID		= 2,
 	HV_VMBUS_MONITOR_CONNECTION_ID	= 3,
 	HV_VMBUS_MONITOR_PORT_ID	= 3,
 	HV_VMBUS_MESSAGE_SINT		= 2
 };
 
 #define HV_PRESENT_BIT		0x80000000
 
 #define HV_HYPERCALL_PARAM_ALIGN sizeof(uint64_t)
 
 typedef struct {
 	uint64_t	guest_id;
 	void*		hypercall_page;
 	hv_bool_uint8_t	syn_ic_initialized;
 
 	hv_vmbus_handle	syn_ic_msg_page[MAXCPU];
 	hv_vmbus_handle	syn_ic_event_page[MAXCPU];
 	/*
 	 * For FreeBSD cpuid to Hyper-V vcpuid mapping.
 	 */
 	uint32_t	hv_vcpu_index[MAXCPU];
 	/*
 	 * Each cpu has its own software interrupt handler for channel
 	 * event and msg handling.
 	 */
 	struct intr_event		*hv_event_intr_event[MAXCPU];
 	struct intr_event		*hv_msg_intr_event[MAXCPU];
 	void				*event_swintr[MAXCPU];
 	void				*msg_swintr[MAXCPU];
 	/*
 	 * Host use this vector to intrrupt guest for vmbus channel
 	 * event and msg.
 	 */
 	unsigned int			hv_cb_vector;
 } hv_vmbus_context;
 
 /*
  * Define hypervisor message types
  */
 typedef enum {
 
 	HV_MESSAGE_TYPE_NONE				= 0x00000000,
 
 	/*
 	 * Memory access messages
 	 */
 	HV_MESSAGE_TYPE_UNMAPPED_GPA			= 0x80000000,
 	HV_MESSAGE_TYPE_GPA_INTERCEPT			= 0x80000001,
 
 	/*
 	 * Timer notification messages
 	 */
 	HV_MESSAGE_TIMER_EXPIRED			= 0x80000010,
 
 	/*
 	 * Error messages
 	 */
 	HV_MESSAGE_TYPE_INVALID_VP_REGISTER_VALUE	= 0x80000020,
 	HV_MESSAGE_TYPE_UNRECOVERABLE_EXCEPTION		= 0x80000021,
 	HV_MESSAGE_TYPE_UNSUPPORTED_FEATURE		= 0x80000022,
 
 	/*
 	 * Trace buffer complete messages
 	 */
 	HV_MESSAGE_TYPE_EVENT_LOG_BUFFER_COMPLETE	= 0x80000040,
 
 	/*
 	 * Platform-specific processor intercept messages
 	 */
 	HV_MESSAGE_TYPE_X64_IO_PORT_INTERCEPT		= 0x80010000,
 	HV_MESSAGE_TYPE_X64_MSR_INTERCEPT		= 0x80010001,
 	HV_MESSAGE_TYPE_X64_CPU_INTERCEPT		= 0x80010002,
 	HV_MESSAGE_TYPE_X64_EXCEPTION_INTERCEPT		= 0x80010003,
 	HV_MESSAGE_TYPE_X64_APIC_EOI			= 0x80010004,
 	HV_MESSAGE_TYPE_X64_LEGACY_FP_ERROR		= 0x80010005
 
 } hv_vmbus_msg_type;
 
 /*
  * Define port identifier type
  */
 typedef union _hv_vmbus_port_id {
 	uint32_t	as_uint32_t;
 	struct {
 		uint32_t	id:24;
 		uint32_t	reserved:8;
 	} u ;
 } hv_vmbus_port_id;
 
 /*
  * Define synthetic interrupt controller message flag
  */
 typedef union {
 	uint8_t	as_uint8_t;
 	struct {
 		uint8_t	message_pending:1;
 		uint8_t	reserved:7;
 	} u;
 } hv_vmbus_msg_flags;
 
 typedef uint64_t hv_vmbus_partition_id;
 
 /*
  * Define synthetic interrupt controller message header
  */
 typedef struct {
 	hv_vmbus_msg_type	message_type;
 	uint8_t			payload_size;
 	hv_vmbus_msg_flags	message_flags;
 	uint8_t			reserved[2];
 	union {
 		hv_vmbus_partition_id	sender;
 		hv_vmbus_port_id	port;
 	} u;
 } hv_vmbus_msg_header;
 
 /*
  *  Define synthetic interrupt controller message format
  */
 typedef struct {
 	hv_vmbus_msg_header	header;
 	union {
 		uint64_t	payload[HV_MESSAGE_PAYLOAD_QWORD_COUNT];
 	} u ;
 } hv_vmbus_message;
 
 /*
  *  Maximum channels is determined by the size of the interrupt
  *  page which is PAGE_SIZE. 1/2 of PAGE_SIZE is for
  *  send endpoint interrupt and the other is receive
  *  endpoint interrupt.
  *
  *   Note: (PAGE_SIZE >> 1) << 3 allocates 16348 channels
  */
 #define HV_MAX_NUM_CHANNELS			(PAGE_SIZE >> 1) << 3
 
 /*
  * (The value here must be in multiple of 32)
  */
 #define HV_MAX_NUM_CHANNELS_SUPPORTED		256
 
 /*
  * VM Bus connection states
  */
 typedef enum {
 	HV_DISCONNECTED,
 	HV_CONNECTING,
 	HV_CONNECTED,
 	HV_DISCONNECTING
 } hv_vmbus_connect_state;
 
 #define HV_MAX_SIZE_CHANNEL_MESSAGE	HV_MESSAGE_PAYLOAD_BYTE_COUNT
 
 
 typedef struct {
 	hv_vmbus_connect_state			connect_state;
 	uint32_t				next_gpadl_handle;
 	/**
 	 * Represents channel interrupts. Each bit position
 	 * represents a channel.
 	 * When a channel sends an interrupt via VMBUS, it
 	 * finds its bit in the send_interrupt_page, set it and
 	 * calls Hv to generate a port event. The other end
 	 * receives the port event and parse the
 	 * recv_interrupt_page to see which bit is set
 	 */
 	void					*interrupt_page;
 	void					*send_interrupt_page;
 	void					*recv_interrupt_page;
 	/*
 	 * 2 pages - 1st page for parent->child
 	 * notification and 2nd is child->parent
 	 * notification
 	 */
 	void					*monitor_pages;
 	TAILQ_HEAD(, hv_vmbus_channel_msg_info)	channel_msg_anchor;
 	struct mtx				channel_msg_lock;
 	/**
 	 * List of primary channels. Sub channels will be linked
 	 * under their primary channel.
 	 */
 	TAILQ_HEAD(, hv_vmbus_channel)		channel_anchor;
 	struct mtx				channel_lock;
 
 	/**
 	 * channel table for fast lookup through id.
 	 */
 	hv_vmbus_channel                        **channels;
 	hv_vmbus_handle				work_queue;
 	struct sema				control_sema;
 } hv_vmbus_connection;
 
 typedef union {
 	uint64_t as_uint64_t;
 	struct {
 		uint64_t build_number		: 16;
 		uint64_t service_version	: 8; /* Service Pack, etc. */
 		uint64_t minor_version		: 8;
 		uint64_t major_version		: 8;
 		/*
 		 * HV_GUEST_OS_MICROSOFT_IDS (If Vendor=MS)
 		 * HV_GUEST_OS_VENDOR
 		 */
 		uint64_t os_id			: 8;
 		uint64_t vendor_id		: 16;
 	} u;
 } hv_vmbus_x64_msr_guest_os_id_contents;
 
 
 typedef union {
 	uint64_t as_uint64_t;
 	struct {
 		uint64_t enable :1;
 		uint64_t reserved :11;
 		uint64_t guest_physical_address :52;
 	} u;
 } hv_vmbus_x64_msr_hypercall_contents;
 
 typedef union {
 	uint32_t as_uint32_t;
 	struct {
 		uint32_t group_enable :4;
 		uint32_t rsvd_z :28;
 	} u;
 } hv_vmbus_monitor_trigger_state;
 
 typedef union {
 	uint64_t as_uint64_t;
 	struct {
 		uint32_t pending;
 		uint32_t armed;
 	} u;
 } hv_vmbus_monitor_trigger_group;
 
 typedef struct {
 	hv_vmbus_connection_id	connection_id;
 	uint16_t		flag_number;
 	uint16_t		rsvd_z;
 } hv_vmbus_monitor_parameter;
 
 /*
  * hv_vmbus_monitor_page Layout
  * ------------------------------------------------------
  * | 0   | trigger_state (4 bytes) | Rsvd1 (4 bytes)     |
  * | 8   | trigger_group[0]                              |
  * | 10  | trigger_group[1]                              |
  * | 18  | trigger_group[2]                              |
  * | 20  | trigger_group[3]                              |
  * | 28  | Rsvd2[0]                                      |
  * | 30  | Rsvd2[1]                                      |
  * | 38  | Rsvd2[2]                                      |
  * | 40  | next_check_time[0][0] | next_check_time[0][1] |
  * | ...                                                 |
  * | 240 | latency[0][0..3]                              |
  * | 340 | Rsvz3[0]                                      |
  * | 440 | parameter[0][0]                               |
  * | 448 | parameter[0][1]                               |
  * | ...                                                 |
  * | 840 | Rsvd4[0]                                      |
  * ------------------------------------------------------
  */
 
 typedef struct {
 	hv_vmbus_monitor_trigger_state	trigger_state;
 	uint32_t			rsvd_z1;
 
 	hv_vmbus_monitor_trigger_group	trigger_group[4];
 	uint64_t			rsvd_z2[3];
 
 	int32_t				next_check_time[4][32];
 
 	uint16_t			latency[4][32];
 	uint64_t			rsvd_z3[32];
 
 	hv_vmbus_monitor_parameter	parameter[4][32];
 
 	uint8_t				rsvd_z4[1984];
 } hv_vmbus_monitor_page;
 
 /*
  * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent
  * is set by CPUID(HV_CPU_ID_FUNCTION_VERSION_AND_FEATURES).
  */
 typedef enum {
 	HV_CPU_ID_FUNCTION_VERSION_AND_FEATURES			= 0x00000001,
 	HV_CPU_ID_FUNCTION_HV_VENDOR_AND_MAX_FUNCTION		= 0x40000000,
 	HV_CPU_ID_FUNCTION_HV_INTERFACE				= 0x40000001,
 	/*
 	 * The remaining functions depend on the value
 	 * of hv_cpu_id_function_interface
 	 */
 	HV_CPU_ID_FUNCTION_MS_HV_VERSION			= 0x40000002,
 	HV_CPU_ID_FUNCTION_MS_HV_FEATURES			= 0x40000003,
 	HV_CPU_ID_FUNCTION_MS_HV_ENLIGHTENMENT_INFORMATION	= 0x40000004,
-	HV_CPU_ID_FUNCTION_MS_HV_IMPLEMENTATION_LIMITS		= 0x40000005
-
+	HV_CPU_ID_FUNCTION_MS_HV_IMPLEMENTATION_LIMITS		= 0x40000005,
+	HV_CPU_ID_FUNCTION_MS_HV_HARDWARE_FEATURE		= 0x40000006
 } hv_vmbus_cpuid_function;
 
+#define	HV_FEATURE_MSR_TIME_REFCNT	(1 << 1)
+#define	HV_FEATURE_MSR_SYNCIC		(1 << 2)
+#define	HV_FEATURE_MSR_STIMER		(1 << 3)
+#define	HV_FEATURE_MSR_APIC		(1 << 4)
+#define	HV_FEATURE_MSR_HYPERCALL	(1 << 5)
+#define	HV_FEATURE_MSR_GUEST_IDLE	(1 << 10)
+
 /*
  * Define the format of the SIMP register
  */
 typedef union {
 	uint64_t as_uint64_t;
 	struct {
 		uint64_t simp_enabled	: 1;
 		uint64_t preserved	: 11;
 		uint64_t base_simp_gpa	: 52;
 	} u;
 } hv_vmbus_synic_simp;
 
 /*
  * Define the format of the SIEFP register
  */
 typedef union {
 	uint64_t as_uint64_t;
 	struct {
 		uint64_t siefp_enabled	: 1;
 		uint64_t preserved	: 11;
 		uint64_t base_siefp_gpa	: 52;
 	} u;
 } hv_vmbus_synic_siefp;
 
 /*
  * Define synthetic interrupt source
  */
 typedef union {
 	uint64_t as_uint64_t;
 	struct {
 		uint64_t vector		: 8;
 		uint64_t reserved1	: 8;
 		uint64_t masked		: 1;
 		uint64_t auto_eoi	: 1;
 		uint64_t reserved2	: 46;
 	} u;
 } hv_vmbus_synic_sint;
 
 /*
  * Timer configuration register.
  */
 union hv_timer_config {
 	uint64_t as_uint64;
 	struct {
 		uint64_t enable:1;
 		uint64_t periodic:1;
 		uint64_t lazy:1;
 		uint64_t auto_enable:1;
 		uint64_t reserved_z0:12;
 		uint64_t sintx:4;
 		uint64_t reserved_z1:44;
 	};
 };
 
 /*
  * Define syn_ic control register
  */
 typedef union _hv_vmbus_synic_scontrol {
     uint64_t as_uint64_t;
     struct {
         uint64_t enable		: 1;
         uint64_t reserved	: 63;
     } u;
 } hv_vmbus_synic_scontrol;
 
 /*
  *  Define the hv_vmbus_post_message hypercall input structure
  */
 typedef struct {
 	hv_vmbus_connection_id	connection_id;
 	uint32_t		reserved;
 	hv_vmbus_msg_type	message_type;
 	uint32_t		payload_size;
 	uint64_t		payload[HV_MESSAGE_PAYLOAD_QWORD_COUNT];
 } hv_vmbus_input_post_message;
 
 /*
  * Define the synthetic interrupt controller event flags format
  */
 typedef union {
 	uint8_t		flags8[HV_EVENT_FLAGS_BYTE_COUNT];
 	uint32_t	flags32[HV_EVENT_FLAGS_DWORD_COUNT];
 } hv_vmbus_synic_event_flags;
 
 #define HV_X64_CPUID_MIN	(0x40000005)
 #define HV_X64_CPUID_MAX	(0x4000ffff)
 
 /*
  * Declare the MSR used to identify the guest OS
  */
 #define HV_X64_MSR_GUEST_OS_ID	(0x40000000)
 /*
  *  Declare the MSR used to setup pages used to communicate with the hypervisor
  */
 #define HV_X64_MSR_HYPERCALL	(0x40000001)
 /* MSR used to provide vcpu index */
 #define	HV_X64_MSR_VP_INDEX	(0x40000002)
 
 #define HV_X64_MSR_TIME_REF_COUNT      (0x40000020)
 
 /*
  * Define synthetic interrupt controller model specific registers
  */
 #define HV_X64_MSR_SCONTROL   (0x40000080)
 #define HV_X64_MSR_SVERSION   (0x40000081)
 #define HV_X64_MSR_SIEFP      (0x40000082)
 #define HV_X64_MSR_SIMP       (0x40000083)
 #define HV_X64_MSR_EOM        (0x40000084)
 
 #define HV_X64_MSR_SINT0      (0x40000090)
 #define HV_X64_MSR_SINT1      (0x40000091)
 #define HV_X64_MSR_SINT2      (0x40000092)
 #define HV_X64_MSR_SINT3      (0x40000093)
 #define HV_X64_MSR_SINT4      (0x40000094)
 #define HV_X64_MSR_SINT5      (0x40000095)
 #define HV_X64_MSR_SINT6      (0x40000096)
 #define HV_X64_MSR_SINT7      (0x40000097)
 #define HV_X64_MSR_SINT8      (0x40000098)
 #define HV_X64_MSR_SINT9      (0x40000099)
 #define HV_X64_MSR_SINT10     (0x4000009A)
 #define HV_X64_MSR_SINT11     (0x4000009B)
 #define HV_X64_MSR_SINT12     (0x4000009C)
 #define HV_X64_MSR_SINT13     (0x4000009D)
 #define HV_X64_MSR_SINT14     (0x4000009E)
 #define HV_X64_MSR_SINT15     (0x4000009F)
 
 /*
  * Synthetic Timer MSRs. Four timers per vcpu.
  */
 #define HV_X64_MSR_STIMER0_CONFIG		0x400000B0
 #define HV_X64_MSR_STIMER0_COUNT		0x400000B1
 #define HV_X64_MSR_STIMER1_CONFIG		0x400000B2
 #define HV_X64_MSR_STIMER1_COUNT		0x400000B3
 #define HV_X64_MSR_STIMER2_CONFIG		0x400000B4
 #define HV_X64_MSR_STIMER2_COUNT		0x400000B5
 #define HV_X64_MSR_STIMER3_CONFIG		0x400000B6
 #define HV_X64_MSR_STIMER3_COUNT		0x400000B7
 
 /*
  * Declare the various hypercall operations
  */
 typedef enum {
 	HV_CALL_POST_MESSAGE	= 0x005c,
 	HV_CALL_SIGNAL_EVENT	= 0x005d,
 } hv_vmbus_call_code;
 
 /**
  * Global variables
  */
 
 extern hv_vmbus_context		hv_vmbus_g_context;
 extern hv_vmbus_connection	hv_vmbus_g_connection;
+
+extern u_int			hyperv_features;
+extern u_int			hyperv_recommends;
 
 typedef void (*vmbus_msg_handler)(hv_vmbus_channel_msg_header *msg);
 
 typedef struct hv_vmbus_channel_msg_table_entry {
 	hv_vmbus_channel_msg_type    messageType;
 
 	bool   handler_no_sleep; /* true: the handler doesn't sleep */
 	vmbus_msg_handler   messageHandler;
 } hv_vmbus_channel_msg_table_entry;
 
 extern hv_vmbus_channel_msg_table_entry	g_channel_message_table[];
 
 /*
  * Private, VM Bus functions
  */
 
 int			hv_vmbus_ring_buffer_init(
 				hv_vmbus_ring_buffer_info	*ring_info,
 				void				*buffer,
 				uint32_t			buffer_len);
 
 void			hv_ring_buffer_cleanup(
 				hv_vmbus_ring_buffer_info	*ring_info);
 
 int			hv_ring_buffer_write(
 				hv_vmbus_ring_buffer_info	*ring_info,
 				hv_vmbus_sg_buffer_list		sg_buffers[],
 				uint32_t			sg_buff_count,
 				boolean_t			*need_sig);
 
 int			hv_ring_buffer_peek(
 				hv_vmbus_ring_buffer_info	*ring_info,
 				void				*buffer,
 				uint32_t			buffer_len);
 
 int			hv_ring_buffer_read(
 				hv_vmbus_ring_buffer_info	*ring_info,
 				void				*buffer,
 				uint32_t			buffer_len,
 				uint32_t			offset);
 
 uint32_t		hv_vmbus_get_ring_buffer_interrupt_mask(
 				hv_vmbus_ring_buffer_info	*ring_info);
 
 void			hv_vmbus_dump_ring_info(
 				hv_vmbus_ring_buffer_info	*ring_info,
 				char				*prefix);
 
 void			hv_ring_buffer_read_begin(
 				hv_vmbus_ring_buffer_info	*ring_info);
 
 uint32_t		hv_ring_buffer_read_end(
 				hv_vmbus_ring_buffer_info	*ring_info);
 
 hv_vmbus_channel*	hv_vmbus_allocate_channel(void);
 void			hv_vmbus_free_vmbus_channel(hv_vmbus_channel *channel);
 void			hv_vmbus_on_channel_message(void *context);
 int			hv_vmbus_request_channel_offers(void);
 void			hv_vmbus_release_unattached_channels(void);
 int			hv_vmbus_init(void);
 void			hv_vmbus_cleanup(void);
 
 uint16_t		hv_vmbus_post_msg_via_msg_ipc(
 				hv_vmbus_connection_id	connection_id,
 				hv_vmbus_msg_type	message_type,
 				void			*payload,
 				size_t			payload_size);
 
 uint16_t		hv_vmbus_signal_event(void *con_id);
 void			hv_vmbus_synic_init(void *irq_arg);
 void			hv_vmbus_synic_cleanup(void *arg);
 int			hv_vmbus_query_hypervisor_presence(void);
 
 struct hv_device*	hv_vmbus_child_device_create(
 				hv_guid			device_type,
 				hv_guid			device_instance,
 				hv_vmbus_channel	*channel);
 
 int			hv_vmbus_child_device_register(
 					struct hv_device *child_dev);
 int			hv_vmbus_child_device_unregister(
 					struct hv_device *child_dev);
 
 /**
  * Connection interfaces
  */
 int			hv_vmbus_connect(void);
 int			hv_vmbus_disconnect(void);
 int			hv_vmbus_post_message(void *buffer, size_t buf_size);
 int			hv_vmbus_set_event(hv_vmbus_channel *channel);
 void			hv_vmbus_on_events(void *);
 
 /**
  * Event Timer interfaces
  */
 void			hv_et_init(void);
 void			hv_et_intr(struct trapframe*);
 
 /*
  * The guest OS needs to register the guest ID with the hypervisor.
  * The guest ID is a 64 bit entity and the structure of this ID is
  * specified in the Hyper-V specification:
  *
  * http://msdn.microsoft.com/en-us/library/windows/
  * hardware/ff542653%28v=vs.85%29.aspx
  *
  * While the current guideline does not specify how FreeBSD guest ID(s)
  * need to be generated, our plan is to publish the guidelines for
  * FreeBSD and other guest operating systems that currently are hosted
  * on Hyper-V. The implementation here conforms to this yet
  * unpublished guidelines.
  *
  * Bit(s)
  * 63 - Indicates if the OS is Open Source or not; 1 is Open Source
  * 62:56 - Os Type; Linux is 0x100, FreeBSD is 0x200
  * 55:48 - Distro specific identification
  * 47:16 - FreeBSD kernel version number
  * 15:0  - Distro specific identification
  *
  */
 
 #define HV_FREEBSD_VENDOR_ID	0x8200
 #define HV_FREEBSD_GUEST_ID	hv_generate_guest_id(0,0)
 
 static inline  uint64_t hv_generate_guest_id(
 	uint8_t distro_id_part1,
 	uint16_t distro_id_part2)
 {
 	uint64_t guest_id;
 	guest_id =  (((uint64_t)HV_FREEBSD_VENDOR_ID) << 48);
 	guest_id |= (((uint64_t)(distro_id_part1)) << 48);
 	guest_id |= (((uint64_t)(__FreeBSD_version)) << 16); /* in param.h */
 	guest_id |= ((uint64_t)(distro_id_part2));
 	return guest_id;
 }
 
 typedef struct {
 	unsigned int	vector;
 	void		*page_buffers[2 * MAXCPU];
 } hv_setup_args;
 
 #endif  /* __HYPERV_PRIV_H__ */
Index: releng/10.3
===================================================================
--- releng/10.3	(revision 303983)
+++ releng/10.3	(revision 303984)

Property changes on: releng/10.3
___________________________________________________________________
Modified: svn:mergeinfo
## -0,0 +0,2 ##
   Merged /head:r297219,297635,297802-297804,298038,298385,299505,302541,302605
   Merged /stable/10:r299153,299156,300656,301924-301925,301942,302863